Download as pdf or txt
Download as pdf or txt
You are on page 1of 478

DYNAMICAL

SYSTEMS
Stability,
SYITlbolic DynaITlics,
and Chaos

Clark Robinson

CRC Press
Boca Raton Ann Arbor London Tokyo
Ubral')' 01 COOI"SI CatalopDg-lo-Publicatioo Data

Robinson. C. (Clark)
Dynamical systems: stabilily. symbolic dynamics. and chaos f Clark Robinson.
p. cm.--{Studies in advanced mathematics)
Includes bibliographical references and index.
ISBN G-8493·8493-1
I. Differentiable dynamical systems. I. Title. II. Serics.
QA614.8.R63 1994
514'.74--dc20 94-24456
CIP

This book cont ai ns infonnation obtained from authentic and highly regarded sources. Reprinted material i. quoled
with pe nnission. and sources are indicated. A wide variety of references are listed. Reasonable effons have been made
to poIbIish reliable dala and infonnation. hut the author and the publisher cannol assume responsibilily for the validity
of all IDIfcriais or for the consequences of their usc.
Neither this book nor any pan may be reproduced or transmitted in any form or by any means. electronic or
mechanical. including pholocopying. microfilming. and recording. or by any infonnation storage or retrieval system.
without prior pennission in writing from the publisher.
CRC Press. I nc. ·s conscnt does not extend to copying for general distribution. for promotion. for creating new
works. or for resale. Specific penoission must be obtained in writing from CRC Press for such copying.
Direct all inquiries 10 CRC Press. Inc 2000 Corporate Blvd N.W Boca Ralon. Florida 33431.
.• .• .•

C 1995 by-CRC �ss. InlO.

No claim 10 original U.S. Governmenl works


International SIaDdard Book Number G-8493-8493-1
Ubnry of COIIpas CII'II Number 94-24456
PrInted in the United Slates of America 2 3 4 5 6 7 8 9 0
PrlIIIed on acid-free paper
Preface

In recent years, Dynamical Systems has had many applications to science and engi-
neering some of which have gone under the related headings of chaos theory or nonlinear
analysis. Behind these applications there lies a rich mathematical subject which we treat
in this book. This subject centers on the orbits of iteration of a (nonlinear) function
or of the solutions of (nonlinear) ordinary differential equations. In particular, we are
interested in the properties which persist under nonlinear change of coordinates. As
such, we are interested in the geometric or topological aspects of the orbits or solutions
more than an explicit formula for an orbit (which may not be available in any case).
However, as becomes clear in the treatment in this book, there are many properties
of a particular solution or the whole system which can be measured by some quantity.
Also, although the subject has a geometric or topological flavor, analytic analysis plays
an important role (e.g. the local analysis near a fixed point and the stable manifold
theory).
There have been several books and monographs on the subject of Dynamical Systems.
There are several distinctive aspects which together make this book unique.
First of all, this book treats the subject from a mathematical perspective with the
proofs of most of the results included: the only proofs which are omitted either (i) are
left to the reader, (Ii) are too technically difficult to include in an introductory book,
even at the graduate level, or (iii) concern a topic which is only included as a bridge
between the material covered in the book and commonly encountered concepts in Dy-
namical Systems. (Much of the material concerning measures, Liapunov exponents,
and fractal dimension is of this latter category.) Although it has a mathematical per-
spective, readers who are more interested in applied or computational aspects of the
subject should find the explicit statements of the results helpful even if they do not
concern themselves with the details of the proofs. In particular, the inclusion of explicit
formulas for the various bifurcations should be very useful.
Second, this book is meant to be a graduate textbook and not just a reference book
or monograph on the subject. This aspect of the book is reflected in the way the
background materials are carefully reviewed as we use them. (The particular prerequi-
sites from undergraduate mathematics are discussed below.) The ideas are introduced
through examples and at a level which is accessible to a beginning graduate student.
Many exercises are included to help the student learn the meaning of the theorems and
master the techniques of the proofs and topic under consideration. Since the exercises
are not usually just routine applications of theorems but involve similar proofs and or
calculations, they are best assigned in groups, weekly or biweekly. For this reason, they
are grouped at the end of each chapter rather in the individual section.
Third, the scope of the book is on the scale of a year long graduate course and is
designed to be used in such a graduate level mathematics course in Dynamical Systems.
This means that the book is not comprehensive or exhaustive but tries to treat the core
concepts thoroughly and treat others enough so the reader will be prepared to read
further in Dynamical Systems without a complete mathematical treatment. In fact,
this book grew out of a graduate course that I taught at Northwestern University many

iii
times between the early 1970s and the present. To the material that I covered in that
course, I have added a few other topics: some of which my colleagues treat when they
teach the course, others round out the treatment of a topic covered earlier in the book
(e.g. Chapter XI), and others just give greater flexibility to possible courses using this
book. Details on which sections form the core of the book are discussed in Section 1.4.
The perspective of the book is centered on multidimensional systems of real variables.
Chapters II and III concern functions of one real variable, but this is done mainly
because this makes the treatment simpler analytically than that given later in higher
dimensions: there are not lUly (or many) aspects introduced which are unique to one
dimension. Some results are proved so they apply in Banach spaces or even complete
metric but most of the results are developed in finite dimensions. In particular, no direct
connection with partial differential equations or delay equations is given. The fact that
the book concerns functions of real rather than complex variables explains why topics
such as the Julia set, Mandelbrot set, and Measurable Riemann Mapping Theorem are
not treated.
This book treats the dynamics of both iteration of functions and solutions of ordinary
differential equations. Many of the concepts are first introduced for iteration of functions
where the geometry is simpler, but an attempt has been made to interpret these results
for differential equations. A proof of the existence and continuity of solutions with
respect to initial conditions is also included to establish the beginnings of this aspect of
the subject.
Although there is much overlap in this book and one on ordinary differential equa-
tions, the emphasis is different. The dynamical systems approach centers more on
properties of the whole system or subsets of the system rather thlUl individual solu-
tions. Even the more local theory in Chapters IV-VI deals with characterizing types
of solutions under various hypotheses. Chapters VII and IX deal more directly with
more global aspects: Chapter VII centers on various examples and Chapter IX gives
the global theory.
Finally, within the various types of Dynamical Systems, this book is most concerned
with hyperbolic systems: this focus is most prominent in Chapters VII, IX, X, and XI.
However, an attempt has been made to make this book valuable to people interested in
various aspects of Dynamical Systems.
The specific prerequisites include undergraduate analysis (including the Implicit
Function Theorem), linear algebra (including the Jordan canonical form), and point set
topology (including Cantor sets). For the analysis, one of the following books should
be sufficient background: Apostol (1974), Marsden (1974), or Rudin (1964). For the
linear algebra, one of the following books should be sufficient background: Hoffman and
Kunze (1961) or Hartley and Hawkes (1970). For the point set topology, one of the
following books should be sufficient background: Croom (1989), Hocking and Young
(1961), or Munkres (1975). What is needed from these other subjects is an ability to
use these tools; knowing a proof of the Implicit Function Theorem does not particularly
help someone know how to use it. For this reason, we carefully discuss the way these
tools are used just before we use them. (See the sections on the Calculus Prerequisites,
Cantor Sets, Real Jordan Canonical Form, Differentiation in Higher Dimensions, Im-
plicit Function Theorem, Inverse Function Theorem, Contraction Mapping Theorem,
and Definition of a Manifold.) After using these tools in Dynamical Systems, the reader
should gain a much better understanding of the importance of these "undergraduate"
subjects. The terminology and ideas from differential topology or differential geometry
are also used, including that of a tangent vector, the tangent bundle, and a mani-
fold. However most surfaces or manifolds are either Euclidean space, tori, or graphs of
functions so these ideas should not be too intimidating. Although someone pursuing
Dynamical Systems further should learn manifold theory, I have tried to make this book
accessible to someone without prior background in this subject. Thus, the prerequisites
for this book are really undergraduate analysis, linear algebra, and point set topology
and not advanced graduate work. However, the reader should be warned that most
beginning graduate students do not find the material at all trivial. The main compli-
cating aspect seems to be the use of a large variety of methods and approaches. The
unifying feature is not the methods used but the type of questions which we are trying
to answer. By having patience and reviewing the mathematics from other subjects as
they are used, the reader should find the material accessible and rich in content, both
mathematical and for applications.
The main topic of the book is the dynamics induced by iteration of a (nonlinear)
function or by the solutions of (nonlinear) ordinary differential equations. In the usual
undergraduate mathematics courses, some properties of solutions of differential equa-
tions are considered but more attention is paid to the specific form of the solution. In
connection with functions, they are graphed and their minima and maxima are found,
but the Iterates of a function are not often considered. To iterate a function we repeat-
edly have the same function act on a point and its images. Thus, for a function f with
initial condition xo, we consider XI = f(xo), and then Xn = f(xn-d for n ;:: 1. We are
interested in finding the qualitative features and long time limiting behavior of a typical
orbit, for either an ordinary differential equation or the iterates of a function. Certainly
fixed points or periodic points are important, but sometimes the orbit moves densely
through a complicated set such as a Cantor set. We want to understand and bring a
structure to this seemingly random behavior. It is often expressed by saying, "we want
to bring order out of chaos." One way of finding this structure is via the tool of symbolic
dynamics. If there is a real valued function f and a sequence of intervals J i such that
the image of J i by f covers Ji+lt f(J i ) :::) Ji+lt then it is possible to show that there
is a point X whose orbit passes through this sequence of intervals, fi(X) E J i . Labels
for the intervals then can be used as symbols, hence the name of symbolic dynamics for
this approach.
Another important concept is that of structural stability. Some types of systems (it-
erated functions or ordinary differential equations) have dynamics which are equivalent
(topologically conjugate) to that of any of its perturbations. Such a system is called
structurally stable. Finally the term chaos is given a special meaning and interpreta-
tion. There is no one set definition of a chaotic system, but we discuss various ideas
and measurements related to chaotic dynamics. One of the ironies is that some chaotic
systems are also structurally stable.
Chapter I gives a more detailed introduction into the main ideas that are treated in
the book by means of examples of functions and differential equations. Suffice it to say
here that these three ideas, symbolic dynamics, structural stability, and chaos, form the
central part of the approach to Dynamical Systems presented in this book.
In the year-long graduate course at Northwestern, we cover the the material in Chap-
ter II, Sharkovskii's Theorem and Subshifts of Finite Type from Chapter III, Chapter
IV except the Perron-Frobenius Theorem, Chapter V except some of the material on
periodic orbits for planar differential equations (and sometime the proof of the Stable
Manifold Theorem is omitted), a selection of examples from Chapter VII, and most of
Chapter IX. In a given year, other selected topics are usually added from among the
following: Chapter VI on bifurcations, the material on topological entropy in Chapter
VIII, and the Kupk&-Smale Theorem. A course which did not emphasize the global
hyperbolic theory as much could be obtained by skipping Chapter IX and treating ad-
ditional topics, e.g. Chapter VI or more on the measurements of chaos. Section 1.4
discusses the content of the different chapters and possible selections of sections or
topics for a course using this book.
There are several other books which give introductions into other aspects or ap-
proaches to Dynamical Systems. For other graduate level mathematical introductions
to Dynamical Systems see Devaney (1989), Irwin (1980), Nitecki (1970), and Palis and
de Melo (1982). For a more comprehensive treatment of Dynamical Systems see Katok
and H_lbl8tt (11Xl4). Some boob which .mph..i.. the dynamic. of iteration of 8
function of one variable are Alseda, Llibre, and Misiurewicz (1993), Block and Coppel
(1992). and de Melo and Van Strien (1993). Carleson and Gamelin (1993) gives an
introduction to the dynamics of functions of a complex variable. Chow and Hale (1982)
gives a more thorough treatment of the bifurcation aspects of Dynamical Systems. The
article by Boyle (1993) gives a more thorough introduction into symbolic dynamics as a
separate subject and not just how it is used to analyze diffeomorphisms or vector fields.
Some books which concentrate on Hamiltonian dynamics are Abraham and Marsden
(1978). Arnold (1978). and Meyer and Hall (1992). For an introduction to applications
of Dynamical Systems, see Guckenheimer and Holmes (1983). Hirsch and Smale (1974).
Wiggins (1990. 1988). and Ott (1993). For applications to ecology. see Hirsch (1982,
1985. 1988. 1990). Hofbauer and Sigmund (1988). Hoppensteadt (1982). May (1975),
and Waltman (1983). There are many books written on Dynamical Systems by people
in fields outside mathematics. including Lichtenberg and Lieberman (1983), Marek and
Schreiber (1991). and Rasband (1990).
I have tried very hard to give references to original papers. However, there are
many researchers working in Dynamical Systems and I am not always aware of (or
remember) contributions by various people to which I should give credit. I apologize
for my omissions. I am sure there are many. I hope the references that I have given will
help the reader start finding the related work in the literature.
When referring to a theorem in the same chapter, we use the number as it appears
in the statement. e.g. Theorem 2.2 which is the second theorem of the second section
of the current chapter. If we are referring to Theorem 2.2 from Chapter VI in a chapter
other other than Chapter VI. we refer to it as Theorem VI.2.2 to indicate it comes from
a different chapter.
There are not any specific references in this book to using a computer to simulate a
dynamical system. However, the reader would benefit greatly by seeing the dynamics as
it unfolds by such simulation. The reader can either write a program for him or herself
or use several of the computer packages available. On an IBM Personal Computer. I
have used the program Phaser which comes with the book by K~ak (1989). Also
the program Dynamics by Yorke (1990) runs on both IBM Personal Computers and
Unix/Xll machines. There are several other programs for IBM Personal Computers
but I have not used them myself. Also the program DsTool by J. Guckenheimer, M.
R. Myers. F. J. Wicklin. and P. A. Worfolk runs on Unix/Xll machines. Many of the
programming languages come with a good enough graphics library that it is not difficult
to write one's own specialized program. However, for the X-Window environment on a
Unix computer, I found the VOGLE library (C graphics C functions) a very helpful
asset to write my own programs. There are several programs for the Macintosh including
MacMath by Hubbard and West (1992), but I have not used them.
Over the years, I have had many useful conversations with colleagues at Northwestern
University and from elsewhere. especially people attending the Midwest Dynamical
Systems Seminars. Those colleagues in Dynamical Systems at Northwestern University
include Keith Burns, John Franks, Don Saari, Robert Williams, and many postdoctoral
instructors and visitors. Those attending the Midwest Dynamical Systems Seminars are
too numerous to list, but surely Charles Conley is one who bears mentioning and will
long be remembered by many of us. I also owe a great debt to the people who taught
me about Dynamical Systems, including Morris Hirsch, Charles Pugh, and Steve Smale.
The perspective on Dynamical Systems which I learned from them is still very evident
in the selection and treatment of topics in this book.
I would also like to thank the many people who found typographical errors, conceJ>-
tual errors, or points that needed to be clarified in earlier drafts of this book. I would
eIIpeclaily like to thank Keith Burna, Beverly Diamond, Roger Kraft, and Mlng-Chla
Li: Keith Burns taught out of a preliminary version and made many suggestions for
improvements, clarifications, and changed arguments; Beverly Diamond made many
suggestions for improvements in grammar and other editing matters; Roger Kraft made
both mathematical and typographical corrections; in addition to noting out typograph-
ical errors, Ming-Chia Li pointed out aspects which needed clarifying.
This text was typeset using A,MS-1EX. The figures were produced using DsTooI, Xfig,
Maple, and Vogle graphics C Library. I would like to thank Len Evens who supplied me
with some macros which were used with ~1EX to produce the chapter and section
titles and numbers, and produce the index and table of contents.
I was supported by several National Science Foundation grants during the years this
book was written.
Clark Robinson
Department of Mathematics
Northwestern University
Evanston, Illinois 60208
clark@math.nwu.edu
Contents

Chapter I. Introduction 1
1.1 Population Growth Models, One Population 2
1.2 Iteration of Real Valued Functions as Dynamical Systems 3
1.3 Higher Dimensional Systems 5
1.4 Outline of t.he Topics of the Chapters 9
Chapter II. One Dimensional Dynamics by Iteration 13
2.1 Calculus Prerequisites 13
*2.2 Periodic Points 15
*2.2.1 Fixed Points for the Quadratic Family 20
*2.3 Limit Sets and Recurrence for Maps 22
*2.4 Invariant Cantor Sets for the Quadratic Family 26
*2.4.1 Middle Cantor Sets 26
*2.4.2 Construction of the Invariant Cantor Set 30
2.4.3 The Invariant Cantor Set for Jl > 4 33
*2.5 Symbolic Dynamics for the Quadratic Map 37
*2.6 Conjugacy and Structural Stability 40
*2.7 Conjugacy and Structural Stability of the Quadratic Map 46
2.8 Homeomorphisms of the Circle 49
2.9 Exercises 57
Chapter III. Chaos and Its Measurement 63
3.1 Sharkovskii's Theorem 63
3.1.1 Examples for Sharkovskii's Theorem 70
3.2 Subshifts of Finite Type 72
3.3 Zeta Function 78
3.4 Period Doubling Cascade 79
3.5 Chaos 81
3.6 Liapunov Exponents 86
3.7 Exercises 88
Chapter IV. Linear Systems 93
4.1 Review: Linear Maps and the Real Jordan Canonical Form 93
*4.2 Linear Differential Equations 95
*4.3 Solutions for Constant Coefficients 97
*4.4 Phase Portraits 102
*4.5 Contracting Linear Differential Equations 106
*4.6 Hyperbolic Linear Differential Equations 111
*4.7 Topologically Conjugate Linear Differential Equations 113
*4.8 Nonhomogeneous Equations 115
*4.9 Linear Maps 116
4.9.1 Perron-F'robenius Theorem 123
4.10 Exercises 127
• Core Sections
Chapter V. Analysis Near Fixed Points and Periodic Orbits 131
*5.1 Review: Differentiation in Higher Dimensions 131
*5.2 Review: The Implicit Function Theorem 134
*5.2.1 Higher Dimensional Implicit Function Theorem 136
*5.2.2 The Inverse Function Theorem 137
*5.2.3 Contraction Mapping Theorem 138
*5.3 Existence of Solutions for Differential Equations 140
*5.4 Limit Sets and Recurrence for Flows 146
*5.5 Fixed Points for Nonlinear Differential Equations 149
*5.5.1 Nonlinear Sinks 150
*5.5.2 Nonlinear Hyperbolic Fixed Points 152
*5.5.3 Liapunov Functions Near a Fixed Point 154
*5.6 Stability of Periodic Points for Nonlinear Maps 156
*5.7 Proof of the Hartman-Grobman Theorem 158
*5.7.1 Proof of the Local Theorem 163
5.7.2 Proof of the Hartman-Grobman Theorem for Flows 165
*5.8 Periodic Orbits for Flows 165
5.8.1 The Suspension of a Map 171
5.8.2 An Attracting Periodic Orbit for the
Van der Pol Equations 171
5.8.3 Poincare Map for Differential Equations in the Plane 176
*5.9 Poincare-Bendixson Theorem 179
*5.10 Stable Manifold Theorem for a Fixed Point of a Map 181
5.10.1 Proof of the Stable Manifold Theorem 185
5.10.2 Center Manifold 197
*5.10.3 Stable Manifold Theorem for Flows 199
*5.11 The Inclination Lemma 200
5.12 Exercises 202
Chapter VI. Bifurcation of Periodic Points 211
6.1 Saddle-Node Bifurcation 211
6.2 Saddle-Node Bifurcation in Higher Dimensions 213
6.3 Period Doubling Bifurcation 218
6.4 Andronov-Hopf Bifurcation for Diffeomorphisms 223
6.5 Andronov-Hopf Bifurcation for Differential Equations 224
6.6 Exercises 231
Chapter VII. Examples of Hyperbolic Sets and Attractors 235
*7.1 Definition of a Manifold 235
*7.1.1 Topology on Space of Differentiable Functions 237
*7.1.2 Tangent Space 238
*7.1.3 Hyperbolic Invariant Sets 241
*7.2 Transitivity Theorems 244
*7.3 Two Sided Shift Spaces 247
7.3.1 Subshifts for Nonnegative Matrices 247
*7.4 Geometric Horseshoe 249
7.4.1 Horseshoe for the Henon Map 255
*7.4.2 Horseshoe from a Homoclinic Point 259
7.4.3 Melnikov Method for Homoclinic Points 268
7.4.4 Fractal Basin Boundaries 274

... Core Sections


*7.5 Hyperbolic Toral Automorphisms 275
7.5.1 Markov Partitions for Hyperbolic Toral
Automorphisms 279
7.5.2 The Zeta Function for Hyperbolic Toral
Automorphisms 288
*7.6 Attractors 292
*7.7 The Solenoid Attractor 294
7.7.1 Conjugacy of the Solenoid to an Inverse Limit 299
7.8 The DA Attractor 300
7.8.1 The Branched Manifold 303
*7.9 Plykin Attractors in the Plane 304
7.10 Attractor for the Henon Map 306
7.11 Lorenz Attractor 309
7.11.1 Geometric Model for the Lorenz Equations 312
7.11.2 Homoclinic Bifurcation to a Lorenz Attractor 318
*7.12 Morse-Smale Systems 318
7.13 Exercises 326
Chapter VIII. Measurement of Chaos in Higher Dimensions 333
8.1 Topological Entropy 333
8.1.1 Proof of Two Theorems on Topological Entropy 343
8.1.2 Entropy of Higher Dimensional Examples 350
8.2 Liapunov Exponents 351
8.3 Sinai-Ruelle-Bowen Measure for an Attractor 356
8.4 Fractal Dimension 356
8.5 Exercises 362
Chapter IX. Global Theory of Hyperbolic Systems 367
9.1 Fundamental Theorem of Dynamical Systems 367
9.1.1 Fundamental Theorem for a Homeomorphism 374
9.2 Stable Manifold Theorem for a Hyperbolic Invariant Set 374
9.3 Shadowing and Expansiveness 377
9.4 Anosov Closing Lemma 381
9.5 Decomposition of Hyperbolic Recurrent Points 382
9.6 Markov Partitions for a Hyperbolic Invariant Set 388
9.7 Local Stability and Stability of Anosov Diffeomorphisms 398
9.8 Stability of Anosov Flows 401
9.9 Global Stability Theorems 403
9.10 Exercises 407
Chapter X. Generic Properties 413
10.1 Kupka-Smale Theorem 413
10.2 Transversality 417
10.3 Proof of the Kupka-Smale Theorem 419
10.4 Necessary Conditions for Structural Stability 425
10.5 Nondensity of Structural Stability 428
10.6 Exercises 430

• Core Sections
Chapter XI. Smoothness of Stable Manifolds and
Applications 433
11.1 Differentiable Invariant Sections for Fiber Contractions 433
11.2 Differentiability of Invariant Splitting 441
11.3 Differentiability of the Center Manifold 444
11.4 Persistence of Normally Contracting Manifolds 444
11.5 Exercises 448
References 451
Index 463
How many are your works, 0 Lord!
In wisdom you made them all;
the earth is full of your creatures.
-Psalm 104:24
CHAPTER I
Introduction

The main goal of the study of Dynamical Systems is to understand the long terl!
behavior of states in a system for which there is a deterministic rule for how a stat,
evolves. The systems often involve several variables and are usually nonlinear. In
variety of settings, very complicated behavior is observed even though the equation
themselves are not very complicated (only "slightly nonlinear"). Thus the simple "I
gebraic form of the equations does not mean that the dynamical behavior is simp)'
in fact, it can be very complicated or even "chaotic." Another aspect of the chaoti
nature of the system is the feature of "sensitive dependence on initial conditions." I
the initial conditions are only approximately specified then the evolution of the stat·
may be very different. This feature leads to another difficulty in using approximate. ()
even real, solutions to predict future states based on present knowledge. To develop !II
understanding of these aspects of chaotic dynamics, we want to find situations whicl
exhibit this behavior and yet for which we can still understand the important featuff'
of how a solution evolves with time.
Sometimes we cannot follow a particular solution with complete certainty becall~'
there is round off error in the calculations or we are using some numerical scheme to fill'
the solution. We are interested to know whether the approximate solution we calculat·
is related to a true solution of the exact equations. In some of the chaotic system,
we can understand how an ensemble of different initial conditions evolves, and pro\
that the approximate solution traced by a numerical scheme is shadowed by a trll
solution with some nearby initial conditions, If the system models the weather, peopl
may not be content to know the range of possible outcomes of the weather that COlli,
develop from the known precision of the previous conditions, or to know that a sma i
change of the previous conditions would have produced the weather which had be<"
predicted. However, even for a subject like weather, for which quantitative as well "
qualitative predictions are Important, it is stili useful to understand what factors ell:
lead to instabilities in the evolution of the state of the system. It is now realized t/};I
no new better simulation of weather on more accurate computers of the future will "
able to predict the weather more than about fourteen days ahead, because of the VI')
nonlinear nature of the evolution of the state of the weather. This type of knowled!,
can itself be useful.
We now proceed to discuss these ideas in a little more detail in terms of specili
equation!>, some of which arise from modeling different physical situations,
First a comment about notation. Throughout the book, points and vectors in R".
manifold, or a metric space are written in bold face, e.g. x. Points (and vectors) on th
line or the circle are denoted without bold face, e,g. x. Also the components of a poil!
or vector in R" are written without bold face, e.g. x = (Xl, ... ,xn)' In a few plac,
it seems strange to use different notation for a point in the line from a point in Rn f,
n ~ 2, but it seems like the best choice to distinguish a vector from a number.
2 I. INTRODUCTION

1.1 Population Growth Models, One Population


In calculus, the differential equation

:i; =ax

is studied, where :i; = ~;. This equation can be used to model many different situations.
In particular, when a > 0 it can model the growth of a population with unlimited
resources or the effect of continuously compounding interest. When a < 0, it can model
radioactive decay. For this simple equation, an explicit expression for the solutions can
be given,
x(t) = xoeat ,
where Xo is the value of x when t = o. If a > 0, the solution tends to infinity as t goes
to infinity, while the solution tends to zero if a < o. The behavior of the solutions is
very simple since the equation is linear. See Figure 1.1.

FIGURE 1.1. Plot of Solutions as a FUnction of Time: for a > 0 and a < 0
Often, we do not plot the solution as a function of t, but merely plot the solution in
the space of possible values of "x" , the phase portrait. The solution curves are labeled
with arrows to indicate the direction that the variable changes as t increases. (This
contains more information when more than one variable is involved.) See Figure 1.2.
This type of phase space analysis can often yield qualitatively important information,
even when an explicit representation of the solution can not be obtained.

o x
• I •

FIGURE 1.2. Phase Portrait: for a > 0 and a <0


A more sophisticated population growth model involves a crowding factor. For this
equation, it is not assumed that the growth rate, XIx, is a constant, but that this
quantity decreases as x increases, XIx = 0 - bx with a, b > o. This equation can be
used to model the growth of a population when the resources are limited. Thus we get
the equation
:i; = (0 - bx)x.

This equation is sometimes called the logistic model for population growth. This equation
can be solved explicitly (by separation of variables and partial fractions), yielding the
solution
axo
x(t) = bxo + (a - bxo)e-at
As can be seen from the differential equation or from the solution, if Xo equals either
o or alb then :i; = 0 and the solution x(t) is constant in time. Such a solution is also
1.2 ITERATION OF REAL VALUED FUNCTIONS 3

called a steady state solution or a fixed point solution. If Xo lies between 0 and alb,
then :i; > 0, and the solution continues to increase with time but can never quite reach
alb. By a simple argument about monotone solutions or by using the exact form of the
solution, it can be seen that for these initial conditions, x(t) tends to alb as t goes to
infinity. Similarly, if Xo > alb, then:i; < 0 and the solution x(t) monotonically decreases
toward alb as t goes to infinity. See Figure 1.3. (Figure 1.3A shows the graphs of four
solutions and Figure 1.3B shows the phase portrait of all solutions.) Thus, for any
initial condition Xo > 0, the solution x(t) tends to the quantity alb as t goes to infinity.
Thus, alb is the long term limit state for any positive initial condition.

O~------------------------

FIGURE 1.3A. Logistic Equation: Solutions as a Function of Time

o x
I
FIGURE 1.3B. Logistic Equation: Phase Portralt

For differential equations on the real line (one real variable), the dynamics are always
about this simple. As t goes to infinity, all solutions must tend to either a steady state
solution (a fixed point) or ±oo.

1.2 Iteration of Real Valued Functions as Dynamical


Systems
As mentioned above, the differential equation :i; = az can be used to model the
growth of capital when interest is compounded continuously. If however the Interest is
compounded at set time intervals (dally, monthly, or yearly) and added to the capital,
then the amount of money at the n-th time interval in terms of the previous time is
given by

where ~ > 1. (Here ~ is equal to one plus the interest rate.) This equation could also be
thought to model the growth of a population which reproduces at fixed time intervals
(e.g. always in the spring) rather than continuously. Letting !J,.(x) = ~x, we get that
Xn = f>.(Xn-l). We also write n(xo) for b, 0 f>.(xo) and ff(xo) = b, 0 fr-1(xo). For
the second iterate, X:l = n(xo) = b,(xt} = ~Xl = ~f>.(xo) = ~(~xo) = ~2xo, and by
4 I. INTRODUCTION

induction on n,

Xn = f>.(xn-d
= >'X n - l
= >.(>.n-l xo )
= >.nxo·
Thus, if>. > I, Xn = ff(xo) tends to infinity as n goes to infinity. If 0 < >. < 1 (some
kind of penalty situation rather than adding interest to the capital), then Xn tends to
o as n goes to infinity. The behavior of these iterates is very similar to the solutions of
the linear differential equation. Again, the dynamics are simple, because this function
is linear.
In the above situation, repeatedly crediting interest corresponded to iteration of a
simple function, f>.. We put in an initial value of xo, and then calculate the value of
x at later "times" n by the rule Xn = h.(xn-d. This use of a real valued function is
much different from the usual idea of merely graphing it. We continue to develop this
idea in this introductory chapter and subsequent chapters.
As a next model, we consider the growth of a population at discrete time periods n
where there is crowding or competing for resources. In comparison with the continuous
differential equations, we get the discrete difference equations Xn = F,.(xn-d with

F,.(x) = Ilx(1 - x).

(We have scaled the variables so the two constants a and b are equal, and use the
single parameter Il.) Robert May (1975) noted that this simple model for discrete
time intervals does not necessarily lead to a system which tends toward a steady state
population. We study this example extensively in Chapter II, but at the moment we
just remark that for 3 < Il < 1 + 6 1/:1 "'" 3.449, most initial values of Xo with 0 < Xo < 1
(including Xo = 0.5) do not tend toward a steady state solution, but tend to an orbit
of period two: b = F,.(a) and a = F,.(b). For Il slightly larger than 1 + 6 1/ 2 , repeated
iteration of Xo by F,. tends to a orbit of period four. For Il = 3.73, the dynamics are
even more complicated, and the iterates of an initial Xo do not seem to tend to any
period motion but move chaotically within the interval (0,1).
For Il > 4, the set of points which stay in the interval [O,IJ for all forward iterates of
F,.(x) turns out to be a Cantor set. The dynamics of iteration of points in this Cantor
set can be analyzed and are very chaotic. The analysis of the dynamics on the Cantor
set is by means of "symbolic dynamics." At any "time" n, the iterate Xn is either in the
left half of the Cantor set or the right half. This distinction can be made by assigning
a symbol (a 1 or a 2) for each time n. It is shown that this crude specification of the
location for all times exactly determines the location at the initial state. By specifying
a sequence of symbols, an orbit with a certain type of behavior can be shown to exist.
Thus, although the dynamics of individual orbits are chaotic, the dynamics of all orbits
can be completely understood and described by symbolic dynamics.
This example for a one dimensional map has many of the properties of what is
called a horseshoe for a two dimensional map. In two dimensions, there is again an
invariant Cantor set and the dynamics are determined by symbolic dynamics. Many.
equations used to model phenomenon are shown to be chaotic by proving the existence
of a horseshoe. The horseshoe in two and higher dimensions is treated in Section 7.3.
Subsections 7.3.1-4 treat various ways the horseshoe occurs for various functions or
differential equations.
1.3 HIGHER DIMENSIONAL SYSTEMS

There are several lessons which the dynamics of iterates of the function F,. teachf'
us. First, the simple algebraic nature of the function does not insure simple dynamb
and in fact its iterates exhibit chaotic properties. Second, for iteration of a function th,
evolution of the state is deterministic. The parameter is fixed and there is a set rule fOI
determining the next state from the previous one, so the evolution of the state from 011'
state to the next one is very predictable. Still, by taking many iterates, the state CIlI
exhibit erratic behavior. The last lesson to learn from this example is that all errati,
behavior is not always caused by changes in internal forces or stochastic effects, but CIlI
also result from the nonlinear nature of the deterministic systems itself.
Chapters II and III treat the dynamics of iteration of real valued functions. We do nfl'
study the implications for modeling physical situations, but do study the mathematic"
aspects of the subject. The quadratic example is studied extensively because of the wid,
variety of types of dynamics it exhibits. Another thing to note about Chapters II all<
III is that only simple analytic tools are used to prove quite sophisticated results: th,
Mean Value Theorem, the Intermediate Value Theorem, differentiation of real valu..,
functions, and a few tools from Point Set Topology. The fact that the mathematic;,
tools used are simple does not mean that Chapters II and III are trivial. In fact, mall
of the key ideas of the book are introduced in this low dimensional setting. Thus, th,
difficulty of the material in Chapters II and III comes from the range of ideas and th,
development of the machinery and not the technical nature of the arguments.

1.3 Higher Dimensional Systems


Solutions of differential equations in two variables can not exhibit chaos in the sem;,
we are using the term. Thus if x E R2 and f(x) is a given function with values in R
(a vector field on the plane), then
X= lex)
is an ordinary differential equation in two variables. The Polncare.Bendlxson Theorl'l'
states that if x(t) is a solution which stays bounded as t goes to infinity, then eithf'
(i) x(t) tends to a periodic solution, or (ii) x(t) repeatedly passes near the same fixf"
point. In fact in the second case the motion x(t) still can not be very complicat('':
Thus, chaos does not occur for differential equations in the plane.
The existence of periodic behavior itself is sometimes interesting. One example whl'l
it has been shown that there is an attracting periodic solution is the Van der PI
equations

x=y-x3 +x
iJ = -x.
These equations were originally introduced to model a "self exciting" electric circuil
If (XO,J/O) is any initial condition other than (0,0) (no matter how small), then t!;
solution (x(t), yet)) tends to the periodic motion. Thus the system excites itself to thi
periodic motion, and other than transitory effects, the natural motion of a solution i
the single periodic motion.
There are many other special equations which model population growth, nonlinel'
oscillators, and many other physical situations. We do not deal with the modelilJ
aspect of the subject, but develop the mathematical tools to study such equations.
Iterates of functions from R2 to R2 can be used to model populations with more th ..
one generation or stages in life (as well as many other situations). The iterates of SUI
functions can exhibit all the complexities of the quadratic map on the real line. In t\\
variables, the map can even be invertible (which the quadratic map on the real line '
6 I. INTRODUCTION

not) and still exhibit chaos. The simplest algebraic form of such a map is the Henon
map,
(:~ ) = FA,B (:) = ( A-B: - X2) .
This map has two parameters A and B. It was introduced by Henon (1976), not as a
model of any particular physical situation, but as a map with a simple algebraic form
which could easily be studied by means of computer simulation. He found that his map
exhibited a "horseshoe" and a "strange attractor" for different parameter values. For
A = 5 and B = 0.3, this map has an invariant Cantor set in the plane, a "horseshoe,"
and the dynamics of iteration of points in this Cantor set are chaotic. (This can be
proved rigorously.) For A = 1.4 and B = -0.3, it can be proved that there is a trapping
region in which points which start in this region remain bounded for all further forward
iteration. There is then an "invariant set," A, of points which remain in the trapping
region for both forward and backward iteration. Computer simulation indicates that this
map is chaotic on A, and A can not be broken up into smaller dynamically independent
pieces (FA.S appears to be topologically tmnsitive on A). See Figure 3.1. For this reason
this invariant set is called a strange attractor. The fact that F1.4,-o.3 is transitive on A
has not yet been proved rigorously, but much of the structure of the dynamics of this
map is well understood.

FIGURE 3.1. Henon Attractor

For larger values of A, A = 5 and B = -0.3, the Henon map has an invariant Cantor
set A. (The set A is homeomorphic to the cross product of the usual one dimensional
Cantor set C with itself, A R::: C X C.) Points not on A become unbounded under
either forward or backward iteration. To a point on A there corresponds a unique string
of symbols which exactly determine the orbit: in this sense the dynamics on A are
equivalent to the dynamics given by the symbolic dynamics. This type of invariant set
is called a "horseshoe". It has many of the properties of the map F,. on the line discussed
above. We study the horseshoe and the Henon map for these parameter values more in
Chapter VII.
Another way that maps can arise is by considering forced differential equations. For
example, we can add a forcing term to the Van der Pol equation and get

:i: = y - x 3 + X + g(t)
y= -x
where g(t) is a T-periodic function of t, e.g. g(t) = cos(wt) for some fixed w. The
solutions of such nonautonomous equations can cross each other as t evolves, but by
1.3 HIGHER DIMENSIONAL SYSTEMS 7

following the solutions through a complete period, we can get a well defined map

(Xl)
Yl
= (X(T»).
y(T)

Following the solution for multiples of the period yields higher iterates of the map,

(Ynxn) = (X( nT) )


y(nT)
.

This period map (a special case of what is called a Poincare map) can exhibit the type
of chaotic behavior which we have been discussing. These differential equations can be
thought of as equations in three variables, X, y, and t with dt/dt = 1. Because one of
the variables is time, some people speak of these as differential equations in two and a
half dimensions. Thus differential equations in two and a half dimensions (or three) can
exhibit chaos. The advantage of considering iteration of maps first in this book is that
this same phenomenon can be observed in lower dimensions, in fact in one dimension
for noninvertible maps.
A type of forced van der Pol type equation was one of the first equations for which
"random" or "chaotic" behavior was observed. Cartwright and Littlewood (1945) first
discovered this and later Levinson (1949) gave a much simpler analysis of the situation.
Much more recently, Levi (1981) gave a more complete analysis and showed how a
horseshoe occurs in this situation.
Another problem which has been shown to have a horseshoe is the motion of three
point masses moving under Newtonian attraction. Sitnikov (1960) studied the situation
of the motion of three masses where the third mass m3 moves on the z-axis and the
first two masses are equal, ml = m2, their positions remain symmetric with respect to
the z-axis, and they move in an elliptical orbit. The simplest description is for m3 = 0
where the first two masses affect the motion of the third but not vice versa. Then ml
and m2 move in an elliptical orbit which is nearly circular in the (x, y)-plane. The third
body oscillates up and down the z-axis. The motion of the first two bodies can be solved
independently of the motion of the third mass. Thus the forces of masses ml and m2
on m3 can be thought of as a time dependent effect on its motion. Since their motion
is periodic, the effect is a time periodic term in the equations for the motion of m3.
The state of the third mass is determined by its height up the z-axis and its velocity.
Thus the equations are time periodic in two variables. By means of a calculation which
has become called the Melnikov method, it is possible to show there is a horseshoe for
this motion, i.e., there is an invariant Cantor set for the period map and the behavior
of an orbit with initial conditions in this Cantor set can be prescribed by sequences of
symbols. One way of interpreting the symbolic dynamics for the Cantor set is that it is
possible to specify any sequence of integers nj (as long as all the nj ~ No for some No),
and then there is an orbit for which the first two masses make Rj revolutions between
the j-th and the j + 1 return of the third mass. Since the sequence is arbitrary, knowing
the length of time it took for the third mass to return the last time gives no information
about how long it will take to return the next time. This unpredictability, or chaotic
nature, of the motion even for very deterministic motion is one of the characteristics
of the type of situations we consider. In fact the number of revolutions can be set to
infinity. If Rio = 00 for some jo < 0 and all the Rj for j > jo are bounded, then
the orbit is a capture orbit, i.e., for this orbit the first two masses rotate about their
center of mass and the third mass comes in from infinity and is captured in bounded
motion as time goes to plus infinity. Similarly, if njo = 00 for some jo > 0 and the Rj
for j < jo are bounded, then there is an ejection orbit. The symbols can be thought
8 I. INTRODUCTION

of 88 a very inexact determination of the state of the system at a given time. By


prescribing this inexact information for all time the exact present state of the system
is determined: it is shown there is a unique initial condition which goes through this
lM1Quence of rough prescriptions of states at the future times. Thus, the existence of
a motion of a prescribed nature can be shown by means of such symbolic dynamics.
Because of its usefulness in determining types of behavior, symbolic dynamics is one
of the fundamental tools we use in our study of Dynamical Systems. In addition to
Sitnikov, this situation was studied extensively by Alekseev (1968a, 1968b, 1969). Also
see the book by Moser (1973) and the expository paper by Alekseev (1981).
A set of differential equations which have been much discussed are the Lorenz equa-
tions given by

:i; = -lOx + lOy


iJ= 28x - y- xz
. 8
z = -3z+xy.

These equations were introduced by Lorenz (1963) as a very rough model of the fluid
flow of the atmosphere (weather). He studied these equations by means of computer
simulation and observed chaotic behavior. After much investigation we understand the
features which are causing the chaos in these equations and have a good geometric model
for their behavior, but no one has analytically been able to verify that these particular
equations satisfy the conditions of the geometric model.

FIGURE 3.2. Lorenz Attractor

So far, we have emphasized that differential equations and iterations of functions


can exhibit chaos. Another important idea is determining when two systems have
the same dynamics, i.e., the two systems are topologically conjugate. A topological
conjugacy can be thought of 88 a continuous change of coordinates. (When this idea is
introduced, we explain why we can not usually find differentiable change of coordinates,
i.e., differentiable conjugacies.) Two systems which are topologically conjugate have the
same long term dynamical behavior: both are chaotic or not, both have the same type
of periodic motions, both are "transitive" or not, etc. Thus, topological conjugacy is an
important concept when we classify systems up to equivalent dynamical behavior.
With this idea of topological conjugacy in place, we then want to define what it
means for a system to have the same dynamics as all nearby systems in some suitable
space of dynamical systems. What is of interest in this consideration is not the stability
of a particular solution of the system, but the structure of the dynamics of the whole
1.4 OUTLINE OF THE TOPICS OF THE CHAPTERS '.

system. For this reason, Smale (1965) introduced the idea of structural stability based
on earlier ideas of Andronov and Pontryagin (1937). A system J is called Itrocturnll.,.
stable provided that every system 9 which is near enough to J is topologically conjugat.
to J. Thus, the dynamics of J are robust under perturbations. Some systems can bl
both structurally stable and chaotic, but there certainly are others with only a finitf
number of periodic orbits which are structurally stable (Morse-Smale systems).
The main analytic feature, which allows us to show that a system is either chaotil
or structurally stable, is the existence of a "hyperbolic structure. For these system~
It

the attention is focused first on the points which have some kind of recurrent behavior.
the set of "chain recurrent points." The rough idea is that a system has a hyperboli.
structure if each point in this chain recurrent set has a splitting into directions which
are contracting and those which are expanding. (For maps on the real line, there i,
only one direction and it is either everywhere contracting and the motion is periodic.
or everywhere expanding and the motion can be chaotic.) The idea of a hyperboli<
structure is a generalization of the notion of splitting into various eigenspaces for a sing\.
linear map. The splitting at the different points has to be compatible: there needs to bl
a compatibility along an orbit and also as the point varies with the invariant set. Thus.
some infinitesimal displacements (tangent vectors) at a point in the chain recurrent set
are contracted and others are expanded. The existence of such a structure is crucial
to be able either to apply the mechanism of symbolic dynamics or to show a systen I
is structurally stable. Because these topics are the focus of this book, we mainly treat
systems with such a hyperbolic structure on the set of all chain recurrent points. Thes.·
ideas can be used in other situations where merely an invariant subset has a hyperboli.
structure. The question of the existence of capture for three point masses is such "
system with a hyperbolic structure on only an invariant subset of states. We do show
that a homoclinic orbit (a orbit which tends to the same fixed point as t tends to both
±oo but has other behavior between) is a situation where the proper assumption le8(b
to a subsystem with a "horseshoe," i.e., an invariant subsystem with chaotic dynamics.
These form an important category of situations where it is possible to prove that ther..
is chaotic dynamics.

1.4 Outline of the Topics of the Chapters


In order to introduce the main perspective early in a situation where the analytic
and geometric aspects are simpler, we first consider the iteration of functions of a
single real variable. In this setting, we can show how to use sequences of interval~
whose images cover the next interval to determine orbits (symbolic dynamics), and how
some functions have dynamics which are equivalent to that of any of its perturbations
(structural stability). These two ideas which are introduced in Chapter II are central
to the approach to Dynamical Systems which we present. In fact, most of Chapter
II is used heavily in the rest of the book with the exception of the rotation numbel
of a homeomorphism on the circle, and the rotation number is an important idea ill
Dynamical Systems.
Chapter III treats topics related to iteration of a function of one variable, and ill
particular a situation which leads to complicated dynamics. One example of such dy-
namics is the invariant Cantor set for the quadratic map which we obtain in Chaptel
II. However in Chapter III, we connect these ideas with the notion of chaos. We al&·
separate these sections from Chapter II, because they can easily be postponed until
later in the book. In fact Sharkovskii's Theorem is only used to motivate subshifts 01
finite type. In turn, subshifts of finite type are not used again until Chapter VII whell
we discuss horseshoes and toral automorphisms In higher dimensions. The material 011
chaos, Liapunov exponents, period doubling cascade, and the zeta function is not used
10 I. INTRODUCTION

but is included since these concepts often occur in the literature on Dynamical Systems.
After Chapter III, we turn to higher dimensions for iteration of functions and solu-
tions of ordinary differential equations. In this setting, the analysis near a fixed point
is much more analytically complicated than the one dimensional situation treated in
Chapter II. As a first step in this analysis, Chapter IV deals with the dynamics of linear
systems, both for iteration of functions and for differential equations. The Perron-
Frobenius Theorem could easily be skipped in this chapter as well as some of the detail
on solutions of linear differential equations. Chapter V treats the local dynamics of a
nonlinear system near a fixed point as a perturbation of the linear system. If the lin-
earization has only contracting and expanding directions and no neutral directions (is
hyperbolic) then the dynamics of the linearization determines the local dynamics of the
nonlinear system. The proof of these results uses the method of finding a contraction
mapping: a mapping from a space of functions to itself is constructed, which is shown to
be a contraction mapping and whose fixed point is a conjugacy between the linear and
nonlinear system. In addition to determining the local behavior near a periodic point,
this chapter introduces some of the methods of proof which are used for the more global
results in later chapters. The method of proof of the conjugacy to the linearized system
(the Hartman-Grobman Theorem) can be used to show that linear Anosov diffeomor-
ph isms are structurally stable. The proof of the existence of the stable manifold for a
fixed point can be modified to prove the existence of stable manifolds for a "hyperbolic
invariant set." Thus Chapter V is also an introduction into the analytic methods for
multidimensional dynamical systems. Certainly some of the material could be skipped
or treated more quickly: for example, the proof of the existence of solutions of differ-
ential equations, the subsection on the Van der Pol equations, the Poincare-Bendixson
Theorem, the proof of the Stable Manifold Theorem, and the Center Manifold Theorem
could be skipped. This chapter is not dependent on Chapter III. It also does not use
the material on the invariant Cantor set from Chapter II.
In Chapter VI, we treat the local bifurcation of fixed and periodic points, i.e., how
the fixed points and their stability varies as a parameter is varied. We do not give a
thorough treatment but give the three basic and generic bifurcations for one parameter
families of functions. Chow and Hale (1982) has a much more thorough treatment of this
aspect of Dynamical Systems. The Implicit Function Theorem is heavily used to prove
these theorems. The section on the one dimensional saddle-node and period doubling
bifurcations depend only on Chapter II and could be covered at that time. The rest of
the sections depend on Chapter IV. Chapter VI is not used extensively in later parts of
the book but there are reference to some of these results.
In Chapter VII, we return to more complicated invariant sets. We give a number of
examples and introduce the basic ideas which can be used to understand the structure of
this type of dynamics. The unifying feature of these examples is their hyperbolic nature:
they have some directions which are contracted and other directions which are expanded.
A rigorous expression of these ideas requires some concepts from differential topology or
differential geometry. We introduce the ideas necessary and mainly discuss the situation
in Euclidean space or on tori where the need for machinery is minimized. This chapter
depends on the treatment of the invariant Cantor set of Chapter II (which could be
combined with the section in Chapter VII). Also the main ideas of local dynamics from
Chapter V are important for a clear understanding of this material. The key sections
are those on the Birkhoff Transitivity Theorem, the Geometric Model Horseshoe, the
Horseshoe from a homoclinic point, attractors, the Solenoid, and Morse-Smale Systems.
Section 7.5.1 on Markov Partitions for Hyperbolic Toral Anosov Automorphisms also
makes a good connection back to the material on symbolic dynamics but is only needed
for Section 9.6 on the Markov Partition for a Hyperbolic Invariant Set. Many of the
1.4 OUTLINE OF THE TOPICS OF THE CHAPTERS 11

other examples are interesting but not essential for the later chapters.
Chapter VIII returns to the theme of Chapter III on measurements of chaos. All
of these concepts appear widely in the literature in Dynamical Systems. We mainly
introduce the the definitions and main ideas without proofs except for the material on
topological entropy. None of this material is used in an essential way in the rest of the
book.
Chapter IX gives the theory of the analysis of these hyperbolic systems and their
structural stability. This chapter uses heavily the ideas from the Stable Manifold Theo-
rem in Chapter V. The examples of Chapter VII give substance to the general definition
of this chapter. Conley's Fundamental Theorem can be treated by itself at the very be-
ginning, but the ideas of a Liapunov function in Chapter V and a gradient flow given in
Chapter VII are helpful for motivation of the ideas. The more global theorems of this
chapter are interesting and important, but much of the richness of Dynamical Systems
is revealed in the examples and more local theory of the previous chapters.
Chapter X gives some generic properties, Le., properties which are true for most
systems in the sense of Baire category. This chapter also is the first place where per-
turbations of systems are treated to any extent. In the earlier chapters, we addressed
conditions which assure that the dynamics can not be changed by any perturbation. In
this chapter, we address the question of what types of perturbations do cause changes
in dynamics. In that sense, this chapter relates to Chapter VI. Although the definition
of transverse intersection is given earlier in the book, some of the proofs require more
theorems from transversality theory. Therefore, we include a section which states thlf
needed theorems and gives references. In terms of generic properties, we prove that.
most systems have hyperbolic periodic points. We also give some necessary conditionS
for structural stability and a counter example to the density of structurally stable sy!H
tems. This last example can be considered as an open set of diffeomorphisms, each o~
which can be made to undergo a bifurcation by means of a correctly chosen perturbation.
Finally, Chapter XI treats some additional topics in stable manifold theory, mainly
concerned with smoothness of the manifolds. In addition to stating some general the-
orems, we prove that the hyperbolic splitting for an Anosov diffeomorphism on T2 is.
differentiable. We also return to prove the differentiability of the center manifold. Cer-
tainly this last topic could be treated right after Chapter V.
CHAPTER II
One Dimensional Dynamics by Iteration

In this chapter we introduce the concepts of Dynamical Systems through iterati"


of functions of a single real variable. The main idea is to understand what the orbit ,
a point is like when iterated repeatedly by the same function: Xl = /(xo) and Xn
/(X n -1) for n ~ 1. In one dimension, the graph of the function can be used quite easil
to analyze the iterates of a point by a function. Also because of the restriction to 011
dimension, the concepts of a periodic orbit being attracting or repelling become a simpi
consequence of the Mean Value Theorem. Two other important Ideas In Dynamic.
Systems are the notion of topological conjugacy and symbolic dynamics. Two functiOl'
/ and 9 are said to be topologically conjugate if there is a homeomorphism h wil
g(x) = h 0 / 0 h- 1 (x). In one dimension, we are able to prove quite simply th"
simple examples such as /(x) = 2x and g(x) = 3x are topologically conjugate. For tb
quadratic family, F,,(x) = 1'2:(1- x), we can show that if 1',1" > 4 then F,.. and F,..' 81
topologically conjugate even though both maps have infinitely many periodic point
We also use the quadratic map to introduce the ideas of symbolic dynamics. The id,
is that we label two intervals II and 12 and are able to show that sequences of these t\\
intervals are in one to one correspondence with orbits of the map under iteration. Tb
final section of the chapter concerns iteration of homeomorphisms of the circle. In th I
setting the average rotation of a point under repeated iteration, the rotation numb(' 1
determines whether the map has periodic points or not.
In later chapters we return to study periodic orbits, the possibility of topologic,
conjugacy, and related topics for iteration of functions in higher dimensions. We al~
apply these notions to solutions of differential equations. The main reason to treat til
case of iteration of one variable first is that it reduces the topological complicatioll
while introducing the new concepts of dynamical systems.

2.1 Calculus Prerequisites


In this short section, we recall three results from calculus which we use extensiv('1
In the material on iterating a real valued function of one real variable. As we proce<'
through the material on dynamical systems, we discuss further results from calculll
which we need. In particular material on derivatives in higher dimensions and til
Implicit Function Theorem is given later.
Critical Points
The points where a function attains its maximum and minimum are important i
studying graphs. They are also important in determining the dynamics of iteration I
the function. Therefore, we review the definition and terminology of points where til
derivative is zero.
Assume / is differentiable. A point a is called a critical point of / provided I'(a) = t
It is called a nondegenerate critical point provided I'(a) = 0 and I"(a) :j:. O. F•.
example, if F,,(x) = I'x(l-x) then x = 1/2 is a nondegenerate critical point. The POitl
o is a degenerate fixed point for the function /(x) = x 3 •

13
14 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Intermediate Value Theorem


Assume J is an interval and I : J ~ R is a continuous function. Further assume
that a, bE J with a < b, and z is a value between I(a) and I(b). Then there is a (at
least one) c with a < c < b such that I(c) = z. This property says that I attains all the
values between I(a) and !(b). This is essentially the result that I([a, bJ) is connected.
This result is also called the Theorem 01 Darboux.
Another consequence of continuity
Assume I : J -+ R is a continuous function and Cl < I(a) < C2 for a in J. Then
there is an open set U about a in J such that CI < I(x) < C2 for all x E U.
Mean Value Theorem
Assume J is an interval and I : J -+ R is a continuous function. Further assume
that a, b E J with a < b and that I is differentiable on (a, b). Then there exists a C with
a < c < b such that
I(b) - I(a) = j'(c).
b-a
This says that there is a point C for which the slope of the tangent line at c equals the
slope of the secant line from (a, I (a» to (b, I (b». This result is also called the Theorem
01 Lagrange.

The Chain Rule


The chain rule concerns the derivative of a composition of functions. If I, 9 : R -+ R
are two functions, then we write 10 g(x) for I(g(x», i.e., for the composition of I with
g. If both I and 9 are differentiable, then

(f 0 g)' (x) = j'(g(x»g'(x) or


= j'(y)g'(x)

where y = g(x). The point is that the derivative of I must be taken at the correct
point.
Composition of functions is at the heart of the dynamics of iteration of functions:
taking a point Xo E R, we want to find XI = I(xo), X2 = !(xt>, and Xn = !(Xn-I).
Thus Xn = 10 ... o/(xo) where we take the composition of! with itself n times. We
write

!2(X) = 10 I(x) and by induction


r(x) = 10 r-I(x) forn~1.

Note that this is composition of the function and not squaring of the formula which
defines I. Thus if !(x) = x(l - x) then

12(X) = I(x(l - x» = x(l - x)[l - x(l - x)].

For higher iterates, it quickly gets impossible to write out the formula for r(x). In
this context, the chain rule to the composition of a function I with itself n times can
be written as
2.2 PERIODIC POINTS 15

where xi = li(xo). In particular, if I(x) = x(1 - x), Xo 1/3, and n = 3 then


XI = 1(1/3) = 2/9, X2 = j2(1/3) = 1(2/9) = 14/81. The derivative is f'(x) = 1 - 2x,
and
(J3)'(1/3) = 1'(14/81)/'(2/9)/'(1/3)

= (1 - 2!~)(1 - 2~)(1 - 2~)


53 5 1
= 81 ·9·3·
The point is that we do not need to compute 1 3 (x) or (J3)'(x) explicitly.
If I is invertible then rl is the inverse of I, r2(x) = (J-I)2(X), and r"(x) =
(J-I)n(x) for -n < o. We also write 1° for the identity, jO(x) = x. Using the chain
rule, it can be shown that the derivative of the inverse is the reciprocal of the derivative
of the function, •
(J-I)'(X) = I'(:-d
where X-I = I-I(x).
Terminology about types of functions
Throughout this book, we use some terminology about functions and the extent of
their differentiability. Let J be an open subset of R (possibly all of R), and let I : J --t R
be a function. If I is continuous we say that I is Co. If I is differentiable at each point
of J and both I and f' are continuous, then I is said to be continuously differentiable
or a C I function. Given r ~ I, if I together with J<il are continuous functions for
:s :s
1 j r, then I is said to be r-times continuously differentiable or a cr /unction.
A function I : X --t Y between metric spaces X and Y is called a homeomorphism
provided it is (i) one to one, (ii) onto, (iii) continuous, and (iv) its inverse rl : Y ...... X
is continuous. Finally for J an open subset of R, a function I : J ...... K c R is called
a cr -diffeomorphism from J to K provided I is a cr -homeomorphism from J onto K
with a C r inverse I-I : K --t J.

2.2 Periodic Points


Throughout this section I : R ...... R is continuous. Sometimes I is only defined on
an interval in R. We add the assumption that I is C l or C2 whenever we take first or
second derivatives of the function.
In this section we discuss the existence of a fixed point X with J(x) = x or periodic
point with I"(x) = x for some n > 1. The fixed points or points of low period can be
found by solving the equations J(x) = x and I"(x) = x. For higher periods, It is not
practical to solve these equations. Below, we discuss an analysis using the graph of I
which is more useful for determining points of higher period.
r
Definition. A point a is a periodic point 01 period n provided (a) = a and P (a) '" a
for 0 < j < n. (Note that n is the least period.) If a has period one then it is called a
fixed point. If a is a point of period n, then the forward orbit of a, O+(a), is called a
periodic orbit. The notation we use for all points fixed by r
(not all of least period n)
is
Per(n, f) = {x : r(x) = x} and
Fix(J) = Per(I, f).
Finally, a point a is eventually periodic 01 period n provided there exists an m > 0 such
that Im+"(a) = Im(a), so p+n(a) = Ij(a) for j ~ m, and Im(a) is a periodic point.
16 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Example 2.1. Let I(x) = X3 - x. The fixed points satisfy the equation x 3 - x = x, so
are the points x = 0, ±y2. The points x = ±l have their first iterates go to the fixed
point 0, I(±I) = 0, so ±l are eventually fixed.
As mentioned in the last section, the critical points satisfy f'(x) = 3x 2 - 1 = O. So
the only critical points are x = ±l/va. The second derivative is f"(x) = 6x, which is
nonzero at ± 1/ va so the critical points are nondegenerate.
As the above example illustrates, to find the fixed points it is enough to solve the
equation I(x) = x. To find the periodic points of higher period is harder. Of course,
theoretically it is possible to find the expression for the higher iterate r(x) and then
solve rex) = x. For n much larger than four this becomes algebraically very compli-
cated even in the simplest examples. Exercise 2.6 asks you to find the points of period
two and four for the map FI'(x) = I'x(1 - x). Sometimes it is possible to find points
of higher period by understanding the graph of r. See Exercises 2.4 and 2.5. Also
computer iteration can be used to find the points of higher period. As mentioned in
the preface, the programs Phaser by Ko<;ak (1989) and MacMath by Hubbard and West
(1992) are two programs which can carry out such iteration.

Definition. For a continuous function I, the lorward. orbit of a point a is the set
0+ (a) = {Jk (a) : k ~ o}. If f is invertible then the backward. orbit is defined using
negative iterates: 0- (a) = {Jk(a) : k $ OJ. The (whole) orbit of a point a is the
set O(a) = {Jk(a): - 00 < k < oo}. If I is not invertible then we sometimes make
choices and construct X-l,X-2,'" where f(x- n ) = X-n+l or X-n E f-l(x_n+l)' (For
a noninvertible map, I-l(y) = {x : f(x) = y}.)

Before discussing how to find the forward orbit using the graph of the function, we
give some more fundamental definitions connected with convergence and stability of
periodic points.
Definition. A point q is forward. asymptotic to p provided 1J3(q) - li(p)1 goes to zero
as j goes to infinity. If p is periodic of period n then q is asymptotic to p if Ifin(q) - pi
goes to zero as j goes to infinity. The stable set of p is defined to be

W"(p) = {q : q is forward asymptotic to pl.

If f is invertible then a point q is said to be backward. asymptotic to p provided If' (q) -


Ji(p)1 goes to zero as j goes to minus infinity. If I is not invertible then a point q is said
to be backward. asymptotic to p provided there are sequences P_j and q-i with Po = p,
qo = q, I(p-j) = P-j+l, f(q-j) = q-j+l, and Iq-j - p-il goes to zero as j goes to
infinity. In either case, the unstable set 0/ P is defined to be

WU(p) = {q : q is backward asymptotic to pl.

Definition. A point p is Liapunov stable (L-stable) provided given any f > 0 there is
a 6> 0 such that if Ix - pi < 6 then 1J3(x) - Ji(p)1 < f for all j ~ O. This says that for
x near enough to p the orbit of x stays near the orbit of p. A point p is asymptotically
stable provided it is L-stable and W'(p) contains a neighborhood of p. If p is a periodic
point which is asymptotically stable it is also called an attracting periodic point or a
periodic sink. If p is a periodic point for which W"(p) is a neighborhood of p, it is called
a repelling periodic point or periodic 80Un:e.
2.2 PERIODIC POINTS 17

Example 2.2. Let I(x) = x 3 . The fixed points satisfy I(x) :..: x 3 - X, so x
0, ±l. Note that the fixed points correspond to points where the graph of I, {(x,J(x)},
intersects the diagonal {(x,x)}. See Figure 2.1. The graph of I is monotone. On
(0,1) the graph of I lies below the diagonal and I(x) < x. Thus for x e (0,1),
x > I(x) > /2(X) > .. ·/"(x) > O. Because this sequence is monotone it must converge
to a fixed point and so to O. (The fact that a bounded monotone sequence of points on
an orbit must converge to a fixed point is left to Exercise 2.2.) Thus (0,1) c W'(O).
For backward iterates, x < ,-1(X) < 1 for x e (0,1). As j goes to minus infinity, Ji(x)
is monotonically increasing to 1,60 (0,1) C WU(1).
A similar analysis shows that (-1,0) C W'(O) and (-1,0) C WU(-I) since for
xe(-l,O),

x < f(x) < 12(x) < ... < rex) <0


- 1 < ,-n(x) < ... < ,-2(x) < ,-1(X) < x.
Because W'(O) contains a neighborhood of 0, the fixed point 0 is attracting.
For x > I, P(x) is monotonically increasing. If this forward orbit were bounded it
would have to converge to a fixed point. Since there is no fixed point larger than 1,
the orbit must go to infinity as j goes to infinity. As j goes to minus infinity, /i(x) is
monotonically decreasing to 1 60 WU(I) :> (1, (0).
Again for x < -1, Ii (x) is monotonically decreasing to -00 as j goes to infinity, and
li(x) is monotonically increasing as j goes to minus infinity, WU( -1) :> (-00, -1). Be-
cause we have analyzed the iterates of all points in the line, the following list summarizes
the stable and unstable sets:

W'(O) = (-1,1)
WU(O) = to}
W'(±I) = {±1}
W U (I) = (0, (0)
W U (-I) = (-00,0).

Because WU(1) is a neighborhood of 1 and (W'(l) is not), the iterates of points near
1 move away and the fixed point 1 is repelling (unstable). Similarly, -1 is repelling.

FIGURE 2.1. Stair Step Method for Example 2.2


18 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

To understand the orbit more fully we use the graph of f. We give a general de-
scription. but the above example is useful in illustrating the construction. Take a point
Xo. Draw a vertical line segment from (xo.xo) to the point on the graph (xo,J(xo)).
Then draw the horizontal line segment from this point on the graph over to the point on
the diagonal (f(xo),J(xo)). Graphically this shows how we map from the point Xo to
XI = f(xo)· Repeating this process with line segments from (Xlo xd to (Xlo f(xd) and
then from (xI.f(xd) to (f(xd.f(xd) determines the next point X2 = f(xd. Repeat-
ing this process determines the orbit. Figure 2.1 draws these stair steps for Example
2.2 for a point 0 < Xo < 1. In the figure. we extend the vertical lines down to the axis
to indicated the points Xn more clearly. Because of the appearance of this figure. this
process is often called the stair step method to determine the phase portrait. The reader
might want to draw the stair steps for Example 2.2 with Xo > 1. -1 < Xo < O. and
Xo < -1.
The stair step method can easily be adapted to find inverse iterates when the function
is one to one (monotonic). In this case take the point x. and draw the horizontal line
segment from this point (x.x) on the diagonal to the point on the graph: since the
function is monotonic there is only one such point and it is (f-I(X).X). Now draw
the vertical line segment from (f-I(X).X) to the diagonal (f-I(x).f-I(x)). or to the
axis (f-I(X),O). In Figure 2.1. applying this process to X3 we get X2 = f- I (X3).
XI = f- 2(X3). and Xo = f-3(X3). This process can either be thought of as (i) reversing
the steps to find the forward iterate or (ii) thinking of x as a function of y (interchanging
the roles of X and y) which is the way the inverse function is explained in a calculus
course (although we did not redraw the the graph with the x-axis up).
If the graph is not monotonic. in applying the stair step method to the inverse there
may be more than one point on the graph which can be reached by a horizontal line
segment from a point (x. x) on the diagonal. Each of these multiple points gives a
possible inverse image of x. See Figure 2.2 for f(x) = -x + x 3 • Xo = 0.231. and
rl(xo) = {x~l ~ -0.854.x~1 ~ -0.246. and x~l = l.l}.

FIGURE 2.2. Function f(x) = -x+x3• Xo = 0.231. and preimage f-I(XO) =


{x~f ~ -0.854.x~1 ~ -0.246. and x~f = 1.l}

The above process describes how to calculate the forward and backward orbit of
points. As is Example 2.2. this information can be used to determine whether a fixed
point. is attracting or repelling. Because this process is involved. it is useful to have
a criterion for a periodic orbit to be attracting which only uses the derivative of the
function at points along the periodic orbit. The next theorem gives just such a criterion.
2.2 PERIODIC POINTS 19

Theorem 2.1. Assume f : R - R is a Cl function. (a) Assume that P is a fixed point


with 1f'(p)1 < 1. Then P is an attracting fixed point (or asymptotically stable or a sink),
i.e., W'(p) contains a neighborhood ofp.
(b) Assume that p is a periodic point of period n with IU"),(p)1 < 1. Then p is an
attracting periodic point.
REMARK 2.1. By the Chain Rule the derivative in part (b) can be calculated as the
product of the derivative of f along the orbit:

IU")'(p)l = I!,(p"-dl" . I!'(pdl1!'(Po)1

where Pi = Ji(p).
PROOF. (a) Because 1f'(p)1 < I, there is an interval (p-f,p+fl and a ~ with 0 <,x < 1
such that 1f'(x)1 ~ A for x E (p - f,p + fl. Then by the Mean Value Theorem, for
x E (p - f,p + fl, there is a z between x and p with

If(x) - pi = If(x) - f(P)1 = 1!,(z)I'lx - pi ~ 'xIx - pi < Ix - pI·

Thus f(x) is closer to p than x so f(x) E (p- f,P+f), and we can repeat the argument.
By induction

This argument shows that fi(x) E (p - f,p + £] for all j ~ 0 proving that pis !rstable.
Because ,Xi Ix - pi goes to zero, fi(x) converges to p as j goes to infinity, proving that
p is attracting.
The above argument can be understood using the stair step method. There are two
cases, 0 < !,(p) < 1 and -1 < f'(p) < O. (The case f'(p) = 0 can be analyzed in a
similar manner, but the exact argument depends on the form of the graph of ,.) In the
first case with 0 < f'(p) < I, for x > p, x > f(x) > J2(x) > ... > p as can be seen from
the graph. Thus fi(x) converges to p from above. See Figure 2.1. Similarly, if x < p,
Ji(x) converges to p from below.
Now consider the case with -1 < f'(P) < O. If x> p, then f(x) < p and f2(x) > p.
Because 0 < ( 2 )'(p) < I, p < f2(X) < x. Thus Ji(x) converges to p also. See Figure
2.3.

FIGURE 2.3. Near an Attracting Fixed Point with Negative Derivative


20 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

(b) For the periodic case. consider 9 = r.


Then g(p) = P and Ig'(p)1 < 1. By part
(a). gi(x) converges to P for x near p. By continuity of P for 1 ~ j ~ n it follows that
IJi(x) - Ji(p)1 goes to zero as j goes to infinity for all j and not just for multiples of n.
o
The following theorem gives the comparable criterion for a periodic point to be
repelling.
Theorem 2.2. Assume I: R -+ R is a C' function. Assume that p is a periodic point
of period n with l(r),(p)1 > 1. Then p is repelling. Moreover. for all sufficiently small
intervals 1 about p and x E 1 there is a k = k", such that J'<n(x) f/. 1. This says that all
points near p go away from p under iterates of In.
We leave the proof as an exercise. (Exercise 2.3)
Example 2.2 (revisited). Notice that for I(x) = x 3 • f'(x) = 3x 2 • 1'(0) = 0. and
f'(±I) = 3. By Theorems 2.1 and 2.2. 0 is attracting and ±1 is repelling. This
conclusion is the same one we obtained by directly analyzing the orbits of points. Notice
that it is much quicker to get the fact that these points are attracting and repelling using
the theorems. i.e .• using the criterion on the derivative. However. also notice that the
theorems only tell us what happens to orbits near the fixed points. while our earlier
analysis of this example analyzed the iterates of all points.

2.2.1 Fixed Points for the Quadratic Family


In this subsection we consider the family of quadratic maps F,,(x) = ,n(1 - x) for
IJ > 0 and mainly for IJ > 1. We first find the fixed points.

Proposition 2.3. The fixed points of the family of quadratic amps F" are 0 and
PI' = IJ - 1. The fixed point 0 is attracting for 0 < IJ < 1 and repelling for IJ > 1. The
IJ
fixed point PI' is attracting for 1 < IJ < 3 and repelling for 0 < IJ < 1 and 3 < IJ. The
only critical point is x = 1/2 which is nondegenerate.
PROOF. The fixed points satisfy x = ,n_,n2. so are the points x = 0 and x = PI' == 7
as claimed.
To determine their stability. note that F;(x) = IJ - 21Jx. Thus IF; (0) I = IIJI so 0
is attracting for 0 < IJ < 1 and repelling for IJ > 1. On the other hand IF; (PI') I =
IIJ - 2(1J - 1)1 = 12 -IJI· Thus PI' is attracting for 1 < IJ < 3 and repelling for 0 < IJ < 1
and 3 < IJ.
The critical points satisfy F;(x) = 1J(1-2x) = 0. so the only critical point is x = 1/2.
Finally. F::(x) = -21J so x = 1/2 is a nondegenerate critical point. 0
We leave to Exercise 2.6 the determination of the points of period two and their
stability. As for the eventually fixed points. note that F,,(I) = 0 so 1 is eventually fixed.
By symmetry of the graph. if we let PI' = 1 - PI' = 1/IJ then F,,(p,,) = PI' so PI' is also
eventually fixed.
The following proposition indicates which points of F" go to infinity. and so which
other points are potentially periodic.
Proposition 2.4. Assume IJ > 1. If x f/. [0.1) then Fd(x) goes to minus infinity as j
goes to infinity.
PROOF. For x < 0. F;(x) = IJ - 2,n > 1. Thus for Xo < 0. 0 > Xo > F,,(xo) >
F~(xo) > ... > Fd(xo) is decreasing. If this orbit were bounded it would have to
2.2.1 QUADRATIC FAMILY

converge to a fixed point which would be a negative point. Since no such fixed poil
exists, Ft(xo) goes to minus infinity.
If Xo > I, then F,,(xo) < 0 so Ft(xo) = Ft- I 0 F,,(xo) goes to minus infinity.
The next proposition shows that all the points in (0,1) converge to the fixed poil
p" for the range of I-' for which p" is attracting. The solution to Exercise 2.6 sho"
that this proposition is false for I-' > 3. However for 3 < I-' < 1-' .. most points in (0, I
are asymptotic to an orbit of period two. For 1-'1 < I-' < 1-'2, most points in (0,1) fl!
asymptotic to an orbit of period four. This continues and there are I-'n such that t,
I-'n-I < I-' < J.tn, it can be shown that most points in (0,1) are asymptotic to an orbit·
period 2n. (Such a proof can not be done directly by calculating /3" .) The I-'n convpr·
to 1-'00' and for I-' > 1-'00 it is not always the case that most points in (0,1) are asymptOI
to a periodic orbit. In Section 2.4, we see that for I-' > 4 there are many points in (n.
which are not asymptotic to a periodic orbit. In Section 3.4, we return to a furth·
discussion of this period doubling cascade.
Proposition 2.5. Assume 1 < I-' < 3. If x E (0,1), then F,{(x) converges to p" /l.'
goes to infinity. Thus W'(p,,) = (0,1).
PROOF. (a) First consider 1 < I-' ::; 2. The maximum of the graph occurs at x = 1;
For this range of parameters, F,,(1/2) = 1-'/4 ::; 1/2. Using the graph it is then elf',
that PI' ::; 1/2. The function is thus monotonically increasing on (O,p,,) and the grsl
lies above the diagonal. Thus for Xo E (O,p,,), FJ(xo} is a monotonically increasil
sequence which must converge to the fixed point p,.. See Figure 2.4. Similarly, on t I
interval (PI" 1/2] the function is monotonically increasing and the graph lies below II
diagonal. Thus for Yo E (p", 1/2] FJ(yo) monotonically decreases to p,.. Finally, I,
Xo E (1/2,1), F,,(xo} E (O,I/2) so Ft(xo} converges to p,.. This completes the proof t·
this range of parameters.

FIGURE 2.4. Iteration of Xo and Yo for 1 < J.t < 2, 0 < Xo < PI" and
p" < Yo::; 0.5

(b) Now assume that 2 < I-' < 3. Note that PI' > 1/2.
(i) Consider the interval [1/2,p"j. Because F;
is monotone on [1/2,p"j, to find I
image it is enough to determine the iterates of the end points:

F~([1/2,p,,]) = F"(IP,,,J.t/4j)
= [J.t(~)(1 - ~),p"l.
22 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

0.5 F\5) P F (.5)


11 11 11

FIGURE 2.5. Iteration of x = 0.5 for 2 < II. < 3

We want to show that this image is contained in [1/2,pj.lJ, /1.(/1./4)(1 - /1./4) > 1/2, or
o > /1.3 -411.2+8 = (/I.-2}(1J. 2-2/1.-4). The roots of JJ.2-2/1.-4 are 1±5 1/ 2 so this factor is
negative for /I. < 3. The first factor /I. - 2 > 0 so the prod uct is negative as desired. Thus
we have shown that F;(1/2) = /I.(/I./4}(1 - /1./4) > 1/2 and F;([1/2,pj.lj) C [1/2,pj.lJ.
See Figure 2.5. The second iterate of Fj.I is monotone on [1/2,pj.lj, so the graph of F;
intersects the diagonal once on the interval [1/2,pj.lj, and this is at Pw Because the
graph of F; is above the diagonal but below 1/2 on [1/2,pj.lj, all the points in this
interval converge to Pw
(ii) Next, let Pj.l = 1//1. < 1/2 as above, so Fj.I(pj.I) = Pj.I' Fj.I ([Pj.l , 1/2]) = Fj.I([1/2,pj.I]),
and F;([P", 1/2]) C [1/2,p"j. Thus all the points in [p", 1/2J also converge to p" by the
results of the previous case.
(iii) Now, consider Xo E (O,p,,). The function F" is monotonically increasing on this
interval and the graph lies above the diagonal. Thus F~(xo) is monotonically increasing
as long as the iterates stay in this interval. Because F,,(pj.l) = P", the first time that an
iterate F~(xo) leaves the interval (O,p,,) it must land in [Pj.l,p"L Le., FI~(xo) E [P",P"J
for some k > O. Then F!+j(xo) converges to PI' as j goes to infinity.
(iv) Finally, if Xo E (p", 1), then F,,(xo) E (O,p,,), so further iterates converge to Pw
Combining the cases we have proved the proposition. 0

2.3 Limit Sets and Recurrence for Maps


We have defined periodic points and found points of low periods. In the examples we
study with complicated dynamics, there are points which are not periodic but whose
orbits keep returning near where it started. Orbits with such properties are said to have
a kind of recurrence. In this section we introduce several such concepts. Although we
give a few examples in this section, the concepts should become clearer as they are used
throughout the rest of the book. In this chapter, we mainly use the concepts of the
Q and w-limit sets of a point, an invariant set, and the nonwandering set. The other

concept could be postponed until they are used.


A related concept is that of convergence of an orbit to another. We have already
defined what it means for q to be forward asymptotic to P . ,rious times in our
study of dynamical systems we need a more general concep{,; .,,' give this in terms of
the Q and w-limit sets of a point which is introduced in this section. These concepts are
also used to define the points with a kind of recurrence called the limit set.
We give these definitions in this section for a continuous map f : X --+ X where X is
a complete metric space with metric d. Several of the definitions use only the forward
2.3 LIMIT SETS AND RECURRENCE FOR MAPS 23

iterates of f and make sense even if f is not invertible. We do not distinguish these
cases but just assume f is a homeomorphism throughout and let the reader determine
which definitions make sense for noninvertible maps. In the next section, we return
to the study of the quadratic map which gives an example of several of the types of
recurrence defined in this section.
Definition. A point y is an w-limit point of x for f provided there exists a sequence
of nk going to infinity as k goes to infinity such that

lim d(r"(x),y) =
k-oo
o.
The set of all w-Iimit points of x for f is called the w-limit set of x and is denoted by
w(x) or w(x, I). Other hooks also use the notation of Lw(x, I) or L+(x,f). If the map
f is invertihle then the a-limit set ofx for f is defined the same way but with nk going
to minus infinity. The set of all such points is denoted by a(x) or a(x, f). Again other
books also use the notation of L,,(x, I) or L - (x, I).
Example 3.1. If rex) = x is a periodic point then both limit sets equal the orbit,
w(x) = a(x) = O(x) = {x, f(x), ... , r-l(x)}. In this case the w-limit set is finite, and
not connected if the period n is greater than 1.
Example 3.2. The Section 2.8 on difl'eomorphisms of the circle gives a more complete
discussion of this example on the circle. Let p ¢ Q. If fp(x) = x + p is considered a map
on the reals, then it induces a map Tp : 8 1 --+ 8 1 of the circle by taking points modulus
one. For any x E 8 1, it is shown in Section 2.8 that the forward and backward orbits
of x, O+(x) and O-(x), are each dense in 8 1 , and also that w(x) = a(x) = 8 1 .
Next we define various types of invariance for a subset.
Definition. A subset 8 c X is said to be positively invariant provided f(x) E 8 for
all x E 8, i.e., f(8) c 8. A subset 8 c X is said to be negatively invariant provided
/-1(8) c 8. Finally a subset 8 c X is said to be invariant provided f(8) = 8. Thus,
if 8 is invariant, then the image of 8 is both into and onto 8 but we do not require that
8 is negatively invariant. If f is invertible (a homeomorphism) and 8 is an invariant
subset for f, then the conditions that f(8) = 8 and that f one to one implies that 8 is
negatively invariant. Notice that a periodic orbit is always an invariant set. We show
below that any w(x) is always positively invariant and often in~riant.
Example 3.3. Let F5(x) = 5x(1 - x) on R (which is not invertible). We show in the
next section that Fs has an invariant Cantor set A. In Section 2.5, we show that there
are points x* with w(x*) = A and O+(x*) dense in A.
The following theorem gives many of the basic properties of the limit set of a point.
Theorem 3.1. Let f : X --+ X be a continuous map on a complete metric space X.
(a) For any x, w(x) = nN>ocl(Un>N{r(x)}). If f is invertible then a(x) =
nN<ocl(Un<N{r(x)}). - -
(b) If p(x) = y for an integer j, then w(x) = w(y). Also a(x) = aCyl if / is
invertible.
(c) For any x, w(x) is closed and positively invariant. If 0) O+(x) is contained in
some compact subset of X or (ii) I is one to one, then w(x) is invariant. Similarly, if
J is invertible, then a(x) is closed and invariant, so it is both positively and negatively
invariant.
(d) 1£ O+(x) is contained in some compact subset of X (e.g. the forward or-
bit is bounded in some Euclidean space), then w(x) is nonempty and compact, and
24 11. ONE DIMENSIONAL DYNAMICS BY ITERATION

d(f"(x),w(x» goes to zero as n goes to infinity. Similarly, if O-(x) is contained in


some compact subset of X, then a(x) is nonempty and compact, and d(f"(x),w(x»
goes to zero as n goes to minus infinity.
(e) If Dc X is closed and positively invariant, and xED then w(x) C D. Similarly,
if I is invertible and D is negatively invariant and xED, then a(x) C D.
(f) Ify E w(x) then w(y) C w(x) and (if I is invertible then) a(y) C w(x). Similarly,
if I is invertible and y E a(x) then a(y),w(y) C w(x).
PROOF. (a) Let y E w(x). Then y E cl(Un>N{f"(x)}) by the definition of w-Iimit
set. Therefore y E nN>ocl(Un>N{fn(x)}). This proves one inclusion. Now assume
y E nN>ocl(Un>N{fn(x)}). Then for any N, y E cl(Un>N{fn(x)}). Now for each
N tl\ke kN > kN--1 with kN ~ N I\nd d(fkN(X),y) < j,. -;rhen d(fkN(x),y) goes to
1.I'm IIIC N I(Of'fl to Infinity 111111 y E w(x). Thill proves the other inclusion, so the sets are
equlll.
(b) This is clear by the group property of iteration.
(c) For any x, w(x) is closed because it is the intersection of closed sets by part (a).
To see that it is positively invariant, let y E w(x). Then there is a subsequence nk with
d(f"'(x),y) going to zero as k goes to infinity. For any fixed j E N, d(fno+j(x).Jj(y»
goes to zero by the continuity of I. Therefore !iCy) E w(x). This proves that w(x) is
positively invariant. If I is invertible, the above argument works for any j E Z proving
that w(x) is invariant.
If I is not invertible but O+(x) is contained in a compact subset, then we argue as
follows. Let the subsequence nk be as above with d(fno(x),y) going to zero as k goes
to infinity. At least a subsequence of the points Ino-l(x) converge to a point z which
also must be in w(x). By continuity, I(z) = y. This proves that w(x) is invariant.
(d) If O+(x) is contained in some compact subset of X, then

cl( U {fn(x)})
n?N
is compact, so
w(x) =
N?O
n cl( n?N
U {In(x)})
is compact. Also w(x) is the intersection of nested nonempty sets and so is nonempty.
Finally, assume d(fn(x),w(x» does not go to zero. Then there exists some fl > 0 and
a subsequence of iterates nk with nk going to infinity such that d(fni (x), w(x» ~ fl.
The points f"' (x) are bounded so have a limit point z outside of w(x) contradicting
the definition of w(x). Thus the distance must go to zero as n goes to infinity.
(e) This follows from either part (a) or the following argument. Let xED. By the
invariance of D, f"(x) E D. Because D is closed, all the limit points of f"(x) must be
in D, proving that w(x), a(x) C D.
(f) Let Y E w(x). The set w(x) is invariant so by part (e) w(y) C w(x). A similar
argument applies to the a limit sets. 0
We now define one type of invariant set which can not be dynamically broken into
smaller pieces: a minimal set. Then the next proposition characterizes a minimal set
in terms of the w-Iimit sets of points. With this characterization we can give a simple
example of a minimal set.
Definition. A set S is a minimal set for I provided (i) S is closed, nonempty, invariant
set and (ii) if B is a closed, nonempty, invariant subset of S then B = S. Clearly, any
periodic orbit is a minimal set.
2.3 LIMIT SETS AND RECURRENCE FOR MAPS 25

Proposition 3.2. Let X be a metric space, I : X -+ X a continuous map, and SeX


a nonempty compact subset. Then S is a minimal set if and only if w(x) = S for all
xE S.

PROOF. First assume S is minimal. Since S is invariant, for any XES, w(x) C S.
Since S is compact, w(x) is a nonempty invariant subset of S. Because S is minimal,
w(x) = S. This proves one direction of the implication of the result.
Now assume w(x) = S for all XES. Assume 0 -# Be S is closed and invariant. If
x E B, then S = w(x) c B c S so B = S. 0
Example 3.2 (revisited). Let Tp be the map of the circle SI as before with p 'i. Q.
The whole circle SI is a minimal set for Tp because w(x) = SI for all points. See Section
2.8. Maps of the circle are the principal examples in which we consider minimal sets in
this book, although their other types of maps with minimal sets.
Now we can give the basic definitions of different types of recurrence.
Definition. Using the definition of w-limit set and a-limit set, we say that x is w-
recurrent provided x E w(x) and that x is a-recurrent provided x E a(x).
Definition. For a map I : X -+ X, a point p is called nonwandering provided for every
neighborhood U of p there is an integer n > 0 such that r(U) n U -# 0. Thus there is
a point q E U with r(q) E U. The set of all nonwandering points for I is called the
nonwandering set and is denoted by n(f).
Definition. An f-chain 01 length n from x to y for a map I is a sequence {x
< f.
"0, ... ,Xn = y} such that for all} ~ j ~ n, d(f(xj-d,xj)
Definition. Let Y c X. The f-chain limit set olY lor I is the set
n;-(Y,/) ={x EX: for all n ~ } there is an y E Y and an
f-chain from y to x of length greater than n}.

Then the lorward chain limit set 01 Y is the set

n+(Y,f) = n n;-(Y,f) .
• >0

(This set is the analogous object to the w-limit set of a point but for chains.) If Y is
the whole space X, we usually write n+(f). Similarly,
n;(Y, f) ={y EX: for all n ~ } there is an x E Y and an
f-chain from y to x of length greater than n},
and the backward chain limit set 01 Y is the set

W (Y, f) = nn;
• >0
(Y, f) .

(This latter set is like the a-limit set of a point.) Finally, the chain recurrent set all
is the set
R(f) = {x : there is an f-chain from x to x for all f > O}
= {x: x E n+(x,/)} = {x: x E W(x,J)}.

It takes a little bit of work to see that x E n+(x,f) is equivalent to there being an
f-chain from x to itself for all f (without assuming the chain is arbitrarily long).
26 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Definition. We define a relation", on RU) by x '" y if Y E n+(x) and x E n+(y).


Two such points are called chain equivalent. It is clear that this is an equivalence
relation. The equivalence classes are called the chain components of R. (For flows these
chain component$ arc actually the connected components of R which is the reason for
the name.) If f has a single chain component on an invariant set A then we say that f
is chain transitive on A.
With these definitions, the reader can easily check that

c1 (U{w(x, f) ; x E Xl) c nu) c RU).


In thi~ I'hllptl'r we mainly use the conl'ept of the nonwandering set and the a and w-Ilmit
SI'!"; of \lllin!s. In Cha)lh'rS Vll, IX, and X, the chain recurrent set is used extensively.

2.4 Invariant Cantor Sets for the Quadratic Family


In this section we start our study of maps with more complicated dynamics by study-
ing the Quadratic family, FI'(x) = I'x(1-x). For suitable choices of the parameter 1', the
complicated dynamics of F" is not spread out over the whole line but is concentrated on
an invariant Cantor set. Cantor sets occur often in maps with complicated or "chaotic"
dynamics throughout this book. The use of the Quadratic map on the line allows us
to introduce such complicated dynamics in a relatively simple setting and give a rather
complete analysis. After determining the invariant Cantor set in this section, the next
section contains a discussion of symbolic dynamics. By introducing symbols to describe
the location of a point, the dynamics of a point in the Cantor set can be determined by
means of a sequence of these symbols. Because many different patterns of symbols can
be written down, points with many different types of dynamics can be shown to exist.
We start in Subsection 2.4.1 by giving the properties which characterize a Cantor
set and reviewing the construction of the middle-a Cantor set in the line. After this
treatment, we show how a Cantor set arises for the Quadratic map in Subsection 2.4.2.

2.4.1 Middle Cantor Sets


Definitions. Let X be a topological space and SeX a subset. The set S is nowhere
dense provided the interior of the closure of S is the empty set, int(cI(S)) = 0. The set
S is totally disconnected provided the connected components are single points. In the
real line a closed set is nowhere dense if and only if it is totally disconnected. In other
spaces these two concepts are different. For example a curve in the plane is nowhere
dense but it is not totally disconnected. The set S is perfect provided it is closed and
every point p in S is the limit of points qn E S with qn t p.
A set S is called a Cantor set provided it is (i) totally disconnected, (ii) perfect, and
(iii) compact.
We write L(K) for the length of an interval K.
Construction of a Middle-a Cantor Set
Let 0 < a < 1 and (3 > 0 be such that a + 2(3 = 1. Note that 0 < (3 < ~. Let
So = I = [O,lJ. Start by removing the middle open interval of length a;
G = «(3,1 - (3) and
SI = [\G.
Notice that G is the middle open interval of [ which makes up a proportion a of the
whole interval. Then
2.4.1 MIDDLE CANTOR SETS 27

is the union of 2 closed intervals each of length {3, L(Jj ) = {3 for i = 0,2. We label the
intervals 50 that Jo is to the left of J2. (We use 0 and 2 rather than 0 and 1 or 1 and
2 because of the connection we make below with the representation of numbers in base
3.) For i = 0,2, let G j be the middle open interval which makes up a proportion 0
of Jj , 50 L(Gj ) = 0{3. Let Jj .o be the left component of Jj \ Gj and Jj.'l be the right
component. For jlth .. 0 or 2, the length of the component J j ,.,. Is (J'l, L(J".Ja) ..
L(Jj,)(1 - 0)/2 = {32. Let

S2 = SI \ (Go UG2) = U Jj,.jo.


j,.,.-0.2

This set, S2, is the union of 22 closed intervals, each of length {32. By induction, assume
Sn-I is the union of 2n - 1 closed intervals each of length pn-I , and labeled as J j , •... ,;.. _,
for all comhinations of jk = 0 or 2. For each of these intervals, let Gj, ..... i._,
be the
middle open interval which makes up a proportion 0 of the interval J), ..... j._,. Thus,
L(Gj, ..... j._,) = oL(Ji, ..... j._,) = Opn-I. Let

where JiJ •...•in_,.o is to the left of Jj , •...•in_ ..2. Each of these components has length pn,
L(Jj, .....j .. _,.j..) = L(JiJ ..... jn _,)(I- 0)/2 = pn. Let

U G;II" .. ,in_l

U J j ...... j ••
iJ ... ·.i .. =0.2

Since each interval Jj , .....i. _, of Sn_I'yields two closed intervals of Sn, Sn has 2(2 n - l ) =
2n closed intervals. This completes the induction step of the construction.
Finally, let

We check that C satisfies the properties of a Cantor set in the following claims.
Claim 1. The set C is nowhere dense.
PROOF. The intervals which make up Sn have length pn. Therefore, for any point
p E C there is a point q within a distance pn which is not in Sn and 50 not in C.
Therefore C has empty interior and is nowhere dense. 0
Claim 2. The set C is perfect.
PROOF. The sets Sn are closed, and C is the intersection of these sets, 50 C is closed.
Let p E C and i a positive integer. Take n such that {3n < 2- j and let K be the
component of Sn containing p. Then K n Sn+1 is made up of two intervals. Let qj be
one of the endpoints of the interval of K n Sn+1 not containing p. Then (i) qj :I: p, (ii)
Ip - qjl < {3n < 2- j , and (iii) q E C (since all the end points of the Sj are in C). This
gives a sequence of points qj E C, qj :I: p, and qj converges to p. 0
REMARK 4.1. The total length of Sn is 2npn = (l-o)n. Since 2{3 < 1, the total length
of Sn goes to zero as n goes to infinity. Therefore, all these niiddle-o Cantor sets have
measure zero.
211 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

There are Cantor sets with positive measure, but they are not formed by taking a
uniform proportion of the intervals out at each step. Let On > 0 be numbers such that
0:=1 (1 - on) > 0, i.e., E:::I on < 00. At the n-th stage remove the middle-on from
each of the previous intervals to form the set Sn. Then the total length of Sn is 1 - On
times the total length of Sn-I. By induction, the total length of Sn is 0;=d1 - OJ).
As n goes to infinity, it follows that C = n:=1 Sn has measure 0;:1(1 - OJ) > O.
REMARK 4.2. In the construction, it is not important that each component of Sn has
length exactly (3n. Let Ln be the maximum length of a component in Sn,

Ln = max{L(K) : K is a component of Sn}.

What is important is that Ln goes to zero as n goes to infinity. This property implies
that C is nowhere dense. If the sum of the lengths of the components of Sn does not go
to zero, then C has positive melL~urf'. Also to prove that. C is perff'ct, it is not necessary
that for f'1l.Ch component K of S" t.hat K n S,,+ I has two component.s; what is nc<,essary
is that for each compoOl'nt K of S" that there is a k ~ n + 1 such that [( n Sk has at
least two components.
REMARK 4.3. We show that the quadratic map has an invariant set which is a Cantor
set in the line but which is not a middle-o Cantor set. Later in the book we see many
other invariant sets which are Cantor sets, or locally the cross product of a Cantor set
and a disk (in some dimension).
Middle-Third Cantor Set and Ternary Expansion
Any number x in the interval [0,1] can be written in ternary expansion as
00 .

X = '~
"3 Jk10 '
"=1
Most numbers have a unique expansion but

In fact for any nand ii, ... ,in,

( ~ik)+in+l
~ 3" 3n
=~ik
~ 310
+ ~ ~.
~ 310
10=1 10=1 k=n+1

Thus repeating 2's in a ternary expansion plays the same role as repeating g's in decimal
expansions, and the nonunique representations given above are the only ones that oc-
cur. expansions have nonunique representations. To make the comparison between the
ternary expansion and the middle-third Cantor set, we want to avoid the use of 1's as
much as possible. For that reason we make the following conventions on the coefficients
j = (il,h, ... ) which we use in the cases where there is a nonunique representation: (i)
we use the j for which there is an n with in = 2 and i" = 0 for all k > n but do not use
the j for which there is an n with in = 1 and i" = 2 for all k > n, (ii) we use the j for
which there is an n with jn = 0 and j" = 2 for all k > n, but do not use the j for which
there is an n with jn = 1 and j" = 0 for all k > n, and finally (iii) we use Er..1 2·3-"
to represent the number 1. With these restrictions, the representation is unique.
2.4.1 MIDDLE CANTOR SETS 29

Next we consider the set of numbers which use only O's and 2's as coefficients in their
ternary expansion:
00 .

Co = {L ~: : jn = 0 or 2}.
n=1

Note for points in Co there is a unique ternary expansion even without any restrictions.
(Their expansions use only O's and 2'8 and automatically obey the above rules for choices
of the representation.)
We want to represent Co as the intersection of sets. Define
00 .

S~ = {L ~~ : jk = 0 or 2 for 1 :5 k :5 n and jk = 0, 1, or 2 for k > n}.


k-I

We want t.o show that S~ = S" hy induct.ion on 71, where the set.s Sn are those given
above for the lIIiddltl-third CI\ntor sct. First, note that S; contains all numbers except
t.hose which nre 1/3 + y where y has an (ternary) expansion in 3- k for k > 1; thus
o < y < 1/3. The two endpoints are contained in S; because 1/3 = ~~2 2· 3- k
and 2/3 can be represented with jk = 0 for k ~ 2. (We do not use expansions whose
coefficients end with a 1 followed by repeated O's or a 1 followed by repeated 2's.)
Therefore, S; = S I. Also note that the left ends of the two intervals in S; have ternary
representations which end in all O's, and the right endpoints end in all 2's. Now assume
by induction that S~_I = Sn-I, and that all the left endpoints in S~_I have ternary
expansions whose coefficients are jk = 0 for k ~ n and the right endpoints have ternary
expansions whose coefficients are jk = 2 for k ~ n. Let JiI .....Jn_. be an interval in
Sn-I = S~_l' Since its left endpoint has a ternary expansion with jk = 0 for k ~ n,
J i ......in _. \ S~ is the.set of points

where 0 < y < 3- n • (Again, the open interval is removed because we do not use
expansions which end in 1 followed by repeated O's or 1 followed by repeated 2'8.)
Therefore, S~ n Jj, .....)n_. = Sn n Ji ......in _. for any Jj, .....in_I' and so S~ = Sn. Also
note that the left endpoints of the intervals in S~ can be represented by expansions that
end in repeated O's and the right endpoints by expansions which end in repeated 2's.
This completes the proof by induction that S~ = Sn for all n.
Now letting C be the middle-third Cantor set,

n=O

-n
-
00

n=O
S n'

= Co.

Thus the middle-third Cantor set consists of those points whose ternary expansion
contains all 0'5 or 2's and no 1'5.
Finally, we note a connection between the points in C which are not endpoints and
their ternary expansions. The endpoints of Sn for some n are those points in C which
are the endpoint of an open interval K where K C R \ C. They are also the points
30 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

which end in repeated O's or in repeated 2's. Because there are points in C with ternary
expansions which have both jk = 0 for arbitrary large k and jk' = 2 for other arbitrary
large k', there are points which are not the endpoints of any ofthe S~. These are points
which have points arbitrarily close which are not in C, but they are also accumulated
on by points of C from both sides.

2.4.2 Construction of the Invariant Cantor Set


Remember that F,,(x) = ILx(1 - x) is the quadratic map, and I = [0,1). We showed
before that if x 1. I then F,,(x) goes to minus infinity as n goes to infinity. Therefore
we want to find the x-values such that F::(x) E I for all n E N. Here, and in the rest of
this book N is the set of nonnegative integers,

N = {n E Z : n ~ a}.

The maximum of F,,(x) occurs at the critical point x = 1/2 where F,,(x) takes the
value 1L/4. We consider values of the parameter IL for which this value is greater than
one, IL > 4, so F,,(l) covers I and F;;I(l) n I = F;;I(I) is the union of two intervals
which we label II and 12 ,
F;; I (I) n I = I I U h
(See Figure 4.1.) Later we take IL > 2 + 5 1/ 2 which insures that IF~(x)1 > 1 for
x E F;;I(I). This bound on the derivative makes some calculations easier.

1 ...--+--~--,

o I I X- 1

FIGURE 4.1. Intervals for the Quadratic Family

Theorem 4.1. Assume p. > 4, and let A" = {x : F::(x) E I for all n ~ O}. Then A"
is a Cantor set.
We introduce the following notation which is used throughout the proof:

n F;;k(li.)
n-I
I io ,... ,;0_1 = = {x : F!(x) E I;. for 0 ~ k ~ n - I}
k=O

where ik = 1 or 2, and
n n-l
Sn = n F;;k(l) = n F;;k(ll
k=O k=O
U 12) = U
io.il , ...• in_1 = 1,2
1;0,;1"",;0_1'
2.4.2 CONSTRUCTION OF THE INVARIANT CANTOR SET 31

For the proof that AI' is a Cantor set, it is not necessary to label the components of the
set Sn so carefully. However, in a later subsection we use the labeling to prove that the
map F,. restricted to AI' is "topologically conjugate" to a map on a space of symbols.
(A topological conjugacy is a homeomorphism which takes orbits of one map to orbits
of another map.) We introduce the notation here to avoid proving twice the lemmas
used in both proofs. The conjugacy of F,. restricted to AI' with a map on a space of
symbols is interesting because it allows us to prove that the periodic points of F,. are
dense in AI' and that F,. has a dense orbit in AI"
We start the proof with the following lemma.
Lemma 4.2. Assume Jl > 4. For all n E III the following statements are true.
(a) For any choice of the labeling with io, ... , i n - I E {I, 2}, I,o .....,~_. n Sn
1'0 ......,,_ •. 1 U 1'0 ...... ,,_ •• 2 is the union of two nonempty disjoint closed intervals (which
are subsets of 1'0 ..... ,,,_.).
(b) For tlvo distinct choices of the labeling (io, ... , in-d f (i o,'"
,i~_d, 110 ..... ,,,_. n
I,:, ..... ,~_. = 0, so Sn is tIle union of2n disjoint intervals.
(c) The map F,. takes the component 110 ...... ,,_. of Sn homeomorphically onto the
component 1........,,_. of Sn-I.
REMARK 4.4. Notice from the definition of 1'0 ......,,_.,
n-I

1'0 ...... ,,_. = n F;;"(l•• ) = {x: F!(x) E I •• for °$ k $ n -I},


"=0
so it is characterized by the forward orbit of points. In Exercise 2.16, the reader is asked
to determine the order on the line of all the intervals with three labels, 110 ., •• ,•.
In terms of images of these intervals, part (c) says that F,.(Iio ......,,_.) = 1, ...... ,"_.
where the first label is dropped. The reader should check why this is true from the
above characterization of these intervals.
PROOF. We prove the lemma by induction on n. For n = 0, So = I = [O,IJ and there
is nothing to check.
For n = 1, let G I = {x E I : F,.(x) > I} and SI = F;;I(l) n I = 1\ G I . Then, G I is
an open interval in the middle of I because F,.(I/2) > 1, and SI = In SI is the union
of two nonempty disjoint closed intervals as claimed, II U 12 with 1'0 c I for io = 1,2.
The map F,. is monotonically increasing on [0, 1/2J, so F,. maps h homeomorphically
onto I = So. Similarly, F,. is monotonically decreasing on [1/2, I], so F,. also maps 12
homeomorphic ally onto I = So. This verifies the induction hypothesis for n = 1.
Although we do not need this step, take n = 2. The map F" is monotone on each
of the separate intervals II and h For io = 1 or 2, F,.(Iio) :) II U 12 , so S2 n 110 =
1'0 n F;I(Sd is the union of two intervals, 1'0.1 U 1'0.2 where F,.(I.o.,,) = lie, and F,. is
a homeomorphism from 1'0.,. onto I, •. Since there are four intervals 110 .• " this verifies
all three conditions of the induction hypothesis.
Now assume the lemma is true for n and we verify it for n + 1. Let 1'0 ..... ,,,_. be
a component of Sn. Then F,.(Jio''''''n_.) = 1, ....... ,,_. is a component of Sn-Io and
1••.....• ,,_. :) I"''''''n_. n Sn = I"''''''n_ •. 1 U 1, ....... ,,_ •. 2. Therefore,

1'0 ..... ,,,_. n Sn+J = F;I(Sn) n Iio''''''n_.


= F;;I(Sn n 1"''''''n_.) n 110
= [F;;I(I'l''''''n- •. d U F;;I(I"''''''n_ •. 2)J n 1'0
is the union of two nonempty diSjoint closed intervals, giving condition (a). Since there
are 2n choices of the index for l io ''''''n_.' Sn+l is the union of 2(2n) = 2n + 1 intervals,
32 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

giving condition (b). The map F,. is monotone on lio •...• i" ••• j. so it maps homeomor·
phirally onto 1.......... _•. ). giving conditions (c). This completes the verification of the
induction step. and so verifies the lemma. 0
With this lemma. some of the properties of a Cantor set can be verified. Since each
of the sets Sn is closed. A = n:=o
Sft is closed. If the lengths of the intervals in Sft go
to zero. thf'n A is perfect as before. If the lengths of the intervals do not go to zero.
then the intersection contains an interval and it is perfect. In any case. A is perfect.
However. to prove that A is nowhere dense. we need to show that the maximal length of
an interval in Sn goes to zero. This fact is easier to prove if 1F~(x)1 > 1 for all x E II uh
This condition on the derivative is true if and only if J.L > 2 + 5 1/2. Therefore in the rest
of thi!l srction we assume JJ > 2 + 5 1/ 2 and return to the case for 4 < J1 ~ 2 + 5 1/2 in
the next subsection.
Lemma 4.3. TIle absolute value of the derivative is greater thall Olle for aJJ points in
fl U f2 = 51. 1F;.(x)l > 1 for all x E 51. if alld only if JL > 2 + 5 1/ 2 .
PROOF. The derivative of F,. is given by F~(x) = J.L - 2JLx. Also F:(x) = -2JL < O.
so thl' smalll'St vahll' of 1F~(x)1 on fj occurs where F,.(x) = 1. Solving 1 = F,.(x) =
JLX - JLX 2, we get x± = [JJ ± (J12 - 4JL)I/2J/(2JL). But for these points.

IF;' (x±)1 = IlL - 2JL(JL ± (JL2 - 4J1)1/2)1 = I 'f (JL2 - 4JL)I/21 = {jL 2 - 4JL)I/2.
2J.L
Therefore, we need 1F~(x±)1 = {JL2 - 4JL)I/2 > 1, or J.L2 - 4JL -1 > O. The left side equals
zero when JL = 2 ± 5 1/ 2 , and it can be checked to be greater than zero for JL > 2 + 5 1/ 2 .
(The second value of JJ arose because we squared the equation when we solved it.) This
proves the lemma. 0
The next lemma proves a bound on the maximal length of a component of 5ft in
terms of a bound on the derivative. If J.L > 2 + 5 1/2 then this bound goes to zero as n
goes to infinity. For an interval K. let L(K) be its length.
Lemma 4.4. Let A = A,. = inf{IF;~(x)1 : x E II U 12}. Then the length of any com-
ponent f. o ..... i • _. is bounded by A-n. L(1.o ....... _.) ~ A-n for aJJ possible choices of the
labeling.
PROOF. We prove the lemma by induction. Take n = 1. Let lio = [a. b]. Since
F,.(1io) = [O,IJ, {F,.(a). F,.(b)} = {0.1}. i.e .• endpoints go to endpoints. By the Mean
Value Thoorem, there is some c E [a. b] for which F,.(b) - F,.(a) = F~(c)(b - a). Then.
1 = IF,.(b) - F,.(all = IF~(c)I'lb - al ~ )"L(/io)' Therefore L(1io) ~ ),,-1. This proves
thE- induction step for n = 1.
Assume the result is true for n - 1. Take a component lio .....' •.• of Sn' Then
the image. F,.(1.o ..... ,•.• ) = li ...... i . . . . is a component of Sn-l. and by induction
L(1i ........ _,) ~ ),,-(n-I). As above by the Mean Value Theorem. there is acE
lio •... ;._. = [a,b] with F,.(b) - F,.(a) = F;.(c)(b - a). so

L(1, ...... , •.• ) = 1F,.(b) - F,.(a)1


= 1F;'(c)(b - a)1
~ Alb - al
= AL(1io ..... i•.• ).

and L(1.o .... .,•.• ) ~ )" -1 L(1i ..... .i •.• ) ~ )". n. This completes the proof of the induction
step and the lemma. 0
2.4.3 THE INVARIANT CANTOR SET FOR,.. > 4 33

PROOF OF THEOREM 4.1. Assume that ~ > 2 + 5 1/ 2 . We mentioned above that the
Sn are closed so All is closed. We have shown that the length of the components of
Sn are shorter than A-n which goes to zero with n. The proof that All is perfect and
nowhere dense is the same as for the Middle-o-Cantor set. Take pEAl" For j E N,
take n such that A-n < 2- j . Then p E lio ..... in_1 for some choice of the component of
Sn' Then lio .....• n_1 nSn+1 = lio ....... _I.1 U/io ..... in _ I .2 is the union of two intervals. Take
YJ E lio ..... io_1 \ Sn+1 in the gap, and qj an endpoint of lio ..... in_l.i. where lio ..... in_l.i n
is chosen so that p It lio ..... i o -I.i o ' Then Yj is not in Sn+1 and so is not in All' The Yj
converge to p, so this shows that A" is nowhere dense. Also qj '" p and qj E All since
it is an endpoint. The qj converge to p proving that All is perfect. This completes the
verification that A" is a Cantor set for ~ > 2 + 5 1 / 2 • 0

2.4.3 The Invariant Cantor Set for J.1 > 4


The proof of Theorem 4.1 for alii' wit.h 4 < I' goes hack to the work of Fatou and
Julia on complex functions. (This is the theorem that if all the critical points of a
polynomial have orbits which go to infinity then the Julia set is totally disconnected.)
See Blanchard (l984) or Carleson and Gamelin (1993). The first proof using strictly
real variables is found in Henry (1973). (This proof does not prove that 1F;{x)1 ~ C)"n
for x E All') The proof given below is mainly given in terms of real variables, but we
use Schwarz Lemma of complex variables to prove the key estimate.
For values of ~ with 4 < I' < 2+5 1/ 2 , there are points x E InF;I{l) with 1F;{x)1 < 1
and other points with 1F;{x)1 > 1. Thus in terms of the usual length on the line we do
not have a )" > 1 such that L{/io ..... i .. ) ~ A-I L{lil ..... i n ) for all the subintervals lio ..... i •.
There are several ways around this difficulty. One method is to look at higher iterates
of F,. and prove there is a C ~ 0 and)" > 1 such that 1(F!),{x)1 ~ C)"k for all k ~ 0
and x E huh By taking an m ~ 1 with C)"m = ),,' > 1 we have that F;' is an
expansion on II uh and L{lio, .... i n ) ~ ()"/)-IL{li m ....... ). This inequality can then be
used to prove that A,. is nowhere dense. Guckenheimer (1979) and van Strien (1981)
have a proof in this spirit using the fact that F,. has negative "Schwarz ian derivative."
Also see Misiurewicz (1981) and de Melo and van Strien (1993). Newhouse (1979) has
a proof for two dimensional maps which can be adapted to prove this result.
Rather than give the details of this proof we give an alternative proof which proceeds
by defining a new length on the interval [O,lJ. This idea is essentially present in the
complex variable proof mentioned above.
Definition. Assume that p{x) > 0 is a continuous density function on an interval K.
If x,Y E K, then define the p-distance from x to y by

dp(x,y) = IL'l p{t)dtl·

It is easy to check that this is a metric on K: dp(x, y) > 0 for x '" y, dp(x, x) = 0,
dp(x, y) = dp(Y, x), and dp(x, z) ~ dp(x, y) + dp(Y, z). Let Lp(J) be the length of an
interval J in terms of the distance dp • We think of p(x) defining a length or norm of a
vector (infinitesimal displacement) at x by IIvllp.% = Ivlp(x) where Ivl is the usual length
of a vector.
REMARK 4.5. Assume K be an interval and p: K ..... R+ be a positive density function
for which there exist positive constants C I and C2 with C I ~ p(x) ~ C2 for all x in K.
It can be seen easily by applying estimates to the integral that
34 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

for any two points r.y E K. Therefore. if we can show that I>"lengths of a nested set of
iDten-als go to zero then the usual lengths also go to zero.
The following lemma indicates the property which we want p(r) to have.
Lemma 4.5. Let K be an interval and p : K -+ R+ be a positive density [unction. Let
/ : R -+ IR be a C t [unction. Assume that there is a ~ > 0 such that
p(f(r»/f'(x)/ = IIf'(x)vll p./(%) > ~
p(x) I/vl/ p, % -
(Ilr.r ...1, wllf'rf' J is It sIIhllltf'rvnl o[ K with /(J) c K. Tllen [or tllis subintf'.Tval J,
I".l/(.I» .~ >.J.,.(.I).
REMARK 4.6. Since the derivative of / is always greater than zero on the interval J. /
is monotone on J.
PROOF. The following estimate proves the lemma:

Lp(f(J» = 11 IE/(J)
p(t) dtl

= 11 p(f(s))j'(s) dsl (where t = /(s))

1I
IEJ

~ p(f(s))f'(s) Ip(s) ds
IEJ p(s)

~ 1 ~p(s) ds
IEJ
= ~Lp(J).
o
The following proposition relates the condition given in terms of varying norm /I I/p,%
in the last lemma to a condition for the standard norm.
Proposition 4.6. Let / : R -+ R be a C t [unction. Let K be an interval and p : K -+
R+ be a positive density [unction [or which there exist positive constants C l and C 2
with C t :::; p(x) :::; C 2 for all x ill K. Assume that there is a ~ > 0 SUell that
IIj'(x)vll p,J(%) ~ ~IIvl/p.%
provided x,f(r) E K. Then taking the positive constant C = C./C2 :::; 1, the derivative
of the 71- til iterate of f satislies
/(r)'(x)v/ ~ C~"/v/
for all n ~ 0 and all r E J in terms of the usual absolute value.
REMARK 4.7. Thus if f is immediately expanding in terms of some different norm
II II p .r, it is eventually expanding in terms of the Euclidean norm.
PROOF. The proof follows directly because the two norms are uniformly eqUivalent:
/(f")'(r)v/ ~ C;·II(r)'(x)vll p,Jn(%)
~ c;t~"1/vl/P.%
~ C;tCl~"/V/.
o
To use Lemma 4.5, we are free to define the density function p(x). We want a choice
that satisfies the assumptions of the above lemma. The metric p we use is related to
the Schwarz Lemma in Complex Variables.
2.4.3 THE INVARIANT CANTOR SET FOR ,.. >4 35

Schwarz Lemma 4.7. Let D = {Z E C : Izl < I} be the open disk in the complex
plane. Assume 1 : D --+ D is complex analytic with 1(0) = 0 and I(D) ! D (not onto).
Then 1/'(0)1 < 1.
For a proof, see Theorem 15.1.1 in Hille (1962).
Corollary 4.8. Assume I: D --+ D is complex analytic and /(D) ! D (not onto). Let
g(z) = 1 -lzl2 and
1
p(z) = g(z)'

Then
p(f(z»IJ'(z)1 = 1If'(z)vllp,f(z) < 1
p(z) IIvllp,z
for all zED.

REMARK 4.8. This norm II· IIp,z is called the Poincare norm and it induces a metric
(distance between points) called the Poincare metric. The unit disk with the Poincare
metric is an example of a non-Euclidean metric on a surface with negative curvature.
PROOF. This theorem is also proved in Hille (1962), Theorem 15.1.3, but is stated in
terms of the distance between points and not the length of vectors.
We also give a sketch of the proof using the Schwarz Lemma, Fix Zo E D and let
Wo = /(zo). For j = 1,2, there are fractional linear transformations Tj preserving D,

with lajl < 1 such that T1(0) = zo and T2 (wo) = O. Thus T2 0 /0 TdO) = O. The
fractional linear transformations preserve the length of vectors in terms of II . lip,., so

I = IIT2(wo)lIp,o
II 1IIp,wo
_ IITl(O)lIp,%o
- II 1 IIp,o

Thus by the Schwarz Lemma

1 > I(T2 0 f 0 Til'(O)1


> IIT2(wo)lIp,o . 1If'(zo)lIp,wo . IITl(O)lIp,zo
II I IIp,Ulo IIllbo IIll1p,o
1I/'(zo)lIp,wo
IIll1p,zo

o
In Corollary 4.8 the absolute value of the derivative is less than one, while in the
following lemma it is greater than one. The reason for this difference is that in Corollary
4.8, / maps D into D, while F,,(/) covers I. In Corollary 4.8, the unit disk is centered
at the origin. For the map F" the corresponding interval we use is centered at 1/2, 80
a change of variables (given in the proof) modifies the Poincare norm to give the one
stated in the following lemma.
36 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Lemma 4.9. Let p(x) = [x(l - X)J-I on (0,1). (This density is singular at x = 0,1.)
Then for 11 > 4 and x, FI'(x) E (0,1),

p(FI'(x))IF~(x)1
p(x)

PROOF. We consider FI' as a map of a complex variable, FI'(z) = Ilz{l - z). Let '0 1/ 2
be the disk of radius 1/2 centered at the point 1/2. Notice that FI' takes circles of
radius r about 1/2 onto circles of radius IJr2 about 11/4: FI' (I /2 + re i8 ) = 11/4 _llr2ei28.
In particular, FI' takes the circle of radius 1/2 onto the circle of radius 11/4 about 11/4.
Thus for 11 > 4, FI' takes the circle of radius 1/2 around the outside of the disk '0 1/ 2 ,
Also, Z = 1/2 is the critical point for FI' (the point where the derivative is 7.ero), and
for I' > 4, FI' takes this critical point outside '0 1/ 2 : F I / 2 {1/2) = 11/4 is real and greater
than one. Therefore FI'('O I /2) covers '0 1/ 2 twice, and F,-;I has two branches of the
inverse t.aking '0 1/ 2 into itself and each being one to one. (The branches correspond
to the inverse of F,. on the two intervals II and h when FI' is restricted to the real
variable x.) Because there are two branches of the inverse, neither is onto all of '0 1/ 2 .
By Corollary 4.8, each of these inverses is a contraction in terms of the Poincare metric
on '01/ 2 , so FI' is an expansion for points z with Z, FI'{z) E '0 1/ 2 ,
To complete the proof we need to determine what the Poincare norm is on this disk
which is not the unit disk centered at the origin. The map h{z) = 2z -1 takes '01/ 2 onto
the unit disk D centered at the origin. If p«) = (1- «)-I is the usual Poincare norm on
D, t.hen poh(z) = [4zz-2z -2i]-I. For real z, this gives a constant multiple of the norm
stated in the lemma. However, the correct way to use the map h to "pull back" the length
of vectors is not to just look at this composition but to define IIvll •.• = IIh'{z)vlip.h(.).
Since h'(z) == 2, this induces the norm IIvll •.• = Ivll2zz - z - i]-I which for real z gives
IItllI .... = IvI2-I[X(1-x)]-I. If we take two times this norm everywhere we get the norm
stated in the lemma and it will also be expanded by F~. 0
The norm in the previous lemma is singular at 0 and 1. To get rid of this difficulty
we modify it slightly to make the norm singular at the points - f and 1 + f which are
outside of the interval [0,1]. We accomplish this change by mapping the disk of radius
1/2 + f about 1/2 onto the unit disk rather than the disk of radius 1/2.
Proposition 4.10. Assume 4 < 11. Let p(x) = [(x + ()(1 + ( - x)]-I on [O,lJ. Then
for 0 < f < 11/4 - I and x, FI'(x) E [0,1],

IIF~(x)lIp.F~("') = p{F,.(x))IF~(x)1 > 1.


IIll1p~ p(x)

REMARK 4.9. Notice that this density is nonsingular on [0,1].


REMARK 4.10. This Proposition can be proved by a direct calculation considering only
real x rather than proving it as a corollary of the Schwarz Lemma. The difficulty is that
this argument is somewhat involved and needs to consider various intervals to prove the
inequality. Exercise 2.12 asks such a direct verification to be carried out for a different
density function for which the calculation is simpler and for which there is no easy way
t.o use a complex variable argument.
REMARK 4.11. The first iterate of FI' stretches lengths in terms of this new metric, so
that we are ahle to take C = 1 and .x > 1 in the inequality 1I(F,~)'(x)lIp.F,~(",) ?: C.xk
which definE'S a hYPE'rholic set.
2.5 SYMBOLIC DYNAMICS FOR THE QUADRATIC MAP 37

PROOF. We want to modify the proof for Lemma 4.9 by taking h(z) so that it takes
the disk of radius 1 + t centered at 1/2 onto the standard unit disk D:

h(z) = 2(1 + 2t)-I(z - 1/2).

Using this h a direct calculation shows that the induced norm is as follows:

(1 + 2t)2- 1
IIvll •.• = (x + t)(1 + t - x)

Again this norm is a constant scalar multiple of the one stated in the Proposition.
To prove the proposition we only need to show that (i) F" takes the critical point
z = 1/2 outside the disk of radius 1/2 + f centered at 1/2, V I/ 2 +.' and (ii) F,,(V I / 2+.)
covers V I/ 2 +< twice. In connection with the first condition, we saw in the proof of
Lemma 4.9 that F;. mapped the critical point x = 1/2 to the point p./4, so we need
t > 0 such that p./4 > 1 + tor f < p./4 -I, which we assumed. For the second condition,
the proof of Lemma 4.9 showed that F" mapped circles of radius r centered at x = 1/2
onto circles of radius Ilr2 centered at p./4. Thus we need p.(1/2 + £)2 > p./4 + t, which
is always true for t > D and IJ > 4. Choosing D < t < p./4 - 1 we get the result. 0
For p. > 4, this last proposition proves that the p-length of the intervals l io ..... in _.
are bounded by .x.- n . Applying Remark 4.5, we get the Euclidean length is bounded by
C,\ -n for some C > D. This proves Theorem 4.1 for all these values of the parameter.

2.5 Symbolic Dynamics for the Quadratic Map


In this section, we show there is a way to represent the dynamics of F" on A by a
map on a symbol space made up by points which are sequences of l's and 2's. The map
on the symbol space is called the symbolic representation of the map and is said to give
the symbolic dynamics for the map.
Definition. Let N = {D, 1,2, ... } be the nonnegative integers as always. For p an
integer with 2 :::: p, let {I, 2, ... ,p}N be the space of functions from N into the set
{ I, 2 ... ,p}. We also write this space as ~ or 1:, to shorten the notation. We define a
metric on 1:, by
d(8, t) = f 6(s;~
k=O
tk)

for s = (so, Sl,"') and t = (to, tl,"')' where


6 i . _ {D if i = j
(,J)- 1 ifi~j.

Exercise 2.13 is designed to clarify the topology induced by this metric. (Many authors
use 2- k in the definition of the metric, but we use 3- k which makes what are called
cylinder sets into balls in terms of this metric. See Exercises 2.13-14.) Finally, we
define a shift map on 1:, by 0'(8) = t where tk = Sk+I, i.e., O'(so, SI, ... ) = (SI' S2, ... ).
The reader can check that 0' is continuous with the above metric. The space 1:p with
the shift map 0', (1:,,0'), is called the symbol space on p symbols or the full (one-sided)
p-shift space. (Later when we are studying diffeomorphisms in dimensions greater than
one, we discuss the two-sided p-shift where s: Z -< {1,2 ... ,p}.)
Next, we define the map which takes the points in A" to points in 1:2 •
38 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Definition. Define h: A,. -+ E2 by'h(x) =j = (jo.jl •... ) where F!(x) E Ii.' Thus
x E F,.-k(I].) for all k. so for any n, x E n~=oF;k(Ii.) = Iio.it .... ,in which is a
component of Sn+1' The map h is called the itinemry map.
Theorem 5.1. Let I' > 4 for the quadratic map F,. defined above. Then the itinerary
map h : A,. -+ El defined above is a homeomorphism from A,. to E2 such that h 0 F,. =
U 0 h.

PROOF. First we check the condition that u 0 h =h0 F,.. Let x E A,., s = h(x), and
t = h(F,.(x)). Then, F!(F,.(x)) E It. and F!(F,.(x)) = F!+I(X) E 1'.+1' Sotk = Sk+1,
t = =
U(5), and h(F,.(x)) u(h(x)).
Next we check that h is onto. Let s E E2. As in the earlier theorem, the intersections
1. 0 , ..••• = n~=o F;k(I•• ) are nonempty intervals and are nested as n increases. There-
fore there is a Xo E n::,=o 1.0....... = n~o F;; " (I•• ). If x E 1'0, ... ,'. then for 0 ~ k ~ n,
x E F,-;"{/•• ) so F!(x) E I ••. Therefore F!(xo) E I •• for 0 ~ k < 00 and h(xo) = 8.
This proves that h is onto.
We give two proofs of the fact that h is one to one to illustrate different ideas. In
both these proofs we assume 11 > 2 + 5 1/ 2 for simplicity. Assume that s = h(x) = h(y).
By the above argument F!(x), F!(y) E I •• for all 0 ~ k. so x, Y E I.O, ..... n for all n.
Lemma 4.4 proved a bound on the length of I.o .... '.n' L(I'o''''''n) ~ A-(n+1) L([O, 1]). so
Ix - yl ~ A-(n+l) for all n, and x = y.
As a second proof that h is one to one, we use the ideas of expansiveness. If z E h uI2
then 1F~(z)1 ~ A. Therefore if z"Z2 E I j then for some z' E I j IF,.(zd - F,.(Z2)1 =
IF~(z')1 . IZI - z21 ~ A· IZI - z21. If s = h(x) = h(y), then F!(x), F!(y) E I .. for all
o ~ k, so 1F;(x) - F;'(y) I ~ AIF;-I(X) - F;-I(y)l ~ Anlx - YI. If x I- y then for
some n, Anlx - yl > L(/. n ) and F;(x) and F;'(y) can not be in the same interval. This
contradiction proves that h is one to one.
Last, we need to check that h is continuous. Take x E A,. and 5 = h(x). Let f > O.
Pick an n such that 3- n < f. Consider the subinterval/,o''''''n' If fJ > 0 is small enough
and YEA,. with Iy - xl < fJ then y E I'o''''''n' Then for yEA,. with Iy - xl < fJ
let t = h(y). Then tIc = SIc for 0 ~ k ~ n. Therefore d(h(x),h(y)) ~ L~=n+13-" =
3-(n+1)[1 - (1/3)J- I = 3- n 2- 1 < f. This proves the continuity of h.
Because the sets A,. and E2 are compact and h is a one to one continuous map, it
follows that h is a homeomorphism. This completes the proof of the theorem. 0
REMARK 5.1. Given two maps 1 : X --+ X and g : Y --+ Y, a homeomorphism h: X --+
Y such that hoI = g 0 h is called a topological conjugacy. See the next section for
further discussion.
The same proof given above (with only minor changes) proves the following theorem
about an arbitrary function and p intervals.
Theorem 5.2. Let 1 : R --+ R be a C l function and II .... ,Ip be p disjoint closed
bounded intervals with p ~ 2. Let I = U~_I I j . Assume that 1(/j) ::> I for 1 ~
j ~ p. Also assume there is a A > 1 such that 1/'(x)1 ~ A for x E In rl(I), Let
A = n~o I-"(I). Then A is a Cantor set. Define h : A --+ Ep by h(x) = 8 where
I"(x) E I ••. Then h is a topological conjugacy from 1 on A to u on Ep.
We now use the above result on the conjugacy to prove facts about the periodic
points of the map on the line. Before we state the theorem, we give the definition of
one more property which is preserved by the conjugacy.
Definition. A map 1 : X --+ X is (topologically) tmnsitive on an invariant set Y
provided the (forward) orbit of some some point p is dense in Y. This property is
2.5 SYMBOLIC DYNAMICS FOR THE QUADRATIC MAP 39

equivalent to the fact that given any two open sets U and V in Y there is a positive
integer n such that r(U) n V 1: 0. (This is called the Birkhoff Transitivity Theorem
and is provM in Chapter VI!.) This property indicates that f mixes up the points of
Y and the set is one piece dynamically. A stronger conditions is as follows: A map
f : X -. X is called topologically mixing on an invariant set Y provided for any pair of
open sets U and V there is a positive integer no such that j" (U) n V 1: 0 for all n ~ no.
Thus if f is topologically mixing, the iterates j"(U) intersect V for allsufllciently large
values of n.
Theorem 5.3. Let f : R -. R be a CI function and II, ... ,Ip be p disjoint closed
bounded intervals with p ~ 2. Let I = LY;'=I I). Assume that f(Ij) :J I for 1 ~ j ~ p.
Also assume there is a ,x > 1 such that 1f'(x)1 ~ ,x for x E In rl(I). Let A =
n:'ork(I). Then
(a) the cardinality of the number of periodic points is given by #(Fix(rIA) = pn,
(b) PerUIA) are dellse ill A, and
(c) f is transitive on A. In fact, f is topologically mixing on A.

REMARK 5.2. We leave to Exercise 2.15 the proof that CT p is topologically mixing on
Ep and so flA is topologically mixing.
PROOF. The map f restricted to A is conjugate to CT on Ep, so it is enough to prove
these facts for CT. (This uses the fact that a conjugacy takes a periodic orbit of period
n to a periodic orbit of period n. See Exercise 2.24.)
(a) Given n there are pn blocks of length n made up with letters in {I, ... ,pl. If
b = bo··· bn -I is one of these blocks, let b be the string which repeats the block
b, b = bo .. ·bn-Ibo .. ·bn_I·" or we could write b = bb· ... Then qn(b) = b so
b E Per(n, q). On the other hand, if qn(s) = s then sn+j = Sj for all j. Therefore if
b = So ... Sn-I then s = b. This shows that the points that are fixed by qn are exactly
those given by one of the b where b is a block of length n. There are pn distinct blocks
of length n so we have proved the claim.
(b) Let sEEp and t > O. Take n such that 31 - n2- 1 < t. Let b = 80'" Sn-l and
t = h. Then t E Per(n,q) and

d(s, t) = f: D(SJ; t)
j=n
00

~ L 3- j = 3 1- n 2- 1 < t.
j=-R

Since t is arbitrary, this proves that there is a periodic point within t of 8, and so the
periodic points are dense in Ep.
(c) To prove there is a point with a dense orbit, we describe such a point. Let t be a
sequence which first lists all the blocks of length one, then lists all the blocks of length
two, and continuing, lists all the blocks of length n for each successive n:

t = 1,2; (1,1), (2,1), (1,2), (2, 2); (1,1,1), (2, 1,1), ....
(The use of parentheses, commas, and semicolons is merely to clarify the blocks making
up the string and has no real meaning.) Then, given any SEEp and k there is a n such
that qn(t) and 8 agree in the first k places. Therefore d(qn(t),s) ~ 31:-12- 1. Since k
is arbitrary, the orbit of t gets arbitrarily near 8. Since s is arbitrary, the orbit of t is
dense in Ep.
40 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

The fact that f is topologically mixing follows from the fact that given one of the
r+
intervals I ••.....•• , I (1 ......... ) :> I. We leave the details to the reader. This completes
the proof of the theorem. 0
REMARK 5.3. It is no accident that we proved that the map f is transitive by proving it
is conjugate to a shift map and then verified the-property for the shift map. It is nearly
impossible to specify a point which has a dense orbit for a nonlinear map. However the
very nature of the coding of the shift space allows us to write down a point with a dense
orbit for the shift map. Then the conjugacy proves that the nonlinear map inherits this
property even though we can not write down the point with the dense orbit.

2.6 Conjugacy and Structural Stability


In the last section showed that the quadratic map on its invariant Cantor set is
topologically conjugate to the shift map. We used the conjugacy to determine the
numher of Pf'riodie points and to prove that the quadratic map is topologically t.ransit.ive
on its invariant cantor set. In this section we continue our discllssion of topological
conjugacy: we apply it to simple maps of the type considered in Section 2.2 and get
conjugacies on intervals in the line (and not just between two Cantor sets). These
examples also illustrate some of the properties which topological conjugacies preserve
and which they do not. In the next section we return to the quadratic map and obtain
a conjugacy between two nearby quadratic maps on the whole real line.
When constructing the conjugacy for the quadratic map, we did not give any general
motivation. We start with such motivation now before stating various conditions which
we can place on the conjugacy. The concept of conjugacy arises in many subjects
of mathematics. In Linear Algebra, the natural concept is linear conjugacy. Thus if
Xl = Ax is a linear map and X = Cy is a linear change of coordinates for which C
has an inverse. then the map on the y-variables is given as follows: Yl = C-IXI =
C-l Ax = C-l ACy. Thus the matrix for the map in the y-variables is C- l AC. As
long as t.he two maps are defined on the same space (e.g. some Rn), a conjugacy can be
considered a change of coordinates of the variables on the space on which the function
acts. In Dynamical Systems, we consider a conjugacy (or change of coordinates) which
is continuous with a continuous inverse, i.e., a homeomorphism. We could also consider
a conjugacy of two functions for which the change of coordinates h is differentiable with
a differentiable inverse, i.e., h is a diffeomorphism. A third alternative is a conjugacy
where the change of coordinates is an affine function, h : IR -+ IR is an affine map,
h(x) = ax + b. We discuss below why we usually are only able to find a conjugacy
by a homeomorphism and not a diffeomorphism. (In the last section, it would not be
possible to require that h : 11." -+ E2 is differentiable because E2 is not a Euclidean
space or a manifold.)
Definition. Let f : X -+ X and g : Y -+ Y be two maps. A map h : X -+ Y is called
a topological semi-conjugacy from f to g provided (i) h is continuous, (ii) h is onto, and
(iii) h 0 f = go h. We also say that f is topologically semi-conjugate to g by h. The map
h is called a topological conjugacy if it is a semi-conjugacy and (iv) h is one to one and
onto and has a continuous inverse (so h is a homeomorphism). We also say that I and
9 are topologically conjugate by h, or sometimes we just say that I and 9 are conjugate.

Definition. To define a differentiable conjugacy we restrict to functions on the line


where we know what we mean by differentiable. Let J, K c R be intervals. Assume
f : J - 0 J and 9 : K -+ K are two cr - maps for some r ~ 1. A map h : J -+ K is called
a cr -conjugacy from f to 9 provided (i) h is a cr -diffeomorphism (h is onto, one to one,
2.6 CONJUGACY AND STRUCTURAL STABILITY 41

c r , with a cr inverse) and (ii) hoI = 90 h. We also say that I and 9 are cr -conjugote
by h, or diiJerentiably conjugate. If the conjugacy h is affine, h(x) = ax + b, then we say
that I and g are affinely conjugote.
We defer to Exercise 2.24 to show that a topological conjugacy takes periodic orbits
of one map to periodic orbits of the same period of the other map. Thus two conjugate
maps "have the same dynamics" .
Example 6.1. In this first example we show how to match up two families of quadratic
maps: F" and a second family 90' In this case the conjugacy can even be affine. This
is a partial verification of the fact that there is only "one family of quadratic functions"
up to affine change of coordinates. Define
90(Y) = ay2 - 1.
A conjugacy must match up the fixed points. The fixed points of F" are 0 and p" =
1 - 1Ill. Those of Yo are y± = [1 ± (1 + 4a)I/21/2a. Also note that (i) 90( -y+) = y+
nnd F,.(I) = 0, IUld (ii) the critical points of Yo and F" are 0 and 1/2 respectively.
Assume x = h(y) = my + b is the change of coordinates. Because -y+ < y- < y+
and 0 < PI' < 1. we must have h(-y+) = 1. h(y-) = p", h(y+) = 0, and h(O) = 1/2.
Substituting in h, we get the equations
m(-y+)+b= 1
1
my- +b = 1 - -
p.
my+ +b = 0
1
m·O+b= 2'
From the last equation we get that b = 1/2. Subtracting the first equation from the
second (and using the form of y±), we get that m(l/a) = -1/p., or m = -a/p.. Sub-
stituting these values in the third equation we get -[1 + (1 + 4a)I/21/[2p.1 = -1/2,
Il = 1 + (1 +4a) 1/2, or 4a = p.2 - 2p.. This last two expressions give necessary conditions
for the maps to be conjugate:
p.=I+(1+4a)l/2 or
p.2 _ 2p.
a= - -
4 -• and

h(y) = ~ _ a y .
2 p.
Once we have found these conditions we can verify directly that this h indeed does work:
F" 0 h(y) = F,,( -ay/p. + 1/2)
= p.(-ay/p. + 1/2)(ay/p. + 1/2)
I' a 2 y2
= '4 --;-' while

h 0 90(Y) = h(ay2 - 1)
a 2 1
= -~(ay - 1) +2
a2 y2 a 1
= ---
p.
+ -p. +-.
2
These two quantities are equal because 4a = p.2 - 2p.. This shows that these two
functions are affinely conjugate when the parameters are correctly related.
42 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Example 6.2. Let


D(z) = 2z mod 2
be the doubling map on the circle S == {z mod 2},

T _ { 2y if 0 :5 y :5 1/2
(y) - 2(1 - y) if 1/2 :5 y :5 1

be the tent map, g2(Y) = 2y2 - 1 be the quadratic map of the last example, and
.r,.(x) = /lx(l - x) be the standard family of quadratic maps. The tent map is also
called the rooftop map. The doubling map is also called the Sqtlarin9 map because In
complex notation it can be written as D(z) = z2 on complex numbers z with Izl = 1.
We claim that (i) D is topologically semi-conjugate to both TI[O,IJ and 921[-1, IJ,
and that (Ii) TIIO, 1],9211-1, I], and F 4 i[0, I] are all topologically conjugate.

(a) (b)

(c) (d)

FIGURE 6.1. The Graphs of (a) Don Sl (or [0,1]), (b) Ton [0,1]' (c) 92
on [-1,1], and (d) F4 on [0,1]

If we consider z is equivalent to -z (or 2 - z) on the circle S == {z mod 2}, this


induces a two-to-one projection p: S --+ [0, I] given by p(z) = Izl for -1 :5 z :5 1. Also,
if 1:5 z :5 2, then p(z) = 2 - z. Then for -1/2:5 z:5 1/2, Top(z) = T(lzl) = 21zl. and
°
po D(z) = p(2lzl) = 21z1 because :5 21z1 :5 1. On the other hand, for 1/2:5 Izl :5 1,
To p(z) = T(lz!) = 2 - 21z1 and po D(z) = p(2lzl) = 2 - 2Iz1 because 1 :5 2Iz1 :5 2.
Therefore, for any z, To p(z) = po D(z). This proves that D and T are semi-conjugate.
To construct the semi-conjugacy from D to 92 we consider z E S to be the point on the
circle with angle lTZ radians and let x = h(z) = cos(lTz) be the map to the x coordinate,
h : S --+ [-1, II. Then h 0 D(z) = cos(2z) = 2 C082 (Z) - 1 = 2h(z)2 - 1 = 92(h(z». This
proves that D is topologically semi-conjugate to 92,
2.6 CONJUGACY AND STRUctuRAL STABILITY 43

Lastly, note that p(z) = p(z') if and only if z is equal to ±z' modulo 2. This is also
true for h, so k(y) = h 0 p-I(y) = cos(7rY) induces a topological conjugacy from T to
g2, k : [0,1) -+ [-1,1). Notice that k is differentiable everywhere, but that k- I is not
differentiable at ±1.
Also note, by Example 6.1, g21[-1, 1] is conjugate to F4 1[0, 1], so combining TI[O,I]
is conjugate to F4 1[0, 1) by H(y) = 1/2 - (1/2)cos(7rY) = sin2 (7rY/2). Again, His
°
differentiable everywhere on [0,1] but H- I is not differentiable at and 1.
In the above examples, some of the conjugacies were affine It.nd so differentiable ev-
erywhere, and other conjugacies were differentiable but had inverses which were not
differentiable at the endpoints. In Dynamical Systems, we usually consider the notion
of topological conjugacy and not C" -conjugacy. The question arises, why do we only
require that the conjugacy h be a homeomorphism and not a diffeomorphism? As the
situation of the last section illustrates, sometimes a topological conjugacy Is all that
exists and it is still useful to match the trajectories of the two maps. The following
proposition gives another reason: the assumption that I and 9 are differentiably con-
jugate puts a very restrictive condition on the relationship between the derivatives of
J and 9 at their respective periodic points. After this proposition, we define another
concept (structural stability) which requires a map I to be topologically conjugate to
all "small perturbations" g. After this definition, we give a specific example of a map
I that is topologically conjugate to all perturbations, and we also verify that the con-
jugacy can not be differentiable for a particular choice of g. This example thus gives
a second justification that topological conjugacy is the natural concept in Dynamical
Systems.
Proposition 6.1. Assume I,g: R -+ R are CI.
(a) Assume that I and 9 are CI-conjugate by h. Assume Xo is a n-periodic point for
I, and Yo = h(xo). Then (r)'(xo) = (g")'(yo)·
(b) Assume I has a point Xo of period n and assume every n-periodic point 1/0 for 9
has (r)'(xo) =I (g")'(yo). Then I and 9 are not CI-conjugate.
REMARK 6.1. Since most small perturbations of a map I change the derivative at
periodic points, it is very difficult for two maps to be differentiably conjugate.
REMARK 6.2. In higher dimensions, the corresponding result is that the matrices of
partial derivatives of I and 9 are linearly conjugate and so have the same eigenvalues.
PROOF. Clearly part (b) follows from part (a).
To prove part (a), we assume that h 0 I(x) = go h(x), so h 0 r(x) = gn 0 h(x) and
r(x) = h- I 0 g" 0 h(x). Taking the derivative at Xo we get that

(f")'(xo) = (h-I)(gnoh(xo»(g")(h(xo»h(xo)
= (h-I)(I/O)(g")(I/O)h(xo)
= (g")(l/o)'

(We moved the point of evaluating the derivatives to subscripts to make the product
easier to read.) This proves part (a). 0
We next want to discuss maps which are conjugate to all perturbations. Such maps
have the same dynamics as all small perturbations and so the structure of the dynamics
of the whole map is stable. For this reason, such maps are called structurally stable.
To make this definition precise, we need to define what we mean for two functions to
be close, i.e., we need to define the distance between two functions.
44 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Definition. Let r ~ 0 be an integer. Let I, 9 : IR -+ IR be C r functions and J C lR be


an interval (usually closed and bounded). Define the cr -distance from I to 9 by

dr.J(f,g) = sup{l/(x) - g(x)l. I!'(x) - g'(x)l, ... ,1/(r)(x) - g(r)(x)1 : x E J}.

Obviously, I and 9 do not need to be defined on the whole real line but only need their
domains to include the interval J (unless J = R).
Definition. Assume r ~ 1. Let I : R -+ R be a cr function. A function I is C r
structumlly stable provided there exists an t > 0 such that I is conjugate to 9 on all of
R whenever g: R -+ R is a C r function with dr.R(f,g) < t. A function I is said to be
structumlly stable provided it is C l structurally stable.
Example 6.3. Take I(x) = 3x and g(x) = 2x. We want to construct a conjugacy h
with h o/(x) = go h, or h(x) = g-I 0 II 0 I(x). If h exists then h(x) = g-j 0 h 0 Ji(x)
for any j ~ O. However, we also have h(x) = go h 0 I-I(x) so lI(x) = g-j 0 h 0 Ji(x)
for any -00 < j < 00. Both maps have 0 as a fixed point. As stated before, we need
that h(O) = o. The image of 1 by h, 11(1), is fairly arbitrary, but we take h(I) = 1 and
h( -1) = -1. Then h(3) = II 0 1(1) = go 11(1) = g(I) = 2. The definition of II for x
between 1 and 3 is also arbitrary as long as it is monotone so we let ho : [1,3]-+ [1,2]
o
be a CI map with (i) ho(I) = 1, (ii) ho(3) = 2, (iii) "0(1) = 1, and (iv) h (3) = 2/3.
Similarly define ho: [-3, -1]-+ [-2, -1].
Once h is determined on [-3, -1] U [1,3] by h(x) = ho(x), this determines it for
all nonzero x as follows. Take x i' O. There is a unique j = j(x) E Z such that
Ji(x) = :Vx E (-3, -1] U [1,3). Define h(x) = g-j 01100 P(x). This defines h for all
x E R. By construction h is a conjugacy.
In the above construction, the orbit of every x i' 0 goes through the union of the
intervals J = (-3, -1] U [1,3), and and there is no proper subset of J for which this
is true. Because of this property, the set J, is called a fundamental domain lor the
unstable set 0/0. Often it is preferred to take a fundamental domain as closed so the
characterization is changed as follows: the pair of closed intervals J = [-3, -1] U [1,3] =
cl« -3, -1] U [1, 3)) is called a fundamental domain lor the unstable set 010 because the
orbit of every x i' 0 goes through J and there is no proper closed subset of J for which
this is true.
o
Note for those x approaching 1 from below, h'(x) = (3/2)h (3x). The limit of this
expression as x approaches 1 is equal to 1 = 110(1):

= 1
= x-I
lim ho(x).
x>1

Thus this extension is differentiable at x = 1. Similarly, it is differentiable at 3 and all


other points ±J.1. Thus this extension is C I for x i' O.
Now let Xi be an arbitrary sequence that approaches 0 as i goes to infinity. Then
there is aj = j(i) such that Zi == P(i)(Xi) E (-3, -1] U [1,3). It must be that j(i) goes
to infinity as i goes to infinity. Then h(x;l = g-i(i)ho(Zi) = 2- j (i)ho(Zi) and this must
converge to 0 as i goes to infinity. Thus ho is continuous at x = O.
2.6 CONJUGACY AND STRUCTURAL STABILITY 45

Finally, we want to show that h is not differentiable at O. Now let Xi = 3- i . Then


Xi approaches 0 as i goes to infinity. Then h(Xi) = 2- i ho(3'Xi) = 2- i ho(l) = 2- i also
approaches O. However,

. h(Xi) - h(O)
I1m I' 2- i
= Im-.
i-oo Xi - 0 i-oo 3- 1

I. (3)i
= .Im -
'-00 2
= 00.

Thus, h is not differentiable at O. Another way to see that h is not continuously


differentiable is to show that the derivative of h at the points Xi. h'(x,), goes to infinity
as i goes to infinity:
lim h'(x,) = lim (g-')'(l)h'(l)(fi)'(x,)
1-00 '_00

= lim 2- i . 1 . 3'
= 00.

Notice that if 9 were any map with g'(x) ~ >. > 1 for all x, then 9 has a single fixed
point and the same proof shows that f is conjugate to g. Thus f is C l structurally
stable. Also notice that if g(x) = (3 + t)x then do(f,g) = 00, but the derivatives of f
and 9 are close for all x. Thus when we consider the perturbations of a function which
are small in terms of the distance d l on all of R (which is noncompact), this is very
restrictive. Often, we can allow the CO size of the perturbation to be larger near ±oo if
the derivatives are controlled.
Example 6.4. In this example we consider f(x) = -xI3. There are two differences
from the previous example: the origin is attracting and not repelling, and the map
switches points from one side of the fixed point to the other. However if we let J =
[1, r2(1» = [1,9) then every x # 0 has a unique j E Z such that Ji(x) E [1,9). Thus
for a fundamental domain we do not need to take an interval for negative x as well as
positive x; [1,9] is the complete fundamental domain for the stable set of O. With this
change, the proof that f is structurally stable is also the same. We leave the details to
the reader.
Note that the fact that the fundamental domain for f(x) = -x13 is one interval,
while that for g(x) = xl3 is the union of two intervals implies that these two maps
can not be topologically conjugate. In fact, f is orientation reversing on R while 9 is
orientation preserving. Two maps which are topologically conjugate must either both
be orientation preserving or orientation reversing. This is why f and 9 can not be
topologically conjugate, even though both have a unique fixed point whose basin of
attraction is the whole real line.
Example 6.5. Let f(x) = x3/2 - x12. This example has fixed points at x = 0, ±3 1/ 2
and critical points at ±1/3 1/ 2 • The fixed point x = 0 is attracting and the two fixed
points x = ±3 1/ 2 are repelling. The main changes from the previous examples are the
existence of several fixed points and the existence of critical points.
If 9 is any small C l perturbation, it will have three fixed points. However to insure
that 9 has exactly two critical points we must take 9 C2 near f. We take such a 9
and let Pj for j = -1,0, 1 be the fixed points and c:l: be the two critical points, with
P-I < c- < Po < c+ < PI! Po < g(c-) < c+, and c- < g(c+) < Po·
Notice however that the map f has f'(x) ~ 4 > 1 for all x with x ~ 31/ 2 or x < _3 1/ 2 •
Thus we take (partial) fundamental domains for the unstable sets of ±3 1/ 2 given by
46 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

[- I( -4), -4J U [4,/(4)J. We can construct a conjugacy h on (-00, -3 1/ 2J U [3 1/ 2,00)


from I to 9 with h« -00, _3 1/ 2]) = (-00, -p-d and h([3 1/ 2,00)) = [Plo 00).
Now we consider the points between ±3 1/ 2 . The point 1/3 1/ 2 is a critical point which
is a minimum of the function f. Thus there are points on either side of 1/3 1/ 2 on which
I takes the same value. Let x, and y, sequences of points converging to 1/3 1/ 2 with
X, > 1/3 1/ 2 , y, < 1/3 1/ 2 and I(x,) = I(y,). If h is a conjugacy from I to 9, then we
need that 90 hex,) = h 0 I(x,) = h o/(y,) = 90 hey,). Thus the points hex,) and hey,)
need to be points which have the same image by 9. Thus h(I/3 1/ 2 ) has two distinct
points arbitrarily near which are taken to the same point by g, and h(I/3 1/ 2 ) must be
the critical point for 9, c+. Similarly, we need h( _1/3 1/ 2 ) = C- .
Once we know that h must take the critical point of f to the critical point of g, the
rest of the construction of the conjugacy is straight forward. The map f is monotone
for x with _1/3 1/ 2 ~ x ~ 1/3 1/ 2 . We construct a conjugacy here to the perturbation
9 much as in the last example using the fundamental domain of the stable set of 0
given by [/2(1/3 1/ 2),1/3 1/ 2]. making sure that the image of the critical point by ho,
hoo I( _1/3 1/ 2 ), is 9(C-). Once the conjugacy ho is defined on this fundamental domain,
extend it to [-1/3 1/ 2 , 1/3 1/ 2 J as before, with h[-1/3 1/ 2 ,1/3 1/ 2 J = [c-, c+l. (Notice that
1(1/3 1/ 2 ) > -1/3 1/2 and 1(-1/3 1/ 2 ) < 1/3 1/ 2 .)
Next, we need to extend h to the interval (1/3 1/ 2 ,3 1/ 2 ). Let 1+ be the restriction of I
to this interval and 9+ be the restriction of 9 to [c+ ,pd. Now for x E (1/3 1/ 2 ,3 1/ 2 ) there
is a smallest j > 0 with I!(x) E [-1/3 1/ 2 ,1/3 1/ 2 J. Define hex) = (9+)-j 0 h 0 I!(x).
Make a similar construction for x E (_3 1/ 2, -1/3 1/ 2J: hex) = (g_ )-j 0 h 0 I~ (x) where
9_ is the restriction of 9 to (P-1,C-J and f- is the restriction of I to (_3 1/ 2 , _1/3 1/ 2 1.
This extension makes h defined and continuous on the whole real line. This completes
the necessary modifications. (The reader may want to check some of the claims we
made in the construction.)

2.7 Conjugacy and Structural Stability of the


Quadratic Map
In the preface we stated that we wanted to determine which systems were dynamically
equivalent t.o any of its perturbation. In this section we prove that this is the case for the
quadratic maps with invariant Cantor sets: any small perturbation 9 of FI' is conjugate
to FjI'
Befort' we prove t.hat a conjugacy exists on the whole real line, we first show that
one exists between the nonwandering set of FI' and the nonwandering set of g. Because
the nonwandering sets are contained in a compact interval, J, we only require that the
two functions are close on this interval J. To make the statement clearer, we introduce
notation for the restriction of the functions to this interval. Let f J be the restriction of
I to J. Then the nonwandering set of IJ, nUJ), are the points z such that for any open
neighborhood U there is a point y E U with Ik(y) E U and f1(y) E J for 1 ~ j ~ k.
We start by giving the definition of two functions being conjugate on their nonwan-
dering sets. We also use the definitions of two functions being close in terms of their
derivatives which is introduced in the last section. Sometimes we only require that these
values are close on an bounded interval.
Definition. Let I : R -+ R be a C r function. As a modification of the concept
of structural stability, we consider a map h that only conjugates f restricted to its
nonwandering set with 9 restricted to its nonwandering set. A function I is C r n-stable
on J provided there exists an f > 0 such that f restricted to nu
J) is topologically
2.7 CONJUGACY AND STABILITY OF THE QUADRATIC MAP 47

conjugate to 9 restricted to O(gJ) whenever 9 : J --+ R is a C r function with dr,JU, g) <


f.

Theorem 7.1. Let F,..(x) = Jlx(I - x) as before witb Jl > 4.


(a) Tben F,.. is C l O-stable on [-2,2), and
(b) F,.. is C 2 structurally stable on R.
REMARK 7.1. By stating the theorem for part (a) the way we do, it implies that F,.. is
O-conjugate to Fl" for III - ll'l small.
PROOF OF THEOREM 7.1 (a). We restrict our proof to the case when Jl > 2 + 5 1/ 2 but
the proof can be modified for Il > 4 using the metric introduced above.
°
Let 16 = [-b,I + bJ for b > 0. For Ii > small enough, if z E [16 n F;I(I6)1 u
[-2,OJ U [1, 2J. then IF~(z)1 > A > 1. Since F~(1/2) = 0, the assumptions imply that
F,..(I/2) > 1 + 6.
°
Takef > small enough so that if dt,i-2,2IU, g) < f theng(-li) < -Ii, g(I+Ii) < -6,
g(I/2) > 1 + 6, and Ig'(z)1 > A for z E [16 n g-I(16») U [-2,01 U [1, 2J. Fix such a map
g. These conditions imply that gl16 covers 16 twice. The next lemma shows that the
nonwandering set of 9 is contained in 16.
Lemma 1.2. If yi'(x) E [-2,2J for all k ~
0(gll-2,2)) c n~o g-k(l6).
°tben gk(X) E 16 for all k ~ 0. Therefore,

PROOF. If gk(X) r;. 16 then gk+l(X) < max{g(1 + Ii),g(-li)} < -Ii. If also gk+I(X) E
[-2, -IiJ, then

gk+2(X) _ (-Ii) < gk+2(X) - g( -Ii)


< A[gk+I(X) - (-Ii)J.

Therefore gk+2(x) stays less than -Ii. As long as gk+i(x) stays in [-2, -liJ,

Since A > 1, this inequality can only last for a finite number of iterates, and gk+i(x) <
°
-2 for some j ~ 0. Thus if gk(x) E [-2,2J for all k ~ then gk(X) E 16 for all k ~ 0.
The statement about the nonwandering set follows from the first statement. 0
Lemma 1.3. Let Ag = n~o g-k{l6). Tben glAg is topologically conjugate to (T on
I:2·
PROOF. Because glI6 covers 16 twice, g-I(l6) n 16 is the union of two intervals If and
Ir By the Mean Value Theorem the length of each of these intervals is less than A-I
times the length of 16. Also 9 is monotone on each I j with derivative greater than A in
absolute value. Let IJ.k = g-I{lk)n/j . Then Sf = n~=o9-i(I6) is the union ofthe four
intervals IJ.k' By the Mean Value Theorem each of these intervals has length bounded
as follows: L(/j,k) S >..-IL{lk) S >..-2L{l6). By induction, S~ = n~_og-i(I6) is the
union of 2n intervals of length less than or equal to >.. -n L(I6). Exactly as in the earlier
proof, Ag = n~o9-k(16) is a Cantor set, and glAg is topologically conjugate to (T on
I:2 0
To ill" . , •• <' proof of the theorem, note that glAg is topologically conjugate to (T on
I:2 which in turn is conjugate to F,..IA,... Because topological conjugacy is a transitive
relation, we have proved part (a) of the theorem. 0
48 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

PROOF OF THEOREM 7.1(b). In this part, we let h: AI' -+ Ag be the conjugacy shown
to exist by part (a). For part (b), we need to define a conjugacy off the Cantor set A,..
Just as in the simpler example, Example 6.5, which has only a finite number of periodic
points, a conjugacy of FI' and a perturbation on all of R must take the critical point cl'
of FI' to a critical point cg of 9. Thus 9 must have a unique critical point. In order to
be able to know that 9 has a unique critical point, we must restrict ourselves to 9 which
are C2 near F,..

FIGURE 7.1. Orbit of the Critical Point

Note that F,,(c,,) > I, and 0 > F~(cl') > F!(cl')' Also if 9 is C 2 near enough to
FI" then g(cg) is greater than the points in Ag, and g2(Cg) > g3(cg) and both points
are less than the points in A g . The conjugacy we construct must take [F!(CI')' ~(cl')1
to [g3(cg ),92(Cg )l. These two intervals are the fundamental domains for FI' and 9 re-
spectively. (These intervals are fundamental domains of the stable set of infinity or the
unstable set of the invariant Cantor sets.) Let ho be such a map. (It could be taken to
be linear on this interval.) The map F" is monotone (so has an inverse) when restricted
to x ~ 1/2. Let FI'_ be this restriction, and FI'+ be the restriction to x ~ 1/2. Simi-
larly, let g_ be the restriction of 9 to points y ~ cg, and g+ be the restriction to points
y ~ cg • Then 9_ and g+ are each monotone with inverses. As in Examples 6.1 - 6.3,
ho can be extended to points x < 0 by
h(x) = 9:' 0 ho 0 F~_ (x)
where F~_(x) E [F;(c,,), F~(c,,)I. This extension is continuous, as a map from AI' U
{x: X < O} to Ag U {y : y < z for all Z E Ag}.
Next, FI'+ {x : X > I} = {x : X < O}, and h is already defined on {x : X < OJ. Also
g+{y: y > z for all Z E Ag} = {y : y < z for all Z E Ag}. Thus we can extend h to
{x:x>l}by
h(x) = g:;1 0 h 0 FI'+(x).
With this definition, the map is still continuous and a conjugacy where it is defined.
Also h(FI'(1/2» = 9.:;1 0 h 0 F~(1/2) = g:;1 0 g2(cg) = g(cg ).
Next, if Gl,I." is the gap at the first level for F", and Gl,l.g the gap at the first level
for g, then the equation
h(x) = { 9':; 10 h 0 FI'(x) for x E G t .I.1' and x > 1/2
g:1 0 h 0 F,,(x) for x E G u ." and x < 1/2
2.8 HOMEOMORPHISMS OF THE CIRCLE 49

extends h continuously from G I •I .,.. to GI.I.g' As above, it can be checked that h(I/2) =
cg • We continue inductively tc;1 the gaps at n'h level, Gj,n,,.. for F,.. and Gj,n,9 for g. For
each j, we can extend h from Gi,n,,.. to Gj,n,g by using the appropriate branch of the
inverse of g, g::;1 or 9: 1 . We leave the details to the reader. 0

2.8 Homeomorphisms of the Circle


In this section, we discuss the periodic orbits and recurrence of orientation preserving
homeomorphisms of the circle. By restricting to invertible maps and by exploiting the
fact that the circle comes back on itself, we are able to give a rather complete description
of what dynamics can occur for such a homeomorphism. An important quantity which
makes this determination possible is the rotation number. This number measures the
average amount that a point is rotated by the homeomorphism. When this is the
rational number p/q the homeomorphism has periodic points with period q. When this
number is irrational, there are no periodic orbits. With the assumption that the map
is a C~ diffeomorphism with irrational rotation number, it follows that every orbit is
dense in the circle.
We denote by 8 1 the unit circle. It can be thought of as either (i) the real numbers
modulo 1 or (ii) the points in the plane at a distance one from the origin. In terms of
the second way of thinking about 8 1 , we often identify R2 with the complex plane C
and so 8 1 = {z E C : JzJ = I}. There is also a covering space projection 11" from R onto
8 1 • In terms of the first way of thinking about 8 1 , 1I"(t) = t mod 1. In terms of the
second way of thinking about 8 1 , 1I"(t) = e2'11"". (Note we used 11" for both the map and
the number 3.14· .. but throughout the rest of this section 11" usually denotes the map.)
Thus 11" can be thought of as taking an angle measurement to a point on the circle.
Throughout this section I : 8 1 -+ 8 1 is assumed to be an orientation preserving
homeomorphism. Given such a map, there is a (nonunique) map F : R -+ R which is
called a lift 011 such that 11" 0 F = 1011'. A lift, F, of I satisfies (i) F is monotonically
increasing and (ii) F(t + 1) = F(t) + 1 for all t, so (F - id) has period 1. For example,
if 1>. is the rotation by A on 8 1 (or by 211'A radians) then F>,(t) = t + A is a lift. But for
any integer k, F>,(t) = k + t + A is also a lift. In fact for any homeomorphism I, if FI
and F~ are two lifts then there is an integer k such that F2 (t) = Fdt) + k for all t. (For
each t, 11' 0 F2 (t) = 10 1I'(t) = 11" 0 FI(t) so there is an integer k, with F2(t) = FI(t) + k t .
Since everything is continuous and the integers are discrete, k, is independent of t.)
The aim of the section is to define an invariant of I called the rotation number and
prove that it can be used to determine whether I has any periodic points or not. The
rotation number is a measure of the average amount of rotation of a point along an
orbit.
Definition. We start by defining a number for a lift F of I. Let

(10 (F , t) -_ I'1m FR(t) - t .


n-oo n
We show below that this limit exists and is independent of t, and so we denote it by
(1o(F). We also show that if FI and F2 are two lifts then (1o(FI,t) - (1o(F2,t) is an
integer, so
p(f) = (1o(F,t) mod 1
is well defined. The number p(f) is called the rotation number of f.
REMARK 8.1. The definition of rotation number easily implies that (1o(Fk) = k{1o(F)
and p(fk) = kp(f) mod 1.
50 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Example 8.1. Let h. be the rotation by >. on SI with lift F>,(t) = t + >.. This map is
called a rigid rotation of SI by>.. Since Ff(t) = t+n>', it is easy to see that Po(F, t) = >.
c . ! ' , '!'ld p(f) = ). mod 1. In this example every point is rotated by exactly>. so the
rotation number should be )..
In this example we can see the connection between the rationality of the rotation
number and the existence of a periodic orbit. Assume). = p/q is rational, i.e., h. is a
rational rotation. Then F!(t) = t + q). = t + p. Therefore every point is periodiC with
period q.
Now assume that). is irrational. For J>. to have a point x of period q, it is necessary
for F!(x) = x+q>' = x+k for some integer k. Thus we need), = k/q which is impossible
because ). is irrational. Thus J>. has no periodic points in this case. Below, Theorem
8.3 shows that a every point in SI has a dense orbit when>. is irrational.
We start by proving that a orientation preserving homeomorphism of the circle has
a rotation number which is independent of the point.
Theorem 8.1. Let f : SI -+ SI be an orientation preserving homeomorphism with lift
F. Then
(a) for t E IR the limit defining Po(F,t) exists and is independent oft,
(b) if p(f) = Po(F, t) mod 1, then this is independent of the lift F, and
(c) p(f) depends continuously on /.
PROOF. (a) Take any two points t, s E R. There is an integer l such that t $ s+l < t+1.
By the monotonicity of F, F(t) $ F(s + l) < F(t + 1) = F(t) + 1, and by induction,
F"(t) $ P(s + l) < P(t) + 1. Subtracting t + 1,

FP(t) - t - 1 $ P(s + l) - t- 1
< FP(s + l) - s - l
< FP(t + 1) - t
= FP(t) - t + 1.
Since FP(s + l) - s - l = FP(s) - s,

FP(t) - t - 1 < FP(s) - S < FP(t) - t + 1.


Writing p+n(t) - t = FP(Fn(t)) - ~(t) + ~(t) - t and applying (*) with s = ~(t),

Applying (**) with n = p,

F 2P(t) - t < 2[FP(t) - tj + 1,


and by induction
FnP(t) - t < k[FP(t) - tj +k - 1.

For any n 2: p, write n = kp + i where 0 $ i < p. Then by (**) and (***),

Fn(t) - t = Fkp+i(t) - t
< Fkp(t) - t + Fi(t) - t + 1
< k[FP(t) - tJ + Fi(t) - t + k.
2.8 HOMEOMORPHISMS OF THE CIRCLE 51

Dividing by n and using the fact that n ~ kp,

Fn(t) - t k[FP(t) - t] Fi(t) - t k


n < kp + n +-k'
p

In the same way using the inequality P(t) - t - I < P(s) - s and the fact that
(k + I)p > n, we get

k[FP(t) - t] + Fi(t) - t _ _ k_ < _P--,(,-,t)_-_t


(k + I)p n (k + I)p n'

Using the last two inequalities and letting n (and so k) go to infinity with p fixed,

FP(t) - t I. Fn(t) - t FP(t) - t I


-~- - - ~ hm sup ~ + -.
p p n-oo 11 p P

Notice that this last set of inequalities shows that the Iimsup is finite. Now letting p go
to infinity,
Fn(t) - t FP(t) - t
lim sup ~ lim inf .
n-oo n p-oo p
Thus we have proved that the limit defining the rotation number Po(F, t) exists and is
finite for any t E R.
Inequality (*) above shows that. for two different points t, s E R,

FP(t) - t - I FP(s) - s FP(t) - t +I


-""'"--'----< < .
P P P

Since the rotation numbers for both t and s exist, it easily follows that Po(F, t) = Po(F, s)
for any two points t, s E R. Because we have shown that the rotation number Is
independent of the point, we write Po(F) from now on.
(b) Assume FJ and F2 are two lifts of f. We noted above that there is an integer k
independent of t such that F2 (t) = F J(t) + k. By induction F2'(t) = Ff(t) + nk for any
positive integer n. Therefore

'" ) = I'1m F2'(t) - t


Po ( £2
R-OO n
. (Ff(t)-t nk)
= I1m +-
n-oo n n
= Po(Fd + k.

Thus Po(F2) = Po(Fd mod I as claimed.


(c) Let f > O. Choose an integer n > 0 such that 2/n < f. Let F be a lift of f. There
is an integer psuch thatp ~ P(O) < p+l. It then follows that p-I < Fn(t)-t < p+I
for all t (possibly replacing p). For 9 near enough to f in terms of the CO topology, a
lift G of 9 can be chosen so that p - I < Gn(t) - t < p + l. (Note that n is fixed.) The
nk-th iterate of 0 can be written as

Fnk(O) = pnk(O) - 0
k-1
= Lpn 0 Fjn(O) - Fin(O)
j=O
52 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

which is greater than k(p - 1) and less than k(p + 1). Therefore

k(p - 1) < F"k(O) < k(p + 1),

and we also get that


k(p - 1) < Gnk(O) < k(p + 1).
Because the rotation numbers for F and G exist, they can be calculated by subsequences,
So Po(F) = Iimk_oo Fkn(O)/kn and Po(G) = Iimk_oo Gkn(O)/kn. The above inequalities
easily imply that

p- 1
--<Po (F) <p-+-1 and
n n
p-l (G) p+ 1
- - <Po
n
< -n- .

Therefore Ipo(F) - Po(G)1 < 2/n < t. This completes the proof of part (c) and the
theorem. 0
We leave to the exercises to show that if f has a periodic point then p(f) is rational.
The following proposition proves that p(f) being rational is also a sufficient condition
for f to have a periodic point.
Theorem 8.2. The rotation number p(f) is rational if and only if f has a periodic
point. In fact, p(f) = p/q if and only if f has a point of period q. (Here p/q is assumed
to be in reduced form with p and q integers and q positive.)
PROOF. Exercise 2.29 proves that if f has a periodic point then p(f) is rational.
Conversely, assume that p(f) = p/q is rational and expressed in lowest terms. Let
F be a lift of f. By the definitions of p(f) and Po(F), there is an integer k with
Po(F) = (p/q) + k. Then F(t) = F(t) - k is another lift of f and Po(F) = p/q. Also,
Po(Fq - p) = Po(Fq) - p = qPo(F) - p = O.
Let G(t) = Fq(t) - p. We need to show that G has a fixed point on it so f has a
point of period q on SI. There are three cases: (i) G(O) = 0, (ii) G(O) > 0, and (iii)
G(O) < O. In the first case, 0 is a fixed point, and we are done.
In case (ii), because G is increasing, 0 < G(O) < G2(O) < ... Gn(o) < .... There
are two subcases. First assume that 0 < Gn(o) < 1 for all 11. Because the orbit is
monot.onically increasing, G"(O) converges to some point Xo. As we have mentioned
before, it follows by continuity that G(xo) = Xo and G has a fixed point.
As a second subcase, assume there is a k > 0 such that Gk(O) > 1. Then G 2k (O) =
G ir co Gk(O) > G k (1) by the monotonicity of G k . But G Ir (I) = Gk(O) + 1 > 2, so
G 2k (O) > 2. By induction, G]k(O) > j. Therefore Gjk(O)fjk > l/k and Po(G) ~ l/k.
This contradiction shows that this second subcase can not occur and case (ii) implies
there is a fixed point.
In case (iii) when G(O) < 0, 0 > G(O) > G2(O) > .... By reasoning as in case (ii),
there must be a fixed point. This finishes the proof of the theorem. 0
Example 8.2. Let f be the map on SI whose lift F is given by

F(t) = t + t sin(21T1It)

for n a positive integer and 0 < E < 1/(21Tn). Then x is a fixed point of f if there is
an integer p with F(x) = x + E sin(21Tnx) = x + p, or E sin(21Tnx) = p. Since E < 1 it
follows that p = O. The solutions are x = j /2n for j = 0, ... ,2n - 1. The derivative
2.8 HOMEOMORPHISMS OF THE CIRCLE 53

FIGURE 8.1. Example 8.2

F'(x) = 1 + f21Tn COS(21TnX) so F'(j/2n) = 1 + f21Tn > 1 for j even and 0 < F'(j/2n) =
1 - ,211'71 < 1 for j odd. Tlu·rr.fore thr. points x = j /2n are fixed point sources for j even
and fixed point sinks for j odd. See Figure 8.1.
Example 8.3. Let I be the map on SI whose lift F is given by

F(t) = t + l/n + f sin(21Tnt)


for n a positive integer and 0 < f < 1/(21Tn). Let xi = j/2n. Then F(xj) = XjH 80
{xi: j is even} is one periodic orbit of period n and {xi: j is odd} is another periodic
orbit of period n. The derivative of F" at Xo satisfies

so the first orbit is a source. Similarly,

so the second orbit is a sink. It can be checked that any other point y (i) is not periodic,
(ii) has o(y) = O(xo), and (iii) has w(y) = O(xd.
Having looked at some examples with rational rotation number, we return to rotations
on the circle with irrational rotation number. We prove that every point for this map
has a dense orbit.
Theorem 8.3. Let I>. be a rotation on SI as defined above. Assume A is irrational.
Then I>. has no periodic points and every point in SI has a dense orbit in SI. Thus for
any x E SI, w(x) = SI and SI is a minimal set for 1>..
PROOF. As we noted in Example 8.1, the lift F>. is given by F>.(t) = t +A, Po{F>.) = A,
and p(l>.) = A mod 1. By Example 8.1 or Theorem 8.2, I>. has no periodic points
because A is irrational.
Now we turn to showing that all orbits are dense. Since there are no periodic points,
all the points Fi(x) are distinct modulo one. (If Ff(x) = Ff'(x) for n # m then
x + nA = x + TTlA + k for an integer k. Then A = k/{n - m) is rational.) The set
f~(x) must have a limit point in SI. Thus given f > 0 there exist integers n # m
and k such that IFf (x) - FJ:'(x) - kl < f, or in SI d(ff{x), fJ:'(x» < f. The lift F>.
preserves lengths, so letting q = n - m, d(f9(X), x) < f. Then d(f29(X),/9X) < f,
d(f3 9(x), f 29 x) < f, ... d(f(j+1)9(X),li9X) < f. These intervals eventually cover SI 80
the orbit of x is f-dense in SI. Because f > 0 is arbitrary, the orbit of x is dense in 8 1 ,
w(x) = SI, and SI is a minimal set for f>.. 0
54 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

Theorem 8.4. Assume I : SI -+ SI is a continuous orientation preserving homeomor-


phism and p(f) is irrational. Then the following are true:
(a) w(x) is independent of x,
(b) w(x) is a minimal set, and
(c) w(x) is either (i) all of SI or (ii) a Cantor subset of SI.

We leave the proof of this result to Exercise 2.26.

Theorem 8.5. Assume I: SI -+ SI is a continuous orientation preserving homeomor-


phism and T = p(f) is irrational.
(a) Then I is semi-conjugate to the rigid rotation map with rotation T, IT' The
semi-conjugacy takes the orbits of I to orbits of IT, is at most two to one on w(x,f),
and preserves orientation.
(b) If w(x, f) = 51 then I is conjugate to IT'
(c) If w(x, f) ~ 51 then the semi-conjugacy h from I to IT collapses the closure of
eacil open interVllI I in the complement ofw(x) to a point.
PROOF. Let F be a lift of I. The next lemma proves that the order of orbits on the
lift is the same as the order of the lift FT of the rigid rotation.

Lemma 8.6. Choose the lift F so that Po(F) = T, with T irrational. Let t E R
and let k, m, n, q E Z. Then (i) Fn(t) + m < Fk(t) + q for some t if and only if (ii)
Fn(t) + m < Fk(t) + q for all t if and only if (iii) nT + m < kr + q if and only if (iv)
F;'(O) + m < F:(O) + q where FT is the translation by T.
PROOF. The equivalence of conditions (iii) and (iv) is clear.
Because the rotation number is irrational, the order of ~(t) + m and Fk(t) + q in
the line is independent of t, so condition (i) is equivalent to condition (H).
In showing that condition (il) implies (Hi), replacing t by F-k(t) in condition (il)
gives that Fn-k(t) - t < q - m for all t. This in turn implies that

T = Po(F)
Fi(n-k)(t) _ F(i-I)(n-k)(t)
.
= ltm L --....!...<~--:-:----'-'-
j-co i=1
j

j(n - k)
< q-m.
- n- k

The inequality must be strict because the rotation number is irrational, so nT + m <
kT + q which is condition (iii).
Assuming condition (iii), we get that Fn-k(t) - t < q - m for some t since the
rotation number is less than (q - m)/(n - k). Again the sign of this inequality must be
the same for all t because the rotation number is irrational. Substituting Fk(t) for t we
get Fn(t) - Fk(t) < q - m for all t, or condition (H). 0

Let Xo E w(x, f) and to E R with 1r(to) = Xo. Let B = {Fn(to) + m : n, m E


Z} c R. Let A = {F.;'(O) + m : n,m E Z}. Then the map h : B -+ A defined by
ii(~(to) + m) = F;'(O) + m is order preserving by Lemma 8.6. Also h(t + 1) = h(t) + 1
by the construction. Because of the order preserving property, h has a continuous
extension to the closure, h: cl(B) -+ cl(A) = R. The extension is also order preserving
and satisfies h(t + 1) = h(t) + 1. The order preserving map ii : cl(B) -+ R induces a
map h : w(x, f) -+ SI. This map h is semi-conjugacy of Ilw(x, f) to IT'
2.8 HOMEOMORPHISMS OF THE CIRCLE

Claim 1. If cl(B) = R then h is one to one on Rand h is one to one on 8 1 . Thus part
(b) of the theorem is true.
PROOF. Assume there are Xl < X2 with h(x.) = h(X2). Letting i = 1,2, there are two
increasing sequences of points Yn; + mi E B for j ~ 1 such that Yn; + mi converges to
Xi from below. (The mi can be taken independent of j because the points Yn ' + mi
J
converge to a single point.) We can assume that Yn; + ml < Yn~ + m2 for all j and k by
- n~ - n2
taking a subsequence. Then h(Yn; +ml) = FT' (O)+ml and h(Yn~ +m2) = FT J (O)+ml
both converge to the same point and from the same side. Thus the two sequences in
the image must be interlaced while those in the domain are not. This contradicts the
order preserving property of h and proves the claim about h.
The properties of h in this case follow from those of h. 0
Claim 2. 1£ cl(JB) '# R, then h takes the two elJd points of an open interval in R 'cI(B)
to tile same point. Thus h has a continuous order preserving extension to R which takes
the c106ure of each open interval in R 'cl(B) to a single point. This map h : R -+ R
induces a semi-conjugacy h : 8 1 -+ 8 1 which satisfies the conditions of part (c) of the
theorem.
PROOF. Take an interval (eI,e2) in the complement of cI(B). No points in the orbit of
to intersect (el' e2). We need to prove that h(e.) = h(e2)' If h(e.) '# h(e2), then by the
order preserving property of h the image h(cI(B» misses the interval (h(e.),h(e2)) in
R. However h(cI(B» = cI(A) = R. This contradiction proves that it must be the case
that h(e.) = h(e2)' Thus h has a continuous order preserving extension from R to R
with h([el' e2]) = {h(ed}. The rest of the conclusions of the claim follow. 0
By Claims 1 and 2, there is always an order preserving map h : R -+ R with h(t+ 1) =
h(t) + 1 which induces a semi-conjugacy h : 8 1 -+ 8 1 from I to IT' The two claims
prove the desired properties of h. This completes the proof of Theorem 8.5. 0

Theorem 8.7 (Denjoy). Assume that I : 8 1 -+ 8 1 Is a C 2 orientation preserving


diffeomorphism. Assume that T = p(f) Is irrational. Then I Is transitive so I is
topologically conjugate to the rigid rotation IT'
See Nitecki (1971), de Melo (1989), or de Melo and Van Strien (1993) for a proof.
Theorem 8.8 (Denjoy). Let T be irrational. Then there exists a Cl orientation
preserving diffeomorphism I of 8 1 such that p(f) = T and w(x) # 8 1 .

PROOF. The proof consists of placing the open intervals which correspond to the gaps
of the Cantor set in the same order as an orbit of IT' the rigid rotation.
Let in be a sequence of p06ltive real numbers (lengths) indexed by n e Z with (I)
limn_:too(in+!/in) = I, (ii) E::'__ oo in = I, (iii) in > in+l for n ~ 0, (iv) in < in+!
for n < 0, and (v) 3in+! -in> 0 for n ~ O. For example, in = T(lnl + 2)-I(lnl + 3)-1
works where T-I = E::'=_oo(lnl + 2)-I(lnl + 3)-1.
Let In be an closed interval of length In. We place these intervals on the circle in
the same oroNl' the order of the orbit I~(O}. SO to place an interval In, consider
the sum of the lengths of the intervals I j where I~(O} is between I~(O) and O. This
determines the placement of In. Since the circle has a total length of one, the measure
of 8 1 \ UnEZ int(In} is zero.
The next step is to define I on the union of the In. It is necessary and sufficient for
f'(t)= 1 on the endpoints in order for the map to have a continuous derivative when it
II. ONE DIMENSIONAL DYNAMICS BY ITERATION

is extended to the closure. Assume In = [On, b..1, SO in = bn - an. The integral

SO

Therefore, if we define f for x E In by

f ()
X = an+! + 1'" +
an
1
6(in+! -in)
p-
n
(b n - t)(t - an)dt,

then f(b n ) = an+1 + in + in+! -in = bn+!. Also, f is differentiable on In with

I'(x) = 1 + 6(£n+;3 -in) (b n - x)(x - an).


n

Thus, f'(On) = 1 = I'(b n ). Notice that for n < 0, t n + 1 - tn > 0, that

1 ~ /'(x)
< 1 + 6(tn+1 -in) (t'n)2
- i~ 2
3tn+! -in
2tn

and (3tn+! - t n )/(2in ) goes to 1 as n goes to minus infinity. Similarly for n ~ 0 and

1 > f'(x) > 3t'n+! - in > 0


- - Un '
SO f'(x) goes to 1 as n goes to plus infinity uniformly for x E In. From these facts,
it follows that f is uniformly CIon the union of the interiors of the In and has a C l
extension to all of SI.
The second derivative I"(x) is given by

!"(x) = 6(£n+;3 - in) [(b n - x) - (x - an)l,


n

SO (i) f"((a n + bn )/2) = 0 in the middle of the interval, and (ii) as x goes to an, f"(x)
converges to

which is unbounded as Inl goes to infinity. Thus f is not C 2 , and this example of Denjoy
does not contradict the theorem of Denjoy.
Let A = SI \ UnEZ int(In). This set can be formed by successively removing open
intervals. Because an open interval is eventually removed from any of the closed intervals
obtained at a finite stage. A is a Cantor set.
2.9 EXERCISES 51

The orbit of a point z E A is dense in A since it is like the orbit of 0 for /r by


=
Theorem 8.5. Thus w(z) A. If z E int(ln), then there is a smaller interval U whose
closure is contained in int{ln). Since the interval In never returns to In but wanders
among the other 1;, z ;. w( z). Also z ;. 0(1), the nonwandering set of ,. This proves
= =
that A w(z) 0(1) for any z E Sl. This completes the proof. 0
Recent work has dealt with the existence of a differentiable conjugacy between a
diffeomorphism / with irrational rotation number T and /r. Arnold, Moser, and Berman
have obtained results. See de Melo (1989) or the revised version de Melo and Van Strien
(1993) for a discu88ion of these results and references.

2.9 Exercises
Periodic Points
2.1. A homeomorphism I of It is (_trid/,) monotonicall, i"crtui", provided z < "
implies that I(z) < 1(,,). It is (_trid/,) monoto"icall, decrtuin, if z < " implies that
I(z) > I(y)·
(a) Prove that any homeomorphism I of lit is either monotonically increasing or mono-
tonically decreasing.
(b) Prove that a homeomorphism I of lit can never have periodic points whose least
period is greater than 2.
2.2. Let I : lit - lit be continuous. A88ume for one point Zo the orbit I; (zo) is a
monotone sequence and is bounded. Prove that Ij(zo) converges to a fixed point.
2.3. Prove Theorem 2.2.
2.4. Let
2z for z ~ 1/2
T ()z ={
2 - 2z for z ;::: 1/2.
be the tent map.
=
(a) Sketch the graph on 1 [0,1] of T, T2, and (a representative graph of) T" for
n > 2.
(b) Use the graph of T" to conclude that T has exactly 2n points of period n. (These
points do not nece88arily have least period n but are fixed by T".)
(c) Prove that the set of all periodic points of T is dense in [0,1].
=
2.5. Let F.. (z) 4z(l- z) on lit.
(a) Make a rough sketch of the graph of F4'(z) for n > 2.
(b) Use the graph of F4' to conclude that F: has exactly 2n fixed points. (These
points do not necessarily have least period n but are fixed by F:.)
2.6. Consider the quadratic map F,. for parameter values I' > 1.
(a) Find the points of period two and determine their stability. (Indicate for which
parameter values they exist and the stability for different parameter values.)
(b) Find the points of period four.
( c) Let II > 1 be in the range of parameters for which the orbit of period 2 is attracting.
=
Let 0 < z < 1. Prove that either (i) there is an integer Ie ;::: 0 such that F!(z) PI'
where PI' is the fixed point or (ii) the w(z) is the orbit of period 2.
5
=
2.7. Let I(z) z3 - 4z.
(a) Find the fixed points, {O,±p}, and determine their stability.
(b) Find the critical points and show they are non degenerate. Label the critical points
±c. Draw the graph of ,.
=
(e) Find all points which satisfy I(z) -z. Use this information to find an orbit of
period 2, {±q}.
58 II. ONE DIMENSIONAL DYNAMICS BY ITERATION

(d) Show that f2 is monotone on [-c,cJ where ±c are the critical points. Use this in-
formation to prove that W'(O(q)) ::) [-c,O)U(O,cJ. Then prove that W'(O(q)) =
(-p, 0) U (0, p) where ±p are two of the fixed points.
Limit Sets
2.8. Let f : X -+ X be a continuous map on a metric space. Let p EX.
(a) Prove that cl(O+(p)) = O+(p) Uw(p).
(b) Prove that if w(p) = 0 then O+(p) is a closed subset of X.
(c) Assume that O+(p) is a compact subset of X. Prove that O+(p) = w(p).
Invariant Cantor Sets
2.9. Let I = [0, IJ, and

SX, for x :-:; 1/2


T.(x) ={
8(1 - x), for x ~ 1/2.

This is a generalized tent map. Prove that for S > 2, A = n{T.-k(I) : 0 ~ k < oo} is
the middle-a Cantor set for some a.
2.10. Let b,(x) = x 3 - Ax.
(a) Find the fixed points of b, and determine their stability for A > o.
(b) Let I), = [-(A + 1)1/2, (A + 1)1/2J. For A > 0, prove that a point x with x rt I), has
ff(x) --+ +00 or ff(x) --+ -00.
(c) Consider A = 8. Show that

Sn = nn

k=O
fik(/),)

contains 3n intervals. Show that

is a Cantor set (i.e., perfect and nowhere dense). Hint: If x E SI show that
I/s(x)1 > 1.
2.11. Use Lemma 4.4 to show that if I.l > 2(1 + 21/2) then Alfo has measure zero. Hint:
If It > 2(1 + 21/2) then F~(x) > 2 on I. U h Remark: If I.l > 4 then Alfo has measure
zero, but this fact is harder to prove.
2.12. For I.l > 4, show that FIfo(x) = I.lx(1 - x) is expanding by a factor of 2 in
terms of the density p(x) = [x(l- X)J- I / 2, i.e., show that If~(x)lp(flfo(x))/p(x) ~ 2 for
x E [0, IJ n F;I([O, 1]).
Symbolic Dynamics
2.13. Let d be the metric on E2 = {1,2}N defined by

d(s, t) = f: ISj ~ tjl.


j=O 3J

(a) Given t E E2 and n ~ 0, prove that

Also prove that this set is a closed set and an open ball in the above metric.
2.9 EXERCISES 59

(b) Given t E E2 and n ? I, prove that

is an open set (but not an open ball). Note that the range of entries of 8 starts
with j = 1 and not O.
(c) Given t E E2 and m ~ n, prove that

{s E E2 : Sj = tj for m ~ j ~ n}

is an open set. These sets are called cylinder sets.


(d) Prove that E2 with the metric d is a complete metric space.
(e) Prove that E2 is compact in terms of the metric d. Remark: The topology induced
by d is t,hc sanlc as the one obtained by considering E2 as the infinite product of
{I, 2} with itself and using the product topology. Thus by Tychonoff's Theorem
E2 is compact. In this exercise the reader is asked to verify this fact directly in
this case.
2.14. Let A > 1 and P>. be the metric on E2 = {I, 2}N defined by

~ Is -tl
p>.(s, t) = L.J ~.
j=O

Let d be the metric of the previous problem and Section 2.5.


(a) Prove that the identity map, id, is a homeomorphism from E2 to itself when the
domain is given the metric d and the range is given the metric P>.. This shows
that the two metrics have the same open sets and so induce the same topology for
anYA> 1.
(b) Given t E E2 and n ? 0, prove that

{s E E2 : Sj = tj for 0 ~ j ~ n}

is an open ball in terms of the metric P>. if and only if A > 2. Note that these sets
are open sets in terms terms of the metric P>. for any A > 1 by part (a).
2.15. Prove that the full p-shift is topologically mixing, i.e., (1" is topologically mixing
on E" for any positive integer p ? 2.
2.16. Let I = [0,1]' Fs(x) = 5x(1 - x), and

4x for 0 ~ x ~ 0.5
g ()
x = {
4x - 2 for 0.5 < x ~ 1.

The map g is called the Baker's map.


(a) Let Fs-l(I) n 1= huh For a string Sj E {1,2} for 0 ~ j ~ n, let

For n = 1,2,3, indicate on a sketch all the possible intervals, 1'0 ...•"' for different
choices of So •.. Sn. When is the interval 1. 0 ...."0 to the left of 1.0 ...."1 and when
is it to the right? What effect does having a symbol 2 (where the slope is negative
on h) have on the order of the intervals?
60 II. ONE D1MRNSJONAL DYNAMICS BY ITERATION

(b) Let g-I(I) n 1 = J I U J2; for a string Sj E {1,2} for O:s j:S n, let

For n = 1,2, indicate on a sketch all these possible intervals, J. o ...•• , for different
choices of so . .. SrI' What effect does having a symbol 2 (where the slope is negative
on 12 ) have on the order of the intervals? What is the difference between the order
of the interval in part (b) from those in part (a)?
2.17. Let
l'(x) = { 2x for x :S 1/2
2 - 2x for x ?: 1/2
he the tent map. Set II = [O,I/2J and 12 = [I/2,IJ. Define the map H : E2 ..... [0, IJ by

H(s) = n
00

k=O
T-k(l •• ).

(a) Prove that H is well defined, i.e., that the intersection is a single point.
(b) Prove that H is a semi-conjugacy from (1 on E2 to Ton [O,IJ.
(e) What sequences correspond to x = 0,1,1/2, i.e., what are H-I(O), H-I(I), and
H-I(I/2)?
(d) Prove that H is one to one on most points, and at most two to one, i.e., prove that
H-I(X) contains at most two sequences. What sequences go to the same point by
H?
(e) Prove that T is topologically transitive.
2.18. Consider F;. for It > 2 + 5 1/ 2 with invariant Cantor set AI" Prove that if t > 0 is
small enough then there is a fJ > 0 such that for any fJ-chain {x j } ~o with d( x j , A,,) < fJ
for all j there is a unique x E A" such that IXj - P(x)1 < t for all j. (A point x
satisfying the conclusion of this exercise is said to t-shadow the fJ-chain {x) }.)
Conjugacy and Structural Stability
2.19. Which of the following are topologically conjugate on all of 1R? Which are
differentiably conjugate? Prove your answer.
(a) J(x) = x/2
(b) J(x) = 2x
(e) J(x) = -2x
(d) J(x) = 5x
(e) J(x) = x 3
2.20. Let ga(Y) = ay2 - 1 and Je(x) = x 2 - e. For each paramet.er value a find a
parameter value e and a linear conjugacy from ga to Je.
2.21. Prove that the periodic points of F4 are dense in [O,IJ. Hint: Use the conjugacy
from the tent map T to F4 given in Example 6.2, and the fact from Exercise 2.4 that
the periodic points for T are dense in [O,IJ,
2.22. Assume I : It ..... R has a fixed point at Xo. Find an affine conjugacy h from I to
a new function 9 where 9 has a fixed point at O. Also give the definition of 9 in terms of
the function I. Hint: Think of I as written in terms of coordinates x, XI = I(x), 9 as
written in terms of coordinates y, YI = g(y), and the affine conjugacy h as transforming
the y coordinates into the x coordinates by means of a translation, x = hey) = x + b,
2.23. Prove that I(x) = x 3 + x/2 is CI-structurally stable.
2.9 EXERCISES 61

2.24. Assume that I : X --+ X and 9 : Y --+ Yare semi-conjugate by means of the
map h : X --+ Y. Let x be a point of least period n. Prove that h(x) is a periodic point
for 9 whose period divides n. Also prove that if h is a conjugacy then the period of h(x)
is n.
Homeomorphisms of the Circle
2.25. Let I, 9 : Sl --+ Sl be two orientation preserving homeomorphisms that are
topologically conjugate. Prove that they have the same rotation number, p(f) = p(g).
2.26. Let I : Sl --+ Sl be an orientation preserving homeomorphism with irrational
rotation number.
(a) Let 1= 1f"(x),Jm(x)] for n < m and some x E Sl. Prove that for any y E SI there
is a positive k such that IIc(y) E I. Hint: Consider j-i(m-n)(l) for consecutive
values of i.
(b) Prove that for any two x,y E SI it must be the case that w(x) = w(y). Conclude
that w(x) is minimal. Hint: Use part (a) to show that if Z E w(x) then Z E w(y).
(c) Take x E SI and let E = w(x). Prove that E is perfect. Hint: for Z E E show
that Z E w(z) = w(x}. What does this imply about how E must accumulate on Z?
(d) For E = w(x), prove that E is nowhere dense or E = SI. Hint: Show the boundary
of E is invariant. Since E is minimal, what are the possibilities for the boundary
of E?
2.27. For the following two functions on the circle, (i) find all the periodic points, (ii)
determine the stability of each periodic point, and (iii) describe the phase portrait.
(a) I(O) = 0 + t sin (nO) mod 211' for n ~ I, t < lIn.
(b) g(O) = 0 + (211'ln) + ESin(nO) mod 211' for n ~ 2, E < lIn.
2.28. This exercise applies the argument on the existence of a rotation number to show
that a subadditive sequence has a limit. Suppose that an is a sequence of real numbers
and C is a fixed real number for which a m+n $ am + an + c for all m, n E N. (In ergodic
theory, if c = 0 this type of sequence is called subadditivf.)
(a) Prove that alcp $ ka p + (k - l}c for all k,p EN.
(b) Consider a fixed p > 0 and write n = kp + i where 0 $ i < n. Prove that

an < ~ + a p +c.
n - n p

(c) By letting n and p go to infinity in the right order, deduce that

lim sup an $ lim inf a p ,


n-oo n p-oo p

and so the limit limn_co anln exists. Note that the limit could be -00.
(d) If c = 0 and the sequence is subadditive, prove that

lim an = inf an.


n-oo n nEN n

2.29. Let I : SI --+ SI be an orientation preserving homeomorphism with lift F : R --+


R. Assume I has a point Xo with least period q.
(a) Prove that F9(tO) = to + P for some integer p, where to E R is a lift of Xo E SI.
(b) Prove that Po(F) = plq. Thus if j has a periodic point, its rotation number is
rational.
CHAPTER III
Chaos and Its Measurement

The theme of the chapter is complicated dynamics or chaos of maps on the line.
The first section presents a theorem of Sharkovskii; it proves for certain n and k, that
the existence of a periodic orbit of period n forces other orbits of period k. This
theorem is not. exactly about complicated dynamics, but it does show that if a map /
Oil the lille IIIL" 11 period which is not C<lual to a power of 2, then / has infinitely many
different periods. The existellce of infinitely many different periods is an indication of the
complexity of such a map. This theorem is not used later in the book but is of interest
in itself, especially since it is proved using mainly the Intermediate Value Theorem
and combinatorial bookkeeping. Sharkovskii's Theorem motivates the treatment of
subshifts of finite type which is given in the next section. A subshift of finite type is
determined by specifying which transitions are allowed between a finite set of states.
Given such a system, it is easy to determine the periods which occur and other aspects
of the dynamics. These systems are generalizations of the symbolic dynamics which we
introduced for the quadratic map in Chapter II. In Chapter VII and IX, we give further
examples of nonlinear dynamical systems which are conjugate to subshifts of finite type.
Because we can analyze the subshift of finite type, we determine the complexity of the
dynamics of the nonlinear map.
The last few sections of this chapter deal with topics related more directly to chaos.
We give a couple of alternative definitions of chaos and introduce various properties
which chaotic systems tend to possess. We use this context to introduce the concept of
a Liapunov exponent for an orbit. This concept generalizes that of the eigenvalue for a
periodic orbit, and associates to an orbit a growth rate of the infinitesimal separation
of nearby points. This quantity can be defined even when the map does not have a
"hyperbolic structure" like the Cantor set for the quadratic map. This quantity is often
used to measure chaos in systems which can be simulated on a computer. In Chapter
VIII, we return to define Liapunov exponents in higher dimensions.

3.1 Sharkovskii's Theorem


In Chapter II we showed that the quadratic map FI'(x) = I-'x(l - x) has an invariant
Cantor set AI' with points of all periods. In this section we study the following question:
if a continuous function / : R --+ R has a point of period n, does it follow that / must
have a point of period k? Stated differently, which periods k are forced by which other
periods n? The following theorem is very simple to state and has a relatively simple
proof.
Theorem 1.1, Li and Yorke (1975). Assume / : R --+ R is continuous, and there is a
point a such that either (i) p(a) :5 a < /(a) < /2(a) or (ii) /3(a) 2: a > /(a) > /2(a).
Then / bas points of all periods.
REMARK 1.1. Note that it follows if / has a point of period three then it has points
of all other periods, hence the title of the Li and Yorke paper, "Period three implies
chaos".

63
64 III. CHAOS AND ITS MEASUREMENT

REMARK 1.2. There is a more general result of Sharkovskii (1964), which also was
proved earlier than the result of Li and Yorke. We prove the simpler result first because
the proof is simpler and the lemmas used in the proof of the result of Li and Yorke
are needed for the more general result. The treatment of this whole section follows
the paper Block, Guckenheimer, Misiurewicz, and Young (1980). Two good general
references for these results and other related one dimensional results are A1sedil., Llibre,
and Misiurewicz (1993) and Block and Coppel (1992).
We assume the first ordering in the theorem with f3(a) :S a < !(a) < j2(a). (The
proof for the other ordering is merely obtained by a reflection in the line.) Let I, =
[a,f(a)] and 12 = [J(a), !2(a)]. Then !(ld :::) 12 and f(h) :::) I, u 12 as can be seen
from the image of the endpoints of the intervals.
Lemma 1.2. If I and J Me closed intervals and !(I) :::) J then there exists a subinterval
K c I such that !(K) = J, !(int(K)) = int(J), and !(8K) = 8J.
PROOF. Let J = Ib l ,b2 ]. There exist al,a2 E I such that f(a]) = b]. Assume a, < a2'
(The other rase is similar with suprema and infima interchanged.) Let

XI = sup{x : al :S X :S a2 such that f(x) = b.}.

By continuity f(xd = bl · N~te that XI < a2. Next let

X2 = inf{x : XI :S X :S a2 such that !(x) = ~}.

Then !(X2) =~. Thus f({XI,X2}) = {b l ,b2 }. By the definitions of Xl and X2,
!«XI,X2)) n8J = 0. Thus !(int([xl,X2]) = int(J) = (b l ,b2). This proves the lemma.
o
Definition. An interval I f -covers an interval J provided f(l) :::> J. We write I -+ J.

Lemma 1.3. (a) Assume that there Me two points a i b with !(a) > a and f(b) < b
and [a. bJ is contained in the domain of!. Then there is a lixed point between a and b.
(b) If a closed interval I f-covers itself then! has a lixed point in I.
PROOF. (a) Let g(x) = !(x) - x. Then g(a) > 0 and g(b) < O. By the Intermediate
Value Theorem there is a point c between a and b where g(c) = 0 so f(c) = c. This
result can also be seen graphically by considering the two cases where (i) a < band (ii)
a > b. See Figure 1.1.

f(a)
a ....------,"---.,.

f(a)

FIGURE 1.1. Fixed Point for Lemma 1.3(a)


3.1 SHARKOVSKII'S THEOREM 65

(b) By Lemma 1.2, there is an interval K = [XI,Xl] c I with /(K) = I = [a,b].


Then either (i) /(xd = a ::; XI and /(Xl) = b ~ Xl, or (ii) /(xa) = b > XI and
/(Xl) = a < X2' If the equality holds we are done. Otherwise pan (a) applies to prove
there is a fixed point. 0
Lemma 1.4. Assume Jo -+ J I -+ .•• -+ I n = Jo is a loop with /(J,,) :::> J"+l for
k = 0, ... ,n-1.
(a.) Then there exists a. fixed point Xo of r
with /"(xo) E J" for k = 0, ... , n.
(b) fUrther assume tha.t (i) this loop is not a product loop formed by going p times
around a shorter loop of length m where mp = n, and (ii) int(J;) n int(J/c) = 0 unless
J" = Ji . If the periodic point Xo of part (a) is in the interior of Jo then it has least
period n.
REMARK 1.3. Note that the loops that we allow can repeat some intervals as for exam-
ple J o ..... ./1 -+ ........ J.. - 2 ..... Jo -+ Jo, or Jo -+ J I -+ J 2 -+ J I -+ J2 -+ J o. However
we do not allow a loop such as Jo -+ J I -+ Jo -+ J I .
PROOF. (8) We give a proof by induction on j. The induction statement is as follows.
(S) There exists a subinterval K) C Jo such that for i = 1, ... ,j, f'(K j ) C J i ,
f'(int(K j » C int(J.), and Ji(K j ) = J j •
By Lemma 1.2 the induction hypothesis is true for j = 1.
Assume (S,,-a) is true. Thus there exists a Kk-I' Then

/k(K,,_d = IU"-I(K,,_a) = I(J,,-a) :::> Jk

By Lemma 1.2, there exists a subinterval K" C K"_I such that I"(K,,) = J" with
Ik(int(K,,» = int(J/c). By the induction assumption (S,,-a) the other statements of
(S,,) are true.
Now using the statement (Sn), we have r(Kn ) = Jo. By Lemma 1.3 r
has a fixed
point Xo in Kn C Jo. Because Xo E K n , li(XO) E J i for i = 0, ... , n. This proves part
(a).
For part (b), since r(int(Kn» = int(Jo), if Xo E int(Jo) then Xo E int(Kn) and
P(xo) E int(Ji ) for i = 1, ... , n. Because the loop is not a product, Xo must have
~od~ 0
PROOF OF THEOREM 1.1.
We assume the first case where I(a) = b > a, j2(a) = I(b) = c > I(a) = b, and
13(a) = I(c) s a. Let II = [a, bl and 12 = [b, c]. Then II I-covers 12 and 12 I-covers
both II and h
First F(h) :::> 12 so there is a fixed point by Lemma 1.3.
Next we show that I has a point of period n for any n ~ 2. Take the loop of length
n with one interval being h and n - 1 intervals being repeated copies of 12 : II -+ 12 -+
12 -+ •.. -+ 12 -+ h. By Lemma 1.4, there exists an Xo E Ia such that In(xo) = Xo and
J1(xo) E 12 for j = 1, ... , n -1. If there were a k with 1 s k < n such that /"(xo) = Xo,
then we would have Xo = I"(xo) E h. Thus we would have Xo E II n 12 = {b}. We now
show that Xo = b is impossible. The argument is slightly different for n = 2 and n ~ 3.
In the case when n = 2, 12(b) = j2(xo) = Xo = b, contradicting j2(b) = 13(a) ::; a. In
the case when n ~ 3, we must have 12(b) = 12(xO) E 12 contradicting j2(b) = /3(a) ::; a.
This contradiction shows that Ji(xo) :F Xo for 1 ::; j < n, and Xo has period n. 0
Definition. In order to state the result of Sharkovskii we need to introduce a new
ordering on the positive integers using the symbol c> called the Sharkovskii ordering.
First the odd integers greater than one are put in the backward order:
66 III. CHAOS AND ITS MEASUREMENT

Next, all the integers which are two times an odd integer are added to the ordering, and
then the odd integers times increasing powers of two:

3 [> 5 [> 7 [> ••• [> 2 . 3 [> 2 . 5 [> ••• [> 22 . 3 [> 22 . 5 [> •••

[> 2n . 3 [> 2n . 5 [> ••• [> 2n +l . 3 [> 2n +l . 5 [> ••••

Finally, all the powers of two are added to the ordering in decreasing powers:

We have now given an ordering between all positive integers. This ordering seems
strange but it turns out to the be ordering which expresses which periods imply which
other periods as given in the Theorem of Sharkovskii (Sharkovskii, 1964).

Theorem 1.5 (Sharkovskii). Let I : I c IR -+ R be a continuous function from an


interval I into the real line. Assume I has a point of period nand n [> k. Then I has a
point of period k. (By period we mean least period.)
Until the proof of the theorem is complete, I is assumed to be a continuous function
from I to IR as given in the statement. The proof of the theorem involves finding intervals
which I-cover each other in certain ways. In order to express these ideas we introduce
the following definition of a type of graph.
Definition. Let A = {I" ... , I.} be a partition of I into closed intervals with disjoint
nonempty interiors. An A-graph 01 I is a directed graph with vertices given by the Ii
and a directed edge from Ii to h if Ii I-covers h. It is also called the graph lor the
partition. See Figures 1.3 and 1.4 for examples.

Example 1.1. Let I have a graph as indicated in Figure 1.2 with three intervals,
1,,12,13, Then hI-covers 12, 12 I-covers 1\ and 12 , and 13 I-covers 1\, 12, and 13.
Thus the graph for the partition is as given in Figure 1.3.

FIGURE 1.2. Graph of the Function in Example 1.1

REMARK 1.4. We first consider the case where n is an odd integer for which (i) n > 1
and (ii) I has a point x of period n and f has no points of odd period k with 1 < k < n
(i.e., k [> n.) To prove Sharkovskii's Theorem in this case, Peter Stefan had the idea to
3.1 SHARKOVSKII'S THEOREM 67

FIGURE 1.3. Graph for the Partition in Example 1.1

prove the existence of an orbit with a special pattern on the line: let XI be an n-periodic
point such that
Xn < Xn-2 < ... < Xa < XI < X2 < X4 < ... < Xn-I

where Xi = li-I(xd. (The reflection of this ordering is just as good.) A periodic point
with such an ordering of its orbit on the line is called a Stelan c!Jdr:.. Lemma 1.6 proves
that indeed such an orbit does exist. Given such an orbit, let II = [Xl, X2), 12 = [X3, Xl)'
13 = [X2,X4), I2j = [X2i+1' X2i-1I, and I2i-l = [x2i-2, X2i) for j = 2, ... ,(n - 1)/2.
Because of the nature of the orbit, (i) II I-covers II and 12, (ii) Ii I-covers 1;+1 for
2 ::s j ::s n - 2, and (iii) In-I I-covers all the Ij for j odd. Thus the existence of such
a special type of orbit proves that the A-graph of I contains a subgraph of the form
given in Figure 1.4. This subgraph is called a Stefan graph. Applying the lemmas above
to this Stefan graph can prove the existence of all the periodic implied by n in the
Sharkovskii ordering.

FIOURE 1.4. Subgraph for the Partition In Lemma 1.6

We now turn to the lemma and its proof.


Lemma 1.6. Assume n is an odd integer with n > 1. Assume that f has a point X
of period n and I has no points of odd period k with 1 < k < n (i.e., k I> n.) Let
J = [min O(x), max O(x)). Let A be the partition of J by the elements of O(x). Then
the A-graph of I contains a subgraph of the following form: The II"'" In-I can be
numbered with all the intervals having disjoint interiors such that (i) II I-covers II and
h (ii) Ij f-covers Ij+1 for 2 ::s j ::s n - 2, and (iii) In-I f-covers all the I j for j odd.
See Figure 1.4.
PROOF. Let O(x) = {ZI' Z2,"" zn} where the Zj are ordered as on the line, ZI < Z2 <
... < zn}. Then, I(zn) < Zn because I(zn) is one of the other Zj. Similarly, I(zd > ZI'
68 111. CHAOS AND ITS MEASUREMENT

Let a = max{y E O(x) : f(y) > y}. Then, a =1= Zn. Let b be the next larger than a
among O(x) in terms of the ordering of the real line. Let I, = [a, bl E A. We show that
this II can be used in the statement of the lemma.
There is a sequence of small steps which we state as claims. First we need to show
that I, covers itself (Claim 1), and eventually covers all of J (Claim 2). Claim 4 shows
that there is a shortest loop with distinct intervals I, -+ 12 -+ .•• -+ In-I -+ h. Claims
5 and 6 complete showing that these intervals are situated on the line and behave as
claimed.

Claim 1. The image of II covers itself, f(Id :::> h.

PROOF. We know that f(a) > a so f(a) ~ b. Also since b > a, f(b) < b so f(b) ~ a.
Therefore f(Id :::> I, as claimed. 0
Claim 2. Tile (n - 2) image of I, covers tile whole interval J, r- 2 (I1) :::> J.

P/lOOF. Since f(ld :::> I,.


fk+l(ld :::> fk(Id, so the iterates arc nested. The number
of points in O(x) \ {a,b} is n - 2, so Zn E fk(Id for some 0 ~ k ~ n - 2. By
the nested property, Zn E r- 2 (Id. Similarly z,
E r-2(Id. Since II is connected,
r- 2 (Ill :::> [z" znl = J. 0
Claim 3. There exists a Ko E A with Ko =1= h such that f(Ko) :::> I,.

PROOF. This proof uses the fact that n is odd, so there are more elements of O(x) on
one side of int(h) than the other. Call P the elements of O(x) on the side of int(I,)
with more elements. There is some y"Y2 E P with f(yd E P and f(Y2) E O(x) \ P.
Take adjacent points YI and Y2 with iterates as above. Let Ko be the interval from YI
to Y2. Then f(Ko) :::> I, and Ko =1= I, as claimed. 0

Claim 4. There is a loop I, -+ 12 --+ ••• -+ Ik -+ II with 12 =1= II. The shortest sucll
loop witll k ~ 2 has k = n - 1.
PROOF. Let Ko be as in Claim 3, so f(Ko) :::> I,. By Claim 2. r-
2 (Ill :::> Ko. There

are only n - 1 distinct intervals in A so there exists such a loop with 2 ~ k ~ n - 1.


Now assume the smallest k that works satisfies 2 ~ k < n - 1 and we get a con-
tradiction. Since this is the shortest loop, none of the intervals can be repeated or it
could be shortened. Either k or k + 1 is odd. Let m = k or k + 1 be this odd integer,
so 1 < m < n. Use the loop with m intervals given by h -+ 12 --+ ••• --+ lie -+ I, or
II -+ 12 -+ ..• -+ Ik -+ h -+ II depending on whether m = k or m = k + 1. By Lemma
1.4(a) there is a point Z with fm(z) = z. The point Z can not be on the boundary of
the interval because these points have period n which is greater than m. Thus Z has
least period m by Lemma 1.4(b). Since m is odd this contradicts the assumption on n
in the Lemma. This contradiction proves that k = n - 1. 0

For the rest of the proof we fix I" h, ... ,In -, as in Claim 4.

Claim 5. (a) If f(Ij) :::> I, tllen j = 1 or n - 1.


(b) For j > i + 1 there is no directed edge from Ii to I j in the graph.
(c) The interval I, I-covers only I, and 12.
PROOF. Part (a) follows from Claim 4. Parts (b) and (c) follow because the loop is the
shortest possible. 0

Claim 6. Either (i) the ordering (in terms of the real line) of the intervals I j in the
loop of Claim 4 is I n -, ~ I n -3 ~ ... ~ 12 ~ I, ~ 13 ~ ... ~ In- 2 and the order of the
3.1 SHARKOVSKII'S THEOREM 69

orbit is /"-1 (a) < /"-3(a) < ... < 12(a) < a < I(a) < 13(a) < ... < /"-2(0) or (ii)
both of these orderings are exactly reversed.
PROOF. Let II =
[a, bJ. The interval II I-covers only II and h 80 they must be next
to each other. Assume that h ~ II. (The other possibility gives the reverse order
mentioned in the claim.) Then it must be that I(a) = b and I(b) is the left endpoint of
h
Next, l(aI2 ) = ah Since one of these endpoints is I(a} = b which is above int(1d
both endpoints of 13 must be above int(1d. Also because of Claim 5a (12 does not
I-cover Id and 5b (12 does not I-cover I j for j > 3) 13 must be adjacent to II'
Continue the argument by induction. For k < n -I, since I" does not I-cover II and
1" does not I-cover Ij for j > k + I, 1,,+1 must be adjacent to 1" _ I, This covers all the
int.ervals in the claim.
Note that we have also shown the ordering on the orbit as stated in the claim. 0
Claim 7. 'I'll(' intcrvlllln_1 I-covers 1111 the I j for j odd.
PROOF. Note that In-I = [/"-I(a),/"-3(a)J. Then /(/"-I(a}) = /"(a) = a. Also
/"-3(a) E I n - 3 80 /(/"-3(a» = /"-2(a) E I n - 2 is the far right endpoint of J (the
largest element in the orbit O(x». Thus I(In-d :::> [a,/"-2(a)] = II U 13 U··· U I n- 2.
We have proved the claim. 0
All the claims together prove Lemma 1.6. o
Proposition 1.7. Theorem 1.5 is true if n is odd and maximal in the ordering for
which the theorem is true.
PROOF. Take k with nt> k. There are two cases: (a) k is even and k < n and (b) k > n
with k either even or odd.
Case a. The integer k is even and k < n.
PROOF. Consider the loop of length k given by In-I - In-" - In-"+1 - ... -In-I.
By Lemma 1.4(a) there is a Xo E In-I with I"(xo) = Xo. The point Xo can not be an
endpoint because the endpoints have period n. Therefore Xo has period k. 0
Case b. The integer k > n with k either even or odd.
PROOF. Consider the loop of length k given by II - 12 - ... - In-I - II - II -
.•• -+ II. Again by Lemma 1.4(a) there is a Xo E II with I"(xo) = Xo. If Xo E all
then Xo has period n. Thus n divides k, 80 k ~ 2n ~ n + 3. Also since /"(xo) E II
the iterate /"+1 (xo) ¢ II which contradicts the conclusion of Lemma 1.4(a}. Therefore
Xo rt all. and by Lemma 1.4(b), Xo has period k. This completes the proof of Case b
and Proposition 1.7. 0

The first step in proving the result for other values of n proves the existence of a
point of period two whenever there is a point of even period.
Lemma 1.8. If 1 has a point of even period then it has a point of period two.
PROOF. Let n be the smallest integer greater than one in the usual ordering of the
integers (not the Sharkovskii ordering) such that I has a point of period n. If n is odd
then we are done by Proposition 1.7. Therefore we can assume that n is even. Let 0,
II = [a, b], and J = [min O(a), max O(o)J = [A, B] be as before. In the proof of Lemma
1.6 we only used the fact that n is odd to show that there exists a Ko E A with Ko i: 11
and I(Ko) :::> II.
70 III. CHAOS AND ITS MEASUREMENT

First assume there is such a Ko. There is a minimal cycle as in Claim 4 with 2 :5
k :5 n - 1. As before, I" covers all the I j on the other side. Thus In-I -+ I n - 2 -+ In-I
is a cycle of length two, and there is a point of period two.
Next assume there is no Ko E A with Ko '" h and I(Ko} :::> h. It follows that (i)
all the points Xj E O(a) with Xj :5 a have I(xj) ~ band (ii) all the points Xj E O(a)
with Xj ~ b have I(xj) :5 b. Since some points in O(a) are mapped to b and B,
both b, B E f(IA, aD and so f(IA, aD :::> Ib, B]. Similarly, 1(lb, BD c lA, a]. Then
lA, a] -+ Ib, B] -+ lA, a] is a cycle of length two. The intervals are disjoint so there must
be a point of period two. 0
The proof of Sharkovskii's Theorem now splits into the following cases.
Case 1: n is odd and maximal in the Sharkovskii ordering and n I> k.
Case 2: n = 2m and nl> k.
Case 3: n = 2mp with p > 1 odd, m ~ 1, n is maximal in the Sharkovskii
ordering, and n I> k.
Case 1 is proved above in Proposition 1.7. We split Case 2 up into subcases and
prove it next.
Case 2: n = 2m and n I> k SO k = 2' with 0 :5 s < m.
Case 2a: s = 0, i.e., 1 has a fixed point.
Case 2b: s = 1.
Case 2c: s > 1.
PROOF OF CASE 2a. We can define a and b as before with f(a} ~ band f(b} :5 a.
Therefore f(la, b]) :::> la, b] and f has a fixed point. 0
Case 2b follows from Lemma 1.8.
PROOF OF CASE 2c. Let 9 = 1"/2 = p.-I. The map 9 has a point of period 2m -,+I
with m - s + 1 ~ 2. Lemma 1.8 proves that 9 has a point Xo of period 2. So Xo =
g2(xO} = I"(xo) and Xo '" g(xo) = 1"/2(xO)' Thus the period of Xo for 1 is 21 for some
t :5 s. If t < s then Xo is fixed by 9 which is impossible. Therefore t = s and Xo is a
point of period 2' = k. 0
We also split Case 3 up into subcases.
Case 3: n = 2mp with p > 1 odd, m ~ 1, n is maximal in the Sharkovskii
ordering for f, and n I> k.
Case 3a: k = 2'q with s ~ m + 1 and 1 :5 q and q odd.
Case 3b: k = 2' with s :5 m.
Case 3c: k = 2mq with q odd and q > p.
We leave the proof of these cases to the exercises. See Exercises 3.2 - 3.4. This
completes the proof of Theorem 1.5 (Sharkovskii's Theorem). 0

3.1.1 Examples for Sharkovskii's Theorem


There are examples of maps with exactly the orbits implied by the Sharkovskii or-
dering. First consider the case where the maximal period in the ordering is odd.
Example 1.2. Let n > 3 be odd. (If n = 3 there are points of all periods and there
is nothing to prove.) Let XI be a point which is a Stefan cycle for n. Make the graph
be piecewise linear connecting the adjacent points (Xj,f(Xj}) on the graph by straight
line segments. Let h, ... , In-I be the intervals as in the proof. See Figure 1.5.
We claim that such a map does not have a point of odd period k with 1 < k < n.
Assume that x is a periodic point with perino different than n. If any iterate of x hits
one of the endpoints of an I j then either :r ita" period n or is not periodic and so this
can not happen. Thus J1(x} E int(/i(j)} for each j. Because the A-graph is exactly the
3.1.1 EXAMPLES FOR SHARKOVSKII'S THEOREM 71

nr---,.--------.---~------_,

1._2

FIGURE 1-5. Example 1.2

subgraph proved to exist by Lemma 1-6, the length of cycles in the graph are exactly
those k which are implied by n in the Sharkovskii ordering. Also the graph over I, has
slope -2. There is a fixed point in h and all other points must leave I, and enter 12'
Thus all orbits passing through h are either fixed points or have periods at least n - 1-
Other orbits have to have the same period as the period of the cycle of intervals (since
the orbit must pass through the interiors)_ Thus the possible periods of periodic points
are exactly those implied by n in the Sharkovskii ordering. In particular there are no
points of odd period with 1 < k < n_

Definition. To get other examples with certain periods we introduce the doubling
operator. Let I = [O,IJ. Assume I : I -+ I is a continuous map. We denote the periods
of the orbits of I by P(f). Now we define the double 01 I, V(f) = g, by

~ + l/(3x) for 0::; x ::; l


g(x) = { [2 + I(l)](~ - x) for l ::; x ::; ~
x- ~ for ~ ::; x ::; 1.

See Figure 1.6. It is easily checked that 9 is continuous.

FIGURE 1-6. The original map I is in the left figure and the double 9 is in
the right figure
72 JII. CHAOS AND ITS MEASUREMENT

The next proposition relates the periods of I with the periods of the double of I;
this result justifies the use of the name "double" for this construction.
Proposition 1.9. The set of all periods of g, P(g), are related to the set of all periods
of I, 1'(1), by p(g) = 21'(1) U {I}. Moreover 9 has exactly one repelling lixed point,
and for each n 9 has the same number of orbits of period 2n as I has of period n and
their stability is the same.
PROOF. Let II = [0,1/3]' 12 = (1/3,2/3), and 13 = [2/3,IJ. Because g(I2) :) h 9 has a
fixed point XI in h. Because the absolute value of the slope of gin h is at least 2, there is
exactly one fixed point in 12 and it is repelling. Also any point in 12 other than the fixed
point has an orbit which leaves h Because [g(h) Ug(I3)J nI2 = 0, none of the points in
12 \ {xd can be periodic. If X E II then g2(X) = g(2/3+ 1(3x)/3) = 1(3x)/3 E II' Thus
for X to be periodic, its period must be even, 2k. But by induction, g2k(x) = I k(3x)/3
for k ~ 1. Thus g2k(x) = x if and only if 1"(3x) = 3x. We have shown that these
periods of 9 are exactly twice the periods of I. Moreover, since g'(t) = 1 on 13, for
a point x of period 2k for g, (g2k)'(x) = (lk)'(3x) so the two orbits have the same
stability type. The periodic points of 9 in 13 are the same because they are on the
orbits described above. This proves the proposition. 0
Example 1.3. Let I(x) == 1/3 for x E [O,IJ. The only periodic point of I is a fixed
point. Let /J = V(I). By the above proposition the periods of /J, P(lIl, are {1,2}.
Also, /J has one repelling fixed point and one attracting orbit of period 2. By induction,
if In = vn(f) then the periods of In, P(ln), are {I, 2, ... , 2n }, and In has one repelling
periodic orbit of pt'fiod 2i for 1 $ j < n and one at.tract.ing periodic orbit of period
2n. Finally let I"",(x) = lim n_ oo In(x). We leave to an exercise to prove that 100 is
continllolls and 1'(/00) = {1,2, ... ,2n , ... }, i.e., 100 has repelling periodic points of
periods 2n for all n and no other periods. See Exercise 3.8.
We also leave to the exercises the fact that if n = 2mp for 1 < p, p odd, and m ~ 1,
then there is a map I for which 1'(1) = {k : nc> k}. See Exercise 3.5.

3.2 Subshifts of Finite Type


In the proof of Sharkovskii's Theorem we considered graphs where intervals I-covered
each other forming a graph as given in Figure 2.1.

FIGURE 2.1. Graph of Partition in Sharkovskii's Theorem

With this graph, a point in II can go to II or 12; a point in 12 can go to 13; a point in
13 can go to I.; a point in I. can go to Ir.; a point in Is can go to 16; a point in 16 can go
to II, 13 , or I r.. Paths in the graph correspond to allowable orbits of points. We can look
at only the labeling of the intervals (as we did for the quadratic map I,,(x) = 1lX(1 - x)
3.2 SUBSHIFTS OF FINITE TYPE 73

for II- > 2 + 5 1/ 2 ) and consider sequences s = 808182 ... where 1 can be followed by 1 or
2; 2 can only be followed by 3; 3 can only be followed by 4; 4 can only be followed by
5; 5 can only be followed by 6; can 6 can be followed by 1, 3, or 5. All other adjoining
combinations are not allowed. Thus a sequence like 634561123456 ... is allowed.
Definition. Instead of looking at the graph, we can define a transition matrix to be a
matrix A = (aij) such that (i) aij = 0,1 for all i and j, (ii) E j aij ~ 1 for all i, and
(iii) E i D.ij ~ 1 for all j. Given a graph of the type in Sharkovskii's Theorem, we can
form a transition matrix A by letting aij = 0 if the transition from i to j is not allowed
(there is no arrow in the graph from Ii to I j ) and aij = 1 if the transition from i to j is
allowed (there is an arrow in the graph from Ii to I j ). The assumption that E j D.i; ~ 1
for every i means that it is possible to go to some interval from Ii; the assumption that
E i aij ~ 1 for every j means that it is possible to get back to I j from some interval. In
the above graph the transition matrix is

o1 01 01 0
0 0
0 0)
0
( 0 0 0 1 0 0
A= 0 0 0 0 1 0 .
o 0 0 001
1 0 1 010

Definition. Let En be the space of all (one-sided) sequences with symbols in the set
{I, 2, ... , n} as defined in Section 2.5, and (1 : En -+ En be the shift map given by
(1(S) = t where tk = 8k+I' This space has a metric as defined before.
Given an n by n transition matrix A let

EA = {s E En : a•••• +! = 1 for k = 0,1,2, ... }.

This space EA is made up of the allowable sequences for A. Let (1A = (1IEA' The
following proposition shows that (1 A acting on a sequence in (1 A gives another sequence
in (1A. The map (1A : EA -+ EA is called the subshift 01 finite type lor the matrix A.
The following proposition also shows that EA is closed.
Proposition 2.1. (a) The subset EA is closed in En.
(b) The map (1A leaves EA invariant, (1A(E A ) = EA.
PROOF. (a) By using cylinder sets it is easily seen that EA is closed.
(b) If S E EA and t = (1 A(S) then it follows directly that all the transitions in t
are allowed so t E EA. On the other hand, if t E EA then there is some 80 such that
a.ol o = 1 by the standing assumptions on A. Let Sic = tk-I for k ~ 1. Then s E EA
and(1A(s)=t. 0
Definition. In general, a subset SeEn is called a SUb8hift provided that it is closed
and invariant by the shift map (1. The following example gives a subshift which is not
of finite type.
Example 2.1. Let S be the subset of E2 consisting of all strings s such that between
any two 2's in the string s there are an even number of 1'5: i.e., if 8; = 2 = 81c with
j < k then there are an even number of indices i with j < i < k for which 8; = 1. This
allows the string s to start with an odd number of 1'5, and s can have an infinite tail
of aliI's or all 2's. A direct check shows that S is closed and invariant under the shift
map. Because the number of 1's between two 2's can be an arbitrary even number, S
is not a subshift of finite type.
74 III. CHAOS AND ITS MEASUREMENT

The next thing we want to do is count the number of periodic orbits in EA. A string
which has period k for q A keeps repeating the first k symbols that appear in its string,
e.g. 124212421242". has period 4. Therefore it is helpful to look at finite strings of
symbols which are called words. Therefore 1242 is a word of length 4. Given a transition
matrix A, a word w = (wo, ... ,Wk- d is called allowable provided the transition from
WJ-I to WJ isallowahle for j = 1, ... , k, i.e., aWJ_I,w, = 1 for j = 1, ... , k. As a first step
(the ind uction step) to determine the number of k-periodic points for q A, we prove the
following lemma about the number of words of length k + 1 which start at any symbol
i and end at the symbol j.
Lemma 2.2. Assume that the ij entry of Ak is p, (Ak)ij = p. Then there are p
allo"wable ""'ords of length k + 1 starting at i and ending at j, i.e., words of the form
iSlS2 ... sk-d·
PROOF. We prove the result by induction on k. Let num(k, i,j) be the number of words
of length k + 1 starting at i and ending at j. This result is certainly true for k = 1 where
num(I,i,j) is either zero or one depending on whether there is an allowable transition
from i to j or not. Now assume the lemma is true for k - 1 for all choices of i and j.
By matrix multiplication

3._1 "1 .• 2 ..... 61.:-2

""-I
= num(k, i,j).

The last equality is true because if a'._d = 0 then these wOl'ds from i to Sk-l do not
contribute to the count of the words from i to j. On the other hand, if a'._d = 1 then
each of these words contributes one word to the words from i to j. 0

Corollary 2.3. The number of fixed points of q~ is equal to the trace of Ak.
PROOF. This follows because # Fix(q~IEA) = Li num(k, i, i) = Li(Ak)ii = tr(Ak). 0

Definition. An n by n matrix of O's and 1's is called reducible provided that there is a
pair i. i with (A k )ij = 0 for all k ~ 1. An n by n matrix of O's and 1'5 is called irreducible
provided that for each 1 ~ i,i ~ n there exists a k = k(i,j) > 0 such that (Ak)ij > 0,
i.e., there is an allowable sequence from i to i for every pair of i and j. The matrix A is
called positive provided Aij > 0 for all i and i and is called eventually positive provided
there there exists a k which is independent of i and j such that (Ak)ij > 0 for all i and
j. Thus, both positive and eventually positive matrices are irreducible.

Example 2.2. (a) The following transition matrix is reducible because it is not possible
to get from 3 to 1:

(i ~ ! ~).
o 0 1 0
3.2 SUBSHIFTS OF FINITE TYPE 75

FIGURE 2.2. Graph for Partition in Example 2.2(a)

Its graph is given in Figure 2.2.


(b) The following transition matrix is irreducible:

(~1 0
~ ~1 ~)
1
o 0 1 0
Its graph is given in Figure 2.3.

FIGURE 2.3. Graph for Partition in Example 2.2(b)

We often want to exclude the case when A corresponds to a permutation of symbols.


A permutation is defined to be a transition matrix where the sum of each row is equal
to one and the sum of each column is also equal to one. An example of a permutation
is given by

Its graph is given in Figure 2.4.

1 ~ 2
\1
3
FIGURE 2.4. Graph for Permutation on Three Symbols
76 III. CHAOS AND ITS MEASUREMENT

The following lemma characterizes permutations in terms of row sums only.


Lemma 2.4. Let A be a trMsition matrix. Then A is a permutation matrix if and
only ifL) a'j = 1 for all i.
PROOF. If every row sum, Lj aij = 1 for all i, then Li,j a'j = n. Since a transition
matrix has Li ail 2: I for every j, it follows that Li ail = I for every j. This shows A
is a permutation matrix. The converse is clear. 0
In the next proposition, we show that a subshift is irreducible if and only if the shift
map has a dense orbit. As a preliminary step, we prove the following lemma about the
dense orbit.
Lemma 2.5. Let A be a transition matrix. Assume that the point s· has a dense orbit
in EA for the shift map U A and s· is not a periodic point. (This last assumption means
that A is not a permutation matrix') Then for My k > 1, u~(s') has a dense orbit.
PROOF. It is clear that u~+)(s') is dense in EA \ {uA(s') : 0 ~ i < k}. Thus we only
need to show that this orbit accumulates on {U~ (5') : 0 ~ i < k}. It is clearly sufficient
to prove that u~+i (5') accumulates on s' by taking a higher iterate to get near the other
uA(s·). Rather than look at s', we show that s' has a preimage t' and that u~+j(s')
accumulates on t·.
First, we show s· has a preimage. By assumption (iii) for a transition matrix, there
is an element tl that can make a transition to So== tl, so there is a t' E LA with
u A (t') = 5'. If t' were on the forward orbit of s· then s· would be periodic which
is not allowed. Therefore t' is not on the forward orbit of 5', and so u~(s') '" t' for
o ~ j ~ k.
Since u~(s') is dense everywhere in E A , and ~(s') '" t' for 0 ~ j ~ k, it follows
that u~+)(s') must come closer to t' than the finite set of points {u{(s') : 0 ~ j ~ k}.
Therefore ~ 0 u~ (5') comes arbitrarily near t·. This completes the proof that the
forward orbit of u~(s') is dense in EA. 0
Proposition 2.6. Let A = (ai,i) be a transition matrix. Then the following are equiv-
alent:
(a) A is irreducible, and
(b) UA has a dense forward orbit in EA.
PROOF. For a finite word w, let b(w) be the first letter of w (beginning of w), and
e(w) be the last letter of w (end of w). Given each pair i and j let tij be a choice of the
words with b(ti]) = i and with e( t ij ) = j. Such a choice exists because A is irreducible.
First we show that (a) implies (b). We describe the point with a dense orbit, s·. List
all the words of length one with the proper choice of the transition word tij between
them to make the sequence allowable. Then list all the allowable words of length two
with the proper choice of the transition word tij between them to make the sequence
allowable. Continue by induction listing all the allowable words of length n with the
proper choice of the transition word tij between them to make the sequence allowable.
In this way we construct an infinite allowable sequence which contains all the allowable
words of finite length. If u is a sequence in EA and V is a neighborhood of u, then there
is some n such that any sequence which agrees with u in the first n places is contained
in V, Now for this n there is a word somewhere in s' which agrees with this word of
length n in u, Next there is a k such that u~(s') has this word in the first n places.
Thus u~(s') E V. Since u and V were arbitrary, this proves that the orbit of s' is
dense, (The reader might consider the case when A is a permutation matrix separately,
but the above proof is also applies to this case.)
3.2 SlJBSHlFTS OF FINITE TYPE 77

Next we show that (b) implies (a). If A is a permutation matrix, then it is clearly
the case that A is irreducible. Thus we can assume that A is not a permutation matrix.
Take s· E EA whose orbit is dense in EA. Because the matrix is not a permutation
matrix, s· can not be a periodic point.
Take an arbitrary pair i and j. By assumption (ii) for a transition matrix, it is
possible to take a E EA such that ao = i. If a point t is close enough to a then to = ao.
There is some kJ such that (7~I(S') is within this distance so (7~'(S')O = ao = i, and
sk, = ao = i. Thus i appears in the sequence for s' .
Similarly there is abE EA such that bo = j. By Lemma 2.5, (7~I(S') has a dense
forward orbit. The same argument as above shows there is a k2 > kJ with (7!'(s')o =
bo = j, and so sk. = bo = j. Thus there is an allowable word in s' from the ki" entry
to the k~" entry which goes from i to j, and we can get from i to j for an arbitrary pair
i and j. 0
By Lemma 2.4, in order to assume that A is not a permutation matrix it is only
necessary to assume that E j aij ~ 2 for some i. We use this assumption to prove that
EA is perfect. First, we prove a preliminary lemma.
Lemma 2.1. Assume that A is an irreducible transition matrix such that E j aioj ~ 2
for some io. Then for each i there exists a k = k(i) for which Ej(Ak)ij ~ 2.

PROOF. Since A is irreducible there is a word wE EA such that b(w) = i and e(w) = i o.
Let the length of w be k, so there are k - 1 transitions. Thus num(k - 1, i, io) ~ 1.
Then there are least two possible choices after io. Thus

Proposition 2.8. Assume that A is an irreducible transition matrix with E j aioj ~ 2


for some i o. Then EA is perfect.
PROOF. For each i there is a k = k(i) such that Ej(Ak)ij ~ 2. Take an s E EA. Take
a cylinder set U as a neighborhood, U = {t E EA : ti = Si for 0 $ i $ n}. Then there
exists a k = k(sn) such that Ej(Ak) •• j ~ 2. Because there is more than one choice for
the transitions from the nl" to the (n + k)'" entry, there is atE U with tn+m '" Sn+m
for some m with 1 $ m $ k. This is true for all s E EA and all cylinder sets, so EA is
perfect. 0
Proposition 2.9. Assume that A is an eventually positive transition matrix. Then (7A
is topologically mixing on EA.
We leave the proof to the exercises. See Exercise 3.14.
Proposition 2.10. Assume A is a transition matrix. (We do not assume A is irre-

..
ducible.) Then the states can be ordered in such a way that A has the following block
form:

A2 •
...

o 0 ...
78 III. CHAOS AND ITS MEASUREMENT

where (i) each Aj is irreducible, (ii) the * terms are arbitrary, and (iii) all the terms
below the blocks Aj are all O. Moreover, the nonwandering set O(UA) = EAt U·· ·UE Am .
We defer the proof to the exercises. See Exercise 3.15.
REMARK 2.1. For an introductory treatment of further topics in symbolic dynamics,
see Boyle (1993).

3.3 Zeta Function


Artin and Mazur had the idea to combine the number of periodic points of all periods
into a single invariant (Artin and Mazur, 1965). If we list all these numbers there
are countably many invariants. For certain classes of maps, when these numbers are
combined together in a certain way they yield a rational function which has only a
finite number of coefficients. Thus the information given by these countable number
of invariants is contained in this finite set of coefficients. We proceed with the formal
definitions.
Definition. Let f : X --+ X be a map, and N,,(J) = #(Per,,(J)) = # Fix(J"). The
zeta function for f is defined to be
00 1
(,(t) = exp(L TcN,,(f)tk).
k=J

The zeta function is clearly invariant under topological conjugacy because the number
of points of each period is preserved. For more discussion of the zeta function see
Chaptt'r 5 of Franks (1982). In this section, we merely calculate the zeta function for
a subshift of finite type. This theorem was originally proved by Bowen and Lanford
(1970). In Chapter VII, we return to prove that the zeta function is a rational function
of t for some further types of maps (toral Anosov diffeomorphisms). Before stating the
throrem, we give a connection between the determinant, exponential, and trace of a
matrix. (The exponential of a matrix is defined by substituting the matrix into the
power series for the exponential. It is discussed further in Section 4.3 in the context of
solutions of linear differential equations.)
Lemma 3.1 (Liouville's Formula). Let B be a matrix. Then
det(e B ) = etr(B).

PROOF. Let eJ ... en be the standard basis. The following calculation uses the facts
that the determinant is alternating in the columns, that e BI = (eBle J, ... ,eBlen ), that
e BO = I. and that fte BI = Be B !. Then

d
dtdet(e BI )1100 = ' "
L.,..det(el, d BI ej II=O,ej+J. ... ,en )
... ,ej-I'dte
J

= Ldet(el, ... ,ej_J,Bej,ej+1, ... ,en )

= LbiJdet(el, ... ,ej-J,ei,ej+I, ... ,en)


i,j

= Lbjjdet(el, ... ,ej-l,ej,ej+I, ... ,en)


j
= tr(B).
3.4 PERIOD DOUBLING CASCADE 79

For t = to,

dtd det(eBt)lt=to = dtd det(eB(t-to)lt=to det(e Bto )


= tr(B) det(e Bto ).

d
Solving the scalar differential equation dt det(e Bt ) = tr(B) det(e B1 ) with initial condi-
tion det(e BO ) = 1 gives that det(e Bt ) = etr(B)t. Evaluating this solution at t = 1 gives
the result. 0

Theorem 3.2. Let (7 A : EA -+ EA be the subsmit of finite type for A = (aij) with
aij E {O, I} for every pair of i and j. Then the zeta function of (7 A is rational. Moreover
(.,. .. (t) = [det(I - tA)J-I.
PROOF. By Corollary 3.3, Nk((7A) = tr(Ak). Therefore, using the linearity of the trace,
the power series expansion of the logarithm, and Lemma 3.10 we can make the following
calculation.
1
(L i/ tr(A
00
(.,. .. (t) = exp k ))

k=1
00 1
= exp (tr(L ktk Ak»)
k=1
= exp ( tr( -log(I - tA»)
= det ( exp(log(I - tA) -I»)
= det ((I - tA)-I)
= (det(I - tA)) -I.

This proves the theorem. o


3.4 Period Doubling Cascade
The Sharkovskii Theorem tells us which periods imply which other periods. In par-
ticular, if a map / : R -+ R has finitely many periodic orbits then all the periods
must be powers of 2. For the quadratic family, F,.(x) = /Lx(1 - x), we saw that it
had only fixed points for 0 < /L ~ 3, and only fixed points and a point of period 2 for
3 < /L ~ 1 + 6 1/ 2 • In fact Douady and Hubbard (1985) proved that for the quadratic
family as /L increases new periods are added to the list of periods appearing and never
disappear once they have occurred. See de Melo and Van Strien (1993). Let /Ln be the
infimum of the parameter values /L > 0 for which F,. has a point of period 2n. By the
Sharkovskii Theorem, /Ln ~ /Ln+!' Notice that all the /Ln < 4 because F. has points of
a1l periods. Let /Loo be the limiting value of the /Ln as n goes to infinity. The dynamics
for F,."", is like the map /00 given in Exercise 3.8: there is an invariant set on which F,."",
acts like an adding machine. This sequence of bifurcations is often called the period
doubling route to chaos.
At the bifurcation value /LI = 3 for the family F,., the fixed point PI' changes from
attracting for 1 < /L < /Ll to repelling for /LI < /L. At /L = /Ll. F~. (Pp.) = -1. For p.
slightly larger than /Ll, the 2-periodic orbit O(Pp,l) is attracting with derivative just less
than one, 1 > (F;)'(p,., I ) > O. In Chapter VI, we study the period doubling bifurcation
80 III. CHAOS AND ITS MEASUREMENT

and show at I-' = 1-'2 where the period four orbit is created that (F;.),(p".,d = -1.
Again, this 2-periodic orbit O(p",d changes from attracting to repelling as I-' moves
past 1-'2. The period 4 orbit O(p",2) is initially attracting for I-' just slight larger than
1-'2 and becomes repelling for I-' > 1-'3. This process repeats itself; at I-' = I-'n the period
2n orbit O(P",n) is added. This orbit is attracting for I-'n < I-' < I-'n+1 and becomes
repelling for I-' > I-'n+l.
A natural question to ask is the rate of convergence of the parameter values I-'n to
1-'00' Consider a geometric sequence of numbers, >'n = Co - cl>.n, where 0 < >. < 1. For
this example the limiting value >'00 = Co and >. (or>. -I) gives the rate of convergence
to >'00' In general, we want to define a quantity which measures the geometric rate of
convergence to the limiting value. Feigenbaum (1978) calculated the rate of convergence
by means of the limit
() = lim I-'n - I-'n-I .
n~oo I-'n+l -I-'n

This value b is called the Feigenbaum constant. Notice for the sequence I-'n = Co - cl>.n,
the value () would equal>. -I:

lim I-'n -I-'n-l r Co - cl>.n - Co + cl>.n-1


n~oo I-'n+ I - I-'n n~~ Co - c l >.n+1 - Co + C1>.n
= >. -I.
Feigenbaum (1978) discovered that this constant is the same for several different families
of functions. The value has been calculated to be () = 4.669202. . .. Both Feigenbaum
(1978) and Coullet and Tresser (1978) suggested using the renormalization method
to prove the universality of this constant, i.e., that the constant is the same for any
one parameter family of functions which go through the period doubling sequence of
bifurcations. Much of this program has now been proved by Feigenbaum, Coullet,
Tresser, Collet, Eckmann, Lanford, and others, but there are some mathematical aspects
of this program which are still unproven. See de Melo and Van Strien (1993), Collet
and Eckmann (1980), and Lanford (1984a, 1984b, 1986).
How can these parameter values I-'n be determined for the family F,,? We mentioned
above that the period 2n orbit is attracting for I-'n < I-' < I-'n+1' In fact the critical
point xo = 0.5 must converge to the attracting periodic orbit. See the discussion of
negative Schwarzian derivative in Devaney (1989) or de Melo and Van Strien (1993).
Thus to find the attracting periodic orbit we could iterate the critical point a number
of times, say 1000, without recording the iterates, and then record or plot the next 1000
iterates. See Figure 4.1A. Note that for the family F" the limiting parameter value
I-'oc = 3.5699456. Therefore the whole period doubling bifurcation is shown in Figure
4.1B. Since this part of the orbit is near an attracting periodic orbit, we could inspect
the orbit to determine the period. By varying 1-', we could determine the value of I-'
when the orbit changed from period 2n- 1 to 2n; this value of I-' gives I-'n.
A second method to determine the I-'n is to note that the period 2n - 1 orbit O(p",n-d
becomes unstable at I-'n and

(F;:-')'(P"n,n-d =-1.
Thus we could use a numerical scheme (e.g. Newton's method) to search for a point
and a parameter value with this property. This search would determine the I-'n.
Finally, there is a third method for determining the rate of convergence given by the
Feigenbaum constant by determining slightly different parameter values. We mentioned
above that
and
3.5 CHAOS 81

FIGURE 4.1A. The Bifurcation Diagram for the Family FI': the Horizontal
Direction is the Parameter ~ Between 2.9 and 3.6; the Vertical Direction is
the Space Variable x Between 0 and 1

Between these two parameter values, there is another value ~~ for which

2" '(Pl'~.n )
(FI'J = 0,
I.e., the critical point 0.5 has least period 2". These parameter values satisfy· .. < ~ <
~~ < ~n+l < .... Using these parameter values ~~ instead of the ~ gives the same
universal constant as the rate of convergence.
For larger values of the parameter ~ but with ~ still less than 4, an orbit of a point for
the quadratic family FI' seems to be dense in the whole interval [0,1]. In fact Jakobson
(1971, 1981) proved that there is a set of parameter values Me [4 - £,4] such that (i)
M has positive Lebesgue measure (and 4 is a density point) and (U) for every ~ E M,
FI' has an invariant measure vI' on [0,1] that is absolutely continuous with respect to
Lebesgue measure on [0,1]. This result implies that most points in [0,1] have orbits
which are dense in the interval [0, I] for these parameter values. Many people have
written papers on this and related results. See de Melo and Van Strien (1993) for
further discussion of this result.
Recently, Benedi~ks and Carleson (1991) have used results about the transitivity of
this one dimensional family of maps to prove the transitivity of the two dimensional
Henon family of maps for certain parameter values. We discuss this further in Chapter
VII.

3.5 Chaos
Dynamical systems are often said to exhibit chaos without a precise definition of
what this means. In this section, we discuss concepts related to the chaotic nature of
maps and give some tentative definitions of a chaotic invariant set.
82 III. CHAOS AND ITS MEASUREMENT

FIGURE 4.1B. The Bifurcation Diagram for the Family F,,: the Horizontal
Direction is the Parameter IJ Between 3.54 and 3.5701; the Vertical Direction
is the Space Variable x Between 0.47 and 0.57

In Section 2.5, we prove that the quadratic map F", with IJ > 4, is transitive on
its invariant set AI" The property of being transitive implies that this set can not be
broken up into two closed disjoint invariant sets. For a set to be called "chaotic" it
should be (dynamically) indecomposable in some sense of the word. A weaker notion
than transitive, which still includes the some kind of dynamic indecomposability of
an invariant set, is that it is chain transitive. See Section 2.3. This latter condition
seems more natural than topological transitivity but it is not as strong and allows
certain examples which do not seem chaotic. (Both periodic motion and "quasi-periodic"
motion is chain transitive but not very chaotic.) Therefore in the first definition of a
chaotic invariant set we require the map to be topologically transitive and only give
chain transitivity as an alternative assumption.
To define a chaotic invariant set, we also want to add a second assumption which
indicates that the dynamics of the map on the invariant set are disorderly, or at least
that nearby orbits do not stay near each other under iteration. The following definition
of sensitive dependence on initial conditions is one possible such concept. In the next
section we define the Liapunov exponents of a map. Another way to express that nearby
orbits diverge is that the map has positive Liapunov exponents. See Section 3.6.
Definition. A map I on a metric space X is said to have sensitive dependence on
initial conditions provided there is an T > 0 (independent of the point) such that for
each point x E X and for each f > 0 there is a point y E X with d(x,y) < f and a
k ~ 0 such that dU"(x),f"(y» ~ T.
One of the early situations where sensitive dependence was observed was in a set
of differential equations in three variables. E. Lorenz was studying a the system men-
tioned in Section 1.3 and discussed in Section 7.10 (Lorenz, 1963). While numerically
3.5 CHAOS 83

integrating the equations, he recorded the coordinates of the trajectory to only a three-
decimal-place accuracy. After calculating an orbit, he tried to duplicate the latter part
of the trajectory by entering as a new initial point q the coordinates of some point
part way through the initial calculation, Pt.. Because the original trajectory had more
decimal places stored in memory than he entered the second time by hand, the points
Pt. and q were not the same but merely nearby points. He observed that the two tra.-
jectories, the original trajectory and the one started with the slightly different initial
condition, followed each other for a period of time and then diverged from each other
rapidly. This divergence is an indication of sensitive dependence on initial conditions of
the particular system which he was studying.
Another way that sensitive dependence is manifest is through the round off errors
of the computer. Curry (1979) reports on numerical studies of the Henon map (for
A = 1.4 and B = -0.3) using two different computers. After 60 iterates, the iterates
have nothing to do with each other. Thinking of the numerical orbit 011 a computer as
an (-chain of the function, two different (-chains diverge, giving an indication of sensitive
dependence on initial conditions. On the other hand, the plot of the orbits on the two
machines seem to fill up the same subset of the plane, giving an indication that the
function is topologically transitive (or at least chain transitive) on this invariant set.
For further discussion of an attractor for the Henon map, see Sections 1.3 and 7.9.
The concept of sensitive dependence on initial conditions is closely related to another
concept called expansive: a map is expansive provided any two orbits become at least
a fixed distance apart.
Definition. A map I on a metric space X is said to be expansive provided there is an
r > 0 (independent of the points) such that for each pair of points x, y E X there is a
k ~ 0 such that d(fk(x),lk(y» ~ r. If I is a homeomorphism, then in the definition
of expansive we allow k E Z and do not require that k is positive, i.e., there is an r > 0
such that for each pair of points x, y E X there is a k E Z such that d(fk (x), IA: (y)) ~ r.

If I is expansive and X is a perfect metric space, then it has sensitive dependence on


initial conditions. In determining the proper characterization of chaos, the assumption
that the map is expansive seems too strong. Therefore we make the following definitions.
Definition. A map I on a metric space X is said to be chaotic on an invariant set Y
or exhibits chaos provided (i) I is transitive on Y and (ii) I has sensitive dependence
on initial conditions on Y.

REMARK 5.1. The use ofthe term chaos was introduced into Dynamical Systems by Li
and Yorke (1975). They proved that if a map on the line had a point of period three,
then it had points of all periods. They also proved that if a map I on the line has a
point of period three, then it has an invariant set S such that

limsuplfn(p) - r(q)1 > 0 and


lim inf Ir(p) - r(q)1 = 0
fI.-OO

for every p, q E S with p '" q. They considered a map with this latter property as
chaotic. This property is certainly related to sensitive dependence on initial conditions.
REMARK 5.2. Devaney (1989) gave an explicit definition of a chaotic invariant set in an
attempt to clarify the notion of chaos. To our two assumptions, he adds the assumption
that the periodic points are dense in Y. Although this property is satisfied by "uniformly
hyperbolic" maps like the quadratic map, it does not seem that this condition is at the
84 III. CHAOS AND ITS MEASUREMENT

heart of the idea that the system is chaotic. (This last comment is made even though
in the original paper Li and Yorke (1975) proved the existence of periodic points.)
Therefore we leave out conditions about periodic points in our definition of chaos.
The paper of Banks, Brooks, Cairns, Davis, and Stacey (1992) proves that any map
which (i) is transitive and (ii) has dense periodic points also must have sensitive depen-
dence on initial conditions. However as stated above, we consider the conditions that
the map (i) is transitive and (ii) has sensitive dependence on initial conditions a more
dynamically reasonable choice of conditions in the definition.
REMARK 5.3. As stated above, an alternative definition of a chaotic invariant set Y
is that (i) , is chain transitive on Y and (ii) , has sensitive dependence on initial
conditions on Y.
This definition allows the following example which does not seem chaotic. Let x and
y both be mod 1 variables, so ({x,y) : x,y mod I} is the two torus, 'f2. Let

'(x, y) = (x + y, y)
he a shear map. Then, preserves the y variable. The rotation in the x direction
depends on the y variable. This map is chain transitive on 'f2 but not topologically
transitive. The controlled nature of the trajectories make it seem non-chaotic.
One way to avoid the above example and still use chain transitivity as the notion of
indecomposability is to require that solutions diverge at an exponential rate. This is
defined in the next section in terms of the Liapunov exponents. (Also see Section 8.2 for
Liapunov exponents in higher dimensions.) Using this concept, we give an an alternative
definition of a chaotic invariant set: an invariant set Y could be called chaotic provided
that (i) , is chain transitive on Y and (ii) , has a positive Liapunov exponents on
Y. Ruelle (1989a) has a long discussion of a chaotic attractor in which he includes the
requirements that it be irreducible and has a positive Liapunov exponent.
There is another measurement related to chaos which relates to the invariant set
for the system. There are various concepts of (fractal) dimensions including the box
dimension which allow the dimension to be an non integer value. We give some of these
dimensions in Section 8.4 (in higher dimensions where the concepts seem to have their
natural setting). For experimental data (without specific equations), Liapunov exp().
nents are not very computahle but the box dimension of the invariant set is computable.
Therefore in the setting of experimental measurements, the box dimension seems like a
reasonable measurement of chaos. See the discussion in Chapter 5 of Broer, Dumortier,
van Strien, and Takens (1991).
REMARK 5.4. A more mathematical solution to making the notion of chaos precise is
in terms of a quantitative measurement of chaos called topological entropy which is
defined in Section 8.1. The topological entropy of a map' is denoted by h(f) and is a
number greater than or equal to zero and less than or equal to infinity. This quantity
has a complicated definition, but can be thought of as a quantitative measurement of the
amount of sensitive dependence on initial conditions of the map. If the nonwandering
set of , is a finite number of periodic points then h(f) = O. In this case a transitive
invariant set is just a single periodic orbit, and this does not have sensitive dependence
on initial conditions. If the dynamics of , are complicated as for the quadratic map
F,. on A,. for p. > 4, then h(F,.) > O. Therefore, another characterization of a chaotic
invariant set A for' might be that h(fIA) > O. Using this definition, we do not need to
add the condition that, is transitive: if h(fIA) > 0, then with mild assumptions there
is an invariant subset A' c A on which' is transitive and for which h(fIA') > O. (This
last statement is not obvious but is true based on results of Chapter IX as well as the
more precise definition of topological entropy in Section 8.1.)
3.5 CHAOS 85

REMARK 5.5. Thus we have given four alternative definitions of a chaotic invariant set.
The characterization of chaos in terms of topological entropy is the most satisfactory
one from a mathematical perspective but is not very computable in applications (with
a computer). The definition in terms of Liapunov exponents is the most computable
(possible to estimate) on a computer. The box dimension is most computable for data
from experimental work. Thus there are a number of related important concepts, each
of which is important in the appropriate setting. We use the definition of chaos which
requires sensitive dependence on initial conditions and topological transitivity for the
definition of chaos. The other concepts we refer to by stating the system (i) has positive
topological entropy, (ll) has a positive Liapunov exponent, or (iii) has fractional box
dimension.
With the above definitions, we can state a result about the quadratic map.
Theorem 5.1. (a) The shift map u is chaotic on the full p-shift space, E,.. In fact, a
is expansive on E".
(b) For II- > 4, the quadratic map F,. is chaotic on its invariant Cantor set AI" i.e.,
F,.IA,. has sensitive dependence on initial conditions and is topological transitivity. In
fact, F,.IA,. is expansive.
PROOF. We proved in an earlier section that both u and F,. are transitive on their
respective spaces. (The fact that F,. is transitive follows from the conjugacy to a.)
To show that u is expansive and so has sensitive dependence, let r = 1. If s 1= t for
two points in E", then there is a k such that Sk 1= tk. Then uk(s) and ak(t) differ in
the O-th place and d(uk(s), uk(t» ~ 1. This proves that a is expansive.
For F,., since the itinerary map h is a homeomorphism, if x,y E A,. are distinct
points with s = hex) and t = hey), then there is a k with Sk 1= tic. Therefore F!(x) and
F!(y) are in different intervals II and h Since there is a minimum distance r between
these two intervals, 1F;(x) - F;(y)1 ~ r. This proves that F,. is expansive. 0
REMARK 5.6. The fact that F,. is expansive on A,. also follows from the follOWing
general result that a conjugacy between maps on compact sets preserves expansiveness.
Theorem 5.2. Let f : X -+ X be conjugate to 9 : Y -+ Y where both X and Y are
compact. Assume 9 has sensitive dependence (resp. is expansive) on Y. Then f has
sensitive dependence (resp. is expansive) on X.
PROOF. Let r > 0 be the constant for 9 for either sensitive dependence or expansiveness.
Let h : X -+ Y be the conjugacy. By compactness, h is uniformly continuous. Therefore
given the value r > 0 as above, there is a 6 > 0 such that if d(p, q) < 6 in X then
d(h(p), h(q» < r in Y. Thus if d(h(p), h(q» ~ r in Y then d(p,q) ~ 6 in X, or
denoting the points differently, if d(p,q) ~ r in Y then d(h-I(p), (h-I(q» ~ 6 in X.
Now we check the sensitive dependence case. Let x E X and £ > o. Then there
is an £' > 0 such that if q E Y is within £' of y = hex) then p = h-I(q) is within
£ of x. Take such a q E Y that is within e' of y and k ~ 0 as given by the con-
dition of sensitive dependence of gat y. Let p = h-I(q). Then d(gk(y),gk(q)) ~
r, so d(h-I(gk(y)),h-I(gk(q))) ~ 6. But h-I(gk(y» = fk(h-1(y» = fk(x) and
h-I(gk(q» = fk(h-I(q» = fk(p). Therefore, p is within £ of x and d(fk(x),fk(q» ~
6. Thus the 6 from the uniform continuity works as the distance by which nearby points
of f move apart in the condition of sensitive dependence. The proof for expansiveness
is similar. 0
86 III. CHAOS AND ITS MEASUREMENT

3.6 Liapunov Exponents


In discussing chaos, we referred to Liapunov exponents which measure the (infinites-
imal) exponential rate at which nearby orbits are moving apart. In this section we give
a precise definition and calculate the exponents in a few examples. In Section 8.2 we
return to discuss Liapunov exponents in higher dimensions.
Deftnltlon. Let J : R ...... R be a C t function. For each point Xo define the Liapunov
exponent of Xo, ..\(xo}, as follows:

..\(Xo} = lim sup.!. 10g(l(r),(xo)l)


n-oo n
n-1
= lim sup .!.
n-oo n
L 10g(lJ'(xj)l)
j=O

where x 1 = f1(xo}. (The first and second limits are equal by the chain rule.) Note that
the right hand side is an average along an orbit (a time average) of the logarithm of the
derivative.
The definition of these exponents goes back to the dissertation of Liapunov in 1892,
see Liapunov (1907). For a treatment from the point of view of time dependent linear
differential equations see Cesari (1959) or Hartman (1964). In higher dimensions, the
definition is more complicated than the one given above in one dimension. We discuss
this situation in Section 8.2.
Next we give three examples where we can calculate or estimate the Liapunov exp<>-
nents.
Example 6.1. Let
2x for 0 < x < 0.5
T(x} = { - -
2(1 - x) for 0.5 ~ x ~ l.
be the tent map. If Xo is such that x} = Tj(xo) = 0.5 for some j then ..\(xo) is not
defined because the derivative is not defined. Such points make up a countable set. For
other points Xo E [0, 11.1f'(x1 }1 = 2 for allj, so the Liapunov exponent, ..\(xo), is log(2).
Example 6.2. Let FI'(x) = J.lx(l-x) for J.l2:: 2+5 1/ 2 . Let AI' be the invariant Cantor
set. Then for Xo E AI" 10g(IF~(xJ)I) 2:: ..\0 > 0 for some AO. Thus the average is larger
than AO, A(XO} 2:: ..\0. Thus we may not know an exact value, but it is easy to derive an
inequality and know that the exponent is positive.
Before giving the last example, we make some connection between the Liapunov
exponent and the space average with respect to an invariant measure. If f has an
invariant Borel measure J.I with finite total measure and support on a bounded interval,
then the Birkhoff Ergodic Theorem (Theorem VII.2.2) says that the limit of the quantity
defining A(XO} actually exists, and is not just a lim sup, for J.I-almost all points Xo. In
fact, since the measure is a Borel measure and log(If'(x)l) is continuous and bounded
above, A(X} is a measurable function and

J A(X} d/l(X} = J log(If'(x}1) dJ.l(x).

If f is "ergodic" with respect to J.I, then ..\(x) is constant J.I-almost everywhere and

..\(X) = I~I J log(I!,(x)I) d/I(X) J.I-almost everywhere,


3.6 L1APUNOV EXPONENTS 87

where 11'1 is the total measure of 1'. (See Section 7.2 for the definition of ergodic.)
This says that the time average of the logarithm of the derivative is equal to the space
average (the integral) of the logarithm of the derivative for I'-almost point. The point
to understand from this discussion is that if the map preserves a reasonable measure
then the Liapunov exponent is constant almost everywhere.
In higher dimensions, the proof that the appropriate limit exists for almost every x
requires a much more complicated ergodic theorem due to Oseledec (1968). See Section
8.2.
Example 6.3. Let F4(X) = 4x(l - x) be the quadratic map for I' = 4.
If Xo is such that Xj = F~ (xo) = 0.5 for some j, then 10g(IF';(xj)l) = log(IF4(0.5)1) =
10g(0) = -00. Therefore A(XO) = -00 for these Xo.
If Xo = 0 or I, then A(Xo) = 10g(lF';(O)l) = log(4) > O.
For points Xo E (0,1) for which Xj is never equal to 0 or 1 (and so never equals 0.5),
we use the conjugacy of F4 with the tent map T, h(y) = sin 2(1I"yj2). (This conjugacy
is verified in Example 11.6.2.) Note that h is differentiable on (O,lJ so there is a K > 0
such that Ih'(y)1 < K for y E [O,lJ. Also h'(y) > 0 in the open interval (0,1), so for any
(small) 6 > 0 there is a bound K6 > 0 such that K6 < Ih'(y)1 for h(y) E (6,1 - 6J. For
Xo as above,

A(Xo) = lim sup .! 10g(I(F4)'(xo)!)


R-CX) n
= lim sup .! log(l(h 0 1'" 0 h- I )'(xo)l)
R-OO n
= lim sup.! (Iog(!h'(Yn)l) + 10g(I(1'")' (yo)!) + log(l(h -I )'(xo)l)
n-oo n
$ lim sup .! (log(K) + nlog(2) + 10g(l(h-I),(xo)!)
n-oo n
= log(2).

On the other hand for these Xo, we can pick a sequence of integers nj going to infinity
such that xn; E [6,1 - 6J. Then letting Yo = h-I(xo) and Yn = 1'"(yo),

A(XO) ~ lim sup 2. log(I(F:T(xo)l)


j-oo nj

= lim sup 2. (Iog(!h'(Yn;)I) + log(I(1'"i)'(yo)l) + log(l(h-1),(xo)l)


j-oo nj
~ lim sup 2. (log(K6) + nj log(2) + 10g(l(h-l),(xo)l)
j-oo nj
= log(2).

Therefore A(XO) = log(2) for all these points. (Note, there are points which repeatedly
come near 0.5 but never hit 0.5 for which the limit of the quantity defining the exponent
does not exist but only the lim sup.) In particular, the Liapunov exponent is positive
for all points whose orbit never hits 0 or 1 (and so never hits 0.5).
Since T preserves Lebesgue measure, the conjugacy also induces an invariant measure
I' for F4i this measure has density function ",-I(x(1-x)t l / 2 • Notice the similarity with
the density functions we used to prove that F,. is transitive for 4 < I' < 2 + 51/ 2 • By
the above argument, A(X) = log(2) for I'-aimost all points. Integrating with respect to
88 III. CHAOS AND ITS MEASUREMENT

this density function gives that

t, t A(X)
10 log(1F4(x)I) dll(X) = 10 71'[x(l _ x)]1/2 dx

_t log(2) dx
- 10 71'[x(1 - x)j1/2
= log(2).

On the other hand

1 log(iF~(x)I)
o
1 dJ.l(x) = 11
0
log(IFHx)I) dx
71'[x(1 - x)j1/2

= l' log(IT'(y)1) dy
= log(2).

These are equal as the Birkhoff Ergodic Theorem says they must be.
REMARK 6.1. In the last section we mentioned that topological entropy is a measure
of complexity of the dynamics of a map. (The formal definition of entropy is given
in the Section 8.1.) Katok (1980) has proved that if a map preserves a non-atomic
(continuous) Borel probability measure J.I for which Il-almost all initial conditions have
non-zero Liapunov exponents, then the topological entropy is positive, so the map is
chaotic. Thus a good computational criterion for chaos is whether a function has a
positive Liapunov exponent for points in a set of positive measure.

3.7 Exercises
Sharkovskii's Theorem
3.1. Let x be a point of period 71 for I, I: R --t R continuous, O(x) = {XI,.'" xn}
be the orbit of I with Xl < I2 < ... < In. Let

be the set of intervals induced by the orbit. Assume there exists II E A such that
l(Id :) I •. Define J I
= II, and define inductively

for j = 2, ... , n - 1. Further assume that I n - I :) K for K E A, where n is the period


of x.
(a) Show that there exists I n- 2 E A such that I n-2 f-covers K and I n-2 C I n- 2.
(b) Show that there exists a sequence I j E A for j = 1, ... , n - 1 with II as above,
In-I = K and such that I) f-covers Ij+! for j = 1, ... , n - 2.
3.2. Assume that f : R --t R is continuous and has a point of period n. Assume
n = 2mp with p > 1 odd, m ~ 1, and n is maximal in the Sharkovskii ordering. Further
assume that k = 2"q with s ~ m + 1 and 1 ::; q and q odd. Prove that f has a point of
period k. Thus prove Case 3a of Sharkovskii's Theorem, page 70.
3.3. Assume that f : R --t R is continuous and has a point of period n. Assume
n = 2mp with p > 1 odd, m ~ 1, and n is maximal in the Sharkovskii ordering. Further
3.7 EXERCISES 89

assume that k = 2" with 8 ::; m. Prove that I has a point of period k. Thus prove Case
3b of Sharkovskii's Theorem, page 70.
3.4. Assume that I : R ..... R is continuous and has a point of period n. Assume
n = 2ffip with p > 1 odd, m ~ 1, and n is maximal in the Sharkovskii ordering. Further
assume that k = 2mq with q odd and q > p. Prove that I has a point of period k. Thus
prove Case 3c of Sharkovskii's Theorem, page 70.
3.5. Let n = 2mp with p > 1 odd, m ~ 1. Prove that there is a continuous function
I: [0, IJ ..... [0, IJ whose periods, P(f), are exactly the set {k : n~ k}.
3.6. Construct a map of R with points of all periods except 3, i.e., construct a map
with all periods implied by period 5 from Sharkovskii's Theorem but no other periods.
Hint: Take an orbit of period 5, {Xl < X2 < X3 < X4 < X5}, with the order given as
in the proof of Sharkovskii's Theorem; let the map be linear on each interval [Xi,XHIJ;
show that this map works.
3.7. Construct a map with a point of period 10 and all periods implied by 10 by the
Sharkovskii ordering but no others. Hint: Take the double of the map in the problem
before the last.
3.8. As defined in Example 1.3, let In : [O,IJ ..... [O,IJ be the function with exactly
one point of period 2i for 0 ::; i ::; n and no other periodic points. Define lco by
lco(x) = limn_co In(x).
(a) Prove that 100 is continuous.
(b) Prove that the periods of '00 are exactly {2i : 0::; i < oc}.
(c) Prove that for each n, 100 has exactly one periodic orbit of each period 2n , that
it is repelling, and that the points of this orbit lie in the gaps Gn,j which define
the middle-(1/3) Cantor set.
(d) Let
Sn = 10, IJ \ U GIe,j
ISi:52"-1
ISleSn
be the union of the 2n intervals used to define the middle-(1/3) Cantor set. Prove
that 100(Sn) = Sn·
(e) Let A = nn>1 Sn. Prove that A is invariant for 100'
(f) Let E2 be the set of all sequences of O's and 1'5. Define A : E2 ..... E2 by
A(8081S2 ... ) = (SOSI82 ... ) + (1000 ... ) mod 2,
i.e., (1000 ... ) is added to (80S1S2 .•• ) mod 2 with carrying (so (110) + (10) =
(0010». The map A on E2 is called the adding machine. Define h : A ..... E2 by
h(p) = s where Sle = 1 if P belongs to the left hand choice of the interval in Sn-l'
Prove that h is a topological conjugacy from 100 on A to A on E2.
(g) Prove that the adding machine A on E2 has no periodic points, and every forward
orbit is dense in E.
Subshifts of Finite Type
3.9. Give the matrix of the subshift of finite type for the map in Exercise 3.6 and the
intervals Ix;, XH 11.
3.10. Let
A = (~ D,
and
An = (~ !:).
90 Ill. CHAOS AND ITS MEASUREMENT

(a) For a vector (xo. Yo). let (xn• Yn) = (xo. Yo)An. With Y_I = Xo. prove that Xn+! =
Yn. and Yn+l = Yn + Yn-I· (This is a Fibonacci sequence.)
(b) Use the fact that (l. O)An = (an, bn ) and (O.I)An = (c,." dn ) to prove that an =
an-I + an -2 and dn = dn- I + dn- 2 .
(c) Prove that tr(An) = tr(An-l) +tr(An-2).
3.11. Consider the matrix A given in the last problem. Find all the fixed points of
U A. U~. u~, and u~. Group the points into orbits and give their least period.
3.12. Let A be an n by n matrix with aij E to, I}. Lj aij ~ 1 for all i. and Li aij ~ 1
for all j. We define i to be equivalent to j. i '" j. if there exist k = k(i.j) ~ 0 and
m = m(i.j) ~ 0 such that (Ak)ij 1: 0 and (Am)ji 1: O. (Because we allow k = 0 = m. i is
equivalent to it.self.) Break {I, ...• n} into equivalence classes. {I, ... , n} = SI U ... U S"
with Si n Sj = 0 for i 1: j. Assume that for each equivalence class, Sq. there exists a
iq E Sq such that LjESq ai.j ~ 2. Prove that EA is perfect.
3.13. Let f : R -. R he a C l function. Assume there are p closed and bounded
intervals 1 1, /2 , •••• I" and A > 1 such that (i) If'(x)1 ~ A for all x E U~=I Ii == I and
(ii) if f(1;) n I j 1: 0 then f(1,) :> IJ' Let A be the matrix of the subshift of finite type
dE'fined byaij = 1 if f(1.) :> I j and aij = 0 if f{I;) n I j = 0. Further assume that (iii)
A is transitive and irreducible. Let A = U~=I f-i(I). Prove that flA is conjugate to
the subshift of finite type U A on EA.
3.14. (This exercise asks you to prove Proposition 2.9, page 77.) Let A be an eventually
positive transition matrix. with (Ak)ij 1: 0 for all i and j.
(a) Prove for n ~ k that (An)ij 1: 0 for all i and j.
(b) Prove that UA is topologically mixing on EA.
3.15. (This exercise asks you to prove Proposition 2.10, page 77.) Assume A is a
transition matrix. (We do not assume A is irreducible.)
(a) Prove that the states can be ordered in such a way that A has the following block
form:
AI * * .. .
A=
( o A2 * .. .
.
o 0 0 ... :]
where (i) each Aj is irreducible, (ii) the * terms are arbitrary, and (iii) all the
terms below the blocks Aj are all O. Hint: Define an ordering on the states as
follows. Call i ~ j provided there is a k = k(i.j) such that A~.j 1: O. Call states
i and j equivalent provided i ~ j and j ~ i. Group together the equivalent state
and order all the states in terms of the above ordering.
(b) Prove that the nonwandering set O(UA) = EA. U··· U EAm' Hint: Show that
O(UA,) = E A, for all j so O(UA):> EA. U··· U EAm' Also show that all points in
EA \ (EA. U ... U EAm) are wandering.
3.16. Let EA be a suhshift of finite type with metric d defined in Chapter II. Let
6 = 0.5. Assume {s(j) E E A } is a O.5-chain for U A on EA' Explicitly indicated the point
t E EA which O.5-shadows this 0.5-chain.
Zeta Functions
3.17. Let A and B each be square matrices and

<(t ) = exp ( ~
L
tr(Ak) - tr(Bk) k)
k t.
k=1
3.7 EXERCISES 91

Prove that
(t) = det{I - tB).
det{I - tA)
Chaos and Liapunov Exponents
3.18. Prove that FI' is expansive on R for p. > 4.
3.19. Consider fl'(x) = IJ.X mod 1, for IJ. > 1.
(a) Calculate the Liapunov exponent.
(b) Prove that fl'(x) has sensitive dependence on initial conditions.
3.20. Let f : R ..... R be CI. Assume p is a periodic point and w(xo) = O(P). Prove
that the Liapunov exponents of Xo and p are equal, >,(xo) = >.(p).
3.21. Let FI'(x) = IJ.X (1 - x) as usual. For 1 < IJ. < 1 + 61 / 2 , find the Liapunov
exponents for the different points x E [O,IJ. Hint: For 3 < IJ. < 1 + 6 1/ 2 , there is an
attracting orbit of period 2. See Exercise 2.6. Also see Exercise 3.20.
3.22. Let 0 < 0 < 0.5, and fa : [0, IJ ..... [O,IJ be defined by

2(0 - x) for 0::;: x::;: 0


fa = { 2(x - 0) for 0 ::;: x ::;: 0.5 + 0
2(0.5 + 0 - x) + 1 for 0.5 + 0 ::;: x ::;: 1.

(a) Draw the graph of fa.


(b) Find the intervals on which fa is transitive and describe its dynamics.
CHAPTER IV
Linear Systems

This chapter begins our study of systems of more than one variable with the consid-
eration of linear systems, both linear ordinary differential equations and linear maps.
In the next chapter, we apply these results to the study of nonlinear systems.
Chapters II and III treated only maps in one dimension and not differential equations.
In this chapter, we start our consideration of differential equations with the study of
linear ordinary differential equations. Most of the results obtained concern linear equa-
tions with constant coefficients, e.g. the form of the solutions, the phase portraits, and
the topological conjugacy class. These results help in the study of a system of nonlinear
differential equations near fixed points. A few results allow the coefficient to depend on
time; these are applicable to "linearized behavior" near a general orbit of a system of
nonlinear differential equations.
After studying linear ordinary differential equations, we indicate the comparable
results for linear maps.
In one dimension the linear theory is trivial so we did not need any special tools.
In several variables, we must use Linear Algebra including eigenvalues, eigenvectors,
and the real Jordan Canonical Form. The first section reviews some of this material,
especially the real Jordan Canonical Form.
The reader may already know some of the material of this chapter and can treat it
quickly. However, we give a few definitions for general flows which we use later, e.g.
topological conjugacy and topological equivalence. These definitions and the material
on phase portraits and topological conjugacy may be new and should be mastered before
these concepts are applied to nonlinear systems in the next chapter.

4.1 Review: Linear Maps and the Real Jordan


Canonical Form
We consider linear maps from Rk to R n , which we denote by L(Rk, Rn). Given bases
{vj}~=l of Rk and {Wi}:'.:l of Rn, a linear map M E L(Rk,Rn) determines an n x k
matrix A = (ai,j) by
k n k
M(LXjV J ) = L(LXjai,j)Wi .
j_l i-I j==l

We often identify such a linear map in M E L(Rk,Rn) with this n x k matrix. With
this identification, the linear map is given by

A(] (:)
where the Xj are the coefficients of the basis {Vj}~_l and the Yi are the coefficients
of the basis {Wi}:'.:l' i.e., Yi = E~=l ai,jXj as indicated by the first displayed formula
above.

93
94 IV. LINEAR SYSTEMS

The space L(Rk, Rn) is given the opemtor norm (also called the sup-norm) defined
by
_ ,Av,
IIAII = su p -,-, '
v~O V

when A E L(Rk, Rn). This norm IIAII measures the maximum stretch of the linear map.
Notice that this norm depends on norms on the domain and range space.
We are also often interested in the minimum stretch of the linear map. (This mea-
surement becomes important in some covering estimates of linear or nonlinear maps.)
For A E L(Rk,Rn), we defined the minimum norm (or conorm) of A by

. IAvl
IDf -,V-, .
m(A) = vpo

The minimum norm is a measure of the minimum expansion of A just as IIAII is a measure
of the maximum expansion. We are often interested in the case of A E L(Rn,Rn). If
such an A has m(A) > 0 then 0 is not an eigenvalue and A is invertible. In turn, if A
is invertible then it is easy to verify that m(A) = IIA-Ill- i .

For the rest of this section we review the Jordan Canonical Form, and in particular,
the real Jordan Canonical Form. Implicitly, we review eigenvalues and eigenvectors.
For the rest of this subsection, A is a n x n real matrix. First we consider the canonical
form over the complex numbers in the case where there is a basis of complex eigenvectors,
vi, ... ,v n . Letting V = (Vi ... v n ), then AV = VA where A = diag(AI, ... ,An). Thus
V-I AV = A is a diagonal matrix. (If A is symmetric, then the eigenvalues are real and
the eigenvectors can always be chosen to be real. Therefore for a symmetric matrix,
there is always a real matrix V which diagonalizes A.)
If an eigenvalue Aj = OIj + i/3j is complex, then its eigenvector vi = u i + iwi must
also be complex. Since A is real, the complex conjugate Xj = 01- i/3 is also an eigenvalue
and has eigenvector vi = u j - iw j . Since

equating the real and imaginary parts yields

Au j = OIjUj - /3jW i and


Awi = /3iuj + OIi wj .

Using the vectors u j and w j as part of a basis yields a subblock of the matrix in terms
of this basis of the form
D· = ( 01')
J -/3i
Thus if A has a basis of complex eigenvectors, then there is a real basis Zl, ... , zn in
terms of which A = diag(B 1 , ••• ,Bq) where each of the blocks B j is either (i) a 1 x 1
block with real entry Ak or (ii) B j is of the form Di given above.
Next we turn to the case of repeated eigenvalues where the eigenvectors do not span
the whole space. If the matrix A has characteristic polynomial p(x), then by substituting
A for x we get that p(A)v = 0 for all vectors v. (This is called the Cayley-Hamilton
Theorem.) In particular, if AI, ... ,Aka are the distinct eigenvalues with mUltiplicities
ml, ... , mka, then Sk = {v E en : (A - AkI)m·v = O} is a vector space of dimension
mk. Vectors in Sk are called generalized eigenvectors.
4.2 LINEAR DIFFERENTIAL EQUATIONS

Now fix an eigenvalue A = Ak and assume that there is an m x m Jordan block. This
means that there are vectors Vi, ... , v m such that (A-AI)v l = 0 and (A-Al}V) = v j - I
for 2 ~ j ~ m. In terms of this (partial) basis, the m x m (sub)matrix has the form

[~ ! ~
o
···
o
0

0
A

0
~ ~l.
0

A
0
..
1
o 0 0 0 A

This gives the Jordan Canonical Form over the complex numbers.
If we use the real and imaginary part of the eigenvectors for the complex eigenvalues
to form a basis of real vectors, we get the Real Jordan Canonical Form for A, A =
diag(BI' ... ,Bq) where B j is of one of the following four types:
(i) B j = (Ak) for some real eigenvalue Ak,
(ii)

[~ ! ~
o
···
o
0

0
A

0
~ ~l
0

A
0
..
1
.

o 0 0 0 A
for some real eigenvalue A = Ak,
(iii) B j = Dk where Dk is a 2 x 2 matrix with entries Ok and ±/3k as given above
for some complex eigenvalue Ak = Ok + i/3k, or
(iv)
Dk 1 .. . o
o Dk .. . o
(
BJ = :
o 0 Dk
o 0 0
where Dk is a 2 x 2 matrix with entries Ok and ±/3k as given above for some complex
eigenvalue Ak = Ok + i/3k and 1 is the 2 x 2 identity matrix.
See Appendix III of Hirsch and Smale (1974) for more discussion of the Jordan
Canonical Form. Also see Gantmacher (1959).

4.2 Linear Differential Equations


The next few sections are concerned with the solutions of linear differential equations.
These results are not only interesting for themselves, but are also used in the theory of
nonlinear differential equations, e.g. to determine the stability near fixed points. We
give a few definitions in the context of a general flow so we can use them throughout the
book. For this reason, we give the definition of the flow of a general, possibly nonlinear,
differential equation.
Consider the linear equation
dx
- = A(t)x
dt
with x E Rn and A(t) = (a;j(t» an n x n matrix. Often we will take the case where A
does not depend on t.
96 IV LINEAR SYSTEMS

More generally in Chapter V, we consider the ordinary differential equation x(O) = Xo


and
d
-x= I(x) (t)
dt
where 1 : Rn --+ Rn is a function all of whose coordinate functions have continuous
partial derivatives. The flow 01 the differential equation is a function <pt(:xo) for t a real
variable and Xo E Rn such that (i) <p0(:xo) = Xo and (ii)

d
di<pt(xo) = I(<pt(:xo))

for t for which it is defined, i.e., <pt(:xo) is the solution curve through :xo at t = O.
Theorem V.3.1 proves that for any:xo E R", there is an 0 = o(Xo) > 0 such that <pt(:xo)
exists and is unique for It I < o.
The time dependent linear equations (*) can be considered in the form of (t) by
forming the equations

dx
- = A(T)X
dt
dT
dt = 1.

These equation are time independent (autonomous) and their solutions are essentially
the same as (*) since T(t) = TO + t. Therefore the general theory implies that given
Xo E R", there is a unique solution x(t) of (*) with x(O) = Xo that is defined on some
open interval containing O. If there is a bound on IIA(t)1I which is independent of t,
then it can be shown that (.) has solutions for all t. (See Exercise 5.5.) For the present,
we shall take the existence of solutions as known. We shall actually construct solutions
for the constant coefficient case (when A is independent of t), giving another proof of
the existence in this case. It will also follow in this case that the solutions exist for all
t. We will also show that the solutions are unique in this case more simply than in the
general theory. For the moment, the following lemma gives the linearity of the set of all
solutions.
Lemma 2.1. If x : I --+ Rn and y : J --+ Rn are two solutions of (.) and a, b E R are
two scalars, then ax( t) + by( t) is a solution of (*) on I n J.
REMARK 2.1. Sometimes we allow solutions of (*) with complex values. In this context,
Lemma 2.1 is true for x: I --+ en and y: J --+ en and a,b E e. We use this in Lemma
3.2 below.
PROOF. The proof is very direct:

~[ax(t) + by(t)] = ax(t) + by(t)


= aA(t)x(t) + bA(t)y(t)
= A(t)[ax(t) + by(t)],
so ax(t) + by(t) is a solution on the common interval of definition. o
Once we have shown that solutions for linear equations with constant coefficients
exist for all t. the above lemma shows that the set of solutions of (.) forms a vector
space. Since solutions are determined by their initial conditions, it will follow that the
dimension of the set of solutions is n. We give the details later.
4.3 SOLUTIONS FOR CONSTANT COEFFICIENTS 97

4.3 Solutions for Constant Coefficients


In this section, we consider the form of the solutions of (.) when A is a constant
matrix which is independent of t: A = (aij) and

dx =Ax
dt
..
( )

with x ERn.
This equation is a generalization of the scalar equation ~~ = all with II E R which has
solutions II = lIoeB !. In order to motivate the form of solutions we seek, assume there is
a basis of vectors Vi, ... , v n that puts the matrix A in diagonal form, diag(AI, ... ,An).
Then in these new coordinates the differential equation (•• ) becomes the n scalar equa-
tions lil = Al 1110 ... , lin = Anlln· Using the form ofthe solutions of these scalar equations,
we get that the solution

where e1 is the standard unit vector with a one in the jlh place and all other coordinates
zeroes and Cj = I/j(O). Back in the original basis, this solution is

Rather than write down the details of the way solutions change when we change basis,
we use this discussion for motivation and we look for solutions to (•• ) of the form
eAtv. The following proposition gives the conditions that are necessary for this to be a
solution.
Before giving the proposition, we need to make one more point. If A = or + i{3 is a
complex number with or, (3 E R, then the exponential of At can be written in its real
and imaginary parts as follows:

e(a+i!3)t = eat cos({3t) + ie at sin({3t).


This can be seen to be the correct expression by comparing the power series expansion
of each side.
Proposition 3.1. Assume A is a real n x n constant matrix. The curve e"tv is a (real)
solution of (.. ) if and only if A is a (real) eigenvalue of A with (real) eigenvector v.
PROOF. If Av = Av then define x(t) = e"tv. Then :ie(t) = Ae"tv = e"t Av = Ax(t).
Thus x(t) is a solution. If A and v are real then clearly the solution e"tv is real.
Conversely, assume that eAtv is a solution. Then x(t) = AeAtv = Ae"tv , so AV = Av
and A is an eigenvalue with eigenvector v. If A = or + i{3 and v = u + iw, then
x(t) = eAtv = eat[cos({3t)u - sin({3t)w) + ie ot [sin({3t)u + cos({3t)w). For x(t) to stay
real for all t, it is necessary that {3 = 0 and w = O. Thus A and v must both be real. 0
The next step is to find real solutions in the case that the eigenvalues are complex.
The following lemma is the main additional step in finding the real form of the solutions
in this case.
98 IV. LINEAR SYSTEMS

Lemma 3.2. Let A(t) be a real matrix which can depend on time. Ifz(t) = x(t) +iy(t)
z
is Ii complex solution of = A(t)z then x(t) and y(t) are each real solutions.
PROOF. This follows from the linearity of differentiation and matrix multiplication:
z(t) = x(t) + iy(t) = A(t)(x(t) + iy(t)) = A(t)x(t) + iA(t)y(t), so by equating the
real and imaginary parts we get that x(t) = A(t)x(t) and y(t) = A(t)y(t), proving the
result. 0
Combining the lemma with the form of the solutions for complex eigenvalues given
in the proof of Proposition 3.2, we get the following result for complex eigenvalues.
Proposition 3.3. Let A be Ii real constant matrix with complex eigenvector v = u+iw
for the complex eigenvalue A = 0 + i{3, {3 '" O. Then eot[cos({3t)u - sin({3t)w] and
eQ1[sin(pt)u + cos({3t)w] are two real solutions.
REMARK 3.1. If v = u + iw is the eigenvector for the complex eigenvalue A = 0 + i{3,
then v = u - iw is the eigenve('tor for the complex conjugate eigenvalue>' = u - i{3.
It can t.hen be checked that these give the same two real solutions of the differential
equation as given above.
Next we turn to the case of repeated eigenvalues where the eigenvectors do not span
the whole space. We will write out the form of the solutions when the eigenvalues are
real (or the complex form of the solutions if a repeated eigenvalue is complex). We will
sketch a way of deriving solutions using the exponential of a matrix and then verify that
the solutions that we derive are actually valid.
For a matrix B, let e B be the matrix obtained from the power series:
B 1 2 1 k
e = 1+ B + 2iB + ... + kiB + ...

It can be shown that this series converges to a matrix just as for real numbers. However
care must be exercised because it is not always the case that e B +C is equal to eBe C ,
This is the case if BC = CB (they commute). In particular,
eAI = e~II+(At->.It)

= e,xlle(AI->.II)

= e,xlel(A->.I).

It can be shown that eAtv is a solution of (.. ) for any v, but we will give a more
specific form of the solutions that does not involve a power series. Given that there are
not always as many eigenvectors as the algebraic multiplicity of the eigenvalue, we take
a generalized eigenvector w such that (A - M)k+l w = 0 but (A - M)kw '" O. Then
(A - M)lw = 0 for j > k and

= e AI{ Iw + t(A - M )w + ... + ki


tk( A - M )k w

tk +1
+ (k + 1)!(A - M)k+ 1W + ... }
kti
= e,xt L -:;(A - M)iw.
i-O J.

With the above calculations as motivation, we can check directly that the final form
is indeed a solution (without verifying that the series converges).
4.3 SOLUTIONS FOR CONSTANT COEFFICIENTS 99

Proposition 3.4. If (A - AI)H1 W = 0 then

is a solution to ( .. ).
PROOF. Let x(t) be as in the statement. Then using the fact that (A - AI)Hl w = 0,

:ic(t) = '>'e).1 L "1Jt (A - '>'I)J w + e).1 L ~(A


k

;=0
j .

(J
t

)
k
- '>'I)'w
j=1
j - I .

k i t HI tj - I
= '>'e).1 L "1(A - AI)jw + e).1 L ~(A - .>.I)jw
i=O J ;=1 () )

k ~ k ~
= e).l{.>. L -;(A - '>'I)jw + (A - '>'1) L -;(A - AI)jw}
j=O)' j=o ).

ti
= Ae).1 L -;(A
k

J.
j=o
- AI)jw

= Ax(t).

Thus x(t) is a solution as claimed. o


We have now given the real form of solutions in all cases (except for repeated complex
eigenvalues). Using this we have existence of solutions and the form of these solutions
as given in the following theorem.
Theorem 3.5 (Existence). Given a real n x n constant matrix A and Xo E Rn, there
is a solution x(t) of:ic = Ax defined for all t such that x(O) = Xo. Moreover, each
coordinate function of x( t) is a linear combination of functions of the form
t k e nl cos(,Bt) and
where a + i,B is an eigenvalue of A and k is less than the algebraic multiplicity of the
eigenvalue.
PROOF. The Jordan normal form says that there is always a basis of generalized eigen-
vectors, vl, ... ,v n . Let xj(t) be the solution with xj(O) = v j for j = l, ... ,n. Given
any Xo ERn, solve for al," .,an such that Xo = E'l=l ajv j . Then x(t) = E'l-l ajxj(t)
is a solution with x(O) = Xo. This shows that there is such a solution, and that it exists
for all t. Also, the form of the solutions xj(t) found above proves that the coordinate
functions are linear combinations of functions of the form stated in the theorem. 0
We mentioned above that for any v, eA1v is a solution. Thus eA1v is the How for
equations (**). If we look only at the matrix e A1 , it can also be shown that
d
dt eAI = Ae At .

Thus if we allow matrix solutions to the differential equation, eAt is such a solution.
However, it is not very easy to compute. In Theorem 3.8 below, we show how to use
the solutions constructed above to get another matrix solution to the equation. Before
giving this construction, we prove directly the uniqueness of solutions using the matrix
eAI rather than using the general theorem for ordinary differential equations.
100 IV. LINEAR SYSTEMS

Theorem 3.6 (Uniqueness). Given Xo. there is a unique solution x(t) to x = Ax


with x(O) = Xo.
PROOF. Let x(t) be a solution. Let y(t) = e-Atx(t). Then

y(t) = -Ae-Atx(t) + e-A1x(t)


= -Ae-A1x(t) + e- A1 Ax(t)
= e- A1 ( -A + A)x(t)
=0.

Therefore y(t) == y(O) = eOx(O) = Xo. and e-A1x(t) == Xo. so x(t) == eA1Xo. This proves
the result. 0
REMARK 3.1. The above proof can be modified to apply to the general linear equation
(.) by using the fundamental matrix solution introduced below.
Finally, we can prove that the solutions form a vector space of dimension n.
Theorem 3.7. Given an n x n constant real matrix A, the set ofsolutiollS oE( •• ),

s= {x: R - Rn : x(t) = Ax(t)},

forms a vector space of dimension n.


PROOF. We know from above that solutions exist for all time. By Lemma 2.1, ifx,y E S
and a, bE R then ax(t) + by(t) E S. This shows that S is a vector space.
Let vi, ...• v n be a basis for Rn (for example, either the standard basis or a basis of
generalized eigenvectors of A). Let xj(t) be the solution with xj(O) = v j for j = 1, ... , n.
Given any solution Z E S, solve for al,"" an such that z(O) = E;=I aivi. Then
both z(t) and E;=I ajxj(t) are solutions, and they have the same initial condition
at t = o. By uniqueness, they are equal, so z(t) = E;=I ajxj(t). This proves that
{xl(t), .... xn(t)} span S.
If a linear combination of the solutions E;_I ajxj(t) = 0 for some al, ... ,an, then
by setting t = 0 we get that E;=I ajvi = O. Because the vi are independent, it follows
that aj = 0 for all j. This proves that the xi(t) are independent and so a basis. This
shows that S has dimension n. 0
Returning to matrix solutions of linear equations, an n x n matrix M(t) is called a
fundamental matrix solution of x = A(t)x provided

~M(t) = A(t)M(t)

for all t and M(to) is nonsingular at one time to. The following theorem justifies
the assumption the M(to) is nonsingular at one time as well as giving an alternative
construction of eAI.
Theorem 3.8. Let A(t) be an n x n real matrix, and consider the linear equation
x= A(t)x.
(a) Assume xi(t), ... ,xn(t) are n solutions, and let
4.3 SOLUTIONS FOR CONSTANT COEFFICIENTS 101

be the n x n matrix formed by putting the vector solutions in as columns. Then M(t)
satisfies the linear equation as a matrix solution:

~M(t) = A(t)M(t).

(b) If Xl (to), . .. , xR(to) are independent for one time to, then any solution can be
written as x(t) = M(t)v for some vector v.
(c) If M(t) is a matrix solution such that M(to) is nonsingular at one time to, then
M(t) is nonsingular at all times.
(d) If M(t) is a matrix solution with M(O) nonsingular, then

eAt = M(t)M(O)-I.

PROOF. Defining M(t) as stated,

~M(t) = (jcl(t), ... ,jcR(t»


= (A(t)XI(t), ... , A(t)xR(t»
= A(t)M(t),

as claimed.
Now assume the solutions are independent at to, and x(t) is any other solution.
Because M(to) is nonsingular, it is possible to solve for v such that x(to) = M(to)v.
Then both x(t) and M(t)v are solutions that agree at to. By the uniqueness of solutions
they are equal for all time.
Part (c) is equivalent to proving that if M(t) is a matrix solution that is singular at
one time, then it is singular at all times. Assume that M(t) is such a matrix solution
with M(to) singular. Thus the columns of M(to) are dependent and there is a nonzero
vector v with M(to)v = o. Then both M(t)v and the zero function are solutions that
are equal at one time. By uniqueness they are equal for all time, M(t)v = o. Thus
M(t) is singular for all time, proving the result.
Now, assume M(O) is nonsingular (or is nonsingular at some time to). Then both
M(t)M(O)-1 and eAt are matrix solutions that are equal at time t = O. By uniqueness
of solutions, they are equal for all times. (To reduce it to vector solutions, multiply
both matrix solutions by the standard basis elements ei.) 0
REMARK 3.2. Notice that if M(t) is a fundamental matrix solution of (*), then by
multiplying on the right by M(tO)-l we get that M(t,to) = M(t)M(to)-1 is another
fundamental matrix solution with M(to. to) = I. This idea is used in the proof of both
Theorem 3.8 and 3.9.
To end the section, we restate Liouville's Formula for any fundamental matrix solu-
tion of a linear differential equation which depends on time. This result is used when
we make the connection between the divergence of a vector field and the way that the
flow distorts area.
Theorem 3.9 (Liouville's Formula). Let M(t) be a fundamental matrix solution of
the linear equation (with p06Sibly nonconstant coefficients) jc = A(t)x. Then

~ det (M(t») = det (M(t») tr (A(t») and

det (M(t») = det (M(O») exp (l tr (A(s») dS)'


102 IV. LINEAR SYSTEMS

PROOF. We let ei be the standard basis. In the following calculation we use the notation
ai,j(t) for the (i,j) entry of A(t). Using the fact that the determinant is multilinear in
the columns we get the following:

ft det (M(t)M(to)-I)lt=t. =

= did det (M(t)M(to)-le 1 , .•. , M(t)M(to)-le n ) It=t.


= L det (M(to)M(to)-le 1 , .•• , M(to)M(to)-lei- 1 ,

M'(to)M(to)-lei, M(to)M(to)-lei+1,
... ,M(to)M(to)-len )
e , ... ,e1'-1 ,A(to)e1.,e1'+1 , ... ,en)
= "L... det (I
j

= L det (e l , ... ,e1- I , Lai,j(to)ei,ei+1, ... ,en)


j

= "L...ai,j(to)det(e 1 , ... ,e1'-I ,ei,e1


' + 1 , ... ,e)
n

i,i

= Laj,j(to)det(el, ... ,en )


j

= tr (A(to».

Therefore, using the multiplicative feature of the determinant,

ft det (M(t»)!t=t. = ft det (M(t)M(to)-I) det (M(to»


= det (M(to» tr (A(to».

Solving this scalar differential equation gives the formula in the theorem relating the
quantities det (M(t», det (M(O», and the exponential of 1t tr (A(s» ds. 0

4.4 Phase Portraits


In the last section, we gave the general theory of solutions to linear equations and
found the solutions of constant coefficient linear equations. In this section, we give
thf' phase portraits of two dimensional constant coefficient linear equations and some
examples in three dimensions. The phase portrait of a differential equation :it = I(x) is
a drawing of the solution curves with the direction of increasing time indicated. In some
abstract sense, the phase portrait is the drawing of all solution curves, but in practice
it only includes representative trajectories. The phase space is the domain of all x's
considered.
Example 4.1 (A Saddle). Consider

. (1 1) x.
x= 0 -2

The general solution is


4.4 PHASE PORI'RAITS 103

FIGURE 4.1. A Saddle


See Figure 4.1 for the phase portrait. (Solutions along the x-axis are moving away from
the origin and those along the line of slope -3 are coming in toward the origin.)
Example 4.2 (A Stable Node). Consider

. (-10 -20) x.
x=

The general solution is

x(t) = C1e- t (~) + C2e-2t (~) .


See Figure 4.2 for the phase portrait. (All the solutions are coming in toward the origin.)

FIGURE 4.2. A Stable Node

Example 4.3 (An Improper Node). Consider

. (-1 1) x.
x= 0 -1

The general solution is


104 IV. LINEAR SYSTEMS

FIGURE 4.3. An Improper Node

See Figure 4.3 for the phase portrait. (All the solutions are coming in toward the origin.)
Example 4.4 (A Center). Consider

. (0w -w)
x= o x.

The eigenvalues are ±iw with eigenvectors (~i). The general solution is

x(t) = C ( - Sin(wt») + C (c~(wt»).


1 cos(wt) 2 sm(wt)

See Figure 4.4 for the phase portrait. All the solutions are on closed orbits surrounding
the origin.

FIGURE 4.4. A Center

Example 4.5 (A Stable Focus). Consider

x. = (-12 -2)
-1 x.
4.4 PHASE PORI'RAITS 105

The eigenvalues are -1 ± 2i with eigenvectors (~i). The general80lution is

x(t) = C 1 e- l [c08(2t) U) - sin(2t) 0)1


+C2 e- l [sin(2t)(n +cos(2t) (~)I.
See Figure 4.5 for the phase portrait. (All the solutions are coming in toward the origin.)

FIGURE 4.5. A Stable Focus

Definition. In general, a linear system is called a node provided that every orbit tends
to the origin in a definite direction as t goes to infinity (if all the eigenvalues are real and
negative), or as t goes to negative infinity (if all the eigenvalues are real and positive).
A linear system is called a proper node provided it is a node and there is a unique orbit
going into (or coming out of) the origin for each direction (all the eigenvalues are equal,
real, and nonzero); it is called an improper node provided it is a node and there is a
direction for which there are more than one orbit going into (or coming out of) the
origin in that direction. In this terminology, both Examples 4.2 and 4.3 are improper
nodes. A linear system is called a stable focus or stable spiml (respectively, unstable
focus or unstable spiml) provided the solutions approach the origin as t goes to infinity
(respectively, minus infinity) but not from a definite direction.
Example 4.6 (A Three Dimensional Saddle). Consider

x= (-1o -2 0)
2 -lOx.
0 1

The eigenvalues are -1 ±2i and 1 with eigenvectors ( ±i) and (0


~ 0 1). The general

solution is
106 IV. LINEAR SYSTEMS

}
d- 7 :@
~
[
(C
+
FIGURE 4.6. A Three Dimensional Saddle
See Figure 4.6 for the phase portrait.

4.5 Contracting Linear Differential Equations


In this section we use the features of the solutions of (**) to characterize the solutions
as contracting, expanding, or subexponentially growing. We give the definitions of
Liapunov stable and asymptotically stable again in this context. They are essentially
the same as we gave for one dimensional maps.
Definition. The orbit of a point p is Liapunov stable for a flow tpt provided that given
any! > 0 there is a 6 > 0 such that if d(x, p) < 6, then d(tpt(x), tpt(p)) < ! for all t ~ O.
If p is a fixed point, then the condition can be written as d(tpt(x), p) < !. The orbit of
p is calif!<! asymptotically stable or attracting provided it is Liapunov stable and there is
a 61 > 0 such that if d(x, p) < 61 , then d(tp'(X), tpt(p)) goes to zero as t goes to infinity.
If p is a fixed point then it is asymptotically stable provided it is Liapunov stable and
61 > 0 such that if d(x, p) < 61 then w(x) = {pl.
REMARK 5.1. For a linear system, if the origin satisfies the second condition of asymp-
totic stability then it is also Liapunov stable. There are nonlinear systems for which
this is not the case. See Example V.5.3 for an example of Vinograd (1957).
Another example can be given in polar coordinates with the fixed point at r = 1 and
8 = 0 can be given (near r = 1) by
r=l-r
11 = sin(9/2).
Initial conditions with reO) near 1, have ret) tending to 1. Also, theta(t) tends to 0
modulo 21r. Thus for any such point, w(ro, 80 ) = {(I, O)}. On the other hand, this point
is not attracting.
The main theorem of the section proves that if all of the eigenvalues of A have
negative real part, then the origin is attracting at an exponential rate determined by the
eigenvalues. Thus one criterion for asymptotic stability of a linear system of differential
equations with constant coefficients is that all the eigenvalues of the matrix have negative
real part. Before stating the theorem, we give an example which illustrates the fact that
for a stable linear system, the usual Euclidean norm of a nonzero solution is not always
strictly dl'Creasing.
4.5 CONTRACTING LINEAR DIFFERENTIAL EQUATIONS 107

Example 5.1. Consider the system of linear differential equations given by


x = -x - y
iJ = 4x -]I.

The eigenvalue!! are -1 ± 2i, and the general solution is given by

( X(t») =e-I (COS(2t) -(1/2)Sin(2t») (xo)


y(t) 2sin(2t) cos(2t) Yo'

See Figure 5.1 for the phase portrait.

FIGURE 5.1. Phase portrait for Example 5.1.

The matrix in the above solution, with terms involving sin(2t) and cos(2t), has norm
less than 2, so

which goes to zero exponentially. The time derivative of the Euclidean norm along a
nonzero solution is given by

!!..(X2 + y2)1/2 = ~(X2 + ]l2)-1/2(2xx + 2yli)


dt 2
= (x 2 + y2)-1/2( _x2 _ y2 + 3xy).
Along the line x = y, (d/dt)(x 2 + y2)1/l = 2- 1/ 2Ixl which is positive for x = y '" O.
Therefore the Euclidean norm is not monotonically decreasing, although it dOe!! go to
zero at an exponential rate. See Figure 5.2.
In this example, it is possible to take a different norm (the norm in the coordi-
nates which give the Jordan Canonical Form) which is monotonically decreasing along
a solution. Let I ,1. be the norm defined by

l(x,y)l. == (4x 2 + y2)1/2.


The time derivative of this norm along a nonzero solution is given by

!1(x,]I)I. = ~1(x,Y)I;I(4' 2xx + 2yli)


= l(x,y)I;I(-4x2 _ y2)
= -\(x, y)I •.
108 IV. LINEAR SYSTEMS

FIGURE 5.2. Solution Curve for Example 5.1 Crossing Level Sets i(x,y)1 =
C and I(x, y)l. = C'

Solving this linear scalar equation shows that

l(x(t),y(t))I. = e-ll(xo,Yo)l.

which monotonically and exponentially decreases to zero.


Theorem 5.1. Let A be an n x n real matrix, and consider the equation x = Ax. The
following are equivalent.
(a) There is a norm I I. on RR and a constant a > 0 such that for any initial
condition x E RR, the solution satisfies

(b) For any norm II' on RR there exist constants a> 0 and C ~ I such that for any
initial condition x E RR, the solution satisfies

(c) The real parts of all the eigenvalues A of A are negative, Re(A) < O.

REMARK 5.2. A norm as in Theorem 5.I(a) is called an adapted norm. It is useful to


use because solutions immediately contract in terms of this norm as time goes forward.
The norm II. introduced in Example 5.1 is such an adapted norm. The adapted norm
for the linear equation is also used to study nonlinear equations near a fixed point.
REMARK 5.3. We give two proofs that condition (c) implies condition (a), i.e., that
the condition on the eigenvalues in part (c) implies the existence of a norm in which
the differential equation is a contraction as given in part (a). The second (alternative)
proof uses the basis which puts the matrix in Jordan canonical form to define the norm.
This proof has the advantage that it is constructive; it has the disadvantage that it
does not apply in the situation we consider in Chapter VII of a hyperbolic invariant set
(Theorem VILl.I). The first proof averages the usual norm along the orbits. It has the
advantage that it applies in the more general situation of a hyperbolic invariant set; it
has the disadvantage of being more abstract and not easily computable for a particular
example.
4.5 CONTRACTING LINEAR DIFFERENTIAL EQUATIONS 109

PROOF. First we show that (a) implies (b). Let II. be the norm given in (a) and II' be
any other norm. Then there are constants AI, A2 > 0 such that Allxl' ~ Ixl. ~ A2Ixl'.
(This is true for any two norms in finite dimensions.) Then for t ~ 0,
1
leAtxl' ~ AlleAlxl.

< ~e-allxl •
- Al
~ ~~ e-allxl'·
This proves that (b) is true with C = A 2 /A I • This completes the proof that (a) implies
(b).
Next, we show that (b) implies (c). Suppose (c) is not true: there exists an eigenvalue
A = (]( + if3 with (]( ~ o. There is a solution for the corresponding eigenvector of the
form eO I [sin(f3t)u + cos(f3t)wl, and this solution does not go to zero as t --+ 00. This
shows that (b) is not true.
Finally, we need to show that (c) implies (a). Let 0 > 0 be chosen so that Re(A) < -0
for all the eigenvalues A of A. Let v I, ... ,v" be a basis of generalized eigenvectors, and
x j (t) be the solution with x j (0) = v j . By the form ofthe solutions given in Theorem
3.5, for each j there is a Tj such that leAlvil ~ e-allvjl for all t ~ Ti. (Here we
use the Euclidean norm, or any other fixed norm.) Then any x it can be written as
x = E;=I ci vj , so the solution eAlx = E;=I Cixj (t). The form of these solutions
implies there is a T which depends on x such that leAlxl ~ e-allxl for all t ~ T. Using
the compactness of {x: Ixl = I}, there is one T which works for all x with Ixl = 1. By
linearity, we get that there is one T which works for all x. Fix this T.
What we have is that eAlx is a contraction if we take t larger than T. We average the
norm along the trajectory for times between 0 and T with a weighting factor eal and
show that in terms of this averaged norm, the linear How is an immediate contraction.
To this end, define
Ixl. == foT ea'leA'xl d,.
We now show that this norm works. Take t ~ 0 and write it as t = nT + T with
o~ T < T. In the calculation which follows, we split up the range of 8 so that 8 + t
runs from nT + T to (n + l)T and from (n + l)T to (n + I)T + T:

leAlxl. = for ea'leA'eAlxl ds


= lo
r - T ea'leAnr eA(T+O)xl ds + {r
lr-T
ea'leA(n+l)r eA(T-r+O)xl ds.

Making the substitution u = T + s in the first integral, and u = T - T + s in the second


integral, and using the estimate above for leAnrxl, we get

leAlxl. ~ £r ea(u-T-nr)leAuxl du + foT ea(u+T-T-(n+l)r)leAuxl du


= e-at for eauleAuxldu
= e-allxl.·

This proves that (a) holds using the norm I I•. o


110 IV. LINEAR SYSTEMS

ALTERNATE PROOF THAT (C) IMPLIES (a). Given E > 0, we can find a basis B of
generalized eigenvectors in terms of which A = diag{ AI, ... ,Ap} where each Ai is one
of the following types:


0

("'0
o OJ
0).
0
,
o o} 0

o 0 ... ~} o
0
0
0
OJ
0
f).
OJ

:),
d 0

n
0

(I Dj

0 D)
or
(I Dj

0
0
0

D}
0 D}
where
-(3i)
o
Let 1 18 be the norm in terms of this basis, i.e., if x =
' and 1=0 n· L:7=1 XjV j then Ixl8
[L:7=1 x;j1/2.
Using the components in the above basis,

~lx(t)18 = L:7=1 Xj(t)Xj(t) = (x(t), AX(t))8


dt Ix(t)18 Ix(t)18·
Assume that C 1 and C 2 are two constants with C I < Re('x) < C2 for all eigenvalues
,X of A. It can be shown that if f is small enough in the above Jordan form, then
Ctlxl~ < (x, AX)8 < C2Ixl~. Therefore,

fel x (t)18
C 1 < IX(t)i8 < C 2 ,
d
CI < dt log(lx(t)18) < C 2 ,
IX(t)i8
CIt < log (lx(O)18) < C 2 t,
and
ec ,IIXol8 < Ix(t)18 < e C • 1Ix oI8.
If all the eigenvalues have negative real part, then taking a = -C2 > 0 above we get
the result claimed. 0
From the above theorem, it follows that if the real part of each eigenvalue is negative,
then the origin is asymptotically stable for the linear flow. Also note that if for t ~ 0,
we set y = eA1x in the first condition of Theorem 5.1, we get that Iyl. ~ e-1ble-A1yl.
for all t ~ 0, or eb1111yl. ~ leA1YI. for all t ~ o. Thus going backward in time the
flow is an expansion. For this reason if all the eigenvalues of A have negative real part,
then we say that. the differential equation x = Ax has the origin as a sink, the origin is
attmciing, or the flow is a contmction.
If all the eigenvalues of a matrix A have positive real part, then an analogous result
is true and the flow is an expansion for t ~ 0 and a contraction for t ~ o. In this case we
say that the origin is a source, the origin is repelling, or that the flow is an expansion.
4.6 HYPERBOLIC LINEAR DIFFERENTIAL EQUATIONS 111

4.6 Hyperbolic Linear Differential Equations


In the previous section, we considered the case when the linear flow eAtx is contracting
in all directions or expanding in all directions. In this section we consider the case when
it is contracting in some directions and expanding in others. We define the linear
differential equation x = Ax to be hyperbolic provided all the eigenvalues of A have
nonzero real part. If A induces a linear hyperbolic differential equation, and there are
at least two eigenvalues of A, Au and A., with Re(Au) > 0 and Re(A.) < 0, then the
origin is called a (hyperbolic) saddle for the differential equation. Thus in the case of a
saddle, some directions expand and some contract. Both the contracting or expanding
cases are hyperbolic but are not saddles.
For any linear flow (either hyperbolic or not), we want to characterize the eigenspaces
of generalized eigenvectors corresponding to the eigenvalues with positive, zero, and
negative real parts. We first introduce some notation and then state the result. Let
A be an 7l x n matrix. Define the stable eigerl8pace, unstable eigen8pace, and center
eigen8pace to be

IE' = span {v : v is a generalized eigenvector


for an eigenvalue A with Re(A) < O},
lEu = span{v : v is a generalized eigenvector
for an eigenvalue A with Re(A) > O}, and
lEe = span{v : v is a generalized eigenvector
for an eigenvalue A with Re(A) = O},

respectively. If A is hyperbolic (so lEe = 0), then the decomposition ofRn into subspaces
given by Rn = lEU €a E' is called a hyperbolic splitting. Let

V' = {v: there exist a > 0 and C ~ 1 such that


leAtvl ::; Ce-atlvl for t ~ O},
vu = {v: there exist a > 0 and C ~ 1 such that
leAtvl ::; Ce-altllvl for t::; O}, and
Ve = {v: for all a > 0, leAtvle-altl -.0 as t -. ±oo}.

Thus the subspace V' is defined to be all vectors which contract exponentially forward
in time; ve is defined to be the vectors which grow at most 8ubexponentially both
forward and backward in time. Finally, notice that VU is defined as the set of vectors
which contract backward in time and not in terms of behavior forward in time. This
is done because any vector which is not in E' €a Ee expands forward in time, and this
is not a subspace and does not characterize VU. The following theorem shows that the
conditions defining the va characterize the subspaces EO" for (J' = s, u, c.
Theorem 6.1. Consider the linear differential equation x = Ax with x in Rn. Let EU,
lEe, IE', VU, ve, and V· be defined as above. Then the following are true.
(a) The subspaces lEa are invariant by the Bow eAt for (J' = U, C, 8,
(b) 'T'he subspace IE" = V" [or (1 = U, C, B, so eAt lIEU is an exponential expansion,
e·~tllE· is an exponential contraction, and eAtllEe grows subexponentially as t .-.
±oo, and these uniquely characterize the subspaces.
PROOF. By the form of the solutions given in Theorem 3.5, the subspaces are invariant
for eAt as claimed.
112 IV. LINEAR SYSTEMS

By Theorem 5.1, IE" C V" and E' C V'. On the other hand, if v E V" \ E" then
it has a nonzero component v' in either Ee or E'. By the form of the solutions, eAtv'
and so eAtv do not go to zero exponentially as t -+ -00, so v f/. V". This contraction
proves that V" C EU and so V" = E". Similarly, V' = E'.
For v E Ee \ {O}, by the form of the solutions, eAtv has at mOllt polynomial growth
as t -+ ±oo, so v E VU. As above, if v E ve \
Ee then there is a nonzero component in
either E" or E', and there is exponential growth as t goes to either +00 or -00. This
contradicts the fact that v Eve. This completes the proof that lEe = ve,
and so the
theorem. 0
Using Theorem 5.1, we can see that if A induces either a purely contracting linear
flow or purely expanding linear flow, then nearby B will also induce hyperbolic flows
of the same type. For hyperbolic linear flows, the subspaces in the hyperbolic splitting
can move, but a small perturbation B of A must remain hyperbolic and must have
eigenspaces of the same dimension as those for A. We state this in the following theorem.
Theorem 6.2. Assume that A induces a hyperbolic flow for the linear differential
equation x = Ax for x E Rn. If B is an n x n matrix with entries near enough to
A, then x = Bx induces a hyperbolic linear flow with the same dimension splitting
E8 $IEB as that for A. Moreover, as A varies continuously to B the subspaces also vary
continuously.
PROOF. Because A is hyperbolic, Rn = lEu $ IE' for the eigenspaces of A. Let p('\) be
the characteristic polynomial for A. Let -y be a curve in the left half of the complex
plane that surrounds all eigenvalues with negative real part for A and is oriented coun-
terclockwise. Let -y' be a curve in the right half of the complex plane that surrounds all
eigenvalues with p08itive real part for A, again with counterclockwise orientation. By
residues
dim(E') = -1-1 P'(z) dz
27ri .., p(z)
and
.
dlm(E " 1
) = -2' 1 P'(z)
-() dz.
7r1..,'PZ
For B near enough to A, the characteristic polynomial q('\) for B does not vanish on -y
or -y'. The two integrals with p( z) replaced by q( z) will count the number of roots for
q inside -y and '"Y' respectively. By continuity of these integrals with respect to changes
in p, the number of roots for p and q are the same (since a continuous integer valued
function is a constant). In particular, all the roots of q(z) are either inside -y or -y' and
noDe are on the imaginary axis. This proves that B is hyperbolic and the subspaces Ea
and EB have the same dimension as those for A.
Next,
Pv = ~ l(zI -
27r1 ..,
A)-lvdz

is a projection from R" onto E'. See Section 148 of Riesz and Nagy (1955). Similarly,

is a projection onto the stable eigenspace Ea for B. Again, this integral varies continu-
ously with changes from A to B, so the subspace varies continuously. Similar statements
are true for the unstable subspaces.
4.7 TOPOLOGICALLY CONJUGATE EQUATIONS 113

4.1 Topologically Conjugate Linear Differential


Equations
Definition. We consider two flows to have the same qualitative properties, and so to be
topologically similar, if we can match the trajectories of one with the trajectories of the
other. There are two ways of doing this depending on whether we demand that the con-
jugacy match the time parameterization of the two flows or allow a reparameterization.
The stronger condition requires that the flows be matched without a reparameterizIV
tion. We say that two flows '{" and 1/J' on a space M (M could be R" or some manifold)
are topologically conjugate provided there is a homeomorphism h : M ---+ M such that
h 0 '{"(x) = 1/J' 0 h(x) for all x E M and for all t E R. Allowing a reparameterization,
we say that '{" and 1/J' are topologically equivalent provided there is a homeomorphism
h : M ---+ M such that h takes trajectories of '{" to trajectories of I/J' while preserving
their orientation. More precisely, '{" and 1/J' are topologically equivalent if there is a
homeomorphism h : M -+ M and a (reparameterization) function 0 : R x M -+ R such
that h 0 ,{,O(I.X) (x) = 1/J' 0 h(x) for all x E M and for all t E R, where we assume for
each fixed x that o(t, x) is monotonically increasing in t and is onto all of R. We could
also assume the group property on 0 to insure that it is indeed a reparameterization:
,{,o(I+"X)(x) = ,{,O(I ...,n( •. x)(x» 0 ,{,O('.x) (x).

We use these concepts repeatedly in our study of flows. In most circumstances, it is


not possible to preserve the parameterization when we make perturbations. However,
the following theorem proves that two linear flows with the same dimensional contracting
spaces and the same dimensional expanding spaces are actually conjugate, i.e., it is
possible to preserve the parameterization.
Theorem 1.1. Let A and B be two n x n real matrices.
(a) Assume that all the eigenvalues of A and B have negative real part (both are
sinks). Then the two linear Bows eAI and eBI are topologically conjugate.
(b) Assume that all the eigenvalues of A and B have nonzero real part and the
dimension of the direct sum of all the eigenspaces with negative real part is the same for
A and B. (Thus the dimension of the direct sum of all the eigenspaces with pa6itive real
part is the same for A and B.) Then the two linear Bows eAI and eBt are topologically
conjugate.
(c) In particular, if all the eigenvalues of A have nonzero real part and B is near
enough to A, then the two linear Bows eAI and eBI are topologically conjugate.
PROOF. Part (b) follows fairly easily from part (a). The theorem that gives a conjugacy
for two systems with all their eigenvalues with negative real part (on the same dimen-
sional space) implies the same result for systems with all their eigenvalues with positive
real part (on the same dimensional space). Then given two hyperbolic linear flows as
in part (b), there is a conjugacy on the eigenspaces for the eigenvalues with negative
real part, and a conjugacy between eAI and eBI on the eigenspaces for the eigenvalues
with positive real part, i.e., there are conjugacies h" : E~ ---+ E~ between eAIIE~ and
eBIIE~ for 0' = U,5. There are projections 7r" : Rn ---+ E" for 0' = U,5, so any x E R"
can be written as x = 7r.. (x..) + 7r.,(x). The two conjugacies can be combined to give a
conjugacy on the total space. Define h(x) = h u (7ru (x» + h.(7r.(x». Using the linearity
of the flows, it is easily checked that h is a conjugacy.
Part (c) follows from part (b) because if B is near enough to A, then the dimensions
of the splittings for A and B are the same by Theorem 6.2.
It remains to prove part (a). By Theorem 5.1, there exist norms I IA and I IB and
constants a, b > 0 such that we have the estimates leAlxlA :5 e-arlxlA and leB1xlB :5
114 IV. LINEAR SYSTEMS

e-b1lxlB for t ~ 0 and for any x in Rn. Running time backward, we get the estimates
leA'xL~ ~ ea111lxlA and leB1xlB ~ eb11ljxIB for t ~ O.
We want to match up the trajectories of eAI with those for eBI. Using the above
estimates, we see that for each x "f 0 the trajectory eA1x crosses the unit sphere for IIA
exactly once, and each trajectory eB1y crosses the unit sphere for liB exactly once. Let
the unit spheres in these two norms be denoted as follows: SA = {x: IxlA = I} and
SB = {x : IxlB = l}. These spheres, SA and SB, are called the fundamental domains
for the two linear flows because of the property that each trajectory of eA1x for x "f 0
(respectively of eB1y) crosses SA (respectively SB) exactly once. Therefore, we first of all
define a homeomorphism ho from SA to SB by ho(x) = x/lxIB' (Any homeomorphism
would do.) Notice that the inverse of ho exists and is given by ha1(y) = y/lyIA'
To extend ho to all Rn we need to define the time when the trajectory that starts at
x crosses the unit sphere. Using the above inequalities for the flow e A1 , it follows that
for any x "f 0 there is a T(X) which depends continuously on x such that leAT(x)xIA = 1,
i.e., eAT(x)x E SA. Because of the definition it follows that T(eA1x) = T(X) - t.
Now using this homeomorphism ho on the unit sphere and the time T(X), we can
define a map (homeomorphism) h : Rn -+ Rn by

e-BT{X)ho(eAT(x)x) for x "f 0, and


h(x) = { 0
for x = o.
The following calculation shows that h is a conjugacy:

h(eA1x) = e-BT(eA'x)ho(eAT(eA'x)eAlx)
= e-B(T(X)-l)ho(eA(T(X)-l)eA1x)
= eBle-BT(X)ho(eAT(X)x)
= eBth(x).

Because T and the flows eAI and e B1 are continuous, it follows that h is continuous
at points x "f O. To check continuity at 0, notice that if x] converges to 0 then
Tj = T(X]) goes to minus infinity. Letting Yj = ho(eATjxj), we have that IYjlB = 1.
Thus Ih(x])IB = le-BT'YjIB ~ e-bIT,1 must go to zero. Therefore h(xj) converges to
o = h(O). This proves the continuity at O.
To show that h is one to one, take X,y with h(x) = h(y). If x = 0, then 0 = h(x) =
h(y), so y = 0 = x. Now assume x "f O. Then h(y) = h(x) "f 0 so y "f O. Letting T =
T(X), h(eATx) = eBT h(x) = e BT h(y) = h(eATy). This shows that h(eATy) = h(eATx) E
SB, so eATy E SA and T(Y) = T(X). Since ho(eATx) = h(eATx) = h(eATy) = ho(eATy),
and ho is one to one, we have eATx = eATy and so x = y. Thus, h is one to one in all
cases.
Reversing the roles of A and B in the arguments above, we get that h- 1 exists (and
so h is onto) and is continuous. This completes the proof. 0
In contrast to the above results which proved that many different linear contractions
are topologically conjugate, there is the following standard result about linear conjugacy
which implies that very few different linear differential equations are linearly conjugate.
Theorem 7.2. Let A and B be two n x n matrices, and assume that the two flows etA
and etB are linearly conjugate, i.e., there exists an invertible M with e lB = MetA M- 1 .
Then A and B have the same eigenvalues.
4.8 NONHOMOGENEOUS EQUATIONS 115

PROOF. Differentiating the equality e tB = MetA M-I with respect to t at t = 0, we get


that B = M AM-I. Then the characteristic polynomial for B equals that for A:

p(~) = det(B - AI)


= det(MAM- 1 - ~MIM-I)

= det(M)det(A - AI)det(M- 1 )
= det(A - AI).

The fact that the characteristic polynomials are equal implies that they have the same
eigenvalues. 0
REMARK 7.1. If B = sA for s > 0, then certainly the two flows are linearly equivalent.
This is the only new feature which the notion of linearly equivalent adds to that of
linearly conjugate.

4.8 Nonhomogeneous Equations


In this section we consider nonhomogeneous linear equations. These results are used
in the next chapter when studying the stability of fixed points of nonlinear equations
and other matters.
The general form of the equations considered is given by

x= A(t)x + g(t). (NH)

Given such an equation, we associate the corresponding homogeneous equation,

x= A(t)x. (H)

The following theorem gives the relationship between the solutions of (NH) and those
of (H). We leave its proof to the reader.
Theorem 8.1. (a) If x 1 (t) and x 2 (t) are two solutions of (NH), then x 1 (t) - x 2 (t) is a
solution of (H).
(b) Ifxn(t) is a solution of (NH) and xh(t) is a solution of (H), then xn(t) + xh(t) is
a solution of (NH).
(c) If xn(t) is a solution of (NH) and M(t) is a fundamental matrix solution of (H).
then any solution of (NH) can be written as xn(t) + M(t)v.
In terms of the above theorem, we know how to find solutions of the homogeneous
equation, at least in the constant coefficient case. We need to find one solution of the
nonhomogeneous equation. One way is to look for solutions of the same type as the
forcing term. For scalar second order systems, this method is often called the method of
undetermined coefficients. The other method is the method of variation of parameters.
The following theorem gives this result for systems. Also, in the scalar equations there
is an arbitrary choice that has to be made. For systems, no such choice is needed: the
process is straight forward.
Theorem 8.2 (Variation of Parameters). Let M(t) be a fundamental matrix solu-
tion of the homogeneous equation (H). Then

x(t) = M(t) (1: M(S)-lg(S) ds + v)


116 IV. LINEAR SYSTEMS

is a solution of the nonhomogeneous equation. If v is allowed to vary, then this gives


the general solution of the nonhomogeneous equation.
PROOF. To derive this equation, we look for a solution ofthe form x(t) = M(t)f(t). If
this is a solution, then

itt) = A(t)M(t)f(t) + M(t)f'(t)


= A(t)x(t) + M(t)f'(t).
Since x(t) is a solution this has to equal A(t)x(t) + g(t), so we need f'(t) = M(t)-lg(t).
Integrating from to to t, we get

f(t) = t M(s)-lg(s)ds +v
1to
for an arbitrary vector v. Substituting this for f(t), we get the form of x(t) claimed.
The above calculation can be worked backward (or a direct calculation of the deriva-
tive of the right-hand side can be made) to show that this indeed gives a solution of
the nonhomogeneous equation. The statements about the general solution follow from
Theorem 8.1 above. 0

4.9 Linear Maps


The results for linear maps are very similar to those for linear differential equations;
the groupings of the eigenvalues become those of absolute value bigger than one, equal
to one, or less than one rather than those of real part positive, zero, or negative.
Let A be an n x n real matrix. If v is an eigenvector for the eigenvalue A, then
A"v = A"V. Thus, if IAI < 1 then IA"vl = IAI"lvl which goes to zero as n goes to
infinity. Using the Jordan Canonical Form, we get a result for linear maps that is
similar to the one for linear differential equations.
Theorem 9.1. Let A be an n x n real matrix and consider the map Ax. The following
are equivalent.
(a) There is a norm I I. on R" and a constant 0 < ~ < 1 such that for any initial
condition x E R" the iterates satisfy

IA"xl. s ~"Ixl. for all n ~ o.


(b) For any norm II' on R" there exist constants 0 < Il < 1 and C ~ 1 such that for
any initial condition x E R" the iterates satisfy

IA"xl' s C~"lxl' for all n ~ o.


(c) All the eigenvalues A of A satisfy IAI < 1.
REMARK 9.1. As for a linear differential equation, a norm as in Theorem 9.1(a) is called
an adapted nann.
The proof of this theorem is similar to that for linear differential equations with
summations replacing integrals, and will be omitted. With this result, a linear map
induced by a matrix all of whose eigenvalues have absolute values less than one is said
to be a linear contraction, and the origin is called a linear sink or attracting fixed point
for this map. If all the eigenvalues have absolute values greater than one then A is
automatically nondegenerate (nonzero determinant) and so A"x is an expansion for
n > 0 and a contraction for n < o. The map induced by A is called a linear expansion,
and the origin is called a linear source or a repelling fixed point for this map.
4.9 LINEAR MAPS 117

Theorem 9.2. Assume that B Il1Id C are two n x n matrices which induce invertible
linear contractions (01' both induce linear expansions), Bx Il1Id Cx. FUrther, assume
that B Il1Id C belong to the same path components of Gl(n, R), the set of invertible
n x n matrices. Then the linear map Bx is topologically conjugate to the linear map
Cx.
REMARK 9.2. The General Linear Group, Gl(n, R), has two path components, those
with positive determinll1lt and those with negative determinll1lt. (Compare with The-
orem 9.6 at the end of this section.) Thus if A and B are two elements of Gl(n, R),
both of which have positive determinant (both are orientation preseroing), then A and
B are topologically conjugate. Similarly, if both have negative determinant (both are
orientation reversing), then they are conjugate.
PROOF. The idea of the proof is very similar to that for flows, but the conjugacy on
the "fundamental domains" is different.
Let BI be a curve in Gl(7I. R) with Bo = C and B\ = B.
Take norms liB and lie given by Theorem 9.1(a) such that Band C are contractions
in terms of these respective norms. Let

DB = {x E R" : Ixls $ I} and


Ss = {x E R" : Ixls = I}.
so Ds is the standard unit ball and SB is the standard unit sphere in R" in terms of
the norm liB. Similarly, we define De and Se using the norm lie. The Collowing two
"annuli"

As = e1[DB \ B(DB)I and


Ae = e1[De \ C(De)1

are called fundamental domains because for any x '" 0 there is a j such that B1 (x) E AB.
and for most x '" 0 there is a unique such j.
We need to construct a conjugacy ho between the two linear maps on their respective
fundamental domains, ho : As ..... Ae, such that if x, Bx E AB then ho(Bx) = Cho(x).
After constructing ho on As, we extend it to all ofR" as we did Cor differential equations.
The conjugacy ho needs to be a homeomorphism between As and Ae taking the
outer boundary to the outer boundary and the inner boundary to the inner boundary.
On the outer boundary we take ho to be essentially the identity (radial projection from
SB to Se), and on the inner boundary we take ho to be essentially the map CB- 1
(plus a radial projection onto CSe). If ho has these values on the two boundaries, it
is not hard to see that it is a conjugacy on AB. To construct ho with these properties
we separated the adjustment of the radius from the change in the "angle" variable in
S"-I. For the radial variable, the radial line segments from the inside boundary to
the outside boundary have different lengths for AB and Ae. In order to adjust this
radial component of the points, we first define a map hB from the standard annulus
[0,11 x 8"-1 to the Cundamental domain AB. Similarly, we define he from [0,11 x 8"-1
to Ae. Having adjusted the radial component by means of the maps hB and he, we
define a map H from [0,11 x 8"-1 to [0,11 x S"-1 using the path from C to B in
Gl(n,R). This map H preserves the t value in [0,11 and is the map in 8"-1 induced by
BIB-I. It is the identity for t = 1 and CB-1 for t = 0; thus H makes the adjustments
of the "angular" component in 8"-1 in a manner so that ho = he 0 H 0 h'i/ satisfies
the necessary conjugacy equation for points on the outer boundary: if x E Ss then
Cho(x) = ho(Bx).
118 IV. LINEAR SYSTEMS

FIGURE 9.1. The Construction of ho


Beginning the actual constructions, let
hB : [O,IJ X sn- I --+ AB
be given by
hB(t, x) = TB(t, x)x
where TB is the affine map in t such that (i) TB(I, x) = Ixll/ so hB(I, x) = x/lxlB E
SB and (ii) hB(O,x) = TB(O,X)X E BSB. For TB(O,X)X to be in BSB, we need
TB(O,x)B-I X E SB, TB(O,x)IB-IXI B = 1, or TB(O,X) = IB-IXIIiI. Since we choose TB
to be an affine map in t,
t 1- t
TB(t,X) = IxlB + IB-IxIB
For any x E sn-I, hB(l, x) = x/lxlB E SB is on the outer boundary of AB, and
hB(O,X) = x/IB-IxIB E BSB is on the inner boundary of A B . Thus hB takes the
radial line segment [0,11 x {x} onto the radial line segment from x/IB-'xIB to x, i.e.,
the line segment from the inner boundary of AB to the outer boundary of AB. For use
in verifying the conjugacy condition, note that for any y E SB, letting x = By/lByl,
By By
hB(O'IByl) = lylB = By or

h'i/(By) = (0, I!~I)'


Similarly, we define Te and he with
he: [0,11 x Sn-I --+ Ae.
Converting the formulae for hB into those for he, for any x =I 0, x/lxl E sn-I,

and
4.9 LINEAR MAPS 119

As stated above, next we use the curve Be in GI(n, R) with Bo = C and Bl = B to


define
H: [0,11 x sn- I -+ [0,11 X sn- I
in a manner which preserves the "radius" t E [0,11 and continuously changes the "an-
gular" component in sn-I. In fact, define

On the outer boundary {I} x sn- I , H(l,x) = (l,x) is the identity. On the inner
boundary {o} X sn-I,

so

which is essentially the map x ~ CB-I X .


Finally, we combine these maps and define

ho = hc 0 H 0 h"i/ : As -+ Ac.

First note that ho is one-to-one and onto because each of the maps hc, H, and hB are.
Next we check that ho is a conjugacy whenever x, Bx E A B , i.e., for x E SB. For these
xE SB,

Cho(x} = Chc 0 H 0 h8 1(X)

= Chc 0 H(l, 1:1)


= Chc(l, 1:1)
Cx
= Ixlc·

On the other hand,

ho(Bx) = hc 0 H hSI(Bx)
0

Bx
= hc 0 H(O, IBxl)
Cx
= hc(O, ICxl)
Cx
= Ixlc
= Cho(x}.

This checks the conjugacy of ho on AB.


Having defined ho from AB to Ac, we extend it to all of Rn by

~-j(X)ho(Bj(X)X}
forx=O
h(x) = { forx~O
120 IV. LINEAR SYSTEMS

where Bi(x)x E AB. The only difference from the case of the differential equations is
that we need to check that h is well defined. However, if both Bjx, Bj+I X E A8 then

C-i-1ho(Bj+I X ) = C-j-1ho(BBjx)
= c-j-1Cho(Bjx)
= C-iho(Bi x )

from the conjugacy property for ho verified above. Thus h is well defined. The reader
should also check that it is continuous at points on B(88). The continuity at 0 is similar
to before: if Xi converges to zero, then j(Xi) goes to minus infinity, so C- j(x.) ho(BJ(x; )x)
goes to O. The rest of the proof is similar to that for differential equations. 0
Having discussed linear contracting maps and linear expanding maps, we now con-
sider the possibility of both types of eigenvalues. If all the eigenvalues of A have absolute
value which is not equal to one, Ax is called a hyperbolic linear map. In general, we
define the eigenspaces much as before,

EU = span{v : v is a generalized eigenvector


for an eigenvalue Awith IAI > I},
Ee = span {v : v is a generalized eigenvector
for an eigenvalue A with IAI = I}. and
E' = span{v : V is a generalized eigenvector
for an eigenvalue Awith IAI < I}.
These subspaces are called the unstable eigenspace, center eigenspace, and stable eigen-
space, respectively. Much as before, E" is characterized as vectors which exponentially
contract as n -+ 00, and EU is characterized as vectors which exponentially contract as
n -+ -00. The center subspace, Ee , is characterized as vectors which grow subexponen-
tially as n -+ ±oo, Le.,

Next we prove the preservation of the hyperbolic nature of a linear map under pertur-
bation, which is analogous to Theorem 6.2 for differential equations. This result is then
used to prove that nearby hyperbolic linear maps are topologically conjugate, which is
analogous to Theorem 7.1. Let H(n, R) be the set of matrices A in Gl(n, R) such that
A induces a hyperbolic linear map, Ax.
Theorem 9.3. Assume that A E H(n, R). If B is near enough to A, then B E H(n, R)
and the splitting for B, E~ eE~, bas suhspaces with the same dimensions as those for
A. Moreover, as A varies continuously to B the subspaces also vary continuously.
PROOF. The proof is the same as that for differential equations. Let p(A) be the
characteristic polynomial for A. Let "( be a simple closed curve inside the open unit
disk of the complex plane that surrounds all the eigenvalues of A with negative real part
and is oriented counterclockwise. Let "(' be a simple closed curve outside the closed unit
disk of the complex plane that surrounds all eigenvalues of A with positive real part but
does not surround the unit disk, again with counterclockwise orientation. Then

_1_
271"i
f
"f
p'(z) dz
p(z)
4.9 LINEAR MAPS 121

counts the number of roots of p(z) inside "Y and varies continuously with the change
from p(z) to the characteristic polynomial q(z) for B. A similar statement holds for
the unstable eigenvalues. Thus for B near enough to A, all the roots for q(z) are either
inside "Y or -y' and so B is hyperbolic with the same dimensional subspaces as A.
As before, the projections,

1 .1(z/ - C)-Ivdz,
Pv = -2
71" .,

onto the stable subspace vary continuously with changes of C from A to B, so the
subspace E~ varies continuously. See Section 148 of Riesz and Nagy (1955). A simUar
statement holds for Eil. Therefore the subspaces of B are near those for A. 0
Theorem 9.4. Assume that A is an n x n matrix which induces a linear hyperbolic
map Ax. If B is near enough to A, then Bx is topologically conjugate to Ax.
This result follows from the cases of linear contractions and linear expansions (The-
orem 9.2), and the fact that the dimension of the splitting does not change (Theorem
9.3) in the same way that it did for differential equations.
In contrast to the above results which proved that many different linear contractions
are topologically conjugate, there is the following standard result about linear conjugacy
which implies that very few different linear maps are linearly conjugate.
Theorem 9.6. Let Band C be two invertible matrices in Gl(n, R). Assume Band C
are linearly conjugate, i.e., there exists an M in GI(n,R) with C = MBM-I. Then B
and C have the same eigenvalues.
REMARK 9.3. This result is a standard fact in linear algebra. Compare with the proof
of Theorem 7.2. The details are left to the reader.
REMARK 9.4. By the uniqueness of the Jordan Canonical Form, if Band C are linearly
conjugate, then they have the same Jordan canonical form if the blocks are ordered In
the same way.

Now we return to the question of the path components of Gl(n, R) and H(n, R). Let
Cont(n, R) be the set of matrices A E Gl(n, R) such that A induces a contracting linear
map, Ax. Let Exp(n,R) be the set of matrices A E GI(n,R) such that A induces a
expanding linear map, Ax. We have the following result.
Theorem 9.6. (a) Let A, B E Cont(n, R). Assume that det(A) and det(B) have the
same sign. Then there is a curve {At E Cont(n,R) : 0:5 t:5 I} such that Ao = A and
Al =B.
(b) Let A, BE H(n, R). Let E~ and E~ be the eigenspaces for A and E~ and Eil be
the eigenspaces for B. Assume that (i) the dimension ofE~ equals the dimension ofE~
(so the dimension of E~ equals the dimension of EiI), (ii) the signs of det(AIEO) and
det(BIEO) are the same, and (iii) the signs ofdet(AIEU) and det(BIEU) are the same.
Then there is a curve {At E H(n,R) : 0:5 t:5 I} such that Ao = A and Al = B.
REMARK 9.5. For u = s, u, det(AIEO') is calculated in terms of a basis of EO'. The
condition that det(AIEO') = ±1 is really the condition that A preserves or reverses the
orientation on the invariant subspace EO'.
REMARK 9.6. Because the determinant is a continuous function, if A,B E Cont(n,R)
have sign(det(A» # sign(det(B», then there can not be a curve in Gl(n, R), let alone
in Cont(n, R), connecting A and B. Thus the result of this theorem is sharp.
PROOF. We break up the proof into lemmas and leave the proof of one main step to
the exercises.
122 IV. LINEAR SYSTEMS

Lemma 9.7. Let A be the matrix given as follows:

A=(~ -:).
Assume T = (02 + (32)1/2 ~ O. Let 8 be chosen so that Q = T cos(8) and (3 = T sin(8)
and let 8, = (1 - t)8. Define the curve of matrices
A = (TCOS(8t ) -TSin(8e))
t T sin(8t ) T cos(8t ) ,
:s :s
for 0 t 1. Then (i) Ao = A, (ii) Al is a diagonal matrix, and (iii) for 0 t 1, the :s :s
eigenvalues of At have the same absolute value as those of A, namely T. Thus we have
given a curve of matrices from a block in the Jordan Canonical Form corresponding to
a complex eigenvalue to a diagonal block.
The validity of this lemma is obvious.
Lemma 9.S. Assume A is a real diagonal matrix, A = diag(AJ, ... , An), in H(n, R).
Let
if 1 < Aj
~.5
if 0 < Aj < 1
/-Ii = { -0.5 if -1 < Aj < 0
-2 ifAj <-1.
Define Ai,t = (1 - t)Aj + t/-lj, and At = diag(AI,t, ... , An,d for 0 t :s :s
1. Then (i)
:s :s
At E H(n, R) for 0 t 1, (ii) Ao = A, and (iii) AI is a diagonal matrix whose entries
are ±2 and ±0.5.
The verification of the lemma is direct and is left to the reader. See Exercise 4.12(b).
Exercises 4.11 and 4.12 ask the reader to prove the following result using Lemma 9.7.
Lemma 9.9. (a) Assume A E Cont(n, R). Then there is a curve
{At E Cont(n,R) : O:S t:S I}
such that (i) Ao = A and (ii)
diag(0.5, ... , 0.5) if det(A) > 0
AI= { . .
dlag(0.5, ... ,0.5, -0.5) If det(A) < O.
(b) Assume A E H(n,R). Then there is a curve {At E H(n,R) : 0 :s t :s I} such
that (i) Ao = A and (ii)
AIlE" = { d~ag(0.5, ... , 0.5) ~f det(AIE:) > 0
dlag(0.5, ... , 0.5, -0.5) If det(AIE ) < 0,
and
Al lEu = { d~ag(2, ... ,2) ~f det(AIE:) > 0
dlag(2, .. , ,2, -2) Jf det(AIE ) < O.
PROOF OF THEOREM 9.6. The proofs of parts (a) and (b) are essentially the same so
we consider only part (a). By Lemma 9.9, since A, B E Cont(n, R) and det(A) and
det(B) have the same sign, there are curves At and B t in Cont(n, R) with Ao = A,
Bo = B. and AI = B 1 • We can combine these two curves by defining
C t = { A2t for 0 t 0.5 :s :s
B 2- 2t for 0.5 t 1. :s :s
:s
Then Co = A. CI = B, and C 1 E Cont(n,R) for O:S t 1. This completes the proof.
o
4.9.1 PERRON-FROBENIUS THEOREM 123

4.9.1 Perron-Frobenius Theorem


In this subsection we return to considering irreducible matrices. These were defined
for transition matrices, but the definition makes sense if all the entries are nonnegative
as we give below. The proof of the theorem of this section is not essential elsewhere in
this book, although we do refer to the theorem in a few situations.
Definition. An n x n matrix A = (aij) is called nonnegative provided all the entries
are nonnegative, aij ~ O. An n x n matrix A = (aij) is called positive provided all the
entries are positive, aij > O. It is called eventually positive provided it is nonnegative
and there is an integer m > 0 for which Am is positive. An n x n matrix A = (ai;)
is called reducible provided that there is a pair i, j with (Am)ij = 0 for all m ~ 1. It
is called irreducible provided that it is not reducible, i.e., for 1 $ i,i $ n there exists
m = m(i,i) > 0 such that (Am)i; ." O. If A Is eventually positive and irreducible, then
(Ai)i; > 0 for all i ~ m.
With these definitions, there is the following result of Perron, see Perron (1901) and
Gantmacher (1959).
Theorem 9.10 (Perron-Frobenius). Assume A is an eventually positive matrix.
(a) Then there is a real positive eigenvalue Al which is a simple root of the charac-
teristic equation such that if Aj is any other eigenvalue (with i > 1) then Al > IAjl.
The eigenvector vI for the eigenvalue At can be chosen with all entries strictly positive,
vf > 0 for 1 $ i $ n. In fact, all other eigenvectors, whether for real or complex
eigenvalues, have components of both signs. This is true for both eigenvectors on the
right and on the left.
(b) If Am has all positive entries and x is any unit vector with Xi ~ 0 for all i, then
Ajx/IAjxl converges to vl/lvtl as i goes to infinity, and there are positive constants
C I and C 2 such that C I A1 $ IAjxl $ C 2 A{ and CtA{ $ (A;X)i $ C 2 A{ for i ~ m and
1 $ i $ n. (Here (Aix)i is the i-th component of Aix .)
REMARK 9.7. Frobenius proved a generalization of this result for irreducible nonnega-
tive matrices (Frobenius, 1912). A good general reference for these results and a proof
of the more general result is Gantmacher (1959).
REMARK 9.8. One application of this theorem is to a system with a finite number
of states where probabilities of making the transition from one state to another are
known. Assume there are n states and Pi; is the probability of making the transition
from state i to state i for 1 $ i,i $ n. Let P = (pijh~i.;~n be the transition matrix.
Since the probability of making the transition from state i to some state is one, it is
assumed that Li Pij = 1. Also, assume Pi; > 0 for all pairs (i, i), so there is a positive
probability of transition from any state to any other state. Note that P preserves the set
of all distributions x for which E j Xj = 1, i.e., x represents the distribution within the
finite states {I, ... , n}. Also note that (1, ... , 1) is a left eigenvector for the eigenvalue
1, (1, ... ,I)P = (L,Pilo''',L,P,n) = (1 ..... 1). Since all the entries of (1 ..... 1)
are positive, Al = 1 is the largest eigenvalue in absolute value by the conclusion of
the Perron Theorem. The eigenvalue At = 1 also has right eigenvector s· with all the
s; s;
> O. Also s· can be normalized so that L; = 1. This vector s· represents the final
steady state distribution within the states. because if x is any initial distribution with
all the x; ~ 0 and Lj Xj = 1, then p"x converges to s· as k goes to infinity because
of the inequalities between the eigenvalues and the fact that P preserves vectors with
Lj Xj = 1.
PROOF. By replacing A with Am. we can assume that A is positive. (Or perhaps only
take powers Ai for i ~ m.)
124 IV. LINEAR SYSTEMS

Let {ei : I ~ j ~ n} be the standard basis of Rn. Let

be the first "quadrant", sn- I = {x : Ixl = I} be the sphere of all unit vectors, and

A = Qnsn-I

be the simplex of unit vectors in the first quadrant.


The matrix A induces a map, lA, on sn- I by
Ax
IA(X) = lAx!,

Because A is positive, Aei = L. a.)e' E int(Q). Applying the map fA to these unit
vectors, IA(A) C int(A,Sn). where the interior is taken relative to sn. The simplex
A is homeomorphic to Dn-I, the closed unit disk in R n- I , i.e., A is the image of a
homeomorphism from Dn-I into Rn. By the Brouwer Fixed Point Theorem, IA must
have a fixed point, Vi, in A. Then Av l = AIV I for some positive real number AI' Thus
AI is an eigenvalue with unit eigenvector vi. (Here vi is a column vector because it is
multiplied on the right of A.) The components of vi are all positive because Vi E int(Q).
The above argument can be repeated with the action of A on row vectors where the
vector is multiplied on the left. We obtain a left eigenvector Wi with all positive entries
for some real eigenvalue A· where wi is a row vector, wi A = A·W I . Alternatively, apply
the above result to AIr. (We do not use the fact that A· = Al but this is true. Because
both Wi and Vi have positive entries WIV I > 0 and A·WIV I = Wi Av l = AIWlv l , so
A· = Ad
Now define the (n - I)-dimensional subspace W by

Thus, wi is the normal (co)vector to this subspace. Any nonzero vector x E W has
some component positive and other components negative, because all the components
of Wi are positive and wlx = O. It follows that W is invariant by A: if x E W then
wl(Ax) = (wIA)x = A·WIX = 0 so Ax E W. Thus A restricted to W has n - 1
eigenvalues (with multiplicity), A2," ., An. We need to prove that Al > IA) I for j > 1.
To prove this claim, first take the case where Ai is real with real eigenvector vi E W
of length one. Since vi E W, it must have both positive and negative components
as claimed in the theorem. Let V be the two dimensional subspace spanned by vi
and v j . We change to the inner product on V which makes these two vectors into an
orthonormal basis. Let Sl be the unit sphere in V in terms of this new inner product.
Any x E Sl can be represented by

for some cpo Then

A[cos(cp)vi + sin(cp)vIJ = Aj cos(cp)vi + Al sin(cp)v l , so


j
( () j +smcpv
I ACOSCPV . ( ) I)
=.
cos(cp)v + (AdAj)sin(cp)vl
.
Icos(cp)v) + (AdAj)sin(cp)vll
4.9.1 PERRON-FROBENIUS THEOREM 125

There is a I{)o with 0 < I{)o < 1r12,0 < cos(l{)o) < 1, such that Xo = cos(l{)o)v j +sin(tpo)v 1
is on the boundary of t:J. = Q n 8 1 relative to 8 1 , B(t:J., 8 1 ). Because A is a positive
matrix, f~xo E int(t:J.,8 1 ) for k ~ 1. If I~jl > ~J, then by the form of fA above,
f~(Xo) would converge to v J as k goes to infinity, which contradicts the fact that
f~Xo E int(t:J., 8 1 ). On the other hand, if I~jl = ~l then ~~ = ~I and f~(Xo) would
equal Xo, which again contradicts the fact that f~Xo E int(t:J., SI). Therefore when ~j
is real, the only possibility left is that ~l > I~jl.
Next we consider the case of a complex eigenvalue, ~j = -y[cos(!J» + isin(!J»J. Then
there is a a complex eigenvector v j +iw j with vi, w j E W such that Avj = -ycos(!J»vi +
,,),sin(!J»w j and Awj = -,,),sin(!J»v j + ,,),cos(!J»w j . Since vj, w j E W, both of these
vectors must have both positive and negative components as claimed in the theorem.
Let V be the three dimensional subspace spanned by vi, v j , and w j . We change to the
inner product on V which makes these three vectors into an orthonormal basis. Let s'l
he tllfl unit sphpw in V in tl'r/ll/i of this new inner produet. Orfillr
X(O) = cos(O)v j + sin(O)wJ,
so x(O) E S2. A direct calculation shows that Ax(O) = ,,),x(O+!J», so fA(X(O» = x(O+!J»
and these points move on the unit circle in the plane spanned by v j and wi. Any point
in 8 2 C V can be represented as cos(l{)x(O) + sin(l{)v 1 , and
A[cos(l{)x(O) + sin(l{)vIJ = -ycos(l{)x(O +!J» + ~I sin(tp)v l , so
fA (cos( )x(O) + sine )v l ) = cos(l{)x(O +!J» + (~d")')s~n(l{)vl .
I{) I{) Icos(l{)x(O + !J» + (~d")') sm(l{)vll
Assume I~jl = ")' > ~I' By the form of fA given above, if cos(cp) =F 0 then
f~(cos(l{)x(O) + sin(cp)vl) converges to the unit circle in the plane spanned by v j and
wi as k goes to infinity. In particular such a point which starts in the boundary of
t:J. = Qns'l relative to S2, B(t:J.,S2), does not remain in the int(t:J.,8 2 ) as k goes to
infinity. This contradicts the fact that A is a positive matrix.
If I~jl = ")' = ~I'
fA(COS(I{)x(O) + sin(l{)vl) = cos(l{)x(O +!J» + sin(l{)v l .
Therefore all points on 8 2 are recurrent for fA. (1f!J> is a rational multiple of 211' then
all points in S2 are periodic or fAi if!J> is an irrational mUltiple of 21r then all points
in S2 are recurrent but only ±v l are periodic.) In any case f~ (B( A, S2» can not be
contained in int(A, S2) for all positive k. Again, this contradicts the fact that A is a
positive matrix.
Combining all the cases, we have proved part (a) of the theorem: ~I > I~jl for all
j>1.
To prove part (b) of the theorem, assume that x is a unit vector with Xi ~ 0 for all
j. Let a = (X, vi )/lv I 12 . Note that a > 0 because all the components of vi are positive
and all those of x are nonnegative. Then x - av l is in the subspace W defined above.
Therefore
x = av l + v·
with v· E W,
AkX = Akav l + Akv·
= a~~vl + Akv·, and
Akx Akv·
7I =av l + >F"
I
126 IV. LINEAR SYSTEMS

where
lim IAkv"1 = o.
k-oo ,\~
From the above convergence, it follows that there are positive constants 0 < C I < C 2
such that

for 1 ~ i ~ n. This completes the proof of the


theorem. o
REMARK 9.9. The above proof shows that Al > IAjl for all j > I, so that for any
x E ~, f~ (x) converges to Vi flvll as k goes to infinity. In fact, using the comparison of
the eigenvalues it follows that fA is a contraction on ~. We did not prove that fA is a
contraction on ~ directly, but merely used the Brouwer Fixed Point Theorem to get the
eigenvector for AI. After getting the eigenvector for AI, we argued that since fA maps
~ into its interior in sn, the other eigenvalues can not be larger than '\1 in absolute
value. Then as remarked above, the inequalities between the eigenvalues implies that
fAI~ is a contraction by general arguments.

REMARK 9.10. The matrix A can have complex eigenvalues and off diagonal terms in
its Jordan canonical form, i.e., l's in its Jordan form. In fact, let vi = (1, ... ,l)tr,
W = {x : x· Vi = OJ,
and v j for 1 < j ~ n be any basis for W. Let C be any matrix in terms of the basis
{v 2 •... , vn}. Clearly, C can have any Jordan canonical form. Then consider the linear
map L which preserves W, has the matrix C in terms of the basis {v 2 , .•• , v n } on W,
and L(v l ) = AIV I . Then any of the standard unit vectors can be represented in terms
of this basis,

Because v I has all positive coefficients, and for j ~ 2 the sum of the coefficients of vi
in terms of the standard basis is zero, it follows that Yj.1 > 0 for all j. Then
L(ei) = L(2:>j.iVi)
n

= AIYj.lv l + LYi.iL(v i ).
;=2
Also let A = (ai.}) be the matrix of L in terms of the standard basis, so
L(ei) = Lai}e;.

Comparing coefficients. we get that


n

aij = AIYj,1 + "L Yj.kL(V k e .


). ;
k=2

Thus for Al large enough, aij > 0 for all i and j, A is positive, and L(ei) is in the
interior of Q for all j. This proves that A can have any type of Jordan canonical form
for the eigenvalues which correspond to the eigenvectors on W.
4.10 EXERCISES 127

4.10 Exercises
Jordan Canonical Form
4.1. Let A be an n x n matrix, and Vi, ... , v n be a basis such that Av l = AV 1 and
Avi = AV i + v i - I for j = 2, ... , n. Given E > 0, find a new basis wi such that
Awl = AWl and Awi = AW j + EWj - 1 for j = 2, ... , n. Hint: Try w j = ajvj for
suitable choices of ai'
Solutions and Phase Portraits for Constant Coefficients
4.2. Find a basis of solutions and draw the phase portrait for x = Ax for each of the
following choices of A.
( a) (~ ~). (b) C ~), (c) C \2). (d) (:2 n,
( e) (~2 ~l ~), (f) (~2
o 0 3 0
0~l).
1
4.3. Consider the second order linear equation given by x = Ax.
(a) Prove that x( t) = eAtv is a solution of x = Ax if and only if p. = A2 is an eigenvalue
of A with eigenvector v.
(b) Find a basis of (four) solutions of x = Ax for

Hyperbolic Linear Differential Equations


4.4. Let Hdiff(n, R) be the set of matrices A in Gl(n, R) that induce a hyperbolic linear
differential equation, j; = Ax. Assume that A E Hdiff{n, R).
(a) Prove there is a curve {At E Hdiff{n,R) : 0 ~ t ~ I} such that (i) Ao = A and (ii)
Al has no nonzero off diagonal terms in its Jordan canonical form.
(b) Prove there is a curve {AI E Hdiff{n, R) : 0 ~ t ~ I} such that (i) Ao = A and (ii)

AdlE' =diag{-l, ... ,-l) and


AdIEU = diag(I, ... , 1).

Conjugacy and Structural Stability


4.5. Consider linear systems of constant coefficients in R3. Construct explicit conju-
gacies h : R2 -+ R3 between x = -x and the following other systems:
(a) y = -2y,
(b) Y= (~2 and
-2
(c) Y = ( 1

4.6. Consider linear systems of constant coefficients in R2. Construct explicit conju-
gacies h : R2 -+ R2 between

4.7. Suppose that A E GL{Rn) is not hyperbolic. Show that the map Ax is not
structurally stable. Hint: Consider the family of maps Ar{x) = rA(x) for r E (I-e, I +
f).
128 IV. LINEAR SYSTEMS

4.8. Let I,g : Rn -- Rn be defined by I(x) = ax and g(x) = bx where 1 < a < b. Find
an explicit formula for a conjugacy from the map 9 to the map I on all of Rn, i.e., a
homeomorphism h for which 10 h = hog. Prove that any such h which is differentiable
must have all partial derivatives at the origin equal to 0, and h- 1 does not have partial
derivatives at o.
Nonhomogeneous Equations
4.9. Prove Theorem 8.1.
Linear Maps
4.10. (This exercise gives a direct calculation of the eventual contraction of a Jordan
block with a real eigenvalue absolute value less than one. Compare with Theorem 9.1.)
Assume 0 < 1>'1 < I. Assume A is a matrix with Ae l = >.e l and Ae k = >.e k + ae k- I for
1 <k ~ 7n.
(a) Prove that

ror 71 ~ j.
(b) Prove that (;) ~ n j Ii! ~ nj.
(c) Prove that njl>,jl goes to zero as n goes to infinity.
(d) Prove that IAnekl goes to zero as n goes to infinity for any 1 ~ k ~ m.
4.11. Let Cont(n,R) be the set of matrices A in G/(n,R) that induce a contracting
linear map, Ax.
(a) Assume that A E Cont(n, R). Prove there is a curve {AI E Cont(n, R) : 0 ~ t ~ I}
such that (i) Ao = A and (ii) AI is diagonal with entries equal to either 0.5 or
-0.5. Hint: Use Lemma 9.7. Allow for 1's in the off diagonal terms of the Jordan
canonical form.
(b) Exhibit a curve from diag( -0.5, -0.5) to diag(0.5, 0.5). Hint: The curve of matri-
ces can not remain diagonal. Use Lemma 9.7.
(c) Assume A E Cont(n,R). Prove there is a curve {AI E Cont(n,lR) : 0 ~ t ~ I}
such that (i) Ao = A and (ii)
diag(0.5, ... ,0.5) if det(A) > 0
{
AI = diag(0.5, ... ,0.5, -0.5) if det(A) < o.
Hint: Use parts (a) and (b). Notice that this proves Lemma 9.9(a) for contracting
linear maps.
4.12. Let H(n, R) be the set of matrices A in G/(n, R) that induce a hyperbolic linear
map. Ax.
(a) Assume that A E H(n,R). Prove there is a curve {AI E H(n,R): 0 ~ t ~ I} such
that (i) Ao = A and (ii) AI is a real diagonal matrix. Hint: Use Exercise 4.11.
Notice that the contracting and expanding subspaces are not necessarily spanned
by the a subset of the standard basis.
(b) Assume A E H(n,R). Prove there is a curve {AI E H(n,R): 0 ~ t ~ I} such
that (i) Ao = A and (ii) AI is diagonal with entries equal to either 2, 0.5, -0.5, or
-2, i.e., prove Lemma 9.8. Notice that l's are allowed in the off diagonal terms
in the Jordan canonical form.
(c) Prove Lemma 9.9(b). More precisely, assume A E H(n,R). Prove there is a curve
{AI E H(n,R) : 0 ~ t ~ I} such that (i) Ao = A and (ii)
A IE' = { diag(0.5, ... , 0.5) if det(AIE') > 0
I diag(0.5, ... ,0.5, -0.5) if det(AIE') < 0,
4.10 EXERCISES 129

and
AIlE" = {d~ag(2, ... ,2) if det(AIE") >0
dlag(2, ... , 2, -2) if det(AIEU) < O.
Hint: Use the previous problem as well as parts (a) and (b). (Note that for
a = s, u, det(AIE") is calculated in terms of a basis of E". The condition that
det(AIE") is positive or negative is really the condition that A preserves or reverses
orientation on the invariant subspace E".)
CHAPTER V
Analysis Near Fixed Points
and Periodic Orbits

In this chapter we consider solutions of systems of nonlinear differential equations


and the iteration of nonlinear functions of more than one variable. We also consider the
phase portraits for both types of nonlinear systems. For nonlinear systems. even the
analysis near a fixed point is more complicated. Rather than just using inequalities from
the Mean Value Theorem. we must use the more complicated linear theory from the
last chapter and nonlinear theorems from differential calculus like the Inverse Function
Theorem. Implicit Function Theorem. and the Contraction Mapping Theorem. We
are able to prove that if the linearization is hyperbolic. then the nonlinear map or
differential equation is topologically conjugate to the linearization. This theorem is
simple in one dimension. but requires a proof using the Contraction Mapping Theorem
in higher dimensions. We also introduce the nonlinear invariant manifold which is
tangent to contracting directions of the linearization. called the stable manifold. and
the corresponding invariant manifold which is tangent to the expanding directions. called
the unstable manifold. The proof that these manifolds exist is again a nontrivial fact
which needs an involved proof.
As the above summary and this chapter's title indicate. this chapter concerns only the
behavior near a single periodic orbit. In Chapters VII and IX we return to consider more
global and complicated dynamics such as those for the quadratic map on the Cantor
set and the structural stability of certain classes of examples. In between. Chapter VI
is concerned with how periodic points change or bifurcate as a parameter is varied.
The first few sections present a review of differentiation of function between Euclidean
spaces as a linear map. and the important theorems from differential calculus in the
form we use them: Inverse Function Theorem. Implicit Function Theorem. and the
Contraction Mapping Theorem. As a first application of the Contraction Mapping
Theorem. we prove the existence of solutions of nonlinear differential equations. After
these beginning sections. we begin our study of properties of solutions of nonlinear
differential equations and the iteration of nonlinear maps.

5.1 Review: Differentiation in Higher Dimensions:


The Derivative as a Linear Map
The general references for this material are Dieudonne (1960). Lang (1968). Marsden
(1974). and Smith (1971). Both Dieudonne (1960) and Lang (1968) talk about the
derivative of functions between Banach spaces. The third reference. Marsden (1974). is
concerned with derivatives between Euclidean spaces. and the approach is more elemen-
tary but not as developed. Many other books on real analysis define the total derivative
as a linear map.
We are concerned with maps f : U c R" ..... Rn. where U is an open subset of
R". If all the partial derivatives exist and are continuous. then the map is said to be
continuously differentiable, C 1 • We put the partial derivatives at a point p into a single

131
132 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

matrix, DIp,
al (p»).
DIp = (ax'
J

We identify n x k matrices with linear maps from Rk to R n , L(R k,lR n ). Thus DIp
should actually be thought of as a linear map in L(R k , Rn). Which coordinate function
is used determines the row and which partial derivative is taken determines the column.
With this choice of the entries in the matrix, the i-th coordinate of the product of this
matrix by a vector v is given by

which is what it should be in terms of the partial derivatives. In fact, if all the partial
derivatives exist in U and are continuous, then using the Mean Value Theorem it can
be shown that for p E U,

I(x) = I(p) + Dfp(x - p) + R(x,p)


where
lim R(x, p) = O.
x-p Ix - pi
This latter condition is often expressed by saying that

R(x, p) = o(lx - pl).

The fact that I(x) = I(p) + Dlp(x - p) + o(lx - pI) can be taken as the definition of
I being differentiable with its derivative being the linear map DIp. If any (matrix or)
linear map A E L(Rk,Rn) exists such that

lim I(x) - I(p) - A(x - p) = 0


x-p Ix - pi

then I is said to be differentiable at p and A is called the (Frechet) derivative at p. The


derivative at p can be shown to be unique, and if the partial derivatives exist then the
above matrix is the unique matrix which gives the derivative. The definition does not
give a way to calculate the derivative but the partial derivatives do.
The space L(R", Rn) is given the operator norm as defined in the last chapter. The
derivative is called continuous provided the map D I : U --+ L(R", Rn) is continuous
with respect to the Euclidean norm on the domain and the operator norm on the space
of linpar maps. In fact, if the partial derivatives all exist and are continuous, then the
derivative is continuous, and vice versa. Such a map is called continuously differentiable
or ct.
If U is a region where I(x) is defined and C t , then we can let K = sup{IIDlxll : x E
U}. By the Mean Value Theorem,

I/(x) - l(y)1 :S Klx - yl

if the line segment from x to y is contained in U. See Dieudonne (1960), Lang (1968),
or Marsden (1974). It is possible for a function to have this latter property but not be
differentiable. In this case, if there is some K > 0 such that I/(x) - l(y)1 :S Klx - yl for
all x and y, then f is called Lipschitz with Lipschitz constant K. We write Lip(f) = K
5.1 DIFFERENTIATION IN HIGHER DIMENSIONS 133

if K is the smallest constant that works. Thus if f is CI then it is Lipschitz. However,


the function f(x) = Ixl on R is Lipschitz but not CI. In the same way, we call f a-
Holderfor a> 0 provided there is a constant K > 0 such that If(x) - f(y)1 :5 Klx-ylQ
for all x and y. Thus if a function is l-Holder then it is Lipschitz. Finally, we call a
function c r +n for r a positive integer and 0 < a :5 1 if the rlh order partial derivatives
are a-Holder.
Given the above definition of the derivative, if f : U C Rk - Rm and 9 : V c Rm -
Rn are differentiable at p and q = f(p), respectively, then the comp06ition, go I, is
differentiable at p with derivative D(g 0 f)p = DgqDfp. Thus the derivative of the
composition is the composition of the derivatives. This is called the chain rule. In
calculus books this derivative of the composition is often written out as a product of
partial derivatives. The reader should satisfy him or herself that these ways of expressing
the derivative of the composition are compatible.
Using this approach to understand the second derivative and higher derivatives seems
complicated when first encountered. As stated above, the derivative of a map 1 : U C
Rk __ Rn at a point p gives a map D f : U _ L(Rk, Rn) as the point p varies. This
second space is itself isomorphic to the Euclidean space Rkn. It can be given either the
Euclidean norm or the operator norm. (Any two norms on finite dimensional spaces are
equivalent.) The second derivative at p, D2 f p , is then the derivative of this map and
so is an element of L(Rk, L(Rk, Rn)).
However, L(R k, L(R k, Rn)) is isomorphic to bilinear maps from Rk to Rn which we
denote by L2(Rk,Rn). (Note L2(Rk,Rn) are those maps in L(Rk x Rk,Rn) which are
linear in each factor of Rk separately.) An element B E L2 (R k, Rn) acts on two vectors
in Rk and gives a vector in Rn. In terms of the standard bases {ei}~_1 of Rk and {stltzl
of Rn, two vectors v and w can be written as v = Li Viei and w = Lj w j e3, and
n
B(v, w) = ~)L 1 :5 i,j :5 kb~,jViWj)st
t=1
n
= L 1 :5 i,j :5 k(L bt,st)ViWj.
t=1

Such a bilinear map is symmetric provided B(v, w) = B(w, v) for all vectors v and w.
bl
In terms of the entries bf,j' B is symmetric provided bf,j = i for all indices i, j, and
t. The set of all symmetric bilinear forms from Rk to Rn is denoted by L~(Rk,Rn).
Returning to the second derivative, D2/p acts on two vectors in Rk and gives a vector
in Rn. If v and w are expressed in terms of the standard basis e l , ' .. ,ek as v = Li Viei
and w = Lj wJ e3, then

(Note that each. (f}f}2f}1 ) is a vector.) The fact that the mixed cross partial derivatives
Xi Xj P
are equal,
f}2 f f}2 f
f}Xif}Xj (p) = f}Xjf}Xi (p),

implies that D2 fp is a symmetric bilinear form, D2 fp(v, w) = D2 fp(w, v) for all v and
w. This process can be continued to define higher derivatives, and the r-th derivative
at p is a symmetric r-Iinear form, Dr fp E L~(Rk, Rn). We write Dr fp(vt to mean that
134 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

the r-th derivative is acting on the same vector v r times, Dr Ip(v, ... ,v). If all the
derivatives (or all the partial derivatives) of order 1 :5 j :5 r exist and are continuous
then I is said to be r-continuously differentiable, or I is cr.
Just as for C 1 (or as for functions of one real variable), if f : U C Rk -+ Rm and
9 : V c Rm -+ R n are C r with q = f(p) E V for p E U, then 9 0 f is cr in a
neighborhood of p. The formula for the second derivative can be calculated:

This is found by differentiating D(g 0 f)p = Dg/(p)Dfp using the product rule (and
the chain rule). There is one term for each place that the variable p appears in the
formula. The result is very similar to functions of one variable, but all the derivatives
are matrices which are applied to vectors. The higher derivatives of the composition
can be calculated, but the formulas are somewhat complicated. (For explicit formulas
see Abraham and Robbin (1967).)
Using these higher derivatives, we can state Taylor's Theorem. If f : U C Rk -+ Rn
is C r , then

where
lim R(x, p) = o.
x-p Ix - plr
This last condition is often expressed by saying that R(x,p) = o(lx - pn.
If I : Rk -+ Rk is cr, DIp is a linear isomorphism at each point PERk, and I is one
to one and onto, then I is called a C r -diffeomorphism. The set of C r -diffeomorphisrns
on Rk is denoted by Diffr(Rk). By the Inverse Function Theorem, it follows that 1-1
is also cr. (The statement of the Inverse Function Theorem is given in Section 5.2.2.)
We have completed our introduction to the notion of derivatives as linear maps which
we need to start studying the dynamics of functions of several variables. Other main
topics of the multidimensional differential calculus which we use are the Implicit and
Inverse Function Theorems. They are not used immediately. Th ... Implicit Function
Theorem is used in discussions of the Poincare map of a differential equation near a
closed orbit and in bifurcation questions. The Inverse Function Theorem is used in the
proof of the Stable Manifold Theorem. Because this material can be postponed until
it is needed, we put it in a separate section made up of several subsection (Sections
5.2 - 5.2.2). Section 5.2.3 deals with the related topic of the Contraction Mapping
Theorem which is used repeatedly, including in the proof of the existence of solutions
for differential equations, Section 5.3.

5.2 Review: The Implicit Function Theorem


In a course on advanced calculus or real analysis a proof of the Implicit Function
Theorem is often given but often the students do not get a very good idea about how
it is used. On the other hand, most calculus students learn to calculate using implicit
differentiation, but have no real idea of the significance of the method of calculation.
In Dynamical Systems several uses are made of the theorem for bifurcation results
and results concerning the Poincare map. In thj, "tion, we want to discuss the idea
and meaning of the theorem. The proof will be left to books on real analysis. See
Dieudonne (1960), Lang (1968), or Marsden (1974). (Also the proof of the Hartman-
Grobman Theorem later in the chapter solves a functional equation by applying a similar
5.2 THE IMPLICIT FUNCTION THEOREM 135

contraction mapping argument.) In subsection 5.2.2 we give the statement of the Inverse
Function Theorem. Finally, in subsection 5.2.3, we give a statement and proof of the
Contraction Mapping Theorem.
We will first give the statement and interpret the result for the case when the function
is real valued. In the next subsection, 5.2.1, we discuss the case when the function is
vector valued.
Theorem 2.1 (Implicit Function Theorem). Assume that U c Rn+1 is an open
set and F : U -+ R is a C r fllnction Eor some r ~ 1. For p E R"+i we write p = (x, Y)
with x E R" and y E R. Assume that (~, Yo) E U and
8F(
By ~,Yo )40
r . (i)

Let C = F(xo, Yo) E R. Then there are open sets V containing ~ and W containing Yo
with V x W c U, and /1 C r function h : V -+ W such that
h(~) = Yo (ii)
F(x, h(x» =C for all x E V. (iii)
Further, Eor each x E V, h(x) is the unique yEW such that F(x,y) = c.
This theorem states several things. First, it says that indeed the set of (x, y) that
satisfy F(x, y) = C can be represented locally as a graph, y = h(x). Next, it says
that this function is differentiable. Once we know that it is differentiable, then implicit
differentiation gives a method of calculating its derivative: differentiating F( x, h( x» =
C with respect to Xj and using the chain rule, we get
8F 8F 8h
8xj (x, h(x» + By (x, h(x» 8xj (x) = 0 so

8h 8F [8F ]-1
-8 (x)
Xj
= --8
Xi
(x,h(x»
-8 (x,h(x»
Y
.

If we combine these partial derivatives together in the (Frechet) derivative of h, we get


8h 8h
Dh(x) = (-8 (x), ... , -{J (x»)
XI Xn

8F ]-1(8F 8F)
=- [ a..(x,h(x» -8 (x,h(x»""'-8 (x,h(x».
vv Xl xn
Thus the assumption in the theorem that *(~, Yo) -F 0 is exactly the assumption
which makes the method of implicit differentiation valid. Or formally, the assumption
that the partial derivative with respect to y is nonzero is the correct assumption to
make so that y can be solved in terms of x.
A second more geometric way of understanding the assumption on the partial deriv-
ative is in terms of the tangent line. Consider the example of two variables (n = I),
F(x, y) = x 2 + y2 = 1. The tangent line at Po = (xo, Yo) is given by
VFpo' (x - xo,y - Yo) = O.
See Figure 2.1. If *(po) -F 0, then we can represent this line as a graph of y in terms
of X, i.e., we can solve for y in terms of x:
8F aF
(x - xo) ax (Po) + (y - Yo) 8y (Po) = 0,
8F
-(x - xo)-(Po)
(y - Yo) = ax
8F (Po)
By
136 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Conversely, if we can solve the tangent line equation to give y as a function of x, then
~(Po) '" 0. For example, at (xo, Yo) = (~, ~),

y _ v'3 = -(x - ~)(2)(~) = _.2... (x _ ~)


2 (2) v'3 v'3 2
2
gives the tangent line. The Implicit Function Theorem says that if the tangent line at
Po can be represented as a graph of y in terms of x, then nearby the nonlinear level set
F(x, y) = C can be represented as a graph y = h(x). The same ideas apply for n > 1.

FIGURE 2.1. Gradient and Tangent Line to the Circle

Notice that in the example, x 2 + y2 = 1 at (1,0), ~(1,0) = 0, and the method


breaks down. The tangent line is vertical so it can not be represented as a graph over
the x variable. Also, near (1,0) the level set x 2 + y2 = 1 can not be represented as a
graph of y in terms of x. Thus both the hypothesis and the conclusion fail to be true
at this point. However, ~~ (1,0) '" 0, so reversing the roles of x and y, it is possible to
solve for x in terms of y, i.e., the level set is a graph of x in terms of y.

5.2.1 Higher Dimensional Implicit Function Theorem


Similar results are true for functions that are vector valued as given in the following
theorem. (See a calculus book on implicit differentiation involving partial derivatives,
e.g. Edwards and Penny (1990), pages 797 and 806.)
Theorem 2.2 (Higher Dimensional Implicit Function Theorem). Assume that
U c R" X Rio is IllI open set IllId F : U ..... Rio is a cr function for some r 2:: 1. Represent
a point p E U by P = (x,y) with x E R" IllId Y E Rk, and the coordinate functions of
F by II> F = (/1,." ,J,.). Assume that for (Xo,Yo) E U
ali
(ayj (Xo, YO») I $i,j$A:

is an im'ertible k x k matrix (nonzero determinllllt). Let C = F(Xo,yo) E Rk. Then


there are open sets V containing Xo IllId W containing Yo with V x W c U, IllId a cr
function h : V ..... W such that

h(Xo) = Yo
F(x, h(x» = C for all x E V.

FUrther, for each x E V, h(x) is the unique yEW such that F(x,y) = c.
5.2.2 THE INVERSE FUNCTION THEOREM 137

Just as in two dimensions, once it is known that h is a C r function, it is possible


to solve for the matrix of partial derivatives, (::;)l Si 9 .1 s ;sn in terms of the two

( aIt)
.
matnces f t' I d '
0 par la
.
envatlves ax; lStS".lS;Sn an d (aOy,
It) lSt.iS":

so

Or if we use the notation of DhXo for the Frechet derivative of h, DxFCXo.yo) for the
matrix of partial derivatives with respect to the x/s, and DyFCXo.yo) for the matrix of
partial derivatives with respect to the Yk'S, then this equation can be written as

DhXo = -(DyFCXo.yo»)-l DxFCXo.yo)·

Notice that this formula is very similar to that for a real valued function F, where
matrices have replaced numbers and the inverse of a matrix has replaced dividing by a
number.
Also, the theorem can be interpreted to say that if we can solve the linear (affine)
equation

F(Xo, Yo) aI, (Xo, Yo) ) (x -


+ (-a Xo) + (ali
-a (Xo, Yo) ) (y - Yo) = c
x; Yj
for y in terms of x, then locally near (Xo,Yo) we can (theoretically) solve the nonlinear
equation F(x,y) = C for y in terms of x. This means that if implicit differentiation
works then indeed locally y is a differentiable function of x. Also the above linear
equation gives the tangent space to the level set F(x,y) = c. If this linear tangent
space can be represented as a graph with y given as a function of x, then the nonlinear
equations can also be represented as a nearby graph.
Just as in two dimensions, it might be that at some points

is not invertible. However, if the k x (n + k) matrix

which is the Frechet derivative at (Xo, Yo), has rank k, then it is possible to select k
columns that give an invertible submatrix. The theorem then says that the correspond-
ing k variables can be solved near the point in terms of the remaining n variables to
give the level set F(x,y) = C as a graph.
For a more thorough treatment see Dieudonne (1960) or Lang (1968).

5.2.2 The Inverse Function Theorem


The Inverse FUnction Theorem can easily be proved from the Implicit FUnction The-
orem and vice versa. However, although both theorems involve the assumption that a
matrix of partial derivatives is invertible, they seem very different.
138 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Theorem 2.3 (Inverse Function Theorem). Assume that U c an is an open set


and / : U -+ Rn is a C r function for some r ;:: 1. Assume that for Xo E u, D/xo is
an invertible linear map (matrix). Then there exist open sets V containing Xo and W
containing Yo = /(xo) and a C r function 9 : W -+ V which is the inverse of / on V:
go/(x) = x for x E V and /0 g(y) = y for YEW. Further D9f(x) = [D/x ]-'.
One way of interpreting this theorem is that if the linearized (affine) equations (func-
tion) y = A(x) = /(Xo) + D/xa(x - Xo) have an inverse, then the nonlinear equations
y = /(x) have an inverse near x = Xo.
In fact, in the proof of the stable manifold theorem we need a more precise statement
of the neighborhoods on which the function is one to one. We give this statement in
the following theorem whose proof we leave to the exercises. See Exercise 5.3. (Also see
Hirsch Pugh (1970) and Lang (1968).) The statement of the theorem uses the minimum
norm of a linear map which we defined in the last chapter. Also for r > 0, let

8(X,r)={yElR n : Iy-xl::;r}

be the closed ball of radius r centered at x.


Theorem 2.4 (Covering Estimate). Assume that U C lRn is an open set and / :
U -+ Rn is a C l function. Assume that Xo E U, and that L = D/xo is an invertible
linear map with a bounded linear inverse. Let Yo = /(Xo). Take any {J with 0 < (J < 1.
Let r > 0 be such that (i) 8(Xo,r) C U and (ii) ilL - D/xll ::; m(L}(l - (J) for all
x E 8(xo, r). Then

/(8(Xo,r» ::> {Yo} + L(8(0,{Jr»::> 8(Yo,m(L){Jr).


Moreover, every point y E {yo} + L(8(0, (Jr» has exactly one preimage x E 8(xo, r),
/(x) = y, so the inverse function 9 is a C' function from

{yo} + L(8(O, (Jr» ::> B(yo, m(L){Jr)

into B(Xo, r).


As stated above, we leave the proof of the covering estimates to the exercises. See
Exercise 5.3.

5.2.3 Contraction Mapping Theorem


In this section we consider maps on some metric space Y which decrease distance, so
called contraction mappings. Finding the fixed points of a contraction map can itself be
thought of as a problem in the dynamics of iteration. The theorem below shows that
such a map 9 has a unique fixed point. Besides being interesting in itself, this result is
used in many of the proofs of other theorems. In various proofs, a map is constructed
whose fixed point gives the desired conclusion. For example, later in this chapter we
prove that a nonlinear map with a "hyperbolic" fixed point is conjugate to a linear
map in a neighborhood of the fixed point (Hartman-Grobman Theorem). This result is
proved by constructing a map e on a set of functions. The fixed point of e turns out to
be the conjugacy. A main step in the proof is verifying that e is a contraction mapping
and concluding that there exists a unique fixed point. Similarly, in the next section we
use the contraction mapping method to prove the existence of solutions of differential
equations. The contraction mapping method is also used in the proofs of the Implicit
and Inverse Function Theorems.
With this motivation, we turn to the statement and proof of the Contraction Mapping
Theorem.
~.2.3 CONTRACTION MAPPING THEOREM 139

Theorem 2.5 (Contraction Mapping Theorem). Assume Y is a complete met-


ric space with metric d, and 9 : Y -+ Y is a Lipschitz function witb Lipschitz con-
stant Lip(g) = K, < 1. Then there is a unique fixed point y., g(y.) = y.. More
specifically, if Yo is any point in Y, then {gR(YO)}REN is a Cauchy sequence and
d(yo, yO) ~ d(yo, g(yo»/[1 - Lip(g)J.

PROOF. By induction,

d(gR(YO),gR+l(yO» ~ ltd(gR-l(YO),gR(yo»
~ K,Rd(yo, g(yo».

Then
k-I
d(gR(YO),gR+k(yO» ~ ~d(gR+j(YO),gR+J+I(yO»
j=O
k-I
~ ~ K,R+jd(yo,g(yo»
j=O
K,R
~ 1- K,d(Yo,g(yo».

This latter inequality proves that the sequence gR(yO) is Cauchy and so converges to a.
limit point y •.
If there were two fixed points y. and y', then

d(y·,y') = d(g(y*),g(y'»
~ Itd(y*,y'),

which is impossible unless d(y*,y') = 0 and y. = y'. Therefore the fixed point is
unique.
To get the final conclusion, note that

d(yo,Y*) = lim d(Yo,gR(yO»


R-OO
R-I
~ n_(X)~
lim "d(gi (Yo), gi+1 (Yo»
j=O
00

= ~d(gi(Yo),gi+l(yo»
j=o
00

~ ~ K,jd(Yo,g(yo»
j=o
1
= -1-d(yo, g(yo».
-If.

This last inequality proves the desired result. o


140 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

5.3 Existence of Solutions for Differential Equations


In this section, we start our consideration of nonlinear systems of differential equa-
tions. In the last chapter we considered linear differential equations of the form :ic = Ax.
d
(As before:ic = dix is the derivative with respect to time.) A simple example of a non-
linear systems is

In general, we consider an equation of the form :ic = !(x), where x is some point
in an open set U in Rn (or on some manifold like the torus) and ! : U c Rn _ Rn
is a differentiable function. The function! is called a vector field because it assigns a
vector !(x) to each point in U. We want to consider the solution, x(t), with x(to) some
prescribed value. More precisely, given Xo and to, we want x(t) to be defined on an
open interval of times, 1, with to E 1, x(to) = Xo, and

d
dix(t) = !(x(t)) (t)

for t in 1. Thus the tangent vector to the solution curve, :ic(t), is equal to the vector
field! evaluated at the position at this time, !(x(t)).
In Chapter IV, we used the concept of a flow when discussing topologically conjugate
flows. Here we reintroduce this notion which is used extensively in the rest of the book
and make the connection with the solutions of a nonlinear differential equation. Given
the differential equation :ic = !(x), we let rpl(Xo) be the solution x(t) with the given
d
initial condition Xo at t = 0: rpO(Xo) = Xo and dirpl(Xo) = !(rpl(Xo)) for all t for which
it is defined. We also sometimes write rp(t, Xo) for rpl(Xo). (Other books often write
rpl(Xo).) The function rpl(Xo) is called the flow of the differential equation. The function
! defining the differential equation is also called the vector field which generates the flow.

The first theorem below shows that the solution is uniquely determined by the initial
conditions Xo and the time t; the notation of the flow rpl(Xo) emphasizes this depende~ce.
Later in the section we show that rpl(Xo) is a continuous function of the initial condition
Xo. The final important property of the flow is the group property: '1'1 0 rp'(Xo) =
rpl+·(Xo). Then '1'-1 0 '1'1 (xo) = rpO(Xo) = Xo so '1'-1 = ('1'1)-1, and for fixed t, '1'1 is a
homeomorphism on its domain of definition. (There might be some points for which
the solution is not defined up to time t.)
Before verifying these properties for the flow generated by a differentiable vector field,
we summarize these important properties of a flow. Since we occasionally refer to flows
on metric spaces which are not the solutions of a differential equation on Rn, we state
the definition in this context.
Definition. For a metric space X, any continuous map 'I' : U c R x X - X defined
on an open set U ::> {OJ x X is called a flow provided (i) it satisfies the group property
'1'1 0 rp'(Xo) = ",1+·(Xo) and (ii) for fixed t, '1'1 is a homeomorphism on its domain of
definition.

We now tum to verifying the properties of the flow generated by a differentiable


vector field on Rn.
5.3 EXISTENCE OF SOLUTIONS 141

Theorem 3.1 (Existence and Uniqueness of Solutions of Ordinary Differential


Equations). Let U c Rn be an open set, and f : U --+ Rn be a Lipschitz function
(or Cl). Let Xo E U and to E R. Then there exists an a > 0 and Ii solution, x(t),
of x = f(x) defined for to - a < t < to + a such that x(to) = Xo. Moreover, if y(t)
is another solution with y(to) = Xo, then x(t) = y(t) on their common interval of
definition about to.

PROOF OF EXISTENCE. For Xo E U take b > 0 such that B(Xo,b) == {x: Ix - Xol $
b} cU. The function f is Lipschitz so there is a K > 0 and M > 0 such that
If(x) - f(y)1 $ Klx - yl and If(x)1 $ M for all x,y E B(Xo,b).
If x(t) is a solution with x(to) = Xo. then

x(t) = Xo + l'
to
x(s) ds = Xo + l'
to
f(x(s» ds.

Conversely, if x : J --+ Rn satisfies (*), then x is a solution of x = f(x) with x(to) = Xo.
Thus to get solutions to the ordinary differential equation we find solutions of (*).
To this end, for y : J --+ B(Xo, b), we define

F(y)(t) = Xo + J~ f(y(s» ds.

The idea is to show F is a contraction on some function space which has a fixed point
which satisfies equation (*). and thus is a solution of the differential equation.
We need to define the function space on which F acts. First we need to specify the
length of the interval J. Take a with a < min {bIM, II K} and let J = [to - a, to + al.
We are going to consider F as acting on potential solutions defined for t in J. We take
a < blM so that F(y)(t) does not leave B(Xo,b) for t in J. We take a < 11K so that
F is a contraction by A = aK.
We now explicitly define the function space S of potential solutions on which Facts.
Let
S = {y: J --+ B(Xo,b) : y is CO, y(to) = Xo,Lip(y) $ M},
the space of M-Lipschitz curves that go through Xo at t = to and take their values in
B(xo, b). We put the CO -sup-norm on S: for y, z E S we set

lIy - zllo = sup{ly(t) - z(t)1 : t E J}.

With this norm, it can be shown that S is a complete metric space.


We need to show that F preserves S. i.e., if y is in S then Yl = F(y) is also in S.
Clearly, Yl is continuous with F(y)(to) = Xo. Next,

This shows that F(y) is Lipschitz as a function of t with Lip(F(y» $ M. Then for
It - tol $ a, Iy,(t) - Xol = Iy,(t) - ydto)1 $ Mit - tol $ Ma < b 80 F(y)(t) takes
values in B(Xo, b) for It - tol $ a. Thus we have shown that for y in S, F(y) is in S, 80
F: S --+ S.
142 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Finally, to show that F is a contraction on S,

IIF(y) - F(z) 110 = sup IF(y)(t) - F(z)(t) I


IEJ

~ sup I !' If(y(s» - f(z(s»1 ds I


IEJ 110
~ sup I !' Kly(s) - z(s)1 ds I
IEJ 110

~ sup I !' Klly - zllo ds I


IEJ 110
~ Kally - zllo
~ AllY - zllo.
Thus F is a contraction by A on S and so has a unique fixed point x· in S. But a fixed
point of F clearly satisfies equation (.). This proves the existence of a solution of the
differential equation. 0
The uniqueness of the fixed point proves that the solution is unique among the curves
which are M-Lipschitz. For the proof of the uniqueness of the solutions among all curves
together with the continuity of solutions on initial conditions, we use the following result
which is called Gronwall's Inequality.
Theorem 3.2 (Gronwall's Inequality). Let vet) and g(t) be continuous nonnegative
scalar functions on (a, b), a < to < b, C ~ 0, and

vet) ~ C + I !' v(s)g(s) ds I for a < t < b.


110
Then
vet) ~ C exp (11. 1
g(s) ds I).
10
PROOF. First we consider the case for to ~ t < b. It is not possible to differentiate
inequalities and retain the inequality, so we define

U(t) = C + !' v(s)g(s) ds.


110
Then, vet) ~ U(t) and we can differentiate U. In fact, U'(t) = v(t)g(t) ~ U(t)g(t).
First, if C > 0, then U(t) > 0 so

U'(t) < (t)


U(t) - 9 ,

log (g(~?») ~ 1: g(s) ds, and

U(t) ~ C exp ( (' qt..) ds)

since U(to) = C. Using the fact that vet) :_ l ttl, we are done when C > 0 and
to ~ t < b.
5.3 EXISTENCE OF SOLUTIONS 143

If a < t :5 to and C > 0, then we define

U(t) = C + [0 v(s)g(s)ds,

so U'(t) = -v(t)g(t) 2: -U(t)g(t). Then,


U(t ) [ '0
log ( U(;) ) 2: - I g(s) ds, and

U(t) :5 C exp ([0 g(s) ds).

This completes the modifications for C > 0 and a < t :5 to.


Next, if C = 0, we can take Cj > 0 which converge to zero and have the assumed
inequality true for C j • By the first case,

vet) :5 Cj exp (Ii g(s) ds I)·

Since this last term goes to zero as j goes to infinity, we get that vet) == 0 in this case,
which verifies the conclusion. 0
We can now give the proof of uniqueness. However, we give the proof of continuity
with respect to initial conditions at the same time.
Theorem 3.3 (Continuity with Respect to Initial Conditions). Witb tbe_
Bumptions of Tbeorem 3.1, tbe solution 'P'(Xo) depends continuously on tbe initial con-
dition Xo.
PROOF OF UNIQUENESS AND COIII.TINUITY. Assume that x(t) and yet) are two solu-
tions with x(to) = Xo and y(to) = Yo. Then,

x(t) - yet) = Xo - Yo + J~ J(x(s» - J(y(s» ds.

Let vet) = Ix(t) - y(t)l, which is nonnegative, and veto) = IXo - yol. We get that

vet) :5 veto) + ILI/(X(S» - J(y(s))! ds I

:5 veto) + IL Kv(s)ds I·
SO by Gronwall's Inequality, vet) :5 v(to)eKIt-tol, or Ix(t) - y(t)1 :5 IXo - yol eKII-tol.
This clearly implies the solutions depend continuously on Xo.
Also, if Xo = Yo then we get that vet) == 0, or x(t) == yet). This last statement gives
the uniqueness. 0
Example 3.1. If the differential equation is not Lipschitz, then it is possible to have
nonunique solutions. One example is the equation :i: = 3x i on the real line. Given any
to there is a solution x(t) which equals (t - to)3 for t 2: to and 0 for t :5 to. There is yet
another solution z(t) == 0 for all t. Then both x(to) and z(to) equal zero but x(t) and
z(t) are not equal on any interval about to.
We next discuss the maximal interoal oj definition oj the solution. That is, for fixed
x, we extend the solution so 'P' (x) is defined for t E (L, t+) but no larger open interval
of times t. (Here Land t+ possibly depend on x.) The interval is open by the existence
of solutions on a short interval starting at any point. The following example gives an
equation for which the solutions are not defined for all time.
144 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Example 3.2. Consider:i: = x 2 • If Xo > 0 then solving the equation by separation of


variables shows that <pt(xo) = ~. This solution is defined for -00 < t < I/xo, so
1- txo
t+ = l/xo < 00. What is the interval of definition for Xo < 01
The following theorem shows that if the solution is bounded then it is defined for all
time: L = -00 and t+ = 00. Thus if a solution is not defined for all time, then it must
be unbounded (it must leave all compact subsets of U).
Theorem 3.4. Let U c R" be an open set and ! : U -+ R" a Cl function.
(a) Given x E U, let (L, t+) be the maximal interval of definition for <pt(x). If
t+ < 00 then given any compact subset C C U, there is a time tc with 0 $ tc < t+
such that <pk(X) ~ C. Similarly, ifL > -00 then there is a tc- with L < tc- $ 0
such that <p1c - (x) ~ C.
(b) In particular, if! : R" -+ R" is defined on all ofR" and If(x)1 is bounded, then
the solutions exist for all t.
PROOF. (a) Given a compact C, since f is C 1 there are constants M > 0 and K > 0
such that If(y)1 :5 M and If(y) - !(z)1 $ Kly - zl for y, Z E C. By the existence proof,
as long as the solution <pt(x) remains in C it is M-Lipschitz with respect to t. Assume
that <p'(x) E C for 0 :5 t < t+. Because <pt(x) is M-Lipschitz with respect to t, it has
a unique limit point <pt+ (x) E C. By the existence of solutions there are 6 > 0 and a
solution defined for (t+ - 6, t+ + 6) that agrees with <P'+ (x) for t = t+. Thus the interval
(L, t+) is not maximal. This contradiction shows that the solution <p'(x) must leave C
before t+.
(b) If 1!(y)1 $ M for all y ERn, then I<pt(x) - xl $ Mltl for all t, and the solution
must stay in the ball 8(x, R) for time It I $ RIM. By part (a), it follows that t+ = 00
and L = -00. 0
Given an initial point Xo let (L,t+) be the maximal interval of definition. The set
O(Xo) = {<p'(Xo) : L < t < t+} is called the orbit through Xo.
Example 3.2 shows that the flow of a differential equation is not necessarily defined
for all time. The following theorem shows how a differential equation can be modified to
keep the same solution curves in phase space but change the parameterization so each
trajectory is defined for all time.
Theorem 3.5 (Reparameterization). Assume f : U C Rn -+ IR" is a vector field
defined on an open set U of R" with flow <pt. Then there is a Coo real valued function
9 : U -+ (0, I] C R such that F(x) = g(x) !(x), F : U -+ an, has a flow t/J' defined
for all time. Moreover, t/Jt(x) = <pT(t,x)(x) where T satisfies the differential equation
f(t,x) = go <pT(t,x) (x) > OJ therefore, t/J t is a reparameterization of the flow<p t with the
same oriented solution curves.
PROOF. If U = R", we let g(x) = [1 + 1!(x)i2]-1/2 where we use the Euclidean norm so
it is differentiable. Then F(x) = g(x)!(x) has IF(x)1 :5 1 for all x; the solutions exist
for all time by Theorem 3.4(b).
If U is not all of R", we let G : U -+ (0, I] C R be a Coo positive function such
that (i) IIDGxll $ 1 and (ii) IG(x)1 goes to zero as x goes to the boundary of U or as
Ixl goes to infinity. The function G can be thought of as the square of the distance to
the boundary of U. Let g(x) = G(x)2[1 + 1!(XWt 1/ 2 and F(x) = g(x) !(x). Then
IF(x)1 :5 IG(x)1 2 $ 1. By Theorem 3.4. to show that a solution for F is defined for
all time it is enough to show that Go W'(x) does not go to zero in finite time, or that
[G ° Wt(X)]-1 does not go to infinity in finite time. But.,

ft[G ° W'(X)tl = -[G ° W'(x)t2DG"'(x)F 0 wt(x)


5.3 EXISTENCE OF SOLUTIONS 145

and
rill
[G 0 ¢1(X)t1 ::; [G(X)t1 + Jo IG 0 ¢·(x)l- l 'IIDG~'(x)II'IF 0 ¢·(x)lds
rill
::; [G(X)t1 + Jo ds
::; [G(X)t1 + Itl.
Thus [Go¢I(X)]-1 does not go to infinity in finite time, and the solution ¢I(X) is defined
for all time.
For the reparameterization,

~rpT(I.X)(X) = f 0 rpT(I,X)(x)r(t, x).


dt
Therefore if r is defined as the solution of
r(t, x) = go rpT(t.x) (x),
then both rpT(I.X)(X) and ¢I(X) satisfy the same differential equation, and so are equal
by uniqueness of solutions. Because 9 is strictly positive, r( t, x) is a monotonically
increasing function of time and the orbits of rpl and ¢I have the same orientation. 0
The final result of this section gives the group property for the flow.
Theorem 3.6. The flow rpl satisfies the following group property. Letting (L, t+) be
the maximal interval of definition for the initial condition Xo, then
rpl(rp·(Xo» = rpl+O(Xo)
for all t and s for which s,t + s E (L,t+).
PROOF. Let J(y) be the maximal interval of definition for rpl(y). Then rpl(rp·(X» is
defined for t E J (rpo (x) ). Define the new function
rpT(X) for 0 ::; r ::; s
y(r) = {
rpT-·(rp·(x» for s ::; r ::; s + t.
It is easily checked that y(r) is a solution. By uniqueness it must equal rpT(X) for
s ::; r ::; s + t. Therefore s + t E J(x) and y(r) = rpT(X) for 0 ::; r ::; 8 + t, or
rpl(rp"(X» = y(s + t) = rpIH(x), verifying the claim of the theorem. 0
If the flow is generated by a differentiable vector field, we have proved above that
rpt(Xo) is a continuous function of Xo and a differentiable function of t. Actually, if
the vector field f is CT with r ~ 1 then rpl(Xo) is cr jointly in Xo and t. We do
not prove this, but a proof can be found in many graduate level ordinary differential
equations books, e.g. Hale (1969) or Hartman (1964). By differentiating the equation
frp'(X) = f(rp'(X» with respect to the initial condition x and interchanging the order
ol differentiation, we get the equation
d I I
(jiDrpx = Df'P'(x)Drpx
or
d D rpxI V = D f'P'(x)DrpxI v
(ji
for any vector v. Either form of these equations is called the First Variation Equation.
146 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

5.4 Limit Sets and Recurrence for Flows


In this section, X is a metric space with distance d, and cpt is a flow on X. In the
applications in this book, X is a Euclidean space or manifold and cpt is the flow of a
differentiable vector field f. We start with the definition of a fixed point and a periodic
orbit, and then give the definitions and results about limit sets, nonwandering sets, and
chain recurrent sets. These theorems for flows are similar to those for maps. (Compare
this section with Section 2.3.)
Definition. A point p is called a fixed point for the flow cpt if cpt(p) = P for all t.
Sometimes such a point is also called an equilibrium or singular point. If the flow is
obtained as the solutions of a differential equation :it = lex), then a fixed point is a
point for which !(p) = O. A point p is called a periodic point provided there is T > 0
such that cpT(p) = P and cpt(p) #- p for 0 < t < T. The time T > 0 which satisfies
the above conditions is called the period of the orbit. (The period is the least period
because cpt(p) #- p for 0 < t < T.) The orbit of such a point p, O(p), is called a periodic
orbit. Periodic orbits are also called closed orbits since the set of points on the orbit is
a closed curve. It easily follows that if p is a periodic point of period T, and q E O(p)
then q is a periodic point of period T.
Definition. A point y is an w-limit point of x for cpt provided there exists a sequence
of tk going to infinity such that limk_oo d(cpt. (x), y) = O. The set of all w-limit points of
x for cpt is denoted by w(x), w(x, cpt), Lw(x), or Lw(x, cpt), and is called the w-limit set.
The o-limit set of x is defined the same way but with tk going to minus infinity. The
set of all such points is denoted by o(x) or Lo(x) (or with cpt specified in the notation).
All the basic results for limit sets of maps are also true for flows as indicated in the
following theorem. In addition, if the forward orbit of a point 0+ (x) = {cpt (x) : t ~ O}
is bounded, then w(x) is connected. Remember that this is not true for a map as the
example of a periodic point shows. (The reason flows are different than maps is that an
orbit for a flow is connected but not for a map.)
Theorem 4.1. Let cpt be a flow on a metric space X.
(a) The w-limit set can be represented in terms of the forward orbit as follows:

w(x) = n
T~O
cl U {cpt (x)}.
t~T

(b) If cpt (x) = y for a real number t, then w(x) = w(y) and o(x) = o(y).
(c) The limit sets, w(x) and o(x), are closed and both positively and negatively
invariant (contain complete orbits).
(d) If O+(x) is contained in some compact subset of X, then w(x) is nonempty,
compact and connected. FUrther, d(cpl(X),W(x)) goes to zero as t goes to infinity.
Similarly, if O-(x) is contained in a compact subset of X, then o(x) is nonempty,
compact, and connected, and d(cpl(X),W(x)) goes to zero as t goes to minus infinity.
(e) If Dc X is closed and positively invariant and xED, then w(x) C D.
(f) Ify E w(x), then w(y) C w(x) and o(y) C o(x).
PROOF. All the proofs are the same as for diffeomorphisms except the new result about
w(x) being connected in part (d). For a flow, the sets UI>T{CPt(x)} are connected, so
clUI~T{cpt(x)} is a nested collection of compact and connreted sets, and so

n cl( U{cpl(X)})
T~O t~T
5.4 LIMIT SETS AND RECURRENCE FOR FLOWS 147

is connected. (See Exercise 5.10.) o


In this section, we present some examples of flows in the plane which illustrate some
of these possibilities of limit sets. In Section 5.9, we discuss the Poincare-Bendixson
Theorem which gives restriction on the possible limit sets for flows in the plane.
Example 4.1. The condition that the orbit is bounded is necessary to prove that the
limit set is connected. This example gives a flow in the plane for which the orbit is
unbounded and the limit set is not connected. Let YI < 0 < Y2 be two fixed values
of Y and L j be the horizontal line {(x,Yj)} for j = 1,2. Let cpt be the flow for which
cpt(x, yd = (x - t, yd, cpt(x, Y2) = (x + t, Y2), the origin is a fixed point, and the orbits
of points q between the lines Y = YI and Y = Y2 spiral out limiting on both LI and L'l'
See Figure 4.1. For q = (x, y) '" 0 and YI < Y < Y2, w(q) = {(x, y) : Y = YI or Y = Y2},
which is not connected. For these same q, o(q) = O. For q = (x,Yi) for j = lor 2,
w(q) = o(q) = 0.

FIGURE 4.1. Example with Disconnected Limit Set

Example 4.2. Consider the equations

:i: = -Y + In(l - x 2 _ y2)


iJ = x + IJy(l - x 2 _ y2)
for IJ > O. In polar coordinates, this is the system

8=1
r = IJr(l - r2).

The time derivative of r satisfies

r{ ><00 for 0 < r < 1


for 1 < r,

so for initial conditions p '" (0,0) (ro '" 0), the trajectory has w(p) = Sl = {(x, y) : x 2 +
y2 = I}. Thus solutions forward in time limit on the unique periodic orbit with r == 1.
Such an orbit with trajectories spiraling toward it from both sides is called a limit cycle.

Example 4.3. Consider the equations

:i:=y
iJ = x - x 3 - IJy(2y2 - 2X2 + x 4)
148 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

for l' > O. We proceed to explain the phase portrait as given in Figure 4.2. There
are three fixed points: 0 = (0,0), and p± = (±l,O). As we show later in the chapter,
the two fixed points p± are repelling and the origin is a "saddle fixed point". To
aid in the analysis of the phase portrait, consider the real valued (Liapunov) function
L(x, y) = y2/2 - x 2/2+x4 /4. (See Section 5.5.3 for a discussion of Liapunov functions.)
The time derivative of L along trajectories can be calculated as follows:
. _ d I
L(x, y) = diL 0 cp (x, y)ll=o
= yy + (-x + x 3 )±
= y[x - x 3 - l'y(2y2 - 2x2 + x 4 )] + (-x + x 3 )y
= -4I'y 2L(x,y).

Because L(x,y) == 0 along the level curve L-1(0), the trajectories are tangent to this
level curve, and so the level curve L-I(O) is invariant under the flow. (If l' = 0, then
each level curve is invariant for the flow.) The level curve L -1(0) is made up of three
trajectories: the fixed point 0 and two trajectories -y+ = {(x, y) : x > 0 and L(x, y) =
O} and -y- = {(x, y) : x < 0 and L(x, y) = O}. For q E -y±, it can be shown that
w(q) = a(q) = O. Because of this property, the trajectories -y± are called homoclinic
connections for the saddle point O. See Figure 4.2.

FIGURE 4.2. Example 4.3

For all q inside -y+, L( q) ~ 0 and L( q) > 0 if q is off the x-axis. Using these facts,
it can also be shown that for ql inside -y+ but not equal to p+, w(q.) = {O} U -y+ and
a(q.) = {p+}. Similarly, for q2 inside -y- but not equal to p-, W(q2) = {O} U -y-
and a( q2) = {p - }. Finally, for q outside both -y+ and -y-, L( q) ::s 0 and L( q) < 0
if q is off the x-axis. Again, it can be shown that for q3 outside both -y+ and -y-,
W(q3) = {O} U -y+ U -y- and a(q3) = 0. At the moment we are not worrying about the
fact that these particular equations have this phase portrait, but merely that this phase
portrait illustrates a flow with different limit sets for different points. In particular, the
a-limit set of certain points equals a single fixed point and for other points it is the
empty set. The w-limit set of certain points equals a single fixed point; for other points,
it is the fixed point at the origin together with one of the homoclinic connections (-y+
or -y-); for still other points, it is the fixed point at the origin together with both the
homoclinic connections -y+ and ,,(-.
Definition. Using the definition of w-limit set and a-limit set, we say that x is w-
recurrent provided x E w(x) and that x is a-recurrent provided x E a(x).
The only changes that are needed in the definitions of nonwandering and chain re-
current points from those in Section 2.3 is that the times must be large enough.
5.5 FIXED POINTS FOR NONLINEAR DIFFERENTIAL EQUATIONS 149

Definition. For a flow <pIon X, a point p is called nonwandering provided for every
neighborhood U of p there is a time t > 1 such that <pI (U) n U 1: 0. Thus there is a
point q E U with <pl(q) E U. (Note the times for the flow are real numbers while those
for maps are integers.)
Definition. An t-chain 01 length T from x to y for a flow <pI is a sequence {x =
XQ"",Xn = y;tl, ... ,t n } with tj ~ 1 for alII $ j $ n, d(<p'J(xj_d,xj) < t, and
tl + ... + tn = T.

The definitions of t-chain limit set, chain limit set, chain recurrent set, and chain
components remain the same. It is not hard to prove that for a flow, the chain compo-
nents are indeed the connected components of R.
As for maps, the reader can easily check that for a flow <pI,

5.5 Fixed Points for Nonlinear Differential Equations


For linear systems of differential equations, the asymptotic stability is determined
by the real parts of the eigenvalues. For a fixed point of a nonlinear system, the real
parts of the eigenvalues of the derivative also determine the stability. In the next two
subsections we state this result for nonlinear equations and prove it for the case of
an attracting fixed point. In Section 5.6, we give the comparable results for maps.
In between, in Section 5.5.3, we define a Liapunov function for a nonlinear ordinary
differential equation, which can also be used to determine the stability of a fixed point.
Definition. We consider a (nonlinear) differential equation x = I(x) with x E aft.
The point p is a fixed point if I(p) = O. In this case the flow fixes the point p for all t,
<pI (p) = p for all t. To linearize the flow near p, let A = DIp be the matrix of partial
derivatives. Then the linearized differential equation at p is given by :ic = A(x - p).
Let p be a fixed point for :ic = I(x). We call the fixed point hyperbolic provided
Re(A) 1: 0 for all the eigenvalues A of DIp. There are several special cases. A hyperbolic
fixed point is called a sink or attracting provided the real parts of all the eigenvalues of
DIp are negative. This terminology is used because Theorem 5.1 below proves that in
this case not only is the linearized equation asymptotically stable (attracting) but also
the nonlinear flow. A hyperbolic fixed point is called a source or repelling provided the
real parts of all the eigenvalues of DIp are positive. Finally, a hyperbolic fixed point is
called a saddle provided it is neither a sink nor a source, so there are two eigenvalues A+
and A_ with Re(A+) > 0 and Re(A_) < 0 (and the real parts of all the other eigenvalues
are nonzero). A nonhyperbolic fixed point is said to have a center. (In a more restrictive
sense of the word a center is a family of closed orbits surrounding a fixed point as occurs
for a linear two dimensional center or an undamped pendulum.)
We are often interested in the set of all points whose w-limit set is a fixed point (or
in a more complicated set). The basin 01 attraction of a fixed point p is the set of all
points q such that w(q) = {p}, which we denote by W"(p). Note this concept is most
interesting and most often used when p is a sink.

Theorems 5.1 and 5.2 below prove the stability of a nonlinear sink, and Theorem 5.3
and Corollary 5.4 state these results for nonlinear hyperbolic fixed points. We end this
section with several examples.
150 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Example 5.1. Consider the second order equation ij + sin(O) + 60 = O. By letting


XI= 0 and X2 = 0, this can be written as a first order system
. _(XI) _f(x) =
X - •
X2
-_(-sm(xd
. -
X2
6X2
) .

The flxocl points are x .. (ml',O) for n an Integer. The derivative of I is

Dfx = ( _ CO~(Xil ~6)'


At (0,0), or (ml',O) for n even, the derivative is given by

The characteristic equation is A2 +6 A+ 1 = 0 with eigenvalues A = (-6 ± W-


4JI/2)/2.
Since 62 - 4 < 62 , for 6 > 0 the real part of both eigenvalues is negative. Thus (0,0) is
a nonlinear sink. Notice that for 0 < 6 < 2, the eigenvalues are complex, and for 6 ~ 2
the eigenvalues are real. For 0 > 6, (0,0) is a source.
Next, for the fixed point (71',0), or (n71',0) for n odd,

and the eigenvalues are A = (-6 ± [6~ + 4J 1/2)/2. Notice that for 6 = 0, the eigenvalues
are ±1. For any 6, the fixed point is a saddle.

Example 5.2. Consider the equation x+ X - x 3 = O. The fixed points are at X = 0, ± 1


and X = O. The derivative of the system of equations is

Dfx = (-1 ~ 3x 2 ~).


The eigenvalues for the fixed point (0,0) are ±i. Thus this fixed point is nonhyperbolic;
the linearized equation is a center. The fixed points x = ±l have eigenvalues ±2 1/ 2 ,
and thus are saddles.

5.5.1 Nonlinear Sinks


In this section we consider the dynamics near a nonlinear sink. The following theorem
proves the nonlinear flow near the fixed point sink is an exponential contraction if the
linear flow is. Also, because the solutions are bounded for the nonlinear flow, they exist
for all t ~ O.
Theorem 5.1. Let p be a fixed point for the equations x = f(x) with x E an. Also
assume that there is a constant a > 0 such that all the eigenvalues A for A = Dfp have
negative real part with Re(A) < -a < O. Then the following two statements are true.
(a) There is a norm I I. on an and a neighborhood U c Rn of p , ....11 that for any
initial condition x E U, the solution is defined for all t ~ 0 and satisfies

1'P'(x) - pl. :5 e- '4 Ix - pl. for all t ~ O.


5.5.1 NONLINEAR SINKS 151

(b) For any norm II' on Rn there exist a neighborhood U C Rn ofp and a constant
C ~ 1 such that for any initial condition x E U, the solution is defined for all t ~ 0 and
satisfies

REMARK 5.1. A norm as in part (a) ofthe theorem is called an adapted nonn.
REMARK 5.2. This theorem can be proved by using either approach to the proof of
Theorem IV.5.l. We present the proof using the averaged norm of the first proof. We
leave to the exercises the proof using the Jordan canonical form. See Exercise 5.20.
PROOF. We first do a change of coordinates y = x - P to move the fixed point to the
origin. Then
y = it = I(y + p)
= Ay + g(y)
where g(y) contains all the terms which are quadratic and higher. Therefore g(O) = 0
and Dgo = O. We let .,pt(y) be the flow in the y coordinates and (l(x) be the flow
in the x coordinates. Since .,pt(yo) is the solution of y = Ay + g(y) with .,pO(yo) =
Yo, then thinking of g(.,pt(yo» as a known function of t, it is also a solution of the
nonhomogeneous linear equation
y = Ay + g(.,pt(yo»
with y = Yo at t = O. Therefore we can apply the variation of parameters formula to
the solution to get

.,pt(yo) = eAt Yo + 1t eA(t-a)g(.,pa(Yo»ds.

We now want to use the estimates for the linear flow to get estimates for the nonlinear
flow. We have to introduce a little error because of the nonlinear terms, so we take f > 0
and b = a + f such that all the eigenvalues A for A satisfy Re(A) < -b. Now let II. be
any adapted norm for the IinPRr equations which is shown to exist by Theorem IV.5.1,
Le., the norm satisfies leA1Yol. S e-1oIYol. for t ~ O. The nonlinear term 9 starts with
quadratic terms, so there is a {) > 0 such that for Iyl. S {) we have Ig(y)l. < flyl.·
We let U' be the neighborhood of y = 0 given by U' = {y : Iyl. S {)}. Applying this
estimate to the integral above, if 1.,p·(yo)l. S {) for 0 S sSt then

l.,pt(Yo)l. S leA1yol. + lleA(t-a)g(.,pa(Yo»I. ds

S e-btIYol. +l e-b(t-a)lg(.,pa(Yo»I. ds

S e-b1IYol. + l' ebae-b'fl.,pa(Yo)l. ds

ebll.,p'(Yo)l. S IYol. + l febaW(Yo)l. dB.

We can now apply Gronwall's Inequality to the function ebll.,pI(Yo)l. and get

ebtl.,p'(Yo)l. S IYol.ed ,

l.,pt(Yo)l. S IYol.e(-b+<)1 = IYol.e- OI •


152 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

It follows that if Yo E U' with IYol. $ h, then the solution will remain in U' for all
t ~ 0: 1""(Yo)l. $ IYol.e(-b+<)1 $ h. Thus the solution is defined for all t ~ 0, and
1""(Yo)l. goes to zero exponentially. In the x coordinates, let U = {x: Ix - pl. $ h}.
For Xo E U, let Yo = Xo - P and <p1(Xo) = ",t(yo) + p. Thus the solution <p1(Xo) is
defined for all t ~ 0 and 1<p'(Xo) - pl. $ e-atlXo - pl., i.e., the solution goes to the
fixed point at an exponential rate as claimed.
For any norm I I', the result for part (b) follows just as in the linear case by the
equivalence of norms. 0
The next theorem is a special case of the Hartman-Grobman Theorem which is stated
at the beginning of the next subsection. Here we state and prove it for the special case
of a fixed point sink. Instead of the linear flow, we consider the affine flow :it = A (x - p)
which has solutions p + eA1(x - pl.
Theorem 5.2. Let p be a fixed point sink for:it = I(x). Then the flow <p' of I is
conjugllte in a neighborhood of p to the affine flow p + eAI(y - p) wllere A = DIp.
More precisely, there are a neighborhood U of p and a homeomorphism h : U -+ U such
that <p'(h(x» = h(p + eA1(y - p» as long as p + eAt(y - p) E U.
PROOF. The proof is very similar to the proof of Theorem IV.7.1 for linear flows. One
of the differences is that because the nonlinear flow and the affine flow are close, it
is possible to use the identity map on the small sphere rather than some ho. Take
o < a < b with Re(.l.) < -b for all the eigenvalues .l. of A. Set f = b - a. Take 6 > 0
as in Theorem 5.1 above such that for Ix - pl. $ 6, leA'(x - p)l. $ e-a'lx - pl. and
l<p'(x) - pl. $ e-atlx - pl. for all t ~ O.
Let U = {x: Ix - pl. $ h}. For x E U take T(X) such that IeAT(x)(x - p)l. = 6.
Define h : U -+ U by

p for x = p
h ()
x = {
<p-r(x)(p + eAr(x)(x _ p» for x '# p.

The proof that h is a conjugacy is now the same as in Theorem IV.7.1 as long as times
are restricted so that T(X) $ t < 00. 0
Example 5.3. Vinograd (1957) gave an example of a nonlinear differential equation
with the origin an isolated fixed point which is not Liapunov stable but which has a
neighborhood U for every q E U has w(q) = O. The equations are

. x 2(y _ x) + y5
X = -:-(x-;;"2-+-y'""'2""')[":"""1-+~(-:x2-=+'---cy2;;-:-);;;2]
. y2(y _ 2x)
y = (x 2 + y2)[1 + (x2 + y2)2] .

The phase portrait is given in Figure 5.1 See Hahn (1967) page 191 for an analysis of
the example.

5.5.2 Nonlinear Hyperbolic Fixed Points


In the last subsection we considered nonlinear sinks. In this subsection we consider
hyperbolic fixed points of nonlinear differential equations and state the results linking
their stability with the real parts of the eigenvalues of the derivative of the vector field
at the fixed point. We give this stability result as a corollary of the Hartman-Grobman
Theorem. We defer the proof of the Hartman-Grobman Theorem to Section 5.7.
5.5.2 NONLINEAR HYPERBOLIC FIXED POINTS 153

FIGURE 5.1. Phase Portrait for Example 5.3

Theorem 5.3 (Hartman-Grobman). Let p be a hyperbolic fixed point for x = I(x).


Then the flow tpl of I is conjugate in a neighborhood ofp to the affine flow p+eAI(y_p)
where A = DIp. More precisely, there are a neighborhood U of p and a homeomorphism
h: U -+ U such that tpl(h(x» = h(p + eAI(y - p» as long as p + eAI(y - p) E U.
We delay the proof of this theorem to Section 5.7.2 because it involves much more
analysis than the special case of a sink, given in Theorem 5.2. As stated above, the
stability of hyperbolic fixed points follows from the Hartman-Grobman Theorem.
Corollary 5.4. Let p be a hyperbolic fixed point for x = I(x). lfp is a source or a
saddle then the fixed point p is not Liapunov stable (unstable). lfp is a sink then it is
asymptotically stable (attracting).
The fact that a source is not Liapunov stable follows from a result like Theorem 5.1.
It can be proved directly that a saddle is not Liapunov stable as is given in Hirsch and
Smale (1974), page 187.
Combining the conjugacy of linear hyperbolic Hows given in Theorem IV.7.1 with the
Hartman-Grobman Theorem, we can state the following corollary.
Theorem 5.5. Assume x = I(x) has a hyperbolic fixed point at p, and x = g(x) bas a
hyperbolic fixed point at q (where both equations are in RR). Assume that the number
of negative eigenvalues at the two fixed points are equal. Then there are neighborhoods
U of p and V of q and a homeomorphism h : U -+ V such tbat h is a conjugacy of the
flow of Ion U to the flow of 9 on V.
REMARK 5.3. Since any two contracting linear Hows are topologically conjugate, it is
clear that a topological conjugacy does not preserve much of the geometric nature of
the phase portrait. By contrast, a differentiable conjugacy does preserve much of this
structure.
It is possible to prove that a nonlinear How is differentiably conjugate to the linear
How near a hyperbolic fixed point if the eigenvalues satisfy a nonresonance condition.
Assume that I is Coo and the eigenvalues at a hyperbolic fixed point are Aj for 1 :$ j :$ n.
Assume that A/c ¥ E7-1 mjAj for any choice of mj ~ 0 with E7.1 mj ~ 2. Then the
conjugacy h in Theorem 5.3 can be taken to be a diffeomorphism from V onto U. This
result was initially proven by Sternberg (1958).
In two dimensions, Hartman (1960) proved that near a hyperbolic fixed point, any C2
How is CI conjugate to its linearized How. This theorem is true even when the eigenvalue
has multlpliclty two. In particular if the the linearized equations are diagonal with
154 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

a negative real eigenvalue, then the solutions for the linearized equations go straight
toward the fixed point. The nonlinear flow could have trajectories which spiral as
they approach the origin but the increase in the angle is bounded (as follows from the
C 1 conjugacy or a direct estimate). Thus the two phase portraits are similar up to
changes which C 1 nonlinear change of coordinates allows. Also see Belitskii (1973) for
C 1 conjugacies for the general hyperbolic case.
See the discussion in Hartman (1964) on differentiable conjugacies.
Finally, we remark again that the assumption of a differentiable conjugacy between
two flows in neighborhoods of their fixed points implies that they have the same eigen-
values. This follows directly from Theorem IV.7.2 about linear flows.
Theorem 5.6. Assume:ic = J(x) has a hyperbolic fixed point at p, and :ic = g(x)
has a hyperbolic fixed point at q (where both equations are in Rn). Assume that the
flows are dilferentiably conjugate in a neighborhood of these two fixed points. Then the
eigenvalues of the linearization of f at p equal the eigenvalues of the linearization of 9
at q.

5.5.3 Liapunov Functions Near a Fixed Point


In this section we introduce the use of a real valued function which is decreasing
along trajectories. For a system of equations for an oscillator, the function is often an
energy function. We start by giving a !ij)ecific example as motivation.
Example 5.4. Consider the equation of a pendulum with friction,

X=y
Y= - sin(x) - 6y

with 6 > O. The fixed point (0,0) is attracting (asymptotically stable or a fixed point
sink) as we saw in Example 5.1 using the eigenvalues and Theorem 5.1. In this section
we will find a second way to see that is true using the "energy function" L(x,y) =
!y2 _ cos(x). (The first term is the "kinetic energy" and the second is the "potential
energy" .) We are interested in the time derivative of the real valued function L along a
solution curve (x(t), y(t»:

.
L(x(t), y(t» =diL(x(t),
d
y(t»
= y(t)!i(t) + sin(x(t»x(t)
= y(t)[- sin(x(t» - 6y(t») + sin(x(t»y(t)
= _6y(t)2
$0.

Since this function is non increasing, an easy argument proves that the origin is Liapunov
stable. (See the proof of Theorem 5.7 below.) Since the function is decreasing most
of the time, we can prove that the origin is asymptotically stable. (See Theorem 5.8
below.) These theorems also give an idea (If tl.r I .in of attraction of this fixed point
sink, namely all points with L(x,y) < L(1r,U) ,aud -1r < X < 1r.
Functions L as in the above example are called Liapunov functions. We give their
important characteristic in the following definition.
5.5.3 LIAPUNOV FUNCTIONS NEAR A FIXED POINT 155

Definition. Let:it = J(x) be a differential equation on Rn with flow <p'(x) and a fixed
point p. A real valued C 1 function L is called a weak Liapunov function for the flow <p'
on an open neighborhood U of p provided L(x) > L(p) and L(x) == ftL(tp'(x»I,_o SO
for all x E U \ {pl. These conditions imply that L 0 <p'(x) S L(x) for all x E U and
for all t ~ 0 such that tp"(x) E U for all 0 S sSt. The function L is called a Liapunov
function on an open neighborhood U of p (or strong. strict. or complete Liapunov
function) provided it is a weak Liapunov function which also satisfies L(x) < 0 for all
XEU\{p}.
One feature of this method is that it is possible to calculate L without actually
knowing the solutions. This makes it possible to verify that a given function is in fact a
Liapunov function for a given differential equation in a certain neighborhood of a fixed
point. A general reference for this section is Hirsch and Smale (1974). pages 192-199.
Theorem 5.7. Let p be a fixed point for it = F(x). Let U be a neighborhood ofp and
L : U -> R a weak Liapunov function on the neighborhood U of p for the dilferential
equation. Then p is Liapunov stable.
(b) If L is a (strict) Liapunov function on the neighborhood U ofp for the dilferential
equation, then p is asymptotically stable.

PROOF. To show that p is Liapunov stable, given any E > 0 we need to find a neigh-
borhood U1 C B(p, f) such that a solution starting in U1 stays in B(p, E). First of all
we can assume that E is small enough so that B(p,E) C U. Choose a such that

L(p) < a < min{L(x) : x E 8B(p,E)},

and let
U 1 = {x E B(p, E) : L(x) < a}.
If x E U 1 then L 0 tpl(X) S L(x) < a, so the solution can not cross the boundary of
B(p, f). Therefore <p'(x) E B(p, E) for all t ~ O. This proves that p is Liapunov stable.
For the second part, take U 1 as above. For x E U lt we want to show that tpl(X) goes
to past goes to infinity. Let z be an w-limit point of x, i.e., there is a subsequence
of times tn -+ 00 such that tpln (x) -> z. Since the closed ball cl(B(p, f» is compact,
z E cl(B(p,f». We claim that z = p. Assume there is an w-limit point z # p. By
continuity, L(<p'n (x» converges to L(z). The fact that L is decreasing along the solution
implies that L(<pln(x» converges to L(z) from above. Since z # p, L(<p"(z» < L(z) for
(small) positive s. By continuity, for y sufficiently near z, L(<p"(y» < L(z). Letting
Y = tpln(x), for some large n, we get that tpln+"(x) < L(z). Then for tm > tn + s,
<pl~ (x) < <pln+"(x) < L(z). This contradicts the fact that L(<p'n (x» converges to L(z)
from above. Thus p is the only w-limit point of x, so p is asymptotically stable. 0
When applying Theorem 5.7 to the pendulum example above, we see that (0,0) is
Liapunov stable for () ~ O. However, L is not strictly negative on any deleted neighbor-
hood, so it does not imply that the point is asymptotically stable for () > O. To get this
result we need Ii refinement of the above theorem given next.
Theorem 5.S. Let p be a fixed point for:ic = F(x). Let U be a neighborhood of p
and L : U -+ R a weak Liapunov function on U. Suppose S C U is a c106ed, bounded,
positively invariant neighborhood of p. Let

z= {x E S : L(x) = O}.
156 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

FUrther suppose that {p} is the largest positively invariant subset of Z. Then p is
asymptotically stable, and S is contained in the basin of attraction of p.
PROOF. Take any xES. The w-limit set w(x) is nonempty because S is closed,
bounded, and positively invariant. Let z be an w-limit point. By the argument in
Theorem 5.7, L(<,?'(z)) = L(z) for all s > 0, so <'?'(z) E Z for all s > O. Thus z must be
in a positively invariant subset of Z and z = p. 0

5.6 Stability of Periodic Points for Nonlinear Maps


In this section we consider a map (usually a diffeomorphism) f : Rn -+ Rn or f :
M -+ At where M is a surface or manifold such as M = T" or sn. A linear map is
given by f(x) = Ax where A is an n x n matrix.
The reader should ('om pare the results of this section wit.h the results for maps on
R given in Chapter II. The conditions on the derivative at the fixed point for one
dimensional maps are replaced by conditions on the eigenvalues of the derivative.
Definition. A point p is a fixed point of f if /(p) = p. It is a periodic point of (least)
period k if /k(p) = P but P(p) oF p for 0 < j < k. A periodic point p is called Liapunov
stable provided given any E > 0, there exists a b > 0 such that IJi(x) - Ji(p)1 < E for
all Ix - pi < band j ~ O. A periodic point p is called it asymptotically stable or
attmcting provided it is Liapunov stable and there is a b > 0 such that if Ix - pi < b,
then IP(x) - P(p)1 goes to zero as j goes to infinity.
The basin of attmction of an attracting periodic orbit O(p) is the set of all points q
such that w(p) = O(p). This basin is denoted by W"(O(p)).
The following theorem gives the existence of an adapted metric for maps at an at-
tracting fixed point which is similar to Theorem IV.5.l for differential equations.
Theorem 6.1. Let p be a periodic point for / : Rn -+ Rn with period k. Assume there
is a constant 0 < I-' < 1 sllch that all the eigen1llllues >. of D/: have 1>'1 < 1-'.
(a) TlJen there is a norm I I. on Rn and a neiglJborlwod U c an of p such that for
any initial condition x E U, the iterates satisfy
II jk (x) - pk(p)l. ~ I-'jlx - pl. for all j ~ O.
(b) For any norm II' on Rn tlJere exist a neighborhood U C Rn ofp and a constant
C ~ 1 such that for any initial condition x E U, the iterates satisfy
II jk (x) _/jk(p)I' ~ CI-'jlx - pI' for all j ~ o.
REMARK 6.1. As for differential equations, a norm as in part (a) of the theorem is
called an adapted norm.
REMARK 6.2. The proof is similar to that for Hows with integrals replaced by summa-
tions and is omitted.
To look at the case that is not a sink, for a linear map with matrix A we let
E" = span{v" : v" is a generalized eigenvector
for an eigenvalue >." of A with I>." I > l},
E' = span{v' : v' is a generalized eigenvector
for an eigenvalue >.. of A with 1>..1 < l},
E = span{v
C C : V C is a generalized eigenvector
for an eigenvalue >'c of A with I>'c! = l}.
5.6 STABILITY OF PERIODIC POINTS FOR NONLINEAR MAPS 157

Definition. For a periodic point p of period k, we consider these spaces with A = D I:.
The periodic point p of period k is called hyperbolic provided EO = to}, i.e., all the
eigenvalues Aof DI; satisfy IAI "" 1. A hyperbolic periodiC point p of period k is called
a .!ink provided all the eigenvalues of DI; are less than one in absolute value, IAI < 1,
i.e., both E", EO = {a}. Theorem 6.1 above proves that a periodic sink is asymptotically
stable (or attracting). In the same way, a hyperbolic periodic point is called a source
provided all the eigenvalues are greater than one in absolute value, I~I > 1, i.e., both
E' ,Eo = {o}. Applying Theorem 6.1 to the inverse, we get that a periodic source is
repelling. Finally, a hyperbolic periodic point with E" "" {o} and E' "" {o} is a saddle.
In the same way that Theorem 5.2 followed from Theorem 5.1, we could prove that
a nonlinear sink is topologically conjugate in a neighborhood of the fixed point to the
linear map induced by the derivative at the fixed point. The following theorem states
the more general Hartman-Grobman Theorem for diffeomorphisrns.
Theorem 6.2 (Hartman-Grobman Theorem). Let I : Rn -+ Rn be a C r diffoo-
morphism with a hyperbolic fixed point p. Then there exist neighborhoods U of p and
V of 0 and a homeomorphism h : V -+ U such that I(h(x)) = h(Ax) for all x E V,
where A = DIp.
We delay the proof of this theorem to Section 5.7.1.

REMARK 6.3. Just as for differential equations, it is possible to prove that a nonlinear
How is differentiably conjugate to the linear How near a hyperbolic fixed point if the
eigenvalUes satisfy a non resonance condition. Assume that I is COO and the eigenvalues
at a hyperbolic fixed point are Aj for 1 ::; j ::; n. Assume that At "" n;=,
A';'J for any
choice of mj ~ 0 with 1:;=1 mj ~ 2. Then it is possible to prove the existence of a
conjugacy h as in Theorem 6.2 that is a diffeomorphism from V onto U. This result
was initially proven by Sternberg (1958).
In two dimensions, Hartman (1960) proved that near a hyperbolic fixed point, any
C 2 diffeomorphism is C l conjugate to its linearized map. Thus the two phase portraits
are similar up to changes which C l nonlinear change of coordinates allows. Also see
Belitskii (1973) for C l conjugacies in the general hyperbolic case.
See the discussion in Hartman (1964) on differentiable conjugacies.
As in the case of a How, the stability of a fixed point follows from the Hartman-
Grobman Theorem.
Corollary 6.3. Let I : R n -+ R n be a C r diffeomorphism with a hyperbolic fixed point
p. If p is a source or a saddle then the fixed point p is not Liapunov stable. If p is a
sink then it is asymptotically stable.
lf a fixed point is hyperbolic then the linear part determines the stability type of the
fixed point. Small changes in the linear part preserve the same stability type. To end
this section we give a result that says that a hyperbolic fixed point persists for small
changes in the map. More specifically, we consider a one-parameter family of maps I,..
We could assume that Xo is a hyperbolic fixed point for llJO' However, it is enough to
assume that 1 is not an eigenvalue of D(fIJO)x.o in order to show the fixed point persists,
as the following theorem shows.

Theorem 6.4. Let I,. (x) bea one-parameter family of differentiable maps with x ERn.
Assume that I,.(x) is C l as a function jointly of 11 and x. Assume that llJO(xo) = Xo
and 1 is not an eigenvalue of D(fIJO)xo' Then there are (i) an open set U about Xo, (ii)
an interval N about 110, and (iii) a C I function p : N -+ U such that p(~) = Xo and
158 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

1/J(p(J.L» = p{J.L). Moreover, (or J.L E N, II' has no other fixed points in U other than
p(J.L). Finally,

PROOF. We want to find points x such that II' (x) = x. We define the function
G(x, J.L) = I/J(x) - x and the condition becomes finding zeros of G(·, J.L). This is set up
in the form where the Implicit Function Theorem might apply. Note that G(Xo, J.Lo) = 0
8G
and 8x (Xo./Jo) = D(J/JO)Xo -1 is invertible since 1 is not an eigenvalue of D(J/Jo)Xo'
Therefore the Implicit Function Theorem does indeed apply to give a Cl function p(J.L)
such that G(p(J.L), J.L) = 0 for J.L in some open interval N about J.Lo and these are the only
zeroes in some set N x U where U is an open neighborhood of Xo. The calculation of
the derivative follows by implicit differentiation. 0

5.1 Proof of the Hartman-Grohman Theorem


To prove the Hartman-Grobman Theorem for a diffeomorphism in a neighborhood
of a fixed point, we first prove a case where the nonlinear map is defined on all of a
Banach (or Euclidean) space, and it is a bounded distance away from the linear map
on the whole space. Next we apply the global theorem to prove the local Hartman-
Grobman Theorem for a diffeomorphism, Theorem 6.2. Finally, we prove the local
Hartman-Grobman Theorem for a flow, Theorem 5.3.
To carry out these arguments, we need to work with a few function spaces and with
bounded linear maps. For two Banach spaces EI and E 2, if A: lEI --+ 1E2 is a linear map
we define the operator norm or sup-norm of A just as in finite dimensions by

IA v l2
IIAII = SUP-I-I .
v"o v I

The linear map A is a bounded linear map provided the norm IIAII is finite. (Note the
function A does not take on "bounded" values in 1E2') We let L(1E1t1E2) be the set of
bounded linear maps from lEI to 1E2 with the norm II . II. We also use the minimum
norm which is also defined in the same way as we defined it in Section 4.1 for finite
dimensions:
m(A) = inf IAvI2.
v"o Ivll
If A is invertible, then m(A) = IIA-Ill-i.
A function I : Rn --+ R n is said to be bounded provided there is a uniform C > 0
such that I/(x)1 ~ C for all x E Rn. (Notice the difference from the use of the term
for a bounded linear map.) We let cg (Rn) = cg (Rn , Rn) be the space of all bounded
continuous maps from Rn to itself. We put the CO-sup topology on cg(Rn),

IIvl - v2110 = sup IVI(x) - v2(x)l·


xER"

With this norm, cg(Rn) is a complete metric space. See Dieudonne (1960).
For a differentiable map 9 : Rn -+ Rk, at each point a E R n the derivative is (a
matrix or) a bounded linear map, Dg. : Rn -+ Rk. We let Ct{Rn) be the set of CI
functions from Rn to itself, 9 : Rn -+ Rn, such that 9 is in cg(R n) and such that there
is a uniform bound on the derivatives, i.e., a constant C independent of a E Rn such
that IIDg.1I ~ c.
5.7 PROOF OF THE HAKrMAN-GROBMAN THEOREM 159

We want to consider CI functions f : an -+ an such that f = A + 9 with A E


L(Rn,Rn) an invertible hyperbolic linear map and 9 E CleRn). The global Hartman-
Grobman Theorem says that such an f can be conjugated to A by a continuous map
h : Rn -+ Rn with h = id + v and v E cg(Rn).
Theorem 7.1. Let A E L(Rn,Rn) be an invertible hyperbolic linear map. There
exists an t > 0 such that if 9 E Ct(Rn) with Lip(g) < E, then f = A + 9 is topologically
conjugate to A by a map h = id + v with v E cg(Rn), and the conjugacy is unique
among maps id+k with k E cg(R n ). In fact, let 0 < a < 1 be such that each eigenvalue
A of A has eitller IAI < a or lA-II < a. If both t(1 - a)-I < 1 and m(A) - E > 0, then
this t works. (The condition on the eigenvalues can also be expressed by saying that
spectrum(A) C {A: IAI < a or lA-II < a}. The condition that m(A) - E > 0 insures
that f is one to one.)
REMARK 7.1. Palis and de Melo (1982) have two proofs of this theorem. The first
on pages 59-·63 is similar to that given here. The second on pages 80-88 is a more
geometrical proof (using the A-Lemma). We use this latter type of reasoning later in
this book. Irwin (1980), pages 92, 113-114, has a proof similar to the one given here.
MOTIVATION AND OUTLINE OF THE PROOF. Formally, the conjugacy can be proved
to exist by the Implicit Function Theorem. For 9 E Ct(Rn), we want to find a map
id + v which conjugates A + 9 with A, i.e.,

(A + g) 0 (id + v) = (id + v) 0 A,
(A + g) 0 (id + v) 0 A-I = id + v,
O=id+v-(A+g)o(id+v)oA- I , or
o= v- A0v 0 A-I - 9 0 (id + v) 0 A-I.

Therefore we define III : CHRn) x cg(a n) -+ cg(Rn) by

III (g, v) = v - A 0 v 0 A - I - 9 0 (id + v) 0 A -I.


Given 9 we want to find a Vg such that lII(g, vg) = O. Such a Vg corresponds to a semi-
conjugacy of A + 9 with A Notice that 111(0,0) = id - A 0 id 0 A -I = 0, so we can hope
to use the Implicit Function Theorem near (0,0) E Cl(Rn) x cg(Rn) to solve for the
Vg with lII(g, vg) = 0.
It can be proved (after some work) that III is a C' map, and the partial with respect
to the second variable is

au (0.0) v- = v• - A 0 v• 0 A-I
( alii)
== (id - A*)v
== .c(v),

where A*v = AoVoA- I . (See Franks (1979), Irwin (1972, 1980).) If we were to show
that III is a C' map with partial derivative.c and that.c is an isomorphism (a bounded
linear map with a bounded linear inverse), then the Implicit Function Theorem would
show that we can solve for v = Vg as a function of 9 such that lII(g,vg) == O. This would
prove the theorem.
Instead of verifying that III is CI with partial derivative .c, we verify that .c is an
isomorphism and imitate (or repeat) the proof of the Implicit Function Theorem. In
the direct proof of the Implicit Function Theorem (as opposed to the proof using the
160 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Inverse FUnction Theorem), the problem of finding a zero of III is changed into finding
a fixed point by considering the function 8: Cl(an) x Gg(Rn) -+ Gg(a n ) given by

8(g, v) = C-I{Cv -1II(g,v)}.

(The functions 9 are required to be bounded so that 8 takes its values in Gg(a n ).) After
showing that C is an isomorphism using Lemma 7.2, we prove that if 1I.en Lip(g) < I
then 8(g,') is a contraction on Gg(Rn) with a unique fixed point v g • Letting f = A + 9
and h, = id + v g , it follows that h, = f 0 h, 0 A-lor h, 0 A = f 0 h" Thus h, is a
semiconjugacy. Lemma 7.4 proves that h, is one to one and Lemma 7.5 proves that h
is onto, so h, is a conjugacy from A to f. This ends the outline of the proof.
Before starting the proof, we give more notation and then prove a preliminary lemma.
The matrix A is hyperbolic, so we can define the stable and unstable subspaces as usual:

E" = span{v" : v" is a generalized eigenvector


for an eigenvalue Au of A with IA"I > I},
E' = span{v' : v' is a generalized eigenvector
for an eigenvalue Ao of A with IAol < I}.

Then R n = E" $Eo .


We use this decomposition to decompose the space of continuous bounded functions,
Gg(Rn, an) = Gg(Rn, E") $ Gg(Rn,Es). We put norms on EU and ES such that the
linear map A is a contraction and expansion on the two subspaces: II(AIE,,)-III ~ a < I
and IIAIEsll ~ a < 1. On Rn, we put the maximum norm of the norms on EU and ES:
if v = v" + v' with vO' E EO' for (1 = U,S, then

Let
A* v = A 0 v 0 A-I
be the map on Gg(Rn, Rn) as above and

C(v) = (id- A*)v = v- AovoA- I .

For (1 = U,S, we also let A# = A*IGg(an,EO') and CO' = CIGg(Rn,EO'). Because we


use the norms induced by the maximum norm on Rn, IICII = max{IICUII, IIC'II}.
The first step is to give some results about linear maps on a Banach space. We use
these results below, applied to A*, to prove that C is an isomorphism.
Lemma 7.2. Let E be,. Banach space and G,B E L(E,E).
(a) If IIGII ~ a < I then id - G is an isomorphism and lI(id - G)-III ~ I~a' In fact,
the inverse (id - G) -I can be represented by the series L:;:o Gj.
(b) If B is an isomorpilism witll liB-III ~ a < I then B - id is IllI isomorphism with
II(B - id)-III ~ al(1 - a). Again, this inverse, (B - id)-l, can be represented by a
power series, L:~ I B - j .
PROOF. To prove (a), given y we want to find x such that x - Gx = y, or x = y + Gx.
We can find this x as a fixed point of a map, u. Let U : E x E -+ E be given by
u(x.y) = y + Gx. Then

U(XltY) - U(X2'Y) = G(XI - X2). so


Iu(xlt y) - U(X2. y)1 ~ a I(xl - x21·
5.7 PROOF OF THE HARTMAN-GROBMAN THEOREM 161

Thus for y fixed, u(·, y) is a contraction. By the contraction mapping result, there Is a
unique fixed point xy, Xy = u(Xy,y) = y+G(xy). Thusy = (id-G)xy. Theexlstence
of Xy shows that id - G is onto. The uniqueness shows that id - G Is one to one. To get
the bound on the norm of the inverse, notice that if x = (id - G)-I y , then x - Gx = y
and

Ixl - alxl :S Iyl,

Ixl <
- I-a'
J!L so
I(id - G)-Iyl < _1_
Iyl - 1- a·

Thus (id - G)-I is a bounded linear map and lI(id - G)-III :S 1/(1 - a).
We will not bother with the details of convergence to show the inverse can be given
by the series indicated. However, if the series converges then it can easily be checked
that it is the inverse as follows:
00 00 00

(id-G)2: Gj = 2: Gj - 2: Gj
j=O j=O j=l

= id.

Turning to part (b), by part (a), B-1 - id is an isomorphism. Since B - id =


B(id - B-1) it follows that it is also an isomorphism. Its inverse is (B - id)-I =
(id - B-1 )-1 B-1 so II(B - id)-III :S (1/(1 - a»a. We leave to the reader to check the
series. This completes the proof of Lemma 7.2. 0
PROOF OF THEOREM 7.1. We want to show that each £,1 = id - A*, is invertible.
Writing Gg(lE a ) for Gg(RR, Ea), the norm of A*, is given as follows:

Then for v E Gg(E a ),

IIA*vllo = sup IAv (' A-Ixl


xER"
= sup IAv(y)1
yER"
:S allvllo.
Thus IIA*II :S a, and by Lemma 7.2(a), £. = id - A* is invertible with 11(£,)-111 :S
1/(1 - a). By a similar calculation, II(A;P-III :S a, and by Lemma 7.2(b), £u = id-A*
is invertible with 1I(£u)-11i :S a/(I - a). Because the norm on Rn is the maximum of
the norms on lEU and E', we get that 11£-111 :S 1/(1 - a).
We have the map W(g, v) = v - A*(v) - 9 0 (id + v) 0 A-I and its 'linearization' at
(0,0), £v = v - A*(v). Imitating the proofofthe Implicit Function Theorem, we let

e(g, v) = Cl{£v - w(g,v)}


= £-I{v - A*(v) - v + A*(v) + 9 0 (id + v) 0 A-I}
= c I {g 0 (id + v) 0 A -I}.
162 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Thus

119(g,vt} - 9(g,1I2)lIo
:oS 11£-111 sup Igo(id+vt}oA- l x-go(id+V2)oA- l xl
xERn
1 .
:oS (-1-) Llp(g) sup IVI (y) - v2(y)1
- a yERn

:oS (1 ~ a) Lip(g)lIvl - v2110.

For a fixed 9 with (l~a)Lip(g) < I, 9(g,') is a contraction on cg(Rn). Because


cg (Rn) is a complete metric space, there is a unique fixed point Vg with 9(g, v g ) = v g •
A direct calculation shows that this is equivalent to w(g, vg ) = O. Letting 1= A+g and
hJ = id+vg, the fact that wig, vg) = 0 implies that hJ = (A+g)ohJoA -I = lohJoA-I,
or h f 0 A = 1 0 h J. All that remains is to show that h = h J is a homeomorphism. Before
proving this fact, we show that I = A + 9 is a diffeomorphism.
Lemma 1.3. The map I is one to one and onto, so is a diffeomorphism.
PROOF. Assume that I(x) = I(y). Then, 0 = I(x) - I(y) = A(x - y) + g(x) - g(y).
Therefore

0= I/(x) - l(y)1
~ m(A)lx - yl - Lip(g)lx - yl

~ (m(A) - Lip(g») Ix - yl,

so x = y. This shows that I is one to one.


The map I is onto because it is a bounded distanc· . I" t he linear map A which is
onto and one to one. o
Lemma 1.4. The map h = hJ is one to one.

PROOF. There are two types of proofs that h is one to one. One uses the uniqueness of
the conjugacy h (within maps h for which h - id is bounded) and the fact that we can
also solve for a unique k with Ao k = k 0 I. Then Aokoh = k 0 I 0 h = kohoA. Thus
k 0 h conjugates A with itself. By the uniqueness of the maps which conjugate A with
itself (within maps for which h - id is bounded), k 0 h = id. This proves that h is a one
to one, and even that h is a homeomorphism. We do not give the details of this proof.
The second proof uses a property called expansiveness. If h(x) = h(y), then hoAx =
loh(x) = loh(y) = hoAy. By induction, h(Anx) = h(Any) for n ~ O. Using the fact
that I is invertible and 1- 1 0 h = h 0 A-I, we can also show that h(Anx) = h(Any) for
n :oS 0, so for all n E Z.
Now we write x = XU + x' and y = yU + y' with xu, yU E lEU. If x "I y then either
XU "I yU or x· "I y'. If XU "I yU then IAi x u - AiyUI ~ a- i Ixu - YUI. Thus we can take
a j ~ 0 with IAi x U - AiyUI ~ 311h - idll o > O. (If h = id then h is a homeomorphism
and we are done.) Then letting Xj = Aixand Yj = Ai y , h(xj)-h(Yi) = Xi -Yi+(h-
id)(xj)-(h-id)(Yj), soO = Ih(xj)-h(Yi)1 ~ Ixj-yjl-l(h-id)(xi)I-I(h-id)(Yi)1 ~
IIh - idll o > O. This contradiction shows that it is impossible for XU "I yU. Similarly,
using negative iterates we can prove that x· = y'. This completes the proof that h is
~to~. 0
5.7.1 PROOF OF THE LOCAL THEOREM 163

Lemma 7.5. The map h = hJ is onto, so it is a homeomorphism ofRn.


PROOF. The proof that h is onto uses the fact that it Is a bounded distance from the
identity: let b = IIh - idll o. We use the notation that B(r) = B(O, r) Is the open ball
centered at the origin of radius r, cI(B(r» is the closed ball centered at the origin of
radius r, and S(r) = cI(B(r» \ B(r) is the sphere of radius r centered at the origin.
Notice that for x E cI(B(r)), Ih(x)l $ Ih(x) - xl + Ixl $ b + r, so h(cI(B(r))) c
cI(B(r + b». Similarly, for x E S(r), Ih(x)1 ~ Ixl -Ih(x) - xl ~ r - b, so h(S(r)) c
cI{B(r + b» \ B(r - b).
Because h is one to one, the Brouwer Invariance of Domain Theorem implies that h
takes an open set to an open set; in particular, the images h(B(r» are open. By taking
the union, h(Rn) is open. (See Dugundji (1966) page 359 for the Brouwer Invariance of
Domain Theorem.)
One the other hand, we show that the image h(Rn) is closed. Assume Zo E cI(h(Rn».
There exists x; E Rn with h(x;) converging to Zo. Thus h(x;) is bounded with Ih(x;)1 $
IZoI + 1 == R. Since R ~ Ih(x;)1 ~ Ix;1 - b, we get that Ix; I $ R + b, and the x; are
bounded. By compactness of cI(B(R + b)), there Is a subsequence x;. which converges
to a point Xo E cI(B(R + b». By continuity of h, h(Xo) = ZOo Therefore Zo is in the
image, and cI(h(Rn)) = h(Rn).
Because h(Rn) is both open and closed in Rn and Rn is connected, h(Rn) = Rn, i.e.,
h is onto.
In finite dimensions, a continuous bijection is a homeomorphism. We show this fact
explicitly in this situation, i.e., we show that h- I is continuous. Assume Yn is a sequence
of points contained in some cI(B(R» converging to yoo. By the above arguments,
there are Xn and Xoo in cI(B(R + b» such that h(xn) = Yn and h(Xoo) = Yoo' Thus
h-I(Yn) = Xn , h-I(yoo) = Xoo E cI(B(R + b». Assume the Xn do not converge to Xoo.
Then there is a subsequence Xn, converging to p i' Xoo. By continuity of h,
h(p) = J-OO
lim h(Xn,) = Yoo = h(xoo ).

This contradicts the fact that h is one to one. Therefore h - I (y n) must converge to
h-I(yoo), proving that h- I is continuous.
The key idea which made the above proof work is that h is proper. A map h is called
proper provided the inverse images of compact sets are compact. This completes the
proof of the lemma and Theorem 7.1.
o
5.1.1 Proof of the Local Theorem
To prove the local version, we need to use what are called bump functions. These are
functions which make the transition from being identically zero to functions which are
identically one. We give the construction in the following lemma.
Lemma 7.6. Given numbers 0 < a < b, there is a Coo function {3 on Rn such that
o $ {3(x) $ 1 for all x E R n and
{3(x) = {I for Ixl $ a
o for Ixi ~ b.
PROOF. We start by defining a function of a real variable,

{ o forx<O
a(x) = e-I/'" for x ~ O.
164 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

A direct check shows that a is Coo.


Next, for a < b, let ,.,(x) = o(x - a)o(b - x). Then ,.,(x) ~ 0 and is greater than zero
exactly on the open interval (a,b). Again,., is Coo.
Now, if we define
C( ) _ f:,.,(s)ds
x - b
v '
fa ,.,(s) ds
then 0 ~ 6(x) ~ 1 for all x E Rand
I forx<a
6(x) = { -
o for x ~ b.
Thus {} is almost the desired function on the real line.
Lastly, define .B(x) on Rn by .B(x) = {}(Ixl). This function has all the desired proper-
ti~ 0
We now use this bump function to construct a map satisfying Theorem 7.1 from a
nonlinear map in a neighborhood of a fixed point.
Proposition 1.1. Let Uo be an open neighborhood of the origin ill Rn. Let I : Uo -+ Rn
be a cr local diffeomorphism for r ~ 1 with 1(0) = 0 and A = D/o. Then given any
f > 0, there is a smaller neighborhood U C Uo of 0 and e. C r extension] : Rn -+ Rn

with]IU = IIU, (j - A) E Cl(Rn,Rn ), and Lip(] - A) < f.


PROOF. Let .B(x) be the Coo bump function given by Lemma 7.6 on R with a = 1 and
b = 2. Then there is a uniform K ~ 1 such that 1.B'(x)1 ~ K for all x E R.
Let 9 = I -A, so Dgo = O. Take r > 0 such that IIDgx ll < f/(4K) for all x E B(O,2r)
and B(O,2r) C Uo. Finally, let

<p(x) = .B( ~ )g(x) and


r
](x) = Ax + <p(x).
Clearly,] equals I inside U;: B(O,r) and A outside B(O,2r). Thus] E Cl(Rn, Rn).
All that remains is to check the bound on the Lipschitz constant of <p. Since <p ;: 0
outside B(0,2r), we can assume x, y E B(0,2r) in the following calculation of this
Lipschitz constant:

I<p(x) - 'P(y)1 = I.B( ~ )g(x) -.B( Iyl )g(Y)1


r r

~ 1.B(lxl) - .B(l!l)I·lg(x)1 + 1.B(lyl)I·lg(x) - g(y)1


r r r
1 l f
~ K· ~ 'Ix - yl· (4K) 'Ixl + 1· (4K) 'Ix - yl
1 1
~ f(2 + 4K)lx - yl
~ flx-yl·
This completes the proof of the proposition. o
For an arbitrary nonlinear map with a fixed point at p, there is a simple translation
that brings the fixed point to the origin. By Proposition 7.7, there is an extension]
which equals I in a neighborhood of 0 and satisfies Theorem 7.1. Then] and A are
conjugate on all of Rn. Because] and I are equal on a neighborhood of 0, I and A
are conjugate on a neighborhood of the origin. This shows how the local version of the
Hartman-Grobman Theorem for a diffeomorphism, Theorem 6.2, follows from Theorem
7.1.
5.8 PERIODIC ORBITS FOR FLOWS 165

5.7.2 Proof of the Hartman-Grobman Theorem for


Flows
As in the statement of the theorem of the local Hartman-Grobman Theorem for flows
near a fixed point, Theorem 5.3, we consider the differential equation i = I(x) in a
neighborhood of a hyperbolic fixed point p. By a translation we can take p = O. Let
B = D 10. Using a bump function we can find an extension I which equals I in a
neighborhood of 0 and equals B outside a larger neighborhood. Let 'P' be the flow for
the differential equation x = /(x) and e'B the flow for the linear equation x = Bx. Note
that 'PI is also the flow for I in a neighborhood of O. Take the time one maps 'PI(X)
and eBx. By Theorem 7.1 there is a conjugacy h between 'PI and eB , h 0 eB = 'PI 0 h,
and h is unique among maps for which h - id is bounded. The following lemma shows
that h actually conjugates 'PI and elB for all times t.
Lemma 7.S. TIle J//al' " satisfies 'P' 0 h 0 e- IB = h for all real t.

PROOF. Fix a real number t. First, we show that not only h but also 'PI 0 h 0 e- IB is a
conjugacy between 'PI and e B :

'PI 0 ['PI 0 h 0 e- IB ) 0 e- B = 'PI 0 ['PI 0 hoe-B) 0 e- IB


= 'PI 0 h0 e- IB

because h is a conjugacy for the time one maps of the flows. The conjugacy h is unique
among maps for which h-id is a continuous bounded map. To see that 'Plohoe-IB -id
is bounded, note that

The second quantity on the right hand side is bounded because h - id is bounded. The
two flows are equal outside a bounded set 80 the first quantity on the right hand side is
bounded by compactness. Thus 'P' 0 h 0 e- IB - id is bounded and also is a conjugacy.
By the uniqueness of Theorem 7.1, 'PI 0 hoe-IB = h and we have proved the lemma. 0
By the lemma we have a conjugacy of the extensions on all of R". By restricting to a
neighborhood of the fixed point, we have proved the local Hartman-Grobman Theorem
for flows, Theorem 5.3.
The next section discusses the behavior of a flow in a neighborhood of a periodic
orbit.

5.8 Periodic Orbits for Flows


We consider periodic orbits in this section and see how the eigenvalues of an appro-
priate matrix determine the stability. For a different approach than given here to the
analysis of orbits in a neighborhood of a periodic orbit, see Hale (1969).
Remember that a point p is a periodic point with (least) period T provided 'PT (p) = p
and 'P'(p) ~ P for 0 < t < T. If p is a periodic point with period T, then the set of
points O(p) = {'PI(p) : 0 ~ t ~ T} is called a periodic orbit or closed orbit.
It is important to understand the flow in a neighborhood of a periodic orbit. One
way to do this is to follow the nearby trajectories as they make one circuit around the
periodic orbit. The following example gives a case where the orbits can be given as
explicit functions of time.
166 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Example 8.1. Consider the equations

:i; = -y + x(l _ X2 _ y2)


iI = x + y(l - x 2 _ y2).

which in polar coordinates are given by

r = r(l - r2)
0=1.
If we look at orbits with 9(0) = 0 then 9(271") = 271". Thus the solutions return to the
surface {9 = 0 mod 271"} after a time of 271". The map P which takes the r value at
time 0 to the r value at time 271" incorporates the effect of making one circuit around
the periodic orbit. This map P is called the first return map or Poincare map from the
surface {9 = 0 mod 271"} to itself. The solutions for this system of differential equations,
and so also the Poincare map, can be explicitly calculated, and used to show that r = 1
is an attracting periodic orbit. Also in this case, it can be seen directly that r = 1 is an
attracting periodic orbit because r > 0 for r < 1 and r < 0 for r > 1.
Definition. In general, let 'Y be a periodic orbit of period T with p E 'Y. Then for some
k the k-th coordinate function of the vector field must be nonzero at p, /,,(p) =I o. We
take the hyperplane E = {x: x" = p,,}. This hyperplane E is called a cross section
or tmnsversal at p. For x E E near p the flow cpt(x) returns to E in time r(x), which
is about T, as can be seen by the Implicit Function Theorem (as carried out explicitly
in the proof of the following theorem.) Let VeE be an open set in E on which
r(x) is a differentiable function. The first return map or Poincare map is defined to be
P(x) = cpT(X) (x) for x E V.
REMARK 8.1. Example 8.1 gives a trivial example where the Poincare map can be
explicitly calculated. In two subsections below we determine properties of the Poincare
map in two different situations of differential equations in R2. These two subsections
are not necessary for the rest of the book but give some idea of how the Poincare map
could be determined when the equations can not be explicitly solved. In particular,
we determine the nature of the Poincare map for the Van der Pol equation, and we
determine the derivative of the Poincare map for equations in two dimensions in terms
of an integral of the divergence of the vector field. In the first subsection below, we show
how it is possible to take a map and recover a flow which has this map as a Poincare
map. This process is called the suspension of a map.
The proof of the following theorem carries out the construction of the Poincare map
and determines some of its properties in terms of the Implicit Function Theorem. It
should also be noted that a cross section can be taken to be a hypersurface which is
nonlinear but we do not include the necessary modifications.
Theorem 8.1. Consider a C T flow cpt for r ~ 1.
(a) If p is on a periodic orbit 'Y of period T and E is a transversal at p, then the first
return time r(x) is defined in a neighborhood V ofp in E and r : V -+ R is cr.
(b) If P : V --+ E given by P(x) = cpr(x)(x) is the first return map, then Pis cr.
PROOF. Let E = {x : Xlc = pic} be the cross Sf"" ", P with f,,(p) ::j:. O. Define
1J'!(x, t) = cp~(x) - Pic. Then 1J'!(x, t) = 0 if .'ii, x) E E. Also, 1jJ(p, T) = O.
! !
Finally, 1J'!(x, t) = /lcOcpt(X) and 1J'!(p, t)lt=T = hc(p) ::j:. O. By the Implicit Function
Theorem, there is a neighborhood V of p in E such that for x E V it is possible to
5.8 PERIODIC ORBITS FOR FLOWS 167

solve for t as a function of x, t = r(x), such that 1/J(x, r(x» == O. Moreover, r(x) is a
CT function of x. (The reader may want to look at the section on the statement of the
Implicit Function Theorem.) Once we have that r(x) is CT, it follows that the Poincare
map P is CT because rpl(X) is jointly C T in t and x. 0

Now that we have shown that the Poincare map is differentiable, we can indicate
the relationship between the eigenvalues of the derivative of the Poincare map and the
eigenvalues of the derivative of the flow, Drp~. The first step is given in the following
lemma.

Lemma 8.2. (a) Ifrpl(x) is a solution ofi = f(x), then Drp;f(x) = f(rpl(X» for any
t.
(b) If"l is a periodic orbit of period T and p E "I, then Drp~ has 1 as an eigenvalue
with eigenvector f(p).
(c) Ifp and q are two points on a T-periodic orbit "I, then the derivatives Drp~ and
Drp~ are linearly conjugate and so have the same eigenvalues.

PROOF. We have that

d
f(rpt(x» = dsrp·(x)I.=1

= :Srpl 0 rp·(x)I.=o
= Drp~f(x).

This proves part (a).


For part (b), rpT(p) = P so f(p) = f(rpT(p» = Drp~f(p). This proves part (b).
For part (c), assume that q = rpT(p). Then

so taking the spatial derivative (with respect to x) at p yields

Thus Drp~ and Drp~ are linearly conjugate by the linear map Drp~. o
Deftnition. If "I is a periodic orbit of period T with p E "I, then the above result shows
that the eigenvalues of Drp~ are I,AI," .,An-I' The n -1 eigenvalues Ab ... ,An-l are
called the characteristic multipliers (or eigenvalues) of the periodic orbit. (Note that
Lemma 8.2(c) shows that the characteristic multipliers do not depend on the point p.)
A periodic orbit "I is called hyperbolic provided IAjl '" 1 for all the characteristic
multipliers (1 ::; j ::; n - 1). It is called attracting, a periodic attractor, or a periodic
sink provided all the characteristic multipliers are less than one in absolute value. It is
called repelling, a periodic repeller, or a periodic source provided all the characteristic
multipliers are greater than one in absolute value. A hyperbolic periodic orbit which is
neither a source nor a sink is called a saddle periodic orbit.

With these results and definitions, we can now state and prove the following theorem
which says that the characteristic multipliers of the periodic orbit and the eigenvalues
of the Poincare map are the same.
168 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Theorem 8.3. Let p be a point on a periodic orbit "( of period T. Then the character-
istic mUltipliers of the periodic orbit are the same as the eigenmlues of the derimtive
of the Poincare map at p.
PROOF. Let 7r : R n ...... E be the projection along f(p). The Poincare map is P(x} =
<pT(X)(X} for x E E, so DPx = D<p~(x)IE + f 0 <pT(x)(x}DTxIE. Now taking the point
p on ,,(, DPp = 7rD<p;(p)IE because 7rf(p) = O. Lastly, the characteristic multipliers of
the periodic orbit are the eigenvalues of 7rD<p;(p) IE. This can be seen by taking a basis
of vectors v I , ... ,v n -' in E and adding f(p} to make a basis of Rn. Then for some
entries C,

D<ppT(p) _- (BC 0)
1 '
and

7rD<p;(p)IE = B.

Thus the eigenvalues of B are the characteristic multipliers. o


Now we can state and prove the theorem giving sufficient conditions for stability in
terms of the characteristic multipliers.
Theorem 8.4. (a) Let "( be a periodic orbit for which all the characteristic multipliers
AJ satisfy IAil < 1. Then,,( is asymptotically stable (attracting).
(b) With the assumptions of part (a), if q is a point for which d(<p'(q), "() goes to
zero as t goes to infinity, then there is a point Z E "( such that d( <p' (q), <p' (z» goes to
zero as t goes to infinity. This is called the in phase condition, q is in phase with z.
(c) Let "( be a periodic orbit for which there is at least one cllaracteristic multiplier
Ak with IAkl > 1. Then -y is not Liapunov stable.
PROOF. Let E be a cross section at p, and P : VeE -+ E be the Poincare map.
Then pep) = p and all of the eigenvalues Aj of DPp have IAjl < 1. Thus p is an
asymptotically stable fixed point for P, and there is a subneighborhood Vo C V of P
such that for q E Vo, d(pn(q}, p} goes to zero as n goes to infinity. The next lemma
shows that this property of the map carries over to the flow in a neighborhood of p in
E.
Lemma 8.5. If q E Vo then lim,_co d(<p'(q), "() = O.
PROOF. For q E Vo, let qn = pn(q). Then qn ...... P as n ...... 00 by the choice of Vo
(and because p is asymptotically stable). Let Tn = T(qn} be the return times used to
define the Poincare map for these points. Then Tn converges to T = T(p} because T is a
differentiable (and so continuous) function. Therefore Tn ~ 2T for n ~ N,. We need to
control the flow of points in times between crossing the transversal, i.e., for times less
than 2T. Given f > 0 there exists a 6 > 0 such that if d(x, p} ~ 6 and 0 ~ t ~ 2T,
then f > d(<p'(X),<p'(p» ~ d(<p'(x),"(}. Now for the qn, there is a N2 ~ N, such that
for n ~ N 2, d(qn, p) ~ 6, so d(<pl(qn}'''() < f for 0 ~ t ~ Tn ~ 2T. Now we want to see
that the flow is near for all sufficiently large times. Let tn = L;:~ Tj' so <p'n (q) = qn
converges to p as n goes to infinity. Thus for t ~ tN., there is some n ~ N2 with
tn ~ t < tn + Tn and d(<p'(q), -y) < f. Since this is possible for any f > 0, we have proved
the lemma. 0
The above lemma proves part (a) if we start on the transversal, E. Now let U be
an arbitrary neighborhood of "(. Let Yo c E be as in Lemma 8.5, and take V, C Vo a
smaller neighborhood of pin E such that T(X) ~ 2T for x E VI, and U I == {<p'(x} : x E
VI and 0 ~ t ~ T(X)} C U. Now take V2 C VI C E such that pn(V2) C VI for all
5.S PERIODIC ORBITS FOR FLOWS 169

n ~ O. Finally, let U2 == (!pt(x) : x E V, and 0 $ t $ T(X}}. Now take q E U,. Let


qo = !p-IO(q) E V2 C VI, so !pt(qo) E U2 C U for 0 $ t $ T(qo) $ 2T. Thus !pt(q) E U I
for 0 $ t $ T(qo) - to. Let qn = P"(qo) = !pt. (q). Then qn E VI and !p'(q) E U I C U
for tn $ t $ tn+I' Thus !pl(q) E UI C U for all t ~ 0, and so 'Y is orbitally Liapunov
stable. Also qn = !pt. (q) E VI and d(!pl (q), 'Y) goes to zero as t goes to infinity by an
argument as in Lemma 8.5. This proves part (a).
We want to find a point z E 'Y such that d(!pt(z),!pt(q» goes to zero as t goes to
infinity. It suffices to consider q E Vo C E. As above, qn = P"(q) = !pt. (q) where
tn = L.;~~ TO pj(q). We want to show that tn grows like nT, so we look at the
difference, h n = tn - nT = L.;~~(Tj - T). We claim that h n is a Cauchy sequence.
Lemma 8.6. The numbers hn form a Cauchy sequence.
PROOF. We need the fact that the derivative of T is bounded by a constant C > 0,
IIDTxll :oS: C, so T is Lipschitz, Le.,IT(x) - T(Y)I $ Cd(x,y) for x,y E Vo. Also, there is
a C I ~ 1 and v < 1 such that d(Qj,p) $ Clv'd(q,p) for all j ~ O. Then

n+k-I
Ihn+k - hkl = I L (T(Qj) - T(p»1 (since T = T(p»
j=k
n+k-I
:oS: L Cd(Qj,p)

n+k-I
$ L CClv'd(q,p)
j=k

$ CCld(q,p)--.
v"
I-v

This last quantity goes to zero as k goes to infinity, so Ihn+k - h,,1 does also. This proves
the lemma. 0
Now, since hn is a Cauchy sequence, it approaches some limit value A. Then tn ~
A + nT. We use A to define z as z = !p-A(p). We need to show that this z works. Let
h n = tn - nT or tn = h n + nT. Then

d(!p'. (z), !pt. (q» = d(!p".+nT-A(p), pn(q»


= d(!p".-A(p),pn(q» (since !pnT(p) = p)
$ d(!p"·-A(p), p) + d(pn(p), P"(q».
On the last line, the first term goes to zero because hn - A goes to zero, and the
second term goes to zero because p is asymptotically stable for the map. Therefore
d(!p'(z),!p'(q» goes to zero as t = tn goes to infinity. For arbitrary large t, there is an
n with tn $ t < tn+1 and Itn - tn+ll < 2T. Using the uniform continuity of !p'(x) for
a bounded time interval, we get that d(!p'(Z),!pI(q» goes to zero as t goes to infinity.
This proves part (b).
The instability in part (c) follows as before from the Hartman-Grobman Theorem.
o
The In Phase Property does not always hold if the attraction to the periodic orbit is
not at a geometric rate (given by the eigenvalues), as the following example shows.
170 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Example 8.2. This example is given in polar coordinates. Consider

T = -(r - rO)3
iJ = 1 +r - rO,

where rO > O. Then r = rO is an attracting periodic orbit. The solutions are given by

for ro = rO
r(t) = {ro sign(ro _ rO)
[2t + (ro - ro)-2jl /2 for ro '" rO
for ro = r'
9(t) = { : : : : + sign(ro - rO){[2t + (ro - ro)-2J1/ 2
-(ro - rO)-I} for ro '" rO.

Let r(t) and 8(t) = 90 +t+[2t+(ro _r·)-2j1/ 2 - (ro-r·)-l be a solution with ro > r·.
We would like to select a solution f(t) == r' and 8(t) = 80 + t with ro = r' and so that
the distance between the solutions goes to zero. But

19(t) - 8(t)1 = 1[2t + (ro - r·)-2jI / 2 - (ro - r·)-l + 90 - 90 1


does not go to zero for any choice of 80 . In fact the difference of the angles goes to infinity
before we take everything mod 211', which means that the solution off the periodic orbit
keeps going around the angle direction more rapidly than the solution on the periodic
orbit. This proves that the solutions do not approach the periodic orbit in phase with
a solution on the periodic orbit.
We now consider a flow near a hyperbolic periodic orbit, "Y, of period T. If p E "Y,
then
D<p~ (lE~ EB lE~) = E~ EB lE~.
We can define the splitting along the whole orbit by

if <pI (p) = q, and consider the nonnal bundle

N = {(q,y) : q E "Y and y E lE~ EBlE~}.

Then, we can consider a flow \{lIon this normal bundle N which is linear on the "fibers"
lE~ EB lE~,
\{I1(q, y) = (<pI(q), D<p~y).
Note that lE~ EB lE~ is n - 1 dimensional and the set of q on "Y is one dimensional, so
the total space S is n dimensional. Thus it makes sense to say that \{lIon this space
is conjugate to <pIon a neighborhood of "Y. We can now state the following Hartman-
Grobman Theorem near a periodic orbit for a flow.
Theorem 8.7 (Hartman-Grobman). Let "Y be a periodic orbit for the differential
equation x = f(x). Then the flow <pI of I is topologically equivalent in a neighborhood
of"( to the linear bundle flow \{II (defined above) in a neighborhood of"( x {OJ.
PROOF. The Poincare map of \{It is the derivative of the Poincare map for <pt. Since
\{ITllE~ EBE~ is hyperbolic, the Poincare map of <pI in a neighborhood of p is conjugate to
5.8.2 AN ATTRACTING PERIODIC ORBIT FOR THE VAN DER POL EQUATIONS 171

the Poincare map of lilt in a neighborhood of {OJ. From the conjugacy of the Poincare
maps, it follows that the flow cpt in a neighborhood of '"Y is topologically equivalent to
lilt in a neighborhood of'"Y x {OJ. 0
REMARK 8.2. The above result about the flow being in phase for a periodic sink can
be used to show that the two flows are topologically conjugate in this case. In fact, if
the periodic orbit is hyperbolic then it is always topologically conjugate (and not just
equivalent) to the linear bundle flow. The main step is to find a transversal E such that
for a neighborhood Eo of'"Y In E, cpT(Eo) C E. See Pugh and Shub (1970a) and Irwin
(1970a).

5.8.1 The Suspension of a Map


The above discussion took the flow near a periodic orbit and found a map on a one
dimensional lower space. This is a local map defined only at points near the periodic
orbit. Sometimes it is possible to make this construction more globally, but we do not
investigate this. See Fried (1982). On the other hand, there is a general construction
that takes a C r diffeomorphism on a space and constructs a C r flow on a space of one
higher dimension. (It also works for a homeomorphism to give a continuous flow.) The
new space is not a Euclidean space and is often not even the product of a Euclidean
space and a circle. In any case this construction is useful in order to understand what
flows are possible. The flow is called the suspension 0/ the map and is essentially what
topologists call the mapping torus. Given a map / : X -+ X we consider the space
X x R with the equivalence relation, (x, s + 1) '" (f(x) , s). Then we consider the space

x = X x R/ "'.
To get all points it is enough to consider 0 $ s $ I, but the other points are included
because they make it clear that the quotient space has a c r structure if / is cr. Now
we consider the equations on X x R given by

X=O
8 = 1.

This induces a flow <pt on X x R which passes to a flow cpt on the quotient space X.
Notice that <pl(X,O) = (x,l) '" (f(x), 0), so the flow on X indeed has / as its (global)
Poincare map. This ends our discussion of the construction.

5.8.2 An Attractin~ Periodic Orbit for the


Van der Pol Equations
In the introductory chapter, we mentioned that the Van der Pol equations

x=y-x3 +x (1)
y= -x
have an attracting periodic orbit. In this section we prove this result for a slightly more
general set of equations called the Lienard equations. The importance of this example
is that it gives a set of equations for which it is possible to verify the properties of the
Poincare map and the existence of an attracting periodic orbit for a nontrivial example.
172 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Consider the second order scalar equation given by

x + g(X)x + x = o. (2)

If g(x) is zero, this is the linear oscillator. The term involving:i; is a "frictional" term
where the friction depends on the position x. In fact for small x we are going to take
g(x) negative so it is an "anti-frictional" term, while for large x we are going to take g(x)
positive so it is a "frictional" term. As mentioned in the introductory chapter, these
equations can be used for a certain type of electrical circuit. See Hirsch and Smale
(1974). We give the precise assumptions below.
To change the equations into the form we use in our analysis we do not let v = x.
Instead we let /(x) be the antiderivative of g(x), f'(x) = g(x) with /(0) = 0, and set
y = x + /(x). Then equation (2) becomes

x = y -/(x) (3)
iJ = -x.
This set of equations is called the Lienard equations. Figure 8.1 shows the phase portrait
for /(x) = -x + x 3 .

FIGURE 8.1. Phase Portrait of the Lienard Equation for /(x) = -x + x3


with -2 S x S 2 and -2 S y S x

-1 1 2

-2

-4

-6

FIGURE 8.2. The Graph of /(x) = -x + x 3 for -2 S x S 1


5.8.2 AN ATTRACTING PERIODIC ORBIT FOR THE VAN DER POL EQUATIONS 173

We now proceed to give the assumptions on / that make our analysis work. The
model for / is /(x) = -x + x 3 which gives the Van der Pol equations (1) given above.
Notice for /(x) = -x + x 3 that (i) / is an odd function of x, (ii) /(x) < 0 for 0 < x < 1
and /(x) > 0 for x > 1, (iii) / is a strictly monotone increasing function of x for x > 1,
and (iv) /(x) goes to infinity as x goes to infinity. See Figure 8.2. We make similar
assumptions on the general function which we consider. Assume
(i) f is a C l odd function of x,
(ii) there is an x· > 0 such that /(x) < 0 for 0 < x < x· and /(x) > 0 for x > x·,
(iii) / is a monotone increasing function of x for x > x·, and
(iv) /(x) goes to infinity as x goes to infinity.
With these assumptions, we can state the main result.
Theorem 8.8. With assumptions (i)-(iv) given above, equations (3) have a unique
nontrivial periodic orbit 'Y. This orbit is globally attracting in the sense that if q ¥ 0
then w(q) = 'Y.
PROOF. The first thing to note is that the only fixed point is at the origin. Next, we
want to show that nontrivial solutions (solutions for q =I 0) go around the origin. To
check that solutions go around the origin, we look at the curves where x = 0 and iI = O.
Define

y+ = {(O,y) : y > O},


y- = {(O,y) : y < O},
g+ = {(x, y) : y = /(x), x> O}, and
g- = {(x,y) : y = /(x),x < O}.
Then x = 0 on g+ and g- , and iI = 0 on y+ and Y-. These curves are the isoclines for
these equations. Let A be the region between y+ and g+, B be the region between g+
and v- , C be the region between y- and g- , and D be the region between g- and y+.
Lemma 8.9. Every nontrivial trajectory moves clockwise around the origin crossing
y+, entering A, crossing g+, entering B, crossing Y-, entering C, crossing g-, entering
D, and returning to y+.
PROOF. Start by assuming that (xo,Yo) E y+. Let (x(t),y(t» be the solution. Then
x(O) > 0 so the solution enters A and XI = x(td > 0 for small time tl > O. As long as
the solution stays in region A, x > 0 so x(t) ~ XI' In the region

((x,y): x ~ XI,y ~ Yo, (x,y) E A},

iJ ~ -XI < 0 so the solution can only exit A by crossing the curve g+ at some time
t2 > O.
The argument in region B is slightly more delicate. Let (X2' Y2) E g+ be the solution
at time t2' The trajectory enters region B for some time t3 > t2' Moreover, all along
g+ , iI < 0, so once the trajectory leaves a smaIl neighborhood of g+, it can never return.
Therefore along the trajectory there is an upper bound on x, x ~ -a < 0 for t > t3, as
long as it stays in region B, and x(t) ~ X3 - a(t - t3) for as long as it stays in region B.
The trajectory can leave B only by crossing v- or by the solution becoming unbounded.
By the above bound, x(t) must become zero at least by time t = ta + xa/a. However,
in this time interval iI = -x ~ -X3, so y(t) ~ Y3 - X3(X3/a). Since there is an a priori
bound on y(t) on this time interval, the solution must exit by crossing y-.
By the symmetry of the equations, the arguments in the other regions are similar to
those given above. 0
174 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Because the solutions travel from v+ to v- and back to v+, we can define the
Poincare maps {3 : v+ -+ y- and u : y+ -+ v+. Because of the symmetry of the
equations, if (x(t), y(t» is a solution then (-x(t), -y(t» is also a solution. Therefore
u(q) = -{3(-{3(q». Note {3(q) E v- so -{3(q) E v+, {3(-{3(q» E v- and -{3(-{3(q» E
y+.
Lemma 8.10. The following are equivalent:
(a) p E v+ is on a periodic orbit,
(b) u(p) = p, and
(c) {3(p) = -po

PROOF. Note that solutions do not cross themselves and all solutions on y+ enter A,
so u and {3 are one to one monotone functions. Therefore the only way for p E v+ to
he a periodic solution is for q(p) = p. This proves that (1) and (2) are flquivalent.
If (J(p) = -p then u(p) = -~(-{J(p» = -,B(p) = p. 011 the ot.her hand, If
,B(p) < -p, then -,B(p) > p, /7(p) = -,B(-{3(p» > -,B(p) > p, and u(p) :-I p. The
case for ,B(p) > -p is similar. 0
Thus to find the periodic orbits we can use the Poincare map,B. To deten;nine the
properties of,B we use a "Liapunov function" L(x, y) = (1/2)(x 2 + y2). Then the time
derivative along solutions is given by

L(x, y) = -x!(x),
which is not always of one sign. The change of L as solutions move from v+ to v- is
given by
["(P)
6(p) == L(,B(p» - L(p) = Jo L(x(t),y(t»dt,

where tl(p) is the time to reach y-. Then {3(p) = -p if and only if 6(p) = o. To
calculate 6, we sometimes look at the solutions as functions of x and write (x, y(x», or
as functions of y and write (x(y), y). We write ~~ (x, y) for the total derivative along

the solution written as a function of x, (x, y(x», and similarly ~~ (x, y) for the total
derivative along the solution written as a function of y, (x(y), y). Then

dL( ) _ L(x,y)
d x x,y - x
.
-x!(x)
=y- !(x)'

and
dL BL:i; BL
-(x,y) = (-)("7) +-
dy Bx y By
= x(y - !(x») +y
-x .
=!(x).
Remember that x' is the value where !(x') = O. There is a unique point p' E v+
that Hows to (x'. 0) when it first reaches g+. Let r' = Ipol. This value is important in
determining the properties of the function 6(p).
5.8.2 AN ATTRACTING PERIODIC ORBIT FOR THE VAN DER POL EQUATIONS 175

Lemma 8.11. (a) For P E v+ with 0 < Ipi :5 r·, 6(p) > O.
(b) The function 6(p) is 8 monotonically decreasing function of Ipi for Ipi ~ r·.
(c) As Ipi goes to infinity, 6(p) goes to minus infinity.
PROOF. If 0 < Ipi :5 r· then x(t) :5 x· along the trajectory from p to (j(p). Thus
f(x(t» :5 0 (and strictly negative for most times), £(x(t), y(t» ~ 0 (and strictly positive
for most times), and so 6(p) > O. This proves part (a).

FIGURE 8.3

Now assume that Ipi > Ip/l ~ r·. Let 1'1 be the part of the trajectory of p from v+
until it hits the vertical line {(x·, y)}. Let 1'2 be the part of the trajectory from this
first time of hitting the vertical line {(x·, y)} until it hits this same vertical line again.
Finally, let 1'3 be the part of the trajectory from this second crossing of {(x·, y)} until
it reaches v -. Let l' be the combination of 1'1, 1'2, and 1'3, which is the trajectory from
p to (j(p). See Figure 8.3. Similarly, let 1'; be the parts of the trajectory for p'. We
need to compare the changes of L along 1'j and 1';. For 0 :5 x < x· if y and y' are
chosen so that (x,y) E 1'1 and (X,y') E 1'~, then y > t,
y - f(x) > y' - f(x) > 0, and
-xf(x) > 0, so the changes of L along 1'1 and 1't are compared as follows:

1.
.., L(x,y)dt =
1""
0
-xf(x)
Y _ f(x) dx

< 1",'
0
-xf(x)
y' - f(x) dx

= 1. : £(x, y) dt .

Similarly, along 1'3 and 1'3' y < y' < 0 but x is decreasing so

1.... L(x,y)dt = - 10
r' y-xf(x)
_ f(x) dx

<- 1",' 0
-xf(x)
y' - f(x) dx.

= 1 'Yo
£(X',y')dt.

For the comparison of the changes of L along 1'2 and 1'~, we need to further split up
"(2 into three subpieces. Let y~ and Y3 be the y values where -rl and 1'3 meet {(x·,y)}.
176 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Let ':b be the part of 1"2 between y\ and Y3' We use this part of the trajectory as a
graph over y and notice that y is decreasing. Also let YI and Y3 be the y values where
1"1 and 1"3 meet {(XO,y)}. Then

1.
~
L(x,y)dt = - 111; f(x)dy - 111;
~
,f(x)dy -
~
illS f(x)dy
~

<- 111; II;


f(x')dy

= 1..,;
L(x', y') dt .

Note that
-1 ..,.
f(x)dy < - L'Y.
f(x) dy

because the x values on 1"2 are larger than the corresponding x' values on 1"~ 80 f(x) >
f(x') and the integral is more negative. Combining the comparisons of integrals shown
above, we get that 6(p) < 6(p'). This proves part (b).
To get that the limit of 6(p) equals minus infinity, note that as Ipi goes to infinity,
the x values on 'Y2 go to infinity, 80

6(p) <- 111;


II,
f(x)dy -+ -00.

o
PROOF OF THEOREM 8.8. We showed above that P E v+ is on a periodic orbit if and
only if 6(p) = O. Lemma 8.11 proved that 6(p) is positive for Ipi :5 rO. For Ipi 2: rO,
6(p) is monotonically decreasing, 80 it has a unique zero at 80me Po with Ipol > rO.
Thus there is a unique periodic orbit.
fUrther, for any P with Ipi > IPol, 6(p) < O. The trajectory for P can never cross
the periodic orbit through Po, 80 it stays outside this periodic orbit and 6(u i (p» < 0
for j > O. Therefore lui(p)1 is monotonically decreasing, and the 8OIution comes inward
toward Po with each revolution. The limit of the 6(u j (p» must be a fixed point of
u and 80 must be Po. Then the trajectory for p limits on the periodic orbit for Po.
Similarly, if Ipi < IPoI, then 6(u j (p» > 0, lui(p)l is monotonically increasing, and 80
6(u i (p» must converge to Po. This completes the proof of the theorem. 0

5.8.3 Poincare Map for Differential Equations in the


Plane
In this subsection, we consider a differential equation in the plane given by

± = X(x,y)
iJ = Y(x,y).

Let V(x. y) = ( ;~:::~) be the corresponding vector field. Denote the divergence of
" at q by (diT ")(q). Assume ~ E is a tnms-.'erSal and E' c E is an open subset 011
5.8.3 POINCARE MAP IN THE PLANE 177

which the Poincare map is defined, P : E' _ E. Since the differential equatiollB are in
the plane, E is a curve which can be parameterized by"'( : ] - E with "'(1') = E' and
I",('(s)I = 1. Let V.J. (q) be the scalar component of V perpendicular to the tangent line
to E at q given by
V.J. 0 "'(s) = det(",('(s), V 0 "'(s)).
In the case where E is a horiwntalline, {(x, yO) : Xl < X < X2}, then V.J. (q) = Y(q).
In the case where E is a vertical line, {(xo,y) : Yl < Y < Y2}. then V.J.(q) = -X(q).
As in the general case, let T(q) be the return time for q E E'. so P(q) = !p1'(q)(q).
With these definitions and notation we can state the main theorem of this subsection.
Theorem 8.12. Let"'(:]' -+ E' be a parameterization of the transversal E' as above
with I",('(s)I = 1. Then for s E ]',
V 0 "'(s) 11'01'(') .
(Po"'()'(s) = V .J. P () exp( (divV) o!pt o"'(s)dt) .
o"'(s .1. 0 0
In particular, if P(qo) = qo and "'(so) = qo, then
r(qo)
(P 0 "'()'(so) = exp (10 (div V) 0 !pI (qo) dt).

REMARK 8.3. This theorem is contained in Section 28 of Andronov. Leontovich, Gor-


don, and Maier (1973). The case where P(qo) = qo is the one most often used. In the
application in the example below we use the general case.
PROOF. The first variation equation states that
d I I
diD!Pq = DV.... (q)D!pq.
Since det(D!p~) = det(id) = I, Liouville's formula for time dependent linear equations
gives that
r(q)
det(D!p~(q» = exp (10 (div V) 0 !pt(q) dt).

(We leave it as an exercise to prove this time dependent version of Liouville's formula.
See Exercise 5.35.) Notice that the right hand side of this equality is the integral in the
formula for (P 0 "'()'(s) as stated in the theorem. Therefore to complete the proof, we
must relate (P 0 ",()'(s) with det(D!p~';':)(·».
Taking the derivative of P 0 "'(s) = !pTO'Y(')(",(s» with respect to s yields
(P 0 ",()'(s) = (D!p~~:)(')h'(s) + (T "'()'(s)[V !p1' 'Y(') ("'(s»J
0 0 0

= (D!p~~:?)h'(s) + (T 0 "'()'(s)[V 0 P 0 ",(s)J.


Then
(P 0 ",()'(s)[V.1. 0 Po ",(s)J
= det«P 0 "'()'(s). V 0 P 0 "'(s»
= det«D!p~~:)(')h'(s), V 0 Po "'(s»
+ det«T 0 "'()'(s)[V 0 P 0 "'(s)J, V 0 Po "'(s»
= det«D!p~~:)(')h'(s), (D!p~~:)('»V 0 "'(s»
= det(D!p~~:)('» det(",('(s), V 0 "'(s»
(TO'Y(')
= exp (10 (div V) o!pt 0 "'(s) dt)V.1. 0 "'(s).
178 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Dividing by V.l 0 po -y(s) gives the desired formula. o


Example 8.3. Consider the Volterra-Lotka equations which model the populations of
two species which are predator and prey:

:t = x(A - By)
iJ = y(Gx - D)
with all A, B, G, D > O. There is a unique fixed point in the interior of the first quadrant
with x· = D/G and y. = A/B. Let E = {(x,y·) : x· :5 x < oo}. By using arguments
like those for the Van der Pol equation, it can be shown that every point (x, yO) of E
with x > x· returns to E, P : E -+ E. The argument below shows that this map extends
differentiably so that P(x·) = x·. We show that all points in the open first quadrant
except (x·, y.) lie on periodic orbits by applying Theorem 8.12. This fact is usually
verified by finding a real valued function L which is constant on orbits, t == o.
Let V be the vector field for the above differential equations. The divergence of V is
given by

(divV)(x,y) = (A - By) + (Gx - D)


:t iJ
= -
x
+-.
y

We write P(x) for the x-value of the Poincare map of the point (x, y.). The integral in
Tneorem 8.12 becomes

exp ( l
o
dr ):t
(-
x
+ -iJ ) dt)
y
= exp (
l
r
P (r) 1
- dx
X
+ 1-
11 '

II'
1
Y
dy)

= P(x)
x
Applying the formula of Theorem 8.12 yields

P'(x) = Y(x, y.) . P(x)


Y(P(x), yO) x

Defining I(s) = Y(s,y·)/s, we get that

10 P(x)P'(x) = I(x).

Integrating from x· to x yields

F 0 P(x) - F 0 P(x·) = F(x) - F(x·),

where F is the antiderivative of I. Since P(x·) = x·, this gives

F 0 P(x) = F(x).

Since I(s) > 0 for s > x·, F(s) is strictly monotonically increasing. Therefore, the fact
that F 0 P(x) == F(x) implies that P(x) == x. This completes the proof that all orbits
are periodic.
For other examples applying Theorem 8.12, see Robinson (1985).
5.9 POiNCARE-BENDlXSON THEOREM 179

5.9 Poincare-Bendixson Theorem


The Poincare-Bendixson Theorem is a result about flows in a region in the plane
or on the two sphere, 8 2 . The reason for the restriction to these domains is that it
depends on the Jordan Curve Theorem. It also depends on the fact that a transversal
is one dimensional. The conclusion of the theorem is that the w-limit set of a point
is either a closed orbit or contains a fixed point. In order to insure that the w-limit
set is nonempty, we need to assume that the forward orbit is bounded, i.e., O+(P) is
contained in a compact subset of the domain. See Section 5.4 for examples of limit sets
for flows in the plane. For other references with more details on this result, see Hale
(1969), Hartman (1964), and Hirsch and Smale (1974).
Theorem 9.1 (Poincare-Bendixson Theorem). Let V be a planar domain, i.e.,
either a simply connected subset ofR2 or V = fP. Let <pI (x) be a Cl Bow on 'D. Let
p E V be a point such that O+(p) is bounded and w(p) does not contain any fixed
points. Then w(p) is a periodic orbit. If we replace the assumptions on the forward
orbit with the assumption that O-(p) is bounded and a(p) does not contain any fixed
points, then the conclusion is that a(p) is a periodic orbit.
In the case where the original point p is not itself on a closed orbit, the closed orbit
is called a limit cycle. Thus a limit cycle is a periodic orbit "I such that there exists a
point p rt "I with w(p) = "I. When applying the theorem to prove the existence of a
periodic orbit, this periodic orbit is usually a limit cycle.
The above theorem can be used to prove the existence of a periodic orbit as the
following corollary shows.
Corollary 9.2. Let <pI be a C' flow on a planar region V as in Theorem 1. Let 1<. c 'D
be a positively invariant bounded region such that 'R does not contain any fixed points.
Then 'R contains a periodic orbit.
The proof of this corollary follows immediately from the Poincare-Bendixson Thecr
rem. This corollary can be applied to specific equations as the following example shows.
Example 9.1. Consider the equations

x=y
iI = -x + y(l - x 2 - 2y 2).

Let r be the polar coordinate. Then along a solution,


d (r2) . .
Cit 2' = xx +yy
= xy - xy + y2(l
- x 2 - 2y2)
2
= y2(l _ x _ 2y 2).

Then f ~ 0 on r = 2- 1/ 2 and f:5 0 on r = 1. Thus the annulus {(x,y) : 1/2:5 r2 :5 I}


is positively invariant. It also contains no fixed points. Therefore this annulus contains
a periodic orbit.
Now we begin the proof of the Poincare-Bendixson Theorem.
PROOF OF THE POINCARE-BENDlXSON THEOREM. Let Lo be a connected open trans-
versal to the flow, i.e., Lo is the image of an open interval. Let L be a connected open
(sub-)transversal with cl(L) c Lo. Let
L'= (. there is a tx > 0 such that <p1_(X) E L}.
180 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

We assume that we ace dealing with an L for which L' is nonempty. For x E L', let
T(X) > 0 be the smallest time such that ."I(X) E L. Then g(x) = ."T(X)(X) is the
Poincare map and depends differentiably on x E L', 9 : L' -+ L. Below we often say we
have a transversal L and do not mention Lo or L'.
Lemma 9.3. If gk(x) is defined for k = 0, ... , n then gk(x) is a monotone function of
k in terms of the ordering on Lo.
PROOF. Consider the Jordan curve r formed by the trajectory {."I(X) : 0 ~ t ~ T(X)}
and the line segment on L' connecting x and g(x). Then r is the boundary of a subset
Dc D. The trajectories can only leave or enter D across the line segment on L'. Points
are either all leaving across this segment or all entering. If they are all entering, then
gk(x) E int(D) for k ~ 1. Thus the entire sequence is on the same side of x as g(x).
The same argument applies to all the iterates. This shows the sequence is monotone. 0
Lemma 9.4. Lt.·t L /),. 11 transv('f.'iIlI/~~ Il/wve. Tlwn w(p) nL is I1t most /JIm point.
PROOF. Assume Yo E w(p) n L. Then ."I(p) gets near Yo infinitely often. If it gets near
enough to L (in a closed neighborhood of Yo) then it has to cross L. (This follows by
methods like the proof of the existence of the Poincare map using the Implicit Function
Theorem.) Thus the trajectory crosses L infinitely often. The sequence of points where
it crosses L is monotone by Lemma 9.3. A monotone sequence can accumulate on at
most one point, so L n w(p) = {Yo}. This proves the lemma. 0
Lemma 9.5. Assume O(p) is bounded, w(p) contaillS no fixed points, and q E w(p).
Then q is on a periodic orbit, and w(p) = w(q) = O(q).
PROOF. Since q E w(p), w(q) C w(p). Take Z E w(q). Then Z can not be a fixed point,
so we can take a transversal L at z. There is a sequence of times tn such that ."I n (q)
accumulates on z. These times can be chosen so that ."I n (q) E Lj they also must be in
w(p) by its invariance. Since w(p) n L is at most one point, .,,1 .. (q) = .,,1 1 (q) for all n.
Thus .,,1.-II(q) = q and q is a periodic point. But then w(q) = O(q) = {."I(q) : 0 ~
t ~ T}, where T = T(q) is the (minimal) period of q.
It remains to prove that w(p) = O(q). Take a transversal L at q. There is a sequence
of times Sn, going to infinity, such that Pn = ."On (p) ELand these points converge to q.
Because q is a periodic point. the times can be chosen so that 8 n +1 - Sn = T(Pn) ::::: T,
where T = T(q) is the period of q. These points are monotone on L, so they must
accumulate on exactly one point from one side. Because the return times are bounded,
it follows that the orbit O+(p) can only accumulate on O(q) and w(p) = O(p). This
completes the proof of the lemma. 0
These lemmas complete the proof of the Poincare-Bendixson Theorem. o
The following generalization of the Poincare-Bendixson Theorem allows w(p) to con-
tain a fixed point. See Hale (1969), page 56 for a proof.
Theorem 9.6. Assume that O+(p) is contained in a closed bounded subset K of the
planar domain D of the differential equation :it = !(x). Assume further that! has only
finitely many fixed points in K. Then one of the following is satisfied:
(a) w(p) is a periodic orbit,
(b) w(p) is a single fixed point of !,
(c) w(p) cOllSists of 11 finite number of fixed points, ql,"" qm, together with 11 finite
set of orbits 11, ... ,1n, such that for each j, o(-yj) is a single fixed point and
w(-Yj) is a single fixed point: w(p) = {ql,' .. , qm} U11 U· .. U1n, o(-yj) = qi(j,o),
and w(-yj) = qi(j ...,)'
5.10 STABLE MANIFOLD THEOREM FOR A FIXED POINT 181

5.10 Stable Manifold Theorem for a Fixed Point of a


Map
In this section we consider a C" differentiable map on a Banach space, f : U c E ...... E.
We allow f to be non-invertible and not onto. Assume p Is a fixed point, and let
A = D fp be the derivative at p. The spectrum of A Is the set of complex numbers for
which A - >'1 is not an isomorphism,

spec(A) = {>. E C : A - >.I is not an isomorphism}.

In finite dimensions the spectrum is the same as the set of eigenvalues. A fixed point
P for f is called hyperbolic if spec(Dfp) n {o : 101 = l} = 0. By standard results in
Spectral Theory, there are invariant subspaces EU and E" corresponding to the part of
t.he spectrum ollt.~ide lind inRide thl' unit circle, and constants 0 < I' < 1 and>' > 1 such
that IE = IE" EllIE', spec(DfpIIE U) C {o: 101> >'}, and spec(DfpllE') C {o: 101 < ,,}.
In fact, we identify IE with lEu EllE", so we write a point x E E as (XU, x") where x" E EO'
for (f = U, s. In finite dimensions, these correspond to the subspaces spanned by the
generalized eigenvectors for the eigenvalues of absolute value greater than one and less
than one respectively. Because the spectrum of EU Is bounded away from 0 (and by the
construction of lEU), DfplEu is an isomorphism on EU. FUrther, there Is C > 0 such
that IIDf;IIE'1I < c"n and IIDf;nllEull < C>.-n for n > O. By the usual change in
the norm on E we can take C = 1. Such a norm Is called an adapted nonn or adapted
metric. The subspaces IE" and lEU are called the stable and unstable subspaces for the
fixed point p respectively.
Given a hyperbolic fixed point p for a C" map f, and given a neighborhood U' C U
of p, the local stable manifold for p in the neighborhood U' is defined to be the following
set:

W·(p,U',1) = {q E U' : fj(q) E U' for j > 0 and


d(fj(q),p) ...... 0 as j ...... co}.

To define the unstable manifold, we need to look at the past history. Because f is not
necessarily invertible we need a replacement for the backward iterates. We define a
past history of a point q to be a sequence of points {q_j }~o such that qo = q and
f(q-j-d = q-j for j ~ O. The local unstable manifold for p in U' Is defined to be the
following set:

WU(p, U', I) = {q E U' : there exists some choice of the past history of q
{q-j}~o C U' 8uch that d(q_j,p) ...... 0 as j -+ co}.

Sometimes we write WI~(P, I) and W~(p, I) to indicate local stable and unstable
manifolds for a suitably small but not specified neighborhood U'. We also write W: (p, f)
for W·(p, B(p, e), I), and similarly W.U(p, I) for WU(p, B(p, e), I).
The following theorem states that these local stable and unstable manifolds are C"
embedded manifolds which can be represented as the graph of a map from a disk In
one of the subspaces to the other subspace. The Hartman-Grobman Theorem already
proves that the stable and unstable manifolds are topological disks but it does not
prove they are differentiable. In fact the hard part of the proof of the Stable Manifold
Theorem is to show that these manifolds are Lipschitz. Once this Is known, it can be
shown they are differentiable. Again the Hartman-Grobman Theorem does not prove
they are Lipschitz. On the other hand, the Hartman-Grobman Theorem proves that
182 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

all the orbits near the fixed point behave like the linear map, while the Stable Manifold
Theorem only gives information about points on the stable and unstable manifolds.
To represent a closed disk in one of the subspaces, we use the following notation:
for any Banach space £ and 6 > 0, the closed disk in £ about the origin of radius 6 is
represented by £(6) = {x E £ : Ixl ~ Ii}.
Theorem 10.1 (Stable Manifold Theorem). Let p be a llyperbolic fixed point
for a C k map I : U c JE --+ JE with k ~ 1. We assume that the derivatives are
uniformly continuous in terms of the point at which the derivative is taken. Then there
is some neighborhood of p, U' c U, such that W' (p, U', f) and WU (p, U', f) are each
C k embedded disks which are tangent to JE' and JEu respectively. In fact, considering
JE = JEu x JE', there is a small r > 0 such that taking U' == p + (JEU(r) x JE'(r»,
W'(p, U', f) is the graph of Il C k function q' : JE'(r) --+ JEU(r) with q'(O) = 0 and
Dq~ = 0:
W'(p, U', f) = {p + (q'(y), y) : y E JE"(r)}.
Similarly, there is a C k function aU : JE"(r) --+ JE'(r) with qU(O) = 0 and Dq~ = 0 such
that
WU(p, U', f) = {p + (x, qU(x» : X E JEU(r)}.
Morco\'er, for r > 0 small enough and U' = p + (JEU(r) x JE'(r»,
w' (p, U', f) = {q E U' : Ii (q) E U' for j ~ o}
= {q E U' : li(q) E U' for j ~ 0 and
d(P(q),p) ~ jlid(q,p) for all j ~ OJ.
This means that every point that is not on W'(p, U', f) leaves U' under forward itera-
tion, and that points on W'(p, U', f) converge to p at an exponential rate given by the
bound on the stable spectrum. Similarly,

lVU(p, U', f) = {q E U' : there exists some choice of the past history of q
with {q-i}~O c U'}
= {q E U' : there exists some choice of the past history of q
with {q-i}~O c U' and
d(q_},p) ~ A-id(q,p) for all j ~ OJ.

FIGURE 10.1. Stable and Unstable Manifolds in a Neighborhood of the


Fixed Point
5.10 STABLE MANIFOLD THEOREM FOR A FIXED POINT 183

Once we have local stable and unstable manifolds, then the (global) unstable maniJold
is obtained by
WU(p,f) = UJjWU(p,U',f).
j~O

If f is invertible, then the (global) stable manifold is obtained by

W"(p,f) = U riW"(p,U',f).
i~O

We end this section with a discussion of the types of proofs of the Stable Manifold
Theorem. There are two basic types, the Graph Transform Method of Hadamard (1901)
and the variation of parameters method of Perron (1929). For historical notes see
Hartman (1964), page 271.
Graph Transform Method of Hadamard
In the Graph Transform Method, the approach is to take a trial function u : EU(r) -+
E"(r) which might possibly give the unstable manifold. A new function r(0') : EU(r) -+
E" (r) is defined so that

graph[f(O'») = f(graph(O'» n (EU(r) x E"(r)).

The set of all such possible trial functions is defined by

E(r,L) = {a: EU(r) -+ E"(r) : 0' is continuous, 0'(0) = O,Lip(O') :5 L}.


It Is shown that f : E(r, L) -+ E(r, L) is a contraction in the CO-sup topology. It thus
has a fixed point function, f(O'U) = au, which gives the local unstable manifold as the
graph of a Lipschitz function. Because the map f is not a contraction on the function
space if the function space is given the C 1 or C k topology, the fixed point needs to be
proven to be C 1 (and then Ck) as a second step by different means. Notice that in this
method only one iterate of f is used to define f. Fur this approach see Hirsch and Pugh
(1970), Hirsch, Pugh, and Shub (l <i77), and Shub (1987). Also see Section 11.1.
In the proof presented in the next section, we use a modifcation of the Hadamard
proof. Let

BN = n fj(EU(r)
N-l

j=O
x E·(r».

Then B N+ 1 = f(BN) n Bo. It is prove<l that

is the graph of a Lipschitz function that Is in fact C I . It is then proved that it is C k by


induction on k.
Variation of Parameters Method of Perron
To explain the Perron method, it Is easier to consider an ordinary differential equa-
tion. AssulIle that :Ii: = J(x) = Ax + g(x) where A Is the linear part at 0, g(O) = 0,
and Dga = O. Again take a splitting so x = (x..,x.)lr. Applying the variation of
184 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

parameters to the equation itt) = Ax(t) + g(x(t», thinking of the second term as a
known nonhomogeneous term, we get

We then break this equation into the stable and unstable components. The stable
component is specified with an initial condition at t = 0, x.(O) = a., so taking t2 = t
and t\ = 0 for the stable component we get

x.(t) = eA.1a, + 10 1eA.(I-')g.(x(s»ds.

On the other hand, we can not specify the unstable component at t = 0, but it is
determined by the fact that the unstable component goes to zero (or is at least bounded)
as t goes to infinity. Therefore, taking t2 = t and t\ = 00 for the unstable component
and xu(oo) = 0, we get

xu(t) = 0 + J~ eA.(I-·)gu(x(s»ds.

Combining the right hand sides of these two equations, we can get a transformation
of trial solutions on the stable manifold. To get a transformation of the whole stable
manifold we look at trial solutions x( t, a,) where a, E E' (r) parameterizes the stable
component of the initial conditions, and let F be the function space of such trial sets
of solutions on the stable manifold. A transform T : F -+ F is defined by

[T(x)J(t, a.) = eA.1a, + 101eA.(I-·)g,(x(s, a,)) ds

+ f~ eA.(I-·)gu(x(s, a.» ds,

which takes one parameterized set of potential solutions on the stable manifold into
another set. This transformation can be shown to be a contraction which has a fixed
point which gives the stable manifold. Notice that one iterate of T uses the whole trial
orbit of x(t, a.) which is pulled back with the linear flow. For this approach see Hale
(1969), Chow and Hale (1982), or Kelley (1967).
The proof in Irwin's book, Irwin (1980), is a mixture of these methods. For N the
natural numbers, let

L:(E(r» = {q E cl(N, E(r» : q(j) -+ 0 as j -+ oo}.


A function is then defined,
F: E'(r) x C(E(r» -+ C(E(r»,
by the formulas
F(x.. q)(O) = (X.,q(O) + A~l[qu(l) - fu(q(O»]) E E'(r) x EU(r),
F(x.,q)(n) = (f.(q(n - 1»,q(n) + A~l[qu(n + 1) - fu(q(n»)]) for n > O.

It is then shown that FisC" and a contraction for each fixed x., so there exists a
C" function 9 : E'(r) -+ C(E(r» with F(x., g(x.» = g(x,). Note that for this fixed
function, f(g(x,)(n» = g(x.)(n + 1). Then h(x.) == g(x,)(O) is C" and its image is the
stable manifold. For details see Irwin (1980).
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM 185

Differentiability of the Stable Manifold


The differentiability of the stable manifold is very important for our applications.
There are several methods of proving this property. The main difficulty is that the
various transformations are not contractions of the set of possible Cl manifolds so we
can not directly look at a contraction map on this space. Hirsch and Pugh (1970) used
the Fiber Contraction Principle. See Exercise 11.4. We use the method of cones inspired
by McGehee (1973). Also compare with the argument in Moser (1973) to show that
there is a subshift (horseshoe) as a subsystem of a dynamical system. Both of these
references are based on work of Conley. Following these references, we use the cones to
not only prove that the stable manifold is Lipschitz but also to show that it is C 1 •
Cones are closely related to the idea of hyperbolicity. If L : R" ..... R" is a hyperbolic
linear map, and CU is a cone of vectors roughly in the expanding direction, then L(CO.) c
CU, i.e., the cone of vectors is invariant by L. (See following subsection for the definition
of a cone of vectors. Some authors use the name of a sector for what we call a cone.) If
the exact expanding direction is not known, then it is often easier to find an invariant
cone, Cu. Using such a cone, it is possible to show that L is hyperbolic.
The use of cones goes back at least to Alekseev (1968a). Sinai (1968,1970) also used
the concept very early. As mentioned above they are used in McGehee (1973) and Moser
(1973). Recent uses of cones include Wojtkowski (1985), Burns and Gerber (1989), and
Katok and Burns (1994).

5.10.1 Proof of the Stable Manifold Theorem


This subsection contains the proof of Theorem 10.1, the Stable Manifold Theorem.
The proof presented here follows the method developed by Conley as given in McGehee
(1973), but with some modifications which were influenced by Hirsch and Pugh (1970).
Thus it is in the spirit of the Hadamard proof (more than the Perron proof) but has some
differences. There is also a lot of similarity to the argument of Conley given in Moser
(1973) to show there is a subshift (horseshoe) as a subsystem of a dynamical system. The
arguments in both McGehee (1973) and Moser (1973) are presented in the case of two
dimensions where both the stable and unstable directions are one dimensional. Some
modifications need to be made to take care of the higher dimensional cases. Because
the map is not assumed to be invertible, we need to indicate how we show both the
stable and unstable manifolds exist.
OUTLINE OF THE PROOF. To start the proof, we want to show that W: is the graph
of a function 'P' : E'(r) ..... EU(r) which is Lipschitz with Lipschitz constant less than a
(a-Lipschitz) for some fixed a. To verify that W: is a graph, we use the characterization
of W: as points whose forward orbit remains in 8(r) = E'(r) x EU(r):

W: = {x : jj(x) E 8(r) for all j ~ O}

n ri(8(r».
00

=
;=0
See Figure 10.2. (Later in Proposition 10.7, we verify that these points tend to the
fixed point.) In order for W: to be a graph, we need for W:n [{x} x EU(r») to be a
single point for each x E E·(r). The disk {x} x EU(r) is a trivial example of a Cl graph
of a function from EU(r) into E'(r) with slope less than a-I. We define an unstable
disk to be the graph of a Cl function 'P : E"(r) -+ E'(r) with I/D'Pyl/ S a-I for all
y E £"(r). Similarly, an a-Lipschitz ,table disk is defined to be the graph of a function
'" : £'(r) ..... E"(r) with Lip("') S a. Using the bounds on the terms In the derivative
186 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

B(r)

u
{X} X E (r)

FIGURE 10.2. Intersection of the Boxes rn(B(r»

of f, Lemma 10.5 shows that if D(j is an unstable disk, then Dr = f(Dg) n B(r) is an
unstable disk (with slope less than a-I) and
diam(Do n rl(D't)) ::; (A - m-I)-I diam(Do) = (A - m- 1 )-12r.

By induction, D'J = f(D'J-I) n B(r) is an unstable disk, and


n
diam(n rj(D'J}) ::; (A - m- 1 )-n2r.
j=O

From this it follows that the infinite intersection is a single point which we call (x, ip"(x»:

w: = [{x} x EU(r)] n n rj(B(r»


00

[{x} x E"(r)] n
j=O
= {(x, ip·(x»}.

This shows that W: is a graph.


To show that it is a-Lipschitz, we consider the stable and unstable cones which are
defined by
C'(a) = {(v" v u ) E E' x EO. : IVul ::; alv,l} and
C"(a) = {(V., v u ) E E' x EO. : Iv,,1 ~ alv.I}.

See Figure 10.3. The condition that


W: n [{p} + C"(a») = {p}
for all points pEW: is equivalent to the graph being a-Lipschitz. Proposition 10.6
shows that this intersection is indeed a single point by the same argument which shows
that W: is a graph.

cs

FIGURE 10.3. Stable and Unstable Cones


5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM 187

After we have shown that W; is the graph of a Lipschitz function, Proposition 10.8
shows that it is C 1 and Theorem 10.10 proves that it is C" by induction on k.
To carry out the above argument, we need some estimates about the effect of the
map / on the stable and unstable components. Lemma 10.3 shows that the distance
between the unstable components of two points p and q is increased by the action of
the nonlinear map / provided the displacement from p to q lies in the unstable cone (is
"mainly an unstable displacement"). Before making these nonlinear estimates, we give
similar linear estimates in Lemma 10.2: the derivate D/p preserves the unstable cones
and stretches the length of vectors in the unstable cone. We prove this linear case first
because it is easier and is used in PropOSition 10.8 to prove that W; is Cl.
REMARK 10.1. There are some differences in the proof given here from those in the
references cited. In particular, if dim(E") i' I, then
n
8[n ri(8(r»] \ [8E'(r) x EU(r)]
)=0

need not be made up of two graphs over E'(r), as is true in Moser (1973) where
dim(EU) = 1. The fact that I need not be invertible necessitates some care in the
proof for the unstable manifold: 8[nj=0Ii(8(r))] n [E'(r) x {yo}] need not be in the
image by I of 8[nJn;~J1(8(r»]. 1(8[nj;JJ1(8(r»)]). In general, greater care needs to
be taken in sever of the arguments when I is not assumed to be invertible.

To start carrying out the above outline, we first introduce some notation. It is
convenient in expressing some of the ideas to use the minimum norm of a linear map
which is first introduced in the Section 4.1. We also use the estimates from the Inverse
Function Theorem given in Section 5.2.2. (Those estimates should be reviewed at this
time.) Remember that the minimum norm of a linear map A : lE - t lE is defined by

. IAvl
mf -Iv-I .
meA) = v~o

The minimum norm is a measure of the minimum expansion of A just as IIAII is a


measure of the maximum expansion. If A is invertible then meA) = IIA -III-I. For
a hyperbolic fixed point 0 for which IID/anllEull < CA- n for n > 0, it follows that
m(D/O'llEU) > CAn.
By taking local coordinates we can assume that the fixed point is 0 in the Banach
space lE and E = E' x lEu, where these subspaces come from the hyperbolic splitting at
this fixed point. (This cross product is isomorphic to the original Banach space.) Let
'II'u : lE - t lEU be the projection along E' onto EU, and 7r, : E - t E' be the projection
along EU onto E'. Then 1(0) = 0 and we use the splitting E' x EU to write

D' _(A..0 Auu0) '


10 -

where Au = 7r,Dlol(E' x {OJ) and Au .. = 'II'uD/ol({O} x E"). Because the fixed point
is hyperbolic, there exist 0 < J.I < 1 < A and nonns on each subspace E' and lEu so that
m(A .... ) > A > 1 and IIA .. II < J.I < 1. We fix these norms, and on lE we put the norm
which is the maximum of the components in the E' and E" subspaces. In any Banach
space E we denote the closed ball of radius r about 0 by lEer) = {x E E : Ixl ::s r}. We
define a neighborhood of the fixed point in E by taking the cross product of the closed
188 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

balls in JEu and E' of radius r (which is also the closed ball in E because we are using
the maximum of the norm in the stable and unstable subspaces):

B(r) = E'(r) x JEU(r) C E.

Then I : B(r) -+ E, with 1(0) = o.


We fix 0 > 0 which serves as a bound on the slopes of graphs over IE' into lEu. For
all but spE'Cial cases, we can take 0 = 1. We take t > 0 small enough so that

II+m +t < 1 and


A - m- I - 2t > 1.

(These give bounds on the effects of the off diagonal terms to the expansion and con-
traction.) Given these 0 and t, we can find r > 0 small enough 80 that for q E B(r),

with

< II,
IIA.. (q)1I
m(Auu(q» > A,
< t,
IIA.u(q)1I
< t,
IIAu.(q)1I and
IIDlq - Dlall < t.

We say that I satisfies hyperbolic estimates in a neighborhood in which estimates of


the above type are true. This name is justified because these estimates imply that the
map contracts displacements which are primarily in the stable direction and expands
displacements which are primarily in the unstable direction as is shown in Lemmas 10.3
and 10.4 below.
Having fixed the notation and choice of of t, we start by proving the linear estimates
as discussed above.
Lemma 10.2. For p E B(r), DlpCU(o) C CU(o).
PROOF. Let v = (v •• v u) E CU(o) \ {O}. and p E B(r). Then

11r.Dlpvl _ IA.u(p)vu + A •• (p)v.1


l1ruDlpvl - IAuu(p)vu + Au.(p)v.1
< IIA.u(p)II·lvul + IIA•• (p)1I ·lv.1
- m(Auu(p»lvul-IIAu.(p)II·lv.1
IIA.. (p)II(lv.l/lvu) + IIA.u(p)1I
m(Auu(p» -IIAu.(p)II(lv.l/lvui)
110 - 1 +t -I( II + fO) -I
:'S A_m-l:'So A-m- I <0 .

But we chose f and 0 so that II + m < 1 and A - m- I > 1. o


5.lD.I PROOF OF THE STABLE MANIFOLD THEOREM 189

Lemma 10.3. Let p,q E B(r) and q E {p} + C"(a). Then the following are true:
(a) 111".0 !(q) - 11". 0 !(p)1 :5 a- I (I£ + m)11I",,(q - p)l,
(b) 111"" 0 !(q) - 11"" 0 !(p)1 ~ (A - m- I )11I",,(q - p)l, and
(c) !(q) E {f(p)} + C"(a).

PROOF. (a) The idea is to calculate how 1I".o! varies along the line segment from p to
q, t/J(t) = p + t(q - p) for 0 :5 t :5 1. Using the Mean Value Theorem,

111".0 !(q) - 11". 0 !(p)1 = 11I".o! 0 t/J(l) - 11". o! 0 t/J(O)I


d
:5 sup I-d 1I".o! 0 t/J(t)1
O:51SI t
:5 sup IIA.,,(t/J(t»1I·11I",,(q - p)1
°SIS!
+ sup IIA •• (t/J(t»1I·11I".(q - p)l
oS'S!
:5 { sup IIA.,,(t/J(t»1I
°SIS!
+ a-I sup IIA •• (t/J(t»1I}11I",,(q - p)1
0S':51
:5 (£ + a- II£)11I",,(q - p)l
= a- I (I£ + ta)11I",,(q - p)l.
This proves part (a).
(b) To get an estimate of minimum expansion we would like to take the inverse of
the function. To get a function from a space to Itself, we split the displacement from p
to q into the component in the lE" direction and then the lE' direction, and apply the
Inverse Function Theorem (Theorem 2.4) to the part that concerns the displacement in
the E" direction.
Let p = (P.,p,,) and q = (q.,q,,) for p and q as in the statement, i.e., p" = 1I""p
and q" = 1I""q for C1 = u,s. Let z = (P.,q,,), so q - p = (q - z) + (z - p), with
z - p E {OJ x E" and q - z E E' x {OJ.
For y E E"(r) and x E E'(r) define h(y) = 11"" 0 !(P.,y) and g(x) = 11"" 0 !(x,O).
Then g(P.) = 11"" 0 !(p" 0) = h(O). Also Dgx = A".(x, 0), so IIDgxll :5 t. By the Mean
Value Theorem, tr ~ tlp.1 ~ Ig(p.)1 = Ih(O)I. Then, Dhy = A""(p.,p,, + y), which
is an isomorphism with minimum norm greater than A, and IIDhy - A",,(O,O)II < t.
Applying Theorem 2.4, hIE"(r) is onto lE"«A - t)r - Ih(O)l) ::,) E"«A - 2t)r) ::,) E"(r),
and h has a unique inverse on this set. Also by the Inverse Function Theorem, the
derivative of h- I is given by D(h-I)h(y) = (Dhy)-I which has norm less than A-I.
Therefore by the Mean Value Theorem,

Ih- 1 (W2) - h-l(wl)1 :5 A- 1 1w2 - wd·


Letting W2 = h(q,,) = 11"" 0 !(z) and WI = h(p,,) = 11"" 0 !(p), so h- 1 (W2) = q" and
h- l (wll = p", we get

AI1I",,(q - p)l :5 111"" 0 !(z) - 11"" 0 !(p)l.


Thrning to the displacement from z to q,

111"" 0 !(q) - 11"" 0 !(z)1 :5 sup IIA".1I·11I".(q - p)1


:5 tI1l".(q - p)l
:5 m- 1 111",,(q - p)l·
190 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Combining,

l7Tu 0 f(q) - 7Tu 0 f(p)1 ~ l7Tu 0 f(z) - 7Tu 0 f(p)1


-17Tu 0 f(q) - 7Tu 0 f(z)1
~ (oX - m-I )17Tu (q - p)l.

This completes the proof of part (b).


(c) For p and q in the statement of part (c), to show that f(q) E {/(p)} + C"(o) we
need to look at the ratio of stable and unstable components. Using part (a) to estimate
the numerator and part (b) to estimate the denominator, we get

11T.[/(q) - l(p)JI < 0- 1 (I-' + m)


l1Tu[/(q) - l(p)JI - (oX - m-I)
< 0- 1 .

This last inequality uses the fact that I-' + m < 1 and oX - m-I > 1. From the above
inequality we get that f(q) E {/(p)} + CU(o). 0
From this last lemma, we can determine the effect of I-Ion displacements within
the stable cone.
Lemma 10.4. Assume that f(q-d = q, I(p-d = p, q E {p} + C'(o), and all the
points P.q,P_I,q_1 E B(r). Then
(a) q_1 E {p-d + C'(o),
(b) 11T.q-1 -1T.p-d ~ (" + m)-II1T.q - 7T.pl·

PROOF. (a) We use the fact that C'(o) and C"(o) are complementary cones, i.e.,

C'(o) = E \ int(CU(o».
By Lemma 1O.3(c), q ft {p} + CU(o) implies q-l ft {p-d + CU(a). Thus q-l E
{p-d + C·(o).
(b) Following the method of the proof of Lemma 10.3(a), let 1{.I(t) = P-l + t(q-l -
p-d. Then

11T.q -1T.pl = 11T. 0/01{.l(1) -1T. 0/01{.l(0)1


d
:5 sup I-
d 1T. 0 I 0 1{.I(t) 1
O~t~l t
:5 sup IIA .. (1{.I(t))ll·I7T.q-1 - 1T.P_t!
O$l~1

+ sup IIA.u(1{.I(t»1I ·17Tuq-l - 7Tup-d


0$19
:5 { sup IIA .. (1{.I(t»1I
O$l~l

+ 0 sup IIA.u(1{.I(t»II} ·17T.q-l - 7T.P_ll


0~t9

:5 (I-' + Of) ·11T.q-l - 7T.p-d·


Dividing both sides of the inequality by the constant, we get the result of part (b). 0
The following lemma makes precise the induction process with unstable disks which
is discussed in a descriptive fashion earlier in this section.
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM 191

Lemma 10.5. Let D~ be a C 1 unstable disk over E"(r).


(a) Then D~ = I(D8) n B(r) is an unstable disk over E"(r) and

diam[7I".. U- I (Df) n Dg)] :s (~- fa- I )-12r.


(b) Inductively let D~ = I(D~_I)nB(r). Then D~ is an unstable disk for n 2: 1 and

diam[7I"" n rj(Dj)] :S
n
(~ - fa- I )-n2r.
j=O

See Figure 10.4.

,
~

FIGURE 10.4. Disks D~ and l-n(D~)

PROOF. (a) Since DC; is an unstable disk, it is the graph of a CI function 'Po: E"(r) -+
E'(r) with IID('Po»'1I :S 0- 1. Let 0'0: E"(r) -+ B(r) be the associated function defined
by uo(y) = ('Po(y), y). Define h = 71".. 0 I 0 0'0. We need to determine the size of the set
covered by h and on which it has a unique inverse. We obtain this by applying Theorem
2.4 for which we need to estimate IIDh)' - A.... (O)II:

IIDh)' - A".. (0) II = IID(7I".. 0 I 0 0'0»' - A" .. (O)II


= IIA".. (uo(y» + A ... (uo(y»D('Po»' - A.... (O)II
:S IIA.... (uo(y» - A..,,(O) II + II A... (uo(y))ll IID('Po»' II
:S (f + fa-I).

By Theorem 2.4, h is a C I local diffeomorphism and has an inverse defined on

To get the estimate on Ih(O)1 and so the image under h, notice that h(O) = 71".. 0
1('Po(O),O) with l'Po(O)1 :S r, and 71".. 0 1(0,0) = O. By the Mean Value Theorem
applied to 71" .. 0 I and the displacement from (0,0) to ('Po(O), 0), we get that Ih(O)1 :S
8up(IIA... IDI'P0(0) - 01 :S fro Using this estimate, we get that h has an inverse defined
on
h(E"(r» :::> 1E"«~ - fa-l)r -lh(O)l) :::> E"«~ - fa-I - f)r) :::> 1E"(r).
Since h = 71"" 0 I 00'0 is a CI diffeomorphism which covers 1E"(r), h- I (IE"(r» C E"(r).
Therefore we can define what is called the graph transform of (10,

(11 = 1 0 (10 0 h-lllE"(r).


192 v. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Notice that 1I"u 00"1 = 1I"u 0/00"00 h- I = h 0 h- I = idIJEU(r). Thus 0"1 is a graph over
EU(r).
We also need an estimate on IID(h-l)yll. For Vu E EU,

ID(1I"u 0/ 0 O"o)yVul = IAuu(O"o(Y»vu + Au,(O"o(y»D(CPo)yvul


~ IAuu(O"o(y»vul -IAu,(O"o(y»D(cpo)yvul
~ m(Auu(O"o(y)))lvul -IIAu.(O"o(y»IIIID(cpo)yIlIVul
~ (~ - m-I)lvul.

Therefore m(Dhy) ~ ~ - m-I, IID(h-l)yll ~ (~- m-I)-I, and

Ih- I (Y2) - h-l(y')l ~ (~- m- I )-IIY2 - yd·

IID(cp')yll ~ II A", (0"0 0 h-l(y»D(h-l)yli


+ IIA,,(O"o 0 h-l(y»D(cpO)h-1(y)D(h-l)yll
~ f(~ - m-I)-I + I-'a-I(~ - m-I)-I

~ a -I ( II + fa )
~ - fa-I

Note that for this argument we need that II + m < 1 and ~ - f - m- I > 1. Letting
Dr = O"I(JEU(r», we have verified the conclusions of part (a).
Part (b) follows from part (a) by induction on n. 0
REMARK 10.2. Lemma 10.5 is the heart of the Hadamard proof of the existence of a
Lipschitz unstable manifold. We could define the function space

'H = 'H(a, r) = {O" : JEU(r) -- B(r) : 1I"u 00" = id, Lip(1I", 0 0") ~ a-I}.

By an argument like that in Lemma 10.5, we can show that for any 0" in 'H, l(image(O"»
is again the graph of a Lipschitz function in the function space which we denote by
1*(0"). This map 1* on the function space 'H is called the graph transform. Then it
is possible to show that 1* is a contraction on 'H with the CO-sup topology, and so
has a unique fixed section O"u, for which 11", oO"u is a-I-Lipschitz. The graph of 11". OO"U
or the image of O"U is the unstable manifold. For this type of approach see Hirsch and
Pugh (1970), Hirsch, Pugh, and Shub (1977), and Shub (1987). Instead of following
that method, we use the above lemma on C l functions (which are unstable disks) to
prove the existence of a Lipschitz stable manifold.
The following lemma now gives the induction step and the proof that W;' is the graph
of an a-Lipschitz function. As discussed above, we use Lemma 10.5 to prove that for
each x E JE'(r), W:
n ({x} x EU(r» is exactly one point, and so is a graph. Finally,W;
Lemma 10.3 shows that W;' is a-Lipschitz.
Proposition 10.6. The l()('al stable manifold, W: = n~o j-j(B(r», is a graph of an
a-Lipschitz function cP' : E'(r) -- EU(r) with cp'(O) = O.
PROOF. Let Sn = n;=o rj(B(r» so W; = n::,,=o Sn: the forward orbit of a point in
the infinite intersection is contained in B(r) so it lies on W;.
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM 193

The first step is to use Lemma 10.5 to see that W; n [{x} x E" (r)] is at moet one
point for each x E IE." (r). Fix x E IE' (r) and let D~(x) = {x} x E" (r). By Lemma 10.5
and induction on n,
D~(x) = I(D~_1 (x» n B(r)

is an unstable disk, and

[{x} x F'(r)] n S" n


= " r i (Dj(x» C Dg(x)
i=O

is a nested sequence of closed and complete sets whose diameters are bounded above
by (~ - (Q-I )-n2r. Since the diameters of [{x} x IE" (r)] n Sn go to zero u n goee to
infinity, the intersection [{x} x JE"(r)] n W; is a single point for each x and W; is a
graph over IE.". See Figure 10.5. We show below that the WI-limit set of a point in the
intersection defining W; is the fixed point.

B(r)

FIGURE 10.5. Sets Sn Converging to W;


To show that the graph is Lipschitz, let PEW;. Assume that

q E W: n [{p} + C"(a)]

Because both p, q E W;, /i(p),/j(q) E B(r) for all j ~ O. By induction, applying


Lemma 10.3 we get that for all j ~ 0,

Ii (q) E {Ii (p)} + C"(a)

and
IlI"u(fi(q) - fi(p»1 ~ (~- £a-lYllI"u(q - p)l.
If q:F p, then since (~- (Q-l)i goee to infinity either fi(q) or Ji(p) must leave B(r).
This contradicts the fact that both p and q are in W;. Thus q =
p, and there is at
moet one point in W; n ({p} + C"( a». Applying this to x, y E JE" (r) with x :F y, since
o-u(x)= (x, 'P'(x» :F (y, 'P'(Y» =
o-U(y), it follows that UU(y) ;. {o-U(x)} + C"(a),
o-U(y) - {o-U(x)};' C"(a), and utl(y) - {UU(x)} E C'(a). Thu8 Lip(lI"u ou') Sa. This
completes the proof of Proposition 10.6. 0
Before we prove that W; is Cl, we check its characterization in terms of the rate of
convergence to O.
194 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Proposition 10.7. If pEW:, then IP(p}1 $ 0(1-' + fCt}ilpl for j ~ 0, so li(p}


converges to the fixed point 0 at an exponential rate.
PROOF. Because W:
is o-Lipschitz, l7ru (Ji(p} - 011 $ ol7r.[fJ(p} - 011 so li(p} E
to} + C·(o}. Applying Lemma 10.3 to P(p} and li(O} = 0, we get that
(Jl + fCt}IP-l(p}1 ~ (I-' + eo}I7r. oli-1(p}1
~ (11 + eo)I7r.[/ j - 1(p} - /]-1 (O}II
~ 17r.IP (p) - P (0)11
~ 17r, 0 P (p)l·
By induction. we get that

(I-' + eoJilpl ~ 17r, 0 P(p)l·

Then the unstable component is bounded by 0 times this amount:

Because we are using the maximum norm of the components,

This completes the proof of the proposition. o


Proposition 10.S. The local stable manifold, W:, is C l and is tangent to]E' at the
fixed point O.
PROOF. We have shown that W: is the graph of a Lipschitz function 'P' : ]E'(r) --+ ]EU(r}
with Lip('P') $ 0. We let (T' : E'(r) --+ B(r} be defined by (T'(x) = (x,'P'(x)). We
proceed to show that (T' and 'P' are CI.
For any pEW:. since Lip('P') $ 0, (T'(JE'(r)) C {p} + C'(o). To get the derivative
of 'P' . we use the comparison of the nonlinear action of 1 on C' (0) with the linear action
of D/q on C·(o).

FIGURE 10.6. Nested Cones

Define
c,·n(p) = (D/;)-IC'(O).
By Lemma 10.2. Df/J(p)CU(o} C CU(o) so

[DI/J(p)]-IC'(O) C C'(o}.
5.10.1 PROOF OF THE STABLE MANIFOLD THEOREM 195

By induction,
c··n(p) C C··n-1(p) C ... c C,·l(p) c C'(O).
See Figure 10.6. It follows from the argument of Proposition 10.6 applied to this se-
quence of linear maps, that C··n(p) converges to a plane Pp (which can depend on
p) which is the graph of a linear map Lp : E' -+ EU. (This means that the maximal
angle in the cone C··n(p) converges to zero as n goes to infinity.) The goal is to prove
that "P' is differentiable at Xo = 1T.P with D"P~o = Lp and Pp = TpW:. (Because
Po = IE' x {O}, this shows that W: is tangent to E' at 0.) We show that given n,
there is an T/ = T/(n) > 0 such that for Iy - Xol < T/, q'(Y) - q'(xo) E C··n(p). Define
truncated cones by

c,·n(p,T/) = C··n(p) n 1T; 1(IE'(T/)), and


C'(o,T/) = C·(O)n1T;l(E·(T/».

We use the differentiability of J" to get the image by f- n of some truncated cone
{In(p)} + C'(o, 71') inside the truncated cone {p} + C··n-1(P. 7/).
Lemma 10.9. Let pEW: and n > O. Then there exists an TJ = TJ(n) > 0 such that
the following hold.
(a) IfI1T.(q - p)1 ~ TJ and J"(q) E {/n(p)} + C'(o), then q E {p} + c··n-1(p, 17).
(b) If q E W: and 11T.(q - p)1 ~ 17, then q E {p} + C··n-1(p, 17).
PROOF. (a) We prove the contrapositive: we prove that there is an 17 > 0 such that if
11I'.(q - p)1 ~ 17 but q f/. {p} + C··n-1(p, TJ) then J"(q) f/. {In(p)} + C'(o).
We want to use the differentiability of J" to conclude the inclusion of the truncated
cones. Because the kernel of DI; doc<> not intersect the complement of c,·n-l(p,TJ),
C··n-1(p,TJ)C, there is a constant C > 0 such that for v E C·· n- 1(p,17)C, ID.t;vl ~
C-1Ivl. On the other hand, by the definition of the C""(p), Df;c··n-l(p)c equals
Dfr-'(p)C'(o)C, which in turn is contained in the interior of C·(o)C. In fact by
Lemma 10.2,
D"n_.(p)(C·(o)C) n {v: Ivl = I}
is uniformly contained in the interior of C·(o)c. Therefore there is {3 > 0 such that
if v E c,·n-l(p)C, v' = DI;v, and Iwl < {3lv'l, then Vi + wE C·(o)c. With these
constants we can take f' > 0 such that Cf' < {3. By the differentiability of In, there is
TJ > 0 such that if Ivl ~ 17, then J"(p + v) = J"(p) + Df;v + E with lEI ~ f'lvl ~
f'Clv'l < {3lv'l where v' = Df;v. Now let 11I'.(q - p)1 ~ TJ but q f/. {p} + C·· n- 1 (P,71).
Using the above inequalities with v = q - p, we get that J"(q) E {r(p)} + C'(o)C, or
r(q) f/. {In(p)} + C·(o). This proves part (a).
For the proof of part (b), we know that if q = q'(Y) then J"(q) E {In(p)} + C·(o).
By part (b), given n there is an TJ such that if 11I'.(q - p)1 ~ TJ and r(q) E {/n(p)} +
C'(o), then q E {p} + c··n-l(p). Combining these two facts proves part (b). 0
Returning to the proof of Proposition 10.8, we have shown by Lemma 10.9 that for
> 0 small enough and Ix - "01 < TJ,
TJ = TJ(n)

qU(x) _ qU(Xo) E C··n-1(p).

Given f' > 0, for n large enough, the maximum angle between vectors in c··n-l(p} is
less than t', so for Ix - "01 < 1)(n) it follows that

1"P'(x) - "P'(Xo) - L,,(x - xo)1 < f'lx - XoI·


196 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

Thus 'P' is differentiable at Xo and D'P'(Xo) = Lp.


Because the definition of C··n+l(p) only involves the derivative of a continuously
differentiable function, In+ I, this cone is a continuous function of p. More precisely, for
q near p, C··n+l(q) C C··n(p). Therefore the Image of both Lq and Lp are contained
inside C··n(p). By taking n large and so q near enough to p, we can insure that the
image of Lq is as near as we desire to the image of p. Therefore the derivative of 'P' is
continuous and 'P' is C I .
Because E' is invariant by Dlo, it follows that E' c c,·n(o) for all n and so Po = E'.
Therefore we have proved that W:is tangent to E' at the fixed point O. 0
Theorem 10.10. 1£ I is C k [or k ~ 1, then W: is C li .
PROOF. If k = 1, then this is a restatement of Proposition 10.8. Assume that I is C li
for k ~ 2, with 1(0) = O. Define a new function F(p,v) = (f(p),Dlpv). This function
F is defined for (P. v) with P E B(r) and v is a (tangent) vector at p, v E TpE ~ E.
Then F is C k - I , F(O,O) = (0,0), and

so
DF(o.o)(p, v) = (~~~e) = (Dto D~o) (e) .
Thus (0,0) is a hyperbolic fixed point for F. By induction, F has a C k - I stable
manifold, W:«O, 0), F). This manifold is characterized as points (p, v) which have
Pj E B(r), and IVjl ~ r for all j ~ 0, where (Pj, Vj) :; Fj(p, v). Thus p E W:(O, f).
In fact, IVjl = ID/~vl goes to zero exponentially as j goes to infinity. But vectors in
TpW:(O, f) have this property. By the uniqueness of W:«O, 0), F) and counting the
dimensions of TpW:(O, f) and W:«O, 0), F) n ({p} x TpE) ,

Thus, TW:(O.f) is C k - I and W:(O,f) is ek • o


Theorem 10.11. I[ I is Ck [or k ? 1, then the local unstable manifold W;'(O, f) is
Ck .
PROOF. If I is invertible, then the existence of the unstable manifold for I follows from
that of the stable manifold of I-I. We will now indicate the changes needed to take
care of the case when I is not invertible.
We can use Lemmas 10.3 and 10.4 as given. All we need is a replacement for Lemma
10.5.
Lemma 10.12. Let D' = D~ be an o-Lipschitz stable disk over E'(r).
(a) Then Dr
= I-I(D') n B(r) is an o-Lipschitz stable disk over E'(r) and

diam[/(D:>] ~ (I-' + m)2r.

(b) Inductively de1ine D~ = I-I(D~_I) n B(r). Then D~ is an o-Lipschitz stable


disk [or n ~ 1 and diam[f"(DJ)] ~ (I-' + m)n2r.
PROOF. The first step is to show that Dr
= I-I(D') contains exactly one point in
each fiber {x} x EU(r). But p E I-I(D') n [{x} x EU(r)] if and only if I(p) E D' n
I[{x} x EU(r)] and p E {x} x EU(r). By Lemma 10.5, the second set is the graph of
5.10.2 CENTER MANIFOLD 197

a Cl function '1/1 : 1E"(r) -+ E'(r) whose derivative has norm slightly less than 0-1.
There is a unique point of intersection of these two sets. This fact can be _n by
considering the composition 71'" 0 (7(71','1/1) which is a contraction (has derivative with
norm less than one). Let z be this point of intersection: {z} = D' n f[{x} x E"(r»).
By the proof of Lemma 10.5, there is a unique P E [{x} x E"(r») with f(p) = z. Thus
P E r1(D')n[{x} x 1E"(r)] is unique. By Lemma 1O.4(a), the graph is a-Lipschitz and
diam[f(DDI ~ (I' + w)2r. This indicates the changes in the proof of Lemma 10.5. 0
Finally we want to check the characterization of the points in W,!'.
Proposition 10.13. If pEW,!', then for any past history p_ j E B(r) with Po = P
and f(p-j-d = P-j, it follows that Ip-jl ~ (>. - w)-jlpl· Also, the past history is
unique in W,!'.
PROOF. Take Po = pEW,!' and P-i E B(r) a past history. Thus, Po E {OJ + C"(a).
Applying induction and using Lemma 10.3 with q = 0,

so
17I'"p-jl ~ (>. - w)-iI7l'"pol·
Thus I71'" P-jI goes to zero exponentially fast as stated. The stable component, 17I',p-jl ~
aI7l'"p-jl also goes to zero exponentially fast.
To show the uniqueness, assume that P_j and q_j are both past histories for P which
remain in B(r) for all j ~ O. Then Po = qo, so qo E {Po} + C'(a). By induction and
using Lemma 10.4, q_j E {p_j} + C'(a) for j ~ O. Applying the estimate of Lemma
10.4, the only way that both P-j-k and q-i-k can remain in B(r) for all k ~ 0 is for
P_j = q-j' This proves uniqueness in W,!'. 0

5.10.2 Center Manifold


In this section, we give the modifications of the Stable Manifold Theorem for the case
that allows eigenvalues (spectrum) on the unit circle. We only discuss the case of finite
dimensions, although the results in Banach spaces are true with some modifications of
the assumptions. First of all, the stable and unstable manifolds exist in this situation
as stated in Theorem 10.14. There also exists a manifold which is tangent to the center
eigenspace as stated in Theorem 10.15. Theorem 10.15 also gives the existence of so
called center-stable and center-unstable manifolds.
Theorem 10.14 (Stable Manifold Theorem). Let f : U c R" -+ R" be a C k
map for 1 ~ k ~ 00 with f(p) = p. Then R" splits into the eigenspaces of Dfp ,
R" = lE" EEl lEe EEl E', which correspond to the eigenvalues of Dfp greater than one,
equal to one, and less than one. There exists a neighborhood of p, V C U, such
that W'(p, V, f) and W"(p, V,!) are C le manifolds tangent to E' and E", respectively,
and are characterized by the exponential rate of convergence of orbits to P as foUows.
Assume that 0 < I' < 1 < >. and norms on E" and IE' are chosen such that liD fplE'1i < I'
and m(DfpIE") > >.. Then

W'(p, V, f) = {q E V : d(fi(q), p) ~ I'id(q,p) for all j ~ O}

and

W"(p, V,!) = {q E V: d(q-i'P) ~ >.-id(q,p) for all j ~0


where {q-i }~o is some choice of a past history of q}
198 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

REMARK 10.3. We do not give a proof of this theorem, although methods similar to
those for the proof of the earlier Stable Manifold Theorem work with some modifica-
tion. Most proofs of the Stable Manifold Theorem probably apply, but this theorem is
certainly proved in Kelley (1967) and Hirsch, Pugh, and Shub (1977).
There is also a center-stable manilold, WC'(p, V, f), which is tangent to lEC ESlE', but
it is not necessarily unique and is difficult to characterize locally. Also, if I is C k for
1 ~ k < 00, then there is a neighborhood on which the center-stable manifold is C k . See
Section 11.3. However, the neighborhood can depend on k, and so if f is Coo there are
examples where there is no neighborhood on which the manifold is Coo. See van Strien
(1979), Carr (1981), and Exercise 5.44. Finally, the manifold WC'(p, V, f) is not strictly
invariant, although all points which stay in V for all future iterates are contained in
we,(p, V, f).
In the same way, there is a center-unstable manilold, weu(p, V, f), which is charac-
terized by choices of backward iterates.
It is easier to characterize these manifolds by forming an extension, I, of I to all
of Rn which is close to /(p) + D/p(q - p) on all of an. Once the extension is fixed,
WCU(p,f) and WC'(p,f) are unique. By a translation we can assume that p = O. Let
A = Dlo. Given E > 0 there is an r > 0 and an extension I: IR n .... IR n that is C k
and such that IIB(O, r) = /IB(O, r), III - Allc. < E, and I = A off B(O, 2r). With this
notation, the manifolds are characterized in the following theorem.
Theorem 10.15 (Center Manifold Theorem). Let I: U C IRn .... lR n be a C k map
for 1 ~ k ~ 00 with /(0) = 0 and A = D 10. Let k' be chosen to be (i) k if k < 00 and
(ii) some integer with 1 ~ k' < 00 if k = 00. Assume that 0 < P. < 1 < ,\ and norms
on lEu and lE' are chosen such that IID/pllE'1i < p. and m(D/pllEU ) > A. Let E > 0
be small enough so that IID/pllE'li < p. - E and m(D/pllEU ) > A + E. Let r > 0 and
I : Rn .... Rn be a C k map with IIB(O, r) = IIB(O, r), III - Allc' < E, and I = A off
B(O,2r). 1£ r > 0 is small enough (it can depend on k'), then there exists an invariant
C k ' center-stable manifold, we,(o, I), which is a graph over lEc ESlE', which is tangent
to lEe ESlE' at 0, and which is characterized as follows:

WC'(o,f) = {q : d(Ji(q), 0)'\ -i .... 0 as j .... oo}.

This means that Ii (q) grows more slowly than :>.,i. Similarly, there exists an invariant
C k ' center-unstable manifold, WCU(O, I), which is a graph over lEuESlEc, which is tangent
to lEu ES lEc at 0, and which is characterized as follows:

WCU(o,f) = {q : d(q_i,O)p.i .... 0 as j .... 00 where {q-i}~o


is some choice of a past history of q}.

This means that q-i grows more slowly than p.-i as j .... 00, or -j .... -00. Then the
center manifold of the extension is defined as

It is C k ' and tangent to Ee, There are also local center-stable, local center-unstable,
and local center manifolds of / defined as

WCO(O, B(O, r), f) = We,(O, I) n B(O, r),


Weu(O, B(O, r), f) = WCU(o,f) n B(O, r), and
WC(O, B(O, r), f) = WC(O, I) n B(O, r),
5.10.3 STABLE MANIFOLD THEOREM FOR FLOWS 199

respectively. These local manifolds depend on the extension, but if Ii (q) E B(O, r) for
all -00 < j < 00 then q E WC(O, I) for any extension 1 and q E We(O, B(O, r), f).
REMARK 10.4. The proof that a CI center-stable and center-unstable manifolds exist
is essentially the same as in the last subsection. In Section 11.3, we return to discuss
why it is cr for 2 ~ r < 00. For a more thorough discussion of the Center Manifold
Theorem, see Carr (1981). There are proofs in Kelley (1967), Hirsch, Pugh, and Shub
(1977), and Chow and Hale (1982).

Example 10.1 (Nonunique Center Manifold). The following example illustrates


the fact that the center manifold is not unique. It is attributed to Anosov in Kelley
(1967), page 149. Consider the differential equations

y= -yo

It is easy to see that the origin is a non-hyperbolic fixed point with eigenvalues -1 and
O. The stable manifold of the origin is clearly the y-axis. To determine the various
center manifolds, note that :: = :~ and y = Cell>: is a solution for any choice of C.
Thus for any choice of C, the graph of the following function gives a center manifold:

u(x,C) = { 0 for x ~ 0
Cel/>: for x < O.

Note that as x --+ 0 with x < 0 that u(x, C) --+ O. Thus u(x, C) is continuous at x = O.
With more calculations, it can be shown that u(x, C) is Coo at x = O. For any C, the
graph of u(x, C) is tangent to the x-axis at x = 0 and is invariant. Therefore for any
choice of C, this graph is a center manifold. Thus, far from being unique, there is a one
parameter family of center manifolds.

5.10.3 Stable Manifold Theorem for Flows


The statement of the Stable Manifold Theorem for a fixed point of a flow is similar
to that for a diffeomorphism. We do not comment further on it except to mention that
the sum of the dimensions of the stable and unstable manifolds of a single fixed point
equals the total dimension of the ambient space. This is similar to a diffeomorphism
but different than the stable and unstable manifolds of a periodic orbit for a flow.
For a hyperbolic periodic orbit 1, it is certainly possible to get the stable manifold
of a Poincare map P: E --+ E, W:(p, P), and let

W:b, ei) = U ri(W:(p, Pl·


t~O

However, this representation does not tell us much about the geometry of the local
stable manifold of the periodic orbit.
To explain this geometry, we need to look at the contracting and expanding splitting
along the whole periodic orbit. Let cpt be a Row on a manifold M for a vector field X
and with a periodic orbit 1 of period T. For any q E 1, there is a splitting

TqM = IE~ Ell E~ Ell (X(q»,


200 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

where (X(q)) is the span of the vector X(q), and E~ = Dcp~E~ for 17 = s,u if q =
cpt(p). Just as we defined the normal bundle for the periodic orbit when discussing the
Hartman-Grobman Theorem, we can define the stable bundle and unstable bundle of the
periodic orbit by
E; = U{(q,y) : q E 'Y and y E E~},
and
E~ = U{(q,y) : q E 'Y and y E E:}.
Further. for 17 = S. u and f > 0 let

If the derivative of the flow restricted to the stable bundle. Dcp~IE~. is orientation
preserving (has positive determinant). then E~(f) is isomorphic to the cross product of
'Y and E~(f). If Dcp~IE~ is orientation reversing (has negative determinant). then E~(f)
is not isomorphic to the cross product of'Y and E~(f) but is a twisted product. By
going around the orbit twice an oriented basis of E~ can be brought back to itself (while
remaining a basis of the appropriate E~ the whole way) but not after only once around
'Y. If E~ is one dimensional and the stable bundle is twisted. then E~(f) is isomorphic
to a Mobius strip. In higher dimensions. it is a corresponding twisted product if the
bundle is not oriented.
The local stable manifold of 'Y. W.'(-y. cpt). can be represented as a graph over E~(f)
for some small f > O. Thus if E~ is one dimensional. the local stable manifold is either
a graph over an untwisted strip (an annulus) or a Mobius strip. Similar statements
can be made about the local unstable manifold. This geometric difference of types of
local stable and unstable manifolds for a periodic orbit does not arise for fixed points
of flows or for diffeomorphisms. The differentiation between E~ being twisted or not is
related to Floquet theory for the time periodic linear system of differential equations
(first variation equation)
d
'dtv = DX<p'(p)V
along the periodic orbit. See Hartman (1964) or Hale (1969).
Notice that the sum of the dimensions of E~ and E~ is equal to the dimension of M
plus one.
dim(E~) + dim(E~) = dim(M) + 1,

since both bundles contain the direction along 'Y. This is another difference from the
case for stable and unstable manifolds of fixed points for flows or for diffeomorphisms.
This difference is important when we consider transverse stable and unstable manifolds
for flows: for a periodic orbit W'(-y. cpt) can be transverse to WU(-y. cpt) (and so generate
a horseshoe). but for a fixed point Po. W'(Po. cpt) can not be transverse to WU(Po. cpt)
at points away from Po. See the discussion of flows in Chapters VII and IX.

5.11 The Inclination Lemma


This section concerns a result which follows from the proof of the stable manifold
theorem. or at least from similar ideas. It is used in Chapter IX to develop the gen-
eral theory of invariant sets with a hyperbolic structure (some contracting and some
expanding directions). Let p be a hyperbolic fixed point for f. The result concerns the
iterates of a disk. of the same dimension as the WU(p), which crosses W'(p}. We first
need to define what we mean by a disk crossing W'(p} transversally.
5.11 THE INCLINATION LEMMA 201

Definition. Let DI and D2 be the differentiable images in Rn of two disks, D; =


9j (Rn; (r j»
for j = 1,2 where 9j : Rn; -- Rn is Cle and one to one. We say that these
two embedded disks are transverse at p provided either p ¢ Dl n D2 or

We say that DI and D2 are transverse provided they are transverse at all points.
In the definition, note that if DI and D2 are transverse and n1 + n2 < n, then
DI n D2 = 0. Thus two transverse lines in R3 do not intersect. Next, if nl + n2 = n,
and the disks are transverse, then they intersect in isolated points. (A complete proof
of this fact uses the Implicit Function Theorem.) So, a line and a plane which are
transverse in R3 intersect in a single point. If nl + n2 > n, then the transverse objects
intersect in a curve, surface, or higher dimensional object (of dimension nl + n2 - n).
So, two planes which are transverse in R3 intersect in a line. For a more complete
treatment of transversality, see Abraham and Robbin (1967), Guillemin and Pollack
(1974), or Hirsch (1976).
Now we can state the Inclination Lemma (or Lambda Theorem).
Theorem 11.1 (Inclination Lemma). Let p be a hyperbolic fixed point for a C le
diffeomorphism f. Let r > 0 be small enough so that in the neighborhood {p} +
(JE"(r)xJE'(r» ofp the hyperbolic estimates work which prove the Stable (and Unstable)
Manifold Theorem. Let D" be an embedded disk of the same dimension as IE" and such
that D" is transverse to W:(p,f). Let Dr
= f(D") n (JE"(r) x JE'(r» and D~+l =
f(D~) n (JE"(r) x JE·(r». Then D~ converges to W;'(p, f) in the Cle topology (pointwise
and with all its derivatives). See Figure 11.1. So given E > 0, there is no such that for
all n ~ no, D~ is with E of W;'(p, f) in terms of the Ck-topology. (The latter condition
means that if D~ is given as the graph of Un : W;'(p, f) -- JE", then Un and its first k
derivatives are smaller than E.)

FIGURE 11.1. The Disks D~ Converging to the Unstable Manifold

The proof of this theorem follows from the methods of the proof of the Stable Manifold
Theorem. See Palis and de Melo (1982) for the details. Palis and de Melo (1982)
also contains another (geometric) proof of the Hartman-Grobman Theorem using the
Inclination Lemma.
This type of result can also be formulated for an invariant set with a hyperbolic
structure which we define in Chapter VII.
202 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

5.12 Exercises
Differentiation
5.1. Assume I: U C Rk -+ Rm is CI and p E U. Show that

01 .
oxJ (p) = D/pe1

where e1 = (0, ... ,0, 1,0, ... ,0) is the standard unit vector with zeroes in all but the
j-th position.
5.2. Assume I : U C Rk -+ Rm and 9 : V C R m -+ an are differentiable at p
and q = I(p), respectively. Using the matrix product write out the (i,j)-th entry
of DgqD Ip. Discuss the relationship between the chain rule in terms of products of
matrices and combinations of partial derivatives.
Inverse Function Theorem
5.3. This exercise asks for a proof of the covering estimates of the Inverse Function
Theorem stated in Theorem 2.4. Assume that U C IRn is an open set containing 0, and
that I : U -+ IRn is a CI function with 1(0) = 0. Assume that L = Dlo is an invertible
linear map (so has a bounded linear inverse). Take any /3 with 0 < /3 < 1. Let r > 0 be
such that (i) 8(0, r) C U and (ii) ilL - Dlxll $ m(L)(1 - (3) for all x E 8(0, r). Define
<p(x, y) = y + x - L - I I(x).
(a) For y E 8(0, /3r) fixed, prove that the derivative with respect to x is given by
Dx(<p(-, Y))x = 1 - L - I Dlx for any x E 8(0, r), and that IIDx(<p(·, Y))xll $ 1 - /3.
(b) For y E 8(0, /3r), prove that

for all XI,X2 E 8(0,r).


(c) For y E 8(0,/3r), prove that <p(.,y): 8(0,r) -+ 8(0,r), and that it has a unique
fixed point.
(d) For y E L(8(0, /3r)), prove that there is a unique x E 8(0, r) such that I(x) = y.
Hint: A fixed point x of <p(.,y) corresponds to a point x such that I(x) = Ly.
(e) Prove that L(8(0, /3r)) ::> 8(0, m(L)/3r).
Contraction Mapping
5.4. Assume Y is a complete metric space with metric d, A is a metric space, and
0< I'i. < 1. Assume 9 : A x Y -+ Y is uniformly continuous, and for each x E A, g(x,·)
is Lipschitz with Lip(g(x, .)) $ I'i.. Prove that there is a continuous map 0' : A -+ Y
such that g(x, O'(x)) = O'(x) for every x E A. Hint: Apply Theorem 2.5 to show that
for each x E A, there is a O'(x) E Y such that g(x,O'(x)) = O'(x). Then for x E A near
Xo E A, apply the estimate in Theorem 2.5 on the fixed point O'(x) to see how near it is
to O'(xo).
Solutions of Differential Equations
5.5. Assume A(t) is an n x n matrix with real entries such that IIA(t)1I is bounded for
°
all t, Le., there is a C > such that IIA(t)1I $ C for all t. Prove that the solutions of
the linear system of equations x = A(t)x exist for all t.
5.6. Let Co, C II and C2 be positive constants. Assume v(t) is a continuous nonnegative
real valued function on R such that

v(t) $ Co + Cdtl + C21l v(s) dsl·


5.12 EXERCISES 203

Prove that

Hint: Use
U(t) = Co + Cit + C'J l V(s) ds.

5.7. Let CI and C 2 be positive constants. Assume I : Rn -+ Rn is a C I function such


that I/(x)1 ~ C I + C 2 1xl for all x ERn. Prove that the solutions of x = I(x) exist for
all t. Hint: Use the previous exercise.
5.8. Consider the differential equation depending on a parameter

x = l(x;lJ)
where x is in an, IJ is in R", and I is a CI function jointly in x and Jl. Let 'P1(x;lJ)
be the solution with 'P°(x; IJ) = x. Prove that 'PI (x; IJ) depends continuously on the
parameter IJ.
5.9. (Differentiably flow conjugate) Let I,g : an -+ Rn be two vector fields with flows
'PI and 1/J1 respectively.
(a) Assume h : Rn -+ Rn is differentiable flow conjugacy of the flows 'PI and 1/J1, i.e.,
h 0 'PI 0 h -I = 1/J1 for all t and h is a CI diffeomorphism. Prove that
Dh,,-I(Xj/ 0 h-I(x) = g(x).

In differential geometry, the vector field given by Dh,,-l (xj/ 0 h -I (x) is labeled by
(h./)(x).
(b) Assume I and 9 are differentiably flow equivalent, i.e., they satisfy the equation
1/J1(x) = h 0 'PQ(I,X) 0 h-I(x)

where h : Rn -+ Rn is diffeomorphism and a : R x Rn -+ R is differentiable with


d
dto(t,x) > O. Prove that

Dh,,-I(Xj/ 0 h-I(x) = ,B(x)g(x)


where,B: an -+ (0,00).
5.10. (Flow Box Coordinates) Consider a differential equation x = I(x) for I : Rn -+
R n a C" vector field for r ~ 1. Let a be a point for which I(a) oF O. Prove there exists
a cr change of coordinates x = g(y) valid near a in terms of which the differential
equations become
!JI = 1, for 2 ~ j ~ n.
The outline of the proof is as follows.
(a) Let v = I(a). One of the coordinates Vk oF O. Renumber the equations so that
k = 1. Take the hyperplane
s= {(al,Y2, ... ,Yn)}'
Define g(y) = 'Pill (at. Y2,"" Yn) where 'PI is the How of the differential equation,
Prove that
VI 0
( V2 1
Dga = :
Vn 0
204 v. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

(b) Using the inverse function theorem, prove that 9 defines a C r set of coordinates
near a, i.e., 9 is cr and has a cr inverse.
(c) Prove the differential equations in the y variables is as the form given above.
Limit Sets
5.11. Let X be a metric space, and Sj C X for 1 ~ j be a sequence of nested, compact,
and connected subsets: Sj ::> Sj+l. Prove that n~l Sj is connected.
5.12. Consider the flow e'Ax for the linear differential equation

x=Ax
with x ERn.
(a) Prove that all the eigenvalues of A have nonzero real part (i.e., the fixed point 0
is hyperbolic) if and only if for each x ERn, either w(x) = {O} or w(x) = 0.
(h) For n = 4, show there is a choice of A and a point x for which w(x) :) O(x) but
O(x) is neither a fixed point nor a periodic orbit.
5.13. Let x = X(x) be a (nonlinear) differential equation in an. Assume that a
trajectory 'P' (p) is bounded and each of the coordinates of the solution is monotonic for
t ~ to, i.e., ft7rj'P ' (p) does not change sign for t ~ to and 1 ~ j ~ n, where 7rj : Rn -+ R
is the projection of the j-th coordinate. Prove that w{p) is a single fixed point.
5.14. Let f : X -+ X and 9 : Y -+ Yare continuous maps on compact metric spaces.
Assume h : X ..... Y is a topological conjugacy. Prove that h takes the chain recurrent
set of f to the chain recurrent set of g, h(n(f» = n(g), i.e., prove that the chain
recurrent set is a topological invariant.
Fixed Points for Differential Equations
5.15. Find the fixed points and classify them for the system of equations

± = v,
v=
-x +wx3 ,
W= -w.

5.16. Find the fixed points and classify them for the Lorenz system of equations

x= -lOx + lOy,
y= 28x - Y - xz,
. 8
z="3 z + xy .

5.17. Consider the equation of a pendulum with friction,

X=y
iJ = - sin(x) - 6y
with 6 > O.
(a) Using a Liapunov function, prove that the w-limit set of any point qo = (xo, YO)
is a single fixed point.
(b) Let PI: = (b,O) be the fixed points. For a fixed point PI:, let

W"(Pk) = {q : w(q) = pI:}


5.12 EXERCISES

be the basin of attraction of Pk (or the stable manifold of Pk). Discuss how the the
basins of attraction for the fixed points are located in R2. In particular, explain
how W'(P2k+d separates W'(P2k) from W'(P2k+2).
(c) Discuss the difference of the motion of a point q = (0,110) which lies in W'(P2k)
from the motion if it lies in W'(P2k+2). In particular, how many rotations through
multiple of 211" in the x-variable does each forward orbit make?
5.18. Discuss the basins of attraction of the fixed points for the following system of
differential equations with fJ > 0:

:i:=11
iJ = x - x 3 - fJ y.

5.19. Consider the equations

:i: = x(A - x - ay),


iJ = y(B - bx - y)

which model two competing species. Assume A, B, a, b > 0, A > aB, and B > bA.
(a) Find the fixed points. Hint: Consider the isoclines where :i: = 0 and iJ = O.
(b) For any point q in the interior of the first quadrant, prove, using Exercise 5.13,
that the w( q) is a fixed point with both coordinates positive. Hint: Consider the
various regions where :i: and iJ have fixed sign. (The isoclines are the boundaries
of these regions.)
5.20. This exercise asks for an alternate proof of Theorem 5.1 (on nonlinear sinks for
ordinary differential equations) using the Jordan Canonical Form approach.
Assume 0 is a fixed point for the equatlons:ic = f(x) with x ERn. Also assume that
there is a constant a > 0 such that all the eigenvalues ~ for A = D fa have negative real
part with Re(~) < -a < o.
(a) Let ( , )B be the inner product and II liB be the norm determined by an arbitrary
basis B. Show that
~II (t)11 = (x(t),/(X(t)))B
dt x B IIx(t)IIB·
(b) Let " > 0 be small enough so that Re(~) < -a - " for all the eigenvalues of A.
Let B be the basis used in the alternative proof of the linear sink theorem, so

{x, Ax)B < (-a - £)lIxll~.

Prove that if U is a small enough neighborhood of 0, then

(X,f(X»B < -allxll~·

(c) Prove that for a small enough neighborhood U of 0 and :xo E U, the solution
<p'(:xo) satisfies 1I<p'(:xo)IIB ~ e-olll:xoIiB.
(d) Prove that 0 is a nonlinear sink.
Remark: This exercise proves that IIxliB or IIxll~ is a Liapunov function in a neighbor-
hood ofO.
5.21. Let f : Rn -+ R be a C2 function. Let X(x) = 'ilf be the gradient vector field
for f.
(a) If P is a fixed point for X, prove that the eigenvalues are real.
206 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

(b) Prove that p is a hyperbolic fixed point of X if and only if DIp = 0 and D2/p("')
is a nondegenerate bilinear form.
5.22. Assume:it = Ax and y = By are both hyperbolic linear flows on an and are
differentiably conjugate, i.e., there is a C I diffeomorphism h : an -> an and a C I
function T : R x Rn -> R such that
h(eA1'(t,x)x) = eBth(x).
Prove t.hat the eigenvalues of A are proportional to the eigenvalues of B.
Periodic Points for Maps
5.23. Let h : R -> R be given by h(x) = x 3 . Find a Coo diffeomorphism I : R -> a
such that h - I 0 I 0 h is not differentiable at 1.
5.24. Consider the map given in polar coordinates by l(r,B) = (1· 2,B - 0.5 sin(B)).
This can be considered as a map on the two sphere by adding a fixed point at infinity.
(a) Find the fixed points and classify their stability. (Include the fixed point at infin-
ity).
(h) Find the basins of attractions of all the fixed point sinks including the fixed point
at infinity.
5.25. Let FAB = (A - By - x 2 ,x) be the Henon map. Find the fixed points and
classify them for different values of the parameters A and B.
5.26. Prove Theorem 6.1 in the case of a fixed point. (This is the theorem that says
that a map with a fixed point, all of whose eigenvalues are less than one in absolute
value, is a contraction.)
5.27. Let I and gk be diffeomorphisms of a given by

I(x) = x + ~ sin(x) and


1
gk(X) = x + 2hk(x), where
x3 x5 X 2k + 1
hk(X) = X - 3! + 5! - ... + (-I)\2k + I)! for k ~ 1.
For k ~ I, prove that I and 9 are not topologically conjugate.
5.28. Suppose h is a conjugacy between I : an -> an and 9 : an -> an.
(a) Show that p is a periodic point of I if and only if h(p) is a periodic point of g.
(b) Show that if Ii (p) converges to q as j goes to infinity, then gi (h(p)) converges to
h(q) as j goes to infinity.
(c) Show that for any p E Rn we have h(w(p, f)) = w(h(p), g) where w(p, f) are the
w-Iimit sets of p.
5.29. Assume I : Rn -> Rn is a CI diffeomorphism. Assume p is a hyperbolic periodic
point for I. Given any positive integer n, prove there is a neighborhood U of p such
any periodic point of I in U \ {p} has period greater than n.
Hartman-Grohman Theorem
5.30. This exercise gives another proof of Theorem 7.1. Let A E L(an,an ) be an
invertible hyperbolic linear map. Assume that I is a C I diffeomorphism of an with
A 0 I-I - id E CI:(Rn,Rn). To solve the equation A 0 h = hoI for h = id + k with
k E cg (an, Rn ), it is sufficient to solve
AO(id+k)or l =id+k,
Aor l +Aokorl =id+k,
A 0 f' id = k - A 0 k 0 f -I, or
C(k) = A 0 r l - id,
5.12 EXERCISES 207

where
C(k) = k - A 0 k 0 rl.
(a) Prove that C is an invertible bounded linear operator on Gg(R n , Rn). This includes
the fact that C preserves the space Gg(R n , Rn).
(b) Prove that there is a unique solution ko to the equation C(k) = A 0 1-1 - id.
(c) Let h = ko + id where ko is the solution obtained in part (b). Prove that h is (i)
a homeomorphism, and (ii) a conjugacy between A and I.
5.31. Assume cpl and t/JI are two flows on Rn for which 0 is a hyperbolic fixed point
sink. Show that there is a conjugacy h in a neighborhood of 0 from cpl and t/Jl (the time
one maps) that is not a conjugacy of cpl and t/J I, and in fact does not take trajectories
of cpl to trajectories of t/JI.
Periodic Orbits for Flows
5.32. Assume 'Y is a periodic orbit for the flow cpl and (3 is a periodic orbit for the flow
t/JI (both flows in n dimensional spaces). Let PIP be the Poincare map for the flow cpt for
a transversal Ep at p E 'Y and P", be the Poincare map for the flow t/JI for a transversal
Eq at q E (3. Assume that PIP in a neighborhood of pin Ep is topologically conjugate
to P", in a neighborhood of q in Eq . Prove that the flow cpt in a neighborhood of'Y is
topologically equivalent to t/J t in a neighborhood of (3.
5.33. Assume 'Y is an attracting periodic orbit for the flow cpt. Prove that cpt in a
neighborhood of'Y is topologically conjugate to the linear bundle flow defined in Section
5.8 for Theorem 8.7.
5.34. Assume that two diffeomorphisms ! and 9 are topologically conjugate. Prove
that their suspensions are topologically conjugate.
5.35. Let A(t) be a time dependent n by n curve of matrices. Let M(t) be a fun-
damental matrix solution for the system of differential equations :ic = A(t)x. Prove
Liouville's Formula:
[',
det(M(tJl) = det(M(to)) 110 tr(A(t)) dt.

5.36. Consider the Volterra-Lotka equations

x = x(A - By - ax) = xM(x, y),


iJ = y(Cx - D - by) = yN(x,y)

with all A,B,C,D,a,b > O. These equations model the populations of two species
which are predator y and prey x and an increase in either population adversely affects
its own growth rate (there is a crowding factor in both equations),
(a) Find conditions on the constants so there is a unique fixed point p' = (XO, yO)
with x' > 0 and y' > O. Hint: Look at the isoclines where x = 0 and iI = O.
(b) Letting V (x, y) be the vector field for these equations, verify that

(div V)(x, y) = xix + illY + xM., + yNII


where M., and Nil are the partial derivatives with respect to x and y of the re-
spective functions.
Let E = {(x, y') : x ~ x'} and P: E ~ E be the Poincare map. The solution of the rest
of this exercise proves that for any q in the interior of the first quadrant, w(q) = {pO}.
208 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

(c) Verify that


P'(x) = N(x,y·) . P(x) . e"(")
N(P(x), y.) x
Cor x > x· where I'(x) < O.
(d) Define !(x) = N(x, y·)/x and F the antiderivative oC!. Prove that F 0 P(x) <
F(x) Cor x > x·, and that pn(x) converges to x· as n goes to infinity.
(e) Conclude, Cor q in the interior ofthe first quadrant that w(q) = {p.}.
5.37. Assume X is a C I vector field on T2 with no fixed points. Prove that there is
a closed curve r on T2 which is a transversal to the flow. Further, prove that r can
not be contracted to a point in T2. Hint: Consider the vector field Y on T2 which is
everywhere perpendicular to X, e.g. iC X and Y are the lifts oC X and Y to R2 then it

is possible to take Y(x, y) = (X 2 (x, y), -XI (x, where X(x, y) = (XI (x, y), X 2 (x, y».
Take and point q E T2 and consider P E w(q, Y). The trajectory oC q Cor Y repeatedly
comes near p. By considering a pair oC points on the trajectory, show that it can be
modified to make a transversal Cor the flow oC X.
5.38. Consider a vector field X on y2 with lift X to R2 oC the form X(x, y) =
y».
(1, X 2 (x, Prove that X has a periodic orbit if and only if the Poincare map has a
rational rotation number.
Poincare-Bendixson Theorem
5.39. Let:it = V(x) be a differential equation in R2 with only a finite number of fixed
points. Assume Po is a point whose forward orbit, O+(po), is bounded.
(a) Assume PI E w(Po) and P2 E w(Ptl with V(P2) ~ O. Apply Lemma 9.4 and the
argument of Lemma 9.5 to prove that PI is on a periodic orbit.
(b) If PI E w(Po) is not a periodic orbit, prove that o(Ptl and w(Ptl are each single
fixed points.
(c) If w(Po) is not a periodic orbit, prove that w(Po) contains a finite set of fixed
points and a finite set of other orbits O(q;) where O(qi) and W(qi) are each single
fixed points for each qi'
5.40. Let A = 51 x la, bJ be an annulus with covering space A = R x la, bJ. (The 'angle
variable' is not taken modulo 1 in the covering space.) Let X be a vector field on A
with lift X = (X I ,X2 ) to A. Assume that XI(x,a) < 0 and XI(x,b) > 0 for all x, and
div(X) '= 0, i.e., X is area preserving. Prove that the flow of X has a fixed point in
A. Hint: Assume X has no fixed points and let Y be a nonzero vector field which is
perpendicular to X everywhere in A. Prove that Y has a periodic orbit "Y in A using the
Poincare-Bendixson Theorem. Get a contradiction to the area preserving assumption
on X by considering X along "Y.
5.41. Consider the differential equations
. 4xy
x=a-x---
1 +x2
Ii = bX(l - -y-)
1+x
2

for a,b > O.


(a) Show that x· = a/5 and y. = 1 + (x·)2 is the only fixed point.
(b) Show that the fixed point is repelling for b < 3a/5 - 25/a and a > O. Hint: Show
that det(DF(.,_.\I_» > 0 and tr(DF(",- .\1-» > O.
(c) Let XI be the value of X where the isocline {x = O} crosses the x-axis. Let
YI = 1 + (xtl2. Prove that the rectangle {(x, y) : 0 :5 x :5 XI, 0 :5 Y :5 yd is
positively invariant.
1i.l2 EXERCISES

(d) Prove that there is a periodic orbit in the first quadrant for a > 0 and 0 < b <
30/5 - 25/a.
Fiber Contractions (Stable Manifold Theory)
5.42. Let fln(r) C R" be the closed ball in R" of radius r > 0 about 0, and
fI"(r') C R" be the closed ball in R" of radius r' > 0 about O. Assume F : fln(r) x
fI"(r') -+ R" x fI"(r') is a C l map of the form F(x,y) = (J(x),g(x,y» such that
(i) /(Int(fln(r») ::> l1"(r), and (Ii) /lSn(r) Is a diffeomorphism from Sn(r) onto its
image. Let D 2g(x,y) : R" -+ R" be the derivative with respect to the variables in R".
Let K. x = SUp{II D2g(y,x)II : y E R"}. Assume that (iii) sup{K.x : x E fln(r)} < 1.
(a) Prove that for each x E fln(r) there is a unique O'(x) E fI"(r') such that (x,O'(x»
has a backward orbit by F in : fI"(r) x fI"(r'). Hint: Prove that the intersection
n::,=o F"( {/-"(x) x fI"(r')) is a single point.
(b) Prove that 0' : fI"(r) -+ fI"(r') is an invariant section, i.e.,

F(x,O'·(x» = (J(x),O'· o/(x»


for all x E /-I(fI"(r».
(c) Prove that map 0' : fln(r) -+ fI"(r') is continuous.
5.43. Using the notation of the previous exercise, assume F : fln(r) x R" -+ R" X R"
is a CI map of the form F(x,y) = (J(x),g(x,y» such that (i) /(int(fln(r))):> fln(r),
and (ii) /Ifln(r) is a diffeomorphism from fln(r) onto its image, (iii) sup{K.x : x E
fln(r)} < I, and (iv) sup{K. x Ax : x E fln(r)} < I, where Ax = II(D/x)-III and
K. x = SUp{IID29(y,x)1I : y E Rio}. Let 0'. : fln(r) -+ R" be the unique continuous
invariant section found in the previous exercise, Prove that 0'. is CI. Hint: Con-
struct a family of horizontal cones C(x,y) that are taken inside themselves by DF(x,y),
DF(x,C7(X»C(X,C7(X» C C F (X,C7(X»'
Center Manifold
5.44. (A polynomial vector field without a Coo center manifold. This example is taken
from Carr (1981) and is a modification of an example of van Strien (1979).) Consider
the equations

x=xy+x3
y=o
i=z - x2.

(a) Show that the center manifold of (x, y, z) = 0 can be written as a graph z = h(Z,II)
for Ixl ~ 6 and Iyl ~ 6 for small 6 > o.
(b) Show that the fixed points {(O, y, 0) : Iyl ~ 6} all lie on we(o).
(c) Assume that z = h(z, y) is C2n for Ixl ~ 6 and Iyl ~ 6. Take the Taylor expansion
of h in x about x = 0 (with coefficients which are functions of y):

2n
h(x, y) =L a; (y).:d + o(lxI2n).
j==l

Find a relationship between the coefficients by equating i =z- x2 = h(x, y) - x2


and i = ::x + ~y. In particular, show that (i) al(y) = 0, (iI) a2(y) :f: 0, and
(iii) (1 - jy)a;(y) = (j - 2)a;_2(Y) for j > 2.
210 V. ANALYSIS NEAR FIXED POINTS AND PERIODIC ORBITS

(d) Show that the point (O,l/(2n),O) can not lie in the domain where h is C 2n . Hint:
Show that a2i(1/(2n)) ". 0 for 1 ~ i < n, and obtain a contradiction for the
relationship involving the coefficients a2n(1/(2n)) and a2n-2(1/(2n)). Remark:
What makes this example work is the resonance between the eigenvalues at the
fixed point (O,l/(2n), 0). The resonance forces the weak unstable manifold (for
the eigenvalue 1/(2n) to be C 2n - 1 but not c2n. In turn, this manifold Is contained
in the center manifold of 0 so it can not be C 2n either.
(e) Show that there is no neighborhood of 0 on which the center manifold WC(O) is
Coo.
5.45. Assume f : Rn x R -+ Rn is a C r function for r ~ 1. Write I>. (x) for f(x, A).
The map F : Rn+l -+ Rn+l be defined by F(x, >.) = (f(x, A), >.) is also cr. Assume
that Xo is a hyperbolic fixed point for fo. Let x.\ be the corresponding hyperbolic fixed
point for A near O. Using the cr Center Manifold Theorem for F, prove that the stable
manifold of X A for fA, W"(x A, 1>.), depends C r jointly on position and parameter A, i.e.,
prove that W"(x.\, fA) for IAI < E and E > 0 small can be represented as the graph of a
cr function
q: E~o.o x (-E,E) -+ E~.o.

Inclination Lemma
5.46. Let A = (00
5 ~), and consider the linear map Ax. Let S = [-r, r] x [-r, r]
be the square. Let D be the line segment be the part of the line through the point
(1, 0) with slope s that lies within S. Assume sand r are chosen so that D intersects
the top and the bottom of S and not the sides. Prove by a direct calculation that the
n-th iterate of the line segment, An (D) n S, converges to the part of the y-axis given
by {O} x [-r, rIo Prove the convergence both in terms of the points and the slope.
CHAPTER VI
Bifurcation of Periodic Points

Throughout this chapter we consider a map with one parameter. These results also
apply to differential equations near a periodic orbit by considering the Poincare map.
We write fl'(x) = I(x,/J) where /J E R and x E RH. We proved in Theorem V.6.4 that
if f"o(Xo) = Xo is a fixed point and 1 is not an eigenvalue of D(fI'O)xo' then the fixed
point can be continued for values of the parameter /J near /Jo. This is a non-bifurcation
result. Notice that the tool we used to show that the fixed point could be continued in
this case is the Implicit Function Theorem. We repeatedly use this theorem to study
the bifurcations considered in this chapter.
The first type of bifurcation, called the saddle-node bifurcation, occurs where the
above assumption on the eigenvalues is violated and 1 is an eigenvalue. With further
assumptions on the higher derivatives of II'(x), it follows that (i) for /J on one side of
Ito there are no fixed points near Xo, and (ii) for /J on the other side of /JO there are two
fixed points. Of the two fixed points, one is attracting and the other is repelling (at
least in one dimension, n = 1).
The other types of bifurcations that we consider are ones where the fixed point
persists (1 is not an eigenvalue), but the stability type of the fixed point changes as
/J passes through Ito (one eigenvalue has absolute value equal to 1). Among this type
of bifurcation, the second type we consider, called a period doubling bifurcation or flip
bifurcation, occurs when one eigenvalue is -1. In one spatial dimension, n = 1, an
attracting point of period one becomes repelling at /Lo and a stable orbit of period 2
branches off from the fixed point.
The last type of bifurcation we consider is called the AndronotJ-Hopf bifurcation. It
occurs when a pair of complex eigenvalues have absolute value one when /J = /JO but are
not equal to ± 1. As /t passes through /Lo, the absolute value of these eigenvalues change
from less than one to greater than one. For a pair of complex eigenvalues to occur, it is
necessary to be considering a map in at least two dimensions. With further assumptions
on the derivatives, for /L slightly bigger than /Jo there is an invariant closed curve near
xo. This bifurcation is simpler for differential equations. In this case, for a differential
equation in two dimensions, a fixed point changes from attracting to repelling as a pair
of eigenvalues crosses the imaginary axis, and a stable periodic orbit branches off from
the family of fixed points.
For a more complete treatment of bifurcation theory, see Guckenheimer and Holmes
(1983), Chow and Hale (1982), Wiggins (1990) and (1988), or Hale and Koc;ak (1991).

6.1 Saddle-Node Bifurcation


As stated in the introduction, the first bifurcation we consider is the one that occurs
when the map fails to be hyperbolic because 1 is an eigenvalue of the derivative.
We first consider the case where x is a real variable. We want fl'o(xo) = Xo and
I~o (xo) = 1. We also want the tangency of the graph of 11'0 to the diagonal {(x,1/) : 1/ =
x} to occur in the simplest possible fashion so we assume that I~o(xo) 1: o. Finally
we need that the graph of I" is moving upward or downward as the paranleter varies,

211
212 VI. BIFURCATION OF PERIODIC POINTS

:~ (xo, JlO) f. O. With these assumptions, the fixed point disappears on one side of 110
and two fixed points branch off on the other side. Before stating the general theorem
we give an example.
Example 1.1. Let I,,(x) = Jl+x-ax 2 with a> O. For Jl < 0, I,,(x) -x = Jl-ax 2 < 0
for all x so there are no fixed points. For Jl = 0, 0 = lo(x) - x = -ax 2 has one root at
x = 0, so 10 has one fixed point at x = O. For Jl > 0, 0 = I,,(x) - x = Jl - ax 2 has two
roots, x± = ±('I/a)I/2. Thus I" has two fixed points. See Figure 1.1.

__________ ~~--~~------------~~x

FIGURE 1.1. Creation of Two Fixed Points

Theorem 1.1 (Saddle-Node Bifurcation). Assume that I : R2 --+ R is a C r func-


tion jointly in both variables with r ~ 2. We also write I,,(x) = l(x,Jl). Make the
following further assumptions:
(I) l(xo,Jlo)=xo,
(2) I~o(xo) = 1,
(3) 1::O(xo) f. 0, and
81
(4) --8 (xo, /lo) f. O.
II
Then there exist intervals I about Xo and N about /lo and a C r function m : I --+ N
such that (i) Im(x)(x) = x, (ii) m(xo) = /lo, and (iii) the graph ofm gives all the fixed
points in I x N. Moreover m'(xo) = 0 and

These fixed points are attracting on one side of Xo and repelling on the other. See Figure
1.2.

PROOF. To find the fixed points of II" we consider the new function

G(x, /l) = I(x, /l) - x.

Then fixed points of I" exactly correspond to zeroes of G. First we have G(xo, /10) = O.
We want to use the Implicit FUnction Theorem to solve for nearby zeroes of G. Note
that (8G/8x)(xo, /lo) = 1:.0
(xo) -I = 0 so we can not solve for x in terms of /l. However,
6.2 SADDLE-NODE BIFURCATION IN HIGHER DIMENSIONS 213

( stable

unstable

FIGURE 1.2. Bifurcation Diagram for Saddle-Node Bifurcation

so we can solve for p. in terms of x. In fact, there are intervals I about Xo and N about
P.o and a C r function m : 1 -+ N such that G(x, m(x)) == 0, and these give all the zeroes
of G in I x N. This construction proves the first facts about the fixed points.
To calculate the derivatives of m(x) we use implicit differentiation. We use subscripts
to designate partial derivatives. Thus G z = If.
Differentiating 0= G(x,m(x» with
°
respect to x, we get = G z + Gl£m'. Evaluating this at Xo and using the fact that
Gz(xo, p.o) = 0 and GI£(XO, p.o) l' 0, we get that m'(xo) = 0. (Notice that this much of
the proof only uses the fact that f(x,p.) is C 1 and does not use the second derivative.)
To get the second derivative of m, we differentiate the equation 0 = G.,(x, m(x)) +
GI£(X, m(x»m'(x) a second time with respect to x and get
0= Gzz + 2Gl£z m' + GI£I£(m')2 + Gl£m".
Evaluating this expression at Xo and using the fact that m'(xo) = 0, we get that
m"(xo) = -G:z:z(xo, p.o) = - f:z:z(xo, p.o)
GI£(XO, p.o) fl£(xo,p.o)
as claimed. (In this notation, fl£(xo, p.o) is the p.-partial derivative evaluated at the
point indicated.)
To find the stability of the fixed points we use the Taylor expansion of f" about
(xo, p.o):
of 02f 02 f
ox (x, p.) =1 + ox 2 (x - xo) + OXOI-' (I-' - p.o)
+ O(lx - xol 2 ) + O(lx - xol ,11-' - p.ol) + 0(11-' _ p.oI2).
Because m'(xo) = 0, it follows that m(x) - p.o = O(lx - xoI 2 ). Therefore,
of o2f 2
-0 (x, m(x» = 1 + -02 (x - xo) + O(lx - xol ).
x x (Zo.,..,)

Because :~ (xo, p.o) oF 0, :~ (x, m(x)) - 1 has opposite signs on the two sides of xo,

:~ (x, m( x» is less than one on one side and greater than one on the other, and so one
side has an attracting fixed point and the other side a repelling fixed point. 0

6.2 Saddle-Node Bifurcation in Higher Dimensions


In the last section, we gave the saddle-node bifurcation in one spatial dimension.
This section considers this same bifurcation in higher dimensions. Before stating the
theorem, we consider the example of the Henon map.
214 VI. BIFURCATION OF PERIODIC POINTS

Example 2.1. Let FAB(X,y) = (A - By - x 2,x) be the Henon family of maps. The
fixed points have x-coordinate given by x = _ B ; 1 ± [( B ; 1) 2 +A r/ 2, so there are
no fixed points when A < -I(B + 1)/2J2 and two fixed points when A> -I(B + 1)/2J2.
The eigenvalues of the derivative D(FAB)(z.y) are ~:I: = -x ± Ix 2 - Bj1/2. When A =
-[(B + 1)/21 2 and x = -(B + 1)/2, the eigenvalues are ~+ = Band .L = 1. Thus there
is a bifurcation from no fixed points to two fixed which occurs when A = -[(B + 1)/2J2,
and one of the eigenvalues of this fixed point is 1 at this bifurcation value. We leave
to the exercises for the reader to verify that this family satisfies the conditions of the
theorem below for a saddle-node bifurcation. See Exercise 6.2.
To state the theorem in higher dimensions, it is necessary to specify the derivative
of the "coordinate" function along the direction of the eigenvector for the eigenvalue 1.
We assume that 1 is an eigenvalue of D(f,.o)xo of multiplicity one. Let vi be the right
eigenvect.or (written as a column) for the eigenvalue 1, and w be the left eigenvector
(written lIS a row), D(f,.o)Xovl = vi and wD(f,..)xo = w. If v j for 2 ~ j ~ n are the
other generalized right eigenvectors, then wv j = 0 for 2 ~ j ~ n. (The product wv j is
zero because Aj i' 1 and wv J = (wD(f,.o)Xo)v J = w(D(f,.o)xovJ) = AjWV j .) We can
now use w I,. (x) lIS the component of I,.(x) along the direction of vi.
Theorem 2.1. Let I : Rn x R ~ Rn be C 2 jointly in all the variables, and write
I,.(x) = I(x, Il)· Assume that I,. satisfies the following conditions:
(1) I(Xo,llo) = Xo,
(2) D(f,.o)xo has eigenvalues Al = 1 and Aj with IAjl i' 1 for 2 ~ j ~ n, letv l be the
right eigenvector for the eigenvalue 1 of D(f,.o)xo and w be the left eigenvector
for the eigenvalue 1 (written as a row),
(3) W [D2(f,.o)xo (vI, vi)] i' 0, and
(4) w(81/81l)(Xo,Ilo) i' O.
Then it is possible to parameterize Il = m(s) and x = q(s) such that m(O) = Ilo,
q(O) = xo. q'(O) = vi, and I(q(s), m(s» == q(s).
REM A RK 2.1. It is possible to make a change of basis so that

D(f,.o)xo =
(o1 0• ...... 0)*
. . ... . '
o • ... •
where the terms marked by a '.' are unspecified. In terms of this basis, the left eigen-
vector w = (1,0,··· ,0) = (Vl)t (the transpose to a column vector). Thus in terms of
this basis, condition (3) is given by (8/t/81l)(Xo,IlO) i' 0 and condition (4) is given by
(82/d8xr)(Xo,llo) i' O.
REMARK 2.2. If n = 2, IA21 < 1, and the derivatives in conditions (3) and (4) have
opposite signs, then one of the fixed points is a stable node (attracting fixed point with
two unequal eigenvalues) and the other is a saddle. This is the reason for the name of
the bifurcation.
REMARK 2.3. There are two approaches to the proof. One uses the center manifold
associated to F(x, Il) = (f,.(x), IJ) which is two dimensional, one spatial dimension and
one parameter direction. The restriction of F to this invariant manifold satisfies the
assumptions of the earlier theorem.
A second approach us<'~ thp Implicit Function Theorem to do the reduction in di-
mensions. This method does not produce an invariant two dimensional set, but it does
6.2 SADDLE-NODE BIFURCATION IN HIGHER DIMENSIONS 215

produce a two dimensional set on which all the fixed points must be found. This second
method is often called Liapunov-Schmidt reduction.
We give proofs using both approaches below.
PROOF USING THE CENTER MANIFOLD THEOREM. Define th map on Rn+l given by
F(x, Il) = (f(x, Il), Il)· and its derivative is given in the following block form:

Take the eigenvectors of D(F",,)xo 50 that Ivll = 1, Iwl = 1, and wv l = 1. Make a


change of basis to the eigenvectors of D(f"" )xo; in these coordinates, the eigenvectors of
D(fl-'o)xo are the standard basis. Then w = (1,0,· .. ,0) and vi = WI. The center direc-
tions at (xo, 1-10) for the map F include the Vi and Il directions. By the Center Manifold
Theorem, we can solve for an invariant manifold x = cp(8, Il) which is a graph over the
span of Vi and the Il variable. (The total center manifold is (X,Il) = (CP(8,1l),Il).) The
parameter s can be chosen 50 that cp(O,I-Io) = Xo and W[cp(8, Il) - XoJ = s where wz is
a projection of the vector z onto the component in the direction in the vector vi. The
surface given by the image of cp is tangent to

Vi, 50 ~Cp
vS
(O,IlO) = vi. Also, from these
properties of cp, it follows that

and

With these preliminaries, define

which is the projection onto the span of vi of the difference of the iterate from Xo. We
want to apply the Saddle-Node Bifurcation Theorem in one spatial dimension. Let us
check the conditions of the one dimensional theorem for g:

g(O, 1l0) = w[Xo - XoJ = 0,


8g 8cp
8s(S,Il) = w[D(fI-')<p("I-') 88 (S,Il)] , 50

88 (0,1-10) = w [D(f",,)xo v IJ = wv I = 1.
8g

This is the first condition for a saddle-node bifurcation for g. Taking the derivative
again,
8 2g 2 8cp 8cp l)2cp]
8s 2 (0'I-Io) = w[D (f",,)xo( 88 (0,1-10), 88 (0,1-10» + D(f,.o)xo 8s 2 (0,1-10) .

We mentioned above that w~(O,I-Io) = 0,50 both

and

8 2cp
take their values in the span of v 2 through v n , and wD(f",,}xo 8s 2 (0,1-10) = 0. It follows
that
216 VI. BIFURCATION OF PERIODIC POINTS

which is nonzero by an 8SSumption of the theorem. Finally,

by another 8SSumption of the theorem. (The second term is zero because w :~ (0, /-10) =
0.)
We have checked the 8SSumptions of the saddle-node bifurcation in one spatial di-
mension applied to g, so we can solve for Il = m(s) such that g(s, m(s)) = s. Also,
m'(O) = 0 and •

10.
Next we check that this function gives us the set of fixed points that we want. First,

w[/('P(s, m(s)), m(s)) - "0] = s = w['P(s, m(s)) - xo].

Both (f('P(s,m(s)),m(s)),m(s)) and ('P(s,m(s)),m(s)) are in the center manifold and


have the same components along vi and p. Therefore they are equal,

1('P(s, m(s)), m(s)) = 'P(s, m(s)).

Letting q(s) = 'P(s.m(s)), we have the conclusion of the theorem. o


PROOF USING THE IMPLICIT FUNCTION THEOREM. This method uses only the fact
that A) f. 1 for 2 ~ j ~ n and not that IAjl 1 1. Rather than using a center manifold,
we reduce the problem to one spatial dimension by means of the Implicit Function
Theorem, a method called Liapunov-Schmidt reduction.
Wt> take a basis so that

D(fl'<»xo = ( o~.~.* ~.)


... *
In terms of partial derivatives, 88ft ("0,/-10) = 0 fo~ j ~ 2. Define!/J: Rn+1 --+ Rn- I
Xj
by !/J)(x,p) = h(x,ll) - Xj for 2 ~ j ~ n, where the I/s are the coordinate functions.
Then !/J("o, Po) = 0 and

( 8!/Ji ) = ( 8/; ) _ id
8x) 2$i,)$" 8xj 2$i,j$n
6.2 SADDLE-NODE BIFURCATION IN HIGHER DIMENSIONS 211

is invertible at (XU, ~). Therefore we can solve for (X2,' .. ,xn ) in terms of (XI,IJ),

{}1/J arp
and -a (XU,~) = 0, so -a (a,~) = 0 where a = (xuh- Also, note that ",-1(0) is
Xl Xl
not an invariant manifold, but we can use it in the same manner as the invariant center
manifold in the previous proof.
Writing a for (xoh, define

Then g(O,llO) = 0 is a fixed point, and

ag aft
-a (a,~) = -a (a + a,rp(a + a,~),~)
a Xl
all arp
+ (,,-(a + a,rp(a + a'~)'~)){-a (a + a,~)),
vXi Xl

and at (0, ~),


:!(O,~) = 1
because aaft (XU,~) = 0 for i ~ 2. Thus 9 satisfies the first two conditions for a
Xj
saddle-node bifurcation.
Taking the second derivative with respect to a,

arp aft
because -a
Xl
(a,lJo) = 0 and
V~j
"- (XU,~) = 0 for J. ~ 2.
The final assumption on 9 for a saddle-node bifurcation involves the derivative with
respect to IJ. But

ag aft aft arp


-a (O,lJo) = -a (xu,~) + (-a (xu'~))-a (a,~)
IJ IJ Xi IJ
aft
= alJ (xu,~) ¥ 0

because aaft (xu.~) = 0 for i ~ 2.


Xi
218 VI. BIFURCATION OF PERIODIC POINTS

Thus 9 has a saddle node bifurcation, and we can solve for /J., /J. = m(s) such that
9(S, m(s» == s. By the previous results, m'(O) = 0 and

This completes the proof using the Implicit Function Theorem. o

6.3 Period Doubling Bifurcation


Consider the quadratic family of maps, F,,(x) = /J.x(l- x). In Chapter II we showed
that the fixed points are 0 and p" = 1 - 1//J.. We also showed that p" is attracting
for 1 < /J. < 3 and repelling for 3 < /J.. The reason the stability can change at /J. = 3
is that FHp3) = -1, so the fixed point is not hyperbolic. We further showed that for
1 < /J. < 3, all points x E (0,1) have their forward orbit Ft(x) converge to P" as j goes
to infinity. Thus there are no points of period two for 1 < /J. < 3. A further calculation
shows that for /J. > 3, there is an orbit of period two, q;, which bifurcates off from the
fixed point. It can be shown that this orbit is attracting for 3 < /J. < 1 + 61/ 2 , and
then repelling for /J. > 1 + 6 1/ 2 . In Section 3.4, we discuss the repeated period doubling
bifurcations which takes place as the stable orbit increased in period through 1, 2, 4, 8,
... , 2n , ... . In this section we concentrate on one of these bifurcations, e.g. the one
which occurs at /J. = 3 where a fixed point becomes repelling when its derivative equals
-1 and a period two orbit is created. The following example is a model problem where
the fixed point is always at x = o.
Example 3.1. Let f,,(x) = -/.LX + ax 2 + bx 3 • Notice that f{(O) = -1. We want to
find the points of period two for /J. near 1. See Figure 3.1.

FIGURE 3.1. The Graph of f~ for (a) /J. < 1, (b) /J. = 1, and (c) /J. > 1
6.3 PERIOD DOUBLING BIFURCATION 219

A calculation shows that


f;(x) = p?x + x 2( -al-' + a1-'2) + x 3( -bl-' - 2a 21-' - b1-'3) + O(X4).
We want to find zeroes of f'f,(x) - x = O. We know that I~(O) - 0 = 0, since 11'(0) = O.
This is reflected in the fact that x is a factor of I~(x) - x. Since we want to find the
zeroes of f~(x) - x = 0 other than 0, we define

M(x, 1-') = J'f,(x) - x


x
= 1-'2 - 1 + x( -al-' + al-'~) + x 2( -bl-' - 2a 2 1-' - b1-'3) + O(x 3 ).
This function vanishes at (x,l-') = (0,1), M(O,I) = O. Also, both the constant term,
1-'2 - 1, and the coefficient of x, -al-' + a1-'2, vanish at (x,l-') = (0,1). Using the fact
that 1-'2 - 1 = (I-' - 1)(1-' + 1) ::::: 2(1-' - 1), to lowest terms the zeroes of M(x, 1-') are
approximately equal to the zeroes of 0 = 2(1-' - 1) - 2(b + a2 )x 2 , which are
1-'=1+(b+a2)x2, or
x= for bl-' - 12 > O.
±( I-' - 1 ) 1/2
b+ a2 +a
In the proof of the general theorem, this is justified by applying the Implicit Function
Theorem. The partial derivatives of Mat (0,1) are M",(O, 1) = -al-' + a1-'211'=1 = 0 and
MI'(O,l) = 21-'11'=1 = 2 # O. The Implicit Function Theorem says that I-' can be solved
for in terms of x to give zeroes of M, I-' = m(x) with M(x,m(x» == 0 so I;'(",)(x) = x.
To justify the approximation made above, we can calculate the derivatives of m(x) by
implicit differentiation and show that this gives the lowest order terms given above.
Differentiating 0 = M(x, m(x» twice with respect to x gives
0= M",(x,m(x» + MI'(x,m(x»m'(x) and
0= M""" + 2M"'l'm' + MI'I'(m,)2 + Ml'm".
Using the fact that M",(O,I) = 0 and MI'(O,I) # 0 in the first equation gives that
m'(O) = O. We use the second equation to determine m"(O). Because M""" is the only
second derivative not multiplied by m'(O) (which equals zero), this is the only one we
need to calculate. Using the explicit expression for M, M"",,(O,I) = 2( -bl-' - 2a 2 1-' -
bI-'3)11'=1 = -4b - 4a 2. Then
m"(O) = -M"",,(O, 1) - 2M"'I'(O,I)m'(0) - MI'I'(O,I)(m'(O»2
MI'(O,I)
4b+ 4a 2 -0 - 0
2
= 2(b + a 2 ).
Thus to get a quadratic shape to the new points of period two, we need to assume that
b + a2 # o. In the general theorem, we see that the sign of b + a2 also determines the
stability of the period two orbit. Note that -2(a 2 + b) is the coefficient of x 3 in Jl
where 1 is the bifurcation parameter value.
The above example is fairly general, but it does assume that the fixed point does not
vary with the parameter, so :~ (0,1) = O. In the theorem we allow :~ (xo, 1-'0) # 0 and
show the effect of this term. We also give a condition for the spatial derivative of 1 to
vary along the curve of fixed points as the parameter varies. This condition is given in
terms of derivatives of f, so it is not necessary to calculate f;(x) to apply the theorem.
The bifurcation described in the following theorem is called the period doubling or flip
bifurcation.
220 VI. BIFURCATION OF PERIODIC POINTS

Theorem 3.1 (Period Doubling Bifurcation). Assume that 1 : R2 -+ R is a


C r function jointly in both variables with r ~ 3, and that 1 satisfies the following
conditions.
(1) The point Xo is a fixed point for Il = J.'o: l(xo.J.'o) = Xo.
(2) The derivative of 1"0 at Xo is minus one: I;"'(xo) = -1. Since this derivative is
not equal to 1, there is a curve of fixed points X(Il) for Il near 1'0.
(3) The derivative of 1~(x(Il» with respect to p. is nonzero (the derivative is varying
along the family of fixed points):

(4) The graph of I'/.. has nonzero cubic term in its tangency with the diagonal (the
quadratic term is zero):

Then tIl ere is a period doubling bifurcation at (xo. J.'o). More specifically, there is a
differentiable curve of fixed points, X(Il), passing through Xo at 1'0. and the stability of
the fixed point changes at Ilo. (Which side of J.'o is attracting depends on the sign of
0.) There is also a differentiable curve "f passing through (xo. 1'0) so that "f \ {(xo. J.'o)}
is the union of hyperbolic period 2 orbits. The curve "f is tangent to the line R x {IlO}
at (XO,/lo), so "f is the graph of a function of x, Il = m(x) with m'(xo) = 0 and
m"(xo) = -2{3/0 '" O. The stability type of the period 2 orbit depends on the sign of
{3: if {3 > 0 then the period 2 orbit is attracting, and if (3 < 0 tllen the period 2 orbit is
repelling.
PROOF. By the assumptions, I~o(xo) '" 1 so there is a curve of fixed points parameter-
ized by 1-1. X(Il). Moreover,

".f'want to translate the coordinates so that 0 is a fixed point for all nearby 1-1, so
we define
g(y, Il) = I,.(y + X(Il» - X(Il)·

Then, g(O./I) == O. We write g2(y, Il) when we mean g(., 1') 0 g(y, Il). The partial deriva-
tives of 9 with respect to y are the same as those of 1 with respect to x at the corre-
sponding points,

The value of the partial derivative with respect to the position, : : (0. Il), determines
[J2
the stability of the fixed point, so all~ (0, Il) measures the change along the curve of
6.3 PERIOD DOUBLING BIFURCATION 221

fixed points, and

829 8 8/
8"Dy (0, "0) = 8" ax (x(,,), ,,)1,.=,.0

oP/ 82 / ,
= a,,8x (xo, /10) + 8x 2 (xo, /IO)x (/10)
82 / 1a2 / 0/
= 81'8x (xo,/IO) + 2ax2 (xo, /10) a" (xo,/IO)
=0",0.

This calculation shows that 0 measures the quantity described in the statement of the
theorem.
Let aj(") be the coefficient of yj in the Taylor expansion of 9 about y = 0, so

g(y, II) = al (,,)y + a2(,,)y2 + a3(,,)y3 + O(y4).


A direct calculation then shows that

where we do not exhibit the dependence of the coefficients aj on ". As in the example,
°
we want to find points where g2(y, 1') - Y = that are different than 0. Since y = is °
always a solution, we divide out by y when y '" 0. So we define
g2(y, ,,) _ y
M(y,,,) =
{ Y
for y '" °
a 2
8y (g (y, 1'»1 11 -0 - 1 for y = 0.

Notice that the definition for y =


Using the expansion of g2,
° is the natural extension of the definition for y '" 0.

M(y, 1') = (a~ - 1) + (ala2 + a2a?)y + (ala3 + 2ala~ + a3a~)y2 + O(y3).


In order to show that the Implicit Function Theorem applies, we note that M(O,/IO) = 0,
and the partial derivatives are

M\I(O, /10) = ala2 + a2a~I,.0 = -a2("0) + a2(/IO) = ° and


88g2
M,.(O, 1'0) = 8" Dy (0, ,,)1"0
8 8g 2
= 8" (8y(0,,,») I,..,
8g 8 2g
= 2 Dy (0, /10) 81'8y (0, /10)
= -20", 0.

Because M,.(O, /10) '" 0, the Implicit Function Theorem applies and there is a differen-
tiable function m(y) such that M(y,m(y» == 0.
By implicit differentiation, 0= Mil (0, /10) + M,.(O, /IO)m'(O), so
222 VI. BIFURCATION OF PERIODIC POINTS

To calculate the second derivative of m(y), differentiate the equation 0 = MII(y, m(y» +
M,,(y, m(y»m'(y) again:

0= Mill/ + 2MII"m'(y) + M",,(m,)2 + M"m".


Evaluating these at y = 0 where m'(O) = 0 gives
m"(O) = -MIIII(O,~o).
M,,(O,~)

Thus we need to calculate the numerator,

MIIII(O,~) = 2(ala3 + 2ala~ + a3a~)I"o


= 2( -a3 - 2a~ - a3)IPo = -4(a3 + a~)I"o
= -4/3 i: o.
Therefore
m"(O) = 4/3 = _ 2/3 i: O.
-2a a
lfIg2
It can also be checked that 8y3 (0, ~o) = 3MIII/(0, ~o) = -12/3 i: O.
This leaves only the stability of the period 2 orbit to check. For this we use the
8(g2)
Taylor expansion for """OJ/(y,m(y)) about y = 0 and ~ =~:

We have already calculated the value of several of these coefficients:

(by the chain rule),


8;;::) (O,~) = 0

as is noted in the calculation of MI/(O, ~), and

so

~~~ (O,~)(m(y) -~) = M,,~mll(0)y2 + O(y3)


= M,,~C::III1)y2
= 2/3y2. "
6.4 ANDRONOV-HOPF BIFURCATION FOR DIFFEOMORPHISMS 223

Finally,
1 (J3(g2) (1) 2 3
2--;3y3(O,Jlo) = 2 6(ala3 + 2ala2 + a3 al) = -6{3.
Combining these terms,

a~:) (y, m(y» = 1 + 2{3y2 _ 6{3y2 + O(y3)


= 1 - 4{3y2 + O(y3).
Thus, if {3 > 0 the period 2 orbit is attracting, and if {3 < 0 then it is repelling. This
finishes checking all the conditions of the theorem. 0

6.4 Andronov-Hopf Bifurcation for Diffeomorphisms


As stated in the introduction to the chapter, the Andronov-Hopf bifurcation for
a diffeomorphism occurs when a pair of eigenvalues for a fixed point changes from
absolute value less than one to absolute value greater than one, i.e., the fixed point
changes from stable to unstable by a pair of eigenvalues crossing the unit circle. With
further conditions on derivatives, it follows that an invariant closed curve bifurcates off
from the fixed point. The motion on this invariant curve is a rotation, whose rotation
number starts near the value determined by the eigenvalue. The formal statement of
the theorem requires some notation and preliminary change of variables. We do this
before we state the theorem.
Assume that ~ : R2 x R -+ R2 is a one-parameter family of C r diffeomorphisms
which satisfies the following conditions. (We do not state a version which allows the
fixed point to vary with the parameter.)
(a) Assume r ~ 5.
(b) Assume that the origin is a fixed point of ~I' for I-' near 0: ~I'(O, 0) = (0,0).
(c) Assume that D(~I')(o,O) has two non-real eigenvalues, >'(1-') and X(I-'), such that
d
1>'(0)1 = 1 and dl-' 1>'(1-')1 '" o. By a change of parameter, we can assume that
1>'(1-')1 = 1 + 1-'.
(d) By a change of basis on R2 (which depends on 1-'), we can assume that

(e) We further assume that >.(o)m = eim/3(O) '" 1 for m = 1,2, ... ,5. This means that
>'(0) is not a low root of unity (in addition to not being equal to ±1).
(f) Because >'(0) is not a low root of unity, there exists a change of coordinates that
bring ~I' into the form

where in polar coordinates,

We make the assumption that !teO) '" o. Thus, the radial component of the map is a
nonlinear function of r. (Notice that (1 +I-')r- h(l-')r3 = r has a solution r2 = 1-'!t(1-')-1
for those I-' with 1-'!t(I-')-1 > o. Thus the normal form terms have an invariant circle for
I-' on one side of I-' = 0.) The bifurcation described in the following theorem is called
the Andronov-Hopf Bifurrotion for diffeomorphisms.
224 VI. BIFURCATION OF PERIODIC POINTS

Theorem 4.1 (Andronov-Hopf Bifurcation). Assume ~,,(x, y) satisfies assump-


tions (a) - (f). Then for all sufficiently small,.,. with ,.,.11{,.,.)-1 > 0, ~" has an in-
variant closed curve surrounding the fixed point (O,O) of radius approximately equal to
[J.l111 (11)]1/2. FUrther, if 11 (0) > 0 then the closed curve is attracting, and jf IdO) < 0
then it is repelling.
REMARK 4.1. This theorem was proved by Naimark (1967), Sacker (1964), and Ruelle
and Takens (1971). Also see Marsden and McCracken (1976) and Carr (1981). The
name of Andronov-Hopf Bifurcation is commonly used because of the connection with
the Andronov-Hopf bifurcation for differential equations given in the next section.
REMARK 4.2. The map on the invariant closed curve can vary. This map is most likely
not conjugate to a rigid rotation for most parameter values. As J.I approaches 0 the
rot.ation number is approximately {3(/1)/{21r).
REMARK 4.3. The proof of this theorem is more difficult than those for the other
bifurcations which we treat. The reason that it is harder is that we need to find a
whole closed curve of points and not just a single point. For this reason we can not
use the Implicit Function Theorem to prove this theorem, but must apply a contraction
mapping argument to a set of potential invariant curves, and the construction is much
more involved and delicate. Because of these complications we do not give the proof
but refer the reader to the references. In the next section we do prove the simpler Hopf-
Andronov bifurcation for differential equations where we can use the Poincare map in
polar coordinates from () = 0 to itself and reduce the problem to finding a fixed point
of this map.

6.5 Andronov-Hopf Bifurcation for Differential


Equations
As stated in the introduction to the chapter, the Andronov-Hopf bifurcation for a sys-
tem of differential equations occurs when a pair of eigenvalues for a fixed point changes
from npgative real part to positive real part, Le., the fixed point changps from stable to
unstable by a pair of eigenvalues crossing the imaginary axis. With further conditions
on derivatives, it follows that a periodic orbit bifurcates off from the fixed point. Notice
that for a differential equation, the Andronov-Hopf Bifurcation gives a periodic orbit
and not just some invariant closed curve. This difference makes the analysis much sim-
pler in this case. We proceed to make some assumptions and constructions before we
state the main bifurcation theorem of the section.
\Ve consider a one parameter family of differential equations
x= I{x,,.,.) = I,,(x)
with x E ]R2 that satisfies the following assumptions.
(1) The origin is a fixed point for all values of J.I near 0: I(O,J.I) = o.
(2) The eigenvalues of D(f,,)o are o{,.,.) ± i{3{,,) with 0(0) = 0, {3(0) = f30 =F 0, and
0'(0) =F 0, so the eigenvalues are crossing the imaginary axis.
The last assumption of the theorem involves the Taylor expansion of the differential
equations, with a condition given on a combination of the coefficients when they are
expressed in polar coordinates. Therefore, after we make some preliminary change of
coordinates, we indicate in a lemma the form of the equations when transformed into
polar coordinates.
By the Implicit Function Theorem, since 0'(0) =F 0, the parameter can be changed
so that 0(11) = IJ. We use this new parameter. Then there is a change of basis on R2
such that
6.5 ANDRONOV-HOPF BIFURCATION FOR DIFFERENTIAL EQUATIONS 225

with

and
F(x ) _ (BHxl'X~'JJ)+BHxl'X~.#~)+O(IXI4))
,JJ - B~(XI,X2,JJ) + B~(XI,X2,JJ) + O(lxI4)
where Bj(X('X2,JJ) is a homogeneous polynomial of degree j in Xl and X2. Next, we
transform the equations to polar coordinates and obtain the form stated in the following
lemma.
Lemma 5.1. Consider the differential equ8tiollS (.) when expressed in polar coordi-
nates, Xl = reos(O) and X2 = rsin(O). Then

r = JJr + r2C3(0,JJ) + r3C4(0,JJ) + O(r4)


{J = !3(JJ) + rD 3(O, JJ) + r2 D 4(O, JJ) + O(r3)

where Cj{-,JJ) and Dj(',JJ) are homogeneous polynomials of degree j in sin(O) and
eos( 0). In fact,

C 3(0,JJ) = eos(O)B~(eos(O),sin(O),JJ) + sin(O)B~(cos(O),sin(O),JJ)


D3(0, JJ) = - sin(O)B~ (eos(O), sin(O), JJ) + eos(O)B~(cos(O), sin(O), JJ),
where Bj{-'" JJ) is the homogeneous term of degree j in terms of Xl and X2 of XA;.
Moreover,

PROOF. Taking the time derivatives of the equations which define polar coordinates,
we get

where Bj are functions of rcos(O), rsin(O), and JJ, Bj(rcos(O),rsin(O),JJ). Factoring


out r from the Bj (and remembering that Bj is homogeneous of degree j), we get
the form of the statement of the lemma. To check the integral of C3(O,JJ), notice that
C3 (O, JJ) is a homogeneous cubic polynomial in sin(O) and cos(O) so that
226 VI. BIFURCATION OF PERIODIC POINTS

o
(3) Using the coefficients defined in Lemma 5.1, we define

and make the assumption that K I- O.


The significance of assumption (3) is best understood in terms of a "normal form."
With assumptions (1) and (2), there is a change of coordinates R = r + ul(r,O,p) in
terms of which the differential equations become the following:

R= pR + K R3 + O(~)
9 = fj(p) + O(R).

See Section 3.2 in Carr (1981). Thus the fact that K I- 0 means that the R equation
has a nonzero cubic term and so has an invariant closed curve of approximate radius
(-Ill K)1/2 (almost a circle of this radius). This situation is similar to the discussion we
gave for the diffeomorphism case. In the following proof we avoid the use of the normal
form (so we do not need to verify it), but merely build the necessary construction into
the proof.
In the following theorem we use the radius of the solution at t = 0 as the parameter
and denote it by E. Thus we find T(E)-periodic solutions in time, x"(t, p(£», such that
their initial conditions in polar coordinates are given by r"(O, pte»~ = e and 0"(0, pte)) =
0, for some parameter p(E) which is a function of E, and where the period T(e) is a
function of E. Thus the period and the parameter value for which the periodic orbit
occurs are functions of the approximate radius of the periodic solution. The bifurcation
described in the following theorem is called the Andronov-Hop! bifurcation for flows.
Theorem 5.2 (Andronov-Hopf Bifurcation). Make assumptions (1) - (2) on the
differential equation,
x = I(x,p).
(a) Then there exists an EO > 0 such that for 0 :5 £ :5 EO, there are (i) differentiable
functions Il(E) and T(E) with T(O) = 27r/{30, p(O) = 0, and p'(O) = 0 and (ii) a T(E)-
periodic function oft, X"(t,E), that is a solution of(*) for the parameter value p = PtE)
and with initial conditions in polar coordinates given by r" (0, e) = € and 0" (0, €) = O. In
fact, for all t, r·(t, £) = e +0(£). (Uniqueness) Further, there are po> 0 and 60 > 0 such
that any T-periodic solution x(t) of(.) with Ipl :5 PO, IT-27r/.Bo1 :5 60, and Ix(t)1 :5 60,
must be x"(t,p) up to a phase shift, i.e., x(t + to) = x"(t,p) where p = 1l(lx(to)i) and
to is chosen so that the polar angle 0 is zero for x(to), Otto) = O.
(b) If we also make assumption (3), then not only is p'(O) = 0 but also Ill/(O) =
-2K I- O. (This means the periodic solutions occur for p on one side of 0 with the side
determined by the sign of K.) fUrther, the periodic solution is attracting if K < 0 and
is repelling if K > O. See Figure 5.1.

REMARK 5.1. Examples of this bifurcation are found in the work of Poincare. This
theorem was explicitly stated and proved by Andronov (1929). Also see Andronov and
Leont<lvic-Andronova (1939). Later Hopf gave an independent proof of this theorem
and extendt'd it to higher dimensions where two eigenvalues cross the imaginary axis,
Hopf (1942). See Arnold (1983) and Chow and Hale (1982) for further discussion and
references. See ~farsden and McCracken (1976) for applicatiOns.
6.5 ANDRONOV-HOPF BIFURCATION FOR DIFFERENTIAL EQUATIONS 227

(a) (b)

(c)

FIGURE 5.1. The Phase Portraits of the Andronov-Hopf Bifurcation with


K °
< for (a) Il < 0, (b) Il = 0, and (c) Il > °
REMARK 5.2. The proof given below is a combination of those found in Carr (1981)
and Chow and Hale (1982).

PROOF OF THEOREM 5.2(a). We first determine the rate of change of r with respect
to 8 to determine the Poincare map. By using the equations in polar coordinates and
dividing r by 9, we obtain

dr Il 2 1 Il
d8 = /ir +r [/iC3 (8, Il) - {32 D 3 (8, Il) 1

+ r 3[1/iC4 (8, Il) - f321 C 3 (8,Il)D 3 (8,1J)


- ;D4 (8,1l) + ;3 D3(8,1J)2j
+ 0(r 4 ).

(In this and subsequent equations, 0(r4) is really a remainder term in an expansion, so
it can be differentiated to give a term ofthe next lower order: (Oi /fJr i )0(r4) = 0(r4-i)
for 1 ::; j ::; 4.) Let r(8, E, IJ) be the solution for the parameter value IJ with r(O, E, IJ) = E.
° °
If E = then r(8, 0, IJ) ;: 0, so this gives E = as a fixed point for r(21r, E, IJ). Since we
want to find nonzero fixed points of the Poincare map, we define

=
F( E,IJ ) - r(21r, E, IJ) - E
.
E

We want to apply the Implicit Function Theorem to continue a zero of F. In fact, we


show that F(O,O) = 0, F.(O,O) = 0, and F,..(O,O) i' 0. To verify these conditions, we
need to show that r(21r, E, Il) -E = 0(E2), and r(21r, E, 0) -E = 0(E 3). Such estimates can
probably be proved using Gronwall's Inequality applied to the equation (d/dB)e-",'/fJr =
0(r2). Instead of using this approach, we scale the variable r by E, defining r = fP, and
derive estimates on p.
Thus we define r = EP, and let p(8,E,IJ) be the solution with p(O,f,lJ) = 1 (which
corresponds to r(O,E,IJ) = f). We need to verify that p(21r,E,IJ) - 1 = 0(£). The
VI. BIFURCATION OF PERIODIC POINTS

differl"ntial equation, dp/dO, is given by

dp I' 2 1 I'
dO = ~P+EP [~C3{O,I')- tpD3{O,I')]

+ E2 P3 [1~C4{B,I') - 1 C3{B,I')D3{B,I')
tp

-;2 D4{B,I') + ;3 D 3 {B,I')2]


+ O{E 3p4).
We trl"at these equations as a perturbation of the linear terms. Using the integrating
factor e-,..fJ/o, the derivative of e-,..9/0 p is as follows:

Integrating from 0 to 21T, we obtain

e- 2",../O p{21T,E,I') - 1

Jor
h 1 I'
= E e-"'1I/0p{O'E'I')2[~C3{O,I') - ,62D3{O,Il)] dO

+ E212" e-,..1I/0 p{O, E,1')3 [~C4{O,I') - ~2 C3{O,I')D3(B,Il)


-;2 D4{O, 1') + ;3D3{O, 1')2] + O{E 3p4{O,E,I'»dO
== Eh{E, Il).

Thus

F(E,I') = p{21T,E,Il)-1
= [e 2",../o -IJ + Ee 2""'/Oh{E,I').

At (E. 1') = (0,0). F(O,O) = 0 and

F. (0 0) = (21T)e 2",../o _ (21T1')(0,6)e 2""'/O


"",6 tp 01'
+ E~
01'
[e 2",../13 h{E, 1')]1 .-0.,..-0
_ _
= 211" f: O.
t30
Therefore we can solve for I' as a function of E, I'{E), such that F{E,I'{E» == O. These
points are periodic orbits with 00 = 0, p{O,E,I'{E» = 1 so r{O,E,I'(t:» = E, and
p{211", E,I'(E» = 1 so r(211", E,I'{E» = E.
We find the derivative of I'{t:) by implicit differentiation:

F.{O, O) + F,..(O, 0)1"(0) = o.


6.5 ANDRONOV·HOPF BIFURCATION FOR DIFFERENTIAL EQUATIONS 229

Thus we need to calculate F.(O, 0). Note that


F(E, 0) = O+Eh(E,O)
r2w 1
= E10 p(9,f,0)\'30)C3(9,0)dB

r 2w 1 1
+E210 {P(9,E,0)3[(.eo)C4 (9,0) - (J3g)C3(9,0)D3(8,0)]
+ O(E3p4(9,E,0»)} d9.
Using the fact that p(9,0,0) == 1,

F.(O, 0) = .eo 10
1 r 2w
p(9,0,0)2C3 (9,0)d9+0

1 r 21<
= fJo 10 C 3 (9,0)d9
=0.
The integral is zero because C3(9, 0) is a homogeneous polynomial of odd degree in sin(9)
and cos(9). Using that 1"(0) = -F.(O,O)F,,(O,O)-l, F.(O,O) = 0, and F,,(O,O) '" 0, we
get that 1"(0) = 0. This completes the proof of part (a) of the theorem. 0
PROOF OF THEOREM 5.2(b). By part (a), there is a function I'(E) such that F(f,I'(E)) ==
°
0, 1"(0) = 0, and = F.(t,I'(E» + F,,(E,I'(f)}I"(f). Differentiating this last equation
again with respect to E and evaluating at (E,I') = (0,0) gives
0= F.. (O, 0) + 2F." (0, 0)1"(0) + F",,(O, 0)1"(0)2 + F,,(O, 0)1'''(0),
so
1'''(0) = _ F•• (0, 0)
F,,(O,O)
.eo
= -(2.,..)F.. (0,0).
Thus we need only show that F•• (O,O) = (4.,../.eo)K in order to verify the derivative
given in the theorem.
°
In the calculation of F.. (O, 0), we can fix I' = and differentiate F(E,O) twice. Using
the expression for F(E,O) given in part (a),

F.(E,O) = (.eo)
1 121<
0 P(9,f,0) 2C3(9,0)dB

E
+ (.eo) 10
r 2w 8p
2p(9,E,0) 8E(9,E,0)C3(9,0) dB

+2E 10
r2w 1 1
P(9,E,0)3 [(.eo)C4 (8, 0) - (J3g)C3 (8,0)D 3 (8,0)] dB

+ 0(E2 ),
and
230 VI. BIFURCATION OF PERIODIC POINTS

because p(8, 0, 0) == 1. Thus it is enough to show the last integral, which we denote by
I, is zero. Since (or IJ = 0,

dp
d8 (8, e, 0) = ( f30e ) p(8, e, 0) 2 C3(8, 0)+ O(e 2 ),
the derivative with respect to e gives (the first variation equation)

Integrating from 0 to 8 gives

{)p 1 {'
({)e)(8,O,O) = (130) Jo C3(8,O)d8
= (63 (8,0»
- 130 .
Thus the integral I is given as follows:

2 {2ft
I = (f3~) Jo 263(8,O)C3(8,O)d8

= (;~)[63(2'11",O)2 - 6 3(0,0)2]
=0

because C3 (8,O) is a homogeneous polynomial of odd degree in sin(8) and c08(8).


This only leaves the stability of the periodic orbit to be determined. For f30 > 0, the
Poincare map, P,.(e), is given by P(e,lJ) = =
r(2'11",e,lJ) ep(2'11",e,IJ). (For f30 < 0, the
= =
Poincare map is given by P(e,lJ) r( -2'11", e,lJ), and P(e,IJ)-1 r(2'11", e,IJ). We do not
indicate the changes (or this case.) The derivative with respect to e, which is used to
determine the stability, is given by

The Taylor expansion in e and IJ is given by

p(21r,e,lJ) = F(e,IJ)+ 1
= 1 + F.(O, O)e + F,.(O, O)IJ
1 2 1 2 3
+2"F«(O,O)e +F.,.(O,O)elJ+2F,.,.(O,O)1J +O(I(e,IJ)I)

and
"(0) 1
p(2'11", e,lJ(e» = 1 + 0 . e + F,.(O, 0)[Te2 + 0(e 3») + 2F«(O, 0)e 2 + 0«3)

( )( -F«(O,O»
=l+F,.O,O
2 1 '\ 2 ( 3)
2F,.(O,O) e +2Fulll,lJ)' +Oe
= 1 + 0(e3 ).
6.6 EXERCISES 231

Next,

and

Combining,

p,.(£) = 1 + F•• (0, 0)£2 + 0(£3)


= 1+ (~)K£2 + 0(£3).

Therefore if K < 0, then 0 < P;(£) < 1 for small £ and the orbit is attracting. Similarly,
if K > 0, the orbit is repelling. (When /30 < 0, it is still the case that the orbit is
attracting for K < 0 and repelling for K > O. We leave this verification to the reader.)
This completes the proof of the theorem. 0

6.6 Exercises
Bifurcations for Maps
6.1. Let F,,(x) = I-' + x 2 for x E R.
(a) Find the point and parameter value where there is a saddle-node bifurcation of
fixed points. Verify the assumptions of the theorem.
(b) Find the point and parameter value where there is a period doubling bifurcation
from a fixed point to an orbit of period two. Verify the assumptions of the theorem.
6.2. Let FAB(X,y) = (A - By - x 2,x) be the Henon family of maps. Prove that FAB
undergoes a saddle-node bifurcation when A = -[(B + 1)/2)2.
6.3. Let f,,(x) = I-'X - x 3 for x E R.
(a) Find the fixed points. Note that the bifurcation at I-' = 0 is not one we have
studied. It is called the pitchfork bifurcation, and takes place naturally in systems
with a symmetry 1,,( -x) = - f,,(x) for alII-'.
(b) Show that there is a period doubling bifurcation at I-' = 2. (Verify the conditions
of the theorem.)
6.4. Assume that f,,(x) = - f,,( -x) for alII-' where x E R.
(a) Prove that 1,,(0) == 0 and 1;(0) == o.
(b) Assume 1:"'(0) = 1, I;~(O) '" 0, and :/~(O)I,,=I'O '" O. Prove that f,,(x) under-
goes a pitchfork bifurcation like the example in the previous exercise.
6.5. Assume I : R" x R -+ Rn is C 3 and satisfies the following conditions.
(1) There is a Xo E R" and 1-'0 E R such that f(Xo, 1-'0) = Xo.
(2) The derivative of fl'O at Xo has eigenvalues '\l(JLO) = -1 and .\;(1-'0) for 2 ~ j ~ n
with 1.\;(1-'0)1 '" 1. Let vi be the right eigenvector for the eigenvalue .\dl-'o) = -1
of D(fI'O)"'"
(3) Let x(JL) be the curve of fixed points of I,.. Let .\;(JL) be the eigenvalues of
D(f,,)x(,,)' Assume
d
dl-'.\l (1-')11'0 '" O.

(a) Prove that there is a curve of points of period 2 bifurcating off from (Xo, 1-'0) in
Rn x R, i.e., there is a differentiable curve "f passing through (xO,I-'o) so that
232 VI. BIFURCATION OF PERIODIC POINTS

1 \ {( %0, I'o)} is the union of period 2 orbits. The curve 1 is tangent to the line
< vI> x{l'o} at (Xo,l'o).
For parts (b) through (d), assume Xo =
O. Let w be a left eigenvector for the
eigenvalue -1 of D(f,.o)o and let 11' : IR n -+ R n- I be the projection along vI onto
(v 2, ... , vn). Using coordinates with vI along the zl-axis, and I/J(x, 1') = lI'[/~(x)-x],
°
construct cp(XI,I') such that I/J(ZI,cp(ZI,I'),I') == and let 9(ZI,I') be defined as in
the proof of Theorem 2.1.
(b) Prove that

::~(0'1'0)=WD2(f,.o)(VI'VI) and

::~ (0, 1'0) = w D3(f,..)(v l , vI, vi) + 3 W


D2(/,..)(v l , :28~ (0,1'0»'

(c) Prove that


82~
{} 8 (0,1'0) = -[11' D 2(f,..) - WI 88 22 (f;.)(0).
%1

(d) Use parts (b) and (c) to write the conditions on the derivatives of and I,.. I;.
which insure that the orbits of period 2'are on one side of I' 1'0. =
=
6.6. Let FAB(Z, y) (A - By - z2, %) be the Henon map. Fix B Bo. =
(a) Show that FAB. has a fixed point with one eigenvalue equal to -1 (and the other
=
eigenvalue equal to -Bo) for A 3(1 + Bo)2/4.
(b) Prove that FAB. undergoes a period doubling bifurcation at A =
3(1 + Bo)2/4
as A varies using the conditions derived in the last exercise. (Verify at least the
conditions of part (a).)
Bifurcations for Differential Equations
6.7. Let

z=y
iI = -z + I'Y + ayl.
(a) Show that this system satisfies the eigenvalue conditions for a Andronov-Hopf
bifurcation at the origin for I' 0. =
(b) Find a (right handed) basis (of eigenvectors) which changes the linear terms at
the origin to the system

Ii= IIU - 1V
v = 1U + IIV
for the appropriate choice of II and 1. Also, calculate the nonlinear equations in
terms of these variables.
(c) Express in polar coordinates the nonlinear equations found in part (b), where
= =
r2 u 2 + v 2 and tan 9 v/u.
(d) Find the constant K (used in the statement of the Andronov-Hopf bifurcation
theorem), and show that it is nonzero. Here a '" 0, but it can either be positive
=
or negative. Hint: C3(0,9) D3(0, 9) 0. =
6.8. Consider the system of differential equations given by

z =a - (1 + 6)% + Z2 y
iI = 6z - ~21J.
6.6 EXERCISES

This system of equations is a model of a certain chemical reaction and is called the
'Brusselator'.
(a) Prove that (a, b/a) is the unique fixed point.
(b) Prove that the fixed point is stable for b < 1 + a2 and unstable for b > 1 + a 2 with
a pair of complex eigenvalues crOSBing the imaginary axis at b = 1 + a2 • (It can be
further shown that a Andronov-Hopf bifurcation to a stable periodic orbit occurs
at b = 1 + a 2 • See Prigonine and Lefever (1968) and Lefever and Nicholis (1971).)

6.9. Consider the following systems in polar coordinates, where in each case 9 = 1. In
each case, find the periodic orbits and draw the phase portrait (in Cartesian coordinates)
for JI. < 0, JI. = 0, and JI. > O. Indicate which conditions of the full Andronov-Hopf
Theorem are not true (if any).
(a) r = r(Jl.2 - r2).
(b) r = Jl.r(l + r2).
(c) r = r(JI. - r2)(4J1. - r2).
(d) r = r(JI. - r 4 ).
6.10. Consider the Lorenz system of differential equations,

:i; = -lOx + lOy,


iJ = px - y-xz,
. 8
Z = 3z+xy.

(a) Find the fixed points, 0, a±.


(b) Find the eigenvalues at the fixed point O.
(c) The rest of the problem deals with the eigenvalues at the fixed points a±. (The
eigenvalues are the same at ±a.) In particular, part (h) asks the reader to verify
that a pair of complex eigenvalues crOSB the imaginary axis (some of the conditions
for a Hopf bifurcation) at a parameter value PI which is found in part (f). To start
this process, we ask the reader to find the characteristic polynomial at a±. Show
that the characteristic polynomial at a± is given by

41 8 160
ptA) = A3 + ("3 )A2 + 3(10 + p)A + 3'(p - 1).

(d) Show that ptA) has a negative real root. (Note that p(O) > 0 and all the coefficients
are positive.)
(e) For P 2': 14, show that ptA) is a monotonically increasing function of A, so it has
exactly one real root, >.0. Thus the eigenvalues are Ao and a ± iP with P .;: o.
(There is a pair of complex eigenvalues for smaller P as well, but it is not as easy
to show.)
(f) Find a parameter value PI for which a = O. Hint: Note that .>.0 + 2a = -41/3,
where 41/3 is the coefficient of A2. So find P = PI such that Ao = -41/3 is a real
root.
(g) Show that Ao > -41/3 for P < PI. and .>.0 < -41/3 for P > Pl.
(h) Show that the complex eigenvalue a + iP crosses the imaginary axis as P increases
through Pl. Note that this shows that the Lorenz equations satisfy some of the
conditions for a Hopf bifurcation at the fixed point a for P = Pl.
CHAPTER VII
Examples of Hyperbolic Sets and Attractors

In this chapter, we return to consider examples with complicated invariant sets. We


introduce the idea of a hyperbolic invariant set and show that not only periodic orbits
can have stable and unstable manifolds, but that a hyperbolic invariant set also has
a family of stable manifolds. We give a number of different types of examples which
are hyperbolic and give a method to show that the map is topologically transitive on
the invariant set. One very important type of example arises from the intersection of
stable and unstable manifolds of a saddle periodic orbit. This gives rise to an invariant
set called a Smale horseshoe. It is very similar to the invariant set which we found for
the quadratic map on the real line. Another important type of hyperbolic invariant set
occurs where all nearby orbits tend toward the invariant set. Such an invariant set is
called an attractor (with further conditions added). Thus an attractor is like a periodic
sink where the invariant set itself is more complicated topologically. The final cl_
of examples we consider are those with only a finite number of periodic points, called
Morse-Smale systems. For these systems we make a connection with the Lefschetz
theory through the Morse-Smale inequalities.

7.1 Definition of a Manifold


In this chapter, we consider diffeomorphisms (or flows) on a manifold M having a
dimension greater than or equal to two. A manifold is merely a set on which there are
local coordinates that make it a Euclidean space. We have already seen examples of
manifolds: (i) the circle which is represented by 71": RI -+ Sl, 7I"(t) = t mod 1, and (Ii)
local stable and unstable manifolds which are represented as graphs, (1' : E'(r) -+ E"(r)
and W:(O) = ({x,(1'(x» : x E E·(r)}. We now give a more formal definition of a
manifold.
Definition. A c r n dimensional manifold M is a second countable metric space to-
gether with a collection of homeomorphisms CPo : Va eRn -+ Uo c M for Q in some
index set A such that (i) cp(Vo) = U (Ii) {UO}OEA is an open cover of M, and (iii) if
Q '

Uo n Up -F 0, then

CPo,p = cp;;1 0 CPo : cp;;l(Uo n Up) C Va -+ cp;;l(Uo n Up) c Vp

is a C r diffeomorphism between open subsets of Rn. One of the allowable maps CPo :
Va C R" -+ Uo c M is called a coordinate chart on M.
Example 1.1. For the circle, we can use the map 71" : R -+ Sl to induce homeomor-
phisms on open intervals 10 of length less than one. If 10 and Ip are two such open
intervals with 71"(10) n 71"(113) -F 0, Uo = 71"(10)' and Up = 71"(113), then

71"0.13 = (71"1113 )-1 071": (7I"110)-1(U0 n U13 ) -+ Ip

is given by 7I"o,l3(t) = t + j for some integer j. Therefore 7I"o.P is Coo and the collection
of these maps clearly satisfies the conditions of the definition of " ",anifold.

235
236 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

REMARK 1.1. In the case of a local stable manifold, we allowed it to be the graph of
a function between Banach spaces. We do not make the formal definition of a Banach
manifold, but it is much the same as that given above for finite dimensional manifolds.
See Lang (1967).
REMARK 1.2. The local stable or unstable manifolds of a hyperbolic fixed point of a
diffeomorphism can be represented as a graph so it is clearly a manifold. The global
stable and unstable manifolds do not always satisfy the full conditions stated above for
a manifold. The problem is that the global stable manifold can accumulate on itself.
(See the examples of the Smale horseshoe and the toral An080V automorphism given in
Sections 7.4 and 7.5.) More specifically, let / : M -+ M be a C'" diffeomorphism with a
hyperbolic saddle fixed point p. It is always possible to define a C'" map", : JFi, -+ M
such that (i) '" is one to one, (ii) '" is onto W'(p), and (iii) the derivative of", at each
point is an isomorphism. (See the discussion of the derivative of maps into a manifold
below.) However, the map", does not always have a continuous inverse (Le., '" is not a
homeomorphism). so W'(p) is not a manifold in the full sense defined above.
A C r map '" : N -+ M from one manifold into another is called an immersion
provided the derivative of '" at each point is an isomorphism. The image of a one to
one immersion is called an immersed submani/old. If an immersion is a homeomorphism
then it is called an embedding and its image is called an embedded submanifold. (Some
people require that an embedding is also proper, Le., the inverse image of a compact
set is compact.) See Hirsch (1976) for discussion of these concepts.
REMARK 1.3. A one dimensional manifold is usually called a curve; a two dimensional
manifold is usually called a surface.
Example 1.2. The n-torus 1['n is the product of n copies ofthe circle. 1['n Sl x· .. X Sl. =
To define a map from IR n onto 1['n. we first define an equivalence on IR n by (ZI,' ..• zn) -
(J1l •...• lin) if ZJ = 11; mod 1 for all j. Define 11' : IR n -+ 1['n by letting 1I'(x) be the
equivalence class of x under -. The reader can check that this gives 1['n the structure
of a Coo manifold.
= =
Example 1.3. Let sn {x E IR"+I : Ixl I} be the n-sphere. To show that sn is a
manifold. we represent pieces of sn as graphs. Let D n be the open unit ball in IRn. For
1 :5 j :5 n. define ",t :
D n -+ sn by

",1(YI •...• lin) = (111 ..... lIj-I, ±(1 - 1I~ - .. , - 1I~)1/2, lIj,' .. , lin).

Each of these maps "'1


is a homeomorphism onto a "hemisphere" of sn. We leave to
the exercises the verification that these maps gives sn the structure of a Coo manifold.
See Exercise 7.1.
Example 1.4. A common method to specify a manifold is as the level set of a function.
Let F : lRn+l -+ lR be a C'" function for some r > 1. Assume c E IR is a value such that
for each p E F-I(c), DFp -:F O. Le .• DFp has ;-ank one, or some partial derivative of
F is nonzero at p. Let M =
F-I(c). For each p EM, the Implicit Function Theorem
proves that there is a neighborhood Up of p and a C'" function up : Vp C lR" -+ IR such
that the graph up is onto Up. More specifically. if 88F (p) -:F 0, there there is an open
Zj
set Vp in lR n and a C'" function up : Vp -+ IR such that

Up = {(1I1 •...• lIj-l. Up (lIl •... , lin). lIj," ., lin) : (111, ...• lin) E Vp }

is a neighborhood of p in M. These graphs give M the structure of a C r manifold. This


example is a generalization of the n-sphere of the last example.
7.1.1 TOPOLOGY ON SPACE OF DIFFERENTIABLE FUNCTIONS 237

See Guillemin and Pollack (1974), Hirsch (1976), Chillingworth (1976), or Lang
(1967) for more details and examples of manifolds. The only explicit examples of man-
ifolds that we use are Euclidean spaces (a trivial example), tori, and spheres.
Given the definition of a differentiable manifold, we can define a differentiable map
between two manifolds.
Definition. Let M and N be two c r manifolds for some r ~ 1. Assume I : M - 1 N is
a continuous map. We say that I is r time. rontinuousl1l differentiable, or cr, provided
for each point P E M and coordinate charts tpo : Va - 1 Uo c M and tpp : Vp - 1 Up eN
at P and I(p), respectively (i.e., p E Uo and I(p) E Up), tppl 0 I 0 tpo is differentiable
at tp;;l(p). Note that iftpplol0tpo is C r at tp;;l(p) for one pair of coordinate charts,
and tpo' : Vo' -+ Ua' C M and tpp' : Vp' - 1 Up, eN is another pair of coordinate charts
at p and I(p) then tpp.1 0 I 0 tpa' is cr at tp;;}(p) because both tpo,a' and tpfJ,fJ' are
cr. The set of all c r maps from M to N is denoted by cr(M,N). A C r maps from
M to AI is a diffeomorphism provided it is one to one, onto, and the derivative at each
point (in local coordinates) is nonsingular. The set of all C r diffeomorphisms on M is
denoted by DiW(M).

7.1.1 Topology on Space of Differentiable Functions


In this chapter and the next we consider the structural stability of dilfeomorphisms
or differential equations on a compact manifold. The definition of structural stability
uses the notion that two functions are close in the C l or cr topology. In this subsection,
we give these definitions which are used later.
Definition. The definition of the cr distance between functions is easier on the torus,
so we start with this case. If I : 1'" - 1 1"', then there is a lift F : Rn - 1 Rn such that
(i) 11' 0 F = 1011' where 11' is the projection 11' : Rn - 1 T" defined in Example 1.2, and
(Ii) F(x + j) = F(x) + j for all x E Rn and j E Z". If I, 9 : 1'" - 1 1'" are two c r maps
with lifts F, G : R" -+ Rn then the c r distance from I to 9 is defined to be

dr(f,g) = sup{d(foll'(x),goll'(x)), IIDiFx - DiGxll :


x = (X., ••• ,xn ) E R n satisfies 0 ~ Xj ~ 1
for 1 ~ j ~ n, 1 ~ i ~ r}.

In this definition, d is the distance between points on 1"'.


The next easiest case to treat is functions from a compact manifold M to a Euclidean
space RN. Let
{tpj : Vi eRn - 1 Uj c M}f=l
be a finite number of coordinate charts with uf-I Uj = M. Let Cj C Uj be compact
subsets with Uf-l C j = M. If I,g : M - 1 RN are two cr functions then the cr distance
from I to 9 is defined to be

dr(f,g) = sup{l/(x) - g(x)I, IIDi(f otpj)OPj'(x) - Di(gotpj)OPj'(x) II :


x E Cj ' 1 ~ j ~ J, and 1 ~ i ~ r}.

The case of maps between two manifolds is slightly more complicated. Rather than
define a distance between two functions, we define a base of neighborhoods for the
topology. Assume M and N are c r compact manifolds. Let {tpj : Vi C R" - 1 Uj C
M}f=1 be a finite number of coordinate charts on M, and C j C Uj be compact subsets
238 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

as above. Let {t/Jk : Vk C Rn -+ U;., c N}:=l be a finite number of coordinate charts on


N. Let f E Cr(M, N). Each Cj can be broken into finite number of compact subpieces
Cj = U;~l C j •t , and an index kU,t) chosen so that f(Cj,t) c U;"(j.t) for 1 ~ t ~ L j and
1 ~ j S; J. For! > 0, let N be the neighborhood of f given by

N = {g EC'(M, N) : g(Cj,t) C U;"(j,l)'


It/Jk(~,l) 0 f(x) - t/Jk(~.t) 0 g(x)1 < t,
IIDj(t/Jk(~,t) 0 f 0 "'A'j'(x) - Dj(t/Jk(~,l) 0 9 0 "'j)<pj'(x)1I <t
for all x E Cj,t, 1 S; t S; L j , 1 S; j S; J, 1 ~ i ~ r}.

The CO estimate It/Jk(~,t) 0 f(x) - t/Jk(},t) 0 g(x)1 < ! can be replaced by the estimate
d( f (x), g( x» < l using distances between points on the manifold. The set of these base
neighborhoods N generate the topology on cr(M, N).
REMARK 1.4. Let M and N be manifolds with M compact. The manifold N can be
embedded into some (large dimensional) Euclidean space JRL. Using this embedding, it
is possible to consider cr(M,N) c cr(M,IRL), i.e., f E cr(M,N) if f E cr(M,JRL)
and f takes its values in the subset N C JRL. Using this, the topology on cr(M, N) can
be inherited from Cr(M, JRL). Franks (1979) uses this approach.
Definition. A C r diffeomorphism f : M -+ M is called C r structurally stable provided
there is a neighborhood N in Cr(M. M) such that every 9 EN is topologically conjugate
to f. A C· structurally stable diffeomorphism f is also called just structurally stable.
Definition. For flows ",I, t/J t : M -+ M, the C 1 topology measures their derivatives as
functions from [O,lJ x M to M. A C 1 flow ",I : M -+ M is called structurally stable
provided there is a neighborhood N of ",I in the C 1 topology such that any flow t/JI E N
is flow equivalent (topologically equivalent) to ",I.
We give the definition of vector fields and the topology on the set of vector fields in
the next subsection.
For more detail on the C r topology, including the noncompact case, see Hirsch (1976).

7.1.2 Tangent Space


We are interested in invariant sets which are more complicated than a single fixed
point, The first example is a horseshoe. In Chapter II we saw that the quadratic map
had an invariant Cantor set. In two dimensions, the horseshoe is homeomorphic to
the product of two Cantor sets. It has some directions which are expanding and some
which are contracting. The directions which are expanding and contracting can vary
from point to point. These directions of expansion and contraction at a point p can be
thought of as infinitesimal displacements at p and are called tangent vectors. A tangent
vector at p is also thought of as the derivative of a curve through p: if"> : (-6,6) -+ M
is a differentiable curve with ')'(0) = p, then v = ')"(0) is a tangent 'it p. In order
to organize these ideas, we need to distinguish vectors at different /lOJ1110, The following
definition makes thf>Se ideas precise even on a manifold.
Definition. We start t '\ giving the definition of tangent vectors for Euclidean space.
Fix a point pER". A tangent vector at p is a pair (p, v) where v E an. This pair
(p, v) is often written vp. The set of all possible tangent vectors at p is denoted by
TpRn is called the tangent space at p. The tangent space at p is a vector space where
7.1.2 TANGENT SPACE 239

(p, v) + (p, w) = (p, V + w). The disjoint union ofthe tangent vectors at different points
is called the tangent bundle or the tangent space of Rn and is denoted by TRn:

TR n = {(p, v) : p E Rn and v is a tangent vector at p}


= Rn x Rn.
Thus the tangent space of a Euclidean space Rn is isomorphic to the cross product of
Rn with itself: the first copy of Rn is the set of the base points at which the vector is
situated and the second copy is the set of vectors at a given point, or (p, v) is a vector
at p in the direction of v.
For a manifold M, we need to specify what we mean by a tangent vector at a point
p .. Assume "( : ( -6, 6) c R -+ M is a Cl curve with "(0) = p. Assume <{)o : Vo -+ Uo is
a coordinate chart at p. By the above definition of differentiability, <{);;l o"(t) is Cl. In
the coordinate chart, the tangent vector determined by"( is given by (<{);;l o "()'(O) = v~.
If <{)fJ : VfJ -+ UfJ is another coordinate chart at p, then the tangent vector determined
by"( in this coordinate chart is given by (<{)/il ° "()'(O) = v~. Note that

v~ = D(<{)/il 0 <{)O)pDV~
where pO = <{);;l(p). These two vector, v~ and v~, should be considered as represen-
tatives of the same vector in different coordinate charts because they are the derivative
of coordinate representatives of the same curve. The derivative of the curoe on a mani-
fold (or tangent vector to a curoe) is the equivalence class of representatives in different
coordinate charts, where v~ '" v~ provided v~ = D(<{)pl 0 <{)o)pov~. A tangent vector
at p is a derivative of a differentiable curve through p. The set of all tangent vectors
at p is written TpM,

TpM = {vp : vp is a derivative of a differentiable curve through pl.

For the point p fixed, the set of vectors at p, TpM, forms a vector space (using the
addition in anyone of the coordinate charts at p). The disjoint union of the tangent
vectors at different points gives the tangent bundle or the tangent space of M which is
denoted by T M:

TM = {(p, v) : p EM and v is a tangent vector at p}

= U{p} x TpM.
pEM

If M is a manifold and S is a subset of M, we denote the tangent vectors to M at


points of S by
TsM = U
{p} x TpM.
pES

REMARK 1.5. If M c Rn+/c is a cr manifold, then the tangent space of M at p can be


thought of as a subspace of the tangent vectors of Rn+/c at p, TpM C TpRR+Ir.
Definition. With this idea of tangent vectors, if f : M -+ N is a Cl map between
manifolds, we consider the derivative of I at p to be a linear map from TpM to Tf(p)N,
DIp: TpM -+ Tf(p)N. If <{)o : Vo -+ Uo c M and <{)fJ : VII -+ Up c N are coordinate
charts at p and f(p), respectively, then

D(<{)ijl ° J 0 <{)O)pDV~ = v1(p)


240 VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS

takes the representative of a vector at p in the coordinate chart (!Po, Vo , Uo ) into the
representative of a vector at f(p) in the coordinate chart (!pp, Vp, Up).
In fact, when we used cones of vectors to prove the stable manifold theorem, we
used this idea of the derivative acting on tangent vectors at points implicitly in our
description.
Definition. Below, we define a hyperbolic structure in terms of expanding and con-
tracting directions. To do that, we need to be able to determine the length of vectors.
The length can be determined from an inner product on each tangent space TpM. A
Riemannian metric is an inner product on each tangent space,

which is a symmetric, positive definite bilinear form. Such an inner product can be
obtained by embedding M in some large Euclidean space RN and inheriting the inner
product from RN. For example, the two sphere inherits a Riemannian metric from R3
because it is a subset, f{J C R3.
Once the inner product is known, then

Ivpl p = (vp, v p );!2

defines a Riemannian nonn on each tangent space. We are most often interested in the
norm rather than the inner product.
Definition. A differential equation on a manifold M is given in terms of a vector field,
i.e., a function X : M - TM such that X(x) E TxM for all x EM. The vector field is
cr if the representatives of X are C r in local coordinates. The set of cr vector fields
on M is denoted by xr(M).
Let 1 ~ r < 00. The cr topology on xr(M) is determined by local coordinate charts.
Assume M is compact. Let

{!pj : ~ C Rn - Uj C M}f=,
be a finite number of coordinate charts on M and Cj C Uj be compact. subsets with
U;.l C} = M. If X, Y E xr(M), then the cr distance from X to Y is defined to be

dr(X, Y) = sup{lxj(x) - yi(x)l, IiD'(Xi)<p~'(X) - D'(yj)<p-'(x) Ii :


J J

where Xi, yi : 11; _ Rn are the representatives of X and Y in the local coordinate
chart.
Given such cr a vector field X on M, then
p = X(p)
is a differential equation on M. The flow !p' of the vector field X satisfies
d
ditp'(p) = X 0 !p'(p).

As before, a flow satisfies the group property.


For more details on the ideas of the tangent space, the derivative between tangent
spaces, and Riemannian metrics see Guillemin and Pollack (1974), Hirsch (1976), Chill-
ingworth (1976), or Lang (1967).
7.1.3 HYPERBOLIC INVARIANT SETS 241

7.1.3 Hyperbolic Invariant Sets


We want to make more precise the concept of expanding and contracting directioDl
at different points for a map. Let A be an invariant set for a diffeomorphism f. If a
periodic point p for I is hyperbolic there are subspaces t; and ii. such that (i) TpM
=
is the direct sum of n; and n; (as vector spaces), (ii) TpM II.'; e E;, and (iii) t; is
expanding and Il!t. is contracting under D f;. An invariant set A is said to be hyperbolic
provided that at each point p in A this same type of splitting into subspaces exists and
the subspaces vary continuously as the point p varies. We give the Cormal definition oC a
hyperbolic invariant set in this subsection. We give examples of hyperbolic invariant sets
in the rest of the chapter which should help to make the definition more understandable.
Definition. An invariant set A has a la,per60lic ,t,..cture for a diffeomorphism I on
M provided (i) at each point p in A the tangent space to M splits as the direct sum
ofn; and lFj., TpM = n; en;, (ii) the splitting is invariant under the action of the
= =
derivative map in the sense that D/p(lI.';) Ei(p) and D/p(lFj.) Ej(p)' (iii) t; and
lFj. vary continuously with p, and (iv) there exist 0 < ,\ < 1 and C 2! 1 independent of
p such that for all n 2! 0,

ID.t;v'l :5 c,\nlv'l Cor v' e F;., and


IDI;nv"l :5 c,\nlv"l Cor v" en;.
If an invariant set A has a hyperbolic structure for I, we also say that A is a la,per6olic
invariant set.
REMARK 1.6. Notice that if m is a positive integer such that p =c,\m < I, then
ID.t;'v'l :5 plv'l Cor v' e F;., and
IDI;mv"l:5 plv", for v" e »;,

so D.t;' IIl!t. is a contraction and D.t;' 1m; is an expansion. Thus, the constant C deter-
mines the number of iterates of I which are necessary beCore the vectors get contracted
(respectively, expanded) in the subbundle JI!1, (respectively, F;). Sometimes an invari-
ant set satisfying the above conditions is said to have a uni/orm la,lper60lic ,t,..cture
because the constants ,\ and C, and so the number oC iterates m, are independent oC the
point p. By contrast, a diffeomorphism is sometimes said to be nonuni/orml, larperho/ic
on ,orne lIet provided it has nonzero Liapunov exponents for almost all points on this
set.
REMARK 1.7. We use backward iterates to specify the unstable bundle because any
vector with a nonzero component in the unstable subbundle expands exponentially
under forward iteration. Therefore forward iteration does not characterize the vectors
in n;. However the above conditions characterize these subbundles (i.e., make them
unique).
Just as for a fixed point, it is useful to change the norm so that the cODltant C = 1.
In this case, the new norm ,. I; must vary Crom point to point, i.e., the norm oC a
vector depends on the point which is the base point of the vector. In terms of this
new norm, vectors are stretched or contracted by DIp, and not just by the derivative
oC some high iterate of I, Df;. This type oC norm is called an adapted norm. The
existence of an adapted norm for a hyperbolic invariant set can not be proved using the
Jordan Canonical Form, but must be proved using the averaging method given for a
fixed point. See the appendix oC Smale (1967) by Mather for one of the original proo'"
of the existence of an adapted norm.
242 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Theorem 1.1. Assume A is a hyperbolic invariant set for I with constants 0 < ..\ < 1
and C 2: 1 giving the hyperbolic structure. Let..\' be any number with 0 < ..\ < "\' < 1.
Then there is a continuous change of norm I . I~ for points PEA such that in terms of
this norm

ID/pY'lj(p) :S ..\'lv'l~ for Y' E F;, and


ID/;lyuli_.(p) :S ..\'lyul~ for y U E~.

PROOF. As stated above, we must average the norm and can not use some kind of
Jordan canonical form.
For pEA, any vector Y E TpM can be decomposed into components Y' En; and
y U E JE;, Y = Y' + yU. Then we can let

Thus it is enough to define the norm on each of the 8ubspaces. (The maximum of two
norms on subspaces defines a norm.) Let no be a positive integer 8uch that C(..\/..\,)n o <
I, i.e., c..\n o < (..\,)n o. For y' En;, define

"0-1
IY'I~ = 2: (..\,)-jID(fj)pY'I·
j=O
We show this norm works on n;:

"0- 1
ID/pv'lj(p) = 2: (..\,)-jID(fHl)py'l
j=O
no-2
= ( 2: (..\,)-jID(fHl)pY'O + (..\,)-no+lID(rO)pY'I
j=O
no-l
:S ..\'( 2: (..\,)-j ID(fj)py'l) + p,)-no+lc..\n oIy'l
j=1
no-l
:S"\' 2: (..\,)-jID(fj)py'I,
j=O
since C(,,\/,,\')n o < 1.
In the same way, for y U E JE; define
no-l
Ivul; = 2: (..\')-i ID(ri)pyUI·
;=0
The reader can check that a similar calculation shows that

o
If a set A is a hyperbolic invariant set then it can be proved that there are stable and
unstable manifolds through each point in A. The following theorem states this result
more precisely.
7.1.3 HYPERBOLIC INVARIANT SETS 2-43

Theorem 1.2 (Stable Manifold Theorem for a Hyperbolic Set). LetJ: M -+ M


be a C k diffeomorphism. Let A be a hyperbolic invariant set for J with hyperbolic
constants 0 < A < 1 and C ~ 1. Then there is an f > 0 such that for each pEA
there are two C k embedded disks W:(p, f) and W.U(p, f) which are tangent to E~ and
1E~ respectively. In order to consider these disks as graphs of functions, we IdentIfy a
neIghborhood of each point p with 1E~(E) x E~(f). (On a manifold, this identification can
be realized by either local coordinates or the exponential map.) Using this identification,
W:(p, J) is the graph of a C k function q~ : 1E~(f) -+ 1E~(f) with q~(Op) = Op and
D(q~)o = 0:
W:(p,f) = ((q~(y),y) :yE 1E~(f)}.

Also, the function u~ and its first k derivatives vary continuously as p varies. Similarly
there is a C k function u~ : E~(f) -+ 1E~(f) with u~(Op) = Op and D(q;)o = 0 and with
the function u~ and its first k derivatives varying continuously as p varies such that
W.U(p, f) = {(x, q;(x» :x E E;(f)}.

Moreover, identifying a neighborhood of p with 1E~(f) x 1E~(f) and taking A < A' < I,
for I' > 0 small enough

W:(p,f) = {q E 1E;(f) x 1E~(f) : Ji(q) E IEji(p)(f) x IEj,(p)(f) for j ~ O}


= {q E 1E;(f) X 1E~(f) : Ji(q) E IEji(p)(f) x IEji(p)(f) for j ~ 0 and
d(P(q), /i(p» ~ C(A')id(q, p) for all j ~ OJ.
Similarly,

W;'(p,f) = {q E E;(f) X E~(f) :


rj(q) E Ej-i(p)(f) x Ej-i(p)(f) for j ~ O}
= {q E E;(f) x E~(f) :
rj(q) E IEj-J(p)(f) x IEj-J(p)(E) for j ~ 0 and
d(f-i(q),ri(p»:S C(A')Jd(q,p) for all j ~ a}.

The proof is very similar to that for a hyperbolic fixed point. We delay the proof
until Section 9.2.
After the local stable and unstable manifolds are obtained by the above theorem, the
global manifolds are determined as follows:
W'(p, f) == {q EM: d(fJ(q), P(p» -+ 0 as j -+ co}
= U r"(w:(p, f) and
"2:0
WU(p, f) == {q EM: d(ri(q), rj(p» -+ 0 as j -+ co}
= U !"(W.U(p, f).
"2:0
An important fact about the stable (respectively, unstable) manifold of each point is
that it is an immersed copy of the linear spaces E~ (respectively, E;). This means
that the map u' : E~ -+ M (respectively, qU : E; -+ M) whose image equals W'(p, f)
(respectively, WU(p,f) is one to one but need not have a continuous inverse. The fact
that the stable and unstable manifolds of points are immersed copy of the linear spaces
implies that they can not be circles or cylinders.
244 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Proposition 1.3. Let A be a hyperbolic invariant set for a diffeomorphism I. Then


for each pEA, W'(p, f) is an immersed copy ofE~, and W"(p, f) is an immersed copy
ofE~.

PROOF. By the Stable Manifold Theorem, each W:(p, f) is the embedded image of the
closed disk E~(t) by a map O'~. Because f is a diffeomorphism, !'(W:(p, f)) is thus the
embedded image of the closed disk E~(t) by the map O'~,j = Ii 0 O'~. Thus W'(p, f) is
the union of these sets each of which is an embedded copy of a disk, and so W'(p, f)
is the immersed copy of the linear subspace. The proof for the unstable manifold is
similar. 0
In the following sections, we give examples of hyperbolic invariant sets and determine
their stable and unstable manifolds. These examples should make both the definition
and the meaning of the theorem more understandable.
Hyperbolic Structure for Flows
We end the section by mentioning the changes needed in the definitions for a How or
differential equation.
Definition. An invariant set A for the flow rpt has a hyperbolic structure, or A is a
hyperbolic invariant set, provided (i) at each point p in A the tangent space to M splits
as the direct sum ofE~, E~, and span(X(p)),

TpM = E~ E9E~ E9span(X(p)),

(ii) the splitting is invariant under the action of the derivative in the sense that

D(rpt)pE~ = E~.(p),
D(rpt)pE~ = E~.(p), and
D(rpt)pX(p) = X(rpl(p)).

(iii) E; and E~ vary continuously with p, and (iv) there exist JJ > 0 and C ~ 1 such
that for t ~ 0

IDrp~v'l ~ Ce-l'tlv'l for v' E E~, and


IDrp;tv"l ~ Ce-l'tlv"l for v" E E~.

REMARK 1.8. If A is a hyperbolic invariant set for '1'1 and is a single chain component for
'1", then either A is a single fixed point or A does not contain any fixed points (because
of the assumption that the splitting varies continuously). In Section 7.11, we consider
the Lorenz equations which have an invariant set which contains a fixed point together
with other non fixed points. This set can not have a standard hyperbolic structure. In
that situation, we discuss the possibility of a "generalized hyperbolic structure" .

7.2 Transitivity Theorems


In the following sections, we often want to prove that a map is (topologically) transi-
tive on a set X, i.e., the (forward) orbit of some point p is dense in X. When considering
the horseshoe for the quadratic map on the line, we were able to verify this condition
by means of the conjugacy with the shift map and then constructing an explicit point
with a dense orbit for the (one-sided) shift map. We will also be able to do this in the
7.2 TRANSITIVITY THEOREMS 245

first set of examples below which are horseshoes for a diffeomorphism in two dimensions
using the conjugacy to the two-sided shift map. In other cases, this type of approach
is not possible. The following theorem of Birkhoff' shows that it is enough to show that
the orbit of any open set intersects every other open set. In fact, it follows from this
hypothesis that there are many points (a residual set in the sense of Baire category)
with dense orbits. Remember that a set A is residual in Ii space X if there is a countable
number of dense open sets {Uj}~l in X such that A = n~l Uj . The Baire category
theorem says that any residual subset of Ii complete metric space is dense.
Theorem 2.1 (Birkhoff Transitivity Theorem). Let X be a complete metric space
with countable basis and I : X --+ X a continuous map.
(a) Assume that for every open set U of X, O-(U) = U.. <ol"(u) is dense in X.
Then there is a residual set (large in the sense of Baire category) 'R.+ such that for every
p E 'R.+, the forward orbit ofp, O+(p), is dense in X.
(b) Assume that I is a homeomorphism, and that for every open set U of X, O+(U) =
U.. >o I"(U) is dense in X. Then there is a residual set 'R.- such that for every p E 'R.-,
the-backward orbit ofp, O-(p), is dense in X.
(c) Combining the first two parts, if I is a homeomorphism and every open set U has
both O+(U) and O-(U) dense in X, then there is a residual subset 'R. C X such that
any p E 'R. has both O+(p) and O-(p) dense in X.
PROOF. Let {V; : i E Z} be a countable basis of X. Then

is the intersection of dense open sets and so is residual. Let p E 'R.+. Then p E 0- (V;)
for all i, and so O+(p) n V; '" 0. This proves that O+(p) is dense in X, and so part
(a). In the proof of (b), 1 is assumed to be a homeomorphism so the backward orbit
of a point is well defined. The rest of the proof is similar. Part (c) is a combination of
parts (a) and (b). 0
Ergodicity and Birkhoff Ergodic Theorem
In connection with a few concepts we refer to invariant measures: Liapunov ex-
ponents (Sections 3.6 and 8.2), measure theoretic entropy (Section 8.1), Sinai-Ruelle-
Bowen measure (Section 8.3), and Hausdorff' dimension (Section 8.4). In some of these
situations, we refer to a system being ergodic. Because the concept of ergodicity is mea-
sure theoretic type of transitivity, and so relates to the Birkhoff' Transitivity Theorem,
we introduce it in this section.
A measure IJ on a space X is called a Borel measure provided it is a measure for the
sigma algebra of Borel sets (generated by open sets). A measurable set A in X is said
to be of full measure in X provided IJ(X 'A) = O. If the measure of X is finite, a set
A is of full measure in X provided IJ(A) = ",(X). For a measure IJ on X, the support 01
the measure, supp(IJ), is the smallest closed set with full measure, or

supp(lJ) = n{C : C is closed, and IJ(X, C) = o}.


A measure IJ is invariant for a map I : X --+ X provided ",(I-I(A» = IJ(A) for all
measurable sets A. If IJ is an invariant measure for /, / is also said to be a mea.rure
preseroing trons/ormation lor IJ. The flow of a vector field with zero divergence preserves
Lebesgue measure, so this gives one example of a measure preserving system. (Geodesic
flows are such systems. See Katok and Hasselblatt (1994).)
246 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Finally, a map I : X -+ X is called ergodic with respect to invariant measure 1.1


provided IJ(X \ A) = 0 for any measurable invariant set A for I with IJ(A) > O. Thus
for an ergodic map, all invariant measurable sets either have zero measure or full measure
in X. If I is ergodic with respect to the measure 1.1, we also say that 1.1 is an ergodic
measure lor I.
Exercises 7.9-11 make some connection. between a meaaure preaerving map and
recurrence. In particular Exercise 7.10 states that if I is a measure preserving home-
omorphism for an invariant measure 1.1 then the set {x EX: x E a(x) and x E w(x)}
has full measure. Thus for such a homeomorphism, IJ-all every point is recurrent.
In the discussion in Section 3.6, the Birkhoff Ergodic Theorem is used to show that
if I : R ...... R has an invariant measure J.I. with compact support, then the Liapunov
exponents exists J.I.-almost all points. We state this theorem next but leave the proof to
the reference.
Theorem 2.2 (Birkhoff Ergodic Theorem). Assume I : X ...... X is a measure
preserving transformation for the measure J.I.. Assume 9 : X ...... R is a J.I.-integrable
function. Then limn_co(lln) L;':~ go li(x) converges J.I.-almost everywhere to an
integrable function g". Also g" is I invariant wherever it is defined, i.e., g"of(x) = g"(x)
for J.I.-almost all x. Also, (i) if I'(X) < 00 thenIx g"(x) dJ.l.(x) =Ix g(x) dlJ(x), and (ii)
if J.I. is an ergodic measure for f then g" is a constant J.I.-almost everywhere.
REMARK 2.1. Consider the case when J.I. is an ergodic measure for f. The value
Ix g(x) dJ.l.(x)
is the space average of the function g; the value g"(p) is the time av-
erage of 9 along the orbit of p. Thus for any integrable function (e.g. the characteristic
function of an open set), the time average along almost all orbits equals the space av-
erage of the function. This fact indicates that almost all orbits for an ergodic measure
are dense in the support of the measure.
The following theorem gives a result for ergodic maps which is similar to the Birkhoff
Transitivity Theorem.
Theorem 2.3. Let X be a complete metric space with countable basis. Assume f :
X ...... X is an ergodic map with respect to the measure J.I.. Assume that J.I. is positive on
open sets and J.I.(X) < 00. Then there is a set R of full measure such that the forward
orbit of every point in 'R is dense in X.
PROOF. Let {Y:, : j E Z} be a countable basis of X. For each Vj • O-CVj) is a measur-
able invariant set of positive measure, so it has full measure, i.e., X \ 0- (\Ij) has zero
measure. Therefore

has zero measure and

has full measure. For any p E 'R +, 0+ (p) is dense in X just as in the proof of the
Birkhoff Transitivity Theorem, 0
For details and an introduction to ergodic theory, see Walters (1982). For further
discussion of ergodic theory in the context of smooth dynamical systems see Katok and
Hasselblatt (1994).
7.3.1 SUBSHIFTS FOR NONNEGATIVE MATRICES 247

7.3 Two Sided Shift Spaces



In Chapters II and III, symbolic dynamics is used to specify the forward itinerary for
the orbit of a noninvertible map, and so one sided shifts and subshifts are used. In this
chapter, symbolic dynamics is used to specify both the forward and backward Itinerary
for the orbit of an invertible map, and 80 two aided ahlft spac.. are Introduced. A point
In a two sided shift space Is given by B = ( ... , S-I, SO,SI, ..• ) which includes a symbol
Sj for all j E Z. If Section 3.2 has not be read previously, it should be read at this time.

Definition. Let n be an integer that is greater than one. Let

En={l, ... ,n}Z


= {a = ( ... 8-1, So, SI,"') : Sj E {I, ... , n} for all j E Z}.

We define the distance d on En:

~ .5(Sj, tj)
des, t) = L 41jl .
j=-oo

where
.5(a,b)=
{ o1 ifa=b
ifa¥b.

AB in the case of the one sided shift, this makes En into a complete metric space.
Exercise 7.12 proves that making the factor in the denominator greater than 3 makes
the cylinder sets into open balls in terms of the metric. There is also a shift map (1
defined on En by (1(s) = t where tj = Sj+!' The space En together with the map (1 is
called the full two sided n-shift, or sometimes simply the n-shift.
The shift space is called two sided because the index is both positive and negative. It
is called the full n-shift because all possible sequences are allowed on n symbols. Notice
that this C1 is one to one on this En and so is invertible, while the one sided shift map
is n to one.
As for one sided shifts, we can also define subshifts of finite type. If A = (aij) is
an n x n matrix with each aij E to, I} then EA C En is the subspace of all points
8 = (8j) for which a"'H' = 1 for all j. The shift map restricted to EA is designated by
C1A = C1IE A. The map C1 A on EA Is called a 8ubshift of finite type. As for one sided shifts,
the number of fixed points of C1~ is given by the trace of Ak, # Fix(C1~) = tr(A k ). Also
C1 A has a dense forward orbit in EA if and only if A is irreducible (where irreducible is
defined as before).

7.3.1 Subshifts for Nonnegative Matrices


In this subsection we extend the definition of a subshift of finite type by associating
a shift space with a matrix with nonnegative integer entries. For FI' = 1'X(1 - x) with
JI. > 4, there are two subintervals 11 ,12 C [0,1) such that FI'(I,,)::> It uh Using the
itinerary map for these intervals, we get sequences in E2, i.e., for the transition matrix
(! !). However, the one interval I = [0, 1] crosses itself twice, so we could a.sssocate
the matrix (2). There are several advantages to this extension. First, the matrix which
gives the symbolic dynamics can be made to have smaller size. In the above example
we get the 1 x 1 matrix (2) rather than the 2 x 2 matrix (! !). Second, if A is a
248 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

transition matrix. then we can from the subshift of finite type for A'" for any positive
power. For example. if A = (~ ~) then A:l = (~ ~) also has a subshift of finite
type associated to it. Finally, when we consider Markov partitions for hyperbolic toral
automorphisms later in the chapter, the matrix for the subshift of finite type associated
to a hyperbolic toral automorphism can be taken to be the same as the matrix which
induces the hyperbolic toral automorphism even when it has entries which are greater
than one.
We call A an adjacency matrix provided A = (ajj) is an n x n matrix with entries
aij E N. For example, (2) and (i ~) are adjacency matrices. We call A a transition
matrix provided A = (ajj) is an n x n matrix with entries ajj E {0,1}.
Now we turn to understanding how an adjacency matrix corresponds to a subshift.
The previous subshift constructed for a matrix with entries of O's and 1's we call the
vertex subshift. For an adjacency matrix, we construct an edge subshift. Let S =
{VI, V2 , .•. , Vn } be a set with n elements. We use the adjacency matrix A to from an
oriented graph G A with vertices the elements of S. If the entry ajj > 0, then we draw
alj directed edges from vertex V; to vertex V;. For example, if aij = 2, then there are

two edges from V; to Vj . Figure 3.1 gives the graphs for the matrices (2) and (~ ~).
Thus there are N = Ej. j ajj edges in the graph GA. Notice that the adjacency matrix
tells which vertices are adjacent to each other in the sense that there is a path in the
graph directly from the first vertex to the second. Let e = {E I ,E2 , •..• EN} be the
set of all edges. If A is a {O,l}-matrix, then we get the same graph which we have
considered before with either no edge or one edge from vertex V; to vertex V;.

vl~~
n V
2

FIGURE 3.1. Graphs for the Matrices (2) and (~ ~ ) .


Next, we form the transition matrix T = (tjj) on e. This is an N x N matrix whose
entries are O's or l's. For the edge Ej E e, let b(Ej ) E S be the beginning vertex of
edge E) and e(Ej) E S be the end vertex, i.e., E j is an edge from b(Ej } to e(Ej ). The
entries of T are defined as follows:
1 if e(Ej ) = b(Ej )
{
tjj = 0 if e(Ej ) ' " b(Ej ).

Thus the transition on e from E j to Ej is allowable provided the end of the edge E j is at
the vertex which is the beginning of the edge Ej • Using the transition matrix T, we can
form the vertex subshift ET. This process takes an adjacency matrix A and constructs
a subshift UT : ET --+ ET called the edge subshift lor A. (This induced subshift on N
symbols is often labeled as UA : EA --+ EA but we do not do this. We always label the
subshift by the transition matrix which induces it.)
An allowable sequence 8 E ET can be visualized as a curve in the graph GA. An
allowable curve in the graph G A is a continuous function "y : R --+ G A such that for each
7.4 GEOMETRIC HORSESHOE 249

i, -y(i) is a vertex and -y([i, i + 1]) is an edge from -y(i) to -y(i + 1). Thus an allowable
sequence s E ET can be visualized as the allowable curve -y in G,4 with -y(i) = 8, E E.
Notice for the adjacency matrix A = (2), its transition matrix is T = (~ !). (Ei-
ther of the two edges can follow each other.) Therefore the 1 x 1 matrix (2) corresponds
to the full two-shift. In the same way the adjacency matrix A = (n) corresponds to the
full n-shift.
One case that is interesting is when A is already has all its entries aij E to, I}. Let T
be the transition matrix formed above. (The matrix T is usually a different size than A.)
We leave to Exercise 7.16 to show that vertex subshlft 0',4 : E,4 -+ EA is topologically
conjugate to the edge subshift formed from A, O'T : ET -+ ET. This result shows that
the edge subshift is a generalization of the vertex subshift for matrices with nonnegative
integer entries.
The adjacency matrix A is called irreducible provided that for each pair 1 :S i,i :S n,
there is a positive integer k = k(i,j) such that (A")ij > 0. (This is really the same
definition as for a to, 1}-matrix.) We leave to Exercise 7.15 to verify that if A Is an
irreducible adjacency matrix and T is its induced transition matrix then T is irreducible.
Thus if A Is irreducible, then O'T is topologically transitive.
For further development subshifts of finite type, see Boyle (1993) and Franks (1982).

7.4 Geometric Horseshoe


In this section, we give an example of a diffeomorphism I : S2 -+ S2 (or from R2 to
itself) that has an invariant set which is a Cantor set. This example is closely related to
the map I,,(x) = lJX(l-x) on R for IJ. > 4 discussed in Chapter II. It was introduced by
Smale and is one of the important early examples with complicated (chaotic) dynamics
on an invariant set. It is called the Smale horseshoe or geometric model horseshoe.
It has been at the heart of much of modern Dynamical Systems. See Smale (1965,
1967). In the next subsection, we show how a horseshoe arises for a polynomial map,
the Henon map. The horseshoe also leads to a better understanding of the dynamics of
points near a transverse homoclinic point as we discuss in the second subsection below.
Finally, we show how transverse homoclinic points and so horseshoes can be proven to
exist for small parameter time periodic perturbations of differential equations (Melnikov
method).
Let S = [0, IJ x [0, IJ be the unit square. Let H j for i = 1,2 be two disjoint horizontal
substrips, Hj = {(x,V) : 0 :S x :S 1,y{ :S V :S ~} with 0 :S vI < v~ < vl < y~ :S 1.
Similarly, let V; for i = 1,2 be two disjoint vertical substrips, V; = {(x, y) : x{ :S x :S
~,O:S Y :S I} with 0 :S xl < x~ < x¥ < x~ :S 1. We assume that I is a diffeomorphism
such that I(Hj ) = V; for i = 1,2, S n F-I(S) = HI U H 2 , and for p E HI U H2

D/p=(ao b~)
with lapl = IJ. < 1/2 and Ibpl = A > 2. See Figure 4.1.
This map I can be extended to all of S2 as follows. Let A be the semidisk of radius
~ on the bottom of S, B be the semidisk of radius ~ on the top of S, and N = SuAuB
be the topological disk which is the union of these three region. Let G be the horizontal
gap between HI and H2, H3 be the top horizontal strip In S above H 2 , and Ho be the
bottom horizontal strip in S below HI. We extend I so that I(G) c B and the Image
arcs from the top of VI to the top of V2 . This part of the map can be taken 80 that
I :~ (q)1 = IJ. for q E G. See Figure 4.2. For the extension, the image of Ho u H3 Is
250 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACfORS

H3
H2

G VI V
2
HI

HO

FIGURE 4.1. Horizontal and Vertical Strips in 8

contained in Ai the first coordinate function of f on HouH3 is a contraction by a factor


of 11-; the second coordinate function of f on Ho U H3 changes from an expansion by ,\
on the boundary with HI U H2 to a contraction at the boundary with Au B. We take
f so that f(A) C A. and that f is a contraction on A, so f has a unique fixed point
Po E A. Also./(B) c A. Thus 1 maps the topological disk N into itself. The map can
be extended to all of R2 so all other points enter this topological disk under forward
iteration. Finally. 1 can be extended to S2 so that it takes the point at infinity, POOl to
itself. and Poo is a source for 1 on 8 2 •
This map f restricted to N can be thought of as the composition of two maps. The
first map. L. takes the region and stretches it out to be over twice as tall and less
than half as wide. L(x, y) = (II-X, >.y). The second map 9 takes this longer and thinner
rectangle and bends it in the middle and puts it down so it crosses 8 twice. The map
9 must change the image near the ends so that 1 = 9 0 L is a contraction on A and
I(B) C A. The image I(N) is drawn in Figure 4.2 together with the region N itself.
Since I(N) C N, it follows that j2(N) C f(N) c N. Notice that L 0 f(N) is a longer
and thinner version of I(N) with two vertical strips. Then j2(N) = goLof(N) is bent
again in the middle and has its image inside f(N). Thus the part of the image in 8,
f2(N) n 8, is four vertical strips of width 11-2. See Figure 4.2.
Next, we consider the chain recurrent set of f. If q E Au B then w(q) = {po}, so
'R.(f) n (A U B) = {Po}. If q E 8 2 \ N then o(q) = {Poo}, so n(f) n (82 \ N) = {Pool.
Finally, if q E 8 n n(f) then Ii (q) must be in 8 for all j E Z (or else it wanders forward
to Po or backward to Pool. Combining,

'R.(f) c A U {Po} U {Pool


where
11.= n
iEZ
fi(8).

Below we prove that fill. is conjugate to the shift map on a symbol space. As a corollary
of this result, A C 'R.(f) so 'R.(f) = A U {Po} U {Pool.
Now we turn to the analysis of A. We define sets in a manner similar to the con-
7.4 GEOMETRIC HORSESHOE 251

FIGURE 4.2. First and Second Image of the Neighborhood N for the Ge0-
metric Horseshoe

struction of the Cantor set for the quadratic map on the line:

S;:' = n"
j-m
/)(S).

By the description above, SJ is the union of the two vertical strips VI and V2 of width
1'. AJ5 in the one dimensional case,

So = /(S;;-I) n S
Vd U [/(SO-I) n V2 ]
= [/(SO-I) n
= 1(80- n H.) U 1(~-1 n H2).
1

In particular for n = 2,

sg = [!(SJ) n Vd u [/(SJ) n V2J


= /([V1 u V2J n H.) u /([V1 U V2) n H 2 ).

Then for k = 1 or 2, /(SJ) n VA: = /((Vl U V2 ) n HA:) is the union of 2 vertical strips of
width 1'2. Therefore, sfi is the union of 22 vertical strips of width 1'2. By induction

is the union of 2" vertical strips of width 1'''. Taking the infinite intersection,

sgo = n00

".0
~ = C1 X [0,1]

is a Cantor set of vertical line segments. If q e 88" , then q e f1 (S) and /- i (q) e S for
all j ~ O. Thus S8" is the set of points whose backward iterates stay in S.
252 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Considering the sets ~m' ~I = HI U H2 is the union of two horizontal strips of


height ,\-1. Then, ~2 is the union of four horizontal strips of height ,\ -2. Continuing
by induction, S~m is the union of 2m horizontal strips of height ,\ -m, and

~oo = n
00

msO
~m = [O,lJ X C2

is a Cantor set of horizontal line segments. If q E S~oo' then for all j ~ 0, q E f- j (8)
and f1 (q) E 8. Thus ~oo is the set of points whose forward iterates stay in 8.
Intersecting these two sets, we get that

A = S~oo
=sO'n~oo
= C I x C2

is the product of two Cantor sets, and A is the set of points such that both the forward
and backward iterates stay in 8. We want to show that A has the three properties of
a Cantor set in the line: A is perfect and its connected components are points. It is
perfect because both the sets Cj are perfect. From the fact that the set S~n is the union
of 22n rectangles of dimensions p'n by ,\ -n, it follows that the connected components of
A are points. Thus A has the three properties of a Cantor set in the line. There is in
fact a theorem which says that such a set is homeomorphic to a Cantor set in the line.
(See Hocking and Young (1961), page 97.)
We now turn to considering the stable and unstable manifolds of points. We first
consider the set
n S~m =
00

~oo = [O,lJ X C2 ·
m=O

A point q is in S~oo if and only if q E J3(8) for all j ~ 0, if and only if f1(q) E 8 for
all j ~ o. Thus such a point q is in the stable manifold of the whole set A. In fact if
q E [O,lJxC2 and p E C I xC2 = A have the same y coordinate, then 1f1(q)-J3(p)1 ~ p.j
and
q E compp(W'(p) n 8).
Thus these horizontal line segments are the local stable manifolds of points in A. Simi-
larly for pEA, compp(WU(p) n 8) is the vertical line segment through p.
The union of the local unstable manifolds for all points in A, W/~c(A), is thus the set
we labeled SO'. The global unstable manifold of A, WU(A), is the forward orbit of So.
Because the forward images of the larger neighborhood N are nested, J3(N) c f"(N)
for j > k. it is not hard to see that WU(A) is the intersection of these images,
00

WU(A) = U P(SO')
j=O

= n
00

j=O
fj(N).

This set winds around and accumulates on itself. In fact, it is a continuum which can not
be written as the union of two proper subcontinua. For this reason it is an example of an
7.4 GEOMETRIC HORSESHOE

indecomposable continuum. This particular example is called the Knaster continuum.


See Barge (1986) for more discussion of this property.
To introduce the symbolic dynamics of the map 1 on A, we need to consider the two
sided shift on two symbols because 1 involves both forward and backward iterates to
determine A. The next theorem proves that IIA is topologically conjugate to (1 on 1:2.
Theorem 4.1. Let 1:2 be the full two sided two-shift space with shift map (1. Define
h : A -+ 1:2 by h(q) = s where P(q) E H' j for all j. Then h is a topological conjugacy
from IIA to (1 on 1: 2 ,
PROOF. Let h(q) =s and h(f(q» = t. Then p+l(q) E H'j+, but also li+l(q) =
po/(q) E HI,. Therefore 8i+l = tj, and (1(s) = t or (1(h(q» = h(f(q)). This proves
the first property of a conjugacy.
Next we show that h is continuous. Let h(q) = s. A neighborhood of 8 is given by

N = {t : tj = 8j for - no ~ j ~ no}.

With no fixed, the continuity of 1 insures that there is a Ii > 0 such that for pEA and
Ip - ql ~ Ii, Ij(p) E H., for -no ~ j ~ no. Thus, if t = h(p) and Ip - ql ~ Ii then
tEN. This proves the continuity of h.
To show that h is one to one, assume that h(p) h(q)= =
8. Then for all j, both
rj(p) and rj(q) are in H._ i , so p,q E P(H._,). Letting j run from 1 to Infinity,
we see that p,q E n~l Ji(H._,} so they are in the same vertical line segment. Letting
j run from minus infinity to 0, we see that p, q E n~_-oo Ii (H._,), so they are in the
same horizontal line segment. Using all j from minus infinity to infinity we see that
p = q. This proves h is one to one.
Finally, we check that h is onto. We apply induction on n to show that n;=l Ii (H._ j )
is a vertical strip of width I-'n for all strings of symbols s E 1:2. Let 8 E 1:2. For n = I,
this set is just I(HL . ) = V._. which is a vertical strip of width 1-'. Then

nn

j=l
Ij(H._,) = f(n
i=2
n
li- 1 (H._,» n I(H._.)

is a strip of width I-'R since n;=2Ii - 1 (H._ i ) is a strip of width IJn-l. Letting n go to
infinity,
n li(H._
00

i )
j=l
is a vertical line segment. Similarly, n~_-oo P (H._ j ) is a horizontal line segment, and
n~-oo P(H L , ) is a (single) point, q. In particular, the intersection is nonempty. For
this q, h(q) = s, and h is onto. This completes the proof that h is a conjugacy. 0
It is possible to have other horseshoes which are conjugate to subshifts of finite type.
Example 4.1. Let the region N in the plane be made up of three disks A, B, and C
and two strips 8 1 and 82 connecting them as in Figure 4.3. The map 1 takes N inside
itself. The disks A, B, and C are permuted with I(A) C B, I(B) c C and I(C) c A.
The map on these sets is a contraction and there is a unique attracting orbit of period
three, {Pl,P2,P3} with Pi E A, P2 E B, and Pa E C. The strip 8 1 is stretched across
81t B, and 82 with a contraction in the vertical direction as indicated. Finally 8 2 is
stretched across 8 1 . This map can be extended to S2 so it has a fixed point source
254 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

at infinity, Poo. As in the geometric horseshoe above it can be shown that the chain
recurrent set is given by

where
n
00

11.= li(Sl U S2).


j=-oo

The map I can be taken so that it has a hyperbolic structure on A with one expanding
direction and one contracting direction. Because of the manner in which I maps the
strips. 1111. is not conjugate to the the full two shift, but is conjugate to the two sided
subshift of finite type E8 for the transition matrix

There are restrictions on what combinations of periodic orbits and subshifts of finite
type which can be realized on a manifold such as the two sphere. For more details see
Franks (1982).

FIGURE 4.3. Horseshoe for a Subshift of Finite Type


7.4.1 HORSESHOE FOR THE HENON MAP 255

7.4.1 Horseshoe for the Henon Map


Let FAB : ]R2 -+ ]R2 be given by

FAB(X,y) = (A-By-x 2,x) so


-1 A - x - y2
FAB(x, y) = (y, B ).

We will often write F for FAB in this section. This map is called the Henon map. It
was written down by Henon to realize the Smale horseshoe for a specific function which
could be iterated on the computer, Henon (1976). He also observed what appeared to
be an attractor. We return to this aspect of the map in Section 7.10. Notice that

-2X
det (DFAB)(x,lI) = det ( 1
-B)
0 = B

is the amount that F changes area. If B > 0 then F preserves orientation, and if B < 0
then it reverses orientation.
The usual paramf'tf'rs discussed are A = 1.4 and B = -0.3 for which computer itera-
tion indicates there is a "strange attractor". Notice that for these parameter values FAB
decreases area and reverses orientation. It is still unproven that there is a "transitive"
attractor for these parameter values. We will discuss this more fully when we come to
examples of attractors.
The following theorem is the main result of the section. It proves that FAB has a
horseshoe for larger A, namely A = 5 and B = ±0.3 or, more generally, A ~ (5 +
2v'5)(1 + IBI)2/4. Note that for B = 0, FAO is conjugate to the quadratic map F,,(x) =
Jlx(1 - x). Therefore the following theorem is analogous to Theorem 11.4.1.

Theorem 4.2. Let B '" 0 and A ~ (5 + 2J5)(1 + IBI)2/4. In particular, B = ±0.3


and A = 5 are allowed. Let R = HI
+ IBI + [(1 + IBI)2 + 4Aj1/2} and let the square

S = {(x,y) E]R2 : Ixl $ Rand Iyl $ R}.

Let A = n~-<x> Fh(S). Then, (a) all points which are non wandering are contained
inside S, (b) FAB has a hyperbolic structure on A, (c) A is a Cantor set in the plane,
and (d) FABIA is topologically conjugate to the two sided shift on two symbols. Thus
for these parameter values, the nonwanderiag set of FAB is a horseshoe.

REMARK 4.1. The proof given below is based on Devaney and Nitecki (1979).

PROOF. To make the calculations somewhat easier, we take A = 5 and B = 0.3. Most
of the calculations are very similar with B = -0.3. We also take S = {(x, y) : Ixl $
3,lyl $ 3} which is a slight enlargement of the size given in the general definition in the
statement of the theorem.
A direct analysis of the effect of the map F on points in different regions outside S
shows that for p ~ S, P is wandering with IFj(p)1 going to infinity as j goes to either
00 or -00. See Devaney and Nltecki (1979).
To find the image of S by F we look at the image of the corners, the middle vertical
256 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

line, and horizontal lines:

F(±3, 3) = (5 -~: - 9) = ( -;39),


F(±3 -3)- (5+0.9-9) _ (-3.1)
, - ±3 - ±3 '

F(O ,II ) -- (5 - 00.3 11 ) ' and

F( Z,II0 )_(5-0.31/O-
- z
z 2) .
These images show that the four corners go to points to the left of the box S. Since
5 - 0.311 > 3 for -3 ~ II ~ 3, the middle vertical line is mapped to the right of the box.
Finally, each horizontal line goes to a parabola. See Figure 4.4.
F(c) F(d) b

F(O .y)

F(b) F(.) •

FIGURE 4.4. Image of the Square by the Henon Map

Thus F(S) n S has two horizontal strips. Similarly, F-l(S) n S has two vertical
strips, and F-l(S) n S n F(S) has four components. By induction, Fj(S) has ni=-n
4n components. With this much information we can define a semi-conjugacy h : A -+
{I, 2}Z which is onto.
The next step is to show that A has a hyperbolic structure so it is possible to prove
that the components of A are points. This will enable us to conclude that A is a Cantor
set and that the semi-conjugacy h is one to one and so is a conjugacy.
To prove that the hyperbolic structure exists, we define cones (or sectors) at each
point that are mapped into each other. This follows the ideas that we used for the proof
of the stable manifold theorem. Also see the discussion in Moser (1973).
We define the cones CU(p) and C'(p) by
CU(p) = He, '1) E TpJR 2 : I'll ~ A-llell,
C'(p) = He, '1) E TpJR 2 : I'll ~ AleIl·
For ease of estimations. we use the norm which measures the larger component of a
vector: I(e. '1)1. = max{lel.I'IIl· We find A > 1 that make the cones invariant. (For
= =
A 5 and B ±0.3. A can be taken to be 1.7.)
Lemma 4.3. There is a A> 1 (which can be taken to be 1.7 (or A 5 and B ±0.3) = =
for which the following two statements are true.
(a) ForalJp E SnF-l(S) and v E CU(p), DFpv e CU(F(p» andlDFpvl. ~ Alvl •.
(b) For alJ pES n F(S) and v e C'(p), DF;lv E C'(F-l(p» and IDF;lvl. ~
Alvl.·
PROOF. In the proof. we need some estimates which we prove first in a sublemma.
7.4.1 HORSESHOE FOR THE HENON MAP 267

Ixl > 1 and 21xl -IBI ~ 1.7 = A> 1.


Sublemma 4.4. (a) 1£ (x. y), F(x, y) E S then
(b) If(x,y),F-I(x,y) E S thenIyl > 1.
PROOF. Let (XI, yd = F(x, y). If Iyl $ 3 and Ixd $ 3 then 5 - IBly - x' $ 3, or
x 2 ~ 2 - IBI(3) = 1.1. Thus Ixl > 1. The second estimate follows from the first:
21xl -IBI ~ 2 - 0.3 = 1.7.
Let (x-Ioy-d = F-I(X,y). Then (x,y) = F(x_I,y_I), so Iyl = Ix-d > 1 by part
W. 0
Now take p= (x,y) with p,F(p) E S and (~) E C"(p). Because lei ~ AI'll> I'll,
we have that I(e, '1)1. = lei· The image of the vector by the derivative is given by

Then 1'111 = lei and led = 1- 2xe - B'II ~ 12xllel-IBII'I1 ~ (2Ixl-IBI)lel ~ Aiel ~ AI'III·
Therefore (~:) E C"(F(p)), and

I(e .. '1dl. = lell ~ Aiel = Al(e, '1)1.·


This proves the first part of the lemma.
For the second part, take p = (x, y) with p, F-I(p) E Sand (~) E C·(p). Because
lei < Aiel $ I'll, we have that I(e, '1)1. = I'll· The image of the vector by the derivative
of the inverse is given by

Then 1'1-11 ~ (l2yl-1)IBI- 1 1'11 ~ (2 -1)IBI- I I'I1 ~ AI'll = Ale-II. Therefore

e-I)
( 'I-I E C'(F-I(p)),

and

This completes the proof of the lemma. o


Now to consider the hyperbolic structure on A, the lemma shows that for each pEA,

n DF~_;(p)c"'(rj(p))
n

j=O

is a nested set of cones at p so the infinite intersection is a nonempty cone:

n DF~_j(p)C"(rj(p)) :/= 0.
00

j=O

Similar statements are true for the stable cones:


-00

n DFt_'(pp'(rj(p)) '" 0.
j=O
258 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Since Lemma 4.3 proves the expansion and contraction in these two sets of intersec-
tions, to complete the proof that the set is hyperbolic, all we need to show is that this
intersection is a line for each pEA.
In fact the maximal angle between two vectors in the intersection
n
n
j-O
DFt_'(p)C"(ri(p»

goes to zero as n goes to infinity as the following calculation shows. Take

with ~ =~' = I, 17]1,17]'1:::; A-I, and q = F-j(p) for some j > O. Then

and

~I = -2x~ - B7]
= -2x - B7]
~; = -2x~' - B7]'
= -2x - B7]'.

Thus

17]1
~I -
7]11 1 1
~1 = -2x - B7] - -2x - B7]'
1 1

= 1 B( 7] - 7]') 1
(-2x - B7])( -2x - B7]')
< IBII7] - 7]'1
- A2
< IBI I7] 7]'1
-~ ~-f

since I - 2x - B7]I, I - 2x - B7]'1 ~ A and I~I = I~'I = 1. Therefore the angle between
two vectors is contracted by IBIA -2 and so the intersection of the cones goes to a single
line in the tangent space at each point p:

n DF~_i(pPU(F-j(p»
00

= lE~.
j-O

Notice that the line EU(p) can depend on p even though the original cones did not,
because its definition involves the derivative of F along the backward orbit of p. Simi-
larly,
-00

n DF~_i(pp'(rj(p»
j=O
= lE~

is a line in the tangent space at p. This completes the proof of the hyperbolic structure
on A, or part (b) of the theorem.
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT 259

For pEA, W'(p) has to start inside the cone C'(p) + {pl. In fact, for q E W'(p)
we have TqW'(p) C C'(q). Let HI and H2 be the two (horizontal) components of
S n F(S). Then by using the estimates of Lemma 4.3. we see that for each iterate by
F. comp p (W'(p) nHi ) is shrunk by a factor of A-I. In this way we see that

~Bf length {compp [W'(p) nnJ=-n


n
Fi(S)] } ~ 6,\-n

which goes to zero as n goes to infinity. Similarly

max length {comp p [W"(p) n


~A .
)=-n
n
n
Fi(S)] } ~ 6,\-n

which also goes to zero. Therefore for q E A. compq(nj=_n Fi(S» have diameters
which go to zero and so converge to the single point {q}. Therefore the connected
components of A are points. This shows that A is a Cantor set. Also, by arguments
as before. we can prove that the semiconjugacy of FIA to E2 is one to one. and so is a
topological conjugacy. This completes the proof of the theorem. 0
REMARK 4.2. Fix B. For A < -(B + 1)2/4. FAB has (no periodic points and) empty
nonwandering set. For A> (5 + 2v'5)(1 + IBI)2/4. FAB has a horseshoe. Therefore as
A varies. FAB forms a horseshoe. There are many bifurcations which take place as this
horseshoe is formed. Many people have studied this process but it is not yet completely
understood. See Newhouse (1979). Mallet-Paret and Yorke (1982). Robinson (1983),
Holmes and Whitley (1984). Holmes (1984). Yorke and Alligood (1985). Easton (1986.
1991). and Patterson and Robinson (1988).

7.4.2 Horseshoe from a Homoclinic Point


In the last two sections. we have analyzed the geometric model horseshoe and have
shown how it arises in the Henon map for certain parameter values. In this section. we
show how it arises from a transverse intersection of the stable and unstable manifolds
of a periodic point. Such an intersection is called a homoclinic point.
Definition. Let p be a hyperbolic periodic point of period n for a diffeomorphism J.
Let
n-I
W<7(O(p» = U W<7(Ji(p)) and

W<7(O(p)) = W<7(O(p» \ O(p)

for q = s, u. A point q E W'(O(p» n W"(O(p» is called a homoclinic point Jor p. A


point q is called a tmnsverse homoclinic point provided the manifolds W'(O(p)) and
W"(O(p» have a nonempty transverse intersection at q.

The following theorem proves that the existence of a transverse homoclinic point
implies the existence of a Smale horseshoe. Figure 4.7 indicates why the invariant set
of part (a) is called a horseshoe. By analogy. we also call any invariant set like the one
constructed in part (b) a horseshoe.
260 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Theorem 4.5. Suppose that q is a transverse homoclinic point for a hyperbolic periodic
point p for a diffeomorphism f.
(a) For each neighborhood U of {p, q}, there is a positive integer n such that r has a
hyperbolic invariant set A C U with p, q E A and on which r is topologically conjugate
to the two sided shift map on two symbols, tT on E 2 . Thus A C c1(Per(f» C n(f) and
q E c1(Per(f».
(b) Let V be a neighborhood ofO(p) U O(q). Then there is a smaller neighborhood
8 = U?=, =
8 i ofO(p) U O(q) with n ~ 2, 8 C V, such that As niEzJi (8) C V is a
hyperbolic invariant set for f and flAs is topologically conjugate to the shift map tT A
on a transitive two sided subshift of finite type on n symbols, EA C En. The conjugacy
h : As -+ EA is the itinerary function given by hex) = 8 where Ii (x) E H,; for all j.
Notice that q E O(p) U O( q) C As C c1(Per(f» C n(f), 80 q E c1(Per(f». If p is a
fixed point then the transition matrix A = (aiJ) for the subshift of finite type is given
by

for j = 1,2
for j '" 1,2
for j = i + 1, 2 ~ i <n
for j '" i + 1, 2 ~ i <n
for j =1
forj",l.

See Remark 4.8 for the transition matrix when p is not a fixed point.
REMARK 4.3. The theorem is most often stated as in part (a). See Smale (1965,
1967). There are many other references including Guckenheimer and Holmes (1983),
pages 252-3; Newhouse (1980), pages 13-24; Moser (1973); and Wiggins (1990), pages
470-483.
REMARK 4.4. Given a set 8, As =niEZ Ii (8) is the maximal invariant set in 8, i.e.,
the largest invariant set contained in 8. (See Section 9.3 for further discussion of the
maximal invariant set.) In part (b), the set 8 needs to be chosen carefully 80 that this
maximal invariant set is conjugate to a relatively simple subshift of finite type.
REMARK 4.5. One way to think of the subshift of finite type in part (b) is in terms
of l-chains. Consider the case when p is a fixed point. For big enough N, and N2,
A~ = =
{p,f-N'(q), I-N,+l(q), ... ,fN'(q)} is a periodic l-chain. Let n 2 + N, + N2 ,
q, = p, and Qj = f-N,-Hi(q) for 2 ~ j ~ n. The subshift of finite type given in
the theorem corresponds to allowing the following transitions: (i) from q, =
p to q, or
q2 =r =
N• (q), (ii) from Qj to Qj+' for 2 $ j < n, and (iii) from qn IN. (q) to q, p.=
The space EA C En for this subshift is a Cantor set with dense periodic points. (The
proof is the same as for the section on one sided subshifts since one of the n symbols
has more than one option.)
The fact that any sequence of symbols in EA can be realized by an actual orbit follows
from shadowing. We prove below that the set Aq = O(p) U O( q) has a hyperbolic
structure. By the Shadowing Theorem IX.3.1, any l-chain in A' C Aq can be shadowed
by an orbit of I. The sets Hi are taken to be disjoint and neig~borhoods of the points
qi E A~; the orbit which shadows the l-chain {xi} C A~ can be taken with Ii (x) in the
same H,; as Xj. In particular, the Shadowing Theorem proves that the homoclinic point
q is in the closure of the periodic points. However, we want to prove that the maximal
invariant set As in this neighborhood 8 of Aq i8 conjugate to a 8ubshift of finite type
EA. (This is Smale'. main contribution.) If we prove the theorem u8ing the Shadowing
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT 261

Theorem, we would need to prove that the set of all these orbits which shadow the
f-chains in Aq is the maximal invariant set in B. This last step is not obvious from the
statement of the Shadowing Theorem and involves shadowing all f-chains in A~ at once.
For this reason, we give a proof below which is independent of the Shadowing Theorem
and duplicates some of its proof.
REMARK 4.6. Notice that in addition to the transverse homoclinic point q, the theorem
allows that f can have nontransverse homoclinic points for p at points other than q.
PROOF. First, we give an outline of the proof. Let Aq = O(p)UO(q). This invariant set
is closed because o(q) = w(q) = O(p). The first step is to prove that Aq has a hyperbolic
structure. The second step is to prove that if V is a small enough neighborhood of Aq
then the maximal invariant set in V, Av = niEZ fi(V), also has a hyperbolic structure.
For the third step, the proofs of both parts (a) and (b) use boxes contained in the
ambient space. These hoxes are similar to those used for the geometric horseshoe and
the Henon map. For part (a), we find two disjoint boxes VI and V2 contained in V and
an n > 0 such that each of the images r(V;) crosses both VI and V2 • It follows that r
restricted to A = n:=-oo fmn(V1 U V2 ) is conjugate to the full two sided subshift on
two symbols. For part (b), the construction uses disjoint closed boxes Bi C V such that
the union B = U Bi is a neighborhood of Aq • This neighborhood B is contained inside
n
V so A8 = mEZ fm(B) C Av has a hyperbolic structure. One difference between this
case and part (a) (or the geometric horseshoe) is that the image of each f(Bi) does not
intersect all the other B j • However, each nonempty intersection f(Bi) n B j completely
crosses B j , so A8 can be shown to be conjugate to a subshift of finite type.
Throughout much of the proof we assume p is a fixed point. (The notation and
labeling becomes a little simpler in this case). The reader can make the necessary
changes if it is not fixed. We start the proof as if we were proving part (b), although
the first part of the proofs is the same for both parts. We indicate where the bifurcation
in the proof of the two parts takes place and the reader can then choose which part to
read first.
As indicated in the outline, the first step is to show that Aq has a hyperbolic structure.
Let qm = fm(q). For each point x E Aq, let ~ = Tx(W"(p» for u = S,u. This is
clearly a continuous splitting on O( q) and p separately. We must check that it is
continuous as O(q) approaches p. Consider qm as m goes to 00, so qm approaches
p. The stable bundle E~_ = Tq_(W·(p» approaches E~ because the stable manifold
W·(p) is C I • The unstable bundle E~_ = Tq_(WU(p» approaches E~ by the linear
estimates in the Inclination Lemma. The argument as m goes to -00 is is similar.
Therefore we have a continuous splitting on Aq •
Next we check that vectors in E"IAq are uniformly contracted. We can take an
adapted metric so that on E~ the derivative is an immediate contraction, IIDfpIE~1I <
A < 1. By continuity, there is a neighborhood W of p such that liD fq_IE~_1I < .A for
all qm E W. Since there are only finitely many qm E Aq \ W, there is a C ~ 1 such
that if fi(qm) ¢ W for 0 ~ i < k then IIDf~_IE~_1I ~ C Ai for 0 < i ~ k. For any
qm E Aq \ {p}, there is at most one string of iterates for which f(qm) ¢ W: there are
i l and i2 such that fi(qm) ¢ W for il ~ i < i2 and f(qm) E W for i < i l or i ~ i2'
Combining the estimates on and off W, II D f~_IE~m II ~ C Ai for any qm E Aq and all
0< i. The estimates for EUIAq are similar interchanging f and rl. This proves the
hyperbolicity of Aq .
Now we turn to step 2 which proves that there is a small neighborhood V of Aq 8uch
that the maximal invariant set Av = n.EZ f'(V) has a hyperbolic structure. Note that
there is no neighborhood V of Aq for which Aq = Av. (This might not be obvious but
is true from the conclusion of this theorem or by the ShadOWing Theorem IX.3.1.) This
262 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

means that Aq is not an isolated invariant set. (An invariant set A is called isolated
provided there is a neighborhood V for which A = Av with Av defined as above. See
Section 9.3 for further discussion of isolated invariant sets.)
For simplicity below, we take an adapted metric on Aq and extend it to a (perhaps
smaller) neighborhood V. (The adapted metric implies that f is an immediate con-
traction and expansion on Aq.) We use cones to extend the splitting from Aq to a
neighborhood. For x E V, using the adapted metric let

and
CU(x) = {(~, '7) E E~ EB E~ : I~I ::; ~1'71}
some 0 < ~ < 1. By the hyperbolicity on Aq and continuity, there is a (perhaps smaller)
neighborhood V of Aq such that
DfxCU(x) c C"(J(x)) provided x, f(x) E V,
Df;IC'(X) c C'U-1(x)) provided x,rl(x) E V,
IDf:'v"l ~ .x-mlv"l provided fi(X) E V for 0 ::; i < m, and
IDf;mv'l ~ .x-mlv'l provided ri(x) E V for 0 ::; i < m

for any V U E C"(x) and v' E C'(x). Let Av = nez fi(V) :::) Aq as before. For this
neighborhood V and x E Av, define

i;:::O

E: = n Dfp(x)C'(t(x)).
i;:::O

Because of the expansion esimates, these are subspaces for each x E Av which depend
continuously on x, have the usual invariance properties, and satisfy E~ EB E~ = TxM.
By the inequalities for all vectors in the cones, vectors in E~ and E~ are expanded and
contracted, respectively, so Av is a hyperbolic invarinat set. This completes the proof
of step 2.
The invariant set Av defined above is probably not be conjugate to a simple subshift
of finite type. To get such an invariant set, we carefully construct a smaller neighborhood
of Aq, B c V, which is a finite union of "boxes". Each box Bi corresponds to a symbol
in the subshift. The transitions which are allowed in the subshift are exactly those for
which f(B;) n B j -F 0. The images of the boxes Bi are correctly aligned so that a string
of symbols is allowable if and only if there is a point whose orbits goes through this
sequence of boxes. Since B C V, AB C Av, so AB has a hyperbolic structure.
We take coordinates near p induced by the hyperbolic splitting. In fact identifying
IE~ with a subspace for (J" = 8, U, we can take coordinates so a neighborhood can be
considered as a subset of E~ x E~, and the local stable and unstable manifolds are
disks in the subspaces given by the splitting, W:(p) = E~(r) x {O} and W;'(p) =
{O} x E~(r), and identify these two local stable and unstable manifolds with E~(r) and
E~(r), respectively. For 0" Ou > 0, we let D' = WI. (p) and D" = Wl. (p). We also
consider D' = IE~(O,) and D" = 1E~(ou) and take their cross product in local coordinates
near p. We can take 6"0,, > 0 and k > 0 such that
q E int[r"(D') \ r(k-I)(D')]
q E int[fk(D U ) \ f("-I)(D U )],
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT 263

where the interiors are taken relative to W'(p) and WU(p), respectively. See Figure 4.5.
We can also insure that D' x DU c V (resp. U) where V (resp. U) is the neighborhood
of Aq (resp. {p, q}) given in the statement of part (b) (resp. of part (a». For the
same k to work both forward and backward, the relative sizes of 6. and 6u need to be
adjusted.

FIGURE 4.5. Images of the Disks D' and DU

Having fixed k, take iI ?: 0 such that for i ?: ii, fk(DU) crosses f-k(D' x rj(DU»
transversally in the components of the intersection containing q and p. By transversally,
we mean that it hits transversally each "horizontal fiber" f-k(D' x {y}) once and only
once for each y E f-j(DU). Thus fk(DU) is a "vertical disk" through q. See Figure
4.6.

For i ?: i .. the set

is a thin neighborhood of fk(DU) by the Inclination Lemma. Therefore for j ?: iI large


enough, fk+ j (D' x r'(DU» crosses r II r (DU»
(D' x j transversally In the components
ofthe intersection containing q and p. In particular, each "fiber" fk+i( {x} x r'(DU»
crosses each rk(D' x {y}) once and only once for each xED' and y E f-j(DU). See
Figure 4.7. Notice the similarity with the figure for the geometric horseshoe.
264 VII. EXAMPLES OF HYPERBOLIC SETS AND A'ITRACTORS

Fix an j ~ 11 large enough to satisfy the above conditions. Let n = 2k + j,


BI = D' x rj(D U ), and
V = rk(Bd.

Letting comp.(B) be the connected component of B containing z, set

VI = compp(Vn r(V» c BI ,
V2 = compq(Vnr(V», and
Bi = r- l - k - J (V2 )

for 2 ~ i ~ n. See Figures 4.7 and 4.8. The set of boxes {VI, V2} is used in the proof of
part (a), and the set of boxes {Bi : 1 ~ i ~ n} is used in the proof of part (b).
At this point we are in position to prove either part (a) or (b) of the theorem. The
reader can choose which to read first.
PROOF OF PART (b). Let B = UI:5i:5n B;. For j large enough Be V. Finally, let

As = nfi(B),
iEZ

so As is the maximal invariant set in B, As C Av c V, and As has a hyperbolic


structure. We show that the (nonlinear) boxes Bi can be used as symbols so As is
conjugate to a subshift of finite type.
Because ofthe construction, f-Ic(Bd and flc+i(Bd = r(V) cross V2 , but ft(Bd n
r
t.'2 = 0 for -k < t < k + j. It follows that (i) f(Bd crosses BI and k -j+1(V2) = B2
but not B; for i > 2, (ii) flc(V2) = f 0 f 2Ic+j-I-Ic-i(V2) = f(Bn) crosses B 1 , and (iii)
f t (V2) n BI = 0 for -k - j < t < k, so B; n BI = 0 for 2 ~ i ~ n. Thus we have
constructed the desired disjoint boxes for the symbols. The first symbol, B 1 , goes to
either itself or B 2 • The other symbols can only go to the next symbol: B; goes to Bi+1
for 2 ~ i < n, and Bn goes to B 1• Therefore the transition matrix for the subshift is
given as in the statement of the theorem. (Note that we have assumed that p is a fixed
point. The form of the transition matrix when p is not fixed is given in Remark 4.8
below.)
In the construction of these boxes, they are correctly aligned: each of these boxes
can be assigned coordinates so that the image of an unstable disk in Bi crosses Bi+1 in
the unstable direction, and the inverse images of an stable disk cross Bi crosses B'_ 1 in
the stable direction.
7.~.2 HORSESHOE FROM A HOMOCLINIC POINT
,- -,,

D
-+--+-+., - - - - - ,

FIGURE 4.8. Choice ofthe Boxes Bit ... , Bn

The conditions on the boxes are similar to the properties of a Markov partition for
a hyperbolic invariant set which we define in Section 7.5.1. One difference is that the
Markov partition is made up of "boxes" which are subsets of the hyperbolic invariant
set while these boxes are diffeomorphic to "Euclidean boxes" and are neighborhoods in
the ambient space. Because of this difference, we say the set of ambient boxes satisfies
the Markov properly rather than calling them a Markov partition.
Define h : As -+ EA as the itinerary function by hex) = 8 provided li(x) E B., for
all i E Z. Because the boxes are disjOint, h is well defined. Fix any symbol 8 E EA.
For any m ~ 0, because the images of the boxes have the correct topological alignment,
n::o li(BL , ) is a nonempty nonlinear sub-box of B.o which stretches all the way across
the unstable direction. Similarly, n?=-m
Ji(B._;} is a nonempty nonlinear sub-box of
B.o which stretches all the way across the stable direction. Therefore, ~:-m li(B._.>
n
and iEZ li(B._;} are nonempty. Thus h is onto EA. By an argument like we used
before, ho/lAs = uoh so h is a semiconjugacy. (This much ofthe argument does not use
that As has a hyperbolic structure, but can be made to work if there is a "topologically
transverse" intersection. See Burns and Weiss (1994).)
The fact that h is one to one follows from the fact that I has a hyperbolic structure
on As. The contraction and expansion implies that for any symbol sequence 8 E EA
there is only one point in the intersection niEzJi(B._.), i.e., there is only one point
x E A such that rex) is in the box V•• for all i, i.e., h is one to one. 0
PROOF OF PART (a). Let U be an open set of p and q which is contained in the set V
defined above. Now fix k, j, n = 2k + j, 6., and 6u as above. By the above choices,
VI and V2 are two correctly aligned sets, and VI U V2 C U. See Figure 4.7. By the fact
that the images of VI and V2 by In stretch vertically across V,

Sr:- I = n lin(V1 U V nlin(v)


m-l
2) =
m

i=O

has 2m components each of which stretches vertically across V. Similarly

-I 0
S::.. = n pn(V1 U V 2) = n pn(v)
i=-m

has 2m components each of which stretches horizontally across V transverse to the


266 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

vertical fibers. Combining,


m-l
S~,;; 1 = n /in(V1 U V2 )
i=-m

has 22m components. (So far we have not shown that the maximum of the diameters of
these components goes to zero as m goes to infinity.) By arguments like those for the
geometric horseshoe, it follows that there is a semiconjugacy h : A -+ E 2 , where

A= n /in(V1
00

ia:-oo
U V2 ),

which is onto and such that h 0 riA = (10 h.


The fact that h is one to one follows from the fact that r
has a hyperbolic structure
on A: the contraction and expansion implies that for anyone symbol sequence 8 there
is only one point x E A such that /in(x) is in the box V., for all i. 0
REMARK 4.7. It might seem that the invariant set for I, A B • should be the orbit of oft he
invariant set for r, O(A, f). However, O(A,f) is not the largest natural invariant set
in a neighborhood of Aq , because O(A. f) only has periodic points which are multiples
of n; A8 has points of all periods larger than n (if p is a fixed point). Another way to
see the difference is that flAB is topologically mixing (if p is a fixed point) while flO(A)
is topologically transitive but not topologically mixing.
Still Another way to characterize the difference is in terms of the topological entropy
which we define Section 8.1, which is a measure of the complexity. The characteristic
polynomial of the matrix A of Theorem 4.5(b) is p(>.) = >.n - >.n-l -1. Since p(21/n) =
2_2(n-l)/n_1 < 0, the largest eigenvalue of A is larger than 21/n. In Section 8.1 we show
that the logarithm of this eigenvalue is equal the topological entropy of flAB. On the
other hand, riA has the same entropy as 0"21E2which equals log(2). By further results
in Section 8.1. flO(A) has entropy (lin) log(2). Therefore flAB has more entropy than
/lO(A) and so has more complex dynamics.
REMARK 4.8. If p is not a fixed point but has period p then the subshift has a cycle
of period p rather than a fixed point. Therefore ai,j = 1 in the following cases:
i = 1 and i = 2,p + 1,
2 ~ i < P and i = i + 1,
i = P and i = 1,
P + 1 ~ i < n and i = i + 1. and
i=nandi=1.

For all other (i,i). ai.j = O. Thus the transition matrix is


0 0 0 0 0

0 0 0 0 0 0
1 0 0 0 0 0 0
A= 0 0 0 0 1 0 0

0 0 0 0 0 1 0
0 0 0 0 0 0 1
1 0 0 0 0 0 0
7.4.2 HORSESHOE FROM A HOMOCLINIC POINT 267

We leave the details to the reader.


In the next subsection, we show how a transverse homoclinic point arises from a time
periodic perturbation of a differential equation with a nontransverse homoclinic connec-
tion. In the remainder of this subsection, we discuss a more geometric construction of
a perturbation which changes a nontransverse homocllnic connection into a transverse
homoclinic point.
Example 4.1. We start by giving the construction of a diffeomorphism in R2 with
a nontransverse homoclinic point. In fact, our example has one branch of the stable
manifold coinciding with one branch of the unstable manifold for a saddle fixed point.
The simplest construction of such a diffeomorphisrns is by means of the flow of a system
of differential equations. Let ",' be the flow of the system of differential equations

XI = X2
X2=XI-X~,

The origin is a saddle fixed point for ",'. The real valued function H (x) = x~ - xV2 +
xf!3 is an integral of motion, H(x) ;;;; o. ( The analysis of the next subsection derives
this function as the sum of the kinetic and potential energy.) Using the level sets of H,
it can be seen that

W"(O, ",') n {x : XI > O} = WU(O, ",') n {x : XI > O}


x2 x3 1/2
= {x : X2 = ±( 21 - ;) ,XI> O}.

See Figure 4.9. Let 1 be the time one flow of ",', I(x) = ",I (x). Then

W"(O,f) n {x: XI > O} = WU(O,f) n {x: XI > O}


X2 x3 1/2
= {x: X2 = ±( 21 - 31 ) ,Xl> O},

so 1 has a nontransverse homoclinic connection for the fixed point O.

FIGURE 4.9. Homoclinic Connection for Example 4.1

The next step in the construction is to perturb 1 to a new diffeomorphism 9 which


has a transverse homoclinic point. Let x· = (1.5,0) be the point where the homoclinic
connection crosses the Xl-axis. Let U be a relatively small neighborhood of x· which
satisfies the following properties:
(i) I-I(U) n U = 0 and I(u) n U = 0, and
(ii) letting 1 = W'(O, f) n U, Un UIj(1) = 0.
jy.o
268 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Let U' CUbe a smaller neighborhood of x·. Let ,6(x) be a nonnegative real valued
bump function such that

,6()
x = {o forx~U and
1 for x E U'.

We use the bump function to define the perturbation k. by

k.(x) = x + e,6(x) ( ~2 ) ,

and the new diffeomorphism g. by

g.(x) = k. o/(x).

(Notice that k. is a sheer near x·.)


We defer to Exercise 7.19 the verification ofthe following statements for small enough
e > 0:
(a) g. is a diffeomorphism,
(b) ° is a saddle fixed point for g.,
(c) letting I = W"(O, f) n U,
00

Uli(I) c W"(O,g.),
i=O
-00

U li(1) c WU(O,g.), and


j=-I
k.(I) c WU(O, g.),

and
(d) x· is a transverse homoclinic point for g•.
Notice that the perturbation by composition with k. changes the unstable manifold in U
but leaves the stable manifold unchanged in U. (The stable manifold in l-l(U) become
I-I 0 k;I(I).) Part (d) shows that g. has a transverse homoclinic point which is the
desired result.
REMARK 4.9. The Kupka-Smale Theorem states that any diffeomorphism I can be
C'" -approximated by a diffeomorphism 9 for which
(i) all the periodic points of 9 are hyperbolic and
(ii) for any pair of periodic points p and q for g, W"(p, g) is transverse to WU(q, g).
The above example (and Exercise 7.19) gives an explicit example of the construction
which makes condition (ii) true for a perturbation. See Section 10.1 for a discussion of
the Kupka-Smale Theorem.

7.4.3 Melnikov Method for Homoclinic Points


In the last section we showed how a horseshoe arises from a transverse homoclinic
point for a hyperbolic periodic point. In this section, we give one way to verify that a
certain type of differential equations has a transverse homoclinic point. This approach
goes back to Poincare, but its recent use starts with Melnikov (1963). Some people call
this the Poincare..Melnikov-Arnold method. For a more complete treatment of this type
of result see Wiggins (1988) or (1990).
7.4.3 MELNIKOV METHOD FOR HOMOCLlN1C POINTS 269

The class of differential equations which we consider is the time periodic perturbations
of a Hamiltonian system. We introduce the concept of a Hamiltonian system on a
Euclidean space (or a Euclidean space cross a torus). Assume H(ql,' .. , qn, Ph'" ,p,.)
is a real valued function of the 2n variables. The Hamiltonian differential equations
generated by H are given by

· 8H
qj=-,
8pj
· 8H
P---
) - 8qj

for j = 1, ... , n. The corresponding Hamiltonian vector field is given by

Notice that H is conserved along a trajectory (is a weak Liapunov function):

Example 4.2. One simple example is where H is the sum of "kinetic energy", E j riJ/2,
plus a "potential energy" which depends only on the positions q, V(q),

p2
H(q,p) = L ~ + V(q).
j

Then the equations of motion are given by

qj = Pi'
· 8V
Pj = --8 .
q;

For example, with n = 1 the equations

q=p,
p = q -l = J(q)

are Hamiltonian with potential energy

V(q) = - ! q2
J(q)dq = - -
2
q"
+-
4
270 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

FIGURE 4.10. Graph of the Potential Function

-ffir----t-""*-t----ttt-q

FIGURE 4.11. Phase Portrait for Example 4.2

and Hamiltonian function H(q,p) = p2/2 + V(q). The potential function V(q) is a
quartic polynomial with two minima and one local maximum. See Figure 4.10. The
phase portrait of the equations has one saddle point and two centers. See Figure 4.11.
We next want to consider the perturbation of these equations by adding a periodic
forcing term and a "frictional" term:

q =p,
p= q - q3 + E"}'cos(wt) - t6p.

We analyze this example below.


This last form of the equations, with the periodic forcing term and a "frictional"
term added, is of the form

( : ) = XH ( : ) +tY(q,p,t)

where Y has period T in t. We can add another variable T to make the equations
independent of time:
7.4.3 MELNIKOV METHOD FOR HOMOCLINIC POINTS 271

where the variable T is taken modulo T. (In the explicit example above we could take
= =
T WI and T Wit rather than T t.) =
For the rest of the section we assume that n =1, 80 P and q are each real variables
(or a single angle variable). We assume that XH has a hyperbolic saddle fixed point
(qO,Po), 80 equations (.) have a closed orbit 'Yo for ( = O. BecaulII! the eigenvaluee
(characteristic multipliers) of 'Yo are not equal to one, for ( '" 0 but small, there persists
a closed orbit 'y..
Next, we assume that XH has a homoclinie orbit for the fixed point, i.e., a point

(q, p) e [W'«qO, po), XH) n WU«qO,po), XH») \ {(qO, Po)}.

In the (q,p, T)-space, the homoclinic orbit of XH becomes a homoclinie surface for 'Yo
for the equations (.),

f = {(q, p, T) : (q,p) is on a homoclinic orbit for XH}.


For t > 0, the closed orbit r. remains hyperbolic and its stable and unstable manifolds
vary smoothly with t on compact subsets. For each Zo E f, let a'(zo,t) be the point
where W'(-}'., i.) intersects the normal to f through zo. By the smooth dependence
=
on f, z'(zo, 0) Zo and z'(zo, t) is a smooth function of t. Similarly define ZU(zo, f).
We want to measure the separation of the stable and unstable manifolds in the
directions orthogonal to f, i.e., the separation ZU(zo, t) from z'(zo, f). The function
H is a good measure of a displacement in these directions (since the gradient of H is
=
nonzero at points in f), 80 we want to measure G(zo, () H(zb(ao, f» - H(a'(zo, (».
Since G(zo, 0) == 0, it is possible to write

A zero of G(zo, t) corresponds to a homoclinie point. Since we want to measure the rate
of separation with respect to t (the infinitesimal separation), we define

M(zo) :t
= H(zU(zo, €»I.=o - :€ H(z'(zo, f))l.=o
=G(zo,O),
which is called the Melnikov function. The function M is considered a function from r
to the real numbers. Then G(zo, f) equals M(zo) plus terms involving (. A zero of M
corresponds to a place where infinitesimally the stable and unstable manifold continue
to intersect. In fact the following theorem, which is a direct consequence of the implicit
function theorem applied to G, gives a criterion that the manifolds actually intelllect for
( '" 0, (See Melnikov (1963), Holmes (1980), or Marsden (1984) for a proof.) The use
of this function to prove the existence of a transverse homoclinic point and a horseshoe
is referred to as applying the Melnikov method.
Theorem 4.6. Suppose Zo is a point on f with M(zo) = 0 and BOrne directional
derivative a;: (zo) '" 0 (or v tangent to f. Then (or small enough € '" 0, r. bas a
transverse homoclinic intersection near Zo, In (act tbe point of transverse homoclinic
intersection varies smoothly with (.
As a consequence of this theorem, i. has a hyperbolic horseshoe near Zo of the type
indicated. In order for this result to be useful we need a method of calculating M. The
next theorem gives just such a result.
272 VII. EXAMPLES OF HYPERBOLIC SETS AND ATI'RACI'ORS

where 1P0 is the flow of XH for


M(ZO) =

f
i:
Theorem 4.1. The Melnikov function is given by the following improper integral:

DH...o(I ••o)Y(lPo(t, ZO)) dt

= 0 and X. = XH + fY.
PROOF. We need to calculate I. H(z"(zo,()) for (1 = U,S. To do this, we calculate

! H ° lP(t, z"(zo, f), ()I.-o


along the whole trajectory, and in fact

ft! H ° lP(t, z"(zo, f), f)I.=o.


From now on, all the derivatives with respect to ( are evaluated at ( = 0 even though
this is not explicitly noted. Then using several rules of differentiation (including the
chain rule and a Lelbniz rule)

~ !H 0IP(t,z"(zo,e),f) = :f~H olP(t,Z"(Zo,f),f)


a
= ae [DH. (XH + eY)l"'(I,.~(.o.'}.<l
= (DH . Y) ... (I •• ~(SO.O).O)
a
+ af [DH . XH l. . (I •• ~(.o •• ) •• )
= (DH· Y) .... (I.SO)
because DH· XH ;: 0 at all points and z"(ZO,O) = ZOo Now integrating between two
times Tl and T2 we get
a H ° IP(T2 , z"(zo,e),e) - -8a H °IP(Tloz"(zo,e),f) = h~ (DH. Y)
-8 dt.
e e T , " ' . (I .110
)

For (1 = S, we use Tl = 0 and let T2 = T2 go to infinity; for (1 = U, we use T2 = 0 and


let Tl = Ti go to minus infinity. Substituting these into the definition of M(ZO) we get
T.'
M(zo) = r
iT; 2 (DH· Y) (I
~o t~
) dt

+ :e H ° IP(T;., ZU(zo, f), e) - ! H ° IP(T~, z'(zo, e), e).


By the chain rule,

!H OIP(~,Z'(Zo,f),f) = DH...(T.....O) !IP(T3,Z'(ZO,f),f)


As T2 goes to infinity, the point IP(T2,zo, 0) goes to the fixed point P of XH in (q,p)-
space where DHp = 0, so DH...(T•• so.O) converges to the zero row vector. This quantity
acts on :e lP (T2,z·(zo,e),e). The point at which this is evaluated, IP(T:l,z'(zo,e),f),

goes to 1., so the limit of !


IP(~, z'(zo, f),e) goes to the infinitesimal movement of the
closed orbit which is bounded. (This uses the smooth dependence of the stable manifold
on the parameter.) Combining, we get that I.H olP(~,Z'(ZO,f),f) goes to zero as T2
goes to infinity. Similarly, I.H 0 IP(T:z,ZU(ZO,f),f) goes to zero as T{ goes to minus
infinity. Thus, letting Ti go to minus infinity and T2 go to infinity, we have proven that
the integral converges absolutely and the equality given in the theorem is true. 0
7.4.3 MELNIKOV METHOD FOR HOMOCLINIC POINTS 273

We end the section by giving the calculation for the example given above.
Example 4.2 (Continued). We consider the equations

q=p
p = q -l + t"(C08(T) - €6p
t =W.

The Hamiltonian function is H(q,p) = p'l/2_ q2/2+p4/4. The point (q,p) = (0,0) = 0
is a saddle fixed point with a homoclinic connection on two sides. A parameterized form
of these two connections is given by

q±(t») ( ±2!sech(t) )
( pff(t) = =F2t sech(t) tanh(t)
T(t) TO + tw

as can be verified by differentiation. To apply the theorem we use one of the two homo-
clinic orbits. We use the positive side and drop the label '+'. Other parameterizations
of homoclinic orbits are given by (qo(t - to),Po(t - to),T(t» where to is the time at
which p = O. The two variables (to, TO) parameterize the points on r. We actually only
need to calculate M on some cross section. Either to = 0 or TO = 0 is a reasonable cross
section. For now we keep both variables. Letting s = t - to in the integral,

M(to. TO) = 1: (- qo(t - to) + qg(t - to),Po(t - to» x

( "( COS(TO + tw)0 - 6Po(t - to) ) &

= -61: Po(s)2 ds + "( 1: Po(s) COS(TO + tow + ws) ds

= -61: Po(s)2 ds + "(C08(TO + tow) 1: Po(S)C08(WS) ds

- "(sin(To + toW) 1: Po(s)sin(ws) ds.

The integrand of the second integral is an odd function of s because Po(s) is an odd func-
tion and C08(WS) is an even function. Therefore the second integral is zero. Substituting
in for Po(s) in the first integral,

as can be directly calculated by a substitution. The third integral is given as follows:

-1 sin (TO + tow) 1: Po(s) sinews) ds

= 1 sin(To + tow)21 1: sech(s) tanh(s) sinews) dB

= "(sin(To + tow)2 1",wsech( "';)


274 VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS

as can be calculated by means of residues (after the substitution z = ea.) Therefore

M(to, TO) = 3-40 + 1'2 ! 7rwsech("2)


'TrW
sin(To + tow).

Now we take the cross section TO = 0, and set

This function has a nondegenerate zero as long as

The meaning of this inequality is that as long as there is not too much friction in
comparison with the periodic forcing, there is a transverse homoclinic orbit.

7.4.4 Fractal Basin Boundaries


A certain perspective on Dynamical Systems places most of the emphasis on the
attracting sets, i.e., sets A such that {q: w(q) C A} is an open neighborhood of A.
(See Section 7.6 for the actual definition.) If we are mainly interested in attracting sets,
then in what way are horseshoes important? One answer to this question is that the
stable manifolds of a horseshoe can form the separating set between two different basins
of attraction. A good introduction to this type of result is contained in Alligood and
Yorke (1989). Also see Grebogi and Yorke (1987). We start with an example.
Example 4.3. Instead of the horseshoe which gives the full two shift, we consider the
horseshoe which gives the full three shift. Figure 4.12 shows a neighborhood Nand
its image feN) by a diffeomorphism f on the two sphere. We assume that f(A) C A
and feB) c B and there is a fixed point sink in each of these regions, ql E A and
q2 E B. The square middle region S contains a hyperbolic horseshoe, A, such that flA is
conjugate to the two sided shift map on three symbols. Considering the diffeomorphism
on S2 we add a source at infinity, qoo. If we are careful in the construction, then
R(f) = AU {QI,q2,qoo}'

FIGURE 4.12
7.5 HYPERBOLIC TORAL AUTOMORPHISMS 275

The sphere can be split up into the stable manifolds of the invariant sets,

8 2 = W'(qt) U W'(q2) U W'(A) U {q",,}.


The attracting sets for this example are merely the two fixed point sinks. Their basins
are dense in the sphere. The stable manifold of the horseshoe separates them. The
reader should determine which points in the gaps 8 \ l-t(8) are in W'(qt} and which
are in W'(q2). In this case the boundary of the basin W'(q;) for either fixed point
sink is W'(A) U {qoo}' This set has the structure of a Cantor set across the stable
manifolds and so is not a smooth manifold. Such sets are called fractal because they
have fractional Hausdorff (or box) dimension. We discuss these concepts in Section 8.4.

If q is a fixed point sink for a diffeomorphism I, let B be the boundary of its


oRSin of attraction. This is the boundary in the sellse of point set topology, B =
cl(W'(q» \ W'(q). This set B is called the basin boundary. In studying a basin
boundary, an important feature is which points are accessible from within W·(q). A
point x E B is called accessible provided there is a curve 'Y starting in W'( q) and x is the
first point on 'Y which is not in W·(q). In the above example, not all points of W'(A)
are accessible. If PI, P2, and P3 are the three saddle fixed points in the corresponding
vertical strips, then the accessible points in this example are W'(pd U W'(P3) U {q",,}.
To understand this fact, let L be a vertical line segment in 8 from top to bottom. Then
W'(A) n L is a Cantor set. The only accessible points in W'(A) n L from within L
are the end points of the Cantor set (at a finite stage in its formation). But these end
points are exactly the points on the stable manifolds of Pt and P3 (and not the middle
fixed point P2). Alligood and Yorke (1989) discusses the question of showing that the
basin boundary contains a periodic point. Also see Barge and Gillette (1991).
There has also been a number of papers on understanding how the basin boundary
changes from a smooth set to a fractal set under a deformation of the diffeomorphism.
See Alligood and Yorke (1989), Grebogi, Ott, and Yorke (1987), Hammel and Jones
(1989), and Alligood, Tedeschini-Lalli, and Yorke (1991).

7.5 Hyperbolic Toral Automorphisms


The horseshoe is an example of an invertible map with infinitely many periodic
points. It is created on a piece of Euclidean space so can occur on any manifold. In this
section we consider another type of example with infinitely many periodic points which
is more global in nature; the dynamics occur on the whole torus. As such, this type of
dynamics is not as prevalent as the horseshoe, but it is an important construction of a
more global example. We use it later in the chapter to construct an attractor called the
DA attractor.
The hyperbolic toral automorphisms were introduced about the same time as the
horseshoe examples. R. Thom is credited with suggesting them to Smale as examples
with infinitely many periodic points. They are special cases of systems called Anosov
diffeomorphisms or flows. Anosov (1967) showed that the geodesic flow on a manifold
with negative curvature are examples of Anosov flows. In this section we define a general
Anosov diffeomorphism but only analyze the hyperbolic toral automorphisms.
To prove that a horseshoe has infinitely many periodic points, we show it is con-
jugate to a subshift of finite type. The proof in this section that a hyperbolic toral
automorphism has infinitely many periodic points is more straight forward and a1ge-
braic. In Section 9.4, we prove that any Anosov diffeomorphism has infinitely many
periodic points by a more analytic or geometric argument.
We start with the definitions.
276 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Definition. A diffeomorphism f : M -+ M is called an Anosov diffeomorphism pro-


vided f has a hyperbolic structure on all of M. It is ca.1led a toral Anosov diffeomorphism
provided in addition M is a torus, 1"".
Construction of hyperbolic toral automorphisms. The examples of toral Anosov
diffeomorphisms which we introduce are induced by a linear map on Rn. Let A = (aij)
be an n by n matrix with all the aij integers, det(A) = ±l, and A hyperbolic (no
eigenvalue of absolute value one). Let LA : Rn -+ Rn be the induced linear map on
Rn. Because A has all integer entries, LA takes the integer lattice zn (points with all
integer components) into itself. Because in addition the determinant of A is ± I, A-I
has integer entries and (LA)-I = LA-I a.1so takes zn into itself, so LA(zn) = zn. Let
11': R n -+ 11"" be the projection which takes a point x = (Xl, ... ,xn ) in Rn to the point
in the torus by taking each component modulo one. Because LA(zn) = zn, LA induces
a map fA from the torus 1"" to itself such that fA 01l'(x) = 11' 0 LA (x): if 1I'(x) = 1I'(x')
then x = x' + m with m E zn, so 11' 0 LA(x) = 1I'oL A(x') + 1I'oLA(m) = 11' oLA(x')
and fA 0 (x) is well defined. Because det(A) = ±l, the inverse A-I is again a matrix
with integer entries, so fA is a diffeomorphism with fA I = fA-I. These maps are ca.1led
hyperbolic toral automorphisms.
Example 5.1. Let
A= C~) and A2 = C ~).
These are both hyperbolic matrices. The eigenvalues of A are ,x± = (1 ± 5 1/ 2 )/2,
,X + ::::: 1.618 and ,X - ::::: -0.618, so I,X + 1> 1 and I,X -I < 1. The eigenvectors are

for ,X - and ,X + respectively. Notice that both eigenvectors have irrationa.1 slope. It can
be shown that this must be the case for hyperbolic integer matrices of determinant one.
Notice that A reverses orientation and has one negative eigenvalue, while A2 preserves
orientation, with two positive eigenvalues, (,X±)2. The eigenvectors for A2 are the same
as those for A.
With the above example as a model, we can now state the main theorem.
Theorem 5.1. Let fA be a hyperbolic toral automorphism. Then the following are
true.
(a) The periodic points are dense in yn. In particular, there are an infinite number of
periodic points. Also, the nonwandering set of fA is all of yn.
(b) The toral automorphism fA has a hyperbolic structure on all ofyn:
(i) E" is the space of generalized stable eigenvectors of A,
(ii) EU is the space of generalized unstable eigenvectors of A,
(iii) E~ is the translation ofE" to TpT",
(iv) E~ is the translation ofEu to TpT",
(v) fA is an Anosov diffeomorphism, and
(vi) if1l'(p) = p, then W"(p) = 1I'(p + E") and WU(p) = 1I'(p + EU).
(c) The toral automorphism fA is topologically transitive.
(d) The toral automorphism fA is expansive and so it has sensitive dependence on initial
conditions.
(e) The toral automorphism fA is structurally stable.
7.5 HYPERBOLIC TORAL AUTOMORPHISMS 277

PROOF. (a) For fixed positive integer k, let Rat(k) be the rational points in T" with
denominators k:
Rat(k) = 1I"{(it!k, ... ,inlk) : i j E Z} cT".
Then, LA({(it!k, ... ,inlk) : i j E Z}) c {(it!k, ... ,inlk) : i j E Z}, 80 fA(Rat(k» c
Rat(k). This set has a finite number of points (kn points) and fA is one to one, 80
fAIRat(k) is a permutation of Rat(k) and every point in this set is periodic. Finally,
Uk Rat(k) is dense in the torus, so the periodic points are dense.
The other two statememts in part (a) easily follow from the first.
(b) Because 11" : IR n -+ Tn is onto, these give global coordinates. If U c T" is an open
set and 1{il,l{i2 : U -+ Rn are two sets of local coordinates on U which are inverses of 11",
then 1{i2 0 1{i,1 (x) = X + m where m E zn. Therefore, for PET", the tangent space at
p can be thought of as {p} x Rn.
In the local coordinates, the map fA is given by LA. AB a map on Rn , the derivative
of LA at a point p is equal to the matrix A (or the linear map LA), D(LA)fI = A.
Therefore, D(fA)p = A as a map from TpT" = {p} x Rn to T, .. (p1"" = {f(p)} x Rn.
In dimension two, the span of the eigenvectors gives the stable and unstable directions
for A, IE" = {tv" : t E R} and lEu = {tv" : t E R}. In higher dimensions they are the
generalized stable and unstable eigenspaces of A respectively as stated in the theorem.
There is a C ~ 1,0 < " < I, and ~ > 1 such that IIA"IE"II ~ C ,," and IIA-"IEull ~
C ~ -" for k ~ 1. For any point p E 1"", define the subspaces at p to be the translates
of the stable and unstable eigenspaces of A, E~ = {p} x IE" and IE~ = {p} x lEu.
For p E Tn and v E IE~, ID(f~)pvl = IAkvl ~ C "klvl where 0 < " < 1 and C ~ 1
are the constants given above. The bound C ,,"Ivl goes to zero as k goes to infinity,
so IE~ is made up of stable vectors. A similar argument holds for v E IE~ as k goes to
minus infinity. This proves the hyperbolic structure.
Because fA has a hyperbolic structure on all of Tn and all points are nonwandering,
fA is an Anosov diffeomorphism.
Next we turn to the stable manifold for a point p = 1I"(p), For t > 0 and q = 7r(q),
let B(q,f) C Rn be the ball of radius E and U(q,E) = 7r(B(ii,f» C T". For E small
enough fAI(U(f(q), E» does not wrap around the handles ofthe torus, 80
fAI(U(f(q), E» n U(q, E) = 7r[L:4 1 (B(LA(q), E)) n B(q, E)l.
Then the local stable manifold is given as follows:

n~O

= 7r[ n L:4n(B(L~(p),t»]
n~O

= 1I"[P+ n A-n(B(O,t»].
n~O

By the properties of the linear map, 7r[p + 1E"(C-1E)] c W:(P,fA) c 7r[p + E"(E)].
Therefore the global stable manifold is given as stated:

n~O

The result about the unstable manifold is proved similarly. This proves part (b) of the
theorem.
278 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Corollary 5.2. For any point p, W'(p) and W"(p) are each dense in T" and so are
their intersections, which are transverse homoclinic points.
PROOF. For n = 2, the slopes of the lines lE' and IE" are irrational in ]R2, so the lines
W'(p) = 1T(iHIE') and W"(p) = 1T(p+lE") are each dense in 'f2, and their intersections
are dense.
For n > 2, we replace the lines with subspaces. The stable and unstable manifolds
of LA of a point q are given by W'(q, LA) = q + lE' and W"(q, LA) = q + lEu. Let q
be a periodic point of period m with lift q. Then W"(O, LA) intersects W'(q,L A ) in
a point z (because these two affine subspaces are complementary dimensions and not
at all parallel). Letting z = 1T(i), z E W"(1T(0).!A) n W'(q, fA) and d(fA(z),/A(q))
goes to zero as n goes to infinity. Therefore W"(11'(0),fA) accumulates on q at the
points fmn(z). Because the periodic points are dense in 'f", W"(11'(0).!A) is dense
in 'f". Similarly, W'(1T(0),fA) accumulates on q and so W'(11'(0).!A) is dense in 'f".
Because W"(1T(0),fA) and W'(1T(0), fA) are projections of complementary subspaces,
they intersect transversely arbitrarily near q, so the transverse homoclinic points are
dense in 'f".
For an arbitrary point p in 'f", W" (p, fA) and W' (p, fA) are translates of the mani-
folds of 0, so they are each dense in 'f", and the homoclinic intersections for p are dense
in 'f". 0
PROOF OF THEOREM 5.1 CONTINUED. (c) To prove that fA is topologically transitive
we verify the hypothesis of the Birkhoff Transitivity Theorem. Let U and V be any
two open sets in T". The stable manifold of the origin, W'(1T(0)), is dense in 'f", so
it intersects U in a point q. Let J = 1T(q + lE"(r)) where r > 0 is small enough so
that J cU. If A is a lower bound on the unstable eigenvalues, then the k-th iterate
of the disk, f~(J), contains a disk of radius at least least C-1Akr in W"(f~(q)). As
k increases, the "radius" of f~(J) becomes larger, so f~(J) accumulates on compact
pieces of W"(1T(0)), and so it must intersect V. For this iterate,

o~ f~(J) n V c f~(U) n V.

Thus 0+ (U) n V ~ 0. Similarly, 0- (U) n V ~ 0. By the Birkhoff Transitivity Theorem,


fA is topologically transitive.
(d) Given any point p and q E W"(p), the distance d(fk(p),fk(q)) ~ Akd(p,q) as
long as the distance stays less than one. The quantity Akd(p, q) grows, so the distance
gets bigger than 1/4. This proves that fA is expansive with expansive constant 1/4,
which is part (d).
(e) The proof of structural stability uses the proof of the Hartman-Grobman The-
orem, Section 5.7. After looking at the lifts of the diffeomorphisms to Rn, the main
change is that a little extra checking needs to be done in the proof that the conjugacy
is one to one.
The map LA is the lift of fA to a map on Rn. Let 9 be a C l perturbation of fA·
It is possible to choose the lift G : Rn -+ Rn for which G(O) is near 0 = LA(O). Let
a = G - LA. Then for any lattice point w E zn, LA(x + w) - LA(x) = LA(w), and
G(x + w) - G(x) = LA(w). (This latter equality can be proved by taking a homotopy
9t with 90 = fA and 91 = g. The lift G t will have Gt(x + w) - Gt(x) a lattice point
for each 0 $ t $ I, so G(x + w) - G(x) = Go(x + w) - Go(x) = LA(w).) Then,
a(x + w) = a(x). Because of the periodicity of a, it is C l small on all of Rn. This
shows that G is C t close to LA on all of Rn.
To conjugate LA and G, we solve for H = id + v. The map v should be a bounded
function on Rn and periodic. As before, let Gg(Rn) be the space of all bounded contin-
7.5.1 MARKOV PARTITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS 279

uous maps from Rn to itself. Now we let

cg,per(R") = {v E cg(R") : vex + w) = vex)


for all w E Z",x E R"}
and
Cl,per(R n ) = cg.per(R n ) n CI(Rn).
Then, if G E Cl,per(R"). Because ofthe periodicity, there is a uniform bound on IIDGxll
for all x ERn.
For G E Cl,per(R") and v E cg,per(R"), as in the proof of Hartman-Grobman we let

where
C(v) = (id - (LA)*)V = v - LA ovoLAI.

A direct check shows that 8(G,·) preserves C::,per(R n ). (We leave this verification to
the exercises. See Exercise 7.22.) Exactly as in the proof of the Hartman-Grobman
Theorem, if Lip(G) is small relative to the distance of the contraction and expansion
rates away from one, 8(G, -) has a fixed point v~ E C::,per(R"). Letting HG = id + v~
we get that

C(v~) = Go (id + v~) 0 LAI


HG = id + v~ = LA 0 LAI + LA 0 va 0 LAI + C(va)
= LA 0 HG oLAI + GoHG 0 LAI
=GoHGoLAI
on IRn. For W E zn, HG(x + w) = HG(x) + w so HG induces a map hg on 1'" that
satisfies go hg 0 fA I = hg.
Next we check that hg is one to one. If hg(x) = hg(Y), x is a lift of x, and y is a
lift of y, then HG(x) = HG(Y) + w = HG(Y + w) for some w E Z". Replacing Y with
y' = y + w we get another lift of y with Ha(x) = HG(Y')' Because Ha is one to one,
x = y' and x = y. (The proof that HG is one to one uses the fact that LA is expansive:
HG 0 L:4(x) = HG 0 L:4(y') for all n so x = y'.) Thus hg is one to one.
By invariance of domain, hg(T") is open in 1"'. Since it is also close, hg(T") = T",
and hg is onto. This completes the proof that hg is a homeomorphism, that fA is
structurally stable, and the proof of the theorem. 0
REMARK 5.1. In Section 9.7 we prove that all Anosov diffeomorphisms are structurally
stable. Manning (1974) proved that any Anosov diffeomorphism on a torus is topo-
logically conjugate to a hyperbolic toral automorphism. One conjecture which is still
unknown is whether being Anosov implies that all points are nonwandering (or chain
recurrent) .

7.5.1 Markov Partitions for Hyperbolic Toral


Automorphisms
We want to connect the dynamics of a hyper l,. ; ,.1 automorphism, f : T" -+ T",
with that of a subshift of finite type, i.e., to see how symbolic dynamics can be applied to
a hyperbolic toral automorphism. We need to find (and define) the replacements for the
geometric boxes of the horseshoe which are used to define the symbol sequences. The
theory which we give is for all dimensions, but the examples are all in two dimensions
where the situation is simpler.
280 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Example 5.2. We introduce the ideas of rectangles, a Markov partition, and the semi-
conjugacy using the toral automorphism fA induced by the matrix A = (~ ~ ). We
want to subdivide the total space into rectangles (which can be taken to be actual
parallelepipeds in two dimensions but not higher dimensions). The eigenvalues are
A" = (1 + 5 1/ 2 )/2 with eigenvector v" = (2,5 1/2 - 1) and A. = (1 - 5 1/ 2 )/2 with
eigenvector v' = (2, _5 1/ 2 - 1). Note that Au + A. = 1 = tr(A) > O. Thus A" =
tr(A) - A., and the fact that the trace of A is a positive integer insures that Au > O.
Then A"A. = det(A) = -1 < 0, 80 this insures that A. < O. Also, v" has positive slope
and v' has negative slope.
To form the rectangles for A, we look in the covering space, R2. From the origin
and other lattice points take the part of the unstable manifold of this point in R2 that
crosses the fundamental domain above and to the right of the lattice point. See Figure
5.1. Next, extend the stable manifold from the lattice point downward to the point a
where it hits the part of the unstable line segment drawn above. Similarly, extend the
stable manifold upward from a lattice point to the point b where it hits the part of the
unstable manifold drawn above. Finally, extend the unstable manifold to the point c
where it hits the line segment [a, bl. in the stable manifold. These line segments, [a, bl.
in W'(O) and [0, cl" in W"(O) (and their translates in R2), define two rectangles RI
and R2 in T2. See Figure 5.1.

FIGURE 5.1. Rectangles for Example 5.2

To find the images of the rectangles, we first consider the images of the points a, b,
and c: fA(a) = b, fA(b) = c, and fA(C) E [0, bl., where [x,yl. is a line segment in the
stable manifold from x to y. See Figure 5.1. Using these images, it follows that

fA(Rd crosses R} and R 2 ,


f A(R2 ) crosses R I •

See Figure 5.2. The pair of rectangles {R}, R 2 } have the properties of a Markov partition
for fA: (i) the collection of rectangles covers y2, (ii) the interiors of R} and R2 are
7.S.1 MARKOV PARTITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS 281

FIGURE 5.2. Images of Rectangles for Example 5.2

disjoint, and (iii) if IA(int(~» n int(Rj ) # 0, then IA(~) reaches all the way across
R j in the unstable direction and does not cross the edges of Rj is the stable direction.
(There is a fourth condition which we only discuss implicitly below in terms of the
semi-conjugacy.) We give the general definition below.
We define a transition matrix which indicates which itineraries for the orbit of a
point are allowable: for a transition from rectangle ~ to R j to be allowable, it must
be possible for an orbit of a point to pass from the interior of ~ to the interior of R j •
(We disregard the fact that the image of the boundary of R'l hits the boundary of R'l')
In this example the transition matrix is given by

B=G ~).
Notice that this transition matrix B is the same matrix as the original matrix A which
induced the toral automorphism. The shift space for B is the two sided subshift of finite
type
E8 = {s : Z -+ {I,2} : b•••• +1 = I}
with shift map (T8 = (TIE8'
To define the symbolic dynamics, we can not get a continuous map (conjugacy or
semiconjugacy) h from T2 to E8 because T'l is connected and E8 is a totally dis-
connected Cantor set. Also for a point p E 8(~) there are at least two choices of
rectangles to which p belongs. Therefore, there is no way to assign a unique symbol
sequence to points on the boundary of a rectangle. Instead, we define a map going the
other direction, h: E8 -+ T2. We want h to be a semiconjugacy (continuous, onto, and
IA 0 h = h 0 (T8). To do this we define h : E8 -+ y2 by

h(s) = n n l ..
00

n-O
cI(
n

j_-n
j(int(R.J »).

We take the images of the interiors because RI n IA'(R2) does not always equal
cI(int(RI) n IAI (int(R2» but can have extra points whose images are on the boundary
282 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

of R 2 . (We must put up with the annoyance to be able to use fewer rectangles.) Using
the general theory, Theorem 5.3 proves that this h is a semi-conjugacy. In fact, it proves
that h is at most four to one.

In order to give the precise definitions of rectangle and Markov partition, it is neces-
sary to indicate what we mean by the component of a stable or unstable manifold for
a point in a rectangle. As we have done before, we use the notation of comp.(S} to be
the connected component of the set S containing the point z. We think of W"(z, R} as
equal to comp.(RnW"(z}} for (1 = u,s and Rone of the rectangles (ifit is connected).
However, this definition does not quite work, because even in the rectangles for Example
5.2 there is a difficulty: in '1'2, RI touches itself along the projection of the line segment
from 0 to c, lI'([O,cl.}. When the total ambient manifold is a torus, a better definition
of the stable and unstable manifolds in a rectangle uses the covering space R2 as follows.
Let R he one lift of a rectangle R in '1'2 to a rectangle in )R2, so 11' : R -+ R is onto and
one to one in the interior. Let z be a lift of z, z E Rand lI'(z} = z. For (1 = U, s, define

W" (z, R) = lI'(W" (z) n R).

Note in Example 5.2, for z = 0 and rectangle R\, there are two choices for the lift RI
which touches the origin 0 in R2. (There is one choice above and to the right of 0 and
one below and to the left.) Making either of these choices, W"(O, Rd = lI'(W"(O, RI »
is a proper subset of compo(RI n W"(O}}. In fact, compo(R I n W"(O)} is the union of
the two choices for W"(O, Rd.
We now use the motivation of the rectangles defined above for the specific example
to give a general definition of both a rectangle and a Markov partition.
Definition. For a hyperbolic toral automorphism on the n-torus, 'lr', we proceed as
follows. Let R be a subset of 'lr' and z E R. Let R is a lift of R to Rn and z E R be
a lift of z, i.e., 11' : R c Rn -+ R is a homeomorphism, and 1I'(z} = z. If R is connected
then R should be taken to be connected; if R is not connected, then care must be
taken to choose the points in (1I')-I(R) in a reasonable manner, e.g. R should be in one
fundamental region of 11' : Rn -+ 'lr'. For (1 = U, s, let

W"(z, R) = lI'(W"(z) n R).

We do not give a completely precise definition for the general case of a hyperbolic
invariant set A. An isolated hyperbolic invariant set has a property called a local product
structure provided for t > 0 small enough, there is a () > 0 such that if d(x,y} < () for
x.y E A, then W.U(x} n W:(y) is a single point in A. Let A be a hyperbolic invariant
set with a local product structure, let R be a subset of A that has diameter less than 6,
and let z E R. Then
W"(z, R} == R n W:(z)
using the local stable and unstable manifolds of size t. This general case was considered
by Bowen (1970a, 1975). Also see Section 9.6.
Definition. Let f be a diffeomorphism with a hyperbolic invariant set with a local
product structure. (This includes the case where f is a hyperbolic toral automorphism.)
A nonempty set R of T" (or of A) is a (proper) rectangle provided
(i) R = cl(int(R}} (where the interior is relative to A) so that it is closed, and
(ii) p, q E R implies that W'(p, R} n WU(q, R) is exactly one point, and this point is
in R. If we are considering a hyperbolic tOrai automorphism, then the same lift
must be used for R to determine both W'(p, R) and W"(q, R).
7.5.1 MARKOV PAIUITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS 283

REMARK 5.2. In his general definition, Bowen defines W:(p) n W:'(q) == [p,q). He
then demands that for p, q E R that [p, q) is exactly one point, and that this point
is in R. Note, if we use Bowen's definition then RI is not a rectangle in Example 5.2
because there are points p and q in RI near 0, for which W.·(p) n W:'(q) is in R2 and
not in R I • Using the fact that the manifold is a torus and our definition ofthe subsets
of the stable and unstable manifolds using lifts, the sets RI and R2 given in the above
example are indeed rectangles.
Below, we define a collection of rectangles (a Markov partition) which have the prop-
erties needed to use them to define symbolic dynamics. The definitions use the notion
of the interior and boundary of a rectangle. A point pER is a boundary point 01 R if
arbitrarily near to p there is a point q in A such that q rt R. (This is the usual pointset
boundary of a subset.) If p is a boundary point of R, it follows for such q, that either
W'(p) n WU(q) or W'(q) n WU(p) is not in R. Let 8(R) be the set of all boundary
point.s of R, and the interior of R be the complement of 8(R) in R, int(R) = R \ 8(R).
Definition. Assume that I : M -+ M is a diffeomorphism which has an isolated
hyperbolic invariant set A with a local product structure. (This includes the case where
I is a hyperbolic toral automorphism with A = M.) A Markov partition for I is a finite
collection of rectangles, 'R = {R j } J= I' that satisfies the following four conditions. (All
interiors are taken relative to A.)
(i) The collection of rectangles cover A, A = U;"=1 R j •
(ii) If i ~ j then int(R;) n int(R j ) = 0 (so int(R;) n Rj = 0).
(iii) If z E int(R;) and I(z) E int(R j ) then

I(WU(z, R;» :) WU(f(z), R j ) and


I(W'(z, R;» c W'(f(z), R j ).

(iv) (The rectangles are small enough.) If z E int(R;) n r 1 (int(Rj » then

int(Rj)n/(WU(z,int(R;») = WU(j(z),int(Rj » and


int(R;) n r l (W'(f(z), int(R j ») = W'(z, int(R;»

where WO' (Zl , int( R,,» = WO' (Zl , int( R,,» n int( R,,) for = 'U, s, any point Z/,
(1 and
rectangle R".
Definition. Once we have a Markov partition. we want to set up the symbolic dynamics
of the subshift of finite type by means of a transition matrix. Given a Markov partition
'R = {R j }j!.I' the transition matrix B = (b;j) is defined by

b.. = {I if int(f(R;» n int(Rj ) ~ 0


'] 0 if int(f(R;» n int(Rj ) = 0.

The shift space lor B Is defined as

E8 = {s: Z -+ {I, ...• m} : b"' H1 = I}.


Letting (1 be the shift map on the full m-shift, Em = {I, ...• m}z. define 0'8 = 0'IE8 :
E8 -+ E8·
284 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Example 5.3. For the geometric horseshoe, let R.; = Hi n A for i = 1,2. These two
rectangles form a Markov partition with transition matrix (! !).
For the hyperbolic invariant set created for a homoclinic point, the sets R.; = A n Ai
form a Markov partition with transition matrix B given in the proof of Theorem 4.4(b).
REMARK 5.3. Notice that we do not demand that a rectangle be connected, although
the examples we give for hyperbolic toral automorphisms are connected. There are
examples where a rectangle has countably many components even for a Markov partition
of a total space which is connected. In general, for a Markov partition of a hyperbolic
invariant set, the total space is often not connected or even locally connected, so a
rectangle certainly could not be connected in this case.
REMARK 5.4. Let 8(R.;) be the boundary of R.; relative to A. Conditions (i) and (ii) in
the definition of a Markov partition imply that 8(R;) = {p E R, : pER) for some j ."
i}. This holds because clearly int(Ri) n {p E R.; : p E R j for some j ." i} = 0, so
8(Ri) :::J {p E R; : pER] for some j ." i}. Next, if p E 8(Ri ), then there are qk E Rj>
with jk ." i and qk converging to p. Because there are a finite number of rectangles, by
taking a subsequence we can take all the jk = j to be the same. Because R j is closed
it follows that p E Rj • This proves that 8(R;) C {p E R; : p E R j for some j ." i}.
REMARK 5.5. Condition (iii) in the definition of a Markov partition insures that if the
image of a rectangle hits the interior of another rectangle, then it goes all the way across
in the unstable direction and is a subset in the stable direction (goes all the way across
in the stable direction when looking at the inverse). Note that if a point z is on the
boundary of a rectangle R.;, then the image of R.; can abut on another rectangle R j
without even going into the interior of rectangle R j • (Thus the condition (iii) does not
necessarily hold for the points on the boundary.)
REMARK 5.6. Condition (iv) is not included in Bowen's definition because he only used
small rectangles. It is added to our list to make the point determined by a sequence of
rectangles allowed by the transition matrix well defined. This condition prohibits the
image of a rectangle R.; from crossing a rectangle R j twice. Note that it does allow the
image to intersect the boundary a second time. (See f(Rd and R2 in Figure 5.2.)
We could strength Condition (iv) to the following assumption:
(iv)' for z E int(R;) n f-l(int(R j »),

Rj nf(W"(z,R.;») = W"(f(z),R j ) and


R.; n r 1 (W"(f(z), R j ») = W'(z, R.;).

This condition does not allow the image of a rectangle R.; to cross the rectangle R j once
and then intersect the boundary a second time. Therefore the partition constructed
in Exanlple 5.2 satisfies assumption (iv) but not assumption (ivY. The advantage of
assumption (iv)' over (iv) is that the definition of the conjugacy in Theorem 5.3 without
taking interiors and closures. See Remark 5.10.
For some purposes, people allow the image of a rectangle to cross more than one time.
If multiple crossings are allowed, then (1) Condition (iv) is not included in the definition
and (2) the transition matrix must be allowed to have integer entries which are larger
than one, i.e., we get an adjacency matrix as defined in Section 7.3.1 on subshifts for
matrices with nonnegative integer entries. More precisely, assume there is a partition
by rectangles {R.;}:'..l which satisfies conditions (i-iii) for a Markov partition but not
necessarily condition (iv). To such a partition, we can associate an adjacency matrix
A = (aij) where the entry ail equals the number of times that the image !(R i ) crosses
7.5.1 MARKOV PARTITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS 285

the rectangle R J • Thus if aij = 2, then f(R;) crosses R j twice. We do not pursue this
connection. See Franks (1982).
REMARK 5.7. Adler and Weiss (1970) gave a method of constructing simple Markov
partitions for hyperbolic toral automorphisms on T2. Assume A is a 2 x 2 adjacency
matrix with all positive entries and which induces a hyperbolic toral automorphism on
y2. Then, there is always has a partition by two rectangles, {Rb R2}, such that (1)
the partition satisfies all the properties of a Markov partition except (iv), and (2) the
image f(R;) has aij geometric crossings of R j . The recent theses by Snavely (1990) and
Rykken (1993) give more details on constructing such a Markov partition.
REMARK 5.8. It should be noted however that even for Markov partitions for hyperbolic
toral automorphisms in T" with n ~ 3, the boundaries of the rectangles are not smooth.
Thus the "rectangles" are much different than the simple two dimensional example leads
one to believe. See Bowen (1978b).
REMARK 5.9. Bowen (1970a) proved that any hyperbolic invariant set with a local
product structure has a Markov partition. We prove this result in Section 9.6. In
this chapter, we restrict ourselves to finding Markov partitions for hyperbolic toral
automorphisms on y2 and the solenoid which is defined in Section 7.7.
We can now state the main result.
Theorem 5.3. Let 'R = {Rj}j=l be a Markov partition for a hyperbolic toral auto-
morpllism on T2. Let (EB,O'B) be tile shift space and h: EB -+ y2 be defined by

n cl( n rj(int(R.;»).
00 n
h(s) =
n-O j--n

Then h is a finite to one semiconjugacy from O'B to f. In fact h is at most m 2 to one


where m is the number of rectangles in the partition.
REMARK 5.10. If we used assumption (iv)' given in Remark 5.6 above, then we could
just use the intersection of the images f-j(R. J ) to define h,

n rj(R.
00

h(s) = J ).

j=-oo

This latter intersection is usually used to define the conjugacy. The problem is that
f(int(R;» n int(Rj ) can be nonempty and f(Ri) abut on the boundary of R j at points
for which there are no nearby interior points, so

cl (J(int(R;» n int(Rj »i: f(R;) n R j •


See Example 5.2. We allow such intersections on the boundary in order to find Markov
partitions with fewer rectangles. This forces us to use this slightly more complicated
definition of h given above.
REMARK 5.11. This theorem is used in Section VIII.1.2 to prove that the topological
entropy of FA can be calculated by the largest eigenvalue of B.
PROOF. By condition (iv), c1(int(R•• ) n f- 1 (int(R'.+I))) is a nonempty subrectangle
that reaches all the way across in the stable direction. By induction,
k+i
cl ( n rj(int(R.;»)
j-"
286 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

is a nonempty subrectangle that reaches all the way across in the stable direction for any
k E Z and i E N. The width of this set in the unstable direction decreases exponentially
at the rate given by the inverse of the minimum expansion constant. Thus

n (n
00 n
cl ri(int(R.;)) = W·(p .. R. o)
n-O j-O

for some P. E R.o ' Similarly,

n n
-00

n-O
cl(
0

j=n
ri(int(R.;))) = WU(Pu,R. o)

for some Pu E R.o . Therefore,

n n
00

n=-oo
cl(
n

)=-n
ri(int(R.;))) = W·(p.,R'o)nWU(pu,R. o)

is a unique point p = h(s). This shows that h is a well defined map.


By arguments like those used for the horseshoe, h is continuous, onto, and a semi-
conjugacy.
If Ji (p) E int( R.;) for all j, then h -I (p) is a unique symbol sequence, s, because
Ji(p) rI Rk for k '" si' Thus, h is one to one on the residual subset (in the sense of
Baire category)

Next we show that h is at most m 2 to one, where m is the number of partitions. Let
p = h(s). As we showed above we only have to worry if J"(p) is on the boundary of
some rectangle Ri .
We want to distinguish the boundary points of a rectangle R which are on the edge
of an unstable manifold in the rectangle, WU(z, R), and those which are on the edge of
a stable manifold, W'(z, R). Let

8'(R) = {x E 8(R) : x rI int(WU(x, R))} and


8U (R) = {x E 8(R) : x ¢ int(W·(x,R))}.

Here int(WU(x, R)) is the interior relative to a compact part of the manifold WU(x, R).
Similarly for int(W'(x, R)). Then 8'(R) is the union of stable manifolds W'(z, R), and
au (R) is the union of such unstable manifolds.
If J"(p) E 8'(R'n) then li(p) E 8·(R.;) for j ~ n. There are at most m choices
for Sn. (The reader can check that for a hyperbolic toral automorphism on y2, there
are at most 4 choices.) Since the transitions of interiors are unique, a choice for Sn
determines the choices of Sj for j ~ n. Similarly if J"' (p) E 8 U(R. n , ) then a choice for
Sn' determines the choices of Sj for j $ n'. Combining, there are at most m 2 choices as
claimed. 0
Example 5.4. As a second example of a hyperboh . ,'oill, let A2 =
(~ !). As we noted above, if A = (! ~), then A2 = A2. The rectangles Rl
and R2 from Example 5.2 are still rectangles for this matrix. However the image of RI
7.5.1 MARKOV PAIITITIONS FOR HYPERBOLIC TORAL AUTOMORPHISMS 287

by I A. crosses RI twice. This partition satisfies conditions (i)-(iii) and has A2 as an


adjacency matrix.
If we want to get a transition matrix with only O's and l's, we must subdivide the
rectangles (split symbols) by taking components of RI n I A. (R.): let the rectangle

Ria = comp (11'(0), cl(int(R.) n IA.(int(R.»»


= 1I'(RI n LA. (R.»
where LA. is the map on R2, and

These rectangles can also be formed by extending the unstable manifold of the origin
until it intersects the stable line segment [0, bl. at the point e = I(c). See Figures 5.3
and 5.1. The reader can check that

IA.(Rla ) crosses Ria, Rib and R2,


IA.(RlI,) crosses Ria, Rib and R 2 ,
IA.(R 2 ) crosses Rib and R2.
Thus the transition matrix is

B=(i ~ D·
This transition matrix has characteristic polynomial p(~) = _~(~2 - 3~ + I), and
eigenvalues 0, (~ .. )2, and (~.)2 where ~ .. , and ~. are the eigenvalues of A. Thus the
eigenvalues of B are those of A2 together with O. We do not prove it, but the eigenvalues
of the transition matrix are always the eigenvalues of the original matrix A together
with possibly 0 and/or roots of unity. See Snavely (1990).

FIGURE 5.3. Markov Partition for Example 5.4


288 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

7.5.2 The Zeta Function for Hyperbolic Toral


Automorphisms
As we mentioned in Section 3.3, the zeta function for a map f is defined by

with N} = #(Fix(fi». In that section, we prove that if (J A is the subshift of finite type
for the matrix A, then <"A (t) = Idet(l- tA)]-1 which is a rational function of t. The
zeta function has been proved to be a rational function for many more diffeomorphisms.
In this subsection. we prove this is true for a toral Anosov diffeomorphism. The fact
that the zeta function is rational means that the number of all the periodic points of all
periods can be determined by the finite number of invariants given by the coefficients
of the rational function.
Theorem 5.4. (a) Assume that f : T" --+ T" is an Anosc)V diffeomorphism. Then the
zeta function of f is rational.
( b) Assume that fA : 11'2 --+ 11'2 is a hyperbolic toral automorphism for the matrix A,
and Au the the unstable eigenvalue. Then

(1 - t)2
if Au > 0 and det{A) > 0,
det(l- tAl
(1 + t)2
if Au < 0 and det{A) > 0,
det(I + tAl
(I - t 2 )
if Au > 0 and det(A) < 0, and
det(l- tAl
(1 - t 2 )
if Au < 0 and det(A) < O.
det(l + tAl

REMARK 5.11. For the proof to be valid f could be any Anosov diffeomorphism which
induces the map A on the first homology group, HI (11'2, JR) = JR ffi JR.
REMARK 5.12. There are two types of proof of the rationality of the zeta function.
Manning (1971) gave a proof using Markov partitions. Also see Bowen (1978a). Man-
ning proved that the zeta function is a rational function whenever the chain recurrent
set has a hyperbolic structure. (Actually he uses the slightly weaker hypothesis that
the nonwandering set is hyperbolic and is equal to the closure of the periodic points.)
We discuss this type of proof in the exercises. See Exercise 7.30.
In this section, we use the proof using the Lefschetz index and the Lefschetz Fixed
Point Theorem. This proof goes back to Smale {1967} in the case of a toral Anosov
diffeomorphism. The zeta function for a diffeomorphism on a compact manifold was
later proved to be a rational function using this type of proof by Williams (1968) for an
attractor and by Guckenheimer (1970) whenever the chain recurrent set has a hyperbolic
structure. Finally, Fried (1987) proved that the zeta function is rational whenever the
diffeomorphism is expansive on the chain recurrent set. For more discussion of zeta
functions, see Franks (1982).
Before starting the proof, we need to define the· Lefschetz number of a fixed point
and state the Lefschetz Fixed Point Theorem. We only give these definitions in the case
when no periodic point has an eigenvalue which is a root of unity.
7.5.2 TilE ZETA FUNCTION FOR HYPERBOLIC TORAL AUTOMORPHISMS 289

Definition. Let 1 : M -- M be a C l diffeomorphism on a compact manifold M of


dimension n. If p is a fixed point, the Lelschetz index 01 p, Ip(f), is defined to be the
sign( det(I - Dip» (where the derivative and determinant are calculated using some
local coordinates). If p is a hyperbolic fixed point, then an easy analysis shows that

where u = dim(E~), and A = 1 provided D/pIE~ -- E~ preserves orientation and


A = -1 provided this linear map reverses orientation. (See Exercise 7.29.)
Let I.k be the induced map on the k·th homology group, Hk(M, R). The Lelschetz
number ollis defined as follows:
n

L(f) = ~)-I)ktr(f.k)
Ic=O

where n is the total dimension of M.

Theorem 5.5 (Lefschetz). Let I: M -- M be a C l diffeomorphism on a compact


manifold M. Then
L(f) = L Ip(f).
pEFix(f)

See Dold (1972).


To prove the main theorem using the Lefschetz number, we need make a connection
between the Lefschetz numbers of the iterates of 1 and the induced map on homology.
We start by defining a zeta function using the Lefschetz numbers of the iterates of I
which can be related to the induced map on homology by means of the Lefschetz Fixed
Point Theorem.
Definition. Let I : M -- M be a map for which none of the periodic points have
eigenvalues which are roots of unity. The homology zeta junction, Z,(t), is defined 88
follows:

where K j = L(fi).
We first relate the homology zeta function with the regular zeta function.
Proposition 5.6. Let I : T" -- 1'" be a C I Anosov toral diffeomorphism and u =
dim(E~).If D/IE" preserves orientation, then

If DIllE" reverses orientation, then

REMARK 5.13. For a Anosov diffeomorphism on 1"', the bundle

u {p}
pETn
x E~
290 VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS

is oriented, and the derivative

either preserves orientation for all points p or reverses orientation for all points p.
Therefore the statement of the proposition takes care of all cases. This is the only
use we make of the manifold being a torus except for the case of the two dimension
hyperbolic toral automorphism.
PROOF. Remember that K j = L(fj) and Nj = # Fix(fj). If D/IEu preserves ori-
entation, then the indices of all the periodic points are (_l)U, and K j = (-l)UNj or
N j = (-l)UKj . Therefore,

Similarly, if D !lEU reverses the orientation, then D P lEU reverses the orientation for
j odd and preserves the orientation for j even. Therefore the index of any fixed point
of Ii is (-l}U(-lF, N j = (-l)U(-l)jKj, and

o
By the above proposition, to prove the theorem it is enough to prove that the homol-
ogy zeta function is rational. The next proposition relates the homology zeta function
with the linear maps on the homology groups which are induced by I.
Proposition 5.1. Let I : M -+ M be a C 1 diffeomorphism on a n dimensional manifold
M. for which none of the periodic points have eigenvalues which are roots of unity. Then

Zf(t) = fI
10=0
det(I - tl.k)<-t)"+I.

The proof of the proposition uses the following lemma


Lemma 5.B. For an arbitrary matrix B (which we take as some I.k below).
7.5.2 THE ZETA FUNCTION FOR HYPERBOLIC TORAL AUTOMORPHISMS 281

PROOF. The infinite series '£';.1 Biti Ii is the formal power series for the function
- log(l - tB), so
ti .
exp (L "'7BJ)
00
= (/ - tB)-I.
j=1 J
Applying Liouville's Formula, we get

exp ( tr ( L ti Bj)) = det ( exp ( L ti Bi))


00
"'7
00
"'7
i=1 J i=1 J
= det ((l- tB)-I)
= (det(l- tB)) -1.

o
PROOF OF PROPOSITION 5.7. By the Lefschetz Fixed Point Theorem,
n
Kj = LUi) = L(-I)"tr(f!,,).
"-0
Substituting this expression for K j in the definition ofthe homology zeta function Z,(t)
as follows:

n
= II det(l- tf!,Y-I)"+J,
10:=0

where the last equality uses Lemma 5.8. o


PROOF OF THEOREM 5.4. Since the homology zeta function is rational by Proposition
5.7, the usual zeta function is rational by Proposition 5.6.
For a hyperbolic toral automorphism induced by A on T2, f.o = 1, f.l = A, and
f.2 = sign(det(A)) . 1. Therefore if det(A) > 0 then

Z ( ) = det(l- tA)
, t (1 _ t)2

by Proposition 5.7. Similarly if if det(A) < 0 then

Z,(t) = (~e~(:)0 t:~)


det(l- tA)
1 - t2
Since u = dim(E~) = 1, using the relationship between the homology zeta function and
the usual zeta function givpn in Proposition 5.6, we get the four cases as stated in the
theorem. 0
292 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

7.6 Attractors
There are various definitions of an attractor. The main difference involves which
points in a neighborhood of the attractor have to approach the set. We demand that
this holds for a whole neighborhood. There is also the question of whether the dynamics
are "indecomposahle" on the attractor itself. We demand that the map (or flow) is chain
transitive on the attractor but this requirement changes from author to author quite
drastically.
We write the definitions for a diffeomorphism I : M -+ M but only slight changes
are needed to apply for a flow.
Definitions. A compact region N c M is called a trapping region for I provided
I(N) C int(N). A set A is called an attracting set (or an attractor by the terminology
of Conley) provided there is a trapping region N such that A = nk>olk(N). A set
A is called an attractor provided it is an attracting set and fill. is chain transitive, so
A C R(f). (Sometimes we might want to assume that 1111. is topologically transitive.)
An invariant set A is called a chaotic attractor provided it is an attrador and I has
sensitive dependence on initial conditions on A. (Sometimes people require I to have a
positive Liapunov exponent on A instead of sensitive dependence. See Section 8.2 for
the definition of Liapunov exponents for a map in several dimensions. Compare with
the discussion on chaos in Chapter III.) Finally, an attractor with a hyperbolic structure
is called a hyperbolic attractor.
REMARK 6.1. Some authors do not require A attracts a whole neighborhood but only
it attracts a set of positive measure in order to call it an attractor, however this leads to
a very different concept which I might call the core 01 the limit set. See Milnor (1985)
for further discussion along these lines. Also See Section 9.1 for a discussion of Conley's
theory.
It can easily be checked that an attracting set is both negatively and positively
invariant and closed. (These properties hold because it is the intersection of the nested
sets Ik(N).)
It is useful to give a definition of an attracting set which is given more in terms of
properties of the invariant set A, and not determined by intersections of a trapping set.
The following example shows that for A to be an attracting set, it is not enough that
there is one neighborhood V of A such that w(p) C A for all p E V. Also see the
example of Vinograd given in Example V.5.3.
Example 6.1. Consider the equations

iJ = e2 (211" - ef
r= r(1 - r2)

where e is an angular variable modulo 211". See Figure 6.1 for the phase portrait. In
these polar coordinates, let B = {(O,I)} and V = {(e, r) : Ir - 11 < fl. Then, V is a
trapping set which is a neighborhood of Band w(p) = B for all p E V. However, the
attracting set and attractor for V is the circle A = {( e, 1) : 0 ~ e ~ 211"}. (This is an
example where the chain recurrent set is larger than the limit set.) (The set B is an
attractor in the sense of Milnor.)
Taking into consideration the above example, a compact attracting set can be char-
acterized by w-limit sets of points in arbitrarily small neighborhoods as given in the
following proposition.
7.6 A'ITRACTORS 293

FIGURE 6.1. Phase Portrait of Example 6.1

Proposition 6.1. Let A be a compact invariant set in a finite dimensional manifold.


Then A is an attracting set if and only if there are arbitrarily small neighborhoods V
of A such that (i) V is positively invariant and {ii} w(p} C A for all p E V.
We leave the proof of this result to the exercises. (See Exercise 7.31.) We also have
the following result about the unstable manifolds for points in an attracting sets.
Theorem 6.2. Let A be an attracting set for /. Assume either that (i) pEA is
a hyperbolic periodic point or (ii) A has a hyperbolic structure and pEA. Then
WU(p,f) C A.
PROOF. The set A is contained in the interior of a trapping region N so there is an
f> 0 such that W.u(fk(p» C N for all k E Z. Therefore for any k ~ 0,

i~O

Taking the intersection for k ~ 0 of /k(N}, we get that WU(p} c A. o


In the next few sections, we give examples of attractors which are locally the cross
product of the unstable manifold by a Cantor set.
It is often useful to talk about the dimension of an attractor or other hyperbolic
invariant set. Since such sets are often not manifolds we need another concept. What
we use is the topological dimension, which is always an integer. In other contexts,
the fractal or Hausdorff dimension is useful, which is a non-negative real number. We
postpone discussion of this concept until Section 8.4.
Definition. The definition of topological dimension is given inductively. A set A has
topological dimension zero provided for each point pEA, there is an arbitrarily small
neighborhood U of p such that 8(U} n A = 0. (It is not always possible to take U as
a ball as the example of Antoine's necklace shows.) Then inductively, a set A is said
to have dimension n > 0 provided for each point pEA, there is an arbitrarily small
neighborhood U of p such that 8(U}nA has dimension n -1. See Hurewicz and Wallman
(1941) or Edgar (1990) for a more complete discussion of topological dimension.
Definition. Using the concept of topological dimension, we say that a hyperbolic at-
tractor A is an expanding attractor provided the topological dimension of A is equal
to the dimension of the unstable splitting. (Since WU(p) C A, we always have that
the topological dimension is greater than or equal to the dimension of the unstable
splitting.} See Williams (1967, 1974).
294 VII. EXAMPLES OF HYPERBOLIC SETS AND A'ITRACTORS

7.7 The Solenoid Attractor


In this section, we introduce an example of a hyperbolic attractor which can be given
by a specific map, the solenoid. The solenoid was known to people studying topology,
see Hocking and Young (1961), but was introduced as an example in Dynamical Systems
by Smale (1967). Using this example as a model, R. Williams developed the theory of
one dimensional attractors, of which we see further examples in the sections on the
Plykin attractors and the DA-attractor. See Williams (1967, 1974).
Let D2 = {z E C : Izl ::; I}. We think of D2 as a subset of R2 even though we use
complex notation for a point (and complex multiplication). Let Sl = {t E R modulo I}
be the circle. The neighborhood of the attractor is the solid torus given by N = Sl X D2.
We define the embedding I of N into itself by means of a map 9 on SI, 9 : SI -+ SI,
given by g(t) = 2t mod 1. (The circle can also be thought of as a subset of C. Using
complex notation, g(z) = z2 for Izl = 1.) The map 9 is called the doubling map (or
squaring map). Using g, the embedding I : N -+ N (into N, not onto) is defined by

I(t, z) = (g(t), 41 z + 2e
1 2 I'
.. ').
(Below, we callI a diffeomorphism even though it is not onto N.) The constants 1/4
and 1/2 in the definition are somewhat arbitrary as long as the first is small enough to
make lone to one and the combination ofthe two insures that I(N) C N: 1/2 -1/4 > 0
and 1/2 + 1/4 < 1. Geometrically, the map can be described as stretching the solid
torus out to be twice as long in the SI direction and wrapping it twice around the SI
direction. The image is thinner across in the D2 direction by a factor of 1/4. See Figure
7.1.

FIGURE 7.1. Image of Neighborhood N inside of N

Let
D(t) = {t} x D2
be the fiber with fixed 'angle' t. The map I is a bundle map which takes a fiber D(t)
into the fiber D(2t) and is a contraction by a factor of 1/4 on each such fiber. We also
use the notation
D([t),t2]) = U{D(t) : t E [tl,t2]}'
Theorem 1.1. Let A = n~o Ik(N). Then A is a hyperbolic expanding attractor for
I of topological dimension one, called the solenoid.
We give various other properties of the solenoid throughout the section as we prove
the theorem and further analyze this example. First note that N is a trapping region
for I, so A is an attracting set. We start with the following proposition.
7.7 THE SOLENOID ATTRACTOR 295

••
FIGURE 7.2. f(N) n D(to) for to = 0 and 0 < to < 1/2

Proposition 1.2. For each fixed to, An D(to) is a Cantor set.


PROOF. If f(t, z) E D(to), then g(t) = to mod 1. so t is to/2 or to/2 + 1/2. The image
f(N) n D(to) is shown in Figure 7.2 for to = 0 /Lad 0 < to < 1/2.
Notice that

f(D(~))=(to,~D2+{~e"IOi}) and

f(D(t o ; 1)) = (to, ~D2 _ {~e"loi})

since e"'oi+'" = -e"lo'. Thus the two images are in the same fiber, but are reflections
of each other through the origin in D(to). They are disjoint because 1/2 - 1/4 > O.
Both images are in D(to) because 1/2 + 1/4 < 1, so f(N) c N. Now let

n fi(N)
k
Nk == = fk(N).
i=O

Claim 1.3. For all t E Sl, the set Nk nD(t) is the union of2 k disks of radius (1/4)k.
PROOF. The claim is trivially true for k = 0, and for k = I, it follows from the image
of N as discussed above. See Figures 7.1 and 7.2. Next,

Nk n D(t) = f(Nk- 1 n D(t/2») u f(Nk- 1 n D(t/2 + 1/2»).


By induction, N k- J nD(t/2) and N k- I nD(t/2+1/2) are each the unionof2 k - 1 disks
of radius (1/4)k-l. It follows from the fact that f is a contraction on fibers by a factor
of 1/4 that f(Nk- 1 n D(t/2» and f(Nk- 1 n D(t/2 + 1/2)) are each the union of 2k - 1
disks of radius (1/4)k. Together, they are the union of 2k disks of the stated radius. 0
Now A = n';oJi(N) = n;'oNj, so An D(to) is a Cantor set as in the earlier
example of a horseshoe. This proves Proposition 7.2. 0
Next, we give several other topological properties of A.
Proposition 1.4. The set A has the following properties:
(a) A is connected,
(b) A is not locally connected,
(c) A is not path connected, and
(d) the topological dimension of A is one.
PROOF. (a) The sets Nk are compact, connected, and nested so their intersection A is
connected. (See Exercise 5.ll.)
296 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

(b) For 0 < t2 - t\ < 1, D([tl' t2]) n Nle is the union of 21e twisted tubes. For any
neighborhood U of a point p in A, there is a choice of tl and t2 and large k such that
U contains two of these tubes. Since each tube contains some points of A, this shows
that A is not locally connected.
(c) Fix p = (to, zo) E A. By induction, for k ~ 1, there is a point qle E An D(to)
such that (i) for k ~ 2, qle is in the same component of N Ie - 1 n D(to) as qle-I, and (ii)
any path from p to qle in N" must go around SI at least 2" -I times. (This specifies the
component of N" n D(to) in which qle lies.) By construction, the sequence {q" }f-l is
Cauchy. Let q be the limit point of the qle, which is a point of A since A is closed. We
claim that there is no continuous path in A from p to q. This limit point q is in the
same component of N" n D(to) as q", and any path from p to q in N/c must intersect
D(to) at least 2"-1 times, going around SI between each of these intersections. Thus, if
there were a continuous path in A from p to q it would have to interest D(to) infinitely
many times (at least 21e - 1 times for arbitrarily large k), going around SI between each
of these intersections. A continuous path can not go around SI an infinite number of
times, so this contradicts the assumption that there is a continuous path in A from p
to q, and proves that A is not path connected.
(d) In each fiber An D(to) is totally disconnected, so has topological dimension zero.
In a segment of N, An D([tl' t21> is homeomorphic to the product of An D(t.) and
the interval [tl' t2l, so has topological dimension one. This completes the proof of the
proposition. 0
Next, we state several properties of Ion A which together imply Theorem 7.1.
Proposition 7.5. The map I restricted to A has the following properties:
(a) the periodic points of I are dense in A,
(b) I is topologically transitive on A, and
(c) I has a hyperbolic structure on A.

PROOF. As a first step in the proof of part (a), we give the following lemma about g.

Lemma 7.6. The periodic points of 9 are dense in SI.


PROOF. The point to is fixed by gle, gle(to) = to, if and only if 2kto = to + j for some
integer j. Thus to = j/(2/c - 1). For k fixed, let tie.; = j/(2/c - 1) for 0 ~ j ~ 2k - 2.
These points are evenly spaced on the circle with separation 1/(2k - 1). As k goes to
infinity, it follows that the set of all periodic points is dense. 0
Now if gk(to) = to, 1"(D(to» c D(to). The set D(to) is a disk, and Ik takes it into
itself with a contraction factor of 4-", so lie has a fixed point in D(to). By the lemma,
it follows that the fibers with a periodic point for I are dense in the set of all fibers.
We want to show the periodic points are actually dense in A. Take a point pEA
and a neighborhood U of p. There is a choice of k and tl, t2 such that the tube,
Ik(D([tl,t2]) cU. We showed above that I has a periodic point in D([tl,t2]) and so
r
in (D( [t I, t2]) cU. This proves the first part ofthe proposition.
(b) We need to verify on A the hypothesis of the Birkhoff Transitivity Theorem,
Theorem 2.1. Let U and V be two open subsets of A. Thus there are open set in SI x D2,
U ' and V', such that U ' n 11.= U and V' n 11.= V. There exist kEN, 0 < t2 - tl < 1,
and 0 < t; - tl < 1 such that lle(D([tl, t2]) C U' and lle(D([t1,t;])) c V'. Then there
is a j > 0 such that Ji(D([tl,t2]) n D([tl,t;D ,,0. In fact, since we can take j such
that Ij(D([tl,t2])) goes all the way across D([tl,t~]), we can require that
7.7 THE SOLENOID ATTRACTOR 297

Thus

Ij(fk(D([tl' t2]» n A) n Ik(D([t;, t~])) n A # 0 and


P(U) n V = li(U' n A) n [V' n AJ # 0.
By the Birkhoff Transitivity Theorem, Theorem 2.1, IIA is transitive, and we have
proved part (b) of the proposition.
(c) In terms of the coordinates on Sl x D'l,

where 12 is the identity matrix on C or R2. Let E~ = {O} X R2. Then for (0, v) E E~,

DIp (~) = (lOv ), and

DI:(~)=(tv)
which goes to zero as k goes to infinity. Therefore, this is indeed the stable bundle at
each point pEA.
To find E~, it is necessary to use cones. Let

Step 1. DlpC: C C'J(p)'

PROOF. Let (~~) E C;;, and

Then,

Iv~1 = 21vlI = ~(4Ivll)


1 1
> 2(7r1vd + 21v1 1)
1 1
~ 2(7r1vd + 41v2 1)
~ ~lv~l.

This last inequality shows that (~~) E Cj(p)' completing the proof of the first step.
o
298 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRAGrORS

n Df7-.(p)
00

Step 2. The intersection C;_.(p) = IE: is a line in the tangent space.


k-O
PROOF. By Step 1, the finite intersections
k

n
j-=O
Df;-J(p) CJ-;(p) = Df7-·(p) CJ-.(p)
are nested. To prove that the intersection is a line, we prove that the angle between
two vectors in these finite intersections goes to zero as k goes to infinity. Let

VI) ' (WI)


( V2 W2 "
E C,_.(p)

with Vt. WI > 0,

(:D = D fj •(p) ( :~), and (:D = Dfj-.(p) (:~) .


Then
Iv~ _ w~ I= l7rie 2 .. li vl + i V 2_ 7rie2"li wl + iW21
VI WI 2vI 2wI

= !IV2 _ w21,
8 VI WI
which shows there is a contraction on the difference of the slopes. By induction on k,
IV2 _ W21=(_)kIV2_
k 1 k
21, W
v~ w~ 8 VI WI
which goes to zero as k goes to infinity. Since the difference of slopes goes to zero, the
cones converge to a line. 0
Step 3. The derillBtive of f restricted to IE~, DfpllE~, is an expansion.

PROOF. Since IE~ is a graph over TISI x {OJ, let I ( :~) I. = Ivd. This is a norm on
the cone. Then,

Thus D fp is an expansion on vectors in this bundle in terms of this norm, and hence
the standard norm. This completes the proof of Step 3, part (c) of the proposition, and
the proposition. 0
REMARK 7.1. Given the bundle of vectors which expand and contract, it is easy to see
that W"(p) :) {t} X D2 if p = (t,z). The unstable manifold, W"(p), winds around
through 1\.. Each W"(p) is an immersed line. (Since f is one to one and an expansion
on W"(p), it can not be a circle which is the only other one dimensional possibility.)
The unstable manifold W" (p) hits D( t) in a countable number of points. Since I\. n D( t)
is uncountable, there are many other points in I\.nD(t) which are not in WU(p). These
are points q for which there is no curve in I\. from p to q.
REMARK 7.2. As an exercise, we ask the reader to construct Markov partitions for f
and the doubling map g. See Exercises 7.28 and 7.29.
7.7.1 CONJUGACY OF THE SOLENOID TO AN INVERSE LIMIT 299

7.7.1 Conjugacy of the Solenoid to an Inverse Limit


Williams introd uced the idea of representing certain attractors (expanding attractors)
as inverse limits. See Williams (1967, 1974).
As before N is the natural numbers, {O, 1, 2, 3, ... }. Let g( t) = 2t mod 1 as before.
Let
E = {s E (SI)N : g(Bj+d = Bj}.
Define the shift map, a, on E by a(s) = t if

Bj_1 ifj~l
{
tj = g(BO) if j = O.
If SEE, then g(Bj+1) = 8j so Bj+1 E g-I(8;) is one of the two preimages of Bj. The
pair (E, a) is called the inverse limit 01 g.
A point p = (t, z) E A is determined by a sequence of descending disks in D(t) as
we saw above. These disks are in turn determined by the preimages of t by the map g.
When a map is expanding (like the one dimensional horseshoe) we use forward images
of a point p to determine p. Because 1 contracts on fibers, we use backward images.
Define h: A -+ (SI)N by h(p) = s where rj(p) E D(sj) with Bj E 8 1 for j = 0,1, ....
Theorem 7.7. The map h defined above is a conjugacy from 1 on A to the inverse
limit of g, a on E.
PROOF.

Step 1. heAl c E.
PROOF. Let h(p) = s. Then rj(p) E D(sj) and r j - l (p) E D(sj+1)' Therefore the
intersection I(D(sj+d) n D(s;) -F 0, so f(D(Bj+1)) C D(sj). Thus g(s;+1) = s; for all
j and sEE. 0
Step 2. h 0 f = a 0 h.
PROOF. For pEA, let h(p) = s and h(f(p» = t. f-(j+1l(f(p» = f-;(p) and so is in
both D(tj+d and D(Bj). Therefore tj+1 = Sj for all j ~ O. Similarly, f(p) is in both
D(to) and f(D(so» so to = g(Bo). This proves that a(s) = t as required. 0
Step 3. The map h is one to one.
PROOF. If h(p) = h(q) = s, then p,q E n~_ofj(D(sj)). This is a nested sequence of
disks whose radii go to zero. Therefore there is only one point in the intersection and
P=~ 0
Step 4. The map h is onto E.
PROOF. Take 8 E E. Then g(sj+l) = Sj so f(D(sj+1» C D(sj), and

fj(D(sj» c f j - I (D(s;-I» c ... C D(so).


Therefore n~=oJi(D(sj)) is a nested sequence of disks with nonempty intersection,
hence
n 1;(D(s;» -F 0.
00

j-O

If p is a point in this intersection, then h(p) = s. This completes the proof of the fourth
step and the theorem. 0
300 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

7.8 The DA Attractor


The next example of an attractor we consider is constructed by modifying a toral
Anosov diffeomorphism on the two dimensional torus. For this reason it is called the
Derived-from-Anosov-diffeomorphismor the DA-diffeomorphism. It was first introduced
by Smale (1967).
Let 9 : y2 -+ T2 be the Anosov diffeomorphism induced by the linear matrix (~ ~ ).
(Any other example with a fixed point would work as well.) Let Po be a fixed point of
g corresponding to 0 in R2. Let v" and v' be the unstable and stable eigenvectors of
the matrix and use coordinates UIV" + U2V' in a (relatively small) neighborhood, U, of
Po· Let TO > 0 be small enough so the ball of radius TO about Po is contained in U. Let
6(x) be a bump function of a single variable such that 0 ~ 6(x) ~ 1 for all x, and

0 for x ~ TO
6(x} = { 1
for x ~ To/2.

Consider the differential equations

UI = 0
U2 = u26(1(UI' U2)J).

Let cpt be the flow of these differential equations, CPt(UI,U2) = (UI,CP~(UI,U2»' Then
the support, SUpp(cpl - id} c U. Also the derivative of the flow at Po in terms of the
(uI,u2)-coordinates is

Define' = cpT 0 9 for a fixed T > 0 such that eTA. > 1 where A. is the stable eigenvalue.
The map' is called the DA-diffeomorphism. Note that in the (Ulo u2)-coordinates the
derivative of , at Po is

so Po is a source.
Theorem 8.1. The DA-dilfeomorphism , described above has nu) = {Po} U A where
Po is a fixed point source and A is an expanding attractor of topological dimension one.
The map' is transitive on A and the periodic points are dense in A.
PROOF. Because the neighborhood U can be taken arbitrarily small, , can be made
arbitrarily CO near g, but not C l near 9 since eT can not be arbitrarily small. Also note
that the flow cpl preserves each stable manifold of a point for g, W'(q, g), because of
the form of the differential equations. Therefore, , preserves each W' (q, g).
The new map' has three fixed points on W'(pO,g), Po and two new fixed points PI
and P2. This fact can be seen to be true because '(Po) = Po is a source and outside
U the slope of the graph of , on W' (q, g) is still less than one. Therefore there must
be a fixed point on each side of Po along W·(q,g). See Figure 8.1. We claim that both
PI and P2 are saddles. To see this, note that in U, D/q = (all 0), with all
a:n a22
= Au
7.8 THE DA ATTRACTOR 301

• u
..
FIGURE 8.1. Graph of IIW'(Po,g)

FIGURE 8.2. Image of Open Set V

for all q, and with 0 < a22 < 1 at PI and P2 because of the nature of the graph of
IIW'(Po,g) indicated in Figure 8.1.
Let V be a neighborhood of Po (not containing PI and P2) contained in U such that
(i) a22 > 1 for q E V (f is an expansion along E' in V), (ii) 0 < a22 < 1 for q ~ I(V)
(f is a contraction along E' outside of V), and (iii) I{v) ::> V. See Figure 8.2. (We
leave as an exercisp. the existence of such a neighborhood V. See Exercise 7.36.) Clearly,
V C WU(Po,J) so it is the local unstable manifold of Po and WU(Po,J) == ~o Ji(V).
Let N == '1'2 \ V. Then N is a trapping region because I{v) ::> V. Let A = ni_OP(N).
This is an attracting set, and A = T2 \ WU(Po,f). The unstable manifold of Po for I,
WU(po, f), is a "thickened" version of the unstable manifold for g. In Step 4 below, we
prove that WU(Po,f) is still dense in T2, so A has empty interior.
We proceed to prove Theorem 8.1 through a series of steps.
Step 1. The map I has a hyperbolic structure on A.
PROOF. In terms of the splitting E~(g) e E~(g), the derivative of I, Dlq = (aji), is
lower triangular in U and diagonal outside U (a12 = 0 everywhere and a21 = 0 outside
U). The unstable term all == Au > 1 everywhere and 0 < a22 < 1 outside fey) so on
A. Because of the form of the derivative, E~(f) = E~(g) is an invariant bundle and
every vector in this bundle is contracted by Dlq for q EA. Therefore this is the stable
bundle on A. Let C be a bound on la:l1l everywhere, define L == C(~u - ~.)-l and take
the cones
c:: = {(Vb V2) E E~(g) e E~(g) : IV21 ~ Llvll}·
Then It can be checked using the lower triangular nature of the derivative of I that the
302 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

cones are invariant and


00

E~(J) = n Dfl-;(qPJ-J(q)
j=o

is an invariant bundle on which the derivative is an expansion for points q E A. Thus


this gives the unstable bundle on A and so the hyperbolic splitting. 0
Step 2. For j = 1,2, PI,P2 E A and W"(Pj,f) C A.
PROOF. The fact that PI, P2 E A follows because PI, P2 ¢ V and they are fixed points,
so
00

PI,P2 ¢ UP(V) = 11'2 \ A.


j=O

The fact about the unstable manifolds we proved for general attracting sets. 0
Step 3. The stable manifolds of f satisfy the following:

W'(PII f) U W'(P2, f) = W'(PO,g) \ {Po}

and W'(q, f) = W'(q,g) forq ¢ W'(Po,g). Thus W'(q, f) is dense in 11'2 for all q E A.
PROOF. f(W'(q,g» = W'(J(q),g) and W'(q,g) is tangent to R'(J). Also for q E A,
I is a contraction on WI:"'(q, g). Therefore W1:"'(q, f) = W1:"'(q, g), and W'(q, f) c
W'(q,g). A line segment I in W'(q,g) which does not end in V (goes all the way
across V if it intersects it) is lengthened by I-I. Any line segment whose end stays in
V for all inverse images must be a subset of W'(pO,g) and have an end in ",,~c(PO,g).
Thus, if q ¢ W'(Po.g) then W'(q,f) = W'(q,g). Also, this implies W'(Pj'/) is one
component of W'(Po, g) \ {Po} for j = 1,2. The fact that the W'(q, f) are dense in 11'2
follows because these are lines with irrational slope. 0
Step 4. The unstable manifold of I at Po, W"(po, f), is an open dense set in 11'2.

PROOF. By construction, 11'2 = A u U~o Ii (V), so we need only prove that W" (Po, f)
accumulates on A. Let pEA, and Zp be an arbitrarily small neighborhood of P in
11'2. Let I = compp(W'(p, f) n Zp). As long as l-j(I) does not intersect I(V) it is
lengthened by I-I by a uniform amount. But there is a uniform bound on the length
of comp:a[W'(z, g) \ I(V)]. Therefore for large j,

l-i(I) n I(v) # 0,
rj(Zp) n I(v) # 0,
Zp n IHI(V) # 0, and
Zp n W"(Po, f) # 0.
Since Zp is an arbitrarily small neighborhood, it follows that W"(Po, f) is dense at p.
o
Step 5. For j = 1.2, W"(Pj'/) is dense in A.
PROOF. Let P E AnW'(q, g) where q has period k for g. By Step 4, P E c1(W"(po'/»\
W"(Po. f), so P E 8(W"(po,/». In the proof of Step 4, when rjk(I) intersects V, it
must cross W"(PI, f) U W"(P2, f). (It crosses from one side to the other.) Therefore,

[W"(PI. f) U W"(P2, f)] n rik(Zp) #0 and


[W"(PI,f) U W"(P2.f)] n Zp # 0
7.8.1 THE BRANCHED MANIFOLD 303

because the unstable manifolds are invariant by f. This shows that the union of the
two unstable manifolds WU(Pt, f) U W U(P2, f) is dense in A.
We have shown that the union of the two manifolds is dense in A, and we need to
show that each manifold is dense by itself. W"(Pt, f) is dense in T2 so it must intersect
WU(P2, f). Because these are tangent to the bundles E" and E" the intersections (which
are on A) are transverse. By the Inclination Lemma, it follows that W" (1'2, f) accu-
mulates on WU(PI, J), cl(W"(P2,f) ::> W"(PI, f), and cl(W"(Pl, J) = A. SimUarly,
cl(WU(PI,J) = A. 0
Step 6. The topological dimension of A is one.
PROOF. By Step 4, WU(Po, f) is dense in T2, so A has empty interion and must have
topological dimension at most one. The manifolds WU(Pj, f) for j = 1,2 are contained
in A, so it must have topological dimension at least one. 0
Step 1. For j = 1, 2, {q E A : q is 8 transverse homoclinic point for Pj} is dense in A.
PROOF. If x E A, Steps 3 and 5 imply that both WU(Pj, f) and W'(Pj, f) come
arbitrarily near x for j equal either 1 or 2. The existence of a hyperbolic structure in
Step 1, implies that WU(Pj,J) and W"(Pj,f) intersect transversally arbitrarily near x
for j equal either 1 or 2. 0
Step 8. The set A is transitive.
PROOF. This follows from Step 7 and the Birkhoff Transitivity Theorem, Theorem 2.1.
o
Step 9. The periodic points of f are dense in A.
PROOF. This follows from Step 7 and the horseshoe theorem for transverse homoclinic
points. 0
Together, all these steps prove the theorem. o
7.S.1 The Branched Manifold
A Markov partition for the Anosov automorphism 9 is given in Figure 8.3. The
map f pushes outward from Po in the stable direction. If we form equivalence classes
of points in comp.(W"(z,f) \ V) and collapse these to points we get the branched
manifold, K, indicated in Figure 8.4. This quotient space has the differential structure
of a one dimensional manifold except there are branch points. The fact that there is
a ct structure on the quotient space is reflected in the picture by the fact that the
three curves coming into a brancli point all have the same tangent line. There is a map
defined on the quotient space (the branched manifold), g. : K -+ K. See Williams
(1967) for the definition of a one dimensional branched manifold or Williams (1974) for
the definition in any dimension. This map is an expanding map (because we quotiented
out the contracting directions and left the expanding directions), and has the following
images:
g.(A) = B
g.(B) = BCB
g.(C) = CAC.
In fact, these line segments A, B, and C can be oriented so the map preserves the
orientation. This map takes the role of the doubling map for the solenoid. It can be
proved that f on A is topologically conjugate to the inverse limit of g. on K. See
Williams (1970a). We leave the details to the reader and references.
304 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

FIGURE 8.3. Markov Partition for DA-Diffeomorphism

FIGURE 8.4. Branched Manifold for the DA-Diffeomorphism

7.9 Plykin Attractors in the Plane


Let A be a hyperbolic attractor in the plane with trapping region N. Thus N must
be diffeomorphic to a disk with some (or no) holes removed. Plykin (1974) proved that
if A is not just a periodic orbit, then N must have at least three holes removed (four
holes on the two sphere 8 2 ).
Theorem 9.1. Let N C R2 be a trapping region for f. Assume the associated attract-
ing set, A = n:. o fk(N), has a uniform hyperbolic structure for which the expanding
bundle has dimension one, dim(E~) = 1 for pEA. (So A is not just the orbit of a
periodic sink.) Then N must have at least three holes.
REMARK 9.1. For the general proof see Plykin (1974). We give a sketch of a proof that
N must have at least two holes. Because of this theorem, any nontrivial attractor (not
a periodic orbit) in the plane or sphere is called a Plykin attractor.
PROOF. The hyperbolic splitting on A can be extended to a small neighborhood U of
A. (This extension is not hard if it is not assumed that the splitting is invariant off A.
It is possible to extend it so it is invariant but this is harder and we do not need this
property.) For large k, fk(N) C U, so there is a splitting on fk(N). The neighborhood
Jk(N) has the same topological type as N, so we can assume that the splitting is on
the entire trapping neighborhood N.
Assume that the extension of the bundle E; to all points pEN is orientable on N.
Then it is possible to take X(p) E E; that is a nonvanishing vector field. Take pEA.
ThE.'n thE.' intt'gral curve of p is one side of W"(p), W"(p)+. By the Poincar~ Bendlxson
TheorE.'m, W"(p)+ accumulates on a closed orbit..., for X (because w(p, X) has no fixed
points since X is nonvanishing). But w(p, X) c A because A is closed. Thus..., c A is a
7.9 PLYKIN ATTRACTORS IN THE PLANE

closed curve which is an unstable manifold. But unstable manifolds can not be closed
curves (they are immersed lines). This contradiction shows that the extension E; can
not be orientable on N.
If N has no holes (and so is a disk), then the extended bundle E" must be orientable
on N. The above argument shows that this is impossible, so N must have at least one
hole.
Next assume that N is a disk with one hole removed (an annular region). By the
above argument the extension E" must not be orientable on N. In this case, it is possible
to take a double cover N of N on which there is an orientable bundle E" which covers
EU on N. Again, N is an annular region. It is also possible to define a map Ion N
which covers f. But this leads to a contradiction as above, so N must have at least two
holes.
As stated above, Plykin has an argument that N can not have just two holes. This
can also be proved using the theory of "pseudo-An08Ov diffeomorphisms" of Thurston.
We do not give these arguments. 0
Example 9.1. It is possible to describe a geometric model of a map f which has a
planar region with three holes, N, as a trapping region. See Figure 9.1. Consider
the map f for which the image of N is as indicated in Figure 9.2. This map takes
each of the line segments drawn in Figure 9.1 into (subsets of) another one of these
line segments. These line segments are pieces of the stable manifolds. The map f
stretches in the direction across the line segments. Also f(N) c N. The attracting Bet
A= n;::.o f"(N) has a hyperbolic structure.

I
,,
I
...
If'
I' '
I, ' '
" I '
'. I'
:: ,: ,/,,' ,i'
:f:/,/ C' ........ .
, ...... _.... ..

D ;.::- . .
,'",,,
II'"
,I I ' \
.. ..
.......
.
,'I'
,'I" \ ..
,I", . . ....
,I I' \

" I' '


: ~ '. '- '-
I

", ,
I

. , I

FIGURE 9.1. Neighborhood N


If we make equivalence classes of points which are in the sarne components of stable
manifolds in N, q'" p if q E compp(W'(p) n N), then we can form the quotient space
K = N/ "'. For this example this quotient is Indicated in Figure 9.3. Williams showed
that to an expanding attractor there is associated a bmnched manifold, Williams (1970a,
1974). In Section 11.2, we show that the tangent lines, E! to the various TxW'(p),
depend in a C· fashion on x. This differentiability can be used to show that the
quotient space can be given a smooth structure. There is a map defined on the quotient
space, 9 : K --+ K. This map is an expanding map (because we quotiented out the
contracting directions). This map takes the role of the doubling map for the solenoid.
In the example being discussed, g(C) ::> A, g(A) ::> B, g(B) ::> C, and g(D) ::> C. It
can be proved that f on A is topologically conjugate to the inverse limit of 9 on K. See
Williams (1967) and (1970a). Also see Barge (1988) for the connection between inverse
limits and attractors for diffeomorphisrns in the plane.
306 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

FIGURE 9.2. Image of N inside of N

c
D

FIGURE 9.3. Branched Manifold for Plykin Example

REMARK 9.2. Some progress has been made to understand hyperbolic attractors which
are not expanding attractors, i.e., hyperbolic attractors for which the topological dimen-
sion is greater than the dimension of the unstable manifolds. See Wen {1992}.

7.10. Attractor for the Henon Map


Again we consider the Henon map, FA.B(X,y} = (A - By - x2,x). In the earlier
section, we showed that for large values of A FA,B has a horseshoe, e.g. B = ±0.3 and
A = 5. In this section, we consider smaller values of A for which FA,B has a trapping
region. Henon introduced this map and discussed this map for B = -0.3 and A = 1.4
for which there is a trapping region N which is topologically a disk (a region with no
holes). See Henon (1976). For the following discussion, let F = F1. 4 .- O.3 • Since it is
a trapping region, A = n:,o Fk{N} is an attracting set. Numerical iteration indicates
that F is topologically transitive on A because the iteration of a (generic) point appears
to have a dense orbit in A. By the discussion of Plykin theory in the last section, F
can not have a (uniform) hyperbolic structure on A because A has arbitrarily small
neighborhoods which have no hole (topologically disks), Fk(N}. It is still possible that
there exists a point with a dense orbit in A. This point also might have a positive
Liapunov exponent. (That is, there might be some point p and a vector v for which
IDF;vl grows at an exponential rate, lim infk_oo(l/k) 10g(IDF;vi) > D.} In spite of
the lack of rigorous proof, an attracting set A for any map FA,B in the Henon family
such that FA,BIA appears to be transitive is called a Henon attmctor. See Figure 10.1
for A = 1.4 and B = -0.3. (Also see the comments below about the results of Benedicks
and Carleson.)
Since A is an attracting set, A must contain all the unstable manifolds of hyperbolic
periodic points in A. The following proposition shows that A is the closure of the
unstable manifold of the fixed point for the Henon map for many B < O.
7.10. ATTRACTOR FOR THE HENON MAP 307

FIGURE 10.1. Henon Attractor


Proposition 10.1. (a) Let 1 : R2 -+ R2 be a diffeomorphism with a fixed point p.
Assume there is a bounded region 0 C R2 which is p06itively invariant and 8(0) C
L U u L' where LU is contained in a compact piece of WU(p) and L' is contained in a
compact piece ofW'(p). Further assume 1 decreases area on 0, i.e., there is 8 0 < p < 1
such that I det(D/x)1 :5 p. Then
A = cl( U f"(L U ».
n~O

Ifp is in the interior of the £U as a subset ofWU(p) then


A = cl(WU(p}.
(b) For the Henon map F"..B there is a set S of values {A, B} with B < 0 for which
part (a) applJes. This set includes {1.4, -0.3} as well as values with 1.4 < A < 2 and B
small enough.
REMARK 10.1. The region 0 is not a trapping region since the part of the boundary
in WU{p) is usually in the image I{O}. See Figure 10.2. In the case of the Henon map
the region can be enlarged to make it a trapping region U, but then WU{p} is in the
interior of U.
PROOF. (b) We do not give the proof of part (b) but mere indicate a region which
works. A choice of 0 is the shaded region in Figure 10.2 whose boundary is made up of
the pieces of WU(p} and W'(p). This region is not a trapping region because the part
of the boundary contained in WU(p} is on the boundary of the image of O. By slightly
enlarging 0 along the part ofthe boundary contained in WU(p}, it is pOBSible to make
a trapping region.

FIGURE 10.2. Invariant Region 0 for the Henon Attractor is Shaded


308 VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS

(a) Because f decreases area by p, the absolute value of the stable eigenvalue at p
is less than p, IA,I < p. Since L' is contained in a compact piece of W'(p), there is
a k > 0 for which flc(L') C W:(p). For n ~ k, the diameter of /"(L') is less than
2fpn-lc. Given '1 > 0, there is an nl such that this diameter is less than '1/2 for n ~ nl.
Now take q E A. Take '1 > 0 as above. Since the area(fn(O» $ pnarea(O),
there is an n2 ~ nl such that D(q, '1/2) rt. /"'(0), D(q, '1/2) n 8(fn·(o» '" 0, and
D(q, '1/2) n (fn'(Lu) U /"' (L'» '" 0. Because the diameter of /"'(L') is less than '1/2,
d(q,/"'(LU» $ TJ. (The ends of 8(fn·(o» n /"'(L') lie in /"·(LU).) Because TJ is
arbitrary, q is in the closure of the union of the forward iterates of LU, and A is the
closurE' of U .. >o f"(LU) as claimed.
If p is in the interior of the LU, then Un>O f"(LU) = WU(p), flO A is t.he clollurf' of
WU(p) lIS claimed. - 0
EVf'n though the attracting set for the Henon map is the closure of the unst.ahle
manifold, it ill not l)('cesRnrily topologically transitive. In fact for sollie parameter values,
we argue below that it contains some periodic sinks. However, when we look more closely
at the attracting set A, the invariant set looks like a Cantor set of curves which seem to
be the unstahle manifolds of points in A. See Figure 10.3. If we look at the attracting
set in a smaller box (at a smaller scale), the set still looks like a Cantor set of curves.
However, between the curves which reach all the way across the box there are curves
which turn around part way across the box and come out the same edge they entered.
These latter curves look like hooks among the other curves which are relatively straight.
If all points in A had stable and unstable manifolds, then these hooks in the unstable
manifolds would be tangent to the stable manifold of some other point in the attracting
set. As the parameter A varies, these hooks move in the attracting set. For many
parameter values, it would seem likely that there are homoclinic tangencies (tangencies
of the stable and unstable manifolds of the fixed point or some periodic point). At
other parameter values the tangencies of stable and unstable manifolds may be only be
for non periodic points. In any case these tangencies prevent A from having a uniform
hyperbolic structure.
A numerical study by means of computer graphics seems to indicate that there is a
tangency for B = -0.3 and A about 1.392. However, it is known that for parameter
values near a homoclinic tangency, the attracting set is not transitive but contains
infinitely many periodic sinks. This follows from the work of Newhouse (1979) on
infinitely many periodic sinks. Also see Robinson (1983). It is still conceivable that the
placement of the hooks could be controlled enough to avoid all homoclinic tangencies.
It would be hoped that for such a parameter value that most points could be proved to
have a positive Liapunov exponent.

FIGURE 10.3. Enlargement of Piece of Henon Attractor


7.11 LORENZ ATIRACTOR 309

Recently Benedicks and Carleson (1991) have shown that there are other parameter
values for which FA,8 has a transitive attractor with positive Liapunov exponent. (The
map FA,B still can not have a uniform hyperbolic structure.) In fact, there is a set S c
{A: 1.0 < A < 2.0} of positive measure such that for A E S and B < 0 small enough
the attracting set is topologically transitive and has a positive Liapunov exponent. Their
proof uses a perturbation argument from the one dimensional case of the quadratic map,
i.e., from FA,o. For the one dimensional map, the "hooks" are the images of the critical
point, and can be controlled well enough to make the map transitive with an invariant
ergodic measure. This one dimensional result was first proved by Jakobson (1981).
There have hern many refinements of the proof, including Benedicks and Carleson
(1985). Benf'llickll nnd Cnrl<'ll<ln were able to use the knowledge about the images ofthe
critical point for the one dimensional map to control the location of the hooks for the
two dimensional map for small values of B and even prove that there is a point with a
"C'IISC' orhit. which has a pOllit.ivC' Lil\punov exponent. (See Section 8.2 for the definition
of Liapunov exponents for a two dimensional map.)
This result for the Henon map for very small B by Benedicks and Carleson gives
plausibility to the conjecture that this is true for A = 1.4 and B = -0.3 but this is
still unproven. Mora and Viana (1993) have shown how this theorem of Benedicks and
Carleson applies to maps which are not quadratic maps.

1.11 Lorenz Attractor


As stated in Chapter I, Lorenz (1963) introduced the equations

:i: = -ux +uy,


iJ = px - y - xz,
i = -{3z + xy
as a model for fluid flow of the atmosphere (weather). See Guckenheimer and Holmes
(1983) or Sparrow (1982) for further discussion of the way these equations model atmo-
spheric movement. The parameter values which have some physical significance occur
near p = I, but Lorenz discovered some very unusual dynamics for the parameter values
(T = 10, p = 28, and {3 = 8/3. In our treatment of the dynamics, we fix (T = 10 and

{3 = 8/3 for the rest of the section and discuss the situation for various values of p, but
always taking p > 1.
Before turning to the detailed discussion, note that the equations are invariant under
the substitution of (-x, -y, z) for (x, y, z). Therefore, the solutions have this type of
symmetry: if (x(t), y(t), z(t)) is a solution then (-x(t), -y(t), z(t}} is also a solution.
The fixed points of this system of equations are easy to find, and are at 0 = (0,0,0),

p+ = ([8(p - 1}/3JI/2, [8(p -1}/3J 1/ 2,p - I), and


p- = (-[8(p-I}/3J 1/ 2,_[8(p-l}/3J 1/ 2,p-l).

The eigenvalues at the origin are all real, -8/3, -11/2 ± [121 + 40(p - I}J 1/ 2/2. Thus
there is one unstable eigenvalue

Au = -11/2 + [121 + 40(p - I)JI/2/2


and two stable eigenvalues

A. = -8/3 and
A•• = -11/2 - [121 +40(p-l}]1/2/2.
310 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

For P = 28, Au ::;:: 11.83 and A•• ::;:: -22.83. The unstable manifold of the origin is one
dimensional and has two branches WU(O)*. These two branches are related to each
other by the symmetry noted above. For small values of p, the positive branch of the
I!Estable manifold, WU(O)+ stays on one side of the stable manifold W·(O). In fact,
numerical integration indicates that it goes to p+. See Figure 11.1. For P = Po ::;:: 13.93,
the unstable manifold for the origin Is seen to connect to the stable manifold of the
origin forming a homoclinic loop, WU(O) C W'(O). (Remember, if one branch forms a
homoclinic loop for a parameter value than the other branch also forms a homoclinic
loop by the symmetry.) See Figure 11.2. For P > Po each half of the unstable manifolds
for the origin (going out only one direction from the origin), WU(O)±, crosses from one
side of W'(O) to the other side. See Figure 11.3. For P near Po, WU(O)+ falls into
W'(p-) and WU(O)- falls into W·(p+). For P > PI = 470/19::;:: 24.74 this is no longer
the case as we discuss further below.

FIGURE 11.1. Unstable Manifold ofthe Origin, 1 < P < Po

FIGURE 11.2. Unstable Manifold of the Origin, P = Po

We have already referred to the stable manifolds of the fixed points p±. We now
turn to a discussion of the stability type of these fixed points. As is verified in Exercise
6.10, the characteristic equation for the fixed points p± is

8 160
p(A) = A3 + (41/3)A 2 + 3(P + IO)A + 3(P - 1) = O.
For P ~ 14, p(A) has one real negative root and two complex roots. There is a bifurcation
value at PI = 470/19 ::;:: 24.74. The two complex roots have a negative real part for
7.11 LORENZ ATTRACTOR 311

FIGURE 11.3. Unstable Manifold of the Origin, Po < P < PI

FIGURE 11.4. Two Views of the Unstable Manifold of the Origin for P = 28

14 $ P < PI, and positive real part P > PI. Thus these fixed points are stable for
14 $ P < PI, and they become unstable at PI = 470/19 R: 24.74. In fact a Hopf
bifurcation takes place for this parameter value. See Sparrow (1982). It is a subcritical
Hopf bifurcation where two unstable periodic orbits disappear at PI. For P < PI, the
fixed points p± are sinks, while for P > PI, these fixed points push outward in a two
dimensional subspace. The eigenvalue is complex so the trajectories spiral around in this
two dimensional surface as they move outward. The existence of this two dimensional
expanding subspace for the fixed points pushes the unstable manifolds. WU(O)± away
from p±. In fact for P = 28, which is greater than PI! WU(O)+ crosses from one side
of W"(O) to the other and back again, as indicated in Figure 11.4. (None of the facts
stated above are obvious, but follow by detailed calculations. Some of these calculations
are contained in Exercise 6.10, and others are referred to in Sparrow (1982).)
From now on we focus our attention on the behavior for P = 28 and fix this value.
There is a trapping region N containing the origin and not containing the other two
fixed points. In fact, there are two holes in the region where these two fixed points are
located. See Figure 11.5. Let A be the maximal invariant set in N,

A= n,l(N).
t~O

Thus A is an attracting set. The unstable manifold of the origin, WU(O), must be
completely contained in the trapping region N, and so in A.
The flow can not have a hyperbolic structure on A in the usual sense because A
contains a fixed point at 0: at the fixed point 0, Ef, has dimension two and q has
312 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

, ,
,,
I
I
\
\
,

FIGURE 11.5. Trapping Region

dimension one. while at other points x E A the splitting would be of the form
IE: $E~ $IE~
where each of the subspaces would have dimension one and E~ would be spanned by the
vector field at x. However. we could consider a generalized hyperbolic structure where
the splitting varies as above and the two dimensional subspace E~ $E~ for x E A \ {OJ
converges to E& as x converges to o. Numerical integration indicates that the equations
have this type of generalized hyperbolic structure but this is a "global" question and
has not been verified analytically.

7.11.1 Geometric Model for the Lorenz Equations


Guckenheimer (1976) introduced a geometric model of the Lorenz equations which is
compatible with the observed numerical integration of the actual equations. This model
has been analyzed in Williams (1977. 1979. 1980). Guckenheimer and Williams (1980).
Rand (1978). and Robinson (1989. 1992). See Sparrow (1982) and Guckenheimer and
Holmes (1983) for a more complete discussion of the model than we give in this section.

To understand the model. first consider the flow of the actual equations near the
origin. By a nonlinear change of coordinates the equations are differentiably conjugate
to the linearized equations in a neighborhood of the origin.
i: = ax
y=-by
i = -cz
where a = Au ~ 11.83. b = -A.. ~ 22.83, and C = -A. = 8/3. (The differentiable
conjugacy follows by the result of Sternberg (1958). Also see Hartman (1964).) There-
fore 0 < c < a < b. The solution of the linearized equations is given by x(t) = eBtxo,
y(t) = e-btyo, and z(t) = e-ctzo. We want to follow the solutions as they flow past the
fixed point. so from the time when z(t) equals to some fixed Zo until x(t) equals to some
fixed ±x1. Consider the two cross sections
E = {(x,y, ZQ) : Ixl,lyl :5 a},
E' =E\ {(x,y,ZQ) : x = OJ, and
S± = {(±x\. y, z) : lyl,lzl :5 .a}.
7.11.1 GEOMETRIC MODEL FOR THE LORENZ EQUATIONS 313

Then for (x,y,zo) E I;', the time r such that <pT(x,y,zo) E S = S+ uS- is determined
by

eGTlxl = XI
eT = (XI) '/0.
x
Then the Poincare map P, : E' -+ S is given by

PI(x,y) = (y(r),z(T))
= (e-In- y, e- CT zo)
= (yxblox;blo,xc/ozox;c/o).

Notice that if x> 0 then P,(x, y) E S+ and if x < 0 then Pdx, y) E S-. For eigenvalues
at the fixed point equal to those of the real Lorenz equations, b/a = I~ .. / ~"I :<:: 1.93 > 1
and cia = I~./~"I :<:: 0.23 < 1. Therefore, a square region {(x,y, zo) : 0 < x ~ a, Iyl ~
a} in E' comes out in a cusp shaped region in S+. See Figure 11.6. The geometric
model assumes that the Poincare map P2 from S± back to E takes the horizontal lines
z = z\ into lines x = Xl' This compatibility insures that there is a coherent set of
contracting directions. More specifically, we assume that

Let P = P2 0 PI. Then P: E' -+ E and the image of I;' by P is as in Figure 11.7.

FIGURE 11.6. Flow of E Past Fixed Point

FIGURE 11.7. Image of E' by Poincare Map


314 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Because (1) PI takes a line segment with all the same x values to a line segment with
all the same z values and (2) P2 takes a line segment with all the same z values back
to a line segment with all the same x values, the x value of the image of a point by P
is determined solely by its x value. Therefore P is of the form

P(x, y) = U(x),g(x, 1/)).

For a fixed xo, the map g(xo, y) is a contraction in the y-direction:

for 0 < 11 < 1, and If'(x)1 ~ 21/ 2 > 1 for all x with Ixl :5 a. Thus for the Poincare
map, there is a hyperbolic splitting E; e E~, with E~ = {(O, V2)} and E; mainly in the
x-direction. Because of the form of the Poincare map, it has an invariant stable foliation
W'(q, P) made up of curves with constant value of x on E. One of these line segments,
W'(q, P) for q E E', is taken into another such line segment, W"(P(q), P), with most
likely a different value of x. We make an equivalence class of points on E that lie on
the same stable line segment, W"(q, Pl. By collapsing equivalence classes to points, we
get a map 7r : E -+ [-a, a]. (In the above situation 1T(q) just gives the x-value of q.)
Because P takes an equivalence class into an equivalence class, P and 1T induce a map
I: [-a, a] \ {OJ -+ R. This description of the map 1 is more coordinate free than given
above where we wrote P(x, y) = U(x), g(x, y)) but it represents the same function.

FIGURE 11.8. Graph of 1

The assumptions on the map 1 are more specifically as follows.


(1) The symmetry ofthe differential equations implies that I(-x) = -/(x).
(2) The map 1 has a single discontinuity at x = o.
(3) The limit of I(x) as x approaches 0 from the left side is A > 0, 1(0-) = A, and
the limit of I(x) as x approaches 0 from the right side is -A, 1(0+) = -A < o.
Also 0 < 12(A) < I(A) < A and so 0 > 12( -A) > I( -A) > -A. Thus

I: [-A, A] \ {OJ -+ [-A,A].

(4) 1 is nonuniformly continuously differentiable on [-A, AJ -{OJ, with !,(x) > 21/2
for all x ~ O.
(5) The limit of I'(x) is infinity as x approaches 0 from either side.
(6) Each of the two branches of the inverse of 1 extends to a C Ha function for
some a> 0 on [/( -A), A] or [-A,/(A)I. (Thus the derivative of the extension
of the inverse is a-Holder for some a > 0.)
The graph of 1 is given in Figure 11.8. The properties of 1 follow mainly from the form
of the Poincare map of the How past the fixed point.
7.11.1 GEOMETRIC MODEL FOR THE LORENZ EQUATIONS 315

Let cpt be the flow for the geometric model. Because P(E') C E, cpt has an attracting
set A. The dynamics of the flow cpt on A are determined by the two dimensional Poincare
map P. It can be shown that the dynamics of the two dimensional Poincare map are
determined by the one dimensional Poincare map f.
The fact that P has a coherent set of contracting directions can be used to show that
there is a bundle IE:: for pEA and a complementary plane of directions E~ that are
taken into themselves by D(cpt)p,

D( cpt )pE;." = IE~~ (p)


D(cpt)plE~ = 1E~,(p)

and the IE~· is contracted more strongly than anything in the IE~ directions,

This last condition implies that cones about the IE~" are taken into themselves by the
derivative of the flow, D(cpt)p. The vector field X(p) for the differential equation is in
the center direction IE~. The center direction IE~ is also more or less "tangent" to the
"sheets" in A, while the strong stable direction IE~" points transverse to the attracting set
A. There is then a stable manifold theorem which says that there are curves W:"(p, cpt)
that are tangent to the IE~· directions which are taken into themselves by the flow,

For small f > 0,


N' == U W:"(p,cp) c N.
pEA
We can form equivalence classes of points in the same strong stable manifold W:"(p, cp),
and get a projection from N' to a branched manifold L. See Figure 11.9. See Williams
(1974) for the general definition of a branched manifold or Williams (1977) for the
branched manifold of the Geometric Model of the Lorenz attractor. The flow on N'
induces a semi-flow 'ljit on L. Only a semi-flow and not a flow is induced on L because
there are two choices of the backward trajectory at the branch set. The Poincare map
from E' = '/l'(E) to itself for 'Iji is ,. (See the references for details.)

FIGURE 11.9. The Branched Manifold

The fact that the flow cpt has the above properties is preserved under (fl perturba.-
tions. That is the content of the papers Robinson (1981, 1984).
316 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Theorem 11.1. Let <{il be a flow with the properties of the Geometric Model of the
Lorenz equations as given above. In particular, assume the one dimensional Poincare
map 1 satisfies properties (1) - (6). Then there is a neighborhood N in the C 2 topology
such that if tjJ! is a flow in N with the symmetry properties of <{it, then tjJl has a bundle
of strong stable directions (and so do the Poincare maps). Forming the quotient by the
strong stable manifolds for tjJl on the trapping region N', there is induced a seml-Bow
on the quotient space L which is a brandied manifold. The semi-Bow tjJl has a Poincare
map
P: E \ W'(O,tjJ) -+ E.
The quotient map taking W"(p, tjJ) to points, induces a one dimensional Poincare map
j that satisfies all the properties (1) - (6).
In particular, Williams was able to show j is transitive by the following argument.
We first of all make the following definition for one dimensional maps.
Definition. Let I c R be an interval and 1 : I -+ I be a continuous map. We say that
1 is locally eventually onto provided for any nonempty (small) open interval K C I,
there is an n > 0 such that I"(K) :> I. If 1 is locally eventually onto, then it is
transitive on I by the Birkhoff Transitivity Theorem.
Theorem 11.2. Assume I: [-A,O) U (O,Aj-+ [-A,Aj satisfies assumptions (1) - (6)
gh'en above.
(a) Then 1 is locally event ually onto for the interval ( - A, A).
(b) Therefore 1 is transitive on (-A, A) and nu) = [-A, Aj.
PROOF. Let I = [-A,Aj. In the proof, we repeatedly throwaway points whose iterates
hit 0 and thus do not have well-defined forward orbits.
Given an interval K C I, define
if 0 ¢ K
Ko = {Kthe longest component of K \ {O} if 0 E K.
By induction, define Ki for i 2: 0 by
K _ {/(K;) if 0 ¢ I(Ki )
1+1 - the longest component of !(Ki ) \ {OJ if 0 E I(/(i)'
Let
.x = zE/\{O}
inf {f'(x)}.

By the assumptions ,x > 21/2. Note that by construction, 0 ¢ int(/(i). Therefore,


»
tU(K i 2: Af(K,) where t(J) is the length of an interval J. Because of the choice of

eK > {Af(K;) if 0 ¢ /(K i )


( Hd - ~t(K,) if 0 E I(K,).
Similarly,
l(I< ) > { Af(KH1 ) if 0 ¢ /(Ki +1)
,+2 - jl(KHd if 0 E I(KHd.
Thus if 0 ¢ I(Ki ) or 0 r;.1(K i +1) (0 ¢ I(Ki ) n I(KH1 », then
,x2
I(KH2 ) ~ 2"l(Ki ).
Thus every two iterates of the interval makes the length of Ki increase by a factor
.x2 /2 > 1 until there is some n for which 0 E I(Kn- 2 ) n/(Kn-d. Thus, we have proved
that 0 E I(Kn-2) n I(Kn-tl for some n 2: 2.
7.11.1 GEOMETRIC MODEL FOR THE LORENZ EQUATIONS 317

Claim. For the above choice ofn, Kn = (-A,O) or [O,A).


PROOF. ° °
The point E I(Kn -2) so E 8(Kn_.}, i.e., K n- l abuts on 0. Let b >
°
be such that I(±b) = 0. Then the fact that E I(Kn -.} implies that b or -b are in
°
K n - il so K n - l :> [-b,O) or (0, b). Thus, I(Kn -.} contains either 1([-b,O» = [0, A) or
/«O,b]) = (-A,OJ. 0
Let c = /(A). From the claim, it follows that /(Kn) = (-c,A) = (-c,OJ U [O,A)
°
or (-A, c) = (-A, OJ U [0, c). Note that since ICc) = peA) > = I(b), it follows that
c > b. Then in the first case,

/2(Kn) :> /« -c, 0]) U /([0, A))


:> /« -b, 0)) U 1([0, A))
:> [O,A)U(-A,c)
:> (-A, A).

A similar argument holds when I(Kn) = (-A, 0] U [0, c). This completes the proof of
(a).
As mentioned before the theorem, the Birkhoff Transitivity Theorem implies that /
is transitive and so all points are nonwandering. 0
The next theorem makes some connections between I and P.
Theorem 11.3. (a) There is a one to one correspondence between the periodic points
of / and the periodic points of P.
(b) Let Ap = nn~o pn(E'). Then Ap = O(P).
PROOF. If P"(xo, YO) = (xo, Yo) then clearly /"(xo) = Xo. Conversely, assume that
/"(xo) = Xo· Then P(x,y) = (f(x),g(x,y)) so P"(xo.y) E {xo} x I for the interval I
of values of y. The map P is a contraction by a factor I-' < 1 on fibers W"(q, P), so

or P"(xo.·) : I --+ I is a contraction by 1-'''. Therefore P"(XO,y) has a unique fixed point
Yo in I, and P has a unique point (xo, Yo) of period k corresponding to the point Xo of
period k for /. This proves part (a).
For part (b). note that E has the property that peE) C int(E). (If this Is not the case,
then enlarging E slightly in the x-direction makes this true.) Then Ap Is an attracting
set. so O(P) cAp.
We need to show that O(P) :> A p . Take (xo. Yo) E Ap and let U be a neighborhood
in E. Then there is a small interval J containing Xo and K = [YO - f, Yo + E) containing
°
Yo such that J x K c U. Take m > such that I-'m[(I) < E. There exists a point
(Uo.Vo) E Ap such that pm(Uo.Vo) = (xo, Yo). Since / Is locally eventually onto. there
Is a point Xl E J such that r(x.} = Uo. Take any Yl E K, so (xily!l E U. and let
pn(Xil y.) = (uo. vd. Then

IPR+m(XI.YI) - (xo,Yo)1 = IPm(Uo,v.) - pm(Uo.vo)1


$l-'ml VI - Vol
$ E.

Therefore (Xily.},pn+m(XI.y.) E U. This Is true for any neighborhood of (xo,Yo) 80


(xo, Yo) E O(P). This shows Ap c O(P), completing the proof. 0
318 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

The reduction to the one dimensional Poincare map can also be used to prove that the
flow is not structurally stable. Small changes in the flow can make it have a homoclinic
orbit or not, i.e., f"(A) = 0 for some n, or Ji(A) ~ 0 for all j > o. These two different
types of flows are not conjugate or even flow equivalent. However, Guckenheimer and
Williams have analyzed much of the topological structure of the attracting set for the
Geometric Model of the Lorenz equations. See Guckenheimer (1976), Guckenheimer
and Williams (1980), and Williams (1977, 1979, 1980).
Birman and Williams (1983a, 1983b) and Williams (1983) have studied the type of
knots that occur as periodic orbits for the Geometric Model of the Lorenz equations.
The branched manifold plays an important part of this analysis. The branch manifold is
modified by removing the fixed point and get what is called a template. In particular the
periodic orbits on the template correspond to the periodic orbits in R3. The reference
sited above show that the knots which appear are prime. Also see Holmes (1988) for a
good introduction into the theory of knots in Dynamical Systems.

1.11.2 Homoclinic Bifurcation to a Lorenz Attractor


More recently, Rychlik (1990) proved that an attroctor 01 Lorenz type could occur for
specific cubic differential equations in R3. The papers by Robinson (1989, 1992) contain
a further discussion of this type of bifurcation and connections with stable manifold
theory. Also see Ushiki, Oka, and Kokubu (1984). Recently Dumortier, Kokubu, and
Oka (1992) has determined the various codimension two bifurcations of a homoclinic
connection for a fixed point of a differential equations in R3. For the Lorenz equations,
after the homoclinic bifurcation at Po there is an invariant suspension of a horseshoe
and not an attractor. Later at the Hopf bifurcation value, it appears that the gap of the
horseshoe disappears and an attractor is formed. In the equations studied by Rychlik
and Robinson there are more parameters than in the Lorenz equations so these two
bifurcations can be compressed into one. With certain (codimension two) assumptions
on the parameters, it is possible to bifurcate directly from the homoclinic connection to
an attractor. Since the homoclinic connection involves only two orbits, it is possible to
analyze completely the properties of the flow at this parameter value and prove that a
strong stable direction is preserved after the bifurcation. By controlling the expansion
rates in comparison to the distance of the unstable manifold to the stable manifold,
it is possible to show that the one dimensional Poincare map is like the one given for
the Geometric Model of the Lorenz equations. Again this control of the expansion
rates requires the extra parameter of the equations studied in these papers which is not
present in the Lorenz equations.
This work for the Lorenz equations is somewhat analogous to the comparison of the
reliults of Benedicks and Carleson (1991) for the Henon map for small B to the observed
results for the parameter values A = 1.4 and B = -0.3.

1.12 Morse-Smale Systems


The examples of the horseshoe, toral Anosov, and solenoid all have infinitely many
periodic orbits. In this section we consider a class of systems with only finitely many
periodic orbits and no other chain recurrent points (or no other nonwandering points).
If we let nu) be the set of chain recurrent points, and PerU) be the set of periodic
points, then we are assuming that Per(f) = n(f) and that there are finitely many orbits
in Per(f). We also want this system to be structurally stable. Therefore we require
that all the periodic points are hyperbolic, so they persist under small perturbations of
1 and the dynamics near the periodic points do not change.
7.12 MORSE-SMALE SYSTEMS 319

Assume 1 is a diffeomorphism with PerU) = 'R.U). For any q, a(q),w(q) C PerU),


so there must be two periodic points PltP2 with a(q) = O(p.) and w(q) = 0(1)2).
Thus for any q there must be PI,P2 E PerU) with q E W"(pd n W"(P2). Since the
periodic points are hyperbolic, for 9 sufficiently near to I, there will be nearby periodic
points Pl(g),P2(g) E Per(g). For the system 1 to be structurally stable, it is necessary
that for 9 sufficiently near to I, W"(Pl(g),g) n W"(P2(g),g) '" 0. Thus we need that
any intersection of stable and unstable manifolds can not be destroyed. The property
which assures that these intersections can not be broken is traru;;versality, which we now
define.
Definition. Two submanifolds V and W in M are transverse (in M) provided for any
point q E V n W, we have that TqV + TqW = TqM. (This allows for the possibility
that VnW=0.)
Notice that for the two submanifolds V, W to intersect transversally at a point q, it
is necessary for dimTqV + dimTqW ~ dimTqM. In R2, if V and Ware two curves
then V being transverse to W at q means that the two tangent lines Tq V and Tq W are
not colinear. In R3, two planes are transverse at q if they intersect in a line through q.
The definition of a Morse-Smale system can now be given. Notice that the definition
involves conditions on the whole phase portrait and how orbits go between various
periodic points. For this reason we only define it when the phase space is compact.
Instead of R", we need to add the point at infinity to get a system on S", or work with
some other compact phase space such as yn.
Definition. A diffeomorphism 1 (or a flow !pI) on a compact manifold M is called
Morse-Smale provided
(1) the chain recurrent set is a finite set of periodic orbits, each of which is hyperbolic,
and
(2) each pair of stable and unstable manifolds of periodic points is transverse, i.e., if
PltP2 E PerU) then W"(p.) is transverse to W"(P2). Notice that in the case of
flows, we allow periodic orbits and not just fixed points.
We now give a number of examples of Morse-Smale diffeomorphisms. At the end
of the section, we return to some examples of flows, and highlight some of the special
aspects of Morse-Smale flows.
Example 12.1. On 8 1 we consider the system

1(8) = 8 + £ sin(27rk8) mod I,

for 0 < 27rk£ < 1. The lift F of 1 has F'(8) = 1 +£cos(27rk8). This derivative is positive
with the assumption on £, so 1 is a diffeomorphism. This has 2k fixed points, {;k g~o 1 .

Half of the fixed points are attracting, {(2 j 2; 1) H:J, and the other half are repelling,

{t}J:J. All other trajectories are in the stable manifold of one of these sinks and the
unstable manifold of one of the sources. Thus the system is Morse-Smale. See Figure
12.1.
Example 12.2. Again on SI we consider the system

1(8) = 8 + I + £sin(27rk8) mod I,


320 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

FIGURE 12.1. (a) k = 1 (b) k = 3

for 0 < 2rrkl < 1. With the assumption on l, f is a diffeomorphism as in the previous
example. This system has two periodic orbits of period k,

{ l}k-I and { (2j + 1) }k-I


k )=0 2k )=0'

The first orbit is repelling and the second is attracting.


Example 12.3. On y2, take 8 1 and 82 as variable modulo 1. Let

We assume that 0 < 2rrl < 1 so that both coordinate functions are one to one and f is
a diffeomorphism. This diffeomorphism has fixed points at PI = (0,0), P2 = (1/2,0),
P3 = (0,1/2), and P4 = (1/2,1/2). The point PI is a source, P2 and P3 are saddles,
and P4 is a sink. Then W"(Pj) C WU(pd and WU(Pj) C W"(P4) for j = 2,3,

WU(pd C W'(P2) U W'(P3) U W"(P4), and


W"(P4) C WU(pd U W U(P2) U W U(P3).

It is easily checked that all these intersections are transverse and the diffeomorphism is
Morse-Smale. See Figure 12.2.

FIGURE 12.2. Example 12.3 on Torus

Example 12.4. On S'l. it is possible to have a Morse-Smale diffeomorphism with one


fixed point source at the north pole and one fixed point sink at the south pole. See
Figure 12.3.
7.12 MORSE-SMALE SYSTEMS 321

FIGURE 12.3. North Pole - South Pole Diffeomorphism 00 S'l

FIGURE 12.4. Example 12.5

Example 12.5. On S'l, it is p088ible to have a Morse-Smale diffeomorphism with one


BOurce at the north pole (infinity of the plane), two saddles, and three sinu, all fixed
points. See Figure 12.4.
The next lemma proves that for a Morse-Smale diffeomorphism, there are restrictions
on the manner in which the stable and unstable manifolds can intersect. We first define
a cycle of periodic points. The lemma then states that a Morse-Smale system can have
no cycles among the periodic orbits.
Definition. If P is a periodic point of f, for tT =u,., let
W"(O(p» =UW"(JJ(p», and

W"(O(p» =W"(O(p» ' O(p).


A collection of periodic points Po, ... , Pt-I E PerC!) is a k-c,cle provided WU(O(pj»n
W'(O(PH.) :I " for j = 0, ... , k - 1 where we let Pt = PO.
Lemma 12.1. (a) If PI, P2 E Per(J) and q E WU(p.) n W'(P2), then there is an
(-chain from PI to q and then to P2.
(b) Ifpo, ... Pt E Per(J) and q; E WU(pj) n W'(PJ+I) for j = 0, ... , k - I, then
there is an (-chain from Po to Pt which p _ through all tbe Pi and tbe q;.
(c) If there is a k-cycle Po, ... Pt-l E PerC!) and q; E WU(O(PJ» n W'(O(PJ+I»
for j= 0, ... , k - 1, then q; E X(J) for j =
0, ... , k - 1 and there are chain recurrent
points which are not periodic points.
(d) If, is Morse-Smale, then there can be no cycles among the period points of ,.
We leave the proof as an exercise for the reader. See Exercise 7.46.

Next we state the theorem about the structural stability of Morse-Smale dift'eomor-
phisms and flows.
322 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Theorem 12.2. Let M be a compact manifold, and J a C l Morse-Smale diffeomor-


phism (or flow). Then J is CI structurally stable.
The proof in the case of a diffeomorphism on SI is simple because there can only
be periodic sources and sinks and no periodic saddle points. The proof in this case is
similar to the examples treated in Section 2.6. For the proof in the general case, see
Palis and de Melo (1982). The original proof is found in Palis and Smale (1970).
Morse-Smale Flows
We now turn to flows. The definition of a Morse-Smale flow is the same as for
a diffeomorphism but with periodic orbits replaced by either fixed points or periodic
orbits or some of each. There are some differences in the implications of the assumptions
of transversality. If q E WU(-ytl n W"(-Y2) where 'YI and 'Y2 are either fixed points or
periodic orbits, then the whole orbit O(q) c WU(-ytl n W"(-Y2). Thus for transversality
we need that dim(Wu(-Yd) + dim(W"(-Y2» ?: dim(M) + 1. In particular, on a surface
(dimension two), if 'YI and 'Y2 are each fixed point saddles for a Morse-Smale flow, then
WU('Yd n W"(2) = 0.
Example 12.6. On 'f2, consider the equations (written on a 2)
01 = £1 sin(27r9tl
02 = £2 sin(27r92)

for 0 < £1, £2 < 1/(27r). Taking all the variables modulo 1, there are four fixed points:
(0,0), (1/2,0), (0,1/2), and (1/2,1/2). This is very much like Example 12.3 given above
for a diffeomorphism and has one source, two saddles, and one sink. See Figure 12.5.
This example is a Morse-Smale flow.

FIGURE 12.5. Example 12.6: Differential Equations on Torus

If <pI is a Morse-Smale flow with fixed points but no periodic orbits, then the time one
map, <pI, is a Morse-Smale diffeomorphism. (See Exercise 7.39.) Examples 12.1-12.4
of Morse-Smale diffeomorphisms discussed above are of this type, but Example 12.5 is
not..
One way to get examples of Morse-Smale flows with fixed points but no periodic
orbits is as the gradient of a real valued function, which we now define.
Definition. A flow <pIon an is called a gradient flow provided there is a real valued
function V: Rn -+ R such that

~<pl(p) = -'VV""(p)'
7.12 MORSE-SMALE SYSTEMS 323

We also say that X(p) = -VV'P'(p) is a gradient vector field.


We want to define a gradient flow on a surface or a manifold. A surface can be
embedded in JR3. In general, a manifold M can be embedded in a large dimensional
Euclidean space JR n. If this is done, then a real valued function V : M -+ R is called
differentiable if it can be extended to a neighborhood U of M in JR n, V : U ..... R with
=
VIM V. With these facts we can define a gradient system on a manifold M.
Definition. Let a manifold M be embedded in a Euclidean space Rn. A vector field
X(p) on M is called a gradient vector field provided there is a real valued function
V : U -+ JR where U is a neighborhood of M in JRn such that

where 1I"p : TpJRn -+ TpM is the orthogonal projection of vectors at pin R n to tangent
vectors to M. Also a flow 11" on M is called a gradient flow provided there is a gradient
=
vector field X(p) such that I,V"(p) X(V"(p».
As an alternative definition of a gradient vector field on a manifold (and more intrinsic
to the manifold itself), we can assume that there is a Riemannian metric on M, i.e., for
each p E M there is an inner product on TpM, a positive definite symmetric bilinear
form ( , )p, which varies smoothly with p. (An embedding of M into a Euclidean space
is merely one way to get such an inner product.) Given a tangent vector vp at p. (vp, )p
is something which acts on a tangent vector at p and gives a real number, and the map
sending wp to (vp, wp)p is linear in wp, so (vp, )p is in the dual space to TpM. Thus
using the inner product, for each p EM. there is a map from the tangent vectors at
p to the dual space, J p : TpM ..... (TpM)". The fact that the inner product is positive
definite implies that this map is an isomorphism. At each point p, the derivative for
V, DVp, is a linear map waiting to act on a vector and the outcome is a real number,
thus DVp E L(TpM,JR) ~ (TpM)". Combining these two constructions, we define the
gradient 01 V by
=
grad(V)p J;l DVp.
Finally, a flow 11" is called a gradient flow provided there is a real valued function
V : M -+ JR such that 'P' is the flow for the gradient vector field -grad(V),

!V"(p) = -grad(V)'P'(p).

Theorem 12.3. Let V : M ..... JR be a C' (unction such that each critical point is
nondegenerate, i.e., at each point x where grad(V)" = 0, the matrix o( second partial
derivatives, (1J1J21JV (x»), has nonzero determinant. Let X(x)
Zi Zj
= -grad(V)" be the
gradient vector field (or V. Then (a) all the fixed points are hyperbolic and (b) the
chain recurrent set (or X equals the set o( fixed points (or X (X has no periodic points).

PROOF. We leave the proof of part (a) to the exercises. See Exercise 7.37. Since the
function V is strictly decreasing off the fixed points and all the fixed points are isolated,
all points which are not fixed are not in the chain recurrent set. This proves part (b).
o
REMARK 12.1. Let V : M ..... JR be a C' function such that each critical point is
nondegenerate. By Theorem 12.3, then -grad(V) is a Morse-Smale vector field if and
only if the stable and unstable manifolds are transverse.
324 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

Example 12.7. For a gradient vector field on 'f2, let V : 1'2 --+ IR be the height function
when 1'2 is stood up on its end. This defines a gradient vector field with four fixed points:
a source at the maximum Pmor, a sink at the minimum Pm'n, and two saddles PI and
P2 with V(P2) > V(pd. See Figure 12.6. If the maximum, minimum, and saddles are
nondegenerate, then the fixed points are hyperbolic. If the torus is "straight up and
down", then there is a saddle connection from P2 to PI, WU(P2)nW'(pd '" 0. Thus this
example is not Morse-Smale. If the torus is slightly tipped so that WU(P2)nW'(pd = 0,
then the vector field becomes Morse-Smale.

FIGURE 12.6. Gradient Flow on 'f2

We want to make the connection between gradient systems and Liapunov functions,
so we first define (global) Liapunov functions and then state and prove the theorem.
Also see Conley (1978).
Definition. A C l function L : M --+ R is called a weak Liapunov function for a flow
<pI provided
L 0 <pI (x) ~ L(x) for all x E M and all t ~ O.

(Maybe we should call this a weak, global Liapunov function, but we do not use the
adjective global when we use this definition.) Thus if we let L(x) == fcL(<p'(X))I,~o,
then L is a weak Liapunov function if and only if L(x) ~ 0 for all points x E M. A real
valued function L on M is called a Liapunov function (or strong or strict or complete
Liapunov function) provided it is a weak Liapunov function with

L 0 <pI (x) < L(x) for all x rt R(<pl) and all t > 0,
i.e., L(x) < 0 for all x rt R(<pl). (Or for x rt S for some other set S such as Fix(<pl),
or the set P defined in Theorem IX.l.3. If necessary, the set off which the Liapunov
function is decreasing under the flow is identified.)

The reader might want to check that it is necessary that a Liapunov function L is
constant on any orbit in Per(<pl) or in fact in R('{JI).
Sometimes a flow is not a gradient flow but has a Liapunov function which is de-
creasing off the fixed points. We identify such flows by the following definition.
Definition. A flow <pI is called gmdient-like provided there is a strict Liapunov function
which is strictly decreasing off Fix('{JI).
7.12 MORSE-SMALE SYSTEMS 325

Theorem 12.4. Let X(p) = -grad(L)p be a gradient vector field on a manifold M


with flow <pt(p). Then i(p) = -lgrad(L)pI2 $ 0 for all p E M, and i(p) = 0 only
at the fixed points of X. Thus L is a Liapunov function of the flow of X, and <pI is
gradient-like.
PROOF.
. d
L(p) = diL(<pI(p»II=o
d I
= DL""(p)di<P (p)ll=o
= DLpX(p).

But DLpvp = (grad(L)p, v p) for any tangent vector vp E TpM, so


L(p) = (grad(L)p, X(p))
= (grad(L)p, -grad(L)p)
= -lgrad(L)pI2.

Thus i(p) $ O. It equals zero if and only if grad(L)p = 0, if and only if DLp = 0, if
and only if p is a fixed point of the How. 0
Corollary 12.5. The minima of L are asymptotically stable fixed points, and the
saddles and maxima of L are not Liapunov stable.
Corollary 12.6. The trajectories of a gradient system for L : M -- R cross tbe level
sets of L orthogonally at points wbicb are not fixed points.

If the real valued function L on M has only finitely many nondegenerate critical
points, then the gradient vector field for L, X, has finitely many fixed points all of
which are hyperbolic and no other periodic orbits. Thus R(X) = Fix(X) = Per(X).
It can be shown that by perturbing any such gradient vector field slightly to Y, all the
stable and unstable manifolds of Y can be made transverse, and hence Y would be 8
Morse-Smale vector field. Example 12.7 with the torus straight on the end is an example
where R(X) = Fix(X) = Per(X) but the system is not Morse-Smale. The perturbation
of the gradient system caused by slightly tipping the torus is an example of how such 8
system can be made Morse-Smale.

There are constraints on the possible dynamics in terms of the topology of the phase
space. We give the result for gradient Hows or gradient-like flows. If p is a fixed
point, then the index at p is defined to be the dimension of the unstable subspace,
index(p) = dim(E~). Let Ck = #( critical points of index k) and /310 = dimH.(M,R).
Then the Morse inequalities are as follows:
Ck -Ck-I + ... ±ec ~ /310 - /310-1 + ... ±/Jo
for k = 0, ... ,dim(M). Also for k = dim(M),
dim(M) dim(M)
E (_1)dlm(MH cj = E (_1)dim(MH/3j = X(M),
j=O j=O

where X(M) is the Euler characteristic of M. See Franks (1982) for more details in-
cluding some results for diffeomorphisms. Also see Palis and de Melo (1982) for further
extension of the index to more complicated invariant sets.
326 VII. EXAMPLES OF HYPERBOLIC SETS AND ATIRACTORS

7.13 Exercises
Manifolds
7.1. Prove that sn has the structure of a Coo n-manifold. Hint: Use the graphs given
in Example 1.3.
7.2. Let F : Rn+1 - R be a C r function for some r > 1. Assume c E R Is a value such
that for each p E F-I(c), DFp "# 0, i.e., DFx has r~k one, or some partial derivative
of F is nonzero at p. Prove that F-I(C) has the structure of a C r n-manifold.
7.3. Let F : Rn+k - Rk be a C r function for some r 2: 1. Assume C E Rk is a value
such that for each p E F-1(c), DFp has rank k. Prove that F-I(C) has the structure
of a C r n-manifold. Hint: Use Theorem V.2.2.
7.4. Let AI and N be manifolds and I : M - N a C l diffeomorphism (a C l home-
omorphism with a C l inverse). Prove that the map DIp: TpM - Tf(p)M is an
isomorphism for every p EM.
7.5. Let X = R and consider the two coordinate charts 11'1,11'2 : R - X given by
II'dx)= x and 1I'2(X) = x 3. For i = 1,2, let Mi denote the manifold consisting of the
metric space X together with the one coordinate chart lI'i'
(a) Prove that the identity map on X is not a diffeomorphism.
(b) Prove that there is a diffeomorphism from MI to M2.
7.6. Let M be a manifold and X : M - TM be a vector field. Prove that X is C r
for r 2: 1 if and only if for each cr+ 1 function I : M - R the function X (f) defined by
X(f)(p) = DJpX(p) is cr.
n-ansitivity Theorems
7.7. A homeomorphism I : X - X is called topologically mixing if for any two open
sets U, V C X there is an N > 0 such that n 2: N implies J"(U) n V "# 0.
(a) Prove that no homeomorphism of 1= [0,1) is mixing.
(b) Give an example of a homeomorphism of a compact connected metric space which
has a point with a dense orbit but which is not mixing.
7.8. Let
2x for x ::; 1/2
T ()
x = {
2 - 2x for x 2: 1/2
be the tent map.
(a) Using the Birkhoff Transitivity Theorem, prove that T has a point with a dense
orbit.
(b) Using the fact that F4(X) = 4x(1- x) is conjugate to T (see Example 11.6.2), prove
that F4 has a point with a dense orbit.
7.9. (Poincare Recurrence Theorem) Assume Il is a finite measure on a space X,
J.I(X) < 00. Assume J : X - X is a one to one map which preserves the measure Il,
i.e., Il(f-I(A» = J.I(A) for every measurable set A. Assume S is a measurable set. Let
So = S,

Sn = {x E S,,_I : li(x) E Sn-I for some j 2: I}


= Sn-I n Uri(Sn_ll
i2:1

for n 2: I, and Soo = r1n>0 Sn' Prove that Soo is measurable, Il(Soo) = Il(S), and for
x E Soo, fJ(x) returns to S an infinite number oftimes. Hint: Prove that Il(Sn) = Il(S)
for n 2: 1.
7.13 EXERCISES 327

7.10. Assume X is a separable metric space and I" is a finite Borel measure on X (so
all open sets are measurable). Assume that i : X --+ X is a homeomorphism on X
which preserves the measure 1". Using the previous exercise, prove that

Y = {x EX: x E o(x) and x E w(x)}


has full measure, i.e., p(Y) = p(X). Hint: Let {Xk}f:1 be a countable dense set in X
and consider the open balls B(Xk, 6) for 6 > O.
7.11. Assume X is a complete separable metric space and I" is a finite Borel measure
on X which is positive on every open set. Assume that i : X --+ X is a homeomorphism
on X which preserves the measure p. Prove that

Y = {x EX: x E o(x) and x E w(x)}

is a residual set ill X and so is dense in X. Hint: Let {Xk}f:1 be a countable dense set
in X. For 6> 0, let U(xk,6,O) = B(xk,6) and

U(xk,6,n) = {XE U(xk,6,n-l): ii(x) E U(xk,6,n-1) forsomej 2: I}

for n 2: I, Prove that each U(Xk, 6, n) is dense and open in B(Xk, 6).
Two Sided Shift Spaces
7.12. Let EN be the full two sided shift space on N symbols with shift map (IN. For
>. > I, let p). be the metric on EN defined by

~ 6(si,ti)
p).(s, t) = L..J >.lil
j=-oo

where

Given tEEN and k 2: 0, prove that

{s: Sj = tj for Ijl ~ k}


is an open ball in terms of the metric p). if and only if >. > 3.
7.13. Let EN be the full two sided shift space on N symbols with shift map (IN. Let A
be an N x N transition matrix and EA C EN be the subshift of finite type determined
by A and (lA = (lNIE A .
(a) Prove that (lA is topologically transitive on EA using the Birkhoff Transitivity
Theorem.
(b) Describe a symbol sequence s· E EA that has both a dense forward orbit and a
dense backward orbit in EA' Remark: Compare with Theorem 11.5.3.
7.14. Let (I: E2 --+ E2 be the full two-sided two-shift. Define r : E2 --+ E2 by rea) = b
where bj = a_j-I, and let S = (lor. Prove that ror = id, sos = id, and (I = sor.
Subshifts for Nonnegative Matrices
7.15. Let A be an n x n adjacency matrix and N = Ei,j aij. Let T be the N x N
transition matrix on the edges induced by A as defined in Section 7.3.1.
(a) Prove that there are (Ak)ij T-allowable words w of length k + 1 with b(w) = i
and e(w) = j.
(b) Prove that A is irreducible if and only if T is irreducible.
328 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

(c) Prove that #(Fix(u}» = tr(A").


=
7.16. Let A be an nxn adjacency matrix and N EiJ aij. Assume aij E {O, I} for all
1 ~ i,j ~ n. Let T be the N x N transition matrix on the edges induced by A &8 defined
in Section 7.3.1. Prove that the vertex subshift u A : EA -+ EA is topologically conjugate
to the edge subshift formed from A, UT : ET -+ ET. Hint: Define h : ET -+ EA by
h(s) = a where Bj = 6(aj) for all j, aj E E and Bj E V. Prove h is a conjugacy.
Horseshoes
7.17. Consider the map I: S2 -+ S2 which gives the Geometric Horseshoe. Let S be
the square &8 given in the chapter. Draw the inverse image of S by I, I-I(S).
7.18. Consider the Geometric Horseshoe Map. Let PI be the fixed point in HI n VI,
= =
L' compp,(W'(p!)nS), and LU compp,(WU(PI)nS). Draw the images of LU by
r
P and L' by 1- 3 , P(LU) and 3 (L').
7.19. Let I,k"g. : a' -+ a' be diffeomorphisms &8 in Example 4.1 (Section 7.4.2).
Let U be the neighborhood of the nontransverse homoclinic point x· for f. Prove that
for f > 0 small enough that following statements:
(a) g, is a diffeomorphism,
°
(b) is a saddle fixed point for g"
=
(c) letting 1 W'(O, f) n u,
00

UI j (1) c W'(O,g.),
j=O
-00

U Ij(1) c WU(O,g,), and


j=-I
k.(1) C WU(O, g.),

and
(d) x· is a transverse homoclinic point for g•.
7.20. When B = 0, show that the family of Henon maps FA,O is conjugate to the family
of quadratic maps F,. on a. (Unfortunately, the labeling of the two families of functions
is very similar.) By a conjugacy in this case we mean a continuous map h : a -+ a'
=
which is a homeomorphism onto the image of FA,o and such that h 0 F,. FA,o 0 h.
Anosov Diffeomorphisms
7.21. Let / : 1r' -+ 1r2 be the map on the two torus induced by the matrix

acting on a'. (Note that I is not invertible.) Prove that the periodic points are dense.
7.22. Let IA be a hyperbolic toral automorphism on 1r" with lift LA to a". Let 9 be a
small CI perturbation of IA with lift G to a". Finally, let G = G- LA. Let Cf,per(R"),
Cl,per(a"), and 9(0, v) be defined &8 in the proof of Theorem 5.1.
. I
(a) Prove that G E C.,Plr(lR").
(b) Prove that 9(0,·) preserves C~,per(a").
7.23. (a) Give an example of a hyperbolic toral automorphism on the three torus T3.
(b) Given an example of a hyperbolic toral automorphism on the n-torus 1r" for n > 3.
7.13 EXERCISES 329

1.24. Let I : y2 -+ y2 be the hyperbolic tora! automorphism with matrix

Prove that I has sensitive dependence on initial conditions.


Markov PartioDs for Hyperbolic Toral Automorphisms
1.25. Let I". : y2 -+ y' be the diffeomorphism induced by the matrix

discussed in Example 5.4. Let Ria be the rectangle used in the Markov partition for
=
this diffeomorphism. Let 9 11. and

n gi(Rla)'
00

A =
j=-oo

Prove that 9 : A -+ A is topologically conjugate to the two-sided full two-shift D" : E2 -+

E2.
1.26. Let I: y2 -+ y2 be the diffeomorphism induced by the matrix

Form a Markov partition with three rectangles by using the line segment [a. b]. as
in the text, and an unstable line segment [g, c]u where 9 is determined by extending
the unstable manifold of the origin through the origin 80 that it terminates at a point
g E [a,O].. Thus 0 is within the unstable segment (g, c]u. Determine the transition
matrix B for this partition. Determine the three eigenvalues for the transition matrix.
How do the eigenvalues compare with the eigenvalues for A?
1.21. Let I : y2 -+ y2 be a hyperbolic toral automorphism and A the transition
matrix for a Markov partition. Let h : E" -+ y2 be the semi-conjugacy. Prove that s is
periodic for IT" if and only if h(s) is periodic for I". Explain why the periods could be
different.
1.28. Let I" : y2 -+ y2 be a hyperbolic toral automorphism and B the transition
matrix for a Markov partition. Using the fact that I" is topologically transitive, prove
that B is irreducible.
Zeta FunctioD for a Hyperbolic Toral Automorphism
1.29. Let I : yn -+ yn be a C l diffeomorphism with a hyperbolic fixed point p. Prove
that the Lefschetz index of I at p is given as follows:

where u = dim(E;) and .:l =


1 provided DlplE; -+ - ; preserves orientation and
=
.:l -1 provided this linear map reverses orientation.
1.30. (Zeta function via Markov partitions) Let I : y2 -+ y2 be a hyperbolic tora!
automorphism and A the transition matrix for a Markov partition M. Let h : E" -+ T'
be the semi-conjugacy. The zeta function of D"" is rational by Theorem 111.3.2. A.
Manning (1911) proved that (,(t) is a rational function by relating it to (I7A' Bowen
330 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTORS

(1978a) sketched the following modification of Manning's proof. Let Pk be the collection
of families of k indices of distinct rectangles {i l , ... , ik} such that each R;j E M and
R;, n ... n R;. i: 0. For each such family, fix an ordering i = (i" . .. ,ik). For i,j E Pk,
write i -+ j provided there is a permutation T of {I, ... , k} so that Ai"iT(') = 1 for
1 $ t $ k. Define the #(Pk) x #(Pk) matrix by

I ifi-+j and T is an even permutation,


A(k)IJ = { -1 if i -+ j and T is an odd permutation, and
o otherwise.

Notice that A(l) = A and A(k) = 0 for k > 4 because J is a diffeomorphism on a two
dimensional manifold. The parts of the problem below ask you to prove that

N,(f) = ~) _1)k+1 tr(A(k)j).


k?1

(a) Prove that if p E int(R;) for some Ri E M has period t for J, then h-I(p) has
period t for U A. Prove that it does not correspond to a periodic point for U A(k)
for any k > 1. (Alternatively, h-I(p) does not contribute to the trace of A(k)i for
any k > 1 and j ~ 1.) Prove that in this case, h-I(p) contributes 1 to the right
hand side of (.) for the j which are multiples of land 0 for other j.
(b) Assume pERi, n R;2' p is not in any combination of three rectangles in M, and
p is a fixed point for J. Prove that i -+ i where i = (iI, i2)' (i) If the permutation
T for i is even, prove that h-I(p) contains exactly two points, each of which is
fixed by U A. Prove that A(2)i,1 = 1 and p does not correspond to a periodic point
of UA(k) for a k > 2. In this case, prove that the points in h-I(p) contribute 1
to the right hand side of (.) for all j. (ii) If the permutation T is odd, prove that
h-'(p) contains exactly two points, which is an orbit of period two for UA. Prove
that A(2)L = (-I)i and p does not correspond to a periodic point for U A(k) for a
k > 2. Prove that the points in h-I(p) contribute 1 to the right hand side of (*)
for all j.
(c) Assume that p is a fixed point for J. Prove that the points ill h-I(p) contribute
1 to the right hand side of (.) for all j.
(d) Assume that p has period l for J. Prove that the points in h - 1 (p) contribute 1
to the right hand side of (.) for the j which are multiples of f and 0 for other j,
Conclude that (.) is true for all j.
(e) Using part (d), prove that <J(t) is rational.
Attractors
7.31. (Theorem 6.1) Let J : M -+ M be a diffeomorphism on a finite dimensional
manifold M. Let A be a compact invariant set for J. Prove that A is an attracting
set if and only if there are arbitrarily small neighborhoods V of A such that (I) V is
positively invariant and (ii) w(p) C A for ail p E V. (Note that the neighborhoods V
can be taken to be compact since both M and A are compact.)
7.32. (Theorem 6.2) Let A be an attracting set for J. Assume either that (i) pEA is
a hyperbolic periodic point or (ii) A has a hyperbolic structure and pEA. Prove that
WU(p,f) c A.
Solenoids
7.33. Construct a Markov partition for the map g : Sl -+ SI defined by g(8) = 28 mod
1. Note in this case there is no contracting direction so WU (p, R;) = Ri and W' (p, Ri ) =
7.13 EXERCISES 331

{pl. Thus the fourth condition for a Markov partition becomes the following: ifint(Rt)n
g-l(int(R j » 1: 0 and p E int(Rt) n g-l(int(R j )) then g(int(Rt)) n int(Rj ) = int(Rj )
and g-I(g(p» n int(Rt) = {pl.
7.34. (a) Construct a Markov partition for the solenoid.
(b) Let h : EA ~ A be the map from the subshift of finite type determined by the
transition matrix A for the Markov partition to the attractor A for the solenoid
diffeomorphism defined by

h(s) = n n
00

n=O
cl(
n

j=-n
rj(int(R. j »).

Prove that h is at most two to one.


7.35. Let compp(S) denote the connected component of S containing p. Let N =
SI xD2 and f : N ~ N be the solenoid map. For p = (0, z) show that compp(W'(p, f)n
N) = D(O), and for 01 < 0 < O2 that

compp(WU(p, f) n D([OI, 02])) = n compp(D([Ol, 02]) n r(N».


n2:1

Here D([OI, O2 ]) = (0 1 ,02 ) x D2. Hint: Show the inclusion both ways. You may assume
that the left side is the one-to-one differentiable image of an interval.
DA Attractor
7.36. Prove that these exists a neighborhood V for the DA-diffeomorphism as specified
in the proof of Theorem 8.1: f(V) :::> V, a22 > 1 for q E V, and a22 < 1 for q rt f(V).
Morse-Smale Diffeomorphisms
7.37. Let V : 1'" ~ 1'" be a C2 function such that each critical point is nondegenerate,
i.e., at each point x where grad(V)x = 0, the matrix of second partial derivatives,
a::':x (x), has nonzero determinant.
j
Prove that all the fixed points of the gradient
vector field of V are hyperbolic.
7.38. Let L: M ~ R be weak Liapunov function for the flow 'PI. If x E 'R('P 1), prove
that L 0 'PI(X) is a constant function of time.
7.39. (a) Let 'PI be a Morse-Smale flow with only fixed points and no periodic orbits.
Prove that f(x) = 'PI (x) is a Morse-Smale diffeomorphism.
(b) Let 'PI be a Morse-Smale flow that has a periodic orbit. Prove that j(x) = 'PI (x)
is not a Morse-Smale diffeomorphism.
(c) Let j : M ~ M be a Morse-Smale diffeomorphism. Let 'P' be the suspension
of j. Prove that 'P' is a Morse-Smale flow without any fixed points and only periodic
orbits.
7.40. Consider the set of C l diffeomorphisms on SI with the C l topology, Diffl(SI).
Prove that any f E Diffl(SI) can be approximated by a Morse-Smale diffeomorphism,
i.e., prove that the set of Morse-Smale diffeomorphisms is dense in Diff1(SI).
7.41. Prove that the set of Morse-Smale diffeomorphisms is not dense in Diffl(T2) or
Diffl(S2) where S2 is the two sphere. Hint: Consider the examples of diffeomorphisms
we have given with infinitely many periodic points.
7.42. Consider the two torus, T2. The Beti numbers of T2 are /30 = f32 = 1, and
{31= 2. Assume that there is a diffeomorphism f on R2 with one source, C2 = 1, and
332 VII. EXAMPLES OF HYPERBOLIC SETS AND ATTRACTOR..<;

two sinks, Co = 2. Using the Morse inequalities determine how many saddles I must
have.
7.43. Which of the following diffeomorphisms are structurally stable? Prove your
answer.
(a) I: SI -+ SI defined by 1(0) = 0 + 0 mod(271'), where 0/71' is irrational.
(b) g: SI -+ SI defined by 1(0) = 0 + 0 mod(271'), where 0/71' is rational.
(c) ho : SI -+ SI defined by ho(O) = 0 + 0 sin(O) mod(271'), where 0 E (0,0.2). (Is ho
structurally stable for all these values of 0, some, or none?)
7.44. Assume that for each t E [a, bl, II : M -+ M is a diffeomorphism which is
structurally stable. Prove that la and Ib are topologically conjugate.
7.45. Let I : R -+ R be defined by I(x) = x + 1. Show there is an f > 0 such that
whenever 9 is a C l map which satisfies

I/(x) - g(x)1 <f and


I!'(x) - g'(x)1 <f

for all x E R, then I and g are differentiably conjugate. That is, there is a Cl diffeo-
morphism h : R -+ R such that I 0 h = hog.
7.46. Prove Lemma 12.1.
CHAPTER VIII
Measurement of Chaos in Higher Dimensions

In this chapter we continue the discussion of measurements of chaos started in Chap-


ter III. As mentioned there, topological entropy is a quantity which is used in the math-
ematical study of dynamical systems. A newcomer to the subject often has difficulty
understanding exactly what is being measured from the definition of topological entropy
itself. We calculate the entropy of a few simple situations. We also relate the entropy
to horseshoes caused by homoclinic points and Markov partitions for Anosov diffeomor-
phisms. Hopefully, after seeing these examples, the reader will gain an understanding
of the concept.
The next section returns to the concept of Liapunov exponents. These are defined in
one dimension where the situation is fairly simple. In this chapter we give the definition
in higher dimensions. In this discussion, we merely give the definitions and statements
of theorems. The goal is for the reader to be aware of the concept. We leave a full
treatment to the references.
The next section discusses the Sinai-Ruelle-Bowen measure for an attractor. Our
treatment in this book of invariant measures in general and this measure in particular
is very sketchy. Again, we defer to the references for a more complete treatment.
The last measurement of chaos relates to the geometry of the invariant set. We define
fractal dimension (box dimension, or Hausdorff dimension) and calculate it for a few
examples. Again the goal is for the reader to be aware of the concept without giving all
the theory or details.

8.1 Topological Entropy


As stated when discussing chaos in Section 3.5, topological entropy is a quantitative
measurement of how chaotic the map is. In fact, it is determined by how many "different
orbits" there are for a given map (or flow). To get an intuitive idea of the concept,
assume that you can not distinguish points which are closer together than a given
resolution e. Then, two orbits of length n can be distinguished provided there is some
iterate between 0 and n for which they are distance greater than e apart. Let r(n, e, f)
be the number of such orbits of length n that can be 80 distinguished. The entropy for a
given e, h(e, f), is the growth rate of r(n, f, f) as n goes to infinity. The limit of h(f, f)
as e goes to zero is the entropy of I, h(J). The key idea in this sequence of limits is the
growth rate of the number of orbits of length n that are at least e apart. Now we give
a more careful definition.
Definition. Let I : X -> X be a continuous map on the space X with metric d. A
set SeX is called (n, e)-separated lor I for n ~ positive integer and e> 0 provided for
every pair of distinct points x, yES, x '" y, there is at least one k with 0 $ k < n
such that d(J"(x),f"(y» > e. Another way of expressing this concept is to introduce
the distance
dn.,(x,y) = sup d(Ji(x),li(y».
°Sj<n
334 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Using this distance, a set SeX is (n, t)-separated for f provided dn./(x, y) > t for
every pair of distinct points x, YES, x # y.
The number of different orbits of length n (as measured by t) is defined by

r(n,t,f) = max{#(S) : SeX is a (n,t)-separated set for n,


where #(S) is the number (cardinality) of elements in S.
We want to measure the growth rate of r(n, t, f) as n increases, so we define

·
h( t, f) = IImsup log(r(n, t, f»
.
n-oo n

If r(n,t,f) = e nT , then h(t,f) = T; thus h(t,f) measures the "exponent" of the


manner in which r(n,l, f) grows with respect to n.
Note that r( n, t, f) 2: 1 for any pair (n, t), so 0 ~ h( t, f) ~ 00. We show in Lemma
1.10 that on a compact metric space, h(t,f) < 00.
Finally, we consider the way that h(t, f) varies as t goes to zero, and define the
topological entropy of f as
h(f) = lim h(t, f) .
• -0,.>0

In the next subsection we introduce another way of counting orbits, so we distinguish


the notation for the number of (n, t)-separated orbits by a subscript, i.e., we use the
notation r.ep(n,l,f) for r(n,t,f), h •• p(t,f) for h(t,f), and h •• p(f) for h(f)
REMARK 1.1. Note for 0 < t2 < tl, r(n,f2,f) 2: r(n, tit f), so h(t,f) is a monotone
function of t, h(t2' f) 2: h(ll' f). the limit defining h(f) exists, and 0 ~ h(l, f) ~ h(f) ~
00 for all t > O. If f is Cion a compact space, then it has been proved that h(f) < 00.
See Bowen (1971, 1978a).
REMARK 1.2. The concept of topological entropy was originally introduced by Adler,
Konheim, and McAndrew (1965) using a very different definition involving covers of the
set by open sets. This definition is given in the next subsection. The definition we use
was introduced by Bowen (1970b). Good general references for topological entropy are
Bowen (1978a, 1970b), Walters (1982), and Alseda, Llibre, and Misiurewicz (1993). This
last reference treats the case of the entropy of a one dimensional map quite extensively.
It also treats both the definition we have given with separated sets and the original
definition using refinements of open sets.
REMARK 1.3. It is also possible to define a measure theoretic entropy hj.l(f) for an
invariant measure JI.. Then under the correct hypothesis, h(f) = sup{hj.l(f)} where the
supremum is taken over all invariant measures JI.. See Bowen (1975b) or Walters (1982)
for a discussion of this type of entropy and its connection with topological entropy.
It is hard to get a very good idea of what entropy means directly from the above
definition. Throughout the section, we determine the entropy of a few examples which
helps give some feeling for its meaning. The first example is the "doubling map" which
is easy to show that it has entropy equal to log(2). We state this result as a proposition.
Proposition 1.1. Let f : SI -+ SI have a covering map F : IR -+ IR given by

F(x) = 2x.

This mllp is (,IIJled the doubling map. The distance on SI is the one inherited from R
by taking x to x mod 1. Thus points near 1 are close to points near O. (This ('an also
8.1 TOPOLOGICAL ENTROPY 335

be considered as the map g(z) = z2 on the circle where we use complex notation. If this
representation is used, the map is often called the squaring map.) Then h(f) = log(2).

PROOF. Two points x and y stay within l of each other for n - 1 iterates of / if and
only if Ix - yl ~ t2- n+ 1 (as points in R). If we put points exactly distance t2- n +l
apart in [0.1). then there are at most [l-12 n- I j points. However the last point to the
right is close to the first point on the left when considered modulo 1 in SI. so there are
[t- 12n- I j - 1 points in SI. These points can be spread apart slightly to make these
points actually (n.f)-separated. Therefore. r(n.f.f) = [t- 12n - I j - 1 where raj Is the
integer part of a. Then

·
h( E. I) = I Imsup log([f-12n-tj - 1)
n-oo n
I . log(t-t) + (n - 1) log(2)
= tmsup
n-(X) n
= log(2)

for any l > O. so h(f) = log(2) as claimed. o


We give a few basic results which are used to help calculate the entropy for specific
maps. The first such result relates the entropy of a map I with a power lie of I.
Theorem 1.2. Assume I is uniformly continuous (or X is compact) and k is an integer
with k ~ 1. Then the entropy of lie is equal to k times the entropy of /. h(fle) = k h(f).
PROOF. Because the points of the orbits considered for r(n. t. lie) constitute a subset
of those considered for r(nk. f. f).

so any (n.f)-separated set for lie is also an (nk.t)-separated set for I. and we have that
r(n.l. I") ~ r(nk.l. f). By uniform continuity. given.,., > 0 there is l > 0 such that if
d(x. y) ~ l then d(fj(x). Ji(y)) ~ .,., for 0 ~ j < k. Therefore any (nk •.,.,)-separated set
for I Is also a (n.f)-separated set for I". or r(n.f./") ~ r(nk •.,.,.f) where.,., is uniform
in n. Combining these two inequalities.

1
-r(nk.f.f)
n
~
1 Ie
-r(n.l.1 )
n
~
1
-r(nk •
n
.,.,.n
and taking the limits in n and then f

kh(f.f) ~ h(f.I") ~ kh(.,.,. f).


kh(f) ~ h(f") ~ k h(f).

This proves the theorem. o


REMARK 1.4. We leave to the exercises to prove that if I Is a homeomorphism. then
h(r t) = h(f). See Exercise 8.9. From equality it follows that h(f") = Ikl h(f) for any
integer k.
The next two results relate the entropy of a map with the entropy on invariant
subsets: the first result Is in terms of disjoint invariant sets and the second one is in
terms of the nonwandering set.
336 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Theorem 1.3. Let f be a continuous map on X. A&5ume X = Xl U··· U X,. is a


decomposition into disjoint closed invariant subsets which are a positive distance apart.
Then
h(f) = maxh(flXi) .

PROOF. If t is smaller than the distance between the subsets Xi. then

r(n.t.fl = Lr(n.t.fIXi).
i

Thus for each n and each j. we must have

r(n.t.fIXj ) $ r(n.t.fl $ kmaxr(n.f.fIXi) .



In passing to the limit in calculating h(f.!). for each j we have

h(t.fIXj ) $ h(t.!)
.
= Ilmsup
_lo.:::.g(,. . T.:. .n.(. :. . t...:..,,--,fl'-'.)
n-oo n
. log(k maxi r(n. t. fiX;)
$ I1m su p --='-'-----'---'--'-:..:...!---::..:..
R-OO n
$ max heft fIXi )•

so h(t.fl = maxj h(t.fIXj ). By taking a countable number of ti > 0 converging to
zero. one jo must have

for infinitely many of the li. so

h(f) = h(fIXjo )'

Since h(fIXj ) $ h(f) for all j. this proves the theorem. o


The next result of Bowen (1970b) says that all the entropy is contained in the non-
wandering set. i.e .• the wandering orbits do not contribute to the entropy.
Theorem 1.4. Let f : X -+ X be a continuous map on a compact metric space X.
Let 0 c X be the nonwandering points of f. Then the entropy of f equals the entropy
of f restricted to its nonwandering set, h(f) = h(fIO).
The proof of this theorem is delayed until the next subsection because of its length and
the relative greater complexity of its argument; it also uses a slightly different definition
of topological entropy in addition to that of separated sets. We mention that this result
shows that any Morse-Smale diffeomorphism, or any map whose nonwandering set is a
finite set of points. has zero entropy as stated in the following result.
Theorem 1.5. Let X be a compact metric space and f : X -+ X be a continuous
map for which O(f) is a finite number of periodic orbits. (For example, f could be a
Morse-Smale diffeomorphism.) Then the entropy of f is zero, h(f) = O.
PROOF. First. h(f) = h(JIO) by Theorem 1.4. Then by Theorem 1.3. h(fIO) is the
maximum of the entropy on the individual periodic orbits. However. the entropy of a
single periodic orbit is zero. 0
8.1 TOPOLOGICAL ENTROPY 337

REMARK 1.5. Let F,,(x) = #lX(1 - x), and #lie be taken in the range where F". has
one attracting periodic orbit of period 21e and repelling periodic orbits of periods 2; for
o ~ j < k. Since #lie < 4, F". preserves the interval I = [O,IJ. Theorem 1.5 implies
that the entropy h(F".II) = O. In Theorem 1.6 below, we give a direct proof of this fact
for 1 < I' < 3 without using Theorem 1.5 (and so without the proof of Theorem 1.4).
We include the proof of this special case in addition to the proof of the general case
given in Theorem 1.4 because its proof is more concrete. In the calculation of entropy
it is necessary to calculate the number of (n, f)-separated wandering orbits which start
near the repelling fixed point 0 and end up near the attracting fixed point PI" Because
these orbits can be partitioned by the iterate when they leave a neighborhood of 0, the
number of these orbits grows linearly in n. Because linear growth adds nothing to the
entropy, h(F,,) = O.
REMARK 1.6. In the proof of Theorem 1.4 we show that the wandering orbits contribute
at most a factor which grows in a polynomial fashion in the length n of the orbits
considered. A term which grows polynomially in the length n does not contribute to the
entropy, so this proves the theorem. The reason that the growth is possibly polynomial
rather than linear is that an orbit can make several transitions from a neighborhood of
the nonwandering set. For example for the horseshoe on 8 2 , an orbit could start near
the source at 00 and proceed near the horseshoe itself and finally leave and go near the
fixed point sink. Because there are two times at which these orbits leave a neighborhood
of the nonwandering set (the time it leaves a neighborhood of 00 and the time it leaves
a neighborhood of the horseshoe), the number of such orbits can grow quadratically
in n. In the general proof, we do not use any decomposition of the nonwandering set.
(However see Theorems IX.1.3 and IX.4.4.) Because we proceed without any special
knowledge of the decomposition of the nonwandering set, we must consider wandering
orbits which make several transitions between a neighborhood of the nonwandering set.
The proof shows that each of these transitions contributes a possible factor of n to the
growth rate of r(n, f, f) and so the the total number of these (n, f)-separated grows at
a polynomial rate. Such a growth rate of r(n, E, f) contributes nothing to h(J).
REMARK 1.7. On the whole real line r(n,E,F,,) = 00 (because if x,y < 0 then the
orbits F;(x) and Fi(y) diverge as j goes to infinity) so h(F,,) = 00. This illustrates
the fact that the entropy is not a good measurement of the chaotic nature of a map on
a noncompact set. In fact, let id : R -+ R be the identity map, id(x) = x for all x.
The identity map is certainly not chaotic, but r(n, E, id) = 00 for all n and f > 0, so
h(id) = 00.
As stated above, we give a direct proof that h(F,,) = 0 in the special case when F,.
when it has only two fixed points and no other nonwandering points. This proof is more
concrete and less complex than the general proof of Theorem 1.4, and we also obtain a
better upper bound on the number of (n, f)-separated orbits.
Proposition 1.6. LetF,.(x) = JlX(I-x) forI < #l < 3. Then the entropy h(F,.i[O, 1]) =
O.

PROOF. For 1 < #l < 3, F,. has a single repelling fixed point at 0, a single attracting
fixed point PI' = 1 - 1/#l, and no other periodic points or nonwandering points. We
write F for F", and P for PI"
Let a = F(0.5) be the image of the critical point and J = [O,aJ. Notice that
F([O,I]) = J. The reader can easily verify that h(FIIO,I]) = h(FIJ). (See Exercise
8.13.) We use FIJ because it has unique inverse images of points near O. Let H = FI}
to simplify notation.
338 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

°
The point p is attracting, so there exists eo > such that if x and yare two points
within a distance eo of p then for all j 2: 0, (i) Hi(x) and Hi (y) stay within a distance
°
eo of p and (ii) IHi(x) - Hi(y)1 ::; Ix - yl. (Take eo> such that IH'(z)1 < 1 for all
°
points z E (p - eo,p + eo), and apply the Mean Value Theorem.) For eo > possibly
°
smaller, if x and yare within eo of then for all j ::; 0, (i) Hi (x) and Hi (y) stay within
a distance EO of 0 and (ii) IHi(x) - Hj(y)1 ::; Ix - YI. (Notice that this would not be
true as stated for FI[O, 1].)
The idea of the proof is that as n grows, the (n, EO)-separated orbits which make the
transition from near 0 to near p can be counted by the iterate at which they start to
make the transition. Therefore the number of these orbits grows at most like a multiple
of n. Since the number of orbits which remain near either 0 or p is bounded, r(n, EO, H)
grows linearly in n and h(Eo,H) = O.
We proceed to make the above idea precise. Now fix a 0 < E ::; EO. Let No =
H«p - E/2,p + E/2) be the open interval about the attracting fixed point p, and let
Nr = H-1 ([0, f /2)) be the open (in J) interval about the repelling fixed point O. Let
U = N r U No and ue = J \ U. Then V = [0, E/2] \ N r is a fundamental domain of the
unst.able manifold of 0.
Next, there is a positive integer no such that if Hi(x) E UC for 0 ::; j < n, then
n ::; no. In particular, if x E V then Hj(x) E No for j 2: no, and U7!;;i Hi(V) :J UC.
Because U C is compact and all points are wandering, there is a {3 > 0 with 2{3 ::; E
such that Hj([y - {3, y + {3]) n [y - {3, y + {3] = 0 for all y E U e and all j 2: l.
Let E dense ({3, V) C V be a set such that
Ro-l
E dense ({3, UC) = U H i (Eden •• ({3, V))
;=0

(i) is {3-dense in U C and (ii) for any x E U e, there is ayE E dens .({3,U e) such that
dno.H(X,y) < {3. Thus for these x and y, IGi(x) - Gi(y)1 < {3 as long as Gi(x) E ue.
Finally, let
G= (O,p}UEden •• ({3,UC).
To estimate r(n, E, J), we define a map

'Pn : J --+ Gn

as follows.
Case (i): If x E No, then Hi(x) E No for all i 2: 0, so we let 'Pn(x) = (p, ... ,pl·
Case(ii): If x E N r , then there is a j ::; n such that Hi(X) E N r for 0 ::; i < j and
Hl(X) E V if j < n. If j < n, let Yj be a choice of a point in Eden.e({3, V) such that
dno,H(Hi (x), Yi) < {3. Let 'Pn(x) = (Yo, ... , Yn-d where

o forO::;i<j
Yi = { Hi-j(Yj) for j ::; i < min{n,j + no}
p for min{n,j + no} ::; i < n.

Case(iii): If x E UC, then let Yo be a choice of a point in Edenoe({3, ue) such that
dno.H(x, YO) < {3. Let 'Pn(x) = (Yo, ... , Yn-d where

Yi = {Hi(yO) for 0 ::; i ~ no


p for no ::; t < n.

We leave to the reader to check the following claim.


S.l TOPOLOGICAL ENTROPY 339

Claim 1. If x E J and rpn(x) = (Yo, ... , Yn-l), then IHi(x) - Yil < f/2 for 0 $ i < n.

Take an n with n > no. Let E(n,f) be a set with the maximal number of (n,f)-
separated points in J, #(E(n, f» = r(n, f, J, H).

Claim 2. The map 'PnIE(n, t) is one to one.


PROOF. Assume rpn(x) = rpn(z) = (yo, ... , Yn-d for x, Z E E(n, E). Then

for 0 $ i < n. Since E(n, E) is (n, f)-separated, x = Z. o


Claim 3. The cardinality of 'Pn(E(n, E» is Jess than (n + 1) #(Eden •• U1, UC».

PROOF. An n-tuple (yo, ... , Yn-d E rpn(E(n, f» has the form

o forO$i<j
Yi = { Yj for j $ i < j + k
p forj+k$i<n

where k = min{n - j,no} if j ~ I, and k = min{n - j,no} = no or 0 if j = O.


To count these n-tuples, we divide them up by the number k. If k = min{ n, j + no} >
0, then there are at most #(Eden •• {{3, U C» choices for Yj. As j varies, 0 $ j < n, we
at most n #(Ed.n •• (f3, UC» such n-tuples. The only other cases are when j = n or
j + k = O. These contribute two more n-tuples, (0, ... ,0) and (p, ... ,p). Thus

#(rpn(E(n, E))) $ n #(Eden.e{{3, U C »+ 2


$ (n + 1)#(Eden.e(l1, U C »
as claimed. o
To calculate the entropy,

r(n,f,J,H) = #(E(n,E»
= #(rpn(E(n, E)))
$ (n + 1)#(Eden.e(f3, U C ».

This quantity grows linearly with n so does not contribute to the entropy:

h( f, H) < · log(n + 1) + log(#(Eden •• {/3, UC)))


_ IImsup
R-OO n
= O.

Therefore, h(t, H) = 0 and h(F) = h(H) = 0 as claimed in the theorem. o


The next two theorems show that the topological entropy of two maps which are
topologically conjugate are equal. In fact, the second result is for maps which are only
semi-conjugate by a map which is uniformly finite to one.
340 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Theorem 1.1. Let X and Y be metric spaces with metrics d and d' respectively. Let
F: X -- X and f : Y -- Y be semi-conjugate by k: X -- Y.
(a) Assume X and Yare compact and the semi-conjugacy k is onto. Then h(F) ~ h(f).
(b) If k is one to one (but not necessarily onto), then h(F) ~ h(f).
° °
PROOF. By uniform continuity of k, given t > there is a 6 > such that d(XI' X2) ~ 6
whenever d'(k(xd,k(xd) ~ t. Let E(n,t,f) c Y be a maximal (n,t)-separated set
for f. i.e., one with #(E(n, t, f) = r(n, t. f). Form the set E(n, 6, F) c X by taking
one x E k-I(y) for each y E E(n,t,f). Thus #(E(n,6,F» = #(E(n,t,f). Then
E(n, 6, F) is a (n,6)-separated set for F by the property of uniform continuity of k
mentioned above. Therefore

r(n,6,F) ~ #(E(n,6,F» = #(E(n,t, f) = r(n.t.f).


From this it follows that h(6, F) ~ h(t. f) and h(F) ~ h(f) as desired. This proves part
(a). We leave part (b) to the reader. 0
REMARK 1.8. If two maps f and 9 are conjugate on invariant compact subsets (or
conjugate by a uniformly continuous homeomorphism), then their topological entropies
on these subsets are equal. Thus the topological entropy of the shift map (12 on 1:2 is
equal to that of FI' on AI' for #J > 4. We show below that h«(12) = log(2) 50 we can
deduce the value of the entropy for the quadratic map on AI' for #J > 4.
The next result gives a criterion for entropies of F and f to be equal when F is semi-
conjugate to f by a map k. We say that k is uniformly finite to one provided k- I (y) has
a finite number of points for each y and there is a bound C on the number of elements
in k -I (y) which is independent of y. The theorem says that if k is a uniformly finite to
one semi-conjugacy from F to f, each of which are defined on compact sets, then the
entropies of F and f are equal. This result can be used to calculate the entropy of F4 •
This theorem is due to Bowen (1971). See de Melo and Van Strien (1993).
Theorem 1.8. Assume F : X -- X and f : Y -+ Y are continuous maps where X and
Y are compact metric spaces with metries d and d' respectively. Assume k : X -+ Y is a
semi-conjugacy from F to f that is onto and uniformly finite to one. Then h(F) = h(f).
We delay the proof to the next subsection, and at this time apply it calculate the
entropy of F 4 i!0. 11.
Example 1.1. There is a semi-conjugacy k from the doubling map D(y) = 2y mod 1
to F 4i!O.lj which is two to one. (This is shown in Example II.6.2.) The entropy of Dis
log(2) by Proposition 1.1 so by the above theorem the entropy of F4i!O,lj is also log(2).
We end the section by determining the entropy of a subshift of finite type. The first
part of the theorem express the entropy of any subshift (and not just a subshift of finite
type) in terms of the growth rate of the number of words of length n as n goes to infinity.
Theorem 1.9. (a) Let (1 : 1:N -+ 1:N be the full shift on N symbols (either one or two
sided). Assume X C 1: N is a closed invariant subset, so (1IX is a subshift. Let Wn be
the number of words of length n in X, i.e.,

Wn = #{(so,.·. sn-d : Sj = Xj for ° ~ j < n for some x EX}.


Then
h«(1IX) = lim sup log(wn ).
ft-OO n
8.1 TOPOLOGICAL ENTROPY 341

(b) Let A be a transition matrix on N symbols, so A is N x N. Let 0' A : EA -+ EA be


the associated subshift of finite type (either one or two sided). Then h(O'A) = log(~l)
where ~I is the rcal eigenvalue of A such that ~I ~ I~il for all the other eigenvalues ~j
of A.
PROOF. (a) We need to consider the number of (n,f)-separated points for various (.
First, take ( = 2- 1 . Two points s, t E X are within 2- 1 if and only if So = to. For the
first n - 1 iterates, O'i(s) is within 2- 1 of 0'1(t) for 0 :5 j < n if and only if S1 = tj for
0:5 j < n. There are Wn choices of blocks (so, ... , sn-.) (by the definition of w n ), so
r(n, 2- 1, O'IX) = W n . Thus

h(2- 1,O'IX) = limsup log(wn ).


n-oo n
Next, we need to consider other values of f. Since h(e, O'IX) is monotonically increas-
ing as E decreases, it is enough to calculate the value for E = 2- 13-11:. By Exercises 2.12,
d(s, t) :5 2- 13-11: if and only if 5i = tj for 0 :5 j :5 k. Thus

d(O'i(S),O'i(t» > 2- 13-11:

for some 0 :5 i < n if and only if Sj '" tj for some 0 :5 j < n + k. Therefore

and

h(2-13-k,O'IX) = lim sup log(r(n,2- 13- k ,O'IX»


R-OO n
I . log(r(n+k,2-1,O'IX»
= 1m sup --=;.:....:~-'---'--=--!.:;.
R-OO n
Ii (n + k) log(r(n + k, 2- 1, O'IX»
= ~_s!P -n- n+k
= h(2- 1 , O'IX).

Since we have shown that h(2- 13- k ,O'IX) = h(2- 1, O'IX) for any positive k and h(f,O'IX)
is monotone in e,

h(O'IX) = h(2- 1 ,O'IX)


log(wn)
= Iimsup---.
n-oo n
This proves part (a).
(b) We prove the case for A irreducible using the Perron-Frobenius Theorem and
leave to the exercises the proof ofthe general case. See Exercise 8.14. (The general case
uses Proposition III.2.10.)
We first take the case where A is eventually positive, Aj is positive for j ~ m. Later,
we discuss the proof of the general irreducible case.
By Lemma 111.2.2, for a subshift of finite type with transition matrix A, Wn is the
sum of all the entries in An-1 which we denote by #(An-1),

Wn = L (An-1li,i
l~i~N,ISi~N

= #(A n- 1 ).
342 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Therefore to calculate the entropy, we need to estimate #(An-I). Letting e be the


column vector with all entries being one, e = (1, ... ,1)tr,

so

Applying the case of the Perron-Frobenius Theorem given in Theorem IV.9.lD, (i) there
is a positive real eigenvalue 'xl and corresponding eigenvector vi with all positive entries
such that 'xl > I'xjl for all the other eigenvalues'xj of A, (ii) An-le/IAn-lel converges
to Vi Ilvll as n goes to infinity, and (iii) there are positive constants C I , C2 such that
CI,X~-I ::; (An-Ie), ::; C2,X~-1 for 1 ::; i ::; Nand n > m. (Remember that Aj is
positive for j ~ m.) Summing on i, we get the estimate

Because constant multiples do not affect the exponential growth rate,

')
Iog ("I I' log(N)+log(Cd+(n-1)log(,Xd
= 1m
n-oo n
= lim 10g(NCI,X~-I)
n-oo n
::; lim sup log(w n )
n-oo n
.
I1m log(NC2,X~-I)
<
- R-oo n
::; 10g(,Xd,

so

. log(wn )
h(UA) = hmsup - - -
n_oo n
= log('xd·

This completes the proof of the theorem in the case when A is eventually positive.
Finally, consider the case when A is merely irreducible. It is proved in Gantmacher
(1959) that by means of a permutation A can be put into the following 'cyclic' form:

A= ( :
o o
Au o o
where the blocks along the diagonal are square. By taking the k-th power (where k is
the number of blocks),
AA: = di"g(B It ... , BA:)
where B j = A),HI'" An-I,nAn,1 ... Aj_I,j is eventually positive for each j. Thus
a general irreducible matrix is a "combination" of a cyclic permutation of blocks of
symbols and an eventually positive return map AA: on each of these blocks of symbols. All
S.l.l PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY 343

the B j have the same real eigenvalue A~, such that A~ > I~l,jl for the other eigenvalues
of Bj. In fact Ale 2d/ k for 0 $ t < k are simple eigenvalues of A; in particular, Al 2: IAil
for the other eigenvalues Aj of A. (See Gantmacher (1959) for details.) From this form,
it follows that EA is the union of k subsets which are invariant under u~, and u~ on the
j subset is UB,' By Theorem 1.3, h(u~) = maxi h(UBJ) = 10g(A~). Then the entropy of
U A is as follows:

1
h(u A) = k h(u~)
= k1 10g(AI)
Ie

= 10g(Ad·

This shows how the result for a general irreducible subshift follows from the general
Perron-Frobenius Theorem. 0
REMARK 1.9. In the exercises, we ask the reader to use this last theorem to show that
the entropy of the full shift on N symbols is 10g(N). See Exercise 8.4. We also use it
to calculate the entropy of several subshifts of finite type.

8.1.1 Proof of Two Theorems on Topological Entropy


This subsection contains the proofs of Theorems 1.4 and 1.8. We apply Theorem
1.8 to toral automorphisms using their Markov partitions in the next subsection. In
Chapter IX we apply it to general hyperbolic invariant sets. The main use of Theorem
1.4 is for Morse-Smale systems which we also discuss in the next subsection.
The proofs of Theorems 1.4 and 1.8 use another method of counting orbits in addition
to (n, e)-separated sets called (n, e)-spanning sets. We give the definition in a slightly
more general situation where we allow the initial points to be restricted to a subset K
that is not necessarily invariant. The following definition makes these ideas precise.
Definition. Let f : X - t X be a continuous map on the space X with metric d. Let
K c X be a subset. For a positive integer q, let

dq,f(W, z) = sup d(fi{w),li{z))


o:Sj<9

as we defined earlier. A set S c K is said to (n, f)-span K for n a positive integer and
f > 0 provided for each x E K there exists ayE S such that dn.f{x,y) $ E. Then the
number r.pan{n, E, K, f) is defined to be the smallest number of elements in any set S
which {n, f)-span K, and

h.pan (K · log(r.pan(n, e, K, f))


E, , I) = IImsup .
n-oo n
It is easily checked that h.pan(E, K, f) is monotonically decreasing in E (O < EI < e2 im-
plies that h.pan{fl, K, f) 2: h.pan{f2, K, f)) so the limit as f goes to zero of h.pan{f, K, f)
exists,
h.pan(K, f) = lim h.pan(E, K, f),
(-0, (>0

and for any f > 0, h.pan(f, K, f) $ h.pan{K, f).


For any integer n, f > 0, and subset K::> X, we let E.pan{n,f,K) to be a minimal
{n, f)-spanning set, so

#(E.pan{m, f, S)) = r.pan{m, E, S, f).


344 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

As in the last section, r •• p(n, 1', K, f) is the maximal number of elements in any
set S c K which are (n,E)-separated. The definitions for h •• p(E,K,f) and h •• p(K,f)
are similar. Again, we let E •• p(m,E,S) be a maximal (m,E)-separated set for K, so
#(E•• p(m,E,S)) = r ••p(m,E,S,f).
If K = X then we usually drop the specification of the set K in the notation.
The following lemma shows that h •• p(K, f) = h.pon(K, f) so the limit is the same
for separating and spanning sets (and either one can be used to define entropy). After
proving the lemma, we denote either quantity by h(K, f).
Lemma 1.10. Let K be a subset of X. For I' > 0 and n a positive integer,

and
h ••p(2E, K, f) ::; h.pon(E, K, f) ::; h ••p(E, K, f).
Therefore h •• p(K, f) = h.pon(K, f). If we further assume that the space X is compact,
then h•• p(E, K, f) < 00.
PROOF. Let E •• p ( n, 1', K) be a maximal (n, I' )-separated set for K and let x E K. There
is some y E E •• p(n,E,K) such that dn,/(x,y) ::; 1', because otherwise E •• p(n,E,K) U
{x} would be an (n,E)-separated set for K and Eup(n,E,K) would not be maximal.
Therefore E •• p(n,E,K) (n,E)-spans K, and
r•• p(n,E,K,f) = #(Eup(n,E,K» 2: r.pon(n,E,K,f).

Let E •• p(n,2E,K) be a maximal (n,2E)-separated set for K, and E.pon(n,f',K) be


a minimal (n,E)-spanning set for K. Using the fact that E.pon(n,E,K) spans, we are
going to define a map T : E •• p(n, 21', K) -+ E.pon(n, 1', K). For x E E •• p(n, 21', K) there
is a y = T(x) E E.pon(n, 1', K) with dn./(x,y) ::; I' because E.pon(n, 1', K) spans. If
T(xiJ = T(X2) for XI,X2 E E •• p(n,2f',K), then

dn,/(XI, X2) ::; dn,/(XI, y) + dn./(y, X2) ::; 21'.


Because E •• p(n, 21', K) is an (n, 2E)-separated set, Xl = X2' This shows that T is one to
one, and so

r •• p(n,2E,K,f) = #(E•• p(n,2E,K»


::; #(E.pon(n, 1', K»
= r.pon(n,E,K,f).

By taking the growth rates as n goes to infinity, we get the inequalities

h ••p(2E, K, f) ::; h.pon(E, K, f) ::; h ••p(E, K,J).


By letting I' go to zero, h •• p(K, f) = h •• p(K, f).
If X is compact, there is a finite number, N., such that N. is the maximal number of
disjoint balls of radius E. It follows from this that the maximal number of elements of a
(n, f)-separated set is bounded by N:. (There can not be two orbits with dn./(x, y) 2: I'
and Ji(x) and fi(y) in the same f'-balls forO::; j < n.) Thereforer•• p(n,f,K,f)::; N:,
and h •• p(E, K, f) ::; 10g(N,) < 00. 0
We next give the definition of topological entropy in terms of open covers. We do not
use this definition, so it can be skipped. However the reader may note some similarity
between this definition and a construction in the proof of Theorem 1.4.
8.1.1 PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY 345

Definition. A collection A is called an open cover 01 X provided (i) each A E A is an


open subset of X and (ii) UtA E A} = X. A subcollection 8 c A is called a subcover
provided U{ A E 8} = X.
For an open cover A, let

An = {n
n-I

j==O
ri(A j ) : Aj E A and n
n-I
rj(A j ) =F 0}.

Let N(A) be the minimal cardinality of a subcover 8 cA. Denote the growth rate of
the number of elements in (a minimal subcover of) An by

h(A, I) -- I'Imsup log(N(An» .


n-oo n

Finally, the entropy of I is given by

h(f) = sup{h(A,f) : A is an open cover of X}.

Note that this definition also growth rate of a number of objects determined by iterates
of I. We do not prove this fact, but the above definition is equivalent to the definitiolUl
in terms of (n,E)-separated or spanning sets.
PROOF OF THEOREM 1.4. Because 0 C X, we always have that h(fIO) ~ h(f). What
we need to prove is the reverse inequality. We use an (m, E)-spanning set of 0 to estimate
the size of an (n,2E)-separated set on all of X.
We fix an integer m ~ 1 and E > 0 for quite a while in the proof. Take the set
E.pon(m,E,O) to be a minimal (m,E)-spanning set for 110. Let

U = {x EX: dm,/(x,y) < E for some y E Em(E,O)}.

Since the orbits in E.pon (m, f, 0) also span orbits of points near 0 in X, U is an open
neighborhood of 0 in X. Since UC = X\ U is compact and all points in UC are wandering,
there exists a uniform {3 with 0 < {3 ~ E such that the forward orbit of the ball of radius
{3 about any y E UC, B(y,{3), never intersects itself, Ji(B(y,{3»nB(y,{3) = 0 for all
j ~ 1. Now take a set E.pon(m, (3, UC) which is a minimal (m, {3)-spanning set for J
with points starting in UC, so

Let G.pon(m) = E.pon(m,E,O)uE.pon(m,{3, UC). The set G.pon(m) is clearly an (m, E)-
spanning set for X, so #(G.pon(m» ~ r.pon(m,E,X,f).
Let t be a positive integer. To estimate r .ep( n, 2E, X, f), we define a map

by IPt(x) = (Yo, ... , Yt-1l where (i) y. E E.pon(m, f, 0) and dm,/(rm(x), Y.) < f if
rm(x) E U, and (ii) y. E E.pon(m,{3,UC) and dm,/(rm(x),y.) < {3 if rm(x) E UC,
Because E.pon(m,E,O) is an (m,f)-spanning set for U and E.pon(m,{3,U") is an (m,{3)-
spanning set for UC, it is always possible to make these choices to define IPt.
346 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Claim 1. Assume (Yo, ... ,Yt-d = 'Pt(x) for some x E X. Then a point Y. E
E,pon (m, {3, U C ) can not be repeated in this i-tuple.
PROOF. This claim follows because the balls B(y.. {3) are wandering for and choice of
y. E E.pon(m,{3,UC). 0
Now we take n > mr,pon(m,/3,Uc,j) be an integer. Let E •• ,,(n,2E, X) be a maximal
(n, 2t)-separated set. For such an n, let i be the positive integer with (i-l)m < n ~ im.
The following claim shows that 'Pt is one to one on E,e,,(n, 2£, X), so we can estimate
r ••,,(n, 2£, X, f) by means of #('PI(Eoe,,(n, 2£, X»).
Claim 2, The map 'PI is one to one on E. e,,(n,2£,X).
PROOF. Assume that 'Pt(x) = 'Pt(z) = (Yo, ... ,Yt-d for x,z E E. e,,(n,2£,X). For
o '5 t < m and 0 '5 s < i,

d(f'm+t(x), j"m+t(z» ~ dm,JU·m(x), Y.) + dm,J(Y" j"m(z»


< £ + £ = 2£.

The integer l is chosen so that tm ~ n, so we get that dn,J(x, z) < 2£, Since the set
E.e,,(n, 2E, X) is (n, 2E)-separated, x = z. 0
Claim 3. Let q=r.pon(m,{3,Uc,f) andp=r.pon(m,E,n,f). Then

PROOF. Let I) be the subset of i-tuples in 'Pt(E.e,,(n, 2£, X» such that there are exactly
j of the Y. that are in E.pon(m,/3,UC). Because the y. E E.pon(m,{3,U C) can not be
repeated in 'P1(X), we must have j '5 q. (Notice that this bound is independent of n
or t. Also, n > mq so t > q.) For Ii' there are (J) ways of picking these j points
Y. E E.pan(m,{3,UC); there are

i'
t· (t- 1)··· (i- j + 1) = - - '.-
(i-])I

ways of arranging these choices among the positions in the ordered i-tuples; finally,
there are at most
r.pan(m,E,O,f)t- i = pt- i ~ l

ways of picking the remaining y. from E.pan(m,E, 0). Thus

#(Ii)~ (q) (i-j)!P,


j
i! t

and
q

#('Pt(Eoe,,(n, 2E, X))) = L #(Ii )


i=O

~(q) i! t
'5 L., j (i _ j)! P .
,-0
8.1.1 PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY 347

To estimate this summation, note that (~) $ ql and

~ ,
- - = t· (t- 1)··· (t- J'
(t- j)!
+ 1) <- tJ <- f!l.

Thus
q

#(IPt(E.ep (n,2E,X))) $ Lq!f!lpt


j=O
$ (q + I)!f!ll

as claimed. o
REMARK 1.10. Notice how crudely we made the count of possible i-tuples in the set
q
IPt(E,ep(n, 2E, X». The estimate L q!f!l $ (q + I)!f!l gives a bound on the number of
j=O
wandering orbits. Because this quantity only grows polynomially in i, it does not con-
tribute to the entropy. A better bound is possible by paying attention to the dynamics
of points which wander but it would still have polynomial growth contributed by the
wandering orbits.
We could also form the open cover of sets V made up of the open sets

{x EX; dm,/(X,y.) < E}

for y. E E.pan(m,E,n) and

{x EX; dm,/(x,y,) < t1}

for y. E E.pan(m,t1,UC). Let V t be the refinement as in the definition of the entropy


by open sets. Then there is a very close connection between V t and IPt(E.ep(n,2E, X».
Since the growth rate of the number of open sets in V t as t goes to infinity give the
entropy for this open cover, it is not surprising that it can be used to get It. bound on
the growth rate of (n, 2E)-separated orbits. See Alseda, Llibre, and Misiurewicz (1993)
for a proof of Theorem 1.4 using the definition in terms of open covers.
By Claims 2 and 3,

r.ep(n, 2E, X, f) = #(IPt(E.ep(n,2E, X)))


$ (q+l)!f!ll,

where q = r.pan(m,t1,Uc,f) and p = r.pan(m,f,n,f). Then

h. ep (2E,X,f) = limsup.!. log(r. ep (n, 2f,X, f)


R-OO n
I. log«q + I)!) + q log(f) + f log(p)
$ l~~P (i _ I)m
$ log(p)
m
= log(r'fH!n(m, E, n, f).
m
348 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Next, we let m vary and go to infinity, and we obtain the bound

h ..,(2f, X,!) :5 h."..n(f, n,!)


:5 hen,!).

Finally, letting f go to zero, we get h(X,!) :5 h(n,!) as desired. o


PROOF OF THEOREM 1.8. Because k is onto, h(F) ~ h(f). Thus what we need to
show is that h(F) :5 h(f).
The obvious attempts at proofs do not work. Using the uniform continuity of k,
it can be shown that given f > 0 there is a 6 > 0 such that if E •• ,(n,6,!) c Y
is (n,6)-separated for I then k- 1 (E..,(n,6,/» is (n,f)-separated for F. However
k- 1 (E•• ,(n,6,/» is not necessarily the maximal (n,f)-separated set for F, so this fact
does not give an upper bound for r ••,(n,f,F) in terms of r •• ,(n,6,!). On the other
hand, the inverse image of a spanning set for I is not necessarily a spanning set for F.
The proof below shows that given f > 0, there is a 13 > 0 such that there is an upper
bound on the maximal size of a (n,2f)-separating set for F, r ..,(n, 2f, F), in terms of
the minimal size of a (n,13)-spanning set for I, r."..n(n,13,!).
e
Fix an f > O. Let ~ 1 be a bound on the number of points in k-1(y),

for all y E Y. Let m ~ 1 be an integer. With some work we show that the following
bound for the size of an (n, 2f)-separating set for F is true:

1 1
h••p (2f, F) :5 h."..n (13,!) + -m loge e) :5 h(f) + -m loge e),
for correctly chosen 13 > O. Since m is an arbitrarily large integer, this implies that
h •• p (2f, F) :5 h(f) and h(F) :5 h(f).
To start this proof, for y E Y, let

Uy = Uy •m,< = {w EX: dm.F(W,Z) < f for some Z E k-l(y)}.


By the continuity of F, Uy is an open neighborhood of k-1(y). By the continuity
of k and compactness of X, there is an open neighborhood Wy C Y of y such that
k-1(Wy ) C Uy . The set Y is compact, so there exists a finite cover {Wy " ••. , Wyp }.
Let {3 > 0 be the Lebesgue number of this finite cover, i.e., if y E Y there is some WYi
such that the closed 13 ball about y is contained in Wy;, cl(B(y, 13» C WYi" Note that
given a small f > 0, we have determined a small 13 > O. This is the 13 that works for the
(n,13)-spanning set for I mentioned above. We want to show that

1 1 1 1
-log(r••p(n, 2f, F» :5 -log(r."..n(n,13,/»
n n
+ -log(e)
m
+ -log(e).
n
In the proof below, we need to break up an orbit of F into segments of length m. To
this end, for an integer n let t be the integer such that

(t - l)m < n :5 tm.

Let E •• ,(n,2f,F) C X be a maximal (n,2f)-separating set for F with

#(E•• p (n,2E,F» = r •• ,(n,2E,F),


8.1.1 PROOF OF TWO THEOREMS ON TOPOLOGICAL ENTROPY 349

and E.pon(n, {J, j) c Y be a minimal (n, {J)-spanning set for f with

For y E Eopon(n,{J, f), let q(j,y) E {y" ... y,,} be chosen so that

c1(B(fi(y), {J» c Wq{j,y)'

To get an estimate on roe,,(n, 2£, F) in terms of r.pon(n, {J, F), we define the map

'PI : E •• ,,(n, 2£, F) --+ E.pon(n, {J, f) X X,

by 'P1(X) = (Yi"o, ... ,xl-d where (i) d~,/(y,k(x» ~ {J and y E E.pon(n,{J,j), and
(ii) x. E k-I(q(sm,y» satisfies dm.dF·m(x), x.) < £ for 0 ~ s < t. Because
k 0 Fom(x) = I'm 0 k(x) E c1(B(f'm(y), {J» c Wq(.m,y),
F·m(x) E k-I(Wq(.m.y» C Uq(.m,y),m .•

and it is always possible to choose the x •.


Claim. The map 'PI is one to one on E ••,,( n, 2£, F).
PROOF. If 'P1(W) = 'Pt(z) = (Yi "0, ... , xl-d, then

d(Fom+l(w), F·m+l(z» ~ d m,F(F6m(W), x.» + dm,F(X., Fom(z»


~ £ + l = 2£.

for 0 ~ t < m, and 0 ~ B ~ t. Since ml ~ n, we get that dn,F(Z, w) ~ 2£. Since


E •• ,,(n, 2l, F) is (n, 2l)-separated, we get that w = z. 0
Now we use the claim to finish the proof of the theorem. For y E E.pon (n, {J, f) fixed,

t-I
#('PI(E. e,,(n,2l,F»n({y} x xt» ~ 11 #(k-I(q(sm,y»
0=0

Because there are r.pon(n, {J, f) choices of y E E.pon(n,{J, f),

r"l'(n, 2£, F) = #('PI(E••,,(n, 2£, F)))


~ r.pon(n,{J,f)C' .

Therefore,

1 l I t
-log(r••,,(n, 2l, F» ~ -log(r.pon(n, {J, f) + - 10g(C )
n n n
1 tm
= -log(r.pon(n,{J,f» + -log(C)
n nm
1 n+m
~ -log(r.pon(n,{J,f) + --log(C)
n nm
1 1 1
~ - log(r.pon(n, {J, f) + -log(C) + -log(C).
n m n
350 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

The second to the last inequality is obtained using the definition of i. Taking the limsup
as n goes to 00, we get
1 1
h.<p(2f, F) ::; h''fJGn({3, f) + -log(C) ::; h(f) + -log(C).
m m
Since this last inequality is true for all m ~ 1, we get that h..p(2~, F) $ h(f). Letting
E > 0 go to zero we get the desired Inequality, h(F) $ h(f). (Notice that the bound

on the number of pointli in the invllTae im~1l of" point contributllcl \og(C)/m to the
bound on the entropy, which is nonzero if C > 1. It is only as m goes to infinity that
this bound can be neglected.) 0
REMARK 1.11. There is a generalization of Theorem 1.8 that does not assume k is
uniformly finite to one. In this case,
h(f) ::; h(F) $ h(f) + sup h(k-I(y), F).
yEY

Notice with the assumptions of Theorem 1.8, k-I(y) is finite so that the second term
is zero, SUPyEY h(k-I(y), F) = O. Therefore the specific result of Theorem 1.8 follows
from this more general result. The proof of the more general result is basically the same
as that given above but the number m varies with the point y E Y. This form was also
proved by Bowen (1971). See de Melo and Van Strien (1993) for a proof.

8.1.2 Entropy of Higher Dimensional Examples


In this subsection, we apply the theorems about entropy to some of the higher dimen-
sional examples discussed in Chapter VII. We start with Morse-Smale diffeomorphisms.
Because such a diffeomorphism has only finitely many points in its chain recurrent set,
the dynamics are simple and it has zero topological entropy.
Theorem 1.11. Let M be a compact manifold, and f a CI Morse-Smale diffeomor-
phism. Then the topological entropy of f is zero, h(f) = O.
PROOF. Theorem 1.4 shows that h(f) = h(fIO) where 0 = O(f) is the nonwandering
set of f. Since f is Morse-Smale, O(f) is a finite set of points. Therefore, h(f) =
h(fIO) = O. 0
As a second example, we consider an Anosov diffeomorphism and connect the entropy
to the entropy of the subshift of finite type induced by a Markov partition.
Theorem 1.12. Let fA be a hyperbolic toral automorphism on ']['2. Assume'R. =
{Rj}j~1 is a Markov partition with transition matrix B. Let Al be the eigenvalue of
the transition matrix B with largest absolute value. (This is also the absolute value of
the largest eigenvalue of the original matrix A.) Then the topological entropy of fA is
10g(Ad·
PROOF. By Theorem VII.5.3, there is a finite to one semiconjugacy h : EB -+ ']['2. Since
h is a finite to one semi-conjugacy, the entropies of f and (fB are equal by Theorem 1.8.
The topological entropy of a two sided shift is the logarithm of the largest eigenvalue
by Theorem 1.9. Combining these facts gives the result. 0
REMARK 1.10. Bowen (1971) has also related the topological entropy of a hyperbolic
toral automorphism on T" to a summation involving all the expandinp; eigenvalUes:

h(fA) = L 10g(lAil)
1.\;1>1
where Ai are the eigenvalues of A. Also see Bowen (1978a).
8.2 LIAPUNOV EXPONENTS 351

Example 1.2. As a last example consider a transverse homoc1inic point for a diffeo-
morphism f. By Theorem VII.4.5(a), r has an invariant set A which is conjugate to
172 on the full two shift. The orbit of A by f is just n sets which are homeomorphic to A
and for x E A, fj (x) returns to A every n iterates. The entropy of riA is log(2) and the
entropy of f on the or bit of A is (1/ n) log (n). On the other hand if we consider directly
invariant set A' for f which Theorem VII.4.5(b) shows is conjugate to a subshift of finite
type for the matrix B given in the proof. As noted in Remark VII.4.6, the characteristic
polynomial of B is p(A) = An - An-I - 1. Since p(21/n) = 2 - 2(n-I)/n - 1 < 0, the
largest eigenvalue of B is larger than 21/ n , i.e., the entropy of the subshift of finite type
1781E8 and so flA' is greater than the entropy of fIO(A).

REMARK 1.11. Assume f : M -+ M is a C l diffeomorphism on a compact manifold


M and f has a hyperbolic chain recurrent set (or hyperbolic nonwandering set). Let
Nn(f) = #(Fix(fk» be the number of points whose least period divides n. Bowen
(1970b) proved that
·
h(f) = IImsup 10g(Nn(f»
,
R-OO n
i.e., the entropy is equal to the growth rate of the number periodic points. Also see
Bowen (197811.).
Bowen (197811.) also introduces the connection between the entropy and the induced
maps on the homology groups. See Yomdin (1987) for a proof of the conjecture.

8.2 Liapunov Exponents


Most of this book concerns examples of systems with uniformly hyperbolic invariant
sets. In this section we define Liapunov exponents for diffeomorphisms and flows in all
dimensions, extending the treatment in Section 3.6. We give only a brief introduction
to the ideas. For more details see Ruelle (198911.), Mane (198711.), Katok (1980), and
Walters (1982).
Just as in one dimension, the exponents exist almost everywhere in terms of an
invariant measure. If these exponents are nonzero almost everywhere on an invariant
set, then it has a nonuni/ormly hyperbolic structure (which may in fact be a uniformly
hyperbolic structure in some cases). There are examples which have a nonuniformly
hyperbolic structure; in fact, the Benon map for A < 2 and IBI small can not have
a uniform hyperbolic structure by the theorem of Plykin, but Benedicks and Carleson
(1991) proved that it does have nonzero Liapunov exponents. Therefore the Benon map
for these parameter values is a.n example with a nonuniformly hyperbolic structure. For
A = 1.4 and B = -0.3, numerical simulation indicates that it has an invariant set with
a nonuniformly hyperbolic structure but this is unproven.
With this introduction we turn to the definitions for a diffeomorphism. Those for
flows are similar, but we leave the small differences to the reader.
Definition. Let f : M -+ M be a. diffeomorphism on a manifold of dimension m. Let
I . I be the norm on tangent vectors induced by a Riemannian metric (inner product on
tangent vectors) on M. For each x E M and v E TxM let

A(x, v) = lim -k110g(ID/;vl)


k_oo

wh·",·,·ef this limit exists. Note that

lim sup -k1 10g(IDf;vJ)


k-oo
354 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Remember that for a single linear map L on am. most vectors v have some component
in the direction of the eigenvector corresponding to the largest eigenvalue J.Lm. so

In the present situation. most tangent vectors v at x lie in Vxm \ V:,-I. so

Am(X) lim -k1 10g(IDf:~vl).


= k-oo
Thus we can calculate Am(X) by taking an arbitrary tangent vector v E TxM (and
assume that it does not lie in Vxm- I ) and determine Am{X) by means of A(X. v) (if the
limit converges).
We can not easily find a vector v E Vxm- I to calculate Am_l(X). However. for most
pairs of vectors v m • v m - 1 E TxM. the plane spanned by v m and v m - 1 is complementary
to Vxm - 2 • Therefore. there is a vector w m- 1 E Vxm - 1 such that w m- I = v m- I + QV m
for some scalar Q. Under application of D I:'. the area of the parallelogram determined
by DI:'v m and DI!v m- 1 equals the area of the parallelogram determined by DI!v m
and D l!w m- 1 • and so grows at a rate like ek(>'m(xl+~m-dx)). (This requires some
justification and consideration of angles between V,T(x) and Vpc;.\.) Let Iv /\ wi be the
area of the parallelogram determined by v and w. Then. for most v m, v m- I E TxM,

This gives a means of calculating Am-I (x) once we know Am (x).


Taking j-tuples of tangent vectors vm •...• v m- HI E TxM, most choices span a
subspace which is complementary to VJ. so

where Iv m/\ .. ·/\vm-j+ll is the j-dimensional volume of the j-dimensional parallelepiped


determined by v m•...• v m- HI. Thus it is possible to determine all the Liapunov ex-
ponents by induction.

In interpreting the Liapunov exponents, Aj(X) is like the logarithm of the eigenvalue
of a fixed point of a map. Therefore Aj(X) > 0 corresponds to an expanding direction,
Aj (x) < 0 corresponds to a contracting direction, and Aj (x) = 0 corresponds to a neutral
direction (at least as far as exponential growth rates are concerned).
For the rest of the section, we assume that Aj(X) 1: 0 for all j and almost all points
x E An B, where A is an invariant set for f and B, is defined by Theorem 2.1. This
assumption is analogous to a hyperbolic structure on A, and is what we called above
a nonuniform hyperbolic structure on A. Let r(x) be the integer function such that
Aj(X) < 0 for 1 $ j $ r(x) and Aj(X) > 0 for r(x) < j $ s{x). Let 1E~ = V,;(x). This
is the stable subspace at x. The unstable subspace at x can be determined using I-I,
1E~. The splitting of the tangent space by means of the stable and unstable subspaces,
TxM = E~ eE~ for x E AnB" is measurable in x but not necessarily continuous. This
type of structure is called a nonunifonnly hyperbolic structure on A.
Pesin {1976} proved that a ll(Jlllillear map f on a nonuniformIy hyperbolic invariant
set has local stable and unstable manifolds for points x E A which are invariant and
tangent to 1E~ and E~ respectively. In this case, the diameter of the local manifolds varies
8.2 LIAPUNOV EXPONENTS 355

with the base point x, so the situation is much more complicated than the uniformly
hyperbolic case. Also see Pugh and Shub (1989) for a proof more like the one given in
this book for uniformly hyperbolic sets. Ruelle (1979) has another proof.
Katok (1980) used these ideas to prove the following theorem which gives a connection
between positive Liapunov exponents and uniformly hyperbolic sets which have positive
topological entropy.
Theorem 2.2. Let f : M --+ M be a C 2 diffeomorphism on a compact manifold M.
Assume that (i) f is ergodic for some invariant Borel probability measure J.'() EMU)
where J.'() is not concentrated on a single periodic orbit (J.'() is a non-atomic measure),
and (ii) the Liapunov exponents are nonzero J.'()-almost everywhere. Then f has a
closed invariant uniformly hyperbolic subset, A, (a "horseshoe") such that (i) flA is
topologically conjugate to a subshi£t of finite type and (ii) the topological entropy of f
on A is positive, h(fIA) > O.
REMARK 2.3. With the assumptions of this theorem, it follows that there must be at
least one positive Liapunov exponent J.'()-alm06t everywhere. Thus, this theorem proves
that a positive Liapunov exponent implies positive topological entropy and so chaos.
This theorem also says that the condition on individual orbits about Liapunov ex-
ponents implies an aggregate condition on a hyperbolic invariant set. In principle, it
is easier to show the existence of nonzero Liapunov exponents than the existence of a
uniform hyperbolic structure. The theorem implies that there is not as much difference
between these two conditions than one might think or fear.
An earlier theorem of Pesin (1977) implies that if a diffeomorphism f (i) preserves a
measure 1'0 EMU) which is equivalent to the volume for the lliemannian metric, and
(ii) at least one of the Liapunov exponents is positive on a set of positive J.'() measure,
then the topological entropy is positive. In fact, there is a lower bound on the topological
entropy in terms of an integral of the sum of the positive Liapunov exponents. Let kj(x)
be the multiplicity of Aj(x), kj(x) = dim(V1) - dim(Vl- 1 ), and
,(xl
X"(x) = L kj(x)Aj(x),
j=r(x)+l

then
h(f) ~ 1M X"(X) dl'o.
This theorem requires the measure to be smooth, but does not assume that all the
exponents are nonzero. In fact what Pesin proved was that the measure theoretic
entropy with respect to J.'() is greater than or equal to the integral,

Earlier, Margulis (1969) proved the other inequality

so
hl'o(f) = 1M X"(X) dl'o·
See Katok (1980) or MRne (1987a) for further discussion of this type of result.
356 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Several recent papers have used a field of invariant cones to prove the existence of a
positive Liapunov exponent and the ergodicity of the system. See Wojtkowski (1985),
Burns and Gerber (1989), and Katok and Burns (1994) for references dealing with
the ergodicity of geodesic flows. See Sinai (1970), Chernov and Sinai (1987), Katok
and Streleyn (1986), and Liverani and Wojtkowski (1993) for references dealing with
the ergodicity of billiards problems. (The last reference gives a good introduction and
many other references.)

8.3 Sinai-Ruelle-Bowen Measure for an Attractor


In this section, I : AI -+ M is a C 2 diffeomorphism, and A is a uniformly hyperbolic
attractor. (Since we assume that all the points of an attractor are chain recurrent and
have a hyperbolic structure, the periodic points are dense in A. See Theorem IX.4.l.)
Our goal in this section is to explain why the forward orbit of most points in the basin
of attraction of A tend to be dense in A: the computer picture generated by the forward
orbit of most points x is a picture of the attractor.
We denote the basin of attraction of A by W"(A). Let U be a compact neighborhood
of A in W·(A). For x E U, a probability measure, II;, can be associated to the partial
forward orbit of length n, {x,/(x), ... ,r-I(x)}:
n-I

II;: = .!. L 6'i(xl


n i=O
where 6y is the atomic measure at the point y. The Sinai-Ruelle-Bowen Theory says
that there exist (i) a subset U' C U of full Lebesgue measure and (ii) a measure /.S with
support on A such that for any x E U' the measures II; converge weakly to /.S, i.e., for

any continuous function tp : U -+ R,


n i=O
n-I
.!. L tp 0 Ii (x) converges to J
tp(y) d/.S(y). This

measure /t is called the Sinai-Ruelle-Bowen measure 01 the attmctor, or just the SRB
measure 01 the attmctor. The fact that the average of the evaluation of the function tp
along the forward orbit of x converges to the integral means that the forward orbit is
uniformly spread out over A in terms of the measure /.S. The measure Il is characterized
by the fact that (i) there is a positive Liapunov exponent Il-a.e. and (ii) Il has absolutely
continuous conditional measures on the unstable manifolds for points of A relative to the
Riemannian measure on the unstable manifolds. See Young (1993) for an introduction
to this theory and other related measure theoretic results. Also see Sinai (1972), Bowen
(1975b), Ruelle (1976), and Bowen and Ruelle (1975).
We mentioned above that a uniformly hyperbolic attractor always has a SRB mea-
sure. The only situation where a SRB measure has been shown to exist for a nonuni-
fonnJy hyperbolic invariant set is the Henon attractor for B very small and A just less
than 2, Benedicks and Young (1993). This is the same situation where Benedicks and
Carleson (1991) proved there is a transitive attractor. See Young (1993) for a discus-
sion of this result. For A = 1.4 and B = -0.3, the Henon map appears to have a SRB
measure (the orbits of most points appear to be uniformly dense in the whole attracting
set), but this result has not been verified mathematically. The same remark about an
apparent SRB measure that has not been verified mathematically applies to the Lorenz
attractor for p = 28.

8.4 Fractal Dimension


Another measurement of the complicated or chaotic nature of the dynamics is the
dimension of the invariant set. In Section 7.6 we have defined the topological dimen-
sion of a set. This dimension is always an integer and is useful in some considerations
8.4 FRACTAL DIMENSION 357

but does not relate to the chaotic nature of the system. In this section, we discu88
some dimensions which are nonnegative real numbers and do not have to be integers.
The definitions of these concepts go back to Hausdorff, but recent interest has been
stimulated by Mandelbrot. Generally, these types of dimensions are called fractal di-
mensions. We mainly consider the box dimension (also known as capacity dimension),
but define Hausdorff dimension at the end of the section. We give references to other
sources which introduce some of the other types of dimension: information dimension,
correlation dimension, and Liapunov dimension.
Chapter 5 in Broer, Dumortier, van Strien, and Takens (1991) discusses the sense in
which the box dimension is related to chaotic nature of the system. In doing this it also
defines both the box dimension and the entropy for a time series of a system. Because
the time series can either be generated by an explicit map (or differential equation) or by
an experiment, this indicates how the ideas can be applied to experimental situations.
Definition. III our discussion of box dimension, we only consider compact subsets A of
some Euclidean space Rn. (These definitions also make sense in a metric space. Since
a manifold M can be embedded in some Euclidean space Rn , our definitions apply to
compact manifolds.) For t > 0, consider the subdivision of Rn into boxes or cubes of
sides of length t: for (jl, ... ,in) E zn, let

Rj" ... ,jn = {(x), ... x n ): i,t:5 x, < (i, + l)t for 1:5 i:5 n}.
A box of this kind is said to be a box from the t-grid. Let N(t, A) be the number of
boxes Rj among all the choices of j E zn such that An Rj '" 0.
To motivate the definition of box dimension, we consider the number of boxes from
the (-grid, N(t, A), that are needed to cover various objects. For a line segment, N{f, A)
is roughly t-I times the length. For a rectangle in a plane, N{e, A) is roughly c 2 times
the area. It is not as obvious, but for a curve, N(t, A) also is roughly e- I times the
length, and for a piece ofsurface, N{t, A) is roughly t- 2 times the area. Next, consider
a compact submanifold A of Rn of dimension d. (The dimension is here used in the
sense of the number of variables in local coordinate charts.) For such a manifold,
N{t, A)fd is roughly equal to the d-dimensionaI volume, or N{f, A) is proportional to
Cd; more precisely, there are two constants C),C2 > 0 such that C I :5 N{f,A)f d :5 C2
or C l t- d :5 N{f,A):5 C 2 t- d • Taking logarithms of the inequality, we get that

so
10g{N{f, A» - log{C2 ) < d < log{N(t, A» - Jog{CJ) and
log{c l ) - - log{c l )
d = lim 10g(N{t, A)) .
• -0 log(c l )

Notice that for real numbers 0 :5 p < d < q and 0 < t < I, tP > fd > t q so
lim N{t, A)e P = and
.-0 00

lim N{t, A)t q = 0,


.-0
Thus for a compact submanifold A of dimension d and 0 :5 p < d < q, the growth rate
of the of the p dimensional volume of boxes which cover A is infinite, the growth rate
358 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

of the of the q dimensional volume of boxes which cover A is zero, and the growth rate
of the d dimensional volume of the boxes which cover A is a finite number. Thus the
dimension d is characterized as that number p at which the lim._oN(e,A)e P changes
from being infinite to being zero. If the limit exists for p = d, then this limit is the
d-dimensional measure of A.
Thus the dimensions can both be defined in terms of the growth rate of N(e, A) in
terms of e- I , and as the number for which the d-dimensional measure makes sense. We
use the first characterization in our definition of box dimension and the second in our
definition of Hausdorff dimension. We now turn to organizing these ideas into precise
definitions.
Definition. For a general compact subset A c Rn, we define the box dimension of A,
dimb(A), by
· (A) - I' . f log(N(e, A»
d 1mb - 1m In
<-0
Iog (_\)
e
This dimension is often called the capacity dimension of A or limit capacity of A. Be-
cause the word capacity has other meanings in certain discussions of complex dynamics,
term box dimension is becoming the standard term for this measurement. If we use the
limsup instead of the liminf, we get the upper box dimension of A which is denoted by
dimB(A),
d' (A) - l' log(N(e, A))
1mB - lI~~~P log(c 1)

REMARK 4.1. Clearly dimb(A) ~ dimB(A) ~ n where A eRn.


REMARK 4.2. The box dimension of a set A is the same as the box dimension of its
closure. This is the reason we restrict to closed sets. We take the sets to be compact so
there are at most a finite number of boxes from the e-grid which intersect A.
REMARK 4.3. We use an exercise to show that
for 0 ~ p < dimb(A)
liminfN(e,A)e P =
<-0
{OO
0 for dimb(A) < p < 00
and
for 0 ~ p < dimB(A)
limsupN(e,A)eP = { 00
• -0 0 for dimB(A) < p < 00 .
See Exercise 8.30.
Instead of using a fixed set of boxes from the e-grid, it is possible to cover the set
A by a finite set of closed cubes with length e on a side and sides parallel to the axes
(without fixing the grid of hyperplanes which form the surfaces). Let N'(e, A) be the
minimum number of such cubes of size e which cover A. The following result shows that
N'(e, A) can be used to calculate the box dimension instead of N(e, A).
Proposition 4.1. Let A c Rn be a compact subset. The box dimension of A can be
calculated by the following limit:
. . . log(N'(e, A»
dlmb(A) = hmlnf
<-0
Iog (-1)
f

where N'(e, A) is defined above as the minimum number of boxes without fixing the
grid.
PROOF. Clearly N'(e,A) ~ N(e,A) since it is possible to choose the cover from boxes
from the e-grid. Also any cube of size e is contained in 2n boxes from the e-grid, so
8.4 FRACTAL DIMENSION 359

N{E, A) :5 2n N'{E, A), Taking logarithms and taking the limit, we get the following
result:

d 1mb - 1m In .-0
' (A) - I' ' f log(N(t, A))
Iog (t I)
' ' f nlog(2)+log(N'(t,A)
<I 1m In
- .-0 log(c l )
, ' f log(N'{E, A)
= IImIn
.-0 log{c l )

< ,
IImln' f log{N{t, A»
- .-0 log{c l )
= dimb{A),

o
It is also possible to cover the set A with a ball of diameter t in terms of the Euclidean
metric or any metric equivalent to the Euclidean metric, A metric d is said to be
equivalent to the Euclidean metric provided there are constants Cit C2 > 0 such that

Cllx - yl :5 d{x,y) :5 C 2 1x - yl

for all x,y ERn, Using a metric equivalent to the Euclidean metric, let N;'(A) be the
minimum number of closed balls of diameter t which cover A, We defer to Exercise 8,31
to verify the following proposition,
Proposition 4.2. Let A c Rn be a compact subset, The bwc dimension of A is given
by the following limit:
d 1mb - 1m In .-0
' (A) -I' ' f log(N"{E, A»
Iog (E- I)

where N" (t, A) is defined above as the minimum number of closed balls of diameter t
which cover A,
It is also useful to calculate the box dimension using a sequence of diameters going
to zero and not having to use all possible t, To that end we have the following result,
Proposition 4.3. Let A c Rn be a compact sub6et, Assume 0 < r < 1. Then

' (A) I' ' f log{N'(rj , A»


d 1mb = ImIn
)-00 J'1og (-I)
r ,

PROOF, For any E > 0, there is a J such that r HI < t :5 r j , In particular,

r- I r- j > t - I 2: r r- j - I ,

log{r-I) + j log{r-l) > log(t- I ) 2: log(r) + (j + 1) log(r- I ), and


[Iog(r-I) + j log(r- I WI < [log(E- 1WI :5 [Iog(r) + (j + 1) log(r- I WI,

Since N'{t, A) 2: N'(D, A) whenever t < D,

N'(r j , A) :5 N'(E, A) :5 N'(ri+l, A),


360 VIII, MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

Therefore,

. f 10g(N'(ri , A)) I' . f


I·Imm. 10g(N'(ri , A»
= Imm .
i-oo Jlog(r- I ) i-oo 10g(r-I)+Jlog(r-l)

- .-0
· . f 10g(N'(t, A))
< IImm
log(t- I )
= dimb(A)
.. 10g(N'(rHI , A»
< IImmf ,
- J-OO log(r) + (J + 1) log(r I)
· . f 10g(N'(rHI , A»
= I1m m ...,..:::.-'--:-'-:----:-'----:"'"
i-oo (J+1)log(r- l )
= lim inf 10g(N'(ri , A)).
J-OO J log(r- I )

Bt'Cause the first and last entries are equal, they must equal the box dimension. 0
Example 4.1. Let C be the middle-a Cantor set in the line. Let /3 be chosen so that
2/3 + a = 1, so that 0 < /3 < 1/2. In the formation of the Cantor set, there are 2i
intervals of length /3J which cover C, so N'(/Ji) = 2J. Therefore,

· (C)-I' . f log (N'(/J1,C»


d1mb - Imm
J-OO J'Iog (/3-1)
I' . f log(2J)
= 1m m J, Iog (/3-1 )
J-OO
log(2)
10g(/3-I) .

First of all note that 0 < dimb(C) < 1. Thus these Cantor sets have non integral box
dimension. Also the dimension depends on /3 and hence a. In fact any number between
o and 1 can be realized as the box dimension by the proper choice of a.
It is also possible to construct a Cantor set in the line (which is not a middle-a
Cantor set) with positive Lebesgue measure so dimb(C) = 1.
Example 4.2. For 0 < {3 < 1/2, a solenoid can be formed using the map

!(t, z) = (g(t), ~z + (3 e 21r1i )


where g(t) = 2t mod 1. This map! takes the neighborhood N = SI X D2 into itself.
Let A = n,,>o !"(N) be the attractor as in the usual construction of the solenoid. The
set Sic = t'rN) is the finite intersection. Let D(t) = {t} x D2 be a single fiber as before.
For each t, SIc n D(t) is the union of 2/c disks of diameter 2{3/c. Therefore

d · (AnD(t» =1' . f log (N'(2{3/c,AnD(t»)


1mb 'k~: k 10g({3-I)
· . f k log(2),
= IImIn
/c-oo k 10g({3 I)
log(2)
= log({3-I)"
8.4 FRACTAL DIMENSION 381

We do not give the details but

. log(2)
dm1b(A) = 1 + log(,B-l)

with the other dimension coming from the expanding direction. Notice that by picking
,B close to 1/2, dimb(A n D(t» can be made almost equal to one and dimb(A) can be
made almost equal to two.
REMARK 4.4. If we define Cantor sets of the type of A n D(t) in the above example
but with different rates of contraction in different directions, it is often impossible to
calculate the box dimension exactly.
Hausdorff Dimension
We mentioned above for the box dimension that
for 0 ::; p < dimb(A)
liminfN(e.A)e = P {OO
<-0 0 for dimb(A) < p < 00.

The definition of Hausdorff dimension uses a characterization like this one.


Definition. If U is a nonempty subset of Rn, let

lUI = sup{lx - yl : x,y E U}

denote the diameter of U. If A C Ui Ui where each Ui is a ball of diameter less than or


equal to 6, then lUi} is called a 6-cover 0/ A.
Let 0 ::; p. We want to define a p measure of a Borel set A. To do this first take
6 > 0 and define
00

1t:(A) = inf L lUi I",


i=-l

where the infimum is take over all countable 6-covers. Notice in this summation, all the
diameters of the different sets do not have to be equal; this contrasts with the definition
for box dimension. Next let 6 go to zero and define

1t"(A) = lim 1t~(A).


6-0

Because 1t~(A) increases as 6 decreases, the limit exists but can equal infinity. This
quantity, 1t"(A), is called the Hausdorff p-dimensional measure of A. If A eRn, then
1tn(A) is the Lebesgue measure of A. The Hausdorff dimension of A is defined to be d
if
forO::;p<d
W(A) = { :
for d < p::; 00.
The Hausdorff dimension of A is denoted by dimH(A). If a set A has Hausdorff dimen-
sion d, then the Hausdorff d-dimensional measure of A can be zero, infinity, or a finite
number.
REMARK 4.5. For a compact set A c Rn,

In many ways the Hausdorff dimension is defined in a manner more similar to Lebesgue
measure than the box dimension. Cantor subsets ofR which are defined defined in terms
362 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

of a map in a manner like that for F,x(x) = AX(l.X) are called dynamically defined. For
a dynamically defined Cantor set dimH(A) = dimb(A) = dimB(A) ~ n. See Palis
and Takens (1993), Takens (1988), or Manning and McCluskey (1983). For a further
discussion of these fractal dimensions see Edgar (1990) or Falconer (1990). For an
introduction to dimension in terms of dynamical systems see Farmer, Ott, and Yorke
(1983) or Palis and Takens (1993). For a connection with measures and entropy see
Young (1982).
REMARK 4.6. In Palis and Takens (1993), they define dynamically defined Cantor
sets which include the middle-a Cantor sets, Co, and also ones A" determined by
F,,(x) = Jlx(1 - x) for Jl > 4. They prove that if C c R is a dynamically defined
Cantor set then the Hausdorff dimension equals the upper box dimension and so also
the box dimension. In particular, dimH(Co ) = dimb(Co) = log(2)/log(2/(1 - a» for
the middle-a Cantor set.

8.5 Exercises
Topological Entropy
8.1. Let I,,(x) = JlX mod I, for Jl > 1. Calculate the entropy of I", h(f,,).
8.2. Let f be a diffeomorphism on the circle, SI. Prove that the topological entropy
of f is zero, h(f) = O.
8.3. Let d > 1 be an integer. Let f : SI --+ SI have a covering map F : lR --+ R given
by
F(x) = dx.
Prove that the entropy of I is log(d), h(f) = log(d).
8.4. Let aN : EN --+ EN be the full shift on N symbols. Prove that the entropy of aN
is 10g(N), h(aN) = 10g(N).

8.5. Let A = (~ ~). Find the entropy of the subshift of finite type with transition
matrix A.
8.6. Use Theorem 1.9(a) (but not Theorem 1.9(b» to calculate the entropy of the
subshifts of finite type with the follow transition matrices A, h(aA)' Note: These
examples illustrate again that (i) the growth rate of the number of the wandering orbits
does not contribute to the entropy and (ii) that the entropy is the maximum of the
entropy on disjoint invariant pieces of the nonwandering set.
(a)LetA=On·

(b) Let A = (~ i D·
1 11 1)
~ ~:1
8.7. Let A be an N by N transition matrix and EA C EN be the corresponding
subshift of finite type. Prove that the entropy h(aA) ~ 10g(N).
8.8. Let I : M --+ M and 9 : N --+ N be two maps on compact metric spaces. Consider
the map f x 9 : M x N --+ M x N defined by f x g(x,y) = (f(x),g(y». Prove that
h(f x g) = h(f) + h(g).
8.5 EXERCISES 363

8.9. Assume /: X -+ X is a homeomorphism. Prove that h(f-I) = h(f).


8.10. Let /w : Sl -+ Sl be a rotation by w. Prove that the entropy h(fw) = O.
8.11. Assume / : [0,1) -+ [0,1) is continuous and has two subinterval h,12 C [0,1)
such that /(Id ~ II U 12 and /(12) ~ II U h Prove that the entropy of / is at least
log(2), h(f) ~ log(2).
8.12. Assume / : [0,1) -+ [0,1) is continuous and has a periodic point of a period
which is not a power of two. Prove that / has positive entropy. Hint: Use the Stefan
cycle to get intervals which cover each other.
8.13. Assume / : X -+ X is a continuous map on a metric space X and leX) = Y.
Prove that h(f) = h(fIY).
8.14. Use Proposition III.2.IO to prove Theorem 1.9(b) in the case when A is reducible.
8.15. Let A : L2 -+ L2 be the bdding machine map:
A(SOSI82 .•• ) = (808182 ... ) + (1000 ... ) mod 2,
i.e., (1000 ... ) is added to (808182 ... ) mod 2 with carrying.
(a) Prove that h(A) = O. Hint: Take f = 3-k+ 1 2- 1 so the closed balls of radius of
radius f are cylinder sets given in Exercise 2.13. Then show that A permutes these
cylinder sets.
(b) Let /00 : [0,1) -+ [0,1) be the map defined in Example m.1.3. Prove that h(foo) =
O.
8.16. Let / be the solenoid diffeomorphism given in Section 7.7. Using a Markov
partition, prove that h(f) = log(2).
8.17. Let 9 be the hyperbolic toral automorphism for the matrix (~ ~ ). Prove that
the the entropy h(g) = (3 + 5 1/ 2 )/2.
8.18. Let / be the DA-diffeomorphism formed from the hyperbolic toral automorphism
9 for the matrix (i ~ ). Using the result of Bowen given in Remark 1.11, prove that
the entropies of / and 9 are equal, h(f) = h(g).
8.19. Let / : T2 -+ T2 be the noninvertible toral map induced by the matrix

(1 -1)
1 1 .

Calculate the entropy of /. Hint: Find a power /Ie for which it is easy to calculate the
entropy and use Theorem 1.2 relating the entropy of /Ie to the entropy of /.
8.20. Let / : [0,3) -+ [0,3) be the piecewise linear function such that /(0) = 3,
/(1) = 2, /(2) = 3, and /(3) = O. (The map / is linear between each pair of adjacent
integers.)
(a) Find a Markov partition for / and its transition matrix.
(b) Find the topological entropy of /.
Liapunov Exponents
8.21. Let / be the solenoid map given in Section 7.7. Prove that the Liapunov expo-
nents satisfy A2 = A3 = log(4- 1 ) and Al ~ log(2).
8.22. (Generalized Baker's map) Assume 0 < 1'1 < 1'2 < 1 satisfy 1'1 + 1'2 < 1. Define
the function
(2x,l'llI) for 0 $ x < 0.5, 0 $11 $ 1
{
/I-II.".(x,y) = (2x - 1.1 - 1'2 + 1'2Y) for 0.5 $ x $ 1, 0 $ Y $ 1
364 VIII. MEASUREMENT OF CHAOS IN HIGHER DIMENSIONS

on the square S = [0,11 x [0,11.


(1\) Show that the Liapunov exponents satisfy AI(X,y) = log(2) and

for all (x, y) E S.


(b) Notice that the first coordinate function of 1",1.",2 is essentially (except for x = 1)
the doubling map D(x) = 2x mod 1 and so is ergodic with respect to Lebesgue
measure on [0, IJ. (You do not need to prove this fact.) Using the Birkhoff Ergodic
Theorem, prove that A2(X, y) = 0.5[log(l-Id + log(1-I2)] almost everywhere with
respect to Lebesgue measure on S.
(c) Find three different periodic orbits (of any period) for which A2(X, y) equals
log(l-Id, log(1-I2), and [log(l-Id + log(1-I2)J/2 respectively.
8.23. Let 1 : M -+ M be a C l diffeomorphism. Assume x E M and v E TxM are a
point and a vector for which the Liapunov exponent A(X, v) exists.
°
(a) Let be a real number. Prove that

A(X,OV) = A(X,V).

(b) Prove that


A(f(X), D/xv) = A(X, v).

Fractal Dimension
8.24. Let S = to} U {11k: k is a positive integer }.
(a) Prove that dimb(S) = 1/2.
(b) Prove that dimH(S) = O.
8.25. Let S = to} U {2-k : k is a positive integer }.
(a) Prove that dimB(S) = O.
(b) Prove that dimH(S) = O.
8.26. Let F",(x) = I-Ix(l-x) and A", = nn~O~([O, 1]). Let A", = (1-1 2 _41-1)1/2. Prove
that

for 1-1 > 2(1 + 2 1/2).


8.27. Construct a Cantor set in the line with box dimension equal to one.
8.28. Let N = SI X D2 and

I(t, z) = (g(t), ~z + (3 e2..ti )

where g(t) = 4t mod 1. Let A = nk>o Ik(N).


f
(a) Prove for 0 < (3 < 2- 112 , that is an embedding of N into itself.
(b) Let D(t) = {t} x D2. Prove that

. log(4)
dlmb(AnD(t)) = 10g({3-I)'

Also prove for correction choice of (3, that dimb(AnD(t)) > 1. Note that AnD(t)
is a totally disconnected set that has box dimension greater than one.
8.5 EXERCISES 365

(c) Prove that


. log(4)
dlmb(A) = 1 + log(.B I)'

8.29. Calculate the box dimension of the invariant set A for the Geometric horseshoe.
8.30. Let A c Rn be a compact set. Prove that

for 0 ::; p < dimb(A)


lim inf N(f, A)fP = { 00
.-0 0 for dimb(A) < p < 00

and
.
hmsupN(f,A)fP =
• _0
{oo0 for 0 ::; p < dimB(A)
for dimB(A) < p < 00 .
8.31. Let A c Rn be a compact subset. Using a metric equivalent to the Euclidean
metric, let N;'(A) be the minimum number of balls of diameter E which cover A. Prove
that
d 1mb - Imm .-0
· (A) - I' . f log(N"(E, A))
Iog (E- I)

8.32. Let 1/10,./10. be the generalized Baker's map defined in Exercise 8.21.
(a) Prove there is a unique positive number d such that 1 = J.l1 + J.I~.
(b) Let d be the number given in part (a). Let

A= n1::,.".([0,
00

n=O
IJ x [0,1]).

Prove that the Hausdorff dimension of A n ({ O} x [0, 1]) is less than or equal to
d. Remark: Theorem 6.3.12 in Edgar (1990) proves that the Hausdorff dimension
of A n ({O} x [0,1]) is actually equal to d. The Hausdorff dimension of A is then
equal to 1 + d. Because the set is dynamically defined, the box dimension of
An ({O} x [0, '1]) is also d, so the box dimension of A is 1 + d.
CHAPTER IX
Global Theory of Hyperbolic Systems

In this chapter, we take the ideas introduced in the last chapter and make them
into a more complete theory. The first section shows that in some sense any map can
be decomposed into a map on chain recurrent pieces and a gradient-like map (or flow)
between the pieces. This is a very general theorem of Conley and can be proved without
using much of the material introduced earlier in this book except the definition of chain
recurrent. The next section indicates how the proof of the stable manifold theorem for
a fixed point can be modified to prove the case for a hyperbolic invariant set. Using
these results, we prove the possibility of shadowing near a hyperbolic invariant set, the
n-stability of diffeomorphisms with a hyperbolic chain recurrent set, and the structural
stability of diffeomorphisms which satisfy a "transversality" condition in addition to
having a hyperbolic chain recurrent set. These theorems form the heart of the theory of
hyperbolic diffeomorphisms (ones for which the chain recurrent set is hyperbolic) which
was articulated by Smale (1967) and carried out in the following years by Smale and
other researchers.

9.1 Fundamental Theorem of Dynamical Systems


In this section, we study a flow <pt on a metric space M. Usually M is compact.
Once we give the results for flows, we indicate how they can be carried over to homeo-
morphisms (or diffeomorphisms) in the next subsection.
The main tool is that there is a Liapunov function which is decreasing off the chain
recurrent set, i.e., off the set of points on which complicated dynamics takes place.
Different parts of this chain recurrent set may lie at different levels of the Liapunov
function and so there is no way to move from one of these pieces to another piece and
then back to the first piece (since the Liapunov function is strictly decreasing off the
chain recurrent set). Therefore if the pieces of the chain recurrent set are collapsed to
points, the flow on the quotient space is a gradient-like system, so the original flow can
be thought of as a gradient-like extension of a flow that is transitive on disjoint pieces.
The main reference and original reference for this section is Conley (1978). Franks
(1988) has a proof of many of the theorems of this section with proof directly for
diffeomorphisms. Hurley (1991,1992) has extended some ofthese results to noncompact
sets.
In Section 2.3 we defined the chain recurrent set for a map and in Section 5.4 we gave
the modifications for a flow. In particular we defined an t-chain. For a subset Y c M
we defined the f-chain limit set of Y, n:(y), and the chain limit set of Y, n+(Y).
Similarly backward sets n; (Y) and n- (Y). These definitions play an important role in
this section and should be reviewed. In particular, the chain recurrent set is given by
'R.(<pt) = {x: there is an f-chain from x to x of length greater than T
for all t > 0 and for all T >. I}
= {x : x E n+(x)}
= {x : x E W(x)}.

367
368 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

The definitions for a flow are the same once an t-chain is defined, which is done in
Section 5.4.
In order to give an interpretation of the fundamental theorem below, we form equiv-
alence classes of points in 'R.('l) for which there are chains from one point to another
and back to the the first point. This concept is made precise in the following definition.
Definition. We define a relation'" on 'R.(<pI) by x '" y ify E n+(x) and x E n+(y). It
is clear that this is an equivalence relation. The equivalence classes are called the chain
components of 'R.(<pI). It is not hard to prove that for a flow, the chain components are
indeed the connected components of 'R.(<pI).
In Section 7.12 on Morse-Smale Systems, the definition of a Liapunov function in a
global sense is given, as opposed to near a fixed point. Also, a gradient vector field and
gradient-like vector fields and flows on Euclidean spaces and manifolds are defined in
that section. The following definition of Conley seems very different than the clefinltion
of gradient-like flow but it is compatible in the sense that a corollary of the FUndamental
Theorem of Dynamical Systems (given below) says that a strongly gradient-like flow is
gradient-like.
Definition (Conley). A flow <p' is called strongly gradient-like provided 'R(<pI) is to-
tally disconnected (and consequently made up entirely of fixed points, 'R( <pI) = Fix( <pI».
Using these definitions, we can state the main theorem.
Theorem 1.1 (Conley's Fundamental Theorem of Dynamical Systems). A flow
<pIon a compact metric space has a Liapunov function L which is strictly decreasing off
'R(<pI) and such that L('R.(<pI» is a nowhere dense subset ofR.

REMARK 1.1. Conley interprets his FUndamental Theorem by looking at the induced
flow after collapsing points in the same chain component to points and stating that the
resulting flow is strongly gradient-like. More specifically, let x '" y if x and y are in the
same chain component, i.e., y E n+(x) and x E n+(y). The flow <pI induces a flow 4)1
on the quotient space M" = M/{x '" y} in a natural way. Then 'R.(4)I) is clearly totally
disconnected. The theorem says that there is a strong Liapunov function L for the
flow <pl. This function is constant on chain components so it induces a strong Liapunov
function L for 4)1, and 4)1 is gradient-like. In particular, if <pI is strongly gradient-like
(the chain recurrent set of <pI is already totally disconnected), then it is gradient-like.
The proof of the theorem requires proving the connection between the chain recurrent
set and the collection of all pairs of attracting and repelling sets. We start by giving
these definitions for a flow which are much the same as for a diffeomorphism except
that the ''time" is continuous rather than discrete.
Definition. Let <pI be a flow on a metric space M. A set U is called a trapping region
(or isolating neighborhood) provided U is positively invariant and there is aT> 0 such
that <pT(cl(U» C int(U). A set A is called an attracting set lor a flow <pI provided there
exists a trapping region U such that A = n,>o <p'(U). A set A" is called a repelling set
provided there exists a trapping region U suCh that A" = nl<o <p'(M \ U). The set A*
is also called the dual repelling set for the attracting set A. -The ordered sets (A, A")
are called the attracting-repelling pair for the trapping region U. The collection of all
attracting-repelling pairs is denoted by

A = {(A, A") : U is a trapping region for the attracting set A


and the repelling set A"}.
9.1 FUNDAMENTAL THEOREM OF DYNAMICAL SYSTEMS

A set V is called a wed trapping region for the flow '(" provided there is some T >0
such that c1e'('T(V» C inteV), i.e., V is not neceesarily positively invariant.

REMARK 1.2. The sets A and A·, which we call an attracting set and a repelling set
respectively, Conley calls an attractor and a dual repeller respectively. We reserve the
word "attractor" for an attracting set which is transitive.
REMARK 1.3. Proposition 1.10 proves that if V is a weak trapping region with A =
n,>o '("~eV) and A· = n<o '("(M \ V) then there is a trapping region with the same
attracting-repelling pair, CA, A·). Proposition 1.9 proves that an attracting-repelling
pair has a trapping region which is strongly positively invariant in the serure that
'("(c\(U» C int(U) for all t > O. Thus, these two propositions give two different ways
in which the definition of a trapping region can be changed.
The following proposition gives a couple of useful properties of attracting sets.
Proposition 1.2. (aJ Any attracting set or repelling set is closed.
(b) Both A and A· are positively and negatively invariant.
PROOF. (a) Because of the property that '('T(c\(U» C int(U), it follows that A is also
=
the intersection of closed sets and so is closed: A n>o '("( cl(U». The proof for A·
is similar. -
(b) The proofs that attracting sets and repelling sets are invariant are similar, 80
we only look at A. Let x E A and fix any real s. Then x E '("~eU) for all t ~ 0, 80
'("(x) E ,(,'+'(U). Therefore, '("(x) e n'~I'1 '('I+'(U) =A. Since s is arbitrary, A is
both positively and negatively invariant. 0

The following theorem gives the connection between the chain recurrent set and the
set of all attracting-repelling pairs, and is used to prove the existence of a Liapunov
function which is strictly decreasing off X('(") as given in Theorem 1.1.
Theorem 1.3. Let '(" be a !Iowan a compact metric space M. Let

l' =n{A U A· : (A,A·) E A}.


Then l' = X('(").
Before starting the proof we give an example.
Example 1.1. Let '(" be the gradient flow on T'l with one sink Po, two saddles PI and
P2, and one source P3. If Uo =e =e
then Ao =
and Ao T2. Next, let UI be an attracting
neighborhood of the sink po. Then AI = {po} and Ai = {pa}UW'(P2)UW'(PI). If U2
=
is a positively invariant set that contains po and PI but not P2 or Pa, then A2 {Po} U
=
WU(PI) and A; {ps}UW'(P2). If U3 is a positively invariant set that contains Po and
=
P2 but not PI or P3, then As = {po} U WU(P2) and Ai {Pal U W'(PI). If U4 is a set
such that T2 \ U4 is a repelling neighborhood of Pa, then A4 = {po} U WU(PI)UWU(P2)
and A. = {ps}. Finally, if U5 = T'l, then A5 = T'l and As = e. By taking the
intersection of these sets, l' :::> nO~i~5Ai U Aj = {Pi: 0 $ j $ 3}. By the remarks
preceding the example, we see that the other inclusion is also true and so we have
equality (as stated in Theorem 1.3).

The following lemma proves Xe'(") :::> l' which is one direction of the inclusion of the
two sets in Theorem 1.3.
370 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

FIGURE 1.1. Neighborhoods UI. U2, and U4 for Example 1.1

Lemma 1.4. Ify rt R(cp') then there exists an (A,AO) E A such that y rt Au AO.
Therefore, R(cp') :) P.
PROOF. Take y rI: R(cp'). Then there is an ( > 0 such that y rI: 0i(y). Let U =
0i(y). Then cpl(cI(U» C int(U) (or else it would be p068ible to (-chain out of U
and the set would be bigger). Let A = n,>o
cp'(U) be the attracting set for U and
A° = n, <0 cp' (M \ U) be the dual repelling set. Since y rI: u, y rI: A. From the
definition -of 0i, it follows that there is aT> 0 such that cpT(y) E U 80 cpT(y) rI: AO.
Because AO is negatively invariant, y rI: AO. This proves y rt AU AO. 0
REMARK 1.4. To prove the second inclusion, that R( cp') C P, first note that for any
attracting-repelling pair (A,AO), all the fixed points and periodic points must be con-
tained in Au AO. Thus Fix(cp') U Per(cp') C P. In the following lemma, we prove that
given any attracting-repelling pair (A, AO) there is a Liapunov function which is strictly
decreasing off AUAo. Then in Lemma 1.6 below, we use this Liapunov function to show
that R(cp') c P.
Lemma 1.5. Let (A, AO) E A. Then there is a continuous function L : M ..... Ii such
that LIA =
0, LIAo =
1, L(M \ (A U AO» C (0,1), and L is strictly decreasing off
AUAo.
PROOF. Since A and AO are disjoint closed sets in a metric space, the function V :
M -+ [0, 1] given by
Vex) _ d(x, A)
- d(x, A) + d(x, AO)
is continuous, VIA = 0, VIAO = 1, and V(M\(AUAO» c (0,1). (The existence ofsuch
a function in a normal space which is not a metric space is given by the Tietze-Urysohn
Theorem.) This function V is not necessarily decreasing or even non-increasing.
Let
VO(x) = sup{V(cp'(x» : t ~ O}.
For x E A, cp'(x) E A so VO(x) =O. Similarly, if x E AO then VO(x) =
1. If
x rt A U AO then cp'(x) approaches A as t goes to plus infinity. (See Exercise 9.7.)
Therefore, if x rt AO, then V(cp'(x» goes to zero and the maximum of V(cp'(x» is
attained. Moreover, there is a neighborhood W and a bounded time interval [0, td such
that for yEW the maximum of V (cp' (y» for t ~ 0 is attained for some t E [0, t 11; it
follows that VO is continuous. Finally, VO(cp'(x» ~ VO(x) for t ~ 0 by construction.
Note that the inequality is not necessarily strict.
To make a strictly decreasing function, define L to be a weighted average of VO over
the forward orbit:
L(x) =10 00
e-' V· 0 cp'(x) ds.
9.1 FUNDAMENTAL THEOREM OF DYNAMICAL SYSTEMS 371

This function is easily seen to be continuous. Then for t ~ 0,

L(lPt(x» = fo'JO e-' V* (lP·+t (x» ds

$fo'>O e-' V*(lP'(x)) ds


=L(x).
The second inequality follows because V*(rp·+t(x» $ V*(lP'(x» for t ~ O. Thus L is
non-increasing along orbits.
Now take t > O. If L(lPt(x» = L(x) then V*(lP,+t(x» = V*(rp'(x)) for all s > O.
In particular, taking s = nt, we get that V*(lPnt(x» = V*(x) for all n > O. However,
if x ~ Au A* then lPnt(x) goes to A so V*(lPnt(x)) goes to zero. Thus it is impossible
since V*(lPHt(X» = V*(lP'(x» for all s > O. Therefore we have shown that L(x) is
strictly decreasing off A U A * as required. 0
Lemma 1.6. (a) Let (A, A*) E A bean attracting-repelling pair. Then R(rpt) c AuA*.
(b) FUrther, R(lPt) c P.
PROOF. Part (b) easily follows from part (a). To prove part (a), we take p ~ Au A*
and show p ~ R(lPt). Let L be the Liapunov function which is decreasing off A U A*
which is given by Lemma 1.5. Let Co = L(p) and CI = L(lPI(p)). Since L is strictly
°
decreasing at all points of L-I([C),Co]), there is a 6> with 6 < (Co - cI)/2 such that
if q E L-I([cI,cI + 6]) then lPI(q) E L-I([O,CI])' For each q E L-I([O,cd) there is
°
an f > such that for q' within f of q, L(q') < CI + 6. By compactness there is one
f that works for all points at once. Now if {p = Po, ... , Pnj t), ... , t n } is an f-chain
with tAo ~ 1, then L(lPt,(Po)) $ L(rpl(Po)) = CI' Then L(pd $ CI + 6 by the choice
of f. Next, L(lPt'(pd) $ L(lPI(pd) $ CI. Again, L(P2) $ CI + 6 by the choice of f.
Continuing by induction L(pA:) $ CI + 6 for 1 $ k $ n. Thus the chain can never get
back to p. This proves that p is not chain recurrent. 0

Together, Lemmas 1.4 and 1.6 prove Theorem 1.3.


To prove Theorem 1.1, we combine the Liapunov functions for different attracting-
repelling pairs to prove that there is a Liapunov function which is strictly decreasing
off P (which equals R(lPt». To carry out this construction, we need to know that the
number of such pairs is at most countable.
Lemma 1.1. The set A is at most countable (i.e., finite or countable).
PROOF. We want a neighborhood to uniquely determine the attracting set. Since there
are often proper subsets of attracting sets which are attracting sets, we use the pair
A x A* which is the unique pair that is oontained in int(U) x int(M \ U). Since M x M
is a compact metric space, it has a countable basis. Therefore, there is at most a
countable number of such pairs A x A * and A is countable. 0
Example 1.2. The Morse-Smale examples all have finitely many attracting-repelling
pairs. The following is an example of a function on the one torus (or circle), having
countably many attracting-repelling pairs. Let TI = {x mod I} be the one torus. Let
the equations on ']['1 be given by
x= f(x) = x 2 sin(l/x) for - 2/(311') $ X $ 2/11'.
Then f(x) can be extended outside this interval to have no more zeroes and be periodic.
The fixed points are x = 0, 1/(n1l') for all integers n. Since
!,(x) = 2x sin(l/x) - oos(l/x),
372 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

f'(I/(mr» = (-1)"+1. Therefore for any integer k, x = 1/(2k7r) is a fixed point sink and
x = 1/( (2k + 1 )11-) is a source. Thus there are countably many fixed point attracting sets.
There are other attracting sets made up of intervals of the form A = [1/(2j1l"), 1/(2k7r)],
where j and k are integers and either j < 0 < k, or j and k have the same sign
and 0 < Ikl < Ijl. The dual repelling set is also an interval, if j < 0 < k then
A· = yl \ (1/((2j - 1)11"), 1/((2k - 1)11"). (There are other attracting sets made up
of intervals which do not contain 0.) This gives the different types of attracting sets.
Notice that the set {O} is not an attracting set, but is the intersection of attracting sets.

Theorem 1.8. Assume cpl is a flow on a compact metric space M. Then there is a
Liapunov function L : M -+ R which is strictly decreasing off P and such that L(P) is
a nowhere dense subset of R.
PROOF. By Lemma 1.7, A is at most countable, i.e., A = {(AJ' A;)}~l' (The case
when A is finite requires small modifications.) For each j there is a weak Liapunov
function L J given by Lemma 1.5 which is strictly decreasing off A J UAj. Also, Lj(M) c
[D. 1]. Let
00

L(x) = 2 L 3- Lj(x).
,_I
j

The sum is absolutely convergent, so L is continuous. Also L(M) C [0.1]. The function
is a combination of non-increasing functions and so is non-increasing along trajectories.
Next, if x rt P. then x rt A" U Ak for some k. Then for t > 0, Lj(cp'(X» $ Lj(x) for all
j and L,,(cpl(X» < L,,(x), so L(cpt(x» < L(x). This proves that L is strictly decreasing
off P as desired. Finally, for any point pEP, each Lj(p) is equal to either 0 or I, so
L(p) is a number whose ternary expansion has only zeroes or two's. (Note the factor of
2 in the definition of L.) Therefore L(P) is contained in the nowhere dense Cantor set
made up of points whose ternary expansion has only zeroes or two's. 0

Theorems 1.3 and 1.8 combine to prove Theorem 1.1.

We end this section with three results which follow from the above results or are
closely related.
Proposition 1.9. Any attracting set A has a neighborhood U which is a trapping
region for A and which is positively invariant in the strong sense that cpl(c1(U» C int(U)
for all t > O.
PROOF. Let L be the Liapunov function given by Lemma 1.5 and let U = L-1([O,f»
for small f > D. Then cl(U) C L-l([D,f]) and cpt(c1(U» C cpl(L-l[D,f]) C L-1([O,f»
for t > o. 0
The following proposition proves that it is enough to assume there is a weak trapping
region.
Proposition 1.10. Let V be a weak trapping region for the pair (A, A·) of attracting
and repelling sets. Then there is an trapping region U such that A = n,>o cpt(U) and
A· = nt$ocpl(M \ U). -
PROOF. Let V be a weak trapping region and T the time such that cl(cpT(V» C int(V).
Let
U= n
cpt (int(V».
O$l$T
9.1 FUNDAMENTAL THEOREM OF DYNAMICAL SYSTEMS 373

We leave to Exercise 9.5 the verification that U is open using the continuity of the flow
on initial conditions. For 0 :5 s = jT + s' with 0 :5 s' < T.

",'(U) = [ n ",'+iT+.' (int(V))] n[ n ",,+jT+.' (int(V))]

cU.

For s = T,

",T(c1(U» c n ",' ° ",T(c1(V))


O~'<T

C n ",'(int(V»
O~I<T

cU.

This proves that U is a trapping region. It is also easily checked that the attracting-
repelling pair for U is still equal to (A,A"). 0
The next proposition says that the chains from a point x E 'R.( ",') to itself can
actually be take inside 'R.(",').
Proposition 1.11. Let ",I be 8 flow on 8 compact metric space M. Then the the map
restricted to the chain recurrent set has all points chain recurrent. 'R.(",'I'R.(",')) = 'R.(",').
REMARK 1.5. This result is not true if the ambient space is not compact. Note also
that the analogous theorem is not true in general for the nonwandering set. i.e .• there
exist examples where 0(",'10(",'» '" 0(",'). Remark 4.2 later in this chapter gives one
such example.
PROOF. Let P E 'R.(",'). For each n > 0 there is 8 periodic lin-chain through p:
{p = P~ .... ,p~ = P;tl = 1..... tA: = I}. It is easily seen to be possible to take all
these tj = 1. Let Cn = {pj }jEZ be a periodic lin-chain through p. Then the set Cn is
a compact subset of M. In the Hausdorff metric on compact subsets of M. there is a
subsequence C n ; that converges to some compact set C c M. We show below that for
any q E C and t > 0 there is a periodic t-chain through q, {q = Xo .... ,XA:; 1.... , I}
with Xj E C. It follows that q E 'R.(",'IC) c 'R.(",'). This is true for any q E C, so
C c 'R.(",'). Next p E C, so P E 'R.(",'IC) c R(",'). This argument applies to any
p E 'R.(",'), so R(",') c R(",'IC) and so they are equal.
We have only to show that there is a periodic t-chain through q with Xj E C.
By uniform continuity of the flow ",' on M, there is a 6 = 6(tI3) > 0 such that for
dCa, b) < 6. d(",l(a). ",l(b)) < t13. We can also take 6 < E/3. Because the Cni
converge to C, we can find an n = nk such that lin < E/3 and the distance from Cn to
C in the Hausdorff metric is less than 6. Assume {pf} has period j so pj = p8 = p.
We can extend pf periodically to all i E Z. so pr-rj = pf for all i. For each pf
take Xi E C with d(Xi. pf) < 6, Xi+j = Xi for all i. and Xi = q for some i. Then
d(",l(Xi).Xi+1) < d(",l(Xi).",l(pf» + d(",l(pf).pf+1) + d(Pr-rl,XHd < E. Therefore,
Xi is a periodic t-chain through q. 0
374 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

9.1.1 Fundamental Theorem for a Homeomorphism


Let f : N -- N be a homeomorphism on a compact metric space. We can form
the suspension of f and obtain a flow 'P' on M = (N x R)/ -. By the Fundamental
Theorem for flows, there is a Liapunov function L : M ~ R which is strictly decreasing
off the chain recurrent set of <p'. By restricting L to N (actually (N x {O})/ -, we get
a Liapunov function for f which is strictly decreasing off the chain recurrent set of f.
This proves the following theorem.

Theorem 1.12 (Conley's Fundamental Theorem for a Homeomorphism). A


homeomorphism f on Ii compact metric space has Ii Liapunov function L which is strictly
decreasing off n(f) and such that L(n(f)) is Ii nowhere dense subset of R.
There is the following corollary just as in the case for flows.

Corollary 1.13. Let f : N -- N be a homeomorphism on a compact metric space.


Then n(fln(f») = n(f).

9.2 Stable Manifold Theorem for a Hyperbolic


Invariant Set
The last section gave some of the basic results about the chain recurrent set using
only ideas from point set topology. The other results which we give use the hyperbolic
structure on the chain recurrent set, or at least on the closure of the periodic points.
One of the key results about systems with a hyperbolic structure on the chain recurrent
set is the existence of stable and unstable manifolds as stated in Section 7.1.3. In this
section, we restate the theorem and sketch the proof.
In the statement of the theorem, we use coordinates near points in the invariant set A.
In a Euclidean space, a neighborhood Vp(t) of p can be taken as {p} +1E;(t) x 1E~(t). In
a manifold, a neighborhood can be taken of the same form if we use local coordinates.
However, it is not always possible to take one set of coordinates for all the different
points of A. Another approach is to use the exponential map from tangent vectors at p
into the manifold:
exp p : TpM -> M.

In a Euclidean space, expp(vp) = p+vp' In a manifold, the point expp(vp) is obtained


by going along a geodesic (distance minimizing curve) starting at p in the direction vp.
If the manifold is embedded in some larger Euclidean space R N, then exp p (vp) can be
visualized as taking the point p + vp which is probably not in the manifold M and then
projecting this point into M along the subspace (TpM).l of vectors perpendicular to
TpM. See Franks (1979) for a more complete explanation of this latter method. In any
case the neighborhood of p can be taken as

where
Bp(t) = 1E;(t) x E~(t).
In the proofs below, we use the fact that the D(expp)op = id, so IID(expp)vp - idll is
small for vp E Bp(t), although we do not explicitly isolate the fact that D(expp)op is
different from the identity.
9.2 STABLE MANIFOLD THEOREM FOR A HYPERBOLIC INVARIANT SET 375

Theorem 2.1 (Stable Manifold Theorem for a HyperboUc Set). Letf: M -+ M


be a C" diffeomorphism. Let A be a compact invariant set with a hyperbolic structure
for f with hyperbolic constants 0 < ~ < 1 < oX and C ~ 1. Then there is an f. > 0 such
that for each pEA there are two C" embedded disks W:(p, f) and W;'(p, f) which
are tangent to IE~ and IE~ respectively.
Using the exponential map discussed above, W:
(p, f) can be represented as the graph
of a C" function (1~ : 1E~(l) -+ E~(f.) with (1~(Op) = Op and D({1~)op = 0:

Also, the function {1~ and its first k derivatives vary continuously as p varies. Similarly,
there is a C" function (1~ : E~(t) -+ E~(f.) with (1~(Op) = Op and D({1~)op = 0 and with
the function (1~ and its first k derivatives varying continuously as p varies such that

Moreover for l > 0 small enough and using the neighborhoods Vp(l) defined above,
W:(p, f) = {q E Vp (€) : fi(q) E VfJ(p)(t) for j ~ O}
= {q E Vp(l) : fi(q) E V/;(p)(t) for j ~ 0 and
d(fi(q),fi(p» $ C~id(q,p) for all j ~ OJ.
Similarly,

W.U(p,f) = {q E Vp(t) : ri(q) E V/-;(p)(t) for j ~ O}


= {q E Vp(t) : ri(q) E V/-J(p)(t) for j ~ 0 and
d(ri(q),ri(p» $ CoX-id(q,p) for all j ~ OJ.
REMARK 2.1. The idea of the proof we sketch is to repeat the argument for a single
fixed point adding the fact that points near p get mapped to points near f(p).
It is possible to prove the Stable Manifold Theorem for a hyperbolic set as a corollary
of the Stable Manifold Theorem for a fixed point in a Banach space. See Hirsch and
Pugh (1970) or Shub (1987). These references also contain a complete proof (rather
than a sketch of a proof which we give for our approach).
PROOF. We also assume that the constant C ~ 1 in the definition of hyperbolic struc-
ture is taken to be one. This is always possible by taking an adapted norm as we proved
in Theorem VII.l.l of Section 7.1.3.
For each point pEA, we take Bp(f) C TpM and Vp(f) = expp(Bp(t» as defined
above. The map from a neighborhood of p to a neighborhood of I(p) induces a C"
map Fp : Bp(r) -+ B/(p)(Cor) for some Co ~ oX defined by

Fp(vp) = eXP/(~) 01 0 expp(vp)'


For a small v p, expp(vp) is a point in M near p, f 0 exPp(vp) is a point in M near
f(p), and eXP/C~) of oexpp(vp) is a relatively small vector in T/Cp)M. Notice that

Fp(Op) = eXP/C~) of(p)


= O/(p).
376 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Also D(Fp)op = DJp because D(expp)op = id and D(exp/(p»op = id. Therefore DJp
is a linear approximation for the nonlinear map Fp in Bp(E).
In order to consider all these maps for different p at once, we use the bundle structure
of T"M to keep the different neighborhoods disjoint. The set

B,,(r) = {vp E Bp(r) ; pEA} c T"M

is a bundle over A with all the sets Bp(r) contained In distinct tangent spaces TpM and
so do not intersect. (The neighborhoods Vq(E) and Vp(E) do intersect for q near p.)
Since A is often not a manifold (e.g. a Cantor set), this space is a metric space but not
a manifold. The map
F; B,,(r) ..... B,,(Cor)
is continuous and Ck on each fixed fiber Bp(r). Note that since A is compact, Co can
be taken independent of p.
Let 0 < I' < I < A be bounds on the derivatives on E~ and E~; IIDJpIE~1I < I'
and m(DJpIE~) > A for all pEA. (Remember that for a linear map A, m(A) is the
minimum norm of A. See Section 4.1.) Take E > 0 small enough so that I' + 2f < 1 and
A - 2E > 1. Given such an f > 0, there is an r > 0 small enough so that for q E Bp(r),

A"(q) A'U(q»)
D(Fp)q = ( A~'(q) A~U(q)

with IIA~'(q)1I < 1', m(A~U(q» > A, IIA~U(q)1I < E, and IIA~'(q)1I < E. In this expres-
sion, the derivative of Fp is taken along the fiber with p fixed. (We have used the fact
that D(expp)q is near the identity.)
We let

n Fj,(p)(B/i(p)(r»
00

W:(p) = and
j=O
W:(p) = expp(W:(p»,

where we think of the local stable manifold W;(p) as being represented in the local
coordinates given by the linear space TpM. We want to show that it is I-Lipschitz. As
before, we consider the stable and unstable cones (but now at different points);

C~ = {(v~, v~) E E~ x E~ ; Iv~1 :::; Iv~l}


= {vp E TpM ; 11r~vpl :::; 11r~vpl}, and
c; = {vp E TpM ; 11r~vpl ~ 11r~vpl}·

As before, the condition that

W:(p) n [{q} +'c;J = {q}

for all points q E W;(p) is equivalent to the graph being I-Lipschitz for each fixed p.
The following facts are proved just as in the proof of the stable manifold of a single
point.
1. For q E Bp(r), D(Fp)qC; C Cj(p)"
2. Let ql,q2 E Bp(r) with q2 E {qd +~. Then Fp(Q2) E {Fp(q2)} + Cj(p)' and
l1rj(p)Fp(q2) - 1rj(p)Fp(qdl ~ (A - E)\1r~(q2 - qdl·
9.3 SHADOWING AND EXPANSIVENESS 377

3. Let ql,q2 E Bp(r) with q2 E {qd + C;. Then F;I(q:/) E {F;I(q2)} + Ci-1(p).
and l7rj-l(p)F;I(q2) - 7rj_l(p)F;I(ql)1 ~ (I-' + l)-II7r;(q2 - ql)l.
4. Let D8. p be an unstable disk in Bp(r), i.e., the image of a Cl function t/J : E~(r) -+
E;(r) with Lip(t/J) ~ 1. Let

D~,/"(p) = F/"-I(p)(D~_I,fft-l(p» n B/ft(p){p)


be defined by induction. Then for n ~ I, D:,/"(p) is an unstable disk in Bf"(p)(r)
and
F-n(D~,J.(p» C F-n+l(D~_I./"_I(P» C ... c Do.p
is a nested set of unstable disks in Bp(r) with
n
diam[n P-j(Dj./j(p»] ~ (.\ - i)-n2r.
j=O

5. The manifold W:(p) is a graph of a I-Lipschitz function "'; : E;(r) --+ E~(r) with
",;(Op) = Op.
6. Ifq E W:(p), then IFt(q)-O/J(p)1 ~ (I-'+l)ilq-Opl for j ~ O. Ifq E W:(p) c M
is a point in the stable manifold in M, then d(fi(q),p(p» ~ (I-' + l)id(q,p)
converges to zero at an exponential rate.
7. For p fixed, the manifolds W:(p) and W:(p) are Cli W:(p) is tangent to E;
at
Op and W:(p) is tangent to E; at p.
8. For p fixed, the manifolds W:(p) and W:(p) are Ck if f is Ck.
The fact that the function u; and its derivatives vary continuously as p varies follows
easily from the fact that this is true about Fp and its derivative along fibers. Thus the
construction for p and that for p' are close to each other if p is near p'.
Since we are assuming that f is invertible (a diffeomorphism), the corresponding
facts about W;'(p) follow from looking at rl. 0
In the next section, we outline a modification of the above proof which shows that
a 6-chain can be i-shadowed near a hyperbolic invariant set. We later give another
variation which proves the structural stability of Anosov diffeomorphisms.

9.3 Shadowing and Expansiveness


In this section, we prove that it is possible to shadow in a neighborhood of a hyperbolic
invariant set and that a diffeomorphism is expansive on a hyperbolic invariant set. R.
Bowen carried over the idea of shadowing to hyperbolic invariant sets, based on the
work of D. Anosov (1967) and Ya. Sinai (1972) for Anosov systems. See Bowen (1975a)
and (1975b). In the next section, we show that if'R.(f) has a hyperbolic structure then
it breaks up into a finite number of pieces, and the results of this section show that we
can i-shadow an 6-chain by an orbit in one of these pieces.
We start by defining shadowing precisely. We also want a condition on the invariant
set which allows us to conclude that the orbit which shadows a 6-chain lies in a given
invariant set. The necessary assumption is that the invariant set is isolated, which we
define next. After these definitions, we state the result about existence of shadowing.
Definition. Let f : M --+ M be a homeomorphism. Let {Xj}~!.it be a 6-chain for f.
(In this section, often we add the requirement that either it = -00 or h = 00 or both.)
A point y E M i-shadows {Xj}~!.it provided d(fi(y),xi) < f for it ~ j ~ h.
378 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Definition. A closed invariant set A is said to be isolated provided there is a neigh-


borhood U of A such that A C int(U) and

A= n
nEZ
r(U)·

The neighborhood U is called the isolating neighborhood. Notice that the isolated in-
variant set is the maximal invariant set contained in its isolating neighborhood, i.e., any
invariant set A contained entirely inside the neighborhood U is a subset of A.
The geometric horseshoe given in Section 7.3 is an example of an isolated invariant set
which is not an attractor. The set S given in that section is an isolating neighborhood.
When we defined an isolating neighborhood (trapping region) for an attracting set
we required that U is positively invariant. With this assumption on U, the intersection
of r(U) for all integers n (given above) is the same as the intersection for all positive
integers n (used for attracting sets).
For an invariant set which is not an attracting set, we only require the set A be locally
isolated in U. There can be other points x in the neighborhood U such that the orbit of
x leaves U and later returns to U. These points could be either periodic, nonwandering,
or chain recurrent.
Theorem 3.1 (Shadowing). Let A be a compact hyperbolic invariant set. Given
t > 0, there exist 6 > 0 and TJ > 0 such that if {XjH~iI is a 6-chain for f with
d(xj, A) < TJ for il ~ i ~ h, then there is a y which t-shadows {Xj};~iI' If the 6-chain
is periodic, then y is periodic. Moreover, if il = -00 and h = 00 for the 6-chain, then
y is unique. If h = -it = 00 and A is an isolated invariant set (or has a local product
structure), then the unique point YEA.
REMARK 3.1. R. Bowen's proof of this theorem assumes that the set has a local product
structure and uses the conclusion of the stable manifold theorem and then uses point
set topology ideas. Instead, we use the proof of the stable manifold theorem as given in
the last section to prove the result directly.
REMARK 3.2. Meyer (1987) and Meyer and Sell (1989) give a proof of the shadowing
theorem using the implicit function theorem. At SOD;le general level, their proof is really
the same as the one given here but it appears very different. We present the ideas
geometrically, while their proof is cleaner analytically. Meyer and Hall (1992) give a
complete exposition of this proof in the case when the invariant set is a subset of a
Euclidean space. Also see Palmer (1984) and Chow, Lin, and Palmer (1989) for another
proof.
REMARK 3.3. Grebogi, Hammel, and Yorke (1988) have extended this result to show
that systems without a uniform hyperbolic structure can often shadow 6-chains for long
intervals of time.
PROOF. Throughout the proof, we take 0 < r < E. First extend the splitting E~ x E~
from on A to a neighborhood V C M. Take TJ small enough so the rrneighborhood
of A is inside V. Let Bp(r) = E;(r) x E~(r) C TpM and Vp(r) = expp (8p (r)) be as
in the last section. The box Bp(r) is a subset of TpM while Vp(r) is the comparable
neighborhood of pin M. Let {Xj}1!.j, be a 6-chain for f. If 6 > 0 and TJ > 0 are
small enough, then f(VxJ (r)) C VX;+l (Cor) for some Co > A. We introduce the map
Fj : Bx; (r) -+ BXH , (Cor) which should be considered as the map f represented in local
coordinates at Xj and Xj+!
9.3 SHADOWING AND EXPANSIVENESS 379

Note since f(xj) is not necessarily equal to Xj+1, IFj(Oxj) -OxHII is small, say less than
6, but is not necessarily equal to zero. However for small enough 6, the estimates for
this sequence of maps are similar to those in the last section. Here we use v > 0 instead
of f to measure the change in the derivatives because we are using f for something else.
One difference between the two proofs is that in order to know that an unstable disk
has an image which is an unstable disk, we need to take into consideration the fact that
Fj(Ox) -:;, OXJ+I: the requirement becomes (>.-v)r-6 ~ r or (>.-v-l)r ~ 6, so that the
jumps measured by 6 are bounded in terms of quantities determined by the hyperbolicity.
Similarly in the consideration of stable disks, we need that (p + v)-I r - 6 ~ r or
[(p + v)-I - llr ~ D. Thus we first take the neighborhood V of A in M and 0 < r < f
small enough so that the estimates on the derivative of

Fp = eXP/(~) of 0 expp : Bp(r) -+ Bf(p)(COr)

are true for all p E V in tenns of the extended splitting; for q E Bp(r),

A"(q) A'''(q»)
D(Fp)q = ( A~'(q) A~"(q)
the entries satisfy the following estimates: IIA~'(q)1I < p, m(A~"(q» > >., IIA~"(q)1I <
v/2, and IIA~'(q)1I < v/2. Next take 6 > 0 such that the same estimates are true for
Fj constructed from a D-chain {xi }~!.jl provided Xi E V for all j.
Assume that jl = -00. To shorten the notation, for k > 0 write F! for Fm+k-I 0
···0 Fm. Using this notation,

is an unstable disk in BXJ (r) and Dj(r) = eXPXj (Dj(r» is an unstable disk in M near
Xj' (The fact that Fj(Oxj) -:;, Ox,+! means that the disks Dj(r) do not necessarily go
through OXJ and the Dj(r) do not necessarily go through Xj but are only nearby.) If
q E D~(r) and k ~ 0, then q E F~k(Bx_.(r» so (F~k)-l(q) E Bx_.(r). In terms of
D(;(r), if y E D~(r), then rk(y) E Vx_. (r) and the backward orbit of y stays within
r of the D-chain.
Similarly, if 32 = 00 we can find stable disks. For k negative, let F! = F':;;~k 0 ..• 0
F':;;~I. Then

is an stable disk in BXj (r) and Di(r) = eXPXj (DJ(r» is a stable disk in M near Xi. If
q E D~(r), then for all k ~ 0, q E F~,,(Bx_.(r» so F::;(q) E Bx_.(r). In terms of
D&(r), ify E D~(r), then f''''(y) E Vxl.l(r) and the forward orbit ofy stays within r of
the chain.
Because of the slopes of these two disks,
00

n FJ-A:(Bxj_.(r»
k--oo

is a single point Qj E Bxj(r) and eXPx;(Qj) = Yj E VXj(r). Because of the nature of


the intersection defining Yo, P(Yo) stays within r of Xj for each j. Since r < t, Yo
380 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

(-shadows the chain. The fact that Yo is a single point shows that the point which
(-shadows the chain is unique when h = -iI = 00. Also the uniqueness shows that
Fj(lJ.i) = lJ.i+l, or f(Yj) = Yj+1 as points of M.
If the chain is It periodic, then the uniqueness shows that F~(qo) qk= = qo, so qo
is a periodic point for F, or Yo E M is a periodic point for f.
If the chain is not infinite in one direction or the other, then the intersection is a
nonempty strip but not a point. This gives the existence of shadowing but not the
uniqueness.
Now in the case that A is also an isolated invariant set, let U be an isolating neigh-
borhood and take V and r small enough so that Vp(r) C U for all p E V. Then,
P (yo) = Yj E Vx, (r) CU. Therefore Ij (Yo) E U for all j, and so Yo E A since A is the
maximal invariant set in U. 0
The next result compares the points with w-limit sets in A with the stable manifolds
of points. It shows that if a point approaches the whole set A then it has to approach
the orbit of some point in A. First we give a definition of the stable manifold of the
invariant set.
Definition. If A is an invariant set, the stable manifold of A, W'(A), is defined to be
all points q such that w(q) C A. Notice that often the stable manifold is not a manifold.
Even for the horseshoe it is a Cantor set of curves. Similarly, the unstable manifold of
A, WU(A), is defined to be all points q such that a(q) C A.
A point q in the stable manifold of a single point p in an invariant set has a forward
orbit which approaches the forward orbit of p in phase, d(fk(q),Jk(p» goes to zero as
It goes to infinity. If the invariant set is not hyperbolic then a point q might approach
the invariant set and be in the stable manifold of the invariant set, without being in
phase to anyone point in the invariant set. The following theorem proves that a point
which approaches a compact isolated hyperbolic invariant set necessarily is in phase
with some point in the invariant set.
Corollary 3.2 (10 Phase). Let A be a compact isolated hyperbolic invariant set.
Then

W'(A) = U W'(x) and


xEA
WU(A) = U WU(x).
xEA

PROOF. Let Y E W·(A). Then there is an N > 0 such that d(fj(y), A) < v for j;:: N.
Take Xj E A such that d(fj(y),Xj) < v for j;:: N. By uniform continuity of I, Xj is &
6-chain if v is small enough. Let Xj =
P-N(XN) E A for j ~ N. This 6-chain can be
uniquely (-shadowed by a point x E A. Thus for j ;:: N,

d(fj (y),Jj (x» ~ d(fi (y), Xj) + d(xj, Ij(x»


~ 6+(.

Since the forward orbit of y stays near the forward orbit of x, the Stable Manifold
Theorem implies that y E W·(x).
The proof for W"(A) is similar. 0
9.4 ANOSOV CWSING LEMMA 381

Corollary 3.3 (Expansiveness). Let A be a compact hyperbolic invariant set. There


exists a fJ > 0 such that ifx,y E A with d(fi(x),li(y» ~ fJ for all j E Z then x = y.
REMARK 3.4. A diffeomorphism with the property of this corollary is called expansive.

Any diffeomorphism that is expansive also has sensitive dependence on initial condi-
tions. (See Section 3.5 for the definition.) Therefore, a diffeomorphism restricted to a
hyperbolic invariant set has sensitive dependence on initial conditions.
PROOF. Let xi = J1(x) and Yj = J1(y). Both of these are infinite c5-chains. Each is
orbit which c5-shadows the other. By uniqueness of shadowing, x = y if c5 is small
!\II
enough. 0

9.4 Anosov Closing Lemma


In this section we prove that if A is either the set limit set L(f) or the chain recurrent
set 'R(f) and A hIlS a hyperbolic structure, then A = c1(Per(f». Thus if (i) A is one
of these two sets and is hyperbolic, and (ii) there there are transverse intersections of
stable and unstable manifolds for some points in A, then we can conclude that there are
transverse intersections of stable and unstable manifolds for periodic points. Sometimes
we can even get a homoclinic point for the same periodic point. Since a transverse
homoc1inic intersections for a periodic point implies the existence of a horseshoe, this
result can be used to get an invariant set nearby with additional periodic points.
We proceed with the statement and proof of the An060v Closing Lemma.
Theorem 4.1 (Ano80v Closing Lemma). Assume I: M -+ M is a C 1 diffeomor-
phism (or flow) on a compact manifold M.
(a) Assume that the chain recurrent set of I, 'R(f), has a hyperbolic structure. Then
the periodic points are dense in the chain recurrent set, and cl(Per(f» = 'R(f) = L(f) =
O(f).
(b) Assume that the limit set of I, L(f), has a hyperbolic structure. Then the
periodic points are dense in the limit set, c1(Per(f» = L(f).
(c) Assume that the non wandering set O(f) is hyperbolic. Then the periodic points
are dense in the nonwandering set of the map restricted to the nonwandering set,
c1(Per(f» = O(fIO(f)), i.e., cl(Per(f» = O(F) where F = fIO(l).
REMARK 4.1. In part (c), if 0(1) is hyperbolic it is not true that cl(Per(l» =.O(f). A.
Dankner (1978) gives an example of a diffeomorphism with a hyperbolic nonwandering
set for whic1I c1(Per(f» '" O(f). The problem is that there are a point x E O(f) and a
neighborhood U ofx such that ifz, I"(z) E U with k ~ 1 then {fj(z) : 0 ~ j ~ k} is not
contained in a small neighborhood of O(f). Thus the orbit segment {fj (z) : 0 ~ j ~ k}
does not stay in the region of M where I is hyperbolic and so the periodic f-chain
generated by {fi(z) : 0 ~ j ~ k} can not be shadowed by a periodic orbit.
REMARK 4.2. There are examples where L(f) is hyperbolic but L(f) '" O(f) c'R(f).
Figure 4.1 gives the phase portrait of an example where the limit set is hyperbolic, but
the nonwandering set is not hyperbolic. The point q of non-transverse intersection is
nonwandering but not a limit point, q E O(f) \ L(f). It follows that the whole orbit
of q are nonwandering points whic1I are not lim1t points. Also, q E O(f) \ 0(f10(f».
Exercise 9.14 asks the reader to prove these facts about q.
PROOF. (a) We showed in the section on Conley's Theorem that 'R(fI'R(f» = 'R(f).
This means that given x E 'R(f), there is a periodic fJ-cbain {xi} with Xo = x, Xj+/< = Xi
for all j, and Xi E 'R(f). By the Shadowing Theorem, there is a periodic point y E 'R(f)
382 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

-t
FIGURE 4.1. Phase Portrait for Remark 4.2

which f-shadows the chain. The point y is within 6 + f of x. Since this is arbitrarily
small, x is in the closure of the periodic points.
Since cl(Per(f)) C L(f) c n(f) c n(f), and the outer two are equal, we have that
cl(Per(f)) = L(f) = n(f) = n(f).
(b) Since L(f) = cl (U. w(z)) , it is sufficient to prove that the periodic points are
dense in an arbitrary w(z). Let x E w(z). Since M is compact, w(z) is compact.
In this circumstance, we proved earlier that d(fi(z),w(z)) goes to zero as j goes to
infinity. Therefore given 6 > 0, we can find k and n such that d(f/c (z), fk+n (z» ~ {),
d(f/c(z),x) < 6, and d(P(z),w(z)) ~ {) for k ~ i ~ k + n. Let zi be the n-periodic
6-chain with Zj = fi(z) for k ~ j < k + nand Z/c+n = Z/c. By construction this whole
periodic 6-chain is within 6 of w(z) and so of L(f). This chain can be f-shadowed by a
periodic point y. Then y is within 6 + f of x. The sum 6 + f can be made arbitrarily
small, so x is in the closure of the periodic points.
We leave the proof of part (c) to the exercises. See Exercise 9.15. 0

9.S Decomposition of Hyperbolic Recurrent Points


Conley's Fundamental Theorem of Dynamical Systems gives a decomposition of the
chain recurrent set into invariant sets. In this section we show that if f has a hyperbolic
structure on nu) then nu) has a finite number of chain components, each of which
is an isolated invariant set. This conclusion is not true without the added assumption
of hyperbolicity.
In other books, it is often assumed that f satisfies a condition in terms of the non-
wandering set called Axiom A. A diffeomorphism of flow I is said to satisfy Axiom
A provided it has a hyperbolic structure on n(J) and cl(Per(f)) = n(f). See Smale
(1967). We mentioned in the last section that the second condition of Axiom A does
not follow from the first condition.
Rather than assume n(J) has a hyperbolic structure or I satisfies Axiom A, we
often only assume that f has a hyperbolic structure cl(Per(f)). This latter condition
is a weaker assumption than either of the other two assumptions and isolates what is
actually necessary to make the theorem true. A second part of the main theorem assume
the limit set L(J) has a hyperbolic structure. Again this assumption is implied by the
assumption that either (I) n(J) has a hyperbolic structure or (ii) I satisfies Axiom A
9.5 DECOMPOSITION OF HYPERBOLIC RECURRENT POINTS 383

and so is a weaker assumption. This weaker assumption is exactly what is needed to


make the theorem work. The treatment given in this section closely follows the lecture
notes by Newhouse(1980) who first isolated the actual assumptions which we use. Many
of the original theorems are due to Smale. See Smale (1967).
Through this section, / is a C 1 diffeomorphism on a compact manifold M. The
results are also true for a flow on a compact manifold but we do not state the results
using this terminology.
As indicated in the introduction above, we use several types of recurrent sets in this
section: limit set, nonwandering set, and chain recurrent set. We review the notation
for these concepts in the following definition.
Definition. As we have defined before, n(f) is the set of all nonwandering points and
R(f) is the set of all chain recurrent points. As a third type of recurrent points, the
limit set 0/ / is defined to be the following set:

L(f)=cI( U w(p)ua(p»).
pEM

Using the definitions, it is easy to check that Per(f) C L(f) c n(f) c R(f). In the
theorems of this section, we often assume that cI(Per(f», L(f), or R(f) has a hyperbolic
structure. We could also assume that n(f) has a hyperbolic structure, but then we have
to add the assumption that cI(Per(f» = n(f), i.e., that / satisfies Axiom A ..
To break up the periodic points into pieces, we form equivalence classes of periodic
points using the relation given in the following definition.
Definition. We let H(f) be the set of all hyperbolic periodic points of ,. If q,p E
H(f), then we say that p is heteroclinically related to q, or p is h-related to q, provided
WU(O(p» has a nonempty transverse intersection with W'(O(q» and W"(O(q» has
a nonempty transverse intersection with W·(O(p». (Newhouse calls this property
homoclinically related.) We form equivalence classes of h-related points and write p "" q
if P is h-related to q. For p E H(f), let

Hp={qEH(f): p"'q}.

The set Hp is called the h-closs 01 p. Note if p is h-related to q then dimWU(p) =


dimWU(q) since there is a transverse intersection of the stable manifold of p and the
unstable manifold q and vice versa.
Proposition 5.1. Being h-related is an equivalence relation on H(f).
PROOF. It is clear that p "" p and that p '" q if and only if q "" p.
What we need to check is transitivity: assume p "" q and q '" r and show that p '" r.
Let n1, n:h and n3 be the periods of p, q, and r respectively. Since the manifolds for
the orbits are transverse, we can find p' E O(p) such that W"(p') intersects W'(q)
transversally at a point x. By replacing x by fiRIR·(X) for some j ~ 1, we can assume
that x lies in the local stable manifold of q. Let ~u be a small disk in WU(p') through
x. By the Inclination Lemma, fiR·(a U) accumulates on W,~(q) in a C1 manner.
Next, because q is h-related to r, there is a r' E OCr) such that W"(q') intersects
W'(r') transversally at some point y. By replacing y by r;R,RI(y) for some j ~ I,
we can assume that y lies in the local unstable manifold of q. Let~' be a small disk
in W'(r') through y. See Figure 5.1. By the Inclination Lemma,l-jR'R'(~') accumu-
lates on W,~(q) in a C 1 manner. Since W,~(q) and W,~(q) cr06S transversally at q,
and fiR,R·(a U) and l-jR.n'(a') accumulate on them in a C1 manner, it follows that
384 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

p' I X \
r'

FIGURE 5.1. Disks ~u and ~.

pnln2(~U) and l-jn2n.(~") cross transversally for large enough j. Thus WU(O(p))
has a nonempty transverse intersection with W"(O(r)). A similar argument shows that
WU(O(r)) has a nonempty transverse intersection with W·(O(p)). Combining, p is
h-related to r. 0
Next, we show that the closure of a single class Hp, cl(Hp), is topologically transitive
using the Birkhoff Transitivity Theorem.
Proposition 5.2. For any p E H(f), the set cl(Hp) is closed, invariant under I, and
topologically transitive for I.
PROOF. Let X = cl(Hp). Then X is a complete metric space with countable basis.
Clearly X is closed. It is the closure of an invariant set and so is invariant. (See
Exercise 9.4.) We need to show that the Birkhoff Transitivity Theorem applies. Take
any two open sets U1 and U2 • It is sufficient to show that O(U.) n U2 '" 0. Take
any r J E Uj n Hp for j =
1,2. The orbit O(r.) C O+(Ul) so W.U(O(rd) C O+(U.).
By iteration, we get that WU(O(r.)) C O+(U.). Because r1 is h-related to r2 (both
are in Hp), WU(O(rd) intersects W'(0(r2)) transversally at some point y. We have
shown that y E O+(U.). By an argument as above applied to r2, Y E 0-(U2 ) and
so there is some k > 0 such that I"(Y) E U2 . Because O+(U.) is positively invariant,
I"(Y) E O+(U.). Thus O+(U.) n U2 '" 0. 0
Example 5.1. Let I be the map on the standard horseshoe A. Any two periodic points
are h-related. Also A = cl(Hp). By the above theorem it follows that I is topologically
transitive on A. (The fact that I is topologically transitive on A also follows from the
conjugacy to the two sided shift.)
Example 5.2. The fact that the solenoid, or any other connected hyperbolic attracting
set, is topologically transitive follows from the following proposition.
Proposition 5.S. (a) Let N be compact and connected and I : N -+ N be a C 1
diffeomorphism into N. Assume that N is a trapping region, so that l(cl(N)) C int(N).
Let A = nji!O Ji (N) be the attracting set. Assume that I has a hyperbolic structure
on A and the periodic points are dense in A. Then I is topologically transitive on A.
(b) In fact, if A is a connected hyperbolic attracting set for a diffeomorphism f then
f is topologically transitive on A.
PROOF. A connected attracting set has a connected trapping region, so part (b) follows
from part (a).
9.5 DECOMPOSITION OF HYPERBOLIC RECURRENT POINTS 385

To simplify the notation in the proof, we write H to mean the hyperbolic periodic
points in A, H(f) n A. Because there is a hyperbolic structure on A, there is an f > 0
such that if p, q E H with d(p, q) < f then p is h-related to q. Thus, each Hp is open
in H. Take Up ::> Hp open in M with Up contained within a distance of f/3 of Hp.
Since U Hp = H, and H is dense in A, UPEH Up ::> A. Therefore we have an open
pEH
cover of A.
Next, we claim that two Up and Uq either coincide or are disjoint. If x E Up n Uq,
there is a p' E Up and q' E Uq such that d(x, p') < f/3 and d(x, q') < f/3. Then
d(p',q') ~ d(p',x) + d(x,q') < (2/3)f < E. By the choice of f, it follows that p' is
h-related to q', and so p is h-related to q. Thus Hp = Hq and Up = Uq. Therefore the
only time that Up and Uq can intersect is for them to coincide.
We have shown that the Up form a cover of the set A by disjoint open sets. Since
each teeN) is connected, A is the intersection of the connected sets Ik(N), and A is
connected. (See Exercise 5.11.) Therefore, there is only one set Up and so only one
equivalence class Hp, H = Hp, and A = cl(Hp). By Proposition 5.2, it follows that I
is topologically transitive on A. 0
The next theorem shows that the closure of the periodic points can be split up
into a finite union of invariant sets that are topologically transitive if cl(Per(f» has a
hyperbolic structure. In some sense, Proposition 5.3 is a special case of this theorem
where there is one piece. The theorem also proves that each invariant set has a local
product structure which is defined next.
Definition. Let A be an hyperbolic invariant set for a diffeomorphism I. We say that
A has a local product structure provided there is an r > 0 such that for every p, q E A,
W;'(p) n W:(q) c A.
Let A be a hyperbolic invariant set. The continuity of the stable and unstable man-
ifolds for points of A implies that there is an ro > 0 such that for any 0 < r ~ ro
there is an f = f(r} > 0 for which W;'(p) n W:(q) is a single point for any p,q E A
with d(p, q) ~ E, i.e., this intersection is nonempty and a single point. Thus if A
is a hyperbolic invariant set with a local product structure and d(p, q) < E, then
W;'(p) n W:(q) = {z} c A. Using this fact, it can be shown that given pEA,
for small r > 0 a neighborhood ofp in A is homeomorphic to [W:,(p)nA) x [W:(p)nA)
by the map

h: [W;'(p) n A) x [W:(p) n A) - A
h(x,y) = W;'(x) n W:(y).

Thus locally the invariant set is homeomorphic to a product, hence the name local
product structure. Now we give the Spectral Decomposition Theorem.
Theorem 5.4 (Spectral Decomposition). Let M be compact and I : M - M be
a C 1 diffeomorphism.
(a) Assume that I has a hyperbolic structure on cl(Per(f». Then there are a finite
number of sets, AI, ... ,AN, each of which is the closure of disjoint h-classes, such that
cl(Per(f)) = Al u··· U AN. Thus each Aj is closed, invariant by I, the periodic points
are dense, and I is topologically transitive on A j • EUrther, each Aj has a local product
structure.
(b) If we assume that L(f) has a hyperbolic structure then M = Uj W'(A j } =
Uj WU(A j ) where WU(A j ) = {q: a(q) C Aj} and W'(A j } = {q: w(q) C Aj }.
386 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Definition. Assume that cI(Per(f» has a hyperbolic structure for!. Then by the
Spectral Decomposition Theorem cI(Per(f» = Al U '" U AN. The sets Ai are called
basic sets.
REMARK 5.1. By Theorem 4.1, if we assume either (i) A = R(f) has a hyperbolic
structure, (ii) A = L(f) has a hyperbolic structure, or (iii) A = n(J) and f satisfies
Axiom A, then I\. = c1(Per(J». Therefore both parts of the Spectral Decomposition
Theorem are true if we assume that ! satisfies any of the above three assumptions.
Since it is possible for n(f) to be hyperbolic and n(f) -F L(J) = cI(Per(f», if we state
the theorem in terms of the nonwandering set we need to assume Axiom A and not just
that n(f) has a hyperbolic structure.
REMARK 5.2. It is not hard to prove that each basic set Ai can be further split up
into pieces for which a power of ! is topologically mixing: Ai = U::!I Xi,i with (i) the
sets X i .i = cI(WU(p) n W"(p» for some periodic point p E Xi,i' (ii) the sets X i .i are
pairwise clL'ljoint, (iii) !(Xj •i ) = X j •HI for 1 ~ i < nj and !(X j •n ,) = X i . I , and (iv) the
n;-power of ! restricted to each Xi,i is topologically mixing, r j IXi,i is topologically

mixing for each j and i. See Bowen (1975b). (See Section 2.5 for the definition of
topologically mixing and the difference from topologically transitive.)
PROOF. As in the proof of Proposition 5.3, there is an f > 0 such that if p, q E H(f)
with d(p, q) < f then p is h-related to q. Therefore if cI(Hp) n c1(Hq) -F 0 then
Hp = Hq. That is, the closure of h-classes either agree or are diSjoint. Also c1(Hp) is
closed, invariant, and topologically transitive by Proposition 5.2.
As in the proof of Proposition 5.3, we can take sets Up :) Hp which are open in M
with Up contained within f/3 of Hp. These open sets are disjoint (or coincide) and cover
cI(Per(f». Since cI(Per(f» is compact, a finite number of these Up cover. Therefore
there are only a finite number of distinct classes Hp.

z'k

FIGURE 5.2. Local Product Structure: W:(Xk), W,!'(Xk) , W:(Yk), and


WU(Yk)

Finally, we prove the local product structure of each Ai = cI(Hpj ). Let X,Y E Ai'
and choose Xk, Yk E H pi n Per(f) so that Xk converges to x and Yk converges to y. By
the continuity ofthe local stable and unstable manifolds both W,!'(Xk) n W:(Yk) = {zA:}
and W,!'(Yk) n W:(XIt) = {zAJ are a single point at a transverse intersection. By the
Inclination Lemma, WU(Yk) accumulates on W,!'(Xk), and so WU(Yk) has a transverse
homoclinic intersection with W:(Yk) arbitrarily near the point Zk. See Figure 5.2. Each
of these transverse homoclinic intersections for a periodic point is in the closure of the
9.5 DECOMPOSITION OF HYPERBOLIC RECURRENT POINTS 387

periodic points, so Zk E cl(PerU)) = Al U ... U AN. Because the Ai are at a finite


distance apart, Zk E Ai if l is small enough. As Xk approaches x and Yk approaches y,
Zk approaches some Z E W;'(x) n W:(y). Since Ai is closed, it follows that Z E Aj as
was to be proved. This completes the proof of part (a).
For the proof of part (b) of Theorem 5.4, let p E M. Then w(p) c LU) = Al U··· U
AN. The Ai are disjoint and a bounded distance apart, so there exists an v > 0 such that
they are distance apart greater than v. We want to show that w(p) must be contained
in a single Ai' We assume the opposite and get a contradiction, i.e., we assume that
w(p) n Aj f= 0 and w(p) n Ak f= 0 for j -f= k. Let D( Ai, v /3) be the v /3 neighborhood
of Ai. Because the sets Ai are invariant, there are neighborhoods Ui of each Ai, such
that cl(U,) c D(A"v/3) and !(cl(U,» n D(A t ,v/3) = 0 for t f= i. With the above
assumptions on p, there is an increasing sequence of iterates n,
with fn .. (p) E Uj
and In .. +! (p) E Uk. Let m, be the largest integer m such that n2, < m < n2i+1
with Im-I(p) E cl(Uj). Then Im,(p) rt Uj by the choice of m,. On the other hand,
Im,-I(p) E cl(U]) so Im.(p) = 10 Im.-I(p) rt D(A t ,v/3) :::> Ut for t f= j. Therefore
1m , (p) E M\Ut Ut • By taking a subsequence ofthe mi, we get that Im,(p) accumulates
on a point q that is not in any of the At. We have a contradiction because q is in w(p).
Thus we have shown that w(p) is contained in a single Aj . Therefore p E W·(A j ).
The proof that Q(p) C Ak for some k is similar, so p E W"(Ak). This completes the
proof of the theorem. 0
Corollary 5.5. Ifnu) has a hyperbolic structure, then I has a ilrute number of chain
components.
PROOF. The the Spectral Decomposition Theorem implies that nu) = Al U· .. UAN is
a finite decomposition. Since I is topologically transitive on each of the Aj • each of these
Aj is a chain component. Therefore. there are a finite number of chain components. 0
The following proposition shows that the stable manifold of a single periodic orbit is
dense in the stable manifold of the whole invariant set. This fact clarifies the meaning
of a cycle as defined below.
Proposition 5.6. Assume A is a basic set for I. (We could assume that A is an
isolated invariant set with PerU) dense in A and all the period points of A h-related.)
Let pEA n PerU). Then W'(O(p» is dense in W·(A). Similarly, WU(O(p» is dense
in WU(A).
REMARK 5.3. If I is topologically mixing on a basic set and pEA n PerU), then the
stable manifold ofthe single point p, W'(p), is dense in W'(A) and not just WU(O(p».
If I is not topologically mixing, then W'(p) is not dense in W'(A) and it is necessary
to take the stable manifold of the orbit of p.
PROOF. Let pEA n PerU) be as in the statement of the theorem have period n. Let
q be any other periodic point in A with period m. By the assumptions q is h-related
to p. Thus there is a PI E O(p) such that W·(ptl has a transverse intersection with
W"(q). Using the Inclination Lemma applied to Imn, W·(ptl accumulates on the local
stable manifold of q, and so by iteration of I mn it accumulates on all of W·(q). Thus
W'(O(p» accumulates on the stable manifold of any periodic point.
Because the periodic points are dense in A, it follows from the continuous dependence
of the stable manifolds on the point that U{W'(q) : q E A n PerU)} is dense in
U{W'(q) : q E A} = W·(A). Since W'(O(p» is dense in U{W'(q) : q E AnPerU)}.
it follows that W'(O(p» is dense in W·(A). This completes the proof of the theorem
for stable manifolds. The proof for unstable manifolds is similar. 0
388 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Definition. Assume that c1(Per(f», L(f), or R(f) has a hyperbolic structure for f.
(If we assume that n(f) has a hyperbolic structure, then we also have to assume that
. c1(Per(f» = n(f).) Let Al u··· U AN be the basic sets given by the Spectral Decom-
position Theorem, and define WU(A j ) = WU(Aj ) \ Aj . Define a partial ordering on the
basic sets by declaring Aj « Ak if WU(Aj) n W'(Ak) --F 0. A k-cycle is a sequence
of basic sets Aj " ... ,Aj • with Ail « Ai> « ... « Ai. « Aj ,. Thus a I-cycle
is a basic set Aj with WU(A j ) n W'(A j ) --F 0, i.e., the stable and unstable manifolds
intersect off A}.
REMARK 5.4. If L(f) (or n(f) or R(f» is hyperbolic then a cycle contains some
non-transverse intersections because otherwise all the periodic points are h-related and
intersections are all in L(f). The example given above with L(f) --F n(f) is an example
for which the limit set is hyperbolic with a 3-cycle, but the nonwandering set is not
hyperbolic, i.e., the points of non-transverse intersection are nonwandering but not
limit points.
Definition. Assume that LU) or R(f) has a hyperbolic structure. (Alternatively,
assume that n(f) has a hyperbolic structure and c1(Per(f» = n(f).) Then / is said to
have no cycles or satisfy the no cycle property provided there are no cycles among the
basic sets formed from L(f) (or n or R(f».

Theorem 5.1. If R(f) is hyperbolic then f satisfies the no cycle property.


PROOF. Assume {AjJ~=1 is a k-cycle for /. It is easy to check that all the points ofthe
intersections ~VU(Aj;) n W'(Aj;+I) are in R(f). (See Exercise 9.2 dealing with periodic
points. Also compare with Exercise 9.18.) Thus these points are not outside the basic
sets, so there is no cycle but merely one larger basic set. 0
REMARK 5.5. If we assume that L(f) or nu) has a hyperbolic structure, then the fact
that / has no cycles among the basic sets is an additional assumption.

9.6 Markov Partitions for a Hyperbolic Invariant Set


Throughout this section, f : M -+ M is a C l diffeomorphism on a manifold M, A is
a compact, isolated, hyperbolic invariant set for f with a local product structure. The
set A could be a basic set from the spectral decomposition but this is not necessary. We
take an adapted norm on tangent vectors at points of A, and d a compatible distance on
M. With this distance, two points in a stable manifold get closer together under forward
iteration and two points in an unstable manifold get closer together under backward
iteration. Therefore /-I(W:(f(p») :> W:(p) and f(W:'(f-l(p») :> W.U(p). Since A
has a local product structure, there are f 2: 0' > 0 such that W:(p) n W:'(q) a single
point in A whenever p, q E A satisfy d(p, q) ~ 0'. However by taking f 2: 0' > 0 smaller,
we can make sure that

whenever p, q E A satisfy d(p, q) ~ 0', and so the intersection of the manifolds on the
left hand side is also a single point. Note that the the two sets which are intersected on
the left hand side of the equality contain the respective sets on the right hand side. We
fix f and 0' with these properties.
For this general situation, we need to define a rectangle and then the stable and
unstable manifolds in a rectangle. Note that throughout this section the interiors of all
subsets of A are taken as subsets of A and not as a subset of the ambient manifold M;
thus int(R) means the interior of R in A.
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET 389

Definition. A set RcA is a rectangle provided


(i) it has diameter less than 0, where 0 is as above, and
(ii) p,q E R implies that W:(p) n W;'(q) E R.
The rectangle is call proper if in addition
(iii) R = cl(int(R)) 80 that it is closed.
If R is a rectangle then for pER let

W'(p, R) = W:(p) n R and


W"(p, R) = W.U(p) n R.

Note the comparison of the definition of W'(p, R) with what is used for hyperbolic toral
automorphisms in Chapter VII.
With theRe definitions we can define a Markov partition as before.
Definition. Assume that A has a local product structure for J as above. A Markov
partition oj A for J is a finite collection of proper rectangles, 'R = {Rj lj"-l' that satisfy
the following four conditions:
(i) 11.= Uj:l R j ,
(ii) if i 1: j then inteR;) n int(Rj ) = 0, (so inteR;) n R j = 0),
(iii) ifz E inteR;) nJ-l(int(Rj )) then

J(WU(z, R;)) :) WU(J(z), Rj ) and


J(W'(z, R;)) C W'(J(z), Rj ),

and
(iv) if z E inteR;) n J-l(int(Rj )) then

int(Rj ) n J(WU(z, R;) n int(R;)) = WU(J(z), Rj ) n int(R j ) and


inteR;) n rl(WB(Rj) n int(Rj )) = W'(z, R;) n int(R;).

REMARK 6.1. In the context of the toral Anosov automorphisms, condition (iv) is
needed to insure that the image of a rectangle only crosses another rectangle once; this
fact enables large rectangles to be used and still to be able to get a single orbit which
passes through the prescribed sequence of rectangles, i.e., to get symbolic dynamics. In
the present context, the rectangles found are small. Their small size is used to show
that condition (iii) implies condition (iv). (Note the added condition on the size of the
local stable manifolds and iterates of the map which we imposed above.) Because the
small rectangles automatically satisfy condition (iv), Bowen and other authors do not
add a condition like this one to the de~nition of a Markov partition.
We can now state the main result of this section which is due to Bowen (1970a). We
follow the treatment of Bowen (1975).
Theorem 6.1. Let A be a hyperbolic invariant set with a local product structure for
a diffeomorphism J. Then there exists a Markov partition of A for I with rectangles
arbitrarily small (diameter less than 0).
REMARK 6.2. Once there is a Markov partition, then it is possible to define a subshift
of finite type EA as for a toral Anosov automorphism and a semi-oonjugacy h : EA -+ A
that is finite to one. By Theorem VIII.l.S, the entropy of J/A, h(JIA). is equal to
390 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

the entropy of C1 A, h(ul1: A). By Theorem VIII.1.9, h(ul1: A ) = log(~d where ~l is the
largest eigenvalue of A.
PROOF. Let E > 0 and a> 0 be small as above. Let {3 > 0 be such that any {3-chain can
be (a/2)-shadowed in A because A is isolated. Next take -y > 0 with -y '5: min{{3/2, a/2},
so that if d(x, y) < -y then d(f(x), f(y» < (3/2 and d(f-l(x), rl(y» < {3/2.
Let P = {Pl, ... , Pr} be a finite set of -y-dense points in A. Let B = (b'j) be the
transition matrix with
b _
'J -
{I
d(f(Pi),Pj) < (3
0 d(f(Pi),Pj) ~ {3.
Because of the choice of 'Y, for each i, there is at least one j such that bij = 1. Let 1:8
be the two sided subshift of finite type determined by B. Then the cylinder sets

Cj = {s E 1:8 : So = j}
form a Markov partition of 1:8. Remember that for s E 1: 8 ,

WI~(S,C18) = {tE1:8 : ti = Si for i ~ O} and


WI~c(S,C18) = {tE1:8 :ti = Si fori '5: O}.
If s, s' E 1:8 with So = s6, then
~~(S,C18) n WI~(S',U8) = {SO}
with s· E 1:8 where
S! = {Si for i ~ 0 and
, s; for i '5: o.
We use this partition for 1:8 to construct a partition for A. We define a map

8: 1:8 -+ A

which we use to take a rectangle in 1:8 to a rectangle in A. For each s E 1:8, let 8(8)
be the point z E A which (a/2)-shadows the ,B-chain {P'; }~-oo. The main properties
of 8 are contained in the following lemma.
Lemma 6.2. The map 8: 1:8 -+ A is continuous and onto. Further, ifs,s' E 1:8 have
80 (are in the same cylinder set of the partition of 1:8 ), then
So =

d(8(s), 8(8') < a


8(WI~(S,C18» c W:(8(8),f),
8(~~(8',C18» c W:(8(s'),f), and
8(WI~(s, (8) n WI~(S',C18» = W:(8(s), f) n W:(8(s'), f).

PROOF. To show 8 is onto, let Z E A. For each j let P'; be chosen within -y of Ji(z).
Then

d(f(p';)'P'j+,) '5: d(f(p.;),fi+l(z» +d(fj+l(Z),V.,+,)


'5: {3/2 + 'Y
'5: {3,
9.6 MARKOV PARI'ITIONS FOR A HYPERBOLIC INVARIANT SET 391

so {p'j} is a /1-chain. The orbit of z 1'-shadows so it (O'/2)-shadows this /1-chain; thus


lI( s) = z and II is onto.
If two sequences s,s' E E8 have So = so'
then lI(s) and lI(s') are both within 0'/2 of
P' o so are within a of each other.
Note that two symbol sequences s and a' are close if Sj = sj for -n :s j :s n for some
large n. Thus. and .' correspond to two /1-chalns which agree for a large number of
points. By an argument like we have given in earlier sections, lI(s) and lI(s') are nearby
points. This shows that II is continuous.
For the properties about stable manifolds stated in the lemma, take a" E WI~(S,U8).
Then ill = Sj for j ~ O. Then both 8(a) and 8(a") (O'/2)-shadow the same forward /1-
chain, so
d(fj 0 lI(a"), Ij o8(s» :s a :s f
for j ~ O. It follows that

lI(s') E W!(lI(s),J) c W.'(lI(s),J) or


8(WI~c(S,0'8» C W:(8(s),f).

The proof that


lI(WI~(S,0'8» c W:(lI(s),J)
is similar using j :s o.
Finally, assume let s, s' E E8 with So = so. Then

where
S~ = {Sj for j ~ 0 and
• sj for j :s O.
By above

8(s") E W:(8(s), f) n W:(8(s'), f) = W:(lI(s), I) n W,"(8(s'), J), or


8(WI~c(s, 0'8) n WI~c(S', 0'8» = W:(8(s), f) n W." (lI(s'), f)
as claimed. o
We check in Lemma 6.3 below that the sets

Tj = 8(Cj ) = {8(s) : s E E8 and So = j}

are rectangles in A for 1 :s j :s r. Since 8 is continuous, each of the Tj is closed. We do


not know that these rectangles are proper. Also, the collection of these rectangles might
not have disjoint interiors. But, they d,· ..[ 11 cover of A, and Lemma 6.3 also checks
the first condition on the stable and unstable manifolds in the definition of a Markov
partition.
Lemma 6.3. The collection of sets {Tj : 1 :s j :s r} satisfy the following conditions.
(a) Each Tj is a rectangle in A.
(b) The {Tj } cover A, A = U;_I Tj •
(c) I£x = 8(s) for a E E8, then

I(W'(x,T,o» c W·(f(x),T,.) and


I(W"(x, T. o »:::> W"(f(x), T•• ).
388 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Definition. Assume that cl(Per(f)), L(f), or n(f) has a hyperbolic structure for f.
(If we assume that n(f) has a hyperbolic structure, then we also have to assume that
cl(Per(f» = n(f).) Let Al u··· U AN be the basic sets given by the Spectral Decom-
position Theorem, and define WU(A j ) = WU(A j ) \ Aj . Define a partial ordering on the
basic sets by declaring Aj « Ale if W"(A j ) n W'(AIe) =I 0. A k-cyc/e is a sequence
of basic sets AJt , ... ,Aj • with AJt « A;. « ... « Aj • « Ail' Thus a I-cycle
is a basic set Aj with W"(Aj) n ~V'(Aj) =I 0, i.e., the stable and unstable manifolds
intersect off AJ .
REMARK 5.4. If L(f) (or n(f) or n(f)) is hyperbolic then a cycle contains some
non-transverse intersections because otherwise all the periodic points are h-related and
intersections are all in L(f). The example given above with L(f) =I n(f) is an example
for which the limit set is hyperbolic with a 3-cycle, but the nonwandering set is not
hyperbolic, i.e., the points of non-transverse intersection are nonwandering but not
limit points.
Definition. Assume that L(f) or n(f) has a hyperbolic structure. (Alternatively,
assume that nUl has a hyperbolic structure and cl(Per(f) = nu).) Then f is said to
have no cycles or satisfy the no cycle property provided there are no cycles among the
basic sets formed from L(f) (or n or n(f».

Theorem 5.7. If n(f) is hyperbolic then f satisfies the no cycle property.


PROOF. Assume {AjJ~=1 is a k-cycle for f. It is easy to check that all the points of the
intersections ~VU(Aj;) n W'(Aj;+I) are in n(f). (See Exercise 9.2 dealing with periodic
points. Also compare with Exercise 9.18.) Thus these points are not outside the basic
sets, so there is no cycle but merely one larger basic set. 0
REMARK 5.5. If we assume that L(f) or n(f) has a hyperbolic structure, then the fact
that f has no cycles among the basic sets is an additional assumption.

9.6 Markov Partitions for a Hyperbolic Invariant Set


Throughout this section, f : M -+ M is a C l diffeomorphism on a manifold M, A is
a compact, isolated, hyperbolic invariant set for f with a local product structure. The
set A could be a basic set from the spectral decomposition but this is not necessary. We
take an adapted norm on tangent vectors at points of A, and d a compatible distance on
/If. With this distance, two points in a stable manifold get closer together under forward
iteration and two points in an unstable manifold get closer together under backward
iteration. Therefore rl(W:U(p») :) W:(p) and f(W.u(f-I(p») :) W."(p). Since A
has a local product structure, there are t ~ 0 > 0 such that W:(p) n W."(q) a single
point in A whenever p, q E A satisfy d(p, q) ~ o. However by taking t ~ 0 > 0 smaller,
we can make sure that

whenever p, q E A satisfy d(p, q) ~ 0, and so the intersection of the manifolds on the


left hand side is also a single point. Note that the the two sets which are intersected on
the left hand side of the equality contain the respective sets on the right hand side. We
fix t and 0 with these properties.
For this general situation, we need to define a rectangle and then the stable and
unstable manifolds in a rectangle. Note that throughout this section the interiors of all
subsets of A are taken as subsets of A and not as a subset of the ambient manifold M;
thus int(R) means the interior of R in A.
9.6 MARKOV PAIUITIONS FOR A HYPERBOLIC INVARIANT SET 389

Definition. A set RCA is a rectangle provided


(i) it has diameter less than 0, where 0 is as above, and
(H) p, q E R implies that W:(p) n W;'(q) E R.
The rectangle is call proper if in addition
(iii) R = cl(int(R» so that it is closed.
If R is a rectangle then for pER let

W'(p,R) = W:(p)nR and


W"(p, R) = w:'(p) n R.

Note the comparison of the definition of W· (p, R) with what is used for hyperbolic toral
automorphisms in Chapter VII.
With these definitions we can define a Markov partition as before.
Definition. Assume that A has a local product structure for 1 as above. A Markov
partition 01 A for 1 is a finite collection of proper rectangles, 'R = {Rj }j'=l' that satisfy
the following four conditions:
(i) A = U7'=l R j ,
(H) if i # j then int(~) n int(R j ) = 0, (so int(~) n Rj = 0),
(iii) if z E inteR;) n ,-l(int(Rj )) then

J(WU(z, ~» :::> WU(f(z), Ri ) and


I(W'(z,~)) C W'(f(z), Rj ),

and
(iv) if z E inteR;) n 1-1(int(Rj » then

int(Rj ) n J(WU(z,~) n int(~» = WU(f(z), R j ) n int(Rj ) and


int(~) n r1(W'(R j ) n int(Rj )) = W'(z,~) n int(~).

REMARK 6.1. In the context of the toral Anosov automorphisms, condition (iv) is
needed to insure that the image of a rectangle only crosses another rectangle once; this
fact enables large rectangles to be used and still to be able to get a single orbit which
passes through the prescribed sequence of rectangles, i.e., to get symbolic dynamics. In
the present context, the rectangles found are small. Their small size is used to show
that condition (iii) implies condition (iv). (Note the added condition on the size of the
local stable manifolds and iterates of the map which we imposed above.) Because the
small rectangles automatically satisfy condition (iv), Bowen and other authors do not
add a condition like this one to the de~nition of a Markov partition.
We can now state the main result of this section which is due to Bowen (1970a). We
follow the treatment of Bowen (1975).
Theorem 6.1. Let A be a hyperbolic invariant set with a local product structure for
a diffeomorphism J. Then there exists a Markov partition of A for f with rectangles
arbitrarily small (diameter less than 0).
REMARK 6.2. Once there is a Markov partition, then it is possible to define a sub6hift
of finite type E" as for a toral Anosov automorphism and a semi-conjugacy h : E" - A
that is finite to ODe. By Theorem VIII.l.8, the entropy of !lA. h(!lA), is equal to
390 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

the entropy of OA. h(O'IEA)' By Theorem VIII.1.9. h(O'IEA) = 10g(At} where Ai is the
largest eigenvalue of A.
PROOF. Let f > 0 and a > 0 be small as above. Let {3 > 0 be such that any {3-chain can
be (a/2)-shadowed in A because A is isolated. Next take "{ > 0 with "{ $ min{{3/2. a/2}.
so that if d(x.y) < "( then d(f(x),j(y» < (3/2 and d(f-i (X),j-i (y)) < {3/2.
Let l' = {Plo ...• Pr} be a finite set of "{-dense points in A. Let B = (bij) be the
transition matrix with
ij =b {I d(f(Pi), Pi) < (3
o d(f(Pi), Pi) ~ {3.
Because of the choice of "{. for each i, there is at least one j such that bii = 1. Let EB
be the two sided subshift of finite type determined by B. Then the cylinder sets

Cj = {s E EB : So = j}
form a Markov partition of EB. Remember that for S E EB,

WI~(S'O'B) = {t E EB : ti = Si for i ~ O} and


WI~(S,O'B) = {t E EB : ti = Si for i $ OJ.

If s.s' E EB with So = so, then

with s· E EB where
S~ = {Si for i ~ 0 and
1 S; for i $ O.
We use this partition for EB to construct a partition for A. We define a map

8: EB -+ A

which we use to take a rectangle in EB to a rectangle in A. For each s E EB, let 8(s)
be the point Z E A which (a/2)-shadows the {3-chain {P' j }~-oo. The main properties
of 8 are contained in the following lemma.
Lemma 6.2. The map 8 : EB -+ A is cO'ntinuO'us and O'ntO'. Further, if s, s' E EB have
So = So (are in the same cylinder set O'f the partitiO'n O'fEB), then

d(8(s),8(s') < a
8(WI~(s. O'B» C W~(8(s), I),
8(WI~(S"O'B)) c W:(8(s'),I), and
8(WI~(S'O'B) n WI~(S'.O'B» = W:(8(s), I) n w;'(8(s'). I).

PROOF. To show 8 is onto, let Z E A. For each j let P'j be chosen within "( of /i(z).
Then

d(f(p'j)' P'HI) $ d(f(p.j ), /i+i(z» + d(fi+1 (:1:). P.,+,)


$ {3/2+,,{
$ {3,
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET 391

so {p,,} is a .a-chain. The orbit of z -y-shadows so it (0/2)-shadows this .a-chain; thus


9(s) = z and 9 is onto.
If two sequences s, s' E EB have So = so'
then 9(s) and 9(s') are both within 0/2 of
Pso so are within or of each other.
Note that two symbol sequences s and s' are close if Sj = sj for -n ~ j ~ n for some
large n. Thus s and s' correspond to two .6-chains which agree for a large number of
points. By an argument like we have given in earlier sections, 9(s) and 9(s') are nearby
points. This shows that 9 is continuous.
For the properties about stable manifolds stated in the lemma, take s· E W,!.c(S,O"B).
Then s;= Sj for j 2: o. Then both 9(s) and 9(s·) (or/2)-shadow the same forward .6-
chain, so
d(fj 0 9(s·), Ij 0 9(s» ~ 0 ~ f
for j 2: O. It follows that

9(s·) E W~(9(s), f) c W.'(9(s), I) or


9(W,!.c(s, UB» C W~(9(s), f).

The proof that


9(W,~(S,UB» c W~(9(s),f)

is similar using j ~ o.
Finally, assume let s,s' E EB with So = so. Then

where
S! = {Sj for j 2: 0 and
• sj for j ~ o.
By above

9(s·) E W~(9(s), f) n W~(9(S/), f) = W:(9(s), f) n W.U(9(s'), f), or


9(W,~c(s, UB) n ~~(s', UB» = W:(9(s), f) n W;'(9(s'), f)
as claimed. o
We check in Lemma 6.3 below that the sets

Tj = e(Cj ) = {e(s) : s E EB and So = j}


are rectangles in A for 1 ~ j ~ T. Since 9 is continuous, each of the T j is closed. We do
not know that these rectangles are proper. Also, the collection of these rectangles might
not have disjoint interiors. But, they J... .., 11 cover of A, and Lemma 6.3 also checks
the first condition on the stable and unstable manifolds in the definition of a Markov
partition.
Lemma 6.3. The collection of sets {Tj : 1 ~ j ~ T} satisfy the following conditioDS.
(a) Each Tj is a rectangle in A.
(b) The {Tj } cover A, A = U;~l Tj •
(c) Ifx = 9(s) for s E EB, then

I(W'(x, T. o »c W'(f(x), T•• ) and


f(WU(x,T. o »::> WU(f(x),T•• ).
392 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

PROOF. (a) The diameter ofTj is less than 0, because any two points in a Tj are within
0/2 of the same point Pj. Thus condition (i) in the definition of a rectangle is true. Let
x, y E Tj with 8(8) = x and 8(8') = y. Then 8 and 8' are in the same cylinder set Cj ,

8' = Wk,.,(8,O'B) n WI~(S"O'B) E Cj , and


W:(8(s), I) n W.U(8(s'), f) = 8(s*) E Tj •
Thus condition (ii) in the definition of a rectangle is true, and the Tj are rectangles.
(b) Since 8 is onto, the {Tj } cover A.
(c) If y E WO(x, Too) c W:(x, I), then y = W:(x, I) n W.U(y, I). On the other
hand, if y = 8(8') define

8' = Wk..,(8,O'B) n WI~(8"O'B)'


• {Sj for j ~ 0 and
Sj = sj for j :S O.

By Lemma 6.2,
8(s') = W:(8(s), I) n W:'(8(8') , I),
so 8(s') = y. Also (i) 0'8(S')j = 0'8(S)j = Sj+l for j ~ 0 and 8(0'8(S» = f(x) so
8(0'8(S'» E W:(f(x), I), and (ii) 0'8(S')j = 0'8(8')j = Sj+l for j :S 0 and 8(0'8(S'» =
f(y) so 8(0'8(S'» E W.u(f(y),f):

8(0'8(S'» E W:(f(x), I) n W:'(f(y), I) = {J(y)},


8(0'8(S'» = f(y).

But 8(O'B(S'» E To .. so f(y) E W'(f(x), To.), f(W'(x, Too» c W'(f(x), To.), and we
are done.
A similar argument proves the last inclusion,

o
To get the other two properties of the Markov partition, we need to subdivide the Tj's.
The subdivision uses the different parts of the boundary of the rectangles. Therefore,
before making the refinements of the covering, we characterize the different parts of the
boundary of a rectangle in the following lemma.
Lemma 6.4. Let R be a closed rectangle of A. The boundary of R (relative to A) can
be written as the union of two subsets, 8(R) = 8"(R) u 8 U (R), where

8"(R) = {x E R : x rt int(WU(x, R))} and


aU(R) = {x E R : x rt int(WO(x,R))}.
(The interiors of W·(x. R) and WU(x. R) in the above definitions of 8"(R) and 8 U(R)
are as subsets of W:(x) n A and W.U(x) n A respectively.)

REMARK 6.3. The notation is chosen so that 8"(R) is made up of pieces of stable
manifolds and 8 U (R) is made up of pieces of unstable manifolds. See Figure 6.1.
PROOF. It is sufficient to prove that int(R) is equal to the set R \ (8"(R) U OU(R».
9.6 MARKOV PARrITIONS FOR A HYPERBOLIC INVARIANT SET 393

lwU(x)

WS(y)
. ...•......... ........ y

FIGURE 6.l. The Points x E ~(R) and Y E 8"(R)

If x E int(R) then W"(x, R) ,." R n W:(x) is a neighborhood of x in W:(x) n A for


U = 8, u. This proves that int(R) C R \ (~(R) U 8"(R».
Conversely, assume

x E R \ (~(R) U 8"(R)) or
x E int(W'(x, R» n int(W"(x, R».
For yEA near to x,

y' == W:(x) n W."(y) E int(W'(x, R» and


y" == W:(y) n W;'(x) E int(W"(x, R»

since the intersections depend continuously on y. See Figure 6.2. Thus

y = W:(y") n W."(y')

for y' E int(W'(x, R» and y" E int(W"(x, R». The set of such points forms a neigh-
borhood of x in A, so x E int(R). 0

·
• U •
....:y.......... :.....
~
y .
~

·:X ·.·
: s
.... ~ •.•.....•• :)(....
·· .
·
FIGURE 6.2. Determination of Points y' and y"

For x E A, we associate elements of the cover T = {T1' ... , Tr } of A by letting

T(x) = {Tj E T : x E 1j} and


T"(x) = {TAo E T : TAo n Tj "# 0 for some 1j E T(x)}.

Since T is a cover of A,
r
Z = A\ U 8(T;)
j .. 1
394 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

is open and dense in A. We also need to remove the extensions of the a'(Tj ) by stable
manifolds and the extensions of the au Cr,) by unstable manifolds, so we define the set

Z' = {x E A : W:(x) n 8'(T,,) = 0 and


W.U(x) naU(T,,) = 0 for all T" E T·(x)}.

This set is also open and dense by an argument like Lemma 6.4.
Now we fix Tj E T and subdivide it for T" with T j n T" "i 0 as follows:

T}." = T j nT"
= {x E Tj : WU(x, Tj ) n T" "i 0 and W'(x, T j ) n T" "i 0}
Tl~" = {x E Tj : WU(x, Tj ) n T" "i 0 and W'(x, Tj ) n T" = 0}
Tl." = {x E Tj : WU(x, Tj ) n T" = 0 and W'(x, Tj ) n T" "i 0}
Tf." = {x E Tj : WU(x, Tj ) n T" = 0 and W'(x, T1 ) n T" = 0}.

See Figure 6.3. With these definitions, T],j = Tj and TL = 1j~j = TL = 0. Each
of the I;,,, is a rectangle as follows from the following observation: if x, y E T j and
z = U"(x, T1 ) n WU(y, Tj ), then W'(z, Tj ) = W'(x,Tj ) and WU(z,Tj) = WU(y, Tj ).

T.k
j.
4

- --3 --
T j.'k
I
I 2
: T'j. k
I

T.j.lk
- ~
W
W

S
u

Tk

FIGURE 6.3. Subdivision of Rectangles

For x E Z', we take intersections of the T}:" to define a rectangle at x,

Since there are only a finite number of sets involved, R(x) is open, and nonempty since
x E R(x). Since each of the I;,,, is a rectangle, each int(Tj~") is a rectangle, so R(x) is
a rectangle.
A finite collection of the c1(R(x)), R, form the Markov partition claimed in the theo-
rem. The following lemma proves properties of the R(x)'s used to verify the conditions
of a Markov partition in the sequence of lemmas which follow. Notice that R is a
refinement of the cover T.
Lemma 6.5. (a) For x, y E Z', either R(x) = R(y) or R(x) n R(y) = 0.
(b) For x E Z', a(R(x)) n Z· = 0.
(c) For x E Z', int(c1(R(x))) = R(x), i.e., the rectangles c1(R(x)) are proper.
PROOF. AssumethatR(x)nR(y) "i 0forx,y E Z·. Thereisaz E R(x)nR(y)nZ*. By
the definition of R(x), if T j E T(x) then z E 1~I.J = 1j, so Tj E T(z) and T(z) ::) T(x).
On the other hand, if Tj r1. T(x), then Tj n R(x) = 0, so z ~ Tj and T(z) C T(x).
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET 395

Combining, we get T(z) = T(x). Similarly, T(y) = T(z) so T(y) = T(x). Thus, if
R(x) n R(y) 1= 0 then R(x) = R(y). This proves part (a).
Assume y E o(R(x» n Z·. Since R(y) is a neighborhood of y in A and y E o(R(x»,
it must be that R(y) n R(x) = 0 or else R(x) = R(y) and y is not a boundary point.
But also this neighborhood R(y) must not intersect R(x) so y can not be a boundary
point. This is a contradiction and shows that o(R(x» n Z· = 0. This proves part (b).
Since Z· is dense in A and o(R(x»nZ· = 0, int(o(R(x») = 0, and so int(cl(R(x))) =
R(x). 0
There are only a finite number of different R(x) because there are only finitely many
Tj and T?k' Let
n = {cl(R(x» : x E Z·} = {R), ... ,R",}
be an enumeration of this finite collection of rectangles. Lemma 6.5(c) proves that the
rectangles are proper. Since the collection of the interiors,

{int(Rd, ... , int(R",)},

cover Z· which is dense in A, the collection 'R satisfies the first condition for a Markov
m
partition, UR j = A. By Lemma 6.5(a,c), condition (ii) for a Markov partition is
j=1
true: int(Rj ) n inteR,,) = 0 for j 1= k. We are only left to show that the stable and
unstable manifolds relative to the rectangles behave properly, conditions (iii) and (iv)
for a Markov partition. These conditions follow from the lemmas which follow.
Lemma 6.6. The rectangles in 'R satisfy condition (iii) in the definition of a Markov
Partition: if x E int(~) n 1- 1(int(R j then»,
I(W'(x,~» c W"(f(x),R j ) and
I(WU(x,~» :;) WU(f(x), Rj ).

We prove only the inclusion for the stable manifolds, and remark how the case for
the unstable manifolds follows. The first step in the proof of this lemma is the following
sublemma.
Sublemma 6.1. Assume x, y E Z· n 1- 1 (ZO), R(x) = R(y), and y E W:(x). Then,
R(f(x» = R(f(y».
PROOF. The first step is to show that T(f(x» = T(f(y». Assume that I(x) E T j •
Then there is a s E E8 with x = 6(8) and 81 = j. Let s' be the point in E8 with
8i1 = 80 and y = 6(s'). If we let

. = {8;,
Si
8;
for i ~ 0
for i $ 0,
and

then y = 6(s·) as we argued before. Since y = 8(sO) and 8j = j, I(y) E 1';. Thus
T(f(x» c T(f(y». The other inclusion is proved by reversing the roles of x and y, 80
T(f(x») = T(f(y».
Now assume I(x), f(y) E Tj and Tj n T" 1= 0. We need to show that I(x) and I(y)
are in the same 1'?". Note that W"(f(x), Tj ) = W"(f(y),1';), so both these stable
manifolds either intersect T" or neither intersects T". If I(x) and f(y) are not in the
396 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

,, :
tw
-
u
f(y)
.............•. :.......
, ~f(x)
, WS
-- --- If(z)

Tj Tk

FIGURE 6.4

same T;\, then one ofthe unstable manifolds WU(f(x), 1j) and WU(f(y), 1j) intersects
Tk and the other does not. Assume that

J(z) Ewu(f(X), T j ) n Tk i: 0 and


W U(f(y),1j) n Tk = 0.

See Figure 6.4. Let S E E8 be such that J(x) = 80 o'(s) with 81 = j. In Lemma 6.3
we proved that it follows that WU(f(x), T j ) c f(WU(x, Too)), so J(z) E f(WU(x, Too))
and z E WU(x,T.o)'
Let s· E E8 be such that z = 8(s*) with 8i = k. Thus T. o E T(x) = T(y) and
z E Too n T.; i: 0. Since z E WU(x, T. o) n T.; and R(x) = R(y) (so they are in the
same r.:,.oo)' the intersection WU(y, T. o ) n T.; i: 0. In fact, there is a point

z' = W:(z) n W.U(y)


= W'(z, Tao) n WU(y, Too)'

See Figure 6.5. Therefore

f(z') = W:(f(z)) n W.u(f(y))


= W°(f(z), Tk) n WU(f(y),1j).

This contradicts the fact that WU(f(y), T j ) n Tk = 0. Therefore f(x) and J(y) are in
the same 1}:k' This is true for an arbitrary Tk, so R(f(x)) = R(f(y)). 0

,:
·:Y
:
-rzi - - +::-z-ir----,
···t·········
,I:: X
.... ·t- ................ ,.~ ....... .
,:
..... .
-WS

·· ..
.~

FIGURE 6.5
9.6 MARKOV PARTITIONS FOR A HYPERBOLIC INVARIANT SET 397

PROOF OF LEMMA 6.6. Let x E Z· n r'(Z·). The set

W'(x, R(x» n Z· n r'(Z·)

is open and dense in' W"(x, cl(R(x» by an argument like Lemma 6.4. By Lemma 6.7
and continuity,

f(W'(x,cl(R(x» C cl(R(f(x») n W:(f(x»


C W'(f(x), cl(R(f(x))).

If int(R;)nrl(int(Rj» '" 0, then this open subset of A contains an x E ZOnf-'(ZO)


with R; = cl(R(x» and R j = cl(R(f(x))). Therefore for such x,

f(W"(x,R;» C W·(f(x),R j ).

For any y E inteR;) n f-'(int(Rj »,


f(W'(y, R;» = f{W:(y) n W.U(z) : z E W'(x, R i }}
= {/(W:(y» n f(W.U(z» : z E W'(x, R;)}
C {/(W:(y» n W:'(z') : z' E W"(f(x), R j }}
C W'(f(y), R j ).

This proves the lemma for the stable manifolds.


The result for the unstable manifolds follows by applying the above argument to
(or modifying it for the unstable manifolds directly).
,-I.
0
Lemma 6.S. The rectangles in R satisfy a strong version of condition (iv) in the
»
definition of a Markov Partition: i(x E int(R;) n r'(int(Rj then

W'(x, R i ) = R; n r'(W"(f(x), R j » and


W"(f(x), R j ) = f(WU(x, R;» n R j •

PROOF. By Lemma 6.6,

W'(x,R;) c R; n r1(W'(f(x),Rj»
C R; n r1(W:(f(x))) = W'(x, R;).

The last equality follows by the choices of f for the size of the stable and unstable
manifolds made early in the section. Therefore

W"(x, R;) = R; n r1(W"(f(x), R j »

as claimed. The argument for the unstable manifolds is similar. This proves the lemma
and completes the proof of the theorem. 0
398 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

9.7 Local Stability and Stability of Anosov


Diffeomorphisms
In this section we prove the semi-stability and structural stability of Anosov dlffeo-
morphisms and also the stability of a single basic set. The proofs given are in the spirit
of that of Anosov, and repeat the construction of stable and unstable manifolds. We
start with an Anosov diffeomorphism. Throughout the section M is a compact manifold
(or at least the invariant set A is a compact invariant set).
Definition. Consider the set of homeomorphisms on a compact manifold M. If we
lise the usual CO-sup topology on functions, then the set of homeomorphisms are not
open and so not a complete space. Therefore to study perturbations 9 of a homeomor-
phism I which we want to remain a homeomorphism, we require that both do(f,g) and
do(f-I,g-I) to be small, i.e., we use the metric
dhomeo(f,g) = max{do(f,g),do(rl,g-In
= sup{d(f(x),g(x)),d(rl(x),g-I(x)) : x EM}.
The space of homeomorphisms is complete in terms of the metric dhomeo.
Let I : M -+ M be a homeomorphism (or diffeomorphism). We say that I is semi-
stable provided there exists a £ > 0 such that for every homeomorphism 9 : M -+ M
with the CO distance from I to 9 and from I-I to g-I less than £, dhomeo(f,g) < £,
there exists a h : M -+ M which is a semi-conjugacy from 9 to I, with h continuous,
onto, and hog = I 0 h. These conditions imply that the dynamics of 9 are at least as
complicated as those of I.
An example of two diffeomorphisms which are semi-conjugate is where I is a hyper-
bolic toral automorphism on 11'2 and 9 the DA-diffeomorphism constructed from I. In
fact, this I is semi-stable as the following theorem of Walters (1970) proves.
Theorem 7.1 (Semi.Stability of Anosov Diffeomorphisrns). Assume M is a com-
pact manifold and that I : M -+ M is an Anosov diffeomorphism on a compact (f has
a hyperbolic structure on all of M). Then I is semi-stable.
REMARK 7.1. The proof we give is in the spirit of that given by Anosov, although the
ideas are filtered through the concept of shadowing. A different type of proof was given
by Moser (1969). This latter proof solves a functional equation using a contraction
mapping.
PROOF. Let 9 be a homeomorphism within t > 0 of I, dhomeo(f,g) < t. Then for each
x E M, {gJ(x)}~_oo is an t-chain for I. Given." > 0, there is an t > 0 such that all
t-chains can be uniquely .,,-shadowed. Let y = h(x) be the point that .,,-shadows gi(x),
d(P 0 h(x), gJ (x)) < .". We now check the properties of h. Because the point which
shadows is unique, h is a well defined function.
Claim 1. The map h is continuous.
PROOF. Let Vp(r) = expp(Bp(r)) be the neighborhoods of p in M defined before in
the proof of shadowing. The proof of the shadowing result shows that

n
00

h(x) = li(Vg-;(x)(r)).
i=-oo
Given." > 0, there is a N such that all points in the finite intersection

nN

i=-N
li(Vg-l(x)(r))
9.7 LOCAL STABILITY AND STABILITY OF ANOSOV DIFFEOMORPHISMS 399

are within TJ/2 of hex). Then for y near enough to x, all points in the finite intersection
N
n
i=-N
li(Vg-i(y)(r»

are within TJ of hex). Since


N
n n
00

hey) = P(Vg-i(y)(r)) c li(Vg-i(y)(r»,


j--N

we have that d(h(x),h(y» < TJ. This proves the continuity of h. o


Claim 2. hog = loh.
PROOF. Using the point g(x) to TJ-shadow,

d(gi 0 g(x), po h 0 g(x» = d(gHI(X), p+! 0 rio h 0 g(x» < TJ.

But also
d(gi+!(x), Ii+! 0 hex»~ < TJ.
By the uniqueness of h(x), I-I 0 h 0 g(x) = h(x), or h 0 g(x) = 10 hex). o
Claim 3. The map h is onto.
PROOF. We argue that h is onto using ideas from algebraic topology. (In the proof of
Theorem 7.3 below, h is one to one and there is another proof which uses the Invariance
of Domain Theorem.) The map h is within TJ of the identity map and can be continuously
deformed into the identity (h is homotopic to the identity). The top homology group of
M, Hn(M) where n = dim(M), is nontrivial and the identity induces an isomorphism
on this group. Because h is homotopic to the identity, it also induces an isomorphism
on Hn(M). Since any map which induces an isomorphism on Hn(M) is onto, h is onto.
o
These claims combine to prove Theorem 7.1. 0
REMARK 7.2. Let 9 be the DA-diffeomorphism constructed from the hyperbolic toral
automorphism I. Then the semi-conjugacy h from 9 to I is not one to one. In fact,
h takes the line segments in W'(q,J) between W'(Pltg) and W'(Jl2,g) and collapses
them to points.
Theorem 7.2 (Openness of Anosov Dift'eomorphisms). Assume M is a compact
manifold. The set of Anosov diffeomorphisms on M is open in the CI topology.
We leave the proof of this theorem as an exercise. See Exercise 9.27. (The proof uses
cones.)
Theorem 7.3 (Structural Stability of Anosov Diffeomorphisms). Assume that
M is a compact manifold and I : M -+ M is an An080v diffeomorphism. (The map I
has a hyperbolic structure on all of M.) Then I is structurally stable (under all small
C I perturbations).
REMARK 7.3. This result was first proved by An080v (1967). Also see Moser (1969)
and the appendix by Mather in Smale (1967).
PROOF. The previous result showed there is a semi-conjugacy, h, with hog = 10 h.
We only need to show that h is one to one. Assume that hex) = hey). Then hogi(x) =
400 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

= =
Ii 0 hex) Ii 0 hey) h 0 gi(y). Therefore d(gi(x),gi(y» < 2'1. Since 9 is An08Ov,
it is expansive. Therefore if '1 is small enough, x = y. (The expansive constant can be
shown to be uniform for a neighborhood of /.)
In this case there are alternative proofs that h is onto that do not use the induced
maps on the top homology group. One such proof is given below in the case of a
hyperbolic invariant set. Another proof uses the fact that h is one to one. By the
Invariance of Domain Theorem, h is an open map so heM) is open in M. Because M
is compact, heM) is compact and so closed. If M is connected, the the image heM)
both open and closed in M and 80 is all of M, and h is onto. (This is the same type
of proof that we used for the Hartman-Grobman Theorem, Lemma V.7.S.) If M is not
connected, then the above argument can be applied to the connected components 80
show that h is onto. 0
Finally, we want to give a version of this last theorem for a hyperbolic i80lated
invariant set. See Hirsch and Pugh (1970).
Theorem 7.4 (Stability of an Hyperbolic Invariant Set). Let A, be a compact
hyperbolic i80lated invariant set for / : M -+ M with isolating neighborhood U. There
is an ( > 0 such that if 9 is a C l map within ( of /, then 9 has a hyperbolic structure
=
on A, n{gi(U) : j E Z}, and there exists a homeomorphism h : A, -+ A, (onto) that
gives a topological conjugacy.
PROOF. Given '1 > 0 there exists a 6 > 0 such that any 6-chain within 6 of A, can be
uniquely rrshadowed by an orbit in A,. Take N such that

n
N

i=-N
lieU) c {q : d(q,A,) < 6/2}.

There exists a CO neighborhood N' of / such that for 9 in N'

n
N

i=-N
gi(U) C {q: d(q,A,) < 6/2},

and gi(p) is a 6-chain for / for any p E ntgi(U): - N $ j $ N}. Let A, =


n{gi(U) : j E Z}. Using cones, it can be shown that if N c N' is a small enough
neighborhood of / in the Cl topology, then for 9 E N, 9 has a hyperbolic structure on
A,. (See Exercise 9.28.)
Take gEN. For PEA" gi(p) is a 6-chain for /. Therefore there is a unique
q = h(p) E A, such that d(fi oh(p),gi(p» < '1. This defines h: A, -+ A,. Just as
in the case of an An080v diffeomorphism, the fact that the shadowing is unique proves
=
that hog /0 h. The map h is continuous, just as for an An080v diffeomorphism.
Because 9 has a hyperbolic structure on A" we can define a map Ie : A, -+ A,
such that Ie 0 / = go Ie. In fact, if YEA" then Ii (y) is an 6-chain for g. Because
9 has a hyperbolic structure on A" this chain can be uniquely shadowed by gi (py)
for Py = Ie(y) with d(gi(py),Ji(y» small. Just as for h, Ie is continuous and satisfies
=
leo/ gole.
The following claim proves that Ie is the inverse for h, 80 h is one to one and onto.
This claim completes the proof of Theorem 7.4. 0
Claim. The map h is a homeomorphism between A, and A, and 80 is a conjugacy. In
fact, Ie is the inverse of h as a map between A, and A,.
PROOF. For pEA" d(fi 0 h(p),gi(p» < '1 is small for all j and the 9 orbit of p
shadows the / orbit of h(p), 80 Ie 0 h(p) = p. Thus if h(pt} =
h(P2) then PI =
Ie 0 h(Pl) = Ie 0 h(P2) = P2, 80 h is one to one.
9.8 STABILITY OF ANOSOV FLOWS 401

Next, for y E A" d(g1 0 k(y), p(y» is small for all i, and the f orbit of y shadows
the 9 orbit of k(y), so h 0 k(y) = y. Thus h is onto A" We showed above that h is
continuous. 0

9.8 Stability of Anosov Flows


The results of the preceding section are also true for flows. We restrict ourselves to
proving the structural stability of Anosov flows. We use the proof of this theorem to dis-
cuss expansiveness for a flow (flow expansiveness) and to introduce some constructions
which are used in the proof of the global structural stability theorem for flows. (The
global structural stability theorem for diffeomorphisms is stated in the next section.)
The reader should also notice that we solve a slightly different functional equation in
this section than that which is implicitly used in the last section by means of shadowing.
The statement of the theorem is similar to before.
Theorem 8.1 (Structural Stability of An080V Flows). Let M be a compact man-
ifold and ",I be a C· Anosov flow on M, i.e., ",I has a hyperbolic structure on all of M
and R(",I) = M. Then ",I is structurally stable, i.e., ",I is topologically equivalent to
any flow t/JI which is C· near to ",I for -2 ::; t ::; 2.
REMARK 8.1. Remember that for ",I to be topologically equivalent to t/JI it is permissible
to reparameterize either t/JI or ",I.
PROOF. The main difference in the proof is that the flow does not expand or contract
along the direction of the flow. To keep the trajectory of the perturbation from running
ahead of the trajectory for the original flow we construct a reparameterization of t/JI
that keeps its trajectory in a transversal of ",I(X).
For each point x E M, let E(x) be a small transversal at x. These can be taken 80
the vary differentiably with x. For 1/ > 0, let E(x,1/) be the subset of E(x) made up
of points within 1/ of x. As is done to define the Poincare map, there are " > 0 and a
differentiable function r(t,x,y) with r(O,x,y) = 0 and such that for -2 ::; t ::; 2 and
y E E(x, 1/),
t/JT(t.x,)')(y) E E(",I(X».
Then we can use this "reparameterization" to define

The flow FI is on a subset of M x M which can be thought of as a bundle over M.


(We could let E(x) = expx(t(x» where t(x) c TxM is a disk in a subspace which
is a complement to the line spanned by the vector field for ",I. In the construction of
the stable and unstable disks below, we would first construct disks in E(x) and then
exponentiate them to get disks in M. In this discussion we do not include these steps
explicitly. See Robinson (1975a).) The first point x gives the base point in M and the
second point y gives the point in the fiber (which is the transversal at x). Thus for
-2::; t::; 2,
FI: U {x} x E(x, 1/) -+ U {x} x E(x).
xEM xEM

This flow can be extended for all times for which t/J1'C I.X ,y)(y) the stays in the transversal
E(",I(X». As in the case for diffeomorphisms

DU(x, 1/) = nFI(",-I(x), E(",-I(X), 1/»


I~O
402 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

is an unstable disk at x,

D'(x,1/) = n
I:SO
F'(cp-'(X), E(cp-I(X), 1/»

is an unstable disk at x, and

DU(x, 7]) n D'(x, 7]) = nIER


F'(cp-'(X), E(cp-I(X), 7]»

== (x,h(x»
is a single point. By uniqueness of this point

(cpl(X), !J1T(I.X.h(x»(h(x))) = F'(X, h(x))


= (cp'(X), h 0 cpl(X»,

so
h 0 cpl(X) = !J1T(I,X,h(xJ)(h(x».
This formula has built into it the reparameterization of!J1. The function T(t, x, h(x» is
monotone function of t, so it has an inverse O'(s,x). Then 0' can be used to reparame-
terize cpl,
h 0 cp"(,,x)(x) = !J1·(h(x)).
The fact that h is onto and continuous is the same as for diffeomorphisms. The
reparameterization is continuous by the fact that both hand T (and so 0') are continuous.
Lastly, we need to check that h is one to one. Assume h(xIl = h(X2)' Then

h 0 cp"(,,xtl(xd = !J1'(h(xd)
= tjI'(h(X2»
= h 0 cp"(,,x')(X2),

so
d(cp,,(·,xtl(xd, cp,,(.,x.) (X2» < 27]
for all t. In fact, by taking tjlt nearer to cpt we can make this as small as we desire. The
following proposition gives the result about expansiveness of flows.
Proposition 8.2 (Flow Expansiveness). Let A be a compact hyperbolic invariant
set for a Bow cpl. Given E > 0, there exists a 6 > 0 such that if Xl, X2 E A with

for all t where


lim O'(t) = ±oo,
t-±oo

then X2 = cp'(xd for some lsi ~ E.

We defer the proof of the proposition to Exercise 9.30.


RETURNING TO THE PROOF OF THEOREM 8.1. By taking tjI' near enough to cpt, we
can insure that the two trajectories are within 6. Because transversals for nearby points
on the same trajectory are disjoint, we must have that Xl = X2, and h is one to one.
This completes the proof of Theorem 8.1. 0
9.9 GLOBAL STABILITY THEOREMS 403

The above proof uses the How FI on a bundle of transversals. This construction
works fine away from fixed points of the How. Since we are considering Anosov Hows
(or possibly flows on a hyperbolic invariant set without fixed points) this causes no
problem in the present situation. However, in the proof of the global stability theorem
for flows, it is necessary to allow fixed points. As other points limit on a fixed point,
the transversals do not have a continuous extension to the fixed point. In the proof of
the Hartman-Grobman Theorem near a fixed point for a flow, no reparameterization is
needed and the variation of the nonlinear How from the linear flow in all directions (not
just those transverse to the How lines) is used to construct the conjugacy. In the proof of
the global stability theorem for Hows, it is necessary to make a transition from (i) repa-
rameterizations of the flow and the conjugating function taking values in a transversal
for most points to (ii) 110 reparameterization of the flow and the conjugating function
allowing displacements in all directions near fixed points. One way to accomplish this
is given in Robinson (1975a). Although in this section we only consider flows without
fixed points, we introduce some of the ideas used in the more general situation.
It is possible to extend the flow FI used above so that it is defined for all pairs
(x, y) with y near x (not just in the transversal). As before, there are I) > 0 and a
differentiable function r( t, x, y) such that for -2 ::S t ::S 2 and d(x, y) < I),
.,pT(I,x,y)(y) E E(!p/(X».

In this context, r(O, x, y) is not necessarily equal to O. Next, let


J!(t,x,y) = r(t,x,y) - e-Olr(O,x,y)
where a > 0 is small enough so that
J!'(t,x,y) = r'(t,x,y) + ae-otr(O,x,y) > O.
(This last condition gives monotonicity of the reparameterization by J!.) Notice that
J!(O, x, y) = O. Let
F'(X, y) = (!p/(X), .,pT(t,X,Y) (y»
as before, but is defined on a larger space. It can be checked that F' satisfies the group
property, FI 0 F' = F'+'. The flow FI preserves the bundle of transversals on which
we previously defined the flow, We have altered the flow so that Ft is hyperbolic in
all the "fiber" directions: it contracts in the direction pointing off the transversal (in
the flow direction) in the fiber. Applying the construction as before, we get D"(x,l7)
and D'(x,l7) with DU(x, I) C E(x) so h(x) E E(x). (The disk D'(x, I) includes the
direction along the flow lines of .,pI in each fiber.) We refer the reader to Robinson
(1975a) for details.
In the present context, there is no real advantage to extending the How F' as indicated
above. However, this construction allows the style of proof discussed above to be used to
apply to the global stability theorem where fixed points are allowed. We present these
ideas in the present context where they are somewhat simple and can be understood
without the complicated induction construction needed for the general global stability
theorem.

9.9 Global Stability Theorems


In this section we give the global stability theorems. The first result gives the conju-
gacy only on the chain recurrent set. Because the original version of this theorem gave
the conjugacy only on the nonwandering set, nUl, we call the result the n-stability
theorem. Remember that in the last section we proved the existence of a conjugacy on
each basic set.
404 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Definition. Let I: M ..... M be a C l diffeomorphism on a compact manifold M. Then


I is 'R-stable provided there exists a neighborhood N of I in the CI topology such
that for 9 E N there is a homeomorphism h : 'RU) ..... 'R(g) (onto 'R(g» such that
hoI = go h. Similarly, I is called O-stable provided there is a homeomorphism h from
0U) onto O(g) with h 0 f = go h.

Theorem 9.1 (O-Stability Theorem). Assume that f: M -+ M [or M compact is


a CI diffeomorphism [or which 'RU) has a hyperbolic structure. Then f is'R-stable.
Alternatively using the non wandering set to express the assumptions, assume that
0U) has a hyperbolic structure, 0U) = cl(PerU», and f has no cycles. Then f is
O-stable.
REMARK 9.1. This theorem was originally given in Smale (1970).
REMARK 9.2. We give the proof with the assumptions on the chain recurrent set. With
the assumptions on the nonwandering set it can be proved that I has a filtration. The
rest of the proof is similar to the one given. See Shub (1987).
PROOF. By the assumptions, 'RU) = cl(PerU)), so by the Spectral Decomposition
Theorem 'R(f) = Al U ... U AN is the finite union of basic sets. Since we are using
the chain recurrent set, each Ale has an isolating neighborhood Ule such that Ale =
n~1 P(UIe). In this context, there are only a finite number of attracting-repelling
pairs. See Exercise 9.16.
By Conley's Fundamental Theorem, there is a Liapunov function V : M ..... R that is
strictly decreasing off of 'R(f). Because each pair of Aj and Ale can be put in different
attracting-repelling pairs, V has different values on each of the Aj . In fact the Aj
can be renumbered and V can be modified so that V(Aj) = j. For his modified V,
V : M -+ [1, NJ. Exercise 9.32 asks the reader to carry out the details of the above
modification of the Liapunov function V. This puts a total ordering on the Aj that is
compatible with the partial ordering with Aj « Ale if W"(Aj) n WU(AIe) "I: 0. (In this
ordering the orbits flow down the ordering. In many of my papers, I used the ordering
for which the index increases along a forward orbit.)
Using the modified Liapunov function V, we can define subsets of M by

These sets form what are called a filtration. They have the following properties which
characterize a filtration: for each j such that 1 ::; j ::; N,
(1) M = MN :) MN-I :) ... :) MI :) Mo = 0,
(2) f(Mj ) C int(Mj ), so each M j is a trapping region,
(3) A) C int(Mj \ M j _.},
(4) Aj = n;;:-oo fle(Mj \ M j _.}, and
(5)

n
00

1e=0
fle(Mj ) = UWU(Ai)
iSj

= Ucl(WU(Ai».
iSj

We leave it to Exercise 9.33 to check these properties. Note that we used the existence
of a Liapunov function to prove the existence of a filtration. From now on we use the
9,9 GLOBAL STABILITY THEOREMS

filtration and no longer use thfl Liapunov function. It is possible to prove the existence
of the filtration without using the Liapunov function. Properties 1-4 are used in the
proof of the n-Stability Theorem, while Property 5 is also used in the proof of the
Structural Stability Theorem. Note in Property 5 that cl(WU(A j » is not usually equal
to W" (Ai) but can have points in unstable manifolds of basic sets lower in the filtration.
Given the filtration, using the Local Stability Theorem of the last section, it is possible
to take the isolating neighborhoods Uj = Mj \ Mj -I. Then there exists a neighborhood
N of f such that for 9 EN, (i) each of the Mj \ Mj _ 1 is an isolating neighborhood for
Aj(g) = n%:_oogk(Mj\Mj-d, (ii) thereexistshj : Aj(f) -+ Aj(g) that is a conjugacy,
and (iii) g(Mj ) C int(Mj ). We define h : U:ml Aj(f) -+ U:=l Aj(g). This gives a
conjugacy on these sets. We show below that for 9 E N, R(g) = U:,.l Aj(g). Therefore
h is a R-conjugacy from f to g. It remalns to show that R(g) = U:.l A; (g) which we
do by means of the following two claims.
Claim 1. R(g) ::> U:.l Aj(g).
PROOF. The periodic points of! are dense in R(f) and so in each Aj(f). The conjugacy
h j takes these periodic points of f into periodic points for g. Therefore the periodic
points of 9 are dense in each Aj(g), and R(g) ::> cl(Per(g» ::> Aj(g). Taking the union
over j we get the result. 0
Claim 2. R(g) C U:=I Aj(g).
PROOF. Take Y E R(g). Then Y E M j \ M j - 1 for some j. We want to show that
Y E Aj .
The first step is to show that gi(y) rt Mj _ 1 for all i > O. Assume to the contrary
that gk(y) E Mj _ 1 for some k > O. Since g(Mj-d C int(Mj_d. gi(y) E int(Mj _.) for
all i ~ k. Also for f > 0 small enough, any t-chain {Yi} with Yo = y has YA: E M j _ 1
by the continuity of g. Next, g(Yk) E g(Mj _.) C int(Mj _.). For E > 0 small enough,
Yk+1 E Mj _ 1 since d(Yk+1,g(Yk» < t. Continuing by induction Yi E Mj- 1 for i ~ k.
This contradicts the fact that Y is chain recurrent. Therefore g;(y) rt Mj - 1 for all i ~ O.
Because g(Mj ) C Mj and Y E Mj , gi(y) E gi(Mj ) C Mj for all i ~ O. Combining,
g'(y) E M j \ Mj _ 1 for all i ~ O.
A similar argument applied to backward iterates shows that g;(y) E Mj \ M j _ 1 for
all i ::; O. Combining the results for positive and negative i, gi(y) E M j \ M j _ 1 for all
i E Z, I.e., Y E Aj(g). 0
Combining the claims, we have completed the proof of Theorem 9.1. o
In the exercises, we ask the reader to give a direct proof of the n-stability theorem
for a flow on the two sphere 52 with one source, one sink, and no other chain recurrent
points. See Exercise 9.31.
Definition. Assume f is a diffeomorphism with a hyperbolic structure on R(J) and
RU) = Al u··· U AN where the Aj are basic sets. We say that I satisfies the transver-
sality condition provided for every p,q E R(f). WU(p,J) is transverse to W'(q,J).
(This condition allows the fact that some unstable manifolds do not intersect other
stable manifolds.) Note that the condition is that the stable and unstable manifolds
of points are transverse and not just that the stable and unstable manifolds of basic
sets are transverse. Alternatively, we could assume that L(f) (resp. n(f» has a hyper-
bolic structure, and ! satisfies the transversality condition with respect to L(f) (resp.
n(f», I.e., WU(p, J) and W'(q, J) are transverse for all p, q E L(f) (resp. n(f». Note
that if L(J) (resp. nu)) has a hyperbolic structure and ! satisfies the transveraaUty
condition with respect to L(f) (resp. n(J» then! can be shown to have DO cyclee.
406 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

(Remember that we proved that if R(f) has a hyperbolic structure then f has no cy-
cles, so certainly it has no cycles if R(f) has a hyperbolic structure and f satisfies the
transversality condition.)

Theorem 9.2 (Structural Stability Theorem). Assume that M is a compact man-


ifold, I : M --+ M is a C l diffeomorphism, (i) I has a hyperbolic structure on R(f), and
(ii) f satisties the transversaJity condition with respect to R(f). Then f is structurally
stable. That is, there exists a neighborhood N of I in the set of C l diffeomorphisms
such that if 9 E N then 9 is topologically conjugate to f on all of M.
Assumptions (i) and (ii) and be replaced with either of the following alternatives:
(a) (i) I has a hyperbolic structure on L(f) and (ii) I satisties the transversaJity
conditioll with respect to L(f), or
(b) (i) I has a hyperbolic structure on n(f) and n(f) = c1(Per(f)) and (ii) I satisties
the transversality condition with respect to n(f).

REMARK 9.3. This theorem was proved for several special cases before the general proof
was given.
The case when R(f) = M was proved by Anosov (1967) and Moser (1969). (This case
is the Anosov Stability Theorem, Theorems 7.3 and VII.5.l(e).) These proofs applied
to both diffeomorphisms or flows (although Moser's proof for flows had to be somewhat
modified from what he gave).
The case when R(f) is a finite number of (periodic) points (so I is Morse-Smale)
was proved by Palis and Smale (1970). This proof applies to either diffeomorphisms or
flows. Also see Palis and de Melo (1982).
The case when I is a C2 diffeomorphism (but the neighborhood is still in the CI
topology) was proved by Robbin (1971). This is the first proof of the general theorem
stated above.
The case when I is a CI diffeomorphism was first proved in Robinson (1976a). (This
result has weaker hypothesis than the result of Robbin referred to above.)
The general case of the theorem when J' is a CI flow was proved by Robinson (1975a)
after first proving it for C 2 vector fields in Robinson (1974).

REMARK 9.4. Besides the original proofs referred to above, also see Robinson (1975b,
1976b, and 1977) for sketch of the proof and discussion of the ideas for the full theorem.

REMARK 9.5. The proof in one dimension is especially easy because there can only
be sources and sinks. In Section 2.6, we treated some examples on the line. The one
complication comes from the fact that the line is not compact. We also considered some
examples which have critical points which is more complicated. These methods can be
applied to show that Morse-Smale diffeomorphisms on Sl are structurally stable. See
Exercises 9.38 and 9.39 for some special cases.
We conclude by giving a few comments about the construction. The conjugacy is
built up in neighborhoods of successive basic sets. It is necessary to extend the map
onto the stable and unstable manifolds. We give the following definition.

Definition. Let A be a hyperbolic basic set. A fundamental domain lor the stable
manifold 01 A is a closed set D' C W'(A) \ A such that there exists a set D" with
D' = cl(D") and Ij(D") n D" = 0 for all integers j l' O. Note that D' n A = 0.
A fundamental domain for the urutable manifold of A is defined similarly.
9.10 EXERCISES 407

REMARK 9.6. One way to show the existence of a fundamental domain is to use a
Liapunov function. Let V : M ~ R be a Liapunov function with A. C V-I(i) and
W'(A.) C V-I([i,oo)). Let S = V-I([O,i + ED n W·(A.) be the "local stable manifold"
of A. for some choice of f, 0 < f < 0.5. Let D" = S \ I(S) and and D' = d(D·'). This
is a fundamental domain for the stable manifold. See Exercise 9.3S.
The proof is not that difficult in the case when 1 has only one repeller and one
attractor. We consider the even easier case of a north pole south pole diffeomorphism
of 8", i.e., 1 has a single fixed point source X2 and a single fixed point sink XI and
no other chain recurrent points. Let D' be a fundamental domain for the sink XI of
I. Also that D' is constructed as above with the upper edge equal to V-I(1.5). Let
9 be a small C l perturbation which is R-conjugate (so it has only two fixed points, YI
and Y2). Assume 9 is near enough to 1 so that g(V-I(l.5)) C V-I([l, 1.5)), i.e., 9 still
move this level set down in terms of V. Let ho(x) = X on V-I(1.5). On the image
of 1(V-1(1.5», define ~ by ~(x) = go ho 0 rl(x). Using a bump function ho can
be filled in to define a function ho : D' ~ 8". (This is similar to the construction in
Section 2.6.) Since D' is a fundamental domain, ho can be extended to a function h
defined on sn\ {Xl. X2} by hex) = giohoori(x) where j is chosen so that I-'(x) ED'.
Since ho is continuous on D', this extension is continuous on 8" \ {x I , X2}. Also define
h(xt} = YI and h(X2) = Y2. A little checking shows that h is continuous at these points
as well. This completes the sketch of why 1 is structurally stable in this case.

9.10 Exercises
Fundamental Theorem
9.1. Let 1 : S2 ~ S2 be the diffeomorphism with one source, one sink, and a hyperbolic
(saddle) invariant set that is a Cantor set, i.e., 1 is the horseshoe map on S2. Find
enough pairs of attracting-repelling pairs to show that their intersection as in Theorem
1.3 is equal to R(f).
9.2. Let 1 be a diffeomorphism, and all the Pj listed below are periodic points for I.
(a) Assume that q E WU(O(p.)) n W'(O(J)2)) =1= 0. Show that for all E > 0, there is
an E-chain from PI to q and then to P2.
(b) Assume that Qj E WU(O(pj))nW'(O(PH.)) =1= 0 for j = 0, ... , nand Pn+1 = Po.
Show that all the Qj are chain recurrent.
9.3. Let X,Y E R and assume Y ~ {l+(x). Prove there is a (A, AO) E A such that
X E A and Y E AO. Hint: Let U = {li(x) for small f, and for the corresponding
attracting-repelling pair, (A, AO) E A, show that x E A and yEA".
9.4. Prove that if Y is an invariant set for 1 and X = dey) then X is an invariant
set for I.
9.5. Let ",' be a continuous flow that is defined on a space X for all t, e.g. X is
compact. Assume V is an open set in X and let U = r1o<l<r""(V). Assume X E U.
(a) Prove that ",-I(X) E V for 0 :S t :S T. - -
(b) Prove that U is open.
9.6. Assume that the set Y is positively invariant for the flow <p'. Prove that y c is
negatively invariant.
9.7. Let <p' be a continuous flow on a compact metric space M. Let (A,AO) be an
attracting-repelling pair for a trapping neighborhood U. If x ~ A u A 0 , without using
a Liapunov function prove that w(x) C A and o(x) C A".
9.S. Let <p' be a continuous flow on a compact metric space M. Let A be an attracting
set. Prove its dual repelling set AO does not depend on which trapping neighborhood
U is used (for which A = n,~o ",'(U)) but only depends on A.
408 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Shadowing and Expansiveness


9.9. Let D(x) = 2x mod 1 be the doubling map on SI. Let {Xj}~o be a 6-chain for
D.
(a) Show that the inverse iterates D-i(xj) can be chosen so that

(b) Choosing the inverse iterates 88 in part (a), let y" = Iimj_OO D-H"(Xj). Prove
that D(y,,) = Yk+l.
(c) Prove that Yo 6-shadows the 6-chain {Xj}~o.
9.10. Let EA be a one-sided subshift of finite type with metric d as defined in Chapter
II. Let 6 = 0.5, Assume {s(j) E EA}~O is a O.S-chain for O'A on EA. Specify the point
t E EA which 0.5-shadows the 0.5-chain.
9.11. Prove that a hyperbolic toral automorphism on y2 is expansive without using
the result about shadowing.
9.12. Assume A is a compact hyperbolic invariant set. Prove that A is isolated if and
only if it has a local product structure.
9.13. Assume A is a compact hyperbolic invariant set that is not isolated. Prove
that if V is a small enough neighborhood of A then the maximal invariant set in V,
Av = nnEZ t(V), h88 a hyperbolic structure. Hint: See the proof of Theorem VlI.4.5,
the existence of a horseshoe for a transverse homoclinic point.
Anosov Closing Lemma
9.14. Consider the example given in Remark 4.2 and Figure 4.1.
(a) Explain why the point labeled q is in n(f) but not L(f).
(b) Explain why the point labeled q is in n(f) but not in n(fln(f», so n(fln(f» '"
n(f).
9.15. Prove Theorem 4.1(c), i.e., assume that the nonwandering set n(f) is hyperbolic,
and prove that c1(Per(f» = n(fln(f».
Decomposition of Recurrent Points
9.16. Assume M is a compact manifold and f : M -+ M is a diffeomorphism with a
hyperbolic structure on the chain recurrent set, R(f). Let A be the set of attracting-
repelling pairs for f. Let {AI," . ,AN} be the collection of basic sets.
(a) Let (A,AO) E A. If Aj n A '" 0, prove that Aj C A and WU(A j ) C A. If
Aj n AO '" 0, prove that Aj C AO and W'(A j ) C AO.
(b) Let (A, AO) E A. Prove that

A = U{WU(Aj ) : Aj n A", 0} and


AO = U{W'(A i ) : Ai n AO '" 0}.

(c) For a diffeomorphism with a hyperbolic structure on the chain recurrent set, prove
that there are a finite number of distinct attracting-repelling pairs in A.
9.17. Give an example of an attracting set which is not an attractor.
9.18. Assume that (i) the periodic points are dense in two hyperbolic basic sets Aj,
IUId Aj,' (ii) there exist ql E Ai, and q2 E Ai, such that WU(q.) h88 a non-empty
transverse inters«tion with W'(q2), and (iii) there exist qi E A" and qi E A" such
that \P(qi) has a OO[H!mpty transverse intersection with W·(q't). Prove that all the
9.10 EXERCISES 409

points of intersection, W"(A,.) n W'(A h ) and W"(Aj,) n W'(A i .), are in cl(Per(J»
and so in n(J).
9.19. Assume that M is compact, , : M -+ M has a hyperbolic structure on the
nonwandering set n(f), and , has a cycle. Prove that that nonwandering set is not
equal to the chain recurrent set, n(f) :/: 'R(f).
9.20. Assume that, : M -+ M has a hyperbolic structure on the chair recurrent set
»
n(J) and M is compact. Assume A, is a basic set for which int(W'(A i "# e. Prove
that Ai is an attractor.
9.21. Assume that' : M -+ M has a hyperbolic structure on the chair recurrent set
n(J) and M is compact. Let

Prove that S is open and dense in M.


9.22. Let,: M ..... M be an Anosov diffeomorphism on a connected manifold. (It is
not assumed that , is a hyperbolic toral automorphism.)
(a) Let p E Per(f) be a periodic point. Prove that W'(p) is dense in M. Noted that
this is the stable manifold of p and not the stable manifold of the orbit of p. Hint:
Prove that cl(W'(p» is open in M.
(b) Let q E M be any point. Prove that W'(p) is dense in M.
(c) Prove that, is topologically mixing.
9.23. Show on any compact two dimensional manifold M that there exists a diffeo-
morphism' for which (i) n(f) has a hyperbolic structure and (ii) , has infinitely many
periodic points. Hint: Take a Morse-Smale diffeomorphism on M with a fixed point
sink. Replace a neighborhood of the sink with the Smale horseshoe on the disk N (in
S2).
9.24. Assume that' : M -+ M has a hyperbolic structure on the chair recurrent set
nu) and M is compact. Let 'R(f) = AI U ... U AN be the spectral decomposition into
basic sets. Prove that each basic set Ai can be decomposed into subsets

nJ

Ai = UX,,;
i:al

with the following properties.


(i) The sets Xi,; = cl(W"(p) n W'(p» for some periodiC point p E Xi,;'
(ii) The sets Xi,; are pairwise disjoint.
(iii) The sets Xi,; are permuted, ,(Xi,;) = Xi,HI for 1 ~ i < ni and ,(Xi,nj) = Xi,l'
(iv) r
The n,-power of, restricted to each Xi,; is topologically mixing, i IXj ,; is topo-
logically mixing for each j and i.
Markov Partitions
9.25. Let E8 be a two sided subshift of finite type with shift map 0'8. Find Markov
partitions of arbitrarily small diameter. (Note that there is no differential structure, so
this map does not have a true hyperbolic structure but it does have stable and unstable
manifolds which is all that is needed,)
9,26, Let f : R2 -+ R2 be the diffeomorphism which has the geometric horseshoe A as
an invariant set, Find Markov partitions of arbitrarily small diameter.
410 IX. GLOBAL THEORY OF HYPERBOLIC SYSTEMS

Local Stability
9.27. Prove that the set of Anosov diffeomorphisms is open by proving the following
steps. Let f be all Anosov diffeomorphism and define cones C; using the splitting
lE~ E9lE~ such that Dff-'(p)Cj-,(p) C C~.
(a) Show that if 9 is Cl near enough, then Dgg-'(pp;_,(p) C Cj; C TpM.
(b) Prove that
n
n~O
Dg;-n(p)C;-n(p) C TpM

is a subspace E~·g that is near to the subspace lE~ for f. Similarly, get the subspace

E~g = n
n~O
Dg;n(pp;n(p) C TpM,

where C~ = c1(Tq \ C:).


(c) Since the subspaces lE~·g and lE:;g are near the subspaces lE~ and lE~, argue that
TpM = lE~,g E9E~,g is a direct sum decomposition.
(d) Since the decomposition for 9 is near the decomposition for f, argue that this is a
hyperbolic structure: Dgp expands vectors in lE~,g and contracts vectors in lE:;g.
9.28. Assume that Af be a hyperbolic isolated invariant set for f : M -+ M with is0-
lating neighborhood U. Prove that if f > 0 is small enough and 9 is a C 1 diffeomorphism
within f of f, then 9 has a hyperbolic structure on Ag = n{gj(U) : j E Z}.
9.29. Assume that f : M -+ M has a hyperbolic chain recurrent set on a compact
manifold M. Prove each of the basic sets Aj is an isolated ill'" lilt set.
Anosov Flows
9.30. Prove Proposition 8.2 on Flow Expansiveness.
Global Stability Theorems
9.31. Let cpl be the flow on the two sphere, 8 2 , with one sink and one source and no
other recurrent points. This flow is often called the north-pole south-pole flow. Prove
that cpl is 'R.-stable using only the local stability near the fixed points (i.e., without
using the fI-Stability Theorem).
9.32. Assume that f : M -+ M is a Cl diffeomorphism for which M compact and
'R.(f) has a hyperbolic structure. Prove that the Aj can be renumbered and there exists
a modified Liapunov function V: M -+ [I, NJ which has the following properties:
(i) V is decreasing off 'R.(f), and
(ii) V(Aj) = j.
9.33. Assume that f : M -+ M is a CI diffeomorphism for which M compact and
'R.(f) has a hyperbolic structure. Let Aj for 1 ~ j ~ N. Let V be a Liapunov function
that is strictly decreasing off of 'R.(f) with V(Aj) = j. Let

Prove these sets have the five properties of a filtration listed in Section 9.9.
9.34. Assume that f : M -+ M is a Cl diffeomorphism for which M compact and 'R.(f)
has a hyperbolic structure. Prove there is at least one basic set which is an attractor
and at least one basic set which is a repeller.
9.35. Assume that f : M -+ M is a CI diffeomorphism for which M compact. Also
assume that f has finitely many periodic orbits, all of which are hyperbolic. Prove that
9.10 EXERCISES 411

there can be no cycle between these periodic points for which all the intersections of
stable and unstable manifolds are transverse.
9.36. Assume that fJ : M j -+ M j is a Cl diffeomorphism on a compact manifold M j
for j = 1,2. Let F: Ml x M2 -+ Ml X M2 be defined by F(x,y) = (h(x), h(y».
(a) If fJ has a hyperbolic structure on the chair recurrent set 'R.(fJ) for j = 1,2, prove
that F has a hyperbolic structure on 'R.(F).
(b) If in addition to the assumptions of part (a), if fJ satisfies the transversality
condition for j = 1,2, prove that F satisfies the transversality condition and so is
structurally stable.
9.37. Let A be a basic set for a diffeomorphism f with a hyperbolic chain recurrent set
on a compact manifold M. Let V : M -+ R be a Liapunov function with A c V-l(i).
Let S = V-I([O, i + f» n W'(A) for some choice of f > 0, and D' = c1(S\f(S». Prove
for good choices of f that D' is a fundamental domain.
9.38. Let f be the diffeomorphism on Sl whose lift F : R -+ R is given by

F(8) = 8 + fsin(211'k8) mod 1,

for 0 < 211'kf < 1. Prove that f is structurally stable.


9.39. Let f be the diffeomorphism on Sl whose lift F : R -+ R is given by

F(t) = t + lIn + f sin(211'nt)


for n a positive integer and 0 < f < 1/(211'n). Prove that f is structurally stable.
9.40. Prove that the set constructed in Remark 9.6 is a fundamental domain for the
basic set.
CHAPTER X
Generic Properties

In Chapter IX, systems with hyperbolic chain recurrent sets are shown to be 'Rr
stable. Certain systems are also shown to be structurally stable: (i) Anosov systems,
(ii) Morse-Smale systems, and (iii) systems with a hyperbolic chain recurrent set which
also satisfy the transversality condition. Although we have given examples of such
systems, we have not discussed the prevalence of such systems. At one time it was
hoped that any system could be approximated by a structurally stable system. By
now, many counter examples have been constructed. For these examples, there is a
whole open set of systems that are not structurally stable or even !l-stable. However,
it is possible to prove that there are certain properties which are generic in the sense
of Baire category, i.e., any system can be approximated by another for which these
properties are true and the condition is open at least in a weak sense. This chapter
considers several of the basic generic properties. It also gives a counter-example to the
density of structurally stable systems. The proofs of the genericity use methods from
transversality theory, so a section develops these ideas.

10.1 Kupka-Smale Theorem


A generic property is one which holds for most functions in the function space under
consideration. We make this precise in the following definition.
Definition. Let F be a topological space. A subset n c F is called a residual subset
(in the sense of Baire category) provided n contains the countable intersection of dense
n;:l
open subsets, more precisely, 'R. :::> Sj where each Sj is open and dense in F. In a
complete metric space, a residual subset of X is always dense. A topological space X
is called a Baire space provided any residual subset of X is dense in X. A property is
generic in a function space F which is a Baire space provided the property is true for
functions in a residual subset of F.
If M is a compact manifold and 1 ~ k ~ 00, then CIe(M,M), Diffle(M), and the set
of C le vector fields XIe(M) are all Baire spaces. See Hirsch (1976). If M is noncompact
and these function spaces are given the Whitney topology (or strong topology as defined
in Hirsch (1976» then they are also Baire spaces.
The first generic property which we consider is the hyperbolicity of periodic points
and the transversality of the stable and unstable manifolds. Because the statement and
proof of the theorem can be expressed in terms of the set of all periodic points and
the set of hyperbolic periodic points, we give some notation for these sets. For any
diffeomorphism' on M, let Per(k,f) be the set of all periodic points with period less
than or equal to n,
Per(n, f) = {p EM: Ji(p) =p for some j ~ n},
PerU) be the set of all periodic points,
00

PerU) = U Per(n, f),


n=l

413
414 X. GENERIC PROPEJITIES

Perh(n, f) be the set of all hyperbolic periodic points with period less than or equal to
n,
Perh(n, f) = {p E Per(n, f) : p is a hyperbolic periodic point },
and Perh(f) be the set of all hyperbolic periodic points,
Perh(f) = {p e Per(f) : p ill a hyperbolic periodic point }.
(Notice that this usage ofthe notation for Per(n, f) does not agree with how it is used
in the rest of the book where it is the set of all periodic points with least period exactly
n.)
Note that all the periodic points of period less than or equal to n are hyperbolic if
and only if Per(n,f) = Perh(n,f). Using this fact, we define
'Hn = {f E Diff"(M): Per(n,f) = Perh(n,f)}, and

n=1

Therefore, f E 'H if and only if all the periodic points of f are hyperbolic.
The second half of the theorem deals with the transversality of the stable and unstable
manifolds. We let
KS(M) = {f E 'H : W'(p, f) is transverse to WU(q, f) for all p, q E Per(f)}.

Theorem 1.1 (Kupka-Smale). Assume M is a compact manifold and 1::; k::; 00.
(a) The set 'Hn defined above is dense and open in Diff"(M) , and 'H is a residual
subset of Diff"(M).
(b) The set KS(M) is a residual subset of DiffA:(M).
REMARK 1.1. This theorem is also true for vector fields (or Hows). We require that
both all the fixed points and all the periodic orbits are hyperbolic in the definition of
'H(X). In the definition of KS(X), we use the stable and unstable manifolds of periodic
orbits and not just the stable and unstable manifolds of individual points in the periodic
orbits: we require that W'btl is transverse to WU (2) where /'1 and/' vary over all
fixed points and periodic orbits.
There are two aspects which are different about Hows or vector fields than diffeo-
morphisms. First, there are both fixed points and closed orbits. This difference is
minor. Second, the periods of the periodic orbits can be any positive real number so the
induction on the period is slightly more cumbersome to implement. Even with these
differences, the main ideas of the proof are the same.
REMARK 1.2. The Kupka-Smale Theorem was proved independently by Kupka (1963)
and Smale (1963). A nice proof for the case of vector fields is given in Peixoto (1966)
which includes the case of noncompact manifolds. Palis and de Melo (1982) and Abra-
ham and Robbin (1967) also give proofs for vector fields.
REMARK 1.3. We delay the proof (for diffeomorphisms) until Section 10.3 because it
uses the transversality theorems which we discuss in Section 10.2. The idea of the
proof is that a periodic point can be approximated by a hyperbolic periodic point.
Since the hyperbolicity of a single periodic point is an open condition, the set 'Hn is
both dense and open. Similarly, a nontransverse intersection of stable and unstable
manifolds can be approximated by a transverse one which implies that. KS(M) is dense.
The transversality of the intersections on compact pieces is an open condition and the
manifolds can be represented as the countable union of compact subsets, so it can be
shown that KS(M) is residual.
10.1 KUPKA·SMALE THEOREM 415

Definition. A diffeomorphism (respectively flow) which satisfies the properties of the


set KS(M) in Theorem l.l(b) is called a Kupka·Smale diffeomorphism (respectively
flow).
The next set of results concerns the genericity of the condition that the closure of
the periodic points equals the nonwandering set. The first step is the possibility of
approximating a nonwandering point by a periodic orbit, the Closing Lemma. This
lemma was first thought to be obvious (hence the title of a lemma), which it is in the
CO topology. Its proof is very difficult in the C 1 topology and unknown in the C 2
topology. In other words, the approximating diffeomorphism 9 with the periodic orbit
can be taken to be Coo but is only Cl near to the original f. The reason that the
proof is complicated is that one localized perturbation is not enough to change an orbit
which returns near to itself into a periodic orbit. The proof for an approximation in
the Cl topology uses many localized perturbations to accomplish the feat. A proof for
an approximation in the C 2 topology (if and when it is given) will probably not use a
"localized perturbation", but will have to control the effects of an orbit passing several
times through a single perturbation.
Theorem 1.2 (Closing Lemma of Pugh). Assume M is a compact manifold, fa
c 1 diffeomorphism, Na neighborhood of I in Diffl(M), and p E flU). Then there is
agE N such that p is a periodic point for g.
REMARK 1.4. This result is also true in the spaces of Cl flows or Cl vector fields.
It was originally proved in Pugh (1967a) for flows. That paper claims to prove the
result for C 1 vector fields but there is a technical difficulty in the smoothness of the
perturbed vector field (which do not arise when considering the space of flows). These
difficulties were discovered by Pugh and are corrected (by Pugh) in Pugh and Robinson
(1983). The theorem is also proved for many other function spaces in this latter paper:
Hamiltonian and volume preserving diffeomorphisms and flows. Further papers on this
result include Liao (1979), Mai (1986), and Wen (1991). For an intuitive discussion of
the proof and its difficulties see Robinson (1978).
The next lemma is much easier than the Closing Lemma. It shows that once a periodic
orbit has been produced the diffeomorphism (vector field, or flow) can be approximated
by a new diffeomorphism (vector field, or flow) with a hyperbolic periodic point.
Lemma 1.3. Let p be a periodic point of period j for a C,. diffeomorphism 9 : M -+ M,
for 1 ~ k ~ 00. Then 9 can be approximated arbitrarily closely in the Ck topology by
a diffeomorphism g' such that p is a hyperbolic periodic point of the same period.
PROOF. Let 'P : V -+ U be a coordinate chart at p with 'P(O) = p. Let r > 0 be small
enough so that gi(B(p, r»nB(p, r) = 0 for 1 ~ i < j where B(p, r) = 'P( {x: Ixl ~ r}).
Let {3 : Rn -> R be a Coo bump function such that {3(x) = 1 for Ixl ~ r/2 and {3(x) = 0
for Ixl ~ r. Finally, let g.(q) = g(q) for q ~ B(p, r) and
g. 0 'P(x) = 9 0 'P(x + E{3(X)X)
for x E 'P-1(B(p,r» = {x: Ixl ~ r}. Then
D('P- 1 0 y!. 0 'P)o = D('P- 1 0 gi 0 'P)o(l + E)I.
Because of the form of this derivative, g. has a hyperbolic periodic point at p for
arbitrarily small E > O. Clearly, g. converges to 9 in the C,. topology as E goes to zero.
o
A corollary of the Closing Lemma is the General Density Theorem which was also
originally proved by Pugh (1967b). This result is only true in the C 1 topology because
it uses the Closing Lemma.
416 X. GENERIC PROPERI'IES

Theorem 1.4 (General Density Theorem). Assume M is a compact manifold. Let


9 = {J E Diffl(M) : c1(Perh(f)) = n(f)}. Then 9 is residual in Diff1(M).
Before starting the proof of the General Density Theorem, we give some definitions
and results about semi-continuous set valued functions which are used in the proof.
Let M be a complete metric space, and CM be the collection of all compact subsets
of M. The Hausdorff metric on CM is defined as follows: for A, BE CM,

d(A, B) =sup{d(a, B), d(b, A) : a E A, bE B},


where
d(b, A) = inf{d(b, a) : a E A}.
(Note: This metric has nothing to do with the Hausdorff property for a general topo-
logical lipace.) If M is a complete metric space, then CM is a complet.e metric splICe
with the metric d defined above. (This result is an exercise in many of the books on
topology.)
The proof also uses the concept of a semi-continuous set valued fUlIction. (The
functions we use take f to Perh(n, f).) Let An E CM for n ::?: 1. Define

liminf An = {y EM: there exist Yn E An for n::?: 1


n-<XI

such that Y = lim Yn}.


n-oo

If a point Y is contained in all the An for n ::?: N for some N, then Y E liminf n _ oo An.
Thus the set lim infn _ oo An is the set of points which are "essentially" contained by
each of the An for large n. (In fact, y only has to be the limit of points in the An and
not actually lie in the sets.) It is easy to check that if An E CM then lim inf n _ oo An is
compact so is in CM.
Let X be a topological space and M a complete metric space. A set valued function
r : X ..... CM is called lower semi-continuous at x provided
f(x) C lim inff(xn )
n-oo

for every sequence Xn E X converging to x. This means that any point in r(x) can
be approached by a sequence of points Yn E r(xn ). In an intuitive sense, the set f(x)
can be smaller than the r(Xn) for nearby Xn but can not be bigger. The set valued
function r is called lower semi-continuous provided it is lower semi-continuous at all
points x EX.
Lower semi-continuity can also be expressed in terms of a nonreftexive "semi-metric"
on CM defined by
=
d.ub(A, B) sup{d(a, B) : a E A}.
For B compact, d.ub(A, B) = 0 if and only if A C B. (In the Hausdorff metric,
d(A, B) = 0 if and only if A = B.) Then r : X ..... CM is lower semi-continuous at x
provided
lim{d.ub(Gamma(x),r(y» : y converges to xinX} = O.
Exercise 10.3 asks the reader to prove the following result. Assume r n : X ..... CM
are lower semi-continuous set valued functions for n ::?: 1, and define r : X ..... CM by
[(xl = cl(Un r n(x)). Then f is a lower semi-continuous set valued function.
Finally, assume r : X ..... CM is a lower semi-continuous set valued function. Let
R. c X be the points of continuity of f. Then R. is residual, i.e., a semi-continuous
function is continuous at a residual subset. See Choquet (1969).
10.2 TRANSVERSALITY 417

PROOF OF THEOREM 1.4 (GENERAL DENSITY THEOREM). Define rn,r: Dur1(M)-+


eM by
r nU) = Perh(n, f).
ru) = c1(PerhU»,
where Perh(n, f) and PerhU) are the set of hyperbolic periodic points of period less
than or equal to n and of all periods respectively. The sets Perh (n, f) can easily seen
to be closed and so compact. For n ~ I, the map r n is lower semi-continuous because
a hyperbolic periodic point persists under perturbations. (See Theorem V.6.4.) These
functions are not continuous at all I: if 10 is a diffeomorphism with a periodic point
Xo with eigenvalue of absolute value one, then under arbitrarily small perturbations of
10 to 9 the periodic point Xo(g) can become hyperbolic (or disappear). Thus the set
r n(g) can get bigger for small C l perturbations of an I (Xo(g) E r n(g» but can not
get smaller, so r n is lower semi-continuous but not continuous at 10' Because each of
the set valued functions r n are lower semi-continuous, the function ro = c1(Un r n('»
is also lower semi-continuous by Exercise 10.3.
Let n c Diffl (M) be the points of continuity of r. As mentioned above, a serni-
continuous set valued function is continuous at a residual subset, so n is residual in
Diffl(M).
Claim. The set neg where 9 is the set defined in the theorem, so 9 is residual.
PROOF. Suppose lEn \ g. Therefore, r is continuous at I, but there is a point
p E nu) \ c1(Per(f». By the Closing Lemma, there are gi E Diffl(M) converging to I
such that p E Per(gi). By Lemma 1.3, each g.
can be approximated by g: E Diffl(M)
such that p E Perh(g;) and the g: still converge to I. Therefore p E r(g;) and
lim{d(r(g:),rU» : i -+ oo} ~ d(p,ru» F 0,
which contradicts the fact that I is a continuity point of r. This contradiction proves
that neg which prove the claim and finishes the proof of the theorem. 0

10.2 Transversality
We have already defined what it means for two submanifolds to be transverse in an
ambient manifold. In this chapter we need to consider functions which are transverse
to a submanifold in the space where the function takes its values. After giving the
definition of this concept, we present several transversality theorems which state that
most functions are transverse to a given submanifold.
Definition. Let M and N be differentiable manifolds, V C N be a submanifold, and
J( C M be a subset. A C l function I : M -+ N is transverse to V at P E M provided
that either (i) I(p) 'i. V, or (ii) Tf(p)N is spanned by the two subspaces D/pTpM and
Tf(p) V whenever I(p) E V, i.e.,
Tf(p)N = DlpTpM + Tf(p) V.
A C l function I : M -+ N is transverse to V along K provided it is transverse to V at
all points p E K. We use the notation / rhK V to indicate that I is transverse to V
along K. If K = M, then we say that I is transverse to V, and denote it by I rh V.
We also use the notion of the codimension of a submanifold. Assume N is a manifold
and V C N a submanifold. The codimension 01 V in N is defined to be dim(N) -
dim(V), and is denoted by codim(V).
The following theorem gives a generalization of the fact that the inverse image of a
regular value is a submanifold.
418 X. GENERIC PROPERTIES

Theorem 2.1. A~ume M and N manifolds and V c N is a submanifold. (In our use of
the terminology a submanifold is an embedded submanifold.) Assume that I : M --+ N
is a C r map for r ~ 1, and it is transverse to V. Then I-'(V) is a C r submanifold of
M. Morem'f'I', the codimension of 1-' (V) in M is the same as the codimension of V in
N.
See Hirsch (1976) for a proof.
Theorem 2.2 (Thom Transversality Theorem). Let M and N be manifolds with
M compact, and let V C N be a closed submanifold. Assume that k ~ 1. The set of
functions Tk(M, V) = {f E Ck(M, N) : I rh V} is dense and open in Ck(M, N).
Again, see Hirsch (1976) for a proof.
The difficulty in using the Thorn Transversality Theorem is that we do not always
have the use of all the functions in Ck(M, N). In the proof of the Kupka-Smale Theorem
for I E Diff' (M), we consider the maps Pn(f) = (id, f") : M --+ M x M where n is a
positive integer. We want to conclude that most diffeomorphism I have Pn(f) transverse
to the diagonal in M x M. Thus we do not have all functions in C'(M, M x M)
at our disposal but only those arising as Pn(f). The Thorn Transversality Theorem
certainly implies that an open set of I have this property. (See Theorem 2.4 below.) To
prove the density of the I which satisfy the property, we must know that the function
space Diffl (M) is large enough to be able to make a perturbation of I for which Pn(f}
is t.ransverse to the diagonal. There are various ways around this difficulty. Palis
and de Melo (1982) and Peixoto (1966) use only the Thorn Transversality Theorem to
prove density. They both build into the proof the necessary step to make the Thorn
Transversality Theorem applicable to construct the perturbation of the diffeomorphism
I itself and not just of Pn(f}. Abraham and Robbin (1967) proves a very general form
of the transversality theorem which is directly applicable to representations like Pn. The
difficulty is that this proof of the Kupka-Smale Theorem requires proving that Diff' (M)
is a Banach manifold which we do not want to verify. (It is true that Diffk(M) is a
Banach manifold and p~v : Diffk(M} x M --+ M x M is C k . See Franks (1979), or
Hirsch (1976).) We give a proof more like Abraham and Robbin but only require that
Diffl (AI) contains finite dimensional subsets of functions on which Pn is differentiable.
Thus. we lise the Parametric Transversality Theorem stated below. Also see Abraham
and Robbin (1967) for many related results.
First we give one definition which we use repeatedly.
Definition. Let M and N be manifolds, :F a topological space, and P : :F --+ Ck(M, N}
a continuous map for some k ~ O. The evaluation 01 p, is defined to be the map
P' v : :F x M ..... N given by
p'V(f,x} = p(f)(x).

Theorem 2.3 (Parametric Transversality Theorem). Assume Dq is an open su~


set of some IRq, M and N are manifolds, K C M compact, V C N is a closed subman-
ifold, and k > max{O, dim(M} - codim(V)}. Assume P : Dq --+ Ck(M, N) satisfies the
following two conditions:
(i) P is continuous,
(ii) the evaluation map p. v : Dq x M --+ N is C k , and
(iii) p' v is transverse to V along ~ x K.
Then T(p, K, V) == {t E Dq : p(t} rhK V} is dense and open in Dq.
REMARK 2.1. The differentiability assumption on p'v is that it is C k where k is greater
than the dimension of M minus the codimension of V in N. Note that there is a misprint
in this assumption in the Parametric Transversality Theorem given in Hirsch (1976).
10.3 PROOF OF THE KUPKA-SMALE THEOREM 419

REMARK 2.2. The transversality 88Sumption on the evaluation map means that there
are enough parameters with which to make the necessary perturbations at one point at
a time. The conclusion of the theorem is that the function can be approximated by one
which is transverse at all points.
REMARK 2.3. See Hirsch (1976) for a proof.
The following theorem proves the openness of the set of maps which are transverse.
Theorem 2.4. Assume tbat :F is a topological space, M and N are manifolds, K c
M compact, V C N is a closed submanifold, and 1 :5 k :5 00. Assume the map
p::F -> Ck(M,N) is continuous wbere Ck(M,N) is given tbe compact open topology
on the first k derivatives. (If M is compact, then the sup topology on tbe first k
derivatives can be used. Hirsch calls the compact open topology the wealc topology.)
Tben T(p, K, V) = {J E :F : p(J) thK V} is open in :F.
SKETCH OF THE PROOF. The idea is that Tk(K, V) = {J E Ck(M, N) : f thK V} is
open in Ck(M,N) and p is continuous. Therefore p-t(Tk(K, V)) = T(p,K, V) is open.
See Hirsch (1976) for details. 0
REMARK 2.4. See Abraham and Robbin (1967) or Hirsch (1976) for a more detailed
discussion of transversality theorems and their applications.

10.3 Proof of the Kupka-Smale Theorem


In the proof of the theorem, we need to consider not only whether a periodiC point
are hyperbolic but also whether it satisfies a condition connected with a transversality
condition (or the Implicit Function Theorem). A periodic point p of I of period n is
called elementary provided 1 is not an eigenvalue of DI;. Thus a hyperbolic periodic
point is elementar), but not all elementary periodic points are hyperbolic. The condition
of being elementary is the natural first step in the proof below.
As mentioned in the last section on transversality, we consider the maps

defined by
Pn(J)(x) = (x, r(x».
Let tl. be the diagonal in M x M, tl. = {(y,y) : y EM}. Clearly, tl. is a closed
submanifold of M x M. Also Pn(J)(p) E tl. if and only if p is fixed by r, i.e., p has
period n in the weak sense of the word. The following lemma characterizes the condition
that Pn (f) is transverse to tl. as being equivalent to the condition that all fixed points
of In are elementary.
Lemma 3.1. (a) The point p is a fixed point of I if and only if Pt(J)(p) E tl..
(b) Assume pd/)(p) E tl.. Then, pdf) thp tl. if and only ifp is an elementary fixed
point.
(c) Tbe point p is a fixed point of rif and only if Pn(J)(p) E tl..
(d) Assume Pn(J)(p) E tl.. Then, Pn(J) thp tl. if and only jfp is an elementary fixed
point of r.
PROOF. (a) and (c) These statements follow easily from the definitions.
(b) Assume Pt(J)(p) E tl.. For v E TpM,
420 X. GENERIC PROPElITIES

The tangent space to the diagonal is clearly given as follows:

T(p.p)~ = {(W, W) : wE TpM}.

If PI (f) rnp ~, then for any UI, U2 E TpM, we can solve the set of equations

V+W = UI

Dlpv+W=U2

for v. WE TpM. But this means W = UI - V, so we can solve Dlpv - v = U2 - UI for


v. Dip - I is onto, and 1 is not an eigenvalue of Dip.
Conversely, assume 1 is not an eigenvalue of Dip. Then for any UI, U2 E TpM, we
can solve Dlpv - v = U2 - UI for v, and set W = UI - v. Thus we can solve the set of
equations

V+W = UI

Dlpv + W = U2

for v. wE TpM. and PI(f) rnp~.


(d) The case for higher period is proved similarly to part (b). o
We use the following notation for a disk in the proof: for r > 0 and a positive integer
J. DJ(r) is the open ball of radius r centered at 0 in R J , DJ(r) = {x E RJ : Ixl < r}.
The first step in the proof of the theorem is to prove that the set of diffeomorphisms
all of whose fixed points are elementary is open and dense.
Lemma 3.2. The set

is open alld dellSe in Diffk(M).


PROOF. The openness of T(pI'~) follows from Theorem 2.2 since PI is clearly contin-
uous. (The openness of T(PI'~) is also related to Theorem V.6.4.)
We fix a I E Diffk(M) for the rest of the proof. To prove that I is in the closure of
T(pI.~) in Diffk(M). we apply the Parametric Transversality Theorem 2.3 with a finite
dimensional subspace of perturbations. We construct a map ( : DLm(r) ---> Diffk(M)
and apply Theorem 2.3 to PI 0 (. We have assumed that M is compact. Clearly,
~ c M x At is a closed submanifold. The differentiability required to apply Theorem
2.3 is
k > dim(M) - dim(M x M) + dim(~) = m - 2m + m = 0,
or k ~ 1. Finally, we must construct a large enough space of perturbations of the given
I so that (PI 0 ()eu is transverse to ~.
Fix IE Diffk(M). To construct the perturbations, we use local coordinates (although
it is possible to use more global constructions). Cover M by a finite number of open
coordinate charts Uj , {'Pi: V; C Rm - Uj C M}f=l' with compact subsets K j C Uj
which cover M, U:=I Ki = M. Let Vj be open sets in M with Kj.1 c Vj C c1(Vj) c Uj.
Let Kj.o = {x E K; : I(x) ~ Vi}, and K j • l = c1(K, \Kj •o). Thus, each Kj is union of two
compact subsets, Kj = Ki.OuKj.t. such that I(K,.o)nKj = 0 and !(K•. d C c1(V,) CUi'
With this decomposition, there can not possibly be any fixed points in any of the sets
K •. o: PI (f)(K,.o) n ~ = 0 so PI(f) rnK •.• ~. On the other hand for each i, we need
to construct a finite dimensional set of perturbations of I. (, : Dm(r;) - Diffk(M), so
10.3 PROOF OF THE KUPKA-SMALE THEOREM 421

that (PI 0(;)"" is transverse to ~ along to} x K i•1 C Rm x M. The transversality means
that this finite dimensional subset of diffeomorphisms, {(i(Dm(ri))}, is large enough to
perturb I so that it makes all the fixed points in K i•1 elementary.
We proceed to construct the perturbations along K i•l . Let VI be open sets in M
with Ki,l c V; c cl(V;) C Vi' Let Pi : M ..... R be a Coo bump function with PilVI ;: 1
and cl( {x : Pi(X) "I O}) C Vi' The set cl({x : Pi(X) "I O}) is called the support 01 P and
is denoted by supp({3). For ri > 0, define (i, (i : Dm(ri) ..... Diff"(M) by
forx¢Vi
(;(v)(x) = { :i(rpjl(X) + Pi(X)V) for x E Vi, and
(;(v)(x) = (i(V) o/(x).

Because the set of diffeomorphisms is open, for small enough ri > 0, the image of Dm(ri)
by ( is in Diff"(M). Clearly, (PI 0 (i)e" : Dm(ri) x M ..... M x Mise" for k ~ 1 88
required. For p E Ki,l n (pd/»-l(~), p = f(p) E Ki,l C VI, Pi 0/(p) = 1, and

(PI 0 (i)'''(V, p) = (p, rpi(rpj I(p) + v)).


Differentiating with respect to v for such a p,

D(pi o (i),8,p)(v,Op) = (OP,D(rpi)op;-l(p)v) and


D(pi 0 (i)i8,p)(R m x {Op}) = top} x TpM.

As noted above,
T(p.p)~ = {(w, w) : WE TpM}.
Together top} x TpM and {(w, w) : wE TpM} span T(p.p)(M x M) = TpM x TpM.
Thus,

Before applying Theorem 2.3, we combine all the (i into a single map to get transver-
sality along all of M in one step. Let
P = Dm(rl) x ... x Dm(r/)

and ( : p ..... Diff"(M) be given by

«VI. ... , VI) = (I(VJ) 0·" 0 (/(V/) 0 I.

Then

D(pi o()io.p)(O, ... ,O,Vi'O, ... ,O, ... ,O,Op) = D(pi o (i)(O,p)(Vi,Op),

and (PI ooe" is transverse to ~ along to} x U:=l K;.I. As mentioned above, PI(f) is
transverse to ~ along U:=I K i •o, so (PI 0 ()et! is transverse to ~ along to} x M. By
openness of transverse intersection, there is some open neighborhood P' C P of 0 such
that (PI 0 ()e" is transverse to ~ along P' x M. By Theorem 2.3,

T(pl o(,~);: {t E P': PI o«t) ttI~}

is dense in P'. Because ( is continuous, ! = «0) is in the closure of «T(pi 0 (,~» in


Diffk(M). Since
422 X. GENERIC PROPERI'IES

I is in the closure of T(PI, t.) in DiffA:(M). This completes the proof of the lemma. 0
PROOF THAT 'HI IS OPEN AND DENSE. For I E T(PI,t.), Lemma 3.1 shows that if
PI (f)(p) E t. then p is a elementary fixed point. Because the elementary fixed points are
isolated, each I E T(plo t.) has only finitely many fixed points. (The set (PI(f))-I(t.) is
a manifold of the same codimension as t. which is n, so it is a 0 dimensional manifold,
i.e., isolated points.) By Lemma 1.3, each I E T(PI, t.) can be approximated by R
diffeomorphism 9 for which each fixed point of I becomes a hyperbolic fixed point of g.
In fact, there are an open neighborhood U of all the fixed points of I and a perturbation
9 of I such that Per(l,gIU) = Per(I,fIU). Also the nonexistence of fixed points on the
compact set M \ U is an open condition, i.e.,

{g E DiffA:(M) : PI (g)(M \ U) n t. = 0}

is open in DiffA:(M). Combining the consideration on and off U, the 9 which approxi-
mates I can be taken so that Per( 1, g) = Per( 1, f) and each fixed point of 9 is hyperbolic,
so 9 E 'HI' This proves that 'HI is dense in DiffA:(M).
By the openness of T(PI, t.), and the openness of the hyperbolicity of one particular
fixed point, it follows that 'HI is open. 0
To prove that 'Hn is dense in DiffA:(M), we can not just take pn : DiffA:(M) --+
CA:(M, M x M) and proceed to construct perturbations for an arbitrary I E DiffA:(M).
The difficulty is that if p is a fixed point for I for which -1 is an eigenvalue with
eigenvector Vi for DIp, then for any perturbations of I, (: DJ(r) --+ Diffk(M),

D({J2 0 ()(O,p)(v, 0)

does not span the direction corresponding (0, vi). This difficulty is overlooked by many
books giving the proof. The correct proof proceeds by induction on n and considers
Pn : 'Hn-I --+ CA:(M,M x M), i.e., only constructs perturbations for I E 'Hn- I. The
idea is that for I E 'H n - I , if P has period less than n and r(p) = p (so p has period
n/j for some integer j), then D(r)p is hyperbolic so we do not need to construct any
perturbations.
Lemma 3.3. Assume 'Hn-I is dense and open in Diffk(M). Then,

is open and dense in Diffk(M).


PROOF. By the openness of transverse intersection, T(Pn, t.) n 'Hn- I is open in 'Hn-I,
and so in DiffA:(M).
To prove the density in DiffA:(M), it is enough to prove density in 'Hn-I, so we fix IE
'H n - I . Let {'P; : Vi --+ U; C M}f=1 be open coordinate charts which cover M as before.
Let U be an open neighborhood of Per(n - 1, f) in M and N be an open neighborhood
of I in 'Hn-I such that for 9 E N, Per(n, g) n c1(U) = Per(n - l,g) C U and all the
periodic points in Per(n - l,g) are hyperbolic. If p E c1(U) and Pn(f)(p) E t., then p
has least period less than n, p is a hyperbolic periodic point, p is a hyperbolic fixed
point of r, and Pn(f) rhp t.. Therefore Pn(f) rhcl(U) t..
Let K; C U; \ U be compact subsets such that uf=1 K; = M \ U. (Note that K; = 0
is allowed.) Also, K; n Per(n - 1, f) c K; n U = 0. To proceed as in the case for n = 1,
we need to divide K; into subsets; however for n > 1, we may need more than two
subsets because the orbit of a point x E K; can pass through U; for some intermediate
10.3 PROOF OF THE KUPKA-SMALE THEOREM 423

iterate, and we must construct a perturbation 9 of I such that the orbit of a point in
Ki goes only once through the set {x : g(x) :1= I(x)}. In particular, each Ki can be
written as the union of a finite number of compact subsets, Ki = Ki,j, such thatuf,:,o
(i) f"(Ki,o) n Ki = 0, (ii) f"(Ki,j) C Ui for 1 ~ j ~ Li , and (iii) ft(K"j) n Ki,j = 0
for 0 < l < nand 1 ~ j ~ L i . (Note that Li = 0 is allowed.) For 1 ~ j ~ Li and
1 ~ i ~ I, let U;,j and U;:j be open sets of M such that (i) K"j C U:,j C c1(U:,j) C
U::j C cl(U::j ) CUi, (ii) f"(U:) CUi, and (iii) It(U:) n U::j = 0 for 0 < l < n. For
1 ~ j ~ Li and 1 ~ i ~ I, let {3i,j : M ~ R be a Coo bump function with (3i,jlUL == 1
and SUPP({3i,j) c U::j • For ri,j > 0, define (i,j'(i,j : Dm(ri,j) ~ DiIf"(M) by

, { x for x '" U::


j
(i,j(V)(X) = CPi (cpjl (x) + (3i,j(X)V) for x E U::
j , and
(i,j(V)(X) = (.,j(v) 0 f(x).

For ri,j > 0 small enough, the image (i,j(Dm(ri,j)) is in DiIf"(M). Fix p E Ki,j with
Pn(J)(p) E A. Let Iv = (i,j(V), For 1 ~ l < n, I!(p) f/. U::j so I!(p) = It(p). The nih
iterate,

I::(p) = CPi(cpjl 0 r(p) + (3i,j 0 r(p)v)


= cpi(cpjl(p) + v),
or

As for the case n = I,


(Pn 0 (i,j )e" rIl{O} x K . A.
',)

Let L = 2::=1 Lit Pi = Dm(ri,d x ... x Dm(ri,L.), and P = PI X... XPI c n. Lm .


Define ( : P ~ Diff"(M) by

«vl.l, .. · ,V/,L,) = (1,I(VI,d 0'" 0 (/,L, (v I,L,) 0 f.


As for the case n = I, (Pn 00·" is transverse to A along {O} x U:=I UJ':'I Ki,j' Also,
Pn(J) is transverse to A along cl(U) U U:=I Ki,o. Because
I L.
M = cl(U) U U (Ki,O U U K"j)'
tal j=l

(Pn o()"" is transverse to A along {O} x M. By the openness of transverse intersection,


there is an open neighborhood pi C P of 0 in RLm such that (Pn 0 ()"" is transverse to
A along pi x M. By Theorem 2.3,

T(Pn oC,A) = {t E P': Pno«t) rIl A}

is dense in P'. Because ( is continuous, f is in the closure of «T(Pn 0(, A)) in Diff"(M).
Since

«T(Pn 0 (,A)) c T(Pn, A) n 'H n - I


== {g E 'Hn-I : Pn(g) rIl A},
424 X. GENERIC PROPEIUIES

f is in the closure of T(Pn,~) n 1tn -1 in Dilfk(M). This completes the proof of the
lemma. 0
PROOF THAT 1tn IS OPEN AND DENSE. For f E T(Pn, ~), each point of least period
n can be made hyperbolic using Lemma 1.3, so 1tn is dense in T(Pn, ~). The set of
dilfeomorphisms for which a particular periodic point is hyperbolic is open, so 1tn is
open in Dilfk(M). 0
1t IS A RESIDUAL SUBSET. Since 1t = n::"=I1tn is the countable intersection of open
dense subsets of Dilfk(M), it is residual. 0
KS(M) IS A RESIDUAL SUBSET. We define a countable collection of sets of dilfeomor-
ph isms K:(n, R,), such that (i) the intersection of all the K(n, R,) is equal to KS(M)
and (ii) we can prove each set K:(n, R;) is open and dense in Dilfk(M). For a hyperbolic
fixed point P E Perh(f), let W~(p, f) be the points q in W'(p, f) for which there is
a curve {-r(t) : 0 ::; t ::; I} such that (i) the length of "( is less than or equal to R,
(ii) "((0) = q and "((1) = p, and (iii) "((t) E W'(p, f) for 0 ::; t ::; 1. Similarly, define
W}l(p, f). We say that W}l(PI,f) is transverse to W~(P2,f) as a shorter way of saying
that W"(PI,f) is transverse to W'(p2,f) at points of W}l(PI,f) n Wk(P2' f). For a
positive integer nand R > 0, let

K:(71,R) = {f E 1t n : Wil(PI,f) is transverse to WR(P2,f)


for all PI, P2 E Per( 71, I)}·

Then
KS(M) = n n
ISn<oo ISR<oo
K:(n,R).

If each K(71,R) is dense and open in 1tn then KS(M) is residual in Dilfk(M). (It is
possible to take only a countable number of R > 0 which go to infinity.)
The fact that K(n, R) is open is not difficult. For I E 1tn , there are only finitely
many points in Per(n, f). By the arguments which prove the openness of 1tn , there is a
neighborhood N of I such that for 9 E N, (i) the cardinality of Per(n, g) is the same as
the cardinality of Per(n,f) and (ii) each periodic point of gin Per(n,g) is hyperbolic,
so N c 1t n . For 9 EN, let {p,(g)}[=1 be the periodic points in Per(n,g). The stable
and unstable maps depend continuously in the C k topology on compact subsets. For
each 1 ::; i ::; I and 9 E N, there are maps O':(g) : R" -+ M such that (i) the image
of O':(g) is W'(p,(g),g) and (ii) the image of the close ball jj"(R) is WR(p,(g),g),
a:(g)(jj"(R)) = W~(Pi(g),g). Similarly, O'r(g) : R'" -+ M gives W"(Pi(g),g). The
continuity of the stable and unstable manifolds can be expressed by saying that the
maps 0': :N -+ C t (R", M) and O'r : N -+ C k (Ru" M) are continuous if C k (R", M)
and C k (R"', M) are given the compact open topology on the first k derivatives. For i
and j fixed, we can combine these into a map ai,j : N -+ Ck(R" x a";, M x M) by

a',j(g)(x,y) = (aj(g)(x),aj'(g)(y)).

The reader can check that W~(P.(g),g) is transverse to W1Hpj(g),g) if and only if
O",j(g) is transverse to ~ along jj"'(R) x jju;(R), The openness of the transverse
intersection of W~(P.(g),g) and W1Hpj(g),g) follows from Theorem 2.4. By using all
pairs 1 ::; i, j $ I, it follows that K(n, R) is open in N, and so in 1tn and Dilfk(M).
To complete the proof, we only need to prove that K:(n, R) is dense at functions
f E 1t n . We prove this density by applying Theorem 2.3 to a.,)' If we can show for
each pair (i,j), that the set of 9 for which 0"'; (g) is transverse to ~ is dense in N, then
10.4 NECESSARY CONDITIONS FOR STRUCTURAL STABILITY 425

clearly it follows that K:(n,R) is dense in N. We show that we can make W~(Pi(g),g)
is transverse to W;l(Pj(g),g) at a single point q E W~(Pi(g),g) n WJHpj(g),g). The
transversality at all points can then be obtained by combining these perturbations in a
way similar to that used to prove 'HI and 'Hn are dense. (This argument uses the fact
that D8'(R) x D"j(R) is compact.)
Assume
ai,j(g)(Xo,YO) = (q,q) E tl..
Since the stable and unstable manifolds are transverse at the periodic points, q '" Pi, Pj'
Let <p : V -+ U be a coordinate chart at q with Pi, Pj ¢ U. Then w(q) = O(Pi) and
a(q) = O(Pj), so for a small enough neighborhood UtI C U of q, gn(q) ¢ UtI for n '" O.
Let U' c UtI be a smaller neighborhood of q, and {3 : M -+ R be a bump function with
(3IU' == 1 and supp({3) C U". Finally, define (i,;, (i,j : Dm(ri,j) -+ Diffk(M) by
for x rfc UtI
(',j(V)(X) = { ;(<p-I(X) + (3(x)v) for x E U",
(i,j(V)(X) = (i,j(V) 0 g(x).
The two periodic points Pi, Pj rfc utI so they remain hyperbolic periodic points. Also,
g(q) ¢ U", so (i,j(V)(q) = g(q). Similarly, gn(q) rfc UtI for n > 0, so (i,j(V)R(q) =
gn(q), q E W"(Pi(i,j(V», (i,j(V», and

at 0 (i,;(V)(Xo) = q
is still the same point in M. For the unstable manifold, a similar argument shows that
g-I(q) E W"(Pj(i,;(V»,(i,j(V». However

(i,j(V)(g-l(q» = (i,j(V)(q)
= <p(<p-I(q) + v),
so
a'J 0 (i,j(V)(Yo) = <p(<p-I(q) + v).
Therefore the perturbation (;,j(v) of 9 changes the unstable manifold at q but not
the stable manifold, Le., the perturbation can move W"(Pj, (;,j(v» off W8(Pi,<i,;(V».
Taking the derivative with respect to v,
D(ai,j o (i,j)(O,Xo,yo)(Rm x {O,} x {O}) = {Oq} x TqM.

Because this image and T(q,q)tl. span TqM x TqM, (ai,; 0 (i,j)'" is transverse to tl. at
(0, Xo,Yo), A finite number ofthe sets ofthe type of U' cover W~(pi,g)nwJHpj,g), so
perturbations in these various U' can be combined to define a <i,; : DJ(r) -+ Diffk(M)
such that (a 0 (i,j )ev is transverse to tl. along {O} x DO, (R) x DUj (R). By Theorem 2.3,
9 is in the closure of K:(n, R), This completes the proof of Theorem 1.1. 0

10.4 Necessary Conditions for Structural Stability


Chapter IX gives sufficient conditions for R-stability and structural stability, In this
section, we consider the necessity of some of these conditions, After several people
made contributions to this question Mane (1978b, 1987b) and Liao (1980) eventually
proved that if f is a C l 'R-stable diffeomorphism on a compact manifold, then f has
a hyperbolic structure on 'R(f), The proof of this result is very difficult and beyond
the score of this book, but we discuss some of the easier aspects. We start with the
hyperbolicity of fixed points.
426 X. GENERIC PROPERTIES

Theorem 4.1. Let M be a compact manifold and f : M ..... M a diffeomorphism.


Assume f is CI 'R.-stable. Then all the periodic points of f are hyperbolic.
REMARK 4.1. This theorem was proved by Franks (1971) and Pliss (1971, 1972). The
papers by Pliss also obtained some further results related to the theorem of Mane and
Liao.
REMARK 4.2. Note that if f is structurally stable then it satisfies the assumptions of
the theorem.

REMARK 4.3. We prove this theorem assuming that f is CI 'R.-stable but it is true if
f is cr structurally stable for any r ~ 1. See Robinson (1973).

The first step in the proof of Theorem 4.1 is to show that a diffeomorphism can
be C I approximated by a new diffeomorphism which is equal to its linear part in a
neighborhood of a periodic point.
Lemma 4.2. Assume p is a periodic point for f of period n. Let <.p : V ..... U be a
coordinate chart at p with <.p(0) = p and A = D(<.p-Io/"o<.p)O. LetN bea neighborhood
of f in the C I topology.
(a) Tllen there are r > 0 and 9 E N such that (i) p has period n for 9 and (ii)
<.p-I og" o <.p(x) = Ax forxE Vn{xE R": Ixl:S r}.
(b) If liB - All is small enough, then there are r > 0 and 9 E N such that (i) p has
period n for 9 and (ii) <.p-I 0 g" 0 <.p(x) = Bx for x E V n {x E R" : Ixl :S r}.
REMARK 4.4. Lemma 4.2 is not true in the C 2 topology.

PROOF. (a) We only consider the case of a fixed point. The reader can supply the
changes for higher period. We also leave the proof of part (b) to the reader.
Let (J : R" ..... R be a Coo bump function with (i) 0 :S (J(x) :S 1 for all x, (ii) (J(x) = 1
for Ixl :S 1, and (iii) .B(x) = 0 for Ixl ~ 2. For r > 0, let .Br : lR" -+ lR be defined by
(Jr(x) = .B(x/r). Thus.Br is a bump function which is equal to 1 for Ixl <" , and is equal
to 0 for Ixl ~ 2r. The estimates on .Br and its derivative depend on r ill the following
fashion: SUp{.Br(X)} = sup{.B(x)} = 1, and

where Co = sup{IID«(J)xll}·
Using the bump function (Jr, we can define a perturbation gr. First we write f in
terms of its linear and nonlinear terms: let

<.p-I 0 f o <.p(x) = Ax + j(x)

where j(O) = 0 and D(j)o = 0, Thus IID(f)xll = o(lxIO) and Ij(x)1 = o(lxl), i.e.,
IID(ilxll and Ij(x)I/lxl both go to zero as Ixl goes to zero. We only consider r > 0 for
which SUpp«(Jr) c {x: Ixl :S 2r} C V. Let gr be equal to f outside of U, and

<.p-l 0 gr 0 <.p(x) = (Jr(x)Ax + (1 - (Jr(X»<.p-l 0 f 0 <.p(x)


= Ax + (1 - .Br(X))j(X).

for x E V. Note that <.p-l 0 gr 0 <.p(x) = <.p-l 0 f 0 <.p(x) for Ixl ~ 2r. On the other hand
for Ixl :S r, (Jr(x) = 1 and <.p-l 0 gr 0 <.p(x) = Ax.
\0.4 NECESSARY CONDITIONS FOR STRUCTURAL STABILITY 427

To check that 9r is near / for small r, we need to calculate the derivative of 9r:
D(cp-I 09r 0 cp)x = A + (1 - IJr(x»Dix + i(x)D(IJr)x
= D(cp-I 0/0 cp)x - IJr(x)Dix + i(x)D(IJr)x.
For Ixl ~ 2r, D(cp-I 0 9r 0 cp)x = D(cp-I 0/ ° cp)x so we only need to consider x with
Ixl :5 2r. For Ixl :5 2r,
IID(cp-' 09r 0 cp)x-D(cp-I 0/ ° cp)xll
:5 IJr(x)IID(i>xll + li(x)l· IID(IJr)xll
= o(ro ) + o(r)(CO )
r
= o(ro).

In this last calculation, we used the estimates given above for i(x) and Dix and that
sup{IID(IJr)xll} = Coiro From the estimate, there derivative of 9r approaches that of /
as r goes to O. Since 9r(P) = P = /(p), the Mean Value Theorem proves that the CO
distance from 9r to / goes to zero also. Therefore for r > 0 small enough, 9r E N, the
CI neighborhood of /. 0
PROOF OF THEOREM 4.1. Since / is 'R-stable, there is an open neighborhood N of
/ such that any 9 E N is 'R.-conjugate to / and so 9 is also 'R.-stable itself. Assume
that p is a nonhyperbolic periodic point of period n. By Lemma 4.2(a), there is a
91 E N such that in local coordinates 9f is linear, cp-I o9f 0 cp(x) = Ax for Ixl < r.
Since A is nonhyperbolic, it can be approximated by another matrix B that has an
eigenvalue'\' which is a j-th root of unity with eigenvector v. By Lemma 4.2(b), 91 can
be approximated by 92 EN such that in local coordinates '1'-1 0 9~ ° V'(x) = Bx for
Ixl < r. Then
'1'-1 o~n 0V'(av) = Biav = ,\,iav = av,
for lavl < r, so ~n has a curve of fixed points which are nonisolated, i.e., 92 has
nonisolated periodic points of a given period. By the Kupka-Smale Theorem, there
is a 93 E N with only hyperbolic periodic points, so the periodic points of any given
period are isolated. But the two diffeomorphisms 93 and 92 can not be 'R.-conjugate.
This contradicts the assumptions on /, and so all the periodic points of / must be
hyperbolic. 0
We end the section by stating a result about the transversaiity of stable and unstable
manifolds. We give the result in two dimensions because part (a) is easier to state in
this case. Parts (b) and (c) are true in any dimensions. We leave the proof of this
theorem to the exercises. See Exercise 10.11.
Theorem 4.3. Assume M is a compact surface (two dimens.ional manifold) and /,9 :
M -+ M are CI diffeomorprusms.
(a) Assume / is a Kupka-Smale diffeomorphism. Assume 9 has two periodic points
p and q such that W"(P,9) is tangent to W'(q,9) at z and W"(p, 9) is on one side
ofW'(q,9) locally near z. (The two manifolds are not topologically transverse at z.)
Then / and 9 are not conjugate.
(b) If / is a structurally stable then / is Kupka-Smale.
(c) Assume / is structurally stable and that 'R(f) has a hyperbolic structure. Then
/ satisfies the transversality condition.
REMARK 4.5. As stated in the opening paragraph of this section, Mane (l978b, 1987b)
and Liao (1980) proved a much stronger theorem. If / is a CI 'R.-stable diffeomorphism
428 X. GENERIC PROPERrIES

on a compact manifold, then f has a hyperbolic structure on 'R(f). Hayashi (1992) has
some improvements in the end of the proof. Also see Aoki (1991, 1992).
If f is C l structurally stable then it follows that f also satisfies the transversality
condition. By Theorem 4.3 (or Exercise 10.11(c)), it then follows that 1 also satisfies
the transversality condition.
The proof of the result of Mane and Liao for diffeomorphisms implies the same result
for flows without fixed points. Hu (1994) proved a comparable theorem for flows on
three dimensional compact manifolds even when fixed points are allowed. The result of
Mane and Liao for flows with fixed points on manifolds of dimension greater than 3 is
still unproven. .

10.5 Nondensity of Structural Stability


As we have remarked elsewhere, the set of structurally stable diffeomorphisms (or
O-stable diffeomorphisms) are not dense in Diffl(M) (unless M = 8 1 ). There were
a sequence of papers dealing with this problem including Smale (1966), Abraham and
Smale (1970), Williams (1970a), Newhouse (1970a), and Simon (1972). Later exam-
ples contradicted further conjectures of genericity of various forms of stability. Below
we describe the the example in Williams (1970a). This example is simple once the
the DA-diffeomorphism is understood. See Section 7.7 for the description of the DA-
diffeomorphism.
Theorem 5.1. There is an open set of dilfeomorphisms N c Diffl(1I"2) such that no
9E N is structurally stable.
PROOF. We start with the DA-diffeomorphism Ii on 11"2 with a fixed point source Po
and a hyperbolic invariant set A. The DA-diffeomorphism Ii can be constructed from
a hyperbolic toral automorphism 9 with Ii I(T2 \ U) = 91(11"2 \ U) where U c 11"2 is an
open set. As we saw in Section 7.7,

A= n00

n=O
If(1I"2 \ U).

We now modify Ii to form a 12 with 11I(T2 \ U) = hl(T2 \ U), so 12 still has A as


a hyperbolic invariant set. Inside U, we replace the single fixed point source with two
fixed point sources ql and q2 and a fixed point saddle Po. The construction can be
made so that

See Figure 5.1.

FIGURE 5.1
10.5 NONDENSITY OF STRUCTURAL STABILITY 429

FIGURE 5.2

Finally. 12 can be modified a third time to obtain a diffeomorphism h Let

be a fundamental domain with h(a) = b. and let c = h(b). Let ,8(x) be a bump
function such that (i) it is zero away from [b. e] and on the part of W"(Po. h) from Po
to b. (ii) it equals one in a neighborhood of the midpoint of [b. e]. and (iii) the forward
orbit by 12 of supp(,8) does not intersect supp(,8).
00

supp(,8) n U 12'(supp(,8» = 0.
n=l

Let v be a vector transverse to W"(Po.h). For small £ > O. let

h(x) = { h(x) for ,8 0 h(x) = 0


h(x) + £,8 0 h(x)v for,8 0 h(x) '" O.

(Here we write the sum as if it makes sense on T2. This should really be written in
the coordinates of the covering space R2.) Because of the support of ,8. the part of
the unstable manifold W"(po.h) from Po to b remains part of W"(Po./s); in par-
ticular. [a.b] C W"(Po.h). so also h([a.b]) C W"(Po.h). However h([a,b]) is no
longer equal to the line segment [b. e]. but it bends outward from the stable manifold
W"(Pl.h). See Figure 5.2. Because the perturbation from h to 13 does not affect
the forward orbits of points in supp(.8). these points stay on the stable manifolds of
the same points in A. and 13 has a tangency between W"(Po./3) and W"(z. h) for
some z(f3) E A. For a small enough Cl neighborhood N of h. every lEN has (i)
a hyperbolic invariant set A(f). (ii) a saddle fixed point Po(f). (iii) two fixed point
sources ql(f) and q2(f). (iv) R(f) = A(f) U {Po(f). qdf). q2(f)}. and (v) a tangency
between W"(Po(f). f) and W"(z(f). f) for some z(f) E A(f). In the Cl topology (as
opposed to the C2 topology). I may have more than one tangency. However. if we use
the extremal tangency. then the stable manifold W"(z(f). f) is well defined. (Note that
the point z(f) is not itself well defined.) This neighborhood N is the one indicated in
the statement of the theorem.
By means of a bump function near the point of tangency. the stable manifold on
which the tangency occurs, W"(z(f).f). can be moved within W"(A,f). The periodic
points of I in A(f) are dense. so the union of stable manifolds of periodic points,

U W"(x.n
xEPer(f)
430 x. GENERIC PROPERTIES

is dense in W"(A(f), f). Therefore the set N, of fEN for which W"(z(f), f) contains
a periodic point in A is dense in N. By the Kupka-Smale Theorem, the set N2 = {I' E
N: f is Kupka-Smale} is dense in N. For fEN, and f' E N 2, f is not conjugate
to f' by Theorem 4.3, since (i) f has a tangency of stable and unstable manifolds of
periodic points where locally the stable manifold is on one side of the unstable manifold
nnd (il) f' Is KupkR-Smale. (The proof of Theorem 4.3 18 an exercise. See Exercise
10.11.) Since (i) N, and N2 are both dense in Nand (ii) no fEN, is conjugate to any
f' E N 2 , no diffeomorphism in N is structurally stable. 0
REMARK 5.1. The idea of the counterexamples to O-stability is to make the tangency
between the stable and unstable manifolds take place at a point which is nonwandering.
Abraham and Smale (1970) do this for an example on S2 x y2. Simon (1972) does this
for a three dimensional manifold. Newhouse (1970a) does this on a two dimensional
manifold but nE'eds to use the C 2 topology.

10.6 Exercises
'I'cansversali ty
10.1. Let M and N be compact submanifolds of Rk with dim(M) + dim(N) < k.
Prove that {a E Rk : (M + a) n N = 0} is dense and open.
10.2. Let M be a submanifold of JRn. Fix k < n and let B be the set of all n x k
matrices A such that A has rank k. Note that for A E B, A(Rk) is a k-plane in Rn. The
set B is an open subset of all n x k matrices, so B is a manifold. (You do not need to
prove this fact.) Show that for a generic set of A E B, the k-plane A(JRk ) is transverse
toM.
Kupka-Smale Theorem
10.3. Let F be a topological space, M a compact metric space, and C be the collection
of all compact subsets of M with the Hausdorff metric. Assume r n : F -+ C is lower
semi-continuous for all n ~ 1. Define r(f) = d(Un r n(f). Prove that r : F -+ C is
lower semi-continuous.
10.4. Let MS(S') be the set of Morse-Smale diffeomorphisms on S'.
(a) Prove that MS(S') is open and dense in Diff'(S').
(b) Prove that if f E MS(S') then there is a positive integers nand k such that f
has j periodic sinks of period nand j periodic sources of period n and no other
periodic points.
10.5. Assume M is a compact manifold and L : M -+ JR is a C 2 Morse function, i.e.,
all the critical points of L are nondegenerate (or V(L) has only hyperbolic fixed points).
Assume X be a C' vector field such that L is constant along the trajectories of X (e.g.
X could be the Hamiltonian vector field for L).
(a) Prove that X can be C' approximated by a vector field Y for which n(Y) is a
finite set of hyperbolic fixed points.
(b) Prove that X can be C' approximated by a vector field Y that is Morse-Smale
and has no periodic orbits.
10.6. Assume M is a compact manifold. Let Xk(M) be the set of C le vector fields on
M. Fix k ~ 1. Let 1i~ = {X E Xk(M): all the fixed points of X are hyperbolic }.
Let Z = fOp : p EM} be the zero section of TM.
(a) Let X E Xk(M). Prove that X : M -+ TM is transverse to Z if and only if each
fixed point of X does not have 0 as an eigenvalue.
(b) Prove that Tk(M, Z) = {X E Xk(M) : X rtl Z} is dense and open in Xk(M).
(c) Prove that 1i~ is dense and open in Xk(M).
10.6 EXERCISES 431

10.7. Let KS(S2) be the set of Kupka-Smale vector fields on the two sphere. Prove
that KS(S2) is open in Xl(S2).
10.8. Prove that the suspension of a Kupka-Smale diffeomorphism is a Kupka-Smale
flow.
10.9. Prove that the set of Kupka-Smale diffeomorphisms is not open in Diff1(M).
Hint: Given an example with nu) '" cl(PerU» for which 1 is Kupka-Smale.
Necessary Conditions for Structural Stability
10.10. Let M be a compact manifold and X E Xl(M) a structurally stable vector
field on M. Prove that all the fixed points of X are hyperbolic.
10.11. (Proof of Theorem 4.3.) Assume M is a compact surface and I,g : M -+ M
are C 1 diffeomorphisms.
(a) Assume 1 is a Kupka-Smale diffeomorphism and 9 has two periodic points p and q
such that W"(p, g) has a point of nontransverse intersection with W'(q,g) where
W"(p, g) locally lies on one side of W'(q, g). Prove that 1 and 9 are not conjugate.
(b) Prove that if 1 is a structurally stable then 1 is Kupka-Smale.
(c) Assume 1 is structurally stable and that 'R(f) has a hyperbolic structure. Prove
that 1 satisfies the transversality condition.
CHAPTER XI
Smoothness of Stable Manifolds
and Applications

In Chapters V and VIII, we proved various results about stable manifolds for a fixed
point and a hyperbolic invariant set. In this chapter we present some general results
about the existence and differentiability of an invariant section for a fiber contraction.
These results generalize the parameterized version of the Contraction Mapping Theorem
which is given in Exercise 5.4. Later in the chapter, we give applications of this theory
to prove (i) the differentiability of the Center Manifold, (ii) the differentiability of the
hyperbolic splitting for an Anosov diffeomorphism on y2, and (iii) the persistence and
differentiability of a "normally contracting" invariant submanifold.

11.1 Differentiable Invariant Sections for Fiber


Contractions
In this section, we consider maps F : X x Yo ..... X x Yo of the form F(x,y)
(f(x),g(x,y)) where X is a metric space and Yo is a closed ball in a Banach space Y.
Such a map is called a bundle map, X is called the base space, and Yo or Y is called the
fiber. In applications, the set of points in the base space that we want to consider is not
always invariant. A subset Xo C X is called overflowing for! provided !(Xo) :::> Xo.
The bundle map F(·,·) = (f(-), g(., .)) is called a fiber controction over Xo provided
Xo is overflowing for f and there is a ,.. with 0 < ,.. < 1 such that g(x, .) : Yo ..... Yo is
Lipschitz and Lip(g(x,·» ::; ,.. for each x E X o, i.e.,

for all x E Xo and Yt, Y2 E Yo. An invariant section over Xo of such a map is a function
u· : Xo ..... Yo such that F(x, u·(x» = (f(x),u· o/(x» for all x E Xo n ,-t(Xo), i.e.,
the graph of u· is invariant (or overflowing) by F.
If F is a fiber contraction, then Theorem 1.1 proves that there is a continuous in-
variant section. After proving this result, we give conditions on F which imply that the
invariant section is differentiable. In the case when F is a fiber contraction and I(x) = x
for all x, the first theorem is merely the result about fixed points for a parameterized
contraction mapping. See Exercise 5.4.
The results not only apply to product spaces of the form X x Yo C X x Y but also
to subsets of vector bundles. In the applications several of the spaces are of the form
IE = UXEX{x} x Ex where the space Ex is a vector space that varies with the point x,
and the space X is at least a metric space and often a manifold. With a few assumption
on how Ex varies, such a space E is a vector bundle over X. (In the proof of Theorem
1.2, the induction step uses a bundle map on a vector bundle even if the original space
is a product.) The space X is called the base space and Ex is called the fiber over x.
The map 11' : IE ..... X taking {x} x Ex to x is called the projection. The tangent space of
a manifold is an example of a vector bundle. See Hirsch (1976) for a formal definition
of a vector bundle.

433
434 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

We say that a metric d on a vector bundle E is an admissible metric provided


(1) it induces a norm on each fiber Ex for x E X (so we write Ivx -wxl for d(vx, w x»,
(2) there is a complementary bundle E' over X such that (i) the sum of the two
bundles
E$E' =U{x} x Ex x ~
x
is isomorphic to a product bundle X x Y, where Y is a Banach space, (ii) the
product metric on X x Y induces the metric d on E, and (iii) the projection
{x} x Y onto Ex along E~ is of norm 1.
In a vector bundle with an admissible metric, the disk bundle 01 radius R is the subset

E(R) = U {x} x Ex(R)


xEX

where Ex(R) is the closed ball (or disk) of radius R in Ex.


A map F: IE(R) -+ E(R) is called a bundle map provided 11" 0 F( {xl x Ex(R» takes
on a unique value I(x). Therefore a bundle map F induces a map 1 : X -+ X such that
11" 0 F = 1011". A bundle map on a vector space IE with an admissible metric induces
a map F : X x Y(R) -+ X x Y(R) on the associated product space by means of the
composition of (i) the isomorphism from X x Y(R) to (E $ E')(R), (ii) the projection
from (E$E')(R) to E(R), (iii) the map F from E(R) to E(R), (iv) the inclusion ofE(R)
in (IE $ E')(R), and finally (v) the isomorphism from (E $ E')(R) to X x Y(R). By
using this construction, the theorems which we state for a map on a product spaces are
valid for a bundle map on a disk bundle. See Hirsch and Pugh (1970) or Shub (1987)
for details.
Now we can state the first result.
Theorem 1.1 (Invariant Section Theorem). Assume X isametricspace, Xo C X,
ll1ld Yo is a closed bounded ball in a Banach space Y. Assume F : Xo x Yo -+ X x Yo
is a continuous fiber contraction over Xo with F(x, y) = (J(x), g(x, y». (Or F is a
bundle map on a disk bundle over Xo ll1ld is a contraction on each fiber.) Assume that
(i) 1 is overflowing on X o, and (ii) IIXo is one to one. Then there is a unique invarill1lt
section over X o, u· : Xo -+ Yo, ll1ld u· is continuous.
PROOF. Consider the spaces of bounded functions and continuous functions given by

g = {u : Xo -+ Yo} and
gO = {u : Xo -+ Yo : u is continuous}.

Note that because Yo is a bounded ball, g is the space of bounded functions. On g and
go put the CO-sup topology,
lIuI - u2110 = sup lUI (x) - u2(x)l.
xEX.

This norm makes both g and gO complete normed linear spaces since Yo is complete
and the sections are bounded. See Dieudonne (1960), 7.1.3 and 7.2.I.
There is an induced map on the sections rF : {i -+ (i such that the graph of rF(U)
is a subset of the image by F of the graph of u. (These two graphs are not necessarily
equal because Xo is overflowing but not necessarily invariant for J.) This map r F is
called the graph translorm 01 F. In fact it is p068ible to give a formula for rF(U). Let
x E Xo. Then I-I(X) E Xo is well defined because 1 is one to one on XOi the value of u
11.1 DIFFERENTIABLE INVARIANT SECTIONS FOR FIBER CONTRACTIONS 435

at I-I(x} is given byuorl(x} and is in the fiber over I-I(x}; the imageofuo/-l(x}
by F must be (X,rF(U)(X}), so

(X,rF(U)(X}) = F(rl(x},u 0 rl(x))


= (x,g(rl(x},uorl(x)))
= (x,gouorl(x})

where u(x} = (x,u(x}). Thus

We claim that r F is Lipschitz on {i with Lip(r F} ~ "', where 0 < '" < 1 is the fiber
contraction constant for F. Let UI, C12 E {i, and &1, U2 : Xo -+ Xo x Yo be induced as
above. Then

IIrF(C1t}-rF(C12}1I0= sup Igo&lorl(x}-gO&20rl(X}1


xEX.
~ sup Igo&I(P}-goU2(P}1
pEX.
~ sup ",luI(p} - u2(p}1
pEX.
= "'lIuI - u2110.

Because r F is a contraction on the complete metric space g, it has a unique fixed point
u*, or F has a unique bounded invariant section over Xo. Because rF preserves the
closed subspace gO, u* E go and the invariant section is continuous. 0
We want conditions whicli imply that the invariant section is differentiable. It is
not enough for F to be a fiber contraction. The following example gives a function
which contracts more along the base space X than along the fibers Y and there is no
differentiable invariant section.
Example 1.1. We take the base space as Sl = {x E IR mod I} and fiber space R.
Let I : Sl -+ Sl be a Coo diffeomorphism with two fixed points, an attracting fixed
°
point at and a repelling fixed point at 1/2. FUrther, assume that I(x} = x/3 for
-1/3 ~ x ~ 1/3. (This is really the form of the function on the covering space R.) To
define g, let (3 : [0, I] -+ [0, I] C R be a Coo bump function with supp({3) = [1/9.1/3]
and (3(2/9) = 1. The map 9 : SI x R -+ R is defined by g(x, y) = (3(x) + y/2. Notice
that I contracts more strongly on the base SI (by a factor of 1/3) than 9 contracts on
the fiber (by a factor of 1/2). Let F = (/,g) be the bundle map on SI x R, which takes
SI x [-3,3] into its interior.
Since F is a fiber contraction, it has a unique continuous section u· , go u* I-I = u· .
°
By the iterative process used in the proof of Theorem 1.1, u·(x} = for 1/3 ~ x ~ 1.
For 1/9 ~ x ~ 1/3, rl(x} E [1/3.1/2] (because 1(1/2) = 1/2 and I is monotone}. so
u· 0 I-I (x) = and°
u·(x} = go u· 0 rl(x} = g(rl(x},O} = (3o rl(x) = 0.

However for 1/33 ~ x ~ 1/32, rl(x) = 3x E (1/9,1/31 and

u·(x) = gou'(3x) = g(lx.O) = 8(3x).


436 XI. SMOOTHNESS OF STABl-E MANIFOLDS AND APPLICATIONS

In particular, (7.(2/33 ) = 1 and (7*(1/3 3 ) = (7.(1/3 2 ) = o. Next for 1/33 +1 ~ x ~


1/32+1, f-I(x) = 3x E [1/33 ,1/3 21 and

(7*(x) =gou*(3x)
= g(3x, p(32x»
1
= 2 P(3 2 x).
In particular, (7.(2/3 4 ) = 1/2 and (7*(1/3 4 ) = (7*(1/3 3 ) = o. By induction on n, for
1/33 +n ~ X ~ 1/32+n with n ~ 1, f-I(x) = 3x E [1/3 2+n , 1/3 1+n l and
(7*(x) = go u*(3x)
1
= g(3x, 2n - 1 t1(3 n+1 x »

= ..!..t1(3n+1 X ).
2n
In particular, (7*(2/33+ n ) = 1/2n and (7*(1/33 +n ) = (7*(1/32+ n ) = O. Since (7*(0) = 0
and (7·(2/33+ n ) = 1/2n , the difference quotient
(7'(2/33+ n ) - (7·(0) 33+n
(2/33+ n ) - 0 = 2n+1

does not converge as n goes to infinity. Therefore, (7* is not differentiable at x = o.


By the calculation of the form of (7* for x '" 0, it is differentiable at all other points.
The reader can check using the Mean Value Theorem that there are points in the
interval [1/33+n, 2/33+ n l at which the derivative is at least 3n +3 /2 n , and this quantity
is unbounded as n goes to infinity. This completes the analysis of this example.
To insure that the invariant section is C l we need to assume that the contraction on
fibers is stronger than the contraction in the base space. The following theorem makes
this precise.
Theorem 1.2 (C r Section Theorem). Assume X is a manifold, Yo is a closed
bounded baIl in a Banach space Y, and Xo C X is a subset for which Xo = cl(int(Xo»
(so differentiation makes sense on Xo). Assume F : Xo x Yo --+ X x Yo is a C l
fiber contraction on Xo with uniformly bounded derivatives on Xo x Yo and F(x,y) =
(f(x), g(x, y». (Or F is a bundle map on a disk bundle over Xo.) Assume that (i) f is
overflowing on X o, and (ii) flXo is a diffeomorphism from Xo to its image I(Xo ) J Xo.
For each x E Xo, let Ax = II(Dlx)-1 II = l/m(Dfx). (Note that Ax measures the great-
est expansion of /-1 at I(x), and 1/ >'x measures the minimum expansion or greatest
contraction of I at x.) Let D2 g(x.y) : Y --+ Y be the derivative with respect to the
variables in Y. For each x E Xo, let

The map F is a fiber contraction so


sup Itx < 1.
xE/-'(Xo)

(a) Assume further that 1 ~ T ~ 00, F is cr, and

sup ItxA~ <1


xE/-'(Xo)
1LJ DIFFERENTIABLE INVARIANT SECTIONS FOR FIBER CONTRACTIONS 437

for 1 :5 j :5 r. Then the unique invariant section is cr.


(b) Assume that 0 :5 r :5 00, a > 0, F is e r +a with uniformly bounded Holder
constant,
sup /£xA~ < 1
xE/-'(Xo)
for 1 :5 j :5 r, and
sup /£xA~+a < 1.
xE/-'(Xo)
Tllen the unique invariant section is e r +a , i.e., the roth derivative is a-Holder.
REMARK 1.1. When r = 0 and a > 0, the theorem is true when Yo is merely a metric
space.
REMARK 1.2. This theorem is essentially in Hirsch and Pugh (l970). For part (b),
they make the comparisons of derivatives at different points:
[ sup ItxA~1 [ sup A~I < 1.
xE/-1(Xo) xE/-'(Xo)
Also see Hirsch, Pugh, and 8hub (1977).
REMARK 1.3. Hurder and Katok (1990) prove the result with the purely pointwise
assumption that
sup /£xA~+Q < 1.
xE/-'(Xo)
We refer to these references for the proof in the case of Holder continuous derivatives.
Also see Exercises 11.2 and 11.3.
REMARK 1.4. Below we let gl be the e l sections in go. The graph transform rF
preserves gl but it is not a contraction on gl with the e l topology, so we can not prove
that a· E gl by that means.
We do not proceed exactly like the proof of the stable manifold, but instead show
that the graph transform preserves Lipschitz sections. After this step, we show that the
derivative takes a family of appropriate cones of vectors into itself. Note the assumption
that sUPxE/-1(Xo) ItxAx < 1 is just the condition needed to verify this invariance of cones.
PROOF FOR r = 1. Let go = {a : Xo -+ Yo : a is continuous} and rF : go -+ go be
as in the proof of Theorem 1.1, rF(a) = go (id, a) 0,-1. By Theorem 1.1, there is a
unique invariant section a· E gO.
We first prove that the invariant section a· is Lipschitz. Let
gl = {a: Xo -+ Yo: a is e l },
gl(L) = {a E gl: sup IIDaxll :5 L},
.,EXo
and gLiP(L) be the closure of gl(L) in go in terms ofthe CO-sup topology. We justify
the notation for gLip(L) with the following lemma.
Lemma 1.3. Every a E gLip(L) is Lipschitz with Lip(a) :5 L,
PROOF. By the Mean Value Theorem, every a E gl(L) is Lipschitz with Lip(a) :5 L,
If a i E gl(L) converges to a E gLip(L) then

la(x) - a(y)1 = j-ao


lim lai(x) - ai(Y)1
:5 .lim Ld(x,y)
,-ao
=Ld(x,y),
438 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

which proves that CT has a Lipschitz constant as claimed. o


REMARK 1.5. The proof of the above lemma actually shows that the set of maps whose
Lipschitz constant is bounded by L is closed in go.
Also gLiP(L) is actually the set of all Lipschitz sections with Lipschitz constant less
than or equal to L, but we do not need this fact. We merely use that gLip(L} is closed
in gO and gl(L) is dense in QLip(L).
Let
p. = sup Kx ~x.
xEf-I(X.)

The derivatives of 9 are uniformly bounded so we can take C > 0 such that

for all (x. y) E Xo x Yo. where D I 9(x.),) : TxX -- Y is the derivative with respect to
the variables in X. Take Lo = C/(l - p.} > 0, i.e., Lo > 0 so that C + p. Lo = Lo. With
these constants we can show that r F preserves gLip(L o}.
Lemma 1.4. With Lo as chosen above, rF take gLip(L o} into itself.
PROOF. For a E gl(Lo) and x E Xo,

IID(rdCT»xll = IID(9 o o- o r' )xll


~ IID I 9"oJ-'(x)D(r l )xll + IID29"of-l(x)DCTf-'(x)D(r')xll
~ C + Kf-I(X) Lo Af-I(x)
~ C + p.Lo
= Lo·

Therefore rdCT) E gl(Lo). Because gl(Lo) is dense in gLiP(Lo) and gLiP(Lo ) is closed,

rF(QLip(Lo}} c gLip(L o}.


o
Take any a O E Ql(Lo). Then for n 2: 1, r~(CTO} E gl(Lo), r~(CTO) converges to CT>
as n goes to infinity, QLip(Lo) is closed, and so CT> E gLiP(Lo}. This proves that the
invariant section is Lipschitz.
To prove that the invariant section is Cl, consider the cones

C~(2 Lo) = {(v, w) E TxX x Y : Iwl ~ 2 Lo Ivl}·

The calculation of Lemma 1.4 shows that

We need to measure the maximum opening of a cone. If CU c C;:(2 Lo) is a cone over
TxX then the angle of opening of CU is defined to by

..((CU ) = sup{ 11:11 : (v, w) E C U c Tx x Y}.

The following lemma proves the related fact that the angle of opening of the cone
decreases by a factor of p..
11.1 DIFFERENTIABLE INVARIANT SECTIONS FOR FIBER CONTRACTIONS 439

Lemma 1.5. IfC" C C:(2 Lo) has L(CU) = {3 > 0, then L(DF(x.y)C") ~ 1I{3.

PROOF. Assume
(v, w), (v, w') E DF(x,y)CU c T/(x)X x Y,
then

(v, w) = DF(x.y) (v, w)


(v, w') = DF(x.y) (v, w').

The vector v = Dfxv, so v = (Dfx)-I v (so v is the same for both pairs of vectors),
Ivl ~ Ax lvi, and Ivl- I ~ Ax lvi-I. To estimate the components in Y,
w - w' = D I 9(x.y)V + D 2 9(x.y)W - D I 9(x.y)V - D 2 9(x.y)W'
= D 2 9(x.y)[w - w'I,

so
Iw - w'l ~ Itx Iw - w'l·
Thus the angle of the opening of the cone DF(x.y)CU is estimated as follows:

L(DF(x.y)C") = sup{
Iw-w'l
Ivl : (v, w), (v, w') E DF(x.y)C"}

~ sup{1tx Ax
Iw Ivl
- w'l : (v,
.•w), (v,
..,w)E C"}
~ 1-'{3.

This proves the bound on the angle of the opening which is claimed. o
By induction on n,

L(DF;.o/-n(xp;-n(x)(2 Lo) ~ I-'n L(Ci-n(x)(2 Lo»


= I-'n4L o·

As in the proof of the stable manifold, it follows that

n DF;'o/-n(x)Ci-n(x)(2Lo)
00

Px =
n_O

is a linear subspace which is a graph over TxX of slope less than or equal to Lo. By
using an argument like that for Proposition V.IO.S, we get that u" is CI. This completes
the proof of Theorem 1.2 for r = 1. 0
PROOF FOR r ;?: 2. By induction, u" is cr- I , so at least CI. Let A; = Du;. Then
Du; : TxX -+ Y is linear for each x, so Du; E L(TxX, Y) where L(V1t V2 ) is the space
of bounded linear maps between Banach spaces. We put the norm on L(Tx, Y) given by
the norm of a linear operator. (Notice that this definition uses a Riemannian norm on
TX.) We form a space of possible derivatives of such sections. Let £(X) be the bundle

£(X) = U {x} x L(TxX, Y).


xEX
440 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

Let C(Xo) = C(X)IXo be the bundle restricted to fibers over points in X o, and C x =
{x} x L(TxX, Y).
Next we want to define a bundle map on C(Xo) that is compatible with the trans-
formations of derivatives by the action of rF on gl. If u E gl, then rF(U) is C l
and
D(r F( u»x = Dgao/ -' (x) (id, Du1-' (x) )D(f-I)x
by the Chain Rule. Using this a motivation, we define a bundle map \If (f, t/J)
C(Xo) -+ C(X) by

\If (x, S) = (f(x), t/J(x, S»


= (f(x), Dgao(x) 0 (id, S) 0 D(f-I )/C x».

Note that in this definition, we consider (id, S) as taking values at tangent vectors at
(x, u·(x». Thus we only consider the space of possible derivatives along a·. The map \If
is C r - I because F is C r and u· is C r - I . Lemma 1.6 shows that \If is a fiber contraction
over x by a factor of /3x = Itx Ax :$; 11 < 1. By the definitions,

sup /3x Ai :$; 11 < 1


xEXo

for 1 :$; j :$; T - 1. The invariant section u· is C l by the induction hypothesis, so the
map

is an invariant section of \If, i.e., a fixed point of r 1jI. By applying this theorem for T - 1
to \If. A· is a C r - I invariant section. The fact that A· is C r - l implies that u· is cr.
This completes the induction step in the proof except for the following lemma. 0
Lemma 1.6. The map \If : C(Xo) -+ C(X) is a contraction on the fiber over x by
ItxAx·
PROOF. The map \If is a bundle map by its definition. For (x, S), (x, S') E cx,

1It/J(x,S) - t/J(x,S')1I = IIDgao(x)((id,S) - (id,S')]D(rl)/(x)1I


:$; IID 2 ga(x)(S - S']IIIID(f-I)/(x)1I
:$; II:x Ax liS - S'II

as claimed. o
REMARK 1.6. Note that the proof of Lemma 1.6 is essentially the same as the proof of
Lemma 1.5. Also, it is very important that both S and S' take there values at the same
point a·(x), so that the derivative of 9 is calculated at the same point for both terms.
REMARK 1. 7. Note that the proof of the induction step (the case T ~ 2) does not apply
to prove the case for T = 1. The reason is that without using the proof for T = 1 we do
not know that the invariant section A· for \If is in fact the derivative of the invariant
section u· for F.
11.2 DIFFERENTIABILITY OF INVARIANT SPLITTING 441

11.2 Differentiability of Invariant Splitting


As an application of the cr Section Theorem, we consider an Anosov diffeomorphism
on T2 and show that its splitting is CI.
Theorem 2.1. Assume f : T2 -+ T2 is a C'l An060v diffeomorphism. Then the sulr
bundles IE" and IE" in the hyperbolic splitting depend in a Cl fashion on the base point.
PROOF. We prove that the unstable bundle is CI. For this proof, we use that the stable
bundle has one dimensional fibers but not that the unstable bundle is one dimensional.
For this reason to prove that IE" is C I , we assume that I is a C2 Anosov diffeomorphism
on yn with IE" one dimensional.
In the proof we write the derivative as if I were a map on Rn , i.e., we confuse I with
its lift to Rn. By assumptions there is a continuous splitting IE" e E". This splitting
can be approximated by a C I splitting F" e F" which is almost invariant. In terms of
the splitting IF" elF",
Dfx = (~: ~:)
where Ax : IF~ -+ lFj(x) , Bx : IF~ -+ lFj(x) , etc. Because the splitting is near the
hyperbolic invariant splitting, there are A > 1, 0 < p.' < p. < 1, and E > 0 such that
A - E > 1, p. + f < 1, A-I + f + 2f/P.' < I, m(Ax) ~ A, p.' $ m(Dx) = IIDxll $ p.,
and IIBxll, IICxll $ f. (Note that the bundle IF" is one dimensional so the norm of Dx is
really an absolute value.)
The bundle IE~ is the graph of a linear map F,! -+ IF~. Let ex be the space of all
linear maps from IF~ to IF~) with norm less than or equal to 1,

Let e be the disk bundle over yn of all the ex,


e = U {x} x ex'
xET"

e
The natural transformation on is given in terms of the graph transform induced by
the derivative of I, II! = (f, "') : e -+ e where "'x: ex -+ e,(x)' To derive the formula
for "'x, let O'x E ex and v E F,!. Then
D/x(v,O'xv) = ((Ax + BxO'x)v, (Cx + DxO'x}v)
= (w, "'x(O'x}w).

So, v = (Ax + BxO'x)-I W and


"'x((7x)w = (Cx + DxO'x}{Ax + Bx(7x)-l w or
"'x((7x) = (Cx + DxO'x}{Ax + Bx(7x)-I.
To check that Ax + B x(7x is invertible note that

m(Ax + BxO'x) ~ m(Ax} - IIBxllllO'xll


~ A-f
>1.

(See Exercise 11.6.) Since m(Ax + BxO'x) > 1 > 0, Ax + B"O'x is invertible.
e e
Since I is C2, II! : -+ is CI. Also, II! covers the map I : T" -+ T". The following
lemma shows that II! is a fiber contraction.
442 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

Lemma 2.2. (a) The map ill is a fiber contraction by a [actor o[K. x = (IIDxll/m(Ax»+
2t on ex'
(b) Let e = sup{lI1/Jx(Ox)1I : x E T"}. The bundle map ill preserves the disk bundle
o[radius Lo = e/(l- K.) in e where K. = SUPxETn K.x.
PROOF. (a) If u!.u~ E ex. then by adding and subtracting the same term and applying
the triangle inequality. we get

l11/Jx(U!) -1/Jx(u~)11
= II (ex + Dxu!)(Ax + BxU!)-l - (ex + Dxu~)(Ax + Bxu~)-III
~ lI(ex + Dxu!) - (ex + Dxu~)IIII(Ax + BxU!)-11i
+ lI(ex + Dxu~)lIII(Ax + BxU!)-1 - (Ax + BxU~)-III·

Now we estimate each term in the right hand side above. The first term has the
following estimate:

lI(ex + Dxu!) - (ex + Dxu~)11 = IIDx[u! - u~JII


~ IIDxllllu! - u~lI·

Next.

II(Ax + Bxu~)-III = [m(Ax + BxU~>rl


~ [m(Ax) -IIBxll'lIu~WI
~ [m(Ax) - tr 1

and

lI(ex + Dxu~)11 ~ lI(exll + IIDxllllu~)1I


~ t + Il.
To estimate the next term. note that

lIa- 1 - b-III = lIa-Ibb- 1 - a-lab-III


~ lIa-11lllb - all lib-III.

so

II(Ax + Bxu!)-I - (Ax + B"u~)-III


~ II(Ax + Bxu!)-IIIIIBxllllu! - £1~IIII(Ax + BxU~)-lll
~ t (oX - t)-2I1u! - £1~II.

using the fact that II(Ax + Bx£1!)-III.II(A x + BxU~)-11i ~ (oX - t)-l and IIBxll ~ t.
Combining these estimates we get that

l\1/Jx(£1!) - 1/Jx(£1~}11
~ IID"lIlIu! - £1~1I [m(Ax} - tr l + (f + 1l)t{oX - f)-2I1u! - £1~1I
< [IIDxll + 2t]II£1 1 _ £1211
- m(Ax) x x
11.2 DIFFERENTIABILITY OF INVARIANT SPLITTING

as claimed. (In the last inequality we used that IIDxll [m(Ax) -fj-I ~ [IIDxII Im(Ax)] +f
and (E + J.L)E (A - E)-2 ~ t.)
(b) Assume U x E .cx(Lo). Then

IItPx(ux)1I ~ IItPx(ux) - tPx(Ox)1I + IItPx(Ox)1I


~ "x lIuxll +C
~"Lo+C
= Lo·

Therefore tPx(ux) E .cf(x)(Lo). 0


Since III covers the map /, we need the estimate on II(D/x)-III: Ax = II(D/x)-III ~
II(Dx)-11i + E = IIDxll-1 + E. (This uses the fact that IF! is one dimensional.) Then the
fiber contraction times the maximum expansion of /-1 is less than 1:

sup "x Ax ~ sup [(IIDxll Im(Ax)) + 2f][IIDxll- 1 + fj


xET" xET"
~ m(Ax)-1 + f + 2f/llDxll
~A-l+E+2f/J.L'
<1.

It is important that the product of the rates is taken before the supremum is taken so
that the factors II Dx II and II Dx 11- I multiply to give 1. Thus the contraction on fibers
.c(Lo) is stronger than the contraction within T" and the invariant section is CI. Since
the invariant section has E: as a graph, the unstable bundle is CI.
The proof for the stable bundle is similar provided that the unstable bundle has one
dimensional fiber. 0
REMARK 2.1. The bundles are not necessarily C2 even if the diffeomorphism is C3.
This is because IIAx ll-I[IIDx ll- 1 +fj is not necessarily less than 1.
However, various rigidity results have been proven. If an Anosov diffeomorphism
/ : 1'2 --+ y2 is Coo, area preserving, and has C2 invariant bundles, then the bundles are
Coo. See Hurder and Katok (1990). Also see de la Llave, Marco, and Moriyon (1986)
and de la Llave (1987).
REMARK 2.2. In higher dimensions the bundles are not necessarily CI but they are
Holder. See Hirsch and Pugh (1970). Their treatment uses the CO Section Theorem of
the last section.
The same type of argument can be used to prove that the stable bundle is CI for a
hyperbolic attractor with dim(E:) = 1.
Theorem 2.3. Assume / : M --+ M is C2 and has a hyperbolic attractor A with
dim(E:) = 1. Let E! = Tx W"(x) [or x in a neighborhood o[ A. Then the stable bundle
is CI.
PROOF. The proof is like above, but the map /-1 is overflowing on a neighborhood U
of A. The estimates similar to those of Theorem 2.1 but applied to D(f-I)x give a fiber
contraction on the space of linear maps whose graphs could potentially give E!. See
Hirsch and Pugh (1970) for more details. 0
444 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

11.3 Differentiability of the Center Manifold


The existence of a cr Center Manifold is stated in Section 5.10.2. In this section,
we indicate how this result follows from the cr Section Theorem. What is needed is
to show that the center-unstable manifold, WCU(O,f), and the center-stable manifold,
WC'(O, f), are cr. (The center manifold is the intersection of these two manifolds.) We
concentrate on showing that WC"(O, f) is C r because the prooffor WC'(O, f) is similar.
The proof in Section 5.10.2 (using the methods of Section 5.10.1) shows that WCU(O, f)
is C l . We indicate the induction argument which proves that it is cr.
Let 2 ::; T < 00 be a fixed level of differentiability. (The proof for T = 1 is done.)
We assume I : U c Rn --> Rn is a C r map with 1(0) = o. Also the tangent space at 0
splits, Rn = IE" (fl EC (fl E', where these subspaces are labeled in the usual manner. Let
0< J' < 1 be such that IIDlolE'1i < 1'. Let ~ > 1 be chosen so that JJ~r < 1. A basis
can he chosen so that in terms of the norm in this basis II D 10 I lIEu (flEe II < ~. (Notice
that for a fixed ~ > 1, it is not possihle to satisfy this inequality for all T ~ 1. This
is {'Ssentially the reason the manifold can not be proven to he Coo. Also if D lollE c is
diagonalizahle, thl'n it is pO.'l.'!iblc t.o make IIDlolllE" (fl1E'1i ::; 1. )
Next. we want to get global estimates and not just at o. We do this by extending
I using a bump function to all of Rn so it is uniformly near the derivative at 0. Let
!3 : R n -> R be a bump function with supp(!3) = B(2,O) and !3IB(I,O) = 1. Define
!3.(x) = !3(fX), so supp(!3.) = B(2f,0) and !3.IB(f,O) = 1. Let A = Dlo (in the basis
indicated above) and

F.(x) = !3.(x)/(x) + (1 - !3.(x))Ax.

As f goes to 0, F. converges to the linear map Ax in the C l topology. (See Proposition


V.7.5 and the proof of Lemma X.4.2.) In particular, if f > 0 is small enough then
IID(F.)xllE'li < I' and IID(F.);lIIEU (flEcll < ~ for all x ERn. We fix this f and write
F for F.. (Notice that even if DlollEc is diagonalizable, it is not possible to make
IIDI; I lIEu (flEc" ::; 1 for all x ERn.)
Let C(Rn) be the bundle and III : C(Rn) --> C(Rn) be the bundle map as defined in
Section 11.1 for the proof for T ~ 2. Lemma 1.6 proves that III is a fiber contraction by
a factor of JJ~. The construction above gives that (I' ~)~r-l < 1, so the the invariant
section is C r - l .
Let o' : IE" (flEc --> IE' be the C l invariant section whose graph gives WCU(O, F).
The map A; = Do: is an invariant section for III. It follows that Do: is C r - l and u·
is cr. This completes the proof.

11.4 Persistence of Normally Contracting Manifolds


In this section, we assume that we are given a C r diffeomorphism on a manifold,
I: M --> M with an invariant compact C l submanifold V c M, I(v) = V. The main
theorem gives conditions for an invariant manifold to persist for perturbations of f
which are C l small. The first step is to define a condition on the invariant 8ubmanlfold
called normally contracting for f at V.
To make the definitions, we need the notion of a normal bundle of a submarlifold.
At each point x E V, it is possible to pick a subspace N x of the tangent space TxM
which is a complementary subspace to Tx V, TxM = Tx V (fl Nx . These subspaces can
be chosen so they vary differentiably on the point x E V, so together they form a vector
bundle ovt'r V.
N = U {xl x N x·
xEV
11.4 PERSISTENCE OF NORMALLY CONTRACTING MANIFOLDS 445

This vector bundle N is called a nOTmal bundle to V (in M). For each point x E V,
there are two projections defined: 11'~ : TxM -+ TxV, the projection along N x onto
TxV, and 11'~ : TxM -+ N x , the projection along TxV onto N x .
Definition. A diffeomorphism I : M -+ M is called nOTmally controcting at V provided
V is a compact invariant submanifold for I and there are constants C ~ 1 and 0 < Il < 1
such that

111I'f,(x)DI!INxll $ CIl" and


111I'f,cx)DI:INxll $ CIl" m(DI:ITx V)
for all x E V and k ~ 1. These conditions mean that I contracts toward V and the
rate of contraction toward V is stronger than any contraction within V. (The term
I:
m( D ITx V) measures any possible contraction within V.)
To get n higher degree of smoothness of the invariant manifold of a perturbation, we
need to make further assumptions on the rate of contractions toward V relative to the
contradions within V. For r ~ 1, a diffeomorphism I : M -+ M is called r-noTmally
contmcting at V provided V is a compact invariant submanifold for I, I is cr,and
there are constants C ~ 1 and 0 < I" < 1 such that

111I'f,(x)DI!INxll $ CI""m(DI!ITxVp
for all 0 $ j $ r, x E V, and k ~ L
REMARK 4.1. There is a generalizatioJl,-of r-normally contracting to r-normally hyper-
bolic invariant manifolds; see Hirsch, Pugh, and Shub (1977) and Fenichel (1971). This
latter condition allows there to be both contracting and expanding directions within the
normal bundle.
In our definition. we did not require that the normal bundle be invariant by the
derivative map. However, if I is normally contracting along V, then it is possible to
choose another normal bundle that is invariant.
Proposition 4.1. Assume I: M -+ M is normally contracting at V.
(a) There is a continuous choice of the normal bundle that is invariant by tbe deriv-
ative of /. D/x(Nx) = Nf(x)'
(b) By changing the Riemannian norm of M, it is possible to take C = 1 in the
definition of normally contracting at V.
We leave the proof of this proposition to the exercises. See Exercise 11.9. In the rest
of this section we take the invariant normal bundle and adapted norm given by this
proposition.
Once the normal bundle is invariant and C = 1, we can leave the projection out of
the conditions on the rates of contraction, and write that IIDlxlNxll $l"m(DI!ITxV)i
for 0 $ j :5 r. This condition that DI contracts vectors in N more strongly than
vectors tangent to V implies that the derivative of I preserves a family of cones of
vectors which point more along V than in the normal direction. This fact is crucial in
the main theorem of this section which we state next.
Theorem 4.2. Let I : M -+ M be a cr diffeomorpbism, r ~ 1. Assume I is r-normally
contracting at V, where V is a compact Cl submanifold of M. If g: M -+ M is a cr
diffeomorphism which is Cl near I, then 9 bas a cr
invariant r-normally contracting
submanifold Vg which is Cl near V.
REMARK 4.2. A similar theorem is true for flows with very little change in the defini-
tions, statement, or proof.
446 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

REMARK 4.3. If 9 is C r near I, then its invariant manifold Vg is C r near V.


REMARK 4.4. This theorem has a long history. See the remarks in Hale (1969) and
Hirsch, Pugh, and Shub (1977). Sacker (1967), Fenichel (1971), and Hirsch, Pugh, and
Shub (1977) have recent results in this direction and beyond.
REMARK 4.5. This theorem has many applications in Dynamical Systems. The proof
of the Andronov-Hopf Theorem for diffeomorphisms is related to this theorem. For this
bifurcation result, the full nonlinear map is considered as a perturbation of a normal
form. The normal form of the map trivially has an invariant circle which is normally
contracting. As the parameter goes to the bifurcation value, the extent of contraction
toward the invariant circle goes to one. However, the perturbation effects of the true
nonlinear map from the normal form is small enough so the nonlinear map also has an
invariant closed curve. See Ruelle and Takens (1971).
Before starting the proof of the theorem, we discuss some constructions and results
related to the theorem. We start by showing that V is necessarily a C r manifold under
the hypothesis of the theorem. This same proposition shows that Vg is C r once we have
shown that it is C 1 •
Proposition 4.3. Assume I is C r and r-normally contracting at V, where V is a
compact invariant C 1 submanifold of M. Then V is a C r submanifold of M.

REMARK 4.6. There are examples of I, which are r-normally contracting at V but not
(r+l)-normally contracting, such that V is c r but not Cr+l; in fact, this is the generic
situation. See Mane (1978a).
PROOF. The proof is very similar to that of Theorem 2.1. Again we approximate the
invariant splitting TV EEl.N by a differentiable splitting IFv EElIF N , in terms of which

Given E > 0 with IJ + E < 1, because the splitting is near the invariant splitting,
IIBxll, IICxll ~ E, and IIDxll ~ IJm(Ax)m(DlxITxV)j for 0 ~ j ~ r - 1.
Let 'Dx be the space of all linear maps from IF:; to lFit with norm less than or equal
to 1,

Let 'D be the disk bundle over V of all the 'Dx,

'D = U{x} x 'Dx·


xEV

Let \II = (j,1/» : 'D -+ 'D be the graph transform induced by the derivative of I,

for U x E 'Dx. The proof of Lemma 2.2 shows that \II is a fiber contraction by a factor of
II Dx II m(Ax)-1 + 2t. The map on the base space V has,xx = m(DlxITxV)-I. By the
conditions above, \II satisfies the assumptions of the C r - I Section Theorem. Thus the
invariant section, whose image is TV, is cr- I and so V is cr. 0
11.4 PERSISTENCE OF NORMALLY CONTRACTING MANIFOLDS 447

Definition. Next, we introduce the notion of a tubular neighborhood of a submanifuld.


The idea is to identify a point in a neighborhood of V with a point in V and a displace-
ment in the normal direction which is represented by a vector in N. To state this more
carefully, we use the disk bundle in N. For a > 0, let Nx(a) be the vectors in N x with
length less than or equal to a; thus Nx(a) is a closed disk in N x . Then let N(a) be the
bundle of all these disks,
N(a) = U {x} x Nx(a).
xEV

The Tubular Neighborhood Theorem says that there is a embedding cp from N(a) for
small a onto a neighborhood of V in M. If M is a Euclidean space then cp can be taken
to be given by cp(v x ) = x + v x . In a manifold, cp(vx) = expx(vx ) works but may be
only C r - I if V is cr. With a little care, cp can be made to be cr. See Hirsch (1976)
for a more complete discussion.
PROOF OF THEOREM 4.2. The bundle N is defined at all points of the tubular neigh-
borhood cp(N(a» by taking tangent spaces to cp(Nx ) at points in the image of a fiber.
We continue to call this bundle N. The tangent space to V can be extended to this
neighborhood to be differentiable. (We do not give the details.) We denote the fibers
of this bundle by Tx.
Rather than use the family of cones of vectors which point more along V than in
the normal direction, we use the cones which point more in the normal direction. For
x E cp(N(a», let

Because I is normally contracting at V (with C = 1 from an adapted norm), if a is


small enough, then these cones are invariant under the action of the derivative of I-I:
D(r1)x(Cx) c C,-I(X)'
We leave the details to the reader. See Exercise 11.10.
Next let Do be a "vertical" disk in the tubular neighborhood cp(N(a» which is the
same dimension as a fiber of the normal bundle, Ny, and whose tangent space TyDo is
contained in the cone Cy . We also assume that the boundary of Do is in the boundary of
cp(N(a» and Do goes all the way across cp(N(a»: in local coordinates we could assume
that Do is the graph of a function from Nx(a) into TxV for some point x E V. Because
of the invariance of the bundles under I-I, l-n(Do) n cp(N(a» is a disk with the same
properties. As in the proof of the stable manifold theorem,
Dn = f"(rn(Do) n cp(N(a») c Do
is a nested set of disks which converge to a single point. (Here cp(N(a» is the tubular
neighborhood which is the image of the normal disk bundle over points of V.) This
point is the unique point in Do which stays in cp(N(a» for all backward iterates.
If Do = cp(Nx ) for x E V, then x E DonV stays in <p(N(a» for all backward iterates
and so in the unique point in the intersections of the Dn. Thus for these choices of
disks, using I we recover V:

V = u n f"(rn(cp(Nx»n cp(N(a»)
xEVn~O

Now if 9 is a C r diffeomorphism which is C l near to I, then 9- 1 will also preserve


the above set of cones:
448 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

Again. if Do is a "vertical" disk of the same type as above. then

n gn(g-n(Do) n cp(N(a)))
R2:0

is a single point. Thus

Vg = un
xEV n2:0
gn(g-n(cp(Nx }} n cp(N(a)))

is a graph over V in the sense that cp-I(Vg) C N(a) can be represented as the graph of
a function t/Jg : V --+ N(a) with t/Jg(x) E Nx(a). This function is also Lipschitz because
it has a unique point in cp(Cx ) for x E V. (This follows because any vertical disk has a
unique point in cp-I(Vg) just as in the proof of the Stable Manifold Theorem.)
An argument like that for the stable manifold proves that Vg is a C1 graph over V.
Just as the tangent space to V is an invariant section for the graph transform \II as
defined in the proof of Proposition 4.3. the tangent bundle to Vg is a fixed section for a
similar graph transform \IIg induced by the derivative of g. This map \IIg is CO near \II
provided 9 is C 1 near f. so the sections are CO near. The fact that the tangent space
to Vg is CO near t.he tangent space to V implies that Vg is C I near V. We leave the
details to the reader.
If 9 is C1 near enough to f. then 9 is r-normally contracting at Vg • so by Proposition
4.3 Vg is CT. 0

11.5 Exercises
Fiber Contractions
11.1. Let F: Xo x Yo --+ X x Yo be as in Theorem 1.1
(a) Prove that for each x E X o•

n
00

n=O
FR( {f-n(x)} x Yo)

is a unique point (x.u·(x». (Do not use the conclusion of Theorem 1.1.)
(b) Prove that the map u· : Xo --+ Y defined in part (a) is continuous.
11.2. Consider Theorem 1.2 for r = 0 and 0 < 0< < 1. Let

90(L) = {u E gO : lu(x) - u(x/)1 ~ L d(x. X/)O for all x. x' E Xo}.

(a) Prove that 90(L) is closed in go.


(b) Prove that for Lo > 0 large enough. fF preserves 90(£o).
(c) Prove that the invariant section u· E gO(Lo).
1l.3. Consider Theorem 1.2 for r = 1 and 0 < 0< ~ 1. Let

gI+O(C I • L) = {u ECI(Xo. yo) : IDuxl ~ C1 and


IDux - DUx'l ~ Ld(x,x')O for all x.x' E Xo}.

Henry (1981) in Lemma 6.1.6 proves that gI+O(CI. L) is closed in gO for any 0 < 0< ~ 1.
(Note this is not true for 0< = 0.) In fact he proves with the suitable definition that
gT+O(C 1 • L) is closed in go for any integer r ~ 0 and 0 < 0< ~ 1.
11.5 EXERCISES 449

(a) Prove that for C.,£o > 0 large enough, rF preserves g1+0(C., £0).
(b) Prove that the invariant section u· E g1+0(C., Lo). (You may use the result of
Henry.)
11.4. (Fiber Contraction Theorem) Assume X and Y are metric spaces with Y com-
plete. Assume F : X x Y -+ X x Y is a uniformly continuous fiber contraction on X
with F(x,y) = (f(x),g(x,y)), Le., there exists a 0 < ~ < 1 such that

for all x E X and YI,Y2 E Y. Assume there is ax· E X such that for any x E X,
d(f"(x), x·) goes to zero as n goes to 00. (Such a fixed point is called attractive. Let
y. be the fixed point of g(x·,·) : Y -+ Y. Let (x,y) be a point in X x Y. Let
11'2: X x Y -+ Y be the projection onto Y. Note that

d(7r2 0 F"(x,y),y·)
~ d(7r2 0 F"(x, y), 11'20 F"(x, y.))
n-I
+ L d(7r2 ° F"-I-;(p+I(X), g(f;(x),y·)),
;=0

(a) Show that d(7r2 oF"(X,y),1r2 oF"(x,y·)) goes to zero as n goes to infinity.
(b) Let 6; = d(g(Ji(x),y·),y·). Prove 6; goes to zero as j goes to infinity.
(c) Estimate

(d) Splitting the sum from 0 to n -1 into the two sums from 0 to k -1 and from k to
n - I, show that d(1r2 ° F"(x, y), y.) goes to zero as n goes to infinity. Thus prove
for any (x, y) E X x Y, d( F n (x, y), (x· , y.)) goes to zero as n goes to infinity, and
(x·, y.) is an attractive fixed point.
B.5. Assume F : Xo x Yo -+ X x Yo is as in Theorem 1.2 for some r ~ 1. Let C(X)
be as defined in the proof. Write Yo €a C(X) for

u
xEX
{x} x Yo x L(TxX, Y).

Define 9 : Yo €a C(Xo) -+ Yo €a C(X) by 9(x,y, S) = (f(x),g(x,y),1/J(x,y, S)), where

1/J(x,y,S) = Dg(x,y) 0 (id, S) ° D(rl)px).

(Note the Similarity to 1/J defined in Section 11.1.)


(a) Show that 9 is a cr- I fiber contraction over Xo.
(b) Note that the continuous sections of Yo €a C(Xo) can be written as gO x where no
go is as before and 110 are continuous sections of C(Xo). Let re be the graph
transform of 9 on go x no.
Show that re(u, S) = (rF(U), r ... (u, S)) where

r ... (u, S)(x) = Dg/to/-'(x) 0 (id, S/-'(x») 0 D(f-I)x.


450 XI. SMOOTHNESS OF STABLE MANIFOLDS AND APPLICATIONS

(c) Using the previous exercise. show that fa has a unique fixed point (11°. AO). Con-
clude that if 11 E gl then fa(l1. D(1) converges to (11°. AO) and 11° is e l .
(d) Prove that 11· is cr.
11.6. Let A : r;: -+ r,:,. 11 : 1F~ -+ r;:. and Bx : lF~ -+ lF~, be linear maps between
Banach spaces. Prove that m(A + B(1) ~ m(A) -IIBIIIII1I1.
Differentiability of an Invariant Splitting
11.7. Setup the bundle map for the proof of Theorem 2.2. Show that it is overflowing
on the base space and a fiber contraction with a stronger contraction on the fiber than
the contraction on the base space.
Differentiability of the Center Manifold
11.8. Let I be a e r diffeomorphism on Rn for 1 < r < 00. Assume 0 is a fixed
point that has a center (some eigenvalues with absolute value 1). Assuming that the
local unstable manifold WU(O. f) is e l • prove that it is e r . (Use the theorems of this
chapter. but do not use the theorems of Section 5.10.2.)
Normally Contracting Manifolds
11.9. (Proposition 4.1) Assume I: M -+ M is normally contracting at V.
(a) Prove that there is a continuous choice of the normal bundle that is invariant by
the derivative of I. D/x(Nx) = Nf(x)'
e
(b) Prove that it is possible to take = 1 in the definition of normally contracting at
V by changing the Riemannian norm of M.
11.10. Assume I : M -+ M is normally contracting at V. Let {ex} be the family
of cones as defined in the proof of Theorem 4.2. Prove that these cones are invariant
under the action of the derivative of I-I:

11.11. Assume I : (-fO. fO) x M -+ M is e l and I(t,·) : M -+ M is a diffeomorphism


for each f E (-fO. fO). Assume 1(0,·) has a I-normally contracting invariant manifold Vo.
Theorem 4.2 proves that there is an fl > 0 such that I(f,·) has a I-normally contracting
invariant manifold V. which is e l for If I ~ fl. Define the map F : (-fO, EO) x M -+
(-EO.fO) x M by F(£,x) = (E,/(E,X».
(a) By constructing an invariant set of cones for F, prove that the set

v= {(f,X) : x E V. for lEI ~ Ed


is a Lipschitz manifold in (-fO, fO) x M. This says that the normally contracting
invariant manifold varies in a Lipschitz manner. This manifold V could be called
the "Center Manifold" for the normally contracting invariant manifold Vo.
(b) Prove that V is a e l manifold in (-EO,EO) x M. This says that the normally
contracting invariant manifold varies differentiably.
References
Abraham, R. and Marsden, J. (1978), Foundation 0/ Mechanics (2nd Edition), Benjamin-
Cummings Publ. Co., Reading MA.
Abraham, R. and Robbin, J. (1967), Transversal Mappings and Flows, Benjamin, Reading
MA.
Abraham, R. and Smale, S. (1970), Nongenericity o/n-stability, Proc. Symp. in Pure Math.,
Amer. Math. Soc. 14, 5-8.
Adler, R. and Weiss, B. (1970), Similarity 0/ automorphisms 0/ the torus, Memoirs of Amer.
Math. Soc. 98.
Alekseev, V. M. (1968a), Quasimndom dynamical systems, I, Math. USSR-Sb. 5, 73-128.
Alekseev, V. M. (1968b), Quasimndom dynamical systems, II, Math. USSR-Sb. 6, 505-560.
Alekseev, V. M. (1969), Quasimndom dynamical systems, Ill, Math. USSR·Sb. 1, 1-43.
Alekseev, V. M. (1981), Quasimndom oscillations and qualitative questions in celestial
mechanics, Amer. Math. Soc. Translations (Three Papers on Dynamical Systems) 166
(Series 2),97-169.
Alligood, K. and Yorke, J. (1989), fractal basin boundaries and chaotic attmctors, Proc.
of Sympoeia in Applied Mathematics 39, 41-55.
Alligood, K., Tedeschini-Lalli, L., and Yorke, J. {1991}, Metamorphoses: sudden jumps in
basin boundaries, Commun. Math. Physics 141, 1-8.
AIsedA, L., Llibre, J., and Misiurewicz, M. (1993), Combinatorial Dynamics and Entropy
in Dimension One, Advanced Series in Nonlinear Dynamics, vol. 5, World Scientific
Publ., River Edge NJ & Singapore.
Aoki, N. (1991), The set 0/ Axiom A diffeomorphisrns with no cycles, in Collection: Dy-
namical Systems and related topics, World Scientific Publ, River Edge, NJ, pp. 2(}-35.
Aoki, N. (1992), The set 0/ Axiom A diffeomorphisms with no cycles, Bol. Soc. Brasil Mat.
23,21-65.
Andronov, A. A. (1929), Application 0/ Poincare's theorem on bifurcation points and
change in stability to simple autooscillatory systems, C. R. Acad. Sci. (Paris) 189,
559-561.
Andronov, A. A. and Leontovich-Andronova, E. A. (1939), Some cases o/the dependence 0/
periodic motions on a pammeter, Ucenye Zapiski Gorki Gosudarstvenny Univ. 6, 3.
Andronov, A. A. and Pontryagjn, L. (1937), Systemes Grossiers, Dok!. Akad. Nault. USSR
14, 247-251.
Andronov, A. A., Leontovich, E. A., Gordon, I. I., and Maier, A. G. (1973), Theory 0/ Bifur-
cation 0/ Dynamical Systems on the Plane, Wiley, New York.
Anosov, D. V. (1967), Geodesic flows on closed Riemannian mani/olds oj negative curva-
ture, Proc. Steklov Inst. 90, Amer. Math. Soc. (trans!. 1969).
Apostol, T. (1974), Mathematical Analysis, Addison·Wesley Publ. Co., New York and Read-
ing, MA.
Arnold, V. I. (1961), On the mapping 0/ the circle into itself, Izvestia Akad. Nault. USSR
Math. 25,21-86.
Arnold, V. I. (1971), Ordinary Differential Equations, M.I.T. Press, Cambridge MA.
Arnold, V. I. (1978), Mathematical Methods 0/ Classical Mechanics, Springer-Verlag, New
York, Heidelberg, Berlin.
Arnold, V. I. (1983), Geometric Methods in the Theory 0/ Ordinary Differential Equations,
Springer-Verlag, New York, Heidelberg, Berlin.
Arrowsmith, D. K. and Place, C. M. (1990), An Introduction to Dynamical Systerns, Cam-
bridge University Press, Cambridge, New York.
Artin, M. and Mazur B. (1965), On periodic points, Ann. Math. 81,82-89.

451
452 REFERENCES

Banks, J., Brooks, J., Cairns, G., Davis, G., and Stacey, P. (1992), On Devaney's definition
of chaos, Amer. Math. Monthly 99, 332-334.
Barge, M. (1986), Horseshoe maps and inverse limits, Pacific J. Math. 121,29-39.
Barge, M. (1988), A method for constructing attractors, Ergodic Theory Dynamical Systems
8,331-349.
Barge, M. and Gillette, R. (1990), Rotation and periodicity in plane separating continua,
Ergodic Theory Dynamical Systems 11, 619-631.
Barge, M. and Martin, J. (1990), Construction of global attractors in the plane, Proc. Amer.
Math. Soc. 110, 523-525.
Belitskii, G. R. (1973), f\inctional equations and conjugacy of local diffeomorphism of a
finite smoothness class, Functional Analysis and Its Applications 7, 268-277.
Benedicks, M. and Carleson, L. (1985), On iterates of x ...... 1 - ax 2 on (0,1), Ann. Math.
122. 1-25.
A.. n..<lirk.~, M. and Car1e11On, L. (1991), The dynamics of the Henon map, Ann. Math. 133,
73-169.
Bl'm'dkks, M. and Young. L. S. (1993), SBR measures for certain Henon maps. Inven.
Math. ll2. 541-576.
Birkhoff. G. D. (1927). Dynamical Systems. Amer. Math. Soc .• Providence. Rl.
Birman, J. and Williams. R. (1983a), Knotted periodic orbits in dynamical systems I:
Lorenz equations. Topology 22. 47-82.
Birman. J. and Williams, R. (1983b). Knotted periodic orbits in dynamical systems II:
Knot holders and fibered knots. Contemporary Math. 20. 1-60.
Blanchard. P. (1984). Complex analytic dynamics on the Riemann sphere. Bull. Amer.
Math. Soc. 11.85-141.
Block, L. and Coppel. W. (1992). Dynamics in One Dimension (Lecture Notes in Mathe-
matics), vol. 1513. Springer-Verlag. New York. Heidelberg. Berlin.
Block. L.• Guckenheimer. J .• Misiurewicz, M., and Young. L. S. (1980), Periodic points and
topological entropy of one dimensional maps, Lecture Notes in Math. 819, Springer-
Verlag. New York. Heidelberg. Berlin, 18--34.
Bowen. R. (1970a). Markov partitions for Axiom A diffeomorphisms, Amer. J. Math. 92,
725-747.
Bowen. R. (1970b). Topological entropy and Axiom A. Global Analysis. Proc. Sympos. Pure
Math., Amer. Math. Soc. 14. 23--42.
Bowen. R. (1971), Entropy for group endomorphisms and homogeneous spaces, 'frans.
Amer. Math. Soc. 153, 401-414.
Bowen. R. (1975a). w-Limit sets for Axiom A diffeomorphisms, J. Diff. Eq. 18.333--339.
Bowen. R. (1975b), Equilibrium States and the Ergodic Theory of Anosov Diffeomor-
phisms (Lecture Notes in Mathematics). vol. 470, Springer-Verlag, New York, Heidel-
berg. Berlin.
Bowen. R. (1978&). On Axiom A Diffeomorphisms (Conference Board Math. Science),
vol. 35. Amer. Math. Soc., Providence, RI.
Bowen. R. (1978b), Markov partitions are not smooth, Proc. Amer. Math. Soc. 71, 130-132.
Bowen. R. and Lanford. O. (1970), Zeta function of restrictions of the shift transformation,
Global Analysis. Proc. Sympos. Pure Math., Amer. Math. Soc. 14. 43--49.
Bowen. R. and Ruelle, D. (1975), The ergodic theory of Axiom A flows, Inven .. Math. 29,
181-202.
Boyle, M. (1993). Symbolic dynamics and matrices, IMA Vol. in Math. and Its Appl. 50,
Springer-Verlag, New York, Heidelberg, Berlin, 1-38.
Braun. M. (1978), Differential Equations and Their Applications, Springer-Verlag, New
York, Heidelberg, Berlin.
REFERENCES 453

Broer, H., Dumortier, F., van Strien, S., and Takens, F. (1991), Structures in Dvnamics;
Finite Dimensional Deterministic Studies, North-Holland, Ellievier Sci. Publ., Amster-
dam.
Brunovsky, P. (1971), On one parameter families 0/ diffeomorphisms 1/; generic branching
in higher dimensions, Comm. Math. Univ. Carolinae 12, 765-784.
Burns, K. and Gerber, M. (1989), Continuous invariant cone families and ergodicity 0/
flows in dimension three, Ergodic Theory Dynamical Systems 9, 19-25.
Burns, K. and Weiss, H. (1994), A geometric criterion lor positive topological entropy,
Preprint.
Carleson, L. and Gamelin, T. (1993), Complex Dynamics, Springer-Verlag, New York, Hei-
delberg, Berlin.
Carr, J. (1981), Applications 0/ Centre Mani/old Theory, Springer-Verlag, New York, Hei-
delberg, Berlin.
Cartwright, M. and Littlewood, J. (1945), On nonlinear differential equations 0/ the second
order: J. The equation ii-k(1-y2)y+y = b"\kcos(..\t+Q), k large, J. London Math.
Soc. 20, 180-189.
Cesari, L. (1959), Asymptotic Behavior and Stability Problems in Ordinary Differential
Equations, Springer-Verlag, New York, Heidelberg, Berlin.
Chernov, N.1. and Sinai, Ya. G. (1987), Ergodic properties 0/ some systems 0/2-dimensional
discs and 9-dimensional spheres, RUS8. Math. Surveys 42, 181-207.
Chillingworth, D. (1976), Differential Topology with a View to Applications, Pitman Publ.,
London, San Francisco, Melbourne.
Choquet, G. (1969), Lectures on Analysis, I, Benjamin, New York.
Chow, S.-N. and Hale, J. (1982), Methods 0/ Bifurcation Theory, Springer-Verlag, New York,
Heidelberg, Berlin.
Chow, S.-N., Lin, X. B., and Palmer, K. (1989), A shadowing lemma with applications to
semilinear parabolic equations, SIAM J. Math. Anal. 20, 547-557.
Collet, P. and Eckmann, J. P. (1980), Iterated Maps on the Interval as Dynamical Systems,
Progress on Physics, vol. 1, Birkhill5er, Boston.
Conley, C. (1978), Isolated Invariant Sets and Morse Index, Amer. Math. Soc., CBMS 3S.
Coullet, C. and '!Teaser, C. (1978), Iteration d'endomorphismes et groupe de renormalisa-
tion, J. de Physique Colloque 39, C5-C25.
Croom, F. (1989), Principles 0/ Topology, Saunders College Pub., Philadelphia.
Curry, J. H. (1979), On the Henon trans/ormation, Comm. Math. Phys. 6S, 129-140.
Dankner, A. (1978), On Smale's Axiom A dvnamical systems, Ann. Math. 107,517-553.
Denjoy, A. (1932), Sur les courbes dlfinies par les equations differentielles Ii la sUrface
du tore, J. Math. Pure Appl. 11,333-375.
Devaney, R. (1989), Chaotic Dvnamical Systems, Addison-Wesley Publ. Co., New York &t
Reading, MA.
Devaney, R. and Nitecki, Z. (1979), Shift automorphism in the Henon mapping, Comm.
Math. Phys. 67, 137-148.
Dieudonne, J. (1960), Foundations 0/ Modem Analysis, Academic Press, New York.
Dold, A. (1972), Lectures on Algebraic Topology, Springer-Verlag, New York, Berlin, Heidel-
berg.
Douady, A. and Hubbard, J. H. (1985), On the dynamics o/polynomial-like mapping8, Ann.
Sc. E. N. S. 4 seril§ IS, 287-343.
Dugundji, J. (1966), Topology, Allyn and Bacon, Boston, London, Sydney, Toronto.
Dumortier, F., Kokubu, H., and Oka, H. (1992), On degenerate singularities that generate
geometric Lorenz attractors, Preprint.
Easton, R. (1986), 7rellises formed by stable and unstable manifolds, '!Tans. Amer. Math.
Soc. 294, 719-731.
454 REFERENCES

Easton, R. (1991), Transport through chaos, Nonlinearity 4, 583-590.


Edgar, G. (1990), Measure, Topology, and Fractal Geometry, Springer-Verlag. New York.
Berlin, Heidelberg.
Edwards. C. H. and Penney. D. E. (1990). Calculus and Analytical Geometry, Prentice Hall.
Englewood Cliffs, NJ.
Falconer, K. (1990). Fractal Geometry, John Wiley & Sons. New York.
Farmer, D., Ott, E., and Yorke. J. (1983). The dimension of chaotic at tractors. Physica D
3, 153-180.
Feigenbaum, M. (1978), Quantitative universality for a class of non-linear transforma-
tions, J. Stat. Phys. 21,25--52.
Fen ichel , N. (1971), Persistence and smoothness of invariant manifolds for flows, Indiana
Univ. Math. J. 21. 193-226.
Fenichel, N. (1974), Asymptotic stability with rate conditions. Indiana Univ. Math. J. 23.
1109-1137.
Fenichel. N. (1977), Asymptotic stability with rate conditions, II, Indiana Univ. Math. J.
26,81-93.
Franks, J. (1969), Anosov diffeomorphisms on tori. Trans. Amer. Math. Soc. 145. 117-124.
Franks. J. (1970). Anosov diffeomorphisms. Global Analysis. Proc. Sympos. Pure Math .•
Amer. Math. Soc. 14,61-94.
Franks, J. (1971). Necessary conditions for stability of diffeomorphisms. Trans. Amer.
Math. Soc. 158,301-308.
Franks. J. (1972). Differentiably {l-stable diffeomorphisms. Topology 11. 107-114.
Franks, J. (1973). Absolutely structurally stable diffeomorphisms. Proc. Amer. Math. Soc.
37, 293-296.
Franks, J. (1974). Time dependent structural stability. Inven. Math. 24. 163-172.
Franks, J. (1977a), Constructing structurally stable diffeomorphisms. Ann. Math. 105.
343-359.
Franks, J. (1977b). Invariant sets of hyperbolic toral automorphisms. Amer. J. Math. 99.
1089--1095.
Franks, J. (1979), Manifolds ofer mappings and applications to differentiable dynamical
systems, Adv. Math. Suppl. Studies (Studies in Analysis) 4. 271-290.
Franks, J. (1982). Homology and Dynamical Systems. Amer. Math. Soc .• CBMS 49. Provi-
dence, RI.
Franks, J. (1988). A variation on the Poincare-Birkhoff Theorem. Contemporary Math.
81, 111-117.
Franks, J. and Shub, M. (1981). The existence of Morse Smale diffeomorphisms. Topology
20. 273-290.
Fried, D. (1982), Geometry of cross-sections to flows, Topology 21. 353-371.
Fried, D. (1987). Rationality for isolated expansive sets. Adv. Math. 65. 35--38.
Frobenius, G. (1912), Uber Matrizen aua nicht negativen Elementen. S.-B. Deutsch. Akad.
Wiss. Berlin Math.-Nat. Kl.. 456-77.
Gantmacher, F.R. (1959), The Theory of Matrices, Volume I, II. Chelsea Publ. Co .• New
York.
Grebogi, C .• Hammel. S .• and Yorke. J. (1988). Numerical orbits of chaotic processes rep-
resent true orbits. Bull. Amer. Math. Soc. 19. 465--469.
Grebogi, C., Ott, E .• and Yorke. J. (1987), Basin boundary metamorphoses: changes in
accessible boundary orbits. Physica D 24, 211
Guckenheimer. J. (1970). Axiom A + no cycles ,minKS <,(t) rational. Bull. Amer. Math.
Soc. 76. 592-595.
Guckenheimer, J. (1972), Absolutely {l-stable diffeomorphisms, Topology 11, 195--197.
REFERENCES 455

Guckenheimer. J. (1976). A strange, strange attractor, in The Hopr Bifurcation and Its
Applications by Marsden and McCracken. Springer-Verlag, New York. Heidelberg, Berlin.
Guckenheimer. J. (1979). Sensitive dependence to initial conditions for one dimensional
maps, Comm. Math. Phys. TO, 133-160.
Guckenheimer. J. (1980), Bifurcations of dynamical systems, in Dynamical Systems, Pro-
gress in Math. #8, Birkhauser. Boston. Basel. Stuttgart, pp. 11&-232.
Guckenheimer. J. and Holmes. P. (1983). Nonlinear Oscillations, Dynamical Systems and
Bifurcations of Vector Fields. Springer-Verlag, New York, Heidelberg. Berlin.
Guckenheimer. J. and Williams R. (1980), Structural stability of the Lorenz attractor, Publ.
Math. I.H.E.S. 50.73-100.
Guillemin. V. and Pollack A. (1974). Differential Topology, Prentice-Hall, Englewood Cliffs,
NJ.
Hadamard. J. (1901). Sur l'iteration it les solutions asymptotiques des equations differ-
entielln. Rill I. SO<". Math. France 29. 224-228.
Hahn. W. (1967), Stability of Motion. Springer-Verlag, New York. Heidelberg. Berlin.
Hale. J. (1969), Ordinary Differential Equations. Wiley. New York.
Hale. J. and KOI;ak, H. (1991). Dynamics and Bifurcation. Springer-Verlag. New York. Hei-
delberg, Berlin.
Hammel, S. and Jones. C. (1989). Jumping stable manifolds for dissipative maps of the
plane, Physica D 35, 87-106.
Hartley. R. and Hawkes, T. (1970), Rings, Modules, and Linear Algebra, Chapman and Hall,
New York and London.
Hartman. P. (1960), On local homeomorphisms of Euclidean spaces, 801. Soc. Mat. Mexi-
cana 5. 220- 241.
Hartman, P. (1964). Ordinary Differential Equations, Wiley, New York.
Hayyashi, S. (1992), Diffeomorphisms in Fl(M) satisfy Axiom A. Ergodic Theory Dynam-
ical Systems 12. 233-253.
Henon, M. (1976). A f1l'n·dimensional mapping with a strange attractor, Comm. Math.
Phys. 50, 69- 7 •
Henry. B. R. (1973). Escape from the unit interval under the transformation x ..... Ax(l -
x), Proc. Amer. Math. Soc. 41, 146-150.
Henry. D. (1981). Geometric Theory of Semilinear Parabolic Equations (Lecture Notes in
Math.). vol. 840, Springer-Verlag, New York, Heidelberg, Berlin.
Herman, M. (1979), Sur la conjugation differentiable des diffeomorphisms du cercle desa
rotations, Publ. Math. I.H.E.S. 49, &-233.
Hille. E. (1962), Analytic Function Theory" vol. II, Chelsea Publ. Co .• New York.
Hirsch. M. (1976), Differential Topology, Springer-Verlag, New York, Heidelberg, Berlin.
Hirsch. M. (1982), Systems of differential equations that are competitive or cooperative,
I: limit sets, SIAM J. Math. Anal. 13, 167-79.
Hirsch, M. (1984), The dynamical systems approach to differential equations, Bull. Amer.
Math. Soc. 11, 1-64.
Hirsch, M. (1985), Systems of differential equations that are competitive or cooperative,
II: conve1!Jence almost everywhere, SIAM J. Math. Anal. 16, 423-39.
Hirsch, M. (1988), Systems of differential equations that are competitive or cooperative,
III: competitive species, Nonlinearity 1, 51-71.
Hirsch, M. (1990), Systems of differential equations that are competitive or cooperative,
IV: structural stability in 3-dimensional systems, SIAM J. Math. Anal. 21, 1225-34.
Hirsch, M. and Pugh. C. (1970), Stable manifolds and hyperbolic sets, Proc. Symp. Pure
Math. 14, 133-163.
Hirsch, M., Pugh, C., and Shub, M. (1977), Invariant Manifolds (Lecture Notes in Math.),
vol. 583, Springer-Verlag, New York, Heidelberg, Berlin.
456 REFERENCES

Hirsch, M. and Smale, S. (1974), Differential Equations, Dynamical Systems, and Linear
Algebra, Academic Press, New York.
Hocking, J. G. and Young, G. S. (1961), Topology, Addison-Wesley Publ. Co., New York &
Reading, MA.
Hoffman, K. and Kunze, K. (1961), Linear Algebra, Prentice-Hall, Englewood Cliffs, NJ.
Holmes, P. (1980), Averaging and chaotic motions in forced oscillations, SIAM J. Appl.
Math. 38, 65-80.
Holmes, P. (1984), Bifurcation sequences in horseshoe maps: infinitely many routes to
chaos, Phys. Lett. A 104, 299-302.
Holmes, P. (1988), Knots and orbit genealogies in nonlinear oscillations, in New Directions
in Dynamical Systems, London Math. Soc. Lecture Notes, vol. 127, Cambridge University
Press, Cambridge, New York, pp. 150-191.
Holmes, P. and Whitley, D. (1984), Bifurcation in one and two dimensional maps, Phil.
Trans. Roy. Soc. London (Ser. A) 1515,43-102.
Hofbauer, J. and Sigmund, K. (1988), The Theory of Evolution and Dynamical Systems,
Cambridge University Press, Cambridge, New York.
Hopf, E. (1942), Abzweigung einer periodischen Losung von einer stationiiren Losung
eines Differentialsystems, Ber. Math. Phys. Siichsische Akad. der Wissen. Leipzig 94,
1-22 (See Marsden and McCracken (1976) for an English translation).
Hoppensteadt, F. C. (1982), Mathematical Methods of Population Biology, Cambridge Uni-
versity Press, Cambridge, New York.
Hu, S. (1994), A proof of C 1 stability conjecture for 3-dimensional flows, Transact. Amer.
Math. Soc. 342, 753-772.
Hubbard, J. and West, B. (1992), MacMath: A Dynamical Systems Software Package for
the Macintosh, Springer-Verlag, New York, Heidelberg, Berlin.
Hurder, S. and Katok, A. (1990), Differentiability, rigidity, and Godbillon- Vey Classes for
Anosov flows, Pub!. Math. I.H.E.S. 12,5-61.
Hurewicz, W. (1958), Lectures on Onlinary Differential Equations, MIT Press, Cambridge,
MA.
Hurewicz, W. and Wallman, H. (1941), Dimension Theory, Princeton University Press,
Princeton, NJ.
Hurley, M. (1991), Chain recurrence and attraction in noncompact spaces, Ergodic Theory
Dynamical Systems 11, 709-729.
Hurley, M. (1992), Chain recurrence and attraction in noncompact spaces, II, Proc. ofthe
Amer. Math. Soc. lIS, 1139-1148.
Irwin, M. C. (19708), A classification of elementary cycles, Topology 9, 35-47.
Irwin, M. C. (1970b), On the stable manifold theorem, Bull. London Math. Soc. 2, 196-198.
Irwin. M. C. (1972), On the smoothness of the composition map, Q. J. Math. Oxford Ser.
23, 113-133.
Irwin, M. C. (1980), Smooth Dynamical Systems, Academic Press, New York.
Jakobson, M. (1971), On smooth mappings of the circle into itself, Math. USSR Sb. 14,
161-185.
Jakobson, M. (1981), Absolutely continuous invariant measures for one-parameter fami-
lies of one-dimensional maps, Comm. Math. Phys. 81, 39--88.
Katok, A. (1980), Lyapunov exponents, entropy, and periodic orbits for diffeomorphisms,
Publ. Math. I.H.E.S. 51, 137-173.
Katok, A. and Burns, K. (1994), Infinitesimal Lyapunov functiOns, invariant cone families
and stochastic properties of smooth dynamical systems, Ergodic Theory Dynamical
Systems 14, 757-786.
Katok, A. and Hasselblatt, B. (1994), Introduction to the Modern Theory of Smooth Dy-
namical Systems, Cambridge University Press, Cambridge, New York.
REFERENCFS 457

Katok, A. and Streicyn, J.-M. (1986), Invariant manifolds, entropy, and billiards; smooth
maps with singularities (Lecture Notes in Mathematics), vol. 1986, Springer-VerlatJ,
New York, Heidelberg, Berlin.
Kelley, A. (1967), The stable, center-stable, center center-unstable, and unstable mani-
folds (Appendix C), in Transversal Mappinp and Flows by R. Abraham and J. Robbin ..
Benjamin, Reading, MA.
K~, H. (1986, revised 1989), Differential and Difference Equations through Computer
Experiments, Springer-Verlag, New York, Heidelberg, Berlin.
Kupka, I. (1963), Contribution d la theorie des champs generiques, Contrib. to Dill. Eq. 2,
457-484.
Lanford, O. (1982), A computer-assisted proof of the Feigenbaum conjecture, Bull. Amer.
Math. Soc. 6, 427--434.
Lanford, O. (1984), A shorter proof of the existence of Feigenbaum fixed point, Commun.
Math. Phys. 96, 521-538.
Lanford, O. (1986), Computer-assisted proofs in analysis, Proc. Int. Congress of Math. 1,2,
1385-1394.
Lang, S. (1967), Introduction to Differentiable Manifolds, Interscience Publ. (Wiley &. Sons,
Inc.), New York/London/ Sydney.
Lang, S. (1968), Analysis I, Addison-Wesley Pub\. Co., New York &. Reading, MA.
Lefever, F. and Nicholls, C. (1971), Chemical instabilities and sustained oscillations, J.
Theoretical Biology 30, 267-284.
Levi, M. (1981), Qualitative analysis of the periodically forced relaxation oscillator, Mem-
oirs Amer. Math. Soc. 32.
Levinson, N. (1949), A second order differential equation with singular solutions, Ann.
Math. 50, 127-153.
Li, T. and Yorke, J. (1975), Period three implies chaos, Amer. Math. Monthly 82, 985-992.
Liao, S. T. (1979), An extension of the C' closing lemma, Acta Scientiarum Naturalium
Universitatis Pekinensls 3, 1-14. (Chinese)
Liao, S. T. (1980), Chinese Ann. Math. 1,9--30.
Liapunov, A. (1907), Probleme gene,lll de La stabilite du mouvement, Ann. Fac. Sci. Univ.
Toulouse 9, 203--475 [reproduced in Ann. Math. Study (17) Princeton (1947)].
Lichtenberg, A. and Lieberman, M. (1983), Regular and Stochastic Motion, Springer-VerlatJ,
New York, Heidelberg, Berlin.
Liverani, C. and Wojtkowski, M. (1993), Ergodicity in Hamiltonian systems, Dyuamies
Reported (to appear).
de la Llave, R. (1987), Invariants for smooth conjugacy of hyperbolic dynamical .y.tems
II, Commun. Math. Phys. 109, 369--378.
de la Llave, R., Marco, J. M., and Moriyon, R. (1986), Canonical perturbation theory of
Anosov systems and regularity results for the Livsic cohomology equation, Ann. Math.
123, 537~11.
Lorenz, E. N. (1963), Deterministic non-periodic flow, J. Atmos. Sci. 20, 13(}-141.
Mai, J. (1986), A simpler proof of c' closing lemma, Scientia Sinica (Series A) 29 (10),
1021-1031.
Mallet-Paret, J. and Yorke, J. (1982), Snakes: oriented families of periodic orbits, their
sources, sinks, and continuation, J. Dill. Eq. 43, 411).-450.
Mane R. (19788), Persistence manifolds are normally hyperbolic, Trans. Amer. Math. Soc.
246,261-283.
Mane R. (1978b), Contributions to the stability conjecture, Topology 17,383-396.
Mane R. (1987a), Ergodic Theory and Differentiable Dynamics, Springer-VerlatJ, New York,
Heidelberg, Berlin.
Mane R. (1987b), A proof of the C' stability conjecture, Pub\. Math. I.H.E.S. 66, 161-210.
458 REFERENCES

Manning. A. (1971). Axiom A diffeomorphisms have rational zeta junctions. Bull. London
Math. Soc. 3. 215-220.
Manning. A. (1974). There are no new Anosov diffeomorphisms on tori., Amer. J. Math.
96. 422-429.
Manning. A. and McCluskey. H. (1983). Hausdorff dimension for horseshoes. Ergodic Theory
Dynamical Systems 3, 251-261.
Marek. M. and Schreiber. I. (1991). Chaotic Behaviour of Deterministic Dissipative Sys-
tems. Cambridge University Press. Cambridge. New York.
Marsden. J. (1974). Elementary Classical Analysis. Freeman and Co .• New York.
Marsden. J. (1984). Chaos in dynamical systems by the Poincare-Melnikov-Amold meth-
od. in Nonlinear Dynamical Systems (edited by J. Chandra), SIAM. Philadelphia. pp. 19-
31.
Marsden. J. and McCracken. M. (1976). The Hopf Bifurcation and Its Applications. Springer-
Verlag. New York. Heidelberg. Berlin.
Margulis. G. A. (1969). On some applications of ergodic theory to the study of manifolds
of negative curvature. Funct. Analy. Appl. 3. 89-90.
May. R. (1975). Stability and Complexity in Model Ecosystems, 2nd edn .• Princeton Univ.
Press. Princeton. NJ.
McGehee. R. (1973). A stable manifold theorem for degenerate fixed points with applica-
tions to celestial mechanics. J. Diff. Eq. 14. 70-88.
Melnikov. V. K. (1963). On the stability of the center for time periodic perturbations.
Trans. Moscow Math. Soc. 12. 1-57.
de Melo. W. (1973). Structural stability of diffeomorphisms on two manifolds, Inven. Math.
21.233-246.
de Melo. W. (1989). Lectures on One-Dimensional Dynamics. Instituto de Matematica Pura
e Aplicada. Rio de Janeiro. Brazil.
de Melo. W. and Van Strien. S. J. (1993). One-Dimensional Dynamics. Springer-Verlag. New
York. Heidelberg. Berlin.
Meyer. K. (1987). An analytic proof of the shadowing lemma, Funkcialaj Ekvacioj 30.
127-133.
Meyer. K. and Hall. G. H. (1992). Introduction to Hamiltonian Dynamical Systems and
the N-Body Problem. Springer-Verlag. New York, Heidelberg. Berlin.
Meyer. K. and Sell. G. (1989). Trans. Amer. Math. Soc. 314, 63-105.
Milnor. J. (1985). On the concept of attract or. Comm. Math. Phys. 99.177-195.
Misiurewicz. M. (1981). Absolutely continuous measures for certain maps of an interval.
Publ. Math. I.H.E.S. 53. 17-51.
Mora. L. and Viana. M. (1993). Abundance of strange at tractors, Acta Math. 171, 1-71.
Moeer. J. (1962). On invariant curves of area preserving mappings of an annulus, Nachr.
Akad. Wiss. Gottingen. Math. Phys. KI. 1-20.
Moser. J. (1969). On a theorem of Anosov. J. Diff. Eq. 5. 411-440.
Moser. J. (1973). Stable and Random Motions in Dynamical Systems, Princeton University
Press. Princeton. NJ.
Munkres. J. (1975). Topology, A First Course. Prentice-Hall, Englewood Cliffs. NJ.
Naimark. J. (1967). Motions close to doubly asymptotic motions, Soviet Math. Dok!. 8,
228-231.
Newhouse. S. (1970a). Nondensity of Axiom A (a) on S2. Proc. Symp. in Pure Math., Amer.
Math. Soc. 14. 191-202.
Newhouse. S. (1970b). On codimension one Anosov diffeomorphisms. Amer. J. Math. 92.
761-770.
Newhouse. S. (1972). Hyperbolic limit sets. Trans. Amer. Math. Soc. 167. 125-150.
REFERENCES 459

Newhouse, S. (1979), The abundance of wild hyperbolic sets and non-smooth stable sets
for diffeomorphism, Pub!. Math. I.H.E.S. 50, 101-151.
Newhouse, S. (1980), Lectures on Dynamical Systems, in Dynamical Systems, Progress in
Math. #8, Birkhauser, Boston, Basel, Stuttgart, pp. 1-114.
Nitecki, Z. (1971), Differentiable Dynamics, MIT Press, Cambridge MA.
Nitecki, Z. (1971), On semi-stability for diffeomorphisms, Inven. Math. 14,83-122.
Oseledec, V. I. (1968), A multiplicative ergodic theorem. Liapunov characteristic numbers
for dynamical systems, Trans. Moscow Math. Soc. 19, 197-221.
Ott, E. (1993), Chaos in Dynamical Systems, Cambridge University Press, Cambridge, New
York.
Palis, J. (1968), On the local structure of hyperbolic fixed points in Banach space, Anais
Acad. Brasil Ciencais 40, 263-266.
Palis, J. and de Melo, W. (1982), Geometric Theory of Dynamical Systems, An Introduc-
tion, Springer-Verlag, New York, Heidelberg, Berlin.
Palis, J. and Smale, S. (1970), Structural Stability Theorems, Proc. Symp. in Pure Math.,
Amer. Math. Soc. 14, 223-232.
Palis, J. and Takens, F. (1993), Hyperbolicity and Sensitive Chaotic Dynamics at Homo-
clinic Bifurcations, Cambridge University Press, Cambridge, New York.
Palmer, K. (1984), Exponential dichotomies and transversal homoclinic points, J. Ditr. Eq.
55, 225-256.
Patterson, S. and Robinson, C. (1988), Basins for general nonlinear Henon attracting sets,
Proc. Amer. Math. Soc. 103, 615-623.
Peixoto, M. (1962), Structurally stability on two dimensional manifolds, Topology I, 101-
120.
Peixoto, M. (1966), On an appraximation theorem of Kupka and Smale, J. Ditr. Eq. 3,
214-227.
Perron, O. (1907), Uber Matrizen, Math. Ann. 64, 248-63.
Perron, O. (1929), Uber stabilitiit und asymptotishes Vemalten der Losungen eines Sys-
temes endlicher Differenzengleichungen, J. Reine Angew. Math. 161,41-64.
Pesin, Ja. (1976), Families of invariant manifolds corresponding to nonzero characteristic
exponents, Math. USSR-Izvestia 10, 1261-1305.
Pesin, Ja. (1977), Characteristic Lyapunov exponents and smooth ergodic theory, Russian
Math. Surveys 32, 55-114.
Pliss, V. A. (1971), Properties of solutions of a sequence of second-order periodic system
with small nonlinearities, Ditr. Eq. 7,501-508.
Pliss, V. A. (1972), A hypothesis due to Smale, Ditr. Eq. 8, 203-214.
Plykin, R. V. (1974), Sources and sinks for A-diffeomorphisms of surfaces, Mathematics
of the USSR, Sbomik 23, 233-253.
Pollicott, M. (1993), Lectures on Ergodic Theory and Pesin Theory on Compact Mani-
folds, London Math. Soc. Lecture Note Series No. 180, Cambridge University Press,
Cambridge, New York.
Prigongine,l. and Lefever, R. (1968), Symmetry breaking instabilities in dissipative systems
II, J. Chem. Phys. 48, 1695-1700.
Pugh, C. (1967a), The Closing Lemma, Amer. J. Math. 89, 956-1009.
Pugh, c. (1967b), An improved Closing Lemma and a General Density Theorem, Amer.
J. Math. 89, 1010-1021.
Pugh, C. (1969), On a theorem of Hartman, Amer. J. Math. 91, 363-367.
Pugh, C. and Robinson, C. (1983), The C' Closing Lemma, including Hamiltonians, Er-
godic Theory Dynamical Systems 3, 261-313.
Pugh, C. and Shub, M. (19708), Linearization of normally hyperbolic diffeomorphisms and
flows, Inven. Math. 10, 187-198.
460 REFERENCES

Pugh, C. and Shub, M. (1970b), O-stability for flows, Inven. Math. 11, 150-158.
Pugh, C. and Shub, M. (1989), Ergodic attractors, Transact. Amer. Math. Soc. 312, 1-54.
Rand, D. (1978), The topological classification of the Lorenz attractor, Math. Proc. Camb.
Phil. Soc. 83, 451-460.
Rasband, N. (1990), Chaotic Dynamics of Nonlinear Systems, Wiley-Interscience Publ.,
New York.
Riesz, Z. and Nagy, B. (1955), i'Unctional Analysis, Fredrick Ungar Publ. Co., New York.
Robbin, J. (1971), A structural stability theorem, Ann. Math. 94, 447-493.
Robinson, C. (1973). C r structural stability implies Kupka-Smale, in Dynamical Systems
(edited by M. M. Peixoto), Academic Press. New York, pp. 443-449.
Robinson, C. (1974), Structural stability of vector fields, Ann. Math. 99,154-174.
c
Robinson, C. (1975a), Structural stability of l flows. in Lecture Notes in Math., vol. 468,
Springer-Verlag, New York, Heidelberg, Berlin, pp. 262-277.
Robinson, C. (1975b), The geometry of the structural stability proof using unstable disks,
801. Soc. Brasil. Math. 6, 129-144.
Robinson, C. (1976a), Structural stability ofC I diJJeomorphisms, J. Dilf. Eq. 22, 28-73.
Robinson, C. (1976b), Structural stability theorems, in Dynamical Systems, Vol. II, Academic
Press, New York, pp. 33-36.
Robinson, C. (1977), Stability theorems and hyperbolicity in dynamical systems, Rocky
Mountain J. Math. 1,425-437.
Robinson. C. (1978), Introduction to the Closing Lemma, in The Structure of Attractors
in Dynamical Systems (edited by Markley, Martin, and Perrizo) Lecture Notes in Math.,
vol. 668. Springer-Verlag, New York. Heidelberg, Berlin, pp. 225-230.
Robinson, C. (1981), Differentiability of the stable foliation of the model Lorenz equations,
in Dynamical Systems and Thrbulence (edited by Rand and Young) Lecture Notes in
Math., vol. 898, Springer-Verlag, New York, Heidelberg, Berlin, pp. 302-315.
Robinson, C. (1983), Bifurcation to infinitely many sinks, Commun. Math. Phys. 90, 433-
459.
Robinson, C. (1984), 1hlnsitivity and invariant measures for geometric model of the
Lorenz equations, Ergodic Theory Dynamical Systems 4, 605-611.
Robinson, C. (1985), Phase analysis using the Poincare map, Nonlinear Anal. Theory Meth-
ods Appl. 9, 1159-1164.
Robinson, C. (1989), Homoclinic bifurcation to a transitive attractor of Lorenz type, Non-
linearity 2, 495-518.
Robinson, C. (1992). Homoclinic bifurcation to a transitive attractor of Lorenz type, 11,
SIAM J. Math. Anal. 23, 1255-1268.
Rudin, W. (1964), Principles of Mathematical Analysis (1964), McGraw-Hill, New York.
Ruelle, D. (1976), A measure associated with Axiom A attractor. Amer. J. Math. 98,
619--654.
Ruelle, D. (1979), Ergodic theory of differentiable dynamical systems, Publ. Math. I.H.E.S.
50,27-58.
Ruelle, D. (1989a). Chaotic Evolution and Strange Attractors, Cambridge University Press,
Cambridge, New York.
Ruelle, D. (1989b), Element8 of Differentiable Dynamics and Bifurcation Theory, Aca-
demic Press, New York.
Ruelle, D. and Takens. F. (1971), On the nature of turbulence. Comm. Math. Phys. 20,
167-192.
Rychlik, M. (1990), Lorenz attractors through Sil'nikov-type bifurcation, Part I, Ergodic
Theory Dynamical Systems 10, 793-822.
Rykken, E. (1993). Markov partitions and the expanding factor for pseudo-Anosov home-
omorphisms, Thesis, Northwestern University.
REFERENCES 461

Sacker, R. (1964), On invariant su.rfaces and bifurcation of periodic solutions of ordinary


differential equations, New York Univ. IMM-NYU 333.
Sacker, R. (1967), A Perturbation for invariant Riemannian manifolds, in Proc. Symp.
at Univ. of Puerto Rico (edited by J. Hale and J. LaSalle), Academic Press, New York,
pp.43-54.
Sharkovskii, A. N. (1964), Coexistence of cycles of a continuous map of a line into itself,
Ukrainian Math. J. 16,61-71.
Shub, M. (1987), Global Stability of Dynamical Systems, Springer-Verlag, New York, Hei-
delberg, Berlin.
Simon, C. (1972), Instability in Diffr(T3) and the non-genericity of rational zeta func-
tions, Trans. Amer. Math. Soc. 114,217-242.
Sinai, Ya. (1968), Markov partitions and C-diffeomorphisms, f\mc. Anal. AppJ. 2, 64-89.
Sinai, Ya. (1970), Dynamical systems with elastic reflections, Russ. Math. Surveys 25,
137-189.
Sinai, Ya. (1972), Gibbs measures in ergodic theory, RUBS. Math. Surveys 166, 21~9.
Sitnikov, K. (1960), Existence of oscillating motions for the three body problem, Ookl.
Akad. Nauk. 133, 303-306.
Smale, S. (1963), Stable manifolds for differential equations and diffeomorphisms, Ann.
Scuo1a Normale Piss 18, 97-116.
Smale, S. (1965), Diffeomorphisms with many periodic points, Differential and Combinato-
rial Topology, Princeton University Press, Princeton, NJ, 63-80.
Smale, S. (1966), Structurally stable systems are not dense, Amer. J. Math. 88, 491-496.
Smale, S. (1967), Differentiable dynamical systems, Bull. Amer. Math. Soc. 13,747-817.
Smale, S. (1970), The n-Stability Theorem, Proc. Symp. Pure Math. 14,289-297.
Smale, S. (1980), The Mathematics of Time: Essays on Dynamical Systems, Economic
Processes, and Related Topics, Springer-Verlag, New York, Heidelberg, Berlin.
Smith, K. (1971), Primer of Modem Analysis, Bogden & Quigley, Inc., Publishers, Tarrytown-
on-Hudson, New York, Belmont CA.
Snavely M. (1990), Markov Partitions for Hyperbolic Automorphisms of the Two-Dimen-
sional Torus, Thesis, Northwestern University.
Sotomayor, J. (1973a), Generic one parameter families of vector fields on two manifolds,
Pub!. Math. Inst. Hautes Etudes Scientifiques 43, 5-46.
Sotomayor, J. (1973b), Generic bifurcations of dynamical II/Items, in Dynamical Systems
(edited by M. Peixoto), Academic Press, New York, pp. 561-582.
Sparrow, C. (1982), The Lorenz Equations: Bifurcations, Chaos, and Strange Attroctors,
Springer-Verlag, New York, Heidelberg, Berlin.
Stefan, P. (1977), A theorem 01 Sarkovskii on the existence olperiodic orbits of continuous
endomorphisms 01 the real line, Comm. Math. Phys. 54, 237-248.
Sternberg, S. (1958), On the structure of local homeomorphisms 01 Euclidean n-space, II,
Amer. J. Math. 81, 623-631.
van Strien, S. (1979), Center manilolds are not Coo, Math. Z. 166, 143-145.
van Strien, S. (1981), On the bifurcation creating horseshoes, in Dynamical Systems and
Thrbulence (edited by Rand and Young) Lecture Notes in Math., vol. 898, Springer-Verlag,
New York, Heidelberg, Berlin, pp. 316-351.
Strogatz, S. (1994), Nonlinear Dynamics and Chaos, Addison-Wesley Pub!. Co., New York
& Reading, MA.
Takens, F. (1974), Singularities 01 vector fields, Publ. Math. Inst. Hautes Etudes Scientiflquee
43,47-100.
Takens, F. (1975), Tolerance stability, in Dynamical Systems - Warwick 1974, Lecture Notes
in Math., vo!. 468, Springer-Verlag, New York, Heidelberg, Berlin, pp. 293-304.
462 REFERENCES

Takens, F. (1988), Limit capacity and Hausdorff dimension of dynamically defined Cantor
sets, Lecture Notes in Math. (Dynamical Systems) 1331, Springer-Verlag, New York,
Heidelberg, Berlin, 196-212.
Thorn, R. (1973), Stabilite Structurelle et Morphogenese, Addison-Wesley Publ. Co., New
York and Reading, MA; English (1975).
Ushiki, S., Oka, H., and Kokubu, H. (1984), Existence of strange attractors in the unfolding
of a degenerate singularity of a translation invariant vector field, C. R. Acad. Ac.
Paris, Ser. I Math. 298, 39-42.
Vinograd, R. E. (1957), The inadequacy of the method of characteristic exponents for the
study of nonlinear differential equations, Mat. Sbornik 41, 431-438.
Walters, P. (1970), Anosov diffeomorphisms are topologically stable, Topology 9, 71-78.
Walters, P. (1982), An Introduction to Ergodic Theory, Springer-Verlag, New York, Heidel-
berg, Berlin.
Waltman, P. (1983), Competition Models in Population Biology, Society for Industrial and
Applied Mathematics, Philadelphia, PA.
Wen, L. (1991), The C· closing lemma for non-singular endomorphisms, Ergodic Theory
Dynamical Systems 11, 393-412.
Wen, L. (1992), Anosov endomorphisms on branched surfaces, J. Complexity 8, 239-264.
Wiggins, S. (1988), Global Bifurcation and Chaos, Springer-Verlag, New York, Heidelberg,
Berlin.
Wiggins, S. (1990), Introduction to Applied Nonlinear Dynamical Systems and Chaos,
Springer-Verlag, New York, Heidelberg, Berlin.
Williams, R. (1967), One-dimensional nonwandering sets, Topology 6, 473-487.
Williams, R. (1968), The zeta function of an attractor, in Conference on the Topology of
Manifolds (J. C. Hocking, ed.), Prindle Weber and Smith, Boston MA, pp. 155-161.
Williams, R. (19708), The 'DA' maps of Smale and structural stability, Global Analysis,
Proc. Sympos. Pure Math., Amer. Math. Soc. 14, 329-334.
Williams, R. (1970b), The zeta function in global analysis, Global Analysis, Proc. Sympos.
Pure Math., Amer. Math. Soc. 14, 335-340.
Williams, R. (1974), Expanding attractors, Publ. Math. I.H.E.S. 43, 169-203.
Williams, R. (1977), The structure of Lorenz attractors, in Thrbulence Seminar (edited by
P. Bernard) Lect. Notes in Math., vol. 615, Springer-Verlag, New York, Heidelberg, Berlin.
Williams, R. (1979), The bifurcation space of the Lorenz attractor, in Bifurcation Theory
and Its Applications in Scientific Disciplines, (edited by O. Gurel and o. E. ROesler) Ann.
NY Acad. Sci., vol. 316, pp. 393-399.
Williams, R. (1980), Structure of Lorenz attractors, Publ. Math. I.H.E.S. 50,59-72.
Williams, R. (1983), Lorenz knots are prime, Ergodic Theory Dynamical Systems 4,147-163.
Wilson. W. (1967), The structure of the level surfaces of a Lyapunov function, J. Dift'. Eq.
3,323-329.
Wojtkowski, M. (1985), Invariant families of cones and Lyapunov exponents, Ergodic The-
ory Dynamical Systems 5, 145-161.
Yomdin, Y. (1987), Volume growth and entropy, Israel J. Math. 57, 285-318.
Yorke, J. and Alligood, K. (1985), Period doubling cascades of attractors: a prerequisite
for horseshoes, Commun. Math. Phys. 101,305-321.
Young, L. S. (1982), Dimension, entropy, and Lyapunov exponents, Ergodic Theory Dy-
namical Systems 2, 109-124.
Young, L. S. (1993), Ergodic theory of chaotic dynamical systems, Lecture Notes in Math.
(From Topology to Computation: Proceedings of the Smalefest, editors Hirsch, Marsden,
and Shub), Springer-Verlag, New York, Heidelberg, Berlin.
Zeeman, E. C. (1980), Population dynamics from game theory, Lecture Notes in Math. 819,
Springer-Verlag, New York, Heidelberg, Berlin, 471-497.
Index

A-graph 66 bounded continuous maps 158


a Cantor set 26 bounded linear map 158
a-limit point 23, 146 box dimension 358
a-limit set 23, 146 branched manifold 303, 305, 315
a-recurrent 25, 148 Brusselator 233
accessible points 275 bump functions 163
adapted metric 181 bundle map 433
adapted norm 108, 116, 151, 156, 181,
241 CO-sup-norm 141
adding machine 89 C l 15,132
adjacency matrix 248 or 15, 134
admissible metric 434 C r Section Theorem 436
affine conjugacy 41 or structurally stable 238
Andronov-Hopf Bifurcation for diffeo- or topology 238
morphisrns 223 or topology on vector fields 240
Andronov-Hopf bifurcation for flows or-conjugacy 41
226 or-diffeomorphism 134
Ano80v Closing Lemma 381 or -distance between functions 44
Ano80v diffeomorphism 276 or(M,N) 237
asymptotic 16 Cantor set 26
asymptotically stable 16, 106, 156 capacity dimension 358
attracting 16, 106, 110, 149, 156, 167 center eigenspace 111, 120
attracting fixed point 116 center 104, 149
attracting set 292, 368 center manifold 198
attractor 292 center-stable manifold 197
attractor of Lorenz type 318 center-unstable manifold 197
Axiom A 382 chain components 26, 149,368
chain equivalent 26
backward orbit 16 chain limit set 25, 368
Baker's map 59 chain recurrent set 25, 149, 368
Baker's map, generalized 363 chain transitive 26
base space 433 chaos 83, 355
basic sets 386 chaotic attractor 292
basin boundary 275 chaotic map 83
basin of attraction 149, 156, 204, 275 characteristic multipliers 167
Benedicks and Carleson 308 closed orbit 146, 165
Birkhoff Ergodic Theorem 86, 246, 353 codimension 417
Birkhoff Transitivity Theorem 245 cones 185, 186, 256, 297
Borel measure 245 conjugacy 40
Borel probability measure 352 conjugate flows 113
Borel sets 352 conjugate maps 40

41>1
464 INDEX

Conley's Fundamental Theorem 368 equivalent flows 113


continuation of a fixed point 157 ergodic 246, 352
Continuity of solutions with respect to eventually periodic 15
initial conditions 143 eventually positive matrix 74, 123
continuously differentiable 15, 132, 134 Existence of Solutions of Differential
Cont(n, R) 121 Equations 141
contracting linear map 116 expanding attractor 293
contraction mapping 139 expanding linear map 116
Contraction Mapping Theorem 139 expansion 110
coordinate chart 235 expansive 83, 381
covering estimate 138
critical point 13 I-covers 64
cross section to a flow 166 family of quadratic maps 20
cycle 321, 388 Feigenbaum constant 80
fiber contraction 433
Diffr(M) 237 fiber 433
Diffr(Rk) 134 filtration 404
DA-diffeomorphism 300 first ret urn map 166
Denjoy's Example 55 First Variation Equation 145
Denjoy's Theorem 55 Fix(f) 15
derivative 132 fixed point 15, 146, 156
diffeomorphism 15, 134, 237 flip bifurcation 219
differentiable 132, 134 Floquet theory 200
differentiable linearization 154, 157 Flow Box Coordinates 203
differentiable map between manifolds flow conjugate 203
237 flow 140
differentiably conjugacy 41 flow equivalent 113, 203
differentiably conjugate 332 flow expansiveness 402
differential equation on a manifold 240 flow of a vector field 240
dimension, box 358 focus 105
dimension, capacity 358 forward orbit 16
dimension, Hausdorff 361 Frechet derivative 132
dimension, topological 293 full n-shift 247
disk bundle disk 434 fundamental domain 44, 45, 46, 48,
double of a map 71 114, 117,406
doubling map (squaring map) 42, 335 fundamental matrix solution 100
dynamically defined 362
generalized Baker's map 363
Ee 111, 156 generalized eigenvector 94
E' Ill, 156 generalized tent map 58
EU 111, 156 geometric horseshoe 249
E~ 241, 354 geometric model of Lorenz equations
E; 241, 354 312
e-chain 25, 149 global stable manifold 183
f-shadow 60, 377 global stable manifolds of points 243
edge subshift 248 global unstable manifold 183
eigenvalues of a periodic orbit 167 global unstable manifolds of points 243
embedded submanifold 236 gradient flow 323
embedding 236 gradient vector field 323, 368
entropy, topological 334 gradient-like 324
equilibrium 146 gradient-like vector fields 368
INDEX 465

graph for the partition 66 itinerary map 38


graph transform 183, 434
Gronwall's inequality 142 Jakobson 81, 309
Jordan canonical form 94
Hamiltonian differential equations 269
Hamiltonian vector field 269 k-cycle 321, 388
Hartman-Grobman Theorem 152, 157, knots 318
170
Hausdorff dimension 361 L(Rk,Rn) 94
Hausdorff metric 416 Lambda Lemma 200
Henon attractor 306 least period 15
Henon map 6, 231, 255, 306 Lefschetz Fixed Point Theorem 289
Holder 133 Lefschetz index of a fixed point 289
heteroclinkally related 383 Lefschetz number of a diffeomorphism
higher derivatives 134 289
homeomorphism 15 Liapunov exponent 86
homoclinic point 259 Liapunov exponents 84, 352
homoclinically related 383 Liapunov function 155, 324, 368
homology zeta function 289 Liapunov stable 16, 106, 156
horseshoe 6, 249, 259 Lienard equations 171, 172
h-related 383 lift 49
hyperbolic 111,120,149,157,167,181 limit cycle 147,179
hyperbolic estimates 188 limit set 383
hyperbolic fixed point 149 linear conjugacy 41
hyperbolic invariant set 241, 244 linear contraction 116
hyperbolic linear differential equation linear expansion 116
III linear sink 116
hyperbolic linear map 120, 181 linear source 116
hyperbolic periodic point 157 Liouville's formula 78,101,177,207
hyperbolic splitting III Lip(f) 132
hyperbolic structure 241, 244 Lipschitz 132
hyperbolic toral automorphisms 276 local product structure 282, 385, 388
local stable manifold 181
immersed submanifold 236 local unstable manifold 181
immersion 236 locally eventually onto 316
Implicit Function Theorem 135, 136 Lorenz attractor 312
In Phase Theorem 380 Lorenz equations 8, 309
Inclination Lemma 200 L-stable 16
index of fixed point 325
Intermediate Value Theorem 14 m(A) 94, 138, 158
invariant measure 81, 352 manifold 235
invariant section 433 Markov partition 283, 389
Invariant Section Theorem 434 Markov property for ambient boxes 265
invariant set 23 maximal interval of definition of the
Inverse Function Theorem 138 solution 143
inverse limit 299 maximal invariant set 260, 378
irreducible 247 Mean Value Theorem 14
irreducible matrix 74, 123 measure 352
isoclines 173, 205 Melnikov function 271
isolated invariant set 262, 378 Melnikov method 7, 271
isolating neighborhood 368, 378 middle-a Cantor set 26
466 INDEX

middle-third Cantor 28 Perh(f) 414


minimal set 24 Perh(k, f) 414
minimum norm 94, 138, 158, 187 Per(k, f) 414
mixing 39 Per(n,f) 15
monotonically decreasing 57 perfect 26
monotonically increasing 57 period doubling bifurcation 219
Morse inequalities 325 period doubling route to chaos 79
Morse-Smale diffeomorphism or flow period IS, 146, 156
319 periodic attractor 167
Morse-Smale flow 322 periodic orbit IS, 146, 165
Multiplicative Ergodic Theorem 352 periodic point IS, 146, 156, 165
periodic repeller 167
N30 periodic sink 167
(n, t )-separated 334 periodic source 167
(n, t)-spanning set 343 Perron method for stable manifolds 183
negatively invariant set 23 Perron-Frobenius Theorem 123, 341
no cycle property 388 persistence of a fixed point 157
node 105 phase portrait 2, 102, 148
nondegenerate critical point 13, 323 phase space 102
nonhomogeneous linear equations 115 pitchfork bifurcation 231
nonnegative matrix 123 Plykin attractor 304
nonuniformly hyperbolic structure 351, Poincare map 166
354 Poincare metric 35
nonwandering 149 Poincare norm 35
nonwandering point 25 Poincare Recurrence Theorem 326
nonwandering set 25 Poincare-Bendixson Theorem 179
normal bundle 445 positive matrix 74, 123
normal bundle of a closed orbit 170 positively invariant set 23
normally contracting 445 predator-prey 178
normally hyperbolic 445 proper 163
nowhere dense 26
n-shift 37, 247 quadratic map 20
n-sphere 236
'R-stable 404
w-limit point 23, 146 p-distance 33
w-limit set 23, 146 r-continuously differentiable 134
w-recurrent 25, 148 rectangle 282, 389
O-Stability Theorem 404 repelling 16, 1l0, 149, 157, 167
O-stable 47, 404 repelling fixed point 116
one-sided shift 37 repelling set 368
openness of Anosov diffeomorphisms residual subset 245
399 Riemannian metric 240
operator norm 94, 158 Riemannian norm 240
orbit 16, 144 rigid rotation of 8 1 50
orbitally Liapunov stable 169 rooftop map 42
orientation preserving, reversing 117 rotation number 49
overflowing 433 rotation of 8 1 50

partial ordering of basic sets 388 saddle Ill, 149, 157


past history 181 saddle linear differential equation III
Per(f) 414 Saddle-Node Bifurcation 212
INDEX 467

Schwarz Lemma 34 structural stability of Anosov flows 401


second derivative 133 structural stability of hyperbolic toral
sectors 185 automorphisms 276
semi-conjugacy 40 structural stability of Morse-Smale dif-
semi-stability of Anosov Diffeomor- feomorphisms 321
ph isms 398 Structural Stability Theorem 406
semi-stable 398 structurally stable 9, 44, 238
sensitive dependence on initial condi- subadditive 61
tions 82, 381 subshift 73
separated set 334 subshift of finite type 73, 248, 254
shadow 60, 377 subshifts of finite type 247
Sharkovskii ordering 65 sup-norm 94, 141, 158
Sharkovskii's Theorem 66 support 421
shift space 37, 247, 283 surface 236
Sinai-Ruelle-Bowen measure 356 suspension of a map 171
sink 16, 110, 149, 157, 167 symbol space 37
Smale horseshoe 249, 259 symbolic dynamics 4, 37, 247, 253, 283
solenoid 294
source 16, 110, 149, 157, 167 tangent bundle 238
spanning set 343 tangent space 238
Spectral Decomposition Theorem 385 tangent vector 238
spectrum of a linear map 181 Taylor's expansion 134
sphere 236 template 318
spiral 105 tent map 42,57,58,60
squaring map (doubling map) 335 topological conjugacy 38, 40
SRB measure 356 topological dimension 293
stability of an hyperbolic invariant set topological entropy 84, 334
400 topologically conjugate flows 113
stable bundle of a periodic orbit 200 topologically equivalent flows 113
stable cones 186 topologically mixing 39, 266
stable disk 185 topologically transitive 6, 39, 244
stable eigenspace 111, 120 toral Anosov diffeomorphism 276
stable focus 105 torus 236
stable manifold 183 totally disconnected 26
stable manifold of an invariant set 380 transition matrix 73, 248, 283
Stable Manifold Theorem 182, 197 transitive 39, 244
Stable Manifold Theorem for a hyper- transversal to a flow 166
bolic set 242, 374 transversality condition 405
stable node 105 transverse 200, 319
stable set 16 transverse homoclinic point 259
stable spiral 105 trapping region 292, 368
stable subbundle 241 tubular neighborhood 447
stable subspaces 181 two shift 37
stair step method 18 two sided shift space 247
Stefan cycle 67
Stefan graph 67 uniform hyperbolic structure 241
Sternberg linearization 154, 157 Uniqueness of Solutions of Differential
strongly gradient-like 368 Equations 141
structural stability 44, 238 unstable cones 186
structural stability of Anosov diffeo- unstable disk 185
morphisms 399 unstable eigenspace 111, 120

You might also like