Professional Documents
Culture Documents
Henk Bruin - Topological and Ergodic Theory of Symbolic Dynamics
Henk Bruin - Topological and Ergodic Theory of Symbolic Dynamics
I N M AT H E M AT I C S 228
Topological and
Ergodic Theory of
Symbolic Dynamics
Henk Bruin
Topological and
Ergodic Theory of
Symbolic Dynamics
GRADUATE STUDIES
I N M AT H E M AT I C S 228
Topological and
Ergodic Theory of
Symbolic Dynamics
Henk Bruin
EDITORIAL COMMITTEE
Matthew Baker
Marco Gualtieri
Gigliola Staffilani (Chair)
Jeff A. Viaclovsky
Rachel Ward
Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting
for them, are permitted to make fair use of the material, such as to copy select pages for use
in teaching or research. Permission is granted to quote brief passages from this publication in
reviews, provided the customary acknowledgment of the source is given.
Republication, systematic copying, or multiple reproduction of any material in this publication
is permitted only under license from the American Mathematical Society. Requests for permission
to reuse portions of AMS publication content are handled by the Copyright Clearance Center. For
more information, please visit www.ams.org/publications/pubpermissions.
Send requests for translation rights and licensed reprints to reprint-permission@ams.org.
c 2022 by the author. All rights reserved.
Printed in the United States of America.
∞ The paper used in this book is acid-free and falls within the guidelines
established to ensure permanence and durability.
Visit the AMS home page at https://www.ams.org/
10 9 8 7 6 5 4 3 2 1 27 26 25 24 23 22
Contents
Preface ix
v
vi Contents
ix
x Preface
The scope and aim of the book: Although the text grew from two courses
I gave on symbolic dynamics at the University of Vienna, there is no attempt
to shape it as a textbook for a course on symbolic dynamics. The material
is too diverse, hardly balanced in depth and detail, and I have made no at-
tempt to indicate which sections together would constitute such a course. My
Preface xv
experience is that hardly any book, however well-conceived and written, com-
pletely fits the purpose and taste of the lecturers teaching the actual class.
Instead, I hope this book can serve a purpose for topics courses or reading
courses or as a reference book for anyone wishing to acquaint him/herself to
a particular topic.
Also the exercises are not devised as a testing tool of the students’ un-
derstanding of the material. Symbolic dynamics is at the intersection of
dynamical systems, topological dynamics, combinatorics, and of course cod-
ing theory, and there are a lot of trivia to share. Some of these trivia are
disguised as exercises. Most of the exercises have solutions in the back of
the book. I have given, probably multiple times, proofs of simple results
that are used as exercises in comparable books, but I wanted to avoid the
annoying situation that you cannot refer to a well-known result because it is
only presented as an exercise without solution.
The necessary background for the book varies: for most of it a solid
knowledge of real analysis and linear algebra and first courses in probability
and measure theory (a few times, conditional expectations are used, and
martingales in the proof of Theorem 6.118), metric spaces, number theory,
topology, and set theory suffice. Chapter 6 is not meant as an introduction
to ergodic theory, so a course in ergodic theory and in Hilbert spaces and
Fourier analysis for Section 6.8 are probably necessary. Section 8.1 uses some
Galois theory.
By adding an extensive index and cross-references within the main text,
I tried to enable the reader to study the chapters independently. However,
readers without any prior knowledge of symbolic dynamics should not skip
(the first halves of) Chapters 1 and 2. To follow Chapter 5 one should have a
good understanding of SFTs (Section 3.1), substitution shifts (Section 4.2),
and Sturmian shifts (Section 4.3). To follow Chapter 6, an additional un-
derstanding of BV-systems (Section 5.4) is required. Chapter 8 can be read
largely independently of the rest, except for several examples with direct
references to earlier parts in the text.
The book is largely, but not entirely, self-contained. That would go
too far, because various topics are covered much better in other textbooks.
General books on ergodic theory are [165, 277, 305, 346, 408, 456, 479, 509,
551]; for proofs of Birkhoff’s Ergodic Theorem we suggest [341, 346, 349,
551], and for the proof of the Variational Principle we recommend [551]. In
topological dynamics the book by Auslander is a classic (although difficult to
navigate). Other texts are [12,22,198,199,284,381]. In symbolic dynamics
there are Kitchens [364] and Lind & Marcus [398], both specializing in
SFTs, and the more general book by Kůrka [381], and topic collections
by Blanchard et al. [85] and Bedford et al. [57]. Substitution shifts have
xvi Preface
the expert monographs Queffélec [465], “Phyteas Fogg” [249], and other
groups of authors [68]. I should not fail to mention Viana’s monograph [548]
on interval exchange transformations. Continued fractions and Diophantine
approximation are the subject of monographs [98,175,360,377]. For general
dynamical systems there are [17, 113, 346, 474], and [22, 87, 116, 164, 414,
462] for one-dimensional dynamics in particular.
Various milestones in the theory have been treated extensively by people
far more expert and well-placed than I am. For example, this holds for
Williams’s Theorem on conjugacy in subshifts of finite type; see [364, 398].
This also holds for Ornstein’s Theorem [438] that entropy is a complete
invariant among invertible Bernoulli shifts, which we only briefly introduce
in Section 6.5 because the full proof is beyond the scope of this book. We refer
to [208,209,216,352,456] for further developments, new (more conceptual)
proofs, and more detailed expositions.
How much detail is given for each topic relied on my own taste and
judgment, and if I overstretched the reader and/or my own abilities and
understanding, then so be it. My apologies to the true experts. Thus I
decided to include Li-Yorke chaos but not distributional chaos, some version
of entropy, but no dynamical ζ-functions, no IP-sets, only a few variations
of shadowing and specification (see [461] for a monograph on shadowing),
no higher-dimensional shifts, and no automorphism groups of shift spaces.
Although the first use of symbolic dynamics by Hadamard ([296] in 1898)
was for geodesic flows on modular surfaces, this topic does not appear in
this book; see e.g. [494]. We cover Kakutani-Rokhlin partitions, cutting
and stacking, S-adic transformation, Bratteli-Vershik systems but no graph
covers, although they describe the Cantor system in even further generality.
See [273, 500–502] for their constructions of some of the earliest uses.
Partly, this book is a compendium of bits of knowledge and curiosities
that are scattered over the literature if not on Wikipedia pages. The material
that I present is not equally up to date. For instance, notions such as B-free
shifts (or at least the current state) and amorphic complexity are from the
past decade or even less, whereas in the section on SFTs, the material is all
from before 1990. Topics that are to my knowledge new to textbook and
monograph literature include gap shifts, spacing shifts, power-free shifts,
B-free shifts, amorphic complexity, enumeration systems. The breadth of
topics allows one to see similarities of methods of proof in different subfields
of symbolic dynamics. I hope that my extensive treatment of Bratteli-Vershik
systems and unique ergodicity, as well as my treatment of Sturmian shifts
and Rauzy fractals (redoing and sometimes reproving results of Arnoux),
have some added value. The sections on weak mixing for Bratteli-Vershik
Preface xvii
That is, for a right-infinite sequence, the first symbol disappears, and all
other symbols move a place to the left. For a bi-infinite sequence, the dot
that indicates position zero moves one place to the right. A closed σ-invariant
subset of sequences over some fixed set of symbols (the alphabet), combined
with this left-shift operation σ, is called a subshift. In this chapter, we give
the basic notions and examples of subshifts and discuss the number and
frequency of their subwords.
1
2 1. First Examples and General Properties of Subshifts
1 We will rather not use this word, because of possible confusion with the factor of a subshift
Shift spaces with product topology are metrizable. One of the usual3
metrics that generates the product topology is
0 if x = y or
(1.1) d(x, y) = −m
2 for m = sup{n ≥ 0 : xi = yi for all |i| < n};
so in particular d(x, y) = 1 if x0 = y0 , and diam(Σ) = 1. If (xk )k∈N is a
sequence of sequences such that xk → x, then there is k0 ∈ N such that
d(xk , x) < 2−m for every k ≥ k0 . The definition of the metric d implies
that xki = xi for all |i| ≤ m. In other words, xk → x means that xk[a,b] is
eventually equal to x[a,b] on every finite window [a, b].
The shift map or left-shift σ : Σ → Σ, defined as
σ(x)i = xi+1 , for i ∈ N or Z,
is invertible on AZ (with inverse σ −1 (x)i = xi−1 ) but non-invertible on AN .
We can use the ε-δ definition of continuity for δ = ε/2 to show that σ is
uniformly continuous. This is even true if #A = ∞.
Definition 1.2. A pair (X, σ) with X ⊂ Σ and σ the left-shift is a subshift
(often called simply shift) if X is closed (in product topology) and strongly
shift-invariant; i.e. σ(X) = X. If σ is invertible, then we also stipulate that
σ −1 (X) = X. For example, if Σ = {0, 1}Z and
x = . . . 000.111111 . . . ,
then X = {σ n (x) : n ≥ 0} is not a subshift, because x ∈ X but σ −1 (x) ∈
/ X.
In Examples 1.3–1.6, we use A = {0, 1}.
Example 1.3. The set X = {x ∈ Σ : xi = 1 ⇒ xi+1 = 0} is called the
Fibonacci shift4 . It disallows sequences with two consecutive 1’s. This Fi-
bonacci shift is an example of a subshift of finite type (SFT); see Section 3.1.
The collection X can be represented by a graph in multiple ways:
• X is the collection of labels of infinite paths through the vertex-
labeled graph in Figure 1.1 (left). Labels are given to the vertices
of the graph, and no label is repeated.
• X is the collection of labels of infinite paths through the edge-
labeled graph in Figure 1.1 (right). Labels are given to the arrows
of the graph, and labels can be repeated (different arrows with the
same label can occur).
3 Other metrics are d (x, y) = i |xi − yi |2−|i| or d (x, y) = m
1
with m as in (1.1). They are
equivalent to d(x, y): the former in the sense that there is some C such that C 1
d(x, y) ≤ d (x, y) ≤
Cd(x, y) for all x, y ∈ Σ, the latter in the weaker sense that the embedding i : (Σ, d ) → (Σ, d) as
well as its inverse i−1 are uniformly continuous. In either case, they generate the same topology.
4 Warning: There is also a Fibonacci substitution shift = Fibonacci Sturmian shift (see Ex-
1
0 1 0
0
Example 1.4. Xeven ⊂ {0, 1}N is the collection of infinite sequences in which
the 1’s appear only in blocks of even length and also 1111 · · · ∈ X. We call
Xeven the even shift. Similarly, the odd shift Xodd is the collection of
infinite sequences in which the 0’s appear only in blocks of odd length and
also 0000 · · · ∈ X; see Figure 1.2.
1 1
0 1
0 0 1
Xodd ∩ Xeven
0 0 1
1 0
1 0 1
Xodd Xeven
Figure 1.2. Edge-labeled graphs for Xodd , Xeven , and Xodd ∩ Xeven .
1.2. Word-Complexity
Definition 1.8. Given a subshift X, the collection
L(X) = {words of any finite length in X}
is called the language of X. We use the notation Ln (X) for all the words
in the language of length n.
Definition 1.9. The function p := pX : N → N defined by p(n) = #Ln (X)
is called the word-complexity of X.
Example 1.10. For the Fibonacci SFT of Example 1.3, let
Fn = #{w ∈ Ln (X) : wn = 0}.
Then F1 = 1, F2 = 2, and Fn+1 = Fn + Fn−1 for n ≥ 3 because Fn
is the cardinality of the set of n + 1-words ending in 00 and Fn−1 is the
5 After the Norwegian mathematician Axel Thue (1863–1922) and the American Marston
Morse (1892–1977), but the corresponding sequence was used before by the French mathematician
Eugène Prouhet (1817–1867), a student of Sturm.
6 1. First Examples and General Properties of Subshifts
Clearly
p(n + 1) − p(n) = #{left-special words of length n}
= #{right-special words of length n}.
The following result goes back to Morse & Hedlund [425].
Proposition 1.12. If the word-complexity of a subshift (X, σ) satisfies
p(n) ≤ n for some n, then (X, σ) consists of finitely many periodic sequences.
constant C such that p(n) ≤ Cn. Sturmian sequences (see Section 4.3)
have p(n) = n + 1; in fact all recurrent words with this word-complexity are
Sturmian. There are further possibilities for non-recurrent subshifts. The
sequences
x = . . . 000.10000 . . . and y = 00001111.00000 . . .
both have p(n) = n+1. They are not uniformly recurrent, but asymptotically
fixed for n → ±∞. Ormes & Pavlov [435, Theorems 1.2 & 1.3] showed
that for non-recurrent shifts (X, σ) that are not asymptotically periodic in
both directions, lim inf n p(n)/n ≥ 32 and that this bound is sharp, as is
demonstrated by
z = 0000.10n0 10n1 10n2 10n3 1 . . .
for a carefully chosen increasing sequence of gaps (ni )i≥1 . In fact, given
any non-decreasing function g : N → N that tends to infinity, there is x ∈ X
such that p{x} (n) := #{w is subword of x : |w|} = n < 32 n+g(n). In further
detail, if a transitive6 shift (X, σ) with a recurrent point contains m minimal
subsystems, of which m∞ are infinite, then
lim sup pX (n) − (m + m∞ + 1)n = ∞, lim inf pX (n) − (m + m∞ )n = ∞,
n→∞ n→∞
and these bounds are sharp. The second estimate holds also without the
existence of a recurrent point. See [230], specifically Theorems 1.2 and 1.3.
Symbolic spaces associated with interval exchanges transformations on
k intervals have p(n) = (d − 1)n + 1; see Proposition 4.80. The Chacon
substitution shift and primitive Chacon substitution shift (see Example 1.27)
have word-complexity p(n) = 2n−1 (for n ≥ 2) and p(n) = 2n+1; see [243].
For many subshifts, pX (n)/n is bounded in n but hard to compute exactly;
often limn p(n)/n doesn’t exist. For instance, the word-complexity of the
Thue-Morse shift (i.e. the closure {σ n (ρTM ) : n ∈ N0 } of Example 1.6) is
3 · 2m + 4r if 0 ≤ r < 2m−1 ,
(1.2) p(n) =
4 · 2m + 2r if 2m−1 ≤ r < 2m ,
where n = 2m + r + 1; see [115, 406]. In [129], the word-complexity of
certain (Fibonacci-like) unimodal restrictions to the critical ω-limit set are
computed.
The following curious result is due to Heinis; see [150, 311].
Proposition 1.13. If limn pX (n)/n exists and is finite, then it has to be an
integer.
All substitution shifts, in fact all linearly recurrent shifts, have sublinear
complexity; see Theorem 4.4.
6 See Definition 1.18 below.
8 1. First Examples and General Properties of Subshifts
To show that the limit in (1.3) exists, we need one more notion and one
well-known lemma.
Definition 1.14. We call a real-valued sequence (an )n≥1 subadditive if
am+n ≤ am + an for all m, n ≥ 1.
Analogously, (an )n≥1 is superadditive if am+n ≥ am + an for all m, n ∈ N.
Lemma 1.15 (Fekete’s Subadditive Lemma). If (an )n≥1 is subadditive, then
limn ann = inf r≥1 arr (possily −∞). Analogously, if (an )n≥1 is superadditive,
then limn ann = supr≥1 arr (possily −∞).
Petersen’s shifts mentioned below Theorem 2.77 has zero entropy and is topologically mixing,
while Grillenberger’s [287] construction gives minimal shifts of positive entropy (and therefore
lacking periodic orbits), with further examples among Toeplitz shifts (see Theorem 4.94) and
B-free shifts (Section 4.6).
1.3. Transitive and Synchronized Subshifts 9
Even for the simplest direct generalization of the Fibonacci SFT, namely
0-1-patterns on Z2 where no two 1’s occur directly next to each other (hor-
1
izontally or vertically), the entropy limm,n→∞ mn log px (m, n) is unknown.
There are however numerical approximations (e.g. for this example, the en-
tropy equals 0.5878116 . . . which these digits certainly correct; see [251])
and characterizations of which values can occur; see e.g. [259, 260, 289, 313,
314, 399].
Proof. First assume that π is continuous and commutes with the shift. For
each a ∈ Ã, the cylinder [a] = {y ∈ Y : y0 = a} is clopen, so Va := π −1 ([a])
is clopen too. Since Va is open, it can be written as the union of cylinders,
and since Va is closed (andhence compact) it can be written as the finite
union of cylinders: Va = ri=1 a
Ua,i . Let N be so large that every Ua,i is
9 Sometime the window can have memory and anticipation of different lengths, so the window
would be [−m, n], but calling their maximum N covers all cases.
10 Curtis and Lyndon were working for the military at the time, so their work was “classified”,
and the paper was published under Hedlund’s name only, [308].
1.4. Sliding Block Codes 11
Proof. Let 2N + 1 be the maximal window size among the sliding block
codes from X to Y and from Y to X. Then every k-word in Y is obtained
from an N + k-word in X, so pY (k) ≤ pX (N + k). Replacing the role of X
and Y gives the other inequality.
Exercise 1.26. If ψ : X → Y is an onto sliding block code which is k-to-one
for some fixed k, show that htop (X, σ) = htop (Y, σ).
Example 1.27. The following substitutions (see Section 4.2) are called the
Chacon substitution and primitive Chacon substitution:
⎧
⎪
0 → 0010, ⎨0 → 0021,
(1.4) χchac : and χChac : 1 → 021,
1→1 ⎪
⎩
2 → 21,
with fixed points
ρchac = 0010 0010 1 0010 0010001010 1 0010 . . . ,
ρChac = 0021 0021 21 021 0021002121021 . . . .
They can be transformed into each other using the sliding block code
⎧ ⎧
⎪
⎨00a → 0, ⎪
⎨0 → 0,
−1
f : 10a → 1, a ∈ {0, 1} and f : 1 → 0,
⎪
⎩ ⎪
⎩
1 → 2, 2 → 1,
and this extends to the shift orbit closures
Xchac = {σ n (ρchac ) : n ≥ 0} and XChac = {σ n (ρChac ) : n ≥ 0}.
11 If (X, σ) is a one-sided subshift, with window size [0, N ] (so no memory, only anticipation),
then this part of the proof still works. The first part of the proof fails: one must first extend
(X, σ) to a two-sided shift before the Curtis–Hedlund–Lyndon Theorem can be applied in full.
12 1. First Examples and General Properties of Subshifts
01 11
00
11
10 11
Figure 1.3. The edge-labeled transition graph of the 2-block even shift.
Proof. We do the proof for invertible shifts; the one-sided shifts work as
well, but then we cannot allow a memory in the sliding block code, only
anticipation. Let φ : X → X̃ be the sliding block code that recodes the
2N + 1-blocks in X into the letters of Ã; i.e. φ(x)i = f (xi−N · · · xi+N ).
Then π̃ = π ◦ φ−1 is the required sliding block code.
that assigns a number to every cylinder set, according to the following rules:
The Kolmogorov Extension Theorem (see e.g. [56, Section 21.10]) implies
that μ can be extended uniquely for every set in the σ-algebra B generated
by the cylinder sets. Thus, if x ∈ X is such that fw (x) exists for every
w ∈ L(X), then there is a shift-invariant probability measure μ such that
μ([w]) = fw (x) for all w ∈ L(X).
x = 1001110000111110000001111111 · · · 0n 1n+1 · · ·
1
is associated to a combination of Dirac measures 2 (δ0
∞ + δ1∞ ), and this
measure is clearly not ergodic.
The full shift is obviously not uniquely ergodic; it has for instance a
Bernoulli measure for every probability vector p and neither are SFTs, sofic
shifts, or β-shifts (which are, in fact, intrinsically ergodic). The Thue-Morse
shift on the other hand is uniquely ergodic. Clearly, unique ergodicity implies
intrinsic ergodicity, but not the other way around. It follows from Oxtoby’s
Theorem 6.20 that a recurrent subshift (X, σ) is uniquely ergodic if and only
if fw (x) exists and is the same for every x ∈ X. In this case, the convergence
in the limit (1.5) is uniform in x.
12 Named after Jacob Bernoulli, one of the mathematicians’ family originating from Basel
who wrote the book “Ars conjectandi”, one of the first books on probability theory.
1.6. Symbolic Itineraries 15
0 1 0 1 0 1
α 1/β 1/2
dynamics emerges from the dynamical system (X, T ) by coding the T -orbits
of the points x ∈ X. To this end, for a finite or countable alphabet A, we
let J = {Ja }a∈A be a partition of X. Then to each x ∈ X we assign an
itinerary i(x) ∈ AN0 :
in (x) = a if T n (x) ∈ Ja .
If T is invertible, then we can extend itineraries to sequences in AZ . It is
clear that i◦T (x) = σ◦i(x). Therefore, i(X) is σ-invariant and if T : X → X
is onto, then σ(i(X)) = i(X). In general, however, i(X) is not closed, so
we need to take the closure before it can be called a subshift. Using this
subshift, we can often show the abundance of different trajectories (periodic
or with other properties) of the original system (X, T ).
Example 1.34. Let X be the closure of the collection of symbolic itineraries
of a circle rotation Rα : S1 → S1 over angle α ∈ [0, 1] \ Q; see Figure 1.4
(left). We use the partition J = {J0 , J1 } with J0 = [0, α) and J1 = [α, 1).
Hence, if y ∈ S1 and n ∈ Z, then
0 if Rn (y) ∈ [0, α),
i(y)n =
1 if Rn (y) ∈ [α, 1).
Slightly different coding comes from the partition {(0, α], (α, 1]}, but the
closure of i(S1 ) is the same for both partitions. The resulting shift is called
a Sturmian shift; see Definition 4.60.
Example 1.35. Consider the β-transformation Tβ : [0, 1] → [0, 1], Tβ (x) =
βx mod 1 (see Figure 1.4 (middle)), and i(x)n = a if Tβn (x) ∈ Ja := [ na , a+1
β ).
The closure of i([0, 1]) is called a β-shift; see Section 3.5.
Example 1.36. Let X = [0, 1] and T (x) = f4 (x) = 4x(1−x); see Figure 1.4
(right). Let J0 = [0, 12 ] and J1 = ( 12 , 1]. Then i(X) is not closed, because
16 1. First Examples and General Properties of Subshifts
From now on, assume that X is a compact metric space without isolated
points. We will now discuss the properties of the coding map i itself. First of
all, for i to be continuous, it is crucial that T |Ja is continuous on each element
Ja ∈ J . But this is not enough: if x is a common boundary of two elements
of J , then (no matter how you assign the symbol to x in Example 1.36), for
each neighborhood U x, diam(i(U )) = 1, so continuity fails at x. It is only
by using quotient spaces of i(X) (so changing the topology of i(X)) that we
can make i continuous. Normally, we choose to live with the discontinuity,
because it affects only a few points:
Lemma 1.38. Let ∂J denote the collection of common boundary points of
different elements in a partition J . If orb(x) ∩ ∂J = ∅ for all J ∈ J , then
the coding map i : X → AN0 or AZ is continuous at x.
1.6. Symbolic Itineraries 17
Proof. We carry out the proof for invertible maps. Let ε > 0 be arbitrary
and fix N ∈ N such that 2−N < ε. For each n ∈ Z with |n| ≤ N , let
Un T n (x) be such a small neighborhood that it is contained in a single
partition element Jin (x) . Since orb(x) ∩ ∂J = ∅, this is possible. Then
U := |n|≤N T −n (Un ) is an open neighborhood of x and in (y) = in (x) for
all |n| ≤ N and y ∈ U . Therefore diam(i(U )) ≤ 2−N < ε, and continuity at
x follows.
Definition 1.39. A transformation T : X → X of a metric space (X, d) is
called expansive if there exists δ > 0 such that for all distinct x, y ∈ X,
there is n ≥ 0 (or n ∈ Z if T is invertible) such that d(T n (x), T n (y)) > δ.
We call δ the expansivity constant.
Every subshift (X, σ) is expansive. Indeed, if x = y, then there is
n ∈ N (or n ∈ Z if (X, σ) is a two-sided shift) such that xn = yn , so
d(σ n (x), σ n (y)) = 1. This makes every δ ∈ (0, 1) an expansivity constant.
Lemma 1.40. Suppose that T is a continuous expansive map and injective
on each Ja of some partition J . If the expansivity constant δ >
supa∈A diam(Ja ), then the coding map i : X → AN0 or Z is injective.
To obtain injectivity of the coding map, it often suffices (but not always;
see Example 1.43 below) that T is expanding on each partition element
Ja . Expanding (and expansion) should not be confused with expansive (and
expansivity) of Definition 1.39.
Definition 1.41. Let T : X → Y be a map between metric spaces. We call
T expanding if there is ρ > 1 such that dY (T (x), T (y)) ≥ ρdX (x, y) for all
x, y ∈ X and locally expanding if there are ε > 0 and ρ > 1 such that
d(T (x), T (y)) ≥ ρd(x, y) for all x, y ∈ Y with d(x, y) < ε.
Proposition 1.42 (Gottschalk & Hedlund [284]). Let T : X → X be a
homeomorphism on a compact metric space (X, d). If T is locally expanding,
then X is finite.
Compact is important. For example T : R → R, x → 2x, would be a
counterexample without the compactness assumption.
such that diam(Ui ) < δ. Then {T −1 (Ui )}N i=1 is an open cover of X, and
diam T (Ui ) < ε, so by local expansion, diam T −1 (Ui ) < diam(Ui )/ρ ≤ δ/ρ.
−1
Repeating this argument, we find that {T −n (Ui )}N i=1 is a finite open cover
of X with diam(T −n (Ui )) < δρ−n . Since n is arbitrary, #X ≤ N .
Example 1.43. In this example, we show that despite T being expanding
on partition elements Ja , a ∈ A, this may still not result in an injective
coding map i : X → AN0 if the diameter of some of the Ji ’s is too big.
Let T : S1 → S1 , x → 2x mod 1, be the doubling map, and let J0 =
and J1 = S1 \ J0 . Clearly T (x) = 2 for all x ∈ S1 , but T is not
( 14 , 34 )
expanding on the whole of S1 , because for instance d(T ( 14 ), T ( 34 )) = 0 <
1 1 3
2 = d( 4 , 4 ). More importantly, T is not expanding on J0 or J1 either; for
example d(T ( 14 + ε), T ( 34 − ε)) = 4ε < 12 − 2ε = d( 14 + ε, 34 − ε) for each
ε ∈ (0, 121
). The corresponding coding map is not injective. The way to see
this is by noting that the involution S(x) = 1 − x commutes with T and also
preserves each Ja . It follows that i(x) = i(S(x)) for all x ∈ S1 , and only
x = 0 and x = 12 have unique itineraries. For the more general partition
J0b = (b, b + 12 ) and J1b = S1 \ J0b for b ∈ [0, 12 ), see Remark 3.102.
Chapter 2
Topological Dynamics
19
20 2. Topological Dynamics
The set orb+ (x) = {T n (x) : n ≥ 0} is the forward orbit of x. This notation
is useful if T is invertible; if T is non-invertible, then orb+ (x) = orb(x).
In formula,
ω(x) = T m (x) = {y ∈ X : ∃ ni → ∞, lim T ni (x) = y}.
i→∞
n∈N m≥n
Definition 2.10. Two dynamical systems (X, f ) and (Y, g) are called orbit
equivalent if there is a homeomorphism ψ : X → Y such that ψ(orbf (x)) =
orbg (ψ(x)) for all x ∈ X; i.e. ψ sends orbits to orbits (set-wise, not neces-
sarily point-wise).
Remark 2.14. The notion of dense orbit may need further explanation if
the subshift is two-sided. Consider the sequence
(2.3) x = · · · 000000000000000000.101000101000000000101000101 · · · .
This sequence emerges from the Cantor substitution
0 → 000
χCantor :
1 → 101
from the seed 0.1. This sequence has a dense forward orbit orb+ (x) within
its forward orbit closure orb+ (x) as well as a dense backward orbit orb− (x)
within its backward orbit closure orb− (x). However, orb− (x) is not dense in
its two-sided orbit closure.
its closure) as the phase-space. An interesting example with only recurrent orbits but no minimal
subset is due to Auslander [38, page 27].
2.2. Transitive and Minimal Systems 25
6 The expression “almost periodic” is frequently used as well, e.g. in [284, 381, 398, 465], but
it is not the same with all authors and sometimes refers to a different notion. For instance, in
[482] it is used as “periodically recurrent” in our Definition 2.19.
26 2. Topological Dynamics
Since N was arbitrary, this contradicts the uniform recurrence and hence
such Y cannot exist.
Definition 2.18. Uniform recurrence means that the set
N (x, U ) := {n ∈ Z or N : x ∈ T −n (U )}
is syndetic for every x ∈ X; i.e. it has bounded gaps (from the Greek
συνδετ ικoς = bound together). A set that is not syndetic has a complement
that is thick: for every N ∈ N it contains blocks {n, n + 1, . . . , n + N }.
Definition 2.19. A dynamical system is called periodically recurrent if
for every non-empty open set U , there is N such that U ⊂ T −kN (U ) for all
k ∈ N (or k ∈ Z if T is invertible).
Clearly, essentially minimal maps can have at most one periodic or-
bit, but as the subshift X := {σ k (· · · 000001000000 · · · )}k∈Z ∪ {0∞ } shows,
X \ Y = ∅ is possible. However, the two-sided orbit closure of (2.3) does
not give an essentially minimal shift.
Proposition 2.26. Given a dynamical system (X, T ) and a point y ∈ X,
the following are equivalent:
(i) (X, T ) is essentially minimal and y is contained in its minimal set.
(ii) For every x ∈ X, ω(x) y.
If, in addition, T is invertible, then two further equivalent statements are:
(iii) For every x ∈ X, α(x) y.
(iv) For every open set U y, n∈Z T n (U ) = X.
See [324, Proposition 1.1] for more general results in this direction.
Proof. Suppose z ∈ X has a dense orbit. Take ε > 0 arbitrary and choose
δ ∈ (0, ε/3) such that d(x, y) < δ implies d(T n (x), T n (y)) < ε/3 for all n ≥ 0.
−1
Choose N ∈ N so large that N n N
n=0 Bδ (T (z)) = X and d(T (z), z) < δ. Now
let x be arbitrary and take 0 ≤ n < N such that d(T (z), x) < δ. Then
n
d(T N (x), x) ≤ d(T N (x), T N +n (z)) + d(T n+N (z), T n (z)) + d(T n (z), x)
ε ε
≤ + +δ <ε
3 3
as required.
However, equicontinuity also gives for every ε > 0 some δ > 0 (and δ → 0
as ε → 0) such that d(x, y) < δ implies d∞ (x, y) < ε, and therefore xn → x
in the metric d if and only if xn → x in the metric d∞ . Hence both metrics
generate the same topology.
If T is itself a strict contraction, then also d∞ (T (x), T (y)) < d∞ (x, y),
but if X is compact and T is surjective, then the dynamical system (X, T )
is an isometry in the metric d∞ .
Proposition 2.31. If a dynamical system (X, T ) is equicontinuous and sur-
jective on a compact metric space (X, d), then T preserves d∞ .
Proof. We have already seen that d∞ (T (x), T (y)) ≤ d∞ (x, y) for all x, y ∈
X. Assume by contradiction that we have strict inequality for some choice
a = b, say d∞ (a, b) = d∞ (T (a), T (b)) + 9ε for some ε > 0.
30 2. Topological Dynamics
Auslander & Ellis (see e.g. [13]) proved that for every x ∈ X, there exists
a y ∈ X such that orb(y) is a minimal subset of X and (x, y) is a proximal
pair. Note that proximality is not an equivalence relation: it is not transitive.
For example, (101)(000)2 (101)3 (000)4 · · · and (000)(101)2 (000)3 (101)4 · · ·
are both proximal to 0∞ under the shift, but not to each other. A stronger
version of proximality that does give an equivalence relation is the following:
Definition 2.33. Let (X, T ) be a dynamical system on a metric space (X, d).
Then a pair of points (x, y) is syndetically proximal if for every ε > 0,
the set {n ∈ N or Z : d(T n (x), T n (y)) < ε} is syndetic.
The following result for subshifts goes back to [156, 562]; see also [434,
Theorem 19] for the proof.
Theorem 2.34. Given a subshift (X, σ), the following are equivalent:
(1) Proximality is an equivalence relation.
(2) Every proximal pair is syndetically proximal.
(3) The orbit closure {σ n × σ n (x, y) : n ∈ N or Z} of every (x, y) ∈ X×
X contains exactly one minimal set in the product shift.
2.3. Equicontinuous and Distal Systems 31
Exercise 2.37. (a) Show that the map T (x, y) = (x, x + y) on the two-torus
T2 is distal but not equicontinuous.
(b) Let α ∈ [0, 1] be irrational. Show that the map T (x, y) = (x + α, x + y)
on the two-torus T2 is distal but not equicontinuous. (Here showing mini-
mality is the hard part; see Proposition 6.26).
Proof. First assume that the shift is one-sided. If it is distal, then it has
to be invertible, and therefore a homeomorphism. But a one-sided shift
is locally expanding, and locally expanding homeomorphisms only exist on
finite spaces; see Proposition 1.42. Hence, there are no distal one-sided shifts
other than finite unions of periodic orbits.
Now if (X, σ) is a two-sided shift, then its one-sided restriction (X + , σ)
is a subshift too. Here we need to check that σ : X + → X + is surjective,
but this follows because if x+ is the one-sided restriction of x ∈ X, then
y + := σ −1 (x)+ ∈ X + and σ(y) = x. Furthermore, since X has a non-
periodic minimal set, X + has a non-periodic minimal set too. Thus the
above argument shows that (X + , σ) cannot be distal.
Definition 2.39. Given a dynamical system (X, T ), we say that (Y, S) is
the maximal equicontinuous factor (MEF) if it is equicontinuous and
semi-conjugate to (X, T ) and every other equicontinuous factor of (X, T ) is
also a factor of (Y, S).
Every dynamical system has an MEF, and it can be shown that the MEF
is unique up to conjugacy. This goes back to a result of Ellis & Gottschalk
[236]. The proof we give is for invertible dynamical systems7 and relies on
the notion of regional proximality:
Definition 2.40. Let (X, T ) be a dynamical system on a metric space (X, d).
Two points x, y ∈ X are regionally proximal if there are sequences xi → x
and yi → y and (ni ) ⊂ N such that d(T ni (xi ), T ni (yi )) → 0. In this case
we write x ∼rp y. It is not obvious that ∼rp is a transitive relation8 , and
therefore we take the transitive hull x ∼trp y if there is a sequence x = z0 ∼rp
z1 ∼rp · · · ∼rp zN = y.
Proposition 2.41. Every continuous invertible dynamical system (X, T ) on
a compact metric space (X, d) has a maximal equicontinuous factor.
Therefore (x, y) is not a distal pair, but equicontinuous maps are distal; see
Corollary 2.35.
The (transitive hull) relation ∼trp is an equivalence relation that is T -
invariant and also T −1 -invariant. The equivalence classes are closed, and if
xk → x, yk → y are such that xk ∼trp yk , then also x ∼trp y. Therefore
the quotient space Xeq = X/ ∼trp is a well-defined Hausdorff space (and in
fact a metric space with quotient metric deq ), and the maps T and T −1 are
well-defined on it.
Now suppose by contradiction that T and hence T −1 is not equicontin-
uous on the quotient space Xeq . Then there is ε > 0 such that for all i ∈ N,
there are xi , yi ∈ Xeq , deq (xi , yi ) < 1/i, and ni ∈ N such that deq (xi , yi ) > ε
for xi = T −ni (xi ) and yi = T −ni (yi ). By passing to a subsequence, we can
assume that xi → x and yi → y and deq (x, y) ≥ ε. But x ∼trp y by construc-
tion, contradicting that Xeq has only trivial regionally proximal pairs.
9 This was defined as for every ε > 0 there is δ > 0 such that d(x, y) < δ implies
d(T n (x), T n (y)) < ε for all n ∈ N except for a set of density zero. This is equivalent to Def-
inition 2.42 by Lemma 8.53.
34 2. Topological Dynamics
However, it was proved in [211] for minimal dynamical systems (and [263,
464] in more generality) that (X, T ) is mean equicontinuous if and only if
for every ε > 0 there is a δ > 0 and N ∈ N such that d(x, y) < δ implies
n−1
1
d(T i x, T i y) < ε for all m and n ≥ m + N.
n−m
i=m
Let (Y, σ) be the symbolic system associated to (X, T, P), i.e. the smallest
subshift such that the itinerary i(x) ∈ Y for every x ∈ X. Then (Y, σ) is
mean equicontinuous.
n−1 n−1
1 1
lim sup dσ (σ j (y), σ j (y )) = lim sup dσ (σ j (i(x)), σ j (i(x )))
n→∞ n n→∞ n
j=0 j=0
n−1
1
≤ lim sup dσ (i(T j (x)), i(T j (x )))
n→∞ n
j=0
12 We will apply Oxtoby’s Ergodic Theorem for the indicator function 1 , which is discon-
Vε
tinuous. But by assumption (2), μ(∂Vε ) can be made arbitrarily small by taking ε small, so that
1Vε can be approximated by a continuous function with negligible error.
36 2. Topological Dynamics
13 Note, however, that the Adler, Konheim & McAndrew definition requires only a topology,
inequality, this ε/2-ball is disjoint from the ε/2-balls centered around all
other points in En (ε). Therefore,
(2.5) rn (ε) ≤ sn (ε) ≤ rn (ε/2).
Thus we can equally well define
1
(2.6) htop (T ) = lim lim sup log rn (ε).
ε→0 n→∞ n
Example 2.46. Let (X, σ) be the full shift on N symbols. Let ε > 0 be
arbitrary, and take m minimal such that 2−m < ε. If we select a point from
each n + m-cylinder, this gives an (n, ε)-spanning set, whereas selecting one
point from each n-cylinder gives an (n, ε)-separated set. Therefore
1 1
log N = lim sup log N n ≤ lim sup log sn (ε) ≤ htop (σ)
n→∞ n n→∞ n
1 1
≤ lim sup log rn (ε) ≤ lim sup log N n+m
n→∞ n n→∞ n
= log N.
Exercise 2.47. Show that for subshifts the definition of (1.3) coincides with
(n, ε)-definition in this section.
Example 2.48. Consider the β-transformation Tβ : [0, 1) → [0, 1), x →
βx mod 1 for some β > 1. Take ε < 2β1 2 and Gn = { βkn : 0 ≤ k < β n }.
Then Gn is (n, ε)-separating, so sn (ε) ≥ β n . On the other hand, Gn =
βn βn
β n : 0 ≤ k < 2ε } is (n, ε)-spanning, so rn (ε) ≤ 2ε . Therefore
{ 2kε
1 1 βn
log β = lim sup log β n ≤ htop (Tβ ) ≤ lim sup log = log β.
n→∞ n n→∞ n 2ε
Circle rotations, or in general isometries, have zero topological entropy.
Indeed, if E(ε) is an ε-separated set (or ε-spanning set), it will also be (n, ε)-
separated (or (n, ε)-spanning) for every n ≥ 1. Hence sn (ε) and rn (ε) are
independent of n, and their exponential growth rates are equal to zero. In
more generality:
Proposition 2.49. Every equicontinuous transformation (X, T ) on a com-
pact metric space (X, d) has zero entropy.
Proof. For any k ∈ N, a (kn, ε)-separated set for T is also an (n, ε)-separated
set for T k . Therefore
1 1
htop (T k ) = lim log sn (ε, T k ) = k lim log sn (ε, T ) = khtop (T ).
n→∞ n n→∞ kn
Clearly the identity T 0 has zero entropy. If T is invertible and En (ε) is
an (n, ε)-separated set, then T n−1 (En (ε)) is an (nε)-separated set for T −1 .
Therefore htop (T −1 ) = htop (T ). Combined with the first part, it follows that
htop (T k ) = |k|htop (T ) for all k ∈ Z.
Corollary 2.51. If (Y, S) is a continuous factor of (X, T ) (where (X, d)
is a compact metric space), then htop (S) ≤ htop (T ). In particular, conju-
gate dynamical systems on compact metric spaces have the same topological
entropy.
Further properties concern iterates and factors; see [264, Proposition 1.3].
Lemma 2.58. Let (X, T ) and (Y, S) be two dynamical systems on compact
metric spaces.
• If (Y, S) is a topological factor of (X, T ), then ac(S) ≤ ac(T ). In
particular, amorphic complexity is preserved under conjugacy.
• ac(T n ) = ac(T ) for every n ∈ N.
• ac(S × T ) = ac(S) + ac(T ).
Proof. Since X is infinite and has a dense orbit, no periodic point is isolated,
and there are at least two periodic orbits, say orb(p) and orb(q). Let δ :=
min{d(x, y) : x, y ∈ orb(p) ∪ orb(q), x = y}/6 > 0. Take x ∈ X and ε > 0
arbitrary. Then Bε (x) contains a periodic point r ∈ / orb(p) ∪ orb(q). If
there is n ≥ 0 such that d(T (x), T (r)) > δ, then sensitive dependence is
n n
Remark 2.65. There is also an analogue for mean equicontinuity, see [394]
and also [270], saying that every minimal dynamical system is either mean
equicontinuous or mean sensitive, which means that there is a δ > 0
x ∈ X
such that for every and neighborhood U x, there is y ∈ U such
that lim supn n1 n−1
i=0 d(T i x, T i y) > δ. A measure-theoretic version of the
The paper of Li & Yorke [395] from 1973 might be called a popular
(partial) rediscovery of Sharkovskiy’s theorem [498] from 196415 , but it also
initiated the study of the following notions.
Definition 2.66. Let (X, T ) be a dynamical system on a metric space (X, d).
A pair of points x, y ∈ X is called a Li-Yorke pair if
lim inf d(T n (x), T n (y)) = 0 and lim sup d(T n (x), T n (y)) > 0.
n→∞ n→∞
A set S ⊂ X is called scrambled if (x, y) is a Li-Yorke pair for every two
distinct x, y ∈ S. The dynamical system is chaotic in the sense of Li and
Yorke if there is an uncountable scrambled set.
15 Sharkovskiy’s Theorem states that if a continuous map of the real line has a periodic point
of period n, it also has a periodic point of period m for every m ≺ n in the Sharkovskiy order
1 ≺ 2 ≺ 4 ≺ 8 ≺ · · · ≺ 4 · 7 ≺ 4 · 5 ≺ 4 · 3 · · · ≺ 2 · 7 ≺ 2 · 5 ≺ 2 · 3 · · · ≺ 7 ≺ 5 ≺ 3.
Sharkovskiy related during the 2018 IWCTA: International Workshop and Conference on Topology
& Applications (Kochi, India) in honor of his 1,000-th moon that the printer of his original
publication didn’t have the sign ≺ at his disposal, and therefore he suggested to use the letter
Y turned sideways. The publisher followed this suggestion but turned the Y in the different
direction as Sharkovskiy had intended, and therefore the Sharkovskiy order was first printed as
Y
This is the main result of [83]; see also [482, Chapter 5] and [210]. The
converse is, however, not true. There exist examples of continuous (so-called
2∞ ) interval maps which have periodic points of period 2n for each n ∈ N and
no periodic points with other periods, which have (therefore) zero topological
entropy, but which still are Li-Yorke chaotic; see [516, 566]. Example 2.53
gives a subshift which has zero entropy but is Li-Yorke chaotic.
Theorem 2.71. Let X = {1, . . . , d}N . For every probability vector p =
(p1 , . . . , pd ), every scrambled set has zero p-Bernoulli measure.
Proof. Take x0 = x1 ∈ X, and choose 0 < ε < d(x0 , x1 )/3. Let U0 and U1
be the ε-neighborhoods of x0 and x1 , respectively. By topological exactness,
there is N ∈ N such that T N (U0 ) = X = T N (U1 ). Hence, for an arbitrary
n ∈ N and every w = w0 w1 · · · wn−1 ∈ {0, 1}n , there is xw ∈ X such that
T kN (xw ) ∈ Uwk for all 0 ≤ k < n. If w = w ∈ {0, 1}n , then the nN -distance
dnN (xw , xw ) > ε. Hence, every (nN, ε)-spanning set must contain at least
2n elements and htop (T ) ≥ N1 log 2 > 0.
Theorem 2.75. If T : [0, 1] → [0, 1] is a continuous transitive interval
map, then htop (T ) ≥ 12 log 2. If in addition T is topologically mixing, then
htop (T ) > 12 log 2.
This result is due to Blokh; see [90, 92]. A compact exposition of this
and related results can be found in [482, Proposition 4.70].
Definition 2.76. A dynamical system (X, T ) on a topological space is
called weakly topologically mixing if for every four non-empty open sets
U1 , U2 , V1 , V2 , there is n such that U1 ∩ T −n (V1 ) = ∅ and U2 ∩ T −n (V2 ) = ∅,
or equivalently, the product system T × T on X × X is transitive.
and weak mixing. These are discussed in Section 6.7. Some specific differ-
ences exist; for instance, there is no topological analog of Theorem 6.86.
From the definition it is clear that topological weak mixing implies that
the product system (X 2 , T × T ) is transitive. In fact, Furstenberg [266]
showed that this holds for every N -fold Cartesian product (X N , T × · · · × T ).
An important result on topological weak mixing is the following multiple
recurrence (a dynamical version of Van der Waerden’s Theorem) due to
Furstenberg & Weiss [267]: if (X, T ) is minimal, then for every open set
U ⊂ X and m ∈ N, there is n ∈ N such that U × T n (U ) × T 2n (U ) × · · · ×
T mn (U ) = ∅. Glasner [276] extended this to multiple transitivity: if (X, T )
is minimal and topologically weak mixing, then for x in a residual subset of
X, the m-tuple (x, . . . , x) has a dense orbit under T ×T 2 ×· · ·×T m . Further
results can be found in e.g. [139, 424].
The following hierarchy (which also holds for the measure-theoretic ana-
log) will not come as a surprise:
In the symbolic setting, i.e. a subshift (X, σ) takes the place of (M, f ), we
can define
s (q) = {x ∈ X : x = q for all n ≥ 0},
Wloc n n
Wloc (q) = {x ∈ X : xn = qn for all n ≤ 0}.
u
Remark 2.84. For subshifts (X, σ), this definition can be simplified. We
give the version for strong specification, because it is the one in most frequent
use in this context. There is a gap size N ∗ such that for all K ∈ N and every
K-tuple x1 , . . . , xK ∈ X and iterates m1 ≤ n1 < m2 ≤ n2 < · · · < mK ≤ nK
with mk+1 − nk ≥ N ∗ , there is x ∈ X such that
(2.11) xj = xkj−mk for all k ∈ {1, . . . , K}, mk ≤ j < nk .
The next result is due to Bowen [101] and in more generality to Sigmund
[508, Proposition 3].
Proposition 2.86. Every continuous dynamical system with specification
for all K ∈ N on a compact metric space has positive topological entropy.
Proof. Take distinct points a, b ∈ X and let ε = d(a, b)/3. Let N be the
gap size associated to ε. Now for every K ∈ N and chain {x1 , . . . , xK } ⊂
{a, b}K and the integers mk = nk = mk+1 − N , there is a point x such that
d(T mk (x), xk ) < ε for k = 1, . . . , K. There are 2K choices of {x1 , . . . , xK }
and the corresponding points x are (nK , ε)-separated. Hence, according to
Definition (2.4), htop (T ) ≥ 1+N
1
log 2 > 0.
Subshifts of
Positive Entropy
51
52 3. Subshifts of Positive Entropy
Lemma 3.2. Every SFT (X, σ) on a finite alphabet can be recoded such that
the list of forbidden words consists of 2-words only.
Proof. Assume that (X, σ) is a subshift over the alphabet A and the longest
forbidden word has length M + 1 ≥ 2. Take a new alphabet à = AM ,
say a1 , . . . , an are its letters. Recode every x ∈ X using a sliding block
code π, where for each index i, π(x)i = aj if aj is the symbol used for
xi xi+1 · · · xi+M −1 . Effectively, this is replacing X by its M -block code. Then
every M + 1-word is uniquely coded by a 2-word in the new alphabet Ã, and
vice versa, every a1 a2 such that the M -suffix of π −1 (a1 ) equals the M -prefix
of π −1 (a2 ) encodes a unique M + 1-word in A∗ . Now we forbid a 2-word
a1 a2 in Ã2 if π −1 (a1 a2 ) contains a forbidden word of X. Since B is finite,
and therefore A is finite, this leads to a finite list of forbidden 2-words in the
recoded subshift.
Example 3.3. Let X be the SFT with forbidden words 11 and 101, so
M = 2. We recode using the alphabet a = 00, b = 01, c = 10, and
d = 11. Draw the vertex-labeled transition graph (see Figure 3.1); labels at
the arrows indicate which word in {0, 1}3 they stand for. For example, the
edge a → b labeled 001 has prefix a = 00 and suffix b = 01. Each arrow
containing a forbidden word is dashed and then removed in the right panel
of Figure 3.1.
001
000 a b a b
101
100 011
010
c d 111 c
110
Figure 3.1. The recoding of the SFT with forbidden words 11 and 101.
⎧
⎨γ(x + 2−γ
if x ∈ J1 := [0, γ−1
γ ) γ ],
T (x) =
⎩γ(1 − x) if x ∈ J2 := [ γ−1
γ , 1],
√
5+1
γ= 2
J1 J2
Figure 3.2. The tent map with slope equal to the golden mean.
The map TM has a Markov partition1 , that is, a partition {Ji }i∈A for
sets such that:
(1) The Ji have disjoint interiors and i Ji = Td .
(2) If TM (Ji◦ )∩Jj◦ = ∅, then TM (Ji ) stretches across Jj◦ in the unstable
direction (i.e. the direction spanned by the unstable eigenspaces of
M ).
(3) If TA−1 (Ji◦ ) ∩ Jj◦ = ∅, then TA−1 (Ji ) stretches across Jj◦ in the stable
direction (i.e. the direction spanned by the stable eigenspaces of
M ).
Every hyperbolic toral automorphism has a Markov partition (see [100]),
but in general they are fiendishly difficult to find explicitly, especially in
dimension ≥ 3 where the boundaries of the Ji might have to be fractal 1 1(see
[104]). Therefore we confine ourselves to the simpler case of M = 1 0 , a
Markov partition of three rectangles Ji for i = 1, 2, 3 can be constructed; see
Figure 3.3. The corresponding transition matrix is
⎛ ⎞
0 1 1
1 if TM (Ji◦ ) ∩ Jj = ∅,
A = (ai,j ) = ⎝1 0 1⎠ where aij =
0 1 0 0 if TM (Ji◦ ) ∩ Jj = ∅.
1 The construction of Markov partitions for toral automorphisms on T2 goes back to Berg [58]
and Adler & Weiss [10], extended to more general settings in [99, 100, 258, 512] among others.
3.1. Subshifts of Finite Type 55
mod1
J2
J1
M2
J1
sta
J3
bl
J3
e
di
re
ion
ct
ct
ion
di re
bl
e J2
sta
un
Proof. We give the proof for X ⊂ AN0 only; the two-sided case follows in a
similar way.
⇐: Let (X, σ) be an SFT of memory M (see below Definition 3.1) so
M + 1 is the length of the longest forbidden word. Let ε > 0 be arbitrary
and choose m ≥ M + 1 so small that 2−m < ε. Take δ = 22−m . We need to
56 3. Subshifts of Positive Entropy
and d(σ(xn ), xn+1 ) ≤ 2−N +2 < δ. Hence (xn )n≥0 is a δ-pseudo-orbit, which
can be ε-shadowed by some z ∈ X. But then zn = xn0 = yn for every n ≥ 0,
so z = y ∈ X. Since y was arbitrary, up to the condition that each of its
N -blocks belongs to L(X), it follows that the only restriction of X involves
forbidden blocks of length ≤ N . Therefore X is an SFT.
It follows that C −1 λn
≤ p(n) ≤ (#A)2 Cn#A λn (where p(n) stands for the
word-complexity; see Definition 1.9) and limn n1 log p(n) = log λ.
Proposition 3.15. If (Y, σ) is a factor of (X, σ), then htop (Y, σ) ≤ htop (X, σ).
If (X, σ) and (Y, σ) are conjugate, then htop (X, σ) = htop (Y, σ).
The result also holds in general, i.e. not just in the context of subshifts,
see Corollary 2.51, but using the word-complexity and sliding block codes,
the proof is particularly straightforward here.
3.1. Subshifts of Finite Type 57
This proves the first statement. Using this in both directions, we find
htop (X, σ) = htop (Y, σ).
As shown by Parry [446], see Theorem 6.67, irreducible SFTs are intrin-
sically ergodic. This follows also from Theorem 3.48 and Proposition 3.41.
Weiss [556] showed that factors of irreducible SFTs are intrinsically ergodic
as well.
v1 v v v1 v
v2 v v v v2 v
Let G = (V, E) where V is the vertex set and E the edge set. For each
v ∈ V , let Ev ⊂ E be the set of edges starting in v and let E v ⊂ E be the
set of edges terminating in v.
58 3. Subshifts of Positive Entropy
Proof. We give the proof for an elementary outsplit graphs Ĝ; the general
outsplit and (elementary) insplit graphs follow similarly. By Theorem 1.23,
it suffices to give sliding block code representations for π : X̂ → X and
π̂ : X → X̂.
• The factor map π : X̂ → X is simple. If ê ∈ Ê replaces e ∈ E, then
f (ê) = e and π(x)i = f (xi ).
3.1. Subshifts of Finite Type 59
Sketch of proof. Prove it first for an elementary outsplit, and then com-
pose elementary outsplits to a general outsplit. For the first step, we compute
the elementary outsplit for the example of Figure 3.4.
⎛ ⎞
⎛ ⎞ 0 0 0 1
1 1 1 ⎜1 1 1 0⎟
A = ⎝0 1 1⎠ and  = ⎜ ⎝0 0 1 1⎠ .
⎟
1 0 0
1 1 0 0
Also ⎛ ⎞
⎛ ⎞ 0 0 1
1 1 0 0 ⎜1 1 0⎟
D = ⎝0 0 1 0⎠ and C=⎜
⎝0
⎟.
1 1⎠
0 0 0 1
1 0 0
Matrix multiplications confirms that DC = A and CD = Â.
Exercise 3.20. Do the same for the elementary insplit graph in the example
of Figure 3.4.
Definition 3.21. Two matrices A and  are strongly shift equivalent
(of lag ) (denoted as A ≈ Â) if there are (rectangular) matrices Di , Ci and
Ai , 1 ≤ i ≤ over N0 such that
(3.2) A = A0 , Ai−1 = Di Ci , Ci Di = Ai , i = 1, . . . , , A = Â.
Remark 3.22. One important restriction of this definition is that the con-
jugating matrices must have non-negative integer entries. Even if a square
60 3. Subshifts of Positive Entropy
matrix has determinant ±1, its inverse may still have negative integers among
its entries. For example
4 1 3 2
A= and  =
1 0 2 1
1 1 1 1
are similar via 1 −1 A = Â 1 −1 . From this, we can easily compute that the
traces tr(An ) = tr(Ân ) for all n ∈ Z, so A and  share ζ-functions ζA (t) :=
exp( ∞ n
n=0 tr(A )). However, A and  are not (strongly) shift equivalent.
This is Williams’s [559, Example 3] counterexample to Bowen’s question of
whether sharing ζ-functions for SFTs suffices to conclude conjugacy.
Exercise 3.23. Show that strong shift equivalence ≈ is indeed an equiv-
alence relation between non-negative square matrices. Show that A ≈ Â
implies that A and  have the same leading eigenvalue λ = λ̂.
A A−1 A
Zn Zn Zn Zn
D D C C D
 Â
Zn̂ Zn̂ Zn̂
3.2. Sofic Shifts 61
Shift equivalence means that the -th powers A and  are strong shift
equivalent (with lag 1). Shift equivalence is easier to verify than strong shift
equivalence, although verification can still be very complicated. But, and
this is Williams’s Conjecture, it is still not fully2 known if it is a complete
invariant; see [398, Section 7.3] and [106, Problem 19.1]. If A ∼ Â, then XA
and XÂ cannot be conjugate, but if A ∼ Â, this is insufficient to conclude
that (XA , σ) and (XÂ , σ) are conjugate.
Exercise 3.26. Show that (i) A ∼  implies A ∼k  for all k ≥ , (ii)
shift equivalence ≈ is an equivalence relation between non-negative square
matrices, and (iii) strong shift equivalence implies shift equivalence, with the
same value of .
Shift equivalence matrices have the same ζ-function, and many other
properties coincide too.
Lemma 3.27. If A and  are shift equivalent (of lag ), then they have the
same non-zero eigenvalues (so also htop (XA , σ) = htop (XÂ , σ)).
In order to say what can be proved with shift equivalence, we define SFTs
(XA , σ) and (XÂ , σ) to be eventually conjugate if the n-block shifts are
conjugate for all sufficiently large n. Then, see [398, Theorem 7.5.15]:
Theorem 3.28. Two SFTs (XA , σ) and (XÂ , σ) are eventually conjugate if
and only if A and  are shift equivalent.
There remain many open (classification) problems in SFT, as well as in
sofic and other subshifts. The survey of Boyle [106] contains a long list of
open problems, many of which remain open to today.
was coined by Benjy Weiss; it comes from the Hebrew word for “finite”. Much
of this section can be found in concise form in [364, Section 6.1].
Proof. Assume that the SFT has memory M . Let G be the vertex-labeled
M -block transition graph of the SFT; i.e. each a1 · · · aM ∈ LM (X) is the
label of a unique vertex. We have an edge a1 · · · aM → b1 · · · bM if and only
if a1 · · · aM bM = a1 b1 · · · bM ∈ LM +1 (X), and then this M + 1-word is also
the label of the edge. Since each infinite vertex-labeled path is in one-to-
one correspondence with an infinite edge-labeled path and also in one-to-one
correspondence with an infinite word in X, we have represented X as a sofic
shift.
Remark 3.31. Not every sofic shift is an SFT. For example the even shift
(Example 1.4) has an infinite collection of forbidden words, but it cannot be
described by a finite collection of forbidden words. Sofic shifts that are not
of finite type are called strictly sofic.
The following theorem shows that we can equally define the sofic subshifts
as those that are a factor of a subshift of finite type.
Corollary 3.33. Every factor of a sofic shift is again a sofic shift. Every
shift conjugate to a sofic shift is again sofic.
called follower-separated if for each vertex v ∈ G, the follower set (i.e. the
set of labeled words associated to paths starting in v) is different from the
follower set of every other vertex.
For example, the Fibonacci SFT of Example 1.3 and the even shift of
Example 1.4 are both coded subshifts, with sets of code words C = {0, 01}
and C = {0, 01}, respectively. On the other hand,
the SFT (XA , σ) on the
alphabet {0, 1} with transition matrix A = 0 1 is not transitive, and it is
1 1
also not a coded shift, because no code word containing 01 can ever be used
twice in a concatenation.
Proof. Rewrite the SFT to an SFT with memory M = 1; i.e. all forbidden
words have length ≤ 2. Let G be the transition graph; since the SFT is
transitive, G is strongly connected. Fix vertices a, b such that the arrow a → b
occurs in G. Now let S contain the codes of all finite paths b → · · · → a;
these can be freely concatenated.
Remark 3.42. Naturally, the set C of codes may not be the most economical,
but the idea of the proof of Proposition 3.41 is quite general. It can also be
used to show that sofic and synchronized subshifts are coded. Therefore we
have the inclusion:
All these inclusions are strict. For instance, Dyck shifts are coded but
not synchronized; see Section 3.10. Coded shifts are always transitive, but
not always totally transitive; indeed, if the lengths of all code words is a
multiple of N ≥ 2, then σ N can easily be non-transitive (but not necessarily;
see [180, Theorem 4.1]). Totally transitive coded subshifts are always weak-
mixing (since they have a dense set of periodic orbits, see [325, Corollary
3.6]), and also topologically mixing, see [180, Theorem 2.2]. Thus for coded
systems, these three notions coincide.
66 3. Subshifts of Positive Entropy
· · · 01
00
00
10
10 0 · · · = · · · 0 10
01 00
01
01
00 · · ·
10
1
Formula (3.6) suggests that the topological entropy htop (σ) = 2 log 3, and
this is indeed true.
0 0 4 5
0 1 0 2
v0 v0
1 0 1 3
We see this by considering XC˜ for C˜ = {01, 23, 45}. These have isomor-
phic transition graphs (with isomorphic path spaces), see Figure 3.5, but the
latter is clearly uniquely decipherable with entropy 12 log 3. Since (XC , σ) is
a factor of (XC˜, σ) via the sliding block-code π : 0 → 0, 1 → 1, 2 → 1, 3 →
0, 4 → 0, 5 → 0, the factor map is at most 2-to-1 and hence doesn’t decrease
entropy.
i.e. the n-th code word is a concatenation of all words in {0, 1}∗ of length n.
Then q = 1 if = n2n and q = 0 otherwise. Since every word appears in
XC , the true entropy is htop (XC ) = log 2, but (3.6) yields
e−2h +e−8h +e−24h +e−128h +· · · = 1, which gives h = log 1.1809 · · · < log 2.
68 3. Subshifts of Positive Entropy
Theorem
3.45 ([450, Theorems 1.7 and 1.8]). Recall from (3.6) that F (h) =
q e −h .
Proof. (i) Let Pren and Sufn denote the length n prefixes and suffixes of
code words C ∈ C. Note that Pren ∪ Sufn ⊂ Wn . Every word in L(XC ) can
be written as the concatenation of the one suffix, some code words, and one
prefix, and therefore
n
Ln (XC ) = Wn ∪ Sufn1 Cn2 · · · Cnk−1 Prenk ,
k=2 n1 +···+nk =n
ni ≥ 1
where the inner union really runs over the concatenations of all words in the
indicated sets. Note that if the concatenation starts with a full code word,
then this counts as a suffix, and similarly if the concatenation ends with a
full code. Therefore it is justified to assume that ni ≥ 1 for each i.
This gives
n
#Ln (XC ) ≤ #Wn + #Sufn1 · qn2 · · · qnk−1 · #Prenk .
k=2 n1 +···+nk =n
ni ≥ 1
Since limn n1 log #Wn = h(UC ), our assumption h > h(UC ) implies that there
is a constant K such that
Therefore, setting m = n1 + nk ,
n
k−1
#Ln (XC ) ≤ Ke nh
+ 2 (n1 +nk )h
K e qnj
k=2 n1 +···+nk =n j=2
ni ≥1
⎛ ⎞
⎜ n n−m
k−1
⎟
= enh ⎜
⎝K + K
2
qnj e−nj h ⎟
⎠,
m=0 k=2 n2 +···+nk−1 =n−m j=2
ni ≥ 1
3.3. Coded Subshifts 69
where the empty product counts as 1. All the terms in the last sum are part
k−2
∞ −jh
of the expansion of F (h)k−2 = j=1 qj e . By the assumption that
F (h) < 1, we obtain
n
(n + 1)K 2
#Ln (XC ) ≤ e nh
K + (n + 1)K 2
F (h) k−2
≤e nh
K+ .
1 − F (h)
k=2
Taking logarithms, dividing by n, and taking the limit n → ∞ gives htop (XC )
≤ h.
(ii) If F (h) > 1, then for all t ∈ N sufficiently large, St := tj=1 qj e−jh >
1. For k ∈ N, we have the expansion
⎛ ⎞k
t tk k
k
St = ⎝ qj e −jh ⎠
= e −nh
qnj .
j=1 n=k n2 +···+nk−1 =n j=1
ni ≥ 1
We can now state the consequence for the entropy of coded shifts, para-
phrasing results of Pavlov [450, Theorems 1.1–1.3].
Corollary 3.46. Let h(UC ) = limn n1 log pUC (n) be the exponential growth
rate of words in UC , and recall the function F (h) = ≥1 q e−h from (3.6).
(a) Assume that XC has unique decomposition property. If F (h(UC )) ≥
1, then F (htop (XC , σ)) = 1. In fact, h = htop (XC ) is the only
solution of F (h) = 1.
(b) If F (h(UC )) < 1, then htop (XC , σ) = h(UC ).
Also htop (XC , σ) = h(UC ) if and only if F (h(UC )) ≤ 1.
70 3. Subshifts of Positive Entropy
Proof. The map h → F (h) has a critical value hc such that F (h) = ∞ for
h < hc and F (h) < ∞ is strictly decreasing for h > hc . At h = hc , F (h) can
be finite or infinite.
(a) If 1 < F (h(UC )) is finite, then hc ≤ h(UC ), as there is a unique
h1 > h(UC ) such that F (h1 ) = 1. Theorem 3.45 gives that htop (XC ) = h1 .
(b) If F (h(UC )) < 1, then by Theorem 3.45(i) we have htop (XC ) <
h(UC ) + ε for every ε > 0. Since XC ⊃ UC , we have htop (XC ) ≥ h(UC ),
so htop (XC ) = h(UC ) follows.
Combining (a) and (b) shows that htop (XC ) = h(UC ) if and only if
F (h(UC )) ≤ 1.
Corollary 3.47. Every non-periodic coded shift (XC , σ) has positive en-
tropy.
Hereditary shifts first appeared in [356, page 882]. It is clear that this
rule is shift-invariant, but it is not necessarily closed under taking limits.
For example, the collection
(3.7) X = {x ∈ {0, 1}N : xi = 0 infinitely often}
is hereditary, but it contains the sequence 1∞ in its closure. Therefore, some
authors [382] make the distinction between hereditary shift and subordi-
nate shift, the latter being hereditary and closed. We will write hereditary
subshift, meaning it is indeed closed. SFTs are hereditary, if the collection of
forbidden words of length M is exactly the largest in the partial order ≤her
on AM . A similar fact holds for sofic shifts.
Lemma 3.51. The hereditary closure of (i.e. smallest hereditary subshift
containing) the sofic shift (X, σ) is sofic.
We will see later that also β-shifts (Corollary 3.71) and spacing shifts
are hereditary. Another way to create hereditary subshifts is by stipulating
an upper bound of the frequency of non-zero digits.
Definition 3.52. Let A = {0, 1, . . . , N − 1} be the alphabet. The (upper)
density of the subshift X ⊂ AN or Z is
¯
d(X) = sup{d(x) ¯ : x ∈ X},
¯
where d(x) is the upper density (see Definition 8.52) of the set of indices
¯
j such that xj = 0; i.e. d(x) = lim supk k1 {0 ≤ j < k : xj = 0}. Let
Xδ := {x ∈ AN : d(x)
¯ ≤ δ}.
Proof. Let X be a one-sided hereditary shift (the two-sided case goes sim-
ilarly). Assume that X is not a single periodic orbit, which for hereditary
shifts means X = {0∞ }. If d(X) ¯ > 0, then for every ε > 0 there are
x ∈ X and infinitely many integers n such that #{1 ≤ i ≤ n : xi = 0} ≥
¯
(d(X) − ε)n. Since X is hereditary,
1 1 ¯ ¯
log p(n) ≥ log 2(d(X)−ε)n = (d(X) − ε) log 2.
n n
But limn n1 log p(n) exists according to Fekete’s Lemma 1.15, and ε > 0 is
¯
arbitrary, so htop (σ) ≥ d(X) log 2. Note that if X, for every ε > 0, contains
sequences x such that #{1 ≤ i ≤ n : xi = N − 1} ≥ d(X) ¯ − ε, then we find
¯
htop (σ) ≥ d(X) log N .
¯
For the converse, assume that d(X) = 0, so for every ε > 0 there is n0
such that for all n ≥ n0 ,
nε
n n
p(n) ≤ ≤ nε .
k nε
k=0
Xδ := {x ∈ AN or Z : d(x)
¯ ≤ δ}
Corollary 3.60. If σ m (z) sum z for all m n≥ 0, then z is the maximal
sequence of the density shift Xf for f (n) = i=1 zi .
In particular, SFTs (X, σ) that are also density shifts are transitive, be-
cause, unless X = {0∞ }, there is a non-trivial word v and x ∈ X that
contains v infinitely often as subword. In fact, density SFTs are com-
pletely characterized as those for which the canonical function f satisfies
inf n f (n)/n = f (p)/p for some p ∈ N; see [523, Theorem 4.3]. On the other
hand, if f is bounded, then all x ∈ Xf end with 0∞ . They can be rep-
resented by a finite edge-labeled transition graph [523, Theorem 2.16] and
also have a finite collection of follower sets. Hence such density shifts are
non-transitive sofic shifts.
Sofic density shifts, in general, are characterized [523, Theorem 6.3] as
those for which the maximal sequence z is eventually periodic (zn = zn+p
for n sufficiently large), or equivalently f (n + p) = f (n) + k (where k =
n+p−1
i=n zi and k > 0 if and only if Xf is transitive).
Since every infinite subshift is expansive (see below Definition 1.39), The-
orems 3.61 and 3.62 allow the following characterizations of chaos for density
shifts.
Corollary 3.63. Let (Xf , σ) be a density shift with canonical function f .
Then:
(1) (Xf , σ) is Devaney chaotic if and only if inf n f (n)/n > 0.
(2) (Xf , σ) is Auslander-Yorke chaotic if and only if f is unbounded.
(3) (Xf , σ) is Li-Yorke chaotic if and only if f is unbounded.
In other words, xk = Tβk (x) for the map Tβ : x → βx mod 1, and bk+1 is the
integer part of βxk .
Definition 3.64. The closure of the greedy β-expansions of all x ∈ [0, 1] is
a subshift of {0, . . . , β}N ; it is called the β-shift and we will denote it as
(Xβ , σ).
If b = (bk )∞
k=1 is the β-expansion of some x ∈ [0, 1], then σ(b) is the
β-expansion of Tβ (x). The following lemma from [445] characterizes the
β-shift in terms of the lexicographic order lex :
Lemma 3.65. Let c = c1 c2 c3 · · · be the β-expansion of 1, and suppose it is
not finite; i.e. ci > 0 infinitely often5 . Then b ∈ Xβ if and only if
σ n (b) lex c for all n ≥ 0.
1
0 11 0
0
111 1
0
00 1 0
1
0 Tβ2 1 Tβ 1 1
5 This condition is required for the “if” direction. For example, if c = 1110∞ as in Exam-
ple 3.66, then b = (110)∞ <lex c, but there is no point x ∈ [0, 1] with this itinerary. In fact, b is
the lazy expansion of the point 1; it is “the other” canonical itinerary that 1 has.
3.5. β-Shifts and β-Expansions 79
(of intervals of length ≤ β −r ) is a single point x with (bk (x))k≥1 = (bk )k≥1 .
If ns+1 = ∞ for some s ≥ 0 and we set Abns +1 bns +2 ··· = {1}, then {x} =
s −nr
r=0 Tβ (Abnr +1 ···bnr+1 ) gives again the unique point with (bk (x))k≥1 =
(bk )k≥1 .
The greedy
−k
expansion above is not the only way of expressing x =
k≥1 bk β for bk in the digit set {0, . . . , β}. For instance, in the lazy
expansion we always take the smallest possible6 digit bk such that the
sum x can still be achieved. For β = 2, choosing the greedy and lazy
6 In terms of the algorithm given for the greedy expansion, we need to take b = βx
k k−1 −
β/(β − 1) so that xk ≤ j>k ββ k−j ; i.e. xk (and therefore x) can still be reached choosing
the remaining digits bj maximal.
80 3. Subshifts of Positive Entropy
β+1 β+1
β −1 β
β+1+iβ
Switch regions Δi ∩ Δi+1 = [ i+1
β , β2 ]
0 1
β+1 β+1
Figure 3.7. The map Tβ : [0, β
] → [0, β
] with switch regions.
On the other hand, points whose forward Tβ -orbits avoid switch regions
(and then these forward orbits are indeed uniquely defined) have only one
expansion. Such points are called univoque; we denote the set of univoque
points in (0, β/(β − 1)) by Uβ . Larger values of β lead to smaller switch
regions and thus smaller univoque sets, that is, until β becomes integer and
the digit set is increased by one. The following theorem is a summary of
results from [278, 368], just for the digit set {0, 1}.
(3) #Uβ = ℵ0 for βc < β < βKL ≈ 1.787 . . . , the so-called Komornik-
Loreti constant7 ;
(4) #Uβ = 2ℵ0 for βKL ≤ β < 2; it is a Cantor set of positive Haus-
dorff dimension;
(5) Uβ = (0, 1) \ {dyadic rationals} if β = 2.
In fact, the Lebesque measure Leb(Uβ ) = 0 for all β ∈ [1, 2).
Further details are given also in [201]. Previously, Erdös and coauthors
[238–240] studied the number of β-representations of 1 as function β. For
similar results for larger digit sets {0, 1, . . . , m}, see e.g. [42, 200], among a
by now very extensive literature.
Proposition 3.68. The β-shift is a coded shift.
Apart from the single infinite word, these are exactly the indices of the
intervals Ac1 ···ck j in (3.11). We know from (3.12) that Tβk+1 (Ac1 ···ck j ) = [0, 1),
so free concatenations of such code words all represent (bk (x))k≥1 for some
x ∈ [0, 1]. Any concatenation in S ∗ also satisfies Lemma 3.65, so that S ∗ is
dense in (and in fact equal to) Xβ .
1
0
0
1
0
0
0
• 2
• 1
• 0
• 2
• 0
• 1
• 0
• 2
•
1
Figure 3.8. The edge-labeled transition graph for a β-shift with c = 21020102 . . . .
Proof. This was first shown by Hofbauer [315]; see also [159] based on
a weakened form of specification9 . Implementing Theorem 3.48, we have
#{s ∈ S : |s| = n} ≤ β for each n, so the exponential growth rate of these
words is 0. Hence Theorem 3.48 even implies that every factor of the β-shift
is intrinsically ergodic.
Remark 3.70. For the β-transformation with slope β > 1, the measure
of maximal entropy is absolutely continuous w.r.t. Lebesgue measure, and
there is an explicit formula for the density:
dμ 1
= β −n 1[0,Tβn (1)]
dx Λ
n≥1
The following result was probably first stated in [382, Section 6].
Corollary 3.71. For every β ∈ [1, 2], the β-shift (Xβ , σ) is hereditary.
Proof. This follows directly from Lemma 3.65 which determines the
shape of the code-words in Proposition 3.68. Indeed, if x ∈ Xβ and n =
min{i ≥ 1 : xi = ci }. Then xn < cn and x1 · · · xn is a code word. Now
repeat the argument with σ n (x).
Theorem 3.72. The Tβ -orbit of 1
(1) contains 0 if and only if Xβ is conjugate to an SFT;
(2) is preperiodic if and only if Xβ is sofic10 ;
9 This
is because specification as in Definition 2.83 and hence Lemma 2.87 do not apply.
10 Since
1 is not in the range of Tβ , the orbit of 1 cannot be periodic. If T n (1)(j/β) for some
j ∈ N, then T n+1 (1) = 0 and case (1) applies, even though limyj/β Tβ (y) = 1.
3.5. β-Shifts and β-Expansions 83
We give a proof below but refer to [446,488] for other proofs and related
results.
TβM +N (y) = z. Symbolically, this means that for every word w ∈ L(X) such
that vw ∈ L(Xβ ), also uvw ∈ L(Xβ ). In other words, v is synchronizing.
Conversely, suppose that v ∈ L(X) is some word. Then v corresponds
to the domain Z of some branch of TβN . If orb(1) is dense, then there is
n ∈ N such that Tβn (1) ∈ Z. Therefore there is a one-sided neighborhood
Y of 1 such that Tβn (Y ) = [0, Tβn (1)], and there is x ∈ Z \ Tβn (Y ). Let w
be the itinerary of TβN (x); since x ∈ Y , vw ∈ L(Xβ ). Similarly, taking
u = c1 c2 · · · cn , since Tβn (1) ∈ Z, also uv ∈ L(Xβ ). However, uvw ∈
/ L(Xβ ),
because there is no y ∈ Y such that Tβn (y) = x. This shows that v is not
synchronizing, and since v was arbitrary, Xβ is not synchronized.
Finally, for statement (4), take N such that the cylinder set [0N ] cor-
responds to a subinterval ZN contained in [0, δ]. Then TβN (ZN ) = [0, 1].
Also, for any k-cylinder [x] corresponding to an interval Zx ⊂ [0, 1], we have
Tβk (Zx ) ⊃ [0, δ] ⊃ ZN . Specification follows from this.
On the other hand, if 0 is an accumulation point of orb(1), then for any
M, N ∈ N, there is some word x ∈ LM (Xβ ) corresponding to an interval Zx
such that TβM (Zx ) ⊂ [0, β −N +1 ]. Then there is no word y ∈ LN (Xβ ) such
that xy1 ∈ L(Xβ ), and thus specification fails.
holds. The subshift Xβ is itself not of finite type, because there are infinitely
many forbidden words 1110k 1, k ≥ 0, but by some recoding it can be seen
to be conjugate to an SFT (see the middle panel of Figure 3.6), and it has a
simple edge-labeled transition graph. Also, Xβ is the image of the length one
sliding block code π(a) = π(b) = 0, π(c) = π(d) = 1, because a, b ⊂ [0, 1/β]
and c, d ⊂ [1/β, 1].
0
0
a c a = [0, β(β − 1) − 1]
1 b = [β(β − 1) − 1, 1/β]
0 1 1 c = [1/β, β − 1]
d = [β(β − 1), 1]
0
a c d
b d
b 1
0 Tβ2 1 Tβ 1 1
Figure 3.9. The transition graph for a sofic β-shift for β = 1.801937735 . . . .
Remark 3.75. We refer to [490] and [85, Chapter 7] for more results in
this spirit. If Xβ is sofic, then the Tβ -orbit of 1 is a finite set, say 0 =
x0 < x1 < x2 < · · · < xd = 1, where x0 = 0 is added for convenience, also
if it is not part of orbTβ (1). The intervals τi = [xi−1 , xi ] form a Markov
partition with associated matrix M = (mij )di=1 where mij = 1 if Tβ (τi ) ⊃ τj
and mij = 0 otherwise. This also defines a substitution χβ (a) = a1 . . . at
(with the letters ai in increasing order) if Tβ (τa ) = τa1 ∪ · · · ∪ τat with
fixed point ρ = limn χnβ (a1 ) and substitution shift (Xρ , σ) for Xρ = orbσ (ρ).
See [15, 16] for studies of these kinds of substitution systems. The Pisot
substitution conjecture states that this subshift has a purely point spectrum
(see Section 6.8.3) if and only if β is a Pisot number. This special version of
the Pisot substitution conjecture was proved by Barge [50].
Proof. This result comes from [435, Theorem 2.25], but we give a different
dynamical proof. Set β > 1, and assume that c = c1 c2 c3 · · · is the β-
expansion of 1. Let D0 = [0, 1] and in general11 Dn = [0, Tβn (1)]. First
assume that all points Tβn (1) are distinct. The proof will be by induction.
11 This notation is derived from the Hofbauer tower construction from Section 3.6.3 applied
to β-transformations. If the orbit of 1 is infinite, then there are n + 1 levels in the tower of height
≤ n. The image of each n-cylinder under Tβn is one of these, and therefore #F (n) = n + 1. The
same result holds for unimodal maps. More generally, for interval maps with d + 1 branches, we
have #F (n) ≤ dn + 1.
86 3. Subshifts of Positive Entropy
Theorem 3.77. The β-shift for β > 1 has topological entropy log β.
12 The fact that {Ac1 ···ck j : k ∈ N, 0 ≤ j < ck+1 } is a partition of [0, 1) shows that (bk )k≥1
starts with a code word rather than the suffix of a code word for every x ∈ [0, 1).
3.5. β-Shifts and β-Expansions 87
power series
1 + fS ∗ (t)fS (t) = 1 + pS ∗ (n)tn #{s ∈ S : |s| = m}tm
n≥0 m≥1
N
= 1+ pS ∗ (N − k)tN −k #{s ∈ S : |s| = k}tk
N ≥1 k=1
= 1+ pS ∗ (N )tN = fS ∗ (t).
N ≥1
−n = f (β −1 ), β −1 is a
Therefore fS ∗ (t) = 1−f1S (t) . Since 1 = n≥1 cn β S
(simple) pole of fS ∗ and fS ∗ (t) is well-defined for |t| < β . Hence β −1 is
−1
One can ask whether β-shifts are density shifts and vice versa. After
all, the one-sided β-shift (Xβ , σ) is characterized as {x ∈ AN : x lex c}
for the lex -maximal sequences c of Lemma 3.65 and the one-sided density
shift (Xf , σ) is characterized as {x ∈ AN : x sum z} for the ≺sum -maximal
sequence z of (3.8). If σ n (y) sum x for all n ≥ 0, then σ n (y) lex x for
all n ≥ 0; see [523, Lemma 8.1]. Therefore the shift-maximal sequence of a
density shift is also shift-maximal for a β-shift, and every one-sided density
shift is also a β-shift. The converse, however, is false. For example, c = 302∞
is shift-maximal w.r.t. lex but not w.r.t. sum because σ 2 (c) sum c (in
fact, these two sequences are not comparable). A way of finding (non-sofic)
density β-shifts with β ∈ [0, 1]13 is as follows: Given a β-transformation
Tβ : [01, 1] → [0, 1], define T̄β : [1 − β1 , 1] → [1 − β1 , 1] by
⎧
⎪
⎪Tβ (x) = βx if 1 − β1 ≤ x ≤ β1 ,
⎨
(3.16) T̄β (x) = 1 − β1 if β1 < x < 2β−1 ,
⎪
⎪
β2
⎩ 2β−1
Tβ (x) = βx − 1 if β 2 ≤ x ≤ 1;
see Figure 3.10.
Since T̄β (1 − β1 ) = T̄ (1) = β − 1, this map can be considered as a
non-decreasing circle endomorphism on [1 − β1 , 1]/1− 1 ∼1 , with plateau A =
β
[ β1 , 2β−1
β2
]. If T̄βn (1) ∈
/ A for all n ≥ 1, then the rotation number α :=
ρ(T̄β ) ∈ / Q, and c = i(1) is shift-maximal both w.r.t. lex and sum , and
it is in particular a sequence with maximum frequency of 1’s. It is also a
13 This is for simplicity of exposition; similar constructions for β > 2 are of course possible.
88 3. Subshifts of Positive Entropy
β−1
Tβ T̄β
A
1 2β−1
0 1− 1
β β β2
1
Sturmian sequence; more specifically, the itinerary of α for the circle rotation
Rα : S1 → S1 w.r.t. the partition (0, α] (with symbol 1) and (α, 1] (with
symbol 0); cf. Definition 4.48. The canonical function of the density shift
equals the Beatty sequence f (n) = nα.
14 Decreasing in our definition; in the frequently used family f (z) = z 2 + c, c ∈ [−2, 1 ], the
c 2
roles of increasing and decreasing are reversed.
15 Also for multimodal maps (i.e. continuous intervals with multiple critical points), symbolic
dynamics have been studied. Much of the structure presented here has a direct analogue, albeit
on a larger alphabet, for the multimodal case. However, since the multimodal doesn’t present
substantially different phenomena from the unimodal case, we omit it from this text, but see e.g.
[92, 414, 420].
3.6. Unimodal Subshifts 89
0 1 0 1
16 Also 1 1
htop (f ) = max{0, limn→∞ n log Var(f n )} = lim supn n log #{n-periodic points}, so
specifically, htop (Ts ) = max{0, log s} for the tent map Ts with slope s ∈ [0, 2].
17 This is quite a weaker sufficient condition for specification than for β-transformations, but
Find the sliding block code transforming θ(x) into i(x). Is there an inverse
sliding block code transforming i(x) into θ(x)?
3.6. Unimodal Subshifts 91
Proof. First let x, y ∈ [0, 1] be such that x < y. If c ∈ / [f k (x), f k (y)] for
all k ≥ 0, then i(x) = i(y). Otherwise, take n ≥ 0 minimal such that
c ∈ [f n (x), f n (y)]. Then ik (x) = ik (y) for 0 ≤ k < n. Furthermore, f n |(x,y)
is increasing/decreasing precisely if |i0 (x) · · · in−1 (x)|1 is even/odd. In the
former case, in (x) < in (y), so i(x) ≺pl i(y). In the latter case, in (x) > in (y),
so again i(x) ≺pl i(y).
This shows that
(3.18) i : ([0, 1], <) → (Σ, ≺pl ) is order preserving.
Since f 2 (c) ≤ x ≤ f (c) for every x in the core, (3.17) follows.
92 3. Subshifts of Positive Entropy
For the converse, assume that e ∈ {0, 1}N0 satisfies (3.17). We need
to find x ∈ [f 2 (c), f (c)] with e = i(x). Define cylinder sets Zn = {x ∈
[f 2 (c), f (c)] : ik (x) = ek for 0 ≤ k < n}. By definition Zn+1 ⊂ Zn for all n.
We will show that Z∞ := n Zn = ∅, and then each x ∈ Z∞ has itinerary
i(x) = e.
Using (3.18)
[f 2 (c), c) and σ(ν) pl e if e0 = 0,
Z1 =
(c, f (c)] and e pl ν if e1 = 0.
−j (c).
Now assume by induction that Zn−1 is found and that ∂Zn−1 ∈ n−1 j=−1 f
Therefore f n (Zn−1 ) is contained in an interval [f a (c), f b (c)], 0 < a, b ≤ n+2.
First assume c ∈ f n (Zn−1 ). Regardless of the value of en , we can take Zn
to be the closure of the component of Zn \ f −n (c) such that in (Zn ) = en . If
c∈/ f n (Zn−1 ), then Zn = Zn−1 , but because e satisfies (3.17), in (Zn ) = en .
This induction step proves that all the Zn ’s are closed non-empty nested
intervals and i(x) = e for every x ∈ n Zn . This concludes the proof.
3.6.3. Cutting Times and the Kneading Map. Let ν be the kneading
sequence. We can split any sequence e into maximal pieces (up to the last
symbol) that coincide with a prefix of ν. To this end, define
(3.19) ρ : N → N, ρ(n) = max{k > n : en+1 en+2 · · · ek−1 is prefix of ν}.
That is, the function ρ depends on e and ν, but we will suppress this de-
pendence. When we apply this for e = ν itself, we obtain the sequence of
cutting times which were introduced in the late 1970s by Hofbauer [316].
They are given recursively by
S0 = 1, Sk+1 = ρ(Sk ),
or in other words Sk = ρk (1) for e = ν and k ≥ 0.
Example 3.84. There is a unique transitive unimodal map, up to conjugacy
and homtervals18 , that has cutting times S0 , S1 , S2 , S3 , S4 , . . . = 1, 2, 3, 5, 8, . . .
equal to the Fibonacci numbers. We call this the Fibonacci (unimodal)
map, and, as one would expect, it has connections with Fibonacci substitu-
tions and golden mean rotations; see Proposition 5.26 and [125].
Lemma 3.85. Let ν be an admissible kneading sequence. The integer n ≥ 1
is a cutting time if and only if ν1 · · · νn is admissible w.r.t. ν in the sense
that (3.17) holds for it. In this case also ν1 · · · νn contains an odd number of
ones.
18 An interval J ⊂ [0, 1] is called a homterval if f n : J → f n (J) is a homeomorphism for
every n ∈ N.
3.6. Unimodal Subshifts 93
Proof. We argue by induction. Since ν starts with 10, the statement holds
for n = 1, 2. For the induction step, assume the assertion holds for all
j < n. Let k be maximal such that Sk < n and assume ρ(Sk ) < ∞, because
otherwise, ν is Sk -periodic and there is nothing to prove. We distinguish
four cases:
• n < ρ(Sk ) and n − Sk is not a cutting time. Then the word
νSk +1 · · · νn = ν1 · · · νn−S
k
is not admissible by induction, and hence
σ k (ν1 · · · νn ) fails (3.17).
S
(c) Show that if n is such that |f n (c) − c| < |f k (c) − c| for all 1 ≤ k < n,
then n is a cutting or a co-cutting time. That is, closest returns of c happen
at cutting or co-cutting times.
(d) If Q̂(k) is bounded, show that c is not recurrent.
(e) Give an example of a unimodal map with bounded kneading map but
c is non-periodic and recurrent.
Proof. Note that a tent map Ts is long-branched if and only if lim inf n |Dn |
> 0, and since |Dn+1 | = s|Dn | unless n is a cutting time, this is equivalent
to lim inf k |D1+Sk | > 0.
If c is periodic, then {|Dn | : n ∈ N} is a finite collection and hence Ts is
long-branched. So let us assume that c is not periodic. If Q(k) ≤ B, then
Sk − Sk ≤ SB . It follows that lim inf k |c − cSk | > 0, because otherwise, the
time between cutting times is unbounded. Therefore lim inf k |D1+Sk | > 0
and Ts is long-branched.
If, on the other hand, lim supk Q(k) = ∞, then lim supk Sk − Sk−1 =
∞, and hence lim inf k |c − cSk | ≤ lim inf k s−(Sk −Sk−1 ) = 0. This gives
lim inf k |D1+Sk | = 0, and Ts is not long-branched.
%
The disjoint union D = n≥1 Dn supports a map
Dn+1 if c ∈
/ [cn , x],
fˆ(x ∈ Dn ) = f (x) ∈
DSQ(k)+1 if c ∈ [cn , x], so n = Sk is a cutting time.
The collection {Dn }n≥1 forms a countable Markov partition for (D, fˆ). It
is easy to see that the inclusion map π : x ∈ Dn → x ∈ [0, 1] satisfies
3.6. Unimodal Subshifts 95
c9 c1 c2 c9 c1
c3 c8 c3 c8 c1
c2 c7 c2 c7 c1
c6 c1 c2 c6 c1
c2 c5 c2 c5 c1
c4 c1 0 c4 c1
c3 c1 0 c3 c1
c2 c1 0 c2 c1
0 c1 0 c1 1
c c
Figure 3.12. The Hofbauer tower and extended Hofbauer tower for the
Fibonacci map.
Hofbauer saw (D, fˆ) as an infinite Markov chain extending the interval
dynamics (I, f ) and explicitly added arrows Di → Dj if fˆ(Di ) ⊃ Dj . We
can edge-label this graph by setting
0
Di −→ Dj if fˆ−1 (Dj ) ∩ Di ⊂ [0, c],
1
Di −→ Dj if fˆ−1 (Dj ) ∩ Di ⊂ [c, 1].
The infinite paths on this graph starting in D1 are thus put in one-to-one
correspondence with X = {i(x) : x ∈ [0, f (c)]} = {x ∈ {0, 1}N0 : σ k (x)
pl ν}. Therefore the edge-labeled Hofbauer tower is immediately the count-
able state automaton accepting the language L(X); see Figure 3.13 for the
Fibonacci map. Such automata are discussed at length in [565, Chapter 5 &
6]. If c has an infinite orbit, then all the sets Dn are all different. Therefore
the corresponding unimodal shift has F (k) = k + 1 distinct follower sets
associated to k-words; see (3.14) and Theorem 3.76. If c is preperiodic, then
there are only finitely many different levels Dn , and L(X) is sofic, and in
fact it is an SFT if c is periodic.
96 3. Subshifts of Positive Entropy
0 1 1 0 0
1 0 0 1 1 1 0 1 1
• • • • • • • • • •
Figure 3.13. The edge-labeled Markov graph for the Fibonacci map.
One can extend the Hofbauer tower so as to account for the co-cutting
times as well. Set D̂1 = [0, 1] and inductively
⎧
⎪
⎨f (D̂n ) if c ∈ / Dn ,
D̂n+1 = f (En ) if c ∈ Dn and En is the component of
⎪
⎩
Dn \ {c} containing c,
See Figure 3.12 for the Hofbauer tower and extended Hofbauer tower of
the unimodal Fibonacci map. Then cn ∈ D̂n for all n ≥ 1 and there is a
neighborhood Zn−1 f (c) such that f n−1 : Zn−1 → D̂n is monotone onto.
Also, if c ∈ D̂n , then n is a cutting or a co-cutting time. More precisely,
cutting time if c ∈ Dn ,
(3.21) if c ∈ D̂n , then n is a
co-cutting time if c ∈ D̂n \ Dn .
It is clear from this that the cutting and co-cutting times are disjoint se-
quences.
Theorem 3.90. A sequence ν = 10 · · · is an admissible kneading sequence
if one of the following equivalent conditions is satisfied:
(a) σ(ν) pl σ n ν pl ν for all n ∈ N0 .
(b) The kneading map is well-defined by (3.20) above, and (according
to Hofbauer [316])
(3.22) {Q(k + j)}j≥1 #lex {Q(Q2 (k) + j)}j≥1
for all k ≥ 1, where #lex stands for the lexicographical order on
sequences. Here we set Q(0) = 0 by convention.
(c) If ρ(m) < ∞, then ρ(m) − m is a cutting time.
(d) The sequences of cutting times {Sk }k≥0 and co-cutting times {Ŝ }≥0
(see Exercise 3.88) are disjoint.
Proof. We first show that admissibility implies the four conditions (a)–(d).
The necessity of condition (a) is shown in Theorem 3.83.
Condition (d) follows directly from (3.21).
Define the closest precritical point ζ ∈ [0, 1] as any point such that
f n (ζ) = c for some n ≥ 1 and f k (x) = c for all k ≤ n and x ∈ (ζ, c).
By symmetry, if ζ is a closest precritical point, ζ̂ = 1 − ζ is also a closest
3.6. Unimodal Subshifts 97
Figure 3.14. The points ζQ(k) < cSk−1 < ζQ(k)−1 and their images
under f SQ(k) .
In particular,
(3.25) f Sk−1 (c) ∈ ΥQ(k) = (ζQ(k)−1 , ζQ(k) ] ∪ [ζ̂Q(k) , ζ̂Q(k)−1 ),
see Figure 3.14, and the larger Q(k), the closer f Sk−1 (c) is to c.
Formula (3.22) can be interpreted geometrically as cSk ∈ [c, cSQ2 (k) ]; see
Figure 3.14. To see this, apply f SQ(k) to the points ζQ(k) , cSk−1 , and ζQ(k)−1 .
We find cSk ∈ (c, cSQ2 (k) ), so cSk is closer to c than cSQ2 (k) is. This implies
that Q(k + 1) ≥ Q(Q2 (k) + 1). If the inequality is strict, then (3.22) holds.
Otherwise, i.e. if Q(k + 1) = Q(Q2 (k) + 1), then both cSk and cSQ2 (k) ∈
(ζQ(k+1) , ζQ(k+1)−1 ) and we apply f SQ(k+1) , which maps (ζQ(k+1) , ζQ(k+1)−1 )
S
into [c, f Q2 (k+1) (c)). Therefore cSk+1 ∈ (c, cSQ2 (k)+1 ). This shows that
Q(k + 2) ≥ Q(Q2 (k) + 2). If the inequality is strict, then again (3.22)
holds; otherwise both c1+Sk+1 and c1+SQ2 (k)+1 ∈ ΥQ(k+2) and we can apply
f SQ(k+2) . Repeating the argument shows that (3.22) holds in any case, and
(b) is proven.
(c) ⇒ (a): Since ρ(m) − m is a cutting time, #{m < j ≤ ρ(m) : νj = 1}
is even by Lemma 3.85. Hence σ m (ν) pl ν (cf. Exercise 3.86). Since ν1 = 1,
the parity-lexicographical order implies that σ m+1 (ν) #pl σ(ν) for all m.
98 3. Subshifts of Positive Entropy
= ν1 · · · νn · · · " ν,
where n is not a cutting time because SQ2 (k)+j0 < n < SQ2 (k)+j0 +1 . This
contradicts (3.17).
(b) ⇒ (d): First we claim that if Sk−1 < Ŝ < Sk < Ŝ+1 , then Sk − Ŝ =
SQ2 (k) . This is true for Ŝ0 = κ and Sk = κ + 1, because then Sk − S = 1 =
SQ2 (k) .
Assume now by induction that Sk−1 < Ŝ = Sk − SQ2 (k) < Sk for some
k, . Then (3.22) gives Ŝ+1 = Sk + Sj for some k ≥ k and j < Q(k + 1).
But since νSk +1 · · · νSk +Sj νSk +1 = ν1 · · · νSj · · · νS , the integers Sk +
Q(k +1)
Sj , Sk + Sj+1 , . . . , Sk + Sj are co-cutting times for all j ≤ j < Q(k + 1).
The largest such integer is Ŝ := Sk + SQ(k +1)−1 , so
Sk +1 − Ŝ = Sk +1 − (Sk + SQ(k +1)−1 ) = SQ(k +1) − SQ(k +1)−1 = SQ2 (k +1) ,
and this completes the induction step. But repeating this step also shows
that {Sk }k≥0 and {Ŝ }>0 are disjoint, so (d) holds.
It remains to prove that (d) implies admissibility. For this we will use
the quadratic family fa (x) = ax(1 − x), a ∈ [0, 4], with critical point c = 12
to which we assign the symbol ∗. Let
Aν1 ···νn := {a ∈ [0, 4] : the kneading sequence ν(fa ) starts with ν1 · · · νn }.
√
Then A0 = √[0, 2), A1 = (2, 4] while f2 (c) = c. Also A11 = [2, 1 + 5),
A10 = (1 + 5, 4] while f1+ 2 √ (c) = c. We are only interested in kneading
5
sequences starting with 10, so we continue with A10 .
Define ϕn (a) := fan (c). It is easy to check that ϕ2 : A10 → [0, c] =
2 2 √ (c)] = [f 2 (c), c] is monotone onto. We claim that this holds in
[f4 (c), f1+ 5 4
general: for all prefixes ν1 · · · νn of some ν satisfying (b),
where Smax and Ŝmax are the largest cutting and co-cutting times in ν1 · · · νn
(see Exercise 3.88) and a1 , a2 are the boundary points of Aν1 ···νn and the
order in boundary points in [fan−S 1
max (c), f n−Ŝmax (c)] may be the other way
a2
around. Also Ŝmax = 0 if ν1 · · · νn = 10 · · · 0 by convention.
If n + 1 is neither a cutting nor co-cutting time, then νn−Smax +1 =
νn−Ŝmax +1 = νn+1 , so (fan−S max (c), f n−Ŝmax (c)) c, Therefore A
1 a2 ν1 ···νn+1 =
Aν1 ···νn and Smax and Ŝmax remain the same. Also fa : f (Aν1 ···νn+1 ) →
n
ϕn+1 (Aν1 ···νn+1 ) is monotone onto, so (3.26) holds for Aν1 ···νn+1 .
If n + 1 is a cutting time, then νn−Smax +1 = νn−Ŝmax +1 = νn+1 , so
(fan−S
1
max +1
(c), fan−
2
Ŝmax +1
(c)) c.
Now Aν1 ···νn+1 is a proper subset of Aν1 ···νn and Smax = 0 and Ŝmax remains
the same. Again fa : f n (Aν1 ···νn+1 ) → ϕn+1 (Aν1 ···νn+1 ) is monotone onto, so
(3.26) holds for Aν1 ···νn+1 .
If n + 1 is a co-cutting time, then νn−Ŝmax +1 = νn−Smax +1 = νn+1 , so
(fan−S
1
max +1
(c), fan−
2
Ŝmax +1
(c)) c.
Again, Aν1 ···νn+1 is a proper subset of Aν1 ···νn and Ŝmax = 0 and Smax remains
the same. Also fa : f n (Aν1 ···νn+1 ) → ϕn+1 (Aν1 ···νn+1 ) is monotone onto, so
(3.26) holds for Aν1 ···νn+1 .
Since Aν1 ···νn+1 ⊂ Aν1 ···νn , n≥2 Aν1 ···νn = ∅. If ν is periodic and
νn = ∗, then there is a ∈ ∂Aν1 ···νn with ν(fa ) = ν. Otherwise, Aν1 ···νn+1 ⊂
Aν1 ···νn infinitely often, so n≥2 Aν1 ···νn = ∅ and ν(fa ) = ν for each a ∈
n≥2 Aν1 ···νn .
Exercise 3.91. Show that if Sk−1 < n < Sk ≤ ρ(n), then Sk − n is a cutting
time.
Example 3.92. Hofbauer [315] showed that tent maps Ts (x) are intrinsi-
cally ergodic. If s > 1, then the measure of maximal entropy is absolutely
continuous w.r.t. Lebesgue measure and its density is given explicitly as
dμ θ(n)
(3.29) = 1 n+1 s s (x),
dx sn [Ts ( 2 ), 2 ]
n≥1
for θ(n) as in (3.27); see [195, Section 5.3]. In fact, (3.29) extends to skew
tent maps
⎧
⎨sx if 0 ≤ x ≤ c := s+t
t
,
(3.30) Ts,t : [0, 1] → [0, 1], Ts,t (x) =
⎩t(1 − x) if c ≤ x ≤ 1
see [330]. Further results in this direction can be found in [282, 370].
The main result of this section relates the kneading determinant to the
topological entropy of the map. The rest of this section leads up to its proof.
Theorem 3.93. The topological entropy htop (f ) > 0 if and only if t0 :=
inf{t > 0 : Df (t) = 0} ∈ (0, 1) and in this case htop (f ) = − log t0 .
By setting 0 < ∗ < 1 we can extend ≺pl to sequences in {0, ∗, 1}N with
the property that if em = ∗, then σ m (e) = ν is the kneading sequence ν of
f.
Milnor & Thurston [420] used formal power series rather than symbolic
dynamics to phrase their kneading theory. This is a bit more involved, but
for many purposes a very powerful method. Let us interpret the intervals
19 Exercise 7.15 gives a precise recursive formula of the lap-number of the Feigenbaum map,
and Exercise 3.98 gives the kneading determinant of the quadratic map with a period 3 critical
point.
3.6. Unimodal Subshifts 101
for every point x. Here the Kronecker delta (δi (Ej ) = 1 if i = j and δi (Ej ) =
0 otherwise) is extended by linearity to vectors with Q[t]-valued coefficients.
Example 3.95. Before proving this lemma, let us see how this works out
for the fixed points of f . The orientation-reversing fixed point α ∈ E1
102 3. Subshifts of Positive Entropy
Proof. We write 1j=0 (1 − ε(Ej )t)δj (Θ(x, t)) as a double sum and assume
for simplicity that orbf (x) c:
1
(1 − ε(Ej )t)δj (Θn (x))tn
j=0 n≥0
n−1
1
= (1 − ε(Ej )t) εk (x) tn
j=0 n≥0 k=0
f n (x)∈Ej
n−1
n
= εk (x) t −
n
εk (x) tn+1
n≥0 k=0 n≥0 k=0
f n (x)∈E0 f n (x)∈E0
n−1
n
+ εk (x) t −n
εk (x) tn+1 .
n≥0 k=0 n≥0 k=0
f n (x)∈E1 f n (x)∈E1
Formally, all positive powers tn cancel, leaving only x∈E0 t0 + x∈E1 t0 =
1. If f n (x) = 0 for some n, then the definition of Θn (x) allows a similar
proof.
The qualitative behavior of the entire interval map is given by the in-
variant coordinate of the critical value. In this terminology, the kneading
increment
ν(t) := lim Θ(x, t) − lim Θ(x, t)
xc x c
is the object closest to our kneading sequence.20 This formula obviously
expresses the change of kneading coordinate Θ(x) as the point x moves
20 We changed the sign from the definition on page 483 of [420] because in our setting f
assumes a maximum rather than a minimum at the critical point. The same construction, with d
formal unit vectors Ek , k = 0, 1. . . . , d − 1, can be carried out for a d − 1-modal interval map (i.e.
with d − 1 critical points) and also (although not covered in [420]) for piecewise continuous maps.
3.6. Unimodal Subshifts 103
Milnor & Thurston [420] continue to define kneading matrices and kneading
determinants Df (t) which in the unimodal case is equal to
1
(3.32) Df (t) = (δ0 (ν(t)) − δ1 (ν(t))) = 1 + ε1 (c1 )t + ε1 (c1 )ε2 (c1 )t2 + · · ·
2
P (t)
Df (t) = ,
1 ± tp
21 But not necessarily strictly monotone — even if f n | has flat pieces, this does not count
J
towards the lap-number.
104 3. Subshifts of Positive Entropy
so that
∞
∞
1
(3.33) (f |J )t
n n−1
= 1+ γn (J)tn .
1−t
n=1 n=0
Proof. The difference limxb Θ(x, t) − limx a Θ(x, t) is the sum of the in-
crements of all precritical points z. Each precritical point of order n gives a
contribution of tn ν(t), and γ(J) counts how many order n precritical points
there are, giving them weight tn . So the first formula follows.
Since
1
Θ(β) = E0 (1 + t + t2 + . . . ) = E0
1−t
and
t
Θ(−β) = E1 − E0 (t + t2 + t3 + . . . ) = E1 − E0 ,
1−t
we can use formula (3.32) to simplify for J = [0, 1]:
1
γ([0, 1])D(t) = (δ0 (γ(t)ν(t)) − δ1 (γ(t)ν(t)))
2
1
= δ0 ( lim Θ(x, t) − lim Θ(x, t))
2 x1 x 0
−δ1 ( lim Θ(x, t) − lim Θ(x, t))
x1 x 0
1 1 t 1
= + +1 = .
2 1−t 1−t 1−t
3.6. Unimodal Subshifts 105
1−t−t2
(1) Show that the kneading determinant is Dfa (t) = 1−t3
.
(2) Show that
∞
2
1 + γ(t) = = 2Fn tn ,
1 − t − t2
n=0
Proof of Theorem 3.93. The power series ∞ n n−1 converges for
n=1 (f )t
all t less than its radius of convergence R. But by [422], the lap-number
(f n ) ∼ ehtop (f )n , so − log R = htop (f ). By (3.34), this is also the first zero
of the kneading determinant, as required.
The following theorem is one of the main results in [420], namely Theo-
rem 9.2 and Corollary 10.7. We don’t prove or need it for our purpose, but
we mention it for completeness sake.
106 3. Subshifts of Positive Entropy
of f : R → R satisfies
'
−1 (1 − t)D(t) if c is non-periodic,
(3.35) ζ(t) =
(1 − t)(1 − tp )D(t) if c is periodic of period p.
3.6.5. Complex Kneading Theory. The standard and most direct ex-
tension of unimodal dynamics to the complex plane is via the quadratic
family23 fc (z) = z 2 + c. This family is conjugate to fa (w) = aw(1 − w) via
2 √
w = ψ(z) = 12 − az and c = a2 − a4 , a = 1 + 1 − 4c.
The family fc has its own kneading theory with features interesting
enough to devote a separate section to. Instead of symbolic dynamics on
a core interval in the real setting, we now address the symbolic dynamics on
a tree H (called the Hubbard tree) or dendrite24 .
The Hubbard tree models the closed connected hull of the critical orbit
within the Julia set Jc provided Jc is a dendrite. This applies to most of the
parameters c in the Mandelbrot set M that do not lie in the closure of any of
its hyperbolic components. But also if Jc is not a dendrite (because it is not
locally connected, or there are bounded Fatou components), there is always
a topological model for the Hubbard tree that satisfies Definition 3.101. The
22 Here we count at most one k-periodic point in each lap of f k ; if there are two such orbits
with the period of one twice the period of the other (as is the case shortly after a period doubling
bifurcation), then only the orbit of the smaller period is counted. Note, however, that the period
need not be minimal: a k-periodic point also counts for 2k, 3k, . . . .
23 We use a different font to distinguish from f (w) = aw(1 − w). Also we will write c = 0
a 0
for the critical point of fc (to distinguish it from the parameter), so c1 = fc (c0 ) = c.
24 A dendrite is a compact, connected, locally connected (and therefore arc-connected) set
without loops.
3.6. Unimodal Subshifts 107
25 Although rational maps can also have Herman rings (as Siegel disks, but now on an annulus
instead of a disk).
108 3. Subshifts of Positive Entropy
1
3 1
ϑc = 6
1
3
• 1
6
c2 = c4 c1 •
• • 1
1
1
1 12 12
•c •0
0
7 0 7
12 c3• 12 0
•
2
3
fi : z → z 2 + i
2
3
Figure 3.15. The Hubbard tree inside the disk model and the Julia set
of the external angle ϑc = 1/6 and kneading sequence ν = 110.
2 2
7 3
• 1
•7
25 25
γ= 56 • 56
1
2
• • •
c1 α β
1−γ = 31 • 31
56 56
4
7
• •
25
• 28
11 1
14 3
Figure 3.16. A schematic Julia set for external angle γ = 25/56 (with
ν = 100101) and the Mandelbrot set with some external rays.
if x = ±β (but −β ∈ [c1 , β) only if c1 = −2), and #L−1 (x) = 2 for all other
x ∈ [c1 , β)). In particular, htop (T |Kγ ) = htop (fc |[c1 ,β] ). Moreover,
⎧
⎪
⎪c → (θn )n≥1 (lexicographic order) is order preserving;
⎪
⎪
⎨(θ ) 1 1−θn −n
n n≥1 → γ = 2 n≥1 2 2 is order reversing;
⎪
⎪γ → Kγ (inclusion order) is order preserving;
⎪
⎪
⎩K → h (T | ) is order preserving.
γ top Kγ
We can extend the ρ-function from (3.19) to this complex case without
changing the definition:
29 It was shown by Lavaurs [388] that between two hyperbolic components of the same period,
there is always a hyperbolic component of a lower period, and therefore this list determines the
hyperbolic components on this arc uniquely.
112 3. Subshifts of Positive Entropy
This completes Case II and proves that orbρ (1) and orbρ (n) intersect at
the latest at ρ(m), where m is minimal with the property that ρ(m)−m = n.
For an arbitrary (i.e. not necessarily minimal) m with ρ(m) − m = n, let m
be minimal with this property. Then the ρ-orbits orbρ (1) and orbρ (n) meet
at the latest at ρ(m ) = n + m ≤ n + m = ρ(m), so the statement holds for
arbitrary m. This proves IH[n].
Proof. Assume by contradiction that (Ah )h≥0 = orbρx (A0 ), (Bi )i≥0 =
orbρx (B0 ), and (Cj )j≥0 = orbρx (C0 ) are pairwise disjoint ρx -orbits. There
are infinitely many triples (Ah , Bi , Cj ) such that Bi−1 < Ah−1 < Bi < Ah
and Cj−1 < Ah−1 < Cj < Ah (possibly with the roles of Ah , Bi , and
Cj permuted. Assume that (Ah , Bi , Cj ) is one of such triples, with span
d(Ah , Bi , Cj ) := max{Ah − Ah−1 , Bi − Bi−1 , Cj − Cj−1 } taking the minimal
value dmin among the span of all such triples; see Figure 3.17.
• • • • • • • • • • •
Cj−1 Bi−1 k Ah−1 ρ(k) − k Bi Cj Bi −1Cj Cj −1 Ah B i C j
We have
νBi−1 +1 · · · νAh−1 · · · νBi = ν1 · · · νAh−1 −Bi−1 · · · νB i −Bi−1
,
Remark 3.107. This proof is more general than Thurston’s proof, because
it applies also to non-admissible kneading sequences, i.e. those that do not
come with a Thurston lamination. For instance, if ν = 101100 · · · , then
there is a periodic point (see Figure 3.18, right) with itinerary x = 101. The
ρx -orbits of A0 = 3, B0 = 6, and C0 = 1 are disjoint, with span dmin = 6.
The precritical branch-points are not covered by this proposition, because
of the issue of assigning a proper symbol to the critical point. Each choice of
0 or 1 allows for one or two branches according to whether ν is an end-point
in the Julia set or not (unless ν is eventually periodic). Therefore 0ν and 1ν
together accounts for two or four arms.
c1• • c7
◦ • c4 • c1 • c5
◦
c =c c =c
• c5 ◦ 0 • 10 ◦ • c3 $ $0 • 6 $
• c8 ◦ • c6 • c9
◦ • c4 • c3 • c2
c2 •
Figure 3.18. The Hubbard tree of ν = 1 10 111 1100 has two periodic
orbits of branch-points. The Hubbard tree of ν = 1 0 11 0 0 has an orbit
of evil branch-points.
Admissibility Condition:
If ν ∈ {0, 1}N is such that (3.37) fails for every m ∈ N, then
ν is the kneading sequence of some quadratic polynomial.
If m ≥ ρ(m) − m ∈ orbρ (1) for all m, then all characteristic periodic
points have two arms, according to Propositions 3.108 and 3.109, and the
Hubbard tree is an arc [c1 , c2 ]. But this condition gives the existence of a
kneading map Q, which is central to having a real kneading sequence.
1
2
0 1
1 0
Figure 3.19. Graphs of a sofic shift, SFT, and tent map with topolog-
ical entropy 12 log 2.
118 3. Subshifts of Positive Entropy
Proof. Gap shifts are coded shifts, with C = {10s : s ∈ S} as code words.
Therefore the results of Section 3.3 apply, but the situation here is simpler
because UC from (3.5) reduces to {0∞ }. In fact, we can pass directly to the
representation of a gap shift by an infinite transition graph consisting of a
single central vertex from which loops of length s + 1, s ∈ S, emerge. So
Theorem 3.114 follows directly from Theorem 8.73.
Exercise 3.115. Use Theorem 3.114 to compute the entropy of the Fi-
bonacci SFT, the odd shift, and the even shift.
Proof. We give the proof first for the truncated Sa := Sa ∩{0, . . . , N }. Since
the entropy increases as N increases, the theorem follows by taking N → ∞.
31 If Sa 0, then the symbol a can be “jumped”; if Sa ⊃ {0, 1} for each a, then XS = AN or Z .
120 3. Subshifts of Positive Entropy
Also, we use the rome technique from Section 8.7.3 as opposed to the proofs
in [183, 398, 409].
Let B be the n × n-transition matrix (for some n ≤ d(N + 1)) for the
cyclic S -gap shift. Then by Theorem 8.72,
det(B − λIn ) = (−λ)n−d det(Arome (λ) − λId )
for ⎛ ⎞
0 Σ0 0 ... ... 0
⎜ 0 0 Σ1 ⎟
0
⎜ ⎟
⎜ .. .. ⎟
..
⎜ . . ⎟ .
⎜ ⎟
Arome (λ) = ⎜ . ⎟,
..
⎜ . . ⎟ .
⎜ ⎟
⎜ .. ⎟
⎝ . 0 Σd−2 ⎠
Σd−1 0 . . . 0
and Σa := s∈Sa λ1−s , a ∈ A. A straightforward computation gives that
⎛ ⎞
det(B − λI) = (−λ)n − (−λ)n−d Σa = (−λ)n ⎝1 − λ−s ⎠ .
a∈A a∈A s∈Sa
&
Therefore the leading root satisfies a∈A s∈Sa λ−s = 1.
Proof. If P is finite, then no point x ∈ [1] can return to [1] infinitely often,
so topological transitivity fails. Conversely, assume that u, v ∈ L(XP ) both
start and end with a 1, and let p ∈ P be arbitrary. We claim that w = u0p−1 v
also belongs to L(XP ). Indeed, let 1 ≤ i < j ≤ |w| be such that wi = wj = 1.
If j ≤ |u|, then |j − i| ∈ P because u ∈ L(XP ), and if |u| + p ≤ i, then
|j −i| ∈ P because v ∈ L(XP ). The case i ≤ |u| < |u|+p ≤ j follows because
P is closed under addition. Therefore (XP , σ) is topologically transitive.
The same proof gives that (XP , σ) has a dense set of periodic orbits, so if
P is closed and non-empty, then (XP , σ) is automatically Devaney chaotic.
Contrary to gap shifts, spacing shifts are hereditary, which gives a certain
freedom in constructing proofs.
Theorem 3.120. A spacing shift (XP , σ) is Li-Yorke chaotic if and only if
the set P is infinite.
mixing; see [387, Theorem 1.3]. It is known from [266] that a dynamical
system (X, T ) is topologically weak mixing if the product of (X, T ) with
every transitive system is again transitive. Conversely, in [387, Theorem
1.1], it is pointed out that the converse is false. Namely, if P and P are
disjoint thick sets, then (XP , σ) and (XP , σ) are both topologically weak
mixing, but their product is not topologically transitive (since [1] × [1] is not
infinitely recurrent). Topologically weak mixing precludes the existence of
non-constant Borel-measurable eigenfunctions [359]; i.e. no Borel function
f : X → C satisfies UT f := f ◦ T = λf . However, [387, Theorem 1.2]
presents a spacing shift that is not topologically weak mixing, but which
has a non-constant Borel eigenfunction. All this shows that there is only
a partial topological analog of the characterization of measure-theoretically
weak mixing in Theorem 6.86.
Regarding topological entropy, htop (XP , σ) ≥ k1 log 2 if P ⊃ kN; cf.
Theorem 3.54 and [47, Lemma 3.1]. Other than this, there seems to be no
easy way to compute the topological entropy htop (XP , σ) from the properties
of P . However [47, Theorem 3.6] gives a criterion for htop (XP , σ) = 0.
Theorem 3.122. The smallest alphabet size for which square-free subshifts
exist is 3. The Thue-Morse sequence is square+ε-free (i.e. overlap-free) in
the sense that www1 ∈/ L(X) for every w ∈ L(X) and w1 is the first letter
of w.
This is different from Theorem 3.122 in the sense that x can be any 0-1-
word, not just words from the Thue-Morse language L(XρTM ). On the other
hand, it applies to every k ≥ 1, not just to k = 1 (i.e. overlap-free/squares-
free).
Proof. It is immediate that if w has a k-overlap (or k-power), so has χTM (w).
Hence we only have to prove the “if”-direction, and we start with a prelimi-
nary remark. By the shape of χTM ,
(3.38) |χTM (x)|0 = |χTM (w)|1 = |x| for all x ∈ {0, 1}∗ .
Suppose that w ∈ L contains χT M (w) which contains a k-overlap; i.e.
χTM (w) = av k v1 b where v1 is the first letter of w, but w itself does not
contain a k-overlap. Assume also that w is the shortest word with this
property, so |a|, |b| ≤ 1. Suppose by contradiction that there is x such that
χTM (x) = avvc for |a|, |c| ≤ 1. Since |χTM (x)| = |avvb| is even, |a| = |b|.
If |v| is odd, then |vv|0 − |vv|1 = ±2, so a, c = . But then a = c = and
when we now divide avvc in blocks of two, then v is chopped in two-block
in two different way. Each such block is 01 or 10; therefore a = v2 = v4 =
· · · = vn and c = v1 = v3 = · · · = vn−1 = a, where we wrote v = v1 · · · vn .
But then |avvc|0 = 1 + 2|v|0 = 1 + 2|v|1 = |avvc|1 , contradicting (3.38).
Therefore |v| is even. If |a| = 1, then if we divide avvc in blocks of
two, we see that a = v1 and c = vn . The parity of (3.38) gives a = c, so
a = vn = v1 = c. But then w shortened by its last letter has the same
property as w, contradicting that w is shortest.
Hence a = c = and χTM (x) = vv. But then xk x1 is a prefix of w, so w
contains a k-overlap after all.
The next lemma (see [107, Theorem 2]) can be used to produce any
number of square-free languages.
Lemma 3.124. Let χ : A → B ∗ be a constant length substitution; i.e.
|χ(a)| = |χ(b)| for all a, b ∈ A. If χ(w) is square-free for every square-
free 3-letter word w, then χ(x) is square-free for square-free words x of any
length.
Proof. Clearly χ(a) = χ(b) for all a = b because otherwise χ(aba) is not
square-free. If |χ(a)| = 1 for all a ∈ A, then χ is a simple permutation of
letters, and χ preserves the square-freeness. So let us assume that |χ(a)| =:
d ≥ 2.
Assume by contradiction that a square-free word x = x1 · · · xn maps to a
non-square-free word χ(x) = rsst. Assume that x is the shortest such word,
so |x| ≥ 4 and χ(x1 ) = rr for some non-empty prefix r of ss and χ(xn ) = t t
for some non-empty suffix t of ss. However, |χ(x1 )| = |χ(xn )|, so there is
some 1 < k < n such that χ(xk ) = yy and
χ
x = x1 uxk vxn −→ r r χ(u)y y χ(v)t t.
s s
• If |r | > |y |, then χ(w1 ) = ry r where r = and χ(v1 ) = r r .
Since |χ(w1 )| = |χ(v1 )| = d, also r = . Now χ(w1 v1 w1 ) =
ry r r r ry r is not square-free, so w1 = v1 . Thus we can rewrite
χ(w1 ) = r qr for some q = , because otherwise not even χ(w1 ) is
square-free. But r is also a prefix of χ(u), so r qr r is a prefix of
χ(w1 u), contradicting the minimality of w.
• If |r | < |y |, then χ(wk ) = yr y where y = and χ(u1 ) = y y .
Since |χ(wk )| = |χ(u1 )| = d, also y = . Now χ(wk u1 wk ) =
yr y y y yr y is not square-free, so wk = u1 . Thus we can rewrite
χ(wk ) = y qy for some q = , because otherwise not even χ(wk )
is square-free. But y is also a prefix of χ(v), so y qy y is a prefix
of χ(wk v), contradicting the minimality of w.
This proves the lemma. Note that χ : a → ab, b → cb, c → cd is square-free
on all 2-letter words, but χ(abc) = abcbcd is not square-free. Therefore the
minimal length |w| = 3 in the hypothesis of the lemma is optimal.
Lemma 3.124 is a building block for the proof of the following result; see
[107, Theorems 5].
Proof. The idea is to start with a square-free word x ∈ {0, 1, 2}n , from
which we can create 2n different square-free words in a 6-letter alphabet
A = {a, a , b, b , c, c } by replacing occurrences of 0 by a or a , occurrences of
1 by b or b , and occurrences of 2 by c or c , all independently. Let y be any
of the resulting words, and apply the following length-22 substitution to y:
⎧
⎪
⎪ a → 0102012021012102010212,
⎪
⎪
⎪
⎪ a → 0102012021201210120212,
⎪
⎪
⎪
⎨b → 0102012101202101210212,
χ:
⎪
⎪ b → 0102012101202120121012,
⎪
⎪
⎪
⎪ c → 0102012102010210120212,
⎪
⎪
⎪
⎩c → 0102012102120210120212.
Other methods have been designed than the one in this proof; see e.g.
[60, 114, 235, 366, 472, 518]. If p(n) indicates the number of square-free
words in {0, 1, 2}n , then htop (X, σ) = limn n1 log p(n). For square-free sub-
shift of {0, 1, 2}∗ , the most accurate estimate to date is htop (X, σ) = log α for
1.3017597 < α < 1.3017619, see [503, 504], which contains also numerical
estimates for topological entropy for k-power-free shifts for various values of
k and alphabet sizes.
Definition 3.126. If w is a finite word, its repetition exponent is the
largest rational pq such that there is a prefix v of w such that w is a prefix of
v ∞ and |w| = pq |v|. If x is an infinite word, then the critical exponent of
x is the supremum of the repetition exponents of all its subwords w.
The proof was completed in a list of articles [173, 189, 427, 470]. See
[467, 468] for related results. This raises the question of the topological en-
tropy of fractional repetition-free subshifts. For example, in [343] it is shown
that the 7/3-rd repetition-free subshift over 3 letters has polynomial word-
complexity, where γ-repetition-free shifts have positive entropy if k > 7/3.
In fact [343, Theorems 7 and 11], the word-complexity of the k-repetition-
free language satisfies
p(n) = O(nlog2 25 ) if 2 < k ≤ 7/3,
log p(n)
0 < lim supn n ≤ 63 log 2 if k > 7/3.
1
Theorem 3.128. Let (XOF , σ) be the overlap-free shift, and pOF (n) its
word-complexity. Then:
log pOF (n)
• lim inf n→∞ log n ∈ [1.2690, 1.2736].
log pOF (n)
• lim supn→∞ log n ∈ [1.3322, 1.3326].
Proof. If (X, σ) was sofic, then there would be a finite edge-labeled transi-
tion graph representing X; see Theorem 3.32. But then we can pass a loop
arbitrarily often, creating any order powers. (This is basically the Pumping
Lemma 7.9 from Section 7.2.2.)
Since k-power-free shifts don’t contain blocks 0k , they are not S-gap
or spacing shifts either. Specifically, because power-free shifts contain no
periodic sequences, we can ask whether power-free shifts are minimal. The
entire k-power-free shift is not minimal, because it contains non-recurrent se-
quences, but there exist minimal k-power-free subshifts for any value of k >
0. Naturally this holds for the Thue-Morse shift, and by Lemma 3.123 other
overlap-free shifts are obtained by performing substitutions to (XTM , σ).
Theorem 4.4 in Section 4.1 shows that linearly recurrent shifts with con-
stant L are L + 1-power-free. Sturmian shifts are also k-power-free for k
sufficiently large if and only if their frequencies are of bounded type; see
Example 4.46 in Section 4.2.5.
128 3. Subshifts of Positive Entropy
Every Dyck shift has positive entropy, because it contains the coded shift
with code words () and (()). However, a Dyck shift with at least two pairs
of brackets is not synchronized. Indeed, there is no way that any word v
can synchronize so that both ([v)] and [(v]) both become admissible. On
the other hand, the Dyck shift is coded; see [450, Example 5.5]. Indeed,
let C be the collection of all the well-formed expressions with brackets
where each opening bracket is closed, without linking. In the terminology of
groups generated by the brackets, these are the expressions that reduce to
the identity if each pair of brackets ( ) = [ ] = · · · = Id.
Example 3.131. The language of the Dyck shift with one pair of brackets
is isomorphic to the language of the full shift on two symbols (with entropy
log 2), because every word v ∈ {(, )}∗ can be extended by brackets on the left
to supply brackets ( for every unopened ) and on the right to supply ) for
every unclosed (. The collection Lext of such extended words in which every
opening bracket is closed, and vice verse without illegally linked pairs, has
a representation with a countable automaton; see Figure 3.20. It can also
be represented as a push-down automaton (see Section 7.2.2), where we put
or remove a plate on/from the stack whenever we read an opening/closing
bracket.
( ( ( (
...
) ) ) )
Figure 3.20. A countable automaton for the two bracket Dyck shift.
34 Named after Walther von Dyck (1856–1934) who, being a student of Klein, was more
∞ √
1− 1 − 4x 2
GCat (x) := n
Cn x = = √ .
n=0
2x 1 + 1 − 4x
4n
−3/2 9 −5/2 145 −7/2
More precise asymptotics are Cn = √ π
n + 8n + 128 n + ··· ,
so the entropy of the Dyck shift with one pair of brackets is indeed log 2,
just as in the full shift of two symbols: The allowed 2n-words are a small
fraction of all 2n-words, but not an exponentially small fraction.
To compute the entropy of the Dyck shift with k types of bracket pairs,
we obtain the well-formed expressions of length 2n by starting with the well-
formed expressions with one pair of brackets and then, for each joined pair of
open-and-closing brackets, choosing one of the k bracket types independently.
Thus there are Cn k n well-formed expressions with k types of bracket pairs,
and the entropy is ≥ log 2 + 12 log k. This is only a lower bound, because not
every 2n-word in this Dyck shift is a well-formed expression. The topological
entropy is really log(k + 1), which follows from the next result by Krieger
[373].
Theorem 3.133. The Dyck shift (X, σ) on k ≥ 2 types of bracket pairs has
exactly two ergodic measures of maximal entropy log(k + 1), and each one is
fully supported and isomorphic to a Bernoulli shift.
Proof. Let B− ⊂ X be the set of all sequences in which every left bracket
has a corresponding right bracket, and let B+ be the set of all sequences in
which every right bracket has a corresponding left bracket. Note that B+ and
B− are shift-invariant. One can show that every shift-invariant measure has
μ(B− ∪ B+ ) = 1 by partitioning the complement into a countable collection
of disjoint sets indexed by the location of the first/last left/right bracket
with no partner.
Define a map π+ : B+ → {0, 1, . . . , k}Z by sending the k left brackets
to the symbols {1, . . . , k} and sending every right bracket to the symbol 0.
Then π+ is an isomorphism between the two shift spaces because every right
130 3. Subshifts of Positive Entropy
bracket has a corresponding left bracket, and hence its identity is uniquely
determined by the rules of the shift. Similarly, the analogous map π− :
B− → {0, 1, . . . , k}Z is an isomorphism.
Because every ergodic invariant measure on X is supported on either
B− or B+ , we conclude that htop (X, σ) = log(k + 1) and that there are
exactly two ergodic measures of maximal entropy μ± = ν ◦ π± , where ν is
the Bernoulli measure on the full shift on k + 1 symbols that gives equal
weight to all symbols. Each of these measures gives positive measure to
every open set in X.
Finally, note that if k = 1, then B+ and B− largely overlap. If we
let ν be the ( 12 , 12 )-Bernoulli measure, then by the Law of Large Numbers,
the mass is concentrated on sequences with zeros and ones occurring with
frequency 1/2, so that the number of opening brackets and closing brackets
is asymptotically the same and μ+ = μ− .
This and the next result from [373] have been shown in simplified form
in the Math Blog of Climenhaga [157].
Proposition 3.134. The set of ergodic measures for the Dyck shift is arc-
wise connected but is not dense in the Choquet simplex of invariant measures
(see Section 6.1 for the definition).
D
C
B
F A
A
B D
C
Subshifts of
Zero Entropy
133
134 4. Subshifts of Zero Entropy
Proof. (i) Linear recurrence implies that for every n ∈ N and every word
u ∈ Ln (X) and x ∈ X, the occurrence frequency of u in x satisfies
1 1
lim inf #{1 ≤ i ≤ k : xi · · · xi+n−1 = u} ≥ .
k→∞ k Ln
Therefore there is no space for more than Ln words of length n.
(ii) If v ∈ Ln (X), then the gap between two occurrences of v is ≤ Ln,
so every word u of length (L + 1)n − 1 contains v at least once. If v L+1 ∈
L(X), then all words of length n are cyclic permutations of v because the
1 As shown in [221, Theorem 1], a linearly recurrent subshift has, up to isomorphism, only
gap between any other words of length n becomes too large otherwise; cf.
Proposition 1.12. But then X is periodic.
(iii) Take u ∈ L(X) and w ∈ Ru . If |u| ≥ L|w|, then the word wu
(which starts and ends with u) must have wL+1 as prefix. This contradicts
(ii).
(iv) Take u ∈ L(X) and v ∈ L(X) of length (L + 1)2 |u|. By the proof of
(ii), every word of length ≤ (L + 1)|u| occurs in v and, in particular, every
return word w ∈ Ru occurs in v. Now return words in v don’t overlap, see
(4.1), so using the minimal length |w| ≥ |u|/L of return words (from item
(iii)), we find #Ru ≤ |v|/(|u|/L) = L(L + 1)2 .
(v) Finally, suppose that the subshift (Y, σ) over alphabet B is a factor
of (X, σ) and f : A2N +1 → B is the corresponding sliding block code, so
2N + 1 is its window size. Take u ∈ L(X) of length |u| ≥ 2N + 1 and v its
image under f . Then |v| = |u| − 2N . If w ∈ Rv , then |w| ≤ max{|s| : s ∈
Ru } ≤ L|u| ≤ (|v| + 2N )L ≤ (2N + 1)|v|L. Therefore Y is linearly recurrent
with constant (2N + 1)L.
The bound (2N + 1)L is not the sharpest. One can show that for every
ε > 0, there is L0 such that for all n ≥ L0 , x ∈ X, and v ∈ Ln (X), the gap
between two occurrences of v in x is at most (L + ε)n.
is a map for which ψ(xy) = ψ(x)ψ(y) holds, provided concatenations (or products) are properly
defined on X and Y . The word substitution agrees better with our intuition of this concept, so
we will use substitution.
136 4. Subshifts of Zero Entropy
The lengths of χn (0) are exactly the Fibonacci numbers. The limit word ρFib
is also a Sturmian sequence, namely the one associated to the golden mean
as rotation number; see Section 4.3.
Lemma 4.7. Assume that χ(a) is non-empty for every a ∈ A. Then for
every a ∈ A, χn (a) tends to a periodic orbit of χ as n → ∞.
Proof. As can be seen in Example 4.6, if a is the first symbol of χ(a), then
χ(a) is a prefix of χ2 (a), which is a prefix of χ3 (a), etc. Therefore χn (a)
tends to a fixed point of χ as n → ∞.
Since #A = N , there must be p < r ≤ N such that χp (a) and χr (a)
start with the same symbol b. Now we can apply the above argument to
χr−p and b.
The second line of this example is not interesting, so we will usually make
the assumption
This sequence appears as the kneading sequence (i.e. itinerary of the crit-
ical value) of the (infinitely renormalizable) Feigenbaum interval map; see
Section 4.7.1. It is also a Toeplitz sequence; see Example 4.86.
fold fold
1
- -
1 0 1
0
1 0 1
1 0
1
!
Lemma 4.13. Each one-sided substitution shift space (Xρ , σ) admits a two-
sided substitution shift extension.
Proof. By Lemma 4.7, we can assume that χ(a) starts with a. First define
χ on two-sided sequences as
ρ(· · · x−2 x−1 .x0 x1 x2 x3 · · · ) = · · · ρ(x−2 )ρ(x−1 ).ρ(x0 )ρ(x1 )ρ(x2 )ρ(x3 ) · · · ,
where the central dot indicates where the zeroth coordinate is.
To create a two-sided substitution shift, take some i > 1 such that ρi = a,
and let a = ρi−1 . Similar to the argument of Lemma 4.7, there is b ∈ A and
p < q ∈ N such that ρp (a ) and ρq (a ) both end in b. Set K = q − p, so ρK (b)
ends with b. Next iterate ρK (b.a) repeatedly, so that limn ρnK (b.0) =: ρ̂ is a
two-sided fixed point of ρK . Finally, set X̂ρ = {σ n (ρ̂) : n ∈ Z}.
Even though ρ̂ need not be unique (because the choices of b and K are
not unique), due to minimality (see below), the shift space X̂ρ is unique.
This way of writing the associated matrix (and not its transpose) ensures
that composition of substitutions and composition of associated matrices
work in the same way: Aχ̃◦χ = Aχ̃ · Aχ .
Lemma 4.15. Let a primitive substitution χ with χ(0) = 0 · · · have the
fixed point ρ = 0 · · · and associated matrix A. Let v bethe right eigenvector
of the leading eigenvalue of A. If v is scaled so that i vi = 1, then vj =
limn n1 #{1 ≤ i ≤ n : ρi = j} is the frequency of the j-th letter in ρ.
Proof. Let u = (uj )j∈A , uj = |w|j /|w|, be the frequency vector of some
word w = 0 · · · ∈ A∗ and let u be the frequency vector of χ(w). Then
aij uj
ui = j
aij uj , so u = f (u) := Au
Au1 . Since χ is primitive, the Perron-
i,j
Frobenius Theorem 8.58 assures that f n (u) converges to the leading eigen-
vector, which is therefore the frequency vector of the letters in the fixed point
ρ = limn χn (w).
Remark 4.16. By taking the associated matrix A instead of the substitu-
tion, we lose the order structure of the substitution words. For instance, the
140 4. Subshifts of Zero Entropy
where δa,b is the Dirac delta and an empty product is 1. Then A(1, . . . , 1) =
A and A(x) satisfies the composition rule for substitutions: Aχ◦ψ (x) =
Aχ (ATψ (x)) · Aψ (x). See e.g. [132].
We follow the exposition of Durand [221, 222] here; the paper [176]
shows that for substitution shifts, linear recurrence is equivalent to minimal-
ity.
Deviatov [197] extended these results to S-adic shifts; see Section 4.2.5.
Then χ has a unique fixed point, which for e.g. r = 3 looks like
ρ = ab3 . b3 b2 . b3 b2 b2 b1 . b3 b2 b2 b1 b2 b1 b1 b0 . b3 b2 b2 b1 b2 b1 b1 b0 b2 b1 b1 b0 b1 b0 b0 · · · .
v1 v2 v3 v4
Set vi = χi (abr ) for i ≥ 0. The dots separate the blocks wi , where w0 = abr
suffix of vi of length |vi | − |vi−1 |. Then symbol bk appears
and wi is the
i
exactly r−k times in wi .
Next apply an erasing substitution χ̃ : A → {0, 1}∗ given by
⎧
⎪
⎨a → ,
χ̃ : bk → 0, for k = 0, . . . , r − 1,
⎪
⎩
br → 1
to ρ. Then
r
i
n0 n1
ρ̃ := χ̃(ρ) = 1.10 .10 .10 .10 .10 .10n2 n3 n4 n5
··· for ni = ≈ ir /r!.
k
k=1
That is, w̄j is the j-th word of length inside w = χ(u). Note that |χ (u)| is
equal to |χ(u1 )|, which is not necessarily the same as the number of -words
that fit in χ(u). For example, if χ = χFib : 0 → 01, 1 → 0 is the Fibonacci
substitution on the alphabet A = {0, 1}, and = 3, then the new alphabet
A = {001, 010, 100, 101} = {a, b, c, d} and
⎧ ⎧
⎪
⎪ 001 → 01010, ⎪
⎪ a → bd because |χ(u1 )| = 2,
⎪
⎪ ⎪
⎪
⎨010 → 01001, ⎨b → bc because |χ(u1 )| = 2,
χ: χ :
⎪
⎪ 100 → 00101, ⎪
⎪ c→a because |χ(u1 )| = 1,
⎪
⎪ ⎪
⎪
⎩101 → 0010, ⎩d → a because |χ(u1 )| = 1,
with associated matrices
⎛ ⎞
0 0 1 1
1 1 ⎜1 1 0 0⎟
A= and A2 = ⎜ ⎝0 1 0 0⎠
⎟
1 0
1 0 0 0
√ √
and eigenvalues 12 (1 ± 5) and 12 (1 ± 5), 0, 0, respectively. For the second
iterate:
⎧ ⎧
⎪
⎪ 001 → 01001001, ⎪
⎪ a → bca because |χ2 (u1 )| = 3,
⎪
⎪ ⎪
⎪
⎨010 → 01001010, ⎨b → bca because |χ2 (u )| = 3,
1
χ2 : χ2 :
⎪
⎪ 100 → 01010010, ⎪
⎪ c → bd because |χ (u1 )| = 2,
2
⎪
⎪ ⎪
⎪
⎩ 101 → 0101001, ⎩ d → bd because |χ2 (u1 )| = 2.
This example shows that powers of χ and powers of χ match: f ◦ χn (x) =
χn ◦ f (x) if f is the transposition of words x ∈ A∗ into words in A∗ .
Proposition 4.22. Let χ be the -block version of the substitution χ, with
associated matrix A . If χ is a primitive substitution, then so is χ , and the
leading eigenvalue of A is equal to the leading eigenvalue of A1 . For the
remaining eigenvalues of A , they are the same as those of A2 , possibly with
extra eigenvalues 0.
χp χ
A∗ A∗ A∗
ψ2 ψ ψ ψ2
χ2 χp2
A∗2 A∗2 A∗2
see [3]. Clearly they both have the same associated matrix with eigenval-
ues 1 and 3. However, the fixed point of χ is a shift-periodic sequence
121212121 · · · and χ = χ for each if we recode the two -blocks by their
first letters. For ψ with {a, b, c, d, e, f } = {112, 212, 121, 211, 122, 221} we
have
⎧
⎪
⎪ a → abd, ⎛ ⎞
⎪
⎪ 1 0 1 0 1 0
⎪
⎪ b → bcd,
⎪
⎪ ⎜1 1 0 1 0 1⎟
⎪
⎨c → aef, ⎜ ⎟
⎜0 1 0 1 0 0⎟
ψ2 : with associated matrix A2 = ⎜ ⎜ ⎟
⎪d → bcd,
⎪ ⎜ 1 1 0 1 0 0⎟ ⎟
⎪
⎪ ⎝0 0 1 0 1 1⎠
⎪
⎪e → aef,
⎪
⎪
⎪
⎩f → bef 0 0 1 0 1 1
√
which has eigenvalues 3, 1, 12 (1 ± 3i), 0, 0. In Table 4.1 we worked out some
of the details for the Fibonacci substitution.
χ χ associated
leading left eigenvector ψ matrices
=2
⎧ ⎧ ⎛ ⎞
⎪
⎨a := 00 → 0101, ⎪
⎨a → bc, 0 0 1
χ : b := 01 → 010,
2 χ2 : b → bc, ⎝1 1 0⎠
⎪
⎩ ⎪
⎩
c := 10 → 001 c→a 1 1 0
⎛ ⎞
1
⎝γ ⎠
γ
= 3, p = 2
⎧ ⎧
⎪ a := 001 → 01010, ⎪ a → bd, ⎛ ⎞
⎪
⎪ ⎪
⎪ 0 0 1 1
⎪
⎨b := 010 → 01001, ⎪
⎨b → bc, ⎜1 1 0 0⎟
χ3 : χ3 : ⎜ ⎟
⎪ c := 100 → 00101, ⎪ c → a, ⎝0 1 0 0⎠
⎪
⎪ ⎪
⎪
⎪
⎩d := 101 → 0010 ⎪
⎩d → a 1 0 0 0
⎛ ⎞ ⎧ ⎛ ⎞
1+γ ⎪ 1 1 0
⎜1 + 2γ ⎟ ⎨a → bca, ⎜1
⎜ ⎟ ⎜ 1 1⎟⎟
⎝1+γ ⎠ ψ3 : b → bca, ⎝1
⎪
⎩ 1 0⎠
γ c → bd 0 0 1
= 4, p = 3
⎧ ⎧
⎪ a := 0010 → 0101001, ⎪ a → ce, ⎛ ⎞
⎪
⎪ ⎪
⎪ 0 0 0 1 1
⎪
⎪ ⎪
⎪
⎪
⎨b := 0100 → 0100101, ⎪
⎨b → bd, ⎜0 1 1 0 0⎟
⎜ ⎟
χ : c := 0101 → 010010,
4 χ4 : c → bd, ⎜1 0 0 0 0⎟
⎪ ⎪ ⎜ ⎟
⎪
⎪ ⎪
⎪ ⎝0 1 1 0 0⎠
⎪
⎪ d := 1001 → 001010, ⎪
⎪ d → a,
⎪
⎩ ⎪
⎩ 1 0 0 0 0
e; = 1010 → 001001 e→a
⎛ ⎞ ⎛ ⎞
1 + 2γ ⎧ 1 1 1
⎜1 + 2γ ⎟ ⎪
⎨a → bdace, ⎜1 1 1⎟
⎜ ⎟ ⎜ ⎟
⎜1+γ ⎟ ψ4 : b → bdace, ⎜1 1 0⎟
⎜ ⎟ ⎪ ⎜ ⎟
⎝1 + 2γ ⎠ ⎩ ⎝1 1 1⎠
c → bda
1+γ 1 1 0
Proof. Recall from Theorem 4.18 that a primitive substitution shift is lin-
early recurrent; let L be the correspondent constant. Then, independently
of the prefix u = of ρ, we have by Theorem 4.4 that
|u|
#Au = #Ru ≤ L(L + 1)2 , ≤ |v| ≤ L|u|, and |χ(v)| ≤ KL|u|
L
for K = supa∈A |χ(a)|. Therefore there is no space for more than finitely
many different substitutions.
Proposition 4.27. All derived substitutions of a primitive substitution χ
have the same eigenvalues, possibly with extra eigenvalues 0.
Table 4.2 shows the derived substitution for the first few prefixes of ρ.
putting in the dot to indicate the zeroth position, there are three ways of
dividing x into three-blocks,
(4.10)
x = · · · |010|101|010| · · · = · · · 0|101|010|10 · · · = · · · 01|010|101|0 · · · ,
and each with their own inverse. The way to cut x into blocks χ(a) is called
a 1-cutting of x. The problem is thus: can a sequence x ∈ χ(Xρ ) have
multiple 1-cuttings if we don’t know a priori where the first block starts?
In this definition, the sequence x from (4.10) is not recognizable, but for
example the fixed point of the Fibonacci substitution χFib is recognizable
with recognizability index 2. The Thue-Morse sequence ρ0 (or ρ1 ) is recog-
nizable with recognizability index 4. The following result is due to Mentzen
(1989) and Apparicio [29]:
Theorem 4.31. Every primitive injective constant length substitution with
aperiodic fixed point is one-sided recognizable.
and just based on the word u = 010001, we cannot say if the cut is directly
before its occurrence or not. This problem does not disappear if we take
longer words. The latter substitution χchac is called the primitive Chacon
substitution; see Example 6.124.
Using the above computation, we see that also ||| λk |χn (b)| ||| → 0 expo-
nentially, and therefore λk are eigenvalues as well; cf. Theorem 8.8. If the
minimal polynomial of λ has degree d = #A, then 1, λ, . . . , λd−1 are linearly
independent, but λd is a linear combination of 1, λ, . . . , λd−1 . Thus
g : Xρ → Td−1 , x → (gλ , gλ2 , . . . , gλd−1 )
is a semi-conjugacy between (Xρ , σ) and the toral rotation Rλ : Td−1 →
Td−1 , x → x + λ mod 1 for the translation vector λ = (λ, . . . , λd−1 ). Again,
since the 1, λ, . . . , λd−1 are linearly independent, Rλ is minimal and uniquely
ergodic, with Lebesgue measure as its only Rλ -invariant probability measure.
It is widely believed that, for every irreducible Pisot substitution, (Xρ , σ, μ)
is isomorphic to (Td−1 , Rλ , Leb); i.e. the semi-conjugacy π is one-to-one μ-
a.e. This is a corollary of Halmos & von Neumann’s Structure Theorem
6.100, together with the Pisot substitution conjecture which states that every
irreducible Pisot substitution has a pure point spectrum; see Section 6.8.3.
In this section, we will give some more properties of Pisot substitutions,
leading to a more geometrical understanding of g.
The letter frequencies fa = limn n1 |x1 · · · xn |a of substitution shifts exist
for all a ∈ A, independently of x ∈ X. Frequency is a limit notion, but
there are ways to measure how often subwords and letters appear in finite
words, without taking limits. Given a word v = v1 · · · vn ∈ A∗ , let |v|a =
#{1 ≤ i ≤ n : vi = a} be the number of appearances of the letter a in v.
Similarly, |v|u stand for the number of occurrences of the word u in v.
Definition 4.36. A language L(X) is called R-balanced if there is an
R ∈ N such that
||v|a − |w|a | ≤ R
for all a ∈ A, n ∈ N, and words v, w ∈ Ln (X). If R is not specified, then we
just say balanced. Similarly, we call L(X) balanced on words if there is
R ∈ N such that
||v|u − |w|u | ≤ R
for all u ∈ L, integers n ≥ |u|, and words v, w ∈ Ln (X).
Theorem 4.37. Every primitive Pisot substitution shift is balanced.
Proof. Let f = (fa )a∈A be the frequency vector; it is the right eigenvalue
of the associated matrix A of χ; see Lemma 4.15. Let λ, μ be the largest two
eigenvalues of A. Because λ is a Pisot number, λ > 1 > |μ|. Assume that
μ has multiplicity m. Then, using the Jordan decomposition A = U JU −1
where f is the leftmost column of U and writing 1b for the unit column
vector with a single 1 at position b, we find
|χn (b)|a = (An1b )a = U J n U −11b = fa λn (U −1 )1b + O(nm−1 μn ).
4.2. Substitution Shifts 153
We sum over a ∈ A, noting that a∈A fa = 1: |χn (b)| = a∈A |An1b |a =
λn (U −1 )1b + O(nm−1 μn ). Therefore
(4.14) | |χn (b)|a − fa |χn (b)| | = O(nm−1 μn ),
proving that the discrepancy is bounded at the words χn (b); see (8.18) and
Definition 8.40 in Section 8.3.1. We can split an arbitrary word w ∈ L(ρ) as
(4.15) w = v0 χ(v1 ) · · · χn−1 (vn−1 )χn (vn )χn−1 (vn−1 ) · · · χ(v1 )v0
for some maximal n such that vn = and each vk and vk have length ≤ L :=
maxa∈A |χ(a)|. Applying (4.14) to each of χj (vj ) and χj (vj ) we get bounded
discrepancy altogether. It follows by Proposition 8.43 that ρ is balanced; see
also Proposition 4.22.
Remark 4.38. The above proof can be adapted to show that also whole
words v ∈ L (ρ) appear with bounded discrepancy, namely by considering
the -block shift, which is also Pisot, and in which v is simply a single letter.
Without proof (see [3, 4]), we remark that if λ is not a Pisot number, then
the discrepancy
⎧
⎪ m log |μ|/ log |λ| if |μ| > 1,
⎨(log n) n
∗
nDn (ρ) ≈ (log n) or (log n)
m m−1 if |μ| = 1 is a root of unity,
⎪
⎩
(log n) m if |μ| = 1 is not a root of unity.
where again μ is the second largest eigenvalue, of geometric multiplicity m.
2
1
Eλ+
• •
• •
• • • 5 (ρ)
V
•
π
• •
0
⎧
⎪
⎨0 → 02
χ: 1→0
⎪
⎩
2→1
ρ = 02100202102 · · ·
is called the Rauzy fractal of χ, [32, 53, 471]. See Figure 4.4 for some
examples in dimension two. Strictly speaking, for Rauzy fractals that are
topological disks, it is only the boundary of R that is fractal.
We can transfer the shift action σ from Xρ to the space of broken lines
via
Also the substitution can be carried to the space of broken lines. Set
χ̂(1a ) = u1 · · · uχ(xi ) , uj is parallel to 1χ(a)j ,
and extend this to a broken line L by concatenating the broken arcs χ̂(xi )
such that χ̂(x1 ) starts at the origin and χ̂(xi ) and χ̂(xi+1 ) have a boundary
point in common, namely the vector (|χ(x1 · · · xi )|a )a∈A . It also follows that
(4.17) h ◦ π = π ◦ χ̂, h = A|V : V → V.
Theorem 4.39. The map π̂ : orbσ (ρ) → R ⊂ V defined by π̂ ◦ σ n (ρ) = π ◦
n (ρ) extends continuously to Xρ and commutes with the piecewise translation
(4.18) T : R → R, y → y + π(1a ) if y ∈ p([a]).
In particular, p(Xρ ) = R. In fact, T is a group translation on V /Λ for some
lattice Λ. If A is unimodular, then R is a fundamental domain of Λ, and
π̂ : Xρ → R % V /Λ is a measure-theoretic isomorphism.
α
−α α − log(d(σ n1 (ρ) , σ n2 (ρ)))
&π̂(σ (ρ)) − π̂(σ (ρ))& ≤ λ
n2 n1 kα
≤N n = .
N log 2
This implies the required uniform continuity of π̂ : orbσ (ρ) → V and allows
us to extend π̂ continuously to Xρ .
However, on each domain π̂([a]), the translation vector πa := π(1a ) is
different. When we divide the hyperplane V by a well-chosen lattice Λ, these
translation vectors become the same. That is, we need πa − πa to be lattice
points for all a, a ∈ A = {0, . . . , d − 1}. The simplest way of achieving this
is by letting Λj = πj − πj−1 , for j ∈ {1, . . . , d − 1}, be the vectors spanning
Λ. Let us compute the πi more explicitly. Let uj , j ∈ {0, . . . , d − 1}, be the
(generalized) right eigenvectors of A, where u0 is associated to the leading
eigenvalue λ. Since the uj are the columns of U in the Jordan decomposition
−1
A = U JU −1 , we have ej = (U −1 U )j = d−1 i=0 uij ui where U −1 = (u−1 d−1
ij )i,j=0 .
4.2. Substitution Shifts 157
d−1
Hence πj = ej − u−1
0,j
u0 = i=1 u−1 ui , and
i,j
The result was first shown for general irreducible unit Pisot substitutions
by Sirvent & Wang [515], although special cases were around, see e.g. [33,
125, 329]. In particular, Arnoux & Ito [33] gave a condition under which
158 4. Subshifts of Zero Entropy
the tiles R(i) overlap at most on a null-set. We follow the proof presented
in [69], which is Chapter 5 in [68].
Proof. Recall that L = {i }i≥0 is the broken line associated to the fixed
point ρ of the Pisot substitution χ. The subtile R(i) is the closure of the
points {π ◦ n−1 : ρn = i}. Since χ(ρ) = ρ, for each such n, there is m such
that ρ1 · · · ρn = χ(ρ1 · · · ρm )p, where ρm = j and χ(j) = pis. By (4.17), we
get
Taking the union of such points for all n with ρn = i and then taking the
closure, we arrive at
(p,i,s)
where i −→ j are the labeled arrows of the prefix-suffix graph. Now h con-
tracts the d − 1-dimensional Lebesgue measure Leb of V by a factor 1/λ
because A is a unimodular Pisot matrix. Therefore, writing wi = Leb(R(i)),
we obtain that
Here A = (aij ) is both the associated matrix of the substitution χ and the
transition matrix of the prefix-suffix graph. However, the Perron-Frobenius
Theorem 8.58 (part (c)) tells us that if A is a non-negative matrix with
leading eigenvalue λ and w a non-negative vector, then λw ≤ Aw coordinate-
wise (that is (4.19)) can only hold if w is a multiple of the leading
eigenvector
of A and then we have equality. Therefore λ Leb(R(i)) = (p,i,s) Leb(R(j))
i−→j
for every i ∈ A, and R(i) = (p,i,s) h(R(j)) + π̂(p) as claimed.
i−→j
A priori, the limit need not exist, or can depend on the choice of letters
an ∈ An , but if ρ exists and is an infinite sequence, then we have the following
definition.
4.2. Substitution Shifts 159
The word S-adic was first used by Ferenczi [244] and the S in S-adic
stands for substitution. If the sequence (χn )n∈N itself is periodic, then the
S-adic shift reduces to a substitution; the reverse question of when S-adic
shifts are isomorphic to substitution shifts was addressed in [318].
The following simple set of conditions implies the existence of ρ: An =
A 0, an ≡ 0, and χn (0) starts with 0 for each n ∈ N. However, this
by itself doesn’t imply that (Xρ , σ) is minimal. We use a straightforward
generalization of Definition 4.14.
always form primitive S-adic shifts because every letter occurs in every image
of every composition of two substitutions. The problem is the word 20
which only occurs when straddling the concatenated images of two words
χ1 ◦ · · · ◦ χn (a), a ∈ A. As a result, two appearances of 20 in χn ◦ χ̃(w)
are always 3n+1 places apart. Hence, to achieve linear recurrence, we need a
bound on the distance between occurrences of two-letter words, but this is
8 Some, but not all, authors require S to be finite. We will not require finiteness, because in
the few results where this requirement matters, it can easily be assumed separately.
9 In [70] a weaker notion of primitive is used, namely that for every m, there is n such that
χm+1 ◦ · · · ◦ χn has a strictly positive associated matrix. This is strong enough to conclude
minimality, but not for linear recurrence.
160 4. Subshifts of Zero Entropy
≤ DK 2
min {|χN (c)| : c ∈ AN } · max {|χ1 ◦ · · · ◦ χN −1 (c)|}
c∈AN −1 c∈AN −1
Proof. For the proof, see [221, Proposition 1.1 of the addendum].
4.2. Substitution Shifts 161
see (4.31). By themselves they are not primitive, neither their iterates χa0
and χa1 , but
0 → χ1 (0) = 01, 0 → χ0 (0a 1) = 0a 10,
χ1 ◦ χa0 : and χ0 ◦ χa1 :
1 → χ1 (1a 0) = 1a 01 1 → χ0 (1) = 10
are primitive. The limit sequence ρ = limn χa01 ◦ χa12 ◦ · · · ◦ χa1n (0) is linearly
recurrent if and only if (an )n∈N is a bounded sequence. Since all Sturmian
sequences can be found this way, where the corresponding frequency α has
continued fraction expansion α = [0; a1 , a2 , . . . ], Sturmian sequences are lin-
early recurrent if and only if α is of bounded type; see Durand [221, Propo-
sition 10 and Proposition 5.1 of the addendum]. Note that {χ0 , χ1 } is not a
collection of proper substitutions; a proper S-adic representation of Sturmian
sequences was given in [179].
Well before Durand’s work, it was shown by Mignosi [418] and [20,
Theorem 10.6.1] that Sturmian sequences are k-power-free for some k ∈ N
if and only if the corresponding frequency α has a continuous fraction of
bounded type. Of course, one direction follows, because if α is of unbounded
type, say ark > k for arbitrary k, then (4.34) shows the occurrence of a
162 4. Subshifts of Zero Entropy
ar
k-power χa01 ◦ χa12 ◦ · · · ◦ χ0 k (b) for b = 0, 1 depending on whether rk is even
or odd. Mignosi’s proof shows that there are no unexpected k-powers for
k > supn an .
0
1
1
0
0 1 1
1
1 0
1
are at least three equivalent defining properties, to which we will devote sep-
arate sections.
The name Sturmian was given by Morse & Hedlund [425], seemingly
because these sequences appear in connection with the work of the French
mathematician Jacques Sturm (1803–1855) on the number of zeroes that
sin(αx + β)π has in the interval [n, n + 1), but the sequences as such were
certainly not studied by Sturm. There are multiple other ways to obtain
Sturmian sequences. For instance, take a piece of paper with a square grid,
draw a line on it with slope α, and write a 0 whenever it crosses a horizontal
grid-line and a 1 whenever it crosses a vertical grid-line (see Figure 4.6, left).
Then we obtain a Sturmian sequence. Also, the trajectory of a billiard ball
moving frictionless on a rectangular billiard table can be coded symbolically
by writing a 0 for each collision with a long edge and a 1 for each collision
with a short edge (see Figure 4.6, right). If the motion is not periodic, then
the resulting sequence is Sturmian.
Equivalently, Sturmian sequences can be obtained as the difference se-
quence bn+1 − bn for a Beatty sequence bn = αn for some irrational
number α ∈ (0, 1). For irrational α > 1, we would obtain Sturmian se-
quences on a larger alphabet {0, 1 . . . , α}, but we will not address these in
this text.
Remark 4.49. The additional sequences obtained by taking the closure can
also be obtained by taking the half-open interval the other way around:
1 if Rαn (x) ∈ (0, α],
un =
0 if Rαn (x) ∈/ (0, α].
In either way, the resulting two-sided subshift (Xα , σ) for Xα = orbσ (u) is an
extension of (S1 , Rα ) where i : S1 → Xα is the inverse factor map i = ψ −1 .
Therefore the points xn = Rαn (0), n ∈ Z, have fibers ψ −1 (xn ) consisting of
two points, whereas #ψ −1 (x) = 1 for all other x. Thus (Xα , σ) is an almost
one-to-one extension of the circle rotation; see Section 2.3.1.
Exercise 4.53. Show that every bi-special word of a rotational sequence (so
Sturmian sequence by Lemma 4.63) is a palindrome.
For the proof we refer to [414, Chapter I, Theorem 2.1], but let us give
some details on how non-minimal circle homeomorphisms f with irrational
rotation numbers can be constructed. Start with the rotation Rρ : S1 → S1
and select some x1 ∈ S1 (or any finite or countable set of points xj ∈ S1
having disjoint orbits under Rα ). For each k and n ∈ Z, replace Rρn (xk ) by
a closed interval Ik,n of length 2−(k+|n|) ; this creates a new circle K with
circumference 1 + k n∈Z 2−(k+|n|) = 1 + 3 k 2−k . Define f : Ik,n →
Ik,n+1 as an
affine (or any orientation-preserving) homeomorphism, and for
all x ∈ S1 \ k,n Rρn (xk ) set f (x) = Rρ (x). Then f : K → K is indeed a
homeomorphism, and h : K → S1 ,
Rρn (xk ) if x ∈ Ik,n ,
(4.24) h(x) =
x otherwise,
is a semi-conjugacy; see Figure 4.8. Such circle homeomorphisms f are
called Denjoy circle maps. There is some restriction on how smooth such
homeomorphisms can be. Denjoy proved that if f is a C 1 diffeomorphism
166 4. Subshifts of Zero Entropy
f :K→K Ik,2
h(Ik,2 )
•
Rρ : S1 → S1
Ik,1 •h(Ik,1 )
h
Ik,0 •h(Ik,0 )
Ik,−1 •h(Ik,−1 )
•
h(Ik,−2 )
Ik,−2
such that log f has bounded variation10 , then f is minimal. On the other
hand, for every γ ∈ [0, 1), there are C 1+γ Denjoy circle maps; see [309].
Take Rρ , split open the orbit of 0, replacing the points Rρn (0) by intervals
denote the corresponding Denjoy circle map by f : K → K. Then
In , and
K \ n In◦ is a minimal Cantor set. If we code [sup I0 , inf I1 ] ∩ X by 1 and
[sup I1 , inf I0 ] ∩ X by 0, then the coding map i : X → {0, 1}Z is precisely a
conjugacy between (X, f ) and a two-sided rotational shift Xρ with frequency
ρ = ρ(f ).
If we split open S1 only along the backward orbit of 0, then the map f
is not invertible at α, and we obtain a one-sided rotational shift.
Remark 4.55. In this construction, we have split open only a single orbit,
and this leads to a rotational shift. It is of course possible to split open the
circle at several orbits. This still leads to an almost one-to-one extension of
the circle map, but no longer to a rotational shift of Definition 4.48. The
following result on amorphic complexity holds for these more general Denjoy
examples, and the proof given works in this generality. An easier proof for
rotational shifts is given in [288].
Theorem 4.56. The amorphic complexity of any non-periodic two-sided ro-
tational subshift (Xρ , σ) is 1. Equivalently, ac(f ) = 1 for any Denjoy circle
map f : K → K.
Proof. Since
the two-sided shift σ : Xρ → Xρ is conjugate to f : C → C for
C = K \ k,n Ik,n◦ , it suffices to show that ac(f | ) = 1.
C
Take three points ξ1 , ξ2 , ξ3 ∈ k,n Ik,n such that d(h(ξi ), h(ξj )) ≥ 14 for
i = j. Let δ := min{|Ik,n | : Ik,n ξj for some j} be the minimal length of
the intervals corresponding to the ξi ’s.
10 For the definition of variation, see before Theorem 8.42
4.3. Sturmian Subshifts 167
Since h( k,n Ik,n ) is a countable set, we can take N := 1/v points in
C such that S := {h(xi ) : i = 1, . . . , N } is an equidistant lattice in S1 with
minimal mutual distance 1/N . Set J = [xi , xj ] for some i = j, ordered in
such a way that |h(J)| < 12 . Whenever Rρn (h(J)) ξ1 , |f n (J)| ≥ δ, but
S1 \ Rρn (h(J)) has length ≥ 1/2, so it must contain ξ2 and/or ξ3 . Therefore
also |K \ f n (J)| ≥ δ, and thus d(f n (xi ), f n (zj )) ≥ δ.
Since
1 1
lim #{0 ≤ k < n : Rnk (h(J)) ξ1 } = Leb(h(J)) ≥ ≥ v,
n→∞ n N
we obtain lim supn n1 #{0 ≤ k < n : d(hk (xi ), hk (xj )) ≥ δ} ≥ v, so S is
(δ, v)-separated. We have #S ≥ v1 − 1 and therefore ac(f ) ≥ 1.
Now for the other direction, we will use (δ, v)-spanning sets; see Re-
mark 2.56. For v ∈ (0, 1], define a function ψv : S1 → [0, |K|], where |K| is
the circumference of K, as
Note that d(x, y) ≤ Leb(h−1 ([h(x), h(y)])) (because d(x, y) measures the
shortest arc between x and y) and ψv (x) ≥ diam(h−1 ([x, x + v])) for all v
sufficiently small and x outside the countable set h( k,n Ik,n ). Therefore
ψv is an L1 -function. The Birkhoff Ergodic Theorem 6.13 implies that for
Leb-a.e. y ∈ S1 ,
1
(4.25) lim #{0 ≤ k < n : ψv (Rρk (y)) ≥ δ|K|} = Leb({ψv ≥ δ|K|}).
n→∞ n
Ñ Ñ
Leb(h−1 ([ξi , ξi + v])) = ψv (ξi ) ≥ Ñ δ|K| ≥ (1 + δ)|K|,
i=1 i=1
Now take y ∈ K arbitrary and i such that y ∈ [yi , yi+1 mod N ). Then
h(y) ∈ [h(yi ), h(yi ) + v) and d(f k (yi ), f k (y)) ≤ ψv (Rρk (h(yi ))). Therefore
1
lim sup #{0 ≤ k < n : d(f k (yi ), f k (y)) ≥ δ}
n→∞ n
1
≤ lim sup #{0 ≤ k < n : ψv (Rρk (h(yi ))) ≥ δ} = mv ,
n→∞ n
which means that S is (δ, mv )-spanning. Using the spanning set equivalent
of (2.8), we obtain
log 2v(1/δ + 1)
ac(f ) ≤ sup lim sup = 1,
δ|K|>0 v→0 − log v
and the result follows.
k l k l
u = 0 · · · 0 · · · 1 · · · 0, u = 0 · · · 0 · · · 0 · · · 0,
· · 1 · · · 0 · · · 1,
v = 1 · v = 1 · · · 1 · · · 1 ·
· · 1 .
shorter u,v shorter u,v
This contradicts the minimality of K. The proof is complete, but note that
we have proved that |w| ≤ K − 2 as well.
Proof. Let i(x) denote the itinerary of x ∈ S1 w.r.t. {[0, α), [α, 1)}. If
ik (x) = ik (y) for 0 ≤ k < n, then Rαk (x) and Rαk (y) belong to the same
set [0, α) or [α, 1) for each 0 ≤ k < n. In other words, the interval [x, y)
contains no point in Qn := {Rα−k (α) : 0 ≤ k ≤ n}. But Qn consists of
exactly n + 1 points, and it divides the circle into n + 1 intervals. Each
such interval corresponds to a unique word of length n in the language, so
p(n) = n + 1.
Example 4.64. This lemma depends crucially on the partition of S1 into
intervals [0, α) and [α, 1). If we take the intervals [0, γ) and [γ, 1) for some γ
rationally independent of α ∈ [0, 1] \ Q instead, then p(n) = 2n for all n ≥ 1.
Exercise 4.65. Given N ∈ N, find a subshift with complexity p(n) = 2n
for n ≤ N and p(n) = N + n for n ≥ N .
Theorem 4.66. A non-periodic sequence is Sturmian if and only if it is
balanced.
Proof. Define
(4.28) Mn = min{|uk+1 · · · uk+n |1 : k ≥ 0}.
Since u is balanced, max{|uk+1 · · · uk+n |1 : k ≥ 0} = Mn + 1, so
|uk+1 · · · uk+n |1 = Mn or Mn + 1 for every k ∈ N.
For q, n ∈ N such that n > q 2 , we can write n = kq + r for a unique k ≥ q
and 0 ≤ r < q. We have
(4.29) kMq ≤ Mkq+r = Mn ≤ k(Mq + 1) + r.
Dividing by n gives
Mq − 1 Mn Mq 2
≤ ≤ + .
q n q q
As this holds for all q ≤ q 2 < n, we conclude that { Mnn }n∈N is a Cauchy
sequence, say with limit α.
Now to prove that α is irrational, assume by contradiction that α = pq
and take k = 2m , r = 0, and n = 2m q in (4.29) for increasing m ∈ N. This
gives
Mq M2q M24q M2m q + 1 M2q + 1 Mq + 1
≤ ≤ ≤ ··· ≤ ≤ ≤ ,
q 2q 24q 2q 2q q
M m M2m q+1
so { 2m q }m is increasing and {
2 q
2 m q }m is decreasing in m. They converge
p
to q , so p = Mq or Mq + 1.
172 4. Subshifts of Zero Entropy
Lemma 4.68. If u and u ∈ {01}N or Z are balanced words with the same
frequency α, then u and u generate the same language.
Proof. From the proof of Proposition 4.67 we know that α ∈ ( Mnn , Mnn+1 )
and α ∈ ( Mnn , Mnn+1 ) where Mn and Mn are given by (4.28) for u and u ,
respectively. This implies that Mn = Mn for all n ∈ N. For each n ∈ N, u and
u each have only one right-special word in Ln (X); we first show that these
right-special words, say w and w , are the same. Assume by contradiction
that there is some minimal n such that w = w . Hence there is an n − 1-word
v such that w = 0v and w = 1v (or vice versa). But v is right-special, so
all four of 0v0, 0v1, 1v0, and 1v1 occur in the combined languages. But
then Mn+1 = |v|1 ≤ Mn+1 − 1, a contradiction. By uniform recurrence
(of minimal subshifts), every word of length n appears in any sufficiently
long word, specifically in every sufficiently long right-special word. But as
these right-special words of u and u are the same, u and u have the same
subwords altogether.
We finish this section by proving the last implication for the three equiva-
lent characterizations of Sturmian sequences, due to Morse & Hedlund [425].
110110 011011
101011 110101
Figure 4.9. The Rauzy graph Γ6 based on the Fibonacci Sturmian sequence.
Figure 4.10. The two types of Rauzy graphs for a Sturmian sequence.
174 4. Subshifts of Zero Entropy
Cantor systems, was introduced by Gambaudo & Martens [273] and studied
by several authors, especially Shimomura; see e.g. [500–502].
Theorem 4.70. For each k ∈ N, there are at most three values that the
frequency
1
lim #{1 ≤ j ≤ n : xj+1 · · · xj+k = w}
n→∞ j
can take for a k-word w in a Sturmian sequence x. These three values depend
only on k and the rotation angle α.
Remark 4.71. This is the symbolic version of what is known as the Three
Gap Theorem which was conjectured by Hugo Steinhaus and eventually
proven by Vera Sós [520, 521]:
For every α ∈ R \ Q and n ∈ N, the set {jα mod 1}n−1 j=0
divides the circle into intervals of at most three different
sizes.
Indeed, since Lebesgue measure is the only invariant probability measure
that is preserved by the rotation Rα : x → x + α mod 1, the frequencies
in Theorem 4.70 corresponds to the Lebesgue measures (i.e. length) of the
intervals.
Proof. This is a special case of the more general statement that the fre-
quency can take at most 3(p(n + 1) − p(n)) values, which we will prove here.
For Sturmian sequences 3(p(n + 1) − p(n)) = 3.
Let n ∈ N be arbitrary and let Γn be the word-graph (Rauzy-graph)
of the language. For every vertex a ∈ Γn let a− and a+ be the number of
incoming and outgoing arrows. That is, a is left-special, resp. right-special,
if a− ≥ 2, resp. a+ ≥ 2.
Let V1 = {a ∈ Γn : a+ ≥ 2} be the collection of right-special words of
length n. Then
#V1 = 1≤ (a+ − 1) ≤ p(n + 1) − p(n).
a+ ≥2 a+ ≥2
According to this proof applied to Figure 4.10, the upper and lower arrow
indicate two maximal paths, both with an extra vertex in V2 , with distinct
frequencies. The middle path (including the left-special and right-special
word) has the sum of these frequencies. In the right panel, the bi-special
word is the unique vertex that occurs with the sum of the frequencies.
Proof. Assume that s is of Type 0 (the proof for other type goes likewise).
Note that since 11 doesn’t appear in s, it has a unique 1-cutting into
words 0 and 10. The choice of whether in this 1-cutting s1 is a block by
4.3. Sturmian Subshifts 177
itself or the second letter of a block is determined by the symbol a0 that can
be put in front of s is 0 or 1. With this caveat, the choice of t is unique.
Now for the second statement, suppose by contradiction that s is not
Sturmian (and hence not balanced; see Theorem 4.66); then by Lemma 4.59,
there is a word w such that both 0w0 and 1w1 appear in s. Since s doesn’t
contain 11, w = 0v0 for some v, and v0 = χ0 (v ). Now 10v01 is the prefix of
χ0 (1v 1) and 00v00 is the prefix of χ0 (10v 0) or χ(00v 0). This means that
both 0v 0 and 1v1 are factors of t, so t is not balanced.
For the converse, suppose by contradiction that t is not Sturmian (hence
not balanced) and that w is such that 0w0 and 1w1 both appear in t. Then
a0w0 appears as a factor too for some a ∈ {0, 1}, unless 0w0 only appears in
t as initial word. Let w = χ0 (w). Now χ0 (1w1) = 10w 10 and χ0 (a0w0) =
χ(a)0w 0. Because χ0 (a) ends in 0, both 10w 1 and 00w 0 appear in s, so s
is not Sturmian.
Finally, if 0w0 is the only prefix of t and doesn’t appear elsewhere in
t, then a = 0 and 0χ0 (0w0) = 00w 0 appears in s. As above, also 10w 1
appears in s, so s is not Sturmian.
Inverting this relation gives α = g(α). As we already know from Lemma 4.72
that a Sturmian s ∈ Xα can always be written as χi (t) or σ ◦ χi (t) for some
Sturmian sequence t, we have now also determined that t ∈ Xg(α) .
Definition 4.74. If we iterate this procedure, then we get a sequence of
types (τj )j≥1 which is called the additive coding of the Sturmian shift Xα .
We can abbreviate this sequence as
τ1 τ2 τ3 · · · = 0a1 1a2 0a3 1a4 · · · ,
178 4. Subshifts of Zero Entropy
see Figure 4.11 (right). If x has R̃1 -itinerary t, then its Rα -itinerary
is χ1 (t). If x ∈ S1 \ J1 , then i−1 (x)i0 (x) = 01 and Rα−1 (x) ∈ J1 . In
this case, if t is the R̃1 -itinerary of Rα−1 (x), then the Rα -itinerary
of x is i(x) = σ ◦ χ1 (t).
In conclusion, this renormalization operation turns Rα into Rg(α) . To
obtain the itinerary of any other point x ∈ S1 , we need to apply shifts as in
the second line of (4.32) every time the renormalization image of x doesn’t
belong to J0 or J1 .
0 1 α 2α 0 1 0 1 2α − 1 α 0 1
[ )[ ) [ )[ )
1 0 1 0
[ )[ ) [ )[ )
α 2α−1
0 1−α 1 0 α 1
1
Figure 4.11. First returns of circle rotations for α < 2
(Type 0) and
α > 12 (Type 1).
180 4. Subshifts of Zero Entropy
and
ρ = lim χ1 ◦ σ ◦ χ0 ◦ χ1 ◦ χ0 ◦ · · · ◦ χ1 ◦ χ0 (1).
j
j pairs
Viana [548] is strongly recommended), but we say some more about unique
ergodicity and counterexamples to unique ergodicity in Section 6.3.5.
T (x) = x − λj + λj if x ∈ Δi = [γi−1 , γi ).
j<i π(j)<π(i)
Equivalently, writing γ0 = 0 and γi = j≤i λj , we have
Instead of [0, 1), it may be more convenient to define IETs on the circle
S1 .Every IET on two intervals thus becomes a rotation. Thus IETs are a
generalization of circle rotations.
Example 4.79. To illustrate that Poincaré maps for polygonal billiard flows
can be IETs, we present an example of a torus T2 = R2 /Z2 with a single
horizontal wall [ 14 , 34 ) × { 12 } in it; see Figure 4.12 (left). A particle moves
with constant speed and angle θ w.r.t. the positive horizontal axis. When it
hits the wall, it reflects elastically: angle of incidence is angle of reflection.
Thus the outgoing angle is −θ until the next hit at the other side of the wall,
when the angle returns to θ.
We take 2 < tan θ < 4. The upper side of the wall is parametrized by
x ∈ [0, 12 ) and the lower side of the wall is parametrized by x ∈ [ 12 , 1) (left to
right). Then the map T : [0, 1] → [0, 1) assigning to x the coordinate of the
182 4. Subshifts of Zero Entropy
γ1 γ2
∠θ
Figure 4.12. Billiards on the torus against a wall and its IET.
Using the alphabet A = {1, . . . , d} and the partition {Δi }, define the
corresponding symbolic shift space XT = {i(x) : x ∈ [0, 1)} ⊂ AZ . It is
easily seen that the closure can be taken care of as follows:
XT = {i(x) : x ∈ [0, 1)} ∪ { lim i(y) : x ∈ (0, 1]}.
yx
for some 0 ≤ k < n, because otherwise the set J can be enlarged with points
with the same first n symbols in its itinerary.
Assume by induction that #Jn ≤ (d−1)n+1. Then J ∈ Jn but J ∈ Jn+1
if and only if T −n (γi ) ∈ J ◦ for some i = 1, . . . , d − 1. Since T is invertible,
4.4. Interval Exchange Transformations 183
Note that irreducibility follows from minimality. The Keane condition is,
however, not implied by minimality. Without the Keane condition, minimal-
ity can fail, but as shown in [489], an interval exchange transformation (and
in fact interval translation maps12 ) with d pieces can have at most d/2
distinct minimal subsets.
Proof. Our first claim is that T cannot have periodic points. Clearly T (0) =
0 and limy1 T (y) = 1 because otherwise the permutation would be re-
ducible. Now assume by contradiction that there is x ∈ (0, 1) with x =
T k (x). Let y = max{T n (γi ) : 0 ≤ n < k, 1 ≤ i < d}. Then T k restricted to
[y, x] is continuous and therefore the identity. So T k (y) = y, contradicting
the Keane condition.
Now assume that there is a point z whose full orbit orb(z) is not dense in
[0, 1). Then there is an interval J = [a, b) disjoint from orb(z). For each γi ,
1 ≤ i < d, there is at most one point γi ∈ J such that T ki (γi ) = γi . Similarly
a and b have such first preimages a , b ∈ J . Thus there is a partition of J
into half-open intervals with points in {γi } ∪ {a, b, a , b } as boundary points.
Any such interval J is mapped continuously by T n until T n (J ) ⊂ J for
n(J )−1 j
some finite minimal n = n(J ). Then the union K = J j=0 T (J ) is a
T -invariant set consisting of finitely many intervals, all disjoint from orb(z).
For any x ∈ ∂K \ {0, 1}, either T (x) ∈ ∂K or x = γi for some 1 ≤ i <
d. Since ∂K is a finite set, x must be periodic in forward time (but that
contradicts our first claim) or eventually map to some γi . In backward time,
x must also be periodic (contradicting the first claim) or eventually map to
a point γj . But then T m (γj ) = γi for some m ≥ 1, contradicting the Keane
condition. Therefore no such z can exist. In other words, every orbit is
dense.
Type 0 Type 1
A B C D E A B C D E
e d e d
E A D C B E A D C B
λ = 1
1−λe (λ1 , . . . , λd−1 , λd − λe ), λ = 1
1−λd (λ1 , . . . , λe − λd , λd , . . . , λd−1 ),
⎧ ⎧
⎪ if π(j) ≤ π(d), ⎪ if j ≤ e,
⎨π (j) = π(j) ⎨π (j) = π(j)
π (e) = π(d) + 1, π (e + 1) = π(d),
⎪
⎩ ⎪
⎩
π (j) = π(j) + 1 if π(j) > π(d), π (j) = π(j − 1) if j > e + 1,
⎧
⎪
j→j if j = e, ⎨j → j if j ≤ e,
χ: χ : e + 1 → ed,
e → ed. ⎪
⎩
j →j−1 if j > e + 1.
χn (1) starts with 1 for every n and there is a fixed point of the corresponding
S-adic transformation:
ρT := lim χ1 ◦ χ2 ◦ · · · ◦ χn (1).
n→∞
Since the iterates of the Rauzy induction represent first return maps to
shorter and shorter one-sided neighborhoods of 0 (assuming the Keane con-
dition holds), every letter will eventually play the role of d and e, and there-
fore this S-adic substitution is primitive. This gives another proof that
irreducible IETs satisfying the Keane condition are minimal.
Since χ1 ◦ χ2 ◦ · · · ◦ χn (1) represents the successive intervals Δi that
0 visits before its first return time associated to the n-th Rauzy induction
step, ρT = i(0). Since x = 0 has a dense orbit, the one-sided subshift is
XT = orbσ (ρT ).
Lemma 4.88. A Toeplitz shift (Xq , σ) is uniformly rigid and hence minimal.
Proof. We give the proof for one-sided Toeplitz sequences; the proof of two-
sided sequences goes likewise. Let [x1 x2 · · · xn ] be any cylinder set. Then
every digit xi reappears with gap qi . Hence, if Ln = lcm(q1 , . . . , qn ) is the
least common multiple of q1 , . . . , qn , then σ kL ([x1 x2 · · · xn ]) ⊂ [x1 x2 · · · xn ]
for all k ∈ N. This is uniform rigidity. The minimality of the corresponding
subshift follows from Lemma 2.24 and Corollary 2.20.
Proof. Let Ni = χi (a) be the length of the words from the i-th substitution.
By the condition that x1 = χ1 (a) for all a ∈ A1 , we find x1+kN1 = x1 for all
k ∈ N. By composing χ1 ◦χ2 , we obtain x1 · · · xN1 = x1+kN1 N2 · · · xN1 +kN1 N2
for all k ∈ N. In general, the initial block x1 · · · xN1 N2 ···Nr repeats with period
N1 N2 · · · Nr Nr+1 , so x is Toeplitz.
Conversely, if x = x1 x2 x3 · · · is Toeplitz on alphabet A0 , then there is
N1 such that x1+kN1 = x1 for all k ∈ N, and there is a finite collection of
N1 -words bk , k = 1, . . . , K1 , all starting with x1 such that x = bk1 bk2 bk3 · · · .
Consider {bk }Nk=1 as the letters of alphabet A1 , and define the substitution
word χ1 (bk ) (as letter) = bk (as N1 -word). Then x = χ1 (bk1 bk2 bk3 · · · ).
Since the N1 -words bki appear with their own gap, bk1 bk2 bk3 · · · ∈ AN 1 is a
Toeplitz sequence on its own right, and we can repeat the construction.
With some more work, and for the two-letter alphabet, we could improve
log q
the upper bound to ac(σ) ≤ lim supj − log Sk∗j (qj ) . By stipulating further prop-
erties on the Toeplitz sequence, one can (see [264, Section 5]) give examples
showing that this upper bound is sharp and also that for a dense set of values
a ∈ [1, ∞) (including a = 1), there is a regular Toeplitz shift with ac(σ) = a.
Proof of Theorem 4.93. Note that the densities Sk∗ (qj ) are decreasing in
j, and by regularity of the Toeplitz shift, limj Sk∗ (qj ) → 0. Choose δ > 0
arbitrary and m ∈ N such that 2−m < δ. Next choose v arbitrary and j ∈ N
such that (2m + 1)Sk∗ (qj+1 ) < v ≤ (2m + 1)Sk∗ (qj ). Then
Sep(δ, v) ≤ Sep(2−m , (2m + 1)Sk∗ (qj )).
We claim that the right-hand side is bounded by qj+1 . Indeed, assume by
contradiction that there is a (2−m , (2m + 1)Sk∗ (qj ))-separated set S with
more than qj+1 elements. Then at least two of them, say x, y ∈ S, share the
same qj+1 -skeleton. This means that x and y differ at most in qj+1 Sk∗ (qj+1 )
positions in every qj+1 -block. Since d(σ k (x), σ k (y)) ≥ δ only if xi = yi for
some i with |i − k| ≤ m,
1
#{0 ≤ k < nqj+1 : d(σ k (x), σ k (y)) ≥ δ}
nqj+1
2m + 1
≤ #{0 ≤ k < nqj+1 : xk = yk } ≤ (2m + 1)Sk∗ (qj+1 ).
nqj+1
When taking the limit n → ∞, we get a contradiction with the choice of j.
This proves the claim.
Therefore Sep(δ, v) ≤ qj+1 . Take logarithms and divide left- and right-
hand sides by − log v ≥ − log(2m + 1)Sk∗ (qj ), respectively. This gives
log Sep(δ, v) log qj+1
≤ .
− log v − log(2m + 1) − log Sk∗ (j)
Note that m depends only on δ. Thus taking the superior limit v → 0 (and
log q
hence j → ∞), we obtain ac(σ) ≤ lim supj − log Skj+1
∗ (qj )
as claimed.
Proof. We start with the positive entropy Toeplitz sequence, following [381,
Theorem 4.77], which in turn follows [560]. Let A be &an alphabet such that
∞
log #A ≥ 2K and take a sequence (ki )i∈N such that i=1 (1 − k1i ) = log2K #A ∈
(0, 1). Start with an L0 -word V (0) containing r0 = L0 /2 symbols ∗. We
construct the i-th skeleton V (i)∞ with |V (i)| = Li , recursively. Given V (i),
190 4. Subshifts of Zero Entropy
let W (i) be the concatenation of the (#A)ri copies of V (i) where the ri
symbols ∗ are replaced by the (#A)ri ri -words in A. Then set
ri (k −1)
V (i + 1) := W (i)V (i)(#A) i
,
It follows that limi Lrii = Lr00 ∞i=1 (1 − ki ) = L0 log #A > 0 (so regularity
1 r0 2K
log p(n)
topological entropy is limn n = K by Fekete’s Lemma 1.15.
log p(n)
We will not give the examples with logarithmic complexity limn log n =
K ≥ 1, but the technique is the same.
4.5.2. Adding Machines. Just like the more general enumeration system
in Section 5.3, adding machines are a class of symbolic systems that are
not subshifts. They are also called odometers13 , after the device in a car to
measure distance. Such an odometer consists of a number of disks, with the
digits 0, . . . , 9 written on the edge. A single “tick” moves the rightmost disk
by one unit, and if the 9 is passed (so the disk is back at position 0), it ticks
over the second disk by one unit. A mathematical odometer has infinitely
many disks, and the number of digits may vary from disk to disk.
The most common one is the dyadic adding machine or dyadic
odometer a : Σ → Σ for Σ = {0, 1}N . For x ∈ Σ, let k = inf{i : xi = 0}.
Then a is defined as
⎧
⎪
⎨0, i < k,
(4.37) a(x)i = 1, i = k,
⎪
⎩
xi , i > k.
13 After ῾
the ancient Greek words oδoς and μετ ρoν for road and measure.
4.5. Toeplitz Shifts 191
Σ̃q = {y = (yj )∞
j=1 : yj ∈ {0, . . . , qj −1}, qj−1 divides (yj −yj−1 ) for all j ∈ N},
Remark 4.98. There is yet another, less common, way to write the adding
machine, provided all the pi ’s are pairwise coprime. Let Σ̂p = Σp and define
â : Σ̂p → Σ̂p as
â(y)i = yi + 1 mod pi for all i ≥ 1.
Then (Σp , a) and (Σ̂p , â) are conjugate via ψ : Σp → Σ̂p defined as
i
ψ(x)i = xj pj−1 mod pi with p0 = 1.
j=1
The inverse of this map ψ can be computed using the Chinese Remainder
Theorem which states that, whenever p1 , . . . , pk are coprime integers greater
&
than 1 and N = ki=1 pi and given integers 0 ≤ ai < pi , the congruence
equations
(4.40) x mod pi = ai , 1 ≤ i ≤ k,
have a unique solution 0 ≤ x < N . A constructive solution can be found
inductively. Since gcd(p1 , p2 ) = 1, Bézout’s identity (effectively the Eu-
clidean algorithm) gives n1 , n2 ∈ Z such that n1 p1 + n2 p2 = 1. Then
x = a1,2 := a1 n2 p2 + a2 n1 p1 mod p1 p2 solves the first two congruence equa-
tions. Now replace these first two congruence equations by x ≡ a1,2 mod p1 p2
and continue by induction. This inductive procedure also shows that if we
increase k to k + 1, the new solution is in the same congruence class modN
as the previous. Hence ψ −1 (y) can be computed term by term.
Proposition 4.99. Every odometer is uniformly rigid and hence periodically
recurrent.
Proof. Let ε > 0 be arbitrary and take k such that 2−k < ε. Let qk =
p1 p2 · · · pk . Then aqk (x)i = xi for all i ≤ k; i.e. d(aqk (x), x) < ε as required.
Periodic recurrence follows by Lemma 2.24.
Proposition 4.100. Every odometer is strictly ergodic; i.e. it is minimal
and has a unique invariant probability measure; see Section 6.3.
Theorem 4.103. Let (Xq , σ) be a Toeplitz shift with periodic structure q and
assume that p = (pi )i≥1 with p1 = q1 , pi = qi /qi−1 is an integer sequence.
Then (Σp , a) is the maximal equicontinuous factor of (Xq , σ), and (Xq , σ)
is a non-trivial almost one-to-one extension of (Σp , a).
Proof. Let Xq be the orbit closure of the Toeplitz sequence x with periodic
structure q. Let Sk(qj ) be the j-th skeleton of x, so it is a qj -periodic
sequence in (A ∪ {∗})∞ . For y ∈ Xq , define
πj (y) = r ∈ {0, . . . , pj − 1} if yi = Sk(qj )i+r whenever Sk(qj )i+r = ∗.
Therefore πj (σ n y) = πj (y) + n mod qj , so πj is surjective, and π −1 (r), r =
0, . . . , qj − 1, are qj disjoint clopen sets in Xq . For y ∈ Xq , it may not be
clear from the first qj entries what πj (y) is. However, for every j, there is
mj such that the first mj entries determine the value of πj (y). Therefore πj
is continuous.
Note that π(y)j − π(y)j−1 is always a multiple of qj−1 . Thus we can
define π : Xq → Σ̃q by
π(y)j = πj (y).
Then π −1 (z) = j πj−j (z), as the intersection of nested non-empty closed
sets, is itself non-empty. Thus π is surjective, continuous, and π ◦ σ =
ã ◦ π, were ã is defined in Remark 4.97. Via b we can recode (Σ̃q , ã) to
the adding machine (Σp , a) as Remark 4.97 explains. This adding machine
is thus a factor of the Toeplitz shift and, as with all adding machines, it is
equicontinuous.
If we set π̃ = b ◦ π, we see further that π̃(σ n (x)) = an (00000 · · · ) =: (n)
for each n ∈ N0 and that also π̃ −1 ((n)) = {σ n (x)}. Therefore (Xp , σ) is
an almost one-to-one extension of (Σp , a). However, there must be z ∈ Σp
such that π̃ −1 (z) ≥ 2, because otherwise (Σp , a) would be conjugate to the
(expansive) subshift (Xq , σ), contradicting Proposition 4.102.
It follows from Theorem 2.43 that (Σp , a) is the maximal equicontinuous
factor of (Xq , σ).
Proof. Recall from Proposition 2.31 that (X, T ) preserves the metric
d∞ (, y) := sup d(T n (x), T n (y)).
n≥0
XB = {σ n (· · · 000101000 · · · ) : n ∈ Z} ∪ {0∞ }
The B-free sets date back to the first half of the 20th century; research
from that time includes the question of under which conditions the density
d(FB ) = limn n1 #{FB ∩ {1, 2, . . . , n}} exists; see [76, 155, 184, 186, 187].
Davenport & Erdös [186] showed that the logarithmic density δ(FB ) (see
Definition 8.55) always exists and is equal to the upper density d(FB ). Besi-
covitch [75] gave the following sufficient condition for d(FB ) to exist:
1
(4.41) B is pairwise coprime and thin; i.e. < ∞.
b
b∈B
¯
Since d( Bb>K bZ) ≤ Bb>K 1/b, every
thin sequence B has light tails,
¯
which means that the upper densities d( Bb>K bZ) → 0 as K → ∞.
The set B might contain superfluous elements b0 in the sense that FB\{b0 }
and its related shifts have the same properties as FB and its related shifts.
A condition on B to avoid superfluous elements is the following:
Having light tails implies tautness, but not the other way around.
Proof. We sketch the proof from [231, Proposition K and Theorem 2.28].
It is easy to see that htop (XBher , σ) ≥ d(FB ). Indeed, among the first n
entries, η has at least d(FB )n ones, and therefore pX her (n) ≥ 2d(FB )n . Since
B
16 Keller [353] derives this conclusion under the weaker assumption that B is taut and pairwise
coprime.
198 4. Subshifts of Zero Entropy
htop (XBher , σ) = inf n pX her (n), the inequality htop (XBher , σ) ≥ d(FB ) follows.
B
One step in [231, Proposition K] is therefore to show that the other inequality
¯
pX her (n) ≤ 2d(FB )n+ε holds.
B
where μ denotes the Möbius function, roughly counting the parity of distinct
prime factors; see (4.44). In particular, Per(n) = an −a if n is a prime. Derive
Fermat’s Little Theorem, an−1 ≡ 1 (mod 1) if n is a prime not dividing a.
The connection between B-free shifts and Toeplitz shifts is that every
B-free shift has a unique minimal “core” that is a Toeplitz shift, although it
is usually a very simple one, namely ({0∞ }, σ). This is summarized in the
next result, which is [231, Theorem A].
Theorem 4.112. Every B-free shift (XB , σ) has a unique minimal subshift,
which is a Toeplitz shift, and every x ∈ XB is proximal to this subshift.
the property that every continuous f : X → R is orthogonal to the Möbius function, which
n−1
k=0 μ(k) · f ◦ T (x) tend to zero for every x ∈ X. Many dynamical
1 k
means that averages n
systems satisfy this conjecture, e.g. circle rotations [185]. It is known that the converse is false:
There are continuous positive entropy systems such that every continuous function is orthogonal
to the Möbius function; see Downarowicz & Serafin [217]. A recent account of the progress on
this problem can be found in [245].
4.6. B-Free Shifts 199
proximal (i.e. every pair (x, y) ∈ XB2 is proximal) if and only if its maximal
equicontinuous factor is trivial; see [231, Theorem 3.22].
are extensions with the shortest possible blocks of 1’s and longest possible
blocks of 0’s available. Then y := limn An is a two-sided sequence of which
each subword reappears periodically, so it is Toeplitz. Therefore orbσ (y) is
a minimal subset of (XB , σ). Because the blocks An appear with the same
periods in η, it is the only minimal subset.
The case that {0∞ } is the minimal subset is easy to determine from B:
Lemma 4.113. The set B contains an infinite set of pairwise coprime inte-
gers if and only if 0∞ ∈ XB . In this case {0∞ } is the unique minimal set in
XB and in XBher .
&
Proof. ⇒: Let b1 , . . . , bk ∈ B be pairwise coprime, and let N = ki=1 bi .
By the Chinese Remainder Theorem, there is m ∈ {0, . . . , N − 1} such that
200 4. Subshifts of Zero Entropy
be the adding machine as described in Remark 4.98. Then (Σ̂B , â) is called
the canonical odometer of (XB , σ). We abbreviate 0 = 0∞ and 1 =
â(0) = 1∞ ∈ Σ̂B . Note that, contrary to Remark 4.98, we did not make the
assumption that B = {b1 , b2 , b3 , . . . } consists of pairwise coprime integers.
Therefore orbâ (1) need not be dense in Σ̂B ; we have xi ≡ xj mod gcd(bi , bj )
for all i, j ∈ N. In more detail, Σ̂B is an abelian group under addition, and
{. . . , −x, 0, x, 2x, 3x, . . . } is dense if and only if B is pairwise coprime. As
a consequence, (Σ̂B , â) is minimal and uniquely ergodic if and only if B is
pairwise coprime.
4.6. B-Free Shifts 201
¯ B ) suffices,
Any sequence (nk )k≥1 such that n1k #{FB ∩ {0, . . . , nk − 1}} → d(F
so if d(FB ) exists, then η is typical20 for νη .
This combined with (4.47) gives (4.46), and the proof follows.
fa2 fa
J1 f (J1 )
H
Y
HH *
H
1−p c p
twice renormalizable. Figure 4.14 shows how these intervals are permuted if
q1 = 2, q2 = 4.
Similarly, there are maps that are 3, 4, . . . times renormalizable, or even
infinitely renormalizable. In this case there is an infinite sequence of
nested intervals
· · · ⊂ J4 ⊂ J3 ⊂ J2 ⊂ J1 ⊂ [0, 1]
and a sequence of periods (qk )k∈N (where qk divides qk+1 ), such that
f qk (Jk ) ⊂ Jk and Jk , f (Jk ), . . . , f qk −1 (Jk )
have pairwise disjoint interiors. This is what happens, with qk = 2k , during
the first period doubling cascade in the quadratic family. There is an
increasing sequence of parameters (αk )k∈N such that Qα becomes k times
renormalizable if α ≥ αk . At the limit parameter
αfeig = lim ak ≈ 3.569945672 . . .
k→∞
the map becomes infinitely renormalizable. This behavior was first observed
in the 1970s by Tresser & Coullet [538] and Feigenbaum21 [241], and an
amazing observation was that the relative distances of those parameters con-
verges:
|αk+1 − αk |
(4.50) → δ = 4.669201609102990 . . . .
|αk+2 − αk+1 |
This phenomenon has been a major source of inspiration since the 1970s;
see e.g. [112, 140, 386] and the monograph in [414, Section VI]. The next
proposition gives the effect of having periodic intervals on the kneading map.
21 Mitchell Feigenbaum (1944–2019).
206 4. Subshifts of Zero Entropy
Proof. Recall the closest precritical points ζk , ζ̂k from the proof of Theo-
rem 3.90 and (3.25).
(1) If Q(k + 1) > k, then f Sk (c) ∈ [ζk , ζ̂k ], so f Sk maps one of the
intervals [ζk , c] or [c, ζ̂k ] monotonically into itself (in an orientation-reversing
way). This interval contains an attracting Sk -periodic or 2Sk -periodic point.
If J c is n-periodic, then f n (J ◦ ) c because otherwise f n maps [p, c]
into itself, producing an attracting n-periodic point. Therefore J ζk , ζ̂k for
some minimal k, and n = Sk . Additionally, f j (J) c only if j is a multiple
of n. In particular, Sk+j are all multiples of n, and thus Q(k + j) ≥ k for all
j ≥ 1.
Conversely, if Q(k + j) ≥ Q(k + 1) = k for all j ≥ 1, then f Sk (c) ∈
(ζk−1 , ζ̂k−1 ), and f Sk maps one of the intervals [ζk−1 , ζk ] or [ζ̂k , ζ̂k−1 ] in an
orientation-reversing way onto itself, producing an orientation-reversing Sk -
periodic point p. The other interval contains a preperiodic point p̂. If there
are more such points, then we can take p, p̂ furthest away from c. If f Sk (c) ∈ /
[p, p̂], then f jSk (c) ∈
/ (ζk−1 , ζ̂k−1 ) for some j ≥ 1. Then Q(k + j) < k for this
j, contrary to our assumption. Therefore f Sk (c) ∈ [p, p̂], making J := [p, p̂]
periodic.
(2) Let p be an attracting periodic point, and assume it is the one closest
to c on its orbit. We can assume without loss of generality that p < c. Since
f is quadratic, Singer’s Theorem22 [414, Chapter II.6] implies that c is in the
immediate domain of p, so f kn (c) → p as k → ∞, and there is no 0 < j < n
such that f j (c) ∈ [p, c].
If p reverses orientation, there is an interval [ζ, p] that is mapped mono-
tonically onto [p, c] by f n . But this means that ζ = ζk is a closest precritical
point, so n = Sk .
If p preserves orientation, then f n ([p, c]) ⊂ [p, c), so n is not a cutting
time. Take y > f (c) maximal such that f n−1 is monotone on [f (p), y]. Then
f n−1 ([f (p), y]) c, so n is a co-cutting time.
f
22 It suffices for this that f has negative Schwarzian derivative: f
− 32 ( ff )2 ≤ 0 as quadratic
maps do; see [414, Chapter II.6].
4.7. Unimodal Restrictions to Critical Omega-Limit Sets 207
This means that c has to be accumulated by ω(c) from both the right
and the left.
This proof shows that the subshift Xν has only one infinite left-special
sequence, like Sturmian sequences. However, there may be left-special words
of finite length in L(Xν ) that cannot be extended indefinitely to the right
and remain left-special. This happens for example in the Feigenbaum case
where 11 is left-special and also right-special, but both 110 and 111 are
no longer left-special. This corresponds to property (iii) in Theorem 4.128
below.
Excluding the periodic and infinitely renormalizable cases, the first ex-
amples of kneading sequences for maps f so that f |ω(c) are homeomorphisms
were described in [119], in terms of the kneading map. The construction is
flexible enough to provide, say for the tent √family Tλ , uncountably many
parameters within every open subinterval of [ 2, 2] of slopes. Further exam-
ples emerged with the discovery of strange adding machines in [23, 89]. The
following characterization comes from [26, Theorem 4.3].
n
Take kn = j=0 aj and then cutting times as follows:
⎧
⎪
⎨Sk = k + 1 for 0 ≤ k ≤ a1 ,
Skn = qn for n ≥ 1,
⎪
⎩
Skn +a = aqn + qn−1 for 1 ≤ a ≤ an , n ≥ 1.
It is clear that Q(k) → ∞ in this case, and the Sk ’s interpolate between the
qn ’s; cf. [125]. However, f : ω(c) → ω(c) is in general not invertible, since c
itself and/or other points in the backward orbit of c have two preimages in
ω(c); see [119].
Yet also if Q(k) is bounded (and even if Q(k) ≤ 1), there are examples
where (ω(c), f ) is Sturmian; see [117, Chapter III, 3.6]. Let ϕ : [0, 1] → [0, 1]
be a Lorenz-like map, i.e. an interval map that is continuous and increasing
both on [0, c) and (c, 1] with limxc ϕ(x) = 1 and limx c ϕ(x) = 0. Thus it
has a discontinuity at the critical point c. In addition, we will assume that
ϕ is symmetric: ϕ(1 − x) = 1 − ϕ(x) (i.e. ϕ(x̂) = ϕ(x) with the notation
x̂ = 1 − x) for all x ∈ [0, 1] \ {c}, so the critical point c = 12 .
Every symmetric unimodal map f : [0, 1] → [0, 1] with f (c) = 1 can
be made into a symmetric Lorenz-like map by flipping the right half of the
graph vertically around c = 12 (see [28, 117]):
f (x) if x ∈ [0, c),
ϕ(x) =
1 − f (x) if x ∈ (c, 1].
Then ϕ is semi-conjugate to f : ϕ ◦ f = f ◦ f ; see Figure 4.15. In fact
n f n (x) if f n is increasing at x,
ϕ (x) =
1 − f n (x) if f n is decreasing at x.
We will use the itinerary map i for ϕ with codes +1 for [0, c) and −1 for
(c, 1].
f ϕ
1 1
c= 2 c= 2
for n ≥ 1. It follows that θ(f (x)) = σ(θ(x)) if i0 (x) = 0 and θ(f (x)) =
−σ(θ(x)) if i0 (x) = 1. For the itinerary iϕ of x ∈ I \ nj=0 ϕ−j (c) under the
function ϕ this means that
⎧
⎪
⎨in (x) = 0 and θn (x) = +1
in (x) = 0 ⇔ or
ϕ
⇔ θn+1 (x) = +1
⎪
⎩
in (x) = 1 and θn (x) = −1
and
⎧
⎪
⎨in (x) = 1 and θn (x) = +1
iϕ (x) = 1 ⇔ or ⇔ θn+1 (x) = −1.
n
⎪
⎩
in (x) = 0 and θn (x) = −1
ϕ(1) ϕ̄
a 1
c= 2 b
In the latter case, the kneading map Q(k) ≤ 1 for all k ∈ N, and if α ∈
/ Q,
then f : ω(c) → ω(c) is a minimal homeomorphism.
Proof. Recall that f (c) = 1 and assume that there is a minimal integer
n ≥ 1 such that ϕn (1) ∈ (c, b]. Then ϕ̄n+1 (1) ∈ (0, a] and ϕ̄n+2 (1) = ϕ̄(1) is
periodic with period n + 1.
Recall that b > c is such that ϕ̄(b) = a, so f (b) = â > c, and f 2 (b) =
f (a) = f 2 (c) > c. Therefore b ∈ (ζ̂ , ζ̂ ) for closest precritical points ζ̂ >
2 1 1
ζ̂2 > c, see (3.23), and b̂ ∈ (ζ1 , ζ2 ). There are two possibilities:
• ϕn (1) = f n (1). In this case f n is increasing at 1 and thus n+1 = Sk
is a cutting time.
• ϕn (1) = f
n (1). In this case f n is decreasing at 1 and again n + 1 =
Sk is a cutting time.
4.7. Unimodal Restrictions to Critical Omega-Limit Sets 213
By minimality of k, f Sj (c) ∈
/ [b̂, b] \ {c} for all j < k, and hence the kneading
sequence ν of f consists of blocks 0 or 11. For example,
ν = 1. 0. 0. 1 1. 0. 1 1. 0. 1 1. 1 0 ··· ,
θ = +1 − 1 − 1 − 1 + 1 − 1 − 1 + 1 − 1 − 1 + 1 − 1 + 1 + 1 ··· ,
ν ϕ
= 1. 1. 1. 0 1. 1. 0 1. 1. 0 1. 0 0 ···
where dots indicate cutting times and the bold symbol the position Sk . Since
n + 1 is the period of ϕ̄(1), this shows that #{1 ≤ j ≤ Sk : θj = −1} = k,
and in view of (4.53) we have α = k/Sk .
If there is no such minimal n, i.e. ϕn (1) ∈
/ (b̂, b) for all n ≥ 1, then f n (1) ∈
/
(b̂, b) for all n ≥ 1 (and in particular Q(j) ≤ 1) for all j ≥ 1. A counting
argument similar to the above shows that α = lim supk k/Sk = limk k/Sk . It
is possible that α is rational, e.g. for the logistic map fa (x) = 1 − a(x − 12 )2
with a = 3.5097. In this case, ν = (101)∞ and ϕ̄i (1) converges to an
attracting orbit of period 3. Also for the tent map Ts (x) = 1 − s|x − 12 |
√ √
with s = 12 (1 + 5), the critical orbit { 12 , 1, 34 − 14 5} has period three and
avoids [0, a].
If α ∈/ Q, then ωϕ̄ (c) is a Cantor set, disjoint from [0, a] and minimal
w.r.t. the action of ϕ̄. Under the semi-conjugacy f between f and ϕ (indeed
f ◦ f = f ◦ ϕ), this projects to a minimal map f : ωf (c) → ωf (c). We will
show that f : ωϕ (c) → ωf (c) is a homeomorphism, from which it follows that
f : ωf (c) → ωf (c) is also a homeomorphism. Assume by contradiction that
x < c < x̂ are points in ωϕ (c) such that f (x) = f (x̂) = y ∈ ωf (c). Then,
since f is the semi-conjugacy between ϕ and f , we must have f (ϕn (x̂)) =
f (ϕn (x)) = f n (y) for every n ∈ N. Note that ϕn (x̂) = ϕ n (x) for every
We argued so far that there exist stunted Lorenz maps for which orbϕ̄ (c)
is a Cantor set with dynamics similar to circle rotations (or more precisely
to Denjoy circle maps) with irrational rotation number and that there are
also unimodal maps with kneading map bounded by 1, such that f |ω(c) is
semi-conjugate to a circle rotation. The rotation number is α = limk k/Sk .
Therefore (ω(c), f ) represents a Sturmian shift.
214 4. Subshifts of Zero Entropy
In fact, every irrational rotation number (hence every Sturmian shift) can
be realized this way, as we can prove by studying this rotation number closer.
Indeed, let α = [0; a1 , a2 , a3 , . . . ] be the continued fraction expansion of α,
with convergents pi /qi . For the irrational rotation Rα , the denominators qi
are the times of closest returns of any point x ∈ S1 to itself, and these returns
occur alternatingly on the left and on the right; see Section 8.2.
For the map ϕ̄, the closest returns on the left indeed accumulate on c,
but the right neighborhood [c, b) is the preimage of the plateau [0, a) and no
further iterates of c enter that region. Instead, returns on the left accumulate
on b.
Translating this back to the unimodal map f with kneading sequence
ν = ν1 ν2 ν3 · · · , the closest returns on the left correspond to closest returns
at co-cutting times (recall that there are no cutting times Sj so that f Sj (c) ∈
(b̂, b)). If qi is such a co-cutting time, then (recalling the function ρ from
(3.19) and using the above argument) the Farey convergents ρa (qi ) = qi +
aqi+1 are also the next co-cutting times for 1 ≤ a ≤ ai+1 , and in particular,
ρai+1 (qi ) = qi+2 .
The closest returns on the right correspond to cutting times, but this
time the f qi (c) accumulate on b, and because f 3 (b) = f 3 (c), the itinerary of
b is
where dots indicate cutting times and primes co-cutting times. The bold
symbols indicate the positions qi . In fact, for each i
Therefore c has two limit itineraries limxc i(x) = 0ν and limx c i(x) = 1ν,
but c has only one preimage in ω(c).
Outside maps: Boyland, de Carvalho & Hall in [105, Section 3] present
a different way of creating a circle endomorphism from a unimodal map.
They call this the outside map B and use it to study the inverse limit space
of the unimodal map as attractors of sphere homeomorphisms. Starting from
a unimodal map f : I → I such that the second branch is surjective (i.e.
4.7. Unimodal Restrictions to Critical Omega-Limit Sets 215
2(s−1) ϕ̄
s
Ts
2−s
B
0 c ϕ̄ d 2
s =2−d 2
1− 2
s a= s−1
s 1− 1
s 1
Figure 4.17. Constructing the outside map and the stunted Lorenz
map for a tent map Ts .
Let us carry this out for the family of cores of tent maps Ts : I → I,
sx + 2 − s, x ∈ [0, s−1
s ],
Ts =
−sx + s, x ∈ [ s , 1],
s−1
216 4. Subshifts of Zero Entropy
Further Minimal
Cantor Systems
In this chapter we present three main types of dynamical systems that, al-
though not subshifts themselves, are popular tools to describe (minimal)
continuous maps on Cantor sets. Cutting and stacking goes back to von
Neumann and Kakutani in the early 1960s. These systems were originally
used to create examples to test specific ergodic properties. Enumeration
systems are a generalization of and a more number-theoretic approach to
both odometers and Ostrowski numeration systems. Bratteli-Vershik sys-
tems came in the 1980s and seem to have become the most frequently used
tool to describe Cantor systems. We explain how to represent some of the
subshifts from earlier chapters in terms of these tools.
TY (y) = T r (y) for the return time r = r(y) := min{i > 0 : T i (y) ∈ Y }.
and the Rokhlin Lemma [479, Theorem 3.10] in the measure-preserving set-
ting are classical techniques associated with first return maps.
217
218 5. Further Minimal Cantor Systems
of X into clopen sets that are pairwise disjoint and together cover X. We call
B= N i=1 Bi the base of the KR-partition and the integers hi the heights.
Also we assume that T hi (Bi ) ⊂ B. (If T is invertible, this is automatic.)
This result comes from [310] and was extended from minimal to transi-
tive aperiodic in [78].
Proof. Let B(1) be any clopen subset of X. By minimality, the first entry
time
r(x) = min{i ≥ 1 : T i (x) ∈ B(1)}
is well-defined and finite for every x ∈ X. Take Br (1) = {x ∈ B(1) :
r(x) = r}. Then
r−1
−r
Br (1) = (T (B(1)) ∩ B(1)) \ T −j (B(1)).
j=1
5.1. Kakutani-Rokhlin Partitions 219
From this it follows that the Br (1)’s are clopen and pairwise disjoint. By
compactness of B(1) (or uniform recurrence), B(1) is the union of finitely
many such Br (1)’s, say B(1) = N i=1 Bri (1). Then
N ri −1
X= T j (Bri (1)),
i=1 j=0
and the sets in this union are pairwise disjoint. Hence, we have found our
first KR-partition P1 = {T j (Bri (1)) : 1 ≤ i ≤ N, 0 ≤ j < ri }.
To continue, take a clopen set B(2) inside one of the Bri (1)’s in the pre-
vious partition, and repeat the above construction. In this way, we can con-
struct inductively a sequence (Pn )n∈N , where Pn+1 refines Pn . The heights
hi (n) = ri for step n in this construction.
Without loss of generality, we can assume that diam(B(n)) < 1/n, so
n B(n) is a singleton, so that (KR1)–(KR3) hold. By renumbering the
Bri (n)’s, (KR6) holds as well.
To show (KR4), we can view the sets T j (Bri (n)), 1 ≤ i ≤ Nn , 0 ≤ j <
hi (n), achieved at the previous step as the targets in which to choose the next
clopen set B . That is, for fixed η > 0 and for each pair (i, j), choose as the
next clopen set B ⊂ T j (Bri (n)) so that diam(B ) and diam(T j (Bri (n))) \
B < (1 − η) diam(T j (Bri (n))). After going through all these 1 ≤ i ≤
Nn , 0 ≤ j < hi (n), we return to taking the next clopen set B(n + 1) x.
Since the corresponding Pn+1 = {T j (Bri (n + 1)) : 1 ≤ i ≤ N, 0 ≤ j < ri }
refines all the intermediate KR-partitions, and therefore {Pk }k≥1 generates
the topology of X.
Finally, (KR5) can be achieved using minimality and taking a subse-
quences of (Pn )n∈N if necessary.
Remark 5.3. If T : X → X is equicontinuous, then the above construction
can be refined so as to obtain that Nn ≡ 1 and T h1 (n) (B(n)) = B(n).
is the graph cover and (KR6) provides the positive directional property. The
dynamics f : Γ → Γ is by following the arrows as in equation (4.30).
Example 5.5 (Kakutani). Start with the half-open interval [0, 1) as first
stack. Cut it in half and put the right half on top of the left half. Repeat
T 6
3
4 1 stack
T 6
1 1
4 2
T 6
stack
1 3
2 4 1
.. T 6
.
1 3 1 1 3
0 2 4 1 0 4 2 4 1
this procedure. The limit map T : [0, 1) → [0, 1) is call the von Neumann-
Kakutani map and the resulting formula is
⎧
⎪
⎪ x + 12 if x ∈ [0, 12 ),
⎪
⎪
⎪
⎪
⎨x − 4 if x ∈ [ 12 , 34 ),
1
⎪
T (x) = x − 4 + 8 if x ∈ [ 34 , 78 ),
3 1
⎪
⎪ .. ..
⎪
⎪
⎪
⎪ . .
⎪
⎩x − (1 − 1 ) + 1
2n 2n+1
if x ∈ [1 − 1 , 1 − 1 ), n ≥ 0;
2n 2n+1
then T acts as the adding machine or odometer: add 0.1 with carry. That
is, if k = min{i ≥ 1 : bi = 0}, then T (0.b1 b2 b3 . . . ) = 0.0 . . . 01bk+1 bk+2 . . . .
If k = ∞, so x = 0.111111 . . . , then T (x) = 0.0000 . . . . However, x =
0.11111 · · · = 1, so we need to extend the domain of T .
Definition 5.9. The rank is r = lim inf n #{stacks used in the step n}, re-
gardless of whether spacers are used or not. Note that this is the number of
stacks after piling the slices of the previous stacks on top of each other, so
not the number of slices.
∗
√
γ = 12 (1 + 5)
0 is the minimal point
• 1 = • and γ −1 = ∗ are maximal points
∗
γ −5
•
γ −4
∗ •
0 γ −3 γ −2 γ −1 1
Example 5.11.
I: Take spacer [r+ , 1] = [ 12 , 1] and at every step slice the stack in
two halves, stack the right half on the left, and put a single layer
of spacer on top; see Figure 5.3 (left). Doubling the discontinuity
points will not produce a minimal map, irrespective of how we define
T at 1 (note that T (r− ) = 34 ). If we set T (1) = 1, then T has a
fixed point where it is discontinuous. If we set T (1) = 0, then T is
continuous, but not minimal because T n (r− ) ∈ S for all n ≥ 1.
II: Take spacer [r+ , 1] = [ 23 , 1] and at every step slice the stack in three
equal thirds, stack the second slice on the first, then put in a single
layer of spacer, and finally stack the third slice on top; see Figure 5.3
(right). This is the Chacon map (or one of the Chacon maps; see
Example 1.27), related to the non-primitive Chacon substitution
0 → 0010,
χchac :
1 → 1.
1 Extending a particular case in [35], see also [226, 392, 393] for later results; also for repre-
spacer spacer
hi −1 h −1
{Δji }j=0 is stacked inside the i -th stack {Δ ji }j=0
i
at cutting and stacking
step n . This definition is the equivalent of primitivity in substitution and
S-adic subshifts.
Proposition 5.13. If a cutting and stacking transformation (I ∗ , T ) is prim-
itive and maximal number s∗ of consecutive layers of spacer is finite, then
(I ∗ , T ) is minimal.
Proof. Let an open interval U be compactly contained in [0, 1], and take
n so large that the base Δ at the n-cutting and stacking step has length
|Δ| < 12 |U | and the part of U inside the spacer is all included in the stacks.
Then for at least one i
(5.2) there exists 0 ≤ j < hi such that Δji ⊂ U,
hi < ∞ by our assumption on the spacer. By primitivity, there is n such that
hi −1 h −1
for each i , part of the stack {Δji }j=0 finds its way into the stack {Δ ji }j=0
i
N0 = j x j G j .
Remark 5.15. Sometimes the greedy expansion is the only possible expan-
sion, for example the binary expansion, if Gn = 2n−1 and xn ∈ {0, 1}. Zeck-
endorf’s Theorem [567] states that the expansion is infinite and unique if the
Gn are the Fibonacci numbers and the digits xj ∈ {0, 1} satisfy xj xj+1 = 0.
.
x0 G0 + x1 G1 + · · · + xn Gn < Gn+1 for all n ≥ 0 .
Proof. If x = y ∈ / XG , there is a largest nx such that x0 G0 + · · · + xnx Gnx =
Gnx +1 − 1 and y0 G0 + · · · + yny Gny = Gny +1 − 1. Take n = max{nx , ny } + 1.
If no such nx , ny exist, then take n = 1. Then a(x)j = xj and a(y)j = yj
for all j ≥ n. Hence, if xj = yj for some j > n, then a(x) = a(y). If xj = yj
for some j > n, then a(x)0 , . . . , a(x)n , 0, 0, . . . and a(y)0 , . . . , a(y)n , 0, 0, . . .
are the greedy expansions of r(x) := 1 + x0 G0 + · · · + xn Gn and r(y) :=
1 + y0 G0 + · · · + yn Gn , respectively, and hence they are not equal because
x = y. This proves that a is injective on XG \ XG . Surjectivity follows from
∞
Proposition 5.17, so a : XG \ {0 } → XG \ XG is well-defined.
228 5. Further Minimal Cantor Systems
Proof. Let ε > 0 be arbitrary and let N be such that 2−N < ε. Set Q :=
inf n>N Q(n). Let x ∈ XG be arbitrary. For a(x) to have a carry beyond
N
index N , we need i=1 xi Gi ≥ GQ − 1, and this happens at most once
every GQ iterates of a. At such iterate there can be a carry or not, so it
can no more than double the number of points in an (n, ε)-separated set
and still have an (n + GQ , ε)-separated set. The maximal cardinality of an
(Gn , ε)-separated set is bounded by ε−1 2Gn /GQ , so
log(ε−1 2Gn /GQ ) log 2
htop (a) ≤ lim lim sup ≤ lim = 0.
ε→0 n→∞ Gn N →∞ GQ
then ∞
g : XG → T1 , x → e2πi( j=0 ρxj Gj )
3 However, [125] also gives examples where (ω(c), f ) factors to a torus (of any dimension)
panel in Figure 5.4 is constructed in this way from the sequence (Gn )n≥0 =
1, 2, 3, 4, 6, 9, 13, 19, . . . (sometimes called the Narayama cow sequence4 ).
The picture suggests, and this is indeed true, that the boundary of such
a Rauzy fractal is a fractal, non-rectifiable, curve. It has infinite length, but
the interesting question is whether it has positive two-dimensional Lebesgue
measure or not. Occasionally, this can be decided upon by a simple geometric
argument. For the case x3 = x2 + 1 with solutions λ0 > 1 and λ1 = λ̄2 , the
space XG consists of sequences in which every two 1’s are separated by at
least two 0’s. Define π : XG → C as
n n
π(x) = lim sf(λ0 xj Gj ) + i sf(λ20 xj Gj )
n→∞
j=0 j=0
and set P = π(XG ). Identify the two-dimensional torus T2 with the quotient
space C/(Z + iZ), and note that we have
T ◦π =π◦a for the translation T : T2 → T2 , z → z + λ0 + iλ20 .
In Figure 5.4 (left), the three shades refer to three cylinder sets P00 =
π(00XG ), P100 = π(100XG ), and P0100 = π(0100XG ). As shown in [125]
P = P00 ∪ P100 ∪ P0100 is the attractor (in the sense of Hutchington [326])
of an iterated function system (IFS):
⎧
⎪
⎨ ψ00 : P → P00 , z → λ21 z,
(5.7) ψ100 : P → P100 , z → λ41 + λ31 z,
⎪
⎩
ψ0100 : P → P0100 , z → λ51 + λ41 z.
Since λ0 λ1 λ2 = 1, the squares of the absolute values of the contraction
factors sum to
|λ21 |2 + |λ31 |2 + |λ41 |2 = λ21 λ22 + λ31 λ32 + λ41 λ42 = λ−2 −3 −4
0 + λ0 + λ0
= λ−4 (λ20 + λ0 + 1)
= λ−4
0 (λ0 − (λ0 + 1) (λ0 − λ0 − 1)) = 1.
4 3 2
=0
4 The Indian mathematician Narayama Pandita (1325–1400) studied cows in much the same
way that Fibonacci studied rabbits, only cows take more time to mature than rabbits.
232 5. Further Minimal Cantor Systems
the left and right panels of Figure 5.4. Continuing our example, consider the
substitution
⎧ ⎛ ⎞
⎪
⎨0 → 02, 1 1 0
χ : 1 → 0, with associated matrix A = ⎝0 0 1⎠ .
⎪
⎩
2→1 1 0 0
This matrix has left and right eigenvectors (λ2i , λi , 1) and (λ2i , 1, λi )T , where
λi , i = 0, 1, 2, are the roots of the characteristic polynomial p(x) = x3 −
x2 − 1. Hence p(x) = 0 is exactly the characteristic equation of our enu-
meration scale. This means that the attracting right eigenspace of A is
V = (λ20 , λ0 , 1)⊥ . Applying Theorem 4.40 to this substitution (its prefix-
suffix graph is given in the right panel of Figure 4.5) we get that its Rauzy
fractal R ⊂ V is the attractor of the graph-directed IFS
⎧
⎪
⎨R(0) = h(R(0)) ∪ h(R(1)),
(5.8) R(1) = h(R(2)), where h = A|V .
⎪
⎩
R(2) = h(R(0)) + π(10 )
v0 v0 v0
M (1) = 1 1
V1 M (1) = 3 1 M (1) = 4
1 1
M (2) =
2 0
V2 V1
1 1
M (3) = M (2) =
1 1
V3 V2 V1
Exercise 5.31. Show that also if M (1) = (1, . . . , 1), then there is an equiv-
alent Bratteli diagram with M (1) = (1, . . . , 1).
for all i ≥ 1. In this way, it is analogous to the notion of primitive used for
incidence matrices of SFTs or S-adic transformations, except that primitive
requires that supi mi − mi−1 < ∞; see Definition 4.42.
If the Bratteli diagram is stationary (i.e. M (i) = M is the same matrix for
all i ≥ 2), then the path space XBV is identical to the path space of an edge-
labeled transition graph associated to M . The Vershik map τ : XBV → XBV
that we will define below, however, is quite different5 from the left-shift σ.
The latter is hyperbolic and of positive entropy6 whereas τ is not hyperbolic
and, on stationary Bratteli diagrams, has zero entropy.
If x = x1 . . . xN and y = y1 . . . yN , with xi , yi ∈ Ei for 1 ≤ i ≤ N , are
finite paths, then we can compare x and y if they have the same endpoint in
VN . Let m < N be the largest index such that xm = ym . This means that
t(xm ) = t(ym ), and we say that x < y if xm < ym . This gives a partial order
on the set of N -paths and a total order on the set of N -paths ending in the
same v ∈ VN . For every v ∈ Vi , there is a unique minimal path from v up to
v0 and at least one e ∈ Ei with s(e) = v. From this it follows that there are
infinite paths x ∈ XBV such that the initial N -path xmin [1,N ] is minimal among
all N -paths with the same terminal vertex. That is, the collection XBV min of
1 ≤ #XBV
min max
, #XBV ≤ lim inf #Vn .
n→∞
5 Without going into details, the shift and Vershik map can be likened to the geodesic flow
and horocycle flow on a manifold of curvature −1; see the interesting exposition in [487].
6 See [487] for a comparison to geodesic and horocycle flow.
236 5. Further Minimal Cantor Systems
can make this choice. Medynets [413] gave an example of a Bratteli diagram
that doesn’t allow any ordering by which τ is continuously extendable, even if
#XBV min = #X max ; see Figure 5.6 (right). For this diagram the only incoming
BV
edges to u ∈ Vn come from u ∈ Vn−1 , and therefore there is a minimal and
a maximal path going through vertices u only. By the same token, there is
a minimal and a maximal path going through vertices w only. No matter
how τ is defined on these two maximal paths, there is no way of putting
an order on the incoming edges to v ∈ Vn such that this definition makes τ
continuous at these maximal paths.
Remark 5.34. For Bratteli-Vershik systems to a Kakutani-Rokhlin parti-
tion, the partition Pn is formed by the n-cylinders, represented by n-paths
connecting v0 with some v ∈ Vn . There are then hv (n) such path, and the
smallest path corresponds to the base elements Bv (n).
Example 5.35. There are examples where XBV is not a Cantor set.
The two examples in Figure 5.6 (left) have opposite ordering. On the
min consists of the (vertex-labeled) paths e := v → v → v → v → · · ·
left, XBV 0
v0 v0 v0
v v v v u v w
0 1 0 0 0 1
v v v v u v w
0 1 0 0 0 1
v v v v u v w
0 1 0 0 0 1
v v v v u v w
The first part was already addressed in [310, Section 2]; they called
such Bratteli-Vershik systems “essentially simple” where currently the word
“properly ordered” is used; see Definition 5.38. The question has been inves-
tigated in detail by Bezuglyi et al. [81, 82]7 , calling such Bratteli-Vershik
systems perfect. A different account is due to Downarowicz & Karpel
[212, 213]. They call a Bratteli-Vershik system decisive if τ can be ex-
tended to a homeomorphism in a unique way. According to [212, Lemma
6.11], a Bratteli-Vershik system is decisive if and only if τ is uniformly con-
tinuous on XBV \ XBV max and the interior of X max is either empty or a single
BV
isolated point.
Example 5.37. There are four ways to assign a stationary order to a sta-
1 1
tionary Bratteli diagram with incidence matrix 1 1 ; see Figure 5.7.
• The right two cases represent the Thue-Morse shift. They have
two minimal and two maximal infinite paths, but it is impossible
max → X min so that τ : X
to define τ : XBV BV BV → XBV becomes
continuous, let alone a homeomorphism. As shown in [81, Example
3.5] (see also [212, Example 6.12]), it is not possible to extend τ
continuously, also for non-stationary orders, as soon as there are
two minimal paths.
7 Also for infinite rank Bratteli diagrams, see Definition 5.41 below.
238 5. Further Minimal Cantor Systems
v0 v0 v0 v0
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
• The left two cases have only one minimal and maximal path. Now τ
can be extended continuously to a homeomorphism of XBV , and it
is conjugate to the dyadic odometer; see Remark 5.55, [81, Propo-
sition 2.20], or [255, Section 5] which are concerned with the char-
acterization of odometers as Bratteli-Vershik systems. This also
follows because (XBV , τ ) represents an invertible Toeplitz shift; see
[208, below Theorem 5.1] and Theorem 5.54 and the text below it.
Definition 5.38. A Bratteli-Vershik system (XBV , τ ) is called properly
min = #X max = 1.
ordered if it is simple (as in Definition 5.32) and #XBV BV
Figure 5.6 illustrates the effect of reversing the order of each collection
of incoming edges:
Lemma 5.40. The system (XBV , ≤, τ ) is conjugate to (XBV , ≥, τ −1 ) wher-
ever τ is well-defined and injective. That is, if we reverse the order on the
incoming edges everywhere, we obtain a system whose inverse Vershik map
is conjugate to the original system. In particular, the set of minimal and
maximal paths change roles.
Definition 5.41. A Bratteli diagram B = ((Ei )i∈N , (Vi )i∈N ) has rank r B :=
lim inf i #Vi . A Bratteli diagram has rank r if r is the smallest integer such
that (XBV , τ ) is isomorphic to a system on a Bratteli diagram with r B = r.
If no such finite r exists, then the Bratteli diagram is said to have infinite
rank.
v0
0 1 2 3
⎧
⎪
⎪0 → 01,
0 1 2 3 ⎪
⎪
⎨1 → 01,
χ:
⎪
⎪2 → 02031,
⎪
⎪
0 1 2 3 ⎩3 → 02131
0 1 2 3
hence not minimal. The subdiagram (XBV , τ ) using only symbols 0 and 1
v0 v0
3
0 1 2 3 S 0 1 0 2 1 0 12
3
0 1 2 3 S 0 1 0 2 1 0 12
3
0 1 2 3 S 0 1 0 2 1 0 12
3
0 1 2 3 S 0 1 0 2 1 0 12
Example 5.48. The Bratteli-Vershik systems in Figure 5.9 are both iso-
morphic to the Chacon substitution shift (see Example 1.27) generated
by the fixed point
The one on the left is not properly ordered, because it has two minimal
sequences x and y and two maximal sequences x and z as we saw below
Definition 5.46.
The one on the right, constructed by Park [444], is properly ordered.
See [225] for general results on finding properly ordered Bratteli-Vershik
systems.
242 5. Further Minimal Cantor Systems
s = lim χ2 ◦ χ3 ◦ · · · ◦ χi (v),
i→∞
where v is taken from Vi . For completeness, we also set χ1 (v) = 0 for every
v ∈ V1 . Using the irreducibility conditions,
Example 5.49. The best-known examples are of course the stationary sub-
stitutions; i.e. Vi ≡ V and χi ≡ χ. For example, the Fibonacci substitution
acts on the alphabet {0, 1} by
'
0 → 01,
χFib :
1→0
and s = 0100101001001 . . . . This sequence is equal to the sequence of first
labels of {τ n (xmin )}n≥0 in the Fibonacci Bratteli diagram in Figure 5.13. As
a result, the Fibonacci substitution is isomorphic to the Fibonacci Bratteli
diagram, which in turn is isomorphic to the Fibonacci enumeration system.
Lemma 5.50. Every S-adic shift such that each letter v ∈ Vi−1 , i ≥ 2, ap-
pears in some word χi (w), w ∈ Vi , is isomorphic to a Vershik transformation
on an ordered Bratteli diagram and vice versa.
5.4. Bratteli Diagrams and Vershik Maps 243
Proof. The vertices of the Bratteli diagram coincide with the alphabets
Vi (for this reason we choose the same notation), except that the Bratteli
diagram has a first level V0 = {v0 }. Let8 E1 = {v0 → v | v ∈ V1 }. For
each v ∈ Vi , i ≥ 1, there is an incoming edge w → v for each appearance of
w ∈ Vi−1 in χi (v), and the ordering of the incoming edges in v is the same
as the order of the letters in χi (v). It follows that the incidence matrices of
the substitution χi coincide with incidence matrices associated to the edges
Ei . Hence M (i) is the transpose of the matrix associated to χi .
Clearly, the Bratteli diagrams and substitutions (χi )i≥2 are in one-to-
one correspondence, provided every w ∈ Vi−1 appears in at least one χi (v),
v ∈ Vi .
Let vi ∈ Vi be the symbol indicated by (5.9); then it easily follows that
vi−1 is the first symbol of χi (vi ) and that xmin := v0 → v1 → v2 → v3 → · · ·
is the unique minimal element.
The sequence s = limj χ2 ◦ · · · ◦ χj (v) can be read off as
sn = s(τ n (xmin )2 ).
In other words, sn records the vertex in V1 that the n-th τ -image of xmin
goes through. The way to see this is the following: Since the incoming edges
to w ∈ V3 are ordered as in χ2 (w), a path starting with χ2 (w)1 → w is
followed by a path starting with χ2 (w)2 → w, etc. Because this is true for
every vertex in every level Vi , the required sequence s will emerge.
Remark 5.51. A graph cover map f : Γ → Γ on the inverse limit
Δe Δd
γe γd
T (Δd ) T (Δe )
Figure 5.10. The first return map to [0, γd− ] for the case |Δd | < |Δe |.
• If |Δd | > |Δe |, then there is k < d and r ≥ 0 such that T 1+r (Δk )
γd and T j (Δk ) ⊂ δd for 1 ≤ j ≤ r. Thus r ≥ 1 is the minimal
iterate such that T r (1− ) ∈/ Δd . The intervals Δj mapped into Δd
remain in Δd for r or r + 1 steps, depending on whether T (Δj )
is to the left or right of T (Δk ); see Figure 5.11. The first return
map to [0, γd−1 ) is comprised of r Type 1 Rauzy steps and a sin-
gle Type 0 Rauzy step, where r = r × #{j : T (Δj ) ∩ Δd = ∅}
+ #{j : T (Δj ) lies to the right of T (Δk )}.
Δe Δk Δd
γe γd−1
T 2 (Δk )
Figure 5.11. The first return map to [0, γd−1 ) for the case |Δd | > |Δe |.
In fact, Gjerde & Johansen [274] proved this for invertible Toeplitz shifts,
provided the Bratteli diagram is properly ordered, so it has both a unique
minimal and a unique maximal path.
Remark 5.55. If the equal path number property holds, then the sequence
p = (pi )i∈N ⊂ N, pi := #t−1 (v) for v ∈ Vi is well-defined. We label the
incoming edges e ∈ Ei with t(e) = v in increasing order with the labels
0, 1, . . . , pi −1. Then the labeling map ψ : XBV → Σp assigning the sequence
of labels to each path is a continuous factor onto the p-adic odometer (Σp , a)
and ψ◦τ = a◦ψ. This gives another way of seeing that odometers are factors
of Toeplitz shifts.
Remark 5.56. Relating Toeplitz shifts to Kakutani-Rokhlin partitions, for
a Toeplitz sequence with periodic structure (qn )∞
n=1 , the elements of Pn are
the sequences that share the same qn -skeleton, and the heights hi (n) = qn .
In order to prove Theorem 5.54, we start with a lemma that holds for
general Bratteli-Vershik systems with a unique minimal path.
Proof. First remove the minimal path xmin from the Bratteli diagram. Since
there is only one minimal path, for no v ∈ Vi , i ∈ N, there remains an infinite
minimal path starting at v. That is, there is an increasing sequence (ik )k∈N ,
such that no minimal path connects Vik−1 to Vik . Therefore, after telescoping
between Vik−1 and Vik for all k ∈ N, obtaining a Bratteli diagram (Êk , V̂k )k∈N ,
there is no minimal edge connecting V̂k−1 to V̂k for any k ∈ N. Now reinsert
the (telescoped version of the) minimal path in (Êk , V̂k )k∈N . This achieves
the required property.
We write X = orbσ (θ) for the (one-sided) Toeplitz shift space. Let q1 be the
period of θ0 , and let V1 be the collection of q1 -prefixes of {σ kq1 (θ) : k ≥ 0}.
Next set the first base
Bv (1) := {x ∈ X : x and θ share q1 -skeleton and x0 . . . xq1 −1 = v},
for v ∈ V1 , so that we obtain the first Kakutani-Rokhlin partition
P1 := {σ j (Bv (1)) : 0 ≤ j < q1 , v ∈ V1 }.
To continue the induction, suppose we have found qn , Vn , (Bv (n))v∈Vn , and
Pn = {σ j (Bv (n)) : 0 ≤ j < qn , v ∈ Vn }. Let qn+1 be the minimal period
with which the word θ0 . . . θqn appears in θ. Since θ0 . . . θqn−1 −1 is a prefix of
θ0 . . . θqn , qn+1 is a multiple of qn . Let Vn+1 be the collection of qn -prefixes
of {σ kqn+1 (θ) : k ≥ 0}. Next set the n + 1-st base
Bv (n + 1) := {x ∈ X : x and θ share qn -skeleton and x0 . . . xqn −1 = v},
for v ∈ Vn+1 , so that we obtain the n + 1-st Kakutani-Rokhlin partition
Pn+1 := {σ j (Bv (n + 1)) : 0 ≤ j < qn+1 , v ∈ Vn+1 }.
Clearly Pn+1 refines Pn and the height of each base element Bv (n + 1) is the
same, namely qn+1 , for each v ∈ Vn+1 .
Since all v ∈ Vn+1 have θ0 . . . θqn −1 as prefix, v∈Vn+1 Bv (n + 1) ⊂
Bθ0 ...θqn −1 (n), verifying condition (KR6) of Section 5.1. To check (KR4),
suppose that x = x ∈ X and let k be the smallest entry for which xk = xk .
Take n such that qn > k. If x, x are in different qn -skeletons, then they
belong to different levels in Pn . If x, x are in the same qn -skeleton, then
there is j < qn , but two different w, w ∈ Vn such that x ∈ σ j (Bw (n)) and
x ∈ σ j (Bw (n)). This shows that (Pn )n≥1 separates points. Since all Pn ’s
are partitions into clopen sets, the Pn ’s generate the topology of X. Hence
(Pn )n≥1 satisfies (KR1)–(KR4) and (KR6). Condition (KR5) may fail but
can be achieved by taken a subsequence of (Pn )n≥1 .
Finally, to construct the Bratteli-Vershik system, Vn are the vertex sets.
For each v ∈ V1 , the edge set E1 contains q1 edges connecting v0 to each
v. To get a simple cap of the Bratteli diagram, we can microscope between
{v0 } and V1 by inserting a level V 1 = A such that there is a single edge
2
(labeled a) between v0 and a ∈ V 1 . There will be q1 edges between a and
2
v ∈ V1 , ordered in the same way as the letters appear in v. The general
En , n ≥ 2, will, for each v ∈ Vn , contain qn /qn−1 edges connecting v with
u ∈ Vn−1 ordered in the same way that σ (rv +k)qn−1 (θ) visits the u’s. Here
rv = min{r ≥ 0 : σ rqn−1 (θ) ∈ v}. Clearly the equal path number property is
satisfied.
250 5. Further Minimal Cantor Systems
v0 v0
V1 10 11 V1 10 11
2
0 1 0 1 0 1 0 1
V2 1011 1010 V1 1011 1010
0 1 0 1 0 1 0 1
V3 10111010 10111011 V2 10111010 10111011
0 1 0 1 0 1 0 1
V4 V3
0 1 0 1
V4
t v0
@
@
@
@
V1 tH @t
0 1
HH
HH
H
HH 1
0 1
V2 t H
H0
t 0 0
HH
HH
H
HH
0
1 1
0 1 HH
V3 t H0
t 0 0
%
Now define a weight function w : n En → [0, 1] by setting
μ([x1 . . . xn e])
w(e) := for any path x1 . . . xn with t(xn ) = s(e) ∈ Vn .
μ([x1 . . . xn ])
By τ -invariance of μ, any path x1 . . . xn with t(xn ) = v ∈ Vn has the same
mass as any other path with
the same terminal vertex v, so the above defi-
nition makes sense. Also s(e)=v w(e) = 1 for every n ∈ N0 and v ∈ Vn .
Now define a function ϕ : XBV → [0, 1] by
∞
ϕ(x) = w(e) + μ([x1 . . . xn−1 ]) w(e).
s(e)=v0 ,e<s x1 n=2 s(e)=s(xn ),e<s xn
and xm is not a maximal incoming edge (we can take m minimal with this
property), then τ (xmax
w ) = xv
max and τ (xmin ) = xmin for consecutive n-paths
w v
v = τ (w) and v = τ (w ). In this case, T (y) = ϕ(xmax ) = ϕ(xmin
v v ) so T is
well-defined (and in fact continuous) at y after all. Therefore, for every n
12 Recall that μ is assumed to be non-atomic.
5.4. Bratteli Diagrams and Vershik Maps 253
there are at most i,j mij (n) pairs xmax min
w , xw (corresponding to the total
number of slices of the n − 1-st level stacks to create the n-th level stacks)
where T can be discontinuous at y = ϕ(xmax min
w ) = ϕ(xw ). At all other points,
T acts as a local isometry, that is, precisely as a cutting and stacking map.
This proves the proposition.
Proof. We start by showing that all vertices have incoming and outgoing
edges. Indeed, if m ∈ Vn , so Q(m) < n ≤ m, then there are
n − 1 → m if n − 1 ∈ RG (m),
incoming edges
m→m / RG (m), so Q(m) < n − 1
if n − 1 ∈
and
m→m+1 if m = n,
outgoing edges
m→m if Q(m) < n < m.
0 0 0
We call the path v0 → 1 → 2 → · · · with code xmin := 0000 . . . the spine
of the Bratteli diagram. It is clearly the unique minimal path. A path x
is maximal if, whenever it passes through the first vertex n ∈ Vn , it passes
through Q(n) ∈ VQ(n) as well, but not through m ∈ Vm for Q(n) < m < n.
The edge-labeled paths with this property are exactly the sequences in XG .
We set τ (x) = x min for every x ∈ XG .
254 5. Further Minimal Cantor Systems
v0
G0 = 1, V0
0 1 0 0 1 0
G1 = 2, V1 1 2 3 4
0 1 2 1 0
0
3 0 2
G2 = 7, V2 2 3 4 5
0 1 2 1 0
0
3 0 2
G3 = 23, V3 3 4 5 6
0 1 2 1 0
0
3 0 2
G4 = 74, V4 4 5 6 7
0 1 2 1 0
0
3 0 2
G5 = 237, V5 5 6 7 8
Thus there are finitely many maximal paths if n − Q(n) is bounded and
only one maximal path if there are infinitely many n such that Q(m) ≥ n
for all m > n. In this case, the Bratteli diagram corresponds to an adding
machine based on these particular Gn ’s. This proved so far that the ordered
Bratteli diagram is well-defined, with a single minimal and finitely many
maximal paths.
Claim: The number of paths from v0 to n ∈ Vn is equal to Gn .
a
Since there are G1 = d1 (1) edges v0 → 1 ∈ V1 , the claim holds for
G1 . We continue to prove the claim by induction, assuming the claim is
true for all m < n. The number of paths through both n − 1 ∈ Vn−1 and
n ∈ Vn equals d1 (n)Gn−1 . This proves the inductive step if Q(n) = n − 1.
Otherwise, the remaining incoming edge n → n is the last edge of a strand
VQ(n) Q(n) → n → n → · · · → n → n ∈ Vn . Going up this strand, we
accumulate d2 (n)Gn−2 + d3 (n)Gn−3 + · · · + dQ(n) GQ(n) paths. Together, this
adds up to Gn , proving the induction step, and thus the claim.
If 0 ≤ k < Gn , then, counting from the spine, the k-th path from v0
through 1 ∈ Vn in the Vershik order, i.e. x = τ k (00000 . . . ), satisfies k =
n
i=1 xi Gi−1 . From this it follows that τ (x) = a(x) for every x ∈ XG , and by
continuity also τ (x) = a(x) for every x ∈ XG . This completes the proof.
5.4. Bratteli Diagrams and Vershik Maps 255
v0
S0 = 1
0
S1 = 2 1 2 3
0 1
S2 = 3 2 3 4
0 1
S3 = 4 3 4 5
0 1
S4 = 6 4 5 6
0 1
S5 = 9 5 6 7
Methods from
Ergodic Theory
257
258 6. Methods from Ergodic Theory
limit probability measure μ and a subsequence (ni )i∈N such that for every
continuous function ψ : X → R,
0 0
(6.1) ψ dνni → ψ dμ as i → ∞.
X
In a metric space, for any ε > 0 and closed set A, we can find a continuous
function ψA : X → [0, 1] such that ψA (x) = 1 if x ∈ A and
μ(A) ≤ X ψA dμ ≤ μ(A) + ε,
μ(T −1 A) ≤ X ψA ◦ T dμ ≤ μ(T −1 A) + ε.
Here we use outer regularity of the measure μ: μ(A) = inf{μ(U ) : A ⊂
U is open}. We take U ⊃ A so small that μ(U ) − μ(A) < ε and ψA = 0 for
all x ∈
/ U . Note that it is important that A is closed, because if there exists
a ∈ ∂A \ A, then the above property fails for μ = δa .
By the definition of μ
/0 0 /
/ /
|μ(T −1
(A)) − μ(A)| ≤ // ψA ◦ T dμ − ψA dμ// + ε
/0 0 /
/ /
= lim / ψA ◦ T dνni − ψA dνni // + ε
/
i→∞
/ /
/n −1 0 0 /
1 // i /
= lim / ψA ◦ T j+1 dν − ψA ◦ T j dν // + ε
i→∞ ni / /
j=0
/0 0 /
1 // /
≤ lim / ψA ◦ T ni dν − ψA dν // + ε
i→∞ ni
2
≤ lim &ψA &∞ + ε = ε.
i→∞ ni
Since ε > 0 is arbitrary, μ(T −1 (A)) = μ(A). The closed sets generate the σ-
algebra of Borel sets, so μ(T −1 (A)) = μ(A) also for arbitrary Borel sets.
Exercise 6.3. To demonstrate the role of the compactness assumption in
Theorem 6.2, consider the fixed point ρCantor of the Cantor substitution
χCantor in Remark 2.14 and Example 6.19. Let X = {σ n (ρCantor ) : n ≥ 0}, so
no closure is taken! Show that (X, σ) has no invariant probability measure.
Remark 6.4. Invariant measures are related to fixed points of the trans-
fer operator of a dynamical system (X, T ). First define the Koopman
operator UT g = g ◦ T . With respect to some reference measure m (e.g.
2
6.1. Ergodicity
The notion of ergodicity says that the space X doesn’t fall apart in separate
T -invariant components, both of positive measure.
Definition 6.5. A measure μ is called ergodic if for every T -invariant set
A ∈ B (i.e. T −1 (A) = A mod μ) either μ(A) = 0 or μ(Ac ) = 0. That is, the
only T -invariant sets are nullsets or the whole space up to a nullset.
Example 6.6. For the full shift (AN or Z , σ), Bernoulli measures (see Def-
inition 1.32) are ergodic σ-invariant measures. So are equidistributions or
periodic orbits. Non-trivial convex combinations of such measures are still
σ-invariant, but not ergodic.
Corollary 6.7. A dynamical system (X, T, μ) is ergodic if and only if the
only T -invariant L1 (μ)-functions (i.e. ψ = ψ ◦ T μ-a.e.) are constant μ-a.e.
Theorem 6.13. Let μ be a probability measure and let ψ ∈ L1 (μ). Then the
ergodic average
n−1
1
ψ ∗ (x) := lim ψ ◦ T i (x)
n→∞ n
i=0
exists μ-a.e., and ψ∗ is T -invariant. If in addition μ is ergodic, then
0
∗
(6.3) ψ = ψ dμ μ-a.e.
X
5 Named after George Birkhoff (1884–1944). Details of the controversy on priority of the
1 i−1
Ergodic Theorem (John von Neumann was earlier in proving his L1 -version n k=0 UT ψ →
i
1
X ψ dμ in L (μ), but Birkhoff delayed its publication until after the appearance of his own
paper) can be found in [569].
262 6. Methods from Ergodic Theory
In fact [277, Theorem 4.10], if every point is typical for a generic mea-
sure7 , then (X, T ) is uniquely ergodic.
A major consequence of unique ergodicity is the uniform existence of
visit frequencies; i.e. for a uniquely ergodic subshift (X, σ, μ)
1
(6.4) μ([a1 . . . aN ]) = lim #{0 ≤ j < n : xj+1 . . . xj+N = a1 . . . aN },
n→∞ n
for every word a1 . . . aN and all x ∈ X.
Proof. If μ and ν were two different ergodic measures, then we can find a
continuous function ψ : X → R such that ψ dμ = ψ dν. Using Birkhoff’s
Ergodic Theorem 6.13 for both measures (with their own typical points x
and y), we see that
n−1 0 0 n−1
1 1
lim ψ ◦ T (x) = ψdμ = ψdν = lim
k
ψ ◦ T k (y),
n→∞ n n→∞ n
k=0 k=0
so there is no uniform convergence to a constant function.
Conversely, we know by Birkhoff’s Ergodic Theorem 6.13 that
n−1 0
1
lim ψ ◦ T (x) = ψ dμ
k
n n
k=0
is constant μ-a.e. But if the convergence is not uniform, nithen there is a
−1
sequence (yi )i∈N ⊂ X and (ni )i∈N ⊂ N, such that limi n1i k=0 ψ ◦ T k (yi ) =
ni −1
X ψ dμ. Define probability measures νi
:= n1i k=0 δT k (yi ) . This sequence
∗
(νi )i∈N has a weak accumulation points ν which is shown to be T -invariant
measures in the same way as in the proof of Theorem 6.2. But ν = μ because
ψ dν = ψ dμ. Hence (X, T ) cannot be uniquely ergodic.
On the positive side, there are several conditions implying unique ergod-
icity.
Theorem 6.22. Let (X, T ) be an equicontinuous surjection on compact met-
ric space (X, d). Then the following are equivalent:
(a) T is transitive.
(b) T is uniquely ergodic.
(c) Every T -invariant probability measure is ergodic.
The main implication (a) ⇒ (b) is due to Fomin [250] for minimal dy-
namical systems and was generalized to transitive systems by Oxtoby [440],
as we shall do in the proof below (but see Exercise 2.28). In fact, Oxtoby’s
proof applies to transitive mean equicontinuous systems as well.
is δ > 0 such that d(T n x, Y ) > δ for all n ∈ N. Hence, for any weak∗
accumulation point
n −1
1 k
μ1 = lim δT i x ,
k→∞ nk
i=0
the support supp(μ1 ) ∩ Y = ∅. Therefore 12 (μ0 + μ1 ) is not ergodic.
Lemma 6.23. If a shift (X, σ) is balanced on words, see Definition 4.36,
then it is uniquely ergodic.
Proof. Suppose that μ and ν are two different ergodic invariant measures
and that u ∈ L(X) is such that μ([u]) = ν([u]). Take typical points x and y
(in the sense of the Birkhoff Ergodic Theorem 6.13,) for μ and ν, respectively.
Then | |x1 . . . xn |u −|y1 . . . yn |u | ∼ n|μ([u])−ν([u])| → ∞ as n → ∞, so L(X)
cannot be balanced.
Example 6.24. If L(X) is R-balanced on words, then it is R-balanced on
letters, but the other direction fails of course. For example, the full shift
X := {0, 1}Z is not balanced, but χT M (X) for the Thue-Morse substitution
χTM is balanced on letters but not even balanced on 2-words. This example
shows that balancedness on letters is not sufficient for Lemma 6.23.
Adding machines are minimal isometries, and therefore uniquely ergodic
by Theorem 6.22. For the more general class of Toeplitz shifts, the question of
unique ergodicity is more interesting. A following result (requiring regularity;
see Section 4.5.1) is by Jacobs & Keane [332].
Theorem 6.25. Every regular Toeplitz shift is uniquely ergodic.
Proof. The original result is [332, Theorem 5]; we follow [381, Theo-
rem 4.78], using the notation from Theorem 4.92. In particular, Li =
lcm(q1 , . . . , qi ), where (qi )i∈N is the periodic structure of the Toeplitz se-
quence. Let u ∈ L(x) be an arbitrary word; then for i so large that Li > |u|,
the frequency μi (u) := L1i |V (i)|u of the word u in V (i) is a lower bound for
inf j≥0 lim inf n n1 #{1 ≤ k ≤ n : xj+n+1 . . . xj+n+|u| = u}, whereas μi + |u|+r
Li
i
Proof. Since every word w ∈ Ln reoccurs with gap ≤ Ln, Birkhoff’s Ergodic
Theorem 6.13 implies that μ([w]) ≥ 1/(Ln) > 0 for every ergodic invariant
measure μ. Therefore Theorem 6.28 applies.
Proof. The implication ⇒ uses the same argument as in the previous proof.
The reverse implication ⇐ is due to Boshernitzan (unpublished) and appears
in [74]12 . Let u ∈ Ln be arbitrary, and let N = N (n, u) be the length of the
longest return word x = x1 x2 . . . xN associated to u. Note that N > n be-
cause otherwise x is periodic; see Example 4.3. For n ≤ k ≤ N , let Uk be the
collection of words ending in x1 . . . xk . Then μ( v∈Uk [v]) = μ([x1 . . . xk ]) ≥
ε̂/k. Since the sets Uk are all disjoint (after all u cannot reappear in the
return word x), it follows that
N 0 N +1
ε̂ ε̂ N
1≥ ≥ dx ≥ ε̂ log .
k n x n
k=n
12 I am grateful to Fabien Durand for pointing this out to me and the streamlined argument.
268 6. Methods from Ergodic Theory
Proof. Since every primitive substitution shift is minimal, see Theorem 4.17,
it remains to prove unique ergodicity. Recall that
|w|u = #{1 ≤ i ≤ |w| − |u| : wi wi+1 . . . wi+|u|−1 = u}
270 6. Methods from Ergodic Theory
stands for the number of occurrences of u in w. We claim that for every non-
empty u ∈ L(X), there is a frequency pu ∈ (0, 1) and a function εu (n) → 0
as n → ∞ such that
/ /
/ |w|u /
(6.5) / − p / ≤ εu (|w|) for all w ∈ L(X).
/ |w| − |u| + 1 u /
If μ is a σ-invariant probability measure on X, then for every non-empty
u ∈ L(X) and every n ∈ N, we have
/ /
/0 /
/ 1
n−1
/
|μ([u]) − pu | = / / 1[u] (σ (x)) dμ − pu //
j
/ X n j=0 /
0 / /
/1 /
≤ / |x[1,n+|u|] |u − pu / dμ
/n /
0X
≤ εu (n) dμ = εu (n) → 0.
X
Therefore there is only one σ-invariant probability measure given by μ([u]) =
pu (and extended to the Borel σ-algebra by the Kolmogorov Extension The-
orem).
Now the proof of the claim (6.5) is fairly direct from Theorem 8.58, at
least for single letters u = a ∈ A. Indeed, let A be the matrix associated to
the substitution χ. Then Anb,a = |χn (b)|a . By the Perron-Frobenius Theorem
8.58, there are 1 < ρ < λ (the leading eigenvalue of A) and C > 0 such that
| |χn (b)|a − λn pa qb | ≤ Cρn ,
where
p and q are the left and right leading eigenvectors of A, scaled so that
a∈A a qa = 1. Also, by the triangle inequality,
p
/ /
/ /
/ n /
| |χ (b)| − λ qb | = /
n n
|χ (b)|a − pa λ qb /
n
/ /
a∈A
≤ | |χn (b)|a − λn pa qb | ≤ C #A ρn
a∈A
and
| |χn (b)|a − pa |χn (b)| | ≤ | |χn (b)|a − λn pa qb | + pa | |χn (b)| − λn qb | ≤ C ρn
for C = C(1 + pa #A). Therefore we have
/ n /
/ |χ (b)|a /
/ /
/ |χn (b)| − pa / ≤ C (ρ/λ) → 0,
n
for some maximal n such that / kvn = and each/ vk and vk has length ≤ L :=
maxa∈A |χ(a)|. Therefore /|χ (vk )|a − pa λk / ≤ LC ρk and the same holds
for each χk (vk ). Additionally, there can be at most 2n|u| appearances of |u|
in between the words χk (vk ) and χk (vk ) in (6.6). Altogether
n
| |w|a − pa |v| | ≤ 2n|u| + 2LC ρk ≤ C ρn .
k=0
hv (n − 1)
(6.7) K(n) = (kv,w (n))v∈Vn−1 ,w∈Vn , kv,w (n) = mv,w (n) .
hw (n)
Therefore
(6.8) p̃(n − 1) = K(n)p̃(n),
and
hv (n − 1)
kv,w (n) = mv,w (n)
hw (n)
v∈Vn−1 v∈Vn−1
1 hw (n)
= hv (n − 1)mv,w (n) = = 1,
hw (n) hw (n)
v∈Vn−1
as claimed.
Example 6.33. To illustrate this lemma, we repeat the telescoping of Ex-
ample 5.30 with probabilistic incidence matrices:
2
12 1 3
K(1, 3) = 1 1 1 1 = 1 .
2 0 3
#Vn
Let Σn = {x ∈ R#V n
≥0 :
#Vn
j=1 xj = 1} be the unit simplex in R≥0 . The
ergodic measures correspond to the extremal points of the sets
Sn := K(n + 1) · K(n + 2) · · · K(n + j)(Σj ).
j≥1
If r is the rank of the BV-system, then #Vn = r infinitely often, and Sn can
have no more than r extremal points for every n. This proves that (X, τ )
cannot preserve more than r ergodic measures.
Lemma 6.35. Let K(n) = (kv,w )v∈Vn−1 ,w∈Vn be the probabilistic matrices of
a Bratteli diagram as defined in (6.7). Define
2
max{kv,w /kv,w : v ∈ Vn−1 }
ρn := max
w,w ∈Vn min{kv,w /kv,w : v ∈ Vn−1 }
∞
for n ≥ 1. If n=1 1/ρn = ∞, then the BV-system is uniquely ergodic.
Proof. This goes as the proof of Proposition 6.34, but now because of
Lemma 8.61, there is a unique solution to (6.8) if n 1/ρn = ∞. Hence,
under this assumption, unique ergodicity follows. See also [121] and [80, Sec-
tion 4].
This result gives another proof that minimal linear recurrent shifts are
uniquely ergodic, because (after telescoping to make the transition matri-
ces M (n) ≥ 1), the M (n) are still bounded, and also the entries of K(n)
∞bounded and bounded away from zero. Therefore supn ρn < ∞ and
are
n=1 1/ρn = ∞. In fact, we have a type of exponential mixing:
Lemma 6.36. Assume that the Bratteli diagram of a minimal linearly re-
current (with constant L) shift is telescoped so that its transition matrices
are strictly positive. Then there exist C > 0 and β = β(L) ∈ (0, 1) such that
for every 0 ≤ k ≤ n and v ∈ Vn−k , w ∈ Vn ,
/ /
/ μ([v] ∩ [w]) /
/ − μ([v]) / ≤ Cβ k .
/ μ([w]) /
Here [v] = {x ∈ XBV : t(xn−k ) = v} and [w] = {x ∈ XBV : t(xn−k ) = w}
are cylinder sets.
Proof. Let K = K(n − k + 1) · · · K(n) and let Σn denote the unit simplex
in R#V
≥0 . Then there is β ∈ (0, 1) (which can be derived from (8.29)), such
n
274 6. Methods from Ergodic Theory
that diam(K(Σn )) ≤ Cβ k diam(K). This means that for each v ∈ Vn−k , the
entries Kv,w are no more than Cβ k apart from each other or from any of
their convex combinations. That is,
/ /
/ /
/ /
/
|μ([v] ∩ [w]) − μ([v])μ([w])| = /Kv,w μ([w]) − Kv,w μ([w ])μ([w])//
/ w ∈Vn /
≤ Cβ k μ([w]).
Now divide by μ([w]) to get the lemma.
Example 6.37. Let F0 , F1 , F2 , F3 , F4 , . . . = 1, 1, 2, 3, 5, . . . be the Fibonacci
numbers. For the Fibonacci Bratteli-Vershik system of 1 Figure 5.13 (i.e. the
1
diagram is stationary with M (1) = (1 1) and M (n) = 1 0 for every n ≥ 2),
we find hvn (n) = Fn and hwn (n) = Fn−1 for the vertex sets Vn = {vn , wn }.
Therefore Fn−1
F 1
K(n) = F n .
n−2
Fn 0
After telescoping
⎛ ⎞
2Fn−2 Fn−2
K(n − 1)K(n) = ⎝ Fn Fn−1
⎠
Fn−3 Fn−3
Fn Fn−1
by Binet’s formula (8.5). Lemma 6.35 shows that the Fibonacci Bratteli-
Vershik system is uniquely ergodic. Indeed, we can compute
2
2F2n−1 /F2n √
ρ̃n := ρ(K(2n − 1)K(2n)) = = 2,
F2n−1 /F2n
so n 1/ρ̃n = ∞.
Lemma 6.35 gives only a sufficient condition, but this is not a necessary
condition; see Example 6.37 below. An “if and only if” condition for unique
ergodicity is the following.
Theorem 6.38. A Bratteli-Vershik system (XBV , τ ) is uniquely ergodic if
and only if, after sufficient telescoping,
lim max |kvw (n) − kvw (n)| = 0.
n→∞ w=w ∈Vn
v∈Vn−1
6.3. Unique Ergodicity 275
This is [77, Theorem 3.1]13 and this paper also contains a description of
BV-systems with any fixed number of ergodic measures.
Example 6.39. Assume that we have a rank 2 simple Bratteli diagram with
Vn = {vn , wn } and incidence matrices
an − 1 1
M (1) = (1, 1) and M (n) = for an ≥ 2, n ≥ 2.
1 an − 1
&
Then hvn (n) = hwn (n) = nj=1 aj . This gives
1 − εn εn 1
K(n) = , εn = , n ≥ 2.
εn 1 − εn an
We compute that ρn = 1−ε n
for ρn as in Lemma 6.35, so if n εn = ∞ (hence
εn
is given.
276 6. Methods from Ergodic Theory
Proof. The cutting times (Sk )k≥1 form an enumeration scale with Sk =
Sk−1 + SQ(k) . The boundedness of k − Q(k) implies that Sk , ∞ exponen-
tially, and therefore condition (2) in Theorem 6.40 is satisfied.
Unique ergodicity is not given for minimal critical ω-limit sets; see [121].
In fact, the collection of kneading maps is so rich that for every simplex Σ
with a compact totally disconnected set of extremal points, there is a knead-
ing map Q such that the corresponding system (ω(c), f ) has its Choquet
simplex homeomorphic to Σ; see [168]. This goes to the extent that the
invariant measures can form a Poulsen simplex.
Note that a : XG → XG need not be continuous, so the existence of an
a-invariant measure does not follow from the Krylov-Bogul’yubov Theorem
6.2. Theorem 6.40 shows, however, that unique ergodicity holds whenever
lim inf j Gj+1 /Gj > 1, so also if e.g. Gj = 2Gj−1 + 1 for all j ≥ 1, even
though in this case a is not continuous. As shown in [121, 168], there are
non-uniquely ergodic enumeration systems,
Example 6.42. Condition (1) in Theorem 6.40 is fairly strict. Having
n k → 0 is not enough as [49, Example 6]
k/G shows. Here we choose Jn =
k=1 k! for n ∈ N, G0 = 1, and
(n + 1)Gj−1 + 1 if j = Jn ,
Gj =
Gj−1 + 1 otherwise.
In this case a : XG → XG is discontinuous at (0) (because Gj − Gj−1 = 1
infinitely often). Also (m)0 = 0 for each GJn ≤ m ≤ GJn+1 , but one can
compute that (m)0 = 1 for a definite proportion of the integers GJn−1 <
m < GJn . As a result, the cylinder [1] has no well-defined visit frequency for
the a-orbit of (0), or in fact of any x ∈ XG . Therefore a admits no invariant
measure.
uniquely ergodic (in fact, it is an IET on three pieces with eigenvalue −1, and
therefore T 2 on five pieces allows two independent T -invariant functions).
Keane [348] found a class of examples with four pieces, and this is the
minimal possible number, because an IET with d pieces can have at most
d/2 ergodic probability measures; see [345,347,542]. Chaika & Masur [151]
gave an example of a six piece IET with two ergodic measures and one extra
generic (but non-ergodic) measure.
We start with Keane’s counterexample; the technique of proving non-
unique ergodicity is similar to what is discussed in Section 6.3.3. This ex-
ample T has permutation π : {1, 2, 3, 4} → {3, 2, 4, 1} and lengths λi = |Δi |
satisfying λ3 > λ4 > λ1 ; see Figure 6.2.
m − 1 times n times
Δ1 Δ2 Δ3 Δ4
d c b a
ca b d
Δ4 Δ2 Δ1 Δ3
The first return map to Δ4 is again an IET with four pieces and per-
mutation π = π provided we number the sub-pieces of Δ4 in reverse order.
Since λ4 > λ1 , T maps the right part Δ of Δ4 into Δ2 , and then T translates
T (Δ ) some m − 1 ∈ N times within Δ2 until it covers the right end-point
γ2 of Δ2 . Thus there is a decomposition Δ = b ∪ a into intervals such
that γ2 is the common boundary point of T m (b) and T m (a). Since T |Δ3
is a translation over λ4 − λ1 = |Δ |, T m+1 (b) is adjacent to the other side
of T m (a). Now the adjacent intervals T m (a), T m+1 (b) are mapped n times
within Δ3 (for some n ∈ N) before returning to Δ4 . The left part Δ of
Δ4 is first mapped onto Δ1 , then into Δ3 , and after n − 1 more iterates it
covers γ3 . Thus there is a decomposition Δ = d ∪ c into intervals such that
γ3 is the common boundary point of T n+1 (d) and T n+1 (c). The interval c
is now back in Δ4 and d needs one more iterate to return. As a whole, the
itineraries of the four sub-pieces of Δ4 before returning to Δ4 are described
278 6. Methods from Ergodic Theory
From the proof it is clear that there are actually uncountably many such
IETs. Whether Lebesgue measure is ergodic for any of them is still an open
question.
Despite these examples, the prevalent case is that IETs are uniquely
ergodic.
Keane & Rauzy showed in [351] that a residual set of the IETs in
d≥2 Σd−1 × Sd is uniquely ergodic. In [348], Keane stated the conjecture
that Lebesgue-a.e. IETs are uniquely ergodic, and this was proven in separate
papers (but in the same issue of the Annals of Mathematics) by Veech [543]
and Masur [410].
0 (12) 1
1 0
0 (132) (13) (123) 1
1 0
1 (1234) 0 (1324)
1
0 0
1
(14) (13)(24)
0
1 1
0
0 (1432) 1 (1423)
0 1
1 (1342) (142) (143) 0
0
1 1
(14)(23)
0 0
1
0 (1243) (124) (134) 1
1 0
0/1
The restriction of Θ to the half-simplices Σd−1 is expanding diffeomor-
phisms. This expansion is achieved by the normalization (i.e. division by
1 − λd or 1 − λπ−1 (d) ) and therefore it is not uniform. Indeed, Σ0d−1 has
a hyperplane {λd = 0} and Σ1d−1 has a hyperplane {λπ−1 (d) = 0} of neu-
tral points, which can in fact contain fixed points. To overcome this lack
of expansion, we can accelerate (i.e. take an induced map) called Zorich
acceleration: Z : Σd−1 × Sd → Σd−1 × Sd defined as follows. Let
τ (λ) = min{n ≥ 1 : Θn changes type at λ} and Z(λ, π) = Θn (λ, π).
There are countably many connected components of level sets of τ , and it
can be shown that Z is uniformly expanding on each of them. The following
theorem, after Veech [543] and Masur [410], is the main ingredient for the
proof of the Keane conjecture.
6.3. Unique Ergodicity 281
Proof. Assume we are in some Rauzy class R; the proof for every Rauzy
class is the same. Rauzy induction Θ “removes” the rightmost interval of
length λd or λπ−1 (d) , whichever is shorter. So by applying Θ repeatedly,
each interval j will eventually become rightmost, so some χ-image will have
j as second letter; see Table 4.3 in Section 4.4. Hence, we can find an open
set U on which Z N is continuous and U visits all parts Σd−1 × {π} in R
sufficiently often in these N iterates that the telescoping of the corresponding
substitution χ1 ◦· · ·◦χN has a strictly positive transition matrix A. Therefore
ρ(A) as computed in (8.29) is a fixed positive number, and consequently, A is
a strict contraction in Hilbert metric. By Birkhoff’s Ergodic Theorem 6.13,
μZ -a.e. (λ, π) ∈ R visits U infinitely often, so we can apply Lemma 8.61 and
conclude that such (λ, π) corresponds to a uniquely ergodic IET. Since μZ is
equivalent to Lebesgue measure × counting measure, the Keane conjecture
follows.
Remark 6.46. The collection of non-uniquely ergodic IETs of d pieces has
Lebesgue measure 0 in d − 1-dimensional parameter space, but their Haus-
dorff dimension is equal to d − 32 for d ≥ 4; see [37, 152]. Dimension d = 2, 3
is too low to get any non-unique ergodicity other than via a rational rela-
tion between the lengths of the pieces (for d = 2, i.e. circle rotations, this
means a rational rotation number) and therefore the Hausdorff dimension of
non-uniquely ergodic IETs is d − 2.
Remark 6.47. Katok (quoted in [165]) showed that for every IET, uniquely
ergodic or not, Lebesgue measure is not mixing; see Section 6.7. Avila &
Forni [40] showed that typical IETs are weak mixing. This comes after the
result of Nogueira & Rudolph [432] that generic IETs have no continuous
eigenfunctions apart from constant functions, and results by Sinaı̆ & Ulci-
grai [514] showing that IETs for which the Rauzy induction has a certain
type of periodicity are weak mixing. Conditions ensuring that IETs have a
continuous spectrum and satisfy Sarnak’s conjecture were given in [131] and
[342], respectively.
282 6. Methods from Ergodic Theory
with ϕ(0) := limx 0 ϕ(x) = 0. Clearly ϕ (x) = −(1+log x), so ϕ(x) assumes
its maximum at 1/e and ϕ(1/e) = 1/e. Also ϕ (x) = −1/x < 0, so ϕ is
strictly concave.
Given a finite partition P of a probability space (X, μ), let
where we can ignore the partition elements with μ(P ) = 0 because ϕ(0) = 0.
For a T -invariant probability measure μ on (X, B, T ) and a partition P,
define the entropy of μ w.r.t. P as
n−1
1 4
(6.11) hμ (T, P) = lim Hμ T −k P .
n→∞ n
k=0
3 −k P)
This limit exists by Fekete’s Lemma 1.15 and the fact that n1 Hμ ( n−1
k=0 T
is subadditive; see [551, Corollary 4.9.1]. Finally, the measure-theoretic
entropy of μ is
15 That is, (X, B, μ) is isomorphic to ([0, 1], Leb) ! (countable set with counting measure).
284 6. Methods from Ergodic Theory
and [127]. The matrix A in that example is the transition matrix of a countably piecewise
Markov map T : [0, 1] → [0, 1] such that the slope |T | > 4 wherever defined. Yet the entropy is
hμ (T ) < htop (T ) = log 4. Similar examples can be found in [94, 95, 421].
6.5. Isomorphic Systems 285
Any two natural extensions are isomorphic, see [456, page 13], so it
makes sense to speak of the natural extension. Sometimes natural extensions
have explicit formulas, e.g. the baker transformation
(2x, y2 ) if x ≤ 12 ,
b : [0, 1] → [0, 1] ,
2 2
b(x, y) =
(2x − 1, 1+y
2 ) if x > 12
is the natural extension of the doubling map T2 (x) = 2x mod 1. There is
also a general construction: Set
Y = {(xi )i≥0 : T (xi+1 ) = xi ∈ X for all i ≥ 0}
with S(x0 , x1 , . . . ) = (T (x0 ), x0 , x1 , . . . ). Then S is invertible (with the left
shift σ = S −1 ) and
ν(A0 , A1 , A2 , . . . ) = inf μ(Ai ) for (A0 , A1 , A2 . . . ) ⊂ S
i
286 6. Methods from Ergodic Theory
see Definition 6.65 below, because these systems are isomorphic to Bernoulli
shifts [261]. Ornstein’s Theorem also holds for infinite alphabet shifts; see
[209]. A short and elegant proof was given by Downarowicz & Serafin [216].
Ornstein’s Theorem strengthened a result by Sinaı̆ [511] from 1962:
Theorem 6.61 (Sinaı̆’s Theorem). Every ergodic measure-preserving trans-
formation (X, B, T, μ) with entropy hμ (T ) has every p-Bernoulli shift with
hμp (σ) ≤ hμ (T ) as measure-theoretic factor.
Sinaı̆’s Theorem says, for example, that if two Bernoulli shifts (with
probability vectors p and p ) have the same entropy, then there are measure-
preserving factor maps ψ and ψ from the one to the other and vice versa.
But this leaves unanswered whether ψ = ψ −1 . Ornstein’s Theorem settles
this in the positive.
We stress that (unlike Sinaı̆’s Theorem) Ornstein’s Theorem holds for
two-sided shifts because in the one-sided shift setting the number of
preimages is, almost surely, preserved under isomorphisms. Walters [550]
showed that the one-sided (p1 , . . . , pm )-Bernoulli shift is isomorphic to the
(p1 , . . . , pn )-Bernoulli shift if and only if m = n and (p1 , . . . , pm ) is a permu-
tation of (p1 , . . . , pn ).
The isomorphism for the two-sided setting is very complicated and has
nothing to do with sliding block codes (no continuity is required). The
proof of the existence of the isomorphism by Ornstein is not constructive,
but in 1979, Keane & Smorodinsky [352] (sketched also in [456]) gave a
constructive proof showing that the isomorphism can be made finitary.
Definition 6.62. A factor map ψ : (X, μ) → (Y, ν) is called finitary if one
of the following equivalent properties holds:
• ψ is continuous μ-a.e.
• For μ-a.e. x ∈ X, there is N = N (x) such that the zeroth entry of
ψ(x) depends only on [x−N , . . . , xN ]. In this sense, a finitary factor
map is a sliding block code with window size depending on x.
If ψ is invertible ν-a.e. and ψ −1 satisfies the above two properties, then ψ is
a finitary isomorphism.
For β-transformations and tent maps (or every interval maps T of con-
stant slope s > 1, so htop (T ) = log s), the absolutely continuous (w.r.t.
Lebesgue measure) invariant probability measures are also the measures of
maximal entropy. This follows from the Rokhlin formula (6.14). See Re-
mark 3.70 and Example 3.92 for precise formulas for these measures.
Full shifts on A = {0, . . . , d − 1} have a unique measure of maximal en-
tropy, namely the ( d1 , . . . , d1 )-Bernoulli measure. A generalization of Bernoulli
measures for SFTs is Markov measures. For such measures, the probability
of xk depends on the value of xk−1 but not on the further past . . . xk−3 , xk−2 .
Definition 6.65. Let A = {0, . . . , d − 1} be our alphabet. Define a d × d
probability transition matrix P = (pij )d−1i,j=0 where all rows are probabil-
ity vectors. Let π ∈ R be a probability row-vector. The measure defined
d
on cylinders as
μπ ([x0 . . . xn ]) = πx0 px0 x1 px1 x2 · · · pxn−1 xn
and extended to the Borel σ-algebra B of AN0 by the Kolmogorov Extension
Theorem is call a Markov measure. It is shift-invariant and (provided P
is irreducible) ergodic.
For subshifts of finite type, Shannon [495] and Parry [446] (see also
[364, Section 6.2] and [346, Section 4.4]) demonstrated how to construct
the measure of maximal entropy. Let (ΣA , σ) be a subshift of finite type on
6.6. Measures of Maximal Entropy 289
d−1
ui vi = 1.
i=0
pi := ui vi = μSP ([i]),
Ai,j vj
pi,j := = μSP ([ij] | [i]),
λvi
Proof. This measure was introduced by Shannon [495], and Parry showed
in [446] that it is indeed the unique measure. In this proof, we will only show
that hμSP (σ) = htop (σ) = log λ and skip the (more complicated) uniqueness
part; see [550, Theorem 8.10].
The definitions of the masses of 1-cylinders and 2-cylinders are compat-
ible, because (since v is a right eigenvector)
d−1 d−1
Summing over i, we get i=0 μSP ([i]) = i=0 ui vi = 1, due to our scaling.
17 Infact, also if k = Aij ∈ N \ {1} we can interpret this as of k paths from state i to state
j. The theory doesn’t change.
290 6. Methods from Ergodic Theory
d−1 d−1
pi pi,im
μSP (σ −1 Z) = μSP ([iim . . . in ]) = μSP ([im . . . in ])
pim
i=0 i=0
d−1
ui vi Ai,im vim
= μSP ([im . . . in ])
λvi uim vim
i=0
d−1
ui Ai,im λuim
= μSP (Z) = μSP (Z) = μSP (Z).
λuim λuim
i=0
This invariance carries over to all sets in the σ-algebra B generated by the
cylinder sets.
Based on the interpretation of conditional probabilities, the identities
⎧
⎪ d−1
⎨ im+1 ,...,in =0 pim pim ,im+1 · · · pin−1 ,in = pim ,
(6.16)
⎩d−1
⎪
· · · pin−1 ,in = pin
im ,...,in−1 =0 pim pim ,im+1
follow because the left-hand side indicates the total probability of starting
in state im and reaching some state after n − m steps, respectively, starting
at some state and reaching state n after n − m steps.
To compute hμSP (σ), we will confine ourselves to the partition P of 1-
cylinder sets; this partition is generating, so this restriction is justified by
Theorem 6.48.
n−1
4 d−1
−k
HμSP σ P = − μSP ([i0 . . . in−1 ]) log μSP ([i0 . . . in−1 ])
k=0 i0 ,...,in−1 =0
Aik ,ik+1 =1
d−1
= − pi0 pi0 ,i1 · · · pin−1 ,in
i0 ,...,in−1 =0
Aik ,ik+1 =1
× log pi0 + log pi0 ,i1 + · · · + log pin−2 ,in−1
d−1 d−1
= − pi0 log pi0 − (n − 1) pi pi,j log pi,j ,
i0 =0 i,j=0
6.6. Measures of Maximal Entropy 291
The first term in the brackets is zero because Ai,j ∈ {0, 1}. The second term
(summing first over i) simplifies to
d−1 d−1
λuj vj
− log vj = − uj vj log vj ,
λ
j=0 j=0
Hence these two terms cancel each other. The remaining term is
d−1 d−1 d−1
ui Ai,j vj ui λvi
log λ = log λ = ui vi log λ = log λ.
λ λ
i,j=0 i=0 i=0
⎛ ⎞
0 1 1
1 1
A = ⎝1 0 1⎠ A=
1 0
1 1 0 √
λ=2 λ = 12 (1 + 5)
the left panel of Figure 6.518 , but from the measure-theoretic viewpoint this
doesn’t matter. As a piecewise expanding map, T preserves a probability
measure μ - Leb with density ρ = d dμ Leb constant on each Pi .
If we denote the lengthsof the partition elements by vi = |Pi | and set
ui = ρ|Pi , then i ui vi = i μ(Pi ) = 1. By the Rokhlin formula (6.14)
the entropy is hμ (T ) = log |T | dμ = log λ = htop (T ), so μ is a measure of
maximal entropy. Also
d−1
Aij vj = Leb(T (Pi )) = λ|Pi | = λvi ,
j=0
for all x ∈ Pj◦ . An intuitive way to see this is that the expansion factor λ
of T dilutes the density by a factor 1/λ, but summing over all preimages of
the interval Pj gives (6.17). Thus u and v are left and right eigenvectors of
v
A for the leading eigenvalue λ. Finally Aij λvji is the relative measure of the
subinterval of Pi that is mapped to Pj by T .
Example 6.68. We carry out the computation for the Fibonacci SFT with
associated matrix
1 1 1 √
A= and leading eigenvalue λ = γ = (1 + 2).
1 0 2
In this case, we can let T be the tent map Tγ (x) = min{γx, γ(1 − x)}; see
the right panel of Figure 6.5. Then P0 = [ 12 , γ2 ] has length v1 = 12 (γ − 1) and
P1 = [ 12 (γ − 1), 12 ] has length v1 = 12 (2 − γ). Solving
μ0 = u0 v0 , μ1 = u1 v1 , μ0 + μ1 = 1, u1 = u0 /γ,
1 γ2
we find μ0 = 3−γ = γ 2 +1 and μ1 = γ 21+1 , which is in agreement which what
Theorem 6.67 would provide. Although the Fibonacci substitution χFib has
the same associated matrix A, μ does not describe the frequency of symbols
0, 1 or of words in the Fibonacci substitution shift (XFib , σ). This is because
the fixed point ρFib = 0100101001001 . . . of χFib is not the itinerary of a
μ-typical point.
The next example applies this to S-gap shifts and finds their measure of
maximal entropy.
18 Since in the example of Figure 6.5 the matrix has zeros on the diagonal, there is no fixed
Example 6.69. Let the directed graph G consist of a vertex v0 from which
q loops of length emerge. Let (X, σ) be the corresponding SFT. We want
to find the Shannon-Parry
measure. First we canreplace the directed graph
by one with q vertices vi,j , 1 ≤ i ≤ L := q and 1 ≤ j ≤ if i is
the index of a loop of length (let us write = i in this case). For such i,
the edges of the graph are vi,j → vi,j+1 if j < i and vi,i → vi ,1 for each
i ∈ {1, . . . , L}. Thus the collection R = {vi,1 : i = 1, . . . , L} is a rome; see
Definition 8.71. Once in vertex vi,1 it takes i steps to return to R, and the
first return map to R has the full graph as transition graph.
By Theorem 8.73, htop (σ) = log λ, where λ is the unique positive solution
to
L
−
(6.18) q λ = λ−i = 1.
≥1 i=1
Clearly, uniquely ergodic systems are intrinsically ergodic, and for zero
entropy systems, the two notions are equivalent. But for positive entropy,
most intrinsically ergodic dynamical systems are not uniquely ergodic.
As shown in Theorem 2.88, specification implies intrinsic ergodicity. For
this reason, irreducible SFTs19 , irreducible sofic shifts (Theorem 3.48), and
factors thereof [556] are all intrinsically but not uniquely ergodic. Intrinsic
19 However, for SFTs in higher dimension, intrinsic ergodicity can fail [133, 134].
294 6. Methods from Ergodic Theory
ergodicity need not hold for SFTs on infinite alphabets, as the next example
shows.
Example 6.71. Take the infinite alphabet N and the infinite transition
matrix A = (Ai,j )i,j∈N is given by
⎛ ⎞
1 1 1 1 1 ...
⎜1 1 1 1 1 . . .⎟
' ⎜ ⎟
⎜0 1 1 1 1 ⎟
1 if j ≥ i − 1, ⎜ ⎟
Ai,j = A = ⎜0 0 1 1 1 ⎟.
0 if j < i − 1, ⎜ ⎟
⎜0 0 0 1 1 ⎟
⎝ ⎠
.. .. .. ..
. . . .
Then htop (σ) = log 4, but there is no measure of maximal entropy. For the
proof, see [127].
There are weaker versions of specification that still imply intrinsic er-
godicity. This approach has been used to show that β-shifts (Corollary 3.69)
and unimodal shifts [159, 315] (see Example 3.92) are intrinsically ergodic.
Some further results, not relying on (any form of) specification, follow:
• S-gap shifts are intrinsically ergodic; see [159].
• Coded shifts and their factors are intrinsically ergodic under the
conditions given by Theorem 3.48.
• The hereditary B-free subshift (XBher , σ) is intrinsically ergodic [378]
and [231, Theorem J] and also B-free shifts themselves are of-
ten intrinsically ergodic (such as the B-free shift for B = {p2 :
p prime}; see [451]), but (XB , σ) need not be intrinsically ergodic
if htop (Xη ) > 0 [379].
On the other hand, there exist transitive and even minimal shifts that are
not intrinsically ergodic; see e.g. [306] and [194, Example 27.2] where there
are infinitely many measures of maximal entropy.
Definition 6.72. A dynamical system (X, T ) is called entropy dense if for
every invariant measure μ, there is a sequence of ergodic measures μn such
that μn → μ in the weak∗ topology and the entropies hμn (T ) → hμ (T ).
Obviously, uniquely ergodic systems are entropy dense, but there are
many more systems which have this property for non-trivial reasons.
• Every dynamical system with specification is entropy dense; see
[234] with an extended result in [458, Theorem 2.1] and [459,536].
Thus topological mixing unimodal maps are entropy dense (they
have specification; see [91, 92]). Weaker versions of specification
6.7. Mixing 295
hold for β-shifts, and this can be used to show that also β-shifts are
entropy dense [458, Theorem 2.1 and Proposition 5.1].
• Every transitive dynamical system with the shadowing property is
entropy dense, see [383, Corollary 31].
• General conditions for entropy denseness in the context of hyper-
bolic measures were given in [283], and also B-free shifts [344].
Remark 6.73. Entropy denseness implies that the Choquet simplex is
Poulsen, which in its turn implies that the Choquet simplex is arc-wise con-
nected. The reverse implications are not true in general. For Dyck shifts,
the Choquet simplex is arc-wise connected, but not Poulsen; see Proposi-
tion 3.134. In [369] it was shown that the set of ergodic measures of a
hereditary shift is arc-wise connected, but according to [379] not necessarily
Poulsen. In [271, Proposition 4.29] examples are given where the Poulsen
simplex is not entropy dense; using [207] one can create minimal shifts with
this property.
6.7. Mixing
Whereas a Bernoulli process consists of totally independent trials, mixing
refers to an asymptotic independence:
Definition 6.74. A dynamical system (X, B, μ, T ) preserving the probabil-
ity measure μ is called mixing (or strongly mixing) if
(6.20) μ(T −n (A) ∩ B) → μ(A)μ(B) as n → ∞
for every A, B ∈ B.
Lemma 6.75. Every Bernoulli system is mixing.
Proof. Take A, B ⊂ X measurable such that μ(A), μ(B) > 0 and inf{d(a, b) :
a ∈ A, b ∈ B} =: ε > 0. Next take any n such that d(T n (x), x) < ε for all
x ∈ X. Then A ∩ T n (B) = ∅, so μ(T −n (A) ∩ B) = 0 = μ(A)μ(B). Since n
can be arbitrarily large, μ(T n (A) ∩ B) → μ(A)μ(B).
6.7.1. Linearly Recurrent Shifts and Mixing. Dekking & Keane [191]
gave the first general proof that constant length substitution shifts are never
strongly mixing. A short and more general result [167] states that no lin-
early recurrent shift20 can be strongly mixing, and this includes all primitive
substitution shifts.
Theorem 6.78. A linearly recurrent shift (X, σ) is not mixing w.r.t. its
unique21 invariant probability measure.
D(n) = [um ] ∩ σ hn ([um ]) and E(n) = {0 ≤ j < hn−1 : σ j ([un−1 ]) ⊂ [um ]}.
20 In fact, [167] applies to any linearly recurrent mapping of the Cantor set.
21 Recall from Corollary 6.29 that linear recurrent shifts are uniquely ergodic.
6.7. Mixing 297
#E(n)
(X, σ) gives limn→∞ hn = μ([um ]). Combining all of the above, we get
lim inf μ(D(n)) ≥ lim inf #E(n)μ([un ])
n→∞ n→∞
≥ lim inf hn−1 μ([un ])μ([um ]) (by Theorem 4.4(iii))
n→∞
hn
≥ lim inf μ([un ])μ([um ]) (bounded gaps)
n→∞ L
hn 1
≥ μ([um ]) (by (6.23))
L Lhn
> μ([um ])2 (by the choice of m).
Also,
μ(D(n)) = μ([um ] ∩ σ hn ([um ])) = μ(σ −hn ([um ] ∩ σ hn ([um ])))
= μ(σ −hn ([um ]) ∩ σ −hn ◦ σ hn ([um ])).
But htop (σ|X ) = hμ (σ) = 0, so by Proposition 6.52, σ is invertible μ-a.s.
Therefore μ([um ]$σ −hn ◦ σ hn ([um ])) = 0 and thus
lim inf μ(σ −hn ([um ]) ∩ [um ]) > μ([um ])2 .
n→∞
6.7.2. Cutting and Stacking and Mixing. A similar poof as in the pre-
vious section holds for cutting and stacking systems of finite rank and with
a bound on the number of layers of spacer.
Theorem 6.79. Let (Δ, T, μ) be a finite rank cutting and stacking system
such that at each step of the construction, at most s layers of spacer are
inserted between stacked slices. Then (Δ, T, μ) is not strongly mixing.
Proof. Let w(n) be the number of stacks Δi (n) at step n in the construction,
hi (n)−1 j
and let hi (n) be their heights, so Δi (n) = j=0 Δi (n), 1 ≤ i ≤ w(n).
Also let hmin (n) = mini hi (n) and hmax (n) = maxi hi (n). By speeding up
the cutting and stacking construction, we can assume that hmin (n) ≥ 2n .
Let Sn be the spacer left at step n; the assumption on hmin (n) also implies
that μ(Sn ) = O(2−n ). Because the system is of finite rank r, 1 ≤ w(n) ≤ r
for all n, but there need not be a fixed upper bound on the number of slices
each stack is cut into.
−r
Choose ε ∈ (0, 2rs ). Take m so large that εhmin (m) ≥ 100(1 + s),
μ(Sm ) < ε/100, and e−4s/2 ≥ 89 . Now let
m
Clearly 4
3 ε ≥ μ(A) ≥ 3
ε. We claim
4
' 5
μ(A ∩ Δi (n)) 2ε
an := min ≥ for all n ≥ m.
i=1,...,w(n) μ(Δi (n)) 3
By the choice of A, am ≥ 34 ε. Each of the slices into which each stack Δi (m)
hmin (m)
is cut receives at most s layers of spacer. Therefore am+1 ≥ am hmin (m)+s ≥
am (1 − hmin (m) ). By induction
s
⎛ ⎞
n−1
s
3ε
∞
s
an ≥ am 1− ≥ exp ⎝ log 1 − ⎠
hmin (j) 4 hmin (j)
j=m j=m
⎛ ⎞
∞
3ε 3ε −4s/2m 2ε
≥ exp ⎝−2s 2−j ⎠ = e ≥ ,
4 4 3
j=m
Note that although in Theorem 6.79 every stack can receive no more
that s layers of spacer throughout the procedure, we don’t require a bound
on the number of slices a stack is cut into. Without the bound s, and still
without a bound on the number of slices, strong mixing may be achieved.
This was first hinted at by Ornstein in [436]. In detail, in the n-th step
of the construction, we cut the stack in z(n) slices and add i − 1 layers of
spacer to the i-th slice before stacking them (slice i on top of slice i − 1) into
6.7. Mixing 299
22 See [5, 172, 244]; the conjecture was probably stated by Meir Smorodinsky in personal
discussions with Nat Friedman. Ferenczi, building on the example from a symbolic point of view,
states that the map was proposed by Smorodinsky but not published before [6].
300 6. Methods from Ergodic Theory
⎧
⎪
⎨
spacer
⎪
⎩
⎧
⎪
⎪
⎪
⎪
⎪
⎪
j+1=4
⎨ Ē1 ⊃ I D1 t = 11
t
⎪
⎪
⎪
⎪
⎪
⎪
⎩
⎧
⎪
⎪
⎨
h(n) − t D2
⎪
⎪
⎩
Δ(n)
Since E1 and E2 are disjoint, we can add (6.25) and (6.26) to obtain
μ(T iρn (E) ∩ E) → μ(E)2 as n → ∞.
By T -invariance of μ, we obtain for i1 − i2 = i,
μ(T i1 ρn (E) ∩ T i2 ρn (E)) = μ(T (i1 −i2 )ρn (E) ∩ T i2 ρn (E)) → μ(E)2 ,
as n → ∞. Choosing n ∈ N sufficiently small that the above convergence
holds uniformly over 1 ≤ i1 , i2 < n , i1 = i2 , but still so that n → ∞, we
arrive at
0 // n −1 /
/
/1 /
/ 1E (T −iρn (x)) − μ(E)/ dμ (by Cauchy-Schwarz)
X / n i=0 /
6 / /2
70
7 / 1 n −1 /
8 / /
≤ / 1E (T −iρn (x)) − μ(E)/ dμ
X / n i=0 /
6 / /
70 / /
7 / n −1 /
7 / 1
=8 / 2 (1E (T n (x)) − μ(E)) (1E (T n (x)) − μ(E))// dμ
−iρ −jρ
X / n i,j=0 /
6 / /
70 / /
7 / n −1 /
7 /1
=8 / 2 1E (T −iρn (x))1E (T −jρ n (x)) − μ(E) // dμ.
2
X / n i,j=0 /
6.7. Mixing 301
Since the terms in this sum tend to 0 for i = j and the terms with i = j are
only an n -th part of the whole, the average tends to 0 as n → ∞. This is
what we needed to prove.
Theorem 6.82. Let (X, T, μ) be a staircase cutting and stacking system such
that z(n) → ∞ and z(n)2 /h(n) → 0. Then (X, T, μ) is strongly mixing.
Note that for any ε > 0 we can take n0 ∈ N so large and A , B con-
sisting of full levels of Δ(n0 ) such that the symmetric differences satisfy
μ(A$A ), μ(B$B ) < ε. Then A , B also consist of full levels of Δ(n) for
all n ≥ n0 . If (6.27) holds for every such A , B and ε > 0, then (6.27) holds
for A and B as well. So there is no loss of generality to work with sets A, B
that consist of full levels of Δ(n) for every n ≥ n0 .
Next take m ∈ N arbitrary, and let n ∈ N be such that
h(n) ≤ m = kn h(n) + tn < h(n + 1), 1 ≤ kn ≤ z(n), 0 ≤ t n < hn .
We can assume that m is so large that n ≥ n0 .
Divide Δ(n) into pieces D1 , D2 , D3 where
D1 = {kn + 1 rightmost slices of Δ(n)},
D2 = {top tn levels of Δ(n) \ D1 },
D3 = {bottom h(n) − tn levels of Δ(n) \ D1 };
see Figure 6.7.
Mixing on D1 : Since the levels of D1 get interspersed with layers of
spacer, D1 occupies the top (kn + 1)h(n) + (z(n) − 1) + (z(n) − 2) + · · · +
(z(n) − kn − 1) = (kn + 1)h(n) + 12 (kn + 1)(2z(n) − kn − 2) levels of Δ(n + 1).
302 6. Methods from Ergodic Theory
spacer
⎧
⎪
⎪
⎪
⎪ I ⊂ Ā1
⎪
⎪
⎨
D1 D̄1
⎪
⎪
⎪
⎪
⎪
⎪
⎩
⎧
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
' ⎪
⎪
⎪
⎪
spacer ⎪
⎪
⎪
⎪
⎧ D2 ⎪
⎪
⎪ ⎪
⎪
⎪
⎪ and ⎨
⎪
⎨ Ā2 ⊃ I D2 D3 ⎫
tn ⎪
⎪ ⎪
⎪
⎪ inter- ⎪
⎪
⎪
⎪
⎬
⎪
⎪ ⎪
⎪ T m (I)
⎪
⎩ spersed ⎪
⎪ ⎪
⎪
D1 ⎪
⎪ ⎪
⎪
⎭
⎧ ⎪
⎪
⎪ ⎪
⎪
⎪
⎨ ⎪
⎪
⎪
⎪
h(n) − tn D3 ⎪
⎪
⎪ ⎪
⎪
⎪
⎩ ⎪
⎪
⎩
Δ(n) Δ(n + 1)
Figure 6.7. The staircase represented as Δ(n) (left) and Δ(n + 1) (right).
z(n)−kn −2
1
μ(T −i(kn +1) (A∗2 ) ∩ B) − μ(A∗2 )μ(B)
z(n) − kn − 1
i=0
z(n)−kn −2
1
= μ(A∗2 ∩ T i(kn +1) (B)) − μ(A∗2 )μ(B)
z(n) − kn − 1
i=0
z(n)−kn −2 0
1
= 1B ◦ T −i(kn +1) − μ(B) dμ
z(n) − kn − 1 ∗
A2
i=0
/ /
0 / z(n)−kn −2 /
/ 1 /
≤ / 1 ◦ T −i(kn +1)
− μ(B) / dμ.
/ B /
X / z(n) − kn − 1 i=0 /
n −1
Choose kn ≥ 1 minimal such that kn (kn + 1) ≥ h(p). Then z(n)−k
kn →∞
as well. By Lemma 6.81 we can choose a sequence n → ∞ so slowly that
n −1
an := z(n)−k
n k → ∞, but
n
/ /
0 // n −1
/
/
/1 (k +1)
−ikn /
(6.28) / 1B ◦ T n
− μ(B)/ dμ → 0 as n → ∞.
X // n i=0
/
/
−ikn
g◦S
0 // n −1
/
/ 0 // n −1
/
/
/1
−ikn / /1 +j
−ikn /
/ g◦S / dμ = / g◦S − μ(B)/ dμ
X / n i=0
/ X / n i=0
/
6.7. Mixing 305
for 0 ≤ j < kn . Taking the average of this expression over j = 0, . . . , kn − 1,
we find
/ /
0 // n −1 /
/ 0 / kn −1 n −1 /
/1 / / 1 1 +j /
/ g◦S −ikn
/ dμ = / g◦S −ikn / dμ
/ /
X / n i=0 / X / kn j=0 n i=0 /
/ /
0 / /
/ 1 kn n −1 /
(6.29) = / −i /
g ◦ S / dμ.
/ k n
X/ n /
i=0
where in the last line we used the opposite argument of (6.29). Combining
this result with (6.29) and (6.28), we get the required convergence. This
finishes the proof for D2 .
Mixing on D3 : The argument here is the same as for D2 , except that
T m (I)
now only goes through the roof kn times.
Recall that, since this staircase example uses only one stack, it is of
rank 1. In [244] the word-complexity of a small variation of this example is
computed. Namely, instead of (6.24), the recursion is
for every A, B ∈ B.
for all A, B ∈ B. Note the absence of absolute value bars compared to (6.32).
6.7. Mixing 307
Proof. (1) ⇔ (2): Use Lemma 8.53 for ai = |μ(T −i (A) ∩ B) − μ(A)μ(B)|.
(2) ⇒ (3): For every A, B, C, D ∈ B, there are subsets E1 and E2 of N
of zero density such that
n−1
1
μ (T −i (A) ∩ B)ν(S −i (C) ∩ D)
n
i=0
n−1
1
= μ(A)μ(B)ν(S −i (C) ∩ D)
n
i=0
n−1
1
+ (μ(T −i (A) ∩ B) − μ(A)μ(B))ν(S −i (C) ∩ D).
n
i=0
−i
By ergodicity of S (see Lemma 6.84), n1 n−1i=0 ν(S (C) ∩ D) → μ(C)μ(D),
so the first term in the above expression tends to μ(A)μ(B)μ(C)μ(D). The
−i
second term is majorized by n1 n−1i=0 |μ(T (A) ∩ B) − μ(A)μ(B)|, which
tends to 0 because T is weak mixing.
(4) ⇒ (5): By assumption T × S is ergodic for the trivial map S : {0} →
{0}. Therefore T itself is ergodic, and hence T × T is ergodic.
6.8. Spectral Properties 309
In fact, if μ is ergodic, then σ(UT ) = S1 , but for the proof of this result
we refer to [429].
n−1 n−1
1 1
0 = lim |(UTi (f − (f, 1)), g)| = lim |(UTi f − (f, 1), g)|
n→∞ n n→∞ n
i=0 i=0
n−1
1
= lim |(UTi f, g) − (f, 1)(1, g)|.
n→∞ n
i=0
Take f = 1A , g = 1B to get the definition of weak mixing.
Example 6.97. Circle rotations Rα , of any rotation angle α ∈ [0, 1), are
neither mixing nor weakly mixing. To prove non-mixing, set A = B =
[0, 1/3]. There are infinitely many n such that Rα−n (A) ∩ B ⊃ [0, 1/4], so
1/4 ≤ μ(Rα−n (A) ∩ B) → μ(A)μ(B) = 1/9. Furthermore, Rα has a non-
constant eigenfunction ψ : S1 → C defined as ψ(x) = e2πix because ψ ◦
Rα (x) = e2πi(x+α) = e2πiα ψ(x). Therefore Rα is not weakly mixing.
Since the Sturmian shift with rotation number α ∈ [0, 1] \ Q (with its
unique invariant probability measure) is isomorphic to (S1 , Rα , μ), the ab-
sence of (weak) mixing carries over to the Sturmian shift.
6.8. Spectral Properties 315
The structure theorem by Halmos & von Neumann [299] associates pure
point spectrum to group rotations:
Theorem 6.100. An ergodic probability measure-preserving system
(X, B, μ, T ) on compact metric space has pure point spectrum if and only
if it is isomorphic to a rotation on a compact metrizable abelian group G
with Haar measure μG , so there is g0 ∈ G such that T x = φ−1 (φ(x) + g0 ),
where φ : X → G is the isomorphism.
can choose δ > 0 such that d(x, y) < δ implies that |fn (x) − fn (y)| < ε/2 for
all n < N . Therefore
N
1 1
ρ(x, y) = |fn (x) − fn (y)| + |fn (x) − fn (y)|
2n 2n
n=1 n>N
N
1 ε ε
< + < ε.
2n 2 2
n=1
Thus (X, ρ), as a continuous image of the compact space (X, d), is compact
itself. Also T is assumed to be transitive, so by Exercise 2.28 it is minimal
(and in fact uniformly rigid; see Lemma 2.30). It remains to give (X, ρ) a
group structure. Fix x0 ∈ X, and define a homomorphism h : Z → orb(x0 )
by h(n) = T n (x0 ). Since T is an isometry on (X, ρ), it is easy to check that
the addition on Z transfers to a uniformly continuous action on orb(x0 ) and
T (x) = h(h−1 (x) + 1). But orb(x) = X, so this action extends continuously
to X and the group G is the compactification of Z in the topology that Z
inherits from (X, ρ) via h−1 .
⇐: Let Ĝ be the group of characters of G; i.e. each γ ∈ Ĝ is a continuous
function γ : G → S1 such that γ(g1 + g2 ) = γ(g1 )γ(g2 ) for all g1 , g2 ∈ G.
Define T̂ : G → G as T̂ (g) = g + g0 , so ψ ◦ T̂ = T ◦ φ μG -a.e. Then
Proof. Since a group rotation is an isometry, Haar measure has zero topo-
logical entropy. By Theorem 6.100, and since isomorphisms preserve entropy,
each ergodic T -invariant measure has therefore zero entropy. Now use the
Variational Principle 6.63.
Definition 6.102. We say that U has simple spectrum if there is single
h ∈ L2 (μ) such that Span(UTn h : n ∈ Z) = L2 (μ). In other words, for
the decomposition of Theorem 6.92, the sequence (hj ) consists of a single
function h.
Proof. Since the Sturmian shift with frequency α (with its unique invariant
probability measure) is isomorphic to (S1 , B, Leb, Rα ), it suffices to consider
the irrational circle rotation; it preserves Lebesgue measure Leb, so we take
μ to be Lebesgue measure. The Koopman operator URα has eigenfunctions
ψn : S1 → C defined as ψn (x) = e2πnix , n ∈ Z, because
But the (ψn )n∈Z form the standard basis of Fourier modes spanning L2 (μ),
so URα has pure point spectrum.
Now for the simple spectrum part, irrational rotations Rα indeed have
a simple spectrum, but the Fourier modes, i.e. the eigenfunctions, don’t
play the role of h. Quite the opposite: take h ∈ L2 (μ) such that ĥ(n) =
1 2πin dx = 0 for all n ∈ Z. We show that the orthogonal complement
0 h(x)e
Span(UTn h : n ∈ Z)⊥ = {0}. Indeed, suppose that g ∈ L2 (μ) satisfies
g ⊥ UTn h for all n ∈ Z. Write h = j ĥ(j)e−2πijx and g = k ĝ(k)e−2πikx ,
where both sequences of Fourier coefficients belong to 2 (C). Then
: ⎛ ⎞ ;
0 = (UnT h, g) = UnT ⎝ ĥ(j)e−2πijx ⎠ , ĝ(k)e−2πikx
j∈Z k∈Z
: ;
−2πij(x+nα) −2πikx
= ĥ(j)e , ĝ(k)e
j k∈Z
−2πijnα −2πijx
= e (ĥ(j)e , ĝ(k)e−2πikx ) = ĥ(j)ĝ(j)e−2πjnα .
j,k∈Z j∈Z
There are sufficient conditions for substitution shifts to have pure point
spectrum, such as [519, Lemma 3.2] (see also Host’s results quoted in [465,
Section VI.27]). We state it for the associated BV-system:
The next theorem is the main result in [332] of Jacobs & Keane.
Theorem 6.108. A regular Toeplitz shift (X, σ) has pure point spectrum.
Proof. Let (X, σ), X ⊂ AN or Z , be our regular Toeplitz shift; from The-
orem 6.25 we know that it preserves a unique probability measure μ. Let
x ∈ X be a sequence with skeletons Skj (x) of periods qj . Set rj
= q1j #{1 ≤ i ≤ qj : Skj (x)i = ∞}. By our assumption of regularity, rj → 0
as j → ∞.
1 j (x) as follows. If Skj (x)m = ∗
First we need to modify the skeleton to Sk
for some 1 ≤ m ≤ qj and xm+kqj = a for all k, for a single a ∈ A, then set
1 j (x)m+kq = a for all k. This new skeleton has period q̃j which is equal
Sk j
to, or at least divides, qj . Continue these modifications for all 1 ≤ m ≤ q̃j ,
1 j (x) of period q̃j with
but after a finite number of steps it stabilizes to a Sk
1
the property that for each 1 ≤ m ≤ q̃j with Skj (x)m = ∗, the sequence
(xm+kq̃j )k contains at least two different letters from A. The corresponding
sequence r̃j → 0.
Next define
Clearly
Ajm = Ajm if m ≡ m mod q̃j ,
Ajm ∩ Ajm = ∅ otherwise,
6.8. Spectral Properties 321
q̃j q̃j
and since (X, σ) is minimal, m=1 Ajm = X. It follows that27 {Ajm }m=1 is a
clopen partition of X, independently of the choice of x ∈ X. Also
(6.35) σ(Ajm ) = Ajm+1 mod q̃j for all 1 ≤ m ≤ q̃j .
Let
q̃j
fr,j (x) = λm
r,j 1Ajm for λr,j = e2πir/q̃j .
m=1
By (6.35), fr,j ◦σ = λs,j fr,j , so the λs,j ’s are eigenvalues to the eigenfunctions
q̃j
fr,j associated with the clopen partition {Ajm }m=1 .
To show that {fr,j }j∈N,1≤r≤q̃j spans L2 (μ), first note that because the
fr,j ’s are linear combinations of the indicator functions 1Ajm , the indicator
functions 1Ajm are linear combinations of the fr,j ’s. The Borel σ-algebra
is generated by the cylinder sets, so let us show that for each cylinder set
B = σ m ([b1 , . . . , bn ]), 1B is the L1 (μ)-limit of 1Ajm ’s. Set
showing that the sets Ajm are in general more complicated than q̃j -cylinders.
27 Later these sets were used in [560]; see also [381, Chapter 4.6].
322 6. Methods from Ergodic Theory
The next few results on the spectrum of substitution shifts where the sub-
stitution χ is of constant length q go back to Kamae [340] and Dekking [190].
In this case, because the height vectors of the corresponding Bratteli diagram
(En , Vn )n∈N are hv (n) = q n for every v ∈ Vn , the Koopman operator Uσ has
an eigenvalue e2πiα for each α = p/q n . According to [465, Proposition VI.11],
there are no irrational continuous eigenvalues, but other rational eigenvalues
are not excluded.
Theorem 6.111. A constant length substitution shift has pure point spec-
trum if and only if its pure base substitution has a coincidence.
28 This “period” 2 is called the height by Dekking, see [190, Definition 8] and [320, page 531],
but to avoid confusion with height vectors, we will not adopt that terminology.
6.8. Spectral Properties 323
Proof. First recall that (XTM , σ, μTM ) is a 2-to-1 extension of the two-sided
Feigenbaum substitution shift (Xfeig , σ, μfeig ); see Example 4.9. Since Xfeig is
also a regular Toeplitz shift, it has pure point spectrum by Theorem 6.108;
in fact, e2πiα is an eigenvalue for every dyadic rational. Define a two-point
skew-product extension over (Xfeig , σ) by
σ : Xfeig := Xfeig × {−1, 1} → Xfeig , (x, y) → (σ(x), (−1)x0 · y).
Then (XTM , σ) is conjugate to (Xfeig , σfeig ). Indeed, if π1 : Xfeig → Xfeig is
the projection onto the first coordinate and ψ : XTM → XTM is defined as29
⎛ ⎞
n
ψ(x)n = ⎝π(x)n , (−1)xj ⎠ ,
j=0
σ ψ
XTM Xfeig σ
π π1
Xfeig σ
Figure 6.8. The Thue-Morse shift, the Feigenbaum shift, and its skew-
product extension.
and likewise
f (x, y) + f (x, −y) f (x, y) − f (x, −y)
f (x, y) = + .
2
2
feven fodd
so Heven ⊥ Hodd . Since (Heven , μfeig ) is isomorphic to (Xfeig , μfeig ), it has pure
point spectrum. It is the Kronecker factor of (Xfeig , μfeig ).
It remains to show that L2 (μfeig ) \ Heven contains no eigenfunctions. In-
deed, if f ∈ L2 (μfeig ) satisfies f ◦ σ = λf μfeig -a.e., then
Remark 6.116. With the help of a result by Kesten [357], see also the end
of Section 8.3, Petersen [455] used this for circle rotations Rα : S1 → S1
to show that 1[0,β] − β is a function of bounded discrepancy if and only if
β ∈ Z[α] if and only if e2πiβ is an eigenvalue of the Koopman operator URα .
Proof. The word w has bounded discrepancy; see Remark 4.38. Therefore
/ /
/n−1 /
/ /
sup sup / (1[w] − μ([w])) ◦ σ k (x)/ < ∞.
n∈N x∈Xρ / k=0
/
6.9. Eigenvalues of Bratteli-Vershik Systems 327
Lemma 6.115 implies in this case that 1[w] − μ([w]) = g − g ◦ σ for some
continuous map g : Xρ → R. This gives
e2πig ◦ σ = e2πig◦σ = e2πi(g−1[w] +μ([w])) = e2πiμ([w]) · e2πig ,
because e2πi1[w] (x) = 1 for all x ∈ Xρ . Therefore e2πiμ([w]) is the eigenvalue
to the (continuous) eigenfunction e2πig .
For primitive substitution shifts (Xρ , σ), Host [320, Theorem 1.4] for-
mulated necessary and sufficient conditions for e2πiα to be an eigenvalue;
namely
n (a)|
(6.37) lim e2πiα|χ = h(a)
n→∞
This theorem invites the question of whether there are systems that
have measurable but no continuous eigenfunctions (other than constant func-
tions). Such systems indeed exist; see [110, Theorem 2.5].
30 Cases where h ≡ 1 have to do with periodic structure of the fixed point χ = χ(ρ), similar
to Example 6.110.
328 6. Methods from Ergodic Theory
This matrix has integer eigenvalues 1 and 4. However, the height vectors
h(n) = An 11 have only odd components. Therefore Theorem 6.118 implies
that this substitution shift is weak mixing. If we alter this substitution to
0 → 01010, 3 1
χ: with associated matrix A = ,
1 → 0111 2 3
√
then the eigenvalues of A are 3 ± 2 > 1. According to Theorem 8.10 and
Exercise 8.12, the conditions of Theorem 6.118 cannot be met. Hence also
this substitution shift is weak mixing.
Let us first give an of idea what eigenfunctions should look like. This
step requires neither (6.38) nor linear recurrence. Let
be the number of iterates of the Vershik maps necessary to change the path
x beyond the n-th edge.
It denotes a kind of average phase of the cylinder [v]. For μ-a.e. x ∈ XBV
e−2πiα(rn (x)+ρn (t(xn ))) = λ−rn λ−ρn (t(xn )) = λ−rn (x) Eμ (Ψn |Pn ) → f (x).
31 This set corresponds to the v-th stack of the n-th tower in a cutting and stacking
construction.
330 6. Methods from Ergodic Theory
Take M ∈ N arbitrary. Then the distance d(τ rn (x) (x), xmin ) < 2−n <
2−M for all n > M . In other words, τ rn (x) → xmin uniformly, so
Since f is uniformly continuous, we can find δ > 0 such that f (Bδ (x)) ⊂
Bε (f (x)) for every x ∈ XBV . Therefore, for y ∈ orb(x) ∩ Bδ (xmax ) with
τ (y) ∈ Bδ (τ (xmax )), we have
The Martingale Theorem (see e.g. [86, Theorem 35.6]) implies that fn → f
in L2 (μ) and μ-a.e. as n → ∞.
Let [v min ] denote the cylinder set of the minimal path connecting v0
with v ∈ Vn . Then the elements of Pn are of the form τ j ([v min ]) for v ∈
Vn , 0 ≤ j < hv (n), and for each v ∈ Vn , the measures μ(τ j ([v min ])) for
0 ≤ j < hv (n) all coincide. Assume now that rn (x) ≥ 2, so τ (x) ∈ [v]. Then
32 For background on martingales and conditional expectation, see e.g. [56, Chapter 21] and
* *2
* *
* *
*
&fn+1 − fn &2 = * fn+1 − fn *
* = &f &2 < ∞.
2 2
n≥1 *n≥1 *
2
As before, let v min denote the path connecting v0 and v ∈ Vn , and let [v min ]
be the corresponding n-cylinder. Define for v ∈ Vn and w ∈ Vn+1
Then for j ∈ J(v, w) and x ∈ τ j+k ([wmin ]) ⊂ τ k ([v min ]) we have by (6.47):
fn+1 (x) = e2πiα(j+k) cn+1 (w)λρw (n+1) ,
fn (x) = e2πiαk cn (v)λρv (n) ,
for 0 ≤ k < hv (n). Because all the sets τ j+k ([wmin ]) have the same mass, it
follows that
hv (n)−1 0
&fn+1 − fn &22 ≥ |fn+1 − fn |2 dμ
k=0 [wmin ]
hv (n)−1 0
≥ |e2πiαj cn+1 (w)λρw (n+1) − cn (v)λρv (n) |2 dμ
k=0 [wmin ]
1 2πiαj
≥ |e cn+1 (w)λρw (n+1) − cn (v)λρv (n) |2 ,
L2
(6.51) max max |e2πiαj cn+1 (w)λρw (n+1) − cn (v)λρv (n) |2 < ∞.
v∈Vn ,w∈Vn+1 j∈J(v,w)
n≥1
ρvmin (n) 2
max |cn+1 (w)λρw (n+1) − cn (vnmin )λ n | < ∞.
w∈Vn+1
n∈N
6.9. Eigenvalues of Bratteli-Vershik Systems 333
is summable.
By (6.50), 1 ≥ maxv∈Vn cn (v) ≥ minv∈Vn cn (v) → 1 as n → ∞. Therefore
/ / / / 2
/ ρv (n) / / c (v)λρv (n) /
/ c n (v)λ / / n /
|e2πiαj − 1|2 ≤ /e2πiαj − / + / − 1 /
/ cn+1 (w)λ ρw (n+1) / / cn+1 (w)λρw (n+1) /
≤ 4 |e2πiαj cn+1 (w)λρw (n+1) − cn (v)λρv (n) |
2
+|cn+1 (w)λρw (n+1) − cn (v)λρv (n) | .
Furthermore,
|λhv (n) − 1|2 = |e2πiαhv (n) − 1|2 = |e2πiα(j+hv (n)) − e2πiαj |2
2
≤ |e2πiα(j+hv (n)) − 1| + |e2πiαj − 1|
- .2
≤ 4 max |e2πiα(j+hv (n)) − 1| , |e2πiαj − 1| .
For j ∈ J(v, w), we have j + hv (n) ∈ J(v , w) for some v ∈ Vn , and therefore
the summability in (6.53) implies that maxv∈Vn |λhv (n) − 1|2 is summable as
well, completing this direction of the proof.
By assumption, 2
n κn < ∞. Define
and show that both (Yn ) and (Zn ) are summable in L2 (μ). For this we
replace the mod Z by the closest distance ||| · ||| to the integers.
The L2 (μ)-norm of Zn satisfies
&Zn & = &θn − Eμ (θn |Pn )&2 ≤ &θn &2 + &E(θn |Pn )&2 ≤ 2&θn &2 ≤ 2Lκn ,
Therefore,
Eμ ⎝ Zm ⎠ = Eμ 2
Zm
m=n+1 m=n+1
∞ ∞
≤ Eμ (Zm
2
) ≤C κ2m < ∞.
m=n+1 m=n+1
Now for Yn , let θn (e) be the value of θn at the cylinder set [e] = {x ∈
XBV , xn+1 = e} and recall that μ([v]) ≥ 1/L by linear recurrence. Then
&Yn &2 = &Eμ (θn − Eμ (θn )|Pn )&2 = &Eμ (θn |Pn ) − Eμ (θn )&2
* *
* *
* *
* hv (n) μ([w]) hv (n)μ([w]) *
=** 1{s(xn+1 )=v} − θn (e) *
hw (n + 1) μ([v]) hw (n + 1) *
*w∈Vn+1 e>xn+1 *
* v∈V t(e)=t(x ) *
n n+1
* * 2
* *
* *
* hv (n)μ([w]) 1 *
=** 1{s(xn+1 )=v} − 1 θn (e) *
*
*w∈Vn+1 e>xn+1 hw (n + 1) μ([v]) *
* v∈V t(e)=t(x ) *
n n+1 2
hv (n)mv,w (n)
≤ μ([w])(L − 1)Lκn
hw (n + 1)
w∈Vn+1 v∈Vn
= μ([w])(L − 1)Lκn ≤ L2 κn .
w∈Vn+1
where in the last line we used XBV Yn dμ = w∈Vn μ([w]) q(w) = 0. Since
(XBV , τ ) is linearly recurrent and
/ |q(w)| ≤ Lκ
/ n , Lemma 6.36 gives us a
/ μ([vw]) /
C > 0 and β ∈ (0, 1) such that / μ([v]) − μ([w])/ ≤ Cβ k . It follows that
Therefore
* *2 ⎛⎛ ⎞2 ⎞
* n *
* * n
* Yj * ⎝⎝ Yj ⎠ ⎠ =
* * = Eμ Eμ (Yj Yk )
*j=m+1 * j+m+1 m<j,k≤n
2
n−m−1 n−i
2 j−k i
= 2L C β κj κk = 2C β κj κj+i
m<j≤k≤n i=0 j=m+1
2 n
2L C
≤ κ2j < ∞,
1−β
j=m+1
odd odd
odd odd
odd
odd
+ and I −
The arguments for Ieven are similar, so adding the four variants
odd/even
of (6.55) gives n∈∞ maxv∈Vn |||αhv (n)||| < ∞, as required.
Continuous eigenvalues, “if ” direction: Set fn (x) = λj if x ∈ τ j ([v min ]),
v ∈ Vn , and 0 ≤ j < hv (n). Then fn is continuous because it is constant
on the elements of Pn . Also, if t(τ (x)n ) = t(xn ), i.e. j < hv (n) − 1, then
fn ◦ τ (x) = λj+1 = λfn (x).
Now fn+1 (x)
f n (x) = λ
un (x) where u (x) =
n e∈En+1 ,e<xn+1 hs(e) (n). Using
linear recurrence (so each vertex has at most L incoming edges),
/ /
/ fn+1 (x) /
|fn+1 (x) − fn (x)| = // n − 1// ≤ |||αun (x)||| ≤ L max |||αhv (n)|||.
f (x) v∈Vn
We have n maxv∈Vn |||αhv (n)||| < ∞ by assumption, so fn converges uni-
formly to some f : XBV → C and therefore f is continuous as well.
338 6. Methods from Ergodic Theory
For x ∈/ X max , there is N such that t(τ (x)n ) = t(xn ) for all n ≥ N ,
and therefore f (τ (x)) = limn fn (τ (x)) = limn λfn (x) = λf (x). Finally, for
xmax ∈ X max we use the argument of (6.44).
Automata and
Linguistic Complexity
7.1. Automata
In this section we discuss Turing machines and variations of them and ask the
question of what languages they can recognize or generate. The terminology
is not entirely consistent in the literature, so some of the notions below may
be called differently depending on which book you read.
341
342 7. Automata and Linguistic Complexity
• A reading device that can read a symbol at one position on the tape
at the time. It can also erase the symbol and write a new one, and
it can move to the next or previous position on the tape. At the
start, the reading device is located at the first non-blank position.
• A finite collection of states Q = {q0 , . . . , qN −1 }, so N is the size
of the Turing machine. One state, say q0 , is the initial state. A
collection H of states is called halting states, which falls apart in
accepting, rejecting, and indecisive states. When a state q ∈ H
is reached, the machine stops and accepts or rejects the input or
remains undecided according to the status of q.
• Each state comes with a short list of instructions:
– read the symbol;
– replace the symbol or not;
– move to the left or right position;
– move to another (or the same) state.
This instruction list is the outcome of the transition function
δ : (Q \ H) × A Q × A × {L, N, R},
where {L, N, R} stands for the moves of the reading device (left,
no move, right). Some authors disallow the N ; it can always be re-
placed by move right + move left, using a few extra states. Because
the Turing machine halts when it reaches a state q ∈ H, δ need not
be defined on H.
Furthermore, in all generality, the transition function can be
multivalued (so we write rather than →). When reading a ∈ A
in state q ∈ Q\H, the Turing machine can choose which instruction
list in δ(q, a) it will perform. In order to emphasize this, we speak of
a non-deterministic Turing machine, whereas the Turing machine
is deterministic if δ is a proper single-valued function.
Example 7.1. Let χFib be the Fibonacci substitution of Example 4.6. The
following Turing machine replaces the input w ∈ {0, 1}∗ by χFib (w). The
word w is written as · · · bbbwbbbb · · · on the tape, and the reading device
(cursor) starts at the first symbol of w. The initial state is q0 and there is
7.1. Automata 343
one halting state q8 . The transition function δ is as follows (with a ∈ {0, 1}):
⎫ q0 : 01bbbb
(q0 , 0) → (q1 , b, R) ⎬
q1 : b1bbbb
(q0 , 1) → (q4 , b, R) read input symbol
⎭ q1 : b1bbbb
(q0 , b) → (q8 , b, R)
q2 : b1bbbb
⎫ q3 : b1b0bb
(q1 , a) → (q1 , a, R) ⎬
move cursor right q6 : b1b01b
(q1 , b) → (q2 , b, R)
⎭ until the second blank q6 : b1b01b
(q2 , a) → (q2 , a, R)
q7 : b1b01b
5
(q2 , b) → (q3 , 0, R) q7 : b1b01b
write 01
(q3 , b) → (q6 , 1, L) q0 : b1b01b
⎫
(q4 , a) → (q4 , a, R) ⎬ q4 : bbb01b
move cursor right
(q4 , b) → (q5 , b, R) q5 : bbb01b
⎭ until the second blank
(q5 , a) → (q5 , a, R) q5 : bbb01b
@ q5 : bbb01b
(q5 , b) → (q6 , 0, L) write 0
q6 : bbb010
⎫
(q6 , a) → (q6 , a, L) ⎬ q6 : bbb010
move cursor left
(q6 , b) → (q7 , b, L) q6 : bbb010
⎭ until the second blank
(q7 , a) → (q7 , a, L) q7 : bbb010
@ q0 : bbb010
(q7 , b) → (q0 , b, R) start at next input symbol
q8 : bbb010
The rightmost column gives an example of how the Turing machine works
step by step. If we remove the halting state q8 and replace the instruction in
the third line by (q0 , b) → (q0 , b, R), then the Turing machine will never halt,
but instead replace w by χFib (w) and then by χ2Fib (w), χ3Fib (w), χ4Fib (w), etc.
(7.1) M = (Q, A, q0 , f )
344 7. Automata and Linguistic Complexity
where:
Example 7.2. The language of the even shift of Example 1.4 is recognized
by the automaton in Figure 7.1. The tape is written over the alphabet
A = {0, 1, b}. The arrow qi → qj labeled a ∈ A represents δ(qi , a) = qj .
q3 accept q4 reject
b b
b
1 0
0 q0 q1 q2 0, 1
1
start
Proof. Let (G, A) be the edge-labeled directed graph representing the finite
automaton for L. Assume without loss of generality that the automaton di-
rects every path to a single state, say qe , when the entire input is read. Then
the reverse graph (G R , A) in which the directions of all arrows are reversed
and qe becomes the initial state and vice versa recognizes LR . However,
even if every outgoing arrow in G has a different label (so the automaton is
deterministic), this is no longer true for (G R , A). But by Theorem 7.3 there
is also a DFA that recognizes LR .
1 Said differently, if we include the movement of the reading device in the transition function
left:
(sentence) → (articled noun phrase)(transitive verb)
(articled noun phrase)
(articled noun phrase) → (article)(noun phrase)
(noun phrase) → (adjective)(noun phrase)
(noun phrase) → (noun)
(noun) → mouse, cat, book, fluency
(article) → the, a
(adjective) → big, small, high, low, red, green, orange, yellow
(transitive verb) → chases, eats, hits, reads
This produces sentences such as
a small yellow mouse chases the big green cat,
(7.2)
a high low red fluency eats a orange book.
Here the first sentence is fine; the second is nonsense. But apart from the
fact that “a orange” should be “an orange” it is grammatically correct.
In arithmetics, we can make the following example:
(expression) → (expression) ∗ (expression),
(expression) → (expression) + (expression),
(expression) → ((expression)),
(expression) → 0, 1, 2, 3, 4, 5, 6, 7, 8, 9.
This can generate all kinds of arithmetic expressions by repeatedly adding
and multiplying the numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 that a pocket calculator
should be able the compute. For instance
9 + (5 ∗ 3) + 7, (9 + 5) ∗ 3 + 7, 9 + 5 ∗ (3 + 7), (9 + 5) ∗ (3 + 7),
all with different outcomes.
Formally, this grammar has the components
G = (V, T, P, S),
where
V = set of variables to which production rules can be applied.
T = set of terminals which remain unchanged.
P = set of production rules to replace variables with words in V ∪ T.
S = a special variable, called the starting symbol.
7.2. The Chomsky Hierarchy 347
Example 7.5. The even shift (Example 1.4) is recognized by the following
left- and right-linear regular grammars with T = {0, 1}:
left-linear right-linear
S → S0 S → 0S
S → S11 S → 11S
S→ S→
Note that the language L is closed under taking reverses, i.e. LR = L, and
this property makes it so simple to convert the left-linear productions into
the right-linear productions.
Example 7.8. Consider the language L := {0n 1n : n ≥ 1}. That is, every
maximal block of 0’s is followed by an equally long word of 1’s.
This is a context-free language, generated by the productions
S → 0S1,
S → 01.
7.2. The Chomsky Hierarchy 349
This example shows that context-free grammars are a strictly wider class
than the regular grammars, and it also illustrates the working of a general
class of lemmas, called Pumping Lemmas that are frequently used as a
tool to distinguish grammars. The simplest (applied in Example 7.8) is:
Lemma 7.9 (Pumping Lemma for Regular Languages). Let L be a regular
language. Then there is N such that for every w ∈ L of length |w| ≥ N , we
can decompose w = tuv such that |uv| ≤ N , v = , and tuk v ∈ L for all
k ≥ 0.
Proof. Suppose by contradiction that the language L(x) was regular. Then
by the Pumping Lemma 7.9, there are words tuk v ∈ L(x) for some u = and
any k ≥ 1. But the frequency of 1’s is limk |tu
k v|
|tuk v|
1
∈ Q and this contradicts
that the rotation number of x is irrational. See [249, Corollary 6.1.11].
Lemma 7.11 (Pumping Lemma for Context-Free Languages). Let L be a
context-free language. Then there is N such that for every w ∈ L of length
|w| ≥ N , we can decompose w = rstuv such that 1 ≤ |su| ≤ |stu| ≤ N , and
rsk tuk v ∈ L for all k ≥ 1.
The very form of the Pumping Lemma 7.11 implies that square-free (or
power-free) subshifts cannot be context-free. The same holds for linear recur-
rent subshifts and in fact most other subshifts discussed in previous chapters.
Corollary 7.12. The language L(x) of a Sturmian sequence x is not context-
free.
350 7. Automata and Linguistic Complexity
|su| ∈
1 1
|rsk tuk v|
Q, contradicting that Sturmian sequences have irrational frequencies.
A stronger form of Lemma 7.11 is Ogden’s Lemma; see [319, 433] and
[203, Chapter 4]. In this lemma, we mark positions in words w in any way
we like, but after our choice, the marking cannot change anymore.
0 1 0 1
1 0 1 1 1 0 1 0
• • • • • • • • •
2nd 3rd
Figure 7.2. The edge-labeled Hofbauer tower for the Feigenbaum map.
(2) Since ν is fourth-power-free, neither Bj Bj Bj Bj nor Bj Bj Bj Bj can
be a subword of x after the first appearance of Bj+1 .
Now take m ∈ N so large that 2m−1 > p from Ogden’s Lemma 7.13, and
take
(7.5) z = Bm Bm Bm Bm = Bm Bm Bm Bm−1 Bm−1 ∈ L(Xfeig )
marked
From the shape of its production rules, it is clear that the language
of Example 7.8 is context-free. No finite automaton can keep track of the
precise number of 0’s before starting on the 1’s, but there is a simple memory
device that can. Imagine that for every 0, we put a card on a stack, until
log (f n )
2 That is, lim sup 1
n
log (f n ) = 0 but lim inf n log n
= ∞.
352 7. Automata and Linguistic Complexity
we reach the first 1. At every 1 we remove a card again. If at the end of the
word no cards are left on the stack, the word is accepted.
This device is simple in construction: we can only add or remove at the
top of the stack; what is further down cannot be read until all the cards
above it are removed. On the other hand, the stack is unbounded, so it
requires unbounded memory.
Formally, the (push-down) stack has its (finite) stack alphabet C (think
of cards of different color) which is different from A and an “empty stack”
symbol e. The transition function needs to include instructions for the stack:
δ : Q × A × (C ∪ {}) Q × (C ∪ {, r})
where c ∈ C refers to adding a card of color c to the stack, r refers to
removing the top card from the stack, and refers to leaving the stack
unchanged. When the input is read entirely, its acceptance depends on the
status of both the stack and state finally reached. The resulting automaton
with stack is called a push-down automaton.
Theorem 7.16. A language is (not more complicated than) context-free if
and only if it is recognized by a push-down automaton.
See [319, Section 5.3] for the proof. Using this theorem, it becomes
clear that Dyck shifts from Example 7.7 are context-free languages (and
non-regular if there are at least two types of brackets). We use a different
color card for each type of bracket, add a card of the correct color for every
opening bracket, and remove it for the corresponding closing bracket. If the
correct color is not at the top of the stack, then there are linked sets of
brackets.
them by “dummy” variables: P E → DD, Sai → SD, and S → D, and finally put all dummies at
the end by an extra production DB → BD for each B ∈ T ∪ V .
7.2. The Chomsky Hierarchy 355
Complete proofs of this (regarding the itineraries of all x ∈ [0, 1], not just
those contained in the attractor) were given in [153] and [565, Section 6.2].
We give an explicit set of production rules. In [554] it is also shown that
two types of languages derived from Fibonacci unimodal maps are context-
sensitive as well.
Proof. Recall the structure of L(Xfeig ) from Corollary 7.14. The production
rules will be in groups, doing specific tasks.
(1) The initial word is BCr 1HE, where B and E are begin- and end-
markers that will also be used to eventually remove symbols from
the left and right.
(2) Produce a length 2n prefix of νfeig (for n ≥ 1 arbitrary) with markers
H at positions 2k , 0 ≤ k ≤ n. The markers Cr , Cl are cursors
running right and left, respectively.
Cs a → SaCs , Cs H → Cl H,
XCl → Cl X, X = H, HCl Sa → aHCa , a ∈ {0, 1},
HCl SaS → aCa S, HCl SaX → a Ca Xa = 1 − a, X = S,
HCl → HCs a, HCl a → Cr ,
XCl →
Cl X, X = H, Cr X → XCr if X = H,
Cr H → HCs , Cr H → HCr .
356 7. Automata and Linguistic Complexity
With this restriction, a Turing machine can still be very powerful. It can
compute prime numbers in the sense that {1p }p prime is a context-sensitive
language (but not context-free); see [486]. Theorem 7.23 suggests that to
find languages which are not context-sensitive, one needs to search for prob-
lems that take a lot of memory to solve. The class EXPSPACE is the class
of problems whose solution requires memory space of order 2p(n) (but not
less) for some polynomial p(n) of the input length n. The known examples
of such problems are complicated, and even more so to state in the form of
a language, so we will not try to give an example.
1 0
0 q0 /1 q1 /0 q2 /0 1 0
1
start
Figure 7.3. The DFAO for the even shift. The label qi /t stands for qi , τ (qi ).
1
0 q0 /0 q1 /1 0
1
start
Proof. The proof relies on a way to rewrite the DFAO as a pair of substitu-
tions χ and ψ, and vice versa. The state space Q is the alphabet of both sub-
stitutions, the base N of the N -automaton is the length of the substitution
words χ(q) and also the cardinality of the input alphabet A = {0, . . . , N −1},
and the output function τ : Q → B is the letter-to-letter ψ : Q → B.
First assume that χ and ψ are given so that χ(q0 ) starts with q0 . Since
χ has the fixed point ρ, the letter q0 is the zeroth letter of ρ = ρ0 ρ1 ρ2 . . . ,
and we take it as the initial state of the DFAO. Now define the transition
function δ : Q × A → Q as
δ(q, a) is the a-th letter of χ(q).
Then for n ∈ {0, . . . , N − 1}, the representation of n in base N is simply
w(n) = n. If this is the input word, then q = δ(q0 , w(n)) is the n-th letter
of χ(q0 ), which is the n-th letter of ρ.
We continue by induction using that the induction hypothesis is
δ(q0 , w(n)) = ρn . We verified this for 0 ≤ n < N and assume that it
holds for all m < n. Write n = n N + n . Then
δ(q0 , w(n)) = δ(q0 , w(n)1 · · · w(n) )
= δ(δ(q0 , w(n)1 · · · w(n)−1 ), w(n) ) (by (7.7))
= δ(δ(q0 , w(n )), w(n) )
= δ(ρn , n ) (by induction)
= the n -th letter of χ(ρn )
= ρN n +n = ρn .
This completes the induction. It follows that τ (q0 , w) = τ (ρn ) = ψ(ρn ) = xn ,
as required for an automatic sequence.
Now for the converse, we are given the DFAO, and we can assume that
δ(q0 , 0) = q0 for intitial state q0 , because this simply deals with the insignifi-
cant digits 0 in a base N representation of n. Set ψ = τ : Q → B and define
360 7. Automata and Linguistic Complexity
χ : Q → QN by
χ(q) = δ(q, 0)δ(q, 1) . . . δ(q, N − 1).
Then χ(q0 ) = q0 and χ has a fixed point ρ starting with the letter q0 . For
n ∈ {0, . . . , N − 1} and its representation w(n) = n in base N , we find that
δ(q0 , w(n)) is the n-th letter of χ(q0 ), which is the n-th letter of ρ.
The induction hypothesis is again δ(q0 , w(n)) = ρn . We verified this for
0 ≤ n < N and assume that it holds for m < n. Write n = n N + n . Then
δ(q0 , w(n)) = δ(q0 , w(n)1 . . . w(n) )
= δ(δ(q0 , w(n)1 . . . w(n)−1 ), w(n) ) (by (7.7))
= δ(ρn , n ) (by induction)
= the n -th letter of χ(ρn )
= ρN n +n = ρn .
This completes the induction. Again τ (q0 , w) = τ (ρn ) = ψ(ρn ) = xn .
All sequences that are eventually periodic are N -automatic for every
N ∈ N. In particular, every indicator sequence x = 1E of a finite set E is
automatic for every N ∈ N. However, as soon as sup E > N #Q for some
N -automaton Mout with set of states Q, then there must be a loop. That
is, for some m = [w]N ∈ E, the automaton Mout reading w must reach the
same state q ∈ Q twice: there must be a loop from q to q. The (proof of the)
Pumping Lemma 7.9 gives that Mout must accept the words that take this
loop an arbitrary number of times. These loops explain the existence of geo-
metric progressions in automatic sequences, such as the indicator sequence
x = 1{2k :k≥0} of the powers of 2; its 2-automaton in shown in Figure 7.5. It
also shows that 1{n!:n∈N} , or the indicator sequence of any superexponentially
increasing sequence, cannot be automatic.
k 3k base 2
0 1
1 1 11
0 0 0 2 1001
3 11011
1 1 4 1010001
q0 /0 q1 /1 q2 /0 5 11110011
6 1011011001
7 100010001011
start 8 11001101000011
This is not surprising in view of Theorem 7.27 and the fact that fixed
points of substitution shifts are syndetic.
For example, the indicator sequence x = 1{3k :k≥0} of the powers of 3
is trivially 3-automatic, but the powers of 3 written in base 2 betray no
pattern; see Figure 7.5. The natural question that inspired Büchi’s paper
[130] is whether x is 2-automatic. The answer relies on an elegant application
of the Pumping Lemma 7.9.
log Ñ
Proposition 7.29. If N and Ñ are multiplicatively independent, i.e. log N ∈
/
Q, then the indicator sequence x = 1{Ñ k :k≥0} of the powers of Ñ is not N -
automatic.
Before giving the proof, we need a simple result from number theory.
Lemma 7.30. If N, Ñ ∈ N are multiplicatively independent, then for every
ε > 0 there are r, r̃ ∈ N such that |Ñ r̃ − N r | < εN r .
for some word v. Hence, when M0 parses 10#Q v, it must see some state
q ∈ Q twice before it finishes reading 10#Q . Say the corresponding loop
from q to q has length s. By the Pumping Lemma 7.9, Mout has to accept
10#Q+ks v for every integer k ≥ 0. Therefore there is an (k) ∈ N such that
(7.8) [10#Q+ks v]N = Ñ (k) = N #Q+ks+|v| + [v]N = N r+ks + [v]N .
Note that (k + 1) − (k) is bounded in k. Subtracting (7.8) for k from (7.8)
for k + 1, we obtain
Ñ (k) (Ñ (k+1)−(k) − 1) = N #Q+ks+|v| (N − 1).
This can only hold for all k ≥ 0 if Ñ m̃ = N m for some m, m̃ ∈ N, which
contradicts our assumption that N and Ñ are multiplicatively independent.
This concludes the proof.
Proof. The overlap of neighboring intervals Ik and Ik+1 is large enough that
the period pk extends to Ik ∪ Ik+1 . This is a special case of the Fine-Wilf
Theorem [247, Theorem 3]. In detail:
(i) Suppose that n, n + pk ∈ Ik ∪ Ik+1 . If n + pk ∈ Ik , then xn = xn+pk
by the local periodicity on Ik .
(ii) Otherwise n, n + pk ∈ Ik+1 and by local periodicity on Ik+1 we can
find n ≡ n mod pk+1 such that n , n + pk ∈ I ∩ I ; see Figure 7.6. But then
xn = xn = xn +pk = xn+pk , using local periodicity on Ik+1 , on Ik , and again
on Ik+1 , respectively.
4 Eilenberg [233] called Cobham’s proof correct, but long and unreasonable, but Cobham’s
proof is just six pages, and whether unreasonably technical is in the eye of the reader.
7.3. Automatic Sequences and Cobham’s Theorems 363
Therefore the local period pk carries over to Ik+1 . By the same argument,
the period pk carries over to all next neighbors, both left and right. By
induction, the period carries over to all Ij , j ∈ N. Hence xn = xn+pk for all
n ≥ min I1 . This is true for all k, so in particular to p = mink pk .
Proof of Theorem 7.31. Assume that the sequence (xn )n≥0 is both N -
automatic and Ñ -automatic. The aim is to prove that (xn )n≥0 is locally
periodic. We will specify the integer intervals Ik , show that they have a
local period, and finally show that they have sufficient overlap to apply
Lemma 7.33. An important idea to achieve this overlap is to use larger
input alphabets A = {0, 1, . . . , 2N − 1} and à = {0, 1, . . . , 2Ñ − 1} than the
bases suggest, but according to [20, Theorem 6.8.6], there are DFAOs that
produce the same automatic sequences in this case. We don’t change the
bases, so integers may have multiple representations in the extended input
alphabets, but for all representations, the DFAO will give the same output.
Without loss of generality, assume that N < Ñ . Let Q and Q̃ be the
sets of states of the corresponding automata. Since we only have to prove
that (xn )n≥0 is eventually periodic, it suffices to consider states q such that
δ(q0 , w) = q for infinitely many w ∈ A∗ .
If w, w ∈ A∗ are such that δ(q0 , w) = δ(q0 , w ), then also δ(q0 , wz) =
δ(q0 , w z) for every z ∈ A∗ . For the N -automatic sequence (xn )n≥0 and any
r ∈ N, this means that
Since |Ik ∩ Ik+1 | = 13 N r > 2 maxq∈Q pq , the overlap of these integer intervals
is as large as required in Lemma 7.33.
Theorem 7.34. Let ρ and ρ̃ be the fixed points of two primitive5 substitutions
whose associated matrices have leading eigenvalues λ and λ̃. If there are
substitutions ψ and ψ̃ such that ψ(ρ) = ψ̃(ρ̃) and this is not an eventually
periodic sequence, then log λ̃
log λ ∈ Q.
That non-trivially different substitutions can have the same fixed point
is shown in e.g. Example 4.28.
⎧ ⎧ ⎧
⎪1 → 2, 1 → 2,
⎪1 → 2 3, 1 → 3 2,
⎪1 → 3 4, 1 → 4 3,
⎪
⎪ ⎪
⎪ ⎪
⎪
⎨2 → 3, 2 → 3,
⎨2 → 4, 2 → 4,
⎨2 → 4,
2 → 4,
χ: χ: χ:
⎪
⎪3 → 4, 3 → 4,
⎪
⎪ 3 → 3 4,
3 → 4 3, ⎪
⎪3 → 3 2,
3 → 2 3,
⎪
⎩ ⎪
⎩ ⎪
⎩
4 → 4 3 2 1,
4 → 1 2 3 4,
4 → 21, 4 → 1 2,
4 → 1,
4 → 1.
c2 c c1 c2 c c1 c2 c c1
Figure 7.7. Partitions for three different tent maps with critical period 5.
Miscellaneous
Background Topics
The set of algebraic numbers is countable, as one can check from the
fact that for each n ∈ N, there are only finitely many algebraic numbers
that are the root of a degree d polynomial with integer coefficients ai such
that d + di=0 |ai | = n. Hence, most real numbers are transcendental. They
are more difficult to specify (short of writing down all their decimal dig-
its), but examples of transcendental numbers are e, π (in fact, π α for every
algebraic number α) and ζ(3), as proved by Hermite (1873), Lindemann
(1882), and Apéry (1978), respectively. Hilbert asked in his 1900 address
√
of
the International Mathematical Congress whether numbers such as 2 are 2
transcendental (Hilbert’s 7-th Problem). This was solved a good thirty years
later by Gelfond [272] and Schneider [491]: every number of the form ab
where a = 0, 1 is algebraic and b is an algebraic irrational is transcendental.
1 For definiteness, we assume that the coefficients have no common prime divisor and the first
coefficient is positive.
367
368 8. Miscellaneous Background Topics
Among the algebraic numbers there are some classes that are responsible
for special properties in various dynamical systems.
Definition 8.2. An algebraic integer α > 1 is called a Pisot number if all
its Galois conjugates of its minimal polynomial (called the Pisot polyno-
mial) are in the open unit disk. If the Galois conjugates are in the closed
unit disk, with at least one on the boundary, then α is a Salem number.
For example, all the multinacci numbers, i.e. the leading solutions of the
equations
xd = xd−1 + xd−2 + · · · + 1,
√
2
are Pisot numbers. The numbers xa = a+ 2a +4 , i.e. the leading roots of
√
x2 − ax − 1, a ∈ N, are all Pisot numbers. In particular, x1 = 12 (1 + 5) is
√
the golden mean, x2 = 1+ 2 is the silver mean, and in general, the numbers
xb are called the metallic means; see also (8.21). Salem [485] showed that
the set of Pisot numbers is closed, so there is a smallest Pisot number. This
turns out to be the cubic irrational x = 1.3247 . . . solving x3 = x + 1. It
is known as the plastic number (see [1] for more on the history of this
terminology), and it is isolated in the set of Pisot numbers [506]. The next
√
one is the leading root of x4 = x3 − 1 and every other one is larger than 2.
The smallest known Salem number is called Lehmer’s number λ =
1.17628 . . . [312, 391]; it is the leading root of Lehmer’s polynomial
p(x) = x10 + x9 − x7 − x6 − x5 − x4 − x3 + x + 1.
There are polynomials of lower degree that non-trivially have roots on the
unit circle, for example x4 − 2x3 − 2x + 1 = 0, which has smallest possi-
ble degree, but the leading root is larger than Lehmer’s number. It is an
open question whether all characteristic polynomials of non-negative integer
matrices with roots on the unit circle are reducible (so not of Salem type).
Proposition 8.3. If α > 1 is a Salem number, then its minimal polynomial
is palindromic; i.e.
p(x) = ad xd + ad−1 xd−1 + · · · + a1 x + a0 = a0 xd + a1 xd−1 + · · · + ad−1 x + ad .
Except for 1/α, all Galois conjugates of α lie on the unit circle but are not
roots of unity.
other Galois conjugate α can have |α | < 1, because then |1/α | > 1 and
this contradicts that α is a Salem number. Thus all the remaining Galois
conjugates α lie on the unit circle, and the complex conjugate α = 1/α is
also a root of p(x) and of p∗ (x). But then α is also a root of the polynomial
a0 p(x) − ad p∗ (x) which has degree < d. This contradicts that p(x) is the
minimal polynomial of α, unless p(x) = p∗ (x). If α is a root of unity, then
its minimal polynomial, which is (a factor of) xr − 1, divides p(x), but is not
equal to it. This contradicts that p is irreducible. The proof is complete.
Definition 8.4. An algebraic number α > 1 is a Perron number if all its
algebraic conjugates α satisfy |α | < α.
This result goes back to Livshits; see [402, 403] and also [519]. For
Pisot matrices, we can trivially choose g(x) = x and α = λ1 , the leading
eigenvalue. By choosing g(x) = xk , we see that |||λk1 Gn ||| → 0 for all k ∈ N.
Example 8.9. In [246, Section 4], the following example is presented:
⎧
⎪ 0 → 0133, ⎛ ⎞
⎪
⎪ 1 0 0 1
⎪
⎨1 → 12, ⎜1 1 0 0⎟
χ: with associated matrix A = ⎜ ⎝0 1 0 0⎠ .
⎟
⎪
⎪ 2 → 3,
⎪
⎪
⎩3 → 0 2 0 1 0
and 9 9
√ √
1+i −5 + 4 2 1 − i −5 + 4 2
λ3 = , λ4 =
2 2
√
within
√ the unit circle. For g(x) = x 2 − x − 1, we have g(λ1 ) = g(λ2 ) = 2,
so 2 solves (8.2).
Since Fn ∈ Z, |||αGn ||| → 0 as claimed. To obtain that also |||αk Gn ||| → 0,
we repeat the argument with g k (x) = p(x)q(x) + r(x), noting that p(λi ) = 0
for all i and r(λi ) = αk for |λi | ≥ 1.
The proof of Theorem 8.10 uses a fair amount of Galois theory and
properties of the Galois group G associated to the field extension Q(λj ∈ Λ).
This is the group G of automorphisms on Q(λj ∈ Λ) that fixes Q itself. It
turns out that every τ ∈ G permutes the roots in Λ, and if we extend τ
coordinate-wise to the eigenvectors3 vj corresponding to the eigenvalues λj ,
then it permutes the eigenvectors in the same way as the eigenvalues:
Lemma 8.13. Given τ ∈ G we have τ (vj ) = vj if and only if τ (λj ) = λj .
Proof. Let G be the Galois group of the splitting field K, and for τ ∈ G,
denote the coordinate-wise action on vectors in Kd again by τ . Because the
eigenvectors satisfy Avj = λj vj , indeed vj ∈ Q(λj )d ⊂ Kd . Since A has
integer coefficients, we have for every τ ∈ G and vector vj ∈ Kn ,
A τ (vj ) = τ (A vj ) = τ (λj vj ) = τ (λj ) τ (vj ).
This shows that τ (vj ) is an eigenvector of A with eigenvalue τ (λj ).
Proof. Let Gi be the Galois group associated to the splitting field Ki of the
/ Hi⊥ ,
∈
irreducible factor qi . If deg(qi ) = 1, then dim(Hi ) = 1 and since w
there is nothing to prove.
Now suppose deg(qi ) ≥ 2, so dim Hi ≥ 2. Suppose by contradiction that
there are λj , λj ∈ Λi such that cj = 0 = cj . Since Gi acts transitively on
Λi , we can find τ ∈ Gi such that τ (λj ) = λj . Therefore
⎛ ⎞
d−1 d−1
=τ⎝
τ (w) cj vj ⎠ = τ (cj )τ (vj ).
j=0 j=0
But this is a contradiction to linear independence of the vj , because the vec-
tor τ (vj ) is collinear with vj , and the coefficient of vj in the above expression
is cj − τ (cj ) = 0.
− z =
Proof. We have αw j (αcj − dj )vj . If α = dj /cj for all λj ∈ Λ+ ,
then
/// ///
/// ///
// / ///
|||A (αw)|||
n
= |||A (αw
n // /
− z)||| = ///A n
(αcj − dj )vj //////
/// λj ∈Λ ///
/// ///
/// ///
/// ///
/ /
= //// n ///
(αcj − dj )λj vj /// → 0.
///λj ∈Λ− ///
→ 0, then αw
Conversely, if |||An αw||| (mod 1) belongs to the stable manifold
of the toral endomorphism TA : T → Td induced by A. Hence αw−
d z belongs
to the stable manifold of −
0 for some z ∈ Zd , and in the decomposition αw
z = j (αcj − dj )vj all the coefficients belonging to non-stable eigenvectors
are zero. Hence α = dj /cj for all λj ∈ Λ+ .
Lemma 8.16. Assume that all λj ∈ Λ+ have multiplicity one. The coeffi-
cients cj and dj in the previous lemma belong to the field extension Q(λj )
for every λj ∈ Λ+ . In particular, α ∈ λj ∈Λ+ Q(λj ).
Proof. Since Avj = λj vj , the eigenvector can be scaled such that vj ∈
Q(λj )d . Note also that the matrix V with these eigenvectors as columns
satisfies V DV −1 = A, where D is the Jordan matrix with eigenvalues λj on
the diagonal. Since we assumed that all λj ∈ Λ+ have multiplicity one, all
the corresponding Jordan blocks are trivial.
Then also (V −1 )T DV T = AT and (V −1 )T has the eigenvectors of AT
as columns. Because the transpose matrix AT has the same eigenvalues
λ0 , . . . , λj , the same argument as before gives that these eigenvectors belong
to Q(λj ), respectively. Hence, the j-th row of V −1 belongs to Q(λj ).
Now w = V c, so c = V −1 w;
in other words, the j-th component of c
satisfies cj ∈ Q(λj ). Similarly dj ∈ Q(λj ) and hence α = dj /cj ∈ Q(λj ).
Since this is true for all λj ∈ Λ+ , we find α ∈ λj ∈Λ+ Q(λj ).
Remark 8.17. Note that this proof gives that the j-th and j -th rows of
V −1 are obtained from each other by replacing every λj by λj .
Now we can finished the proof of Theorem 8.10.
By the same reasoning, there is a polynomial f˜i ∈ Q[x] of lowest degree such
that dj = f˜i (λj ), and dj = f˜i (λj ) for each λj ∈ Λi ∩ Λ+ . Therefore
dk f˜(λk )
α= = for all λj ∈ Λi ∩ Λ+ .
ck f (λk )
Note that the splitting field Ki of the factor qi (x) is a finite separable field
extension, and by the Primitive Element Theorem (see [36, Chapter 2, The-
orem 27]), there exists γi ∈ Ki such that Ki = Q(γi ). Let mi (x) be the
minimal polynomial of γi over Q. Then by [525, Theorem 5.12] there is
an isomorphism Q[x]/∼ → Q(γ), where Q[x]/∼ denotes the quotient field
of polynomials in Q[x], where r1 ∼ r2 if mi (x) divides r1 (x) − r2 (x). This
isomorphism is realized by [r(x)] → r(γ), where [r(x)] is the equivalence
class of r(x) in Q[x]. This isomorphism tells us that Q[x]/∼ is a field of
polynomials, so the quotient f˜i /fi can be expressed in Q[x]/∼ as a single
polynomial gi . This concludes the proof.
5 See [98, 175, 360] for some of the many general references on continued fractions.
8.2. Continued Fractions 377
Every irrational has a unique continued fraction expansion, but every ra-
tional θ1 ∈ (0, 1) has two (finite) continued fraction expansions, namely
[0; a1 , . . . , an ] and [0; a1 , . . . , an − 1, 1] for an ≥ 2.
Show that there is n ≥ 0 such that f n (p, q) = (gcd(p, q), gcd(p, q)).
Define recursively
p0 = 0, p1 = 1, pn+1 = an+1 pn + pn−1 ,
(8.6)
q0 = 1, q1 = a1 , qn+1 = an+1 qn + qn−1 .
The fractions pqnn are called the convergents of θ1 . They are the best rational
approximations of θ1 and |θ1 − pqnn | ≤ qn q1n+1 as we shall see.
pn pn (a1 ,...,an )
Exercise 8.19. Denote the convergents qn = qn (a1 ,...,an ) . Show that
qn (a1 , . . . , an−1 , an ) = qn (an , an−1 . . . , a1 ),
pn (a1 , a2 . . . , an−1 , an ) = qn−1 (an , an−1 , . . . , a2 ).
zpn + pn−1
= [0; a1 , a2 , . . . , an , z].
zqn + qn−1
for arbitrary sequences of non-zero integers (an ) and (bn ) have convergents
pn
qn satisfying the recursive relations
p0 = 0, p1 = b1 , pn+1 = an+1 pn + bn+1 pn−1 ,
q0 = 1, q1 = a1 , qn+1 = an+1 qn + bn+1 qn−1 .
Figure 8.1. Farey convergents of x for some odd n and an+1 = an+2 = 2.
in mathematics, but his studies in number theory led to what are now called Farey fractions.
380 8. Miscellaneous Background Topics
...
11 1 1
0 1 0 54 3 2 1
Proof. The first return map T to [0, α) is piecewise affine and invertible
because Rα is. The first return time of 0 is α1 + 1. Taking into account the
rescaling of [0, α) to unit size, we obtain
B C B C
1 1 1 1
T (0) = +1 α−1 =1− − = 1 − G(α).
α α α α
8.2. Continued Fractions 381
0 θ = 0.3580 . . . 1
1 1
1
2
1 2
3 3
1 2 3 3
4 5 5 4
1 2 3 3 4 5 5 4
5 7 8 7 7 8 7 5
4
11
p p p
Sometimes q ⊕ q is called the Farey child of the Farey parents q
p
and q , but as this Farey child produces another child with either of its
parents and then again with all of its Farey children, the rationals become
a rather incestuous collection. The collection of all rationals (in [0, 1]) with
lines connecting Farey parents with children is called the Farey web; see
Figure 8.3.
In order to find the continued fraction of some θ ∈ ( pq , pq ) using this web,
perform the Euclidean algorithm as in the proof of Theorem 8.22. That is,
382 8. Miscellaneous Background Topics
to find the Farey convergents of θ, starting with pq00 = 01 and pq11 = a11 , take
Farey sums “towards θ”, i.e. with the Farey parent on the other side of θ.
Just before you cross the vertical line at θ, we have a “true” convergent; see
Figure 8.3.
Every next Farey convergent that we find in this algorithm is the Farey
sum of the previous Farey convergent and “true” convergent, and it is neigh-
bor to both of them. At some point it is the turn of pq and pq ; in fact, since
θ ∈ ( pq , pq ), at least one of them is a “true” convergent of θ.
Clearly 0 = 01 and 1 = 11 are Farey neighbors, and their Farey sum is
1 ⊕ 1 = 2 . We see that 2 is Farey neighbor to both 1 and 1 . This is no
0 1 1 1 0 1
coincidence.
p p
Lemma 8.26. Two rationals 0 ≤ q < q ≤ 1 are Farey neighbors if and
only if p q − pq = 1. In this case they are both neighbors with p
q ⊕ pq as well.
p
Proof. Clearly p
q = 0
1 and q = 1
1 are Farey neighbors and p q − pq =
p p
1 − 0 = 1. We continue by induction, assuming that q < q are Farey
p p+p p
neighbors and p q − pq = 1. Clearly p p
q < q ⊕ q = q+q < q and both
(p + p )q − p(q + q ) = p q − pq = 1 and p (q + q ) − (p + p )q = p q − pq = 1.
p
Note also that if p
q < b
a
< q and aq − pb = p b − aq = 1, then a(q + q ) =
p+p
b((p + p ) so a
b = q+q , but that doesn’t contribute to the proof.
p p
Instead, it remains to check that q ⊕ q is the fraction with smallest
p p
denominator between and Take any fraction ab ∈ ( pq , pq ). Then there is
q q .
some η ∈ (0, 1) such that
a p p p η p a p p 1−η
0< − =η − = and 0 < − = (1−η) − = .
b q q q qq q b q q qq
Multiply out the denominators:
0 < q (aq − pb) = ηb and 0 < q(aq − p b) = (1 − η)b.
Thus aq − pb and aq − p b are positive integers, and therefore q + q ≤
q (aq − pb) + q(aq − p b) ≤ b. This means that p+p
q+q has indeed the smallest
denominator of all fractions in ( pq , pq ).
Exercise 8.27. Consider two circles of radii 12 and centers (0, 12 ) and (1, 12 )
in the plane. They are tangent to each other and “perched” on the horizontal
axis , with base-points 0 and 1. Inscribe the maximal circle between these
two and ; it clearly touches in base-point 12 . Continue inscribing new
circles in between and neighboring circles; see Figure 8.4. These circles are
8.2. Continued Fractions 383
called Ford circles after the American mathematician Lester Ford (1886–
1967). Show that if the base-points of the neighboring circle are pq and pq
(in lowest terms), then the base-point of the new circle is pq ⊕ pq . Show that
the diameter of a circle with base-point pq is q12 . See [98, Chapter 9] for more
details on Ford circles.
0 1 1 2 1
1 3 2 3 1
see Figure 8.5. The Kepler tree contains all fractions pq ∈ (0, 1) and the
Calkin-Wilf tree contains all fractions pq ∈ (0, ∞), and all the indicated
fractions appear exactly once. For both trees, going up along branches in
the tree mimics the Euclidean algorithm in the sense that the descendant is
a
b and the parent has denominator and numerator equal to max{a, b} and
|b − a| or vice versa.
There is a single function f of the positive rationals, called the Calkin-
Wilf function, defined as
1
f (x) = ,
2x − x + 1
7 The tree was introduced earlier by Jean Berstel and Aldo de Luca [73] as a Raney tree,
since they drew some ideas from a paper by George Raney [469].
384 8. Miscellaneous Background Topics
1 1
2 1
1 2 1 2
3 3 2 1
1 3 2 3 1 3 2 3
4 4 5 5 3 1 3 2
1 4 3 4 2 5 3 5 1 4 3 4 2 5 3 5
5 5 7 7 7 7 8 8 4 1 4 3 5 2 5 3
such that the f -orbit of 1 denumerates all rationals in the Calkin-Wilf tree
row by row.
Exercise 8.30. Express the rules (8.9) of the Kepler tree and the Calkin-
Wilf tree in terms of continued fractions. Find a function, similar to the
Calkin-Wilf function, that denumerates the rationals in (0, 1) row by row in
the Kepler tree.
8.2.2. Closest Returns for the Circle Rotation. Let θ ∈ [0, 1] \ Q, and
consider the integers q such that Rθq (0) is closer to 0 ∈ S1 than Rθi (0) for all
0 < i < q. This means that qθ is closer to some integer p than ever before,
so pq is closer to θ than ab is for each integer 1 ≤ b < q. Theorem 8.22 tells
us that this happens if pq are Farey convergents.
q1 θ 0 θ
left are (q1 + 2q2 )θ mod 1, (q1 + 3q2 )θ mod 1, . . . until (q1 + a3 q2 )θ mod 1 =
q3 θ mod 1. In this way, we obtain all the closest returns at times indicated
in the proposition.
Sketch of Proof. For the “only if” part, take an interval [a, b] ⊂ S1 ; uniform
distribution is then equivalent to (8.11) applied to the indicator function
1[a,b] . Next approximate 1[a,b] in L1 by continuous functions.
For the “if” part split f into its real and imaginary part (which are both
Riemann integrable), and approximate Re f and Im f by step functions.
Theorem 8.34 (Weyl’s Criterion). A sequence (xn )n∈N is uniformly dis-
tributed if and only if
N
1
(8.12) lim e2πijxn = 0 for every integer j = 0.
N →∞ N
n=1
Proof. Since x → e2πijx is continuous with integral 0, the “only if” part
follows immediately from Lemma 8.33.
For the “if” part use Lemma 8.33 again, and approximate an arbitrary
continuous function f : S1 → C with S1 f (x) dx = 0 by Fourier series
K 2πijx . Here the Fourier coefficient f = 0 because
j=−K fn e 0 S1 f (x) dx = 0.
1 N
Now (8.12) implies that limN →∞ N n=1 f (xn ) = 0, so (xn )n∈N is uniformly
distributed.
Theorem 8.35 (Van der Corput’s Difference Theorem). Let (xn )n∈N be a
sequence in S1 . If (xn+k − xn )n∈N is uniformly distributed for some fixed
integer k ≥ 1, then (xn )n≥1 is uniformly distributed.
k n = 0 1 2 ... ... N N +K −1
0 +++++++++++++++++ + + ++
1 + +++++++++++++++++ +++
++ + + + + + + + + + + + + + + + + + ++
+++ +++++++++++++++++ +
K−1 + + ++ + + + + + + + + + + + + + + + + +
un−k =0 un−k =0
f : T2 → T2 , (x, y) → (x + α, y + 2x + α).
Exercise 8.39. Use Example 8.38 to show that ({αnp })n∈N is well-distri-
buted for each p ∈ N. Conclude that ({αp(n)}) is well-distributed for every
polynomial with rational coefficients.
sup N D∗N ([a, b]) := sup sup |#{k + 1 ≤ n ≤ k + N : xn ∈ [a, b)} − (b − a)|
N N k
8.3. Uniformly Distributed Sequences 389
is finite. Indeed, {(n − j)α − a} = {nα − a} − {jα} + 1[a,b) ({nα}), and hence
/ / / /
/N −1 / /N −1 /
/ / / /
/ 1[a,b) ({nα}) − (b − a)/ = / {(n − j)α − a} − {nα − a}/
/ / / /
n=0 n=0
/ /
/ −1 N −1 /
/ /
/
= / {nα} − {nα}// ≤ j
/n=−j n=N −j /
Proof. Without loss of generality, we can reorder the point xn such that
0 := x0 ≤ x1 ≤ · · · ≤ xN ≤ xN +1 := 1. Integration by parts and telescoping
series give
N −1 0 xn+1 0
n 1 N −1
n
t− df (t) = t df (t) − (f (xn+1 ) − f (xn ))
xn N 0 N
n=0 n=0
0 1
= [tf (t)]10 − f (t) dt
0
N −1
n+1 1 n
− f (xn+1 ) − f (xn+1 ) − f (xn )
N N N
n=0
0 1 N −1
1 N 0
= f (1) − f (t) dt + f (xn+1 ) − f (xN ) − f (x0 )
0 N N N
n=0
N 0 1
1
(8.16) = f (xn ) − f (t) dt.
N 0
n=1
390 8. Miscellaneous Background Topics
We used the Stieltjes integral with df (t) instead of our usual notation f (t)dt,
x
because f need not be differentiable to carry out this step. Also xnn+1 |df | =
Var(f |[xn ,xn+1 ] ). Note that
/ /
/1 /
∗
DN = max sup / #{1 ≤ i ≤ N : xi ∈ [0, a)} − a//
/
n=0,...,N xn <a≤xn+1 N
/n / -/ n / /n /.
/ / / / / /
= max sup / − a/ = max max / − xn / , / − xn+1 / .
n=0,...,N xn <a≤xn+1 N n=0,...,N N N
From (8.16), we get
/0 1 / / 0 /
/ N / / N xn+1 n /
/ 1 / / /
/ f (t) dt− f (xn )/ = / t− df (t)/
0 N / / N /
n=1 n=0 xx
0 -/ n // // n //.
N xn+1
/
≤ max /xn − / , /xn+1 − / |df |
N N
n=0 xn
N 0 xn+1
≤ D∗N Var(f |[xn ,xn+1 ] ) = D∗N Var(f ),
n=0 xn
as required.
The special case that xn = {αn} for irrational α is called the Denjoy-
Koksma inequality. It applies to specific values of N .
Theorem 8.42. Let f : S1 → R have bounded variation and let p/q be a
convergent of the irrational
/ /
/ q 0 /
/1 / 1
(8.17) / f (αj) − f (x) dx / ≤ Var(f ).
/q / q
/ j=1 S1 /
and rational solutions to systems of linear equations, and he didn’t go for approximations.
392 8. Miscellaneous Background Topics
Proof. Take N ∈ N arbitrary and let P = {0, {x}, {2x}, . . . , {(N − 1)x}, 1}
⊂ [0, 1], where {x} denotes the fractional part of x. Since #P = N + 1, by
the pigeon hole principle at least two of them must have a distance ≤ 1/N .
Suppose these two points are {mx} and {nx}. Then
/ / / /
1 / / / /
≥ /{mx} − {nx}/ = /(mx − m ) − (nx − n )/ = |qx − p|
N
/ /
/ /
for q = |m − n| < N and p = |m − n |. Therefore /x − pq / ≤ N1q ≤ q12 . Similar
arguments hold if the two points are 0 and {mx}, or {nx} and 1.
However, Freiman [257] proved in 1975 that the maximal line (called Hall
line) that L contains is [k∗ , ∞) for
√
2222221564096 + 283748
k∗ = ≈ 4.52782956616 . . . .
491993569
The numbers x ∈ R for which k(x) in (8.20) is finite are irrationals of
bounded type, i.e. with a bounded sequence (ai )i∈N in their continued frac-
tion expansion. One special class of these is the quadratic numbers, i.e.
irrationals solutions to quadratic equations ax2 + bx + c = 0 for a, b, c ∈ Z,
because these have an eventually periodic sequence (ai )i∈N . This is La-
grange’s11 Theorem; see [385].
Theorem 8.45. An irrational x is a quadratic number if and only if the
sequence (ai )i∈N of partial quotients (i.e. digits in its continued fraction ex-
pansion) is (eventually) periodic.
Proof. Write
p 1 p 1
Qk,ν = − 2+ν , + 2+ν .
p q kq q kq
q
∈Q
for at most finitely many q ∈ N. But every x ∈ [0, 1] with this property is
Diophantine.
e = [2; 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8, 1, 1, 10, 1, 1, . . . ];
8.5. Density and Banach Density 395
are called the lower and upper density of E. If they coincide, then d(E) :=
limn n1 #(E ∩ {1, . . . , n}) is the density of E.
More generally, the lower Banach density and upper Banach den-
sity of E are
⎧
⎨d∗ (E) := lim inf m,n→∞ 1 #(E ∩ {n + 1, . . . , n + m}),
m
⎩d∗ (E) := lim sup 1
∩ {n + 1, . . . , n + m}).
m,n→∞ m #(E
396 8. Miscellaneous Background Topics
Proof. ⇐: Assume that limEn→∞ an = 0, and for ε > 0, take N such that
an < ε for all E n ≥ N . Also let A = sup an . Then
n n n
1 1 1
0 ≤ ai = ai + ai
n n n
i=1 Ei=1 Ei=1
N A + (n − N )ε A
≤ + #(E ∩ {1, . . . , n}) → ε,
n n
as n → ∞. Since ε > 0 is arbitrary, limn n1 ni=1 ai = 0.
⇒: Let Em = {n : an ≥ m }.
1
Then clearly E1 ⊂ E2 ⊂ E3 ⊂ · · · and each
Em has density 0 because
n n
1 1 1
0 = m lim ai ≥ lim 1Em (i) = lim #(Em ∩ {1, . . . , n}).
n→∞ n n→∞ n n→∞ n
i=1 i=1
Now take 0 = N0 < N1 < N2 < · · · such that n1 #(Em ∩ {1, . . . , n}) < 1
m for
every n ≥ Nm−1 . Let E = m (Em ∩ {Nm−1 + 1, . . . , Nm }).
Then, taking m = m(n) maximal such that Nm−1 ≤ n,
1
# (E ∩ {1, . . . , n})
n
1 1
≤ #(Em−1 ∩ {1, . . . , Nm−1 }) + #(Em ∩ {Nm−1 + 1, . . . , n})
n n
1 1
≤ #(Em−1 ∩ {1, . . . , Nm−1 }) + #(Em ∩ {1, . . . , n})
Nm−1 n
1 1
≤ + →0
m−1 m
as n → ∞.
Corollary 8.54. For every non-negative sequence (an )n≥0 ,
n n
1 1
lim ai = 0 if and only if lim a2i = 0.
n→∞ n n→∞ n
i=1 i=1
n
Proof. By the previous lemma, the average limn n1 i=1 ai = 0 if and only
if limEn→∞ an = 0 for a set E of zero density. But the latter is clearly
equivalent to limEn→∞ a2n = 0 for the same set E. Applying the lemma
again, we arrive at limn n1 ni=1 a2i = 0.
8.5. Density and Banach Density 397
Take ε > 0 arbitrary, and find x0 so large that d(E) − ε ≤ A(x)x ≤ d(E) + ε
for all x ≥ x0 . Substitute these inequalities into the integral of (8.25) to
obtain 0 x
d(E) − ε
L(x) ≥ dt ≥ (d(E) − ε)(log x − log x0 )
x0 t2
398 8. Miscellaneous Background Topics
A(t)
and, since ≤ 1 for all t,
t
x0 0 0 x
1 d(E) + ε
L(x) ≤ 1+ dt+ dt ≤ 1+log x0 +(d(E)+ε)(log x−log x0 ).
1 t x0 t
Divide these inequalities by log x and take the limit x → ∞. Finally letting
ε → 0 gives the result.
Exercise 8.57. Let E = n≥1 {k ∈ N : 22n ≤ k < 22n+1 }. Show that
1 1 2
= d(E) < = δ(E) < = d(E).
3 2 3
12 See Definition 3.6. We emphasize that A need not be an integer matrix in this theorem.
8.6. The Perron-Frobenius Theorem 399
Section 6.3.5) or [121, 168, 169]. This is also why rank r transformations
(namely if all the A(i) area di × di -matrices with di ≤ r) have at most r
ergodic measures. The following example shows that this upper bound r is
strict.
cn 1
Lemma 8.59. For n ≥ 1, let A(n) = , with 1 1
n≥1 cn +1 < 2 .
1 cn
Then S∞ in (8.26) is a non-degenerate arc.
Proof. Let (a1 , b1 ) = (1, 0) and inductively (an+1 , bn+1 ) = (an , bn )A(n).
Set λn = an /(an + bn ) ∈ [0, 1], so λ1 = 1. If we parametrize the simplex by
S = {(t, 1 − t) : t ∈ [0, 1]}, then
cn λn + 1 − λn 1 − 2λn 1
λn+1 = fn (λn ) = = λn + ≥λ− .
cn + 1 cn + 1 cn + 1
Therefore (λn )n≥1 is a decreasing sequence with limit λ∞ ≥ 1− n≥1 cn1+1 >
1
2 by assumption. By symmetry, the same procedure starting with (a1 , b1 ) =
(0, 1) produces an increasing sequence with limit 1 − λ∞ < 12 . Therefore S∞
is the non-degenerate arc {(t, 1 − t) : t ∈ [1 − λ∞ , λ∞ ]}.
If T : Rd≥0 → Rd≥0 has the matrix representation A = (ai,j )d,d
i=1,j=1 , then
Diam(T (Rd≥0 )) = 2 log ρ(A) for
2
max{ak,j /ak,j : 1 ≤ k ≤ d}
(8.29) ρ(A) := max ,
1≤j,j ≤d min{ak,j /ak,j : 1 ≤ k ≤ d}
as ρ → 1. In particular, if d = d, i.e. T maps the cone Rd≥0 into itself, and its
matrix representation is strictly positive, then n T n (Rd≥0 ) is a single half-
line, and the convergence to this half-line is exponential. More generally, we
have the following result (see [121, Proposition 3]):
Lemma 8.61. Let (Cn )n≥n be a sequence of cones and Tn : Cn+1 → Cn are
linear transformations
with matrix representations An such that ρn := ρ(An )
satisfies ∞ n=1 1/ρn = ∞. Then n T1 ◦T2 ◦· · ·◦Tn (Cn+1 ) is a single half-line.
√ √
ρn −
Proof. By (8.30), the contraction factor of Tn is √ √1/ρn ∼ 1 − 2/ρn .
& ρn + 1/ρn
Thus
the infinite product of contraction factors n 1 − 2/ρn = 0 provided
n 1/ρn = ∞. This proves the lemma.
matrix A — compare the last row of Table 8.1 and Proposition 8.64. We
call corresponding classes of matrices transient, null recurrent, weakly
positive recurrent, and strongly positive recurrent.
Remark 8.65. Note that Fii (x) = 1 (n) n
x n≥1 fii nx , and several authors
(n) n
give the second line of this table in the form of n≥1 nfii R .
In order to find out which box in Table 8.1 a matrix fits in, one can use
Salama’s criteria (see [481, 484]); they depend on whether the underlying
graph G can be enlarged/reduced (in the class of strongly connected directed
graphs) without changing the entropy.
Also Φii < R for some (and hence every) vertex i implies that G is positive
recurrent.
8.7. Countable Graphs and Matrices 405
Example 8.67. Let G have vertex set N and directed edges n + 1 → 1 and
1 → n for each n ∈ N. Then the truncated n × n transition matrix is
⎛ ⎞
1 1 1 ... ... 1
⎜1 0 0 0⎟
⎜ ⎟
⎜ .. ⎟
⎜0 1 0 .⎟
⎜ ⎟
An = ⎜ . .. .. .. ⎟ ,
⎜ .. . . .⎟
⎜ ⎟
⎜ .. .. .. ⎟
⎝. . . 0⎠
0 ... ... 1 0
so det(A1 − λ1 ) = 1 − λ, and
∞
1
F11 (z) = zn = 1 for z = = e−hG (G)
2
n=1
and
∞
1 1
F11 (z) = nz n−1 = =4<∞ for z = .
(1 − z)2 2
n=1
−1 0 1
⎛ ⎞
.. .. .. ..
. . . .
⎜ ⎟
⎜ 0 1 0 1 0 ⎟
⎜ ⎟
A=⎜
⎜ 0 1 0 1 0 ⎟
⎟
⎜ 0 1 0 1 0 ⎟
⎝ ⎠
.. .. .. ..
. . . .
Figure 8.8. The transition graph and matrix for the symmetric random
walk on Z.
Since every vertex has exactly two outgoing (and two incoming) edges,
the Gurevich entropy hG (G) = log 2. In more detail, using the reflection
principle (see Exercise 3.132) and Stirling’s formula, we obtain the following
for every i ∈ Z and even n ∈ N:
(n+2) n n
fii = 2 −
n/2 (n − 2)/2
n! n!
= 2 −
(n/2)! (n/2)! (n/2 − 1)! (n/2 + 1)!
2n! 1 1
= −
(n/2)! (n/2 − 1)! n/2 n/2 + 1
E
4 n 2 2n+2
= ∼ .
n + 2 n/2 πn n + 2
P = {[−N ], . . . , [N ]} as
n−1
1 4
h(μN , P) = lim H σ −k (P)
n→∞ n
k=0
N
= − μN ([i0 . . . , in−1 ]) log μN ([i0 . . . , in−1 ])
i0 ,...,in−1 =−N
N N
pi0 2−#{1≤k<n:pik−1 ,ik = 2 }
1
=
i0 =−N i0 ···in−1 =−N
' 5
1
− log pi0 + # 1 ≤ k < n : pik−1 ,ik = log 2
2
N N
= pi0 2−#{0≤k<n−1:ik−1 ,ik =±N }
i0 =−N i0 ···in−1 =−N
(− log pi0 + #{0 ≤ k < n − 1 : ik = ±N } log 2).
By the Birkhoff Ergodic Theorem 6.13, μ-a.e. n-path satisfies #{0 ≤ k <
−1 2N −1
n − 1 : ik = ±N } ∼ 2N2N n. Therefore h(μN , P) = 2N log 2. The partition
P obviously generates the truncated path space, and since there is no invari-
−1
ant measures more evenly spread out, hG (GN ) = h(μN ) = 2N 2N log 2. How-
∗
ever, in the weak topology, μN doesn’t converge, and in the vague topology
(i.e. weak∗ topology restricted to compact subsets) μN converges to the zero-
measure. The entire system G has no measure of maximal entropy. Indeed,
regarding the existence of measures of maximal entropy intrinsic ergodicity,
Gurevich [292] proved the following:
Not only mass but also entropy “escapes to infinity”, and one would like to
quantify how much entropy is carried “at infinite”, i.e. outside every compact
subgraph. This is addressed by papers by Buzzi [136] and Iommi, Todd &
Velozo [327].
As it was shown in [327, Theorem 1.4] that these three quantities coincide,
we can call them the entropy at infinity of G.
It was show in [294] and [293, Theorem 3.8] that G is strongly positive
recurrent if and only if δ∞ < hG (G). The entropy at infinity δ∞ is precisely
the defect that may occur in upper semi-continuity of the entropy function;
see [327, Theorem 1.1].
Theorem 8.70. Let G be a countable directed graph with path space XG and
finite Gurevich entropy. Let (μn )n∈N be a sequence of probability measures
that converge to μ on cylinders (i.e. μn ([Z]) → μ(Z) for every cylinder set).
Then
lim sup h(μn ) = μ(XG ) h(μ) + (1 − μ(XG )) δ∞ .
n→∞
The name rome16 was coined by Misiurewicz, since all roads lead to
Rome. Clearly G itself is a rome, as is every connected subgraph of G that
contains a rome R.
Let B = (bi,j )ni,j=1 be the transition matrix associated to G, so we enu-
merated vertices of G as {1, . . . , n}. A simple (i.e. without self-intersections)
path p of length l(p) is given by i = i0 → i1 → · · · → il(p) = j, where i, j ∈ R,
&l(p)
but the intermediate vertices belong to G \ R. Let w(p) = k=1 bik−1 ,ik be
the weight of p. The rome matrix Arome (x) = (ai,j (x)), where i, j run over
the vertices of R, is given by
ai,j (x) = w(p)x1−l(p) ,
p
where the sum runs over all simple paths p as above. (Note that with the
convention that x0 = 1 for x = 0, Arome (0) reduces to the weighted transition
matrix of the rome R.) The result from [88, Theorem 1.7]17 is:
Theorem 8.72. Let G be a transition graph containing a rome R of cardi-
nalities n and r, respectively. The characteristic polynomial of its associated
matrix B is equal to
(8.32) det(B − xIn ) = (−x)n−r det(Arome (x) − xIr ),
where In and Ir are the identity matrices of the appropriate dimensions.
Proof. Clearly G itself is a rome of G, and in this case all the path lengths
l(p) = 1 and ai,j (x) = bi,j , so det(Arome (x) − xIr ) = det(B − xIn ) holds.
Now we argue by induction, decreasing the number of vertices in steps
of one until we get down to R. Recall that the graphs of the intermediate
steps are all romes by themselves.
Let S = {s1 , s2 , . . . , sk } be the vertex set of such intermediate rome,
and let S = {s0 } ∪ S be the vertex set of rome at the previous step. Set
16 The original definition in [88] says that there are no loops disjoint from R, but that is for
Theorem 8.73. Let the directed graph G consist of a vertex v0 from which q
loops of length emerge. Let H∗ := lim supn n1 log #{closed n-paths in G}.
Then eH∗ is the positive solution of the equation
1= q x− ,
≥1
following:
•
If x∞ < x∗ − 2ε for some positive ε, then there is N such that
N − N − > 1.
q x
=1 N ≥ =1 q (x∗ − ε)
• If x∞ > x∗ + 2εfor some positive
N ε, then there is N such that
∞ − −
xN > x∗ + ε and =1 q x∗ > =1 q xN = 1.
Let GN be the subgraph of G consisting of loops of length ≤ N and define
1
HN := lim log #{closed n-paths in GN }.
n→∞ n
The existence of the limit HN follows from Fekete’s Lemma 1.15 because
#{closed n-paths in GN } is supermultiplicative in n. Clearly HN is non-
decreasing in N ∈ N; set H∞ = limN →∞ HN ∈ [0, ∞]. Since H∗ ≥ HN for
all N , also H∗ ≥ H∞ . The above argument on the finite graphs case shows
that xN = eHN , and therefore x∞ = eH∞ ≤ eH∗ .
Next assume x∞ < ∞. Assume by contradiction that eH∗ > x∞ . Then
there is N such that log x∞ < N1 log K for K := #{closed N -paths in G}.
There are also K N -loops in the graph GN , and #{closed rN -paths in GN } ≥
K r for every r ∈ N. Therefore log xN = HN ≥ N1 log K > log x∞ , and this
contradicts the monotonicity xN , x∞ .
We have H∗ ≥ lim sup 1 log q = Q, because #{closed -paths in G} ≥
q . This covers the case Q = ∞ with H∗ = ∞ as well.
−
Assume that Q < ∞ and x∗ doesn’t exist because q x∞ < 1. If
H∗ > Q, then we can find N such that H∗ ≥ HN > Q, and similarly as
in
the −Q above argument, xN = eHN > eQ . However, this contradicts that
q e < 1. Therefore H∗ = Q = log x∞ .
Remark 8.74. The fact that x∞ = eQ if q x− ∞ < 1 also confirms that
Fii (R) ≤ 1 in the Vere-Jones classification (first line in Table 8.1);
in−this
case R = e −Q
is the radius of convergence of q z . If 1 = q x∞ <
−Q , then G is positive recurrent (cf. the last statement of Theo-
q
e
rem 8.66), and hence it has a unique
measure of maximal entropy by The-
−
orem 8.68. If 1 = q x∞ = q e −Q , then it requires more informa-
tion on (q )∈N to decide whether G is positive recurrent or not. Ruette
[481, Example 2.9] and Pavlov [450, Section 5] give examples illustrating
this distinction.
Appendix
Solutions to Exercises
1
pY (n) ≥ pX (n + 2N ),
k
so
1 1
htop (Y, σ) = limlog pY (n) ≥ lim log pX (n + 2N )
n→∞ n n→∞ n
n + 2N 1 1
= lim log pX (n + 2N ) − log k
n→∞ n n + 2N n
1
= lim log pX (n + 2N ) = htop (X, σ).
n→∞ n + 2N
413
414 A. Solutions to Exercises
Then n is 2k -periodic if 2k ≤ n < 2k+1 , and f |X has two fixed points, 1 and
∞, but f |Y has only one, so they are not conjugate. There is a factor map
π : X → Y given by π(1) = π(∞) = ∞ and π(n) = n otherwise. The reverse
factor map π̃ : Y → X can be taken as π̃(∞) = ∞ and π̃(n) = 2k−1 + n
(mod 2k−1 ) if 2k ≤ n < 2k+1 and k ≥ 1 (halving the period of each n < ∞).
Solution to Exercise 2.28: If (X, T ) is minimal, then it is of course transi-
tive. Suppose it is transitive but not minimal. Let x have a dense orbit and y
a non-dense orbit. Then there are ε > 0 and z ∈ X such that d(z, orb(y)) >
2ε. Take δ > 0 so small such that d(u, v) < δ implies d(T n u, T n v) < ε for all
n ≥ 1. Since orb(x) = X, there are 0 ≤ m < n such that d(T m x, y) < δ and
d(T n x, z) < ε. Then d(T n−m y, z) ≤ d(T n−m y, T n (x)) + d(T n x, z) < ε + ε =
2ε, contradicting the choice of z, ε.
Solution to Exercise 2.47: First note that pX (m + n) ≤ pX (m)pX (n),
so log pX (n) is subadditive. Thus by Fekete’s Lemma 1.15,
1 1
lim log pX (n) = inf log pX (n)
n n n n
exists. Next take ε > 0 arbitrary and N ∈ N such that 2−N < ε. Then
every n + N -cylinder is an (n, ε)-ball, and we need exactly pX (n + N )
of them to cover the space. Therefore, writing m = n + N , htop (σ) =
limε limn n1 log pX (n + N ) = limm m
1
log pX (m).
where the last step can be done using an induction proof on Pascal’s
triangle.
√
(3) Since Fn ∼ γ n for the golden ratio γ = 12 (1 + 5), the radius of
√
convergence is γ −1 and htop (fa ) = log 12 (1 + 5). An easier proof
follows from the fact that fa is Markov with transition matrix 11 10 .
Solution to Exercise 3.115: The Fibonacci shift is a gap shift with set
x−1
of gaps S = N, and 1 = x−2 + x−3 + · · · = 1−x −2 is equivalent to x
2 =
√
x + 1. The largest solution is the golden mean 12 (1 + 5), so the entropy is
√
htop (σ) = log( 12 (1 + 5)).
The odd shift is a gap shift with set of gap {0, 1, 3, 5, 7, . . . }. The equation
x−2
1 = x−1 +x−2 +x−4 +x−6 +· · · = x−1 + 1−x −2 is equivalent to x −x −2x+1 =
3 2
• • • • • • •
• • • • • • • •
• • • • • • •
• • • • • • • •
p • • • • • • •
• • • • • • • •
{y = −1} • • • • • • •
• • • • • • • •
p̃
• • • • • • •
• • • • • • • •
• • • • • • •
as required.
Solution to Exercise 4.53: For every length n, there is at most one bi-
special word w, namely if the unique left-special coincides with the unique
right-special word. Since rotational shifts are palindromic, we can take a
palindrome v in the language large enough to contain awb as a subwords for
all possible letters a, b. Reversing v, we see that also the reverse of awb oc-
curs in the language, so the reverse of w is bi-special as well. The uniqueness
of w shows that it is a palindrome.
Solution to Exercise 4.65: Take α ∈ [0, 1] \ Q and partition S1 into inter-
vals [0, N α mod 1) and [N α mod 1, 1). Then the symbolic dynamics of the
rotation Rα w.r.t. this partition has the required complexity function.
Solution to Exercise 4.96: Take ϕ : X → X, x → −x, i.e. the additive
inverse under the group action on the odometer. Then ϕ ◦ a(x) = ϕ(x + 1) =
−(x + 1) = −x − 1 = a−1 ◦ ϕ(x).
418 A. Solutions to Exercises
• For d = n/p for some prime p, then we count −ad because they
were already counted in an , but not anywhere else, because d has
no other multiple that divides n. Indeed μ( nd ) = μ(p) = −1.
n−1
1
μ(A) = lim ψ ◦ T (y) = μ1 (A).
n→∞ n
i=0
(1) Since node 0 and node 1 have arrows to themselves, but no arrows
back from higher nodes, the relations n = n−1 + an−1 and an =
an−1 + bn−1 follow.
(2) Since beyond node 2, only even nodes can have two outgoing arrows,
we have b2n = b2+1 .
(3) Again from node 2 onwards, if we remove every odd-numbered node,
we obtain the original graph from node 1 onwards. Therefore b2n =
an .
420 A. Solutions to Exercises
[1] J. Aarts, R. Fokkink, and G. Kruijtzer, Morphic numbers (Dutch, with Dutch summary),
Nieuw Arch. Wiskd. (5) 2 (2001), no. 1, 56–58. MR1823158
[2] E. H. El Abdalaoui, M. Lemańczyk, and T. de la Rue, A dynamical point of view
on the set of B-free integers, Int. Math. Res. Not. IMRN 16 (2015), 7258–7286, DOI
10.1093/imrn/rnu164. MR3428961
[3] B. Adamczewski, Balances for fixed points of primitive substitutions: Words, Theoret. Com-
put. Sci. 307 (2003), no. 1, 47–75, DOI 10.1016/S0304-3975(03)00092-6. MR2014730
[4] B. Adamczewski, Symbolic discrepancy and self-similar dynamics (English, with English
and French summaries), Ann. Inst. Fourier (Grenoble) 54 (2004), no. 7, 2201–2234 (2005).
MR2139693
[5] T. M. Adams, Smorodinsky’s conjecture on rank-one mixing, Proc. Amer. Math. Soc. 126
(1998), no. 3, 739–744, DOI 10.1090/S0002-9939-98-04082-9. MR1443143
[6] T. Adams and N. Friedman, Staircase mixing, Preprint, 1997.
[7] M. Adamska, S. Bezuglyi, O. Karpel, and J. Kwiatkowski, Subdiagrams and invariant mea-
sures on Bratteli diagrams, Ergodic Theory Dynam. Systems 37 (2017), no. 8, 2417–2452,
DOI 10.1017/etds.2016.8. MR3719266
[8] R. L. Adler and L. Flatto, Uniform distribution of Kakutani’s interval splitting proce-
dure, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 38 (1977), no. 4, 253–259, DOI
10.1007/BF00533157. MR447521
[9] R. L. Adler, A. G. Konheim, and M. H. McAndrew, Topological entropy, Trans. Amer. Math.
Soc. 114 (1965), 309–319, DOI 10.2307/1994177. MR175106
[10] R. L. Adler and B. Weiss, Entropy, a complete metric invariant for automorphisms of the
torus, Proc. Nat. Acad. Sci. U.S.A. 57 (1967), 1573–1576, DOI 10.1073/pnas.57.6.1573.
MR212156
[11] C. Aistleitner, M. Hofer, and V. Ziegler, On the uniform distribution modulo 1 of multi-
dimensional LS-sequences, Ann. Mat. Pura Appl. (4) 193 (2014), no. 5, 1329–1344, DOI
10.1007/s10231-013-0331-0. MR3262635
[12] E. Akin, The general topology of dynamical systems, Graduate Studies in Mathemat-
ics, vol. 1, American Mathematical Society, Providence, RI, 1993, DOI 10.1090/gsm/001.
MR1219737
[13] E. Akin, J. Auslander, and E. Glasner, The topological dynamics of Ellis actions, Mem.
Amer. Math. Soc. 195 (2008), no. 913, vi+152, DOI 10.1090/memo/0913. MR2437846
423
424 Bibliography
[14] E. Akin, J. Auslander, and K. Berg, When is a transitive map chaotic?, Convergence in
ergodic theory and probability (Columbus, OH, 1993), Ohio State Univ. Math. Res. Inst.
Publ., vol. 5, de Gruyter, Berlin, 1996, pp. 25–40. MR1412595
[15] S. Akiyama, Cubic Pisot units with finite beta expansions, Algebraic number theory and
Diophantine analysis (Graz, 1998), de Gruyter, Berlin, 2000, pp. 11–26. MR1770451
[16] S. Akiyama, On the boundary of self affine tilings generated by Pisot numbers, J. Math.
Soc. Japan 54 (2002), no. 2, 283–308, DOI 10.2969/jmsj/05420283. MR1883519
[17] K. T. Alligood, T. D. Sauer, and J. A. Yorke, Chaos: An introduction to dynamical systems,
Textbooks in Mathematical Sciences, Springer-Verlag, New York, 1997, DOI 10.1007/978-
3-642-59281-2. MR1418166
[18] J.-P. Allouche and M. Cosnard, The Komornik-Loreti constant is transcendental, Amer.
Math. Monthly 107 (2000), no. 5, 448–449, DOI 10.2307/2695302. MR1763399
[19] J.-P. Allouche and J. Shallit, The ubiquitous Prouhet-Thue-Morse sequence, Sequences and
their applications (Singapore, 1998), Springer Ser. Discrete Math. Theor. Comput. Sci.,
Springer, London, 1999, pp. 1–16. MR1843077
[20] J.-P. Allouche and J. Shallit, Automatic sequences: Theory, applications, generaliza-
tions, Cambridge University Press, Cambridge, 2003, DOI 10.1017/CBO9780511546563.
MR1997038
[21] J.-P. Allouche, J. Shallit, and R. Yassawi, How to prove that a sequence is not automatic,
Expo. Math. 40 (2022), no. 1, 1–22, DOI 10.1016/j.exmath.2021.08.001. MR4388977
[22] L. Alsedà, J. Llibre, and M. Misiurewicz, Combinatorial dynamics and entropy in dimension
one, 2nd ed., Advanced Series in Nonlinear Dynamics, vol. 5, World Scientific Publishing
Co., Inc., River Edge, NJ, 2000, DOI 10.1142/4205. MR1807264
[23] L. Alvin, The strange star product, J. Difference Equ. Appl. 18 (2012), no. 4, 657–674, DOI
10.1080/10236198.2011.608066. MR2905289
[24] L. Alvin, Toeplitz kneading sequences and adding machines, Discrete Contin. Dyn. Syst. 33
(2013), no. 8, 3277–3287, DOI 10.3934/dcds.2013.33.3277. MR3021357
[25] L. Alvin, Uniformly recurrent sequences and minimal Cantor omega-limit sets, Fund. Math.
231 (2015), no. 3, 273–284, DOI 10.4064/fm231-3-3. MR3397281
[26] L. Alvin, Homeomorphisms on minimal Cantor sets in the unimodal setting, Topology Appl.
282 (2020), 107292, 10, DOI 10.1016/j.topol.2020.107292. MR4119460
[27] D. Anosov, Tangential fields of transversal foliations in Y -systems, Math. Notes 2 (1967),
818–823.
[28] A. Anušić, H. Bruin, and J. Činč, Topological properties of Lorenz maps derived
from unimodal maps, J. Difference Equ. Appl. 26 (2020), no. 8, 1174–1191, DOI
10.1080/10236198.2020.1760260. MR4164085
[29] C. Apparicio, Reconnaissabilité des substitutions de longueur constante, Stage de Maîtrise
de l’ENS Lyon, 1999.
[30] V. I. Arnold and A. Avez, Ergodic problems of classical mechanics, Translated from the
French by A. Avez, W. A. Benjamin, Inc., New York-Amsterdam, 1968. MR0232910
[31] P. Arnoux and A. M. Fisher, The scenery flow for geometric structures on the
torus: the linear setting, Chinese Ann. Math. Ser. B 22 (2001), no. 4, 427–470, DOI
10.1142/S0252959901000425. MR1870070
[32] P. Arnoux and E. Harriss, What is ... a Rauzy fractal?, Notices Amer. Math. Soc. 61 (2014),
no. 7, 768–770, DOI 10.1090/noti1144. MR3235844
[33] P. Arnoux and S. Ito, Pisot substitutions and Rauzy fractals (English, with English and
French summaries), Journées Montoises d’Informatique Théorique (Marne-la-Vallée, 2000),
Bull. Belg. Math. Soc. Simon Stevin 8 (2001), no. 2, 181–207. MR1838930
[34] P. Arnoux, M. Mizutani, and T. Sellami, Random product of substitutions with the same
incidence matrix, Theoret. Comput. Sci. 543 (2014), 68–78, DOI 10.1016/j.tcs.2014.06.002.
MR3225711
Bibliography 425
[35] P. Arnoux and G. Rauzy, Représentation géométrique de suites de complexité 2n+1 (French,
with English summary), Bull. Soc. Math. France 119 (1991), no. 2, 199–215. MR1116845
[36] E. Artin, Galois Theory, Notre Dame Mathematical Lectures 2 (1971).
[37] J. S. Athreya and J. Chaika, The Hausdorff dimension of non-uniquely ergodic direc-
tions in H(2) is almost everywhere 12 , Geom. Topol. 19 (2015), no. 6, 3537–3563, DOI
10.2140/gt.2015.19.3537. MR3447109
[38] J. Auslander, Minimal flows and their extensions, Notas de Matemática [Mathematical
Notes], 122, North-Holland Mathematics Studies, vol. 153, North-Holland Publishing Co.,
Amsterdam, 1988. MR956049
[39] J. Auslander and J. A. Yorke, Interval maps, factors of maps, and chaos, Tohoku Math. J.
(2) 32 (1980), no. 2, 177–188, DOI 10.2748/tmj/1178229634. MR580273
[40] A. Avila and G. Forni, Weak mixing for interval exchange transformations and translation
flows, Ann. of Math. (2) 165 (2007), no. 2, 637–664, DOI 10.4007/annals.2007.165.637.
MR2299743
[41] M. Baake and U. Grimm, Squirals and beyond: substitution tilings with singular con-
tinuous spectrum, Ergodic Theory Dynam. Systems 34 (2014), no. 4, 1077–1102, DOI
10.1017/etds.2012.191. MR3227148
[42] S. Baker, Generalized golden ratios over integer alphabets, Integers 14 (2014), Paper No.
A15, 28, DOI 10.15546/aeei-2014-0005. MR3239596
[43] S. Baker and A. E. Ghenciu, Dynamical properties of S-gap shifts and other shift spaces, J.
Math. Anal. Appl. 430 (2015), no. 2, 633–647, DOI 10.1016/j.jmaa.2015.04.092. MR3351972
[44] V. Baker, M. Barge, and J. Kwapisz, Geometric realization and coincidence for reducible
non-unimodular Pisot tiling spaces with an application to β-shifts, Numération, pavages,
substitutions, Ann. Inst. Fourier (Grenoble) 56 (2006), no. 7, 2213–2248. MR2290779
[45] F. Balibrea, J. Smítal, and M. Štefánková, The three versions of distributional chaos, Chaos
Solitons Fractals 23 (2005), no. 5, 1581–1583, DOI 10.1016/j.chaos.2004.06.011. MR2101573
[46] J. Banks, J. Brooks, G. Cairns, G. Davis, and P. Stacey, On Devaney’s definition of chaos,
Amer. Math. Monthly 99 (1992), no. 4, 332–334, DOI 10.2307/2324899. MR1157223
[47] J. Banks, T. T. D. Nguyen, P. Oprocha, B. Stanley, and B. Trotta, Dynamics
of spacing shifts, Discrete Contin. Dyn. Syst. 33 (2013), no. 9, 4207–4232, DOI
10.3934/dcds.2013.33.4207. MR3038059
[48] G. Barat, T. Downarowicz, A. Iwanik, and P. Liardet, Propriétés topologiques et combi-
natoires des échelles de numération. part 2 (French, with English summary), dedicated
to the memory of Anzelm Iwanik, Colloq. Math. 84/85 (2000), no. part 2, 285–306, DOI
10.4064/cm-84/85-2-285-306. MR1784198
[49] G. Barat, T. Downarowicz, and P. Liardet, Dynamiques associées à une échelle de numéra-
tion (French), Acta Arith. 103 (2002), no. 1, 41–78, DOI 10.4064/aa103-1-5. MR1904893
[50] M. Barge, The Pisot conjecture for β-substitutions, Ergodic Theory Dynam. Systems 38
(2018), no. 2, 444–472, DOI 10.1017/etds.2016.44. MR3774828
[51] M. Barge, H. Bruin, and S. Štimac, The Ingram conjecture, Geom. Topol. 16 (2012), no. 4,
2481–2516, DOI 10.2140/gt.2012.16.2481. MR3033522
[52] M. Barge and B. Diamond, Coincidence for substitutions of Pisot type (English, with Eng-
lish and French summaries), Bull. Soc. Math. France 130 (2002), no. 4, 619–626, DOI
10.24033/bsmf.2433. MR1947456
[53] M. Barge and J. Kwapisz, Geometric theory of unimodular Pisot substitutions, Amer. J.
Math. 128 (2006), no. 5, 1219–1282. MR2262174
[54] M. Barge, S. Štimac, and R. F. Williams, Pure discrete spectrum in substitution tiling spaces,
Discrete Contin. Dyn. Syst. 33 (2013), no. 2, 579–597, DOI 10.3934/dcds.2013.33.579.
MR2975125
426 Bibliography
[55] A. D. Barwell, C. Good, and P. Oprocha, Shadowing and expansivity in subspaces, Fund.
Math. 219 (2012), no. 3, 223–243, DOI 10.4064/fm219-3-2. MR3001240
[56] R. Bass, Real analysis for graduate students, CreateSpace Independent Publishing Platform
(2016), Version 3.1 online on: http://bass.math.uconn.edu/3rd.pdf.
[57] T. Bedford et al., Ergodic theory, symbolic dynamics, and hyperbolic spaces (Trieste, 1989),
Oxford Sci. Publ., Oxford Univ. Press, New York, 1991.
[58] K. R. Berg, On the conjugacy problem for K-systems, ProQuest LLC, Ann Arbor, MI, 1967.
Thesis (Ph.D.)–University of Minnesota. MR2616688
[59] E. R. Berlekamp, J. H. Conway, and R. K. Guy, Winning ways for your mathematical plays.
Vol. 1: Games in general, Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers],
London-New York, 1982. MR654501
[60] J. Berstel, Growth of repetition-free words—a review, Theoret. Comput. Sci. 340 (2005),
no. 2, 280–290, DOI 10.1016/j.tcs.2005.03.039. MR2150766
[61] J. Berstel, A. Lauve, C. Reutenauer, and F. V. Saliola, Combinatorics on words: Christoffel
words and repetitions in words, CRM Monograph Series, vol. 27, American Mathematical
Society, Providence, RI, 2009, DOI 10.1090/crmm/027. MR2464862
[62] V. Berthé, P. Cecchi Bernales, and R. Yassawi, Coboundaries and eigenvalues of finitary
S-adic systems, Preprint, 2022, arXiv:2202.07270.
[63] V. Berthé, P. Cecchi Bernales, F. Durand, J. Leroy, D. Perrin, and S. Petite, On the di-
mension group of unimodular S-adic subshifts, Monatsh. Math. 194 (2021), no. 4, 687–717,
DOI 10.1007/s00605-020-01488-3. MR4228544
[64] V. Berthé and V. Delecroix, Beyond substitutive dynamical systems: S-adic expansions,
Numeration and substitution 2012, RIMS Kôkyûroku Bessatsu, B46, Res. Inst. Math. Sci.
(RIMS), Kyoto, 2014, pp. 81–123. MR3330561
[65] V. Berthé, T. Jolivet, and A. Siegel, Substitutive Arnoux-Rauzy sequences have pure discrete
spectrum, Unif. Distrib. Theory 7 (2012), no. 1, 173–197. MR2943167
[66] V. Berthé, T. Jolivet, and A. Siegel, Connectedness of fractals associated with Arnoux-
Rauzy substitutions, RAIRO Theor. Inform. Appl. 48 (2014), no. 3, 249–266, DOI
10.1051/ita/2014008. MR3302487
[67] V. Berthé and H. Nakada, On continued fraction expansions in positive characteristic:
equivalence relations and some metric properties, Expo. Math. 18 (2000), no. 4, 257–284.
MR1788323
[68] V. Berthé et al., Combinatorics, automata and number theory, Eds. V. Berthé and M. Rigo.
[69] V. Berthé, A. Siegel, and J. Thuswaldner, Substitutions, Rauzy fractals and tilings, Com-
binatorics, automata and number theory, Encyclopedia Math. Appl., vol. 135, Cambridge
Univ. Press, Cambridge, 2010, pp. 248–323. MR2759108
[70] V. Berthé, W. Steiner, and J. M. Thuswaldner, Geometry, dynamics, and arithmetic of
S-adic shifts (English, with English and French summaries), Ann. Inst. Fourier (Grenoble)
69 (2019), no. 3, 1347–1409. MR3986918
[71] V. Berthé, W. Steiner, J. M. Thuswaldner, and R. Yassawi, Recognizability for sequences
of morphisms, Ergodic Theory Dynam. Systems 39 (2019), no. 11, 2896–2931, DOI
10.1017/etds.2017.144. MR4015135
[72] V. Bergelson, G. Kolesnik, and Y. Son, Uniform distribution of subpolynomial func-
tions along primes and applications, J. Anal. Math. 137 (2019), no. 1, 135–187, DOI
10.1007/s11854-018-0068-1. MR3938000
[73] J. Berstel and A. de Luca, Sturmian words, Lyndon words and trees, Theoret. Comput. Sci.
178 (1997), no. 1-2, 171–203, DOI 10.1016/S0304-3975(96)00101-6. MR1453849
[74] A. Besbes, M. Boshernitzan, and D. Lenz, Delone sets with finite local complexity: linear
repetitivity versus positivity of weights, Discrete Comput. Geom. 49 (2013), no. 2, 335–347,
DOI 10.1007/s00454-012-9455-z. MR3017915
Bibliography 427
[75] A. S. Besicovitch, On the density of certain sequences of integers, Math. Ann. 110 (1935),
no. 1, 336–341, DOI 10.1007/BF01448032. MR1512943
[76] E. Bessel-Hagen, Zahlentheorie, Teubner, 1929.
[77] S. Bezuglyi, O. Karpel, and J. Kwiatkowski, Exact number of ergodic invariant mea-
sures for Bratteli diagrams, J. Math. Anal. Appl. 480 (2019), no. 2, 123431, 49, DOI
10.1016/j.jmaa.2019.123431. MR4000100
[78] S. Bezuglyi, J. Kwiatkowski, and K. Medynets, Aperiodic substitution systems and
their Bratteli diagrams, Ergodic Theory Dynam. Systems 29 (2009), no. 1, 37–72, DOI
10.1017/S0143385708000230. MR2470626
[79] S. Bezuglyi, J. Kwiatkowski, K. Medynets, and B. Solomyak, Invariant measures on station-
ary Bratteli diagrams, Ergodic Theory Dynam. Systems 30 (2010), no. 4, 973–1007, DOI
10.1017/S0143385709000443. MR2669408
[80] S. Bezuglyi, J. Kwiatkowski, K. Medynets, and B. Solomyak, Finite rank Bratteli diagrams:
structure of invariant measures, Trans. Amer. Math. Soc. 365 (2013), no. 5, 2637–2679,
DOI 10.1090/S0002-9947-2012-05744-8. MR3020111
[81] S. Bezuglyi, J. Kwiatkowski, and R. Yassawi, Perfect orderings on finite rank Brat-
teli diagrams, Canad. J. Math. 66 (2014), no. 1, 57–101, DOI 10.4153/CJM-2013-041-6.
MR3150704
[82] S. Bezuglyi and R. Yassawi, Orders that yield homeomorphisms on Bratteli diagrams, Dyn.
Syst. 32 (2017), no. 2, 249–282, DOI 10.1080/14689367.2016.1197888. MR3638433
[83] F. Blanchard, E. Glasner, S. Kolyada, and A. Maass, On Li-Yorke pairs, J. Reine Angew.
Math. 547 (2002), 51–68, DOI 10.1515/crll.2002.053. MR1900136
[84] F. Blanchard and G. Hansel, Systèmes codés (French, with English summary), Theoret.
Comput. Sci. 44 (1986), no. 1, 17–49, DOI 10.1016/0304-3975(86)90108-8. MR858689
[85] F. Blanchard et al., Topics in Symbolic Dynamics and Applications, London Mathematical
Society Lecture Note Series, Editors: F. Blanchard, A. Maass, and A. Nogueira, Cambridge
Univ. Press, 2000, ISBN 9780521796606.
[86] P. Billingsley, Probability and measure, 3rd ed., A Wiley-Interscience Publication, Wiley
Series in Probability and Mathematical Statistics, John Wiley & Sons, Inc., New York,
1995. MR1324786
[87] L. S. Block and W. A. Coppel, Dynamics in one dimension, Lecture Notes in Mathematics,
vol. 1513, Springer-Verlag, Berlin, 1992, DOI 10.1007/BFb0084762. MR1176513
[88] L. Block, J. Guckenheimer, M. Misiurewicz, and L. S. Young, Periodic points and topologi-
cal entropy of one-dimensional maps, Global theory of dynamical systems (Proc. Internat.
Conf., Northwestern Univ., Evanston, Ill., 1979), Lecture Notes in Math., vol. 819, Springer,
Berlin, 1980, pp. 18–34. MR591173
[89] L. Block, J. Keesling, and M. Misiurewicz, Strange adding machines, Ergodic Theory Dy-
nam. Systems 26 (2006), no. 3, 673–682, DOI 10.1017/S0143385705000635. MR2237463
[90] A. M. Blokh, Sensitive mappings of an interval (Russian), Uspekhi Mat. Nauk 37 (1982),
no. 2(224), 189–190. MR650765
[91] A. M. Blokh, Decomposition of dynamical systems on an interval (Russian), Uspekhi Mat.
Nauk 38 (1983), no. 5(233), 179–180. MR718829
[92] A. M. Blokh, The “spectral” decomposition for one-dimensional maps, Dynamics reported,
Dynam. Report. Expositions Dynam. Systems (N.S.), vol. 4, Springer, Berlin, 1995, pp. 1–59.
MR1346496
[93] A. Blokh and L. Oversteegen, Wandering triangles exist (English, with English and
French summaries), C. R. Math. Acad. Sci. Paris 339 (2004), no. 5, 365–370, DOI
10.1016/j.crma.2004.06.024. MR2092465
[94] J. Bobok and H. Bruin, Constant slope maps and the Vere-Jones classification, Entropy 18
(2016), no. 6, Paper No. 234, 27, DOI 10.3390/e18060234. MR3530057
428 Bibliography
[95] J. Bobok and M. Soukenka, On piecewise affine interval maps with countably many laps,
Discrete Contin. Dyn. Syst. 31 (2011), no. 3, 753–762, DOI 10.3934/dcds.2011.31.753.
MR2825637
[96] M. Boshernitzan, A unique ergodicity of minimal symbolic flows with linear block growth,
J. Analyse Math. 44 (1984/85), 77–96, DOI 10.1007/BF02790191. MR801288
[97] M. D. Boshernitzan, A condition for unique ergodicity of minimal symbolic flows, Er-
godic Theory Dynam. Systems 12 (1992), no. 3, 425–428, DOI 10.1017/S0143385700006866.
MR1182655
[98] W. Bosma et al., Continued fractions, Eds. W. Bosma and C. Kraaikamp, URL:
https://www.math.ru.nl/∼bosma/Students/CF.pdf.
[99] R. Bowen, Markov partitions for Axiom A diffeomorphisms, Amer. J. Math. 92 (1970),
725–747, DOI 10.2307/2373370. MR277003
[100] R. Bowen, Markov partitions and minimal sets for Axiom A diffeomorphisms, Amer. J.
Math. 92 (1970), 907–918, DOI 10.2307/2373402. MR277002
[101] R. Bowen, Periodic points and measures for Axiom A diffeomorphisms, Trans. Amer. Math.
Soc. 154 (1971), 377–397, DOI 10.2307/1995452. MR282372
[102] R. Bowen, Equilibrium states and the ergodic theory of Anosov diffeomorphisms, Second
revised edition, with a preface by David Ruelle; edited by Jean-René Chazottes, Lecture
Notes in Mathematics, vol. 470, Springer-Verlag, Berlin, 2008. MR2423393
[103] R. Bowen, Some systems with unique equilibrium states, Math. Systems Theory 8 (1974/75),
no. 3, 193–202, DOI 10.1007/BF01762666. MR399413
[104] R. Bowen, Markov partitions are not smooth, Proc. Amer. Math. Soc. 71 (1978), no. 1,
130–132, DOI 10.2307/2042234. MR474415
[105] P. Boyland, A. de Carvalho, and T. Hall, Natural extensions of unimodal maps: virtual
sphere homeomorphisms and prime ends of basin boundaries, Geom. Topol. 25 (2021),
no. 1, 111–228, DOI 10.2140/gt.2021.25.111. MR4226229
[106] M. Boyle, Open problems in symbolic dynamics, Geometric and probabilistic struc-
tures in dynamics, Contemp. Math., vol. 469, Amer. Math. Soc., Providence, RI,
and updates on http://www2.math.umd.edu/∼mboyle/open/, 2008, pp. 69–118, DOI
10.1090/conm/469/09161. MR2478466
[107] F.-J. Brandenburg, Uniformly growing kth power-free homomorphisms, Theoret. Comput.
Sci. 23 (1983), no. 1, 69–82, DOI 10.1016/0304-3975(88)90009-6. MR693069
[108] O. Bratteli, Inductive limits of finite dimensional C ∗ -algebras, Trans. Amer. Math. Soc.
171 (1972), 195–234, DOI 10.2307/1996380. MR312282
[109] X. Bressaud, F. Durand, and A. Maass, Necessary and sufficient conditions to be an eigen-
value for linearly recurrent dynamical Cantor systems, J. London Math. Soc. (2) 72 (2005),
no. 3, 799–816, DOI 10.1112/S0024610705006800. MR2190338
[110] X. Bressaud, F. Durand, and A. Maass, On the eigenvalues of finite rank Bratteli-Vershik
dynamical systems, Ergodic Theory Dynam. Systems 30 (2010), no. 3, 639–664, DOI
10.1017/S0143385709000236. MR2643706
[111] J. Brillhart and L. Carlitz, Note on the Shapiro polynomials, Proc. Amer. Math. Soc. 25
(1970), 114–118, DOI 10.2307/2036537. MR260955
[112] K. Briggs, A precise calculation of the Feigenbaum constants, Math. Comp. 57 (1991),
no. 195, 435–439, DOI 10.2307/2938684. MR1079009
[113] M. Brin and G. Stuck, Introduction to dynamical systems, Cambridge University Press,
Cambridge, 2002, DOI 10.1017/CBO9780511755316. MR1963683
[114] J. Brinkhuis, Nonrepetitive sequences on three symbols, Quart. J. Math. Oxford Ser. (2) 34
(1983), no. 134, 145–149, DOI 10.1093/qmath/34.2.145. MR698202
[115] S. Brlek, Enumeration of factors in the Thue-Morse word, First Montreal Conference on
Combinatorics and Computer Science, 1987, Discrete Appl. Math. 24 (1989), no. 1-3, 83–96,
DOI 10.1016/0166-218X(92)90274-E. MR1011264
Bibliography 429
[116] K. M. Brucks and H. Bruin, Topics from one-dimensional dynamics, London Mathemat-
ical Society Student Texts, vol. 62, Cambridge University Press, Cambridge, 2004, DOI
10.1017/CBO9780511617171. MR2080037
[117] H. Bruin, Invariant measures of interval maps, ProQuest LLC, Ann Arbor, MI, 1994. Thesis
(Dr.)–Technische Universiteit Delft (The Netherlands). MR2714793
[118] H. Bruin, Combinatorics of the kneading map, Proceedings of the Conference “Thirty Years
after Sharkovskiı̆’s Theorem: New Perspectives” (Murcia, 1994), Internat. J. Bifur. Chaos
Appl. Sci. Engrg. 5 (1995), no. 5, 1339–1349, DOI 10.1142/S0218127495001010. MR1361922
[119] H. Bruin, Homeomorphic restrictions of unimodal maps, Geometry and topology in dynam-
ics (Winston-Salem, NC, 1998/San Antonio, TX, 1999), Contemp. Math., vol. 246, Amer.
Math. Soc., Providence, RI, 1999, pp. 47–56, DOI 10.1090/conm/246/03773. MR1732370
[120] H. Bruin, Inverse limit spaces of post-critically finite tent maps, Fund. Math. 165 (2000),
no. 2, 125–138, DOI 10.4064/fm-165-2-125-138. MR1808727
[121] H. Bruin, Minimal Cantor systems and unimodal maps, dedicated to Professor Alexander N.
Sharkovsky on the occasion of his 65th birthday, J. Difference Equ. Appl. 9 (2003), no. 3-4,
305–318, DOI 10.1080/1023619021000047743. MR1990338
[122] H. Bruin and J. Hawkins, Exactness and maximal automorphic factors of unimodal
interval maps, Ergodic Theory Dynam. Systems 21 (2001), no. 4, 1009–1034, DOI
10.1017/S0143385701001481. MR1849599
[123] H. Bruin and J. Hawkins, Rigidity of smooth one-sided Bernoulli endomorphisms, New York
J. Math. 15 (2009), 451–483. MR2558792
[124] H. Bruin, A. Kaffl, and D. Schleicher, Existence of quadratic Hubbard trees, Fund. Math.
202 (2009), no. 3, 251–279, DOI 10.4064/fm202-3-4. MR2476617
[125] H. Bruin, G. Keller, and M. St. Pierre, Adding machines and wild attractors, Ergodic
Theory Dynam. Systems 17 (1997), no. 6, 1267–1287, DOI 10.1017/S0143385797086392.
MR1488317
[126] H. Bruin and D. Schleicher, Admissibility of kneading sequences and structure of Hubbard
trees for quadratic polynomials, J. Lond. Math. Soc. (2) 78 (2008), no. 2, 502–522, DOI
10.1112/jlms/jdn033. MR2439637
[127] H. Bruin and M. Todd, Transience and thermodynamic formalism for infinitely branched
interval maps, J. Lond. Math. Soc. (2) 86 (2012), no. 1, 171–194, DOI 10.1112/jlms/jdr081.
MR2959300
[128] H. Bruin and S. Troubetzkoy, The Gauss map on a class of interval translation mappings,
Israel J. Math. 137 (2003), 125–148, DOI 10.1007/BF02785958. MR2013352
[129] H. Bruin and O. Volkova, The complexity of Fibonacci-like kneading sequences, Theoret.
Comput. Sci. 337 (2005), no. 1-3, 379–389, DOI 10.1016/j.tcs.2005.02.001. MR2141232
[130] J. R. Büchi, Weak second-order arithmetic and finite automata, Z. Math. Logik Grundlagen
Math. 6 (1960), 66–92, DOI 10.1002/malq.19600060105. MR125010
[131] A. Bufetov, Y. G. Sinai, and C. Ulcigrai, A condition for continuous spectrum of an in-
terval exchange transformation, Representation theory, dynamical systems, and asymptotic
combinatorics, Amer. Math. Soc. Transl. Ser. 2, vol. 217, Amer. Math. Soc., Providence, RI,
2006, pp. 23–35, DOI 10.1090/trans2/217/03. MR2276099
[132] A. I. Bufetov and B. Solomyak, On the modulus of continuity for spectral measures in
substitution dynamics, Adv. Math. 260 (2014), 84–129, DOI 10.1016/j.aim.2014.04.004.
MR3209350
[133] R. Burton and J. E. Steif, Non-uniqueness of measures of maximal entropy for sub-
shifts of finite type, Ergodic Theory Dynam. Systems 14 (1994), no. 2, 213–235, DOI
10.1017/S0143385700007859. MR1279469
[134] R. Burton and J. E. Steif, New results on measures of maximal entropy, Israel J. Math. 89
(1995), no. 1-3, 275–300, DOI 10.1007/BF02808205. MR1324466
430 Bibliography
[135] J. Buzzi, Subshifts of quasi-finite type, Invent. Math. 159 (2005), no. 2, 369–406, DOI
10.1007/s00222-004-0392-1. MR2116278
[136] J. Buzzi, Puzzles of quasi-finite type, zeta functions and symbolic dynamics for multi-
dimensional maps (English, with English and French summaries), Ann. Inst. Fourier (Greno-
ble) 60 (2010), no. 3, 801–852. MR2680817
[137] J. Buzzi and P. Hubert, Piecewise monotone maps without periodic points: rigidity, mea-
sures and complexity, Ergodic Theory Dynam. Systems 24 (2004), no. 2, 383–405, DOI
10.1017/S0143385703000488. MR2054049
[138] V. Canterini and A. Siegel, Automate des préfixes-suffixes associé à une substitution primi-
tive (French, with English and French summaries), J. Théor. Nombres Bordeaux 13 (2001),
no. 2, 353–369. MR1879663
[139] F. Cai, A characterization of weak-mixing for minimal systems, Topology Appl. 267 (2019),
106844, 11, DOI 10.1016/j.topol.2019.106844. MR4001122
[140] M. Campanino and H. Epstein, On the existence of Feigenbaum’s fixed point, Comm. Math.
Phys. 79 (1981), no. 2, 261–302. MR612250
[141] I. Carbone, A van der Corput-type algorithm for LS-sequences of points, Preprint, 2012,
arXiv:1209.3611, and Extension of van der Corput algorithm to LS-sequences, Appl. Math.
Comput. 255 (2015), 207–213.
[142] I. Carbone, Discrepancy of LS-sequences of partitions and points, Ann. Mat. Pura Appl.
(4) 191 (2012), no. 4, 819–844, DOI 10.1007/s10231-011-0208-z. MR2993975
[143] I. Carbone, M. R. Iacò, and A. Volčič, A dynamical system approach to the Kakutani-
Fibonacci sequence, Ergodic Theory Dynam. Systems 34 (2014), no. 6, 1794–1806, DOI
10.1017/etds.2013.20. MR3272771
[144] L. Carleson and T. W. Gamelin, Complex dynamics, Universitext: Tracts in Mathematics,
Springer-Verlag, New York, 1993, DOI 10.1007/978-1-4612-4364-9. MR1230383
[145] J. Cassaigne, Counting overlap-free binary words, STACS 93 (Würzburg, 1993), Lecture
Notes in Comput. Sci., vol. 665, Springer, Berlin, 1993, pp. 216–225, DOI 10.1007/3-540-
56503-5_24. MR1249296
[146] J. Cassaigne, Special factors of sequences with linear subword complexity, Developments in
language theory, II (Magdeburg, 1995), World Sci. Publ., River Edge, NJ, 1996, pp. 25–34.
MR1466182
[147] J. Cassaigne, Subword complexity and periodicity in two or more dimensions, Developments
in language theory (Aachen, 1999), World Sci. Publ., River Edge, NJ, 2000, pp. 14–21.
MR1880477
[148] J. Cassaigne, S. Ferenczi, and A. Messaoudi, Weak mixing and eigenvalues for Arnoux-Rauzy
sequences, Ann. Inst. Fourier (Grenoble) 58 (2008), no. 6, 1983–2005. MR2473626
[149] J. Cassaigne, S. Ferenczi, and L. Q. Zamboni, Imbalances in Arnoux-Rauzy sequences
(English, with English and French summaries), Ann. Inst. Fourier (Grenoble) 50 (2000),
no. 4, 1265–1276. MR1799745
[150] J. Cassaigne and F. Nicolas, Factor complexity, Combinatorics, automata and number the-
ory, Encyclopedia Math. Appl., vol. 135, Cambridge Univ. Press, Cambridge, 2010, pp. 163–
247. MR2759107
[151] J. Chaika and H. Masur, There exists an interval exchange with a non-ergodic generic
measure, J. Mod. Dyn. 9 (2015), 289–304, DOI 10.3934/jmd.2015.9.289. MR3412151
[152] J. Chaika and H. Masur, The set of non-uniquely ergodic d-IETs has Hausdorff codi-
mension 1/2, Invent. Math. 222 (2020), no. 3, 749–832, DOI 10.1007/s00222-020-00978-3.
MR4169051
[153] X. Chen, Q.-H. Lu, and H.-M. Xie, Grammatical complexity of Feigenbaum attractor, Ad-
vances in Mathematics (China) 22185–6 (1993).
[154] D. K. Childers, Wandering polygons and recurrent critical leaves, Ergodic Theory Dynam.
Systems 27 (2007), no. 1, 87–107, DOI 10.1017/S0143385706000526. MR2297088
Bibliography 431
[155] S. Chowla, On abundant numbers, J. Indian Math. Soc. New Ser. 1 (1934), 41–44.
[156] J. P. Clay, Proximity relations in transformation groups, Trans. Amer. Math. Soc. 108
(1963), 88–96, DOI 10.2307/1993827. MR154269
[157] V. Climenhaga, MathBlog: http://vaughnclimenhaga.wordpress.com/2012/04/18/a-
useful-example-for-the-space-of-ergodic-measures/.
[158] V. Climenhaga, Specification and towers in shift spaces, Comm. Math. Phys. 364 (2018),
no. 2, 441–504, DOI 10.1007/s00220-018-3265-y. MR3869435
[159] V. Climenhaga and D. J. Thompson, Intrinsic ergodicity beyond specification: β-shifts, S-gap
shifts, and their factors, Israel J. Math. 192 (2012), no. 2, 785–817, DOI 10.1007/s11856-
012-0052-x. MR3009742
[160] V. Climenhaga and D. J. Thompson, Intrinsic ergodicity via obstruction entropies, Er-
godic Theory Dynam. Systems 34 (2014), no. 6, 1816–1831, DOI 10.1017/etds.2013.16.
MR3272773
[161] A. Cobham, On the base-dependence of sets of numbers recognizable by finite automata,
Math. Systems Theory 3 (1969), 186–192, DOI 10.1007/BF01746527. MR250789
[162] A. Cobham, Uniform tag sequences, Math. Systems Theory 6 (1972), 164–192, DOI
10.1007/BF01706087. MR457011
[163] H. Cohn, A short proof of the simple continued fraction expansion of e, Amer. Math.
Monthly 113 (2006), no. 1, 57–62, DOI 10.2307/27641837. MR2202921
[164] P. Collet and J.-P. Eckmann, Iterated maps on the interval as dynamical systems, Progress
in Physics, vol. 1, Birkhäuser, Boston, Mass., 1980. MR613981
[165] I. P. Cornfeld, S. V. Fomin, and Ya. G. Sinaı̆, Ergodic theory, translated from the Russian by
A. B. Sosinskiı̆, Grundlehren der mathematischen Wissenschaften [Fundamental Principles
of Mathematical Sciences], vol. 245, Springer-Verlag, New York, 1982, DOI 10.1007/978-1-
4615-6927-5. MR832433
[166] J. Coquet, Représentation lacunaires des entières naturelles I, II, Arch. Math. Basel 38
(1982), 184–188, and 41 (1983), 238–242.
[167] M. I. Cortez, F. Durand, B. Host, and A. Maass, Continuous and measurable eigenfunctions
of linearly recurrent dynamical Cantor systems, J. London Math. Soc. (2) 67 (2003), no. 3,
790–804, DOI 10.1112/S0024610703004320. MR1967706
[168] M. I. Cortez and J. Rivera-Letelier, Invariant measures of minimal post-critical sets of logis-
tic maps, Israel J. Math. 176 (2010), 157–193, DOI 10.1007/s11856-010-0024-y. MR2653190
[169] M. I. Cortez and J. Rivera-Letelier, Topological orbit equivalence classes and numeration
scales of logistic maps, Ergodic Theory Dynam. Systems 32 (2012), no. 5, 1501–1526, DOI
10.1017/S0143385711000435. MR2974208
[170] E. M. Coven, I. Kan, and J. A. Yorke, Pseudo-orbit shadowing in the family of tent maps,
Trans. Amer. Math. Soc. 308 (1988), no. 1, 227–241, DOI 10.2307/2000960. MR946440
[171] D. Creutz and C. E. Silva, Mixing on a class of rank-one transformations, Ergodic Theory
Dynam. Systems 24 (2004), no. 2, 407–440, DOI 10.1017/S0143385703000464. MR2054050
[172] D. Creutz and C. E. Silva, Mixing on rank-one transformations, Studia Math. 199 (2010),
no. 1, 43–72, DOI 10.4064/sm199-1-4. MR2652597
[173] J. Currie and N. Rampersad, A proof of Dejean’s conjecture, Math. Comp. 80 (2011),
no. 274, 1063–1070, DOI 10.1090/S0025-5718-2010-02407-X. MR2772111
[174] V. Cyr and B. Kra, Counting generic measures for a subshift of linear growth, J. Eur. Math.
Soc. (JEMS) 21 (2019), no. 2, 355–380, DOI 10.4171/JEMS/838. MR3896204
[175] K. Dajani and C. Kraaikamp, Ergodic theory of numbers, Carus Mathematical Monographs,
vol. 29, Mathematical Association of America, Washington, DC, 2002. MR1917322
[176] D. Damanik and D. Lenz, A condition of Boshernitzan and uniform convergence in the mul-
tiplicative ergodic theorem, Duke Math. J. 133 (2006), no. 1, 95–123, DOI 10.1215/S0012-
7094-06-13314-8. MR2219271
432 Bibliography
[177] M. Damron and J. Fickenscher, On the number of ergodic measures for minimal shifts with
eventually constant complexity growth, Ergodic Theory Dynam. Systems 37 (2017), no. 7,
2099–2130, DOI 10.1017/etds.2015.138. MR3693122
[178] M. Damron and J. Fickenscher, The number of ergodic measures for transitive subshifts
under the regular bispecial condition, Ergodic Theory Dynam. Systems 42 (2022), no. 1,
86–140, DOI 10.1017/etds.2020.134. MR4348411
[179] P. Dartnell, F. Durand, and A. Maass, Orbit equivalence and Kakutani equivalence with
Sturmian subshifts, Studia Math. 142 (2000), no. 1, 25–45, DOI 10.4064/sm-142-1-25-45.
MR1792287
[180] D. A. Dastjerdi and M. Dabbaghian Amiri, Mixing coded systems, Georgian Math. J. 26
(2019), no. 4, 637–642, and ArXiv 1507.08048.pdf, DOI 10.1515/gmj-2017-0058. MR4036605
[181] D. A. Dastjerdi and S. Jangjoo, Computations on sofic S-gap shifts, Qual. Theory Dyn.
Syst. 12 (2013), no. 2, 393–406, DOI 10.1007/s12346-013-0096-2. MR3101268
[182] D. A. Dastjerdi and S. Jangjoo, Dynamics and topology of S-gap shifts, Topology Appl. 159
(2012), no. 10-11, 2654–2661, DOI 10.1016/j.topol.2012.04.002. MR2923435
[183] D. Dastjerdi and S. Shaldehi, (S, S )-gap shifts as a generalization of run-length-limited
codes, British J. of Math. & Comp. Science 4 (2014), 2765–2780.
[184] H. Davenport, Über numeri abundantes, Sitzungsberichte Preuss. Akad. Wiss. (1933), 830–
837.
[185] H. Davenport, On some infinite series involving arithmetical functions. II, Quart. J. Math.
Oxford 8 (1937), 313–320.
[186] H. Davenport and P. Erdös, On sequences of positive integers, Acta Arithm. 2 (1936), 147–
151.
[187] H. Davenport and P. Erdös, On sequences of positive integers, J. Indian Math. Soc. (N.S.)
15 (1951), 19–24. MR43835
[188] H. Davenport and K. F. Roth, Rational approximations to algebraic numbers, Mathematika
2 (1955), 160–167, DOI 10.1112/S0025579300000814. MR77577
[189] F. Dejean, Sur un théorème de Thue (French), J. Combinatorial Theory Ser. A 13 (1972),
90–99, DOI 10.1016/0097-3165(72)90011-8. MR300959
[190] F. M. Dekking, The spectrum of dynamical systems arising from substitutions of constant
length, Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 41 (1977/78), no. 3, 221–239, DOI
10.1007/BF00534241. MR461470
[191] F. M. Dekking and M. Keane, Mixing properties of substitutions, Z. Wahrscheinlichkeitsthe-
orie und Verw. Gebiete 42 (1978), no. 1, 23–33, DOI 10.1007/BF00534205. MR466485
[192] V. Delecroix, C. Matheus, and C. G. Moreira, Approximations of the Lagrange and Markov
spectra, Math. Comp. 89 (2020), no. 325, 2521–2536, DOI 10.1090/mcom/3513. MR4109576
[193] A. Denjoy, Sur les courbes definies par les équations différentielles à la surface du tore,
Journal de Mathématiques Pures et Appliquées 11 (1932), 333-375.
[194] M. Denker, C. Grillenberger, and K. Sigmund, Ergodic theory on compact spaces, Lecture
Notes in Mathematics, Vol. 527, Springer-Verlag, Berlin-New York, 1976. MR0457675
[195] B. Derrida, A. Gervois, and Y. Pomeau, Iteration of endomorphisms on the real axis and
representation of numbers (English, with French summary), Ann. Inst. H. Poincaré Sect. A
(N.S.) 29 (1978), no. 3, 305–356. MR519698
[196] R. L. Devaney, An introduction to chaotic dynamical systems, The Benjamin/Cummings
Publishing Co., Inc., Menlo Park, CA, 1986. MR811850
[197] R. Deviatov, On subword complexity of morphic sequences, Computer science—theory and
applications, Lecture Notes in Comput. Sci., vol. 5010, Springer, Berlin, 2008, pp. 146–157,
DOI 10.1007/978-3-540-79709-8_17. MR2475157
Bibliography 433
[198] J. de Vries, Elements of topological dynamics, Mathematics and its Applications, vol. 257,
Kluwer Academic Publishers Group, Dordrecht, 1993, DOI 10.1007/978-94-015-8171-4.
MR1249063
[199] J. de Vries, Topological dynamical systems: An introduction to the dynamics of contin-
uous mappings, De Gruyter Studies in Mathematics, vol. 59, De Gruyter, Berlin, 2014.
MR3752609
[200] M. de Vries, V. Komornik, and P. Loreti, Topology of the set of univoque bases, Topology
Appl. 205 (2016), 117–137, DOI 10.1016/j.topol.2016.01.023. MR3493310
[201] M. de Vries, V. Komornik, and P. Loreti, Topology of univoque sets in real base expan-
sions, Topology Appl. 312 (2022), Paper No. 108085, 36, DOI 10.1016/j.topol.2022.108085.
MR4403655
[202] E. I. Dinaburg, A correlation between topological entropy and metric entropy (Russian),
Dokl. Akad. Nauk SSSR 190 (1970), 19–22. MR0255765
[203] P. Dömösi and M. Ito, Context-free languages and primitive words, World Scientific Pub-
lishing Co. Pte. Ltd., Hackensack, NJ, 2015. MR3243137
[204] S. Donoso, F. Durand, A. Maass, and S. Petite, On automorphism groups of low
complexity subshifts, Ergodic Theory Dynam. Systems 36 (2016), no. 1, 64–95, DOI
10.1017/etds.2015.70. MR3436754
[205] A. Douady, Julia sets and the Mandelbrot set, in: H.-O. Peitgen and, P. Richter: The beauty
of fractals, Springer-Verlag, New York, 1986, pp. 161–173.
[206] A. Douady and J. Hubbard, Études dynamique des polynômes complexes I & II, Publ. Math.
Orsay (1984-1985) (The Orsay notes).
[207] T. Downarowicz, The Choquet simplex of invariant measures for minimal flows, Israel J.
Math. 74 (1991), no. 2-3, 241–256, DOI 10.1007/BF02775789. MR1135237
[208] T. Downarowicz, Survey of odometers and Toeplitz flows, Algebraic and topological dy-
namics, Contemp. Math., vol. 385, Amer. Math. Soc., Providence, RI, 2005, pp. 7–37, DOI
10.1090/conm/385/07188. MR2180227
[209] T. Downarowicz, Entropy in dynamical systems, New Mathematical Monographs,
vol. 18, Cambridge University Press, Cambridge, 2011, DOI 10.1017/CBO9780511976155.
MR2809170
[210] T. Downarowicz, Positive topological entropy implies chaos DC2, Proc. Amer. Math. Soc.
142 (2014), no. 1, 137–149, DOI 10.1090/S0002-9939-2013-11717-X. MR3119189
[211] T. Downarowicz and E. Glasner, Isomorphic extensions and applications, Topol. Methods
Nonlinear Anal. 48 (2016), no. 1, 321–338, DOI 10.12775/TMNA.2016.050. MR3586277
[212] T. Downarowicz and O. Karpel, Dynamics in dimension zero: a survey, Discrete Contin.
Dyn. Syst. 38 (2018), no. 3, 1033–1062, DOI 10.3934/dcds.2018044. MR3808986
[213] T. Downarowicz and O. Karpel, Decisive Bratteli-Vershik models, Studia Math. 247 (2019),
no. 3, 251–271, DOI 10.4064/sm170519-5-2. MR3937447
[214] T. Downarowicz and Y. Lacroix, A non-regular Toeplitz flow with preset pure point spectrum,
Studia Math. 120 (1996), no. 3, 235–246. MR1410450
[215] T. Downarowicz and A. Maass, Finite-rank Bratteli-Vershik diagrams are expansive, Er-
godic Theory Dynam. Systems 28 (2008), no. 3, 739–747, DOI 10.1017/S0143385707000673.
MR2422014
[216] T. Downarowicz and J. Serafin, A short proof of the Ornstein theorem, Ergodic Theory
Dynam. Systems 32 (2012), no. 2, 587–597, DOI 10.1017/S0143385711000265. MR2901361
[217] T. Downarowicz and J. Serafin, A strictly ergodic, positive entropy subshift uniformly
uncorrelated to the Möbius function, Studia Math. 251 (2020), no. 2, 195–206, DOI
10.4064/sm180719-13-12. MR4045659
[218] M. Drmota and R. F. Tichy, Sequences, discrepancies and applications, Lecture Notes
in Mathematics, vol. 1651, Springer-Verlag, Berlin, 1997, DOI 10.1007/BFb0093404.
MR1470456
434 Bibliography
[219] F. Durand, A characterization of substitutive sequences using return words, Discrete Math.
179 (1998), no. 1-3, 89–101, DOI 10.1016/S0012-365X(97)00029-0. MR1489074
[220] F. Durand, A generalization of Cobham’s theorem, Theory Comput. Syst. 31 (1998), no. 2,
169–185, DOI 10.1007/s002240000084. MR1491657
[221] F. Durand, Linearly recurrent subshifts have a finite number of non-periodic subshift factors.
Ergodic Theory Dynam. Systems 20 (2000) 1061–1078, and corrigendum and addendum,
Ergodic Theory Dynam. Systems 23 (2003) 663–669. MR1779393; MR1972245
[222] F. Durand, Combinatorics on Bratteli diagrams and dynamical systems, Combinatorics,
automata and number theory, Encyclopedia Math. Appl., vol. 135, Cambridge Univ. Press,
Cambridge, 2010, pp. 324–372. MR2759109
[223] F. Durand, Cobham’s theorem for substitutions, J. Eur. Math. Soc. (JEMS) 13 (2011), no. 6,
1799–1814, DOI 10.4171/JEMS/294. MR2835330
[224] F. Durand, A. Frank, and A. Maass, Eigenvalues of minimal Cantor systems, J. Eur. Math.
Soc. (JEMS) 21 (2019), no. 3, 727–775, DOI 10.4171/JEMS/849. MR3908764
[225] F. Durand, B. Host, and C. Skau, Substitutional dynamical systems, Bratteli diagrams
and dimension groups, Ergodic Theory Dynam. Systems 19 (1999), no. 4, 953–993, DOI
10.1017/S0143385799133947. MR1709427
[226] F. Durand and J. Leroy, S-adic conjecture and Bratteli diagrams (English, with English
and French summaries), C. R. Math. Acad. Sci. Paris 350 (2012), no. 21-22, 979–983, DOI
10.1016/j.crma.2012.10.015. MR2996779
[227] F. Durand and J. Leroy, The constant of recognizability is computable for primitive mor-
phisms, J. Integer Seq. 20 (2017), no. 4, Art. 17.4.5, 15. MR3622264
[228] F. Durand, J. Leroy, and G. Richomme, Do the properties of an S-adic representation
determine factor complexity?, J. Integer Seq. 16 (2013), no. 2, Article 13.2.6, 30. MR3032389
[229] F. Durand and M. Rigo, On Cobham’s theorem, Chapter in Automata: from Mathematics
to Applications, Eur. Math. Soc., Editor J.-E. Pin.
[230] A. Dykstra, N. Ormes, and R. Pavlov, Subsystems of transitive subshifts with linear complex-
ity, Ergodic Theory Dynam. Systems 42 (2022), no. 6, 1967–1993, DOI 10.1017/etds.2021.8.
MR4417341
[231] A. Dymek, S. Kasjan, J. Kułaga-Przymus, and M. Lemańczyk, B-free sets and dynamics,
Trans. Amer. Math. Soc. 370 (2018), no. 8, 5425–5489, DOI 10.1090/tran/7132. MR3803141
[232] F. J. Dyson and H. Falk, Period of a discrete cat mapping, Amer. Math. Monthly 99 (1992),
no. 7, 603–614, DOI 10.2307/2324989. MR1176587
[233] S. Eilenberg, Automata, languages, and machines. Vol. A, Pure and Applied Mathemat-
ics, Vol. 58, Academic Press [Harcourt Brace Jovanovich, Publishers], New York, 1974.
MR0530382
[234] A. Eizenberg, Y. Kifer, and B. Weiss, Large deviations for Zd -actions, Commun. Math.
Phys. 644 (1994), 33–54.
[235] S. B. Ekhad and D. Zeilberger, There are more than 2n/17 n-letter ternary square-free
words, J. Integer Seq. 1 (1998), Article 98.1.9 (1 HTML document). MR1677077
[236] R. Ellis and W. H. Gottschalk, Homomorphisms of transformation groups, Trans. Amer.
Math. Soc. 94 (1960), 258–271, DOI 10.2307/1993310. MR123635
[237] J. Epperlein, D. Kwietniak, and P. Oprocha, Mixing properties in coded systems, New trends
in one-dimensional dynamics, Springer Proc. Math. Stat., vol. 285, Springer, Cham, [2019]
2019,
c pp. 183–200, DOI 10.1007/978-3-030-16833-9_10. MR4043215
−n
[238] P. Erdős, M. Horváth, and I. Joó, On the uniqueness of the expansions 1 = q i , Acta
Math. Hungar. 58 (1991), no. 3-4, 333–342, DOI 10.1007/BF01903963. MR1153488
−n
[239] P. Erdős and I. Joó, On the number of expansions 1 = q i , Ann. Univ. Sci. Budapest.
Eötvös Sect. Math. 35 (1992), 129–132. MR1198106
Bibliography 435
[240]
P. Erdös, I. Joó, and V. Komornik, Characterization of the unique expansions 1 =
∞ −ni and related problems (English, with French summary), Bull. Soc. Math. France
i=1 q
118 (1990), no. 3, 377–390. MR1078082
[241] M. J. Feigenbaum, Quantitative universality for a class of nonlinear transformations, J.
Statist. Phys. 19 (1978), no. 1, 25–52, DOI 10.1007/BF01020332. MR501179
[242] D.-J. Feng, M. Furukado, S. Ito, and J. Wu, Pisot substitutions and the Hausdorff dimen-
sion of boundaries of atomic surfaces, Tsukuba J. Math. 30 (2006), no. 1, 195–223, DOI
10.21099/tkbjm/1496165037. MR2248292
[243] S. Ferenczi, Les transformations de Chacon: combinatoire, structure géométrique, lien avec
les systèmes de complexité 2n + 1 (French, with English and French summaries), Bull. Soc.
Math. France 123 (1995), no. 2, 271–292. MR1340291
[244] S. Ferenczi, Rank and symbolic complexity, Ergodic Theory Dynam. Systems 17 (1996),
271–292.
[245] S. Ferenczi, J. Kułaga-Przymus, and M. Lemańczyk, Sarnak’s conjecture: what’s new, Er-
godic theory and dynamical systems in their interactions with arithmetics and combinatorics,
Lecture Notes in Math., vol. 2213, Springer, Cham, 2018, pp. 163–235. MR3821717
[246] S. Ferenczi, C. Mauduit, and A. Nogueira, Substitution dynamical systems: algebraic char-
acterization of eigenvalues, Ann. Sci. École Norm. Sup. (4) 29 (1996), no. 4, 519–533.
MR1386224
[247] N. J. Fine and H. S. Wilf, Uniqueness theorems for periodic functions, Proc. Amer. Math.
Soc. 16 (1965), 109–114, DOI 10.2307/2034009. MR174934
[248] A. M. Fisher, Nonstationary mixing and the unique ergodicity of adic transformations,
Stoch. Dyn. 9 (2009), no. 3, 335–391, DOI 10.1142/S0219493709002701. MR2566907
[249] N. P. Fogg, Substitutions in dynamics, arithmetics and combinatorics, edited by V. Berthé,
S. Ferenczi, C. Mauduit and A. Siegel, Lecture Notes in Mathematics, vol. 1794, Springer-
Verlag, Berlin, 2002, DOI 10.1007/b13861. MR1970385
[250] S. Fomin, On dynamical systems with a purely point spectrum (Russian), Doklady Akad.
Nauk SSSR (N.S.) 77 (1951), 29–32. MR0043397
[251] S. Forchhammer and J. Justesen, Entropy bounds for constrained two-dimensional random
fields, IEEE Trans. Inform. Theory 45 (1999), no. 1, 118–127, DOI 10.1109/18.746776.
MR1677852
[252] R. H. Fox and R. B. Kershner, Concerning the transitive properties of geodesics on a rational
polyhedron, Duke Math. J. 2 (1936), no. 1, 147–150, DOI 10.1215/S0012-7094-36-00213-2.
MR1545913
[253] A. S. Fraenkel, Systems of numeration, Amer. Math. Monthly 92 (1985), no. 2, 105–114,
DOI 10.2307/2322638. MR777556
[254] S. B. Frick and N. Ormes, Dimension groups for polynomial odometers, Acta Appl. Math.
126 (2013), 165–186, DOI 10.1007/s10440-013-9812-9. MR3077947
[255] S. Frick, K. Petersen, and S. Shields, Dynamical properties of some adic systems with
arbitrary orderings, Ergodic Theory Dynam. Systems 37 (2017), no. 7, 2131–2162, DOI
10.1017/etds.2015.128. MR3693123
[256] S. Frick, K. Petersen, and S. Shields, Periodic codings of Bratteli-Vershik systems, Math.
Scand. 126 (2020), no. 2, 298–320, DOI 10.7146/math.scand.a-117570. MR4102566
[257] G. A. Freı̆man, Diofantovy priblizheniya i geometriya chisel (zadacha Markova) (Russian),
Kalinin. Gosudarstv. Univ., Kalinin, 1975. MR0485714
[258] D. Fried, Finitely presented dynamical systems, Ergodic Theory Dynam. Systems 7 (1987),
no. 4, 489–507, DOI 10.1017/S014338570000417X. MR922362
[259] S. Friedland, On the entropy of Zd subshifts of finite type, Linear Algebra Appl. 252 (1997),
199–220, DOI 10.1016/0024-3795(95)00676-1. MR1428636
436 Bibliography
[280] C. Good, J. Meddaugh, and J. Mitchell, Shadowing, internal chain transitivity and α-limit
sets, J. Math. Anal. Appl. 491 (2020), no. 1, 124291, 19, DOI 10.1016/j.jmaa.2020.124291.
MR4109096
[281] P. Góra, Invariant densities for generalized β-maps, Ergodic Theory Dynam. Systems 27
(2007), no. 5, 1583–1598, DOI 10.1017/S0143385707000053. MR2358979
[282] P. Góra, Invariant densities for piecewise linear maps of the unit interval, Ergodic
Theory Dynam. Systems 29 (2009), no. 5, 1549–1583, DOI 10.1017/S0143385708000801.
MR2545017
[283] A. Gorodetski and Y. Pesin, Path connectedness and entropy density of the space of hy-
perbolic ergodic measures, Modern theory of dynamical systems, Contemp. Math., vol. 692,
Amer. Math. Soc., Providence, RI, 2017, pp. 111–121, DOI 10.1090/conm/692. MR3666070
[284] W. H. Gottschalk and G. A. Hedlund, Topological dynamics, American Mathematical Society
Colloquium Publications, Vol. 36, American Mathematical Society, Providence, R.I., 1955.
MR0074810
[285] P. J. Grabner, P. Hellekalek, and P. Liardet, The dynamical point of view of low-discrepancy
sequences, Unif. Distrib. Theory 7 (2012), no. 1, 11–70. MR2943160
[286] P. J. Grabner, P. Liardet, and R. F. Tichy, Odometers and systems of numeration, Acta
Arith. 70 (1995), no. 2, 103–123, DOI 10.4064/aa-70-2-103-123. MR1322556
[287] C. Grillenberger, Constructions of strictly ergodic systems. I. Given entropy, Z. Wahrschein-
lichkeitstheorie und Verw. Gebiete 25 (1972/73), 323–334, DOI 10.1007/BF00537161.
MR340544
[288] M. Gröger, Examples of dynamical systems in the interface between order and chaos, PhD.
Thesis, University of Bremen (2015).
[289] J. G. Simonsen, On the computability of the topological entropy of subshifts, Discrete Math.
Theor. Comput. Sci. 8 (2006), no. 1, 83–95. MR2247517
[290] J. Grytczuk, H. Kordulewski, and A. Niewiadomski, Extremal square-free words, Electron.
J. Combin. 27 (2020), no. 1, Paper No. 1.48, 9, DOI 10.37236/9264. MR4075246
[291] B. Gurevič, Topological entropy for denumerable Markov chains, Dokl. Akad. Nauk SSSR
10 (1969), 911–915.
[292] B. M. Gurevič, Shift entropy and Markov measures in the space of paths of a countable
graph (Russian), Dokl. Akad. Nauk SSSR 192 (1970), 963–965. MR0268356
[293] B. M. Gurevich and S. V. Savchenko, Thermodynamic formalism for symbolic Markov chains
with a countable number of states (Russian), Uspekhi Mat. Nauk 53 (1998), no. 2(320), 3–
106, DOI 10.1070/rm1998v053n02ABEH000017; English transl., Russian Math. Surveys 53
(1998), no. 2, 245–344. MR1639451
[294] B. M. Gurevich and A. S. Zargaryan, Conditions for the existence of a maximal measure
for a countable symbolic Markov chain (Russian), Vestnik Moskov. Univ. Ser. I Mat. Mekh.
5 (1988), 14–18, 103; English transl., Moscow Univ. Math. Bull. 43 (1988), no. 5, 18–23.
MR1051173
[295] A. Haar, Der Massbegriff in der Theorie der kontinuierlichen Gruppen (German), Ann. of
Math. (2) 34 (1933), no. 1, 147–169, DOI 10.2307/1968346. MR1503103
[296] J. Hadamard, Les surfaces à courbures opposées et leurs lignes géodesiques, Journ. Math.
Pures et Appliqués 4 (1898), 27–73.
[297] M. Hall Jr., On the sum and product of continued fractions, Ann. of Math. (2) 48 (1947),
966–993, DOI 10.2307/1969389. MR22568
[298] P. R. Halmos, Introduction to Hilbert Space and the theory of Spectral Multiplicity, Chelsea
Publishing Co., New York, N. Y., 1951. MR0045309
[299] P. R. Halmos and J. von Neumann, Operator methods in classical mechanics. II, Ann. of
Math. (2) 43 (1942), 332–350, DOI 10.2307/1968872. MR6617
[300] T. E. Harris, Transient Markov chains with stationary measures, Proc. Amer. Math. Soc.
8 (1957), 937–942, DOI 10.2307/2033696. MR91564
438 Bibliography
[301] G. Hansel, A propos d’un théoreme de Cobham, Actes de la Fête des mots, Ed. Perrin, Greco
de Programmation, Rouen, 1982 (55–59).
[302] G. H. Hardy and J. E. Littlewood, Some problems of diophantine approximation, Acta Math.
37 (1914), no. 1, 193–239, DOI 10.1007/BF02401834. MR1555099
[303] G. H. Hardy and E. M. Wright, An introduction to the theory of numbers, 5th ed., The
Clarendon Press, Oxford University Press, New York, 1979. MR568909
[304] B. Hasselblatt and A. Katok, Principal structures, Handbook of dynamical systems, Vol.
1A, North-Holland, Amsterdam, 2002, pp. 1–203, DOI 10.1016/S1874-575X(02)80003-0.
MR1928518
[305] J. Hawkins, Ergodic dynamics—from basic theory to applications, Graduate Texts in
Mathematics, vol. 289, Springer, Cham, [2021] 2021,
c DOI 10.1007/978-3-030-59242-4.
MR4221210
[306] N. T. A. Haydn, Phase transitions in one-dimensional subshifts, Discrete Contin. Dyn. Syst.
33 (2013), no. 5, 1965–1973, DOI 10.3934/dcds.2013.33.1965. MR3002738
[307] E. Hecke, Über analytische Funktionen und die Verteilung von Zahlen mod. eins (German),
Abh. Math. Sem. Univ. Hamburg 1 (1922), no. 1, 54–76, DOI 10.1007/BF02940580.
MR3069388
[308] G. A. Hedlund, Endomorphisms and automorphisms of the shift dynamical system, Math.
Systems Theory 3 (1969), 320–375, DOI 10.1007/BF01691062. MR259881
[309] M.-R. Herman, Sur la conjugaison différentiable des difféomorphismes du cercle à des ro-
tations (French), Inst. Hautes Études Sci. Publ. Math. 49 (1979), 5–233. MR538680
[310] R. H. Herman, I. F. Putnam, and C. F. Skau, Ordered Bratteli diagrams, dimension
groups and topological dynamics, Internat. J. Math. 3 (1992), no. 6, 827–864, DOI
10.1142/S0129167X92000382. MR1194074
[311] A. Heinis, Arithmetics and combinatorics of words of low complexity, PhD. Thesis, Univer-
sity of Leiden (2001).
[312] E. Hironaka, What is . . . Lehmer’s number?, Notices Amer. Math. Soc. 56 (2009), no. 3,
374–375. MR2494103
[313] M. Hochman, Multidimensional shifts of finite type and sofic shifts, Combinatorics, words
and symbolic dynamics, Encyclopedia Math. Appl., vol. 159, Cambridge Univ. Press, Cam-
bridge, 2016, pp. 296–358. MR3525488
[314] M. Hochman and T. Meyerovitch, A characterization of the entropies of multidimensional
shifts of finite type, Ann. of Math. (2) 171 (2010), no. 3, 2011–2038, DOI 10.4007/an-
nals.2010.171.2011. MR2680402
[315] F. Hofbauer, On intrinsic ergodicity of piecewise monotonic transformations with posi-
tive entropy, Israel J. Math. 34 (1979), no. 3, 213–237 (1980), DOI 10.1007/BF02760884.
MR570882
[316] F. Hofbauer, The topological entropy of the transformation x → ax(1 − x), Monatsh. Math.
90 (1980), no. 2, 117–141, DOI 10.1007/BF01303262. MR595319
[317] M. Hollander and B. Solomyak, Two-symbol Pisot substitutions have pure dis-
crete spectrum, Ergodic Theory Dynam. Systems 23 (2003), no. 2, 533–540, DOI
10.1017/S0143385702001384. MR1972237
[318] C. Holton and L. Q. Zamboni, Directed graphs and substitutions, Theory Comput. Syst. 34
(2001), no. 6, 545–564, DOI 10.1007/s00224-001-1038-y. MR1865811
[319] J. E. Hopcroft and J. D. Ullman, Introduction to automata theory, languages, and computa-
tion, Addison-Wesley Series in Computer Science, Addison-Wesley Publishing Co., Reading,
Mass., 1979. MR645539
[320] B. Host, Valeurs propres des systèmes dynamiques définis par des substitutions de longueur
variable (French), Ergodic Theory Dynam. Systems 6 (1986), no. 4, 529–540, DOI
10.1017/S0143385700003679. MR873430
Bibliography 439
[321] B. Host, B. Kra, and A. Maass, Nilsequences and a structure theorem for topological dy-
namical systems, Adv. Math. 224 (2010), no. 1, 103–129, DOI 10.1016/j.aim.2009.11.009.
MR2600993
[322] W. Huang, J. Li, J.-P. Thouvenot, L. Xu, and X. Ye, Bounded complexity, mean equicon-
tinuity and discrete spectrum, Ergodic Theory Dynam. Systems 41 (2021), no. 2, 494–533,
DOI 10.1017/etds.2019.66. MR4177293
[323] W. Huang, P. Lu, and X. Ye, Measure-theoretical sensitivity and equicontinuity, Israel J.
Math. 183 (2011), 233–283, DOI 10.1007/s11856-011-0049-x. MR2811160
[324] W. Huang and X. Ye, Devaney’s chaos or 2-scattering implies Li-Yorke’s chaos, Topology
Appl. 117 (2002), no. 3, 259–272, DOI 10.1016/S0166-8641(01)00025-6. MR1874089
[325] W. Huang and X. Ye, Dynamical systems disjoint from any minimal system, Trans. Amer.
Math. Soc. 357 (2005), no. 2, 669–694, DOI 10.1090/S0002-9947-04-03540-8. MR2095626
[326] J. E. Hutchinson, Fractals and self-similarity, Indiana Univ. Math. J. 30 (1981), no. 5,
713–747, DOI 10.1512/iumj.1981.30.30055. MR625600
[327] G. Iommi, M. Todd, and A. Velozo, Escape of entropy for countable Markov shifts, Adv.
Math. 405 (2022), Paper No. 108507, DOI 10.1016/j.aim.2022.108507. MR4438058
[328] S. Ito and H. Rao, Purely periodic β-expansions with Pisot unit base, Proc. Amer. Math.
Soc. 133 (2005), no. 4, 953–964, DOI 10.1090/S0002-9939-04-07794-9. MR2117194
[329] S. Ito and H. Rao, Atomic surfaces, tilings and coincidence. I. Irreducible case, Israel J.
Math. 153 (2006), 129–155, DOI 10.1007/BF02771781. MR2254640
[330] S. Ito, S. Tanaka, and H. Nakada, On unimodal linear transformations and chaos II, Tokyo
J. Math. 2 (1979), 241–259.
[331] A. Iwanik, Toeplitz flows with pure point spectrum, Studia Math. 118 (1996), no. 1, 27–35,
DOI 10.4064/sm-118-1-27-35. MR1373622
[332] K. Jacobs and M. Keane, 0 − 1-sequences of Toeplitz type, Z. Wahrscheinlichkeitstheorie und
Verw. Gebiete 13 (1969), 123–131, DOI 10.1007/BF00537017. MR255766
[333] H. Jager and C. Kraaikamp, On the approximation by continued fractions, Nederl. Akad.
Wetensch. Indag. Math. 51 (1989), no. 3, 289–307. MR1020023
[334] M. Jellali, M. Mkaouar, K. Scheicher, and J. M. Thuswaldner, Beta-continued fractions over
Laurent series, Publ. Math. Debrecen 77 (2010), no. 3-4, 443–463. MR2741860
[335] T. Jolivet, B. Loridant, and J. Luo, Rauzy fractals with countable fundamental group, J.
Fractal Geom. 1 (2014), no. 4, 427–447, DOI 10.4171/JFG/13. MR3299819
[336] U. Jung, On the existence of open and bi-continuing codes, Trans. Amer. Math. Soc. 363
(2011), no. 3, 1399–1417, DOI 10.1090/S0002-9947-2010-05035-4. MR2737270
[337] R. M. Jungers, V. Y. Protasov, and V. D. Blondel, Overlap-free words and spectra of matri-
ces, Theoret. Comput. Sci. 410 (2009), no. 38-40, 3670–3684, DOI 10.1016/j.tcs.2009.04.022.
MR2553320
[338] L. Kailhofer, A classification of inverse limit spaces of tent maps with periodic critical
points, Fund. Math. 177 (2003), no. 2, 95–120, DOI 10.4064/fm177-2-1. MR1992527
[339] S. Kakutani, A problem of equidistribution on the unit interval [0, 1], Measure theory (Proc.
Conf., Oberwolfach, 1975), Springer, Berlin, 1976, pp. 369–375. Lecture Notes in Math., Vol.
541. MR0457678
[340] T. Kamae, A topological invariant of substitution minimal sets, J. Math. Soc. Japan 24
(1972), 285–306, DOI 10.2969/jmsj/02420285. MR293611
[341] T. Kamae, A simple proof of the ergodic theorem using nonstandard analysis, Israel J. Math.
42 (1982), no. 4, 284–290, DOI 10.1007/BF02761408. MR682311
[342] A. Kanigowski, M. Lemańczyk, and M. Radziwiłł, Rigidity in dynamics and Möbius disjoint-
ness, Fund. Math. 255 (2021), no. 3, 309–336, DOI 10.4064/fm931-11-2020. MR4324828
440 Bibliography
[343] J. Karhumäki and J. Shallit, Polynomial versus exponential growth in repetition-free binary
words, J. Combin. Theory Ser. A 105 (2004), no. 2, 335–347, DOI 10.1016/j.jcta.2003.12.004.
MR2046086
[344] S. Kasjan, G. Keller, and M. Lemańczyk, Dynamics of B-free sets: a view through the
window, Int. Math. Res. Not. IMRN 9 (2019), 2690–2734, DOI 10.1093/imrn/rnx196.
MR3947636
[345] A. B. Katok, Invariant measures of flows on orientable surfaces (Russian), Dokl. Akad.
Nauk SSSR 211 (1973), 775–778. MR0331438
[346] A. Katok and B. Hasselblatt, Introduction to the modern theory of dynamical systems,
with a supplementary chapter by Katok and Leonardo Mendoza, Encyclopedia of Mathe-
matics and its Applications, vol. 54, Cambridge University Press, Cambridge, 1995, DOI
10.1017/CBO9780511809187. MR1326374
[347] M. Keane, Interval exchange transformations, Math. Z. 141 (1975), 25–31, DOI
10.1007/BF01236981. MR357739
[348] M. Keane, Non-ergodic interval exchange transformations, Israel J. Math. 26 (1977), no. 2,
188–196, DOI 10.1007/BF03007668. MR435353
[349] M. S. Keane, Ergodic theory and subshifts of finite type, Ergodic theory, symbolic dynamics,
and hyperbolic spaces (Trieste, 1989), Oxford Sci. Publ., Oxford Univ. Press, New York,
1991, pp. 35–70. MR1130172
[350] M. Keane, A continued fraction titbit, Symposium in Honor of Benoit Mandelbrot (Curaçao,
1995), Fractals 3 (1995), no. 4, 641–650, DOI 10.1142/S0218348X95000576. MR1410284
[351] M. S. Keane and G. Rauzy, Stricte ergodicité des échanges d’intervalles (French), Math. Z.
174 (1980), no. 3, 203–212, DOI 10.1007/BF01161409. MR593819
[352] M. Keane and M. Smorodinsky, Bernoulli schemes of the same entropy are finitarily iso-
morphic, Ann. of Math. (2) 109 (1979), no. 2, 397–406, DOI 10.2307/1971117. MR528969
[353] G. Keller, Tautness for sets of multiples and applications to B-free dynamics, Studia Math.
247 (2019), no. 2, 205–216, DOI 10.4064/sm180305-9-4. MR3920387
[354] K. Keller, Invariant factors, Julia equivalences and the (abstract) Mandelbrot set, Lecture
Notes in Mathematics, vol. 1732, Springer-Verlag, Berlin, 2000, DOI 10.1007/BFb0103999.
MR1761576
[355] J. Kepler, Harmonices mundi, Linz, 1619.
[356] D. Kerr and H. Li, Independence in topological and C ∗ -dynamics, Math. Ann. 338 (2007),
no. 4, 869–926, DOI 10.1007/s00208-007-0097-z. MR2317754
[357] H. Kesten, On a conjecture of Erdős and Szüsz related to uniform distribution mod 1, Acta
Arith. 12 (1966/67), 193–212, DOI 10.4064/aa-12-2-193-212. MR209253
[358] H. B. Keynes and D. Newton, A “minimal”, non-uniquely ergodic interval exchange trans-
formation, Math. Z. 148 (1976), no. 2, 101–105, DOI 10.1007/BF01214699. MR409766
[359] H. B. Keynes and J. B. Robertson, On ergodicity and mixing in topological transformation
groups, Duke Math. J. 35 (1968), 809–819. MR234441
[360] A. Ya. Khinchin, Continued fractions, with a preface by B. V. Gnedenko; reprint of the
1964 translation, translated from the third (1961) Russian edition, Dover Publications, Inc.,
Mineola, NY, 1997. MR1451873
[361] K. H. Kim and F. W. Roush, Williams’s conjecture is false for reducible subshifts, J. Amer.
Math. Soc. 5 (1992), no. 1, 213–215, DOI 10.2307/2152756. MR1130528
[362] J. L. King, A map with topological minimal self-joinings in the sense of del Junco, Er-
godic Theory Dynam. Systems 10 (1990), no. 4, 745–761, DOI 10.1017/S0143385700005873.
MR1091424
[363] J. F. C. Kingman, The exponential decay of Markov transition probabilities, Proc. London
Math. Soc. (3) 13 (1963), 337–358, DOI 10.1112/plms/s3-13.1.337. MR152014
Bibliography 441
[364] B. P. Kitchens, Symbolic dynamics: One-sided, two-sided and countable state Markov shifts,
Universitext, Springer-Verlag, Berlin, 1998, DOI 10.1007/978-3-642-58822-8. MR1484730
[365] J. Kiwi, Wandering orbit portraits, Trans. Amer. Math. Soc. 354 (2002), no. 4, 1473–1485,
DOI 10.1090/S0002-9947-01-02896-3. MR1873015
[366] R. Kolpakov, Efficient lower bounds on the number of repetition-free words, J. Integer Seq.
10 (2007), no. 3, Article 07.3.2, 16. MR2291946
[367] R. Kolpakov, G. Kucherov, and Y. Tarannikov, On repetition-free binary words of minimal
density, WORDS (Rouen, 1997), Theoret. Comput. Sci. 218 (1999), no. 1, 161–175, DOI
10.1016/S0304-3975(98)00257-6. MR1687788
[368] V. Komornik and P. Loreti, Unique developments in non-integer bases, Amer. Math.
Monthly 105 (1998), no. 7, 636–639, DOI 10.2307/2589246. MR1633077
[369] J. Konieczny, M. Kupsa, and D. Kwietniak, Arcwise connectedness of the set of ergodic
measures of hereditary shifts, Proc. Amer. Math. Soc. 146 (2018), no. 8, 3425–3438, DOI
10.1090/proc/14029. MR3803667
[370] C. Kopf, Invariant measures for piecewise linear transformations of the interval, Appl.
Math. Comput. 39 (1990), no. 2, 123–144, DOI 10.1016/0096-3003(90)90027-Z. MR1071209
[371] T. J. P. Krebs, A more reasonable proof of Cobham’s theorem, Internat. J. Found. Comput.
Sci. 32 (2021), no. 2, 203–207, DOI 10.1142/S0129054121500118. MR4218824
[372] W. Krieger, On entropy and generators of measure-preserving transformations, Trans.
Amer. Math. Soc. 149 (1970), 453–464, and Erratum 168 (1972) 519, DOI 10.2307/1995407.
MR259068
[373] W. Krieger, On the uniqueness of the equilibrium state, Math. Systems Theory 8 (1974/75),
no. 2, 97–104, DOI 10.1007/BF01762180. MR399412
[374] W. Krieger, On topological Markov chains, Dynamical systems, Vol. II—Warsaw, Soc. Math.
France, Paris, 1977, pp. 193–196. Astérisque, No. 50. MR0500874
[375] W. Krieger, On a dimension for a class of homeomorphism groups, Math. Ann. 252
(1979/80), no. 2, 87–95, DOI 10.1007/BF01420115. MR593623
[376] W. Krieger, On dimension functions and topological Markov chains, Invent. Math. 56
(1980), no. 3, 239–250, DOI 10.1007/BF01390047. MR561973
[377] L. Kuipers and H. Niederreiter, Uniform distribution of sequences, Pure and Applied
Mathematics, Wiley-Interscience [John Wiley & Sons], New York-London-Sydney, 1974.
MR0419394
[378] J. Kułaga-Przymus, M. Lemańczyk, and B. Weiss, On invariant measures for B-free sys-
tems, Proc. Lond. Math. Soc. (3) 110 (2015), no. 6, 1435–1474, DOI 10.1112/plms/pdv017.
MR3356811
[379] J. Kułaga-Przymus, M. Lemańczyk, and B. Weiss, Hereditary subshifts whose simplex of
invariant measures is Poulsen, Ergodic theory, dynamical systems, and the continuing in-
fluence of John C. Oxtoby, Contemp. Math., vol. 678, Amer. Math. Soc., Providence, RI,
2016, pp. 245–253, DOI 10.1090/conm/678. MR3589826
[380] M. Kulczycki, D. Kwietniak, and J. Li, Entropy of subordinate shift spaces, Amer. Math.
Monthly 125 (2018), no. 2, 141–148, DOI 10.1080/00029890.2018.1401875. MR3756340
[381] P. Kůrka, Topological and symbolic dynamics, Cours Spécialisés [Specialized Courses],
vol. 11, Société Mathématique de France, Paris, 2003. MR2041676
[382] D. Kwietniak, Topological entropy and distributional chaos in hereditary shifts with ap-
plications to spacing shifts and beta shifts, Discrete Contin. Dyn. Syst. 33 (2013), no. 6,
2451–2467, DOI 10.3934/dcds.2013.33.2451. MR3007694
[383] D. Kwietniak, M. Ła̧cka, and P. Oprocha, A panorama of specification-like properties and
their consequences, Dynamics and numbers, Contemp. Math., vol. 669, Amer. Math. Soc.,
Providence, RI, 2016, pp. 155–186, DOI 10.1090/conm/669/13428. MR3546668
442 Bibliography
[384] D. Kwietniak, P. Oprocha, and M. Rams, On entropy of dynamical systems with almost
specification, Israel J. Math. 213 (2016), no. 1, 475–503, DOI 10.1007/s11856-016-1339-0.
MR3509480
[385] J. Lagrange, Additions au mémoire sur la résolution des équations numériques, Mém. Acad.
royale sc. et belles-lettres, Berlin 24 (1770); also in Oeuvres II, 581–652.
[386] O. E. Lanford III, A computer-assisted proof of the Feigenbaum conjectures, Bull. Amer.
Math. Soc. (N.S.) 6 (1982), no. 3, 427–434, DOI 10.1090/S0273-0979-1982-15008-X.
MR648529
[387] K. Lau and A. Zame, On weak mixing of cascades, Math. Systems Theory 6 (1972/73),
307–311, DOI 10.1007/BF01740722. MR321058
[388] P. Lavaurs, Une description combinatoire de l’involution définie par M sur les rationnels à
dénominateur impair (French, with English summary), C. R. Acad. Sci. Paris Sér. I Math.
303 (1986), no. 4, 143–146. MR853606
[389] P. D. Lax, Functional analysis, Pure and Applied Mathematics (New York), Wiley-
Interscience [John Wiley & Sons], New York, 2002. MR1892228
[390] F. Ledrappier, Some properties of absolutely continuous invariant measures on an interval,
Ergodic Theory Dynam. Systems 1 (1981), no. 1, 77–93, DOI 10.1017/s0143385700001176.
MR627788
[391] D. H. Lehmer, Factorization of certain cyclotomic functions, Ann. of Math. (2) 34 (1933),
no. 3, 461–479, DOI 10.2307/1968172. MR1503118
[392] J. Leroy, Some improvements of the S-adic conjecture, Adv. in Appl. Math. 48 (2012),
no. 1, 79–98, DOI 10.1016/j.aam.2011.03.005. MR2845508
[393] J. Leroy and G. Richomme, A combinatorial proof of S-adicity for sequences with linear
complexity, Integers 13 (2013), Paper No. A5, 19. MR3083467
[394] J. Li, S. Tu, and X. Ye, Mean equicontinuity and mean sensitivity, Ergodic Theory Dynam.
Systems 35 (2015), no. 8, 2587–2612, DOI 10.1017/etds.2014.41. MR3456608
[395] T. Y. Li and J. A. Yorke, Period three implies chaos, Amer. Math. Monthly 82 (1975),
no. 10, 985–992, DOI 10.2307/2318254. MR385028
[396] D. Lima, C. Matheus, C. G. Moreira, and S. Romaña, Classical and dynamical Markov and
Lagrange spectra—dynamical, fractal and arithmetic aspects, World Scientific Publishing
Co. Pte. Ltd., Hackensack, NJ, [2021] 2021.
c MR4274593
[397] D. A. Lind, The entropies of topological Markov shifts and a related class of al-
gebraic integers, Ergodic Theory Dynam. Systems 4 (1984), no. 2, 283–300, DOI
10.1017/S0143385700002443. MR766106
[398] D. Lind and B. Marcus, An introduction to symbolic dynamics and coding, Cambridge
University Press, Cambridge, 1995, DOI 10.1017/CBO9780511626302. MR1369092
[399] D. Lind, K. Schmidt, and T. Ward, Mahler measure and entropy for commuting
automorphisms of compact groups, Invent. Math. 101 (1990), no. 3, 593–629, DOI
10.1007/BF01231517. MR1062797
[400] J. Lindenstrauss, G. Olsen, and Y. Sternfeld, The Poulsen simplex (English, with French
summary), Ann. Inst. Fourier (Grenoble) 28 (1978), no. 1, vi, 91–114. MR500918
[401] K. Lindsey and R. Treviño, Infinite type flat surface models of ergodic systems, Discrete
Contin. Dyn. Syst. 36 (2016), no. 10, 5509–5553, DOI 10.3934/dcds.2016043. MR3543559
[402] A. N. Livshits, On the spectra of adic transformations of Markov compact sets (Russian),
Uspekhi Mat. Nauk 42 (1987), no. 3(255), 189–190. MR896889
[403] A. N. Livshits, Sufficient conditions for weak mixing of substitutions and of station-
ary adic transformations (Russian), Mat. Zametki 44 (1988), no. 6, 785–793, 862, DOI
10.1007/BF01158030; English transl., Math. Notes 44 (1988), no. 5-6, 920–925 (1989).
MR983550
Bibliography 443
[421] M. Misiurewicz and S. Roth, Constant slope maps on the extended real line, Ergodic Theory
Dynam. Systems 38 (2018), no. 8, 3145–3169, DOI 10.1017/etds.2017.3. MR3868025
[422] M. Misiurewicz and W. Szlenk, Entropy of piecewise monotone mappings, Studia Math. 67
(1980), no. 1, 45–63, DOI 10.4064/sm-67-1-45-63. MR579440
[423] J. Mitchell, On origin of orbits and the shadow of chaos, Ph.D. Thesis, 2020, University of
Birmingham.
[424] T. K. S. Moothathu, Diagonal points having dense orbit, Colloq. Math. 120 (2010), no. 1,
127–138, DOI 10.4064/cm120-1-9. MR2652611
[425] M. Morse and G. A. Hedlund, Symbolic dynamics II. Sturmian trajectories, Amer. J. Math.
62 (1940), 1–42, DOI 10.2307/2371431. MR745
[426] B. Mossé, Puissances de mots et reconnaissabilité des points fixes d’une substitu-
tion (French), Theoret. Comput. Sci. 99 (1992), no. 2, 327–334, DOI 10.1016/0304-
3975(92)90357-L. MR1168468
[427] J. Moulin Ollagnier, Proof of Dejean’s conjecture for alphabets with 5, 6, 7, 8, 9, 10 and 11
letters (English, with French summary), Theoret. Comput. Sci. 95 (1992), no. 2, 187–205,
DOI 10.1016/0304-3975(92)90264-G. MR1156042
[428] J. Myhill, Finite automata and the representation of events, WADD technical report (1957),
112–137.
[429] M. G. Nadkarni, Spectral theory of dynamical systems, Birkhäuser Advanced Texts: Basler
Lehrbücher. [Birkhäuser Advanced Texts: Basel Textbooks], Birkhäuser Verlag, Basel, 1998,
DOI 10.1007/978-3-0348-8841-7. MR1719722
[430] A. Nerode, Linear automaton transformations, Proc. Amer. Math. Soc. 9 (1958), 541–544,
DOI 10.2307/2033204. MR135681
[431] J. von Neumann, Zur Operatorenmethode in der klassischen Mechanik (German), Ann. of
Math. (2) 33 (1932), no. 3, 587–642, DOI 10.2307/1968537. MR1503078
[432] A. Nogueira and D. Rudolph, Topological weak-mixing of interval exchange maps, Ergodic
Theory Dynam. Systems 17 (1997), no. 5, 1183–1209, DOI 10.1017/S0143385797086276.
MR1477038
[433] W. Ogden, A helpful result for proving inherent ambiguity, Math. Systems Theory 2 (1968),
191–194, DOI 10.1007/BF01694004. MR233645
[434] P. Oprocha and G. Zhang, Topological aspects of dynamics of pairs, tuples and sets, Recent
progress in general topology. III, Atlantis Press, Paris, 2014, pp. 665–709, DOI 10.2991/978-
94-6239-024-9_16. MR3205496
[435] N. Ormes and R. Pavlov, On the complexity function for sequences which are not uniformly
recurrent, Dynamical systems and random processes, Contemp. Math., vol. 736, Amer.
Math. Soc., [Providence], RI, [2019] 2019,
c pp. 125–137, DOI 10.1090/conm/736/14833.
MR4011909
[436] D. S. Ornstein, On the root problem in ergodic theory, Proceedings of the Sixth Berkeley
Symposium on Mathematical Statistics and Probability (Univ. California, Berkeley, Calif.,
1970/1971), Univ. California Press, Berkeley, Calif., 1972, pp. 347–356. MR0399415
[437] D. Ornstein, A Kolmogorov automorphism that is not a Bernoulli shift, Matematika 15
(1971), 131–150.
[438] D. S. Ornstein, Ergodic theory, randomness, and dynamical systems, James K. Whittemore
Lectures in Mathematics given at Yale University, Yale Mathematical Monographs, No. 5,
Yale University Press, New Haven, Conn.-London, 1974. MR0447525
[439] A. Ostrowski, Bemerkungen zur Theorie der Diophantischen Approximationen (German),
Abh. Math. Sem. Univ. Hamburg 1 (1922), no. 1, 77–98, DOI 10.1007/BF02940581.
MR3069389
[440] J. C. Oxtoby, Ergodic sets, Bull. Amer. Math. Soc. 58 (1952), 116–136, DOI 10.1090/S0002-
9904-1952-09580-X. MR47262
Bibliography 445
[441] J.-J. Pansiot, Complexité des facteurs des mots infinis engendrés par morphismes itérés
(French, with English summary), Automata, languages and programming (Antwerp, 1984),
Lecture Notes in Comput. Sci., vol. 172, Springer, Berlin, 1984, pp. 380–389, DOI 10.1007/3-
540-13345-3_34. MR784265
[442] J.-J. Pansiot, À propos d’une conjecture de F. Dejean sur les répétitions dans les mots
(French, with English summary), Discrete Appl. Math. 7 (1984), no. 3, 297–311, DOI
10.1016/0166-218X(84)90006-4. MR736893
[443] J.-J. Pansiot, On various classes of infinite words obtained by iterated mappings, Automata
on infinite words (Le Mont-Dore, 1984), Lecture Notes in Comput. Sci., vol. 192, Springer,
Berlin, 1985, pp. 188–197, DOI 10.1007/3-540-15641-0_34. MR814743
[444] S. S. Park, Bratteli diagram isomorphic to Chacon homeomorphism, Bull. Korean Math.
Soc. 37 (2000), no. 3, 519–536. MR1779242
[445] W. Parry, On the β-expansions of real numbers (English, with Russian summary), Acta
Math. Acad. Sci. Hungar. 11 (1960), 401–416, DOI 10.1007/BF02020954. MR142719
[446] W. Parry, Intrinsic Markov chains, Trans. Amer. Math. Soc. 112 (1964), 55–66, DOI
10.2307/1994009. MR161372
[447] W. Parry, Representations for real numbers, Acta Math. Acad. Sci. Hungar. 15 (1964),
95–105, DOI 10.1007/BF01897025. MR166332
[448] O. G. Parshina, On arithmetic progressions in the generalized Thue-Morse word, Combina-
torics on words, Lecture Notes in Comput. Sci., vol. 9304, Springer, Cham, 2015, pp. 191–196,
DOI 10.1007/978-3-319-23660-5_16. MR3446321
[449] M. E. Paul, Construction of almost automorphic symbolic minimal flows, General Topology
and Appl. 6 (1976), no. 1, 45–56. MR388365
[450] R. Pavlov, On entropy and intrinsic ergodicity of coded subshifts, Proc. Amer. Math. Soc.
148 (2020), no. 11, 4717–4731, DOI 10.1090/proc/15145. MR4143389
[451] R. Peckner, Uniqueness of the measure of maximal entropy for the squarefree flow, Israel J.
Math. 210 (2015), no. 1, 335–357, DOI 10.1007/s11856-015-1255-8. MR3430278
[452] C. Penrose, On quotients of shifts associated with dendrite Julia sets of quadratic polyno-
mials, Ph.D. Thesis, University of Coventry, 1994.
[453] D. Perrin, Finite automata, Handbook of theoretical computer science, Vol. B, Elsevier,
Amsterdam, 1990, pp. 1–57. MR1127186
[454] K. E. Petersen, A topologically strongly mixing symbolic minimal set, Trans. Amer. Math.
Soc. 148 (1970), 603–612, DOI 10.2307/1995392. MR259884
[455] K. Petersen, On a series of cosecants related to a problem in ergodic theory, Compositio
Math. 26 (1973), 313–317. MR325927
[456] K. Petersen, Ergodic theory, Cambridge Studies in Advanced Mathematics, vol. 2, Cambridge
University Press, Cambridge, 1983, DOI 10.1017/CBO9780511608728. MR833286
[457] K. Petersen, Chains, entropy, coding, Ergodic Theory Dynam. Systems 6 (1986), no. 3,
415–448, DOI 10.1017/S014338570000359X. MR863204
[458] C.-E. Pfister and W. G. Sullivan, Large deviations estimates for dynamical systems without
the specification property. Applications to the β-shifts, Nonlinearity 18 (2005), no. 1, 237–
261, DOI 10.1088/0951-7715/18/1/013. MR2109476
[459] C.-E. Pfister and W. G. Sullivan, On the topological entropy of saturated sets, Ergodic
Theory Dynam. Systems 27 (2007), no. 3, 929–956, DOI 10.1017/S0143385706000824.
MR2322186
[460] R. R. Phelps, Lectures on Choquet’s theorem, 2nd ed., Lecture Notes in Mathematics,
vol. 1757, Springer-Verlag, Berlin, 2001, DOI 10.1007/b76887. MR1835574
[461] S. Yu. Pilyugin, Shadowing in dynamical systems, Lecture Notes in Mathematics, vol. 1706,
Springer-Verlag, Berlin, 1999. MR1727170
446 Bibliography
[462] C. Preston, Iterates of maps on an interval, Lecture Notes in Mathematics, vol. 999,
Springer-Verlag, Berlin, 1983, DOI 10.1007/BFb0061749. MR706078
[463] W. E. Pruitt, Eigenvalues of non-negative matrices, Ann. Math. Statist. 35 (1964), 1797–
1800, DOI 10.1214/aoms/1177700401. MR168579
[464] J. Qiu and J. Zhao, A note on mean equicontinuity, J. Dynam. Differential Equations 32
(2020), no. 1, 101–116, DOI 10.1007/s10884-018-9716-5. MR4061636
[465] M. Queffélec, Substitution dynamical systems—spectral analysis, Lecture Notes in Mathe-
matics, vol. 1294, Springer-Verlag, Berlin, 1987, DOI 10.1007/BFb0081890. MR924156
[466] M. Queffélec, Une nouvelle propriété des suites de Rudin-Shapiro (French, with English
summary), Ann. Inst. Fourier (Grenoble) 37 (1987), no. 2, 115–138. MR898934
[467] N. Rampersad and J. Shallit, Repetitions in words, Combinatorics, words and symbolic
dynamics, Encyclopedia Math. Appl., vol. 159, Cambridge Univ. Press, Cambridge, 2016,
pp. 101–150. MR3525483
[468] N. Rampersad, J. Shallit, and É. Vandomme, Critical exponents of infinite balanced words,
Theoret. Comput. Sci. 777 (2019), 454–463, DOI 10.1016/j.tcs.2018.10.017. MR3961908
[469] G. N. Raney, On continued fractions and finite automata, Math. Ann. 206 (1973), 265–283,
DOI 10.1007/BF01355980. MR340166
[470] M. Rao, Last cases of Dejean’s conjecture, Theoret. Comput. Sci. 412 (2011), no. 27, 3010–
3018, DOI 10.1016/j.tcs.2010.06.020. MR2830264
[471] G. Rauzy, Nombres algébriques et substitutions (French, with English summary), Bull. Soc.
Math. France 110 (1982), no. 2, 147–178. MR667748
[472] C. Richard and U. Grimm, On the entropy and letter frequencies of ternary square-free
words, Electron. J. Combin. 11 (2004), no. 1, Research Paper 14, 19. MR2035308
[473] M. Rigo and L. Waxweiler, A note on syndeticity, recognizable sets and Cobham’s theorem,
Bull. Eur. Assoc. Theor. Comput. Sci. EATCS 88 (2006), 169–173. MR2222340
[474] C. Robinson, Dynamical systems: Stability, symbolic dynamics, and chaos, 2nd ed., Studies
in Advanced Mathematics, CRC Press, Boca Raton, FL, 1999. MR1792240
[475] V. A. Rokhlin, Exact endomorphisms of a Lebesgue space (Russian), Izv. Akad. Nauk SSSR
Ser. Mat. 25 (1961), 499–530. MR0143873
[476] K. F. Roth, Rational approximations to algebraic numbers, Mathematika 2 (1955), 1–20;
corrigendum, 168, DOI 10.1112/S0025579300000644. MR72182
[477] W. Rudin, Some theorems on Fourier coefficients, Proc. Amer. Math. Soc. 10 (1959), 855–
859, DOI 10.2307/2033608. MR116184
[478] W. Rudin, Functional analysis, McGraw-Hill Series in Higher Mathematics, McGraw-Hill
Book Co., New York-Düsseldorf-Johannesburg, 1973. MR0365062
[479] D. J. Rudolph, Fundamentals of measurable dynamics: Ergodic theory on Lebesgue spaces,
Oxford Science Publications, The Clarendon Press, Oxford University Press, New York,
1990. MR1086631
[480] D. Ruelle, Statistical mechanics on a compact set with Z v action satisfying expansiveness
and specification, Trans. Amer. Math. Soc. 187 (1973), 237–251, DOI 10.2307/1996437.
MR417391
[481] S. Ruette, On the Vere-Jones classification and existence of maximal measures for
countable topological Markov chains, Pacific J. Math. 209 (2003), no. 2, 366–380, DOI
10.2140/pjm.2003.209.365. MR1978377
[482] S. Ruette, Chaos on the interval, University Lecture Series, vol. 67, American Mathematical
Society, Providence, RI, 2017, DOI 10.1090/ulect/067. MR3616574
[483] Y. Saiki, M. A. F. Sanjuán, and J. A. Yorke, Low-dimensional paradigms for high-
dimensional hetero-chaos, Chaos 28 (2018), no. 10, 103110, 7, DOI 10.1063/1.5045693.
MR3865616
Bibliography 447
[484] I. A. Salama, Topological entropy and recurrence of countable chains, Pacific J. Math. 134
(1988), no. 2, 325–341. MR961239
[485] R. Salem, A remarkable class of algebraic integers. Proof of a conjecture of Vijayaraghavan,
Duke Math. J. 11 (1944), 103–108. MR10149
[486] A. Salomaa, Theory of automata, International Series of Monographs in Pure and Applied
Mathematics, Vol. 100, Pergamon Press, Oxford-New York-Toronto, Ont., 1969. MR0262021
[487] O. Sarig and M. Schmoll, Adic flows, transversal flows, and horocycle flows, Ergodic theory
and dynamical systems, De Gruyter Proc. Math., De Gruyter, Berlin, 2014, pp. 241–259.
MR3220105
[488] J. Schmeling, Symbolic dynamics for β-shifts and self-normal numbers, Ergodic Theory
Dynam. Systems 17 (1997), no. 3, 675–694, DOI 10.1017/S0143385797079182. MR1452189
[489] J. Schmeling and S. Troubetzkoy, Interval translation mappings, Dynamical systems
(Luminy-Marseille, 1998), World Sci. Publ., River Edge, NJ, 2000, pp. 291–302. MR1796167
[490] K. Schmidt, On periodic expansions of Pisot numbers and Salem numbers, Bull. London
Math. Soc. 12 (1980), no. 4, 269–278, DOI 10.1112/blms/12.4.269. MR576976
[491] T. Schneider, Transzendenzuntersuchungen periodischer Funktionen, Teil 1, 2, Journal für
die Reine und Angewandte Mathematik 172 (1934), 65–69, 70–74,
[492] B. Schweizer and J. Smítal, Measures of chaos and a spectral decomposition of dynam-
ical systems on the interval, Trans. Amer. Math. Soc. 344 (1994), no. 2, 737–754, DOI
10.2307/2154504. MR1227094
[493] E. Seneta, Nonnegative matrices and Markov chains, 2nd ed., Springer Series in Statistics,
Springer-Verlag, New York, 1981, DOI 10.1007/0-387-32792-4. MR719544
[494] C. Series, Geometrical methods of symbolic coding, Ergodic theory, symbolic dynamics, and
hyperbolic spaces (Trieste, 1989), Oxford Sci. Publ., Oxford Univ. Press, New York, 1991,
pp. 125–151. MR1130175
[495] C. E. Shannon, A mathematical theory of communication, Bell System Tech. J. 27 (1948),
379–423, 623–656, DOI 10.1002/j.1538-7305.1948.tb01338.x. MR26286
[496] H. S. Shapiro, Extremal problems for polynomials and power series, ProQuest LLC, Ann
Arbor, MI, 1953. Thesis (Ph.D.)–Massachusetts Institute of Technology. MR2938495
[497] S. Shao and X. Ye, Regionally proximal relation of order d is an equivalence one for minimal
systems and a combinatorial consequence, Adv. Math. 231 (2012), no. 3-4, 1786–1817, DOI
10.1016/j.aim.2012.07.012. MR2964624
[498] A. Sharkovskiy, Coexistence of cycles of a continuous map of the line into itself, (Russian)
Ukrain. Math. Zh. 16 (1964), 61-71.
[499] A. Sharkovskiy, Coexistence of cycles of a continuous mapping of the line into itself, trans-
lation into English by J. Tolosa, International Journal of Bifurcation and Chaos 05 (1995)
1263–1273.
[500] T. Shimomura, Special homeomorphisms and approximation for Cantor systems, Topology
Appl. 161 (2014), 178–195, DOI 10.1016/j.topol.2013.10.018. MR3132360
[501] T. Shimomura, Graph covers and ergodicity for zero-dimensional systems, Ergodic Theory
Dynam. Systems 36 (2016), no. 2, 608–631, DOI 10.1017/etds.2014.72. MR3503037
[502] T. Shimomura, Bratteli-Vershik models and graph covering models, Adv. Math. 367 (2020),
107127, 54, DOI 10.1016/j.aim.2020.107127. MR4080580
[503] A. M. Shur, Growth rates of complexity of power-free languages, Theoret. Comput. Sci. 411
(2010), no. 34-36, 3209–3223, DOI 10.1016/j.tcs.2010.05.017. MR2676864
[504] A. M. Shur, Growth properties of power-free languages, Developments in language the-
ory, Lecture Notes in Comput. Sci., vol. 6795, Springer, Heidelberg, 2011, pp. 28–43, DOI
10.1007/978-3-642-22321-1_3. MR2862712
[505] C. Siegel, Approximation algebraischer Zahlen (German), Math. Z. 10 (1921), no. 3-4, 173–
213, DOI 10.1007/BF01211608. MR1544471
448 Bibliography
[506] C. L. Siegel, Algebraic integers whose conjugates lie in the unit circle, Duke Math. J. 11
(1944), 597–602. MR10579
[507] K. Sigmund, Generic properties of invariant measures for Axiom A diffeomorphisms, Invent.
Math. 11 (1970), 99–109, DOI 10.1007/BF01404606. MR286135
[508] K. Sigmund, On dynamical systems with the specification property, Trans. Amer. Math. Soc.
190 (1974), 285–299, DOI 10.2307/1996963. MR352411
[509] C. E. Silva, Invitation to ergodic theory, Student Mathematical Library, vol. 42, American
Mathematical Society, Providence, RI, 2008, DOI 10.1090/stml/042. MR2371216
[510] S. Silverman, On maps with dense orbits and the definition of chaos, Rocky Mountain J.
Math. 22 (1992), no. 1, 353–375, DOI 10.1216/rmjm/1181072815. MR1159963
[511] Ja. G. Sinaı̆, A weak isomorphism of transformations with invariant measure (Russian),
Dokl. Akad. Nauk SSSR 147 (1962), 797–800. MR0161960
[512] Ja. G. Sinaı̆, Construction of Markov partitionings (Russian), Funkcional. Anal. i Priložen.
2 (1968), no. 3, 70–80 (Loose errata). MR0250352
[513] Ya. G. Sinai, Introduction to ergodic theory, translated by V. Scheffer, Mathematical Notes,
vol. 18, Princeton University Press, Princeton, N.J., 1976. MR0584788
[514] Ya. G. Sinai and C. Ulcigrai, Weak mixing in interval exchange transformations of peri-
odic type, Lett. Math. Phys. 74 (2005), no. 2, 111–133, DOI 10.1007/s11005-005-0011-0.
MR2191950
[515] V. F. Sirvent and Y. Wang, Self-affine tiling via substitution dynamical systems and
Rauzy fractals, Pacific J. Math. 206 (2002), no. 2, 465–485, DOI 10.2140/pjm.2002.206.465.
MR1926787
[516] J. Smítal, Chaotic functions with zero topological entropy, Trans. Amer. Math. Soc. 297
(1986), no. 1, 269–282, DOI 10.2307/2000468. MR849479
[517] J. Smítal and M. Štefánková, Distributional chaos for triangular maps, Chaos Solitons Frac-
tals 21 (2004), no. 5, 1125–1128, DOI 10.1016/j.chaos.2003.12.105. MR2047330
[518] M. Sollami, C. C. Douglas, and M. Liebmann, An improved lower bound on the number of
ternary squarefree words, J. Integer Seq. 19 (2016), no. 6, Article 16.6.7, 21. MR3546621
[519] B. Solomyak, On the spectral theory of adic transformations, Representation theory and
dynamical systems, Adv. Soviet Math., vol. 9, Amer. Math. Soc., Providence, RI, 1992,
pp. 217–230. MR1166205
[520] V. T. Sós, On the theory of diophantine approximations. I, Acta Math. Acad. Sci. Hungar.
8 (1957), 461–472, DOI 10.1007/BF02020329. MR93510
[521] V. Sós, On the distribution (mod 1) of the sequence {ηα}, Ann. Univ. Sci. Budapest. Eötvös
Sect. Math. 1 (1958), 127–134.
[522] C. Spandl, Computing the topological entropy of shifts, MLQ Math. Log. Q. 53 (2007),
no. 4-5, 493–510, DOI 10.1002/malq.200710014. MR2351946
[523] B. Stanley, Bounded density shifts, Ergodic Theory Dynam. Systems 33 (2013), no. 6, 1891–
1928, DOI 10.1017/etds.2013.38. MR3122156
[524] P. Štefan, A theorem of Šarkovskii on the existence of periodic orbits of continuous endo-
morphisms of the real line, Comm. Math. Phys. 54 (1977), no. 3, 237–248. MR445556
[525] I. Stewart, Galois Theory, Chapman & Hall, 2004.
[526] S. Štimac, A classification of inverse limit spaces of tent maps with finite critical or-
bit, Topology Appl. 154 (2007), no. 11, 2265–2281, DOI 10.1016/j.topol.2007.03.003.
MR2328011
[527] D. Sullivan, Conformal dynamical systems, Geometric dynamics (Rio de Janeiro,
1981), Lecture Notes in Math., vol. 1007, Springer, Berlin, 1983, pp. 725–752, DOI
10.1007/BFb0061443. MR730296
Bibliography 449
[528] F. Svanström, Properties of a generalized Arnold’s discrete cat map, Master Thesis, Lin-
naeus University Uppsala, 2014, https://www.diva-portal.org/smash/get/diva2:725545/
fulltext01.pdf.
[529] S. Tabachnikov, Dragon curves revisited, Math. Intelligencer 36 (2014), no. 1, 13–17, DOI
10.1007/s00283-013-9428-y. MR3166985
[530] Y. Tarannikov, The minimal density of a letter in an infinite ternary square-free word is
0.2746 · · · , J. Integer Seq. 5 (2002), no. 2, Article 02.2.2, 8. MR1938221
[531] A. Thue, Über unendliche Zeichenreihen, Norske Vid. Selk. Skr. I. Mat. Nat. Kl. Christiana
7 (1906), 1–12.
[532] A. Thue, Über Annäherungswerte algebraischer Zahlen (German), J. Reine Angew. Math.
135 (1909), 284–305, DOI 10.1515/crll.1909.135.284. MR1580770
[533] A. Thue, Über die gegenseitige Lage gleicher Teile gewisser unendliche Zeichenreihen,
Norske Vid. Selk. Skr. I. Mat. Nat. Kl. Christiana 1 (1912), 1–67.
[534] H. Thunberg, A recycled characterization of kneading sequences: Discrete dynamical sys-
tems, Internat. J. Bifur. Chaos Appl. Sci. Engrg. 9 (1999), no. 9, 1883–1887, DOI
10.1142/S0218127499001371. MR1728748
[535] W. Thurston, On the geometry of iterated rational maps, Preprint, Princeton University,
1985.
[536] X. Tian, Different asymptotic behavior versus same dynamical complexity: recurrence
& (ir)regularity, Adv. Math. 288 (2016), 464–526, DOI 10.1016/j.aim.2015.11.006.
MR3436391
[537] O. Toeplitz, Ein Beispiel zur Theorie der fastperiodischen Funktionen (German), Math.
Ann. 98 (1928), no. 1, 281–295, DOI 10.1007/BF01451594. MR1512405
[538] C. Tresser and P. Coullet, Itérations d’endomorphismes et groupe de renormalisation
(French, with English summary), C. R. Acad. Sci. Paris Sér. A-B 287 (1978), no. 7, A577–
A580. MR512110
[539] R. Tijdeman, Periodicity and almost-periodicity, More sets, graphs and numbers, Bolyai
Soc. Math. Stud., vol. 15, Springer, Berlin, 2006, pp. 381–405, DOI 10.1007/978-3-540-
32439-3_18. MR2223402
[540] W. Veech, The necessity of Harris’ condition for the existence of a stationary measure,
Proc. Amer. Math. Soc. 14 (1963), 856–860, DOI 10.2307/2035014. MR156379
[541] W. A. Veech, Strict ergodicity in zero dimensional dynamical systems and the Kronecker-
Weyl theorem mod 2, Trans. Amer. Math. Soc. 140 (1969), 1–33, DOI 10.2307/1995120.
MR240056
[542] W. A. Veech, Interval exchange transformations, J. Analyse Math. 33 (1978), 222–272, DOI
10.1007/BF02790174. MR516048
[543] W. A. Veech, Gauss measures for transformations on the space of interval exchange maps,
Ann. of Math. (2) 115 (1982), no. 1, 201–242, DOI 10.2307/1971391. MR644019
[544] D. Vere-Jones, Geometric ergodicity in denumerable Markov chains, Quart. J. Math. Oxford
Ser. (2) 13 (1962), 7–28, DOI 10.1093/qmath/13.1.7. MR141160
[545] D. Vere-Jones, Ergodic properties of nonnegative matrices. I, Pacific J. Math. 22 (1967),
361–386. MR214145
[546] A. M. Vershik, Uniform algebraic approximation of shift and multiplication operators
(Russian), Dokl. Akad. Nauk SSSR 259 (1981), no. 3, 526–529. MR625756
[547] A. M. Vershik, A theorem on Markov periodic approximation in ergodic theory (Russian),
Boundary value problems of mathematical physics and related questions in the theory of
functions, 14, Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI) 115 (1982),
72–82, 306. MR660072
[548] M. Viana, Ergodic theory of interval exchange maps, Rev. Mat. Complut. 19 (2006), no. 1,
7–100, DOI 10.5209/rev_REMA.2006.v19.n1.16621. MR2219821
450 Bibliography
[549] M. Viana and K. Oliveira, Foundations of ergodic theory, Cambridge Studies in Ad-
vanced Mathematics, vol. 151, Cambridge University Press, Cambridge, 2016, DOI
10.1017/CBO9781316422601. MR3558990
[550] P. Walters, Some results on the classification of non-invertible measure preserving trans-
formations, Recent advances in topological dynamics (Proc. Conf. Topological Dynamics,
Yale Univ., New Haven, Conn., 1972; in honor of Gustav Arnold Hedlund), Lecture Notes
in Math., Vol. 318, Springer, Berlin, 1973, pp. 266–276. MR0393424
[551] P. Walters, An introduction to ergodic theory, Graduate Texts in Mathematics, vol. 79,
Springer-Verlag, New York-Berlin, 1982. MR648108
[552] P. Walters, On the pseudo-orbit tracing property and its relationship to stability, The struc-
ture of attractors in dynamical systems (Proc. Conf., North Dakota State Univ., Fargo, N.D.,
1977), Lecture Notes in Math., vol. 668, Springer, Berlin, 1978, pp. 231–244. MR518563
[553] H. Wang, X. Long, and H. Fu, Sensitivity and chaos of semigroup actions, Semigroup Forum
84 (2012), no. 1, 81–90, DOI 10.1007/s00233-011-9335-5. MR2885999
[554] Y. Wang, L. Yang, and H. Xie, Complexity of unimodal maps with aperiodic kneading
sequences, Nonlinearity 12 (1999), no. 4, 1151–1176, DOI 10.1088/0951-7715/12/4/323.
MR1709853
[555] B. Weiss, Intrinsically ergodic systems, Bull. Amer. Math. Soc. 76 (1970), 1266–1269, DOI
10.1090/S0002-9904-1970-12632-5. MR267076
[556] B. Weiss, Subshifts of finite type and sofic systems, Monatsh. Math. 77 (1973), 462–474,
DOI 10.1007/BF01295322. MR340556
[557] H. Weyl, Über die Gleichverteilung von Zahlen mod. Eins (German), Math. Ann. 77 (1916),
no. 3, 313–352, DOI 10.1007/BF01475864. MR1511862
[558] N. Wiener, Generalized harmonic analysis, Acta Math. 55 (1930), no. 1, 117–258, DOI
10.1007/BF02546511. MR1555316
[559] R. F. Williams, Classification of subshifts of finite type, Ann. of Math. (2) 98 (1973), 120–
153; errata, ibid. (2) 99 (1974), 380–381, DOI 10.2307/1970908. MR331436
[560] S. Williams, Toeplitz minimal flows which are not uniquely ergodic, Z. Wahrsch. Verw.
Gebiete 67 (1984), no. 1, 95–107, DOI 10.1007/BF00534085. MR756807
[561] C. Williamson, An overview of the Thue-Morse sequence, unpublished manuscript, 2012,
https://sites.math.washington.edu/~morrow/336_12/papers/christopher.pdf.
[562] T. S. Wu, Proximal relations in topological dynamics, Proc. Amer. Math. Soc. 16 (1965),
513–514, DOI 10.2307/2034685. MR179775
[563] X. Wu, Y. Luo, X. Ma, and T. Lu, Rigidity and sensitivity on uniform spaces, Topology
Appl. 252 (2019), 145–157, DOI 10.1016/j.topol.2018.11.014. MR3883168
[564] H. M. Xie, On formal languages in one-dimensional dynamical systems, Nonlinearity 6
(1993), no. 6, 997–1007. MR1251254
[565] H. Xie, Grammatical complexity and one-dimensional dynamical systems, Directions in
Chaos, vol. 6, World Scientific Publishing Co., Inc., River Edge, NJ, 1996, DOI 10.1142/2877.
MR1470955
[566] J. C. Xiong, A chaotic map with topological entropy, Acta Math. Sci. (English Ed.) 6 (1986),
no. 4, 439–443, DOI 10.1016/S0252-9602(18)30503-4. MR924033
[567] E. Zeckendorf, Représentation des nombres naturels par une somme de nombres de Fibonacci
ou de nombres de Lucas (French, with English summary), Bull. Soc. Roy. Sci. Liège 41
(1972), 179–182. MR308032
[568] A. N. Zemljakov and A. B. Katok, Topological transitivity of billiards in polygons (Russian),
Mat. Zametki 18 (1975), no. 2, 291–300. MR399423
[569] J. D. Zund, George David Birkhoff and John von Neumann: a question of priority and the
ergodic theorems, 1931–1932 (English, with English and French summaries), Historia Math.
29 (2002), no. 2, 138–156, DOI 10.1006/hmat.2001.2338. MR1896971
Index
451
452 Index
sum, 381 golden mean, 54, 80, 92, 136, 274, 368,
web, 381 392, 393, 395, 416
Farey convergents, 214 Gottschalk, 325
Fatou set, 107 grammar, 346
Feigenbaum context-free, 348, 349
map, 100, 205, 207–209, 350 context-sensitive, 352, 353
sequence, 350, 355 recursively enumerable, 347, 356
shift, 186, 194 regular, 347, 349
substitution, 136, 193, 207, 250, 323, graph cover, 174, 220, 243
350 greedy expansion, 77, 79, 225, 226, 253
Feigenbaum map, 100 Gurevich entropy, 66, 71, 402, 405
Fekete’s Lemma, 8, 76, 89, 188, 190,
282, 411, 414 Haar measure, 201, 266, 315, 317
Ferenczi, 159, 161, 223, 298, 299 Halmos & von Neumann Structure
Fermat’s Little Theorem, 198 Theorem, 155, 315
Fibonacci halting state, 342
Bratteli-Vershik system, 242, 251, 274 Hedlund, 6, 10, 17, 163, 172, 176, 325
number, 6, 92, 105, 134, 226, 228, height, 218, 221, 234, 271, 327
274, 371, 393 hereditary, 71, 71, 72, 82, 121, 197
SFT, 3–5, 9, 11, 51, 53, 66, 74, 117, closure, 72
119, 228, 292 subshift, 72
substitution, 3, 92, 134, 135, 143, hereditary subshift, 197
145, 180, 222, 242, 292 Hilbert, 367, 400
unimodal map, 92, 95, 96, 355 Hilbert metric, 281, 400, 400
Fibonacci substitution, 46 Hilbert space, 309
Fine-Wilf Theorem, 362 Hofbauer, 82, 95, 96, 100
finitary, 287 Hofbauer tower, 81, 85, 95
first return map, 179, 183, 204, 217 homoclinic point, 374
first return time, 217, 218, 261, 288 homterval, 92
fixed point, 20 Host, 134, 149, 161, 269, 319, 327
flip-conjugacy, 23 Host’s conjecture, 161
follower separated, 64 Hubbard tree, 106
follower set, 63, 63, 76, 83, 85, 118 hyperbolic, 47, 54, 131
Ford circles, 383
formula IET, 162, 181, 244
Abel’s summation, 397 incidence matrix, 234, 242, 243, 271
Abramov’s, 293 inequality
Binet’s, 371 Denjoy-Koksma, 390
Rokhlin’s, 284, 292 Parseval, 318
Stirling’s, 73, 406 infinitely renormalizable, 136, 205, 207,
forward orbit, 20 229, 255
frequency, 12, 127, 134, 151, 171, 172, Ingram conjecture, 365
175, 177, 262, 263, 350, 354, 391 initial state, 342
Furstenberg, 46, 265 insplit graph, 58
internal address, 111
Galois conjugate, 367, 368, 369 interval exchange transformation (IET),
gap shift, 50, 117 161, 162, 181, 263, 281
Gauß map, 380 intrinsically ergodic, 14, 50, 57, 64, 70,
generalized odometer, 225 82, 100, 119, 201, 202, 293, 407
generating, 34, 283 invariant coordinate, 101
generic, 267 invariant measure, 13, 257
generic measure, 13, 266, 277 inverse limit space, 137, 365
Index 455
irreducible, 9, 53, 139, 151, 181, 288, left-special, 6, 138, 164, 169, 176, 209
402 Lehmer’s number, 368
isometry, 28, 37, 40, 192, 265 Lemma
isomorphic, 152, 284, 309 Anosov Closing, 48
isomorphism, 34, 252, 315 Anosov Shadowing, 48
iterated function system (IFS), 157, 231 Borel-Cantelli, 394
itinerary, 15, 90, 109 Fekete’s, 8, 76, 89, 188, 190, 282, 411,
414
Julia set, 106, 107, 108, 114 Kac’s, 261
Ogden’s, 350
Kac’s Lemma, 261
Pumping, 349, 360
Kakutani, 220
Riemann-Lebesgue, 312
Kakutani-Rokhlin partition, 218, 329
leo (locally eventually onto), 45, 279
Keane, 180, 263, 265, 277, 279, 287,
Li-Yorke chaos, 43, 77, 121
296, 320, 328, 380, 399
Li-Yorke chaotic, 44
Keane condition, 181, 245, 276, 279
Keane conjecture, 279–281 Li-Yorke pair, 43
Kepler tree, 383 lift, 22, 165
Kleene’s Theorem, 347 light tails, 196
Knaster continuum, 137, 365 linear complexity, 39
kneading determinant, 100, 103 linearly recurrent, 7, 133, 141, 160–162,
kneading increment, 102 219, 273, 296, 327, 349
kneading map, 48, 204, 209, 229, 255, Liouville number, 393
276, 354 local rule, 10
kneading sequence, 91, 107, 109, 186, locally eventually onto, 45, 279
203, 208, 350, 355 locally expanding, 17, 208
kneading theory, 90 logarithmic density, 196, 197, 397
Koksma inequality, 389 long-branched, 90, 94
Kolmogorov entropy, 282 Lorenz-like map, 210
Kolmogorov Extension Theorem, 13, 14, low enumeration scale, 204, 229
270, 273, 288, 295, 319, 321 Lyapunov stable, 28, 33
Komornik-Loreti constant, 81
Koopman, 258
Maass, 239, 327
Koopman operator, 122, 151, 258, 309,
main cardioid, 108, 111
315, 326
Mandelbrot set, 108, 108, 111
KR-partition (Kakutani-Rokhlin), 218,
map
248
Chacon, 223
Krieger, 128, 129, 263, 283
Kronecker factor, 312, 315, 324, 325 doubling, 285
Krylov-Bogul’yubov Theorem, 276 Feigenbaum, 205, 207–209, 350
skew tent, 100, 105
Lagrange spectrum, 392 tent, 22, 40, 48, 88, 90, 94, 100, 105,
Lagrange’s Theorem, 354, 393 117, 137, 213, 365
Lang’s conjecture, 395 unimodal, 88, 229, 354
language, 5, 65, 345 Markov, 94
lap-number, 89, 103, 351 Markov measure, 288, 406
lazy expansion, 78, 79 Markov partition, 53, 83–85
leading eigenvalue, 398 Markov triple, 392
Lebesgue Density Theorem, 44 martingale, 330
Lebesgue space, 283 Martingale Convergence Theorem, 330
left-linear, 347 maximal equicontinuous factor, 32, 32,
left-shift, 3 34, 45, 194, 199
456 Index
maximal measure, 14, 50, 71, 82, 128, null recurrent, 71, 403
129, 288, 289, 293, 294, 403, 405, nullset, 257
407, 411 number
maximal spectral type, 312 algebraic, 367
mean equicontinuous, 33, 264, 317 Catalan, 129
mean sensitive, 42 Diophantine, 393
measurable eigenvalue, 309 Fibonacci, 6, 92, 105, 134, 226, 228,
measure 274, 371, 393
Bernoulli, 14, 259 Lehmer’s, 368
generic, 13, 266, 277 Liouville, 393
Haar, 266, 315, 317 multinacci, 368
Mirsky, 201 Pell, 393
of maximal entropy, 14, 71, 82, 128, Perron, 9, 84
129, 288, 289, 293, 294, 403, 405, Pisot, 84, 151, 230, 339, 368
407, 411 plastic, 368
Shannon-Parry, 289, 293, 295, 406 quadratic, 354, 393
spectral, 311, 312, 315 rotation, 22, 87, 165
measure-theoretic entropy, 282, 287, 406 Salem, 368
mediant, 381 transcendental, 5, 81, 367
memory, 10, 52 tribonacci, 154, 230
metallic mean, 368, 393
metric entropy, 282 odd shift, 4, 66, 74, 119, 120
microscoping, 234 odometer, 190, 224, 238, 239, 248, 265,
minimal, 24, 25, 42, 157, 159, 183, 200, 319
208, 238, 298 canonical, 200
minimal polynomial, 152, 367, 369 dyadic, 240
mirror invariant, 164 simple, 193
Mirsky measure, 201, 201 Ogden’s Lemma, 350
Misiurewicz, 207, 209, 409 omega-limit set, 7, 20, 21, 204
mixed spectrum, 323 orb, 20
mixing, 295, 296 orbit, 20
strong, 295, 296 orbit cocycle, 23
topological, 45, 49, 63, 65, 76, 118 orbit equivalent, 23
weak, 39, 224, 306, 307, 339 ordered Bratteli system, 233
weak topological, 45 Ornstein, 286, 298
weak topologically, 65 Ornstein’s Theorem, 286
Möbius function, 198, 198 orthogonal, 310
Montel’s Theorem, 111 Ostrowski numeration, 228
morphism, 135 outsplit graph, 58
Morse, 5, 6, 163, 172, 176 overlap, 122
multinacci number, 368 overlap-free, 122
multiplicative coding, 178 Oxtoby’s Theorem, 14, 262
Myhill-Nerode Theorem, 63
palindromic, 137, 164, 368
natural extension, 285 paper folding sequence, 136
neutral, 20 parity-lexicographical order, 89, 91, 203
non-deterministic finite automaton, 344 Parry, 57, 288, 289, 291
non-erasing, 141 Parseval inequality, 318
non-wandering, 47 partial quotients, 393
non-wandering point, 21 Pavlov, 68, 70, 410
non-wandering set, 21, 28, 38 Pell number, 393
Non-wandering Triangle Theorem, 114 perfect Bratteli-Vershik system, 237
Index 457
GSM/228
www.ams.org