Bounded Variable Logics and Counting A Study in Finite Models 1st Edition Martin Otto

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 69

Bounded Variable Logics and Counting

A Study in Finite Models 1st Edition


Martin Otto
Visit to download the full and correct content document:
https://ebookmeta.com/product/bounded-variable-logics-and-counting-a-study-in-finite
-models-1st-edition-martin-otto/
More products digital (pdf, epub, mobi) instant
download maybe you interests ...

Models in Microeconomic Theory She Edition Martin J


Osborne

https://ebookmeta.com/product/models-in-microeconomic-theory-she-
edition-martin-j-osborne/

Martin Rauch Refined Earth Construction Design of


Rammed Earth Otto Kapfinger (Editor)

https://ebookmeta.com/product/martin-rauch-refined-earth-
construction-design-of-rammed-earth-otto-kapfinger-editor/

Sociology of the Renaissance 1st Edition Alfred Wilhelm


Otto V Martin Gertrud Lenzer Wallace K Ferguson

https://ebookmeta.com/product/sociology-of-the-renaissance-1st-
edition-alfred-wilhelm-otto-v-martin-gertrud-lenzer-wallace-k-
ferguson/

Agrarian Socialism The Cooperative Commonwealth


Federation in Saskatchewan A Study in Political
Sociology Seymour Martin Lipset

https://ebookmeta.com/product/agrarian-socialism-the-cooperative-
commonwealth-federation-in-saskatchewan-a-study-in-political-
sociology-seymour-martin-lipset/
A Spin and Momentum Resolved Photoemission Study of
Strong Electron Correlation in Co Cu 001 1st Edition
Martin Ellguth

https://ebookmeta.com/product/a-spin-and-momentum-resolved-
photoemission-study-of-strong-electron-correlation-in-co-
cu-001-1st-edition-martin-ellguth/

Modal Logics and Philosophy Second Edition Rod Girle

https://ebookmeta.com/product/modal-logics-and-philosophy-second-
edition-rod-girle/

Study Skills for International Postgraduates Bloomsbury


Study Skills 2nd Edition Martin Davies

https://ebookmeta.com/product/study-skills-for-international-
postgraduates-bloomsbury-study-skills-2nd-edition-martin-davies/

R-Calculus, V: Description Logics 1st Edition Li

https://ebookmeta.com/product/r-calculus-v-description-
logics-1st-edition-li/

Statistical Analysis of Graph Structures in Random


Variable Networks V. A. Kalyagin

https://ebookmeta.com/product/statistical-analysis-of-graph-
structures-in-random-variable-networks-v-a-kalyagin/
Bounded Variable Logics and Counting

Since their inception, the Perspectives in Logic and Lecture Notes in Logic series
have published seminal works by leading logicians. Many of the original books in the
series have been unavailable for years, but they are now in print once again.
In this volume, the 9th publication in the Lecture Notes in Logic series, Martin
Otto gives an introduction to finite model theory that indicates the main ideas and lines
of inquiry that motivate research in this area. Particular attention is paid to bounded
variable infinitary logics, with and without counting quantifiers, related fixed-point
logics, and the corresponding fragments of Ptime. The relations with Ptime exhibit
the fruitful exchange between ideas from logic and from complexity theory that is
characteristic of finite model theory.

M a r t i n O t t o works in the Department of Mathematics at RWTH Aachen


University, Germany.
L E C T U R E N OT E S I N L O G I C

A Publication of The Association for Symbolic Logic

This series serves researchers, teachers, and students in the field of symbolic
logic, broadly interpreted. The aim of the series is to bring publications to
the logic community with the least possible delay and to provide rapid
dissemination of the latest research. Scientific quality is the overriding
criterion by which submissions are evaluated.

Editorial Board
Jeremy Avigad,
Department of Philosophy, Carnegie Mellon University
Zoe Chatzidakis
DMA, Ecole Normale Supérieure, Paris
Peter Cholak, Managing Editor
Department of Mathematics, University of Notre Dame, Indiana
Volker Halbach,
New College, University of Oxford
H. Dugald Macpherson
School of Mathematics, University of Leeds
Slawomir Solecki
Department of Mathematics, University of Illinois at Urbana–Champaign
Thomas Wilke,
Institut für Informatik, Christian-Albrechts-Universität zu Kiel

More information, including a list of the books in the series, can be found at
http://www.aslonline.org/books-lnl.html
L E C T U R E N OT E S I N L O G I C 9

Bounded Variable Logics and Counting


A Study in Finite Models

MARTIN OTTO
Rheinisch-Westfälische Technische Hochschule,
Aachen, Germany

association for symbolic logic


University Printing House, Cambridge CB2 8BS, United Kingdom
One Liberty Plaza, 20th Floor, New York, NY 10006, USA
477 Williamstown Road, Port Melbourne, VIC 3207, Australia
4843/24, 2nd Floor, Ansari Road, Daryaganj, Delhi – 110002, India
79 Anson Road, #06–04/06, Singapore 079906

Cambridge University Press is part of the University of Cambridge.


It furthers the University’s mission by disseminating knowledge in the pursuit of
education, learning, and research at the highest international levels of excellence.

www.cambridge.org
Information on this title: www.cambridge.org/9781107167940
10.1017/9781316716878
First edition © 1997 Springer-Verlag Berlin Heidelberg
This edition © 2016 Association for Symbolic Logic under license to
Cambridge University Press.
Association for Symbolic Logic
Richard A. Shore, Publisher
Department of Mathematics, Cornell University, Ithaca, NY 14853
http://www.aslonline.org
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
A catalogue record for this publication is available from the British Library.
ISBN 978-1-107-16794-0 Hardback
Cambridge University Press has no responsibility for the persistence or accuracy
of URLs for external or third-party Internet Web sites referred to in this publication
and does not guarantee that any content on such Web sites is, or will remain,
accurate or appropriate.
Preface

Viewed as a branch of model theory, finite model theory is concerned with


finite structures and their properties under logical, combinatorial, algorithmic
and complexity theoretic aspects. The connection of classical concerns of logic
and model theory with issues in complexity theory has contributed very much
to the development of finite model theory into a field with its own specific
flavour.
I like to think of this monograph as a study which — with a partic-
ular theme of its own — exemplifies and reflects some central ideas and
lines of research in finite model theory. The particular theme is that of
bounded variable infinitary logics, with and without counting quantifiers,
related fixed-point logics, and corresponding fragments of PTIME. The re-
lations with PTIME exhibit that fruitful exchange between ideas from logic
and from complexity theory that is characteristic of finite model theory and,
more specifically, of the research programme of descriptive complexity.
Among the main particular topics and techniques I would emphasize:
— the importance of games as a fundamental tool from classical logic; their
use in the analysis of finite structures also with respect to algorithmic and
complexity theoretic concerns is amply illustrated.
- the role of cardinality phenomena, which clearly are amongst the most
fundamental guidelines in the analysis of finite structures.
- the importance of combinatorial techniques, and of dealing with concrete
combinatorial problems over finite domains. Examples here range from
applications of the stable colouring technique in the formation of structural
invariants to certain colourings of squares that come up in canonization for
logics with two variables.
In order that this study may be useful also as an introduction to some
of the important concepts in the field, I have tried to treat the particular
theme in a detailed and mostly self-contained manner. On the other hand
this treatment leads up to specific results of a more technical nature, and I
welcome the opportunity to present some contributions in a broader context.
VI

With respect to the work presented here, I personally owe much to two
sources that I would like to mention. One is the Freiburg logic group where
I had the opportunity to participate in an intensified engagement in finite
model theory after finishing my doctorate with Professor Ebbinghaus. Pro-
fessor Flum and Professor Ebbinghaus created an encouraging atmosphere
for taking up this new field actively. The second important source is the col-
laboration with Professor Gradel which has had much impact on the research
presented here. I am grateful to Erich Gradel for this collaboration and also
for his advice and support.
The text itself is a revised version of my Habilitationsschrift, presented
at the RWTH Aachen in 1995. I am grateful to Professor Ebbinghaus for
numerous comments on the earlier version of this text.

Aachen, October 1996 Martin Otto


Table of Contents

Preface V

0. Introduction 1
0.1 Finite Models, Logic and Complexity 1
0.1.1 Logics for Complexity Classes 1
0.1.2 Semantically Defined Classes 4
0.1.3 Which Logics Are Natural? 7
0.2 Natural Levels of Expressiveness 7
0.2.1 Fixed-Point Logics and Their Counting Extensions . . . 8
0.2.2 The Framework of Infinitary Logic 9
0.2.3 The Role of Order and Canonization 11
0.3 Guide to the Exposition 12

1. Definitions and Preliminaries 15


1.1 Structures and Types 15
1.1.1 Structures 15
1.1.2 Queries and Global Relations 17
1.1.3 Logics 18
1.1.4 Types 19
1.2 Algorithms on Structures 21
1.2.1 Complexity Classes and Presentations 22
1.2.2 Logics for Complexity Classes 23
1.3 Some Particular Logics 24
1.3.1 First-Order Logic and Infinitary Logic 24
1.3.2 Fragments of Infinitary Logic 25
1.3.3 Fixed-Point Logics 30
1.3.4 Fixed-Point Logics and the L^ω 33
1.4 Types and Definability in the L^ω and C^ω 35
1.5 Interpretations 38
1.5.1 Variants of Interpretations 38
1.5.2 Examples 40
1.5.3 Interpretations and Definability 41
VIII Table of Contents

1.6 Lindstrδm Quantifiers and Extensions 43


1.6.1 Cardinality Lindstrδm Quantifiers 43
1.6.2 Aside on Uniform Families of Quantifiers 44
1.7 Miscellaneous 47
1.7.1 Canonization and Invariants 47
1.7.2 Orderings and Pre-Orderings 49
1.7.3 Lexicographic Orderings 49

2. The Games and Their Analysis 51


2.1 The Pebble Games for £,*,„ and C^ω 51
2.1.1 Examples and Applications 54
2.1.2 Proof of Theorem 2.2 60
2.1.3 Further Analysis of the C^-Game 62
2.1.4 The Analogous Treatment for Lk 66
2.2 Colour Refinement and the Stable Colouring 67
2.2.1 The Standard Case: Colourings of Finite Graphs 67
2.2.2 Definability of the Stable Colouring 68
2.2.3 C^ω and the Stable Colouring 71
2.2.4 A Variant Without Counting 72
2.3 Order in the Analysis of the Games 73
2.3.1 The Internal View 74
2.3.2 The External View 76
2.3.3 The Analogous Treatment for Lk 77

3. The Invariants 79
3.1 Complete Invariants for Lk and Ck 80
3.2 The ^-Invariants 81
3.3 The L*-Invariants 85
3.4 Some Applications 87
3.4.1 Fixed-Points and the Invariants 87
3.4.2 The Abiteboul-Vianu Theorem 90
3.4.3 Comparison of ICk and ILk 91
3.5 A Partial Reduction to Two Variables 93

4. Fixed-Point Logic with Counting 97


4.1 Definition of FP+C and PFP-f C 98
k
4.2 FP+C and the C -Invariants 106
4.3 The Separation from PTIME 109
4.4 Other Characterizations of FP-hC Ill

5. Related Lindstrom Extensions 115


5.1 A Structural Padding Technique 117
5.2 Cardinality Lindstrom Quantifiers 124
5.2.1 Proof of Theorem 5.1 125
5.3 Aside on Further Applications 128
Table of Contents IX

6. Canonization Problems 131


6.1 Canonization 131
6.2 PTIME Canonization and Fragments of PTIME 134
6.3 Canonization and Inversion of the Invariants 136
6.4 A Reduction to Three Variables 139
6.4.1 The Proof of Theorems 6.16 and 6.17 141
6.4.2 Remarks on Further Reduction 147

7. Canonization for Two Variables 149


7.1 Game Tableaux and the Inversion Problem 150
7.1.1 Modularity of Realizations 156
7.2 Realizations for IC2 160
7.2.1 Necessary Conditions 160
7.2.2 Realizations of the Off-Diagonal Boxes 162
7.2.3 Magic Squares 163
7.2.4 Realizations of the Diagonal Boxes 166
7.3 Realizations for ILτ 169
7.3.1 Necessary and Sufficient Conditions 169
7.3.2 On the Special Nature of Two Variables 174

Bibliography 177

Index 181
0. Introduction

0.1 Finite Models, Logic and Complexity


Finite model theory deals with the model theory of finite structures. As a
branch of model theory it is concerned with the analysis of structural prop-
erties in terms of logics. The attention to finite structures is not so much
a restriction in scope as a shift in perspective. The main parts of classical
model theory (the model theory related to first-order logic) as well as of ab-
stract model theory (the comparative model theory of other logics) almost
exclusively concern infinite structures; finite models are disregarded as triv-
ial in some respects and as intractable in others. In fact, the most successful
tools of classical model theory fail badly in restriction to finite structures.
The compactness theorem in particular, which is one of the corner stones of
classical model theory, does not hold in the realm of finite structures. Several
examples of other important theorems from classical model theory that are
no longer true in the finite case are discussed in [Gur84].
There are on the other hand specific new issues to be considered in the
finite. These issues mainly account for the growing interest in finite model
theory and promote its development into a theory in its own right. One of the
specific issues in a model theory of finite structures is complexity. Properties
and transformations of finite structures can be considered under algebraic and
combinatorial aspects, under the aspect of logical definability, and also under
the aspect of computational complexity. Issues of computational complexity
form one of the main links also between finite model theory and theoretical
computer science.
In this introduction I merely intend to indicate selectively some main
ideas and lines of research that motivate the present investigations. There
are a number of surveys that also cover various other aspects of finite model
theory — see for instance [Fag90, Gur84, Gur88, Imm87a, Imm89]. A general
reference is the new textbook on finite model theory by Ebbinghaus and Flum
[EF95].

0.1.1 Logics for Complexity Classes


The study of the relationship between logical definability and computational
complexity of structural properties is an essential branch of finite model the-
2 0. Introduction

ory. This topic is most pronounced in the search for semantic matches be-
tween levels of computational complexity and logics. The search for logics for
complexity classes, also suggestively described as capturing complexity classes
[Imm87b] or tailoring logic for complexity [Gur84], has lead to an active re-
search programme in finite model theory.
Consider any of the standard classes in computational complexity as a
class of problems for finite structures, say for finite graphs. The class of all
properties of finite graphs that can be recognized by PTIME algorithms is a
typical example. A logical characterization of PTIME on finite graphs would
have to provide a logic for PTIME on graphs in the sense that exactly all
PTIME properties of finite graphs are definable by sentences of this logic. For
the present purposes we work with a slightly informal notion of a logic for
PTIME. The exact definition underlying our treatment is due to Gurevich
[Gur88]. A more detailed discussion will be provided with Definition 1.7 in
the next chapter. It is also shown in [Gur88] that the restriction to graphs
rather than finite structures of arbitrary type is inessential for the present
issue.
Definition 0.1 (Sketch). A logic C is a logic for PTIME if exactly those
properties of finite graphs that are PTIME recognizable are C-definable.
Minimal requirements on candidate logics to be imposed are the following:
C has recursive syntax and recursive semantics that associates with each C-
sentence a PTIME algorithm for checking its truth in finite models.
It is a central open problem in the field whether there is a logic for PTIME.
Relationships between computational complexity classes and definability
in logical systems are interesting for a number of reasons:
(a) The potential for theoretical transfer between different fields. Techniques
from complexity theory may be brought to bear on logical and model
theoretic issues and vice versa. To give an example, several of the out-
standing open problems of complexity theory like the PTIME = NPTIME?
or PTIME = PSPACE? questions have found appealing non-trivial model
theoretic reformulations in terms of semantic equivalences of particular
logical systems over finite structures (compare [AV91, DLW95, Daw95b]).
Short of solving the original problems this offers new perspectives, and
investigations of related logical issues may at least lead to a better under-
standing of these problems.
(b) Logical analysis of the required kind may yield deeper insights into the
fundamental notion of complexity. Definability in logical systems can be
viewed as a kind of complexity in itself. Whereas computational com-
plexity controls the computational resources required in the solution of
a problem, definability considerations control the logical or descriptive
resources required in the specification of the problem: hence the term
descriptive complexity as used in [Imm89]. The relationship between the
0.1 Finite Models, Logic and Complexity 3

structure imposed by these completely different resources thus becomes


part of a broader view of complexity theory.
(c) Exact matches between computational complexity classes and logics pro-
vide an appealing notion of semantic completeness for model theoretic
considerations. A logic for PTIME say would be a logic that is complete
for the world of PTIME computability over structures — or for compu-
tationally feasible problems, if PTIME computability is identified with
efficient solvability or feasibility. The classical classes of computational
complexity have emerged as natural levels of computational power, certi-
fied by robustness criteria and the existence of natural complete problems.
Matching logics constitute naturally distinguished levels of expressiveness.
(d) In this connection there is also a strong theoretical interest from computer
science. Problems related to structures like graphs (and more generally ar-
bitrary relational structures corresponding to instantiations of relational
databases) are ubiquitous in computer science applications and in par-
ticular in the theory of databases. A natural logic for PTIME would be
a theoretically ideal database language for exactly all feasible queries:
anything that can be specified in this language is guaranteed to have an
efficient algorithmic solution; by semantic completeness for PTIME such
a logic constitutes a universal language for all efficient tasks. And indeed
this context is one of the original sources for the problem of capturing
PTIME, as formulated by Chandra and Harel in [CH82].
The following are some of the well known results concerning complete matches
between distinguished levels of computational complexity and logical systems.
Regular languages and monadic second-order logic: words over any finite al-
phabet can in a canonical way be identified with linearly ordered struc-
tures over an otherwise monadic vocabulary (one unary predicate for each
letter to mark its occurrences in the word). Monadic second-order logic
^lίn f°Γ tne resulting word models defines exactly the regular languages,
i.e. those languages that are recognized by finite automata. This is a clas-
sical result of Bύchi [BiicGO], Elgot [ElgGl] and Trakhtenbrot [TraGl] that
fits into the present framework as a precursor to the recent development
of finite model theory (compare the treatment in [EF95]).
NPTIME and existential second-order logic: Fagin's theorem [Fag74] is the
first result of this branch of finite model theory proper. It equates non-
deterministic polynomial time recognizability with definability in exis-
tential second-order logic Σ\.
PTIME and fixed-point logic with order: in restriction to linearly ordered fi-
nite structures PTIME has been characterized logically by Immerman
[Imm86] and Vardi [Var82] through the very natural extension of first-
order logic to fixed-point logic FP by means of an operator for monotone
relational induction.
So there are the following semantic equivalences:
4 0. Introduction

Finite Automata = £^n (for word models)


NPTIME = Σ\
PTIME = FP (in the presence of linear order)

It is remarkable that all major complexity classes, in particular LOGSPACE,


NLOGSPACE, PTIME and PSPACE, are captured by very natural extensions of
first-order logic in the presence of order. The fundamental question whether
similar matches can be found in the general case of not necessarily ordered
structures is open. In particular the question whether there is a logic for
PTIME, as raised by Chandra and Harel in [CH82], is a notorious open prob-
lem in finite model theory. In fact there is no capturing result at all for any
standard complexity class below NPTIME that applies to the general case.
Fagin's theorem NPTIME = Σ\ essentially remains the only general result
on a strict match between a complexity class and a logic on finite structures.
This phenomenon will be further discussed below.

0.1.2 Semantically Defined Classes

Consider the class of all PTIME recognizable graph properties — for the
moment denote it graph-PτiME. It serves as a typical example of a complexity
class on finite structures.
Why is it difficult to find a logic for graph-PτiME?
Recall that ordinary PTIME is the class of all problems that can be solved by
polynomially time bounded Turing machines. A priori Turing machines work
with words or strings as inputs. As far as recognition (i.e. decision) problems
are concerned a problem is a set of words over some alphabet. Words over
this alphabet are rejected or accepted, according to membership in the set,
in time polynomial in their length.
In particular a Turing machine does not work with abstract graphs as
inputs but rather with encodings of these. The standard encoding scheme for
finite graphs uses adjacency matrices for the input representation. The ad-
jacency matrix of a graph whose vertices are labelled v\,..., υn is the n x n
boolean matrix with entries α^ = 0 or 1 according to whether (v»,Vj) is
an edge. But obviously different adjacency matrices may encode the same,
more precisely isomorphic, graphs. Any rearrangement of the vertices in a
different order induces an equivalent representation that is different from the
given one unless the rearrangement happens to be an automorphism of the
abstract graph. Any graph algorithm, i.e. any algorithm that recognizes a
graph property, must therefore satisfy a non-trivial semantic invariance con-
dition: a graph algorithm must produce the same result on any two inputs
that represent isomorphic abstract graphs. In other words it may not reject
one graph and accept an isomorphic copy of that same graph. We adopt
the terminology of complexity theory as presented in [Pap94] to distinguish
0.1 Finite Models, Logic and Complexity 5

semantic presentations and syntactic presentations of complexity classes, or


semantic and syntactic classes according to their presentation. Semantic pre-
sentations are given in terms of semantic constraints on algorithms. Owing
to the invariance condition, graph-PτiME is clearly a semantically presented
class. A syntactic presentation of a complexity class in contrast consists of a
recursive or at least recursively enumerable set of algorithms of the required
complexity, that contains at least one realization for every problem in the
given class.1 We shall mostly speak of recursive presentations in this sense.
For an example of a class that is not a priori syntactically defined but
nevertheless admits a simple recursive presentation consider PTIME in the
ordinary sense as a class of problems for words over finite alphabets. It is
clearly presentable by the set of algorithms that limit their computation
time by means of a step counter that is initialized in each computation to a
polynomial in the input size. This presentation is suggestively referred to as
through polynomially clocked machines.
Semantic invariance conditions like the one for graph-PTlME are non-
recursive conditions on algorithms. In fact the set of all (syntactic descrip-
tions of) graph algorithms is not even recursively enumerable (as an index
set). It can furthermore be shown that the same applies to any of its intersec-
tions with standard complexity classes. In particular the ad-hoc presentation
of graph-PTlME through the set of all PTIME graph algorithms does not pro-
vide a recursively enumerable presentation.
A logic C for PTIME in the sense of Definition 0.1 above, however, would
induce the following recursive presentation for graph-PTlME. Let 5 be the
recursive semantic mapping that associates a PTIME algorithm with each sen-
tence of C (in the language of graphs). Obviously image(S) consists of PTIME
graph algorithms. By semantic completeness of C for PTIME on graphs, any
PTIME graph property is realized by some member of image(S). The recur-
sively enumerable subset image(S) C {.4 | A a PTIME graph algorithm }
therefore provides a recursive presentation for graph-PTlME. In the termi-
nology of complexity theory, image(5) is a syntactic presentation of the se-
mantically defined class graph-PTlME. In fact it can be shown that there is a
logic for PTIME (in the sufficiently general sense of Definition 0.1) if and only
if graph-PTlME admits a syntactic, i.e. recursive or recursively enumerable,
presentation.

For properties of linearly ordered structures — properties of linearly or-


dered graphs say — these problems do not arise because there are canonical
encodings for ordered structures. For ordered graphs we may use the adja-
cency matrix based on the natural labelling of the vertices as v\,..., υn in
increasing order. This observation is easily turned into a recursive presenta-
tion for the class of all PTIME properties of ordered finite graphs.
1
The difference between recursive and recursively enumerable syntax is not impor-
tant in this kind of question. If A\, Aa, ... is a recursive enumeration of syntactic
descriptions of algorithms, then the syntax (Λi, 1), (Λ.2,2), ... is recursive.
6 0. Introduction

This crucial difference between the ordered and the unordered case is
at the root of the apparent mismatch with respect to capturing complexity
classes in the case of ordered structures or in the general case of not nec-
essarily ordered structures. For the standard complexity classes it is almost
trivial to see that the induced classes over ordered structures are presentable
as syntactic classes and therefore can be captured by logics. The point of
the corresponding capturing results indeed rather is that moreover they are
captured by very natural logical systems.
As mentioned above, no complexity class below NPTIME has been cap-
tured or shown to be recursively presentable in the general case. NPTIME here
marks a threshold because in NPTIME and above, the invariance problem can
be side-stepped as follows. Consider the class graph-NPτiME of all NPTIME
graph properties. From a graph property Q € graph-NPτiME we may pass to
its ordered version <2<, tne class °f all ordered graphs that possess the given
property:

Q< — {(G, <) I G G <9, < a linear ordering of the vertices }.

Q< is NPTIME recognizable, essentially through the algorithm for Q itself.


But a plain graph G belongs to Q if and only if any expansion (G, <) by
a linear ordering of its vertices belongs to Q< (and also if and only if all
such expansions belong to Q<). It follows that graph-NPTlME is presentable
through the class of all NPTIME algorithms that first guess a linear ordering
and then evaluate an NPTIME property of ordered graphs on the result.
From this observation we obtain a recursive presentation for graph-NPTlME
in a standard manner. It is worth noting that this trick is also directly used
in Fagin's proof that graph-NPTlME coincides with the class of all graph
properties that are definable in existential second-order logic. The existential
quantification over linear orderings < that is implicit in the passage from Q<
to Q is explicitly available in existential second-order logic.

Note that capturing results for complexity classes in the general case of
not necessarily ordered structures are not merely of theoretical interest. The
challenge is well motivated by the potential applications in database theory.
Natural abstract databases often are not ordered. Their realizations at the
machine level may involve an implicit linear ordering for representational
purposes (naively: a numbering of memory cells). Even though an ordering is
present then, it is not considered part of the intended data. A sound database
query in this case corresponds to a property of unordered relational struc-
tures. In the query specification it is desirable to hide this ordering. Logically,
one would have to have a query language corresponding to a logic for PTIME
on unordered structures in order to achieve semantic completeness within
PTIME and simultaneously to guarantee soundness — soundness in the sense
of independence of a linear ordering that is an artifact of the realization.
0.2 Natural Levels of Expressiveness 7

0.1.3 Which Logics Are Natural?

Consider possible solutions, positive or negative, to the problem whether


there is logic for PTIME. The above definition with its very liberal conditions
on candidate logics is theoretically appealing because of its connection with
recursive presentability. A negative solution in the sense of this definition
would be a strong result to the effect that no reasonable logic at all can
possibly capture PTIME. A positive result, however, might still leave much to
be desired owing to the liberal notion of a logic. In other words, a recursive
presentation of graph-PτiME might intuitively be far from constituting a
natural logical system. As with the known positive results in the ordered
case or for NPTIME much may depend on the style of the logic obtained.
The logics to be considered are extensions of first-order logic, as first-order
sentences can be evaluated in LOGSPACE. The systematic study of extensions
of first-order logic belongs to the domain of abstract model theory. It is
worth to pursue this systematic study with particular focus on logics for finite
structures. A systematic study of this kind is a possible approach to problems
like that concerning the existence of a logic for PTIME. In particular if one
conjectures that the problem of a logic for PTIME has a negative solution,
then results that state the impossibility of capturing PTIME by logics that
satisfy certain stronger criteria can be interesting approximations.
To some extent the formal framework available in abstract model theory is
not necessarily well adapted to the finite case. Complexity considerations and
considerations that concern logics under a procedural aspect are not a priori
accommodated. The standard formalism in abstract model theory is that of
Lindstrόm extensions or of extensions by generalized quantifiers (Lindstrδm
quantifiers); compare the overview in [Ebb85]. Roughly, each such quantifier
incorporates one single new structural property and the resulting extension is
a minimal one to make this new property available under some natural closure
conditions. While this formalism is universally applicable for many purposes
— any extension of first-order logic that satisfies some corresponding closure
properties is equivalent with a Lindstrδm extension — it may be argued that
it is not always optimally adapted to the demands of finite model theory.
It seems that a framework for extensions of logics for finite structures that
is sufficiently fine grained to reflect algorithmic constraints is still lacking.
This issue is connected with the above-mentioned lack of criteria for the
'naturalness' of a logic for finite structures.

0.2 Natural Levels of Expressiveness


First-order logic is not well adapted to the programme of logics for com-
plexity classes. While any individual finite structure is characterized up to
isomorphism by a single sentence of first-order logic, natural properties that
are of very low complexity are not first-order definable. For instance neither
8 0. Introduction

connectedness nor regularity are first-order properties of finite graphs. In fact


these examples are typical of the two most apparent defects in the expres-
sive power of first-order logic: first-order logic does not provide expressive
means to capture any relational process that requires true recursion (like the
generation of the transitive closure of the edge predicate required for connect-
edness), and first-order logic has no means to express non-trivial cardinality
properties (like the equality of the numbers of direct neighbours required for
regularity). In short, first-order logic lacks recursion and counting.

0.2.1 Fixed-Point Logics and Their Counting Extensions

The first defect is taken care of in the extension to fixed-point logics. The
adjunction of fixed-point operators leads to logics that capture certain levels
of relational recursion. Least or inductive fixed-point logic FP in particular
is a very natural logic that has been studied extensively. Inductively defined
and increasing relational processes are captured by FP. The generation of the
transitive closure is a simple but typical example for the expressive power of
FP above that of first-order logic. An important point is that the increasing
nature of these relational processes guarantees termination in a stationary
value within polynomially many steps. A further extension in terms of rela-
tional recursion for arbitrary rather than increasing processes (that therefore
may or may not terminate in a stationary value) is partial fixed-point logic
PFP. The interest in FP is justified because by the theorem of Immerman
and Vardi it captures PTIME for ordered structures. Similarly, PFP captures
PSPACE in the presence of order [Var82, AV89]. In particular, in the presence
of order, FP and PFP automatically remedy the second shortcoming of first-
order logic: on ordered structures FP and PFP also capture all counting and
PTIME, respectively PSPACE, cardinality properties of definable predicates.
In the absence of order, however, this is not at all true. In the extreme case
of pure sets (graphs without edges) it is easy to see that relational recursion
and all of FP and PFP collapse to first-order. Simple cardinality properties
of the size of sets like evenness of the number of vertices are not definable
in FP or PFP. Moreover, all the simple examples of properties that are not
FP-definable but may be recognized in PTIME involve such cardinality prop-
erties.
One of the themes underlying our present investigations is the attempt
to treat these two most apparent shortcomings of first-order logic over finite
structures — recursion and counting — on an equal footing and to consider
the extensions FP and PFP to a framework that incorporates counting.
We thus obtain fixed-point logic with counting FP+C and partial fixed-
point logic with counting PFP+C. Roughly speaking we deal with two-sorted
variants of the given finite structures, augmented by a second ordered arith-
metical sort. A link between the sorts is induced by counting terms that
associate cardinality values with formulae that define sets. The usual fixed-
0.2 Natural Levels of Expressiveness 9

point operations can now be applied in this framework to combine relational


recursion with the processing of cardinalities.
The conception of fixed-point logic with counting is due to Immerman
[Imm87a]. It has not been studied in its own right or even rigorously formal-
ized in the work of Immerman though. The fact that counting is the most
obvious defect in FP as compared with PTIME had led Immerman to conjec-
ture that an appropriate extension of FP to FP+C should even be a logic for
PTIME in the general case. This conjecture was disproved in a strong sense
by Cai, Fίirer and Immerman [CFI89]. The sophisticated nature of their ex-
ample for the separation of FP+C from PTIME indicates on the other hand
that FP+C may still be regarded as an interesting level of expressiveness
within PTIME that captures many PTIME properties that naturally arise for
instance in graph theory. This view has since been corroborated by model
theoretic as well as complexity oriented results in [G093, Ott96a] and we
shall see much of this in the sequel. The claim for the naturalness of FP+C
mainly rests on the following:
• The expressive power of FP+C can be understood very well in terms of
certain FP+C-definable structural invariants. An analogous phenomenon
was first discovered and exploited in the analysis of FP itself in the work
of Abiteboul and Vianu [AV91] and lead to their beautiful result that FP
collapses to PFP if and only if PSPACE = PTIME. This approach could
successfully be extended to FP+C and PFP+C. In some respects the link
between the expressive power of FP+C and PFP+C and the associated
invariants is even neater than for FP and PFP themselves. The result-
ing characterization of the expressive power of FP+C and its relation to
PFP+C show that even though FP+C falls short of PTIME it extends to
the general case some of the computational and model theoretic features
that apply to FP only in the ordered case.
• FP+C and PFP+C are very robust with respect to the actual formalization
of the counting extension. There are a number of equivalent characteriza-
tions of the expressive power of FP+C, both in terms of logical systems
that turn out to be equivalent with FP+C and in computational terms.
Intuitively these show that FP+C constitutes a natural level of expressiveness
that at the same time corresponds to some natural level in complexity — even
though it is clear that this level is strictly contained in standard PTIME.

0.2.2 The Framework of Infinitary Logic

A different but related approach to the investigation of logics with respect to


complexity classes focuses on the a priori logical framework given by certain
fragments of infinitary logic. Consider firstly full-fledged infinitary logic LOOUM
the logic generated by the usual first-order rules for the formation of formulae
together with infinite disjunctions and conjunctions over arbitrary sets of
10 0. Introduction

formulae. Now any finite graph is characterized up to isomorphism by a first-


order sentence. It follows that every property of finite graphs is definable in
Looω by a countable disjunction over first-order sentences that characterize
all positive instances of this property. Looω is a universal logic for finite
structures — and overshoots all sensible levels of expressiveness.
Particular fragments of infinitary logic, however, have emerged as very
useful tools in finite model theory. These are defined in terms of restrictions
on the number of variables that may occur (bound or free). These restrictions
are well adapted to the study of relational recursion since the processes con-
sidered in relational recursion always involve a fixed bound on the maximal
arity of auxiliary relations. Thus pure relational recursion is fully contained
tne
within Z/rou;> fragment of LO^ that consists of all formulae that use a fi-
nite number of variable symbols each. In particular fixed-point logic FP and
partial fixed-point logic PFP are properly embedded in L(^oω. It is important
to note that the completely non-uniform constructors of infinite disjunctions
and conjunctions allow to define non-recursive properties of finite structures
as well. L^, too, is completely at odds with complexity on finite structures.
Here this may be seen as an advantage. If we consider problems related to log-
(
ics for complexity in restriction to the framework of L ^oω then it is important
that this restriction in itself does not trivialize the issues. Since L^oω allows to
define properties of arbitrary complexity, the class of all those PTIME proper-
ties of finite graphs, that at the same time are L^-definable, is a non-trivial
subclass of graph-PτiME for our purposes. Furthermore the restriction to
bounded arity auxiliary relations can be considered as a natural restriction
also in terms of the computational complexity of relational problems.
The points made about the inclusion of counting in connection with FP
versus FP-f C also apply to the framework of L^ω. Evenness of the number
of vertices or regularity of graphs are not L^-definable. The reason is that
although each individual expression of the from 3~mxφ(x) asserting the ex-
istence of exactly m elements that satisfy φ is in first-order logic, the number
of variables required in its formalization grows unboundedly with m. L^ω
compensates completely all defects of first-order that concern relational re-
cursion but fails for the defects related to counting. It is natural therefore
to study also the fragment C^ω of infinitary logic with only finitely many
=m
variables in each formula but allowing all counting quantifiers 3 instead of
the usual existential quantifier. These were also first considered in the work
of Immerman on the counting extension of FP. FP+C and PFP+C are com-
prised in C^ω just as FP and PFP are in L^ω. We denote the constituent
sublogics with a fixed finite bound k on the number of variables C^ω and
L*^ so that C^ω = U fc C^ω and L^ω = \Jk L^ω. The infinitary logics C^ω
and L^ω and their constituents C^ω and L^ω will be used extensively as
frameworks in our exposition. On the one hand they are used in the analysis
of mostly still open restricted problems on capturing complexity classes. On
0.2 Natural Levels of Expressiveness 11

the other hand they provide the setting for the comparative analysis of the
expressive powers of FP+C, PFP+C, FP and PFP.
The main asset of the fragments C^ω and L^ω is in fact a methodologi-
cal point. Both possess very elegant and tractable Ehrenfeucht-Fraϊsse style
characterizations in terms of games. Such games that capture the expres-
sive power of a logic are an important tool in classical model theory. The
classical Ehrenfeucht-Fraϊsse theorem relates first-order equivalence of two
structures to the existence of a strategy in a game played on these struc-
tures. See [EFT94] for a textbook treatment of this technique in the classical
context. Variants of these games have been found and employed for various
other logics besides first-order. It is remarkable that such games are among
the few tools from classical or abstract model theory that are fully available
in restriction to finite structures without any alteration. See also [EF95].
A large part of the present work is devoted to the detailed analysis of
corresponding games for the C^ω and L^. The games are due to Barwise
[Bar77] and Immerman [Imm82], and Immerman and Lander [IL90] respec-
tively. The analysis of the games leads to the abstraction of concise PTIME
computable structural invariants that characterize finite relational structures
exactly up to equivalence in C^ω or L^ω. As mentioned above such invari-
ants were first considered by Abiteboul and Vianu in [AV91] in the context
of a computational model for relational recursion and applied to the study
of fixed-point logics. A formalization in terms of the underlying fragments
of infinitary logic has been presented in Dawar's dissertation [Daw93] and in
[DLW95] for the L^ω and in [G093, Ott96a] for the C^ω. The relationship
between the expressive power of FP-hC and the invariants for the C^ω is
also one of the main topics here. We shall investigate this relationship in
comparison with FP and the invariants for the L^ω as well.

0.2.3 The Role of Order and Canonization


Consider once more the problem of logics for complexity classes. As out-
lined above the essential difficulty in capturing classes below NPTIME in the
absence of order can be attributed to the ambiguity in the input representa-
tion — it is this ambiguity that imposes the problematic semantic invariance
condition on graph algorithms.
Canonization addresses the problem of providing well defined and unique
representatives up to a given equivalence relation. Consider canonization up
to isomorphism for finite graphs. Suppose there were a PTIME functor defined
on the class of all finite graphs that maps each graph of size n to an isomor-
phic representative over the standard universe {0,... , n — 1} in such a way
that any two isomorphic graphs get mapped to the same representative. Such
a mapping would constitute what is called PTIME canonization up to isomor-
phism or PTIME normalization. It is not known whether finite graphs admit
PTIME normalization. It is clear that PTIME normalization would induce a
PTIME algorithm for graph isomorphism; whether the graph isomorphism
12 0. Introduction

problem itself is in PTIME is not known. Note that the entire problem of
capturing PTIME is solved trivially if there should be PTIME normalization.
Any algorithm applied to the standard encoding of a normalized version of
the input graph becomes a graph algorithm. PTIME normalization would in
fact reduce the capturing of PTIME in the general case to the ordered case.
Some of our investigations concern the variant of this approach in re-
striction to the framework of the C^ω and L^, respectively. We consider
canonization up to equivalence in these logics rather than up to isomorphism.
There is a direct connection between PTIME canonization for these rougher
equivalences and the capturing of the PTIME fragments of these fragments
of infinitary logic. Linking these considerations with the above-mentioned
PTIME invariants for the C^ω we find appealing sufficient criteria that FP+C
indeed captures PTIME in restriction to all of C^ω. It remains a challeng-
ing open problem whether these conditions are fulfilled. A reduction proce-
dure furthermore shows that the general cases hinge on the three-dimensional
cases, i.e. on canonization up to equivalence in the three-variable fragments
Of LOGO;.
A main result that will be treated in full detail in the last chapter concerns
the two-variable case [Ott95a, Ott95b]. For L\oω and C^ω we exhibit a strong
form of PTIME canonization and thus obtain non-trivial capturing results.
The classes of all PTIME properties that are C^ω- or L^3θω-definable are
indeed recursively presentable and may be captured naturally in terms of the
complete invariants for C^ω or L^, respectively.

0.3 Guide to the Exposition


We summarize the investigations and results that are presented here in order
to provide an outlook that may also help to make the overall organization of
the material transparent.

• Chapter 1 reviews and introduces basic terminology, summarizes some


facts and simple results related in particular to the fragments of infinitary
logic and fixed-point logics. Typical examples illustrate the expressive power
of these basic logics.

• Chapter 2 provides an introduction to the games for the bounded variable


fragments of infinitary logic with and without counting quantifiers. Proofs of
the corresponding Ehrenfeucht-Praϊsse type theorems are given. The analysis
of these games is carried further to support the definition of the associated
invariants in Chapter 3.

• Chapter 3 is devoted to the definition and discussion of the invariants


associated with the games. We review those known applications to fixed-
point logic without counting that will later be paralleled by, and contrasted
with, the corresponding picture for fixed-point logic with counting.
0.3 Guide to the Exposition 13

• Chapter 4 is about fixed-point logic with counting. The formal definitions


of FP+C and PFP-hC are provided here. Some material is collected to cor-
roborate the view of fixed-point logic with counting as a distinguished level
of expressiveness within PTIME. The central results rest on applications of
the invariants for the C^>oω.

• Chapter 5 considers the formalism of Lindstrόm quantifiers and exten-


sions of fixed-point logic in restriction to cardinality properties. A structural
padding technique is developed which among other applications shows that
the extension of fixed-point logic by all cardinality Lindstrδm quantifiers is
still to weak to comprise the full power of proper fixed-point logic with count-
ing.

• Chapter 6 provides the general treatment of the connection between can-


onization up to equivalence in the bounded variable fragments of infinitary
logic and recursive presentations of the related fragments of PTIME.

• Chapter 7 finally is concerned with the positive results related to the two-
variable fragments. In particular there are detailed proofs that the PTIME
fragments of both Li^Qω and C^ω can be captured.

Throughout the entire text I have attempted to give an almost self-


contained exposition. In the first four chapters in particular numerous ex-
amples are given and comments and background material provided towards
a thorough introduction to the leading concepts, along with the technical
development. The last three chapters on the other hand are more specifically
devoted to individual results and correspondingly are of a more technical
nature.
The main ideas that concern fixed-point logics and the bounded vari-
able fragments of infinitary logic without counting quantifiers are reviewed
and developed along with the corresponding notions for the counting case.
This seems justified because the results in the case without counting may be
obtained through obvious specializations. And also because the comparison
between the two scenarios is a major source of motivation for our investiga-
tions.
This two-tiered treatment is also intended to make the main results con-
cerning either individual case individually accessible as far as possible. This
is true in particular of those new results that concern the case without count-
ing, mainly in Chapters 6 and 7. The only chapters that are devoted to the
counting case proper are Chapters 4 and, to some extent, 5.
The main dependencies between chapters are the following. Chapters 2-4
each build on their predecessors. Chapters 5, 6 and 7 are to a large extent
14 0. Introduction

independent of each other. For those developments of Chapters 6 and 7, that


concern the case without counting, all major prerequisites can be found in
Chapters 1-3.
I should also point out that several sections of Chapter 1 (in particular
1.5-1.7) become essential only for the understanding of specific further devel-
opments. Chapter 1 may therefore be read selectively and called upon again
where necessary.
In the beginning of each chapter there is a brief summary of its contents.
Where appropriate I have appended to the individual chapters short sections
that discuss summarily the main sources of ideas and results reported or
clarify connections with other work.
1. Definitions and Preliminaries

A major part of this chapter serves to review and fix notation and terminol-
ogy. The material is standard. Readers familiar with the notions addressed
might therefore only want to refer back to particular definitions at later
points. The main issues of the individual sections are the following:

• Section 1.1 sums up the basics about structures, global relations, logics
and types that are relevant for our purposes.

• In Section 1.2 we consider algorithms that deal with structures as inputs


and fix some corresponding conventions. Recognizability of classes of finite
structures and computability of global relations are discussed.

• The bounded variable fragments of infinitary logic, and the fixed-point


logics, are presented in Section 1.3. We also provide some typical examples
for the expressive power of these logics.

• Section 1.4 contains some preliminary material about types and definability
in the relevant fragments of infinitary logic.

• Section 1.5 deals with interpretations, a concept that plays an important


role in many definability considerations.

• In Section 1.6 we review the notions of generalized quantifiers and Lind-


strόm extensions. In particular we define the class of cardinality Lindstrδm
quantifiers.

• Section 1.7 fixes some terminology with respect to the notion of can-
onization and of complete invariants for arbitrary equivalence relations. We
also sketch some technicalities and conventions concerning orderings and pre-
orderings.

1.1 Structures and Types


1.1.1 Structures
We deal with finite structures exclusively, finfr] is the class of all finite r-
structures. Unless explicitly stated otherwise, τ stands for some finite and
16 1. Definitions and Preliminaries

purely relational vocabulary. A structure in fin[r] consists of its universe


together with interpretations for the symbols in r. If r = {R\,..., R8}, where
Ri is a relation or predicate symbol of arity r;, we write 21 = (A, Rf,..., Rf)
Ti a
for a r-structure. Thus Rf C A . The superscripts are mostly omitted
when there is no danger of confusion.
It is sometimes convenient in special circumstances to admit some varia-
tions and extensions of the basic concept of structures:
(i) In order to deal with fixed tuples of parameters over some structure 21,
one may think of those parameters as interpretations for a correspond-
ing tuple of extra constant symbols. We here prefer to stick with an
entirely relational vocabulary and treat parameters as interpretations
for variable symbols. The distinction between parameters and variables
becomes purely intentional. The class of all r-structures with fixed tu-
ples of r parameters is denoted

finfr r] = {(21,α) | 21 <Ξ fin[r],α G A 7 *}.

(ii) In our formalization of fixed-point logic with counting in Chapter 4 we


deal with two-sorted structures. These are structures over two disjoint
universes, one for each sort. Each relation symbol comes with a specifi-
cation telling which components range over the first sort and which over
the second. Similarly terms and in particular variables have a designated
status with respect to the sorts. There is a standard way to represent
two-sorted structures by ordinary one-sorted structures that have two
additional unary predicates to distinguish the sorts. A structure of the
form
(A,[/ι,[7 2 ,...) with A = Uι\JU2
can thus naturally encode a two-sorted structure with universes Ui for
the two sorts. A binary relation R for instance whose i-ih component
ranges over the i-ih sort for i = 1,2 then gets interpreted as a binary
relation over A that satisfies VxVy(Rxy ->• U\x Λ t/22/)
(iii) At some places we consider weighted structures. These are structures
together with some functions from their domains to some external stan-
dard domain, mostly and without loss of generality to the set ω of the
natural numbers. A standard example is that of graphs (V, E) with
weights put on the edges, formalized by a weight function v: V2 -> ω.
Linearly ordered structures play a special role. Assume that r contains a
designated binary relation symbol < for a linear ordering. Then the class of
all finite r-structures which are linearly ordered is denoted

ord[τ] = 121 G fin[τ] <a a linear ordering of A\.

When talking of classes of finite structures it is generally understood that


these are closed with respect to isomorphism. The only exception in our treat-
ment being that in places we restrict attention to structures over standard
1.1 Structures and Types 17

domains, meaning structures with an initial segment of the natural numbers


for their universe. We denote by stan[τ] the set of all finite r-structures over
l
standard domains n = {0,..., n — 1} :

stan[r] = J21 G fin[r] A = n,n ^ 1 j.

There is a direct correspondence between linearly ordered structures and


α
structures over standard domains. Each structure (21, < ) in ord[r ύ {<}] has
a unique representative (21) in stan[τ] determined by the requirement that
21 ~ (21) and that the linear ordering <a translates into the natural ordering
on the standard domain of (21) under the isomorphism. Obviously the map-
ping (21, <21) i-» (21) induces a bijective correspondence between isomorphism
classes of linearly ordered structures and structures over standard domains

): ord[rύ{<}] /~ —> stan[r].

1.1.2 Queries and Global Relations

A class Q of finite r-structures may be identified with a boolean valued


functor x on finfr] that maps structures to 1 or 0 according to membership
in Q: Q = {21 G fin[τ] | χ(2l) = l}. The term boolean queries for classes of
structures stresses this functorial view. Since classes of structures are tacitly
assumed to be closed under isomorphisms, their characteristic functions x
are invariant under isomorphisms.
Consider similarly an isomorphism-invariant boolean valued function on
fin[τ;r], χ:finfr;r] -> {0,1}. Such a functor constitutes an r-ary query on
fin[τ]. An alternative view is that of a mapping from 21 G fin[τ] to a new
r-ary predicate R® over 21:

The mapping β:2l •->> R* is a global relation of arity r. At the level


of this mapping isomorphism invariance of x turns into equivariance under
isomorphisms: if ττ:2t -)• 03 is an isomorphism, then π(R*) = R®. This
is in fact the standard defining condition on global relations or queries as
introduced in [CH80]. We note in particular that the value of a global relation
over 21 must be invariant under all automorphisms of 21.
Definition 1.1. A global relation or query R of arity r over fin[τ] is a map-
ping sending each structure 21 G fin[τ] to an r-ary predicate R* C Ar in
~-compatible fashion. Whenever π : 21 -> 05 is an isomorphism, then π also
preserves R: π(R^) = R®. The characteristic functor XR of R is the boolean
valued mapping on fin[τ;r] that sends (21,ά) to 1 if a G R*. Compatibility
1
We apply the usual convention to identify the natural number n G ω with the
set of its predecessors {0,..., n — 1} C ω.
18 1. Definitions and Preliminaries

of R with isomorphisms is equivalent with invariance of XR under isomor-


phisms.
It is often convenient to regard boolean queries (boolean global relations)
as special, namely 0-ary cases of r-ary queries. To accommodate this view
formally, we may naturally identify 0-ary predicates with boolean values and
fin[τ;0] withfin[r].
Some remark on our usage of the term functor is in order. We generally
apply it to a mapping / whose domain is a class of structures (or of struc-
tures with parameters and the like) if / is required to be invariant under
isomorphisms: 21 ~ <B => /(2l) = /(2ϊ). If also the range of / consists of
structures, for instance /: fin[τ] -» fin[σ] then the appropriate form of invari-
ance is 21 ~ <B => /(21) ~/(<B).

1.1.3 Logics

Let £ be a logic. We do not require any formal general notion of a logic,


the apparent generality here only serves to collect some notions, that we
later apply to a few individual concrete logics, into common statements. £[τ]
denotes the class of all formulae of C in vocabulary r. A formula φ G £[τ]
without free variables (one that semantically evaluates to a boolean value
over each r-structure) is a sentence. Sentences define classes of structures,
concentrating on finite structures we put

fmod(<p): = J2t G fin[r] 21 |= φ\.

We mostly use letters φ,^,χ,... to denote formulae. Let φ G C[τ\. Vari-


ables displayed in brackets like the Xi in φ(x\,..., xr) indicate that semanti-
cally we consider φ as defining a global relation of arity r on finjr]. Over 21,
φ(xι,...,xr) evaluates to the predicate

21 |= <p[α] says that φ is satisfied over 21 when the free variables are interpreted
as indicated. In this usage the notation φ(x\,..., Xk) does not imply that the
displayed Xi must all be syntactically free in φ, but that the free variables of φ
are among those displayed. We speak of a formula in free variables x\,..., Xk
with this meaning: free(<^) C {χ 1 ? ... ,#*.}. For instance, we allow to regard
the formula x\ —x^ also as a formula in free variables #ι,£2 ? #3 5 and write
φ(xι, x 2 , #3) = xι=X2 if this view is intended.
Similar conventions apply to second-order variables (predicate variables)
where such occur. In particular notation like φ(X,x) G £[τ] indicates that
given a r-structure plus additional interpretations for the second-order vari-
ables X by extra predicates and for the x by elements, φ evaluates to a
boolean value. 21 |= φ[P,a] expresses that φ is satisfied in 21 with the indi-
cated interpretations for X and x.
1.1 Structures and Types 19

It is sometimes convenient to consider interpretations for some free first-


or second-order variables as momentarily fixed. The notation (21, Γ) for some
partial interpretation of free variables through Γ indicates this meaning.
Definition 1.2. (i) The sentence φ G C[τ\ defines the boolean query Q C
fin[τ] if Q = fmod(φ).
(ii) The formula φ(x\,. . . ,x r ) G C[τ\ defines the global relation R on fin[τ]

The expressive power of a logic is determined in terms of those global


relations that are definable in this logic.
Definition 1.3. Two logics are semantically equivalent if they define exactly
the same global relations on finite structures.

A Ξ £2

denotes this semantic equivalence over finite relational structures. The possi-
ble weakening of this requirement, that the two logics define the same classes
of finite structures is explicitly indicated. We write "£ι = £2 for sentences "
or "£ι = £2 for boolean queries ".
Observe that L\ Ξ £2 says that for every formula of C\ there is a formula
of £2 that is equivalent over finite structures, and vice versa. The weaker
notion of equivalence expresses the same requirement in restriction to sen-
tences. The distinction between the two notions of equivalence is of a purely
formal nature for our considerations. Most natural logics admit a faithful re-
duction from definable global relations to definable boolean global relations
so that their expressive power is fully determined by their strength in defining
classes, i.e. by their sentences.
The notation = for semantic equivalence extends with analogous meaning
to classes of queries that are not specified by logics. For instance if C is a class
of queries and C a logic then C = C says that every query in C is definable
by a formula of C and that conversely all £-definable queries are in C.

1.1.4 Types

We are interested in ^-definable properties of element tuples. The £-type


of a tuple α = (αi, . . . , α^) of elements of a r-structure 21 is the class of all
/^-formulae in free variables x = (xi, . . . , Xk) that are satisfied by α in 2t:

£(a) = (φ(x) € C(
£
Tp (τ; k) is the class of all £[r]-types in variables xι,...,xk:

Tp£(τ;A;) = {tp£(ά) a efinM.se A*}.


Tp£(2l; k) denotes the set of all £[r]-types of fe-tuples over a particular 21:
20 1. Definitions and Preliminaries

We use Greek letters α, /?,... to denote types. If α G Tp£(τ; k) then a \= φ


means that 21 1= φ[a] whenever 21 and ά are such that tp^(α) = α. In case φ
is also in C[r] this is just to say that φ € α.
£ £
We often think of Tp (τ;r) for 1 ^ r < k as embedded into Tp (τ;fc)
via
tp£(αι,...,α r ) i—> tp% (αi, . . . ,0^,01,..., α r ) .
fc-r

Some of the logics that play a central role in the following possess only
a bounded supply of variable symbols. If C only has variables xi, . . . , Xk we
agree to apply the notion of £-type only to tuples α of length at most k. We
adopt the convention that Tp £ (τ;r) = 0 for r > k in this context.
The most basic types considered are the atomic or quantifier free types.
They are obtained in the above formalism if C is chosen to be the quantifier
free fragment of first-order logic. We write atp^ (α) for the collection of all
quantifier free formulae that hold true of a in 21. Note that each such type
can be fully represented by the set of atomic formulae contained in it. In
this sense, and for finite relational vocabularies, each atomic type is finite
and we may identify an atomic type θ with a single quantifier free formula:
the conjunction over all atomic formulae contained in θ together with the
negations of all those not contained in θ. Some such syntactic normal form is
tacitly assumed when we deal with sets of atomic types. The set of all atomic
r-types in variables a?ι, . . . , Xk is denoted by Atp(τ; k):

Atp(r; k) = { atp* (ά) 21 e fin[r] , α G Ak } .


Clearly only structures of size up to k need be considered since r is purely
relational. For finite r therefore, Atp(τ; fc) is obviously finite. In fact a finite
representation of Atp(τ; fc) in terms of the above syntactic normalization is
immediately obtained.
Atomic types in vocabulary 0 — in the language of pure sets, where only
equality is available — are here called equality types. We write eq(α) for the
equality type of a and Eq(fc) for the finite set of all equality types in variables
#1,. . . ,£fc.
For indistinguishability of structures or structures with parameters in a
logic we use the following notation.
Definition 1.4. For the logic C we denote by =c the equivalence relation of
indistinguishability in C or /^-equivalence both of structures and of structures
with parameters:
(i) 21 =£ 21' if 21 and 21' satisfy exactly the same C-sentences.
(ii) (21, α) =c (2t',α') if a and a1 satisfy exactly the same C-formulae over
21 and 21' respectively — equivalently 2/tp£(ά) = tp§,(ά').
Note that Tp£(r; *) may be identified with fin[r; k] /=c.
1.2 Algorithms on Structures 21

1.2 Algorithms on Structures


The informal notion of algorithms on structures simply refers to algorithms
that are intended to take finite structures as inputs. Algorithms do not deal
with abstract structures directly but with presentations or encodings of these.
In standard models of computation — we mainly think of Turing machines
— algorithms directly deal with words, ordered strings of symbols over some
fixed finite alphabet. Straightforward encoding schemes that faithfully map
structures to words are available for structures over standard domains. The
implicit ordering of the standard domain allows to enumerate all instantia-
tions of atoms lexicographically. The entire structural information can thus
be coded in a binary string that lists the boolean values of all instantiations
of atoms in this ordering. Having fixed any such convention for the direct en-
coding of standard structures we can identify standard structures with their
encodings as bit-strings. Without loss of generality we may thus pretend that
algorithms for computations on finite r-structures directly accept elements
of stanfr] as inputs. We think of such an algorithm A as realizing a mapping

A: stan[τ] —>• range(*4)


a h-*
The same considerations apply to algorithms that take structures with pa-
rameters as inputs; we replace stan[τ] by stan[r; r] for some r. With respect
to the range of A we may distinguish two cases (the distinction is purely
intentional): either we regard range(*4) simply as a set of words, or we simi-
larly identify output words with standard objects they encode. In particular
we adopt the latter view if we want A to realize a mapping from structures
to structures. As for the input domain, we identify the output domain with
some stan[σ] and pretend for instance that A directly realizes a mapping
A: stanfr] -»• stan[σ]. Similar conventions can be employed to algorithms
which are to output natural numbers: A: stan[τ] -» ω.
Algorithms are unproblematic as far as they realize mappings between
certain domains of standard objects (objects with standard encodings). The
picture becomes fundamentally different if we want to realize functors on
structures. Consider the algorithmic evaluation of a boolean query on fin[τ].
No matter whether we restrict the domain to stan[τ] or not, there remains
the crucial invariance condition that A(%) = *4(2t') whenever 21 and 2t; are
isomorphic. Note that this condition really arises from two sources:
(a) when encoding even a single abstract 21 G fin[τ] through an element of
stan[τ] then a priori the representative in stan[τ] is determined only up
to isomorphism.
(b) since a boolean query by definition corresponds to an isomorphism in-
variant functor, its restriction to stan[τ] still has to be invariant under
isomorphisms.
22 1. Definitions and Preliminaries

Of course there are situations in which uniquely determined representa-


tives of abstract input structures by elements of stan[τ] are available. Most
notably this applies to linearly ordered structures. As pointed out above we
may identify ord[τU{<}]/~ with stan[τ]. Linearly ordered structures, even
when viewed only up to isomorphism, therefore are themselves objects with
standard encodings — whence their notorious special status in considerations
concerning logics for complexity classes arises. Somewhat more generally such
exceptions occur wherever there is some adequate normalization or canon-
ization procedure available. Variations on this issue will concern us in later
chapters.
Entirely similar considerations apply of course to the evaluation of r-ary
queries that we choose to realize in the boolean format

,A:stan[τ;r] —> {0,1}

which is also subject to invariance under isomorphism. Finally, functors from


structures to structures are realized as

A: stan[r] —ϊ stan[σ]

with invariance condition 21 ~ <B => .4(21) ~ A(*B).

Definition 1.5. (%) The algorithm ,4:stan[τ] ->• {0,1} computes the bool-
ean query Q C fin[r] if for all 21 G stan[τ]: .4(21) = 1 if and only if
21 G Q. We also say that A recognizes the class Q.
(ii) An algorithm A: stan[τ; r] -» {0,1} computes the r-ary query R on fin[τ]
if for all (21, a) G stan[r;r]: .4(21, α) = 1 if and only if a G Rm.
(Hi) An algorithm .4:stan[τ] -> stan[σ] computes the functor F:fin[τ] ->
fin[σ] if for all 21 G stan[r]: .4(21) ^ F(%).
(iv) An algorithm A: stan[τ] -> S computes the functor F: fin[τ] -> S whose
range is some domain of standard objects S if for all 21 G stan[τ]:
,4(a)=F(a).

1.2.1 Complexity Classes and Presentations

We are interested in the complexity of problems that concern structures.


Consider a structural problem, any of the several kinds of computational
problems considered in Definition 1.5. The complexity of such problems is
the complexity in the standard sense of its algorithmic realizations A. When
dealing with relational input structures we identify the input size with the
size of the universe of the input structure. Although this parameter may differ
from the length of an actual encoding of the input structure, the difference
does riot matter for our purposes because the complexity classes considered —
mainly PTIME and PSPACE, but the same applies to all standard classes from
LOGSPACE upward — are robust under polynomially bounded re-scalings of
the input size.
1.2 Algorithms on Structures 23

A boolean query Q C fin[τ], for instance, is in PTIME if there is a PTIME


algorithm A that computes Q in the sense of Definition 1.5 (i). More precisely,
if there is an algorithm Λ for Q that terminates its computation on all 21 =
(n,...) £ stanfr] in time polynomial in n. It is customary to denote the class
of all PTIME queries again by PTIME, and similar conventions apply to all
the usual complexity classes. It should always be clear from context whether
we think of for instance PTIME either as the class of all polynomial time
computable functions (on the natural numbers, or on some other domain
of objects with standard encodings), or as the class of all polynomial time
computable queries on finite relational structures. In order to emphasize the
latter interpretation we shall sometimes speak of PTIME or other complexity
classes 05 complexity classes of queries, a notion introduced by Chandra and
Harel [CH80].
As pointed out in the introduction the issue of logics for complexity classes
is closely related with the abstract notion of recursive presentations for com-
plexity classes of queries.
Definition 1.6. Let C be a complexity class of queries. C is recursively pre-
sented by a recursive or recursively enumerable set M of algorithms if each
A G M is an algorithmic realization of a query in complexity C, and if M is
semantically complete for C: C = {Q \ Q realized by some A £ M }.
We write C = M to stress the underlying semantic equivalence and speak
of M as a recursive presentation for C. For short we also just call C recur-
sively enumerable if it admits a recursive presentation.
This notion of a recursive presentation similarly applies to any class of
problems C that is specified by algorithmic criteria. Recall from the intro-
duction that ordinary PTIME, as the class of all polynomial time computable
problems on natural numbers say, is recursively presentable through the
subclass of polynomially clocked PTIME machines (compare Section 0.1.2).
PTIME as a class of queries is a paradigmatic semantic class. As a subclass of
ordinary PTIME, PTIME as a class of queries is characterized by the semantic
condition of invariance under isomorphism. The problem whether there are
logics for complexity classes of queries essentially is the problem of finding
recursive presentations of these semantically defined classes.

1.2.2 Logics for Complexity Classes


This following notion was first presented in precise terms in [Gur88].
Definition 1.7. Let C be a complexity class of queries. Assume that C is
a logic with recursive syntax and semantics: for finite r the set C[τ] of T-
2
formulae of C is recursive and there is a recursive mapping from C[τ\ to
2
Note that just as for recursive presentations it does not really matter whether
we require recursive or recursively enumerable syntax. If (ψi)i^ι is a recursive
enumeration then the recursive set {(ψi, i)\i ^ 1} can replace the original syntax
if necessary.
24 1. Definitions and Preliminaries

algorithms, φ »->• Λφ such that Λφ evaluates (the query defined by) φ over
fin[r].
£ is a logic for C or captures C if C coincides with the class of queries that
are definable in £, C = C, and if the recursive semantics φ *-+ Aφ maps £[τ]
to algorithmic realizations within C.
Of course the same notion applies to other classes of queries that are
defined in terms of algorithmic criteria, in particular to subclasses of com-
plexity classes of queries. The well known theorem of Immerman and Vardi
for instance says that the class of all PTIME queries on ordered structures is
captured in exactly this sense by fixed-point logic.
Sometimes it is useful to strengthen these requirements so that other
important data also become recursive in terms of the formulae of £, compare
[Gur88, EF95]. For instance one may require that data describing complexity
bounds on Aφ be recursive in φ. While such strengthenings are crucial for
certain arguments we can here stick to the basic notion.
It is worth to note the essential equivalence between the notions of cap-
turing by some logic and that of a recursive presentation. It is clear that a
logical representation as in the last definition provides a recursive presenta-
tion through {Aφ \ φ £ £} . Conversely, any recursive set of algorithms may
be regarded as a logic with recursive syntax in the abstract sense; for the
semantics choose the obvious one embodied in the algorithms. In this way
any recursive presentation of C can essentially be regarded as a logic that
captures C. There are some fine points to be considered if as usual we want
abstract logics to satisfy some appropriate regularity criteria as outlined in
[Ebb85]. For this it is obviously necessary that C itself as a set of queries
satisfies corresponding closure criteria. At least for classes C that are natural
in such respects it follows that indeed the two notions are equivalent. We
do not here enlarge on this issue, in fact an informal concept of 'logics for
complexity classes' will be quite sufficient for our purposes.

1.3 Some Particular Logics


1.3.1 First-Order Logic and Infinitary Logic
We write Lωω for first-order logic. The expressive power of first-order logic
over finite structures is very unsatisfactory in terms of computational com-
plexity. While all L^-definable queries are LOGSPACE computable, first-
order logic fails to define fundamental structural properties in LOGSPACE
or even below. This was briefly discussed in the introduction. First-order
L
equivalence = »», however, turns out to be too strong a notion of equiva-
lence over finite structures. Two finite structures are first-order equivalent
if and only if they are isomorphic: a first-order sentence, that uses enough
variables to enumerate all elements of a given structure and specify all basic
relations between them, characterizes that structure up to isomorphism.
1.3 Some Particular Logics 25

Full infinitary logic is the logic LOOW that has the usual first-order rules
for the formation of formulae and in addition is closed under infinitary con-
junctions and disjunctions: if # is any set (!) of formulae of L(X)ω then /\ #,
the conjunction over #, and V ^> the disjunction over Ψ, are also formulae of
LOOU,. Their semantics is the obvious one: /\ Ψ evaluates to true if all formulae
in \P evaluate to true and V & evaluates to true if at least one formula in Ψ
does. Note that one often has to deal with families of formulae (t/Όίe/ and
then writes for instance /\ ίe/ ψi instead of f\{ψi | i € /}.
As mentioned in the introduction any query on finite structures is defin-
able in LOOU This follows from the observation that any finite structure 21 is
characterized up to isomorphism by some first-order sentence φ^. If Q is a
boolean query, for instance, then the infinite disjunction ψ = \/ φ<% over all
φ% for 21 £ Q Π stan[τ] clearly defines Q. Recall, however, that φ<& typically
requires n -f 1 variables if the size of 21 is n. This motivates the introduction
of the finite variable fragments of LOOU

1.3.2 Fragments of Infinitary Logic

Definition 1.8. L^ω is the fragment of Looω that consists of formulae us-
ing only variable symbols from {#ι , . . . , Xk}- The union of the L^ω is denoted
Lt^oω . It consists of all formulae of Looω that use finitely many variable sym-
bols (from the standard supply {xi\i ^ I}).
We also consider the corresponding bounded variable fragments of Lωω :
let L^ω denote first-order logic with variable symbols {#ι, ...,#&}.
The union of the L^ω is full first-order logic Lωω. In actual formalizations
we often use variable symbols x, ?/, z, ... instead of the standardized X{
for the sake of easier readability. The official restriction to standard sets of
variables is convenient, however, to have syntactic closure under conjunctions
and disjunctions for each L^ω. We give some examples for the expressive
power of the L^ω. Formalizations with few variable symbols typically require
clever re-use of already quantified variables. Examples 1.9 and 1.11 are from
[KVa92a], Example 1.16 plays an important role in [DLW95] in a context
that will also concern us here later.
Example 1.9. Over linear orderings (A, <) two different variable symbols
suffice to produce first-order formulae ψi(x), for i ^ 0, which express that x
is the z'-th element with respect to <. Equivalently, for the standard linear
orderings (n, <):
(n, <) \= φi[τn] exactly for ra = i.
To obtain these formulae put ψo(x) := -~>3y y<xto define the bottom element
in any linear ordering (A, <). Inductively let

ψi+ι(x):= f\ ^ψj(x) Λ
26 1. Definitions and Preliminaries

where ψj (y) is the result of exchanging x and y throughout the formula ψj (x) .
Example 1.10. The class of acyclic directed graphs (A, E) is definable by
a sentence of L\oω. Observe that a finite graph is acyclic if it has no infinite
-E-paths, or equivalently if there is some finite bound on the length of E-
paths. Put ξo(x) := -~<3yEyx to characterize those vertices that have no
E-predecessors. Inductively let

Then (A, E) \= ξi[υ] if and only if there is no E-path of length greater than
i reaching v. It follows that

characterizes acyclic directed finite graphs as desired.


The sequence of formulae & from Example 1.10 can be extended to ordinal
indices to form formulae ξa(x) asserting (over arbitrary structures) that the
E-rank of x is at most α: inductively ξα(x) = Vy(Eyx ->• V0< α £/?($/)) E l8
well-founded if there are no infinite descending E-paths, which is equivalent
with the existence of some λ such that (A, E) f= \Ja<χVxξa(x) It follows
that the class of well-founded relations of rank less than λ is L^-definable
over arbitrary structures, for each λ. We shall return to two variables, linear
orderings, and well-foundedness in Example 1.12 and Corollaries 1.13 and
1.14 below.
Example 1.11. The reflexive transitive closure of a binary relation E is
definable in L^X)ω. The formula ^i(#,2/) := x = y V Exy describes the pairs of
J£-distance at most 1. Inductively, ψi+ι(x,y) := ψi(x,y) V3z(ψi(x,z) /\Ezy)
defines those pairs (x,y), whose E-distance is at most ί + 1. Thus ξ(x,y) :=
Vi^i Ψί(χιy) defines the reflexive transitive closure of E.
It is a well known fact (that will also be illustrated in Example 2.6 with a
typical game argument) that two variables do not suffice to define transitive
closures or to assert transitivity of a given binary relation. The following
observation, which is also a direct consequence of the very first exercise in
Poizat's [Poi82], is therefore quite intriguing. The way it is proved here is
inspired by an argument from [GOR96a].
Example 1.12. The class of finite linear orderings is L^-definable (even
over not necessarily finite structures). Let ξ1 be an L^ou;[<]-sentence asserting
that < is acyclic (obtained from ξ in Example 1.10 through replacing E by
<). We claim that

¥>ord := ξ1 Λ φQ ,
whereto := VxVy(x=yVx<yVy<x),
1.3 Some Particular Logics 27

defines the class of all finite linear orderings. It remains to argue that φord
enforces
(i) irreflexivity — Vx-ix < x : it does since £' forbids loops.
(ii) antisymmetry — VxV2/-»(x < y Λ y < x) : it does since ξ' also forbids
cycles of length two.
(iii) transitivity — VxVyVz (x < y Λ y < z -»> x < z ) : if x < y and y < z
then x Φ z (forbidden cycle of length two) and z £ x (forbidden cycle
of length three), whence ΨQ forces x < z.
In fact, over not necessarily finite structures and for any ordinal λ the class
of all well-orderings of order type less than λ is L^-definable. It follows
that for instance the class of all countable well-orderings is L^-definable
over arbitrary structures. This claim is justified just as in the last example,
if we replace £' by a sentence that expresses that < is a well-founded rela-
tion of rank less than λ (compare the remark following Example 1.10). ξ' in
particular forbids loops and cycles, so that irreflexivity, antisymmetry and
transitivity follow as above.
Returning to finite structures, we have the following:
Corollary 1.13. Let <£ r and let τ have no relation symbols of arity greater
than 2. Then any query of linearly ordered finite r -structures Q C ord[τ] is
L^-definable.
Sketch of Proof. Consider without loss r = {<,E} with one extra binary
relation E besides <. Let 21 = (n, <,£?) have standard domain with the
natural interpretation of <. Put, for formulae ψi as in Example 1.9 that
characterize the ΐ-th element of the ordering,

Λ Vz V ψi(x) Λ f\ 3x3y(ψi(x) Λ ψj(y) Λ Exy)


i<n (i,J)£E

Λ /\

Then φ<& characterizes 21 up to isomorphism. The disjunction V φ^ over those


of these 21 that are in Q defines Q. D
In particular, for any set W C ω \ {0}, the class of those linear orderings
whose size is in W is definable in L2QQω. This immediately gives the following:
Corollary 1.14. L^ω is sufficiently expressive to define arbitrarily complex
and even non-recursive queues.
Example 1.15. The class of all finite trees is definable in L^. A structure
(A, E) is a tree if E is acyclic and if
(i) there is exactly one element without E-predecessors (the root).
(ii) each element has at most one E-predecessor.
28 1. Definitions and Preliminaries

Note that connectedness is implied by (i) and (ii) if there cannot be cycles.
Now cycles can be forbidden in L\oω according to Example 1.10. (i) can even
be formalized in L^ω through

3xVy-<Eyx Λ VxVt/ ί (Vy^Eyx Λ Vx->Exy) —ϊ x = y j .

(ii) actually needs a third variable for mere counting, as for instance in the
formalization
Λ Ezx -+y = z.
A binary tree is a tree in which all nodes have out-degree 0 or 2. A full
binary tree is a binary tree in which all leaves (nodes of out-degree 0) are at
the same distance from the root (at the same height).
Example 1.16. The class of all full binary trees is definable in L^ω. The
condition on equal height of all leaves is formalized in three variables us-
ing auxiliary formulae ψi that define the set of vertices at height i. These
are constructed inductively similar to the formulae used in Example 1.9.
ΨQ(X) := -iByEyx defines the root. Inductively ψi+ι(x) := 3y(ψi(y) Λ Eyx)
is as desired. The L^-sentence \J ^x(-^yExy -» ψi(x)) forces all leaves to
be at the same height.
It might look surprising that the condition on out-degree 2 should also
be expressible in just three variables. It is however, and quite simply, in the
context of trees. The sentence

ι3zι3z23z3 (/\ Xi φ Xj Λ /\ (ixi f\


<=1,2,3

expresses that there are no three vertices with common direct predecessors
for any two of them. If the in-degree can at most be 1 then this is equivalent
with a bound of 2 on the out-degree. The condition that the out-degree of
all vertices apart from the leaves must be at least 2 is trivially first-order
expressible in three variables.
In Example 1.15 some variables had to be spent for mere counting in the
conditions on the degree of vertices. In general the explicit formalization of
3^πιxφ(χ) requires at least m variables as for instance in

3xι ... 3xm ( f\ Xi φ Xj Λ /\

Indeed, over the empty vocabulary it follows easily from game arguments that
are treated in Chapter 2, that no sentence of L^~l can express the existence
of at least m distinct elements. It is therefore natural to consider fragments
of first-order and infinitary logic with a bounded supply of variables that
admit, however, arbitrary counting quantifiers
1.3 Some Particular Logics 29

Definition 1.17. C^ω is the fragment of Looω that consists of formulae


using only variable symbols from {xι,...,x^} but allows arbitrary counting
m
quantifiers 3^ , m ^ 1, instead of the standard existential quantifier 3. The
union of the C^ω is denoted C^ω .
Some simplifications in actual formalizations are achieved with the fol-
=m m
lowing agreements. We write 3 xφ(x) as an abbreviation for 3^ xφ(x) Λ
rn+l =0
-^ xφ(x). This notation may be extended to allow 3 xφ(x) as short-
hand for -ι3xφ(x). Quantifiers 3>m, 3^m and 3<m are similarly introduced.
We give a few simple examples for the expressive power of C^. A much
more central example, the expressibility of the stable colouring of graphs in
Cooωί ιs presented in detail in Chapter 2.
Example 1.18. The class of regular graphs is definable in C^)0ω. Apart from
the standard axioms for graphs, which are in two variables, take

χ: = Va?Vj/ /\ (3>myExy <-» 3>mxEyx)


mGu>

to express regularity.
Example 1.19. Even C%ω does not have the finite model property, i.e. there
are sentences of C%ω without finite models that are satisfiable in infinite
structures. For instance it is expressible in C*ω that a binary relation E is
the graph of an injective total function that is not surjective:
\/x3=lyExy Λ Vy3^lxExy Λ 3yVχ-^Exy.
First-order logic with two variables l?ωω on the other hand is known to have
the finite property by a result of Mortimer's, see Theorem 7.35.
Example 1.20. Any finite equivalence relation is characterized up to iso-
morphism among all finite equivalence relations by its C^ -theory. The ax-
iomatization of equivalence relations, however, needs at least three variables.
Let 21 = (A, ~) where ~ is an equivalence relation over A. A complete in-
variant that characterizes any finite equivalence relation up to isomorphism
is its spectrum, the set of pairs (i,Vi) such that there are exactly Vi classes
of size ί in (A, ~). This information is expressed in two variables by

That two variables even with counting quantifiers do not suffice to express
transitivity of a binary relation E is shown easily with games discussed in
Chapter 2, see Example 2.6.
Example 1.21. The classes of all trees and of all full binary trees are de-
finable in C^ou In f act we observed in Example 1.15 and 1.16 that a third
variable was only needed for counting purposes. For instance (ii) of Exam-
ple 1.15 now simply becomes
30 1. Definitions and Preliminaries

1.3.3 Fixed-Point Logics

Fixed-point logics provide extensions of first-order logic that capture some


recursion on relations. Consider a formula φ(X,x) in free variables X and x
of matching arities, X a second-order variable for a predicate of arity r and
x — (#!>••• ,£»•)• Over structures 21 that interpret φ up to X and the x the
formula φ induces a mapping on r-ary predicates:

F*:P(Ar) — > P(Ar)


P h—> {aeAr \&\=φ[P,a]}.
r r r r
P(A ) denotes the power set of A . For any mapping F:P(A ) -> P(A )
consider the following two recursive processes.
The partial fixed point of F. Inductively, a sequence of r-ary predicates
is generated through iterated application of F to the empty predicate:

With this sequence the partial fixed point of F is defined to be either the
stationary value of this sequence if such exists or the empty set otherwise:

Xi if -Xi+1 — Xi
PFP[F] := I
0 if Xi+1 φ Xi for all i.
The inductive or inflationary fixed point associated with F. Induc-
tively, the following increasing sequence of r-ary predicates is generated:

Note that this sequence may alternatively be obtained through iterated ap-
plication of a modified operation Fίnfl to the empty predicate. F infl is the
inflationary variant of F, defined by Finfl(P) := P U F(P); generally an op-
eration G on sets is called inflationary if always G(P) D P.
Since the domain A is finite, the sequence of the Xi must become station-
ary within polynomially many steps of the iteration. The limit reached is the
inductive or inflationary fixed point:

IFP[F] := Xio where i0 = min{i|Xi+ι = Xi}.

Equivalently: IFP[F] = (J; Xi,


or WP[F\ = PFP[Finfl].

The Xi in these definitions are referred to as the stages in the generation of


the respective fixed points.
Fixed-point logics provide constructors for the definition of those predi-
cates that are obtained as fixed points of operators F = Fφ as above. For the
1.3 Some Particular Logics 31

inductive definition of their syntax and semantics it is important to consider


formulae φ that may have other free first- and second-order variables than
X and x. To make clear which variables are those involved in the fixed-point
process, we write PFPχ^φ for the partial fixed point associated with the
operator Fφ, where free variables other than X and x in φ are kept fixed
as parameters with respect to the operation Fφ. The same applies to the
inductive fixed point.
Syntax and semantics of partial fixed-point logic PFP are defined as fol-
lows. Syntactically, PFP is the closure of atomic formulae (where second-
order variables are admitted) under the usual first-order constructions and
the partial fixed-point constructor. The latter allows to construct a new for-
mula ψ = [P¥Pχ^φ]x from any formula φ £ PFP, where X is a second-order
variable of some arity r and x an r- tuple of distinct first-order variables. For
the free occurrence of variables put free(^) := (free(^) U {#}) \ {X}.
The semantics for the PFP-constructor is that induced by the partial fixed
point of the corresponding operator. If (21, Γ) is a structure of the appropriate
vocabulary together with interpretations for all variables in free(φ) \ {X,x},
then
if α
where F^Γ:P(Ar) ->• P(Ar) is the operation induced over (a, Γ) by φ.
Definition 1.22. Partial fixed-point logic PFP is the smallest logic closed
under the usual first- order constructors and the PFP-constructor.
Clearly PFP C PsPACE. Recall that first-order queries are in LOGSPACE.
Over structures of size n PFP-processes in arity r terminate within at most
2m + 1 steps, where ra = nr. A corresponding step counter can be run in
PSPACE and only two stages of the fixed-point generation need be kept si-
multaneously.

For fixed-point logic FP one considers the inductive or inflationary fixed-


point operator instead of the partial fixed-point operator. The syntax is ex-
actly the same as for PFP, only that we write FP instead of PFP in fixed-point
formulae and choose the semantics based on the inductive fixed point of the
underlying operation: for φ and (21, Γ) as above, and ψ — [FPx^Jx put

(2t,Γ)μV[δ] if α
Definition 1.23. Fixed-point logic FP is the smallest logic closed under first-
order constructors and the FP- constructor.

It is obvious that FP C PTIME as IFP-processes terminate within a poly-


nomial number of iterations.
We regard FP as a sublogic of PFP in the sense that [FPx^Jaf may
be identified with [PFPχ^(Xx V φ)]x, since the inflationary variant of the
operation Fφ is the same as Fφ> for φ1 — Xx V φ.
32 1. Definitions and Preliminaries

Theorem 1.24 (Immerman, Vardi). Over the class of linearly ordered fi-
nite structures FP captures PTIME: a class of linearly ordered structures is
recognized by a PTIME algorithm if and only if it is the class of finite mod-
els of some sentence of FP; a global relation on ordered structures is PTIME
computable if and only if it is definable in FP.
Theorem 1.25 (Abiteboul, Vardi, Vianu). Over linearly ordered finite
structures PFP captures PSPACE: a class of linearly ordered structures is
recognized by a PSPACE algorithm if and only if it is the class of finite models
of some sentence of PFP; a global relation on ordered structures is PSPACE
computable if and only if it is definable in PFP.
Theorem 1.24 was obtained in [Imm86] and [Vaτ82], Theorem 1.25 com-
bines results from [Var82] and [AV89] (via equivalences with relational While-
queries).
Let us briefly indicate how the fixed-point processes available in FP and
PFP can be used to code PTIME, respectively PSPACE, computations over
linearly ordered structures — an observation that is the key to the preceding
theorems.
Example 1.26. Polynomially space bounded configurations of a Turing ma-
chine can be encoded by predicates over the ordered domains of the input
structures. The arity of the predicate depends on the degree of the space
bound. A definable indexing of the tape cells is provided by the lexicographic
ordering on an appropriate power of the universe. In terms of these encodings
by predicates it is easily seen that the successor step from one configuration
to the next becomes first-order definable.
For a PSPACE machine that computes a boolean query say, consider the
partial fixed-point process based on this first-order formalization of the com-
putational successor (with a corresponding definition of the initial configura-
tion). Provided that we adapt the transition function of the machine in such
a way that an eventual halting configuration is formally repeated indefinitely,
the limit value of this partial fixed-point process is an encoding of the halting
configuration if the machine halts on the given input structure.
For a PTIME machine we may pass from the encoding of individual con-
figurations to a cumulative encoding of initial segments of the computation
path. We use a fixed-point variable with additional entries for a lexicographic
indexing also for a polynomial number of time steps. This leads to an induc-
tive fixed-point process whose i-th level describes the computation path up
to step i. The limit reached in this process is an encoding of the entire com-
putation path on the given input structure.
In either case the actual result of the computation (acceptance or rejection
in the case of a boolean query) is first-order definable in terms of the final
configuration.
The original and intuitively also very appealing definition of fixed-point
logic FP in terms of a least fixed-point operator LFP rather than the in-
1.3 Some Particular Logics 33

ductive or inflationary operator IFF is equivalent in expressive power. This


equivalence between the least fixed-point extension of first-order logic and the
formalization using IFF is shown by Gurevich and Shelah [GS86], see also
[Lei90]. The formalization using inductive fixed points will be more convenient
in connection with the counting extensions to be introduced in Chapter 4.
As a further indication of the versatility of fixed-point constructs we
briefly consider systems of simultaneous fixed points.

Example 1.27. Let φ = (ψi(X\,... ,-ϊmj^))<=ι,...,m be a tuple of formu-


lae in which the arities r» of the x^ match those of the Xi. With φ one may
associate an operation Fψ on P(Arι) x ••• x P(AΓm), over structures that
interpret the ψi up to the Xi and 5?W, through:

ι=I,....m

Iteration in the spirit of IFF or PFP yields inflationary or partial fixed points
for such systems of interrelated relational transformations. Standard tech-
niques for the encoding of tuples of relations through a single relation of
higher arity can be employed to show that fixed-point logic FP and partial
fixed-point logic PFP are closed under the formation of respective fixed points
for systems.

1.3.4 Fixed-Point Logics and the L^ω

The fixed-point logics PFP and FP both are sublogics of L^ω. Since FP itself
is a sublogic of PFP it suffices to review the argument that yields PFP C
L^oω. Since we have already seen that Lt^oω expresses queries of arbitrary
complexity, whereas PFP-definable queries obviously are in PSPACE, it follows
that PFP £ L^ω.
For a convenient statement of the following arguments it is useful to elimi-
nate first-order parameters in fixed-point processes. This is easily done at the
expense of an increase in the arity of the second-order fixed-point variable.

Lemma 1.28. Any formula in PFP is equivalent with one in which fixed-
point operators are applied only in the form PFPχ$φ, where all free first-
order variables of φ are among those in x. The same holds o f F P .
Sketch of Proof. Consider φ with free first-order variables of, ~z. Assume that
x and ~z together consist of pairwise distinct variables and that no variable in ~z
is bound in φ. Let φ1 be the result of replacing any atom Xΰ in φ by the atom
Z~zϋ, where Z is a new second-order variable whose arity is that of X plus
arity of z. It is easily seen that {zx \ [PFPx^φ]x} = {zx \ [PFPz^Έφ']zx}.
D

Lemma 1.29. Let φ £ Lf^Qω be a formula, possibly with free second-order


variables. Let X be a second-order variable of arity r, and x an r-tuple of
Another random document with
no related content on Scribd:
The return traverse via Farafra to Assiut is left out of consideration
owing to its great length and consequent low value in the
determination of the longitude of Zubbo. It agrees however, very
closely with the others, owing to compensation of errors.
[12] Lyons, Proceedings of the Royal Society, vol. 71.
[13] Op. cit.
[14] The occurrence of this fern in the ravines of the Fayum may also
be recorded here.
[15] Beadnell, H. J. L., Geological Magazine. Jan., 1900, No. 427, pp.
46-48; The Cretaceous Region of Abu Roash, near the Pyramids of
Giza. Geol. Surv. Egypt, Report 1900, Part II, 1902.
[16] See Reports of the Geological Survey on Farafra, Kharga and the
Eastern Desert.
CHAPTER III.

The Roads connecting the Oasis of Baharia with the Nile


Valley and other Oases.

The roads traversed by the Survey parties between Baharia and


other places are three in number, viz., from near Maghagha and from
Minia in the Nile Valley, and from Baharia to Farafra Oasis. Other
well-known routes run from the Fayum, from Bahnessa, Samalut
(Ascherson) and Delga, in the Nile Valley, from Alexandria, via
Mogara, and from Siwa (Jordan, Cailliaud). The Survey’s return
traverse to Minia from the south end of Baharia did not follow any
defined road, but kept on the open plateau on a course computed
from the known positions of the points of departure and destination.
[17]

Road from
The road from Feshn and Maghagha to Baharia
Feshn and leaves the edge of the Nile Valley cultivation at Qasr el
Maghagha Lamlum Bey, which bears 51½° west of true north from
to Baharia.Maghagha railway station, and is distant 15·4
kilometres. From this point the road is well-defined and easily
followed right into the oasis. In the following description the
distances are given from the edge of the Nile Valley cultivation.
The road at first leads over a strip of drift sand, half a kilometre
broad, with short prickly scrub, passing a white mosque on the left
and then turning off somewhat to a direction 26° south of west, and
continuing in a straight line for 15 kilometres over an undulating
gravelly plain. The high prominent cliffs, about 7 or 8 kilometres to
the north-west, are the flanks of Jebel Muailla, and a valley known as
Wadi Muailla leads through them to the Wadi Rayan in the Fayum
depression.[18] At 19 kilometres the valley scarp, with a number of
isolated peaks, is approached on the right, while ridges and low
mounds form the plain below, well-marked lines of drainage running
from here in a south-east direction towards the cultivation. At 23
kilometres the scarp runs back, enclosing a large bay, across which
the road runs and ascends to the plateau beyond at 27·8 kilometres.
Numerous isolated parallel sand-dunes in the form of small ridges
are seen running out into the bay from the cliff at the far end, all lying
slightly west of north and east of south, or parallel to the normal wind
direction.
The escarpment bounding the Nile Valley at this point is only
some 15 metres in height, being thus quite insignificant compared
with the cliffs on the east side. The plateau here was found to be
about 140 metres above the cultivation, the road having risen
gradually throughout. The latter continues for about 1½ kilometres
across the strip of plateau when it again descends, making a slight
detour to the left for easy descent. It then continues 9° south of west,
slightly winding, over gravelly undulating ground. At 31-32 kilometres
a line of low hills is passed on the right, while a dark well-marked
range lies 6-7 kilometres to the left.
A ridge of sandstone, known as Jebel el Ghudda, is passed on
the right at 45 kilometres, from the end of which a small dune runs
out; beyond, the plain resumes its monotonous undulating character,
a low ridge being crossed at 61 kilometres. There, the road,
consisting of a number of more or less parallel well-marked narrow
paths worn by camels, which have a somewhat general habit of
marching in line one behind the other, changes its direction to 36°
south of west, falling gradually in level until a patch of scrub is
reached 6½ kilometres further on. This scrub was dead at the time of
the visit, and furnished a useful supply of fuel. From this point the
course is 7° south of west (true), which direction is maintained for the
next 42 kilometres over a remarkably monotonous undulating gravel-
covered desert, the typical “serir” of the Arabs. At 92 and 93½
kilometres some more patches of dead scrub were passed on the
right, while logs of silicified wood were noticed strewing the plain on
the left. An Arab grave was met with at 96 kilometres, while
skeletons of camels lie about near the roadside at frequent intervals;
at 110 kilometres the eye of the traveller is relieved by a small grove
of green thorny flat-topped acacia trees (Acacia nilotica, or “sunt” of
the Arabs) with a patch of coarse grass; four gazelle (probably
Gazella dorcas) were observed browsing on the scrub here.
The course now continues 12½° south of west, over gently
undulating gravelly “serir”, until the eastern scarp of El Bahr is
reached at 125 kilometres from the cultivation of the Nile Valley. The
“serir” or undulating gravelly type of desert then ceases.
El Bahr is a depression, some 60-70 metres deep, cut out in
white limestone rocks; its breadth at the point crossed by the road
was 8 kilometres. Within it are several high prominent hills, one of
which near the centre on the left side of the road is called Jebel Gar
Marzak. The bottom of the depression was quite green with
vegetation; sufficient water is said to fall every year to keep these
plants alive, and in 1894 rain is said to have fallen to such an extent
that a pool of considerable size was formed; the silt deposited by this
is plainly visible at the present time. A good deal of blown sand
occurs within the depression. El Bahr evidently corresponds to the
Bahr Bela Ma, (river without water) figured on some authors’ maps,
which has been frequently but erroneously referred to as an old
river-course; although this idea was shown to be untenable by
Zittel[19] and Ascherson[20] it has subsequently been maintained by
non-scientific writers. No traces of any river deposit occur in the
depression, which consists simply of a series of unconnected
depressions, eroded by wind-borne sand.[21]
The track leaves the depression at 134 kilometres, rising over
heavy sand; it then continues 3° north of west. The character of the
desert has now completely changed, and instead of the smooth
undulating gravelly “serir,” its surface is rough and hummocky, being
formed of hard bare limestone, cut up into sharp knobs and grooved
into furrows by the powerful action of wind-borne sand; it resembles
closely the surface of the rough open sea. This type of desert is
spoken of as “kharafish” by the Arabs. While the “serir” forms an
ideal surface for travelling over, the “kharafish” is the worst
imaginable, the innumerable hillocks necessitating incessant small
deviations, while the hard rough surface is in some places very
troublesome to camels; moreover, an extensive view is out of the
question and no tracks are visible on the surface, so that the road is
easily lost except where marked by frequent cairns built of loose
stones.
Occasional patches of blown sand are here met with, and the first
well-marked dunes were crossed at 141 kilometres. From here
onwards for kilometres the whole area was more or less sandy with
occasional narrow well-marked ridges or dunes, running almost due
north and south, and varying in breadth from that of a single line to a
number of parallel ridges side by side half a kilometre broad. The
largest dune of this group at 146½ kilometres is known as Ghard el
Shubbab. The steepest sides are those facing west where the angle
may reach 30°.
At the particular locality crossed by this road the sand area is
very easily crossed, a circuitous route being followed in order to take
advantage of the flatter dunes with the easiest slopes when crossing
the steeper ridges. Probably the road crosses at one of the easiest
points. This remarkable line of dunes, known as the Abu Moharik,
has its origin in the neighbourhood of the oasis of Mogara and runs
southward, almost without a break, across the desert until Kharga is
reached, whence with a slight break owing to the broken character of
the ground it continues southward within the oasis-depression. The
total width of the sand-belt on the road under description is about 6
kilometres.
At 153 to 156 kilometres a number of black conical hills, Gar el
Hamra, are situate from 1 to 2 kilometres from the road on the south
side. One or two more sand-dunes were crossed and then the road,
maintaining its direction of 2°-3° north of west, lay over a more or
less uneven dark-coloured limestone desert broken up into a number
of small hills. At 169 kilometres a broad ridge of sand-dunes was
encountered, running 18° west of true north. These light yellow
dunes afford a beautiful and remarkable sight, running northwards
away to the horizon over a dark brown-coloured desert in an almost
perfectly straight line and with a sharply maintained junction-line
between the edge of the sand and the desert surface adjoining.
Within a few hundred metres of the western side of the sand-
dunes the road commences the descent from the plateau into the
oasis-depression. The road enters at the most northerly extremity of
the oasis, the descent being particularly easy at this point, passing
the large dark-coloured hill, Jebel Horabi (or Morabi?), on the right
almost immediately afterwards.
A fine view of the depression is obtainable from the top of the
escarpment, a broad low-lying expanse, bounded by steep
escarpments or walls, stretching away to the south, its monotony
relieved by several large flat-topped hill-masses, near which, on the
lowest portions of the floor, dark areas, the cultivated lands and
palm-groves can be distinguished. The road crosses the depression
in a south-westerly direction, passes a spring known as Ain el Gidr,
the first watering place, and divides in front of the great hill-masses
separating the two groups of villages, the eastern branch keeping
close under the eastern scarp of Jebel Mayesra, to avoid a large
area of soft salty ground, and leading to the villages of Zubbo and
Mandisha, While the western branch continues its course to the
cultivation surrounding El Qasr and Bawitti. The distance by this road
from Qasr el Lamlum Bey to the village of Zubbo is 190 kilometres
and to Bawitti 195 kilometres.

Geology of
Having now described the topographical features of
the Feshn- this road, the chief geological characters may be
Baharia noticed. The plain between the Nile Valley cultivation
and the scarp of the plateau is covered with sandy
gravel, partly downwash from the higher ground in Recent times, and
partly the remains of definite gravel deposits belonging to the Nile
Valley Pleistocene series.[22] The pebbles now found strewn over the
plain consist chiefly of flints, doubtless derived from the Eocene
limestones forming the deserts on both sides of the Nile Valley, and
occasional pebbles of hard felspar porphyry which must have
originally been derived from the igneous massifs of the Red Sea
Hills. Both are well rounded, although the former are frequently
broken up into angular fragments by temperature changes. White
granular beds of gypsum, of various degrees of impurity, crop out on
the plain in places, and in all probability there was in Pleistocene
times an extensive deposit of this mineral all over the surface of the
low-lying country. In the desert lying between the Fayum and the Nile
Valley further to the north, these gypseous beds occur of great
thickness and wide extent, and the deposits crossed on this road are
doubtless part of the same series.
The cliffs of Jebel Muailla to the north are capped by a hard dark
bed of limestone, which weathers with a vertical face, while the more
gentle slopes, generally more or less hidden with sand, are
doubtless formed of softer limestones, marls, and clays. During the
survey of the Fayum (see foot-note, p. 17) the hills surrounding Wadi
Muailla were found to be formed of Lower Mokattam beds (Middle
Eocene) and the hills seen from this road are doubtless composed of
the same beds. The ridges crossed at 20 kilometres are formed of
hard, compact, close-grained crystalline limestone, covered with
more or less gypsum and flint gravel; the limestone beds forming
these ridges show dips which suggest the existence of a fault
running N.E.-S.W., parallel to the trend of the cliff behind, and this
may be part of the extensive faults and folds of the Nile Valley. In
one small hill (22 kilometres) shales with Ostrea were noticed at the
base, with occasional hard oyster-limestone bands; the upper part
was formed of 10 metres of gravel consisting of well-rounded
limestone pebbles. This superficial deposit must be classed as
Pleistocene and may be a sea-beach, though no conclusive
evidence was obtainable on this point. The escarpment passed at 23
kilometres is capped by a bed of white limestone, shales forming the
slope, but was not examined at close quarters. The floor of the bay
formed by the receding cliff shows outcropping brown limestone with
Ostrea, and the escarpment on the far side is capped by a hard
white crystalline limestone with much flint, the latter forming bands.
On the surface is a thin calcareous gypseous gravel deposit,
doubtless of the same age as the gypseous beds already mentioned
as occurring on the plain below. The flanks of the scarp are hidden
by downwash. The cliff bounding this strip of plateau, 1½ kilometres
further on, is composed of the same beds, the limestone being here
silicified, with large silicified Conidæ. With regard to the age of these
limestones and clays they are probably equivalent to part of the
Lower Mokattam series already mentioned as forming the hill-
masses round Wadi Rayan, although no Nummulites gizehensis
beds were observed in the sections examined. A conspicuous black
knob among the low gravelly hills left two kilometres on the right at
32 kilometres, was found to be a neck of hard dark andesitic basalt,
one of the few occurrences of igneous rocks in the Western Desert.
Several other similar looking dark hills were in sight, but time did not
admit of their examination. The dark well-marked range 6-7
kilometres to the left of the road is probably identical with a range of
hills occurring 10 kilometres west of Bahnessa, which was
mapped[23] during the survey of the Nile Valley in 1899, and found to
consist of a mass of andesitic basalt similar to that forming the small
neck on this road. Doubtless they are both parts of the same
intrusion. The surface of the plain is still composed in part of
gypseous deposits, with occasional outcrops of the underlying
limestone, the surface being covered with a certain amount of loose
sand with rounded flints and their broken fragments. In the
neighbourhood of Jebel el Ghudda the plain consists of limestone
with numerous individuals of the large Nummulites gizehensis, and
are thus of Lower Mokattam age. Much of the limestone is
crystalline. The hills of Jebel el Ghudda are formed by younger
overlying beds consisting of hard silicified sandstones and grits
(quartzites), which lithologically are very similar to the beds of Jebel
Ahmar near Cairo, of Oligocene age. They may, however, belong to
the Upper Eocene series, so well developed above the Upper
Mokattam in the escarpments to the north of the Fayum, as this
series contains similar beds with similar silicified wood. They enclose
bands of coarse conglomerate and have a peculiar blackish burnt
colour. Occasional patches of these grits are here and there met with
right up to the depression of El Bahr, and the silicified wood passed
at 92 kilometres belongs to the same series of beds. From the scrub
area at 93½ kilometres till the road approached El Bahr there was
hardly a sign either of the grits or of section anywhere met with on
this road. The depression is cut down through the upper series of
sandstones and grits into the fossiliferous white limestones and
sandstones below. The eastern scarp showed the following beds:—
Top.
Soft white Sandstone.
⎧ Limestone with Nummulites gizehensis. ⎫
Middle Lower
⎨ Limestone with Ostrea, Exogyra and Lucina, ⎬
Eocene. Mokattam.
⎩ etc. ⎭

The floor is covered in places with numerous weathered out


individuals of N. gizehensis and large oysters. Some of the hills
occurring within the depression showed brown siliceous limestone
overlying white limestones with beds containing Nummulites
gizehensis, Ostrea, etc., below. In some of these hills the beds show
evidences of folding, which like that within the Baharia Oasis may
have partly caused the formation of the depression by bringing the
softer beds to the surface within the reach of the agents of
denudation. On ascending the western scarp, beds with Ostrea,
Turritella, etc., were crossed, and near the top thick yellowish
sandstones crowded with Ostrea occur. Two of the species of Ostrea
were afterwards examined by Dr. Blanckenhorn, one of which he
regarded as nearly related to O. Fraasi, and the other as a new
species with affinities to O. Hess, May-Eym.
Probably the whole of these fossiliferous beds belong to the
Middle Eocene Mokattam Series; the lower beds with N. gizehensis,
are certainly Lower Mokattam, and probably the upper, although the
latter may represent the base of the Upper Mokattam. Whether the
upper bed of sandstone capping the eastern scarp belongs to the
same series or is equivalent of the silicified grits with silicified wood
passed further back is open to some doubt.
Beyond the Bahr the desert is formed of a hard limestone much
cut up by the action of wind-borne sand as already mentioned. This
limestone is very unfossiliferous, occasional obscure nummulites
seen on fractured or smooth surfaces being the only indications of its
Eocene age.
The conical-peaked hills of Gar el Hamra on the left of the road at
151 kilometres were so striking that a detour was made specially to
examine them. They were found to be composed of black
ferruginous silicified sandstone or quartzite at top, with false-bedded
sandstones below. They overlie nummulitic limestones forming the
surrounding plain, and are not improbable Post Eocene lacustrine
deposits, similar to those within Baharia Oasis (see pp. 60-62).
From here onwards up to the oasis-depression the plateau
consists of hard brown limestone, more or less silicified, and
contains nummulites and oysters. The Eocene strata, thus extend
right up to the escarpment of the depression on the north side of the
oasis. The geological structure within the depression will be found
discussed in Chapter V.

Road from
Road from Minia to Baharia.—The cultivated land
Minia to extends for some 9 or 10 kilometres west of Minia town,
Baharia. and at the time of the Survey expedition (early in
October, 1897) this area was mostly covered by flood-
water. Boats were therefore taken to the edge of the desert, and the
march west was commenced from Nasl Nadiub Lengat, a small
village on the cultivation-limit in lat. 28° 6′ 24″, long. 30° 39′ 45″ E.,
bearing from Minia railway station 9° north of west, distance very
nearly 10 kilometres. The road westward being ill-defined, and the
party knowing that any course followed a little north of west would
surely lead to the oasis, no attempt was made to follow the track
precisely, the course being rather chosen to take advantage of
commanding points so as to map as much topography as possible
en route. As will be seen further on, however, the road was struck
some distance before reaching the oasis, and was thence followed
to Zubbo.
On leaving Nasl Nadiub Lengat the course first taken was about
32° north of west. The desert here rises very gradually from the
flood-level, (i.e. about 40 metres above sea-level), there being no
cliff bounding the Nile Valley on the West at this point. For about 6
kilometres the ground was sandy, with occasional patches of flood-
water and some grass; then came a stretch of level gravelly ground,
and at 11½ kilometres some low sand-dunes were crossed. The
course was now changed to about 3° south of west, and at 14
kilometres a descent was made from the gravel-plain into a
limestone-depression, with gravel-capped eminences on either side.
At 22 kilometres the gravel-strewn plain was again come on, a few
low mounds, also gravelly, being passed at 31 kilometres, and
camel-tracks, doubtless part of the Minia-Baharia road, being noticed
coincident with the survey line at 37 kilometres. Continuing following
these tracks, another camel-road was found to branch off at 46
kilometres to the north-west. The march was continued along the
same track going a little more south of west, over a monotonous
gravelly plain, with limestone showing through it in small patches, till
at 78 kilometres from Nasl Nadiub Lengat a conspicuous though
small mass of fissile gritty limestone was come on, which afforded a
good survey-station. North and west of this are low gravel-covered
hills. Turning a little north of west, a large extent of low table-like hills
was seen on the south. At about 90 kilometres low plateaux of
limestone, capped by gravel, closed in so as to form a defile about a
kilometre wide, through which the line of survey passed, and at 98
kilometres a large sandy and gravelly depression, bounded by
limestone scarps and containing numerous large limestone hills, was
entered. Within the depression the course followed was about 30°
north of west, so as to map roughly the escarpments on either hand.
About 10 kilometres from the point where the depression is entered
a long stretch of heavy sand was crossed beyond which tracks going
west were come on, and the edge of the limestone plateau on the
left curved round so as to cross the road. Ascending this low scarp at
116 kilometres, the course lay over white limestone, with sharp
angular flints on its surface, and low limestone hills on either side. At
124 kilometres the broad belt of sand-dunes, the Abu Moharik, was
reached. The dunes, which are of considerable height, run nearly
north and south, and have a total width of some 4 kilometres; they
formed the most serious obstacle of the entire journey. Beyond the
dunes is a hard limestone plain, crossed by a low ridge (forming part
of a higher plateau) at 136 kilometres. Beyond this ridge are low
limestone-hills on either side, the floor being generally of hard white
limestone. At 151 kilometres an oval depression in the limestone
plateau, with its longer axis (about 4 kilometres) running W.N.W.,
was entered; this depression is full of hills of white chalk, which also
forms the floor here. Ascending the opposite scarp of the depression
at 159 kilometres, a narrow strip of higher plateau of much harder
limestone was crossed, the edge of the scarp of the oasis of Baharia
being reached at 160 kilometres, after marching for 57 hours from
Nasl Nadiub Lengat. Up to this point there had been a gradual rise of
the ground, from 40 metres at the Nile Valley to 265 metres above
sea-level at the edge of the oasis.
The descent into the oasis is not difficult, the scarp being sandy
and the fall gradual. At about 4 kilometres W.N.W. from the edge of
the scarp is the first well or spring of the oasis, Ain Gelid (lat. 28° 19′
28″ N., long. 29° 8′ 40″ E. of Greenwich), a small pool surrounded by
grasses and with a tree growing near it. The barometric observations
gave for this point the level 134 metres above sea, so that the total
drop from the edge of the scarp is about 130 metres. From Ain Gelid
the village of Zubbo bears about 15° north of west, and is some 17
kilometres distant, but owing to the hills between the two places
necessitating a slight detour, the actual distance to be traversed is a
few kilometres greater.

The road from Minia to Baharia shows essentially


Geology of
the Minia- the same geological features as the one from
Baharia Maghagha already described. The gravel which strews
Road. so large a portion of the road to the east is mainly
composed of quartzite and flint pebbles, of a prevailing brown colour.
Some of the rounded flints show a concentric structure, and are
evidently segregation-nodules derived from chalk-beds; a very fine
specimen of these, measuring some 20 c.m. in diameter, of almost
perfectly spheroidal form, was obtained. The gravelly covering is
frequently very thin, the limestones underlying it showing through in
numerous places. No evidences of the precise age of this gravel
have come to the authors’ knowledge, but it is certainly Post-
Eocene. It appears to be the same formation which is found covering
the edge of the plateau west of Girga[24] and south of Assuan (east
bank) and at various other points of the Nile Valley. It overlies beds
of every age from the Nubian Sandstone to the Lower Mokattam.
The limestones which underlie the gravel near the Nile, and
which are well exposed in the depression crossed 14 kilometres from
the edge of the cultivation, are crowded with nummulites of various
species, N. gizehensis and N. curvispira being specially common;
these rocks belong therefore to the Lower Mokattam Series (Middle
Eocene).
The limestone underlying the gravel further west (from about 30
to 80 kilometres west of the cultivation-limit) is only seen in small
patches; it varies in character, being in some places loose and
tufaceous in texture, and in others gritty and fissile, passing into a
calcareous grit. The mass of gritty limestone at 78 kilometres, noted
above, is about 6 metres in length by 2 metres high and broad; it
shows a peculiar stalagmitic structure, the layers always parallel to
the free surface. Several smaller masses found around all show a
similar structure, and where the rock is exposed on the floor
concentric fissuring is frequently seen. The sand which strews the
surface here is largely calcareous, being doubtless in part derived
from the gritty limestones.
The eastern part of the low hills lying to the left of the track at
about 85 kilometres from the cultivation-limit show the following
section (total height about 13 metres):—
Top.
1. Flinty gravel, the flints containing nummulites.
2. Tufaceous white limestone.
3. Hard pink siliceous limestone.
4. Fissile sandy marls and gritty limestones.

On the western part of the hills the gravel covering is absent, the
succession here seen being:—
Top.
1. Sandy marls, 60 centimetres.
2. Very hard pink calcareous grit, 1 metre.
3. White sandy marl, fissile, 2 metres.
4. Soft red and white marls, 10 metres.

No fossils were seen in these beds except the nummulites in the


gravel, which prove the latter to be at least in part derived from
Eocene deposits.
The hills left of the road a little further on show the following
section:—
Top.
1. Flinty gravel; thin deposit unconformably overlying the limestone below.
2. Tufaceous white limestone (thin bed).
3. Hard grey crystalline limestone, 1·2 m.
4. Slope of sand and debris, doubtless covering soft marly beds.

On entering the depression at 98 kilometres a band of earthy


reddish limestone with Bryozoa is crossed, this bed appearing to
overlie the fissile sandy limestone already mentioned. From here a
good view of the numerous limestone hills is obtained, the top beds
of hard limestone showing out sharply from the lower sand-covered
slopes; the beds show a stratigraphical depression here in addition
to the eroded surface depression. The floor of the depression is
covered with sandy gravel and Ostrea shells; an examination of the
left scarp, about 3 kilometres after entering the hollow, showed the
following succession of beds (total height of section 20 metres):—
Top.
1. Hard white limestone with numerous shell-casts and containing some
shaly layers, 3 m.
2. Debris-covered slope of softer beds with Ostrea shells, 16 m.
3. Red earthy limestone with shells at base of hill.

One of the hills within the depression, to the left of the track about
6 kilometres further on, showed in a face of about 28 metres the
following beds:—
Top.
1. Hard white limestone, sand-eroded.
2. Brown fossiliferous limestone.
3. White limestone with many small fossils.

The slopes of this hill were covered with large black Ostrea
shells.
The floor of the depression further west showed limestones of
varying character, frequently highly fossiliferous. Beyond the
depression the ground passed over was mainly white limestone,
strewn with large sharp angular flints, and many spheroidal flint-
masses derived from the limestones. The surrounding hills show that
we have here two white limestones with brown beds between, the
floor and the hillcaps being white, while the feet of the hills are
brown.
The plateau-rock near the great belt of sand-dunes is a hard
white limestone with large Conidæ. Just beyond the dunes this rock
is very siliceous, the exposed surfaces showing a smoky-black
colour; the rock is however quite white on fracture. The hills rising
from the plateau here consist entirely of limestone, beds of white
chalk alternating with harder yellowish and brown limestones, in
which no fossils were noticed. Further on the plateau rises, so that at
the entrance to the depression the surface is formed of the same
brown limestone-beds which are seen in the hills behind. The hills
within the oval depression here are composed of alternations of
chalky with harder limestones. The narrow ridge which separates the
depression just described from the oasis-depression is of very hard
horny siliceous limestone.
On the descent into the oasis the following beds are passed
through:—
Top.
1. Very hard yellowish and reddish-brown siliceous limestone, in part
crystalline.
2. Fissile sandy limestone.
3. Soft yellow ochreous marly limestone.
4. Greyish-white chalky limestone, with ferruginous layers.
5. Sand-covered slope, mainly consisting of clays.

It seems probable that the entire mass of the limestones forming


the surface of the plateau are Eocene, occupying an horizon in and
below that of the Mokattam series. That the fissile sandy limestone
and calcareous grit cannot be the equivalent of the Post Eocene
sandstone deposits encountered on the road from Maghagha is
proved by their being intercalated in the Eocene limestones of the
plateau.
Traverse
The return journey from Baharia Oasis to Minia was
from south made, as already mentioned, across the open desert,
end of not following any road. The starting-point was the point
Baharia to on the southern scarp of the oasis, where the Farafra
Minia.
road leaves Baharia. Our observations give for the
position of this point[25] lat. 27° 46′ 13″ N. long. 28° 32′ 47″ E. of
Greenwich; height above sea level, 247 metres. The course was
shaped so as to reach the village of Nasl Nadiub Lengat, whence the
outward traverse had started, so as to give a closed polygon of
survey-lines. The following topographical and geological notes were
taken on the journey.
The first 2½ kilometres of the way, going about S.S.E., lay over a
plateau of sandy limestone, often mammilated and strewn with
limonite-fragments. Sand-dunes of small size were passed on the
left. An ascent was then made of some 35 metres on to a flat-topped
ridge, consisting of white and yellow marls and clays, capped by a
horizontal bed of hard brown calcareous grit, passing into brown
crystalline limestone with calcite shell-casts. On this ridge, which is
about 600 metres in width at the place seen, a turn was made so as
to go almost due east. The descent from the ridge on the other side
is on to a plain strewn with whitish-brown laminated and mammilated
limestone and some limonite, and from the plain rise hills showing
the same structure as the ridge just crossed. A camel-road running
south was crossed 7 kilometres from the starting-point, and beyond
this, the ground having gradually risen to the level of the brown
calcareous grit, we came among numerous low hills; these consist of
white chalk beds, capped by sandy greyish limestone and then by
thick beds of harder greyish-white limestone, weathered grey on the
surface. No fossils were seen here, but continuing the journey east,
over uneven ground of white and greyish limestone of considerable
hardness, a small hill was met with, at about 12 kilometres from the
starting-point, and found to consist of limestone with Nummulites,
and these foraminifera were found in great abundance a little further
on. The finding of these forms, so distinctively Eocene, is important
as showing that whatever the age of the limestones at the actual
edge of the south part of the oasis may be, the beds forming the
plateau only a few kilometres east of the oasis are, like those which
form the top of the northern scarp, of undoubted Eocene age. The
nummulites were visible in the rocks of the plateau till about 17
kilometres from the starting-point, none being noticed further on till
the neighbourhood of the great sand-dunes was reached. At 25
kilometres a slight depression containing numerous hills of white
chalk was entered. The floor of this depression, 267 metres above
sea-level, is strewn with fragments of crystalline calcite, probably
derived from veins or druses in the chalk. The beds here dip slightly
to the south. About 3 kilometres further on the beds dip about 5°
E.S.E., so that higher beds were come on; the dip however soon
diminished. At 29 kilometres the plateau rock consisted of porous
siliceous limestone, generally white, but in places yellow, closely
resembling that found capping the scarp in the northern parts of the
oasis. The march was continued over white limestones of varying
hardness, with flints and fragments of chalcedony on the surface,
and numerous low limestone-hills. At 36 kilometres a chalky area
was entered on, covered with countless small rounded hills of chalky
limestone; these hills, of which there are literally millions, cover the
ground like haycocks in a field; they are generally about 20 metres
high, and up to 100 metres in diameter. No fossils were seen in the
rocks. At 70 kilometres the hills began to get smaller, becoming
presently mere chalk-mounds. A slight depression, about a kilometre
wide, with larger chalk hills, was crossed at 73 kilometres, beyond
which was a long stretch of flat ground composed of snow-white
limestone, chalky to fairly hard, unfossiliferous, with thin siliceous
bands. At 83 kilometres occasional low limestone hills were seen on
ground otherwise fairly level, strewn with flints. A camel-road going
south-east was crossed at 77½ kilometres. At 92 kilometres
numerous small gasteropod casts were noticed in the chalk, but no
other fossils. The plateau further on was seen to be formed of a hard
thin bed of siliceous limestone, with chalk below; it shows occasional
depressions with low hills within them. At 103 kilometres the great
belt of sand-dunes running N.W.-S.E. was entered on. Close to the
dunes and in the interspaces between them the plateau-rock is hard
semi-crystalline white to brownish limestone, with nummulites and
small gasteropod-casts. The sand contains a good deal of
calcareous matter in addition to the grains of quartz. The dunes have
a total width of about 3½ kilometres, some of them are of great
height, and the passage with heavily laden camels is not without
some difficulty.
Beyond the dunes the limestone is frequently siliceous,
weathered smoky-grey on surface, and crowded with nummulites;
the surface of the ground is generally sandy and flint-strewn. At 113
kilometres a narrow band of highly silicified, superficially blackened,
limestone is crossed. This band, which is only a few centimetres
broad, running north and south, stands up like a vein above the
plateau, and is evidently caused by infiltration of siliceous solutions
in a crack. In the silicified part of the limestone here the fossils can
be easily seen; they are chiefly corals. Further on, the plateau
consists of hard semi-crystalline and horny white limestone in which
fossils are not seen.
At 123 kilometres the course, hitherto nearly eastward, was
changed about 30° to the north, over the same hard semi-crystalline
limestone, with hillocks showing alternations of soft chalky beds with
harder ones. Some large flat-topped hills passed at 130 kilometres
consist of horizontal limestone-beds, the lower ones being chalky,
the upper ones hard, grey, and somewhat porous. Beyond this small
rounded flints are seen on the plain, and these continue, increasing
in number, till a low plateau in front is reached. At 134 kilometres a
broad camel-road going south-east was crossed, and at 135
kilometres the scarp of a higher plateau was ascended. The lower
plain has an altitude of 127 metres above sea-level, the top of the
plateau being some 70 metres higher; the ground however rises
gradually before reaching the escarpment, so that the actual rise at
the scarp is only about 40 metres. The beds passed through on the
ascent are—
Top.
1. Gravels of rounded flints and pebbles, thin covering.
2. White chalk, 2 metres.
3. Earthy limestone, 60 centimetres.
4. Red and white clays and marls, about 30 metres.
5. Sandy limestone, 20 centimetres.
6. Clayey limestone, 15 centimetres.
7. Sandy limestone, 60 centimetres.
8. Limestone conglomerate, 1 metre.
9. Red and yellow clays at base.

These beds appear to dip slightly eastward. No fossils were


seen.
The top of the plateau is a huge level gravel-plain, with white
chalky limestone showing through occasionally. Sometimes this rock
shows a loose tufaceous texture; at other places it is sandy and
fissile, and now and again it encloses small pebbles. A broad camel-
road going south-east was crossed at 145 kilometres, and another in
the same direction was crossed at 157 kilometres. At 162 kilometres,
the gravelly ground having gradually fallen to 179 metres above sea-
level, a turn so as to take again an eastward course was made, and
a long stretch of gently undulating gravelly ground, with white
limestone showing through in small patches, was traversed. A
camel-road going east was crossed obliquely at 185 kilometres. At
196 kilometres, low gravel-covered hills of some extent were passed.
The pebbles range up to 10 or 15 centimetres in diameter, and are
well-rounded. In the lower part of one of these hills, sandy chalky
limestone is seen cropping out through the gravel which covers the
slope. Then a little further on are other low hills of white and grey
nummulitic limestone, only thinly sprinkled with pebbles and sand.
Some of the beds here contain cylindrical nearly vertical holes, one
being measured and found to be 8 centimetres diameter and 45
centimetres deep.
Beyond the hills just mentioned is a further stretch of gravel, till at
199 kilometres a broad low ridge of white and creamy limestone,
crowded with large nummulites (N. gizehensis) is reached. The beds
of this ridge dip slightly west. The hollows of the ridge are full of
blown sand. Beyond is a flat depression, the floor of which is strewn
with small nummulites (N. curvispira), and with Ostrea and other
shells; then another small nummulitic ridge is crossed, after which
comes some hard smoky-grey silicified limestone, with much sand
on the surface. At 202 kilometres from the starting-point a slight
descent was made from the limestone on to a shelving gravelly tract
with some sand-dunes. Some 4 kilometres further on the wide belt of
low sand-dunes fringing the cultivated area west of Minia was
entered. This sandy tract has a width of about 4 kilometres; it
contains some patches of grassy land with pools of water.
The village of Nasl Nadiub Lengat, at the edge of the valley-
cultivation, was reached after covering 212 kilometres from the south
point of the oasis, the march having occupied about 60 hours.
To summarise the geology of the area crossed in the traverse just
described, it will be clear that although the beds which form the top
of the scarp at the south end of the oasis are probably of Upper
Cretaceous age, yet these beds are overlain by Eocene limestone
with nummulites at a short distance (about 10 kilometres) east of the
Baharia-Farafra road. It is probable that the entire stretch of
limestone-plateau between this point and Minia is Eocene,
nummulites being recorded both from near the centre of the tract and
from the edge of the desert near Minia. The softer chalky beds
traversed appear stratigraphically lower than the Nummulite-
limestone, and would seem to correspond to the chalky-limestones
with Operculina libyca and Lucina thebaica which occur so
constantly near the base of the Eocene in Kharga Oasis. The
gravels, as already remarked in referring to the Minia-Baharia road,
are of uncertain age, but are certainly Post-Eocene. The sand-dune
belt crossed near the centre of the tract is the same as that crossed
on the outward journey from Minia, and extends for a great distance
north and south.

Baharia-
The road from Baharia to Farafra, traversed by
Farafra Cailliaud in 1820,[26] and by Jordan in 1874[27], was
road. taken by a party of the Geological Survey in proceeding
to Farafra after surveying the west side of Baharia. The start was
made from the same point as the return traverse to Minia, viz., the
point of ascent of the Farafra road at the south scarp of Baharia (lat.
27° 48′ 13″ N., long. 28° 32′ 49″ E. of Greenwich), and the general
course taken was in a direction 30°-40° west of south. A second,
almost disused, road from Baharia to Farafra ascends the western

You might also like