Download as pdf or txt
Download as pdf or txt
You are on page 1of 224

Lecture Notes in Computer Science 675

Edited by G. Goos and J. Hartmanis

Advisory Board: W. Brauer D. Gries J. Stoer


Anne Mulkers

Live Data Structures


in Logic Programs

Derivation by Means of Abstract Interpretation

Springer-Verlag
Berlin Heidelberg NewYork
London Paris Tokyo
Hong Kong Barcelona
Budapest
Series Editors
Gerhard Goos Juris Hartmanis
Universit~it Karlsruhe Cornell University
Postfach 69 80 Department of Computer Science
Vincenz-Priessnitz-Stra6e 1 4130 Upson Hall
W-7500 Karlsruhe, FRG Ithaca, NY 14853, USA

Author
Anne Mulkers
Department of Computer Science, K.U. Leuven
Celestijnenlaan 200 A, B-3001 Heverlee, Belgium

CR Subject Classification (1991): F.3.1, D.3.4, 1.2.2-3

ISBN 3-540-56694-5 Springer-Verlag Berlin Heidelberg New York


ISBN 0-387-56694-5 Springer-Verlag New York Berlin Heidelberg

This work is subject to copyright. All rights are reserved, whether the whole or part
of the material is concerned, specifically the rights of translation, reprinting, re-use
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other
way, and storage in data banks. Duplication of this publication or parts thereof is
permitted only under the provisions of the German Copyright Law of September 9,
1965, in its current version, and permission for use must always be obtained from
Springer-Verlag. Violations are liable for prosecution under the German Copyright
Law.
9 Springer-Verlag Berlin Heidelberg 1993
Printed in Germany
Typesetting: Camera ready by author/editor
45/3140-543210 - Printed on acid-free paper
Preface

Abstract interpretation is a general approach for program analysis to discover


at compile time properties of the run-time behavior of programs, as a basis to
perform sophisticated compiler optimizations. Several frameworks of abstract
interpretation for logic programs have been presented [11, 25, 27, 43, 48, 49, 51,
55, 57, 65, 81, 82]. A framework is a parameterized construction for the static
analysis of programs, together with theorems that ensure the soundness and
termination of the analysis. To complete the construction, an application specific
domain and primitive operations satisfying certain safety conditions must be
provided.
This book elaborates on an application for such a generic framework. The
framework used [11] belongs to the class of top-down abstract interpretation
methods and collects the information derived in an abstract AND-OR-graph that
represents the set of concrete proof trees that can possibly occur when executing
the source program. The starting point of the present work is the previously
developed application of integrated type and mode analysis [38]. The purpose
of that application was to guide the compiler, based on a characterization of the
entry uses of the program, to generate code that is more specific for the calls
that can occur at run time.
In an attempt to give further guidance to the compiler, we address the prob-
lem of compile-time garbage collection, the purpose of which is to (partially) shift
run-time storage reclamation overhead to compile time. In applicative program-
ming languages, the programmer has no direct control over storage utilization,
and run-time garbage collection is necessary. Garbage collection involves a pe-
riodic disruption of the program execution, during which usually a marking and
compaction algorithm is employed. Such schemes are expensive in time. Our
research shows that at compile time useful and detailed information about the
liveness of term substructures can be deduced which the compiler can use to
improve the allocation of run-time structures. In fact, it provides a technique
to automatically introduce destructive assignments into logic languages in a safe
and transparent way, thereby reducing the rate at which garbage cells are cre-
ated. The resulting system gets near to the methods of storage allocation used
in imperative programming languages.
The global flow analysis to be performed on Prolog source programs in order
to derive the liveness of data structures is constructed in three layers. The
vI

first layer, consisting of the type and mode analysis, basically supplies the logical
terms to which variables can be bound. The two subsequent layers of the analysis
heavily rely on these descriptions of term values. The sharing analysis derives
how the representation of logical terms as structures in memory can be shared,
and the liveness analysis uses the sharing information to determine when a term
structure in memory can be live.

Acknowledgments
This book is based on my Ph.D. dissertation [59] conducted at the Department
of Computer Science of the K.U.Leuven, Belgium. The research presented has
been carried out as part of the RFO/AI/02 project of the Diensten voor de
programmable van he~ wetenschapsbeleid, which started in November 1987 and
was aimed at the study of implementation aspects of logic programming: 'Logic
as a basis for artificial intelligence: control and efficiency of deductive inferencing
and parallelism'.
I am indebted to Professor Maurice Bruynooghe, my supervisor, for giving
me the opportunity to work on the project and introducing me to the domain
of abstract interpretation, for sharing his experience in logic programming~ his
invaluable insights and guidance. I wish to thank Will Winsborough for many
helpful discussions, for his advice on the design of the abstract domain and
safety proofs and his generous support; Gerda Janssens for her encouragement
and support, and for allowing the use of the prototype for type analysis as the
starting point for implementing the liveness analysis; Professors Yves Willems
and Bart Demoen, for managing the RFO/AI/02 project and providing me with
optimal working facilities; Professor Marc Gobin, my second supervisor, and
Professors Baudouin Le Charlier and Danny De Schreye, for their interest and
helpful comments, and for serving on my Ph.D. thesis committee. I also want to
thank my family, friends and colleagues for their support and companionship.

Leuven, March 1993 Anne Mulkers


Contents

Introduction 1

Abstract Interpretation 5
2.1 Basic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Abstract Interpretation Framework . . . . . . . . . . . . . . . 7
2.2.1 Overview of the F r a m e w o r k . . . . . . . . . . . . . . . . . . . . 8
2.2.2 Concrete a n d A b s t r a c t D o m a i n s of S u b s t i t u t i o n s . . . . . . . . 10
2.2.3 Primitive Operations ....................... 11
2.2.4 A b s t r a c t I n t e r p r e t a t i o n Procedure . . . . . . . . . . . . . . . . 14
2.3 E x a m p l e : I n t e g r a t e d T y p e a n d Mode Inference . . . . . . . . . 16
2.3.1 Rigid a n d I n t e g r a t e d T y p e G r a p h s . . . . . . . . . . . . . . . . 16
2.3.2 Type-graph Environments ..................... 23
2.3.3 P r i m i t i v e O p e r a t i o n s for T y p e - g r a p h E n v i r o n m e n t s . . . . . . 25

3 Related Work 31
3.1 Aliasing a n d P o i n t e r A n a l y s i s . . . . . . . . . . . . . . . . . . . 31
3.2 Reference C o u n t i n g a n d Liveness Analysis . . . . . . . . . . . . 38
3.3 Code O p t i m i z a t i o n . . . . . . . . . . . . . . . . . . . . . . . . . 41

4 Sharing Analysis 47
4.1 Sharing Environments . . . . . . . . . . . . . . . . . . . . . . . 47
4.1.1 Concrete R e p r e s e n t a t i o n of Shared S t r u c t u r e . . . . . . . . . . 48
4.1.2 A b s t r a c t R e p r e s e n t a t i o n of Shared S t r u c t u r e . . . . . . . . . . 55
4.1.3 T h e Concrete a n d A b s t r a c t D o m a i n s . . . . . . . . . . . . . . . 62
4.1.4 Order R e l a t i o n a n d U p p e r b o u n d O p e r a t i o n . . . . . . . . . . . 66
4.2 Primitive Operations ....................... 68
4.2.1 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2.1.1 Xi : Xj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.2.1.2 Xi : f(Xil,...,Xij) . . . . . . . . . . . . . . . . . . . . . . . 85
4.2.2 Procedure E n t r y . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2.3 Procedure Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.3.1 Example: i n s e r t / 3 ........................ 111
4.3.2 Relevance of S h a r i n g Edges . . . . . . . . . . . . . . . . . . . . 114
v lll CONTENTS

4.3.3 Imprecision in the S h a r i n g Analysis . . . . . . . . . . . . . . . 117


4.3.4 Efficiency of the S h a r i n g Analysis . . . . . . . . . . . . . . . . 123

LivenessAnalysis 127
5.1 Liveness E n v i r o n m e n t s . . . . . . . . . . . . . . . . . . . . . . 127
5.1.1 Concrete R e p r e s e n t a t i o n of Liveness I n f o r m a t i o n . . . . . . . . 128
5.1.2 A b s t r a c t R e p r e s e n t a t i o n of Liveness I n f o r m a t i o n . . . . . . . . 133
5.1.3 T h e Concrete a n d A b s t r a c t D o m a i n s . . . . . . . . . . . . . . . 141
5.1.4 Order R e l a t i o n a n d U p p e r b o u n d O p e r a t i o n . . . . . . . . . . . 145
5.2 Primitive Operations . . . . . . . . . . . . . . . . . . . . . . . 147
5.2.1 Unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2.1.1 Xi = X j . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.2.1.2 X i : f ( X i , , . . . , X i j ) . . . . . . . . . . . . . . . . . . . . . . . 153
5.2.2 Procedure E n t r y . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.2.3 Procedure Exit . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.3.1 Example: q s o r t / 3 . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.3.2 Precision of the Liveness Analysis . . . . . . . . . . . . . . . . 168
5.3.3 T h e Practical Usefulness of Liveness I n f o r m a t i o n . . . . . . . . 171

6 Conclusion 179

Appendix; Detailed Examples 183


A.1 List of T y p e s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
A.2 append/3 ......................... ..... 185
A.3 nrev/2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
A.4 buildtree/2 and insert/3 . . . . . . . . . . . . . . . . . . . . . . 193
A.5 p e r m u t a t i o n / 2 a n d select/3 . . . . . . . . . . . . . . . . . . . . 196
A.6 split/3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
A.7 qsort/2 and partition/4 . . . . . . . . . . . . . . . . . . . . . . 202
A.8 s a m e l e a v e s / 2 a n d profile/2 . . . . . . . . . . . . . . . . . . . . 205
A.9 sift/2 a n d r e m o v e / 3 . . . . . . . . . . . . . . . . . . . . . . . . 209

Bibliography 213
Chapter 1

Introduction

In conventional languages, such as C or Pascal, the p r o g r a m m e r explicitly con-


trols the utilization of m e m o r y by means of declarations and destructive assign-
ments. For example, when reversing a linear list L, the list cells of the original
list can be reused to construct the reversed list in the case that the original list
is no longer needed for further computations. It is up to the p r o g r a m m e r to
decide whether he needs to preserve the old list intact and construct a reversed
list which has only the list elernertLs in c o m m o n with the list L (e.g. Rev_L1 in
Figure 1.1), rather than reuse the list-constructor cells of L as well (e.g. Rev_L2).

_1 I -J I _1 [
-, , I , -, , I , -~ , I i

Rev_L2

/ \" / \" -, RevL 1


J
e
/

Figure 1.1: Reversing a linear list.

Applicative languages, in their pure form, do not have destructive assign-


ments. Also type declarations are often absent. The declarative nature of these
languages is often cited as an i m p o r t a n t advantage, which allows p r o g r a m m e r s
to focus on the logic of the problems they have to solve, rather than on more
technical aspects such as search control and efficient m e m o r y usage. Unfortu-
nately, the performance of current implementations of applicative languages does
not compare well with procedural languages yet. To achieve better utilization of
memory, global flow analysis techniques are being developed that are concerned
with determining the type and liveness of d a t a structures that are dynamically
2 C H A P T E R 1. I N T R O D U C T I O N

append (nil, _Y, _Y).


append([_E I _UJ,_Y,[_E I _W]) :- append(_U,_Y,_W).
nrev(nil, nil).
nrev([_E I _U], _Y) :-nrev(_U, _KU), append(_KU, [~], ~).

P r o g r a m 1.h n r e v / 2 (Naive reverse)

created during program execution. Knowledge about the lifetime of d a t a struc-


tures guides the compiler in the generation of target code to reuse heap storage
t h a t is no longer accessible from program variables, i.e. to introduce destructive
operations and avoid the copying of data structures that have no subsequent
references.
In this book, we address the problem of liveness analysis for the class of pure
Horn clause logic programs. The language considered has a countable set of
variables (Vats), and countable sets of function and predicate symbols. A term
is a variable, a constant, or a compound term f ( Q , . . . , t , ~ ) where f is a n-ary
function symbol and the t~ are terms. An atom has the form p ( Q , . . . , tin) where
p is a m - a r y predicate symbol and the ti are terms. A body is a (possibly empty)
finite conjunction of atoms, written A1,..., A,~. A clause consists of an a t o m (its
head) and a body and is written A : - B. A program consists of a finite number
of clauses. A query or goalconsists of a body only, written 7- B. We assume that
the reader is acquainted with the basic terminology of logic programming and the
execution mechanism of Prolog which is based on unification and backtracking.
Features such as assert and retract are not considered, i.e. we assume that any
source code for the predicates that can be executed at run time is available to
the compiler.
The handling of d a t a structures is very flexible in Prolog. D a t a manipulation
(record allocation as well as record access and parameter passing) is achieved
entirely via unification. An optimizing compiler can translate general unification
to more conventional m e m o r y manipulation operations if information is available
about the mode of use of the predicates. When at run time a compound term
becomes accessible for the first time, we can say the term is being constructed.
When a pattern is matched against a compound t e r m that is already accessible,
we can say the components of the term are being selected. Integrated type and
mode analysis in m a n y cases allows to predict at compile time whether a unifi-
cation is a selection rather than a construction operation. Selection statements
in particular are good candidates to check for the possible creation of garbage
cells, i.e. cells that have no further references.
Consider the Prolog Program 1.1 for naive list reversal. We use the conven-
tion that variable names start with an underscore. If we assume that queries to
n r e v / 2 are restricted to have as first argument a list that is no longer referenced
after the call, and as second argument a free variable to return the output, then
it is possible to generate target code for this program that allocates no new
list-constructor cells, but rather reuses the list cells of the first argument. In-
deed, under the assumption, the integrated type and mode analysis will infer
t h a t each call to the recursive clause of n r e v / 2 has as its first argument a list,
and as second argument a free variable. The unification of the call with the
clause head selects the head and tail of the first argument list. The principal
list-constructor cell of this list on the contrary has no subsequent references in
the clause following the unification of the call with the clause head. This means
that the compiler can recognize the principal list cell as garbage and generate
target code t h a t reuses it. For instance, consider the caI1 to a p p e n d / 3 m a d e by
the same clause. A single element list [_E] needs to be constructed. Instead of
allocating a new cell, the compiler can reuse the garbage cell that was detected.
Note that the problem is more complex if there may be multiple references to
the cells of the input list. Most implementations of unification unify a variable
and a compound structure by making the variable a reference to the structure -
not a copy of the structure. The representations in m e m o r y of the logical terms
to which variables can be bound typically share some of their structure: while
the denoted terms make up a forest of trees, their representations form a more
general directed acyclic graph. This is why in general the sharing analysis plays
a crucial part in the liveness analysis.
In the above example, we can also infer that, the first two arguments in
each call to a p p e n d / 3 will be lists and that the third argument will be a free
variable. Again, it is possible to detect that, after invocation of the recursive
clause of append/3, the principal cell of the first argument is garbage and can be
reused to construct the value of the third (output) argument. Thus, all list con-
structions in this example can reuse garbage list ceils, eliminating all allocation
operations. Since the reused cells would otherwise be garbage, we have elim-
inated the garbage-collection overhead associated with the n r e v / 2 procedure.
Moreover, a compiler can detect that the element field of each reused list cell
already contains the value desired in the cells new use. The operations filling in
these car fields can be eliminated from the generated target code. The resulting
code closely resembles how a p r o g r a m m e r using an imperative language would
solve the problem of reversing a linear list of linked records.
In the present work, we propose an abstract domain and operations to ana-
lyze the liveness of d a t a structures within a framework of abstract interpretation.
Chapter 2 presents the principles of abstract interpretation for logic programs,
and the application of type and mode analysis on which the domain for liveness
analysis is based. In Chapter 3, we discuss work related to the application of
compile-time garbage collection in the context of both logic and functional pro-
g r a m m i n g languages. In Chapter 4, we formalize an abstract interpretation for
analyzing how the terms to which program variables are bound at run time, can
share substructure in storage. We also augment the usual concrete semantics
with information about sharing of t e r m structures and discuss whether any im-
plementation c o m m i t m e n t s are implied. As argued above, the sharing analysis
constitutes a prerequisite for the liveness analysis. The latter is presented in
Chapter 5. In both Chapter 4 and 5, the emphasis is mainly on the precision
4 C H A P T E R 1. I N T R O D U C T I O N

and on the soundness of the results that can be obtained, rather than on the
efficiency of the analysis. Due to imprecision that is inherent to the global anal-
ysis algorithms, not all garbage cells can be detected in arbitrary cases. We will
extensively discuss the strength of the analyses that are proposed.
The study of code optimization schemes that explicitly reclaim or reuse
garbage cells is beyond the scope of the present book. In [52], Mari~n et al.
discussed some preliminary experiments on code optimization based on liveness
information. Only opportunities for local reuse of storage cells are considered,
i.e. reuse within the same clause where a cell is turned into garbage. Non-local
reuse would require extra run-time data areas to keep track of the free space.
Although possible in principle, non-local reuse therefore will be less beneficial for
code optimization. The reuse of storage also introduces some new requirements
on the trailing mechanism of standard Prolog implementations that will affect
the performance. We will briefly discuss these issues in Section 3.3 and 5.3.3.
Chapter 2

Abstract Interpretation

In this chapter, we first set forth the basic principles of abstract interpretation.
Then, we sketch one framework in particular, as it will be used in the remainder
of the book. Finally, we describe an application, namely type and mode analysis,
which constitutes the first layer of the global flow analysis to derive the liveness
of data structures. A more detailed introduction to abstract interpretation of
declarative languages can be found in [1].

2.1 Basic C o n c e p t s
Static p r o g r a m analysis is a general technique for deriving properties of the
run-time behavior of a program. The information obtained in this way can be
used to drive the optimization phase of a compiler, or to guide source level
program transformation and program development tools (e.g. for debugging).
Often, program analysis can be viewed as executing the program over a symbolic
or abstract domain of d a t a descriptions, instead of over the normal concrete
d a t a domain, and therefore it is called abstract interpretation. To mimic the
concrete execution of a program, the basic operations defined on the standard
domain are replaced by abstract operations defined on the abstract domain. The
resulting flow analysis produces a so-cMled abstract semantics: for each possible
point of control in the program, it gives a finite description of the set of data
states that the program could be in when execution passes through that point.
The program properties that compiler optimizations are based on are usually
undecidable. Since static analyses are expected to be finitely computable, the
data descriptions will be imprecise in general. The abstract interpretation is
said to be sound if the d a t a descriptions computed for each program point give
upper approximations of the set of concrete data states that can occur during
program execution.
Patrick and Radhia Cousot [24] provided a general framework for data-flow
analysis problems of imperative languages, and they defined conditions which
ensure the soundness of an abstraction. Based on that work, a variety of abstract
6 CHAPTER 2. A B S T R A C T INTERPRETATION

interpretation frameworks have been developed for the specification and verifica-
tion of analyses for logic programs [11, 25, 27, 43, 48, 49, 51, 55, 57, 65, 81, 82].
A framework is a formally based, generic construction for program analysis that
provides a basis for sound optimizations. To this end, it includes theorems that
ensure the safety and termination of the analysis if the application dependent do-
mains and operations supplied to complete the construction obey certain safety
requirements.
The standard theory of lattices provides the conceptual framework for pro-
gram analysis. In order to apply the method for a new application, the space
of properties (the abstract domain) capturing the information of interest should
be a complete lattice (i.e. a set with a partial ordering such that each subset
has a least upper bound and a greatest lower bound), and the functions used
should be monotone or order-preserving on the lattice. If these conditions are
satisfied, then the Tarski-Knaster Fixpoint Theorem [71] guarantees the exis-
tence of a solution for the fixpoint problem posed by the abstract semantics of
the program.
Let (C, E) be the concrete domain, and (A, _<) the domain of descriptions. In
the context of logic programs, the concrete domain C will typically be the power-
set of the set of substitutions. A substitution is a set a - {X1 +-- K:I,..., Xc *---
ice}, where the Xi are distinct variables and the /Ci are terms. The set D
= { X 1 , . . . , X~} is called the domain of the substitution a (denoted as dom(a)).
An interesting application of program analysis for logic programming languages
is mode analysis. Significant performance improvements can be achieved in inter-
preters if it is known how the logical variables are used in a relation (i.e. as input
or output or a mixture of the two). For this application, the abstract domain
A will describe the instantiation state of the terms assigned to the variables by
the concrete substitutions that are possible at run time. A simple mode analysis
can be based on the set (a, g, f}, partially ordered such that f < a and g < a.
The modes f, g and a are abbreviations for respectively free, ground and any.
An element of the abstract domain A is either • or an abstract substitution of
the form {Xi +-- rn~ [ Xi E D & rn/ e {f, g, a}} that assigns a mode to each
variable of some domain D.
The semantics of the descriptions is given by a concretization function "), :
A ---* C, such that ~(z) is the set of concrete objects described by z. Conversely,
given a collection of concrete objects, the abstraction function a : C ---* A yields
the best overall description of the objects. Suppose that the normal interpreta-
tion of a program is defined in terms of an operator F c , which has as abstract
counterpart FA (e.g. for the execution of logic programs, unification is the cen-
tral operation). The traditional abstract interpretation scheme requires that the
following conditions are satisfied.
1 . (C, _ ) and (A, <) are complete lattices,
2. q, : A --4 C and ot : C ~ A are monotonic, i.e.

9 z _~ y ::~ a ( z ) < a(y), for all a:, y e C,


9 z < y ~'r(~) z ~(y), for all ~ , y e A,
2.2. A B S T R A C T INTERPRETATION F R A M E W O R K 7

3. z = a ( 7 ( z ) ) for all z e A,
4. 9 E_ for a l l . C,

5. F c : C ---* C and FA : A --~ A are monotonic, and

6. E for a l l . A.

Conditions 2, 3 and 4 ensure that C and A have a similar structure. The last
condition is called the safely or soundness requirement, and ensures that the
abstract operation mimics the concrete operation such that no false conclusion
can be drawn about the concrete operation's behavior. The framework described
in the next section requires a set of similar but slightly weaker conditions.
Two basic classes of abstract interpretation of logic programs can be distin-
guished. The top-down analyses are based on the standard operational semantics
of Prolog, i.e. SLD-resolution with a left-to-right computation rule which always
resolves the leftmost goal in the current resolvent. In order to derive the states
in which clauses and their calls can be reached, a query specification is provided
that characterizes the entry uses of the program. The bottom-up analyses [18, 53]
are based on the fixpoint semantics of logic programs. Because the evaluation
of a b o t t o m - u p semantics does not correspond to the operational behavior of a
program, it does not readily provide information about call patterns describing
how the clauses of a program will be used in actual computations. Furthermore,
the b o t t o m - u p computations tend to derive a lot of redundant facts not actually
needed to answer a specific query. To overcome both deficiencies, the normal
top-down Prolog execution is simulated in recent work by first transforming the
program and the query according to the magic templates method and then ex-
ecuting the new program b o t t o m up [18, 66]. This method is adopted from
the field of deductive databases, where magic templates are used to provide for
efficient b o t t o m - u p evaluation of database queries. The resulting b o t t o m - u p
abstract interpretation method seems to be equally suited for all sorts of appli-
cations as the corresponding top-down methods using the same abstract domain
and operations.

2.2 Abstract Interpretation Framework

In this section, we explain the framework of abstract interpretation that forms


the base for our analysis and was proposed by Bruynooghe in [11]. It is a top-
down analysis method adequate to determine an upper approximation of the set
of substitutions that can actually occur while executing the program, given an
initial goal. It is a parameterized construction: each application requires the
design of the application dependent components, namely the abstract domain
and the abstract interpretation operations that are used in the abstract inter-
pretation procedure, which is itself domain independent. The soundness of the
elementary abstract operations implies the soundness of the abstract interpre-
tation.
8 C H A P T E R 2. A B S T R A C T I N T E R P R E T A T I O N

2.2.1 Overview of the Framework


In the abstract interpretation framework, data descriptions are computed at a
fixed number of program points: in each clause just after the head, between two
calls in the body and after the last call. Consider, for example, the clause

where the program points are indicated by circled labels.


A canonical form of the clauses is used1: procedure calls and headings are
normalized such that the parameters are distinct variables. This is done by
breaking unifications up into sequences of simple operations. For example, the
append app/3 procedure in normal form is as follows.

app(_X,_Y,_Z) : - _X=nil, _Y=_Z.


app(_X,_Y,_Z) : - _~=_E._U, app(_U,_Y,_W), _Z=_E._W.
As a consequence, parameter passing becomes very simple. It reduces to passing
values from actual to formal parameters on procedure entry, and vice versa on
procedure exit. The real unification work is done by the sequence of primitive
unification operations.
The information inferred by the abstract interpretation process is gathered in
an abstract AND-OR-graph. Such a graph is the finite representation by means
of a rational tree 2 of a set of (possibly infinite) proof trees that correspond to
the concrete executions of the program. An 'Or'-node represents a call, and its
children represent the different clauses matching with the call. The 'And'-nodes
on the other hand represent a clause head, and their children the subgoals of
the clause body. A graphical representation of an AND-OR-graph for the app/3
program is given in Figure 2.1. Each program point is adorned with an element
of the abstract domain, an abstract substitution. The abstract substitution to
the left of the root,/30, represents a set of initial queries and is supposed to be
specified by the user. For instance, consider the application of mode analysis
mentioned above. We would have/~0 -- {_11 ~ g, .12 ~ g, _13 ~ a} to specify
that the first and second argument are ground terms in any initial query and the
third argument may be any term. The other abstract substitutions,/~1 . . . . . fls,
represent sets of concrete substitutions over the variables of the clause or query
they adorn, and the soundness of the abstract interpretation procedure will guar-
antee that they are safe over approximations of the set of concrete substitutions
that can actually occur at run time when control reaches that point for any
concrete derivation originating from one of the queries represented by the initial
abstract substitution. The abstract substitution to the left of a call is often
called the abstract call-substitution, and the one to the right is called the ab-
stract success-substitution of the call. E.g. in Figure 2.1,/~s is both the abstract
success-substitution for the call _.X=_E._U, and the abstract call-substitution for
the call app(_U, _Y, _W).
1 We use a slightly simplified version of the framework developed in [11].
2 A rational tree is a tree that has only a finite n u m b e r of distinct subtrees [20].
2.2. A B S T R A C T I N T E R P R E T A T I O N F R A M E W O R K 9

~0 a p p ( _ l l , ~ 2 , _ . i 3 ) /~s

app(-X, _Y, _z) app(_X,_Y,_Z)

/%
~I /=nil B2 _Y=_Z ~s ~4 _X=V.._U~5 app(_U,_g,_w) ,Be _Z=-E._W ~T

1
Figure 2.1: AND-OR-graph for a query ?- a p p ( _ l l , -12, _3.3).

The abstract AND-OR-graph provides the information required by program


specialization techniques [31, 81, 82]. In general, the AND-OR-graph can con-
tain several instances of a particular clause, but with different abstract call-
substitutions. The compiler can decide to generate target code specialized for
each version (taking care that the proper version is called at run time for each
program point) or to generate target code that is general enough to be used for
all the versions.
The abstract interpretation procedure to construct an abstract AND-OR-
graph (see Section 2.2.4) follows essentially a top-down strategy, starting from
the query form. It takes into account the depth-first left-to-right computation
rule of Prolog, but not the search rule nor the effects of cu~ (!). This means that
the order of the child nodes of an 'And'-node are relevant, but for an 'Or'-node,
the order is irrelevant. Also, there might be calls which are considered possible
but will never occur at run time; consequently, some imprecision might result.
Typical for the framework is that the abstract substitutions only mention
the variables of the query or clause that they adorn. As a consequence, each
call requires two unification-based operations, not just one as in standard SLD-
resolution. By confining attention to the state of the variables in one clause at
a time, the analysis can recognize the similarity between clause invocations that
differ only in the context of their use. This property is essential for the treatment
of recursive calls. It also makes possible an optimization of the implementation
such that a repeated computation of the same subgraph in different branches of
the AND-OR-graph is avoided.
10 CHAPTER 2. A B S T R A C T INTERPRETATION

When a clause is invoked, the analysis restricts attention to the portion of


the state pertaining to the variables in the clause. The restriction is achieved
by an operation called procedure entry, which abstracts the usual unification
between call and clause head. When control returns to the caller, the analy-
sis must reintroduce the variables in the calling environment 3 and convey the
effect of the clause execution upon those variables. This is achieved by an opera-
tion called procedure exit, which abstracts a second unification between call and
(instantiated) clause head. We further need a third primitive operation to han-
dle the nontrivial unification operations occurring in the source program 4. The
safety requirements imposed by the framework (see Section 2.2.3) assure that the
outcomes of abstract procedure entry, procedure exit and primitive unification
cover the outcomes of their concrete counterparts. In the next section, we first
look at the requirements imposed on the abstract domain in order to get a finite
analysis.

2.2.2 Concrete and Abstract Domains of Substitutions


The set of variables in the clause (or query) that a concrete (or abstract) sub-
stitution adorns is called the domain D of the substitution. The restriction of a
concrete substitution a to a set D of variables is defined as

aid = {X~ *---ti I Xi ~ - t i E a & X~ E D}.

We denote the set of all idempotent s concrete substitutions having domain D


b y ( o n c r S u b D , and the set of all abstract substitutions having domain D by
AbstrSubD. The concretization function -), to be defined is a mapping

: AbstrSubD --~ 2 c ~

Although the framework does not refer explicitly to the abstraction function

a : 2 C~ --~ AbstrSubD,

we assume such a function is given and allows the user to abstract the initial set
of queries.
The set AbstrSubD is not required to be finite, nor does it need to be a
complete lattice; the abstract interpretation framework of [11] imposes a slightly
weaker algebraic structure on AbstrSubD that still guarantees termination of the
abstract interpretation procedure. There must exist a binary relation < that is
reflexive and transitive (a preordering), and that satisfies the following property

V~1,~2 E AbstrSubD : ~1 <~_/~2~ ")'(~1) C ')'(~2).


n a n environment represents the part of the run-tlme control structure that records the
current bindings of variables to terms.
4When the language is extended with other built-ins (e.g. var/1, is/2, atom/1 ,..), a primitive
abstract operation has to be associated with each of them (see [21]).
~A substitution ~ is idempotent if ~ ----~ .
2.2. A B S T R A C T I N T E R P R E T A T I O N F R A M E W O R K 11

We define an equivalence relation on the set AbstrSubD as follows

V/31,/3"~ 9 AbstrSubD :/31 = /32 r /31 <__/~2 ~ /~2 --< ~1"

The relation <_ can also be used modulo this equivalence relation, which yields
a partial order for the set of equivalence classes. The existence of a uniquely
determined representative for each equivalence class may be convenient, but is
not required by the framework.
Further requisites for the framework are

9 An upperbound operator Upp : AbstrSubD x AbstrSubD --r AbstrSubD, such


that

V/31,/32 9 AbstrSubD :/31 < Upp(/31,/32) ~k /32 < Upp(/31,/32).

A maximal element/3max E AbstrSubD (i.e. V~ E AbstrSubD : ~ma~ </3 :~


Z~. -8).
A minimal element J_ E AbstrSubD (i.e. V~ E AbstrSubD :/3 <_ / =~/3 =_ -L)
such that 3,(1) = 0.

A subset F-AbstrSubD C_ AbstrSubD, such that d-,~m~ 6 F-AbstrSubD, and


there does not exist an ascending chain for < in F-AbstrSubD. An ascending
chain for <_ is an infinite sequence of elements ( a l , a 2 , . . . , a n , . . . ) such that
al < as < . . . < a , < ... (here < is defined as < but not =). An operator
R : AbstrSubD --* F-AbstrSubD must be defined, satisfying the property
V/3 E AbstrSubD : ~ _< R(/3).

We will use a generalization of the Upo operation for a finite set {/31,... ,/3,) of
abstract substitutions, which is defined in the obvious way.

Upp({/31,...,/3,~}) = Upp(/31, Upp(..., Upp(/3~_l,/3,~)...)).

Note that for the sake of precision of the analysis, Upp should return a value as
small as possible (e.g. the least upperbound if AbstrSubD is a lattice).

2.2.3 Primitive Operations


In this section, we give the correctness conditions for the primitive abstract
operations that guarantee the soundness of the abstract interpretation procedure
of Section 2.2.4. The input for each operation consists of a partially constructed
AND-OR-graph, a (collection of) abstract substitution(s), and a call represented
by one of the nodes of the AND-OR-graph. The output of the operation consists
of a (set of) abstract substitution(s) and the updated AND-OR-graph. We
consider the AND-OR-graph as a global data structure for which the updating
is an implicit aide-effect of the primitive operations.
As in [38], we benefit from the simplifications that are possible, compared
with the framework developed in [11], due to the normal-form of the source
12 C H A P T E R 2. A B S T R A C T I N T E R P R E T A T I O N

H 1 ,., H p

9., ~n t -o,

Figure 2.2: Procedure entry extends the abstract AND-OR-graph.

programs. Below, the most general unifier of two terms K1,/C2 is denoted by
mgu(/C1, ~2). We denote the set of variables that occur in a particular term/C
as Vars(/C).

. Procedure-Entry(/~t=, P ) : { ~ . , . . . , / ~ . }
Applied to a call P with abstract call-substitution ~i,=, the procedure-entry
operation extends the abstract AND-OR-graph at the 'Or'-node repre-
senting the call P, by adding a branch for each clause Hj :- BJ , . . . , B ~
(1 < j _ p) defining the predicate P (see Figure 2.2), and it computes the
abstract substitutions 3~,=,...,~,~. The domain of the abstract substitu-
tion ~',~ is the set of variables Dj in the jth clause. The computation of the
abstract substitutions is clone in two steps out of convenience. First, the
abstract call-substitution/~i,~is restricted by the call AbstrRestrict(~i,,,P)
to a substitution ~.~tr E F-AbstrSubDp, only referringto the domain Dp of
the call p6. This step is independent of the set of clauses defining P. In the
second step, /~.,t,is renamed for the domain of variables for each of the
clauses, and the local variables of the clause are initialized.For instance,
for the application of mode analysis mentioned above, the local variables
of the clause (i.e.the variables not occurring in the head) are initializedto
the mode f (free).
The safety requirements that are sufficient for the correctness of the two
steps of the procedure-entry operation are respectively:

6 T h i s i s n o t e s s e n t i a l , b u t i t i m p r o v e s t h e p e r f o r m a n c e ; we a s s u m e t h a t t h e p r o c e d u r e - e n t r y
o p e r a t i o n u s e s e l e m e n t s of F - A b s t r S U b D p for t h e r e c u r s l v e p r e d i c a t e s .
2.2. A B S T R A C T I N T E R P R E T A T I O N F R A M E W O R K 13

CI B~,, p w',t,
I't o ~ t #o,,t C~

HI ... n p

B~ ... 1 1
B,~,Bo,, B~ 9.. v P
B,,~,flo,,,

Figure 2.3: Procedure exit computes ~o~,t.

(a) If a E "r(/~,) then aiD P E "r(fl~,~t').


(b) For 1 < j < p, if a E 7(fl~'~") and 0 = mgu(Pa, Hi), with H i the head
of the properly renamed jth clause defining P, then 0tD j E 7(fli,~).

Due to the normal form of the source programs, proving the safety of the
second step is rather straightforward for most applications. The first step
may be more involved, especially for the application of liveness analysis.
For the example of Figure 2.1, procedure entry computes/31 and f14 from
the call app(_11,__12,__13), represented by the root node, and the call-
substitution/30.

2. Procedure-Exit(/31,, 1
{#o,,,, ...,/3o,,,}, P ) = ,fi'o,,,
Procedure-exit is illustrated in Figure 2.3. It can be applied when the
final abstract success-substitution ~J,t has been derived for each clause
defining P. The initial step computes an abstract substitution arst; that
has as domain Dp, the set of variables of the call P. We have a,,t~ P'o~t =
Upp({/31,...,13.r}), where /~ (1 < j < p) is the renaming to Dp of
AbstrRestrict(/3sout, HJ). In the second step, the eztensio~ operation
AbstrExtend reintroduces the other variables in the domain DP.nv of the
calling environment (note: Dp C_ DBnv, where DEn, is the domain of/3i,
and the calling clause C :- C1,... ,P,... ,Co), and makes explicit the effect on
those variables of the execution of the procedure: /3ou, = AbstrExtend(/3i,,
#,,,,
O~t '
p).
14 CHAPTER 2. A B S T R A C T INTERPRETATION

The safety requirements that are sufficient for the correctness of the two
steps of the procedure-exit operation are respectively:

(a) If for some j such that 1 < j < p, a~,~ E 7(~/~,~) and aj E 7 ( ~ , t ) ,
and (3a over Vars(Pai,~) : Pai,.,a = HJaj), with HJ the head of the
properly renamed jo, clause defining P, and the only variables of
(C :- C1,...,P,...,Cc)a~r, occurring in a also occur in Pa~,~, then
{ f'~ratr ~
o'inO'[D p E 3"k~'out/"
{f4vstr~J, and
(b) If a ~ 9 3 " ( ~ ) and (3a over Vars(Pa~)) : a ~ a l D p 9 3"x~-o~t
the only variables of (C :- C 1 , . . . , P , . . . , C c ) a i n occurring in a also
occur in Pa~,~, then a~,:[Dzn" 9 7(Dour).
Proving the safety of the first step follows almost immediately from the
properties of the Upp operation and the monotonicity of the concretization
function 3'. We will illustrate this in Section 2.3.3.
For the example of Figure 2.1, procedure exit computes ~s from the-call
represented by the root node, a p p ( - 2 1 , - / 2 , - 2 3 ) , the abstract call-substi-
tution/~0, and the final abstract success-substitutions of the two clauses,
/~3 and ~7.
. AbstrUnify( ~,,, P ) -- ~o~,t
For programs in normal form, the unification call P can have either of two
forms: X~ - Xj or X~ = f(X~,,...,Xi~). Let DBnv be the domain of ~i,~.
Th e safety requirement that is sufficient for the correctness of the abstract
unification operation is:

9 If cr 9 3"(~i,,) and executing P a instantiates the call by 0, then

For the example of Figure 2.1, abstract unification computes/35 from the
call ~=.E. _U, and the call-substitution ~4.

2.2.4 Abstract Interpretation Procedure


We now sketch the abstract interpretation procedure to construct an abstract
AND-OR-graph. It is the application independent kernel of the framework for
abstract interpretation. It is convenient to assume that queries consist of a single
call; this does not incur any loss of generality. The abstract AND-OR-graph for
a set of queries, described by the goal Q and the abstract call-substitution/~/, is
initialized with the root node representing the call Q and adorned to the left by
the abstract call-substitution 8. The recursively defined abstract interpretation
procedure, outlined below, is then called for this initial abstract AND-OR-graph:
Abstrlnterpretation (~,Q).
2.2. A B S T R A C T I N T E R P R E T A T I O N F R A M E W O R K 15

A l g o r i t h m 2.1 Abstrlnterpretation(/3i,~,P)

I n p u t : A program (a set of clauses), a partially constructed AND-OR-graph


and a call node in the AND-OR-graph representing a predicate P and
adorned to the left with art abstract call-substitution/3i,.

Output: The abstract success-substitution /3o~t, and an updated A N D - O R -


graph such that/3o~t adorns the call node for P to the right.

We distinguish between four cases.

C a s e 1: If P is a primitive unification operation, then call AbstrUnify(/3i,,,P).

C a s e 2: If P has no ancestor node in the partially constructed AND-OR-graph


for a call to the same predicate, then

9 call Procedure-Entry(/3i.,P);
9 call Abstrlnterpretation for each of the calls in the bodies of each of the
clauses HJ :- B~,..., B ~ , ( 1 < j _< p) added to the AND-OR-graph by
Procedure-Entry in the previous step, always working on a call whose
abstract call-substitution is already computed;
9 call erocedure-Exit(/3i., {/3so,,t I 1 _ j < p}, P).

C a s e 3: If P has an ancestor node in the partially constructed AND-OR-graph


with a call P' for the same predicate, and if they have equivalent ab-
stract call-substitutions ._i,,
(B ~'t~ - _i,,
6/~'t~' up to renaming of variables), then
f~,t~' which is
t-'o,,tm'~ is a renaming of the abstract success-substitution ,-'o,,t
computed by the first step of the Procedure-Exit procedure for the call
P', Pr~ 't'', {/3So,,t' I 1 _~ j S p}, P'), using l as success-
j i
substitution for all the branches j for which ~3out is not yet computed.
Next, /3o,,t is computed by means of the extension operation: 3 ~ =
AbstrExtead(/3,,,/3o~;,7, P).
At some point, the Procedure-Exit procedure will be called for P', and a
new value will be obtained for a'~'t~'
I~ o~t 9
Let /31 be the old and/32 be the
new value. If/32 ~/31, then the computation proceeds as usual with the
old value 131 and node P refers back to ancestor node P'; otherwise the
computation starts over at the call P using R(Upp(/31,/32)) as new value
for ~st~
/~oltt
and arsw'
/~oltt 9

C a s e 4: If P has an ancestor node with a call P' to the same predicate, but the
abstract call-substitutions are not equivalent (/3~'~t~ ~ ~i,, l~rStrl up to renaming
of variables), then if ~-i,,A~'t~<_ ~-,,,Br't~',the computation proceeds as in C a s e
3, otherwise, the subgraph of P' is recomputed using n(Upp(/3~.~,t~,/3..,,t~' ))
as new value for/3[,~t~'.7

~'The application of R corresponds in fact to the use of a widening operation as discussed


by P. Cousot in [23].
16 CHAPTER 2. A B S T R A C T I N T E R P R E T A T I O N

Note that the iterations in Case 3 and Case 4 terminate because there does
not exist an ascending chain in F-AbstrSUbDp. For proofs of the correctness and
termination of the above procedure, as well as for further details, we refer the
reader to [11].

2.3 Example: Integrated Type and Mode In-


ference
In this section we discuss the type-inference system of Janssens and Bruynooghe
[12, 13, 38, 40] to illustrate the application of the abstract interpretation frame-
work. The type system also forms the base of our own abstract domain for
deriving the liveness of data structures. The type graphs inferred by the system
integrate the functionalities of both modes, which are descriptions of instanti-
ation states, and of types which are descriptions of sets of ground terms. The
mode information derivable from the type graphs allows to distinguish the se-
lection operations from the construction operations at compile time, and the
type graphs themselves provide the structure information necessary to express
liveness of subterms at an adequate level of granularity.

2.3.1 Rigid and Integrated Type Graphs


Logic programming languages generally do not contain any type declarations,
in contrast with procedural languages. Well-known compiler techniques exist to
exploit type information to generate highly optimized code or to perform static
type checking. In the literature, several approaches to derive type information
automatically by means of a static program analysis have been suggested (for
instance in [44, 58, 69, 83, 85, 86]). Usually, types are seen as descriptions of
sets of ground terms. This is not the case for the type-inference system devel-
oped by Janssens and Bruynooghe. It introduces the concept of type graphs,
which are essentially deterministic top-down tree automata [75], and describe
sets of partially instantiated terms as well as ground terms. If combined with
information constraining how the same variable can occur at various positions
in an environment, they provide a very expressive domain for the analysis of the
structures to which clause variables can be bound at various points in a program.
In [38, 40], two kinds of type graphs are defined: rigid types and integrated
types. In this section, we summarize the basic results that were derived for both
type systems. The rigid types have the property of being closed under substi-
tution application and they have simple semantic operations. The integrated
types on the other hand provide a greater expressivity, and are therefore more
suited for the application of code optimization. In the sequel, we assume that
the reader is familiar with the basic terminology of graph theory (full definitions
of the concepts presented here can be found in I38]).

Definition 2.3.1
A rigid type graph is an ordered triple, T = (NodesT-, ForwardArcsT-, BackArcsT-),
2.3. E X A M P L E : I N T E G R A T E D T Y P E A N D M O D E I N F E R E N C E 17

TT~.~ T2 Or T30f~ M
nil fa C / ~ g ax

lnt O ~ lit

a b

Figure 2.4: The graphical representation of some type graphs.

where NodesT is a finite, non-empty set of nodes, ForwardArcs~- and BackArcsT


are disjoint sets of directed arcs over Nodessr. ~ree -- (Nodes:r, ForwardArcssr),
is a tree called the underlying tree of the type graph. Each backarc, (n,n') E
BackArcs~r, is a directed arc connecting a node n to one of n's proper ancestors,
n', with respect to "Ttr~..
Each node o f t is labeled by a function symbol, 'Or', 'Maz', _l_ or a primitive
type name (Int, R e a l , . . . ) . Each node labeled with a function symbol has ordered
children corresponding to the function symbol's arguments s. Each node labeled
'Or' has two or more unordered children. Other nodes have no children.

Recall that a tree is a directed acyclic graph (DAG) satisfying the following
properties:
There is exactly one node, called the root, with indegree 0.
Every node except the root has indegree 1.
There is a path from the root to each node.
The root of the tree Yt,~ (and of the type graph 7-) is denoted Root(T). The
label of a node n C NodesT is denoted Label(n). The i th child ofa functor node n
(with respect to ForwardArcs~r U BackArcs:r) is denoted Child(i, n). The elements
of ForwardArcsT are called forward arcs, the elements of BackArcs:r backward
arcs. The information described by a type graph can also be expressed by a
context free grammar. We use the convention that (non-terminal or primitive)
type symbols start with a capital letter, and that the other functor node labels
consist of small letters only.
E x a m p l e : Figure 2.4 gives the graphical representation of the following type
graphs:
T1 ::= nil ] '.'( Int, T1),
T2::=a[f(a[b[W2)[g(Int),
T3 ::= f(T3) [ Max.
In the figure, the nodes of a type graph are represented by their label. We do not
explicitly indicate the direction of the forward arcs. We use the convention that
S A unique natural n u m b e r n is associated with each function symbol f and referred to as
the arityof the symbol. A function symbol of arlty 0 is referred to as a constant. T h e outdegree
of a node labeled with a function symbol is equal to the arity of the function symbol.
18 C H A P T E R 2. A B S T R A C T I N T E R P R E T A T I O N

forward arcs are drawn downwards, while backward arcs are drawn upwards.
The root of the type graph is the topmost node.
There exist several approaches to specify the meaning of a type graph, i.e.
the set of concrete terms that it represents. It is the set of terms recognized by
the type graph considered as a finite automaton, or the language accepted by
the context free grammar that can be derived from the type graph. Here, it is
understood that the nodes labeled 'Int' represent the set of integers, the nodes
labeled 'Real' the set of real numbers. The node labeled _1_represents the empty
set of terms. The node labeled 'Max' represents the set of all terms, including
variables and partially instantiated terms.
Another approach is related to the scheme presented by Yardeni and Shapiro
in [85] where a subclass of logic programs, the Regular Unary Logic (RUb)
programs, is used as a specification language for a class of regular tl/pes. A R U L
program is a logic program satisfying certain syntactic rules: every predicate
is unary, no two head arguments of clauses of the same predicate are top-level
unifiable, every body goal is of the form p(_X) with p a predicate name and _.X
a variable, every variable in a clause occurs exactly once in its head and once
in its body. Given a type graph as defined above, a unary Prolog predicate can
be associated with every node of the type graph, and a set of defining clauses
can be constructed according to the structure of the type graph. Consider for
example the type graph T1 of Figure 2.4, and number the nodes in a depth-first
left-to-right manner, assigning number i to the root node. If we associate with
node i the predicate Pi, then we get the following Prolog program.

pI(-X) : - ( p ~ ( _ x ) ; p s ( - x ) ) .
p2 (nil).
p z ( ' . ' ( - X , -Y)) : - integer(1), PI(-Y).

After a few partial evaluation steps, we obtain a RUL program.

pl ( n i l ) .
pl('.'(-X, -Y)) : - i n t e g e r ( _ X ) , PI(-Y).
P2 (nil).
pS('.'(-X, -Y)) : - i n t e g e r ( _ X ) , pI(-Y).

The predicate associated with a node labeled .J_ would be 'fail',and a node
labeled 'Max' would bc associated with a predicate generating all the terms that
can be constructed from the functors and constants in the source program, the
variables, and the terms in the primitive types (e.g. Int, Real).
The meaning or denotation of the type graph T1 coincides with the meaning
of the predicate p l / 1 , namely, the set of all lists of integer elements. A formal
definition of the denotation of a type graph based on fixpoint semantics can
also be given directly, i.e. without the intermediate step of associating a Prolog
program with the type graph. A related approach is taken for instance by
Mishra [58], although in a different formalism, to define the interpretation of
the regular tree8 that form the basis of his type system. Let Vars be the set of
variables that can occur in the terms, and SMa m the set of all the terms that can
2.3. E X A M P L E : I N T E G R A T E D T Y P E A N D M O D E I N F E R E N C E 19

be constructed from the functors and constants in the source p r o g r a m P, the


elements of Vars and the terms in the primitive types 9. We assume t h a t each
primitive type T represents a set ST of ground terms with depth one, and t h a t
these sets are m u t u a l l y disjoint.
A n interpretation of a type graph 7- consists of the d o m a i n SMa= of terms
and a m a p p i n g t h a t associates a set of terms to each node.

I : NodesT ---* 2 SMaz.

Equipped with the pointwise defined operations tO and N, and the subset ordering
C_, the set of interpretations of a type graph 7- forms a complete lattice.

Ii LAI~ : Nodes:?- --* 2SM~" : n ~ I i ( n ) to 12(n),


Ii FqI2: NodesT- --+ 2SM'~" : n H It(n) Fq12(n),
c_ h r vn E Nodes- : I (n) C_

T h e e m p t y interpretation, defined as

0 : NodesT ---* 2 SMax : n w-~ O,

is the b o t t o m element.
To define the m e a n i n g of a type graph using fixpoint concepts, we need
a m o n o t o n i c operator on the complete lattice of interpretations for this type
graph. The next definition first introduces an auxiliary function: given a node
of the type g r a p h and an interpretation I, it associates a set of terms with the
node depending on its label and the set of terms associated with its child nodes
by the given interpretation I.

D e f i n i t i o n 2 . 3 . 2 For a type graph T , a node n E NodesT- and I an interpreta-


tion of T ,

if Label(n) = l then r
else if Label(n) -= lnt then SInt
else if Label(n) = Real then SReat
else if Label(n) = Maz then S Maz
E)7-(n, I) = else if Label(n) = Or then U { I ( m ) I m is a child of n}
f(Q,...,tk)
[ }
tj E I(Child(j,n)),
else if Label(n) = f then
for all j : l <_j <_ k, "
k the outdegree of n

We get an operator on interpretations as follows.

g)~-(I) : Nodesz --~ 2 S M " ' : n H / D z ( n , I).

9Note that the Herbrand universe Up of a program P only contains ground terms, so
Up # SMax.
20 C H A P T E R 2. A B S T R A C T I N T E R P R E T A T I O N

We now use this operator in a way that is similar to the use of the immediate
consequence operator Tp when defining the meaning of logic programs [50]. The
operator ~gT can be shown to be monotonic and continuous (i.e. it preserves the
least upperbound of each increasing sequence). The powers of the operator are
defined as usual.
~97- T 0(I) -= I,
~9~r T i(I) = ~9:r(~gcr T (i - l)(I)),
T ,,,(I) = T i(I).
Here, w is the first infinite ordinal and U is the pointwise operation mentioned
above. According to the Fixpoint Theorem of Knaster and Tarski, the operator
has a least fixpoint, and because it is continuous, the least fixpoint is given by
~9~r T w(0). Note that K)~- T w(0)(n), called the denotation of the node n, can
be an infinite set of finite terms if the type graph T contains backward arcs.

D e f i n i t i o n 2.3.3 ( C o n c r e t l z a t l o n o f a t y p e g r a p h ) For a type graph T, we


define
TGConc(T) = //97- T w(0)CRoot(T)).

E x a m p l e : For the type graphs of Figure 2.4, we have

TGConc(T1) = {nil, [1, 2, 3], [4, 4],...},


TGConc(T2) = {a, f(a), f(b), f ( f ( a ) ) , f(f(b)), g(7), f ( g ( 5 ) ) , . . . } ,
TGConc(T3) - {a, f(Y), X, [a, Y, b], f ( Z , Z, 3),...} = SMax.

When using rigid types, it is not possible to distinguish the set of free variables
from the universe of all terms: both are represented by a node labeled 'Max'.
Consequently, when the value of a term is changed because of the instantiation
of some free variable subterm, then the old type is still a correct description
for the new term value. This property significantly simplifies the definition and
the correctness proofs of the operations for rigid type graphs. For instance, the
unification of two type graphs can be stated in terms of the intersection of the
underlying finite automata, for which well-known algorithms exist.

P r o p e r t y 2.3.4 The concretization of a rigid type graph is closed under sub-


stitution, i.e. for a rigid type graph T and a substitution O binding variables to
terms in SMax, we have

t e ZGConc(T) :=~t8 e TGConc(T).

The proof is given in [38].


In general, several distinct type graphs can have the same concretization.
For instance, both T1 defined above, and T4 defined as

T4 ::= nil I '.' ( Int, T4) I'.'( Int, nil),

represent the set of all lists of integer elements. This causes the algorithms that
check for equality or inclusion of type graphs (relations which are needed by the
abstract interpretation algorithm), to be quite complex and inefficient.
2.3. E X A M P L E : I N T E G R A T E D T Y P E AND MODE I N F E R E N C E 21

In order to reduce the cardinality of the domain of type graphs (which is


infinite), a number of restrictions on the structure of type graphs are introduced.
Some restrictions merely prohibit redundant nodes in the type graphs, while
others reduce the number of sets expressible by type graphs. The restricted
class of type graphs is better behaved than the unrestricted class, while still
highly expressive. In a first step compact type graphs are defined. Essentially,
these are type graphs having no nodes with an empty set as denotation, no
circular paths consisting only of nodes labeled 'Or', no 'Or' nodes with only one
predecessor also labeled 'Or', and no 'Or' nodes having a child labeled 'Max'.
The formal definition and a compaction algorithm can be found in [39]. For
example, the type graph T3 in Figure 2.4 is replaced by a single node labeled
'Max'. Another restriction that our algorithms rely upon is based on the next
definition.

D e f i n i t i o n 2.3.5 For a type graph 7- and a node n E Nodes,z-,

PrincipalNodes(n) = if Label(n) ~: 'Or' then {n} else


U{erincipalNodes(m) I m is a child of n}.
The principal label restriction states that, for all nodes n labeled 'Or',

ml,m2 E PrincipalNodes(n): ml # rn~ ::~ Label(m1) :fl Label(m~).

Note that the compactness of the type graph assures that neither rnl nor m2
are labeled 'Max'. The principal label restriction limits the expressive power
of the type graphs, but it makes them deterministic as recognizers of terms.
For instance, the set of terms {f(a, b), f(c, d)} cannot be represented; it will be
approximated by the tuple-dis~ributive set {f(a, b), f(c, d), f(a, d), f(c, b)} [58,
85]. Compact type graphs satisfying the principal label restriction are called
normal type graphs.
In the examples, we will also refer to the depth restriction for type graphs.
For every function symbol, the depth restriction indicates the maximum number
of nodes having that function symbol as label that can occur on a path in the
underlying tree Tt~,, of the type graph and that starts from the root node. The
restriction is required in order to get a finite subdomain of type-graph environ-
ments, and to guarantee termination of the abstract interpretation procedure.
The depth bound will also turn out to be an important parameter to tune the
precision of the sharing and liveness analysis. The particular value of the bound
can be fixed for the whole program, or chosen on a per clause and functor basis
according to some heuristic. Normal type graphs satisfying the depth restriction
are called restricted type graphs. An algorithm to construct a restricted type
graph with a denotation that contains at least the denotation of some given
normal type graph, can be found in [39]. The report also presents a formal
correctness proof for the algorithm.
Unfortunately, rigid types are not sufficiently precise to deal with partially
instantiated terms. For instance, in Figure 2.5, T1 is a rigid type graph rep-
resenting open-ended trees; an empty open-ended tree is a free variable, so its
22 CHAPTER 2. A B S T R A C T INTERPRETATION

T2 Max T3 Or_

lnt lnt

Figure 2.5: Some type graphs representing open-ended binary trees.

type is 'Max'. However, T1 is not a compact type graph. The compaction al-
gorithm transforms the type graph T1 into the type graph T2, and causes all
structure information to be lost. The compact representation, corresponding to
mode a (any), does not provide any useful information for code optimization.
Therefore, rigid types are generalized to integrated types which are defined in
the same way as rigid types, except that nodes with the label 'V' are allowed
additionally (see type graph T3 in Figure 2.5). The symbol 'V' represents the
set of all variables (i.e. if Labd(n)= V, then /Dr(n, I) = Vars). In general, in-
tegrated types are not closed under substitution, and the operations become
more complex. Information capturing the dependencies between the values of
variables is needed to obtain sufficiently precise operations. We return to this
topic in Section 2.3.2, and in the sequel, by type we will mean an integrated type,
unless stated otherwise.
The structural properties of compactness and depth restriction are also im-
posed on integrated type graphs. In spite of these properties, there are still
different restricted type graphs representing the same set of terms. For instance,
there exist several alternative representations for the set of all terms.
"TMazI ::-----Max
TMaz2 ::---- V l i n t I Real [ a [f(TMaz 2) [ . . . I t(T~=x 2, TMaz2)
Here a/0, f / l , ..., t/2 enumerates all the functor labels that occur in the source
program. Also, the set of restricted type graphs ordered by set inclusion of the
concretizations does not form a lattice, because there does not exist a least up-
perbound operation. Fortunately, the set of normal type graphs has the algebraic
structure required by the framework of abstract interpretation. There exists a
preorder relation < for normal type graphs satisfying

T1 _< "2"2 4=~ TGConc(T,) cC- TGConc(T2).

Algorithms implementing an Upp and a restriction operation R can be found


in [38], both for rigid and integrated types. It is also proved that there are no
ascending chains under < in the subset of restricted type graphs. It is beyond
the present scope to discuss those operations in detail.
The type graphs that we use as the basis for the abstract domain for sharing
and liveness analysis, are a special kind of restricted integrated type graphs, in
the sense that we assume that none of the nodes is labeled 'Max' (a subgraph
equivalent with some explicit representation such as 7"M,~2 must be used instead).
2.3. E X A M P L E : I N T E G R A T E D T Y P E A N D M O D E I N F E R E N C E 23

We also assume that the type graphs have as their root a node that is not labeled
'Or'. Each type graph that we use to represent an execution environment will
have its root labeled by either the tupling functor, /), or _1_.

2.3.2 Type-graph Environments

The type graphs are augmented with sets of constraints expressing dependencies
between variables. The constraints play an i m p o r t a n t role in the precision that
can be achieved by the primitive abstract operations. The abstract domains
that we propose for the sharing and liveness analysis in Chapter 4 and 5, do not
rely on any particular representation for the variable dependencies. In order to
give the reader some intuition about the issues involved, we describe the sets of
constraints as they are used in [40] for the domain of integrated types. Another
approach can be found in [83].
First, we introduce some additional notational conventions to select nodes in
a type graph.

D e f i n i t i o n 2.3.6 For a type graph 7- and n C NodesT-,

h i e selects the node n itself,


n/(fl,il)...(fv_l,ip_l).(fp,iv) (with 1 < il < a r i t y ( f ~ ) , . . . 1 <
9th child of the principal node with
i v < arity(fp)) selects the ~v
label fp of the node selected by n/(fl,il)...(fp_l,iI~_l ).

We call e and (fx, i a ) . . . ( f v _ l , i v _ x ) . ( f p , i p ) selectors 1~ and "." denotes the


operation of selector concatenation.

If there are no 'Or'-nodes on the path between n and the selected node, then
the selector ( f 1 , i l ) . . . ( f p - l , i v - x ) . (fp, iv) can be abbreviated as i l . . . i p - l . i v,
and is called a determinate selector. If there are 'Or'-nodes on the path, then
the selector is called non-determinate and cannot be abbreviated. To select a
subterm in a particular concrete t e r m /C, an abbreviated determinate selector
can be used, e.g. for ~: = f(g(a, b)), the expression ~ / 1 . 2 selects b. We only use
selectors that are well defined for the type graph (or term) considered.
A type-graph environment 7- e E AbstrSubD, i.e. an abstract substitution over
a domain D = { X ~ , . . . , X,}, is a tuple (7-, SVaI~-, NUniT-, PShrT-) such that

9 The Type component 7- is a normal integrated type graph with its root
labeled by either the tupling functor () of arity c, or _1_. In the former case,
the value of a variable Xi in a concrete substitution represented by 7-' is
m e a n t to belong to the denotation of the node Child(i, Root(7.)). In the
latter case, the e m p t y set of substitutions is represented.

9 The Same Value component SValT- is a set of SVal-constraints. An SVal-


constraint represents information a b o u t what subterms of the values for
the variables Xi in a concrete substitution represented by 7-' are known to
1~ C h a p t e r 4, we will u s e a different d e f i n i t i o n o f s e l e c t o r s , u n r e l a t e d t o t h i s o n e .
24 CHAPTER 2. ABSTRACT INTERPRETATION

be identical. An SVal-constraint is of the form {Xi/sl,Xj/s=}, where


i r j, and sl, sa are determinate selectors in the subgraphs n of 7"
with roots respectively Child(i, Root(7")), Child(j, Root(TT)). The expres-
sion {X~/sl, Xi/s2} is allowed if sl r s2.

The Not Unique component NUni~r is a subset of the domain D. If Xi E


NUni~, then the value of Xi in the concrete substitutions represented by
7-' can have multiple occurrences of some free variable (i.e. internal sharing
of a variable).

The Possible Sharing component PShr~r is a set of elements of the form


{Xi, Xj}, where i r j. An element {Xi,Xj} E PShr~r expresses that
the values of Xi and Xj in the concrete substitutions represented by 7-*
can have occurrences of the same free variable (i.e. external sharing of a
variable).

The domain F-AbstrSubD is defined to be the subset of type-graph environments


of which the type component consists of a restricted integrated type graph.
The SVal constraints are used in the type analysis to prohibit a potential
l o s s of precision due to the normal-form of the source programs. The SVal
constraints express equality of structured terms, which does not necessarily imply
a shared representation of those structures. It is precisely information about
shared representations that plays a key role in the application of liveness analysis.
The N0n; and PShr constraints provide information about possibly shared free
variables only. Therefore, we will develop a more specific formalism of constraints
in Chapter 4, concerning the sharing of structured terms.
As explained in Section 2.2, a concretization function TGEnvConc : AbstrS-bD
--* 2C~ has to be defined.

D e f i n i t i o n 2.3.7 ( C o n c r e t l z a t l o n o f a t y p e - g r a p h e n v i r o n m e n t ) Given
a domain D - { X I , . . . , Xc}, and a type-graph environment 7 . e - (7", SValT",
NUni~-, PShr~r) E AbstrSubD,

TGEnvConc(7.') =
8 = {X~ ~--/C~ I 1 < i < c} for/C -- t/C1,... ,/Cc) ]
0 E ConcrSubD E TGConc(7.) and 8 satisfies the SValT-, NUniT I "
and PShr~ constraint8

The fully formalized definitions can be found in [39]. Here, we restrict ourselves
to some examples of type-graph environments over the domain D = {X1, X2}.

Examples:

1. 7.~ -- (7-1, 0, 0, 0) with 7-1 ::= (f(alb, alc), g(alb, Intlv)). Although the
SVal-constraint does not enforce any dependencies between the values of
11An algorithm for constructing the selected subgraph is in [40]. Ln some cases, nodes are
duplicated.
2.3. EXAMPLE: I N T E G R A T E D T Y P E AND MODE INFERENCE 25

X1 and X2, identical subterms may occur; e.g. the following substitutions
belong to TGEnvConc(T~):
01 = ( x l ~- f(a, a), x 2 ~- g@, 3)},
02 = ( x l ~- I(~, c), x2 ~- g(~, v)}.
2. T~ = (T1, {{X1/1, X2/1}}, g, g)9 The SVal-constraint enforces the first ar-
guments of the values of X1 and X2 to be identical, so 02 E TGEnvConc(T~),
but 0x r TGEnvConc(T~)9

3. T~ = (7-3, { { x l / 1 , x2/1}}, { x l } , { { x l , x 2 } } ) where 7-3 ::= (g(V, ~IV, V),


h(V, V, b)). For the free variables U, W, Y E Vars, the following concrete
substitutions belong to TGEnvConc(T~):
03 = {X1 +---g(U, a, W), X2 4-- h(U, Y, b)},
04 = { X l ~-- g(W, W, U), X2 ,-- h(W, U, b)}.
Note that the SVa]-constraint enforces the sharing of free variables and
that the NUni-component does not allow the value of X2 to have multiple
occurrences of a free variable.

Note: We use the symbol ::-- if we specify a type graph by means of a grammar
rule instead of a triple as in Definition 2.3.1.
In general, if b o t h {Xi/$1,Xj/82} and {Xi/819 , Xj/$2.$} belong to some
SVal-constraint set, then the latter is in fact redundant. Similarly, if the SVal-
constraint set enforces the sharing of free variables, it may be possible to sim-
plify the NUni- or PShr-constraint sets9 A normal form of the type-graph en-
vironments can be computed such that most of the redundant components are
removed, while the denotation is left unchanged. It is beyond the present scope
to discuss the issue in detail (see [39]). Also in [39], it is proved that the algebraic
structure imposed by the framework is satisfied for the domain of type-graph en-
vironments. For instance, a preorder relation _<Ta and an upperbound operation
TGOpp are defined, and the following property is established.

P r o p e r t y 2.3.8 For two type-graph environments T~ and T~ $uch that


arit~Label(Root(T1))) = arity(Label(Root(T2))),
I9 T~ <__ToT~ ::~ TGEnvConc(T~) C_TGEnvConc(T~),

~. T~ ~TO TGUpp(T~,T~) & T~ <_TOTGUpp(T~,T~).


A maximal element for AbstrSubD is T~r = (TT, @,D, D2), where ~VT : : : ("1"-Mare,
9.., 7-M=x), with TMax an explicit representation of the maximal type graph, and
D2 = {{Xi, X j ) [ X i , X j E D & i r j}. A minimal element of AbstrSubD is
7-~. -- (7-• @,@,$), where 7-• ::-- _L.

2.3.3 Primitive Operations for Type-graph Environments


In [38, 39] primitive operations AbstrUnify, Procedure-Entry and Procedure-Exit
for type-graph environments are defined and proven to satisfy the safety require-
ments as imposed by the framework of abstract interpretation. In this section,
26 C H A P T E R 2. A B S T R A C T INTERPRETATION

we summarize some of the basic results. The operations do not deal explicitly
with the trivial case of T~, because the result of any abstract operation on the
bottom element is again 7"~.. The bottom element represents that the program
point it decorates is never reached during program execution.
W e firstfix the notation for concrete substitutions that will be used in the re-
mainder of the book. For a fixed enumeration of program variables, Xx,..., X~,
a concrete term environment is a term K = I]~1, .... E n ) , which represents the
concrete substitution 0 = ~XI *'- El, ..., X,~ ~-- En}. The subterm Ei is also
represented as Eft.
Recall that a unification call P can have either of two forms: Xi = Xj or
Xi = f(Xil,...,Xi#). Therefore, AbstrUnify is defined as a function that is
polymorphic in its third argument:
TGUnify(T/n', i, j): returns the type-graph environment 7-o,f resulting from ab-
stract unification of the i th and jth component of T/n. This corresponds to
the basic operation X i = X j .
TGUnify(~n e, i,f(Q,..., ij)): returns the type-graph environment To~te result-
ing from abstract unification of the ith component of T/n and a type graph
having root node labeled f and as sons the i~h,..., i~h components of T/,~.
This corresponds to the basic operation 3 ( / = f ( X i l , . . . , X i , ) .
The unification of type-graph environments is a rather complex operation where
the constraint sets play an important role. We do not present any details here,
but we will need the following safety results for type-graph environments when
defining the abstract operations for sharing and liveness analysis.
T h e o r e m 2.3.9 ( S a f e t y of TGOnify(-, i, j) [38]) Let Tin ~ be a type-graph en-
vironment and ICin E TGEnvConc(T/ne). Let I < i < j <_ arity(Label(Root(Tin))).
I r a = m g u ( E i n / i , E i n / j ) i~ not fail, E o . t = Ein~ r, and 7"o.t' = TGUnify(T/n',
i , j ) , then Eont E TGEnvConc(Tont').
T h e o r e m 2.3.10 ( S a f e t y o f T G U n i f y ( - , i , f ( i t , . . . , i j ) ) [38]) Let 7"i,~~ be a
type-graph environment and Ein E TGEnvConc(T/ne). Let 1 <_ i, i t , . . . , i j
arit~Label(Root(Tin))) such that i, it, . . . . ij are pairwise distinct, and f is a
functor of arity j . I f a = rngu(ICi,/i, f ( E i n / i l , . . . , E i n / / j ) ) is not fail, Eo~t =
Ei,~ a, and To,,,e =TGUnify(T/n e, i, f ( it, . . . , ij ) ), then Eo=tE rGEnvConc(To,,te).
Before considering the Procedure-Entry and Procedure-Exit operations, we intro-
duce an auxiliary function. Recall that all Prolog programs are assumed to be
in normal form, so the arguments of a call (or of the head of a clause), can
be given as a subset (Xil,...,Xi..} of the domain of the clause environment
{Xz,..., X,~}, and an injection i: {I,..., m } --, {1,..., n}: k ~ i~ can be used
to formally define the restriction of the clause environment to the arguments of
the call (or the head of the clause).

Definition 2.3.11 For a concrete term environment E = ( E l , . . . , E n ) , and


an injection i : { 1 , . . . , m} -~ { 1 , . . . , n) (for m <_ n), we define Proj(E,i) =
(E~(1),..., E~(,~)).
2.3. E X A M P L E : I N T E G R A T E D TYPE AND MODE INFERENCE 27

In the first step of procedure entry, Procedure-Entry( T/,~e, P ), the type-graph


environment T,,,' is restricted by means of a function call TGRestrict(Ti,~', ir
to a type-graph environment Tr,t, e for the domain of the call P. Assume
T,,,' = (T~,,, SVal~q,,, NUniT-~,~,PShrT"~,~), and T~,, ::= (Tx,... ,T,~). Assume that
the call, P ( X i , , . . . , X i . . ) , is determined by the injection i~u. Then T, ot," =
<T~ot,, SValr,.... NUniT"~,t,,PShrr..._), where T,,t,. ::= (Ti ..,a,,'", 7'; ..,,~,),and
the constraint sets are reduced to the information pertaimng to the variables X q ,
9 , 9 X~aL.

SPain. ~ = {{W:lsl, Wbls~} E SVair~. I W:, Wb E {X,,,..., X~..)},


NUniT",,tr = {W= E NUnir~,, I Wo E { x , , , . . . , x . . } } ,
PShrcrr,t. -- {{Wa, Wb} e PShrcq,~ I Wo, Wb e {X~,..., X,..}}.
The safety requirement of the first step of procedure entry is stated in the fol-
lowing theorem.

T h e o r e m 2.3.12 ( S a f e t y of TGRestrict [38]) Let ~ , t be an type-graph envi-


ronment and /C,,, E TGEnvConc(Ti,,'). Let ari(l/(Label(Root(T,,))) = n, and
i~o, : 0 , . . . , , ~ , } --. { 1 , . . . , ,~} ~,~ i,qe~o,~ (fo,. m _< ,q, ~,~d Jc~,~ = p~o~(~:,.,
i~u). If T,,t~' = TGRestrict(Ti,', i~u), theft/E~,t~ E TGEnvConc('Trstr').

In the second step of procedure entry, T~,t~ ~ is renamed for the domain of vari-
ables for each of the clauses matching with the call P and the local variables
of those clauses are initialized. Assume that T~,t,~ = (T,,t,, SValTrstr, NUniT, str,
PShrT- t,), and T~,t~ ::= (Tt . . . . , T,~). Assume that the cai~ is P ( X 1 , . . . , X,~,),
and the jth clause P(Z{ , .. . g ~ ) : .- s t , . .~ BJq j '. with domain
9 . , {Z~, , Z,~,J Z,,,+I,
. . . , Z ~ , } . Then T~ = (7-j, SVal~rj, NUni~-j, PShrT-j), where Tj ::= (Tx,... ,Tin,
V, .. ., V), which reflects that the local variables {Z,,~+x,..., J Z~,} are free vari-
ables on procedure entry. Also the constraint sets are renamed for the domain
of variables of the clause.

sv~% = ((Z~/s~, g ~ l ~ } l I X ~ l ~ , X ~ l ~ ) ~ sv~ln. A,


NUniT-j = {Z~ I X~ E NUniTrst~ },
PShrT-, = {{Z~, Z~} I {Xo, X~} E PShrT.,t~}.
This means that the sharing and same-value relationships existing for the actual
parameters are passed on to the formal parameters, but that the local variables
of the clause are not involved in any new sharing, nor in the same-value relation.
Proving the safety of the second step of procedure entry is straightforward.
Procedure exit, Procedure-Exit( T~,', {To~d . . . 7 - o ~ } , P), can be applied
when the final type-graph environment To,t~ has been derived for each clause
defining P. The initial step of procedure exit computes a type-graph environment
4,0t~~ for the domain of the call P. We have T~,t~" - T G U p p ( { T ~ , .. . , T p r} ) ,
where T~ is the renaming to the domain of the call of TGRestrict(To,t~, i~,,dj),
for 1 < j < p. The injection ih,,dj specifies how to restrict the environment
of the jta clause defining P to the head of that clause. Note that for the type
28 C H A P T E R 2. A B S T R A C T I N T E R P R E T A T I O N

component and for a concrete term environment the renaming is done implicitly
by the injection.
The next theorem reformulates the safety requirement as introduced in Sec-
tion 2.2.3 in terms of type-graph environments and concrete term environments.
It states that if P is called with a concrete substitution represented by the term
environment /Ci,, E TGEnvConc(Ti,i e) and is solved by means of the k th defin-
ing clause for P which returns the term environment/C E TGFnvConc(7-ont~),
then the projection of the latter term environment on the arguments of the head
of that clause is described by the abstract type-graph environment T~~ e. The
safety of the first step of procedure exit follows from the safety of TGRestrict, the
properties of the TGUpp operation and the monotonicity of the concretization
function TGEnvConc.

T h e o r e m 2.3.13 ( S a f e t y of first step o f p r o c e d u r e exit) Let ~,=', and


Toft~,..., To,,t~ be type-graph environments, and/Ci,~ E TGEnvConc(~,=e), /C E
TGEnvConc(7"o=t~) for some k such that 1 < k < p. Let arity(Label(Root(~,,))) =
n and for 1 _< j < p : mj = ar/tl/(Label(Root(Toffitj))). Let m be the ar-
ity of the call P, iaan: { 1 , . . . , m } --* { 1 , . . . , n } an injection defining how
the environment of P is mapped into the environment of the calling clause,
and i~,adj : {1,...,ra} ---, { 1 , . . . , m j } the injection relating the environment
of the jth clause defining P, to the environment of its head. Suppose that
(3or over Vars(Proj(/Ci,=, icail)) : Proj(/Cin, ir a = Proj(/C, ih,adt=), and the only
variables of ICi~ occurring in a also occur in Proj(/Ci,,, ir Then if ~,t~ ~ =
TGUpp({7-~.,...,~r~}), where 7-~ is the renaming to the domain of the call
of TGRestrict(7"o,,t~,ih,adj) for (1 < j < p), it/ollows that Proj(g~,ih,=dk) E
TG EnvConc(7",.~

Proof. By the safety of TGRestrict (Theorem 2.3.12), we have Proj(/C, ih,,=dk) E


TGEnvConc(7-~). By the properties ofTGUpp, 7-~=<a,a TGUpp({T~,..., T~}) =
7;.,0t~'. By the monotonicity of TGEnvConc, we have TGEnvConc(7-~) C
TGEnvConc(T~0tf'). It follows that Proj(/C, iheadk) E TGEnvConc(~T~,t~'). [3

In the second step of procedure exit, the extension operation TGExtend, reintro-
duces the other variables of the domain of the calling environment and makes
explicit the effect on those variables of the execution of the procedure: 7-o,f -
TGExtend(7~,, e, ~,t~', it=n). Due to sharing existing prior to the call, the value
of variables not involved in the call can be changed too. Hence, the extension
operation has to derive how their type component and the constraint sets are
affected by executing the call. We do not present any details here.
The safety requirement of the second step is stated in the following theorem.

T h e o r e m 2.3.14 ( S a f e t y of TGExtend [38]) L e t ~ , , ' , 7-~,t," be type-graph en-


vironments and IC~,= E TGEnvConc(~,=*), /C~.a, E TGEnvConc(7"ra,'). Let
arity(Label(Root(~,=))) -- n & arity(Label(Root(T~,,~))) "- m & m < n, and i t , n :
{ 1 , . . . , m } - * { 1 , . . . , n } an injection such that (3a over Vars(Proj(/Ci,=, icali)) :
Proj(/Q,, it=n)a - IC,,t,) and the only variables of ICi,= occurring in cr also occur
2.3. EXAMPLE: INTEGRATED TYPE AND MODE INFERENCE 29

in Proj(/Ci., ic~zz). I f ICo,,t = lC,~,, a and ~o.t' = TGExtend('Ti.',/V.,t.', icau), then


/Co~t E TGEnvConc(To.t').

For the development of an abstract domain for sharing and liveness analysis
(Chapters 4 and 5), we regard the set of type-graph environments and the op-
erations TGUnify, TGRestrict and TGExtend as an abstract data type and its
operations. The safety of the abstract operations for procedure entry and pro-
cedure exit will be based on the safety results for type-graph environments.
Chapter 3

Related Work

In this chapter, we review some of the papers that inspired our design of the
abstract domain. In the first section, we consider several approaches for 8hating
analysia, the prerequisite for liveness analysis. In the second section, we summa-
rize related research on determining the likeness of data structures, both in the
context of logic programming languages and for functional languages. The last
section discusses experimental results of code optimization based on run-time
properties derived by abstract interpretation.

3.1 Aliasing and Pointer Analysis


Aliasing (or sharing) information provides the basis for a variety of program anal-
ysis applications such as occur-check reduction [43, 56, 57, 68, 70], groundness
analysis [17, 19, 22], mode and type analysis [28, 38, 40, 44, 57, 83], independence
analysis [32, 37, 46, 63], and liveness analysis [9, 13, 15, 62].
Previous work on aliasing in logic programming languages mainly addressed
the sharing of (unbound) variables. Indeed, for applications such as occur-check
reduction, mode, type or independence analysis, tracing the dependencies be-
tween free variables plays a key role in preserving the soundness of the analysis,
because variable bindings propagate through shared variables. On the other
hand, for the application of liveness analysis, the sharing of structured terms
becomes relevant as well.
The sharing of free variables was first studied for the application of occur-
check reduction. Before unifying a variable X with a term $, according to the
general unification algorithm, it must be checked whether X occurs in t. If X
is a proper subterm of t, the unification process should halt with failure. If the
unification algorithm omits the occur check, e.g. for reasons of efficiency, then
structures having loops may be created. For instance, when t is the term J(X),
the infinite term f ( f ( f ( . . . ) ) ) is produced. Colmerauer [20] introduced a the-
oretical model of Prolog for using infinite trees, but such a new semantics of
terms is not always desirable for ordinary Prolog applications (see [54] for fur-
32 CHAPTER 3. RELATED WORK

ther discussion). Plaisted [68] described a general method for detecting places
in a Prolog program where structures with loops might be created and proposed
a preprocessor for adding tests at such places that cause subgoals to fail if they
return infinite terms. W e discuss Plaisted's method more extensively than oth-
ers because it first introduced the idea of alternating paths (a closure operation
for propagating the sharing relation without full transitive closure). This con-
cept inspired our definitions of the primitive operations (Section 4.1.2) for the
derivation of sharing information that is sufficiently precise for the application
of liveness analysis.
The abstract domain used by Plaisted is based on the so-called Binary
Term/Literal Schemata. In essence, these schemata provide a finite way to
represent infinite sets of terms/literals. The schemata retain partial information
about term instances: some of their subterms are represented by term locators r~,
causing information about substructures and the multiplicity of variables to be
lost. A given term locator may occur more than once in a binary term schema,
indicating more than one occurrences of some subterm. The information that is
relevant for detecting possible infinite loops, namely the pairs of positions in the
subterms that contain common variables, is represented by means of functions
returning (an approximation of) the multiplicity of variable occurrences.

9 Rep(ri,rj): returns the number of repeated variables between locators


ri, rj (i and j may be identical).

9 Is(ri): returns the number of isolated variables of term locator r~, that is,
variables that occur only in r~ and nowhere else.

9 VC(ri): returns the total number of occurrences of variables in ri.

These values m a y be integers in the range 0 through b, where b is some bound.


A value of b indicates the possibility of b or more such variable occurrences.
Multiplicities of variables above this bound are not distinguished. A term t
is a binary schema instance of binary schema s, if all the locators ri in s are
systematically replaced by terms t~, such that the number of distinct c o m m o n
variables between t~ and tj, for i r j, is less than or equal to Rep(ri, rj), unless
Rep(r~, rj)--b, in which case the number of distinct c o m m o n variables is unspec-
ified; and similarly for the Is and V C functions. W e illustrate the concept by
means of an example. Consider the binary literal schemata P(rl, r4, rs, c) and
P(g(r2, f(rs)), rs, rs, rz), where the r~ are term locators. Suppose that the func-
tions Rep, Is, and V C are specified by: Rep(ra, r4) = Rep(r2, rs) = Rep(r2, rs)
= Rep(rz, rl) - 1, and VC(rl) = VC(r2) = 2, VC(r3) = VC(r4) --- VC(rs) =
VC(re) = VC(rz) = 1, IS(rl) = Is(r3) = Is(rz) = i, and all other values zero.
The literals P(g(X, U), X, Y, c) and P(g(h(V, Y), f(Z)), V, Z, W) are instances
of the above binary literal schemata respectively.
We now informally describe how the general unification algorithm is extended
i n order to unify term schemata containing term locators. The first part of
the algorithm applies the usual unification algorithm [47] until no more can
be done. The next part of the algorithm deals w i t h term Iocators. The first
3.1. ALIASING AND POINTER ANALYSIS 33

r4 rl
/ #l
## # r3'
## /
#
## 1#
II I r3"
I
# I

r5 - - r2

Figure 3.1: Detecting loops by constructing alternating cycles.

and second step consist of several rules to adjust the Rep, Is and VC functions
reflecting their behavior under unification. For example, when a term locator
is unified against a ground term, its variable count VC is set to zero. This
in turn will influence other Rep relations. When unifying the example binary
literal schemata given above, Rep(rT, rl), Is(rT) and VC(rT) will be set to zero,
because the term locator'r7 is unified against the constant 'c'. We do not discuss
these rules in further detail as there is no direct relationship with our work. The
last step of the algorithm proceeds by constructing a graph, the nodes of which
are t e r m locator occurrences of the term/literal schemata being unified, and
edges representing the Rep relation. Note that a term locator m a y label more
than one node in the graph if it has several occurrences in the term schemata
being unified. In the example, there are two occurrences of the term locator
r3; we distinguish them as r~ and r~'. The resulting graph is augmented with
unification edges (connecting nodes labeled with t e r m locators having non-zero
variable count VC, such that one of the locators occurs in a term that the other
locator is unified against) and label edges (connecting nodes labeled with the
same t e r m locator). Figure 3.1 shows the result of this step when unifying
the above example binary literal schemata. Unification edges are represented as
dashed edges, Rep edges as full lines, and label edges as full lines marked 'lb'. The
alternating cycles of Rep and label edges on the one hand and unification edges
on the other hand, indicate places where circular terms might be introduced
by the unification. For Figure 3.1, we have such a cycle r4 - r5 - r2 - rl -
r4. Indeed, this reflects the loop that is created when unifying the instances
P(g( X, U), X, Y, c) and P(g( h(V, Y), f( Z) ), V, Z, W) mentioned above. The loop
rl - r2 - r6 - r~' - r~ - rl is not an alternating cycle, but it could be interpreted
as indicating the possible creation of internal sharing in terms represented by
the t e r m locator rl. Unfortunately, Plaisted's paper does not address soundness
proofs for these operations; the intuition behind the graph transformation rules
is not always apparent.
We do not use the concept of term locators in our abstract domain, because it
does not provide a proper abstraction for representing sharing of term structures.
However, for the specifications of our abstract operations, we borrow the idea of
alternating paths to propagate the sharing relation. Roughly speaking, the Rep
and label edges correspond with the sharing (or old) edges in our application,
the unification edges to the binding (or new) edges (see Section 4.1.2).
34 CHAPTER 3. RELATED WORK

The paper of SCndergaard [70] is concerned with the same problem of de-
termining circumstances under which the occur check may be safely dispensed
with. His method is a simplification of Plaisted's within a framework of top-
down abstract interpretation for Prolog programs. He does not use the concept
of term locators. Informally, variables are aliased or share if in some execution
of the program they may be bound to terms which contain a common variable.
Abstract substitutions are pairs of the form A = ( G , R ) where G is a set of
variables and R is a binary relation on variables. The intended meaning is that
A describes the set of all substitutions, each of which makes (at least) all the
variables in G ground and has no more sharing than that specified by R. A pair
(X, Y) E R, where X ~ Y, allows X and Y to share a variable, while (X, X) 6 R
allows X to have some multiple variable constituent. Note that R is s y m m e t -
ric but not necessarily reflexive. No sharing of ground terms is represented (if
X 6 G then for any variable Y, (X, Y) ~ R). When restricted to a finite set of
variables (e.g. the variables of some program clause), the domain yields a lattice
of finite height under the lexical ordering based on set inclusion. Sondergaard
further introduces the concept of preunification. It is a kind of partial evalua-
tion of the unification of predicate arguments which derives templates for each
pair of subgoal and clause heading. The abstract unification algorithm is only
described informally via a series of examples.
Codish, D a m s and Yardeni [17, 19] use the same notion of abstract substi-
tution as defined by SOndergaard, but in a framework for b o t t o m up abstract
interpretation of logic programs. Their contribution provides a rigorous seman-
tics for the analysis and a soundness proof for the abstract unification algorithm.
We include some of the formal definitions here.
The set Sub represents the set of idempotent concrete substitutions, defined
in the usual way. The set Vats denotes the set of variables and PVar C_ Vats
denotes the set of variables that may occur in programs.
The set of abstract substitutions, ASub, is the set of pairs (G, R) G 2 pvar x
2 (PwrxPvar) for which

1. G is finite

2. R is symmetric

3. R n ( G x P V a r ) = O

Let ASub_l_ = ASub U {A_}. The abstraction function, a : 2 s~b ---, ASub• is
defined by a(0) = _1_,and for @ :/: 0, a ( | = (G, R) where

G = ["] { x I va s(xo) = o},


BE|
(X r Y A vars(XO) N vars(YO) r O) V
R = U (X,Y) (X = Y A XO contains two occurrences .

0co of a variable)
3.1. ALIASING AND POINTER ANALYSIS 35

The ordering E on ASub• is defined by A_ _E (G, R), and (G1, R1) _E (G2, R2)
iff G1 _D G2 and R1 C_ R2. The concretization function, q, : ASub• ---* 2 s~b, is
defined by -y(_l_) = 0, and for A r _k,

For example, let X , Y 9 PVar, U E Vats \ PVar, and consider the concrete
substitutions 0 = {X ~-- f(U, U), Y ~-- g(U)} and ~o = {X ~- b}, then

c~({O, ~a}) = (0, {(X, X), (X, Y), (Y, X)}).

Note that the internal sharing in the variable X is represented by a reflexive pair
(X, X) in the sharing relation. Also, the sharing relation is not transitive; e.g.
an abstract substitution A = {(X, Y), (Y, Z)} specifies potential sharing between
X and Y and between Y and Z, but not between X and Z.
The core of the analysis is the abstract unification algorithm, which is a
transition system mimicking the concrete algorithm for finding the most general
solution of a set of equations. It is divided into two parts. The input is a set of
equations, together with an abstract substitution. The first step (called preuni-
fication in [70]) solves the concrete part of the equations and corresponds to the
standard Herbrand algorithm for solving a set of equations. The second step of
the algorithm solves the abstract part of the equations. The solution of a single
abstract equation cA, where e =_ tl = t~ and )~ = (G, R), is approximated by
considering the variable multiplicity of tile equation. For example, if one side of
the abstract equation is ground then the solution grounds all other variables and
causes no sharing; i.e. if vats(t1) C_ G or vats(t2) C G, then e,~ results in the
abstract substitution ,V = (G tJ vats(e), R \ vats(e)), where R \ S is a shorthand
for R \ ((S x PVar) tJ (PVar x S)). In this way, the propagation of groundness
is reflected. In the other cases, aliasing propagation is required. For example, if
e - P ( X , Y , Z ) = P(A, f(A, B), B) and A = (0,0), then we get the abstract sub-
stitution ,V = (0, R), where R = {(A, X), (A, Y), (B, Y), (B, Z), (X, Y), (Y, Z)}.
Note that no sharing between A and B is derived. Indeed, such sharing could
only be introduced by the unification if there was internal sharing in Y prior to
the unification. We refer to [17, 19] for the formal definitions of the abstract uni-
fication algorithm and its soundness proof. The concept of alternating sequences
(or mixed paths) is implicitly used in tile algorithm.
In the sharing analysis approach of Jacobs and Langen [37] and Muthukumar
and Hermenegildo [63], a concrete substitution 0 for the set of variables in a
clause, PVar, is abstracted by means of a set of sets of variables. First, each
variable U in the universe of all variables, UVar, is associated with the set of
variables through which it occurs in the given substitution 0.

occs(o, u) = { x I x 9 u e var4XO))

The substitution 0 is then approximated by an abstract sharing defined as follows.

A: --, 2 :0 U) I V c UVar}.
36 CHAPTER 3. RELATED WORK

For example, the substitution 8 = {X *-- I(U, V), Y ~-- g(V), Z ,-- a} is repre-
sented by the abstract sharing A(8) -- {0, {X, Y}, {X}}. Intuitively, variables
that occur together in a set have a variable in common, variables that do not
occur in any set are ground. The abstraction function .4 is extended to sets of
substitutions as follows.

2 - , 2 Pv ' : e H L3,4(0).
060

The set of abstract sharings can be ordered by set inclusion C, and set union
U yields a safe least upperbound operation. The concretization function 7 maps
an abstract sharing to the set of all substitutions it approximates.

. . 7 : 2 2Pva ~ 2 : A H {0 I A(0) c_

The abstract substitution which makes all program variables in a clause ground
is {0}. The bottom element in the lattice is 0, representing failure. The top
element in the lattice is the powerset of all the clause variables.
In this abstract domain, internal sharing within a variable is not represented
explicitly. This causes a loss of precision in the unification algorithm, as was
pointed out by Codish et al. in [17]. Consider the unification X = f(Y, Z)
and an abstract sharing Ai,~ = {{X},{Y}, {Z}}, i.e. neither of the variables
is ground, nor shares any free variable with the others. A substitution such
as 8 = { X ~ f(U, U), Y *-- g(V), Z ~-- W}, belongs to the concretization of
Ai,~. Consequently, sharing between Y and Z has to be derived by the abstract
unification, even for those cases where no internal sharing in the variable X is
possible (Ao~,, = {{X, Y}, {X, Z}, {X, Y, Z}}).
A formalization of the abstract unification algorithm and the other prim-
itive operations required in the framework of Bruynooghe [11], is given by
Muthukumar and Hermenegildo in [64]. There they argue that the combined
determination of sharing and freeness information yields operations that are
more precise than if they were computed separately. Freeness information al-
lows to distinguish between variables that are just bound to another variable
and variables possibly bound to a complex term. For that purpose, they use
abstract substitutions consisting of two components: a sharing component as
described above and a freeness component given by means of a mapping from
PVar into the set {G, F, N F } , where G, F, N F represent respectively ground,
free, and potentially non-free. The second component thus provides the mode
information discussed in the previous chapter. The interaction between the two
components in the unification algorithm increases the accuracy of the analysis to
some extent. For instance, if X, Y, Z are known to be free before the unification
3.1. A L I A S I N G A N D P O I N T E R A N A L Y S I S 37

X = f ( Y , Z), then no sharing between Y and Z will be derived. However, if X


has mode N F , but no internal sharing, imprecision will still be introduced by
the abstract unification.
The combined sharing and freeness analysis presented by Muthukumar and
Hermenegildo is intended to be applied for the determination of run-time goal
independence. Many parallel Prolog implementations restrict to Independent/
Restricted And-Parallelism (IAP). In IAP subgoals in the body of a clause are
executed in parallel provided they are independent: that is, if further binding
of one goal cannot cause further binding of another goal, they can be executed
in parallel. Compile-time knowledge about the variables that may share, can
be used in such systems to eliminate run-time goal-independence checks. In
stricL independent and-parallelism, only goals which do not share variables are
executed in parallel. Freeness information is used for the detection of non-stricf
goal independence, which allows parallelism among goals that share variables,
provided they do not compete for the bindings of such variables. This will be
the case for instance if for one of the goals it is derived that variables that are
free before the call are still free after the call [84].
The basis of the sharing analysis presented in our work is the integrated
type and mode inference system discussed in the previous chapter, which was
developed by Janssens and Bruynooghe [38, 40]. In that system, types represent
sets of possibly non-ground terms. Groundness as well as freeness information is
obtained by the analysis. In order to handle variable dependencies, sets of 5Val,
PShr and NUni constraints are used. The SVal constraints represent program
variables having the same, possibly non-ground, (sub)term values. Such a con-
straint can imply sharing of free variables, but does not provide an indication
for the sharing of structures. The PShr and NUni constraints abstract possible
shared representations of free variables, between different clause variables (ex-
ternal sharing), and within a clause variable (internal sharing), respectively. The
sharing of (partially) instantiated structures is not represented. However, the
type graphs derived by the analysis, provide us with the structure information
that is essential for expressing the sharing of term values with an adequate level
of granularity. Such structure information is not derived by any of the sharing
analysis methods discussed earlier.
Bruynooghe et al. [9, 10, 13] discuss the kind of analysis needed to enable
compile-time garbage collection and introduce a formalism for representing shar-
ing of term structures and corresponding inference rules. The technique differs
from previous methods in that the abstract domain captures information about
the sharing of general term structures: sharing of ground terms and partially in-
stantiated terms, as well as variable sharing. A relation A[. (alias) is introduced
such that XAI.Y represents possible sharing of tile instantiations of program vari-
ables X and Y. The expression Comp(7-, X) represents parts of instantiations of
X having type 7- (for T a component type of the type of X)'. It can be used to
express potential sharing of parts of instantiations (e.g. Comp(T, X) Ak Y rep-
resents possible sharing of the instantiation of Y and a part of the instantiation
of X having type T). The nesting of expressions, e.g. Comp(T1, Comp(T2, X)),
38 C H A P T E R 3. R E L A T E D W O R K

is not allowed because of the finiteness requirement for abstract domains. The
inference rules are transitive-like operations propagating the alias relation. For
example,

Comp('2"l, X) AL Y &
Comp(T2, X) AL Z & Comp(']-2, Y) AL Z.
T2 is a component type of T1

Such rules can cause imprecision because they do not fully exploit the structure
information available in type graphs for relating the (component) types. The
formalism is suited for liveness analysis if expressions of the form Live(X) and
Live(CompCT, X)) are added to represent the (parts of) instantiations that may
not be deallocate& The correctness of the method was argued only informally.
In the paper [60], we formalized an abstract domain based on type graphs,
and an abstract unification operation to analyze how the terms to which program
variables are bound, can share substructures at run time. Formal proofs of the
soundness of the operation are given in [61]. The abstract domain in the present
work is an enhanced version of the one given there, yielding more precise results
for the extension operation.
The problem of structure sharing is significantly different in the context of
functional or imperative languages, primarily because the sharing introduced by
unification is significantly more complicated to analyze at compile time than
parameter passing, term construction and subterm selection. Nevertheless, so-
lutions to related structured-data dependency problems have appeared in the
literature. For instance, in order to transform Lisp programs for concurrent
execution, Larus and Hilfinger [46] describe a technique for detecting conflicts
between pairs of read and write accesses to structures linked by pointers. Their
method is based on alias graphs that model a collection of linked structures
in memory: the nodes correspond to instances of a structure type; the directed
edges represent pointers between the structures; summary nodes are used to join
several nodes of the graph into a single node, thereby make it possible to repre-
sent unbounded chains of run-time objects. The alias graphs finitely represent
the potential sharing of structure, visible at any point in the program. Horwitz,
Pfeiffer and Reps [32] describe a similar technique to determine an approxima-
tion of the actual layouts of memory that can arise at each program point during
execution. Their method is also intended for languages that have pointer-valued
variables, heap-allocated storage and a destructive update operation.

3.2 Reference Counting and Liveness Analysis


For functional programming languages as well as for logic programming lan-
guages, run-time garbage collection is an expensive process. The substantial
prior work on liveness analysis for functional programming, however, does not
transfer effectively to logic programming, because of tile differences in the pa-
rameter passing mechanisms.
3.2. R E F E R E N C E C O U N T I N G A N D L I V E N E S S A N A L Y S I S 39

Hudak [33] addresses the problem of efficiently implementing aggregates (e.g.


arrays) in functional p r o g r a m m i n g systems. He uses abstract interpretation to
infer at various points in the source program upper bounds on possible reference
counts for structures. Reference counts larger than a certain chosen bound are
a p p r o x i m a t e d by the special value oo (with increment ooq-1 --, oo, and decrement
oo - 1 --* oo operations). A reference count operation is considered as a side-
effect on a store that emulates the real store of an interpreter. The abstraction of
the store consists of a finite set of locations, one for each occurrence of a location
generatir~g function in the source program (e.g. 'cons' in Lisp-like languages and
'new' in Pascal). The idea is that each such occurrence can be a p p r o x i m a t e d by
an operator t h a t generates the same location every time it is called. On function
call, the reference count of each actual p a r a m e t e r is increased by one less than
the number of occurrences of the corresponding formal parameter in the body
of the function. New cells allocated in the store have initial reference count one.
This approach is inadequate for logic p r o g r a m m i n g because whether and where
new references are introduced upon unification depends on the run-time values
being unified.
Similarly, the work of Inoue, Seki and Yagi [35, 36] handles the d a t a manip-
ulation operations of cons, car, cdr, and p a r a m e t e r passing, but is insufficient
to handle general unification. The approach consists of deriving from the func-
tional p r o g r a m text the set of strings of list-type primitive functions (cons, car,
cdr) that are possibly applied to program variables. These strings of primitive
operations are translated into 'occurrences' that are similar to the 'occurrences'
that we use to locate subterms relative to the root of a term (Section 4.1.1).
The occurrences (for list-type values only) are expressions of the form {0, 1}*,
where 0 selects the head and 1 the tail of a list. Overlapping (shared) cells are
cells that have more than one occurrence relative to the root. Sets of concrete
occurrences are approximated by a c o m m o n prefix that is a regular expression.
They do not aim at introducing destructive assignment locally in one clause but
(more generally) at embedding, into the executable program, code to reclaim
garbage cells by means of operations that link the detected garbage cells to
the list of available cells. However, no full sharing analysis is performed; input
arguments are assumed to be non-overlapping. Some sufficient conditions are
discussed for newly created cells to be non-overlapping. In this way, the cells
of the input arguments cannot be reused and no information is derived a b o u t
sharing between the results of different functions with shared input arguments.
Another shortcoming of the method is that it derives assertions efficiently for
the selectors 1" and 0* only.
Jensen and Mogensen [42] define a domain of contezt8 to count the usage of
expressions in a functional language and discuss a strategy for reusing cells via a
compile-time free-list of cells that are no longer used but still addressable by the
variables in the current scope. Operations that request new store cells should
then be replaced by operations that reuse the cells in this free-list. The contezt
semantics consists of the standard semantics instrumented with a mechanism
to compute the exact number of uses of expressions. The standard semantics
40 CHAPTER 3. RELATED WORK

is then abstracted away to yield an approximating analysis. They consider the


following domain, associating counts to terms and their subterms.
D' = {ABS, 0} U {1(dl, d2)ld,, d~ 6 D'} U {2(dl, d2)ldl, d2 6 D'}.
ABS represents an undefined context or a value not used in the remaining com-
putation, 0 represents an atomic value, n(dl, d2) means that the value may be
structured and the top-level cell may be used n times (one time if n = 1, many
times if n = 2), the substructures are used as described by dl and d2. To guar-
antee termination of the analysis, they use approximations based on the concept
of a grammar: The identifiers of an environment are used as nonterminals, and
each (context) binding is represented by a production rule.
Vataja and Ukkonen [79] consider the problem of garbage collection in the
context of backtrack-based Prolog implementations. The key-concept of their
analysis is that of a temporary term, which is a structured term that is not
shared at some program point with the calling environment. Intuitively, a tem-
porary term is a term needed for the local purposes of some clause. A term
is strongly temporary if it is not shared with any term of a backtrack point in
the search tree for the clause activation. At the end of the clause activation,
the storage space reserved for the strongly temporary terms can be reclaimed.
This concept is rather strong and prevents the introduction of destructive as-
signments in many cases. Also, it assumes some determinacy analysis. On the
other hand , the trailing mechanism does not have to be adapted, as is the case
in our proposal. Although their approach includes a full sharing analysis (based
on depth-k abstractions and color variables), it is described only informally and
no correctness proofs are provided.
An approach to liveness analysis for the significantly restricted ease of so-
called Ground Prolog has been developed by Klu~.niak [45]. Ground Prolog is a
restricted form of pure Prolog in the sense that the query must be known, mode
declarations must be given for each predicate and the groundness restriction is
imposed such that whenever a call is executed, its input parameters must be
ground and the output parameters must be free. These restrictions imply that
general unification must not be handled and that programs can be converted to a
normal form, which is a notation that makes explicit the uses of unification pos-
sible in Ground Prolog (term comparison, assignment, subterm selection, term
construction). Run-time representations of variable instances are characterized
by means of sets of variable names that occur at the left-hand side of construc-
tion operations in the source program. Different term construction operations
are assumed to create different objects in memory, as is indeed the case in stan-
dard Prolog implementations. Based on some elementary observations about
the run-time flow of information in a proof tree, inaccessibility criteria for the
term representations of a variable instance can be derived. Klu~niak proposes
an algorithm for checking whether a program construct likely to release storage
(e.g., a selection operation that has in its left-hand side the last occurrence of
a variable _V in the clause) does indeed release it. The analysis performs a sys-
tematic search for variables which will still be used and from which the value of
_V might be accessible.
3.3. CODE OPTIMIZATION 41

The work of Chase, Wegman and Zadeck [15] is not directly applicable to logic
programming either. They look at liveness analysis for imperative languages,
where full unification is not an issue. Their approach summarizes the linked
data structures allocated in a heap by a 8forage shape graph, in which one node
corresponds to possibly many nodes in the heap. As in the approach of Janssens
and Bruynooghe [12, 38, 40], the authors avoid the loss of information incurred
in prior contributions [32, 46] by imposing a k-depth bound beyond which the
structure is collapsed to a single node. As for the type inferencing used as the first
layer of our liveness analysis, they rather seek to fold the structure into something
like a rational tree or tree automaton. The heuristical method they use to fold
the tree consists of superimposing only structure cells that are generated by the
same instruction. The approach of ]anssens and Bruynooghe has been to impose
a fixed bound on the depth beyond which folding is required. The approach of
Chase, Wegman and Zadeck in effect enables different bounds to be used for
different programs, while preserving the finite ascending chain property of the
abstract domain, needed to ensure termination. To derive information useful
for compile-time garbage collection, the storage shape analysis is followed by a
heap reference counting analysis, during which the nodes of the storage shape
graphs are annotated with reference counts from the lattice (0, 1, oo) similar to
the approach of Hudak [33]. An extended model of heap reference counting is
also suited to reason about acyclic data structures represented by cyclic storage
shape graphs.

3.3 Code Optimization


An exhaustive study of the code optimizations that come within reach of the
compiler, thanks to the information about the run-time behavior of programs
derived by static program analysis, is beyond the scope of the present book.
However, in order to evaluate the sharing and liveness analyses, an understand-
ing of some elementary notions of code generation for WAM (Warren Abstract
Machine) based Prolog systems [2, 41, 80] and the possible use of liveness infor-
mation, is necessary.
The Warren Abstract Machine is a standard Prolog implementation tech-
nique. We briefly sketch the instruction set and the data areas that the in-
structions operate on. The data areas are the code area, containing the source
program, the control area, containing the abstract machine registers, and three
areas operated as stacks. The local stack contains execution environments and
choice-points and represents the goals still to be executed. The heap contains
structures, lists, and value cells created during program execution. The trail
records the bindings that have to be undone on backtracking. A value cell con-
sists of a tag and a value. The possible types are references (REF), variables
(UNDEF), structures (STRUCT), lists (LIST), integers (INT) and constants
(CONST). The registers of the control area determine the computation state.
They include argument registers (A~) and temporary registers (X~), used for ar-
gument passing and unification, the program pointer P, the continuation pointer
42 C H A P T E R 3. R E L A T E D W O R K

CP, the (top of) heap pointer H, the (top of) trail pointer TR, the current envi-
ronment E, and the current choice-point B. A n environment corresponds to an
activation of some clause and contains information about bindings for the clause
variables and about how to continue the execution. For each call to a predicate
with more than one defining clause, a choice-point is created on the local stack,
containing the information neededon backtracking. In order to reduce the space
needed for an environment on the local stack, clause variables are classified ac-
cording to their local life-time in the clause: variables that occur only once in
the clause are called void; variables that have more than one occurrence, but
do not need to survive a call, are called temporary; all other variables are called
permanent. Only the bindings for the permanent variables have to be stored in
an environment on the local stack.
Prolog programs are compiled into intermediate code, approximately one in-
struction per Prolog symbol. The instruction set is classifiedinto get, put, unify,
procedural and indezing instructions. The put-instructions put the arguments of
a call into the argument registers. The indexing instructions determine the set
of clauses that possibly match with the call and create a choice-point if needed.
The procedural instructions deal with control transfer and environment han-
dling. The get-instructions take care of unifying a call and the head of some
clause. Components of compound terms are encoded by means of the unify in-
structions. Unify instructions operate in either of two modes: in write mode,
they create a new structure on the heap, and in read mode, they perform unifica-
tion with an existing term. The put_listand put_struct instructions always switch
to write mode, the get_Hst and get_struct instructions switch to write m o d e if
the corresponding argument register contains an uninstantiated variable. If the
corresponding argument register does not contain an uninstantiated variable,
then the get_listand get_struct instructions switch to read mode. For instance,
if the instruction unify_atom cte is executed in write mode, it allocates a new
value cell on the heap, initializesit with tag C O N S T and value field cte, and
increments the heap pointer H. Likewise, the instruction u~ify_val Xi copies the
dereferenced value cell determined by Xi on top of the heap if it is not a free
variable. If the dereferenced value cell represents a free variable, then a refer-
ence to that variable is copied on top of the heap. The instruction unify_var Xi
allocates an U N D E F value cell on top of the heap and sets Xi pointing to it.
Mari~n et al. [52] describe an experiment set up to assess the impact of
the information about modes, types, length of reference chains and liveness of
data structures on the extent of code optimizations that can be performed by a
compiler. The paper illustrates how m o d e information can be used to enhance
clause indexing and to specialize the unify instructions generated for arguments
known to be input or output (thus eliminating run-time checking). Type infor-
mation can be used to remove tag-checking instructions, for instance to improve
the processing of arithmetic expressions. Knowledge about the reference chain
length can be used to deal more efficiently with dereferencing. Liveness infor-
mation makes it possible to recognize at compile time whether storage occupied
by a data structure can be reused to create new data structures in the same
3.3. CODE OPTIMIZATION 43

procedure.
The experiments discussed consider only a restricted case of local reuse of
garbage cells, namely reuse in a construction operation within the same chunk
of the clause where the garbage is created. A chunk is defined as either a chunk-
without-head or a head immediately followed by a chunk-without-head. A chunk-
without-head is a number, possibly zero, of in-line calls followed by an out-of-line
call. In-line calls are calls which use only argument registers of the WAM (e.g.
built-in primitive predicates). Consider the following clause.

append(l/, _Y, _Z) :- _X = [E I _U], _Z = [_E I _W],


append(_U, _Y, _W).

The clause consists of one chunk only, so the method will be applicable. However
for the recursive clause of n r e v / 2 , the selection and the construction operation
belong to different chunks because nrev(_U, _RID is an out-of-line call.

nrev(l/, _Y) :- _X = [_E I _U], nrev(_U, _RU), -Last = [_E],


append(_RU, _Last, _Y).

Within a chunk, a t e m p o r a r y register suffices to keep track of the garbage cell


until it is needed. Because the contents of these registers does not survive the
call of a Prolog predicate, a more sophisticated technique is needed in general
to benefit from the garbage detected by program analysis. For instance, by
changing the classification of the clause variables, a permanent variable on the
environment stack can be used to keep the address of the garbage cell until it is
needed; or m a y b e some kind of inter-procedural register allocation can be useful.
We will discuss some examples in Section 5.3.3.
The scheme of code generation proposed by Marii~n et al. [52] for reusing
garbage cells is as follows.

9 If there is a t e r m which can be reused, save a reference to that term in a


t e m p o r a r y WAM register before doing the unification.

9 Before constructing the new term, set a register, D, to the saved structure.
D is the destructive overwrite register.

9 Proceed in destruct mode:

- If an argument is the same, generate a destr_unify_skip instruction.


- If it is different, trail if needed (both value and cell address) and
perform the unification as if it were write-mode, but using D instead
of H (generate destr_get, destr_unify instructions instead of the usual
get and unify instructions).

The best results are obtained when a term must be constructed with the same
functor as the garbage cell and which has m a n y arguments in common. Basically,
some additional move instructions are introduced a n d / o r replace instructions for
pushing new terms on the heap.
44 CHAPTER 3. RELATED W O R K

The code was tested by means of a straightforward extension of a commercial


Prolog compiler [4]. According to the tests, there is little or no measurable time
overhead caused by the new instructions for local reuse and a significant gain can
be expected from avoiding run-time garbage collection. Because in this scheme
the garbage collection is only partial, we will briefly speculate in Section 5.3.3 on
other opportunities for reuse. Further investigations and concrete experiments
are needed to evaluate the effectiveness in an implementation.
Note that the reuse of storage introduces some new requirements on the trail-
ing mechanism of standard Prolog implementations. The trail records actions
that must be undone explicitly on backtracking. Our notion of liveness considers
only forward execution of the source program, completely ignoring backtrack-
ing. Before overwriting the cell, the unify instructions working in destructive
mode have to perform a trail test and, if necessary, trail not only the address
but also the content of the field to be overwritten, so that the cell can be com-
pletely restored on backtracking. Such an extension of the trail has also been
proposed, for example, to assist implementation of delay mechanisms in logic
programs [6, 76] and for a scheme allowing the removal of dereferencing code in
an optimizing compiler [73, 74]. Of course, value-trailing will be more expen-
sive than the WAM's address-trailing; in fact heap space is traded for space on
the trail. A technique of destructive update therefore should avoid reinitializing
structure links that are already set correctly, so that they need not to be trailed
and a net gain in memory usage will result.
Debray [26] considers a code improvement scheme that is complementary to
the method described by Mari~n et al. [52]. It is suited to address low level imple-
mentational details (at the level of the intermediate code) such as the reduction
of tag manipulation, environment allocation and redundant bounds checks. The
transformation scheme proposed consists of a pair of dual transformations on
flow graphs (called code hoisting) and three generic transformations (code in-
troduction, code elimination, and code migration). Code migration consists of
moving a sequence of instructions from one point of a block to another provided
certain conditions are satisfied such as the absence of externally visible side ef-
fects. The conditions ensure that the behavior of the program is not altered. In
Section 5.3.3, we will discuss a few examples where code migration is useful to
allow local reuse of garbage cells.
Performance measurements from a few prototype optimizing compilers have
been reported in the literature. Van Roy and Despain [77, 78] presented the
Aquarius compiler for the BAM (Berkeley Abstract Machine), which is a general-
purpose machine extended with a carefully chosen set of instructions to support
Prolog implementation. The compiler incorporates a simple analysis scheme de-
riving modes that distinguish between ground, non-variable, dereferenced and
uninitialized variables. Because no alias information is deduced, non-trivial
modes are obtained for only part of the predicate arguments (about 56 %).
Nevertheless, the global analysis seems to be effective and allows the compiler
to remove much of the overhead of the powerful features of logic programming,
when they are not used. For larger programs, the quality of the results is re-
3.3. CODE OPTIMIZATION 45

stricted, and suggests that further work should be done to enhance the abstract
domain.
In [72, 73, 74], Taylor describes an abstract domain suited for mode, reference
chain and choice-point analysis. The analyzer differentiates between different
categories of constants, list and compound term types, and keeps track of alias
information. The P a r m a compiler developed by Taylor is a native (machine)
code compiler for the MIPS RISC architecture; it is not based on the WAM
instruction set, but uses a similar memory model. The information derived by
global analysis is used to reduce the cost of dereferencing and trailing operations.
The statistics presented in the paper on performance improvement and code
size reduction are very promising and indicate that sophisticated native code
compilers using global data-flow analysis can lead to high-performance Prolog
implementations.
Chapter 4

Sharing Analysis
This chapter consists of three parts. In the first section, we motivate the de-
sign of the sharing environments by first identifying the concrete information
of interest and then constructing an appropriate abstraction. The second sec-
tion contains the formal definitions of the primitive operations for the domain
of sharing environments and their soundness proofs. The final section provides
a few examples and discusses the strength of the sharing analysis.

4.1 Sharing Environments


The central problem in program analysis of logic programs for compile-time
garbage collection is detecting the sharing of term substructure that can occur
during program execution. The sharing of structure is a property not of the
standard semantics but of the language implementation. To justify our analysis
using abstract interpretation, we need a concrete semantics that reflects shared
structure. Toward this end, we instrument the standard domain of substitu-
tions with information about the shared structures that would be created by the
implementation, and we augment the primitive operations to maintain this infor-
mation. Implementations [7] share structure among all occurrences of the same
variable and introduce new shared structure when unifying two free variables
or when unifying a free variable with a term. We show that our instrumented
versions of the concrete domain and operations characterize the sharing that
takes place in standard implementations.
Then we sort out an abstraction of the instrumented concrete domain that
leads to sufficiently precise abstract operations for the application of compile-
time garbage collection. Next, we give the formal definitions of these concrete
and abstract domains, which we consider promising for the purpose of liveness
analysis, and not too complex to formally prove the soundness of the operations
(see Section 4.2). Finally, we show that the domain of sharing environments has
the algebraic structure imposed by the framework of abstract interpretation.
48 C H A P T E R 4. SHARING A N A L Y S I S

4.1.1 Concrete Representation of Shared Structure


In this section, we propose an instrumentation of the standard concrete domain
such that it captures an aspect of logic programming language implementations
normally not expressed, namely the sharing of structure. Primitive operations
for the instrumented domain, mimicking the way concrete implementations in-
troduce sharing, are informally described and illustratedby means of an example.
The formal definitions and specifications are postponed until Section 4.1.3 and
Section 4.2.
W e begin by introducing some auxiliary notions that allow us to discuss the
sharing of structure in the representation of terms. Following [34],an occurrence
in a term /C is a finite sequence of integers describing a position in K. The
indicated position is arrived at by following a path from the root, selecting
each arc in the path according to the successive integers of the occurrence. W e
use unordered pairs of occurrences to indicate the positions of subterms whose
representations are shared. In our notation, c denotes the empty sequence, "."
denotes the operation of sequence concatenation, Vats denotes the (infinite)set
of variables that can occur in terms, Vats(K) denotes the set of variables that
occur in a particular term K, and = denotes syntactic identity.

Definition 4.1.1 For a (possiblynon-ground) term IC, the set of occurrences in


~, 0(~), is defined as foUows.
{ I /C - f(/Ct'"" "'/C*)' for s o m e f ~ n c t o r f 1
o(/c) _- { d u i.s of ~ i t v ~, (1 < i < ~), and s ~ 0 ( ~ , ) "

O(/C) is finite because we consider only finite,non-circular terms. A n occurrence


in/C determines a subterm and a functor or variable in/C.

Definition 4.1.2 For a (possibh./ non-ground) term IC and s E O(IC), the sub-
term IC/s and functor or variable K[s] determined by s are defined as follows.
~c/s = if (8 = ~) then
cue ~:~/s', where (s = i.8') and (~: - f ( ~ : l , . . . , ~o)),
K:[s] = if (~:18 ~ va,s) then ~ / s cue f, where (~:/s - f ( # c , , . . . , SCo)).
TermShift is a closure operation on sets of pairs of occurrences. W e use it in
correspondence with the idea that if two terms occupy the same storage cells
in an implementation, then their corresponding subtcrrns also occupy the same
storage cells. (Note, ~:/r - ~:/, ~ (r.t e O(r) ~ ,.t e O(~:)).)

Definition 4.1.3 For K a term and P~z a set of unordered pairs (r,s) such
that r , s ~_ a(IC) and IClr - ICIa, we define VermShift(/C, Px:) = {(r.t,8.01
(r, s) (~ Px: & r.L s.t E O(/C)}.
Some subterms cannot share their representation, while in the implementations
for which our analysis is intended, multiple occurrences of the same free variable
always share their representation. So, we introduce two concepts to characterize
meaningful sets of sharing pairs for a given term/C.
4.1. S H A R I N G E N V I R O N M E N T S 49

D e f i n i t i o n 4.1.4 For a term IC, we call a set Pp: of unordered pairs (r, s) such
that r, s E O ( E ) , a pre-sharing component for K: if the following conditions are
met.
I. (r, s) E P~ :~ IC/r ___IC/s
Only identical subterms can share.

~. (r, s) E Px: ::~ r r s


Reflezive pairs are meaningless and disallowed out of convenience.

S. (r, s) E P~: :~ TermShift(/C, {(r, s)}) C P,:


The set is closed under TermShift.

4. ((r, s), (s, t) E P~c) & (r ~ t) :~ (r,t) e P~


The relation is almost transitive.

D e f i n i t i o n 4.1.5 For a term K:, we call a set P~ of unordered pairs (r, s) such
that r, s E O(IC), a concrete sharing component for K: if Px: is a pre-sharin#
component for E and it satisfies the following condition, which corresponds to the
property that multiple occurrences of a free variable share their representation.

(/C/r = t:/s) ~ (t:/r E Vars) ~ (r 9~ s) ::~ (r, s) E PIc (4.1)


Recall that we use concrete term environments to represent substitutions. A
concrete sharing environment is a concrete term environment augmented with
a sharing component. The sharing component gives the set of pairs of occur-
rences indicating subterms that occupy the same storage cells in the language
implementation.
We will now illustrate the concrete primitive operations of unification, re-
striction and extension, for the domain of concrete sharing environments. They
compute the sharing of structured terms, as created by standard implementa-
tions of logic programming languages. The auxiliary operation TransitiveCIosure
is a closure operation used to join pre-sharing components.

D e f i n i t i o n 4.1.6 For a term IC and a set Ppc of unordered pairs (r, s) such that
r, s E O(IC), we define

3(rx, n x ) . . . (rn, sn) a finite sequence


TransitiveClosure(Px:) = (rl, sn) over P~c such that (rl ~ s,~) and I
( V l E ~ : 1 <_l < n ~ st = r l + a )

We first consider the operation Unify(I/Cin , CSharingx:,,,), i, j ) for unifying the i th


and jta terms in /Cin in the presence of the sharing component, CSharingx:~,,.
If we assume IC,,,/l is the binding of X t , for each g, 1 _< t _< arity(/Ci,,[e]),
then Unify((K:i,~, CSharingx:,.),i,j) corresponds to the basic operation Xi = Xj
employed in [11]. (The other basic unification operation, X~ = f ( X i l , . . . , X i , ) ,
is considered in Section 4.2.)
Let Unify((K;in, CSharingK:,~), i, j) = (ICo~t, CSharing~:.~,). Then both the
term part (ICo,,t) and the sharing part (CSharlngt:..,) must reflect the effect of
50 C H A P T E R 4. SHARING A N A L Y S I S

O
E
h h h h Ihl
A A / ~ .... . / N
X f g Y g f g f

I IY.'"
Z
9 I ,
LI

Z ....... Z
(a) (b) (c) (d)
Skeletons Environments Skeletons Environments

I
Figure 4.1: Concrete input (a) and output (b) sharing environments for the Unify
operation, the DAGs in structure-copying implementations before (c) and after
(d) unification, and the representations (e), (f) in structure-sharing implemen-
tations.

unification. For the term part, this means applying the most general unifier of
ICi,,/i and ICi,~/j, defined for example as in [47] and denoted by mgu(lCi,Ji,ICi,~/j).
(Note that, from the definition of substitution application, we have the property,
lCo,. = Ic,,,~ =~ o g q ~ ) c_ OgCo.,).)
For the sharing part expressing the effect of unification means adding pairs
indicating positions where unification has introduced new sharing of subterms.
The language implementations that we address create new sharing when binding
a variable to an existing structure. The sharing existing in the input term prior
to the unification is preserved and extended downwards in the case that the
term is further instantiated by the unification (cfr. the TermShift operation).
Finally, the new sharing propagates through the old sharing because, for the
representations of terms that we consider, sharing is transitive.
For example, consider the following term and sharing component (see also
Figure 4.1.a).
4.1. SHARING E N V I R O N M E N T S 51

Jc~. = (11( _x, f C z ) ), ~ ( g ( _ y ) , _y ))


CSharingjc,,, ---- {(2.1.1, 2.2)}

When computing Unify((/Ci,, CSharing~c,.), I, 2), a set BindingPairs is constructed


by locating each variable occurrence in the input terms that are to be uni-
fied, /Ci,/1 and K:i,/2, and introducing a pair giving the position of that vari-
able occurrence and the corresponding position in the other term being uni-
fied (the other of K:~,/1 and /C~,/2 ). Then both the sets CSharing~:,. and
BindingPairs are shifted according to the instantiated term ICo,,t. The result is
given by the pre-sharing components Old and New respectively. Finally the func-
tion TransitiveCIosure generates the concrete sharing component CSharingjco., by
propagating the new sharing according to the old sharing existing in the terms.

~oV.t = (~C g(~(_z)), ~(_z) ), hC g(~C_z)), ~(_z) ))


BindingPairs ---- {(1.1, 2.1), (1.2,2.2)}
Old = C.Sharingjcr U {(2.1.1.1, 2.2.1)}
New = BindingPairsU
{(1.1.1, 2.1.1), (1.1.1.1, 2.1.1.1), (1.2.1, 2.2.1)}
CSharingjco,,, = TransitiveClosure(Old U New)
= Old U New LJ {(1.1.1, 1.2), (1.1.1, 2.2), (1.2, 2.1.1),
(1.1.1.1,1.2.1), (1.1.1.1, 2.2.1), (1.2.1, 2.1.1.1)}
Figures 4.1.a and 4.1.b show these (/C,,~, CSharing~c,,) and (ICo,,t, CSharingpco.,);
sharing edges are represented by dotted lines.
The language implementer wishing to use our analysis must know which
implementations create only sharing captured by the Unify operation, since these
will be the only implementations for which our analysis is suited. In Section 4.2
we give an abstract characterization of these implementations. Here we show
that our analysis is suitable for the two most common structure representation
schemes, structure sharing and structure copying.
The case of structure copying is straightforward. Terms are represented as
directed acyclic graphs (DAGs) in memory. The term occurrences are directly
applicable on DAGs with the understanding that K:/s selects the dereferenced
storage cell. A pair (r, s) in CSharingjc expresses that /C/r and K:/s are repre-
sented by the same storage cell in the DAG. This is a variable cell, the address of
a record on the heap, or the address of a constant. Conditions 4.1.4.1-4.1.4.4 ex-
press the obvious properties of shared storage cells. Condition (4.1) requires that
a single variable cell be used to represent all occurrences of the same variable. A
typical unification algorithm traverses IC/i and IC/j synchronously until it hits a
variable cell and inserts a pointer from the variable to the other element (which
can be a variable or a structure). From that point on, the dereferenced variable
refers to the same storage cell as the other element. This fact is reflected in the
addition of a pair to BindingPairs. See Figures 4.1.c and 4.1.d for an illustration.
The case of structure sharing is more complex because a bound variable
is represented by a pair of pointers, the first to a skeleton, the second to an
environment. This makes the meaning of IC/s a bit tricky. Such a pair of
52 C H A P T E R 4. S H A R I N G A N A L Y S I S

x,

o o o

h h h h h h

f Z..Z f f~2_.-f-.5:f f f f~ .f-._..-f

I I I "'--~'" I I I I ""k j I
g U U -: .~_-..- U : . . - U --_.-U --i-
-'~ V -~~- -__. - Y ~ _. .~- u' . ~ .~

Figure 4.2: The order in which unification subproblems are solved affects the
sharing created.

pointers is selected when /C/s is a non-variable term, while the address of a


free variable is selected when /C/8 is a free variable. For example, if/C =
(h(_X,f(_Z)) ,h(g(_Y),_Y)) (see Figure 4.1.e), /(;/1 selects the skeleton h(_X,
f ( _ g ) ) and the (local) environment with _g and _2, whereas /C/2.2 selects the
address of the free variable _Y. The pair (r, 8) 6 CSharing~c expresses that IC/r
and /C/s either return the same pair of pointers or return the same free vari-
able address. As in the structure-copying case, Conditions 4.1.4.1-4.1.4.4 and
(4.1) express properties of structure-sharing implementations; the creation of a
pointer during unification in a structure-sharing implementation is soundly re-
flected in our instrumented unification by the addition of a pair to BindingPairs.
See Figures 4.1.e and 4.1.f for an illustration.
The order in which unification subproblems are solved affects the sharing cre-
ated. An illustration is given in Figure 4.2. In'the top figures, a possible memory
layout is sketched for a WAM-based implementation. On the left, a binding state
is shown prior to the unification of two program variables X1 and X2. Depend-
ing on whether the arguments of corresponding functors are unified in the order
from left-to-right or vice versa, we end up with different representations as is
shown in the middle and right top figures respectively. In the bottom figures,
the corresponding concrete sharing environments are shown. The concrete in-
strumented unification operation captures the sharing that can be introduced by
unifying such dynamic structures in any order. The set of sharing arcs derived
for this example is the transitively closed union of the sets of arcs shown in the
bottom middle and right figures. In fact the Unify operation yields an upper ap-
4.1. S H A R I N G E N V I R O N M E N T S 53

proximation of the sharing created in any particular implementation. We expect


the redundancy introduced by this strategy to be neglectable. Also note that
the order in which static substructures are unified is predetermined by the nor-
mal form of the program. For instance, the program fact p(-Xa,-.Xx,-Xx) could
be transformed to the normal-form clauses p(-Xx, -X2,-X3) :- --Xl=--X2, -X2=-X3,
or p(-Xl,-.X2,--Xa):- -.X2=..Xs, -Xl=-X2. We assume that the normal form used
reflects the program transformations (e.g. reordering of argument unifications)
as performed by the optimization algorithm of the underlying compiler.
Next we consider the call Restrict((K:~,,, CSharingjc,~), c a l l ) for restricting the
concrete term environment K:i,, and its sharing component CSharing~c,,,, describ-
ing the binding state of the program variables at the program point preceding
some call, to a description of the binding state for the variables pertaining to
the call. This operation is quite simple as we assume that all programs are
in normal form, wherein parameter passing is separated from unification. For
the calling term environment, K:i,~ = (K:i,,/1,..., I(.in/c), describing the substi-
tution {X1 ~ K:~,~/1,...,Xc *-- K.~,,/c), and the set of variables occurring in
the call, e.g. ~ X ~ , , . . . , X i , , , } where 1 < i a , . . . , i m < c and i l , . . . , i r n are pair-
wise distinct, the restriction operation computes a concrete term environment
ICr,t,. = (IC~,~/ia,..., K.~n/i,,~). For the sharing part CSharing~:,,, the restriction
operation implies a translation (and restriction) of pairs of occurrences for the
term K:i,, into pairs of occurrences for the term K~rstr.
For example, consider the following concrete term environment and sharing
component (Figure 4.3.a).

it,. = (~(_u), h( _X, _Y ). _Y, hC gC_z), _z ))


CSharingjc,. = {(2.2,3),(4.1.1,4.2)}

Suppose that the call to be executed is P(-X2,-X4), where the predicate P/2 is
defined by the single clause P(-Yx,-Y2) :- -YI=-Y2. Computing Restrict((K:i,,,
CSharing~:,,),P(_X2,_X~)) then results in the sharing environment (K:r,t,,
CSharing~:,,,,) of Figure 4.3.b.

lc,,t, = (h( _X, _Y ) , h ( g ( _ Z ) , _Z ))


CSharingjc..,, -- .[(2.1.1, 2.2))

The Restrict operation described is intended for implementations that maintain a


stack of active execution environments. When a call is resolved with a matching
clause, the current execution environment (with domain the set of variables of
the current clause) is pushed onto the stack and a new execution environment
(with domain the set of variables of the matching clause) is created. No more
sharing is created between the old and new environments than follows from
unifying the arguments of the call and the head of the matching clause. The
Restrict operation forms the first step in such a switch of execution environments.
It reduces the current execution environment to the arguments of the call that
are needed for the second step, which consists of the unification of the call and
the head of the matching clause. For programs in normal form, this unification
is trivial, and merely boils down to parameter passing. In the third step, where
54 C H A P T E R 4. SHARING A N A L Y S I S

o <> o o

f h Y h h h h h f h Z h

I XA
U Y.."
,.." gA Z
A
X Y g
A Z
A ..... A I
I/
Z..."
I/
Z..'
I :-. ...t..'.
z,..9.: ..... :. z. " M-':" :....z.'.
. . . . . . :: ."

(a) (b) (c) (d)

Figure 4.3: Concrete input Ca) and output (b) sharing environments for the
Restrict operation. Concrete input (a), (c) and output (d) sharing environments
for the Extend operation.

the execution environment is extended to the variables of the matching clause


not occurring in the head, no sharing is created. Together, they implement an
application of SLD resolution, where the matching clause is first renamed such
that none of its variables occur in any of the previous derivation steps.
Finally, we consider the call Extend((/Ci,, CSharingjq~), (/Cr,~r, CSharingjc..,.),
c a l l ) . The binding state (ICr,,r, CSharing~.,.) represents the output concrete
sharing environment that resulted from executing some c a l l for the concrete
input sharing environment (/C~., CSharing~,~). However, (/C~,~, CSharingg.,,.)
is restricted to the variables (the arguments) of that call. The Extend operation
extends this restricted binding state to a binding state for all the variables in
the concrete calling environment. The operation is similar to Unify. It applies
a substitution to the concrete term environment/q,, to obtain the output term
environment/Co~t. It then propagates the new sharing (resulting from executing
the call) through the old sharing that existed prior to the call. The new sharing
is obtained by translating the sharing component CSharingpc.... for the term
environment /C,,~ to the term environment /Co~t. The sets of old and new
sharings need not be disjoint.
For example, consider the following concrete sharing environments (see Fig-
ure 4.3.a and 4.3.c).
/q,, = (~(_v), ~( _x, _Y ), 2 , h(gC_Z), _ z ) )
CSharing~ = {(2.2,3), (4.1.1,4.2)}
/c.t~ = ( ~ ( g C _ z ) , _z ), ~(gC_Z), _z ))
CSharingjc .... = {(2.1.1, 2.2), (1.1.1, 1.2), (2.1.1, 1.1.1), (2.2, 1.2),
(1.1,2.1),(1.1.1,2.2),(1.2,2.1.1)}
The concrete sharing environment (/C~,t~,CSharing~:....) describes the result of
executing the call P(-X2,14) used above to illustratethe restriction operation.
When computing Extend((Ki,,CSharing~q.,), (K,,~,,CSharingjc..,.), P(-X2,-X4)),
the translation of the sharing component CSharingtc_,. back to the term environ-
ment/Co~t gives the pre-sharing component New. The function TransitiveCIosure
is used to propagate the new sharings according to the old sharings. The result,
(Ko~,, CSharing~:o.,), is shown in Figure 4.3.d.
4.1. SHARING E N V I R O N M E N T S 55

/Co.t = (~( _U ), h ( g ( _ Z ) , _Z ), _Z, h ( g ( _ Z ) , 7. ))


Otd = ((2.2, 3), (4.1.1, 4.2)}
New = {(4.1.1, 4.2), (2.1.1, 2.2), (4.1.1, 2.1.1), (4.2, 2.2),
(2.1, 4.1), (2.1.1,4.2), (2.2, 4.1.1)}
CSharingx:.., = TransitiveCIosure(OId U New)
= Old U New tJ {(3, 2.1.1), ( a, 4.2), (3, 4.1.1)}
Note that the sharing edge (4.1.1, 4.2) is both an edge existing before the unifica-
tion and an edge present in the translation of the sharing component CSharing~: ....
to the term environment ICo,,t (i.e. (4.1.1,4.2) E Old n New).

4.1.2 Abstract Representation of Shared Structure


In this section, we discuss the design of an abstract domain for sharing analysis.
W e start from a structure that is analogous to the structure of the concrete
sharing environments used so far. However, we argue that, in order to get
abstract primitive operations that are sufficiently precise for the application
of liveness analysis, some further enhancement is needed for the concrete and
abstract domains.
A set of concrete term environments can bc approximated with a type-graph
environment, as introduced in Section 2.3. In the present discussion we augment
these abstract term environments with a sharing component. The notions of
occurrence and subterm for concrete terms are extended to type graphs, yielding
selectors and subgraphs. A selector is a sequence of ordered pairs (i,l). Each
pair tellshow to move from one non-Or node to another: i is an integer selecting
a child of the first node and I is a non-Or label selecting a (unique) principal
node of that child.
D e f i n i t i o n 4.1.7 For a type graph 7-, an integer i E 1N, and f a functor, a
constant or V, define
The connected subgraph rooted at n where Label(n) = f and
7./(i, f) = n E PrincipalNodes(Child(i, Root(7.))), if such an n ezists,
undefined, otherwise.
An algorithm for constructing the selected subgraph is in [38, 40]. In some cases,
nodes are duplicated to obtain a normal type graph.
D e f i n i t i o n 4.1.8 For a type graph 7-, define the set of selectors in 7- as follows.
S(7.) : {c} U {(i, f ) . S I S ~ s(7.') where 7: = 7./(i, f) is not undefined}.
Notice that because of the principal label restriction, 7./(i, f) is unique when it
is defined. Also, S(7.) can be infinite if 7" contains backarcs.
D e f i n i t i o n 4.1.9 For a type graph 7- and a selector S E S(7.), define the sub-
graph 7 . / S and the node 7.[S] determined by S in 7-, as follows.
f 7- if s = e,
7./S
(7./(i, f ) ) / S ' if S = (i, f ) . S ' ,
7.[S] = Root(T/S).
56 C H A P T E R 4. SHARING A N A L Y S I S

Notice that selectors always determine nodes not labeled 'Or'. The type-graph
analogue of the TerrnShift operation is a closure operation on sets of pairs of
type-graph nodes.

D e f i n i t i o n 4.1.10 For T a type graph and PT" a set of pairs (r,s) such that
r, s E NodesT and Label(r) = Label(s) and Label(s) ~ 'Or', define

TGShift(T, P~-) = {(T[R.T],T[S.T]) J 3R, S , T : (T[R],7-[S]) E PT &


R.T, S.T G S(T)}.

D e f i n i t i o n 4.1.11 For a term IC and s E O(]C), define

l if s= e then e
else let s = i.s I in
Sel(K:, s) = if ~ l i ~ Vars then (i, V)
else if IC/ i E S1.t then (i, Int)
else if IC/i E SR,=l then ( i, Rea 0
else (i, IC[i]).Sel( IC/ i, s').

The Sel function translates the term occurrence s for a term/C into a type-graph
selector, e.g.,
Sel((b,f(a, g(c))),2.2) = (2,f).(2,g).
Type graphs are the basis of our abstract term environment. We use type graphs
having their root labeled by the tupling functor 0 with arity corresponding to the
number of program variables in the domain of the abstract term environment.
As is explained in Section 2.3, for the purposes of a type inference system a type
graph is augmented with information about what positions in the denoted terms
can be occupied by the same variable.

D e f i n i t i o n 4.1.12 An abstract term environment 7-e has the form (7-,


VSharingT-). 7" is a type graph. The root node o f T , T[e], has label 0. VSharingT"
is an abstract variable-sharing component.

The type graph 7" captures the structure of the terms in the set denoted by
7-'. The sharing of free variables within those terms is captured by the shar-
ing component VSharingT-. We do not require any particular representation for
VSharingT. We regard T e as an abstract data type and TGUnify, TGRestrict,
and TGExtend as its operations. This abstract data type hides the variable shar-
ing component but exports the type-graph environment, 7". For this abstract
data type, we assume an operation TGEnvConc, which maps T e to the subset
of TGConc(7-) that conforms to VSharingT-. In the sequel, we will also use the
following property.

P r o p e r t y 4.1.13 For an abstract term environment T e and a concrete term


/C E TGEnvConc(T'), we have
4.1. S H A R I N G E N V I R O N M E N T S 57

The property can be proved by induction on the length of the path s, by using
the inclusion TGEnvConc(T') C TGConc(T) and the definition of TGConc.
In order to represent abstract sharing of structure, we borrow such an ex-
isting abstract term environment and instrument it by analogy to the concrete
sharing environment. The instrumentation needs the portion of the abstract
term environment that represents term structure. It does not need the portion
that represents variable sharing. The borrowed abstract term environment and
operations are used to infer what terms program variables can be bound to. The
instrumentation is used to infer shared structures in the representation of those
terms.
D e f i n i t i o n 4.1.14 For art abstract term environment T" corttainirt9 the type
graph 7- = (NodesT, ForwardArcs:T, BackArcsT>, we call a set of unordered pairs
(re, n) such that m , n E Nodes~r, Label(m)= Label(n), and Label(m) r ' O r ' art
abstract sharing component for T .
An abstract sharing component, ASharingcr, for a type graph T, constrains
the sharing that may be possible in the terms represented by the type graph T.
An abstract sharing environment, consisting of a type graph and an abstract
sharing component, characterizes the current binding environment in a Prolog-
9style computation: free variables in a binding must match variable nodes (nodes
labeled 'V') in the type graph; multiple occurrences of the same variable must
match type-graph nodes that are connected by a sharing edge (an element of
ASharing7-), since the language implementation is assumed to give them a shared
representation. Unlike concrete sharing components, which are TermSbift closed,
abstract sharing components need not be closed w.r.t. TGShift. The concretiza-
tion function, defined in Section 4.1.3 below, assures a correct interpretation by
first calling the TGShift operation.
Note that, while a concrete sharing component contains pairs of occurrences
indicating identical subterms, the definition of an abstract sharing component
requires only that sharing edges connect nodes having the same label. We opted
for the weaker condition in the case of abstract sharing components because it
simplifies the primitive operations and safety proofs, while retaining enough of
the precision for the liveness analysis. As an alternative, we could define shar-
ing edges to connect nodes determining subgraphs with non-empty intersection
(as defined in [38, 40]). An even stronger condition is described below in Sec-
tion 4.3.2. The weaker condition allows edges in an abstract sharing component
that may not be relevant with respect to the type graph they are associated
with. We will discuss the consequences for the precision of the sharing analysis
in Section 4.3.2.
While CSharings: does not contain reflexive pairs, ASharingT- possibly does.
Because of backarcs, a node in a type graph can correspond to several subterms
in a concrete term represented by the type graph. Therefore, self-edges are
significant in an abstract sharing component.

E x a m p l e : Figure 4.4.a shows an abstract type graph and sharing component


representing open-ended lists whose free-varlable elements may share (note the
58 CHAPTER 4. SHARING ANALYSIS

:::" V v V
''"-..= .... .o.-*~176"*..o.........-o~
(a) Co)

Figure 4.4: An abstract type graph and sharing component (a) and an abstract
sharing environment (b).

self-edge). In the type-graph representation, all the elements of the list are
represented by the same node. The following are among the concrete terms
represented: _0, [ I I _U], [_X, _I I -Y], [_3[,_Y, _Z,_Y I _W]. [_X I _X] is not.

Also, while CSharingx: is (almost) transitive, AShadngT is not. For example,


if -Lz, _L=, and -L3 are (concrete) lists, it is possible for -L2 to share elements
with both -L1 and -1,3, although -Lz and -L3 do not share any element. The
abstract sharing environment of Figure 4.4.b represents such a concrete term
environment ( -L1,-L2,..L3 ).
Recall that in the case of concrete term environments, we used the transitivity
property for the sharing relation and the transitive closure operation to mimic
the propagation of sharing. One of the main consequences of abstracting concrete
terms and their definite sharing pairs, by means of type graphs and corresponding
possible sharing edges, is that the transitivity property is no longer appropriate.
It introduces imprecision that would make the abstract domain inappropriate
for the application of liveness analysis. By dropping the transitivity requirement
and using an alterr, atin# closure operation, we obtain a level of precision that is
satisfactory for several simple data manipulating programs.
As discussed in the chapter on related work, the idea of alternating paths
was first used by Flaisted [68] in a technique for automatic detection of points
in Prolog programs where the occur check might be violated. Later it was also
used by Sendergaard for a similar flow analysis [70]. Unlike TransitiveCIosure,
AIternatingCIosure adds only the pairs of nodes connected by a sequence of edges
in which the sharing edges of the input type graphs and the new edges created
by an operation on those graphs (e.g. unification) alternate.
The alternating closure of two sets of edges is formally defined as follows.

Definition 4.1.15 For a type graph 7", and two sets B, C of unordered pairs
( R, S) such that R, S E Nodes:T, toe define

AlternatlngCIosure(B, C) =
4.1. SHARING E N V I R O N M E N T S 59

Abs~rUnify(OT,t(L,F,R))
Tin <> Tout <>

Or V V V t Or [st Or

Int empty~/~ empty~/~...2t_..." Int


Int Int

Figure 4.5: Abstract unification: _OT = t ( _ L , _.F, _R).

insert(-E, _0T, _.NT) : - _ 0 T = empty,


/ = e m p t y , X = e m p t y , _NT = t(_X, _E, _Y).
insert(E, _0T, _NT) :- _0T = t(_L, I , _It), _E =< I ,
_NT = t(_RL, _F, _R), insert(_E, _L, _NL).
insert(_E, _OT, A~T) : - _ O T = t(_L, _F, _R), _E > _F,
_NT = t(_L, _F, _NR), insert(_E, _R, _NR).

Program 4.h i n s e r t / 3

(R1, Sn) I 3(R1, E1)...(P.r,, Sn) a finite sequence over


B U C such that (Vg 6 IN : 1 < t < n =~ & = Rt+I) and
[Vk6]PC: ( ( l < 2 k _ < n ~ ( R 2 k , Sak) E B ) &
(1 < 2 k + l < n ==~(R2t+I, Saj,+I) 6 C ) ) v
((t < 2k < &.) e c)a
(l<2k+l<n =>(Rak+I,S2t+I) eB))]
Consider the first recursive clause of Program 4.1 for inserting an (integer) el-
ement into a binary tree. Figure 4.5 shows the abstract sharing environments
before and after the selection operation _0T = t (_L, _F, _R), for the case where
there is no internal sharing in the input binary tree _0T and that alternating clo-
sure is used 1 . If the abstract unification operation were based on the transitive
closure operation, the new edges resulting from the unification, (1)-(2), would
interact with themselves, and the output type graph would have self-edges in
the functor nodes labeled t of both the subgraphs for the program variables _L
and _R. So, the recursive call insert(_E, _L, _//L) would pass an input binary
tree to the program variable _L for which possible internal sharing between its
subtrees is derived; hence the opportunity for in-place reuse of tree cells would
not be recognized for the recursive call. Of course, using alternating closure
does not cure all causes of imprecision, as will be illustrated by the examples in
Section 4.3.3.
We now discuss the abstract versions of the unification, restriction and exten-
t Depth bound two is used for the tree constructor nodes.
60 C H A P T E R 4. S H A R I N G A N A L Y S I S

T mO, o T' nO,o

ml ,V m2,Or m3,f nl ,Or n2,f n3,f


/X
m6,a m7,f n4,V n5,a n6,a n7,f ng,a n9,f
":.....
t% A
"" mg,V m9,a mlO,a :,. nlO,a nll,a nl2m nl3,a
~~ ..~ .. "-~176176

(a) (b)

Figure 4.6i Abstract input (a) and output (b) sharing environments for the
AbstrOnify operation.

sion operations. The definitionof the abstract unificationis similarto the defini-
tion of the concrete unification. Let AbstrUnify((T/,,',ASharingT,,,),i,j) = {To,,t',
ASharing~- ,). For the type-graph part, we take To,,t" = TGUnify('T/,,',i,j),
where the function TGUnify returns the abstract term environment resulting
from abstract unificationof the ith and jth components of T. (See [38, 40, 83].)
Let us consider the sharing component. As in the concrete case, structures that
are (possibly) sharing before unificationremain (possibly) sharing after unifica-
tion. However, the abstract sharing component for T/n must be converted into an
abstract sharing component for To,,t.To this end, we will define (see Section 4.2)
a function Convert that takes the input sharing TGShift('T/,~,ASharingT-,,~),ex-
pressed in terms of the nodes of T/,,,and reexpresses it in terms of the nodes of
To,,t. Special care has to be taken with edges in TGShift(T/,,ASharingT-,,)that
involve nodes labeled 'V'. As the output type graph To,,treflects the effect of
unification,nodes labeled 'V' in 7~,~can correspond to nodes in Toat not labeled
'V', indicating variables that got bound to some structure.
As in the concrete unification, AbstrUnify introduces new sharing where a
variable is unifiedagainst some other structure. These new sharing edges are col-
lected and mapped onto the output type graph To,,tby a function BindingEdges.
This function returns an edge for every node in 7"o,,tcorresponding to a node
labeled 'V' in the gth or jth component of T/,~.
The main differencewith concrete unificationis that the abstract unification
uses AlternatingClosure,rather than TransitiveClosureto propagate the sharing
relation. The AlternatingClosureoperation adds only the pairs of nodes connected
by a sequence of edges in which the old and new edges alternate.
Example: Figure 4.6.a represents a type graph T and an abstract sharing com-
ponent ASharingT- -- {(ml, m8), (m9, m10)}. For the nodes of the type graph,
we show both their name (e.g.,m3) and label (e.g.,f = Label(m3)). Note that
(..X, f(_X,~(_g,a)), f(a,f(a,a)))and (_~, f(_Y,f(-X,a)), f(a,:f(a,a)))
are both represented by (T', ASharingT), but (l, f(_X,f(_X,a)), f(a,f(a,
a))) is not, because m8 is not self-sharing. To compute AbstrOnify((T',
ASharingT), 2, 3), we first use TGUnlfy(T t, 2,3) to construct the output type
graph T' (see Figure 4.6.b). Then we compute
4.1. SHARING ENVIRONMENTS 61

Convert(T, ASharingT, T') -- ~[(n5, n6), (n5, nl0), (n12, n13)} -- C,


BindingEdges(T, T', 2, 3) = {(n6, n8), (nl0, n12)} = B,
AlternatingClosure(C, B) = C U B U {(nS, nS), (n5, n12), (nl0, n13), (n5, n13)}.

In Figure 4.6.b the elements of BindingEdges are represented as dashed lines and
the elements of Convert as dotted lines. Note that the sharing edge (ml, m8) in
T corresponds to two sharing edges (n5, n6), (n5, nl0) in T'.
The definition of the abstract restriction operation AbstrRestrict is also very
similar to the definition of the concrete restriction. Let AbstrRestrict((Ti, ",
ASharing~r~,,), c a l l ) = (Tt,tr', ASharing~r ,.). For the type-graph part, we take
T~~ ~ = TGRestrict(T~, ~, call), where the function TGRestrict returns the ab-
stract term environment resulting from the restriction of the term environment
T~,," to the variables of the call. (See [38, 40].) For the sharing component, we
need a function that takes the input sharing TGShift(Ti,~, AShafing~r~.,), expressed
in terms of the nodes of T~,, and reexpresses the edges pertaining to the call in
terms of the nodes of T~,tr. Figure 4.7 shows an example input (a) and output
(b) of the AbstrRestrict operation for some call p(_Y,_Z,_U).
Finally, we consider the call AbstrExtend((Ti,,',ASharingT-,~),(T~,t,',
ASharingT_,~), c a l l ) = (To~t ~, ASharingyo.,), for extending the abstract bind-
ing state (T~,t~', ASharing=r_,.), which is the result of the abstract interpreta-
tion of some call, for the variables of that call, to an abstract binding state
for the variables in the calling environment (Ti,', ASharingT~). For the type-
graph part, we take To,,t ~ = TGExtend(Ti,~,T,,t,',call), where the function
TGExtend returns the abstract term environment resulting from extending the
term environment T.~,tr" to the variables of the calling abstract term environ-
ment T~,,t. (See [38, 40].) The sharing component is obtained by propagating
the n e w sharing (resulting from executing the call), through the old sharing exist-
ing prior to the call, using the AlternatingCIosure operation. However, computing
the new sharing as in the concrete case, by translating the sharing component
ASharingT.,,. for the term environment T , , t , e to the term environment Tout',
introduces imprecision.

E x a m p l e : ( m o t i v a t i o n for using a t w o f o l d sharing c o m p o n e n t ) Con-


sider the environment of Figure 4.7.a, involving the program variables _.X, _Y,
_Z, _U, and the environment 4.7.c representing the result after a call involving
three of those variables, e.g. p(_Y,-Z,_U). Suppose the definition of p/3 consists
of the single clause p(_Y,_Z,_U):- _U=f(_Z). In Figure 4.7.d, the edges (2),
(3) and (4) are the result of translating the sharing component ASharingT.,,
of Figure 4.7.c to the term environment 7-o,,te. Treating these edges as new,
results in redundant edges (5,6,7,8). (Note that edges (7) and (8) are irrelevant
edges in the sense defined below in Section 4.3.2.) Subtracting the old (input)
sharing edge (2) from the translation of ASharingy..,. would avoid redundant
edges (5,7,8) in this case, but is not safe in general. Suppose the definition of
p/3 consists of the single clause p(_Y, _2, _U):- _U=f(7.), _'/=_7,. Then edge (2)
corresponds to newly created sharing and edge (5) is not redundant. Remember
that abstract sharing edges constrain the sharing possible for a set of terms, while
62 C H A P T E R 4. S H A R I N G A N A L Y S I S

Or V V V V V V V V f or ,:~-.y 2 v...~-) f
/ \ l~ . ..' ,3 I
Iat V~4 ~ "~ Int V,.s ~ \ . . " a x ~/
%....~....-"
~ ~.,-
"".. 6 ..."
-..........9

Ca) (b) (c) (d)

Figure 4.7: Abstract input Ca) and output (b) sharing environments for the
AbstrRestrict operation. Abstract input Ca), (c) and output (d) sharing environ-
ments for the AbstrExtend operation.

concrete sharing pairs represent definitesharing in one particular term. On the


other hand, subtracting the new sharing edges from the old can be proved to be
sound, but stillcauses imprecision (e.g. the edges (5) and (6)).
Hence, the extension operation needs rather precise knowledge about the
sharing edges from ASharingT-,.,.that correspond to the structure sharing pos-
sibly created by the call,and those that do not. The AbstrUnify operation can
provide such information. So, we change the abstract sharing environments pro-
posed so far, such that they preserve the distinction between old (input) and
new (created) sharing edges, returned by the differentprimitive operations. The
full set of sharing edges, consisting of the alternating closure of these sets, can
be leftimplicit most of the time.
For example, assume again that the definition of p/3 consists of the single
clause p(_Y,-2,_U):- AJ=f (_Z). In Figure 4.7.d, only edge (3) will be returned
as a new edge by the AbstrExtend operation. The set of old edges contains (1)
and (2). W h e n computing the fullset of sharing edges, we get edge (4) from the
alternating path (2)-(3),but no redundant edges such as (5,6,7,8).
W e will prove the soundness of the abstract primitive operations described
above in Section 4.2, by relating the abstract to the concrete operations, and
showing that each sharing pair created by one of the auxiliaryfunctions TermShift,
BindingPairs, and TransitiveClosure has an abstract counterpart created by the
abstract operations TGShift, BindingEdges, and AlternatingClosure and that the
abstract counterparts of the initialsharing pairs are preserved by the Convert
operation.

4.1.3 T h e C o n c r e t e and A b s t r a c t D o m a i n s
Having introduced the basic concepts, we are now in the position to formally
define the concrete and abstract domains for the sharing analysis. We use a
twofold sharing c o m p o n e n t to m a i n t a i n detailed information a b o u t w h a t sharings
are passed down from a calling environment and w h a t sharings are created in
a local environment. T h e distinction plays a m a j o r role in the precision of the
extension operation, as was illustrated in Section 4.1.2.
4.1. SHARING E N V I R O N M E N T S 63

D e f i n i t i o n 4.1.16 A concrete sharing environment has the form (/C,CSharing~:).


IC is a term that has principal functor O; the arity corresponds to the num-
ber of program variables described by the environment. The sharing component
CSharin~c = (CShr~, CShr~:) is such that CShr~r CShr~c are pre-sharing compo-
nents for IC, and TransitiveCIosure(CShr~c U CShr2) is a concrete sharing compo-
nent for IC.
D e f i n i t i o n 4.1.17 An abstract sharing environment has the form ( T e,
ASharingT-) where 7-~ is an abstract term environment containing the type graph
7- = (Nodes7, ForwardArcsT-, BackArcsT-) with () or _L labeling its root. The
sharing component ASharing7 - (AShr~-,AShr~-) is such that AShr~-, AShr~-,
and AlternatingCIosure('FGShift(7-,AShr~-),TGShift(7-,AShr~-)) are abstract shar-
ing components for 7-.
The abstract interpretation procedure computing abstract AND-OR-graphs (Sec-
tion 2.2.1), will associate with every program point an abstract sharing environ-
ment. The component CShr~c (resp. AShr~-) stands for the sharing pushed down
from the calling environment to the local environment (associated with the clause
being interpreted); the component CShr~c (resp. AShr~-) stands for the sharing
created in the local environment up to the current program point. So, the com-
plete set of sharings for a program point is obtained by the TransitiveCIosure
(resp. AlternatingCIosure) operation.
The concretization function InstrEnvConc for abstract sharing environments
is defined as follows.
D e f i n i t i o n 4.1.18 For an abstract sharing environment (7-', (AShr~-, AShr~-)),
InstrEnvConc((7-', (AShr~, AShr~-))) =
(CShr , CSh ?c)) I E TGEnvConc(7- ~)
&~ CShr~o CShr~c are pre-sharing components f o r IC
& TransitiveCIosure(CShr~c LJ CShr~c) is a concrete
sharing component for IC
((r, s) E CShr~: ~
(7-[Sel(/C, r)], 7-[Sel(/C, s)]) E TGShift(T, AShr~-) )
& ((r, s) E CShr~ :=>.
(7-[Sel(/C, r)], 7-[Sel(/C, s)]) (5 TGShift(7", AShr~-) )
Note: In the sequel when there is no confusion possible, we use CSharing~c to
denote either TransitiveCIosure(CShr~c U CShr~) or (CShr~c , CShr~), and similarly
ASharing:r to denote either AlternatingClosure(TGShift(7-,AShr~-),TGShift(7-,
AShr~-)) or (AShr~-,AShr~-). It will be clear from the context what is meant.
We also introduce some notation that is convenient for proving the safety re-
suits.
D e f i n i t i o n 4.1.19 For an abstract term environment 7-e, a term IC E
TGEnvConc(7-'), and P~: a set of unordered pairs of the form (r, s) where r, s E
defne
AbsPairT,K:(Px:) = {(7-[Sel(K~,r)], 7-[Sel(/C,s)]) I(r, s) e PK:}.
64 C H A P T E R 4. S H A R I N G A N A L Y S I S

The AbsPairT,iC function maps a concrete sharing pair for a term K: into an
abstract sharing edge for the type graph 7- recognizing the term K. The con-
cretization function for abstract sharing environments can now be reformulated
as follows.
D e f i n i t i o n 4.1.20 ( R e f o r m u l a t i o n of D e f i n i t i o n 4.1.18) For an abstract
sharing environment (7-', (AShr~-, AShr,~)),

{
InstrEnvConc((7-', (AShr~-, AShr~-))) =
(K:,(CShr~:,CShr~:)) I JC 9 TGEnvConc(7-') ']
& CShr~, CShr~: are pre-sharing components for ]~
& TransitiveCIosure(CShr~c U CShr~:) is a concrete
sharing component for IC
& AbsPairT-,lc(CShr~c ) _C TGShift(7-, AShr~
& AbsPairT-,/c(CShr~) C_ TGShift(7-, AShr~)
The following lemma states that the alternating closure operation of abstract
sharing components safely approximates the transitive closure operation of con-
crete pre-sharing components.
L e m m a 4.1.21 ( S a f e t y o f AlternatingClosure) For an abstract term environ-
ment 7-e and a term /C E TGEnvConc(7-e), let Ppcl, Plc~ be two pre-sharin#
components for ]C. Then
AbsPairT-,/c(TransitiveClosure(Pjc 1 U Px:2)) C
AlternatingCIosure(AbsPairT-,/C (P~c1), AbsPairT-,/C (PJc~)).
Proof. An element of the first set has the form (7-[SeI(/C, r)], 7-[SeI(/C, s)]),
where r r s and (r, s) E TransitlveCIosure(P~:l U P~:~). We must prove that
(7-[Se=(~:, r)], 7-[Se=(~:, s)])
AlternatingClosu re(AbsPairT-,K~ (Px: 1), AbsPairT-,/C (PJc2)).
By the definition of TransitiveClosure, there exists a finite sequence E over P/cl U
P~c2 such that
E = (r~, s d , . . . , (r., s.) for some ~ C ~V,
r = rz, s = s,~, (4.2)
VlEM :I <l<n::~st =rt+l.
Given such a finite sequence, we can transform it into an "alternating" sequence
E' satisfying
E' = (r~, s~), . . . , (r~, s~) for some m 9 ~W,
!
r : r~,s : am,
VeeIV:l<e<m~slt: r l'+ 1 '
Vk 9 ( ( l <_2 k < r_n ' ,s~k
=:~(r2k ' ) 9 PJcl)& (4.3)
( l <_2 k + l < m _ ~ ( r = k' + l , s2~+l
' ) PJc2))V
/ /
((1 _< 2/~ < m ~ (r2k, s2~) 9 e~:2)&
( l <_ 2 k + l < m _ ::r ' ' 9 em))-
4.1. SHARING ENVIRONMENTS 65

We first introduce two reduction steps for finite sequences satisfying (4.2). At
least one of these reduction steps will be applicable if (4.3) is not satisfied.

r e d u c t i o n 1: Let E = (rt, s t ) , . . . , (r,~, s,~) be a sequence satisfying (4.2), and


/r 9 1~r such that 1 </r < n and (r~, sk), (rk+t, S~+l) 9 P~:t. If rk = sk+t,
we remove (rk, s~), (r/c+t, $/~+t) from the finite sequence E. This gives us
a new finite sequence E' (of shorter length) satisfying (4.2). Otherwise,
we have

rk # sk+l & ((rk,sk), (rk+l, sk+l) E PK:I) & sk = r~+l. (4.4)

Because P~cl is a pre-sharing component, property 4.1.4.4 is satisfied, so


it follows from (4.4) that there exists an element (rk, sk+t) E P~=t. In the
finite sequence E, we replace (rk, s~), (r~+t, sk+t) by (rk, s~+t). Again, we
get a new finite sequence E' (of shorter length) satisfying (4.2).
r e d u c t i o n 2: The same as r e d u c t i o n 1, but P~:t replaced by Ppc~.

Clearly, the reduction steps can only be applied a finite number of times and
lead to an alternating sequence E' = (r~, s t ) , . . . , (rim, S'm) satisfying (4.3). For
every k 9 such that 1 < k < m, let (Rk, Sk) = (T[Sel(]C, r~)], 7"[Sel(]C, s~)]).
From the definition of AbsPairT,/C , it follows that

(r~,s~,) 9 PJct =:> (Rk,Sk) 9 AbsPairT,/c(Pjct ),


r'k, s'k J~ 9 PJc~ =:~ (Rk, Sk) 9 AbseairT-,K;(Px:2).
Using the sequence (Rt, $ 1 ) , . . . , (R~, Sin) in the definition of AIternatingCIosure,
we have

(R,, = r)], T[Sel( :, e


AlternatingClosure(AbsPairT, K: (Px:t), AbsPairT,K:(Px:2)),
as desired. []

Based on the previous result, we can now show that the full sharing relation
of some concrete environment described by an abstract sharing component can
be safely approximated by the AlternatingCIosure operation.

L e m m a 4.1.22 For an abstract sharing environment (Te,ASharing:r) and a


concrete sharing environment (]C, CSharing~:) 6 InstrEnvConc((7-', ASharingT)),
we have
AbsPairT-,/c(CSharing~c ) C_ ASharingT.
Proof. From the definition of InstrEnvConc, we know that
AbsPairT,/c(CShr~c ) C TGShift(T, AShr~-)&
AbsPalr,T,/c(CShr~: ) C TGShift(7-, AShr~-). (4.5)

We derive that
66 C H A P T E R 4. SHARING A N A L Y S I S

AbsPairT,)C (CSharingK:)
-- AbsPair]-,/c(TransitiveCIosure(CShr~c U CShr~c)),
by the definition of CSharingK:,
C_AlternatlngCIosure(AbsPairT,lc(CShr~c), AbsPairT, lc(CShr~c)) ,
by Lemma 4.1.21 (the safety of AlternatingCIosure),
C AlternatingClosure(TGShift(~, AShr~-), TGShift('/', AShr~-)),
by (4.5),
and the monotonicity of AlternatingClosure,
C_ASharingT,
by the definition of ASharlngcr.
Summarizing, we have AbsPairT,ic(CSharingK: ) c_ ASharing~-, as desired. []

4.1.4 Order Relation and Upperbound Operation


In this section we prove that the domain of sharing environments has the alge-
braic structure imposed by the framework of abstract interpretation. We first
define a (pre)order relation and an upperbound operation for abstract sharing
environments having the same arity. In the definitions we use the (pre)order
relation _<TQ and the upperbound operation TGUpp for abstract term environ-
ments. The function Translate takes an abstract sharing component expressed
for a type graph 7"1 and reexpresses it for a type graph 7"2 that describes a
superset of terms represented by the type graph T1.
D e f i n i t i o n 4.1.23 For two abstract term environments T~,7"~ such that
arity(7"l[e]) = arity(T2[e]), and 7"~ <_To 7"~, and for an abstract sharing com-
ponent P7"1 .for 7-1, define
Translate(T1, Or1, 7-2) =
{(7-2[R],'/'215"]) 1 3R, S E S(Tx) : (TI[R], 7"115'])e TGShift(T1, P7"i) }.
D e f i n i t i o n 4.1.24 ( < s h ) F o r abstract sharing environments (7-~,(AShr~l ,
AShr~l)) and (7-~, (AShr~2 , AShr~2)) such that arity(Tl[e])= arity(T2[e]),
( ,(AShr I,AShr I)) <_sh

<re
&~ Translate(T1, AShr~-l,'/'2) C_ TGShift(T2, AShr~-2)
&~ Translate(7"1, AShr~z , 7"~) _C TGShift(T~, AShr~-~)
Note that in the case where (7"~ --- T~) 2 it follows from the definition that
(T~, (AShr~-l, AShr~-l) ) ~Sh (T~, (AShr.~2 ,Ashr~-2) )

TGShlft(T1, AShr~- 1) C TGShift(T2, AShr~-2)


& TGShift(7"l, AShr~-,) C TGShift(T2, AShr~-2)
~By T[ _----7"~ we mean syntacticalidentity,not a shorthand for T~ _<To T~ & "T~ _<TGT~.
4.1. SHARING ENVIRONMENTS 67

L e m m a 4.1.25 For abstract sharing environments (T~, ASharing=rl) and (T~,


ASharingT2) such that arit~Tl[r arity(T2[r
(T~, ASharing:r 1) <Sh (T~,ASharing~2)

InstrEnvConc((T~, ASharingsr1)) C InstrEnvConc((T~, ASharing~%))


Proof. Assume that (TI, ASharing:rl) -<Sh (T~, ASharingT-2) and t/C, CSharingpc)
9 InstrEnvConc((T~., ASharingc%)). By the definition of -<sh, we have
T; -<TO T~ & Translate(T1, AShr~l , T2) C TCShift(T2, AShr~-2)
& Translate(T1, AShr~-, T2) _CTGShift(T2, AShr~2 ).
From Property 2.3.8 of <TO we know that TGEnvConc(T~) C_TGEnvConc(T~)
and hence, by the definition of InstrEnvConc, we have that K: E TGEnvConc(T~),
CShr~c and CShr~c are pre-sharing components for K:, TransitiveClosure(CShr~: U
CShr~c ) is a concrete sharing component for K:, and

(r, s) E CShr~: ~ (TI[SeI(/C,r)],'/'I[SeI(/C,s)]) E TGShift(T1, AShr~-1),


(r,s) E CShr~c =~ (Tl[Sel(/C,r)],Tl[Sel(/C,s)]) E TGShift(T1,AShr~-l).
For (r, s) E CShr~: (resp. CShr~c) it follows from Property 4.1.13 that SeI(K:, r),
Sel(/C, s) E S(T1)V~ S(T2). So using the definition of Translate we derive

(r, s) 9 CShr~c =:~ (T2[SeI(K~, r)], T2[SeI(K~, s)]) 9 Translate(T1, AShr~-l, T2),
(r,s) 9 CShr~c ::~ (T2[SeI(/C,r)I,T2[SeI(/C,s)]) E Translate(T1,AShr~-l, T2 ),
and by the definition of -<s~

(r, s) E CShr~c ::> (T2[SeI()C,r)], T2[SeI(/C,s)]) E TGShift(T2, AShr~-2),


(r, s) E CShr~: =~ (T2[SeI(/C,r)], T2[SeI(/C,s)]) E TGShift(T2, AShr~2).

Hence, we have (/C, CSharingx:) E InstrEnvConc((T~,ASharingT-2)), as desired.


[]

The minimal element of the abstract domain of sharing environments is (T~L, (0,
O)), with T~_ as defined in Section 2.3.2. The maximal element of the abstract
domain is (T~-, (AShr~r,.,AShrPr,_)) , with T~- as defined in Section 2.3.2, and
AShr~- _, AShrPr T the maximal set of sharing edges, containing all pairs of func-
tion nodes of TT having the same label. The subdomain of sharing environments
based on restricted type graphs does not contain any infinite ascending chains,
because the domain of restricted type graph environments is finite, and for each
finite type graph, there are only finitely many different sharing components.

D e f i n i t i o n 4.1.26 (Upp)For abstract sharing environments (T~,(AShr~-l,


AShr~,)) and ITS, (AShram,AShram)) s~ch that arity(T~[~])= arity(T~[~]),
68 CHAPTER 4. SHARING ANALYSIS

Upp((T;, (AShr~-l, AShr~-l)), (T;, (AShr~-2, AShr~-~))) = (T', (AShr~-, AShr~-))

where
~
T = TGUpp(T~, T~),
AShr~r = Translate('Tl, AShr~l , T) U Translate(T2, AShr.~2, T),
AShr~- = Translate(T1, AShr.~l , 7") U Translate(T2, AShr~-~, T).

Note that when T~ -~ T[ - T ~, we can define Upp as follows

Upp((T~, (AShr.~l , AShr~-l)), (T~, (AShr.~, AShr~-2))) =


(T', ((AShr~- 1 U AShr~-2), (AShr~- I U AShr~-a))).

Definition 4.1.26 can straightforwardly be generalized for a finiteset {)01,...,I~}


of abstract sharing environments. For all I satisfying 1 <_ t < n, let /~r =
(T~, ASharlngTt), then we define

Upp(.[... ,~l,...}) = Upp(~I, Upp(... Upp(,0n-l,,~'~)...)).

Lemma 4.1,27 For abstractsharing environments (T~,ASharing:rl)and (T~,


ASharingT2) such that aritl/('Tl[c]) = arity('T2[c]), let Upp((T~.,ASharing'ri),
('T.~, ASharing~-2) ) = (7- *, ASharing~). Then
(T~, ASharingT-1) <_sh (T', ASharing./-) &
(T~, ASharingT2) <_sh (T e., ASharlngT-).

Proof. This follows from Property 2.3.8 of T G U p p and the definitions of <sh
and Upp. []

4.2 Primitive Operations


In this section, we specify the primitive operations for the concrete and abstract
domain of sharing environments (unification, procedure entry, and procedure
exit) and we formally prove the safety of the abstract operations with respect to
the concrete operations. The specifications are based on the operations for the
type-graph environments (see Section 2.3.3).

4.2.1 Unification
First we give the specifications of the concrete and abstract unification corre-
sponding to the basic operation X~ : Xy. Then we reformulate these specifi-
cations for the basic operation X~ : f(X~,...,X~#). For Prolog programs in
normal form, these are the only two forms of unification that can occur.
4.2. PRIMITIVE OPERATIONS 69

4.2.1.1 Xi = Xj

D e f i n i t i o n 4.2.1 For a concrete sharing environment (ICi,,, (CShr~c,.,, CShr~c,,.) )


and 1 <_ i < j <_ arity(Igi,~[r
Unify((/Ein, (CShr~c,,,, CShr~:,.)), i,j)) =
if mgu(JC,./i,/c.,/j) is fail
then fail
else (]Co,a, (CShr~c.,,,, s
where
~ o . t =-- Idi. o', f o r a = mgu(lCin/i, ]Cin/j),
and
CShr~c.. ' = TermShift(/Co~,t, CShr~:,.),
AccNew = TermShift(/Co,,t, CShr~:~.),
New = mermShift(/Co.t, BindingPairs(/Ci,,, i, j)),
CShr~:.~, = TransitiveCIosure(AccNew U New),
BindingPairs(/Ci,~, i, j)
{ [ i.p,j.pE O(ICi,) & }
= (i.p, j.p) (]Cin/i.p E Vars V ]gln/j.p e Mars) "
The following lemma states that the initial sharing is preserved.

L e m m a 4.2.2 Let (/Ci., CSharingx:~.) be a concrete sharing environment. Let


(ICo~t, CSharingjco.,) = Unify((KSi., CSharingjq.), i, j) (Unify does not fail), where
1 < i < j < arity(ICi,,[r Then CSharing~c~ C_ CSharingic**,.
Proof. From the properties of TermShift and the definition of Unify, we have

CShr~,. C_ TermShift(/Co.t, CShr~q,.) = CShr~c..,, (4.6)


CShr~c,~ C_ TermShift(/go,,t, CShr~,.) = AccNew. (4.7)

From (4.7), the properties of TransitiveCIosure and the definition of Unify, we


have

CShr~,. C TransitiveClosure(AccNewU New) = CShr~o.. (4.8)

From (4.6), (4.8) and the monotonicity of TransitiveCIosure, it follows that

TransitiveClosure(CShr~:,,, U CShr~:,,,) C_ TransitiveClosure(CShr~:.., u CShr2_,)


or shortly,
CSharingjq~ C CSharingK:o~t.
F1

To prove that Definition 4.2.1 guarantees the concrete unification operation


U n l f y ( - , i , j ) to be well defined on the domain of concrete sharing environ-
ments, we proceed as follows. First we demonstrate that New -- TermShlft(/Co,a,
70 C H A P T E R 4. S H A R I N G A N A L Y S I S

BindingPairs(/Cin, i, j)) is a pre-sharing component for lCo~,t. Second, using the


previous result, we prove that CShr~c.~,, AccNew, CShr~c.. ' and CSharingpc.., =
TransitiveCIosure(CShr~c.~ ' UCShr~.~,), are pre-sharing components for ICo~,t.
Third, we introduce the high-level unification algorithm. Fourth, we demon-
strate that CSharingpc.., is a concrete sharing component for ](.out. Using the
fact that it is a pre-sharing component, we still need to show that it satisfies
Condition (4.1), concerning multiple occurrences of a free variable in Eout.

L e m m a 4.2.3 Let (/(:in, CSharingK:~) be a concrete sharing environment. Let


(/C~,~,t, CSharingJco..) = Unify((/C,,~, CSharingx:,~), i, j) (Unify does not fail), where
1 < i < j < arity(ICin[e]). Then New = mermShift(/Co~t, BindingPairs(ICi,~,i,j))
is a pre-sharing component for 1Cont.

Proof. From the definitions of TermShift and BindingPairs, it is clear that New
is a set of unordered pairs of the form (r, s) where r, s E C0(~o~,t). We must
prove that New satisfies Conditions 4.1.4.1-4.

C o n d i t i o n 4.1.4.1 is satisfied by New: Given (r, s) E New, we must prove


that Eo~,t/r =- lCo~,t/s. The property holds for the set BindingPairs(K;i,~, i, j)
as, after unifying ICi,~/i and ICi,~/j, we have that ICo,,t/i.s - K.o~,t/j.s for
all paths s E (9(lCo~,t/i). The function TermShift preserves the property
because corresponding subterms of identical terms are identical.

C o n d i t i o n 4.1.4.2 is satisfied by New: Given (r, s) E New, we must prove


that r ?~ s. The property holds for the set BindingPairs(i~in, i, j), as i :fi j.
The function TermShift preserves the property.

C o n d i t i o n 4.1.4.3 is satisfied by New: Given (r, s) E New, we have to prove


that TermShift(Eo~t, {(r, s)}) C New. This follows from the idempotence
of TermShift w.r.t. ~o~,t.

C o n d i t i o n 4.1.4.4 is satisfied by New" Given (r, s), (s, t) E New where r :fl
t, we must prove that (r,t) E New. From the definitions of TermShift and
BindingPairs, we know that (r, s ) a n d (s, t) are both of the form (i.p.q, j.p.q)
where ( i . p , j . p ) i s a member of BindingPairs(Ei,~,i,j). Suppose that s =
i.p.q, then r = j.p.q -- t, i.e. there are no two different consecutive pairs in
New.

L e m m a 4.2.4 Let (l~i,~, (CShr~:,~, CShr~,~)) be a concrete sharing environment.


Let (]Co,,, (CShr~co.,, CShr~c.~,) ) -- Unify((E,n, (CShr~q~, CShr~q~)), i, j) (Unify
does not fail), where 1 <_ i < j < arity(ICin[e]). Then CShr~c..,, AccNew,
CShr~c.~~ and CSharingjco., are pre-sharing components for the term lCo,,t (where
CSharingjc.., = TransitiveCIosure(CShr~c.~ ' U CShr~~
4.2. P R I M I T I V E OPERATIONS 71

Proof. From the definitions of Unify and its auxiliary functions, it is clear
that s AccNew, s and s are sets of unordered pairs
of the form (r, s) where r, s E O(/Co~t). We must prove that s AccNew,
s t and s satisfy Conditions 4.1.4.1-4.

C o n d i t i o n 4.1.4.1 is s a t i s f i e d b y t h e sets s176 AccNew, s , and


CSharingjc~
Given (r, s) E CShr~co., (resp. AccNew, CShr~c~ CSharing~:~ we must
prove that /Co~t/r --/Co~t/s.
By the assumptions of the lemma, CShr~q. and CShr~. satisfy Condi-
tion 4.1.4.1. The function TermShift preserves the property because cor-
responding subterms of identical terms are identical. So, for (r,s) E
CShr~c~ ' = TermShift(K:o~t, CShr~c,~) (and similar for the set AccNew --
TermShift(/Co~,t, CShr~c,,,)), we have ICo~,t/r =- ICo~,t/s.
From Lemma 4.2.3 we know that for (r,s) E New, we have ICo=t/r -~
ICo~,t/s. Using the definition of TransitiveClosure and the transitivity of
term equality, the property follows for CShr~c.. ' = TransitiveCIosure(New U
AccNew) and CSharingjc~ -- TransitlveCIosure(CShr~co~' U CShr~o~,).
C o n d i t i o n 4.1.4.2 is s a t i s f i e d b y t h e sets CShr~co~,,AccNew, CShr~co~,, and
CSharingjco~,:
Given (r,s) E CShr~:o., (resp. AccNew, CShr~c~ CSharingKo.,), we must
prove that r ~ s. By the assumptions of the lemma, CShr~q. and CShr~c~.
satisfy Condition 4.1.4.2. The function TermShift preserves the property,
thus passing it on to CShr~c~ ' and AccNew.
From Lemma 4.2.3 we know that for (r,s) E New, we have r r s. Us-
ing the definition of TransitiveCIosure it follows that the property holds
for both CShr~:~ ' = TransitiveCIosure(AccNew U New) and CSharingjcol= =
TransitiveCIosure(CShr~:~ U CShr~:~
C o n d i t i o n 4.1.4.3 is s a t i s f i e d b y t h e sets CShr~c~ AccNew, CShr~:~ and
CSharingjc~
Given (r, s) E CShr~co~, (resp. AccNew, CShr~co~,, CSharingjCo~), we must
prove that TermShift(K:o.,, {(r, s)}) C_ CSh,~c~' (resp. AccNew, CShr~co.,,
CSharlngJc~
For (r, s) CShr :~ = TermShift( :o.. CShr ,.), ( and resp. AccNew =
TermShift(/Co~t, C S h r ~ ) this follows from the idempotence of TermShift.
Consider (r,s) E CShr~c~ , = TransitiveC[osure(AccNew U New). Expand-
ing the definition of TermShift, we must prove that for each t such that
r.t, s.t E O(Ko=t), (r.t, 8.t) E CShr~c~ .. Fix arbitrary such r, s,t. Let
(rz, s z ) , . . . , (r,~, s,~) be the finite sequence over AccNew U New that leads
to the inclusion of (r, s) in CShr~~ according to the definition of the op-
eration TransitiveCIosure. Using Condition 4.1.4.1 proved above, we have
Ko=t/r =-- Ko,t/rz -- Ko~,t/Sl =---... -- Ko=t/r,~ = Ko,,t/s,~ =- /Co=t/s. F r o m
r.t, s.t E O(Ko=t) we can conclude that rt.t,st.t E O(Ko,,t) for 1 < I < n.
72 CHAPTER 4. SHARING ANALYSIS

For (rt, st) E AccNew ----TermShlft(/Co~,t, CShr~r it follows from the prop-
erties of TermShift that also (rt.t, st.t) E AccNew. For (rt, st) E New it
follows from Lemma 4.2.3 that (rt.t, st.t) 9 New.
Thus, there is a finite sequence in AccNew U New of the form (rl.L s l . t ) , 9 9
(r,~.t, s~.t) with rl -- r, s,~ -- s, st = rt+l for 1 <_ l < n. So by the definition
of TransltiveClosure, also (rA, s.t) 9 CShr~co.~.
Similarly, we can prove the property for the sharing component CSharingjCo.~
: TransitiveCIosure(CShr~c.. ' U CShr~o.,).
C o n d i t i o n 4.1.4.4 is s a t i s f i e d b y t h e sets CShr~co**, AccNew, CShr~c~ and
CSharingpCo.,:
Given (r, s), (s, t) E CShr~co.t (resp. AccNew, CShr~co.,, CSharing~c~
where r # t, we must prove that (r,t) 9 CShr~co.' (resp. AccNew, CShr~o.,,
CSharing~Co.,).
By the assumptions of the lemma, CShr~c~ and CShr~,. satisfy Condi-
tion 4.1.4.4.
For (r, s), (s, t ) 9 CShr~r = TermShift(/Co,a, CShr~q~) where r#~, 3r', #,
., r r m : (r, s) = (r r ~ % t) = (r r ~ (r r (r e') 9
CShr~,~ ~ ~'.~ # e'.m. We have r = s = r 9 0 ( ; o , , ) ~ r S II 9
O(/Ci,). Assume, without loss of generality, that # --- s'.p, hence p.n =
m & r' 9~ E'.p. From (d',E') 9 CShr~q. & s' = #'.p 9 O(/Ci,,), we
have (s".p,E'.p) 9 CShr~q~, because CShr~q., satisfies Condition 4.1.4.1
and 4.1.4.3. From (r',s'),(s".p,t".p) 9 CShr~c,,* & # = s".p & r' # t".p,
we have (r', t".p) 9 CShr~q,,, because CShr~q,, satisfies Condition 4.1.4.4.
Finally, (r, t) = (r'.n, t".m) = (r'.n, t".p.n) 9 CShrTc~ ' = TermShlft(/Co,,t,
CShr~q~), from the definition of TermShift.
In a similar way we can prove the property for AccNew = TermShift(/Co=t,
CShr~q~).
For (r, s), (s, t) 9 CShr~r = TransitiveCIosure(AccNew U New), let (rl, sl),
. . . , (r,~, s,~) be the finite sequence for (r, s) as specified in the definition
of TransitiveCIosure, and (r~, s ~ ) , . . . , ( r ' , s ' ) the finite sequence for (s, t).
Since s,, = s = rl, the finite sequence (r~, s~),..., (r,,, s,), (r'~, s'z),...,
(rk, s ' ) then assures that (r, t) 9 CShr~co.,.
Similarly, we can prove the property for the sharing component CSharinglc~
= TransitiveCIosure(CShr~co. ' U CShr~:o.,).
[]

The algorithm for finding a most general unifier in Definition 4.2.1, is the Solved
Form Algorithm for simplifying sets (systems) of equations given in [47]. Based
on Herbrand's original unification algorithm, this algorithm views unification as
a process of transforming a solvable equation set into an equivalent solved form
equation set. When applied to an unsolvable equation set, the algorithm halts
with failure.
4.2. P R I M I T I V E O P E R A T I O N S 73

A l g o r i t h m 4.1 Solved Form Algorithm [47]


Non-deterministicMly choose an equation from the set to which a numbered
step applies. The action taken by the algorithm is determined by the form of
the equation:

i. f(t1,...,tn) = f(rl,...,rn)
replace by the equations tl = rl,..., t,~ = r,~

2. f ( t l , . . . . tn) : g ( r t , . . . , r m ) where f ~t g or n ~ m
---* halt with failure

3. X = X
---* delete the equation

4. t = X where t is not a variable


---* replace by the equation X = t

5. X = $ where t ~ X and X has another occurrence in the set of equations


--~ i f X appears in t then halt with failure; otherwise replace X by g in
every other equation
The algorithm terminates when no step can be applied or when failure has been
returned. 1:3

The following lemma states that the set of concrete sharing environments is
closed under the Unify(-, i, j) operation.

L e m m a 4.2.5 Let (]Ci,~,CSharingx:~) be a concrete sharing environment. Let


(lCo,,t, CSharingx:..,) = Unify((K:i,,, CSharingjc,~), i, j) (Unify does not fail), where
1 < i < j < arity(Ein[e]). Then (ICout, CSharingx:.,,) is a concrete sharing
environment.

Proof. By the definition of Unify, we have K:out ~ K:inor , where a -: mgu(ICiJi,


K;i,~/j) is not fail. Because (K:i,,, CSharingjc,.,) is a concrete sharing environment,
it follows from the definition of substitution application that the term K:o~t
has principal functor 0, and that O(X:,,,) C_ O(X:o,t). From Lemma 4.2.4 we
know that CShr~c.~,, CShr~c..,, and CSharingjc.~, = TransitiveCIosure(CShr~c.. ` u
CShr~..,) are pre-sharing components for/Co,,t. So, we still have to prove that
CSharing~:.~, satisfies Condition (4.1), i.e.

- a E Vats) (r r s) (r, s) e CShari,g ..,.

From the relationship between mgu's and solved form equation sets [47] we know
that mgu(K:~,,/i, s can be obtained by transforming the solvable equation
set
Eo = { q li = P q . l j }
into its equivalent solved form (a is a most general unifier of Eo).
The Solved Form Algorithm throws away the upper structure of terms when
the latter does not occur in the variable bindings that solve the equation set. We
74 C H A P T E R 4. S H A R I N G A N A L Y S I S

wish to observe the effect of the algorithm on the terms in the environment as
it incrementally isolates and applies variable bindings. To facilitate this we add
an extra variable and auxiliary equation, the purpose of which is simply to force
the algorithm to leave the entire environment in the equation set. W e consider
the equation set
X = /Ci~ }

formed from Eo by adding an auxiliary equation of the form X = / C i , . where X


is a new variable not occurring elsewhere in the system. Clearly Ei,~ is solvable
iff Ea is. Let

be the solved form of Ei,~. Since Ei~ and Eo~t are equivalent, they have the
same most general unifiers. Let 7 be one. Then it follows that

/C = X 7 =/Ci,~ cr (modulo renaming) = ~o~t

W e prove the l e m m a by induction on the number of steps applied by the Solved


Form Algorithm. Notice that none of the steps will ever select the first equation9
W e show that the following induction hypotheses hold at each stage of the Solved
Form Algorithm given by an equation set E and a term/C, where
9

X = /C

E = tl = t2

Induction hypotheses:

(a) i.s,j.s E O(#C) ~ (IC/i.s E VarsV K~/j.s E Vars) =~ (i.s,j.s) E New


( h ) Vr, q : - Vats # q q) CShari.g ...)

(c) For each tt = t2 in E that is not of the auxiliary form X =/C and that has
not yet been the subject of Step 5, there exists a path s E O(/C/i)NO(/C/j)
such that either tl - IC/i.s,t2 =- IC/j.s or t2 - IC/i.s,ti - IC/j.s

Clearly, Condition (b) expressed for the stage given by the solved form equation
set E ~ t and the term/C gives the conclusion of the lemma.

Basis: The induction hypotheses hold for the stage given by Ei~ and/Cir.
Condition (a): The pairs (i.s, j.s) are in BindingPairs(/C~,,,i, j) by definition, so
from the the fact that TermShift is an increasing function, they are in New.
Condition (b): Because CSharingpc,~ = TransitiveCIosure(CShr~c~ U CShr~c~ ) is
a sharing component for /Ci,,, the pairs (r,q) are in CSharingjc~ . So, using
Lemma 4.2.2, they are in CSharingpc..,.
4.2. P R I M I T I V E O P E R A T I O N S 75

Condition (c) holds because the only equation in Ei,~ not of the auxiliary form
X =/C,,, is ICi,,/i = ICin/j, and e 9 O(IC,,,/i) (30(ICi,,/j).

S t e p : We consider each step of the Solved Form Algorithm and show that
its application preserves the induction hypotheses. We use Ebe! .... ICb,Ior , and
E~lt,r, ICMte~ to denote the two consecutive stages just before and after the
application of a step of the Solved Form Algorithm. That is, we assume that the
induction hypotheses hold for Eb,Jor,, ICb~lo~,. We must prove that they hold for

case 1: When Step 1 is applied,/Cb~lo~e ~ /C=#er, so Conditions (a) and (b) are
preserved. By Induction Assumption (c) there exists s E O(/Cb~lo~/i)0
O(ICb,lo~,/j) such that (without loss of generality) f ( t l , . . . , t , ~ ) =
JCbelo,.e/i.s, and f ( r l , . . . , rn) ~ ]CbeforJj.s. S o w e have for all g satisfying
l<g<n

& (,'L

As/C~,lo~e -/Ca/te~, Condition (c) is preserved.

case 2: By assumption, the unification succeeds, so Step 2 is not applied.


case 3: When Step 3 is applied, K:~elor, ~-/Calt,~, so Conditions (a) and (b) are
preserved. No new equations are introduced, so Condition (c) is preserved.

case 4: Again, ICS~lo~, = tC~#~, preserving Conditions (a) and (b). Clearly, also
Condition (c) is preserved.

case 5: By assumption, the algorithm does not fail. Let Z = t be the selected
equation.
Consider a path s such that i.s,j.s E O(]Calte~) &
C o n d i t i o n (a):
(]C~tter/i.s E VarsV ICalttr/j.s E Vats). We must prove that (i.s,j.s) C
New. By the way /C~#er is constructed from /Cbelo~,, we know that
there exist paths p and r such that s = p.r & i.p, j.p E O(IC~,/o~,) &
(ICt,tlor,/i, p 9 Vats V ICt,,lo,.,/j.P 9 Vats). By the Induction Assump-
tion (a), (i.p,j.p) 9 New. Because 36:IC,,it,r = K:o~t, it follows from
the definition of substitution application that O(/Cajt,~) C_ O(/Co~t).
Because New is a pre-sharing component for/Co~t (Lemma 4.2.3), it
follows that (i.p.r,j.p.r)= (i.s,j.s) 9 New.
C o n d i t i o n (b):Consider paths r,q such that r,q 9 O(IC~lt,, ) & r #
q & /Calter/r -~ ICalttr/q & ICalter/r 9 Vars. We must prove that
(r,q) 9 CShatingt:o,,. Let IC,,#,~/r =_ IC,,]tt,./q =- Y . Since /Ca#,~
is constructed from ICt,~lo,., according to Step 5, we know that each
s 9 O(K:~/t,~) such that IC~lt~ds =_ Y must satisfy one of the following
(Figures 4.8.a and 4.8.b).
(s 9 a ---- (4.9)
76 C H A P T E R 4. S H A R I N G A N A L Y S I S

<> <> <>

,/ /' "', ,," ",, ,, ,, ,, ,


I ..... , '. . . . ' '--'Z . . . . ;.--~ '- Z--. z-'~--~,l~---
~" , " ' , t ,]~;~,t I' " ~t r"
. . . . . . . . . ,_ y _ y _ .
(a) (b) (c) (d)

Figure 4.8: (a) /Cbelo,~ when s satisfies (4.9),


(b) /Cbejo,~ when s satisfies (4.10),
(C) ]Cbelore satisfying Induction Assumption (c),
(d) ]Cbelo~ when q, r both satisfy (4.11).

or

3s', s " : .
(.= s'." o(,) /
& I C b , l o , , / s ' -- Z & I C , ~ , ~ / s -- ~ / s " -- Y (4.10)
]

Shortly, we shall show that (r, q) G CSharlngK:o., by a case analysis of


r and q baaed on which of (4.9) or (4.10) holds when s is taken as r
and, respectively, q. But first we derive a convenient consequence of
(4.10).
Recall that Z = t is the selected equation. By Induction Assump-
tion (c), there exists a path p e O(]Cbe/o~e/i)f30(ICbeyote/j ) such
that ICb,yo,,/i.p -- Z & ICbeto,,/j.p ---- t (without loss of generality).
Fix such a p (Figure 4.8.c). Consider any s E O()Calte,) satisfying
(4.10) and the corresponding values of s' and s". Since ]Cbeyore/s' --
Z & )Cbelo~/j.P -- t & Z ~ t we have 8' ~ j.p, hence s'.s" ~ j.p.s".
Since t / s " - Y and ]Ca/~ is derived from ]CbeIo ~ by replacing Z
by t, we have ]c.~edz.p.8 " " - ~,d~.p.8 " " - ~b,/o~Jj.p.8" - Y.
Using Condition Ca), proved above, since Ka#e~/i.p.s" G Vars, we
have (i.p.s",j.p.s") 6 New C CSharingro.,. From (4.10) we have
s' 60(]Cb,lo~) & ]Cb,lo~,/s' =-- Z which, by Induction Assumption
(b), implies that (s', i.p) 9 CSharing~co., or s' = i.p, which implies
that (s'.s", i.p.s") 9 CSharingjc~ since CSharingjCo.t is a pre-sharing
component for ICo,,t (Lemma 4.2.4), or s'.s 't = i.p.s".
Since we now have either (s'.s", i.p.s't), (i.p.s 't, j.p.s") G CSharing~:o~, &
st.ottr j.p.s tt or s ' . s t t : i.p.s"r j.p.s 't & (i.p.s tt, j.p.s tt) 9 CSharinglc~
we also have (s'.s",j.p.s") 9 CSharingx:o., (Lemma 4.2.4). Thus, if
s 9 o(J%.,.,) satis~es (410). then s satisfies
( -- s t . s " & ( S , j . p . s " ) 9 CSharingK:~ )
~St Stt -- S s at E O(](~.befove) a ]~.before/j.p.S t t ~__ Y (4.11)

Suppose r and q each satisfy (4.9) with r and q, respectively, standing


for s. Then it follows immediately from Induction Assumption (b)
4.2. P R I M I T I V E OPERATIONS 77

that (r,q) E CSharingjco~,, as desired. Otherwise, at least one of r


and q must satisfy (4.11) when substituted for s. Without loss of
generality, assume that it is q. The strategy is now to show that each
of r and q is related to some occurrence of Y in IC~lo~Jj.p. ( T h a t is
why (4.11) uses IC~4o~Jj.p.s". ) This will allow us to apply Induction
Assumption (b), in case the occurrences related to r and to q are
different, and Lemma 4.2.4, to show the required result.
By assumption, q is an s that satisfies (4.11), so there exist q' and q",
corresponding to s' and s", satisfying

(q, j.p.q") E CSharing/Co., &: ICbefore/j.p.q" =- g

If r satisfies (4.9), then

so (r,j.p.q") E CSharing~:o~,,by Induction Assumption (b), and con-


sequently (q,r) E CSharing~c.... by L e m m a 4.2.4 as required. Other-
wise, r satisfies(4.11), so there must exist r' and r" (Figure 4.8.d),
corresponding to s' and s', such that

(r, j.p.r") E CSharing~c~ & Kbelort/j.p.r" = Y

Since ICa~lo,.~/j.p.r" =- Y =_ /C~lo~Jj.p.q" , either r " = q" or (j.p.r",


j.p.q") E CSharing~co.,, by Induction Assumption (b). In the for-
mer case one application of Lemma 4.2.4 and in the latter case two
applications derive the required result

(r, q) E CSharingjc~

from (q, j.p.q"), (r, j.p.r") E CSharingpCo~,, which was shown above.
C o n d i t i o n (c): For each equation "1 ~elor~ : ~belor~
~ in Ebb/ore other than
the selected Z = t and not yet selected in a previous application of
Step 5, and not of the auxiliary form X = /Cb,lo~, there exists by
Induction Assumption (c), a path s such that i.s,j.s E O(IC~/o,~)
and (t~ el~ = IC~lo~/i.s ) & (t~ ~1~ -- IC~lor~/j.s ) (without loss of
generality). Since replacing Z by t throughout the system has the
same effect on ~~1belo~e' +
~2b~lort as it has on ICbtlo~e/ i.s, ICbelo~t/ j.s the
relationship (t~ lt~ = IC,jt~r/i.s) & (t~ ft'~ =__IC,jt~/j.s) holds in the
resulting system Ealt,~.

I-3

The following corollary formalizes the relationship between CSharingjco., as com-


puted by Unify and the high-level unification algorithm given by the Solved Form
Algorithm.
78 C H A P T E R 4. S H A R I N G A N A L Y S I S

C o r o l l a r y 4.2.6 Let (]Ci,,, CSharingjc~) be a concrete sharing environment. Let


1 _< i < y _< =~ity(Jc,.[~]), and (~o~,, CSharing~..,)= u~ifygC~., CSh~ring,:,~, i, j),
(i.e., Unify does not fail). Let the equation sets Ei,~ and E and the term IC be as
defined in the proof of Lemma 4.e.5. Then, for all computations of the Solved
Form Algorithm on Ei,, and for all stages in such a computation, given by E
and IC, CSharlng~.., satisfies the following.

(a) i.,, j.s ~ o(pc) ~ (PCli.s ~ vats v Ic/j.s ~ Vats) = (i.s, j.,) ~ New

(b) Vr, q E O(/C): (/C/r ---- ]C/q & /C/r E Vars & r ?d q ::~ (r, q) E CSharingjc.,,.)

Proof. Follows immediately from the inductive proof of Lemma 4.2.5. []

We are now in the position to define the abstract operation. The following
definition uses auxiliary functions Convert and BindingEdges, defined below.

D e f i n i t i o n 4.2.7 F o r a n abstract sharing environment (Tin t, (AShr~-, AShr~- ))


and i <__i < j <_ ~rit~(~,,[d),

AbstrUnify((Ti.', (AShr~-, AShr~-)), i, j) =


if TGUnify(Ti,', i, j) is fail
then fail
else (To,,t', (AS h r~-..,, AS h r~-..,)),

where
To~,t" = TGUnlfy(Tin', i, j),
and

AShr~-.., = Convert(Tin, AShr~-,To,,t),


C = TGShift(To,,t, Convert(Tin, AShr~-,To,,t)),
B = TGShift(To~t, BindingEdges(Ti,, To,,t, i,j)),
TGShift(To,,t, AShrL~,) = AlternatingClosure(C, B).

Recall that failure is represented by T~_ in the case of type graph environments
(Section 2.3.2), and by (T~_, (0, 0)) for the abstract domain of sharing environ-
ments.
We still have to define the auxiliary functions. The function Convert takes
an abstract sharing component for a type graph Ti,,and reexpresses it for a type
graph To,n. In particular, the function will be used for a type graph To,,tthat
describes a supersct of instantiations of terms represented by the type graph Ti,.

D e f i n i t i o n 4.2.8 For two abstract term environments T.e'Tin,


,o,,te such that
arit~Ti,,[e]) = arit~t(To~t[e]), and an abstract sharina component PT"~ for Ti,,,
define
4.2. PRIMITIVE OPERATIONS 79

Convert(Ti., P:T~.,To.t) =
/ (7-o.,[R],To.,[s])13R,s e S(To.,):
([R, S 6 S(~,,) & (Ti,,[R], Ti,,[S]) 9 TGShlft(Ti,,, PT"~.)]
v
[3R',S',k,Z,/: R = R'.(k,/) ~ S=S'.(Z,:)
R'.(k, v), S'.(t, v) e s ( ~ , ) /
& f is a functor or a constant &
(T..[R'.(k, V)], ~.[S'.(I, V)]) E -I-CShift(~., P~r,..)])
The function BindingEdges i~ the type-graph analogue of the BindingPairs func-
tion for concrete terms.

D e f i n i t i o n 4.2.9 For art abstract t e r m environment 7~', 1 <_i<j<_


arity(7-i,~[e]) and 7-o,,t~ = TGUnify(Ti,,~, i, j),

BindingEdges(T~,~, To,,,, i, j) =
(7"o.,[R], 7-o,,,[S]) I 3R, S E S(To,,t), 3 f :
f is a functor, a constant or V &
([R = (i, f) a S = (j, f) a ((i, V) e S(~.) v (j, V) ~ S(~.))]
(either the i th or the jth component of Tin is itself a V-node)
V
[3g,T,l: R= (i,g).T.(l,f) & S=(j,g).T.(l,f) &
(<i,9).T.<l,v) e s(~) v <j,9).T.<Z,V) e S(~))]
(or there is a V-node inside the i 'h or jth component of 'Tin))
We now formulate the theorem stating the safety of the abstract unification.
Before proving the theorem itself, we first prove several lemmas that contribute
to its proof. Most of the propositions in the remainder of the section use the
relationships given by the following condition.

C o n d i t i o n 4.2.10

~. (%.', (ASh,~-, ASh,;,.)) is an ab,t~act ,har~ng environment


~. qC.,, (CSh%,., CShr~:,,.)) is a concrete sh~ring environment
3. (/Ci,=, (CShr~c~,CShr~c..)) E InstrEnvConc((Ti,~ t, (AShr~r~,AShr~-))), that
is,

(a} tCi,, E TGEnvConc(~')


(b) CShr~c~,CShr~c~ are pre-sharing components for ICi,=, and
TransitlveClosure(CShr~c~ U CShr~) is a concrete sharing component
for/Ci.
(c) AbsPai%.,X:,.(CSh4c,. ) c TGShift(~., AShr~,~) and
AbsPair~,=,/Ci,=(CShr~:~..) C_TGShift(T~., AShr.~)
80 C H A P T E R 4. S H A R I N G A N A L Y S I S

4. 1 _< i < i _< arit~ (Z,.[d) = arit~ (/C~.[d)


5. (/Co,,,, (CShr~:.,.,, CShr~:.,.)) = Unify((/Ci,. (CShr~.., CShr~;..)), i, j), is not
fair
6. (7-o~t', (AShr~-..,, AShrL.,) ) : AbstrUnify((Ti,~', (AShr~q~, AShr~- )), i, j)

T h e o r e m 4.2.11 (Safety of AbstrUnify(-,i,j)) Assuming Condition 4.~.10,


it follows that

(/Co.t, (CShr~.,., CShr~:...)) e InstrEnvConc((To~,t', (AShr~-...,, AShr~-o,,,))),

that is,

Part 1. ICont E TGEnvConc('To~,t'),


Part 2. CShr~c..,,CShr~c.= ' are pre-sh.aring components for ICo~,t and
TransitiveCIosure(CShr~c.. ` U CShr~c..,) is a concrete sharin# component for
ICo~t,
P a r t 3. AbsPairTo,t, iCo,t(CShr~.., ) C_ TGShift(To,t, AShr~-..,) and
AbsPairTo,jt, iCo,,,( CShr~c..,) C TGShift('To,,t, AShrL,., ).
Part I. is a restatement of Theorem 2.3.9. Part 2. follows from the fact that
Unify is well defined ( L e m m a 4.2.5). The proof of Part 3. will follow at the end
of the section, alter we have related the auxiliary functions used in the definition
of AbstrUnify to those used in the definition of Unify.

L e m m a 4.2.12 (Safety of TGShift) For an abstract term environment 7"e and


a term IC E TGEnvConc(7-'), let P~ be a set of unordered pairs (r, s) such that
r, s E O(IC) and IC/r =_ IC/s. Then

AbsPairT,/c(TermShift(/C, PJc)) _C TGShift(T, AbsPairT,/c(Ppc)).

Proof. Expanding the definitions of AbsPairT-,K; and TerrnShift we have

AbsPairT",/c(TermShift(/C, PJc)) = (4.12)


{(7"[Sel(/C, r.t)], 7-[Sel(/C, ,.t)]) I (r, s) 9 P/c & r.t, s.t 9 O(/C)}.

Fix arbitrary r,s,t such that (r,s) E P~ and r.t,s.t 9 O(/C). From Prop-
erty 4.1.13, it follows that SeI(/C, r.t), Sel(/C, s.t) 9 S ( T ) . Since/C/r = IC/s, for
T = Sel(/C/r, t), we have, by the definition of Sel,

Sel(~:,rt) = Sel(~,r) T,
selC~:,8.t) = Sei(~,s).T.
Therefore, we have
4.2. PRIMITIVE OPERATIONS 81

(7-[Se[(/c, r.t)], 7-[Sel(/c,s.t)])


e TGShift(7-,{(7-[SeI(/C,r)], 7-[Sel(/c,s)])})
by the definition of TGShift,
= TGShift(7-,AbsPakT-,/C({(r , s)}))
by the definition of AbsPairT-,/c.
In summary,

(7-[SeI(/C,r.t)], 7-[SeI(/C,s.t)]) E TGSHIft(7-,AbsPairT-,/c({(r,s)})). (4.13)


The lemma follows from (4.12), (4.13), and the monotonicity of TGShift and
AbsPairT,/c. 12

L e m m a 4.2.13 ( S a f e t y o f Convert) Let T i e and 7-o``," be abstract t e r m envi-


r o n m e n t , such that arity(~.[e]) = arity(7-o``t[e]) and let ICi~, E TGEnvConc(Z.')
and/co``t E TGEnvConc(To``t') such that/co``t - /ci, a for some substitution tr.
For P~., a pre-sharing "component for/ci,~, "we have that

AbsPsir7-o``t,/co``t(P~,.) C_Convert(T~., AbsPair~.,/c~.(P~:,.), To``,).


Proof. Consider any (r, s) E P~.. Applying the definition of AbsPairT,
O``'I
/C
O``t
,
we need to show that

(7-o``t[Se[(/co``t,r)], 7-o``t[Sel(/co``hs)])
E Convert(~T/.,AbsPair~.,/c,,,(P~c,.), To.,).
From Definition 4.1.4, we have r, s 6 0 ( / C , , ) . From/Ci, 6 TGEnvConc(~,') and
Property 4.1.13, we have

Sel(/ci. , r), Se](/ci., s) E S(7-i.). (4.14)

It follows from/Co``t - / C i , a and the definition of substitution application that


O()C,,,) C_O(/co``,). Hence r,s E O(/Co``,). So from /Co,,, E TGEnvConc(7-o``t')
and Property 4.1.13 it follows that

Sel(/co``h r), Sel(/co``hs) E $(7-o``t). (4.15)


Since/c,./r -/ci./s by Definition 4.1.4, there are two cases.
case It ( / C i . / r , / c i . / s q! Vars). From/Co`` t =/ci.a and the definition of substi-
tution application, we have that

Sel(/co``$,r) : Se]()Cin, r) ~ Sel(/co``t,s) --- Sel(/cin, s). (4.18)


We then observe that

----(7-i.[Sel(/ci.,r)], n[Sel(/cin, s)]),


82 CHAPTER 4. SHARING ANALYSIS

by (4.16),
E AbsPairT~jCi~(Plq. ),
from (r, s) E PJc~., and the definition of AbsPairTi,./Ci,=,
c_ ZGShift(~., abseslrz.,iC~.(P~,.)) ,
by the definition of TGShift.

Summarizing, we have

(~.[SeI(PCo.,, r)], ~n[SeK~Co.,, s)])


E TGShift(~., AbsPair~.,/Ci. (Pico. 11. (4.17)

The desired result is now obtained from (4.14), (4.15), (4.16), (4.17) and
an application of the first disjunct in the definition of Convert.
case 2: (/Ci,=/r, Kin/s E Vars). Here we have

SeI(X:,., ,.) = s~(~:,,,, ,-').(z, v) ~ SeI(X:,.,s) = SeI(X:,.,s').(~, V), (4.18)


and

Sel(/Co~,t,r I = Sel(/Ci,=,r').(l,fl & Sel(/Co,,t,s) = Sel(/Co=,s').(k,f), (4.19)

for r ~,s~ E O(/Ci,), l,k E -~ such that r = r~.l, s = s~.k and f either V,
a functor or a constant, depending on whether the substitution a further
instantiates ICi,,/r = ICi,/s. If the variable is not instantiated by the
substitution, then Sel(/Co,,t, r) = Sel(}C/n , r / & Sel(/Cout, s) = Sel(lCin, s),
and the result is obtained as in case 1 above. Otherwise we observe that

(~.[SoKJC~.,r)],~.[S~I(IC~.,s)])
E AbsPair~,=,/Ci,=(Px:,.,),
from (r, s) E PK:.. and the definition of AbsPair~.,,/Ci,~,
C TGShift(~,,, AbsPairTo~,Ki,,(P~q.,)),
by the definition of TGShift.

Summarizing, we have

(~.[Sel(~:,.,r)],~.[SeI(~:,.,s)])
E TGShift(Tin, AbsPair~,~./Ci,~(P~) ). (4.20)

The desired result is now obtained from (4.141, (4.151, (4.18), (4.19), (4.201
and an application of the second disjunct in the definition of Convert.
[]

Lemma 4.2.14 (Safety of BindingEdges(-,-, i, j))


Assuming Condition 4.~.10, it follows that
AbsPalr,To,a, iCo.t(BindingPairs( ICir. , i, j) ) C_ BindingEdges(~,~, To,a, i, j).
4.2. P R I M I T I V E OPERATIONS 83

Proof. Each element of AbsPairT-o~t, Eo,,t(BindingPairs(ICi,,,i~) ) has the form


(7"o,t[Sel(~out, i.p)], "To.~t[Sel(/Cout, j.p)]),
where

(i.p,j.p 9 o(Jcin)), (4.21)


(JCinli.p 9 Wrs v IC,nlj.p 9 Wrs). (4.22)

Using (4.21), Condition 4.2.10.3a and P r o p e r t y 4.1.13, we have

SeI(lq., i.p), Sel(~:,., j.p) 9 S(~.).


F r o m (4.22) and the definition of Sel, it follows in the case where p = c t h a t

Set(~:i,, i.p) = (i, v) v Sel(t:,,, j.p) = (j, V).


I f p r c, it follows from ('4.22), Condition 4.2.10.5 and the definition of Sel t h a t
there exist a functor g, a selector T, a p a t h p~ E O(lCin/i) nO(lCin/j) and 1 E 1~r
satisfying

g = K:in[i] =/(:in[j] & P = p'.l & T = Sel(ICin/i,p') = Sel(ICi,Jj, p'),


and
Sel(K;i,, i.p) = (i, g).T.(l, V) V Se[(K~in , j.p) = (j, g).T.(l, V).
It follows f r o m the definition of substitution application and Condition 4.2.10.5
t h a t i.p, j.p E O(K;o,~t), and if p -- e then

Sel(K:out, i.p) : (i, f ) & Sel(/Cout, j.p) : (j, f),


else
Sel(/Co,,t, i.p) = (i, g).T.(l, f) & Sel(/Co,,t, j.p) = (j, g).T.(l, f),
where f is either V, a functor or a constant. From the safety of TGUnify (The-
o r e m 2.3.9), we have K;o,,t E TGEnvConc(To,,t'), so f r o m P r o p e r t y 4.1.13, it
follows t h a t for b o t h cases Sel(Eo,,t, i.p), Sel(/Co~,t, j.p) E $(To,,t). Finally, by the
definition of BindingFdges, we have

('-l"o,~,[Sel(ICo~,t, i.p)], To~,t[Sel( ICo,,,, j.p)]) E BindingEdges(Tin, To,t, i, j),


as desired. []

We are now in a position to prove the m a i n result of the section. We restate the
proposition.

Theorem 4.2.11, Part 3: Assuming Condition 4.~.10, it follows that


1. AbsPair.-r ~ (CShr~ C TGShift(To,,t, AShr~-o.,),
2. AbsPairTZ O ~ t J~-
~O~t~
(CShr~) ~ o ~ t
C TGShift(To, mAShr~o.,).
--
84 C H A P T E R 4. SHARING ANALYSIS

Proof. Applying Conditions 4.2.10.5 and 4.2.10.6 and expanding the definitions
of Unify and AbstrUnify, we can rewrite 1. as

AbsPairT-o~,t,K:o.t(TermShift(K:o,,t, CShr~c,..)) (4.23)


C TGShift(To.t, Convert(~.,AShr~-,To.t)).
We observe that

A bsPair'/'o.t,K:o.t (Term Shift (K: o,,t, CShr~c,~))


C TGShift(7"o,,t, AbsPairT'o.t, ICo,,t(CShr~..)),
by Lemma 4.2.12 (the safety of TGShift),
C TGShift('To.t, Convert(~,,, AbsPairTi,jCi,~(CShr~c,. ' ), To.,)),
by Lemma 4.2.13 (the safety of Convert),
and the monotonicity of TGShift,
C_ T GShift( 7"o,t, Convert(7~,,, TGShift(Ti,,, AShr~-~.),To,t)),
by Condition 4.2.10.3c,
and the monotonicity of TGShift and Convert,
= TGShift(7"o.t, Convert(Ti,., AShr~-,.., To.t)),
by the definition of Convert,
and the idempotence of TGShift.

Summarizing, we get (4.23), as desired.


Applying Conditions 4.2.10.5 and 4.2.10.6 and expanding the definitions of
Unify and AbstrUnify, we can rewrite 2. as

AbsPairTo.jCo.t(TransitiveClosure(AccNew LJ New))
C_AlternatingCIosure(C, B).
(4.24)

First we relate AccNew and C.

AbsPaira- /C (AccNew)
= AbsPairTo~t,/~o.t(TermShift(/Co.t, CShr~c,.)),
by the definition of AccNew,
C TGShift(To.t, AbsPairTo.t,/Co.t(CShr~c,~)),
by Lemma 4.2.12 (the safety of TGShift),
c_ TGShift(7"..,, Convert( ., AbsPa .),
by Lemma 4.2.13 (the safety of Convert),
and the monotonicity of TGShift,
C TGShift(To.t, Convert(~,,, TGShift(~,~, AShr~-), To.t)),
by Condition 4.2.10.3c,
and the monotonicity of TGShift and Convert,
= TGShift(To.,, Convert(~., AShr~-, To.t)),
by the definition of Convert,
and the idempotence of TGShift,
= C,
by the definition of C.
4.2. P R I M I T I V E O P E R A T I O N S 85

Summarizing, we have

AbsPairTo,,t, iCo,a(AccNew ) C_ C. (4.25)


Similarly,
AbsPair To,t,IC o,,t ( New )
= AbsPairTo,,t, ICo,,t(TerrnShift(ICo,,t , BindingPairs(/Ci,~, i, j))),
by the definition of New,
C_ T GShift( To,,t, AbsPair To~,t,lCo,,t ( BindingPairs( IC~,. i, j ) ) ),
by Lemma 4.2.12 (the safety of TGShift),
C_ T GShift( To,t, BindingEdges(T~n, Tout, i, j ) ),
by Lemma 4.2.14 (the safety of BindingEdges),
and the monotonicity of TGShift,
= B,
by definition of B.
Summarizing, we have

AbsPairTo,,t,iCo~t(New) C B. (4.26)
Finally, we prove (4.24).
AbsPair']-o.t,ICo.t(TransitiveCIosure(AccNew U New))
C_ AlternatingCIosure(AbsPairTo~t,iCo~t(AccNew),
AbsPair To~t,IC o,,t ( New ) ),
by Lemma 4.1.21 (the safety of AlternatingClosure),
C_ AlternatingClosure(C, B),
by (4.25), (4.26),
and the monotonicity of AlternatingCIosure.
[]

4.2.1.2 Xi : f ( X i l , . . . , X i j )
The main difference with the basic operation Xi : Xj is the specification of
the set BindlngPairs describing places where new sharing is introduced by the
concrete unification. If the left-hand side Xi is a free variable, then the operation
is in fact a construction operation, otherwise we call it a selection operation. The
set BindingPairs is defined as the union of two sets; however, one of these sets will
be empty depending on whether the operation is a selection or a construction.
D e f i n i t i o n 4.2.15 For a concrete sharing environment QCi,~,(CShr~:,~,CShr~c,,.))
and 1 < i, i l , . . . , ij < arity(lCin[r such that i, i l , . . . , ij are pairwise distinct and
f is a functor of arity j,
Unify((/Cin, (CShr~c,. , CShr~,.)), i, f( il, . . . , ij ) ) =
if mgu(ICinli, f(ICinlil,...,ICinlij)) is fail
then fail
else (ICo,a, (CShr~co,,,, CShr~co,,,)),
86 C H A P T E R 4. S H A R I N G A N A L Y S I S

where
ICont ~ ICing, for a = mgu(ICinli, f(K.i,~lil,..., PC,,/i,)),
and

CShr~co," = TerrnShift(/Cont, CShr~c,.,),


AccNew - TermShiff(/Cout, CShr~:,,),
New = TerrnShift(ico~t, B i n d i n g P a i r s ( I C i , ~ , i , f ( i l , . . . , i j ) ) ) ,
CShr~:o,, - TransitiveCIosure(AccNew U New),
BindingPairs(ici~ , i, f ( i l , . . . , ij ) )

= { (i.t.p, Q.p) (ICi,Ji.t.p e Vars v ICi,JQ.p 9 Vars) }


u {(i.l, 11 <_ t <_ j 9 Vars}.

The next lemma, stating that the initial sharing is preserved, can be proved in
the same way as Lemma 4.2.2 for the basic operation Xi = X 1.

L e m m a 4'2.16 Let (icin, CSharingK:,.) be a concrete sharing environment. Let


{/Co,t, CSharingjco.,) = Unify((/Cin, CSharing,:,~), i, f ( Q , . . . , ij)) (Unify does not
fail), where 1 < i, i x , . . . , i S <_ arity(ici,,[c]) are pairwise distinct, and f is a
functor of arity j. Then CSharingjc~., C_ CSharingjco,..

The proof that Definition 4.2.15 guarantees the concrete unification operation
Unify(-,/, f ( i l , . . . , i j ) ) to be well defined on the domain of concrete sharing
environments is almost identical to the proof of the well-definedness of the
Unify(-, i, j) operation. First we demonstrate that New = TermShift(/Cont,
BindingPairs(ici,,,i,f(il,...,ij))) is a pre-sharing component for Icont. Sec-
ond, using the previous result, we prove that CShr~:~ AccNew, CShr~o., and
CSharing~:.~, = TransitiveCIosure(CShr~:.., U CShr~n,) , are pre-sharing compo-
nents for/Cont. Third, we demonstrate that CSharinglco., is a concrete sharing
component for/Co~t. Using the fact that it is a pre-sharing component, we still
need to show that it satisfies Condition (4.1), concerning multiple occurrences
of a free variable in/Cont.

L e m m a 4.2.17 Let (icin, CSharing~:~) be a concrete sharing environment. Let


(ICon,, CSharing/c..,) = Unify((/Cin, CSharingjc,..), i, f ( i l , . . ., ij) ) (Unify does not
fail), where 1 _< i, i x , . . . , i j <_ arity(ICin[e]), such that i, i z , . . . , / j are pairwise
distinct, and f is a functor of arity j.
Then New = TermShift(ICo,,t, BindingPairs(ICi,,, i, f ( it, . . . , ij ) ) ) is a pre-sharing
component for ~ont.

Proof. From the definitions of TermShift and BindingPairs, it is clear that New
is a set of unordered pairs of the form (r, s) where r, s 9 O(ICo,,). We must
prove that New satisfies Conditions 4.1.4.1-4.

C o n d i t i o n 4.1.4.1 is satisfied b y New: Given (r, s) 9 New, we must prove


that ICo~t/r - ICo~t/s. The property holds for the set BindingPairs(/Q,,,i,
f ( i l , . . . , 6 ) ) as, after unifying ICi,,/i and f ( I C i , , / i l , . . . , ICi,,/ij)), we have
4.2. P R I M I T I V E O P E R A T I O N S 87

that ]Co,,t/i.s - ]Co,,t/it.p for all s satisfying 1 < s < j and for all paths
p E O(]Co~,t/i.s The function TermShift preserves the property because
corresponding subterms of identical terms are identical.

C o n d i t i o n 4.1.4.2 is satisfied b y New: Given (r, s) E New, we must prove


that r r s. The property holds for the set BindingPairs(]c~,, i, f ( i x , . . . , ij)),
as i, Q , . . . , i s are pairwise distinct. The function TermShift preserves the
property.

C o n d i t i o n 4.1.4.3 is satisfied b y New: Given (r, s) E New, we have to prove


that TermShift(]co,t, {(r, s)}) _C New. This follows from the idempotence
of TermShift w.r.t. ]Cant.

C o n d i t i o n 4.1.4.4 is satisfied b y New: Given (r, s), (s, t) E New where r


t, we must prove that (r,t) E New. From the definitions of TermShift
and BindingPairs, we know that (r,s) and (s,t) are both of the form
(i.l.p.q, it.p.q), where (i.Lp, it.p) E BindingPairs(]c~,,,i, f ( i l , . . . , i 3 ) ) , so
1 < s < j and p is possibly c. Suppose that s = i.Lp.q (without loss
of generality), then r = it.p.q = t, i.e. there are no two different consecu-
tive pairs in New.

[]

L e m m a 4.2.18 Let (]ci,,, (CShr~c,..,CShr~c.~)) be a concrete sharin9 environ-


ment. Let (]co,,t, (CShr~c..,, CShr~..,)) = Unify((]ci,, (CShr~c,~ , CShr~c,,,)),i,
f ( i ~ , . . . , is) ) (Unify does not fail}, where 1 < i, i l , . . . , i s < arit~(]c,,[c]), such
that i, i l , . . . , i j are pairwise distinct, and f is a functor of arity j. Then
CShr~:..,, AccNew, CShr~c.. * and CSharingx:.., are pre-sharing components for
]co~,t (where CSharingx:.., = TransitlveCIosure(CShr~:.., U CShr~c..,) }.

Proof. The proof of this lemma is identical to the proof of Lemma 4.2.4, except
that Lemma 4.2.17 is used instead of Lemma 4.2.3. r3

The following lemma states that the set of concrete sharing environments is
closed under the Unify(-, i, f ( i l , . . . , is) ) operation.

L e m m a 4.2.19 Bet (]ci,, CSharingx:~.) be a concrete sharing environment. I.et


(]co,t, CSharingjc..,)= Unify((]ci,, CSharing~:,~), i, f ( il, . . ., is ) ) (Unify does not
fail), where 1 < i , Q , . . . , i s < arity(]ci,[e]) such that i, i l , . . . , i s are pairwise
distinct, and f is a functor of arity j. Then QCo,~t, CSharingx:..,) is a concrete
sharin# environment.

Proof. By the definition of Unify, we have ]co,,t = ]ci,, a, where r = mgu(]ci,,/i,


]c .lis)) is not fail. Because CSharlng ,.) is a concrete har-
ing environment, it follows from the definition of substitution application that
the term ]co,,t has principal functor 0, and that O(]Q=) C_ O(]co,,t). From
Lemma 4.2.18 we know that the sets CShr~:o.,, CShr~.,, and CSharing,~:.,., =
88 C H A P T E R 4. S H A R I N G A N A L Y S I S

TransltiveClosure(CShr~co," U CShr~co.,) , are pre-sharing components for/Co,,t. So,


we still have to prove that CSharing~o., satisfies Condition (4.1), i.e.,
(ICo,,t/r = ICo.t/s) & (/Co.Jr E Vars) & (r ~: s) =~ (r, s) E CSharing,:o.,.
From the relationship between mgu's and solved form equation sets [47] we know
that mg,,(JC~./i,/(JC~./il,..., ~:~./iS)) can be obtained by transforming the solv-
able equation set

E= = { ~ . l i = f(~.li,,..., ~.li~)}
into its equivalent solved form (a is a most general unifier of Eo).
Again, we wish to observe the effect of the algorithm on the terms in the en-
vironment as it incrementally isolatesand applies variable bindings. W e consider
the equation set

Ei,~= ICi./i = /(ICi,Uit,...,ICi./i~)


formed from Eo by adding an auxiliary equation of the form X = K;i,~,where X
is a new variable not occurring elsewhere in the system. Clearly Ei,~ is solvable
iff Eo is. Let

J~o~t =

be the solved form of E~,~. Since Ei,~ and Eo,,e are equivalent, they have the
same most general unifiers. Let ")' be one. Then it follows that
/C = X 7 =/Ci,~cr (modulo renaming) =/Co~t.
From the fact that Unify(~/Ci,~,CSharingpc,.,), i, f ( i l , . . . , ij)) does not fail, it fol-
lows that two cases are possible.
case 1: lCi,Ji E Vars & ICi,~/i ~ Vars(f(ICi,JQ,...,ICi,Jij)).
case 2= ICi,~[i] = S, where f is a functor of arity j.
In case 1, assuming ICi,Ji = Z, step 5 of the Solved Form Algorithm is applica-
ble, replaces Z by f(ICi,~/Q,..., ICi,Jij) in the auxiliary equation X = /Ci,~,
and then halts. Given paths r,q such that r,q E O(/Co,,t) & r ~ q &
ICo,,t/v = ICo,,t/q & ICo,,t/T" ~ Vars, we must prove that (r, q) G CSharing~c~ Let
ICo,,t/r - ICo,,Jq = Y . Since/Co,,t is constructed from/Ci,~ according to Step 5,
we know that each s E O(/Co,,t) such that/Co,,t/s = Y must satisfy one of the
following.

or

~$1,~a, 8 II .
I 8 = 81.~a.a II 9' ~ o ( ~ . ) ~ ~.I~' = z ~ ~ (4.28)
l<_l,<_j 9" e o ( ~ . / i ~ . ) ~ ~o ~, / , = ~ . / i ~ . . , " - Y /
4.2. PRIMITIVE OPERATIONS 89

Note that for case (4.28), we have lCi,~/s' - Z = ]Ci,,/i. So, because CSharingjc, =
TransitiveCIosure(CShr~c,. ' U CShr~:,,) satisfies Condition (4.1), we have t h a t ei-
ther s' = i or (s I, i) E CSharing~ci.,. So, using L e m m a 4.2.18, and CSharingjci. C_
CShadng;c~ ( L e m m a 4.2.16), we have either s = i.g,.s" or (s,i.g,.s") E
CSharing,c..,. From ]Ci,,/i E Vats and the definition of Unify, we know t h a t

" " ) E New C CSharing~o~,.


(i.ls .s", tt..s

So, using again L e m m a 4.2.18, a more convenient consequence of (4.28) is

3g,, s" : 1 < ts <_ j & ]Ci,~/it,.s" ----Y & (s, it,.s") E CSharingjco., (4.29)

We shall now show t h a t (r, q) E CSharingico., by a case analysis based on which


of (4.27) or (4.29) holds for r and q. Suppose r and q both satisfy (4.27), i.e. r, q E
O(]Ci,,) and ]C~,,/r = Y =_ ]C~,,/q. Because CSharing,:,~ satisfies Condition (4.1)
and r # q, we have (r, q) E CSharing;c~,, C CSharingjc .... as desired.
Suppose r and q b o t h satisfy (4.29), i.e. 3 & , r " , g q , q " : 1 < g,,gq < j &

(r, it..r") E CSharing;co., & (q, it,.q") E CSharing;co., &


(ICinlit..r" = Y -- ]Ci,~/it, .q").

Because CSharing,c,~ = TransitiveClosure(CShr~c,~ U CShr~c,~ ) satisfies Oondi-


tion (4.1), we have

II II 7,11 ~ II\
V (it.. , l,.q ) E CSharingjq .
9

l.. = ll,.q

It follows from ~" :/: q, CSharing,c~ C CSharingjc .... and L e m m a 4.2.18, that
(r, q) E CSharing~:o~,.
Otherwise, one of r and q satisfies (4.27) (assume it is r, without loss of
generality), and the other (thus q) satisfies (4.29). From r satisfying (4.27) we
know K:~,,/r = Y, and from q satisfying (4.29) we know 3gq,q" : 1 <_ gq <
j & ]Ci,Jit,.q" =- Y & (q, it,.q") E CSharing;c..,. Because CShating,c,.. sat-
isfies Condition (4.1), we have either r = it,.q" or (r, i t , . q ' ) E CSharing,c,. C_
CSharing,:o.,. It follows from r # q , (q, it,.q") E CSharing,: .... and Lemma4.2.18,
t h a t (r, q) E CSharing;co.,.
In e a s e 2, step 1 of the Solved Form Algorithm is applicable, and results in
the set of equations

E~,~ =
X
ICin/i.1
= K:i,, }
=. ICi,~/il .

Similarly as in the proof of L e m m a 4.2.5, one can show t h a t the following in-
duction hypotheses hold at each stage of the Solved Form Algorithm given by
90 C H A P T E R 4. SHARING A N A L Y S I S

an equation set E and a term/C, where

X = /C}

E = tl = t2 "

Induction hypotheses:

(~) i.t.s, i~.s ~ o(~c) & pc/i.l.s ~ w~s v ~:/i~.s E w~s) ~ (i.d.s, i~.s) ~ N ~
(b) Vr, q E O(/C): (IC/r =_ IC/q & /C/r E Vats & r :fi q => (r,q) E CSharlngjc..t)

(c) For each tx = t2 in E that is not of the auxiliary form X =/C and that has not
yet been the subject of Step 5, there exists a path s E O(Ig/i.l) n O(lC/it)
such that either tt = IC/i.s t2 =- IC/it.s or t2 = IC/i.Ls, tl =- IC/it.s

Clearly, Condition (b) expressed for the stage given by the solved form equation
set Eo~. and the term K; gives the conclusion of the lemma. []

Next, we give the definition of the abstract operation.

D e f i n i t i o n 4 . 2 . 2 0 For an abstract sharing environment (Ti,,',(AShr~r~,


AShr~%)) and 1 < i, Q , . . . , ij < arity(Ti~[e]) such that i, Q , . . . , i j are pairwise
distinct, and f is a functor of aritv j,

AbstrUnify((Ti,,', (AShr~-~., AShr~-)), i, f(il,..., ij)) =


if TGUnify(Ti,~',i,f(il,...,ij)) is fail
then fail
else (To,,,', (AShr~-..,, AShr~-..,)),
lllhere
To,,," = TGUnify(T~.', i, f(ix,..., ij)),
and

AShr~-.., = Convert(Ti,,,AShr~-,To.t),
C = TGShift(To,,,, Convert(~n, AShr~%,Toat)),
B = TGShift(To,,, BindingEdges(T/,,, To,t, i, f ( i l , . . . , ij))),
TGShift(To.,, gShr~..,) = AlternatingClosure(C, B).

We still have to define the auxiliary procedure BindingEdges. It is the type-graph


analogue of the BindingPairs function for concrete terms.

Definition 4.2.21 For an abstract term environment 7"i,~e, 1 < i, i l , . . . , i j <


arity(Ti,~[e]) such that i, i l , . . . , ij are pairwise distinct, f is a functor of arity j
4.2. P R I M I T I V E OPERATIONS 91

and Tout" = TGUnify(Yi,,', i, f ( i l , . . . , ij)),


BindingEdges(T/n, Tout, i, f ( il , . . . , ij ) ) =
(TO.,[R], TO.,[S]) 13R,S 9 S(TO.,), 3h, g: ]
h is a functor, a constant or V & 1 < g < j &
([R = (i, f).(g, h) & S = (it, h) & ((i, V) 9 S(Ti,,)V
(i, f).(g, V) 9 S(T/,,) V (it, V) E S('T/.))] (either the i th,
the (i, f ) . l th or the il h component of Tin is itself a V-node) I
[3g, Y,k : R = ( i , f ) . ( g , g ) . T . ( k , h ) & S = ( i t , g ) . T . ( k , h ) &
( (i, f).(t, g).T.(k, V) 9 S(7-in) v (it~ g).T.(k, V) 9 S(~n))]
(or there is a V-node inside the (i, f).gt~' or component of Ti,,))

The following theorem gives the correctness condition for the AbstrUnify opera-
tion.

Condition 4.2.22

1. (~,,', (AShrams, AShr~,~)) is an abs~.ract sharing environment


~. ( ~ . , (CShrL., CShr~,.)) is a conc~et~ sha~ing environment
3. (/Ci., (CShr~c~.,CShr~c,~)) E InstrEnvConc((~.', (AShr~- ,AShr~- ))), that
is,
(a) ICi,~ 9 TGEnvConc(Tin ~)
(b) CShr~c..,CShr~c~ are pre-sharin9 components for PCin, and
TransitiveCIosure(CShr~c~ U CShr~c~..) is a sharing component for ICin
(c) AbsPairT/,~,/Ci.(CShr~c~..)C TGShift(Ti,~,AShr~- ) and
AbsPairTi,.,/Ci.(CShr~,.) C_ TGShift(~,~, AShr~- )
4. 1 ___ i, i~,..., i; _ a~ity (~-[d) = a,'it,./ (~:,-[d) s~ch that i, i , , . . . , i, a,'~
pairwise distinct and f is a functor of arity j
5. (/Co.t , (CShr~c..,, CShr~c..,) ) = Unify((Ei,~, (CSh r x:,~,
~ CShr~:,.)), i,
f(i,,...,ij))), /s not fail
6. (To.t e, (AShr~-..,, AShr~-..,)) = AbstrUnify((Ti. e, (AShr~,,~, AShr~-)), i,
f(ix,...,ij)))

T h e o r e m 4.2.23 ( S a f e t y of AbstrUnify(-, i, f ( i l , . . . , i j ) ) ) Assuming Condi-


tion 4 . ~ . ~ , it follows that

(/Co,t, (CShr~c..,, CShr~co.,) ) E InstrEnvConc((To.t', (AShr~-o.,, AShr~-..,))).

Proof. The proof of this theorem is identical to the proof of Theorem 4.2.11,
except that Theorem 2.3.10 is used instead of Theorem 2.3.9, and Lemma 4.2.24
instead of Lemma 4.2.14. []
92 CHAPTER 4. S H A R I N G ANALYSIS

L e m m a 4.2.24 ( S a f e t y o f BindingEdges(-, - , i, f(i~, . . ., ii))) Assuming Con-


dition 4 . ~ . ~ , it follows that

AbsPairTo,,t,iCo,,t(BindingPairs(ICi,~ , i, f ( i l , . . ., ij )))
C_ BindingEdges(Ti,, To,,t, i, f ( i l , . . . , ij)).

Proof. Each element of AbsPair To,JC o,,t ( BindingPairs( ICi,, i, f ( il , . . . , ij ) ) ) has


the form
(~'o~,[Sel(~o.,, i.t.p)], 7-o.,[Set(~ o.t, it.p)]),
where

(i.s 0(ICi,)) & (l<t_<j) &


(ICi,~li.s 6 Vats v ]Ci,Jit.p 6 Vats), (4.30)

or

(p = ~ ~ (i <_ t < j) ~ ~.,./i ~ v~ts). (4.31)


We first consider case (4.30). Using Condition 4.2.22.3a and Property 4.1.13, we
have
Sel(]Cin, i.l.p), Se](]Cin, il.p) 6 S(7-i,~).
From (4.30), Condition 4.2.22.5 (the unification succeeds) and the definition of
Ssl, it follows that if p = e then

SeI(~:,,, i.l.p) : <i, f>.<t, v> v s,i(~:,,, it.p) : <it, v>,

otherwise there exist a functor g, a selector T, a path p' 6 0 ( ] C ~ , / i . l ) n O ( ] C i , / Q )


and k 6 / V , satisfying

g =/c,,[i.l] : ~,.[it] u p : p'.k ~ T : S , l ( ~ , , / i . t , / ) : Sel(/C,./it,/),

and

Sel(/Ci,=,i.s = (i,f).(l,g).T.(k,V) V Sel(/Ci,,it.p): (it,g).T.(/r V).


From the definition of substitution application and Condition 4.2.22.5, we have
that i.Lp,Q.p 60(]~o~,t),and if p = e then

Se[(ICon,, i.l..p) -" (i, f).(l., h) a Se[(]gou,, it.p) = (it, h),

else

Sel(~:o,,, i.t.p) : (i,/).{t, g).T.{k, a) ~ S01(~o,t, it.p) : {it, g).T.{k, a),

where h is either V, a functor or a constant. In case (4.31), it follows from


Condition 4.2.22.3a and the definition of Se[ that

Sel(~,., i) = (i, V) 6 S ( ~ . ) .
4.2. P R I M I T I V E OPERATIONS 93

From Condition 4.2.22.5 (the unification succeeds) and the definition of substi-
tution application, we have that i.g, it E O(ICo,,t), and

Sel(/Co,,t, i.g) = (i, f}.(g, h) & Sel(/Co,,t, it) = (it, h),

where h is either V, a functor or a constant. From the safety of TGUnify (The-


orem 2.3.10) we have ICo,,t E TGEnvConc(To,,t'). So, from Property 4.1.13 it
follows that for all cases Sel(/Co~t, i.Lp), Sel(/Co~t, it.p) E S(To,,t). Finally, by the
definition of BindingEdges, we have

(7"o~,[Se~(~:.,, i.t.p)], 7-o.,[set(~o~,, i,.p)])


E BindingEdges('Ti,,,To,,t,i,f(i~,...,iy)),

as desired.
[]

4.2.2 Procedure Entry


The procedure-entry operation is applied by the abstract interpretation proce-
dure when expanding the AND-OR-graph. It corresponds to resolving the next
call with an applicable clause while constructing a concrete proof tree.
A first step of the operation, independent of the clause(s) applicable, consists
of restricting the information present in the current environment to the argu-
ments of the call. The second step, which depends on the clause(s) that is (are)
applicable, consists of an initialization of the new environment(s). We will only
specify the first step of the operation as the second step is straightforward. As
discussed in Section 2.3.3 for type-graph environments, the second step serves
to pass on the type and sharing information existing for the actual parameters
to the formal parameters and to initialize the local variables of the clause to the
type 'V'. Note that the local variables initially do not take part in the sharing
relation.
We represent a concrete environment having domain { X a , . . . , Xn} as a term
with principal functor 0 and arity n (see Section 4.1.1). We assume that all Pro-
log programs are in normal form, so the arguments of the next call can be given
as a subset { X q , . . . , X i = } of the domain and an injection ic~u: { 1 , . . . , m }
{ 1 , . . . , n} : k ~-~ ik can be used to formally define the restriction of the sharing
information to the arguments of the call.

D e f i n i t i o n 4.2.25 For a concrete term IC = (ICl,...,IC,~}, a set L,c C_ O(IC),


and PJc a set of unordered pairs (r, s) such that r, s E O(IC), and an injection
i: ( 1 , . . . , m} --. { 1 , . . . , n} (for m _ n), we define

Proj(/C, i) = (~,;(1),..., ~,:(~)),


Proj(L,:, d) = {k., I k E { 1 , . . . , m } ~ i(~)., E t~},
Proj(P~:, i) = {(k.s,l.r) l k ,l E { 1 , . . . , m } & (i(k).s,i(l).r) E P,:}.
94 C H A P T E R 4. S H A R I N G A N A L Y S I S

P r o p e r t y 4.2.26 For a concrete t e r m IC such that arity(IC[e]) = n, an injection


i : { 1 , . . . , m} -4 { 1 , . . . , n} (for m < n), a a substitution over Vars(Proj(/C, i)),
and k.r 9 O(Pro5(~ , i)), we have

1. Proj(/C, i) a = Proj(/Ca, i)
e. Proj(/C,i)/k.r =_ JC/i(k).r
3. Proj(O(/C), i) = O(Proj(/C, i))
Proof. These equalities follow immediately from the definition of Proj and the
definition of substitution application. []
The arguments of the restriction operation (i.e. the first step of the procedure-
entry operation) comprise the abstract sharing environment at the call point and
the icau injection. The operation always succeeds, because of the normal form
of programs.

D e f i n i t i o n 4.2.27 For a concrete sharing environment (]Ci,,,(CShr~c,,.,CShr~c,..))


such that arityQCi,,[e]) = n, domr C { 1 , . . . , n } , and ic~u : { 1 , . . . , m } ---*
{ 1 , . . . , n) an injection (for < n) such that i , , ( { 1 , . . . , m)) = domo,,,

Restrict((K;in, (CShr~c,,., CShr~,~)), teal,) = (~.,tr, (CShr~c.... ,0))


lllhere
/Cr, tr = Proj(/Ci., icall),
CSharingjc,~ = TransitiveClosure(CShr~c,~ U CShr~,..),
CShr~c .... = Proj(CSharingjc~, icazO.
The following lemma states that the operation Restrict is well-defined on the set
of concrete sharing environments.

L e m m a 4.2.28 Let (K;i,~, (CShr~c,..,CShr~,.)) be a concrete sharing environ-


m e n t such that aritu(]Ci,,[e]) = n, domcazz _C { 1 , . . . , n } , and icon: { 1 , . . . , m} ---*
{ 1 , . . . , n ) an injection (for m <_ n ) such that i,att({1,...,m)) = domcaw Let
(/C.,t~, (CShr~c.... , CShr~c..,.)) = Restrict((/Ci,~, (CShr~c.. , CShr~c~..)), it.u).
T h e n (IC,,t,, (CShr~:.,,., CShr~: .... )) is a concrete sharin 9 environment.

Proof. By the definition of Restrict, we have K:,,t, ---- Proj(K:i,~, i~,,n), CShr~..,. =
0 and CShr~c..,. = Proj(TransitiveCIosure(CShr~c~ U CShr~.),ic.u). The empty
set is a pre-sharing component for any term. Because (/Ci,,, (CShr~c,.,, CShr~,~))
is a concrete sharing environment, and because the function Proj preserves Con-
ditions 4.1.4 and (4.1), it follows that CShr~c.... is a concrete sharing component
for K:~ot~. Hence (]~r,t~, (CShr~: .... , CShr~.,.)) is a concrete sharing environment.
[]

To define the abstract restriction operation, we first introduce a type-graph


analogue of the function proj for concrete terms that translates sharing edges
for a type graph T into sharing edges for a type graph T~,t~ constructed from T
by selecting subgraphs according to an injection i.
4.2. P R I M I T I V E O P E R A T I O N S 95

D e f i n i t i o n 4.2.29 For an abstract term environment T ~ with arity(T[e]) =


n, a set PT of unordered pairs (p,q) such that p,q 9 NodesT & Label(p)
' o r ' a Label(q) r 'Or', a~d an injection i : { 1 , . . . , - q -4 { ~ , . . . , ~ } (for
m < n) such thai Tr,tr e = TGRestrict(T e, i), we define

AbstrProj(T, P:r, T.,t. i) =


(k, f)..5', (1, g).R e ,.q(Tr,t,) ]
{ (%..[(k, I).S], %.,.[(~, g).R]) k, z e ( 1 , . . . , , ~ )
(i(k), f).s, (i(Z),g).R e S ( Z ) ~
(T[(i(k), f).S], T[(i(1), g).R]) 9 P~-
! 9

D e f l n i t l o n 4 . 2 . 3 0 For a~. abstract sharing environment (~n~,(AShr~-~,.,


ASh~,~)) such that arit~(~[~]) = ~, dora,, C (1, . . . , ,q, and i,,, : {1, . . . , m}
-~ {1,..., ~) an injeaion (/or ~ _< ,q such that io~,({1,..., ,q) = domo~,,

AbstrRestrict((Ti,", (AShr~-~, AShram,,)), ir = (T~,t,', (AShr~..,., g))


where
"T~,t,' = TGRestrict(~.", i~u),
ASharinga~ = AlternatingCIosure(TGShift(T, ASh foa~), TGShift(T, AShr~- )),
TGShift(~,t~, AShr~- ) = AbstrProj(~,~, ASharingT~, "]~,tr, ic~u).
The following theorem gives the correctness condition for the AbstrRestrict op-
eration.

T h e o r e m 4.2.31 ( S a f e t y of AbstrRestrict) Assuming the following conditions


1. ( ~ , ' , (AShr~-, AShr~-,~)) is an abstract sharing environment
2. (~,., (CSh%., CShr~,.)) is a concrete sharing environment
3. (]Ci•, (CShr~Q~, CShr~:,~)) E InstrEnvConc((Ti,', (AShr~r,~, AShr~%.))), that
is,
(a) ~,~ e TGE,vCo.c(%Z)
(b) CShr~c,~,CShr~, ~ are pre-sharing components for tgi,, and
TransitiveClosure(CShr~c,~ u CShr~,..) is a concrete sharing component
for If.in
(c) AbsPair~ /C (CShr~c)C TGShift(%,~,AShr~. ) and
AbsPair~,jCi,,(CShr~c~ ) C "l-GShift(~., AShr.~)

5. ic~zz : { 1 , . . . , m }
--* { 1 , . . . , n } is an injection (for m <_ n) such that
i~,u({1,..., re}) = dom~,u
6. (]Cr, t. , (CShr~c..., CShr~c.,,.)) = Restrict((/Cin, (CShr~c.. , CShr~c.~)), ir
96 C H A P T E R 4. S H A R I N G A N A L Y S I S

7. (Tr,t.',(AShr~-, AShr~- )) = AbstrRestrict((~.e,(AShr~q,,,AShr~q.)),


it=.)
it follows that

(K:..r, (CShr~c.. ., CShr~..,.)) e InstrEnvConc((7"~0tr',


(AShr~-, AShr~- ))).
Before proving the theorem itself, we first give a lemma that contributes to the
proof.

L e m m a 4.2.32 ( S a f e t y of AbstrProj) Let Ti~ e be an abstract term environ-


ment such that n = arity(~[e]), and/Ci, E TGEnvConc(Ti,~'), and P~z,~ a set of
unoraered 'pairs of the form (r, s) where r, s E O(ICi,). Let i,an : {1,..., m}
{1,..., n} (for m < n) be an injection such that T,,t," = TGRestrict(Ti,,', i,,zz) &
K:r, tr --~ Proj(Kin, i~=n). Then

AbsPairT.,t,~g:.,,.(Proj(P~z,~, it=n)) (4.32)


C_ AbstrProj(T/,,,AbsPairTi,~,/Ci.(P~c,.),~~ ic=u).
Proof. After expanding the definitions of AbsPair-r /C
2 rOt'P~ r
and Proj, we have

AbsPair7-..K:.~ i,=n)) =

{(7- l (i~=,(k).,,io~,(l).r) e P~,.


} "
For arbitrary k, l, s, r such that k, l E {1,..., m} and (ir ic=n(l).r) E Ppc,.,
we must prove that

(~r,,,,[Sei(~:,,,,, k.s)], ~,,,[Sel(~:,,,,, l.,)])


E AbstrProj(T~.,AbsPair~n,K:i.(Pjc,.),T~,t.,ic=n).
By the definition of AbstrProj, it is sufficient to prove that

SeI(JC,,,,. ~.s), sel(~,,,,,l.r) e s(~~ (4.33)


Sel(K:i,~,i~=u(k).s),Sel(/Ci,,,i~tr(/).r) e S(T/.), (4.34)
(7~.[Sel(~., io..(k).s)], ~.[Sel(~,., i..(0.r)]) (4.35)
E AbsPairTi,~,/Ci.(Ppc~, ,),
such that if SeI(/C,,,, k.,) = (k, f).S, then Sel(/Ci,,ir = (i,=n(k), f).S,
and similarly if Sel(/(:,,~, l.r) = (1, g).R, then Sel(/Ci,, ir = (it=n(/), g).R.

From the definition of AbsPair.Ti.,/Ci. and (ic=zz(k).s, ic.zz(l).r) E PJc,.., we de-


rive (4.35).
Because/C,, E TGEnvConc(Ti,~'), we get (4.34) from ican(k).s, icazt(l).r E O(ICi~)
and Property 4.1.13.
From/C,,,, = Proj(/Ci,,, i,=u), k, l E {1,..., rn}, ir i,,n(l).r e O(K:,.) and
Property 4.2.26.3, it is clear that k.s, l.r E O(/C,,t,), IC,,t,[k] =- /Ci,,[i,=u(k)],
4.2. PRIMITIVE OPERATIONS 97

/C,,t,[/] - /C,,[i~u(l)], IC,,t,./k =- /Cin/i~u(k) and IC,,t,./l =__ /Ci,/i~u(l). Now,


suppose that Sel(/C,,t,, k.s) = {k., f ) . S . If s -- e, then so is S, and f is V, a func-
tor or a constant, depending on the value oflC,,t~/k. But then SeI(/C~, i~u(k)) =
(icau(k), f), because K , s t r / k = IC,,,/icau(k). If s # e, then f = IC,str[k] :
/C~jir and S = Sel(IC,ot,/k, s) = Sel(ICi,,/i~u(k), s), by the definition of
SeL Hence SeI(K:i,t, icau(k).s) -- (lean(k),f).S.
From the assumptions of the lemma and Theorem 2.3.12, we have /C~,t~ E
TGEnvConc(T~,t~'). So, from Property 4.1.13 and k.s,l.r ~ O(/C~,t~), we de-
rive (4.33).
[]

Proof. [ T h e o r e m 4.2.31 ( S a f e t y o f AbstrRestrict)]


We must prove that

(~,,t,, (CShr~c.,.,CShr~c.o,.)) E InstrEnvConc((Tr,tr~, (AShr~..,.,AShr~- ))),


that is,

Part I. ICrst, E TGEnvConc(~are),


Part 2. CShr~..,.,CShr~..,. are pre-sharing components for IC,,t,. and
TransitiveClosure(CShr~c.o,. U CShr~c.... ) is a concrete sharing component
for/C~stt ,

P a r t 3. AbsPairq- ~- (CShr~ I C TGShift(T, st,, AShr~- ) and


AbsPairT r r s t r , ~ r s t r
(CShr~
x
I C TGShift(T,,t,, nShr~- .,.).
~ r . t r i --

P a r t 1. is a restatement of Theorem 2.3.12. P a r t 2. follows from the fact that


Restrict is well defined (Lemma 4.2.28). Expanding the definitions of Restrict
and AbstrRestrict, we can rewrite P a r t 3. as

AbsPair.T ~-
9 ~$t!P~l~lPst!p~
(CShr~ P~st~l
~ C AbstrProj(~,~, ASharingT-i~, T~,t~, lean), (4.36)
--

AbsPairT~,t,jC,,t,(0 ) C TGShift('/-,,t,,r (4.37)

Obviously, (4.37) is satisfied. To prove (4.36) we observe that


AbsPair.T v- (CShr ~~ x . r s t r /
I yStrl]~'t,#tT~
= AbsPair.T ~- (Proj(TransitiveCIosure(CShr ~ U CShr~c~),icau)),
by the definition of Restrict,
C_ AbstrProj(Ti,~, AbsPairTi,jCi,~(TransitiveCIosure(CShr~c,,, U CShr~c,~)),
T~,t~, ican),
by Lemma 4.2.32 (the safety of AbstrProj),
C AbstrProj(~,,, AlternatingCIosure(AbsPair~,~,Kin(CShr~c,.),
AbsPairTin,K:i,~(CShr~c~)), 7"rstr, icall),
by Lemma 4.1.21 (the safety of AlternatingCIosure),
and the monotonicity of AbstrProj,
98 C H A P T E R 4. S H A R I N G A N A L Y S I S

C AbstrProj(Ti,,, AlternatingCIosure( TGShift(Ti,t, AShr~,..),


TGShift(Ti,,, AShr~-~..)), T,,t,, i~,u),
by Condition 4.2.31.3c,
and the monotonicity of AbstrProj and AlternatingClosure,
= AbstrProj(~,=, ASharingT~, T~,t~, ic=/z),
by the definition of ASharing~r~ .

Summarizing, we get (4.36), as desired. El

For the procedure-exit operation (Section 4.2.3), a special version of the restric-
tion operation is needed that restricts the abstract sharing environment at the
last program point of some clause to the variables of the head. It differs from
the version needed on procedure entry, in that the new sharing edges created in
the current environment, are not joined with the old sharing edges to become
the set of old edges for a lower level of the proof tree (resp. AND-OR-graph
in the abstract case). Procedure exit is needed to move one level upwards in-
stead. The new edges of the current environment will also be considered as new
edges at that higher level. We omit the proofs of the well-definedness of the
concrete operation and the safety of the abstract operation as they are rather
straightforward.

D e f i n i t i o n 4.2.33 For a concrete sharing environment (/Ci,=,(CShr~c,~,CShr~,.))


such tkat grity(/Ci~=[e]) = r~, domhead C_ { 1 , . . . , n } , and ihead : {1,..., m} --~
(1 .... , . } an injection (for m <_ . ) such that i,,ad((1,... , rn}) = domhead,

Restrict2((/Ci., (CShr~c,. , CShr~c,.)) , ih,ad) = (/C.str, (CShr~c..., CShr~..,.))


where
]Crstr = Proj(/Ci~, ihead),
CShr~c.... = Proj(CShr~,..,ihead),
CShr~..,. = Proj(CShr~c,,,,ih,ad ).
D e f i n i t i o n 4.2.34 For an ab,tract ,hating environment (Ti,,', (AShr~.,AShr~-;.))
,.ch thct =/qty('.T/n[.]) = ~, domhead C ( 1 , . . . , n } , grid ihead : ( 1 , . . . , r n } --~
( 1 , . . . , . } =n i.jection (for m <_ . ) such th=t ihead((1,..., rn}) ----domhead,

AbstrRestrlct~((Ti.', (AShr~.,,, AShr~.)), ihead) = (Trst.', (AShr~-,., AShr~- ,.))


1//here

~r=tr r ---- TGRestrict(T~,=', ihe=d),


TGShift(T~,t,, AShr~r ,.) = AbstrProj(T~,,, TGShift(T~,,, AShr~r,.), 7/-,,t,, ihe=d),
TGShift(T~,t,, AShr~- ) = AbstrProj(T~,~, TGShift(T~,,, AShr~-), T~,t~, ih,=d).

4.2.3 Procedure Exit


The procedure-exit operation is applied as soon as the a b s t r a c t interpretation
of the different clauses matching a call is completed. In a first step, the abstract
4.2. PRIMITIVE OPERATIONS 99

sharing environment at the last program point of each applicable clause is re-
stricted (see Section 4.2.2) to the arguments of the head of the clause. Then
these restricted abstract sharing environments (all having the same arity) are
combined by the upperbound operation (i.e. an over approximation is computed,
see Section 4.1.4). In a second step, the extension operation regains the infor-
mation about the calling environment that was lost by the restriction operation
on procedure entry (it computes the effect of the call on the variables of the
environment that are not passed as arguments to the call). In this section, we
present only the extension operation that is needed in the second step of the
procedure-exit operation. Proving the safety of the first step is straightforward,
as was discussed in Section 2.3.3.
Again, we start with the instrumentation of the concrete extension operation
for the domain of concrete sharing environments. We need the following auxiliary
function.

D e f i n i t i o n 4.2.35 For a concrete t e r m IC such that arity(IC[e]) = m & ra < n,


an injection i : { 1 , . . . , m } --, { 1 , . . . , n } and a set Px: of unordered pairs (r,s)
such that r, s E O(IC), we define

RevProj(Pjc, i) = {(i(k).r, i(l).s) I k, t E {i,..., m} ~ (k.r, l.s) E Px:}.


D e f i n i t i o n 4.2.36 For concrete sharing e n v i r o n m e n t s (K;i,, (CShr~c,. , CShr~c,.))
and (IC,,t,, (CShr~c..,. , CShr~c.... )), such that arity( ICin[c]) ~- n & arity( ICr,t,[e]) =
m & ra < n, and an injection i,=n: { 1 , . . . , m } ~ { 1 , . . . , n } ,

Extend((/(:/.,, (CShr~c,. , CShr~,.)), (/Cr,t., (CShr~c.... , CShr~c.... )), icau) --


if (3a over Vars(Proj(ICi,,ican)) : Proj(/C/,,i,=u)a z-/C,,,,) &
Vars(/Cr, tr) fq Vars(/Cin) C_ Vars(Proj(/Cin,/cart)) & CShr~c .... :
TermShift(/Cr,tr, Proj(TransitiveClosure(CShr~c.. U CShr~q.),/carl))
then ()Co,,t,(CShr~c.,,,, CShr~c.,,,)),
else fail

where
~out -~ ]('in 0",
f o r the substitution a over Vars(Proj(/C~,,i,=z,)) such that Proj(/C/n,ir =_
]~rstrl and

CShr~c.. ' = TermShift(/Cout, CShr~c,.),


AccNew = TermShift(/Co,~t, CShr~c~,,),
New = RevProj(CShr~c .... , ic=u),
CShr~:.~, = TransitiveCIosure(AccNew U New).
Note that for a given pair/C/,~, /Cr, t~, there is at most one substitution a over
Vars(Proj(K;in, /call)) such that Proj(/C/n, ic=ll)a = /C~,t~. The conditions on the
relationship between /Ci,~ and ICr,tr reflect properties of the computed answer
substitution of an SLD-derivation [3, 50]. The domain and range of a substi-
tution, accumulated in a subderivation, consist of the variables occurring either
100 C H A P T E R 4. SHARING A N A L Y S I S

TermShift

I
(CShr~:,..,
'
CShr~,.)
TermShift
Call
/
~..,)

TransitiveCIosure L
Head "N~Proj

~
Call I
. . .
Call/
. ..
(CShr~c.... , CShr~c..,.) q

Call0
Restrict2 I
(CShr~c, CShr~c) ,,,J

Figure 4.9: Extend: Construction of the sharing component.

in the subgoal that is executed, or in the renamed clauses used during the sub-
derivation, but no other variables. The condition on the relationship between
the sharing components of ]Ci,, and ]C,.,t, further characterizes the implementa-
tions for which our analysis is intended, namely, we address implementations in
which sharing, once introduced between term representations, persists until final
success or failure of the computation.
Notes:

9 The specification of the Extend operation does not use the value of CShr~c.."
to construct the output sharing environment (see Figure 4.9). However,
the properties of CShr~:.... are crucial to the proof of Lemma 4.2.37.

9 The extend operation differs from the unification operation only in the
definition of New.

The following lemma states that the sharing derived by the extension operation
comprises both the initial sharing and the sharing that resulted from the call.

L e m m a 4.2.37 Let (/Ci,~,CSharingjc,,~) and (/Cr,t.,CSharingjc...) be concrete

and i~n : { 1 , . . . , m } --~ { 1 , . . . , n } an injection. Let (ICo~t, CSharingjc.~,) =


Extend((/Ci,~, CSharing~), (IC~,t~, CSharingjc.,.), i~n) (Extend does not fail).
Then

I. CSharingjc.. C CSharingjc.~,.
2. RevProj(CSharingx::...,icon) C CSharingjc.~..
4.2. P R I M I T I V E OPERATIONS 101

Proof. We first prove 1. From the properties of TermShift and the definition of
Extend, we have

CShr~c,,` C TermShift(/Co.t, CShr~c,,~) ---- CShr~co.,, (4.38)


CShr~c. C TermShift(/Co~,t, CShr~c ) : AccNew. (4.39)
From (4.39), the properties of TransitiveClosure and the definition of Extend, we
have

CShr~q., _C TransitiveCIosure(AccNew U New) = CShr~co.,. (4.40)


From (4.38), (4.40) and the monotonicity of TransitiveCIosure, it follows that

Transit veCIosure(CShr~c,~ U CShr~c,,,) C TransitiveCIosure(CShr~co. ' U CShr~:o.,)


or shortly,
CSharing~c~ C CSharingjc=.~.
To prove 2., consider (r, s) E RevProj(CSharingpc..... i~u). From the definition
of RevProj, we know that (r, s) = ( i ~ u ( k ) . r ' , i ~ u ( l ) . s ' ) where 1 < k, l < m &
(k..r', l.s') E CSharing~c_, . Let (k~.rx, l~.Sl),..., (kp.rp, lp.sp) be the finite se-
quence over CShr~:.... U CShr~c,, ~ that leads to the inclusion of (k.r',l.s') in
CSharingpc_,. = TransitiveCIosure(CShr~ .... U CShr~c.,,) according to the defini-
tion of TransitiveCIosure. We have

( Cl < i < p ~ k i , l i E { 1 , . . . , m } ) & )


Vi E / N : (1 <_ i < p :=~ li ---- ~ g i + l & Si : Vi+I) (4.41)
a (k.rl, l.s/) = (~l.rl, lp.sp).
From the definition of Extend we know that the extension operation succeeds
only if the input sharing components satisfy the relation

CShr~c.... - TermShift(/Crstr, Proj(TransitiveC osure(CShr~q~ U CShr~,.), i~=u)).


Hence it follows from the definition of TermShift that for (/r li.si) E CShr~c.,,.,
we have

3t~i :

(k~.ri,I l~.s;) E Proj(Transit veClosure(CShr~c,.


.
U CSh r n~c,~), ic=~z)

By the definition of Proj, (i~u(ki).r~,ic=u(li).s~) E TransitiveCIosure(CShr~c~ u


C S h r ~ , , ) = CSharing/c~,,. Let (uz, v l ) , . . . , (uq, vq) be the f i n i t e sequence over
CSh r Jc~,,
o U CShr~,~ that leads to the inclusion of (lcall(ki). 9 r i,~ icazl(li).s;) in
CSharingpc,., (ux = ic=u(ki).r~, vq = i~=u(/i).s~). Because CSharing~c,., is a sharing
component for/Ci,, we have that for all j satisfying 1 < j < q,

~,../~.i - ~,./,,, -- ~,Ji~o,Ck,).r; - ~,dic.,,Cz~).~; (4.42)


102 C H A P T E R 4. SHARING ANALYSIS

Using Property 4.2.26.1 and the definition of Extend we have that for a substi-
tution over Vars(Proj(:,.,it..)),
Proj(/Co,,=, ir - Proj(/Ci,= a, it=u) = Proj(K;i., ir a -/Cr,t~. (4.43)

From (4.43) and Property 4.2.26.3, we have

O(/Cr,tr) = O(Proj(/Co.=, it=u)) = Proj(O(/Co.,), it=u).

From ki.r;.ti, li.s;.ti E O(Kr,tr) and the definition of Proj, it then follows that
ir i~,,u(li).s~.ti E O(/Co,,t), and because of (4.42) also uj.~i,vj.ti E
O(/Coft) for all j satisfying 1 < j < q. Using the definition of TermShift, we ob-
tain a finite sequence (ut.ti, vx.ti),. 9 (% .ti, %.ti) over TermShift(/go=t, CShr~c,~)
UTermShift(/go,,t, CShr~,,,) such that ul.ti = ir &%.~ = ic=u(li).s~. From
the definition of Extend, CShr~c.. ' -- TermShift(/Co,,t, CShr~c,.,) and AccNew =
TermShift(/Co~,t, CShr~c,, ). Summarizing we have

(l~i.ri, li.si) E CShr~c..,. :::> (4.44)


(ic=/l(ki).ri, ic=zl(h).si) E TransitiveCIosure(CShr~co~ , U AccNew).

From New = RevProj(CShr~c..,. , ir and the definition of RevProj, we have

(ki.ri, li.si) E CShr2.,,. :e. (i~=tz(ki).ri, i~=zz(li).si) E New. (4.45)


Given the finite sequence (kx.rl, ll.Sl),..., (kp.rp, lp.sp) over CShr~c.... WCShr~c.,,.
satisfying 4.41, it follows from (4.44) and (4.45) that (i,=u(k~).rx, i,=u(h).st), 9 9
(i~u(kp).rp, i~=u(lp).sp) is a finite sequence over TransitiveCIosure(CShr~c.~' U
AccNew) W New, and

s) =
E TransitiveCIosure(TransitiveCIosure(CShr~co. ' U AccNew) U New).

W e observe that

TransitiveClosure(TransitiveCIosure(CShr~c~ ' U AccNew) U New)


= TransitiveCIosure(CShr~co. ' U TransltiveCIosure(AccNew U New)),
by the associativity of TransltiveCIosure,
= TransitiveCIosure(CShr~co. ' U CShr~=.,),
by the definition of CShr~o.,,
= CSharing~co.,,
by the definition of CSharingjco~.

Therefore, (r, s) = (ic~u(k).r', ic=u(/).s') E CSharingjc..t, as desired. D

The proof that Definition 4.2.36 guarantees the concrete operation Extend to
be well defined on the domain of concrete sharing environments is analogous to
4.2. P R I M I T I V E O P E R A T I O N S 103

the proof of the well-definedness of the Unify operations. First we demonstrate


that New = RevProj(CShr~:..,., i~an) is a pre-sharing component for ICo,,t. Sec-
ond, using the previous result, we prove that CShr~c.~,, AccNew, CShr~c.. ' and
CSharingjc.., = TransitiveCIosure(CShr~c.~ ' U CShr~c.~,), are pre-sharing compo-
nents for ICo,,t. Third, we demonstrate that CSharingK:..~ is a concrete sharing
component for ICo,,t. Using the fact that it is a pre-sharing component, we still
need to show that it satisfies Condition (4.1), concerning multiple occurrences
of a free variable in/Co,,t.

L e m m a 4.2.38 Let (/C~,,,CSharingjc,~) and (IC~,t~, CSharingpc.,,.) be concrete


sharing environments such that arity(ICin[e]) = n & arity(ICrstr[e]) = m & rn < n,
and i,~n : { 1 , . . . , m ) --, { 1 , . . . , n ) an injection. Let (ICo,,t, CSharingx:..,) =
Extend((/Cin, CSharingjc,~), QCr,t,, CSharingjc.,,.), icatt) (Extend does not fail}.
Then New = RevProj(CShr~c.,., its/I) is a pre-sharin# component for ICo,,t.

Proof. From the definition of RevProj, it is clear that New is a set of unordered
pairs of the form (r, s) where r, s E O(/C~ We must prove that New satisfies
Conditions 4.1.4.1-4.

Condition 4.1.4.1 is satisfied by New: Given (r, s) E New, we must prove


that Ko~,t/r =- /Co,a/s. For (r, s) E New = RevProj(CShr~c ....,icazZ), it
follows from the definition of RevProj that (r, s) = (ican(k).r', ican(1).s')
for k,l e {1,...,m} & (k.r',l.s') e CShr~ ..... Because CShr~..,. is a
pre-sharing component for/C~,t~, satisfying Condition 4.1.4.1, we have

/C~,t~lk.r' -~ lC,,t,/l.s'. (4.46)

Using Property 4.2.26.1 and the definition of Extend we have

Proj(/Co,~t, ic=zl) - Proj(/Ci,~cr, i~=n) _-__Proj(/Ci,,ic~l;) a --/C~,t~. (4.47)

Using Property 4.2.26.2 and (4.47), we have

ICo~tlr -/Co~,/i<=.(k).r' - Oroj(Ko~t,i<=,,)/k.r' - / C . , . / k . r ' &


ICo,,tls = ICo,,tlic=ll(l).s' -~ Proj(ICo,,t, i,=,,)ll.s' =__IC,,t,ll.s'. (4.48)

Summarizing (4.46) and (4.48), we have ICo,,t/r =- ICo,,t/s, as desired.

C o n d i t i o n 4.1.4.2 is satisfied b y New: Given (r, s) G New, we must prove


that r ~ s. By the assumptions of the lemma, CShr~c.,,. satisfes Condi-
tion 4.1.4.2. The function RevProj preserves the property, thus passing it
on to New.

C o n d i t i o n 4.1.4.3 is satisfied b y New: Given (r, s) E New, we have to prove


that TermShift(/Co~t, {(r, s))) C New. Expanding the definition of TermShift,
we must prove that for each t such that r.t,s.t E O(Ko,t), (r.t,s.t)E New.
Fix arbitrary such r, s, t.
104 C H A P T E R 4. S H A R I N G A N A L Y S I S

As (r, s) E New = RevProj(CShr~c..,. , i:,n), it follows from the definition of


RevProj that (r, s) = (i,,z~(k).r', i,=zt(/).s') for k, l E { 1 , . . . , m}&(k.r', l.s') E
CShr~.,, . From r.t = i=:zl(k).r'.t, s.t = i,:ll(1).s'.t E O(/co,t), Prop-
erty 4.2.26.3 and (4.43) above, we have

k.r'.t, l.s'.t E Proj(O(/Co,,t), lea/l) = O(Proj(/Con,, icat,)) = O(IC,,t,).

Because CShr~..,. satisfies Condition 4.1.4.3 and (k.r',l.s') E CShr~ .... ,


we have that (k.r'.t,l.s'.t) E CShr~c.... . So, for (r,s) E New, we have
(r.t, s.t) = (icazZ(k).r'.t, icat,(l).s'.t) E New = RevProj(CShr~..,., icau).

C o n d i t i o n 4.1.4.4 is s a t i s f i e d b y New: Given (r, s), (s,t) E New where r


t, we must prove that (r,t) E New. By the assumptions of the lemma,
CShr~:... satisfies Condition 4.1.4.4. The function RevProj preserves the
property, thus passing it on to New.

[]

L e m m a 4.2.39 Let (/cin, (CShr~c,,~, CShr~c,.~)) and (/cr,tr, (CShr~c..,., CShr~.o,.))


be concrete sharing environments such that arity(/cin[e]) = n & arity(K~,t~[e]) =
rn&ra < n, and i , , n : { 1 , . . . , m]. --* { 1 , . . . , n} an injection. Let (ICo,,t, (CShr~c.. ,,
CShr~c..,) ) = Extend((/Cin, (CShr~c,,,,CShr~c,~)) , (/c,,t,,(CShr~c..,., CShr~c..,.)),
icalz) (Extend does not fail). Then CShr~c..,, AccNew, CShr~c.~ ' and CSharlng~c..,
= XransitiveCIosure(CShr~c.. ' UCShr~..,), are pre-sharing components for/co~t.

Proof. The proof of this lemma is identical to the proof of L e m m a 4.2.4, except
that L e m m a 4.2.38 is used instead of L e m m a 4.2.3. []

The following l e m m a states that the operation Extend is well-defined on the


set of concrete sharing environments.

L e m m a 4.2.40 Let (/cin, CSharingjc~) and (/C,,t~, CSharingjc.,.) be concrete


sharing environments s ch that arit (/cin[d) = = ,- _< n ,
and i ~ , n : { 1 , . . . , m } --* { 1 , . . . , n ) an injection. Let (/co~t, CSharingx:..,) =
Extend((/cin, CSharinglc,,.),(/C,,t,, CSharingjc.,.), icatt) (Extend does not fail).
Then (/coat, CSharing~c..,) is a concrete sharing environment.

Proof. By the definition of Extend, we have ICo,,t - Kina for a a substi-


tution over Vars(Proj(/ci,,,icazz)) such that Proj(/Cin,icau) a - /Cr,tr. Because
(/cin, CSharing~c~.) is a concrete sharing environment, it follows from the defini-
tion of substitution application that the term/Co~t has principal functor () and
that O(Kin) C_ O(Ko,,t). From L e m m a 4.2.39 we know that CShr~co.,, CShr~:o.,
and CSharing,:~ = TransitiveCIosure(CShr~c.. ' U CShr~c..,) are pre-sharing com-
ponents for /co~t. So, we still have to prove that CSharing~.., satisfies Condi-
tion (4.1), i.e.,

C/co.dr --/co.,Is) ~ (/co.dr ~ Vars) & (r ~- s) =~ (r, s) E CSharingjc~


4.2. P R I M I T I V E O P E R A T I O N S 105

O~t

ICratr /

/
/

9/ m
_L.,I'\ I \ x
Y Y

Figure 4.10: ICo~,t = ICi,~a and q = q'.q" E O(ICo~t)

Let Ko~t/r - ICo~t/s =- Y . Since Eo~t =- K i , a for a substitution ~r over


Vars(Proj(/C~,,,i~u)) such that Proj(/C,,,,i,~u)a - /C~0t~, we know that each
q E O(K:o,,t) such that ICo,,t/q =- Y must satisfy one of the following.

(q E O(ICi,~) & ICin/q :- ICo,~t/q -- Y ) (4.49)

or

3q', q" :
( q = ql.q,, & q, E O(ICi,~) & lCi,,/q' - X, E Vars(Proj(/Cin, i~,m)) '~ (4.50)
\ & Xq cr =_ t.q & ql, E O(tq) & ~out//q ~-- tq/q" = Y /
Note that
X, E Vars(Proj(K:in, icau))
g
3k,.q* G O(Proj(/Cin, ir k, E ( 1 , . . . , m} & IC.i,~/ic~u(kq).q* =- X , .

Because CSharingjc,.. = TransitiveClosure(CShr~c,~ U CShr~c,~ ) satisfies Condi-


tion (4.1), we have in case (4.50) that either q' = ir or (q',ic~u(kq).q ~
E CSharingx:,~ = TransitiveCIosure(CShr~c,. U CShr~c,,,) (see Figure 4.10). So,
using Condition 4.1.4.3 proved above, and CSharinglc.,. C_ CSharingx:.., (see
L e m m a 4.2.37), we have either q = i,au(kq).q*.q" or (q, ic~u(kq).q*.q") E
CSharingjc..,. So, a more convenient consequence of (4.50) is

3k,,q*,q': ( (q&=icall(k')'q*'q'v(q'icall(k'
- Y)'q*'q' )lC,,tr/k,.q,.q. E CSharingx:.., ) (4.51)

We shall now show that (r, s) E CSharingx:.., by a case analysis based on


which of (4.49) or (4.51) holds for r and s. Suppose r and s both satisfy (4.49),
i.e., r , s E O(K:in) and ICi,,/r ==- Y =_ Kin/s. Because CSharingjc,~ satisfies
Condition (4.1) and r ~- s, we have (r,s) E CSharing~c,~ C_ CSharing~c..,, as
desired.
106 C H A P T E R 4. S H A R I N G A N A L Y S I S

Suppose r and s both satisfy (4.51), i.e., 3k,, r*, r", k,, s*, s":
(r = iea/l(k,).r*.r" V (r, icall(kr ).r* .r") E CSharingjco.,) &
(s = i,an(k,).s*.s" V ks,
' z~n~,
" ' s * 9s " ~) E CSharingx:..,) &
' k ,).

(/c,.,,,./k,.r
9
.r
II
- Y - ~,.,,,/~,,.s
*
.s
It
).
Because CSharingjc.,,. = TransitiveClosure(CShr~c., * U CShr~c.,,. ) satisfies Con-
dition (4.1), we have
kr.r* .r" = k,.s* .s" V (kr .r*.r", k,.s* .s") E CSharingR:.., .
It follows from RevProj(CSharing~c..,., i,~n) C_ CSharing~:.~, (Lemma 4.2.37), r
s, and Condition 4.1.4.4, proved above, that (r, s) E CSharingjc..,.
Otherwise, one of r and s satisfies (4.49) (assume it is r, without loss of gen-
erality) and the other (thus s) satisfies (4.51). From r satisfying (4.49) we know
/Ci,~/r -- Y E Vars(/C,,~), and from s satisfying (4.51) we know )C,.,t~/k,.s*.s" =
r e Wrs(~:.,,). Because W r s ( ~ : . , , ) n W r s ( ~ ) C_ Wrs(Proi(~:~, i . , ) ) , we have
3 k,, r ' : k, E { 1 , . . . , m} &/Ci,~/i,,n(k,).r' - IC,ot~/k,.r' - Y . From CSharing~c,~
and CSharingx: .... satisfying Condition (4.1) and s satisfying (4.51), we know
3kr, r t, ks, s*, 8" :
(r = i,an(kr).r' V (r, ic~n(kr).r') E CSharingx:,,.) &
(k~ .r I = k, .s* .s" V (k~.r I, k , . s * . s ' ) E CSharingjc.,,.) &
(s = i,~,(k,).s*.s" V (s,i,~n(k,).s*.s")E CSharing~c..,).
It follows from r :fi s, CSharingJc,,. C CSharlngjc.~,, RevProj(CSharlngx:..,., lean) C
CSharingjc.., (Lemma 4.2.37), and Condition 4.1.4.4, proved above, that (r, s) E
CSharingx:.~,. D

D e f i n i t i o n 4.2.41 For abstract term environments 7 TM and "T~,t~e, such that


arity(7-[e]) = n & arity(T~,t,[c]) = m & m < n, an injection i : {1,..., ra}
{1,..., n}, and PT".,,. a set of unordered pairs (p, q) such that p, q E Nodescr.o,.,
we define
AbstrRevProj(T~,t,, Pr..,., T, i) =
k, l e { 1 , . . . , m } &
<iCk), f).s, <iCl), 9).R e s(7-)
(7-[(i(k),f).s], 7-[<i(1),g).R]) (k, f).s, (l, 9).S E S(7-,,,,) ~
(T,,t~[(k, f).S], Tr,t,[(l, g).R]) e PT"..,.
D e f i n i t i o n 4.2.42 For abstract sharing environments (7-i,~',(AShr~r~,,,AShr~- ))
and ( .,~ (AShr~-.,., AShr~..,.)), such that a~/t~(Z,~[~]) = ,~ ~ arit~(7;,,.[~]) =
m ~ r . < n, and an injection i . . : { l , . . . , m } --, {1,...,n},

AbstrExtend((~,, ~,(Ashr~- , A S h r ~ - ) ) , ( ~ , t , ' , (AShr~-, AShr~-)),i~,,)


if TGExtend(Ti,~~,T~,t~~,icon) is fail
then f a i l
else (]-o,,,e,(AShr~-:,.,,AShr~.,,)),
4.2. P R I M I T I V E O P E R A T I O N S 107

whePe
T.o,~t" = TGExtend (~,,~, ~,t~ ', i~.u),
and
AShr~ro., = Convert(~,~,AShr~-,To~t),
C = TGShift(2-o~t,Convert(~,~, AShr~-,]-o,,t)),
B = AbstrRevProj(Trstr, TGShift(Tr,t~, AShr~..,.), To,,t, icau),
TGShift(To,,t, AShr~-o.,) = AlternatingClosure(C, B).
Theorem 4.2.44, below, gives the correctness condition for the AbstrExtend op-
eration. It assumes the following condition.

C o n d i t i o n 4.2.43
1. ( i,~, <AShr~-~,AShr~- )) is an abstract sharing en~ironmen/.
~. (Jq., <CSh4c,~, CShr~,.>) i, a concrete sharing environment
3. </Ci,~,(CShr~c,.., CShr~c..)) E InstrEnvConc((Ti.', <AShr~-, AShr~,,.)))
4. (T.str', (AShr~-.t., AShr~..,.)) is an abstract sharing environment
s. (to..., (CSh4c. . . . , CSh,~ .... )) is a eo,~erete sha~i,~g environment

6. ()C,st,, (CShr~c.... , CShr~c.... )) E InstrEnvConc((T~,t~', (AShr~r..,.,


AShr~- )))
7. ~ r i t ~ ( T , . [ d ) = ,~ ~ arit~(7-..[d) = m s~ ,-~ < ,~

8. icatt: { 1 , . . . , m } ~ { 1 , . . . , n } is an injection
O. (ICo,t, (CShr~:o.,, CShr~o.,)) ----Extend(QCi,~,(CShr~;., CShr~,.)), (/C,.,tr ,
(CSh,~.... , CSh,~..,.)), i~o.), is ~ot Mt
10. (To,~te, (AShr~ro.,, AShr~-o,.)) = AbstrExtend((Ti. t, (AShr~-, AShr~.~)),
(7:,,,tr ' , (AShr~r,., AShr.~.,.)), ic~u)
Theorem 4.2.44 (Safety of AbstrExtend) Assuming Condition J.~.J3, it ]ol-
lows that

(/Co~t, (CShr~co.,, CShr~co.,)) E InstrEnvConc((7o,,t', (AShr~-o.,, AShr~-o.,))),


that is,

P a r t 1. /Co,,t E TGEnvConc(To,,t'),
P a r t 2. CShr~co~,,CShr~o~, are pre-sharing components for Ko,,t and
TransitiveCIosure(CShr~co. ' U CShr~c~ is a concrete sharing component for
~ O~g s

Part 3. Ab~Pai%.dCo./CShr~:o.,) C_TGShift(=ro~.AShr~o.,) and


108 CHAPTER 4. S H A R I N G A N A L Y S I S

Part I. is a restatement of Theorem 2.3.14. Part 2. follows from the fact that
Extend is well-defined ( L e m m a 4.2.40). The proof of Part 3. will follow at the
end of the section, after we have related the auxiliary function AbstrRevProj used
in the definition of AbstrExtend to its counterpart RevProj in the definition of
Extend.

L e m m a 4.2.45 ( S a f e t y o f AbstrRevProj) Let 7" e and r,tr be abstract t e r m


environments, such that arity(T[e]) : n & arity(T,,t~[e]) = m & rn <_ n,
and/C 9 TGEnvConc(T') & /Cr,t, E rGEnvConc(~,t,'). Det ican: { 1 , . . . , m) -->
{ 1 , . . . , n } be an injection such that Proj(/C, ican) --/Cr,t,. Let P~c.... be a set of
=nordered pairs of the form (r,s) where r,s 9 O(/C,,t,). Then

AI0sPairT,/r (RevProj(P~:..,., i,=u))


C_ AbstrRevProj(7-,,t,,AbsPairT-r,t,,/C,,tr(P~:..,.),7",ieau).

Proof. After expanding the definitions of AbsPairT,/C and RevProj, we have

AbsPairT,/c(RevProj(Ppc.,., i<=,)) =
{(7-[sel(#c, i<=.(k).s)], 7-[Sei(SC, i<~ I
k, I 9 {1,..., m} & (k.s,l.r) e Px:..,.}.
For arbitrary k, s, 1, r such that k, l 9 {1,..., m} and (k.s, l.r) E P~:..,., we must
prove that

(m[sel(~, i<.,~k).s)], "T[Sei (K:, ir


9 AbstrRevProj(T,,tr, AbsPairm,,vjC,,t,(P~; .... ),7-,lean).
By the definition of AbstrRevProj, it is sufficient to prove that

Sel(K;,ic=n(k).s),Sel(E,i<~/i(1).r) 9 S(T), (4.52)


Sel(/Cr, t,, k.s),Sel(K:r, tr, l.r) 9 S(7-,,/r), (4.53)

(~,t,[Sel(~r,t,, k.s)], ~,tr[Sel(~r,t,, l.r)])


E Abs Patr7- t~.,/C,.,tr(Pjc_,.), (4.54)
such that if SelUC, i<=,(k).,) -- (i<=,(k), f).S, then SeI(IC,,,,, k.,) = (k, .f).S, and
si~il~ly if SelUC, i,,zK0.r) -- (i,,(0, g).R, then SelgC,,., Z.r) -- (t, g).R.
From the definition of AbsPairT,,t~,lC,,tr and (k.s, l.r) E P~.,,., we derive (4.54).
Because /C,,t, e TGEnvConc(T~,t,,), we get (4.53) from k.s, l.r E O(/C,,t,) and
Property 4.1.13.
From Proj(/C, icau) -- /C~0t~ and k . s , l . r 9 O(IC~,t~), it is clear that ic=n(k).s,
icaii(l).r 9 O(K;), #Cli<=n(k) =. lC,,,~lk and x;/i<=.(l) = g,,,Jl. Now, suppose
that SeI(/C,i,=n(k).s) = (i=,n(k), f ) . S . I f s = e, then so is S, and f is V, a functor
or a constant, depending on the value of/C/i~=n(k). But then Sel(K:~,t~,k) =
4.2. PRIMITIVE OPERATIONS 109

(k, f), because IC/ic~n(k) ~ IC~,t~/k. If s :~ e, then f = /C[icat,(k)] = 1C~ot~[k]


and S = Sel(/C/ir = Sel(/C~,t~//c,s), by the definition of Sel. Hence
Sel(~:~,,~, k.s) = (k, f ) . s .
Because/C E TGEnvConc(T'), we derive (4.52) from Property 4.1.13 and i,~,,(k).s,
ic~,l(1).r E O( IC). t:]
T h e o r e m 4.2.44, P a r t 3: Assuming Condition 4.2.43, it follows that
1. AbsPairo- r
g O~tl~O~t
( C S h r ~o ~cl ) C
--
TGShift(To~,t, AShr~-.~,)

2. AbsPairTo~,t, iCo,,t(CShr'~c..,) C_ TGShift(Tout, AShr~-..,)


Proof. Applying Conditions 4.2.43.9 and 4.2.43.10 and expanding the definitions
of Extend and AbstrExtend, we can rewrite 1. as

AbsPairTout, iCout(TermShift( ICou,, CShr~:,,,))


C_ TGShift(Tout, Convert(Ti,,, AShr~-i~ , To,,,)). (4.55)

We observe that

AbsPairTout,ICo,,t(TermShift( ICo,,,, CShr~:i~))


C_ T GShift( To,,t, AbsPairTout, ICout( CShr~i..) ),
by Lemma 4.2.12 (the safety of TGShift),
C_ TGShift(Tout, Convert(Ti,~, AbsPairTin,K;i,,(CShr~c,~), To,,t)),
by Lemma 4.2.13 (the safety of Convert),
and the monotonicity of TGShift,
C TGShift(Tout, Convert(Ti,~, TGSh ft(Ti,~, AShr~%.), Tout)),
by Condition 4.2.43.3,
and the monotonicity of TGShift and Convert,
o ~ , Tout)),
= TGShift(Tout, Convert(Ti,~, A ShrTi
by the definition of Convert,
and the idempotence of TGShift.
Summarizing, we get (4.55), as desired.
Applying Conditions 4.2.43.9 and 4.2.43.10 and expanding the definitions of
Extend and AbstrExtend, we can rewrite 2. as

AbsPair~, /C (TransitiveClosure(AccNew U New))


ou~, out (4.56)
C AlternatingCIosure(C, B).
First we relate AccNew and C.
AbsPairTout,ICout ( AccNew )
= AbsPairTo,,t,iCout(TermShift(ICout, CShr~i~)),
by the definition of AccNew,
C TGShift(Tout, AbsPairT.
--
e- (CShr~: i n ~,
o,a, t l r ~ . o u t ~ iz

by Lemma 4.2.12 (the safety of TGShift),


C_XGShift (Touh Convert(~in, AbsPa r~,,,/C,n(CSh
r~,~. ), To,t)),
110 C H A P T E R 4. S H A R I N G A N A L Y S I S

by Lemma 4.2.13 (the safety of Convert),


and the monotonicity of TGShift,
C_TGShift(To,`t, Convert(T~., TGShift(T~,~, AShr~-), To,`t)),
by Condition 4.2.43.3,
and the monotonicity of TGShift and Convert,
= TGShift(To,`t, Convert(~,,, AShr~-,.., ~o,`t)),
by the definition of Convert,
and the idempotence of TGShift,
---- C,
by the definition of C.
Summarizing, we have

AbsPairTo,`t,/Co,`t(AccNew ) C_ C. (4.57)

On the other hand,

AbsPalrTo,`t,/C o,`t(New)
= AbsPair.-r ~- (RevProj(CShr~: i~Iz)),
by the definition of New,
C AbstrRevProj(T~,,,, AbsPairT,,,,jC,,t,(CShr~..,.), To,,,, icazz),
by Lemma 4.2.45 (the safety of AbstrRevProj),
C AbstrRevProj(T,,,,, TGShift(T,0t,, AShr~-.,.), To,`,, it,u),
by Condition 4.2.43.6,
and the monotonicity of AbstrRevProj,
-- B,
by definition of B.
Summarizing, we have

AbsPairTo,`t,iCo,`t(New ) C B. (4.58)
Finally, we prove (4.56).
AbsPair.]'o~,,.iCo,,(TransitiveCIosure(AccNew U New))
C AlternatingCl~ /C (AccNew), AbsPair~ /C (New)),
by Lemma 4.1.21 (the safety of AlternatingClosure),
C_AlternatingClosure(C, B),
by (4.57), (4.58),
and the monotonicity of AlternatingCIosure.
[]

4.3 Evaluation
In this section, we show some of the results obtained by a prototype implemen-
tation based on the specifications for primitive operations given in Section 4.2.
4.3. E V A L U A T I O N 111

Because most Prolog implementations do not create any sharing when a variable
gets bound to an atom or an integer, a small optimization with respect to the
formal specifications is incorporated, avoiding the introduction of such sharing
edges. Detailed tables representing the results produced by the prototype are
given in Appendix A. So here we discuss only one example in detail, after which
we comment on a possible refinement of the sharing analysis, based on a new
concept that characterizes meaningful sharing edges for a given type graph (cfr.
Section 4.1.2). Next, based on our experience with the prototype implementa-
tion, we discuss the strength of the sharing analysis (e.g. the gains that can be
expected from using the alternating closure) and we point to a few shortcomings
(e.g. the remaining causes of imprecision and the main sources of inefficiency).

4.3.1 Example: insert/3


Consider Program 4.1 (page 59) for inserting an element into a sorted binary
tree. Assume that the program is called with the abstract sharing environment
~i,~ = (T~,~e, ASharingT-,,), where

Tim ::= (Int, Tree, V),


Tree ::= empty [ t(Tree, Int, Tree),
ASharing~r~ = (@, @).

For the analysis, we use depth bound two for the tree nodes labeled 1. The
success substitution ~o,,t = (To,,t', ASharingT-o~,), computed by the abstract in-
terpretation procedure, is given by

~out ::= (Int,Tree, TreeOne),


TreeOne ::= t(Tree,Int, Tree),
ASharingTo., = (0, AShr~..,),
AShr~-o., = { t).(3, t)], I)1),
( To,,t[(3,t).(1,t)], t)]) ).
Only the principal sharing edges are represented. The full set of sharing edges
is obtained by applying the TGShift operation. Figure 4.11 represents both ~i,~
and flo~,t. We see that the output argument Air gets bound to a tree containing
at least one element and that after the call there is potential sharing between
the input and output trees. This corresponds exactly to what one intuitively
expects from the source code of the program. Note that the sharing relation
is not transitive and that no internal sharing is derived for the output tree.
The top tree cell is not shared either. We obtain such a precise result thanks
to using depth bound two for the t labeled nodes. If depth bound one were
used instead, the output type graph would have been folded and the top tree
cell would be indistinguishable from the other tree cells. There is no sharing
between the integer _E and the integer elements of the output tree Air because
of the optimization mentioned above.
Next, let us consider one of the recursive clauses.
112 C H A P T E R 4. S H A R I N G A N A L Y S I S

Tin <> Tout <>

Int Or V lnt Or t

Int Int "~empty ". t empty ,, t

Int ..," Int

Figure 4.11: Abstract sharing environments ~in and flout for the predicate
insert/3.

inserg(_E, _0T, _lIT) :-


_OT = t(_L, _F, _..K), 2(~
Y. =< _F, _}IT = t ( J L , _F, _It), |
insert(_E, _L, _NL).

In Section 4.1.2, we already discussed the binding states derived for the pro-
gram points ( ~ and (~) . Figure 4.5 (page 59) represents the restricted
sharing environments before and after the selection operation _0T = t (_L, _F,
_R). Here, we restrict ourselves to a discussion of the results obtained for the
program point ~ , just before the recursive call. If we order the variables of
the clause according to the tuple (_E, _0T, _Jr, i l L , _L, _F, _~), then the
abstract sharing environment ;3 = (T *, ASharingsr) computed by the abstract
interpretation procedure, is given by

T ::= (Int, TreeOne, t(V,Int,Tree), V, Tree, Int, Tree),


ASharingl- (@, AShr~-),
AShr,~- { ( T[(3,~).(3, g)], 7"[(2, t).(3, t)] )'
( T[<a,t).<3,t>], ~r[<~, t>] ),
( ~-[(3, t).<l, v)], 7[(4, V>] ),
( T[(2,t).(a,t)], 7[<~, t>] ),
( T[(2, t).(1,/)], 7[<5,~)]) }.
It is represented in Figure 4.12. The output argument .~T is only partially
constructed. The left subtree is a free variable that has yet to be filled in. It
will receive its value through the sharing with the program variable ilL, which
is the third argument of the recursive call. After restricting the environment
to the variables of the recursive call, we obtain an abstract substitution that is
equivalent to ;3in.
Starting from the substitution ~in, all program points have an empty first
sharing component. Therefore, assume as a second example that the program is
4.3. EVALUATION 113

Int t t V Or Int Or
//',,",, /;',,",,
Or Int Or V " Int Or empty I t ~ crop,y[ t
I @9

ernpty I t ~ empty[ t ..~ empty[ ~ t .'~ t Int ,."1 Int

Int ".~ Int


.".,. . . . % 4 4Int- - - ; z 9
. . . . . . .-;,.
,,"

Figure 4.12: Abstract sharing environment/3 for point (~) in the predicate
insert/3.

called with the abstract sharing environment ~1 : (T~, ASharingT-~) where

T1 ::= (d(Int), DTree, V),


DTree ::= empty ] t(DTree, d(Int), DTree),
ASharingT-1 = <AShr~-1, 0),
AShr~- 1 = { (Tl[<l,d>], Tl[<2,t>.<2, d)]) ).

The substitution /31 represents that the element to be inserted, a structured


term of type d(Int), possibly shares with (at most) one element already present
in the input tree (at most one, because there is no internal sharing in the input
tree of the second argument). The success substitution f12 = (T~, ASharingr2>,
computed by the abstract interpretation procedure, is given by

7- 2 ::= (d(Int), DTree, DTreeOne),


DTreeOne ::= t(DTree, d(Int), DTree),
ASharing-T2 = (AShr~r 2, AShr~-2),
AShr~- 2 = { (~r~[<a,d>], ~2[<2,t>.<2,a>]) },
AShr~- 2 { (T2[<3,t>.(3,t)], T: [<2,t>]),
( T2 [<3,/).(1, t)], T~ [<2,t>]),
( 7-2[<3, t>.<2,d>], T~[<2,t).<2,d>]),
( T2 [(3,/>.(3, t).(2, d)], T~[<I,d>]),
( T2 [(3,/).(1, t).(2, d)], T2 [<1, d>] ),
( 7-2 [<3, t).<2, d)], 7-~[(i,d>])}.
Remember that the full sharing relation is obtained by computing the alternating
closure of AShr~- 2 and AShr~2. This set contains edges between all the nodes
of the subtree for the third argument that are labeled d (there are six implicit
sharing edges of which three are self-edges). So, it is correctly derived that there
may be internal sharing in the output tree for the third argument. For example,
114 CHAPTER 4. SHARING ANALYSIS

the following edges are implicit in the representation above.


( T2[(3,t).(3,t).(2, d)], 7"2[(3, t).(2,d)])
( t).(1, t>.(2, d)], t).(2, d)])
(T2[(3,t).(2,d)], 7-2[(3, t).(2, d)] )
Note that the latter edge is in fact an irrelevant edge, as will be defined in the
next section.

4.3.2 Relevance of Sharing Edges


For an abstract sharing environment of the form (T t, (AShr~-, AShr~-)>, the full
abstract sharing component is given by AlternatingClosure(TGShift(T, AShr~-),
TGShift(T, AShr~-)). The meaning (and relevance) of the elements of ASharing~r
is determined by the concretization function InstrEnvConc and depends on the
eornezt, namely the type graph T. The conditions on an element (m, n) E
ASharing~-, imposed by the definition of an abstract sharing component (Defi-
nition 4.1.14), namely m, n E NodesT and Label(m) = Label(n) and Label(n) :/:
'Or', do not guarantee that the element is meartingful or relevartt in its context T
(meaningful in the sense that it corresponds to sharing in some of the concrete
terms represented). Intuitively, we can distinguish several kinds of irrelevant
edges (for an illustration, see Figure 4.13):
(a) Edges that carry no (interesting) information. In the corresponding concrete
terms, they only express sharing of a subterm with itself.
(b) Edges between disjoi~tt nodes. There are no corresponding concrete terms
in which such sharing can exist, because only identical concrete terms can
share.
(c) Edges between mutual ezclusive nodes. There are no corresponding concrete
terms in which such sharing can exist, because only one of the alternatives
allowed by an Or-node of a type graph occurs in a concrete term.

(a) <> (b) <> (c) <> (d)

I
a a
I
b !',.-_r
C

Figure 4.13: Irrelevant sharing arcs - - a concrete term exhibiting such sharing
cannot be constructed.

The semantics of the type graph, and hence of the sharing edges, is formally
defined by a concretization function that may be adjusted to a particular ap-
plication. Figure 4.13.d shows a self-edge that is irrelevartt if the concretization
4.3. EVALUATION 115

AbsUUnify(A,[B])
Tin <> ~ Tout <>

Or Or Or

nil

r V 8.,~ ml
l"" :.)

r ~ "" 6

Figure 4.14: Ignoring the creation of irrelevant edges causes imprecision.

function generates non-circular terms only (which is the case in our application).
This is appropriate if the abstract domain is intended to be used for the analy-
sis of systems that perform the occur check. If the intended application is e.g.
loop detection for systems not performing the occur check, the concretization
function has to consider circular terms too, and the self-edge in Figure 4.13.d
will be relevant, indicating possibly circular terms. We will come back to this in
Section 4.3.3 below.
The definition of an abstract sharing component and the specifications of the
abstract primitive operations ignore the possible presence or creation of irrele-
vant edges. An irrelevant edge does not affect the outcome of the concretization
function for some abstract sharing environment, because it does not correspond
to sharing that is possible in any of the concrete terms represented by the un-
derlying abstract term environment. Removing the irrelevant edges from an
abstract sharing environment 131, leads to an equivalent abstract sharing envi-
ronment/32, in the sense that InstrEnvConc(/31) = InstrEnvConc(/32). Note that
the order relation, defined in Section 4.1.4, does not recognize such equivalent
sharing environments either. We have

/32 ~Sh /31 ~ /31 ~Sh /32"


Consequently, a few more iteration steps may be needed in the abstract interpre-
tation procedure before a fixpoint is recognized. Moreover, Figure 4.14 illustrates
that, if irrelevant edges are not detected by the operations that create them, they
can cause imprecision in subsequent steps of the abstract interpretation, because
shifting down irrelevant edges can result in relevant ones. Suppose we have two
program variables, A and _B, that are both bound to lists of free variables,
and that the elements of list _~ possibly share with one another (type graph
T~,, of Figure 4.14). The abstract interpretation of the unification A = [_B],
consists of the following steps (in the figure, initial sharing edges are drawn as
short dashed lines, accumulated new edges as long dashed lines, and edges in
the alternating closure as dotted lines),

9 Expressing the sharing component for type graph Ti,~ (input sharing edge
(1)) in terms of type graph To~t by the operation Convert (which leads
116 CHAPTER 4. S H A R I N G A N A L Y S I S

to edges (2, 8)), immediately followed by an application of the TGShift


operation w.r.t. To,,t (which gives edge (5)).
Constructing the edges that correspond to places where unification possi-
bly introduces new sharing by the operation BindingEdges (which leads to
edges (3, 9)), also immediately followed by an application of the TGShift
operation w.r.t. To,,t (which gives edge (6)).
Computing the alternating closure of the sets obtained in the previous two
steps (which gives the edges (4, 7, 10) by the paths (3-2-3), (6-5-6) and
(9-8-9) respectively).
However, because the program variable _a is known to be a single element list af-
ter a successful unification, the input sharing edge (1) is in fact converted into the
irrelevant edges (2) and (8) (corresponding to class (c) and (d) in Figure 4.13,
respectively). If these edges are not removed at this intermediary stage, the
subsequent TGShift operation introduces the relevant edge (5). So, if irrelevant
edges are not removed by the Convert operation itself, but only after abstract
interpretation of the unification has been fully completed, then we can only re-
move the edges (2, 4, 8, 10) as irrelevant edges (assuming that the concretization
function excludes circular terms), and we end up with an imprecise output shar-
ing component, containing the redundant edges (5) and (7). In general, when
as a result of unification, some 'Or'-alternatives or back-arcs of some input type
graph T/,, do not have corresponding arcs in the output type graph To,,t, the
converted version of relevant sharing edges for T/,, may be irrelevant for To,,t.
Relevant edges can be formally defined as follows.

Definition 4.3.1 For an abstract t e r m environment T t and rn, n E Nodes:r


such that Label(m) = Label(n) and Label(m) # 'Or', define

Relevant(T', (m, n))


i.ff
=I/C E TGEnvConc(Tt), CSharing/c a sharing component for IC,
I (r, s) E CSharinglc : rrt = T[SeI(/C, r)] & n = T[SeI(/C, s)].

Using this concept, we can now give a more restrictive definition for abstract
sharing components.
Definition 4.3.2 For ar~ abstract t e r m environment 7- t containing the type
graph T = (NodesT, ForwardArcs~r, BackArcs~r), we call a set of unordered pairs
(re, n) such that ra, n E Nodes~-, Label(m) = Label(n), Label(m) • 'Or', and
Relevant(T ~, (m, n)), an abstract sharing component for T .

The specifications of the primitive operations (Section 4.2) must be adapted


accordingly, and the safety proofs will become more involved.
The algorithms for detecting the irrelevant edges shown in Figure 4.13 are
rather straightforward. Edges as in Figure 4.13.b can be characterized as edges
that cannot be shifted down all the way to leaf nodes in at least one way (or
4.3. EVALUATION 117

as edges connecting two nodes representing type graphs with an empty intersec-
tion [38, 39]). The irrelevant edges shown in Figure 4.13.a, 4.13.c and 4.13.d,
in fact connect the roots of two subgraphs that represent sets of terms of which
no elements can occur as (distinct and independent) subterms within a term
represented by the enclosing type graph. These edges can be characterized as
the edges connecting two nodes nl and n~ for which the sets of ancestor nodes
w.r.t, the type graph do not have a functor node n! in common, such that nl
and n2 are descendants of (or equal to) two differer~t children of n I.
We expect that, based on properties of normalized type graphs (such as the
absence of empty nodes), and the precise definition of the concretization function
of type graphs, it can be proved that these are the only possible irrelevant edges,
and that the characterization given above is correct.
Further investigations should address the following issues:
1. Which operations possibly create irrelevant edges. (For example, the irrel-
evant edge in the success substitution ~2 for the example in the previous
section is created by the AlternatingClosure operation.)
2. What is the impact of detecting irrelevant edges at intermediary stages on
the precision and the efficiency of the analysis.

4.3.3 Imprecision in the Sharing Analysis


Although for the test programs that we used (see Appendix A) the analysis based
on the domain of abstract sharing environments derives fairly precise results, it is
not hard to think of specific programs where it does not lead to the most precise
sharing information expressible in the domain. In this section, we discuss a few
parameters that influence the precision of the sharing analysis and that can be
further tuned in order to improve the precision:
1. The depth bound imposed on the type graphs (the smaller the depth
bound, the less precise sharing information can be represented). Although
the optimal depth bound may be program dependent in general, for pro-
grams in normal form, the analysis seems to attain sufficient precision
using depth bound two.
2. The treatment of irrelevant edges. In Section 4.3.2, we illustrated that not
removing irrelevant edges at intermediary steps of the abstract interpreta-
tion can cause imprecision. The importance of this kind of imprecision in
the analysis of (realistic) programs is probably small, but further investi-
gations are needed.

3. The choice of the upperbound operation. The upperbound operation as


defined loses information about what sharings can occur in the same exe-
cution state. The upperbound operation computes an over approximation
summarizing the sharing states that can occur at a given program point.
The abstract domain chosen represents possible sharing (the relation is
not transitive) and the operations are very conservative in that they do
118 C H A P T E R 4. S H A R I N G A N A L Y S I S

not perform a case analysis when computing the interaction of a set of


input sharings with a set of newly created sharings according to the shar-
ings that can coezisl (e.g. transitive subrelations). This is a compromise
between the complexity and the precision of the operations. An alterna-
tive upperbound operation could keep track of sets of sharing components;
however this might not be worth the extra cost.

In the remainder of the section, we illustrate the first and the third cause of
imprecision.
The larger the type graphs, the more precise sharing information can be
represented. However as one can expect, the execution time for the abstract
interpretation of a program grows with the size of the type graphs used. Hence,
we must look for some compromise. The size (i.e.the number of nodes) of the
type graphs, as derived by the abstract interpretation of a program, depends on
the depth bound restriction for the function symbols that occur in the recursivc
type-graph branches. The type-graph operations for abstract interpretation fold
the type graphs at places where the depth bound is violated.
A general solution should choose an appropriate depth bound based on the
analysis of the source code of the program. W e expect the optimal depth bound
restriction for a functor occurring in a recursive branch of a type graph to bc
one larger then the maximal depth of the terms in selection operations occurring
in the source code and involving that functor.
In the prototype the depth bound is filledin by hand. For the test programs
(see Appendix A), which are all in normal form (all terms have maximal depth
one), depth bound two is sufficient to get good precision (depth bound one is
not). Consider Program 4.2, for splitting an input list into two lists, the first
consisting of the elements at even positions, the second of the elements at odd
positions. The program on the left is in normal form, the program on the right
is not.
For the recursive clause of the program on the left,the prototype using depth
bound two derives that, after the extension operation that follows the first t w o
calls (_X -- [_al-Xl], _XI = E_bJ_.r]) (see Figure 4.15.a), the top listcell of the
_X variable is not shared with any other node, and the top listcell of the variable
_Xl is only shared with the second list cell of _X. The sharing information is
precise enough to allow compile-time garbage collection:the top list cells of _X
and _X1 can be reused in the two subsequent construction operations, because
they do not share with any of the variables needed in the remainder of the
clause _Y, _a, _rl, .2, _b, _.r2, _r, nor with the output arguments X, _2
(remember that the abstract sharing relation is not transitive). However, for the
program on the right, which is not in normal form, depth bound three is needed
to obtain a similar precision. The domair~ of this program is smaller: we lose
detailed information about the program variable _Xl. Using depth bound two
(scc Figure 4.15.b), we derive that the second list cell of _.X possibly shares with
list cells of program variable _r. This means that the prototype detects only
that the top list cell of _X can be reused, not its second listcell.
4.3. EVALUATION 119

(a) (b)
split( I , _Y, _Z ) :- split( ..X, _Y, ..Z ) "-
I = [_alli], _Xi = [_bl_r], ..X = [ _ a , _ b l - . r ] ,
-Y = [_al _ . r l ] , -u = [_al-rl] ,
7. = [_b l-r2], 7. = [_bl _ r 2 3 ,
split( _ r , _rl, ._r2 ) . split( _r, _rl, _r2 ).
split( _X, _Y, _Z ) :- split( I , _Y, _Z ) "-
I = [_alll], _Xl =. [ ] , _X = [ _ a ] ,
_Y = [_al_Yi], _Yi = [], -Y = [ _ a ] ,
7. = []. 7. = [].
split( _X, _Y, _Z ) :- split( I , _Y, _Z ) .-
I= [], I = [],
_Y = [], _Y = [],
7. = []. = [2.

Program 4.2:split/3

<>

V V Int I n t / ~ ~ V _ ~ ~V

Int ~ . "~ . i nt "-, -- i n/ . . ~ . / ~


nil "- Int
(a)

<>

~ V V Int I n t / ~ . ~ V

Int .~ ---- ~ nil

I Int
nil
(b)

F i g u r e 4.15: T h e r e l a t i o n b e t w e e n t h e d e p t h b o u n d r e s t r i c t i o n , i m p r e c i s i o n i n
t h e s h a r i n g a n a l y s i s a n d t h e n o r m a l f o r m of P r o l o g p r o g r a m s .
120 C H A P T E R 4. SHARING A N A L Y S I S

process_max_min( _an, -mm ) :-


max_min( _nn, .mm, -min, .max ),
p r o c e s s l ( _max ),
process2( _min ).

max_rain(d(_n,ml), d(_m,_ml), _nl, -ml ) "-


..11=< -m.
max-rain(d(_n,_nl), d(-m,_ml), _ml, _nl ) "-
_n > -m.

Program 4 . 3 : p r o c e s s . m a x ~ i n / 2

a l t e r n a t e ( _ x , 7-) : - p(_.a,_b,'_c,_d) :- _a = _b.


p(_x,_y,_z,_u), p(_e, "f,_g,_h) :- _.g = ..h.
q(_y,_u). q(_v, _u) : - _v = _w.

Program 4 . 4 : a l t e r n a t e / 2

Next, we illustrate the advantage of using alternating closure and twofold


sharing components. Namely, they avoid to some extent the loss of precision
due to combining sharing components of different execution states (cause 3. of
imprecision).
Consider Program 4.3. Intuitively, it is clear that after the call max_min(_ma,
.Jam, _min, _max), the first (input) argument _ms, being of the form d ( _ n , m l ) ,
will either share with _rain or _max, but not with both. That is precisely what the
sharing analysis based on AlternatingCIosure derives. However, if the abstract
unification operation were based on a transitive closure operation, we would
derive possible sharing between the program variables _max and _rain. This
would result in _max being possibly live for the call p r o c e s s l (_max), because it
possibly shares with a term that is still needed after returning from the cM1.
Opportunities for destructive assignments within the definition of the process l
predicate would not be detected.
For the example Program 4.4, the technique of alternating paths is not
enough to avoid imprecision caused by combining sharing information from
different execution states. For a call a l t e r n a t e ( _ x , _ z ) , with both _x and _z
free variables, we derive possible sharing between _x and .z, because the call
q(_y,m) creates a sharing edge (_y,_u) that combines with its input sharing
edges (.x, _y) and (_z, _u), into an alternating path.
A similar imprecision arises when unifying the left and right subgraphs of the
input type graph shown in Figure 4.16 on the left (using depth bound one for
list constructor nodes). Old edges are drawn as short dashed lines, accumulated
new edges as long dashed lines, and edges in the alternating closure as dotted
lines. The self-edges (2) and (4), in the output type graph are the result of
4.3. E V A L U A T I O N 121

Tin o AbstrUnify(A,B) Tout <>

Or Or Or 3 Or
..:.4
V Or V Or v nil ".A.. ~..-~ V nil" ~ .""

nil ~ nil ~ k ~ | /i In, Or~--'' /' ~. Int 6rNX

Int Int . ' " nil . . . . "~ . . . . . . . . . . nil


\ .... . . . . . 7,;--

Figure 4.16: Abstract unification involving imprecision.

combining edge (3) with itself (it is both a converted input sharing edge (1) and
a binding edge created by the abstract unification operation). However, in the
concrete terms represented by the input type graph, either both input terms are
ground lists and unification will not create any new sharing, or at least one of
them is a free variable such that there is no input sharing of list constructor
cells. In other words, the corresponding concrete unification will not give rise
to any alternating paths: the self-edges derived by the abstract unification are
redundant. Note that in this case these edges happen to be irrelevant w.r.t, the
type graph (see Section 4.3.2). However, when trees of integers are used instead
of lists, removing irrelevant edges will not solve the problem of imprecision in
this example.
In Figure 4.17, a similar unification of subgraphs is considered (with depth
bound one for list constructor nodes). In the output type graph on the right, the
edge connecting the list constructor nodes is again both an input sharing edge
and an edge created by the abstract unification operation. In this case however,
the corresponding concrete unifications can give rise to alternating paths (e.g.
consider the input term (.(1, .(2, ~ ) ) , .(1, .(2, .(3,_Z))))): the self-edges derived
by the abstract unification correctly indicate the possible creation of circular
terms. In the case of a system performing the occur-check, such concrete unifi-
cations will result in failure and the corresponding abstract unification can safely
remove irrelevant edges.
Note that the underlying integrated type and mode analysis can also be used
to predict the possible creation of circular terms. The unification of A and B can
create a circular binding if the values of A and B just before unification share
a variable or if they both have internal sharing. Such information is available
in the constraint sets SVal, NUni and PShr too (see Section 2.3.2). However,
according to that criterion, a lot more abstract unification calls will possibly
create circular terms than will be derived by the structure sharing analysis.
The following example motivates the use of twofold sharing components.
Consider the recursive clause of the append/3 program, called with the first two
arguments linear lists of integers that possibly share at the list cell level and the
third argument a free variable (Figure 4.18.a).
~tppend(l, _Y, 7.) :- I = [gl_U], _Z = [_EI_W], append(_U, _Y, _W).
122 C H A P T E R 4. S H A R I N G A N A L Y S I S

Tin <> AbstrUuify(A,B) Tout <>

Or Or Or Or

~ >..\. 9. . . .
tA3 'A3 t t Int Or \ # ',. Int Or N~

V V 9 ",, V nil ,s ", V n i l


9 "- *~ 9 ~

Figure 4.17: Abstract unification indicating possible creation of circular terms.

vvvv ~
....
W
Or ' Int Or V

In nil .~...| nil .: I

Int Int nil '~,'~ Int / Int

Int
(a) (b)

Figure 4.18: Abstract sharing environments for the append/3 program, at the
entry point (a) and before the recursivc call (b).

At the program point preceding the recursive call, we have the abstract
sharing environment shown in Figure 4.18.b. Old edges are drawn as short
dashed lines, accumulated new edges as long dashed lines, and edges in the
alternating closure as dotted lines. If we do not make that distinction, then
the primitive operations will not derive that the sharing edge (2) between _U
and _Y, which is input sharing for the recursive call append(_U, _Y, _W), does
not represent possibly new sharing created by the recursive call. As mentioned
in Section 4.1.2, a safe extension operation exists for an abstract domain of
sharing environments that contain a single sharing component. This extension
operation considers all output sharing edges of a call as possibly created by the
call. However, for the example at hand, such an operation would consider the
sequence of edges (1-2-3) as an alternating sequence and hence would generate a
self-edge in the list node marked with *, indicating that the input list _X possibly
ends up with internal sharing of structure after the recursive call (at the element
level in the case that the concretization function only generates non-circular
terms, or at the list cell level in the case that the concretization function allows
circular terms). Of course, one can argue that, in this particular case where
the input terms are ground, no new sharing would be introduced in the input
list _X by the implementations that we consider. So, instead of treating sharing
4.3. EVALUATION 123

Tin <> AbstrUnify(Y,Z) Tout <>

f g Or f g ,g

g~ g~ V . g
i(i. ,/. a....;,
L
I "''~a ..... "'''
a
l
a
redundam ~ " ~ a :"'a[ "" ,'

Figure 4.19: Using groundness information to reduce imprecision.

components straightforwardly as binary relations in sets of nodes, we could define


the primitive operations to take advantage of the context of the sharing edges
to reduce the imprecision. For example, by comparing the input and the output
type graphs of the corresponding type-graph operations, we can decide whether
sharings involving ground parts of the output type graphs can be introduced as a
result of executing the operation. In general however, the initial sharing and the
newly accumulated sharing cannot be distinguished from one another by merely
comparing the input and output type graphs; e.g. consider a similar program
for merging two sorted binary trees of variable elements that possibly share at
the tree cell level on input. Using twofold sharing components adequately solves
the problem of deriving what output edges are possibly new. Nevertheless, an
optimization based on comparing input and output type graphs can still be useful
even when twofold sharing components are used, as is illustrated by Figure 4.19.

4.3.4 Efficiency of the Sharing Analysis


A prototype was developed in order to investigate the strength of the sharing
analysis based on the abstract sharing environments. It is an extension of a
prototype for integrated type and mode inferencing [38]. We experimented with
a set of well-known, but rather small (pure Horn clause) logic programs. For
larger programs, some problems arise with the efficiency and the computational
complexity of the sharing analysis. The eff• is influenced by:

1. The size of the type graphs (for large type graphs, large sets of sharing
edges must be dealt with). The size of the type graphs depends on the
depth bound for function symbols occurring in recursive branches, but also
on program specific properties such as the number of different functors
occurring in the source code, and the size of the domain of the clauses (see
Program 4.2).

2. The depth of the AND-OR-graph (the number of unfolds of recursive pred-


icates) and the number of iterations over different parts of the AND-OR-
graph. Although this is mainly program dependent, an upperbound on the
number of expansions of recursive predicates can be imposed.
124 C H A P T E R 4. S H A R I N G A N A L Y S I S

3. The use of an extension table, which is a m e m o structure that records


intermediate results during data-flow analysis, thus avoiding costly recom-
putations. This also allows an incremental analysis with smaller A N D -
OR-graphs.
As the precision presently obtained, is satisfactory, it is interesting to investigate
in future work, how to improve the efficiency and reduce the complexity of the
operations. As an example, we give less precise but more efficient specifications
for the function BindingFdges.
D e f i n i t i o n 4.3.3 For an abstract germ environment ~,~', 1 <_ i < j <_ arity
(~,.[d),an~ o., ' = TGUnify(Ti,," i,j),

BindingEdges~(~n, To=t, i, j) =
((To,t[(i,f)],7"o,t[(j,f)])I 3 f : f isafunctor,a constantor V }
& (i, f), (j, f) e S(To,t)
D e f i n i t i o n 4.3.4 For an abstract germ environmeni Tin ~, 1 <_ i , Q , . . . , i j <_
arit~(%~[d) ,~ch *hat i, i x , . . . , ij are pairwi,e distinct, f is a f~netor of ari~ j
and To,t" = TGUnify(T/,,', i, f ( i l , . . . , ij)),
BindingEdge%(~,L, To.t, i, f ( Q, . . . , ij ) ) - -

3h, t : h i~ a ~nctor, a con,rant


(To=t[(i, f).(s h)], To,,t[(it, h)]) or V & 1 < s < j &
(i, f).(t, h), (il, h) E S(7"o,t) I
These functions do not require a matching of input and output type graphs to
find out where unification possibly introduced new bindings. Of course these
functions are less precise. When unifying general (partially instantiated) terms,
sharing edges between ground parts of the terms may be derived redundantly.
However, a lot of programs, especially programs in normal form, do not use
such general unification. Consider again the first recursive clause of the insert
program :
insert(_.E, _OT, ...NT) :-
_OT = t(..L, ..F, _It),
_E =< _F, J T = t(..1~L, _F, _.R),
insert(_E, _L, ._NL).
Both the selection operation _0T = t( _L, _F, At) and the construction oper-
ation J T = t(JL, _F, At) are special cases of unification in which one of the
arguments of the unification builtin only contains free variables (_L, _F, At, for
the selection, -]T for the construction). For those special cases of unification, the
simplified 8indingEdges function will not introduce any imprecision. The safety
of the alternative BindingEdges function can be proved using the inclusions (4.59)
and (4.60), which follow immediately from the specifications.
BindingEdges(~,,, To~t, i, j)
C_ TGShift(To,., BindingEdge%(T/,,, To,,t,i,j)) (4.59)
4.3. EVALUATION 125

f.. g

Figure 4.20: Edges can be mutually related by TGShift.

and

i, f( il, . . . , ij ) )
BindingEdges(Y/,,, To,,t,
C TGShift(To,,t, BindingEdge%(T/,,, Tout, i, f ( i l , . . . , i j ) ) ) (4.60)
Another issue to investigate is the representation of the abstract sharing com-
ponents. Formally, the abstract domain we are working with is the set of equiv-
alence classes of abstract sharing environments (see the definition of the con-
cretization function InstrEnvConc (Definition 4.1.18) and the specifications of
Section 4.2). The prototype implementation manipulates maximal representa-
tions of the abstract sharing components, i.e. sets of sharing edges that are closed
under TGShift. Also, the safety of the Convert operation (Lemma 4.2.13) cru-
cially depends on the application of the TGShift operation on its argument before
translating it to the output type graph. It is not obvious at what intermediary
stages of the primitive operations the TGShift operation can be dispensed with
in order to improve the efficiency. Based on the properties of normalized type
graphs, it may be interesting to search for a unique and well-defined canoni-
cal representative for a class of equivalent abstract sharing environments that
contain the same abstract term environment. (Note: Equivalent abstract term
environments do not have a unique representative.) Removing irrelevant edges
(defined in Section 4.3.2) and edges that result from some other edge using the
TGShift operation in general does not lead to a unique minimal representation
for an abstract sharing component, because edges can be mutually related by
TGShift, due to the existence of back-arcs. For example, removing either of the
two edges in Figure 4.20 will yield a minimal abstract sharing component.
Chapter 5

Liveness Analysis

This chapter has the same overall structure as the previous one. In the first
section, we motivate the design of the liveness environments. The second section
contains the formal definitions of the primitive operations and their soundness
proofs. The final section discusses the strength of the liveness analysis.

5.1 Liveness Environments


Standard Prolog implementations keep the representation of a data structure
intact when its parts are used to construct a new data structure, although there
may be no further reference to the old structure. The run-time garbage collect-
ing processes that are called when the interpreter runs out of memory deallocate
the resulting dead data structures, but constitute a costly remedy, as they re-
quire an interruption of the program execution. A run-time garbage collecting
procedure basically consists of a marking and compaction algorithm [2, 8, 67].
During the marking phase, the state of the computation is analyzed to detect
the references that will be used in the remaining computations. The structures
on the heap that are accessible from the set of active argument registers, the
chain of active execution environments or the choice-points are marked. During
the compaction phase, the heap-allocated term structures that are unmarked
are considered as garbage and are discarded. Static liveness analysis on the
other hand derives at compile time an upper approximation of the set of term
structures that are needed to finish the execution of the program. The compiler
can exploit such information to generate code minimizing memory consumption
by performing destructive operations that locally reuse heap storage that is no
longer referenced by program variables [29, 30], and thereby reduce the role of
run-time garbage collection. T h a t is why the technique is also called compile-
time garbage collection.
Before giving formal definitions of concrete and abstract domains enclosing
liveness information, we first illustrate how the liveness analysis can be per-
formed and how it relies on the sharing analysis. A representation for live data
128 CHAPTER 5. LIVENESS ANALYSIS

structures very naturally follows from the representation of terms and sharing
between term substructures used in the previous chapter. Further, we show that
the domain of liveness environments has the algebraic structure required by the
framework of abstract interpretation.

5.1.1 Concrete Representation of Liveness Information


Once information is available about the term structures that variables can be
bound to and about the sharing that can occur at run time between the rep-
resentations of those terms as structures in memory, we can infer the live (or
active) data structures from knowledge about the flow of control in the program-
ming language (i.e. the order in which calls are processed). In the present work,
we focus on standard sequential logic programming language implementations,
which use a left-to-right computation rule.
The amount of garbage created by programs depends on their mode of use.
Strictly speaking, we have to consider all the computation states that can occur
at run time, in order to deduce for each program point which term structures
might still be accessed. For instance, consider the following Prolog code and
initial query.

Q : - A1,...,An.
R :- D1,...,P . . . . ,Dq.

P :- B I , . . . , (~ Bj,...,Bm.

?-0.

Figure 5.1 represents a computation state when program point ( ~ is reached.


The control provided in (pure) Prolog is a depth-first left-to-right traversal of
the search tree. Triangles denote completed subtrees. The current resolvent is

( B j , B j + I , . . . , B m , D / t , . . . , D q , . . . , A/+l,...,g~n)tr.

The left-to-right computation rule selects the call Bja for execution. Bj+I . . . . .
A,~ are the pending calls - - in fact renamed copies of calls appearing in the bodies
of the program clauses - - and the substitution cr is the current variable-binding
environment. The path of active calls, as represented by the run-time control
structures of an interpreter, consists of the calls q,Ai . . . . ,P,Bj. The binding
environments associated with the active clauses would be marked as accessible
in the case of run-time garbage collection.
Deducing information about the future need to access term structures from
the source text of the program, is a complicated task. We reduce the prob-
lem by considering forward executions only. This restriction is justified if the
trailing mechanism is enhanced such that at run time, before overwriting any
structures, it preserves a copy of the old values that are needed on backtracking
(see Section 3.3). Further, in order to facilitate passing the liveness information
inductively down the proof trees, we introduce the distinction between globally
5.1. LIVENESS ENVIRONMENTS 129

A1 ... Ai Ai+l ... An

/r
DI ... P Dk ... Dq

Ad Bl ... Bj Bj+I ... Bm

Figure 5.1: A partial proof tree.

and locally live structures. Under a forward execution regime, t e r m structures


that represent Bjcr and that are shared with the term structures of (Bj+I, . . . ,
A,~)a, must not be reused by the code generated for the predicate that resolves
with BW; such structures must be marked as live structures, because they will
probably be referenced in a later stage of the execution. We call a structure
globally live if it m a y be referenced in a subsequent program point in one of the
ancestor clause instances. For Figure 5.1, the structures that Bja shares with
(Dk . . . . . Dq, . . . , Ai+l . . . . . A , ) a are globally live. A structure is called lo-
cally live, if it m a y be referenced in one of the subsequent program points of
the current clause instance. For Figure 5.1, these are the structures that Bja
shares with (Sj+l . . . . . S m ) a . The distinction is convenient when processing
the liveness information, because the set of locally live structures depends on the
program variables referenced in the tail of the current clause, while the set of
globally live structures does not. Globally and locally liveness for logic programs
correspond to inter- and intra-procedural liveness for imperative programs.
We now illustrate the basic ideas by means of a detailed concrete example
that operates on the concrete sharing environments introduced in the previous
chapter. Consider the proof tree of Figure 5.2 for some subderivation calling the
a p p e n d / 3 predicate. The normal form of the a p p e n d / 3 predicate is as follows.
append(l/, _Y, 7.) :- _X = n i l , _Y = 7. .
append(./, _Y, 7.) :- _X = [_El_U], 7. = [_E[_W], append(_U, _Y, _W).

In the figure, variables are renamed to avoid name conflicts between different
clause instances. The labels ~ , . . . , ( ~ indicate points in the c o m p u t a t i o n
where we consider snapshots of the local clause environment. The concrete
sharing environments are shown in Figure 5.3. In the figures, program variable
names label the roots' outarcs. Shared substructures are shown as connected
by dashed arcs. Corresponding substructures of shared structures also share;
130 C H A P T E R 5. L I V E N E S S A N A L Y S I S

| 9

Figure 5.2: The proof tree of a derivation from the query


?- append(-/O, _YO, 7.0).

however, we show only the principal sharing arcs to avoid cluttering the figures.
Variables are represented by nodes labeled V, with aliases connected by sharing
arcs. Nodes decorated with * are locally live, nodes decorated with ** are
globally live. Substructures of live structures are also live; again, we indicate
only the principal live nodes.
The concrete sharing environment ( ~ of Figure 5.3 represents the initial
binding state for the call of append(_X0, frO, 7.0).
= ([1],[2],v)
CSharingjc,. = (0, @)
The arguments _X0 and _Y0 are ground single element listswithout any sharing.
Assuming that 7.0 is the qucry's intended solution, _Z0 is marked globally live at
point (~. The structures of-El, _UI, _YI and _Wl are locally live at point ~ )
because they occur in the second and third call of that clause instance. The fact
that _7,1 is globally live at point (~) is inherited from the (globally) liveness of
_7.0 at point ( ~ . This illustrates how (local and/or global) livencss information
is passed down on procedure entry to the environments at lower levels of the
proof tree. The unification _Xl = [_El I _UI] after program point (~) boils
down to a selection of subterms, because the program variable _Xl is a ground
list on entry of the call and _El, _UI are uninstantiated variables. In particular,
the variable -El gets bound to the value 1 and the variable _UI gets the value nil
(see the environment (~)). In the logic programming systems that we consider,
a copy is m a d e whenever a variable is bound to an atom or an integer; so no new
sharing is introduced by this unification. If the list elements were compound
term structures and/or the tail of the list was not empty, pointers would be
5.1. LIVENESS ENVIRONMENTS 131

@ @
o

V V V V V
A
I nil
A
2 nil
A
1 nil
A
2 nil

9 |
<~

1 nil V V 1 nil V
A
1 nil
A
2 nil
A
1 nil
A
2 nil
?~'.v "

(2) C)

nil V I Bil
A
2 nil
A
1 nil
/%:::<:z~,,
2 nil ~2 mill 1 p~'.

2 nil
9

1 nil 2 nil 1
A
2 nil

Figure 5.3: The concrete sharing environments at the points shown in Figure 5.2,
annotated with liveness information.
132 CHAPTER 5. L I V E N E S S A N A L Y S I S

created from the variables _El, _U1 to the respective subterms of ..Xl and the
sharing of structure thus introduced would be represented by new sharing edges
in the concrete environment ( ~ . For this point, we now discuss how the liveness
information can be used to determine which storage cells become garbage after
they are referenced. Figure 5.3. ( ~ shows the liveness information of interest.
The program variable -21 is still globally live and _U1, _Y1, _Wl are locally live.
The program variable _El is no longer locally live, because the only reference
is in the next call. Because .21 is a free variable at that point, we know that
the unification _Zl = [_El I _W1] is in fact a construction operation requiring
the allocation of a new list-cell on the heap. However, because the top list-cell
of _Xl is neither locally, nor globally live, it is a candidate to be used instead.
Note that the list-cell of variable _YI is not; it will be referenced by the recursive
call. At point ( ~ , the concrete sharing environment indicates that sharing
between -21 and _W1 is created by the construction operation. The set of locally
live structures is empty because the next call is the last one within the current
clause environment. The set of globally live structures still contains the output
argument .21. Since .21 got instantiated by the unification, its substructures
are live too. Moreover, also _W1 is globally live because it shares with the tail
of .21. Indeed, the liveness property propagates through shared structures. We
introduce a LiveCIosure operation to make explicit the interaction between the
sharing and the liveness relations when needed.

D e f i n i t i o n 5.1.1 For a concrete term ]C, a set [-Ic C_ O(]C) and PIe a set of
unordered pairs (r, s) such ~ha~ r, s 9 O ( ~ ) , we de~ne

LiveCIosure(L :, PJc) = Is 9 (3r 9 (r,s) 9


Each time the sharing relation is changed, the set of live subterms in fact changes
implicitly too. By keeping part of the liveness information implicit, we have a
more compact representation of the environments. Such reduced representations
are preferable, as they save space and time during the abstract interpretation
process. However, to ensure a sound interpretation, the changed liveness in-
formation has to be made explicit at certain intermediate transition steps; for
instance, at the program points where the compiler checks for the safe dealloca-
tion of some data structure, and also at procedure entry, just before applying the
restriction operation. Consider the program point ( ~ ) . Because on procedure
entry only the part of the sharing relation pertaining to the call is passed down,
the environment at the lower level does not know of the sharing between _7,1 and
_W1, which is necessary to derive the liveness of the term _Wl. So, the procedure-
entry operation has to make the liveness information explicit first, by propa-
gating the (locally and globally) live parts according t o the TransitiveCIosure
of the sharing components. In the example, the first sharing component and
the set of locally live terms are empty. The call pattern for the recursive call
append (_Ul , _Y1, _W1) that results from making the liveness information explicit
is basically the same as that of the initial query: the first and second arguments
5.1. LIVENESS ENVIRONMENTS 133

are ground lists without any sharing and the third argument is a free variable
which is live. The sharing environments ( ~ and (~) illustrate the binding
states after the recursive call append(_gl,_Vl,_tr and after the initial query
append(-/O,_YO, 7.0) respectively.
If we assume that the intended solution of the initial query consists of the
values of 7.0 and -/0, then both must be marked globally live in point ( f ) . At
program point (~), this results in -/1 and _Zl being globally live; hence, the top
list-cell of _I1 is no longer a candidate for reuse by the construction operation
_.Z1 = [~..1 I _~13.
If both kinds of subgoals are likely to occur at run time, the compiler can
generate two specialized versions for the append procedure: one that leaves
the input list of the first argument intact and another less memory consuming
version that reuses the list-cells of the first argument to build the output list
returned in the third argument. The purpose of liveness analysis is to derive at
compile time, which calls of the append/3 predicate in the source programs can
be specialized to the more efficient version.

5.1.2 Abstract Representation of Liveness Information


In this section we discuss how the new concepts transfer to the abstract setting,
where we have to deal with type graphs and a non-transitive sharing relation.
Once more, we will try to retain as much of the precision as possible.
The abstract AND-OR-graphs used by the abstract interpretation frame-
work to gather the information derived by the abstract interpretation process,
abstractly represent all the proof trees that can occur at run time; hence, they
are adequate to derive abstract information about the future need to access term
structures. Recall that the operations required by the parameterized abstract
interpretation framework in order to induce the liveness analysis are procedure
entry, primitive unification, and procedure exit. We consider each of the prim-
itive operations in turn, first reviewing the concrete version of the operation
and then considering the abstract version. Each primitive operation will modify
the set of marked nodes such that it reflects the traversal of the concrete, re-
spectively the abstract tree. The formal constructions and verifications of these
operations are given in Section 5.2.
The first operation is procedure entry. To illustrate the issues involved we
refer to Program 5.1, which shows the naive list reversal program in normal form.
We first consider the program point marked ( ~ , just prior to the recursive call
to n r e v / 2 . Suppose that the environment involves the variable instances -/1,
-Y1, -El, -U1, -RU1, and _Last1, and that when control reaches point (~) in some
invocation of the recursive clause we have the concrete liveness environment
shown on the left in Figure 5.4.
The call to be executed is nrev(_U1, -RU1). The purpose of procedure entry is
to compute the concrete liveness environment at point ( ~ in the child invocation
of the recursive clause for n r e v / 2 , which we assume uses new instances of the
134 C H A P T E R 5. L I V E N E S S A N A L Y S I S

nrev(_X, _Y) :- append(_X, _Y, _Z) :-


_X = nil, _Y = nil. _X = nil, _Y = _Z.
nrev(_X, _Y) :- I~ a p p e n d ( l , _Y, _Z) :-

_x= I @
nrev (_U, _~U), 7. : I _w], |
Last = [_E], 3~ append ( _U, _Y, _W). 7(~
append(_RU, _Last, _Y) <~)

Program 5 . 1 : n r e v / 2 (Naive reverse): normal form.

o o

9- v" ;" .: v" v" v'" v v v v

a . ..... .- SJ b ni l b nil

b nil

Figure 5.4: Concrete liveness environment at point Q and then at


point O in the recursive invocation.

clause variables named I 2 , -Y2, _.E~, _Us, .-RU2, -Last2. Apart from determining
the liveness component, procedure entry proceeds as discussed in Section 4.1.
The value of -Ul is passed to I 2 , that of --RU1 is passed to -Y2, and sharing
between subterms of -U1 and -RUI is passed on as sharing between subterms of
I 2 and -Y2 - - at least it would be if there were any sharing between -Ui and
-RU1 in the example. The program variables -E2, -U2, -RU2, _Last2, which occur
only in the body of the child invocation of the recursive clause, are initialized as
free variables that do not take part in the sharing relation.
Computing the liveness component is more involved. First, we must make
explicit the set of terms that are needed to complete the computation after
returning from the call nrev(_U1,_RU1). The set consists of the terms that are
(locally and/or globally) live at program point (~), or that are sharing with any
such live terms. The set of globally live terms is passed down from the calling
environment. For Figure 5.4 on the left, if we assume that the term -Y1 is the
intended solution of the n r e v / 2 invocation, then -Y1 (labeled **) will be the only
globally live part of the environment; it does not share with other subterms. We
will represent globally live terms as a set of occurrences, the concrete liveness
component, and add it to the concrete sharing environments. The set of locally
live terms can be derived from the source text of the clause and comprises the
terms _Last1, -El, -RUt (labeled 9 in Figure 5.4 on the left). Note that -Y1 is
5.1. LIVENESS ENVIRONMENTS 135
<>

X ~ I *

V atom OR V V OR V ** V V V V

atom

/
9~nil
/ ~ _
OR ." .

//~
~,sI atom
il
/@"
atom

atom

Figure 5.5: Abstract liveness environment at point (~) and then at


point Q in the recursive invocation.

<> <> <> O

Or V V V V V V V V V V V
" ' - . . . . . . . . . .-~176
lnt V..."

(a) (b) (c) (d)

Figure 5.6: Propagation of abstract globally live terms.

in fact both locally and globally live at point O 9 Because the set of locally
live terms is only temporarily needed during the procedure-entry operation and
because locally live components of successive program points do not depend on
each other (they are determined by the source code only), we will not include
them as a separate component in the environments.
From the set of (localiy a n d / o r globally) live subterms, as determined above,
we must select those that compose the structures of -U1 and A~U1, the arguments
of the next call. And at last, we must establish the corresponding subterms in
the structures of I 2 and -Y2 as the (globally) liveness component of the child
clause invocation (labeled ** in Figure 5.4 on the right).
At the abstract level, although terms are replaced by type graphs, the pro-
cess is similar. Nodes of type graphs needed to complete the computation are
determined, the information is restricted to the type graphs for -Ux and --RU1, the
arguments of the call, and passed on to I 2 and -u the arguments of the head.
Figure 5.5 illustrates the abstract analogue of the operation shown in Figure 5.4.
Remember that, before restricting the information to the arguments of the
call, the liveness information has to be made explicit first. While in the concrete
case, liveness propagates through the TransitiveClosure of the sharing compo-
nents, obviously, for abstract environments it propagates through the
AJternatingClosure of the twofold sharing component. However, as Figure 5.6
illustrates, when propagating the abst.racL global liveness information, we can
136 CHAPTER 5. LIVENESS ANALYSIS

get an even more precise result if we take into account that the liveness is al-
ready closed with respect to the first sharing component. Suppose that the
predicate p/3, defined by the single clause p(_Y,7.,_U) : - _Y=_U, q( 7.,_U)., is
called in the environment of Figure 5.6.a. Here, sharing edges from the first (or
old) sharing component are represented as dashed arcs, those of the second (or
new) sharing component as dotted arcs. The _X0 subgraph nodes are marked
locally live, the first sharing component is empty, the second component repre-
sents potential sharing between _X0 and _u and between _Y0 and 7.0, but not
between _X0 and 7.0. On procedure entry to p/3, the sharing information of Fig-
ure 5.6.a results in a non-empty first sharing component for the environment of
Figure 5.6.b and an empty second sharing component (see Section 4.2.2). Prop-
agation of the liveness information through the AIternatingCIosure operation, will
mark the term _Y1 as globally live but not the term 7.1 (see Figure 5.6.b). Intu-
itively, it means that the global liveness of the term _Y1 and the sharing between
_Y1 and 7.1 cannot occur in the same execution state. Figure 5.6.c represents
the environment at the program point just before the call of q(7.1,_U1). The
unification _YI=_U1 has introduced new sharing between _Y1 and _U1. Now, if
procedure entry to the predicate q/2 propagates the global liveness information
through the alternating closure of the twofold sharing component, then both
7.2 and _U2 will be marked as globally live (Figure 5.6.d). It should be clear
that marking _Z2 as a live term introduces imprecision because, as mentioned
above, the global liveness of the term _Y1 and the sharing between _Y1 and _7,1
cannot occur in the same execution state. To overcome this problem, we define
a AItLiveCIosure operation that considers only paths starting with a new sharing
edge from a live node.

D e f i n i t i o n 5.1.2 For a type graph T, a subset LT of Nodes:r and two sets


PT1, P~'2 of urtordered pairs ( R, S) such that R, S E NodesT, we define

AltLiveClosure(LT-,PT1, P7"2) =
s v S.) a fi.ite seque.ce ove
PT"z u P7"2 such that S : S,, & R1 E L7 &
S (VlE~7:l<l<n:=>St=Rt+l) &
[VkeM: (l<2k+l<n ~(R=k+,,Szk+I)9
(1 _< 2k < n = (R=~, S=~) e P7"1)]

Note however that in the case of the naive reverse program, using the improved
closure operation has no effect, because all relevant sharing edges belong to the
second sharing component; the first sharing component is empty for all program
points.
The next operation to consider is unification. The two basic forms of uni-
fication that can occur in a normal-form logic program (i.e., X = Y and X --
f ( X l , . . . , Xn)) are discussed at length in Section 4.1 for sharing environments.
We now extend the unification to incorporate the liveness component. As is
mentioned in the previous section, for concrete terms the liveness property is
closed under substructures. Therefore, when unification further instantiates a
5.1. LIVENESS ENVIRONMENTS 137
<>

V b nil V

b nit a nil

Figure 5.7: The concrete liveness environment at point ( ~ ) .


<>

OR * V atom OR V

atom OR il il

9 ' atom ~ s S~ S atom

Figure 5.8: The abstract liveness environment at point ( ~ .

live structure, we add the new nodes to the concrete liveness component. Also,
unification can introduce new sharing, but as nodes that share with a live node
become live implicitly, no special action must be undertaken.
At the abstract level, the effect of unification is to modify the type graphs,
say T into T ~. When T has live nodes, updating the liveness component entails
determining the corresponding nodes in the new type graph T ~. As type graphs
are essentially deterministic, top-down tree automata, the problem is similar to
determining whether two automata recognize intersecting sets. Once the cor-
responding nodes are determined, it is straightforward to update the liveness
components. The propagation of liveness across new sharing edges occurs im-
plicitly. The propagation of liveness down new descendent arcs follows from
the interpretation of the liveness environments, as defined by the concretlzation
function (Section 5.1.3). Figures 5.7, 5.8 and 5.9, 5.10 illustrate the concrete and
abstract liveness environments before and after the unification _Z = [_E [ 3~]
in the a p p e n d / ] predicate.
Finally, we consider procedure exit. Figure 5.11 shows a concrete liveness
environment in an invocation of n r e v at the point the final append is called
(Program point ( ~ ) ) . Figure 5.12 shows the concrete liveness environment of
the append clause at the point execution of the body is completed (Program
138 C H A P T E R 5. L I V E N E S S A N A L Y S I S

<>

b nil V

AAA
b nil a nfl b V . . . . . . . . . -"
s S ,-'"
Figure 5.9: The concrete liveness environment at point @ .

<>

OR . atom OR 9 V

A
atom OR . ] nil atom V s ." I nil
--
&/L/ ....~
9 ~ d nil atom o" ~ ~ atom

atom

Figure 5.10: The abstract liveness environment at point @ .

<>

V a s9 9 9
s S

a . . . . . . o ~ b nil b nil a nil

A
b nil

Figure 5.11: The concrete liveness environment in the caller's environment


at point (,~ - - First of two inputs to procedure exit.
5.1. LIVENESS ENVIRONMENTS 139

~. b nil ,, '

A
b nil a
9
\',
nit tI t ........... "
I # # #S
/,,
##
,' a nil

~'~ a nil

Figure 5.12: The concrete liveness environment in the callee's environment


at point Q - - Second of two inputs to procedure exit.

<>

Lastl

a 9 sS

A
a 9 ,
A
b . . ,,' 9
I I

b nil b
/% nil !t a
9

nil

b nil a ifil

Figure 5.13: Concrete liveness environment in caller's environment


at point ( ~ - - Output for procedure exit.

point ( ~ ) ) . The restriction of this state to the variables -X2, -Y2, and 7,2 has to
be passed back to -RU1, -Last1, and -Y1, respectively. No liveness information
need be returned. The result is shown in Figure 5.13. We see that -Y1, a live
variable, has been bound to the list [b,a], and that, due to the generated sharing
_Last1 becomes live as well, although implicitly.
When moving from the concrete to the abstract level, we have similar op-
erations. The abstract liveness environment must be restricted to the variables
occurring in the head of the clause and the type graphs and sharing information
has to be returned to the caller. The extension operation (Section 4.2.3) prop-
agates the updates of the variables participating in the call to the variables in
the caller's environment: Further type graphs may need to be modified owing to
shared variables; sharing has to be updated in terms of the new type graphs; the
new sharing interacts with the sharing present in the caller's environment before
the call, inducing further sharing. The updating of the liveness component in the
caller's environment implicitly follows from the modifications in the type graphs
140 C H A P T E R 5. LIVENESS ANALYSIS

<>

V** atom OR OR
atom OR
.~ni s ' I il atom nil
s s S ~
9 | tS atom atom
atom
Figure 5.14: The abstract liveness environment in the caller's environment
at point (~) - - First of two inputs to procedure exit.

OR atom OR OR

1 atom/Ot~ ";
,~ni2
" .atom ttttt "'-2r nil atom,), / atom
atom , .. jr:tom ., , sr

Figure 5.15: The abstract liveness environment in the callee's environment


at point (~) - - Second of two inputs to procedure exit.
5.1. LIVENESS ENVIRONMENTS 141

<>

9 atom OR OR #l .

A A"
atom OR atom OR , . ~ nil . ~ nil t atom nil

9 il - ] nil t atom atom ,,

atom % atom ~," "'-----"

Figure 5.16: Abstract liveness environment in caller's environment


at point ( ~ - - Output for procedure exit.

and the sharing relation. Figures 5.14-5.16 illustrate the abstract analogue of
the operation shown in Figure 5.11-5.13. Notice the possible sharing between
-Y1 and _Last1 in Figure 5.16, which is not present in Figure 5.13. This is due
to the possibility that ~Ul is the empty list in Figure 5.14.

5.1.3 The Concrete and Abstract Domains


This section contains the formal definitions of the concrete and abstract domains
for the liveness analysis. They are based on the sharing environments defined
in Section 4.1.3. First we introduce some extensions of the basic concepts of
Sections 4.1.1 and 4.1.2.

D e f i n i t i o n 5.1.3 ( G e n e r a l i z a t i o n o f D e f i n i t i o n 4.1.3) For IC a t e r m a n d


Lpc a subset of O(IC), we define

TermShift(/C, L~c)= {r.t I r E Lpc & r.t E O(/C)}.

Note the overloading of the operation TermShift. It is a closure operation on


either sets of occurrences or sets of pairs of occurrences. We use it here in
correspondence with the idea that ira term is live at some program point, then its
subterms are live too. We will use a similar overloading for other functions. For
example, we generalize the definition of TGShift(T,Ey) such that the elements of
ET- can be either elements of Nodesy or unordered pairs of elements of NodesT-.

D e f i n i t i o n 5.1.4 ( G e n e r a l i z a t i o n o f D e f i n i t i o n 4.1.10) For 7- a type graph


and L~- a set of nodes n such that n 6 Nodesy and Label(n) # 'Or', define

TGShift(T, LT) = {T[S.T] I 3S, T : T[S] 6 LT- & S.T E S(T)}.


142 C H A P T E R 5. L I V E N E S S A N A L Y S I S

In order to derive liveness information, we augment the sharing environments


with a component that, if propagated through the sharing components, repre-
sents the globally live subterms of the environment. As discussed in the previous
section, a concrete liveness component is closed with respect to the first sharing
component and also with respect to the TermShift operation.

D e f i n i t i o n 5.1.5 A concrete liveness environment has the .form (K, CSharingjc,


CLivejc) where (IC, CSharing/c) is a concrete sharing environment and CLivejc is
a subset of O(IC) closed under TermShift, i.e. CLivelc = TerrnShlft(/C, CLivejc),
and closed with. respect to CShr~c, i.e. CLivejc = LiveCIosure(CLivejc, CShr~c). We
call CLivejc a concrete liveness component for (/C, CSharingjc).

D e f i n i t i o n 5.1.6 An abstract liveness environment has the form ( T ~,ASharing:r,


AkiveT-) where (7- e, ASharingT-) is an abstract sharing environment and ALivez
a subset of NodesT-, consisting of nodes not labeled 'Or'. We call ALiveT an
abstract liveness component for (7-e, ASharingT-).

The abstract interpretation procedure computing the abstract A N D - O R -


graphs (Section 2.2.1), will associate with every program point an abstract live-
hess environment. A correct interpretation requires that the sharing relation
is taken into account to derive from the component CLivejc (resp. ALive~r) the
full set of globally live terms, i.e.live with respect to some calling environment.
The information that a compiler needs at a program point to decide upon the
liveness of some term, in addition requires deriving the set of locally live terms.
W e will formalize how to obtain that set in Section 5.2.2.
The concretization function InstrEnvConc for abstract sharing environments
defined in Section 4.1.3 can bc generalized for abstract liveness environments as
follows.

D e f i n i t i o n 5.1.7 For an abstract liveness environment ( T~ ASharingT-, ALivez),

LiveEnvConc((T', ASharingT-, ALiver)) =


(/C, CSharingK:, CLivex:) }
(/C, CSharinglc) E InstrEnvConc((T', ASharingT-))
& CLivejc is a concrete liveness component for
(/C, CSharingjc) & (r E CLivex: =~
'T[SeI(/C, r)] E TGShift(T, ALivesr) )
As in Section 4.1.3, we introduce some notation that is convenient when proving
safety results for the operations on the domain of liveness environments.

D e f i n i t i o n 5.1.8 ( G e n e r a l i z a t i o n o f D e f i n i t i o n 4,1.19) For 7"e an abstract


term environment, a term ICE TGEnvConc(T') and a set Lpc C_ 0(tC), define

AbsNodeT,/c(Ljc ) = {:r[Sel(JC, r ) ] l r E Ljc}.


The AbsNodeT,/C function maps a concrete occurrence for a term /C into an
abstract node for the type graph T recognizing the term/C. The concretization
function for abstract liveness environments can now be reformulated as follows.
5.1. LIVENESS ENVIRONMENTS 143

D e f i n i t i o n 5.1.9 ( R e f o r m u l a t i o n o f D e f i n i t i o n 5.1.7) For an abstract live-


hess environment (7 -e, ASharingT-, ALive:/-),

LiveEnvConc((T e, ASharingcr, ALive~r)) =


QC, CSharingx:, CLive~c)
(/C, CSharing~c) E InstrEnvConc((T ", ASharingcr))
& CLiveK: is a concrete liveness component f o r ~

</C,CSharingJc) & AbsNodeT,/c(CLivejc) _C


TGShift(T, ALive~r)

The following lemmastates that the AItLiveCIosure operation for abstract liveness
components safely approximates the LiveClosure operation of concrete liveness
components.

L e m m a 5.1.10 ( S a f e t y of AItLiveCIosure) Let T ~ be an abstract t e r m environ-


ment, )C E TGEnvConc(Te), and Pitt, Pie2 pre-sharing components for ]C, such
that Ljc C_ (9(it) is a concrete liveness component Jar (K~, (PJc1, Px:2)). Then

AbsNodeT,/c(LiveClosure(Lx: , TransitiveClosure(Px:1 u Px:2)))


C_ AItLiveCIosure(AbsNodeT,)c(Ljc),AbsPairT,/c(P~cl) ,AbsPairT,/C(P~c~)).

Proof. After expanding the definitions of AbsNode.T,/C and LiveCIosure, we have

AbsNodeT,/c(LiveClosure(Lpc, TransitiveClosure(Pjc1 U Ppc2))) =


{T[SeI(/C,s)] I s E Ljc V (3r E Ljc: (r,s) E TransitiveClosure(Pjcl U PJc~))}.

Fix such an arbitrary s satisfying

s E Lx: V (3 r E Lx: : (r, s) E TransitiveClosure(Px:1 U PJc2)).

If s E Ljc, then it follows from the definition of AbsNodeT,/C, that T[SeI(/C, s)] E
AbsNodeT,/c(LJc), and by the definition of AItLiveCIosure we have T[SeI(/C, s)] E
AttLiyeC$osure(AbsNodeT,/c(Ljc), AbsPairT,/c(P~cl) , AbsPairT,/c(Pjcu)), so we are
done. Now, suppose s g kx:, but (3r E Lpc : (r,s) E TransitiveCIosure(Px:l U
PJc2)). By the definition of TransitiveCIosure, there exists a finite sequence E
over PJcl U P~:2 such that

r E L~,
E = for s o m e E
r = rl~s = Sn~
Vt E ~ : 1 < l < n =~ sl - rl+l.
144 CHAPTER 5. L I V E N E S S A N A L Y S I S

In Lemma 4.1.21 (safety of AlternatingCIosure), we proved that such a finite


sequence can be transformed into an "alternating" sequence E ' satisfying
r E L~c,
I
E' = (rt, s Il ) , . . . , (r,~,
I I
am) for some m ~ ~r,
r ~ r~.,s'-'- I
8zrrt,
Y l E ~W : l < l < m ==ys~t rt+l;
VkE/~V: ((l<2k<m ~ ( r ~ k ,s~) EPic1)&
(i _< 2/0 + 1 _< m '
~ (r2~+~, '
s~k+~) Px:2))V
((i < 2k < .~ ~ ( r'~k, s~k) e P~c2)&
( l <_2 k + l < m _ '
~ ( r 2 k' + z , s2k+l ) E PJcl)).
If (rl,
' sz)
' E P~cz, then s~ E L~c, because r = r~ E L~ and L~c = LiveClosure(Lx::,
Pie1) by the definition of a concrete liveness component for (/C, (P~:z, PJc2)).
Hence, we can further reduce the sequence E t to a sequence E " that starts with
an element of Px:2, i.e.

E H .~ ( r l'1 , s lI I ), ) for some p E ~W,


. . . , [( ,r1v , s v,,.t
II
S ..~ Spl
Vs E -N : I < t < p ::~ s~' "

VkE_/N: (l<_2k+l_<p r l~+ l l r "2k+1,%k+t)


" E PJc2)&
(1 _< 2k _< p ~ (r~kl " s~k)
" P~Cz).
For every l E JW such that i < l < p, let (Rt, St) = (T[SeI(/C, r~')], T[SeI(/C, s~')]).
Using the sequence (R1, $ 1 ) , . . . , (R v, Sp) in the definition of AItLiveClosure, we
have that
T[Sel(~:, ~)]
e AItLiveClosu,e(Ab.Noae~r~(L~),Ab~Pair~r,~:(P~),AbsPair~r~:(P~)),
as desired. []
Based on the previous result, we can now show that the full set of live terms
in some concrete environment described by an abstract liveness environment can
be safely approximated by the AItLiveCIosure operation.
L e m m a 5.1.11 For an abstract liveness environment (7 -e, ASharing:T, ALive:/-),
and a concrete liveness environment (]C, CSharing~c, CLive~c) ~ LiveEnvConc(('T%
ASharingT, ALive~r)), we have
AbsNodeT,/c(LiveClosure(CLive~c, TransitiveClosure(CShr~c U CShr~c)) ) C (5.1)
AIt LiveClosure(TGShift('/-, ALive~-), TGShift(T, AShr~-), TGShift(T, AShr~-)).
Proof. From the definitions of LiveEnvConc and InstrEnvConc, we know that
AbsPair,]-,K:(CShr~c ) _C TGShift(7", AShr~-)
& AbseairT,/c(CShr~) C TGShift(7-, AShr~)
& AbsNodeT,K:(CLive~c) C TGShift(7-, ALiveT-). (5.2)
We derive that
5.1. L I V E N E S S E N V I R O N M E N T S 145

AbsNodeT,/c(LiveCIosure(CLivejc , TransitiveCIosure(CShr~c LJCShr~c)))


C_AltLiveClosure(AbsNodeT,/c(CLive~c), AbseairT, ic(CShr~c),
AbsPairT,/C (CShr~c)),
by Lemma 5.1.10 (the safety of AltLiveClosure),
C AItLiveClosure(TGShift(T, ALive~r), TGShift(T, AShr~-),
TGShift(T, AShr~-)),
by (5.2),
and the monotonicity of A[tkiveClosure.
Summarizing, we have (5.1), as desired. [3

5.1.4 Order Relation and U p p e r b o u n d Operation


In this section we prove that the domain of liveness environments has the alge-
braic structure imposed by the framework of abstract interpretation. We first
extend the (pre)order relation and the upperbound operation defined in Sec-
tion 4.1.4 for abstract sharing environments, to abstract liveness environments
having the same arity. The function Translate is generalized to reexpress an ab-
stract liveness component for a type graph T1 as an abstract liveness component
for a type graph T2 that describes a superset of terms represented by the type
graph T1.
Definition 5.1.12 ( G e n e r a l i z a t i o n of Definition 4.1.23) For two abstract
term environments T ~ , T ~ such that arity(Tl[e]) = arity(T2[e]), and T~ <TO
T~, and L~'I a subset of Nodes:r consisting of nodes not labeled 'Or', define
Translate(T1, L-rI,T2) = {T2[R]I3R E S(Tx) : Tx[R] E TGShift(Ti, L:r 1)}.
Definition 5.1.13 (<Lv) For abstract liveness environments (T~,ASharingT1,
ALive~l ) and (T~., ASharingT2, ALive=r2) such that arity(Tl[e]) = arity(T2[e]),
(T~, ASharingT 1, AkiveT 1) _<L,~ (T,], ASharingT2, ALiveT-~)
iff
(T~, ASharing=r1) _<Sh (T~,ASharing=r2)
& Translate(T], ALiveT1, Tl) C_ TGShift(T~, ALivey2)
L e m m a 5.1.14 For abstract liveness environments (Ttl, ASharingT-1,ALiveT-1)
and (T~, ASharingT2, ALiveT2) such that arity(T][e]) = arity(T2[e]),
(T~, ASharingT"l, ALive~rl) <L~ (T~, ASharingT-2, ALive~r2)

LiveEnvConc((T~, ASharingT-1, ALive71)) C_


LiveEnvConc((T~, ASharingT2, ALiveT2) )
Proof. Assume that (T~, ASharing~rl, ALive~-l) <L~ (T~, ASharingT2, ALive~r2)
and (/C, CSharingjc, CLivejc) E LiveEnvConc((T~, ASharingT1, ALive~-l)). We must
prove that
t/C, CSharing~:, CLivet:) e LiveEnvConc((T], ASharing~r2, ALiveT~}). (5.3)
146 C H A P T E R 5. L I V E N E S S A N A L Y S I S

From (/C, CSharingjc, CLive~c) 9 LiveEnvConc((T~, ASharingT1, ALiveTa)) and the


definition of LiveEnvConc, we know that (K:, CSharingJc) 9 InstrEnvConc((T~,
ASharingT1)), CLivejc is a concrete liveness component for (/C, CSharingJc) and
(t 9 CLive~ =r Tl[Sel(/C,t)] 9 TGShift(T1,ALiveTx)). By the definition of
<L~ and Lemma 4.1.25, we now have that InstrEnvConc((T~, ASharingT-1)) C_
InstrEnvConc((T~, ASharing:r2)), hence
(/C, CSharingjc) 9 InstrEnvConc((T~, ASharlngT-2)). (5.4)
For t 9 CLivejc, we have t 9 O(K:) by Definition 5.1.5. By the definition of
InstrEnvConc, we have K: 9 TGEnvConc(T~) and K: 9 TGEnvConc(T~). From
Property 4.1.13, it follows that SeI(K:, t) 9 S ( T 1 ) n S ( T 2 ) . So, using the definition
of Translate we derive
t E CLivelc =~ T2[SeI(/C, t)] E Translate(T1, ALive~-l, T2).
By the definition of _<L~, we have that Translate(T1, ALive=rl, Tz) C_TGShift(T2,
ALiveT=), hence
t 9 CLive~ ==~T=[SeI(/C, t)] 9 TGShift(T~, ALive~r=). (5.5)
Finally, (5.3) follows from (5.4), (5.5) and the fact that CLive~: is a concrete
liveness component for (/C, CSharing~c). 1:3
The minimal element of the abstract domain of liveness environments is (T~., (~,
0), 0), with T~L as defined in Section 2.3.2. The maximal element of the ab-
stract domain is (T~r, (AShr~-r, AShr~r), ALiveTT), with T~- as defined in Sec-
tion 2.3.2, AShr~-r, AShr~- r the maximal set of sharing edges, containing all
pairs of function nodes of TT having the same label, and ALiveTT the maximal
set of live nodes. The subdomaln of liveness environments based on restricted
type graphs does not contain any ascending chains, because the domain of re-
stric$ed sharing environments is finite and for each finite type graph, there are
only finitely many different liveness components.
D e f i n i t i o n 5.1.15 (LiveUpp)For two abstract liveness environments (T~,
ASharing:rl, ALive:rl) arid (T~,ASharing~-2, ALive~r2) such that arit!/(Tl[e]) =

LiveUpp((T~, ASharingT-1, ALivezt), (Y~, ASharing:r 2, ALiver2))


= (T e, AShsrlngT, ALiveT),
where
(T e, ASharing~-) = Upp((T[,ASharingT1), (T~,ASharing~r2)),
Akive:r = Translate(T1, ALive~rl, T) U Translate(T2, ALive~r2, T).
Definition 5.1.15 can straightforwardly be generalized for a finite set {/~1,...,/~n}
of abstract liveness environments. For all s satisfying 1 < l <_ n, let fit =
(T~, ASha6ng:Tt, AkiveTt), then we define
LiveUpp({...,/3l,...}) = LiveUpp(/~l, LiveUpp(... LiveUpp(/~_ 1,/3n)...)).
5.2. P R I M I T I V E O P E R A T I O N S 147

L e m m a 5.1.16 For two abstract liveness environments (T~,ASharingT"l,


ALiveT-x) and (T~,ASharingT-~,ALiveT-z) such that arity(Tl[e]) = arity(T2[e]),
let LiveUpp((T[,ASharingT-1,ALive.T1), (T~, ASharing~r2, ALive~2)) = ( T ' ,
ASharingT-, ALiveT). Then

(T~, ASharingT-1, ALiveT-1) <_L~ (7-', ASharingT-, ALiveT-) &


(T~, ASharingT-2, ALiveT-2) <_L~ ( T ' , ASharingT-, ALivey)

Proof. This follows from the property of Upp and the definitions of < z , and
LiveUpp. []

5.2 Primitive Operations


In this section, we specify the primitive operations for the concrete and abstract
domain of liveness environments (unification, procedure entry, and procedure
exit), and we formally prove the safety theorems for the abstract operations
with respect to the concrete operations. The specifications are based on the
operations for the sharing analysis (see Section 4.2). Only minor extensions are
necessary to incorporate the liveness component.

5.2.1 Unification
As for the sharing analysis, we specify separately a concrete and an abstract live-
ness unification for the two basic forms of unification that can occur in normal-
form Prolog programs: Xi = Xj and X~ = f ( X i ~ , . . . , X i , ) . The unification
operation has to ensure that whenever the term environment becomes further
instantiated due to the unification, the associated liveness component of the
concrete environment is closed with respect to the TermShift operation (see Sec-
tion 5.1.2). Changes in the global liveness of terms that are due to an update of
the sharing component are not made explicit.

5.2.1.1 Xi : Xj
D e f i n i t i o n 5.2.1 For a concrete liveness environment (/C,,~, CSharingjc~,
CLivejc,..) and 1 < i < ff < arity(tCin[e]),
LiveUnify((/Ci,~, CSharing~c,,,, CLivejc,,,), i, j) =
if mgu(S:,./i,/c,,,/j) i, fail
then fail
else (ICout, CSharingx:.~,, CLivex:.~,),
where
CSharingjc.,.)
(ICon, t, = Unify((/Cin, CSharingjc,,,),i,j),
CLivejco,,, = TermShift(/Co,,t, CLivejc~,,).
The followinglemma states that the concrete unification operation LiveUnify(-, i,
j) is well-defined on the domain of concrete liveness environments.
148 C H A P T E R 5. LIVENESS ANALYSIS

L e m m a 5.2.2 Let (/ci,~, CSharingjq~, CLivejq~) be a concrete liveness environ-


ment. Let (/co,~t, CSharingjco~,, CLivejco~,) = LiveUnify( (/cin, CSharingjc,,.,
CLivejq.),i,j) (LiveUnify does not p,-il), where 1 < i < j <_ ~ritv(/C~[c]). Then
(/Co~,t, CSharing~:.,,, CLiveJco,,,) ia a concrete liveness environment.

Proof. By the definition of LiveUnify and Lemma 4.2.5 (well-definedness of the


Unify(-, i ,j ) operation), we know that (/Co=t, CSharingjco.,) is a concrete shar-
ing environment. From CLive~c~ = TermShift(/Co~t, CLivelc,~), and the idem-
potence of the TermShift operation, it follows that CLivejc#~, is closed under
TermShift(/Co~,,-). We still have to prove that CLive~c~ = LiveCIosure(CLiveJCo~,,

,We first prove the following property.

mermShift(/Co.,, LiveCIosure(CLivejc,~, CShr~c,,~) ) (5.6)


= LiveClosure(TermShift(/Co,,t, CLivejq~), TermShift(/Co~t, CShr~q~)).

For an arbitrary element s E TermShift(/Co,,t, LiveCIosure(CLivejc,,~, CShr~c,,~)), it


follows from the definitions of TermShift and LiveCIosure that

3,1, t : s = a'.t E O(/Cout)&(*' E CLivejc, V(Br' E CLivejc,., : (r', a') E CShr~c,~)).

Applying once more the definition of TermShift, we have

s = s'.l~ e O(/Co,,t) & (s'.t E TermShlftC/Co~,t, CLivejc,~) V


-I $1, t :
I (3r'.t E TermShift(/Co,~t, CLivejc~) :
(r'.t, s'.l 0 E TermShift(/Co,~t, CShr~c~.~))).

Hence, by the definition of LiveCIosure,

s E LiveClosure(TermShift(/co~,t, CLivelc,.,), TermShift(/Co,a, CShr~c,.,)),

which proves (5.6). Next, we observe that

CLivejco~, = TermShift(/Co~,, CLivejq,.),


by the definition of LiveUnify,
"- TermShift(/Co,,t, LiveCIosure(CLivejq~, CShr~q,,)),
by the assumptions of the lemma,
= LiveClosure(TermShift(/Co~,t, CLivejc,~), TermShift(/Co~,t, CShr~c,~)) ,
by (5.6) proved above,
= LiveClosure(CLive~co.,, TermSh ft(/co,~t, CShr~q~)),
by the definition of LiveUnify,
= LiveClosure(CLive/co.,, CShr~co.,) ,
by the definition of Unify.

We conclude that CLivepc~ is a concrete liveness component for (/Co,,t,


CSharingjc.,,). D
5.2. P R I M I T I V E O P E R A T I O N S 149

D e f i n i t i o n 5.2.3 For an abstract liveness environment (Ti, e, ASharing~q..,


ALive~-,~) and 1 < i < j < arity(Ti,~[e]),
AbstrLiveU nify((Ti., ~, ASharingT-~.., A LiveT-..), i, j) --
if TG Unify(Ti.', i, j) is fail
then fail
else (To~,tt, ASharingT-.~,, ALiveT-.~,),

where

( o=t ,ASharingT"..,) -= AbstrUnify((Ti.',ASharingT-~.),i,j),


AkiveT.~, = Convert(Tin, ALiveT-~..,To~t).

We still have to generalize the Convert procedure for liveness components. The
function Convert takes an abstract liveness component for a type graph Ti,, and
reexpresses it for a type graph To,,t. In particular, the function will be used for a
type graph To,,t that describes a superset of instantiations of terms represented
by the type graph Ti,,.

D e f i n i t i o n 5.2.4 ( G e n e r a l i z a t i o n o f D e f i n i t i o n 4.2.8) For abstract term


environments Ti.', To=t" such that arity(Ti.[c]) = arity(To~t[e]), and LT-,. a sub-
set of NodesT-.. consisting of nodes not labeled 'Or', define

Convert(Ti,=, LT"~., To=t) =

([R e S(Ti.) & Tin[R] e TGShift(Tin, LT-~,~)]


V
[3R', k, f : R : R'.(k, f) & R'.(k, V) ~ S(Ti,) "
& f is a functor or a constant &
Ti,,[R'.(k, V)] E TGShift(Ti., L~,.)])

The following theorem gives the correctness condition for the liveness unification.

T h e o r e m 5.2.5 ( S a f e t y o f AbstrLiveUnify(-, i, j) ) Assuming the following


conditions

1. (Tin e, ASharingT~, ALiveT-~..) is an abstract liveness environment


9$. (/(:in, CSharingx:~., CLivex:..) is a concrete liveness environment
3. (/Ci,,, CSharingjc,., CLivepc,.) E LiveEnvConc((Ti. ~, ASharingT-~., ALivey,,..)),
that is,

(a) (ICi~, CSharinglc,..) E InstrEnvConc((Tin e, ASharingT-~))


(b) CLivejc~.. is a concrete liveness component for (Kin, CSharingjc~)
(c) AbsNodeTin,/Ci,~(CLivejq~ ) C TGShift(Ti., ALiveT~)
4. 1 < i < j < arity(~.[~]) = arity(Jq.[~])
150 CHAPTER 5. L I V E N E S S A N A L Y S I S

5. (ICo~t, CSharingjc.~,, CLivejc.~,) = LiveUnify((/Cin, CSharingjQ~, CLivejc~,.),


i, j), is not fail
6. (To~t e, ASharingT-.~,, ALiveT-.~,) = AbstrLiveUnify((Tin e, ASharingT~,
ALiveT-~.), i, j)
it follows that

QCo~,t, CSharingjc.~,, CLive~:.~,) E LiveEnvConc((To,,t ~, ASharing~-.~,, ALive:r.~,)).


The proof of the theorem (see page 152) uses generalizations for liveness
components of results given in Section 4.2 for sharing components.

L e m m a 5.2.6 ( S a f e t y of TGShift) For ant abstract t e r m environment T ~ and


a t e r m IC E TGEnvConc(T'), let L~c C O(IC). It follows that

AbsNodeT,/c(TermShift(/C,Lx:)) C TGShift(T,AbsNodeT,/c(L~)).
Proof. Expanding the definitions of AbsNodeT./C and TermShift we have

AbsNodeT,/c(TermShift(/C , L~)) = (T[SeI(/C,r.t)]lr E L/c & r.t E O(/C)}. (5.7)


Fix arbitrary r,t such that r E L~ and r.t E O(/C). From/C E TGEnvConc(Te)
and Property 4.1.13, it follows that SeI(/C, r.t) E S ( T ) . By the definition of Sel,
we have that for T = Sel(/C/r, t),

SeI(/C,r.t) -- Sel(/C,r).T.
It follows that
T[SeI(/C,r.t)] E TGShift(T,{'T[SeI(/C,r)]})
by the definition of TGShift,
= TGShift(T, AbsNodeT,/C({r}))
by the definition of AbsNodeT,/C.
In summary,

T[SeI(/C. r.t)] E TGShift(T. AbsNodeT./C({r}) ). (5.8)

The lemma follows from (5.7), (5.8), and the monotonicity of TGShift and
AbsNodeT,/C. []

L e m m a 5.2.7 ( S a f e t y o f Convert) Let ~,e,7-o~te be two abstract t e r m envi-


r o n m e n t , such that arity(~n[e]) = arity(To.t[e]), and/Ci., e TGEnvConc(~,~t),
/Co.,t E TGEnvConc(To~te), such that tCo~t =- ICi.~a for some substitutiont a. F o r
Lie.. a s~bset of O(lC..), we have that

AbsNodeTo,,t,lCo,,t(Lpc~ ) C Convert('T/~, AbsNode./in,/Cin(L/Q,, ), 7-o~t).


5.2. PRIMITIVE OPERATIONS 151

Proof. Consider any r E L~: . Applying the definition of AbsNodeq- ,- ,


we need to show that

"To.t[Sel(/Co.t,r)] E Convert(T/n,AbsNode~,,,/C,.(L~c,~), To.t).


From Definition 5.1.5, we have r E O(K:i,,). From Ki,, E TGEnvConc(Ti,,e) and
Property 4.1.13, we have
Sel(*:,~, r) E S ( ~ , ) . (5.9)
It follows from ICo,t = K i , a and the definition of substitution application, that
O(Ki,,) C_ O(ICo,t). Hence r E O(Ko,t), so from Ko,t 6 TGEnvConc(To,t') and
Property 4.1.13, we know that

SeI(/Co,,t,r) E S("]-o,t). (5.10)


Now, there are two cases.
case 1: (K;,,,/r ~ Vars). From ICo,,t - K i . c f and the definition of substitution
application, we have that

Sel(/Co,t, r) -- Sel(/Cin, r). (5.11)


We then observe that

= r)],
by (5.11),
E AbsNode~,~,Ki,~(LK:~.),
from r E Ljc~ and the definition of AbsNode~.,K:i,,,
C_ TGShift(T~,,, AbsNodeT~.,K;in(L~c,.)),
by the definition of TGShift.

Summarizing, we have

T~.[Sel(K:o.t, r)] E TGShift(T,,~, AbsNodeT~,~,Ki,,(Lx:..) ). (5.12)

The desired result is now obtained from (5.9), (5.10), (5.11), (5.12) and an
application of the first disjunct in the definition of Convert (Definition 5.2.4,
page 149).
case 2: (ICi,,/r E Vsrs). Here we have

r) = F).(k, V), (5.13)


and
Sel(/Cout,r) ----Sel(JCia,r').(k, f), (5.14)
for r' E O(K:i,,), and k E -/~ such that r = r'.k, and f either V, a functor
or a constant, depending on whether the substitution a further instanti-
ates K:i,,/r. If the variable is not instantiated by the substitution, then
Sel(Ko,t, r) = Sel(/Ci,,, r), a n d the result is obtained as in case 1 above.
Otherwise we observe that
152 C H A P T E R 5. L I V E N E S S A N A L Y S I S

E AbsNodeT~,,,/Q,,(L~:~.),
from r E L~c~.and the definition of AbsNode~jQ.,
C TGShift(T~n,AbsNode~jQ~(L~c,~)),
by the definition of TGShift.
Summarizing, we have
~,,[Sel(/ci,,, r)] e TGShift(~,~,AbsNode~.,/C~.(L~c,.)). (5.15)
The desired result is now obtained from (5.9), (5.10), (5.13), (5.14), (5.15)
and an application of the second disjunct in the definition of Convert (Def-
inition 5.2.4, page 149).
D
Proof. [ T h e o r e m 5.2.5 ( S a f e t y of AbstrkiveUnify(-, i, j))]
We must prove that
(/Co,,,,CSharingx:..,,CLive~co.,)E LiveEnvConc((To'`,',ASharing:ro.,,ALivesro.,)),
that is, the following conditions must be satisfied.
1. (/Co.,, CSharing~c~ 9 InstrEnvConc((To,,,', ASharing:ro.,)).
2. CLivex:o., is a concrete liveness component for (~Co'`,, CSharing~:o.,).
3. AbsNode./-o,,,,/Co,,~(CLivejc~ C TGShift(To'`t, ALivecro.,).
Condition 1. is a restatement of Theorem 4.2.11. Condition 2. follows from
Lemma 5.2.2. Using the assumptions of Theorem 5.2.5.5 and 5.2.5.6 and expand-
ing the definitions of ilveUnify and AbstrLiveUnify, we can rewrite Condition 3.
a8

AbsNodeTo'`t,/Co'`,(TermShift(/Co'`t,CLivejc,~))
C_ TGShift(To'`,, Convert(~,~,ALive~%,To,,,)). (5.16)
We observe that
AbsNodeTo'`t,/Co'`(TermShift(
, /Co'`,,CLivejc,~))
C_TGShift(To'`,,AbsNodeTo,,,,/Co'`,(CLivejc~))
by Lemma 5.2.6 (the safety of TGShift),
C__TGShift(To'`,,Convert(~,~,AbsNode~,~,/C~,~(CLivejc,~),7o,,,))
by Lemma 5.2.7 (the safety of Convert),
and the monotonicity of TGShift,
C_TGShift(To.,,Convert(7~,~,TGShift(~,,, ALiveT-,~),To'`,))
by the assumptionof Theorem 5.2.5.3c,
and the monotonicity of TGShlft and Convert,
= TGShift(To,,,, Convert(Tin, ALive~%, To'`,))
by the definition of Convert,
and the idempotence of TGShift.
Summarizing, we get (5.16), as desired. [-1
5.2. P R I M I T I V E O P E R A T I O N S 153

5.2.1.2 Xi = f ( X i , , . . . , X i ~ )
D e f i n i t l o n 5 . 2 . 8 For a corLerete liveness environment (ICin, CSharingx:,,,
Cl_ivex:,.~) and 1 < i, i l , . . . , ij < arity(ICi~[e]) such that i, i x , . . . , ij are pairwise
distinct, and f in a functor of arity j,

Liven nify((/Ci,~, CSharing~,.., CLive~,.), i, f ( Q , . . . , ij )) =


if mgu(/Ci,~/i, f(IC.in/ix,..., ICin/ij)) is fail
then fail
else (ICo~t, CSharingx:..,, CLivexz.~,),
where
(1Co,,,, CSharingpc..,) = Unify({/Cin, CSharingx:,.),i, f ( i l , . . . , i j ) ) ,
CLivex:o~, = TermShift(/Co~t, CLive~:i, ).

The following lemma states that the unification operation LiveUnify(-,i,f(il,


. . . , ij)) is well-defined on the domain of concrete liveness environments.

L e m m a 5.2.9 Let (tCi,~, CSharing~:i,,, CLive~:~) be a concrete liveness environ-


ment. Let (ICo~t, CSharingK: .... CLivepc...) = LiveUnify( (/Ci,~, CSharingx:,..,
CLivelc~,.),i,f(it,...,ij)) (i.e. LiveUnify does not fail), where 1 <_ i , Q , . . . , ij <_
arity(ICi,[e]) such that i, i l , . . . , ij are pairwise distinct and f is a functor of arity
j. Then (ICo~t, CSharingtc..t, CkiveK:.~,) in a concrete liveness environment.

Proof. By the definition of LiveUnify and Lemma 4.2.19 (well-definedness of


the operation Unify(-, i, f(ia~..., ij))), we know that (K;o,,, CSharing~..,) is a
concrete sharing environment.
The proof that CLivex;.., is a concrete liveness component for (tCout,
CSharingK:..,) is similar to the proof of Lemma 5.2.2. rl

D e f i n i t i o n 5.2.10 For an abstract liveness environment (Ti,~',ASharingT-~.,


AkiveT-,,) and 1 < i, i t , . . . , i j <_ arity(7-i,[e]) such that i, i x , . . . , i j are pairwise
distinct, and f in a functor of arity j,

AbstrLiveUnify((T~,', ASharingT-~., ALive~q.), i, f( i~, . . . , ij ) ) =


if T G U n i f y ( 7 - i , ' , i , f ( i l , . . . , i j ) ) is fail
then fail
else (Tout e, ASharingT-..,, ALive:v..,),
where
(To,,~", ASharingT-o,.,) = Abstr U nify((~, e, ASharing:q.), i, f ( i x , . . . , ij )),
ALive./-.,., = Convert(Tin, ALiveTi~, 7-o~t).
The following theorem gives the correctness condition for the liveness unification.

T h e o r e m 5.2.11 ( S a f e t y of AbstrLiveUnlfy(-, i, f ( i l , . . . , ij))) Assumin# the


following conditions
154 C H A P T E R 5. L I V E N E S S A N A L Y S I S

1. (Tin', ASharingT~, ALive~q,.) is an abstract lireness enrironmeng


s (/ci,~, CSharingjc~, CLivejc~,.) is a concrete lireness en~ironmeM
3. (/Ci,,, CSharingjc,.., CLivejc,.,) E LiveEnvConc((Ti,/, ASharing=q.,, ALiveT~,,))
I. 1 _< i, i l , . . . , 6 -< arit~(Ti~[~]) = ~,ity(/c,,[~]) s~ch that i, i l , . . . , 6 are
pairwise distinct and f is a funetor of arity j
5. (/co,,t, CSharingjc.,,,, CLive~:.~,) : LiveUnify((/c~,,, CSharing~c~, CLive~),
i, y(i~, .., ~j)), i, not f~il
6. (7o,,/, ASharingT-o.,, ALive%.,) = AbstrLiveOnify(ITi,/, ASharing~%,
ALiveT-,,,), i, f ( Q , .., 6 ) )
it follows that

(/Coat, CSharing~co.., CLiveJco..) E LiveEnvConc((7"o,./, ASharingT"o,., ALiveT"o,.)).


Proof. The proof of this theorem is similar to the proof of Theorem 5.2.5,
except that Theorem 4.2.23 is used instead of Theorem 4.2.11, and Lemma 5.2.9
instead of Lemma 5.2.2. D

5.2.2 Procedure Entry


As for the sharing analysis, we consider only the first (non-trivial) step of the
procedure-entry operation: the restriction of the liveness environment to the
variables occurring in the call. The computation of the liveness information that
has to be passed down on procedure entry requires knowledge of the variables
that occur in the tail of the current goal statement. This set, domt~iw, is easily
obtained for each program point, from the source text of the current clause.
(Remember that a left-to-right computation rule is assumed1.) Given such a
set, an auxiliary restriction operation Idomt~a for liveness components can be
defined.

Definition 5.2.12 For a concrete germ g : ( / C 1 , . . . , ]~tt), a set LIc C O(/C),


and domt~il C_ {1,..., n}, we define

LJc]domt~iz = {k.s I k E domtai~ & k.s E L~c}.


The arguments of the restriction operation that is the first step of the procedure-
entry operation comprise the abstract liveness environment at the call point, the
ic~zz injection determined by the variables of the call (see Section 4.2.2) and the
set domt~it at the call point. The operation always succeeds, because of the
normal form of programs. Note that the full liveness component is computed as
an intermediate step, before applying the function Proj that selects the parts of
the environment pertaining to the call.
1The abstract interpretation procedure constructing the AND-OR-graph is based on the
left-to-right computation rule,
5.2. PRIMITIVE OPERATIONS 155

D e f i n i t i o n 5.2.13 For a concrete liveness e n v i r o n m e n t (/Ci,~,CSharing~c~,


CLive,,c,.~) such that arity(ICi,,[e]) = n, dom,~u C { 1 , . . . , n}, domt~a C_ { 1 , . . . , n},
and i,~u: { 1 , . . . , rn} ~ { 1 , . . . , n} an injection (for m <_ n ) such that i,=u({1,...,
m}) = dom,au,

LiveRestrict ((/~i,,, CSharingx:,,~, C Livejc,,~), lean, domt=a)


= (/~,,,~, CSharinglc .... , CLivex:.,.)

where

(/Cr,tr, CSharing/c,,,,) = Restrict((/Cin, CShari ngx:,~ ), it=u),


CLivejc.,,. = Proj(LiveCIosu re (CLivepc,~, CSharing~c,,~)
ULiveClosure(O(/Ci.)ldomt~a , CSharing~q.,), ir

The following lemma states that the operation LiveRestrict is well-defined on the
set of concrete liveness environments.

L e m m a 5.2.14 L e t (/Ci,~, CSharing,,q,~, CLivexh,~) be a concrete liveness environ-


m e n t such that arity(ICi,~[e]) = n, domc=/I C_ { 1 , . . . , n}, domt=a C { 1 , . . . , n},
and i~au: { 1 , . . . , m} ---, { 1 , . . . , n} an injection (for rn < n ) such that i,~u({1,
. . . , m } ) : domcau. Let (K;r,t~, CSharingx:..,., CLivejc.... ) : LiveRestrict((K;in,
CSharingpc,., CLivejq.),i,=u, domt=a). T h e n (E,,,~, CSharing~c.,,., CLive~c.... ) is
a concrete liveness environment.

Proof. By the definition of LiveRestrict and Lemma 4.2.28, we know that (K~,tr,
CSharingpc.,,./is a concrete sharing environment. From

CLivex: .... = Pro](LiveClosure(CLivex:,~, CSharingjc..)


u LiveOosure(O(PC.Oldom~o,vcsharing~:,~),i~.),
because CSharing~. ,,., O(K;i,~)laA__uu,.tai,.and CLivejc~,~ are closed under TermShift,
a property that is preserved by the functions Proj and LiveCIosure, we have that
s .... is closed under TermShift. We still have to prove that Ckive~c.... =
LiveCIosure(Ckivejc.,,., CShr~c.,,. ). From the definition of LiveClosure, it is clear
that fiLive~c.,,. C kiveClosure(fikivejc .... , CShr~c.,,. ). So, we only have to prove
that CLivejc..,. _D Livefilosure(Ckivejc ..... CShr~c.... ). Let

Lpc~,, = LiveCIosure(CLivetq., CSharing~.,)


U LiveC osure(O(lCi,Oldomtaa , CSharing~c,,~),

then by the definition of LiveRestrict, CLivejc . . . . = Proj(Ljq., icau). By the defini-


tion of Restrict, CShr~c. . . . = Proj(CSharing~q., icau). We first prove the following
property.

LiveClosore(Pro](Ljc,~,
icGzz),Pro~(CSharimg~,it.ll))
C__ Pro](LiveClosore(bc,~, CSh~ring,c,.),i~.). (s.iw)
156 C H A P T E R 5. L I V E N E S S A N A L Y S I S

For an arbitraryelement s 9 LiveClosure(Proj(Lx:~,ic=z~),Proj(CSharing~., i~=~t)),


it follows from the definitionof LiveClosurethat

s e Proj(Ljc,.., icszz) V (qr 9 Proj(L~c,.., ic=ll): (r,s) 9 Proj(CSharingjc,.~, ieazz)).


From the definition of Proj, we know that there exist k, s' such that s = k.s' and

i c.,(k).s 9 v 9 (ioo,(0.r ico,(k).r 9 CSSsring ,.).


Applying the definition of LiveClosure, we have ic=zz(k).s' 9 LiveClosure(Ljc~,
CSharingjc,.,), i.e., s : k.s' 9 Proj(LiveClosure(Ljc,.,, CSharingjc,,), i~a.) and (5.17)
is proved.
Next, we observe that
LiveClosure(Ljc~,~,CSharingjc~,,)
= LiveCIosure(LiveCIosure(CLivejc,., CSharingjc,.)
U LiveCIosure(O(/Ci.)ldomt=~z, CSharlng~c~.~), CSharlng~c~),
by the definition of L~.,,
= kiveClosure(LiveCIosure(CLivejc,.., CSharingjc.,), CSharingjc.,)
U LiveCIosure(LiveCIosure(O(/C~,~)ldomt=iz , CSharingjc,~), CSharlngjc~,~),
by the definition of LiveCIosure,
= LiveCIosure(CLivelc.,, CSharingjc,~)
U LiveCIosu re(C)(/C~,~)Idomt=~i, CSharingjc,~),
because CSharingjc,., is transitively closed,
and thus LiveClosure(-,CSharingx:~) idempotent,
= Lx:,~,
by the definition of Ljc,,..

Summarizing, we have

LiveClosure(Lic~,~, CSharingjc~) = Ljc~,~ (5.18)


Finally,

LiveClosure( CLive~c.,,., CShr~c..,.)


= LiveClosure(Proj(Lx:,,., ic=zz), Proj(CSharingx:,., it=u)),
by the definitions of LiveRestrict, Restrict and Ljc~,
C_ Proj(LiveCIosure(Ljc~,~, CSharingx:,.), ic=zz),
by (5.17) proved above,
= Proj(Ljc,,,, ic=lz),
by (5.18),
= CLivejc..,.,
by the definitions of Ljc., and LiveRestrict.

Summarizing, we have LiveClosure(CLivejc..,~, CShr~c..,. ) C_ CLivejc.o,., as de-


sired. 0
5.2. P R I M I T I V E O P E R A T I O N S 157

To define the abstract liveness restriction operation, we first introduce a


type-graph analogue of the Idomt~it operation. Also, we generalize the AbstrProj
function (Definition 4.2.29) for liveness components, and we allow the LiveClosure
operation to be applied on an abstract liveness and sharing component.

Definition 5.2.15 For art abstract term environment T ~ with arity(T[e]) = n,


a set L=r of nodes q E Nodes=r such that Label(q) # 'Or" and P=r a set of
unordered pairs (p,q) such that p,q E Nodes=r & Label(p) # ' O r ' & Label(q) #
'Or" an injection i : {1,..., m} ~ {1, ..., n} (for m _< n) such that 7:,~,t~" --
TGRestrict(T', i) and domtait C_ {1,..., n}, we define

AbstrProj(T, L=r, Tr,t., i)


<k,s>.s c k
(i(k),f).SES(7-) & T[(i(k,),f).S]EL=r j '
Rstr(T, L=r,domt~iz)
{ I (k'f)'SES(T) &T[(k'f)'s]ELr & }
= T[(k,f).S] k E domta,,
LiveClosure(L=r, P=r)
= {p I p E L=r V (3 q E L=r: (p, q) E P=r)}.

Definition 5.2.16 For an abstract liveness environment (~,~e, ASharing=r~.~,


ALive=r,.~) such that arity(~,[e]) : n, dom~u C_ { L . . . , n}, domt~, C_ {1,..., n},
and i ~ . : {1,..., m} - , 0 , . . . , n} an injection (for ~ <_ n) such that i . . ( { 1 , . . . ,
m}) = d o m ~ . ,

Abstr LiveRestrict ((Ti,, ~, ASharing=r~, ALive=r~.), i~u, domt~,l)


= ( 7:,.,tr , ASharing=r..,., ALive=r...)

where
(7-,st, ~, ASharing=r..,.) = AbstrRestrict((Ti. e, ASharingn~), lean),
TGShift(Tr, tr, ALive=r.,.) = AbstrProj(Ti,~, C U B,T,.,tr, icau),
and

C = AItLiveClosure(TGShift(T,,,,ALive:q~),TGShift(T~,,,AShr~- ),
TGShift(Ti,~, AShr~2..)) ,
B = LiveCIosure(Rstr(Ti,~, Nodes=r~,domt(~it), ASharingT,~).
The following theorem gives the correctness condition for the AbstrLiveRestrict
operation.

T h e o r e m 5.2.17 (Safety of AbstrLiveRestrict) Assuming the following condi-


tions
1. (Ti~ e, ASharingT~.., ALive=r~) is an abstract liveness e n v i r o n m e n t

2. (/Cin, CSharingjc~.., CLive/c~.~) is a concrete liveness environment


158 C H A P T E R 5. L I V E N E S S A N A L Y S I S

3. (/Ci,~, CSharingjc,,,, Ctivejc,~) E LiveEnvConc(('Ti,~c, ASharlngn~ , ALiven~)) ,


that is,

(a) (ICin,CSharing/c,,~) E InstrEnvConc((~ne, ASharingn.))


(b) CLivejc.. is a concrete liveness component for (Kin, CSharingjc~)
(e) AbsNodez.,/C,.(CLive~:,~ ) C_TCShift(Z~, ALiveT,~)
4. arity (Ti,,[e]) = n & domtail C { 1 , . . . , n} & domc~u C_ { 1 , . . . , n}

5. icau : { 1 , . . . , m } --* { 1 , . . . , n } is an injection (for m < n) such that


icau((1,..., rn}) = domcau
6. (ICr, tr, CSharingjc..,., CLivex:..,.) = LiveRestrict((K:i,~, CSharingJc,~,
CLivejc.,), icau, domtail)
7. (7-r,tr', ASharingT..,., ALive~r .,.) = AbstrLiveRestrict((Ti,~ e, ASharingT-.,,
A LiveT"i~), i call, dom tail)
it follows that

(/Cr, tr, CSharingjc..,., CLivejc..,.)ELiveEnvConc(('Tr,tr', ASharingT-,., ALiveT- ,.)).


Before proving the theorem itself, we first give several lemmas that contribute
to the proof.

L e m m a 5.2.18 ( S a f e t y o f AbstrProj) Let 7-i,,e be an abstract term environ-


ment, n = arity(~n[e]), and K;i,~ E TGEnvConc(Ti,~'), and Lpc,,, a subset of
O(K:i,,). Let icau : { 1 , . . . , m } ---* { 1 , . . . , n } (for m < n) be an injection such
that T,,t," = TGRestrict(Ti,,=, icau) & E~,t~ = Proj(K:i,~, i,au). Then

AbsNodeTr, t.,K:.,t.(Proj (Ljc~, it.ll))


C AbstrProj(Ti,~,AbsNodeTin,/Cin(Ljc,~),Tr,t.,ic~u).
Proof. After expanding the definitions of AbsNodeT-r,t.,IC..,tr and Proj, we have

AbsNode'T,.,t,.,IC,.,t,.(Proj(L~,=, icau))
{~r,t,[Sel()C,,tr, k.S)] I k E {I,..., m} ~ ic~u(k).sE Lx:,~}.
=

For arbitrary k, s such that k E {I,..., m} and icau(k).s E Lic,~, we must prove
that

:Tr, t,{Sel(JC, r,t,, k.s)] E AbstrProj(Ti,,, AbsNodeTi,~,/Ci.,(L~c~. ' ), T.,t., icau).


By the definition of AbstrProj, it is sufficient to prove that

Sel(IC,..,tr,k.s) E S('/-r,tr), (5.19)


Se1(JCi,,ioo,(k).s) e S(Z,), (5.20)
~.[Sel(/Ci.,icau(k).s)] E AbsNodeTi,,/Ci,(Lx:,,), (5.21)
5.2. P R I M I T I V E O P E R A T I O N S 159

such that if Sel(/C,,tr, k.s) = (k,/>.S, then Sel(/Ci,~, ican(k).s) = (lean(k), I ) . S .

From the definition of AbsNode~n,/C~,, and i,au(k).s 9 L~c,~, we derive (5.21).


Because /Ci,, e TGEnvConc(7~,,'), we get (5.20) from ic,u(k).s E O(/Ci,,) and
Property 4.1.13.
From K:,,,, -- Proj(/Ci,,,ic,u), k E {1,...,m}, ic,,Kk).s E O(/Ci,,) and Prop-
erty 4.2.26.3, it is clear that k.s 9 O(/C~,,~), IC,,t~[k] =/Ci,,[i,=u(k)] and lC~,t~/k =
/Ci,,/i,=l~k). Now, suppose that Sel(K:rstr, k.s) = (k, I ) . S . If s = e, then so is S,
and f is V, a functor or a constant, depending on the value of tCrstr/k. But then
Sel(K:i,~, i:,u(k)) = (i=,n(k), .f), because IC,,t~/k =_ ICi,,li=au(k). If s ~: e, then
f = K:,,t~[k] = K:i,[i,:u(k)] and S = Sel(ld,,tr/k, s) = Sel(/di,,/i~,u(k), s), by the
definition of Sel. Hence Sel(.~:i,,, ic:u(k).s) = (i::u(k), f ) . S .
From the assumptions of the lemma and Theorem 2.3.12, we have ]Crstr 9
TGEnvConc(T~,t,'). From Property 4.1.13 and k.s 9 O(/C~,t,), we derive (5.19).
[]

L e m m a 5.2.19 ( S a f e t y of LiveClosure) Let 7- e be an abstract term environ-


ment and IC 9 TGEnvConc(T'), Ljc C_ O(/C), and Px: a set of unordered pairs of
the I o ~ (~, s) ~here ~, s 9 o ( ~ ) . Then

AbsNodeT,/c(LiveC1osure(Ljc, Px:))
C_ LiveClosure(AbsNodeT,/c(L~c),AbsPalrT,/c(Px: )).

Proof. After expanding the definitions of AbsNodeT,K: and LiveCIosure, we have

AbsNodeT,KC(LiveClosure (L1c, P~c))


= {T[SeI(JC, r)] I r 9 L,c V (~ s 9 L,c: (r, s) 9 PJc)}.

For an arbitrary r satisfying r 9 L~c v (3 s 9 L~: : (r, s) 9 P;c), it follows from


the definitions of AbsNodeT,/C and AbsPairT,/C, that

T[SeI(/C, r)] 9 AbsNodeT,/g(Ljc ) V (3 T[SeI(/C, s)] 9 AbsNodeT,K;(L;c) :


(T[SeI(K:, r)], T[SeI(/C, s)]) 9 AbsPairT, K:(Px:)),

and hence T[SeI(/C, r)] 9 LiveClosure(AbsNodeT,/C(/ic), AbsPairT,/c(Ptc ) ), as de-


sired. []

L e m m a 5.2.20 ( S a f e t y o f Rstr) Let T ~ be an abstract term environment,


arit~(7-[,]) = n, ~C 9 TGEnvConc(7-'), and dom,oa C { 1 , . . . , n}. Then

AbsNodeT,/d(O(E)ldomt,a) C Rstr(T, Nodes:r, domt,a).

Proof. After expanding the definitions of AbsNodeT,/C and ]domt=a, we have

AbsNodeT,/d(O(~)ldom,,a ) = .[~[SeI(/C, k.s)] I k ~ domtaa &~ k.s 9 O(/C)}.


160 CHAPTER 5. LIVENESS ANALYSIS

For arbitrary k, s satisfying k E domt~iz & k.s E O(IC), we must prove that

7"[Sel(/C, k.s)] E Rstr(7-, NodesT", domtaiz).


By the definition of Rstr, it is sufficient to prove that

Sel(/C,k.s) e S('T), (5.22)


7-[Sel(IC, k.s)] E Nodes~r. (5.23)
where Sel(/C, k.s) is of the form (k, :).S.

From K: E TGEnvConc(7-'), k.s e O(/C) and Property 4.1.13, we derive (5.22).


From the definition of Sel, we have SeI(/C, k.s) = (k, f ) . Sel(/C/k, s), where f is
V, a functor or a constant, depending on the value of IC/k.
From (5.22) and Definition 4.1.9, we derive (5.23). [3

Proof. [ T h e o r e m 5.2.17 ( S a f e t y o f AbstrLiveRestrict)]


We must prove that (K:.,t.,CSharingjc.,,., CLivejc.... ) E LiveEnvConc((T.,t. ~,
ASharingT-.,,., ALivesr ,.)), that is, the following conditions must be satisfied.
1. (ICe,t,, CSharingx:..,.) e InstrEnvConc((Tr,,,',ASharingT-,.)).
2. CLivejc... is a concrete liveness component for (K:~,t~, CSharingjc..,.).

3. AbsNodeTr,t,.K:r,tr(CLivex:..,.)_CTGShift(Tr,tr,ALiver..,.).
Condition 1. follows from Theorem 4.2.31, and Condition 2. follows from Lemma
5.2.14. Expanding the definitions of LiveRestrict and AbstrLiveRestrict, we can
rewrite Condition 3. as

A bsN~ eT-~,tr,/Cl.,t~( Proj( LiveCIosure(CLiver:.~, CSharingx::~)


U LiveCIosure(O()Ci~) Idom,~,,, CSharingjc~), i~u))
C AbstrProj(Ti,,, C u B, T~a,, i~ll). (5.24)
First, we observe that

AbsNodeT-r, tr,iCr,tr(Proj(LiveCIosure( CLivelc~,,, CSharingjc~), icon))


C AbstrProj('Tin, AbsNodeTin,/Ci,~(LiveClosure(CLivejc~.~, CSharing~Q,~)),

by Lemma 5.2.18 (the safety of AbstrProj),


= AbstrProj(Ti,,, AbsNodeTi.,)Ci.(LiveClosure(CLivejc,~ ,
Transit veC osure(CShr~c~ U CShr~c~))), 7-rstr, icaZZ),
by the definition of CSharingx:~,
C AbstrProj(T~,~, AItLiveCIosure(AbsNodeT~n,/Ci,~(CLivejc~),
AbsPairTi,~,/Ci,~(CS hr~c~ ), Abs PairTi,~,K:in (CSh r~ci,")), "-l"rstr,icall),
by Lemma 5.1.10 (the safety of AItLiveClosure),
because CLivex:.. is a concrete liveness component
for (K:in, CSharingx:..,), and by the monotonicity of AbstrProj,
5.2. P R I M I T I V E O P E R A T I O N S 161

C AbstrProj(T~,,, AltLiveClosure(TGShift(~,~, ALive:r,.),


TGShift(T/., AShr~,.), TGShift(Ti.~, AShr~-)), T,,t,, icall),
by the assumption of Theorem 5.2.17.3a and 3%
and the monotonicity of AbstrProj and AItLiveClosure,
= AbstrProj(Ti,,, C, ~,t., icaH),
by the definition of C.

Summarizing, we have

AbsN~ - /C - (Pmj(LiveClosure(CLive~c ~,CSharingx:,~),i~a,Z))


Ir8 ~-I T'J i t " 9

C_AbstrProj(~n, C, T,,t~, ieaU).


(5.25)

Next, we observe that

AbsNode~r s t r , /Cr s t r (Proj (LiveCIosu re (O(/C,,L)1dora tail , CSh aringjc,..), icau))


C AbstrProj(T,,~, AbsNodeT~,~,/Co~(LiveCIosure(O(/Ci,~)ldomta.,
CSharingpc~..)), ~,tr, icalt),
by Lemma 5.2.18 (the safety of AbstrProj),
C AbstrProj(Tin, LiveCIosure(AbsNode~,~,/Ci,~((9(/C,n)ldomtau),
AbsPairTi,~,/C~,,(CSharingJc~ )), 7-~,tr, icalt),
by Lemma 5.2.19 (the safety of LiveCIosure),
and the monotonicity of AbstrProj,
C_ AbstrProj(T~,~, LiveClosure(AbsNodeT~,~,/Ci,~(O(/Ci,~)ldomt~i,),
ASharingT-~), 7"rst,, icalt),
by the assumption of Theorem 5.2.17.3a,
Lemma 4.1.22,
and the monotonicity of AbstrProj and LiveCIosure,
C_ AbstrProj(Ti,,, LiveCIosure(Rstr(T,,,, Nodesz,~, domta~,), ASharing7.,~),
7-rstf, icall),
by Lemma 5.2.20 (the safety of Rstr),
and the monotonicity of AbstrProj and LiveCIosure,
= AbstrProj(Ti,~, B, ~,tr, ican),
by the definition of B.

Summarizing, we have

AbsNode~r s t r , /Cr s t r (Proj(LiveCIosure(O(/Ci~)ldom tail , CSharingn;=),icall))


_C AbstrProj(Ti,~, B, Tr, t~, icall).
(5.28)
So, (5.24) follows from (5.25), (5.26), and the properties of AbsNodeT
Ir$
it
~f,t/~ lrtt?,
,
Proj and AbstrProj. []

The information that a compiler needs to decide upon the liveness of some
term at a particular program point, requires deriving the set of locally live terms
at that program point, in addition to the full set of globally live terms. Given a
set domtau, characterizing the variables occurring in the subgoals of the clause
162 C H A P T E R 5. L I V E N E S S A N A L Y S I S

that follow the program point considered, the formalization of the operation is
similar to the intermediate step of the Restrict operation.

Ljc = liveClosure(CLivex:, CSharingjc) U LiveClosure(O(/C)ldomta~z, CSharingjc).

T h e abstract counterpart is as follows.

TGShiff(T, LT) = Cu B, (5.27)


C = AItLiveCIosure(TGShiff(T, ALive~r),
TGShift(T, AShr~-), TGShift(T, AShr})),
= LiveCIosure(Rstr(T, NodesT,domtail),ASharingT).

The special versions of the liveness restriction operation needed by the proce-
dure-exit operation (cfr. Section 4.2.2) are given below. Note that the full live-
ness component need not be computed, because the set of locally live terms is
empty at the end of the clause, and the set of sharing edges relevant for propagat-
ing the globally liveness property is passed upwards on procedure exit. We omit
the proofs of the well-definedness of the concrete operation and the soundness
of the abstract operation, as they are straightforward.

D e f i n i t i o n 5.2.21 For a concrete liveness environment (ICi., CSharingx:i~,


CLivejc,~) such that arity(ICi.[e]) = n, domht,,a C {1,..., n}, and iheaa : {1,...,
rn} -4 { 1 , . . . , n } an injection (for rn <_ n) such that ia,a,t({1,..., rnI) =
domh,ad,

LiveRestrict~((/Ci., CSharing~:,., CLivejc,.), ih..a)


= (/Cr~ CSharingK: .... , CLiveK:.... )

where

(/Cr,tr, CSharingjc..,.) = Restrict2((/Ci,,,CSharlngpc,..),ihea~),


CLivex:..,. = Proj(CLivejc~., ihead).

D e f i n i t i o n 5.2.22 For an abstract liveness environment ( in ,ASharing~r~..,


ALive~r~,.) such that arity(~,,[c]) = n, domh,a~ C { 1 , . . . , n } , and ih,ad : { 1 , . . . ,
m} --4 { 1 , . . . , n } an injection (for m <_ n) such that ihe,~t({1,..., m}) =
domhead,

AbstrLiveRestrict 2((~,,', ASharingT-~., ALive~-..), ih..a)


= (T~stre, ASharing~r..,., ALive~- ..)
where

(']-~,t.', ASharing~r o,.) = AbstrRestrict2((~.', ASharingT~"), ih,.4),


TGShift(']-.,tr, ALiveT..,.) = AbstrProj(~,,, TGShift(~., ALive~-~..),']-.st., ihead).
5.2. PRIMITIVE OPERATIONS 163

5.2.3 Procedure Exit

In the first step of procedure exit, the abstract liveness environments at the
last program point of the applicable clauses are restricted to the arguments of
the head of each clause by means of the Abstrl_iveRestrict2 operation, and the
I_iveUpp operation is used to compute an upper approximation of these restricted
liveness environments. In this section, we present only the extension operation
that is needed in the second step of the procedure-exit operation to regain the
information about the calling environment that was lost on procedure entry.

D e f i n i t i o n 5.2.23 For two concrete liveness environments (tCin, CSharing~:~.,


CLivejc,~) and (IC~,tr, CSharingx:.,,., CLive~:.... ), such that arity(ICin[e]) = n &
arity(IC~,tr[e]) = m & rn <__n, and an injection i,au: { 1 , . . . , m} --~ { 1 , . . . , n},

LiveExtend((/CI,~,CSharingjq.,CLivex:..,), (/C,.,:,., CSharinglc..,,.,CLivelc .... ), ieazl)


if ( 3 a over Vars(Proj(/Ci,,icau)) : Proj(/Cin, icaH) a : / C r , t r ) &
Vars(/Cr~'tr) fq Vars(JCi.) C Vars(Proj(/Ci., icall)) &
: CShr~:.... : TermShift(/Cr, t~, Proj(CSharingx:~, lean))
then (ICo,~t,CSharlngjc.~,, CLive~..,)
else fail

where

(IEo,~t, CSharingjc..,) -:- Extend((/Ci,, CSharingtci.), (K:r,tr, CSharing~c.... ), icon),


CLivex:.~, = TermShift(/Co~,:, CLivex:~.).

Note that the value of CLivejc.., does not depend on CLivejc.... and that the
conditions on the relationship between /Ci, and Er,tr and their sharing compo-
nents are the same as in Section 4.2.3. Changes in the global liveness of terms
that are due to an update of the sharing component are not made explicit.
The following lemma states that the operation LiveExtend is well-defined on
the set of concrete liveness environments.

Lemma 5.2.24 Let (/Ci~, CSharingjq~, CLive~) and (/C~,tr, CSharingjc .....
CUve .... ) be concrete iveness environment,, , eh that =
arity( IC,,t,[e]) = r a & m <_ n, and lean: { 1 , . . . , m} --* { 1 , . . . , n} an injection. Let
(/Co,,:, CSharingjc .... CLivejc~ = LiveExtend((K:i,,, CSharing~c,,., CLivejc,.,), (/C.,t.,
CSharingjc .... , CLivejc.... ),ican). Then (ICo,,t, CSharingjc .... CLivejco~,) is a con-
crete liveness environment.

Proof. By the definition of LiveExtend and Lemma 4.2.40 (well-definedness of


the Extend operation), we know that (/Co,,t, CSharingjco.,) is a concrete sharing
environment.
The proof that CLivejco~, is a concrete liveness component for (/Co,,t,
CSharing~=o~,) is similar to the proof of Lemma 5.2.2. El
164 CHAPTER 5. L I V E N E S S A N A L Y S I S

D e f i n i t i o n 5.2.25 For two abstract liveness environments ('Ti,~,ASharingT-~.,,


ALive~r~) and (Y.,tr',ASharingT-,., ALive~r.,.), such that a r i t ~ , , [ c ] ) = n &
aritv(7-~,t~[r]) = m & m < n, and an injection i.~zl : {1,..., m) ~ {1,..., n),

AbstrLiveExtend ((~.', ASharingT-~..,ALive~r~,.),


(Tr,t.', ASharingT-..,., ALive~r ,.), it,m) =
/f AbstrExtend((Yi.', ASharinga.~) , ( . , t . , ASharingT-.,.), ic=ll) is fail
teen fail
else (To=t', ASharingT-..,, ALive.T.,,,),
where

(To=t', ASharingsr..,) -= AbstrExtend(('Ti,,', ASharingT-~.),


( r, tr , ASharingT.,,.), icall),
ALivesr... = Convert(~,ALive:r.=,To=t).
The following theorem gives the correctness condition for the AbstrLiveExtend
operation.
T h e o r e m 5.2.26 (Safety of AbstrLiveExtend) A s s u m i n g the following condi-
tions,

1. (7~,=6, ASharingT-~., ALiveT-~.~) is an abstract liveness environment


~. (/Ci,~,CSharingjc~,., CLivejc..) is a concrete liveness environment
3. (K:i., CSharinglc,,,, CLive/c..) E LiveEnvConc((~,,', ASharing~,., ALive~,.))
9~. (T~,t~ e, ASharingT- ,., ALiveT.,,.) is an abstract liveness environment

5. (/Cr,tr, CSharingjc..., CLivejc..,.) is a concrete liveness environment


6. (/Cr, tr, CSharingjc..,., CLivejc.,,.) E LiveEnvConc(('T,.,t~', ASharingT- ,.,
ALive.r.,,.))
= n = m m < n

8. i~n: { I , . . . , rn} --, {1,..., n} is ar~ injection


g. (KSo=t,CSharingx:o,,, CLivex:o.,) = LiveExtend((/Ca,=, CSharingx:,,,, CLivex:,,~),
(/C~,tr, CSharingjc..,., CLivepc..,.), icau), is not fail
10. (7-o~t', ASharingT-.~o ALive:to.,) = AbstrLiveExtend((T/n e, ASharingT-,.~,
ALiveT~..), (T~,,~', ASharingT..,., ALivesr ,.), i.=zz)
it follows that

(/Co=t, CSharingK:o,., CLivelco,.) E LiveEnvConc((7"o=t e, ASharing=ro,., ALiveT-o,.)).


Proof. The proof of this theorem is similar to the proof of Theorem 5.2.5,
except that Theorem 4.2.44 is used instead of Theorem 4.2.11, and Lemma 5.2.24
instead of Lemma 5.2.2. Q
5.3. E V A L U A T I O N 165

5.3 Evaluation

After discussing one example in detail, we relate the precision of the liveness
analysis to the strength of the sharing analysis. Next, we take the viewpoint of
a compiler writer and investigate the practical usefulness of the results obtained
by the abstract interpretation procedure for the liveness environments introduced
in Section 5.1.

5.3.1 Example: qsort/3

q s o r t (_L,_Acc, _hes) :- part it ion ( _M, _L, _Sm, _Gr) 9-


_L = n i l , _Acc = _Res. _L = nil, _Sin = nil, _Gr = nil.
q s o r t (_L, _Acc,_Res) :- partition(_M,_L, tim,_Gr) "-
_L = [_H I_T], (select) _L = [_H I_T], (select)
part it ion (_H, _T, _UI, _U2), _.It = < _M,
qsort (_U2, -Ace, _Accl ), _Sra = [_H I_Sml], (construct)
_Acc2 = [ 4 Jlccl], (construct) p a r t i t ion ( ~ , _T, _Sml, _Sr).
qsort (_UI, _Acc2, A~es). p a r t i t i o n ( R , 1,_Sm,_Gr) "-
_L = [ ~ I_T], (select)
_If > _~,
_Gr = [I~ I_Grl], (construct)
partition(_M, _T, _Sin, _Grl) .

Program 5 . 2 : q s o r t / 3 (Quicksort)

Consider Program 5.2, implementing the quicksort algorithm using an accumu-


lating parameter. Assume that the program is called with the abstract liveness
environment ~i, = (Ti,, ', ASharingT~, ALiveT~), where

T~. ::= (DList, DList, V),


DList ::= nil I '.'(d(Int), DList),
ASharing~r,,~ -- (0, 0),
A Liver,,, = {T~,[(3, V)]}.

See Figure 5.17 (top left) for a graphical representation. The third argument is
specified to be globally live, because it is the intended solution of the initial goal.
Note that we specify the type of the second argument (which is the accumulating
parameter) to be DList. This is an overestimation because in an initial coztcrete
call, its value will be n i l usually. However for the recursive calls, the accumu-
lating parameter is a list in general. Our analysis uses depth bound two for the
list cell nodes. The success substitution ~o,,t = (To,,t ~, ASharingT"o.,, ALiveTo.,),
computed by the abstract interpretation procedure (Figure 5.17 top right), is
166 C H A P T E R 5. L I V E N E S S A N A L Y S I S

Tin <> Tout <>

Or Or V Or Or Or

d d d,, d d
I I I "- I ,,'1
Int Int Int 99 Int .," lnt

Ts <>

M L H T Sm ** G r ** S m l 9
v

ht ~ t ,'" i|

/nat 9 .'[ nil , d


/%2 ...... ' I
d Int
I
|nt

Figure 5.17: Abstract liveness environments/3~.,/3o~,t and ~j for q s o r t / 3 .

such that

7-o~t ::= (DList, DList, DList),


ASharingT-o,. = (@, AShr~-o,,,),
AShr.~-o,, = { ( 7"o,,t[(2, '.')], 7"o,,,[(3, '.'>] ),
( 7"o,,,[(1, '.').<1, d)], 7"o,,,[(3, '.'>.(1, d>] ) },
ALiveT-.., - { 7"o,,t[<3, '.')] }.

Only the principal sharing edges are represented. The full set of sharing edges
is obtained by applying the TGShift operation. We see that the output argu-
m e n t , ..ges, gets bound to a list that possibly shares with the input value of
the accumulating parameter. Intuitively, this is correct because in the first pro-
gram clause, the value of the accumulating parameter is passed on as the output
value. Note that there is potential sharing between the input list that is to be
sorted and the output list, but only on the level of the list elements (which are
structured terms of type d(Int)).
The analysis of the quicksort program q s o r t / 3 entails an analysis of the
p a r t i t i o n / 4 predicate, namely for the abstract liveness environment /31 =
(7-~, ASharingT-1, ALiveT-1), where

7"1 ::= (d(Int), DList, V, V),


ASharingTx = (r @),
AI-iverx = { Tx[(1,d)], 7-a[(3, V)], 7-1[<4, V)] }.

The success substitution/32 - (7"~, ASharing~-2, ALive~-2), computed by the ab-


5.3. E V A L U A T I O N 167

stract interpretation procedure, is given by

T2 ::= (d(Int), DList, DList, DList),


ASharing~r 2 = (0, AShr~-2) ,
AShr~- 2 = { (T2[(2,'.').(1,d)], T2[(3, '.').(1, d)] ),
( 7"2[(2, '.').(1, d)], T2[(4, ')).(1, d)] ) },
ALive~r2 = ( 7"2[(1,d)], 7-2[(3, '.')], T2[(4,'.')] }.

Note that there is potential sharing between the input list and both the lists of
the third and fourth argument, at the level of the list elements, but not between
the output lists among themselves. Now, let us consider the program point in the
second clause of the p a r t i t i o n / 4 predicate, right after the selection operation.
The point is of particular interest to the compiler when it checks for the possible
creation of garbage cells. If we order the variables of the clause according to
the tuple ( _~, _i., _.It, _T, _Sin, _Gr, _Sml ), then the environment derived
by the analysis is given by fl, = (T:, ASharing=ro, ALiveT,>,

T~ ::= (d(Int), '.'(d(Int),DList), d(Int), DList, V, V, V),


ASharing=r s = (g, AShr~- ),
AShr~- : { ( T , [(2, '.').(1, d)], Ts[(a, d)] ),
(z,[(2,'.').(2,'.')], :r, [(4, ,.,)] ) ),
ALive:T, = { Ts[(6, V)], 7",[(5, V)], T,[(1,d)] }.

From the representation in Figure 5.17, it is clear that the top list cell of the
program variable _L is not shared with any live term. We marked both the
globally and locally live subterms. Remember that ALiveT-, contains only the
globally live terms. Similar results are obtained for the selection operations in
the third clause of partition/4 and the second clause of qsort/3.
The result of the liveness analysis can be interpreted as follows. For an
initial goal described by (T~,~e, ASharing~q~, ALiveT~), such that the input list of
the first argument is not live when q s o r t / 3 is called, the top list cell of the
program variable _.L becomes garbage after each of the selections in the source
code, as it will not be shared with any live term at run time. Consequently the
compiler can generate code such that all the list cells of the input list are reused
to construct the output list. The list cell containing the pivot element can be
reused in the q s o r t predicate to construct Acc2 = [_3 I A c c l ] ; and in the
call p a r t i t i o n ( _ l t , _T, _U1, _U2), all the list cells that _T is made of, can be
reused to construct _U1 and _U2. The recursive calls of q s o r t / 3 will then once
again reuse the list cells of _U1 and _U2. We conclude that the sorting algorithm
can work in-place; the only heap-memory required is that occupied by the input
data structures.
We believe that our analysis performs quite well compared with other related
methods reported in the literature. In the work of Hudak [33], a version of the
quicksort program for sorting arrays of integers is discussed. Their results are
very similar to ours. The analysis infers that all the updates in the program
can be done destructively. The analysis of a Lisp-like version of the quicksort
program in the framework of Inoue, Seki and Yagi [35], also deduces that the
168 CHAPTER 5. L I V E N E S S A N A L Y S I S

sameleaves( _x, _y ) : - p r o f i l e ( _x, _w ), p r o f i l e ( _y, _w ) .

p r o f i l e ( _ t r , _pr ) :- _tr = l v ( _ u ) , _pr = [_u]


p r o f i l e ( _ t r , _pr ) "-
_tr = t ( l v ( _ u ) , _y),
_pr = [_ul_prtail], Q
profile( _y, _prtail )
profile( _tr, _pr ) :-
_tr = z ( t ( ~ , _ y ) , ~ z ) ,
_z2 = t ( _ x , z ( _ y , _ z ) ) ,
p r o f i l e ( -1;2, _pr )

Program 5 . 3 : s a m e l e a v e s / 2

garbage list ceils created by the auxiliary functions (namely for extracting the
elements that are greater, resp. smaller, than the pivot element, and for append-
ing two sublists), can be reclaimed (i.e. appended to a free-list) after the calling
function has finished its execution; they do not address the issue of local storage
overwriting, reusing the input list cells to construct the output list.

5.3.2 Precision of the Liveness Analysis


The amount of detectable garbage mainly depends on the precision of the sharing
analysis. In Section 4.3.3, we extensively discussed the strength of the sharing
analysis. Here, we illustrate the effect on the liveness information of the impre-
cision in the sharing analysis.
Consider Program 5.3 that checks whether two trees have the same leaf pro-
file. Note that some of the unification operations in Program 5.3 are not strictly
in normal form. This reduces the number of local variables in the clauses, but
causes some imprecision in the analysis discussed below. Assume that the pro-
gram is called with the abstract liveness environment ~i,~ ( i, , ASharing~r~,
ALive:r~) (Figure 5.18 on the left), where

T~,, ::= (LvTree, LvTree),


LvTree ::: lv(Int) I t ( LvTree, LvTree ),
ASharingT~.. = (0, 0),
ALive~r,. = {T~,,[(2,t)], Ti,[(2, Iv)]}.

The intended solution of the initial goal basically is a yes/no answer. How-
ever, we assume that the input list of the second argument is also needed in
some further computations. Therefore, the second argument is specified to be
globally live. Our analysis uses depth bound two for the tree nodes labeled t.
The success substitution ~o~,t = ( 7:.o,,t ,ASharingTo.,,ALive:ro~,), computed by
the abstract interpretation procedure, is equivalent to 13in. The analysis of the
program s a m e l e a v e s / 2 entails an analysis of the predicate p r o f i l e / 2 ; once for
5.3. EVALUATION 169

Tin <> Ts <>

tr pr x y z t2
Or Or t V** Or Or Or V

' "
,,LAJ LLAJ " "-
l
I I , I

A / ~ a I ~''1"

Int lnt

Figure 5.18: Abstract liveness environments 8~- for sameleaves/2 and 8s for
profile/2.

the abstract liveness environment 81 : (~-T'~, ASharing~rl, ALivecrl such that

T1 ::= (LvTree, V),


ASharingT-1 = (@, 0),
ALiveT1 = { Tx[(2, V)] },
and once for the abstract liveness environment 82 = (T~, ASharingT2, ALive~r2),
such that
T2 ::= (LvTree, List),
List ::= nil [ '.'(Int, List),
ASharingT2 = (@,@),
ALiveT-2 = { T2[(1,t>], T2[(1,1v>] }.
In the first call, the profile/2 predicate has to construct a linear list containing
the leaf nodes of the input tree; in the second call it only has to test whether a
given linear list contains the leaf nodes of the input tree. Let us first consider
program point @ for the case that the analysis starts from the liveness
environment 81- If we order the variables of the clause according to the tuple
( _ t r , _pr, _x, _y, _z, _t2 ), then the environment derived is given b y 8 s =
(T~, ASharing~- ~,AtiveT ,),
T, ::= (t(LvTreeOne,LvTree), V, LvTree, LvTree, LvTree, V ),
LvTreeOne ::= t ( l v ( I n t ) [ LvTreeOne, lv(Int) I LvTreeOne ),
ASharing~r, = (0, AShr~-),
AShr~-o = { (T,[(1,t).(1,t)], To[(3,t)]),
(T,[(1,t).(1, t)], T,[(4, t)] ),
(T,[(1, t).(2,t)], T,[(5, t)]) },
ALiveT. = { T,[(2, V)] }.
See Figure 5.18 for the graphical representation. We derive that the second
t cell of _tr possibly shares with t cells of the program variables _x and _y.
Because these variables are needed in the subsequent call, it means that the
analysis detects only that the top t cell of _tr can be reused, not its second
t cell. Intuitively, we expect that also the second t cell of _tr is turned into
170 C H A P T E R 5. L I V E N E S S A N A L Y S I S

garbage and because the construction operation that follows requires two heap
cells for tree nodes labeled t, it would be most desirable too. As we discussed in
Section 4.3.3 for the s p l i t / 3 predicate where a similar problem occurred, the
precision of the sharing analysis can be improved, either by using depth bound
three for the tree nodes, or by breaking the unification up into simple normal-
form operations, thereby introducing a few more local program variables. The
result of the analysis for the latter case can be found in Appendix A. It confirms
our intuition that the rearrangement of the subtrees in the third clause of the
p r o f i l e / 2 program can be done in-place.
9Now, for the case that the analysis starts from the liveness environment ~fl~,
we obtain for program point ( ~ the abstract substitution/~,.= (T~, ASharingT,.,
Alive:rr),
7-, ::= (t(LvTreeOne,LvTree), List, LvTree, LvTree, LvTree, V),
ASharing=r,. = (0, AShr~- ),
AShr~,. = { (T,.[(I,/;).(I,~)], T~.[(3,~)]),
T,[(4,t)] ),
(T,.[(1,t).(2, g)], T,.[(5,$)]) },
ALive:r,. = { T,.[(1,t)] }.
Thus, we get the same imprecise sharing information as in/3s. But in this case
the program variable _tr is globally live instead of _pr, so there are no garbage
cells created and the imprecision is insignificant. Two new tree-cells have to be
allocated anyway.
In Section 4.3.3, we discussed a few other causes of imprecision in the sharing
analysis, not all of which can be resolved. More experiments with concrete pro-
grams will be needed to measure the impact of such imprecision on the amount
of garbage that our analysis cannot detect. For those cases, we will have to rely
on run-time garbage collection to compensate for the imprecision in the liveness
analysis. However for future work, we think it is more important to look for ef-
ficient ways to reuse the garbage cells that are detected by the analysis. Before
we consider that issue in the next section, we want to make some final remarks
concerning the analysis of the p r o f i l e / 2 program.
Consider program point @ and the variables of the clause ordered according
to the tuple ( _ t r , _pr, _u, _ p r t a i l , _y ). The environment derived by the
analysis in the case that the mode of use is described by/~2, is given by tip =
( T ; , ASharingT-p, ALiveT-p),

7", ::= (t(lv(Int),LvTree), ListOne, Int, List, LvTree ),


ListOne ::= '.'(Int, List),
ASharingTp = (0, AShr~-p),
AShr~- = { ( T,[(1, t).(2, g)], T,[(5, t)] ),
( :r,, [<2, '.'). (2, '. % T,, [(4, '.')] ) },
ALive;-~, = { Tp[(1,~)] }.
This means that the top list cell of _pr is turned into garbage by the selection
operation _pr : [_ul _prtail]. However, there is no local opportunity to reuse
5.3. EVALUATION 171

the cell. If the analysis starts from a mode of use corresponding to the first
call of p r o f i l e / 2 in the s a ~ e l e a v e s / 2 program, as described by ~1, we derive
/3q = (T~, ASharing=rq, ALive=rq) where

T, ::= (t(lv(Int),LvTree), '.'(Int,V), Int, V, LvTree ),


ASharingTq : (0, AShr,~q),
AShr.~e = .[ (T,[(1,t).(2, t)], Te[(5, t)] ),
( :r,[(2, '.').(2, v)], 7-,[(4, v)] ) },
ALivey, = { T,[(2,'.')] }.
Now, _pr = [ _ u l _ ~ r t a i l ' l is a construction operation and _pr is globally live.
Indeed, the list that is constructed will be referenced in the second call of the
p r o f i l e / 2 predicate in the s a m e l e a v e s / 2 program. The top t and Iv cells of
_tr on the other hand are derived to be garbage; hence they arc candidates for
reuse in the construction operation. However, overwriting a tree cell by a list
cell requires that the full contents of the cell is trailed, whereas reusing a list cell
for another list cell requires trailing of the changed arguments only.

5.3.3 The Practical Usefulness of Liveness Information


In this subsection we discuss the usefulness of liveness information to enable
reuse of memory cells in structure-copying implementations, either based on the
WAM or some other basic instruction set (e.g. [74, 781).
It is important to note that a correct interpretation of the live-structure
information obtained by the analysis is implementation dependent. Indeed, in
WAM-like implementations it is possible that a live argument of a dead structure
cell is accessible through a reference chain that passes through an argument field
of the structure cell. For instance, consider the following piece of Prolog code:

p : - _X=f(_), _Y=f(_), _X=_Y, _ X = f ( g ( 3 ) ) , (~) _Z=h(a), use(_Y).


On the left in Figure 5.19, we sketch a memory layout that can occur in a real
implementation when program point ( ~ is reached. On the right, the corre-
sponding abstract liveness environment computed by the live-structure analysis
is shown. It indicates that the principal functor cell of the structured term that
variable _X is bound to is dead. Moreover, the next operation (_Z = h ( a ) ) is
known to be a construction needing the same amount of storage. However, be-
cause of the reference chain passing through the argument field of the structure
cell representing the f / 1 functor on the heap, local reuse is not safe. In addition
to requiring the principal functor to be dead, its immediate children should also
be dead.
Note that the difficulty is not inherent in all WAM-based implementations.
E.g. in the (didactical) PAM-3 machine [41], the problem does not occur because
there is an entry for each clause variable in an environment frame on the local
stack (all clause variables are permanent variables) and no heap cell ever gets
the tag VAR. As a consequence, no reference chain will ever pass through the
argument field of a structure cell on the heap.
172 C H A P T E R 5. L I V E N E S S A N A L Y S I S

local stack heap structures o

X
REF I I " / '~ ",,
f f Z
Y REF
\\
Z VAR STRUCT
REF "
f/1
- -~
,'
]
I I
w4 g.~ .,~

cONST
I I
3 ~" 3

Figure 5.19: A reference chain passing through an argument of a dead structure.

Also, the safety requirement can be significantlyweakened at a program point


immediately following a selection operation. It suffices that the functor cell is
dead afterthe selection, and the children corresponding to argument fieldsthat
are going to be overwritten are dead, non-variable and not shared prior to the
selection.
In the implementation of Taylor [74], aliased free variables are represented by
a circular list. W h e n a variable becomes instantiated, all the cells in the list of
aliases are assigned the same reference pointing to the structure that the variable
is unified with. As a consequence, reference chains arc never longer than one.
So, for the safe reuse of a dead structure cell after a selection operation, it is
sufficient to have the children be non-vat prior to the selection, since references
to non-var terms cannot pass through an argument field of the structure cell.
The use that a compiler can m a k e of the liveness information that is derived,
also varies with the sequencing of construction and selection operations in the
source programs. In [52], some preliminary experiments on code optimization
based on liveness information are discussed. The only case considered is that
of a selection and subsequent construction operation involving the same kind
of structured term and occurring in a single chunk (for the definition see Sec-
tion 3.3). Starting from those experiments, we now speculate on some other
potential uses. W e have to keep in mind that heap space is traded for space on
the trail. As explained in Section 5.1.1, we assume that an enhanced trailing
mechanism is used that saves the contents of the heap cells that are overwritten
whenever backtracking m a y require to restore the old value. But in m a n y cases,
the space needed on the trail (a pointer to the cell that is overwritten and a
copy of the old value), is smaller than the heap space that is reused. Most issues
discussed in this section require a further investigation.

Local R e u s e

If we focus on the local reuse of garbage cells, applicability is restricted to pro-


grams in which the selection operation that creates garbage, is followed by a
construction operation within the same clause, that requires the same or less
amount of storage as released by the selection statement. If the selection and
5.3. EVALUATION 173

query #S #C functor #D
nrev(List,V) 1 1 91 2 o
append(List,List,V) 1 1 .12 i
qs o r t ( List, List, V) 1 1 .12 o
p a r t i t ion(Int,List,V,V) 2 2 .12 2
profile(LvTree,V) 3 2 t/2 2
2 0 lv/1 0
0 2 ./2 o
buildtree(List,Wree,V) 1 0 ./2 0
ins e r r (Int,Tree,V) 2 3 t/3 2
sift(List,V) 1 1 .12 1
remove(Int,List,V) 2 1 ./2 1
permutation(List,V) 0 1 912 0
select(V,List,V) 2 1 .12 1

Table 5.1: Opportunities for destructive operations.

construction operation occur within a single chunk, a temporary WAM-register


suffices to keep track of the garbage cell until it is used. In the other cases, a
temporary register cannot be used because the contents of such a register does
not survive the call of a Prolog procedure. One solution consists in changing
the classification of the clause variables, such that a permanent variable on the
environment stack keeps the address of the garbage cell until it is needed.
Table 5.1 lists some of the list and tree manipulating programs that we used
to evaluate the liveness analysis. For each program we show the type component
of a query specification. For this table, we assume that no sharing exists between
the different arguments and that the output arguments (the free variables) are
globally live. The second column shows the number of selection operations in
the normal-form source code of the program, and for the given mode of use.
The third column gives the number of construction operations. We also indicate
the kind of structure that is involved. (Test and assignment operations are not
counted.) The last column indicates the number of selection-construction pairs
occurring in a single chunk and concerning the same kind of structure. This
reflects the number of destructive operations that can be introduced when using
the technique described in [52]. Of course there is no direct relationship with
the amount of storage that is reused for a particular call, because the specific
concrete input arguments determine the execution frequency of the different
clauses of a predicate.
For queries such as append(List,List,V) and p a r t i t i o n ( I n t , List, V, V), we
have # S -- # C -- # D . This means that our prototype analyzer recognizes that
the data structures being consumed are turned into garbage and that the pro-
gram offers the opportunity to reuse the garbage cell.
For queries such as nrev(List,V) and qsort(List,List,V), we have # S = # C ,
174 C H A P T E R 5. LIVENESS A N A L Y S I S

buildtree(_L, _OT, _NT) :- insert(_E, _OT, _NT) :-


_L = nil, _OT = empty,
_NT = _OT. _NT = t(empty, _E, empty).
buildtree(_L , _OT, J T ) :- insert(_E, _0T, _NT) :-
i= [ _ E I _~], _0T = t(_L, _F, _~),
insert(_E, _OT, _T), _E =< _F, _NT = t(_}lL, _F, ..R),
buildtree(_K, _T, _}IT). i n s e r t ( . . E , ..L, ..NL).
i n s e r t ( . . E , _0T, ..)IT) : -
_0T = t(_.L, ._F, __R.),
..E > 3 , _NT = t(_.L, _F, ~ R ) ,
insert(_E, _~, _NR).

Program 5 . 4 : b u i l d t r e e / 3

but # D = 0. This means that the construction and selection do not occur in
the same chunk. In order to allow a destructive update, either a permanent
variable must be used to keep track of the garbage cell, or a more sophisticated
technique such as code migration [26] must be used. If that extension is added to
the compiler described in [52], than it will be possible to generate code such that
the program works in-place. The only heap-memory required is that occupied
by the input data structures.
For queries such as s e l e c t ( V , L i s t , V ) and remove(Int,List,V), we have # S >
# C = # D . The s e l e c t / 3 predicate is used by the p e r m u t a t i o n / 2 procedure (see
page 177). The remove/3 predicate (see Appendix A) is called as an auxiliary
procedure by the s i f t / 2 program to sift out the prime numbers according to
Eratosthenes' sieve algorithm. For these queries, more garbage cells are created
(and detected) than can actually be reused within the clauses themselves. Run-
time garbage collection is still needed to discard the garbage cells, unless some
techniques for non-local reuse are developed.
The same holds for the tree-manipulating program b u i l d t r e e / 3 , which trans-
forms a list into a sorted binary tree (see Program 5.4). The analysis detects
that the list cells become garbage in the second clause, however they cannot be
reused locally. Analyzing b u i l d t r e e / 3 entails analyzing i n s e r t / 3 , which in-
serts one element into a binary tree. We have # C > # S = # D , indicating that
the tree can be modified in place, but extra heap space is needed for the new
element that is inserted. Interesting enough, this results in the same memory
usage as the skillful programmer obtains by using open-ended trees.
If the query specifications for these programs allow sharing at the element
level of the input lists or trees, essentially the same results are obtained as
above. For example, when reversing a list of free variables that share (e.g.
n r e v ( [ X , Y , X , Y , Z ] ,Out)), the reverse procedure can still work in-place (see
Section A.3). However, if sharing is allowed between different input lists (trees)
at the list-cell (tree-cell) level, it is generally unsafe to reuse the cells (see Sec-
tion A.2).
In some cases a reordering of subgoals (or code migration [26]) may be de-
5.3. EVALUATION 175

sirable. For instance, when the append/3 program is used to split a list into two
sublists, the compiler has to reorder the unification operations in order to benefit
from the liveness information. The normal form of the append/3 predicate is as
follows.

append( _3(, _u _7, ) :- _X = nil, _Y = _Z .


append( _X, _Y, _7, ) :- _X = [_El_u], _Z = [_El_W],
append(_U, _Y, _W)

Consider the initial abstract liveness environment ~i,~ = (T~,~", ASharingT,~.,


A Live~r~,,), where
~,~ ::= (V, V, List),
List ::= nil l ' . ' ( I n t , List),
ASharing,z~ = (@,@>,
ALive~r,~ ---- {T~,~[(1,V)], T~,,[(2, V>]}.

The first and second arguments are specified to be globally live, because they
are the intended solution for the initial goal. The success substitution ;3o,~t =
(To,,t ~, AShadngTo.,, Al_ivezo~,>, computed by the abstract interpretation proce-
dure, is

To,,t ::= (List, List, List),


ASharing.2-o,,, = (@, AShr.-~o,,,>,
AShr~-o,,, = ( ( To,,t[(2,'.')], To,,t[(3,'.'>]) },
ALive~-o., = { To,,t[(1, '.')], To~,t[(2, '.'>] }.
The sharing between the second and the third argument after the call, obviously
results from the unification _u = _Z in the first program clause. In the recursive
clause, the unification _X = [_El_U] is now a construction operation, while the
unification _Z = [_El _W] is a selection operation. If we order the variables of the
clause according to the tuple (_X, _Y, 7. _E, _U, _W>,then the environment
derived by the analysis for the program point just before the recursive call, is
given by ~s = (T:, ASharing:Ts, ALive:rs),

'7-,, ::= ('.'(Int,V), V, ListOne, Int, V, List >,


ASharingT, = (0, AShr~- ),
AShr~, = { ( T,[<l,'.').<l,Int)], T,[<4,Int>]),
( v)], v)]),
( T,[(3,'.'>.(2,'.% 'T, [(6, '.'>] ) },
ALive~% = { T,[(2, V>], T,[<I,'.'>] }.

From the representation in Figure 5.20, it is clear that the top list cell of the
program variable 7. is not shared with any live term, nor is its value needed for
the recursive call. So, if the construction operation _.X = E-El_U] were delayed
until the selection operation _Z = E_EI_W3 is done, it could reuse the garbage
176 C H A P T E R 5. LIVENESS A N A L Y S I S

Tin <> Ts <>


~ v . Int V Or~
nil Int
~'~ hit ~V-'--""--'~t~n~l~.~ii,,,sni~/~

Int
Figure 5.20: Abstract liveness environments/3i,~ and/3~ for append/3.

cell. Note the sharing of the integer term between the program variables _X and
_.E. This sharing is created by the construction operation, at a time the variable
_.E is still unbound. But the selection operation, giving the integer value, does
not introduce any sharing with the first integer element of the list _Z, because of
the optimization mentioned in the beginning of Section 4.3. After restricting the
abstract substitution to the arguments of the recursive call, we obtain a liveness
environment that is equivalent to the initial substitution ~in.

Non-local R e u s e

To allow non-local reuse, an integration of the liveness analysis and the storage
allocation algorithm of the compiler is needed in order to have the liveness en-
vironments reflect what garbage cells are back in use. Presently, the analyzer
assumes that no garbage cells are reused. Consequently, the dead cells detected
at program points following some procedure call that is not a selection oper-
ation itself are only true garbage cells if they were not already reused inside
the procedure (otherwise they m a y be dangling references). For example, con-
sider Program 5.5 to generate all the permutations of a given list and an initial
abstract liveness environment {~,e, ASharing~q~, ALive~ri~), such that
Ti,, ::= (List, V),
List ::= nil I '.'(Int, List),
ASharing~r,,~ = (0, 0),
aLive~-,.. = {T~.[(2,v)]}.
The analysis of the program permutat• entails an analysis of the se:].ect/3
predicate, namely for the abstract liveness environment ~I ----{TI, ASharingTl,
ALive~rl) such that
T1 ::= (V, List, V),
ASharlngTl = {0, r
ALive~rl = { 7"1[{1, V)],T1[{3, V)] }.

For program point (~) (resp. ( ~ ) ) in the definition of s e l e c t / 3 , it is detected


that the top-list cell of the input argument _Yl becomes garbage. In the second
5.3. EVALUATION 177

permutation( _XI, _YI ) :- select( _X, _YI, _Zl ) :-


_XI = nil,
_YI = [ _X i -ZZ ]
_YI : nil.
select( _X, _Y1, 7.1 ) :-
permutation( _XI, _ZI ) :-
select( 7., _X1, _Ys ), <D _YI = [ _Y I _Ys ],
2 1 = [ _Y I 2 s ] ,
3~
_ZI = [ _Z I _Zs ] ,
select( _X, _Ys, _Zs ).
permutation( _Ys, _Zs ).

Program 5 . 5 : p e r m u t a t i o n / 2

sort( _Xs, _Ys ) "- makelist ( I , _Y, 7.) 9-


_emp = empty, _X = empty, _Y = _Z.
buildtree( _Xs, _emp, 3 T ), Q makelist (_X, _Y, _Z) 9-
_nil = nil, _x = t ( _ L , _E, - M ,
makelist( ~ T , 2nil, _Ys ). makelist(_R, _Y, _YI),
_I'2 = [_El_Y1],
makelist(_L, _Y2, _Z).

Program 5 . 6 : s o r t / 2

clause, the cell can be reused in the construction that follows. For the first
clause, there is no local opportunity to reuse the cell. Now, consider program
point (~) of the p e r m u t a t i o n / 2 predicate and an ordering of the program
variables according to the tuple ( _Xl, _Z, _Ys, _Zl, _Zs ). For that point,
we derive the liveness environment/3, = (T~, ASharingT-,, ALive~-,), such that

T, ::= (ListOne, Int, List, V, V),


ASharing:T~ -- (@,AShr~,),
AShr~, = { (7-,[(1,'.').(2,'.')], Ts[<3,'.')]) },
ALive,]-, = { 7-,[(4, V)] }.

This means that the top-list cell of the _X1 program variable becomes garbage.
However, the latter is only true if the code generated for the s e l e c t / 3 predicate
did not introduce destructive assignments. If the code for the s e l e c t / 3 predicate
is specialized such that the garbage cell it produces in the second clause is
locally reused, then the liveness analysis for the p e r m u t a t i o n / 2 program should
be redone. In order to reuse the garbage cell created in the first clause of
the s e l e c t / 3 predicate, for the construction operation in the second clause of
p e r m u t a t i o n / 2 , we need some technique to pass on its address. The garbage
cell may or may not be the first list cell of the variable _I1.
As another illustration of an opportunity for non-local reuse of garbage cells,
consider Program 5.6 for sorting a linear list via the construction of an ordered
binary tree. In the program point ( ~ ) , it is detected that the full list that the
178 C H A P T E R 5. L I V E N E S S A N A L Y S I S

program variable I s is bound to, is turned into garbage. Intuitively, the list
could be reused by the makelist/3 predicate that has to construct a new list
of the same length. However, additional run-time bookkeeping data areas will
be needed in an interpreter to bring about this kind of storage overwriting. W e
did not investigate the problem of keeping track of garbage cells, which m a y be
quite complicated.
Chapter 6

Conclusion

This book addresses t h e problem of memory reuse for logic programs through
program analysis rather than by run-time garbage collection. The aim is to
derive run-time properties that can be used at compile time to specialize the
target code for a given set of queries and to introduce destructive assignments in a
safe and transparent way. The derivation process is constructed as an application
of abstract interpretation for logic programs. The development of the application
is greatly facilitated by the structure of the underlying framework [11], which
allows to focus on local properties of the abstract domain and operations in order
to guarantee correctness and termination of the global analysis.
A modular design of the abstract domain in successive layers was essential to
handle the complexity of the problem at hand. The first layer consists of the type
analysis for logic programming languages that was previously developed in [40]
and that provides a characterization of the logical terms to which variables can
be bound during program execution. Our contribution consists of two additional
layers consisting of an abstract domain and primitive operations for sharing and
liveness analysis, respectively.
The central problem in program analysis for compile-time garbage collection
is detecting the sharing of term substructures that can occur during program
execution. In order to justify our analysis based on abstract interpretation, we
introduce variants of the concrete domain and operations that are augmented
with information about the term structures shared in actual implementations.
We show that these instrumented versions of the concrete domain and opera-
tion characterize the sharing that takes place in standard implementations and
that they are safely approximated by the abstract domain and operation. In a
subsequent step, we enhance the abstract domain and operations for type and
sharing analysis, into an abstract domain representing liveness information as
well. The abstract operations for this final domain compute for each program
point an upper approximation of the set of term structures that are needed to
finish the execution of the program. The compiler can exploit such information
to generate code avoiding the copying of data structures that have no further
references.
180 CHAPTER 6. CONCLUSION

For the formal specifications of the domains and operations, we follow a


stepwise design, taking care that each step is sufficiently simple to allow its
correctness to be proved in a clean and comprehensible way. The approach
results in rigorous and well motivated definitions. The emphasis throughout
this work is mainly on the precision and on the soundness of the analysis.
In order to evaluate the expressiveness and precision of the domain and op-
erations introduced, we extended a prototype implementation for the abstract
interpretation framework and the type analysis by Gerda Janssens [40]. Thanks
to the modular design of that prototype and the use of abstract data types, it
could be changed to perform sharing and liveness analysis in a rather straight-
forward way. Algorithms implementing the formal specifications of the primitive
operations for the augmented abstract domain were developed and incorporated
into the prototype. We then applied it to a set of small and well-known pro-
grams manipulating term structures. The examples discussed throughout the
book are produced by the prototype. To assess the degree of precision that our
analysis can reach on larger programs, more work is needed on the efficiency of
the prototype. Currently, the three layers of the analysis, i.e. type, sharing and
liveness analysis, are performed at the same time. Executing them successively
might be more space efficient, as the space used on the various Prolog stacks
during one layer of the analysis can be reclaimed for the next layer. Also, the
question remains whether separating the fixpoint computation for type infer-
encing from the fixpoint computation of sharing, resp. liveness analysis reduces
the analysis time. On the other hand, an interface between the layers has to
be designed that allows to pass on the full AND-OR-graph annotated with the
abstract substitutions. In order that the analysis of large programs be practical,
the issues and tradeoffs involved need to be investigated.
Previous work on sharing analysis for Prolog programs mainly addressed the
sharing of free variables. The sharing of structure has received far less attention
and, to our knowledge, no existing methods achieved a similar degree of precision.
Apart from the imprecision inherent to the framework of abstract interpretation,
the precision of a particular analysis depends partly on the expressivity of the
abstract domain, partly on the abstract operations defined on it to model the
concrete operations, and partly on the choice of an upperbound operation to
merge the results of the analysis over different execution paths. We discussed
our choice for each of these parameters and indicated a few shortcomings. But
even with this unavoidable inaccuracy, the quality of the sharing information
obtained by our analysis appears to be sufficiently precise to allow the deduction
of valuable information about the liveness of data structures.
Previous work on code generation by Mari~n et al. [52] has shown that live-
ness information allows worthwhile optimizations. According to the experiments,
there is little or no measurable time overhead caused by the instructions for local
reuse and a gain in execution time can be expected from reducing the role of run-
time garbage collection. Further investigations are needed to find efficient ways
for reusing garbage cells, and to measure the impact on the execution time. A
more appropriate balance between the precision and practicality of the analysis
181

may be needed, because the main limitation of the current approach is the com-
plexity of the analysis. However, we expect that this problem may be overcome
in the future and we believe that practical analyses based on our work can be
used in compilers and will lead to substantially more efficient implementations
of logic programming languages.
Other possible applications of the sharing information derived include occur-
check reduction and detection of goal independence for AND-parallel program
execution. In previous analysis systems aimed at the application of occur-check
reduction, the detection of circular structures is stated in terms of properties of
the abstract substitutions holding in the program point prior to the unification,
or in terms of conditions holding during the abstract unification algorithm. The
abstract substitution holding after the unification does not provide any informa-
tion about the loops that are possibly created. The idea is that the substitution
only has to provide information about unifications that succeed in a system per-
forming the occur-check. For instance, also the underlying integrated type and
mode analysis can be used to predict the possible creation of circular terms.
The structure-sharing analysis developed in this book yields a better criterion
for loop detection. In many cases, the precision is sufficient to detect the absence
of circular terms after unification in a system that does not perform the occur
check, or to provide detailed information about the nature of the loops that are
possibly created.
Many parallel Prolog implementations restrict to Independent/Restricted
And-Parallelism (IAP). In such systems, the usual semantics of a sequential
program execution is preserved. The subgoals in the body of a clause are ex-
ecuted in parallel provided that they are independent: that is, if the bindings
of one goal cannot interact with the bindings of another goal. Compile-time
knowledge about the variables that may share and about which goals instan-
tiate shared variables, is useful to reduce run-time dependency checking and
scheduling overhead. Although knowledge about the sharing of structured terms
is not required for the automatic parallelization of logic programs using IAP, the
greater precision of such sharing information may be beneficial.
Appendix A

Detailed Examples

We have used the prototype to analyze a set of small and well-known (pure)
Prolog programs manipulating data structures. We give a detailed description
of the abstract AND-OR-graphs computed for these programs. The abstract
liveness environments obtained for the different program points in a predicate
definition, are collected in a table.
A type graph is represented as a tuple of types. The ith type in the tuple is the
type of the i th variable in the tuple of variable names representing the domain
of the program point that the type graph is associated with. E.g., (_.X, _u
_Z) represents the domain at some program point, and ( V, L i s t , . ( I n t , V ) )
represents an abstract type graph T for this domain, defined by the following
two grammar rules.

T ::= ( V , L i s t , . ( I n t , V ) )
List ::= nil [ .( Int, List)

Type-graph nodes involved in sharing edges or liveness sets, are represented by


means of their shortest selector (i.e. the selector with respect to the underlying
tree of the type graph). A selector, e.g. (2,' .'>.(1, Int), for the domain and type
graph given above, is represented as [ ( _ u (1, I n t ) ] , for reasons of clarity.
Note that we consider a functor node label to consist of both the functor name
and arity.
We use reduced representations for the sharing and liveness components,
omitting edges (resp. nodes) that are redundant w.r.t, some other edge (node)
because of the TGShift operation. Irrelevant edges (if they occur) are not re-
moved.
The tables are organized as follows.

9 The heading gives the tuple of variable names appearing in the initial goal,
and the input and output abstract liveness environments.

9 For each clause, the first entry gives the clause number (as shown in the
Prolog code defining the predicate), and the tuple of variable names ap-
184 A P P E N D I X A. D E T A I L E D E X A M P L E S

pearing in the clause, i.e. the domain of the clause. For some of the in-
teresting program points of the clause, an entry in the table is given with
the name of the abstract substitution at the program point and its value,
the abstract liveness environment. Empty sharing or liveness components
in the abstract environments are not shown.

Recall that the full set of sharing edges, holding at some program point, is in fact
the AltematingCIosure of the ASbr~- and AShr~- components shown in the tables.
Also, in order to interpret the Akive7- component, one has to take into account
the sharing relation, first applying the AltkiveClosure(Akive~r,AShr~-,AShr~-) op-
eration to obtain the set of globally live terms. The set of locally live terms is
not represented in the liveness environments (see Section 5.1.2). To ease the
interpretation of the results, we added for the selection operations the output
abstract liveness environment, restricted to the domain of the selection opera-
tion. These liveness environments are particularly important for the application
of compile-time garbage collection. The liveness components shown in these re-
stricted output liveness environments (indicated by a superscript r), consist of
the full se~ of locally and globally live terms, as defined by (5.27) on page 162.
Because most Prolog implementations do not create any sharing when a
variable gets bound to an atom or an integer, a small optimization with respect
to the formal specifications is incorporated into the prototype, avoiding the
introduction of such sharing edges.
Note that for the sharing component of the query form, the user can choose
to specify the sharing edges as old or as new edges, or to use a combination of
the two.

A.1 List of T y p e s
We first give some shorthands for type (sub)graphs used frequently throughout
the examples. Unless explicitly stated otherwise, we assume that for the func-
tors that occur in some recursive branch of a type graph, the depth restriction
is two.

L i s t ::= n i l I ' . ' ( I n t , L i s t )


L i s t O n e ::= ' ' ( I n t , L i s t )

D L i s t ::= n i l I ' . ' ( d C I n t ) , DList)


DListOne ::= ' . ' ( d ( I n t ) , DList)

VList ::= n i l I ' . ' ( V, V L i s t )


VListOne ::= , . , ( V, V L i s t )

T r e e ::= empty ] t ( T r e e , I n t , T r e e )
TreeOne ::= t ( Tree, Int, Tree)

DTree ::= empty I t( DTree, d(Int), DTree)


DTreeOne ::= t( DTree, d(Int), DTree)
A.2. APPEND~3 185

LvTrse ::= l v ( I n t ) ] t ( L v T r e e , LvTree)


LvTree0ne ::= t ( LvTree, LvTres)

A.2 append/3
The following two tables give the abstract liveness environments that result from
the abstract interpretation of the append/3 program, when the mode of use is
to append the ground input lists of the first and second arguments. In the first
table, we assume that there is no input sharing between the arguments, in the
second table, there is potential sharing between the input arguments. Note that
for f~4 in the second table, [(_X, . / 2 ) ] 6 Akive~r, whereas in the first table
[ ( _ X , . / 2 ) , ( 2 , . / 2 ) ] 6 Aiive~-. Thus, it is correctly derived that in the case of
input sharing between the first two arguments, the selection and construction
operations in the second clause cannot be replaced by a destructive assignment.
Note also that the value of AShr~- is the same for the corresponding entries of
the two tables.

Prolog code:

(i) append(_X,_Y,_Z) "-


~I _X = nil,
#2 _Y -- -z. #3
(2) append(_X,_Y,_Z) "-
#4 _x : [-E I _u],
#s_z: [_El_w],~
#6 append(_U,_Y,_W). Jg7

Call: #~ append( ..A, ..B, _C )

( _A, _B, _C)


~l Ir ( List, List, V)
ALive,r { [(_C,V)] }
]9oI 7- ( List, List, List>
aShr~- { ([(_C,./2)], [(_B,./2)])}
ALive~r { [(_C, ./2)] }
~) (_x, _Y, 7.)
#~ T (List, List, V)
AUveT { [(_Z,V)] }
~2 7- (nil, List, V)
ALive~r { [(_Z,V)] }
~3 IF (nil, List, List)
AShr~ { ([(_Y,./2)], [(I,./2)])}
ALive~ { [(7.,.12)3}
186 APPENDIX A. DETAILED EXAMPLES

2) (_E, _U, _W, I, _Y, _Z)


T ( V, V, V, List, List, V)
ALive:r { [( 7.,v)]}
T ( Int, List, _, ListOne, _, _)
AShr,-} { ([(I,./2),(2,./2)], [(_U,./2)])}
ALive.r { [ ( _ X , . / 2 ) , ( 2 , . / 2 ) ] , [(_U,./2)], [(_E,Int)]}
T ( Int, List, V, List0ne, List, V)
AShr,}- { ([(_X,./2),(2,./2)], [(_U,./2)])}
ALiveT { [(~,v)]}
T ( Int, List, V, List0ne, List, .(Int,V))
AShr,} { ([(_X,./2),(2,./2)], [(_U,./2)]),
([(_Z, ./2), (2,V)], [(_W,V)])}
ALiveT" { [(_z,./2)]}
T ( Int, List, List, ListOne, List, ListOne)
AShr,} { ([(_Y, ./2)] , [(_Z,./2),(2,./2)]), ([(_Y, ./2)] ,
[(_W,./2)]), ( [ ( I , . / 2 ) , ( 2 , . / 2 ) ] , [(_U,./2)]),
([(_Z, ./2), (2, ./2)], [(_W, ./2)])}
ALiveT { [( 7,.12)]}

Call: ~ append( _A, _B, _C )


i

(_A, _B, _C)


T ( List, List, V)
AShr,~- { ([(_A,.12)], [(~,.12)])}
ALiveT- { [(_c,v)]}
T < List, List, List)
AShr~ { ( [ ( J , . / 2 ) ] , [(_B,./2)])}
AShr,}- { ([(_B,./2)], [(_C,./2)])}
ALive:T {
[(_C, ./2)]}
1) <
l, _Y, _z)
T (
List, List, V)
AShr~ {
([(_X,./2)], [(_Y,./2)])}
ALive~- { [( 7.,v)]}
T ( n i l , List, V)
AShr~ { ([(_X,nil)], [(_Y,nil)])}
ALive:r { [(_z,v)]}
T (nil, List, List)
AShr~ { ( [ ( . / , n i l ) ] , [(_Y,nil)])}
AShr~ { ([(_Y,./2)], [(7.,./2)])}
ALiveT { [(_z,./2)]}
A.2. APPEND~3 187

2) ( _E, _U, _W, _X, _Y, _Z>


T ( V, V, V, List, List, V)
AShr~ { ([(_X,./2)], [(_Y,./2)])}
ALive~ { [(I,V)]}
T ( Int, List, _, List0ne .... )
AShr,-~ { ([(I,./2),(2,./2)], [(_U,./2)])}
ALiveT { [(I,./2)], [(_U,./2)], [(_E,Int)]}
fls T ( Int, List, V, List0ne, List, V)
AShr~ { ([(_X, ./2)] , ['(_Y,./2)]), ([(_X,./2),(2,.12)],
[(_Y, . / 2 ) ] ) }
AShr~ { ( [ ( I , . / 2 ) , ( 2 , . / 2 ) ] , [(_U,./2)])}
ALivey { [(2Z,V)] }
T ( Int, List, V, List0ne, List, .(Int,V)>
AShr~ { ([(_X, ./2)] , [(_Y,./2)]), ( [ ( I , . / 2 ) , ( 2 , . / 2 ) ] ,
[(_Y, ./2)1)].
AShr,~- { ([(_X,./2),(2,./2)], [(_U,./2)]),
([(_Z, . / 2 ) , ( 2 , V ) ] , [(_W,V)])}
ALive:r { [(_Z, ./2)] }
f17 T ( Int, List, List, List0ne, List, List0ne)
AShr~ { ([(_X,./2)], [(_Y,./2)]), ( [ ( I , . / 2 ) , ( 2 , . / 2 ) ] ,
[(_Y, . / 2 ) ] ) }
AShr~ { ([(_Y,./2)], [ ( 1 , . / 2 ) , ( 2 , . / 2 ) ] ) , ([(_Y,./2)],
[(_W,./2)]), ([(_X,./2),(2,./2)], [(_U,./2)]),
([(_Z, ./2), (2, . / 2 ) ] , [(_W, ./2)])}
ALiveT { [(_Z, . / 2 ) ] }

The following table illustrates how the sharing and liveness analyses perform
when one uses the invertibility property of Prolog. The table shows the ab-
stract liveness environments that result from the abstract interpretation of the
append/3 program, when the mode of use is to split the ground input list of the
third argument into two lists, returned in the first and second arguments.
From/Ss, the compiler derives that _Z = [_E I _W], is a selection operation.
From/~s, it is derived that the top-list cell can be reused in a construction. How-
ever, the only construction in the clause, _X = [_E I _U], precedes the selection.
The compiler has to reorder the subgoals in order to benefit from the liveness
information provided.
Call: /9~ append( _a, _B, _C )

( _A, _.B, _C)


~ T ( V , V, List>
ALiveT { [(..B,V)], [(_A,V)]}
~o3 7- (List, List, List>
AShr~ { ([(_B,. 12)3, [(_C,. 12)3 )}
ALive,]- { [(_B, ./2)] , [(_A,./2)]}
188 APPENDIX A. DETAILED EXAMPLES

1) (_X, _Y, _Z)


T <v, v, List>
ALive,:r { [(_Y,V)], [(1,v)]}
~2 T ( nil, V, List>
ALiveT { [(_Y,V)], [(/,nil)]}
~3 T (nil, List, List>
AShr~ { ([(_Y,.12)], [(_z, .12)] )}
ALive~r { [(_Y, .12)] , [(_X,nil)]}
2) ( ~ , _u, _w, _x, _Y. _z>
f14 T ( V, V, V, V, V, L i s t )
ALive~r { [(_Y,V)], [(_x,v)]}
T ( V, V, V, .(V,V), V, List)
AShr,~ { ([([..X, . / 2 ) , (1,V)], ['(..E,V)]),
( [(_x,./2), (2 ,v)], [(_u, v) ] )}
ALive,r { [(_Y,V)], [ ( . / , . / 2 ) ] }
T ( Int, _, List, _, _, ListOne>
AShr~ { ([(_w,./2)], [ ( _ z , . 1 2 ) , ( 2 , . / 2 ) 3 ) }
ALive~r { [(_Z,.12),(2,.12)], [(_W, .12)] , [(_E,Int)]}
& T ( Int, V, List, . ( I n t , V ) , V, ListOne)
AShr~- { ( [(_X,./2), ( I , I n t ) ] , [(_E, I n t ) ] ),
( [(_x, ./2), (2,v)], [(_u,v)] ),
( [(_w,. 12)3, [(_z,. 12), (2,./2)3 )}
ALive1 { [(_Y,V)], [ ( _ X , . / 2 ) ] }
T ( Int, List, List, ListOne, List, ListOne)
AShr~- { ( [ ( _ Z , . / 2 ) , ( 2 , . / 2 ) ] , [(_Y,./2)]),
([(_X,./2), ( l , I n t ) ] , [(_E,Int)]),
( [(_X,./2), ( 2 , . / 2 ) ] , [(._u,./2)] ),
( [(_w, ./2)] , [(_.Z, ./2) , ( 2 , . / 2 ) ] ),
([(_Y, . / 2 ) ] , [(_W, . / 2 ) ] ) }
ALive~r { [(_Y,./2)], [(_x,./2)]}

A.3 nrev/2
We now consider the program for naive list reversal, called with a list of free
variables that may share with one another. From the table below, it is derived
that the procedure can still work in-place.

Prolog code:

(1) ~ , v ( _x, _Y) .-


~1 -X = n i l ,
~2 -Y : nil. #3
(2) nrev( _X, _Y) "-
,04 -X = [ -E [ -U ], ,~4
A.3. NREV/2 189

;35 nrev( _U, _RU),


;36 _Last = [_El,
;37 append( _.RU, _Last, _Y). ;3s

Note that the selection _X = [ _E I _U ], and the construction operation _Last


= [_El of the second clause, do not occur in the same chunk (Section 3.3).
Moreover, the variable _X in the recursive clause is a temporary variable because
it only occurs in the head and the first call. If it is made a permanent variable by
the compiler, its value will be saved on the local stack, and the garbage cell it is
pointing to at program point ;36, will be accessible for reuse in the construction
_Last = [_El.
Call: ;3~ nrev( _X, _Y )

( _X, _u
;37 T ( VLisZ, V>
AShr,} { ([(D[,./2),(1,V)], [(_X,./2),(1,V)])}
ALive~r { [(_Y,V)] }

The query above causes a recursive call of the n r e v / 2 predicate for a more
general abstract substitution ;3~ (obtained as a restriction of the substitution ;3s
to the domain of the call n r e v ( _U, ~U)), for which the list elements of the first
argument are also live. Indeed, when n r e v ( _U, ~U) is called, the list elements
of _U possibly share with _E due to the input sharing in _X, and _E is still needed
for the construction operation _Last = [_E] that follows the recursive call of
n r e v / 2 . The table below only contains the program points of interest for this
more general query.
Call: ;3~ nrev( I, _Y )

( _X, _Y>
T < VList, V>
AShr,~ { ([(~,./2),(1,v)], [(~x,./2),(1,v)])}
ALive~r { [ ( _ X , . / 2 ) , ( 1 , V ) ] , [(_Y,V)]}
T ( V L i s t , VList>
AShr,~ { ([(_x,.12),(i,v)], [(_x, .12) , (1,v)] )}
AShr,~- { ([(I,./2),(1,V)], [(_Y,./2),(1,V)])}
ALiveT { [(_X,./2),(l,V)], [(_Y,.12)]}
1) < _X, _Y )
A T < vni t, v>
AShr,~ { ([(_~,./2),(1,v)], [(~,./2),(1,v)])}
ALive~r { [(_X,./2),(1,V)], [(_Y,V)]}
T ( n i l , nil)
ALive~r { [(_Y,nil)] }
190 A P P E N D I X A. DETAILED E X A M P L E S

2) (_E, last, _KU, _U, I, _Y)


T <v, v, v, v, vT.ist, v>
AShr~- { ([(_x,.12),Cl,V)], [(_x,.12),(1,v)])}
ALive~r { [(_x,.12),(1,v)], [(_Y,v)]}
T ( V ..... VList, VList0ne, _)
AShr~- { ([(_x,.12),(1,v)], [(_x,.12),(1,v)]),
([(_x, ./2), (2, ./2), (1,v)], [(_x, ./2), (1,v)]),
([(I,.12),(2,.12),(1,v)3,
[(I,.12),(2,.12),(1,v)I)}
AShr~ { ([(_x,.12),C1,v)], [(_E,V)]),
([(_U, ./2)], [(_X,./2),(2,./2)])}
ALive:T { [(_x,.12),(2,.12)], [(_u, .12)] , [(_E,V)],
[(_X, . 1 2 ) , ( I , V ) ] }
T (V, V, V, VList, VList0ne, V)
AShr~- { ([(_x,.12),(1,v)], [(_x,.12),(1,v)]),
([(_x, .12), (2, .12), (1,v)], [(_x, .12), (1,v)]),
( [(_x,./2), (2,./2), (1,v)],
[(_X, ./2), (2, ./2), (1 ,V)] )}
AShr~ { ([(_x,.12),(1,v)], [(_E,V)]),
([(_U, ./2)], [(_X,./2),(2,./2)])}
ALive~ { [(_X,./2),(2,./2),(1,V)], [(_X,./2),(1,V)],
[(_Y,V)]}
T ( V, V, VList, VList, VList0ne, V)
AShr~- { ([(_x,.12),(1,v)], [(_x,.12),(1,v)]),
([(_X, ./2), (2, ./2), (1,V)], [(_X, ./2), (1,V)]),
( [(_x, ./2), (2, ./2), (1,v)],
[(_X,./2),C2,./2),Cl,V)])}
AShr~- { ([(_X,./2),(2,./2),(I,V)], [(_~U,./2),(I,V)]),
( [(_x, ./2), (1,v)], [(_E,V)]),
([(_U, ./2)], [(_X, ./2), (2, ./2)]),
([(_U, ./2), (I,V)], [(_3.U,./2), (1,V)])}
ALiveT { [(_X,./2),C2,./2),(1,V)], [ ( I , . / 2 ) , ( 1 , V ) ] ,
[(_Y,v)]}
#7 T ( V, .(V,nil), VList, VList, VList0ne, V)
AShr~- { ([(_x,.12),(1,v)], [(_x,.12),(1,v)]),
([(I,./2),(2,./2),(1,V)], [ ( I , ./2), (1,V)]),
( [C-X, ./2), (2, ./2), (1,V)],
[(_X,./2),(2,./2),(1,V)])}
AShr~ { ([(_X,./2),(I,V)], [(last,./2),(l,V)]),
([(_X,./2),(2,./2),(1,V)], [(_~U,./2),(1,V)]),
([(_x, ./2), (1,v)], [(_E,V)]),
([(_U, ./2)], [(_X, ./2), (2, ./2)]),
([(_u, . / 2 ) , ( 1 , v ) ] , [(_Ru, . / 2 ) , ( 1 , v ) ] ) ,
( [(_Last; . / 2 ) , (1,V)], [(_E,V)] )}
ALiveT { [(./,.12),(2,./2),(1,V)], [(_X,./2),(I,V)],
[(_Y,V)] }
A.3. NREV/2 191

During the iteration process of the fixpoint computation, the append/3 predicate
is called in the second clause of the n r e v / 2 program for a sequence of liveness
environments of increasing generality. The variable A{U is successively bound to
the empty list, a list of at most one element, at most two elements, and at last
to a list of any possible length. The introduction of recursive types during the
abstract interpretation process is controlled by the depth restriction, which is
two for the list-functor ./2. The following table shows the successive abstract
substitutions (restrictions of/~7) for which append/3 is called.
Call: /~ append( ..RU, l a s t , • )

( last, _RU, _Y)


~ 7- ( . ( v , ~ i l ) , ~ i l , v>
AShr~ { ([(_Last,./2),(l,V)], [(_Last,./2),(l,V)])}
ALiveT { [(_Last,./2),(l,V)], [(_Y,V)]}
/~4 T ( . ( V , n i l ) , n i l l . ( V , n i l ) , V)
AShr~r { ([(_RU,./2),(I,V)], [(last,./2),(l,V)]),
([(.Last,./2),(1,V)], [(Iast,./2),(1,V)]),
([(..I{U,./2),(1,V)], [(_RU,./2),(1,V)])}
ALiveT { [(last,./2),(l,V)], [(_RU,./2),(I,V)], [(_Y,V)]}
fl# T (.(V,nil), nill.(V,nill.(V,nil)), V)
AShr~- { ([(..RU,./2),(1,V)], [(iast,./2),(i,V)]),
( [(_RU,./2), (2,./2), (I ,V)], [(_Last,./2), (I ,V)] ),
([(_Last,./2),(1,V)], [(Iast,./2),(1,V)]),
([(_KU,./2),(2,./2),(1,V)], [(_R,U,./2),(1,V)]),
( [(_RU, . / 2 ) , ( 1 , V ) ] , [(..RU, . / 2 ) , ( 1 , V ) ] ) ,
( [ ( ~ U , . / 2 ) , ( 2 , ./2), ( l , V ) ] ,
[(~U,./2),(2,./2),(1,V)])}
ALive=r { [(_Last,./2),(1,V)], [(_RU,./2),(2,./2),(t,V)],
[(_RU, . / 2 ) , ( 1 , V ) ] , [(_Y,V)] }
T (.(v,niz), Vnist, V)
AShr~ { ([(_.KU,./2),(1,V)], [(last,./2),(1,V)]),
([(last, ./2), (1,V)], [(iast, ./2), (1,V)]),
([(_.RU,./2), ( 1 , V ) ] , [(_KU, . / 2 ) , ( 1 , V ) ] ) }
ALive:,- { [(_Last,./2),(l,V)], [(_RU,./2),(I,V)], [(_Y,V)]}

The table below only contains the interesting program points for the most general
of these queries. The program points are as indicated in Section A.2. Note that
the input sharing edge ( [ ( l a s t , . / 2 ) , ( 1 , V ) ] , [(_Last, . / 2 ) , ( 1 , V ) ] ) , is in
fact an irrelevant edge. The program variable _Last is bound to a single element
list, consisting of one free variable: there is no internal sharing possible.
From/~4 we see that _X = [_E [ _U] is a selection, from /~4 that the top
list cell of _X is turned into garbage, and from /3s that _Z = [_E I _W] is a
construction that needs the allocation of a list cell.
192 A P P E N D I X A. D E T A I L E D E X A M P L E S

Call: ~/ append( _RU, l a s t , _Y )

( last, _RU, _Y)


( . ( V , n i l ) , VList, V)
AShr.~ { ([(_Ru,./2),Cl,V)], [ ( i a s t , . / 2 ) , C l , v ) ] ) ,
( [ ( l a s t , ./2), (1,V)], [ ( l a s t , ./2), (1,V)]),
([(_RU,./2),(1,V)], [(_RU,./2),C1,V)])}
ALiveT { [ ( i a s t , . / 2 ) , ( 1 , V ) ] , [(_RU,./2),(1,V)], [(_Y,V)]}
T ( . ( V , n i l ) , VList, VListOne)
AShr~- { ( [ ( l a s t , . / 2 ) , ( 1 , V ) ] , [ ( l a s t , . / 2 ) , ( 1 , V ) ] ) ,
([(_RU, ./2), (1,V)], [ ( l a s t , ./2), (1,V)]),
( [(_~u, ./2), (1,v)], [(_Ru, ./2), (1,v)])}
AShr,~ { ([(_Y,./2),(2,./2)], [ ( l a s t , . / 2 ) ] ) ,
( [(_Y, ./2)], [(last, ./2)]),
([(_~u, ./2), (1,v)], [(_Y, ./2),(1,v)]),
([(_RU, ./2), (1,V)], [(_Y, ./2), (2, ./2), (1,V)])}
ALive,r { [ ( I a s t , . / 2 ) , ( 1 , v ) ] , [(_Y,./2)],
[ (_RU, ./2), (1,V)] }
1) < _x, _Y, 7>
31 T ( VList, . (V,nil), V)
AShr,-~ { ([(_x,./2),Cl,V)], [(_Y,./2),Cl,V)]),
( [(_Y, ./2), (1,v)], [(_Y, ./2), (1,v)]),
([(_x, ./2), Cl,V)], [(_x, ./2), (1,v)])}
ALive~T { [(_Y,./2),(1,V)], [(_X,./2),(1,V)], [(_Z,V)]}
T ( n i l , .(V,nil), .(V,nil))
AShr~ { ([(_Y,./2),CI,V)], [(_Y,./2),(1,V)])}
AShr,~ { ([(_z,./2)], [(_Y,./2)])}
ALive7 { [(_Y,./2),Cl,V)], [(_z,./2)]}
2) (_E, _U, _8, _X, _Y, _Z)
T ( V, V, V, VList, .(V,nil), V)
AShr,~ { ([(_x,./2),Cl,v)], [(_Y,./2),(1,v)]),
( [(_Y, ./2),(1,v)], [(_Y, ./2), (1,v)]),
([(_x,./2),Cl,V)], [(_x, ./2), (1,v)])}
ALive~- { [(_Y,./2),Cl,V)], [(_x,./2),Cl,V)], [(_z,v)]}
T ( V, VList, _, VListOne .... )
AShr,-~ { ( [ ( I , . / 2 ) , ( 1 , v ) ] , [(_x,./2),Cl,V)]),
([(_x,./2), (2, ./2), (1,v)], [(_x, ./2), (1,v)]),
([C_x,./2),C2,./2),Cl,v)],
[(_x, ./2), (2,./2), (1,v)])}
AShr,~ { ([(_x,./2),(1,v)], [(_E,v)]), ([(_u,./2)],
[(_x, ./2), (2, ./2)])}
ALive,r { [(_x,./2),(2,./2)], [(_u,./2)], [(_E,V)],
[(_X, ./2),(1,V)]}
A.4. BUILDTREE/2 AND INSERT/3 193

~s T ( V, V L i s t , V, VListOne, . ( V , n i l ) , V)
AShr,~ { ([(_X,./2),(I,V)], [(_Y,./2),(I,V)]),
([(_X.,./2), (2,./2), (1,V)], [(_Y, ./2), (1,V)]),
([(_Y, ./2), (I,V)], [(_Y, ./2), (1,V)]),
([(_x, ./2), (1,v)], [(_x, ./2), ( ! , v ) ] ) ,
([(_x, ./2), (2, ./2), ( l , v ) ] , [(_x, ./2), (1,v)]),
([(_x,./2),(2,./2),(1,v)],
[(I,./2),(2,./2),(1,v)])}
AShr~- { ( [ ( _ X , . 1 2 ) , C 1 , V ) ] , [(_E,V)]),
([(_U, . 1 2 ) ] , [(_X, . / 2 ) , (2, . / 2 ) ] ) }
ALive,]- { [(_Y,./2),(I,V)], [(_X.,./2),(2,./2),(1,V)],
[(_X, ./2), (1,V)], [(_Z,V)]}

A.4 buildtree/2 and insert/3

We discussed the results of the sharing analysis for the i n s e r t / 3 predicate in


Section 4.3. The following table contains the results of both the sharing and
liveness analysis for the first and the second clause. The analysis results for the
third clause are similar to those for the second clause. (Note: /Ts is not included
in the table as it is identical to/3s.)

Prolo# code:

(1) insert(E, _OT, _}IT) "-


/71 _OT = empty,
/72 -}IT = t(empty, _E, empty). #3
(2) insert(_E, _OT, X T ) "-
/74 _OT : t(_L, _F, _R),
/Ts -E :< -F,
/76 _NT = t(_NL, _F, _R),
/77 i n s e r t ( _ E , _L, _NL). /78
(3) i n s e r t ( _ E , _OT, i T ) "-
_OT = t(_L, _F, _R),
/71o _E > ~,
/711 _NT = t(_L, _F, J R ) ,
312 insert(_E, _R, A~R). /713
194 APPENDIX A. D E T A I L E D E X A M P L E S

Call: ~/x insert( _A, _B, _C )

( _A, _B, _C)


/~ T ( I n t , Tree, V)
ALiveT { [(_c,v)]}
~o~ z ( I n t , Tree, TreeOne)
AShr,-} { ([(._B,tl3)], [(_C,t13),C3,t13)]), ([(_B,tl3)],
[(_C,t/3), (I ,t/3)] )}
ALiveT- { [(_C,tl3)] }
1) ( _E, ]IT, _OT)
#1 T ( Int, V, T r e e )
ALive,r { [(]IT,V)]}
#2 T ( I n t , V, empty)
ALive~- { [(]IT,V)]}
#3 T (Int, t(empty,lnt,empty), empty)
ALiveT- { [(]IT,tl3)] }
2) ( _E, _F, _L, ]IL, ]IT, _0T, _R)
#4 ( Int, V, V, V, V, Tree, V)
ALiveT { [(]IT,V)]}
T ( _, Int, Tree . . . . . TreeOne, Tree)
AShr,} { ([(_L,t/3)], [ ( _ 0 T , t / 3 ) , ( l , t / 3 ) ] ) , ([(_R,t/3)],
[(_0T,t/3), (3,t13)])}
ALiveT- { [(mT,t/3),(l,t/3)], [(mT,t/3),(3,t/3)],
[(_R,t/3)], [(_L,t/3)], [ ( - F , I n t ) ] }
T ( Int, Int, Tree, V, V, TreeOne, Tree)
AShr,} { ([( L , t / 3 ) ] , [(_OT,t/3), ( 1 , t / 3 ) ] ) , ( [ ( ] t , t l 3 ) ] ,
[(_OT,tl3), (3,t13)])}
ALive,r { [(]IT,V)]}
T ( Int, Int, Tree, V, t(V,Int,Tree), TreeOne, Tree)
AShr,} { ([(]]T,t/3), (3,t/3)], [(]IT,t/3), (3,t/3)]),
( [(• [(_0T,t/3), (I ,t/3)] ),
([(_~,t13)], [(_OT,t/3),(3,t/3)]),
( [ ( ] I T , t / 3 ) , (1,V)], [(]In,v)]),
( [(]iT,t/3), (3,t/3)], [(_R,t/3)])}
ALive~r { [(]IT,t/3)] }
A.4. BUILDTREE/2 AND INSERT~3 195

T (Int, Int, Tree, TreeOne, t(TreeOne,lnt,Tree),


TreeOne,Tree)
AShr~ { ([(mZ,t/S),(1,t/3)], [(~T,t/S),(1,t/3)]),
[(tiT,t/3) ( 1 , t / 3 ) ] , [(_NL,Z/3),(3,t/3)]),
[(_OT,t/3) ( 1 , t / 3 ) ] , [ ( J L , t / 3 ) , ( 1 , t / 3 ) ] ) ,
[(JT,t/3) (1,t/3)], [(_L,t/3)]),
[(_DT,tI3) ( 3 , t / 3 ) ] , [(_NT,t/3),(3,t/3)]),
[(• [(JOT,t/3),(1,t/3)]),
[(_R,t/3)] [ ( _ 0 T , t / 3 ) , ( 3 , t / 3 ) ] ) ,
[(_~lT,t/3) ( 1 , t / S ) ] , [ ( _ N L , t / 3 ) , ( 1 , t / 3 ) ] ) ,
[(~T,t/3) ( l , t / 3 ) ] , [(_NL,t/3),(3,t/3)]),
[(_NT,t/3) ( 1 , t / 3 ) ] , [(_NL,t/3)]), ,
[(JIT,t/3) ( 3 , t / 3 ) ] , [(_R,t/3)]),
[(_L,t/S)] [ ( _ _ % l L , t / S ) , ( 3 , t / 3 ) ] ) ,
[(_L,t/3)] [ ( _ I q L , t / 3 ) , ( 1 , t / 3 ) ] ) }
ALiveT- [(_NT~t/3)] }

From/34, the compiler derives that _0T = t(_L, _F, _~), is a selection opera-
tion. From/~4, it is derived that the top-tree cell can be reused in a construction.
The unification ~ T = t(_}lL, _F, ~ ) is such a construction operation, within
the same chunk as the selection operation.

Prolog code:

(i) buildtree(l, _OT, ~T) :-


#I L = nil,

(2) buildtree(i , ~T, JT) "-


#4 -L = [-E; ] ~ ] , /~4
#s insert(~, _OT, _T),
#6 buildtree(~, _T, _NT). #7

Call : #~ buildtree( _A, 8, _C )

( _ A . _B, _C)
~ T (List, Tree, V>
ALive~r { [(_C,V)]}
#~ T ( List, T r e e , Tree)
AShr~ { ([(_C,t/3)], [(~,t/3)])}
ALiveT { [(_C,t/3)]}
196 A P P E N D I X A. DETAILED E X A M P L E S

1) < _L, -)IT, _OT)


#1 T ( L i s t , V, Tree)
ALive,r { [(_NT,V)] }
& T ( nil, V, Tree)
ALiveT- { [(iT,V)]}
#s T ( Itil, Tree, Tree)
AShr~- { ([(JT,t/3)], [(nT,t/3)])}
ALiveT- { [(JT,tl3)] }
2) < _E, _L, _NT, _OT, _R, _T)
T ( V, L i s t , V, Tree, V, V)
ALive~ { [(JT,V)]}
T ( Int, ListOne, _, _, List, _)
AShr,~ { ([(_~, . / 2 ) ] , [(_L, . / 2 ) , (2, . / 2 ) ] ) }
ALiveT" { [(_L,.12),(2,.12)], [(-~,.12)3, [(_E,Int)]}
T ( Int, ListOne, V, Tree, List, V)
AShr~- { ([(_E,./2)], [(_L,.12),(2,./2)])}
ALiveT" { [(_NT,V)] }
7- ( Int, ListOne, V, Tree, List, TreeOne)
AShr~ { ([(/~,./2)], [(i,./2),(2,./2)]), ([(•T,t/3)],
[(_T,t/S),(3,t/3)]), ([(.DT,t/3)],
[(_T,t/3), (1,t/3)])}
ALive~r { [(JT,V)]}
T ( I n t , ListOne, Tree, Tree, L i s t , TreeOne)
AShr~- { ( [ ( _ D T , t / 3 ) ] , [(-NT,Z/3)]), ( [ ( - P . , . / 2 ) ] ,
[(.L, . / 2 ) , (2, . / 2 ) ] ) , ( [ ( _ D T , t / 3 ) ] ,
[(_T,t/3),(3,t/3)]), ([(_OT,t/3)],
[ ( _ T , t / 3 ) , ( 1 , t / 3 ) ] ) , ([(_/~T,t/3)], [ ( _ T , t / 3 ) ] ) ,
([(JT,t/3)], [(_T,t/3),(3,t/3)]), ([(JT,t/3)],
[(_T,t/3), (1,t/3)])}
ALiveT- { [(JT,t/3)]}
From ~4, the compiler derives that _L = [ _~ I _R], is a selection operation.
From /~4, it is derived that the top-list cell can be reused in a construction.
However, there is no local opportunity to reuse the list cell.

A.5 permutation/2 and select/3


We use the s e l e c t / 3 program to point out that the garbage cells detected at
a program point following a procedure call different from the unification builtin
=/2, are only true garbage cells if they are not already reused by the called
procedure. The reason is that the analysis presently assumes that no garbage
cells are reused.

Prolog code:
A.5. PERMUTATION~2 AND SELECT~3 197

(i) select( _X, _YI, _Zl) "-


#i _Yz = [ _ x I 71].
(2) select( _X, _YI, _Zl) "-
#2 _Y1 : [ _Y I _Ys ]~
#311= [_Y 1 7s],
#4 select( I , _Ys, _Zs). #s

Call : #~ select( _A, _B, _(~ )

( _A, _B, _C)


#~ T ( V, List, V)
ALiveT- { [(J,v)], [(_c,v)]}
( Int, ListOne, List)
AShr,~ { ([(_B,./2),(2,./2)], [(_C,./2)3)}
ALive,r { [(_A,Int)], [(_C,./2)]}
l) ( I, _YI, _Zl>
#1 T ( V, List, V)
ALive,r { [(~,v)], [(7.1,v)3}
T ( Int, List0ne, List)
AShr,~ { ([(_YI,./2),(2,./2)3, [(~Zl,./2)])}
ALive,r { [(_YI,./2),(2,./2)], [(_X,Int)], [(_ZI,./2)]}
2) (_X, _YI, _ZI, _Y, _Ys, ~Zs>
#2 T ( V, List, V, V, V, V>
ALive~r { [(_X,V)], [(_Zl,V)]}
T ( _, ListOne, _, Int, List, _)
AShr~ { ([(_Y1,.12),(2,.12)], [(_Ys,.12)])}
ALiveT { [(_YI,./2),(2,./2)], [(_Y,Int)], [(_Ys,./2)]}
T ( V, List0ne, V, Int, List, V)
AShr,~ { ([(_YI,.12),(2,.12)3, [(_Ys, .12)] )}
ALive,r { [(_x,v)], [(~Zl,V)]}
T (V, ListOne, .(Int,V), Int, List, V)
AShr,~ { ([(~Zl,./2),(2,V)], [(~Zs,V)]),
([(_Y1, .12), (2,. 12)], [(_Ys, .12)])}
ALive~r { [(I,V)], [(i1,.12)]}
T ( Int, .(Int,List0ne), List0ne, Int, List0ne,
List)
AShr~- { ([(_ZI,./2),(2,./2)], [(_YI,./2),(2,./2)]),
([(_Zs,./2)], [(_YI,./2),(2, /2)]),
( [(_Ys, .12), (2, .12)], [( 7.1, 12), (2,. 12)3 ),
( [(_Ys, .12), (2, .12)], [(_Zs, 12)]),
([(31, .12), (2, .12)], [(_Ys, 12),(2,.12)]),
( [(_Y1, .12), (2,. 12)], [(_Ys, 12)]),
( [(_Zl, ./2), (2, ./2)], [(_Zs, 12)])}
ALive~r { [(1,Int)], [(i1,.12)]}
198 A P P E N D I X A. DETAILED E X A M P L E S

From the abstract substitution ffl in the first clause, it is clear that a garbage
cell is created at run time. However, there is no opportunlty for the local reuse
of that cell in the same clause. If the procedure is called by the permutation/2
program shown below, then there will correspond one construction operation 7.1
-- [ 7. I 7.s ] in the permutation/2 program to each selection operation _YI
= [ _X I _Z1 ] in the select/3 program. Unfortunately, we do not know of an
easy strategy to pass on the address of the garbage cell for such non-local reuse.
A free-listor garbage-trail could be added to the run-time control structures of
an interpreter in order to record the address and the size of the garbage cells
detected. But the operations for handling the garbage cells in such a free-list
will be more complicated than stack operations, and the overhead introduced
m a y exceed the gain to be expected from reusing storage. Unless this problem
is solved, the permutation/2 procedure cannot really work in-place.

Prolog code:

(i) permutation( _Xl, _YI ) "-


#I _Xl = nil,
#2 _YI = nil. #3
(2) permutation( _XI, 7.1 ) "-
#4 select( 7., II, _Ys ), ~4
#5 z l = [7. I 7s],
#6 permutation( _Ys, 7.s ). #7

Call: #~ permutation( _A, _B )


( _A, _B>
#~ T < List, V>
ALive~- { [(_B,V)] }
#~ 7" ( List, List>
AhiveT- { [(_B, ./2)]}
I) ( _Xl, _n)
#i '7- ( List, V)
ALive~- { [(_YI,V)] }
#2 iT < nil, v>
ALive~- { [(_YI,V)] }
#3 iT ( nil, nil)
ALiveT { [(_YI, nil)]}
2) ( 1 1 , _Ys, 7., 7.1, _Zs>
#4 iT < List, V, V, V, V>
ALive~ { [(7.1,V)]}
~4 iT < nist0ne, List, Int . . . . )
AShr~- { ([(_Ys, ./2)], [(_XI,./2), (2, . / 2 ) ] ) }
ALive~- { [(_Ys,./2)], [ ( 7 . , I n t ) ] , [ ( _ X I , . / 2 ) , ( 2 , . / 2 ) ] }
A.6. SPLIT/3 199

/75 T ( ListOne, List, Int, V, V)


AShr~ { ([(_Ys,.12)],[(Ii,.12),(2,.12)])}
ALiveT { [(li,v)]}
/76 T ( List0ne, List, Int, .(Int,V), V)
AShr~ { ([(_Ys,./2)],[(_XI,./2),(2,./2)]),
([(Ti,./2),(2,v)],[(is,V)])}
ALiveT- { [(7.i,.12)]}
/77 T ( L i s t O n e , L i s t , Int, ListOne, L i s t )
AShr~- { ([(_Ys,./2}],E(_Xl,./2),(2,./2)]),
([(7.1,./2),(2, . / 2 ) ] , [(_Zs, . / 2 ) ] ) }
ALiveT- { [(_Zl,./2)]}
Note that the top-list cell of the variable _Xl is derived to become a garbage
cell in fl'44. However, this is only true if the code generated for the s e l e c t / 3
predicate did not introduce destructive assignments. The garbage cell created
in the first clause of the s e l e c t / 3 predicate, may or may not be the first list
cell of the variable _Xl.

A.6 split/3
The split/3 predicate illustrates that for programs in normal form, depth bound
two yields sufficient precision to detect the garbage cells. A drawback of normal-
form Prolog programs is the large set of program variables and consequently the
large type graphs, sharing and liveness sets.
Garbage list-cells are detected for the program points ~[, ~ and/7~.

Prolog code:

(i) split( _X, _Y, 7. ) "-

/72 1 1 = [ _ b l = ] ,
/73 -Y = [~I=1],
/74 7. : [_b{ = 2 ] ,
/75 split( _r, _rl, _r2 ). /76
(2) spilt( i, _Y, 7. ) "-
/77 _x = [_al_Xl].
/7, I I [], =

/79 -Y = [_a[_Y1],
/710 _Y1 = [], /711 _Z = []. /712
(3) split( _X, _Y, 7. ) :-
/713 _X = [], /714 -Y = [], /71s _Z = [].

Call: ~ split( _.A, _13, _C )


200 APPENDIX A. DETAILED EXAMPLES

(_A, _B, _C)


T ( L i s t , v, v>
ALive,r { [(~,v)], [(_c,v)]}
T (List, List, List)
ALive,r { [(-B, .12)] , [(_C,.12)]}
1) ( I, _Y, _Z, _a, II, _b, _r, _rl, _r2 )
T < L i s t , v, v, v, v, v, v, v, v)
ALive~r { [(_Y,V)], [(_z,v)]}
T (List0ne ..... Int, List ....... _>
AShr~ { ([(-Xl,./2)], [(_X,./2),(2,./2)])}
ALive,r { [(_X,./2),(2,./2)], [(_Xl,./2)], [(_a,Int)]}
#2 T ( ListOne, V, V, Int, List, V, V, V, V)
AShr,~ { ([(_XI,.12)], [(_X,./2),(2,.12)])}
ALive,r { [(_Y,V)], [(_z,v)]}
T ( ........ ListOne, Int, List, _, _)
AShr~- { ([(_r,./2)], [(_xi,./2),(2,./2)])}
ALive~ { [(_X1,.12),(2,.12)], [(_b,Int)], [(_r,.12)]}
T (.(Int,ListOne), V, V, Int, ListOne, Int, List,
V, V)
AShr,} { ([C-r,./2)], [(-X,./2),(2,./2)]),
([(_r, . / 2 ) ] , [(_X1,./2),(2,./2)]),
( [(_Xl, ./2), ( 2 , . / 2 ) ] , [ ( _ X , . / 2 ) , ( 2 , . / 2 ) ] ) ,
([(_XI, ./2)], [(_X,./2),(2,./2)])}
ALive~ { [(_Y,V)], [(_Z,V)]}
T (.(Int,ListOne),.(Int,V), V, Int, ListOne, Int,
List,V,V)
AShr,} { ([(_Y,./2),(2,V)], [(_rl,V)]),
C[(_xl, . / 2 ) ] , [(_x, ./2), (2, . / 2 ) ] ) ,
([(_XI,./2),(2,./2)], [(_X, ./2), (2, ./2)]),
( [(_r, . / 2 ) ] , [(_X1, ./2), (2, . / 2 ) ] ) ,
([(_r, . / 2 ) ] , [(_X, ./2), (2, ./2)])}
ALive~r { [(_Y,./2)], [(_z,v)]}
T (.(Int,ListOne), .(Int,V), .(Int,V), Int,
ListOne, Int, List, V, V)
AShr,} { ( [ ( 1 , . / 2 ) , ( 2 , V ) ] , [(=2,V)]),
([(_r, . / 2 ) ] , [ ( _ X , . / 2 ) , ( 2 , . / 2 ) ] ) ,
([(_r, ./2)], [(_XI,./2),(2,./2)]),
( [ ( I 1 , ./2), (2, . / 2 ) ] , [(_X, ./2), (2, . / 2 ) ] ) ,
([(_Xl, ./2)], [(_X,./2),(2,./2)]),
( [(_Y, .12), (2,V)], [(_r1,V)])}
ALive~T { [(_u [C 7.,./2)]}
A.6. SPLIT~3 201

& :T (.(Int,ListOne), ListOne, ListOne, Int, ListOne,


Int, List, List, List)
AShr~ { ([(_Z,.12),(2,.12)], [(__r2,.12)]),
([(_r, .12)], [(_X, .12), (2, .12)]),
( [(_r, .12)], [(-/I, .12), (2, .12)]),
([(_X1,.12),(2,.12)], [(-/,.12),(2,.12)]),
( [ ( - / I , .12)], [(_X, .12), (2, .12)]),
([(_Y, .12), (2, .12)], [(_rl, .12)])}
ALivesr { [(_Y,.12)], [(_z,.12)]}
2) < -/, _y, _z, _~, - / I , _y1>
< L i s t , v, v, v, v, v>
ALive,r { [(_Y,v)], [(_z,v)]}
( ListOne ..... Int, List _>
AShr~- { ([(_x1,.12)], [(-/,.12),(2,.12)])}
ALivecr { [ ( - / , . / 2 ) , ( 2 , . / 2 ) ] , [ ( - / 1 , . / 2 ) ] , [(._a,Int)]}
~'s :7- ( ListOne, V, V, Int, List, V>
AShr~- { ([(-/1,./2)], [(-/,./2),(2,./2)])}
ALive~ { [(_Y,V)], [(_z,v)]}
( . ( i n t , = i l ) , v, v, int, n i l , v)
AShr~ { ([(11, n i l ) ] , [ ( - / , . / 2 ) , ( 2 , n i l ) ] ) }
ALive~r { [(_Y,V)], [(_z,v)]}
~lO T (.(Int,nil), .(Int,V), V, Int, nil, V)
AShr~- { ([(_Y,./2),(2,V)], [(_YI,V)]),
([(_XI, nil)I, [(-/,.12),(2, nil)I)}
ALive~r { [(_Y,./2)], [(_Z,V)]}
~ii ~- (.(Int,nil), .(Int,nil), V, Int, nil, nil>
AShr~- { ([(_Y,./2),(2, n i l ) I , [(_Y1, n i l ) I ) ,
([(-/I, nil)I, [(-/,./2),(2, nil)])}
ALiveT { [(_Y,.12)], [(z,v)]}
Bi2 T (.(Int,nil), .(Int,nil), nil, Int, nil, nil>
AShr~- { ([(_Y,./2),(2, n i l ) I , [(_YI, n i l ) I ) ,
([(./1, n i l ) I , [ ( - / , . / 2 ) , ( 2 , n i l ) I ) }
ALive~ { [(_Y, ./2)] , [(_Z, n i l ) ] }
3) ( -/, _y, _z>
~ls T (List, v, v>
Ative~T { [(_Y,V)], [(_Z,V)]}
~14 T ( n i l , V, V>
ALive~ { [(_Y,v)], [( v,v)]}
~i5 T ( ~iI, ~ii, v>
ALive,r { [(_Y, n i l ) I , [(_Z,V)]}
( n i l , n i l , nil)
ALive~ { [(_Y, n i l ) I , [ ( 7 , n i l ) I }
202 APPENDIX A. DETAILED EXAMPLES

A.7 qsort/2 and partition/4

In Section 5.3, we discussed the results of the liveness analysis for the quicksort
program using an accumulating parameter. Similar results are obtained for the
simple quicksort program using append/3. First we consider the analysis of
the partition/4 predicate, which is called by both versions of the quicksort
program. We specify input sharing and liveness components according to the
use of the predicate in the quicksort program given below.

Prolog code:

(1) partition(.M,_L,_Sm,_Gr) :-
~i -L = nil,
~2 _Sm = nil,
~s _Gr = nil. ~4
(2) partition(~,_L,~m,_Gr) :-
#s r = [~I-T],~
~e - ~ = < ~,
~7 _Sm = [_H I _Sml] ,
~e partition(~M, _T, _Sml, _Gr).
(3) partition(_M,_L,_Sm,_Gr) :-
# i o r . = [_~ I - T ] , ~o
,~11 'R' > _M,
#n _Gr = [_B I _Grl],
~lS partit ion (_M, _T, _Sm, _Gr I). ~14

Call : ~i1 partition( _A, _]3, _C, _D )

(_A, ..B, _C, _I))


#~ 7- < I.t, List, v, v)
ALive~ { [(_D,V)], [(_C,V)], [(_A,Int)]}
~oI 7- ( Int, List, List, List)
ALive~ { [(_A,Int)], [(_C,./2)], [(_D,./2)]}
1) (-/4, _L, _Sm, _Gr)
~, 7- ( I n t , L i s t , V, V)
ALiveT { [(_Gr,V)], [(_Sm,V)], [(_N,Int)]}
X~2 7- ( I n t , n i l , V, V)
ALive~ { [(_Gr,V)], [(~Sm,V)], [(_H,Int)]}
~3 7- ( I n t , n i l , n i l , V)
ALiveT { [(_Gr,V)], [(_Sin, nil)l, [(_~,Int)]}
~4 7- ( Int, nil, nil, nil)
ALive~, { [(_Gr, nil)l, [(_Sm, nil)l, [(_M,Int)]}
A. 7. QSORT/2 AND PARTITION/4 203

2) (_M, _i, _H, _T, _Sin, _Gr, _Sin1)


#5 7- ( I n t , List, V, V, V, V, V)
ALive:r { [(_Gr,V)], [(_SIn,V)], [(J4,Int)]}
7- ( _, ListOne, Int, List, _, _, _)
AShr~ { ([(_T, ./2)] , [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] ) }
ALive3- { [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] , [(_T, ./2)] , [(_II,Int)]}
/3e 7- ( I n t , ListOne, I n t , L i s t , V, V, V)
AShr~ { ([(_T, ./2)] , [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] ) }
ALiveT { [(_Gr,V)], [(_Sin,V)], [(34,Znt)]}
#a 7- ( Int, ListOne, Int, List, .(Int,V), V, V)
AShr~ { ([(_T, ./2)] , [ ( _ ~ , . / 2 ) , ( 2 , . / 2 ) ] ) ,
( [(_Sm, ./2), (2,V)], [(_Sml,V)])}
ALiveT { [(_Gr,V)], [(~Sm,./2)], [(A~,Znt)]}
~9 7- ( Int, ListOne, Int, List, ListOne, List, List)
AShr~ { ([(_T, ./2)] , [(_L,./2),(2,./2)3),
([(~Sm,./2),(2,./2)], [(_Sml,./2)])}
ALive~r { [(_Gr,./2)], [(_Sin,./2)], [(_M,Int)]}
3) ( _M, _L, _It, _T, _Sin, _Gr, _Grl)
/3to 7- ( I n t , L i s t , V, V, V, V, V)
ALive:r { [(_Gr,V)], [(_Sm,V)], [(_M,Int)]}
~10 7- ( _, ListOne, I n t , L i s t . . . . . _)
AShr~
{ ([(_7,./2)], [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] ) }
ALiveT
{ [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] , [(_T, ./2)] , [ ( ] ~ , I n t ) ] }
/311 7" ( Int, List0ne, Int, List, V, V, V)
AShr~ { ([(_T, ./2)] , [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] ) }
ALiveT { [(_Gr,V)], [(~Sm,V)], [(_M,Znt)]}
/313 7- ( Int, ListOne, Int, List, V, . ( I n t , V ) , V)
AShr~ { ([(_T, ./2)] , [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] ) ,
(F(_Gr,./2),(2,V)], [(_Grl,V)] )}
ALive~r { [(_Gr,./2)], [(_SIn,V)], [(_M,Int)]}
/314 T ( Int, ListOne, Int, List, List, ListOne, List)
AShr,} { ([(_T, ./2)] , [ ( _ L , . / 2 ) , ( 2 , . / 2 ) ] ) ,
(r(_Gr,./2), ( 2 , . / 2 ) ] , [(_Grl, . / 2 ) ] ) }
ALive~r { [(_Gr,./2)], [(.5m,./2)], [(.~,Int)]}
We omitted/3~, and/312 from the table, as they are identical to/3e, respectively
/311 9
From/3s (resp./31o), the compiler derives that _.L = [~t I _T], is a selection
operation. From fi~5 (resp. ~ o ) , it is derived that the top-list cell can be reused
in the construction _Sin = [_It [ _Sml], (resp. _Gr = [_B I _Grl]). For both
clauses, the selection and construction operations occur within a single chunk.
Note that there is no sharing between the output lists of the third and the
fourth argument.

Prolog code:
204 A P P E N D I X A. D E T A I L E D E X A M P L E S

(i) qsort(-~, _Res) "-


~1 -X nil, =

~2 A s s = nil. ~3
(2) qsort(_X, _Res) "-
#4 _x = [_H I _T],
#5 p a r t i t i o n ( _H, _T, _U1, _U2 ),
~e qsort( _Ul, _Resl ),
/~7 qsort( _U2, _Res2 ),
/gs ~ = [.B I _-Kes2],
append( _Resl, _R, _Res ). ~10

Call: ~/2 qsort( _A, _B )

( _A, _B)
~ 7- < List, v>
ALive~r { [(_B,V)] }
#~ 7- < List, List>
ALive~r { [(_B, ./2)3}
1) ( rues, ~)
fll 7- ( V, List)
ALivecr { [(-~es,V)] }
~2 7- < v, nil)
ALivecr { [(_Res,V)]}
fls 7- ( nil, nil)
ALivesr { [(_I{es,nil)]}
2) ( _H, _P~, ..Res, _B.esl, _Kes2, _T, _U1, _U2, _X)
~'4 7 ( V, V, V, V, V, V, V, V, List)
ALive~- { [(ross,v)]}
7- ( Int . . . . . . . . . List ..... ListOne)
AShr,~- { ([(_T, ./2)] , [ ( - X , . / 2 ) , ( 2 , . / 2 ) ] ) }
ALive:r { [(I,./2),(2,./2)], [(_T, ./2)] , [(_H,Int)]}
fl~ 7- ( Int, V, V, V, V, List, V, V, ListOne)
AShr~ { ([(_T, ./2)] , [(_X, . / 2 ) , (2, . / 2 ) ] ) }
ALive~r { [(_Res,V)] }
f16 7 ( Int, V, V, V, V, List, List, List, ListOne)
AShr.~- { ([(_T, ./2)], [(_X, ./2), (2, . / 2 ) ] ) }
ALive~r { [(_Res,V)]}
f17 7- ( Int, V, V, List, V, List, List, List, ListOne
AShr~- { ([(_T,./2)], [(_X, ./2), (2, . / 2 ) ] ) }
ALive~r { ['(./{es ,V)] }
f18 7 ( I n t , V, V, L i s t , L i s t , L i s t , L i s t , L i s t ,
ListOne)
AShr~ { ([(_T, ./2)], [(_X, . / 2 ) , ( 2 , . / 2 ) 3 ) }
ALive:r { [(_Res,V)] }
A.8. SAMELEAVES/2 AND PROFILE~2 205

#9 T < Int, List0ne, V, List, List, List, List, List,


ListOne)
AShr,} { ([(_T,.12)], [(I, .12) , (2, .12)3 ) ,
( [(A~,./2), (2,./2)], [(]~es2, ./2)] )}
ALive~r { [(~es,V)]}
/31o T ( Int, List0ne, List, List, List, List, List,
List, List0ne)
AShr,~ { ([(-~es2,.12)], [(~es,.12)]),
( [ ( _ T , 12)],[(_x,.12),(2, 12)]),
( [(_~,. 12), (2,. 12)], [(_.R,es2,. 12)] ),
( [(_.~es,. 12)], [(R, 12)] ),
( [(~es,. 12)], [(~,. 12), (2,. 12)] )}
ALive~r { [(~es,.12)]}
From/34, the compiler derives that _X = [_B I _T], is a selection operation.
From/3~4 , it is derived that the top-list cell can be reused in the construction
= [_H I _Res2]. However, the selection and construction operations do not
occur within a single chunk. If the temporary variable I is made a permanent
variable by the compiler, its value will be saved on the local stack, and the
garbage cell it is pointing to, will be accessible for reuse in the construction ~ =
[At I _~es2] at program point/3s. Code migration would provide an alternative
solution (see Section 3.3).
Note that the restriction of/39 to the variables of the call a p p e n d ( A e s l ,
_K, A~es), yields the abstract substitution/3~, for which the analysis results are
shown in Section A.2.
Also, it can be seen from/36 that the terms bound to _Ul and _U2 are ground
and that the terms bound to _Kesl and ~ e s 2 are independent free variables
when the subgoal q s o r t ( _ U 1 , _ R e s l ) , in the body of the recursive clause for
q s o r t / 2 , is called. Therefore, if the results of the analysis were used for an IAP-
implementation of Prolog (Section 3.1), the subgoals qsort(_U1, _Kesl) and
qsort(_U2, _Ires2) could be run in parallel without any run-time groundness
or independence checks.

A.8 sameleaves/2 and profile/2


In Section 5.3, the s a r a e l e a v e s / 2 program served to clarify the relationship
between the depth restriction for type graphs, the normal form of the program
and the precision of the liveness analysis. The results of this section are for the
program in normal form and depth bound two for the t / 2 functor.

Prolog code:

(i) sameleaves( _x, _y ) "-


/31 profile( _x, _w ),
/32 profile( _y, _w ).
206 A P P E N D I X A. D E T A I L E D E X A M P L E S

(1) p r o f i l e ( _tr, _pr ) :-


~4 _tr = l v ( _ u ) ,
~5 m r = [_u]. ~6
(2) profile( _tr, _pr ) "-
~7 _tr = t ( _tl, _y), ff~7
~8 _ t l = l v ( _ u ) .
~9 _pr = [_u[ _prtail] ,
~1o profile( _y, _prtail ). ~11
(3) profile( _tr, _pr ) :-
~lZ _tr = t ( _ t l , _ z ) , fills
~13 _ t l = t(_x,_y), ~ s
~4 _t2 = t ( _ x , _ t 3 ) ,
~5 _t3 = t ( _ y , ~.),
~16 profile( _t2, _pr ). ~17

The intended solution of an initial call to the sa.meleaves/2 predicate usually


is a yes/no answer. In the query as specified below, the input list of the second
argument is assumed to be needed in some further computations. Note that
for program point ~2, it is derived that the program variable _w is bound to a
ground list containing at least one element.
Call: /9~ sameleaves( -A, _B )

( _A, .3)
~1 7- ( LvTree, LvTree)
ALiveT { [(_B,t/2)], [(_B,iv/1)]}
~ 7- ( LvTree, LvTree)
ALive~ { [(_B,ivli)], [(_B,tl2)]}
1) ( _,, _X, _y)
~1 7- ( V , LvTree, LvTree)
ALive~ { [(_y,t/2)], [ ( _ y , l v / i ) ] }
~2 7- ( L i s t O n e , LvTree, LvTree>
ALiveT { [(_y,t/2)], [(_y,lv/1)]}
~3 7- ( ListOne, LvTree, LvTree)
ALive7 { [(_y,t/2)], [ ( _ y , l v / l ) ] }
The analysis of the predicate sameleaves/2 entails an analysis of the predicate
p r o f i l e / 2 for two different abstract liveness environments. In the first call, the
p r o f i l e / 2 predicate has to construct a linear list containing the leaf nodes of
the input tree.
Call: ~ p r o f i l e ( _A, _B )

(_A. _.)
~ Y ( LvTree, V)
ALiveT { [(_B,V)] }
In the second call it only has to test whether a given linear list contains the leaf
nodes of the input tree.
A.8. SAMELEAVES/2 AND PROFILE~2 207

Call: l~3 p r o f i l e ( _A, -13 )

( _.h, _.B)
#3 ,/- ( LvTree, ListOne>
ALive~ { [(_a,t/2)], [(A,Zvll)]}

We only consider the first query, for which the table below indicates that the
rearrangement of the input tree in the third clause, can be done in-place. For
the second query this will not be the case as the input tree of the first argument
is live. Note that for large type graphs (e.g. in f~14, the type graph has 39 nodes)
the sets of sharing edges may blowup exponentially. This is a major difficulty
for the approach taken.

<_A, _B>
~ 7- ( LvTree, V)
ALiveT { [(_B,V)]}
/302 7" < LvTree, ListOne)
ALivey- { [(_B, ./2)3}
I) ( -F, _tr, _u )
/74 iT ( V, LvTree, V)
A Live,r { [(_pr,V)]}
~4 7" ( _, iv(Int), Int>
ALive=r { [(_u, I n t ) ] }
#5 T ( V, l v ( I n t ) , Int)
A Live,r { [(_pr,V)] }
#6 ~T (.(Int,nil), lv(Int), Int)
ALive~ { [(_pr,./2)]}
2) ( _pr, _prtail, _ t l , _tr, _u, _y)
~7 T ( V, V, V, LvTree, V, V)
ALive~ { [(_pr,V)] }
T ( . . . . LvTree, LvTree0ne, _, LvTree)
AShr~ { ( [ ( _ t l , t / 2 ) ] , [(_tr,t/2), (I ,t/2)] ),
( [ ( _ t l , l v l l ) ] , [(_tr,tl2), ( l , l v l l ) ] ) ,
( [(_y,t/2)], [(_tr,t/2), (2,t/2)]),
( [ ( _ y , l v / l ) ] , [(_tr,t/2), ( 2 , 1 v / l ) ] ) }
ALive~ { [(_tr,t/2), ( l , l v / l ) ] , [(_tr,t/2), ( l , t / 2 ) ] ,
[(_tr,t/2), (2,Zvll)], [(_tr,tl2), (2,t12)],
[(_y,tl2)], [(_y,lvll)], [(_tl,tl2)], [(_t1,1vll)]}
208 A P P E N D I X A. D E T A I L E D E X A M P L E S

#8 :T < V, V, LvTree, LvTreeOne, V, LvTree)


AShr~- { ([(_ti,tl2)], [(_tr,t/2),(1,t/2)]),
([(_tl,Zvll)], [(_tr,tl2),(1,Zv/1)]), ([(_y,tl2)],
[(_tr,tl2), (2,t12)]), ( [ ( _ y , l v l l ) ] ,
[(_tr,t/2), (2,1v/I)])}
ALiveT- { [(_pr,V)] }
( . . . . iv(Tnt), _, T~t, _)
ALive,r { [(_u,Xnt)] }
( V, V, iv(Int), t(Iv(Int),LvTree), Int, LvTree)
AShr~ { ([(_t1,1vll)], [ ( _ t r , t l 2 ) , ( l , l v / 1 ) ] ) ,
( [(_y,t/2)], [(_tr,t/2), (2,t/2)]),
([(_y,lv/1)], [(_tr,t/2), (2,iv/I)])}
ALive:r { [(_pr,V)] }
3) ( _pr, _tl, _t2, _t3, _tr, _~, 4 , ~)
#i2 T ( V , V, V, V, LvTree, V, V, V)
ALiw~r { [(mr,V)]}
~12 T ( _, LvTree . . . . . LvTreeOne . . . . . LvTree)
AShy- { ([(_tl,tl2)], [(_tr,tl2),(l,t/2)]),
( [ ( _ t l , l v l l ) ] , [(_tr,t/2),(1,1v/1)]),
([(~z,t/2)], [ ( _ t r , t / 2 ) , ( 2 , t / 2 ) ] ) ,
( [ ( _ z , l v / l ) ] , [(_tr,tl2), (2, i v / l ) ] )}
ALiveT- { [ ( _ t r , t / 2 ) , ( 1 , i v / I ) ] , [(_tr,t/2), (1,t/2)],
[(_tr,t/2),(2,1v/1)], [(_tr,t/2),(2,t/2)],
[( 7.,t12)], [ ( _ z , l v l l ) ] , [(_tl,zl2)], [ ( _ t l , l v l l ) ] }
/$13 ( V, LvTree, V, V, LvTreeOne, V, V, LvTree)
AShr,~ { ([(_ti,t/2)], [(_tr,t/2),Cl,t/2)]),
( [ ( _ t l , l v l l ) ] , [(_tr,t/2), (I, i v / l ) ] ),
([(_z,t/2)], [ ( _ t r , t / 2 ) , ( 2 , t / 2 ) ] ) ,
( [ ( _ z , l v / l ) ] , [(_tr,t/2),(2,1v/1)])}
ALive~T { [(mr,V)]}
~s ~r ( _, LvTreeOne . . . . . . . LvTree, LvTree, _)
AShr.~ { ([(_x,t/2)], [ ( _ t l , t / 2 ) , ( 1 , t / 2 ) ] ) ,
(F(_x,lv/1)], [(_tl,t/2), (I, I v / l ) ] ),
([(_y,t/2)], [ ( _ t l , t / 2 ) , ( 2 , t / 2 ) ] ) ,
( [ ( _ y , l v l l ) ] , [(_t i , t / 2 ) , (2,1v/I)])}
ALive~ { [(_tl,t/2),Cl,Zvll)], [(_tl,t/2),(l,tl2)],
[(_tl,t/2),(2,Zv/1)], [(_t1,Z/2),(2,t/2)],
[(_y,t/2)], [(_y,lv/l)], [(_x,t/2)], [(_x,lv/l)]}
A.9. SIFT/2 AND REMOVE/3 209

#14 T ( V, LvTree0ne, V, V, t(LvTree0ne,LvTree), LvTree,


LvTree, LvTree)
AShr,~ { ([(~r,t/2),(1,t/2)], [(_x,t/2)]),
[(_tr ,t/2), (1,t/2), (2,1v/1)], [(_x,Zv/1)]),
[(_tr t/2), (1,t/2), (1,Zv/1)], [(_x,Zv/1)]),
[(_tr t/2), (1,t/2)], [(_y,t/2)]),
[(_tr t/2), (1,t/2), (2,1v/1)], [(_y,lv/1)] ),
[(_tr t / 2 ) , ( 1 , t / 2 ) , ( 1 , 1 v / 1 ) ] , [(_y,lv/1)]),
[(_tl t / 2 ) ] , [ ( _ t r , t / 2 ) , ( 1 , t / 2 ) ] ) ,
[(_tl t/2),(2,t/2)], [(_tr,t/2),(l,t/2)]),
[(_tl t/2),(l,t/2)], [(_tr,t/2),(l,t/2)]),
[(_tl t/2), (2,1v/l)], [(_tr,t/2), (l,t/2), (l,lv/l)])
[(_tl t12), ( 1 , 1 v / l ) ] , [(_tr,t/2), (1 ,t12), (2,1v/1)])
[(_z,t/2)], [(_tr,t/2),(2,t/2)]),
[(_z,lv/l)], [(_tr,t/2),(2,1v/l)]),
[(_x,t/2)], [ ( _ t l , t / 2 ) , ( 1 , t / 2 ) ] ) ,
[(_x,Zv/1)], [ ( _ t l , t / 2 ) , ( 1 , i v / 1 ) ] ) ,
[(_y,t/2)], [ ( _ t l , t / 2 ) , ( 2 , t / 2 ) ] ) ,
[(_y,lv/1)], [(_tl,t/2),(2,1v/1)])}
ALive~ [(_pr,V)]}

A.9 sift/2 and remove/3


The sift/2 program can be used to sift out the prime numbers according to
Eratosthenes' sieve algorithm. In the tables below, we only show the abstract
substitutions of interest. Garbage cells are detected in the program point ;~4
for the s i f t / 2 program, and in the program points ~ and ~ for the remove/3
program. We analyze the remove/3 predicate for an initial query corresponding
to the restriction of the abstract substitution ~76 in the second clause of the
s i f t / 2 program. Note that there is no local opportunity to reuse the garbage
cell in the third clause of the remove/3 predicate.

Prolog code:

(1) s i f t ( I, _Y ) "-

(2) si~t( _x, _Y ) . -

#4 _X = Ell_Is],
#5 -Y = [_z$_es],
•6 remove( _I, _Is, _New ),
#~ sift( _New, 2 s ). #s
210 A P P E N D I X A. DETAILED E X A M P L E S

Call: #~ sift( _A, _B )

(_A, _B)
#~ 7- < List, v>
ALive~r { [(_B,V)] }
#~ 7" ( List, List>
ALive~r { [(_B, . / 2 ) ] }
I) ( _x, _Y)
#1 T ( List, V)
Akive~- { [(_Y,V)]}
#3 ~F ( n i l , nil>
Ative~r { [(_Y,nil)] }
2) (_I, _Is, _New, _Ps, _X, _Y)
( V, V, V, V, List, V)
ALive~r { [(_Y,V)] }
T ( Int, List . . . . . ListOne, _)
AShr,} { ([(Ts,.12)], [(~,./2),(2,./2)])}
ALive,r { [(_X,./2),(2,./2)], [(_Is,./2)], [(_I,Int)]}
#s 7- ( Int, List, V, V, ListOne, V)
AShr,} { ([(_Zs,.12)], [(_x,.12),(2,.12)3)}
ALive~r { [(_Y,V)]}
#6 :T ( Int, List, V, V, List0ne, .(Int,V))
AShr,} { ([(_Is,.12)], [(_x,.12),(2,.12)]),
([(_Y, ./2),(2,v)], [(~s,V)])}
ALive,r { [(_Y, ./2)]}

Prolog code:

(1) remove( -P, _X, _Y ) .-


fl~ _.x = [3, fl~ _Y = [3. #s
(2) remove( _P, _x, _Y ) "-
#4 _X = [ _ I I _ I s ] , /~4
#s -Y = [__llJis],
#6 n o t ( 0 i s _I rood _P),
#z remove( _P, _Is, J i s ). ,88
(3) remove( _P, I , -Nis ) :-
#g _X : [_II_Is], #~9
#10 0 is _I rood _P,
#11 remove( _P, _Is, Jis ). #12
A.9. SIFT~2 AND REMOVE~3 211

Call: ~i~ remove( _I, _Is, _New )

( _I, _Is, -New)


~ iv ( Int, List, V>
A Live,r{ [(mew,V)]}
/5'0
2 iV ( Int, List, List>
ALive~r { [(mew,./2)]}
~) < 2 , ~, _Y>
~i iv ( Int, List, V>
ALiveT- { [(_Y,V)]}
flS 7- ( Int, nil, nil>
ALive~- { [(_Y,nil)]}
2) (_I, _Is, _Nis, _P, _X, _Y)
(V, V, V, Int, List, V>
ALive7 { [(_Y,V)] }
( Int, List ..... ListOne, _)
AShr,~ { ([(_Is, ./2)] , [(_X,./2),(2,./2)])}
ALive~- { [(_X,./2),(2,./2)], [(_Is, ./2)] , [(_I,Int)]}
#s T ( Int, List, V, Int, ListOne, V)
AShr~- { ([(_Is,./2)], [(_X,./2),(2,./2)])}
ALive~- { [(_Y,V)] }
#7 iv < Int, List, V, Int, ListOne, .(Int,V))
AShr~ { ([(_Is,./2)], [(_X,./2),(2,./2)]),
([(_Y, ./2), (2,V)], [(_Nis,V)] )}
ALive~r { [(_Y, ./2)3 }
3) ( _I, _Is, mis, _P, _x>
( v, v, v, Int, List)
ALiveT { [(mis,V)]}
( Int, List ..... ListOne)
AShr~ { ([(_Is,./2)], [(_X,./2),(2,./2)])}
ALive~r { [(_X,./2),(2,./2)3, [(_Is, ./2)] , [(_I,Int)]}
#11 iv ( Int, List, V, Int, ListOns>
AShr~- { ( [ ( 1 s , . / 2 ) ] , [ ( I , . / 2 ) , ( 2 , . / 2 ) ] ) }
ALive:,- { [(mis,V)3}
Bibliography
[1] S. Abramsky and C. Hankin, editors. Abstract Interpretation of Declarative
Languages. Ellis Horwood Series in Computers and their Applications. Ellis
Horwood, Chichester, 1987.

[2] K. Appleby, M. Carlsson, S. Haridi, and D. Sahlin. Garbage collection for


Prolog based on WAM. Commun. ACM, 31(6):719-741, 1988.

[3] K. R. Apt. Logic programming. In J. Van Leeuwen, editor, Handbook of


Theoretical Computer Science, Volume B: Formal Models and Semantics,
pages 493-574. Elsevier, 1990.

[4] B.I.M., B-3078, Everberg, Belgium. ProLog by BIM -3.0- Reference Man-
ual, Nov. 1990.

[5] G. Birkhoff. Lattice Theory, volume 25 of American Mathematical Society


Colloquium Publications. American Mathematical Society Providence, 1979.

[6] P. Boizumault. PROLOG l'implantation. Masson, Paris, 1988.

[7] M. Bruynooghe. The memory management of Prolog implementations. In


Clark and Ts [16], pages 83-98.

[8] M. Bruynooghe. Garbage collection in Prolog interpreters. In J. A. Camp-


bell, editor, Implementations of Prolog, Ellis Horwood Series in Artificial
Intelligence, pages 259-267. Ellis Horwood Limited, 1984.

[9] M. Bruynooghe. Compile-time garbage collection. Report CW43, Depart-


ment of Computer Science, Katholieke Universiteit Leuven, Apr. 1986.

[10] M. Bruynooghe. Compile-time garbage collection or how to transform pro-


grams in an assignment-free language into code with assignments. In L. G.
L. T. Meertens, editor, Program Specification and Transformation, pages
113-129. North-Holland, 1987.

[11] M. Bruynooghe. A practical framework for the abstract interpretation of


logic programs. Journal of Logic Programming, 10(2):91-124, Feb. 1991.
214 BIBLIOGRAPHY

[12] M. Bruynooghe and G. Janssens. An instance of abstract interpretation in-


tegrating type and mode inferencing. In R. A. Kowalski and K. A. Bowen,
editors, Proceedings of the Fifth International Conference and Symposium
on Logic Programming, pages 669-683, Seattle, 1988. MIT Press, Cam-
bridge.

[13] M. Bruynooghe, G. Jansscns, A. Callebaut, and B. Demoen. Abstract


interpretation: Towards the global optimization of Prolog programs. In
Proceedings of the Fourth Symposium on Logic Programming, pages 192-
204, San Francisco, 1987. IEEE Computer Society Press.

[14] J. A. Campbell, editor. Implementations of Prolog. Ellis Horwood Series in


Artificial Intelligence. Ellis Horwood Limited, 1984.

[15] D. R. Chase, M. Wegman, and F. K. Zadeck. Analysis of pointers and struc-


tures. Proceedings of the ACM SIGPLAN'90 Conference on Programming
Language Design and Implementation, SIGPLAN Notices, 25(6):296-310,
1990.

[16] K. L. Clark and S.-A. Tarnlund, editors. Logic Programming. Academic


Press, N.Y., 1982.

[17] M. Codish, D. Dams, and E. Yardeni. Abstract unification and a bottom-


up analysis to detect aliasing in logic programs. Technical Report csg0-10,
Department of Computer Science, Weizmann Institute of Science, Israel,
May 1990.

[18] M. Codish, D. Dams, and E. Yardeni. Bottom-up abstract interpretation


of logic programs. Technical Report CS90-24, Department of Computer
Science, Weizmann Institute of Science, Israel, Oct. 1990.

[19] M. Codish, D. Dams, and E. Yardeni. Derivation and safety of an abstract


unification algorithm for groundness and aliasing analysis. In K. Furukawa,
editor, Proceedings of the Eighth International Conference on Logic Pro-
gramming, pages 79-93, Paris, 1991. MIT Press, Cambridge.

[20] A. Colmerauer. Prolog and infinite trees. In Clark and T~irnlund [16], pages
231-251.

[21] A. Cortesi and G. Fild. Abstract interpretation of Prolog: The treatment of


the built-ins. Rapporto Interno 11, Department of Mathematics, University
of Padova, 1991.

[22] A. Cortesi, G. Fild, and W. Winsborough. Prop revisited: Propositional


formula as abstract domain for groundness analysis. In Proceedings of the
Sizth Annual IEEE Symposium on Logic in Computer Science, pages 322-
327. IEEE Computer Society Press, 1991.
BIBLIOGRAPHY 215

[23] P. Cousot. Semantic foundations of program analysis. In S. S. Muchnick


and N. D. Jones, editors, Program Flow Analysis: Theory and Applications,
pages 303-342. Prentice-Hall, 1981.
[24] P. Cousot and R. Cousot. Abstract interpretation: A unified lattice model
for static analysis of programs by construction or approximation of fix-
points. In Proceedings of the Fourth A CM Symposium on Principles of
Programming Languages, pages 238-252, Los Angeles, 1977.
[25] S. K. Debray. Efficient dataflow analysis of logic programs. In Proceedings
of the Fifteenth A CM Symposium on Principles of Programming Languages,
pages 260-273, San Diego, California, 1988.
[26] S. K. Debray. A simple code improvement scheme for Prolog. In G. Levi
and M. Martelli, editors, Proceedings of the Sizth International Conference
on Logic Programming, pages 17-32, Lisbon, 1989. MIT Press, Cambridge.
Also in Journal of Logic Programming, 13(1):57-88, 1992.
[271 S. K. Debray. Static inference of modes and data dependencies in logic
programs. ACM Trans. Prog. Lang. Syst., 11(3):418-450, 1989.
[28] S. K. Debray and D. S. Warren. Automatic mode inference for Prolog
programs. In Proceedings of the Third Symposium on Logic Programming,
pages 78-88, Salt Lake City, Utah, 1986. IEEE Computer Society Press.
[29] I. Foster. Copy avoidance through local reuse. Technical Report Preprint
MCS-P99-0989, Mathematics and Computer Science Division, Argonne
National Laboratory, Sept. 1989.
[30] I. Foster and W. Winsborough. Copy avoidance through compile-time anal-
ysis and local reuse. In Proceedings of the International Logic Programming
Symposium, Cambridge, 1991. MIT Press.
[31] J. Gallagher and M. Bruynooghe. The derivation of an algorithm for pro-
gram specialisation. In D. H. D. Warren and P. Szeredi, editors, Proceedings
of the Seventh International Conference on Logic Programming, pages 732-
746, Jerusalem, 1990. MIT Press, Cambridge. Also in New Generation
Computing, Vol. 9, Nos. 3,4, 1991.
[32] S. I-Iorwitz, P. Pfeiffer, and T. Reps. Dependence analysis for pointer
variables. Proceedings of the ACM SIGPLAN'S9 Conference on Program-
ming Language Design and Implementation, SIGPLAN Notices, 24(7):28-
40, 1989.
[33] P. Hudak. A semantic model for reference counting and its abstraction. In
Abramsky and Hankin [1], pages 45-62.
[34] G. Huet. Confluent reductions: Abstract properties and applications to
term rewriting systems. Journal of the Association for Computing Machin-
ery, 27(4):797-821, 1980.
216 BIBLIOGRAPHY

[35] K. Inoue, H. Seki, and H. Yagi. Analysis of functional programs to detect


run-time garbage cells. A CM Transactions on Programming Languages and
Systems, 10(4):555-578, 1988.
[36] K. Inoue and K. Torii. Implementation and analysis of compile-time garbage
collection. New Generation Computing, 10(1):101-119, 1991.
[37] D. Jacobs and A. Langen. Accurate and efficient approximation of variable
aliasing in logic programs. In E. Lusk and R. Overbeek, editors, Proceedings
of the North American Conference on Logic Programming, pages 154-165,
Cambridge, 1989. MIT Press.
[38] G. Janssens. Deriving Run Time Properties of Logic Programs by Means of
Abstract Interpretation. P h . D . thesis, Department of Computer Science,
Katholieke Universiteit Leuven, Mar. 1990.
[39] G. 3anssens and M. Bruynooghe. Deriving descriptions of possible values
of program variables by means of abstract interpretation: Definitions and
proofs. Report CW108, Department of Computer Science, Katholieke Uni-
versiteit Leuven, Mar. 1990.
[40] G. Janssens and M. Bruynooghe. Deriving descriptions of possible values
of program variables by means of abstract interpretation. Journal of Logic
Programming, 13(2&3):205-258, July 1992.
[41] G. Janssens, B. Demoen, and Y. Willems. Execution mechanism for Prolog.
Report CW53, Department of Computer Science, Katholieke Universiteit
Leuven, 1987.
[42] T. P. Jensen and T. $ . Mogensen. A backwards analysis for compile-
time garbage collection. In N. Jones, editor, ESOP'90 Proceedings Third
European Symposium on Programming, Lecture Notes in Computer Science
432, pages 227-239. Springer-Verlag, N.Y., 1990.
[43] N. D. Jones and H. Scndergaard. A semantic-based framework for the
abstract interpretation of Prolog. In Abramsky and Hankin [1], pages 123-
142.
[44] F. Klu~niak. Type synthesis for ground Prolog. In J.-L. Lassez, editor,
Proceedings of the Fourth International Conference on Logic Programming,
pages 788-816, Melbourne, 1987. MIT Press.
[45] F. Klu~niak. Compile-time garbage collection for ground Prolog. In R. A.
Kowalski and K. A. Bowen, editors, Proceedings of the Fifth International
Conference and Symposium on Logic Programming, pages 1490-1505, Seat-
tle, 1988. MIT Press, Cambridge.
[46] J. R. Larus and P. N. Hilfinger. Detecting conflicts between structure
accesses. Proceedings of the A CM SIGPLAN'88 Conference on Program-
ming Language Design and Implementation, SIGPLAN Notices, 23(7):21-
34, 1988.
BIBLIOGRAPHY 217

[47] J.-L. Lassez, M. J. Maher, and K. Marriott. Unification revisited. In


J. Minker, editor, Foundations of Deductive Databases and Logic Program-
ming, pages 587-625. Morgan Kaufmann Publishers Inc., Los Altos, 1988.

[48] B. Le Charlier, K. Musumbu, and P. Van Hentenryck. A generic abstract in-


terpretation algorithm and its complexity analysis. In K. Furukawa, editor,
Proceedings of the Eighth International Conference on Logic Programming,
pages 64-78, Paris, 1991. MIT Press, Cambridge.

[49] B. Le Charlier and P. Van Hentenryck. Experimental evaluation of a generic


abstract interpretation algorithm for Prolog. Technical Report CS-91-55,
Institute of Computer Science, University of Namur, and Dept. of Computer
Science, Brown University, Aug. 1991.

[50] J. W. Lloyd. Foundations of Logic Programming. Springer Series : Symbolic


Computation - Artificial Intelligence. Springer-Verlag, second, extended edi-
tion, 1987.

[51] O. Mallet. Interprdtation Abstraite Appliqude ~la Compilation et la Par-


alldlisation en Programmation Logique. Ph.D. thesis, L'l~cole Polytechnique
de Paris, June 1992.

[52] A. Mari~n, G. Janssens, A. Mulkers, and M. Bruynooghe. The impact of


abstract interpretation: An experiment in code generation. In G. Levi and
M. Martelli, editors, Proceedings of the Sizth International Conference on
Logic Programming, pages 33-47, Lisbon, 1989. MIT Press, Cambridge.

[53] K. Marriott and H. SOndergaard. Bottom-up abstract interpretation of


logic programs. In R. A. Kowalski and K. A. Bowen, editors, Proceedings of
the Fifth International Conference and Symposium on Logic Programming,
pages 733-748, Seattle, 1988. MIT Press, Cambridge.

[54] K. Marriott and H. Sendergaard. On Prolog and the occur-check problem.


SIGPLAN Notices, 24(5):76-82, 1989.

[55] K. Marriott and H. Sondergaard. Semantics-based dataflow analysis of logic


programs. In G. Ritter, editor, Information Processing 89, pages 601-606.
North-Holland, N.Y., 1989.

[56] C. S. Mellish. Some global optimizations for a Prolog compiler. Journal of


Logic Programming, 2:43-66, 1985.

[57] C. S. Mellish. Abstract interpretation of Prolog programs. In Abramsky


and Hankin [1], pages 181-198.

[58] P. Mishra. Towards a theory of types in Prolog. In Proceedings of the 1084


International Symposium on Logic Programming, pages 289-298, Atlantic
City, 1984. IEEE Computer Society Press.
218 BIBLIOGRAPHY

[59] A. Mulkers. Deriving Live Data Structures in Logic Programs by Means


of Abstract Interpretation. Ph.D. thesis, Department of Computer Science,
Katholieke Universiteit Leuven, Dec. 1991.
[60] A. Mulkers, W. Winsborough, and M. Bruynooghe. Analysis of shared data
structures for compile-time garbage collection in logic programs. In D. H. D.
Warren and P. Szeredi, editors, Proceedings of the Seventh International
Conference on Logic Programming, pages 747-762, :Jerusalem, 1990. MIT
Press, Cambridge.
[61] A. Mulkers, W. Winsborough, and M. Bruynooghe. Analysis of shared data
structures for compile-time garbage collection in logic programs (extended
version). Report CW117, Department of Computer Science, Katholieke
Universiteit Leuven, Oct. 1990.
[62] A. Mulkers, W. Winsborough, and M. Bruynooghe. Static analysis of logic
programs to detect run-time garbage cells. In P. Dewilde and :J. Vandewalle,
editors, Proceedings of the International Conference on Computer Systems
and Software Engineering, pages 526-531. IEEE Computer Society Press,
May 1992.
[63] K. Muthukumar and M. Hermenegildo. Determination of variable depen-
dence information through abstract interpretation. In E. Lusk and R. Over-
beek, editors, Proceedings of the North Americas Conference on Logic Pro-
gramming, pages 166-185, Cambridge, 1989. MIT Press.
[64] K. Muthukumar and M. Hermenegildo. Combined determination of shar-
ing and freeness of program variables through abstract interpretation. In
K. Furukawa, editor, Proceedings of the Eighth International Conference on
Logic Programming, pages 49-63, Paris, 1991. MIT Press, Cambridge.
[65] U. Nilsson. Systematic semantic approximations of logic programs. In P. De-
ransart and J. Matuszyfiski, editors, Proceedings of the International Work-
shop on Programming Language Implementation and Logic Programming,
Lecture Notes in Computer Science 456, pages 293-306. Springer-Verlag,
1990.
[66] U. Nilsson. Abstract interpretation: A kind of magic. In P. Deransart
and J. Matuszyfiski, editors, Proceedings of the International Workshop on
Programming Language Implementation and Logic Programming, Lecture
Notes in Computer Science. Springer-Verlag, 1991.
[67] E. Pittomvils, M. Bruynooghe, and Y. Willems. Towards a real time garbage
collector for Prolog. In Proceedings of the Symposium on Logic Program-
ming, pages 185-198, Boston, 1985. IEEE Computer Society Press.
[68] D. A. Plaisted. The occur-check problem in Prolog. J. New Generation
Computing, 2(4):309-322, 1984. Also in: Proceedings of the International
Symposium on Logic Programming, pages 272-280, Atlantic City, 1984.
IEEE Computer Society Press.
BIBLIOGRAPHY 219

[69] C. Pyo and U. S. Reddy. Inference of polymorphic types for logic programs.
In E. L. Lusk and R. A. Overbeek, editors, Proceedings of the 1989 North
American Conference on Logic Programming, pages 1115-1132, Cambridge,
1989. MIT Press.

[70] H. Sendergaard. An application of abstract interpretation of logic programs:


Occur check reduction. In B. Robinet and R. Wilhelm, editors, ESOP'86
Proceedings European Symposium on Programming, Lecture Notes in Com-
puter Science 213, pages 327-338. Springer-Verlag, N.Y., 1986.

[71] A. Tarski. A lattice-theoretical fixpoint theorem and its applications. Pacific


J. Math., 5:285-309, 1955.

[72] A. Taylor. Removal of dereferencing and trailing in Prolog compilation.


In G. Levi and M. Martelli, editors, Proceedings of the Sizth International
Conference on Logic Programming, pages 48-60, Lisbon, 1989. MIT Press,
Cambridge.

[73] A. Taylor. LIPS on a MIPS, results from a Prolog compiler for a RISC.
In D. H. D. Warren and P. Szeredi, editors, Proceedings of the Seventh
International Conference on Logic Programming, pages 174-185, Jerusalem,
1990. MIT Press, Cambridge.

[74] A. Taylor. High Performance Prolog Implementation. P h . D . thesis, Uni-


versity of Sydney, June 1991.

[75] W. Thomas. Automata on infinite objects. In J. Van Leeuwen, editor,


Handbook of Theoretical Computer Science, Volume B: Formal Models and
Semantics, pages 133-191. Elsevier, 1990.

[76] M. Van Caneghem. L'anatomie de Prolog. InterEditions, 1986.

[77] P. Van Roy and A. M. Despain. The benefits of global dataflow analysis for
an optimizing Prolog compiler. In S. Debray and M. Hermenegildo, editors,
Proceedings of the 1990 North American Conference on Logic Programming,
pages 501-515, Austin, 1990. MIT Press, Cambridge.

[78] P. L. Van Roy. Can Logic Programming Ezecute as Fast as Imperative


Programming. Ph.D. thesis, University of California, Berkeley, Dec. 1990.

[79] P. Vataja and E. Ukkonen. Finding temporary terms in Prolog programs. In


ICOT, editor, Proceedings of the International Conference on Fifth Gener-
ation Computer Systems, pages 275-282, Tokyo, 1984. Ohmsha, LTD. and
North-Holland.

[80] D. H. D. Warren. An abstract Prolog instruction set. Technical report, SRI


International, Artificial Intelligence Center, 1983.
220 BIBLIOGRAPHY

[81] W. Winsborough. Path-dependent reachability analysis for multiple spe-


cialization. In E. Lusk and R. Overbeek, editors, Proceedings of the North
American Conference on Logic Programming, pages 133-153, Cambridge,
1989. MIT Press.
[82] W. Winsborough. Multiple specialization using minimal-function graph
semantics. Journal of Logic Programming, 13(2&3):259-290, July 1992.
[83] W. Winsborough and M. Bruynooghe. Approximating unification over rep-
resentations of sets of nonground terms. Draft, 1990.
[84] W. Winsborough and A. Wvsrn. Transparent and-parallelism in the pres-
ence of shared free variables. In R. A. Kowalski and K. A. Bowen, editors,
Proceedings of the Fifth International Conference and Symposium on Logic
Programming, pages 749-764, Seattle, 1988. MIT Press, Cambridge.
[85] E. Yardeni and E. Shapiro. A type system for logicprograms. In E. Shapiro,
editor, Concurrent Prolog: Collected Papers (Volume ~), chapter 28, pages
211-244. M I T Press, Cambridge, 1987. Also in Journal of Logic Program-
ming, Vol. 10(2):125-154 (1991).
[86] 3. Zobel. Derivation of polymorpl~ic types for Prolog programs. In 3.-L.
Lassez, editor, Proceedings of the Fourth International Conference on Logic
Programming, pages 817-838, Melbourne, 1987. MIT Press.
Lecture Notes in Computer Science
For information about Vols. 1-595
please contact your bookseller or Springer-Verlag

Vol. 596: L.-H. Eriksson, L. Halln~is, P. Schroeder-Heister Vol. 613: J. P. Myers, Jr., M. J. O ' D o n n e l l (Eds.),
IEds.), Extensions of Logic Programming. Proceedings, Constructivity in Computer Science. Proceedings, 1991.
1991. VII, 369 pages. 1992. (Subseries LNAI). X, 247 pages. 1992.
Vol. 597: H. W. Guesgen, J. Hertzberg, A Perspective of Voh 614: R. G. Herrtwich (Ed.), Network and Operating
Constraint-Based Reasoning. VIII, 123 pages. 1992. System Support for Digital Audio and Video. Proceedings,
(Subseries LNAI). 1991. XII, 403 pages. 1992.
Vol. 598: S. Brookes, M. Main, A. Melton, M. Mislove, D. Vol. 615: O. Lchrmann Madsen (Ed.), ECOOP '92. Euro-
Schmidt (Eds.), Mathematical Foundations of Programming pean Conference on Object Oriented Programming. Pro-
Semantics. Proceedings, 1991. VIII, 506 pages. 1992. ceedings. X, 426 pages. 1992.
Voh 599: Th. Wetter, K.-D. Althoff, J. Boose, B. R. Gaines, Vol. 616: K. Jensen (Ed.), Application and Theory of Petri
M. Linster, F. Schmalhofer (Eds.), Current Developments Nets 1992. Proceedings, 1992. VIII, 398 pages. 1992.
in Knowledge Acquisition EKAW '92. Proceedings. XIII, Vol. 617: V. MaHk, O. St6pfinkovfi, R. Trappl (Eds.), Ad-
444 pages. 1992. (Subseries LNAI). vanced Topics in Artificial Intelligence. Proceedings, 1992.
Vol. 600: J. W. de Bakker, C. Huizing, W. P. de Roever, G. IX, 484 pages. 1992. (Subseries LNAI).
Rozenberg (Eds.), Real Time: Theory in Practice. Proceed- Vol. 618: P. M. D. Gray, R. J. Lucas (Eds.), Advanced
ings, 1991. VIII, 723 pages. 1992. Database Systems. Proceedings, 1992. X, 260 pages. 1992.
Vol. 601: D. Dolev, Z. Galil, M. Rodeh (Eds.L Theory of Vol. 619: D. Pearce, H. Wansing (Eds.), Nonclassical Log-
Computing and Systems. Proceedings, 1992. VIII, 220 ics and Information Proceedings. Proceedings, 1990. VII,
pages. 1992. 171 pages. 1992. (Subseries LNAI).
Vol. 602: 1. Tomek (Ed.), Computer Assisted Learning. Pro Vol. 620: A. Nerode, M. Taitslin (Eds.), Logical Founda-
ceedings, 1992. X, 615 pages. 1992. tions of Computer Science Tver '92. Proceedings. IX,
Vol. 603: J. van Katwijk (Ed.), Ada: Moving Towards 20(10. 514 pages. 1992.
Proceedings, 1992. Vlll, 324 pages. 1992. Voh 621: O. Nurmi, E. Ukkonen (Eds.), Algorithm Theory
Vol. 604: F. Belli, F.-J. Radermacher (Eds.), Industrial and SWAT '92. Proceedings. VIII, 434 pages. 1992.
Engineering Applications of Artificial Intelligence and Vol. 622: F. SchmalhotEr, G. Strube, Th. Wetter (Eds.),
Expert Systems. Proceedings, 1992. XV, 702 pages. 1992. Contemporary Knowledge Engineering and Cognition. Pro
(Subseries LNAI). ceedings, 1991. XII, 258 pages. 1992. (Subseries LNAI).
Vol. 605: D. Etiemble, J.-C. Syre (Eds.), PARLE '92. Par- Vol. 623: W. Kuich (Ed.), Automata, Language,; and Pro
allel Architectures and Languages Europe. Proceedings, gramming. Proceedings, 1992. XII, 721 pages. 1992.
1992. XVII, 984 pages. 1992.
Voh 624: A. Voronkov (Ed.), Logic Programming and Au
Vol. 606: D. E. Knuth, Axioms and Hulls. IX, 109 pages. tomated Reasoning. Proceedings, 1992. X[V, 509 pages.
1992. 1992. (Subseries LNAI).
Vol. 607: D. Kapur (Ed.), Automated Deduction CADE- Voh 625: W. Vogler, Modular Construction and Partial
11. Proceedings, 1992. XV, 793 pages. 1992. (Subseries Order Semantics of Petri Nets. IX, 252 pages. 1992.
LNAI).
Vol. 626: E. BOrger, G. JSger, H. Kleine Brining, M. M .
Voh 608: C. Frasson, G. Gauthier, G. 1. McCalla (Eds.), Richter (Eds.), Computer Science Logic. Proceedings, 1991.
Intelligent Tutoring Systems. Proceedings, 1992. XIV, 686 VIII, 428 pages. 1992.
pages. 1992.
Vol. 628: G. Vosselman, Relational Matching. IX, 190
Vol. 6(19: G. Rozenberg (Ed.), Advances in Petri Nets 1992. pages. 1992.
VIII, 472 pages. 1992.
Vol. 629: I. M. Havel, V. Koubek (Eds.), Mathematical
Vol. 610: F. von Martial, Coordinating Plans of Autono- Foundations of Computer Science 1992. Proceedings. IX,
mous Agents. XII, 246 pages. 1992. (Subseries LNAI). 521 pages. 1992.
Vol. 611 : M. P. Papazoglou, J. Zeleznikow (Eds.), The Next Voh 630: W. R. Cleaveland (Ed.), CONCUR '92. Proceed
Generation of Information Systems: From Data to Knowl- ings. X, 580 pages. 1992.
edge. VIII, 310 pages. 1992. (Subseries LNAI).
Vol. 631: M. Bruynooghe, M. Wirsing (Eds.), Program-
Vol. 612: M. Tokoro, O. Niers'trasz, P. Wegner (Eds.), ming Language Implementation and Logic Programming.
Object-Based Concurrent Computing. Proceedings, 1991. Proceedings, 1992. XI, 492 pages. 1992.
X, 265 pages. 1992.
Vol. 632: H. Kirchner, G. Levi (Eds.), Algebraic and Logic Vol. 656: M. Rusinowitch, J. L. R6my (Eds.), Conditional
Programming. Proceedings, 1992. IX, 457 pages. 1992. Term Rewriting Systems. Proceedings, 1992. XI, 501 pages.
Vol. 633: D. Pearce, G. Wagner (Eds.), Logics in A|. Pro- 1993.
ceedings. VIII, 410 pages. 1992. (Subseries LNAI). Vol. 657: E. W. Mayr (Ed.), Graph-Theoretic Concepts in
Vol. 634: L. Bough, M. Cosnard, Y. Robert, D. Trystram Computer Science. Proceedings, 199Z VIII, 350 pages.
(Eds.), Parallel Processing: CONPAR 92 - VAPP V. Pro- 1993.
ceedings. XVII, 853 pages. 1992. Vol. 658: R. A. Rueppel (Ed.), Advances in Cryptology -
Vol. 635: J. C. Derniame (Ed.), Software Process Technol- EUROCRYPT '92. Proceedings, 1992. X,493 pages. 1993.
ogy. Proceedings, 1992. VIII, 253 pages. 1992. Vol. 659: G. Brewka, K. P. Jantke, P. H. Schmitt (Eds.),
Vol. 636: G. Comyn, N. E. Fuchs, M. J. Ratcliffe (Eds.), Nonmonotonic and Inductive Logic. Proceedings, 1991.
Logic Programming in Action. Proceedings, 1992. X, 324 VIII, 332 pages. 1993. (Subseries LNAI).
pages. 1992. (Subseries LNAI). Vol. 660: E. Lamma, P. Mello (Eds.), Extensions of Logic
Vol. 637: Y. Bekkers, J. Cohen (Eds.). Memory Manage- Programming. Proceedings, 1992. VIII, 417 pages. 1993.
ment. Proceedings, 1992. Xl, 525 pages. 1992. (Subseries LNA1).

Vol. 639: A. U. Frank, 1. Campari, U. Formentini (Eds.), Voh 661: S. J. Hanson, W. Remmele, R. L. Rivest (Eds.),
Theories and Methods of Spatio-Temporal Reasoning in Machine Learning: From Theory to Applications. VIll, 271
Geographic Space. Proceedings, 1992. XI,431 pages. 1992. pages. 1993.
Voh 640: C. Sledge (Ed.), Software Engineering Educa- Vol. 662: M. Nitzberg, D. Mumford, T. Shiota, Filtering,
tion. Proceedings, 1992. X, 451 pages. 1992. Segmentation and Depth. VIII, 143 pages. 1993.

Vol. 641 : U. Kastens, P. Pfahler (Eds.), Compiler Construc- Vol. 663: G. v. Bochmann, D. K. Probst (Eds.), Computer
tion. Proceedings, 1992. VIII, 320 pages. 1992. Aided Verification. Proceedings, 1992. IX, 422 pages.
1993.
Vol. 642: K. P. Jantke (Ed.), Analogical and Inductive In-
ference. Proceedings, 1992. VIII, 319 pages. 1992. Vol. 664: M. Bezem, J. F. Groote (Eds.), Typed Lambda
(Subseries LNAI). Calculi and Applications. Proceedings, 1993. VII1, 433
pages. 1993.
Vol. 643: A. Habel, Hyperedge Replacement: Grammars
and Languages. X, 214 pages. 1992. Voh 665: P. Enjalbert, A. Finkel, K. W. Wagner (Eds.),
STACS 93. Proceedings, 1993. XIV, 724 pages. 1993.
Vol. 644: A. Apostolico, M. Crochemore, Z. Galil, U.
Manber (Eds.), Combinatorial Pattern Matching. Proceed- Vol. 666: J. W. de Bakker, W.-P. de Roever, G. Rozenberg
ings, 1992. X, 287 pages. 1992. (Eds.), Semantics: Foundations and Applications. Proceed-
ings, 1992. VII1, 659 pages. 1993.
Vol. 645: G. Pernul, A M. Tjoa (Eds.), Entity-Relationship
Approach - ER '92. Proceedings, 1992. Xl, 439 pages, Vol. 667: P. B. Brazdil (Ed.), Machine Learning: ECML -
1992. 93. Proceedings, 1993. XII, 471 pages. 1993. (Subseries
LNAI).
Vol. 646: J. Biskup, R. Hull (Eds.), Database Theory -
ICDT '92. Proceedings, 1992. IX, 449 pages. 1992. Vol. 668: M.-C. Gaudel, J.-P. Jouannaud (Eds.), TAPSOFT
'93: Theory and Practice of Software Development. Pro-
Vol. 647: A. Segall, S. Zaks (Eds.), Distributed Algorithms. ceedings, 1993. XII, 762 pages. 1993.
X, 380 pages. 1992.
VoL 669: R. S. Bird, C. C. Morgan, J. C. P. Woodcock
Vol. 648: Y. Deswarte, G. Eizenberg, J.-J. Quisquater (Eds.), Mathematics of Program Construction. Proceedings,
(Eds.), Computer Security - ESORICS 92. Proceedings. 1992. VIII, 378 pages. 1993.
XI, 451 pages. 1992.
Vol. 670: J. C. P. Woodcock, P. G. Larsen (Eds.), FME
Vol. 649: A. Pettorossi (Ed.), Meta-Programming in Logic.
"93: Industrial-Strength Formal Methods. Proceedings,
Proceedings, 1992. Xll, 535 pages. 1992. 1993. XI, 689 pages. 1993.
Vol. 650: T. lbaraki, Y. lnagaki, K. lwama, T. Nishizeki,
Vol. 671: H. J. Ohlbach (Ed.), GWAI-92: Advances in
M. Yamashita (Eds.), Algorithms and Computation. Pro- Artificial Intelligence. Proceedings, 1992. XI, 397 pages.
ceedings, 1992. XI, 510 pages. 1992. 1993. (Subseries LNAI).
Vol. 651: R. Koymans, Specifying Message Passing and
Vol. 672: A. Barak, S. Guday, R. G. Wheeler, The MOS1X
Time-Critical Systems with Temporal Logic. IX, 164 pages.
Distributed Operating System. X, 221 pages. 1993.
1992.
Vol. 673: G. Cohen, T. Mora, O. Moreno (Eds.), AAECC-
Vol. 652: R. Shyamasundar (Ed.), Foundations of Software
10: Applied, Algebra, Algebraic Algorithms and Error-
Technology and Theoretical Computer Science. Proceed-
Correcting Codes. Proceedings, 1993. X, 355 pages 1993.
ings, 1992. XIII, 405 pages. 1992.
Vol. 674: G. Rozenberg (Ed.), Advances in Petri Nets 1993.
Vol. 653: A. Bensoussan, J.-P. Verjus (Eds.), Future Ten-
VII, 457 pages. 1993.
dencies in Computer Science, Control and Applied Math-
ematics. Proceedings, 1992. XV, 371 pages. 1992. Vo[. 675: A. Mulkers, Live Data Structures in Logic Pro-
grams. VIII, 220 pages. 1993.
Vol. 654: A. Nakamura, M. Nivat, A. Saoudi, P. S. P. Wang,
K. Inoue (Eds.). Prallel Image Analysis. Proceedings, 1992.
VIII, 312 pages. 1992.
Vol. 655: M. Bidoit, C. Choppy (Eds.), Recent Trends in
Data Type Specification. X, 344 pages. 1993.

You might also like