Professional Documents
Culture Documents
Efficiently Computing Static Single Assignment Form and The Control Dependence Graph
Efficiently Computing Static Single Assignment Form and The Control Dependence Graph
RON CYTRON, MARK and F. KENNETH Brown ZADECK University IBM Research JEANNE Division FERRANTE, BARRY K. ROSEN, and N. WEGMAN
In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single assignment form and the control dependence graph have been proposed to represent data flow and control flow propertiee of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimization. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use dominance frontiers, a new concept that may have other applications. We also give analytical and experimental evidence that all of these data structures are usually linear in the size of the original program. This paper thus presents strong evidence that these structures can be of practical use in optimization. Categories and Subject Descriptors: D .3.3 [Programming Languages]: Language Constructscontrol structures; data types and structures; procedures, functions and subroutines; Languages]: Processorscompilers; optimization; I. 1.2 [Algebraic D.3.4 [Programming Intelligence]: Automatic Manipulation]: Algorithmsanalysis of algorithms; 1.2.2 [Artificial Programming-program transformation
A preliminary version of this paper, An Efficient Method of Computing Static Single Assignment Form, appeared in the conference Record of the 16th ACM Symposium on principles of Programming Languages (Jan. 1989). F. K. Zadecks work on this paper was partially supported by the IBM Corporation, the office of Naval Research, and the Defense Advanced Research Projects Agency under contract NOCIO14-83K-0146 and ARPA order 6320, Amendment 1. Authors ac!drwses: R. i3yiru:I, J. Ferrante, and M. N. Wegman, Computer Sciences Department, IBM Research Division, P. O. Box 704, Yorktown Heights, NY 10598; B. K. Rosen, Mathematical Sciences Department, IBM Research Division, P. O. Box 218, Yorktown Heights, NY 10598; F. K. Zadeck, Computer Science Department, P.O. Box 1910, Brown University, Providence, RI 02912. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. @ 1991 ACM 0164-0925/91/1000-0451 $01.50
ACM Transact~ons on Programmmg Languages and Systems, VO1 13, NO 4, October, le91, Pages 451.490
452
General
Terms: Algorithms,
Additional dominator,
1. INTRODUCTION In optimizing and efficiency structure can advanced compilers, data structure choices directly influence the power of practical program optimization. A inhibit optimization or slow compilation features become undesirable. poor choice of data to the point where static single
optimization
Recently,
assignment (SSA) form [5, 431 and the control dependence graph [241 have been proposed to represent data flow and control flow properties of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimizations. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use [4]. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use
dominance frontiers,
a new
concept
that linear
may
other
applications.
We
also give analytical the dominance This paper dependence Figure thus
evidence evidence
the sum of the sizes of all program. and control SSA form The
frontiers
code is put into SSA form, optimized in various ways, and then translated back out of SSA form. Optimizations that can benefit from using SSA form include code motion [221 and elimination of partial redundancies [431, as well as the constant propagation discussed later in this section. Variants of SSA form have been used for detecting program equivalence [5, 521 and for increasing parallelism in imperative programs [211. The representation of simple data flow information (clef-use chains) may be made more compact through SSA form. If a variable has D definitions and U uses, then there can be D x U clef-use chains. When similar information is encoded in SSA form, there can be at most E def-use chains, where E is the number of edges in the control flow graph [40]. Moreover, the clef-use information encoded in SSA form can be updated easily when optimizations are applied. This is important for a constant propagation algorithm that deletes branches to code proven at compile time to be unexecutable [50]. Specifically, the def-use program information text that is just use that a list, for each variable, of the places in the variable. Every use of V is indeed a use of the to begin with straight-line
value provided by the (unique) assignment to V. To see the intuition behind SSA form, it is helpful
code. Each assignment to a variable is given a unique name (shown as a subscript in Figure 2), and all of the uses reached by that assignment are renamed to match the assignments new name. Most programs, however, have branch and join nodes. At the join nodes, we In Figure 3, the add a special form of assignment called a @-function.
ACM TransactIons on Programming Languages and Systems, Vol. 13, No, 4, October 1991
453
P()+P~+P~-+
Fig, 1. Vertical arrows represent arrows represent optimizations, translation to/from
...
Pn
form, Horizontal
V+4
vi
+5
if
if
V2)
V~ several version.
assignment
to the by new
indicate
which
use of Vi
is reached
one assignment
to Vi in the
entire program. This simplifies the record keeping An especially clear example is constant propagation the next subsection sketches this application. 1.1 Constant Propagation elaborate version of Figure true
Figure
4 is a more
includes a compiler
the constant
that the else branch never falls through to the join point. If Q is the constant true, then all uses of V after the join point are really uses of the constant 4. Such possibilities is either can be taken into account without SSA form, but the processing costly [48] or difficult that to understand [49]. With SSA form, (i. e., never
followed at run time) and that each variable is constant with an as-yet unknown value (denoted T). Worklists are initialized appropriately, and the assumptions are corrected until they stabilize. Suppose the algorithm finds that variable P is not constant (denoted _L) and, hence, that either branch of ACM
Transactions on Programming Languages and Systems, Vol. 13. No. 4, October 1991.
454
if then P
Ron Cytron et al
if P
do
V +-
do end
Vi
++-
4 6 return
V + if
6 Q then
do V2 if end + 4(V1,
Q then
V2)
. . . +- V3+S
propagation
assumption
VI are notified that they are uses of4. (Each use ofVl is indeed a use of this value, thanks to the single assignment property.) In particular, the @function combines 4with Vz. The second operand of the @function, however, isassociated uses with T for the inedge So long second the of thejoin as this operand, point that corresponds what to falling the asthrough algorithm the else branch. edge is considered no matter unexecutable, is currently
sumed about Vz. Combining 4 with T yields 4, so uses of V~ are still tentatively assumed to be uses of 4. Eventually, the assumption about Q may change from T to a known constant or to ~ . A change to either false or L would lead to the followed, and then discovery that the second the 6 at Vz combines with inedge of the join point can be the 4 at VI to yield L at V~. A
change to true would have no effect on the assumption at V3. Traditional constant propagation algorithms, on the other hand, would see that assignments of two different constants to V seem to reach all uses after the join point. They would thus decide The algorithm just sketched that V is not constant at these uses. is linear in the size of the SSA program it sees
[501, but this size might be nonlinear in the size of the original program. In particular, it would be safe but inefficient to place a +-function for every variable at every join point. This paper shows how to obtain SSA form efficiently. When ~-functions are placed possible but is unlikely in practice. The linear in essentially the size of the original linear in the SSA size. carefully, nonlinear behavior is still size of the SSA program is typically the time to do the work is
program;
1.2
Where
to Place
At first
glance,
careful
pairs of assignment statements for each variable. Checking whether there to V that reach a common point might seem to be are two assignments intrinsically nonlinear. In fact, however, it is enough to look at the domiof each node in the control flow graph. Leaving the technicalinance fi-ontier ties to later sections, we sketch the method here.
ACM Transactions on Programming Languages and Systems, Vol 13, No 4, October 1991
Static Single Assignment Form and the Control Dependence Suppose so that that a variable V has just be either one assignment a use of the
Graph
455
program, to the
any
use of V will
at entry
program or a use of the value assignment to V. Let X be the will determine the value basic block Y. When entered
VI from the most recent execution of the basic block of code that assigns to V, so X control flows along any edge Y will X + 1{ to a and be X (in a [ways X + Y, the code in see VI
of V when along
unaffected by V.. If Y # X, but all paths to Y must still go through dominate Y), then the code in Y will which case X is said to strictly see VI. Indeed, any matter how far from to reach a node node strictly dominated X it may be. Eventually, strictly dominated by
by X will always see Vl, no however, control may be able X. Suppose but Z is the may first such see V. along frontier of X and is no matter how many and no matter how
Z not
node on a path, so that Z sees VI along one inedge another inedge. Then Z is said to be in the dominance clearly in need of a @-function for V. In general, assignments to V may appear in the original program
complex the control flow may be, we can place ~-functions for V by finding the dominance frontier of every node that assigns to V, then the dominance frontier of every node where a +-function has already been placed, and so on. The same concept also be used conditions dependent statement skipped. of dominance control frontiers dependence used for computing [20, 24], which Informally, the branch SSA form identify can those to compute
to execute while another edge can Such information is vital for detection and program analysis [28].
of the Rest of the Paper the representation of control flow by a directed graph, section can be
Section
2 reviews
Section 3 explains SSA form and sketches how also considers variants of SSA form as defined adjusted concept to deal with and formalizes
these variants. Section 4 reviews the dominator tree dominance frontiers. Then we show how to compute
5) and the control dependence graph (Section 6) efficiently. how to translate out of SSA form. Section 8 shows that our
are linear in the size of programs restricted to certain control We also give evidence of general linear behavior by reporting on Section 9 summarizes the algowith FORTRAN programs. our technique with other techniques, and
rithms and time bounds, compares presents some conclusions. 2. CONTROL The basic statements blocks, FLOW GRAPHS are flow
organized enters
into
(not block
maximal) statement
a basic
and leaves the basic block at its last statement [1, 36]. Basic blocks are indicated by the column of numbers in parentheses in Figure 5. A control is a directed graph whose nodes are the basic blocks of a program jZow graph and two additional nodes, Entry and Exit. There is an edge from Entry to
ACM Transactions on Programming Languages and Systems, Vol. 13, No, 4, October 1991
456
Ron
Cytron
et
al.
repeat do
I (Q) then else K-K+ end L L I t 2 3
(2) (2) (3) (3) (3) (4) (5) (6) (6) (7) (8) (9)
(R) then L +L + 4 (S)
KtK+2 J,K, L)
(9) (10)
(11)
(12) (12)
Fig 5 Simple pro~amand itscontrol
@IEl--&
flow graph
4
9 10 11 6, there is represent that each X, a and
any basic block at which the program can be entered, Exit from any basic block that can exit the program. the representation also an edge from transfers of control node
successor
of control
dependence
and explained
Entry to Exit. The other (jumps) between the basic Entry node
SZLCC( X) is the set of all successors of X; with more than one successor is a branch
similarly for predecessors. A node node; a node with more than one
node. Finally, each variable is considered to have an predecessor is a join assignment in Entry to represent whatever value the variable may have when the program is entered. This assignment is treated just like the ones the that appear explicitly in the code. Throughout this paper, CFG denotes control flow graph of the program under discussion. of length J in CFG consists of a For any nonnegative integer J, a path Xo, ., XJ) and a sequence of J edges sequence of J + 1 nodes (denoted (denoted e,, . . . . eJ) 1< j < J. (We write (node or edge) may such that e, runs from XJ_, to X, for all j with sequences, one item eJ: X~_l ~ X~. ) As is usual with occur several times. The null path, with
1991
J = O, is
ACM Transactions
on Programmmg
13, No 4, October
Static Single Assignment Form and the Control Dependence allowed. Nonnull if We write paths
p: p:
Graph
p:
457 if
X.:
XJ
for an unrestricted
path
p, but
X. XJ
p is known
X. # Yo;
XJ=Z=YK; (XJ=Y,) Intuitively, node-disjoint, we still the but paths they p and -(j= q start Jork=K). at at the different end. The nodes and are
(1)
(2) (3) almost than
Z ~ Z, but
3. STATIC
SINGLE
ASSIGNMENT
initially assumes that programs have form in which each statement evaluates
uses the results either to determine a branch or to assign to some variables. (Other program constructs are considered in Section 3.1.) An assignment statement A has the form LHS( A) - RHS( A), where the left-hand side LHS( side tuple. from there A) is a tuple RHS( Each RHS( A) target of distinct variable in target LHS( variables with A) (U, V, . ..) the same the and the right-hand length as the LHS value and the corresponding are l-tuples, 3.1 sketches (such is a tuple of expressions, already between other
is no need to distinguish
use of longer
as proce-
dure calls) so as to make explicit the variables used and/or changed by the statements in the source program. Such explicitness is a prerequisite for most optimization anyway. The only novelty in our use of tuples is the fact that into they provide a simple text. and uniform way to fold about the results of analysis the intermediate Practical concerns the size of the inte rmedi process. the second In the step, first of the new
ate text are addressed in Section 3.1. Translating a program into SSA form step, join some nodes trivial in the o-functions programs control flow
is a two-step graph. In
V ~ @(V, V, . . . ) are
inserted
at some
variables V, (for i = O, 1,2, . . . ) are generated. Each mention of a variable V in the program is replaced by a mention of one of the new variables V,, where a mention may be in a branch expression or on either side of an assignment may the be either result of statement, Throughout an ordinary assignment this paper, an assignment or a @function. Figure statement 6 displays
translating a simple program to SSA form. A +-function at entrance to a node X has the form V ~ +( R, S, . . . ), where V, R, S, . . . are variables. The number of operands R, S, . is the number of control flow predecessors of X. The predecessors of X are listed in some arbitrary fixed order, and the jth operand of + is associated with the jth
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.
458
J2 K2 L2
+ +(P) then
if
(P) then
J+
if do
I
if
K-K+
end else K+--K+2
I
end else J4 K5 L6 +
#( J3, J2) @(K3, K4) ~(L2,L~) J4, K5, L6) +(R) then L8 -L7+4 @(L~,L~)
print repeat
(I,
J,K,
L)
print(12, repeat L7
if
if L9
+ (S)
@(L8, L7)
anditsSSAform.
predecessor. Ifcontrol supportl remembers @(R, S,...) qifunction is just uses only
of the jth operand. Each execution of a operands, but which one depends on the flow
of control just before entering X. Any d-functions in X are executed before the ordinary statements in X. Some variants of d-functions as defined here are useful for special purposes. For example, each @function can be tagged
lWhen SSA form is used only as an intermediate form (Figure 1), there is no need actually to provide this support. For us, the semantics of @ are only important when assessing the correctness of intermediate steps in a sequence of program transformations beginning and ending with code that has no +-functions, Others [6, 11, 521 have found it useful to give @ another parameter that incidentally encodes j, under various restrictions on the control flow.
ACM Transactions on Programming Languages and Systems, Vol. 13, No 4, October 1991
Static Single Assignment Form and the Control Dependence with the node X where it appears [5]. When the control
Graph
459 is
flow
of a language
suitably restricted, each @function can be tagged with information conditionals or loops [5, 6, 11, 52]. The algorithms in this paper apply variants SSA relation form as well. form may be considered between two programs. is a target
about to such
as a property of a single program OY as a A single program is defined to be in SSA of exactly one assignment statement in the the original program by a For every original variable
if each variable
program text. Translation to SSA form replaces new program with the same control flow graph. V, the following conditions are required
(1) If two nonnull paths X ~ Z and Y~ Z converge at a node Z, and nodes X and Y contain assignments to V (in the original program), then a trivial +-function V + q5(V, . . . . V) has been inserted at Z (in the new program). (2) Each mention of V in the original program or in an inserted +fulnction has been replaced by a mention of a new variable V,, leaving the new program in SSA form. flow path, consider and the corresponding the same value. SSA form is translation to SSA form with the any use of a variable V (in the use of V, (in the new program). (3) Along any control original program) Then V and to
Vi have
minimal
Translation
proviso that the number of @-functions inserted is as small as possible, subject to Condition (1) above. The optimizations that depend on SSA form are still valid if there are some extraneous +-functions beyond those that would cause where forego appear information they placing
are
in
to be lost,
and
add unnecessary
the optimization
variant
sometimes
a @-function
at a convergence
more uses for V in or after any risk of losing Condition and it is sometimes However, as Figure
Z. The @function could then be omitted without (3). This has been called pruned SSA form [16], at all our convergence points. form is sometimes
preferable to pruned form. When desired, pruned SSA form can be obtained by a simple adjustment of our algorithm [16, Section 5.1]. For any variable V, the nodes at which we should insert @-functions in the original program that to can be defined originate at V or needing Z needs recursively two different @-functions a @-function by Condition (1) in the definition point for containing we may of SSA form. two paths assignments observe that A node Z needs a +-function for V if Z is a convergence nodes, both nodes for V. Nonrecursively, for V because
a node
Z is a convergence
point for two nonnull paths X ~ Z and Y ~ Z that already containing assignments to V. If Z did
assignment to V, then the @function inserted at Z adds Z to the set of nodes that contain assignments to V. With more nodes to consider as origins of paths, we may observe more nodes appearing as convergence points of
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.
. paths
Ron Cytron et al. originating could algorithm at nodes thus with be assignments by here obtains to V. The an the same set of nodes observation/ end results
found
iterating
presented
Other the
source
program
variables to assign values to scalar variables, then it is straightforward derive an intermediate text that is ready to be put into SSA form otherwise analyzed and transformed by an optimizing compiler).
However,
source programs commonly use other constructs as well: Some variables are not scalars, and some computations do not explicitly indicate which variables they use or change. This section sketches text how to map some important constructs to an explicit intermediate suitable for use in a compiler.
3.1.1 Arrays. It will suffice to consider one-dimensional arrays here. Arrays of higher dimension involve more notation but no more concerns. If A and B are array variables, then array assignment statements like A + B or A +- O, if allowed assignments program, taken value rather to awkward, be both to scalar however, an will integer because by the source Many language, of the for can be treated i, which not just may would change like be be the variables. variable. mentions A(i) may as of A in the a variable source
be mentions an assignment
of A(i) Treating
to A(i)
of A(j) and because the value of A(i) could be changed by assigning to i than to A(i). An easier approach is illustrated in Figure 7. The entire
array is treated like a single scalar variable, which may be one of the operands of Access or Update. 3 The expression Access(A, i) evaluates to the ith component of A; the expression Update(A, j, V) evaluates to an array value that is of the same size as A and has the same component values, except for V as the value of the jth component. Assigning a scalar value V to A(j) is equivalent to assigning an array value to the entire array A, where the new array value depends on the old one, as well as on the index j and on the new scalar value V. The translation to SSA form is unconcerned with whether the values of variables As with scalars, translation are large objects or what the operators mean. of array references to SSA form removes some
anti- and output-dependences [32]. In the program in Figure 7a, dependence analysis may prohibit reordering the use of A(i) by the first statement and the definition of A(j) by the second statement. After translation to SSA form, the two references to A have been renamed, and reordering is then possible. For example, the two statements can execute concurrently. Where optimization does not reorder the two references, Section 7.2 describes how transla -
2In typical cases, paths originating at @functions add nothing beyond the contributions from paths originating at ordinary assignments. The straightforward iteration is still too slow, even in typical cases 3This notation is similar to Select and Update [24], which, in turn, is similar to notation for referencing aggregate structures in a data flow language [23]. ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991
461
+ A(i)
A(j) + -
V
A(k) + 2
A t
T -
Access Access
++ +
ACCeSS(A8,
17)
Llpdate(A,
Update(A8,
j~, V~)
Access(A9,k~~)
+T1+2 (c)
Fig. 7. Source code with array component references (a) is equivalent to code with explicit Access and Update operators that treat the array A just like a scalar (b). Transformationto SSA form proceeds as usual (c).
tion out of SSA form reclaims maintain distinct variables. Consider array A. indirectly) terms the loop shown
storage
that
would
otherwise
be necessary component
to of
in Figure
8a, which
assigns
to every
The value assigned to each component A(i) does not depend (even on any of the values previously assigned to components of A. In effect on A, the whole loop is like an assignment of the form ). Any assignment to a component of A an initialization loop is dead code that form, the Update to the loop. operator in
of its
A + (...), where A is not used in (... that is not used before entering such should Figure
response to the crudeness of the Update operator is to calculations and other genuine scalar calculations can still is to perform dependence analydetermine that no subsequent assignment to A(i) in initialization to make are usually arrays to A. Such is Figure 8a. of A. loolk analysis formulated The The like to in
be optimized extensively. Another response sis [3, 10, 32, 51], which can sometimes accesses of A require the case for for each assignment problem scalars, statement us, or for (like values execution can anyone code produced of the then some who
be viewed of the
is to communicate dead
of dependence
optimizations
elimination)
terms of scalar variables. A simple solution to this formal problem is shown in Figure 8c, where the HiddenUpdate operator does not mention the assigned array operand. The actual code generated for an assignment from a HiddenUpdate expression is exactly the same as for an assignment from the corresponding the target Update expression, where the hidden operand is supplied by of the assignment. as an array, to elements of
iIItO
3.1.2 Structures. A structure can be generally regarded where references to structure fields are treated as references the array. Thus, an assignment to a structure field Update of the structure, Access of the structure. references, this treatment constants. Dependence
is translated
an
and a use of a structure field is translated into an In the prevalent case of simple structure field results in arrays whose elements are indexed by analysis can often determine independence
i31111011g
such accesses, so that optimization may move an assignment to one field far from an assignment to another field. If analysis or language semantics reveals a structure whose n fields are always accessed disjointly, then the
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, Octob,w 1991,
462
integer
A(1:1OO)
i+l repeat
il
repeat
i2 Al
+-++-
#(i~,
i3) i2)
~(A0,A2) HiddenUpdate(12,
A(i) i+l+l
t--
A2 i3
update (Al,lz,
iZ)
until
A2 13
until
until
(c)
Fig. 8, Source loop with array assignment (a) is equivalent to code with an Update operator that treats the array A just hke a scalar (b), As Section 72 explains, the eventual translation out of SSA form will leave just one array here, Using HiddenUpdate (c) is a purely formal way tosummarize some results of dependency analysis, if available.
variables. The elements only for organizational in apparent SSA to form more
decomposition
programs
subsequent
of
SSA may
form In or use
knowing to those
those variables
variables explicitly
modified
a statement.
variables not mentioned by the statement itself. Examples of such references are global variables modified or used by a procedure call, variables, and dereferenced pointer variables. To obtain SSA form,
aliased
we must account for implicit as well as explicit references, either by conservative assumptions or by analysis. Heap storage can be conservatively modeled by representing the entire heap as a single variable that is both modified and used by any statement that may change the heap. More refined modeling is also possible [15], but the conservative approach is already strong enough to support optimization of code that does not involve affect the heap but is interspersed with heap-dependent code. For any statement S, three types of references form: (1) (2) (3) MustMod(S) is the set of variables is the set of variables that that whose
must
translation
into
SSA
of of of S
s.
MayMod(S) may
s.
May Use(S) is the set of variables may be used by S. values
We represent an assignment
ACM Transactions
implicit references for any statement statement A, where all the variables
on Programmmg Languages and Systems, Vol,
Static Single Assignment Form and the Control Dependence Graph in An LHS( A) and appear called all in compiler (directly the RHS( may variables A). or may not have access to the of their in May Use(S) u
463
of all If no
or indirectly)
or to summaries
specific information is available, it is customary sumptions. A call might change any global variable by reference, will be performed. To determine but it is reasonable Such assumptions aliasing to assume limit and that not change. the extent to extract effects
to make conservative asor any parameter passed variables that more local to the caller can variinformation: see [14, 15, result are is that small. is at all. Both own transformations detailed aliasing, the usual and analysis, been The on global
Techniques parameter
are available
of procedures pointer
ables, see [7-9], [19], [37], and [42]; to determine 27, 29, 33, 34], and [44]. When there Small often a sophisticated side analysis and technique the tuples directly. are few tuples effects Many (both
RHS)
can be represented
however,
unavailable.
compilers
do no interprocedural
Consider a call to an external procedure that has not tuples for the call must contain all global variables. representation structure plus that a direct of the tuples includes a flag representation can still of the be compact. that few local (set to indicate
caln be a tuple
variables
intuitive (with
conveniently formulated in terms of explicit can still be used in the implementations. 3.2 Overview of the SSA Algorithm to minimal SSA form
tuples;
Translation
frontier 4.2).
mapping
for each
(3) The variables are renamed (Section 5.2) by replacing each mention of an original variable V by an appropriate mention of a new variable V,. 4. DOMINANCE Section 4.1 reviews the dominance flow graph and how to summarize 4.2 introduces computation. 4.1 Dominator Trees the dominance frontier relation [47] between nodes in the control this relation in a dominator tree. Section mapping and gives an algorithm for its
Let X and Y be nodes in the control flow graph CFG of a program. If -X appears on every path from Entry to Y, then X dominates Y. Domination is both reflexive and transitive. If X dominates
Languages
Y and
and Systems,
X # Y, then
Vol.
X strictly
1991
ACM Transactions
on Programming
464
Ron Cytron et al
Entry [- I -]
(-i
1 [- I Exit] Exit
(-)
[- I -]
I
(Exit
2[-1
Exit,2]
(-l~xit2 3[-18]
7[8181
(-) 4[616]
P
5[;]6] dominate
Y)) idom(
6&\]
\
,,
( E=k
Fig. 9 Control flow graph anddominator tree of the simple program Thesets ofnodes listed in ()and[] brackets summarize thedominance frontier calculation from Section 4.2, Each node X is annotated with two sets [DFIOCaj (X) I DF(X)] and a third set (DFUP(X)).
dominates for
immediate
Y. In formulas, If X does
dominator
we write
not
X>>
domination Y, we write
closest strict
and X ~
X ~
domination.
strictly
Y. The
of Y (denoted
dominator
of Y on any path from Entry to Y. In a dominator tree, the children of a node X are all immediately dominated by X. The root of a dominator tree is Entry, and any node Y other than Entry has
idom( Y)
as its parent
in the
tree. The dominator tree for CFG from Figure 5 is shown in Figure 9. Let N and E be the numbers of nodes and edges in CFG. The dominator tree can be constructed in 0( Ea( E, N)) time [35] or (by a more difficult algorithm) in 0(E) time [26]. For all practical purposes, a( E, N) is a small constant,4 so this paper will consider the dominator tree to have been found in linear time. The dominator tree of CFG has exactly the same set of nodes as CFG but a very different set of edges. Here, the words
parent,
predecessor,
child, ancestor,
successor,
and
and
path
always always
descendant
4Under the definition of a ueed in analyzing the dominator tree algorithm [35, p. 123], N s E implies that a(17, N) = 1 when logz N < 16 and a(ll, N) = 2 when 16 < logz N < 216.
ACM TransactIons on Programming Languages and Systems, Vol. 13, No 4, October 1991
Static Single Assignment Form and the Control Dependence 4.2 The Y: DF(X) = {Y[(~Pe Pred(Y))(X> Pand X? Dominance dominance Frontiers frontier X dominates DF( X) of a CFG node
Graph
465
CFG
nodes
Y such that
a predecessor
of Y but
dominate
Y)}.
Computing D1( X) directly much of the dominator-tree. X would the be quadratic, frontier we define dominance
from the definition would require searching The total time to compute DF( X) for all nodes the sets themselves in time holds: U (J
ZeChildren(X)
To compute I DF( X) I of
linear
in the
two intermediate
= DFIOC~l(X)
DFUP(Z).
(4)
Given This
any local
node
X,
some
of the
successors is defined
of X may by
contribute
to DIF( X).
contribution
DFIOC.l( X)
Given any node Z that is not the root Entry of the dominator tree, some of the nodes in DF(Z) may contribute to DF( idom(Z)). The contribution DFUP(Z) that Z passes up to idom(Z) is defined by DFUP(Z)~f{Ye
LEMMA 1.
DF(Z) frontier
I idom(Z)
Y}.
equation
PROOF. dominance still show DF( X), strictly cannot implies The follows:
LEMMA 2.
Because
is reflexive,
is transitive, each child Z of X has DFUP(Z) G DF( X). We must that everything in DF( X) has been accounted for. Suppose Y e let U ~ Y be an edge Y. If U = X, then Y hand, there such that X dominates and Z of X that strictly U but we are dominates dominate dcles not done. If U but Y. This then X Y e DFIOC.Z( X), is a child does not
and
dominate
U # X, on the other
because
K
can be computed with simple equality tests as
For
Y ~ Succ( X) Y)
# (idom(Y)
The = part is true because the immediate dominator. For the = part, suppose that
ACM Transactions on Programming
dominator X strictly
466
.
for
DF(X)
for
+
Y
0
E Succ(X) # X
/*local./
id.m(Y)
/*up*/
idorn(Y)
some to Y) dominates
child Y that X
V of or
on any X, so
path V = Y
from and
goes to = x.
X + Y, so either
V = Y.
= idom(v)
LEMMA 3.
tree,
PROOF.
We assume
that (X>
Y G DF(Z) Y)+
(idom(Y)
of Y U o!i to Y U or
and, hence, that some child V of X dominates Y. Choose a predecessor Y such that Z dominates U. Then V appears on any path from Entry V dominates that goes to U and then follows the edge U + Y, so either V = Y. If that Only V = Y, then idom( Y) = idom( V ) = X, and we are done.
Suppose
V + Y (and, hence, that V dominates U) and derive a contradiction. one child of X can dominate U, so V = Z and Z dominates Y. This the hypothesis results imply the given that Ye DF(Z).
K
algorithm /*local*/ for line computing effectively the com-
frontiers
putes DFIOC.Z( X) on the storage to it. The /*up*/ dominator tree bottom-up, children. To illustrate the dominator tree in Figure
THEOREM PROOF. 1.
and uses it in (4) without needing to devote line is similar for DFUP( Z). We traverse the visiting each node X only after visiting each of its working of this algorithm, we have annotated the 9 with the information [1 and () brackets. in Figure 10 is correct.
Direct
the preceding
lemmas.
K
10 are
Let CFG have N nodes and E edges. The loop over Succ( X) in Figure examines each edge just once, so all executions of the /*local*/ line
ACM TransactIons on Programming Languages and Systems, Vol. 13, No. 4, October 1991
Static Single Assignment Form and the Control Dependence Graph complete complete which shows have standard 4.3 in in time time 0(E). Similarly, The all executions time of the is thus DF compiler /*up*/ O(E
. line
O(size(DF)). the
overall
in practice,
data-flow
Relating
Dominance more
by stating
characterization
of where
the @-functions
should
be located.
nodes is defined to be the set of all nodes Z such that there are two CFG paths that start at two distinct nodes in Y and converge at Z. join J+ ( 5 ) is the limit of the J(Y); = J(YU J,). nodes for a variable We increasing sequence of sets of
iterated
to be the
set of assignment
V, then J+(Y) is the set of @function The join and iterated join operations extend natural the dominance way: frontier DF( As with increasing join, the iterated mapping S ) =
nodes for V. map sets of nodes to sets of nodes. from nodes X) . DF+ ( Y ) is the limit to sets of nodes
in the
~QJDF( frontier
dominance
of the
sequence
The
actual
computation
of
DF+ ( Y ) is performed
in Section 5.1; the formulation dominance frontiers to iterated nodes for a variable V, then
J+(Y)
=DF+(Y)
(this equation depends on the fact that Entry is in Y) and, hence, that the location of the @functions for V can be computed by the worklist algorithm for computing DF+ ( Y ) that is given in Section 5.1. The following lemmas do most of the work by relating dominance frontiers to joins:
LEMMA 4.
For
any
nonnull
path
p:
X ~ Z
in
CFG,
there
is
a node
X e {X} U DF+ ({ X}) on p that dominates every node on p, the node X can be chosen PRooF. If X dominates properties every node
unless
X dominates
on p, then
choose
= X to
of X.
We may
Languages
assume
and Systems,
ACM Transactions
on Programming
Vol.
468 on p
. are
Ron
Cytron
et al
not
dominated
by
X.
Let the sequence of nodes on p be X = i such that X does not dominate X,, the
predecessor X,_ ~ is dominated by X and so puts X, into DF( X). Thus, there are choices of ~ with XJ e DF+ ({ X}). Consider X = XJ for the largest j with XJ ~ DF+ ({ X}). We will show that X >> Z. Suppose not. Then j < J, and there is a first k with j < k s J such that X does not dominate Xh. The predecessor Xh _ ~ is dominated by X and so puts X~ into DF( X). Thus, Xh e DF( DF+ ({ X} )) = DF+ ({ X} ), contradicting
LEMMA
of j. that
5. Z
p:
X$
Z ~DF+({
DF+({
Y}).
PROOF. We consider three cases that are obviously exhaustive. In the first two cases, we prove that Z e DF+ ({ X} ) U DF+ ({ Y} ). Then we show that the
first
exhaustive
because
the third
case leads to a
X be from Lemma 4 for the path p, with the sequence of XJ = Z. Let Y be from Lemma 4 for the path q, with the Y = YO, . . ., YK = Z.
X is on q and show that Z e DF+({ X}). By the Case 1. We suppose that definition of convergence (specifically, (3) in Section 2), X = Z. We may now assume already of Z but that Z = X and X dominates every node on p. (Otherwise, Lemma 4 asserts Z e DF+ ({ X}).) Because X dominates the predecessor XJ_ ~ does not strictly dominate Y Z, we have is on p, Z c DF( X) show G DF+ ({ X}). Z e DF ({ Y}),
and
that
from ~
that on q, X
X is not on >> YK = Z
Z but
and, therefore, dominates all predecessors of YK. In particular, X >> YK_ ~. But X # YK_~, SO X >> Y&~, and we continue inductively to show that X >> Y~ for all k. In particular, X >> Y. On the other hand, by similar reasoning from the supposition that Y >> Z but Y is not on p, we can show that Y >> X. Two nodes cannot strictly dominate each other, so Case 3 is K impossible.
LEMMA PROOF. 6.
nodes,
J(7)
G DF(J7
).
K
nodes such that Entry e Y, DF( Y ) G
LEMMA
7.
For
any set Y
of CFG
J(Y).
PROOF. Consider any Xc .Y and any Ye DF( X). There is a path from X to Y where all nodes before Y are dominated by X. There is also a path from Entry to Y where none of the nodes are dominated by X. The paths K therefore converge at Y.
THEOREM 2. The set of nodes that need ~-functions for any variable V is the iterated dominance frontier DF+ ( Y ), where Y is the set of nodes with assignments to V. ACM Transactions on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991
469 we can
By Lemma
6 and
induction
on
i in the
definition
of J+,
GDF+(Y).
(5)
GDF+(!YUDF+(.Y)) The node Entry is in Y, so Lemma DF+(Y)G The induction step is as follows: DF,+l = DF(YUDF,) GJ(YU The set of nodes that prove the theorem. J+(Y)) for
SDF(JW =J+(Y).
J+(Y))
need @-functions
V is precisely
J+(Y),
K
OF MINIMAL SSA FORM
5. CONSTRUCTION 5.1
of CFG
nodes
being
processed.
In each iteration
of this
W is initialized to the set &(V) V. Each node X in the worklist a @function. array of flags, Each one iteration flag for empty. is an
of nodes that contain assignensures that each node Y in terminates each node, when the w orklist Work(X) iteration X)
receives
where
whether loop.
the current
is an array whether
HasAlready( at X.
been inserted
been implemented
the values true and false, but this would to reset any true flags between iterations,
over all the nodes. It is simpler to devote an integer to each flag and to test flags by comparing them with the current iteration count. assignments to variables,, where Let each node X have AO,,~ (X) original each ordinary assignment statement LHS + RHS contributes the length of the tuple LHS to A o,,~( x). counting assignments to variables is one of
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, Octc,ber 1991.
470
HasAlready(X)
IterCount
for
each X
Work(X)
~
+-
d(V) {x}
IterCount
W+wu
end while take for if W X each then # @ do from Y
HasAlready(Y) place if (F
@(V,..., + <
at
HasAlready(Y) Work(Y) then do Work(Y) Jv end end end end end Fig, 11. Placement +w
IterCount
IterCount +IterCount
{Y}
of ~-functions.
several measures of program size. 5 By this measure, the program from size AO,,~ = XX AO,,~(X) to size A,O, = XX AtO,( X), where
= A o,,g(x) function placed at X contributes 1 to A,.,(X) + A+(X). a similar expansion in the number of mentions of variables, from Xx M~,,g( X) to Mto, = Xx Mto,(
is
=
contributes
1 plus
the indegree
at
Placing a ~-function at Y in Figure 11 has cost linear in the indegree of Y, so there is an O(EX M*( X)) contribution to the running time from the work done when HasAlready( Y) < IterCount. Replacing mentions of variables will contribute at least 0( M~Ot) to the running time of any SSA translation algorithm, so the cost of placement can be ignored in analyzing the contribution of Figure 11 to the 0(. . ) bound for the whole process. The 0(N) cost of
process is
Languages
and Systems,
Vol.
1991
Static Single Assignment Form and the Control Dependence initialization the dominance managing the performed because running Sizing can be similarly frontier worklist ignored because
Graph
471 cost of
it is subsumed
by the
cannot be ignored is the cost of take X from W in Figure 11 is incurs a cost linear of Figure A ~Ot(X) x also average Aw(X)) in I DF( X) I 11 to the I DF( X) I )). describe the
A,O,( X)
times,
all Ye DF( X) are polled. The contribution time of the whole process is therefore O(ZX( the output in the natural way where x as
A tot,
we can
contribution
I DF(X)
(7)
As 11
emphasizes the dominance frontiers of nodes with many assignments. Section 8 shows, the dominance frontiers are small in practice, and Figure is effectively 0( A tOt). This, in turn, is effectively 0( AO,i~).
5.2 Renaming
The algorithm denoted 12 begins the root with have the been in Figure 12 renames traversal The visit The all mentions are generated of the dominator order, processing starting with of variables. tree by calling the any statements @-functions requires work New variables V. Figure SEAFtCH that for at may only associated V,, where a top-down node Entry. node in inserted. i is an integer, for each variable
sequential
those variables actually mentioned in the statement. In contrast with Figure 11, we need a loop over all variables only when we initialize two arrays among the following data structures: S(*) hold C(*) C(V) is an array integers. Vi that tells is an array of stacks, The integer should one stack i at the
One
for top
V. The
stacks
is used
to construct countelc
variable
replace assignments
of integers,
how many
X, Y) is an integer telling which predecessor of Y in CFG is operand of a +-function in Y corresponds to the jth predecessor the listing of the inedges of Y. statement A has the form LHS(A) RHS(A) of expressions and the variables. These tuples target tuple notation, a conditional
assignment
where the right-hand side RHS( A) is a tuple left-hand side LHS( A) is a tuple of distinct target change is still as mentions remembered of variables as oldLILS( are renamed, A). To minimize
branch is treated as an ordinary assignment to a special variable whose value indicates which edge the branch should follow. A real implementation would recognize that a conditional branch involves a little less work The LHS part of the processing can be than a genuine assignment: omitted.
ACM Transactions on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991
472
for each
variable
do
c(v)
S(V) end call
+
+-
o
Empty Stack
SEARCH (Entry)
.4 in
do
then for end for each itreplace push C(V) end end for /* + of Y first c loop SUCC(X) +/ do F in operand Y do t in RHS(F) by V, where i = Top(S(V)) each ~ for end end for each call Y c Children(X) do V in do replace 1 by use V, where
C(v)
P by i onto i new tj in LHS(A) S(V) +
replace
SEARCH(Y) A in
oldLHS(A)
X do
do
s(v)
Thelistofusesof
The processing ofan assignment statement A considers the mentions of variables in A. The simplest case is that of a target variable V inthetuple LHS(A). Weneed anew variable V,. By keeping acount C(V) ofthe number have already been processed, we can of assignments to V that appropriate new variable by using i = C(V) and then incrementing To facilitate renaming uses of V in the future, identifies V,) onto a stack S(V) of (integers that V. replacing
ACM Transactions on Pro~amming Languages and Systems, Vol
find an C(V).
i (which variables
we also identify)
push new
13, N0
4, October
1991
Static Single Assignment Form and the Control Dependence A subtler any variable expressions computation V used in in the tuple. is needed RHS( A); We want for the right-hand that is, V appears V by to replace
Graph A).
473
in at least
V, is the target
produces the value for V actually used in RHS( A). because A maybe either an ordinary assignment or a but they later in 4 and of
~-function. Both subcases get V, from the top of the stack S(V), inspect S(V) at different times in Figure 12. Lemma 10, introduced this section, shows that The correctness proof V, is correctly for renaming chosen depends
on three more lemmas. We start by showing that the assignment to a variable in the transformed
LEMMA 8.
sense to speak
Each
new variable
Vi in the transformed
program
is a target
of
exactly
one assignment. Because the counter C(V) is incremented after processing each
PRooF.
assignment to V, there can be at most one assignment to V,. To show that there is at least one assignment to V,, we consider the two ways V, can be mentioned. If V, is mentioned on the LHS on an assignment, then there is nothing more to show. If the value of when time Vi when is used, on the other onto hancl, S(V), then it was i = Zop(S( V )) was true pushed to vi. because at the time the algorithm renamed the old use
to V had just
to an assignment
K
X may program contain or were assignments introduced to the variable to receive V that appeared in the for all 12. loop
A node original
the value
of a o-function
V. Let TopA/7er( V, X) denote the new variable in effect for V after statements in node X have been considered by the first loop in Figure Specifically, and define we consider the top of each stack S(V) at the end of this
( V, X)
~f
V,
are no assignments V, X) is to V.
to V in inherited
TopAf3er(
dominator
not
V and
any CFG
edge X ~ Y such
that
Y does
X) that Z has
(8) a
Because
V, if a node
to a new to a
variable derived from V. We use this fact twice below. By Lemma 2, Ye DFIOC=l( X) G DF( X), and, thus, X does not new idom( variable X), derived from X)), V. Let U be the first node in idom( idom( . . . that ( V, X) assigns to such a variable. ( V, U).
Vol.
assign
the Then
sequence (9)
TopAfter
ACM Transactions
= TopAfter
Languages
on Programming
and Systems,
1991.
Ron Cytron et al U assigns to a new variable Y), we get derived from V, it follows dominates that that Y $ is
But
U dominates
a predecessor
Y. For
Z with
an edge X * Y. By the choice assign to a new variable derived 7opAfter( and (8) follows The preceding form, but first from (9). V, U)
Z does not
( V, idom(
Y)),
K
helps extend establish TopAfter Condition (3) in the definition which
A.
lemma we must
of SSA
so as to specify
corresponds to V just before and just after iteration of the first loop in Figure 12 for stack just before and just ~f ~f V, V, after this where where iteration
the
TopBefore( TopAfter
V, A) ( V, A)
processing processing
A; A.
to be the last In particular, if A happens TopAfter( V, A) = TopAfler( V, X). If, on the another
LEMMA
statement in block X, then A is followed by other hand, V, B). program V and any
statement
B, then
TopA/ler(
V, A) = TopBefore(
and
10. Consider any control flow the same path in the original program.
occurrence of a statement A along the path. If A is from then the value of Vjust before executing A in the original as the value program. PRooF. executed is either TopBefore( the original before A. Case of TopBefore( V, A) Just before executing
We use induction
along
the path.
We consider
the
kth
statement
along the path and assume that, for all j < k, the jth statement T not from the original programG or has each variable V agreeing with V, T) just program before and T. We assume show that that the kth statement TopBefore( A is from V, A) just V agrees with
1. Suppose
that
is not
the
first
original
statement
in
its
basic
Let T be the statement just before A in Y. By the induction and the semantics of assignment, V agrees with TopAfter(V, T)
T.
Thus,
Suppose
V agrees
that edge
with
TopBefore(
first by the original control
V, A) just
statement path. We
before
in its
Case 2.
Let X +
A is the followed
Y be the
claim
V agrees
with TopAfler( V, X) when control flows along this edge. If X = Entry, then TopAfter( V, X) is VO and does hold the entry value of V. If X # Entry, let T be the last statement in X. By the induction hypothesis and the semantics of assignment, V agrees with TopAfter( V, T)
It
could be a nominal
Entry
assignment
ACM Transactions
on Programming
475
... ?
~ + Y
... +
T t
...
...
@(..., TopAfter(V,X),
. ..) Y
/*
No ~ for
*/
V4K
V 2 ZopAfter(V,
idom(Y))
1
Fig. 13. In the proof of Lemma 10, the node Y mayor may not have a +-function for V.
just
after
and, still
with the X +
V, X)
when
x+
We
Y.
must knowing V, X) may not knowing ahead of that A
V, A) just
doing
Y. As Figure
13 illustrates,
be a @-function
transformed
2.1.
Suppose
that
has
a ~-function
Vi+-
4(.
..)
in
Y.
The
operand corresponding to X + Y is TopAfter ( V, X), thanks to the loop over successors of X in Figure 12. This is the operand whose value is assigned to Vi, which doing is TopBefore( A in Y. that Y. V does not have V, X) = TopAfter( a ~-function V, idom( Y)) in Y. By Lemma 9, V, A). Thus, V agrees with TopBefore( V, A) just before
TopAfter( A in Any
= TopBefore(
V, A)
K
can be put and then number into minimal SSA form by computin Figure of variables Let avr.gDF 11 in be in the
THEOREM
program frontiers
ing
the dominance
applying
the algorithms
and Figure
of assignments
to variables
of mentions resulting program. ~ Let MtOt be the total number the resulting program. Let E be the number of edges in CFG. the weighted the whole average is E + ~ x I DF(X) I + (A,o,
x
(7) of dominance
frontier
sizes. Then
the running
time of
process O
avrgDF)
+ M,O,
)
. to V in the that need
PROOF.
Figure
11 places
the
@functions
for
dominance original
frontier program.
DF+ ( Y ) is the
the length
of the tuple
LHS
to A,.,,
and
and Systems,
Vol.
1991.
476
Ron Cytron et al. for V, so we have with obtained the fewest correctly by Condition possible Figure (1) in @functions. 12. Condition the definition We must (2) in of still the
@-functions translation show that definition Let N edges, and subsumed frontiers, Section
is done
from Lemma 8. Condition (3) follows from number of nodes in CFG. The dominator 12 runs in 0( N + E + Mtot)
X ) I)
Lemma 10. tree has O(N) N + E term is the dominance at the end of
time.
The
0( AfOt
x czurgDF).
6. CONSTRUCTION
In this section in we dominance Y be nodes
postdominates
OF CONTROL DEPENDENCE
show in the that control graph dependence of the control [241 are flow essentially Let the X and X is reverse graph.
frontiers
Y.a
CFG. If X appears on every path from Y to Exit, then Like the dominator relation, the postdominator relation
If X postdominates Y but X # Y, then X strictly immediate postdominator of Y is the closest strict from Y to Exit. In a postdominator
tree,
of Y on any path
the
children of a node X are all immediately dependent A CFG node Y is control following hold: path
of the
every
node
causes executing
Y Y.
is also
some
this control dependence from X to Y the from X that causes Y to execute. Our here can easily be shown to be equivalent
Let X and
only every if (iff node )
nodes.
a
Then
Y postdominates
path p: X ~ Y
a successor
of X
if
and
nonnull
such
that
postdominates
PROOF.
X on p.
Suppose
that
Y postdominates
a successor
U of
X.
Choose
any
Then Y appears on q. Let r be the initial segment of appearance of Y on q. For any node V on r, we can r to V and then by taking any path from V U but does not appear before the end of r, Let r. p be the path Then p: that starts with the edge X ~ Y and Y postdominates
get from U to Exit by following to Exit. Because Y postdominates Y must every postdominate then V as well. along X + U and proceeds X on p.
node after
8The postdominance relation in [24] is irreflexive, whereas the definition we use here is reflexive. The two relations are identical on pairs of distinct elements, We choose the reflexive defimtlon here to make postdominance the dual of the dominance relation, ACM Transactions
on Programming Languages and Systems, Vol. 13, No. 4, October 1991,
477
for in
RCFG Figure 10 to RDF 0 end find for the RCFG mapping t--
algorithm
for for
node node X
CD(X)
each CD(X)
do
{ Y }
end end Fig. 14. Algorithm for computing the set CD(X) of nodes that are contro 1dependent on X
Node Entry 1 2 3 4 5 6 7 8 9
Fig. 15.
Control
dependences of theprogram
in Figure5.
10
9,11 2,8,9,11,12
10 11
12
given
a path
p with
these
properties,
let
node
of X and
RCFG
has the
control The on
flow graph CFG, but has an edge Y + X for each edge X + Y in CFG. roles of Entry and Exit are also reversed. The postdominator relation CFG is the dominator
1.
relation
COROLLARY
Let X and
on X in CFG
l?ROOF.
iff X~ DF( Y)
in RCFG.
control
Using Lemma 11 to simplify the first condition in the definition of dependence, we find that Y is control dependent on X iff Y postdomi -
nates a successor of X but does not strictly postdominate X. In RCFG this means that Y dominates a predecessor of X but does not strictly dominate X; that Figure is, X~DF(Y). 14 applies this
K
result to the computation of control depenclences.
After building the dominator tree for RCFG by the in time 0( Ea( E, N)), we spend O(size( RDF)) finding and then inverting them. The total time is thus
0( E + size( RDF))
13, No
4, October
Ron Cytron et al. purposes. By applying to Exit the algorithm the control was added would in Figure so that 14 to the control in Figure the at Entry. 15. Note control flow that
in Figure
edge from
depend-
ence relation,
as a graph,
FROM analysis
programs in SSA form. Eventually, however, The @-functions have precise semantics, but sented in existing target machines. each yield This out of SSA form by replacing ments. Naive translation could can be generated elimination if two already allocation and storage @-function inefficient
a program must be executed. they are generally not repredescribes with object how to translate some ordinary assigncode, but efficient code are applied: dead code
section
useful
optimization
Naively, a k-input @-function at entrance to a node ordinary assignments, one at the end of each control This is always correct, but these ordinary assignments good deal elimination efficient. 7.1 Dead Code Elimination program may may in the have
dead code
of useless work. If the naive replacement is preceded by dead code and then followed by coloring, however, the resulting code is
source
(i.e.,
code that
output). dead
integration)
sources for dead code, it is natural to perform (or repeat) tion late in the optimization process, rather than burden steps with concerns about dead code.
Translation to SSA form is one of the compilation steps that may introduce dead code. Suppose that V is assigned and then used along each branch of an if . ..then... but that V is never used after the join point. The else.... original assignments to V are live, but the added assignment by the @ function is dead. Often, such dead +-functions are useful, as in the equivalencing and redundancy elimination algorithms that are based on SSA form [5, 431. One such use is shown in Figure 16. Although others have avoided placement of dead @functions in translating to SSA form [16, 52], we prefer to include the dead O-functions to increase optimization opportunities. There are many different definitions of dead code in the literature. Dead code is sometimes defined to be unreachable code and sometimes defined (as it is here) to be ineffectual code. In both cases, it is desirable to use the broadest code really possible definition, subject removed. to the correctness g A procedural condition of the that dead is can be safely version definition
to that of faint
Languages
and Systems,
Vol.
1991
Graph
479
if
Pi then do Yltl
of
use do Y2
of
Y1
xl
use of
use of
4( YI, Y2) . . .
Y2
use of
Y3
Fig. 16. Ontheleft isanunoptimized program containing a dead @function that assigns toY3. The value numbering technique in [5] can determine that Y3 and Z3 have the same value, Thus, Z3 and many of the computations that produce it can be eliminated. The dead ~-function is brought tolifebyusing Y3 inplaceof Z3.
more Initially,
than
a recursive are
the
style. below.
statements
dead.
statements,
to be marked
because
conditions
these statements live may cause others to be marked live. When the worklist eventually empties, any statements that are still marked dead and can be safely is marked live removed from the code. holds: program parameter, are output, or a iff at least should have one of the following to affect
statement as an 1/0
be assumed
to a reference there
statements
(3) The statement is a conditional branch, marked live that are control dependent Several published that requires every algorithm, conditional given in branches. algorithms conditional Figure
eliminate dead code in the narrower sense branch to be marked live [30, 31, 381.1 Our 17, goes one step further in eliminating does this dead al so.) algorithm by R. Paige
(An unpublished
The more readily accessible [31] contains a typographical should be consulted for the correct version.
ACM TransactIons on Programmmg Languages
technical
report [301
and Systems,
Vol.
1991.
480
Ron Cytron et al
for if each then else end iVorkList while take for
if
S do + true false
S G Pre Live
t(WorkList
Pre Live # 0) do
Live(D) Live(l))
PVorkList end end for If each then block do Live(Last(B)) WorkList end end end for If end Fig. 17 each statement = false delete B
in
CD-l = false -
(B/ock(S))
do
Liz~e(Last(B))
true
U {
WorkList
Last(B)
S do
Live(S) then
S from
Block(S)
data
structures that
are used: S is live. whose execution is initially assumed to or side effects has been outside disstate-
indicates is the
statement
set of statements
Statements that result in 1/0 scope are typically included. whose that Iiveness provide basic statement block S.
is a list is the
recently used by
values B.
ment
Last(
S.
B)
terminates
Block(S) ipdom(
ACM
block block
containing that
is the basic
immediately
and Systems,
postdominates
Vol 13, No. 4, October
block
1991
1?.
TransactIons
on Programming
Languages
Static Single Assignment Form and the Control Dependence Graph CD-1(B) is the set of nodes that to block discovers 11 Other are control is the dependence
statements, (such
optimizations
motion)
so there is already ample reason for a late The empty blocks left by dead code elimination
along with any other empty blocks. The fact that statements are considered for condition (2). Statements that depend never marked live unless required by some (3) is handled tion controls 7.2 Allocation by the loop over a block with live by Coloring seem possible simply
dead until marked live is crucial (transitively) on themselves are other live statement. whose Condition termina-
At first,
it might
to map
of Vi back to V introduced by
and to delete translation may have useful (Figure can 18c). leave be yielding The
all of the
~-functions.
However,
to SSA form cannot always be eliminated, because optimization capitalized on the storage independence of the new variables. The of the new variables by the code motion to V twice by moving with the the introduced example invariant variables VI by translation in Figure assignment for and separate The SSA form out These to SSA form code 18b) loop, (Figure of the 18. The source
persistence
18a) assigns
separate
purposes
(Figure
live.
assignment
to V3 will where
be eliminated.
optimization
program
Vz are
simultaneously
Thus, both variables are required: The original variable V cannot substitute for both renamed variables. Any graph coloring algorithm [12, 13, 17, 18, 211 can be used to reduce the number of variables needed and thereby can remove most of the associated assignment statements. The choice of coloring technique should be guided by the eventual then just use of the output. to consider derived of the should most If the goal is to produce each original from variable readable source code, all of of the it is desirable the SSA variables changes V separately, coloring
be considered assignments
@functions into identity assignments, that is, assignments of the form V - V. These identity assignments can all be deleted. Storage savings are especially noticeable for arrays. If optimization does not perturb the order of the first two statements in Figure 7, then ~arrays A ~ and A ~ can be assigned the same color and, hence, can share the same storage. The array A ~ is then assigned an Update from an identically colored array. Such operations can be implemented inexpensively by
llA
conditional
branch
it to an unconditional
branch
to any one
of its prior
targets. ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.
482
while
(. ..)
do
while W3
(. ..) -
do
~(w~, W-J
while
(. ..)
do
lj(v~, V1
v~)
read
W2 4-
V2+W1
(c)
code motion.
(a) Source
assigning actual
to just
storage.
the
operation
by HiddenUpdate
is always
branching.)
can be described
experimental
the behavior
THEOREM
comprised dominance
of frontier
straight-line of any
if-then-else,
contains at
and
a top-down we have
parse
of a program (program)
using
the
shown tree
in and
Figure
a single
node
a control
CFG
with
two
nodes
Entry
Exit.
The initial dominance frontiers are DF(Entry) changes production, we consider the associated frontiers of nodes. When a production expands
= @ = DF(Exit). For each CFG and to the dominance a nonterminal parse tree node
S, a new subgraph is inserted into CFG in place of S. In this new subgraph, each node corresponds to a symbol on the right-hand side of the production. We will show that applying any of the productions preserves the following invariants:
CFG node S corresponding to an unexpanded Each has at most one node in its dominance frontier.
ACM TransactIons on Programmmg Languages and Systems, Vol.
(statement)
symbol
1991
483
<variable> Grammar
Fig. 19.
Each
CFG
node
to a terminal
symbol
has
at most
two
This production adds aCFG ing DF(S) = {Exit}. this production SI and Sz. Edges leave Sz. A single flow graph tortree: Nodes this
Sand aCFG
+S+Exit,
yield-
(2) When
is applied,
Sisreplaced
previously entering and leaving S now enter edge is inserted from SI to Sz. Although the consider Szdominate Sz. Thus, Edges is applied, T,~~z~. Edges how this wehave node production DF(Sl) entering from affects were = DF(S) and all nodes that a CFG previously
S is replaced
then?
Sel.e, and
enter
are inserted
Ti~ to both
s~l~~; edges are also inserted from St~~~ and S~l~~ to Te~~i~. In the dominator tree, Ti ~ and T,~~i ~ both dominate all nodes that were dominated made
consider fr(mtier,
by S. Additionally, Ti ~ dominates S~~~~ and S.l,.. IBy the argument for production (2), we have DF( T,~) = DF( S) = DF(T,~~i ~]). NOW
nodes we obtain Sthen and Sel~e. From the definition of a dominance
DF(S,~e.)
= DF(S.l.,) a CFG
this
production
is applied,
and SdO. All edges previously associated associated with node TUkil.. Edges are inserted
from S& to TU~,l,. Node TWh,l, dominates all nodes that were by node S. Additionally, TU~,l, dominates SdO. Thus, we have U { TWhi~=) and DF(SdO) = { T~h,~,}. = DF(S)
(5) After
phic
flow
graph
COROLLARY
comprised node
of straight-line dependent
and
while-do
is control
nodes.
PROOF. Consider a program
P composed CFG.
constructs graph
associated
control
ACM
flow
graph
The reverse
Languages
flow
Vol.
Transactions
on Programmmg
and Systems,
1991.
484
Statements per procedure Min 22 9 8 8 Median 89 54 43 55 Max 327 351 753 753 Description Dense matrix eigenvectors Flow past an airfoil Circuit simulation 221 FORTRAN procedures and values
a structured
control
flow
graph
for
some
program
P.
For
all
Y in 1, Y
DF( Y) contains at most two nodes by Theorem K control dependent on at most two nodes. these each loop, linearity the the results nest do not frontier whose hold of the loops. total For
4. By Corollary
Unfortunately, tures. Figure includes leads variable mapping computation In particular, 5. For
for
all loops
consider
of repeat-until
entrance n nested
each of the entrances frontier 0(n) used needs at most is not actually of dominance
to surrounding
loops, yet
to a dominance
size is 0( nz),
frontiers
the resulting number of actual number of dominance frontier diverse set of programs. We implemented placing ~-functions local library data flow procedures
@functions. We therefore wish to measure the nodes as a function of program size over a for constructing dominance frontiers and system, which already offered the required analysis [2]. We ran these algorithms from on 61 [46] and 160 procedures two Perfect
and control
[391 benchmarks. Some summary statistics of these procedures are shown in Table I. These FORTRAN programs were chosen because they contain irreducible intervals and other unstructured constructs. As the plot in Figure 20 shows, the size of the dominance frontier mapping appears to vary linearly with program size. The ratio of these sizes ranged from 0.6 (the Entry node has an empty dominance frontier) to 2.1. For the programs ~-functions is also we tested, linear the plot in Figure 21 shows that program. the number The ratio of of in the size of the original
these sizes ranged from 0.5 to 5.2. The largest ratio occurred for a procedure of only 12 statements, and 95 percent of the procedures had a ratio under 2.3. All but one of the remaining procedures contained fewer than 60 statements. Finally, the plot in Figure 22 shows that graph is linear in the size of the original ranged from O.6 to 2.4, which is very dominance frontiers. The placing
ACM
the size of the control dependence program. The ratio of these sizes close to the range of ratios for
ratio avrgDF (defined by (7) in Section 5.1) measures the cost of ~-functions relative to the number of assignments in the resulting
on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991
TransactIons
. *
485
* * * *
I
300
I
400
I
500
6~o of program
~~oo statements.
Size of dominance
frontier
mapping
versus number
* * *
* * * * *
*
*
*
I
300
I
400
I
500
I
600
--&-Too
Number
of rj-functions
versus number
of program
statements
from
1 to 2, with
median
1.3. There
was
We also measured the expansion A,., / A.... in the number of assignments when translating to SSA form. This ratio varied from 1.3 to 3.8. Finally, we measured the expansion MtOt / MO,,~ in the number of mentions (assignments
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.
486
160 150 1400 130 120 110 100 900 800 700 ! 600 500 400 300 1 200 loo 0
* ** *
* *******
** * * **
**
**2** $*
?@% ****
I
200
0
Fig. 22.
100
1
300 400
I
500 600 700
I
800
Size of control
of program
statements.
or uses) to 9. 6.2.
when of these
to SSA was no
form.
This
ratio with
varied
from
1.6 size.
correlation
program
(1) The
graph
dominance fi-ontier mapping is constructed from CFG (Section 4.2). Let CFG have N nodes and nodes to their tree and then
(2) Using
the dominance frontiers, variable in the original program the total number of assignments where each ordinary assignment len~h
A ,.,.
the locations of the @functions for each are determined (Section 5.1). Let A ~Ot be to variables in the resulting program, statement LHS * RHS contributes the
time,
of the tuple LffS to A ~ot, and each ~-function contributes I to Placing @functions contributes 0( A,O, x aurgDF) to the overall where aurgDF is the weighted average (7) of the sizes I DF( X) 1. are renamed (Section 5.2). Let M,O, be the total number of of variables in the resulting program. Renaming contributes time. in terms be the
Languages
0( M, O,) to the overall R To state the time bounds of the original program
TransactIons on Programming
of fewer maximum
and Systems,
ACM
Static Single Assignment Form and the Control Dependence nodes, mentions ordinary E edges, AOT,~ original assignments In the worst case, can require worst 0( I&) case, In the
Graph
487
of variables. assignments
to variables, and MO,,~ original avrgDF = ~ ( IV) = 0(R), and k insertions a c$function of qLfunctions. has Thus, !J( 1?) operands.
A tOt = !J(R2)
at worst.
Thus, M,Ot = 0( R3) at worst. The one-parameter worst-case time bounds are thus 0(122) for finding dominance frontiers and 0( R3) for translation to SSA form. However, the data in Section 8 suggest that the entire translation to SSA form will is small, avrgDF lation reverse RDF Control be linear as is the is constant, process graph is the dependence RCFG size of the in practice. The dominance frontier of each node in CFG number of ~-functions added for each variable. In effect, A ~Ot= 0( Ao,,~), O(R). are (Section output read off of the and from M,O, = O(MO,,~). the dominance dependence The entire frontiers Since calculation, transin the this is 8
is effectively
6) in time
the size of
algorithm is linear in the size of the output. The only quadratic caused by the output being fl( R2) in the worst case. The data suggest that the control dependence calculation is effectively
behavior in Section
0(R).
9.2
Related Work
SSA form is a refinement of Shapiro and Saints [45] notion of a
Minimal
pseudoassignment. The pseudoassignment nodes for V are exactly the nodes that need o-functions for V. A closer precursor [221 of SSA form associated new from names for V with pseudoassignment Without nodes explicit and inserted assignments it was one new name to another. @functions, however,
difficult to manage the new names or reason about the flow of values. Suppose the control flow graph CFG has N nodes and. E edges for a program with Q variables. One algorithm [411 requires 0( Ea( E, N)) bit vector operations (where each vector algorithm is of [43] length for Q) to find all of the compseudoassignments. A simpler reducible programs
putes SSA form in time 0( E x Q). With lengths of bit vectors taken into account, both of these algorithms are essentially 0( R2 ) on programs of size R, and the simpler algorithm sometimes inserts extraneous o-functions. The method presented here is 0( R3) at worst, but Section 8 gives evidence that it is 0(R) in practice. The earlier 0( R2 ) algorithms have no provision for running faster in typical cases; they appear to be intrinsically quadratic. For CFG with N nodes and E edges, previous general control dependence algorithms [24] can take quadratic the worst-case 0(N) depth of the time in (N + E). This analysis is based on (post) dominator tree [24, p. 326]. Section 6
shows that control dependence can be determined by computing dominance frontiers in the reverse graph RCFG. In general, our approach can also take quadratic time, but the only quadratic behavior is caused by the output being fl( N2) in the worst case. In particular, suppose a program is comprised only of straight-line code, if-then-else, and while-do constructs. By Corollary 2, our algorithm computes control dependence in linear time. We obtain a better time bound for such programs because our algorithm is based on dominance frontiers, whose sizes are not necessarily related to the depth of
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October
1991.
488
the dominator tree. For languages that offer only these constructs, control dependence can also be computed from the parse tree [28] in linear time, but our algorithm is more robust. typical cases in linear time. It handles all cases in quadratic time and
9.3 Conclusions
Previous powerful work has shown that SSA form and control that dependence efficient can support in terms of code optimization algorithms are highly
time and space bounds based on the size of the program after translation to the forms. We have shown that this translation can be performed efficiently, that it leads to only a moderate increase in program size, and that applying the early steps in the SSA translation to the reverse graph is an efficient way to compute control control dependence
ACKNOWLEDGMENTS
that
SSA form
and
Fran
Allen
Avery,
Padget,
Bob Paige,
helpful reports
paper.
REFERENCES 1. AHO,
2,
3.
5.
6.
A. V., SETHI, R., AND ULLMAN, J. D. Compzlers: Principles, Techniques, and Tools. Addison-Wesley, Reading, Mass , 1986. ALLEN, F, E., BURKE, M,, CHARLES, P,, CYTRON, R., AND FERRAiYTE, J An overview of the Distrib. Comput. 5, (Oct. 1988), PTRAN analysis system for multiprocessing. J. Parallel 617-640. ALLEN, J. R Dependence analysis for subscripted variables and its application to program transformations. Ph.D. thesis, Department of Computer Science, Rice Univ., Houston, Tex,, Apr. 1983. ALLEN, J R , AND JOHNSON, S Compiling C for vectorization, parallelization and inline of the SIGPLAN 88 Symposium on Compder Construction. expansion. In Proceedings SIGPLAN Not. (ACM) 23, 7 (June 1988), 241-249 ALPERN, B , WEGMAN, M, N., AND ZADECK, F. K. Detecting equality of values m programs on Principles of Programming Languages In Conference Record of the 15th ACM Symposium (Jan. 1988), ACM, New York, pp. 1-11, BALLANCE, R. A., MACCABE, A. B., AND OTTENSTEIN, K. J. The program dependence web: A representation supporting control-, data-, and demand driven interpretation of languages, In
Proceedings (ACM) 25, 6 of the SIGPLAN 90 Symposkum on Compiler Construction. SIGPLAN Not.
7. BANNING, J. B, An efficient way to find the side effects of procedure calls and the aliases of Record of the 6th ACM Syrnpoa~um on Principles of Programming variables. In Conference Languages (Jan. 1979) ACM, New York, pp. 29-41 8. BARTH, J. M. An interprocedural data flow analysis algorithm. In Conference Record of the
4th York: flow ACM pp. Symposium 119-131. An interval-based ACM of the Trans. SIGPLAN approach Program 86 to exhaustive Lang. Symposium Syst. on and incremental interprocedural data 12, 3 (July Compiler on Principles of Programmmg Languages (Jan. 1977) ACM, New
9. BURKE, M. analysis.
Interprocedural
dependence
In
Not
1986), 162-175.
Languages and Systems, Vol. 13, No. 4, October 1991
Transactions
on Programming
Graph
489
The semantics
of program
dependence.
Not.
on Compiler
Construction.
SIGPLAN
(ACM)
Register
Symposium
allocation
on
and spilling
via graph
coloring.
Not.
In Proceedings
(ACM) 17,
Compiler
Construction.
SIGPLAN
1982), 98-105. 13. CHAITIN, G. J., AUSLANDER, M. A., CHANDRA, A. K., COCICE, J., HOPKINS, M. IE., AND MARKSTEIN, P. W. Register allocation via coloring. Comput. Lang. 6 (1981), 47-57. 14. CHASE, D. R. Safety considerations for storage allocation optimizations. In Proceedings of the SIGPLAN 88 Symposium on Compiler Construction. SIGPLAN Not. (ACM) 23, 7 (June 1988), 1-10. 15. CHASE, D. R., WEGMAN, M. AND ZADECK, F. K. Analysis of pointers and structures. In
Proceedings (ACM) of the SIGPLAN 90 Symposium on Compiler Construction. SIGPLAN Not.
16,
17.
1990), 296-310. CHOI, J., CYTRON, R., AND FERRANTE, J. Automatic construction of sparse data flow evaluaRecord of the 18th ACM Symposium on Principles of Programtion graphs. In Conference ming Languages (Jan, 1991). ACM, New Yorkj pp. 55-66. CHOW, F. C. A portable machine-independent global optimizerDesign and measurements. Ph. D. thesis and Tech. Rep. 83-254, Computer Systems Laboratory, Stanford Univ., Stanford, Calif., Dec. 1983. CHOW, F. C., AND HENNESSY, J. L. The priority-based coloring approach to register allocaLang. Syst. 12, 4 (Oct. 1990), 501-536. tion. ACM Trans. Program. COOPER, K. D. Interprocedural data flow analysis in a programming environment. Ph.D. thesis, Dept. of Mathematical Sciences, Rice Univ., Houston, Tex., 1983. CYTRON, R., AND FERRANTE, J. An improved control dependence algorithm. Tech. Rep. RC
25, (June
ACM, New York, pp. 185-194. 27. HORWITZ, S., PFEIFFER, P., AND REPS, T.
Proceedings of the SIGPLAN 89 Symposium
Dependence
on
analysis
for pointer
variables.
SIGPLf~N
In
Not.
Compiler
Construction.
(ACM) 24, 7 (June 1989). 28. HORWITZ, S., PRINS, J., AND REPS, T.
ACM Trans. Program. Lang. Syst.
29.
Integrating non-interfering versions of programs. 1989), 345-387. JONES, N. D., AND MUCHNICK, S. S. Flow analysis and optimization of lLISP-like structures. Flow Analysis, S. S. Muchnick and N, D. Jones, Eds. Prentice-Hall, Englewood In Program Cliffs, N. J., 1981, chap. 4, pp. 102-131. KENNEDY, K. W. Global dead computation elimination. Tech. Rep. SETL Newsl. 111, Courant Institute of Mathematical Sciences, New York Univ., New York, N. Y., Aug. 1973. In Program Flow Analysis, KENNEDY, K. W. A survey of data flow analysis techniques. S. S. Muchnick and N. D. Jones, Eds. Prentice-Hall, Englewood Cliffs, N. J., 1981. of Computers and Computations. Wiley, New York, 1978. KUCK, D. J. The Structure LARUS, J. R. Restructuring symbolic programs for concurrent execution on multiprocessors. at Berkeley, Tech. Rep. UCB/CSD 89/502, Computer Science Dept., Univ. of California Berkeley, Calif., May 1989.
11, 3 (July
ACM Transactions
on Proq-amming
490
34
.
LARUS, J
Proceedings
Detecting
conflicts
between
structure
accesses
SIGPLAN
In
Not.
88 Sympostum
on Compder
Construction.
23, 7 (July 1988), 21-34, 35. LENGAUER, T., AND TARJAN, R. E. ACM Trans. Program. Lang. Syst.
36. 37
44.
A fast algorithm for finding dominators m a flowgraph. 1 (July 1979), 121-141 Prentice-Hall, EngleProgram Flow Analysis. MUCHNICK, S. S., AND JONES, N. D , EDS wood Cliffs, NJ., 1981 Record of the MYEKS, E. W. A precise interprocedural data flow algorithm, In Conference 8th ACM Symposium on Principles of Programming Languages (Jan. 1981). ACM, New York, pp. 219-230. OTTENSTEIN, K. J. Data-flow graphs as an intermediate form. Ph.D. thesis, Dept. of Computer Science, Purdue Univ., W. Lafayette, Ind., Aug. 1978. POINTER, L. Perfect report: 1. Tech. Rep CSRD 896, Center for Supercomputing Research and Development, Univ. of Illinois at Urbana-Champaign, Urbana, 111,, July 1989 REIF, J. H., AND LEWIS, H. R. Efficient symbolic analysis of programs. J. Comput. Syst, SCZ. 32, 3 (June 1986), 280-313. REIF, J. H., AND TARJAN, R. E Symbolic program analysis in almost linear time. SIAM J. Comput. 11, 1 (Feb. 1982), 81-93 ROSEN, B. K. Data flow analysis for procedural languages J. ACM 26, 2 (Apr. 1979), 322-344. ROSEN, B. K , WEGMAN, M. N., AND ZADECK, F. K. Global value numbers and redundant Record of the 15th ACM Symposui m on Przn ciples of Programcomputations. In Conference ming Languages, (Jan. 1988). ACM, New York, pp. 12-27, RUGGIERI, C., AND MURTAGH, T. P. Lifetime analysis of dynamically allocated objects. In
1, Conference Record of the 15th ACM Symposzum on Principles of Programming Languages
(Jan. 1988). ACM, New York, pp. 285-293. The representation of algorithms. Tech. Rep, CA-7002-1432, 45. SHAPIRO, R. M , AND SAINT, H Massachusetts Computer Associates, Feb. 1970. 46. SMITH, B. T,, BOYLE, J. M., DONGARRA, J. J,, GARBOW, B, S., IIIEBE, Y., KLEMA, V. C,, AND MOLER, C B. Matrzx Eigensystem Routines Eispack Guide. Springer-Verlag, New York, 1976. 3, 1 (1974), 62-89, 47 TARJAN, R. E. Finding dominators in directed graphs. SIAM J. Comput Property extraction in well-founded property sets. IEEE Trans. Softw. Eng. 48. WEGBREIT, B SE-1, 3 (Sept. 1975), 270-285. 49. WEGMAN, M. N , AND ZADECK, F, K. Constant propagation with conditional branches. In
Conference Record of the 12th ACM Symposium on Prlnczples of Programm mg Languages
(Jan.). ACM, New York, pp. 291-299. 50. WEGMAN, M. N., AND ZADECK, F. K. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst. To be published. 51 WOLFE, M. J. Optimizing supercompilers for supercomputers. Ph.D, thesis, Dept of Computer Science, Univ. of Illinois at Urbana-Champaign, Urbana 111., 1982 52. YANG, W., HORWITZ, S., AND REPS, T. Detecting program components with equivalent behaviors. Tech Rep. 840, Dept. of Computer Science, Univ. of Wisconsin at Madison, Madison, Apr. 1989 Received July 1989; revised March 1991; accepted March 1991
ACM
TransactIons
on Programming
Languages
and Systems,
Vol
13, No
4, October
1991