Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

Efficiently Computing Static Single Assignment Form and the Control Dependence Graph

RON CYTRON, MARK and F. KENNETH Brown ZADECK University IBM Research JEANNE Division FERRANTE, BARRY K. ROSEN, and N. WEGMAN

In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point that advanced optimization features become undesirable. Recently, static single assignment form and the control dependence graph have been proposed to represent data flow and control flow propertiee of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimization. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use dominance frontiers, a new concept that may have other applications. We also give analytical and experimental evidence that all of these data structures are usually linear in the size of the original program. This paper thus presents strong evidence that these structures can be of practical use in optimization. Categories and Subject Descriptors: D .3.3 [Programming Languages]: Language Constructscontrol structures; data types and structures; procedures, functions and subroutines; Languages]: Processorscompilers; optimization; I. 1.2 [Algebraic D.3.4 [Programming Intelligence]: Automatic Manipulation]: Algorithmsanalysis of algorithms; 1.2.2 [Artificial Programming-program transformation

A preliminary version of this paper, An Efficient Method of Computing Static Single Assignment Form, appeared in the conference Record of the 16th ACM Symposium on principles of Programming Languages (Jan. 1989). F. K. Zadecks work on this paper was partially supported by the IBM Corporation, the office of Naval Research, and the Defense Advanced Research Projects Agency under contract NOCIO14-83K-0146 and ARPA order 6320, Amendment 1. Authors ac!drwses: R. i3yiru:I, J. Ferrante, and M. N. Wegman, Computer Sciences Department, IBM Research Division, P. O. Box 704, Yorktown Heights, NY 10598; B. K. Rosen, Mathematical Sciences Department, IBM Research Division, P. O. Box 218, Yorktown Heights, NY 10598; F. K. Zadeck, Computer Science Department, P.O. Box 1910, Brown University, Providence, RI 02912. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. @ 1991 ACM 0164-0925/91/1000-0451 $01.50
ACM Transact~ons on Programmmg Languages and Systems, VO1 13, NO 4, October, le91, Pages 451.490

452
General

Ron Cytron et al.


Languages Control dependence, control flow graph, clef-use chain,

Terms: Algorithms,

Additional dominator,

Key Words and Phrases: optimizing compilers

1. INTRODUCTION In optimizing and efficiency structure can advanced compilers, data structure choices directly influence the power of practical program optimization. A inhibit optimization or slow compilation features become undesirable. poor choice of data to the point where static single

optimization

Recently,

assignment (SSA) form [5, 431 and the control dependence graph [241 have been proposed to represent data flow and control flow properties of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimizations. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use [4]. We present new algorithms that efficiently compute these data structures for arbitrary control flow graphs. The algorithms use
dominance frontiers,

a new

concept

that linear

may

have that that

other

applications.

We

also give analytical the dominance This paper dependence Figure thus

and experimental is usually strong role presents the

evidence evidence

the sum of the sizes of all program. and control SSA form The

frontiers

in the size of the original

can be of practical 1 illustrates

use in optimization. of SSA form in a compiler. intermediate

code is put into SSA form, optimized in various ways, and then translated back out of SSA form. Optimizations that can benefit from using SSA form include code motion [221 and elimination of partial redundancies [431, as well as the constant propagation discussed later in this section. Variants of SSA form have been used for detecting program equivalence [5, 521 and for increasing parallelism in imperative programs [211. The representation of simple data flow information (clef-use chains) may be made more compact through SSA form. If a variable has D definitions and U uses, then there can be D x U clef-use chains. When similar information is encoded in SSA form, there can be at most E def-use chains, where E is the number of edges in the control flow graph [40]. Moreover, the clef-use information encoded in SSA form can be updated easily when optimizations are applied. This is important for a constant propagation algorithm that deletes branches to code proven at compile time to be unexecutable [50]. Specifically, the def-use program information text that is just use that a list, for each variable, of the places in the variable. Every use of V is indeed a use of the to begin with straight-line

value provided by the (unique) assignment to V. To see the intuition behind SSA form, it is helpful

code. Each assignment to a variable is given a unique name (shown as a subscript in Figure 2), and all of the uses reached by that assignment are renamed to match the assignments new name. Most programs, however, have branch and join nodes. At the join nodes, we In Figure 3, the add a special form of assignment called a @-function.
ACM TransactIons on Programming Languages and Systems, Vol. 13, No, 4, October 1991

Static Single Assignment Form and the Control Dependence Graph


Orlgmal Intermediate Code Opt Imlzed Intermediate Code

453

P()+P~+P~-+
Fig, 1. Vertical arrows represent arrows represent optimizations, translation to/from

...

Pn
form, Horizontal

static single assignment

V+4

VI-4 + V2*6 +-- V2 +7


Straight-line code and its single assignment version.

+-V+5 V+-6 --V+7


Fig. 2.

vi

+5

if

P then else V V +4 6 times. Fig.3. */ if-then-else anditssingle

if

P then else VI V2 + + 4 6 tines. */

v~ e- q5(vf, /* IJse V several


/* Use

V2)

V~ several version.

assignment

operands point. replaced just

to the by new

@-function variables to Vi.

indicate

which

assignments and is only each

to V reach old variable

thle join V is thus by

Subsequent one assignment

uses of V become Indeed,

uses of V3. The

Vl, Vz, V3, ..., there

use of Vi

is reached

one assignment

to Vi in the

entire program. This simplifies the record keeping An especially clear example is constant propagation the next subsection sketches this application. 1.1 Constant Propagation elaborate version of Figure true

for several optirnizations. based on SSA form [501;

Figure

4 is a more

3. The else branch to this use of Q tells

includes a compiler

test of Q, and propagating

the constant

that the else branch never falls through to the join point. If Q is the constant true, then all uses of V after the join point are really uses of the constant 4. Such possibilities is either can be taken into account without SSA form, but the processing costly [48] or difficult that to understand [49]. With SSA form, (i. e., never

the algorithm is fast and simple. Initially, the algorithm assumes

each edge is unexecutable

followed at run time) and that each variable is constant with an as-yet unknown value (denoted T). Worklists are initialized appropriately, and the assumptions are corrected until they stabilize. Suppose the algorithm finds that variable P is not constant (denoted _L) and, hence, that either branch of ACM
Transactions on Programming Languages and Systems, Vol. 13. No. 4, October 1991.

454
if then P

Ron Cytron et al
if P

do

V +-

then else return

do end

Vi

++-

4 6 return

end else do end V3 . . . +V+5


Fig 4, Example ofconstant

V + if

6 Q then

do V2 if end + 4(V1,

Q then

V2)

. . . +- V3+S
propagation

the outer executable, processed,

conditional and the the

may betaken. statements about VI

The outedges they reach are is changed

of the test ofP processed. from When T to 4, and

are marked VI +4 all is uses of

assumption

VI are notified that they are uses of4. (Each use ofVl is indeed a use of this value, thanks to the single assignment property.) In particular, the @function combines 4with Vz. The second operand of the @function, however, isassociated uses with T for the inedge So long second the of thejoin as this operand, point that corresponds what to falling the asthrough algorithm the else branch. edge is considered no matter unexecutable, is currently

sumed about Vz. Combining 4 with T yields 4, so uses of V~ are still tentatively assumed to be uses of 4. Eventually, the assumption about Q may change from T to a known constant or to ~ . A change to either false or L would lead to the followed, and then discovery that the second the 6 at Vz combines with inedge of the join point can be the 4 at VI to yield L at V~. A

change to true would have no effect on the assumption at V3. Traditional constant propagation algorithms, on the other hand, would see that assignments of two different constants to V seem to reach all uses after the join point. They would thus decide The algorithm just sketched that V is not constant at these uses. is linear in the size of the SSA program it sees

[501, but this size might be nonlinear in the size of the original program. In particular, it would be safe but inefficient to place a +-function for every variable at every join point. This paper shows how to obtain SSA form efficiently. When ~-functions are placed possible but is unlikely in practice. The linear in essentially the size of the original linear in the SSA size. carefully, nonlinear behavior is still size of the SSA program is typically the time to do the work is

program;

1.2

Where

to Place

@-Functions placement might seem to require the enumeration of

At first

glance,

careful

pairs of assignment statements for each variable. Checking whether there to V that reach a common point might seem to be are two assignments intrinsically nonlinear. In fact, however, it is enough to look at the domiof each node in the control flow graph. Leaving the technicalinance fi-ontier ties to later sections, we sketch the method here.
ACM Transactions on Programming Languages and Systems, Vol 13, No 4, October 1991

Static Single Assignment Form and the Control Dependence Suppose so that that a variable V has just be either one assignment a use of the

Graph

455

in the original value V.

program, to the

any

use of V will

at entry

program or a use of the value assignment to V. Let X be the will determine the value basic block Y. When entered

VI from the most recent execution of the basic block of code that assigns to V, so X control flows along any edge Y will X + 1{ to a and be X (in a [ways X + Y, the code in see VI

of V when along

unaffected by V.. If Y # X, but all paths to Y must still go through dominate Y), then the code in Y will which case X is said to strictly see VI. Indeed, any matter how far from to reach a node node strictly dominated X it may be. Eventually, strictly dominated by

by X will always see Vl, no however, control may be able X. Suppose but Z is the may first such see V. along frontier of X and is no matter how many and no matter how

Z not

node on a path, so that Z sees VI along one inedge another inedge. Then Z is said to be in the dominance clearly in need of a @-function for V. In general, assignments to V may appear in the original program

complex the control flow may be, we can place ~-functions for V by finding the dominance frontier of every node that assigns to V, then the dominance frontier of every node where a +-function has already been placed, and so on. The same concept also be used conditions dependent statement skipped. of dominance control frontiers dependence used for computing [20, 24], which Informally, the branch SSA form identify can those to compute

affecting statement execution. on a branch if one edge from

a statement is control definitely causes that

to execute while another edge can Such information is vital for detection and program analysis [28].

cause the statement to be of parallelism [2], program

optimization, 1.3 Outline

of the Rest of the Paper the representation of control flow by a directed graph, section can be

Section

2 reviews

Section 3 explains SSA form and sketches how also considers variants of SSA form as defined adjusted concept to deal with and formalizes

to construct it. This here. Our algorithm

these variants. Section 4 reviews the dominator tree dominance frontiers. Then we show how to compute

SSA form (Section Section 7 explains algorithms structures. experiments

5) and the control dependence graph (Section 6) efficiently. how to translate out of SSA form. Section 8 shows that our

are linear in the size of programs restricted to certain control We also give evidence of general linear behavior by reporting on Section 9 summarizes the algowith FORTRAN programs. our technique with other techniques, and

rithms and time bounds, compares presents some conclusions. 2. CONTROL The basic statements blocks, FLOW GRAPHS are flow

of a program where program

organized enters

into

(not block

necessarily at its first

maximal) statement

a basic

and leaves the basic block at its last statement [1, 36]. Basic blocks are indicated by the column of numbers in parentheses in Figure 5. A control is a directed graph whose nodes are the basic blocks of a program jZow graph and two additional nodes, Entry and Exit. There is an edge from Entry to
ACM Transactions on Programming Languages and Systems, Vol. 13, No, 4, October 1991

456

Ron

Cytron

et

al.

1+1 J+ K+l L+-1 I

(1) (1) (1) (1) if (P) then


Jif

repeat do
I (Q) then else K-K+ end L L I t 2 3

(2) (2) (3) (3) (3) (4) (5) (6) (6) (7) (8) (9)
(R) then L +L + 4 (S)

else print repeat if until 1-1+6 until (T) (I,

KtK+2 J,K, L)

(9) (10)
(11)

(12) (12)
Fig 5 Simple pro~amand itscontrol

@IEl--&
flow graph

4
9 10 11 6, there is represent that each X, a and

any basic block at which the program can be entered, Exit from any basic block that can exit the program. the representation also an edge from transfers of control node
successor

and there is an edge to For reasons related to in Section

of control

dependence

and explained

Entry to Exit. The other (jumps) between the basic Entry node

edges of the graph blocks. We assume

is on a path from of X is any

and on a path to Exit. For Y with an edge X + Y in

each node the graph,

SZLCC( X) is the set of all successors of X; with more than one successor is a branch

similarly for predecessors. A node node; a node with more than one

node. Finally, each variable is considered to have an predecessor is a join assignment in Entry to represent whatever value the variable may have when the program is entered. This assignment is treated just like the ones the that appear explicitly in the code. Throughout this paper, CFG denotes control flow graph of the program under discussion. of length J in CFG consists of a For any nonnegative integer J, a path Xo, ., XJ) and a sequence of J edges sequence of J + 1 nodes (denoted (denoted e,, . . . . eJ) 1< j < J. (We write (node or edge) may such that e, runs from XJ_, to X, for all j with sequences, one item eJ: X~_l ~ X~. ) As is usual with occur several times. The null path, with
1991

J = O, is

ACM Transactions

on Programmmg

Languages and Systems, Vol

13, No 4, October

Static Single Assignment Form and the Control Dependence allowed. Nonnull if We write paths
p: p:

Graph
p:

457 if

X.:

XJ

for an unrestricted

path

p, but

X. XJ

p is known

to be nonnull. X. ~ XJ and q: YO ~ YK are said to converge at a node Z

X. # Yo;
XJ=Z=YK; (XJ=Y,) Intuitively, node-disjoint, we still the but paths they p and -(j= q start Jork=K). at at the different end. The nodes and are

(1)
(2) (3) almost than

come together the possibility

use of or rather to be a cycle

and in (3) is deliberate. need to consider

One of the paths

may happen of convergence.

Z ~ Z, but

3. STATIC

SINGLE

ASSIGNMENT

FORM been expanded to an some expressions and

This section intermediate

initially assumes that programs have form in which each statement evaluates

uses the results either to determine a branch or to assign to some variables. (Other program constructs are considered in Section 3.1.) An assignment statement A has the form LHS( A) - RHS( A), where the left-hand side LHS( side tuple. from there A) is a tuple RHS( Each RHS( A) target of distinct variable in target LHS( variables with A) (U, V, . ..) the same the and the right-hand length as the LHS value and the corresponding are l-tuples, 3.1 sketches (such is a tuple of expressions, already between other

is assigned discussed, V and program (V).

A). In the examples tuples in expanding

all tuples Section constructs

is no need to distinguish

use of longer

as proce-

dure calls) so as to make explicit the variables used and/or changed by the statements in the source program. Such explicitness is a prerequisite for most optimization anyway. The only novelty in our use of tuples is the fact that into they provide a simple text. and uniform way to fold about the results of analysis the intermediate Practical concerns the size of the inte rmedi process. the second In the step, first of the new

ate text are addressed in Section 3.1. Translating a program into SSA form step, join some nodes trivial in the o-functions programs control flow

is a two-step graph. In

V ~ @(V, V, . . . ) are

inserted

at some

variables V, (for i = O, 1,2, . . . ) are generated. Each mention of a variable V in the program is replaced by a mention of one of the new variables V,, where a mention may be in a branch expression or on either side of an assignment may the be either result of statement, Throughout an ordinary assignment this paper, an assignment or a @function. Figure statement 6 displays

translating a simple program to SSA form. A +-function at entrance to a node X has the form V ~ +( R, S, . . . ), where V, R, S, . . . are variables. The number of operands R, S, . is the number of control flow predecessors of X. The predecessors of X are listed in some arbitrary fixed order, and the jth operand of + is associated with the jth
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.

458

Ron Cytron et al. 1-1 J+ Kti L+-1


repeat

11-1 Jl +-1 Kl +-l L1tl


repeat 12 #(13,11)

J2 K2 L2

+ +(P) then

@(J4, JI) l#J(K~,K~) 4(L9, L1)


do J3 if (Q) then else L5 K3 L3 L4 t +2 3 12

if

(P) then
J+

if do
I

if

(Q) then else L +-- 2 L +--- 3

K-K+
end else K+--K+2

I
end else J4 K5 L6 +

@(L3, L4) +K2+1 K4 +K2+2

#( J3, J2) @(K3, K4) ~(L2,L~) J4, K5, L6) +(R) then L8 -L7+4 @(L~,L~)

print repeat

(I,

J,K,

L)

print(12, repeat L7

if

(R) then L+-L+4

if L9

+ (S)

@(L8, L7)

until 1+1+6 until (T)

(S) until Fig.6 Simple program

until 13-12+6 (T)

anditsSSAform.

predecessor. Ifcontrol supportl remembers @(R, S,...) qifunction is just uses only

reaches Xfromits J while executing the value one ofthe

jth predecessor, then the run-time the o-functions in X. The value of

of the jth operand. Each execution of a operands, but which one depends on the flow

of control just before entering X. Any d-functions in X are executed before the ordinary statements in X. Some variants of d-functions as defined here are useful for special purposes. For example, each @function can be tagged

lWhen SSA form is used only as an intermediate form (Figure 1), there is no need actually to provide this support. For us, the semantics of @ are only important when assessing the correctness of intermediate steps in a sequence of program transformations beginning and ending with code that has no +-functions, Others [6, 11, 521 have found it useful to give @ another parameter that incidentally encodes j, under various restrictions on the control flow.
ACM Transactions on Programming Languages and Systems, Vol. 13, No 4, October 1991

Static Single Assignment Form and the Control Dependence with the node X where it appears [5]. When the control

Graph

459 is

flow

of a language

suitably restricted, each @function can be tagged with information conditionals or loops [5, 6, 11, 52]. The algorithms in this paper apply variants SSA relation form as well. form may be considered between two programs. is a target

about to such

as a property of a single program OY as a A single program is defined to be in SSA of exactly one assignment statement in the the original program by a For every original variable

if each variable

program text. Translation to SSA form replaces new program with the same control flow graph. V, the following conditions are required

of the new program:

(1) If two nonnull paths X ~ Z and Y~ Z converge at a node Z, and nodes X and Y contain assignments to V (in the original program), then a trivial +-function V + q5(V, . . . . V) has been inserted at Z (in the new program). (2) Each mention of V in the original program or in an inserted +fulnction has been replaced by a mention of a new variable V,, leaving the new program in SSA form. flow path, consider and the corresponding the same value. SSA form is translation to SSA form with the any use of a variable V (in the use of V, (in the new program). (3) Along any control original program) Then V and to

Vi have
minimal

Translation

proviso that the number of @-functions inserted is as small as possible, subject to Condition (1) above. The optimizations that depend on SSA form are still valid if there are some extraneous +-functions beyond those that would cause where forego appear information they placing
are

in

minimal process required.

SSA itself. One

form. they Thus,

However, always of SSA point it is important

extraneous to place form [52] Z, so long

+-functions overhead +-functions would as there

can to only nre no

to be lost,

and

add unnecessary

the optimization

variant

sometimes

a @-function

at a convergence

more uses for V in or after any risk of losing Condition and it is sometimes However, as Figure

Z. The @function could then be omitted without (3). This has been called pruned SSA form [16], at all our convergence points. form is sometimes

preferable to our placement 16 in Section 7 illustrates,

preferable to pruned form. When desired, pruned SSA form can be obtained by a simple adjustment of our algorithm [16, Section 5.1]. For any variable V, the nodes at which we should insert @-functions in the original program that to can be defined originate at V or needing Z needs recursively two different @-functions a @-function by Condition (1) in the definition point for containing we may of SSA form. two paths assignments observe that A node Z needs a +-function for V if Z is a convergence nodes, both nodes for V. Nonrecursively, for V because

a node

Z is a convergence

point for two nonnull paths X ~ Z and Y ~ Z that already containing assignments to V. If Z did

start at nodes X and Y not already contain an

assignment to V, then the @function inserted at Z adds Z to the set of nodes that contain assignments to V. With more nodes to consider as origins of paths, we may observe more nodes appearing as convergence points of
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.

460 nonnull needing insertion in much 3.1 If

. paths

Ron Cytron et al. originating could algorithm at nodes thus with be assignments by here obtains to V. The an the same set of nodes observation/ end results

@functions cycle. 2 The less time. Program

found

iterating

presented

Other the

Constructs computes expressions using constants and scalar to (or

source

program

variables to assign values to scalar variables, then it is straightforward derive an intermediate text that is ready to be put into SSA form otherwise analyzed and transformed by an optimizing compiler).

However,

source programs commonly use other constructs as well: Some variables are not scalars, and some computations do not explicitly indicate which variables they use or change. This section sketches text how to map some important constructs to an explicit intermediate suitable for use in a compiler.

3.1.1 Arrays. It will suffice to consider one-dimensional arrays here. Arrays of higher dimension involve more notation but no more concerns. If A and B are array variables, then array assignment statements like A + B or A +- O, if allowed assignments program, taken value rather to awkward, be both to scalar however, an will integer because by the source Many language, of the for can be treated i, which not just may would change like be be the variables. variable. mentions A(i) may as of A in the a variable source

be mentions an assignment

of A(i) Treating

some index or may

to A(i)

of A(j) and because the value of A(i) could be changed by assigning to i than to A(i). An easier approach is illustrated in Figure 7. The entire

array is treated like a single scalar variable, which may be one of the operands of Access or Update. 3 The expression Access(A, i) evaluates to the ith component of A; the expression Update(A, j, V) evaluates to an array value that is of the same size as A and has the same component values, except for V as the value of the jth component. Assigning a scalar value V to A(j) is equivalent to assigning an array value to the entire array A, where the new array value depends on the old one, as well as on the index j and on the new scalar value V. The translation to SSA form is unconcerned with whether the values of variables As with scalars, translation are large objects or what the operators mean. of array references to SSA form removes some

anti- and output-dependences [32]. In the program in Figure 7a, dependence analysis may prohibit reordering the use of A(i) by the first statement and the definition of A(j) by the second statement. After translation to SSA form, the two references to A have been renamed, and reordering is then possible. For example, the two statements can execute concurrently. Where optimization does not reorder the two references, Section 7.2 describes how transla -

2In typical cases, paths originating at @functions add nothing beyond the contributions from paths originating at ordinary assignments. The straightforward iteration is still too slow, even in typical cases 3This notation is similar to Select and Update [24], which, in turn, is similar to notation for referencing aggregate structures in a data flow language [23]. ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991

Static Single Assignment Form and the Control Dependence Graph

461

+ A(i)
A(j) + -

V
A(k) + 2

A t
T -

Access Access

(A, i) j,V) A9 T1 (A, k)

++ +

ACCeSS(A8,

17)

Llpdate(A,

Update(A8,

j~, V~)

Access(A9,k~~)

-T+2 (a) (b)

+T1+2 (c)

Fig. 7. Source code with array component references (a) is equivalent to code with explicit Access and Update operators that treat the array A just like a scalar (b). Transformationto SSA form proceeds as usual (c).

tion out of SSA form reclaims maintain distinct variables. Consider array A. indirectly) terms the loop shown

storage

that

would

otherwise

be necessary component

to of

in Figure

8a, which

assigns

to every

The value assigned to each component A(i) does not depend (even on any of the values previously assigned to components of A. In effect on A, the whole loop is like an assignment of the form ). Any assignment to a component of A an initialization loop is dead code that form, the Update to the loop. operator in

of its

A + (...), where A is not used in (... that is not used before entering such should Figure

be eliminated. With or without SSA 8b makes A appear to be live at entry

One reasonable accept it. Address

response to the crudeness of the Update operator is to calculations and other genuine scalar calculations can still is to perform dependence analydetermine that no subsequent assignment to A(i) in initialization to make are usually arrays to A. Such is Figure 8a. of A. loolk analysis formulated The The like to in

be optimized extensively. Another response sis [3, 10, 32, 51], which can sometimes accesses of A require the case for for each assignment problem scalars, statement us, or for (like values execution can anyone code produced of the then some who

by any other assignment uses Update results that as an

be viewed of the

is to communicate dead

of dependence

optimizations

elimination)

terms of scalar variables. A simple solution to this formal problem is shown in Figure 8c, where the HiddenUpdate operator does not mention the assigned array operand. The actual code generated for an assignment from a HiddenUpdate expression is exactly the same as for an assignment from the corresponding the target Update expression, where the hidden operand is supplied by of the assignment. as an array, to elements of
iIItO

3.1.2 Structures. A structure can be generally regarded where references to structure fields are treated as references the array. Thus, an assignment to a structure field Update of the structure, Access of the structure. references, this treatment constants. Dependence

is translated

an

and a use of a structure field is translated into an In the prevalent case of simple structure field results in arrays whose elements are indexed by analysis can often determine independence
i31111011g

such accesses, so that optimization may move an assignment to one field far from an assignment to another field. If analysis or language semantics reveals a structure whose n fields are always accessed disjointly, then the
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, Octob,w 1991,

462

Ron Cytron et al.

integer

A(1:1OO)

integer integer integer

AO(I:lOO) A1(l:lOO) A2(I:IOO)

integer integer integer il+--l repeat

AO(I:lOO) A1(I:lOO) A2(1:100)

i+l repeat

il

+--l i2 Al t + + ~(ii,i~) f#(fio, A2)

repeat

i2 Al

+-++-

#(i~,

i3) i2)

~(A0,A2) HiddenUpdate(12,

A(i) i+l+l

t--

A2 i3

update (Al,lz,

iZ)
until

A2 13

until

> 100 (a)

until

-iz+i i3 > 100 (b)

+-12+1 i3 > 100

(c)

Fig. 8, Source loop with array assignment (a) is equivalent to code with an Update operator that treats the array A just hke a scalar (b), As Section 72 explains, the eventual translation out of SSA form will leave just one array here, Using HiddenUpdate (c) is a purely formal way tosummarize some results of dependency analysis, if available.

structure structures and the the

can be decomposed are united in the expression actual of the use

into n distinct source program structures of the structure

variables. The elements only for organizational in apparent SSA to form more

of such reasons, makes

decomposition

programs

subsequent

optimization. 3.1.3 requires addition modify


implicit Implicit References to Variables.

The and referenced,

construction used by a statement

of

SSA may

form In or use

knowing to those

those variables

variables explicitly

modified

a statement.

variables not mentioned by the statement itself. Examples of such references are global variables modified or used by a procedure call, variables, and dereferenced pointer variables. To obtain SSA form,

aliased

we must account for implicit as well as explicit references, either by conservative assumptions or by analysis. Heap storage can be conservatively modeled by representing the entire heap as a single variable that is both modified and used by any statement that may change the heap. More refined modeling is also possible [15], but the conservative approach is already strong enough to support optimization of code that does not involve affect the heap but is interspersed with heap-dependent code. For any statement S, three types of references form: (1) (2) (3) MustMod(S) is the set of variables is the set of variables that that whose
must

translation

into

SSA

be modified be modified prior

by execution by execution to execution

of of of S

s.
MayMod(S) may

s.
May Use(S) is the set of variables may be used by S. values

We represent an assignment
ACM Transactions

implicit references for any statement statement A, where all the variables
on Programmmg Languages and Systems, Vol,

S by transformation to in MayMod(S) appear


1991

13, No, 4, October

Static Single Assignment Form and the Control Dependence Graph in An LHS( A) and appear called all in compiler (directly the RHS( may variables A). or may not have access to the of their in May Use(S) u

463

( ikfcz~illod(s) bodies effects.

MustMod(S)) optimizing procedures

of all If no

or indirectly)

or to summaries

specific information is available, it is customary sumptions. A call might change any global variable by reference, will be performed. To determine but it is reasonable Such assumptions aliasing to assume limit and that not change. the extent to extract effects

to make conservative asor any parameter passed variables that more local to the caller can variinformation: see [14, 15, result are is that small. is at all. Both own transformations detailed aliasing, the usual and analysis, been The on global

Techniques parameter

are available

of procedures pointer

ables, see [7-9], [19], [37], and [42]; to determine 27, 29, 33, 34], and [44]. When there Small often a sophisticated side analysis and technique the tuples directly. are few tuples effects Many (both

is applied, LHS Sophisticated

RHS)

can be represented

however,

unavailable.

compilers

do no interprocedural

analysis analyzed. compilers

Consider a call to an external procedure that has not tuples for the call must contain all global variables. representation structure plus that a direct of the tuples includes a flag representation can still of the be compact. that few local (set to indicate

The representation all globals that

caln be a tuple

are in the tuple) are in the

variables

because of parameter transmission. The cal analyses of optimization techniques

intuitive (with

explanations and theoretior without SSA form) are compact representations

conveniently formulated in terms of explicit can still be used in the implementations. 3.2 Overview of the SSA Algorithm to minimal SSA form

tuples;

Translation

is done in three is constructed

steps: from the contrcd flow

(1) The dominance graph (Section

frontier 4.2).

mapping

(2) Using the dominance frontiers, variable in the original program

the locations of the @-functions are determined (Section 5.1).

for each

(3) The variables are renamed (Section 5.2) by replacing each mention of an original variable V by an appropriate mention of a new variable V,. 4. DOMINANCE Section 4.1 reviews the dominance flow graph and how to summarize 4.2 introduces computation. 4.1 Dominator Trees the dominance frontier relation [47] between nodes in the control this relation in a dominator tree. Section mapping and gives an algorithm for its

Let X and Y be nodes in the control flow graph CFG of a program. If -X appears on every path from Entry to Y, then X dominates Y. Domination is both reflexive and transitive. If X dominates
Languages

Y and
and Systems,

X # Y, then
Vol.

X strictly
1991

ACM Transactions

on Programming

13, No. 4, October

464

Ron Cytron et al
Entry [- I -]

(-i
1 [- I Exit] Exit

(-)
[- I -]

I
(Exit
2[-1

Exit,2]

(-l~xit2 3[-18]

7[8181

8[-1 Exit ,2]

(-) 4[616]

P
5[;]6] dominate
Y)) idom(

6&\]

(Exit,2) 9[-1 Exit,2,9]

\
,,

,_,~2 10[111 11] li[91Exit:2,9]

( E=k

(Exit ,2) 12[Exit,21Exit,2]

Fig. 9 Control flow graph anddominator tree of the simple program Thesets ofnodes listed in ()and[] brackets summarize thedominance frontier calculation from Section 4.2, Each node X is annotated with two sets [DFIOCaj (X) I DF(X)] and a third set (DFUP(X)).

dominates for
immediate

Y. In formulas, If X does
dominator

we write
not

X>>

Y for strict is the

domination Y, we write
closest strict

and X ~

X ~

domination.

strictly

Y. The

of Y (denoted

dominator

of Y on any path from Entry to Y. In a dominator tree, the children of a node X are all immediately dominated by X. The root of a dominator tree is Entry, and any node Y other than Entry has
idom( Y)

as its parent

in the

tree. The dominator tree for CFG from Figure 5 is shown in Figure 9. Let N and E be the numbers of nodes and edges in CFG. The dominator tree can be constructed in 0( Ea( E, N)) time [35] or (by a more difficult algorithm) in 0(E) time [26]. For all practical purposes, a( E, N) is a small constant,4 so this paper will consider the dominator tree to have been found in linear time. The dominator tree of CFG has exactly the same set of nodes as CFG but a very different set of edges. Here, the words
parent,

predecessor,
child, ancestor,

successor,
and

and

path

always always

refer to CFG. The words refer to the dominator tree.

descendant

4Under the definition of a ueed in analyzing the dominator tree algorithm [35, p. 123], N s E implies that a(17, N) = 1 when logz N < 16 and a(ll, N) = 2 when 16 < logz N < 216.
ACM TransactIons on Programming Languages and Systems, Vol. 13, No 4, October 1991

Static Single Assignment Form and the Control Dependence 4.2 The Y: DF(X) = {Y[(~Pe Pred(Y))(X> Pand X? Dominance dominance Frontiers frontier X dominates DF( X) of a CFG node

Graph

465

X is the set of all does not strictly

CFG

nodes

Y such that

a predecessor

of Y but

dominate

Y)}.

Computing D1( X) directly much of the dominator-tree. X would the be quadratic, frontier we define dominance

from the definition would require searching The total time to compute DF( X) for all nodes the sets themselves in time holds: U (J
ZeChildren(X)

even when mapping equation

are small. size Ex

To compute I DF( X) I of

linear

in the

the mapping, such that

two intermediate

sets DFIOC~l and

DFUP for eaclh node

the following DF(X)

= DFIOC~l(X)

DFUP(Z).

(4)

Given This

any local

node

X,

some

of the

successors is defined

of X may by

contribute

to DIF( X).

contribution

DFIOC.l( X)

Given any node Z that is not the root Entry of the dominator tree, some of the nodes in DF(Z) may contribute to DF( idom(Z)). The contribution DFUP(Z) that Z passes up to idom(Z) is defined by DFUP(Z)~f{Ye
LEMMA 1.

DF(Z) frontier

I idom(Z)

Y}.

The dominance dominance

equation

(4) is correct. DFIOC.l( X) G DF( X). Because

PROOF. dominance still show DF( X), strictly cannot implies The follows:
LEMMA 2.

Because

is reflexive,

is transitive, each child Z of X has DFUP(Z) G DF( X). We must that everything in DF( X) has been accounted for. Suppose Y e let U ~ Y be an edge Y. If U = X, then Y hand, there such that X dominates and Z of X that strictly U but we are dominates dominate dcles not done. If U but Y. This then X Y e DFIOC.Z( X), is a child does not

and

dominate

U # X, on the other

strictly dominate that Ye DFUP(Z). intermediate sets

because

K
can be computed with simple equality tests as

For

any node X, = {Y= SUCC(X) I idom(Y) # X}.

DFIOC~t(X) PRooF. We assume that (X>

Y ~ Succ( X) Y)

and show that =X). is defined dominates to be a strict Y and, hence,

# (idom(Y)

The = part is true because the immediate dominator. For the = part, suppose that
ACM Transactions on Programming

dominator X strictly

Languages and Systems, Vol. 13, No. 4, October 1991.

466

.
for

Ron Cytron et al,


each ~ each if end for each for if end end end Fig. 10. Calculation of DF( X) for each CFG node X. Z each E Y Children(x) E LIF(Z) # X do do then DF(X) llF(X) U {Y} in a bottom-up traversal do then llF(X) +IIF(X) U {Y} of the dominator tree do

DF(X)
for

+
Y

0
E Succ(X) # X

/*local./

id.m(Y)

/*up*/

idorn(Y)

that Entry V idom(

some to Y) dominates

child Y that X

V of or

dominates X n and But V

Y. Then then follows cannot

V appears the edge dominate

on any X, so

path V = Y

from and

goes to = x.

X + Y, so either

V = Y.

= idom(v)

LEMMA 3.

For any node X and DFUP(Z) = {Ye

any child DF(Z)

Z of X in the dominator # X}. that =X).

tree,

I idom(Y) and show

PROOF.

We assume

that (X>

Y G DF(Z) Y)+

(idom(Y)

The - part is true immediate dominance.

because strict dominance For the + part, suppose

is the transitive closure that X strictly dominates

of Y U o!i to Y U or

and, hence, that some child V of X dominates Y. Choose a predecessor Y such that Z dominates U. Then V appears on any path from Entry V dominates that goes to U and then follows the edge U + Y, so either V = Y. If that Only V = Y, then idom( Y) = idom( V ) = X, and we are done.

Suppose

V + Y (and, hence, that V dominates U) and derive a contradiction. one child of X can dominate U, so V = Z and Z dominates Y. This the hypothesis results imply the given that Ye DF(Z).

contradicts These dominance

K
algorithm /*local*/ for line computing effectively the com-

correctness in Figure fly

of the 10. The

frontiers

putes DFIOC.Z( X) on the storage to it. The /*up*/ dominator tree bottom-up, children. To illustrate the dominator tree in Figure
THEOREM PROOF. 1.

and uses it in (4) without needing to devote line is similar for DFUP( Z). We traverse the visiting each node X only after visiting each of its working of this algorithm, we have annotated the 9 with the information [1 and () brackets. in Figure 10 is correct.

The algorithm from

Direct

the preceding

lemmas.

K
10 are

Let CFG have N nodes and E edges. The loop over Succ( X) in Figure examines each edge just once, so all executions of the /*local*/ line
ACM TransactIons on Programming Languages and Systems, Vol. 13, No. 4, October 1991

Static Single Assignment Form and the Control Dependence Graph complete complete which shows have standard 4.3 in in time time 0(E). Similarly, The all executions time of the is thus DF compiler /*up*/ O(E

. line

467 are 8 We the

O(size(DF)). the

overall

+ size(DF)), Section linear. than

amounts that, implemented

to a worst-case this algorithm

complexity size of the and have

of 0( E + N2 ). However, mapping observed is usually [21. that it is faster

in practice,

data-flow

computations Frontiers formally

in the PTRAN to Joins the nonrecursive Given

Relating

Dominance more

We start of join nonnull The nodes

by stating

characterization

of where

the @-functions

should

be located.

a set Y of CFG nodes, the set J(Y)

nodes is defined to be the set of all nodes Z such that there are two CFG paths that start at two distinct nodes in Y and converge at Z. join J+ ( 5 ) is the limit of the J(Y); = J(YU J,). nodes for a variable We increasing sequence of sets of

iterated

J1= J,+l In particular, if Y happens

to be the

set of assignment

V, then J+(Y) is the set of @function The join and iterated join operations extend natural the dominance way: frontier DF( As with increasing join, the iterated mapping S ) =

nodes for V. map sets of nodes to sets of nodes. from nodes X) . DF+ ( Y ) is the limit to sets of nodes

in the

~QJDF( frontier

dominance

of the

sequence

of sets of nodes DF1=DF(Y); DF,+l = DF(W DF,) . by the efficient worklist

The

actual

computation

of

DF+ ( Y ) is performed

algorithm iterated assignment

in Section 5.1; the formulation dominance frontiers to iterated nodes for a variable V, then

here is convenient joins. If the set Y we will show that

for relating is the set of

J+(Y)

=DF+(Y)

(this equation depends on the fact that Entry is in Y) and, hence, that the location of the @functions for V can be computed by the worklist algorithm for computing DF+ ( Y ) that is given in Section 5.1. The following lemmas do most of the work by relating dominance frontiers to joins:
LEMMA 4.

For

any

nonnull

path

p:

X ~ Z

in

CFG,

there

is

a node

X e {X} U DF+ ({ X}) on p that dominates every node on p, the node X can be chosen PRooF. If X dominates properties every node

Z. Moreover, in DFf ({ X}). we just that

unless

X dominates

on p, then

choose

= X to

get all the claimed

of X.

We may
Languages

assume
and Systems,

some of the nodes


13, No. 4, October 1991.

ACM Transactions

on Programming

Vol.

468 on p

. are

Ron

Cytron

et al

not

dominated

by

X.

Xo, . . ., XJ = Z. For the smallest

Let the sequence of nodes on p be X = i such that X does not dominate X,, the

predecessor X,_ ~ is dominated by X and so puts X, into DF( X). Thus, there are choices of ~ with XJ e DF+ ({ X}). Consider X = XJ for the largest j with XJ ~ DF+ ({ X}). We will show that X >> Z. Suppose not. Then j < J, and there is a first k with j < k s J such that X does not dominate Xh. The predecessor Xh _ ~ is dominated by X and so puts X~ into DF( X). Thus, Xh e DF( DF+ ({ X} )) = DF+ ({ X} ), contradicting
LEMMA

the choice and suppose at Z. Then

of j. that

D nonnull paths X})


U

5. Z

Let X + Y be two nodes in CFG, and q: Y$ Z in CFG converge

p:

X$

Z ~DF+({

DF+({

Y}).

PROOF. We consider three cases that are obviously exhaustive. In the first two cases, we prove that Z e DF+ ({ X} ) U DF+ ({ Y} ). Then we show that the

first

two cases are (unobviously)

exhaustive

because

the third

case leads to a

contradiction. Let nodes X= XO, ..., sequence of nodes

X be from Lemma 4 for the path p, with the sequence of XJ = Z. Let Y be from Lemma 4 for the path q, with the Y = YO, . . ., YK = Z.

X is on q and show that Z e DF+({ X}). By the Case 1. We suppose that definition of convergence (specifically, (3) in Section 2), X = Z. We may now assume already of Z but that Z = X and X dominates every node on p. (Otherwise, Lemma 4 asserts Z e DF+ ({ X}).) Because X dominates the predecessor XJ_ ~ does not strictly dominate Y Z, we have is on p, Z c DF( X) show G DF+ ({ X}). Z e DF ({ Y}),

Case 2. We suppose that reasoning just as in Case 1.

and

that

We derive a contradiction Case 3. q and Y is not on p. Because X

from ~

the suppositions X is not

that on q, X

X is not on >> YK = Z

Z but

and, therefore, dominates all predecessors of YK. In particular, X >> YK_ ~. But X # YK_~, SO X >> Y&~, and we continue inductively to show that X >> Y~ for all k. In particular, X >> Y. On the other hand, by similar reasoning from the supposition that Y >> Z but Y is not on p, we can show that Y >> X. Two nodes cannot strictly dominate each other, so Case 3 is K impossible.
LEMMA PROOF. 6.

For any set Y of CFG We apply Lemma 5.

nodes,

J(7)

G DF(J7

).

K
nodes such that Entry e Y, DF( Y ) G

LEMMA

7.

For

any set Y

of CFG

J(Y).
PROOF. Consider any Xc .Y and any Ye DF( X). There is a path from X to Y where all nodes before Y are dominated by X. There is also a path from Entry to Y where none of the nodes are dominated by X. The paths K therefore converge at Y.
THEOREM 2. The set of nodes that need ~-functions for any variable V is the iterated dominance frontier DF+ ( Y ), where Y is the set of nodes with assignments to V. ACM Transactions on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991

Static Single Assignment Form and the Control Dependence Graph


PROOF.

469 we can

By Lemma

6 and

induction

on

i in the

definition

of J+,

show that J+(Y) The induction step is as follows: Ji+l = J(YUJ,)


GJ(SWDF+(Y))

GDF+(Y).

(5)

GDF+(!YUDF+(.Y)) The node Entry is in Y, so Lemma DF+(Y)G The induction step is as follows: DF,+l = DF(YUDF,) GJ(YU The set of nodes that prove the theorem. J+(Y)) for

=DF+(Y). induction yield (6)

7 and another J+(Y).

SDF(JW =J+(Y).

J+(Y))

need @-functions

V is precisely

J+(Y),

so (5) and (6)

K
OF MINIMAL SSA FORM

5. CONSTRUCTION 5.1

Using Dominance Frontiers to Find Where @-Functions Are Needed


in Figure 11 inserts trivial @-functions. The outer loop of this Several data is performed are used: once for each variable V in the program.

The algorithm algorithm structures

W is the worklist algorithm, ments to DF( X) becomes Work(*) indicates

of CFG

nodes

being

processed.

In each iteration

of this

W is initialized to the set &(V) V. Each node X in the worklist a @function. array of flags, Each one iteration flag for empty. is an

of nodes that contain assignensures that each node Y in terminates each node, when the w orklist Work(X) iteration X)

receives

where

whether loop.

X has ever been added to W during of flags, for

the current

of the outer HasAlready(*) indicates The flags because needing

is an array whether

one for each node, where V has already X) to V could

HasAlready( at X.

a @-function and for

been inserted

Work(X) the property a d-function

Ha.sAlready( of assigning V. The flags

are independent. is independent have

We need two flags of the property with of just

been implemented

the values true and false, but this would to reset any true flags between iterations,

require additional record keeping without the expense of looping

over all the nodes. It is simpler to devote an integer to each flag and to test flags by comparing them with the current iteration count. assignments to variables,, where Let each node X have AO,,~ (X) original each ordinary assignment statement LHS + RHS contributes the length of the tuple LHS to A o,,~( x). counting assignments to variables is one of
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, Octc,ber 1991.

470

Ron Cytron et al.


IterCount for each Work(X) end W+o for each IterCount variable +V do --I 1 do + node O X do + 0 O

HasAlready(X)

IterCount

for

each X
Work(X)

~
+-

d(V) {x}

IterCount

W+wu
end while take for if W X each then # @ do from Y

W & do DF(.T) < do IterCount V))

HasAlready(Y) place if (F

@(V,..., + <

at

HasAlready(Y) Work(Y) then do Work(Y) Jv end end end end end Fig, 11. Placement +w

IterCount

IterCount +IterCount

{Y}

of ~-functions.

several measures of program size. 5 By this measure, the program from size AO,,~ = XX AO,,~(X) to size A,O, = XX AtO,( X), where
= A o,,g(x) function placed at X contributes 1 to A,.,(X) + A+(X). a similar expansion in the number of mentions of variables, from Xx M~,,g( X) to Mto, = Xx Mto,(

expands each @There


Morlg

is
=

contributes

1 plus

the indegree

where each @-function placed X), of X to MtOt( X) = MOrLg( X) + M+(X).

at

Placing a ~-function at Y in Figure 11 has cost linear in the indegree of Y, so there is an O(EX M*( X)) contribution to the running time from the work done when HasAlready( Y) < IterCount. Replacing mentions of variables will contribute at least 0( M~Ot) to the running time of any SSA translation algorithm, so the cost of placement can be ignored in analyzing the contribution of Figure 11 to the 0(. . ) bound for the whole process. The 0(N) cost of

5The various measures relevant summarized in Section 9 1.


ACM TransactIons on Programming

here are reviewed

when the whole SSA translation

process is

Languages

and Systems,

Vol.

13, No, 4, October

1991

Static Single Assignment Form and the Control Dependence initialization the dominance managing the performed because running Sizing can be similarly frontier worklist ignored because

Graph

471 cost of

it is subsumed

by the

calculation. What W. The statement and each time

cannot be ignored is the cost of take X from W in Figure 11 is incurs a cost linear of Figure A ~Ot(X) x also average Aw(X)) in I DF( X) I 11 to the I DF( X) I )). describe the

A,O,( X)

times,

all Ye DF( X) are polled. The contribution time of the whole process is therefore O(ZX( the output in the natural way where x as
A tot,

we can

contribution

as 0( A ~Otx aurgDF), aurgDF~f (~ (A,O,(X)

the weighted 1))/(~

I DF(X)

(7)
As 11

emphasizes the dominance frontiers of nodes with many assignments. Section 8 shows, the dominance frontiers are small in practice, and Figure is effectively 0( A tOt). This, in turn, is effectively 0( AO,i~).

5.2 Renaming
The algorithm denoted 12 begins the root with have the been in Figure 12 renames traversal The visit The all mentions are generated of the dominator order, processing starting with of variables. tree by calling the any statements @-functions requires work New variables V. Figure SEAFtCH that for at may only associated V,, where a top-down node Entry. node in inserted. i is an integer, for each variable

to a node processes of a statement

sequential

those variables actually mentioned in the statement. In contrast with Figure 11, we need a loop over all variables only when we initialize two arrays among the following data structures: S(*) hold C(*) C(V) is an array integers. Vi that tells is an array of stacks, The integer should one stack i at the
One

for top

each variable of S(V) variable

V. The

stacks

can the value

is used

to construct countelc

variable

replace assignments

a use of V. for each V. The to V have been processed.

of integers,

how many

WhichPred( X. The jth of Y from Each

X, Y) is an integer telling which predecessor of Y in CFG is operand of a +-function in Y corresponds to the jth predecessor the listing of the inedges of Y. statement A has the form LHS(A) RHS(A) of expressions and the variables. These tuples target tuple notation, a conditional

assignment

where the right-hand side RHS( A) is a tuple left-hand side LHS( A) is a tuple of distinct target change is still as mentions remembered of variables as oldLILS( are renamed, A). To minimize

but the original

branch is treated as an ordinary assignment to a special variable whose value indicates which edge the branch should follow. A real implementation would recognize that a conditional branch involves a little less work The LHS part of the processing can be than a genuine assignment: omitted.
ACM Transactions on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991

472
for each

Ron Cytron et al.

variable

do

c(v)
S(V) end call

+
+-

o
Empty Stack

SEARCH (Entry)

SEARCH(X) : for each statement


if A
is an

.4 in

do

ordinary each variable use LHS(A)

assignment V of used In RHS(A) of do i = Top(S(V))

then for end for each itreplace push C(V) end end for /* + of Y first c loop SUCC(X) +/ do F in operand Y do t in RHS(F) by V, where i = Top(S(V)) each ~ for end end for each call Y c Children(X) do V in do replace 1 by use V, where

C(v)
P by i onto i new tj in LHS(A) S(V) +

tt7hichPred(Y)X) each ~-function the j-th

replace

SEARCH(Y) A in
oldLHS(A)

end for each assignment


for each pop end end end SEARCH V in

X do
do

s(v)

Fig. 12 Renaming mentions ofvariables. V by V, ina RHS

Thelistofusesof

V, grows with each replacement of

The processing ofan assignment statement A considers the mentions of variables in A. The simplest case is that of a target variable V inthetuple LHS(A). Weneed anew variable V,. By keeping acount C(V) ofthe number have already been processed, we can of assignments to V that appropriate new variable by using i = C(V) and then incrementing To facilitate renaming uses of V in the future, identifies V,) onto a stack S(V) of (integers that V. replacing
ACM Transactions on Pro~amming Languages and Systems, Vol

find an C(V).
i (which variables

we also identify)

push new

13, N0

4, October

1991

Static Single Assignment Form and the Control Dependence A subtler any variable expressions computation V used in in the tuple. is needed RHS( A); We want for the right-hand that is, V appears V by to replace

Graph A).

473

side RHS( Vi, where

Consider one of the

in at least

V, is the target

of the assignment that There are two subcases

produces the value for V actually used in RHS( A). because A maybe either an ordinary assignment or a but they later in 4 and of

~-function. Both subcases get V, from the top of the stack S(V), inspect S(V) at different times in Figure 12. Lemma 10, introduced this section, shows that The correctness proof V, is correctly for renaming chosen depends

in both subcases. on results from Section it makes program.

on three more lemmas. We start by showing that the assignment to a variable in the transformed
LEMMA 8.

sense to speak

Each

new variable

Vi in the transformed

program

is a target

of

exactly

one assignment. Because the counter C(V) is incremented after processing each

PRooF.

assignment to V, there can be at most one assignment to V,. To show that there is at least one assignment to V,, we consider the two ways V, can be mentioned. If V, is mentioned on the LHS on an assignment, then there is nothing more to show. If the value of when time Vi when is used, on the other onto hancl, S(V), then it was i = Zop(S( V )) was true pushed to vi. because at the time the algorithm renamed the old use

of V to a use of Vi. At the earlier an assignment

i was pushed been changed

to V had just

to an assignment

K
X may program contain or were assignments introduced to the variable to receive V that appeared in the for all 12. loop

A node original

the value

of a o-function

V. Let TopA/7er( V, X) denote the new variable in effect for V after statements in node X have been considered by the first loop in Figure Specifically, and define we consider the top of each stack S(V) at the end of this

TopAfter If there that assigns


LEMMA

( V, X)

~f

V,

where X, then from the

i = Top(S(V)). the top-down closest traversal of ensures X that

are no assignments V, X) is to V.

to V in inherited

TopAf3er(

dominator

not

9. For any variable have a rp-function for V,

V and

any CFG

edge X ~ Y such

that

Y does

TopAfter(V, PRooF. o-function We may for assume

X) that Z has

= TopAfter( X # idom( Ye DF(Z),

V, idom( Y). then

Y)). Y does not have

(8) a

Because

V, if a node

Z does not assign

to a new to a

variable derived from V. We use this fact twice below. By Lemma 2, Ye DFIOC=l( X) G DF( X), and, thus, X does not new idom( variable X), derived from X)), V. Let U be the first node in idom( idom( . . . that ( V, X) assigns to such a variable. ( V, U).
Vol.

assign

the Then

sequence (9)

TopAfter
ACM Transactions

= TopAfter
Languages

on Programming

and Systems,

13, No. 4, October

1991.

474 Because DF(U). any

Ron Cytron et al U assigns to a new variable Y), we get derived from V, it follows dominates that that Y $ is

But

U dominates

a predecessor

of Y, so U strictly Z >> X because implies

Y. For

Z with

U >> Z >> idom(

Z >> Y and there

an edge X * Y. By the choice assign to a new variable derived 7opAfter( and (8) follows The preceding form, but first from (9). V, U)

of U, U>> Z >> X from V. Therefore, = TopAfler

Z does not

( V, idom(

Y)),

K
helps extend establish TopAfter Condition (3) in the definition which
A.

lemma we must

of SSA

so as to specify

new variable There is one top of each

corresponds to V just before and just after iteration of the first loop in Figure 12 for stack just before and just ~f ~f V, V, after this where where iteration

each statement A. We consider to define

the

TopBefore( TopAfter

V, A) ( V, A)

i = Top( S ( V ) ) before i = TOP( S( V ) ) after

processing processing

A; A.

to be the last In particular, if A happens TopAfter( V, A) = TopAfler( V, X). If, on the another
LEMMA

statement in block X, then A is followed by other hand, V, B). program V and any

statement

B, then

TopA/ler(

V, A) = TopBefore(

and

10. Consider any control flow the same path in the original program.

path in the transformed Consider any variable

occurrence of a statement A along the path. If A is from then the value of Vjust before executing A in the original as the value program. PRooF. executed is either TopBefore( the original before A. Case of TopBefore( V, A) Just before executing

the original program, program is the same A in the transformed

We use induction

along

the path.

We consider

the

kth

statement

along the path and assume that, for all j < k, the jth statement T not from the original programG or has each variable V agreeing with V, T) just program before and T. We assume show that that the kth statement TopBefore( A is from V, A) just V agrees with

1. Suppose

that

is not

the

first

original

statement

in

its

basic

block Y. hypothesis just after

Let T be the statement just before A in Y. By the induction and the semantics of assignment, V agrees with TopAfter(V, T)
T.

Thus,
Suppose

V agrees
that edge

with

TopBefore(
first by the original control

V, A) just
statement path. We

before
in its

A. basic that block Y,

Case 2.
Let X +

A is the followed

Y be the

claim

V agrees

with TopAfler( V, X) when control flows along this edge. If X = Entry, then TopAfter( V, X) is VO and does hold the entry value of V. If X # Entry, let T be the last statement in X. By the induction hypothesis and the semantics of assignment, V agrees with TopAfter( V, T)

It

could be a nominal

Entry

assignment

or a @function, 13, No, 4, October 1991

ACM Transactions

on Programming

Languages and Systems, Vol

Static Single Assignment Form and the Control Dependence Graph

475

... ?
~ + Y

... +
T t

...

...

@(..., TopAfter(V,X),

. ..) Y

/*

No ~ for

*/

V4K

V 2 ZopAfter(V,

idom(Y))

1
Fig. 13. In the proof of Lemma 10, the node Y mayor may not have a +-function for V.

just

after

and, still

hence, bridge along before

with the X +

TopAfter( gap Y between and A in for V

V, X)

when

control that V V in the

flows agrees agrees

along with with there

x+
We

Y.
must knowing V, X) may not knowing ahead of that A

TopAfter( TopBefore( may or program. Case

V, A) just

doing

Y. As Figure

13 illustrates,

be a @-function

transformed

2.1.

Suppose

that

has

a ~-function

Vi+-

4(.

..)

in

Y.

The

operand corresponding to X + Y is TopAfter ( V, X), thanks to the loop over successors of X in Figure 12. This is the operand whose value is assigned to Vi, which doing is TopBefore( A in Y. that Y. V does not have V, X) = TopAfter( a ~-function V, idom( Y)) in Y. By Lemma 9, V, A). Thus, V agrees with TopBefore( V, A) just before

Case 2.2. V agrees just before

Suppose with doing


3.

TopAfter( A in Any

= TopBefore(

V, A)

K
can be put and then number into minimal SSA form by computin Figure of variables Let avr.gDF 11 in be in the

THEOREM

program frontiers

ing

the dominance

applying

the algorithms

and Figure

12. Let A ~Ot be the total

of assignments

to variables

of mentions resulting program. ~ Let MtOt be the total number the resulting program. Let E be the number of edges in CFG. the weighted the whole average is E + ~ x I DF(X) I + (A,o,
x

(7) of dominance

frontier

sizes. Then

the running

time of

process O

avrgDF)

+ M,O,

)
. to V in the that need

PROOF.

Figure

11 places

the

@functions

for

V at the nodes in the iterated set of nodes

dominance original

frontier program.

DF+ ( Y ), where By Theorem 2,

is the set of assignments

DF+ ( Y ) is the

7Each ordinary assignment each @-function contributes


ACM Transactions

LHS + RHS contributes 1 to A ~Ot.


on Programming Languages

the length

of the tuple

LHS

to A,.,,

and

and Systems,

Vol.

13, No. 4, October

1991.

476

Ron Cytron et al. for V, so we have with obtained the fewest correctly by Condition possible Figure (1) in @functions. 12. Condition the definition We must (2) in of still the

@-functions translation show that definition Let N edges, and subsumed frontiers, Section

to SSA form renaming follows be the Figure

is done

from Lemma 8. Condition (3) follows from number of nodes in CFG. The dominator 12 runs in 0( N + E + Mtot)
X ) I)

Lemma 10. tree has O(N) N + E term is the dominance at the end of

time.

The

by the 0( E + x ~ I DF( so Figure 12 contributes 5.1, Figure 11 contributes

cost of computing 0( M,O,). As explained

0( AfOt

x czurgDF).

6. CONSTRUCTION
In this section in we dominance Y be nodes
postdominates

OF CONTROL DEPENDENCE
show in the that control graph dependence of the control [241 are flow essentially Let the X and X is reverse graph.

frontiers
Y.a

CFG. If X appears on every path from Y to Exit, then Like the dominator relation, the postdominator relation

reflexive and transitive. postdominates Y. The postdominator

If X postdominates Y but X # Y, then X strictly immediate postdominator of Y is the closest strict from Y to Exit. In a postdominator
tree,

of Y on any path

the

children of a node X are all immediately dependent A CFG node Y is control following hold: path

postdominated by X. on a CFG node X if both

of the

(1) There is a nonnull after X on p. (2) The node In other

p: X ~ Y such that postdominate edge path from from

Y postdominates the node X X that that X. definitely avoids

every

node

Y does not strictly there there is some

words, and with edge

causes executing

Y Y.

to execute, We associate control flow

is also

some

this control dependence from X to Y the from X that causes Y to execute. Our here can easily be shown to be equivalent

label on the definition of to the original

control dependence definition [241.


LEMMA 11.

Let X and
only every if (iff node )

Y be CFG there after is

nodes.
a

Then

Y postdominates
path p: X ~ Y

a successor

of X

if

and

nonnull

such

that

postdominates
PROOF.

X on p.

Suppose

that

Y postdominates

a successor

U of

X.

Choose

any

path q from U to Exit. q that reaches the first

Then Y appears on q. Let r be the initial segment of appearance of Y on q. For any node V on r, we can r to V and then by taking any path from V U but does not appear before the end of r, Let r. p be the path Then p: that starts with the edge X ~ Y and Y postdominates

get from U to Exit by following to Exit. Because Y postdominates Y must every postdominate then V as well. along X + U and proceeds X on p.

node after

8The postdominance relation in [24] is irreflexive, whereas the definition we use here is reflexive. The two relations are identical on pairs of distinct elements, We choose the reflexive defimtlon here to make postdominance the dual of the dominance relation, ACM Transactions
on Programming Languages and Systems, Vol. 13, No. 4, October 1991,

Static Single Assignment Form and the Control Dependence Graph

477

build build apply

RCFG dominator the dominance tree frontier X Y E do do RDF(Y) CD(X)


U

for in

RCFG Figure 10 to RDF 0 end find for the RCFG mapping t--

algorithm

for for

each each for

node node X

CD(X)

each CD(X)

do
{ Y }

end end Fig. 14. Algorithm for computing the set CD(X) of nodes that are contro 1dependent on X

Node Entry 1 2 3 4 5 6 7 8 9

CD(Node) 1,2,8,9,11,12 3,6,7 4,5

Fig. 15.

Control

dependences of theprogram

in Figure5.

10
9,11 2,8,9,11,12

10 11

12

Conversely, after The

given

a path

p with

these

properties,

let

U be the first U. as the []

node

X on p. Then reverse control

U is a successor flow graph

of X and

Y postdominates same nodes

RCFG

has the

control The on

flow graph CFG, but has an edge Y + X for each edge X + Y in CFG. roles of Entry and Exit are also reversed. The postdominator relation CFG is the dominator
1.

relation

on RCFG. Y be nodes in CFG. Then Y is control dependent

COROLLARY

Let X and

on X in CFG
l?ROOF.

iff X~ DF( Y)

in RCFG.

control

Using Lemma 11 to simplify the first condition in the definition of dependence, we find that Y is control dependent on X iff Y postdomi -

nates a successor of X but does not strictly postdominate X. In RCFG this means that Y dominates a predecessor of X but does not strictly dominate X; that Figure is, X~DF(Y). 14 applies this

K
result to the computation of control depenclences.

After building the dominator tree for RCFG by the in time 0( Ea( E, N)), we spend O(size( RDF)) finding and then inverting them. The total time is thus

standard method [351 dominance frontiers for all


1991

0( E + size( RDF))
13, No

ACM TransactIons on Programming Languages and Systems,Vol.

4, October

478 practical graph the

Ron Cytron et al. purposes. By applying to Exit the algorithm the control was added would in Figure so that 14 to the control in Figure the at Entry. 15. Note control flow that

in Figure

5, we obtain Entry viewed

dependence to CFG be rooted

edge from

depend-

ence relation,

as a graph,

7. TRANSLATING Many powerful

FROM analysis

SSA FORM and transformation techniques can be applied to

programs in SSA form. Eventually, however, The @-functions have precise semantics, but sented in existing target machines. each yield This out of SSA form by replacing ments. Naive translation could can be generated elimination if two already allocation and storage @-function inefficient

a program must be executed. they are generally not repredescribes with object how to translate some ordinary assigncode, but efficient code are applied: dead code

section

useful

optimization

by coloring. X can be replaced by k flow predecessor of X. sometimes perform a

Naively, a k-input @-function at entrance to a node ordinary assignments, one at the end of each control This is always correct, but these ordinary assignments good deal elimination efficient. 7.1 Dead Code Elimination program may may in the have
dead code

of useless work. If the naive replacement is preceded by dead code and then followed by coloring, however, the resulting code is

The original as procedure live may

source

(i.e.,

code that

has no effect (such once was possible

on any program become

output). dead

Some of the intermediate also introduce course of optimization.

steps in compilation With so many

integration)

dead code. Code that

sources for dead code, it is natural to perform (or repeat) tion late in the optimization process, rather than burden steps with concerns about dead code.

dead code eliminamany intermediate

Translation to SSA form is one of the compilation steps that may introduce dead code. Suppose that V is assigned and then used along each branch of an if . ..then... but that V is never used after the join point. The else.... original assignments to V are live, but the added assignment by the @ function is dead. Often, such dead +-functions are useful, as in the equivalencing and redundancy elimination algorithms that are based on SSA form [5, 431. One such use is shown in Figure 16. Although others have avoided placement of dead @functions in translating to SSA form [16, 52], we prefer to include the dead O-functions to increase optimization opportunities. There are many different definitions of dead code in the literature. Dead code is sometimes defined to be unreachable code and sometimes defined (as it is here) to be ineffectual code. In both cases, it is desirable to use the broadest code really possible definition, subject removed. to the correctness g A procedural condition of the that dead is can be safely version definition

The definition used here is broader variables [25, p, 489].


ACM TransactIons on Programming

than the usual one [1, p 595] and similar

to that of faint

Languages

and Systems,

Vol.

13, No. 4, October

1991

Static Single Assignment Form and the Control Dependence


if Pi then do Yi+-1 use end else Y2 end Y3 . . . if Pi then else Z1 Z2 t ~ 1 Xl 4(Y1, Y2) Y3+ do +
xl

Graph

479

if

Pi then do Yltl

of

Y1 end else Y2 end

use do Y2

of

Y1

xl

use of

use of
4( YI, Y2) . . .

Y2

Z3 + 4( ZI, Z2) use of Z3

use of

Y3

Fig. 16. Ontheleft isanunoptimized program containing a dead @function that assigns toY3. The value numbering technique in [5] can determine that Y3 and Z3 have the same value, Thus, Z3 and many of the computations that produce it can be eliminated. The dead ~-function is brought tolifebyusing Y3 inplaceof Z3.

more Initially,

intuitive all need

than

a recursive are

version, tentatively live

so we prefer marked of the

the

procedural Some listed

style. below.

statements

dead.

statements,

however, Marking natural

to be marked

because

conditions

these statements live may cause others to be marked live. When the worklist eventually empties, any statements that are still marked dead and can be safely is marked live removed from the code. holds: program parameter, are output, or a iff at least should have one of the following to affect

dead are truly A statement (1) The such

statement as an 1/0

is one that statement, that may

be assumed

an assignment side effects.

to a reference there

call to a routine (2) The statement already marked

is an assignment statement, and live that use some of its outputs.

statements

(3) The statement is a conditional branch, marked live that are control dependent Several published that requires every algorithm, conditional given in branches. algorithms conditional Figure

and there are statements already on this conditional branch,

eliminate dead code in the narrower sense branch to be marked live [30, 31, 381.1 Our 17, goes one step further in eliminating does this dead al so.) algorithm by R. Paige

(An unpublished

The more readily accessible [31] contains a typographical should be consulted for the correct version.
ACM TransactIons on Programmmg Languages

error; the earlier

technical

report [301

and Systems,

Vol.

13, No. 4, October

1991.

480

Ron Cytron et al
for if each then else end iVorkList while take for
if

statement Live(S) Live(S)

S do + true false

S G Pre Live

t(WorkList

Pre Live # 0) do

S from each then D

14To~kList c do Definers(S) = false ttrue \VorkList


U { ~ } do

Live(D) Live(l))

PVorkList end end for If each then block do Live(Last(B)) WorkList end end end for If end Fig. 17 each statement = false delete B

in

CD-l = false -

(B/ock(S))

do

Liz~e(Last(B))

true
U {

WorkList

Last(B)

S do

Live(S) then

S from

Block(S)

Dead code ehmination

The following ~ive(S)


PreLiue

data

structures that

are used: S is live. whose execution is initially assumed to or side effects has been outside disstate-

indicates is the

statement

set of statements

affect program output. the current procedure WorkList covered.


Definers(S)

Statements that result in 1/0 scope are typically included. whose that Iiveness provide basic statement block S.

is a list is the

of statements set of statements that

recently used by

values B.

ment
Last(

S.
B)

is the statement is the basic


l?)

terminates

Block(S) ipdom(
ACM

block block

containing that

is the basic

immediately
and Systems,

postdominates
Vol 13, No. 4, October

block
1991

1?.

TransactIons

on Programming

Languages

Static Single Assignment Form and the Control Dependence Graph CD-1(B) is the set of nodes that to block discovers 11 Other are control is the dependence

481 of the 14.

predecessors B) in Figure still

node corresponding After live the are algorithm deleted.

B. This the live

same as RDF( all those as code

statements, (such

not marked may leave

optimizations

motion)

empty blocks, remove them.

so there is already ample reason for a late The empty blocks left by dead code elimination

optimization to can be removed

along with any other empty blocks. The fact that statements are considered for condition (2). Statements that depend never marked live unless required by some (3) is handled tion controls 7.2 Allocation by the loop over a block with live by Coloring seem possible simply

dead until marked live is crucial (transitively) on themselves are other live statement. whose Condition termina-

CD- l(Block(S)). A basic block statements is itself live.

At first,

it might

to map

all occurrences the new variables

of Vi back to V introduced by

and to delete translation may have useful (Figure can 18c). leave be yielding The

all of the

~-functions.

However,

to SSA form cannot always be eliminated, because optimization capitalized on the storage independence of the new variables. The of the new variables by the code motion to V twice by moving with the the introduced example invariant variables VI by translation in Figure assignment for and separate The SSA form out These to SSA form code 18b) loop, (Figure of the 18. The source

persistence

can be illustrated optimized a program dead a region in

18a) assigns

and uses it twice.

separate

purposes

(Figure
live.

assignment

to V3 will where

be eliminated.

optimization

program

Vz are

simultaneously

Thus, both variables are required: The original variable V cannot substitute for both renamed variables. Any graph coloring algorithm [12, 13, 17, 18, 211 can be used to reduce the number of variables needed and thereby can remove most of the associated assignment statements. The choice of coloring technique should be guided by the eventual then just use of the output. to consider derived of the should most If the goal is to produce each original from variable readable source code, all of of the it is desirable the SSA variables changes V separately, coloring

V. If the goal is machine at once. In both that were inserted

code, then to model

the SSA variables coloring

be considered assignments

cases, the process

@functions into identity assignments, that is, assignments of the form V - V. These identity assignments can all be deleted. Storage savings are especially noticeable for arrays. If optimization does not perturb the order of the first two statements in Figure 7, then ~arrays A ~ and A ~ can be assigned the same color and, hence, can share the same storage. The array A ~ is then assigned an Update from an identically colored array. Such operations can be implemented inexpensively by

llA

conditional

branch

can be deleted by transforming

it to an unconditional

branch

to any one

of its prior

targets. ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.

482

Ron Cytron et al.


V2G6

while

(. ..)

do

while W3

(. ..) -

do
~(w~, W-J

while

(. ..)

do

V3 read V Wtv+w V+6 W+v+w end (a)


Fig 18. program; Program that (b)unoptimized

lj(v~, V1

v~)

read

W3 + ~(w~, w~) V3 + $+(VO, V2) read VI WI +V1+W3

WI +vi+w3 V2t6 W2 +V2+W1 end (b)


really uses two instances for a variable SSA form; (c) result of code motion, after end

W2 4-

V2+W1
(c)

code motion.

(a) Source

assigning actual

to just

one component performed

if the array s share

storage.

Inparticular, of this form.

the

operation

by HiddenUpdate

is always

8. ANALYSIS AND MEASUREMENTS


The number of nodes that contain d-functions for a variable V is a function of the program control flow structure and the assignments to V. Program structure alone determines dominance frontiers and the number of control dependence. It is possible that dominance frontiers may be larger than necessary for computing @function locations for some programs, since the actual assignments are not taken the size of the dominance frontiers control loops. ure linear flow (We branching assume Such is restricted that expressions into account. In this section we prove that is linear in the size of the program when to if-then-else and results by the that constructs perform grammar that and no given while-do internal in Figis predicates suggest

branching.)

programs programs. For


programs the

can be described

19. We also give for actual


4.

experimental

the behavior

THEOREM

comprised dominance

of frontier

straight-line of any

code, CFG node

if-then-else,
contains at

and

while-do constructs, most two nodes.


PROOF. Consider 19. Initially, flow graph

a top-down we have

parse

of a program (program)

using

the

grammar in the parse

shown tree

in and

Figure

a single

node

a control

CFG

with

two

nodes

and one edge:


to

Entry

Exit.

The initial dominance frontiers are DF(Entry) changes production, we consider the associated frontiers of nodes. When a production expands

= @ = DF(Exit). For each CFG and to the dominance a nonterminal parse tree node

S, a new subgraph is inserted into CFG in place of S. In this new subgraph, each node corresponds to a symbol on the right-hand side of the production. We will show that applying any of the productions preserves the following invariants:
CFG node S corresponding to an unexpanded Each has at most one node in its dominance frontier.
ACM TransactIons on Programmmg Languages and Systems, Vol.

(statement)

symbol

13, No. 4, October

1991

Static Single Assignment Form and the Control Dependence Graph


<program> <statement> <statement> :: = <S. tatenent> ::= ::= <statement><staternent> if <predicate> then else <statement> ::= while do <statement.> ::= <statement> <statement> <predicate> <statement> <expression> for control structures (5) (4) (1) (2) (3)

483

<variable> Grammar

Fig. 19.

Each

CFG

node

T corresponding frontier. inturn. node

to a terminal

symbol

has

at most

two

nodes in its dominance We consider (l) the productions

This production adds aCFG ing DF(S) = {Exit}. this production SI and Sz. Edges leave Sz. A single flow graph tortree: Nodes this

Sand aCFG

edges Entry node

+S+Exit,

yield-

(2) When

is applied,

Sisreplaced

by twon odes SI and control S; Ti~,

previously entering and leaving S now enter edge is inserted from SI to Sz. Although the consider Szdominate Sz. Thus, Edges is applied, T,~~z~. Edges how this wehave node production DF(Sl) entering from affects were = DF(S) and all nodes that a CFG previously

has changed, Sland Sldominates production Te.~i~.

thedornina = DIF(SJ. S now S~~,.m and

dominateclby by nodes leaving

additionally, (3) When

S is replaced

then?

Sel.e, and

enter

T,~ and leave

are inserted

Ti~ to both

s~l~~; edges are also inserted from St~~~ and S~l~~ to Te~~i~. In the dominator tree, Ti ~ and T,~~i ~ both dominate all nodes that were dominated made
consider fr(mtier,

by S. Additionally, Ti ~ dominates S~~~~ and S.l,.. IBy the argument for production (2), we have DF( T,~) = DF( S) = DF(T,~~i ~]). NOW
nodes we obtain Sthen and Sel~e. From the definition of a dominance

DF(S,~e.)

= DF(S.l.,) a CFG

= { T.n~,fl}. node S is replaced with from node TW~,l, by nodes

(4) When T whzle

this

production

is applied,

and SdO. All edges previously associated associated with node TUkil.. Edges are inserted

S are now to Sd. and dominated DF( TWhil.) is isomor-

from S& to TU~,l,. Node TWh,l, dominates all nodes that were by node S. Additionally, TU~,l, dominates SdO. Thus, we have U { TWhi~=) and DF(SdO) = { T~h,~,}. = DF(S)

(5) After
phic

application of this production, K to the old graph.


2.

the new control

flow

graph

COROLLARY

For programs constructs, every

comprised node

of straight-line dependent

code, if-then-else, on at most two

and

while-do

is control

nodes.
PROOF. Consider a program

P composed CFG.

of the allowed control

constructs graph

and its R(CFG is

associated

control
ACM

flow

graph

The reverse
Languages

flow
Vol.

Transactions

on Programmmg

and Systems,

13, No. 4~October

1991.

484

Ron Cytron et al.


Table 1. Statements in all procedures 7,034 2,054 14,093 23,181 Summary Statistics of Our Experiment

Statements per procedure Min 22 9 8 8 Median 89 54 43 55 Max 327 351 753 753 Description Dense matrix eigenvectors Flow past an airfoil Circuit simulation 221 FORTRAN procedures and values

Package name EISPACK FL052 SPICE Totals

itself RCFG, is then

a structured

control

flow

graph

for

some

program

P.

For

all

Y in 1, Y

DF( Y) contains at most two nodes by Theorem K control dependent on at most two nodes. these each loop, linearity the the results nest do not frontier whose hold of the loops. total For

4. By Corollary

Unfortunately, tures. Figure includes leads variable mapping computation In particular, 5. For

for

all loops

program illustrated to that

strucin loop this each

consider

of repeat-until

dominance mapping d-functions. in placing might

entrance n nested

each of the entrances frontier 0(n) used needs at most is not actually of dominance

to surrounding

loops, yet

to a dominance

size is 0( nz),

Most of the dominance @functions, so it seems take excessive time with

frontier that the respect to

frontiers

the resulting number of actual number of dominance frontier diverse set of programs. We implemented placing ~-functions local library data flow procedures

@functions. We therefore wish to measure the nodes as a function of program size over a for constructing dominance frontiers and system, which already offered the required analysis [2]. We ran these algorithms from on 61 [46] and 160 procedures two Perfect

our algorithms in the PTRAN flow from EISPACK

and control

[391 benchmarks. Some summary statistics of these procedures are shown in Table I. These FORTRAN programs were chosen because they contain irreducible intervals and other unstructured constructs. As the plot in Figure 20 shows, the size of the dominance frontier mapping appears to vary linearly with program size. The ratio of these sizes ranged from 0.6 (the Entry node has an empty dominance frontier) to 2.1. For the programs ~-functions is also we tested, linear the plot in Figure 21 shows that program. the number The ratio of of in the size of the original

these sizes ranged from 0.5 to 5.2. The largest ratio occurred for a procedure of only 12 statements, and 95 percent of the procedures had a ratio under 2.3. All but one of the remaining procedures contained fewer than 60 statements. Finally, the plot in Figure 22 shows that graph is linear in the size of the original ranged from O.6 to 2.4, which is very dominance frontiers. The placing
ACM

the size of the control dependence program. The ratio of these sizes close to the range of ratios for

ratio avrgDF (defined by (7) in Section 5.1) measures the cost of ~-functions relative to the number of assignments in the resulting
on Programmmg Languages and Systems, Vol. 13, No. 4, October 1991

TransactIons

Static Single Assignment Form and the Control Dependence Graph


16001500140k 1300120011oo1000-900 800 700 600 500 400 300 200 *:*@*?:*:: * ** * ** ***** *** * * * *

. *

485

* * * *

loo 0 o Fig. 20. I 100 I


200

I
300

I
400

I
500

6~o of program

~~oo statements.

Size of dominance

frontier

mapping

versus number

1300 1200 110 100 900 800 700 600 !


500 400 300 ** ****:* *** ** e * * * *** *

* * *

* * * * *
*

*
*

loo 0 0 I 100 Fig, 21. I


200

I
300

I
400

I
500

I
600

--&-Too

Number

of rj-functions

versus number

of program

statements

SSA form program. no correlation with

This ratio varied program size.

from

1 to 2, with

median

1.3. There

was

We also measured the expansion A,., / A.... in the number of assignments when translating to SSA form. This ratio varied from 1.3 to 3.8. Finally, we measured the expansion MtOt / MO,,~ in the number of mentions (assignments
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October 1991.

486
160 150 1400 130 120 110 100 900 800 700 ! 600 500 400 300 1 200 loo 0

Ron Cytron et al.

* ** *

* *******

** * * **

**

**2** $*

?@% ****
I
200

0
Fig. 22.

100

1
300 400

I
500 600 700

I
800

Size of control

dependence graph versus number

of program

statements.

or uses) to 9. 6.2.

of variables For both

when of these

translating ratios, there

to SSA was no

form.

This

ratio with

varied

from

1.6 size.

correlation

program

DISCUSSION and Time Bounds


is done in three steps: the control flow E edges. Let DF to is

9.1 Summary of Algorithms


The conversion to SSA form

(1) The
graph

dominance fi-ontier mapping is constructed from CFG (Section 4.2). Let CFG have N nodes and nodes to their tree and then

be the mapping from compute the dominator O(E + Xx I DF(X)I).

dominance frontiers. The time the dominance frontiers in CFG

(2) Using

the dominance frontiers, variable in the original program the total number of assignments where each ordinary assignment len~h
A ,.,.

the locations of the @functions for each are determined (Section 5.1). Let A ~Ot be to variables in the resulting program, statement LHS * RHS contributes the

time,

of the tuple LffS to A ~ot, and each ~-function contributes I to Placing @functions contributes 0( A,O, x aurgDF) to the overall where aurgDF is the weighted average (7) of the sizes I DF( X) 1. are renamed (Section 5.2). Let M,O, be the total number of of variables in the resulting program. Renaming contributes time. in terms be the
Languages

(3) The variables


mentions

0( M, O,) to the overall R To state the time bounds of the original program
TransactIons on Programming

of fewer maximum
and Systems,

parameters, let the overall size of the relevant numbers: N


Vol 13, No 4, October 1991

ACM

Static Single Assignment Form and the Control Dependence nodes, mentions ordinary E edges, AOT,~ original assignments In the worst case, can require worst 0( I&) case, In the

Graph

487

of variables. assignments

to variables, and MO,,~ original avrgDF = ~ ( IV) = 0(R), and k insertions a c$function of qLfunctions. has Thus, !J( 1?) operands.

A tOt = !J(R2)

at worst.

Thus, M,Ot = 0( R3) at worst. The one-parameter worst-case time bounds are thus 0(122) for finding dominance frontiers and 0( R3) for translation to SSA form. However, the data in Section 8 suggest that the entire translation to SSA form will is small, avrgDF lation reverse RDF Control be linear as is the is constant, process graph is the dependence RCFG size of the in practice. The dominance frontier of each node in CFG number of ~-functions added for each variable. In effect, A ~Ot= 0( Ao,,~), O(R). are (Section output read off of the and from M,O, = O(MO,,~). the dominance dependence The entire frontiers Since calculation, transin the this is 8

is effectively

6) in time

0( E + size( RDF)). control

the size of

algorithm is linear in the size of the output. The only quadratic caused by the output being fl( R2) in the worst case. The data suggest that the control dependence calculation is effectively

behavior in Section

0(R).

9.2

Related Work
SSA form is a refinement of Shapiro and Saints [45] notion of a

Minimal

pseudoassignment. The pseudoassignment nodes for V are exactly the nodes that need o-functions for V. A closer precursor [221 of SSA form associated new from names for V with pseudoassignment Without nodes explicit and inserted assignments it was one new name to another. @functions, however,

difficult to manage the new names or reason about the flow of values. Suppose the control flow graph CFG has N nodes and. E edges for a program with Q variables. One algorithm [411 requires 0( Ea( E, N)) bit vector operations (where each vector algorithm is of [43] length for Q) to find all of the compseudoassignments. A simpler reducible programs

putes SSA form in time 0( E x Q). With lengths of bit vectors taken into account, both of these algorithms are essentially 0( R2 ) on programs of size R, and the simpler algorithm sometimes inserts extraneous o-functions. The method presented here is 0( R3) at worst, but Section 8 gives evidence that it is 0(R) in practice. The earlier 0( R2 ) algorithms have no provision for running faster in typical cases; they appear to be intrinsically quadratic. For CFG with N nodes and E edges, previous general control dependence algorithms [24] can take quadratic the worst-case 0(N) depth of the time in (N + E). This analysis is based on (post) dominator tree [24, p. 326]. Section 6

shows that control dependence can be determined by computing dominance frontiers in the reverse graph RCFG. In general, our approach can also take quadratic time, but the only quadratic behavior is caused by the output being fl( N2) in the worst case. In particular, suppose a program is comprised only of straight-line code, if-then-else, and while-do constructs. By Corollary 2, our algorithm computes control dependence in linear time. We obtain a better time bound for such programs because our algorithm is based on dominance frontiers, whose sizes are not necessarily related to the depth of
ACM Transactions on Programming Languages and Systems, Vol. 13, No. 4, October
1991.

488

Ron Cytron et al.

the dominator tree. For languages that offer only these constructs, control dependence can also be computed from the parse tree [28] in linear time, but our algorithm is more robust. typical cases in linear time. It handles all cases in quadratic time and

9.3 Conclusions
Previous powerful work has shown that SSA form and control that dependence efficient can support in terms of code optimization algorithms are highly

time and space bounds based on the size of the program after translation to the forms. We have shown that this translation can be performed efficiently, that it leads to only a moderate increase in program size, and that applying the early steps in the SSA translation to the reverse graph is an efficient way to compute control control dependence
ACKNOWLEDGMENTS

dependence. This is strong evidence form a practical basis for optimization.

that

SSA form

and

We would Trina Jin-Fan referees,

like Shaw whose

to thank Julian for various thorough

Fran

Allen

for early Tom

encouragement Reps, Randy improvements We are especially

and Fran Scarborough, grateful in this

Allen, and to the

Avery,

Padget,

Bob Paige,

helpful reports

comments. led to many

paper.

REFERENCES 1. AHO,

2,

3.

5.

6.

A. V., SETHI, R., AND ULLMAN, J. D. Compzlers: Principles, Techniques, and Tools. Addison-Wesley, Reading, Mass , 1986. ALLEN, F, E., BURKE, M,, CHARLES, P,, CYTRON, R., AND FERRAiYTE, J An overview of the Distrib. Comput. 5, (Oct. 1988), PTRAN analysis system for multiprocessing. J. Parallel 617-640. ALLEN, J. R Dependence analysis for subscripted variables and its application to program transformations. Ph.D. thesis, Department of Computer Science, Rice Univ., Houston, Tex,, Apr. 1983. ALLEN, J R , AND JOHNSON, S Compiling C for vectorization, parallelization and inline of the SIGPLAN 88 Symposium on Compder Construction. expansion. In Proceedings SIGPLAN Not. (ACM) 23, 7 (June 1988), 241-249 ALPERN, B , WEGMAN, M, N., AND ZADECK, F. K. Detecting equality of values m programs on Principles of Programming Languages In Conference Record of the 15th ACM Symposium (Jan. 1988), ACM, New York, pp. 1-11, BALLANCE, R. A., MACCABE, A. B., AND OTTENSTEIN, K. J. The program dependence web: A representation supporting control-, data-, and demand driven interpretation of languages, In
Proceedings (ACM) 25, 6 of the SIGPLAN 90 Symposkum on Compiler Construction. SIGPLAN Not.

(June 1990), 257-271.

7. BANNING, J. B, An efficient way to find the side effects of procedure calls and the aliases of Record of the 6th ACM Syrnpoa~um on Principles of Programming variables. In Conference Languages (Jan. 1979) ACM, New York, pp. 29-41 8. BARTH, J. M. An interprocedural data flow analysis algorithm. In Conference Record of the
4th York: flow ACM pp. Symposium 119-131. An interval-based ACM of the Trans. SIGPLAN approach Program 86 to exhaustive Lang. Symposium Syst. on and incremental interprocedural data 12, 3 (July Compiler on Principles of Programmmg Languages (Jan. 1977) ACM, New

9. BURKE, M. analysis.

10. BURKE, M,, AND CYTRON, R.


Proceedings (ACM) ACM 21, 7 (June

Interprocedural

dependence

1990), 341-395. analysis and parallelization.


Construction SIGPLAN

In
Not

1986), 162-175.
Languages and Systems, Vol. 13, No. 4, October 1991

Transactions

on Programming

Static Single Assignment Form and the Control Dependence


11. CARTWRIGHT, R., AND FELLEISEN, M.
of the SIGPLAN 89 Symposium

Graph

489

The semantics

of program

dependence.
Not.

on Compiler

Construction.

SIGPLAN

(ACM)

In Proceedings 24, 7 (July


of the 6 (June

1989), 13-27. 12. CHAITIN, G. J.


SIGPLAN 82

Register
Symposium

allocation
on

and spilling

via graph

coloring.
Not.

In Proceedings
(ACM) 17,

Compiler

Construction.

SIGPLAN

1982), 98-105. 13. CHAITIN, G. J., AUSLANDER, M. A., CHANDRA, A. K., COCICE, J., HOPKINS, M. IE., AND MARKSTEIN, P. W. Register allocation via coloring. Comput. Lang. 6 (1981), 47-57. 14. CHASE, D. R. Safety considerations for storage allocation optimizations. In Proceedings of the SIGPLAN 88 Symposium on Compiler Construction. SIGPLAN Not. (ACM) 23, 7 (June 1988), 1-10. 15. CHASE, D. R., WEGMAN, M. AND ZADECK, F. K. Analysis of pointers and structures. In
Proceedings (ACM) of the SIGPLAN 90 Symposium on Compiler Construction. SIGPLAN Not.

16,

17.

18. 19. 20.

1990), 296-310. CHOI, J., CYTRON, R., AND FERRANTE, J. Automatic construction of sparse data flow evaluaRecord of the 18th ACM Symposium on Principles of Programtion graphs. In Conference ming Languages (Jan, 1991). ACM, New Yorkj pp. 55-66. CHOW, F. C. A portable machine-independent global optimizerDesign and measurements. Ph. D. thesis and Tech. Rep. 83-254, Computer Systems Laboratory, Stanford Univ., Stanford, Calif., Dec. 1983. CHOW, F. C., AND HENNESSY, J. L. The priority-based coloring approach to register allocaLang. Syst. 12, 4 (Oct. 1990), 501-536. tion. ACM Trans. Program. COOPER, K. D. Interprocedural data flow analysis in a programming environment. Ph.D. thesis, Dept. of Mathematical Sciences, Rice Univ., Houston, Tex., 1983. CYTRON, R., AND FERRANTE, J. An improved control dependence algorithm. Tech. Rep. RC

25, (June

13291, IBM Corp., Armonk, N. Y., 1987. 21.


22, Whats in a name? In Proceedings of the 1987 International (Aug. 1987), pp. 19-27. CYTRON, R., LOWRY, A., AND ZADECK, F. K. Code motion of control structures in high-level on Principles of Programming languages. In Conference Record of the 13th ACM Symposium Languages (Jan. 1986). ACM, New York, pp. 70-85. DENNIS, J. B. First version of a data flow procedure language. Tech. Rep. Comput. Strut. Group Memo 93 (MAC Tech. Memo 61), MIT, Cambridge, Mass., May 1975. FERRANTE, J., OTTENSTEIN, K. J., AND WARREN, J. D. The program dependence gmph and ACM Trans. Program. Lang. Syst. 9, 3 (July 1987), 319 349. its use in optimization. ACM GIEGERICH, R. A formal framework for the derivation of machine-specific optimizers. Trans. Program. Lang. Syst. 5, 3 (July 1983), 478-498. HAREL, D. A linear time algorithm for finding dominators in flow graphs and related of the 17th ACM Symposium on Theory of Computing (May 1985). problems. In Proceedings
Conference on Parallel Processing CYTRON, R., AND FERRANTE, J.

23. 24. 25. 26.

ACM, New York, pp. 185-194. 27. HORWITZ, S., PFEIFFER, P., AND REPS, T.
Proceedings of the SIGPLAN 89 Symposium

Dependence
on

analysis

for pointer

variables.
SIGPLf~N

In
Not.

Compiler

Construction.

(ACM) 24, 7 (June 1989). 28. HORWITZ, S., PRINS, J., AND REPS, T.
ACM Trans. Program. Lang. Syst.

29.

30. 31. 32. 33.

Integrating non-interfering versions of programs. 1989), 345-387. JONES, N. D., AND MUCHNICK, S. S. Flow analysis and optimization of lLISP-like structures. Flow Analysis, S. S. Muchnick and N, D. Jones, Eds. Prentice-Hall, Englewood In Program Cliffs, N. J., 1981, chap. 4, pp. 102-131. KENNEDY, K. W. Global dead computation elimination. Tech. Rep. SETL Newsl. 111, Courant Institute of Mathematical Sciences, New York Univ., New York, N. Y., Aug. 1973. In Program Flow Analysis, KENNEDY, K. W. A survey of data flow analysis techniques. S. S. Muchnick and N. D. Jones, Eds. Prentice-Hall, Englewood Cliffs, N. J., 1981. of Computers and Computations. Wiley, New York, 1978. KUCK, D. J. The Structure LARUS, J. R. Restructuring symbolic programs for concurrent execution on multiprocessors. at Berkeley, Tech. Rep. UCB/CSD 89/502, Computer Science Dept., Univ. of California Berkeley, Calif., May 1989.
11, 3 (July

ACM Transactions

on Proq-amming

Languages and Systems, Vol. 13, No. 4, October 1991.

490
34

.
LARUS, J
Proceedings

Ron Cytron et al R., AND HILFINGER, P. N


of the ACM SIGPLAN

Detecting

conflicts

between

structure

accesses
SIGPLAN

In
Not.

88 Sympostum

on Compder

Construction.

23, 7 (July 1988), 21-34, 35. LENGAUER, T., AND TARJAN, R. E. ACM Trans. Program. Lang. Syst.

36. 37

38. 39. 40 41. 42. 43.

44.

A fast algorithm for finding dominators m a flowgraph. 1 (July 1979), 121-141 Prentice-Hall, EngleProgram Flow Analysis. MUCHNICK, S. S., AND JONES, N. D , EDS wood Cliffs, NJ., 1981 Record of the MYEKS, E. W. A precise interprocedural data flow algorithm, In Conference 8th ACM Symposium on Principles of Programming Languages (Jan. 1981). ACM, New York, pp. 219-230. OTTENSTEIN, K. J. Data-flow graphs as an intermediate form. Ph.D. thesis, Dept. of Computer Science, Purdue Univ., W. Lafayette, Ind., Aug. 1978. POINTER, L. Perfect report: 1. Tech. Rep CSRD 896, Center for Supercomputing Research and Development, Univ. of Illinois at Urbana-Champaign, Urbana, 111,, July 1989 REIF, J. H., AND LEWIS, H. R. Efficient symbolic analysis of programs. J. Comput. Syst, SCZ. 32, 3 (June 1986), 280-313. REIF, J. H., AND TARJAN, R. E Symbolic program analysis in almost linear time. SIAM J. Comput. 11, 1 (Feb. 1982), 81-93 ROSEN, B. K. Data flow analysis for procedural languages J. ACM 26, 2 (Apr. 1979), 322-344. ROSEN, B. K , WEGMAN, M. N., AND ZADECK, F. K. Global value numbers and redundant Record of the 15th ACM Symposui m on Przn ciples of Programcomputations. In Conference ming Languages, (Jan. 1988). ACM, New York, pp. 12-27, RUGGIERI, C., AND MURTAGH, T. P. Lifetime analysis of dynamically allocated objects. In
1, Conference Record of the 15th ACM Symposzum on Principles of Programming Languages

(Jan. 1988). ACM, New York, pp. 285-293. The representation of algorithms. Tech. Rep, CA-7002-1432, 45. SHAPIRO, R. M , AND SAINT, H Massachusetts Computer Associates, Feb. 1970. 46. SMITH, B. T,, BOYLE, J. M., DONGARRA, J. J,, GARBOW, B, S., IIIEBE, Y., KLEMA, V. C,, AND MOLER, C B. Matrzx Eigensystem Routines Eispack Guide. Springer-Verlag, New York, 1976. 3, 1 (1974), 62-89, 47 TARJAN, R. E. Finding dominators in directed graphs. SIAM J. Comput Property extraction in well-founded property sets. IEEE Trans. Softw. Eng. 48. WEGBREIT, B SE-1, 3 (Sept. 1975), 270-285. 49. WEGMAN, M. N , AND ZADECK, F, K. Constant propagation with conditional branches. In
Conference Record of the 12th ACM Symposium on Prlnczples of Programm mg Languages

(Jan.). ACM, New York, pp. 291-299. 50. WEGMAN, M. N., AND ZADECK, F. K. Constant propagation with conditional branches. ACM Trans. Program. Lang. Syst. To be published. 51 WOLFE, M. J. Optimizing supercompilers for supercomputers. Ph.D, thesis, Dept of Computer Science, Univ. of Illinois at Urbana-Champaign, Urbana 111., 1982 52. YANG, W., HORWITZ, S., AND REPS, T. Detecting program components with equivalent behaviors. Tech Rep. 840, Dept. of Computer Science, Univ. of Wisconsin at Madison, Madison, Apr. 1989 Received July 1989; revised March 1991; accepted March 1991

ACM

TransactIons

on Programming

Languages

and Systems,

Vol

13, No

4, October

1991

You might also like