Download as pdf
Download as pdf
You are on page 1of 12
Partitioning Dataflow Analyses Using Types Erik Ruf Microsoft Research ‘One Microsoft Way, Redmond, WA 98052-6399 USA erikruf@nicrosoft.com Abstract We present a simple method for partitioning a dataflow analysis problem into a series of related subproblems, We use type information (either declared by the pro- grammer, or computed via nonstandard type inference) to conservatively approximate analysis-time data de- pendences between program quantities. ‘This depen- dence information frees us from the need to model all program quantities simultaneously in a single, mono- lithic analysis. Instead, we can model quantities in any order, or even in parallel, so long as we respect the dependences. Our approach is independent of the ‘means used to solve the subproblems, and enables the ‘wider application of existing sparse approaches previ- ously restricted to “separable” dataflow problems. Pre- liminary experiments applying our technique to flow- sensitive points-to analysis of C programs have achieved storage savings of 1.3-7.2x over existing methods 1 Introduction Compilers are becoming increasingly dependent on whole-program dataflow analyses such as inter- procedural constant propagation and pointer alias anal- ysis. At present, most. such analyses are monolithic in that all relevant program quantities (e.g., variables, lo- cations, expressions) are modeled simultaneously at all relevant program points. Although sparse representa- tion methods have made great strides in reducing the umber of points at which each quantity must be mod- led, further cost reductions can be achieved by par- titioning the dataflow problem into a set of related, smaller problems, each treating a subset of the pr gram Penta copyegt in by permioion of he ACM, le. To copy oherwite, Eepubl 2 at on mers oto edhe oli regres pee termini andor fe. BOpL"99, Paris, France ©1957 ACM '0-89791-853.3/96/01 $3.50 15 quantities and/or points. cost improvements Doing so allows for several © Some optimizations can be performed on a per- subproblem basis, allowing all analysis memory to be reclaimed after each subproblem has been ana- lyzed. For example, an assignment to a dead vari able can always be removed, irrespective of the liveness of any other variable or any other assign- ment statement. Similarly, a primitive operation can be folded whenever its operands are constants, without knowledge of the constancy of any other program quantities. In such situations, partition- ing reduces analysis memory requirements to those of the the most expensive subproblem. © Often, storing a subproblem’s solution requires less storage than computing it. This is true even when sparse techniques are used to solve the subprob- em. Causes include the need to model quanti- ties at; meet points, auxiliary data structures (€g., dependence graphs) used by the analysis, and in- complete usage of the solution by the client (e.g., a browser may never issue queries about compiler- generated temporaries, but these must be modeled anyway). Partitioning allows the excess intermedi- ate storage to be reclaimed after each subproblem is solved, « Independent subproblems can be analyzed in par- allel. Existing partitioning strategies fall into two cate- gories. Point-based strategies analyze only a subset of program points in each subproblem. Examples include interval-based dataflow analyses and interprocedural analyses that separate intraprocedural analysis from in- terprocedural propagation. Quantity-based strategies continue to analyze all points, but model only a sub- set of the program quantities in each subproblem. To date, quantity-based partitioning has been pos- sible only for “separable” dataflow problems such as reaching definitions and live variables, which have the property that the solution for each quantity (e.., vari able) is independent of those for all others. Efficiency improvements for separable problems have been ob- tained using specialized program representations such as sparse evaluation graphs (CCF91] and sparse schedul- ing techniques such as the slotwise (DRZ92] and tabu- lation methods [RHS95]. Unfortunately, many inter- esting dataflow problems, such as constant propagation and pointer alias analysis, are not separable; existing techniques are forced to model all relevant quantities simultaneously. ‘This paper describes a way to extend quantity-based partitioning to non-separable problems. The basic task in quantity-based partitioning is to conservatively es- timate the analysis-time dependences between program. quantities, and to schedule the analysis of each quantity after the analysis of all other quantities upon which it depends. For separable problems, this is trivial, as all ‘quantities are known to be independent. Our insight is that the required dependences can be cheaply and con- servatively approximated using type information either declared by the programmer or computed by nonstan- dard type inference. ‘We begin with a small example, Section 3 describes a partitioning algorithm based on user-declared types, as well as modifications needed for analyzing type-unsafe languages lke C. Section 4 describes how the same algo- rithm can be improved by using nonstandard, inferred types. We then, in Section 5, give experimental results for both algorithms applied to a points-to analysis. Sec- tion 6 suggests other potential applications for our tech- nique. We conclude with a discussion of related work 2 Example Figure 1 contains type declarations for data structures used for compile-time name resolution in a hypothetical compiler for a block-structured language. The primary data structure is a tree of scopes, each of which has a symbol table consisting of a linked list of symbol table entries. The data structure for symbols is used both in symbol table entries and for labeling the scopes them- selves. Each data structure also contains various scalar information, such as sizes or level numbers, A points-to analysis of a program using these data types must model the potential values of all pointer- valued storage (i.e., variables and data structures). Do- ing so is clearly not a separable problem, as the value ‘of one pointer may depend on those of others. A tra- ditional analysis would thus model all values of pointer type! simultaneously, Assuming, of cours that we can trust the type eystem: Ls that struct Scope {int lavel, tranesize; Sytab snyatab; Sym label; Scope sparent, efiretcnild, next: y struct syatable (int munsyns; ‘Scope tacope; SyaTableEntzy efiret, » struct SyaTabletntry (int franeofteets Sym eym: Syatabletntry eprev, next; » eeruct Sym (Cine Length: char nase; Figure 1: Compile-time scope representation for a hy- pothetical compiler. Figure 2: “Pointed-to-by” dependences among types in example of Figure 1. An are from a type th to a type tp means that dereferencing a value of type t» may ac- cess/modify a value of type t 16 Figure 3: Acyclic “pointed-to-by” dependences on pointer types of Figure 2 Such a modeling strategy is unnecessarily coarse, as it makes the worst-case assumption that all pointers in- teract, We use dependence information computed from the types (shown in Figure 2) to establish an upper bound on the interactions between values having par- ticular types, This is an analysis-specific computation; in the case of points-to analysis, we use the “pointed-to- by" relation: a type fh depends on a type tz whenever dereferencing a value of type f, may access/modify @ value of type ¢;. For example, quantities of type char * points only to scalars, and thus cannot affect quantities of any other pointer type. Similarly, SymTable * quan- tities may point to SynTableEntry * quantities, but not vice versa, indicating that it is safe to compute the solution for SynTable * quantities first, represent the solution efficiently, then compute the SynTableEntry * solution, Quantities having mutually recursive types, such as Scope * and SyaTable +, must be modeled si ‘multaneously, Collapsing mutual recursions and elim- inating irrelevant (scalar) types yields the dependence graph of Figure 3, which (being a total ordering) in- duces a four-subproblem schedule. ‘The type informa- tion has enabled us to decompose a single analysis mod- cling quantities of five types into a series of four analy- ses, each modeling quantities of either one or two types. ‘We can now use existing methods to solve each subprob- lem as efficiently as possible. Now suppose that, instead of performing points-to analysis, we're interested in performing integral con- stant propagation? on our hypothetical compiler pro- gram. The declared-type-based dependences of Fig- lure 2 are useless here, as all integral values in the entire program (not only the slots level, numSyms, length, ‘quantities of acalne type will not be used to transmit pointer values We will nddreneunaafe Inngunges in Sections 9. and 4 "rv range anayti, uninitilzed-value analysis, o icing— that volves modeling the value of integer valued program quant Indeed, any am 7 Figure 4: Non-standard types and their “pointed-to- by” dependences. Multiple nonstandard instances of single user type are qualified with the relevant program quantities in brackets franeSize, and franeOffset, but also literals, inter- rediate expression values, and other integral variables) are represented by a single type, int. Under the ap- proach above, all such quantities must be scheduled si- tmultaneously, yielding no partitioning benefit. In this case, finer-grained dependence information is, of potential use. Suppose that the level, muSyms, and ength fields are completely independent (e.g, they are never copied to one another, added together, etc), and that the frameSize for a scope and the franeDftset fields for its entries are mutually dependent. We can ‘obtain such information cither by requiring the user to make additional type distinctions (e.g., by declaring dis- tinct subtypes of integers) or by using nonstandard type inference. Such an inference infers a typing where only quantities with potentially dependent? values (such as frane(ffect and franeSize in this example) share a type, producing the dependence graph of Figure 4, This typing yields a schedule in which quantities having each of the four inferred integer types can be modeled inde- pendently. The finer dependences of Figure 4 also aid points- of dependence is consermative approximation of any dependence ‘example, the stem of (OVCatlahan and Jeckton (0585, 0208] can infr the integral types we scopes are distinct from those in symbol tables, we are able to partition the analysis of Sym * quantities into two independent phases 3. Partitioning Using Declared Types 3.1 Basic Algorithm Given a dataflow problem to be solved over a pro- gram p, we wish to produce ‘* a set of subproblems {go,41,---,dn} where each qi is a set of program quantities drawn from p such that Uses contains all quantities inp relevant to the dataflow problem, and ‘© a total ordering < on the gi such that the dataflow solution for all quantities q € qi can be solved given only the results for all g; < qi Our algorithm (c.f. Figure 5) operates as follows. We begin by computing a dependence relation on all rele- vant types declared in p. The pair (ty,¢2) is in the relation iff the dataflow solution for a program quantity ‘of type t; can possibly be affected by the dataflow solu- tion for a quantity of type ta. This is an analysis-specific computation: in the case of points-to analysis, we use the “pointed-to-by” relation described above. Other analyses may require other relations such as structural inclusion, coercion, or O'Callahan and Jackson's “eom- patibility” relation {096} Given the types and the dependence relation, we are faced with the problem of computing the total ordering, ‘or schedule. This is not all that different from schedul- ing the nodes of a control flow graph, except that each node (type) in our graph represents an entire dataflow problem to be solved. The standard solutions, reverse postorder iteration and interval analysis, are unattrac- tive in this case. Reverse postorder schedules must be evaluated multiple times, requiring that we preserve the analysis data of each phase lest they be needed on a sub- sequent iteration, Elimination methods are difficult to apply due to the often-irreducible nature of the type dependence graph, and the difficulty in combining the dataflow problem for two types into a single problem that can be solved any more efficiently than those for each type considered independently. Instead, we re- move all circularities in the type graph by collapsing ‘each strongly connected component into a single rep- resentative (a set of types). Any dependences between the members of a representative can be safely removed because all quantities denoted by those members will be analyzed in the same subproblem. The resulting de- pendence graph of type representatives will be a DAG 18 corresponding to a partial ordering on the representa- tives ‘The partial ordering constrains our final schedule, in that a representative cannot be scheduled before those ‘on which it depends, but we can choose the relative order of equally-constrained representatives, Our topo- logical sort orders such representatives according to a heuristic based on the estimated intermediate and final storage costs for analyzing each representative (based ‘on the number of program quantities denoted by the representative's type, the number of program points at ‘which those quantities need to be modeled during anal- ysis, and the number of program points at which a solu- tion is required). We reduce peak storage requirements by scheduling representatives with higher intermediate storage needs earlier, and those with higher final storage needs later, Finally, we map each representative to a set of pro- ‘gram quantities to produce the final schedule. 3.2 Complexity Our algorithm requires several traversals of the program (to gather the types and compute the dependences, and to map the type representatives back to program quasl- tities). Tt also performs a strongly connected compo- nents analysis and a topological sort of the dependence relation on types. Summing these terms yields a worst- case asymptotic complexity of O(P) + O(T*), where P is the size of the program and T is the number of types in the program. ‘The need to represent the dependence relation on types dominates the storage cost, making it O(T?) in the worst case. In practice, both the space and time costs of our partitioning algorithm are small, compared to those of the (typically O(P4) to O(P%)) dataflow analysis being partitioned. 3.3 Complications Real-world programming languages have features that require modifications to the analysis described above: * Polymorphism. Runtime polymorphism in the form of disjoint union types, subtypes, or “pointer to any” types can induce false dependences in the type graph, forcing all of a polymorphic type’s potential instance types to be grouped into the same type representative, and thus be scheduled together. In type-safe languages (even runtime- typed ones) without coercions, we can preserve “The OCT) term i based on the atsumption that every type de pends on ll ther types, andi thus quite conservative, For example, in points-toanalyin, ach type t depends only on the types “pointer {to € and on al typen “pointer to aggregate ° where + conthing xn slement of type Domains Programs (P) Program Quantities (Q) ‘Types (T) Inputs a program p: P consisting of quantities ¢ CQ. Output a sequence of sets of program quantities S = go,...,gq where g: C q and Ua Steps 1. Compute types and type dependences (analysis-specific) © aset of types t: T, © a mapping T ; Q + T from program quantities to types, and a corresponding inverse image map T-!: T + PS(Q), and + a dependence graph G with verti 4; where T(q) = t: and T(q) = ts ste € t and edges < tit, > for all q depending on 2, Flatten recursive type dependences: « Compute an acyclic graph G" by collapsing each strongly connected component of G into a single vertex. « Label each new vertex with the set of collapsed vertices (types). 3. Compute a total ordering: # Compute vertex schedule $ by topologically sorting G’ 4. Map components to program quantities: # Replace each vertex v, in S with set of quantities ¢ = U,7~"(label(vi),) # return $, Figure 5: Partitioning algorithm. 19 nalysis-time polymorphism by omitting polymor- phic types from the dependence relation, and then explicitly adding them back to each type represen- tative containing a potential instance type. This allows quantities of polymorphic type to partic pate in multiple schedule elements: this is both (2) sound, because without coercion, the instance types cannot possibly interfere (no interactions between quantities will be missed), and (2) effi- cient, because even though the polymorphic quan- tity will be modeled in multiple phases, each phase will model a disjoint set of runtime values (no ad- ditional space will be consumed) This scheme allows us to model types such as Modula-3's ref any type without undue conser- vatism. When analyzing unsafe types such as C's void * type, we must either accept the collapse of the dependence graph, or insist on additional programmer specifications guaranteeing that poly- morphism is not being used to subvert the type system, Coercions. In languages like C having type cast- ing, the programmer can force quantities of one type (even a monomorphic one) to carry values associated with some other type: if this is not re- flected in the dependence relation, we will build an underconstrained schedule and compute incor- rect results. We account for this by traversing the program to compute a “coerced-to” relation on types, and then add all elements of this rela- tion to the dependence relation for types. In the worst case (where the “coerced-to” relation con- tains circularities), this can produce substantially larger strongly connected components, and thus construct a shorter schedule, each of whose entries denotes a larger set of types. This is a particular problem in analyzing older C programs where the char + type is used simultaneously as the memory allocator type, the polymorphic structure pointer type, and the string type: the result is that vir- tivally all interesting pointer types end up in the same schedule element as char * First-class functions. The input dependence rela- tion on types must conservatively approximate any possible analysis-time interaction between pro- gram quantities having those types. ‘This works well for data dependences, which are often appar- cent from the user types (e.g., struct foo types depend on struct foo in points-to analysis), but less well for control dependences, which are embedded in the program code, not in the types. ‘This becomes a problem when we consider in- terprocedural analyses involving first-class func- 20 tions (eg., C function pointers in a points-to analysis), as the dataflow solution for a function value determines (and is determined by) control fiow edges between procedures, and affects arbi- trary non-function quantities. A conservative so- lution could establish dependences between func- tion types and the formal, actual, return, result, and modified/referenced global types at all indi- rect function calls, but this is clearly impractical. Our implementation avoids this by precomputing the function solution by other means. 4 Partitioning Using Nonstandard Types 4.1 Motivation Partitioning using nonstandard types is quite similar to partitioning using declared types, except that we ob- tain the initial set of types, and dependences between those types, from a nonstandard type inference mecha- nism. For example, the systems of Steensgaard [Ste96b, ‘Ste96a] and O'Callahan and Jackson (0.95, 0396] both compute type information that can be used by our par- titioning algorithm. Such systems conservatively ap- proximate runtime (and thus analysis-time) value flow by building a typing where all program quantities whose ‘values may appear in a common context (e.g., as the ar- sgument to an indirect pointer operation, or as the value of a particular formal parameter) are assigned the same type Using inferred types instead of declared types has several benefits. First, it can produce much finer-grained dependence information. For example, the “assembler” benchmark (see Section 5) contains 574 expressions of type char +, yet the vast majority of these expressions are unrelated (e.g., many of them are string constants that flow only to library procedures such as printf, and never appear as the value of any other expression of char * type). While the approach of Section 3 need- lessly forces all 574 quantities to be modeled simultane- ously, the analysis of (Ste96al infers 177 distinct types for unrelated char * expressions, allowing their analy- sis to be broken into that many distinct phases. Second, inferred types help address the complica- tions we encountered with user-declared types: ‘+ Polymorphism. Even a monomorphic inference system can be of some use here, because it can infer a distinct monomorphic type for each pro- gram quantity having a polymorphic user type This avoids some false dependences, but will still be overly conservative, Given a polymorphic type inferencer, we can handle this case as we did be- fore, by allowing a polymorphic type to partic pate in multiple members of the final partition. Indeed, a polymorphic inferencer can often infer polymorphic types in monomorphic programs (for ‘example, {0J98] finds polymorphic representation types in C programs), allowing for additional levels, of partitioning. Coercion. We no longer need to worry about type casts explicitly; the inference system must. deal with them when assigning types. In general, ex- pressions related via a cast will share the same inferred type; the difference here is that the type unification takes place at the level of individual cast expressions (each of which has its own inferred input/output types), rather than at the level of en- tire progeam types (where we are forced to unify, for example, char and int). This further reduces false dependences. First-class functions. As part of its operation, the type inference system must infer function types for every function and at every application. We can use these types to compute a conservative approx imation to the function values reaching each ap- plication, and remove the function values from the dataflow analysis. This eliminates a large category of potential dependences at some cost in precision. 4.2 Complexity The complexity results of our partitioning algorithm remain the same when nonstandard type inference is used, except that the the number of types, 7", becomes a function of the inference scheme rather than an at- tribute of the source program (though it is known to be bounded by the program size, P). Different type in- ference systems have complexities varying from almost- linear through exponential; the experiments described in the next section used Steensgaard’s structure union analysis (Ste96a], which has worst-case space complex- ity linear, and time complexity quadratic, in the total size of program variables. 5 Empirical Results ‘We implemented an instance of our partitioning strat- egy targeted to points-to analysis of C programs. The basic relationships between program quantities in points- to analysis are: 1. Two quantities may (potentially) carry the same value. We model this using types, in that (mod- ulo coercion and polymorphism) only values of like type can carry a common value. a1 2. One quantity may hold a pointer to another quan- tity’s value. We model this using type depen- dences: a type fy is considered to depend upon another type ¢2 whenever fy is a pointer type hav: ing as its referent either fy or some aggregate type (transitively) containing a member of type t,° We used three different sources for the type in- formation: (1) the declared C types, ignoring type casting,® (2) C types, accounting for type casting, and (3) nonstandard types computed by Steensgaard’s anal- ysis citeSteensgaard:CO96. We used these types to com- pute partitions for a small suite of pointer-intensive benchmark programs. The programs were repre- sented as arity-raised value dependence graphs [Ruf95b, WCES94]; the program quantities of interest. were pointer-valued expressions (“ports in VDG parlance) For purposes of this discussion, it is sufficient to think of the program quantities as pointer-valued variables in an SSA-form representation. Using this representation assures that we will not ascribe to partitioning benefits which could just as easily be achieved using SSA tech- niques; it also improves the precision of type inference Figure 6 summarizes the generated partitions in terms of the number of program quantities per sub- problem. We can see that all three type schemes gen- erated a large number of partitions, greatly varying in size, As we expected, the modeling of type casts re- duced the number of partitions and (most importantly, for memory consumption purposes) increased the size of the largest subproblem. ‘The nonstandard type infer- ence scheme consistently produced more, smaller sub- problems, and achieved a significant. reduction in the size of the largest subproblem, Figure 6 does not. show the space cost of executing each partition, One might expect the cost to be propor- tional to the number of quantities being modeled, but other factors often dominate: © The cost of modeling each program quantity can vary greatly, even among program quantities of equivalent type. This is true both because some ‘quantities carry more values (e., some pointer variables have more targets than others), and be- cause some quantities cost more to model (e.g., indirectly-accessed quantities must be modeled us- ing a store), Quantities of polymorphic and aggregate (struc ture) types may appear in multiple partition ele- These tnmsitive dependences allow ut to eaily model the “inte soe pointer” of C. Alternatively, we could have choaen toad thi ‘ind of dependence denoting containment in an aguregse tyPe this cam yeldincorectreaulta for C programe containing cst; swe seit merely av Dasline for aueatng the cont of the ndlitions Apendences due to type casting Tame [relevant | type | numberof | win | max ] mean [median quantities | source | subproblems | quants | quants | quants | quants root 75 | wer sy a, as] a6 = cast s| a] 36] a6 8 infer a] if | 2 1 agra THO | user iy] 4}, a] te cast n Ge oti oi]|e ai infer 4 1] x] os 1 ember 355 | user ay ez fee 20) | eo cast ro} 23] 745) x62] 98 infer 206 1| ase | os 1 ackprop TiS | user ayy a0] ae east «|| 2] | 2 infor 25 1] _as| 5 2 ve 287 [user 35/68] tase | 2908 cast 20| 68] iss} 01 | ass inter Pll = nllaralle ap 1 corapller Wa [user ep ty aes cast 6} 1f aol on] ss infer we] 1] 168] 4 1 ‘compress 300 | user 1] 2] 139) 3] To cast o| 2] ase] sr} ao infer n | os] 5 1 Tels THT | user op aye] 2 7 cast ee il ie 3d 4 infer 62 1] ss] 2 1 Toader 07 | oner Tr rs cast u 4} ae] os] a9 infer 88 nl eine Jee 1 part WT [user wp 8] so] os} as cast ise) oes]. as) as infer 01 1|_as| os 1 ‘mutator 1158 | user 109 ese) | geo [eee 100] eee cast wu} oa) as) m2] as infor oe! euler a0| eee 1 pan 35 [user rs cat Se || oie) | 108) a infer oo] i] ws] 6 1 wea 1035 | over 99) ees | ear | 29a cast 2} a9] ame] as] ao infer 23 it etisel emai 1 Figure 6: Partitioning results for our pointer benchmark suite, Column 2 gives the number of program quantities carrying pointer values, The remaining columns give the number of subproblems constructed, and the minimum, ‘maximum, mean, and median of the number of program quantities per subproblem for each of three partitioning schemes: “user” (declared types), “cast” (declared types plus coercion relationships), and “infer” (nonstandard inferred types). Column 2 differs from the port count in [Ruf95b] because quantities of type “store” are not included As aggregate and polymorphic quantities can appear in multiple subproblems, and because type inference can prove certain quantities irrelevant, the product of columns 4 and 7 does not necessarily equal column 2 22 Tame peak nal Tene sparse erg_[ part [ratio | orig | part [ratio am] imo] 22] xf sx] ia] 2 ia [333 [27 | tes] re] 2.3 | 96 7osss0"| To7664 | 16 | 007 | 264 [1.4] eos. C7 ET A wsi2t | Bos19 [1s | 3099 | 2o09 | 13 [ar ziase | isoss | ie] 550 [ 239 [23] 200, FL oss | os23 | 18] a2 [75 [28 | 106 ‘se76 | 1963 | 30 | 460 | 140 | 34] 207 7a73 | ars | 16 | 7a] sar] 21] 225 simulator | 178697 | o0sas | 2.0 | 2008 | ove | 21] a0 span ‘sere i005 | 20] 729] 64 [20] 195, yacr | aorta [seer [72 Pinas [| ams [a0] 87 Figure 7: Storage usage by points-to analyses under dense and sparse storage models, measured as a count of “points-to pairs” relating container and containee locations. The “orig” columns give counts for monolithie analysis, while the “part” columns give counts for the partitioned case. The final column is the cost of storing the dataflow. solution only at the location inputs to memory reads and writes. ments (those of their instances and slots, respec tively), which leads to conservative estimates of overall modeling costs when we add up the mum- ber of quantities modeled over all phases. ‘The analysis of each schedule element need only model, the relevant instances (slots). We estimated the benefits of partitioning by ana- lyzing the points-to relations computed by our exist- ing context-insensitive interprocedural points-to analy- sis [Ruf95a]. For each partition, we computed an upper bound” on the number of points-to pairs produced by the analysis of that subproblem’s program quantities, as well as the number of points-to pairs appearing in the solution for that subproblem (many of the intermedi- ate points-to pairs, such as those belonging to the store ‘model or to control flow merges, can be discarded). Fig- ure 7 gives cost, results for the nonstandard-type-based partitions under two different store models: (1) a dense model, in which a points-to pair arising from a write is ‘modeled at that write and at every control flow merge, and (2) a sparse model, in which store modeling for each subproblem is restricted to relevant control flow ‘merges on the dominance frontiers of write operations appearing in the subproblem. ‘The results show that. partitioning produced a mea- surable benefit for all of our subject. programs under both dense and sparse store models. ‘The latter is im- "For quantities of aggregate or polymorphic type, which may ap- pear in multiple aubpeablema our entimate assigns the fall modeling nt to all rlevank mubprablers, thas averestimating the cout per portant since it demonstrates that our partitioning ex- ploits a temporal form of sparseness that is orthogonal to the spatiat sparseness captured by existing sparse evaluation graph techniques. 6 Other Applications 6.1 Speed In our experience with quadratic space/time interprod- edural dataflow analyses, the single most important de- terminant of performance has been keeping the analysis information within the working set of the machine. We believe that virtually any reduction in the intermedi- ate memory consumption of such dataflow analyses are likely to have positive effects on speed in practice Because partitioning enables the application of sparse representation techniques such as [CCF91] to the individual subproblems, we can also hope to gain speed by such means. The usual argument is that the sum of the analysis times for the partitions can be less than the analysis time for the program considered as 1a whole, provided that the costs of constructing the partitions and their sparse representations remain suf- ficiently small® Because our initial experiments are based on a post-analysis of an existing monolithic anal- ysis, we have no empirical evidence for speedup at present, si clearly not alraya the cat; eg, the spate repreventation [Ruf] achieved breakeven only on the larget of the benchavarks| tn thie paper. 23 6.2 Parallelism In the penultimate phase of our partitioning algorithm, Wwe use a topological sort to serialize the evaluation of our partitions, which are related only via a partial or- dering. We could just as easily evaluate unrelated par tions in parallel on a multiprocessor or distributed sys- tem. Previous approaches to parallelization of dataflow analyses [LRF94] have focused on elimination analyses, where the per-processor tasks consist of summarizing and propagating solutions for distinet regions of the control flow graph. Used alone, type-based partition- ing is likely to generate larger-granularity tasks (since each partition is a whole-program analysis problem), which is good for amortizing communication costs, but likely to make load balancing more difficult. Our ap- proach, being quantity-based rather than point-based, is orthogonal to elimination-based techniques, and thus might achieve finer-grained parallelism in combination with them. 6.3 Simplifying dataflow representations This work was partially inspired by Landi and Ryder's work (LR91, LR92] on alias analyses for languages hav- ing only single-level pointers (e.g., pointers denoting only scalar values), Such languages are convenient to analyze because every alias induced by a procedure can be attributed to a single alias holding on entry to that procedure. A precise treatment of multi-level pointers Js significantly more complex due to the need to rep- resent multiple layers of dependences (Ruf95a, WL95}. Our partitioning scheme effectively converts a multi- level problem into a sequence of predominantly single level subproblems (multi-level modeling of recursive data types cannot be avoided). Of the 3960 subproblems wwe constructed using nonstandard type inference, 3901 (98.5%) were single-level in nature. We envision using single-level techniques on the single-level subproblems, and reserving the more expensive, exponential analysis machinery for the remaining subproblems. 7 Related Work 7.1 Partitioning of Analyses Zhang et. al [ZRL96a, ZRL96b] used automatically in- ferred dependence information (a flow-insensitive points- to analysis) to decompose an alias analysis problem into a set of completely independent subproblems (essen- tially our scheme, but using only weakly connected com- ponents instead of our collapse-and-schedule heuristic). They reduced analysis times by over half, not only by 24 reducing subproblem size, but by falling back on flow- insensitive alias analysis for nonrecursive dependence components, where the empirical precision cost of do- ing so appears to be low. Grammar flow analysis is an analogue of dataflow analysis that propagates information on context-free grammars rather than control-flow graphs. Such sys- tems [JP90] use a variety of partitioning and sorting techniques that appear very similar to our topological sort and size heuristics. 7.2 Dataflow Analysis Using Types In some cases, programmer-declared type information is suficent to serve as a dataflow solution; for example, im a type-safe language, values of incomparable types cannot be aliased. Often, types are used as the base for a more complex analysis; e.g, Deutsch (Deu94] uses (trusted) type information to determine the structure of the analysis domains used to represent alias relation- ships among structured heap data, Alternatively, ex- plicit specification types such as those of ADDS [HH92] can be used to restrict the dataflow solution, improving its precision. In most cases, however, the declared type informa- tion is either untrustworthy or overly coarse, For this and other reasons, several researchers have developed dataflow analyses that operate by inferring nonstan- dard types. Henglein was the first to use type infer- ence to perform binding time analysis in almost lin- ear time [Hen91], and later applied a similar approach to optimization problems such as unboxing (Hen92] Henglein's algorithm inspired Steensgeard's type-based almost-linear points-to analyses for C [Ste96b, Ste96a], which we used in Section 5. Zhang et. al's flow- insensitive alias analysis (ZRL96a] infers equivalence classes of object names which can be considered as types. O'Callahan and Jackson (095, 0.196) subject C programs to forms of polymorphic type inference that capture various notions of equivalence of data represen- tation, 7.3 Sparse Dataflow Analysis ‘The costs of a flow-sensitive dataflow analysis are pro- portional to three factors: 1. the cost of modeling a single program quantity at fa program point, 2. the number of quantities modeled, and 3. the number of points at which each quantity is modeled. Factor (1) is specific to the particular analysis prob- Jem; reducing itis primarily a matter of choosing an effi cient data structure for the elements of the dataflow lat- tice, Factor (2) is also constrained by the dataflow prob- lem; the absolute number of relevant program quanti- ties typically cannot be reduced. Quantity-based par- titioning targets this factor by reducing the number of program quantities modeled simultaneously ‘A large body of work on sparse representations targets factor (3). Program representations that di- rectly model producer-consumer dependences, such as static single assignment (SSA) form [CFR*91] and its many relatives represent each quantity only at statically-determined definition and meet points, but assure that. each use point can locate the appropri- ate model in constant time. These are general repre- sentations; [CCF91, Ruf95b] build similar graphs for analysis-specific purposes. Similar effects are achieved by systems that directly simplify the dataflow equations at each program point (DGS94), or use a hierarchical control representation to avoid modeling quantities in regions where their values are not. changed [JPP94} Still other schemes leave the program representation intact, and dynamically represent an analysis-specific producer-consumer relation in an efficient auxiliary data structure [CW290, WL95}, ‘The sparse store model used in the estimates of Sec- tion 5 is similar to the static dependence graphs con- structed by SEG construction [CCF91] and store split- ting [Ste95] and the dominator-based store models of [CWZ90, WL95}, Previous publications on this topic have not provided empirical data; our results (shown in Figure 7) confirm that such methods can save up to a factor of 85 in space usage at analysis time. 8 Conclusion Partitioning of dataflow analyses is useful for reducing ‘memory consumption, enabling parallel analysis, and simplifying the representations of dataflow values. We have presented a new means for partitioning dataflow analyses based on type information. Our key insight is ‘that type information not only bounds the runtime be- havior of programs, but also the analysis-time behavior of datafiow analyses of those programs.” This allows us to partition any dataflow analysis whose analysis-time data dependences can be abstracted by types, whether those be language-level types declared by the program- mer or analysis-specific types computed via nonstan- dard type inference. Our initial experiments, which fo- cus on reducing the peak memory usage of pointer alias based [CCT?]datafow annigacs are themcivesprowaly conersative 25 analysis, achieved memory savings both alone and in conjunction with sparse evaluation graph techniques. Acknowledgements ‘The author would like to thank Bjarne Steensgaard and Daniel Weise for discussions about type-based parti- tioning, Bjarne Steensgaard for assistance in using his nonstandard type inference systems, Bruce Duba and Todd Knoblock for comments on this submission, and William Landi and Todd Austin for sharing their bench- ‘mark programs. References (CCTT} P. Cousot and R. Cousot. Abstract interpretation: A Unified lattice mode! for static analysis of programs bby construction or approximation of fxpoints. In Proceedings of the Fourth Annual ACM Symposium on Principles of Programming Languages, pages 238 252, Los Angeles, January 1977. J-D. Choi, R. Cytron, and J. Ferrante, Automatic ‘onstruction of sparse dataflow evaluation graphs. In Proceedings of the Bighleenth Annual ACM Sympo- sium on Principles of Programming Languages, pages 155-65, ACM Pres, January 1991, Cytron, J. Ferrante, B. K. Rosen, M.N. Wegman, and F.K. Zadeck. Efciently computing static single assignment form and the control dependence graph, ACM Transactions on Programming Languages and ‘Systems, 19(4):451-490, October 1991. D. R Chase, M. Wegman, and F. K. Zadeck. Anal ysis of pointers and structures, In Proceedings of the SIGPLAN '90 Conference on Programming Language Design and implementation, pages 296-310, White Plains, NY, June 20-22, 1990. A. Deutsch, Interprocedural may-alias analysis for pointers: Beyond Kelimiting. In Proceedings of the SIGPLAN '94 Conference on Programming Language Devign and Implementation, pages 230-239. ACM Press, 1994 E, Duesterwald, R. Gupta, and M.L. Soffa. Reducing the cost of data Row analysis by congruence partition: ing. In CC '94: Fifth International Conference on Compiler Construction, pages 351-373, April 1994 D. M. Dhamdhere, B. K. Rosen, and F. K. Zadeck How to analyze large programs efficiently and infor rmatively. In Proceedings of the SIGPLAN '92 Con: Jerence on Programming Language Design and m- plementation, pages 212-223. ACM Press, June 1992. FF Henglein, Efficient type inference for higher-order Dinding-time analysis. Tn Functional Programming and Computer Architecture, pages 448-472, 1991, ocrs1) [crR+9y] few200] [Devo4} (Dcso4) [pRz02} {Hen9i} [Heng2] _F, Henglein. Global tagging optimization by type in- ference. In Proceedings of The Conference on LISP and Functional Programming, pages 205-215, 1992 (ino) {3P90} (uPPaq) (uro3} (LR9) [uRF9s] [0195] (96) [RHs95 (Pesos) [urst) [steas} L. J. Hendren and J. Hummel Abstractions for recursive pointer dat improving the analysis and transformation of imperative programs. Im Proceedings of the SIGPLAN ‘98 Conference on Programming Language Design and Implementation, pages 249-260. ACM Press, June 1992. M, Jourdan and D. Parigot. Techniques for improv ing grammar fow analysis. In Srd European Sympo sium on Programming, number 432 in Lecture Notes {in Computer Science, pages 240-255. Belin: Springer. ‘Verlag, 1990 R. Johnson, D. Pearson, and K, Pingali, The program structure tree: Computing controt regions in linear time. In Proceedings ofthe SIGPLAN '94 Conference fom Programming Language Design and Implementa- tiom, pages 171-185, Orland, Florida, June 20-24, 1994, ACM Press W. Landi and B. G. Ryder, Pointerinduced alias ing: A problem clasifeation. In Procedings of the Bighteenth Annual ACM Symposium on Principles of Programming Languages, pages 99-103. ACM Press, January 1991 W. Landi and B. G. Ryder. A safe approximate algo- ‘thm for interprocedural pointer aliasing. In Proceed. ings of the SIGPLAN '92 Conference on Program: rming Language Design and Implementation, pager 235-248, ACM Press, June 1992, Y-f, Lee, B. G. Ryder, and M. B. Fiucrynski, Re- sion analysis: A parallel elimination method for data flow analysis. In International Conference on Com: puter Languages, pages 31-42. IEEE Computer Soci- ‘ety, May 1994 R, O°Callaban and D. Jackson. Detecting shared representations using type inference. Techical Re port CMU-CS-95-202, School of Computer Science, Carnegi-Mellon University, September 1995, R, O'Callahan and D. Jackson. Practical program understanding with type inference. Technical Re- port CMU-CS.96-180, School of Computer Science, Carnegie-Mellon University, May 1996. T. Reps, 8. Horwite, and M. Sagiv. Precise interpro: cedural datafow analysi vin graph reachability. In Proceedings 22nd ACM SIGPLAN-SIGACT Sympo- sium on Principles of Programming Languages, pages 49-61, January 1998, E. Ruf, Contextintensitive alias analysis reconsid- ‘ered. In Proceedings of the SIGPLAN '95 Conference ‘on Programming Language Design and implementa ‘ion, pages 13-22, June 1995, B. Ruf. Optimising sparse representations for ‘datafiow analysis, In ACM SIGPLAN Workshop on Intermediate Representations (1°95), pages 50-6, January 1995. Proceedings published as ACM SIG- PLAN Notices 80(3), March 1988. B. Steensgaard ative programs. Sparse functional stores for imper In ACM SIGPLAN Workshop on 26 [sten6a] [ste96t) Iwesses) [wLosy [2RL96al [arose Intermediate Representations (IR), pages 62-70, “Tenuary 1995, Proceedings published as ACM SIG- PLAN Notices 30(3), March 1985. B. Steensguard. Points-o analysis by type inference of, rogeams with structures and unions, In International Conference on Compiler Construction, number 1060 in Lectore Notes in Computer Science, pages 136-150, Apeil 1996. B. Stoonsgaard. Pointsto anelysis in almost lin- fear time, In Proceedings 29rd ACM SIGPLAN. SIGACT Symposium on Principles of Programming Languages, pages 32-41, January 1996, D. Weise, R. P. Crew, M. Ernst, and B. Steensgasrd. Value dependence graphs: Representation without taxation. Technical Report MSR-TR-94-03, Microsoft Research, Redmond, WA, April 13, 1994 R. P. Wilson and M. 8. Lam, Bficient context- sensitive pointer analysis for C programs. In Proceed: ingr of the SIGPLAN '95 Conference on Program: ring Language Design and Implementation. ACM Press, June 1995, S. Zhang, B. G. Ryder, and W. Landi. Program de- ‘compasition for pointer aliasing: A step towards prac tical analyses. In Fourth Symposium on the Pound tions of Software Engineering (FSB4), October 1996. S. thang, B. G. Ryder, and W. Landi ‘gram decomposition for pointer-induced aliasing anal- ysis, Technical Report LCSR-TR-259, Laboratory for Computer Science, Rutgers University, March 1996, Pro-

You might also like