Professional Documents
Culture Documents
Intermediate Representation: Goals
Intermediate Representation: Goals
Goals:
encode knowledge about the program
facilitate analysis
facilitate retargeting
facilitate optimization
HIR intermediate
code gen.
LIR
optim
LIR code
gen.
Intermediate representation
Components
code representation
symbol table
analysis information
string table
Issues
IR selection
Using an existing IR
cost savings due to reuse
it must be expressive and appropriate for the
compiler operations
Designing an IR
IR classification: Level
High-level
closer to source language
used early in the process
usually converted to lower form later on
Example: AST
IR classification: Level
Medium-level
try to reflect the range of features in the source
language in a language-independent way
most optimizations are performed at this level
algebraic simplification
copy propagation
dead-code elimination
common subexpression elimination
loop-invariant code motion
etc.
IR classification: Level
Low-level
very close to target-machine instructions
architecture dependent
useful for several optimizations
loop unrolling
branch scheduling
instruction/data prefetching
register allocation
etc.
IR classification: Level
i := op1
instructions
endfor
High-level
i := i + step
goto L1
L2: if i < op2 goto L3
instructions
i := i + step
goto L2
L3:
Medium-level
7
IR classification: Structure
Graphical
Trees, graphs
Not easy to rearrange
Large structures
Linear
Hybrid
Combine graphical and linear IRs
Example:
low-level linear IR for basic blocks, and
graph to represent flow of control
8
Graphical IRs
Parse tree
Abstract syntax tree
High-level
Useful for source-level information
Retains syntactic structure
Common uses
source-to-source translation
semantic analysis
syntax-directed editors
Graphical IRs
Tree, for basic block*
root: operator
up to two children: operands
can be combined
gt, t2
Uses:
algebraic simplifications
may generate locally optimal code.
L1: i := 2
assgn, i
add, t1
assgn, i
if t2 goto L1
gt, t2
t1:= i+1
t2 := t1>0
add, t1
t1
2
10
Graphical IRs
Directed acyclic graphs (DAGs)
Like compressed trees
leaves: variables, constants available on entry
internal nodes: operators
11
Graphical IRs
Generating DAGs
Check whether an operand is already present
if not, create a leaf for it
Check whether there is a parent of the operand that
represents the same operation
if not create one, then label the node representing
the result with the name of the destination
variable, and remove that label from all other
nodes in the DAG.
12
Graphical IRs
Directed acyclic graphs (DAGs)
Example
m := 2 * y * z
n := 3 * y * z
p := 2 * y - z
13
Graphical IRs
Control flow graphs (CFGs)
Each node corresponds to a
basic block, or
part of a basic block, or
a single statement
14
Graphical IRs
Dependence graphs : they represents constraints
Data dependence
Graphical IRs
Dependence graphs
Example:
s1
s2
s3
s4
s5 L1:
s1
s2
a := b + c
if a>10 goto L1
d := b * e
e := d + 1
d := e / 2
s3
s
4
s5
control dependence:
s3 and s4 are executed only when a<=10
true or flow dependence:
s2 uses a value defined in s1
This is read-after-write dependence
antidependence:
s4 defines a value used in s3
This is write-after-read dependence
output dependence:
s5 defines a value also defined in s3
This is write-after-write dependence
input dependence:
s5 uses a value also uses in s3
This is read-after-read situation. It places no constrain
16
in the execution order, but is used in some optimizatio
Basic blocks
2.
Basic blocks
read(n)
unsigned int fibonacci (unsigned int n) {
f0 := 0
unsigned int f0, f1, f2;
f1 := 1
f0 = 0;
if n<=1 goto L0
f1 = 1;
i := 2
if (n <= 1)
L2: if i<=n goto L1
return n;
return f2
for (int i=2; i<=n; i++) {
L1: f2 := f0+f1
f2 = f0+f1;
f0 := f1
f0 = f1;
f1 := f2
f1 = f2;
i := i+1
}
go to L2
return f2;
L0: return n
}
18
Basic blocks
entry
Leaders:
read(n)
f0 := 0
f1 := 1
if n<=1 goto L0
i := 2
L2: if i<=n goto L1
return f2
L1: f2 := f0+f1
f0 := f1
f1 := f2
i := i+1
go to L2
L0: return n
read(n)
f0 := 0
f1 := 1
n <= 1
i := 2
return n
i<=
n
f2 := f0+f1
f0 := f1
f1 := f2
i := i+1
return f2
exit
19
Linear IRs
Sequence of instructions that execute in order
of appearance
Control flow is represented by conditional
branches and jumps
Common representations
20
Linear IRs
Stack machine code
Assumes presence of operand stack
Useful for stack architectures, JVM
Operations typically pop operands and push results.
Advantages
Easy code generation
Compact form
Disadvantages
Difficult to rearrange
Difficult to reuse expressions
21
Linear IRs
Three-address code
Compact
Generates temp variables
Level of abstraction may vary
Loses syntactic structure
Quadruples
operator
up to two operands
destination
Triples
similar to quadruples but the results are not named
explicitly (index of operation is implicit name)
Implement as table, array of pointers, or list
22
Linear IRs
L1: i := 2
(1)
t1:= i+1
(2)
i st (1)
t2 := t1>0
(3)
i+1
if t2 goto L1
(4)
(3) > 0
(5)
if (4), (1)
Quadruples
Triples
23
SSA form
Static Single Assignment Form
Encodes information about data and control flow
Two constraints:
Each definition has a unique name
Each use refers to a single definition
Advantages:
Simplifies data flow analysis & several optimizations
SSA size is linear to program size
Eliminates certain dependences (write-after-read,
write-after-write)
x := 5
x0 := 5
Example: x := x +1
y := x * 2
x1 := x0 +1
y0 := x1 * 2
24
SSA form
Consider a situation where two control-flow
read(x)
x>0
y := 5
y := 10
x := y
read(x0)
x0 > 0
y0 := 5
y1 := 10
x1 := y
should this be y0 or y1?
25
SSA form
The compiler inserts special join functions
y0 := 5
y1 := 10
y2 = (y0, y1)
x1 := y2
y1 :=
10
y2 :=
y1
x1 := y2
26
SSA form
x0 := 0
i0 := 1
Example 2:
x := 0
i := 1
while (i<10)
x := x+i
i := i+1
x := 0
i := 1
x1 := (x0, x2)
i1 := (i0, i2)
i1 < 10
i < 10
x := x +
1
i := i +
1
exit
x2 := x1 + 1
i2 := i1 + 1
exit
27
SSA form
B1
x0 := 0
i0 := 1
functions?
Intuition:
x1 := (x0, x2)
i := (i0, i2)
B3 1
i1 < 10
B2
x2 := x1 + 1
i2 := i1 + 1
exit
There is a definition of x in B1
and another in B2
There is a path from B1 to B3
There is a path from B2 to B3
B3 is the first "common" node
in these two paths.
A function for x should be
placed in B3.
28
SSA form
A program is in SSA form if
Each variable is assigned a value in exactly one
statement
Each use of a variable is dominated by the definition.
Domination
Dominators
A node N may have several dominators, but
30
entry
B1
B2
B3
B4
B5
B1
B6
B2
B3
B4
B7
B5
B8
B9
B6
B7
B8
B10
exit
B9
B10
31
a flow graph:
Natural Loop = A set of basic blocks with
Loop-finding algorithm:
Find an edge BA where A dominates B. This is
called a back-edge. Add A and B to the loop
Find all nodes that can reach B without going through
A. Add them to the loop.
32
entry
Loops
B1
B2
B3
B4
B5
B6
B7
B8
B9
B10
exit
Nested Loops
Loop K is nested within loop L if
L and K have different headers, and respectively,
and
is dominated by
34
graph.
35
Dominance frontier
B0
B1
B2
B11
B3
B4
B5
B6
B7
B8
B10
B9
B8.
36
Dominance frontier
The dominance frontier of a node B can be
computed as follows:
DF(B) = DFlocal(B)
U U
DFup(C)
where:
37
entry
entry
exit
1
{ex}
3
4
6
{8}
{6}
{6}
{8}
1
0
1
0
{11}
1
1
exit
8
{2, ex}
{8}
{ex}
1
2
{2, ex}
1
1
{9, 2, ex}
1
2
{local,
up}
{2, ex}39
entry
entry
B0
B0
B1
B2
B11
B3
B4
B5
B7
B8
B6
B9
B11
B1
{B6}
{B6, exit}
B8
B2
{}
B6
exit
{exit}
B7
B10
{exit}
B10
exit
B3
B4
B5
B9
{B5}
{B8}
{B6}
{B10}
40