Professional Documents
Culture Documents
10-Runtime & Code Gen PDF
10-Runtime & Code Gen PDF
http://www.labouseur.com/courses/compilers/compilers/alan/
3/10/2015
Storage Allocation
Storage allocation roles
OS: allocate physical memory to virtual memory
Storage Allocation
Code
Low address
Parameter &
Return value
Control link &
Saved status
Static data
Activation
Record
Local temps
Stack
Stack ptr
Parameter &
Return value
Control link &
Saved status
Heap
Local temps
High address
4
3/10/2015
Procedure Calls
Storage for procedure calls
Storage is allocated when the procedure is called
When exiting the procedure, the storage is deallocated
The region dynamically created for a procedure is called
an activation record
Activation record is mostly stack based
Some use heap storage
So that the deallocation can be flexible
In case there are static information in the procedure that needs to
Procedure Calls
Activation record
Return value
Size is defined by the type of the function
Keep at the top area so that the address can be easily determined
and the caller can access the region easily
Actual parameters
Caller copies its local values to this region
Callee accesses this region to get the input data
Accessed
For output parameters, reverse the action
Return
Values
Control link
Pointers to the AR of the caller
Return address
Stored by the caller
Other control pointer
Control
Links
Local
Data
Actual
by caller
Parameters
and callee
Accessed
only by
callee
3/10/2015
Procedure Calls
Activation record
Accessed
Return
Values
Actual
by caller
Local data
and callee Parameters
All data declared in the procedure
Control
All temporary variables generated in the procedure
Links
Accessed
Local
Dynamic data are in the heap
only by
Data
callee
But the pointers are in the AR (stack)
Calling sequence
P calls Q, Q calls R
ARs are kept track of in the stack
Ps AR
Qs AR
top of stack
Rs AR
Procedure Calls
Nested procedures
Some languages supports nested procedure definitions
Scope problem
P1 int x, y
P2 int a, b
int m, n
use m
P3 use a
use x
P1
P4
P5
P2
P5
P2
P3
P4
P5
P6
8
3/10/2015
Procedure Calls
Access links
To keep track of where to access non-local data
Linked to the closest scope parent, not the caller
One of the control links
P1 int x, y
P2 int a, b
int m, n
P1
P5
use m
P3 use a
use x
P4
P2
P5
P2
P3
P4
P5
P6
Memory Management
Compiler deal with virtual address space
Stack, heap, and static storage are all in virtual address space
Map virtual space to physical memory
OS issues
Some issues that compiler can help
Reduce stack space
Stack generally stay in main memory
Garbage collection
Virtual address space may also run out
More consumption in virtual space may result in higher potential of
page faults
Garbage collection is program dependent (need data analysis) and
need compiler to generate code to do the job
10
3/10/2015
Garbage Collection
Principle
Should be safe, be conservative
so that no damage to useful data
Problem
How can we tell whether an object is garbage now?
Approach
Reachability analysis
A program can only use the objects it can reference
An object that can no longer be reached from the program is garbage
How to determine reachability?
11
Garbage Collection
How to determine whether an object is reachable
Can check through roots and their references recursively
The roots for referencing any object are in
Static memory: static/global pointers define in the program
Registers: store the state of temporary pointers
Stack: temporary pointers defined in the procedures
Type safety
Some languages are type safe (e.g., Java)
Can safely determine the reachability of objects by checking
through objects with reference (pointer) type
12
3/10/2015
Reference Counting
Try to do incremental garbage collection
Rather than waiting for the memory to be exhausted, try to
13
Reference Counting
Change of references
Object allocations
Via memory management calls: malloc, new, etc.
Set the reference count of the new object to 1
Reference assignment: p := q
Decrease the reference count of the object originally pointed by p
Increase the reference count of the object referenced by q
Procedure calls
References to objects may be passed from actual to formal parameters
Increase the reference count of each reference object passed to the
procedure
14
3/10/2015
Reference Counting
Change of references
Procedure returns
All objects referenced by local/temporary variables in the frame are now
unreachable, unless they are referenced by multiple references
References to objects may be passed from the returned objects to the
caller
But the reference count does not change, just got transferred
For each object referenced in the AR, decrement its reference count
Transitive rule
When an object os reference count becomes 0
For each object referenced by o, decrement its reference count
15
Reference Counting
Good points
Done incrementally, does not need to halt program
execution
Easy to implement
Problems
Cannot handle circular references
Need to update reference count for each reference
assignment
Very expensive
Rarely used in real systems
16
3/10/2015
Copying Collector
Principle
Use two memory heaps
One in use by the program
The other sits idle
GC
Assume that now A is in use and B is sitting idle
When A is running out space
Copy all reachable objects from A to B
Unreachable objects are automatically discarded
Switch heap after copying (A becomes idle and B is in use)
17
Copying Collector
Good points
Simple
Automatically eliminates fragmentation
Can have simpler malloc implementation
Since the memory is going to be compacted, just allocate the top of
the heap to new objects instead of keeping track of all free slots
Problem
When copying, each reference needs to be updated since
3/10/2015
Generational GC
An important observation
In a long-running system
If an object has been reachable for a long time, it is likely to
remain so
Most of the new objects become garbage shortly
Statistics: Less than 10% stays alive
19
Generational GC
Remember set
Avoid scanning everything in tenured set
In practice, tenured objects are unlikely to point to new objects and
new objects are unlikely to point to tenured objects
Compiler insert extra code to catch modifications to tenured objects
When a tenured object is modified to point to a new object, it is put
into the remember set
Algorithm
When collect garbage in G0
Roots for GC: registers, stacks, G0, and remember set of G1
While collecting garbage in G0, record references to G1
Periodically switch objects from G0 to G1
Occasionally collect garbage in G1
Root for GC: registers, stacks, G1, and part of G0 that references G1
20
10
3/10/2015
Generational GC
Good points
Much more efficient
Garbage collection can be done generation by generation
Generally more than 2 generations can be considered
Avoid large pauses
The cross references among different generations are recorded
Unlikely to have a large remember set in practice
So GC in each generation can be done almost independently
21
22
11
3/10/2015
Code Generation
Code Generation
Use registers during execution
Whenever possible, perform computation in registers
Memory load/store are much more expensive
Instruction selection
Map the intermediate code to the set of machine instructions
Peephole optimization
24
12
3/10/2015
Code Generation
Various methods for register allocation and
instruction scheduling
Tree
scheduling
Global
Global register allocation
Do not have corresponding scheduling algorithm, just follow
25
Assumptions:
The system has two registers, r0, r1
only y is alive at the exit of the block
op reg reg/mem reg -- first reg is the result
a b c a := b c
15 instructions
10 load, 5 store
load r0, a
add r0, b, r0
store t1, r0
load r0, c
mul r0, d, r0
store t2, r0
load r0, e
add r0, f, r0
store t3, r0
load r0, t2
add r0, t3, r0
store t4, r0
load r0, t1
mul r0, t4, r0
store y, r0
26
13
3/10/2015
Basic block:
t1 := a + b
t2 := c * d
t3 := e + f
t4 := t2 + t3
y := t1 * t4
* y
+ t3
* t2
b
c
27
+ t4
+ t1
a
load r0, a
add r0, b, r0
load r1, c
mul r1, d, r1
store t1, r0
load r0, e
add r0, f, r0
add r0, r1, r0
load r1, t1
mul r0, r0, r1
store y, r0
11 instructions
7 load, 2 store (1 spill)
Basic block:
t1 := a + b
t2 := c * d
t3 := e + f
t4 := t2 + t3
y := t1 * t4
load r1, e
add r1, f, r1
Can we always
achieve optimal
execution?
* y
28
+ t3
* t2
b
c
+ t4
+ t1
a
load r0, c
mul r0, d, r0
9 instructions
6 load, 1 store (0 spill)
Optimal!
14
3/10/2015
Assumptions:
From here onwards,
3 address code need to be:
op reg reg reg
L(leaf) = 1 if it is an identifier
L(leaf) = 0 if it is a constant
L(nonleaf node) =
If L(left child) = L(right child) then
Otherwise
Compute register
requirement
load r2, e
add r3, r2, r3
load r1, a
a 1
(r1)
load r1, a
(r1, r2)
load r2, b
Generate code
(r1,r2,r3)
load r1, c
30
Assign registers
load r2, d
b 1
(r2)
load r2, b
c1
(r1)
load r1, c
d1
(r2)
load r2, d
e 1
f 1
(r2)
load r2, e
(r3)
load r3, f
15
3/10/2015
31
Principle:
Two variables that are alive simultaneously
interfere
They cannot be allocated to the same register
Register interference graph:
One vertex for each variable in the graph
At each point p in the CFG
L is the Live set at p
Two variables x and y are in L, then x should not get
the same register as y
add an edge (x,y)
32
16
3/10/2015
e
b
c
c
d
b, d have
< 4 edges.
Choose d
top
stack
33
b
c
c
d
a
stack
34
17
3/10/2015
35
a had degree 2, no
problem to color!
18
3/10/2015
Spill
When no way is found to color with k colors
Choose one node to spill
Continue to spill if necessary, till a node can be removed
For each spilled node
For each definition, store the value
For each use, load the value
Rewrite code
Use a new temporary variable for each load, it will have
very short life and likely to have very few outgoing edges
Redo liveliness analysis and register allocation
37
19
3/10/2015
DAG Construction
Versioning
a := b c
b := a + d
d := b c
a := a * d
b := b c
a1 := b0 c0
b1 := a1 + d0
d1 := b1 c0
a2 := a1 * d1
b2 := b1 c0
Goal:
If a variables is redefined, it is no longer
the same as the previous version.
Use a version number to avoid confusion.
Method:
Use the table to keep track of the variables
and their version numbers.
Initialize the version number to 0.
Increase the version number each time
the variable is defined.
39
DAG Construction
Dag
Leaves are identifiers or constants
t1, t3
b0
c0
in the dag
node
The node has t1 in its label list
The node has the highest version of t1
40
20
3/10/2015
DAG Construction
For a copy statement x := y
Find node x
If nonexistent
Find node y
If nonexistent, create it (could be from external)
N y, x
pointer
41
DAG Construction
For statement x := y op z
Find node x (do the same as the previous case)
Check whether <op node(y) node(z)> exists
If so, let N be the root node of the subtree for <op
node(y) node(z)>
If not, check the operands and create the subtree
Find node y, if nonexisting, create node(y)
Find node z, if nonexisting, create node(z)
Create the node in table
Create a leaf node in the dag
Create the operator node, say N, and link it to node(y)
and node(z)
Add x to the list of labels of N and update the
table pointer
42
21
3/10/2015
DAG Construction
Dag construction example
a
b
d
a
b
e
:=
:=
:=
:=
:=
:=
bc
a+d
bc
a*d
bc
a*b
43
a1
b1
d1
a2
e1
b2
:=
:=
:=
:=
:=
:=
b0 c0
a1 + d0
b1 c0
a1 * d1
a2 * d1
d1
* e1
a2
d1 , b2
b1
+
a1
b0
d0
c0
Instruction Selection
Goal
Determine parts of the tree that can match the
instruction tiles
Desirable to achieve optimal tiling
Get the instruction set with least cost (not easy)
The maximal munch algorithm (greedy)
Start from the tree root and find all matching tiles
Select the one with the maximum number of nodes
22
3/10/2015
45
23