Wa0050.

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

BASIC COMPILATION

TECHNIQUES
BASIC COMPILATION TECHNIQUES

• The compilation process is summarized in Figure 5.11.


Compilation begins with high-level language code such
as C and generally produces assembly code.
• The high-level language program is parsed to break it
into statements and expressions.
• In addition, a symbol table is generated, which includes
all the named objects in the program.
• Some compilers may then perform higher-level
optimizations that can be viewed as modifying the
high-level language program input without reference to
instructions.
Cont…

• For example, consider the following array access code:


x[i] = c*x[i];
• A simple code generator would generate the address
for x[i] twice, once for each appearance in the
statement.
• The later optimization phases can recognize this as an
example of common expressions that need not be
duplicated.
• While in this simple case it would be possible to create
a code generator that never generated the redundant
expression, taking into account every such
optimization at code generation time is very difficult.
• We get better code and more reliable compilers by
generating simple code first and then optimizing it.
Statement Translation
Procedures
• Another major code generation problem is the creation
of procedures.
• Generating code for procedures is relatively
straightforward once and the procedure linkage
appropriate for the CPU.
• At the procedure definition, we generate the code to
handle the procedure call and return.
• At each call of the procedure, we set up the procedure
parameters and make the call from compiled code.
• Procedure stacks are typically built to grow down from
high addresses.
• A stack pointer (sp) defines the end of the current
frame, while a frame pointer (fp) defines the end of
the last frame.
• The ARM Procedure Call Standard (APCS) is a good
illustration of a typical procedure linkage
mechanism.
• r0 - r3 are used to pass parameters into the
procedure.
• r0 is also used to hold the return value. If more
than four parameters are required, they are put on
the stack frame.
• r4 - r7 hold register variables.
• r11 is the frame pointer and r13 is the stack pointer.
• r10 holds the limiting address on stack size, which is
used to check for stack overflows.
• Other registers have additional uses in the protocol.
Data Structures
• The compiler must also translate references to data
structures into references to raw memories. In
general, this requires address computations.
• Some of these computations can be done at
compile time while others must be done at run
time.
• Arrays are interesting because the address of an
array element must in general be computed at run
time, since the array index may change.
• Let us first consider one-dimensional arrays: a[i]
Cont….

• The layout of the array in memory is shown in Figure 5.13.


• The zeroth element is stored as the first element of the
array, the first element directly below, and so on.
• Create a pointer for the array that points to the array’s
head, namely, a[0].
• If call that pointer aptr for convenience, then we can
rewrite the reading of a[i] as
*(aptr + i)
Cont….

• There are multiple possible ways to lay out a two-dimensional array in


memory, as shown in Figure 5.14.
• In this form, which is known as row major, the inner variable of the array (
j in a[i, j]) varies most quickly. (Fortran uses a different organization known
as column major.)
• Let us consider the row-major form. If the a[ ] array is of size N M, then we
can turn the two-dimensional array access into a one-dimensional array
access.
Thus, a[i,j]
becomes
a[i*M + j] where the maximum value for j is M - 1.
Cont…

• A C structure is easier to address. As shown in Figure 5.15, a


structure is implemented as a contiguous block of memory.
• Fields in the structure can be accessed using constant offsets to the
base address of the structure.
• In this example, if field1 is four bytes long, then field2 can be
accessed as *(aptr + 4) .
• This addition can usually be done at compile time, requiring only the
indirection itself to fetch the memory location during execution.

You might also like