Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 33

18CSC304J

COMPILER DESIGN

UNIT 4
SESSION 6
Topics that will be covered

• Assignment Statements
• Boolean Expressions
• Case Statements
ASSIGNMENT
STATEMENTS
Assignment Statements
• As part of the translation of assignments into three-address code, we will learn
• How names can be looked up in the symbol table
• How elements of arrays can be accessed
• How records can be accessed
Names in Symbol Table

• The lexeme for the name represented by


id is given by attribute id.name
• The operation lookup(id.name) checks if
there is an entry for this occurrence of
the name in the symbol table. If so, a
pointer to the entry is returned.
Otherwise it returns nil to indicate that
no entry was found
• The top of the stack tblptr contains a
pointer to the symbol table for the
current procedure
• When a name is encountered inside a
procedure, the lookup operation does
the following:
• It first checks if the name appears in
the current symbol table, accessible
through top(tblptr)
• If not, the pointer in the header of
the table is used to search the
symbol table for the enclosing
• Consider the scenario shown here
procedure
• For each of these procedures, a separate symbol
• If the name cannot be found in any
table is created
of these scopes, then it returns nil
• Each such symbol table has a header containing
a pointer to the table for the enclosing procedure
• Assume that the body of procedure
partition is examined currently and a
name k is used in an expression
• lookup(k) will search for the entry k
in the symbol table of partition. It is
not found
• lookup(k) will follow the pointer in
header and will search the symbol
table of quicksort (enclosing
procedure). The name k is found
• So, the use of k in an expression in
the procedure partition is valid
Reusing Temporary Names
• We have assumed that “newtemp” generates a new temporary name each time it is called.
• Space should be allotted in symbol table to store all the temporary names and their values.
• To make use of space efficiently, temporary names can be reused
1. Initialize a count c to zero
2. Whenever a new temporary name is generated, use $c and increase c by 1
3. Whenever a temporary name is used as an operand, decrement c by 1
• Example: Consider the assignment x := a * b + c * d – e * f
Three address statements:
Note:
• Temporaries that may be used
more than once cannot be
assigned names in this method
• This problem may arise when
we perform code optimization
such as combining common
subexpressions
Addressing Array Elements
• Elements of an array are usually stored in a block of consecutive locations
Single Dimensional Array
• The ith element of array A begins in location
base + ( i – low ) * w --------------- (1)
where
low → lower bound on the subscript
base → relative address of the storage allocated for the array
• The expression 1 can be rewritten as
i * w + ( base – low * w ) ---------(2)
• Let c = base – low * w
• The subexpression c may be evaluated when the declaration of the array is seen, and c is saved
in the symbol table entry for A
Addressing Array Elements – cont..
Two Dimensional Array
• A two dimensional array is normally stored in one of the two forms
• Row major (row-by-row)
• Column major (column-by-column)
• The figure below shows the layout of a 2x3 array A
Addressing Array Elements – cont..
Two Dimensional Array stored in row major form
• The relative address of A[i1,i2] can be calculated by the formula
base + ( ( i1 – low1 ) * n2 + i2 – low2 ) * w ---------- (3)
where
low1 → lower bound on the value of i1
low2 → lower bound on the value of i2
n2 → no of values that i2 can take
n2 = high2 – low2 + 2
high2 → upper bound on the value of i2
• Assuming that i1 and i2 are the only values that are not known at compile time, expression 3
can be written as
( ( i1 * n2 ) + i2 ) * w + ( base – ( low1 * n2 ) + low2 ) * w ---------- (4)
The second term in this expression can be determined at compile time
Addressing Array Elements – cont..

Multi Dimensional Array


• The relative address of A[i1,i2,…,ik] can be found by the expression
( (…( ( i1n2 + i2 ) n3 )… ) nk + ik ) * w + base – ( ( … ( ( low1n2 + low2 ) n3 + low3 ) … ) nk + lowk ) * w ---- (5)
• The second term (in blue) can be computed by the compiler and saved with the symbol-table
entry for A, since, for all values of j, nj = highj – lowj + 1 is fixed
• The first term can be generated by the recurrence
e = i1 …….. (6)
em = em-1 * nm + im
• Some languages permit the sizes of arrays to be specified dynamically at run-time.
• The formulas for accessing the elements of such arrays are the same as for fixed-size arrays,
but the upper and lower limits are not known at compile time
Grammar for addressing array elements
Translation Scheme for addressing array elements
If L is a simple name, a normal
assignment will be generated;
otherwise, an indexed assignment is
generated

Code for arithmetic expressions

When an array reference L is reduced


to E, we want the r-value of L. We
use indexing to obtain the contents
of the location L.place[L.offset]

L.offset is a new temporary


representing the first term of
expression (5)

Note: The function c(Elist.array) returns the second term of expression (5)
Translation Scheme for addressing array elements – cont..

A null offset indicates a simple


name

Elist1.place corresponds to em-1


in expression (6) and Elist.place
corresponds to em

E.Place holds both the value of


the expression E and the value
of expression (5) for m=1

Note:
The function limit(array,m) returns the number of
elements along the jth dimension of the array nj
Example:
Let A be a 10x20 array with low1=low2=1
Therefore, n1=10 and n2=20
Take w=4
Use the formula
( ( i1 * n2 ) + i2 ) * w + ( base – ( low1 * n2 ) + low2 ) * w
The second term is the constant c
Consider the statement x := A[y,z]
Here i1=y and i2=z. The expression now becomes
( ( y * 20 ) + z ) * 4 + c
Three address code:
t1 := y * 20
t1 := t1 + z
t2 := c
t3 := 4 * t1
t4 := t2 [ t3 ]
x := t4
Semantic action for E → E1 + E2
Type conversions within Assignments
• In a program there will be many different types of
variables and constants
• So the compiler must either reject certain mixed-
type operations or generate appropriate type
conversion instructions
• Suppose there are two types – real and integer
with integers converted to reals when necessary
Example
• Consider the input
x := y + i * j
• Assuming x and y have type real and i and j have
type integer
t1 := i int* j (integer multiplication)
t2 := inttoreal t1 (convert int to real)
t3 := y real+ t2 (floating-point addition)
x := t3
BOOLEAN
EXPRESSIONS
Boolean Expressions
• In programming languages, Boolean expressions have 2 primary purposes
1. They are used to compute logical values
2. They are used as conditional expressions in statements that alter the flow of control such
as if-then, if-then-else and while-do statements
• Boolean expressions are composed of Boolean operators (and, or, and, not) applied to elements
that are Boolean variables or relational expressions
• Grammar for Boolean expressions
E → E or E | E and E | not E | (E) | id relop id | true | false
Methods of Translating Boolean Expressions
• Encode true and false numerically
Examples:
1. True is denoted by 1 and false by 0
2. Any non-zero quantity denotes true and 0 denotes false
3. Any non-negative quantity denotes true and a negative quantity denotes false
• Represent the value of an expression by a position reached in a program
Numerical Representation
Consider the implementation of Boolean expressions using 1 to denote true and 0 to denote false
Example:
Translation for a or b and not c is
t1 := not c
t2 := b and t1
t3 := a or t2
Example:
Consider the relational expression a<b
It is equivalent to the conditional statement if a<b then 1 else 0
Three address code:
100: if a<b goto 103
101: t := 0
102: goto 104
103: t :=1
104:
Translation Scheme
Assume
1. nextstat gives the index of the next three address statement
2. emit increments nextstat after producing each 3-address statement
Annotated parse tree for the expression
a<b or c<d and e<f
E.place=t5

E.place=t4
E.place=t1

E.place=t3
E.place=t2
Short-Circuit Code
• We can translate a Boolean expression into three-address code without generating code for any
of the Boolean operators and without having the code to evaluate the entire expression
• This style of evaluation is called “short-circuit” or “jumping” code
Flow of Control Statements Syntax directed definition
• The grammar for if-then, if-then-else and while-do
statements is
S → if E then S1 |
if E then S1 else S2 |
while E do S1
1. Three address statements are symbolically
labelled
2. A function newlabel returns a new symbolic
label each time it is called
3. Two labels are associated with a Boolean
expression E
• E.true – the label to which control flows if E
is true
• E.false – the label to which control flows if
E is false
4. S.next is an inherited attribute. It is a label that
is attached to the first three-address instruction
to be executed after the code for S
5. The code for E generates a jump to E.true if E is
true and a jump to E.false if E is false
Flow of Control Statements – cont..
Syntax directed definition
Control Flow Translation of Boolean Expressions
• We can discuss E.code, the code produce for the Boolean expressions E in the previous syntax
directed definition
• Suppose E is of the form a < b, the generated code is of the form
if a<b goto E.true
goto E.false
• Suppose E is of the form E1 or E2,
• If E1 is true, then we need not evaluate E2. We immediately know that E itself is true. So
E1.true is the same as E.true
• If E1 is false, then E2 must be evaluated. So E1.false is the label of the first statement in
the code for E2
• No code is needed for an expression E of the form not E1
Control Flow Translation of Boolean Expressions – cont..
Example:

The statement a<b or c<d and e<f


produces the following code:
if a<b goto L.true
goto L1
L1: if c<d goto L2
goto L.false
L2 : if e<f goto L.true
goto L.false
CASE STATEMENTS
Case Statements
• The “switch” or “case” statement is available in a variety of languages
• Syntax of switch statement
switch expression
begin
case value: statement
case value: statement

case value: statement
default: statement
end
The intended translation of a switch

1. Evaluate the expression


2. Find which value in the list of cases is the same as the value of the expression. The default
value matches the expression if none of the values in the cases match
3. Execute the statement associated with the value found
Implementing n-way branch
1. If the number of cases is not too great, say 10 at most, then a sequence of conditional goto’s
can be used
A more compact way to implement this sequence of conditional goto’s is to create a table of
pairs:
2. If the number of values exceeds 10 or so, it is more efficient to construct a hash table for the
values, with the labels of various statements as entries
3. A more efficient implementation: If all the values lie in some small range say i min to imax and the
number of values is a reasonable fraction of imax-imin, then an array of labels can be
constructed, with the label of the statement for value j in the entry of the table with offset j-i min
Syntax Directed Translation of Case Statements
3. After processing E, generate the jump
Consider the following switch statement: goto test
4. When the keyword ‘case’ is seen, create a
new label Li and enter it into the symbol
table. A pointer to this symbol-table entry
and the value Vi are placed on a stack
5. When the statement case Vi: Si is
processed, emit the newly created label Li,
followed by the code for Si, followed by ‘goto
Steps involved in Translation:
next’
1. When the keyword ‘switch’ is seen, generate
6. When the keyword ‘end’ is found, read the
two labels test and next and a new
pointer-value pairs on the stack from the
temporary t
bottom to the top and generate three-
2. As the expression E is parsed, generate code address statements
to evaluate E into t
Two ways of translation of a case statement

You might also like