Professional Documents
Culture Documents
Intermediate Code Generation: Logical Structure of Compiler
Intermediate Code Generation: Logical Structure of Compiler
Rekhanjali Sahoo
CSE Department
Prof. Rekhanjali Sahoo
Parser,
Static checker
Intermediate Code Generator
Earlier we have discussed parser and types of parsing i.e. top-down parsing
and bottom-up parsing. The parser parses the input string and generates
CSE Department
Prof. Rekhanjali Sahoo
the parse tree or the syntax tree. The nodes of the syntax tree represent
the operations. And the leaves represent the operands.
As the parsing proceeds some information keeps on attaching to these
nodes. And we refer to it as an annotated parse tree.
SDD= CFG+Semantic Rules
Static Checker
Static checking confirms that the compiler can compile the
program successfully. It identifies the programming errors earlier. This helps
the programmer to rectify the error before a program executes. Static
checking performs two types of checking:
Syntactic Checking
This kind of checking identifies the syntactic errors present in the
program.
Type Checking
It checks the operations present in the program. And assures that it
respect the type system of the source language. And if this is not the
case the compiler performs the type conversion.
CSE Department
Prof. Rekhanjali Sahoo
o Coercion
In coercion, the type of operands converts according to the type
of operator. For example, consider the expression 2 * 3.14. Now
2 is an integer and 3.14 is a floating-point number. The coercion
specified by the language converts. integer 2 to floating-point
2.0Now both the operands are floating-point, the compiler will
perform the floating-point operation. This operation will provide a
floating-point resultant.
o Overloading
We have studied the concept of overloading in Java. For
example, the operator ‘+’ if applied to the integer performs the
addition of two integers. And if applied to the string performs
concatenation of the two strings.
o Thus, the meaning of the operator changes according to the type
of operands specified.
CSE Department
Prof. Rekhanjali Sahoo
A=B*C
A=BC*
ABC*= (POSTFIX EXPR.)
a=b*-c+b*-c
a=b*c-+ b*c-
a=bc-*+ bc-*
a= bc-*bc-*+
abc-*bc-*+= (postfix exprsn.)
t1 = y * z
t2 = x + t1
E=t2
Ex:
a:=b*(-c)+b*(-c)
Intermediate code using Syntax for the above arithmetic expression
t1 := -c
t2 := b * t1
t3 := -c
t4 := b * t3
t5 := t2 + t4
a := t5
Ex: a=((b*c)-(d/e))+f
t1=e
t2=d/t1
CSE Department
Prof. Rekhanjali Sahoo
t3=c
t4=b*t3
t5=t4-t2
t6=t5+f
a=t6
The reason for the term “Three-address code” is that each statement
usually contains three addresses, two for the operands and one for the
result.
CSE Department
Prof. Rekhanjali Sahoo
t1 = uminus c
t2 = b * t1
t3 = uminus c
t4 = b * t3
t5 = t2 + t4
a = t5
Advantage –
CSE Department
Prof. Rekhanjali Sahoo
CSE Department
Prof. Rekhanjali Sahoo
Disadvantage –
Temporaries are implicit and difficult to rearrange code.
It is difficult to optimize because optimization involves moving
intermediate code. When a triple is moved, any other triple referring to it
must be updated also. With help of pointer one can directly access
symbol table entry.
CSE Department
Prof. Rekhanjali Sahoo
--------------------------------------------------X----------------------------------------------
Question – Write quadruple, triples and indirect triples for following
expression : (x + y) * (y + z) + (x + y + z)
Explanation – The three address code is:
t1 = x + y t2 = y + z
CSE Department
Prof. Rekhanjali Sahoo
t3 = t1 * t2
t4 = t1 + z
t5 = t3 + t4
(x + y) * (y + z) + (x + y + z)
(x + y) * (y + z) + (x + y + z)
CSE Department
Prof. Rekhanjali Sahoo
Easier to implement: Intermediate code generation can simplify the code generation process by
reducing the complexity of the input code, making it easier to implement.
Facilitates code optimization: Intermediate code generation can enable the use of various code
optimization techniques, leading to improved performance and efficiency of the generated code.
Platform independence: Intermediate code is platform-independent, meaning that it can be
translated into machine code or bytecode for any platform.
Code reuse: Intermediate code can be reused in the future to generate code for other platforms or
languages.
Easier debugging: Intermediate code can be easier to debug than machine code or bytecode, as it
is closer to the original source code.
Increased compilation time: Intermediate code generation can significantly increase the
compilation time, making it less suitable for real-time or time-critical applications.
Additional memory usage: Intermediate code generation requires additional memory to store the
intermediate representation, which can be a concern for memory-limited systems.
Increased complexity: Intermediate code generation can increase the complexity of the compiler
design, making it harder to implement and maintain.
----------------------------------------------------------------------X-------------------------------------------------------------------
CSE Department
Prof. Rekhanjali Sahoo
Backpatching
It is basically a process of fulfilling unspecified information. This
information is of labels.
It may indicate the address of the Label in goto statements while producing
TACs for the given expressions. Here basically two passes are used
because assigning the positions of these label statements in one pass is
quite challenging. It can leave these addresses unidentified in the first
pass and then populate them in the second round.
Ex:
x < 100 || x > 200 && x! = y either evaluates True / False
-------------------------------------------------------------------------------
CSE Department
Prof. Rekhanjali Sahoo
CSE Department
Prof. Rekhanjali Sahoo
Target program: The target program is the output of the code generator.
The output may be absolute machine language, relocatable machine
language, or assembly language.
Absolute machine language as output has the advantages that it
can be placed in a fixed memory location and can be
immediately executed. For example, WATFIV is a compiler that
produces the absolute machine code as output.
Relocatable machine language as an output allows subprograms
and subroutines to be compiled separately. Relocatable object
modules can be linked together and loaded by a linking loader.
But there is added expense of linking and loading.
Assembly language as output makes the code generation easier.
We can generate symbolic instructions and use the macro-
facilities of assemblers in generating code. And we need an
additional assembly step after code generation.
CSE Department
Prof. Rekhanjali Sahoo
table entry for the name. Then from the symbol table entry, a relative
address can be determined for the name.
P = Q+R
S = P+T
MOV Q, R0
ADD R, R0
STA R0, P
MOV P, R0
ADD T, R0
MOV R0, S
Here the fourth statement (MOV P, R0) is redundant as the value of the P
is loaded again in that statement that just has been stored in the previous
statement. It leads to an inefficient code sequence.
CSE Department
Prof. Rekhanjali Sahoo
Here,
All the statements execute in a sequence one after the other.
Thus, they form a basic block.
Three Address Code for the expression If A<B then 1 else 0 is-
CSE Department
Prof. Rekhanjali Sahoo
Here,
The statements do not execute in a sequence one after the other.
Thus, they do not form a basic block.
The characteristics of basic blocks are-
They do not contain any kind of unconditional jump statements in them.
CSE Department
Prof. Rekhanjali Sahoo
All the statements that follow the leader (including the leader) till the next
leader appears form one basic block.
The first statement of the code is called as the first leader.
The block containing the first leader is called as Initial block.
Problem-01:
Compute the basic blocks for the given three address statements-
(1) PROD = 0
(2) I = 1
(3) T2 = addr(A) – 4
(4) T4 = addr(B) – 4
(5) T1 = 4 x I
(6) T3 = T2[T1]
CSE Department
Prof. Rekhanjali Sahoo
(7) T5 = T4[T1]
(8) T6 = T3 x T5
(9) PROD = PROD + T6
(10) I = I + 1
(11) IF I <=20 GOTO (5)
Solution-
We have-
PROD = 0 is a leader since first statement of the code is a leader.
T1 = 4 x I is a leader since target of the conditional goto statement is a
leader.
Now, the given code can be partitioned into two basic blocks as-
CSE Department
Prof. Rekhanjali Sahoo
Problem-02:
Draw a flow graph for the three address statements given in problem-01.
Solution-
CSE Department
Prof. Rekhanjali Sahoo
1. The leaves of graph are labelled by unique identifier and that identifier
can be variable names or constants.
2. Interior nodes of the graph are labelled by an operator symbol.
3. Nodes are also given a sequence of identifiers for labels to store the
computed value.
CSE Department
Prof. Rekhanjali Sahoo
Method:
Step 1:
If y operand is undefined then create node(y). 1. Case (i) x:= y OP z
If z operand is undefined then for case(i) create 2. Case (ii) x:= OP y
node(z). Case (iii) x:= y
Step 2:
For case(i), create node(OP) whose right child is
node(z) and left child is node(y).
Output:
For node(x) delete x from the list of identifiers. Append x to attached
identifiers list for the node n found in step 2. Finally set node(x) to n.
Example:
Consider the following three address statement:
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= s2 * S4
6. S6:= prod + S5
7. Prod:= s6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
CSE Department
Prof. Rekhanjali Sahoo
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= S2 * S4
6. S6:= prod + S5
7. Prod:= S6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
CSE Department
Prof. Rekhanjali Sahoo
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= S2 * S4
6. S6:= prod + S5
7. Prod:= S6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= S2 * S4
6. S6:= prod + S5
7. Prod:= S6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
CSE Department
Prof. Rekhanjali Sahoo
CSE Department
Prof. Rekhanjali Sahoo
CSE Department