Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18

RIFT VALLEY UNIVERSITY

DEPARTMENT OF COMPUTER SCIENCE


COMPILER DESIGNE Assignment 2
By
Tekaligne woldemariam Regassa. RVGCSDE 0041/18

Lecturer Name: Ararsa L.


Addis Ababa Ethiopia
Submission Date: Dec 26, 2022
For the following questions, write clear, brief and to the point answers.
Unreadable answers will be skipped.

1)Describe Intermediate Code Generation and its


types (1pt)

Intermediate code can translate the source program into the machine program.
Intermediate code is generated because the compiler can’t generate machine code
directly in one pass.
Therefore, first, it converts the source program into intermediate code, which performs
efficient generation of machine code further.
The intermediate code can be represented in the form of postfix notation, syntax
tree, directed acyclic graph, three address codes, Quadruples, and triples.

Intermediate Code Generation types are:


Intermediate code can be either language-specific (e.g., Bytecode for Java) or
language. Independent (three-address code).

1. Three Address Code− These are statements of form c = a op b, i.e., in which


there will be at most three addresses, i.e., two for operands & one for Result.
Each instruction has at most one operator on the right-hand side.

Example of three address code for the statement.

2. Postfix Notation − In postfix notation, the operator comes after an operand, i.e., the
operator follows an operand.

Example
 Postfix Notation for the expression (a+b) * (c+d) is ab + cd +*
 Postfix Notation for the expression (a*b) - (c+d) is ab* + cd + - .
 Syntax Trees − It is condensed form of parse tree in which leaves are identifiers
and interior node will be operators.

3. Syntax Tree: 
A syntax tree is nothing more than a condensed form of a parse tree. The operator
and keyword nodes of the parse tree are moved to their parents, a chain of single
productions is replaced by the single link in the syntax tree the internal nodes are
operators, and child nodes are operands. To form a syntax tree put parentheses in the
expression, this way it is easy to recognize which operand should come first. 
2)Describe Three-Address Code with its examples
(1pt)

The three-address code is a sequence of statements of the form A−=B op C, where A,


B, C are either programmer-defined names, constants, or compiler-generated
temporary names, the op represents an operator that can be constant or floating point
arithmetic operators or a Boolean valued data or a logical operator.
The reason for the name “three address code” is that each statement generally
includes three addresses, two for the operands, and one for the result.
In the three-address code, almost three addresses are define any statement. Two
addresses for operand & one for the result.

Example:
Expression a = b + c + d can be converted into the following Three Address Code.
t1 = b + c
t2 = t1 + d
a = t2
Where t1 and t2 are temporary variables generated by the compiler.
Most of the time a statement includes less than three references, but it is still known as
a three-address statement.

3)Describe Syntax Directed Translation — SDT


(1pt)

Syntax-directed translation - SDT refers to a method of compiler implementation where


the source language translation is completely driven by the parser.
In other words, the parsing process and parse trees are used to direct semantic
analysis and the translation of the source program.

A technique of compiler execution, where the source code translation is totally


conducted by the parser, is known as syntax-directed translation.
The parser primarily uses a Context-free-Grammar to check the input sequence and
deliver output for the compiler's next stage.

4)Describe Directed Acyclic Graph — DAG (1pt)


Directed Acyclic Graph (DAG),

is a tool that depicts the structure of basic blocks, helps to see the flow of values flowing
among the basic blocks, and offers optimization too. DAG provides easy transformation
on basic blocks. DAG can be understood here: Leaf nodes represent identifiers, names
or constants.

In Compiler design, Directed Acyclic Graph is a directed graph that does not contain
any cycles in it. 

5)Describe the following phrases. Use examples for


each description (4pts)

a) Production
A production or production rule in computer science

is a rewrite rule specifying a symbol substitution that can be recursively


performed to generate new symbol sequences.
It is of the form α-> β where α is a Non-Terminal Symbol which can be replaced by β
which is a string of Terminal Symbols or Non-Terminal Symbols.

Example-1:

Consider Grammar G1 = <N, T, P, S>

T = {a,b}    #Set of terminal symbols


P = {A->Aa,A->Ab,A->a,A->b,A->  }    #Set of all production rules
S = {A}    #Start Symbol

As the start symbol is S then we can produce Aa, Ab, a,b, which can further
produce strings where A can be replaced by the Strings mentioned in the
production rules and hence this grammar can be used to produce strings of the
form (a+b)*.

Derivation Of Strings :

A->a    #using production rule 3


OR
A->Aa    #using production rule 1
Aa->ba    #using production rule 4
OR
A->Aa    #using production rule 1
Aa->AAa    #using production rule 1
AAa->bAa    #using production rule 4
bAa->ba    #using production rule 5

Example-2:

Consider Grammar G2 = <N, T, P, S>


N = {A}   #Set of non-terminals Symbols
T = {a}    #Set of terminal symbols
P = {A->Aa, A->AAa, A->a, A-> }    #Set of all production rules

S = {A}   #Start Symbol

As the start symbol is S then we can produce Aa, AAa, a, which can further
produce strings where A can be replaced by the Strings mentioned in the
production rules and hence this grammar can be used to produce strings of
form (a)*.

Derivation Of Strings :
A->a    #using production rule 3
OR
A->Aa    #using production rule 1
Aa->aa    #using production rule 3
OR
A->Aa    #using production rule 1
Aa->AAa    #using production rule 1
AAa->Aa    #using production rule 4
Aa->aa    #using production rule 3

Equivalent Grammars:
Grammars are said to be equivalent id they produce the same language.

Different Types Of Grammars:


Grammar can be divided on basis of –

 Type of Production Rules


 Number of Derivation Trees
 Number of Strings
b)Semantic Rules
Semantic Analysis is the third phase of Compiler.
Semantic Analysis makes sure that declarations and statements of program are
semantically correct.

It is a collection of procedures, which is called by parser as and when required


by grammar.

Both syntax tree of previous phase and symbol table are used to check the
consistency of the given code. 

Type checking is an important part of semantic analysis where compiler makes


sure that each operator has matching operands.

Semantic Analyzer:
It uses syntax tree and symbol table to check whether the given program is
semantically consistent with language definition.
It gathers type information and stores it in either syntax tree or symbol table.

This type information is subsequently used by compiler during intermediate-


code generation.
Semantic Errors:
Errors recognized by semantic analyzer are as follows:
 Type mismatch
 Undeclared variables
 Reserved identifier misuse

Example:
float x = 10.1;
float y = x*30;
In the above example integer, 30 will be type casted to float 30.0 before
multiplication, by semantic analyzer.

c) Annotated Parse Tree

An annotated parse tree is a parse tree showing the values of the attributes
at each node. The process of computing the attribute values at the nodes is
called annotating or decorating the parse tree.
6)Code Optimization: Describe the following terms
with examples (4pts)
The code optimization in the synthesis phase is a program transformation technique,
which tries to improve the intermediate code by making it consume fewer resources (i.e.
CPU, Memory) so that faster-running machine code will result.

Compiler optimizing process should meet the following objectives :


 The optimization must be correct, it must not, in any way, change the
meaning of the program.
 Optimization should increase the speed and performance of the program.
 The compilation time must be kept reasonable.
 The optimization process should not delay the overall compiling process.

Optimization of the code is often performed at the end of the development


stage since it reduces readability and adds code that is used to increase the
performance. 

a) Type of Code Optimization


The optimization process can be broadly classified into two types:

1. Machine Independent Optimization:

 This code optimization phase attempts to improve the intermediate code to


get a better target code as the output.
The part of the intermediate code, which is transformed here, does not
involve any CPU registers or absolute memory locations.

2. Machine Dependent Optimization: 


Machine-dependent optimization is done after the target code has been
generated and when the code is transformed according to the target
machine architecture.
It involves CPU registers and may have absolute memory references rather
than relative references.
Machine-dependent optimizers put efforts to take maximum advantage of
the memory hierarchy.

Example: 

Copy Propagation: 

 It is extension of constant propagation.


 After a is assigned to x, use a to replace x till a is assigned again to another
variable or value or expression.
 It helps in reducing the compile time as it reduces copying.

//Before Optimization
c = a * b                                              
x = a                                                 
till                                                          
d = x * b + 4
 
 
//After Optimization
c = a * b 
x = a
till
d = a * b + 4

b)Dead-code Elimination
  Copy propagation often leads to making assignment statements into dead
code.
 A variable is said to be dead if it is never used after its last definition.
 In order to find the dead variables, a data flow analysis should be done.
Example: 

c = a * b                                               
x = a                                               
till                                                         
d = a * b + 4  
 
//After elimination :
c = a * b
till
d = a * b + 4

c) Why code optimization is important?


Optimization helps to:
 Reduce the space consumed and increases the speed of compilation.
 Manually analyzing datasets involves a lot of time. Hence we make use of
software like Tableau for data analysis. Similarly manually performing the
optimization is also tedious and is better done using a code optimizer.
 An optimized code often promotes re-usability.

7)Target Code Generation (3pts)

a) Describe Intermediate with example


In the analysis-synthesis model of a compiler, the front end of a compiler
translates a source program into an independent intermediate code, then the
back end of the compiler uses this intermediate code to generate the target
code (which can be understood by the machine).
 Intermediate code can be either language-specific (e.g., Bytecode for Java) or
language. Independent (three-address code). 

The following are commonly used intermediate code representations

1. Postfix Notation:

 Also known as, reverse Polish notation or suffix notation.


The ordinary (infix) way of writing the sum of a and b is with an operator
in the middle: a + b .

The postfix notation for the same expression places the operator at the
right end as ab +.

In general, if e1 and e2 are any postfix expressions, and + is any binary


operator, the result of applying + to the values denoted by e1 and e2 is
postfix notation by e1e2 +.

No parentheses are needed in postfix notation because the position and


arity (number of arguments) of the operators permit only one way to
decode a postfix expression.

In postfix notation, the operator follows the operand. 

Example 1:

 The postfix representation of the expression (a + b) * c is : ab + c *

Example 2: 

The postfix representation of the expression (a – b) * (c + d) + (a – b)


is :   ab – cd + *ab -+

2. Three-Address Code: 
A statement involving no more than three references (two for operands and one
for result) is known as a three address statement. A sequence of three address
statements is known as a three address code. Three address statement is of
form x = y op z, where x, y, and z will have address (memory location).
Sometimes a statement might contain less than three references but it is still
called a three address statement. 
Example:

 The three address code for the expression a + b * c + d: T 1 = b * c T 2 = a + T


1 T 3 = T 2 + d T 1 , T 2 , T 3 are temporary variables.

There are 3 ways to represent a Three-Address Code in compiler design: 


i) Quadruples
ii) Triples
iii) Indirect Triples

3.Syntax Tree: 
A syntax tree is nothing more than a condensed form of a parse tree. The
operator and keyword nodes of the parse tree are moved to their parents, a
chain of single productions is replaced by the single link in the syntax tree the
internal nodes are operators, and child nodes are operands. To form a syntax
tree put parentheses in the expression, this way it’s easy to recognize which
operand should come first. 

Example: x = (a + b * c) / (a – b * c)
b) Describe Target Languages with example

Target code generation deals with assembly language to convert optimized


code into machine understandable format.
Target code can be machine readable code or assembly code.
Each line in optimized code may map to one or more lines in machine (or)
assembly code, hence there is a 1: N mapping associated with them.
1: N Mapping

Target code generation is the final Phase of Compiler.

1. Input: Optimized Intermediate Representation.


2. Output: Target Code.
3. Task Performed: Register allocation methods and optimization, assembly
level code.
4. Method: Three popular strategies for register allocation and optimization.
5. Implementation: Algorithms.

== Thank you =====

You might also like