Compiler Design

ASSIGNMENT | DIVYANSH JAIN(17100BTCMCI01669)
Compiler Design(BTCS601)
Assignment-1
Ques.1 What are compilers and de-compilers? Why are they needed?
Ans.1
COMPILER: A compiler is a software program that transforms high-level source code that is
written by a developer in a high-level programming language into a low level object code
(binary code) in machine language, which can be understood by the processor. The process of
converting high-level programming into machine language is known as compilation.
The processor executes object code, which indicates when binary high and low signals are
required in the arithmetic logic unit of the processor.
DE-COMPILER: A de-compiler is a programming tool that converts an executable program
or low-level/machine language into a format understandable to software programmers. It
performs the operations of a compiler, which translates source code into an executable format,
but in reverse. A de-compiler’s recipient is a human user, whereas the compiler’s is the
machine.
A de-compiler can be useful in some cases for the following purposes:
Recovery of lost source code to archive or maintain the code

Debugging programs
Antivirus capability to find vulnerabilities in the program
Interoperability to facilitate migration of a program across platforms
Ques.2 Explain “Analysis Synthesis Model” of compiler in brief?
Ans.2
A compiler can broadly be divided into two phases based on the way they compile.
Analysis Phase
Known as the front-end of the compiler, the analysis phase of the compiler reads the source
program, divides it into core parts and then checks for lexical, grammar and syntax errors.The
analysis phase generates an intermediate representation of the source program and symbol
table, which should be fed to the Synthesis phase as input.
1
Synthesis Phase
Known as the back-end of the compiler, the synthesis phase generates the target program with the
help of intermediate source code representation and symbol table.
A compiler can have many phases and passes.
Pass : A pass refers to the traversal of a compiler through the entire program.
Phase : A phase of a compiler is a distinguishable stage, which takes input from the previous
stage, processes and yields output that can be used as input for the next stage. A pass can have
more than one phase.
Ques. 3 Explain the complete execution process of c/c++ program ?
Ans. 3
Execution of a C/C++ program involves four stages using different compiling/execution tool, these
tools are set of programs which help to complete the C/C++ program's execution process.
1. Preprocessor
2. Compiler
3. Linker
4. Loader
These tools make the program running.
1) Preprocessor
This is the first stage of any C/C++ program execution process; in this stage Preprocessor processes
the program before compilation. Preprocessor include header files, expand the Macros.
2) Complier
This is the second stage of any C/C++ program execution process, in this stage generated output file
after preprocessing ( with source code) will be passed to the compiler for compilation. Complier will
compile the program, checks the errors and generates the object file (this object file contains assembly
code).
3) Linker
This is the third stage of any C/C++ program execution process, in this stage Linker links the
more than one object files or libraries and generates the executable file.
4) Loader
This is the fourth or final stage of any C/C++ program execution process, in this stage
Loader loads the executable file into the main/primary memory. And program run.
Different files during the process of execution
2
Suppose, you save a C program with prg1.c – here .c is the extension of C code, prg1.c file contains
the program (source code of a C program). Preprocessor reads the file and generates the prg1.i (prg1.ii
– for c++ source code) file, this file contains the preprocessed code.
Compiler reads the prg1.i file and further converts into assembly code and generates prg1.s
and then finally generates object code in prg1.o file.
Linker reads prg1.o file and links the other assembly/object code or library files and generates
executable file named prg1.exe.
Loader loads the prg1.exe file into the main/primary memory and finally program run.
One more file is created that contains the source code named prg1.bak; it’s a backup file of the
program files.
Ques.4 Explain the term Editor, preprocessor, linker, loader?

Ans.4
Loader: Loader is a program that loads machine codes of a program into the system memory. In
Computing, a loader is the part of an Operating System that is responsible for loading programs. It is
one of the essential stages in the process of starting a program. Because it places programs into
memory and prepares them for execution. Loading a program involves reading the contents of
executable file into memory. Once loading is complete, the operating system starts the program by
passing control to the loaded program code. All operating systems that support program loading
have loaders. In many operating systems the loader is permanently resident in memory.
Linker: In high level languages, some built in header files or libraries are stored. These libraries are
predefined and these contain basic functions which are essential for executing the program. These
functions are linked to the libraries by a program called Linker. If linker does not find a library of a
function then it informs to compiler and then compiler generates an error. The compiler
automatically invokes the linker as the last step in compiling a program. Not built in libraries, it also
links the user defined functions to the user defined libraries. Usually a longer program is divided
into smaller subprograms called modules. And these modules must be combined to execute the
program. The process of combining the modules is done by the linker.
3
Pre-processor
A source program may be divided into modules stored in separate files. The task of collecting the
source program is entrusted to a separate program called pre-processor. It may also expand macros
into source language statement.
Ques.5 Give the short description of tools LEX and YACC?
Ans.5
LEX: Lex is officially known as a "Lexical Analyser".
It's main job is to break up an input stream into more usable
elements. Or in, other words, to identify the "interesting
bits" in a text file.
For example, if you are writing a compiler for the C programming language, the symbols { }
( ) ; all have significance on their own. The letter a usually appears as part of a keyword or
variable name, and is not interesting on it's own. Instead, we are interested in the whole
word. Spaces and newlines are completely uninteresting, and we want to ignore them
completely, unless they appear within quotes "like this"
All of these things are handled by the Lexical Analyser.
YACC: Yacc is officially known as a "parser".
It's job is to analyse the structure of the input stream, and operate of the "big picture".
In the course of it's normal work, the parser also verifies that the input is syntactically sound.
Consider again the example of a C-compiler. In the C-language, a word can be a function name
or a variable, depending on whether it is followed by a ( or a = There should be exactly one }
for each { in the program.
YACC stands for "Yet Another Compiler Compiler". This is because this kind of analysis of text
files is normally associated with writing compilers.
Ques. 6 What is LEX? Describe auxiliary definitions and translation for LEX with suitable
example.
Ans. 6
o Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
o The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
o It reads the input stream and produces the source code as output through implementing the lexical
analyzer in the C program.
The function of Lex is as follows:

o Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs the
4
lex.1
program and produces a C program lex.yy.c.
o Finally C compiler runs the lex.yy.c program and produces an object program a.out.
o a.out is lexical analyzer that transforms an input stream into a sequence of tokens.
Lex file format

A Lex program is separated into three sections by %% delimiters. The formal of Lex source is as
follows:
1. { definitions }
2. %%
3. { rules }
4. %%
5. { user subroutines }
Definitions include declarations of constant, variable and regular definitions.
Rules define the statement of form p1 {action1} p2 {action2}. pn {action}.
Where pi describes the regular expression and action1 describes the actions what action the lexical
analyzer
should take when pattern pi matches a lexeme.
User subroutines are auxiliary procedures needed by the actions. The subroutine can be loaded with
the lexical
analyzer and compiled separately.
5
Ques.7 What are the contents of executable files?
Ans.7
Processors understand programs in terms of opcodes - so your intution about executables

containing opcodes is correct, and you guessed correctly that any executable has to have opcodes
and operands for executing the program on a processor.
However, programs mostly execute with the help of operating systems (you can write programs
which do not use an OS to execute, but that would be a lot of unnecessary work) - which provide
abstractions on top of the hardware which the programs can use. The OS is responsible for setting up
a "context" for any program to run i.e. provide the program the memory it needs, provide general
purpose libraries which the program can use for doing common stuff such as write to files, print to
console etc.
However, to set up the context for the program (provide it memory, load its data, set up a stack for
it), the OS needs to read a program's executable file and needs to know a few things about the
program such as the data which the program expects to use, size of that data, the initial values stored
in that data region, the list of opcodes that make up the program (also called the text region of a
process), their size etc. All of this data and a lot more (debugging information, readonly data such as
hardcoded strings in the program, symbol tables etc) is stored within the executable file. Each OS
understands a different format of this executable file, since they expect all this info to be stored in
the executable in different ways.
A couple of formats that have been used for storing information in an executable file are ELF
and COFF on UNIX systems and PE on Windows.
Not all programs need executable formats. Look up bootloaders on Google. These are special
programs which occupy the first sector of a bootable partition on the hard-disk and are used to
load the OS itself.
Ques. 8 Explain Compiler construction tools?
Ans. 8
The compiler writer can use some specialized tools that help in implementing various phases of a
compiler. These tools assist in the creation of an entire compiler or its parts. Some commonly used
compiler construction tools include:
1. Parser Generator –
It produces syntax analyzers (parsers) from the input that is based on a grammatical
description of programming language or on a context-free grammar. It is useful as the
syntax analysis phase is highly complex and consumes more manual and compilation
time.
Example: PIC, EQM
6
2. Scanner Generator –
It generates lexical analyzers from the input that consists of regular expression description
based on tokens of a language. It generates a finite automation to recognize the regular
expression.
Example: Lex
3. Syntax directed translation engines –

It generates intermediate code with three address format from the input that consists of a parse
tree. These engines have routines to traverse the parse tree and then produces the intermediate
code. In this, each node of the parse tree is associated with one or more translations.
4. Automatic code generators –
It generates the machine language for a target machine. Each operation of the intermediate
language is translated using a collection of rules and then is taken as an input by the code
generator. Template matching process is used. An intermediate language statement is replaced
by its equivalent machine language statement using templates.
5. Data-flow analysis engines –
It is used in code optimization.Data flow analysis is a key part of the code optimization that
gathers the information, that is the values that flow from one part of a program to another.
Refer – data flow analysis in Compiler
6. Compiler construction toolkits –
It provides an integrated set of routines that aids in building compiler components or in the
construction of various phases of compiler.
7
8
Assignment-02
Q.1 Difference between Parse tree and Syntax tree ?

Parse Tree Syntax
An ordered, rooted tree that represents the A tree representation of the abstract syntactic
syntactic structure of a string according to some structure of source code written in a
context free grammar programming language
Also known as parsing tree, derivation tree, and Also known as abstract syntax tree
concrete syntax tree
Contains records of the rules (tokens) to match Contains records of the syntax of programming
input texts language
Q.2 Explain the Recursive-Descent Parsing & Predictive Parsing.

Recursive Descent Parser:
It is a kind of Top-Down Parser. A top-down parser builds the parse tree from the top to down,
starting with the start non-terminal. A Predictive Parser is a special case of Recursive Descent
Parser, where no Back Tracking is required.
By carefully writing a grammar means eliminating left recursion and left factoring from it, the
resulting grammar will be a grammar that can be parsed by a recursive descent parser.
Predictive Parser
Predictive parser is a recursive descent parser, which has the capability to predict which
production is to be used to replace the input string. The predictive parser does not suffer from
backtracking.
To accomplish its tasks, the predictive parser uses a look-ahead pointer, which points to the
next input symbols. To make the parser back-tracking free, the predictive parser puts some
constraints on the grammar and accepts only a class of grammar known as LL(k) grammar.
9
Predictive parsing uses a stack and a parsing table to parse the input and generate a parse tree.
Both the stack and the input contains an end symbol $ to denote that the stack is empty and
the input is consumed. The parser refers to the parsing table to take any decision on the input
and stack element combination.In recursive descent parsing, the parser may have more than
one production to choose from for a single instance of input, whereas in predictive parser,
each step has at most one production to choose. There might be instances where there is no
production matching the input string, making the parsing procedure fail.
Q.3 Describe the Bottom–up parser.

Bottom-up parsing starts from the leaf nodes of a tree and works in upward direction till it
reaches the root node. Here, we start from a sentence and then apply production rules in reverse
manner in order to reach the start symbol. The image given below depicts the bottom-up
parsers available.
Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing. These steps are known as
shift-step and reduce-step.
10
● Shift step: The shift step refers to the advancement of the input pointer to the next input
symbol, which is called the shifted symbol. This symbol is pushed onto the stack. The
shifted symbol is treated as a single node of the parse tree.
● Reduce step : When the parser finds a complete grammar rule (RHS) and replaces it to
(LHS), it is known as reduce-step. This occurs when the top of the stack contains a
handle. To reduce, a POP function is performed on the stack which pops off the handle
and replaces it with LHS non-terminal symbol.
LR Parser
The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class of
context-free grammar which makes it the most efficient syntax analysis technique. LR parsers
are also known as LR(k) parsers, where L stands for left-to-right scanning of the input stream;
R stands for the construction of right-most derivation in reverse, and k denotes the number of
lookahead symbols to make decisions.
There are three widely used algorithms available for constructing an LR parser:
● SLR(1) – Simple LR Parser:
○ Works on smallest class of grammar
○ Few number of states, hence very small table
○ Simple and fast construction
● LR(1) – LR Parser:
○ Works on complete set of LR(1) Grammar
○ Generates large table and large number of states
○ Slow construction
● LALR(1) – Look-Ahead LR Parser:
○ Works on intermediate size of grammar
○ Number of states are same as in SLR(1)
Q.4 What is Operator Precedence Grammar? Apply the Operator Precedence Parsing
on the String id*id+id.
Ans. A grammar that is used to define mathematical operators is called an operator grammar
or operator precedence grammar. Such grammars have the restriction that no production has
either an empty right-hand side (null productions) or two adjacent non-terminals in its right-
hand side.
An operator precedence grammar is a kind of grammar for formal languages.

Technically, an operator precedence grammar is a context-free grammar that has the property
(among others[1]) that no production has either an empty right-hand side or two adjacent
nonterminals in its right-hand side. These properties allow precedence relations to be defined
between the terminals of the grammar. A parser that exploits these relations is considerably
11
simpler than more general-purpose parsers such as LALR parsers. Operator-precedence parsers
can be constructed for a large class of context-free grammars.
Operator precedence grammars rely on the following three precedence relations between the
terminals:
Relation meaning
a<b a yields precedence to b
a=b a has the same precedence as b
a>b a takes precedence over b
Q.5 Explain the LR Parser? Explain its features and Types.
The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class of

context-free grammar which makes it the most efficient syntax analysis technique. LR parsers
are also known as LR(k) parsers, where L stands for left-to-right scanning of the input stream;
R stands for the construction of right-most derivation in reverse, and k denotes the number of
lookahead symbols to make decisions.
There are three widely used algorithms available for constructing an LR parser:
● SLR(1) – Simple LR Parser:

○ Works on smallest class of grammar
○ Few number of states, hence very small table
○ Simple and fast construction
● LR(1) – LR Parser:
○ Works on complete set of LR(1) Grammar
○ Generates large table and large number of states
○ Slow construction
● LALR(1) – Look-Ahead LR Parser:
○ Works on intermediate size of grammar
○ Number of states are same as in SLR(1)
12
Reasons for attractiveness of LR parser

• LR parsers can handle a large class of context-free grammars.
• The LR parsing method is a most general non-back tracking shift-reduce parsing method.
• An LR parser can detect the syntax errors as soon as they can occur.
• LR grammars can describe more languages than LL grammars.
Drawbacks of LR parsers
• It is too much work to construct LR parser by hand. It needs an automated parser generator.
• If the grammar contains ambiguities or other constructs then it is difficult to parse in a left-
to-right scan of the input.
Model of LR Parser
LR parser consists of an input, an output, a stack, a driver program and a parsing table that has
two functions
1. Action
2. Goto
13
The driver program is the same for all LR parsers. Only the parsing table changes from one
parser to another.
The parsing program reads character from an input buffer one at a time, where a shift reduce
parser would shift a symbol; an LR parser shifts a state. Each state summarizes the information
contained in the stack.
The stack holds a sequence of states, so, s1, · ·· , Sm, where Sm is on the top.
Action This function takes as arguments a state i and a terminal a (or $, the input end marker).
The value of ACTION [i, a] can have one of the four forms:
i) Shift j, where j is a state.
ii) Reduce by a grammar production A—> β.
iii) Accept.
iv) Error.
Goto This function takes a state and grammar symbol as arguments and produces a state.
If GOTO [Ii ,A] = Ij, the GOTO also maps a state i and non terminal A to state j.
Q.6 Draw the block diagram of LR Parser.
LR parsing is divided into four parts: LR (0) parsing, SLR parsing, CLR parsing and LALR
parsing.
14
Q.7 What is a shift-reduce Parser?
Shift Reduce parser attempts for the construction of parse in a similar manner as done in bottom
up parsing i.e. the parse tree is constructed from leaves(bottom) to the root(up). A more general
form of shift reduce parser is LR parser.
This parser requires some data structures i.e.
● A input buffer for storing the input string.

● A stack for storing and accessing the production rules.
Basic Operations –
● Shift: This involves moving symbols from the input buffer onto the stack.
● Reduce: If the handle appears on top of the stack then, its reduction by using
appropriate production rule is done i.e. RHS of production rule is popped out of
stack and LHS of production rule is pushed onto the stack.
● Accept: If only the start symbol is present in the stack and the input buffer is
empty, then the parsing action is called accept. When accepted action is obtained,
it means successful parsing is done.
● Error: This is the situation in which the parser can neither perform shift action nor
reduce action and not even accept action
15
Q.8 Write a short note on Ambiguous Grammar.

Context Free Grammars(CFGs) are classified based on:
● Number of Derivation trees

● Number of strings
Depending on Number of Derivation trees, CFGs are subdivided into 2 types:
● Ambiguous grammars
● Unambiguous grammars
Ambiguous grammar:
A CFG is said to ambiguous if there exists more than one derivation tree for the given input
string i.e., more than one LeftMost Derivation Tree (LMDT) or RightMost Derivation Tree
(RMDT).
Definition: G = (V,T,P,S) is a CFG is said to be ambiguous if and only if there exist a string
in T* that has more than on parse tree.
where V is a finite set of variables.
T is a finite set of terminals.
P is a finite set of productions of the form, A -> α, where A is a variable and α ∈ (V ∪ T)* S is a
designated variable called the start symbol.
For Example:
1. Let us consider this grammar : E -> E+E|id
We can create 2 parse tree from this grammar to obtain a string id+id+id :
The following are the 2 parse trees generated by left most derivation:
16
Both the above parse trees are derived from the same grammar rules but both parse trees are
different. Hence the grammar is ambiguous.
Assignment-03
Q.1 Write Short notes on:

(i) S-attribute Definitions (Synthesized) (ii) L-attribute Definitions (Inherited)
(iii) Dependency Graphs
Sol
1. S-attribute (Synthesized)
A Synthesized attribute is an attribute of the non-terminal on the left-hand side of a
production. Synthesized attributes represent information that is being passed up the
parse tree. The attribute can take value only from its children (Variables in the RHS
of the production).
For eg. let’s say A -> BC is a production of a grammar, and A’s attribute is
dependent on B’s attributes or C’s attributes then it will be synthesized attribute.
2. L-attribute (Inherited)
17
An attribute of a nonterminal on the right-hand side of a production is called an

inherited attribute. The attribute can take value either from its parent or from its
siblings (variables in the LHS or RHS of the production).
For example, let’s say A -> BC is a production of a grammar and B’s attribute is
dependent on A’s attributes or C’s attributes then it will be inherited attribute.
3. Dependency Graphs
A program is a collection of statements, the ordering and scheduling of which
depends on dependence constraints. Dependencies are broadly classified into two
categories:
1. Data Dependencies: when statements compute data that are used by other
statements.
2. Control Dependencies: are those which arise from the ordered flow of control
in program.
A dependence graph can be constructed by drawing edges connect dependent

operations. These arcs impose a partial ordering among operations that prohibit
a fully concurrent execution of a program. The use-definition chaining is a form
of dependency analysis but it leads to overly conservative estimates of data
dependence.
Q.2 Parameter passing techniques. Explain various technics.

SOL:
The communication medium among procedures is known as parameter passing. The
values of the variables from a calling procedure are transferred to the called procedure by some
mechanism.
Basic terminology :
● R- value: The value of an expression is called its r-value. The value contained in a single
variable also becomes an r-value if its appear on the right side of the assignment operator.
R-value can always be assigned to some other variable.
● L-value: The location of the memory(address) where the expression is stored is known
as the l-value of that expression. It always appears on the left side if the assignment
operator.
● Call by Value
In call by value the calling procedure pass the r-value of the actual parameters and the
compiler puts that into called procedure’s activation record. Formal parameters hold the
values passed by the calling procedure, thus any changes made in the formal parameters
does not affect the actual parameters.
● Call by ReferenceIn call by reference the formal and actual parameters refers to same
memory location. The l-value of actual parameters is copied to the activation record of
the called function. Thus the called function has the address of the actual parameters. If
the actual parameters does not have a l-value (eg- i+3) then it is evaluated in a new
temporary location and the address of the location is passed. Any changes made in the
18
formal parameter is reflected in the actual parameters (because changes are made at the
address).
● Call by Copy Restore
In call by copy restore compiler copies the value in formal parameters when the procedure
is called and copy them back in actual parameters when control returns to the called
function. The r-values are passed and on return r-value of formals are copied into l-value
of actuals.
● Call by Name
In call by name the actual parameters are substituted for formals in all the places formals
occur in the procedure. It is also referred as lazy evaluation because evaluation is done
on parameters only when needed.
Q.3 What do you mean by three address code?

SOL:
Three address code is a type of intermediate code which is easy to generate and can be
easily converted to machine code.It makes use of at most three addresses and one operator to
represent an expression and the value computed at each instruction is stored in temporary
variable generated by compiler. The compiler decides the order of operation given by three
address code.
General representation –
a = b op c
Where a, b or c represents operands like names, constants or compiler generated temporaries
and op represents the operator
Example : Convert the expression a * – (b + c) into three address code.
Q.4 Translate the expression

-(a+b)*(c+d)+(a+b+c) into Quadruples, Triples, Indirect triples.
SOL:
t1 = a+b
t2 = c+d
t3 = t1*t2
t4 = t1+c
t5 = t3 + t4
QUADRUPLE
Op Arg1 Arg2 Result
(0) + a b T1
(1) + c d T2
(2) * T1 T2 T3
(3) + T1 c T4
(4) + T3 T4 T5
TRIPLE
Op Arg1 Arg2
(0) + A B
19
(1) + C D
(2) * (0) (1)
(3) + (0) C
(4) + (2) (3)
INDIRECT TRIPLE
100 (0)
101 (1)
102 (2)
103 (3)
104 (4)
Op Arg1 Arg2
(0) + A B
(1) + C D
(2) * (0) (1)
(3) + (0) C
(4) + (2) (3)
Q. 5 Construct the DAG for following basic blocks

(i) S1=4*i (ii) S2=a[t1] (iii)S3=4*i
(iv)S4=b[S2] (v) S5=S2*S4 (vi)S6=prod+S5
(vii)prod=S6 (viii)S7=i+1 (ix) i=S7
(x) if i<=20 goto step (1)
SOL:
20
21
22
23
24
Q.6 Construct the DAG for the following basic block:

d := b * c
e := a + b
b := b * c
a := e – d
SOL:
25
1. Only a is live on exit from the block.
e=a+b
d=b*c
a=e-d
2. a, b, and c are live on exit from the block.
e=a+b
b=b*c
a=e-b
Q.7 What is the difference between dynamic and static storage management. Explain the
importance of run time storage management in compiler.
SOL:
S.No Static Memory Allocation Dynamic Memory Allocation
In the static memory allocation, In the Dynamic memory allocation, variables

variables get allocated get allocated only if your program unit
1 permanently. gets active.
Static Memory Allocation is done Dynamic Memory Allocation is done during

2 before program execution. program execution.
It uses stack for managing the static It uses heap for managing the dynamic
3 allocation of memory allocation of memory
4 It is less efficient It is more efficient
In Dynamic Memory Allocation, there is

In Static Memory Allocation, there memory re-usability and memory can be
5 is no memory re-usability freed when not required
The information which required during an execution of a procedure is kept in a block of storage
called an activation record. The activation record includes storage for names local to the
procedure.
We can describe address in the target code using the following ways:
1. Static allocation
2. Stack allocation
26
In static allocation, the position of an activation record is fixed in memory at compile time
In the stack allocation, for each execution of a procedure a new activation record is pushed
onto the stack. When the activation ends then the record is popped.
Q.8 What is register allocation and assignment?
SOL
Registers are the fastest locations in the memory hierarchy. But unfortunately, this resource is
limited. It comes under the most constrained resources of the target processor. Register
allocation is an NP-complete problem. However, this problem can be reduced to graph coloring
to achieve allocation and assignment. Therefore, a good register allocator computes an
effective approximate solution to a hard problem.
The register allocator determines which values will reside in the register and which register
will hold each of those values. It takes as its input a program with an arbitrary number of
registers and produces a program with finite register set that can fit into the target machine
Allocation –
Maps an unlimited namespace onto that register set of the target machine.
Allocation ensures that code will fit the target machine’s reg. set at each instruction.
Assignment –
Maps an allocated name set to physical register set of the target machine.
● Assumes allocation has been done so that code will fit into the set of physical registers.
● No more than ‘k’ values are designated into the registers, where ‘k’ is the no. of physical
registers.
27
ASSIGNMENT - V
Q1. Describe the Compiler and its phases with a neat and clean diagram.
Ans. COMPILER: A compiler is a software program that transforms high-level
source code that is written by a developer in a high-level programming language
into a low level object code (binary code) in machine language, which can be
understood by the processor. The process of converting high-level programming
into machine language is known as compilation.
The processor executes object code, which indicates when binary high and low
signals are required in the arithmetic logic unit of the processor.
phases of a compiler.
Lexical Analysis
The first phase of scanner works as a text scanner. This phase scans the source
code as a stream of characters and converts it into meaningful lexemes. Lexical
analyzer represents these lexemes in the form of tokens as:
28
<token-name, attribute-value>
Syntax Analysis
The next phase is called the syntax analysis or parsing. It takes the token
produced by lexical analysis as input and generates a parse tree (or syntax tree).
In this phase, token arrangements are checked against the source code grammar,
i.e. the parser checks if the expression made by the tokens is syntactically correct.
Semantic Analysis
Semantic analysis checks whether the parse tree constructed follows the rules of
language. For example, assignment of values is between compatible data types,
and adding string to an integer. Also, the semantic analyzer keeps track of
identifiers, their types and expressions; whether identifiers are declared before
use or not etc. The semantic analyzer produces an annotated syntax tree as an
output.
Intermediate Code Generation

After semantic analysis the compiler generates an intermediate code of the
source code for the target machine. It represents a program for some abstract
machine. It is in between the high-level language and the machine language. This
intermediate code should be generated in such a way that it makes it easier to be
translated into the target machine code.
Code Optimization
The next phase does code optimization of the intermediate code. Optimization
can be assumed as something that removes unnecessary code lines, and arranges
the sequence of statements in order to speed up the program execution without
wasting resources (CPU, memory).
Code Generation
In this phase, the code generator takes the optimized representation of the
intermediate code and maps it to the target machine language. The code
generator translates the intermediate code into a sequence of (generally) re-
locatable machine code. Sequence of instructions of machine code performs the
task as the intermediate code would do.
Symbol Table
It is a data-structure maintained throughout all the phases of a compiler. All the
identifier's names along with their types are stored here. The symbol table makes
it easier for the compiler to quickly search the identifier record and retrieve it.
The symbol table is also used for scope management.
29
Q2. Explain the Cousins of Compiler.

COUSINS OF COMPILER
1. Preprocessor 2. Assembler 3. Loader and Link-editor
Preprocessor
A preprocessor is a program that processes its input data to produce output that
is used as input to another program. The output is said to be a preprocessed form
of the input data, which is often used by some subsequent programs like
compilers.
They may perform the following functions :
1. Macro processing 3. Rational Preprocessors
2. File Inclusion 4. Language extension
1. Macro processing:
A macro is a rule or pattern that specifies how a certain input sequence should
be mapped to an output sequence according to a defined procedure. The mapping
process that instantiates a macro into a specific output sequence is known as
macro expansion.
2. File Inclusion:
Preprocessor includes header files into the program text. When the preprocessor
finds an #include directive it replaces it by the entire content of the specified file.
3. Rational Preprocessors:
These processors change older languages with more modern flow-of-control and
data-structuring facilities.
4. Language extension :
30
These processors attempt to add capabilities to the language by what amounts to

built-in macros. For example, the language Equel is a database query language
embedded in C.
Assembler
Assembler creates object code by translating assembly instruction mnemonics
into machine code. There are two types of assemblers:
·One-pass assemblers go through the source code once and assume that all
symbols will be defined before any instruction that references them.
·Two-pass assemblers create a table with all symbols and their values in the first
pass, and then use the table in a second pass to generate code
31
Linker and Loader
A linker or link editor is a program that takes one or more objects generated by
a compiler and combines them into a single executable program. Three tasks of
the linker are
1.Searches the program to find library routines used by program, e.g. printf(),
math routines.
32
2. Determines the memory locations that code from each module will occupy
and relocates its instructions by adjusting absolute references 3. Resolves
references among files.
A loader is the part of an operating system that is responsible for loading
programs in memory, one of the essential stages in the process of starting a
program.
Q5. Differentiate Top-down And Bottom-up parsing techniques.
Sr. Key Top Down Parsing Bottom Up Parsing

No
.
Strategy Top down approach starts Bottom up approach starts evaluating the
evaluating the parse tree from the parse tree from the lowest level of the tree
1
top and moves downwards for and moving upwards for parsing the node.
parsing other nodes.
Attempt Top down parsing attempts to find Bottom up parsing attempts to reduce the
2 the leftmost derivation for a given input string to the first symbol of the
string. grammar.
Derivatio Top down parsing uses leftmost Bottom up parsing uses the rightmost
3
n Type derivation. derivation.
Objective Top down parsing searches for a Bottom up parsing searches for a production
4 production rule to be used to rule to be used to reduce a string to get a
construct a string. starting symbol of grammar.
Q. 6 Construct the DAG for following basic blocks
33
(i) S1=4*i (ii) S2=a[t1] (iii)S3=4*i
(iv)S4=b[S2] (v) S5=S2*S4 (vi)S6=prod+S5
(vii)prod=S6 (viii)S7=i+1 (ix) i=S7
(x) if i<=20 goto step (1)
SOL:
34
35
36
37
38
39
Q6. Translate the expression
-(a+b)*(c+d)+(a+b+c) into Quadruples, Triples,

Indirect triples.
Ans. t1 = a+b
t2 = c+d
t3 = t1*t2
t4 = t1+c
t5 = t3 + t4
QUADRUPLE
Op Arg1 Arg2 Result
(0) + a b T1
(1) + c d T2
(2) * T1 T2 T3
(3) + T1 c T4
(4) + T3 T4 T5
TRIPLE
40
Op Arg1 Arg2
(0) + A B
(1) + C D
(2) * (0) (1)
(3) + (0) C
(4) + (2) (3)
INDIRECT TRIPLE
100 (0)
101 (1)
102 (2)
103 (3)
104 (4)
41
Op Arg1 Arg2
(0) + A B
(1) + C D
(2) * (0) (1)
(3) + (0) C
(4) + (2) (3)
Q9. Explain the process of peephole optimization.
Ans. Peephole optimization is a type of Code Optimization performed on a small part

of the code. It is performed on the very small set of instructions in a segment of
code.The small set of instructions or small part of code on which peephole
optimization is performed is known as peephole or window.It basically works on the
theory of replacement in which a part of code is replaced by shorter and faster code
without change in output.
Peephole is machine dependent optimization.
Objectives of Peephole Optimization:
The objective of peephole optimization is:

1. To improve performance
2. To reduce memory footprint
3. To reduce code size
Peephole Optimization Techniques:

Redundant load and store elimination:
42
In this technique the redundancy is eliminated.

Initial code:
y = x + 5;
i = y;
z = i;
w = z * 3;
Optimized code:
y = x + 5;
i = y;
1. w = y * 3;
Constant folding:
The code that can be simplified by the user itself, is simplified.
Initial code:
x = 2 * 3;
Optimized code:
2. x = 6;
Strength Reduction:
The operators that consume higher execution time are replaced by the operators
consuming less execution time.
Initial code:
y = x * 2;
Optimized code:
y = x + x; or y = x << 1;
Initial code:
43
y = x / 2;
Optimized code:
3. y = x >> 1;
4. Null sequences:
Useless operations are deleted.
5. Combine operations:
Several operations are replaced by a single equivalent operation.
Q10. What is Static and Dynamic memory allocation?

1. Static Memory Allocation
Static memory allocation is performed when the compiler compiles the program and
generates object files, linker merges all these object files and creates a single executable
file, and loader loads this single executable file in main memory, for execution. In static
memory allocation, the size of the data required by the process must be known before
the execution of the process initiates.
If the data sizes are not known before the execution of the process, then they have to be
guessed. If the data size guessed is larger than the required, then it leads to wastage of
memory. If the guessed size is smaller, then it leads to inappropriate execution of the
process.
Static memory allocation method does not need any memory allocation operation
during the execution of the process. As all the memory allocation operation required for
the process is done before the execution of the process has started. So, it leads to faster
execution of a process.
Static memory allocation provides more efficiency when compared by the dynamic
memory allocation.
2. Dynamic Memory Allocation
Dynamic memory allocation is performed while the program is in execution. Here, the
memory is allocated to the entities of the program when they are to be used for the first
time while the program is running.
The actual size, of the data required, is known at the runtime so, it allocates the exact
memory space to the program thereby, reducing the memory wastage.
44
Dynamic memory allocation provides flexibility to the execution of the program. As it

can decide what amount of memory space will be required by the program. If the
program is large enough then a dynamic memory allocation is performed on the
different parts of the program, which is to be used currently. This reduces memory
wastage and improves the performance of the system.
Q11. What do you mean by Activation Record?

Ans. An activation record may contain the following units (depending upon the
source language used). Stores temporary and intermediate values of an expression.
Stores local data of the called procedure.
● Control stack is a run time stack which is used to keep track of the live
procedure activations i.e. it is used to find out the procedures whose execution
have not been completed.
● When it is called (activation begins) then the procedure name will push on to
the stack and when it returns (activation ends) then it will popped.
● Activation record is used to manage the information needed by a single
execution of a procedure.
● An activation record is pushed into the stack when a procedure is called and it
is popped when the control returns to the caller function.
The diagram below shows the contents of activation records:
45
Return Value: It is used by calling procedure to return a value to calling procedure.

Actual Parameter: It is used by calling procedures to supply parameters to the called
procedures.
Control Link: It points to activation record of the caller.
Access Link: It is used to refer to non-local data held in other activation records.
Saved Machine Status: It holds the information about status of machine before the
procedure is called.
Local Data: It holds the data that is local to the execution of the procedure.
Temporaries: It stores the value that arises in the evaluation of an expression.
46
47

Compiler Design

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Compiler Design

Uploaded by

Copyright:

Available Formats

ASSIGNMENT | DIVYANSH JAIN(17100BTCMCI01669)

Recovery of lost source code to archive or maintain the code

Ques.2 Explain “Analysis Synthesis Model” of compiler in brief?

Ques. 3 Explain the complete execution process of c/c++ program ?

These tools make the program running.

Different files during the process of execution

Ques.4 Explain the term Editor, preprocessor, linker, loader?

Ques.5 Give the short description of tools LEX and YACC?

LEX: Lex is officially known as a "Lexical Analyser".

It's main job is to break up an input stream into more usable

elements. Or in, other words, to identify the "interesting

bits" in a text file.

All of these things are handled by the Lexical Analyser.

YACC: Yacc is officially known as a "parser".

The function of Lex is as follows:

Lex file format

Ques.7 What are the contents of executable files?

Processors understand programs in terms of opcodes - so your intution about executables

3. Syntax directed translation engines –

Q.1 Difference between Parse tree and Syntax tree ?

Q.2 Explain the Recursive-Descent Parsing & Predictive Parsing.

Q.3 Describe the Bottom–up parser.

An operator precedence grammar is a kind of grammar for formal languages.

a<b a yields precedence to b

a=b a has the same precedence as b

a>b a takes precedence over b

Q.5 Explain the LR Parser? Explain its features and Types.

The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class of

● SLR(1) – Simple LR Parser:

Reasons for attractiveness of LR parser

Q.7 What is a shift-reduce Parser?

This parser requires some data structures i.e.

● A input buffer for storing the input string.

Q.8 Write a short note on Ambiguous Grammar.

● Number of Derivation trees

Depending on Number of Derivation trees, CFGs are subdivided into 2 types:

where V is a finite set of variables.

T is a finite set of terminals.

designated variable called the start symbol.

1. Let us consider this grammar : E -> E+E|id

Q.1 Write Short notes on:

An attribute of a nonterminal on the right-hand side of a production is called an

A dependence graph can be constructed by drawing edges connect dependent

Q.2 Parameter passing techniques. Explain various technics.

Q.3 What do you mean by three address code?

Q.4 Translate the expression

Q. 5 Construct the DAG for following basic blocks

Q.6 Construct the DAG for the following basic block:

1. Only a is live on exit from the block.

2. a, b, and c are live on exit from the block.

S.No Static Memory Allocation Dynamic Memory Allocation

In the static memory allocation, In the Dynamic memory allocation, variables

Static Memory Allocation is done Dynamic Memory Allocation is done during

4 It is less efficient It is more efficient

In Dynamic Memory Allocation, there is

Q.8 What is register allocation and assignment?

Intermediate Code Generation

Q2. Explain the Cousins of Compiler.

These processors attempt to add capabilities to the language by what amounts to

Linker and Loader

Q5. Differentiate Top-down And Bottom-up parsing techniques.

Sr. Key Top Down Parsing Bottom Up Parsing

Q. 6 Construct the DAG for following basic blocks

(i) S1=4i (ii) S2=a[t1] (iii)S3=4i