Professional Documents
Culture Documents
CD Paper Solution 2022-23
CD Paper Solution 2022-23
CD Paper Solution 2022-23
B. TECH.
(SEM V) THEORY EXAMINATION 2022-23
COMPILER DESIGN
SECTION A
A Lex program is separated into three sections by %% delimiters. The formal of Lex source
is as follows:
1. { definitions }
2. %%
3. { rules }
4. %%
5. { user subroutines }
Example− Construct
• Parse Tree
• Syntax Tree
• Annotated for complete parse tree for the input string 1 * 2 + 3 by using any
grammar you know.
Solution
Properties of IRs:
The priorities of different properties across all compilers are not uniform.
The below five are the properties of IRs:
1. Ease of generation
2. Ease of manipulation
3. Freedom of expression
4. Size of the procedure
5. Level of abstraction
First L of LL is for left to right and second L is for L of LR is for left to right and R is for
leftmost derivation. rightmost derivation.
It follows the left most derivation. It follows reverse of right most derivation.
Using LL parser parser tree is constructed in top Parser tree is constructed in bottom up
down manner. manner.
Ends when stack used becomes empty. Starts with an empty stack.
Pre-order traversal of the parse tree. Post-order traversal of the parser tree.
int a,b
// AFTER RECOVERY:
int a,b; //Semicolon is added by the compiler
Optimized code:
for(int i=0; i<5; i++)
{
a = i + 5;
b = i + 10;
}
(j). What is induction variable?
Ans: Loops are well known targets for optimization since they execute repeatedly and
significant execution time is spent in loop bodies. The class of loop optimizations
which we're considering in this post are centered on special variables called induction
variables (IVs). An induction variable is any variable whose value can be represented
as a function of: loop invariants; the number of loop iterations that have executed; and
other induction variables.
SECTION B
while a < b
doif c < d
then
x=y*
z else
x=y+z
Ans: Boolean Expression
The translation of conditional statements such as if-else statements and while-do
statements is associated with Boolean expression's translation. The main use of the
Boolean expression is the following:
• Boolean expressions are used as conditional expressions in statements that alter the
flow of control.
• A Boolean expression can compute logical values, true or false.
Boolean expression is composed of Boolean operators like &&, ||, !, etc. applied to the
elements that are Boolean or relational expressions. E1 rel E2 is the form of relational
expressions.
Let us consider the following grammars:
B => B1 | | B2
B => B1 && B2 |
B => !B1
B => (B)
B => E1 rel E2
B => true
B => false
If we compute that B1 is true in the first expression, then the entire expression will be true.
We don’t need to compute B2. In the second expression, if B1 is false, then the entire
expression is false.
The comparison operators <, <=, =, !=, >, or => is represented by rel.op.
We also assume that || and && are left-associative. || has the lowest precedence and then
&&, and !.
PRODUCTION SEMANTIC R RULES
B1.true = B.true B1.false = newlabel () B2.true = B.true B2.false = B.false B.code = B1.c
B => B1 | | B2
label(B1.false) || B2.code
Numerical:
(b) Discuss the stack allocation and heap allocation strategies of the
runtimeenvironment with an example.
Ans: Stack Allocation: The allocation happens on contiguous blocks of memory. We call
it a stack memory allocation because the allocation happens in the function call stack. The
size of memory to be allocated is known to the compiler and whenever a function is called,
its variables get memory allocated on the stack. And whenever the function call is over, the
memory for the variables is de-allocated. This all happens using some predefined routines
in the compiler. A programmer does not have to worry about memory allocation and de-
allocation of stack variables. This kind of memory allocation is also known as Temporary
memory allocation because as soon as the method finishes its execution all the data
belonging to that method flushes out from the stack automatically. This means any value
stored in the stack memory scheme is accessible as long as the method hasn’t completed its
execution and is currently in a running state.
Key Points:
• It’s a temporary memory allocation scheme where the data members are
accessible only if the method( ) that contained them is currently running.
• It allocates or de-allocates the memory automatically as soon as the
corresponding method completes its execution.
• We receive the corresponding error Java. lang. StackOverFlowError by JVM, If
the stack memory is filled completely.
• Stack memory allocation is considered safer as compared to heap memory
allocation because the data stored can only be accessed by the owner thread.
• Memory allocation and de-allocation are faster as compared to Heap-memory
allocation.
• Stack memory has less storage space as compared to Heap-memory.
• C++
int main()
{
// All these variables get memory
// allocated on stack
int a;
int b[10];
int n = 20;
int c[n];
}
Heap Allocation: The memory is allocated during the execution of instructions written by
programmers. Note that the name heap has nothing to do with the heap data structure. It is
called a heap because it is a pile of memory space available to programmers to allocate and
de-allocate. Every time when we made an object it always creates in Heap-space and the
referencing information to these objects is always stored in Stack-memory. Heap memory
allocation isn’t as safe as Stack memory allocation because the data stored in this space is
accessible or visible to all threads. If a programmer does not handle this memory well,
a memory leak can happen in the program.
The Heap-memory allocation is further divided into three categories:- These three
categories help us to prioritize the data(Objects) to be stored in the Heap-memory or in
the Garbage collection.
• Young Generation – It’s the portion of the memory where all the new
data(objects) are made to allocate the space and whenever this memory is
completely filled then the rest of the data is stored in Garbage collection.
• Old or Tenured Generation – This is the part of Heap-memory that contains
the older data objects that are not in frequent use or not in use at all are placed.
• Permanent Generation – This is the portion of Heap-memory that contains the
JVM’s metadata for the runtime classes and application methods.
Key Points:
• We receive the corresponding error message if Heap-space is entirely full, java.
lang.OutOfMemoryError by JVM.
• This memory allocation scheme is different from the Stack-space allocation,
here no automatic de-allocation feature is provided. We need to use a Garbage
collector to remove the old unused objects in order to use the memory
efficiently.
• The processing time(Accessing time) of this memory is quite slow as compared
to Stack-memory.
• Heap memory is also not as threaded-safe as Stack-memory because data stored
in Heap-memory are visible to all threads.
• The size of the Heap-memory is quite larger as compared to the Stack-memory.
• Heap memory is accessible or exists as long as the whole application(or java
program) runs.
• CPP
int main()
{
// This memory for 10 integers
// is allocated on heap.
int *ptr = new int[10];
}
Intermixed example of both kinds of memory allocation Heap and Stack in java:
• Java
• C++
class Emp {
int id;
String emp_name;
Following are the conclusions on which we’ll make after analyzing the above
example:
• As we start execution of the have program, all the run-time classes are stored in
the Heap-memory space.
• Then we find the main() method in the next line which is stored in the stack
along with all its primitive(or local) and the reference variable Emp of type
Emp_detail will also be stored in the Stack and will point out to the
corresponding object stored in Heap memory.
• Then the next line will call to the parameterized constructor Emp(int, String)
from main( ) and it’ll also allocate to the top of the same stack memory block.
This will store:
• The object reference of the invoked object of the stack memory.
• The primitive value(primitive data type) int id in the stack memory.
• The reference variable of the String emp_name argument will point
to the actual string from the string pool into the heap memory.
• Then the main method will again call to the Emp_detail() static method, for
which allocation will be made in stack memory block on top of the previous
memory block.
• So, for the newly created object Emp of type Emp_detail and all instance
variables will be stored in heap memory.
Pictorial representation as shown in Figure.1 below:
Fig.1
Conceptually, with both syntax-directed definition and translation schemes, we parse the
input token stream, build the parse tree, and then traverse the tree as needed to evaluate
the semantic rules at the parse tree nodes. Evaluation of the semantic rules may generate
code, save information in a symbol table, issue error messages, or perform any other
activities. The translation of the token stream is the result obtained by evaluating the
semantic rules.
Definition
Syntax Directed Translation has augmented rules to the grammar that facilitate semantic
analysis. SDT involves passing information bottom-up and/or top-down to the parse tree
in form of attributes attached to the nodes. Syntax-directed translation rules use 1) lexical
values of nodes, 2) constants & 3) attributes associated with the non-terminals in their
definitions.
The general approach to Syntax-Directed Translation is to construct a parse tree or syntax
tree and compute the values of attributes at the nodes of the tree by visiting them in some
order. In many cases, translation can be done during parsing without building an explicit
tree.
Example
E -> E+T | T
T -> T*F | F
F -> INTLIT
This is a grammar to syntactically validate an expression having additions and
multiplications in it. Now, to carry out semantic analysis we will augment SDT rules to
this grammar, in order to pass some information up the parse tree and check for semantic
errors, if any. In this example, we will focus on the evaluation of the given expression, as
we don’t have any semantic assertions to check in this very basic example.
(d) Construct the NFA and DFA for the following regular expression.
(0+1)*(00+11)(0+1
)*
Ans:
(e) Explain the lexical analysis and syntax analysis phases of the
compiler with a suitable example. Explain the reporting errors in
these two phases as well.
1. Ans: Lexical Analyzer –
It is also called a scanner. It takes the output of the preprocessor (which
performs file inclusion and macro expansion) as the input which is in a pure
high-level language. It reads the characters from the source program and
groups them into lexemes (sequence of characters that “go together”). Each
lexeme corresponds to a token. Tokens are defined by regular expressions
which are understood by the lexical analyzer. It also removes lexical errors
(e.g., erroneous characters), comments, and white space.
2. Syntax Analyzer – It is sometimes called a parser. It constructs the parse tree.
It takes all the tokens one by one and uses Context-Free Grammar to construct
the parse tree.
Why Grammar?
The rules of programming can be entirely represented in a few productions.
Using these productions we can represent what the program actually is. The
input has to be checked whether it is in the desired format or not.
The parse tree is also called the derivation tree. Parse trees are generally
constructed to check for ambiguity in the given grammar. There are certain
rules associated with the derivation tree.
• Any identifier is an expression
• Any number can be called an expression
• Performing any operations in the given expression will always
result in an expression. For example, the sum of two expressions is
also an expression.
• The parse tree can be compressed to form a syntax tree.
Types of lexical error that can occur in a lexical analyzer are as follows:
1. Exceeding length of identifier or numeric constants.
Example:
#include <iostream>
using namespace std;
int main() {
This is a lexical error since signed integer lies between −2,147,483,648 and 2,147,483,647
2. Appearance of illegal characters
Example:
#include <iostream>
using namespace std;
int main() {
printf("Geeksforgeeks");$
return 0;
}
This is a lexical error since an illegal character $ appears at the end of the statement.
3. Unmatched string
Example:
#include <iostream>
using namespace std;
int main() {
/* comment
cout<<"GFG!";
return 0;
}
This is a lexical error since the ending of comment “*/” is not present but the beginning is
present.
4. Spelling Error
#include <iostream>
using namespace std;
int main() {
#include <iostream>
using namespace std;
int main() {
int main() {
cout<<"GFG!";
return 0;
}
#include <iostream>
using namespace std;
int mian()
{
/* spelling of main here would be treated as an lexical
error and won't be considered as an identifier,
transposition of character 'i' and 'a'*/
cout << "GFG!";
return 0;
}
Syntax Error
During the syntax analysis phase, this type of error appears. Syntax error is found during the
execution of the program.
o Error in structure
o Missing operators
o Unbalanced parenthesis
When an invalid calculation enters into a calculator then a syntax error can also occurs. This
can be caused by entering several decimal points in one number or by opening brackets
without closing them.
1. 16 if (number=200)
2. 17 count << "number is equal to 20";
3. 18 else
4. 19 count << "number is not equal to 200"
In this code, if expression used the equal sign which is actually an assignment operator not
the relational operator which tests for equality.
Due to the assignment operator, number is set to 200 and the expression number=200 are
always true because the expression's value is actually 200. For this example the correct code
would be:
1. 16 if (number==200)
Compiler message:
SECTION C
3. Attempt any one part of the following: 10 x 1 = 10
(a) Construct the CLR parse table for the following Grammar:
A BB
B cB
B d
Ans:
(b) Construct the SLR parsing table for the following Grammar.
S→0S0
S→1S1
S→ 10
Ans:
4. Attempt any one part of the following: 10 x 1 = 10
(a) What is back patching. Generate three address code for the
following Boolean expression using back patching:
a < b or c > d and e < f
In a single pass, backpatching may be used to create a boolean expressions program as well
as the flow of control statements. The synthesized properties truelist and falselist of non -
terminal B are used to handle labels in jumping code for Boolean statements. The label to
which control should go if B is true should be added to B.truelist, which is a list of a jump
or conditional jump instructions. B.falselist is the list of instructions that eventually get the
label to which control is assigned when B is false. The jumps to true and false exist, as
well as the label field, are left blank when the program is generated for B. The lists
B.truelist and B.falselist, respectively, contain these early jumps.
A statement S, for example, has a synthesized attribute S.nextlist, which indicates a list of
jumps to the instruction immediately after the code for S. It can generate instructions into
an instruction array, with labels serving as indexes. We utilize three functions to modify
the list of jumps:
• Makelist (i): Create a new list including only i, an index into the array of
instructions and the makelist also returns a pointer to the newly generated list.
• Merge(p1,p2): Concatenates the lists pointed to by p1, and p2 and returns a
pointer to the concatenated list.
• Backpatch (p, i): Inserts i as the target label for each of the instructions on
the record pointed to by p.
Using a translation technique, it can create code for Boolean expressions during bottom-
up parsing. In grammar, a non-terminal marker M creates a semantic action that picks up
the index of the next instruction to be created at the proper time.
For Example, Backpatching using boolean expressions production rules table:
Step 1: Generation of the production table
Step 3: Now we will make the parse tree for the expression:
(b) What is top down parsing? What are the problems in top down
parsing?Explain each with suitable example.
Ans: Top down paring
o The top down parsing is known as recursive parsing or predictive parsing.
o Bottom up parsing is used to construct a parse tree for an input string.
o In the top down parsing, the parsing starts from the start symbol and transform it into
the input symbol.
Access Link: It is used to refer to non-local data held in other activation records.
Saved Machine Status: It holds the information about status of machine before the
procedure is called.
Local Data: It holds the data that is local to the execution of the procedure.
In the source program, every name possesses a region of validity, called the scope of that
name.
o These scope rules need a more complicated organization of symbol table than a list
of associations between names and attributes.
o Tables are organized into stack and each table contains the list of names and their
associated attributes.
o Whenever a new block is entered then a new table is entered onto the stack. The new
table holds the name that is declared as local to this block.
o When the declaration is compiled then the table is searched for a name.
o If the name is not found in the table then the new name is inserted.
o When the name's reference is translated then each table is searched, starting from the
each table on the stack.
For example:
1. int x;
2. void f(int m) {
3. float x, y;
4. {
5. int i, j;
6. int u, v;
7. }
8. }
9. int g (int n)
10. {
11. bool t;
12. }
Fig: Symbol table organization that complies with static scope information rules
y=x
z=3+y
z=3+x
A DAG for basic block is a directed acyclic graph with the following labels on nodes:
1. The leaves of graph are labeled by unique identifier and that identifier can be variable
names or constants.
2. Interior nodes of the graph is labeled by an operator symbol.
3. Nodes are also given a sequence of identifiers for labels to store the computed value.
Step 2:
For case(i), create node(OP) whose right child is node(z) and left child is node(y).
For case(ii), check whether there is node(OP) with one child node(y).
Output:
For node(x) delete x from the list of identifiers. Append x to attached identifiers list for the
node n found in step 2. Finally set node(x) to n.
Example:
1. S1:= 4 * i
2. S2:= a[S1]
3. S3:= 4 * i
4. S4:= b[S3]
5. S5:= s2 * S4
6. S6:= prod + S5
7. Prod:= s6
8. S7:= i+1
9. i := S7
10. if i<= 20 goto (1)
Stages in DAG Construction:
(b) Write quadruple, triples and indirect triples for following expression :
a = b * – c + b * – c.
Ans: