Download as pdf or txt
Download as pdf or txt
You are on page 1of 77

Part – A

1)C
2)B
3)B
4)C
5)D
Part – B

6) State the complier construction tools. Explain them

A compiler is a computer program that converts source code written in a


computer language (the source language) into another computer language (the
target language, providing having a binary form referred to as object code). The
best reason for inadequate to convert source code is to create an executable
code.
Parser Generator
• Parser Generator produces syntax analyzers (parsers) based on context-
free grammar that takes input in the form of the syntax of a programming
language. It's helpful because the syntax analysis phase is quite complex
and takes more compilation and manual time.
• Example: EQM, PIC

• Scanner Generator
• Scanner Generator generates lexical analyzers from the input that consists
of regular expression descriptions based on tokens of a language. It
generates a finite automaton to identify the regular expression.
• Example: LEX is a scanner generator provided by UNIX systems.
• Syntax Directed Translation Engines

Syntax Directed Translation Engines take a parse tree as input and


generate intermediate code with three address formats. These engines
contain routines to traverse the parse tree and generate intermediate code.
Each parse tree node has one or more translations associated with it.
• Automatic Code Generators

Automatic Code Generators take intermediate code as input and convert it


into machine language. Each intermediate language operation is translated
using a set of rules and then sent into the code generator as an input. A
template matching process is used, and by using the templates, an
intermediate language statement is replaced by its machine language
equivalent.
• Data-Flow Analysis Engines
Data-Flow Analysis Engines is used for code optimization and can
generate an optimized code. Data flow analysis is an essential part of code
optimization that collects the information, the values that flow from one part
of a program to another.

7) What is syntax tree?.Draw the syntax tree for the expression


a=(b/c)*(a+b)c*d*
8) Design a Deterministic Finite Automata (DFA) to accept strings that
begin with a and end with b over �={a,b}. Write the formal definition of
the DFA
Part – C

9) Explain the phases of compiler and draw the translation of statement


(b+c)(b+c)2
b) Draw the language processing system with
neat diagram
.

10)Interpret a DFA for the given RE= x(x+y)x*y using Direct Method and
Discuss the input buffering techniques in detail
Lexical Analysis has to access secondary memory each time to identify tokens.
It is time-consuming and costly. So, the input strings are stored into a buffer
and then scanned by Lexical Analysis.
Lexical Analysis scans input string from left to right one character at a time to
identify tokens. It uses two pointers to scan tokens −
• Begin Pointer (bptr) − It points to the beginning of the string to be read.
• Look Ahead Pointer (lptr) − It moves ahead to search for the end of the
token.
Example − For statement int a, b;
• Both pointers start at the beginning of the string, which is stored in the
buffer.
• Look Ahead Pointer scans buffer until the token is found.

• The character ("blank space") beyond the token ("int") have to be


examined before the token ("int") will be determined.

• After processing token ("int") both pointers will set to the next token ('a'),
& this process will be repeated for the whole program.

A buffer can be divided into two halves. If the look Ahead pointer moves
towards halfway in First Half, the second half is filled with new characters to be
read. If the look Ahead pointer moves towards the right end of the buffer of the
second half, the first half will be filled with new characters, and it goes on.
Advantages
• It usually just does one test to determine if the forward pointer is pointing
to an eof.
• It only runs further tests until it reaches the halfway point of the buffer or
eof.
• The average number of tests per input character is extremely close to 1
since N input characters are encountered between eofs.
SRM Institute of Science and Technology
Mode of Exam
College of Engineering and Technology
OFFLINE
SCHOOL OF COMPUTING
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu

Academic Year: 2022-23 (EVEN) SET-A


Test: CLAT-1 Date: 17.2.2023
Course Code & Title: 18CSC304J -COMPILER DESIGN Duration: 1 HOUR
Year & Sem: III & VI Max. Marks: 25

Course Articulation Matrix:


S.No. Course Outcome PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12

1 CO1 3 3 3

Part – A ( 5 x 1 = 5 Marks)
Instructions: Answer ALL
Q. Question Mark B C P PI
N s L O O Cod
o e
1 The regular expression (0|1)*(0|1) represents a language with 1 2 1 1 1.4.1
a) Nonempty binary strings
b) Empty and nonempty binary strings
c) Odd nonempty strings
d) Even nonempty strings

Answer: a
2 The total number of states to build the given language using DFA: 1 3 1 2 2.1.3
L={w|w has exactly 2 a’s and at least 2 b’s}
a) 10 b) 11 c)12 d)13
Answr\]er: a
3 Which of the following is not a regular expression? 1 2 1 2 2.1.2
a) [(a+b)*-(aa+bb)]*
b) [(0+1)-(0b+a1)*(a+b)]*
c) (01+11+10)*
d) (1+2+0)*(1+2)*

Answer: b
4 Regular expression Φ* is equivalent to 1 1 1 1 1.2.1
a) ϵ b) Φ c) 0 d) 1
Answer :a
5 ________________takes collection of rules that define the translation of 1 1 1 1 1.3.1
each operation of the intermediate language into the machine language
for the target machine.
a. Parser generators
b. Scanner generators
c. Syntax-directed translation engines
d. Automatic code generators
Answer : D

Part – B ( 2 x 4 = 8 Marks)
Instructions: Answer any TWO
6 The two tests schemes can be reduced to one in input buffering 4 1 1 1 1.3.1
technique? justify your answer with an algorithm.
The two tests can be reduced to one, if each buffer half holds a
sentinel character at the end.
The sentinel is a special character of eof.

forward := forward + 1’
if forward = eof then begin
if forward at end of first half then begin
reload second half;
forward := forward + 1
end
else if forward at end of second half then begin
reload first half;
move forward to beginning of first half
end
else
terminate lexical analysis
end

7 Construct a syntax tree with firstpos and lastpos for all nodes of 4 2 1 2 2.3.1
(a|b)*abb.

8 Construct the minimal DFA for the below diagram. 4 3 1 2 2.3.1


Answer

First Construct Transition table for the given diagram

Part – C ( 1 x 12 = 12 Marks)
Instructions: Answer any ONE
9 (i). Consider the input c=a+b*5. With a neat sketch, illustrate how the 8 2 1 2 2.2.1
input is transformed into assembly code, using all the phases of
compiler.
(Problem solving-4 marks, explanation-4)
4
(ii). Illustrate LEX code with an example.
Answer:
%{

#include <stdio.h>
#include "y.tab.h"
int c;
extern int yylval;
%}
%%
" " ;
[a-z] {
c = yytext[0];
yylval = c - 'a';
return(LETTER);
}
[0-9] {
c = yytext[0];
yylval = c - '0';
return(DIGIT);
}
[^a-z0-9\b] {
c = yytext[0];
return(c);
}

OR
10 (i). Convert the following Non-Deterministic Finite Automata (NFA) to 8 3 1 3 3.3.2
Deterministic Finite Automata (DFA) using subset construction
method.

Answer : Accept any method of conversion for this question


(ii). Inference the importance of the compiler construction tools
Answer :
Some commonly used compiler-construction tools. include
1. Parser generators.
2. Scanner generators.
3. Syntax-directed translation engines.
4. Automatic code generators.
5. Data-flow analysis engines.
6. Compiler-construction toolkits.
Parser Generators

Input: Grammatical description of a programming language


Output: Syntax analyzers.
Parser generator takes the grammatical description of a
programming language and produces a syntax analyzer.
Scanner Generators

Input: Regular expression description of the tokens of a


language
Output: Lexical analyzers.
Scanner generator generates lexical analyzers from a regular
expression description of the tokens of a language.
Syntax-directed Translation Engines

Input: Parse tree.


Output: Intermediate code.
Syntax-directed translation engines produce collections of
routines that walk a parse tree and generates intermediate
code.
Automatic Code Generators

Input: Intermediate language.


Output: Machine language.
Code-generator takes a collection of rules that define the
translation of each operation of the intermediate language into
the machine language for a target machine.
Data-flow Analysis Engines

Data-flow analysis engine gathers the information, that is, the


values transmitted from one part of a program to each of the
other parts. Data-flow analysis is a key part of code
optimization.
Compiler Construction Toolkits

The toolkits provide integrated set of routines for various


phases of compiler. Compiler construction toolkits provide an
integrated set of routines for construction of phases of
compiler.

*Performance Indicators are available separately for Computer Science and Engineering in AICTE examination
reforms policy.

Course Outcome (CO) and Bloom’s level (BL) Coverage in Questions


Approved by the Audit Professor/Course Coordinator
SRM Institute of Science and Technology
Mode of Exam
College of Engineering and Technology
OFFLINE
SCHOOL OF COMPUTING
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2022-23 (EVEN)
Test: CLAT-1 Date: 17.2.2022
Course Code & Title: 18CSC304J COMPILER DESIGN Duration: 1 HOUR
Year & Sem: III & V Max. Marks: 25
Part – A ( 5 x 1 = 5 Marks) Instructions: Answer ALL
Q. Question Marks BL CO PO PI
No Code
1 NFA with ϵ transitions _______ 1 1 1 1 1.3.1
a) Increases computations
b) Decreases computations
c) Decreases number of states
d) Increases uncertainty
Ans: a
2 What are the maximum number of tokens generated in the 1 2 1 1 1.1.2
lexical analysis phase for the statement? printf("a = %f, &a =
%d, b=%d", a, &a,b);
a) 10
b) 12
c) 17
d) 18
Ans: b
3 If L,D, S denote the sets of letters, digits and underscore 1 2 1 1 1.1.2
respectively. Then , which can possibly define an identifier?
a) S(LUD)+
b) (LUS)(LUDUS)*
c) (LUS)(LUD)*
d) L(L.D.S)*
Ans: b
4 The error of missing parenthesis detection occurs in _______ 1 1 1 1 1.3.1
phase.

a) Semantic

b) Lexical

c) Syntax

d) Syntax and lexical

Ans: c
5 I: DFA’s can be constructed for all the languages 1 2 1 2 2.1.1
II: The strings accepted by DFA will be accepted by NFA
What can be said about these two statements?
a) Only II is false
b) Only I is false
c) I is false and II is true
d) II is true and I is false
Ans: c or d
Part – B ( 2 x 4 = 8 Marks) Instructions: Answer TWO
6 Explain the process of input buffering for the given source 4 3 1 2 2.1.1
code.
int i,j;
i=i+1;
j=j+1;
Explain the process with one buffer(size:5) and two buffer (size
5 ) concepts

Answer: Definition and One buffer scheme with example (2


marks), two buffer scheme – 2 marks
• Sometimes lexical analyzer needs to look ahead some
symbols to decide about the token to return
• In C language: we need to look after -, = or <
to decide what token to return
• In Fortran: DO 5 I = 1.25
• We need to introduce a two buffer scheme to handle
large look-aheads safely

Two pointers – Begin pointer (bp), Forward pointer (fp)

7 Raju is authoring a book on compiler. He makes sure that the 4 2 1 2 1.1.2


first page is an index page followed by two acknowledgement
pages. Design a DFA for the language L=all strings over {a,b}.
Note: index page and acknowledgment pages are referred to
strings ‘a’, ‘b’ respectively.
Answer: Recognition – 2marks, DFA – 2marks

8 Draw the transition diagrams for unsigned integers and 4 1 1 1 1.3.1


relational operators.
Answer:
unsigned integers – 2marks
relational operators – 2marks

Unsigned integers:

Relational operators:
Part – C ( 1 x 12 = 12 Marks) Instructions: Answer any ONE
9 Convert the following RE=a(a|b)*abb to DFA using subset 12 3 1 2 2.1.2
construction method and minimize it.
Answer: RE to NFA – 4marks, NFA to DFA – 4marks,
minimization of DFA – 4marks
OR
10 a. Perform minimization technique on the following 5+7 3 1 2 2.1.2
DFA
Answer:

b. Define token, pattern and lexeme with example


Definitions – each 1 mark
A token is a pair a token name and an optional token value
A pattern is a description of the form that the lexemes of a token may take
A lexeme is a sequence of characters in the source program that matches the
pattern for a token
SRM Institute of Science and Technology
College of Engineering and Technology SET D
School of Computing
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2022-23 (Even)

Test: CLA-T2 Date: 04-04-2023


Course Code & Title: 18CSC304J & Compiler Design Duration: 1
hour 40 min
Year & Sem: III Year / VI Sem Max. Marks: 50

Course Articulation Matrix: (to be placed)


Cour
S. se
Outco P P P P P P P P P PO PO PO PS PS PS
No me O O O O O O O O O 10 11 12 O1 O2 O3
. 1 2 3 4 5 6 7 8 9
1 C 3 3 3
O3

Part – A
(10 x 1 = 10 Marks)
Instructions: Answer all
Q. Question Marks BL CO PO PI
No Code
1 Given the following expression grammar: 1 L1 3 2 2.1.1

E -> E * F | F + E | F

F -> F - F | id
which of the following is true?

A. * has higher precedence than +


B. - has higher precedence than *
C. +' has higher precedence
D. +' has higher precedence than *
ANS :B

2 Which of the following derivations does a bottom up 1 L1 3 2 2.1.1


parser use while parsing an input string?
A. Leftmost derivation
B. Leftmost derivation in reverse
C. Rightmost derivation
D. Rightmost derivation in reverse
ANS :D
3 Among simple LR (SLR), canonical LR, and look-ahead 1 L1 3 2 2.1.2
LR (LALR), which of the following pairs identify the
method that is very easy to implement and the method
that is the most powerful, in that order?
A. SLR, LALR
B. Canonical LR, LALR
C. SLR, canonical LR
D. LALR, canonical LR
ANS :C
4 Which of the following is correct? 1 L2 3 2 2.1.2
A. r? = r . R
B. r? = r / έ
C. r? = r / r*
D. r? = r . r*
ANS :B
5 A form of recursive-descent parsing that does not require 1 L2 3 2 2.1.2
any back-tracking is known as?

A. predictive parsing
B. non-predictive parsing
C. recursive parsing
D. non-recursive parsing
ANS :A

6 A programmer, by mistake, writes an instruction to 1 L2 3 1 1.6.1


divide, instead of a multiply, such error can be detected
by a/an

A. Compiler
B. Interpreter
C. Linker
D. Not by Compiler/ Interpreter/ Linker
ANS :D
7 Which one of the following is TRUE at any valid state in 1 L2 3 2 2.6.1
shift-reduce parsing?
A. Viable prefixes appear only at the bottom of the stack
and not inside
B. Viable prefixes appear only at the top of the stack and
not inside
C. The stack contains only a set of viable prefixes
D. The stack never contains viable prefixes
ANS :C
8 A Choose a correct statement after left factoring A-> 1 L1 3 2 2.6.1
aA’/€
A. A’ -> AB/A/b
B. A’ -> AB/A/€
C. A -> AB/A/a/b
D. Left factoring cannot be done
ANS :B
9 Consider the grammar , 1 L2 3 2 2.1.2
X→a
X→Y
Z→d
Z→XYZ
Y→c
Y→ε
Identify the FIRST(Y)
A. {$}
B. { c, ε }
C. { a, c, ε }
D. { c}
ANS :B
10 Consider the augmented grammar given : 1 L1 3 2 2.1.1
S' → S
S → 〈L〉 | id
L → L,S | S
Let I0 = CLOSURE ({[S' → ●S]}). The number of items
in the set GOTO (I0 , 〈 ) is: _____.
A. 5
B. 4
C. 3
D. 1
ANS :A
11.

12.
13.

14,15
16
18
19
SRM Institute of Science and Technology
College of Engineering and Technology SET C
SCHOOL OF COMPUTING
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2022-23 (EVEN)
Test: CLAT-2 Date: 04.04.2022
Course Code & Title: 18CSC304J COMPILER DESIGN Duration: 2 periods
Year & Sem: III & V Max. Marks: 50

Q. Question
No
1 Grammar of the programming is checked at which phase of compiler?
A. Lexical analysis
B. Syntax analysis
C. Semantic analysis
D. Syntax directed translation
ANS :B
2 Which of the following regular expression operator has highest precedence
A. Concatenation
B. Union
C. Positive closure
D. Kleene closure
ANS :D
3 An LALR(1) parser for a grammar G can have shift-reduce (S-R) conflicts if and only if
A. the SLR(1) parser for G has S-R conflicts
B. the LR(1) parser for G has S-R conflicts
C. the LR(0) parser for G has S-R conflicts
D. the LALR(1) parser for G has reduce-reduce conflicts
ANS :B
4 The grammar C → CC | (C) | e is not suitable for predictive-parsing because the
grammar is
A. Ambiguous
B. Left recursive
C. Right recursive
D. An operator grammar
ANS :B
5 Consider the grammar E → E + n | E × n | n

For a sentence n + n × n, the handles in the right-sentential form of the reduction are

A. n, E + n and E + n × n
B. n, E + n and E + E × n
C. n, n + n and n + n × n
D. n, E + n and E × n

ANS :D
6 For the grammar rules given below what is the FIRST(S) S ->Aa|bB, A->c|€

A. b,c

B. a,c
C. a,b,c

D. a,b,c,€
ANS :C
7 Consider the grammar defined by the following production rules, with two operators ∗
and +

S --> T * P

T --> U | T * U

P --> Q + P | Q

Q --> Id

U --> Id

Which one of the following is TRUE?

A. +' is left associative, while '∗' is right associative

B. +' is right associative, while '∗' is left associative

C. Both + and ∗ are right associative

D. Both + and ∗ are left associative


ANS :B
8 The grammar A → AA | (A) | ε is not suitable for predictive-parsing because the
grammar is

A. ambiguous

B. left-recursive

C. right-recursive

D. both (A) and (B)


ANS :D
9 Consider the following grammar:

S → FR

R→S|ε

F → id

In the predictive parser table, M, of the grammar the entries M[S, id] and M[R, $]
respectively.

A. {S → FR} and {R → ε }
B. {S → FR} and { }

C. {S → FR} and {R → *S}

D. {F → id} and {R → ε}

ANS :A
10 Consider the grammar

S → aAbB | bAaB | ε

A→S

B→S

What is Follow(S)

A. {a,b,$}

B. {a,$}

C. {b,$}

D. {a}
ANS :A
SRM Institute of Science and Technology
College of Engineering and Technology SET C
SCHOOL OF COMPUTING
RM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2022-23 (EVEN)
Test: CLAT-2 Date: 04.04.2022
Course Code & Title: 18CSC304J COMPILER DESIGN Duration: 2 periods
Year & Sem: III & V Max. Marks: 50

Course
S.No. Outcome PO1 PO2 PO3 PO4 PO5 PO6 PO7

1 CO3 3 3 3

11 Construct the LR(0) item sets for the following grammar

S->SS, S->a S->

12 Differentiate between Top down parsing and Bottom up parsing

13 Ragu has to find the CFG for the below pseudo code and find leading and trailing for all the non-terminals
from the following grammar
Statement
{
While expression then statement OR
for expression then else statement
}
expression = b
14 Parse the input string “ibtaea”using shift reduce parsing for the following grammar
S->iEtS/iEtSeS/a
E->b
STACK INPUT ACTION
$ ibtaea $ SHIFT
$i btaea $ SHIFT
$ib taea $ Reduce E->b
$iE taea $ SHIFT
$iEt aea $ SHIFT
$iEta ea$ Reduce S->a
$iEtS ea$ SHIFT
$iEtSe a$ SHIFT
$iEtSea $ Reduce S->a
$iEtSeS $ S->iEtSeS
$S $ success

15 Eliminate Left Recursion for the following grammar


A->ABd/Aa/a
B->Be/b

16 a) Perform the Backtracking for the input xyxyz for the following grammar (4 marks)
S->xPz
P->yw/y
W-> xy
b) What are the problems in Top Down Parsing ( 4marks)

c) Write the procedure of Recursive Descent Parsing (procedure 3 marks and input checking 3 marks) for the following
grammar
E->num T
T->*num T/
17 Consider the Context free grammar
T->A/L
A-> no/id
L -> (S)
S->T,S/S

i) Find the issue in the above grammar and eliminate (4 marks)


ii) Construct the First and Follow for the Non-terminals (4marks)
iii) Construct the LL(1) parsing table (4 marks)
18 i )Justify the below grammar is an Operator Precedence Grammar with the rules ((2 marks)

E->EAE
E->(E)
E->id
A->*
A-> / ( NOTE: / represents the divide symbol)

ANSWER :

E-> E*E
E-> E/E
E-> (E)

ii)Write down the rules for Leading , Trailing and operator precedence table (5 marks )

ANSWER :

Leading(E)->{*,/, (}

Trailing(E)->{*,/, (}
iii) Construct the operator precedence table for the operators using Leading and Trailing (4 marks)

ANSWER:

E-> E*E

Trailing (E) > *

* < leading (E)

E-> E/E

Trailing (E) > /

/ < leading (E)

E-> (E)

(=)

Trailing (E) > )

( < leading (E)

iv) Draw the operator precedence graph using operator precedence functions and find the longest path (4 marks)

19
Show that the following grammar is LR(1) not LALR
S -> aEa | bEb | aFb | bFa
E -> e
F -> e

Item Sets Construction ( 5 marks) Table of LR(1) ( 3 marks ) Table of LALR( 3) and Justification ( 1mark)
I0
S' -> .S [$]
S -> .aEa [$]
S -> .aFb [$]
S -> .bFa [$]
S -> .bEb [$]

Goto (I0 ,S)


I1
S' -> S. [$]

Goto (I0 ,a)

I2
S -> a.Ea [$]
S -> a.Fb [$]
E -> .e [a]
F -> .e [b]
I3

Goto (I0 ,b)


S -> b.Eb [$]
S -> b.Fa [$]
E -> .e [b]
F -> .e [a]

I4

Goto (I2 ,E)


S -> aE.a [$]

I5

Goto (I2 ,F)

S -> aF.b [$]

I6

Goto (I2 ,e)


E -> e. [a]
F -> e. [b]

I7

Goto (I3 ,E)


S -> bE.b [$]

I8

Goto (I3,F)

S -> bF.a [$]

I9

Goto (I3 ,e)


E -> e. [b]
F -> e. [a]
I10

Goto (I4 ,a)


S -> aEa. [$]
I11

Goto (I5 ,b)

S -> aFb. [$]

I12

Goto (I7 ,b)


S -> bEb. [$]
I13

Goto (I8,a)

S -> bFa. [$]

1. S -> aEa
2. S-> bEb
3. S-> aFb
4. S-> bFa
5. E -> e
6. F -> e

LR(1) TABLE

STATE a b e $ S E F
0 s2 s3 1 - -
1 Accept
2 S6 4 5
3 S9 7 8
4 S10
5 S11
6 r5 r6
7 S12
8 s13
9 r6 r5
10 r1
11 r3
12 r2
13 R4

LALR TABLE

STATE a b e $ S E F
0 s2 s3 1 - -
1 Accept
2 S6 4 5
3 S9 7 8
4 S10
5 S11
69 r5/r6 r6/r5
7 S12
8 s13
10 r1
11 r3
12 r2
13 R4

SINCE REDUCE REDUCE CONFLICT IS IN 69 IT IS NOT LALR


SRM Institute of Science and Technology
College of Engineering and Technology
SET-B
SCHOOL OF COMPUTING
SRM Nagar, Kattankulathur – 603203, Chengalpattu District, Tamilnadu
Academic Year: 2022-23 (EVEN)
Test: CLAT-3 Date: 04.5.2023
Course Code & Title: 18CSC304J COMPILER DESIGN Duration: 2 periods
Year & Sem: III & V Max. Marks: 50
---------------------------------------------------------------------------------------------------------------------------------------
Course Articulation Matrix:
S.No Course PO PO PO PO PO PO PO PO PO PO1 PO1 PO1
. Outcom 1 2 3 4 5 6 7 8 9 0 1 2
e
1 CO4 H H H H M L L L M M L H
2 CO5 H H H H M L L L M M L H
3 CO6 H H H H M L L L M M L H

Part – A ( 10 x 1 = 10 Marks) Instructions: Answer all


Q. Question Marks BL CO PO PI
No Code
1 Three address statement has 1 1 4 1 1.1.3

(i) Maximum of 3 references among that 2 for


operands and one for result
(ii) Exactly 3 references and all the 3 for operands
only
(ii) Exactly 3 references among that 2 for operands
and one for result
(iv) Minimum of 3 references among that 2 for
operands and one for result

2 Syntax Directed Translations are 1 1 4 1 1.1.3

(i) The other representation of context-free grammars


for specifying translations for programming language
constructs.
(ii) Context-free grammar symbols are associated
with set of Attributes
(iii) Context-free grammar productions are
associated with Semantic Rules
(iv) All of the above

3 Intermediate code tends to be 1 1 4 1 1.1.1


(i) Machine-independent code
(ii) Machine-dependent code
(iii) Both machine-independent and machine-
dependent code
(iv) Machine code
4 It enables the optimizers to liberally re-position the 1 1 5 1 1.1.1
sub-expression to produce an optimized code.
(i) Quadruples
(ii) Triples
(iii) Indirect Triples
(iv) Quadriples

5 Semantic rules in a S-Attributed Definition can be 1 1 5 1 1.1.1


evaluated by a
(i) Bottom-up order
(ii) PostOrder traversal
(iii) InOrder traversal
(iv) Either (i) or (ii)

6 The evaluation order of Synthesized attributes and 1 2 4 1 1.1.3


inherited attribute are ----------- and -----------
(i) In-Order and Pre-Order
(ii) Pre-Order and Post-Order
(iii) Bottom-up order and Post-Order
(iv) Post-Order and Pre-Order

7 ---------------- Keeps track of location where current 1 1 6 1 1.1.1


value of the name can be found and -------------
informs the availability of registers to the code
generator.
(i) Register descriptor and address descriptor
(ii) Address descriptor and register descriptor
(iii) Register tracker and address descriptor
(iv) Address descriptor and register tracker

8 Peep-hole optimization is a form of 1 1 6 1 1.1.1

a) loop optimization
b) local optimization
c) constant folding

d) data flow analysis

9 Substitution of values for names whose values are 1 1 5 1 1.1.1


constant, is done in
a) local optimization
b) loop optimization
c) constant folding
d) none of these

10 Local and loop optimization in turn provide 1 1 5 1 1.1.1


motivation for

a) data flow analysis


b) constant folding
c) peep hole optimization
d) DFA and constant folding

Part – B ( 4 x 4 = 16 Marks) Instructions: Answer FOUR


11 Find the polish and reverse polish notation using stack method for the following expression
(a+(b*c))^d-e/(f+q)
Reverse Polish Notation or POSTFIX

Polish Notation or PREFIX


12 Consider the following pseudo code
if ( a>b) then x=a+b else x =a-b ,
write the quadruple, triple and indirect triple.

13 Draw the DAG by converting the following 4 4 5 3 2.2.3


expression into three address code.
a=b+e
b=c[i]+d[j];
c= a+b
a= a+b*c-(a+b)

14 Write the comparison among Static allocation, Stack allocation and Heap Allocation with
their merits and limitations.

15 Discuss in detail about optimization of basic blocks


• A number of code-improving transformations such as structure-preserving transformations,
dead-code elimination and algebraic transformations can be applied for basic blocks
• Many of the structure-preserving transformations can be implemented by constructing a DAG
for a basic block
• Consider the block. The DAG for the block is
a := b + c
b := a – d
c := b + c
d := a – d
• Consider the block. The DAG for the block is
a := b + c
b := b – d
c := c + d
e := b + c

The use of Algebraic Identities


Algebraic identities represent an important class of optimizations on basic blocks
1. x+0 = 0+x = x
x-0 = x
x*1 = 1*x = x
x/1 = x
2. Reduction in strength
x**2 = x*x
2.0*x = x+x
x/2 = x*0.5
3. Constant folding: Constant expressions are evaluated at compile time and the constant expressions are
replaced by their values
4. Commutativity
x*y = y*x
5. Associativity
(x+y)+z = x+(y+z)
Sometimes, associative laws may also be applied to expose common sub expressions
Example:
Source Code
a := b + c
e := c + d + b
Intermediate Code
a := b + c
t := c + d
OR a := b + c
e := a + d
e := t + b

Part – C ( 2 x12 = 24 Marks)


16 Consider the following expression

a<b or c<d and e<f

How will you generate the three address (4 marks) code by forming annotated parse tree (4
marks) using the translation scheme with backtracking.
Three Address Code

100 : if a<b goto 103

101 : t1=0

102 : goto 104

103 : t1=1

104 : if c<d goto 107

105 : t2=0

106 : goto 108

107 : t1=1

108 : if e<f goto 111

109 : t3=0

110 : goto 112

111 : t3=1

112 : t4=t2 and t3

113 : t5=t1 or t4
ii) Describe the sematic rules for translating Boolean Expressions (4 marks)

OR
17 Write the three address code (4 marks) for the following and find the quadruple (3 marks)
,triple(3 marks) and indirect triple (2 marks)
i)
Switch (a+b)
{
case 1:x-x+1:
case 2: y-y+2
case 3: +3 default-1;
}

t=a+b

if t = 1 goto L1

if t = 2 goto L2

if t = 3 goto L3

L1:

T1 = x-x

T2 = T1+1

L2:

T3 = y-y

T4 = T3+2

L3:
T4 = z+3

QUADRUPLE
Loca OP Arg1 Arg2 Result
tion

(1) + a b t

(2) = t 1 (3) GOTO L1

(3) - x X T1

(4) + T1 1 T2

(5) = t 2 (6) GO TO L2

(6) - y y T3

(7) + T3 2 T4

(8) = t 3 (9) GOTO L3

(9) + z 3 T3

Triples

Loca OP Arg1 Arg2


tion

(1) + a b

(2) = t 1

(3) - x x

(4) + (3) 1

(5) = t 2

(6) - y y

(7) + (6) 2

(8) = t 3

(9) + z 3

Indirect triples

Statement

(31) (1)

(32) (2)
(33) (3)

(34) (4)

(35) (5)

(36) (6)

(37) (7)

(38) (8)

(39) (9)

ii) int sum = 0;


for (int i = 1; i <= n; i++) { sum += i*i; }

1. sum = 0;
2. i = 1;
3. if (i >n) goto 9
4. t1 = i * i;
5. sum = sum+t1;
6. t2 = i + 1;
7. i = t2;
8. goto(3)
9. goto calling program

Quadruple

Location OP Arg1 Arg2 Result

(1) = Sum 0

(2) = i 1

(3) > i n (4)

(4) * i 1 T1

(5) + Sum T1 Sum

(6) + i 1 T2

(7) = i T2

(8) JMP (3)

Triple
Location OP Arg1 Arg2

(1) = Sum 0

(2) = i 1

(3) > i n

(4) * i 1

(5) + sum (4)

(6) + i 1

(7) = i (6)

(8) JMP (3)

Indirect triples

Statement

(31) (1)

(32) (2)

(33) (3)

(34) (4)

(35) (5)

(36) (6)

(37) (7)

(38) (8)

(39) (9)

(40) (10)

Location OP Arg1 Arg2

(1) = Sum 0

(2) = i 1

(3) > i n
(4) * i 1

(5) + sum (4)

(6) + i 1

(7) = i (6)

(8) JMP (3)

Perform the following optimization techniques for the quick sort


a. Dead code elimination
b. Variable elimination
c. Code motion
d. Reduction in strength

Sol:

Three address code for quick sort (2 marks)

common sub expression elimination:


Dead code elimination: (2 marks)

Induction-variable elimination (2marks)

•Any two variables are said to be induction variables ,if there is a change in
any one of the variable, then there is a corresponding change in the other
variable.

Code motion: (2 marks)

It moves code outside the loop • Thus transformation takes an expression


that yields the same result independent of the number of times a loop is
executed and places the expression before the loop.

Example Consider the stmt:

while(i<=limit-2)

Code motion : t :=limit-2; while(i<=t)

Reduction in strength (2 marks)

The replacement of an expensive operation by a cheaper one. • Example : •


step t2 :=4*i; in B2 • Replaced with t2 :=t2+4; • This replacement will speed
up the object code ,if addition takes less time than multiplication

OR
19 Consider the following program code:
prod=0;
i=1;
do{
prod=prod+a[i]*b[i];
i=i+1;
}while (i<=10);

i) Partition a sequence of three-address statements into basic blocks by finding the leader and
write the rules (6 marks)
ii). Perform the Transformation on Basic Blocks ( 6 marks)

I) Three address code for the given code is- (6 marks)

prod = 0

i=1

T1 = 4 x i

T2 = a[T1]

T3 = 4 x i
T4 = b[T3]

T5 = T2 x T4

T6 = T5 + prod

prod = T6

T7 = i + 1

i = T7

if (i <= 10) goto (3)

Step-01:

We identify the leader statements as-

II) prod = 0 is a leader because first statement is a leader.


III) T1 = 4 x i is a leader because target of conditional or unconditional
goto is a leader.

Step-02:

The above generated three address code can be partitioned into 2 basic
blocks as-

IV)Transformation on Basic Blocks

There are two types of transformations:

Structure-preserving transformations

Algebraic transformation
SRM Institute of Science and Technology

Course Code & Title: 18CSC304J & COMPILER DESIGN Duration: 2 periods
Year & Sem: III Year /VI Sem Max. Marks: 50

Part A PI
Marks BL CO PO
Code
1 How many temporary variables are required to express the following
statement in three address code?
if( a+b*h>a*b +h)
A=68
a) 3 1 4 4 3 3.6.1
b) 5
c) 6
d) 4
And: d
2 Consider the following translation scheme.
S -> ER
R -> * E{print{’ * ’);
R|f
E -> F + E{print(’ + ’); | F
F -> (S) | id{print(id.value);}
Here id is a taken that represents an integer and id. value represents the 1 3 4 3 3.6.1
corresponding integer value. For an input ‘2 * 3 + 4’, this translation scheme
prints?
a) 2 * 3 + 4 b) b) 2 * + 3 4 c) 2 3 * 4 + d) 2 3 4 + *

And: d

3 How many basic blocks are there in the following code snippet?
x=8;
y=9;
z=0;
w=1;
L1:
if (x>6) {
if(y>5) {
z=x+y;
z=37;
w=x+y;
x=0; 1 3 4 3 3.6.1
Goto L1;
}}

a) 3
b) 4
c) 5
d) 6

Ans: B

4 What does the following code print? Assume that the execution begins at
Procedure Z.

procedure A
print x
1 3 4 3 3.6.1
procedure B
int x = 6
call A

procedure C
int x = 2
print x
call B

procedure Z
int x = 9;
call A
call B
call C
call A
a) 9 6 2 6 9
b) 9 2 6 9 9
c) 9 6 6 2 9
d) 9 6 6 6 9

Ans: a
5 Which of the following has lower instruction costs?
a) ADD R1, #5

SUB 4(R1), *1(R0)

b) MOV R0, #78

ADD *R0, *R1


1 2 4 1 1.7.1
c) MOV b, a

ADD c, a

d) All have equal costs

Ans: b

6 Which is not a NP complete problem?


a) Allocation of registers

b) Order of evaluation
1 2 5 1 1.6.1
c) Instruction selection

d) Both a and b

Ans: c
7 Which optimization techniques is used to reduce multiple jumps?
A.Latter optimization technique
B.Peephole optimization technique
1 1 5 2 2.7.1
C. Local optimization technique
D. Code optimization technique
Ans B
8 Contiguous memory allocation is possible only in

a) Heap
b) Heap and stack 1 3 5 1 1.7.1
c) Static and stack
d) Static and heap
And: c
9 Consider the following three address code. Identify the CORRECT collection of
different optimization can be performed?
m=3
j=n
v=2*n
limit = integer n / 2
1 2 5 1 1.6.1
L1: j = j – 1
t4 = 4 * j
t5 = a[t4]
if t5 > limit – v goto L1
A. Code Motion, Constant Folding, Induction Variable Elimination, Reduction
in Strength
B. Copy Propagation , Code Motion, Deadcode Elimination, Reduction in
Strength
C. Constant Folding, Copy Propagation, Deadcode Elimination, Reduction in
Strength
D. Code Motion, Constant Folding, Copy Propagation, Induction Variable
Elimination

Ans: A

10 Consider the following statements


S1: Static allocation bindings do not change at runtime
S2: Heap allocation allocates and de-allocates storage at run time
Which of the following statements is/are true?
1 2 5 1 1.6.1
a) S1 is true and S2 is false b)S2 is true and S1 is false
c) Both S1 and S2 are true d) Both S1 and S2 are false
Ans: C

Part B
11 State various methods of implementing three address statements?
Three Address Code (2)
Three address code is a sort of intermediate code that is simple to create and
convert to machine code. It can only define an expression with three
addresses and one operator. Basically, the three address codes help in
determining the sequence in which operations are action by the compiler.
Pointers for Three Address Code
• Three-address code is considered as an intermediate code and
utilised by optimising compilers.
• In the three-address code, the given expression is broken down into
multiple guidelines. These instructions translate to assembly
language with ease. 4 2 4 1 1.6.1
• Three operands are required for each of the three address code
instructions. It’s a binary operator and an assignment combined.
There are representations of three address codes, namely
1. Quadruple
2. Triples
3. Indirect Triples
Explanation with Example (2)

12 Translate the conditional statement if a< b then 1 else 0 into three address
code

Three Address Code for the given expression is-

(1) If (A < B) goto (4) 4 3 4 1 1.6.1


(2) T1 = 0

(3) goto (5)

13 Illustrate Peephole optimization with suitable Examples


Peephole Optimization Techniques
A. Redundant load and store elimination: In this technique,
redundancy is eliminated
B. Constant folding: The code that can be simplified by the user itself,
is simplified.
4 3 4 1 1.6.1
C. Strength Reduction: The operators that consume higher execution
time are replaced by the operators consuming less execution time.
D. Combine operations: Several operations are replaced by a single
equivalent operation.
E. Dead code Elimination: A part of the code which can never be
executed, eliminating it will improve processing time and reduces
set of instruction.

14a. Develop a DAG and optimal target code for the expression.
x = (( p + q) / (q-r)) – ( p + q) * ( q-r) +s

4 3 5 2 1.6.1

b.

15 Illustrate annotated parse tree with synchronized and inherited attribute for
expression 3*5 for the given grammar.
E -> TR
T -> FS
F -> n
S -> *T|ε
R -> ε

4 3 5 1 1.6.1
(2)

(2)
Part B
( 2*12=20)
16 Write quadruples, triples and indirect triples for the expression: -(a*b)+(c+d)-
(a+b+c+d) and explain the sequences of code generation algorithm.
a. Write quadruples, triples and indirect triples for the expression:
-(a*b)+(c+d)-(a+b+c+d) (9)
Sol:
First of all this statement will be converted into Three Address Code as−

t1 = a + b

t2 = −t1

t3 = c + d 12 2 4 2 1.5.1

t4 = t2 ∗ t3

t5 = t1 + c

t6 = t4 − t5

Quadruple

Location Operator arg arg 2 Result


1
(0) + a b t1

(1) − t1 t2

(2) + c d t3

(3) ∗ t2 t3 t4

(4) + t1 c t5

(5) − t4 t5 t6

Triple

Location Operator arg 1 arg 2

(0) + a b

(1) − (0)

(2) + c d

(3) ∗ (1) (2)

(4) + (0) c

(5) − (3) (4)

The sequences of code generation algorithm. (3)

The algorithm takes a sequence of three-address statements as input. For


each three address statement of the form a:= b op c perform the various
actions. These are as follows:

1. Invoke a function getreg to find out the location L where the result
of computation b op c should be stored.
2. Consult the address description for y to determine y'. If the value of
y currently in memory and register both then prefer the register y' .
If the value of y is not already in L then generate the
instruction MOV y' , L to place a copy of y in L.
3. Generate the instruction OP z' , L where z' is used to show the
current location of z. if z is in both then prefer a register to a
memory location. Update the address descriptor of x to indicate
that x is in location L. If x is in L then update its descriptor and
remove x from all other descriptor.
4. If the current value of y or z have no next uses or not live on exit
from the block or in register then alter the register descriptor to
indicate that after execution of x : = y op z those register will no
longer contain y or z.

OR
17 State the syntax directed translation? How it is different from translation
(a)
schemes? Explain with an example.
syntax directed translation:
A technique of compiler execution, where the source code translation is
totally conducted by the parser, is known as syntax-directed translation. The
parser primarily uses a Context-free-Grammar to check the input sequence
and deliver output for the compiler's next stage.
It is a kind of notation in which each production of Context-Free Grammar is
related with a set of semantic rules or actions, and each grammar symbol is
related to a set of Attributes. Thus, the grammar and the group of semantic
Actions combine to make syntax-directed definitions. The translation
may be the generation of intermediate code, object code, or adding the
information in symbol table about constructs type.

Semantic Actions − It is an action that is executed whenever the Parser will


recognize the input string generated by context-free grammar.

For Example, A → BC {Semantic Action}

Semantic Action is written in curly braces Attached with a production.

In Top-Down Parser, semantic action will be taken when A will be


expanded to derive BC which will further derive string w. 6 1 4 2 1.6.1
In Bottom-Up Parser, Semantic Action is generated when BC is reduced to
A.

Semantic Action can perform −

• Computation of value of variables


S → S(1) + S(2) {S. VAL = S(1). VAL + S(2). VAL}

Here S. VAL will compute the sum of S(1) and S(2)values.

• Printing of Error Messages


Example − A → BC {error ( ); }

Whenever A will be expanded to BC, an error function will be called to print


an error message.

The syntax-directed translation scheme is beneficial because it allows the


compiler designer to define the generation of intermediate code directly in
terms of the syntactic structure of the source language. It is division into two
subsets known as synthesized and inherited attributes of grammar.

Attributes are related to the grammar symbol that are the labels of the
parse tree node. In other terms, attributes are associated information with
language construct by attaching them to grammar symbols representing
that construct. An attribute can describe anything (reasonable) that it can
select a string, a number, a type, a memory location, a code fragment, etc.

For example, an attribute for an identifier can include name, scope, type,
actual arguments (number of parameters), and type of parameters, return
type, etc. The value of an attribute at the parse tree node is represented by a
semantic rule related with the production applied at that node.

A TRANSLATION SCHEME is a context-free grammar in which semantic rules


are embedded within the right sides of the productions. So a translation
scheme is like a syntax-directed definition, except that the order of evaluation
of the semantic rules is explicitly shown.
(b) Express the semantic rule for productions of Boolean expression. Write three-
address code for
if ( x < 100 || x > 200 && x != y)
x=0;
Ans:

6 3 4 2 1.6.1

(3)

if ( x < 100 || x > 200 && x ! = y ) x = 0;


if x < 100 goto L2
goto L3
L3: if x > 200 goto L4
goto L1
L4: if x != y goto L 2
goto L1
L2: x = 0
L1: (3)

OR
18 Explain the sequence of stack allocation process for a function call.
(a)
Stack Allocation
• Stack allocation is based on the idea of a control stack
• Storage is organized as a stack, and activation records are pushed
and popped as activations begin and end respectively
• Storage for the locals in each call of a procedure is contained in the
activation record for that call 6 2 5 1 1.6.1
• Thus locals are bound to fresh storage in each activation, because a
new activation record is pushed onto the stack when a call is made
• The values of locals are deleted when the activation ends, because
the storage for locals disappears when the activation is popped
• Suppose that register top marks the top of the stack
• At runtime an activation record can be pushed and popped by
incrementing and decrementing top by the size of the record
Calling Sequences
• Procedure calls are implemented by generating calling sequences in
the target code
• A call sequence allocates an activation record and enters
information into its fields
• A return sequence restores the state of the machine so that the
calling procedure can continue exec

(b) What is an Activation Record? Explain how it is relevant to the intermediate


code generation phase with respect to procedure declarations
Activation records: (3)
• Procedure calls and returns are usually managed by a run time stack called
the control stack.
• Each live activation has an activation record on the control stack, with the
root of the activation tree at the bottom, the latter activation has its record
at the top of the stack.
• The contents of the activation record vary with the language being
implemented. The diagram below shows the contents of activation record.
Temporaries Local Data Machine Status Control Link Access Link Actual 6 2 5 1 1.6.1
Parameters Return Value
• Temporary values such as those arising from the evaluation of expressions.
• Local data belonging to the procedure whose activation record this is.
• A saved machine status, with information about the state of the machine
just before the call to procedures.
• An access link may be needed to locate data needed by the called procedure
but found elsewhere.
• A control link pointing to the activation record of the caller.
• Space for the return value of the called functions, if any. Again, not all
called procedures return a value, and if one does, we may prefer to place that
value in a register for efficiency.
• The actual parameters used by the calling procedure. These are not placed
in activation record but rather in registers, when possible, for greater
efficiency.
Intermediate Code for Procedures (3)
Let there be a function f(a1, a2, a3, a4), a function f with four parameters
a1,a2,a3,a4.
Three address code for the above procedure call(f(a1, a2, a3, a4)).
param a1
param a2
param a3
param a4
call f, n

‘call’ is a calling function with f and n, here f represents name of the


procedure and n represents number of parameters
example program to understand function definition and a function call.
main()
{
swap(x,y); //calling function
}

void swap(int a, int b) // called function


{
// set of statements
}

OR
19 Perform all possible optimization on the given code and explain the same.
t0=2
t1=a
t2=12
t3=t1+t2
t4=m[t3]
t5=t0*t4
t6=-16
t7=r+t6
t8=m[t7]
t9=m[t8]
t10=t9-t5 14 3 5 5 1.6.1
t11=4
t12=t10+t11
m[t12]=t10
Ans: 3 marks for each.

Copy propagation
t3=2+12
t4=m[t3]
t5=2*t4
t7=r+(-16)
t8=m[t7]
t9=m[t8]
t10=t9-t5
t12=t10+4
m[t12]=t10
Constant folding
t3=14
t4=m[t3]
t5=2*t4
t7=r+(-16)
t8=m[t7]
t9=m[t8]
t10=t9-t5
t12=t10+4
m[t12]=t10
Copy propagation
t4=m[14]
t5=2*t4
t7=r+(-16)
t8=m[t7]
t9=m[t8]
t10=t9-t5
t12=t10+4
m[t12]=t10
Reduction in strength
t4=m[14]
t5=t4+t4
t7=r+(-16)
t8=m[t7]
t9=m[t8]
t10=t9-t5
t12=t10+4
m[t12]=t10

You might also like