Compiler - Design - CS-603 (C) - MST-1 Solution - 1580200474 - 1580279576

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

Chameli Devi Group of Institutions

Department of Computer Science & Engineering


CS-603(C) Compiler Design
Question Bank for MST-I
1. Describe each phase of compiler and compare interpreter with compiler.
Ans A Compiler is a translator from one language, the input or source language, to another language, the output or
. target language. Often, but not always, the target language is an assembler language or the machine language
for a computer processor.
Phases of a compiler: A compiler operates in phases. A phase is a logically interrelated operation that takes
source program in one representation and produces output in another representation. Compilation process is
partitioned into no-of-sub processes called ‘phases’. The phases of a compiler are shown in below.
Lexical Analysis:-
Lexical Analysis or Scanners reads the source program one character at a time, carving the source program
into a sequence of character units called tokens.
Syntax Analysis:-
The second stage of translation is called Syntax analysis or parsing. In this phase expressions, statements,
declarations etc… are identified by using the results of lexical analysis.

Semantic Analysis
There is more to a front end than simply syntax. The compiler needs semantic information, e.g., the types
(integer, real, pointer to array of integers, etc) of the objects involved.
Intermediate Code Generations:-
An intermediate representation of the final machine language code is produced. This phase bridges the
analysis and synthesis phases of translation.
Code Optimization :-
This is optional phase described to improve the intermediate code so that the output runs faster and takes
less space.
Code Generation:-
The last phase of translation is code generation. A number of optimizations to reduce the length of machine
language program are carried out during this phase.
Symbol-Table Management
The symbol table stores information about program variables that will be used across phases. Typically, this
includes type information and storage location.
Error Handlers
It is invoked when a flaw error in the source program is detected. The output of LA is a stream of tokens,
which is passed to the next phase, the syntax analyzer or parser.
Compare interpreter with compiler
Compiler
A Compiler is a translator from one language, the input or source language, to another language, the output or
target language. Often, but not always, the target language is an assembler language or the machine language
for a computer processor.
Executing a program written n HLL programming language is basically of two parts. The source program must
first be compiled translated into an object program. Then the results object program is loaded into a memory
executed.
Interpreter
An interpreter is a program that appears to execute a source program as if it were machine language.
Execution in Interpreter Languages such as BASIC, SNOBOL, LISP can be translated using interpreters. JAVA
also uses interpreter
2. Demonstrate the outcome of each phase for the example below:
float sum old_sum , rate;
sum=old_sum+rate*60;
Ans
.

3. Describe T-diagram and also explain single pass, multi pass and cross compiler.
Ans The rules for T-diagrams are very simple. A compiler written in some language “C” (could be anything from
. machine code on up) that translates programs in language A to language B looks like this
diagrams are from

Now suppose you have a machine that can directly run HP machine code, and a compiler from ML to HP
machine code, and you want to get a ML compiler running on different machine code P. You can start by
writing an ML-to-P compiler in ML, and compile that to get an ML-to-P compiler in HP:

One-pass Compiler
• One-pass compiler is used to traverse the program only once. The one-pass compiler passes only once
through the parts of each compilation unit. It translates each part into its final machine code.
• In the one pass compiler, when the line source is processed, it is scanned and the token is extracted.
• Then the syntax of each line is analyzed and the tree structure is build. After the semantic part, the
code is generated.
Multi-pass Compiler
• Multi pass compiler is used to process the source code of a program several times.
• In the first pass, compiler can read the source program, scan it, extract the tokens and store the result
in an output file.
• In the second pass, compiler can read the output file produced by first pass, build the syntactic tree
and perform the syntactical analysis. The output of this phase is a file that contains the syntactical tree.
cross-compiler
A compiler that runs on one computer but produces object code for a different type of computer. Cross
compilers are used to generate software that can run on computers with a new architecture or on special-
purpose devices that cannot host their own compilers.

4. Define the need of Bootstrapping in Compiler with an example.


Ans Bootstrapping is a technique that is widely used in compiler development. It has four main uses:
.  It enables new programming languages and compilers to be developed starting from existing ones.
 It enables new features to be added to a programming language and its compiler.
 It also allows new optimizations to be added to compilers.
 It allows languages and compilers to be transferred between processors with different instruction sets
A compiler is characterized by three languages:
 Source Language
 Target Language
 Implementation Language

Figure :T-Diagram

Notation:

   represents a compiler for Source S, Target T, implemented in I. The T-


diagram shown above is also used to depict the same compiler.

To create a new language, L, for machine A:


1. Create   , a compiler for a subset, S, of the desired language, L, using language A, which runs on
machine A. (Language A may be assembly language.)

2. Create   , a compiler for language L written in a subset of L.

3. Compile   using   to obtain   , a compiler for language L, which runs on machine A and
produces code for machine A.

Figure Bootstrapping of Compiler

The process illustrated by the T-diagrams is called bootstrapping and can be summarized by the equation:

To produce a compiler for a different machine B:

1. Convert   into   (by hand, if necessary). Recall that language S is a subset of language L.

2. Compile   to produce   , a cross-compiler for L which runs on machine A and produces code for
machine B.

3. Compile   with the cross-compiler to produce   , a compiler for language L which runs on
machine B.

Figure Porting of Compiler

Need of Bootstrapping

 It is a non-trivial test of the language being compiled, and as such is a form of dogfooding.

 Compiler developers and bug reporting part of the community only need to know the language being
compiled.

 Compiler development can be done in the higher level language being compiled.

5. Explain Lexical Analyzer with its function and also describe the process of error handling.
Ans Lexical analysis is the first phase of a compiler. It takes the modified source code from language preprocessors
. that are written in the form of sentences. The lexical analyzer breaks these syntaxes into a series of tokens, by
removing any white space or comments in the source code.

If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works closely with the
syntax analyzer. It reads character streams from the source code, checks for legal tokens, and passes the data
to the syntax analyzer when it demands.
1. Tokenization .i.e Dividing the program into valid tokens.
2. Remove white space characters.
3. Remove comments.
4. It also provides help in generating error message by providing row number and column number.
Error Handling in Compiler Design
The tasks of the Error Handling process are to detect each error, report it to the user, and then make some
recover strategy and implement them to handle error. During this whole process processing time of program
should not be slow. An Error is the blank entries in the symbol table.
Types or Sources of Error – There are two types of error: run-time and compile-time error:

1. A run-time error is an error which takes place during the execution of a program, and usually happens
because of adverse system parameters or invalid input data. The lack of sufficient memory to run an
application or a memory conflict with another program and logical error are example of this. Logic
errors, occur when executed code does not produce the expected result. Logic errors are best handled
by meticulous program debugging.

2. Compile-time errors rises at compile time, before execution of the program. Syntax error or missing
file reference that prevents the program from successfully compiling is the example of this.

Classification of Compile-time error –

1. Lexical : This includes misspellings of identifiers, keywords or operators

2. Syntactical : missing semicolon or unbalanced parenthesis

3. Semantical : incompatible value assignment or type mismatches between operator and operand

4. Logical : code not reachable, infinite loop.

6. Discuss the structure and functions symbol table.


Ans Symbol Table is an important data structure created and maintained by the compiler in order to keep track of
. semantics of variable i.e. it stores information about scope and binding information about names, information
about instances of various entities such as variable and function names, classes, objects, etc.

 It is built in lexical and syntax analysis phases.

 The information is collected by the analysis phases of compiler and is used by synthesis phases of
compiler to generate code.

 It is used by compiler to achieve compile time efficiency.

 It is used by various phases of compiler as follows :-

1. Lexical Analysis: Creates new table entries in the table, example like entries about token.

2. Syntax Analysis: Adds information regarding attribute type, scope, dimension, line of
reference, use, etc in the table.

3. Semantic Analysis: Uses available information in the table to check for semantics i.e. to verify
that expressions and assignments are semantically correct(type checking) and update it
accordingly.

4. Intermediate Code generation: Refers symbol table for knowing how much and what type of
run-time is allocated and table helps in adding temporary variable information.
5. Code Optimization: Uses information present in symbol table for machine dependent
optimization.

6. Target Code generation: Generates code by using address information of identifier present in
the table.

Structure of Symbol table in Compiler:-

Sr. No Token Data Types Init?

7. Explain Top down parsing and discuss the problems associated with it.
Ans Top-Down Parsing
.
Parsing is classified into two categories, i.e. Top Down Parsing and Bottom-Up Parsing. Top-Down Parsing is
based on Left Most Derivation whereas Bottom Up Parsing is dependent on Reverse Right Most Derivation.
The process of constructing the parse tree which starts from the root and goes down to the leaf is Top-Down
Parsing.
 Top-Down Parsers constructs from the Grammar which is free from ambiguity and left recursion.
 Top Down Parsers uses leftmost derivation to construct a parse tree.
 It allows a grammar which is free from Left Factoring.

Classification of Top-Down Parsing –


1. With Backtracking: Brute Force Technique or Recursive Descent Parsing
2. Without Backtracking: Predictive Parsing or Non-Recursive Parsing or LL(1) Parsing or Table Driver
Parsing

In Top Down Parsing We phase many problems.


Let’s discuss about problems in parsing and how to remove it.
• Backtracking.
• Left Recursion.
• Left Factoring.
• Ambiguity.

Backtracking.
It means, if one derivation of a production fails, the syntax analyzer restarts the process using different rules of
same production. This technique may process the input string more than once to determine the right
production.
• Example:
• S->xPz
• P->yw/y
• String is:-xyz.

Left recursion.
A grammar becomes left-recursive if it has any non-terminal ‘A’ whose derivation contains ‘A’ itself as the left-
most symbol.
• In Case of left recursion grammar system generate the infinite loop.
• Example:-
• A->Aα/β
• Then we write
• A->βA’
• A’->αA’/ε

Left Factoring.
If more than one grammar production rules have a common prefix string, then the top-down parser cannot
make a choice as to which of the production it should take to parse the string in hand.
Rule
A ---> α A'
A' ---> β1I β2 
Example
A -> α β1 I α β2 I ···I α βm I ɣ
Solution
A-> α A'
A' -> β1I β2I ··· I βm
8. Define Shift Reduce and Operator Precedence parsing with Example.
Ans Shift reduce parsing
.  Shift reduce parsing is a process of reducing a string to the start symbol of a grammar.
 Shift reduce parsing uses a stack to hold the grammar and an input tape to hold the string.

 Sift reduce parsing performs the two actions: shift and reduce. That's why it is known as shift reduces
parsing.
 At the shift action, the current symbol in the input string is pushed to a stack.
 At each reduction, the symbols will replaced by the non-terminals. The symbol is the right side of the
production and non-terminal is the left side of the production.
Grammar:
1. S → S+S    
2. S → S-S    
3. S → (S)  
4. S → a  
Input string:
a1-(a2+a3)  
Parsing table:
Operator precedence parsing
Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a small class of operator
grammars.
A grammar is said to be operator precedence grammar if it has two properties:
 No R.H.S. of any production has a∈.
 No two non-terminals are adjacent.
Operator precedence can only established between the terminals of the grammar. It ignores the non-terminal.
There are the three operator precedence relations:
a ⋗ b means that terminal "a" has the higher precedence than terminal "b".
a ⋖ b means that terminal "a" has the lower precedence than terminal "b".
a ≐ b means that the terminal "a" and "b" both have same precedence.
Parsing Action
 Both end of the given input string, add the $ symbol.
 Now scan the input string from left right until the ⋗ is encountered.
 Scan towards left over all the equal precedence until the first left most ⋖ is encountered.
 Everything between left most ⋖ and right most ⋗ is a handle.
 $ on $ means parsing is successful.
Precedence table:

9. Construct LL(1) Parsing table for the grammar below:


E → E+T | T
T → T*F | F
F → (E) | id
Ans After eliminating left-recursion the grammar is
. E → TE’
E’ → +TE’ | ε
T → FT’
T’ → *FT’ | ε
F → (E) | id
First( ) :
FIRST(E) = { ( , id}
FIRST(E’) ={+ , ε }
FIRST(T) = { ( , id}
FIRST(T’) = {*, ε }
FIRST(F) = { ( , id }
Follow( ):
FOLLOW(E) = { $, ) }
FOLLOW(E’) = { $, ) }
FOLLOW(T) = { +, $, ) }
FOLLOW(T’) = { +, $, ) }
FOLLOW(F) = {+, * , $ , ) }

10. Construct LR(0) Parsing table for the grammar below:


S → AA
A → aA | b
Ans S → AA
. A → aA | b
Add all productions starting with "A" in modified I0 State because "•" is followed by the non-terminal. So, the
I0 State becomes.
I0= S` → •S
       S → •AA
       A → •aA
       A → •b

I1= Go to (I0, S) = closure (S` → S•) = S` → S•


Here, the Production is reduced so close the State
I1= S` → S•
I2= Go to (I0, A) = closure (S → A•A)
Add all productions starting with A in to I2 State because "•" is followed by the non-terminal. So, the I2 State
becomes

I2 =S→A•A
       A → •aA
       A → •b
Go to (I2,a) = Closure (A → a•A) = (same as I3)
Go to (I2, b) = Closure (A → b•) = (same as I4)

I3= Go to (I0,a) = Closure (A → a•A)


Add all productions starting with "A" in modified I0 State because "•" is followed by the non-terminal. So, the
I0 State becomes.

I0= S` → •S
       S → •AA
       A → •aA
       A → •b
Add productions starting with A in I3.
A → a•A
A → •aA
A → •b
Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)

I4= Go to (I0, b) = closure (A → b•) = A → b•

I5= Go to (I2, A) = Closure (S → AA•) = SA → A•

I6= Go to (I3, A) = Closure (A → aA•) = A → aA•

LR(0) Table
 If a state is going to some other state on a terminal then it correspond to a shift move.
 If a state is going to some other state on a variable then it correspond to go to move.
 If a state contain the final item in the particular row then write the reduce node completely.
Explanation:
 I0 on S is going to I1 so write it as 1.
 I0 on A is going to I2 so write it as 2.
 I2 on A is going to I5 so write it as 5.
 I3 on A is going to I6 so write it as 6.
 I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
 I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
 I4, I5 and I6 all states contains the final item because they contain • in the right most end. So rate the
production as production number.
Productions are numbered as follows:
1. S  →      AA    ... (1)                              
2. A   →     aA      ... (2)   
3. A    →    b     ... (3)  
 I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
 I4 contains the final item which drives A → b• and that production corresponds to the production
number 3 so write it as r3 in the entire row.
 I5 contains the final item which drives S → AA• and that production corresponds to the production
number 1 so write it as r1 in the entire row.
 I6 contains the final item which drives A → aA• and that production corresponds to the production
number 2 so write it as r2 in the entire row.

11. Construct SLR(1) Parsing table for the grammar below:


S→E
E→E+T|T
T→T*F|F
F → id
Ans
.

Explanation:
First (E) = First (E + T) ∪ First (T)
First (T) = First (T * F) ∪ First (F)
First (F) = {id}
First (T) = {id}
First (E) = {id}
Follow (E) = First (+T) ∪ {$} = {+, $}
Follow (T) = First (*F) ∪ First (F)
               = {*, +, $}
Follow (F) = {*, +, $}

 I1 contains the final item which drives S → E• and follow (S) = {$}, so action {I1, $} = Accept

 I2 contains the final item which drives E → T• and follow (E) = {+, $}, so action {I2, +} = R2, action {I2, $}
= R2
 I3 contains the final item which drives T → F• and follow (T) = {+, *, $}, so action {I3, +} = R4, action {I3,
*} = R4, action {I3, $} = R4

 I4 contains the final item which drives F → id• and follow (F) = {+, *, $}, so action {I4, +} = R5, action
{I4, *} = R5, action {I4, $} = R5

 I7 contains the final item which drives E → E + T• and follow (E) = {+, $}, so action {I7, +} = R1, action
{I7, $} = R1

 I8 contains the final item which drives T → T * F• and follow (T) = {+, *, $}, so action {I8, +} = R3, action
{I8, *} = R3, action {I8, $} = R3.

12. Construct LALR(1) Parsing table for the grammar below:


S → AA
A → aA
A→b
Ans LR (1) Parsing:
. LALR refers to the look ahead LR. To construct the LALR (1) parsing table, we use the canonical collection of LR
(1) items.
In the LALR (1) parsing, the LR (1) items which have same productions but different look ahead are combined
to form a single set of items
LALR (1) parsing is same as the CLR (1) parsing, only difference in the parsing table.
Example
LALR ( 1 ) Grammar
1. S → AA  
2. A  → aA  
3. A → b  
4. S` → •S, $  
5. S  → •AA, $  
6. A  → •aA, a/b   
7. A  → •b, a/b  
I0 State:
Add Augment production to the I0 State and Compute the ClosureL

I0 = Closure (S` → •S)


Add all productions starting with S in to I0 State because "•" is followed by the non-terminal. So, the I0 State
becomes

I0 = S` → •S, $
        S → •AA, $
Add all productions starting with A in modified I0 State because "•" is followed by the non-terminal. So, the I0
State becomes.

I0= S` → •S, $
       S → •AA, $
       A → •aA, a/b
       A → •b, a/b

I1= Go to (I0, S) = closure (S` → S•, $) = S` → S•, $

I2= Go to (I0, A) = closure ( S → A•A, $ )

LALR (1) Parsing table:

13. Construct First and Follow table for the grammar below:
S → A*B 
A → aB/C 
B → bC+ 
C->id
Ans The construction of a predictive parser is aided by two functions associated with a grammar G :
. 1. FIRST
2. FOLLOW
Rules for first( ):
1. If X is terminal, then FIRST(X) is {X}.
2. If X → ε is a production, then add ε to FIRST(X).
3. If X is non-terminal and X → aα is a production then add a to FIRST(X).
4. If X is non-terminal and X → Y1 Y2…Yk is a production, then place a in FIRST(X) if for some i, a is in FIRST(Yi),
and ε is in all of FIRST(Y1),…,FIRST(Yi-1); that is, Y1,….Yi-1 ,=> ε. If ε is in FIRST(Yj) for all j=1,2,..,k, then add ε to
FIRST(X).
Rules for follow( ):
1. If S is a start symbol, then FOLLOW(S) contains $.
2. If there is a production A → αBβ, then everything in FIRST(β) except ε is placed in follow(B).
3. If there is a production A → αB, or a production A → αBβ where FIRST(β) contains ε, then everything in
FOLLOW(A) is in FOLLOW(B).
Example:
Non Terminal Symbol First Follow
S a $
A a *
B b $,*
C id *,+
14. Classify and compare various bottoms up parsing methods.
Ans Bottom-up parsing starts from the leaf nodes of a tree and works in upward direction till it reaches the root
. node. Here, we start from a sentence and then apply production rules in reverse manner in order to reach the
start symbol. The image given below depicts the bottom-up parsers available.

Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing. These steps are known as shift-step and
reduce-step.
 Shift step: The shift step refers to the advancement of the input pointer to the next input symbol,
which is called the shifted symbol. This symbol is pushed onto the stack. The shifted symbol is treated
as a single node of the parse tree.
 Reduce step : When the parser finds a complete grammar rule (RHS) and replaces it to (LHS), it is
known as reduce-step. This occurs when the top of the stack contains a handle. To reduce, a POP
function is performed on the stack which pops off the handle and replaces it with LHS non-terminal
symbol.
LR Parser
The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class of context-free grammar
which makes it the most efficient syntax analysis technique. LR parsers are also known as LR(k) parsers, where
L stands for left-to-right scanning of the input stream; R stands for the construction of right-most derivation in
reverse, and k denotes the number of lookahead symbols to make decisions.
There are three widely used algorithms available for constructing an LR parser:
 SLR(1) – Simple LR Parser:
o Works on smallest class of grammar
o Few number of states, hence very small table
o Simple and fast construction
 LR(1) – LR Parser:
o Works on complete set of LR(1) Grammar
o Generates large table and large number of states
o Slow construction
 LALR(1) – Look-Ahead LR Parser:
o Works on intermediate size of grammar
o Number of states are same as in SLR(1)
15. Compare Lexical Analyzer LEX tool with Yet Another Compiler Compiler (YACC) tool.
Ans LEX
.  Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
 The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
 It reads the input stream and produces the source code as output through implementing the lexical
analyzer in the C program.
The function of Lex is as follows:
 Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs the lex.1
program and produces a C program lex.yy.c.
 Finally C compiler runs the lex.yy.c program and produces an object program a.out.
 a.out is lexical analyzer that transforms an input stream into a sequence of tokens.

Lex file format


A Lex program is separated into three sections by %% delimiters. The formal of Lex source is as follows:
1. { definitions }   
2. %%  
3.  { rules }   
4. %%   
5. { user subroutines }  
Definitions include declarations of constant, variable and regular definitions.
YACC
YACC stands for Yet Another Compiler Compiler.
YACC provides a tool to produce a parser for a given grammar.
YACC is a program designed to compile a LALR (1) grammar.
It is used to produce the source code of the syntactic analyzer of the language produced by LALR (1)
grammar.
The input of YACC is the rule or grammar and the output is a C program.
These are some points about YACC:
Input: A CFG- file.y
Output: A parser y.tab.c (yacc)
The output file "file.output" contains the parsing tables.
The file "file.tab.h" contains declarations.
The parser called the yyparse ().
Parser expects to use a function called yylex () to get tokens

You might also like