Professional Documents
Culture Documents
Compiler - Design - CS-603 (C) - MST-1 Solution - 1580200474 - 1580279576
Compiler - Design - CS-603 (C) - MST-1 Solution - 1580200474 - 1580279576
Compiler - Design - CS-603 (C) - MST-1 Solution - 1580200474 - 1580279576
Semantic Analysis
There is more to a front end than simply syntax. The compiler needs semantic information, e.g., the types
(integer, real, pointer to array of integers, etc) of the objects involved.
Intermediate Code Generations:-
An intermediate representation of the final machine language code is produced. This phase bridges the
analysis and synthesis phases of translation.
Code Optimization :-
This is optional phase described to improve the intermediate code so that the output runs faster and takes
less space.
Code Generation:-
The last phase of translation is code generation. A number of optimizations to reduce the length of machine
language program are carried out during this phase.
Symbol-Table Management
The symbol table stores information about program variables that will be used across phases. Typically, this
includes type information and storage location.
Error Handlers
It is invoked when a flaw error in the source program is detected. The output of LA is a stream of tokens,
which is passed to the next phase, the syntax analyzer or parser.
Compare interpreter with compiler
Compiler
A Compiler is a translator from one language, the input or source language, to another language, the output or
target language. Often, but not always, the target language is an assembler language or the machine language
for a computer processor.
Executing a program written n HLL programming language is basically of two parts. The source program must
first be compiled translated into an object program. Then the results object program is loaded into a memory
executed.
Interpreter
An interpreter is a program that appears to execute a source program as if it were machine language.
Execution in Interpreter Languages such as BASIC, SNOBOL, LISP can be translated using interpreters. JAVA
also uses interpreter
2. Demonstrate the outcome of each phase for the example below:
float sum old_sum , rate;
sum=old_sum+rate*60;
Ans
.
3. Describe T-diagram and also explain single pass, multi pass and cross compiler.
Ans The rules for T-diagrams are very simple. A compiler written in some language “C” (could be anything from
. machine code on up) that translates programs in language A to language B looks like this
diagrams are from
Now suppose you have a machine that can directly run HP machine code, and a compiler from ML to HP
machine code, and you want to get a ML compiler running on different machine code P. You can start by
writing an ML-to-P compiler in ML, and compile that to get an ML-to-P compiler in HP:
One-pass Compiler
• One-pass compiler is used to traverse the program only once. The one-pass compiler passes only once
through the parts of each compilation unit. It translates each part into its final machine code.
• In the one pass compiler, when the line source is processed, it is scanned and the token is extracted.
• Then the syntax of each line is analyzed and the tree structure is build. After the semantic part, the
code is generated.
Multi-pass Compiler
• Multi pass compiler is used to process the source code of a program several times.
• In the first pass, compiler can read the source program, scan it, extract the tokens and store the result
in an output file.
• In the second pass, compiler can read the output file produced by first pass, build the syntactic tree
and perform the syntactical analysis. The output of this phase is a file that contains the syntactical tree.
cross-compiler
A compiler that runs on one computer but produces object code for a different type of computer. Cross
compilers are used to generate software that can run on computers with a new architecture or on special-
purpose devices that cannot host their own compilers.
Figure :T-Diagram
Notation:
3. Compile using to obtain , a compiler for language L, which runs on machine A and
produces code for machine A.
The process illustrated by the T-diagrams is called bootstrapping and can be summarized by the equation:
1. Convert into (by hand, if necessary). Recall that language S is a subset of language L.
2. Compile to produce , a cross-compiler for L which runs on machine A and produces code for
machine B.
3. Compile with the cross-compiler to produce , a compiler for language L which runs on
machine B.
Need of Bootstrapping
It is a non-trivial test of the language being compiled, and as such is a form of dogfooding.
Compiler developers and bug reporting part of the community only need to know the language being
compiled.
Compiler development can be done in the higher level language being compiled.
5. Explain Lexical Analyzer with its function and also describe the process of error handling.
Ans Lexical analysis is the first phase of a compiler. It takes the modified source code from language preprocessors
. that are written in the form of sentences. The lexical analyzer breaks these syntaxes into a series of tokens, by
removing any white space or comments in the source code.
If the lexical analyzer finds a token invalid, it generates an error. The lexical analyzer works closely with the
syntax analyzer. It reads character streams from the source code, checks for legal tokens, and passes the data
to the syntax analyzer when it demands.
1. Tokenization .i.e Dividing the program into valid tokens.
2. Remove white space characters.
3. Remove comments.
4. It also provides help in generating error message by providing row number and column number.
Error Handling in Compiler Design
The tasks of the Error Handling process are to detect each error, report it to the user, and then make some
recover strategy and implement them to handle error. During this whole process processing time of program
should not be slow. An Error is the blank entries in the symbol table.
Types or Sources of Error – There are two types of error: run-time and compile-time error:
1. A run-time error is an error which takes place during the execution of a program, and usually happens
because of adverse system parameters or invalid input data. The lack of sufficient memory to run an
application or a memory conflict with another program and logical error are example of this. Logic
errors, occur when executed code does not produce the expected result. Logic errors are best handled
by meticulous program debugging.
2. Compile-time errors rises at compile time, before execution of the program. Syntax error or missing
file reference that prevents the program from successfully compiling is the example of this.
3. Semantical : incompatible value assignment or type mismatches between operator and operand
The information is collected by the analysis phases of compiler and is used by synthesis phases of
compiler to generate code.
1. Lexical Analysis: Creates new table entries in the table, example like entries about token.
2. Syntax Analysis: Adds information regarding attribute type, scope, dimension, line of
reference, use, etc in the table.
3. Semantic Analysis: Uses available information in the table to check for semantics i.e. to verify
that expressions and assignments are semantically correct(type checking) and update it
accordingly.
4. Intermediate Code generation: Refers symbol table for knowing how much and what type of
run-time is allocated and table helps in adding temporary variable information.
5. Code Optimization: Uses information present in symbol table for machine dependent
optimization.
6. Target Code generation: Generates code by using address information of identifier present in
the table.
7. Explain Top down parsing and discuss the problems associated with it.
Ans Top-Down Parsing
.
Parsing is classified into two categories, i.e. Top Down Parsing and Bottom-Up Parsing. Top-Down Parsing is
based on Left Most Derivation whereas Bottom Up Parsing is dependent on Reverse Right Most Derivation.
The process of constructing the parse tree which starts from the root and goes down to the leaf is Top-Down
Parsing.
Top-Down Parsers constructs from the Grammar which is free from ambiguity and left recursion.
Top Down Parsers uses leftmost derivation to construct a parse tree.
It allows a grammar which is free from Left Factoring.
Left recursion.
A grammar becomes left-recursive if it has any non-terminal ‘A’ whose derivation contains ‘A’ itself as the left-
most symbol.
• In Case of left recursion grammar system generate the infinite loop.
• Example:-
• A->Aα/β
• Then we write
• A->βA’
• A’->αA’/ε
Left Factoring.
If more than one grammar production rules have a common prefix string, then the top-down parser cannot
make a choice as to which of the production it should take to parse the string in hand.
Rule
A ---> α A'
A' ---> β1I β2
Example
A -> α β1 I α β2 I ···I α βm I ɣ
Solution
A-> α A'
A' -> β1I β2I ··· I βm
8. Define Shift Reduce and Operator Precedence parsing with Example.
Ans Shift reduce parsing
. Shift reduce parsing is a process of reducing a string to the start symbol of a grammar.
Shift reduce parsing uses a stack to hold the grammar and an input tape to hold the string.
Sift reduce parsing performs the two actions: shift and reduce. That's why it is known as shift reduces
parsing.
At the shift action, the current symbol in the input string is pushed to a stack.
At each reduction, the symbols will replaced by the non-terminals. The symbol is the right side of the
production and non-terminal is the left side of the production.
Grammar:
1. S → S+S
2. S → S-S
3. S → (S)
4. S → a
Input string:
a1-(a2+a3)
Parsing table:
Operator precedence parsing
Operator precedence grammar is kinds of shift reduce parsing method. It is applied to a small class of operator
grammars.
A grammar is said to be operator precedence grammar if it has two properties:
No R.H.S. of any production has a∈.
No two non-terminals are adjacent.
Operator precedence can only established between the terminals of the grammar. It ignores the non-terminal.
There are the three operator precedence relations:
a ⋗ b means that terminal "a" has the higher precedence than terminal "b".
a ⋖ b means that terminal "a" has the lower precedence than terminal "b".
a ≐ b means that the terminal "a" and "b" both have same precedence.
Parsing Action
Both end of the given input string, add the $ symbol.
Now scan the input string from left right until the ⋗ is encountered.
Scan towards left over all the equal precedence until the first left most ⋖ is encountered.
Everything between left most ⋖ and right most ⋗ is a handle.
$ on $ means parsing is successful.
Precedence table:
I2 =S→A•A
A → •aA
A → •b
Go to (I2,a) = Closure (A → a•A) = (same as I3)
Go to (I2, b) = Closure (A → b•) = (same as I4)
I0= S` → •S
S → •AA
A → •aA
A → •b
Add productions starting with A in I3.
A → a•A
A → •aA
A → •b
Go to (I3, a) = Closure (A → a•A) = (same as I3)
Go to (I3, b) = Closure (A → b•) = (same as I4)
LR(0) Table
If a state is going to some other state on a terminal then it correspond to a shift move.
If a state is going to some other state on a variable then it correspond to go to move.
If a state contain the final item in the particular row then write the reduce node completely.
Explanation:
I0 on S is going to I1 so write it as 1.
I0 on A is going to I2 so write it as 2.
I2 on A is going to I5 so write it as 5.
I3 on A is going to I6 so write it as 6.
I0, I2and I3on a are going to I3 so write it as S3 which means that shift 3.
I0, I2 and I3 on b are going to I4 so write it as S4 which means that shift 4.
I4, I5 and I6 all states contains the final item because they contain • in the right most end. So rate the
production as production number.
Productions are numbered as follows:
1. S → AA ... (1)
2. A → aA ... (2)
3. A → b ... (3)
I1 contains the final item which drives(S` → S•), so action {I1, $} = Accept.
I4 contains the final item which drives A → b• and that production corresponds to the production
number 3 so write it as r3 in the entire row.
I5 contains the final item which drives S → AA• and that production corresponds to the production
number 1 so write it as r1 in the entire row.
I6 contains the final item which drives A → aA• and that production corresponds to the production
number 2 so write it as r2 in the entire row.
Explanation:
First (E) = First (E + T) ∪ First (T)
First (T) = First (T * F) ∪ First (F)
First (F) = {id}
First (T) = {id}
First (E) = {id}
Follow (E) = First (+T) ∪ {$} = {+, $}
Follow (T) = First (*F) ∪ First (F)
= {*, +, $}
Follow (F) = {*, +, $}
I1 contains the final item which drives S → E• and follow (S) = {$}, so action {I1, $} = Accept
I2 contains the final item which drives E → T• and follow (E) = {+, $}, so action {I2, +} = R2, action {I2, $}
= R2
I3 contains the final item which drives T → F• and follow (T) = {+, *, $}, so action {I3, +} = R4, action {I3,
*} = R4, action {I3, $} = R4
I4 contains the final item which drives F → id• and follow (F) = {+, *, $}, so action {I4, +} = R5, action
{I4, *} = R5, action {I4, $} = R5
I7 contains the final item which drives E → E + T• and follow (E) = {+, $}, so action {I7, +} = R1, action
{I7, $} = R1
I8 contains the final item which drives T → T * F• and follow (T) = {+, *, $}, so action {I8, +} = R3, action
{I8, *} = R3, action {I8, $} = R3.
I0 = S` → •S, $
S → •AA, $
Add all productions starting with A in modified I0 State because "•" is followed by the non-terminal. So, the I0
State becomes.
I0= S` → •S, $
S → •AA, $
A → •aA, a/b
A → •b, a/b
13. Construct First and Follow table for the grammar below:
S → A*B
A → aB/C
B → bC+
C->id
Ans The construction of a predictive parser is aided by two functions associated with a grammar G :
. 1. FIRST
2. FOLLOW
Rules for first( ):
1. If X is terminal, then FIRST(X) is {X}.
2. If X → ε is a production, then add ε to FIRST(X).
3. If X is non-terminal and X → aα is a production then add a to FIRST(X).
4. If X is non-terminal and X → Y1 Y2…Yk is a production, then place a in FIRST(X) if for some i, a is in FIRST(Yi),
and ε is in all of FIRST(Y1),…,FIRST(Yi-1); that is, Y1,….Yi-1 ,=> ε. If ε is in FIRST(Yj) for all j=1,2,..,k, then add ε to
FIRST(X).
Rules for follow( ):
1. If S is a start symbol, then FOLLOW(S) contains $.
2. If there is a production A → αBβ, then everything in FIRST(β) except ε is placed in follow(B).
3. If there is a production A → αB, or a production A → αBβ where FIRST(β) contains ε, then everything in
FOLLOW(A) is in FOLLOW(B).
Example:
Non Terminal Symbol First Follow
S a $
A a *
B b $,*
C id *,+
14. Classify and compare various bottoms up parsing methods.
Ans Bottom-up parsing starts from the leaf nodes of a tree and works in upward direction till it reaches the root
. node. Here, we start from a sentence and then apply production rules in reverse manner in order to reach the
start symbol. The image given below depicts the bottom-up parsers available.
Shift-Reduce Parsing
Shift-reduce parsing uses two unique steps for bottom-up parsing. These steps are known as shift-step and
reduce-step.
Shift step: The shift step refers to the advancement of the input pointer to the next input symbol,
which is called the shifted symbol. This symbol is pushed onto the stack. The shifted symbol is treated
as a single node of the parse tree.
Reduce step : When the parser finds a complete grammar rule (RHS) and replaces it to (LHS), it is
known as reduce-step. This occurs when the top of the stack contains a handle. To reduce, a POP
function is performed on the stack which pops off the handle and replaces it with LHS non-terminal
symbol.
LR Parser
The LR parser is a non-recursive, shift-reduce, bottom-up parser. It uses a wide class of context-free grammar
which makes it the most efficient syntax analysis technique. LR parsers are also known as LR(k) parsers, where
L stands for left-to-right scanning of the input stream; R stands for the construction of right-most derivation in
reverse, and k denotes the number of lookahead symbols to make decisions.
There are three widely used algorithms available for constructing an LR parser:
SLR(1) – Simple LR Parser:
o Works on smallest class of grammar
o Few number of states, hence very small table
o Simple and fast construction
LR(1) – LR Parser:
o Works on complete set of LR(1) Grammar
o Generates large table and large number of states
o Slow construction
LALR(1) – Look-Ahead LR Parser:
o Works on intermediate size of grammar
o Number of states are same as in SLR(1)
15. Compare Lexical Analyzer LEX tool with Yet Another Compiler Compiler (YACC) tool.
Ans LEX
. Lex is a program that generates lexical analyzer. It is used with YACC parser generator.
The lexical analyzer is a program that transforms an input stream into a sequence of tokens.
It reads the input stream and produces the source code as output through implementing the lexical
analyzer in the C program.
The function of Lex is as follows:
Firstly lexical analyzer creates a program lex.1 in the Lex language. Then Lex compiler runs the lex.1
program and produces a C program lex.yy.c.
Finally C compiler runs the lex.yy.c program and produces an object program a.out.
a.out is lexical analyzer that transforms an input stream into a sequence of tokens.