Com 413 Ammar Usman Sabo H21CS018

Q1.
Construct NFA equivalent to regular expression r= (a + b)* ab

Answer:
ab:
a b
0 1 2
a+b:
a b r
1
2 5
0
3 4
(b+aa) b+
*ab
a b
0 1 2 3
Q2. Give general format for LEX program

Answer: The general format for LEX Program is:
(Definition section)
%%
(Rules section)
%%
(User Subroutines Section)
Q3. Show that the grammar S -> 0S1| SS |ϵ is ambiguous
Answer: The grammar of S -> 0S1| SS |ϵ as ambiguous
S S
O S O S
1 S S
1 ϵ
Ammar Usman Sabo COM 413 Assignment H21CS018

Q4. Draw a block diagram of a compiler phases and their functions
Answer: The Diagram of Compiler cycle or phases presented below.
COMPILER PHASES AND THEIR FUNCTIONS

Phase I: Lexical Analysis
The first phase of compiler splits the source code into lexemes, which are
individual code fragments that represent specific patterns in the code. The
lexemes are then tokenized in preparation for syntax and semantic analyses.
Phase II: Syntax Analysis
The next phase is called the syntax analysis or parsing. The compiler verifies that
the codes syntax is correct, based on the rules for the source language. This
process is also referred to as parsing. During this step, the compiler typically
creates abstract syntax trees that represent the logical structures of specific code
elements.
Phase II: Semantic Analysis

In semantic analysis the compiler verifies the validity of the code's logic. This
step goes beyond syntax analysis by validating the code's accuracy. For example,

the semantic analysis might check whether variables have been assigned the right
types or have been properly declared.
Phase IV: Intermediate Code Generation
After the code passes through all three analysis phases, the compiler generates an
intermediate representation (IR) of the source code. The IR code makes it easier
to translate the source code into a different format. However, it must accurately
represent the source code in every respect, without omitting any functionality.
Phase V: Code Optimization
The compiler optimizes the IR code in preparation for the final code generation.
The type and extent of optimization depends on the compiler. Some compilers
let users configure the degree of optimization.
Phase VI: Code Generation
In this phase, the compiler generates the final output code, using the optimized
IR code.
Q5. What is the difference between compiler and interpreter
Answer: The main difference between a compiler and an interpreter is in how

they execute code. A compiler translates the entire source code into machine code
or bytecode before execution, while an interpreter executes the source code
directly, line by line. The below table shows the difference between compiler and
Interpreter.
Compiler Interpreter
It scan the entire program first and translate it It scan the program line by line and translate it into
into machine code machine code
Shows all area and warning at the same time Show one area at a time
Error occurs after scanning the whole program Error occurs after scanning each line
Debugging is slow Debugging is faster
Execution time is less Execution time is more
Is used by language such as C, C++, etc. Is used by language such as Java, python etc.

Q6. What are the advantage (a) Complier over and interpreter (b) Interpreter
over a compiler?
Answer: The advantage of Compiler over and Interpreter and Interpreter over a
compiler was given a table below:
(a) Complier over and interpreter (b) Interpreter over a compiler

Faster execution: Since the code is Easier debugging: Since the code is
translated into machine code or bytecode executed line by line, it is easier to
before execution, the compiled code can debug and test the code.
be executed faster than interpreted code.
Better optimization: Compilers can - Dynamic typing: Interpreted
perform more advanced optimizations on languages often support dynamic
the code, resulting in faster and more typing, which allows for more
efficient code. flexible and expressive code.
Portability: Compiled code can be run on - Faster development: Interpreted
any machine that supports the target languages often have a shorter
architecture, without requiring the development cycle, since changes to
compiler to be present. the code can be tested immediately
without requiring a compilation step.
Q7. Write the quadruple, triple, indirect triple for the expression -(a*b) + (c+d)-
(a+b+c+d)
Answer: the -(a*b) + (c+d)-(a+b+c+d)

Q8. Write an algorithm contracting the dependency graph for a given parse tree
Q9. Discuss about error recovery strategies in predictive parsing
Answer: There are four common error-recovery strategies that can be implemented in the
parser to deal with errors in the code. Panic Mode, Statement Mode, Error Productions and
Global Corrections all are error recovery strategies in predictive parsing.
The error recovery strategy depends on the specific error that is encountered. For example,
panic-mode error recovery is a good choice for recovering from syntax errors, while phrase-
level error recovery is a good choice for recovering from semantic errors.
Error recovery strategies, there are a number of other techniques that can be used to improve
the robustness of predictive parsers. For example, the parser can be designed to generate
more informative error messages. The parser can also be designed to collect more
information about the input during the parsing process. This information can be used to
improve the accuracy of error recovery.
Q10. Explain the role of semantic preserving transformations and dominator in code
optimization
Answer: Semantic preserving transformations are transformations that change the form of a
program without changing its meaning. These transformations can be used to improve the
efficiency of a program by making it smaller, faster, or easier to understand.

Some common semantic preserving transformations include:
i. Common subexpression elimination (CSE): CSE eliminates duplicate computations

by storing the result of a computation the first time it is performed and using the
stored value instead of recomputing the expression.
ii. Copy propagation (CP): CP propagates the value of a variable throughout a program
by replacing all occurrences of the variable with its value.
iii. Dead code elimination (DCE): DCE removes code that is never executed.
iv. Constant folding (CF): CF evaluates constant expressions at compile time and
replaces them with their values.
Q11. Explain with suitable example various sources of loop optimization.
Answer: Loop optimization is an important technique used by compilers to improve the

performance of code that contains loops. Here are some sources of loop optimization:
i. Loop unrolling: This technique involves duplicating the loop body multiple times
to reduce the overhead of loop control instructions. For example, if a loop iterates
10 times, unrolling it by a factor of 2 would result in two copies of the loop body,
each iterating 5 times.
ii. Loop fusion: This technique involves combining multiple loops that operate on the
same data into a single loop. This can reduce the overhead of loop control
instructions and improve cache locality.
iii. Loop interchange: This technique involves changing the order of nested loops to
improve cache locality. For example, if a loop iterates over rows of a matrix and
another loop iterates over columns, interchanging the loops can improve cache
performance.
iv. Loop-invariant code motion: This technique involves moving code that does not
depend on the loop index outside of the loop. This can reduce the number of
instructions executed inside the loop and improve performance.
v. Strength reduction: This technique involves replacing expensive operations inside
the loop with cheaper operations. For example, replacing multiplication with
addition or bit shifting can improve performance.
vi. Loop tiling: This technique involves dividing a large loop into smaller loops that
operate on blocks of data. This can improve cache performance and reduce the
overhead of loop control instructions.

Q12. List the properties of LR Parser
Answer: The LR parser is a type of bottom-up parser that uses a wide class of context-free
grammar, making it the most efficient syntax analysis technique. Here are some properties of
the LR parser:
i. Efficiency: The LR parser is the most efficient syntax analysis technique, as it can
handle a wide class of context-free grammar.
ii. Non-recursive: The LR parser is a non-recursive parser, which means that it does
not use recursion to parse the input.
iii. Shift-reduce: The LR parser uses a shift-reduce technique to parse the input. In the
shift step, the input pointer is advanced to the next input symbol, which is pushed
onto the stack. In the reduce step, the parser finds a complete grammar rule and
replaces it with the left-hand side non-terminal symbol.
iv. Bottom-up: The LR parser is a bottom-up parser, which means that it starts from
the leaf nodes of a tree and works in an upward direction until it reaches the root
node.
v. LR(k): The LR parser is also known as LR(k) parser, where L stands for left-to-
right scanning of the input stream, R stands for the construction of right-most
derivation in reverse, and k denotes the number of lookahead symbols to make
decisions.
vi. Three algorithms: There are three widely used algorithms available for constructing
an LR parser: SLR(1), LR(1), and LALR(1). The SLR(1) algorithm works on the
smallest class of grammar, while the LR(1) algorithm works on the complete set of
LR(1) grammar. The LALR(1) algorithm works on an intermediate size of grammar.
By using LR parser, compilers can efficiently parse the input and generate the parse tree, which
is used for further analysis and optimization.
 LR parser can be constructed to recognize most of programming language for which

CFG can be written
 The class of grammar that can be parsed by LR parser is a superset of grammar that
can be parsed using predictive parsers.
 LR parser works using non back tracking shift reduce technique.
Q13. Write the need of semantic analysis

Semantic analysis is the task of ensuring that the declarations and statements of a program are
semantically correct, i.e, that their meaning is clear and consistent with the way in which
control structures and data types are supposed to be used.
Q14. What advantages are there to a language-processing system in which the compiler
produces assembly language rather than machine language?
Several advantages to using a language-processing system that produces assembly language

rather than machine language. Assembly language is a more portable, maintainable, reusable,
and optimizable form of code. Additionally, assembly language is a good way to learn about
how computers work. Also assembly langugae is portable, maintainable, reusability,
optimazation and easy to learn.
Q15. A compiler that translate high-level language into another high-level language is called
source-to-source translator, explain this process.
A source-to-source translator (S2S translator), also known as a transcompiler or transpiler, is a

type of compiler that translates a program written in one high-level language into another high-
level language. The goal of an S2S translator is to produce a program that is semantically
equivalent to the original program, but that is written in a different language.
The process of source-to-source translation typically involves the following steps:
Lexical analysis: The input program is broken down into a sequence of tokens, which are basic
units of the language such as keywords, identifiers, and operators.
Syntactic analysis: The sequence of tokens is parsed to create a syntax tree, which is a
representation of the structure of the program.
Semantic analysis: The syntax tree is analyzed to check for semantic errors, such as undeclared
variables and type mismatches.
Intermediate code generation: An intermediate representation (IR) of the program is generated.

The IR is a language-independent representation of the program that can be easily translated
into another language.
Code generation: The IR is translated into the target language. A source-to-source translator
converts between programming languages that operate at approximately the same level
of abstraction, while a traditional compiler translates from a higher level programming
language to a lower level programming language. For example, a source-to-source translator

may perform a translation of a program from Python to JavaScript, while a traditional compiler
translates from a language like C to assembler or Java to bytecode.
Q16. What advantage are there to using C as a target language for a compiler
C is an excellent target language since it is:
• low level
• easy to generate
• can be written in an architecture-independent manner
• highly available
• has good optimizers
Q17. Describe some of the task that the compiler needs to perform
The task that the Assembler needs to be perform among of these are:
• The assembler takes its source code as assembly language program.
• Making direct hardware access for device driver.
• Produce machine code.
• Generate machine code
• Provide error information for the assembly language programmer
• Provide machine code information for the assembly language programmer
• Assign memory for instruction and data
Q18. Define preprocessor. What are the functions of pre-processor?
The Preprocessor is not a part of the compiler, but is a separate step in the compilation
process. In simple terms, a preprocessor is just a text substitution tool and it instructs the
compiler to do required pre-processing before the actual compilation. We'll refer to the
Preprocessor as CPP.
All preprocessor commands begin with a hash symbol (#). It must be the first nonblank
character, and for readability, a preprocessor directive should begin in the first column. The
following section lists down all the important preprocessor directives
What are the functions of pre-processor?
Preprocessor directives in C programming language are used to define and replace tokens in
the text and also used to insert the contents of other files into the source file.
When we try to compile a program, preprocessor commands are executed first and then the
program gets compiled.
Q19. Discuss about the syntax error handling
Syntax Error Handling

Syntax Error

Syntax or Syntactic errors are the errors that arise during syntax analysis. These errors
can be the incorrect usage of semicolons, extra braces, or missing braces.
In C or Java, syntactic errors could be a case statement without enclosing the switch.
Syntactic Error Detection
We can detect syntax error through the use of precise parsing methods. The parsing
methods, such as LL and LR can detect the errors immediately after their occurrence.
They possess viable prefix property according to which an error is triggered whenever
a prefix of the input string is not considered as a string. Alternatively, an error is
triggered upon encountering the sequence of tokens from the lexical analysis that cannot
be parsed any more according to the language grammar.
However, the error handler has to achieve following goals during parsing.
 Address clearly and precisely about the existence of errors. Addressing the error must
involve at least referring to the location of an error within the source program.
 Put low overhead for the processing of error-free programs.
 Recover from an error as soon as possible to detect next errors.
Q20. Differentiate between shift-reduce and Operator Precedence parsers
Shift Reduce Parser is a type of Bottom-Up Parser. It generates the Parse Tree from Leaves
to the Root. In Shift Reduce Parser, the input string will be reduced to the starting symbol. This
reduction can be produced by handling the rightmost derivation in reverse, i.e., from starting
symbol to the input string.
An operator precedence parser is a bottom-up parser that interprets an operator grammar.
This parser is only used for operator grammars. Ambiguous grammars are not allowed in any
parser except operator precedence parser.
Q21. What are the benefits of intermediate code generation.
 It is Machine Independent. It can be executed on different platforms.

 It creates the function of code optimization easy. A machine-independent code
optimizer can be used to intermediate code to optimize code generation.
 It can perform efficient code generation.
 From the existing front end, a new compiler for a given back end can be generated.
 Syntax-directed translation implements the intermediate code generation, thus by
augmenting the parser, it can be folded into the parsing

Q22. What are the various attribute of symbol table?
1. Name
2. Date type
3. Size of the data type
4. Dimension
5. Line of declaration
6. Line of usage
7. Address
Q23. Mention the issues to be considered while applying the techniques for optimization.
The code optimization in the synthesis phase is a program transformation technique, which
tries to improve the intermediate code by making it consume fewer resources (i.e. CPU,
Memory) so that faster-running machine code will result. Compiler optimizing process
should meet the following objectives:
 The optimization must be correct, it must not, in any way, change the meaning of the
program.
 Optimization should increase the speed and performance of the program.
 The compilation time must be kept reasonable.
 The optimization process should not delay the overall compiling process.
Q24. Discuss in detail the role of dead code elimination and strength reduction during code
optimization of a compiler.
Dead code elimination removes unneeded instructions from the program.
Dead code is a section in the source code of a program, which is executed but whose result is
never used in any other computation.
Dead code execution prevents from wastes of computation time and memory.
Strength reduction often eliminates all uses of an induction variable, except for an end-of-
loop test. In that case, the compiler may be able to rewrite the end-of-loop test to use another
induction variable found in the loop. If the compiler can remove this last use, it can eliminate
the original induction variable as dead code

Q25. Briefly explain about lexical error
Lexical Error
During the lexical analysis phase this type of error can be detected.
Lexical error is a sequence of characters that does not match the pattern of any token. Lexical
phase error is found during the execution of the program.
Lexical phase error can be:
o Spelling error.
o Exceeding length of identifier or numeric constants.
o Appearance of illegal characters.
o To remove the character that should be present.
o To replace a character with an incorrect character.
o Transposition of two characters.
Q26. Write regular expressions for the set of words having a,e,i,o,u appearing in that order,
although not necessarily consecutively.
a (?:[âeiou]*a[âeiou]*e) [âeiou]*i[âeiou]*o[âeiou]*u[âeiou]*
Q27. Define lexeme, token and pattern. Identify the lexemes that make up the tokens in the
following program segment. Indicate corresponding token and pattern. void swap(int i, int
j) { int t; t=i; i=j; j=t; }
Token: is a valid sequence of characteristics which are given be lexeme. In a programming

language,
 Keywords
 Constant
 Identifiers
 Numbers

 Operators and
 Punctuation symbols
Are possible token to be identified
Lexeme: A lexeme is a sequence of characters in the source that matches the pattern for a token
and is identified by the lexical analyzer as an instance of the token.
Pattern: Pattern describe the rule that must be matched by sequence of characters (Lexeme)
to form a token, it can be defined by regular expressions or grammar rules. In the case of a
token, the pattern is just the sequence of characters that form the keyword.
The lexemes that made of a tokens in the fallowing program segment
Lexeme Token Pattern
void keyword (void)
swap identifier [a-zA-Z_][a-zA-Z0-9_]*
( left parenthesis (
int keyword (int)
i identifier [a-zA-Z_][a-zA-Z0-9_]*
, comma ,
int keyword (int)
j identifier [a-zA-Z_][a-zA-Z0-9_]*
) right parenthesis )
{ left curly brace {
int keyword (int)

t identifier [a-zA-Z_][a-zA-Z0-9_]*
; semicolon ;
t identifier [a-zA-Z_][a-zA-Z0-9_]*
= assignment operator =
; semicolon ;
= assignment operator =
j identifier [a-zA-Z_][a-zA-Z0-9_]*
; semicolon ;
} right curly brace }
Q28. Compare bottom up approach of parsing with all top down approach
Top Down Parsing
Top down parsing technique is a parsing technique which starts from the top of the parse tree,
move downwards, evaluates rules of grammar.
Bottom Up Parsing
Top down parsing technique is again a parsing technique which starts from the lowest level of
the parse tree, move upwards and evaluates rules of grammar.
Following are some of the important differences between Top Down Parsing and Bottom Up
Parsing.

Top Down Parsing Bottom Up Parsing
Top down approach starts evaluating the Bottom up approach starts evaluating the
parse tree from the top and move parse tree from the lowest level of the tree
downwards for parsing other nodes. and move upwards for parsing the node.
Top down parsing attempts to find the left Bottom up parsing attempts to reduce the
most derivation for a given string. input string to first symbol of the
grammar.
Top down parsing uses leftmost Bottom up parsing uses the rightmost
derivation. derivation.
Top down parsing searches for a Bottom up parsing searches for a
production rule to be used to construct a production rule to be used to reduce a
string. string to get a starting symbol of grammar.
Q29. Construct the syntax tree and draw the DAG for the expression (a*b) + (c-d) * (a*b) + b
Q30. What is peephole optimization? Explain its characteristics

Peephole optimization is a technique for locally improving the target code which is done
by examining a sliding window of target instruction and replacing the instruction sequence
within the peephole by shorter or faster sequence wherever possible.
 Redundant instruction elimination
 Elimination of unreachable code
 Algebraic simplifications
 Use of machine idioms
Q31. I have run the program on Dev C + + the program illustrate a simple lexical analyzer or
tokenizer. The program takes a string as input and tokenizes it into keywords, identifiers,
numbers, and operators. In the program code the parse serves as function as main logic of
the program. A string serve as input also the strings iterates by any character. The parse
function is the main logic of the program. It takes a string as input and iterates through each
character.
The below is the output of the program:

Com 413 Ammar Usman Sabo H21CS018

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Com 413 Ammar Usman Sabo H21CS018

Uploaded by

Copyright:

Available Formats

Q1.

Construct NFA equivalent to regular expression r= (a + b)* ab

Q2. Give general format for LEX program

Answer: The grammar of S -> 0S1| SS |ϵ as ambiguous

Ammar Usman Sabo COM 413 Assignment H21CS018

COMPILER PHASES AND THEIR FUNCTIONS

Phase II: Syntax Analysis

Phase II: Semantic Analysis

Ammar Usman Sabo COM 413 Assignment H21CS018

Q5. What is the difference between compiler and interpreter

Answer: The main difference between a compiler and an interpreter is in how

Ammar Usman Sabo COM 413 Assignment H21CS018

(a) Complier over and interpreter (b) Interpreter over a compiler

Answer: the -(a*b) + (c+d)-(a+b+c+d)

Ammar Usman Sabo COM 413 Assignment H21CS018

Q9. Discuss about error recovery strategies in predictive parsing

Ammar Usman Sabo COM 413 Assignment H21CS018

i. Common subexpression elimination (CSE): CSE eliminates duplicate computations

Answer: Loop optimization is an important technique used by compilers to improve the

Ammar Usman Sabo COM 413 Assignment H21CS018

 LR parser can be constructed to recognize most of programming language for which

Ammar Usman Sabo COM 413 Assignment H21CS018

Several advantages to using a language-processing system that produces assembly language

A source-to-source translator (S2S translator), also known as a transcompiler or transpiler, is a

The process of source-to-source translation typically involves the following steps:

Intermediate code generation: An intermediate representation (IR) of the program is generated.

Ammar Usman Sabo COM 413 Assignment H21CS018

C is an excellent target language since it is:

Syntax Error Handling

Ammar Usman Sabo COM 413 Assignment H21CS018

 It is Machine Independent. It can be executed on different platforms.

Ammar Usman Sabo COM 413 Assignment H21CS018

Dead code elimination removes unneeded instructions from the program.

Ammar Usman Sabo COM 413 Assignment H21CS018

Lexical phase error can be:

Token: is a valid sequence of characteristics which are given be lexeme. In a programming

Ammar Usman Sabo COM 413 Assignment H21CS018

Lexeme Token Pattern

void keyword (void)

swap identifier [a-zA-Z_][a-zA-Z0-9_]*

int keyword (int)

int keyword (int)

{ left curly brace {

int keyword (int)

Ammar Usman Sabo COM 413 Assignment H21CS018

} right curly brace }

Top Down Parsing

Ammar Usman Sabo COM 413 Assignment H21CS018

Q30. What is peephole optimization? Explain its characteristics

Ammar Usman Sabo COM 413 Assignment H21CS018

The below is the output of the program:

Ammar Usman Sabo COM 413 Assignment H21CS018

You might also like