Professional Documents
Culture Documents
Compiler Constration Solve by Noman Tariq
Compiler Constration Solve by Noman Tariq
Q#2
a) S→0S1∣01S→0S1∣01
This grammar generates strings with equal numbers of 0's and 1's where the 1's follow the 0's.
Every time we use the production S → 01to get the string like 0011,000111,00001111…..
So, This grammar is not ambiguous because each string generated by this grammar has exactly one
parse tree.
b) S→ +SS∣ −SS∣ a
This grammar allows expressions consisting of 'a's combined using '+' and '-'.
c) S->aSbS | bSaS | ∈
This grammar generates strings with a pattern of alternating sequences of 'a's and 'b's, repeated
twice, and nested within each other.
This grammar is ambiguous because If we want to generate the string “abab” from the above
grammar. We can observe that the given string can be derived using two parse trees
Q#3
a)
b)
c)
Q#4
a)
language-processing system offers several advantages:
Efficiency: Compilers can optimize code during the translation process, which leads to more
efficient execution of the program by reducing memory usage and increasing speed.
Error Detection: Language-processing systems can perform thorough checks for errors before the
execution phase. This helps in identifying and resolving issues such as syntax and semantic errors
early in the development process.
Security: Compilers can enhance security by incorporating checks and balances that prevent
unsafe operations, such as buffer overflows and other vulnerabilities.
Portability: With a compiler, the source code of a program can be written in a high-level, human-
readable form, then compiled into the machine language of multiple platforms without changing
the high-level source code, enhancing the portability of applications.
A compiler that produces assembly language rather than machine language is often referred to as
an assembler or an assembly-level compiler. This type of compiler translates high-level language
code into assembly language, which is then converted into machine code by an assembler. This
approach allows for more control over hardware-specific optimizations and is typically used in
systems programming and performance-critical applications.
B)
Q#2
Define interpreter
A language translator that converts a high-level language program into a machine language
program, one line at a time, is referred to as an interpreter. Interpreters converts the codes slower
than compiler. This is because the interpreter can scan and translate only one statement of the
program at a time. Therefore, interpreters convert the source code into machine code during the
execution of the program.
What are the different phases in compilation process? Write names only
of the phases.
Q#3
Suppose the following programming statement, written in C++
programming language. You are to convert it into tokens and make
abstract syntax tree (AST) x=x+3*5
Q#4
Discuss Directed Acyclic graph (DAG) in details, what are the benefits
that are obtained while using DAG in compilation process. Give one
example to support your concept.
Directed Acyclic Graph :
The Directed Acyclic Graph (DAG) is used to represent the structure of basic blocks, to visualize the
flow of values between basic blocks, and to provide optimization techniques in the basic block. To
apply an optimization technique to a basic block, a DAG is a three-address code that is generated
as the result of an intermediate code generation.
• The graph’s leaves each have a unique identifier, which can be variable names or constants.
• The interior nodes of the graph are labelled with an operator symbol.
• In addition, nodes are given a string of identifiers to use as labels for storing the computed
value.
• Directed Acyclic Graphs have their own definitions for transitive closure and transitive
reduction.
Example
Same Same But year Different
Ambiguous Grammar
An ambiguous grammar in programming and linguistics refers to a type of grammar that can
generate more than one distinct parse tree for the same sentence or string of symbols. This
ambiguity means that there are multiple ways to interpret the structure of the sentence without
altering the sequence of words or symbols
Top-down parser and Bottom-up parser
Top-Down Parsing
Top-Down Parsing technique is a parsing technique which starts from the top level of the parse
tree, move downwards, evaluates rules of grammar. In other words, top-down parsing is a parsing
technique that looks at the highest level of the tree at start and then moves down to the parse tree.
The top-down parsing technique tries to identify the leftmost derivation for an input. It evaluates the
rules of grammar while parsing. Consequently, each terminal symbol in the top-down parsing is
produced by multiple production of grammar rules.
Since top-down parsing uses leftmost derivation, hence in this parsing technique, the leftmost
decision selects what production rule is used to construct the string.
Bottom-up parser
Bottom-Up Parsing technique is again a parsing technique which starts from the lowest level of the
parse tree, move upwards and evaluates the rules of grammar. Therefore, the bottomup parsing
technique makes an attempt to decrease the input string to the start symbol of the grammar.
In bottom-up parsing, the parsing of a tree starts from the leaf node (bottom node) of the parse tree
and works towards the start node of the parse tree. Hence, it works in a bottom-up manner so its
named.
The bottom-up parsing technique makes use of rightmost derivation. The main rightmost decision
is to select when a production rule is used to reduce the string to get the starting symbol of the
parsing tree.
Q#2
Q#3
Q#1
Q#2
Q#3
The symbol table stores information about identifiers (like variable names, function names, class
names) used in the program. Each entry typically contains the identifier name, type, scope, and
sometimes additional information such as memory addresses.
Scope Management:
It helps manage the scope of identifiers. Different scopes might have identifiers with the same
name but different attributes. The symbol table ensures that the correct identifier is accessed
according to its scope.
Type Checking:
During semantic analysis, the symbol table is used to check for type consistency in expressions
and assignments. It ensures that operations are performed on compatible data types.
Code Generation:
During the code generation phase, the symbol table provides necessary information about memory
locations of variables and the linkage of functions, which is essential for generating correct and
efficient machine code.
This is one of the most common implementations due to its efficient average time complexity for
insertions, deletions, and lookups, which is O(1) on average. Hash tables handle collisions using
techniques like chaining or open addressing.
A BST can also be used, particularly when a sorted order of elements is beneficial, such as in lexical
scopes. The average time complexity for operations in a balanced BST is O(log n).
Linked Lists:
In simpler implementations or when the number of identifiers is small, linked lists might be used.
They are particularly useful when implementing scopes as a stack of linked lists, where each block
or function scope pushes a new linked list onto the stack.
Trie (Prefix Tree):
Tries are beneficial for fast retrieval of strings and can be particularly useful for autocomplete
features in development environments. They store identifiers in a way that common prefixes are
only stored once, which can be memory efficient.
Stacks:
For managing scopes, especially in block-structured languages like C or Java, stacks can be used
where each block pushes its scope onto the stack and pops it off when exiting the scope.
Top-Down Parsing
Top-Down Parsing technique is a parsing technique which starts from the top level of the parse
tree, move downwards, evaluates rules of grammar. In other words, top-down parsing is a parsing
technique that looks at the highest level of the tree at start and then moves down to the parse tree.
The top-down parsing technique tries to identify the leftmost derivation for an input. It evaluates the
rules of grammar while parsing. Consequently, each terminal symbol in the top-down parsing is
produced by multiple production of grammar rules.
Since top-down parsing uses leftmost derivation, hence in this parsing technique, the leftmost
decision selects what production rule is used to construct the string.
• 𝐸→𝐸+𝑇 ∣ 𝑇
• 𝑇→𝑇∗𝐹 ∣ 𝐹
• 𝐹→(𝐸) ∣ id
Example
Uper LL(1) WALA question Use kr lo
i) S -> SX | SSh | XS | a
Lexical Analysis:
This initial phase involves scanning the source code character by character to convert it into
meaningful lexemes, which are then categorized into tokens (such as keywords, operators,
identifiers, literals). The lexical analyzer (or scanner) removes whitespace, comments, and other
non-essential characters, simplifying the input for the next phases. Errors like invalid symbols or
characters are also detected and reported in this stage.
Semantic Analysis:
Following syntax analysis, the semantic analyzer works with the parse tree to check for semantic
correctness of the program. This involves ensuring that operations are performed on compatible
types, variables are declared before use, and function calls match definitions in terms of number
and type of arguments. The output is an enhanced version of the parse tree, known as an abstract
syntax tree (AST), which captures the logical flow and the meaning of the program without
unnecessary syntactic details.
Optimization:
The optimization phase attempts to improve the intermediate code so that it runs faster and uses
fewer resources. Optimizations can occur at various levels and may involve removing redundant
code, minimizing memory access, exploiting hardware architecture, and more. This phase is critical
for performance-critical applications but must preserve the original semantics of the program.
Code Generation:
The final phase of the compiler is where the optimized intermediate code is translated into the
machine code specific to the target processor architecture. This involves choosing appropriate
machine instructions, allocating registers, managing memory access, and handling system calls
and interrupts. The output is either an executable or object code ready to be executed by the
hardware.
Q#4
Q#5
MCQs(Paper1)
The graph that shows basic blocks and their successor relationship is
called
a) DAG b) Flow Chart c) Control Graph d) Hamilton graph
tree in which each node represents an operator and children of the node
represent the operands.
a) Abstract syntax b) Parse c) Concrete d) none of these
LR stands for
a) Left to right b) left to right reductions c) right to left d) none of these
Intermediate Representation (IR) stores the value of its operand in
a) stack b) queue c) registers d) none of these
Output of parser is
a) Tokens b) Parse tree c) Object code d) Intermediate code
Greeksforgreeks
GitHub
ChatGPT -4
Gemini Pro
• Mr.noman.tariq@outlook.com
• 03700204207(WhatsApp only)