Professional Documents
Culture Documents
Module-2 1
Module-2 1
Module-2 1
The parser receives a steam of tokens from the lexical analyzer and verifies that the string can
be generated by the grammar for the source language by constructing a parse tree.
The term parsing comes from Latin word pars which means part of speech.
SYNTAX ANALYSIS Scanner
[Lexical Analyzer]
Tokens
Parser
[Syntax Analyzer]
INTERACTION BETWEEN LEXICAL ANALYZER
AND PARSER
CONTEXT FREE GRAMMAR (CFG)
Context free grammar is a grammar whose productions are of the form
where A is a non terminal and α is a set of terminals and non terminals (α can be
empty also)
A formal grammar is "context free" if its production rules can be applied regardless
of the context of a nonterminal.
No matter which symbols surround it, the single nonterminal on the left hand side
can always be replaced by the right hand side.
A CFG consist of (NTPS)
Terminals
basic symbols from which strings are formed
tokens
Non terminals
nonterminals define sets of strings that help define the language generated by the
grammar
Production
Start Symbol
Grammar for simple arithmetic expression
DERIVATION
• A derivation is basically a sequence of production rules, in order to get the input
string.
• Beginning with the start symbol, each replaces a non terminal by the body of one of
its productions.
• Types:
• Left Most Derivation - In left most derivation, the left most non terminal is replaced in each step
• Right Most Derivation - In right most derivation, the right most non terminal is replaced in each
step
Consider the grammar
PARSE TREE
Parse tree is a hierarchical structure which represents the derivation of the grammar to yield
input strings.
Derivation tree
The leaves of the parse tree are labeled by non-terminals or terminals and read
from left to right, they constitute a sentential form, called the yield or frontier of
the tree.
Parsing is the process of determining if a string of token can be
generated by a grammar.
2 approaches
Top Down Parsing - In top down parsing, parse tree is constructed from top (root) to the
bottom (leaves).
TDP approaches:
Predictive Parser
RECURSIVE DESCENT PARSING
IMPLEMENTATION
Procedure S()
{ if nextsymbol = ‘c’
{ A();
if nextsymbol = ‘d’
return success;
} Procedure A()
} { if nextsymbol = ‘a’
{ if nextsymbol = ‘b’
return;
else return;
}
error;
}
It is the most general form of top-down parsing.
A left-recursive grammar can cause a recursive-descent parser, to go into an infinite loop. That is when
we try to expand A, we may find ourselves again trying to expanding A, without having consumed any
input.
Recursive-descent parsers are not very common as programming language constructs can be parsed
without using backtracking.
Stack:
initialized with $, to indicate bottom of stack.
Parsing table:
2 D array M[A,a] where A is a nonterminal and a is terminal or the symbol $
30
Moves made by predictive parser for the input id+id*id
31
Uses 2 functions:
FIRST()
FOLLOW()
These functions allows us to fill the entries of
predictive parsing table
32
RULES TO COMPUTE FIRST SET
36
RULES TO COMPUTE FOLLOW SET
37
38
Calculate First and Follow of the given
grammar
S → aBDh
B → cC
C → bC / ∈
D → EF
E→g/∈
F→f/∈
40
44
A context-free grammar G , whose parsing table has no multiple entries is said to be LL(1).
LL(l) grammars are the class of grammars from which the predictive parsers can be constructed
the first L stands for scanning the input from left to right,
and the 1 stands for using one input symbol of lookahead at each step to make parsing
action decision.
Not LL(1)
Grammar
The goal of predictive parsing is to construct a top-down parser that
never backtracks. To do so, we must transform a grammar in two ways:
Eliminate Left Recursion
Perform Left factoring
An ambiguous sentence has two or more possible meanings within a single sentence or sequence
of words. This can confuse the reader and make the meaning of the sentence unclear.
AMBIGUOUS GRAMMAR
An ambiguous grammar is one that produces more
than one leftmost or more than one rightmost
derivation for the same sentence.