Syntax-Directed Translation

 Parser uses a CFG(Context-free-Grammar) to validate the input string and produce

output for the next phase of the compiler. Output could be either a parse tree or an
abstract syntax tree.

 To interleave(mix) semantic analysis with the syntax analysis phase of the

compiler, we use Syntax Directed Translation.

 SDT has augmented rules to the grammar that facilitate semantic analysis. SDT
involves passing information bottom-up and/or top-down to the parse tree in form
of attributes attached to the nodes.

 Syntax-directed translation rules use :-

 lexical values of nodes,
 constants &
 attributes associated with the non-terminals in their definitions.

Syntax-Directed Translation
 SDT Uses a CF grammar to specify the syntactic
structure of the language.

 AND associates a set of attributes with (non)terminals.

 AND associates with each production a set of semantic

rules for computing values for the attributes.

 The attributes contain the translated form of the input

after the computations are completed
Synthesized and Inherited
 An attribute is said to be …
 synthesized if its value at a parse-tree node is
determined from the attribute values at the
children of the node
 inherited if its value at a parse-tree node is
determined by the parent (by enforcing the
parent’s semantic rules)
Attribute Grammar
 Attribute grammar is a special form of CFG where some additional information (attributes) are
appended to one or more of its non-terminals in order to provide context-sensitive information.

 Each attribute has well-defined domain of values, such as integer, float, character, string, and

 AG is a medium to provide semantics to the CFG and it can help specify the syntax and semantics
of a programming language.

 Attribute grammar (when viewed as a parse-tree) can pass values or information among the nodes
of a tree.

 It is useful to allow side effects (printing a value, updating a global variable, ...) in semantic rules.

 It is a syntax-directed definition where the semantic rules cannot have side effects.

Example Attribute Grammar
 Example: E → E + T { E.value = E.value + T.value }

 The right part of the CFG contains the semantic rules that specify how the
grammar should be interpreted.

 Here, the values of non-terminals E and T are added together and the result is
copied to the non-terminal E.

 Each Production Rule of the CFG Has a Semantic Rule. Concate operator

Note: Semantic Rules for

expr Use Synthesized
Attributes Which Obtain

Their Values From Other Rules.

 The way to write arithmetic expression is known as a

 An arithmetic expression can be written in three different but

equivalent notations, i.e., without changing the essence or
output of an expression.

 These notations are

 Infix Notation
 Prefix (Polish) Notation
 Postfix (Reverse-Polish) Notation
 We write expression in infix notation,

 e.g. a - b + c, where operators are used in-between operands

 Prefix Notation:- In this notation, operator is prefixed to operands,
 i.e. operator is written ahead of operands.
 For example, +ab. This is equivalent to its infix notation a + b. Prefix
notation is also known as Polish Notation.

 This notation style is known as Reversed Polish Notation. In this notation

style, the operator is postfixed to the operands
 i.e., the operator is written after the operands.
 For example, ab+. This is equivalent to its infix notation a + b.
Postfix notation for an expression E
 If E is a variable or constant, then the postfix notation for
E is E itself ( E.t≡E )

 If E is an expression of the form E1 op E2 where op is a

binary operator
 E1' is the postfix of E1, E2' is the postfix of E2
 Then E1' E2' op is the postfix for E1 op E2
 if E is (E1), and E1' is a postfix then E1' is the postfix for E
eg) 9 - 5 + 2 ⇒ 9 5 - 2 +
9 - (5 + 2) ⇒ 9 5 2 + -
Postfix Evaluation Algorithm
 Postfix Evaluation Algorithm

We shall now look at the algorithm on how to evaluate postfix

notation −
 Step 1 − scan the expression from left to right
 Step 2 − if it is an operand push it to stack
 Step 3 − if it is an operator pull operand from stack and perform
 Step 4 − store the output of step 3, back to stack
 Step 5 − scan the expression until all operands are consumed
 Step 6 − pop the stack and perform operation

Annotated Parse Tree
 The parse tree containing the values of attributes at each
node for given input string is called annotated or decorated
parse tree.

 It is a parse tree showing the values of the attributes at each


 Features :
 High level specification
 Hides implementation details
 Explicit order of evaluation is not specified

Example Annotated Parse Tree
 Examples: (9-5)+2 → 9 5 – 2 +

9 – (5 + 2) → 9 5 2 + -

Depth-First Traversals
 It is a method for exploring a tree node. In a DFT, you go as deep as possible down one path
before backing up and trying a different one.

 You explore one path, hit a dead end, and go back and try a different one.

 In this traversal first the deepest node is visited and then backtracks to it’s parent node if no
sibling of that node exist.

 procedure visit(n : node);


for each child m of n, from left to right do


evaluate semantic rules at node n


Depth-First Traversals (Example)

Depth-First Traversals
 The Depth First Traversals of this Tree will be
 In-order (Left, Root, Right) : 4 2 5 1 3
 Pre-order (Root, Left, Right) : 1 2 4 5 3
 Post-order (Left, Right, Root) : 4 5 2 3 1

 Parsing = process of determining if a string of tokens can be
generated by a grammar.

 The parser is that phase of the compiler which takes a token string
as input and with the help of existing grammar, converts it into the
corresponding Intermediate Representation(IR).

 The parser is also known as Syntax Analyzer.

 The parser is mainly classified into two categories

 Top-down Parser
 Bottom-up Parser.

 The Top-down parser is the parser that generates parse for the given input
string with the help of grammar productions by expanding the non-terminals.

 i.e. it starts from the start symbol and ends on the terminals. It uses left most

 Derivation of a token string occurs in a top down fashion.

 It constructs from the Grammar which is free from ambiguity and left

 Top-down parser is classified into 2 types:

 A recursive descent parser, and Non-recursive descent parser.

A recursive descent parser
 This technique follows the process for every terminal and
non-terminal entity.

 It reads the input from left to right and constructs the parse
tree from right to left. And have one backtracking

 As the technique works recursively, it is called recursive

descent parsing.(with back tracking).

 If one derivation of a production fails, the syntax analyzer

restarts the process using different rules of same production.
A recursive descent parser
 It uses leftmost derivation to construct a parse tree.

 For Example, Consider: Input: abbcde

S -> aABe aABe - aAbcBe – abbcBe -- abbcde

A -> Abc | b

B -> d

Non-recursive descent parser
 Parser or predictive parser or without backtracking parser or
dynamic parser. It uses a parsing table to generate the parse tree
instead of backtracking.

 Predictive parsing is a special form of recursive descent parsing

where we use one lookahead token to unambiguously determine the
parse operations.

 where no Back Tracking is required.

 Guess a production, see if it matches, if not, backtrack and try


Predictive Parsing
 Parser Operates by Attempting to Match Tokens in the Input Stream.

 Utilize both Grammar and Input Below to Motivate Code for Algorithm

 It is a kind of top-down parsing that predicts a production whose derived terminal symbol is
equal to next input symbol while expanding in top-down paring. o without backtracking

For example, input stream is a + b.

lookahead == a


lookahead == +

match ()

lookahead == b

Predictive parsing
procedure match(t : token);
procedure simple();
begin begin
if lookahead = t then if lookahead = ‘integer’ then
lookahead := nexttoken() match(‘integer’)
else error() else if lookahead = ‘char’ then
end; match(‘char’)
else if lookahead = ‘num’ then
procedure type(); match(‘num’);
begin match(‘dotdot’);
if lookahead in { ‘integer’, ‘char’, ‘num’ } then match(‘num’)
else error()
simple() end;
else if lookahead = ‘^’ then
match(‘^’); match(id)
else if lookahead = ‘array’ then
match(‘array’); match(‘[‘); simple();
match(‘]’); match(‘of’); type()
else error()
end; 21
Advantages of Top-Down Parsing
 Advantages
 Top-down parsing is very simple.
 It is very easy to identify the action decision of the top-down

 Disadvantages
 Top-down parsing is unable to handle left recursion in the
present in the grammar.
 Some recursive descent parsing may require backtracking.

Problem with Top Down Parsing

 Left Recursion in CFG May Cause Parser to Loop


 Backtracking

 Left factoring

 Ambiguity

Left Recursion
 A production of grammar is said to have left recursion if the leftmost variable of its RHS is
same as variable of its LHS.

 A grammar is left recursive if it contains a nonterminal A, such that A⇒+ Aα, where is any
 Grammar {S→ Sα | c} is left recursive because of S⇒Sα
 Grammar {S→ Aα, A→ Sb | c} is also left recursive because of S⇒Aα⇒ Sb

 If a grammar is left recursive, you cannot build a predictive top down parser for it.
 If a parser is trying to match S & S→Sα, it has no idea how many times S must be applied
 Given a left recursive grammar, it is always possible to find another grammar that
generates the same language and is not left recursive.
 The resulting grammar might or might not be suitable for RDP.

Left Recursion
 When a production for nonterminal A starts with a self reference then a
predictive parser loops forever

A → Aα / β

We can eliminate left recursive productions by systematically rewriting the

grammar using right recursive productions

A → βA’

A’ → αA’ / ∈
Exercise: Remove the left recursion in the following grammar:
expr → expr + term | expr – term expr → term
solution: expr → term rest , rest → + term rest | - term rest | ε
Right Recursion
 A production of grammar is said to have right recursion if the
rightmost variable of its RHS is same as variable of its LHS.

 A grammar containing a production having right recursion is called as

Right Recursive Grammar.

 Example : S → aS / ∈

 Right recursion does not create any problem for the Top down parsers.

 Therefore, there is no need of eliminating right recursion from the


Left Factoring
 If more than one grammar production rules has a common prefix
string, then the top-down parser cannot make a choice as to which of
the production it should take to parse the string in hand.

 Example:- If a top-down parser encounters a production like

A ⟹ αβ | α𝜸 | …

 Then it cannot determine which production to follow to parse the

string as both productions are starting from the same terminal (or non-

 To remove this confusion, we use a technique called left factoring.

Left Factoring
 Left factoring transforms the grammar to make it useful for top-down parsers.
In this technique, we make one production for each common prefixes and the
rest of the derivation is added by new productions.

 The above productions can be written as

A => αA’

A'=> β | 𝜸 | …

 Now the parser has only one production per prefix which makes it easier to
take decisions.
A concrete example: <stmt> → IF <boolean> THEN <stmt> | IF <boolean> THEN <stmt> ELSE <stmt>
is transformed into
<stmt>→ IF <boolean> THEN<stmt> S’ S' → ELSE <stmt> | ε

Bottom-Up parsing

 The bottom-up parsing works just the reverse of the

top-down parsing. It first traces the rightmost
derivation of the input until it reaches the start

 It starts from non-terminals and ends on the start

symbol. It uses the reverse of the rightmost derivation.

Bottom-Up parsing
 Example: eg. a + b * c

 Input string : a + b * c

 Production rules:

 S→E

 E→E+T

 E→E*T

 E→T

 T → id


