Lecture 4

Chapter 3
Describing Syntax
Analysis
Program execution
1-2
Introduction
• Syntax: the form or structure of the

expressions, statements, and program
units
• Semantics: the meaning of the expressions,
statements, and program units
• Syntax and semantics provide a language’s
definition
– Users of a language definition
• Other language designers
• Implementers
• Programmers (the users of the language)
1-3
Chomsky hierarchy
According to Chomsky hierarchy, grammar is

divided into 4 types as follows:
•Type 0 is known as unrestricted grammar.

•Type 1 is known as context-sensitive
grammar.
•Type 2 is known as a context-free grammar.
•Type 3 Regular Grammar.
1-4
Chomsky hierarchy
1-5
BNF and Context-Free Grammars
• Context-Free Grammars
– Developed by Noam Chomsky in the mid-1950s
– Language generators, meant to describe the
syntax of natural languages
– Define a class of languages called context-free
languages
• Backus-Naur Form (1959)

– Invented by John Backus to describe the syntax
of Algol 58
– BNF is equivalent to context-free grammars
1-6
BNF Fundamentals (continued)
• Nonterminals are often enclosed in angle brackets
– Examples of BNF rules:

<ident_list> → identifier | identifier, <ident_list>
<if_stmt> → if <logic_expr> then <stmt>
• Grammar: a finite non-empty set of rules
• A start symbol is a special element of the

nonterminals of a grammar
1-7
Describing Lists
• Syntactic lists are described using

recursion
<ident_list>  ident
| ident, <ident_list>
• A derivation is a repeated application of

rules, starting with the start symbol and
ending with a sentence (all terminal
symbols)
1-8
An Example Grammar
<program>  <stmts>
<stmts>  <stmt> | <stmt> ; <stmts>
<stmt>  <var> = <expr>
<var>  a | b | c | d
<expr>  <term> + <term> | <term> - <term>
<term>  <var> | const
<program> => <stmts> => <stmt>
=> <var> = <expr>
=> a = <expr>
=> a = <term> + <term>
=> a = <var> + <term>
=> a = b + <term>
=> a = b + const
1-9
Recursive Grammars
1) S->SaS
S->b
The language(set of strings) generated by the above
grammar is :{b, bab, babab,…}, which is infinite.
2) S-> Aa
A->Ab|c
The language generated by the above grammar is :{ca,
cba, cbba …}, which is infinite.
Note: A recursive context-free grammar that contains no

useless rules necessarily produces an infinite language.
1-10
Non-Recursive Grammars
S->Aa
A->b|c
The language generated by the above grammar is :{ba,
ca}, which is finite.
Types of Recursive Grammars

Based on the nature of the recursion in a recursive
grammar, a recursive CFG can be again divided into the
following:
Left Recursive Grammar (having left Recursion)
Right Recursive Grammar (having right Recursion)
General Recursive Grammar(having general Recursion)
1-11
Parse Tree
• A hierarchical representation of a derivation
<program>
<stmts>
<stmt>
<var> = <expr>
a <term> + <term>
<var> const
b
1-12
An Ambiguous Expression Grammar
<expr>  <expr> <op> <expr> | const

<op>  / | -
<expr> <expr>
<expr> <op> <expr> <expr> <op> <expr>
<expr> <op> <expr> <expr> <op> <expr>
const - const / const const - const / const
1-13
An Unambiguous Expression Grammar
• If we use the parse tree to indicate

precedence levels of the operators, we
cannot have ambiguity
<expr>  <expr> - <term> | <term>
<term>  <term> / const| const
<expr>
<expr> - <term>
<term> <term> / const
const const
1-14
Unambiguous Grammar for Selector
• if-then-else grammar
<if_stmt> -> if (<logic_expr>) <stmt>
| if (<logic_expr>) <stmt> else <stmt>
Ambiguous!
- An unambiguous grammar for if-then-else
<stmt> -> <matched> | <unmatched>

<matched> -> if (<logic_expr>) <stmt>
| a non-if statement
<unmatched> -> if (<logic_expr>) <stmt>
| if (<logic_expr>) <matched> else
<unmatched>
1-15
Removal of Ambiguity in Grammar
S->aSbS | bSaS | ∈ S -> AB

A -> Aa | a
B -> b
1-16
We can remove ambiguity solely on the basis of the following two

properties –
1. Precedence –
If different operators are used, we will consider the precedence of
the operators. The three important characteristics are :
•The level at which the production is present denotes the priority
of the operator used.
•The production at higher levels will have operators with less
priority. In the parse tree, the nodes which are at top levels or close
to the root node will contain the lower priority operators.
•The production at lower levels will have operators with higher
priority. In the parse tree, the nodes which are at lower levels or
close to the leaf nodes will contain the higher priority operators.
1-17
2. Associativity –
If the same precedence operators are in production, then we will
have to consider the associativity.
•If the associativity is left to right, then we have to prompt a left
recursion in the production. The parse tree will also be left
recursive and grow on the left side.
+, -, *, / are left associative operators.
•If the associativity is right to left, then we have to prompt the right
recursion in the productions. The parse tree will also be right
recursive and grow on the right side.
^ is a right associative operator.
1-18
Example 1 – Consider the ambiguous grammar

E -> E-E | id
The language in the grammar will contain { id, id-id, id-id-id, ….}
If we want to derive the string id-id-id. Let’s consider a single
value of id=3 to get more insights. The result should be :
3-3-3 =-3
Since the same priority operators, we need to consider

associativity which is left to right.
1-19
Parse Tree
1-20
So, to make the above grammar unambiguous, simply make the

grammar Left Recursive by replacing the left most non-terminal
E in the right side of the production with another random
variable, say P.
E -> E – P | P
P -> id
1-21
Another Operator
Similarly, the unambiguous grammar for the

expression : 2^3^2 will be –
E -> P ^ E | P // Right Recursive as ^ is right associative.
P -> id
1-22
Task
Consider the grammar shown below, which has two

different operators :
E -> E + E | E * E | id
Clearly, the above grammar is ambiguous as we can

draw two parse trees for the string “id+id*id” as
shown below. Consider the expression :
3+2*5 // “*” has more priority than “+”

The correct answer is : (3+(2*5))=13
1-23
Extended BNF
• Optional parts are placed in brackets [ ]

<proc_call> -> ident [(<expr_list>)]
• Alternative parts of RHSs are placed
inside parentheses and separated via
vertical bars
<term> → <term> (+|-) const
• Repetitions (0 or more) are placed inside
braces { }
<ident> → letter {letter|digit}
1-24
BNF and EBNF
• BNF
<expr>  <expr> + <term>
| <expr> - <term>
| <term>
<term>  <term> * <factor>
| <term> / <factor>
| <factor>
• EBNF
<expr>  <term> {(+ | -) <term>}
<term>  <factor> {(* | /) <factor>}
1-25

Lecture 4

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 4

Uploaded by

Copyright:

Available Formats

Chapter 3

• Syntax: the form or structure of the

According to Chomsky hierarchy, grammar is

•Type 0 is known as unrestricted grammar.

• Backus-Naur Form (1959)

• Nonterminals are often enclosed in angle brackets

– Examples of BNF rules:

• Grammar: a finite non-empty set of rules

• A start symbol is a special element of the

• Syntactic lists are described using

• A derivation is a repeated application of

Note: A recursive context-free grammar that contains no

Types of Recursive Grammars

<expr>  <expr> <op> <expr> | const

<expr> <op> <expr> <expr> <op> <expr>

<expr> <op> <expr> <expr> <op> <expr>

const - const / const const - const / const

• If we use the parse tree to indicate

<term> <term> / const

<stmt> -> <matched> | <unmatched>

S->aSbS | bSaS | ∈ S -> AB

We can remove ambiguity solely on the basis of the following two

Example 1 – Consider the ambiguous grammar

Since the same priority operators, we need to consider

So, to make the above grammar unambiguous, simply make the

Similarly, the unambiguous grammar for the

E -> P ^ E | P // Right Recursive as ^ is right associative.

Consider the grammar shown below, which has two

Clearly, the above grammar is ambiguous as we can

3+2*5 // “*” has more priority than “+”

• Optional parts are placed in brackets [ ]

You might also like

3+25 // “” has more priority than “+”