Context Free Grammar: Lecturer:Ahmed Hadi Al-Taee

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

Context Free Grammar

Lecturer:Ahmed Hadi
Al-Taee
Department of Software 1
Objectives
 Context-free Grammars (CFG’s)- used to specify valid
syntax for programming languages, critical for compiling.

 We will study CFG and its properties.

Department of Software 2
Part of the CFG for Pascal

3
A formal definition of CFGs
 A CFG consists of
A set of terminals T
 A set of non-terminals N
 A start symbol S (one of the non-terminals)
 A set of productions:

X  Y1Y2 Yn
where X  N and Yi  T  N    

4
Example
 Grammar:

S  aSb
S 
 Derivation of sentence ab :
S  aSb  ab

S  aSb S 
5
Notation
a, b, c lower case terminals
A, B, S upper case non-terminals
ɛ often reserved for empty string
Strings in the generated language consist of terminals only.
→ Used to describe rules.
 Used for derivations.
* Derives in zero or more steps.

6
 Grammar: S  aSb
S 

 Derivation of sentence aabb:


S  aSb  aaSbb  aabb

S  aSb S 
7
 Other derivations:
S  aSb  aaSbb  aaaSbbb  aaabbb

S  aSb  aaSbb  aaaSbbb


 aaaaSbbbb  aaaabbbb

8
 Language of the grammar
S  aSb
S 

n n
L  {a b : n  0}

9
Example :
L= { an cc b n : n ≥ 0}

10
Example: L= { w wR : w  {a, b}*}
Rules: Start symbol S
S → aSa
S → bSb
S→ε
A derivation:
S  b S b  b b S b b  b b a S a b b  bbaabb
Shorthand:
S * bbaabb
11
Example, revisited
 Note:
a more compact way to write previous grammar:
E  intlit | E - E | E / E | ( E )

Or
E intlit
| E-E
| E/E
| (E)

12
Notational Conventions
 In these lecture notes
 Non-terminals are written in upper-case
 Terminals are written in lower-case
 The start symbol is the left-hand side of the
first production (unless specified otherwise)

13
The Language of a CFG
The language defined by a CFG is the set of
strings that can be derived from the start symbol
of the grammar.

Derivation: Read productions as rules:

X  Y1 Yn
Means can be replaced by
X Y1 Yn
14
Derivation: key idea
1. Begin with a string consisting of the start
symbol “S”
2. Replace any non-terminal X in the string by the
right-hand side of some production

X  Y1 Yn
3. Repeat (2) until there are no non-terminals in
the string

15
Derivation: an example
CFG: Derivation:
E  id
E  E+E E
E  E*E  E+E
 E  E+E
E  (E)
 id  E + E
 id  id + E
String id * id + id is in the
language defined by the grammar.  id  id + id

16
Terminals
 Terminals are so called because there are
no rules for replacing them

 Once generated, terminals are permanent

 Therefore, terminals are the tokens of the


language

17
The Language of a CFG
Let G be a context-free grammar with start
symbol S. Then the language of G is:

 

a1  an | S  a1  an and every ai is a terminal

18
Example


Strings of balanced parentheses ( ) | i  0
i i

The grammar: Which is
S  (S ) the same
as
S  (S )
S   | 

19
Another Example
A simple arithmetic expression grammar:
E  E+E | E  E | (E) | id
Some strings in the language of this
grammar:
id id + id
(id) id  id
(id)  id id  (id)
20
Derivations and Parse Trees
A derivation is a sequence of productions
S   
A derivation can be drawn as a tree
 Startsymbol is the tree’s root
 For a production X  Y1 Yn add children Y1 Yn
to node X

21
Derivation Example
 Grammar
E  E+E | E  E | (E) | id
 String id  id + id

22
Derivation Example (continued)
E
E
 E+E
E + E
 E  E+E
 id  E + E E * E id
 id  id + E
id id
 id  id + id
23
Notes on Derivations
 A syntax tree or parse tree has
 Terminalsat the leaves
 Non-terminals at the interior nodes

 An in-order traversal of the leaves yields the


original input string
 As in the preceding example, we usually show a
left–most derivation, that is, replace the left–
most non–terminal remaining at each step

24
Derivation Trees
S A|AB Other derivation trees for this
A |a|Ab|AA w = aabb string?
B b|bc|Bc|bB

S S
S ? ?
A
A B A B
A A Infinitely
A A b B A A b many others
A A A b possible.
a a b a A b
a  A b
a
a
S  AB A  aaA |  B  Bb | 

S  AB
 S

A B

26
S  AB A  aaA |  B  Bb | 

S  AB  aaAB
S

A B

a a A

27
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb
S

A B

a a A B b

28
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb


S

A B

a a A B b

29

S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb  aab


Derivation Tree S

A B

a a A B b

30
 
S  AB A  aaA |  B  Bb | 

S  AB  aaAB  aaABb  aaBb  aab


Derivation Tree S

A B
yield

a a A B b aab
 aab
31
 
Leftmost, Rightmost Derivations

Definition. A left-most derivation of a sentential


form is one in which rules transforming the left-most
nonterminal are always applied

Definition. A right-most derivation of a sentential


form is one in which rules transforming the right-most
nonterminal are always applied
Leftmost & Rightmost
Derivations
S A|AB Sample derivations:
A |a|Ab|AA S  AB  AAB  aAB  aaB  aabB  aabb
B b|bc|Bc|bB S  AB  AbB  Abb  AAbb  Aabb  aabb

S These two derivations are special.

A B 1st derivation is leftmost.


Always picks leftmost variable.
A A b B
2nd derivation is rightmost.
a a b Always picks rightmost variable.
Leftmost Derivation Rightmost Derivation
EEOE
E  (E)
E  id
O+|-|*|/

E E
EOE EOE
 (E) O E  E O id
 (E O E) O E  E * id
 (id O E) O E  (E) * id
 (id + E) O E  (E O E) * id
 (id + id) O E  (E O id) * id
 (id + id) * E  (E + id) * id
 (id + id) * id  (id + id) * id
34
Derivation Order
1. S  AB 2. A  aaA 4. B  Bb

3. A   5. B  
Leftmost derivation:

1 2 3 4 5
S  AB  aaAB  aaB  aaBb  aab
Rightmost derivation:

1 4 5 2 3
S  AB  ABb  Ab  aaAb  aab
35
Grammar Types
 There are four types which any grammar should belong to.
These types are type 0, type 1, type 2, and type 3.
 A type 0 grammar has no restrictions on its productions.
 A type 1 grammar can have productions of the form
w1→w2, where w1 = lAr and w2 = lwr, where A is a
nonterminal symbol, l and r are strings of zero or more
terminal or nonterminal symbols, and w is a nonempty
string of terminal or nonterminal symbols. It can also have
the production S → ɛ as long as S does not appear on the
right-hand side of any other production.

Department of Software 36
Grammar Types (Cont.)
 A type 2 grammar can have productions only of the form
w1→w2, where w1 is a single symbol that is not a terminal
symbol.
 A type 3 grammar can have productions only of the form
w1→w2 with w1 = A and either w2 = aB or w2 = a, where
A and B are nonterminal symbols and a is a terminal
symbol, or with w1 = S and w2 = ɛ .
 the grammars form a hierarchy; that is, every Type 3
grammar is a Type 2 grammar, every Type 2 grammar is a
Type 1 grammar, and every Type 1 grammar is a Type 0
grammar.
Department of Software 37
Examples
Grammar Type
S → C, C→ OCAB , Type 1
S → A, BA→ AB , OA → 0 1 , I A→1 1 , I B→12, and 2B → 22.

S → AB Type 2
A → Ca
B → Ba
B → Cb
B→b
C → cb
C→b.

S → aSa Type 3
S → bSb
S→ε

Department of Software 38
References
1. Seymour Lipschutz, and Marc Lipson, “Schaum’s
Outlines: Discrete Mathematics,” 3rd edition, McGraw-
Hill, 2007.
2. Rosen, Discrete Mathematics and Its Applications, 6th
edition, 2007.
3. Kevin Ferland, Discrete Mathematics, An Introduction
To Proofs And Combinatorics, Richard Stratton, 2009.
4. Thomas Koshy, Discrete Mathematics with Applications,
Elsevier Press, 2004.

Department of Software 39
Ahmed Hadi Al-
Taee .

Department of Software 40

You might also like