Chapter4 - New Context Free Grammar (CFG) (1)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 65

Chapter 4

Context-Free Grammars

•Context-Free Grammars and Languages


•Regular Grammars
outcome
• understand/ describe/ differentiate
basic Context free grammar (CFG),
Regular grammar (RG)
• Generate string from CFG/RG in
sentencial form and derivation tree
• Write CFG/RG for a language
• Convert regular expression (RE) to CFG/
RG and vice versa
Suriati2022 Theory of Computer Science 2
Formal Definition of
Context-Free Grammars (CFG)
A CFG can be formally defined by a quadruple
of (V, , P, S) where:
– V is a finite set of variables (non-terminal)
–  (the alphabet) is a finite set of terminal
symbols , where V   = 
– P is a finite set of rules (production rules) written
as: A   for A  V,   (v  )*.
– S is the start symbol, S  V

3
Formal Definition of CFG
• We can give a formal description to a
particular CFG by specifying each of its
four components, for example,
G = ({S, A}, {a, b}, P, S) where P
consists of three rules:
S → aSb
S→A
A→
Suriati2022 Theory of Computer Science 4
Context-Free Grammars
• Terminal symbols – elements of the
alphabet
• Variables or non-terminals – additional
symbols used in production rules
• Variable S (start symbol) initiates the
process of generating acceptable
strings.

5
Terminal or Variable ?
• S → (S) | S + S | S × S | A
•A→a|b|c

• The terminal symbols are { (, ), +,


×, a, b, c}
• The variable symbols are S and A

Theory of Computer Science 6


Context-Free Grammars
• A rule is an element of the set
V  (V  )*.
• An A rule:
[A, w] or A  w
• A null rule or lambda rule:
A
7
Definition
Let CFG G = (V, , P, S) and v  (V  )*.
The set of strings derivable from v is defined
recursively as follows:
1. Basis: v is derivable from v
2. Recursive step:
if u = xAy is derivable from v,
and A  w  P,
then xwy is derivable from v
3. Closure: Precisely those strings constructed from
v by finitely many applications of the recursive
step are derivable from v
8
Context-Free Grammars
• Grammars are used to generate
strings of a language.
• An A rule can be applied to the
variable A whenever and wherever
it occurs.
• No limitation on applicability of a
rule – it is context free
9
Context-Free Grammars
• CFG have no restrictions on the right-hand side of
production rules. All of the following are valid CFG
production rules:
S  aSbS Sc
Sλ S  xyz
S  ABcDEF S  MZK
• Notice that there can be a λ, one or more terminals,
one or more non-terminals and any combination of
terminals and non-terminals on the right-hand side
of a production rule.
Theory of Computer Science 10
Generating Strings with CFG
• Generate a string by applying rules
– Start with the initial symbol
– Repeat:
• Pick any non-terminal in the string
• Replace that non-terminal with the right-hand side of
some rule that has that non-terminal as a left-hand
side
• Repeat until all elements in the string are terminals
• E.g. : P: S  uAv
Aw
We can derived string uwv as: S uAv uwv

11
Generating Strings with CFG
P: S  aS S  Bb
B  cB B
• Generating a string:
S replace S with aS
aS replace S with Bb
aBb replace B with cB
acBb replace B with 
acb Final String
Theory of Computer Science 12
Generating Strings with CFG
P: S  aS S  Bb
B  cB B
• Generating a string:
S replace S with aS
aS replace S with aS
aaS replace S with Bb
aaBb replace B with cB
aacBb replace B with cB
aaccBb replace B with 
aaccb Final String
Theory of Computer Science 13
Generating Strings with CFG

P: S  aS S  Bb
B  cB B
• Regular expression equivalent to
this CFG: a c b
• Shortest string: S Bb b.
Theory of Computer Science 14
Derivation
• A derivation is a listing of how a string is generated –
showing what the string looks like after every
replacement.
S  AB S AB
A  aA |  aAB
B  bB |  aAbB
abB
abbB
abb

Theory of Computer Science 15


Derivation
• A terminal string is in the language of
the grammar if it can be derived from
the start symbol S using the rules of
the grammar:
• Example: S  AASB | AAB
Aa
B  bbb
16
Derivation
Derivation Rule Applied
S AASB S  AASB
AAAASBB S  AASB
AAAAAASBBB S  AASB
AAAAAAAABBBB S  AAB
aaaaaaaaBBBB Aa
aaaaaaaabbbbbbbbbbbb B  bbb

17
Derivation
• Let G be the grammar :
S  aS | aA
A  bA | 
• The derivation of aabb is as shown:
S  aS
 aaA
 aabA
 aabbA
 aabb  aabb
18
Example
• Let G be the grammar :
S  aSb | ab
• The derivation of aaaabbbb is as shown:
S  aSb
 aaSbb
 aaaSbbb
 aaaabbbb
 Set notation equivalent to this CFG is
L = {anbn | n > 0}
19
CFG – Generating strings
• E.g. : Formal description of G = ({S}, {a, b}, P, S)
P: S  aS | bS | 
• The following strings can be derived:
S 
S aS a a
S bS b b
S aS aaS aa aa
S aS abS ab ab
S bS baS ba ba
S aS abS abbS abb abb

20
CFG – Generating strings
• E.g. : G = ({S}, {a,b}, P, S)
P: S  aS | bS | 
• The language above can also be
defined using regular expression:
L(G) = (a + b)*

21
CFG – Generating strings
• E.g. : G = ({S, A}, {a, b}, P, S)
P: S  AA
A  AAA | bA | Ab | a
• The following strings can be derived:
S AA
S aA rule [A  a] Shortest
S aAAA rule [A  AAA] string ?
S abAAA rule [A  bA]
S abaAA rule [A  a]
S abaAbA rule [A  Ab]
S abaabA rule [A  a]
S abaaba rule [A  a]
22
CFG – Generating strings
G = (V, , P, S), V = {S, A},  = {a, b},
P: S  AA
A  AAA | bA | Ab | a
• Four distinct derivations of ababaa in G:
(a) S AA (b) S AA (c) S AA (d) S AA
aA AAAA Aa aA
aAAA aAAA AAAa aAAA
abAAA abAAA AAbAa aAAa
abaAA abaAA AAbaa abAAa
ababAA ababAA AbAbaa abAbAa
ababaA ababaA Ababaa ababAa
ababaa ababaa ababaa ababaa
(a) & (b) leftmost, (c) rightmost, (d) arbitrary
23
Context-Free Grammars
S aS aSa aba
• Sentencial forms are the strings derivable
from start symbol of the grammar.
• Sentences are forms that contain only
terminal symbols.
• A set of strings over  is context-free
language if there is a context-free grammar
that generates it.
24
exercise
• G = ({S, A}, {a, b}, P, S)
P: S  AA
A  AAA | bA | Ab | a
• Show the derivation of baba (in
sentencial form)

Theory of Computer Science 25


Derivation Tree, DT
G = ({S, A}, {a, b}, P, S)
P: S  AA
A  AAA | bA | Ab | a

The derivation tree for


S AA aA abA abAb abab
26
Derivation Tree
S aS aSa aba S Sa aSa aba

27
Derivation Tree

• A CFG G is ambiguous if there exist more


than 1 DT for n, where n  L(G).
• Example: G = ({S}, {a, b}, P, S)
P: S  aS | Sa | b
The string aba can be derived as:
S aS aSa aba or S Sa aSa aba
28
exercise
Let G be the grammar:
S → SAB | λ
A → aA | a
B → bB | λ
1. Give a leftmost derivation of aabaaabb. Build the
derivation tree and sentencial form.
2. Give another derivation tree of aabaaabb which is
different from that of (a).

Theory of Computer Science 29


How to convert RE to CFG
General rules:
Type of Regular Context Free
operation expression (RE) Grammar (CFG)
union a+b Sa|b
Concatenation ab S AB
Aa
Bb
Kleene star a* S  aS | 
Kleene plus S  aS | a
Kleene star on (a + b)* S aS | bS | 
union
Theory of Computer Science 30
How to convert RE to CFG
Regular expression Context Free
(RE) Grammar (CFG)
ab* S  AB
Aa
B  bB | λ
a*b S  AB
A  aA | λ
Bb
(ab)* S abS | λ
a*b* S  AB
A  aA | λ
B  bB | λ 31
Example CFG
Length 2 even length
(a+b)(a+b) ((a+b)(a+b))*
S  AA S  BS | λ
Aa|b B  AA
Aa|b
n>=0 n>=1
S  aSb | λ S  aSb | ab

Theory of Computer Science 32


exercise
• ab(a+b)*
Write the CFG for this RE

Theory of Computer Science 33


example
• Generate CFG for language- at least
length 2
• RE – (a+b)(a+b)(a+b)*
• S  AAB
• A a|b
• B aB|bB|λ
Theory of Computer Science 34
Example CFG
• A string w is a palindrome if w = wR
• The set of palindrome over {a, b}
can be derived using rules:
S  aSa | bSb
Sa|b|

35
Design example
L = {0n1n | n  0}
These strings have recursive structure:
000000111111
0000011111
00001111
000111
0011
S  0S1| λ
01
λ
Design examples
Allowed strings
L = {0n1n0m1m | n  0, m  0} 010011
00110011
These strings have two parts: 000111
L = L1L2
L1 = {0n1n | n  0}
L2 = {0m1m | m  0}
S  AA
rules for L1: S1  0S11| λ A  0A1 | λ
L2 is the same as L1
Design examples Allowed strings
011001
L = {0n1m0m1n | n  0, m  0} 0011
1100
00110011

outer part: 0n1n S  0S1|A


inner part: 1m0m A  1A0 | λ
Example CFG

• L(G) =
{a b a |
n m n n > 0, m > 0}
• Let G be the grammar given by
the production
S  aSa | aBa
B  bB | b
39
Example of CFG

• G = ({S, A, B, C}, {a, b}, P, S)


P: S  aSbb | abb
• L(G) = {a b |n  1}
n 2n

Theory of Computer Science 40


Example
• Let L(G) =
{a b c d |
n m m 2n n≥0, m>0}
• Then the production rules for this
grammar is:
S  aSdd | A
A  bAc | bc

41
Example
• Consider the grammar:
S  abScB | 
B  bB | b

• The language of this grammar consists of:


{(ab) (cb ) |
n m n n n ≥ 0, mn > 0}

42
exercise
For each of the following context free grammar, use set
notation to define the language generated by the
grammar.
– S → aaSB | λ (aa)mbn
B → bB | b
– S → aSbb | A ancmb2n
A → cA | c
– S → abSdc | A (ab)n(cd)m(ba)m(dc)n
A → cdAba | λ
Theory of Computer Science 43
Regular Grammars
• Regular grammars are an important
subclass of context-free grammars that
play an important role in the lexical
analysis and parsing of programming
languages.
• Regular grammars is a subset of CFG.
• Regular grammars are obtained by
placing restrictions on the form of the
right hand side of the rules.
44
Definition
• A regular grammar is a CFG in which each rule
has one of the following form:
1. A  a
2. A  aB
3. A  λ
where A, B  V, and a  , A, B is a non-
terminal and a can be one or more terminals
45
Regular Grammars
• There is at most ONE variable in a
sentential form – the rightmost
symbol in the string.
• Each rule application adds ONE
terminal to the derived string.
Derivation is terminated by rules:
A  a OR Aλ
46
Regular Grammar
• The language L = a+b*
can also be defined as:
L = ( V, , P, S)
where: V = {S, A, B}
 = {a, b}
P: S  aS | aB
B  bB | 
47
Non-Regular Language
• The language L = a+b*
can also be defined as:
L = ( V, , P, S)
where: V = {S, A, B}
 = {a, b}
P: S  AB non-regular
A  aA | a grammar
B  bB | 
48
Context Free Grammar (CFG) vs
Regular Grammar (RG) vs non Regular
Grammar (nRG),

CFG
nRG
RG

Theory of Computer Science 49


Example
• Consider the grammar:
G: S  abSA | 
A  Aa | 
• The equivalent regular grammar:
Gr : S  aB | 
B  bS | bA
A  aA | 
50
CFG : RG vs nRG
Length 2 – (a+b)(a+b)
nRG RG
S  AA S aA|bA
Aa|b Aa|b
Even length ((a+b)(a+b))*
nRG RG
S  BS | λ S aA|bA|λ
B  AA A  aS | bS
Aa|b

Theory of Computer Science 51


How to convert RE to RG
Two ways : Direct conversion and from RE->NFA->RG
Direct conversion
General rules:
Type Regular Regular Grammar
expression (RE) (RG)
union a+b Sa|b
Concatenation ab S aA
Ab
Kleene star a* S  aS | 
Kleene plus S  aS | a
Kleene star on (a + b)* S aS | bS | 
union Theory of Computer Science 52
How to convert RE to RG
Direct conversion
• Start variable, S should produce/ correspond to the first
character in the RE.
• represents the other characters with a new variable or a
terminal and variable.
• Each loop state turns into a recursive definition on a variable
Example:
a b*
Step 1 a b* S  aA
A
Step 2 A = b* A  bA
Step 3 If b* =  Aλ
So, RG for this RE : S aA; AbA | λ
Theory of Computer Science 53
How to convert RE to RG
Regular expression Regular Grammar
(RE) (RG)
ab* S  aA
A  bA | λ
a*b S  aS | A
Ab
(ab)* S aA | λ
A bS
a*b* S  aS | A
A  bA | λ
Theory of Computer Science 54
How to convert RE to RG
example
RE : a*b b b*
Step 1 a* b b b* S  aS

Step 2 a* b bb* S  bA
A
If a* = , sbA
Step 3 A = b b* A  bB
B
Step 4 B = b* BbB
Step 5 If b* = , B= B 55
exercise
• What is the RG of the following RE ?
• ab*a
S → aA;A → bA;A → a

• ab*ab
S → aA;A → bA;A → aB;B → b

Theory of Computer Science 56


How to convert RE to RG
RE NFARG
Construct RG from NFA (Non-determinism Finite Automata)
1. Given RE : a(a + b)*
2. Draw NFA for this RE as shown below
a, b
a
q0 q1

3. Change the states in NFA into the grammar variables


(nonterminal) such as State q0 as S, q1 as A. Alphabets
(inputs) used in NFA are {a, b} is considered as terminals of
the grammar. The transitions become productions.
Theory of Computer Science 57
How to convert RE to RG
-Continue- a, b
a
S A

4. Now, we can construct the production of the


grammar by moment of levels (S, A which represent
states) using input/terminals. Begin the process from
the start state, S.
5. At state S, if the input is ‘a’, move to state A, so
SaA. At state A, if the input is ‘b’, go to state A
itself, so A bA. Also, if the input is ‘a’, go to state A
itself, so AaA.
Theory of Computer Science 58
How to convert RE to RG
Continue a, b
a
S A

6. For every final state must have a lambda move.


State A is a final state, so A
7. Gather all the production rules. The regular
grammar for this regular expression is
S  aA
A bA | aA |

Theory of Computer Science 59


How to convert RE to RG
• if there is no transition at the final state, you can
eliminate the variable for the final state.
Example
RE : (a+b)*b
NFA a, b
b
S A

• SaS ; SbS ; SbA ; A 


• SbA ; A  are the same as Sb
• So, RG for this RE is
SaS|bS|b Theory of Computer Science 60
Exercises
• What is the regular grammar of this regular
expression ?
• a(a + b)*b S  aA
A  aA | bA | b
• aa*bb* S  aA
A  aA | bB
B  bB | 
• a*bba* S  aS | bA
A  bB
B  aB | 
61
Non-RG to RG
• Since every RG defines a RL and every RL must have a
RE, it would be easier to go ahead and figure out
what the RE is and convert it to the RG.
• Eg. : CFG that is not regular: S  XYX
X  aX | bX | 
Y  aa | bb
• The equivalent RE for it is: (a + b)*(aa + bb)(a + b)*
• Convert it to RG: S  aS | bS | X
X  aaY | bbY
Y  aY | bY | 

Sept2011 Theory of Computer Science 62


Example

• Grammar for even-length strings over


{a, b}:
S  aE | bE | 
E  aS | bS

63
Example of RG

• G = ({S, A, B, C}, {a, b}, P, S)


P: S  aS | aB
B  bC
C  aC | a
• L(G) = {a ba |m, n  1}
m n

Theory of Computer Science 64


exercise
• Generate RG for
• L(G) = {a b |m, n  1}
m n

• SaS|aB
• BbB|b

Theory of Computer Science 65

You might also like