Professional Documents
Culture Documents
CFLProperties
CFLProperties
CFLProperties
Reading: Chapter 7
Introduction to Automata Theory, Languages, and Computation
JOHN E. HOPCROFT, RAJEEV MOTWANI, JEFFREY D. ULLMAN
3rd Edition
1
Topics
1) Simplifying CFGs
2) Normal forms
2
How to “simplify” CFGs?
3
Three ways to simplify/clean a CFG
(clean)
1. Eliminate useless symbols
(simplify)
2. Eliminate -productions
A => B
4
Eliminating useless symbols
Grammar cleanup
5
Eliminating useless symbols
A symbol X is reachable if there exists:
S * X
reachable generating
6
Algorithm to detect useless symbols
1. First, eliminate all symbols that are not generating
7
Example: Useless symbols
SAB | a
A b
1. A, S are generating
2. B is not generating (and therefore B is useless)
3. ==> Eliminating B… (i.e., remove all productions that involve B)
1. S a
2. Ab
4. Now, A is not reachable and therefore is useless
5. Simplified G:
What would happen if you reverse the order:
1. Sa
i.e., test reachability before generating?
Will fail to remove: A
b
8
Algorithm to find all generating symbols X * w
Given: G=(V,T,P,S)
Basis:
Every symbol in T is obviously generating.
Induction:
Suppose for a production A , where is generating
Then, A is also generating
9
Algorithm to find all reachable symbols S * X
Given: G=(V,T,P,S)
Basis:
S is obviously reachable (from itself)
Induction:
Suppose for a production A 1 2… k, where A is reachable
Then, all symbols on the right hand side, {1, 2 ,… k} are also reachable.
10
Eliminating -productions
A =>
11
What’s the point of removing -productions?
12
Algorithm to detect all nullable variables
Basis:
If A is a production in G, then A is nullable
(note: A can still have other productions)
Induction:
If there is a production B C1C2…Ck, where every Ci is nullable, then B is
also nullable
13
Eliminating -productions
Given: G=(V,T,P,S)
Algorithm:
1. Detect all nullable variables in G
2. Then construct G1=(V,T,P1,S) as follows:
i. For each production of the form: AX1X2…Xk, where k≥1, suppose m
out of the k Xi’s are nullable symbols
ii. Then G1 will have 2m versions for this production
i. i.e, all combinations where each Xi is either present or absent
iii. Alternatively, if a production is of the form: A, then remove it
14
Example: Eliminating -productions
Let L be the language represented by the following CFG G:
i. SAB
ii. AaAA |
iii. BbBB |
Simplified
Goal: To construct G1, which is the grammar for L-{}
grammar
Nullable symbols: {A, B}
before after
Eliminating unit productions AB
Unit production is one which is of the form A B, where both A & B are
variables
E.g.,
1. E T | E+T
2. T F | T*F
3. F I | (E)
4. I a | b | Ia | Ib | I0 | I1
How to eliminate unit productions?
Replace E T with E F | T*F
Then, upon recursive application wherever there is a unit production:
E F | T*F | E+T (substituting for T)
E I | (E) | T*F| E+T (substituting for F)
E a | b | Ia | Ib | I0 | I1 | (E) | T*F | E+T (substituting for I)
Now, E has no unit productions
Similarly, eliminate for the remainder of the unit productions
17
The Unit Pair Algorithm: to remove unit productions
Suppose AB1 B2 … Bn
Action: Replace all intermediate productions to produce directly
i.e., A ; B1 ; … Bn ;
Induction: If (A,B) and (B,C) are unit pairs, and AC is also a unit pair.
18
The Unit Pair Algorithm: to remove unit productions
Input: G=(V,T,P,S)
Goal: to build G1=(V,T,P1,S) devoid of unit productions
Algorithm:
1. Find all unit pairs in G
19
Example: eliminating unit productions
Algorithm:
Step 1) eliminate -productions
Step 2) eliminate unit productions
Step 3) eliminate useless symbols
Again,
the order is
important!
Why?
21
22
Normal Forms
23
Why normal forms?
If all productions of the grammar could be expressed in the same
form(s), then:
24
Chomsky Normal Form (CNF)
Let G be a CFG for some L-{}
Definition:
G is said to be in Chomsky Normal Form if all its productions are in
one of the following two forms:
i. A BC where A,B,C are variables, or
ii. Aa where a is a terminal
G has no useless symbols
G has no unit productions
G has no -productions
25
CNF checklist
Is this grammar in CNF?
G1:
1. E E+T | T*F | (E) | Ia | Ib | I0 | I1
2. T T*F | (E) | Ia | Ib | I0 | I1
3. F (E) | Ia | Ib | I0 | I1
4. I a | b | Ia | Ib | I0 | I1
Checklist:
• G has no -productions
• G has no unit productions
• G has no useless symbols
• But…
• the normal form for productions is violated
B2 C2 and so on…
B1 C1
AB1C1 C1B2C2 … Ck-3Bk-2Ck-2 Ck-2Bk-1Bk
27
Example #1
G in CNF:
G:
S => AS | BABC X0 => 0
A => A1 | 0A1 | 01 X1 => 1
B => 0B | 0 S => AS | BY1
C => 1C | 1 Y1 => AY2
Y2 => BC
A => AX1 | X0Y3 | X0X1
Y3 => AX1
B => X0B | 0
C => X1C | 1
28
Example #2
31
GNF (Greibach Normal Form)
A CFG (Context-Free Grammar) is a Greibach Normal
Form(GNF) if all of its production rules satisfy one of the
following conditions:
A non-terminal generating terminal. For example, B -> b.
Start symbol generating ε. For example, S → ε.
A non-terminal generates a terminal followed by any number of
non-terminals. For example, A -> aBC…N.
32
Example
G1 = {S → aAB | bBA,
A → aA | a,
B → bB | b}
G2 = {S → aAB | aBA,
A → aA | ε,
B → bB | ε}
The production rules of Grammar G1 satisfy the above rules specified for GNF.
Therefore, G1 is in GNF.
However, the production rule of G2 does not satisfy the rules specified for GNF as
A → ε and B → ε contains ε, but only the start symbol can generate ε. So the
grammar G2 is not in GNF.
33
Convert CFG to Greibach Normal Form(GNF)
Step 1
Convert the Grammar into Chomsky Normal Form(CNF): If the
given Grammar is not in CNF, convert it.
Step 2
If the grammar contains left recursion, remove it.
Step 3
Convert the production rule into GNF form in the grammar.
34
35
36
37
38
Exercise
Since the given grammar is already in CNF, and there is no
left recursion, we can bypass the first two steps and proceed
directly to the third step.
39
40
41
Eliminate Left Recursion
Rule 1: Try to make the grammar of the form A Aα| β
Then Convert A β
Є|α
Example: E E + T | T [It is a left Recursion]
Here α = + T and β = T
So, E T
Є | +T
42
Eliminate Left Recursion
Rule 2: Try to make the grammar of the form A | | ... | | | | …
Then Convert A β1 | β2 | … βk
Є | α1 | α2 … | αm
Example: E E + T | E*T| T [It is a left Recursion]
Here α1 = + T, α2 = * T and β = T
So, E T
Є | +T | +T
43