Download as pdf or txt
Download as pdf or txt
You are on page 1of 83

Introduction to

Theoretical Computer Science

Motivation Finite Automata

• Automata = abstract computing devices


Finite Automata are used as a model for
• Turing studied Turing Machines (= com-
puters) before there were any real comput-
• Software for designing digital cicuits
ers

• We will also look at simpler devices than • Lexical analyzer of a compiler


Turing machines (Finite State Automata,
Pushdown Automata, . . . ), and specifica- • Searching for keywords in a file or on the
tion means, such as grammars and regular web.
expressions.

• NP-hardness = what cannot be efficiently • Software for verifying finite state systems,
computed. such as communication protocols.

• Undecidability = what cannot be computed


at all.

1 2

Structural Representations

These are alternative ways of specifying a ma-


chine
• Example: Finite Automaton modelling an
on/off switch Grammars: A rule like E ⇒ E + E specifies an
arithmetic expression
Push
• Lineup ⇒ P erson.Lineup
Start
Lineup ⇒ P erson
off on

says that a lineup is a single person, or a person


Push in front of a lineup.

Regular Expressions: Denote structure of data,


• Example: Finite Automaton recognizing the e.g.
string then
’[A-Z][a-z]*[][A-Z][A-Z]’
Start t h e n
t th the then matches Ithaca NY

does not match Palo Alto CA

Question: What expression would match


Palo Alto CA
3 4
Length of String: Number of positions for
Central Concepts
symbols in the string.

Alphabet: Finite, nonempty set of symbols


|w| denotes the length of string w

Example: Σ = {0, 1} binary alphabet


|0110| = 4, || = 0

Example: Σ = {a, b, c, . . . , z} the set of all lower


Powers of an Alphabet: Σk = the set of
case letters
strings of length k with symbols from Σ

Example: The set of all ASCII characters


Example: Σ = {0, 1}

Strings: Finite sequence of symbols from an


Σ1 = {0, 1}
alphabet Σ, e.g. 0011001

Σ2 = {00, 01, 10, 11}


Empty String: The string with zero occur-
rences of symbols from Σ
Σ0 = {}

• The empty string is denoted 


Question: How many strings are there in Σ3

5 6

The set of all strings over Σ is denoted Σ∗

Σ∗ = Σ0 ∪ Σ1 ∪ Σ2 ∪ · · ·

Also: Languages:

Σ+ = Σ1 ∪ Σ2 ∪ Σ3 ∪ · · · If Σ is an alphabet, and L ⊆ Σ∗
then L is a language
Σ∗ = Σ+ ∪ {}
Examples of languages:
Concatenation: If x and y are strings, then
xy is the string obtained by placing a copy of
y immediately after a copy of x • The set of legal English words

x = a1a2 . . . ai • The set of legal C programs


y = b1 b 2 . . . b j
• The set of strings consisting of n 0’s fol-
xy = a1a2 . . . aib1b2 . . . bj lowed by n 1’s
{, 01, 0011, 000111, . . .}
Example: x = 01101, y = 110, xy = 01101110

Note: For any string x


x = x = x
7 8
Problem: Is a given string w a member of a
• The set of strings with equal number of 0’s language L?
and 1’s
{, 01, 10, 0011, 0101, 1001, . . .} Example: Is a binary number prime = is it a
meber in LP

• LP = the set of binary numbers whose Is 11101 ∈ LP ? What computational resources


value is prime are needed to answer the question.
{10, 11, 101, 111, 1011, . . .}
Usually we think of problems not as a yes/no
decision, but as something that transforms an
• The empty language ∅ input into an output.

Example: Parse a C-program = check if the


• The language {} consisting of the empty program is correct, and if it is, produce a parse
string tree.

Let LX be the set of all valid programs in prog


Note: ∅ 6= {} lang X. If we can show that determining mem-
bership in LX is hard, then parsing programs
Note2: The underlying alphabet Σ is always written in X cannot be easier.
finite
Question: Why?
9 10

Finite Automata Informally


e-commerce
Protocol for e-commerce using e-money
The protocol for each participant:
Allowed events:
Start pay redeem transfer
a b d f
1. The customer can pay the store (=send
the money-file to the store) ship ship ship
(a) Store
c e g
2. The customer can cancel the money (like redeem transfer
putting a stop on a check)

3. The store can ship the goods to the cus- 2


cancel
tomer pay cancel

1 3 4
4. The store can redeem the money (=cash redeem transfer
the check)
Start Start

5. The bank can transfer the money to the


(b) Customer (c) Bank
store

11 12
Completed protocols:

cancel pay,cancel pay,cancel pay,cancel


The entire system as an Automaton:
Start
a b d f Start
pay redeem transfer a b c d e f g
ship ship ship P P P P P P
(a) Store P S S S
redeem transfer 1
c e g
R
C C C C C C C
pay,cancel pay,cancel pay,cancel R
P S S S
pay, ship 2
P P P P P P
2
ship. redeem, transfer, R P,C P,C P,C
pay,redeem, pay,redeem,
pay, cancel P S R S S
cancel cancel, ship cancel, ship 3
pay, C P,C P,C P,C T
1 3 4
ship redeem transfer R T
P S S S
4 R
Start Start
C P,C P,C P,C P,C P,C P,C

(b) Customer (c) Bank

13 14

Deterministic Finite Automata Example: An automaton A that accepts

A DFA is a quintuple L = {x01y : x, y ∈ {0, 1}∗}

The automaton A = ({q0, q1, q2}, {0, 1}, δ, q0, {q1})


A = (Q, Σ, δ, q0, F ) as a transition table:

δ 0 1
• Q is a finite set of states → q0 q2 q0
?q1 q1 q1
• Σ is a finite alphabet (=input symbols) q2 q2 q1

• δ is a transition function (q, a) 7→ p


The automaton A as a transition diagram:

• q0 ∈ Q is the start state 1 0


Start 0 1
q0 q2 q1 0, 1
• F ⊆ Q is a set of final states

15 16
An FA accepts a string w = a1a2 . . . an if there
is a path in the transition diagram that
• The transition function δ can be extended
to δ̂ that operates on states and strings (as
1. Begins at a start state opposed to states and symbols)

Basis: δ̂(q, ) = q
2. Ends at an accepting state

Induction: δ̂(q, xa) = δ(δ̂(q, x), a)


3. Has sequence of labels a1a2 . . . an
• Now, fomally, the language accepted by A
is
Example: The FA
L(A) = {w : δ̂(q0, w) ∈ F }
1 0
Start 0 1
q0 q2 q1 0, 1 • The languages accepted by FA:s are called
regular languages

accepts e.g. the string 1100101

17 18

Example: DFA accepting all and only strings


with an even number of 0’s and an even num-
Example
ber of 1’s

Marble-rolling toy from p. 53 of textbook


1
Start q0 q1
1
0 A B
0

0 x1 x3
0
q2 1 q3

1
x2

Tabular representation of the Automaton

δ 0 1
? → q0 q2 q1
q1 q3 q0 C D
q2 q0 q3
q3 q1 q2

19 20
A state is represented as sequence of three bits Nondeterministic Finite Automata
followed by r or a (previous input rejected or
accepted) A NFA can be in several states at once, or,
viewded another way, it can “guess” which
For instance, 010a, means state to go to next
left, right, left, accepted
Example: An automaton that accepts all and
Tabular representation of DFA for the toy only strings ending in 01.

0, 1
A B
Start q0 0 q1 1 q2
→ 000r 100r 011r
?000a 100r 011r
?001a 101r 000a
010r 110r 001a Here is what happens when the NFA processes
?010a 110r 001a the input 00101
011r 111r 010a q0 q0 q0 q0 q0 q0
100r 010r 111r
?100a 010r 111r
q1 q1 q1
101r 011r 100a
(stuck)
?101a 011r 100a
110r 000a 101a q2 q2
?110a 000a 101a (stuck)
111r 001a 110a 0 0 1 0 1
21 22

Formally, a NFA is a quintuple

Example: The NFA from the previous slide is


A = (Q, Σ, δ, q0, F )

• Q is a finite set of states ({q0, q1, q2}, {0, 1}, δ, q0, {q2})

• Σ is a finite alphabet where δ is the transition function

• δ is a transition function from Q × Σ to the δ 0 1


powerset of Q → q0 {q0, q1} {q0}
q1 ∅ {q2}
?q2 ∅ ∅
• q0 ∈ Q is the start state

• F ⊆ Q is a set of final states

23 24
Let’s prove formally that the NFA
Extended transition function δ̂.
0, 1
Basis: δ̂(q, ) = {q} Start 0 1
q0 q1 q2

Induction:
[ accepts the language {x01 : x ∈ Σ∗}. We’ll do
δ̂(q, xa) = δ(p, a)
p∈δ̂(q,x)
a mutual induction on the three statements
below

Example: Let’s compute δ̂(q0, 00101) on the 0. w ∈ Σ∗ ⇒ q0 ∈ δ̂(q0, w)


blackboard
1. q1 ∈ δ̂(q0, w) ⇔ w = x0
• Now, fomally, the language accepted by A is

L(A) = {w : δ̂(q0, w) ∩ F 6= ∅} 2. q2 ∈ δ̂(q0, w) ⇔ w = x01

25 26

Equivalence of DFA and NFA

• NFA’s are usually easier to “program” in.

• Surprisingly, for any NFA N there is a DFA D,


such that L(D) = L(N ), and vice versa.

Basis: If |w| = 0 then w = . Then statement


• This involves the subset construction, an im-
(0) follows from def. For (1) and (2) both
portant example how an automaton B can be
sides are false for 
generically constructed from another automa-
ton A.
Induction: Assume w = xa, where a ∈ {0, 1},
|x| = n and statements (0)–(2) hold for x. We • Given an NFA
will show on the blackboard in class that the
statements hold for xa. N = (QN , Σ, δN , q0, FN )
we will construct a DFA

D = (QD , Σ, δD , {q0}, FD )
such that
L(D) = L(N )
.
27 28
The details of the subset construction:

Let’s construct δD from the NFA on slide 26


• QD = {S : S ⊆ QN }.
0 1
Note: |QD | = 2|QN |, although most states in ∅ ∅ ∅
QD are likely to be garbage. → {q0} {q0, q1} {q0}
{q1} ∅ {q2}
• FD = {S ⊆ QN : S ∩ FN 6= ∅} ?{q2} ∅ ∅
{q0, q1} {q0, q1} {q0, q2}
?{q0, q2} {q0, q1} {q0}
• For every S ⊆ QN and a ∈ Σ, ?{q1, q2} ∅ {q2}
[ ?{q0, q1, q2} {q0, q1} {q0, q2}
δD (S, a) = δN (p, a)
p∈S

29 30

We can often avoid the exponential blow-up


by constructing the transition table for D only
for accessible states S as follows:
Note: The states of D correspond to subsets
of states of N , but we could have denoted the Basis: S = {q0} is accessible in D
states of D by, say, A − F just as well.
Induction: If state S is accessible, so are the
0 1 S
states in a∈Σ δD (S, a).
A A A
→B E B
Example: The “subset” DFA with accessible
C A D
?D A A states only.
E E F
1 0
?F E B
?G A D Start
0 1
?H E F {q0} {q0, q1} {q0, q2}

31 32
Induction:
def
δ̂D ({q0}, xa) = δD (δ̂D ({q0}, x), a)
Theorem 2.11: Let D be the “subset” DFA
of an NFA N . Then L(D) = L(N ). i.h.
= δD (δ̂N (q0, x), a)
Proof: First we show on an induction on |w|
cst [
that = δN (p, a)
p∈δ̂N (q0 ,x)
δ̂D ({q0}, w) = δ̂N (q0, w)

def
= δ̂N (q0, xa)
Basis: w = . The claim follows from def.

Now (why?) it follows that L(D) = L(N ).

33 34

Exponential Blow-Up

Theorem 2.12: A language L is accepted by There is an NFA N with n + 1 states that has
some DFA if and only if L is accepted by some no equivalent DFA with fewer than 2n states
NFA. 0, 1

1 0, 1 0, 1 0, 1 0, 1
Proof: The “if” part is Theorem 2.11. q0 q1 q2 qn
Start

For the “only if” part we note that any DFA


can be converted to an equivalent NFA by mod- L(N ) = {x1c2c3 · · · cn : x ∈ {0, 1}∗, ci ∈ {0, 1}}
ifying the δD to δN by the rule
Suppose an equivalent DFA D with fewer than
2n states exists.
• If δD (q, a) = p, then δN (q, a) = {p}.
D must remember the last n symbols it has
By induction on |w| it will be shown in the read. There are 2n bitsequences a1a2 . . . an.
tutorial that if δ̂D (q0, w) = p, then δ̂N (q0, w) = Since D has fewer that 2n states
{p}.
∃ q, a1a2 . . . an, b1b2 . . . bn :

The claim of the theorem follows. a1a2 . . . an 6= b1b2 . . . bn

δ̂D (q0, a1a2 . . . an) = δ̂D (q0, b1b2 . . . bn) = q


35 36
Since a1a2 . . . an 6= b1b2 . . . bn they must differ FA’s with Epsilon-Transitions
in at least one position.
An -NFA accepting decimal numbers consist-
Case 1: ing of:

1a2 . . . an
0b2 . . . bn 1. An optional + or - sign

Then q has to be both an accepting and a 2. A string of digits


nonaccepting state.
3. a decimal point
Case 2:
4. another string of digits

a1 . . . ai−11ai+1 . . . an
b1 . . . bi−10bi+1 . . . bn One of the strings (2) are (4) are optional

Now δ̂D (q0, a1 . . . ai−11ai+1 . . . an0i−1) = 0,1,...,9 0,1,...,9


δ̂D (q0, b1 . . . bi−10bi+1 . . . bn0i−1) Start
q0 ε,+,- q1 q2 q3 ε q5
. 0,1,...,9
and δ̂D (q0, a1 · · · ai−11ai+1 · · · an0i−1) ∈ FD
0,1,...,9 .
δ̂D (q0, b1 · · · bi−10bi+1 · · · bn0i−1) ∈
/ FD q4

37 38

An -NFA is a quintuple (Q, Σ, δ, q0, F ) where δ


is a function from Q × Σ ∪ {} to the powerset
ECLOSE
of Q.

We close a state by adding all states reachable


Example: The -NFA from the previous slide
by a sequence  · · · 

Inductive definition of ECLOSE(q)


E = ({q0, q1, . . . , q5}, {., +, −, 0, 1, . . . , 9} δ, q0, {q5})
Basis:

where the transition table for δ is


q ∈ ECLOSE(q)
 +,- . 0, . . . , 9
→ q0 {q1} {q1} ∅ ∅ Induction:
q1 ∅ ∅ {q2} {q1, q4}
q2 ∅ ∅ ∅ {q3} p ∈ ECLOSE(q) and r ∈ δ(p, ) ⇒
q3 {q5} ∅ ∅ {q3}
r ∈ ECLOSE(q)
q4 ∅ ∅ {q3} ∅
?q5 ∅ ∅ ∅ ∅

39 40
• Inductive definition of δ̂ for -NFA’s

Example of -closure Basis:


ε ε
2 3 6
ε

1 b δ̂(q, ) = ECLOSE(q)

ε
4 5 7 Induction:
a ε

For instance,
[
δ̂(q, xa) = ECLOSE(p)
p∈δ(δ̂(q,x),a)

ECLOSE(1) = {1, 2, 3, 4, 6}
Let’s compute on the blackboard in class
δ̂(q0, 5.6) for the NFA on slide 38

41 42

Given an -NFA Example: -NFA E

E = (QE , Σ, δE , q0, FE ) 0,1,...,9 0,1,...,9


Start
we will construct a DFA q0 ε,+,- q1 q2 q3 ε q5
. 0,1,...,9
D = (QD , Σ, δD , qD , FD )
0,1,...,9 .
such that
q4
L(D) = L(E)

Details of the construction: DFA D corresponding to E

0,1,...,9 0,1,...,9
• QD = {S : S ⊆ QE and S = ECLOSE(S)}
+,-
{q0, q } {q } {q , q } {q2, q3, q5}
• qD = ECLOSE(q0) 1 1 0,1,...,9 1 4 .

. 0,1,...,9
• FD = {S : S ∈ QD and S ∩ FE 6= ∅} .
Start {q2} {q , q5}
3
• δD (S, a) = 0,1,...,9
[
{ECLOSE(p) : p ∈ δ(t, a) for some t ∈ S}
0,1,...,9
43 44
Induction:
[
Theorem 2.22: A language L is accepted by δ̂E (q0, xa) = ECLOSE(p)
some -NFA E if and only if L is accepted by p∈δE (δ̂E (q0 ,x),a)
some DFA.
[
= ECLOSE(p)
Proof: We use D constructed as above and p∈δD (δ̂D (qD ,x),a)
show by induction that δ̂D (q0, w) = δ̂E (qD , w)
[
= ECLOSE(p)
p∈δ̂D (qD ,xa)
Basis: δ̂E (q0, ) = ECLOSE(q0) = qD = δ̂(qD , )
= δ̂D (qD , xa)

45 46

Operations on languages
Regular expressions
Union:
A FA (NFA or DFA) is a “blueprint” for con-
tructing a machine recognizing a regular lan-
L ∪ M = {w : w ∈ L or w ∈ M }
guage.

Concatenation:
A regular expression is a “user-friendly,” declar-
ative way of describing a regular language.
L.M = {w : w = xy, x ∈ L, y ∈ M }

Example: 01∗ + 10∗


Powers:

Regular expressions are used in e.g.


L0 = {}, L1 = L, Lk+1 = L.Lk

1. UNIX grep command Kleene Closure:


[
L∗ = Li
2. UNIX Lex (Lexical analyzer generator) and
i=0
Flex (Fast Lex) tools.
Question: What are ∅0, ∅i, and ∅∗
47 48
Building regex’s
Example: Regex for
Inductive definition of regex’s:
L = {w ∈ {0, 1}∗ : 0 and 1 alternate in w}
Basis:  is a regex and ∅ is a regex.
L() = {}, and L(∅) = ∅.

If a ∈ Σ, then a is a regex. (01)∗ + (10)∗ + 0(10)∗ + 1(01)∗


L(a) = {a}.
or, equivalently,
Induction:
( + 1)(01)∗( + 0)
If E is a regex’s, then (E) is a regex.
L((E)) = L(E).
Order of precedence for operators:
If E and F are regex’s, then E + F is a regex.
L(E + F ) = L(E) ∪ L(F ). 1. Star
2. Dot
If E and F are regex’s, then E.F is a regex.
L(E.F ) = L(E).L(F ). 3. Plus

If E is a regex’s, then E ? is a regex. Example: 01∗ + 1 is grouped (0(1)∗) + 1


L(E ?) = (L(E))∗.
49 50

Equivalence of FA’s and regex’s


Theorem 3.4: For every DFA A = (Q, Σ, δ, q0, F )
We have already shown that DFA’s, NFA’s,
there is a regex R, s.t. L(R) = L(A).
and -NFA’s all are equivalent.

Proof: Let the states of A be {1, 2, . . . , n},


ε-NFA NFA
with 1 being the start state.

(k)
• Let Rij be a regex describing the set of
RE DFA
labels of all paths in A from state i to state
j going through intermediate states {1, . . . , k}
only.
To show FA’s equivalent to regex’s we need to
establish that i
j

k
1. For every DFA A we can find (construct,
in this case) a regex R, s.t. L(R) = L(A).

2. For every regex R there is a -NFA A, s.t.


L(A) = L(R).

51 52
(k)
Rij will be defined inductively. Note that
 
M Induction:
L R1j (n) = L(A)
j∈F

(k)
Rij
Basis: k = 0, i.e. no intermediate states.
=
• Case 1: i 6= j
(k−1)
Rij

M +
(0)
Rij = a  
(k−1) (k−1) ∗ (k−1)
{a∈Σ:δ(i,a)=j} Rik Rkk Rkj

• Case 2: i = j
i k k k k j

 
M In R (k-1) In R (k-1)
(0) 
a
ik Zero or more strings in R (k-1) kj
Rii =  + kk
{a∈Σ:δ(i,a)=i}

53 54

Example: Let’s find R for A, where


L(A) = {x0y : x ∈ {1}∗ and y ∈ {0, 1}∗}
We will need the following simplification rules:
1
Start 0 0,1 • ( + R)∗ = R∗
1 2

• R + RS ∗ = RS ∗

(0)
R11 +1 • ∅R = R∅ = ∅ (Annihilation)
(0)
R12 0
• ∅ + R = R + ∅ = R (Identity)
(0)
R21 ∅
(0)
R22 +0+1

55 56
Simplified
(0)
R11 +1 (1)
R11 1∗
(0)
R12 0 (1)
R12 1∗ 0
(0)
R21 ∅ (1)
R21 ∅
(0)
R22 +0+1 (1)
R22 +0+1

 
(1) (0) (0) (0) ∗ (0)  
Rij = Rij + Ri1 R11 R1j (2) (1) (1) (1) ∗ (1)
Rij = Rij + Ri2 R22 R2j

By direct substitution Simplified


By direct substitution
(1)
R11  + 1 + ( + 1)( + 1)∗( + 1) 1∗ (2)
R11 1∗ + 1∗0( + 0 + 1)∗∅
(1)
R12 0 + ( + 1)( + 1)∗0 1∗0 (2)
R12 1∗0 + 1∗0( + 0 + 1)∗( + 0 + 1)
(1)
R21 ∅ + ∅( + 1)∗( + 1) ∅ (2)
R21 ∅ + ( + 0 + 1)( + 0 + 1)∗∅
(1)
R22  + 0 + 1 + ∅( + 1)∗0 +0+1 (2)
R22  + 0 + 1 + ( + 0 + 1)( + 0 + 1)∗( + 0 + 1)

57 58

By direct substitution
(2) Observations
R11 1∗ + 1∗0( + 0 + 1)∗∅
(2)
R12 1∗0 + 1∗0( + 0 + 1)∗( + 0 + 1)
(k)
(2) There are n3 expressions Rij
R21 ∅ + ( + 0 + 1)( + 0 + 1)∗∅
(2)
R22  + 0 + 1 + ( + 0 + 1)( + 0 + 1)∗( + 0 + 1) Each inductive step grows the expression 4-fold
Simplified
(n)
(2) Rij could have size 4n
R11 1∗
(2)
R12 1∗0(0 + 1)∗ (k) (k−1)
For all {i, j} ⊆ {1, . . . , n}, Rij uses Rkk
(2)
R21 ∅ so we have to write n2 times the regex Rkk
(k−1)
(2)
R22 (0 + 1)∗
We need a more efficient approach:
The final regex for A is the state elimination technique
(2)
R12 = 1∗0(0 + 1)∗

59 60
The state elimination technique

Let’s label the edges with regex’s instead of Now, let’s eliminate state s.
symbols
R 11 + Q 1 S* P1
R 1m q1 p1

R 1m + Q 1 S* Pm
R 11

q1 p1

Q1
S P1

R k1 + Q k S* P1
s
qk pm
Pm R km + Q k S* Pm
Qk

qk pm
R km For each accepting state q eliminate from the
original automaton all states exept q0 and q.

R k1

61 62

For each q ∈ F we’ll be left with an Aq that


looks like

R U
S
Start Example: A, where L(A) = {W : w = x1b, or w =
x1bc, x ∈ {0, 1}∗, {b, c} ⊆ {0, 1}}
T
0,1
that corresponds to the regex Eq = (R+SU ∗T )∗SU ∗ Start 1 0,1 0,1
A B C D
or with Aq looking like

R We turn this into an automaton with regex


Start labels

0+1

Start 1 0+1 0+1


corresponding to the regex Eq = R∗ A B C D

• The final expression is


M
Eq
q∈F

63 64
0+1 From
Start 1 0+1 0+1
0+1
A B C D
Start 1( 0 + 1) 0+1
A C D
Let’s eliminate state B

0+1 we can eliminate D to obtain AC


Start 1( 0 + 1) 0+1
A C D 0+1

Start 1( 0 + 1)
A C
Then we eliminate state C and obtain AD

0+1 with regex (0 + 1)∗1(0 + 1)


Start 1( 0 + 1) ( 0 + 1)
A D • The final expression is the sum of the previ-
ous two regex’s:

with regex (0 + 1)∗1(0 + 1)(0 + 1) (0 + 1)∗1(0 + 1)(0 + 1) + (0 + 1)∗1(0 + 1)

65 66

From regex’s to -NFA’s Induction: Automata for R + S, RS, and R∗

Theorem 3.7: For every regex R we can con- ε R ε


struct and -NFA A, s.t. L(A) = L(R).

ε ε
Proof: By structural induction: S

(a)
Basis: Automata for , ∅, and a.

ε ε
R S

(a)
(b)

ε
(b) ε ε
R

ε
(c) (c)

67 68
Example: We convert (0 + 1)∗1(0 + 1)

ε 0 ε Algebraic Laws for languages

ε ε
1 • L ∪ M = M ∪ L.
(a)

ε
0
Union is commutative.
ε ε
ε ε

ε ε ε • (L ∪ M ) ∪ N = L ∪ (M ∪ N ).
1

(b) Union is associative.


ε
ε 0 ε • (LM )N = L(M N ).
Start ε ε
ε
ε ε
ε 1 Concatenation is associative

ε 0 ε Note: Concatenation is not commutative, i.e.,


1 ε
there are L and M such that LM 6= M L.
ε ε
1
(c)

69 70

• L(M ∪ N ) = LM ∪ LN .

• ∅ ∪ L = L ∪ ∅ = L. Concatenation is left distributive over union.

∅ is identity for union. • (M ∪ N )L = M L ∪ N L.

• {}L = L{} = L. Concatenation is right distributive over union.

{} is left and right identity for concatenation. • L ∪ L = L.

• ∅L = L∅ = ∅. Union is idempotent.

∅ is left and right annihilator for concatenation. • ∅∗ = {}, {}∗ = {}.

• L+ = LL∗ = L∗L, L∗ = L+ ∪ {}

71 72
Algebraic Laws for regex’s

• (L∗)∗ = L∗. Closure is idempotent Evidently e.g. L((0 + 1)1) = L(01 + 11)

Proof: Also e.g. L((00 + 101)11) = L(0011 + 10111).


∞ ∞ !i
[ [
w ∈ (L∗)∗ ⇐⇒ w ∈ Lj
More generally
i=0 j=0
L((E + F )G) = L(EG + F G)
⇐⇒ ∃k, m ∈ N : w ∈ (Lm)k
for any regex’s E, F , and G.
⇐⇒ ∃p ∈ N : w ∈ Lp

[ • How do we verify that a general identity like
⇐⇒ w ∈ Li above is true?
i=0

⇐⇒ w ∈ L∗  1. Prove it by hand.

2. Let the computer prove it.

73 74

In Chapter 4 we will learn how to test auto-


matically if E = F , for any concrete regex’s
E and F . Answer: Yes, as long as the identities use only
plus, dot, and star.
We want to test general identities, such as
E + F = F + E, for any regex’s E and F . Let’s denote a generalized regex, such as (E + F )E
by
Method:
E(E, F )

1. “Freeze” E to a1, and F to a2


Now we can for instance make the substitution
S = {E/0, F /11} to obtain
2. Test automatically if the frozen identity is
true, e.g. if L(a1 + a2) = L(a2 + a1) S (E(E, F )) = (0 + 11)0

Question: Does this always work?

75 76
For example: Suppose the alphabet is {1, 2}.
Theorem 3.13: Fix a “freezing” substitution Let E(E1, E2) be (E1 + E2)E1, and let E1 be 1,
♠ = {E1/a1, E2/a2, . . . , Em/am}. and E2 be 2. Then

Let E(E1, E2, . . . , Em) be a generalized regex. w ∈ L(E(E1, E2)) = L((E1 + E2)E1) =
Then for any regex’s E1, E2, . . . , Em,
({1} ∪ {2}){1} = {11, 21}
w ∈ L(E(E1, E2, . . . , Em)) if and only if
if and only if there are strings wi ∈ L(Ei), s.t.
∃w1 ∈ L(E1) = {1}, ∃w2 ∈ L(E2) = {2} : w = wj1 wj2
w = wj1 wj2 · · · wjk and
and
aj1 aj2 ∈ L(E(a1, a2))) = L((a1+a2)a1) = {a1a1, a2a1}
aj1 aj2 · · · ajk ∈ L(E(a1, a2, . . . , am)) if and only if
j1 = j2 = 1, or j1 = 1, and j2 = 2

77 78

Induction:

Case 1: E = F + G.

Then ♠(E) = ♠(F) + ♠(G), and


L(♠(E)) = L(♠(F)) ∪ L(♠(G))
Proof of Theorem 3.13: We do a structural
induction of E. Let E and and F be regex’s. Then w ∈ L(E + F )
if and only if w ∈ L(E) or w ∈ L(F ), if and only
Basis: If E = , the frozen expression is also . if a1 ∈ L(♠(F)) or a2 ∈ L(♠(G)), if and only if
a1 ∈ ♠(E), or a2 ∈ ♠(E).
If E = ∅, the frozen expression is also ∅.
Case 2: E = F.G.

If E = a, the frozen expression is also a. Now Then ♠(E) = ♠(F).♠(G), and


w ∈ L(E) if and only if there is u ∈ L(a), s.t. L(♠(E)) = L(♠(F)).L(♠(G))
w = u and u is in the language of the frozen
expression, i.e. u ∈ {a}. Let E and and F be regex’s. Then w ∈ L(E.F )
if and only if w = w1w2, w1 ∈ L(E) and w2 ∈ L(F ),
and a1a2 ∈ L(♠(F)).L(♠(G)) = ♠(E)

Case 3: E = F∗.

Prove this case at home.


79 80
Theorem 3.14: E(E1, . . . , Em) = F(E1, . . . , Em) ⇔
L(♠(E)) = L(♠(F))

Proof:

Examples: (Only if direction) E(E1, . . . , Em) = F(E1, . . . , Em)


means that L(E(E1, . . . , Em)) = L(F(E1, . . . , Em))
for any concrete regex’s E1, . . . , Em. In partic-
To prove (L + M)∗ = (L∗M∗)∗ it is enough to
ular then L(♠(E)) = L(♠(F))
determine if (a1 + a2)∗ is equivalent to (a∗1a∗2)∗

(If direction) Let E1, . . . , Em be concrete regex’s.


To verify L∗ = L∗L∗ test if a∗1 is equivalent to
Suppose L(♠(E)) = L(♠(F)). Then by Theo-
a∗1a∗1.
rem 3.13,

Question: Does L + ML = (L + M)L hold? w ∈ L(E(E1, . . . Em)) ⇔

∃wi ∈ L(Ei), w = wj1 · · · wjm , aj1 · · · ajm ∈ L(♠(E)) ⇔

∃wi ∈ L(Ei), w = wj1 · · · wjm , aj1 · · · ajm ∈ L(♠(F)) ⇔

w ∈ L(F(E1, . . . Em))

81 82

Theorem 3.14: E(E1, . . . , Em) = F(E1, . . . , Em) ⇔


L(♠(E)) = L(♠(F))

Proof:

Examples: (Only if direction) E(E1, . . . , Em) = F(E1, . . . , Em)


means that L(E(E1, . . . , Em)) = L(F(E1, . . . , Em))
for any concrete regex’s E1, . . . , Em. In partic-
To prove (L + M)∗ = (L∗M∗)∗ it is enough to
ular then L(♠(E)) = L(♠(F))
determine if (a1 + a2)∗ is equivalent to (a∗1a∗2)∗

(If direction) Let E1, . . . , Em be concrete regex’s.


To verify L∗ = L∗L∗ test if a∗1 is equivalent to
Suppose L(♠(E)) = L(♠(F)). Then by Theo-
a∗1a∗1.
rem 3.13,

Question: Does L + ML = (L + M)L hold? w ∈ L(E(E1, . . . Em)) ⇔

∃wi ∈ L(Ei), w = wj1 · · · wjm , aj1 · · · ajm ∈ L(♠(E)) ⇔

∃wi ∈ L(Ei), w = wj1 · · · wjm , aj1 · · · ajm ∈ L(♠(F)) ⇔

w ∈ L(F(E1, . . . Em))

83 84
The Pumping Lemma Informally
Properties of Regular Languages
Suppose L01 = {0n1n : n ≥ 1} were regular.
• Pumping Lemma. Every regular language
satisfies the pumping lemma. If somebody Then it would be recognized by some DFA A,
presents you with fake regular language, use with, say, k states.
the pumping lemma to show a contradiction.
Let A read 0k . On the way it will travel as
• Closure properties. Building automata from follows:
components through operations, e.g. given L
and M we can build an automaton for L ∩ M .  p0
0 p1
• Decision properties. Computational analysis 00 p2
of automata, e.g. are two automata equiva- ... ...
lent. 0k pk

• Minimization techniques. We can save money ⇒ ∃i < j : pi = pj Call this state q.


since we can build smaller machines.
⇒ δ̂(p0, 0i) = δ̂(p0, 0j ) = q

85 86

Theorem 4.1.
Now you can fool A:
The Pumping Lemma for Regular Languages.

If δ̂(q, 1i) ∈ F the machine will foolishly ac-


Let L be regular.
cept 0j 1i.

Then ∃n, ∀w ∈ L : |w| ≥ n ⇒ w = xyz such that


If δ̂(q, 1i) ∈
/ F the machine will foolishly re-
i
ject 0 1 . i
1. y 6= 

Therefore L01 cannot be regular. 2. |xy| ≤ n

• Let’s generalize the above reasoning. 3. ∀k ≥ 0, xy k z ∈ L

87 88
Now w = xyz, where

1. x = a1a2 . . . ai
Proof: Suppose L is regular

The L is recognized by some DFA A with, say, 2. y = ai+1ai+2 . . . aj


n states.

3. z = aj+1aj+2 . . . am
Let w = a1a2 . . . am ∈ L, m > n.
y=
ai+1 . . . aj
Let pi = δ̂(q0, a1a2 . . . ai).
x= z=
Start a1 . . . ai aj+1 . . . am
⇒ ∃ i < j : pi = pj p0 pi

Evidently xy k z ∈ L, for any k ≥ 0. Q.E.D.

89 90

Suppose Lpr = {1p : p is prime } were regular.

Then Lpr = L(A), for some DFA A with, say


Example: Let Leq be the language of strings
n, states.
with equal number of zero’s and one’s.
Choose a prime p ≥ n + 2.
Suppose Leq is regular. Then Leq = L(A), for
some DFA A with, say, n states, and w =
p
0n1n ∈ L(A). z }| {
w = 111 · 1} 1111
| {z· · }· ·| ·{z | {z· · · 11}
x y z
By the pumping lemma w = xyz, |xy| ≤ n, |y|=m
y 6=  and xy k z ∈ L(A)
Now xy p−mz ∈ L(A)

|xy p−mz| = |xz| + (p − m)|y| =


w = 000 . 0} 0111
| {z. . }. .| .{z | {z. . . 11} p − m + (p − m)m = (1 + m)(p − m)
x y z which is not prime unless one of the factors
is 1.
In particular, xz ∈ L(A), but xz has fewer 0’s
• y 6=  ⇒ 1 + m > 1
than 1’s. ⇒ L(A) 6= Leq . • m = |y| ≤ |xy| ≤ n, p ≥ n + 2
⇒ p − m ≥ n + 2 − n = 2.
⇒ L(A) 6= Lpr .
91 92
Closure Properties of Regular Languages

Let L and M be regular languages. Then the


following languages are all regular: Theorem 4.4. For any regular L and M , L∪M
is regular.
• Union: L ∪ M
Proof. Let L = L(E) and M = L(F ). Then
• Intersection: L ∩ M
L(E + F ) = L ∪ M by definition.
• Complement: N

• Difference: L \ M Theorem 4.5. If L is a regular language over


Σ, then so is L = Σ∗ \ L.
• Reversal: LR = {wR : w ∈ L}

• Closure: L∗. Proof. Let L be recognized by a DFA

• Concatenation: L.M A = (Q, Σ, δ, q0, F ).

• Homomorphism: Let B = (Q, Σ, δ, q0, Q \ F ). Now L(B) = L.


h(L) = {h(w) : w ∈ L, h is a homom. }

• Inverse homomorphism:
h−1(L) = {w ∈ Σ : h(w) ∈ L, h : Σ → ∆ is a homom.}
93 94

Example:

Let L be recognized by the DFA below

1 0

Start Theorem 4.8. If L and M are regular, then


0 1
{q0} {q0, q1} {q0, q2} so is L ∩ M .
0
Proof. By DeMorgan’s law L ∩ M = L ∪ M .
1 We already that regular languages are closed
under complement and union.
Then L is recognized by

1 0
We shall shall also give a nice direct proof, the
Start Cartesian construction from the e-commerce
0 1
{q0} {q0, q1} {q0, q2} example.

Question: What are the regex’s for L and L


95 96
Theorem 4.8. If L and M are regular, then If AL goes from state p to state s on reading a,
so in L ∩ M . and AM goes from state q to state t on reading
a, then AL∩M will go from state (p, q) to state
Proof. Let L be the language of (s, t) on reading a.
AL = (QL, Σ, δL, qL, FL)
Input a
and M be the language of

AM = (QM , Σ, δM , qM , FM )

AL
We assume w.l.o.g. that both automata are
deterministic.
Start AND Accept

We shall construct an automaton that simu-


lates AL and AM in parallel, and accepts if and
AM
only if both AL and AM accept.

97 98

Example: (c) = (a) × (b)

Formally 1

Start 0 0,1
AL∩M = (QL × QM , Σ, δL∩M , (qL, qM ), FL × FM ), p q

where (a)
0
δL∩M ((p, q), a) = (δL(p, a), δM (q, a)) 0,1
Start 1
r s

It will be shown in the tutorial by and induction (b)

on |w| that 1
  Start 1
δ̂L∩M ((qL, qM ), w) = δ̂L(qL, w), δ̂M (qM , w) pr ps

0 0
The claim then follows.
0,1
qr qs
Question: Why? 1
0
(c)

99 100
Theorem 4.11. If L is a regular language,
then so is LR .

Proof 1: Let L be recognized by an FA A.


Theorem 4.10. If L and M are regular lan- Turn A into an FA for LR , by
guages, then so in L \ M .
1. Reversing all arcs.
Proof. Observe that L \ M = L ∩ M . We
already know that regular languages are closed 2. Make the old start state the new sole ac-
under complement and intersection. cepting state.

3. Create a new start state p0, with δ(p0, ) = F


(the old accepting states).

101 102

Theorem 4.11. If L is a regular language,


then so is LR .

Homomorphisms
Proof 2: Let L be described by a regex E.
We shall construct a regex E R , such that
L(E R ) = (L(E))R . A homomorphism on Σ is a function h : Σ → Θ∗,
where Σ and Θ are alphabets.
We proceed by a structural induction on E.
Let w = a1a2 · · · an ∈ Σ∗. Then
Basis: If E is , ∅, or a, then E R = E.
h(w) = h(a1)h(a2) · · · h(an)
Induction: and
1. E = F + G. Then E R = F R + GR h(L) = {h(w) : w ∈ L}

2. E = F.G. Then E R = GR .F R
Example: Let h : {0, 1} → {a, b}∗ be defined by
3. E = F ∗. Then E R = (F R )∗ h(0) = ab, and h(1) = . Now h(0011) = abab.

We will show by structural induction on E on Example: h(L(10∗1)) = L((ab)∗).


blackboard in class that

L(E R ) = (L(E))R
103 104
Theorem 4.14: h(L) is regular, whenever L
is. Inverse Homomorphism

Proof:
Let h : Σ → Θ∗ be a homom. Let L ⊆ Θ∗, and
Let L = L(E) for a regex E. We claim that define
L(h(E)) = h(L).
h−1(L) = {w ∈ Σ∗ : h(w) ∈ L}
Basis: If E is  or ∅. Then h(E) = E, and
L(h(E)) = L(E) = h(L(E)).

If E is a, then L(E) = {a}, L(h(E)) = L(h(a)) = L h h(L)


{h(a)} = h(L(E)).

Induction:
(a)

Case 1: L = E + F . Now L(h(E + F )) =


L(h(E)+h(F )) = L(h(E))∪L(h(F )) = h(L(E))∪
h(L(F )) = h(L(E) ∪ L(F )) = h(L(E + F )).

Case 2: L = E.F . Now L(h(E.F )) = L(h(E)).L(h(F )) h-1 (L) h L

= h(L(E)).h(L(F )) = h(L(E).L(F ))

Case 3: L = E ∗. Now L(h(E ∗)) = L(h(E)∗) = (b)


L(h(E))∗ = h(L(E))∗ = h(L(E ∗)).
105 106

Example: Let h : {a, b} → {0, 1}∗ be defined by


h(a) = 01, and h(b) = 10. If L = L((00 + 1)∗), Theorem 4.16: Let h : Σ → Θ∗ be a homom.,
then h−1(L) = L((ba)∗). and L ⊆ Θ∗ regular. Then h−1(L) is regular.

Claim: h(w) ∈ L if and only if w = (ba)n


Proof: Let L be the language of A = (Q, Θ, δ, q0, F ).
We define B = (Q, Σ, γ, q0, F ), where
Proof: Let w = (ba)n. Then h(w) = (1001)n ∈
L. γ(q, a) = δ̂(q, h(a))

/ L((ba)∗). There
Let h(w) ∈ L, and suppose w ∈ It will be shown by induction on |w| in the tu-
are four cases to consider. torial that γ̂(q0, w) = δ̂(q0, h(w))

1. w begins with a. Then h(w) begins with Input a


/ L((00 + 1)∗).
01 and ∈

2. w ends in b. Then h(w) ends in 10 and h


/ L((00 + 1)∗).

Input
Start h(a) to A
3. w = xaay. Then h(w) = z0101v and ∈
/
L((00 + 1)∗). Accept/reject
A
4. w = xbby. Then h(w) = z1010v and ∈
/
L((00 + 1)∗).

107 108
Decision Properties
From NFA’s to DFA’s

We consider the following: Suppose the -NFA has n states.

1. Converting among representations for reg- To compute ECLOSE(p) we follow at most n2


ular languages. arcs.

The DFA has 2n states, for each state S and


2. Is L = ∅? each a ∈ Σ we compute δD (S, a) in n3 steps.
Grand total is O(n32n) steps.
3. Is w ∈ L?
If we compute δ for reachable states only, we
need to compute δD (S, a) only s times, where s
4. Do two descriptions define the same lan- is the number of reachable states. Grand total
guage? is O(n3s) steps.

109 110

From DFA to NFA Testing emptiness

All we need to do is to put set brackets around L(A) 6= ∅ for FA A if and only if a final state
the states. Total O(n) steps. is reachable from the start state in A. Total
O(n2) steps.
From FA to regex
Alternatively, we can inspect a regex E and tell
We need to compute n3 entries of size up to if L(E) = ∅. We use the following method:
4n. Total is O(n34n).
E = F + G. Now L(E) is empty if and only if
The FA is allowed to be a NFA. If we first both L(F ) and L(G) are empty.
wanted to convert the NFA to a DFA, the total
time would be doubly exponential E = F.G. Now L(E) is empty if and only if
either L(F ) or L(G) is empty.
From regex to FA’s We can build an expres-
E = F ∗. Now L(E) is never empty, since  ∈
sion tree for the regex in n steps.
L(E).
We can construct the automaton in n steps.
E = . Now L(E) is not empty.
Eliminating -transitions takes O(n3) steps.
E = a. Now L(E) is not empty.
If you want a DFA, you might need an expo-
nential number of steps. E = ∅. Now L(E) is empty.
111 112
Example 4.17

Let A = (Q, Σ, δ, q0, F ) be a DFA and L(A) = M .


Testing membership
Let L ⊆ M be those words in w ∈ L(A) for
which A visits every state in Q at least once
To test w ∈ L(A) for DFA A, simulate A on w.
when accepting w.
If |w| = n, this takes O(n) steps.
We shall use closure properties of regular lan-
If A is an NFA and has s states, simulating A guages to prove that L is regular.
on w takes O(ns2) steps.
Plan of the proof:
If A is an -NFA and has s states, simulating
A on w takes O(ns3) steps.
M The language of automaton A

Inverse homomorphism

L1 Strings of M with state transitions embedded

If L = L(E), for regex E of length s, we first Intersection with a regular language

L2
convert E to an -NFA with 2s states. Then we
Add condition that first state is the start state

Difference with a regular language


simulate w on this machine, in O(ns3) steps. L3 Add condition that adjacent states are equal

Difference with regular languages

L4 Add condition that all states appear on the path

Homomorphism

L Delete state components, leaving the symbols

113 114

• M L1

Define T = {[paq] : p, q ∈ Q, a ∈ Σ, δ(p, a) = q} For example h−1(101) =


n
[p1p][p0q][p1p],
Let h : T ∗ → Σ∗ be the homom. defined by
[p1p][p0q][q1q],
h([paq]) = a [p1p][q0q][p1p],
[p1p][q0q][q1q],
Let L1 = h−1(M ). Since M is regular, so is L1 [q1q][p0q][p1p],
[q1q][p0q][q1q],
Example: Suppose A is given by [q1q][q0q][p1p],
o
δ 0 1
→p q p [q1q][q0q][q1q]
?q q q

Then T = {[p0q], [p1p], [q0q], [q1q]}.

115 116
• L2 L3

Define
M
E2 = [paq][rbs]
• L1 L2 [paq]∈T ∗ , [rbs]∈T ∗ , q6=r

Define Let
M
E1 = [q0ap] L3 = L2 \ T ∗.L(E2).T ∗
a∈Σ, δ(q0 ,a)=p

Now L3 is regular and consists of those strings


Let
  [q0a1p1][p1a2p2] . . . [pn−1anpn]
L2 = L1 ∩ L(E1).T ∗
in T ∗ such that

Now L2 is regular and consists of those strings a1a2 . . . an ∈ L(A)


in L1 starting with [qo . . .][. . .
δ(qo, a1) = p1

δ(pi, ai+1) = pi+1, i ∈ {1, 2, . . . , n − 1}

pn ∈ F
117 118

• L3 L4

Define • L4 L
M
Eq = [ras]
We only need to get rid of the state compo-
[ras]∈T ∗ , r6=q, s6=q
nents in the words of L4.

Now L3 \ L(Eq∗) consists of those strings in L3


We can do this by letting
that “visit” state q at least once.
L = h(L4)
Let
 M 
L4 = L3 \ L Eq∗ Now L =
q∈Q
{w : δ̂(q0, w) ∈ F, ∀q ∈ Q ∃xq , y(w = xq y ∧ δ̂(q0, xq ) = q)}

Now L4 is regular and consists of those strings


in L3 that “visit” all states q ∈ Q at least once.

119 120
Equivalence and Minimization of Automata
Example:

Let A = (Q, Σ, δ, q0, F ) be a DFA, and {p, q} ⊆ Q. 1


0
We define
Start 0 1 0
A B C D
p ≡ q ⇔ ∀w ∈ Σ∗ : δ̂(p, w) ∈ F iff δ̂(q, w) ∈ F 1
1 0

0 1
• If p ≡ q we say that p and q are equivalent 1 1 0
E F G H
1 0
• If p 6≡ q we say that p and q are distinguish-
able 0

IOW (in other words) p and q are distinguish- δ̂(C, ) ∈ F, δ̂(G, ) ∈


/ F ⇒ C 6≡ G
able iff
δ̂(A, 01) = C ∈ F, δ̂(G, 01) = E ∈
/ F ⇒ A 6≡ G
∃w : δ̂(p, w) ∈ F and δ̂(q, w) ∈
/ F, or vice versa

121 122

What about A and E?


We can compute distinguishable pairs with the
1
0 following inductive table filling algorithm:
Start 0 1 0
A B C D
Basis: If p ∈ F and q 6∈ F , then p 6≡ q.
1 0 1

0 1
Induction: If ∃a ∈ Σ : δ(p, a) 6≡ δ(q, a),
E
1
F
1
G
0
H
then p 6≡ q.
1 0

Example:
0
Applying the table filling algo to DFA A:
δ̂(A, ) = A ∈
/ F, δ̂(E, ) = E ∈
/F
B x
δ̂(A, 1) = F = δ̂(E, 1) C x x
D x x x
Therefore δ̂(A, 1x) = δ̂(E, 1x) = δ̂(F, x) E x x x
F x x x x
δ̂(A, 00) = G = δ̂(E, 00)
G x x x x x x

δ̂(A, 01) = C = δ̂(E, 01) H x x x x x x

A B C D E F G
Conclusion: A ≡ E.
123 124
Theorem 4.20: If p and q are not distin-
guished by the TF-algo, then p ≡ q.

Proof: Suppose to the contrary that that there


Consider states r = δ(p, a1) and s = δ(q, a1).
is a bad pair {p, q}, s.t.
Now {r, s} cannot be a bad pair since {r, s}
would be indentified by a string shorter than w.
1. ∃w : δ̂(p, w) ∈ F, δ̂(q, w) ∈
/ F , or vice versa. Therefore, the TF-algo must have discovered
that r and s are distinguishable.

2. The TF-algo does not distinguish between But then the TF-algo would distinguish p from
p and q. q in the inductive part.

Let w = a1a2 · · · an be the shortest string that Thus there are no bad pairs and the theorem
identifies a bad pair {p, q}. is true.

Now w 6=  since otherwise the TF-algo would


in the basis distinguish p from q. Thus n ≥ 1.

125 126

Example:
Testing Equivalence of Regular Languages 0 1

Start 1
A B
Let L and M be reg langs (each given in some 0
0
form).
Start 0
C D

To test if L = M 1 0
1

1. Convert both L and M to DFA’s. 1

We can “see” that both DFA accept


2. Imagine the DFA that is the union of the L( + (0 + 1)∗0). The result of the TF-algo is
two DFA’s (never mind there are two start
states) B x
C x
D x
3. If TF-algo says that the two start states
E x x x
are distinguishable, then L 6= M , otherwise
A B C D
L = M.

Therefore the two automata are equivalent.


127 128
Minimization of DFA’s Theorem 4.23: If p ≡ q and q ≡ r, then p ≡ r.

We can use the TF-algo to minimize a DFA Proof: Suppose to the contrary that p 6≡ r.
by merging all equivalent states. IOW, replace Then ∃w such that δ̂(p, w) ∈ F and δ̂(r, w) 6∈ F ,
each state p by p/≡ . or vice versa.

Example: The DFA on slide 119 has equiva- OTH, δ̂(q, w) is either accpeting or not.
lence classes {{A, E}, {B, H}, {C}, {D, F }, {G}}.
Case 1: δ̂(q, w) is accepting. Then q 6≡ r.
The “union” DFA on slide 125 has equivalence
classes {{A, C, D}, {B, E}}. Case 1: δ̂(q, w) is not accepting. Then p 6≡ q.

Note: In order for p/≡ to be an equivalence The vice versa case is proved symmetrically
class, the relation ≡ has to be an equivalence
relation (reflexive, symmetric, and transitive). Therefore it must be that p ≡ r.

129 130

Example: We can minimize

0 1

Start 0 1 0
A B C D

To minimize a DFA A = (Q, Σ, δ, q0, F ) con- 1 0 1

struct a DFA B = (Q/≡ , Σ, γ, q0/≡ , F/≡ ), where 0 1


1 1 0
E F G H

γ(p/≡ , a) = δ(p, a)/≡ 1 0

0
In order for B to be well defined we have to
show that
to obtain
If p ≡ q then δ(p, a) ≡ δ(q, a)
1
If δ(p, a) 6≡ δ(q, a), then the TF-algo would con-
0
clude p 6≡ q, so B is indeed well defined. Note G
1
D,F
1
also that F/≡ contains all and only the accept-
Start
ing states of A. A,E 0 0

1
B,H C

1
0

131 132
NOTE: We cannot apply the TF-algo to NFA’s. Why the Minimized DFA Can’t Be Beaten

For example, to minimize Let B be the minimized DFA obtained by ap-


plying the TF-algo to DFA A.
0,1

Start 0 We already know that L(A) = L(B).


A B

What if there existed a DFA C, with


1 0
L(C) = L(B) and fewer states than B?

C
Then run the TF-algo on B “union” C.

we simply remove state C. Since L(B) = L(C) we have q0B ≡ q0C .

However, A 6≡ C. Also, δ(q0B , a) ≡ δ(q0C , a), for any a.

133 134

Context-Free Grammars and Languages

• We have seen that many languages cannot


Claim: For each state p in B there is at least be regular. Thus we need to consider larger
one state q in C, s.t. p ≡ q. classes of langs.

Proof of claim: There are no inaccessible states, • Contex-Free Languages (CFL’s) played a cen-
so p = δ̂(q0B , a1a2 · · · ak ), for some string a1a2 · · · ak . tral role natural languages since the 1950’s,
Now q = δ̂(q0C , a1a2 · · · ak ), and p ≡ q. and in compilers since the 1960’s.

Since C has fewer states than B, there must be • Context-Free Grammars (CFG’s) are the ba-
two states r and s of B such that r ≡ t ≡ s, for sis of BNF-syntax.
some state t of C. But then r ≡ s (why?)
which is a contradiction, since B was con- • Today CFL’s are increasingly important for
structed by the TF-algo. XML and their DTD’s.

We’ll look at: CFG’s, the languages they gen-


erate, parse trees, pushdown automata, and
closure properties of CFL’s.
135 136
Informal example of CFG’s
CFG’s is a formal mechanism for definitions
Consider Lpal = {w ∈ Σ∗ : w = wR } such as the one for Lpal .

For example otto ∈ Lpal , madamimadam ∈ Lpal .


1. P → 
In Finnish language e.g. saippuakauppias ∈ Lpal 2. P → 0
(“soap-merchant”) 3. P → 1
4. P → 0P 0
Let Σ = {0, 1} and suppose Lpal were regular.
5. P → 1P 1
Let n be given by the pumping lemma. Then
0n10n ∈ Lpal . In reading 0n the FA must make
0 and 1 are terminals
a loop. Omit the loop; contradiction.

Let’s define Lpal inductively: P is a variable (or nonterminal, or syntactic


category)
Basis: , 0, and 1 are palindromes.
P is in this grammar also the start symbol.
Induction: If w is a palindrome, so are 0w0
and 1w1.
1–5 are productions (or rules)
Circumscription: Nothing else is a palindrome.
137 138

Formal definition of CFG’s

Example: Gpal = ({P }, {0, 1}, A, P ), where A =


A context-free grammar is a quadruple
{P → , P → 0, P → 1, P → 0P 0, P → 1P 1}.
G = (V, T, P, S)
Sometimes we group productions with the same
where head, e.g. A = {P → |0|1|0P 0|1P 1}.

V is a finite set of variables. Example: Regular expressions over {0, 1} can


be defined by the grammar
T is a finite set of terminals. Gregex = ({E}, {0, 1}, A, E)
where A =
P is a finite set of productions of the form
A → α, where A is a variable and α ∈ (V ∪ T )∗ {E → 0, E → 1, E → E.E, E → E+E, E → E ?, E → (E)}

S is a designated variable called the start symbol.

139 140
Example: (simple) expressions in a typical prog
lang. Operators are + and *, and arguments Derivations using grammars
are identfiers, i.e. strings in
L((a + b)(a + b + 0 + 1)∗) • Recursive inference, using productions from
body to head
The expressions are defined by the grammar
G = ({E, I}, T, P, E) • Derivations, using productions from head to
where T = {+, ∗, (, ), a, b, 0, 1} and P is the fol- body.
lowing set of productions:
Example of recursive inference:
1. E →I
2. E →E+E String Lang Prod String(s) used
3. E →E∗E (i) a I 5 -
4. E → (E) (ii) b I 6 -
(iii) b0 I 9 (ii)
5. I →a (iv) b00 I 9 (iii)
6. I →b (v) a E 1 (i)
7. I → Ia (vi) b00 E 1 (iv)
8. I → Ib (vii) a + b00 E 2 (v), (vi)
(viii) (a + b00) E 4 (vii)
9. I → I0
(ix) a ∗ (a + b00) E 3 (v), (viii)
10. I → I1

141 142

Example: Derivation of a ∗ (a + b00) from E in


• Derivations the grammar of slide 138:

Let G = (V, T, P, S) be a CFG, A ∈ V ,


{α, β} ⊂ (V ∪ T )∗, and A → γ ∈ P .
E ⇒ E ∗ E ⇒ I ∗ E ⇒ a ∗ E ⇒ a ∗ (E) ⇒

Then we write a∗(E+E) ⇒ a∗(I+E) ⇒ a∗(a+E) ⇒ a∗(a+I) ⇒


αAβ ⇒ αγβ a ∗ (a + I0) ⇒ a ∗ (a + I00) ⇒ a ∗ (a + b00)
G

or, if G is understood

αAβ ⇒ αγβ
Note: At each step we might have several rules
and say that αAβ derives αγβ. to choose from, e.g.
I ∗ E ⇒ a ∗ E ⇒ a ∗ (E), versus

We define ⇒ to be the reflexive and transitive I ∗ E ⇒ I ∗ (E) ⇒ a ∗ (E).
closure of ⇒, IOW:
Note: Not all choices lead to successful deriva-

Basis: Let α ∈ (V ∪ T )∗. Then α ⇒ α. tions of a particular string, for instance

∗ ∗ E ⇒E+E
Induction: If α ⇒ β, and β ⇒ γ, then α ⇒ γ.
won’t lead to a derivation of a ∗ (a + b00).
143 144
Leftmost and Rightmost Derivations The Language of a Grammar

Leftmost derivation ⇒ Always replace the left- If G(V, T, P, S) is a CFG, then the language of
lm
most variable by one of its rule-bodies. G is

L(G) = {w ∈ T ∗ : S ⇒ w}
Rightmost derivation ⇒ Always replace the G
rm
rightmost variable by one of its rule-bodies. i.e. the set of strings over T ∗ derivable from
the start symbol.
Leftmost: The derivation on the previous slide.
If G is a CFG, we call L(G) a context-free lan-
Rightmost: guage.

E ⇒E∗E ⇒
rm rm
Example: L(Gpal ) is a context-free language.
E∗(E) ⇒ E∗(E+E) ⇒ E∗(E+I) ⇒ E∗(E+I0)
rm rm rm
Theorem 5.7:
⇒ E ∗ (E + I00) ⇒ E ∗ (E + b00) ⇒ E ∗ (I + b00)
rm rm rm
L(Gpal ) = {w ∈ {0, 1}∗ : w = wR }
⇒ E ∗ (a + b00) ⇒ I ∗ (a + b00) ⇒ a ∗ (a + b00)
rm rm rm

Proof: (⊇-direction.) Suppose w = wR . We


∗ show by induction on |w| that w ∈ L(Gpal )
We can conclude that E ⇒ a ∗ (a + b00)
rm

145 146

(⊆-direction.) We assume that w ∈ L(Gpal )


and must show that w = wR .

Basis: |w| = 0, or |w| = 1. Then w is , 0, Since w ∈ L(Gpal ), we have P ⇒ w.
or 1. Since P → , P → 0, and P → 1 are ∗

productions, we conclude that P ⇒ w in all We do an induction of the length of ⇒.
G
base cases. ∗
Basis: The derivation P ⇒ w is done in one
step.
Induction: Suppose |w| ≥ 2. Since w = wR ,
we have w = 0x0, or w = 1x1, and x = xR . Then w must be , 0, or 1, all palindromes.

∗ Induction: Let n ≥ 1, and suppose the deriva-


If w = 0x0 we know from the IH that P ⇒ x.
Then tion takes n + 1 steps. Then we must have
∗ ∗
P ⇒ 0P 0 ⇒ 0x0 = w w = 0x0 ⇐ 0P 0 ⇐ P

Thus w ∈ L(Gpal ). or

w = 1x1 ⇐ 1P 1 ⇐ P
The case for w = 1x1 is similar. where the second derivation is done in n steps.

By the IH x is a palindrome, and the inductive


proof is complete.
147 148
Example: Take G from slide 138. Then E ∗ (I + E)
Sentential Forms is a sentential form since

Let G = (V, T, P, S) be a CFG, and α ∈ (V ∪T )∗. E ⇒ E ∗E ⇒ E ∗(E) ⇒ E ∗(E +E) ⇒ E ∗(I +E)
If This derivation is neither leftmost, nor right-
∗ most
S⇒α
we say that α is a sentential form.
Example: a ∗ E is a left-sentential form, since

If S ⇒ α we say that α is a left-sentential form, E ⇒E∗E ⇒I ∗E ⇒a∗E


lm lm lm
lm
and if S ⇒ α we say that α is a right-sentential
rm
form
Example: E ∗(E +E) is a right-sentential form,
since
Note: L(G) consists of those sentential forms
that are in T ∗. E ⇒ E ∗ E ⇒ E ∗ (E) ⇒ E ∗ (E + E)
rm rm rm

149 150

Parse Trees

• If w ∈ L(G), for some CFG, then w has a Constructing Parse Trees


parse tree, which tells us the (syntactic) struc-
ture of w Let G = (V, T, P, S) be a CFG. A tree is a parse
tree for G if:
• w could be a program, a SQL-query, an XML-
document, etc. 1. Each interior node is labelled by a variable
in V .
• Parse trees are an alternative representation
to derivations and recursive inferences. 2. Each leaf is labelled by a symbol in V ∪ T ∪ {}.
Any -labelled leaf is the only child of its
• There can be several parse trees for the same parent.
string
3. If an interior node is lablelled A, and its
• Ideally there should be only one parse tree children (from left to right) labelled
(the “true” structure) for each string, i.e. the X1, X2, . . . , Xk ,
language should be unambiguous.
then A → X1X2 . . . Xk ∈ P .
• Unfortunately, we cannot always remove the
ambiguity.
151 152
Example: In the grammar
Example: In the grammar
1. P → 
1. E → I 2. P → 0
2. E → E + E 3. P → 1
3. E → E ∗ E 4. P → 0P 0
4. E → (E) 5. P → 1P 1
·
·
·
the following is a parse tree:
the following is a parse tree:
P
E
0 P 0
E + E
1 P 1
I
ε

This parse tree shows the derivation E ⇒ I +E

It shows the derivation of P ⇒ 0110.
153 154

Example: Below is an important parse tree

The Yield of a Parse Tree E * E

The yield of a parse tree is the string of leaves I ( E )


from left to right.
a E + E
Important are those parse trees where:
I I
1. The yield is a terminal string.
a I 0
2. The root is labelled by the start symbol
I 0

We shall see the the set of yields of these


b
important parse trees is the language of the
grammar.
The yield is a ∗ (a + b00).

Compare the parse tree with the derivation on


slide 141.
155 156
Let G = (V, T, P, S) be a CFG, and A ∈ V .
We are going to show that the following are
equivalent: From Inferences to Trees

1. We can determine by recursive inference


that w is in the language of A Theorem 5.12: Let G = (V, T, P, S) be a
CFG, and suppose we can show w to be in

2. A ⇒ w the language of a variable A. Then there is a

3. A ⇒ w, and A ⇒ w
∗ parse tree for G with root A and yield w.
lm rm

Proof: We do an induction of the length of


4. There is a parse tree of G with root A and
yield w. the inference.

Basis: One step. Then we must have used a


To prove the equivalences, we use the following
plan. production A → w. The desired parse tree is
then
Parse
Leftmost tree
derivation A

Rightmost
Derivation derivation Recursive w
inference

157 158

From trees to derivations


Induction: w is inferred in n + 1 steps. Sup-
pose the last step was based on a production
We’ll show how to construct a leftmost deriva-
A → X1X2 · · · Xk , tion from a parse tree.
where Xi ∈ V ∪ T . We break w up as
Example: In the grammar of slide 6 there clearly
w1w2 · · · wk , is a derivation
where wi = Xi, when Xi ∈ T , and when Xi ∈ V, E ⇒ I ⇒ Ib ⇒ ab.
then wi was previously inferred being in Xi, in
Then, for any α and β there is a derivation
at most n steps.
αEβ ⇒ αIβ ⇒ αIbβ ⇒ αabβ.
By the IH there are parse trees i with root Xi
and yield wi. Then the following is a parse tree For example, suppose we have a derivation
for G with root A and yield w:
E ⇒ E + E ⇒ E + (E).
A The we can choose α = E + ( and β =) and
continue the derivation as
X1 X2 ... Xk
E + (E) ⇒ E + (I) ⇒ E + (Ib) ⇒ E + (ab).

w1 w2 ... wk
This is why CFG’s are called context-free.
159 160
Theorem 5.14: Let G = (V, T, P, S) be a Induction: Height is n + 1. The tree must
CFG, and suppose there is a parse tree with look like

root labelled A and yield w. Then A ⇒ w in G.
lm A

Proof: We do an induction on the height of ...


X1 X2 Xk
the parse tree.

Basis: Height is 1. The tree must look like w1 w2 ... wk

A
Then w = w1w2 · · · wk , where

w 1. If Xi ∈ T , then wi = Xi.


2. If Xi ∈ V , then Xi ⇒ wi in G by the IH.
Consequently A → w ∈ P , and A ⇒ w. lm
lm

161 162

∗ (Case 2:) Xi ∈ V . By the IH there is a deriva-


Now we construct A ⇒ w by an (inner) induc-
lm
tion by showing that tion Xi ⇒ α1 ⇒ α2 ⇒ · · · ⇒ wi. By the contex-
lm lm lm lm
free property of derivations we can proceed

∀i : A ⇒ w1w2 · · · wiXi+1Xi+2 · · · Xk . with
lm

Basis: Let i = 0. We already know that
A⇒
lm

A ⇒ X1Xi+2 · · · Xk .
lm
w1w2 · · · wi−1XiXi+1 · · · Xk ⇒
lm
Induction: Make the IH that
∗ w1w2 · · · wi−1α1Xi+1 · · · Xk ⇒
A ⇒ w1w2 · · · wi−1XiXi+1 · · · Xk . lm
lm

w1w2 · · · wi−1α2Xi+1 · · · Xk ⇒
lm
(Case 1:) Xi ∈ T . Do nothing, since Xi = wi
gives us ···

A ⇒ w1w2 · · · wiXi+1 · · · Xk .
lm w1w2 · · · wi−1wiXi+1 · · · Xk

163 164
Example: Let’s construct the leftmost deriva-
For the derivation corresponding to the whole
tion for the tree
E
tree we start with E ⇒ E ∗ E and expand the
lm
E * E first E with the first derivation and the second
I ( E ) E with the second derivation:
a E + E

I I
E⇒
lm

a I 0 E∗E ⇒
lm
I 0 I ∗E ⇒
lm
b
a∗E ⇒
lm
Suppose we have inductively constructed the
a ∗ (E) ⇒
leftmost derivation lm
a ∗ (E + E) ⇒
E⇒I⇒a lm
lm lm
a ∗ (I + E) ⇒
lm

corresponding to the leftmost subtree, and the a ∗ (a + E) ⇒


lm
leftmost derivation a ∗ (a + I) ⇒
lm
E ⇒ (E) ⇒ (E + E) ⇒ (I + E) ⇒ (a + E) ⇒ a ∗ (a + I0) ⇒
lm lm lm lm lm lm

(a + I) ⇒ (a + I0) ⇒ (a + I00) ⇒ (a + b00) a ∗ (a + I00) ⇒


lm lm lm lm
a ∗ (a + b00)
corresponding to the righmost subtree.
165 166

From Derivations to Recursive Inferences


Observation: Suppose that A ⇒ X1X2 · · · Xk ⇒ w.

Then w = w1w2 · · · wk , where Xi ⇒ wi
Theorem 5.18: Let G = (V, T, P, S) be a
∗ ∗
The factor wi can be extracted from A ⇒ w by CFG. Suppose A ⇒ w, and that w is a string
G
looking at the expansion of Xi only. of terminals. Then we can infer that w is in
the language of variable A.
Example: E ⇒ a ∗ b + a, and
E ⇒ |{z} ∗ |{z}
E |{z} + |{z}
E |{z} E Proof: We do an induction on the length of
X1 X2 X3 X4 X5 ∗
the derivation A ⇒ w.
G

We have Basis: One step. If A ⇒ w there must be a


G
E ⇒E∗E ⇒E∗E+E ⇒I ∗E+E ⇒I ∗I +E ⇒ production A → w in P . The we can infer that
I ∗I +I ⇒a∗I +I ⇒a∗b+I ⇒a∗b+a w is in the language of A.

By looking at the expansion of X3 = E only,


we can extract
E ⇒ I ⇒ b.
167 168
Ambiguity in Grammars and Languages

In the grammar

Induction: Suppose A ⇒ w in n + 1 steps. 1. E →I
G
Write the derivation as 2. E →E+E
∗ 3. E →E∗E
A ⇒ X1X2 · · · Xk ⇒ w
G G 4. E → (E)
The as noted on the previous slide we can ···

break w as w1w2 · · · wk where Xi ⇒ wi. Fur- the sentential form E + E ∗ E has two deriva-
G
∗ tions:
thermore, Xi ⇒ wi can use at most n steps.
G
E ⇒E+E ⇒E+E∗E
Now we have a production A → X1X2 · · · Xk , and
E ⇒E∗E ⇒E+E∗E
and we know by the IH that we can infer wi to
be in the language of Xi. This gives us two parse trees:
E E
Therefore we can infer w1w2 · · · wk to be in the
language of A. E + E E * E

E * E E + E

(a) (b)
169 170

The mere existence of several derivations is not Definition: Let G = (V, T, P, S) be a CFG. We
dangerous, it is the existence of several parse say that G is ambiguous is there is a string in
trees that ruins a grammar. T ∗ that has more than one parse tree.

Example: In the same grammar If every string in L(G) has at most one parse
5. I → a tree, G is said to be unambiguous.

6. I → b
Example: The terminal string a + a ∗ a has two
7. I → Ia
parse trees:
8. I → Ib
9. I → I0 E E

10. I → I1
E + E E * E
the string a + b has several derivations, e.g.

E ⇒E+E ⇒I +E ⇒a+E ⇒a+I ⇒a+b I E * E E + E I

and
a I I I I a
E ⇒E+E ⇒E+I ⇒I +I ⇒I +b⇒a+b
a a a a
However, their parse trees are the same, and
the structure of a + b is unambiguous. (a) (b)

171 172
Removing Ambiguity From Grammars Solution: We introduce more variables, each
representing expressions of same “binding strength.”
Good news: Sometimes we can remove ambi-
guity “by hand”
1. A factor is an expresson that cannot be
Bad news: There is no algorithm to do it broken apart by an adjacent * or +. Our
factors are
More bad news: Some CFL’s have only am-
biguous CFG’s (a) Identifiers

We are studying the grammar (b) A parenthesized expression.


E → I | E + E | E ∗ E | (E)
I → a | b | Ia | Ib | I0 | I1 2. A term is an expresson that cannot be bro-
ken by +. For instance a ∗ b can be broken
There are two problems: by a1∗ or ∗a1. It cannot be broken by +,
since e.g. a1 + a ∗ b is (by precedence rules)
1. There is no precedence between * and + same as a1 + (a ∗ b), and a ∗ b + a1 is same
as (a ∗ b) + a1.

2. There is no grouping of sequences of op-


erators, e.g. is E + E + E meant to be 3. The rest are expressions, i.e. they can be
E + (E + E) or (E + E) + E. broken apart with * or +.

173 174

We’ll let F stand for factors, T for terms, and E


Why is the new grammar unambiguous?
for expressions. Consider the following gram-
mar:
Intuitive explanation:
1. I → a | b | Ia | Ib | I0 | I1
2. F → I | (E) • A factor is either an identifier or (E), for
3. T →F |T ∗F some expression E.
4. E →T |E+T
• The only parse tree for a sequence
Now the only parse tree for a + a ∗ a will be
f1 ∗ f2 ∗ · · · ∗ fn−1 ∗ fn
E
of factors is the one that gives f1 ∗ f2 ∗ · · · ∗ fn−1
as a term and fn as a factor, as in the parse
E + T
tree on the next slide.

T T * F
• An expression is a sequence
F F I t1 + t2 + · · · + tn−1 + tn
of terms ti. It can only be parsed with
I I a
t1 + t2 + · · · + tn−1 as an expression and tn as
a term.
a a
175 176
Leftmost derivations and Ambiguity

The two parse trees for a + a ∗ a


E E

T E + E E * E

T * F I E * E E + E I

T * F a I I I I a
.
. .
T a a a a

(a) (b)
T * F
give rise to two derivations:
F E ⇒E+E ⇒I +E ⇒a+E ⇒a+E∗E
lm lm lm lm
⇒a+I ∗E ⇒a+a∗E ⇒a+a∗I ⇒a+a∗a
lm lm lm lm

and
E ⇒ E ∗E ⇒ E +E ∗E ⇒ I +E ∗E ⇒ a+E ∗E
lm lm lm lm
⇒a+I ∗E ⇒a+a∗E ⇒a+a∗I ⇒a+a∗a
lm lm lm lm

177 178

In General:
Sketch of Proof: (Only If.) If the two parse
• One parse tree, but many derivations trees differ, they have a node a which dif-
ferent productions, say A → X1X2 · · · Xk and
• Many leftmost derivation implies many parse B → Y1Y2 · · · Ym. The corresponding leftmost
trees. derivations will use derivations based on these
two different productions and will thus be dis-
• Many rightmost derivation implies many parse tinct.
trees.
(If.) Let’s look at how we construct a parse
Theorem 5.29: For any CFG G, a terminal tree from a leftmost derivation. It should now
string w has two distinct parse trees if and only be clear that two distinct derivations gives rise
if w has two distinct leftmost derivations from to two different parse trees.
the start symbol.

179 180
Inherent Ambiguity
Let’s look at parsing the string aabbccdd.
A CFL L is inherently ambiguous if all gram-
mars for L are ambiguous.
S S

Example: Consider L = A B C

{anbncmdm : n ≥ 1, m ≥ 1}∪{anbmcmdn : n ≥ 1, m ≥ 1}.


a A b c B d a C d

A grammar for L is a b c d a D d

S → AB | C
b D c
A → aAb | ab
B → cBd | cd b c
C → aCd | aDd
(a) (b)
D → bDc | bc

181 182

Pushdown Automata

A pushdown automata (PDA) is essentially an


-NFA with a stack.
From this we see that there are two leftmost
derivations: On a transition the PDA:

S ⇒ AB ⇒ aAbB ⇒ aabbB ⇒ aabbcBd ⇒ aabbccdd 1. Consumes an input symbol.


lm lm lm lm lm

and 2. Goes to a new state (or stays in the old).

S ⇒ C ⇒ aCd ⇒ aaDdd ⇒ aabDcdd ⇒ aabbccdd 3. Replaces the top of the stack by any string
lm lm lm lm lm
(does nothing, pops the stack, or pushes a
string onto the stack)
It can be shown that every grammar for L be-
haves like the one above. The language L is
inherently ambiguous. Finite
Input state Accept/reject
control

Stack

183 184
Example: Let’s consider

Lwwr = {wwR : w ∈ {0, 1}∗},


with “grammar” P → 0P 0, P → 1P 1, P → .
A PDA for Lwwr has tree states, and operates The PDA for Lwwr as a transition diagram:
as follows:

1. Guess that you are reading w. Stay in 0 , Z 0 /0 Z 0


state 0, and push the input symbol onto 1 , Z 0 /1 Z 0
0 , 0 /0 0
the stack.
0 , 1 /0 1
1 , 0 /1 0 0, 0/ ε
2. Guess that you’re in the middle of wwR .
1 , 1 /1 1 1, 1/ ε
Go spontanteously to state 1.
Start
3. You’re now reading the head of wR . Com- q0 q
1
q2
ε, Z 0 / Z 0 ε , Z 0 /Z 0
pare it to the top of the stack. If they ε, 0 / 0
match, pop the stack, and remain in state 1. ε, 1 / 1
If they don’t match, go to sleep.

4. If the stack is empty, go to state 2 and


accept.

185 186

PDA formally Example: The PDA

0 , Z 0 /0 Z 0
A PDA is a seven-tuple: 1 , Z 0 /1 Z 0
0 , 0 /0 0
0 , 1 /0 1
P = (Q, Σ, Γ, δ, q0, Z0, F ), 1 , 0 /1 0 0, 0/ ε
1 , 1 /1 1 1, 1/ ε
where
Start
q0 q q2
• Q is a finite set of states, ε, Z 0 / Z 0
1
ε , Z 0 /Z 0
ε, 0 / 0
• Σ is a finite input alphabet, ε, 1 / 1

• Γ is a finite stack alphabet,



is actually the seven-tuple
• δ : Q × Σ ∪ {} × Γ → 2Q×Γ is the transition
function, P = ({q0, q1, q2}, {0, 1}, {0, 1, Z0}, δ, q0, Z0, {q2}),
where δ is given by the following table (set
• q0 is the start state,
brackets missing):
• Z0 ∈ Γ is the start symbol for the stack,
and 0, Z0 1, Z0 0,0 0,1 1,0 1,1 , Z0 , 0 , 1
→ q0 q0 , 0Z0 q0 , 1Z0 q0 , 00 q0 , 01 q0 , 10 q0 , 11 q1 , Z0 q1 , 0 q1 , 1
• F ⊆ Q is the set of accepting states. q1 q1 ,  q1 ,  q2 , Z0
?q2

187 188
Instantaneous Descriptions

A PDA goes from configuration to configura-


tion when consuming input.
Example: On input 1111 the PDA
To reason about PDA computation, we use
0 , Z 0 /0 Z 0
instantaneous descriptions of the PDA. An ID 1 , Z 0 /1 Z 0
is a triple 0 , 0 /0 0
0 , 1 /0 1
, 0 /1 0 0, 0/ ε
(q, w, γ) 1
1 , 1 /1 1 1, 1/ ε
where q is the state, w the remaining input, Start
q0 q q2
and γ the stack contents. ε, Z 0 / Z 0
1
ε , Z 0 /Z 0
ε, 0 / 0
ε, 1 / 1
Let P = (Q, Σ, Γ, δ, q0, Z0, F ) be a PDA. Then
∀w ∈ Σ∗, β ∈ Γ∗ :
has the following computation sequences:
(p, α) ∈ δ(q, a, X) ⇒ (q, aw, Xβ) ` (p, w, αβ).


We define ` to be the reflexive-transitive clo-
sure of `.
189 190

( q0 , 1111, Z 0 ) The following properties hold:

( q0 , 111, 1Z 0 ) ( q , 1111, Z 0 ) ( q2 , 1111, Z 0 )


1. If an ID sequence is a legal computation for
1
a PDA, then so is the sequence obtained
by adding an additional string at the end
( q0 , 11, 11Z 0 ) ( q , 111, 1Z 0 ) ( q , 11, Z 0 ) of component number two.
1 1

2. If an ID sequence is a legal computation for


( q0 , 1, 111Z 0 ) ( q , 11, 11Z 0 ) ( q2 , 11, Z 0 )
1 a PDA, then so is the sequence obtained by
adding an additional string at the bottom
of component number three.
( q0 , ε , 1111Z 0 ) ( q , 1, 111Z 0 ) ( q , 1, 1 Z 0 )
1 1

3. If an ID sequence is a legal computation


( q , ε , 1111Z 0 ) ( q , ε , 11 Z 0 ) ( q , ε , Z0 ) for a PDA, and some tail of the input is
1 1 1
not consumed, then removing this tail from
all ID’s result in a legal computation se-
( q2 , ε , Z 0 )
quence.

191 192
Acceptance by final state
Theorem 6.5: ∀w ∈ Σ∗, β ∈ Γ∗ :
∗ ∗ Let P = (Q, Σ, Γ, δ, q0, Z0, F ) be a PDA. The
(q, x, α) ` (p, y, β) ⇒ (q, xw, αγ) ` (p, yw, βγ). language accepted by P by final state is

L(P ) = {w : (q0, w, Z0) ` (q, , α), q ∈ F }.
Proof: Induction on the length of the sequence
to the left.
Example: The PDA on slide 183 accepts ex-
Note: If γ =  we have proerty 1, and if w =  actly Lwwr .
we have property 2.
Let P be the machine. We prove that L(P ) =
Note2: The reverse of the theorem is false. Lwwr .

For property 3 we have (⊇-direction.) Let x ∈ Lwwr . Then x = wwR ,


and the following is a legal computation se-
Theorem 6.6: quence
∗ ∗
∗ ∗
(q, xw, α) ` (p, yw, β) ⇒ (q, x, α) ` (p, y, β). (q0, wwR , Z0) ` (q0, wR , wR Z0) ` (q1, wR , wR Z0) `

(q1, , Z0) ` (q2, , Z0).

193 194

Move 1: The spontaneous (q0, x, α) ` (q1, x, α).



(⊆-direction.) Now (q1, x, α) ` (q1, , β) implies that |β| < |α|,
which implies β 6= α.
Observe that the only way the PDA can enter
q2 is if it is in state q1 with an empty stack. Move 2: Loop and push (q0, a1a2 . . . an, α) `
(q0, a2 . . . an, a1α).

Thus it is sufficient to show that if (q0, x, Z0) `
(q1, , Z0) then x = wwR , for some word w. In this case there is a sequence

We’ll show by induction on |x| that (q0, a1a2 . . . an, α) ` (q0, a2 . . . an, a1α) ` . . . `
∗ (q1, an, a1α) ` (q1, , α).
(q0, x, α) ` (q1, , α) ⇒ x = wwR .

Thus a1 = an and
Basis: If x =  then x is a palindrome. ∗
(q0, a2 . . . an, a1α) ` (q1, an, a1α).
Induction: Suppose x = a1a2 . . . an, where n > 0, By Theorem 6.6 we can remove an. Therefore
and the IH holds for shorter strings. ∗
(q0, a2 . . . an−1, a1α) ` (q1, , a1α).

Ther are two moves for the PDA from ID (q0, x, α): Then, by the IH a2 . . . an−1 = yy R . Then x =
a1yy R an is a palindrome.

195 196
From Empty Stack to Final State

Theorem 6.9: If L = N (PN ) for some PDA


PN = (Q, Σ, Γ, δN , q0, Z0), then ∃ PDA PF , such
that L = L(PF ).

Acceptance by Empty Stack Proof: Let


PF = (Q ∪ {p0, pf }, Σ, Γ ∪ {X0}, δF , p0, X0, {pf })
Let P = (Q, Σ, Γ, δ, q0, Z0, F ) be a PDA. The where δF (p0, , X0) = {(q0, Z0X0)}, and for all
language accepted by P by empty stack is q ∈ Q, a ∈ Σ∪{}, Y ∈ Γ : δF (q, a, Y ) = δN (q, a, Y ),
∗ and in addition (pf , ) ∈ δF (q, , X0).
N (P ) = {w : (q0, w, Z0) ` (q, , )}.
Note: q can be any state. ε, X 0 / ε

Question: How to modify the palindrome-PDA ε, X 0 / ε


to accept by empty stack?
Start ε, X 0 / Z 0X 0
p0 q0 PN pf

ε, X 0 / ε

ε, X 0 / ε

197 198

Let’s design PN for for cathing errors in strings


We have to show that L(PF ) = N (PN ). meant to be in the if-else-grammar G

(⊇direction.) Let w ∈ N (PN ). Then S → |SS|iS|iSe.


∗ Here e.g. {ieie, iie, iiee} ⊆ G, and e.g. {ei, ieeii} ∩ G = ∅.
(q0, w, Z0) `
N
(q, , ),
The diagram for PN is
for some q. From Theorem 6.5 we get
∗ e, Z/ ε
(q0, w, Z0X0) `
N
(q, , X0). i, Z/ZZ

Since δN ⊂ δF we have
Start
∗ q
(q0, w, Z0X0) `
F
(q, , X0).
We conclude that
Formally,

(p0, w, X0) `
F
(q0, w, Z0X0) `
F
(q, , X0) `
F
(pf , , ).
PN = ({q}, {i, e}, {Z}, δN , q, Z),
where δN (q, i, Z) = {(q, ZZ)},
(⊆direction.) By inspecting the diagram.
and δN (q, e, Z) = {(q, )}.

199 200
From Final State to Empty Stack

From PN we can construct Theorem 6.11: Let L = L(PF ), for some


PDA PF = (Q, Σ, Γ, δF , q0, Z0, F ). Then ∃ PDA
PF = ({p, q, r}, {i, e}, {Z, X0}, δF , p, X0, {r}), PN , such that L = N (PN ).
where
δF (p, , X0) = {(q, ZX0)}, Proof: Let
δF (q, i, Z) = δN (q, i, Z) = {(q, ZZ)},
PN = (Q ∪ {p0, p}, Σ, Γ ∪ {X0}, δN , p0, X0)
δF (q, e, Z) = δN (q, e, Z) = {(q, )}, and
δF (q, , X0) = {(r, )} where δN (p0, , X0) = {(q0, Z0X0)}, δN (p, , Y )
= {(p, )}, for Y ∈ Γ ∪ {X0}, and for all q ∈ Q,
The diagram for PF is a ∈ Σ ∪ {}, Y ∈ Γ : δN (q, a, Y ) = δF (q, a, Y ),
and in addition ∀q ∈ F , and Y ∈ Γ ∪ {X0} :
e, Z/ ε (p, ) ∈ δN (q, , Y ).
i, Z/ZZ

Start ε, X 0/ZX 0 ε, X 0 / ε
p q r ε, any/ ε ε, any/ ε
Start ε, X 0 / Z 0X 0 PF
p0 q0 p

ε, any/ ε

201 202

Equivalence of PDA’s and CFG’s

A language is
We have to show that N (PN ) = L(PF ).

generated by a CFG
(⊆-direction.) By inspecting the diagram.

if and only if it is
(⊇-direction.) Let w ∈ L(PF ). Then

(q0, w, Z0) `
F
(q, , α), accepted by a PDA by empty stack
for some q ∈ F, α ∈ Γ∗. Since δF ⊂ δN , and
Theorem 6.5 says that X0 can be slid under if and only if it is
the stack, we get
∗ accepted by a PDA by final state
(q0, w, Z0X0) `
N
(q, , αX0).
Then PN can compute: PDA by PDA by
Grammar
∗ ∗ empty stack final state
(p0, w, X0) `
N
(q0, w, Z0X0) `
N
(q, , αX0) `
N
(p, , ).

We already know how to go between null stack


and final state.
203 204
From CFG’s to PDA’s At (q, y, βα) the PDA behaves as before, un-
less there are terminals in the prefix of β. In
Given G, we construct a PDA that simulates ⇒.
∗ that case, the PDA pops them, provided it can
lm
consume matching input.
We write left-sentential forms as
If all guesses are right, the PDA ends up with
xAα empty stack and input.
where A is the leftmost variable in the form.
For instance, Formally, let G = (V, T, Q, S) be a CFG. Define
PG as
(a+ E )
| {z } |{z} |{z}
x | A {z α } ({q}, T, V ∪ T, δ, q, S),
tail
where

Let xAα ⇒ xβα. This corresponds to the PDA δ(q, , A) = {(q, β) : A → β ∈ Q},
lm
first having consumed x and having Aα on the for A ∈ V , and
stack, and then on  it pops A and pushes β.
δ(q, a, a) = {(q, )},
More fomally, let y, s.t. w = xy. Then the PDA for a ∈ T .
goes non-deterministically from configuration
(q, y, Aα) to configuration (q, y, βα). Example: On blackboard in class.
205 206

Basis: For i = 1, γ1 = S. Thus x1 = , and



y1 = w. Clearly (q, w, S) ` (q, w, S).

Induction: IH is (q, w, S) ` (q, yi, αi). We have
Theorem 6.13: N (PG) = L(G). to show that
(q, yi, αi) ` (q, yi+1, αi+1)
Proof:
Now αi begins with a variable A, and we have
the form
(⊇-direction.) Let w ∈ L(G). Then
x Aχ ⇒ xi+1βχ
| i{z }lm | {z }
S = γ1 ⇒ γ2 ⇒ · · · ⇒ γn = w γi γi+1
lm lm lm

Let γi = xiαi. We show by induction on i that By IH Aχ is on the stack, and yi is unconsumed.


From the construction of PG is follows that we
if
can make the move

S ⇒ γi, (q, yi, χ) ` (q, yi, βχ).
lm

then If β has a prefix of terminals, we can pop them


∗ with matching terminals in a prefix of yi, end-
(q, w, S) ` (q, yi, αi),
ing up in configuration (q, yi+1, αi+1), where
where w = xiyi. αi+1 = βχ, which is the tail of the sentential
xiβχ = γi+1.

Finally, since γn = w, we have αn = , and yn =



, and thus (q, w, S) ` (q, , ), i.e. w ∈ N (PG)
207 208
(⊆-direction.) We shall show by an induction

on the length of `, that We can now write x as x1x2 · · · xn, according
to the figure below, where Y1 = B, Y2 = a, and
∗ ∗ Y3 = C.
(♣) If (q, x, A) ` (q, , ), then A ⇒ x.

Basis: Length 1. Then it must be that A → 


is in G, and we have (q, ) ∈ δ(q, , A). Thus

A ⇒ .
B
Induction: Length is n > 1, and the IH holds
for lengths < n. a

Since A is a variable, we must have C

(q, x, A) ` (q, x, Y1Y2 · · · Yk ) ` · · · ` (q, , ) x x x


1 2 3
where A → Y1Y2 · · · Yk is in G.

209 210

From PDA’s to CFG’s

Let’s look at how a PDA can consume x =


Now we can conclude that x1x2 . . . xk and empty the stack.

(q, xixi+1 · · · xk , Yi) ` (q, xi+1 · · · xk , )
is less than n steps, for all i ∈ {1, . . . , k}. If Yi
p0
is a variable we have by the IH and Theorem Y1
6.6 that
p1
∗ Y2
Yi ⇒ x i
.
If Yi is a terminal, we have |xi| = 1, and Yi = xi. .
.
∗ ∗
Thus Yi ⇒ xi by the reflexivity of ⇒.
pk- 1
Yk
The claim of the theorem now follows by choos-
pk
ing A = S, and x = w. Suppose w ∈ N (P ).
∗ x1 x2 xk
Then (q, w, S) ` (q, , ), and by (♣), we have

S ⇒ w, meaning w ∈ L(G).
We shall define a grammar with variables of the
form [pi−1Yipi] representing going from pi−1 to
pi with net effect of popping Yi.
211 212
Example: Let’s convert

e, Z/ ε
i, Z/ZZ

Formally, let P = (Q, Σ, Γ, δ, q0, Z0) be a PDA. Start


q
Define G = (V, Σ, R, S), where

V = {[pXq] : {p, q} ⊆ Q, X ∈ Γ} ∪ {S} PN = ({q}, {i, e}, {Z}, δN , q, Z),


R = {S → [q0Z0p] : p ∈ Q}∪
where δN (q, i, Z) = {(q, ZZ)},
{[qXrk ] → a[rY1r1] · · · [rk−1Yk rk ] :
and δN (q, e, Z) = {(q, )} to a grammar
a ∈ Σ ∪ {},
{r1, . . . , rk } ⊆ Q, G = (V, {i, e}, R, S),
(r, Y1Y2 · · · Yk ) ∈ δ(q, a, X)}
where V = {[qZq], S}, and
R = {S → [qZq], [qZq] → i[qZq][qZq], [qZq] → e}.

If we replace [qZq] by A we get the productions


S → A and A → iAA|e.

213 214

Example: Let P = ({p, q}, {0, 1}, {X, Z0}, δ, q, Z0), We get G = (V, {0, 1}, R, S), where
where δ is given by
V = {[pXp], [pXq], [pZ0p], [pZ0q], S}
and the productions in R are
1. δ(q, 1, Z0) = {(q, XZ0)}
S → [qZ0q]|[qZ0p]
2. δ(q, 1, X) = {(q, XX)}
From rule (1):

3. δ(q, 0, X) = {(p, X)} [qZ0q] → 1[qXq][qZ0q]


[qZ0q] → 1[qXp][pZ0q]
[qZ0p] → 1[qXq][qZ0p]
4. δ(q, , X) = {(q, )}
[qZ0p] → 1[qXp][pZ0p]

5. δ(p, 1, X) = {(p, )} From rule (2):

[qXq] → 1[qXq][qXq]
6. δ(p, 0, Z0) = {(q, Z0)}
[qXq] → 1[qXp][pXq]
[qXp] → 1[qXq][qXp]
to a CFG. [qXp] → 1[qXp][pXp]
215 216
From rule (3):

Theorem 6.14: Let G be constructed from a


[qXq] → 0[pXq]
PDA P as above. Then L(G) = N (P )
[qXp] → 0[pXp]

Proof:
From rule (4):

(⊇-direction.) We shall show by an induction


[qXq] →  ∗
on the length of the sequence ` that

From rule (5):


∗ ∗
(♠) If (q, w, X) ` (p, , ) then [qXp] ⇒ w.
[pXp] → 1

Basis: Length 1. Then w is an a or , and


From rule (6):
(p, ) ∈ δ(q, w, X). By the construction of G we

have [qXp] → w and thus [qXp] ⇒ w.
[pZ0q] → 0[qZ0q]
[pZ0p] → 0[qZ0p]

217 218

Induction: Length is n > 1, and ♠ holds for


lengths < n. We must have

(q, w, X) ` (r0, x, Y1Y2 · · · Yk ) ` · · · ` (p, , ),


where w = ax or w = x. It follows that We then get the following derivation sequence:
(r0, Y1Y2 · · · Yk ) ∈ δ(q, a, X). Then we have a
production

[qXrk ] → a[r0Y1r1] · · · [rk−1Yk rk ], [qXrk ] ⇒ a[r0Y1r1] · · · [rk−1Yk rk ] ⇒
for all {r1, . . . , rk } ⊂ Q. ∗
aw1[r1Y2r2][r2Y3r3] · · · [rk−1Yk rk ] ⇒

We may now choose ri to be the state in aw1w2[r2Y3r3] · · · [rk−1Yk rk ] ⇒

the sequence ` when Yi is popped. Let w =
w1w2 · · · wk , where wi is consumed while Yi is ···
popped. Then
aw1w2 · · · wk = w

(ri−1, wi, Yi) ` (ri, , ).
By the IH we get

[ri−1, Y, ri] ⇒ wi

219 220
(⊇-direction.) We shall show by an induction

on the length of the derivation ⇒ that

∗ ∗
(♥) If [qXp] ⇒ w then (q, w, X) ` (p, , )
From Theorem 6.5 we get

(ri−1, wiwi+1 · · · wk , YiYi+1 · · · Yk ) `
Basis: One step. Then we have a production (ri, wi+1 · · · wk , Yi+1 · · · Yk )
[qXp] → w. From the construction of G it
follows that (p, ) ∈ δ(q, a, X), where w = a. Since this holds for all i ∈ {1, . . . , k}, we get

But then (q, w, X) ` (p, , ). (q, aw1w2 · · · wk , X) `


(r0, w1w2 · · · wk , Y1Y2 · · · Yk ) `
Induction: Length of ⇒ is n > 1, and ♥ holds ∗
(r1, w2 · · · wk , Y2 · · · Yk ) `
for lengths < n. Then we must have ∗
(r2, w3 · · · wk , Y3 · · · Yk ) `
[qXrk ] ⇒ a[r0Y1r1][r1Y2r2] · · · [rk−1Yk rk ] ⇒ w
∗ (p, , ).

We can break w into aw2 · · · wk such that [ri−1Yiri] ⇒
wi. From the IH we get

(ri−1, wi, Yi) ` (ri, , )

221 222

Deterministic PDA’s We’ll show that Regular⊂ L(DPDA) ⊂ CFL

A PDA P = (Q, Σ, Γ, δ, q0, Z0, F ) is determinis- Theorem 6.17: If L is regular, then L = L(P )
tic iff for some DPDA P .
1. δ(q, a, X) is always empty or a singleton.
Proof: Since L is regular there is a DFA A s.t.
2. If δ(q, a, X) is nonempty, then δ(q, , X) must
L = L(A). Let
be empty.
A = (Q, Σ, δA, q0, F )
Example: Let us define
We define the DPDA
Lwcwr = {wcwR : w ∈ {0, 1}∗}
P = (Q, Σ, {Z0}, δP , q0, Z0, F ),
Then Lwcwr is recognized by the following DPDA
where
0 , Z 0 /0 Z 0
1 , Z 0 /1 Z 0 δP (q, a, Z0) = {(δA(q, a), Z0)},
0 , 0 /0 0
0 , 1 /0 1 for all p, q ∈ Q, and a ∈ Σ.
1 , 0 /1 0 0, 0/ ε
1 , 1 /1 1 1, 1/ ε
An easy induction (do it!) on |w| gives
Start ∗
q0 q q2 (q0, w, Z0) ` (p, , Z0) ⇔ δˆA(q0, w) = p
1
c , Z 0 /Z 0 ε , Z 0 /Z 0
c, 0/ 0
c, 1/ 1 The theorem then follows (why?)
223 224
• We have seen that Regular⊆ L(DPDA).
What about DPDA’s that accept by null stack?
• Lwcwr ∈ L(DPDA)\ Regular
They can recognize only CFL’s with the prefix
• Are there languages in CFL\L(DPDA).
property.
Yes, for example Lwwr .
A language L has the prefix property if there
are no two distinct strings in L, such that one • What about DPDA’s and Ambiguous Gram-
is a prefix of the other. mars?

Example: Lwcwr has the prefix property. Lwwr has unamb. grammar S → 0S0|1S1|
but is not L(DPDA).
Example: {0}∗ does not have the prefix prop-
erty. For the converse we have

Theorem 6.20: If L = N (P ) for some DPDA


Theorem 6.19: L is N (P ) for some DPDA P
P , then L has an unambiguous CFG.
if and only if L has the prefix property and L
is L(P 0) for some DPDA P 0. Proof: By inspecting the proof of Theorem
6.14 we see that if the construction is applied
Proof: Homework to a DPDA the result is a CFG with unique
leftmost derivations.
225 226

Theorem 6.20 can actually be strengthen as


follows
Properties of CFL’s
Theorem 6.21: If L = L(P ) for some DPDA
P , then L has an unambiguous CFG.
• Simplification of CFG’s. This makes life eas-
Proof: Let $ be a symbol outside the alphabet ier, since we can claim that if a language is CF,
of L, and let L0 = L$. then it has a grammar of a special form.
It is easy to see that L0 has the prefix property.
By Theorem 6.19 we have L0 = N (P 0) for some • Pumping Lemma for CFL’s. Similar to the
DPDA P 0.
regular case.
By Theorem 6.20 N (P 0) can be generated by
an unambiguous CFG G0
• Closure properties. Some, but not all, of the
Modify G0 into G, s.t. L(G) = L, by adding the
closure properties of regular languages carry
production
over to CFL’s.
$→
Since G0 has unique leftmost derivations, G0 • Decision properties. We can test for mem-
also has unique lm’s, since the only new thing bership and emptiness, but for instance, equiv-
we’re doing is adding derivations alence of CFL’s is undecidable.
w$ ⇒ w
lm
to the end.
227 228
Chomsky Normal Form

Eliminating Useless Symbols


We want to show that every CFL (without )
is generated by a CFG where all productions
are of the form • A symbol X is useful for a grammar G =
(V, T, P, S), if there is a derivation
A → BC, or A → a
∗ ∗
where A, B, and C are variables, and a is a S ⇒ αXβ ⇒ w
G G
terminal. This is called CNF, and to get there for a teminal string w. Symbols that are not
we have to useful are called useless.


1. Eliminate useless symbols, those that do • A symbol X is generating if X ⇒ w, for some
∗ G
not appear in any derivation S ⇒ w, for w ∈ T∗
start symbol S and terminal w.

• A symbol X is reachable if S ⇒ αXβ, for
G
2. Eliminate -productions, that is, produc- some {α, β} ⊆ (V ∪ T )∗
tions of the form A → .
It turns out that if we eliminate non-generating
3. Eliminate unit productions, that is, produc- symbols first, and then non-reachable ones, we
tions of the form A → B, where A and B will be left with only useful symbols.
are variables.

229 230

Example: Let G be
S → AB|a, A → b

S and A are generating, B is not. If we elimi- Theorem 7.2: Let G = (V, T, P, S) be a CFG
nate B we have to eliminate S → AB, leaving such that L(G) 6= ∅. Let G1 = (V1, T1, P1, S)
the grammar be the grammar obtained by
S → a, A → b
Now only S is reachable. Eliminating A and b 1. Eliminating all nongenerating symbols and
leaves us with the productions they occur in. Let the new
S→a grammar be G2 = (V2, T2, P2, S).
with language {a}.

OTH, if we eliminate non-reachable symbols 2. Eliminate from G2 all nonreachable sym-


first, we find that all symbols are reachable. bols and the productions they occur in.
From
S → AB|a, A → b
The G1 has no useless symbols, and
we then eliminate B as non-generating, and L(G1) = L(G).
are left with
S → a, A → b
that still contains useless symbols
231 232
Proof: We first prove that G1 has no useless
symbols:

Let X remain in V1 ∪T1. Thus X ⇒ w in G1, for
some w ∈ T ∗. Moreover, every symbol used in

this derivation is also generating. Thus X ⇒ w
in G2 also. We then show that L(G1) = L(G).
Since X was not eliminated in step 2, there are
∗ Since P1 ⊆ P , we have L(G1) ⊆ L(G).
α and β, such that S ⇒ αXβ in G2. Further-
more, every symbol used in this derivation is

also reachable, so S ⇒ αXβ in G1. ∗
Then, let w ∈ L(G). Thus S ⇒ w. Each sym-
G
bol is this derivation is evidently both reach-
Now every symbol in αXβ is reachable and in
V2 ∪ T2 ⊇ V1 ∪ T1, so each of them is generating able and generating, so this is also a derivation
in G2. of G1.

The terminal derivation αXβ ⇒ xwy in G2 in-
Thus w ∈ L(G1).
volves only symbols that are reachable from S,
because they are reached by symbols in αXβ.
Thus the terminal derivation is also a dervia-
tion of G1, i.e.,
∗ ∗
S ⇒ αXβ ⇒ xwy
in G1.
233 234

Theorem 7.4: At saturation, g(G) contains


all and only the generating symbols of G.
We have to give algorithms to compute the
generating and reachable symbols of G = (V, T, P, S). Proof:

The generating symbols g(G) are computed by We’ll show in class on an induction on the
stage in which a symbol X is added to g(G)
the following closure algorithm:
that X is indeed generating.

Basis: g(G) == T Then, suppose that X is generating. Thus



X ⇒ w, for some w ∈ T ∗. We prove by induc-
G
Induction: If α ∈ g(G) and X → α ∈ P , then tion on this derivation that X ∈ g(G).
g(G) == g(G) ∪ {X}.
Basis: Zero Steps. Then X is added in the
Example: Let G be S → AB|a, A → b basis of the closure algo.

Induction: The derivation takes n > 0 steps.


Then first g(G) == {a, b}. Let the first production used be X → α. Then

Since S → a we put S in g(G), and because X⇒α⇒w

A → b we add A also, and that’s it. and α ⇒ w in less than n steps and by the IH
α ∈ g(G). From the inductive part of the algo
it follows that X ∈ g(G).
235 236
The set of reachable symbols r(G) of G = Eliminating -Productions
(V, T, P, S) is computed by the following clo-
sure algorithm: We shall prove that if L is CF, then L \ {} has
a grammar without -productions.
Basis: r(G) == {S}.

Variable A is said to be nullable if A ⇒ .
Induction: If variable A ∈ r(G) and A → α ∈ P
then add all symbols in α to r(G) Let A be nullable. We’ll then replace a rule
like
Example: Let G be S → AB|a, A → b
A → BAD
Then first r(G) == {S}. with
A → BAD, A → BD
Based on the first production we add {A, B, a}
to r(G). and delete any rules with body .

Based on the second production we add {b} to We’ll compute n(G), the set of nullable sym-
r(G) and that’s it. bols of a grammar G = (V, T, P, S) as follows:

Theorem 7.6: At saturation, r(G) contains Basis: n(G) == {A : A →  ∈ P }


all and only the reachable symbols of G.
Induction: If {C1C2 · · · Ck } ⊆ n(G) and A →
Proof: Homework. C1C2 · · · Ck ∈ P , then n(G) == n(G) ∪ {A}.
237 238

Theorem 7.7: At saturation, n(G) contains Example: Let G be


all and only the nullable symbols of G.
S → AB, A → aAA|, B → bBB|

Proof: Easy induction in both directions.


Now n(G) = {A, B, S}. The first rule will be-
Once we know the nullable symbols, we can come
transform G into G1 as follows: S → AB|A|B
the second
• For each A → X1X2 · · · Xk ∈ P with m ≤ k
nullable symbols, replace it by 2m rules, one A → aAA|aA|aA|a
with each sublist of the nullable symbols ab-
the third
sent.
B → bBB|bB|bB|b
Exeption: If m = k we don’t delete all m nul- We then delete rules with -bodies, and end up
lable symbols. with grammar G1 :

S → AB|A|B, A → aAA|aA|a, B → bBB|bB|b


• Delete all rules of the form A → .

239 240
Theorem 7.9: L(G1) = L(G) \ {}.

Proof: We’ll prove the stronger statement:


Induction: Derivation takes n > 1 steps. Then

∗ ∗ A ⇒ X1X2 · · · Xk ⇒ w in G1
(]) A ⇒ w in G1 if and only if w 6=  and A ⇒ w
in G. and the first derivation is based on a produc-
tion
∗ A → Y1 Y2 · · · Ym
⊆-direction: Suppose A ⇒ w in G1. Then
clearly w 6=  (Why?). We’ll show by and in- where m ≥ k, some Yi’s are Xj ’s and the other
duction on the length of the derivation that are nullable symbols of G.

A ⇒ w in G also.

Furhtermore, w = w1w2 · · · wk , and Xi ⇒ wi in
Basis: One step. Then there exists A → w G1 in less than n steps. By the IH we have

in G1. Form the construction of G1 it follows Xi ⇒ wi in G. Now we get
that there exists A → α in G, where α is w plus ∗ ∗
A ⇒ Y1Y2 · · · Ym ⇒ X1X2 · · · Xk ⇒ w1w2 · · · wk = w
some nullable variables interspersed. Then G G G


A⇒α⇒w
in G.
241 242


⊇-direction: Let A ⇒ w, and w 6= . We’ll show
G
by induction of the length of the derivation

that A ⇒ w in G1.

Basis: Length is one. Then A → w is in G,


and since w 6=  the rule is in G1 also. ∗
Each Xj /Yj ⇒ wj in less than n steps, so by
G

Induction: Derivation takes n > 1 steps. Then IH we have that if w 6=  then Yj ⇒ wj in G1.
it looks like Thus
∗ ∗
A ⇒ Y1 Y2 · · · Ym ⇒ w A ⇒ X1X2 · · · Xk ⇒ w in G1
G G

Now w = w1w2 · · · wm, and Yi ⇒ wi in less than
G
n steps. The claim of the theorem now follows from
statement (]) on slide 238 by choosing A = S.
Let X1X2 · · · Xk be those Yj ’s in order, such
that wj 6= . Then A → X1X2 · · · Xk is a rule in
G1 .


Now X1X2 · · · Xk ⇒ w (Why?)
G

243 244
Eliminating Unit Productions
We’ll expand rule E → T and get rules

E → F, E → T ∗ F
A→B
We then expand E → F and get
is a unit production, whenever A and B are
variables. E → I|(E)|T ∗ F
Finally we expand E → I and get
Unit productions can be eliminated.
E → a | b | Ia | Ib | I0 | I1 | (E) | T ∗ F

Let’s look at grammar


The expansion method works as long as there
I → a | b | Ia | Ib | I0 | I1 are no cycles in the rules, as e.g. in
F→ I | (E)
A → B, B → C, C → A
T→ F |T ∗F
E→ T |E+T
The following method based on unit pairs will
It has unit productions E → T , T → F , and work for all grammars.
F →I

245 246


Given G = (V, T, P, S) we can construct G1 =
(A, B) is a unit pair if A ⇒ B using unit pro- (V, T, P1, S) that doesn’t have unit productions,
ductions only. and such that L(G1) = L(G) by setting

Note: In A → BC, C →  we have A ⇒ B, but


∗ P1 = {A → α : α ∈
/ V, B → α ∈ P, (A, B) ∈ u(G)}
not using unit productions only.
Example: Form the grammar of slide 242 we
get
To compute u(G), the set of all unit pairs of
G = (V, T, P, S) we use the following closure Pair Productions
algorithm
(E, E) E →E+T
(E, T ) E →T ∗F
Basis: u(G) == {(A, A) : A ∈ V } (E, F ) E → (E)
(E, I) E → a | b | Ia | Ib | I0 | I1
Induction: If (A, B) ∈ u(G) and B → C ∈ P (T, T ) T →T ∗F
(T, F ) T → (E)
then add (A, C) to u(G).
(T, I) T → a | b | Ia | Ib | I0 | I1
(F, F ) F → (E)
Theorem: At saturation, u(G) contains all (F, I) F → a | b | Ia | Ib | I0 | I1
and only the unit pair of G. (I, I) I → a | b | Ia | Ib | I0 | I1

Proof: Easy. The resulting grammar is equivalent to the


original one (proof omitted).
247 248
Chomsky Normal Form, CNF

We shall show that every nonempty CFL with-


out  has a grammar G without useless sym-
Summary
bols, and such that every production is of the
form
To “clean up” a grammar we can
• A → BC, where {A, B, C} ⊆ T , or

1. Eliminate -productions • A → α, where A ∈ V , and α ∈ T .

2. Eliminate unit productions To achieve this, start with any grammar for
the CFL, and

3. Eliminate useless symbols 1. “Clean up” the grammar.

2. Arrange that all bodies of length 2 or more


in this order. consists of only variables.

3. Break bodies of length 3 or more into a


cascade of two-variable-bodied productions.

249 250

• For step 2, for every terminal a that appears Illustration of the effect of step 3
in a body of length ≥ 2, create a new variable,
say A, and replace a by A in all bodies.
Then add a new rule A → a. A

B1 C1

• For step 3, for each rule of the form B2 C2


.
.
.
A → B1 B2 · · · Bk , C k -2

k ≥ 3, introduce new variables C1, C2, . . . Ck−2, B k-1 Bk

and replace the rule with


(a)

A → B1C1
C1 → B2C2
A
···
B1 B2 . . . Bk
Ck−3 → Bk−2Ck−2
Ck−2 → Bk−1Bk (b)

251 252
Example of CNF conversion For step 3, we replace

Let’s start with the grammar (step 1 already


E → EP T by E → EC1, C1 → P T
done)

E → E + T | T ∗ F | (E) | a | b | Ia | Ib | I0 | I1 E → T M F, T → T M F by
T → T ∗ F | (E)a | b | Ia | Ib | I0 | I1 E → T C2, T → T C2, C2 → M F
F → (E) a | b | Ia | Ib | I0 | I1
I → a | b | Ia | Ib | I0 | I1
E → LER, T → LER, F → LER by
E → LC3, T → LC3, F → LC3, C3 → ER
For step 2, we need the rules
A → a, B → b, Z → 0, O → 1
P → +, M → ∗, L → (, R →) The final CNF grammar is
and by replacing we get the grammar
E → EC1 | T C2 | LC3 | a | b | IA | IB | IZ | IO
E → EP T | T M F | LER | a | b | IA | IB | IZ | IO T → T C2 | LC3 | a | b | IA | IB | IZ | IO
T → T M F | LER | a | b | IA | IB | IZ | IO F → LC3 | a | b | IA | IB | IZ | IO
F → LER | a | b | IA | IB | IZ | IO I → a | b | IA | IB | IZ | IO
I → a | b | IA | IB | IZ | IO C1 → P T, C2 → M F, C3 → ER
A → a, B → b, Z → 0, O → 1 A → a, B → b, Z → 0, O → 1
P → +, M → ∗, L → (, R →) P → +, M → ∗, L → (, R →)
253 254

The size of parse trees


The Pumping Lemma for CFL’s
Theorem: Suppose we have a parse tree ac-
cording to a CFG G in CNF, and let w be the Theorem: Let L be a CFL. Then there exists
yield of the tree. If the longest path (no. of a constant n such that for any z ∈ L, if |z| ≥ n,
edges) in the tree is n, then |w| ≤ 2n−1. then z can be written as uvwxy, where

1. |vwx| ≤ n.
Proof: Induction on n.
2. vx 6= 
Basis: n = 1. Then the tree consists of a root
3. uv iwxiy ∈ L, for all i ≥ 0.
and a leaf, and the production must be of the
form S → a. Thus |w| = |a| = 1 = 20 = 2n−1. S

Induction: Let the longest path be n. Then


the root must use a production of the form A i =A j
S → AB. No path in the subtrees rooted at
A and B can have a path longer than n − 1. Aj

Thus the IH applies, and S ⇒ AB ⇒ w = uv,
∗ ∗
where where A ⇒ u and B ⇒ v. By the IH we u v w x y
have |u| ≤ 2n−2 and |v| ≤ 2n−2. Consequently z
|w| = |u| + |v| ≤ 2n−2 + 2n−2 = 2n−1.
255 256
Proof:

Let G be a CFG in CNF, such that L(G) =


L \ {}, and let m be the number of variables Since G has only m variables, at least one vari-
in G. able has to be repeated. Suppose Ai = Aj ,
where k − m ≤ i < j ≤ k (choose Ai as close to
Choose n = 2m. Let w be a yield of a parse the bottom as possible).
three where the longest path is at most m. By
the previous theorem |w| ≤ 2m−1 = n/2. S

Since |z| ≥ n the parse tree for z must have a


path of length k ≥ m + 1.
A i =A j
A0

A1
Aj
A2
.

.
u v w x y

. z
Ak

a
257 258

Then we can pump the tree in (a) as uv 0wx0y


(tree (b)) or uv 2wx2 (tree (c)), and in general
as uv iwxiy, i ≥ 0.

Since the longest path in the subtree rooted Closure Properties of CFL’s
at Ai is at most m + 1, the previous theorem
gives us |vwx| ≤ 2m = n. Consider a mapping
S

s : Σ → 2∆
A

A
(a)
where Σ and ∆ are finite alphabets. Let w ∈ Σ∗,
u v w x y
where w = a1a2 . . . an, and define
S
s(w) = s(a1).s(a2). · · · .s(an)
A
(b) and, for L ⊆ Σ∗,
w
[
u y s(L) = s(w).
S
w∈L

A
Such a mapping s is called a substitution.
(c)
A

u v A x y

v w x

259 267
Proof: We start with grammars

G = (V, Σ, P, S)
Example: Σ = {0, 1}, ∆ = {a, b},
for L, and
s(0) = {anbn : n ≥ 1}, s(1) = {aa, bb}.
Ga = (Va, Ta, Pa, Sa)
Let w = 01. Then s(w) = s(0).s(1) = for each s(a). We then construct
{anbnaa : n ≥ 1} ∪ {anbn+2 : n ≥ 1}. G0 = (V 0, T 0, P 0, S)
where
Let L = {0}∗. Then s(L) = (s(0))∗ =
S
{an1 bn1 an2 bn2 · · · ank bnk : k ≥ 0, ni ≥ 1}. V 0 = ( a∈Σ Va) ∪ V

S
Theorem 7.23: Let L be a CFL over Σ, and s T0 = a∈Σ Ta
a substitution, such that s(a) is a CFL, ∀a ∈ Σ.
S
Then s(L) is a CFL. P 0 = a∈Σ Pa plus the productions of P
with each a in a body replaced with sym-
bol Sa.

268 269

Now we have to show that

Then let w ∈ L(G0). Then the parse tree for w


• L(G0) = s(L).
must again look like

Let w ∈ s(L). Then ∃x = a1a2 . . . an in L, and S


∃xi ∈ s(ai), such that w = x1x2 . . . xn.

A derivation tree in G0 will look like


Sa Sa Sa
1 2 n
S

x1 x2 xn
Sa Sa Sa
1 2 n
Now delete the dangling subtrees. Then you
have yield
x1 x2 xn
Sa1 Sa2 . . . San
where a1a2 . . . an ∈ L(G). Now w is also equal
Thus we can generate Sa1 Sa2 . . . San in G0 and to s(a1a2 . . . an), which is in s(L).
form there we generate x1x2 . . . xn = w. Thus
w ∈ L(G0).
270 271
Applications of the Substitution Theorem

Theorem 7.24: The CFL’s are closed under


(i) : union, (ii) : concatenation, (iii) : Kleene
closure and positive closure +, and (iv) : ho-
momorphism.
Theorem: If L is CF, then so in LR .
Proof: (i): Let L1 and L2 be CFL’s,
let L = {1, 2}, and s(1) = L1, s(2) = L2. Proof: Suppose L is generated b G = (V, T, P, S).
Then L1 ∪ L2 = s(L). Construct GR = (V, T, P R , S), where

P R = {A → αR : A → α ∈ P }
(ii) : Here we choose L = {12} and s as before.
Then L1.L2 = s(L). Show at home by inductions on the lengths of
the derivations in G (for one direction) and in
(iii) : Suppose L1 is CF. Let L = {1}∗, s(1) = GR (for the other direction) that (L(G))R =
L1. Now L∗1 = s(L). Similar proof for +. L(GR ).

(iv) : Let L1 be a CFL over Σ, and h a homo-


morphism on Σ. Define s by

a 7→ {h(a)}
Then h(L) = s(L).
272 273

CF ∩ Regular = CF

CFL’s are not closed under ∩ Theorem 7.27: If L is CR, and R regular,
then L ∩ R is CF.
Let L1 = {0n1n2i : n ≥ 1, i ≥ 1}. The L1 is CF
Proof: Let L be accepted by PDA
with grammar
S → AB P = (QP , Σ, Γ, δP , qP , Z0, FP )
A → 0A1|01 by final state, and let R be accepted by DFA
B → 2B|2
A = (QA, Σ, δA, qA, FA)
Also, L2 = {0i1n2n : n ≥ 1, i ≥ 1} is CF with We’ll construct a PDA for L ∩ R according to
grammar the picture
S → AB
A → 0A|0 FA
B → 1B2|12 state

Input AND Accept/


reject
However, L1 ∩ L2 = {0n1n2n : n ≥ 1} which is PDA
state

not CF, as we have proved using the pumping


lemma for CFL’s.
Stack

274 275
Theorem 7.29: Let L, L1, L2 be CFL’s and
R regular. Then
Formally, define
1. L \ R is CF
P 0 = (QP × QA, , Σ, Γ, δ, (qP , qA), Z0, FP × FA)
where 2. L̄ is not necessarily CF
  n  o
δ (q, p), a, X = (r, δ̂A(p, a)), γ : (r, γ) ∈ δP (q, a, X) 3. L1 \ L2 is not necessarily CF


Prove at home by an induction `, both for P Proof:
and for P 0 that
∗ 1. R̄ is regular, L ∩ R̄ is regular, and L ∩ R̄ =
(qP , w, Z0) `
P
(q, , γ), and δ̂(qA, w) ∈ FA L \ R.

if and only if 2. If L̄ always were CF, it would follow that


  
(qP , qA), w, Z0 (q, δ̂(pA, w)), , γ L1 ∩ L2 = L1 ∪ L2
always would be CF.
The claim then follows (Why?)
3. Note that Σ∗ is CF, so if L1 \L2 were always
CF, then so would Σ∗ \ L = L̄.

276 277

Inverse homomorphism Let L be accepted by PDA

Let h : Σ → Θ∗ be a homom. Let L ⊆ Θ∗, and P = (Q, Θ, Γ, δ, q0, Z0, F )


define We construct a new PDA
h−1(L) = {w ∈ Σ∗ : h(w) ∈ L}
Now we have P 0 = (Q0, Σ, Γ, δ 0, (q0, ), Z0, F × {})
Theorem 7.30: Let L be a CFL, and h a where
homomorphism. Then h−1(L) is a CFL.

Proof: The plan of the proof is • Q0 = {(q, x) : q ∈ Q, x ∈ suffix(h(a)), a ∈ Σ}

Buffer • δ 0((q, ), a, X) =


Input
a
h
h(a) {((q, h(a)), X) :  6= a ∈ Σ, q ∈ Q, X ∈ Γ}

• δ 0((q, bx), , X) = {((p, x), γ) :


PDA Accept/
state reject (p, γ) ∈ δ(q, b, X), b ∈ T ∪ {}, q ∈ Q, X ∈ Γ}

Show at home by suitable inductions that

Stack ∗
• (q0, h(w), Z0) ` (p, , γ) in P if and only if

((q0, ), w, Z0) ` ((p, ), , γ) in P 0.
278 279
Converting between CFA’s and PDA’s

Decision Properties of CFL’s


• Input size is n.

We’ll look at the following: • n is the total size of the input CFG or PDA.

• Complexity of converting among CFA’s and


PDAQ’s The following work in time O(n)

• Converting a CFG to CNF 1. Converting a CFG to a PDA (slide 203)

• Testing L(G) 6= ∅, for a given G


2. Converting a “final state” PDA
• Testing w ∈ L(G), for a given w and fixed G. to a “null stack” PDA (slide 199)

• Preview of undecidable CFL problems 3. Converting a “null stack” PDA


to a “final state” PDA (slide 195)

280 281

Avoidable exponential blow-up

For converting a PDA to a CFG we have


• Now, k will be ≤ 2 for all rules.
(slide 210)
• Total length of all transitions is still O(n).
At most n3 variables of the form [pXq]
• Now, each transition generates at most n2
If (r, Y1Y2 · · · Yk ) ∈ δ(q, a, X)}, we’ll have O(nn) productions
rules of the form

[qXrk ] → a[rY1r1] · · · [rk−1Yk rk ] • Total size (and time to calculate) the gram-
mar is therefore O(n3).

• By introducing k −2 new states we can mod-


ify the PDA to push at most one symbol per
transition. Illustration on blackboard in class.

282 283
Converting into CNF

Good news:

Bad news:
1. Computing r(G) and g(G) and eliminating
useless symbols takes time O(n). This will
be shown shortly • Eliminating the nullable symbols can make
the new grammar have size O(2n)
(slides 229,232,234)
(slide 236)
2. Size of u(G) and the resulting grammar
with productions P1 is O(n2) The bad news are avoidable:
(slides 244,245)
Break bodies first before eliminating nullable
symbols
3. Arranging that bodies consist of only vari-
ables is O(n)
• Conversion into CNF is O(n2)
(slide 248)

4. Breaking of bodies is O(n) (slide 248)

284 285

Testing emptiness of CFL’s Creation and initialzation of the array is O(n)

L(G) is non-empty if the start symbol S is gen- Creation and initialzation of the links and counts
erating. is O(n)

A naive implementation on g(G) takes time When a count goes to zero, we have to
O(n2).
1. Finding the head variable A, checkin if it
g(G) can be computed in time O(n) as follows: already is “yes” in the array, and if not,
queueing it is O(1) per production. Total
Generating?
O(n)
Count
A ?
B yes A B c D B 3
2. Following links for A, and decreasing the
counters. Takes time O(n).
C B A 2

Total time is O(n).

286 287
CYK-algo for membership testing

The grammar G is fixed

w ∈ L(G)? Input is w = a1a2 · · · an

Inefficient way: We construct a triangular table, where Xij con-


tains all variables A, such that
Suppose G is CNF, test string is w, with |w| = ∗
A ⇒ aiai+1 · · · aj
n. Since the parse tree is binary, there are G

2n − 1 internal nodes.
X 15
Generate all binary parse trees of G with 2n − 1
X 14 X 25
internal nodes.
X 13 X 24 X 35
Check if any parse tree generates w
X 12 X 23 X 34 X 45

X 11 X 22 X 33 X 44 X 55

a1 a2 a3 a4 a5

288 289

Example:
To fill the table we work row-by-row, upwards
G has productions
The first row is computed in the basis, the S → AB|BC
subsequent ones in the induction.
A → BA|a
B → CC|b
Basis: Xii == {A : A → ai is in G}
C → AB|a

Induction:

{S,A,C}
We wish to compute Xij , which is in row j − i + 1.
- {S,A,C}
A ∈ Xij , if
∗ - {B} {B}
A ⇒ aiai + 1 · · · aj , if
for some k < j, and A → BC, we have {S,A} {B} {S,C} {S,A}
∗ ∗
B ⇒ aiai+1 · · · ak , and C ⇒ ak+1ak+2 · · · aj , if
{B} {A,C} {A,C} {B} {A,C}
B ∈ Xik , and C ∈ Xkj
b a a b a

290 291
To compute Xij we need to compare at most
n pairs of previously computed sets:
Preview of undecidable CFL problems
(Xii, Xi=1,j ), (Xi,i+1, Xi+2,j ), . . . , (Xi,j−1, Xjj )
The following are undecidable:
as suggested below

1. Is a given CFG G ambiguous?

2. Is a given CFL inherently ambiguous?

3. Is the intersection of two CFL’s empty?

For w = a1 · · · an, there are O(n2) entries Xij


4. Are two CFL’s the same?
to compute.

For each Xij we need to compare at most n 5. Is a given CFL universal (equal to Σ∗)?
pairs (Xik , Xk+1,j ).

Total work is O(n3).


292 293

Problems that computers cannot solve

Evidently, it is important to know that pro-


grams do what they are supposed to, IOW,
we would like to make sure that programs are
correct.

∞ It is easy to see that the program

main()
{
printf(‘‘hello, world\n’’);
}

indeed prints hello, world.

294 295
What about the program in Fig. 8.2 in the
textbook? The hypothetical “Hello, world” tester H

It will print hello, world, for input n if and only Suppose the following program H exists.
if the equation

xn + y n = z n I Hello-world
tester
yes

has a solution where x, y, and z are integers. P H no

We now know that it will print hello, world,


for input n = 2, and loop forever for inputs Modify the no print statement of H to
n > 2. hello, world. We get program H1

It took humanity 300+ years to prove this. I yes


H1
P hello, world
Can we hope to have a program that proves
the correctness of programs?

296 297

Modify H1 so that it takes only P as input, The Turing Machine (1936)


stores P and uses it both as P and I. We get
program H2.
Finite
control

yes
P H2
hello, world
... B B X1 X2 Xi Xn B B ...

Give H2 as input to H2.


A TM makes a move depending on its state,
and the symbol under the tape head.
yes
H2 H2 In a move, a TM will
hello, world

1. Change state
• If H2 prints yes it should have printed
hello, world.
• If H2 prints hello, world it should have 2. Write a tape symbol in the cell scanned
printed yes.
• Thus H2 cannot exist.
3. Move the tape head one cell left or right
• Consequently H cannot exist either.
298 299
A (deterministic) Turing Machine is a 7-tuple Instantaneous Description
M = (Q, Σ, Γ, δ, q0, B, F ),
where A TM changes configuration after each move.

We use Instantaneous Descriptions, ID’s, to


• Q is a finite set of states,
describe configurations.

• Σ is a finite set of input symbols,


An ID is a string of the form

• Γ is a finite set of tape symbols, Γ ⊃ Σ X1X2 · · · Xi−1qXiXi+1 · · · Xn


where
• δ is a transition function from Q × Γ to
Q × Γ × {L, R},
1. q is the state of the TM.

• q0 is the start state,


2. X1X2 · · · Xn is the non-blank portion of the
tape
• B ∈ Γ \ Σ is the blank symbol, and

• F ⊆ Q is the set of final states. 3. The tape head is scanning the ith symbol

300 301

The moves and the language of a TM A TM for {0n 1n : n ≥ 1}

We’ll use ` to indicate a move by M from a M = ({q0 , q1 , q2 , q3 , q4 }, {0, 1}, {0, 1, X, Y, B}, δ, q0 , B, {q4 })
M
configuration to another. where δ is given by the following table
0 1 X Y B
• Suppose δ(q, Xi) = (p, Y, L). Then → q0 (q1 , X, R) (q3 , Y, R)
q1 (q1 , 0, R) (q2 , Y, L) (q1 , Y, R)
q2 (q2 , 0, L) (q0 , X, R) (q2 , Y, L)
q3 (q3 , Y, R) (q4 , B, R)
X1X2 · · · Xi−1qXiXi+1 · · · Xn `
M ?q4
X1X2 · · · pXi−1Y Xi+1 · · · Xn
We can represent M by the following transition
• If δ(q, Xi) = (p, Y, R), we have diagram
Y/ Y
X1X2 · · · Xi−1qXiXi+1 · · · Xn `
M 0/ 0 Y/ Y
X1X2 · · · Xi−1Y pXi+1 · · · Xn 0/ 0
Start 0/ X 1/ Y
q0 q1 q2
We denote the reflexive-transitive closure of `
M

with `
M
. X/ X
Y/ Y

• A TM M = (Q, Σ, Γ, δ, q0, B, F ) accepts the


B/ B
language q3 q4


L(M ) = {w ∈ Σ∗ : q0w `
M
αpβ, p ∈ F, α, β ∈ Γ∗}
Y/ Y

302 303
A TM with “output” Programming Techniques for TM’s

The following TM computes Although TM’s seem very simple, they are as
· powerful as any computer.
m − n = max(m − n, 0)
0 1 B
→ q0 (q1 , B, R) (q5 , B, R) Lots of “features” can be simulated with a
q1 (q1 , 0, R) (q2 , 1, R)
q2 (q1 , 1, L) (q2 , 1, R) (q4 , B, L) “standard” machine.
q3 (q3 , 0, L) (q3 , 1, L) (q0 , B, R)
q4 (q4 , 0, L) (q4 , B, L) (q6 , 0, R)
q5 (q5 , B, R) (q5 , B, R) (q6 , B, R)
?q6 • Storage in State

The transition diagram is


A TM M that “remembers” the first symbol.
B/ B
 
M = Q, {0, 1}, {0, 1, B}, δ, [q0, B], B, {[q1, B]}
0/ 0 1/ 1

Start 0/ B 1/ 1 0/ 1 0/ 0 where Q = {q0, q1} × {0, 1, B}.


q0 q1 q2 q3
1/ 1
δ 0 1 B
→ [q0 , B] ([q1 , 0], 0, R) ([q1 , 1], 1, R)
1/ B B/ B [q1 , 0] ([q1 , 0], 1, R) ([q1 , B], B, R)
[q1 , 1] ([q1 , 0], 1, R) ([q1 , B], B, R)
?[q1 , B]
B/ B B/ 0
q5 q6 q4

0/ B 0/ 0 L(M ) = L(01∗ + 10∗).


1/ B 1/ B

304 305

• Subroutines
• Multiple Tracks for {wcw : w ∈ {0, 1}∗}
The TM computes 0m10n1 7→ 0m·n
State q 0/0
Start B/ B
q0 q9
Storage A B C

0/ B
0/0
Track 1 X 1/1 Copy 0/0 1/1
q6 q1 q5 q7 q8
Track 2 Y
Track 3 Z
0/0 B /B

1/ B 1/ B
q12 q11 q10

0/ B
M = (Q, Σ, Γ, δ, [q1, B], [B, B], {[q0, B]})
Here is the “Copy” TM
where
1/1 1/1
Q = {q1, q2, . . . , q9} × {0, 1, B} 0/0 0/0

Start 0/ X B /0
q1 q2 q3

Σ = {[B, 0], [B, 1], [B, c]} X/ X


1/1

Γ = {B, ∗} × {0, 1, c, B} q4 1/1 q5

X /0

306 307
Variations of the basic TM Theorem: Every language accepted by a multi-
tape TM M is RE
• Multitape TM’s. Input on first tape.
Proof idea: Simulate M by multitrack TM N

... ... X

A1 A2 Ai Aj

X
... ...
B1 B2 Bi Bj

... ...

1. 2k tracks for k tape simulation. Even tracks


In one move the TM
= tape content. Odd tracks = head posi-
tion.
1. Remains in the same state or enters a new
state. 2. N has to visit k head markers to simulate
a move of M . Store the number of heads
visited in state.
2. For each tape, writes a new symbol in the
current cell. 3. For each head N does what M does on the
corresponding tape.

3. Independently moves each head left or right. 4. N changes state acording to M .

308 309

• Nondeterministic TM’s

Theorem: For every nondeterministic TM MN


there is a deterministic TM MD such that
L(MN ) = L(MD ). Undecidability

Proof idea: Suppose at each state MN has k We want to prove undecidable Lu which is the
choiches for each symbol.
language of pairs (M, w) such that:

Finite
control
1. M is a TM (encoded in binary) with input
alphabet {0, 1}.
Queue x ...
ID1 * ID2 * ID3 * ID4 *
of ID’s

Scratch
tape
2. w ∈ {0, 1}∗.

1. MD has transitions δ(q, a, k).


3. M accepts w.
2. For each ID copy it to scratch tape. For
each k create a new ID at the end of the
queue.

3. Unmark the current ID and go to next ID.

310 311
The landscape of languages (problems) Encoding TM’s

Ld .
• Enumerate strings: w ∈ {0, 1}∗ is treated as
Lu .
binary integer 1w. The ith string is denoted wi.

RE Recursive
but  = w1, 0 = w2, 1 = w3, 00 = w4, 01 = w5, . . .
not
recursive

Not RE
To enccode M = (Q, {0, 1}, Γ, δ, q1, B, F )

- Assume Q = {q1, q2, . . . , qr }, F = {q2}.


1. The recursive languages L:
- Assume Γ = {X1, X2, . . . , Xs}, where
There is a TM M that always halts and X1 = 0, X2 = 1, and X3 = B.
such that L(M ) = L. IOW there is an
algorithm that on input w answers “yes” if - Encode L as D1 and R as D2.
w ∈ L, and answers “no” if w ∈
/ L.
- Encode δ(qi, Xj ) = (qk , X`, Dm) as
2. The recursively enumerable languages L:
0i10j 10k 10`10m
There is a TM M that halts and accepts if
w ∈ L, and might run forever otherwise. - Encode the entire TM as
3. The non-RE languages L: C111C211 · · · 11cn−111Cn
There is no TM whatsoever for L. where each Ci is the code of one transition.

312 313

The diagonalization language Ld


Example:
The ith TM Mi: The TM whose code is wi.
M = ({q1, q2, q3}, {0, 1}, {0, 1, B}, δ, q1, B, {q2}), Write down a matrix where
(
1 if wj ∈ L(Mi),
where (i, j) =
0 otherwise
δ 0 1 B
→ q1 (q3 , 0, R)
?q2 j
q3 (q1 , 1, R) (q2 , 0, R) (q3 , 1, L) ...
1 2 3 4
1 0 1 1 0 ...
The code for transitions C1, C2, C3, C4: ...
2 1 1 0 0
i ...
3 0 0 1 1
0100100010100
4 0 1 0 1 ...
0001010100100 . . . . . . .
00010010010100 . . . . . .
. . . . .
0001000100010010 Diagonal

The code for the entire M : Define Ld = {wi : wi ∈


/ L(Mi)}

01001000101001100010101001001100010010010100110001000100010010 If there is a TM for Ld it must be Mi for some i.


If wi ∈ Ld then wi ∈ L(Mi) and thus wi ∈ / Ld
If wi ∈
/ Ld then wi ∈/ L(Mi) and thus wi ∈ Ld
314 315
Theorem: If L is recursive, so is L̄. The universal language Lu

Proof:

w M
Accept Accept Lu = {(enc(M ), enc(w)) : w ∈ L(M )}
Reject Reject

TM U where L(U ) = Lu

Theorem: If L is RE and L also is RE, then Finite


L is recursive. control

Proof: Input M w

Accept Accept ...


M1 Tape of M 0001000001010001

w
State of M 000 . . . 0BB . . .
Accept Reject
M2

Scratch

316 317

Operation of U :

1. If enc(M ) is not legitimate halt and reject 5. Make the move

(a) Change tape 3 to 0k


2. Write enc(w) on tape 2. Use the blank of
U for 1000 (b) Replace 0j on tape 2 by 0`

(c) Move head 2 left (if m = 1) or


3. Write the start state 0 on tape 3. Place right (if m = 2) to next 1
head of tape 2 on first simulated cell
(d) If no 0i10j 1 · · · 1 · · · is not found on
tape 1, then halt and reject
4. Search tape 1 for 0i10j 10k 10`10m, where
(e) If M enters its accepting state then
(a) 0i is the state on tape 3 accept and halt

(b) 0j is tape symbol of M that begins un-


der the head on 2

318 319
Reductions for proving lower bounds

Theorem: Lu is RE but not recursive. Find an algorithm that reduces a known hard
problem P1 to P2.
Proof: Lu is RE since L(U ) = Lu.
yes
Suppose Lu were recursive. Then Lu is also yes

recursive. Let M be an always halting TM no


no

with L(M ) = Lu.


P1 P2
We modify M to M 0, such that L(M 0) = Ld
Theorem: If there is a reduction from P1 to
Hypothetical Accept Accept
P2, then
w Copy w 111 w algorithm
M for Lu Reject Reject
1. If P1 is undecidable, then so is P2
M’ for Ld
2. If P1 is non-RE, then so is P2.

• wi ∈ L(M 0 ) ⇒ wi 111wi ∈ Lu ⇒ wi ∈
/ L(Mi ) ⇒ wi ∈ Ld
Proof: by contradiction. If there were an al-
/ L(M 0 ) ⇒ wi 111wi ∈ Lu ⇒ wi ∈ L(Mi ) ⇒ wi ∈
• wi ∈ / Ld gorithm for P2 you could also solve P1 by first
reducing P1 to P2 and then running the algo-
rithm for P2.
320 321

Theorem: Lne = {enc(M ) : L(M ) 6= ∅} is not


recursive.

Proof: by contradiction. Suppose ∃ TM M ,


A non-recursive and a non-RE language
such that L(M ) = Lne. Transform instance
(M, w) of Lu into M 0 such that

w ∈ L(M ) ⇔ L(M 0) 6= ∅
Le = {enc(M ) : L(M ) = ∅}

Lne = {enc(M ) : L(M ) 6= ∅} w Accept Accept


x M

M’

Theorem: Lne is recursively enumerable.


We have reduced Lu to Lne.
Proof: Non-deterministic TM for Lne
Suppose there is an algorithm for Lne.
Guessed
w Accept Accept
Mi U

M for Lne Run the algorithm to see if L(M 0) 6= ∅.

Since Lu is not recursive, Lne cannot be recur-


sive either.
322 323
Theorem: Le = {enc(M ) : L(M ) = ∅} is Properties of the RE languages
not RE
Every nontrivial property of the RE languages
Proof: If Le were RE then Lne would be re- is undecidable.
cursive, since Le = Lne.

Property of RE languages (example):


Other undecidable properties of TM’s “the language is CF”

Formally: A nontrivial property is a nonempty


1. Lf in = {enc(M ) : L(M ) is finite}
strict subset of all RE languages.

2. Lreg = {enc(M ) : L(M ) is regular} Let P be a nontrivial property of the RE lan-


guages.

3. Lcf l = {enc(M ) : L(M ) is a CFL}


LP = {enc(M ) : L(M ) ∈ P}.

These follow from Rice’s Theorem. Rice’s Theorem: LP is not recursive.

324 325

Proof of Rice’s Theorem:

Suppose first ∅ ∈
/ P.
Proof of Rice’s Theorem:
Let L ∈ P and ML be a TM such that L(ML) = L.

Transform instance (M, w) of Lu into TM M 0 Suppose then that ∅ ∈ P


such that
(
L if w ∈ L(M ), Consider P: the set of RE languages that do
L(M 0) =
∅ if w ∈
/ L(M ) not have property P. Based on the above P is
undecidable.
Accept
w M start Accept Accept

x
ML Since every TM accepts an RE language we
M’
have
LP = LP
We have reduced Lu to LP .

Suppose there is an algorithm for LP . If LP were decidable then LP would also be


decidable.
Run the algorithm to see if L(M 0) 6= ∅.

Since Lu is not recursive, LP cannot be recur-


sive either.
326 327
Post’s Correspondence Problem

PCP is a problem about strings that is


undecidable (RE) Example:
List A List B
Let A = w1, w2, . . . , wk and B = x1, x2, . . . , xk , i
1
wi
10
xi
101
where xi, yi ∈ Σ∗ for some alphabet Σ. 2 011 11
3 101 011

The instance (A, B) has a solution if there ex- This PCP instance has no solution.
ists indices i1, i2, . . . , im, such that
wi1 wi2 · · · wim = xi1 xi2 · · · xim Suppose i1, i2, . . . , im is a solution:

Example: w2 011
If i1 = 2 we cannot match x2= 11
List A List B
i wi xi
1 1 111 101
2 10111 10 w3
3 10 0 If i1 = 3 we cannot match x3= 011

Solution: i1 = 2, i2 = 1, i3 = 1, i4 = 3 gives w1 10
Therefore i1 = 1 and a partial solution is x1= 101
w2w1w1w3 = x2x1x1x3 = 101111110

Another solution: 2,1,1,3,2,1,1,3


328 329

The Modified PCP:

w1 w1 1010 Let A = w1, w2, . . . , wk and B = x1, x2, . . . , xk ,


If i2 = 1 we cannot match x1x1=101101 where xi, yi ∈ Σ∗ for some alphabet Σ.

w1 w2 10011
If i2 = 2 we cannot match x1x2=10111 The modifeid PCP A, B has a solution if there
exists indices i1, i2, . . . , im, such that
w1 w3 10101
Only i2 = 3 is possible giving x1x3=101011 w1wi1 , wi2 , . . . , wim = x1xi1 , xi2 , . . . xim

Now we are back to “square one:” Example:


List A List B
10101101 i wi xi
w1 w3 1 1 111
Only i3 = 3 is possible giving x1x3=101011011 2 10111 10
3 10 0

w1 w3 w3 10101101101 w1 1
Only i4 = 3 is possible giving x1x3x3=101011011011 Any solution would have to begin with x1=111
w1 w2 110111
If i2 = 2 we cannot match x1x2=11110
Conclusion: The first list can never catch up 110
w1 w2
with the second If i2 = 3 we cannot match x1x2=1110
w1 w2 11
If i2 = 1 we have to match x1x2=111111
We are back to “square one.”
330 331
We reduce MPCP to PCP
Example:
Let MPCP be A = w1, w2, . . . , wk , B = x1, x2, . . . , xk
Let MPCP be
We construct PCP A0 = y0, y1, . . . , yk+1,
B 0 = z0, z1, . . . , zk+1 as follows: List A List B
i wi xi
1 1 111
y0 = ∗y1 and z0 = z1 2 10111 10
3 10 0

If wi = a1a2 . . . a` then yi = a1 ∗ a2 ∗ . . . a`∗ Then we construct PCP

List A List B
If xi = b1b2 . . . bp then zi = ∗b1 ∗ b2 . . . ∗ bp i yi zi
0 *1* *1*1*1
1 1* *1*1*1
yk+1 = $ and zk+1 = ∗$ 2 1*0*1*1*1* *1*0
3 1*0* *
4 $ *$
Now (A, B) has a solution iff (A0, B 0) has one.

332 333

Let M = (Q, Σ, Γ, δ, q0, B, F ) be the TM. WLOG


assume that M never prints a blank, and never
PCP is undecidable
moves the head left of the initial position.

Given an instance (M, w) of Lu we construct 1. The initial pair


instance (A, B) of MPCP such that w ∈ L(M )
iff (A, B) has a solution. List A List B
# #q0w#
Lu an MPCP an PCP
algorithm algorithm 2. For each X ∈ Γ

List A List B
Partial solutions will consist of strings of the X X
form # #

#α1#α2#α3 · · ·
3. ∀q ∈ Q \ F, ∀p ∈ Q, ∀X, Y, Z ∈ Γ
where α1 is the initial configuartion of M on w,
and αi ` αi+1.
List A List B
qX Yp if δ(q, X) = (p, Y, R)
The string from list B will always be one ID
ZqX pZY if δ(q, X) = (p, Y, L), Z ∈ Γ
ahead of A until M accepts w and A can catch up. q# Y p# if δ(q, B) = (p, Y, R)
Zq# pZY # if δ(q, B) = (p, Y, L), Z ∈ Γ
334 335
Example: Lu instance: (M, 01)

M = ({q1, q2, q3}, {0, 1}, {0, 1, B}, δ, q1, B, {q3})


δ 0 1 B
→ q1 (q2 , 1, R) q2 , 0, L) (q2 , 1, L)
4. ∀q ∈ F, ∀X, Y ∈ Γ q2 (q3 , 0, L) (q1 , 0, R) (q2 , 0, R)
?q3 - - -

List A List B The corresponding MPCP is:


XqY q
Rule List A List B Source
Xq q (1) # #q1 01
qY q (2) 0 0
1 1
# #
(3) q1 0 1q2 from δ(q1 , 0) = (q2 , 1, R)
0q1 1 q2 00 from δ(q1 , 1) = (q2 , 0, L)
5. Final pair 1q1 1 q2 10 from δ(q1 , 1) = (q2 , 0, L)
0q1 # q2 01# from δ(q1 , B) = (q2 , 1, L)
1q1 # q2 11# from δ(q1 , B) = (q2 , 1, L)
List A List B 0q2 0 q3 00 from δ(q2 , 0) = (q3 , 0, L)
1q2 0 q3 10 from δ(q2 , 0) = (q3 , 0, L)
q## # q2 1 0q1 from δ(q2 , 1) = (q1 , 0, R)
q2 # 0q2 # from δ(q2 , B) = (q2 , 0, R)
(4) 0q3 0 q3
0q3 1 q3
1q3 0 q3
1q3 1 q3
0q3 q3
0q3 q3
1q3 q3
q3 0 q3
q3 1 q3
(5) q3 ## #

336 337

You might also like