Professional Documents
Culture Documents
ATC - Mod2 - RegularLanguageProperties (Autosaved)
ATC - Mod2 - RegularLanguageProperties (Autosaved)
◼ Types of Grammar:
1. Type 0 grammar(Phrase Structured Grammar)
2. Type 1 grammar(Context Sensitive Grammar)
3. Type 2 grammar(Context Free Grammar)
4. Type 3 grammar(Regular Grammar)
2
Phrase Structured Grammar
◼ A Phrase Structured Grammar/ Unrestricted Grammar / Type 0 Grammar G
is a quadruple (V, Σ, R, S), where:
◼ V is the rule alphabet, which contains non terminals and terminals.
◼ Σ (the set of terminals) is a subset of V,
◼ R (the set of rules) is a finite set of rules of the form α→β,
◼ S (the start symbol) is a nonterminal.
◼ Here, all rules in R must:
◼ α→β,
◼ Where, α Ꞓ(VUT)+ and β Ꞓ(VUT)*
◼ Eg: S → aAb | ε
aA→bAA
bA→a
◼ Language: Recursively Enumerable Language
◼ Machine: Turing Machine
◼ Most Powerful grammar. 3
Context Sensitive Grammar
◼ A Context Sensitive Grammar / Type 1 G is a quadruple (V, Σ, R, S), where:
◼ V is the rule alphabet, which contains non terminals and terminals,
◼ Σ (the set of terminals) is a subset of V,
◼ R (the set of rules) is a finite set of rules of the form α→β,
◼ S (the start symbol) is a nonterminal.
◼ In a Context Sensitive Grammar, rules :
◼ There is restriction on the length of β. The length of β should be at least as
much as the length of α. |β| ≥ |α|
◼ α and β Ꞓ (VUT)+. i.e. ε cannot appear on LHS or RHS of any rule. It is an
ε-free grammar.
◼ Machine: Linear Bounded Automata
◼ Language: Context Sensitive Language
◼ Eg: S→aAb
aA→bAA
bA→aa 4
Context Free Grammar
◼ A Context Free Grammar/ Type 2 Grammar G is a quadruple (V, Σ, R, S),
where:
◼ V is the rule alphabet, which contains non terminals and terminals,
◼ Σ (the set of terminals) is a subset of V,
◼ R (the set of rules) is a finite set of rules of the form A→α,
◼ S (the start symbol) is a nonterminal.
◼ In a context free grammar, all rules in R must be of the form:
◼ A→ α, Where
◼ A→ Single Nonterminal & αꞒ(VUT)*
◼ Machine: Push Down Automata
◼ Language: Context Free Language
◼ Eg:- S→aB/bA/ε
A→aA/b
B→bB/a/ ε
5
Regular Grammar
◼ A regular grammar G is a quadruple (V, Σ, R, S), where:
◼ V is the rule alphabet, which contains nonterminals (symbols that are used in
the grammar but that do not appear in strings in the language) and terminals
(symbols that can appear in strings generated by G),
◼ Σ (the set of terminals) is a subset of V,
◼ R (the set of rules) is a finite set of rules of the form α→β,
◼ S (the start symbol) is a nonterminal.
◼ In a regular grammar, all rules in R must:
◼ have a left-hand side that is a single nonterminal, and
◼ have a right-hand side that is ε or a single terminal or a single terminal followed by
a single nonterminal.
◼ So S → a, S → ε, and T → aS are legal rules in a regular grammar.
◼ Machine: Finite Automata
◼ Language: Regular language
6
Finite State Machine to
Regular Grammar
◼ Procedure:
◼ V(Nonterminals): States of DFSM
◼ Σ(Terminals): Alphabets of DFSM
◼ S=q0 i.e., start state of DFSM is start
symbol of grammar
◼ Rules:
◼ If δ(qi,a)=qj then introduce the rule as: qi→aqj
◼ If q Ꞓ F i.e., if q is the final state in FSM, then
introduce the rule as: q→ε
7
Obtain a grammar to generate
string of any numbers of a’s
◼ Transition: Rules:
S
◼ S is a final State S→ε
a ◼ δ(S,a)=S S→aS
SO, the Grammar is:
S→aS / ε
OR S→aS
S→ ε
Language generated is: L={an:n ≥0}8
Obtain a grammar to generate
string of at least one a.
◼ Transition: Rules:
S a A
◼ A is a final State A→ ε
a ◼ δ(S,a)=A S→aA
◼ δ(A,a)=A A→aA
So, the Grammar is:
S→aA
A→aA/ε
Language generated is: L={an:n ≥1}9
Obtain a grammar to generate
string of any no. of a’s and b’s.
◼ Transition: Rules:
S
◼ S is a final State S→ε
a,b ◼ δ(S,a)=S S→aS
◼ δ(S,b)=S S→bS
SO, the Grammar is:
S→aS/ bS/ε
Language generated is: L={(a+b)n:n ≥0}
10
Obtain a grammar to generate
string of at least two a’s
◼ Transition: Rules:
◼ B is a final State B→ε
S a A
◼ δ(S,a)=A S→aA
◼ δ(A,a)=B
a A→aB
◼ δ(B,a)=B B→aB
B
a
So, the Grammar is:
S→aA
A→ aB
B→aB/ε 11
Obtain a grammar to generate
string of at multiple’s of three a’s
◼ S→aA / ε
S a A
◼ A->aB
a
◼ B→aS
a
B
12
◼ Transitions Rules
◼ δ(S,a)=A S→aA
◼ δ(A,a)=S A→aS
◼ S is final state S→ε
G=(V, Σ, R,S)
V={S,A,a}
Σ ={a}
S is start symbol
R={S→aA/ ε
A→aS}
13
◼ Obtain a grammar to accept the
language L={w:|w|mod 3>0, wꞒ{a}*}
a A
S
a
a
B
14
Show a regular grammar for each of the
following languages:
1. {w Ꞓ {a, b}* : w contains an even number of a’s and an odd number of b’s}.
15
grammartofsm(G: regular grammar) =
1. Create in M a separate state for each nonterminal in V.
2. Make the state corresponding to S the start state.
3. If there are any rules in R of the form X → w, for some w Ꞓ Σ, then create an
additional state labeled #.
4. For each rule of the form X → w Y, add a transition from X to Y labeled w.
5. For each rule of the form X → w, add a transition from X to # labeled w.
6. For each rule of the form X → ε, mark state X as accepting.
7. Mark state # as accepting.
8. If M is incomplete (i.e., there are some (state, input) pairs for which no
transition is defined), M requires a dead state. Add a new state D. For every (q,
i) pair for which no transition has already been defined, create a transition from
q to D labeled i. For every i in Σ, create a transition from D to D labeled i.
S→aB
16
◼ (a+b)=>a, b
◼ (a+b)*=>ε,a,b,ab,ba,aab,bba,aba,bab,
aaaa,bbbb
◼ (a.b)*=> ε,ab,abab,ababab
◼ Ba,bbba,aaaa,bbbb
17
Example 7.2 Strings that End with aaaa
18
Contd….
Applying grammartofsm to this grammar, we get:
a/b
S a B a C a a
D #
δ(S,a)=S
δ(S,b)=S
δ(S,a)=B /* Generate the first a of the pattern.
δ(B,a)=C /* Generate the second a of the pattern.
δ(C,a)=D /* Generate the third a of the pattern.
δ(D,a)=ε 19
Example 7.3 The Missing
Letter Language
◼ Let Σ = {a, b, c}. LMissing = {w : there is a symbol ai Ꞓ Σ not appearing in w}
◼ The job of S is to generate some string in LMissing. It does that by choosing a
first character of the string and then choosing which other character will be
missing.
◼ The job of A is to generate all strings that do not contain any a’s.
◼ The job of B is to generate all strings that do not contain any b’s. And the job
of C is to generate all strings that do not contain any c’s.
S→ε
S → aB δ(S,a)=B B → aB
A → bA B → cB
S → aC δ(S,a)=C
A → cA B→ε
S → bA δ(S,b)=A
A→ε C → aC
S → bC δ(S,b)=C
S → cA δ(S,c)=A C → bC
S → cB δ(S,c)=B C→ε
20
A
b,c
b,c
a,c
S B a,c
a,b
C a,b
21
b
a
S T a #
a a
W
Regular Expression:
a(bUaa)*a
DFSM:→
22
L = {w ∈ {a, b}* : every a in w is immediately followed by at least one b}.
23
b) {w ∈ {a, b}* : w does not end in aa}.
◼ Regular Grammar
S → aA | bB | ε
A → aC | bB | ε a
S A
B → aA | bB | ε
C → aC | bB ba b a
B
b C
b a
24
Properties of Regular
Languages
Reading: Chapter 4
25
Topics
1) How to prove whether a given
language is regular or not?
1) Pumping Lemma Theorem
2) Some examples to prove language is not
regular
2) Closure properties of regular
languages
26
Some languages are not
regular
When is a language is regular?
if we are able to construct one of the
following: DFSM or NFSM or -NFSM or regular
expression
When is it not?
If we can show that no FSM can be built for a
language
27
How to prove languages are
not regular?
What if we cannot come up with any FSM?
A) Can it be language that is not regular?
B) Or is it that we tried wrong approaches?
29
The Pumping Theorem for
Regular Languages
◼ We show that each of the last three conditions must then hold:
1. |xy| ≤ k : M must not only traverse a loop eventually when reading w, it must
do so for the first time by at least the time it has read k characters. It can read
k-1 characters without revisiting any states. But the kth character must, if no
earlier character already has, take M to a state it has visited before. Whatever
character does that is the last in one pass through some loop.
2. y ≠ ε: since M is deterministic, there are no loops that can be traversed by .
3. ∀ q ≥ 0 (xyqz Ꞓ L): y can be pumped out once (which is what happens if q = 0)
or in any number of times (which happens if q is greater than 1) and the
resulting string must be in L since it will be accepted by M.
◼ It is possible that we could chop y out more than once and still generate a
string in L, but without knowing how much longer w is than k, we don’t know
any more than that it can be chopped out once.
30
The Pumping Theorem for
Regular Languages
31
Example 8.8 n
A B n is not
Regular
◼ Let L be AnBn = {anbn : n ≥ 0}. We can use the Pumping Theorem to show that L is
not regular.
◼ If it were, then there would exist some k such that any string w, where |w| ≥ k, must
satisfy the conditions of the theorem. We show one string w that does not.
◼ Let w = akbk. Since |w| = 2k, w is long enough and it is in L, so it must satisfy the
conditions of the Pumping Theorem. So there must exist x, y, and z, such that w =
xyz, |xy| ≤ k, y≠ε, and ∀ q ≥ 0 (xyqz Ꞓ L).
◼ But we show that no such x, y, and z exist. Since we must guarantee that |xy| ≤ k, y
must occur within the first k characters and so y = ap for some p.
◼ Since we must guarantee that y ≠ε, p must be greater than 0. Let q = 2. (In other
words, we pump in one extra copy of y.) The resulting string is ak+pbk.
◼ The last condition of the Pumping Theorem states that this string must be in L, but
it is not since it has more a’s than b’s.
◼ Thus there exists at least one long string in L that fails to satisfy the conditions of
the Pumping Theorem.
◼ So L = AnBn is not regular. 32
n n
L={a b }
◼ w=a……….ab……….b=====➔2n
n n
◼ w= x y z
◼ |xy|<=k<=|w|;;; k=n
i.e. xy=an & z=bn
◼ Assume y=ap
then x=an-p
w=an-papbn = an-p+pbn= anbn
◼ So, let n=4,➔a4b4 p=1,then we get, w=a3a1b4
◼ According to pumping lemma, if w belong to regular language then xyqz Ꞓ L
for q≥0.
◼ Now let q=2,
◼ w=a3a1+2b4=a6b4
w=an+qbn ∉ L
33
The Even Palindrome
Language is Not Regular
◼ Let L be PalEven = {wwR : w Ꞓ {a, b}*}. PalEven is the language of even-length
palindromes of a’s and b’s.
◼ We can use the Pumping Theorem to show that PalEven is not regular.
◼ If it were, then there would exist some k such that any string w, where |w| ≥ k, must
satisfy the conditions of the theorem. We show one string w that does not.
◼ We will choose w so that we only have to consider one case for where y could fall.
◼ Let w = akbkbkak.
◼ w=akbk & wR=bkak
◼ Since |w| = 4k and w is in L, w must satisfy the conditions of the Pumping Theorem.
So there must exist x, y, and z, such that w = xyz, |xy| ≤ k, y≠ε, and ∀ q ≥ 0 (xyqz Ꞓ L).
Since |xy| ≤ k, y must occur within the first k characters and so y = ap for some p.
◼ Since y ≠ε, p must be greater than 0. Let q = 2. The resulting string is ak+pbkbkak. If p
is odd, then this string is not in PalEven because all strings in PalEven have even
length. If p is even then it is at least 2, so the first half of the string has more a’s than
the second half does, so it is not in PalEven.
34
◼ So L = PalEven is not regular.
◼ w=a……….ab……….bb……….ba……….a
k k k k
◼ w= x y z & |w|=4k
i.e. xy=ak & z=bkbkak
◼ Assume y=ap
then x=ak-p
w=ak-papbk bkak= akbkbkak
◼ So, let k=4, p=2,then we get,
w=a2a2b4b4a4
◼ According to pumping lemma, if w belong to regular language then
xyqz Ꞓ L for q≥0.
◼ Now let q=3,
◼ w=a2a2+3b4b4a4=a7b4b4a4
w=ak-p+p+qbkbkak =ak+qbkbkak ∉ L 35
Example 8.12 The Language with
More a’s Than b’s is Not Regular
◼ Let L = {anbm : n > m}. We can use the Pumping Theorem to show that L is not regular. If
it were, then there would exist some k such that any string w, where |w| ≥ k, must satisfy
the conditions of the theorem. We show one string w that does not.
1. Let w = ak+1bk. Since |w| = 2k+1 and w is in L, w must satisfy the conditions of the
Pumping Theorem.
2. So there must exist x, y, and z, such that w = xyz, |xy| ≤ k, y≠ε, and ∀ q ≥ 0 (xyqz Ꞓ L). Since
|xy| ≤ k, y must occur within the first k characters and so y = ap for some p.
3. Since y ≠ε, p must be greater than 0. There are already more a’s than b’s, as required by
the definition of L.
4. If we pump in, there will be even more a’s and the resulting string will still be in L. But
we can set q to 0 (and so pump out).
5. The resulting string is then ak+1-pbk. Since p > 0, k+1-p ≤ k, so the resulting string no
longer has more a’s than b’s and so is not in L.
6. There exists at least one long string in L that fails to satisfy the conditions of the Pumping
Theorem.
7. So L is not regular. 36
The Pumping Lemma for
Regular Languages
What it is?
The Pumping Lemma is a property
of all regular languages.
How is it used?
A technique that is used to show
that a given language is not regular
37
Pumping Lemma for Regular
Languages
Let L be a regular language
39
Closure properties of Regular
Languages
40
Closure properties for Regular
Languages (RL) This is different
from Kleene
closure
◼ Closure property:
◼ If a set of regular languages are combined using an
operator, then the resulting language is also regular
◼ Regular languages are closed under:
◼ Union, intersection, complement
◼ Difference
◼ Reversal
◼ Kleene closure Now, lets prove all of this!
◼ Concatenation
◼ Homomorphism
◼ Inverse homomorphism
41
RLs are closed under union
◼ IF L and M are two RLs THEN:
q0 qi qF2 q0 qi qF2
…
…
qFk qFk
45
DFSM construction for L ∩ M
DFSM for L DFSM for M
qF1 pF1
a a
q0 qi qj qF2 p0 pi pj pF2
…
DFSM for LM
(qF1 ,pF1)
…
(q0 ,p0) (qi ,pi) (qj ,pj)
46
RLs are closed under set
difference
Closed under intersection
◼ We observe: Closed under
◼ L-M=L∩M complementation
47
RLs are closed under reversal
Reversal of a string w is denoted by wR
◼ E.g., w=00111, wR=11100
Reversal of a language:
◼ LR = The language generated by
reversing all strings in L
DFSM for L
qF1
q0 qi
a
qj qF2 q’0 New start
state
…
Make the
old start state
as the only new qFk
final state
50
Homomorphisms
◼ Substitute each symbol in ∑ (main alphabet)
by a corresponding string in T (another
alphabet)
◼ h: ∑--->T*
◼ Example:
◼ Let ∑={0,1} and T={a,b}
◼ Let a homomorphic function h on ∑ be:
◼ h(0)=ab, h(1)=
◼ If w=10110, then h(w) = abab = abab
◼ In general,
◼ h(w) = h(a1) h(a2)… h(an)
51
Given a DFSM for L, how to convert it into an FSM for h(L)?
…
qFk
- Build a new FSM that simulates h(a) for every symbol a transition in
the above DFSM
- The resulting FSM may or may not be a DFSM, but will be a FSM for h(
53
Given a DFSM for M, how to convert it into an FSM for h-1(M)? The set of strings in ∑*
whose homomorphic translation
results in the strings of M
Inverse homomorphism
◼ Let h: ∑--->T*
◼ Let M be a language over alphabet T