Download as pdf or txt
Download as pdf or txt
You are on page 1of 22

CST 301 FORMAL LANGUAGES AND AUTOMATA THEORY

Materials for Final Revision


DETERMINISTIC FINITE AUTOMATA
Problems
Q1) Construct a DFA to accept a string containing a zero followed by a one.

Q2) Construct a DFA to accept a string containing two consecutive zeroes followed by
two consecutive ones

Q3) Construct a DFA to accept a string containing even number of zeroes and any
number of ones

Construct a DFA to accept all strings which do not contain three consecutive zeroes

Downloaded from Ktunotes.in


Q) Construct a DFA to accept all strings containing even number of zeroes and even
number of ones

Q) Construct a DFA to accept all strings which satisfies #(x) mod 5=2

Q) Construct a DFA to accept all strings (0+1)* with an equal number of 0's & 1's
such that each prefix has at most one more zero than ones and at most one more
one than zeroes

Downloaded from Ktunotes.in


Language accepted by the class of DFA

NON-DETERMINISTIC FINITE AUTOMATA


Problems
Q) Construct an NFA to accept all strings terminating in 01

Q) Construct an NFA to accept those strings containing three consecutive zeroes

L = {x | x ∈ Σ* and the number of a in x is divisible by 2 BUT NOT divisible by 3

Downloaded from Ktunotes.in


Downloaded from Ktunotes.in
So a string is accepted by DFA D if, and only if, it is accepted by NFA N.

Application of Finite Automata (FA):


We have several application based on finite automata and finite state machine. Some are
given below;

• A finite automata is highly useful to design Lexical Analyzers.


• A finite automata is useful to design text editors.
• A finite automata is highly useful to design spell checkers.
• A finite automata is useful to design sequential circuit design (Transducer).

Examples of Limitation of finite automata language design:

1. The set of binary strings consisting of an equal number of a’s and b’s.
2. The set of strings over ‘(‘ and ‘)’ that have “balanced” parentheses.

Downloaded from Ktunotes.in


Regular Expression
Definition
The class of re over alphabet ∑ is defined recursively as follows.

1. The letter ,  ,‘a’  ∑ are regular expression over L.


2. If r1 and r2 are regular expressions, then so are
(r1 + r2 ) [+ union]
r1 • r2 [• concatenation]
r 1* [* kleene closure- zero or more occurances of r1 ]

3. Rule 1 & 2 generate primitive regular expression. A string is a regular


expression, if, and only if, it can be derived from the primitive regular
expressions by a finite number of applications of rule in 2.

Theorem -Equivalence of Regular Expression and  Non deterministic Finite Automata


The following diagram shows the relation between finite automata and regular expressions.

Regular
Expression

Expression
NFA with Є-moves
Deterministic
Finite Automata

NFA without

Є -moves

PT for every re there is an equivalent NFA with  transition.

Let r be a re. Then there exists and NFA with  transition that accepts L(r).

Proof:
We show by induction on the number of operators in the re r that there is an NFA M with 
transition, having one final state and no transitions out of this Final state, such that
L(M)=L(r )

Basis zero operator : the expression r must be , Φ or a for some a in ∑. The NFA’s are given
below which satisfy this conditions

Downloaded from Ktunotes.in


Start Start Start a
q0 q0 q0
qf q
1

a) r = b) r=Φ c) r = a

Induction
Assume that theorem is true for regular expressions with fewer that I operators, i>=1. Let r
have I operators. There are three cases depending on the form of r.
Case 1 :
r= r1 + r2
Here both r1 and r2 must have fewer than I operators. Thus there are NFA’s

M1=(Q1,∑1 ,  1 ,q1,{f1}) and M2=(Q2,∑2,  2 ,q2,{f2}) with L(M1 )=L(r1) and (M2 )=L(r2)

Since we may rename states of an NFA at which we assume Q1 and Q2 are disjoint.

Construct M= (Q1 U Q2 U { q0 , f0 }, ∑1 U ∑2 ,  , q0 , {f0 })


Where  is defined by
i)  ( q0, )= { q1 , q2}
ii)  (q,a) =  1 (q,a) for q in Q1 –{f1 } and a in ∑1 U { }
iii)  (q,a) =  2 (q,a) for q in Q2 –{f2 } and a in ∑2 U { }
iv) 1 ( f1, ) =  2 (f2, ) = {f0 }
Recall by the inductive hypothesis that there are no transitions out of f1 or f2 in M1 or M2. Thus
all the moves of M1 and M2 are present in M.

The construction of M is depicted as follows

Downloaded from Ktunotes.in


Any transition path in the transition diagram of M from q0 to f0 must begin by going to either q1
or q2 on . If the pat goes to q1 it may follow any path in M1 to f1 and then go to f0 on . Similarly
paths that begin by going to q2 may follow any path is M2 to f2 and then go to f0 on .
Case 2

r= r1 r2. Let M1 and M2 be as in case 1 and construct M= (Q1 U Q2 , ∑1 U ∑2 ,  , { q1 },{f2})


where is given by

i)  (q,a) =  (q,a) for q in Q1 –{f1 } and a in ∑1 U { }

ii)  ( f1, ) = {q2 }

iii)  (q,a) =  2 (q,a) for q in Q2 and a in ∑2 U { }


The construction of M is

Case 3

r= r1 *. Let M1= (Q1 , ∑1 , 1 , q1 ,{f1}) and L(M1)=r1

M= (Q1 U { q0 ,f0} , ∑1 ,  , q0 ,{f0})


where  is given by
i)  ( q0, )=  ( f1, ) = {q1 , f0}
ii)  (q,a) =  (q,a) for q in Q1 –{f1 } and a in ∑1 U { }

Construction of M is

Downloaded from Ktunotes.in


Regular Grammar

Regular grammars are another way to describe regular languages a grammar is made of of
terminals, variables, and production rule defined by

G=(V,T,S,P) where,

V….variables or nonterminals T….terminals S------------Start symbol P-------productions


(rules)represented

All productions are of the form A→xB or A→x

where A,B,C∈V,x∈T∗

Myhill Nerodes Theorem (Used in minimization of DFA)


The Myhill–Nerode theorem states that given language L is regular if and only if an
equivalent relation RL has a finite number of equivalence classes, and moreover that

the number of states in the smallest deterministic finite automaton (DFA) recognizing
L is equal to the number of equivalence classes in RL. In particular, this implies that

there is a unique minimal DFA with minimum number of states

Downloaded from Ktunotes.in


Downloaded from Ktunotes.in
Closure properties

Downloaded from Ktunotes.in


Closure properties on regular languages (Proof not required as per
syllabus)[see table all properties are closed]
In an automata theory, there are different closure properties for regular languages.
They are as follows −

• Union
• Intersection
• concatenation
• Kleene closure
• Complement
Let see one by one with an example

Union
If L1 and If L2 are two regular languages, their union L1 U L2 will also be regular.

Example
L1 = {an | n > O} and L2 = {bn | n > O}

Downloaded from Ktunotes.in


L3 = L1 U L2 is also regular.

Intersection
If L1 and If L2 are two regular languages, their intersection L1 ∩ L2 will also be
regular.
Example
L1= {am bn | n > 0 and m > O} and
L2= {am bn U bn am | n > 0 and m > O}
L3 = L1 ∩ L2 = {am bn | n > 0 and m > O} are also regular.

Concatenation
If L1 and If L2 are two regular languages, their concatenation L1.L2 will also be
regular.
Example
L1 = {an | n > 0} and L2 = {bn | n > O}
L3 = L1.L2 = {am . bn | m > 0 and n > O} is also regular.

Kleene Closure
If L1 is a regular language, its Kleene closure L1* will also be regular.
Example
L1 = (a U b )
L1* = (a U b)*

Complement
If L(G) is a regular language, its complement L'(G) will also be regular. Complement
of a language can be found by subtracting strings which are in L(G) from all possible
strings.
Example
L(G) = {an | n > 3} L'(G) = {an | n <= 3}
Note − Two regular expressions are equivalent, if languages generated by them are
the same. For example, (a+b*)* and (a+b)* generate the same language. Every string
which is generated by (a+b*)* is also generated by (a+b)* and vice versa.

Downloaded from Ktunotes.in


Properties of Context Free Languages

Union : If L1 and L2 are two context free languages, their union L1 ∪ L2 will
also be context free. For example,
L1 = { anbncm | m >= 0 and n >= 0 } and L2 = { anbmcm | n >= 0 and m >= 0 }
L3 = L1 ∪ L2 = { anbncm ∪ anbmcm | n >= 0, m >= 0 } is also context free.
L1 says number of a’s should be equal to number of b’s and L2 says number
of b’s should be equal to number of c’s. Their union says either of two
conditions to be true. So it is also context free language.
So CFL are closed under Union.

Concatenation : If L1 and If L2 are two context free languages, their


concatenation L1.L2 will also be context free. For example,
L1 = { anbn | n >= 0 } and L2 = { cmdm | m >= 0 }
L3 = L1.L2 = { anbncmdm | m >= 0 and n >= 0} is also context free.
L1 says number of a’s should be equal to number of b’s and L2 says number
of c’s should be equal to number of d’s. Their concatenation says first
number of a’s should be equal to number of b’s, then number of c’s should
be equal to number of d’s. So, we can create a PDA which will first push for
a’s, pop for b’s, push for c’s then pop for d’s. So it can be accepted by
pushdown automata, hence context free.
So CFL are closed under Concatenation.

Kleene Closure : If L1 is context free, its Kleene closure L1* will also be
context free. For example,
L1 = { anbn | n >= 0 }
L1* = { anbn | n >= 0 }* is also context free.
Note :So CFL are closed under Kleen Closure.

Intersection and complementation : If L1 and If L2 are two context free


languages, their intersection L1 ∩ L2 need not be context free. For example,
L1 = { anbncm | n >= 0 and m >= 0 } and L2 = (ambncn | n >= 0 and m >= 0 }
L3 = L1 ∩ L2 = { anbncn | n >= 0 } need not be context free.
L1 says number of a’s should be equal to number of b’s and L2 says number
of b’s should be equal to number of c’s. Their intersection says both
conditions need to be true, but push down automata can compare only two.
So it cannot be accepted by pushdown automata, hence not context free.
Similarly, complementation of context free language L1 which is ∑* – L1,
need not be context free.
Note : So CFL are not closed under Intersection and Complementation

.
Context Sensitive Grammar-Properties

• Any context-free language is context sensitive.

Downloaded from Ktunotes.in


• Not all Context-sensitive are context-free

• Context Sensitive Languages are closed under

o Union, Intersection, Complement, Concatenation, Kleene closure.


Reversal

Closure Properties of Recursive Languages


• Union: If L1 and If L2 are two recursive languages, their union
L1∪L2 will also be recursive because if TM halts for L1 and halts for
L2, it will also halt for L1∪L2.
• Concatenation: If L1 and If L2 are two recursive languages, their
concatenation L1.L2 will also be recursive. For Example:
L1= {anbncn|n>=0}
L2= {dmemfm|m>=0}
L3= L1.L2
= {anbncndm emfm|m>=0 and n>=0} is also recursive.
• L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s. L2
says m no. of d’s followed by m no. of e’s followed by m no. of f’s.
Their concatenation first matches no. of a’s, b’s and c’s and then
matches no. of d’s, e’s and f’s. So it can be decided by TM.
• Kleene Closure: If L1is recursive, its kleene closure L1* will also be
recursive. For Example:
L1= {anbncn|n>=0}
L1*= { anbncn||n>=0}* is also recursive.
• Intersection and complement: If L1 and If L2 are two recursive
languages, their intersection L1 ∩ L2 will also be recursive. For
Example:
L1= {anbncndm|n>=0 and m>=0}
L2= {anbncndn|n>=0 and m>=0}
L3=L1 ∩ L2
= { anbncndn |n>=0} will be recursive.

L1 says n no. of a’s followed by n no. of b’s followed by n no. of c’s and then
any no. of d’s. L2 says any no. of a’s followed by n no. of b’s followed by n no.
of c’s followed by n no. of d’s. Their intersection says n no. of a’s followed by
n no. of b’s followed by n no. of c’s followed by n no. of d’s. So it can be decided
by turing machine, hence recursive.

Similarly, complement of recursive language L1 which is ∑*-L1, will also be


recursive.
Note: As opposed to REC languages, RE languages are not closed under
complementation which means complement of RE language need not be RE.

Downloaded from Ktunotes.in


Pumping Lemma for Regular Languages
Let A be a regular language. Then there exists an integer p ≥ 1, called the pumping
length, such that the following holds: Every string s in A, with |s| ≥ p, can
be written as s = xyz, such that
1. y ≠ € (i.e., |y| ≥ 1),
2. |xy| ≤ p, and
3. for all i ≥ 0, xyiz ∈ A.
In words, the pumping lemma states that by replacing the portion y in s
by zero or more copies of it, the resulting string is still in the language A.

The Applications of these Automata are given as follows:


1. Finite Automata (FA) –
• For the designing of lexical analysis of a compiler.
• For recognizing the pattern using regular expressions.
• For the designing of the combination and sequential circuits using
Mealy and Moore Machines.
• Used in text editors.
• For the implementation of spell checkers.
2. Push Down Automata (PDA) –
• For designing the parsing phase of a compiler (Syntax Analysis).
• For implementation of stack applications.
• For evaluating the arithmetic expressions.
• For solving the Tower of Hanoi Problem.
3. Linear Bounded Automata (LBA) –
• For implementation of genetic programming.
• For constructing syntactic parse trees for semantic analysis of the
compiler.
4. Turing Machine (TM) –
• For solving any recursively enumerable problem.
• For understanding complexity theory.
• For implementation of neural networks.
• For implementation of Robotics Applications.
• For implementation of artificial intelligence.

Expressive Power of various Automata:

The Expressive Power of any machine can be determined from the class or
set of Languages accepted by that particular type of Machine. Here is the
increasing sequence of expressive power of machines :

FA < DPDA< PDA<LBA < TM

Downloaded from Ktunotes.in


Pumping Lemma for Context Free Languages
The Pumping Lemma is made up of two words, in which, the word pumping is used to
generate many input strings by pushing the symbol in input string one after another, and
the word Lemma is used as intermediate theorem in a proof.
Pumping lemma is a method to prove that certain languages are not context free.
The set of all context free language is identical to the set of languages accepted by Push
down Automata.
Theorem:
If L be a Context free language, then there is a constant ‘n’ depending only on L such that,
if w ε L and |w| >= n, then w may be divided into five pieces w = uvxyz, satisfying the
following conditions.
• For all i >= 0, uvixyiz element of L.
• |vy| >= 1
• |vxy| <= p
.
How to Apply Pumping Lemma?
We can apply pumping lemma in context free language to prove that given language is
not context free. The steps needed to prove that given languages is not context free are
given below:
Step 1: Let L is a context free language, and we will get contradiction. Let n be a natural
number obtained by pumping lemma.
Step 2: Now choose a string w ? L where |w| >= n. By using pumping lemma, we can write
w = uvxyz with |vy| >= 1 and |vxy| <= n.
Step 3: Find suitable i, so that uvixyiz element of L. It contradicts our assumption and it is
proved that given languages is not context free.

Example 1:
Let L= { anbncn | n>=0 }. By using pumping lemma show that L is not context free
language.
Solution:

Step 1: Let L is a context free language, and we will get contradiction. Let n be a natural
number obtained by pumping lemma.
Step 2: Let w = anbncn where| w |>= 3n. By using pumping lemma we can write w = uvxyz
with |vy| >= 1 and |vxy| <= n.
Step 3: In step 3, we consider two cases.
• Case 1: Here, v and y contain only one type of alphabet symbol, i.e. both
contains only a’s.

Here, uvxyz = anbncn


Let i = 2
Then we have uv2xy2z, which pumped more a’s into the string, but the number of b’s
remain same. It contradicts our assumption, and it is proved that given language is not
context free.
• Case 2:
In given context free language, we have equal number of a’s, b’s and c’s. The possible
substring from given language anbncn can be ab and bc, but not ba, ca, ac and cb.

Downloaded from Ktunotes.in


If we choose substring v and y as combination of a and b or b and c.

Then uv2xy2z may contain equal number of three alphabet symbols but not in correct
order. The resulting string is of the form
aaaa.aaaaaaaaa..aabb…bbbcc…..bcccbc
Here not all b’s follows a’s and not all c’s follows b’s. Hence it cannot be member of context
free language L and a contradiction occurs. Both the cases result in contradictions so L is
not context free language.

Problem
Find out whether the language L = {xnynzn | n ≥ 1} is context free or not.
Solution
Let L is context free. Then, L must satisfy pumping lemma.
At first, choose a number n of the pumping lemma. Then, take z as 0n1n2n.
Break z into uvwxy, where
|vwx| ≤ n and vx ≠ ε.
Hence vwx cannot involve both 0s and 2s, since the last 0 and the first 2 are at least
(n+1) positions apart. There are two cases −
Case 1 − vwx has no 2s. Then vx has only 0s and 1s. Then uwy, which would have
to be in L, has n 2s, but fewer than n 0s or 1s.
Case 2 − vwx has no 0s.
Here contradiction occurs.
Hence, L is not a context-free language.

Example 2:
Let L= { anbnan | n> 0 }. By using pumping lemma show that L is not context free language.
Solution:
Step 1: Let L is a context free language and we will get contradiction. Let n be a natural
number obtained by pumping lemma.
Step 2: Let w = anbnan where| w |>= n. By using pumping lemma we can write w = uvxyz
with |vy| >= 1 and |vxy| <= n.
Step 3: In step 3 we consider two cases:
Case 1: When both v and y contains equal number of a’s and b’s.
Let, i = 2
Then uv2xy2z = an+k b n+k a n or a n b n+k an+k which is not in L.
Case 2: Allwords in anbnan have one occurrence of substring ab or ba no matter what n
is.
Let, i = 2
Then uv2xy2z will have more than one substring ab or ba, so it cannot be in the form anbnan.
Hence, uv2xy2z ? L.
There is contradiction in both the cases, so L is not context free language.

Downloaded from Ktunotes.in


Chomsky Classification of Languages

The Chomsky hierarchy is a containment hierarchy of classes of formal grammars.


This hierarchy of grammars was described by Noam Chomsky in 1956. The Chomsky
hierarchy consists of the following levels:

Type 0(Unrestricted Grammars


Type 1(Context sensitive grammars)
Type 2(Context free grammars)
Type 3(Regular grammars)
This classification is based on the format of productions. Following diagram
shows the original Chomsky hierarchy.

Type 0 Unrestricted grammars (Recursively Enumerable Languages)


Unrestricted grammar is defined as G=(V,T,S,P)
where V= finite set of non terminals
T= finite set of terminals
S= starting non terminal S € V
P= the set of productions which is of the form

Downloaded from Ktunotes.in


α →β where α ,β are members of (VUT)* and are arbitrary strings of the
grammar with α ≠ ε (empty string).
Unrestricted grammars do not put restrictions on the production rules.
Recognizer for type 0 language is Turing machines
Language generated by this grammar are called recursively enumerable
languages
Eg: aBcd → acd
abAbcd → abABbcd
Ac→A
C→ ε
Type 1 grammars(Context sensitive grammars)
A grammar is said to be context sensitive if all productions are of the form α →β
where α ,β are members of (VUT)* and |α| <= |β|
A production of the form ψAΦ →ψ α Φ is called type 1 production if α≠ ε, ie in type 1
production erasing of A us not permitted. In the above production A is a variable, ψ is
called left context and Φ is the right context and α is the replacement string.
Examples
1. a A bcD → a bcD bcD is a type 1 production. Here a and bcD are left and right
context respectively. A is replaced by bcD ≠ ε.
Recognizer for this type of grammar is Linear Bounded automata. It is the
modification of Turing machine which is non deterministic in nature.
Language generated by this type of grammar is context sensitive language
The production S→ ε is also allowed in a type 1 grammar, but in this case S
does not appear on the right hand side of any production.
Type 2 grammars(Context-free languages)
Every production is of the with A€ V ie a nonterminal and γ € (VUT)*
ie a string of terminals and nonterminals. Here LHS has no left context or right context.
Ie for every production LHS should be a single nonterminal.
Examples
1. S → Aa
2. A→a
3. B → abC
4. A→ ε
Recognizer for this type of grammar is Push down automata.

Downloaded from Ktunotes.in


Application of CFL
Useful for scanning, parsing.
Context-free languages are the theoretical basis for the syntax of most
programming languages.
Type 3 grammars(Regular Grammar)
Type 3 grammar restricts its rules to a single nonterminal on the left-hand side
and right-hand side consisting of a single terminal, possibly followed (or preceded,
but not both in the same grammar) by a single nonterminal. A production of the form
A→ a or A → aB, where A,B € V and a € T is called type 3. The rule is also
allowed here if S does not appear on the right side of any rule.

These languages are exactly all languages that can be decided by a finite state
automaton. Additionally, this family of formal languages can be obtained by regular
expressions. Regular languages are commonly used to define search patterns and the
lexical structure of programming languages.
2 types of regular grammars
Right linear – production is of the form A →a B/a
Left linear - production is of the form A → Ba / a
Recognizer for regular language is finite automata

Downloaded from Ktunotes.in

You might also like