Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 47

Regular Languages

Sequential Machine Theory



Prof. K. J. Hintz
Department of Electrical and Computer Engineering
Lecture 3
Comments, additions and modifications
by Marek Perkowski
Languages
Informal Languages
English
Body language
Bureaucratic conventions and procedures
Formal Languages
1) Rule-based
2) Elements are decidable
3) No deeper understanding required
Formal Language
All the Rules of the Language Are Explicitly
Stated in Terms of the Allowed Strings of
Symbols, e.g.,
Programming languages, e.g., C, Lisp, Ada
Military communications
Digital network protocols
Alphabets
Alphabet: a finite set of symbols, aka I, E
Roman: { a, b, c, ... , z }
Binary: { 0, 1 }
Greek: { o, _, c, , i, k, , ... }
Cyrillic: { , , , , , ... }
String
String, word: a finite ordered sequence of
symbols from the alphabet, usually written
with no intervening punctuation
x
1
= t h e
x
2
= 0 1 0 1 1 0
x
3
=

x
4
=
_ i o c t t o
String
Reverse of String
The sequence of symbols written backwards


Reverse of Concatenation
Strings themselves must be reversed
( ) x
r
1
=" e h t "
( )
R R
R
x y y x - = -
String
Length or Size of String
The number of symbols
13
4 9
6 3
4 3
4 3
2 1
= -
= =
= =
x x
x x
x x
Strings
Null String, Empty String, e, A
A string of length or size zero
The symbols e or A, meant to denote the null
string, are not allowed to be part of the
language
Substring
A String, v, Is a Substring of a String, w, iff
There Are Strings x and y Such That
w = x v y
x is called the prefix
y is called the suffix
x and/or y could be A
Kleene Closure
Set of All Strings, E*, I*
Order IS important
Not the same as , the powerset of the
alphabet, since order is NOT important in the
powerset
( ) E P
Concatenation Operator
If x, y e I*, then the concatenation of x and
y is written as
z = x - y
e.g., if
x = Red | x | = 3
y = skins | y | = 5
z = x - y = Redskins | z | = 8
Concatenation Operator
Concatenation of Any String With the Null
String Results in the Original String
x - e = e - x = x
x - A = A - x = x
Concatenation is Associative
x = abc y=def z= ghi
( x - y ) - z = x - ( y - z )
abcdefghi = abcdefghi
Language
Language, L: Any Subset of the Set of All
Strings of an Alphabet

*
*
I _
E _
L
L
I*
L
1
L
2
Classes of Languages
Enumerated Languages
Defined by a List of All Words in the Language
L
e
= { quidditch, nimbus 2000, }
not very interesting
Rule-based
Defined by Properties or a Set of Rules
{ }
L
r
w w P = eI * : has the property
Rule Based Languages
A Test to Determine Whether a String Is a
Member of a Language
A Means of Constructing Strings That Are
in the Language
Must be able to construct ALL strings in the
language
Must be able to construct ONLY strings in the
language
Rule-Based Language Example
Let I = { a, b }
A Language That Consists of All Two
Letter Strings
L = { aa, ab, ba, bb }
A is not an element of the language
Empty Language
Null Language, Empty Language, u: The
Language With No Words in It
Not the same as A
u can be made into a language with words

A language consisting only of A is still a
language
{ }
L = u A
Kleene Star
If is a language, then
L* Is the Set of All Strings Obtained by
Concatenating Zero or More Strings of L.
Concatenation of Zero Strings Is A
Concatenation of One String Is the String
Itself
L
+
= L* - {A }
L _ I *
Kleene Closure Example
L = { 0, 1}
L* = { A,
0, 00, 000, ... , 0*,
1, 11, 111, ... , 1*,
01, 001, 0001, ... , 0*1,
... }

Kleene Closure Examples
L = { ab, f }
L* = { A,
ab, abf, fab, ffab, ffabf, ... }
u* = { A }
if L = { A }
then L* = { A }

Kleene Closure Examples
Let I = { a, b }
L = Language ( ( ab )* )
{A, ab, abab, ababab, ... }
which is not the same as
L = Language ( a* b* )
{A, a, b, ab, aab, abb, ... }
The language of all strings of as and bs in
which the as, if any, come before the bs

Recursive Language
Definition
Variation of Rule-Based
Three-step Process
1. Specify some basic elements of the set
2. Specify the rules for forming new elements
from old elements of the set
3. Specify that elements not in 1 or 2 above are
NOT elements of the set
Recursive Example
Two Equivalent Recursive Definitions of
Rational Numbers
Rational #1 we define set Rational#1 of
rational numbers
1. Rat_1 = { -, ... -3, -2, -1, 1, 2, 3, ... , }
2. if p, q e Rat_1, then p/q e Rat_1
3. the only rational numbers are those generated by 1
and 2 above.
Recursive Example
Rational #2, we generate the set of rational numbers
with different rules, but this is the same set.
1. Rat_2 = { -1, +1 }
2. if p, q e Rat_2, p,q != 0, then (p+q)/p e Rat_2
3. the only rational numbers are those generated by 1 and 2
above.
e.g.,


generates all integers, similarly negative integers. Now we can
generate any rational number
( ) 1 1
1
2
2 1
1
3
1 1
1
+
=
+
=
+

=
n
n
Example. To create 2/3 take p=3, then take p+q=2
thus p=-1 which is negative integer, OK
Interest in Recursive Definitions
Recursive definition allow us to prove some
Statements About What Is Computable.
Recursive definition leads to Proof by
Induction
Principle of Mathematical
Induction*
Let A Be a Subset of the Natural Numbers
0 e A, and
for each natural number, n,
if { 0, 1, ..., n } e A ,
implies (n + 1) e A
then A = N
* Lewis & Papadimitriou, pg. 24
Mathematical Induction
In practice, mathematical induction is used
to prove assertions of the form

For all natural numbers, n,
property P is true
Mathematical Induction Practice
To prove statements of the form
A = { n : P is true of n }, three steps

1. Basis Step: show that 0 e A,
i.e., P is true of n = 0
2. Induction Hypothesis: assume that for some
arbitrary, but fixed n > 0, P holds for each
natural number 0, 1, ... , n
Mathematical Induction Practice
3. Induction Step: use the induction hypothesis
(that P is true of n) to show that P is true of (n
+ 1)

By the Induction Principle, Then A=N and
Hence, P Is True of Every Natural Number.
Induction Example*
* Lewis & Papadimitriou, pg. 25



1. Basis Step
Show that for any n
n
n n
>
+ + + =
+
|
\

|
.
|
0
1 2
2
2
,

0
0 0
2
0 0
0
2
=
+
|
\

|
.
|
=
= true for n
Induction Example
2. Induction Hypothesis
Assume that for some
when
n
m
m m
m n
>
+ + + =
+
|
\

|
.
|
s
0
1 2
2
2
,

We just assume
that the rule is
true for certain
m smaller than
n
Induction Example
3. Induction Step
( ) ( )
( )
1 2 1 1 2 1
1 2
2
2
1
2 2
2
2
2
2
+ + + + = + + + +
+ +
+
|
\

|
.
|
=
+
|
\

|
.
| + +
=
+ + +

n n n n
n
n n
n n
n
n n n
where is replaced by from the
induction hypothesis
Induction Example
( )
( )
( ) ( )
=
+ + + +
=
+ + +
= +
n n n
n n
n n
2
2
2 1 1
2
1 1
2
0 1
which shows that the hypothesis is true since if it was true
for then it must be true for any ,
Another Induction Example
Define set EVEN as
1. 0 is in EVEN
2. if x e EVEN then so is x + 2
3. The only elements of EVEN are those
produced by 1 & 2 above.
Prove by induction that all of elements of
EVEN end in either 0, 2, 4, 6, or 8.
Induction Example (cont)
Proof
1. Basis Step
0 e EVEN by definition, therefore the property is
true of the zeroth step since 0 e { 0, 2, 4, 6, 8
}
2. Induction Hypothesis
Assume that the last digit of
(m+2) e { 0, 2, 4, 6, 8 } for 0 < m < n
Induction Example
3. Induction Step
n e EVEN n e EVEN
0 2 ...
1 4 n 2n + 2
2 6 n+1 (2n+2)+2
3 8 n+1 2(n+1) +2
4 10
ends in {0,2,4,6,8}
by step 2
0+2=2, 2+2=4,4+2=6
6+2=8, 8+2=0 e {0,2,4,6,8}
Prove by induction that all of elements of EVEN end in
either 0, 2, 4, 6, or 8.
2n+2
Thus if for n it ends wih 0,2,4,6,8 then for
n+1 it also ends with 0,2,4,6,8
Regular Expressions
Shorthand Notation for Concisely
Expressing Languages
Defined Recursively
Lead to a Definition of Regular Languages
Provide Finite Representation of Possibly
Infinite Languages
Lead to Lexical Analyzers
Regular Expressions Notation

{ }
{ }
{ }
{ }
Language



with operator precedence being
highest lowest
or +
Kleene Star Concatenation Set Union
a a
a b a b
a a
a a

-
+
+
,
* *
*
Regular Expressions Over I
u and A are regular expressions
a is a regular expression for each a e I
If r and s are regular expressions, then so
are r s, r - s, and r*
No other sequences of symbols are regular
expressions
Regular Expressions Alternative
1. L( A ) = {A }
L( a ) = { a }
If p and q are regular expressions, then
2. L( pq ) = L( p ) L( q )
3. L( p q ) = L( p ) L( q )
4. L( p* ) = L( p )*
Regular expressions
Regular Expressions Example
What is L
3
( ( a b )* a ) ?

( ) ( ) ( ) ( )
( ) ( ){ } ( )
( ) { } ( )
( ) ( ) ( ) { } ( )
{ } { } ( ) { } ( )
{ } { } ( )
{ } { }
L L L
L
L
L L
3
=
=
=
=
=
=
= e
a b a
a b a
a b a
a b a
a b a
a b a
w a b w a
*
*
*
*
*
, *
, *:
2
1
4
3
1, 1
definition
ends with
Observe that we do everything by completely formal transformations
between expressions representing languages and sets (languages)
Regular Expressions
Boolean OR Distributes over Concatenation


which is the language of all strings beginning
with a, ending with b, and having none or more
cs in the middle, and,
all strings beginning and ending with b and
having at least one c in the middle
( ) ( )
( )
L
L
= +
= +
language
language
a bc c b
ac b bcc b
*
* *
Regular Expressions
The Boolean OR Operator Can Distribute
When It Is Inside a Kleene Starred
Expression, but Only in Certain Ways
( ) ( )
( )( )( )
( )
( )
L = +
= + + +
= +
= +
language a bc b
a bc a bc a bc b
a b bc b
ab bcb
*
* *
*

Be very careful and do not invent or


guess identities for tranformations,
use only those that were proven and
given here
Regular Expressions
Useful String
( a + b )* = the set of all strings of a and b of
any length
L = Language ( ( a + b )* )
{ A, a, b, ba, ab, abab, abaab, abbaab,
babba, bbb, ... }
Regular Languages
If L _ I* is finite, then L is regular.
If L
1
and L
2
are regular, so are
L
3
= L
1
L
2
L
4
= L
1
- L
2
= {x
1
- x
2
| x
1
e L
1
, x
2
e L
2
}

If L is regular, then so is L*, where * is the
Kleene Star
Regular Languages
If L Is a Finite Language, Then L Can Be
Defined by a Regular Expression.
The Converse Is Not True. That Is, Not All
Regular Expressions Represent Finite
Languages.
L = Language( ( a + b )* ) Is Infinite Yet
Regular
Typical Homework
Typical homework in this area may include the following:
1. Converting arbitrary regular expression to a graph and next
converting this graph to a NDFA.
2. Creating a deterministic or non-deterministic stack-based
automaton for language such as deterministic or non-
deterministic palindromes or language similar to {a
n
b
n
| n = 0,
1,2,..}
3. Example: Find a deterministic state machine for the following
language even number of zeros after odd number of ones or odd
number of zeros that follow even number of ones
4. You should be able to transit among regular expression, non-
deterministic and deterministic automata for this expression and
a corresponding regular grammar in any direction, for instance
you may start from a regular grammar and write a regular
expression, or start from a non-deterministic automaton and
write the set of rules for the grammar of the language that this
automaton accepts.

You might also like