Professional Documents
Culture Documents
Regular Expressions
Regular Expressions
Regular Expressions
Deepak DSouza
Department of Computer Science and Automation
Indian Institute of Science, Bangalore.
18 August 2014
Regular Expressions Kleenes Theorem Equation-based alternate construction
Outline
1
Regular Expressions
2
Kleenes Theorem
3
Equation-based alternate construction
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Examples of Regular Expressions
Expressions built from a, b, , using operators +, , and .
(a
+ b
) c
Strings of only as or only bs, followed by a c.
(a + b)
abb(a + b)
b(a + b)(a + b)
3rd last letter is a b.
(b
ab
a)
( + 0 + 1 + + 111)
Regular Expressions Kleenes Theorem Equation-based alternate construction
Formal denitions
Syntax of regular expresions over an alphabet A:
r ::= | a | r + r | r r | r
where a A.
Semantics: associate a language L(r ) A
with regexp r .
L() = {}
L(a) = {a}
L(r + r
) = L(r ) L(r
)
L(r r
) = L(r ) L(r
)
L(r
) = L(r )
.
Question: Do we need in syntax?
No.
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Formal denitions
Syntax of regular expresions over an alphabet A:
r ::= | a | r + r | r r | r
where a A.
Semantics: associate a language L(r ) A
with regexp r .
L() = {}
L(a) = {a}
L(r + r
) = L(r ) L(r
)
L(r r
) = L(r ) L(r
)
L(r
) = L(r )
.
Question: Do we need in syntax?
No.
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Formal denitions
Syntax of regular expresions over an alphabet A:
r ::= | a | r + r | r r | r
where a A.
Semantics: associate a language L(r ) A
with regexp r .
L() = {}
L(a) = {a}
L(r + r
) = L(r ) L(r
)
L(r r
) = L(r ) L(r
)
L(r
) = L(r )
.
Question: Do we need in syntax?
No.
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Example: Semantics of regexp
(a
+ b
) c
+
{, a, b, aa, bb, . . .}
a b
{a} {b}
c
{c}
{, a, aa, . . .} {, b, bb, . . .}
{c, ac, bc, aac, bbc, . . .}
Regular Expressions Kleenes Theorem Equation-based alternate construction
Example: Semantics of regexp
(a
+ b
) c
+
{, a, b, aa, bb, . . .}
+ b
) c
+
{, a, b, aa, bb, . . .}
+ b
) c
+ {, a, b, aa, bb, . . .}
+ b
) c
+ {, a, b, aa, bb, . . .}
|
(p, w) = q}.
Then L(A) =
f F
L
sf
.
For X Q, dene L
X
pq
= {w A
|
(p, w) =
q via a path that stays in X except for rst and last states}
X
p q
Then L(A) =
f F
L
Q
sf
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE: Kleenes construction
p q
r
X
Advantage:
L
X{r }
pq
= L
X
pq
+ L
X
pr
(L
X
rr
)
L
X
rq
.
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE: Kleenes construction (2)
Method:
Begin with L
Q
sf
for each f F.
Simplify by using terms with strictly smaller Xs:
L
X{r }
pq
= L
X
pq
+ L
X
pr
(L
X
rr
)
L
X
rq
.
For base terms, observe that
L
{}
pq
=
{a | (p, a) = q} if p = q
{a | (p, a) = q} {} if p = q.
Exercise: convert NFA/DFAs below to REs:
s
b
a, b
f
b
a
b
a
e o
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE: Kleenes construction (2)
Method:
Begin with L
Q
sf
for each f F.
Simplify by using terms with strictly smaller Xs:
L
X{r }
pq
= L
X
pq
+ L
X
pr
(L
X
rr
)
L
X
rq
.
For base terms, observe that
L
{}
pq
=
{a | (p, a) = q} if p = q
{a | (p, a) = q} {} if p = q.
Exercise: convert NFA/DFAs below to REs:
s
b
a, b
f
b
a
b
a
e o
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE using system of equations
Aim: to construct a regexp for
L
q
= {w A
|
(q, w) F}.
Note that L(A) = L
s
.
Example:
b
a
b
a
e o
Set up equations to capture L
q
s:
x
e
= b x
e
+ a x
o
x
o
= a x
e
+ b x
o
+ .
Solution is a RE for each x, such that languages denoted by
LHS and RHS REs coincide.
Regular Expressions Kleenes Theorem Equation-based alternate construction
DFA RE using system of equations
Aim: to construct a regexp for
L
q
= {w A
|
(q, w) F}.
Note that L(A) = L
s
.
Example:
b
a
b
a
e o
Set up equations to capture L
q
s:
x
e
= b x
e
+ a x
o
x
o
= a x
e
+ b x
o
+ .
Solution is a RE for each x, such that languages denoted by
LHS and RHS REs coincide.
Regular Expressions Kleenes Theorem Equation-based alternate construction
Solutions to a system of equations
L
q
s are a solution to the system of equations
In general there could be many solutions to equations.
Consider x = A
x
e
x
o
b a
a b
x
e
x
o
a b
c d
(a + bd
c)
(a + bd
c)
bd
(d + ca
b)
ca
(d + ca
b)