Introduction To Theory of Computation

Page 1 of 13
Introduction to theory of computation
Course Objectives
1. To introduce the concepts and methods that underlie the formal (mathematical) study
of computing machines
– What is a computing machine?
– How can we characterise and classify computing machines?
2. To present some of the basic results concerning the capabilities and limits of
computing machines
– Are there limits in principle to what can be computed?

o Every program computes a function from its input (a string of bits) to its
output (a string of bits). Since a string of bits may be viewed as a binary
number, every program may be viewed as computing a function from N to N.
But is every function from N to N computable?
o Are all programming languages and computing machines equal (in principle)?
Or are some more equal than others?
– Are there limits in practice to what can be computed?

o Are there computable problems which no matter
· how clever an algorithm we devise
· how efficient the language we write them in
· how ‘next generation’ the hardware
Will still not finish on inputs of small size before the heat death of the
universe?
o How do we identify these problems?
3. To extend basic mathematical skills and to develop further logical and analytical skills
directly related to Computer Science.
4. To provide a theoretical foundation for other Computer Science courses
Definition of Theory of Computation
Theory of computation is a branch of mathematics and computer science that deals with
whether and how efficiently problems can be solved on a model of computation using an
algorithm.
The purpose of the Theory of Computation is to develop formal mathematical models of

computation that reflect real-world computers.
Prepared by Zanamwe N Page 1 of 13

Page 2 of 13
Theory of computation is concerned with asking the following fundamental questions:

1. What are the limits of computation?
2. Are there problems which cannot be computed?
3. How do we model computation?
The first two questions can only be addressed after the last question is addressed, that is, how
to represent computation in forms that admit rigorous analysis and not merely execution.
Branches of theory of computation

Theory of computation is divided into three branches:
1. Complexity theory- In complexity theory, the objective is to classify problems as easy
ones and hard ones.
2. Computability theory- in computability theory the classification of problems is by
those that are solvable and those that are not. Computability theory introduces several of the
concepts used in complexity theory.
3. Automata theory - deals with definitions and properties of different types of
“computation models”. Examples of such models are:
 Finite Automata. These are used in text processing, compilers, and hardware design.
 Context-Free Grammars. These are used to define programming languages and in
Artificial Intelligence.
 Turing Machines. These form a simple abstract model of a “real” computer, such as
our PC at home.
Central Question in Automata Theory: Do these models have the same power, or can
one model solve more problems than the other?
Computer theory applications

Theory of computation provides us with many good applications.
 Finite state machines are used in string searching algorithms, compiler design, control
unit design in computer architecture, and many other modelling applications.
 Context free grammars and their restricted forms are the basis of compilers and
parsing.
 NP-Complete theory helps us distinguish the tractable from the intractable. We do not
focus on these applications. They are deep enough to require a separate course, and
hopefully you have already seen some of them.
Relevance of theory to practice

1. Theory gives us a new viewpoint of computers which are complex machines. Further
theory shows us another elegant side of computation. This course can heighten our
aesthetic sense and help us build more beautiful systems.
2. Theory provides conceptual tools that are used by practitioners in computer engineering.
For example, when designing a new programming language for a specialised application
you need an understanding of grammars. Further, an understanding of finite automata and
regular expressions is useful with string searching and pattern matching.

Page 3 of 13
3. This course helps you to learn problem solving skills. Theory teaches you how to think,
prove, argue, solve problems, express yourself clearly and precisely, and abstract.
How do we model computation?
Computation is modelled using languages and machines.
Major areas of focus

1. Automata theory
2. Pushdown automata theory
3. Turing theory
Automata theory
Is concerned with the definitions and properties of mathematical models of computation
Unlike other models (SE, DB, DAA), computation models deal with all computers that exist,
will exist and that can ever be dreamed of.
Note that computational models may be accurate in some ways and not in other ways.
One model, called the finite automaton, is used in text processing, compilers, and hardware
design.
Another model, called the context – free grammar, is used in programming languages and
artificial intelligence.
Languages
The fact that our study is sometimes called theory of formal languages makes it imperative to
study languages. The word formal means that all the rules for the language are explicitly
stated in terms of what strings of symbols can occur. The other reason why we study
languages is that, languages are used to model computation. It has already been indicated that
TOC deals with asking the question, how do we model computation? And it has been
indicated that computation is modelled using languages and machines.
A language is defined as a game of symbols with formal rules. Natural languages like English
are made up of letters, words, sentences, paragraphs etc. Similarly, with computer languages,
certain character strings are recognisable as words (END, DO, WHILE etc), certain strings of
words are recognisable as commands and certain sets of commands become a program that
can be translated into machine commands.
Terminology
1. Alphabet (Г or Σ) (gamma and sigma)

Is a finite set of fundamental units out of which we build structures (Cohen, 1991) or any
finite set of symbols. Example Σ = {a, b, c, d... z} or Г = {0, 1}
2. Symbols

Page 4 of 13
Are members of an alphabet and usually denoted by small letters but numbers can be part
of the symbols.
3. Words
Are strings containing the symbols in some alphabet. Two words are considered the same
if all their characters are the same and in the same order.
4. String over an alphabet- is a finite sequence of symbols from that alphabet written
adjacent to one another and not separated by commas. For example if Σ = {a,b,c,d} then
aadc, cdabb, and adcd are strings over Σ.
5. Empty string or null string
Is a string of length zero and is denoted by (Λ or ε). For clarity, the symbol Λ or ε is not
allowed to be part of the alphabet for any language.
6. String length (|x| or length(x))
Refers to the number of symbols in a string. For any word x in any language, if length(x)
= 0 then x = Λ
7. Language
Is a certain specified set of strings of characters from an alphabet. Is denoted by L.
8. Empty language (Φ)
Is a language that has no words or strings.
Points of thought
 is there a difference between Φ and Λ (language without words and word without
symbols)
 is L + Λ = L (+ is the union of sets operation)
 is L + Φ = L
Answers
1. There is a subtle but important difference between the word that has no letters and the
language that has no words. It is false that Λ is a word in the language Φ since this
language has no words at all.
2. If a language L does not contain Λ, then L + Λ is not the same as L
3. L + Φ = L since no new words were added
Cohen (1991) posits that, anyone who thinks that Λ is not confusing has missed
something. It is already a problem and it gets worse latter.
Defining Languages
There are two types of language defining rules:
1. Can be used to test whether a word is valid
2. Used to construct all the words in the language by some clear procedures
Concatenation operation
Is used to join two or more strings and a concatenation is a string obtained by appending one
string to the end of another. For example L1 = {good} and L2 = {one}, L1 + L2 = {goodone}

Page 5 of 13
Reverse function
If c is a word in some language L, then reverse(c) is the same string of letters spelled
backward, called the reverse of c even if this backward string is not a word in L. Example
reverse(eert) = tree
Palindrome
Assume a new language Palindrome is defined over the alphabet, Σ = {m, n} then Palindrome
= {Λ, and all strings y such that reverse(y) = y} so words in Palindrome are: {Λ, m, n,
mm, nn, mnm, nmn, mmm...}
Note that if you concatenate two words in Palindrome, the obtained word is sometimes in
Palindrome.
Valid words
If a word is contained in a given language it is valid otherwise it is invalid
Question:
Given the following languages:
L1= pxqyrx+y, where x and y range over all the natural numbers, 0,1,2... and p x denotes
the string containing x successive copies of the symbol p
L2= pxqyrx-y, where x and y range over all the natural numbers, 0,1,2... and p x denotes
the string containing x successive copies of the symbol p and x > y For each language, list 5
valid and invalid words.
Kleene Closure of an alphabet

Denoted by Σ*. This notation is sometimes known as the Kleene star.
Is defined as a language in which any string of letters from Σ is a word, even the null string.
Example: if Σ = {b,c} then Σ* = { Λ, b, c bb, bc, cb, cc, bbb, bbc ... }
The Kleene star is an operation that makes an infinite language of strings of letters out of an
alphabet. The term infinite language means, infinitely many words each of a finite length.
Lexicographic ordering
Means that strings must be arranged in size order (words of shortest length first) and words of
the same length must be put alphabetically. For example if Σ = {0, 1} then Σ * = {Λ, 0, 1, 00,
01, 10, 11, 000, 001, 010, 100, 110, 111 }
Closure of a set of words

The use of the Kleene star can be generalised to sets of words not just sets of alphabet letters.
If S is a set of words then S* is a set of all finite strings formed by concatenating words from
S, where any word can be used as often as we like and where the null string is also included.
Example 1:
if S = {00, 1} then

Page 6 of 13
S* = { Λ plus any word composed of factors of 00 and 1} = { Λ plus all strings of

0’s and 1’s in which 0’s occur in even clumps } = { Λ, 1, 00, 11, 001, 100, 111, 0000, 0011,
1001, 1100, 1111, 00001, 00100, 10000, 10011, 11001, 11100, 11111}
The string 0010001 is invalid since it has a clump of 0’s of length 3.
Example 2:
if S = {x, xy} then
S* = { Λ plus any word composed of factors of x and xy} = { Λ plus all strings of x’s
and y’s except those that start with y and those that contain a double y}
= { Λ, x, xx, xy, xxx, xxy, xyx, xxxx, xxxy, xxyx, xyxx, xyxy, xxxxx, xxyxx, xxyxy,
xyxxx, xyxxy, xyxyx, ...}
Proving the existence of a word in the closure

This is done by showing how a word can be written as a concatenate of words from the base
set S. Using the last example, prove the existence of xxyxxxyx in S*.
Solution: factor the string as follows (x) (xy) (x) (x) (xy) (x). These six factors are all inset S so
their concatenation is in S*. This factoring is unique sometimes it is not.
For example if S = {aa, aaa} then S* = { Λ plus all strings of more than one a} or {an for n
= 0, 2, 3, 4, 5, ...} or { Λ, aa, aaa, aaaa, aaaaa, aaaaaa, ...} prove whether aaaaaaa is in S*.
The factors are:
(aa) (aaa) (aa) or (aaa) (aa) (aa) or (aa) (aa) (aaa).
Using example 1, prove whether or not the string 1000011001110001 is in the closure of S.
Proof by constructive algorithm
Is a way of proving that something exists by showing how to create it.
Given that S= {aa, aaa}, prove that S* contains all an for n ≠1.
We proceed as follows;
1. Assume that there are some powers of a we could not produce by concatenating
factors of (aa) and (aaa). Since we can produce a4, a5and a6 then strings that we
cannot produce must be large.
2. Determine the smallest power of a (> 1) that we cannot form out of factors of (aa) and
(aaa). Assume here that we start making a list of how to construct the various powers
of a. On this list we state how to form a,2 a 3 , a4, a5, a6 and so on. Assume that we
work our way successfully up to an-1 but then we cannot figure out how to form an
3. Establish how an-2 was formed and then concatenate another factor of aa in front of
this and then you will have an.

Page 7 of 13
If Σ = {} then Σ* = {Λ} this is not the same as, if S = {Λ}, then S* ={Λ}which is also true but
for a different reason that is Λ= Λ Λ.
Note: Λ is an element of L* for all languages.
Sometimes the notation + instead of * is used to modify the concept of closure to refer to
only the concatenation of some (not zero) strings from a set S. If Σ = {a} then Σ+ = {a, aa,
aaa, aaaa, ...}
For any language S* = S+ + Λ if S does not contain Λ.
|Λ| = 0
Theorem 1
For any set S of strings, S* = S**

Illustration: if S = {xx, yyy} then S* is a set of strings where the x’s occur in even clumps and
the y’s occur in groups of 3, 6, 9, ... some strings in S * are xxyyyxxxx yyy yyyxx. If we
concatenate these three elements of S* we get one big word in S** which is also in S*.
xxyyyxxxxyyyyyyxx = (xx) (yyy) (xx) (xx) (yyy) (yyy) (xx)
This is analogous to saying that if computers are made up of circuits and circuits are
made up of logic gates then computers are made up of logic gates.
Proof
Every word in S** is made up of factors from S*. Every factor from S* is made up of factors
from S. Therefore, every word in S ** is also a word in S*. This can be expressed as S** ⊂ S*. It
can be generalised that for any set A we know that A⊂A*, since in A* we can choose as a
word any one factor from A. So if we consider A to be our set S *, we have S* ⊂ S**.
Together the two inclusions prove that S** = S*.
Ways of Defining Languages
There are several ways of defining languages notably:
1. Recursive definition of languages
2. Regular expressions
3. Finite automata
4. Transition Graph
5. Other

Page 8 of 13
Recursive definition
Is a method of defining sets and has three steps:
1. Specify some base objects in the set
2. Give rules for combining more objects in the set from the ones we already know
3. Declare that no objects except those constructed in this way are allowed in the set.
Examples:
1. Recursive definition of a set of positive even numbers
Rule 1: 2 is in EVEN
Rule 2: if x and y are both in EVEN then so is x+y
2. Recursive definition of positive integers
Rule 1: 1 is in INTEGERS
Rule 2: if x is in INTEGERS so is x+1
3. Recursive definition of integers
Rule 1: 1 is in INTEGERS
Rule 2: if both x and y are in INTEGERS, then so are x+y and x-y
4. Recursive definition of factorial

Rule 1: 0! = 1
Rule 2: n!= n*(n-1)!
5. Recursive definition of polynomial
A polynomial is a finite set of terms, each of which is in the form: a real number times
a power of x (that may be x0=1).
Rule 1: Any number is in POLYNOMIAL
Rule 2: the variable x is in POLYNOMIAL

Page 9 of 13
Rule 3: if p and q are in POLYNOMIAL then so are p+q, p-q, pq and (p)
*Show that 2x2 + 3x – 10 is in POLYNOMIAL
By rule 1, 2 is in POLYNOMIAL
By rule 2, x is in POLYNOMIAL
By rule 3, (2)(x) is in POLYNOMIAL; call it 2x
By rule 3, (2x)(x) is in POLYNOMIAL; call it 2x2
By rule 1, 3 is in POLYNOMIAL
By rule 3, (3)(x) is in POLYNOMIAL
By rule 3, 2x2 + 3x is in POLYNOMIAL
By rule 1, -10 is in POLYNOMIAL
By rule 3, 2x2 + 3x + (-10) = 2x2 + 3x -10 is in POLYNOMIAL
REGULAR EXPRESSIONS
• Cohen (2001) defines REs as language defining symbols whereas Sipser (1996) defines
them as expressions describing languages.
• Languages defined by REs are referred to as Regular Languages
• REs are limited in capacity because there are some languages that cannot be defined by
REs
• A RL is one that can be defined by a RE
• The value of a RE is a language.
Formal Definition of a Regular Expression
• Symbols that appear in REs include letters of the alphabet Σ, the symbol of the null string
Λ, the symbol for the empty language Φ, parenthesis, the star operator and the plus sign.
• The set of regular expressions is defined as follows:
o Rule 1: every letter of the alphabet Σ can be made into a regular expression by
writing it in bold face; Λ itself is a RE and so is Φ.
o Rule 2: if r1 and r2 are REs then so are:
 (r1)
 r1r2 -- concatenation (°) -- r1r2 = {𝑤 : 𝑤 = 𝑥𝑦, 𝑥 ∈ r1, 𝑦 ∈ r2}
Example: If 𝐴 = {good, bad} and 𝐵 = {boy, girl} we get
𝐴 ∘ 𝐵 = {goodboy, goodgirl, badboy, badgirl}.
Page 10 of 13
 r1 + r2 -- union (∪ ¿ - r1 + r2= {𝑤 : 𝑤 ∈ r1 or 𝑤 ∈ r2}

Example: If 𝐴 = {good, bad} and 𝐵 = {boy, girl} we get
𝐴 + 𝐵 = {good, bad, boy, girl}.
 r2* - Kleen Star -- r2* = {𝑤 : 𝑤 = 𝑥1𝑥2 · · ·𝑥𝑘, 𝑘 ≥ 0, 𝑥𝑖 ∈ r2}
o Rule 3: nothing else is a RE
Note that:
 r + Λ = r + Λ and not always equal to r

 rΛ = Λr = r
 r+ = rr*
 r+Φ=r
 rΦ = Φr = Φ
“but what is far less clear is exactly what Φ* should mean. We shall avoid this philosophical
crisis by never using this symbolism and avoiding those who do” Cohen (2001) but Sipser
(1996) indicates that Φ* = {𝜀} which sounds logical
Moving from RE notation to set notation
We use L operator and its rules are as follows:
• L(a) = {a}
• L(a+b) =
• L(ab) =
• L(r*) = L(r*)
• Language of Λ = L(Λ) = Λ
Examples:
1.

Page 11 of 13
2.
is a language that contains plus sequences of on in which the

precede the if there are .
3. – all strings ending in and the shortest string is
4. ) – all strings over and
5. - only eliminates from
6. L((a+b)(a+b)(a+b))*-all strings with an odd number of symbols
7. - all strings with an even number of symbols
8. strings of and with an a
preceding a
9. all strings over and with precisely one
10. – strings with exactly 2
11. )) – strings over and with 3 symbols,
this can be expressed as which is just short hand not a RE.
12. - accepts all strings over a and b
with first and last symbols different
13. Give the RE for a language over a and b that accepts all strings with
the first and last symbols different or second from last and second
from first different.
14. L(a*b*)- all strings over a and b in which all a’s (if any) precede all
b’s (also if any)
15. L(a*b*)* - all strings over a and b
16. L(a(a+b)*a +b(a+b)*b + a + b – all strings over a and b that start and
end with the same symbol
Page 11 of 13
Page 12 of 13
Note that:
• L(a*Φ)=Φ- concatenating an empty language to any non-empty language yields
an empty language.
• r + Φ = r- adding the empty language to any other language will not change it
• r =r – concatenating an empty or null string to any string will not change it.
But
• r+ may not equal r for example if r=b then L(r) = {b} but L(r+ ) ={b, }
• rΦ may not equal r if r=b, then L(r) ={b} but L(rΦ) = Φ // accepts L(r) when r=Φ
Notice that the use of the plus sign is far from the normal meaning of addition in the algebraic
sense, for plus as union or plus as choice the following all make sense.
 b*=b* + b*
 b*=b* + b* + b*
 b* = b* + bbb
Also note that in algebra but in formal languages ac ca and also that (ab)* a*b*
Languages associated with regular Expressions
Below are rules that define the language associated with any regular expression:
1. Rule 1: that language that is associated with the regular expression that is just a single
letter is that one-letter word alone and the language associated with is just { }, a
one word language.
2. Rule 2: if r1 is a regular expression associated with the language L1 and r2 is the
language associated with the language L2, then:
a. The regular expression (r1)(r2) is associated with product L1L2 that is the
language L1 times L2:
i. Language(r1)(r2) =L1L2
b. The regular expression r1 + r2 is associated with the language formed by the
union of the sets L1 and L2:
i. Language(r1+r2) =L1+L2
c. The language associated with the regular expression (r 1)* is L1*, the Kleene
closure of the set L1 as a set of words:
i. Language(r1*) = L1*
The relationship between REs and RLs leaves open 2 questions:
• Is there an algorithm for determining whether different REs describe the same
language?
• Is it true that every language can be described by a regular expression?
Page 12 of 13
Page 13 of 13
Finite languages are Regular
Theorem: if L is a finite language (a language with only finitely many words), then L can be
defined by a regular expression. In other words all finite languages are regular.
Proof: To make one RE that defines the language L, convert all words in L into bold face
type and insert plus signs between them. For example, the RE that defines the language
L= {ab, bb, abb, bbb} is ab + bb + abb + bbb and also if
L = {a, aa, aaa, aaaa} the algorithm above gives the RE a + aa + aaa + aaaa
Here we need only to show that at least one RE exists.
This trick only works for finite languages because with infinite languages the RE will be
infinitely long which is forbidden.
Page 13 of 13

Introduction To Theory of Computation

Uploaded by

Copyright:

Available Formats

You might also like

Introduction To Theory of Computation

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To Theory of Computation

Uploaded by

Copyright:

Available Formats

Page 1 of 13

Introduction to theory of computation

– Are there limits in principle to what can be computed?

– Are there limits in practice to what can be computed?

o How do we identify these problems?

4. To provide a theoretical foundation for other Computer Science courses

Definition of Theory of Computation

The purpose of the Theory of Computation is to develop formal mathematical models of

Prepared by Zanamwe N Page 1 of 13

Theory of computation is concerned with asking the following fundamental questions:

Branches of theory of computation

Computer theory applications

Relevance of theory to practice

Prepared by Zanamwe N Page 2 of 13

Computation is modelled using languages and machines.

Major areas of focus

1. Alphabet (Г or Σ) (gamma and sigma)

Prepared by Zanamwe N Page 3 of 13

Prepared by Zanamwe N Page 4 of 13

Kleene Closure of an alphabet

Closure of a set of words

Prepared by Zanamwe N Page 5 of 13

S* = { Λ plus any word composed of factors of 00 and 1} = { Λ plus all strings of

The string 0010001 is invalid since it has a clump of 0’s of length 3.

Proving the existence of a word in the closure

Proof by constructive algorithm

Is a way of proving that something exists by showing how to create it.

Prepared by Zanamwe N Page 6 of 13

Note: Λ is an element of L* for all languages.

For any set S of strings, S* = S**

Ways of Defining Languages

There are several ways of defining languages notably:

1. Recursive definition of languages

Prepared by Zanamwe N Page 7 of 13

Is a method of defining sets and has three steps:

1. Specify some base objects in the set

1. Recursive definition of a set of positive even numbers

Rule 2: if x and y are both in EVEN then so is x+y

2. Recursive definition of positive integers

Rule 2: if x is in INTEGERS so is x+1

3. Recursive definition of integers

4. Recursive definition of factorial

Rule 2: n!= n*(n-1)!

5. Recursive definition of polynomial

Rule 1: Any number is in POLYNOMIAL

Rule 2: the variable x is in POLYNOMIAL

Prepared by Zanamwe N Page 8 of 13

*Show that 2x2 + 3x – 10 is in POLYNOMIAL

By rule 3, (2)(x) is in POLYNOMIAL; call it 2x

By rule 3, (2x)(x) is in POLYNOMIAL; call it 2x2

By rule 3, (3)(x) is in POLYNOMIAL

By rule 3, 2x2 + 3x is in POLYNOMIAL

By rule 1, -10 is in POLYNOMIAL

By rule 3, 2x2 + 3x + (-10) = 2x2 + 3x -10 is in POLYNOMIAL

Formal Definition of a Regular Expression

 r1 + r2 -- union (∪ ¿ - r1 + r2= {𝑤 : 𝑤 ∈ r1 or 𝑤 ∈ r2}

 r2* - Kleen Star -- r2* = {𝑤 : 𝑤 = 𝑥1𝑥2 · · ·𝑥𝑘, 𝑘 ≥ 0, 𝑥𝑖 ∈ r2}