Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

COL352 Homework 2

Release date: March 6, 2021 Deadline: March 13, 2021: 23:00

Read the instructions carefully.

This homework is primarily about proving that certain languages L are not regular. For this, we have the
Pumping Lemma and the Myhill-Nerode Theorem at our disposal. Recall that the Pumping Lemma merely
gives a sufficient condition for non-regularity. In some cases, using closure properties might give a much
cleaner proof: assume that L is regular, then argue that some other language L0 must also be regular, then
apply Pumping Lemma to show that L0 is, in fact, not regular. Remember that you can use any claim proven
in class and in the previous quizzes and homeworks without reproducing its proof.
2
1. Prove that the language {x | x is the binary representation of 3n for some n ∈ N} is not regular.

Solution: We assume that the binary representation doesn’t have leading zeroes. Now we prove
claims that we will need further down this solution. We use the notation bin(x) to denote the binary
representation of x and the notation |x| to represent the number of digits in x in a given base (base
will be clear given the context).

Claim 1: |bin(3n )| ≥ n + 1 for n ∈ N.


Proof: We prove this by induction. For n = 1, |bin(3)| = 2 so it is true. Assume it to be true for
some k, then we have |bin(3k )| ≥ k + 1. Now, |bin(3k+1 )| ≥ |bin(2.3k )| ≥ k + 2, thus our claim is
proved.

Claim 2: |bin(x)| ≤ n then x ≤ 2n .


Proof: This is easy to see as it follows from the base 2 representation itself. Note that, |bin(2n )| =
n + 1 thus x ≤ 2n and we are done.

Claim 3: |bin(x)| ≥ n then x ≥ 2n−1 .


Proof: This is also intuitive to see. As |bin(x)| ≥ n and we don’t allow leading zeroes, the most
significant bit of x is atleast n so as that bit is set so x ≥ 2n−1 and we are done.

Now we come to the proof of the given question. We use the pumping lemma. Assume on the
contrary that the language L is regular. Let p be the pumping length given by the pumping lemma.
2
Let a = 3(2p) and s = bin(a). Then s can be split into xyz satisfying the conditions of the pumping
lemma. Note that |s| ≥ 4p2 + 1 by Claim-1. Now we have a = z + 2|z| y + 2|y|+|z| x. By pumping
2
lemma, we also have bin(b) = xz (pumping down) also belonging to L for some b = 3k . Note that,
|z| |z| |y|
b = z + 2 x. Subtracting b from a, we get a − b = 2 (y + x(2 − 1)). Now by the condition
of pumping lemma, we have |xy| ≤ p, so |x| ≤ p and |y| ≤ p. Consider the value a − b. Both
(a and b) are some perfect square power of 3 and a > b. Note that b is a factor of a (as 3 is the
only prime factor and a is greater power of 3). So, b divides a − b. Thus b | 2|z| (y + x(2|y| − 1)) (|
means divides). As b is a power of 3, it can only divide the second factor in above product thus, b |
(y + x(2|y| − 1)). Now let us establish some bounds. Note that, x ≤ 2p (from Claim-2) and similarly
y ≤ 2p . Thus we have, (y + x(2|y| − 1)) ≤ (2p + 2p (2p − 1)) = 22p . Now we try to find bounds
on b. Note that |xz| ≥ 4p2 − p + 1 ≥ 2p + 2 (from |xyz| ≥ 4p2 + 1 and |y| ≤ p)(2nd inequality
is based on 4p2 ≥ 3p + 1 for p ≥ 1). As bin(b) = xz and from Claim-3 (and using above proved
bound |xz| ≥ 2p + 2), we have b ≥ 22p+1 . Now that we have established the bounds, note that as
b | (y + x(2|y| − 1)), we should have b ≤ (y + x(2|y| − 1)) (As b is a factor of the latter). But from
the bounds shown above, 22p+1 ≤ b ≤ (y + x(2|y| − 1)) ≤ 22p which is a contradiction. Thus L is
not regular.

Note: We can also this idea to prove that L = {x | x is the binary representation of 3n for some n ∈
N} is also not regular by choosing some appropriate string s. In the proof above, we used s =
2
bin(34p ) as we could establish the needed bounds and reach a contradiction so likewise we would
have to choose such an s for the stronger statement as well.
2. Prove that the language {0m 1n | m 6= n} ⊆ {0, 1}∗ is not regular.
As a challenge, construct a clean proof using the pumping lemma only. However, no extra credit will be
given for this.

Solution: Note that intersection of two regular language is a regular language. Consider L1 ∩ L2 =
L3 , if L1 and L2 is regular, then L3 must be a regular statement. Negating the above statement,
we get If L3 is not regular then, at least one of L1 or L2 is not regular. We will use this property
for this question.
Let L = {0m 1n | m 6= n} ⊆ {0, 1}∗ , L1 = 0(0)∗ 1(1)∗ .
Claim: L1 is a regular language.
Proof: Consider the following DFA for it.

Figure 1: DFA for recognising L1

Hence, as the above DFA recognises L1 , L1 is regular.

Now, consider L2 = L ∩ L1 = {0n 1n | n ≥ 1}

Claim: L2 is a irregular language.


Proof: We will prove the above claim by pumping lemma. First, choose s = 0p 1p . consider s = xyz
where |y| ≥ 1, |xy| ≤ p. Then, x = 0a , y = 0b and z = 0c 1p where a + b + c = p. Now, if we just
need to show that if there exists a i such that xy i z does not exists in L2 , then we are done. For
i = 0, we can see that xz doesnot exist in L2 , So, L2 is a irregular language.

Hence, as L2 is irregular, at least one of L or L1 is irregular. However, L1 is regular. So, L must


be irregular, which mean that L must be irregular.

3. Construct the minimal DFA D = (Q, {0, 1}, δ, q0 , A) that recognizes the language

{x ∈ {0, 1}∗ | x is the binary representation of a number coprime with 6}.

Prove its minimality by giving a string zq,q0 for each pair of distinct states q, q 0 ∈ Q such that exactly
one of δ(q, b 0 , zq,q0 ) is in A. (Proof of correctness of your automaton is not required.)
b zq,q0 ) and δ(q

Solution:
Figure 2: Initial DFA for a language recognising binary strings coprime to 6, where no in state
corresponds to remainder string ending in that state has modulo 6.

Now applying DFA minimisation algorithm as discussed in class, we see that initial state separation
is {0,2,3,4}, {1,5}. Applying separation algorithm iteratively we get separation after each iteration
as:
{0,2,3,4}, {1,5}
{0,2,3}, {4}, {1}, {5}
{0,3}, {2} {4}, {1}, {5}
{0,3}, {2} {4}, {1}, {5} (Algorithm terminates as there is no change in separation).

Figure 3: Final DFA recognising said language, with 1 and 5 as accepting states.

Now, lets find strings such that for each pair of states (q,q’) exactly one of δ(q, b 0 , zq,q0 )
b zq,q0 ) and δ(q
is in A.(This can be verified from minimised DFA).
{0,3} and {1} − > ”1”
{0,3} and {2} − > ”11”
{0,3} and {4} − > ”1”
{0,3} and {5} − > ”11”
{1} and {2} − > 
{1} and {4} − > 
{1} and {5} − > ”1”
{2} and {4} − > 
{2} and {5} − > 
{4} and {5} − > 
Where  is the empty string such that δ(q,
b ) is q.

4. Let Lk ⊆ {0, 1}∗ be the language defined as Lk = {x | |x| ≥ k and the EXOR of the last k bits of x is 1}.
Prove that any DFA that recognizes Lk has at least 2k states. (By the way, observe that Lk is recognized
by an NFA with O(k) states.)

Solution: We prove this by contradiction. Let the DFA recognising Lk have less that 2k states.
Consider all possible binary strings of length k which are 2k in number. By pigeonhole principle,
there exist 2 strings which end up in same state of DFA. Let them be s1 = a1 a2 ...ak and s2 =
b1 b2 ...bk . Let l be the length of the common prefix, that is a1 = b1 , a2 = b2 , ...., al = bl and
a1+1 ! = b1+1 , and consider a string s= a1 ...al 1. Let s01 = s1 s, and s02 = s2 s. Lets assume without
loss of generality that a1+1 = 1 and b1+1 = 0 (Proof for reverse condition follows similarly). Now
we see that effectively set of last k characters for s01 and s1 is the same, so the XOR value of the last
k characters remains unchanged. But the last k characters of s02 is formed from removing a 0 from
the set of last k characters of s2 and replacing it by 1, which inverts the value of the XOR of last
k characters of s02 . Thus we see that s01 and s02 dont end in the same state (because one of them is
accepted and other is rejected). But we get a contradiction because of our assumption that s1 and
s2 end in the same state, then s01 and s02 must end in the same state as well. Thus our assumption
is incorrect, and the DFA recognising Lk consists of ≥ 2k states.

5. We all know that the set of strings over the alphabet {a, b} containing an equal number of occurrences
of ab and ba is regular. However, what if the alphabet is {a, b, c}? Prove that the language

{x ∈ {a, b, c}∗ | x contains an equal number of occurrences of ab and ba}

is not regular. Here are some hints.


1. Take help of the regular expression (abc ∪ bac)∗ .
2. Use closure under inverse homomorphisms from Homework 1.

Solution: Here also we will use the same argument as that of question 2, that in L1 ∩ L2 = L3,
If L3 is not regular then, at least one of L1 or L2 is not regular. We will use this property for this
question. Consider the language L1 = {abc, bac}∗ and let L = {x ∈ {a, b, c}∗ |x contains an equal
number of occurences of ab and ba}.

Consider the language L2 = L1 ∩ L = {x ∈ {abc, bac}∗ |x contains equal number of string abc and
bac}. Now, consider the homomorphism h : {abc, bac} → {0, 1} where h(abc) = 0 and h(bac) = 1.
Let L3 = {x|x have equal number of zeros and ones } = h(L2 ).
Claim: L3 is a irregular language.
Proof: We will use pumping lemma for the above proof. First, choose s = 0p 1p . consider s = xyz
where |y| ≥ 1, |xy| ≤ p. Then, x = 0a , y = 0b and z = 0c 1p where a + b + c = p. Now, if we just
need to show that if there exists a i such that xy i z does not exists in L3 , then we are done. For
i = 0, we can see that xz doesnot exist in L3 , So, L3 is a irregular language.

As regular language is closed under inverse homomorphism, we can conclude that if L3 is not regular
will mean that L2 is not regular by considering negation of the closure. Also, if L2 is not regular,
at least one of L1 and L must be irregular. However, L1 is regular which mean that L is irregular.
Hence, proved.

6. Prove that for any infinite regular language L, there exist two infinite regular languages L1 , L2 such that
L = L1 ∪ L2 and L1 ∩ L2 = ∅. Here are some hints.
1. Let D be any DFA and q be any one of its states. Prove informally that the language

Lq = {x | x ∈ L(D) and the run of D on x visits q an odd number of times}

is regular.
2. Recall the proof of the pumping lemma.

Solution: As the language L is infinite, it has strings whose length is arbitrarily large. As
the hint suggests, we need ideas from the proof of pumping lemma. When regular language is
finite, the pumping length is larger than the length of the longest string in L, so that we can’t
pump strings to make arbitrarily large number of strings contradicting that language has finite
number of strings. But for infinite regular language, whatever be the pumping length, we can
always pump in strings to form infinite sets. We will use this idea in our proof to construct
two sets L1 and L2 satisfying the conditions given in the question.

Take any string s having length greater than the number of states n for a DFA D for L.
By pigeonhole principle, there is a state q which is visited more than once. Also q has the
property that it is the first state to be repeated. Using ideas from the proof of pumping lemma
we know that δ ∗ (q0 , x) = δ ∗ (q0 , xy) = q where x is the prefix of s when D reaches q the first
time while xy is the prefix of s when it reaches the second time. This gives us the way to
partition L into L1 and L2 based on the number of times it visits q. Let L1 be the language
which contains all strings of L which when run on D visit q odd number of times and similarly,
we define L2 as strings of L visiting q even number of times. Our claim is that L1 and L2 are
both infinite regular languages and satisfy L1 ∪L2 = L and L1 ∩L2 = Φ. We can first prove the
last conditions and then move on to prove they are regular languages. Note that for any string
a ∈ L it either visits q odd number of times or even number of times (we take 0 as even when
it doesn’t visit q). Also we can’t have a run where q is visited both even and odd number of
times. Thus the union and intersection part is done. Now we show that L1 and L2 are infinite
regular languages. Both the languages are infinite as we had chosen such a string s whose x
run ends at q. Now we can simply go on pumping to find infinite such strings in both L1 and
L2 on alternate pumps. Thus we are left with proving that L1 and L2 are regular. Thus we
somehow need to maintain the parity of the number of times we visit the state q. This can be
done by extending our set of states Q to Q × {0, 1} so that whenever we encounter the state q
in our run we simply flip the bit (i.e., every transition that ends in state q will have the parity
bit flipped). For L1 , the accepting states are only A × 1 while for L2 , the accepting states are
A × 0 where A is the set of accepting states of D. Thus, L1 and L2 are regular as well and so
we are done.

You might also like