Professional Documents
Culture Documents
Sandeep Notes - Ps
Sandeep Notes - Ps
2
Sandeep Sen
1
2 Department of Computer S
ien
e and Engineering, IIT Delhi, New Delhi 110016, India.
ssen
se.iitd.ernet.in
E-mail:
Contents
1 Preliminaries 1
1.1 Relations and Fun
tions . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Counting and
omparing innite sets . . . . . . . . . . . . . . . . . . 3
1.3 Prin
iple of Indu
tion . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Basi
Counting 5
2.1 Permutation and Combinations . . . . . . . . . . . . . . . . . . . . . 5
2.2 Distribution problems . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Introdu
tion to Graphs 11
3.1 Representation of graphs . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Rea
hability in graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Tours and
y
les . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Conne
tivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 k-
onne
tivity . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3 Some spe
ial
lasses of graphs . . . . . . . . . . . . . . . . . . . . . . 14
3.4 Problem Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Counting te
hniques 17
4.1 The pigeon hole prin
iple . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 Prin
iple of In
lusion and Ex
lusion . . . . . . . . . . . . . . . . . . . 18
4.3 The probabilisti
method . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 Problem Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.5 Some basi
s of probability theory . . . . . . . . . . . . . . . . . . . . 23
5 Re
urren
es and generating fun
tions 25
5.1 An iterative method - summation . . . . . . . . . . . . . . . . . . . . 25
5.2 Linear re
urren
e equations . . . . . . . . . . . . . . . . . . . . . . . 27
5.2.1 Homogeneous equations . . . . . . . . . . . . . . . . . . . . . 27
5.2.2 Inhomogeneous equations . . . . . . . . . . . . . . . . . . . . 28
1
5.3 Generating fun
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.3.1 Binomial theorem . . . . . . . . . . . . . . . . . . . . . . . . . 30
5.4 Exponential generating fun
tions . . . . . . . . . . . . . . . . . . . . 30
5.5 Re
urren
es with two variables . . . . . . . . . . . . . . . . . . . . . 31
5.6 Probability generating fun
tions . . . . . . . . . . . . . . . . . . . . . 32
5.6.1 Probabilisti
inequalities . . . . . . . . . . . . . . . . . . . . . 33
5.7 Problem Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6 Modular Arithmeti
38
6.1 Divisibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2 Congruen
es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.3 Problem Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7 Appli
ation of probability: Information and Coding theory 44
7.1 Quantifying information . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.2 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
7.2.1 Human
ode . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
7.3 Relative information . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
8 Sorting and Sear
hing 50
8.1 Skip Lists - an alternative to bala
ed BST . . . . . . . . . . . . . . . 50
8.1.1 Review of Skip-lists . . . . . . . . . . . . . . . . . . . . . . . . 50
8.1.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.2 Trieps : Randomized Sear
h Trees . . . . . . . . . . . . . . . . . . . . 53
8.3 Lower bounds for sear
hing and sorting . . . . . . . . . . . . . . . . . 55
9 Universal Hashing 57
9.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.2 Collision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.3 Universal Hash Fun
tions . . . . . . . . . . . . . . . . . . . . . . . . 58
9.4 Example of a Universal Hash fun
tion . . . . . . . . . . . . . . . . . . 58
2
Abstra
t
The following pages
ontain preliminary version of the le
tures and problems on
Dis
rete Stru
tures. The notes are likely to
ontain errors, in parti
ular typographi
.
I will endeavour to update this every week.
Chapter 1
Preliminaries
A set is a
olle
tion of obje
ts. The obje
ts of a set are
alled members or elements.
Two sets are equal i they have the same members. Usually we do not
ount repeated
elements more than on
e - when we do they are
alled multisets. Sets may
ontain
nite or innite number of elements. A set that does not have any element is
alled
empty and is denoted by . Some
ommon set identities are
Idempoten
y
Commutativity
Asso
iativity
Distributivity
Absorption
De Morgan's Laws
The power set of a set A is the
olle
tion of all distin
t subsets of A (in
luding
phi) and is denoted by 2A. A partition of A is a
olle
tion of subsets A1 ; A2 : : : su
h
that [i Ai = A and Ai \ Aj = for all i 6= j .
1
Denition 1.1.1 A relation R A A is re
exive if for all a 2 A, (a; a) 2 R.
A relation is symmetri
if (b; a) 2 R whenever (a; b) 2 R. A relation is anti-
symmetri
if (b; a) 2 R then (a; b) 2= R. A relation is transitive if (a;
) 2 R
whenever (a; b) 2 R and (b;
) 2 R.
Denition 1.1.2 A binary relation that is re
exive, symmetri
and transitive is
alled a equivalen
e relation.
A binary relation that is re
exive, antisymmetri
and transitive is
alled a partial
order.
A partial order is a total order if for every pair of distin
t elements a; b, either (a; b)
or (b; a) belongs to the partial order.
We often use the notation a b to denote that a; b are related under the equiv-
alen
e relation . For a 2 S , the set of elements [a = fx 2 S jx ag is
alled the
equivalen
e
lass of a.
Theorem 1.1.3 The equivalen
e
lasses of an equivalen
e relation on a set S
on-
stitute a partition of S .
Proof: Sin
e a a; a 2 [a. If [a and [b are two distin
t equivalen
e
lasses where
b 2= [a, we must show that [a \ [b = . Suppose
2 [a \ [b, then a
and
b and
therefore from transitivity a b. This implies that b 2 [a whi
h is a
ontradi
tion.
2
A fun
tion f is dened from a set of obje
ts
alled domain to another set
alled range
or
o-domain. Intuitively, f asso
iates for ea
h element of the domain a unique
element of range. Often we represent a fun
tion by f : A ! B and f (a) to denote
the element (of the range) to whi
h a 2 A is mapped by f . Sometimes f (a) is
alled
the image of a (under f ) or a as the inverse image of f . The denition of a fun
tion
also naturally extends to k-ary fun
tions, i.e., f has k arguments. Another view is to
think of A as a set of ordered k tuples.
Denition 1.1.4 A fun
tion f : A ! B is onto if ea
h element of B is an image
of at least one element of A. f is one-to-one if for two distin
t a; a0 , f (a) 6= f (a0).
A fun
tion f is a bije
tion if it is one-to-one and onto.
Bije
tions are espe
ially useful for
ounting problems, For example, if we
an nd
a bije
tion between (nite) sets A and B , then the number of elements in A equal
that in B . The use of one-to-one fun
tions are even more useful for
omparing the
number of elements in innite sets.
2
1.2 Counting and
omparing innite sets
The motivating question for this topi
is "Are there more reals numbers than rationals
?" Both sets R (set of Real numbers) and Q (the set of rationals are innite sets, so
how
an we distinguish between the sizes of these sets. Similarly, we may want to
nd the
ompare the set of integers with rationals.
Denition 1.2.1 Two sets A and B are
alled
ardinally equivalent, i there is a
bije
tive fun
tion f : A ! B and this will be denoted by #(A) = #(B ).
Example 1.2.2 : Let A be a nite non-empty set, then there exists a unique
integer n su
h that A is
ardinally equivalent to f1; 2; : : : ng. Then we say that A has
n elements.
Example 1.2.3 : Let E be the set of even positive integers. Then #(E ) = #(Z+)
where Z+ is the set of all positive integers using the fun
tion f : #(E ) ! #(Z+)
where f (n) = n=2. This fun
tion is bije
tive, so intuitively the number of integers is
the same as the number of even integers.
Denition 1.2.4 A set S is
ountably innite i #(S ) = #(Z+). A set is
ountable
i S is nite or
ountably nite.
Theorem 1.2.5 Every subset of a
ountable set is
ountable. A
ountable union of
a
ountable set is
ountable.
Proof: For the rst part, renumber the integers whose images are in the subset (i.e.
the subsequen
e of f1; 2 : : : ng). For the se
ond part, simply
onstru
t a sequen
e
that traverses the subsequen
es "diagonally." 2
Lemma 1.2.6 The set of reals, R is un
ountable.
Denition 1.2.7 If there exists an surje
tive (onto) fun
tion f : A ! B , then
#(A) #(B ). Equivalently there is an inje
tive (1-1) mapping g : B ! A.
If #(A) #(B ) and #(A) 6= #(B ), then #(A) < #(B ).
Example 1.2.8 : If A B , then #(A) #(B ). Consider the subsequen
e of the
identity map(it is an onto map).
Theorem 1.2.9 If S any set then #(S ) < #(2S ), i.e. there is no bije
tion between
a set and its powerset.
3
1.3 Prin
iple of Indu
tion
One of the most useful proof te
hniques in dis
rete stru
tures is the prin
iple of
indu
tion. There are two well known (equivalent) formulations of this. To distinguish
between these we will give them dierent names.
Prin
iple of Mathemati
al Indu
tion
Let P (i) denote a predi
ate that is dened for an integer i. If P (0) is true
and for all i, P (i + 1) is true whenever P (i) is true, then P (i) is true for
all integers i.
Prin
iple of Complete Indu
tion
Let P (i) denote a predi
ate that is dened for an integer i. If P (0) is true
and for all i P (i +1) is true whenever P (j ) is true for all j i, then P (i)
is true for all integers i.
Remark Although these are equivalent, we will nd the se
ond form easier to apply
in most situations.
Problem Set
1. Let S = f(x; y)jx; y are realsg. If (a; b) and (
; d) belong to S , dene (a; b)R(
; d)
if a2 + b2 =
2 + d2. Prove that R is an equivalen
e relation.
2. Let S be the set of real numbers. If a; b 2 S , dene a b if a b is an integer.
Show that is an equivalen
e relation.
3. Let S be a set of integers. If a; b 2 S , let aRb, if a b 0. Is R an equivalen
e
relation on S ? How about the relation R0 where aR0b if a + b is even ?
4. Give examples of relations that are
re
exive and symmetri
but not transitive
re
exive and transitive but not symmetri
symmetri
and transitive but not re
exive
5. Show that for every positive integer n, show that 22n 1 is divisible by 3.
6. Show that for every positive integer n and every real number . (
os + i sin )n =
os n + i sin n.
7. Fundamental Theorem of Arithmeti
Every integer greater than 1 is a
prime or a produ
t of primes and the produ
t is unique up to the order of the
fa
tors. Prove the existen
e part using indu
tion.
8. Show that the set Q Q is
ountable.
4
Chapter 2
Basi
Counting
5
number of events of either type is m + i.
Proof: By indu
tion on i. The base
ase is
learly true, i.e. when there are no events
of type 1. Suppose it is true for k events of type 2, namely there are m + k eents.
When there are k +1 events of type 2, then there are two distin
t possiblities - either
k + 1st event o
urs or it doesn't. Theefore, by invoking the indu
tion hypothesis,
the total number of possibiliies is n + k + 1. 2
Similarly one
an prove the Multipli
ation Prin
iple as well as the following gen-
eralizations given in exer
ises.
Denition 2.1.4 [Permutation and Combination A permutation of n distin
t
obje
ts is an arrangement or ordering of the n obje
ts. An r-permutation of n
distin
t obje
ts is an arrangement using r out of the n obje
ts. An r-
ombination
of n distin
t obje
ts is an unordered sele
tion (subset) of size r.
We will denote r-permutation and r-
ombination of n obje
ts by P (n; r) and C (n; r)
respe
tively.
From the multipli
ation prin
iple, we obtain
P (n; 2) = n(n 1) P (n; 3) = n(n 1)(n 2)
P (n; n) = n(n 1)(n 2) : : : 1 = n! (nn!fa
torial)
P (n; r) = n(n 1) : : : (n r + 1) = (n r)!
To obtain a formula for r-
ombination, we will make use of an indire
t te
hnique.
For every distin
t r-subset, there are P (r; r) distin
t arrangements. Let us number
the distin
t r-permutations in some order say 1 ; 2 : : : t where t = P (n; r). We
an
group them in a way su
h that ea
h group
orresponds to a distin
t r-subset. From
the previous observation, ea
h group has P (r; r) members. Therefore
P (n; r) n!
C (n; r) = =
P (r; r) (n r)! r!
Caveat We
ould have invoked multipli
ation prin
iple, but we have to be
areful
about setting up the events appropriately. One of the most
ommon pitfalls of
ount-
ing problems is the temptation of applying the formulae
arelessly. The formulae are
usually simple but one must be
areful about the appli
ability in a spe
i
situation.
8
5. How many dierent re
tangles
an be drawn on an 8 8
hess board (re
tangles
an have lengths 1 through 8 and two re
tangles are dierent if they
ontain a
dierent subset of squares).
6. What is the probability that a 4-digit telephone number has one or more re-
peated digits ?
7. There are six Fren
h books, eight Russian books, and ve dierent Spanish
books. How many ways are there to arrange the books in a row with all books
of the same language
onse
utively arranged ?
8. How many ways are there to assign 10 students to 10 out of 20 se
tions ?
9. A man has n friends and invites a dierent subset of four of them to his house
for a year (365 nights). How large must n be ?
10. What is the probability that the dieren
e between the largest and the smallest
numbers is k in a subset of four dierent numbers
hosen from 1 to 20 ( 3
k 19) ?
11. How many points of interse
tion are formed by the
hords of an n-gon (a regular
polygon with n sides) assuming that no three
hords meet at a
ommon point ?
How many line segments are formed by the interse
tions - note that if a
hord
has k interse
tion points then it has k + 1 segments.
12. How many integer solutions are there to the equation x1 + x2 + x3 + x4 = 12 ,
with xi 0 ? How many solutions are there with xi 1 ?
13. In how many ways
an you distribute 20 distin
t
ags into 12 distin
t
agpoles
if in arranging the
ags on the poles, the order from the ground up makes a
dieren
e ?
14. In how many ways
an you distribute r identi
al balls into n distin
t boxes with
the rst m boxes
olle
tively
ontaining at least s balls ?
15. Eleven s
ientists are working on a se
ret proje
t. They wish to lo
k up the
do
uments in a
abinet su
h that the
abinet
an be opened if and only if six
or more s
ientists are present. What is the smallest number of lo
ks required ?
What is the smallest number of keys that ea
h s
ientist must
arry ?
16. In how many ways
an three numbers be sele
ted from the numbers 1; 2 : : : 300
su
h that their sum is divisible by 3 ?
17. Show that (k!)! is divisible by (k)!(k 1)!.
9
18. A binary string is a sequen
e of 0's and 1's. How many binary strings of length
n
ontain an even number of 0's ? If strings are over the alphabet f0; 1; 2g, then
show that the number of strings where 0 appears an even number of times is
(3n + 1)=2.
19. A boolean fun
tion
an be represented using a tabular form where all the n-digit
binary numbers are listed along with the fun
tion values. How many boolean
fun
tions are possible ?
A self-dual boolean fun
tion is a table whi
h remains un
hanged if all the 0's
and 1's are swapped. How many self-dual boolean fun
tions are there ?
A symmetri
boolean fun
tion is one that remains un
hanged for any permuta-
tion of the n input
olumns. How many symmetri
boolean fun
tions are there
?
20. A system
onsists of four identi
al parti
les. The total energy in the system is
4Eo where Eo is a positive
onstant. Ea
h of the parti
les
an have an energy
level equal to kEo ( k = 0; 1::4). A parti
le with energy kEo
an o
upy one
of the k2 + 1 distin
t energy states at that energy level. How many dierent
ongurations (in terms of energy states o
upied by the parti
les)
an the
system have ?
10
Chapter 3
Introdu
tion to Graphs
A graph G = (V; E )
onsists of a nite set V of verti
es and a set E of edges whi
h are
ordered pairs of verti
es. S
hemati
ally, we represent graphs using a set of points that
denote verti
es and edges by an ar
joining the two dening verti
es with an arrow
indi
ating the ordering of the verti
es. An undire
ted graph doesn't have dire
tions
asso
iated with an edge. If we think about the edges as roads
onne
ting verti
es
then in the undire
ted
ase we
an traverse the edge in either dire
tion where as the
(dire
ted) graph is like one-way streets. Unless stated otherwise a graph will be used
to imply the undire
ted version.
There are several generalization of the basi
denition. If the set of edges form a
multiset, i.e., some edges have multiple instan
es, then it is a multigraph. One way
to represent a multigraph is to label the edges with an integer denoting the number
of o
urren
es of the edge. This may be regarded as a weighted graph, where ea
h
edge has an asso
iated (integral) weight. In some
ases, we will allow weights to be
arbitrary real numbers.
A more
ompli
ated stru
ture is a hypergraph where the edges
orrespond to ar-
bitrary subsets of verti
es (and not ne
essarily pairs of verti
es). The
hoi
e of a
ertain
lass of graphs depends on the appli
ation.
Graphs
an be used to model very
omplex problems and some of the most intu-
itive examples are problems related to
ommuni
ation networks. A
ow
hart
an be
thought of as a graph where the nodes represent instru
tions and the edges indi
ate
the
ow of
ontrol.
11
m + n (Why ?).
Another representation is using matri
es of dimensions n n. If AG is the matrix
orresponding to graph G = (V; E ), then Ai;j = 1 if (i; j ) 2 E and 0 otherwise. Here
we are assuming that the vertex set is f1; 2; : : : ng. The size of this representation is
n2 irrespe
tive of the number of edges.
The motivation for having a good representation of graphs is to use
omputer
programs for solving graph problems. The above two representations
an be easily
onverted into appropriate data-stru
tures.
12
onne
ted
omponent, i.e. C is maximal. If C in
ludes all verti
es in the graph, then
the underlying graph is
onne
ted.
Remark Note that for dire
ted graphs, a path of u to v is not the same as a path
from v to u.
There are several algorithms for verifying if a given graph is
onne
ted, the most
notable being Depth First Sear
h and Breadth First Sear
h. Among other
onse-
quen
es of these sear
h te
hniques, they produ
e Spanning Forest, whi
h is a spe
ial
kind of a sub-graph.
Denition 3.2.3 A subgraph S = (W; F ) of a graph G = (V; E ) is graph su
h
that W V and F E . A subgraph is a tree if it is
onne
ted and removal of any
one edge dis
onne
ts some pairs of verti
es, i.e. it is a minimal
onne
ted graph. A
set of disjoint trees is
alled a forest.
Lemma 3.2.4 The number of edges in a tree, m is related to the number of verti
es
n by the formula m = n 1.
Corollary 3.2.5 If there are k trees in a forest with m edges and n verti
es then
m = n k.
Lemma 3.2.6 In a tree, there is a unique path between every pair of verti
es.
Remark This is equivalent to saying that there are no
y
les in a tree.
13
Remark The same holds true for edge-disjoint paths and edge-
onne
tivity. The
minimum number of verti
es (edges) that must be removed to dis
onne
t a graph is
alled the vertex (edge)
onne
tivity of the graph and is usually denoted by ().
14
Denition 3.3.6 A
olouring of a graph assigns
olours to verti
es su
h that no
two adja
ent verti
es have the same
olour. The minimal number of
olours required
for a graph G is
alled the
hromati
number and is usually denoted by (G). An
edge-
olouring of a graph is a
olouring of the edges su
h that no two edges that are
in
ident on the same vertex get the same
olour.
Clearly bipartite graphs are two
olourable. One of the
lassi
olouring theorems
on
ern planar graphs.
Theorem 3.3.7 (four-
olour theorem) Every planar graph is 4-
olourable.
There are many natural problems that
an be modelled as graph
oloring.
Example 3.3.8 : In a s
hool ea
h tea
her has to tea
h a
ertain number of
lasses and ea
h
lass must be taught by a
ertain number of tea
hers. The obvious
onstraints about s
heduling the
lasses is that a tea
her
annot tea
h two
lasses
simultaneously and a
lass
annot be taught by two tea
hers. We are interested in
s
heduling the
lasses in a way that takes minimum number of hours (the duration
of a le
ture). It is not di
ult to see that a valid s
heduling
orresponds to
olouring
the edges. So the answer to this problem is the minimum number of
olours required.
The following is an important result on edge-
olouring.
Theorem 3.3.9 (Vizing's Theorem) If the maximum degree of a graph is d, then
we need d or d + 1
olours to
olour the edges.
16
Chapter 4
Counting te
hniques
The basi
methods of
ounting using permutations and
ombinations are sometimes
not adequate or are too
omplex to apply in many situations. There are many te
h-
niques that have been developed for spe
i
problems whi
h have grown from ad ho
to fairly general prin
iples. We dis
uss three su
h te
hniques in this
hapter - namely
pigeon-hole, prin
iple of in
lusion and ex
lusion and one of the most su
essful
in re
ent years
alled the probabilisti
method.
17
or both. Roughly speaking, in a psequen
e of length n there is an in
reasing or a
de
reasing subsequen
e of length d ne.
Proof For ea
h number ni of the sequen
e, let us label with (xi ; yi ) whi
h are the
lengths of the largest in
reasing/de
reasing subsequen
e beginning/ending at ni. If
there is no in
reasing/de
reasing subsequen
e of length r/s, 1 xi r 1 and
1 yi s 1. Sin
e there are more than (r 1) (s 1) numbers, some pair must
be repeated - say xi = xj and yi = yj for j > i. If ni < nj then xi > xj , else yj > yi.
less than pk C (n; k). (Note that the probability of the union of events is no more
that than the sum of the probabilities of the individual events). If this probability is
less than 1, it implies that among the sample spa
e of all possible
olourings of Kn,
there exists some
olouring where all the C (n; k)
liques are not mono
hromati
,
i.e. R(k; k) > n.
Our next example is an important problem in graph theory. A dominating set U
of an undire
ted graph G = (V; E ) is a subset U V su
h that every vertex has a
neighbour in U . The problem of
omputing a minimum
ardinality dominating set
is very hard (algorithmi
ally intra
table. But we
an prove some interesting bounds
using the probabilisti
method.
Example4.3.2 : If the minimum degree of a graph is , then there is a dominating
set of size at most n (1 + log( + 1))=( + 1).
Pi
k every vertex independently with probability p = log( +1)=( +1) and let X
denote this sample. Let Y be the set of verti
es in V that do not have a neighbour in
X . The probability that a vertex v does not have a neighbour in X is the probability
q that neither v nor any of the neighbours were pi
ked in the sample whi
h is
(1 log( + 1)=( + 1))+1
So the expe
ted size of Y is nq expe
ted size of X is np and using the linearity of
expe
tation E [X + Y = n(p + q) whi
h works out to be n (1+log( +1))=( +1).
This means that there is some
hoi
e of X for whi
h there is a dominaing set (X [ Y )
of the required size. In fa
t we
an
laim something stronger that by
hoosing the
verti
es randomly the probability that the dominating set ex
eeds twi
e the stated
bound is less than half (Markov's inequality).
2. How many 1-1 fun
tions exist between f1; 2; : : : mgtof 1,2, ...n g (for n m)
?
20
For n m, show that the number of onto fun
tions is given by
nm C (n; 1)(n 1)m + C (n; 2)(n 2)m : : : ( 1)n 1 C (n; n 1)1m
3. There are 10 pairs of shoes in a
loset. In how many ways
an eight shoes be
hosen su
h that no pair is
hosen ? Exa
tly one pair is
hosen ?
4. Given n + 1 dierent positive integers 2n, show that there exists a pair that
adds upto 2n + 1.
5. Prove that in any n + 1 integers there will be a pair whi
h diers by a multiple
of n. Using this or otherwise show that there exists some subset of n arbitrary
positive integers that whose summation is a multiple of n.
6. Given an equilateral triangle T , show that it is not possible to
over T with
three
ir
les ea
h of diameter less than 3 .
p1
21
12. A
ho
olate
ompany is oering a prize for anyone who
an
olle
t pi
tures of
n dierent
ri
keters, where ea
h wrap has one pi
ture. Assuming that ea
h
ho
olate
an have any of the pi
tures with equal probability, what is the ex-
pe
ted number of
ho
olates one must buy to get all the n dierent pi
tures
?
13. In a temple, thirty persons give their shoes to the
aretaker who hands ba
k
the shoes at random. What is the expe
ted number of persons who get ba
k
their own shoes.
14. Imagine that you are lost in a new
ity where you
ome a
ross a
rossroad.
Only one of them leads you to your destination in 1 hour. The others bring you
ba
k to the same point after 2,3 and 4 hours respe
tively. Assuming that you
hoose ea
h of the roads with equal probability, what is the expe
ted time to
arrive at your destination ?
15. Gabbar Singh problem Given that there are 3
onse
utive blanks and three
onse
utive loaded
hambers in a pistol, and you start ring the pistol from a
random
hamber,
al
ulate the following probabilities. (i) The rst shot is a
blank (ii) The se
ond shot is also a blank given that the rst shot was a blank
(iii) The third shot is a blank given that the rst two were blanks.
16. A gambler uses the following strategy. The rst time he bets Rs. 100 - if he wins,
he quits. Otherwise. he bets Rs. 200 and quits regardless of the result. What
is the probability that he goes ba
k a winner assuming that he has probability
1/2 of winning ea
h of the bets.
What is the generalization of the above strategy ?
17. Three prisoners are informed by the jailor that one of them will be a
quited
without divulging the identity. One of the prisoners requests the jailor to divulge
the identity of one of the other prisoner who won't be a
quited. The jailor
reasons that sin
e at least one of the remaining two will not be a
quited, reveals
the identity. However this makes this prisoner very happy. Can you explain this
?
18. Show that R(s; g) (s 1) (g 1)+1 using expli
it
onstru
tion, i.e. des
ribe
a
olouring on K(s 1)(g 1) .
19. Verify that R(k; k) > 2k=2 using the probabilisti
method. Note that this is a
mu
h superior bound
ompared to the previous problem.
22
20. Let W (k) be the least n su
h that if the set f1; 2; : : : ng is two-
oloured, there
exists a mono
hromati
arithmeti
progression of k terms. Show that W (k) >
2k=2 using the probabilisti
method.
Denition 4.5.2 A random variable (r.v.) X is a real-valued fun
tion over the
sample spa
e, X :
! R . A dis
rete random variable is a random variable whose
range is nite or a
ountable nite subset of R .
The distribution fun
tion FX : R ! (0; 1 for a random variable X is dened as
FX (x) Pr[X = x. The probability density fun
tion of a dis
rete r.v. X , fX is given
by fX (x) = Pr[X = x.
The expe
tation of a r.v. X , denoted by E [X = Px x Pr[X = x.
A very useful property of expe
tation,
alled the linearity property
an be stated
as follows
Lemma 4.5.3 If X and Y are random variables, then
E [X + Y = E [X + E [Y
23
Remark Note that X and Y do not have to be independent !
Denition 4.5.4 The
onditional probability of E1 given E2 is denoted by Pr[E1 jE2
and is given by
Pr[E1 \ E2
Pr[E2
assuming Pr[E2 > 0.
Denition 4.5.5 A
olle
tion of events fEi ji 2 I g is independent if for all subsets
SI
Pr[\i2S Ei = i2S Pr[Ei
Remark E1 and E2 are independent if Pr[E1 jE2 = Pr[E2 .
The
onditional probability of a random variable X with respe
t to another random
variable Y is denoted by Pr[X = xjY = y is similar to the previous denition with
events E1 ; E2 as X = x and Y = y respe
tively. The
onditional expe
tation is dened
as X
E [X jY = y = Pr x [X = xjY = y
x
The theorem of total expe
tation that
an be proved easily states that
X
E [X = E [X jY = y
y
24
Chapter 5
Re
urren
es and generating
fun
tions
Given a sequen
e a1; a2 : : : an (i.e. a fun
tion with the domain as integers), a
ompa
t
way of representing it is an equation in terms of itself, a re
urren
e relation. One of
the most
ommon examples is the Fibona
i sequen
e spe
ied as an = an 1 + an 2
for n 2 and a0 = 0; a1 = 1. The values a0 ; a1 are known as the boundary
onditions.
Given this and the re
urren
e, we
an
ompute the sequen
e step by step, or better
still we
an write a
omputer program. Sometimes, we would like to nd the general
term of the sequen
e. Very often, the running time of an algorithm is expressed as
a re
urren
e and we would like to know the expli
it fun
tion for the running time to
make any predi
tions and
omparisons. A typi
al re
urren
e arising from a divide-
and-
onquer algorithm is
a2n = 2an +
n
whi
h has a solution an 2
ndlog2 ne. In the
ontext of algorithm analysis, we are
often satised with an upper-bound. However, to the extent possible, it is desirable
to obtain an exa
t expression.
Unfortunately, there is no general method for solving all re
urren
e relations. In
this
hapter, we dis
uss solutions to some important
lasses of re
urren
e equations.
In the se
ond part we dis
uss an important te
hnique based on generating fun
tions
whi
h are also important in their own right.
25
Example 5.1.1 : The number of moves required to solve the Tower of Hanoi
problem with n dis
s
an be written as
an = 2an 1 + 1
By substituting for an 1 this be
omes
an = 22 an 2 + 2 + 1
By expanding this till a1, we obtain
an = 2n 1 a1 + 2n 2 + : : : :: + 1
This gives an = 2n 1 by using the formula for geometri
series and a1 = 1.
Example 5.1.2 : For the re
urren
e
a2n = 2an +
n
we
an use the same te
hnique to show that a2n = Pi=0 log2 n2in=2i
+ 2na1.
Remark We made an assumption that n is a power of 2. In the general
ase, this may
present some te
hni
al
ompli
ation but the nature of the answer remains un
hanged.
Consider the re
urren
e
T (n) = 2T (bn=2
) + n
Suppose T (x) =
x log2 x for some
onstant
> 0 for all x < n. Then T (n) =
2
bn=2
log2bn=2
+ n. Then T (n)
n log2 (n=2)+ n
n log2 n (
n)+ n
n log2 n
for
1.
A very frequent re
urren
e equation that
omes up in the
ontext of divide-and-
onquer algorithms (like mergesort) has the form
T (n) = aT (n=b) + f (n) a; b are
onstants and f (n) a positive monotoni
fun
tion
Theorem 5.1.3 For the following dierent
ases, the above re
urren
e has the fol-
lowing solutions
If f (n) = O(nlog a ) for some
onstant , then T (n) is (nlog a).
b b
If f (n) = O(nlog a+) for some
onstant , and if af (n=b) is O(f (n)) then T (n)
b
is (f (n)).
26
Example 5.1.4 : What is the maximum number of regions indu
ed by n lines
in the plane ? If we let Ln represent the number of regions, then we
an write the
following re
urren
e
Ln Ln 1 + n L0 = 1
Again by the method of summation, we
an arrive at the answer Ln = n(n2+1) + 1.
Example 5.1.5 : Let us try to solve the re
urren
e for Fibona
i, namely
Fn = Fn 1 + Fn 2 F0 = 0; F1 = 1
If we try to expand this in the way that we have done previously, it be
omes unwieldy
very qui
kly. Instead we "guess" the following solution
1
Fn = p n n
5
p p
where = (1+2 5) and = (1 2 5) . The above solution
an be veried by indu
tion.
Of
ourse it is far from
lear how one
an magi
ally guess the right solution. We shall
address this later in the
hapter.
27
This observation (of unique solution) makes it somewhat easier for us to guess some
solution and verify.
Let us guess a solution of the form ar = Ar where A is some
onstant. This may
be justied from the solution of Example 5.1. By substituting this in the homogeneous
linear re
urren
e and simpli
ation, we obtain the following equation
0 k +
1 k 1 : : : +
k = 0
This is
alled the
hara
teristi
equation of the re
urren
e relation and this degree
k equation has k roots, say 1 ; 2 : : : k . If these are all distin
t then the following is
a solution to the re
urren
e
ar (h) = A1 1r + A2 2r + : : : Ak kr
whi
h is also
alled the homogeneous solution to linear re
urren
e. The values of
A1 ; A2 : : : Ak
an be determined from the k boundary
onditions (by solving k simul-
taneous equations).
When the roots are not unique, i.e. some roots have multipli
ity then for mul-
tipli
ity m, n; nn; n2n : : : nm 1 n are the asso
iated solutions. This follows from
the fa
t that if is a multiple root of the
hara
teristi
equation, then it is also the
root of the derivative of the equation.
5.2.2 Inhomogeneous equations
If f (n) 6= 0, then there is no general methodology. Solutions are known for some
parti
ular
ases, known as parti
ular solutions. Let an(h) be the solution by ignoring
f (n) and let an(p) be a parti
ular solution then it
an be veried that an = an(h) + an(p)
is a solution to the non-homogeneous re
urren
e.
The following is a table of some parti
ular solutions
d a
onstant B
dn B1 n + B0
dn2 B2 n2 + B1 n + B0
ed , e; d are
onstants
n Bdn
Here B; B0; B1; B2 are
onstants to be determined from initial
onditions. When
f (n) = f1 (n) + f2 (n) is a sum of the above fun
tions then we solve the equation
for f1 (n) and f2 (n) separately and then add them in the end to obtain a parti
ular
solution for the f(n).
28
5.3 Generating fun
tions
An alternative representation for a sequen
e a1 ; a2 : : : ai is a polynomial fun
tion
a1 x + a2 x2 + : : : ai xi . Polynomials are very useful obje
ts in mathemati
s, in parti
ular
as "pla
eholders." For example if we know that two polynomials are equal (i.e. they
evaluate to the same value for all x), then all the
orresponding
oe
ients must
be equal. This follows from the well known property that a degree d polynomial
has no more than d distin
t roots (unless it is the zero polynomial). The issue of
onvergen
e is not important at this stage but will be relevant when we use the
method of dierentiation.
Example 5.3.1 : Consider the problem of
hanging a Rs 100 note using notes
of the following denomination - 50, 20, 10, 5 and 1. Suppose we have an innite
supply of ea
h denomination then we
an represent ea
h of these using the following
polynomials where the
oe
ient
orresponding to xi is non-zero if we
an obtain a
ertain sum using the given denomination.
P1 (x) = x0 + x1 + x2 + : : :
P5 (x) = x0 + x5 + x10 + x15 + : : :
P10 (x) = x0 + x10 + x20 + x30 + : : :
P20 (x) = x0 + x20 + x40 + x60 + : : :
P50 (x) = x0 + x50 + x100 + x150 + : : :
For example, we
annot have 51 to 99 using Rs 50,so all those
oe
ients are zero.
By multiplying these polynomials we obtain
P (x) = E0 + E1 x + E2 x2 + : : : E100 x100 + : : : Ei xi
where Ei is the number of ways the terms of the polynomials
an
ombine su
h that
the sum of the exponents is 100. Convin
e yourself that this is pre
isely what we are
looking for. However we must still obtain a formula for E100 or more generally Ei,
whi
h the number of ways of
hanging a sum of i.
Note that for the polynomials P1; P5 : : : P50 , the following holds
Pk (1 xk ) = 1 for k = 1; 5; ::50 giving
P ( x) =
1
(1 x)(1 x )(1 x10 )(1 x20 )(1 x50 )
5
We
an now use the observations that 1 1 x = 1 + x2 + x3 : : : and (1 x1)(1x x ) = 5
5
Let us try the method of generating fun
tion on the Fibona
i sequen
e.
Example 5.3.2 : Let the generating fun
tion be G(z ) = F0 + F1 x + F2 x2 : : : Fn xn
where Fi is the i-th Fibona
i number. Then G(z) zG(z) z2 G(z)
an be written
as the innite sequen
e
F0 + (F1 F2 )z + (F2 F1 F0 )z 2 + : : : (Fi+2 Fi+1 Fi )z i+2 + : : : = z
for F0 = 0; F1 = 1. Therefore G(z) = 1 zz z . This
an be worked out to be
2
1
G(z ) = p
1 1
5 1 z 1 z
p
where = 1 = 21 1 5.
5.3.1 Binomial theorem
The use of generating fun
tions ne
essitates
omputation of the
oe
ients of power
series of the form (1 + x) for jxj < 1 and any . For that the following result is very
useful - the
oe
ient of xk is given by
( 1) : : : ( k + 1)
C (; k) =
k (k 1) : : : 1
This
an be seen from an appli
ation of Taylor's series. Let f (x) = (1 + x) . Then
from Taylor's theorem, expanding around 0 for some z,
f 00 (0) f (k) (0)
f (z ) = f (0) + zf 0 (0) + z + z 2 + : : : zk :::
2! k!
= f (0) + 1 + z2 (2! 1) + : : : C (; k) + : : :
Therefore (1 + z) = P1i=0 C (; i)zi whi
h is known as the binomial theorem.
30
rapidly growing fun
tion like n!. For example, if we
onsider the generating fun
tion
for the number of permutations of n identi
al obje
ts
p p p
G(z ) = 1 + 1 z + 2 z 2 : : : i z i
1! 2! i!
where pi = P (i; i). Then G(z) = ez . The number of permutations of r obje
ts when
sele
ted out of (an innite supply of) n kinds of obje
ts is given by the exponential
generating fun
tion (EGF)
1 r
p1 p2 2 n nx X n r
1+ z + z ::: = e =
1! 2! r!
x
r=0
Example5.4.1 : Let Dn denote the number of derangements of n obje
ts. Then it
an be shown that Dn = (n 1)(Dn 1 +Dn 2). This
an be rewritten asn D2n nDn 1 =
(Dn 1 (n 2)Dn 2. Iterating this, we obtain Dn nDn 1 = ( 1) (D2 2D1).
Using D2 = 1; D1 = 0, we obtain
Dn nDn 1 = ( 1)n 2 = ( 1)n :
Multiplying both sides by xn! , and summing from n = 2 to 1, we obtain
n
1 1 1
X Dn n X
x
nDn 1 n X
x =
( 1) n
xn
n=2
n! n=2
n! n=2
n!
If we let D(x) represent the exponential generating fun
tion for derangements, after
simpli
ation, we get
D(x) D1 x D0 x(D(x) D0 ) = e x (1 x)
or D(x) = 1e x .
x
31
where Ci;j are
onstants. We will use the te
hnique of generating fun
tions to extend
the one variable method. Let
A0 (x) = a0;0 + a0;1 x + : : : a0;r xr
A1 (x) = a1;0 + a1;1 x + : : : a1;r xr
An (x) = an;0 + an;1 x + : : : an;r xr
Then we
an dene a generating fun
tion with A0 (x); A1(x)A3 (x) : : : as the sequen
e
- the new indeterminate
an be
hosen as y.
Ay (x) = A0 (x) + A1 (x)y + A2 (x)y 2 : : : An (x)y n
For the above example, we have
Fn (x) = C (n; 0) + C (n; 1)x + C (n; 2)x2 + : : : C (n; r)xr + : : :
1
X 1
X 1
X
C (n; r)xr = C (n 1; r 1)xr + C (n 1; r)xr
r=0 r=1 r=0
Fn (x) C (n; 0) = xFn 1 (x) + Fn 1 (x) C (n 1; 0)
Fn (x) = (1 + x)Fn 1 (x)
or Fn(x) = (1 + x)nC (0; 0) = (1 + x)n as expe
ted.
This is also known as the z-transform of X and it is easily seen that GX (1) = 1 =
P
i pi . The
onvergen
e of the PGF is an important issue for some
al
ulations
involving dierentiation of the PGF. For example,
dG (z )
E [X = X jz = 1
dz
32
The notion of expe
tation of random variable
an be extended to fun
tion f (X )
of random variable X in the following way
X
E [f (X ) = pi f ( X = i )
i
Therefore, PGF of X is the same as E [zX . A parti
ularly useful quantity for a
number of probabilisti
al
ulations is the Moment Generating Fun
tion (MGF)
dened as
MX () = E [eX
Sin
e
X 2 2 X k k
eX = 1 + X + + ::: :::
2! k!
E [X k k
MX () = 1 + E [X + : : : :::
k!
from whi
h E [X k also known as higher moments
an be
al
ulated. There is also
a very useful theorem known for independent random variables Y1; Y2 : : : Yt. If Y =
Y1 + Y2 + : : : Yt , then
MY () = MY () MY () : : : MY ()
1 2 t
i.e., the MGF of the sum of independent random variables is the produ
t of the
individual MGF's.
5.6.1 Probabilisti
inequalities
In many appli
ations, espe
ially in the analysis of randomized algorithms, we want to
guarantee
orre
tness or running time. Suppose we have a bound on the expe
tation.
Then the following inequality known as Markov's inequality
an be used.
Markov's inequality
Pr[X kE [X 1 k
(5.6.1)
Unfortunately there is no symmetri
result.
If we have knowledge of the se
ond moment, then the following gives a stronger
result
Cheby
hev's inequality
2
Pr[(X E [X )2 t (5.6.2)
t
where is the varian
e, i.e. E 2[X E [X 2 .
33
With knowledge of higher moments, then we have the following inequality. If
X = Pni xi is the sum of n mutually independent random variables where xi is
uniformly distributed in f-1 , +1 g, then for any > 0,
Cherno bounds
Pr[X e E [eX (5.6.3)
If we
hoose = =n, the RHS be
omes e =2n. 2
looking at the degree 1 verti
es atta
hed to all possible nodes - v1 tovm 1 and
by addition prin
iple we
an add then up. Noti
e that the degree sequn
es are
dierent in ea
h
ase. If more than one vertex has degree 1
onne
ted to vj ,
note that it su
es to
onsider any one of them, sin
e there is only one way
that
an be
onne
ted. Summing over all instan
es of di we obtain
X (m 3)!
i
(d 1)!(d2 1!) : : : (di 2!)::(dj 1) : : :
d 1 1
Multiplying by numerator and denominator di 1, we obtain
(m 2)! (di 1) :
d1 1!d2 1! : : : di 1!::
35
Summing over all degree sequen
es
X (n 3)!di 1
d 1!d2 1! : : : di 2!::(dj 1) : : :
d 1 1
i
where bn;p is the number of binary trees with n nodes and internal path length
p.
For example (by brute for
e
al
ulation)
B (w; z ) = 1 + z + 2wz 2 + (w2 + 4w3 )z 3 + : : :
Clearly B (1; z) = generating fun
tion for number of (oriented) trees with n
nodes.
X
bn;p = bk;r bl;s
k+l=n 1;r+s+n 1=p
It follows that zB 2 (w; wz) = B( wz) 1.
By taking the partial deriuvative wrt z we obtain
2zB (w; wz)(Bw (w; wz) + zBz (w; wz)) = Bw (w; z)
36
Let H(z) is the generating fun
tion for the total internal path length with n
nodes, then X
H (z ) = Bw (1; z ) = hi z n :
i
Moreover H (z) = 2zB (z) = (H (z)+ zB 0 (z):) Using the formula for B (z) (Cata-
lan numbers),
H (z ) =
1 1 1 z
1 4z z sqrt1 4z 1
giving
hn = 4n
3n + 1 C (2n; n)
n+1
The average value of total internal path length is hn=bn and average
p value of
path length of a node is hn=nbn . The asymptoti
value of this is n 3+ O(1).
2
37
Chapter 6
Modular Arithmeti
In this
hatper, we will dis
uss some useful properties of numbers when
al
ulations
are done modulo n, where n > 0. In the
ontext of
omputer s
ien
e, n is usually a
power of 2 sin
e representation is binary.
6.1 Divisibility
Denition 6.1.1 An integer b is divisible by an integer a (a 6= 0), if there is an
integer x su
h that b = ax. This will be denoted by ajb.
We begin by formalising some elementary observations about integer division.
Theorem 6.1.2 1. ajb implies ajb
for any integer
.
2. ajb and bj
implies aj
.
3. ajb and aj
implies ajbx +
y .
4. if m 6= 0 then ajb majmb.
Theorem 6.1.3 (Divison Algorithm) Given integers a and b with a > 0, there
exist unique integers q and r su
h that b = qa + r, 0 r < a.
Denition 6.1.4 The g
d of two numbers a and b is the largest among the
ommon
divisors of a and b. If this is 1 then a; b are relatively prime.
The following properties of g
d(x; y) are known
Theorem 6.1.5 1. If
is a
ommon divisor of a; b, then
jg
d(a; b).
2. g
d(x; y ) = minfax + by g where x; y are integers, su
h that ax + by > 0.
38
3. m g
d(a; b) = g
d(ma; mb).
4. If g
d(a; m) = g
d(b; m) = 1 then g
d(ab; m) = 1.
5. If
jab and g
d(b;
) = 1 then
ja.
6. g
d(a; b) = g
d(a; b qa)for any q
The beginning of number theory goes ba
k to Eu
lid's algorithm that exploited
some of the properties of divisibility to
ompute the g
d of two integers.From property
6, it follows that to ng g
d of a and b we
an nd g
d of a and b-qa(repeatedly). If
ajb,then
learly a is the g
d, so that
an be used as a terminating
ase. Computing q
an be done using the division algorithm whi
h is how Eu
lid's algorithm works. In
addition, it also
omputes numbers x and y su
h that g
d = ax + by. For this, we
maintain an invariant that axi + byi = ri where ri is the remainder in the i-th iteration
with initial values x0 = 1 and y0 = 0. And this is what is known as Extended Eu
lid's
algorithm. The
orre
tness of the algorithm follows from indu
tion.
Prime numbers (with no divisors other than 1 and the number itself) are extremely
important in number theory.
Theorem 6.1.6 (Fundamental Theorem of Arithmeti
)
Every positive integer
an be expressed as produ
t of primes and this fa
torization is
unique ex
ept for the order of the prime fa
tors.
Proof: We know that if pjq1q2 where p is prime then either pja or pjb or both. 2
The fa
t that number of primes is innite was given in an elegant proof of Eu
lid.
Extending his argument it
an be shown that there are arbitrary gaps between two
primes. The nprime number theorem says that among the rst n integers there are
very nearly ln n prime numbers.
6.2 Congruen
es
Denition 6.2.1 If an integer m, not zero, divides the dieren
e a b, we say that
a is
ongruent to b modulo m and is denoted by a b(modm).
(Sin
e mj(a b) is equivalent to mj(a b), we will always assume that m > 0.)
The following properties follow from the denition.
Theorem 6.2.2 1. a b(modm) is the same as a b 0(modm).
2. a b(modm) and b
(modm) implies a
(modm). (transitive - infa
t
(modm) is an equivalen
e relation).
39
3. If a b(modm) and
d(modm) then ax +
y bx + dy (modm)
4. If a b(modm) and
d(modm), then a
bd(modm)
5. If a b(modm) and djm, d > 0, then a b(modd).
The degree of a polynomial (with integral
oe
ients) modulo m is the highest
power of x for whi
h the
oeent is non-zero modulo m. For f (x) = a0xn + a1 xn 1 +
: : : an , if f (u) 0(modm) then we say that u is a solution of the
ongruen
e f (x)
0(modm). It is known that
Theorem 6.2.3 If a b(modm), then f (a) f (b)(modm)
An important problem is the solution of
ongruen
es and in parti
ular linear (degree
1)
ongruen
e. Any su
h
ongruen
e has the form
ax b(modm)
For the spe
ial
ase that g
d(a; m) = 1, we have a solution x1 = a(m) 1 b, where (m)
is the totient fun
tion (dened by Euler). It is the number of integers less than m
that are relatively prime to m (if m is prime then (m) = m 1). This follows from
the following theorem of Euler.
Theorem 6.2.4 If g
d(a; m) = 1, then a(m) 1(modm).
Another way of viewing the solution is to multiply both sides by a number a 1 su
h
that a a 1 1(modm). We have the following equivalent of
an
ellation laws
Theorem 6.2.5 1. If ax ay (modm) and g
d(a; m) 1(modm) then x
y (modm).
2. ax ay (modm) i x y (mod g
d(ma;m) ). (generalization)
The remaining solutions (when g
d(a; m) = 1) are of the form x1 + jm for any integer
j . In other words there is a unique solution modulo m. For the other
ase (when a
and m are not relatively prime), the solutions are des
ribed by the following theorem.
Theorem 6.2.6 Let g = g
d(a; m). Then ax b(modm) has no solutions if g does
not divide b. If g jb, it has g solutions x (b=g )x0 + t(m=g ); t = 0; 1 : : : g 1, where
x0 is any solution of (a=g )x 1(mod(m=g )).
Algorithmi
ally, in both
ases, we
an use the (extended) Eu
lid's algorithm to
ompute x1 or x0 .
AnQalternate method
Qk is to solve a set eof simultaneous
ongruen
es by fa
torising
m = i=1 pi = i=1 mi where mi = pi . Sin
e mi are relatively prime in pairs,
ki e i
40
it
an be shown that solving the
ongruen
e ax b(modm) is the same as solving
the
ongruen
es ax b(modmi ) simultaneously for all i. Suppose the individual
ongruen
es have solutions
axi b(modmi )
Then these
an be
ombined using a result
alled Chinese Remaindering Theorem.
Theorem 6.2.7 The
ommon solution is given by
k
X m
x0 = bj xj
j =1
m j
42
Part II
Appli
ations
43
Chapter 7
Appli
ation of probability:
Information and Coding theory
useful quantity is the expe
ted amount of un
ertainity
alled entropy. For example,
a
oin that has a probability p of heads and q(= 1 p)of tails has entropy
p log(1=p) + q log(1=q )
44
and it
an be veried that this is maximised when p = q = 1=2 and equals 0 when
the out
ome is
ertain (p = 0 or p = 1).
For a general probability spa
e
= (E1 ; E2 : : : En), the entropy is dened as
n
X
= pi log(1=pi); pi = Pr(Ei )
i=1
The entropy of a dis
rete random variable X is similarly dened
X
H (X ) = Pr(X = i) log(1= Pr(X = i)
i
One
an verify easily that entropy is maximised when all events are equally likely
(maximum un
ertainity).
7.2 Codes
Given a nite set of symbols X , a
oding of this is a fun
tion from X to strings nite
set of strings over a (usually) small set of alphabet . For example X = f1; 2; 3; 4g
an be mapped to f0; 10; 110; 111g over alphabet = f0; 1g. The length of a
ode is
the number of symbols in the string. A sequen
e of symbols from X is en
oded as
the
on
atenation of the
odes for the sequen
e of the symbols. For example 132 will
be en
oded as 011010. The set of all strings over a nite alphabet S will be denoted
by S +
A
ode is uniquely de
ipherable if the mapping X + ! + is a 1-1 fun
tion, i.e. every
string from X + is mapped to a unique string in +. In other words, these strings
an
be de
oded unambiguously.
A
ode is
alled a prex
ode, if no
odeword is a prex of another
odeword.
Clearly a prex
ode is uniquely de
ipherable. The above example is a prex
ode.
The following result
hara
terizes the uniquely de
ipherable
odes very elegantly.
Theorem 7.2.1 (Kraft-M
Millan) For any uniquely de
ipherable
ode over f0; 1g,
the lengths of the
odewords must satisfy
X
2 l 1
i
i
where li is the length of the i-th
odeword. Moreover, if a given set of
odewords satisfy
the above inequality, then there exists a prex
ode with these
odeword lengths.
We will only prove the rst part for prex
odes. We
an represent the prex
ode
using a trie where ea
h leaf node
orresponds to a
odeword. If the maximum length
45
of a
odeword is h, then it follows that the number of leaves of a
omplete binary tree
of depth h is greater than the number of leaf nodes of the subtrees rooted at ea
h of
the nodes
orresponding to the
odewords. In parti
ular,
X
2h l 2hi
It follows from the above denition that if X; Y are independent, i.e., Pr(X = xi; Y =
yj ) = Pr(X = xi ) Pr(Y = yj ) then the reader
an verify easily that
H (X; Y ) = H (X ) + H (Y )
The marginal entropy of X is the entropy of the distribution Pr(X jY = yj ), i.e.,
Pr(X = xijY = yj ) log Pr(X = x1 jY = y )
X
H (X jY = yj ) =
i i j
Proof: X
H (f (X )) = Pr[f (X ) = a logPr[f (X ) = a
a
X X X
H (X ) = Pr[X = x log Pr[X = x = Pr[X = x log Pr[X = x
x a x:f (x)=a
1 A fun
tion f (x) is
onvex if for all x1 ; x2 and 0 1, f (x1 +(1 x2 ) f (x1 )+(1 )f (x2 ).
48
Now
ompare term by term:
Pr[f (X ) = a log Pr[f (X ) = a versus Px:f (x)=a Pr[X = x log Pr[X = x.
Using the
on
avity of the fun
tion log x :
X
(Pr[X = x= Pr[f (X ) = a) logPr[X = x)
x:f (x)=a
X X
log( (Pr[X = x= Pr[f (X ) = a) Pr[X = x)
a x:f (x)=a
X
log( Pr[X = x) = log Pr[f (X ) = a sin
e Pr[X = x= Pr[f (X ) = a 1:
x:f (x)=a
By negating both sides of ea
h term and then summing over all terms, we obtain the
desired result.
2
Example 7.3.5 : [fake
oin problem Suppose we are given n
oins and one of
them is
ounterfeit. We are told that
ounterfeit has a dierent weight and we are
allowed to use a simple balan
e. How many weighings are ne
essary ?
For instan
e if it is 3
oins, then we
an identify using at most 2 weighings. In
the beginning, the entropy is H (1=n:1=n : : : 1=n) log n. Ea
h weighing
an have 3
out
omes - right > = < left, or right equals
log
left. Hen
e we
an get atmost log3 bits
of information. This implies that at least log 3 weighings are ne
essary.
n
We leave it as an exer
ise about how
lose we
an
ome to this gure to a
tually
identify the fake
oin.
Our next example is the well known result of sorting.
Example 7.3.6 : If we want to sort n elements, i.e., permute them into a non-
de
resing order. If all permutations are equally likely, then the information
ontent is
log n!. In a
omparison, the number of out
omes is two implying that the entropy is
log 2 = 1. If the minimum number of
omparisons is m, then H (f (X1; X2 : : : Xm ))
log(n!) where Xi are random variables
orresponding to ea
h
omparison. From
theorem 7.3.4 H (X1) + H (X1 : : : H (Xm) log(n!) whi
h implies m =
(n log n).
49
Chapter 8
Sorting and Sear
hing
50
The sear
h begins from the topmost level Lk where Tk
an be determined in
onstant time. If lk = E or rk =TE then the sear
h is su
essful else we re
ursively
sear
h among the elements [lk ; rk L0 . Here [lk ; rk denotes the
losed interval bound
by lk and rk . This is done by sear
hing the elements of Lk 1 whi
h are bounded by
lk and rk . Sin
e both lk ; rk 2 Lk 1 , the des
enden
e from level k to k 1 is easily
a
hieved in O(1) time. In general, at any level i we determine the tuple Ti by walking
through a portion of the list Li. If li or ri equals E then we are done else we repeat
this pro
edure by des
ending to level i 1.
In other words, we rene the sear
h progressively until we nd an element in S
equal to E or we terminate when we have determined (l0; r0 ). This pro
edure
an
also be viewed as sear
hing in a tree that has variable degree (not ne
essarily two as
in binary tree).
Of
ourse, to be able to analyze this algorithm, one has to spe
ify how the lists
Li are
onstru
ted and how they are dynami
ally maintained under deletions and
additions. Very roughly, the idea is to have elements in i-th level point to approxi-
mately 2i nodes ahead (in S ) so that the number of levels T is approximately O(log n).
The time spent at ea
h level i depends on [li+1 ; ri+1 Li and hen
e the obje
tive is
to keep this small. To a
hieve these
onditions on-line, Pugh [? uses the following
elegant method. The nodes from the bottom-most layer (level 0) are
hosen with
probability p (for the purpose of our dis
ussion we shall assume p = 0:5) to be in the
rst level. Subsequently at any level i, the nodes of level i are
hosen to be in level
i + 1 independently with probability p and at any level we maintain a simple linked
list where the elements are in sorted order. If p = 0:5, then it is not di
ult to verify
that for a list of size n, the expe
ted number of elements in level i is approximately
n=2i and are spa
ed about 2i elements apart. The expe
ted number of levels is
learly
O(log n), (when we have just a trivial length list) and the expe
ted spa
e requirement
is O(n).
To insert an element, we rst lo
ate its position using the sear
h strategy des
ribed
previously. Note that a byprodu
t of the sear
h algorithm are all the Ti's. At level
0, we
hoose it with probability p to be in level L1 . If it is sele
ted, we insert it in
the proper position (whi
h
an be trivially done from the knowledge of T1 ), update
the pointers and repeat this pro
ess from the present level. Deletion is very similar
and it
an be readily veried that deletion and insertion have the same asymptoti
run time as the sear
h operation. So we shall fo
us on this operation.
8.1.2 Analysis
To analyze the run-time of the sear
h pro
edure, we look at it ba
kwards, i.e., retra
e
the path from level 0. The sear
h time is
learly the length of the path (number of
links) traversed over all the levels. So one
an
ount the number of links one traverses
51
before
limbing up a level. In other words the expe
ted sear
h time
an be expressed
in the following re
urren
e (from [? )
C (k) = (1 p)(1 + C (k)) + p(1 + C (k 1))
where C(k) is the expe
ted
ost for
limbing k levels. From the boundary
ondition
C(0) = 0, one readily obtains C (k) = k=p. For k = O(log n), this is O(log n). The
re
urren
e
aptures the
rux of the method in the following manner. At any node of
a given level, we
limb up if this node has been
hosen to be in the next level or else
we add one to the
ost of the present level. The probability of this event (
limbing
up a level) is p whi
h we
onsider to be a su
ess event. Now the entire sear
h
pro
edure
an be viewed in the following alternate manner. We are tossing a
oin
whi
h turns up heads with probability p - how many times should we toss to
ome up
with O(log n) heads ? Ea
h head
orresponds to the event of
limbing up one level in
the data stru
ture and the total number of tosses is the
ost of the sear
h algorithm.
We are done when we have
limbed up O(log n) levels (there is some te
hni
ality
about the number of levels being O(log n) but that will be addressed later). The
number of heads obtained by tossing a
oin N times is given by a Binomial random
variable X with parameters N and p. Using Cherno bounds from Theorem ??, for
N = 15 log n and p = 0:5, Pr[X 1:5 log n 1=n2 (using = 9=10 in equation 1).
Using appropriate
onstants, we
an get rapidly de
reasing probabilities of the form
Pr[X
log n 1=n for
; > 0 and in
reases with
. These
onstants
an be
ne tuned although we shall not bother with su
h an exer
ise here.
We thus state the following lemma.
Lemma 8.1.1 The probability that a
ess time for a xed element in a skip-list data
stru
ture of length n ex
eeds
log n steps is less than O(1=n2 ) for an appropriate
onstant
> 1.
Proof We
ompute the probability of obtaining fewer than k (the number of levels
in the data-stru
ture) heads when we toss a fair
oin (p = 1=2)
log n times for some
xed
onstant
> 1. That is, we
ompute the probability that our sear
h pro
edure
ex
eeds
log n steps. Re
all that ea
h head is equivalent to
limbing up one level
and we are done when we have
limbed k levels. To bound the number of levels, it
is easy to see that the probability that any element of S appears in level i is at most
1=2i, i.e. it has turned up i
onse
utive heads. So the probability that any xed
element appears in level 3 log n is at most 1=n3. The probability that k > 3 log n is
the probability that at least one element of S appears in L3 log n. This is
learly at
most n times the probability that any xed element survives and hen
e probability
of k ex
eeding 3 log n is less than 1=n2.
Given that k 3 log n we
hoose a value of
, say
0 (to be plugged into equation 1
of Cherno bounds) su
h that the probability of obtaining fewer than 3 log n heads in
52
0 log n tosses is less than 1=n2 .The sear
h algorithm for a xed key ex
eeds
0 log n
steps if one of the above events fail; either the number of levels ex
eeds 3 log n or we
get fewer than 3 log n heads from
0 log n tosses. This is
learly the summation of the
failure probabilities of the individual events whi
h is O(1=n2). 2.
Theorem 8.1.2 The probability that the a
ess time for any arbitrary element in
skip-list ex
eeds O(log n) is less than 1=n for any xed > 0.
Proof: A list of n elements indu
es n + 1 intervals. From the previous lemma,
the probability P that the sear
h time for a xed element ex
eeding
log n is less
than 1=n2. Note that all elements in a xed interval [l0; r0 follow the same path in
the data-stru
ture. It follows that for any interval the probability of the a
ess time
ex
eeding O(log n) is n times P . As mentioned before, the
onstants
an be
hosen
appropriately to a
hieve this. 2
It is possible to obtain even tighter bounds on the spa
e requirement for a skip list
of n elements. From Pugh [? it is known that the expe
ted spa
e is O(n). Moreover
it is
lear that it does not ex
eed O(n log n) with probability 1 1=n2 (no element
survives more than O(log n) levels with this probability from the previous lemma).
One
an obtain a mu
h stronger bound by viewing the entire skip list stru
ture as a
sto
hasti
experiment ea
h node
orresponds to a Bernoulli trial that turns up heads
(similar to obtaining the obtaining the query bound). Ea
h element is repli
ated
till the the trial turns up tails. Sin
e there are n elements, the number of nodes
orresponds to the number of Bernoulli trials required to obtain n tails. This is
a negative binomial distribution and one
an use Cherno bounds (Theorem ??)
dire
tly to obtain the following result.
Theorem 8.1.3 For any
onstant > 0, the probability of the spa
e ex
eeding 2n +
n, is less than exp
( n) .
2
56
Chapter 9
Universal Hashing
9.1 Notations
Universe : U
Set of elements : S also jS j = n
Hash lo
ations : f0; 1; :::; m 1g usually, n m
9.2 Collision
If x; y 2 U are mapped to the same lo
ation by a hash fun
tion h.
h (x; y ) = 1 : h(x) = h(y); x 6= y
X
0 : otherwise
h (x; S ) = h (x; y )
y2S
Hash by
haining: The more the
ollision the worse the performan
e. Look at a
sequn
e O1(x2 ); O2(x2 ); :::; On(xn ) where Oi 2 fInsert; Delete; Sear
hg and xi 2 U
Let us make the following assumptions
1. jh 1(i)j = jh 1(i0 )j where i; i0 2 f0; 1; :::; m 1g
2. In the sequen
e, xi
an be any element of U with equal probability.
57
Claim: Total expe
ted
ost = O((1 + )n) where = mn (load fa
tor).
Proof: Expe
ted
ost of (k + 1)th operation = expe
ted number of elements in
lo
ation 1 + k( m1 ) assumingPall the previous operations were Insert.
So total expe
ted
ost nk=1 1 + mk = n + n(2nm+1) = (1 + 2 )n. This is worst
ase
over operations but not over elements. 2
h2H h y2S
= 1 + jH1 j
X X
h (x; y )
y h
1 + jH1 j j j
X H
y
m
= 1 + m
n
59