Professional Documents
Culture Documents
Design $ Analysis of Algorithm
Design $ Analysis of Algorithm
Design $ Analysis of Algorithm
Preface
These lecture notes are based on similar lectures I gave in 2001 at HKUST, Hong
Kong, in 2000 at the University of Waterloo, Canada, together with Prof. Naomi
Nishimura, in 1998 at the University of Trier, Germany, and in 1987–1999 at
the University of Saarbrücken, Germany, together with Prof. Kurt Mehlhorn
and Dr. Jop Sibeyn. I also use some material from David Mounts CMSC 451
lecture notes.
Parts of the notes are rather short (and probably not completely free of
minor errors) and it might be a good idea to read the corresponding chapters
in the text book CLRS as well.
I use the following abbreviations for texbooks:
• Mnt: Mount; Lecture Notes of CMSC 451 (NTU 520B), Fall 2003, Uni-
versity of Maryland, http://www.cs.umd.edu/~mount/451/.
Prof Rudolf Fleischer
Fudan University, Shanghai
Shanghai, October 9, 2004
October 9, 2004 Prof Rudolf Fleischer
Contents Chapter 1
Introduction
1 Introduction 4
1.1 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1 Objectives
1.4 Course Syllabus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2. Lecture
1.5 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Slide: Course information
1.6 Running Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.7 Asymptotic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 9 In this course you learn
1.8 Analysis of Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 14
• Good algorithms for important problems (at least a few examples).
• Correctness proofs.
• Analysis of efficiency.
• Lower bounds.
1.2 Literature
The following books are recommended for this course.
3 4
COMBINATORICS 5 6 BIBLIOGRAPHY
[3] R. Sedgewick. Algorithms. Addison-Wesley, 1988. The word algorithm goes back to the 9th century Persian mathematician
al-Khowârizmı̂. And arithmetic goes back to the ancient Greek: arithmos =
[4] R.E. Tarjan. Data Structures and Network Algorithms. CBMS-NSF Re- number.
gional Conference Series in Applied Mathematics. SIAM, Philadelphia,
Pennsylvania, 1983. Why is this course important? The most important courses you will ever
hear are Data Structures and Efficient Algorithms. All other disciplines of
computer science use data structures and algorithms, and analyze them (not
Combinatorics
always, unfortunately). It is a theoretical course, but you will learn how to
[1] R.L. Graham, D.E. Knuth, and O. Patashnik. Concrete Mathematics — A tackle real world problems:
Foundation for Computer Science. Addison-Wesley, Reading, MA, 1989.
• apply known basic algorithms. In practice, there is often no time to imple-
Complexity Theory ment sophisticated optimal algorithms, therefore sophisticated algorithm
libraries (like LEDA, for example) can be very useful (and save you a lot
[1] M.R. Garey and D.S. Johnson. Computers and Intractability — A Guide of implementation time);
to the Theory of NP-Completeness. Freeman, 1979.
• if you do not know a solution, search the literature;
[2] E. Kinber and C. Smith. Theory of Computing: A Gentle Introduction.
Prentice Hall, 2001. • If that fails, work on your own: try the paradigms you learned in this
course;
[3] H.R. Lewis and C.H. Papadimitriou. Elements of the Theory of Computa-
tion. Prentice Hall, 1998, 2nd edition. • apply the methods you learned to analyze the algorithms you found or
developed;
[4] M. Sipser. Introduction to the Theory of Computation. PWS Publishers,
1997. • are there lower bounds known?
Remember:
C++ and LEDA
[1] M.A. Ellis and B. Stroustrap. The Annotated C++ Reference Manual. • “first think, then code” saves time (30-50% of a project time should be
Addison-Wesley, Reading, MA, 1990. devoted to design);
[2] B. Stroustrup. The C++ Programming Language. AT&T Bell Telephone • thorough analysis and lower bounds help you defend your approach to
Laboratories, Inc., Murray Hill, NJ, 2nd edition, 1993. critics (boss, customer).
[4] K. Mehlhorn, S. Näher, and C. Uhrig. The LEDA User Manual – Version • Paradigms:
3.6. 66123 Saarbrücken, 1998.
– reuse of known solutions (always the easiest way to solve a problem);
[5] T. Standish. Data Structures in Java, Addison-Wesley, Reading, MA, 1998. – divide-and-conquer;
– greedy;
1.3 Algorithms – dynamic programming;
Definition 1.3.1. An algorithm is a set of rules (well-defined, finite) used for – graph exploration;
calculation or problem-solving (for example with a computer). A particular – branch-and-bound;
input to an algorithm is called a problem instance. ⊓
⊔
• Techniques for analysis:
Computers are good in following rules, like adding up the numbers 1, 2, . . . , n,
but they are not good in finding short-cuts, like computing n(n+1) instead of – asymptotic notation;
2
the sum. – worst case/average case/amortized analysis;
1.5. PSEUDOCODE 7 8 BIBLIOGRAPHY
– recurrences; – etc.
The web page (../Resources) also contains some info on pseudocode. Absolute running times are not a good measure of efficiency:
• direct/indirect memory access; Which algorithm is the fastest? Modern computers are fast, so running
time differences for small values of n are irrelevant. We are only interested in
• basic operations like memory access, +, −, ·, /, etc. cost 1. This is called the asymptotic behaviour for large values of n. And since we already ignored
the unit cost measure; in the logarithmic cost measure, the cost of an the constant factor induced by the processor speed (all operations cost 1), we
operation grows linearly in the number of bits of the operands; can as well ignore constant factors in the running time expressions (at least, if
Remark 1.6.3. they are sufficiently small; a factor of 101,000 should better not be hidden in the
asymptotic notation). We are more interested whether the running time grows
• If each operation takes a few cycles on a real processor, then the unit cost linearly, polynomially, or exponentially in the problem size. Formally, we use
measure describes the performance of an algorithm sufficiently well. asymptotic notation to express running times.
• The unit cost measure is independent of processor speed. Definition 1.7.2. Let g : IN 7→ IR+
0 (we can cheat on that later). Then we
define
• It allows for a processor independent comparison of algorithms (but we
cannot, and do not want to, predict running times).
O(g) = {f : IN 7→ IR+
0 | ∃c > 0 ∃n0 ∈ IN ∀n ≥ n0 : 0 ≤ f (n) ≤ c · g(n)}
• We can focus on certain “important” operations which dominate the run-
ning time (this makes the analysis easier): (i.e., g(n) is an asymptotic upper bound on f (n),
up to a constant factor) ,
– Sorting: only count comparisons and data movements.
o(g) = {f : IN 7→ IR+
0 | ∀c > 0 ∃n0 ∈ IN ∀n ≥ n0 : 0 ≤ f (n) < c · g(n)}
– Chess player ranking: only count matches.
(i.e., g(n) is much larger than f (n)) .
– Recursive programs: only count recursive calls. ⊓
⊔
1 2
In the same manner we can define asymptotic lower bounds. n −3n+5
Example 1.7.11. limn→∞ 2
n2 = 21 , thus 21 n2 − 3n + 5 ∈ Θ(n2 ). ⊓
⊔
Ω(g) = {f : IN 7→ IR+
0 | ∃c > 0 ∃n0 ∈ IN ∀n ≥ n0 : c · g(n) ≤ f (n)}
We can use l’Hôpital’s Rule: If f and g are differentiable, limn→∞ f (n) =
′ (n) (n) ′ (n)
(i.e., g(n) is an asymptotic lower bound on f (n), limn→∞ g(n) = ∞, and limn→∞ fg′ (n) exists, then limn→∞ fg(n) = limn→∞ fg′ (n) .
√
up to a constant factor) , In our case, f (n) = log n and g(n) = n, so we have
ω(g) = {f : IN 7→ IR+
0 | ∀c > 0 ∃n0 ∈ IN ∀n ≥ n0 : c · g(n) < f (n)}
1
(i.e., g(n) is much smaller than f (n)) . log n 1 ln n 1 n 2 1
lim √ = · lim √ = · lim 1 = · lim √ = 0 .
n→∞ n ln 2 n→∞ n ln 2 n→∞ √
2 n
ln n n→∞ n
√
⊓
⊔ Thus, log n ∈ o( n) by Lemma 1.7.4 ⊓
⊔
We finally need a notation for the case that a function g is an exact asymp-
totic bound for the function f . T1 (n) = Θ(n log n) ,
2
T2 (n) = Θ(n
√),
Definition 1.7.9. Let g : IN 7→ IR+
0 . Then we define T3 (n) = Θ( n) .
Θ(g) = O(g) ∩ Ω(g) Thus, the third algorithm is asymptotically fastest, and the second is asymp-
= {f : IN 7→ IR+
0 | ∃c1 , c2 > 0 ∃n0 ∈ IN ∀n ≥ n0 : totically slowest.
c1 · g(n) ≤ f (n) ≤ c2 · g(n)}
Remark 1.7.14. Always express running times in the simplest possible form:
write O(n2 ), not O(10n2 + 3n).
⊓
⊔ However, “simplest possible” can depend on the context. Assume we want
to compare two running times T1 and T2 . If T1 and T2 are sufficiently different,
Lemma 1.7.10. it is enough to determine T1 and T2 roughly: Θ(n3 ) is slower than Θ(n). But
f (n) if T1 and T2 are similar then constant factors and lower order terms matter:
f (n) ∈ Θ(g(n)) ⇐⇒ 0 < lim <∞ 3
100n2 + O(n) is slower than 10n2 + O(n), but faster than 100n2 + Θ(n 2 ). ⊓ ⊔
n→∞ g(n)
2. T (A(i)) = Θ(i)
f = O(g) ⇐⇒ g = Ω(f ) ,
Otherwise, we must sum over iterations of the body cost:
f = o(g) ⇐⇒ g = ω(f ) .
n n
!
n(n + 1)
(A.1.2)
= Θ(n2 )
X X
T (n) = Θ(i) = Θ i = Θ
i=1 i=1
2
⊓
⊔
In Appendix A you find a powerful method for finding closed formulas of
Example 1.7.16. For 0 < α < β, 0 < a < b, and 0 < A < B we have the sums (but that is not the topic of this course).
following hierarchy of faster growing functions: 3. T (A(i)) = Θ( 2ni )
If it becomes difficult, a rough count can at least give an upper bound:
log∗ n < log log n < logα n < logβ n < n < na logα n < na logβ n < nb
T (A(i)) ≤ Θ(n) ⇒ T (n) ≤ n · Θ(n) = Θ(n2 ) .
Note that we write T (n) ≤ Θ(n2 ) which means Θ(n2 ) is not the exact
< An < An na < An nb < B n . asymptotic growth rate of T (n), just an upper bound, i.e., we only know
T (n) = O(n2 ). The exact growth rate is in this case:
⊓
⊔
n
!
n 1
X (A.1.4)
Example 1.7.17. Let’s remember the running times we measured in Exam- T (n) = Θ = Θ n· 1− = Θ(n) .
i=1
2i 2n
ple 1.6.2. What is the asymptotic growth rate of the two algorithms?
The running time of A1 approximately doubles when we double the input 4. T (A(i)) = Θ(log i)
size, so we have T1 (2n) = 2T1 (n), which implies T1 (n) = Θ(n). Since log i < log n, we easily see T (n) = O(n log n). On the other hand,
The running time of A2 approximately quadruples when we double the input
size, so we have T2 (2n) = 4T2 (n), which implies T2 (n) = Θ(n2 ).
n n n n
We note that reading asymptotic growth rates off a table of running times T (n) = log 1+· · ·+log +· · ·+log n ≥ ·log = (log n−1) = Ω(n log n).
can be misleading. Lower order terms like log n, log log n, or log∗ n can often 2 2 2 2
not be clearly recognized. ⊓
⊔ Thus T (n) = Θ(n log n). ⊓
⊔
1.8. ANALYSIS OF PSEUDOCODE 15
Just adding the running times of A and B would give a correct upper bound,
but probably not a tight one. Even taking the maximum may overestimate the
running time (if the maximal running time in A only happens for some x > y,
for example), but that is the best we can do usually. We then get an upper
bound on the worst case running time which is defined as the maximum of all
running times over all inputs of the same size (if we sort we consider all possible
inputs of n numbers and then take the maximum running time). Analogously,
the best case running time is the minimal running time over all inputs of the
same size. If we know a probability distribution on the inputs, we can also
determine the average running time which is the weighted median of all running
times for all inputs of the same size. If the algorithm uses randomization,
we usually determine expected running time (or more correct: the worst case
average running time) which is the weighted median of the running times of
all possible computations for any particular instance. Finally, we sometimes
determine the amortised running time which is the average running time of a
single operation in a long sequence of operations. This will be the topic of the
next chapter.