Advanced Algorithm Design and Analysis (1x1)

Outline Dynamic Programming Greedy Algorithms
Chapter 4
Advanced Algorithm Design and Analysis Techniques
Yonas Y.
Algorithm Analysis and Design
School of Electrical and Computer Engineering
1
1 Dynamic Programming
Rod cutting problem
Elements of dynamic programming
Longest common subsequence
Optimal binary search trees
2 Greedy Algorithms
An activity-selection problem
Elements of the greedy strategy
Huffman Codes
3 Amortized Analysis
Aggregate analysis
2
Dynamic Programming
Not a specific algorithm, but a technique (like

divide-and-conquer).
Developed back in the day when ”programming” meant

”tabular method” (like linear programming). Doesn’t really
refer to computer programming.
Dynamic programming applies when the subproblems

overlap-that is, when subproblems share subsubproblems.
Used for optimization problems:
Find a solution with the optimal value.
Minimization or maximization.
3
When developing a dynamic-programming algorithm, we follow a

sequence of four steps:
1 Characterize the structure of an optimal solution.
2 Recursively define the value of an optimal solution.
3 Compute the value of an optimal solution in a bottom-up

fashion.
4 Construct an optimal solution from computed information.
4
Rod cutting problem
Rod cutting
Sterling Enterprises buys long steel rods and cuts them into shorter
rods, which it then sells.
Each cut is free.
The management of sterling enterprises wants to know the
best way to cut up the rods.
We assume that we know, for i = 1, 2,. . . , the price pi in birr
that sterling enterprises charges for a rod of length i inches.
Rod lengths are always an integral number of inches.
length i 1 2 3 4 5 6 7 8 9 10
price pi 1 5 8 9 10 17 17 20 24 30
Table: 4.1 A sample price table for rods. Each rod of length i inches
earns the company pi birrs of revenue.
5
Rod cutting problem
The rod-cutting problem is the following.
Given a rod of length n inches and a table of prices pi for i =

1, 2,..., determine the maximum revenue rn obtainable by
cutting up the rod and selling the pieces.
Consider the case when n = 4.
Figure 4.1 shows all the ways to cut up a rod of 4 inches in

length, including the way with no cuts at all.
We see that cutting a 4-inch rod into two 2-inch pieces

produces revenue p2 + p2 = 5 + 5 = 10, which is optimal.
6
Rod cutting problem
Figure: 4.1 The 8 possible ways of cutting up a rod of length 4. Above

each piece is the value of that piece price.
We can cut up a rod of length n in 2n−1 different ways, since we

have an independent option of cutting, or not cutting, at distance
i inches from the left end, for i = 1, 2,..., n-1.
7
Rod cutting problem
If an optimal solution cuts the rod into k pieces, for some

1 ≤ k ≤ n, then an optimal decomposition
n = i1 + i2 + . . . + ik
of the rod into pieces of lengths i1 , i2 , . . . , ik provides maximum

corresponding revenue
rn = pi1 + pi2 + . . . + pik .
Generally, we can frame the values rn for n ≥ 1 in terms of optimal

revenues from shorter rods:
rn = max (pn , r1 + rn−1 , r2 + rn−2 , . . . , rn−1 + r1 ).
8
Rod cutting problem
Note that to solve the original problem of size n, we solve smaller

problems of the same type, but of smaller sizes.
Once we make the first cut, we may consider the two pieces
as independent instances of the rod-cutting problem.
The overall optimal solution incorporates optimal solutions to

the two related subproblems, maximizing revenue from each of
those two pieces.
We say that the rod-cutting problem exhibits optimal

substructure: optimal solutions to a problem incorporate
optimal solutions to related subproblems, which we may solve
independently.
9
Rod cutting problem
Recursive top-down implementation
In a related, but slightly simpler, way to arrange a recursive

structure for the rod cutting problem:
the rod is cut into a first piece of length i cut off the left-hand
end, and then a right-hand remainder of length n - i.
Only the remainder, and not the first piece, may be further
divided.
rn = max (pi , rn-i ).

1≤i≤n
10
Rod cutting problem
The following procedure implements the computation implicit

above in a straightforward, top-down, recursive manner.
Algorithm 1 CUT-ROD(p, n)
1: if n == 0 then
2: return 0
3: end if
4: q = −∞
5: for i = 1 to n do
6: q = max (q, p[i] + CUT-ROD(p, n - i))
7: end for
8: return q
11
Rod cutting problem
Figure 4.2 illustrates what happens for n = 4.

CUT-ROD(p, n) calls CUT-ROD(p, n - i) for i = 1, 2,. . . , n.
CUT-ROD(p, n) calls CUT-ROD(p, j) for each j = 0, 1,. . . , n
- 1.
Figure: 4.2 The recursion tree showing recursive calls resulting from a call
CUT-ROD(p, n) for n = 4.
12
Rod cutting problem
To analyze the running time of CUT-ROD,
Let T(n) denote the total number of calls made to CUT-ROD.
The count includes the initial call at its root.
Thus, T(0) = 1 and
T(n) = 1 + n-1
P
j=0 T(j),
which gives an exponential running time of 2n .
13
Rod cutting problem
Using dynamic programming for optimal rod cutting
Since the recursive solution is inefficient =⇒ we arrange for

each subproblem to be solved only once, saving its solution.
Dynamic programming thus uses additional memory to save

computation time =⇒ time-memory trade-off.
A dynamic-programming approach runs in polynomial time

when the number of distinct subproblems involved is
polynomial in the input size.
We can solve each such subproblem in polynomial time.
14
Rod cutting problem
There are usually two equivalent ways to implement a

dynamic-programming approach.
1 The first approach is top-down with memoization.

In this approach, we write the procedure recursively in a natural
manner, but modified to save the result of each subproblem.
2 The second approach is the bottom-up method.

This approach typically depends on some natural notion of the
”size” of a subproblem, such that solving any particular
subproblem depends only on solving ”smaller” subproblems.
These two approaches yield algorithms with the same asymptotic

running time.
15
Rod cutting problem
Here is the the pseudocode for the top-down CUT-ROD procedure,

with memoization added:
Algorithm 2 MEMOIZED-CUT-ROD(p, n)
1: let r[0 .. n] be a new array
3: r[i] = −∞
4: end for
5: return MEMOIZED-CUT-ROD-AUX(p, n, r)
16
Rod cutting problem
Algorithm 3 MEMOIZED-CUT-ROD-AUX(p, n, r)
1: if r[n] ≤ 0 then
2: return r[n]
3: end if
4: if n == 0 then
5: return q = 0
6: else
7: q = −∞
9: q = max (q, p[i] + MEMOIZED-CUT-ROD-AUX(p, n-i,
r))
10: end for
11: end if
12: r[n] = q
13: return q
17
Rod cutting problem
The bottom-up version is even simpler:
Algorithm 4 BOTTOM-UP-CUT-ROD(p, n)
2: r[0] = 0
3: for j = 1 to n do
4: q = −∞
5: for i = 1 to j do
6: q = max (q, p[i] + r[j-i])
7: end for
8: r[j] = q
9: end for
10: return r[n]
The running time of the above procedures is Θ(n2 ).

18
Rod cutting problem
Subproblem graphs
The subproblem graph for the problem embodies =⇒ the set of

subproblems involved and how subproblems depend on one
another.
Figure 4.3 shows the subproblem graph for the rod-cutting problem
with n = 4.
It is a directed graph, containing one vertex for each distinct

subproblem.
19
Rod cutting problem
Figure: 4.3 The subproblem graph for the rod-cutting problem with n = 4.
The size of the subproblem graph G = (V, E) can help us

determine the running time of the dynamic programming
algorithm.
Since we solve each subproblem just once, the running time is

the sum of the times needed to solve each subproblem.
20
Rod cutting problem
Reconstructing a solution
Our dynamic-programming solutions to the rod-cutting problem

return the value of an optimal solution.
They do not return an actual solution: a list of piece sizes.
We can record not only the optimal value computed for each
subproblem, but also a choice that led to the optimal value.
21
Rod cutting problem
Algorithm 5 EXTENDED-BOTTOM-UP-CUT-ROD(p, n)
2: r[0] = 0
4: q = −∞
5: for i = 1 to j do
6: if q < p[i] + r[j-i] then
7: q = p[i] + r[j-i]
8: s[j] = i
9: end if
10: end for
11: r[j] = q
12: end for
13: return r and s
22
Rod cutting problem
In our rod-cutting example, the call

EXTENDED-BOTTOM-UP-CUT-ROD(p, 10) would return the
following arrays:
i 0 1 2 3 4 5 6 7 8 9 10
r[i] 0 1 5 8 10 13 17 18 22 25 30
s[i] 0 1 2 3 2 2 6 1 2 3 10
23
When should we look for a dynamic-programming solution to a

problem?
The two key ingredients:
Optimal substructure
Overlapping subproblems
24
A problem exhibits optimal substructure if an optimal solution to

the problem contains within it optimal solutions to subproblems.
Show that a solution to a problem consists of making a

choice, which leaves one or subproblems to solve.
Suppose that you are given this last choice that leads to an
optimal solution.
Given this choice, determine which subproblems arise and how

to characterize the resulting space of subproblems.
25
Show that the solutions to the subproblems used within the

optimal solution must themselves be optimal.
Suppose that one of the subproblem solutions is not optimal.

Cut it out.
Paste in an optimal solution.
Get a better solution to the original problem.
That was optimal substructure procedure.
26
How to characterize the space of subproblems?
Keep the space as simple as possible.
Expand it as necessary.
27
Optimal substructure varies across problem domains:
1 How many subproblems are used in an optimal solution.
2 How many choices in determining which subproblem(s) to use.
Informally, running time depends on (# of subproblems overall) ×

(# of choices).
Rod-cutting problem: Θ(n) subproblems overall, and at most

n choices to examine for each ⇒ O(n2 ) running time.
28
Dynamic programming uses optimal substructure bottom up.
Find optimal solutions to subproblems.
Choose which to use in optimal solution to the problem.
Greedy algorithms work top down:
First make a choice that looks best, then solve the resulting
subproblem.
29
Subtleties
Be careful not to assume that optimal substructure applies when it

does not.
Consider the following two problems in which we are given a

directed graph G = (V, E) and vertices u, v ∈ V.
V is a set of vertices.
E is a set of edges.
30
Let’s find a path (sequence of connected edges) from vertex u to

vertex v.
Shortest path: find path u v with fewest edges. Must be

simple (no cycles).
Longest simple path: find simple path u v with most

edges. Could repeatedly traverse a cycle to make an arbitrarily
long path.
31
Suppose p is shortest path u v.

Let w be any vertex on p.
Let p1 be the portion of p, u w.
Then p1 is a shortest u w.
Same argument applies to p2 .
Therefore, we can find shortest path u v by considering all

intermediate vertices w, then fnding shortest paths u w and w
v (optimal substructure).
32
Does longest path have optimal substructure?
Consider q → r → t = longest path q t. Are its subpaths

longest paths?
NO!
Subpath q r is q → r.
Longest simple path q r is q → s → t → r.
Subpath r t is r → t.
Longest simple path r t is r → q → s → t.
33
Not only isn’t there optimal substructure, but we can’t even

assemble a legal solution from solutions to subproblems.
Combine longest simple paths:
q→s→t→r→q→s→t
In fact, this problem is NP-complete (so it probably has no optimal

substructure to find).
34
Overlapping subproblems
These occur when a recursive algorithm revisits the same problem

over and over.
Good divide-and-conquer algorithms usually generate a brand new

problem at each stage of recursion.
Example: Merge sort
35
Alternative example: memoization
”Store, don’t recompute”.
Make a table indexed by subproblem.
When solving a subproblem:
Lookup in table.
If answer is there, use it.
Else, compute answer, then store it.
In dynamic programming, we go one step further. We

determine in what order we’d want to access the table, and fill
it in that way.
36
Problem: Given 2 sequences, X = hx1 , ..., xm i and Y = hy1 , ..., yn i.
Find a subsequence common to both whose length is longest.
A subsequence doesn’t have to be consecutive, but it has to

be in order.
Biological applications =⇒ To compare the DNA of two (or

more) different organisms.
37
Example 1:
Example 2: For X = hA, B, C , B, D, A, Bi and

Y = hB, D, C , A, B, Ai.
The sequence hB, C , Ai is a CS.
The sequences hB, C , B, Ai and hB, D, A, Bi are LCS.

38
Brute-force approach
For every subsequence of X, check whether it’s a subsequence of Y.
Time: Θ(n2m )
2m subsequences of X to check.
Each subsequence takes Θ(n) time to check: scan Y for first

letter, from there scan for second, and so on.
39
Notation: ith prefix
Xi = prefix hx1 , ..., xi i
Yi = prefix hy1 , ..., yi i
For example, if X = hA, B, C, B, D, A, Bi, then X4 =

hA, B, C, Bi and X0 is the empty sequence.
Theorem:
Let Z = hz1 , ..., zk i be any LCS of X and Y.

1 If xm = yn , then zk = xm = yn and Zk−1 is an LCS of Xm−1 and
Yn−1 .
2 If xm 6= yn , then zk 6= xm ⇒ Z is an LCS of Xm−1 and Y.
3 If xm 6= yn , then zk 6= yn ⇒ Z is an LCS of X and Yn−1 . 40
A recursive solution
LCS problem involves establishing a recurrence for the value of an

optimal solution.
c[i, j] to be the length of an LCS of the sequences Xi and

Yj .

0,
 if i = 0 or j = 0,
c[i,j] = c[i-1,j-1]+1, if i,j > 0 and xi = yj ,

max(c[i,j-1],c[i-1,j]), if i,j > 0 and xi 6= yj .

41
Compute length of optimal solution
Procedure LCS-LENGTH takes two sequences X = hx1 , ..., xm i and Y

= hy1 , ..., yn i as inputs.
The LCS problem has only Θ(mn) distinct subproblems.
Store the c[i, j] values in a table c[0..m, 0..n], and

computes the entries in row-major order.
42
Algorithm 6 LCS-LENGTH(X, Y)
1: m = X.length
2: n = Y.length
3: let b[1..m, 1..n] and c[0..m, 0..n] be new tables
4: for i = 1 to m do
5: c[i, 0] = 0
6: end for
8: c[0, j] = 0
9: end for
10: for i = 1 to m do
12: if xi == yj then
13: c[i, j] = c[i-1, j-1] + 1
14: b[i, j] = ” - ”
15: else if c[i-1, j] ≥ c[i, j-1] then
16: c[i, j] = c[i-1, j]
17: b[i, j] = ” ↑ ”
18: else c[i, j] = c[i, j-1]
19: b[i, j] = ” ← ”
20: end if
21: end for
22: end for
23: return c and b 43
Figure: 4.4 The c and b tables computed by LCS-LENGTH on the

sequences X = hA, B, C, B, D, A, Bi and Y = hB, D, C, A, B, Ai.
The running time of the procedure is Θ(mn), since each table entry
takes Θ(1) time to compute. 44
Constructing an LCS
The b table returned by LCS-LENGTH enables us to quickly
construct an LCS of X = hx1 , ..., xm i and Y = hy1 , ..., yn i.
Algorithm 7 PRINT-LCS(b, X, i, j
1: if i == i or j==0 then
2: return
3: end if
4: if b[i, j] == ” - ” then
5: PRINT-LCS(b, X, i-1, j-1)
6: print xi
7: else if b[i, j] == ” ↑ ” then
8: PRINT-LCS(b, X, i-1, j)
9: else
10: PRINT-LCS(b, X, i, j-1)
11: end if
The procedure takes time O(m + n), since it decrements at least
one of i and j in each recursive call. 45
Suppose that we are designing a program to translate text from

English to Amharic.
We could perform these lookup operations by building a

binary search tree with n English words as keys and their
Amharic equivalents as satellite data.
Because we will search the tree for each individual word in the
text, we want the total time spent searching to be as low as
possible.
46
We want words that occur frequently in the text to be placed

nearer the root.
Moreover, some words in the text might have no Amharic

translation, and such words would not appear in the binary
search tree at all.
Minimize the number of nodes visited in all searches, given

that we know how often each word occurs?
47
Given sequence K = hk1 , k2 , ..., kn i of n distinct keys, sorted

(k1 < k2 < · · · < kn ).
Want to build a binary search tree from the keys.
For ki , have probability pi that a search is for ki .
For each dummy key di , we have a probability qi that a

search will correspond to di .
Each key ki is an internal node, and each dummy key di is a

leaf.
Want best search time (BST) with minimum expected search

cost.
Actual cost = # of items examined.

48
Figure: 4.5 Two binary search trees for a set of n = 5 keys.
49
The probablity table
i 0 1 2 3 4 5
pi 0.15 0.10 0.05 0.10 0.20
qi 0.05 0.10 0.05 0.05 0.05 0.10
Every search is either successful (finding some key ki ) or

unsuccessful (finding some dummy key di ), and so we have
n
X n
X
pi + qi = 1
i=1 i=0
50
Let us assume that the actual cost of a search equals the number
of nodes examined, i.e., the depth of the node found by the search
in T, plus 1.
Then the expected cost of a search in T is:

n
X n
X
E[search cost in T] = (depthT (ki ) + 1) · pi + (depthT (di ) + 1) · qi
i=1 i=0
n
X n
X
=1+ depthT (ki ) · pi + depthT (di ) · qi
i=1 i=0
51
From figure 4.5, we can calculate the expected search cost node by
node:
For the first binary search tree

node depth probablity contribution
k1 1 0.15 0.30
k2 0 0.10 0.10
k3 2 0.05 0.15
k4 1 0.10 0.20
k5 2 0.20 0.60
d0 2 0.05 0.15
d1 2 0.10 0.30
d2 3 0.05 0.20
d3 3 0.05 0.20
d4 3 0.05 0.20
d5 3 0.10 0.40
Total 2.80
52
Following a similar procedure the expected cost of the second

figure ⇒ 2.75.
Observations:
Optimal BST might not have smallest height.
Optimal BST might not have highest-probability key at root.
The lowest expected cost of any binary search tree with k5 at

the root is 2.85.
53
Use optimal substructure (Recursive procedure) to construct an

optimal solution to the problem from optimal solutions to
subproblems:
Given keys ki , ..., kj and dummy keys di−1 , ..., dj . (the

problem).
One of them, kr , where i ≤ r ≤ j, must be the root.
Left subtree of kr contains keys ki , ..., kr−1 (and dummy keys

di−1 , ..., dr−1 ).
Right subtree of kr contains kr+1 , ..., kj (and dummy keys

dr , ..., dj ).
54
If, we examine all candidate roots kr , for i ≤ r ≤ j, and
We determine all optimal BSTs containing ki , ..., kr−1 and

containing kr+1 , ..., kj ,
Then we’re guaranteed to find an optimal BST for ki , ..., kj .
Computing of the optimal solution using dynamic programming

requires to store intermediate solutions which can be used to solve
the bigger subproblems.
55
Greedy Algorithms
Similar to dynamic programming.
Used for optimization problems.
Idea: When we have a choice to make, make the one that

looks best right now.
Make a locally optimal choice in hope of getting a globally
optimal solution.
Greedy algorithms don’t always yield an optimal solution.

But sometimes they do.
56
Suppose we have a set S = {a1 , a2 , ..., an } of n activities require

exclusive use of a common resource.
For example, scheduling the use of a classroom.
ai needs resource during period [si , fi ), which is a half-open

interval, where si = start time and fi = finish time.
Goal: Select the largest possible set of nonoverlapping (mutually

compatible) activities.
We assume that the activities are sorted in monotonically

increasing order of finish time.
57
Note: Could have many other objectives:
Schedule room for longest time.
Maximize income rental fees.
For example, consider the following set S of activities for the

activity selection problem:
i 1 2 3 4 5 6 7 8 9
si 1 2 4 1 5 8 9 11 13
fi 3 5 7 8 9 10 11 14 16
58
Maximum-size mutually compatible set: {a1 , a3 , a6 , a8 }.

Not unique: also {a2 , a5 , a7 , a9 }.
59
We shall solve this problem in several steps.
Dynamic-programming solution, consider several choices when

determining which subproblems to use in an optimal solution.
Consider only one choice-the greedy choice-and that when we

make the greedy choice, only one subproblem remains.
60
The optimal substructure
Sij = {ak ∈ S : fi ≤ sk < fk ≤ sj }
activities that start after ai finishes and finish before aj starts.
Activities in Sij are compatible with
all activities that finish by fi , and

all activities that start no earlier than sj .
By including ak in an optimal solution, we are left with two

subproblems:
Activities, Sik and Skj .
61
Let Aik = Aij ∩ Sik and Akj = Aij ∩ Skj .
Aik contains the activities in Aij that finish before ak starts

and Akj contains the activities in Aij that start after ak
finishes.
Thus,
Aij = Aik ∪ {ak } ∪ Akj
and so the maximum-size set Aij of mutually compatible activities

in Sij consists of |Aij | = |Aik | + |Akj | + 1 activities.
Thus, we might solve the activity-selection problem by

dynamic programming.
62
If we denote the size of an optimal solution for the set Sij by

c[i, j], then we would have the recurrence
c[i, j] = c[i, k] + c[k, j] + 1.
If we did not know that an optimal solution for the set Sij
includes activity ak , we would have to examine all activities in
Sij to find which one to choose, so that

0 if Sij = ∅
c[i, j] =
 max {c[i, k] + c[k, j] + 1}, if Sij 6= ∅
ak ∈Sij
63
Making the greedy choice
What if we could choose an activity to add to our optimal solution

without having to first solve all the subproblems?
That could save us from having to consider all the choices

inherent in recurrence.
In fact, for the activity-selection problem, we need consider

only one choice: the greedy choice.
Now we can solve the problem Sij top down,
Choose am ∈ Sij with earliest finish time: the greedy choice.
Then solve Smj .
64
What are the subproblems?
Original problem is S0,n+1 , suppose our first choice is am1 .
Then next subproblem is Sm1 ,n+1 , suppose next choice is am2 .
Next subproblem is Sm2 ,n+1 , and so on.
Each subproblem is Smi ,n+1 , i.e., the last activities to finish. And
the subproblems chosen have finish times that increase.
Therefore, we can consider each activity just once, in

monotonically increasing order of finish time.
65
A recursive greedy algorithm
The procedure RECURSIVE-ACTIVITY-SELECTOR takes:
The start and finish times of the activities, represented as

arrays s and f,
Index k that defines the subproblem Sk to be solved and,
Size n of the original problem.
It returns a maximum-size set of mutually compatible

activities in Sk .
Assumes activities are already sorted by monotonically increasing

finish time. (If not, then sort in O(n lg n) time.)
66
The initial call, which solves the entire problem, is

RECURSIVE-ACTIVITY- SELECTOR(s, f, 0, n).
Algorithm 8 RECURSIVE-ACTIVITY-SELECTOR(s, f, k, n)
1: m = k + 1
2: while m ≤ n and s[m] < f[k] do . find the first activity in Sk
to finish
3: m=m+1
4: end while
5: if m ≤ n then
6: return {am } ∪ RECURSIVE-ACTIVITY-SELECTOR(s, f, m, n)
7: else return ∅
8: end if
Assuming that the activities have already been sorted by finish

times, the running time of the call
RECURSIVE-ACTIVITY-SELECTOR(s, f, k, n) is Θ(n).
67
Example: Consider the following set S of activities:
i 1 2 3 4 5 6 7 8 9 10 11
si 1 3 0 5 3 5 6 8 8 2 12
fi 4 5 6 7 9 9 10 11 12 14 16
68
Figure: 4.6 The operation of RECURSIVE-ACTIVITY-SELECTOR on the 11

activities given above. 69
An iterative greedy algorithm
Algorithm 9 GREEDY-ACTIVITY-SELECTOR(s, f)
1: n = s.length
2: A = {a1 }
3: k=1
4: for m = 2 to n do
5: if s[m] ≥ f[k] then
6: A = A ∪ {am }
7: k=m
8: end if
9: end for
10: return A
Like the recursive version, GREEDY-ACTIVITY-SELECTOR

schedules a set of n activities in Θ(n) time.
70
Generally, we design greedy algorithms according to the following

sequence of steps:
1 Cast the optimization problem as one in which we make a

choice and are left with one subproblem to solve.
2 Prove that there is always an optimal solution to the original

problem that makes the greedy choice, so that the greedy
choice is always safe.
3 Demonstrate optimal substructure.
71
How can we tell whether a greedy algorithm will solve a particular

optimization problem?
Greedy-choice property
72
Greedy-choice property
A globally optimal solution can be arrived at by making a locally

optimal (greedy) choice.
Greedy:
Make a choice at each step.

Make the choice before solving the subproblems.
Solve top-down.
Just show that optimal solution to subproblem and greedy choice
⇒ optimal solution to problem.
73
Greedy vs. dynamic programming
0-1 knapsack problem:
n items.
Item i is worth vi birr, weighs wi pounds.
Find a most valuable subset of items with total weight ≤ W.
Have to either take an item or not take it-can’t take part of it.
Fractional knapsack problem:
Like the 0-1 knapsack problem, but can take fraction of an

item.
74
Both have optimal substructure.
The fractional knapsack problem has the greedy-choice property.
To solve the fractional problem, rank items by

value/weight: vi /wi .
Sorting the items by value per pound, the greedy algorithm

runs in O(n lg n) time.
Greedy doesn’t work for the 0-1 knapsack problem. Might get
empty space, which lowers the average value per pound of the
items taken.
75
Figure: 4.7 An example showing that the greedy strategy does not work
for the 0-1 knapsack problem.
76
Huffman Codes
Huffman Codes
Huffman codes compress data very effectively.
Savings of 20% to 90% are typical, depending on the

characteristics of the data being compressed.
Based on how often each character occurs(i.e frequency).
Example: Suppose we have a 100,000 - character data file.
a b c d d f
Frequency (in thousands) 45 13 12 16 9 5
Fixed-length codeword 000 001 010 011 100 101
Variable-length codeword 0 101 100 111 1101 1100
Table: 4.2 A Character-coding problem.

77
Huffman Codes
Represent via a binary character code ⇒ unique binary string

(codeword).
1 Fixed-length code:
3 bits to represent 6 characters =⇒ 300,000 bits to code
the entire file.
2 Variable-length code:
Frequent characters =⇒ short codewords.
From Table 4.1 the code requires
(45· 1 + 1· 3 + 12·3 + 16· 3 + 19·4 + 5·4)·1000 = 224,000
bits
to represent the file, a savings of approximately 25%.
78
Huffman Codes
Prefix Codes
Achieves the optimal data compression among any character code.
Encoding a binary character code: Concatenate the code

words.
For example: abc =⇒ 0 · 101 · 100 = 0101100.
They simplify decoding.

Decoding Procedure:
Identify the initial codeword,
Translate it back to the original character, and
Repeat the decoding process on the remainder of the encoded
file.
79
Huffman Codes
A binary tree whose leaves are the given characters provides one
representation.
Binary codeword for a character =⇒ simple path from the

root to that character.
0 means ”go to the left child” and 1 means ”go to the right
child.”
80
Huffman Codes
Figure: 4.8 Trees corresponding to the coding schemes in Table 4.2.

An optimal code for a file is always represented by a full
binary tree, in which every nonleaf node has two children.
The number of bits required to encode a file is thus
X
B(T ) = c.freq · dT (c),
c∈C
which we define as the cost of the tree T. 81
Huffman Codes
Constructing a Huffman code
Greedy algorithm that constructs an optimal prefix code called a

Huffman code.
C is a set of n characters and that each character c ∈ C is an

object with an attribute c·freq giving its frequency.
Builds the tree T corresponding to the optimal code in a

bottom-up manner.
Performs a sequence of |C| - 1 ”merging” operations to

create the final tree.
82
Huffman Codes
Algorithm 10 HUFFMAN(C)
1: n = |C|
2: Q = C
3: for i = 1 to n-1 do
4: allocate a new node z
5: z.left = x = EXTRACT-MIN(Q)
6: z.right = y = EXTRACT-MIN(Q)
7: z.freq = x.freq + y.freq
8: INSERT(Q, z)
9: end for
10: return EXTRACT-MIN(Q) . return the root of the tree
83
Huffman Codes
Figure: 4.9 The steps of Huffmans algorithm for the frequencies given in
Figure 4.2. 84
Huffman Codes
To analyze the running time of Huffman’s algorithm:
Assume that Q is implemented as a binary min-heap.
For a set C of n characters, we can initialize Q in line 2 in

O(n) time using the BUILD-MIN-HEAP.
The for loop in lines 3-8 executes exactly n - 1 times, and

since each heap operation requires time O(lg n), the loop
contributes O(n lg n) to the running time.
Thus, the total running time of HUFFMAN on a set of n characters

is O(n lg n).
85
Amortized Analysis
Analyze a sequence of operations on a data structure.
Goal: Show that although some individual operations may be

expensive, on average the cost per operation is small.
Average in this context does not mean that we’re averaging over a
distribution of inputs.
No probability is involved.
We’re talking about average cost in the worst case.
86
Aggregate analysis
Aggregate analysis
In aggregate analysis, we show that for all n, a sequence of n

operations takes worst-case time T(n) in total.
In the worst case, the average cost, or amortized cost, per

operation is therefore T(n)/n.
87
Aggregate analysis
Stack operations
PUSH(S, x): O(1) each ⇒ O(n) for any sequence of n

operations.
POP(S): O(1) each ⇒ O(n) for any sequence of n

operations.
Now let’s add the stack operation MULTIPOP(S, k), which

removes the k top objects of stack S.
Algorithm 11 MULTIPOP(S, k)
1: while not STACK-EMPTY(S) and k ¿ 0 do
2: POP(S)
3: k = k-1
4: end while
88
Aggregate analysis
Figure: 4.10 The action of MULTIPOP on a stack S. The top 4 objects

are popped by MULTIPOP(S, 4), whose result is shown in (b). The next
operation is MULTIPOP(S, 7)-shown in (c).
Running time of MULTIPOP:

Linear in # of POP operations.
# of iterations of while loop is min(s, k), where s = # of
objects on stack.
Therefore, total cost = min(s, k). 89
Aggregate analysis
Considering a sequence of n PUSH, POP, MULTIPOP operations:
Worst-case cost of MULTIPOP is O(n), since the stack size is

at most n.
Have n operations.
Therefore, worst-case cost of sequence is O(n2 ).
Although this analysis is correct, the worst-case cost of each

operation individually, is not tight.
90
Aggregate analysis
Observation
Each object can be popped only once per time that it’s
pushed.
Have ≤ n PUSHes ⇒ ≤ n POPs, including those in
MULTIPOP.
Therefore, total cost = O(n).
Average over the n operations ⇒ O(1) per operation on
average.
Again, notice no probability.
Showed worst-case O(n) cost for sequence.

Therefore, O(1) per operation on average.
This technique is called aggregate analysis. 91
Aggregate analysis
Incrementing a binary counter
Consider the problem of implementing a k-bit binary counter

A[0..k - 1] of bits, where A[0] is the least significant bit and
A[k - 1] is the most significant bit.
Counts upward from 0.

k-1
A[i] · 2i .
P
Value of counter is
i=0
Initially, counter value is 0, so A[0..k - 1] = 0.
92
Aggregate analysis
Algorithm 12 INCREMENT(A)
1: i=0
2: while i < A.length and A[i]==1 do
3: A[i]=0
4: i = i+1
5: end while
6: if i < A.length then
7: A[i]=1
8: end if
93
Aggregate analysis
Figure: 4.11 An 8-bit binary counter as its value goes from 0 to 16 by a

sequence of 16 INCREMENT operations.
Cost of INCREMENT = Θ(# of bits flipped).
Analysis: Each call could flip k bits, so n INCREMENTs takes
O(nk) time. 94
Aggregate analysis
Observation
Not every bit flips every time.
bit flips how often times in n INCREMENTs
0 every time n
1 1/2 the time bn/2c
2 1/4 the time bn/4c
.
.
i i
1/2 the time bn/2i c
.
.
i≥ k never 0
k-1 ∞
bn/2i c < n b1/2i c = 2n.
P P
Therefore, total # of flips =
i=0 i=0
As a result, n INCREMENTs costs O(n).
Average cost per operation = O(1). 95
Aggregate analysis
End of Chapter 4
Questions?
96

Advanced Algorithm Design and Analysis (1x1)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Advanced Algorithm Design and Analysis (1x1)

Uploaded by

Copyright:

Available Formats

Outline Dynamic Programming Greedy Algorithms

Not a specific algorithm, but a technique (like

Developed back in the day when ”programming” meant

Dynamic programming applies when the subproblems

When developing a dynamic-programming algorithm, we follow a

1 Characterize the structure of an optimal solution.

2 Recursively define the value of an optimal solution.

3 Compute the value of an optimal solution in a bottom-up

4 Construct an optimal solution from computed information.

Rod cutting problem

Rod cutting problem

The rod-cutting problem is the following.

Given a rod of length n inches and a table of prices pi for i =

Consider the case when n = 4.

Figure 4.1 shows all the ways to cut up a rod of 4 inches in

We see that cutting a 4-inch rod into two 2-inch pieces

Rod cutting problem

Figure: 4.1 The 8 possible ways of cutting up a rod of length 4. Above

We can cut up a rod of length n in 2n−1 different ways, since we

Rod cutting problem

If an optimal solution cuts the rod into k pieces, for some

of the rod into pieces of lengths i1 , i2 , . . . , ik provides maximum

rn = pi1 + pi2 + . . . + pik .

Generally, we can frame the values rn for n ≥ 1 in terms of optimal

rn = max (pn , r1 + rn−1 , r2 + rn−2 , . . . , rn−1 + r1 ).

Rod cutting problem

Note that to solve the original problem of size n, we solve smaller

The overall optimal solution incorporates optimal solutions to

We say that the rod-cutting problem exhibits optimal

Rod cutting problem

Recursive top-down implementation

In a related, but slightly simpler, way to arrange a recursive

rn = max (pi , rn-i ).

Rod cutting problem

The following procedure implements the computation implicit

Rod cutting problem

Figure 4.2 illustrates what happens for n = 4.

Rod cutting problem

To analyze the running time of CUT-ROD,

Let T(n) denote the total number of calls made to CUT-ROD.

The count includes the initial call at its root.

Thus, T(0) = 1 and

which gives an exponential running time of 2n .

Rod cutting problem

Using dynamic programming for optimal rod cutting

Since the recursive solution is inefficient =⇒ we arrange for

Dynamic programming thus uses additional memory to save

A dynamic-programming approach runs in polynomial time

We can solve each such subproblem in polynomial time.

Rod cutting problem

There are usually two equivalent ways to implement a

1 The first approach is top-down with memoization.

2 The second approach is the bottom-up method.

These two approaches yield algorithms with the same asymptotic

Rod cutting problem

Here is the the pseudocode for the top-down CUT-ROD procedure,

Rod cutting problem

Rod cutting problem

The bottom-up version is even simpler:

The running time of the above procedures is Θ(n2 ).

Rod cutting problem

The subproblem graph for the problem embodies =⇒ the set of