Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Dynamic Programming

C Patvardhan
Professor, Electrical Engineering
Dayalbagh Educational Institute, Agra
Algorithm
g types
yp
„ Algorithm types we will consider include:
„ Simple recursive algorithms
„ Backtracking algorithms
„ Di id andd conquer algorithms
Divide l ith
„ Dynamic programming algorithms
„ Greedy algorithms
„ Branch and bound algorithms
„ Brute force algorithms
g
„ Randomized algorithms
Countingg coins
„ To find the minimum number of US coins to make any amount,
the greedy method always works
„ At each step, just choose the largest coin that does not overshoot the
desired amount: 31¢=25
„ The greedy method would not work if we did not have 5¢ coins
„ For 31 cents, the greedy method gives seven coins (25+1+1+1+1+1+1),
but we can do it with four (10+10+10+1)
( )
„ The greedy method also would not work if we had a 21¢ coin
„ For 63 cents, the greedy method gives six coins (25+25+10+1+1+1), but
we can do it with three (21+21+21)
„ How can we find the minimum number of coins for any given
coin set?
Coin set for examples
p
„ For the following examples, we will assume coins in the
f ll i denominations:
following d i i
1¢ 5¢ 10¢ 21¢ 25¢
„ W ’ll use 63¢ as our goall
We’ll

„ This example is taken from:


Data Structures & Problem Solving using Java by Mark Allen Weiss
A simple
p solution

„ We always need a 1¢ coin, otherwise no solution exists for


making one cent
„ To make K cents:
„ If there is a K
K-cent
cent coin
coin, then that one coin is the minimum
„ Otherwise, for each value i < K,

„ Find the minimum number of coins needed to make i cents

„ Find
Fi d the
th minimum
i i number
b off coins
i needed
d d tto make
k K - i cents
t
„ Choose the i that minimizes this sum

„ This algorithm
g can be viewed as divide-and-conquer,
q , or as brute
force
„ This solution is very recursive
„ It requires exponential work
„ It is infeasible to solve for 63¢
Another solution
„ We can reduce the problem recursively by choosing the
first coin
coin, and solving for the amount that is left
„ For 63¢:
„ One 1¢ coin plus the best solution for 62¢
„ One 5¢ coin plus the best solution for 58¢
„ One 10¢ coin plus the best solution for 53¢
„ One 21¢ coin plus the best solution for 42¢
„ One 25¢ coin plus the best solution for 38¢
„ Choose the best solution from among the 5 given above
„ Instead of solving 62 recursive problems, we solve 5
„ This is still a very expensive algorithm
A dynamic
y pprogramming
g g solution
„ Idea: Solve first for one cent, then two cents, then three cents,
etc up to the desired amount
etc.,
„ Save each answer in an array !
„ For each new amount N, compute
p all the ppossible ppairs of
previous answers which sum to N
„ For example, to find the solution for 13¢,
„ First, solve for all of 1¢,
First 1¢ 2¢,
2¢ 3¢,
3¢ ..., 12¢
„ Next, choose the best solution among:
„ Solution for 1¢ + solution for 12¢

„ Solution for 2¢ + solution for 11¢

„ Solution for 3¢ + solution for 10¢

„ Solution for 4¢ + solution for 9¢

„ Solution for 5¢ + solution for 8¢

„ Solution for 6¢ + solution for 7¢


Example
p
„ Suppose coins are 1¢, 3¢, and 4¢
„ There’s
Th ’ only
l one way to t makek 1¢ (one
( coin)
i )
„ To make 2¢, try 1¢+1¢ (one coin + one coin = 2 coins)
„ To make 3¢, just use the 3¢ coin (one coin)
„ To make 4¢, just use the 4¢ coin (one coin)
„ To make 5¢, try
„ 1¢¢ + 4¢
¢ ((1 coin + 1 coin = 2 coins))
„ 2¢ + 3¢ (2 coins + 1 coin = 3 coins)
„ The first solution is better, so best solution is 2 coins
„ To make 6¢, try
„ 1¢ + 5¢ (1 coin + 2 coins = 3 coins)
„ 2¢ + 4¢ (2 coins + 1 coin = 3 coins)
„ 3¢ + 3¢ (1 coin + 1 coin = 2 coins) – best solution
„ Etc.
How ggood is the algorithm?
g
„ The first algorithm is recursive, with a branching factor
of up to 62
„ Possibly the average branching factor is somewhere around
half of that (31)
„ The algorithm takes exponential time, with a large base
„ The second algorithm is much better—it has a
b
branching
hi factor
f t off 5
„ This is exponential time, with base 5
„ The dynamic programming algorithm is O(NO(N*K)
K), where
N is the desired amount and K is the number of different
kinds of coins
Comparison
p with divide-and-conquer
q
„ Divide-and-conquer algorithms split a problem into separate
subproblems solve the subproblems,
subproblems, subproblems and combine the results for
a solution to the original problem
„ Example: Quicksort
„ Example: Mergesort
„ Example: Binary search
„ Divide-and-conquer algorithms can be thought of as top-down
algorithms
„ In contrast, a dynamic programming algorithm proceeds by
solving small problems, then combining them to find the solution
to larger problems
„ Dynamic programming can be thought of as bottom-up
Example
p 2: Binomial Coefficients
„ (x + y)2 = x2 + 2xy + y2, coefficients are 1,2,1
„ ( + y))3 = x3 + 3x
(x 3 2y + 33xy2 + y3, coefficients
ffi i are 1,3,3,1
1331
„ (x + y)4 = x4 + 4x3y + 6x2y2 + 4xy3 + y4,
coefficients are 1,4,6,4,1
„ (x + y)5 = x5 + 5x4y + 10x3y2 + 10x2y3 + 5xy4 + y5,
coefficients are 1,5,10,10,5,1
„ The n+1 coefficients can be computed for (x + y)n according to
the formula c(n, i) = n! / (i! * (n – i)!)
for each of i = 0..n
„ The repeated computation of all the factorials gets to be expensive
„ We can use dynamic programming to save the factorials as we go
Solution byy dynamic
y programming
p g g
„ n c(n,0) c(n,1) c(n,2) c(n,3) c(n,4) c(n,5) c(n,6)
„ 0 1
„ 1 1 1
„ 2 1 2 1
„ 3 1 3 3 1
„ 4 1 4 6 4 1
„ 5 1 5 10 10 5 1
„ 6 1 6 15 20 15 6 1
„ Each row depends only on the preceding row
„ Only linear space and quadratic time are needed
„ This algorithm is known as Pascal’s Triangle
The pprinciple
p of optimality,
p y, I
„ Dynamic programming is a technique for finding an
optimal solution
„ The principle of optimality applies if the optimal
solution to a problem always contains optimal solutions
to all subproblems
„ Example: Consider the problem of making N¢ with the
fewest number of coins
„ Either there is an N¢ coin, or
„ The set of coins making up an optimal solution for N¢ can be
divided into two nonempty subsets, n1¢ and n2¢
„ If either subset, n1¢ or n2¢, can be made with fewer coins, then clearly
N¢ can be made with fewer coins,
coins hence solution was not optimal
The pprinciple
p of optimality,
p y, II
„ The principle of optimality holds if
„ Every optimal solution to a problem contains...
contains
„ ...optimal solutions to all subproblems
„ The principle of optimality does not say
„ If you have
h optimal
i l solutions
l i to all
ll subproblems...
b bl
„ ...then you can combine them to get an optimal solution
„ Example: In US coinage,
„ The optimal solution to 7¢ is 5¢ + 1¢ + 1¢, and
„ The optimal solution to 6¢ is 5¢ + 1¢, but
„ The optimal solution to 13¢ is not 5¢ + 1¢ + 1¢ + 5¢ + 1¢
„ But there is some way of dividing up 13¢ into subsets with
optimal solutions (say, 11¢ + 2¢) that will give an optimal
solution for 13¢
„ Hence, the principle of optimality holds for this problem
Longest
g simple
p ppath
B
1 2
„ Consider the following graph: 3
1 4
A C D

„ The longest
g simplep ppath (p y ) from A
(path not containingg a cycle)
to D is A B C D
„ However, the subpath A B is not the longest simple path
f
from A to B (A C B is
i longer)
l )
„ The principle of optimality is not satisfied for this problem
„ Hence the longest simple path problem cannot be solved by
Hence,
a dynamic programming approach
The 0-1 knapsack
p pproblem
„ A thief breaks into a house, carrying a knapsack...
„ He can carry up tto 25 pounds
H d off lloott
„ He has to choose which of N items to steal
„ Each item has some weight and some value
„ “0-1” because each item is stolen (1) or not stolen (0)
„ He has to select the items to steal in order to maximize the value of his
loot, but cannot exceed 25 pounds
„ A greedy algorithm does not find an optimal solution
„ A dynamic programming algorithm works well
„ Thi is
This i similar
i il to, but
b not identical
id i l to, theh coins
i problem
bl
„ In the coins problem, we had to make an exact amount of change
„ In the 0-1 knapsack
p pproblem, we can’t exceed the weight
g limit, but the
optimal solution may be less than the weight limit
„ The dynamic programming solution is similar to that of the coins problem
Comments
„ Dynamic programming relies on working “from the bottom up”
and saving the results of solving simpler problems
„ These solutions to simpler problems are then used to compute the solution
to more complex problems
„ Dynamic programming solutions can often be quite complex and
tricky
„ Dynamic programming is used for optimization problems,
problems
especially ones that would otherwise take exponential time
„ Only problems that satisfy the principle of optimality are suitable for
d
dynamic
i programmingi solutions
l ti
„ Since exponential time is unacceptable for all but the smallest
pproblems,, dynamic
y programming
p g g is sometimes essential
Longest Common Subsequence
Š Problem: Given 2 sequences, X = 〈x1,...,xm〉 and
Y = 〈y1,...,yn〉, find a common subsequence whose
length is maximum.

springtime ncaa tournament basketball

printing north carolina krzyzewski

Subsequence need not be consecutive, but must be in order.


Other sequence questions
Š Edit distance: Given 2 sequences, X = 〈x1,...,xm〉
and Y = 〈y1,,...,y
,yn〉, what is the minimum number of
deletions, insertions, and changes that you must do
g one to another?
to change
Š Protein sequence alignment: Given a score matrix
on amino acid pairs
pairs, s(a,b) a b∈{Λ} ∪A,
s(a b) for a,b∈{Λ} ∪A
and 2 amino acid sequences, X = 〈x1,...,xm〉∈Am
and Y = 〈y1,...,yyn〉∈An, find the alignment with
lowest score…
More problems
Optimal BST: Given sequence K = k1 < k2 <··· < kn
of n sorted keys,
y , with a search pprobability
y pi for
each key ki, build a binary search tree (BST) with
p
minimum expected search cost.
Matrix chain multiplication: Given a sequence of
matrices A1 A2 … An, with Ai of dimension mi×ni,
insert parenthesis to minimize the total number of
scalar multiplications.
multiplications
Minimum convex decomposition of a polygon,
H d
Hydrogen placement
l t in
i protein
t i structures,
t t …
Dynamic Programming
Š Dynamic Programming is an algorithm design technique for
optimization problems: often minimizing or maximizing.
Š Like divide and conquer,
conquer DP solves problems by combining
solutions to subproblems.
Š Unlike divide and conquer, subproblems are not independent.
» S
Subproblems
b bl may share
h subsubproblems,
b b bl
» However, solution to one subproblem may not affect the solutions to other
subproblems of the same problem. (More on this later.)
Š DP reduces
d computation
i bby
» Solving subproblems in a bottom-up fashion.
» Storing solution to a subproblem the first time it is solved.
» Looking up the solution when subproblem is encountered again.
Š Key: determine structure of optimal solutions
Steps in Dynamic Programming
1. Characterize structure of an optimal solution.
2 Define value of optimal solution recursively.
2. recursively
3. Compute optimal solution values either top-
down with caching or bottom-up
bottom up in a table.
table
4. Construct an optimal solution from computed
values.
l
We’ll study these with the help of examples.
Longest Common Subsequence
Š Problem: Given 2 sequences, X = 〈x1,...,xm〉 and
Y = 〈y1,...,yn〉, find a common subsequence whose
length is maximum.

springtime ncaa tournament basketball

printing north carolina snoeyink

Subsequence need not be consecutive, but must be in order.


Naïve Algorithm
Š For every subsequence of X, check whether it’s a
q
subsequence of Y .
Š Time: Θ(n2m).
» 2m subsequences of X to check
check.
» Each subsequence takes Θ(n) time to check:
scan Y for first letter
letter, for second,
second and so on.
on
Optimal Substructure
Theorem
Let Z = 〈〈z1, . . . , zk〉 be anyy LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Notation:
prefix Xi = 〈x1,...,xi〉 is the first i letters of X.
This says what any longest common subsequence must look like;
do you believe it?
Optimal Substructure
Theorem
Let Z = 〈〈z1, . . . , zk〉 be anyy LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Proof: (case 1: xm = yn)


Any sequence Z’ that does not end in xm = yn can be made longer by adding xm = yn
to the end. Therefore,
((1)) longest
g common subsequence
q ((LCS)) Z must end in xm = yn.
(2) Zk-1 is a common subsequence of Xm-1 and Yn-1, and
(3) there is no longer CS of Xm-1 and Yn-1, or Z would not be an LCS.
Optimal Substructure
Theorem
Let Z = 〈〈z1, . . . , zk〉 be anyy LCS of X and Y .
1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1.
2. If xm ≠ yn, then either zk ≠ xm and Z is an LCS of Xm-1 and Y .
3
3. or zk ≠ yn and Z is an LCS of X and Yn-1.

Proof: (case 2: xm ≠ yn, and zk ≠ xm)


Since Z does not end in xm,
((1)) Z is a common subsequence
q of Xm-1
m 1 and Y,, and
(2) there is no longer CS of Xm-1 and Y, or Z would not be an LCS.
Recursive Solution
Š Define c[i, j] = length of LCS of Xi and Yj .
Š We want c[m,n].
[ ]

This gives a recursive algorithm and solves the problem.


But does it solve it well?
Recursive Solution

c[springtime printing]
c[springtime,

c[springtim, printing] c[springtime, printin]

[springti, printing] [springtim, printin] [springtim, printin] [springtime, printi]

[springt, printing] [springti, printin] [springtim, printi] [springtime, print]


Recursive Solution

p r i n t i n g
•Keep ,β] in a
p track of c[[α,β
table of nm entries: s
p
•top/down
r
•bottom/up i
n
g
t
i
m
e
Computing the length of an LCS
LCS-LENGTH (X, Y)
1. m ← length[X]
2. n ← length[Y]
3. for i ← 1 to m
4. do c[i, 0] ← 0
5. for j ← 0 to n
6. do c[0, j ] ← 0 b[i, j ] points to table entry
7 for i ← 1 to m
7.
8. do for j ← 1 to n
whose subproblem we used
9. do if xi = yj in solving LCS of Xi
10
10. c[i j ] ← c[i−1,
then c[i, 1 j−1] + 1 and Yj.
11. b[i, j ] ← “ ”
12. else if c[i−1, j ] ≥ c[i, j−1]
13. then c[i, [ − 1,, j ]
[ , j ] ← c[i c[m,n] contains the length
14. b[i, j ] ← “↑” of an LCS of X and Y.
15. else c[i, j ] ← c[i, j−1]
16. b[i, j ] ← “←” Time: O(mn)
17. return c and b
Constructing an LCS
PRINT-LCS (b, X, i, j)
1. if i = 0 or j = 0
2. then return
3. if b[i, j ] = “ ”
4. then PRINT-LCS(b, X, i−1, j−1)
5
5. print
i t xi
6. elseif b[i, j ] = “↑”
7. then PRINT-LCS(b, X, i−1, j)
8. else PRINT-LCS(b, X, i, j−1)

•Initial call is PRINT-LCS


PRINT LCS (b
(b, X,m,
X m n).
n)
•When b[i, j ] = , we have extended LCS by one character. So
LCS = entries with in them.
•Time:
i (
O(m+n) )
Steps in Dynamic Programming
1. Characterize structure of an optimal solution.
2 Define value of optimal solution recursively.
2. recursively
3. Compute optimal solution values either top-
down with caching or bottom-up
bottom up in a table.
table
4. Construct an optimal solution from computed
values.
l
We’ll study these with the help of examples.
Optimal Binary Search Trees
Š Problem
» Given sequence K = k1 < k2 <··· < kn of n sorted keys,
with a search probability pi for each key ki.
» Want to build a binary search tree (BST)
with minimum expected search cost. cost
» Actual cost = # of items examined.
» For keyy ki, cost = depth
p T((ki))+1,, where depth
p T((ki) = depth
p of ki in
BST T .
Expected Search Cost
E[search cost in T ]
n
= ∑ (depth T (ki ) + 1) ⋅ pi
i =1
n n
= ∑ depth
d th T (ki ) ⋅ pi + ∑ pi
i =1 i =1
n Sum of probabilities is 1.
= 1 + ∑ depth T (ki ) ⋅ pi (15.16)
(15 16)
i =1
Example
Š Consider 5 keys with these search probabilities:
p1 = 0.25, p2 = 0.2, p3 = 0.05, p4 = 0.2, p5 = 0.3.
k2 i depthT(ki) depthT(ki)·pi
1 1 0.25
2 0 0
k1 k4 3 2 0.1
4 1 0.2
5 2 0.6
1.15
k3 k5
Therefore, E[search cost] = 2.15.
Example
Š p1 = 0.25, p2 = 0.2, p3 = 0.05, p4 = 0.2, p5 = 0.3.

k2 i depthT(ki) depthT(ki)·pi
1 1 0.25
2 0 0
k1 k5 3 3 0.15
4 2 0.4
5 1 0.3
03
1.10
k4
Therefore, E[search cost] = 2.10.

k3 This tree turns out to be optimal for this set of keys.


keys
Example
Š Observations:
» Optimal
Op BST
S mayy not have smallest height.
g
» Optimal BST may not have highest-probability key at
root.
Š Build by exhaustive checking?
» Construct each n-node BST
BST.
» For each,
assign keys and compute expected search cost.
» But there are Ω(4n/n3/2) different BSTs with n nodes.
Optimal Substructure
Š Any subtree of a BST contains keys in a contiguous range
ki, ..., kj for some 1 ≤ i ≤ j ≤ n.

T′

Š If T is an optimal BST and


T contains subtree T′ with keys ki, ... ,kkj ,
then T′ must be an optimal BST for keys ki, ..., kj.
Š Proof: Cut and paste.
Optimal Substructure
Š One of the keys in ki, …,kj, say kr, where i ≤ r ≤ j,
must be the root of an optimal subtree for these keys.
Š Left subtree of kr contains ki,...,kr−1.
kr
Š Right subtree of kr contains kr+1, ...,kj.

ki kr-1 kr+1 kj
Š To find an optimal BST:
» Examine all candidate roots kr , for i ≤ r ≤ j
» Determine all optimal BSTs containing ki,...,kr−1 and
containingg kr+1,,...,k
, j
Recursive Solution
Š Find optimal BST for ki,...,kj, where i ≥ 1, j ≤ n, j ≥ i−1.
When j = i−1, the tree is empty.
Š Define e[i, j ] = expected search cost of optimal BST for ki,...,kj.

Š If j = i−1, then e[i, j ] = 0.


Š If j ≥ i,
» Select a root kr, for some i ≤ r ≤ j .
» Recursively make an optimal BSTs
• for ki,..,kr−1 as the left subtree, and
• for kr+1,..,kj as the right subtree.
Recursive Solution
Š When the OPT subtree becomes a subtree of a node:
» Depth of every node in OPT subtree goes up by 1.
» Expected search cost increases by

from (15.16)
(15 16)

Š If kr is the root of an optimal BST for ki,..,kj :


» e[i, ( [i r−1] + w(i,
[i j ] = pr + (e[i, (i r−1))+(e[r+1,
1)) ( [ 1 j] + w(r+1, ( 1 j))
= e[i, r−1] + e[r+1, j] + w(i, j). (because w(i, j)=w(i,r−1) + pr + w(r + 1, j))
Š But
But, we don’t
don t know kr. Hence
Hence,
Computing an Optimal Solution
For each subproblem (i,j), store:
Š expected
p search cost in a table e[1
[ ..n+1 , 0 ..n]]
» Will use only entries e[i, j ], where j ≥ i−1.
Š root[i, j ] = root of subtree with keys ki,..,kj, for 1 ≤ i ≤ j ≤ n.
Š w[1..n+1, 0..n] = sum of probabilities
» w[i, i−1] = 0 for 1 ≤ i ≤ n.
» w[i, j ] = w[i, j-1] + pj for 1 ≤ i ≤ j ≤ n.
Pseudo-code
OPTIMAL BST(p, q, n)
OPTIMAL-BST(p,
1. for i ← 1 to n + 1
2. do e[i, i− 1] ← 0
Consider all trees with l keys.
3. w[i, i− 1] ← 0
4. for l ← 1 to n Fix the first key.
5. do for i ← 1 to n−l + 1 Fix the last key
6. do j ←i + l−1
7. e[i, j ]←∞
8. w[i, j ] ← w[i, j−1] + pj
9. for r ←i to j
10
10. d t ← e[i,
do [i r−1] + e[r
[ + 11, j ] + w[i,
[i j ] Determine the root
11. if t < e[i, j ] of the optimal
12. then e[i, j ] ← t (sub)tree
13
13. root[i j ] ←r
root[i,
14. return e and root

Time: O(n3)
Elements of Dynamic Programming
Š Optimal substructure
Š Overlapping subproblems
Optimal Substructure
Š Show that a solution to a problem consists of making a
choice, which leaves one or more subproblems to solve.
Š Suppose that you are given this last choice that leads to an
optimal solution.
Š Given this choice, determine which subproblems arise and
how to characterize the resulting space of subproblems.
Š Show that the solutions to the subproblems used within
the optimal solution must themselves be optimal. Usually
use cut-and-paste.
cut and paste
Š Need to ensure that a wide enough range of choices and
subproblems are considered.
considered
Optimal Substructure
Š Optimal substructure varies across problem domains:
» 1. How many subproblems are used in an optimal solution.
» 2. How many choices in determining which subproblem(s) to
use.
Š IInformally,
f ll runningi time
ti depends
d d on (# off subproblems
b bl
overall) × (# of choices).
Š How many subproblems and choices do the examples
considered contain?
Š Dynamic programming uses optimal substructure bottom
up.
» First find optimal solutions to subproblems.
» Then choose which to use in optimal solution to the problem.
Optimal Substucture
Š Does optimal substructure apply to all optimization
problems? No.
Š Applies to determining the shortest path but NOT the
longest simple path of an unweighted directed graph.
Š Why?
» Shortest path has independent subproblems.
» Solution to one subproblem does not affect solution to another
subproblem of the same problem.
» Subproblems are not independent in longest simple path.
path
• Solution to one subproblem affects the solutions to other subproblems.
» Example:
Overlapping Subproblems
Š The space of subproblems must be “small”.
Š The total number of distinct subproblems
p is a polynomial
p y
in the input size.
» A recursive algorithm is exponential because it solves the same
problems
bl repeatedly.
dl
» If divide-and-conquer is applicable, then each problem solved
will be brand new.

You might also like