Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 25

Dynamic Programming

In this handout
• A shortest path example
• Deterministic Dynamic Programming
• Inventory example
• Resource allocation example
Dynamic Programming

• Dynamic programming is a widely-used mathematical


technique for solving problems that can be divided
into stages and where decisions are required in each
stage.

• The goal of dynamic programming is to find a


combination of decisions that optimizes a certain
amount associated with a system.
Dynamic Programming

• Dynamic programming solves optimization


problems by combining solutions to sub-
problems.

• “Programming” refers to a tabular method


with a series of choices, not “coding”.
Dynamic Programming

• A set of choices must be made to arrive at an


optimal solution.

• The key is to store the solutions of sub-


problems to be reused in the future.
Dynamic Programming

• Recall the divide-and-conquer approach


– Partition the problem into independent sub-
problems.
– Solve the sub-problems recursively.
– Combine solutions of sub-problems.

• This contrasts with the dynamic programming


approach.
A Sequence of 4 Steps
• A dynamic programming approach consists of a
sequence of 4 steps
1. Characterize the structure of an optimal solution.

2. Recursively define the value of an optimal solution.

3. Compute the value of an optimal solution in a


bottom-up fashion.

4. Construct an optimal solution from computed


information.
A typical example: Shortest Path
• Ben plans to drive from NY to LA
• Has friends in several cities
• After 1 day’s driving can reach Columbus, Nashville, or
Louisville
• After 2 days of driving can reach Kansas City, Omaha,
or Dallas
• After 3 days of driving can reach Denver or San Antonio
• After 4 days of driving can reach Los Angeles
• The actual mileages between cities are given in the
figure (next slide)
• Where should Ben spend each night of the trip to
minimize the number of miles traveled?
Shortest Path: network figure
Columbus
680 Kansas City
2 5 610
790
790
1050
550 Denver
8 1030

580 540
New York 900 Nashville 760 Omaha Los
1 3 6 Angeles
10

Stage 1 660 940


1390
Stage 5
San
Antonio
770 9
510 790
700
Stage 4
Louisville
270
830 Dallas
4 7

Stage 2 Stage 3
Shortest Path problem: Solution
• The problem is solved recursively by working
backward in the network.
• Let cij be the mileage between cities i and j
• Let ft(i) be the length of the shortest path from city i
to LA (city i is in stage t).

– Stage 4 computations are obvious:


Denver
• f4(8) = 1030 Los
Angeles
8 1030
10
• f4(9) = 1390
1390 Los
Angeles
San 10
Antonio
9
Stage 3 computations
– Work backward one stage (to stage 3 cities) and find the shortest
path to LA from each stage 3 city.
– To determine f3(5), note that the shortest path from city 5 to LA
must be one of the following:
• Path 1: Go from city 5 to city 8 and then take the shortest
path from city 8 to city 10.
• Path 2: Go from city 5 to city 9 and then take the shortest
path from city 9 to city 10.
c 58  f 4 (8)  610 1030  1640 *
f 3 (5)  min
c 59  f 4 (9)  790 1390  2180
Kansas City
Kansas City
5 610 5

790

shortest path Denver
1030
8
Los Angeles
10
1390
Los San
Angeles Antonio
10 9
Stage 3 computations

c 68  f 4 (8)  540 1030  1570 *


Similarly, f 3 (6)  min
c 69  f 4 (9)  940 1390  2330
Denver
8 1030

540
 Omaha Los
6 Angeles
10

c 78  f 4 (8)  790 1030  1820


f 3 (7)  min
c 79  f 4 (9)  270 1390  1660 *
Los Angeles
10

1390
San Antonio
9

Dallas 270
7
Stage 2 computations
– Work backward one stage (to stage 2 cities) and find the shortest path to LA from each stage 2 city.

c 25  f 3 (5)  680 1640  2320 *



f 2 (2)  minc 26  f 3 (6)  790 1570  2360

c 27  f 3 (7)  1050 1660  2710

Columbus
680 Kansas City
2 5 610


Denver
8 1030

Los
Angeles
10
Stage 2 computations
– Work backward one stage (to stage 2 cities) and find the shortest path to LA from each stage 2 city.

c 35  f 3 (5)  580 1640  2220 *



f 2 (3)  minc 36  f 3 (6)  760 1570  2330

c 37  f 3 (7)  660 1660  2320

Kansas City
5 610

Denver
580 8 1030

Nashville Los
3 Angeles
10
Stage 2 computations
– Work backward one stage (to stage 2 cities) and find the shortest path to LA from each stage 2 city.

c 45  f 3 (5)  510 1640  2150 *



f 2 (4)  min c 46  f 3 (6)  700 1570  2270

 c 47  f 3 (7)  830 1660  2490
Kansas City
5 610

Denver
8 1030

Los
Angeles
10
510

Louisville
4
Stage 1 computations
– Now we can find f1(1), and the shortest path from NY to LA.
Columbus 680 Kansas City
2 5 610

550 Denver
8 1030

New York c12  f 2 (2)  550  2320  2870 * Los


1  Angeles

Stage 1 f1 (1)  minc13  f 2 (3)  900  2220  3120 10


c14  f 2 (4)  770  2150  2920
• Checking back our calculations, the shortest path is
1 – 2 – 5 – 8 – 10
• that is,
NY – Columbus – Kansas City – Denver – LA
• with total mileage 2870.
General characteristics of Dynamic
Programming
• The problem structure is divided into stages.
• Each stage has a number of states associated with it.
• Making decisions at one stage transforms one state of
the current stage into a state in the next stage.
• Given the current state, the optimal decision for each
of the remaining states does not depend on the previous
states or decisions. This is known as the principle of
optimality for dynamic programming.
• The principle of optimality allows to solve the problem
stage by stage recursively.
Division into stages
 The problem is divided into smaller sub-problems each of them
represented by a stage.
 The stages are defined in many different ways depending on the
context of the problem.
 If the problem is about long-time development of a system then
the stages naturally correspond to time periods.
 If the goal of the problem is to move some objects from one
location to another on a map then partitioning the map into
several geographical regions might be the natural division into
stages.
 Generally, if an accomplishment of a certain task can be considered
as a multi-step process then each stage can be defined as a step
in the process.
States

– Each stage has a number of states associated with


it. Depending what decisions are made in one
stage, the system might end up in different states in
the next stage.
– If a geographical region corresponds to a stage then
the states associated with it could be some particular
locations (cities, warehouses, etc.) in that region.
– In other situations a state might correspond to
amounts of certain resources which are essential for
optimizing the system.
Decisions

– Making decisions at one stage transforms one state of


the current stage into a state in the next stage.
– In a geographical example, it could be a decision to go
from one city to another.
– In resource allocation problems, it might be a decision to
create or spend a certain amount of a resource.
– For example, in the shortest path problem three different
decisions are possible to make at the state corresponding
to Columbus; these decisions correspond to the three
arrows going from Columbus to the three states (cities) of
the next stage: Kansas City, Omaha, and Dallas.
Optimal Policy and Principle of Optimality

– The goal of the solution procedure is to find an optimal


policy for the overall problem, i.e., an optimal policy
decision at each stage for each of the possible states.
– Given the current state, the optimal decision for each of
the remaining states does not depend on the previous
states or decisions. This is known as the principle of
optimality for dynamic programming.
– For example, in the geographical setting the principle
works as follows: the optimal route from a current city to
the final destination does not depend on the way we got to
the city.
– A system can be formulated as a dynamic programming
problem only if the principle of optimality holds for it.
Recursive solution to the problem
– The principle of optimality allows to solve the problem
stage by stage recursively.
– The solution procedure first finds the optimal policy for
the last stage. The solution for the last stage is normally
trivial.
– Then a recursive relationship is established which
identifies the optimal policy for stage t, given that stage
t+1 has already been solved.
– When the recursive relationship is used, the solution
procedure starts at the end and moves backward stage
by stage until it finds the optimal policy starting at the
initial stage.
Solving Inventory Problems by DP
Main characteristics:
1. Time is broken up into periods. The demands for all
periods are known in advance.
2. At the beginning of each period, the firm must determine
how many units should be produced.
3. Production and storage capacities are limited.
4. Each period’s demand must be met on time from
inventory or current production.
5. During any period in which production takes place, a
fixed cost of production as well as a variable per-unit cost
is incurred.
6. The firm’s goal is to minimize the total cost of meeting on
time the demands.
Inventory Problems: Example
• Producing airplanes
• 3 production periods
• No inventory at the beginning
• Can produce at most 3 airplanes in each period
• Can keep at most 2 airplanes in inventory
• Set-up cost for each period is 10
Period 1 2 3

Demand 1 2 1

Unit cost 3 5 4

Determine a production schedule to minimize the


total cost (the DP solution on the board).
Resource Allocation Problems
• Limited resources must be allocated to different
activities

• Each activity has a benefit value which is


variable and depends on the amount of the
resource assigned to the activity

• The goal is to determine how to allocate the


resources to the activities such that the total
benefit is maximized
Resource Allocation Problems: Example
• A college student has 6 days remaining before final exams
begin in his 4 courses
• He wants allocate the study time as effectively as possible
• Needs at least 1 day for each course and wants to
concentrate on just one course each day. So 1, 2, or 3 days
should be allocated to each course
• He estimates that the alternative allocations for each course
would yield the number of grade points shown in the following
table:
Courses
Days 1 2 3 4
1 3 4 3 4
2 5 5 6 7
3 7 6 7 9

How many days should be allocated to each course?


(The DP solution on the board).

You might also like