Dynamic Programming 2

Optimization in Economic Theory
PD Dr. Johannes Paha
University of Hohenheim
winter term 2021/22
johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 1 / 30

Double click here for audio contents
Dynamic Programming
In principle dynamic problems can be solved by means of the methods of Lagrange

and Kuhn-Tucker.
However, with more choice variables and/or in stochastic environments this can
become very tedious.
We will, therefore, consider a powerful alternative now
that is called dynamic programming.
The idea is to break down a problem with infinitely many periods
into one with two periods (current period and the future).
If agents behave optimally between t and t + 1 then,
by induction, they will behave optimally for all t.

Dynamic Programming
We denote by ct control variables that can be chosen by the agent and are
typically flows (e.g., consumption).
We denote by xt the state variables that are typically stocks and summarize the
decision maker’s situation (e.g., assets).
The choices of ct affect xt+1 , and the initial values of xt are denoted by x0 and
given exogenously.
The utility function is again given by
T
X
U0 = β t u(ct ).
t=0
The dynamic constraint has the form
xt+1 = f (xt , ct ).

Dynamic Programming
The dynamic optimization problem is

T
X
max = β t u(ct ) s.t. xt+1 = f (xt , ct ).
{ct }T
t=0 t=0
We can rewrite U0 as
T T
X X
U0 = β t u(ct ) = u(c0 ) + β β t−1 u(ct ) = u(c0 ) + βU1 .
t=0 t=1
Utility is decomposed into instantaneous utility u(c0 )
and discounted future utility βU1 .
More generally, for each s < T we can write
T T
X t−s
X
Us = β u(ct ) = u(cs ) + β β t−s−1 u(ct ) = u(cs ) + βUs+1 .
t=s t=s+1

Dynamic Programming
We can apply the same principle to the value fn. of this problem.
Maximum attainable utility is T
X
V (xs ) = max Us = max β t−s u(ct ).
{ct }T
t=s {ct }T
t=s t=s
Suppose that the decision maker has already solved the problem that starts at time
t + 1 for a given state variable xt+1 .
Then the maximum attainable utility at time t can be broken into instantaneous
utility and maximum attainable utility afterwards,
given the choice that leads to xt+1 .
We therefore have
V (xt ) = max u(ct ) + βV (xt+1 ) s.t. xt+1 = f (xt , ct ).
ct
This equation is called the Bellman Equation.

Dynamic Programming
The idea that given the decision at t, the subsequent decision should be optimal
starting at t + 1 is Bellman’s Principle of Optimality.
Solving the Bellman equation for all t delivers the optimal sequence of control
variables.
If T is finite, the sequence of Bellman eq’ns can be solved recursively (solve for the
last period, then for the second last and so on).
However, this can be tedious, and for T = ∞ it does not work.
Fortunately, there are better methods that allow to gain more insights.
We focus on T = ∞ from now on.

Dynamic Programming
Solving the dynamic optimization problem by means of dynamic programming

consists of the following steps.
Step 1: Formulate the Bellman eq’n and insert the constraint to get
V (xt ) = max{u(ct ) + βV [f (xt , ct )]}.
ct
Compute the FOC ∂f (xt , ct )
u (ct ) + βV ′ (xt+1 )
′
= 0.
′
∂ct
Step 2: Characterize V (xt+1 ) using
 the Envelope Theorem: 
d 
V ′ (xt+1 ) = ∗
u(ct+1 ∗
) + βV [f (xt+1 , ct+1 )]
dxt+1 | {z }
xt+2
∗
∂f (xt+1 , ct+1 )
= βV ′ (xt+2 ) .
∂xt+1

Dynamic Programming
Step 3: Use the FOCs w.r.t. ct and ct+1 to express V ′ (xt+1 ) and V ′ (xt+2 ):
∂f (xt , ct ) u ′ (ct )
u ′ (ct ) + βV ′ (xt+1 ) = 0 ⇔ V ′ (xt+1 ) = − ∂f (x ,c )
∂ct β t t
∂ct
′ ′ ∂f (xt+1 , ct+1 ) u ′ (ct+1 )

u (ct+1 ) + βV (xt+2 ) = 0 ⇔ βV ′ (xt+2 ) = − ∂f (x ,c )
∂ct+1 t+1 t+1
∂ct+1
Inserting this into the expression derived in step 2 and rewriting yields
−1 −1 ∗
′ ∂f (xt , ct ) ′ ∂f (xt+1 , ct+1 ) ∂f (xt+1 , ct+1 )
u (ct ) = βu (ct+1 ) .
∂ct ∂ct+1 ∂xt+1

Dynamic Programming
This is again the Euler equation.

This expression only contains the utility function, which is known.
The value functions, which are typically not known,
have been eliminated.

Continuous time – Optimal savings decision
Consider an average household.

The household inelastically supplies one unit of labor on the labor market and
earns a wage w .
The household decides how to split income between consumption
and saving.
The tradeoff is, again, between the interest income stream earned on accumulated
savings and the desire to consume now versus later (impatience).
We look for an optimal path of consumption (and therefore savings) over the
planning horizon.
The result is a system of differential equations for which we can calculate the
general and the particular solutions.

The utility fn of the household, u(c), exhibits the standard properties:

Positive marginal utility: u ′ (c) > 0
An additional unit of consumption always raises utility.
Diminishing marginal utility: u ′′ (c) < 0
The additional utility due to more consumption decreases in the
consumption level.
The following conditions hold:
lim u ′ (c) = 0, lim u ′ (c) = ∞.
c→∞ c→0

The household “sums up” the utility that is generated by consuming c(t) at each
instant between 0 and T : Z T
U= u [c(t)] dt.
0
R P
This follows from the analogy between ↔ .
PT
In discrete time we had U = i=0 u(ci ).
In the following, we suppress the time index t whenever this does not impair the
clarity of the exposition.
We often assume an infinite horizon over which individuals optimize such that
T → ∞. This simplification could be justified by a dynastic perspective with
intergenerational altruism. Then, we
Z have ∞
U= u(c)dt.
0

Recall that households discount the future at rate ρ, i.e., they are impatient, such
that the discount factor is e −ρt and
Z ∞we have
U= u(c)e −ρt dt.
0
Moreover, a household has wealth a.
The household gets asset income r · a (interest paid on assets).
It earns labor income w (on normalized labor input L ≡ 1).
In such a setting, the assets of the household evolve over time according to the
following differential equation:
ȧ = w + ra − c.
This is called the flow budget constraint.

If the time horizon ends at T , the remaining level of debt must be non-negative,
i.e., a(T ) ≥ 0 has to hold.
In case of an infinite planning horizon, the equivalent formulation is
lim a(t)e −r̄ t ≥ 0,
t→∞
1
Rt
where r̄ = t 0
r (τ )dτ is the average interest rate.
Interpretation: Debt can only grow at a rate lower than r̄ .
Putting it differently: You are not allowed to borrow and pay back the borrowed
money with even higher debt.
Putting it yet another way: The present value of assets must be non-negative.
This is called the No-Ponzi-Game condition (Ponzi scheme = Schneeballsystem).

If a(T ) > 0, households could still increase utility by consuming more in the last
period of their lives such that the original solution would not have been optimal.
Therefore, only a(T ) = 0 is consistent with an optimal behavior in the last period.
Optimality for the case of an infinite planning horizon requires to leave no assets
asymptotically:
lim a(t)e −r̄ t = 0.
t→∞
This is called the transversality condition.
It encompasses the No-Ponzi-Game condition.
Now we have all the ingredients to solve the optimization problem by means of
Pontryagins Maximum Principle (Pontryagin et al. 1962).

Optimal control theory – Problem statement
Let
x be the state variables (assets, velocity, etc.),
c be the control variables (consumption, thrust, etc.),
f (t, x , c) be the objective function (discounted utility, etc.),
g(t, x , c) be the dynamic constraint (flow budget constraint, etc.)
Then, we need to determine the optimal control c that solves the following
maximization problem: Z T
max f (t, x , c)dt,
c(t)
0
s.t. ẋ = g(t, x , c),
x (0) = x0
and a suitably defined transversality condition.
Remark: Also T → ∞ is allowed.

Optimal control theory – General cookbook procedure
1 Define the Hamiltonian as

H = f (t, x , c) + λ · g(t, x , c).
2 Derive the necessary first order conditions (FOCs) for interior maxima, for which c
and x fulfill the following conditions:
∂H
= 0,
∂c
∂H
= ẋ ,
∂λ
∂H
= −λ̇,
∂x
and the transversality condition
λ(T ) · x (T ) = 0.
3 Solve the equations and/or analyze the resulting dynamics further.

Optimal control theory – General cookbook procedure
Remarks
The variable λ is called the costate variable:
It can be interpreted as the shadow price of the state variable
analogous to the Lagrangian but now depending on time,
such that λ is a function of t (one restriction per instant).
In case of T → ∞, the transversality condition becomes
lim x (t)λ(t) = 0.
t→∞

Example 3: Intertemporal household maximization problem
The intertemporal maximization problem for the optimal consumption-savings

decision of households can be summarized
Z ∞ as:
max u(c)e −ρt dt
c(t)
0
subject to
ȧ = w + ra − c,
a(0) = a0 is given,
0 = lim a(t)e −r̄ t .
t→∞
The consumer must choose an optimal function c(t). The future developments of
w and r are known to her (=perfect foresight).

Hamiltonian:
H = u(c)e −ρt + λ[w + ra − c].
Necessary first order conditions:
∂H
=0 ⇒ u ′ (c)e −ρt − λ = 0,
∂c
∂H
= ȧ ⇒ w + ra − c = ȧ,
∂λ
∂H
= −λ̇ ⇒ λr = −λ̇,
∂a
and lim a(t)λ(t) = 0.
t→∞

Interpretation of the first-order conditions

FOC (1): Overall utility cannot be increased by changing the control variable
consumption c. The positive utility of more consumption today and the negative
utility induced by the associated decrease in consumption in the future exactly
balance each other.
FOC (2): The dynamic constraint (in this case the budget constraint) has to hold
along the optimal path.
FOC (3): The rate of decrease in the shadow value of the state variable (in this
case the level of assets) has to correspond to the effective discount rate.
If the shadow value decreases more slowly,
an increase in the savings rate raises lifetime utility.
If the shadow value decreases faster,
a decrease in the savings rate raises lifetime utility.

Derivation of the Euler equation in continuous time

Taking the logarithm of the first FOC yields:
ln[u ′ (c)] − ρt = ln(λ).
Differentiating with respect to time yields the growth rate of λ
λ̇ du ′ (c)/dt u ′′ (c)
= ′
−ρ= ′ ċ − ρ.
λ u (c) u (c)
Together with the third equation λ̇/λ = −r , it follows that
cu ′′ (c) ċ
−r = ′ − ρ,
u (c) c
or, put differently,
ċ r −ρ
= cu′′ (c) .
c − ′ u (c)

The intertemporal elasticity of substitution

Consider again the case of iso-elastic utility
c 1−θ
u(c) = .
1−θ
It can be shown that the coefficient of relative risk aversion
cu ′′ (c) du ′ (c) c
θ=− ′ =−
u (c) dc u ′ (c)
is constant for this particular utility function.
Recall:
The higher θ, the more people want to smooth consumption over time
(they are less likely to take the risk of having a low consumption level
at some point in time).
The inverse 1/θ is the elasticity of intertemporal substitution.

The growth rate of consumption can then be written as

ċ r −ρ
= .
c θ
This is the Euler equation.
Interpretation
Optimality requires
a constant consumption path at r = ρ,
increasing consumption over time for r > ρ (saving),
falling consumption over time for r < ρ (dissaving).
The growth rate of consumption is smaller the larger θ because then
people are more risk averse and aim for a smoother consumption path.
As in discrete time: The growth rate of consumption does not depend
on income. However, the level of consumption does.

The solution of the optimization problem consists of a system of two differential

equations:
1 The Euler equation for the evolution of consumption.
2 The flow budget constraint for the evolution of assets.
One could derive the general solution of this system.
Together with two additional conditions, one could derive the particular solution
of the system.
The two additional conditions in our case are provided by the initial condition and
by the transversality condition:
a(0) = a0 ,
lim a(t)e −r̄ t = 0.
t→∞
One of the conditions is therefore an endpoint condition.

Example 4: Optimal non-renewable resource extraction
Consider a non-renewable resource (coal, crude oil, natural gas, etc.) with a total
fixed global stock of z(t).
Denote the amount of the resource that is extracted
at each point in time by q(t).
The market price of the resource is given by p(t).
The market interest rate is given by r .
We assume that the cost of extraction is negligible.
Price-taking firms aim to maximize the total profit stream.
Each unit that is extracted today reduces the future resource use,
and we have an intertemporal optimization problem.

The problem can be formalized as:

Z ∞
max p · q · e −rt dt
q(t)
subject to 0
ż = −q,
0 ≤ q(t) ≤ q̄ maximum extraction rate,
z0 = z(0),
0 ≤ lim z(t)e −r̄ t .
t→∞
The firm chooses an optimal resource extraction path given the market price of the
resource p, the interest rate r ,
and the initial level of the resource z(0).
The resource stock cannot be negative (compare this to the no-Ponzi game condition) and there
might be a maximum feasible rate of extraction q̄ < z.

Hamiltonian:
H = p · q · e −rt + λ(−q).
Necessary first order conditions:
∂H
=0 ⇒ pe −rt = λ, (1)
∂q
∂H
= ż ⇒ −q = ż, (2)
∂λ
∂H
= −λ̇ ⇒ 0 = −λ̇, (3)
∂z
Since λ is the shadow value of leaving the resource in the ground, Equation (1)
implies that this has to be equal to the present value of the market price of the
resource.
Equation (3) implies that the present value of the resource in the ground has to be
constant.

Taking the derivative of Equation (1) with respect to time yields

ṗe −rt + pe −rt (−r ) = λ̇
⇔ ṗe −rt = pe −rt r
ṗ
⇔ =r
p
This is the famous Hotelling rule and it is a no-arbitrage condition in the sense
that each asset in the economy has to yield the same rate of return.
The Hotelling rule implies that extracting more of the resource and investing the
money earned by selling the resource at the going interest rate r must yield the
same return as leaving the resource in the ground.

Summary
Now you know two powerful methods to solve dynamic optimization

problems.
Dynamic programming is usually more convenient in discrete time and
in a stochastic setting.
The Maximum principle is usually more convenient in continuous time
and in a deterministic setting.
The methods differ substantially but the results (e.g., the Euler
equation) are very similar.

Dynamic Programming 2

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Dynamic Programming 2

Uploaded by

Copyright:

Available Formats

Optimization in Economic Theory

PD Dr. Johannes Paha

winter term 2021/22

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 1 / 30

In principle dynamic problems can be solved by means of the methods of Lagrange

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 2 / 30

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 3 / 30

The dynamic optimization problem is

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 4 / 30

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 5 / 30

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 6 / 30

Solving the dynamic optimization problem by means of dynamic programming

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 7 / 30

′ ′ ∂f (xt+1 , ct+1 ) u ′ (ct+1 )

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 8 / 30

This is again the Euler equation.

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 9 / 30

Continuous time – Optimal savings decision

Consider an average household.

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 10 / 30

Continuous time – Optimal savings decision

The utility fn of the household, u(c), exhibits the standard properties:

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 11 / 30

Continuous time – Optimal savings decision

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 12 / 30

Continuous time – Optimal savings decision

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 13 / 30

Continuous time – Optimal savings decision

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 14 / 30

Continuous time – Optimal savings decision

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 15 / 30

Optimal control theory – Problem statement

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 16 / 30

Optimal control theory – General cookbook procedure

1 Define the Hamiltonian as

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 17 / 30

Optimal control theory – General cookbook procedure

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 18 / 30

Example 3: Intertemporal household maximization problem

The intertemporal maximization problem for the optimal consumption-savings

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 19 / 30

Example 3: Intertemporal household maximization problem

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 20 / 30

Example 3: Intertemporal household maximization problem

Interpretation of the first-order conditions

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 21 / 30

Example 3: Intertemporal household maximization problem

Derivation of the Euler equation in continuous time

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 22 / 30

Example 3: Intertemporal household maximization problem

The intertemporal elasticity of substitution

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 23 / 30

Example 3: Intertemporal household maximization problem

The growth rate of consumption can then be written as

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 24 / 30

Example 3: Intertemporal household maximization problem

The solution of the optimization problem consists of a system of two differential

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 25 / 30

Example 4: Optimal non-renewable resource extraction

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 26 / 30

Example 4: Optimal non-renewable resource extraction

The problem can be formalized as:

johannes.paha@uni-hohenheim.de Optimization in Economic Theory (winter term 2021/22) 27 / 30