5 G

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Linear Quadratic Dynamic Programming

Sang Seok Lee

Department of Economics, Bilkent University

Spring Semester 2017


This Lecture Is Based on

I Ljungqvist and Sargent (2004)


I Hansen and Sargent (2008)
I A nice alternative to my lecture slides is Martin Ellison's lecture
slides at http://users.ox.ac.uk/~exet2581/recursive/lqg_mat.pdf 
Matrix Calculus

I Let z, x, and a be n 1 vectors and y be an m 1 vector; Let A be


an n n matrix and B be an m n matrix
I Some rules:
a0 x
x = a
I
0 0
x Ax
I
x = (A + A )x ; if A is symmetric, it is 2Ax
2 x Ax
0 0
I
xx 0
= (A + A ); if A is symmetric, it is 2A
0 0
x Ax
A = xx
I
0 0
y Bz y B
I
y = Bz ;
y = B
0
y Bz 0 0
I
z = B y ; Bzz = B
0
y Bz 0

B = yz
I

I See http://en.wikipedia.org/wiki/Matrix_calculus for a quick


(albeit clumsy) reference
What Is Linear Quadratic Dynamic Programming?

I Dynamic programming in which

I The return function is quadratic


I The transition function is linear
I The resulting optimal policy function is linear

I Let us call it LQDP from now on

I Implication: It is possible to solve the Bellman equation using

linear algebra

I LQDP is very popular in macroeconomics

I Kydland and Prescott (1982), which was discussed during the

RBC lecture, is an example


Optimal Linear Regulator Problem

I We will solve the following optimal linear regulator problems:

I Deterministic and undiscounted


I Deterministic and discounted
I Stochastic and discounted
Deterministic and Undiscounted

n
0 0
o
max
X

xt Rxt + ut Qut
{ut }t=0
t=0

s.t. xt+1 = Axt + But and a given x0 where

I xt is an n 1 vector of state variables and ut is a k 1 vector of control


variables
I R is a positive semidenite symmetric matrix and Q is a positive denite
symmetric matrix
I A is an n n matrix and B is an n k matrix
I We conjecture that the value function is quadratic:
0
V (x) = x Px

where P is a positive semidenite symmetric matrix


Deterministic and Undiscounted

I The associated Bellman equation is

0
n 0 0 0
o
x Px = max x Rx u Qu (Ax + Bu) P(Ax + Bu)
u

where the transition law is substituted into the tomorrow's state. Expanding,

0
n 0 0 0 0 0 0 0 0 0 0
o
x Px = max x Rx u Qu x A PAx x A PBu u B PAx u B PBu
u

I Solving the problem on the right hand side gives the following FOC (remember

the symmetry of P ):

0 0
2Qu B 0 PAx B PAx 2B PBu = 0
0 0
(Q + B PB)u = B PAx
u = Fx
F = (Q + B PB)1 B PA
0 0
Deterministic and Undiscounted
0
n 0 0 0 0 0 0 0 0 0 0
o
x Px = max x Rx u Qu x A PAx x A PBu u B PAx u B PBu
u

u = Fx
F = (Q + B PB)1 B PA
0 0

I Substituting u into the above,

0 0 0 0 0 0 0
x Px = x Rx (Fx) Q(Fx) x A PAx x A PB(Fx)
0 0 0 0
(Fx) B PAx (Fx) B PB(Fx)
0 0 0 0 0 0 0 0 0 0 0 0 0 0
x Px = x Rx x F QFx x A PAx +x A PBFx +x F B PAx x F B PBFx
0 0 0 0 0
I Note that x A PBFx = x F B PAx because they are scalars

I Rearranging,

0 0 0 0 0 0 0
x Px = x (R + F QF + A PA 2A PBF + F B PBF )x

I This implies

0 0 0 0 0
P = R + F QF + A PA 2A PBF + F B PBF
0 0 0 0
P = R + A PA + F (Q + B PB)F 2A PBF
Deterministic and Undiscounted

0 0 0 0
P = R + A PA + F (Q + B PB)F 2A PBF

Substituting F = (Q + B PB)1 B PA into it,


0 0
I

P = R + A PA + A PB(Q + B PB)1 (Q + B PB)(Q + B PB)1 B PA


0 0 0 0 0 0

2A PB(Q + B PB)1 B PA
0 0 0

P = R + A PA A PB(Q + B PB)1 B PA ()
0 0 0 0

I () is called the algebraic matrix Riccati equation


I It has a positive semidenite solution under certain conditions
I This can be solved numerically: one can iterate on

Pj+1 = R + A Pj A A Pj B(Q + B Pj B)1 B Pj A


0 0 0 0

from P0 = 0 until convergence


Deterministic and Discounted

n 0 0
o
max 0<<1
X

t xt Rxt + ut Qut
{ut }t=0
t=0

s.t. xt+1 = Axt + But and a given x0

I The introduction of the discount factor modies the Riccati equation:


P = R + A PA 2 A PB(Q + B PB)1 B PA
0 0 0 0

I This can be solved by iterating on


Pj+1 = R + A Pj A 2 A Pj B(Q + B Pj B)1 B Pj A
0 0 0 0

from P0 = 0 until convergence


I The associated optimal policy function is

ut = Fxt
F = (Q + B PB)1 B PA
0 0

0
I The value function is still V (x) = x Px where P is obtained from
solving the Riccati equation above
Computer Codes

I Matlab's Control System toolbox can be used to solve the

Riccati equation on the previous slide

(http://de.mathworks.com/help/control/ref/dare.html)

I Tom Sargent provides a Matlab code

(https://ideas.repec.org/c/dge/qmrbcd/21.html)

I Martin Ellison also provides a Matlab code

(http://users.ox.ac.uk/~exet2581/recursive/riccati.m)
Stochastic and Discounted

(
)
n 0 0
o
max 0<<1
X

E0 t xt Rxt + ut Qut
{ut }t=0
t=0

s.t. xt+1 = Axt + But + C t+1 and a given x0 where

I t+1 is an n 1 vector of shocks that is iid-normal with a zero vector as


the mean and an identity matrix as the variance:
0
E {t+1 } = 0 and E {t+1 t+1 } = I
I The value function takes the form
0
v (x) = x Px d
where P is a positive semidenite matrix obtained from solving the
Riccati equation from the deterministic and discounted case and
d = (1 )1 trace(PCC )
0

I The optimal policy function is


ut = Fxt
F = (Q + B PB)1 B PA
0 0
Stochastic and Discounted

I P and F for the deterministic and the stochastic case are

identical

I This is called Certainty Equivalence


I We discussed this in the previous lecture

I Let us prove the claims above


Stochastic and Discounted
I The Bellman equation is

0
h 0 0
n 0
o i
x Pxd = max x Rx u Qu E (Ax + Bu + C ) P(Ax + Bu + C ) d
u

where E {.} is w.r.t


I Expanding,

0 0 0 0 0 0 0 0 0
x Px d = max[x Rx u Qu E {x A PAx + x A PBu + x A PC
u

0 0 0 0 0 0 0 0 0 0 0 0
+u B PAx + u B PBu + u B PC + C PAx + C PBu + C PC } d]
I Taking the expectations,

0 0 0 0 0 0 0
x Px d = max[x Rx u Qu x A PAx x A PBu
u

0 0 0 0 0 0
u B PAx u B PBu E { C PC } d]
I The FOC of the right hand side is

0 0 0
2Qu B PAx B PAx 2B PBu = 0

u = (Q + B PB)1 B PAx
0 0
Stochastic and Discounted
0 0 0 0 0 0 0
x Px d = max[x Rx u Qu x A PAx x A PBu
u
0 0 0 0 0 0
u B PAx u B PBu E { C PC } d]
u = Fx
F = (Q + B PB)1 B PA
0 0

I Using the well-known result about the expected value of a quadratic form,
0 0 0
E { C PC } = trace(C PC )
I Because the trace is invariant under cyclic permutations,
0 0
trace (C PC ) = trace(PCC )
0 0 0
I So, E { C PC } = trace(PCC )
0 0
I Substituting ut and E { C PC } into the above,

0 0 0 0 0 0 0 0
x Px d = x Rx x F QFx x A PAx + x A PBFx

0 0 0 0 0 0 0
+x F B PAx x F B PBFx trace(PCC ) d
0 0 0 0 0 0 0 0 0
x Px d = x (R + F QF + A PA A PBF F B PA + F B PBF )x
0
(trace(PCC ) + d)
Stochastic and Discounted
0 0 0 0 0 0 0 0 0
x Px d = x (R + F QF + A PA A PBF F B PA + F B PBF )x
0
(trace(PCC ) + d)
F = (Q + B PB)1 B PA
0 0

0 0 0
I Noting A PBF = F B PA, let us gure out P rst:
0 0 0 0
P = R + F (Q + B PB)F + A PA 2A PBF
P = R + 2 A PB(Q + B PB)1 (Q + B PB)(Q + B PB)1 B PA
0 0 0 0 0

+A PA 2 2 A PB(Q + B PB)1 B PA
0 0 0 0

P = R + A PA 2 A PB(Q + B PB)1 B PA
0 0 0 0

which is identical to the one from the deterministic and discounted case
as claimed above
I Now, let us gure out d :
0
d = (trace(PCC ) + d)
0
d= trace(PCC )
(1 )
Certainty Equivalence

0
I Even though d depends on CC which is the covariance matrix of
t+1 , F in the optimal policy function does not depend on it

I This implies that this framework is not suitable for studying

the eect of risk such as precautionary savings

I This certainty equivalence property is due to the quadratic objective


function, the linear transition function, and Et {t+1 } = 0
I It breaks down without these assumptions
Closing Remark

I I recommend that you take a look at the Lagrangian formulation of


the optimal linear regulator problem in Ljungqvist and Sargent
(2004)

I Chow (1997) is an excellent textbook which shows deep

connections between the Bellman and the Lagrangian

I I also recommend Hansen and Sargent's (2008) textbook for a


proper treatment of this topic and beyond

You might also like