Professional Documents
Culture Documents
5 G
5 G
5 G
B = yz
I
linear algebra
n
0 0
o
max
X
xt Rxt + ut Qut
{ut }t=0
t=0
0
n 0 0 0
o
x Px = max x Rx u Qu (Ax + Bu) P(Ax + Bu)
u
where the transition law is substituted into the tomorrow's state. Expanding,
0
n 0 0 0 0 0 0 0 0 0 0
o
x Px = max x Rx u Qu x A PAx x A PBu u B PAx u B PBu
u
I Solving the problem on the right hand side gives the following FOC (remember
the symmetry of P ):
0 0
2Qu B 0 PAx B PAx 2B PBu = 0
0 0
(Q + B PB)u = B PAx
u = Fx
F = (Q + B PB)1 B PA
0 0
Deterministic and Undiscounted
0
n 0 0 0 0 0 0 0 0 0 0
o
x Px = max x Rx u Qu x A PAx x A PBu u B PAx u B PBu
u
u = Fx
F = (Q + B PB)1 B PA
0 0
0 0 0 0 0 0 0
x Px = x Rx (Fx) Q(Fx) x A PAx x A PB(Fx)
0 0 0 0
(Fx) B PAx (Fx) B PB(Fx)
0 0 0 0 0 0 0 0 0 0 0 0 0 0
x Px = x Rx x F QFx x A PAx +x A PBFx +x F B PAx x F B PBFx
0 0 0 0 0
I Note that x A PBFx = x F B PAx because they are scalars
I Rearranging,
0 0 0 0 0 0 0
x Px = x (R + F QF + A PA 2A PBF + F B PBF )x
I This implies
0 0 0 0 0
P = R + F QF + A PA 2A PBF + F B PBF
0 0 0 0
P = R + A PA + F (Q + B PB)F 2A PBF
Deterministic and Undiscounted
0 0 0 0
P = R + A PA + F (Q + B PB)F 2A PBF
2A PB(Q + B PB)1 B PA
0 0 0
P = R + A PA A PB(Q + B PB)1 B PA ()
0 0 0 0
n 0 0
o
max 0<<1
X
t xt Rxt + ut Qut
{ut }t=0
t=0
ut = Fxt
F = (Q + B PB)1 B PA
0 0
0
I The value function is still V (x) = x Px where P is obtained from
solving the Riccati equation above
Computer Codes
(http://de.mathworks.com/help/control/ref/dare.html)
(https://ideas.repec.org/c/dge/qmrbcd/21.html)
(http://users.ox.ac.uk/~exet2581/recursive/riccati.m)
Stochastic and Discounted
(
)
n 0 0
o
max 0<<1
X
E0 t xt Rxt + ut Qut
{ut }t=0
t=0
identical
0
h 0 0
n 0
o i
x Pxd = max x Rx u Qu E (Ax + Bu + C ) P(Ax + Bu + C ) d
u
0 0 0 0 0 0 0 0 0
x Px d = max[x Rx u Qu E {x A PAx + x A PBu + x A PC
u
0 0 0 0 0 0 0 0 0 0 0 0
+u B PAx + u B PBu + u B PC + C PAx + C PBu + C PC } d]
I Taking the expectations,
0 0 0 0 0 0 0
x Px d = max[x Rx u Qu x A PAx x A PBu
u
0 0 0 0 0 0
u B PAx u B PBu E { C PC } d]
I The FOC of the right hand side is
0 0 0
2Qu B PAx B PAx 2B PBu = 0
u = (Q + B PB)1 B PAx
0 0
Stochastic and Discounted
0 0 0 0 0 0 0
x Px d = max[x Rx u Qu x A PAx x A PBu
u
0 0 0 0 0 0
u B PAx u B PBu E { C PC } d]
u = Fx
F = (Q + B PB)1 B PA
0 0
I Using the well-known result about the expected value of a quadratic form,
0 0 0
E { C PC } = trace(C PC )
I Because the trace is invariant under cyclic permutations,
0 0
trace (C PC ) = trace(PCC )
0 0 0
I So, E { C PC } = trace(PCC )
0 0
I Substituting ut and E { C PC } into the above,
0 0 0 0 0 0 0 0
x Px d = x Rx x F QFx x A PAx + x A PBFx
0 0 0 0 0 0 0
+x F B PAx x F B PBFx trace(PCC ) d
0 0 0 0 0 0 0 0 0
x Px d = x (R + F QF + A PA A PBF F B PA + F B PBF )x
0
(trace(PCC ) + d)
Stochastic and Discounted
0 0 0 0 0 0 0 0 0
x Px d = x (R + F QF + A PA A PBF F B PA + F B PBF )x
0
(trace(PCC ) + d)
F = (Q + B PB)1 B PA
0 0
0 0 0
I Noting A PBF = F B PA, let us gure out P rst:
0 0 0 0
P = R + F (Q + B PB)F + A PA 2A PBF
P = R + 2 A PB(Q + B PB)1 (Q + B PB)(Q + B PB)1 B PA
0 0 0 0 0
+A PA 2 2 A PB(Q + B PB)1 B PA
0 0 0 0
P = R + A PA 2 A PB(Q + B PB)1 B PA
0 0 0 0
which is identical to the one from the deterministic and discounted case
as claimed above
I Now, let us gure out d :
0
d = (trace(PCC ) + d)
0
d= trace(PCC )
(1 )
Certainty Equivalence
0
I Even though d depends on CC which is the covariance matrix of
t+1 , F in the optimal policy function does not depend on it