Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Victor M.

Chan Ortiz Optimal Control


CINVESTAV - GDL Homework 4 April 7, 2020

1. Find the control input u(t) sequence that minimizes the cost functional:

J = −y(tf )

Subject to the state constraints:

ẏ(t) = y(t)u(t) − y(t) − u2 (t)

for an initial condition y(0) = 2 and final time tf = 5. Give both the control, the state
response and co-state.
Answer:
First, we identify the elements of the problem:

h(y(tf ), tf ) = −y(tf )
g(y(t), u(t), t) = 0
a(y(t), u(t), t) = y(t)u(t) − y(t) − u2 (t)

The Hamiltonian is given by:

H = g + P T a = P (t)[y(t)u(t) − y(t) − u2 (t)]

Setting the necessary conditions for the optimal control:

ẏ(t) = a(y(t), u(t), t)


Ṗ = −HyT
Hu = 0

Substituting:

Hu = 0 → P (t)[y(t) − 2u(t)] = 0 → y(t) − 2u(t) = 0


1
u∗ (t) = y(t)
2
Replacing u∗ (t) in ẏ(t):
   2
1 1
ẏ(t) = y(t) y(t) − y(t) − y(t)
2 2
1 1
ẏ(t) = y 2 (t) − y(t) − y 2 (t)
2 4
1 2
ẏ(t) = y (t) − y(t)
4
Replacing u∗ (t) in Ṗ (t):
 
1
Ṗ = −P (t)[u(t) − 1] → Ṗ = −P (t)[ y(t) − 1]
2

1
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Solving the differential equations. First for ẏ(t)


1 2 
ẏ(t) = y (t) − 4y(t)
4
4
y(t) =
1 + C1 e t
Now to find C1 , we use the conditions given by the problem:
4
2=
1 + C1 e 0
C1 = 1

Then:
4
y(t) =
1 + et
Replacing y(t) in u ∗ (t):
2
u∗ (t) =
1 + et
Now the co-state:
 
2
Ṗ (t) = −P (t) −1
1 + et
Z P Z t 
dP (t) 2
= 1− dt
P0 P (t) 0 1 + et
t
ln(P (t)) − ln(P (0)) = [t − 2(− ln(1 + et ) + t)] 0
 
P (t) t
ln = [2 ln(1 + et ) − t] 0
P (0)
 
P (t)
ln = [2 ln(1 + et ) − 2 ln(2) − t]
P (0)
2
1 + et
  
P (t)
ln = [ln − t]
P (0) 2
2
1 + et

P (t)
= e−t
P (0) 2
2
1 + et

P (t) = P (0) e−t
2

2
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

The boundary condition is given by Kirk equation 5.1 − 20 :


∂h ∗
(y (tf )) − p∗ (tf ) = 0
∂y
2
1 + etf

−y(tf ) − P (0) e−tf = 0
2
4
tf
etf
P (0) = −  1 + e
2
1 + etf
2
etf
P (0) = −16
(1 + etf )3
We know that tf = 5:
e5
P (0) = −16
(1 + e5 )3
P (0) = −0.0007119

2. It is known that the stochastic version of the HJB equation is written as:
 
∗ ∗ 1 ∗ T

−Jt = min g(x, u, t) + Jx a(x, u, t) + trace Jxx G(x, u, t)W G (x, u, t) (1)
u 2
for the optimal control with cost function
 Z tf 
J = E m(x(tf )) + g(x(t), u(t), t)dt | x(t) = given (2)
t

and the dynamics


ẋ(t) = a(x, u, t) + G(x, t)w (3)
where E[w(τ )wT (τ )] = W δ(t − τ ).

(a) In case the dynamics is LTI and the objective is quadratic as:
a(x, u, t) = Ax + Bu G(x, t) ≡ G (4)
g(x, u, t) = xT Qx + uT Ru m(x(tf )) = xT (tf )M x(tf ) (5)
Show that the optimal control law and the optimal cost have the form of:
u∗ (t) = −R−1 B T P (t)x(t) (6)
∗ T
J = x P (t)x + e(t) (7)
where P (t) and e(t) are determined by:
−Ṗ = AT P + P A + Q − P BR−1 B T P P (tf ) = M
−ė = trace(P GW GT ) e(tf ) = 0

3
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

(b) Show that the optimal cost can be approximated as the following, under the
assumption that tf is sufficiently large and P converges to its steady state value
Pss :
J ∗ = xT Pss x + (tf − t)trace(Pss GW GT ) (8)

(c) Show that:


lim E[xT Qx + uT Ru] = trace(Pss GW GT ) (9)
t→∞

when u is determined by the optimal control law.

Answer a):
In order to show the conditions given in section a):
J ∗ = xT P (t)x + e(t) → Jt∗ = xT Ṗ x + ė
Jx∗ = 2xT P → Jxx

= 2P
The equation that minimizes the performance measure has the form:
Z tf
J = h(x(tf ), tf ) + g(x(τ ), u(τ ), τ )dτ
t0

The Hamiltonian is given by:


H(x, t) = g(x, u, t) + Jx∗ a(x, u, t)
H(x, t) = xT Qx + uT Ru + Jx∗ (Ax + Bu)
So:
∂H
=0
∂u
2uT R + Jx∗ B = 0
Finding u∗ :
1
u∗ T = − Jx∗ BR−1
2
∗ 1 −1 T ∗ T
u = − R B Jx
2
∗ 1 −1 T T
u = − R B (2xT P )
2
u∗ = −R−1 B T P x
The optimal control, has the same form as that provided in section a). We will now try
the next part. Using equation (1) and Replacing g(x, u, t), a(x, u, t), G(x, u, t), Jx∗ and

Jxx :
 
∗ T T T 1 T

−Jt = min (x Qx + u Ru) + (2x P )(Ax + Bu) + trace (2P )GW G
u 2

4
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Replacing u∗ in last equation:


−Jt∗ = xT Qx + (−R−1 B T P x)T R(−R−1 B T P x) + (2xT P )(Ax + B(−R−1 B T P x))
+trace P GW GT


−Jt∗ = xT Qx + xT P BR−1 B T P x + 2xT P Ax − 2xT P BR−1 B T P x + trace P GW GT




−Jt∗ = xT Qx + 2xT P Ax − xT P BR−1 B T P x + trace P GW GT




−Jt∗ = xT (Q + 2P A − P BR−1 B T P )x + trace P GW GT




−Jt∗ = xT (Q + P A + AT P − P BR−1 B T P )x + trace P GW GT




Replacing Jt∗ in last equation:


−(xT Ṗ x + ė) = xT (Q + P A + AT P − P BR−1 B T P )x + trace P GW GT


Matching similar terms:


−Ṗ = Q + P A + AT P − P BR−1 B T P
−ė = trace P GW GT


Answer b):
In order to show this section we know that with tf sufficiently large:
J ∗ = xT Pss x + (tf − t)trace(Pss GW GT )
From previous condition:
Jx∗ = 2xT Pss

Jxx = 2Pss
Jt∗ = −trace(Pss GW GT )
Replacing the conditions given in a) in equation (1):
trace(Pss GW GT ) = min (xT Qx + uT Ru) + (2xT Pss )(Ax + Bu)

u

1 T

+ trace (2Pss )GW G
2
Replacing u∗
trace(Pss GW GT ) = xT Qx + (−R−1 B T Pss x)T R(−R−1 B T Pss x)
+(2xT Pss )(Ax + B(−R−1 B T Pss x)) + trace Pss GW GT


trace(Pss GW GT ) = xT Qx + xT Pss BR−1 B T Pss x + 2xT Pss Ax − 2xT Pss BR−1 B T Pss x
+trace Pss GW GT


trace(Pss GW GT ) = xT Qx + 2xT Pss Ax − xT Pss BR−1 B T Pss x + trace Pss GW GT




trace(Pss GW GT ) = xT (Q + 2Pss A − Pss BR−1 B T Pss )x + trace Pss GW GT




trace(Pss GW GT ) = xT (Q + Pss A + AT Pss − Pss BR−1 B T Pss )x + trace Pss GW GT




5
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

From section a) we know that:

Q + Pss A + AT Pss − Pss BR−1 B T Pss = −Ṗ

We know that Ṗ = 0 when P in steady-state, so:

Q + Pss A + AT Pss − Pss BR−1 B T Pss = 0

So:
trace(Pss GW GT ) = trace(Pss GW GT )
This section has been shown.
Answer c):
To show that:

lim E[xT Qx + uT Ru] = trace(Pss GW GT )


t→∞

Replacing u∗ obtained in a):

lim E[xT Qx + (R−1 B T P x)T R(R−1 B T P x)] = lim E[xT Qx + xT P BR−1 B T P x]


t→∞ t→∞
lim E[x (Q + P BR B P )x] = lim E[trace[(Q + P BR−1 B T P )xxT ]]
T −1 T
t→∞ t→∞

Since the trace results in a scalar value, the expected value for a scalar is the same
scalar value:

lim (trace[(Q + P BR−1 B T P )X(t)]) = trace[(Q + Pss BR−1 B T Pss )Xss ]


t→∞

So:
0 = (A − R−1 B T Pss )Xss + Xss (A − R−1 B T Pss )T + GW GT
So that:
trace[(Q + Pss BR−1 B T Pss )Xss ] = trace(Pss GW GT )

3. Given the plant dynamics:

ẋ1 (t) = x2 (t)


ẋ2 (t) = x1 (t) + u(t) + w(t)
y(t) = x2 (t) + v(t)

and cost function:


 Z tf 
1 2 2 2
J =E (3x1 (t) + 3x2 (t) + u (2))dt
2 0

where w(t) ∼ N (0, 4) and v(t) ∼ N (0, 0.5) are Gaussian, white noises and tf = 15.

(a) Numerically integrate the Riccati equations for LQR and the LQE to find the
time-varying regulator and estimator gains.

6
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

(b) The full stochastic linear optimal output feedback problem involves using u(t) =
−K x̂(t). For this control policy, the compensator is of the form:

ẋ = Ac xc (t) + Bc y(t) (1)


u(t) = −Cc xc (t) (2)

For this example, write


 down
 the dynamic equations for the combined plant and
x
compensator system , simulate the full closed-loop system in MATLAB using
xc    
10 0
the gains found in part (a) and the initial conditions, x0 = and x̂0 = .
−10 0
(c) Show that the eigenvalues of the closed loop dynamics are equal to those of the
steady state regulator and estimator. Compare the performance of the compen-
sator in part (b) to the performance you obtain when you use these steady state
values of the regulator and estimator gains.
(d) Find the steady-state mean-squared values of the regulator and estimator. (x1 (t),
x2 (t), x̂1 (t) and x̂2 (t))

Answer a):
First we are going to identify the elements of the problem:
       
0 1 0 0   1 0
A= Bu = Bw = Cy = 0 1 Cz =
1 0 1 1 0 1
 
3 0
Rvv = 0.5 Rww = 4 Ruu = 1 Rxx =
0 3

The solution to solve the regulator and estimator gains is an LQG controller. So its
equations are as follows:
Regulator:
−1 T
−Ṗ (t) = AT P (t) + P (t)A + Rxx − P (t)Bu Ruu Bu P (t)
−1 T
K(t) = Ruu Bu P (t)

Estimator:
−1
Q̇(t) = AQ(t) + Q(t)AT + Bw Rww BwT − Q(t)CyT Rvv Cy Q(t)
−1
L(t) = Q(t)CyT Rvv

Using simulink we find the values of K and L as shown in the following figures.

7
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Figure 1: LQR - K Gain.

Figure 2: LQE - L Gain.

Answer b):
For closed-loop system:
ẋc = Ac xc + Bc y
u = −Cc xc
With:
Ac = A − Bu K(t) − L(t)Cy
Bc = L(t)
Cc = K(t)

8
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

In the same way we have the original system:

ẋ = Ax + Bu u + Bw w
y = Cy x + v

Relating both systems:

ẋ = Ax − Bu Cc xc + Bw w
ẋc = Ac xc + Bc Cy x + Bc v

We can accommodate the system as follows:


       
ẋ A −Bu Cc x Bw 0 w
= +
ẋc Bc Cy Ac xc 0 Bc v
 
  x
y = Cy 0 +v
xc

Using a simulink simulation we can see the gain performance:

Figure 3: Closed-Loop System Performance.

Answer c):
To test this section we need to find the steady state values of Pss and Qss . We know
that these values can be calculated in the following way:
−1 T
AT Pss + Pss A + CzT Rzz Cz − Pss Bu Ruu Bu Pss = 0
−1
AQss + Qss AT + Bw Rww BwT − Qss CyT Rvv Cy Qss = 0

With:
−1 T −1
Kss = Ruu Bu Pss Lss = Qss CyT Rvv

9
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Using MATLAB to find the Pss and Qss matrices, with the conditions that P and Q are
positive defined, the following values were found:
  √ 
6 3 3 √1
Pss = Qss =
3 3 1 3

With these values, the following K and L values were found:


 
2
Lss = √
 
Kss = 3 3
2 3

With this we determine that the gains values found with Simulink and steady state
values are the same. Now the eigenvalues of closed loop system are:
 
0 1 0 0
1 0 −3 −3 
ACL =  
0 2
√ 0 −1√ 
0 2 3 −2 3 + 2 3
λ1 = −3.1463 λ2 = −2 λ3 = −1 λ4 = −0.3178

We know that by separation principle:

ACL ≡ ACL
 
A − Bu K Bu K
ACL =
0 A − LCy

So, we can calculate the estimator and regulator separately:


LQR:
 
  0 1
A − Bu K =
−2 −3

With:

λ1 = −1 λ = −2

LQE:
 
  0 −1√
A − LCy =
1 −2 3

With:

λ1 = −0.3178 λ = −3.1463

They have the same eigenvalues, therefore they have the same performance.
Answer D):

10
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

In order to find the steady state mean squared values we know that the system in the
form:
ẋ(t) = (A − Bu K)x(t) + Bw w(t)
The mean square can be predicted by:
E[x(t)xT (t)] = X(t) E[ẋ(t)ẋT (t)] = Ẋ(t)
Ẋ(t) = [A(t) − Bu (t)K(t)]X(t) + X(t)[A(t) − Bu (t)K(t)]T + Bw Rww BwT
Now for estimated system:
x̂(t) = (A(t) − L(t)Cy (t))x̂(t) + Bu (t)u(t) + Bw (t)w(t) + L(t)y(t)
X̂(t) = [A(t) − L(t)Cy (t)]X̂(t) + X̂(t)[A(t) − L(t)Cy (t)]T + Bw Rww BwT
+ L(t)Rvv LT (t)
In steady-state:
0 = [A − Bu K]Xss + Xss [A − Bu K]T + Bw Rww BwT
0 = [A − LCy ]X̂ss + X̂ss [A − LCy ]T + Bw Rww BwT + LRvv LT

The mean-squared value of Xss and X̂ss :


 
1
0 √
3 √1

Xss =  3 2  X̂ss =

0 1 3
3
4. Given a system with a state representation:
ẋ(t) = Ax(t) + Bu u(t) (1)
y(t) = Cy x(t) (2)
We have investigated various types of observer -based controllers that can be written
as:
ż(t) = Az(t)Bu u(t) + Hr(t) (3)
u(t) = −Gz(t) (4)
Where, r(t) is the residual signal [r(t) = y(t) − Cy z(t)] and the state vectors x and
z have the same size. Assume that we have selected H and G so that the closed-loop
system is stable. We can modify the original compensator in ż(t) to obtain a very
general form the compensator:
ż(t) = Az(t) + Bu u(t) + Hr(t) original comp dynamics (5)
ẇ(t) = F w(t) + Qr(t) other stable dynamics (6)
u(t) = −Gz(t) + P w(t) (7)
so that the residual is filtered by another set of stable dynamics and then added to
the control signal. The dimension (k) of w is arbitrary, but the eigenvalues of F are
assumed to have negative real parts.

11
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

(a) Show that a ”separation” type principle still holds for the closed loop eigenvalues.
(b) Show that under the assumptions given, this new compensator still leads to a
stable closed loop system.

Answer a):
We can write the closed-loop equations as follows:

ẋ = Ax − Bu Gz + Bu P w
ż = (A − Bu G − HCy ) + Bu P w + Hy
ẇ = F w + QCy x − QCy z

In state-space:
    
ẋ A −Bu G Bu P x
 ż  = HCy A − Bu G − HCy Bu P   z 
ẇ QCy −QCy F w

With the help of Matlab we find the eigenvalues:


 
F
λ = A − Bu G
A − Cy H

We know that, ACL ≡ ACL , so the matrix could be rewritten as:


 
F 0 0
ACL =  0 A − Bu G 0 
0 0 A − Cy H

So, the principle of separation applies with closed-loop system. We know that the eigen-
values of each matrix separately, are part of ACL matrix, so:

F = λ1 , λ2 A − Bu G = λ3 , λ4 A − Cy H = λ5 , λ6

With a transformation matrix we get the following:

F ≡F A − Bu G ≡ A − Bu G A − Cy H ≡ A − Cy H

So:
     
λ 0 λ3 0 λ 0
F = 1 A − Bu G = A − Cy H = 5
0 λ2 0 λ4 0 λ6

Finally:
 
F 0 0
ACL =  0 A − Bu G 0 
0 0 A − Cy H

12
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

This shows that the eigenvalues of the ACL matrix depend on the eigenvalues of the
original system matrix.
Answer b):
The problem gave us the data that the H and G gains made the closed-loop system
stable. With this, the condition for the ACL array to be stable is that the eigenvalues of
the F array must have a negative real part.

5. For the following system, design a H∞ controller that minimizes the cost function:
 
WS S

Wu Gc S

Where:
s/M + ωB
Ws (s) =
s + ωB A
with M = 2, ωB = 5, and A = 0.01. As discussed in class, the weight Wu should be
adjusted so that we meet the sensitivity specification.
You should then design a H2 optimal (LQG) controller that yields a similar settling
time of the step response and directly compare the graphs listed below for the H2 , and
H∞ controllers.
One way to do this for a SISO system is set Bu = Bw , C2 = Cy , choose Rww = 0.1 and
Rzz = 1. Then set Ruu = Rvv = ρ and use ρ << 1 as the basic design parameter. You
can then iterate on the choice of Rww and Rzz as needed.
Use this procedure to design a controller (follow the code given in the notes) for the
following system (ω1 = 1, ω2 = 5, ω3 = 10).
1
G1 (s) =
(−s/ω1 + 1)(s/ω2 + 1)(s/ω3 + 1)
For each case, draw the following plots:

ˆ Sensitivity and inverse of Ws weight to show that you meet the specification.
ˆ The Nyquist/Nichols plot to show that you have the correct number of encir-
clements.
ˆ A bode graph that contains Gi (s), Gc (s) and L = Gi Gc (s). Clearly label each
curve and identify the gain/phase cross-over points.
ˆ The root locus obtained by freezing the controller dynamics and then using a
controller αGc (s), where α ∈ {0, 2} (assuming that normally α = 1) to investigate
the robustness of your design to gain errors.
ˆ The step response of the closed-loop system.

The reason for drawing these plots is to develop some insights into what a typically
H∞ controller ”looks like”. So, use your results to comment on how the H∞ controller
modifies the plant dynamics to achieve the desired performance.

13
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Answer:
First, we identify the information:

M =2 ωB = 5 A = 0.01
ω1 = 1 ω2 = 5 ω3 = 10

So:
s/2 + 5
Ws (s) =
s + 0.05
1
G1 (s) =
(−s + 1)(s/5 + 1)(s/10 + 1)

To get a better performance we must make sure that:

γmin < 1

We start with Wu = 1, but γmin > 1 then we decrease the value Wu , until we reach
Wu = 1 × 10−9 . With this:
γmin = 0.5012
So the sensitivity and the inverse of Ws shows that:

Figure 4: Visualization of the weighted sensitivity tests.

Now we are going to see the Nyquist plot:

14
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Figure 5: G(s) Nyquist Plot.

Figure 6: G(s)Gc (s) Nyquist Plot.

15
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

The system G(s) is unstable, but enclose onet time the critical point. This means that
the system G(s) is stable with control law Gc (s).
Now using Bode Diagram:

Figure 7: G(s), Gc (s), G(s)Gc (s) Bode Diagram.

The gain margin and phase margin of G(s) are:

GMG(s) = 0.0697
P MG(s) = −90

With this the system G(s) is unstable.


Now, the gain margin and phase margin of G(s)Gc (s) are:

G(s)Gc (s) = 0.0796


P MG(s)Gc (s) = 81.2832

With this the system with control, makes the system stable and robust
Now checking the root locus of controller dynamics.

16
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Figure 8: Gc (s) Root Locus Diagram.

Figure 9: αGc (s), (α ∈ {0, 2}) Root Locus Diagram.

17
Victor M. Chan Ortiz Optimal Control
CINVESTAV - GDL Homework 4 April 7, 2020

Figure 10: αGc (s), (α = 1) Root Locus Diagram.

Now, we are going to see the step response.

Figure 11: αGc (s), (α = 1) Root Locus Diagram.

18

You might also like