Trajectory Generation

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2013 American Control Conference (ACC)

Washington, DC, USA, June 17-19, 2013

Optimal Trajectory Generation for Nonlinear Systems Based on Double


Generating Functions
Zhiwei Hao1 , Kenji Fujimoto2 and Yoshikazu Hayakawa3

Abstract— A method based on the generating function for values separately when we use those methods. Therefore,
the finite time optimal control problems was proposed recently. calculating optimal trajectories for a set of different boundary
A single generating function can generate a family of optimal conditions imposes a heavy computational burden particular-
inputs which are functions of the state for different boundary
conditions. Therefore, a family of optimal trajectories for ly for an online optimal trajectory generation problem.
different boundary conditions can be obtained by numerical For a nonlinear finite time optimal control problem, the so-
integration along the system dynamic equation. This paper lution of the HJBE is given as a series by solving a sequence
proposes a method to compute a family of optimal trajectories of ordinary differential equations (ODEs) when the terminal
for a nonlinear optimal control problem on a finite time value of the state is zero [8]. Recently, a method to generate
interval by using a pair of generating functions. The proposed
method reduces the online computational effort in calculating optimal trajectories based on the generating functions which
the numerical integration required in the method using a single are the solutions of HJEs is proposed for nonlinear finite time
generating function. It is useful to on-line nonlinear trajectory optimal control problems with the fixed boundary condition
generation problems such as model predictive control. A nu- [9], [10]. The generating function for the HJE can be found
merical example illustrates the effectiveness of the proposed by taking the infimum of the optimal cost function, and it
method.
can be effectively solved by the Galerkin spectral method
I. INTRODUCTION with Chebyshev polynomials [11]. A recursive algorithm
The optimal control problem is a classical mathematical based on the result of [9] is proposed to solve HJEs for
problem. Since it is widely used in engineering, a lot of the generating functions for a nonlinear system [12]. Once a
important results have been obtained in the past decades. generating function is found, it gives a family of optimal
The Hamilton-Jacobi equation (HJE) plays a fundamental inputs for different boundary conditions by the canonical
role in the analysis and control of nonlinear systems. For an transformation. Then a family of the optimal trajectories
infinite time nonlinear optimal control problem, the solution can be obtained by numerical integration along the system
to the Hamilton-Jacobi-Bellman equation (HJBE) can be dynamic equation, whereas the conventional method for an
approximated by the Taylor series in a neighborhood of the optimal control problem can obtain only one trajectory by
origin [1]. The Taylor series converges to the true solution solving the HJE for a value function one time.
under suitable conditions [2]. An approximate stabilizing The authors proposed the double generating functions
solution of the HJE can be obtained by using symplectic method for a linear finite time optimal control problem
geometry and the Hamiltonian perturbation technique for [13], which uses a pair of generating functions to com-
feedback optimal control problems [3]. pute the optimal trajectory. In this paper, this method is
A finite time optimal control problem reduces to a two- extended to the nonlinear case. As in the linear case, it
point boundary-value problem (TPBVP) for ordinary dif- can calculate a family of optimal trajectories for different
ferential equations (ODEs) with respect to a Hamiltonian boundary conditions. It gives the optimal trajectories and
system [4]. There are a lot of methods, e.g. the Riccati inputs as functions of the boundary values of the state and
method [5], the shooting method [6] and its extension [7], the initial and terminal time. In addition, the generating
to solve it. The basic principle of the shooting method is function can be calculated off-line. Therefore, the optimal
computing the trajectories repeatedly so that the exact one trajectories and inputs can be obtained directly without any
satisfying the boundary values is obtained. Therefore, if numerical integration along the system equation required in
we change the desired boundary values, then we need to the single generating function case. Therefore, the proposed
solve the TPBVP again using the shooting method. Similarly method reduces the online computational effort for different
to the shooting method, we have to solve the nonlinear boundary conditions and different time intervals compared
optimal control problems for a set of initial and terminal state to the conventional methods as well as the single generating
function one.
1 Z. Hao is with Department of Mechanical Science and Engineering,
This paper is organized as follows. Section 2 introduces
Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464-8603, Japan
hao@haya.nuem.nagoya-u.ac.jp
the preliminaries of nonlinear optimal control problems and
2 K. Fujimoto is with the Department of Aerospace Engineering, Ky- the generating function method. Section 3 discusses how
oto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501, Japan to generate optimal trajectories for nonlinear optimal con-
k.fujimoto@ieee.org trol problems for different boundary conditions by double
3 Y. Hayakawa is with the Department of Mechanical Science and
Engineering, Nagoya University, Furo-cho, Chikusa-ku, Nagoya, Aichi 464- generating functions. This section is the main part of this
8603, Japan hayakawa@nuem.nagoya-u.ac.jp paper. A numerical example illustrates the usefulness and
978-1-4799-0178-4/$31.00 ©2013 AACC 6382
effectiveness of the proposed method in Section 4. Section S2 f (x, λ0 ,t) and S3 f (λ , x0 ,t) for the following canonical
5 concludes the paper. transformations:

II. PRELIMINARIES ∂ S2 f (x, λ0 ,t) ∂ S2 f (x, λ0 ,t)


λ= , x0 = , (8)
∂x ∂ λ0
This section introduces some preliminaries of nonlinear
∂ S3 f (λ , x0 ,t) ∂ S3 f (λ , x0 ,t)
optimal control problems and generating functions. x=− , λ0 = − . (9)
∂λ ∂ x0
A. Nonlinear optimal control problem More over, the generating functions satisfy their own corre-
A nonlinear optimal control problem to be solved and how sponding HJEs as follows:
it is rendered to a TPBVP for ODEs as in [14] are elaborated
in this section. Consider a nonlinear system ∂ S2 f (x, λ0 ,t) ∂ S2 f
+ H(x, ) = 0, (10)
∂t ∂x
ẋ = f (x) + g(x)u, (1) ∂ S3 f (λ , x0 ,t) ∂ S3 f
+ H(− , λ ) = 0. (11)
∂t ∂λ
with the given boundary condition
Similarly, we also can treat state-costate (x(t), λ (t)) as
x(t0 ) = x0 , x(t f ) = xt f , (2) a canonical transformation from the terminal state-costate
where x(t) ∈ Ω ⊂ Rn is the state, Ω is a neighborhood of (xt f , λt f ) to the current state-costate (x, λ ). The associated
the origin in Rn , f : Ω → Rn , g : Ω → Rn×m , u ∈ Rm is the canonical transformations for backward generating functions
input, and the symbols x0 ∈ Ω and xt f ∈ Ω denote the initial S2b (x, λt f ,t) and S3b (λ , xt f ,t) and their own HJEs are as
value and the terminal value of the state respectively. Define follows:
a cost function ∂ S2 (x, λt f ,t) ∂ S2 (x, λt f ,t)
Z tf λ= , xt f = , (12)
1 ∂x ∂ λt f
J(u, x0 ) = (q(x) + uT r(x)u)dt, (3)
t0 2 ∂ S3 (λ , xt f ,t) ∂ S3 (λ , xt f ,t)
x=− , λt f = − , (13)
where q(x) ≥ 0 (∀x) is a continuous real matrix function of ∂λ ∂ xt f
dimension n × n on Ω, which satisfies conditions q(0) = 0 ∂ S2 (x, λt f ,t) ∂ S2
and ∂ q(x)/∂ x|x=0 = 0, and r(x) > 0 (∀x) is a constant matrix + H(x, ) = 0, (14)
∂t ∂x
of dimension m × m. The objective is to find an optimal ∂ S3 (λ , xt f ,t) ∂ S3
control input u∗ minimizing the cost function J as + H(− , λ ) = 0. (15)
∂t ∂λ
u∗ = arg min J(u, x0 ) (4) Due to (8)-(9) and (12)-(13), the initial values of
u
S2 f (x, λ0 ,t) and S3 f (λ , x0 ,t) and the terminal values of
subject to the boundary condition in (2). Let us introduce
S2b (x, λt f ,t) and S3b (λ , xt f ,t) are as follows respectively.
a column vector λ to represent the costate, according to
Pontryagin’s minimum principle, the necessary conditions for
S2 f (x, λ0 ,t0 ) = λ0T x0 , S2b (x, λt f ,t f ) = λtTf xt f (16)
minimization of the performance index (3) are
S3 f (λ , x0 ,t0 ) = −λ0T x0 , S3b (λ , xt f ,t f ) = −λtTf xt f (17)
ẋ = Hλ (x, λ ), λ̇ = −Hx (x, λ ), (5)
u∗ = −r(x)−1 g(x)T λ , (6) Although there exist other generating functions, for example
S1 f (x, x0 ,t0 ,t) and S4 f (λ , λ0 ,t0 ,t), they are not defined well
where H(·) denotes the partial derivative ∂ H/∂ (·), H(·) is at the initial time or the terminal time [9].
the Hamiltonian function defined as follows:
1 III. OPTIMAL TRAJECTORIES GENERATED BY
H(x, λ ) = q(x) + λ T f (x) − λ T g(x)r(x)−1 g(x)T λ , (7) DOUBLE GENERATING FUNCTIONS
2
and the boundary value is given as in (2). This section provides the main result of the paper. A
Therefore, the optimal control problem (1)-(3) is equiva- method based on a pair of generating functions to generate
lent to the TPBVP for ODEs (5). Since it is difficult to solve optimal trajectories for nonlinear optimal control problems is
the TPBVP, we will use the generating function technique to proposed. It can generate a family of optimal trajectories for
obtain an approximation of its solution. different boundary conditions. If we use a single generating
function, the numerical integration along the system dynam-
B. The generating functions [9] ics is needed to obtain the optimal trajectories. However, the
This subsection explains that a TPBVP for a optimal method using double generating functions gives the optimal
control problem is characterized by the HJEs for the gen- trajectories and inputs as the functions of the boundary
erating functions. Given the Hamiltonian system (5), we can values (of the state and/or costate), the initial time and the
treat state-costate (x(t), λ (t)) as a canonical transformation terminal time. Therefore, we do not need any numerical
from the initial state-costate (x0 , λ0 ) to the current state- integration of the system dynamic equation required in the
costate (x, λ ). Thus, there exist forward generating functions single generating function case.
6383
A. Main result with the initial values
This subsection gives the optimal trajectory and input Z3 f (0) = 0, Y3 f (0) = −I, (23)
generated by the pair of S3 f (λ , x0 ,t) and S3b (λ , xt f ,t) for Z3b (0) = 0, Y3b (0) = −I, (24)
a nonlinear optimal control problem. They are the functions
of the boundary condition of the state. respectively, where I denotes the identity matrix of dimension
n × n and
Theorem 1 Suppose that the functions S3 f (λ , x0 ,t) and
∂ 2 q(x)

df
S3b (λ , xt f ,t), t ∈ R, are the solutions to (11) and (15) with A= (0), B = g(0), Q = , R = r(0)−1 .
boundary conditions in (17) respectively, and suppose that dx ∂ x2 x=0
λ = λ ∗ (t, x0 , xt f ) is the solution of (25)
Then, for the nonlinear optimal control problem (1)-(3)
∂ (S3 f (λ , x0 ,t) − S3b (λ , xt f ,t)) with t0 and t f satisfying t0 < t f and t f − t0 ≤ T , the
= 0, (18)
∂λ optimal trajectory x∗ (t, x0 , xt f ,t0 ,tt f ) (t ∈ [t0 ,t f ]) and input
u∗ (t, x0 , xt f ,t0 ,tt f ) (t ∈ [t0 ,t f ]) are given as follows.
then for the nonlinear optimal control problem (1)-(3)
 ∗
with t0 and t f satisfying t0 < t f , the optimal trajectory
 
x (t, x0 , xt f ,t0 ,tt f ) −Y3 f (t − t0 )x0 − Z3 f (t − t0 )·
x∗ (t, x0 , xt f ) and input u∗ (t, x0 , xt f ), t ∈ [t0 ,t f ], can be =
u∗ (t, x0 , xt f ,t0 ,tt f ) −r(x∗ )−1 g(x∗ )T ·
obtained by x∗ (t, x0 , xt f ) = −∂ S3 f (λ ∗ , x0 ,t)/∂ λ ∗ (or (Z3b (t f − t) − Z3 f (t − t0 ))−1 [Y3 f (t − t0 )x0
x∗ (t, x0 , xt f ) = −∂ S3b (λ ∗ , xt f ,t)/∂ λ ∗ ) and u∗ (t, x0 , xt f )
(Z3b (t f − t) − Z3 f (t − t0 ))−1 [Y3 f (t − t0 )x0
= −r−1 (x∗ )g(x∗ )T λ ∗ (t, x0 , xt f ) respectively, such that 
x∗ (t, x0 , xt f ) and u∗ (t, x0 , xt f ) minimize the cost function in −Y3b (t f − t)xt f + K3 f 3bx (t, x0 , xt f ,t0 ,tt f )]
. (26)
(3). Here x0 = x(t0 ) and xt f = x(t f ) are the given initial and −Y3b (t f − t)xt f + K3 f 3bu (t, x0 , xt f ,t0 ,tt f )]
terminal values of the state respectively. Here K3 f 3bx (t, x0 , xt f ,t0 ,tt f ) and K3 f 3bu (t, x0 , xt f ,t0 ,tt f ) are
Proof: For the nonlinear optimal control problem (1)- functions of t, x0 , xt f , t0 , and tt f , the order of whose terms
(3), the optimal trajectory which is the function of the is higher than one with respect to x0 and xt f .
costate can be calculated by the first equation in (9) or Proof: The Taylor series of the generating functions
the first equation in (13). Rewrite them by eliminating x, S3 f (λ , x0 ,t) and S3b (λ , xt f ,t) can be obtained by solving
we can obtain (18), then λ ∗ (t, x0 , xt f ) can be obtained by HJEs (11) and (15) as follows respectively [12], [9].
solving (18). Therefore, the optimal trajectory x∗ (t, x0 , xt f ) 1
(t ∈ [t0 ,t f ]) can be obtained by the canonical transforma- S3 f (λ , x0 ,t) = λ T Z3 f (t)λ + λ TY3 f (t)x0
2
tion x∗ (t, x0 , xt f ) = −∂ S3 f (λ ∗ , x0 ,t)/∂ λ ∗ (or x∗ (t, x0 , xt f ) = 1 [3]
−∂ S3b (λ ∗ , xt f ,t)/∂ λ ∗ ) and the optimal input u∗ (t, x0 , xt f ) + x0TW3 f (t)x0 + S3 f (λ , x0 ,t), (27)
2
(t ∈ [t0 ,t f ]) can be obtained by using (6). This completes 1
the proof. S3b (λ , xt f ,t) = λ T Z3b (t)λ + λ TY3b (t)xt f
2
Since the solution to (18) is λ ∗ (t, x0 , xt f ) = 0 for x0 = 1 [3]
xt f = 0 and t = 0, the nonsingularity of the Hessian matrix + xtTf W3b (t)xt f + S3b (λ , xt f ,t), (28)
2
of S3 f (λ , x0 ,t) − S3b (λ , xt f ,t) with respect to λ ensures the
where, Z3 f (t), Y3 f (t), W3 f (t), Z3b (t), Y3b (t), and W3b (t) ∈
existence of λ ∗ (t, x0 , xt f ) in a neighborhood of the origin. [3]
Therefore, the sufficient condition for the existence of the Rn×n are coefficient matrices, S3 f (λ , x0 ,t) is a function of
solution to (18) used in Theorem 1 is that the Hessian λ , x0 , and t, the order of whose terms is higher than two
[3]
matrix of S3 f (λ , x0 ,t) − S3b (λ , xt f ,t) with respect to λ is with respect to λ and x0 , and S3b (λ , xt f ,t) is a function of
nonsingular. Theorem 1 tells us that the optimal trajectory λ , xt f , and t, the order of whose terms is higher than two
and input for a nonlinear optimal control problems can be with respect to λ and xt f .
calculated directly by the two generating functions. However, Substituting (27) and (28) to HJEs (11) and (15) respec-
how to calculate it is not clear. The following theorem gives tively, and comparing the coefficient matrices, we obtain
the optimal trajectory and input for a nonlinear optimal (19), (20), and
control problems based on the Taylor series. Ẇ3 f (t) = −Y3Tf (t)QY3 f (t), (29)
T
Theorem 2 Suppose that the matrix functions Z3 f (t), Y3 f (t), Ż3b (t) = Z3b (t)AT + AZ3b (t) − Z3b
T
(t)QZ1b(t) + BRBT , (30)
Z3b (t), and Y3b (t) ∈ Rn×n , t ∈ [0, T ] satisfy T
Ẏ3b (t) = (A − Z3b (t)Q)Y3b (t), (31)
T
Ż3 f (t) =Z3Tf (t)AT + AZ3 f (t) − Z3Tf (t)QZ3 f (t) + BRBT , (19) Ẇ3b (t) = −Y3b (t)QY3b (t). (32)
Ẏ3 f (t) =(A − Z3Tf (t)Q)Y3 f (t), (20) Here, (19) and (30) are differential matrix Riccati equations.
T T T
Ż3b (t) = − Z3b (t)A − AZ3b (t) + Z3b (t)QZ3b (t) − BRB , T Let t¯ = t − t0 for (27) and rewrite t¯ as t, then (23) can be
(21) derived from the first equation in (17). Similarly, let tˆ = t f −t
T
for (28) and rewrite tˆ as t, then (30) and (31) become (21) and
Ẏ3b (t) =(−A + Z3b (t)Q)Y3b (t), (22) (22) respectively, and (24) can be derived from the second
6384
T
equation in (17). Substituting (27) and (28) to the canonical Ż2b (t) = Z2b (t)A + AT Z2b (t) − Z2b
T
(t)BRBT Z2b (t) + Q, (39)
transformations (9) and (13) respectively, we can obtain the Ẏ2b (t) = (AT − Z2b
T
(t)BRBT )Y2b (t). (40)
expressions of x(t) as follows respectively.
Here A, B, Q, and R are defined in (25), and (37) and (39)
∂ S3 f (λ , x0 ,t)
x(t,t0 , λ , x0 ) = − = −Z3 f (t − t0 )λ (t) are differential matrix Riccati equations. The initial values
∂λ for Z2 f (t), Y2 f (t), Z2b (t), and Y2b (t) can be derived from
−Y3 f (t − t0 )x0 + h.o.t., (33)
(16) as follows respectively.
∂ S3b (λ , xt f ,t)
x(t,t f , λ , xt f ) = − = −Z3b (t f − t)λ (t) Z2 f (0) = 0, Y2 f (0) = I, (41)
∂λ
−Y3b (t f − t)xt f + h.o.t.. (34) Z2b (0) = 0, Y2b (0) = I. (42)
Solving (33) and (34) for λ (t, x0 , xt f ,t0 ,tt f ) by eliminating where I denotes the identity matrix of dimension n × n.
x(t), we can obtain the expressions of the optimal state and Substituting (35) and (36) to the canonical transformations
input as in (26) using (33) (or (34)) and u∗ (t, x0 , xt f ,t0 ,tt f ) = (8) and (12) respectively, we can obtain the expressions of
−r−1 (x∗ )g(x∗ )T λ (t, x0 , xt f ,t0 ,tt f ) in (6) respectively. This λ (t) as follows respectively.
completes the proof. ∂ S2 f (x, λ0 ,t)
Theorem 2 gives the optimal trajectory and input as a λ (t,t0 , x, λ0 ) = = Z2 f (t − t0 )x(t)
∂x
functions of boundary values of the state and the initial and +Y2 f (t − t0 )λ0 + h.o.t., (43)
terminal time. It is very convenient to compute the optimal ∂ S2b (x, λt f ,t)
trajectories and inputs for different boundary conditions λ (t,t f , x, λt f ) = = Z2b (t f − t)x(t)
and different time intervals directly. It does not need any ∂x
numerical integration, since the generating functions can be +Y2b (t f − t)λt f + h.o.t.. (44)
calculated off-line. Now, let us discuss the optimal trajectories and inputs
B. Extension generated by the pair S2 f (x, λ0 ,t) and S3 f (λ , x0 ,t) and the
pair S2b (x, λt f ,t) and S3b (λ , xt f ,t). By solving (33) and (43)
In fact, any pair of generating functions among
for x(t) and λ (t), we can get the optimal trajectory x∗ (t) and
S2 f (x, λ0 ,t), S2b (x, λt f ,t), S3 f (λ , x0 ,t) and S3b (λ , xt f ,t) can
input u∗ (t) generated by the pair S2 f (x, λ0 ,t) and S3 f (λ , x0 ,t)
give the optimal trajectories for a nonlinear optimal control
as follows.
problem. Therefore, the other pairs of generating functions,  ∗
−(I + Z3 f (t − t0 )Z2 f (t − t0 ))−1 ·
 
e.g., the pair of S2 f (x, λ0 ,t) and S3 f (λ , x0 ,t) can also generate x (t, x0 , λ0 ,t0 ,t f )
∗ =
the optimal trajectories and inputs for nonlinear optimal u (t, x0 , λ0 ,t0 ,t f ) −r(x∗ )−1 g(x∗ )T ·
control problems. This subsection discusses the optimal tra- [(Y3 f (t − t0 )x0 + Z3 f (t − t0 )Y2 f (t − t0 )λ0 )
jectories generated by the other pairs of generating functions. (I + Z2 f (t − t0 )Z3 f (t − t0 ))−1 [(Y2 f (t − t0 )λ0
Similarly to S3 f (λ , x0 ,t) and S3b (λ , xt f ,t), the Taylor series +h.o.t.]

of S2 f (x, λ0 ,t) and S2b (x, λt f ,t) can be obtained by solving , (45)
−Z2 f (t − t0 )Y3 f (t − t0 )x0 ) + h.o.t.]
HJEs (10) and (14) as follows respectively.
Similarly, by solving (34) and (44) for x(t) and λ (t), we can
1
S2 f (x, λ0 ,t) = xT Z2 f (t)x + xTY2 f (t)λ0 get the optimal trajectory x∗ (t) and input u∗ (t) generated by
2 the pair S2b (x, λt f ,t) and S3b (λ , xt f ,t) as follows.
1 [3]
+ λ0TW2 f (t)λ0 + S2 f (x, λ0 ,t), (35)  ∗
−(I + Z3b (t f − t)Z2b (t f − t))−1 ·
 
2 x (t, xt f , λt f ,t0 ,t f )
1 =
S2b (x, λt f ,t) = xT Z2b (t)x + xTY2b (t)λt f u∗ (t, xt f , λt f ,t0 ,t f ) −r(x∗ )−1 g(x∗ )T ·
2 [(Y3b (t f − t)xt f + Z3b (t f − t)Y2b (t f − t)λt f )
1 [3]
+ λtTf W2b (t)λt f + S2b (x, λt f ,t) (36) (I + Z2b (t f − t)Z3b (t f − t))−1 [(Y2b (t f − t)λt f
2 
where, Z2 f (t), Y2 f (t), W2 f (t), Z2b (t), Y2b (t), and W2b (t) +h.o.t.]
. (46)
[3]
∈ Rn×n are coefficient matrices, S2 f (x, λ0 ,t) is a function −Z2b (t f − t)Y3b (t f − t)xt f ) + h.o.t.]
of x, λ0 , and t, the order of whose terms is higher than The following lemma shows (45) and (46) will cause numer-
[3]
two with respect to x and λ0 , and S2b (x, λt f ,t) is a function ical instability as the time interval increases.
of x, λt f , and t, the order of whose terms is higher than To state the lemma, we need to define some notations.
two with respect to x and λt f . Let t¯ = t − t0 for (35) and For any square matrix A ∈ Rn×n , L + (A) denotes the A-
rewrite t¯ as t, similarly, let tˆ = t f − t for (36) and rewrite invariant subspaces of Rn spanned by the eigenvectors of Rn
tˆ as t, then substituting (35) and (36) to HJEs (10) and corresponding to the eigenvalues λ of A such that Re(λ ) > 0.
(14) respectively, and comparing the coefficient matrices, the For any matrix U, N(U) denotes its null space. For any
following equations hold. constant matrix pair (C, A) where C ∈ R p×n and A ∈ Rn×n ,
NO(C, A) and NO+ (C, A) are defined as follows respectively:
Ż2 f (t) = −Z2Tf (t)A − AT Z2 f (t) + Z2Tf (t)BRBT Z2 f (t) − Q,
i=n−1
(37)
N(CAi ),
\
NO(C, A) := (47)
Ẏ2 f (t) = (−AT + Z2Tf (t)BRBT )Y2 f (t), (38) i=0

6385
NO+ (C, A) := NO(C, A) ∩ L + (A). (48) Here, Z3 f (t), Y3 f (t), Z3b (t), Y3b (t), Z2 f (t), Y2 f (t), Z2b (t), and
Y2b (t) are the solutions to (19), (20), (21), (22), (37), (38),
Lemma 1 [13] For differential matrix Riccati equations
(39), and (40) respectively, λ0 is the solution to
(19) and (37), if both (AT , Q) and (−A, BR) are stabilizable
∂ S3b (λ , xt f ,t)

and ∂ S2 f (x, λ0 ,t)
x0 = or x = − , (56)
NO+ (RBT , AT ) = {0}, NO+ (Q, −A) = {0} (49) ∂ λ0
t=t0 ∂λ
t=t f

hold, then and λt f is the solution to


lim (I + Z3 f (t)Z2 f (t)) = 0. (50)

∂ S2b (x, λt f ,t)

t→+∞ ∂ S3 f (λ , x0 ,t)
xt f = or x = − . (57)
∂ λt f ∂λ

For differential matrix Riccati equations (21) and (39), if
t=t0

t=t f
both (−AT , Q) and (A, BR) are stabilizable and
Proof: The proof is similar to the proof of the theorem
NO+ (RBT , −AT ) = {0}, NO+ (Q, A) = {0} (51) 1. By solving (34) and (43), (33) and (44), and (43) and(44)
hold, then for x(t) and λ (t), we can get the expressions of the optimal
state and input as in (53), (54), and (3) using u∗ (t) =
lim (I + Z3b (t)Z2b (t)) = 0. (52) −r−1 (x∗ )g(x∗ )T λ (t) respectively. This completes the proof.
t→+∞
Because of (50) and (52) in Lemma 1, the optimal trajecto-
ries shown in (45) and (46) will become numerically unstable For (53)-(3), we can see that λ0 and/or λt f are necessary
as the time interval increases. This means that computing to generate optimal trajectories and inputs. Therefore, we
a pair of generating functions with the same time direction need to solve (56) numerically for λ0 and/or solve (57)
will cause numerical instability as the time interval increases. numerically for λt f which is not necessary for (26) in
Therefore, (45) and (46) are not appropriate to generate Theorem 2. Therefore, (26) is more convenient to generate
optimal trajectories for nonlinear optimal control problems. optimal trajectories for different boundary conditions than
The following theorem gives the optimal trajectory and any one among (53)-(3).
input generated by the pair S2 f (x, λ0 ,t) and S3b (λ , xt f ,t), the IV. NUMERICAL EXAMPLE
pair S2b (x, λt f ,t) and S3 f (λ , x0 ,t), and the pair S2 f (x, λ0 ,t)
and S2b (x, λt f ,t) for a nonlinear optimal control problems We solve the following one dimensional nonlinear optimal
respectively. control problem using the method proposed in the previous
section.
Theorem 3 For the nonlinear optimal control problem (1)-
(3) with t0 and t f satisfying t0 < t f and t f − t0 ≤ T , the Example 1 Consider the following nonlinear optimal con-
optimal trajectory x∗ (t) (t ∈ [t0 ,t f ]) and input u∗ (t) (t ∈ trol problem, the dynamic equation is
[t0 ,t f ]) are given as follows. ẋ = 10x + x2 + 0.1x3 + u, (58)
 ∗
−(I + Z3b (t − t f )Z2 f (t − t0 ))−1 ·
 
x (t, λ0 , xt f ,t0 ,t f )
∗ = the cost function is defined as
u (t, λ0 , xt f ,t0 ,t f ) −r(x∗ )−1 g(x∗ )T ·
Z 1
[(Y3b (t − t f )xt f + Z3b (t − t f )Y2 f (t − t0 )λ0 ) 1
J= (x2 + u2 )dt, (59)
(I + Z2 f (t − t0 )Z3b (t − t f ))−1 · [(Y2 f (t − t0 )λ0 2 0
The purpose is to find an optimal control input u∗ minimizing

+h.o.t.]
, (53) the cost function in (59) subject to the different boundary
−Z2 f (t − t0 )Y3b (t − t f )xt f ) + h.o.t.]
or condition as follows.
 ∗
−(I + Z3 f (t − t0 )Z2b (t − t f ))−1 ·
 
x (t, x0 , λt f ,t0 ,t f ) x(0) = −0.4, x(1) = 0.4. (60)
∗ =
u (t, x0 , λt f ,t0 ,t f ) −r(x∗ )−1 g(x∗ )T ·
To explain the following figures and tables, some notations
[(Y3 f (t − t0 )x0 + Z3 f (t − t0 )Y2b (t − tb )λt f ) are introduced. The generating functions are expanded up to
(I + Z2b (t − t f )Z3 f (t − t0 ))−1 [(Y2b (t − t f )λt f the order N. The symbol (·)bvp4c means that (·) is obtained

+h.o.t.] by using MATLAB command “bvp4c”. In what follows, the
, (54)
−Z2b (t − t f )Y3 f (t − t0 )x0 ) + h.o.t.] reference optimal trajectory means the one obtained by using
or MATLAB command “bvp4c”.
 ∗
x (t, λ0 , λt f ,t0 ,tt f )
 
(Z2b (t − t f ) − Z2 f (t − t0 ))−1 · For different pairs of generating functions, the optimal
∗ = states and inputs are depicted in Figs.1-3 for the bound-
u (t, λ0 , λt f ,t0 ,tt f ) −r(x∗ )−1 g(x∗ )T [Y2 f (t − t0 )λ0
ary condition in (60). We use the pair S3 f (λ , x0 ,t) and
[Y2 f (t − t0 )λ0 −Y2b (t − t f )λt f + h.o.t.] S3b (λ , xt f ,t) for Fig.1, the pair S2 f (x, λ0 ,t) and S3b (λ , xt f ,t)
+Z2 f (t − t0 )(Z2b (t − t f ) − Z2 f (t − t0 ))−1 (Y2 f (t − t f )λ0 for Fig.2, and the pair S2 f (x, λ0 ,t) and S3 f (λ , x0 ,t) for Fig.3.
 Fig.1(a) and Fig.2(a) show that, the trajectories and inputs
. (55) generated by the the pair S3 f (λ , x0 ,t) and S3b (λ , xt f ,t) and
−Y2b (t − t f )λt f + h.o.t.] the pair S2 f (x, λ0 ,t) and S3b (λ , xt f ,t) are almost same to the
6386
optimal trajectory. While Fig.3(a) shows that trajectory gen- 0.4
State trajectory x
10
Input u

erated by the pair S3 f (λ , x0 ,t) and S2 f (x, λ0 ,t) diverges from 8


x =−0.1,x =0.1
0 tf
0.2 x =−0.2,x =0.2
the optimal trajectory. This means the trajectory generated by 0.1
0 tf
x0=−0.4,xtf=0.4
6
this pair is not the optimal trajectory. The reason is explained 0

u
x
x0=−0.1,xtf=0.1 4
in Subsection B of Section 3. −0.1
x0=−0.2,xtf=0.2
−0.2
2
x =−0.4,x =0.4
0 tf

State trajectory x Input u −0.4 0


0 1 2 0 1 2
0.4 Time(s) Time(s)
8 N=2
N=3
0.2 N=4 (a) The state (b) The input
6

Fig. 4. The case using S3 f and S3b to generate optimal trajectory for
x

u
0
N=2 4
N=3 different boundary conditions and different time intervals
-0.2 N=4 2
bvp4c

-0.4 0
0 0.5 1 0 0.5 1
Time(s) Time(s)
is proposed to generate optimal trajectories for different
(a) The state (b) The input boundary conditions and different time intervals. The optimal
Fig. 1. The optimal trajectory and input generated by S3 f and S3b for trajectories and optimal inputs are given as functions of the
x0 = −0.4, xt f = 0.4 boundary values of the state and the initial and terminal time.
In addition, the generating function can be calculated off-
line. Therefore, it is easy to generate optimal trajectories
State trajectory X Input u
for different boundary conditions and different time intervals
0.4
8
N=2 directly without any numerical integration of the system
N=3
0.2 N=4 dynamic equation. Therefore, the online computational effort
6
0 for different boundary conditions greatly decreases. It is
u
X

4
-0.2
N=2
N=3
expected to be very useful to online nonlinear trajectory
N=4
bvp4c
2 generation problems such as model predictive control.
-0.4
0
0 0.5
Time(s)
1 0 0.5
Time(s)
1
R EFERENCES
(a) The state (b) The input [1] E. G. Al’brecht, “On the optimal stabilization of nonlinear systems,”
PMM-J. Appl. Math. Mech., vol. 25, pp. 1254–1266, 1961.
Fig. 2. The optimal trajectory and input generated by S2 f and S3b for [2] D. L. Lukes, “Optimal regulation of nonlinear dynamical systems,”
x0 = −0.4, xt f = 0.4 SIAM J. Control and Optimization, vol. 7, no. 1, pp. 75–100, 1969.
[3] N. Sakamoto and A. J. van der Schaft, “Analytical approximation
methods for the stabilizing solution of the Hamilton-Jacobi equation,”
IEEE Trans. Autom. Contr., vol. 53, no. 10, pp. 2335–2350, 2008.
State trajectory x Input u
[4] A. Locatelli, Optimal Control: An Introduction, 1st ed. Basel-Boston-
20 40 Berlin: Birkhauser Verlag, 2001, pp. 147–164.
N=2 N=2
10 bvp4c 30 [5] U. M. Ascher, R. M. M. Mattheij, and R. D. Russell, Numerical so-
0 20
lution of boundary value problems for ordinary differential equations.
Tokyo: Prentice-Hall, 1988, pp. 164–180.
x

-10 10
[6] H. B. Keller, Numerical Methods for Two-Point Boundary-Value
-20 0
Problems. Waltham, Massachusetts: Blaisdell, 1968, pp. 39–68.
-30 -10 [7] E. Cristiani and P. Martinon, “Initialization of the shooting method
-40 -20 via the hamilton-jacobi-bellman approach,” Journal of Optimization
0 0.5 1 0 0.5 1
Time(s) Time(s) Theory and Applications, vol. 146, no. 2, pp. 321–346, 2010.
[8] A. P. Willemstein, “Optimal regulation of nonlinear dynamical systems
(a) The state (b) The input on a finite interval,” SIAM J. Control and Optimization, vol. 15, no. 6,
pp. 1050–1069, 1977.
Fig. 3. The optimal trajectory and input generated by S2 f and S3 f for [9] V. Guibout and D. J. Scheeres, “Solving relative two point boundary
x0 = −0.4, xt f = 0.4 value problems: Spacecraft formation flight transfers application,” J.
Guid. Control. Dyn., vol. 27, no. 4, pp. 693–704, 2004.
[10] C. Park and D. J. Scheeres, “Determination of optimal feedback
To illustrate the advantage of the proposed method, we terminal controllers for general boundary conditions using generating
use the a pair of generating functions (S3 f (λ , x0 ,t) and functions,” Automatica, vol. 42, no. 5, pp. 869–875, 2006.
S3b (λ , xt f ,t)) to generate optimal trajectories for different [11] M. Bando and H. Yamakawa, “A new optimal orbit control for two-
point boundary-value problem using generating functions,” Advances
terminal state values and different time intervals as shown in the Astronautical Sciences, vol. 134, pp. 245–260, 2009.
in Fig.4. We can see that all of the generated optimal [12] Z. Hao and K. Fujimoto, “Approximate solutions to the hamilton-
trajectories achieve to the designed boundary values at the jacobi equations for generating functions with a quadratic cost function
with respect to the input,” in Proceedings of the 4rd IFAC Workshop on
designed time endpoint in Fig.4(a). This example illustrates Lagrangian and Hamiltonian Methods for Nonlinear Control, 2012,
the effectiveness and the advantage of the proposed method. pp. 194–199.
[13] Z. Hao, K. Fujimoto, and Y. Hayakawa, “Optimal trajectory generation
for linear systems based on double generating functions,” in To appear
in Proceedings of the 51st IEEE Conference on Decision and Control,
V. CONCLUSIONS 2012.
In this paper, for a nonlinear finite time optimal control [14] H. P. Geering, Optimal Control with Engineering Applications, 1st ed.
Berlin: Springer, Apr. 2007, pp. 24–35.
problem, a method using a pair of generating functions
6387

You might also like