Professional Documents
Culture Documents
Model-Following Neuro-Adaptive Control Design For Non-Square, Non-Affine Nonlinear Systems
Model-Following Neuro-Adaptive Control Design For Non-Square, Non-Affine Nonlinear Systems
(X
d
, U
d
) (1)
where X
d
[ R
n
is the desired state vector and U
d
[ R
m
(m n) is the nominal control vector of a nominal
system. It is assumed that the order of the system n is
known and a satisfactory nominal controller U
d
has been
designed using some standard method (e.g. dynamic inver-
sion technique, optimal control theory, Lyapunov theory
and so on.) such that this controller meets some desired per-
formance goal. However, (1) may not truly represent the
actual plant because (i) there may be neglected algebraic
terms in this mathematical model (this study is restricted
to this class of unmodelled dynamics) and (ii) the numerical
values of the parameters may not perfectly represent the
actual plant and this error results in unknown functions in
the model. As a consequence, the actual plant is assumed
to have the following structure
_
X f (X, U) d(X) (2)
where X [ R
n
is the state of the actual plant and U [ R
m
is
the modied controller. The unknown algebraic function
d(X) [ R
n
arises because of the two reasons mentioned
above. Note that the two functions f
(X
d
, U
d
) and f(X, U)
may or may not have same algebraic expressions.
However, f(X, U) contains the known part of the dynamics
of (2). The task here is to design a modied controller U
online in such a way that the states of the actual plant
track the respective states of the nominal model. In other
words, the goal is to ensure that X ! X
d
as t ! 1, which
ensures that the actual system performs like the nominal
system. As a means to achieve this, the aim is to rst
capture the unknown function d(X) rst, which is accom-
plished through a neural network approximation
^
d(X). A
necessary intermediate step towards this end is the de-
nition of an approximate system as follows
_
X
a
f (X, U)
^
d(X) (X X
a
), X
a
(0) X(0) (3)
Through this artice, one can ensure that X ! X
a
! X
d
as
t ! 1. Obviously, this introduces two tasks: (i) ensuring
X ! X
a
as t ! 1 and (ii) ensuring X
a
! X
d
as t ! 1.
The reason of choosing an approximate system of the
form in (3) is to facilitate meaningful bounds on the
errors and weights.
2.2 Control solution (ensuring X
a
! X
d
)
In this loop, it is assumed that a neural network approxi-
mation of the unknown function
^
d(X) is available. The
goal in this step is to drive X
a
! X
d
as t ! 1, which is
achieved by enforcing the following rst-order asymptoti-
cally stable error dynamics
(
_
X
a
_
X
d
) K(X
a
X
d
) 0 (4)
where a positive denite gain matrix K is chosen. A rela-
tively easy way of choosing the gain matrix is to have
K diag(1=t
1
, . . . , 1=t
n
) (5)
where t
i
, i 1, . . . , n, can be interpreted as the desired time
constant for the ith channel of the error dynamics in (4).
Substitution of (1) and (3) into (4) leads to
f (X, U)
^
d(X) (X X
a
) f
(X
d
, U
d
)
K(X
a
X
d
) 0 (6)
Solving for f(X, U) from (6)
f (X, U) b(X, X
a
, X
d
, U
d
) (7)
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1651
where
b(X, X
a
, X
d
, U
d
) W {f
(X
d
, U
d
)
K(X
a
X
d
) (X X
a
)
^
d(X)} (8)
The next step is to solve for the control U from (7). A few
different cases and issues need to be considered in this
context, which are discussed next.
Case 1. If the following conditions are satised:
The system is square, i.e. m n
The system dynamics is afne in control variable, i.e.
f(X, U) can be written as
f (X, U) f
1
(X) [g
1
(X)]U (9)
[g
1
(X(t))]
nn
is non-singular 8t
From (79), U can be obtained in a straight forward
manner as
U [g
1
(X)]
1
{b(X, X
a
, X
d
, U
d
) f
1
(X)} (10)
Case 2. The question is what if the system is control afne
but non-square? Two cases may arise, i.e. either m . n or
m , n. If m . n, a technique that can be made use of is
linear programming. Linear programming is the process
of optimising a linear objective function subject to a nite
number of linear equality and inequality constraints [23].
Control allocation problems in the face of redundant
controllers have been dealt with successfully using linear
programming in aerospace applications as shown in [24].
However, if m , n, which is usually the case in many
engineering applications, a novel method of introducing
extra variables to augment the control vector to make it
square is proposed. This technique leads to a square
problem that facilitates a solution. From this solution,
components of the augmented control vector that represent
the actual controller can be extracted. This idea will be
elaborated in the following paragraphs.
When m , n, the number of equations is more than the
number of control variables and (6) cannot be solved for
U. To nd a solution, a slack-variable vector U
s
is intro-
duced rst. Next, an n (n 2m) matrix C(X) is designed
and C(X)U
s
is added to the right-hand side of the approxi-
mate system (3) to get
_
X
a
[f (X, U) C(X)U
s
]
^
d
a
(X, U
s
)
(X X
a
), X
a
(0) X(0) (11)
The following quantities are dened
V W U
T
; U
T
s
_ _
T
(12)
f
a
(X, V) W [f (X, U) C(X)U
s
] (13)
^
d
a
(X, U
s
) W [d(X) C(X)U
s
] (14)
Using the denitions in (1214), (11) can be expressed as
_
X
a
f
a
(X, V)
^
d
a
(X, U
s
) (X X
a
), X
a
(0) X(0)
(15)
Note that (15) denes a square system in X and V and there-
fore it is feasible to get a solution for V. The rst m elements
of V represents U. As a part of this process, the control
designer needs to obtain
^
d
a
(X, U
s
).
A neural network is used in this study for this purpose.
^
d
a
i
(X, U
s
) can be obtained as the output of a neural
network represented by
^
W
T
i
F
i
(X, U
s
). Here
^
W and F are
the weight vector and basis function vector of a neural
network, respectively. [X
T
, U
s
T
] is the input vector to the
neural network. The subscript i stands for the each state
of the plant model, i.e. each state equation has a separate
neural network associated with it.
Similar to the expression in (4), the error dynamic
equation for a control afne but non-square system can be
written as
f
1
(X) g
1
(X) C(X)
_ _ U
U
s
_ _
^
d
a
(X, U
s
)
(X X
a
) f
(X
d
, U
d
) K(X
a
X
d
) 0 (16)
f
1
(X) g
1
(X) C(X)
_ _
V
^
d
a
(X, U
s
) (X X
a
)
f
(X
d
, U
d
) K(X
a
X
d
) 0 (17)
This leads to the solution
V [G(X)]
1
b
s
(X, X
a
, X
d
, U
d
, U
s
) (18)
where
[G(X)] W g
1
(X) C(X)
_ _
(19)
b
s
(X, X
a
, X
d
, U
d
, U
s
) W {f
(X
d
, U
d
) K(X
a
X
d
)
(X X
a
)
^
d
a
(X, U
s
) f
1
(X)} (20)
^
d
a
i
(X, U
s
)
^
W
T
i
F
i
(X, U
s
), i 1, . . . , n (21)
Note that the function C(X) should be chosen carefully
such that the square matrix [G(X)] does not become singu-
lar. Choosing such a function C(X), however, is problem
dependent and care should be taken while choosing it.
It has to be noted that this formulation will result in a
xed point problem in the control solution because the
control vector V contains the vector U
s
, and the control sol-
ution equation (18) also contains U
s
on the right-hand side.
In (18), U
s
is an input to the neural network that approxi-
mates the uncertain function. Solution for V is obtained
numerically as V
k1
G
1
(H
^
d
a
(X, V
k
)), k 0,1,
2 . . . , where k is the iteration number and
H W f f
(X
d
, U
d
) 2 K(X
a
2X
d
) 2(X 2X
a
) 2f
1
(X)g. The
validity of this solution has been proved using the contrac-
tion mapping theorem (see Section 7). The proof containing
conditions required to prove the existence of a unique
control solution given for the most general case, i.e. the non-
square, non-afne case.
Case 3. The system dynamics is square (m n), but not
control-afne. In such a situation, the following three
options are available
1. The form of the equation may be such that it may still
facilitate a closed-form solution for the control variable.
2. Another option is the use a numerical technique such as
the standard NewtonRaphson technique [25]. With the
availability of fast computational algorithms and high-speed
processors, fast online numerical solution of algebraic
equations is not considered to be an infeasible task. For
example, the viability of the NewtonRaphson technique
for online applications is discussed in [26, 27], where the
authors have used the technique for complex real-life pro-
blems. Note that a good initial guess solution can be
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1652
provided at any time step k as
(U
guess
)
k
U
d
, k 1
U
k1
, k 2, 3, . . .
_
(22)
3. Following the idea in [28, 29], a novel method is intro-
duced to deal with a class of control non-afne smooth non-
linear systems of the form
_
X f (X, U), where f is a
smooth mapping and f(0, 0) 0. If the unforced dynamic
equation
_
X f (X, 0) W f
0
(X) of a system that falls in
the class of systems mentioned above
_
X f (X, U),
where f is a smooth mapping and f(0, 0) 0) is
Lyapunov stable, the system equation can be represented as
_
X f
0
(X) g
0
(X)U
m
i1
u
i
(R
i
(X, U)U) (23)
as shown in [28, 29]. In the above-mentioned representation
f
0
(X) W f (X, 0)
g
0
(X) W
@f
@u
(X, 0) [g
0
1
(X) . . . g
0
m
(X)] [ R
nm
(24)
and R
i
(X, U): R
n
R
m
! R
nm
is a smooth mapping for
1 i m. The actual plant equation
_
X f (X, U) d(X)
for this class of nonlinear non-afne systems can be
expressed as
_
X f
0
(X) g
0
(X)U
m
i1
u
i
(R
i
(X, U)U) d(X) (25)
The approximate plant equation now becomes
_
X
a
f
0
(X) g
0
(X)U
^
d(X,U) X X
a
(26)
In this case, the online neural network output
^
d(X, U)
captures the uncertainty
m
i1
u
i
(R
i
(X, U)U) d(X). Now
the control solution can be obtained from the error
dynamic equation (4) between the approximate state and
the desired state as
U [g
0
(X)]
1
{b(X, X
a
, X
d
, U
d
, U) f
0
(X)} (27)
where
b(X, X
a
, X
d
, U
d
, U) W {f
(X
d
, U
d
)
K(X
a
X
d
) (X X
a
)
^
d(X, U)} (28)
Note that [g
0
(X)] is assumed to be non-singular 8 t. Here
again it can be seen that (27) constitutes a xed point
problem. The control solution is obtained numerically
using U
k1
[g
0
(X)]
21
[b(X, X
a
, X
d
, U
d
, U
k
) 2f
0
(X)],
k 0, 1, 2. . . , where k is the iteration number. Such a sol-
ution is shown to be valid by proving that the mapping in
(27) is a contraction mapping. The proofs and conditions
that lead to the validity of the solution are given in
Section 7 (Appendix).
Case 4. If the system is both non-square and non-afne in
control, the approximate plant equation takes the form
_
X
a
f
0
(X) g
0
(X)U
^
d
a
(X, U, U
s
)
X X
a
C(X)U
s
(29)
which reduces to
_
X
a
f
0
(X) g
0
(X) C(X)
_ _ U
U
s
_ _
^
d
a
(X, U, U
s
) X X
a
(30)
with
^
d
a
(X, U, U
s
) W d(X) C(X)U
s
m
i1
u
i
(R
i
(X, U)U)
_ _
(31)
Dene V W [U
T
U
s
T
]
T
and [G(X)] W [g
0
(X) C(X)].
The error dynamic equation can be expressed as
f
0
(X) GV
^
d
a
(X, U, U
s
) (X X
a
) f
(X
d
, U
d
)
K(X
a
X
d
) 0 (32)
The control can be solved as
V [G(X)]
1
b
s
(X, X
a
, X
d
, U
d
, U, U
s
) (33)
b
s
(X, X
a
, X
d
, U
d
, U, U
s
) W {f
(X
d
, U
d
)
K(X
a
X
d
) (X X
a
)
^
d
a
(X,U,U
s
) f
0
(X)}
(34)
Only the rst m elements of V will be needed for the
implementation of the control on the actual plant. [G(X)]
is assumed to be non-singular 8 t. The control solution is
obtained numerically using V
k1
G
1
(H
^
d
a
(X, V
k
)),
k 0,1, 2. . . , where k is the iteration number and
H W [ f
(X
d
,U
d
) 2K(X
a
2X
d
) 2(X 2X
a
) 2f
0
(X)]. This
solution is shown to be valid by proving that the mapping
in (33) is a contraction mapping. The detailed proof is pro-
vided in Section 7 (Appendix).
2.3 Capturing the unknown function and neural
network training (ensuring X ! X
a
)
In this section, the process of realising the uncertainties in
the actual plant equations (which is crucial for controller
synthesis) is discussed in detail. The StoneWeierstrass
theorem from classical real analysis can be used to show
that certain network architectures possess the universal
approximation capability. Networks typically have the
desirable properties that larger networks produce less
error than smaller networks and almost all functions can
be modelled by neural networks. This makes the authors
believe that the neural networks are more efcient in
approximating complex functions if there are a large
number of neurons in the hidden layer.
2.3.1 Selection of neural network structure: An
important idea used in this work is to separate all the channels
in the system equations. Thus, there will be n independent
neural networks to approximate uncertainties in each of the
n channels, which facilitates easier mathematical analysis.
Dene d(X) W [d
1
(X) d
n
(X)]
T
, where d
i
(X), i 1, . . . ,
n is the ith component of d(X) which is the uncertainty in
the ith state equation. Since each element of d(X) is rep-
resented by a separate neural network, each network output
can be expressed as
^
W
T
i
F
i
(X). It should be noted here that
the neural network input vector may contain states, the
control vector and the slack variable vector. Separation of
channels has been carried out in this work to keep the uncer-
tainties in each system equation distinct. During system
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1653
operation, magnitudes of uncertain terms in system equations
may be of different orders. In such a case, having one
network approximate uncertainties of the whole system
may affect the convergence of the single network. In order
to prevent this from happening, all channels were separated.
Trigonometric basis neural networks [30, 31] were used
in this study for approximating each of the unknown func-
tions d
i
(X). The online uncertainty approximating neural
network can be represented by a linearly parameterised
feedforward structure. Radial basis functions (RBFs) can
be used in these structures because these functions are uni-
versal approximators. However, RBFs are very poor at
interpolating between their design centres, and in such
cases a large number of basis functions are needed.
Researchers typically use basis functions constructed from
functions that they think richly represent the nature of the
unknown terms that are being approximated. There is no
standard procedure for choosing basis functions for any
application. Fourier series has the ability to approximate
any nonlinear function quite well. Note that such a choice
also makes it application independent. Trigonometric
basis neural networks are used in this study for approximat-
ing each of the unknown functions as the authors believe
that the trigonometric sine and cosine functions and their
combinations have the capability to represent many non-
linear functions well. In order to form the vector of basis
functions, the input data is rst pre-processed. In the
numerical experiments carried out, vectors C
i
, i 1, . . . , n
which had a structure C
i
[1 sin(x
i
) cos(x
i
)]
T
were created.
The vector of basis functions was generated as
F kron(C
n
, . . . , kron (C
3
, kron(C
1
,C
2
)), . . . ) (35)
kron(, ) represents the Kronecker product and is dened
in [30] as
kron(Y, Z) y
1
z
1
y
1
z
2
y
n
z
m
_ _
T
(36)
where Y [ R
n
and Z [ R
m
. The dimension of the neural
network weight vector is same as the dimension of F. The
neural network outputs for each of the different cases con-
sidered in this study have been tabulated in Table 1.
2.3.2 Training of neural networks: The technique for
updating the weights of the neural networks (i.e. training
the networks) for accurate representations of the unknown
functions d
i
(X), i 1, . . ., n is discussed here. Dene
e
a
i
; (x
i
x
a
i
) (37)
From (23), equations for the ith channel can be decom-
posed as
_ x
i
f
i
(X, U) d
i
(X) (38)
_ x
a
i
f
i
(X, U)
^
d
a
i
(X, U, U
s
) e
a
i
(39)
Subtracting (39) from (38) and using the denition in (37)
gives
_ e
a
i
d
i
(X)
^
d
a
i
(X, U, U
s
) e
a
i
(40)
From the universal function approximation property of
neural networks [31], it can be stated that there exists an
ideal neural network with an optimum weight vector W
i
and basis function vector F
i
(X) that approximates d
i
(X) to
an accuracy of 1
i
, that is
d
i
(X) W
T
i
F
i
(X) 1
i
(41)
Let the actual weight of the network used to approximate
the uncertainties be
^
W
i
. The approximated function can
be written as
^
d
a
i
(X, U, U
s
)
^
W
T
i
F
i
(X, U, U
s
) (42)
Substituting (4142) in (40) leads to
_ e
a
i
~
W
T
i
F
i
(X, U, U
s
) 1
i
e
a
i
(43)
where
~
W
i
W (W
i
^
W
i
) is the error between the actual
weight and ideal weight of the neural network. Note that
_
~
W
i
_
^
W
i
since W
i
is constant. An important point to be
noted here is that the aim of each neural network is to
capture the resulting function in each state equation and
not parameter estimation or system identication. The mag-
nitudes of the uncertainties/nonlinearities in the state
equations are then used to make the plant track the
desired reference trajectory.
Theorem: A stable adaptive weight update rule proposed as
follows
_
^
W
i
g
l
i
e
a
i
F
i
(X, U, U
s
) g
l
i
s
i
^
W
i
(44)
will ensure bounds on the error signal e
a
i
and the adaptive
weights of the online networks
^
W
i
. g
l
i
is the learning rate
of the ith online network and s
i
is a sigma modication
factor used to ensure a bound on the network weights.
Proof: Choose a Lyapunov function for each state equation as
v
i
1
2
(e
2
a
i
)
1
2
(
~
W
T
i
g
1
l
i
~
W
i
) (45)
Taking the derivative of the Lyapunov function
_ v
i
e
a
i
_ e
a
i
~
W
T
i
g
1
l
i
_
~
W
i
(46)
On substituting the expression for _ e
a
i
in (46)
_ v
i
e
a
i
(
~
W
T
i
F
i
(X, U, U
s
) 1
i
e
a
i
)
~
W
T
i
g
1
l
i
_
~
W
i
(47)
If the proposed weight update rule
_
^
W
i
g
l
i
e
a
i
F
i
(X, U, U
s
) g
l
i
s
i
^
W
i
is used, the error dynamics of the
difference between the optimal weight vector that represents
the uncertainty and the weight vector used in the online
Table 1: Uncertainties and neural network nutputs for different system types
No. System type Uncertainty Neural network
output
1 Square, afne d(X)
^
d(X)
2 Non-square, afne d(X) 2C(X)U
s
^
d
a
(X, U
s
)
3 Square, non-afne d(X)
m
i 1
u
i
(R
i
(X, U)U)
^
dX, U
4 Non-square, non-afne d(X) C(X)U
s
m
i 1
u
i
(R
i
(X, U)U)
^
d
a
X, U, U
s
~
W
T
i
g
1
l
i
( g
l
i
(e
a
i
F
i
s
i
^
W
i
))
e
a
i
1
i
e
2
a
i
s
i
~
W
T
i
^
W
i
(49)
However
~
W
T
i
^
W
i
1
2
(2(
~
W
T
i
^
W
i
))
1
2
(2
~
W
T
i
(W
i
~
W
i
))
1
2
(2
~
W
T
i
W
i
2
~
W
T
i
~
W
i
) (50)
The rst term in (50) can be expanded as follows.
2
W
T
i
W
i
~
W
T
i
W
i
~
W
T
i
W
i
~
W
T
i
(
^
W
i
~
W
i
) (W
i
^
W
i
)
T
W
i
~
W
T
i
^
W
i
~
W
T
i
~
W
i
W
T
i
W
i
^
W
T
i
W
i
^
W
T
i
(
~
W
i
W
i
)
~
W
T
i
~
W
i
W
T
i
W
i
^
W
T
i
^
W
i
~
W
T
i
~
W
i
W
T
i
W
i
(51)
Equation (50) can now be expressed as
~
W
T
i
^
W
i
1
2
((
^
W
T
i
^
W
i
) (
~
W
T
i
~
W
i
)
(W
T
i
W
i
) (
~
W
T
i
~
W
i
) (
~
W
T
i
~
W
i
)) (52)
Equation (52) can be expressed as
~
W
T
i
^
W
i
1
2
((
^
W
T
i
^
W
i
) (
~
W
T
i
~
W
i
) (W
T
i
W
i
))
1
2
(k
~
W
i
k
2
k
^
W
i
k
2
kW
i
k
2
) (53)
Therefore the last term in (49) can be written in terms of the
inequality
s
i
~
W
T
i
^
W
i
1
2
s
i
k
~
W
i
k
2
1
2
s
i
k
^
W
i
k
2
1
2
s
i
kW
i
k
2
(54)
The equation for _ v
i
becomes
_ v
i
e
a
i
1
i
e
2
a
i
1
2
s
i
k
~
W
i
k
2
1
2
s
i
k
^
W
i
k
2
1
2
s
i
kW
i
k
2
e
2
a
i
2
1
2
i
2
e
2
a
i
1
2
s
i
k
~
W
i
k
2
1
2
s
i
k
^
W
i
k
2
1
2
s
i
kW
i
k
2
e
2
a
i
2
1
2
i
2
1
2
s
i
kW
i
k
2
_ _
1
2
s
i
k
~
W
i
k
2
1
2
s
i
k
^
W
i
k
2
(55)
Dene
b
i
W
1
2
i
2
1
2
s
i
kW
i
k
2
_ _
(56)
For _ v
i
, 0
e
2
a
i
2
. b
i
(57)
or
je
a
i
j .
2b
i
_
(58)
Thus, it can be seen that selecting a sufciently small s
i
and
choosing a sufciently good set of basis functions which
will reduce the approximation error 1
i
will help in
keeping the error bound small. The error bound for the pro-
posed weight update scheme is 2b
i
p
.
The following steps will prove that the weight update rule
is stable and all the signals in the weight update rule are
bounded.
It can be seen from (48) that
_
~
W
i
g
l
i
e
a
i
F
i
(X, U, U
s
) g
l
i
s
i
^
W
i
.
Equation (48) can be expanded as
_
~
W
i
g
l
i
e
a
i
F
i
(X, U, U
s
) g
l
i
s
i
(W
i
~
W
i
)
g
l
i
s
i
~
W
i
g
l
i
(s
i
W
i
e
a
i
F
i
(X, U, U
s
)) (59)
Dene X
c
W
~
W
i
, A W g
l
i
s
i
, B W g
l
i
, U
c
W s
i
W
i
e
a
i
F
i
(X, U, U
s
).
Equation (59) can be expressed as a linear differential
equation of the form
_
X
c
AX
c
BU
c
(60)
For the above-mentioned linear time-invariant system with
a negative denite A, the solution can be written as
X
c
(t) e
A(tt
0
)
X
c
(t
0
)
_
t
t
0
e
A(tt)
BU
c
(t) dt (61)
On using the bound ke
A(t2t
0
)
k ke
2l(t 2t
0
)
, the bound on the
solution to (60) can be expressed as [32]
kX
c
(t)k ke
l(tt
0
)
kX
c
(t
0
)k
_
t
t
0
ke
l(tt)
kBkkU
c
(t)k dt
ke
l(tt
0
)
kX
c
(t
0
)k
kkBk
l
sup
t
0
tt
kU
c
(t)k (62)
Such a system is input-to-state stable [32]. This proves that
X
c
W
~
W
i
is bounded for all bounded inputs. Since the input
to the system in (62) is bounded,
~
W
i
is bounded, which
proves that
^
W
i
is bounded as well. This completes the
proof. A
3 Simulation studies
In this section, two motivating examples that demonstrate
the ideas presented in Section 2 are presented. The
examples show that the methodology discussed in this
paper can indeed be used to design controllers for
complex nonlinear systems.
3.1 Van der Pol problem
As the rst exercise, the Van der Pol system [33] was
selected. The motivations for selecting it were (i) it is a
vector problem, (ii) it is a non-square problem (m 1,
n 2),(iii) the homogeneous system has an unstable equili-
brium at the origin and (iv) the system exhibits limit cycle
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1655
behaviour. These properties make it a challenging problem
for state regulation. The desired system dynamics for this
problem is given by
_ x
1
d
x
2
d
_ x
2
d
a(1 x
2
1
d
)x
2
d
x
1
d
(1 x
2
1
d
x
2
2
d
)u
d
(63)
where x
1
d
represents position and x
2
d
represents velocity.
The goal was to drive X W [x
1
d
, x
2
d
]
T
! 0 as t ! 1.
Formulating a regulator problem, the desired state and
control trajectories were obtained using a new method
known as single network adaptive critic (SNAC) using a
quadratic cost function shown below
J
1
2
_
1
0
(X
T
QX ru
2
d
) dt (64)
where Q diag(I
2
) and r 1. Details of the SNAC tech-
nique can be obtained from [34]. In the SNAC synthesis
for this problem, the critic neural network was made up of
two sub-networks, each having a 2-6-1 structure.
The plant dynamics was assumed to be
_ x
1
x
2
_ x
2
a(1 x
2
1
)x
2
x
1
d(X) (1 x
2
1
x
2
2
)u (65)
where d(X) 2 cos(x
2
) was the assumed unmodelled
dynamics. Following the discussions in Section 2, the
approximate system can be expressed as
_ x
1
a
x
2
(x
1
x
1
a
)
_ x
2
a
a(1 x
2
1
)x
2
x
1
^
d(X) (x
2
x
2
a
)
(1 x
2
1
x
2
2
)u
(66)
Since the problem is in a non-square form, the technique
mentioned in Section 2 was used and C [210 10]
T
was selected. The approximate system was expressed as
_ x
1
a
x
2
(x
1
x
1
a
) 10u
s
^
d
a
1
(X, U
s
)
_ x
2
a
a(1 x
2
1
)x
2
x
1
10u
s
^
d
a
2
(X, U
s
)
(x
2
x
2
a
) (1 x
2
1
x
2
2
)u (67)
with
^
d
a
1
(X, U
s
) expected to approximate the uncertainty
10u
s
introduced in the rst-state equation as a result of
the approximate system being made square and
^
d
a
2
(X, U
s
)
expected to approximate the uncertainty created by the
algebraic sum of d(X) and 210u
s
, which are the uncertain
terms in the second-state equation of the approximate
system. The augmented controller in (18) was expressed as
V
0 10
1 x
2
1
x
2
2
10
_ _
1
K
x
1
a
x
1
d
x
2
a
x
2
d
_ _
Df
1
Df
2
_ _
x
1
x
1
a
x
2
x
2
a
_ _ _ _
(68)
where
Df
1
Df
2
_ _
;
(x
2
x
2
d
)
^
d
a
1
(X, U
s
)
a(1 x
2
1
)x
2
x
1
^
d
a
2
(X, U
s
)
a(1 x
2
1
d
)x
2
d
x
1
d
(1 x
2
1
d
x
2
2
d
)u
d
_
_
_
_
_
_
_
_
_
(69)
The gain matrix was selected as K diag(1, 1). After
solving for V, the control variable u was extracted from it
as the rst element of V. For the rst iteration in the
control solution process, u
s
0 was used. With this, the
explicit expression for u is given by
u
1
1 x
2
1
x
2
2
[k
1
(x
1
x
1
d
) (k
2
1)(x
2
x
2
d
) Df
2
(x
1
x
1
a
) (x
2
x
2
a
)] (70)
The basis function vectors for the two neural networks were
selected as
C
1
1 sin (x
1
) cos (x
1
)
_ _
T
C
2
1 sin (x
2
) cos (x
2
)
_ _
T
C
3
1 sin (u
s
) cos (u
s
)
_ _
T
F
1
(X) F
2
(X)
kron(C
1
(kron(C
2
, C
3
)) (71)
and the neural network training learning parameters were
selected as g
l
1
g
l
2
10 and s
1
s
2
1 10
26
.
In Fig. 1, the resulting state trajectories for the nominal
system with the nominal controller, the actual system with
the nominal controller and the actual system with the adaptive
controller are given. First, it is clear from the plot that the
nominal control is doing the intended job (of driving the
states to zero) for the nominal plant. Next, it can be observed
that if the same nominal controller is applied to the actual
plant (with the unmodelled dynamics d(X) 2 cos x
2
), x
1
cannot reach the origin. However, if the adaptive controller is
applied, the resulting controller drives the states to the origin
by forcing them to follow the states of the nominal system.
Fig. 1 State trajectories against time
a State x
1
against time
b State x
2
against time
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1656
Fig. 2 illustrates control trajectories and neural network
approximations of uncertainties in the two state equations. In
Fig. 2a, a comparison between the histories of the nominal
control and the adaptive control is presented. Fig. 2b shows
the output of the rst neural network,
^
d
1
(X) tracking the uncer-
tainty in the rst-state equation because of 2C
1
u
s
. From
Fig. 2c it can be seen howwell the neural network approximates
the unknown function (d(X) 2C
2
u
s
), which is critical in deriv-
ing the appropriate adaptive controller.
3.2 Double inverted pendulum problem
The next problem considered is a double inverted pendulum
[19, 35]. The interesting aspects of this problem are (i) the
equations of motion consist of four states, (ii) it is a non-
square problem (n 4, m 2) and, more important, (iii)
it is a problem which is non-afne in the control variable.
In this problem both parameter variation and unmodelled
dynamics are considered simultaneously. These character-
istics make this problem a sufciently challenging one to
demonstrate that the proposed technique works for
complex problems. The nominal system dynamics for this
problem is given by [35]
_ x
1
d
x
2
d
_ x
2
d
a
1
sin (x
1
d
) b
1
j
1
tanh (u
1
d
) s
1
sin (x
4
d
)
_ x
3
d
x
4
d
_ x
4
d
a
2
sin (x
3
d
) b
2
j
2
tanh (u
2
d
) s
2
sin (x
2
d
) (72)
where for i 1, 2 the parameters are dened as
a
i
W
m
i
gr
J
i
kr
2
4J
i
_ _
b
i
W
kr
2J
i
(l b)
j
i
W
u
i
max
J
i
s
i
W
kr
2
4J
i
(73)
In (7273), x
1
d
and x
2
d
denote the desired position and
velocity of mass 21, respectively. Similarly x
3
d
and x
4
d
denote the desired position and velocity of mass 22,
respectively. Note that the control variables u
i
d
(torques
applied by the servomotors) enter the system dynamics in
a non-afne fashion. The system parameters and their
values are listed in Table 2.
Objectives of the nominal controllers were to make x
1
d
and x
3
d
track a reference signal R sin(2pt/T), with
T 10. Since x
2
d
and x
4
d
are derivatives of x
1
d
and x
3
d
,
respectively, it means that x
2
d
and x
4
d
must track the refer-
ence signal
_
R (2pt=T) cos (2pt=T). The nominal control-
ler was designed using the dynamic inversion technique
[36]. A second-order error dynamic equation
[(
X
d
R) K
d
(
_
X
d
_
R) K
p
(X
d
R) 0] was made
use of in the controller design as the objective was tracking.
The gain matrices used were K
d
K
p
I
2
.
In this problem, parametric uncertainties Da
1
and Da
2
were added to parameters a
1
and a
2
respectively.
Functions
~
f
1
(X) and
~
f
2
(X) were added as unmodelled
dynamic terms. The true plant equations now were of the
following form
_ x
1
x
2
_ x
2
a
1
Da
1
_ _
sin(x
1
) b
1
j
1
tanh(u
1
)
s
1
sin(x
4
)
~
f
1
(X)
_ x
3
x
4
_ x
4
(a
2
Da
2
) sin(x
3
) b
2
j
2
tanh(u
2
)
s
2
sin(x
2
)
~
f
2
(X) (74)
To test the robustness of the proposed method, Da
1
and Da
2
were selected to be 20% of their corresponding nominal
values. Similarly,
~
f
1
(X) and
~
f
2
(X) were assumed to be
exponential functions of the form K
m
1
e
a
1
x
1
and K
m
2
e
a
2
x
3
,
respectively, with positive values for a
1
and a
2
.
Parameters K
m
1
K
m
2
0.1 and a
1
a
2
0.01 were
chosen. In this case, the goal for the neural networks was
to learn d(X) W 0 d
2
(X) 0 d
4
(X)
_ _
T
, where d
2
(X)
Da
1
sin(x
1
) K
m
1
e
a
1
x
1
and d
4
(X) Da
2
sin(x
3
) K
m
2
e
a
2
x
3
.
It can be seen that the system dynamics is non-afne in
the control variable, and it is also a non-square problem
where the number of control variables is less that the
number of states. Applying the transformations given in
Fig. 2 Control and uncertainty approximation trajectories
against time
a Control trajectory against time
b Network approximation of uncertainty in the rst-state equation
c Network approximation of uncertainty in the second-state equation
Table 2: System parameter values
System parameter Value Units
End mass of pendulum 1 (m
1
) 2 kg
End mass of pendulum 2 (m
2
) 2.5 kg
Moment of inertia (J
1
) 0.5 kg m
2
Moment of inertia (J
2
) 0.625 kg m
2
Spring constant of connecting
spring (k)
100 N/m
Pendulum height (r) 0.5 m
Natural length of spring (l ) 0.5 m
Gravitational acceleration (g) 9.81 m/s
2
Distance between pendulum
hinges (b)
0.4 m
Maximum torque input (u
1
max
) 20 Nm
Maximum torque input (u
2
max
) 20 Nm
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1657
(23), f
0
(X) and [g
0
(X)] were dened as follows
f
0
(X) W
x
2
a
1
sin (x
1
) b
1
s
1
sin (x
4
)
x
4
a
2
sin (x
3
) b
2
s
2
sin (x
2
)
_
_
_
_
_
_
_
,
g
0
(X)
_ _
W
0 0
j
1
0
0 0
0 j
2
_
_
_
_
_
_
_
(75)
The actual plant equations were expressed as
_
X f
0
(X) g
0
(X)U
m
1
u
i
(R
i
(X, U)U) d(X) (76)
Since [g
0
(X)] is not a square matrix, c(X) 2
10 10 0 0
10 10 10 10
_ _
T
was chosen and a square
problem was formulated. The approximate system equation
was
_
X
a
f
0
(X) g
0
(X)U
^
d(X,U) X X
a
(77)
where
^
d(X, U) represents
m
1
u
i
(R
i
(X,U)U) d(X). To
make (77) a square system CU
s
is added to (77) and is
rewritten as
_
X
a
f
0
(X) g
0
(X)U
^
d
a
(X, U, U
s
) X X
a
CU
s
(78)
where
^
d
a
(X, U, U
s
) is the output of the function approxi-
mating neural networks and represents
m
1
u
i
(R
i
(X, U)U)
d(X) CU
s
. Note that U
s(21)
is the slack variable used
to create a square control effectiveness matrix to help
solve for the real control variable U. The gain matrix for
the linear error dynamic equation was selected as
K diag(1/t
1
, 1/t
2
, 1/t
3
, 1/t
4
) with t
1
t
2
t
3
t
4
0.2. The control solution vector was obtained as
V g
0
(X) C
_ _
1
([f
0
(X)
^
d
a
(X, U, U
s
) (X X
a
)
_
X
d
K(X
a
X
d
)]). The numerical solution was obtained
using V
k1
g
0
(X) C
_ _
1
([f
0
(X)
^
d
a
(X, V
k
)
(X X
a
)
_
X
d
K(X
a
X
d
)]). After solving for V, the
rst two elements which made up U were extracted from V.
The basis function vectors were selected in the following
manner
C
1
1 sin (x
1
) cos (x
1
)
_ _
T
C
2
1 sin (x
2
) cos (x
2
)
_ _
T
Fig. 3 State trajectories against time
a State x
1
(position of pendulum one) against time
b State x
2
(velocity of pendulum one) against time
c State x
3
(position of pendulum two) against time
d State x
4
(velocity of pendulum two) against time
Fig. 4 Control trajectories against time
a Control u
1
against time
b Control u
2
against time
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1658
C
3
1 sin (u
1
)
_ _
T
C
4
1 sin (u
2
)
_ _
T
C
5
1 sin (u
s
1
)
_ _
T
C
6
1 sin (u
s
2
)
_ _
T
F
1
(X) F
2
(X) kron(C
1
, kron(C
2
, kron(C
3
, kron
(C
4
, kron(C
5
, C
6
))))) (79)
C
1
1 sin (x
3
) cos (x
3
)
_ _
T
C
2
1 sin (x
4
) cos (x
4
)
_ _
T
C
3
1 sin (u
1
)
_ _
T
C
4
1 sin (u
2
)
_ _
T
C
5
1 sin (u
s
1
)
_ _
T
C
6
1 sin (u
s
2
)
_ _
T
F
3
(X) F
4
(X) kron(C
1
, kron (C
2
, kron(C
3
, kron
(C
4
, kron(C
5
, C
6
))))) (80)
For this problem, the neural network training parameters
selected were s
1
s
2
s
3
s
4
1 10
26
and g
l
1
g
l
2
g
l
3
g
l
4
20. For the rst iteration on the control sol-
ution, scheme V u
1
d
u
2
d
0 0
_ _
T
was used.
Numerical results from this problem, as obtained by
simulating the system dynamics with forth-order Runge
Kutta method [25] with step size Dt 0.01, are presented
in Figs. 35. State trajectories are given in Fig. 3. It can
be seen that the nominal controller is inadequate to
achieve satisfactory tracking. However, with adaptive
tuning, the resulting modied controller does a better
job of forcing the state variables to track the reference
signals.
The nominal and modied controller trajectories are
plotted in Fig. 4. These plots indicate that the online
adaptation comes up with a signicantly different
control history, which is the key in achieving the controller
goal.
An important component of the control design procedure
is proper approximation of the unknown functions as
neural network outputs
^
d
a
i
(X, U, U
s
). These unknown
functions and the neural network outputs (approximations)
are plotted in Fig. 5. It can be seen how efciently
and accurately the neural networks learn the unknown
functions.
4 Conclusions
Dynamic systems and processes are difcult to model accu-
rately and/or their parameters may change with time. It is
essential that these unmodelled terms or changes in par-
ameters are captured and are used to adapt the controller
for better performance. A model-following adaptive con-
troller using neural networks has been developed in this
paper for a fairly general class of nonlinear systems which
may be non-square and non-afne in the control variable.
The nonlinear system for which the method is applicable
is assumed to be of known order, but it may contain
matched unmodelled dynamics and/or parameter uncertain-
ties. Simulation results have been shown for two challen-
ging problems. The potential of this technique has been
demonstrated by applying it to non-square systems (one of
which is non-afne in control as well). Another distinct
characteristic of the adaptation procedure presented in this
paper is that it is independent of the technique used to
design the nominal controller; and hence can be used in con-
junction with any known control design technique. This
powerful technique can be made use of in practical appli-
cations with relative ease.
5 Acknowledgment
This research was supported by NSF-USA grants 0201076
and 0324428.
Fig. 5 Neural network approximations against time
a Network approximation of uncertainty in the rst state equation
b Network approximation of uncertainty in the second state equation
c Network approximation of uncertainty in the third state equation
d Network approximation of uncertainty in the fourth state equation
IET Control Theory Appl., Vol. 1, No. 6, November 2007 1659
6 References
1 McCulloch, W.S., and Pitts, W.: A logical calculus of the ideas
immanent in nervous activity, Bull. Math. Biophys., 1943, 9,
pp. 127147
2 Miller, W.T., Sutton, R., and Werbos, P.J. (Eds.): Neural networks for
control (MIT Press, 1990)
3 Hunt, K.J., Zbikowski, R., Sbarbaro, D., and Gawthorp, P.J.: Neural
networks for control systems a survey, Automatica, 1992, 28, (6),
pp. 10831112
4 Barto, A.G., Sutton, R.S., and Anderson, C.W.: Neuron-like adaptive
elements that can solve difcult control problems, IEEE Trans. Syst.
Man Cybern., 1983, SMC-13, (5), pp. 834846
5 Narendra, K.S., and Parthasarathy, K.: Identication and control of
dSystems using neural networks, IEEE Trans. Neural Netw., 1990,
1, (1), pp. 427
6 Chen, L., and Narendra, K.S.: Nonlinear adaptive control using
neural networks and multiple models. Proc. American Control
Conf., 2000
7 Sanner, R.M., and Slotine, J.J.E.: Gaussian networks for direct
adaptive control, IEEE Trans. Neural Netw., 1992, 3, (6),
pp. 837863
8 Lewis, F.L., Yesildirek, A., and Liu, K.: Multilayer neural net robot
controller with guaranteed tracking performance, IEEE Trans. Neural
Netw., 1996, 7, (2), pp. 388399
9 Khalil, H.K.: Nonlinear systems (Prentice-Hall Inc, NJ, 1996, 2nd edn.)
10 Aloliwi, B., and Khalil, H.K.: Adaptive output feedback regulation of
a class of nonlinear systems: convergence and robustness, IEEE
Trans. Autom. Control, 1997, 42, (12), p. 1714 1716
11 Seshagiri, S., and Khalil, H.K.: Output feedback control of nonlinear
systems using RBF neural networks, IEEE Trans. Neural Netw.,
2000, 11, (1), pp. 6979
12 Enns, D., Bugajski, D., Hendrick, R., and Stein, G.: Dynamic
inversion: an evolving methodology for ight control design,
Int. J. Control, 1994, 59, (1), pp. 7191
13 Lane, S.H., and Stengel, R.F.: Flight control using non-linear inverse
dynamics, Automatica, 1988, 24, (4), pp. 471483
14 Ngo, A.D., Reigelsperger, W.C., and Banda, S.S.: Multivariable
control law design for a tailless airplanes. Proc. AIAA Conf. on
Guidance, Navigation and Control, AIAA-96-3866, 1996
15 Slotine, J.-J.E., and Li, W.: Applied nonlinear control (Prentice Hall,
1991)
16 Kim, B.S., and Calise, A.J.: Nonlinear ight control using neural
networks, AIAA J. Guidance Control, Dynamics, 1997, 20, (1),
pp. 2633
17 Leitner, J., Calise, A., and Prasad, J.V.R.: Analysis of adaptive neural
networks for helicopter ight controls, AIAA J. Guidance Control
Dynamics, 1997, 20, (5), pp. 972979
18 McFarland, M.B., Rysdyk, R.T., and Calise, A.J.: Robust adaptive
control using single-hidden-layer feed-forward neural networks.
Proc. American Control Conf., 1999, pp. 41784182
19 Hovakimyan, N., Nardi, F., Calise, A.J., and Lee, H.: Adaptive output
feedback control of a class of nonlinear systems using neural
networks, Int. J. Control, 2001, 74, (12), pp. 11611169
20 Hovakimyan, N., Nardi, F., Nakwan, K., and Calise, A.J.: Adaptive
output feedback control of uncertain systems using single hidden
layer neural networks, IEEE Trans. Neural Netw., 2002, 13, (6),
pp. 14201431
21 Calise, A.J., Lee, S., and Sharma, M.: Development of a
recongurable ight control law for the X-36 tailless ghter
aircraft. Proc. AIAA Conf. on Guidance, Navigation and Control,
Denver, CO, 2000
22 Balakrishnan, S.N., and Huang, Z.: Robust adaptive critic based
neurocontrollers for helicopter with unmodeled uncertainties. Proc.
2001 AIAA Conf. on Guidance, Navigation and Control, 2001
23 Karloff, H.: Linear programming (Birkhauser Boston, 1991)
24 Paradiso, J.A.: A highly adaptable method of managing jets and
aerosurfaces for control of aerospace vehicles, J. Guidance Control
Dynamics, 1991, 14, (1), pp. 4450
25 Gupta, S.K.: Numerical methods for engineers (Wiley Eastern Ltd,
1995)
26 Soloway, D., and Haley, P.: Aircraft reconguration using
generalized predictive control. Proc. American Control Conf.,
Arlington, VA, USA, 2001, pp. 29242929
27 Soloway, D., and Haley, P.: Neural generalized predictive control:
a NewtonRaphson implementation. Proc. IEEE CCA/ISIC/
CACSD, 1996
28 Lin, W.: Stabilization of non-afne nonlinear systems via smooth
state feedback. Proc. 33rd Conf. on Decision and Control, Lake
Buena Vista, FL, December, 1994
29 Lin, W.: Feedback stabilization of general nonlinear control systems:
a passive system approach, Syst. Cont. Lett., 1995, 25, pp. 4152
30 Ham, F.M., and Kostanic, I.: Principles of neurocomputing for
science and engineering (McGraw Hill, Inc., 2001)
31 Hassoun, M.H.: Fundamentals of articial neural networks (MIT
Press, Cambridge, MA, 1995)
32 Khalil, H.K.: Nonlinear systems. (Prentice-Hall Inc., NJ, 2002, 3rd
edn.)
33 Yesildirek, A.: Nonlinear systems control using neural networks.
Ph.D thesis, University of Texas, Arlington, 1994
34 Padhi, R., Unnikrishnan, N., and Balakrishnan, S.N.: Optimal control
synthesis of a class of nonlinear systems using single network adaptive
critics. Proc. American Control Conf., 2004
35 Spooner, J.T., and Passino, K.M.: Decentralized adaptive control of
nonlinear systems using radial basis neural networks, IEEE Tran.
Autom. Control, 1999, 44, (11), pp. 20502057
36 Padhi, R., and Balakrishnan, S.N.: Implementation of pilot
commands in aircraft control: a new dynamic inversion approach.
Proc. AIAA Guidance, Navigation, and Control Conf., Austin, TX,
USA, 2003
7 Appendix
The most general form of the approximate system dynamics
will be considered in this proof. A resolution to the
xed-point problem that arises because of the particular
structure of the control solution equation is discussed in
this section. The approximate plant model can be
represented by
_
X
a
f
0
(X) g
0
(X)U
^
d
a
(X, U, U
s
)
X X
a
C(X)U
s
(81)
Substitution of (81) in the stable error dynamic equation
(
_
X
a
_
X
d
) K(X
a
X
d
) 0 leads to
(f
0
(X) g
0
(X)U
^
d
a
(X, U, U
s
) X X
a
C(X)U
s
_
X
d
) K(X
a
_
X
d
) 0 (82)
Equation (82) can be rewritten as
g
0
(X) C(X)
_ _
V H
^
d
a
(X, U, U
s
) (83)
where H W ((f
0
(X) (X X
a
)
_
X
d
) K(X
a
_
X
d
)).
Dene G W [g
0
(X) C(X)]. From (83), the control vector
V can be solved for as
V G
1
(H
^
d
a
(X, V)) (84)
Equation (84) represents a xed point problem to be solved
at each instant. Assuming the state vector X to be xed,
let the mapping in (84) be represented by T. Let S be a
closed subset of a Banach space x and let T be a mapping
that maps S into S. The contraction mapping theorem is
as follows [32].
Suppose that kT(x) 2T( y)k rkx 2yk, 8 x, y [ S,
0 r , 1,
then
There exists a unique vector x
[ S satisfying x
T(x
)
x