Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

802 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 36, NO.

I, JULY 1991

The Astrorn- Wittenmark Self-Tuning


Regulator Revisited and ELS-Based
Adaptive Trackers
Lei Guo, Member, IEEE, and Han-Fu Chen

Abstract-Although there has been made a considerable The assumptions made on system (1.1) are as follows:
progress in stochastic adaptive control, the problem concerning AI: { w,, %} is a Martingale difference sequence satisfy-
the convergence of the original self-tuning regulator proposed ing the following conditions:
by h r o m and Wittenmark in 1973 is still open until now. Since
it is attractive in theory and important in applications, we return
to this problem and give a rigorous proof for its stability and
optimality. Related problems such as convergence of the
extended-least-squares (ELS)-based adaptive tracker are also and
considered in this paper.
1 "
lim - wiwr = R > 0, a.s. (1.7)
I. A LONGSTANDING
OPENPROBLEM n-rm n i=l

us consider the following ARMAX model: ~ 2 C-'(eiA)


: C-'(e-'') +
- I > 0, V X E [0,2?r].
A3: det B ( z ) # 0, vz: I z J I1.
A ( z ) Y ,= B ( z ) u , - , + C(z)w,, n z 0, ( 1 . 1 ) We recall that A2 is known as the strictly-positive-real
condition which is automatically satisfied if C ( z ) = I , and
A ( z ) = 1 + A , z + * * . +A& p 2 1, A3 is known as the minimum phase condition.
B ( z ) = Bl + B 2 z + +B,zq-', q 2 1, Let us formulate the basic problem discussed in the paper.
Basic problem: Let {U:} be a given almost surely (a.s.)
+ +
C ( z ) = 1 c , z * * . +C,z', rL 0 bounded reference signal and let y:+ be %-measurable. ,
where y , , U,, and w, are the m-dimensional system output, Under conditions Al-A3 it is required to design an adaptive
control U, purely based on the ELS algorithm (1.2)-(1.5) in
input, and random disturbance, respectively, y, = 0, U, = 0,
order that
w, = 0 for n < 0, A( z), B( z ) , and C( z ) are polynomials
1) the closed-loop system is globally stable, i.e.,
in backward-shift operator z with unknown matrix coeffi-
cients A i , Bj, and C, and with known upper bounds p , q, 1 "

and r for orders.


Let us denote
2) the tracking error is minimized
e = [ - A , ... - A , B, B, Cl C,]'.
The most commonly used method for estimating 8 is the
following extended least-squares (ELS) algorithm:
3) the estimate 8, given by (1.2)-(1.5) is strongly consis-
en+, = 8, + anPnpn(Yn+l - 8 ~ c ~ n ) ' n 10,(1.2) tent.
An early reference on the basic problem is the self-tuning
pn+1= P, - U , P ~ V , P L P ~an, = (1 + p;Pnpn)-l, (1.3) regulator proposed by Astrom and Wittenmark [l] in 1973.
pn = [ Y ; ..* ~ i - p + , U; . * * ui-q+1 The goal of the adaptive control in [l] is to minimize the
output variance or the tracking error with y: = O for a
G; ... G;-,+,I~,(1.4) single-input and single-output system with C ( z ) = 1, and
3, = y , - 8;pnPl, n L 0; Gn = 0,n < 0 (1.5) with B, known and different from zero.
Since its appearance, the self-tuning regulator has got a
with arbitrary initial values 8, and Po > 0. great success in applications and naturally has drawn much
Manuscript received June 8, 1990; revised December 27, 1990. Paper attention from control theorists in an attempt to establish its
recommended by Past Associate Editor, P. A. Ioannou. This work was convergence. The first significant progress in this direction
supported by the National Natural Science Foundation of China. was made by Goodwin, Ramadge, and Caines [2]. They have
The authors are with the Institute of Systems Science, Academia Sinica,
Beijing 100080, P.R China. shown (1.8) and (1.9) for an adaptive tracker that is not
IEEE Log Number 9100498. based on ELS but on the stochastic approximation (SA)

0018-9286/91/0700-0802$01.000 1991 IEEE

-- ~ ~~

I
GUO AND CHEN: ASTROM-WITTENMARK SELF-TUNING REGULATOR AND ELS-BASED TRACKERS

algorithm, a modification of (1.2)-(1.5). Later, Becker, Ku- tional set may vary with various selections of initial values
mar, and Wei [3]have shown that in general the SA estimate for the estimation algorithm. So, in his conclusions, Kumar
6, is not strongly consistent in the adaptive scheme of [2] [14] points out, “it would be of considerable interest to
unless the reference signal {U,*}is sufficiently rich in some remove at least the Gaussianity restriction on the disturbance
sense (see [4,p. 5693 and [5]). To get strong consistency of w,,. The whiteness restriction cannot be removed so
.
6, when { y,*} is arbitrary but bounded signal, in [6]and [7] easily. . . It would also be of considerable interest to show
by invoking a ‘‘continuously disturbed controller, ’’ the strong that the exceptional set of Theorem 1 is really an empty set.”
consistency of parameter estimate and the global stability of Thus, in summary, as noted by Kumar [14],even “the
the closed-loop system have been achieved simultaneously convergence of the original self-tuning regulator of Astriim
under Conditions A1 -A3 and an identifiability condition. and Wittenmark which uses a reclusive least-squares parame-
However, in these papers the estimate is carried out not by ter estimator followed by a minimum variance certainty
the ELS algorithm and the tracking is no longer optimal but equivalent control law, has been an open question for more
suboptimal. In order that the external excitation does not than fifteen years.” In this paper we will rigorously prove
worsen the tracking accuracy, a diminishing excitation tech- the convergence of the original A-w
self-tuning regulator,
nique is applied in [4],where requirements 1)-3) of our and at the same time give a solution to the basic problem
basic problem are met, however, again they are established stated previously. Some preliminary results on this topic are
for the SA algorithm. presented in [l5].
As pointed out by Sin and Goodwin [8],“it seems that in
practically all applications of stochastic adaptive control, a II. CONVERGENCE OF A-W SELF-TUNING TRACKER
least squares iteration is used,” since “it generally has much The A-W self-tuning tracker discussed in this section is an
superior rates of convergence compared with stochastic ap- extension of the self-tuning regulator introduced by
proximation. ” Consequently, research on the ELS-based hrom-Wittenmark [11 based on the least-squares estimation
adaptive tracker is continuing. With modifications of for single-input and single-output systems with B , known.
Astram-Wittenmark (A- W) self-tuning regulator, Lai and The extension consists of the reference signal { y x } which is
Wei [9]have a sharp convergence rate for the tracking error. an arbitrary bounded sequence not necessarily equal to zero,
They assume that the open-loop system is stable, the random and the system may be multidimensional.
noise is uniformly bounded, and that a certain identifiability Definition: Let B, be known and nondegenerate, {U,*}be
condition is satisfied. Under similar assumptions (but without an a.s. bounded sequence of rn-dimensional vectors with
requiring boundedness of the noise), the present authors get y,*+,being %-measurable, and let U, be defined from
the convergence rates for both tracking and parameter estima-
tion in [lo].A a h , the algorithm used there is a modified
1
version of the -W self-tuning regulator. The main restric- where en is the ELS estimate for
tion of [9]and [lo]is that there the stability and optimality
are established only for open-loop stable systems. Certainly, [-A, - A , B2 * * . B, C , * . * Cr]’ (2.2)
a . .

if the parallel algorithm introduced in [ll]is used, then the


calculated according to the following ELS recursion:
assumption of open-loop stability can easily be removed [121.
The idea is that besides the ELS algorithm, a parallel SA on+, = 6, + anPnpn(Yn+l - Blun - 6;pnIr3
algorithm aiming at slowing down the growth rate of the
system input and output, is used for a finite period of n 2 0 (2.3)
operation. A similar idea is used also in [13].However, such P,+,= P,, - a,,p,,p,pip,,, an = (1 + p i p n p n ) - ’ , (2.4)
parallel algorithms are complicated, the heavy computation
burden may prevent such algorithms from being applied in p n = [ y ; ... YL-,+l 4 - 1 * . . % - q + l
any applications. Moreover, in [91, [lo], [12],and [13],
stability of the closed-loop system is established via consis- % ..* i i l n 7 - r + 1 ] r 1 (2.5)
tency of parameter estimates, for which identifiability condi- W,, = Y n - Blu,-l - e;p,,-1, n 2 0;
tions are required. Such conditions are not necessary for
achieving stability and optimality of adaptive tracking sys- W,, = 0,n < 0 (2.6)
tems (see, e.g., [2]). with arbitrary initial values Bo and Po > 0.
Recently, assuming independence and Gaussianity of { w,,} The adaptive control system (l.l), (2.1)-(2.6) is called the
with C ( z ) = I, and having noticed the connection between A- W self-tuning tracker.
the least-squares estimate and the conditional expectation for We note that the proposed recursion (2.3)-(2.6) is the
an unknown parameter, Kumar [14] has shown that the same as that used in &rom-Wittenmark [I] in the white
least-squares based adaptive tracker converges outside an noise case. However, in the general colored-noise case the
exceptional set of Lebesgue measure zeros in the parameter widely used a posteriori errors, which are slightly different
space of 6 . In this approach, it is necessary to preclude 6 from the a priori errors appearing in Astram-Wittenmark
corresponding to the system under consideration from being [l], are applied here. The following theorem shows the
in the exceptional set, in which, convergence is not guaran- stability and optimality of the A-W self-tuning tracker
teed for almost all sampling points. Moreover, the excep- (2.1)-(2.6).
804 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 36, NO. 7, JULY 1991

Theorem 1: If Conditions Al-A3 are satisfied, then the The certainty equivalence principle suggests us to define
A-W self-tuning tracker ( l . l ) , (2.1)-(2.6) is stable and adaptive control from
optimal in the sense that (1.8) and (1.9) hold. Furthermore,
let { d,} be a nondecreasing positive sequence satisfying e;v, =Y;+,, n 2 1 (3-3)
or
d,+l
sup -< 00, IIw,(~~ = O(d,), a.s. (2.7)
U, = B;t{ Y;,, + ( E l f l u ,- e;v,)}, ifdet[ B , , ] + 0.
nro d,
(3 -4)
Then the following convergence rate holds:
" The first problem arising here is that U, may not be
IIyj - y: - will2 = O(n'd,), as., VE> 0.
well-defined because the set {det[B,,] = 0 ) may have a
i= 1 positive probability unless some sort of continuity assumption
(2 4 is imposed on the distribution of w, (see [4]and [18] for
related discussions). However, we do not intend to make
The proof is given in Section IV.
such a restriction on distributions of w,, instead, we will
Remark: We note at once that the sequence { d,} defined
slightly modify B , , when we define U,, so that it is kept
in Theorem 1, in fact, can be taken as
from being zero or being too small.
d, = n', 6 E
/ ? \

. ( i,
1) As a matter of fact, we are replacing ''B;;" in (3.4) by
any %-measurable B;; that satisfies the following condi-
tions (3.5) and (3.6):
where /3 is given by (1.6). To see this, by (1.6) and the
Markov inequality we have

< m, as.
n= 1

and by the conditional Borel-Cantelli lemma (e.g., [16,


p. 551) we know that when defining the adaptive control, where r, is given by
(3.2).
11w,,+~112= O(n') a.s.\tAE(;,l). (2.10) We note at once that 1) B,, is asymptotically equivalent to
B,, since as will be shown later r,, + 00 as n + m; 2) for
Hence (2.9) is true. Moreover, if there are further assump- parameter estimation the ELS algorithm is not modified.
tions on the noise sequence { w,}, the convergence rate in For single-input and single-output systems ( m = l), it is
(2.8) can be improved. For example, if { w,} is a Gaussian immediately verified that B,, given by the following simple
white noise sequence, then again by Borel-Cantelli lemma modification from B,, satisfies (3.5) and (3.6)
and the Gaussian density function it is easily shown that d,
can be taken as d, = log n (see also [17]); if { w,} is a
bounded sequence, then d, = 1.
III. CONVERGENCE OF ELS-BASEDADAPTIVE TRACKER
(WITH B, UNKNOWN)

In the last section, we have claimed the stability and I otherwise


optimality of an adaptive tracker when the leading matrix
coefficient in B ( z ) is known. Here, we shall no longer where
impose the availability of B, and will use (1.2)-(1.5) to
estimate the whole 8 including B,.
We first give a solution to our basic problem but with the
consistency of parameter estimate ignored.
For the multidimensional case, one way of defining h,,,
Let us write the estimate 8, given by (1.2)-(1.5) in the
which is an analog of (3.7), is as follows. Let the singular
block form
value decomposition of B,, be (see, e.g., [19, p. 3181)
8, = [-A~, - A,, B,, B,, c,, c,,]'
(34 B,, = V,[ O]Ui (3.9)
0 0
and define
n
where U, and V, are orthogonal matrices and C, is a
r, = e + II(pjll*, n 2 0. (3 4 positive definite diagonal matrix. The following choice corre-
i=O
GUO AND CHEN: ASTROM-WITTENMARK SELF-TUNING REGULATOR AND ELS-BASED TRACKERS 805

sponds to (3.7) and satisfies (3.5) and (3.6) Remark: Theorems 1-3 remain valid if the boundedness
I 1
of { $} is replaced by
1 "
limsup - IIyTII' < 03 and IIy,*Il = O ( n b ) ,
B,,= 1 (3.10) n-m n i=l
B1n + Vnun' ' as., vb > 0.
(log rn- ) , IV. PROOF
OF THE THEOREMS
otherwise.
We start with lemmas.
In accordance with (3.4) we define U, by Lemma 1: For (e,} generated by either the algorithm
U, = BF~'{Y:+~
+ (Blnun - e i p n ) ) . (3.11) (1.2)-(1.5) or the algorithm (2.3)-(2.6) , if conditions A1
and A2 hold, and U, is %-measurable, then
Definition: The adaptive control system (1.1)-( 1.5) and
(3.11) with (3.5) and (3.6) satisfied for an a s . bounded { y,*}
with y:+, being %-measurable is called the ELS-based
adaptive tracker. n+l
Theorem 2: Under conditions Al-A3 the ELS-based 2) 11 i+; - will2 = O(1og r,,), a.s. (4.2)
adaptive tracker is stable and optimal in the sense that (1.8) i= 1
and (1.9) hold. Moreover

IIY,~+
~ ~ I I U , ~ =~ ~ o(n'd,), a.s. VE> 0 (3.12)
3) 5 1 + pTP;p;
II8~~i11* = o(log r , ) , a.s. (4.3)

where d, is defined in Theorem 1.


-
where 0, = 8 - 8, and Lin(n) is the minimum eigenvalue
The proof is given in Section IV. of
Theorems 1 and 2 have established the convergence of n
ELS-based adaptive trackers without paying attention to the Pi+!, =
i= 1
p;pf + Pi,. (4-4)
consistence issue of the estimates. We now give a solution to
the basic problem by using the diminishing excitation tech- Except (4.3), whose proof is given in the Appendix, this
nique developed in [4] and [20]. lemma is not new. We note that Lai and Wei [21] are the first
We first define the excitation source. Let { e ; } be the to use the condition log r,, /A,,,,,(n) 0 in guaranteeing
sequence of m-dimensional i.i.d. random vectors indepen- strong consistency of the LS estimates.
dent of { w;, $} with EE;= 0 Remark: It should not be confusedfiat we use the same
notations for algorithms (1.2)-(1.5) and (2.3)-(2.6). For
= I, 11 E,\/ Io example, rn is defined by (3.2) with pi given by (1.4) for the
where U is a constant. algorithm (1.2)-(lS), but when the algorithm (2.3)-(2.6) is
Replacing (3.11), we define a vector U," as under consideration, pi in (3.2) should be understood as that

U: 'B T ~ { Y : + ~
and the diminishingly excited input U, as
+ (Blnun - eipn)) (3.13)
given by (2.5).
The following Lemma 2 is crucial in establishing our
results. The proof of it benefits from some ideas in [22].
Lemma 2: Under Conditions Al-A3, for $e ELS-based
U, = U," + U, (3.14) adaptive tracker (1.1)-(1.5) v d (3.11) with B , , satisfying
where (3.5) and (3.6), and for the A-W self-tuning tracker (l.l),
1 (2.1)-(2.6) the sequence { p,} has the following estimation:
(
En
v, = - , E € 0,-
rn-1 2(t+ l))> IIpkI12= O(r;d,), a.s. VE >0 (4.5)
t = max(p, q, r ) + mp - 1. (3.15)
where d, is defined in Theorem 1.
Theorem 3: Assume that conditions Al-A3 hold, A ( z ) , Before proceeding to prove Lemma 2, we first show that
B ( z ) , and C ( z )have no common left factor and [ A,B,C,] given (4.5) it is a rather easy task to conclude Theorems 1
is of full-row rank. Then the adaptive tracker consisting of and 2.
(1.1)-(1.5), (3.10), (3.13), and (3.14) solves the basic prob- Proof of Theorems I and 2: By ( l . l ) , (1.7), and
lem. To be precise, (1.8) and (1.9) are fulfilled and (3.2), it is easy to show that (see, for example, [17, Eq. ( 3 3 ,
p. 10971)
, a.s. (3.16) rn
liminf - > 0 a.s. (4.6)
and n-+m n \ I

i= 1
I)Y; - - ~ 1 1 ~+ O ( d , ) ,
~ = O(nl-') as.
So, by (2.9)

(3.17) d,=O(r:), a.s.Y6€($,1). (4.7)


where Z is given by (3.15) and d, is defined in Theorem 1.
806 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 36, NO. I, JULY 1991

From Lemma 1-3) and (4.5) it follows that (4.12)-(4.14) that r, = o(r,) + O(n)a s . . From this, it
follows that
rn = o(n), a.s. (4.15)
Hence it is easy to see that (1.8) holds, while (3.12) follows
from Lemma 2 and (4.15).
By (3.6), (4.2), (4.9), and (4.15), it is seen that
1 "
lim -
n-rm n
IIg;$ok - ABikuk
k=l
+
e7(ppk"- (Pk)1l2 = 0 a.s.
= O(1og r,,) + O(rid, log r , ) , VE >0
for the case of Theorem 2 or with the term A&kuk removed
hence by the arbitrariness of e in the case of Theorem 1. From this, (4.11), and (1.7), it is
easy to see that (1.9) is true. This completes the proof of
2
i=O
\I(p$fill2 = O(r;d,,), a.s. VE > 0. (4.8) Theorem 2, while for Theorem 1 it remains to prove (2.8).
By (4.2) and (4.8), it is clear that
n
Consequently, by the arbitrariness of 6 and e in (4.7) and Ilg;'Pk + e T ( ( P E - 'Pk)1I2 = O(n'd,), VE >0
(4.8) k= 1

which in conjunction with (4.11) (with A&kuk removed)


implies (2.8). The proof is completed.
Proof of Lemma 2: We here prove the lemma for
(l.l), (2.1)-(2.6) (i.e., for the case of Theorem 1) and put
Set the proof for (1.1)-(1.5) and (3.11) (i.e., for the case of
': =
P [Y,' -..Y,'-p+l u;u;-1 ... u;-*+1 Theorem 2) in the Appendix.
We first show that there are constants c > 0 and X E (0, 1)
w,' * - w,'-~+1] (4.10) such that
where s = n - 1 for Theorem 1 and s = n for Theorem 2. II Yn+ 1 II 5 cun6n~n+ En (4.16)
In the case of Theorem 2, by (3.11) we have
where
= 87'Pk + e7('PE - 'Pk) + w k + l
Yk+l
-- CY, =
IIg;'P, II
, 6, = tr ( P , - P,,+l) (4.17)
= e;'Pk - A B 1 k U k + Y:+1 1 + VnPnPo,
+ e7('P! - ' P k ) + Wk+l,
Ak1k = b1k - B1k (4.11)
L, = cn

i=O
X"-iIIy,)12 (4.18)

whjle in the case of Theorem 1, (4.11) holds with the term and { tn}is a nondecreasing positive sequence satisfying
AB,,uk removed. Here, we should keep in mind that the
definitions of 8, e,, p,, in Theorem 1 are different from those
4, = o(d, log r, + log2 r,) . (4.19)
in Theorem 2. Noticing (4.2) and (4.9) from (4.11) we see Let p,"be defined by (4.10) with s = n - 1. Then in view
of (2.2), it follows from (1.1) that
2
k = O IIYk+1112 = o(r:)+ '( 5 k=O l l u k 1 l 2 )
Y , + ~= e 7 d + Blun +~,+l. (4.20)
+ O(log r,) + o(n) So taking account of (2.1), we have
= o ( r , ) + o(n). (4.12) Yn+l = BlU, + e7(P, + W n + l + q ( P : - $0,)
From this and condition A3, it is evident that -- Y *~ + I - e i ' ~ ,+ e 7 ~ +, W n + , + e 7 ( ~ -: (0,)
= 6,"Pn +Y:+1 + W n + l + ~'('PR - 'P,). (4.21)

= o ( r , ) + o(n). (4.13) By (2.7), (4.2), and the boundedness of { $} from (4.21),


it follows that
By using Lemma 1-2) and (1.7), we have
n n IIYn+1112 5 211J;pnI12+ O(log rn) + O(dn)
C II + i l l 2
i= 1
52 C
i= 1
[ II+i - Will2 + II wiII'] = 'an['+ ' P A P n + l ' P n + (Pi(Pn - ~ n + l ) ' ~ n ]
= O(1og r,) + O(n), as. (4.14) + O(d, + log r,)
Consequently, by the definition of r,,, we see from 52a,[2 + 6,11~p,))~] + O ( d , + log r,) (4.22)
_______

GUO AND CHEN: RSTROM-WITTENMARK SELF-TUNING REGULATOR AND ELS-BASED TRACKERS 807

where for the last inequality we have used the fact that which implies by the arbitrariness of E
cPipn+lcPn 5 1 .
From condition A3, it follows that there exists a constant Ln+1 = O(r;dn) and II Y~+III'= O(r;dn)
A E ( 0 , l ) such that a.s. ve > 0. (4.30)

tIUn-1112= o(Ln) + t
o(i = O A n - i l i w i i l z )
Thus by noting (4.23) and (4.30), we know that (4.5) holds
in the case of Theorem 1.
Lemma 3: Assume that conditions A1 and A2 hold, A( z),
= o(L,) + O ( d , ) . (4.23)
B ( z ) , and C ( z ) have no common left factor and
Combining (4.23) with (4.2) yields [A,, B,, C,] is of full-row rank, and that the output of
system (1.1) under control (3.14) and (3.15) has growth rate
D- 1 a- 1 r- 1
1 "
- IIyil12= 0(1) as. (4.31)
n i=o
where U," is any c-measurable vector (not necessarily
+ o(i2=O
An-i( )I - W,-ill' + (I wn-il12) defined by (3.13)) with
1 "
o(L,) + o(d, + log r,). (4.24) - IIuPll'
n i=o
= 0(1) a.s. (4.32)

Substituting (4.24) into (4.22) and noticing a,S, =


where {c} is a family of nondecreasing a-algebras such
O(a,)= O(1og r,) (a consequence of (4.3)) we immediately
derive (4.16).
that c is a sub a-algebra of and E, independent of c.
Then 8, given by (1.2)-(1.5) has the convergence rate
From (4.18) and (4.16) it follows that
indicated in (3.16) and
n+l
p,-1 2 conl-z(~+lq (4.33)
Ln+, = An+1-iIIYi()2 = IIYn+1112 + ALn
i=O
for some constant c, > 0, where E is defined in (3.15).
I( A + ca,S,)L, + E,, (4.25) The proof is given in the Appendix.
and hence Proof of Theorem 3: Without loss of generality as-
n n n sume that
L,,] I n ( A + c a j S j ) L o+
j=O i=O j = i + l
( A + cajSj)Ei % = a{ wi, U:+,, ti,i In }
n n and
- An+' n (1 + A-'cajSj)L, + A"-'
j=O i=O e = a{wi, y:+,, i 5 n}.

* fi
j=i+ 1
(1 + A-lajSj)Ei (4.26)
Set
1 ,
Z + 1 =~,*+1+ P/2Blnen.
2 1.
where as usual II",,,(.) 'n- 1
We now proceed to analyze the product in (4.26).
Combining this with (3.13) and (3.14) we have
We note that Sj ,40 because
W
J-*W
m
U, = ~;;{J:+I + ( B l n u n - expn)). (4.34)
CAj=j = O
j=O
(trPj-trPj+,)strPo<oo. (4.27) Clearly, J,*+, is %-measurable. Next, by (3.10) and
Lemma 1-11, we know that 11 B,, 11 = O({log r,- ,}'''),
Consequently, for any E > 0 by (4.3) there exists io such hence { J,*} in a.s. bounded. Thus { J:} may serve as a new
that reference signal that satisfies requirements in Theorem 2, and
n so (1.1)-(lS), (3.10), and (4.34) form an ELS-based adap-
X - ' c c aisj Ie log r,, Vn z i 1 io. (4.28) tive tracker. Consequently, by Theorem 2, (1.8) and (1.9)
j=i
hold. Hence, (4.31) and (4.32) are satisfied because { (1 u,II}
Using this and the inequality 1 + x Ie x , x 2 0 one is bounded. Moreover, U," defined by (3.13) is q-measura-
ble and e is independent of E, by definition. Then Lemma
readily obtains
3 is applicable and (3.16) follows from Lemma 1-1) and
(4.33). It remains to show (3.17).
By Lemma 2 and (4.33), it follows that
I exp { e log r,} = r;, vn z i 2 io. (4.29) p;p,p,, = o(r;'d,), a s . VSE(O, 1 - ~ ( t I ) ) .+
Putting this into (4.26) and taking account of (4.19) yields (4.35)

L,,+~
= o(r;(d,log r,, + log2 r , ) ) a.s. V E > 0 Set Si = x J = l a j So
, = 0. Then by Lemma 1-3), Si =
808 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 36, NO. 7, JULY 1991

O(1og ri) = O(1og i), so by (4.35) we have analyzed in completely the same way. Note that conclusions
n 1) and 2) are known results, see, for example, (9), (29), and
aip:Pipi (31) in reference [20] (note that r,+, of [20] equals ( r , -
i= 1 e + 1) of the present paper). Similar results may also be
found in [9]. Hence, we need only to prove conclusion 3) for
the algorithm (1.2)-(1.5).
Set
-
i= 1
wk+l = wk+l + e'((P,O - ' P k ) . (A4
It is easy to see that y k + = 8 'pk +zk+ 1. Substituting this
into (1.2) we have
e",+, = ( I - akPkpkp;)gk - akPkpkzi+l' (A'2)
From (1.3), it is clear that

= O ( dn) 9 v q o , 1 - E(t + 1)). pk+lpil = I- a k p k p k p ; , PF+!lPk =I + pkp;Pk.

Consequatly by Lemma 1-3) again, we get From this and (A.2) it follows that
n n

tr [ gi + 1 pi+!1 + 1e;, 1
Note that by (3.16), B 1 , -, B , , and B, is nondegenerate,
n-oo '[ ' i 1 6 k - P i + ! l a k P k ' P k z ; + l ]
we see from (3.10) that AB.,, = o for all sufficiently large n.
Hence, by (3.16), (4.11) (with yF+l replaced by J,*+,), = trg;Pi18"k - a k ( I ( p ; g k 1 I 2 - 2ak(O;8"ki?k+l
(4.2) and (3.6) we see that
n + akp;Pkpk11zk+1112. (A.3)
IIyk+l -yk*+l - wk+ll12
k= 1 We now proceed to estimate the last two terms on the
n right-hand side of (A.3). For this we need the following fact
= IIg;(ok + O'((P; - 'Pk) + ABlkuk (see, e.g., [20, Eq. (21)]: for any Martingale difference
k= 1
sequence { w,, %} satisfying (1.6) and any adapted matrix
sequence { M,, %}

= O(d,) + O(1og n) + O(nl-')


= O ( d , ) + O(nl-'), as.,
where E is defined in (3.15). Hence, the desired result (3.17) By this, A.1, conclusion 2) and the
holds.
V. CONCLUDING
REMARKS
2xy 5 6x2 + 6 - ' y 2 , x 2 0,y 2 0 , 6 >0
In this paper, we have proved the stability and optimality it is not difficult to see that
of the A-W self-tuning regulator and an ELS-based adaptive n

tracker. However, several problems are still left open. We do


not know if the ELS-based adaptive tracker (1.1)-( 1.5) and
(3.4) is still stable and optimal if no modification is made on
B , , when det B , , # 0, a.s.. In (3.7) or (3.10), we have
modified B l n .However, we conjecture that in Theorem 2 not
only the modification may happen at most for a finite number
of steps, but also B;,B,, is asymptotically bounded from
below. n

APPENDIX + 6-l (pT(p;


- pk)112, 0<6<;
k= 1
In this section, we give the proofs for Lemmas 1 and 3 and n
complete the proof for Lemma 2 in the Theorem 2 case.
Proof of Lemma I : We need only to consider the ELS
I 26
k= 1
a,)(p;gk11' + o(1og r , ) , o < 6 < ;.
algorithm (1.2)-(1.5), since the algorithm (2.2)-(2.6) can be (A4
GUO AND CHEN: RSTROM-WITTENMARK SELF-TUNING REGULATOR AND ELS-BASED TRACKERS 809

For the last term of (A.3), we need the following result (see from (4.11) that

From this, (A.l) and conclusion 2), we get 5 3 4 1 + 4 P k + l ( P k + 'P;(p, - P k + l ) ( P k )

+ O(d, + log r,) (A.13)


r,),
= O(log a.s. (A4 where for the last equality we have used the estimate ak =
Finally, summing up both sides of (A.3) from 1 to n, and O(1og r k ) ,a consequence of (4.3).
using (AS) and (A.6) we see that We now proceed to estimate 11 U, 11 '. By condition A3, we
n know from (1.1) that there exists X E (0, 1) such that
(1 - 26)
k= 1
a,Ilv;e",11~Itr[e";p;'g1] + O(iog r,)
which yields conclusion 3) because 1 - 26 > 0. This com-
pletes the proof of Lemma 1. (A. 14)
Proof of Lemma 2 in the Theorem 2 case: Similar to
Note that ( B , u , - 07pk)is free of ukr it is easy to see
(4.16), we first show that there exist constants c > 0 and
from (A.14) that
X E (0,l) such that

where L, is defined as in (4.18), f , is defined as k

1
r , - l ) 2 + a,&,,+
i=O
f, = (a,s,log -(A4
log rn - 1 = O(L,) + O(1og r k ) + O ( d , )
with CY,,, 6, defined as in (4.17), and where { Ek} is a where for the last relationship Lemma 1-2) is invoked and L,
nondecreasing positive sequence satisfying is defined by (4.18). Substituting this into (A.12), we have

By (3.11) we have + O(d , + log r , ) . (A. 15)

Yk*+l = ABlkUk + e;p, (A. 10) Consequently, similar to the treatment used above, by (A. 14)
and (A. 15) we derive
and from this
Il'Pk1l2 = IIUk1I2 + [Il'pkl12 - II~k1l21
B,u, = e7p, - Yk*+1 + Yk*+l + (Q,- 07P,)
- 5 12 II B; II 1I h k II + O(L,)
- OkVk - A B l n U k
- T
+ ~ k * ++~ (B,u, - (A.II)
+ O (d , + log r k ). (A.16)
810 IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 36, NO. 7, JULY 1991

possibly random 6 > 0 such that


n
6 a j 5 €(log r , ) , a s . Vn (A.20)
j=O

and by (4.27) there exists a (random) integer io such that


4 c 11' a
~(h) JC
= l , 6 j S ~ ,a . s . , V i l i o . (A.21)

Thus, by the inequalities 1 + xy S (1 + x)(l + y), x L


0,y 2 0 and 1 + x2 5 e 2 x ,x 2 0, we get for all n 1 i 2 io
+ O ( d k log r k + log' rk)

I c("k6k)2 11 p k 11 '
+ o((yk6k + IIAilk112)Lk
+ O(dklog rk + log' r,). (A.17)
To complete the proof of (A.7) we need to derive an upper
bound for 11 pk 11 in terms of L,.
By (3.5)and Lemma 1-l), it is easy to see from(3.11) that
4- 1
IIYk-i11' + II'k-il12
i= 1
k
Ir; exp {(log r , ) ~ =
) r:', (A.22)
+ O(log T,-~) (A.18)
as.
i=O and by the inequality 1 + x I ex, x 2 0, we have for
n z i r i ,
n

J=I

I O( r ; ) , ' a.s. (A.23)


Since r, 00 and X < 1, we know that io can be taken
+

+
large enough such that supjz io { 1 (CA- '/log rj)} < 2 - X
and hence for all n > i 1 io

1 ( 2 - A),-;.
fi
j=i+l (1 -k
(A.24)

Finally, from the definition of fi and (A.22)-(A.24) it


follows that for any E > 0
n
n
j=i+ 1
(1 + cX-'fj)
+ O(iOg5 + rk dk log4 r k )
where c is some constant. Hence (A.7) is verified. j=r+l
Corresponding to (4.25) and (4.26), we have from (A.7)
Ln+, I ( A + cfn)Ln + t n * fi
j=i+ 1
(1 + c h - 1 " j 6 j )
and
IO( r 3 2 - A] ), > i z io.
a.s. vn
L,+1 5 .+I[ fi (1 + X-'cfj)]Lo
j=O
Substituting this into (A. 19) and noting that 2 X - X2 < 1, we

+ 5 Anpi[ fi (1 + X-lcfi) 4;.


get
i=o j=i+l 1 +
We now estimate the product ~ I ; = ~ + ~ (xI- ' c f j ) .
(A.19)
L,+l IO(r,3'[2h - PI") +0 r ; ' 5 (2X -
i=O
X')n-iti
For any E > 0, by Lemma 1-3), there exists a small, = O(r:'[log' r, + dnlog4 r,,]), a.s. VE > 0.
~

GUO AND CHEN: RSTROM-WITTENMARK SELF-TUNING REGULATOR AND ELS-BASED TRACKERS 811

By the arbitrariness of E , this implies L,+l = O(rid,,), sional vectors p(i),i = 1, * * - , X, for some A, such that
V E > 0, which in turn implies the desired result (4.5) since
by (A.14)

11 y k 11 + 11 ‘k 11 ’+ 11 k‘ 11 ’ and
5 L, + W + O(&)
k + d r- 1 x
Y(i)rzi = z)
p(i)7zic(
+ 211@k - Wk1I2 + 211 %(I2 i=O i=O
= o(r i d k ) + O ( d , ) + O(l0g r,) = o(r i d k ) . which imply
Proof of Lemma 3: Comparing conditions in this theo- pW = 0, i = O , . . . , A. (A.27)
rem with those in [20, Theorem 3 ( P > 2, 6 = O)], we find
that n‘’ is replaced by r:iZl in the denominator of U, and since [A,, B,, C,] is of full-row rank.
the full-rank requirement for A, is weakened to that of Clearly, (A.27) leads to a contradictory result di)= 0,
p ( j ) = o , y ( k ) = o , j = l , . . . , p - 1, j = o , . - . , q - 1,
[ A , , B , , cr1.
From the proof of [20, Theorem 31, we see that only the -
k = 0,* r - 1. Thus [20, Theorem 3 remain valid. Hence
e ,

following two properties of { U,} are used: 1) {U,, %} is a (3.16) follows, while (4.33) can be seen from [20, Eq. (46)]
Martingale difference sequence (for [20, (50)-(52)] and 2) is true, and all results of [20, Eq. (44)].The proof is
completed.
1 ” ACKNOWLEDGMENT
1uiul 1 c , l ,
--p
i= 1
c1 > 0, a s . (A.25)
The authors are very grateful to the reviewers and to Prof.
for sufficiently large n. P. Ioannou for their helpful comments on the paper.
The property (A.25) is a consequence of (41) in [20], and REFERENCES
is used in deriving (56) and (59) of [20]. Since without loss 0
K. J. Astrom and B. Wittenmark, “On self-tuning regulators,”
of generality we may assume that E, is %-measurable and is Automatica, vo1.9, pp. 195-199, 1973.
independent of 3$-1, {U,, %} is obviously a Martingale G. C. Goodwin, P. J. Ramagde, and P. E. Caines, “Discrete-time
difference sequence because rn- is E-,-measurable. stochastic adaptive control,” SIAM J. Contr. Optimiz., vol. 19, pp.
829-853, 1981.
Further, by (4.31) and (4.32) and the boundedness of { U,} A. Becker, P. R. Kumar, and C. Z. Wei, “Adaptive control with the
we have stochastic approximation algorithm: Geometry and convergence,”
IEEE Trans. Automat. Contr., vol. AC-30, no. 4, pp. 330-338,
1985.
H. F. Chen and L. Guo, “Asymptotically optimal adaptive control
with consistent parameter estimates,” SIAM J . Contr. Optimiz.,
vol. 25, no. 3, pp. 558-575, 1987.
Hence by Lemma 1-2), (1.7), and (A.26), it is easy to see P. R. Kumar and L. Praly, “Self-tuning trackers,” SIAM J . Contr.
Optimiz., vol. 25, no. 4, pp. 1053-1071, 1987.
that for some c > 0 r, Icn, a.s. V n 2 0. Thus, by [20, P. E. Caines and S. Lafortune, “Adaptive control with recursive
Eq. (41)] we have for sufficiently large n identification for stochastic linear systems,” IEEE Trans. Automat.
Contr., vol. AC-29, pp. 312-321, 1984.
1 “ 1 Ei€; 1 H. F. Chen, “Recursive system identification and adaptive control by
nl-p
I= 1
up; 2 p i=l -
I‘
1 -Z
2 ~ ‘
use of the modified least squares algorithm,” SIAM J. Contr.
Optimiz., vol. 22, no. 5, pp. 758-776, 1984.
K. S. Sin and G. C. Goodwin, “Stochastic adaptive control using a
modified least squares algorithm,” Automutica, vol. 18, pp.
which verifies (A.25). 315-321, 1982.
We now justify the relaxation of replacing the rank condi- T. L. Lai and C. Z. Wei, “Extended least squares and their applica-
tion on A, by that of [ A,, B, ,C,] . Under the assumptions tion to adaptive control and prediction in linear systems,” IEEE
Trans. Automat. Contr., vol. AC-31, pp. 898-906, 1986.
converse to [20, Eq. (46)], following the proof there, we find L. Guo and H. F. Chen, “Convergence rate of ELS based adaptive
a unit vector tracker,” Syst. Sci. Math. Sci., vol. 1, no. 2, pp. 131-138, 1988.
L. Guo, “Identification and adaptive control for dynamic systems,”
Ph.D. dissertation, Inst. Syst. Sci., Academia Sinica, Beijing, China,
Oct. 1986.
H. F. Chen and J. F. Zhang, “Convergence rate in stochastic
adaptive tracking,” Int. J. Contr., vol. 49, pp. 1915-1935, 1989.
such that L. T. Lai and Z. Ying, “Parallel recursive algorithms in asymptoti-
cally efficient adaptive control of linear stochastic systems,” Dep.
P- 1 U- 1 Statistics, Stanford University, Stanford, CA, Tech. Rep. 11, 1989.
z i [adj (z)] B ( z ) = -
,,p P(j)‘zi[det A ( z ) ] Z P. R. Kumar, “Convergence of adaptive control schemes using
i=O i=O least-squares parameter estimates,” IEEE Trans. Automat. Contr.,
vol. AC-35, pp. 416-423, 1990.
D- 1 r- 1 L. Guo and H. F. Chen, “Convergence and optimality of self-tuning
-1
i=O
di)‘zi[adj (z)]C(z) = - .1-y(i)7zi[detA ( z ) ] Z .
i=O
regulators,” Science, China, submitted for publication.
W. F. Stout, Almost Sure Convergence. New York Academic,
1974.
L. Guo and D. Huang, “Least squares identification for ARMAX
From this and using the fact that A( z), B( z), and C( z ) models without the positive real condition,” IEEE Trans. Automat.
have no common left factor we find that there are m-dimen- Contr., vol. AC-34, no. 10, pp. 1094-1098, 1989.

1-
E E E TRANSACTIONS ON AUTOMATIC CONTROL, VOL. 36, NO. 7, JULY 1991

S. Meyn and P. E. Caines, “The zero divisor problem of multivari- Dr. Guo is an Associate Editor of the SIAM Journal on Control and
able stochastic adaptive control,” Syst. Contr. Lett., vol. 6, no. 4, Optimization.
pp. 235-238, 1985.
G. W. Stewart, Introduction to Matrix Computations. New York
Academic, 1973.
H. F. Chen and L. Guo, “Convergence rate of least squares identifi-
cation and adaptive control for stochastic systems,” Int. J. Contr.,
vol. 44, pp. 1459-1476, 1986.
L. Lai and C. Z. Wei, “Least-squares estimation in stochastic regres-
sion models with application to identification and control of dynamic
systems,” Ann. Statbt., vol. 10, pp. 154-166, 1982.
L. Guo, “On adaptive stabilization of time-varying stochastic sys-
tems,” SIAM J . Contr. Optimiz., vol. 28, no. 6, pp. 1432-1451, Han-Fu Chen received the diploma in mathemat-
1990. ics from the University of Leningrad, Leningrad,
U.S.S.R., in 1961.
From 1961 to 1980 he was with the Institute of
Lei Guo (M’88) was born in China in 1961. He Mathematics, Chinese Academy of Sciences, Bei-
received the B.S. degree in mathematics from jing, China. Since 1980 he has been with the
Shandong University in 1982, and the M.S. and Institute of Systems Science, Academia Sinica,

Ph.D. degrees in control theory from the Institute where he is currently a Professor in the Laboratory
of Systems Science, Chinese Academy of Sciences, of Control Theory and Applications. He has au-
Beijing, People’s Republic of China, in 1984 and thored and/or coauthored over 90 papers and five
1987, respectively. books in the areas of stochastic control, adaptive
From June 1987 to June 1989, he was with the control, identification, and stochastic approximation.
Department of Systems Engineering, Australian Prof. Chen serves as Vice-chairman of the International Federation of
National University, Canberra. Presently, he is an Automatic Control Theory Committee. He is the Editor of Systems Science
Associate Professor with the Institute of Systems and Mathematical Sciences and is a member of the Editorial Boards of a
Science, Chinese Academy of Sciences, Beijing. His research interests are in number of international and Chinese journals. His recent book, Identifico-
stochastic systems, adaptive control, estimation and approximation, time tion and Stochastic Adaptive Control (Cambridge, MA: Birkhauser)
series analysis, and statistics of random processes. coauthored with Prof. L. Guo, will be published in 1991.

You might also like