Professional Documents
Culture Documents
1986 A Survey Recursive Identification Algorithms Hunt
1986 A Survey Recursive Identification Algorithms Hunt
algorithms
by K.J. Hunt*, BSc, AMIEE
The paper gives an introduction to and comparison of the duction to the theoretical and practical aspects of identifi-
four recursive identification algorithms most commonly cation is given by Norton ~~l 9R~j.
used in self-tuning control. The paper demonstrates that the In this paper only the most important properties of the
algorithms are very similar, and a simple simulation algorithms are presented.
example is used to compare the algorithms and to give some
insight into the suitability of each method for a particular
problem. Choice of algorithm is seen to depend on four
factors: model complexity, noise/signal ratio, convergence 1 The model structure
rate, and computational expense.
This paper is concerned with the identification of a
dynamical system which can be described by the linear
Introduction difference equation:
The emergence in the last decade of adaptive control am
~~~tj+ar~yt-Ij+...+a~~J(t-~)
signal processing methods has been accompanied by a
renewed interest in the techniques of recursive system =6,M(f -!)+... +~M~ - ~)+ u(Y) ...(1)
identification. Since the main emphasis in this paper is on where ( u {t.)} and (t)} are the input and output
recursive methods for real-time implementation in self- sequences, respectively. The term u(f) represents some dis-
tuning control systems, attention will be given exclusively turbance acting on the output. Using polynomials in the
to the problem of estimating the parameters of the so-callei
delay operator q-1, Eqn (1) can be rewritten as:
ARMAX model (described in Section 2), which is the
model most often used in self-tuning controller design. A ~~~~~L’ ~t~ ~~C~~i~ Ld ~t 5 ~’ t7~t~
=
...
(2 )
The newcomer to the field of system identification is Two case, depending the characteristics
on of v(t&dquo;), will
confronted with an apparently limitless number of different be considered:
algorithms. The purpose of this paper is to give a brief
introduction to and comparison of the algorithms most (i) has unspecified character. The model Eqn (2’)
commonly used in self-tuning control. Four methods will can then be rewritten in vector form as:
most often used in the literature on self-tuning centre) and and this is minimised with respect to 0. The criterion l§y(0)
is shown in Fig 1. is quadratic in 0 and can therefore be minimised analytic-
Eqn (8~ can be rewritten in vector form: ally. Minimisation gives the estimate of 0 as:
_~’ (t~ = ~~~o~~) ~- ~’ (t)
-il
.(9) N A.
For the model Eqn 13): predicted and measured values of the output (this approach
leads to the family of methods known as Prediction error
I (t) = 8~’~ (tj + ~, ~t) ...(20) methods).
t has been stated that when v (t) and (D (t) are correlated Define the prediction error:
such as when u(t) is not white noise) then the estimates of
I will be biased. In general, this will be the case. The RIV ~ {t, ~~ = y {t~ -- y {t/~~ ... {28~
nethod proceeds by replacing 4l (t) in Eqn (17) by a vector and a criterion function
F (t) (called the instrumental variables vector) such that: 1 ~
(i) z (tj and v (t) are uncorrelated.
ii) z (t) is as strongly correlated with ~(t.) as possible.
i~’~r{8~
= 2 I 1 e)
2 1
...{?9)
ensuring, however, that (i) still holds. It can be shown [see, for example, Goodwin and Sin
rhe estimate Eqn (17) now becomes: { 1 ~~~}~ that the optimal prediction of the output of model
(26) is given by:
N N
’(~=fz~)~)1 J
£ 1 Z (t) Y (t) .. , ~2I ) ~’{R&dquo;Oy(t/8}=[~’{q~~~-A{qy’)~Y{t)~~{~l~’)~~t) ...{>C?)
Inspection of (30) shows that § (t/8) is a non-linear
Another approach would also be to replace 4)T(t) in function of 8. This means that the criterion J£v(0) cannot
Eqn (21) by ~~’~t) - the recursive version of this algorithm be minimised analytically, and non-recursive numerical
s called the symmetric IV ~~ethocl.) methods have to be used. To obtain a recursive prediction
The recursive instrumental variables (RIV) method can error algorithm, some approximations have to be made
iow be written down [cf Eqns 1;18)~: (details are in Ljung and Soderstrom, 1983). In the
regression vector Eqn (11) the unobserved components
I <t> =~(t-~)+~,~t) ~~~t)--~~~t-1)~(t~l e{t - 1)... are approximated by the residuals (ef Eqn (9)]:
-&horba;&horba;&horba;&horba;~&horba;&horba;&horba;&horba;&horba;&horba;&horba;&horba;&horba;
pet I)~(t)
(t) ~{t~= ~{t~ -e~’(tj~{t~ ...(31)
j~t~ ~ ~ +~~’~t~~~t -~ ~)~ (t) . ...(?2)
and the regression vector now becomes:
,~~t _ ~ ~~, ~~~ ~,~’~t)p~t _. ~)
~,
’(f) ,..
P~t
e~ =P(r - I)
!) _ w
~’~~t)=~-.Y{t-I)...~--.y’~.t-~~y~{t-i~...ra(t-~tz~~
I +c~~’~t~p(~ _ I)z(r)
e(t -1) .. r)] ... {3?)
A number of variants of the RIV algorithm exist, reflect. ’
Eqns (24) and (2j), together with Eqns (22), define one The RML algorithm [Eqns (33) and (34)] bears a very
version of the RIV algorithm. The method gives unbiased
close resemblance to the basic RLS algorithm [Eqns (18)].
’stimates of the parameters of model Eqn (20) even when
In this case the data vector is extended by the components
;(t) is not white (compare with the RLS). The RIV method ê (t - 1) ..., and the filtered data vector 4/ (t) [Eqn (34)] is
ias a larger computational expense than RLS.
used instead in the algorithm.
A detailed treatment of the variants of the RIV method
:an be found in Liung and Soderstrom (1983) and Young
i 9~=~). 5 Extended least-squares (ELS)
The ELS method to the ARMAX model Eqn (8).
.
~ Recursive maximum likelihood ~~I~~~ The approach taken is to attempt to cast the ARMAX
Consider the ARMAX model Eqn (8): model in the form of a linear regression Eqn (3) and then to
~ ~ ~~
-
~(tj=~’~t_I~-~-
~ ~~
P(t _ i.) ~ (.t3
~ ~~ ~~~~ ~ ~~ ~tj~ +<~’(t~. t-~E)4t> .(41)
H
~~ ~ ~~t _ 1)c~,ttjc~,~’(t)l’(t._ 1)
~
1
e(tj =y(tj-9~’~aj<~~(t;~ >
+
(~) =―&horbar=;&hoXrba-r;~&+horlb(~bar;&hjoP(f-l)<~Y)
ZPl’i)
~ ~~~
1)
(t)
~ ’ ’b ~ ~t ~. ~ ) ~ ~t~J~
-~ ~- ~~r(t~
’X
X
rbaT(i)
r;&Phorbar;(I―&(t
Il l’i) 4l’~(1’)
’
P(t - ’ I )j ~..(45)
(’45’)
4
Comparison of Eqns (18) and (41) shows that the RLS noise sensitivity, and the opposite is true for X close to one.
and ELS methods are computationally identical, except This is clearly illustrated in the following example.
that ELS is complemented by Eqn (39). The data and para-
meter vectors are extended due to the inclusion of the dis-
turbance term. The procedure of casting the model structure Fjcawp~ 6.1 Consider the system model:
chosen (in this case ARMAX) in the form of a linear regres-
sion is the basis of a class of algorithms known as pseccdr~-~ ~ (~) + ai (f)~’ (f - 1) = 0.2 M (r - 1) + c (f)
linear reg~~ssao~as. where u (tj and ~ (t~ are the input and output, and e ~t j is
When the ELS algorithm Eqn its compared to the white noise with unit variance. The system is identified
RML algorithm Eqns (33) and (34). it is apparent that the using Eqns (45) where the input is a pseudo random binary ’
only difference is that the vector 4’(f) is replaced by the sequence (PRBS) with amplitude + 1 0. At time t 100 =
filtered vector * (tj in the equations for L (t) and P(t). The parameter a 1 (t) changes from - 0.8 tao 0.4. Three values
vect~r ~ (t) is obtained by filtering 4) (t) through the esti- of forgetting factor are used: X = 0.95, X = 0.99, X 0.995. =
mated C polynomial [Eqn (34)]. This explains why the The results are shown in Fig where the trade-off in choice
ELS method is also known as the approximate maximum of X is apparent.
likelihood (AML) method. The filtering operation repre- In the case of constant parameters, the measured data at
sents the increased computation required for the RML the beginning of the identification may. be poor due to the
algorithm when compared with the ELS algorithm. choice of initial estimates. It is then desirable to forget data
in this phase (X < 1) and then to let X - 1 as the effect of
initial conditions diminishes. A common way to do this is
6 ~‘im~e-v~ryinc~ systems to let X (t) grow exponentially with t to one according to:
In the previous sections the algorithms presented were °
equipped with some means of tracking these variations. The effect of the forgetting factor X in Eqns (4j) is .
Two types time variations will be considered: time- clearly that the covariance matr-.,xP’t), and hence the gain
varying parameters and time-varying noise variance. Only L (t), is kept from going to zero. The algorithm will there-
the RLS algorithm will be considered since modifications of fore remain alert to parameter changes. The major problem
the other algorithms is identical. with the forgetting factor method is also apparent from
Eqn (45): If the data vector W (t) does not contain much
information or, in the extreme case, (D (t) = 0 then the
6.1 Tracking time-varying parameters
matrix~~t~ becomes:
The least-squares criterion [Eqn (16)] is:
P(t) = P(t 1)- ...(47)
(47)
V N (0) z [.?~’~ f(t)J2
~~;~~ j tj 8~’~(tj~~
iV
--
...(42)
...
~~.?) ~)=――&horbaXr;&h(t)orbar;― , ...
’
It follows from Eqn (42) that the same weight is put on phenonmenon’ or ’estimator Wind-u~’, Other techniques are
276
and the criterion:
/v 1
1
~~(~~~IV ~ ~~t(t~-~~~tt~lz~N‘-, r iv <1 >j2
.
...~~9)>
If the variance of the disturbance term v(t) is varying, a
weight inversely proportional to the variance should be put
on each measurement. Assume that vet) is white noise ~~=ith
variance c~ (t). To account for this variation the criterion
Eqn (49) becomes:
1 N I
y~N~,~~=~~’ (l~J~t)_$T~,(t)~z ...
Nt G(I>
A natural interpretation of this approach is that more
uncertain measurements have iess significance in the
criterion Eqn (50).
When this method is combined with the forgetting factor
approach of the previous section, the weighted RLS
algorithm for time-varying systems becomes [c/’Eqns (IS)
and (45)] :-.
1 )~
~.t ~
. P (t -- 1 > ~ (t)
L
A (f) a (f) + ~ ~) P (t - (1) (t)
1 f Ptt - l~~ft){~T(t~)P(t -l~
p~) ~ ―― j P (I - 1 > -
X(I)
X(f)) t.
’
&(X(I)a(I)+<b
horbar;&horbar~;―&horb(ar;’―&)horbar;_&hotrba~r;&hoPrbar;&ht..~i
or-~
bar;――&~horbar;&horbar.)
;―:
.,.(j1)
~’(f)-1.5~(f-l)+0.7v(f-2)>
::::u(t 1)+0.511(t 2)+e eU 1)+0.2e(t 2)
where e (t) is white noise with a variance a~z of either 0.25
or 1.0. The parameters of this mode! 4vill be identified in
each case using each algorithm (in the case of RLS and RIV
the noise dynamics are not estimated). The input is a PRBS
of amplitude i 1. The initial estimates in each case are taken
Fig 2 Tracking a time-varying parameter to be zero, and a forgetting factor according to Eqn (46) is
used, with Xt.) = ~?.~5, Xo = 0.99. The estimated parameters
at sample times = 20, t 200 and t 2000 are shown in
= =
Table 1 for the case cr’ = 0.25, and in Table 2 for the case
a2 1.0. Based on these results, the performance of each
=
avaiiable for tracking parameter changes. These are bnef!y algorithm may be summarised as follows:
mentioned:
(i) .RZ5’.’ Fastest convergence in both cases. Good esti-
{i=~ Periodic resetting of the covariance matrix. mates when noise/signal ratio is small (cr&dquo; = 0.-?5).
(ii) Adding a constant term to the covariance matrix. Estimates are poor (biased) when noise/signal ratio is
(iii) Using a variable forgetting factor, where 1 < 1 when large (a2 = 1.0). Does not estimate noise dynamics.
parameter changes are detected, and X 1 otherwise.
= Small computational effort.
(ii~ hI~G’: Convergence somewhat slower than RLS, but
6.2 noise variance
quality of estimates is better, especially when n/s ratiU
Time-varying is large. Does not estimate noise dynamics. Larger
Consider the model used for the LS method [Eqn (13)]: computational effort than RLS.
(iii) RML: Very high accuracy estimates as identification
yet) = 8~~’ (t) ~- ~ (t), .. ’ (48) time becomes large. Slow initial convergence. Accurate
277
TABLE 1: a2 = 0,25 The most important factors to be considered when
deciding which method to use are:
(1) Is the noise polynomial to be identified?
(2) Noise/signal ratio.
(3) Convergence rate of particular method.
(4) Computational expense.
Guidelines have been given to help decide which method
is most suitable when the above factors have been
considered.
The forgetting factor method of tracking time-varying
systems has been presented, and the trade-off between
tracking rate and noise sensitivity has been illustrated by
performing a simple example.
For each method, only the basic algorithms have been
presented. In a practical implementation, however, special
precautions have to be taken to ensure that the algorithm is
numerically robust, and to avoid estimator wind-up. In
adaptive control identification will in general be performed
in closed-loop, and identifiability properties of the system
must be considered.
An important condition which must be satisfied for the
algorithms to work properly is that the input to the system
must be sufficiently rich in data (‘persistentty excitin~’~ so
that all modes of the system are excited. In open-loop
TABLE 2: at=1
experiments (as in this paper) it is easy to ensure that this
condition holds. In adaptive control, however, the input
will in general be generated by feedback and there is no
guarantee of persistent excitation. The algorithms described
in this paper can still be applied in closed-loop provided
that certain ’identifiability’ conditions hold. In this paper
these conditions are assumed to be satisfied. A detailed
treatment of the identifiability problem in closed-loop is
given in Gustavsson et al (!977).
References
Astrom, K. J. and Eykhoff, P.1971. ’System identification:
A survey’, Automatica, 7, 123-62.
Eykhoff, P. 1974. System identification: Parameter and
state estimation, Holden-Day, San Francisco.
Eykhoff, P. (ed). .1981 Trends and progress in system
identification. Pergamon Press, Oxford. ,
Goodwin, G. C. and Payne, R. L. 1977. Dynamic system
identification: Experiment design and data analysis,
Academic Press, New York.
Goodwin, G. C. and Sin, K. S.1984.
Adaptive filtering,
prediction and control, Prentice-Hall, New Jersey.
Gustavsson, I., Ljung, L. and Soderstrom, T. 1977. ’Identi-
estimation of noisedynamics, even when n/s ratio is fication of processes in closed-loop - Identifiability
large. Largest computational effort. and accuracy aspects’,
Automatica,13 , 59-75.
(iv) ELS: Accurate estimates (less accurate than RML). Isermann, R. (ed). 1981a
. System identification (Tutorials
Faster convergence than RML, slower than RLS. Good presented at the 5th IFAC Symposium on Identification
estimates of noise polynomial even when n/s ratio is andSystem Parameter Estimation), Darmstadt, Perga-
large. Less computational effort than RML; greater mon Press, Oxford.
effort than RLS, RIV. Isermann, R.198 1b.Digital control systems, Springer,
Berlin.
8 Conclusions Ljung, L. and Soderstrom, T.1983. Theory and practice of
recursive identification, MIT Press, London.
Four of the main algorithms used for real-time system Norton, J. P.1986. An introduction to identification,
identification (in, for example, adaptive control) have been Academic Press, London.
reviewed, and their main features illustrated by performing Young, P. C. 1984. Recursive estimation and time-series
a simple open-loop experiment. analysis, Springer, Berlin.
278