Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Automation, Vol.311.No. I. pp. 75-93, 1994 0(105-1098/94 $6.00+0.

00
Printedin Great Britain. (~ 1993PergamonPressLtd

N4SID*" Subspace Algorithms for the


Identification of Combined Deterministic-
Stochastic Systemst
P E T E R VAN O V E R S C H E E $ and B A R T DE MOOR§

The asymptotic consistency is analyzed of two new subspace algorithms that


identify combined deterministic-stochastic state space models from given
input-output data.
Key Words--System identification: Kalman filters: difference equations: OR and singular value
decomposition; multivariable systems: state space methods.

Abstract--Recently a great deal of attention has been given alternatives. This is especially true for high-order
to numerical algorithms for subspace state space system multivariable systems, for which it is not trivial
identification (N4SID). in this paper, we derive two new
N4SID algorithms to identify mixed deterministic-stochastic
to find a useful parameterization among all
systems. Both algorithms determine state sequences through possible parametrizations. This parametrization
the projection of input and output data. These state is needed to start up the classical identification
sequences are shown to be outputs of non-steady state
Kalman filter banks. From these it is easy to determine the
algorithms (see e.g. Ljung, 1987), which means
state space system matrices. The N4SID algorithms are that a priori knowledge of the order and of the
always convergent (non-iterative) and numerically stable observability (or controllability) indices is
since they only make use of QR and Singular Value
Decompositions. Both N4S1D algorithms are similar, but the
required.
second one trades off accuracy for simplicity. These new With N4SID algorithms, most of this a priori
algorithms are compared with existing subspace algorithms in parametrization can be avoided. Only the order
theory and in practice.
of the system is needed and it can be determined
through inspection of the dominant singular
1. INTRODUCTION values of a matrix that is calculated during the
THE GREATER PART OF the systems identification identification. The state space matrices are not
literature is concerned with computing polyno- calculated in their canonical forms (with a
mial models, which are however known to minimal number of parameters), but as full state
typically give rise to numerically ill-conditioned space matrices in a certain, almost optimally
mathematical problems, especially for Multi- conditioned basis (this basis is uniquely deter-
Input Multi-Output systems. Numerical algo- mined, so that there is no problem of
rithms for subspace state space system identifica- identifiability). This implies that the observ-
tion (N4SID*) are then viewed as the better ability (or controllability) indices do not have to
be known in advance.
Another major advantage is that N4SID
* Numerical algorithms for Subspace State Space System
Identification. Read as a Californian license plate: enforce it. algorithms are non-iterative, with no non-linear
t Received 6 March 1992; revised 2 September 1992: optimization part involved. This is why they do
revised 15 January 1993; received in final form 19 March not suffer from the typical disadvantages of
1993. The original version of this paper was presented
at the 12th IFAC Congress which was held in Sydney, iterative algorithms, e.g. no guaranteed conver-
Australia, during 19-23 July 1993. The Published Pro- gence, local minima of the objective criterion
ceedings of this IFAC Meeting may be ordered from: and sensitivity to initial estimates.
Pergamon Press Limited, Headington Hill Hall, Oxford
OX3 0BW, U.K. This paper was recommended for publica- For classical identification, an extra para-
tion in revised form by the Guest Editors. Corresponding metrization of the initial state is needed when
author: Peter van Overschee. Fax +32-16-221855; E-mail estimating a state space system from data
vanoverschee@esat.kuleuven.ac.be.
ESAT, Department of Electrical Engineering, Katho- measured on a plant with a non-zero initial
lieke Universiteit Leuven, Kardinaal Mercierlaan 94, 3001 condition. A final advantage of the N4SID
Heverlee, Belgium. Research Assistant of the Belgian algorithms, is that there is no difference between
National Fund for Scientific Research.
§ Research Associate of the Belgian National Fund for zero and non-zero initial states.
Scientific Research. Most commonly known subspace methods are
75
76 P. VAN OVERSCHEE and B. DE MOOR

realization algorithms of e.g. Kung (1978), Input-Output


where a discrete-time state space model is
computed from a block Hankel matrix with
data Uk,Yk I Classical
Markov parameters. It is unfortunate that the N4SID Identification
theory here relies on Markov parameters as a
starting point, something rather difficult to 1 1
measure or compute in practice (e.g. think of
unstable systems).
Kalman states ] I SYstemmatrieeI
An alternative direct identification scheme for Least Kalma.n
purely deterministic systems is described by, e.g. Squares filter
i

Moonen et al. (1989), Moonen and Ramos System matrices I Kalman states
(1991), where a state space model is computed I
directly from a block Hankel matrix constructed
from the input-output data. In a first step, a FIG. I. The left-hand side shows the N4S1D approach: first
the (Kalman) states, then the system matrices. The
state vector sequence is computed as an interface right-hand side is the classical approach: first the system, and
between a 'past' and a 'future'. Once the state then an estimate of the states.
vector sequence is known, the system matrices
are computed from a set of linear equations.
Similar data-driven identification schemes for
purely stochastic identification are well known,
data. This projection retains all the information
(see e.g. Arun and Kung (1990) and the
(deterministic and stochstic) in the past that is
references therein). Less well known is that
useful to predict the future. Then, the state
these algorithms can compute extremely biased
space matrices are determined from this state
results. This problem was studied and solved by
sequence. Figure 1 shows how these N4SID
Van Overschee and De Moor (1991a, b).
algorithms differ from the classical identification
The problem addressed in this paper is that of
schemes.
identifying a general state space model for com-
The connection of the two new N4SID
bined deterministic-stochastic systems directly
algorithms with the existing algorithms described
from the input-output data. Some papers in
above will also be indicated.
the past have already treated this problem but
This paper is organized as follows: the
from a different viewpoint. In Larimore (1990)
problem description and the mathematical tools
for instance, the problem is treated from a pure
can be found in Section 2. In Section 3 the main
statistical point of view. There is no proof of
projection is defined. Section 4 introduces a
correctness (in a sense of the algorithms being
closed form formula for the non-steady state
asymptotically unbiased) whatsoever. In De
Kalman filter estimation problem. This result is
Moor et al. (1991) and Verhaegen (1991) the
related to the results of Section 3 to find the
problem is split up into two subproblems:
interpretation of the main projection as a
deterministic identification followed by a stoch-
sequence of outputs of a non-steady state
astic realization of the residuals. In Moonen et
Kalman filter bank. Section 5 introduces a first
al. (1992) the problem is solved for double
N4SID algorithm that identifies the system
infinite block Hankel matrices, which implies
matrices exactly. In Section 6 accuracy is traded
practical computational problems.
off for simplicity in a second approximate N4SID
In this paper, we will derive two N4SID
algorithm. Section 7 shows how these N4SID
algorithms that determine the deterministic and
algorithms can be implemented in a numerically
stochastic system at the same time. The
reliable way, using the Q R and the Singular
connection with classical system theory (Kalman
Value Decomposition (SVD). Section 8 investi-
filter) will be used to prove the exactness
gates the connection with other existing algo-
(unbiasedness for an infinite number of measu-
rithms. Finally Section 9 will treat some
rements) of Algorithm 1, or the degree of
comparative examples. The conclusions can be
approximation (calculation of the bias for an
found in Section 10.
infinite number of measurements) of Algorithm
2.
The approach adopted here is similar to the 2. PRELIMINARIES
identification schemes of Moonen et aL (1989) In this section, we describe the linear time
for the purely deterministic case and Van invariant system we want to identify. We also
Overschee and De Moor (1991a, b) for the introduce the input and output block Hankel
stochastic case. First a state sequence is matrices, the past and future horizon as well as
determined from the projection of input-output the input-output equations.
N4SID: Identification of combined systems 77

2.1. System description well as completely decoupled input-output


Consider the following combined dynamics•
deterministic-stochastic model to be identified The main problem of this paper can now be
stated: given input and output measurements
x~.+l = A x k + B u k -I- wk, (1)
u, . . . . . UN and Yl . . . . . YN (N--*~. ), and the fact
Yk = Cxk + Duk + Ok, (2) that these two sequences are generated by an
unknown combined deterministic-stochastic
with

*E
[(b:)
(w$ v~) = (S~.),
S')
R" dikl -> 02 (3)
model of the form described above, find A, B,
C, D, Q', R "~, S' (up to within a similarity
transformation)•
In the next two sections, we will define some
and A , Q " e N ..... , B e N ...... , C e R I×', D e more useful properties and notations for the
1~1.... , S ~ e N "×t and R ~ e N t×t. The input vectors deterministic and the stochastic subsystem•
u, e N ''×~ and output vectors YkelW ×' are 2.1.1. The deterministic subsystem• Associated
measured• o, e N~×' and wk e I~"×' on the other with the deterministic subsystem (4)-(5), we
hand are unmeasurable, Gaussian distributed, define the following matrices:

(c/
zero mean, white noise vector sequences• • The extended (i > n) observability matrix F,
{A, C} is assumed to be observable, while (where the subscript i denotes the number
{A, (B (Q.,)it2)} is assumed to be controllable• of block rows)
This system (1)-(2) is split up in a
deterministic and stochastic subsystem, by
splitting up the state (xk) and output (y~) in a CA
deterministic (.,i) and stochastic (.") component:
-i - CA:
xk = x a+x~., Yk = y d + y ~ . The deterministic
state (x d) and output (yd) follow from the
deterministic subsystem, which describes the CAi-ll
influence of the deterministic input (Uk) on the
deterministic output • The reversed extended controllability matrix
Q A}~ (where the subscript i denotes the
Xd+I = A x d + BUk, (4) number of block columns)
and
yd = Cx a + Dut. (5) Ad~'(Ai-'B A'-2B ... AB B).

The controllable modes of {A, B} can be either • The lower block triangular Toeplitz matrix
stable or unstable. The stochastic state (x~) and H~'
output (y~) follow from the stochastic sub- H~"
system, which describes the influence of the
noise sequences (wk and ok) on the stochastic
0 0 " i\ )
output CB D 0 ...
x~ +, = Ax'~ + Wk, (6) CAB CB D . .
and
y~ = Cx~ + ok. (7)
CAi-2B
The controllable modes of { A , ( Q ' ) Irz} are
assumed to be stable. 2.1.2. The stochastic subsystem. For the
The deterministic inputs (uk) and states (x d) stochastic subsystem (6)-(7) we define
and the stochastic states (xl) and outputs ( y i )
are assumed to be quasi-stationary (as defined in e~ %' E[x~(x~)']
dcf ~ x t
Ljung, 1987). Note that even though the G = E[xT(yk) ]
deterministic subsystem can have unstable An d~, E[y~(y~)'].
modes, the excitation (Uk) has to be chosen in
such a way that the deterministic states and With equations (3), (6) and (7) and through
output are finite for all time. Also note that since stability of the controllable modes of the system
the systems {A, B} and {A, (Q.,-)i/--,} are not {A, (Q,).2}, we find easily that the following
assumed to be controllable, the deterministic equations are satisfied:
and stochastic subsystem may have common as
p, = AP~A, + Q,
G = A P'C' + S" (8)
*E denotes the expected value operator and 6,~ the
Kronccker index. An = CP'C' + R'.
78 P. VAN OVERSCHEE and B. DE MOOR

This set of equations describes the set of all The deterministic and stochastic state matrices
possible stochastic realizations that have the are defined as
same second order statistics as a given stochastic sddef z d
sequence y~,. We call them the positive real
, =tx, x",+, x:+2 . . . x:+/_,),
equations. More details can be found in Faure x s di e=f t/ X is X is+ l X is+ 2 ... X is+ j _ l ) .

(1976). For the deterministic subsystem we define


It is also easy to derive that
1 { U')li-' ~
def ,. s" , I'cmi-IG i>0 lim - ~
Ai = E[yk+i(y'~.) ] = A~) i = 0. j--~] / /(U~)ji-, U~J2i-i I (xg)')
/ G'(A')-i-IC ' i<0
\ X,, /

Associated with the stochastic subsystem, we


define the following matrices: -
w'/R"R,21s:R_2_s
Oe.~tl .Ot
,2 R~ S'~I=
• The matrix M \ S, $2 P"/ kS I P"}'
A~%t(Ai-~G Ai-ZG ... AG G).
where we use the assumption that the limit exists
• The block Toeplitz covariance matrix L," (quasi-stationarity of uk and xka).
For the stochastic subsystem we find that, due
/ At)A_, A_. ".. A,_i\
to stationarity of y~,, the following equalities
hold true:

I Ai-2 Ai-3 "'" A. / lim 1 {~.,


-
Y(%li_ I ~tt~,,,s
]t~-oli-,sx, (Yit2i-,)')
s
j ~ = J \ lei]2/-,
• The block Toeplitz cross covariance matrix
H~
= ( L,'. (H,'.)"~
\H~ L~ /" (9)

The Matrix input-output equations are defined in


H~ ~f I Ai Ai- t ••• A,
the following Theprem (De Moor, 1988):

Theorem 1.
= F i AT. Yoli_l = F i x d _ F H ia Uoji_ 1 21- Y os l i _ l , (10)
2.2.Block Hankel matrices and input-output Yq~-~ =FiXd + HiUilzi-, + Yq21-~, d s (11)
equations and
g/-- id
Input and output block Hankel matrices are - A Xo + mid Uoli-i. (12)
defined as The theorem is easy to prove by recursive
/ Uo u, u., ...u/_,) substitution into the state space equations.

?.'..2 .2 iii .u'. , 3. T H E MAIN P R O J E C T I O N


\Ui--| Ui Ui+l ''' Ui+j_ 2
In this section, we introduce the projection of
/ Yo Yl Y2 "'" Yj-, \ the future outputs onto the past and future
yoli_,,,~Y, Y.2. .Y.3. ::: .Y./. ), inputs and the past outputs. The results can be
described as a function of the system matrices
,;,?i y, y., y..2.
and the input-output block Hankel matrices.
We define the matrices Zi and Zi+~ as
where we presume that j---->~ throughout the
paper. The subscripts of U and Y denote the .3>
subscript of the first and last element of the first
column. The block Hankel matrices formed with
the output yl of the stochastic subsystem are Zi+l = Yi+lJ2i-i .4>
defined as Y~;li-~ in the same way.
Somewhat loosely we denote the 'past' inputs where A/B =AB'(BB')-IB. T h e r o w space of
with Uoli_I or Uoji and the 'future' inputs with A/B is equal to the projection of the row space
ad2i_ l o r Si+lJ2i-|- A similar notation applies for of A onto the row space of B.
the past and future outputs. This notational Formula (13) corresponds to the optimal
convention is useful when explaining concepts. prediction of Y/12i-i given U,,izi_l and Yoli-i in a
N4SID: Identification of combined systems 79

sense that and


II Y,I~;-, - Z,ll~- Qi = zilPi -1
is minimized constrained to xi=Ai(pd - SR -i S , )Fi+
t A~s (19)
W, = r , ( Pa - SR-'S')F~ + LT. (20)
row space Z~ ~ row space \ Y . l i - i / .
A proof can be found in Appendix A. In the
So, intuitively, the kth row of Zi would next section, we give an interpretation of these
correspond to a k step ahead prediction of the projections•
output. This intuition will become clearer in
Section 4. 4. A BANK OF KALMAN FILTERS
These projections (Zi and Zi+l) are useful in In this section, we show how the sequences ~';
determining the combined system, since (as we and )(i+t can be interpreted in terms of states of
will show in T h e o r e m 2) the linear combinations a bank of j non-steady state Kalman filters,
to be made of the input-output block Hankel applied in parallel to the data. This inter-
matrices to generate the matrices Z~ and Z~÷~ are pretation will lead to a formula that will prove to
functions of the system matrices (A, B, C, D, be extremely useful when determining the
Q', S ", R'). Moreover, the system matrices can system matrices from the data.
be retrieved from these linear combinations, as As stated before, it may come as no surprise
will be explained in Section 5. that there is a connection between the states 2,.
It is tedious though straightforward to prove defined by the projection Z~ and some optimal
the following theorem which delivers formulas prediction of the outputs Yil2~-~-
for the linear combinations to be n.'ade of the To establish this connection, we need one
rows of the input-output block Hankel matrices more theorem that states how the nonsteady
to generate the matrices Z~ and Z~+~. state Kalman filter state estimate -fk can be
written as a linear combination of u. . . . . . Uk-~,
Theorem 2. Main projection. Y. . . . . . Yk-~ and the initial state estimate 2..
• If the deterministic input uk and state x d are
uncorrelated with the stochastic output y~.: Theorem 3. Kalman filter. Given 2., P.,
u. . . . . . Uk-i, y. . . . . . Yk-t and all the system
lim 1 y,~l,_,U,. = 0 lim 1 Yi~l,_,(X.,),= 0 matrices (A, B, C, D, Q', S", R'), then the non-
j-=j j-=j steady state Kalman filter state -fk defined by the
1 following recursive formulas:
l i m - Y~I2,-,U'. = 0 lim 1 Y~lz,_.(X.a) ' = 0 ,
$k = A $ k - i + B u k _ i + K k _ t

where the subscript, denotes past or future, × (Yk-l -- C2k-i -- Duk-i), (21)
• if the input is 'persistently exciting of order Kk-, = (APk-,C' + G ) ( A . + CPk-,C')-'. (22)
2i (Ljung, 1987)': rank U.12~_~= 2mi, and
• if the stochastic subsystem is not identically Pk = A P k - t A ' - (APk-~C' + G)
zero (the purely deterministic case will be x (I~,+ CPk_,C')-'(APk_,C'+ G)' (23)
treated in Section 6.1•3).
Then (for j---~ oo) can be written as

Z, = r,2, +HiUilzi-.,
" (15) No
z,+, = r,_,2,+, +H}L,U,+,I2,_,, (16)
with xk = ( A k - Qkrk I/,'/- Qkn'[I Q')l "'- '
X~ = ( A ' - Q,r, I a:'- Q,HI' I Q,)
{ S R - ' U,,i2,-.' ~
u.,_, !'
~ Y.

\Yk -
(17)
\ Y.li- i / (24)
where
2 , + , = (A '+' - Q , + , F , + , [ a f + , - Q,+,H,~+, I Q,+,)
Q~. = Zklp~-I, (25)
/ S R - i Uopi- ; I x, = A~P,,F~, + a L (26)
x~ U.li" J (18) and
= rka,r + (27)
\ ~:'li /
80 P. VAN OVERSCHEEand B. DE MOOR

The proof of this theorem and some details written as


concerning the special form of the Kalman filter
fi” = SR-‘U,,lri-,.
equations (21)-(23) can be found in Van
Overschee and De Moor (1992). Let us just All this is clarified in Fig. 2.
indicate that the error covariance matrix The expressions for P” and 2” can be
pk ‘if E[(xk - ik)(xk -a,)‘] is given by P’ + Pk, interpreted (somewhat loosely) as follows:
with P” the state covariance matrix from if we had no information at all about the
Lyapunov equation (8). initial state, then the initial state estimate
Note that the limiting solution (k + 03) of (23) would be 2” = 0 and the initial error
is -Pm, where P, is the state covariance matrix covariance would be equal to the expected
of the forward innovation model (Faure, 1976). variance of the state: P” = E[x~x~] = P” +
Hence the limiting error covariance matrix is P”. Now, since the inputs are possibly
pz = P” - P%, which is the smallest state error correlated, we can derive information
covariance matrix we can obtain (in the sense of about Xi” out of the inputs U,,lzi-,. This is
nonnegative definiteness). done by projecting the (unknown) exact
Also note that the expressions for vlk and xk initial state sequence Xi+ Xfl onto the
(26)-(27) are equal to the expressions of vi and row space of the inputs U,,l,,-,
Xi (19)-(20) with P” - SR-‘S’ substituted by p0.
2” = (Xi; + X;,)IU,,l,i-, = SR-‘U,,lzi-,.
If we now combine the results of Theorem 2
and 3, we find an interpretation of the sequences This extra information on the initial state
X, and Xii,, in terms of states of a bank of of the Kalman filter also implies that the
non-steady state Kalman filters, applied in error covariance matrix reduces from
parallel to the data. More specifically, compare P” + p” to:
formulas (17), (18) and (24):
(1) The j columns of Xj are equal to the p” = p” + p’ - ;i;; R”(jy)[
outputs of a bank of j non-steady state
Kalman filters in parallel. The (p + 1)th =P”+P‘--SR-‘S’.
column of Xi for instance, is equal to the
non-steady state Kalman filter state Pi+,, of These are exactly the same expressions for
the Kalman filter (21)-(23) with initial Xi” and p” as we found above.
error covariance matrix at starting time p It can also be seen that when the inputs
are uncorrelated (white noise), the pro-
&p,+P”=p”-SR-‘S’+P” jection of Xz + Xf, onto the inputs U,,l,;-,
UP
is zero, which implies that there is no
f,,=SR-’ ... information about the initial state 2”
contained in the inputs $I~,-,.
( Up+?;-I 1. The state sequence Xi+, has a similar
Notice that pp is independent of the interpretation. The pth column of Xi+, is
column index, so it is denoted with P”. equal to the non-steady state Kalman filter
In this way, all the columns can be state estimate of the same (in a sense of
interpreted as Kalman filter states. The the same initial conditions) non-steady
initial states of the j filters together can be state Kalman filter bank as discussed

1
UO uj-l

( )I/01;-1

&Ii-l
W-1

Yo
Ui+p-1
YP
uitj-2
Y,-1

Yi-1 Yi+p-1 Yi+j-2


T_
i; [ *i
. ii+, .. -++,-I
1
FIG. 2. Interpretation of the sequence k, as a sequence of non-steady state Kalman filter state estimates
based upon i measurements of uk and ya.
N4SID: Identification of combined systems 81

above, but now the filter has iterated one deterministic subsystem, directly from the given
step beyond the estimate of the pth inputs Uk and outputs y,. The stochastic
column of X,-. This is valid for all columns subsystem can be determined in an approximate
p = l . . . . . j. sense.
(2) We define the residuals ~a of the
projection as 5.1. The projections
First, the projections Z; and Zi+~ (13)-(14)
~ i . Yil21-i
. Zi
. Yil2i-I
. -- Fif(i HidUil2i-i. have to be calculated. In Section 7 we will
(28) describe a numerically stable way to do this.
Since Z~ is the result of the projection of For convenience, we rewrite these projections
Y,.121_l on the row space of Uot2i-, and as follows:
Y01~-~, the residuals of this projection ( ~ )
will always satisfy: ~iU[~12i_l= 0,
~iY[~li_ I = 0 and ~iZ~ = 0. Also, since X'i (33)
\ l i x mi
can be written as a linear combination of
\ Y"I~-~ /
Uol~-i and Yoli-i (see formula (17)), we
find: ~iX't = 0. Zi+ I
(3) Since the corresponding columns of ,,~'i and L~+z ':+' /
,~i+~ are state estimates of the same (in a I(i- I) × m ( i + I I(i--I)xm(i--I) I(i--I)xl(i+|)/
sense of the same initial conditions)
non-steady state Kalman filter at two U(,Ii )
consecutive time instants, we can write
x Ui+ll2i_! (34)
(see formula (21))
Yol/
-.~,+ t = AX, + BU,], + K,(Y..I,- C.,.(", - DU, I, ).
with, from (15)-(18)
(29)
It is also trivial that L~ = F;([A i - Q,F,]S(R-'), I,,/+ A~'- Q,,'/),
Yih = C . f ( i dr DU, I, + (Y,I i - CS, - DUd, ). (35)
(30) L 2,= H~i + F,[A'- QiF,]S(R-')...÷,I~.,,. (36)
If we inspect the formula for ~i a little bit L~ = FiOi, (37)
closer (28), we see that its first row is
equal to Y/i / - C f ( i -DUil i. And since we with (R-~)q,, denoting the submatrix from
know that the row space of ~s (and thus column 1 to column mi.
also the first l rows of ~;) is perpendicular The expressions for L~÷l, L~+l and L3÷t are
to Uol2/-i, Yoli-i and ,('i, we find (together similar, but with shifted indices.
with (29) and (30))
5.2. Determination of Fj and n
/ U,,12~-,~ l An important observation is that the column
"ff~+'=mf(~+BU~l~+~ Y " I ' - ' ] ' (31) space of the matrices L~ and L 3 coincides with
the column space of F i. This implies that Fi and
',ki/ the order of the system n can be determined
from the column space of one of these matrices.
gI/--C2'+DV~I'+~ ~ * - ' 1 ' (32) The basis for this column space actually
determines the basis for the states of the final
\2,/ (identified) state space description.
where (.)± indicates a matrix whose row Let us mention two other possible matrices
space is perpendicular to the row space of that have the same column space as Fi
(.). These formulas will prove to be
L] + L;. LT, (38)
extremely useful in the next section where
and
we determine the system matrices from Z~
and Zi+l. L ~ U,,{,_,
(L~[ ,)(y.ii_, ). (39)
This summarizes the whole interpretation as a
bank of non-steady state Kalman filters. It should be mentioned that, for i---,oo the first
one (38) will lead to a deterministic subsystem
5. IDENTIFICATION SCHEME that is balanced (see also Moonen and Ramos,
In this section, we derive an N4SID algorithm 1991) while the second one (39) leads to a
to identify exactly (unbiased for j---~oo) the deterministic system that is frequency weighted
82 P. VAN OVERSCHEE and B. DE M o o r

(with the input spectrum) balanced (Enns, 1984) where we define


together with a stochastic subsystem of which the
forward innovation model is balanced in a
deterministic sense. We will not expand any X22]
more on this, but keep this for future work.
We can now determine Fi, Fi-t and the order ~f ( ~ - AF*~(CF,~(kFi~_DDB)F*~_tH----idAF*~(
_tCF*~(kHi-~o
O -~
) ]l-
n as follows: let T be any rank deficient matrix of
which the column space coincides with that of Fi.
• Calculate the Singular Value Decomposition \Fi_lB/ \Hi-i//
(44)
Observe that the matrices B and D a p p e a r
linearly in the matrices ~t2 and ~2~.-
• Since T is of rank n,
the n u m b e r of singular Let I-[ be a matrix whose row space coincides
values different from zero will be equal to with that of
the order of the system. r:z,
• The column spaces of F~ and UIZI,~2 Uil2i_l]'
coincide.* So, Fi can be put equal to U,Z~ n.
• With F~ defined as F~ without the last I rows then (from (43))
(l is the number of outputs), we get:
(r:_,zi+, r*,z, /rI
Fi- I = Fi.
In the following, we will take T equal to the Obviously, this is a set of linear equations in the
expression in formula (39), but one can replace unknowns A, C, 2gl,_,Yg22.
this with any other matrix of which the column Another point of view is that one could solve
space coincides with F~. the least squares problem
(r: r:z, )ll
5.3. Determinationof the system matrices A,cmi,~x.. \ Yili /I \C1~22 ~ [IF"
We now assume that Fi, Fi_t and n are (45)
determined as described in the previous section,
and are thus known. From (15) and (16) it Either way, from (43) we find (term by term).
follows that:t
Term 1. A and C exactly.
)(i = F[(Zi - HidUilzi_,). (40)
X,+, = Fi*-t(Zi+, - Hd-, Ui+,I2i-,). (41) Term 2. Xi2 and ~22 from which B and D can be
unraveled by solving a set of linear equations,
In these formulas, the only unknowns on the analogous to the one described in De M o o r
right-hand side are the matrices H a and H,-d_j. (1988). Note that in (44), B and D a p p e a r
From (31) and (32) we also know that linearly. Hence if A, C, Fi, Fi-1, ~t2 and ~c22 are

(2)lzi known, solving for B and D is equivalent with


solving a set of linear equations.

Term 3. The residuals of the least squares


(42) solution (43) can be written as
If we now substitute the expressions for )(i and
Xi+l (40)-(41) in this formula, we get [wi q
p= Zi
.% = \ Viii]'
Ell, )=-(A)r,zi+/X,.',
(r~_,zi.,
~)Uil2,-, where W I. and V. I. are block Hankel matrices:~
Term 1 Term 2 with as entries wk and ok: the process and
measurement noise. This is clearly indicated by
Uol2i-! equation (42).
+ Zi , (43) The system matrices R s, S s and Q" are
YC,
The block Hankel matrices of the residual p have only
° The factor ZIIn is introduced for symmetry reasons. one block row. The notation is introduced to be consistent
t At denotes the Moore-Penrose pseudo-inverse. with previous notations.
N4SID: Identification of combined systems 83

determined approximately from p as follows: (ii) B, D follow from A, C and ~ z , ~22


>__( Q, s.,) through a set of linear equations.

1 (SS) ' R'" (iii) (\(S,), I R,) *-'1 (PtP~ P,P'2~


The approximation is due to the fact that the
bank of Kalman filters for finite i is not in steady The deterministic subsystem will be
state (see for instance the Riccati difference identified exactly (as j---,oo, independent
equation (23)). As i grows larger, the ap- of i). The approximation of the stochastic
proximation error grows smaller. For infinite i subsystem is still dependent on i and
the stochastic subsystem is determined exactly converges as i---* oo.
(unbiased). More details on this purely stochas-
tic aspect can be found in Van Overschee and
6. A S I M P L E A P P R O X I M A T E SOLUTION
De Moor (1991a, b).
In this section, we introduce an N4SID
algorithm that is very similar to the 'exact' (as
5.4. N4SID Algorithm 1 /.--, oo) algorithm of the previous section
In this section, we summarize the first N4SID (Algorithm 1). The algorithm we present now
algorithm step by step, not yet paying attention finds a good approximation to the state Xi, and
to the fine numerical details, which will be to the system matrices, without having to go
treated in Section 7. through the complicated Step 4 for the
(1) Determine the projections determination of B and D. This results in a
simple and elegant algorithm with a slightly
/{ .,,,,-,) lower computational complexity as compared to
zi=Yir2i-,/ ! Algorithm 1. Another advantage of this
simplified N4SID algorithm is that it is very
IX Yoli- I / closely related to existing algorithms (Larimore,
1990). This means that the analysis of this
simplified algorithm can also be applied to the
P ~ ' I ~ '
"li x mi I li x mi i li x li" \ ~)1i- I / other algorithms, and can thus contribute to a

Z i + I = Y/+l]2i_ I
/( ,,,, ) Ui+ll2i-i .
better understanding of the mechanism of these
algorithms. A disadvantage is that the results are
not exact (unbiased) for finite i (except for
Y.li special cases), but an estimate for the bias on the
solutions can be calculated.
(2) Determine the Singular Value Decom- If we could determine the state sequences Xi
position and -~'i+~ directly from the data, the matrices A,
B, C and D could be found as the least squares
solution of (31)-(32).
\ Y.li- t / Unfortunately, it is not possible to separate
The order is equal to the number of the effect of the input HidUi]2i-i from the effect
non-zero singular values. of the state Fi-~'i in formula (15), by just
dropping the term with the linear combinations
I'~i = UI'~.II/2 and r i _ I = UI'~~II/2.
of Udzi_~(L~) in the expression for Zi. We would
(3) Determine the least squares solution (Pt automatically drop a part of the initial state if we
and P2 are residuals) did this (see for instance (36)). So, it is not
possible to obtain an explicit expression for FiXi
j n mi and I"i_lZYi+l, without knowledge of H/d, which
n(Ft-,Zi+l] _n(~{I, ff{,2~ would require knowledge of the system matrices.
It is however easy to find a good approxima-
tion of the state sequences, directly from the
J J
data. If we use this approximation in (31) and
(32), we obtain a second very elegant and simple
mi Udzi_~/+ lkp2/" N4SID algorithm that calculates approximations
(4) The system matrices are determined as of the system matrices.
follows:
6.1. The approximate states
A*--;~ll, In this section, we derive an approximate
(i)
C*--:~21. expression for the states ,~'i and Xi+j. This

AUTO 30:1-G
84 P. VAN OVERSCHEE and B. DE MOOR

approximation can be calculated directly from 1992) that


i--I
input-output data.
The approximate state sequences are calcu-
= I-[ ( A - KkC). (50)
k =0
lated by dropping the linear combinations of
U~l~_l out of Zi, and the linear combinations of Since the non-steady state Kalman filter
U,+112,-~ out of Z;+~. In this way, we obtain converges to a stable closed-loop system
Kalman filter states of a different Kalman filter (Anderson and Moore, 1979), we find that ~
compared to the one that produces X~ (in a sense grows smaller when i grows larger. It is also clear
of different initial conditions). We call the that: lim ~ = 0. In the limit, equation (49) thus
i~tm
resulting matrices ri,f( i and I"i_ i,e~'i+ I becomes (with S(R-t)mi+ ~12,,iUd2;_~ finite)

Fi,f(i = Zi - L 2iUil2i-, (46)


and lim 6Xi = lim ~iS(R-t)mi+ll2,,iUil2,._l = O.
i~ i~oo

Fi_lf(i+ I = Zi+ a - L2+tUi+ll2i_t. (47) The same holds for tSXi+~. So, we can conclude
From (33) and (34) we derive that, for i--->o0, there is no difference between
the state sequences ~'i and X,-. Actually, when
L = ([A'- Q,F,]S(R-'),I,., i---~0o, the non-steady state Kalman filter bank
converges to a steady state Kalman filter bank,
+ a",- (48) i.e. the Kalman filters converge (the Riccati
\ Y"Ii- 1]
difference equation (23) converges to an
)t'~+, = ([A '+' _ Q,+tF,+,lS(R-'),Im~,+, ~ algebraic Riccati equation). To obtain the effect
~ + , / U.IA of i---~ oo on a computer, in theory we would need
+ A",+,-Q,+,n,+, a test like.

The matrices ["if( i and 1-'i_l.e~i+l can be obtained


II6X, ll -_ II -'
V'n X j V'-n x j
directly from the data without any knowledge of
the sytem matrices, and can be interpreted as where em,c, is the machine precision. This is a
oblique projections as described in Van Over- condition that can only be checked, after the
schee and De Moor (1992). identification is done. Appendix B indicates how
The state sequence ,('~ is generated by a bank this quantity can be calculated.
of non-steady state Kalman filters, with: In practice, it turns out that i does not have to
po = pd _ S R - I S , + p~ and ,~-o= S(R-l)llmiUoli_ j. be that large. The identification results are
The state sequence ,('~+~ on the other hand, is already very good for reasonably small i (---10).
generated by a bank of non-steady state Kalman This will become apparent in the examples of
filters, with: 15o= p d _ S R - I S , + p~ and ,(.o= Section 9.
S(R-I)qm(i+~)Uoti. So clearly, both sequences do 6.1.2. Uk white noise. With the deterministic
not belong to the same bank of Kalman filters, input uk white noise, we find: S = 0 and R = 12,,,~
and the useful formulas (31)-(32) are not valid with 12,, the 2mi x 2mi identity matrix. So, for a
for these two sequences. white noise input, we find for any i (see (49)):
Still, we will see below, that ,l~'i and )(~+l are
very close to ~'/, and that ) ( / = ~'/, -~'i+t = -'~'i+~ if 6.1.3. Purely deterministic system. From Mo-
at least one of the following conditions is satisfied: onen et al. (1989), we know that generically for
• i -.-~ 00. deterministic systems we have (if there is no
• The deterministic input u~ of the combined 'rank-cancellation', see De Moor (1988))
deterministic-stochastic system is white
noise. rank = mi + n
• The system is purely deterministic.
and (51)
For these three special cases, we will analyze the
difference between Xi (17) and Xi (48), defined ( Uil21-1~
rank \Y~l2 ~ - t / = mi + n,
as 6Xi:
which implies that
6X,~r ff, -ff, = [A' - Q,F,]S(R-')m,+,I2.,,U,12i_,.
(49)
rank
(u.12,-,]
Yoli-I / = 2mi + n.
6.1.1. i---, oo. Define the term between square
brackets in (49) as ~ A ~ - Q;Fi. Now, it is This rank deficiency implies that for purely
easy to prove (Van Overschee and De Moor, deterministic systems, the proof of Theorem 2
N4SID: Identification of combined systems 85

breaks down (see Appendix A: ~-~ cannot be attention to the fine numerical details, which will
calculated). The following theorem is an be treated in Section 7.
alternative for Theorem 2 for purely determinis- (1) Determine the projections
tic systems:

Theorem 4. Purely deterministic systems. If the z,: I/ !


input is persistently exciting and the stochastic
subsystem is zero we have / \ Yoli-i /

= (r/[af - A'rlnf] I r,Air,-t)(~]-


Yoli- l ]
= 1~-;-4 ~ 1 .-.t~l/u~12,_,/,
= r;x, = r,2,. klixmi I lixmi I lixli/ ~ l
\ Yoh-i /
A proof can be found in Van Overschee and De and
Moor (1992). This implies that for deterministic
systems, 6X~ is also equal to zero.
Zi+l = Y/+ll2i-t Ui+II2/-i
6.2. The algorithm Y.h
We know from (31) and (32) that the least
= ( LL'+'
/ 2 +[ ' ~
squares solution ~ of
I(i--I)×m(i+l) I(i--I)xm(i--I)

is equal to: X
I(i--I) x l ( i + l ) Yoli "
~= C D"
(2) Determine the Singular Value
We also know that from the residuals of this
Decomposition
least squares problem, we can approximately
calculate the stochastic subsystem (exactly if
i~oo). Unfortunately, it is impossible to
calculate the states ~'~ and -~'~+l directly from the
data, without any knowledge of the system The order is equal to the number of
matrices. In the previous section, we have seen non-zero singular values.
that for some special cases ,~'i=,~'i. In the Fi = UI~-I/2, and Fi-t = Uj__.._Z]/2.
general case, we have X~ = ,~ if i is reasonably
large. This is because (Section 6.1.1) (3) Determine the states -'~i and Xi+~
i--I , ,
//L...\
6X, = 1-I (A - KkC)S(R-'),.,+q2.,,U,12i_ ,. = Fi ( L i [ \ Yoli- i /
k =0
and
Consequently, for the least squares solution .~ of

min ('~'~+'~ - [ "~'~ ~ 2


(52)
(4) Determine the least squares solution
we have
j n m j j
n(X,+,~ _n{.~,! .~lz 5 x n ( X, ~ +nCo,~
Contrary to ~'i, Xi can be calculated directly l\ Y,I, i - l\~e~, ~e=/ m \ ~ / l\pd"
from the data, without any knowledge of the (5) The system matrices are (approximately)
system matrices. The solution is exact (.~ = &~) determined as follows:
for the cases of Section 6.1.1(i--*~), Section
6.1.2 (input is white noise) and Section 6.1.3
(purely deterministic), since then ~'~ =~';. In
Appendix B, an approximate expression for
( ~ - . ~ ) is derived. This is the bias on the
solution.
(ii) \(S~),[R,/ J \P2P'~ p2P'2]'
where in the general case the approxima-
6.3. N4SID algorithm 2 tion of the deterministic and stochastic
In this section, we summarize the second subsystem depends on i (even when j---~ oo)
N4SID algorithm step by step, not yet paying and converges as i, j---~ oo.
86 P. VAN OVERSCHEE and B. DE MOOR

7. A NUMERICALLY STABLE AND EFFICIENT where we use the shorthand notation


IMPLEMENTATION R4:6.1:3 for the submatrix of R consisting of
In this section, we describe how N4SID block rows 4-6 and block columns 1-3.
Algorithms 1 and 2 can be implemented in a (3) Calculate the projections (see also the
numerically stable and efficient way. We make note at the end of this section).
extensive use of the QR decomposition and
SVD. Zi = R5:6.1:4R-1:4'
' h4 ( ~'l,~i_--1I)
In Sections 5.4 and 6.3 we described step by
step, two N4SID algorithms that determine the
system matrices from given input-output data.
In this section, we will show how these two =(L~ H[L,~)I~
\v,,,,-,/I,
algorithms can be implemented in a numerically and
stable and efficient way.
Steps 1 through 4 are common to both Zi+l = R6:6.I:5R~..I.l:5(U~II~1 )
algorithms.
(1) Construct the block Hankel matrix ~(*

~ -- ( ~ : , l : : ~ : ) / V f .
= (el+, IEL, ILL,)U~+,I~_, -
Yoli
2 ( m + I)i × j
(4) Determine Fi and n through the SVD of
(2) Calculate the R factor of the RQ Fi)(i
factorization of (L] 0 3
Li)Rl:4.1:,Ql:,
~= R x Q'
=(U, U2)(~ ' Z2)(Qt:4V)
0 •
2(,n+l)ix2(tn+l)i 2(m+l)ixl

Note that this SVD can be calculated


with Q'Q = 1 and R lower triangular. It is
through the SVD of (L] 0 L3)RI:4.1:4,
important to note that in the final
since Q,:4 is an orthonormal matrix. The
calculations, only the R factor is needed,
rank is determined from the dominant
which lowers the computational com-
plexity significantly. singular values of this decomposition (X~),
Partition this factorization in the follow- and Fi can be chosen as
ing way: Fi = U, ZI a, Fi_, = F~,
where the underbar means deleting the
J last l rows (1 is the number of outputs).
mi/u,l,,\ The two algorithms described in Sections 5
m [ UA, ~ and 6 now differ in Steps 5 and 6.

m(i- I)[ Ui~,1"-,-,I = N4S1D Algorithm 1.


(5) We find for the left-hand side and
li ~Y"l, I I
right-hand side of equation (43), written
as a function of the original Q matrix
I(i- I) \E÷fl2i-i/
( r*,z, )= ["X-"~'"R
, ~,, .~:6.,:4~c v
mi m m(i- 1) ti t t(i- 1) ] \Uil2i_l \ R2:3.1:4 ]~1:4,
mi and
In R2, R2: 0 0 0 0 Q~ I/2 t
~--- h5 t
m(i- 1) R~, R~ R~ 0 0 0 '3
,

li Ral ga2 R43 R44 0 0 [Q~| (6) Now the least squares problem (45) can be
l RsI R52 R5" Rs4 R55 1 ' ~ ' ~ ! rewritten as
/(i- 1) R., R~2 R, 3 R~ R6~ R~/ \Q min (211t2(UI)tR6:6"I:4~
X \ Rs:5. h4 /
/Z-t~,R \112
* The scalar k/i-is used to be conform with the definition of
E. \ R2:3.h4 /IIF
N4SID: Identification of combined systems 87

and solved in a least squares sense for Xc. With the singular value decomposition
Note that the Q matrix has completely
disappeared from these final formulas. Uoli-i ~ /Rt.i 1.4\
The first n columns of ~ are A and C (yo[i_l] =tg4141114)Ql:4t
stacked on top of each other. The next
=(Ul U2)( S' 0 V'' '
columns determine B and D as described
in Section 5. The residuals of this solution
we find as an appropriate basis
determine the stochastic system as de-
scribed in Section 5. V'I ) Q t
I'I= l~ 1:4-
~J~'2:3.1:4 •
N4SID Algorithm 2.
The rest of the algorithm is very similar to the
(5) We have
general case.
We should note that in the generic real life
case, the problem of rank deficiency is a pure
academic one.
-----(Z -In , i, 0 L 3i)R1:4.1:4"~,
I UI(L
\ R2:2.1:4 ] h~: 1:4, a n d 8. C O N N E C T I O N WITH EXISTING ALGORITHMS

8.1. Instrumental variable method


Y/li ] This method was described by De Moor et al.
(1991), Verhaegen (1991). The basic idea will be
Z'("Z(U,)t(L)+, 0 L3+,)R,:s.,:.s'~o,
shortly repeated here. We start from the
= R5:5.1:5 ]~l:S. input-output equation (11): Yilzi_ 1= ['iXd +
d s
(6) Now the least squares problem (52) can be HiUd~-i+ Yd2i-I. Projection of this equation
rewritten as on the row space perpendicular to that of GI2,-t,
gives
min ( Z-(I'z(G)t(L]+I 0 L3+I)RI:5.,:4)
~" R5:5,1:4
Y/12i_ I/Uil2i-I
.1_ = FiX di/Uil2i-i
± + Yilzi-l.
s (53)

_,,~(Zl - I t 2 U,(L,
t I
0 L,3)R 1:4.1:4) 2. Note that Y~12i-I/U02i-I = Y~'12i-I, since
Y~12i_t/Uil2i_ I = 0 as j---~ oo. Projection of (53) on
\ R2:2,1:4 t IIF
Uoli-i gives: [ Yiiz~-ll Uii2i-i]l
± U,,li-i =
Once again, the Q matrix has completely r i[Xi/Uil21_l]/Uoii_l,
d ±
since
s
Yi[2i_l/Uoli_ I =0. It
disappeared from these final formulas. can be seen from this equation that the column
.L
From 5~ and the residuals we can find the space of [Yd2~_ffUd2i_~]/Uoli_t is equal to that of
system matrices as described in Section 6. I~/(only the deterministic controllable modes are
Note. observed). So, from the column space of 1-~,, A
• The projection of Step 3 is written as a and C can be derived. B and D are then found
function of U and Y, and not as a function from the fact that: ( r i d) ± [ ril2,-,/ uil:i-,lUii2i-,
* =
of Q, because we need to be able to drop (I~/)±H/a which leads to a set of linear equations
the linear combinations (L~) of the rows of in B and D. Finally, the stochastic subsystem is
Uil2i_ I in Zi to find the sequence FiXi and identified from the difference between the
the linear combination (L~+I) of the rows of measured output and a simulation of the
Ui+ll~i-i in Zi+l to find the sequence identified deterministic subsystem with the given
Fi-iXi+l (see formulas (46)-(47)). input uk. This difference is almost equal to the
• The possible rank deficiency of the row output of the stochastic subsystem (y~). The
space we are projecting on (in Step 3) has to stochastic subsystem can then be identified with
be taken into account (see Section 6.1.3). for instance one of the algorithms described by
This occurs when R1:4.1:4 and R~:.S.l:.S are Arun and Kung (1990) and Van Overschee and
rank deficient. It can be checked by De Moor (1991a, b).
inspection of the singular values of R1:4.1:4. A clear disadvantage of this algorithm is the
If one of them turns out to be zero (purely fact that the deterministic and stochastic
deterministic case), a basis 1-I for the space subsystem are identified separately. This in-
we are projecting on has to be selected. This volves more calculations. Two different A
basis should contain the row vectors of matrices will also be identified (one for the
Uil2i_J explicitly. This makes it easy to drop deterministic and one for the stochastic system),
the linear combinations o f GI2i-~ (L~z) in Zi even though a lot of the dynamics could be
to obtain the sequence FiXi. shared between the two subsystems.
88 P. VAN OVERSCHEE and B. DE MOOR

8.2. Intersection algorithms calculate the states Xi and Xi+~ as intersections


In the literature, a couple of algorithms, which (see Moonen et al., 1992 for example).
we call 'intersection algorithms' have been
published by Moonen et al. (1989, 1992). It is 8.3. Interpretation of general projection methods
interesting to note the analogy between these Larimore (1990) shows that viewed from a
algorithms and Algorithm 2 described in Section statistical point of view, the state (memory) M; is
6. We show that the row space of A'i can also be the solution to
found as the intersection between the row spaces
of two Hankel matrices ~'~ and ~2. Define
,9 = row space ~ fq row space ~2 with [Y;12/-I \ Yoli_l ,/j

where p[A [B] is the probability distribution of


A given B.
and For Gaussian processes, these distributions are

./y
I
if('2 = |
\
il2i-I/\"
" YOI;-I
Uil2i-i
"
characterized by the first two moments, and we
can replace p[A [B] by the expectation opera-
tion E[A I B]. This expectation operator can in
its turn be replaced by the projection operator
(Papoulis, 1984). We then get
Facts.
• The subspace `9 is n dimensional. l( Uil2i-I ~
Yil2i-l/\ Yi)l;__lll= Yil2i-I/\ Mi 1" (54)
Proof. We have: rank ~1 = tel, with tel = (m + By substitution it is easy to verify that all of the
l)i in the generic combined deterministic- following three alternatives: Mi = X;, Mi = -~'i
stochastic case. From equation (15), we find
and Mi = f(;/U~zi-t satisfy equation (54).
that: rank ~2 = mi + n. Finally, if the input is
Actually, every matrix Mi satisfying: row space
persistently exciting: rank (~'t ~z)' = rt + mi.
M ; = r o w space [Zi-ff2Uilzi-i and rank Mi = n
With Grassmann's Theorem, we find: dim [ ~ N
(with ff~ ~ Rzi×"), will also satisfy equation (54)
Y(z]= rank ~t + rank ~z - rank (Ygl ~2) ' = ( r , ) +
and give rise to an n dimensional memory. This
(mi + n) - (xl + mi) = n.
ambiguity is due to the fact that the exact
• The intersection subspace `9 is equal to the
influence of the input variable Uilz;-~ on the
row space of Xi.
output is not known. In the frame work of linear
theory, the most natural choice for Mi is of the
Proof. one for which
--,,~; lies in the row space of ,gt'~, since it is
written as a linear combination of Uop_~
Y,'12i-I Yop-I / = FiMi + Hi U;12i-i
and Yoli_I (see Formula (48)).
- - , ( ' ; lies also in the row space of Y(2 since we
is satisfied (f~ = H~). This is because this choice
know from (46) that
corresponds to the linear state space equations
Mi+ 1=AM; + BUd;,
and
So, ~'i lies in the intersection of the row Y~i / ( U°I2i-' ~ = CMi + DUd;.
spaces of ~ and ~ . Since both the I \ Y~Ji-I I
intersection subspace `9 and the row space This leads to the choice M; = ,~';.
of A'i are n dimensional, they must coincide.
This derivation indicates strong similarities
between N4SID Algorithm 2 and the existing 9. EXAMPLES
intersection algorithm mentioned above. When
j-.-,0o both algorithms will calculate the same
9.1. A simple example
solution (if the same state space basis is used). In This simple simulation example illustrates the
practice however (finite j), they calculate slightly concepts explained in this paper. We consider
different models. In the references given above, the single input single output single state system
in forward innovation form (Faure, 1976)
only purely deterministic systems and the
asymptotic case i - - , ~ are treated. The inter- xk+m = 0.9490xk + 1.8805uk - 0.1502ek,
pretation of Kaiman filter states is totally absent,
as is the effect of finite i. and
The similarities indicate an alternative way to Yk = 0.8725xk - 2.0895uk + 2.5894ek.
N4SID: Identification of combined systems 89

with e, a unit energy (or= 1), zero mean, sequences was calculated (Monte-Carlo
Gaussian distributed stochastic sequence. experiment).
Effect of i. Figure 3 shows the results as a function of i for
We first investigate the effect of the number of both N4SID algorithms. The results for Algo-
block rows (i) of the block Hankel matrices. The rithm 1 are represented with a dotted line (:) and
input Uk to the system is the same for all circles (O). The results for Algorithm 2 are
experiments and is equal to a filtered unit energy represented with a d o t t e d - d a s h e d (-.) line and
( o = 1), zero mean, Gaussian distributed stoch- the stars (*). The exact values are indicated with
astic sequence added to a Gaussian white noise a dashed line.
sequence ( o = 0 . 1 ) . The filter is a second order Figure 3a shows the eigenvalue of the system
Butterworth filter, with cut-off frequency equal as a function of i. Clearly the estimates are
to 0.025 (sampling time 1). The number of data accurate, and there is hardly any difference
used for identification is fixed at 1000. One between the two algorithms. Figure 3b shows the
hundred different disturbance sequences e, were deterministic zero of the system (A - B D - ~ C ) as
generated. For each of these sequences and for a function of i. The difference between the two
each i in the interval [2, 15], two models were N4SID algorithms is clearly visible. Algorithm 1
identified using N4SID Algorithm 1 (Section 5) estimates a zero that is close to the exact
and 2 (Section 6). Also for each of these models deterministic zero, independently of i.
the bias was calculated using the expression in Algorithm 2 on the other hand is clearly biased
Appendix B. Then the mean of all these for small i. Figure 3c shows this bias
quantities over the 100 different disturbance (dashed-dotted line) and the estimated bias

Eigenvalue of A Deterministic zero


0.95 1.76

0.949
' h i ., i 1.75 --i t ........ -i,.-'*. .................. i ........................
. .; .
0.949 '° i ; w,• "q ~,--* .
..... ":.,7-:,--/ .... -y.~,~-:~.-:z
i' V :, ° 1.74 ......... ; ... t~.,,.:... . L ..~".'~ .................

0.948 i ............ i . . . . . . . ..d .e °~ " '


,,.,.f..2. . . . . . . . . . - - , i - . . . . . ~.~_.~,,::~
0.948 t 1.73 i i •
5 10 15 5 10 15
i

Bias on deterministic zero Bias on D


0 • : f~-.& .4 "
0 ! ! .io..-'~..-o :
' ~" "'v o'"
: , 8 ' - e " . 6 ' " ," 6~.~.e..~,' • -¢"
-0.005 i f" "~" °"
i . '" .."

-0.01
-0.01 ..... 17.'.( L~............................................... ;"

-0.015
-0.02!
-0.02

-0.025 i -0.03
5 10 15 5 I0 15
i

Stochastic zero Standard deviation of A


I0-I :: ::::::: : :'.:::I:::":; I 1:1~: : : :'::~: : :',:::~::; :l:~t
1
1
:, . - . . . ~ - ~ . -
0.9 ...... ! -~..-., .... .~.*" ........... 4 !:i!; ......... " ,-,
10-2
t,"
0.8
!!: :,!: : :.i :i:i i !! :.-.!:~%;.!i ::-!!i !!!k-!

0.7 ..... ~ :;iii ........... i..~:}'.~


i i 10-3 .... ;;iii , , i;iiii
5 10 15 10 z 103 104
i
FIG, 3. (a) Eigenvalue of A as a function of i ( N 4 S I D A l g o r i t h m I (*) and 2 (O)). (b) Deterministic zeros
as a function of i ( N 4 S I D Algorithm I (*) and 2 (O)). (c) Calculated (*) and estimated ( O ) bias on the
deterministic zero as a function of i ( N 4 S I D Algorithm 2). (d) Calculated (*) and estimated (O) bias on D as
a function of i ( N 4 S I D Algorithm 2). (e) Stochastic zero as a function of i ( N 4 S I D A l g o r i t h m 1 (*) and 2
(O)). (f) Standard deviation of A as a function o f j ( N 4 S I D A l g o r i t h m I).
90 P. VAN OVERSCHEE and B. DE M o o a

(dotted line) using the expressions of Appendix Singular Values


I0 ~ . . . . . . . . , ........... , ............ .. .
B as a function of i. As expected, the bias grows : !!!!!!!!!
!!!?!!!!!!!~!!!!
!!!!!!!!!!!!!
!-!!!i!!!!!!
!!!!!!
!!!!!!!!!!!i
!?.'!!!!
!!!!!!
!!!!!!
!!!!~!!!!!!!
!!!!!!
!!!!!!:::
smaller as i grows larger. Figure 3d shows the . . . . . . . . . . ~ ................. , ............. ! .................... ~ ..............

calculated and estimated bias on the identified D


matrix. This matrix seems to be a lot more
sensitive to i than A. i!!!
.:::,:: ::i.:~: ::2 :.:..::i.!::.:.:/.::...:::.i:::/:::::::..:::2!.:::..:::::.:::
Finally Fig. 3e shows the estimated stochastic
zero as a function of i (the eigenvalues of ......... ;.........~........................i..................... ~...............................................
A - KC, with K the steady state Kalman gain). 101 .::::: :::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
. . . . . . . . . . . . . . . v . . ~ ...................... , .......................................................................

The convergence is very slow (the exact zero is


0.9996). More details can be found in Van
Overschee and De Moor (1991a, b). For this
example, we can conclude that both algorithms i0 o .........................
:. :.. i...................
~...................................

do a good job of estimating A and C. B and D ................


-i......L...:i i..:L...i:i :.ii~.:.'..:.'.i
ir.:.,~i.:i
::i .:.:.:..
are estimated accurately with N4SID Algorithm
I, but not with Algorithm 2 (for small i). The
10-t i ,
bias can be calculated through. The accuracy of 10 15 20 25
the stochastic subsystem is strongly dependent
Order
on i for both algorithms.
FIG. 4. O r d e r decision based on the dominant singular
Effect of j. values.
Throughout the paper, we assumed that j--~ oo.
The effect of finite j is shown in Fig. 3f. This
figure shows the standard deviation (stars) of the The first row indicates the chosen order.
estimate of A as a function of j (i = 5). For each Figure 4 shows the singular values that led to the
j, 100 Monte-Carlo experiments were done. The system order five for N4SID Algorithms 1 and 2.
standard deviation is proportional to 1/V~ The gap in the singular values is clearly visible.
(dashed line). So, the accuracy of the results For the A R X model, the parameters 'na/nb'
depend on j as 1/V~. indicate that 'na' was a 6 x 6 matrix with as
entries 4, and 'nb' a 6 x 3 matrix with as entries
9.2. A glass oven 4 (see also Ljung, 1991). In this way, the
The glass oven has three inputs (two burners resulting state space model was of order 33. It
and one ventilator) and six outputs (tempera- was reduced to a fourth order model using
ture). The data have been pre-processed: balanced truncation of the deterministic sub-
detrending, peak shaving, delay estimates, system. This is basically the same philosophy as
normalization (Backx, 1987). for the N4SID algorithms, but the last ones do
Using 700 data points, five different models not calculate the intermediate model explicitly.
are identified (the 300 following points are used For the prediction error model, we used the
as validation data): identified model of Algorithm 1 in the
(1) A state space model with N4SID Algo- controllability canonical form (controllability
rithm I of Section 5. indices: 2, 2, 1. Number of parameters: 93) as an
(2) A state space model with N4SID Algo- initial value (the initial values obtained from
rithm 2 of Section 6. 'canform' or 'canstart' of Ljung (1991) did not
(3) A state space model with the algorithm of converge). We restricted the number of itera-
Section 8.1. tions to three (no improvement with more
(4) An A R X model (Ljung, 1991) followed by iterations).
a balanced truncation. The third row shows the number of floating
(5) A prediction error model (Ljung, 1991). point operations. Algorithm 2 needs about 5%
The results are summarized in Table 1. less computation (simpler algorithm). The

TABLE 1. COMPARISON OF FIVE IDENTIFICATION ALGORITHMS

Algorithm 1 2 3 4 5

Order 5 5 3/3 4/4 5


i 5 5 5 -- --
Flops 14,355,148 13,59(I ,364 24,056,676 41,744,504 330,529,255
Pred. determ. 47.41 47.95 48.56 45.72 47.34
Pred. Kalman 10.76 10.76 14.17 28.98 10.87
N4SID: Identification of combined systems 91

calculation of the A R X model takes three times estimate an initial model. Now, in the
as much computation. The calculation of the optimization step, only the p a r a m e t e r s of the C
prediction error model (only three iterations) and D matrix corresponding to three outputs
takes about 25 times as much computation! It can vary (output 1, 4 and 6 for this example).
should be mentioned that none of the algorithms The optimization is better conditioned now and
was optimized in the n u m b e r of operations (nor the resulting model has a similar iaerformance as
ours, nor the ones in the Matlab toolbox), but the subspace models for this example. The
these figures still give an idea of the order of number of floating point operations stays
magnitude of the n u m b e r of computations. The extremely high though and this method is rather
fourth row is the error (in per cent) between the complicated and requires quite some insight
measured validation output and the simulated from the identifiers.
output using only the deterministic subsystem.
The fifth row shows the error between measured 10. CONCLUSIONS
and Kalman filter one step ahead prediction.
In this p a p e r two new N4SID algorithms to
With [Y~k']i the ith validation output channel
identify combined deterministic-stochastic linear
and [Y~]i the ith simulated output channel, the
systems have been derived. The first one
error is defined as (N,, = 300)
calculates unbiased results, the second one is a
simpler biased approximation. The connection
1 " kS.= ([Yk]i -- [Yk],) between these N4SID algorithms and the
. . . . O/~.
E 100 kEL classical Kalman filter estimation theory has
been indicated. This allows interpretations and
proofs of correctness of the algorithms.
In future work, we will discuss other
The best deterministic prediction is obtained
connections with linear system theory: model
with the A R X model, which is logical since it
reduction properties, frequency interpretations
came about by balanced truncation of the
and connections with robust control.
deterministic subsystem. The other models
The open problems that remain to be solved
perform a little worse (but all four about the
are the connection with maximum likelihood
same).
algorithms (Ljung, 1987) and the problem of
The Kalman filter error is the smallest though
finding good asymptotic statistical error
for the first two N4SID algorithms. The
estimates.
projection retains the deterministic and
stochastic information in the past useful to Acknowledgments--We would like to thank Lennart Ljung
predict the future. This implies that the resulting and Mats Viberg for their help and useful comments.
models will perform best when used with a
Kalman filter. The prediction error model does
REFERENCES
not improve this Kalman filter prediction error
at all. There is even a slight decline which is Anderson, B. D. O. and J. B. Moore (1979). Optimal
Filtering. Prentice-Hall, Englewood Cliffs, NJ.
probably due to numerical errors since during Arun, K. S. and S. Y. Kung (1990). Balanced approximation
the iteration procedure a warning for a badly of stochastic systems. SIAM J. Matrix Analysis and
conditioned matrix was given. This means that, Applications, ll, 42-68.
Backx, T. (1987). Identification of an industrial process: a
even though there is no optimization involved, Markov parameter approach. Doctoral Dissertation,
the results obtained with the N4SID algorithms, Technical University Eindhoven, The Netherlands.
for this industrial M I M O process are close to De Moor, B. (1988). Mathematical concepts and techniques
for modeling of static and dynamic systems. Doctoral
optimal. Dissertation, Department of Electrical Engineering, Kath.
For this example, the N4SID algorithms thus Universiteit Leuven, Belgium.
calculate a state space model without any a De Moor, B., P. Van Overschee and J. Suykens (1991).
Subspace algorithms system identification and stochastic
priori fixed parametrization, and this in a fast realization. Proceedings MTNS. Kobe, Japan. part 1,
and numerically reliable way. 589-594.
Finally, we should mention that, as we found Enns, D. (1984). Model reduction for control system design.
Ph.D., dissertation, Dep. Aeronaut. Astronaut., Stanford
out in discussion with Lennart Ljung, the University, Stanford, CA.
prediction error method can be made to work Faure, P. (1976). Stochastic realization algorithms. In R.
better with some extra manipulations. The main Mehra and D. Lainiotis (Eds), System Identification:
Advances and Case Studies. Academic Press, NY.
problem with the prediction error method is that Kailath, T. (19811). Linear Systerr~. Prentice-Hall, Engle-
the outputs are pairwise collinear. This results in wood Cliffs, NJ.
a very hard, ill conditioned optimization Kung, S. Y. (1978). A new identification and model
reduction algorithm via singular value decomposition.
problem, for every possible parametrization. Proc. of the 12th Asilomar Conference on Circuits, Systems
This problem can be solved as follows: first and Computers, Pacific Grove, 705-714.
92 P. VAN OVERSCHEE a n d B . D E MOOR

Larimore, W. E. (1990). Canonical variate analysis in and


identification, filtering, and adaptive control. Proceedings
of the IEEE 29th Conf. on Decision and Control, Hawaii.
596-6(M.
Ljung, L. (1987). System Identification: Theory for the User.
Prentice-Hall Information and System Sciences Series, = lim I(F,A'X~,+ F,A~U, + n ; ' v r + Y})
Englewood Cliffs, NJ.
Ljung, L. (1991). System Identification Toolbox. The x ((x~)'E + Up(H,)
' " ' + (Yv)' ' )
Mathworks, Inc.
= FiAipa~ + r,A;St(H~,) '
Moonen, M., B. De Moor, L. Vandenberghe and J.
Vandewallc (1989). On- and off-line identification of linear + FA:'s~E+ r,A;'R,,(HI)'
state space models. International Journal of Control, 49, + H~IS'E + HaiR~2(Hi) ' + H",
219- 232.
Moonen, M. and J. Ramos (1991). A subspace algorithm for = F,A'PaI~, + r,A;'R, ,(H,a)'
balanced state space system identification. ESAT/SISTA + r~A'S,(HI)'+ GA:!S',E
report 1991-(17, Kath. Universiteit Leuven, Dept. E.E.,
Belgium. Accepted for publication in IEEE Automatic + H;'R't2(H;t) ' + H;tS;E + r,A:.
Control (to appear 1993).
Moonen, M., P. Van Overschee and B. De Moor (1992). A Thus, we find
Subspace Algorithm for Combined Stochastic-
Deterministic State Space Identification. ESAT/SISTA
\li x mi It x m i li li/
report 1992-10, Kath. Universiteit Leuven, Dept. E.E., x

Belgium. [ F,A'S + r;A;l(RliRp)


Papoulis, A. (1984). Probability, Random Variables, and =[ + H;I(R~IzR..2) -
Stot'httYtic Processes. McGraw-Hill, NY.
Van Ovcrschcc, P. and B. De Moor (1991a). Subspace F,A i Pd E + EA,S + r,A;R,,(H,)
d d I
+ GA t S,(H;)
d I
]
algorithms for the stochastic identification problem. 30th +F,A;,S,tI~, + H;tR,2(H;;) , + H;,S~E j. (A.I)
IEEE Conference On Decision and Control, Brighton,
U.K., 1321-1326. The second part we need to express in terms of the system
Van Overschee, P. and B. De Moor (1991b). Subspace matrices is
Algorithms for the Stochastic Identification Problem.
ESAT/SISTA report 1991-26, Kath. Universiteit Leuven,
Dept. E.E., Belgium. Automatica, 29, 649-660. {u.] ,
Van Ovcrschec, P. and B. De Moor (1992). Two Subspace 90~',im ~ l uf/(u,,
Algorithms for the Identification of Combined ;~®I \ y . / u'rl "'-k~,T~,_)'
deterministic-Stochastic Systems. ESAT/SISTA report with
1992-34, Kath. Universiteit Leuven, Dept. E.E., Belgium. 1
Vcrhacgcn, M. (1991). A novel non-iterative MIMO state !im ~ YvlU;, U~-]
space model identification technique. Preprints 9th
IFAC/IFORS syn~posium on Identification and System = tim l ( r l x d + HdUp + Y~,)[Up U}]
parameter estimation, Budapest, Hungary, 1453-1458.
=r,S+HIIR,, R,,],
9~,.= l i m l ypY'p

APPENDIX A. PROOF OF THE PROJECTION = lira 1 (F;Xd + H:~Up+ Y~,)


THEOREM 1
x ((x;3'~ + U',,(H~)' + (Y;)')
P r o o f o f f o r m u l a (33).
= F,P"E + F,S,(H~)' + H~,S',~
Note. Duc to the complexity of the formulas, we use the
subscripts p (past) and f (future) throughout this appendix as + H'/R I j(H~I) ' + L",,
follows: For U, yd and Y', these subscripts denote
respectively the subscript 0 1i - 1 and i I 2i - 1. For X d and and ,~ll = R.
X' they denote the subscript 0 and i. Wc also make use of Note that 90 is guaranteed to be of full rank ((2m + I)i)
(9)-(12) without explicitly mentioning it. due to the persistently exciting input and the non-zero
Dclinc stochastic subsystem. To compute 90-t, we use the formula
for the inverse of a block matrix (Kailath, 1980)

Wc havc

,dr = ,lim-,} YIU;, - ~ - '.~-.,90[t' _, , (A.2)

= lira
1 t t
~ ( r , A ' X ; , + r , A ; U,, + H;'U~ + Y})U;, with ~0 = ~'-2 ~(~2.1~Ii190~1 "
-

So, in our case, this becomes


= F,A'S I + F,A;tR,I + H;IR',2,
: i v,u;
" 1

= lim I. (F,A'X;~ + r,A;'up + H;'U r + Y'I)U;


-V,7'(H~(I 0)+ F,S R - ' ) W-~ /
= F,A'S2 + F,A;tRt2 + H;tR22, (A.3)
N 4 S I D : Identification o f c o m b i n e d s y s t e m s 93

with we have:
~,,: r , P " C + L 7 + r,S,(H'/)' + HiS',r', + HIR,,(Hia) '
6"f(i = i ~ (A - K , C ) S ( R I) . . . . . ij2aniUi}2~_l,
-(r,s+ ]('")i k =()
This quantity can only be calculated after the system is
= r A W - SR- ' S ' ) g + L;. identified. In the following it is thus assumed that the system
matrices A , B. C, D, Q", S', R ~ are kno%n. (R and Q are
From formulas (13) and (33) and the definitions of s~ and ~ , the submatrices of the R Q decomposition as defined in
we find (with/'---* ~) Section 7).

(L} L~ L~)=)Y,.I~_,(UI,I~_~_ , Y:,b- ,) • S R - i can be calculated as

SR -' = X",,U',,12 ,_ ,(U.l:,_ , U:,l~_ ,) - '


, )(u,,,2,_, = F~IY;',I,-, - H;'U.I,- ,IU:,I.~,-,(u,12,-, u'.p,_ ,)-'
=s~ -j. = r)[ 1",,I,-, U'.l~-, - H:~U,,I ~- , U'.I-'~ -, ](U.I:, -, UI,I.~.- ,) '

Using (A.I) and (A.3), we thus find


= r'~lr4:4.,:~- H:!R,:,.,:3IR,:'.,:~.

• For the calculation of the Kalman filter gains K~, we


(L~ L~)=F,XSR-' +F,A,a(/ 0)+n,a(0 t) also need the initial covariance estimate:
+ [r, AiS,(H:!)'+ F,AiSR - 'S, E
~ = pa + p,, _ S R " 'S'.
- r,A:!R,,(Hf)' + r, a i s ; r l , --We have approximately
+ n;!r;..(n~')' + H~,S'._U- r , a ' P " U
lira I ~ , ~ , = pa + p,
- r, A7 - F~A~tRI i(H~t) ' I-°~ ]

- r,A'S((H;')' - ria','s',r',- HIr;~(n'/)' and from the description of the algorithm (see Section 7)
- H",S'UIv/; '(n','(t 0) + r , s r - ' ) we have
2,2:
= (FiA~t HI!) + F I A i S R - t + Fi(A,(S R - i S,
J
- p a ) F ' , - AT)'t,O['(H~t(l O) + FiSR - t ) Thus we have
pa + p., = YI
= (r, a l - rixi~.'H:! hi)
+ F i A , S R - I _ rizilp [ 'FiSR - ' --SR IS' can be easily found as
= (r, la:' - QiH:!I H'/) + Film' - QiF, I S R - ' SR IS'¼SR I(Rl:3.,:3RSl:3.t:3)(SR- t )'
which is exactly the same as the first part of (33). = F~IR4:4.1::~- H~tR,:,.,3I
Once again, using (A.I) and (A.3), we find for L~ ' R'1:1.1:3~tH't)'l/FW
x 1R4:4.1:3- i I% t / '
L~ = [-F,A'S,( H:')' - riA','r, ,(Hi)' Care should be taken when subtracting both quantities
- H'.'R'~.(HI)' - r~A'SR -'S'I~, to obtain ~ , It is possible that due to the
approximations, ~b becomes negative definite. If this is
- F,A','(I O)S'U, H'/(O I)S'U
-

the case, it should be put equal to zero.


+ FiA'P'q", + F,A~ • Now the Kalman gains K, can be calculated using
+ FiA:'R,,(H:!)' + FiA'S,(H',')' formulas _(22)-(23).
The errors 6X, and 6Xi+ t can be easily calculated as
+ LA;~S',I", + H','R'..(HI)'+ H~S'U]V).' i-I
= F~(A'(P a - SR -'S')I~, + a~)~p,- ' 6"~i : H
k =|)
(A - K k C)S(R - I)m i ~. 1 1 2 , m R 2 : 3 . I::~Q I:.~.
= r,x,~,: i
and
= F,Q,
bf(i , i = [ I (A - Kn, C)S( R - t),,,(,, j)+ qz,,Ra:3.t:3a l: 3.
which is exactly the same as the second part of (33). V1 k --I)
The proof of formula (34) is completely analogous. El
The least squares solution of

rain + + b)D ll-"


,) Y,I, / \ U, li /lie
A P P E N D I X B. C A L C U L A T I O N O F T H E BIAS will thus be a better approximation for .se as £e is. The bias
For the calculation of the bias ~ - ~ the error on the can then be calculated as
states Xi and "~'i+i has to be calculated. From (49) and (50). a e - .9'= S e - _@.

You might also like