Professional Documents
Culture Documents
Part 5 - Prediction Error Methods
Part 5 - Prediction Error Methods
ARX models
Model structures
General PEM
Part V
Prediction error methods
ARX models
Model structures
General PEM
Table of contents
Model structures
We stay in the single-output, single-input case for all but the last section.
ARX models
Model structures
General PEM
Classification
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
Polynomial representation
Backward shift operator q 1 :
q 1 z(k ) = z(k 1)
where z(k ) is any discrete-time signal.
Then:
y(k ) + a1 y(k 1) + a2 y (k 2) + . . . + ana y(k na)
= (1 + a1 q 1 + a2 q 2 + . . . + ana q na )y(k) =: A(q 1 )y(k)
and:
b1 u(k 1) + b2 u(k 2) + . . . + bnb u(k nb)
= (b1 q 1 + b2 q 2 + . . . + bnb q nb )u(k ) =: B(q 1 )u(k)
ARX models
Model structures
General PEM
1
[B(q 1 )u(k) + e(k )]
A(q 1 )
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
Identification problem
Consider now that we are given a vector of data u(k ), y (k ),
k = 1, . . . , N, and we have to find the model parameters .
Then for any k :
y (k ) = > (k ) + (k )
where (k ) is now interpreted as an equation error.
Objective: minimize the mean squared error:
V () =
N
1 X
(k)2
N
k=1
Remark: When k na, nb, zero- and negative-time values for u and
y are needed to construct . They can be taken 0 (assuming the
system is in zero initial conditions).
ARX models
Model structures
General PEM
Linear system
y(1) = y (1) y (na) u(1) u(nb)
y(2) = y (0) y (1 na) u(0) u(1 nb)
y(N) = y (N 1)
Matrix form:
y (1)
y(1)
y (2) y (0)
.. =
..
.
.
y(N)
..
.
y (N 1)
y (na)
y(1 na)
..
.
u(1)
u(0)
..
.
y (N na) u(N 1)
Y =
with notations Y RN and RN(na+nb) .
u(N nb)
u(nb)
u(1 nb)
u(N nb)
ARX models
Model structures
General PEM
ARX solution
1
2
PN
k =1
b = (> )1 > Y
PN
Since the new V () = N1 k=1 (k )2 is proportional to the criterion
above, the same solution also minimizes V ().
However, the form above is impractical in system identification, since
the number of data points N can be very large. Better form:
> =
N
X
(k )> (k ),
> Y =
k =1
"
b =
N
X
k=1
N
X
(k )y (k )
k=1
#1 "
(k)> (k )
N
X
k =1
#
(k )y (k )
ARX models
Model structures
General PEM
Remaining issue: the sum of N terms can grow very large, leading to
numerical problems: (matrix of very large numbers)1 vector of very
large numbers.
Solution: Normalize element values by diving them by N. In
equations, N simplifies so it has no effect on the analytical
development, but in practice it keeps the numbers reasonable.
"
b =
#1 "
#
N
N
1 X
1 X
>
(k ) (k )
(k )y (k)
N
N
k =1
k =1
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
Main result
Assumptions
1
2
3
Theorem
ARX identification is consistent: the estimated parameters b tend to
the true parameters 0 as N .
ARX models
Model structures
General PEM
Discussion of assumptions
1
i
(k)y(k)
PN
>
>
k=1 (k ) (k ) E (k ) (k) .
N
As N ,
2
1
N
k=1
k=1
E (k)> (k ) is nonsingular if the data is sufficiently
informative (e.g., u(k ) should not be a simple feedback from
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
Experimental data
Consider we are given the following, separate, identification and
validation data sets.
plot(id); and plot(val);
ARX models
Model structures
General PEM
Identification data.
Array containing the orders of A and B and the delay nk.
ARX models
Model structures
General PEM
Model validation
Assuming the system is second-order with no time delay, we take
na = 2, nb = 1, nk = 1. Validation: compare(model, val);
ARX models
Model structures
General PEM
Structure selection
Better idea: try many different structures and choose the best one.
Na = 1:15;
Nb = 1:15;
Nk = 1:5;
NN = struc(Na, Nb, Nk);
V = arxstruc(id, val, NN);
struc generates all combinations of orders in Na, Nb, Nk.
arxstruc identifies for each combination an ARX model (on the
data in 1st argument), simulates it (on the data in the 2nd
argument), and returns all the MSEs on the first row of V (see
help arxstruc for the format of V).
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
Motivation
Before deriving the general PEM, we will introduce the general class
of models to which it can be applied.
ARX models
Model structures
General PEM
B(q 1 )
C(q 1 )
u(k) +
e(k )
1
F (q )
D(q 1 )
ARX models
Model structures
General PEM
A(q 1 )y (k ) =
B(q 1 )
C(q 1 )
u(k)
+
e(k )
F (q 1 )
D(q 1 )
Very general form, all other linear forms are special cases of this. Not
for practical use, but to describe algorithms in a generic way.
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
nb
X
bj u(k j) + e(k )
j=1
M1
X
h(j)u(k j) + e(k )
j=0
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
Overall relationship
ARX models
Model structures
General PEM
Output error
Other model forms are possible that are not special cases of ARMAX,
e.g. Output Error, OE:
y(k) =
B(q 1 )
u(k) + e(k )
F (q 1 )
ARX models
Model structures
General PEM
Table of contents
1
Model structures
ARX models
Model structures
General PEM
2
3
ARX models
Model structures
General PEM
Remarks:
ARX predictor yb(k) is chosen to achieve the error (k ) = e(k ).
We will aim to achieve the same error in the general PEM,
intuitively because we cannot do better.
For the mathematically inclined: Such a predictor is optimal in the sense
of minimizing the variance of (k).
ARX models
Model structures
General PEM
To make things easier to follow, we first derive PEM for a 1st order
ARMAX.
Looking at slide ARMAX: explicit form, we write 1st order ARMAX in a
slightly different way:
y (k ) = ay(k 1) + bu(k 1) + ce(k 1) + e(k )
ARX models
Model structures
General PEM
(5.1)
(5.2)
ARX models
Model structures
General PEM
Final recursion:
yb(k) = c yb(k 1) + (c a)y(k 1) + bu(k 1)
Requires initialization at yb(0); this initial value is usually taken 0.
This initialization needs some technical conditions to work properly, we do not
go into them here.
ARX models
Model structures
General PEM
Similar recursion:
(k ) = y(k) yb(k )
(k 1) = y(k 1) yb(k 1)
(k ) + c(k 1) = y(k) + cy(k 1) (yb(k) + c yb(k 1))
= y(k) + cy(k 1) ((c a)y (k 1) + bu(k 1))
= y(k) + ay(k 1) bu(k 1)
(k ) = c(k 1) + y (k ) + ay(k 1) bu(k 1)
Requires initialization of (0), usually taken 0.
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
Table of contents
1
Model structures
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
Prediction error
ARX models
Model structures
General PEM
Predictor
To achieve error e(k ), the predictor must be:
yb(k ) = y (k ) e(k) = Gu(k) + (H 1)e(k )
= Gu(k ) + (H 1)H 1 (y(k) Gu(k))
= Gu(k ) + (1 H 1 )(y(k) Gu(k))
= Gu(k ) + (1 H 1 )y (k ) Gu(k ) + H 1 Gu(k))
= (1 H 1 )y (k ) + H 1 Gu(k )
=: L1 y(k) + L2 u(k)
where we skipped q 1 to make the equations readable.
Remarks:
New notations L1 (q 1 ) = 1 H 1 (q 1 ),
L2 (q 1 ) = H 1 (q 1 )G(q 1 ).
In order to have a causal predictor (which only depends on past
values of the output and input), we need G(0) = 0 and H(0) = 1,
leading to L1 (0) = L2 (0) = 0.
ARX models
Model structures
General PEM
Once the predictors and errors are available, the parameter is found
PN
by minimizing criterion V () = N1 k=1 2 (k ).
Again, linear regression will not work in general. Computational
methods to solve the minimization problem will be introduced in the
next section.
ARX models
Model structures
General PEM
B
1
u(k ) + e(k )
A
A
We have:
L1 = 1 H 1 = 1 A, L2 = H 1 G = B
yb(k ) = L1 y (k ) + L2 y (k ) = (1 A)y (k ) + Bu(k ) = (k )
(k ) = H 1 (y (k ) Gu(k )) = Ay(k) Bu(k)
which can be easily seen to be equivalent with the ARX formulation.
ARX models
Model structures
General PEM
Homework! Plug in the polynomials for 1st order ARMAX and verify
that you obtain the same result as before.
ARX models
Model structures
General PEM
Table of contents
1
Model structures
ARX models
Model structures
General PEM
Experimental data
Consider again the experimental data on which ARX was applied.
plot(id); and plot(val);
ARX models
Model structures
General PEM
Identification data.
Array containing the orders of A, B, C and the delay nk.
ARX models
Model structures
General PEM
In contrast to ARX, results are good. Flexible noise model pays off.
ARX models
Model structures
General PEM
Identifying an OE model
B(q 1 )
u(k) + e(k)
F (q 1 )
ARX models
Model structures
General PEM
Identification data.
Array containing the orders of B, F , and the delay nk.
y (k ) =
B(q 1 )
u(k nk) + e(k ), where:
F (q 1 )
ARX models
Model structures
General PEM
OE model validation
Considering the system is second-order with no time delay, we take
nb = 2, nf = 2, nk = 1. Validation: compare(val, mOE);
ARX models
Model structures
General PEM
Table of contents
1
Model structures
ARX models
Model structures
General PEM
V1
2
dV
=
.. ,
d
.
V
n
2V
2
2 V1
2 1
d 2V
=
d2
..
.
2V
n 1
2V
1 2
2V
22
..
.
2V
n 2
...
...
..
.
...
2V
1 n
2V
2 n
..
2V
n2
ARX models
Model structures
General PEM
Assumptions
Assumptions (simplified)
1
2
3
4
d 2V
d 2
is nonsingular.
ARX models
Model structures
General PEM
Discussion of assumptions
Assumption 2 is informal, formal name is persistently exciting
input (it comes later).
Recall that G and H have the parameters as coefficients, we
denote them by G(q 1 ; ) and G(q 1 ; ) to make the
dependence explicit. Assumption 3 ensures we can differentiate
G and H w.r.t. .
dH
Note that dG
d , d are vectors of transfer functions!
PN
Recall V () = N1 k=1 2 (k), the MSE. Assumption 4 ensures V
is not flat around minima, so the minima are uniquely
identifiable.
ARX models
Model structures
General PEM
Guarantee
Theorem 1
Define the limit V () = limN V (). Given Assumptions 14, the
identification solution b = arg min V () converges to a minimum
point of V () as N .
ARX models
Model structures
General PEM
Further assumptions
Assumptions (simplified)
5
ARX models
Model structures
General PEM
Additional guarantee
Theorem 2
Under Assumptions 1-6, b converges to a true parameter vector 0 as
N .
Remark: Also a consistency guarantee. Theorem 1 guaranteed a
minimum-error solution, whereas Theorem 2 additionally says this
solution corresponds to the true system, if the system satisfies the
model structure.
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
Optimization problem
N
1 X
(k)2
N
k=1
So far we took this solution for granted and investigated its properties.
b in general
While in ARX linear regression could be applied to find ,
this does not work. Main implementation question:
How to solve the optimization problem?
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
f (` )
df (` )
d
Remarks:
Notation
df (` )
d
df
d
at point ` .
ARX models
Model structures
General PEM
dV
d
dV (` )
d
d 2 V (` )
d 2
d 2V
d 2
d 2 V (` )
d2
1
dV (` )
d
Remark:
The stepsize helps with the convergence of the method, e.g.
because in identification V is noisy.
an
ARX models
Model structures
General PEM
Stopping criterion
ARX models
Model structures
General PEM
V () =
N
1 X
(k)2
N
k=1
k=1
where:
d(k)
d
is an n n matrix.
d 2 (k)
d 2
the Hessian of (k ).
ARX models
Model structures
General PEM
Gauss-Newton
Ignore the second term in the Hessian of V and just use the first term:
H=
>
N
2 X d(k ) d(k)
N
d
d
k=1
dV (` )
d
Motivation:
Quadratic form of H gives better algorithm behavior.
Simpler computation.
ARX models
Model structures
General PEM
d(k)
d
>
(k ) (k )
= [ (k)
a , b , c ] . Differentiating second equation:
(k)
(k 1)
= c
+ y (k 1)
a
a
(k)
(k 1)
= c
u(k 1)
b
b
(k)
(k 1)
= c
(k 1)
c
c
(k) (k )
So, (k)
a , b , c are dynamical signals that can be computed
using the recursions above, starting e.g. from 0 initial values.
ARX models
Model structures
General PEM
)
dV d V
Plug d(k
d into equations for d , d 2
Apply Newton (or Gauss-Newton) update formula to find `+1 .
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
MIMO system
So far we considered y (k ) R, u(k) R,
Single-Input, Single-Output (SISO) systems:
y (k ) = G(q 1 )u(k) + H(q 1 )e(k )
Many systems are Multiple-Input, Multiple-Output (MIMO).
E.g., aircraft. Inputs: throttle, aileron, elevator, rudder.
Outputs: airspeed, roll, pitch, yaw.
ARX models
Model structures
General PEM
y1 (k)
u1 (k )
e1 (k )
y2 (k)
u2 (k )
e2 (k )
.. = G(q 1 ) .. + H(q 1 ) ..
.
.
.
yny (k)
unu (k )
eny (k)
ARX models
Model structures
General PEM
MIMO ARX
ARX models
Model structures
General PEM
Take na = 1, nb = 2, ny = 2, nu = 3. Then:
A(q 1 )y (k ) = B(q 1 )u(k ) + e(k )
A(q 1 ) = I + A1 q 1
11
a1 a112 1
= I + 21
q
a1 a122
B(q 1 ) = B1 q 1 + B2 q 2
11
11
b
b112 b113 1
b2
= 121
q
+
b1 b122 b123
b221
b212
b222
b213 2
q
b223
ARX models
Model structures
General PEM
u (k)
b213 2 1
e1 (k )
u2 (k) +
q
e2 (k )
b223
u3 (k)
Explicit relationship:
y1 (k ) + a111 y1 (k 1) + a112 y2 (k 1)
= b111 u1 (k 1) + b112 u2 (k 1) + b113 u3 (k 1)
+ b211 u1 (k 2) + b212 u2 (k 2) + b213 u3 (k 2) + e1 (k )
y2 (k ) + a121 y1 (k 1) + a122 y2 (k 1)
= b121 u1 (k 1) + b122 u2 (k 1) + b123 u3 (k 1)
+ b221 u1 (k 2) + b222 u2 (k 2) + b223 u3 (k 2) + e2 (k )
ARX models
Model structures
General PEM
ARX models
Model structures
General PEM
PEM criterion
N
1 X
(k)2
N
k=1
ARX models
Model structures
General PEM
Table of contents
Model structures
ARX models
Model structures
General PEM
Experimental data
Consider a continuous stirred-tank reactor:
ARX models
Model structures
General PEM
Experimental data
ARX models
Model structures
General PEM
a (q ) a12 (q 1 ) . . . a1ny (q 1 )
a21 (q 1 ) a22 (q 1 ) . . . a2ny (q 1 )
A(q 1 ) =
...
any1 (q 1 ) any2 (q 1 ) . . . anyny (q 1 )
(
1 if i = j
ij
aij (q 1 ) =
+ a1ij q 1 + . . . + ana
q naij
ij
0 otherwise
11 1
b (q ) b12 (q 1 ) . . . b1nu (q 1 )
b21 (q 1 ) b22 (q 1 ) . . . b2nu (q 1 )
B=
...
ny 1 1
ny2 1
nynu
1
b (q ) b (q ) . . . b
(q )
ij
bij (q 1 ) = b1ij q nkij + . . . + bnb
q nkij nbij +1
ij
ARX models
Model structures
General PEM
Identification data.
Matrices with orders of polynomials in A, B, and delays nk:
na11 . . . na1ny
Na = . . .
nany1 . . . nanyny
nb11 . . . nb1nu
Nb = . . .
nbny1 . . . nbnynu
nk11 . . . nk1nu
Nk = . . .
nkny1 . . . nknynu
ARX models
Model structures
General PEM
yb(k) =g(x(k ); )
(k) =y(k) yb(k ) = y(k ) g(x(k ); )
PN
The criterion is V () = N1 k=1 2 (k ), the MSE. Training should return
a parameter vector b minimizing V ().
So NARX identification is in fact:
A nonlinear prediction error method.
And also a nonlinear regression problem.
(predicted outputs at negative and zero time can also be taken 0.)
Example: Simulation of an aircrafts response to emergency pilot
inputs, that may be dangerous to apply to the real system.