Download as pdf or txt
Download as pdf or txt
You are on page 1of 41

System Identification

Control Engineering B.Sc., 3rd year


Technical University of Cluj-Napoca
Romania

Lecturer: Lucian Busoniu

Model validation

Structure selection

Part IX
Model validation and structure selection

Overparametrization

Model validation

Structure selection

Overparametrization

Recall: Importance of validation

Model validation is a crucial step: the model must be good enough


(for our purposes).
If validation is unsuccessful, previous steps in the workflow must be
redone e.g., the model structure can be changed.

Model validation

Structure selection

Overparametrization

Motivation
So far, we validated and selected models mostly in a heuristic way, by
examining plots using common sense.

Next, some mathematically well-founded tests will be given.


However, common sense remains indispensable mathematical tests
work under assumptions that may not always be satisfied.

Model validation

Structure selection

Overparametrization

Focus: Prediction error methods

This lecture focuses on single-output models obtained by prediction


error methods.
Some of the tests can be extended to other settings.

Model validation

Structure selection

Table of contents

Model validation with correlation tests


Basic correlation tests
Matlab example
Global correlation tests

Structure selection

Testing for overparametrization

Overparametrization

Model validation

Structure selection

Overparametrization

Whiteness: Intuition

Recall general PEM model structure:


y (k ) = G(q 1 )u(k) + H(q 1 )e(k )
where e(k) is assumed to be zero-mean white noise.
PEM are derived so that the prediction error
(k ) = y(k) yb(k ) = e(k ). If the system satisfies the model structure
(so the assumption holds), and moreover the model is accurate, then
(k ) is also zero-mean white noise.
Whiteness hypothesis
(W) The prediction errors (k) are zero-mean white noise.

Model validation

Structure selection

Overparametrization

Independence of past inputs: Intuition

y (k ) = G(q 1 )u(k ) + v (k)


If the model is accurate, it entirely explains the influence of inputs
u(k ) on current and future outputs y (k + ). Therefore, the errors
(k + ) = y (k + ) yb(k + ) are only influenced by the
disturbances v , and are independent of inputs u(k).
Independence hypothesis 1
(I1) The prediction errors (k + ) are independent of inputs u(k ) for
0 (i.e., of past and current inputs).

Model validation

Structure selection

Overparametrization

Independence of all inputs: Intuition


y (k ) = G(q 1 )u(k ) + v (k)
If the experiment is closed-loop, u(k ) depends on past outputs
y (k j) and this will lead to a correlation of (k j) with u(k) (note
the independence of (k + ) and u(k) is unaffected). If open-loop,
then (k j) is also independent from u(k ).

Independence hypothesis 2
(I2) The prediction errors (k + ) are independent of u(k ) for any
(i.e., of all inputs).

Model validation

Structure selection

Overparametrization

All hypotheses

(W) The prediction errors (k) are zero-mean white noise.


(I1) The prediction errors (k + ) are independent of u(k ) for 0
(i.e., of past and current inputs).
(I2) The prediction errors (k + ) are independent of u(k ) for any
(i.e., of all inputs).
A good model should satisfy (W) and (I1), and if there is no feedback,
also (I2).
We will develop tests that allow to either accept or reject these
hypotheses for a given model, and therefore validate or reject the
model.

Model validation

Structure selection

Overparametrization

Whiteness: Correlations

Recall the correlation function (equal to the covariance in zero-mean


case):
r ( ) = E {(k + )(k )}
If (k ) is zero-mean white noise:
The correlation function is zero, r ( ) = 0 for any nonzero .
At zero, r (0) is the variance 2 of the white noise.

Model validation

Structure selection

Whiteness: Correlations from data

Estimating correlations from data:


b
r ( ) =

N
1 X
(k + )(k)
N
k=1

Normalization by the (estimated) variance:


x( ) =

b
r ( )
b
r (0)

Then, x( ) should be small for nonzero .

Overparametrization

Model validation

Structure selection

Overparametrization

Whiteness test at
For any Gaussian distribution N (, 2 ), the probability of drawing a
sample in the interval 1.96, + 1.96 is found (from the
Gaussian formula) to be 0.95.

It can be shown that for large N, x( ) is distributed according to


Gaussian N (0, N1 ), and therefore that:
1.96
P(|x( )| ) = 0.95
N

Model validation

Structure selection

Overparametrization

Whiteness test at (continued)

Whiteness test at
, then the whiteness hypothesis r ( ) = 0 is accepted.
If |x( )| 1.96
N
Otherwise, the whiteness hypothesis is rejected.

Intuition: If the test succeeds, then the hypothesis is likely to be true.


(Mathematically, the probability of rejecting the hypothesis when it is
in fact true is 0.05.)
In practice, we require the hypothesis to hold at all 6= 0 supported
by the data.

Model validation

Structure selection

Overparametrization

Independence: Correlations
To verify independence of from u, use cross-correlation function:
ru ( ) = E {(k + )u(k)}
1
2

If (I1) is true, then ru ( ) for 0.


If (I2) is true, then ru ( ) for any .

Estimation from data and normalization:


( P
N
1
(k + )u(k)
b
ru ( ) = N1 Pk=1
N
k=1 (k + )u(k)
N
b
ru ( )
x( ) = p
b
r (0)b
ru (0)

if 0
if < 0

Model validation

Structure selection

Overparametrization

Independence test at
Like before, for large N, x( ) is distributed according to Gaussian
N (0, N1 ), and therefore:
1.96
P(|x( )| ) = 0.95
N
Independence test at
, then the independence hypothesis ru ( ) = 0 is
If |x( )| 1.96
N
accepted.
Otherwise, the independence hypothesis is rejected.

In practice, we require the hypothesis to hold at all 0 supported


by the data. If the model is accurate, then checking it at < 0 verifies
the presence of feedback.

Model validation

Structure selection

Table of contents

Model validation with correlation tests


Basic correlation tests
Matlab example
Global correlation tests

Structure selection

Testing for overparametrization

Overparametrization

Model validation

Structure selection

Matlab example: Experimental data


The real system is in output-error form:
y(k) =

B(q 1 )
u(k) + e(k )
F (q 1 )

and has order n = 3.


plot(id); and plot(val);

Overparametrization

Model validation

Structure selection

Overparametrization

Matlab: ARX model


First, we try an ARX model:
mARX = arx(id, [3, 3, 1]); resid(mARX, id);

The model fails the whiteness test (W). This is because the system is
not within the model class, leading to an inaccurate model. The
model is rejected.

Model validation

Structure selection

Overparametrization

Matlab: ARX model (continued)

Simulating on the validation data confirms the fact that the model is
poor.

Model validation

Structure selection

Overparametrization

Matlab: OE model
mOE = oe(id, [3, 3, 1]); resid(mOE, id);

The OE model passes all the tests as expected because the OE


model class contains the real system. The model is therefore
validated.
Important note: The Matlab functions use 0.99 probability instead of
0.95.

Model validation

Structure selection

Overparametrization

Matlab: OE model (continued)

Simulating the model on the validation data confirms the model has
good quality.

Model validation

Structure selection

Table of contents

Model validation with correlation tests


Basic correlation tests
Matlab example
Global correlation tests

Structure selection

Testing for overparametrization

Overparametrization

Model validation

Structure selection

Overparametrization

Chi-squared distribution
Let x be a vector of m random variables xi so that each xi is has
Gaussian distribution N (0, 1).
Pm
Define 2 (m) as the distribution of the sum i=1 xi2 = x > x. This is
called the Chi-squared distribution (Matlab: chi2pdf).

Image credit: Wikipedia

Model validation

Structure selection

Overparametrization

Whiteness test
>
Define r = [b
r (1), . . . , b
r (m)] , and choose a small , e.g. 0.05.

Define () to be the value so that a random variable z 0


distributed with 2 (m) has probability 1 of being at most :
P(z ) = 1
(Compare: P(|x( )|)

1.96
)
N

= 0.95 for the single- case.)

Whiteness test
>

r
If N brr2 (0)
(), then the hypothesis that the sequence (k) is white

noise is accepted.
Otherwise, the hypothesis is rejected.

(Compare: |x( )|

1.96
)
N

Model validation

Structure selection

Overparametrization

Independence test

Can be extended in a similar way, formalism is complicated and will


not be given here.

Model validation

Structure selection

Table of contents

Model validation with correlation tests

Structure selection

Testing for overparametrization

Overparametrization

Model validation

Structure selection

Overparametrization

Consider we are given several model structures M1 , M2 , . . . , M` .


Example: ARX structures of increasing order.
How to choose among them?
First idea: choose Mi leading to the smallest mean squared error:
b =
V ()

N
1 X
(k)2
N
k=1

This does not take into account the complexity of the model, so it may
lead to overfitting. We explore other options (without going into their
derivation).

Model validation

Structure selection

Overparametrization

Akaikes information criterion

b + 2p, or equivalently: log V ()


b +
WAIC = N log V ()

2p
N

where N is the number of data points and p the number of


parameters (e.g., na + nb in ARX).
Choice: Model with smallest WAIC .
Intuition: The term 2p penalizes the complexity of the model (number
of parameters). Taking the logarithm of the MSE allows to better take
into account small values of the MSE.

Model validation

Structure selection

Final prediction error

b 1 + p/N
WFPE = V ()
1 p/N
Choice: Model with smallest WFPE .
Intuition: When N is large:
b
V ()
The term

2p
N

1 + p/N
b + 2p/N ) V ()(1
b + 2p )
= V ()(1
1 p/N
1 p/N
N

penalizes model complexity.

Overparametrization

Model validation

Structure selection

Overparametrization

F-test
Take just two model classes M1 , M2 so that M1 M2 (e.g., they
are both ARX models and M2 has larger na and nb).
Choice: M1 is chosen over M2 (M2 does not offer a significant
improvement) if:
V1 (b1 ) V2 (b2 )
()
N
V2 (b2 )
() is defined like before using the Chi-squared distribution, as the
value so that a random variable z 0 distributed with 2 (p2 p1 )
has probability 1 of being at most :
P(z ) = 1
Subscripts 1 and 2 refer to the respective model classes. E.g. p2 is
the number of parameters of M2 , and is larger than p1 .

Model validation

Matlab example

An OE system with n = 2.

Structure selection

Overparametrization

Model validation

Structure selection

Overparametrization

Matlab: selstruc with AIC

Recall arxstruc:
Na = 1:15; Nb = 1:15; Nk = 1:5;
NN = struc(Na, Nb, Nk); V = arxstruc(id, val, NN);
struc generates all combinations of orders in Na, Nb, Nk.
arxstruc identifies for each combination an ARX model on the
data id, simulates it on the data val, and returns information
about the MSEs, model orders etc. in V.

Model validation

Structure selection

Matlab: selstruc with AIC (continued)


To choose the structure with the Akaikes information criterion:
N = selstruc(V, aic);
For our data, N= [8, 8, 1].
Alternatively, graphical selection also allows using AIC:
N = selstruc(V, plot);

Overparametrization

Model validation

Matlab: Results

Structure selection

Overparametrization

Model validation

Structure selection

Overparametrization

Remarks

AIC, FPE also work if the system is not in the model class.
Matlab offers functions aic, fpe that compute these criteria for a
list of models.

Model validation

Structure selection

Table of contents

Model validation with correlation tests

Structure selection

Testing for overparametrization

Overparametrization

Model validation

Structure selection

Overparametrization

Motivation
Consider the real system obeys the ARMAX structure:
A0 (q 1 )y (k ) = B0 (q 1 )u(k ) + C0 (q 1 )e(k )
where subscript 0 indicates quantities related to the real system.
This is equivalent to the model:
M(q 1 )A0 (q 1 )y(k) = M(q 1 )B0 (q 1 )u(k) + M(q 1 )C0 (q 1 )e(k)
with M(q 1 ) some polynomial of order nm.
So, using ARMAX identification with na = na0 + nm, nb = nb0 + nm,
nc = nc0 + nm will produce an accurate model. This model is
however too complicated (overparametrized), and will have some
(nearly) common factors M(q 1 ) in all polynomials.

Model validation

Structure selection

Overparametrization

Pole-zero cancellations

This type of situation can be identified by checking if some poles and


zeros of the model cancel each other.
We exemplify using Matlab function pzmap, which shows the poles
and zeros of G in the generic model:
y (k ) = G(q 1 )u(k ) + v (k)

Model validation

Structure selection

Overparametrization

Matlab: overparameterized OE model


On the same data as for correlation tests (recall system has order
n = 3):
mOE = oe(id, [5, 5, 1]);
Looking at the validation data, the model is accurate:

Model validation

Structure selection

Overparametrization

Matlab: testing for pole-zero cancellations


pzmap(mOE, sd, 1.96);
Arguments sd, nsd specify the confidence region as a number of
standard deviations (nsd taken here 1.96 to get 0.95 confidence).

Two pairs of poles and zeros have overlapping confidence regions


likely they are canceling each other. This indicates the order should
be 3 (the true system order).

You might also like