Nonlinear Systems Identification

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Nonlinear systems identification

Mario Milanese, Carlo Novara

Dipartimento di Automatica e Informatica


Politecnico di Torino
The nonlinear system ID problem
„ Data are generated by the nonlinear system f o :
y t +1 = f o ( wt )
t −n y
w = [y Ly
t t
ut Lut −nu ]T
ut : known variables

„ The system f o is unknown, but a finite number of


noise corrupted measurements of yt, wt are available:
~ ( )
~ t + d t , t = 1, K, N
y t +1 = f o w
dt accounts for errors in data %y t , w% t

„ Identification problem: find an estimate fˆ ≅ f o


The nonlinear system ID problem

„ Related problems :
¾ for a given estimate fˆ ≅ f
o

evaluate the identification error f o − fˆ


¾ find an estimate fˆ ≅ f o
“minimizing” the identification error

„ The identification error cannot be exactly evaluated


since f o and dt are not known

„ Need of prior assumptions on f o and dt for


deriving finite bounds on the identification error
Parametric approach

„ Typical assumptions in literature:


 r

¾ on system: f ∈ Ψ (θ ) =  f ( w,θ ) = ∑ α iσi ( w, βi)
o

 i =1 
¾ on noise: iid stochastic

„ Functional form of f o:
¾ derived from physical laws
¾ σi : “basis” function (polynomial, sigmoid,..).

„ Parameters θ are estimated by means of the Prediction


Error (PE) method.
Parametric approach

( )
r
„ Predictor: ˆy t +1 = f ( wt ,θ ) = ∑ α iσi wt , β i
i =1

„ Given N noise-corrupted measurements of yt,wt:

y 2 = f ( w1 ,θ ) + ε 2
Y = F (θ ) + Dε
y = f ( w ,θ ) + ε
3 2 3

#
y N +1 = f ( w N ,θ ) + ε N +1 Measured output Prediction
Errors

Known function of θ
Parametric approach

„ Given the measurements equation:


Y = F (θ ) + Dε
It is possible to estimate θ by means of the
Prediction Error (PE) method:

θˆ LS = arg min VN (θ )
θ
1 T 1
VN (θ ) = Dε Dε = [Y − F (θ )] [Y − F (θ )]
T

N N

Problem: VN (θ ) is in general non-convex.


Parametric approach

„ If possible, physical laws are used to obtain the


parametric representation of f (w,θ ) .

„ When the physical laws are not well known or too much
complex, black-box parameterizations are used.

Fixed basis Tunable basis


parameterization perametrization
Polynomial, trigonometric, etc. Neural networks
Fixed basis functions

r
f ( w,θ ) = ∑ α iσ i (w) θ = [α1 "α r ] T

i =1

σ i (w ) : Basis functions

„ Problem: Can σi ’s be found such that:

f ( w,θ ) r
→
→∞
f o
(w) ?
Fixed basis functions

„ For continuous fo, bounded W ⊂ ℜ n and σi


polynomial of degree i (Weierstrass):

lim sup f o (w) − f ( w,θ ) = 0


r →∞ w∈W

Polynomial models
Fixed basis functions
r
f ( w,θ ) = ∑ α iσ i (w) θ = [α1 "α r ]T
i =1

„ NARX models: PE estimation of θ is a linear problem:

Y = Lθ + Dε
( )
 σ 1 w1 " σ r w1 ( )  y2 
   
L= # % #  Y = # 
 ( )
σ 1 w N ( )
" σ r w N   y N +1 
 

„ Least squares solution: ( )


θˆ LS = LT L
−1 T
LY
Tunable basis functions
r
f ( w,θ ) = ∑ α iσi (w, β i )
i =1

θ = [α1 "α r β11 " β rq ]T , β i ∈ ℜ q

„ One of the most common tunable parameterization


is the one-hidden layer sigmoidal neural network.
Parametric models
„ Model structure choice:
- Basis functions
The complexity of
THE COMPLEXITY OF THESE
- Number of Basis functions such problem can
PROBLEMS MAY BE
EXPONENTIAL IN n
be exponential in n
- Number of regressors

„ Problem: curse of dimensionality


The number of parameters needed to obtain
“accurate” models may grow exponentially with
the dimension n of regressor space.

More relevant in the case of fixed basis functions


Tunable basis functions

„ Under suitable regularity conditions on the function to


approximate, the number of parameters required
to obtain “accurate” models grows linearly with n.

„ Estimation of θ requires to solve a non-convex


minimization problem (even for NARX models).

Trapping in local minima


Nonlinear regression systems

„ Consider a nonlinear system in regression form:


y t +1 = f ( wt ) + d t +1
where:
- wt : regressor. It defines the system structure:
w = y y
t
[
t t −1
... u u ...] ⇔ NARX
t t −1 T

= [ f (w ) f (w ) ... u u ...] ⇔ NOE


t −1 t −2 t −1 T
wt t

= [y y ... u u ... d d ...] ⇔ NARMAX


t t t −1 t t −1 t t −1 T
w

- u : input signal.
- d : noise acting on the system.
Nonlinear regression systems

„ The predictor of system f is defined as:


yˆ t +1 = f ( wt )
where:
w = y y
t
[ t t −1 t
... u u t −1
...] T
⇔ NARX
= [yˆ ...]
T
wt t
yˆ t −1 ... u t u t −1 ⇔ NOE

[
wt = y t y t −1 ... u t u t −1 ... ε t ε t −1 ... ]
T
⇔ NARMAX
ε t = y t − yˆ t : prediction error

You might also like