Kalman and Particle Filtering

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

Kalman and particle filtering

thierry.chonavel@imt-atlantique.fr

FC IMT

19 juin 2021 1 / 33
Outline

1 About filtering

2 State space models

3 The Kalman filter equations

4 Extended Kalman filter

5 Particle filtering

19 juin 2021 2 / 33
What is a filter and what is it for ?

yt = ht ∗ xt → Y (f ) = H(f )X(f )
What can it be used for ?
I Decompose a signal.
Examples : split a signal into subbands (HIFI loudspeakers), select
a bandwidth of interest (FM station selection).
I Detection
Example : matched filter.
I Approximate a signal from another one
Example : linear regression, denoising by means of a Wiener filter.
I Data compression
Example : compression by linear prediction.
I ...
I Here : estimation in state space models

19 juin 2021 3 / 33
Outline

1 About filtering

2 State space models

3 The Kalman filter equations

4 Extended Kalman filter

5 Particle filtering

19 juin 2021 4 / 33
State space models : description

Limitation of Wiener filters : stationarity assumption or increase of


complexity with horizon size in the non-stationary case.
State space models
I State equation : xt+1 = ft+1 (xt , vt+1 )
I Observation equation : yt = ht (xt , nt )
I Model 
xt+1 = ft+1 (xt , vt+1 ),
yt = ht (xt , nt )

19 juin 2021 5 / 33
State space models : example

Trajectory of a ship
State kinematic equations yield state equations
Position, speed and acceleration are given by vectors (x1 )t , (x2 )t ,
d d
(x3 )t , with dt (x1 )t = (x2 )t and dt (x2 )t = (x3 )t .
Observation a radar supplies a noisy version of position and speed
Time continuous model → discrete time model.
I sampling interval : ∆
I (x1 )t = (x1 )t−1 + ∆(x2 )t−1
I (x2 )t = (x2 )t−1 + ∆(x3 )t−1
I (x3 )t = ρ(x3 )t−1 + vt (ρ and σv2 are related to the kinematic
properties of the ship)

19 juin 2021 6 / 33
State space model : example

        

 x1 I ∆I 0 x1 0
= 0 I ∆I × x2 + 0 vt+1



 x2      

 x3 t+1 0 0 ρI x3 t I


 
x1

  
I 0 0


× x 2  + nt


 yt = 

 0 I 0
 x3 t

19 juin 2021 7 / 33
Linear prediction in state space models (I)

Let us consider a linear state space model



xt+1 = Ft+1 xt + Gt+1 vt+1 ,
yt = Ht xt + Ut nt

with zero mean random variables


x1 , (v2 , n1 ), . . . , (vt+1 , nt ), . . . are assumed uncorrelated.
One can consider the problem of estimating xt from
I y1 , . . . , yt−k → prediction (k > 0)
I y1 , . . . , yt → filtering
I y1 , . . . , yt , . . . , yt+k → smoothing

19 juin 2021 8 / 33
Linear prediction in state space models (II)

One can solve prediction, filtering or smoothing by considering


linear estimators
For filtering, the complexity of minimizing
X (t)
min E[k xt − Ak yk k2 ].
(t) (t)
A1 ,...,At k=1,t

increases with t (in O(t3 )) :


(t) (t)
[A1 , . . . , At ] = E[xt YtT ] × (E[Yt YtT ])−1 , with
YtT = (y1T , . . . , ytT ).
Solution : iterative implementation of the calculation of the
inverse of E[Yt YtT ].
This can be achieved by Kalman filtering

19 juin 2021 9 / 33
Outline

1 About filtering

2 State space models

3 The Kalman filter equations

4 Extended Kalman filter

5 Particle filtering

19 juin 2021 10 / 33
Kalman filter equations


xt+1 = Ft+1 xt + Gt+1 vt+1 ,
yt = Ht xt + Ut nt

Notations :
(τ )
Ak yk k2 ].
P
x̂t|τ = xt |y1:τ = arg minP (τ ) E[k xt − k=1,τ
k=1,τ Ak yk

We assume that x̂1|0 = 0, E[vt ] = 0, E[nt ] = 0, cov(vt ) = Qt and


cov(nt ) = Rt .
Solution :
x̂t|t−1 = Ft x̂t−1|t−1
(1)
x̂t|t = x̂t|t−1 + Kt (yt − Ht x̂t|t−1 ).

Calculation of Kt (Kalman gain)

19 juin 2021 11 / 33
Kalman equations

xt|t−1 = Ft xt−1|t−1

Pt|t−1 = Ft Pt−1|t−1 FTt + Gt Qt GTt

Kt = Pt|t−1 HTt [Ht Pt|t−1 HTt + Ut Rt UTt ]−1

xt|t = xt|t−1 + Kt [yt − Ht xt|t−1 ]

Pt|t = Pt|t−1 − Kt Ht Pt|t−1

= Pt|t−1 − Kt (Ht Pt|t−1 HTt + Ut Rt UTt )KTt


where Pt|t−1 = cov(xt − xt|t−1 ) and Pt|t = cov(xt − xt|t ).

19 juin 2021 12 / 33
Outline

1 About filtering

2 State space models

3 The Kalman filter equations

4 Extended Kalman filter

5 Particle filtering

19 juin 2021 13 / 33
Non-linear state space model


xt+1 = ft+1 (xt , vt+1 ),
yt = ht (xt , nt )

The state is estimated from local linear approximation of the state


space model.

19 juin 2021 14 / 33
Extended Kalman filter


 xt+1 = ft+1 (xt|t , 0) + Ft+1 (xt+1 − xt|t ) + Gt+1 vt+1

yt = ht (xt|t−1 , 0) + Ht (xt − xt|t−1 ) + Ut nt ,


where
∂ft+1 ∂ft+1
Ft+1 = ∂x (xt|t , 0), Gt+1 = ∂v (xt|t , 0),

∂ht+1 ∂ht+1
Ht = ∂x (xt|t−1 , 0), Ut = ∂n (xt|t−1 , 0).

19 juin 2021 15 / 33
Extended Kalman filter : solution

xt|t−1 = ft (xt−1|t−1 , 0)

Pt|t−1 = Ft Pt−1|t−1 FTt + Gt Qt GTt

Kt = Pt|t−1 HTt [Ht Pt|t−1 HTt + Ut Rt UTt ]−1

xt|t = xt|t−1 + Kt [yt − ht (xt|t−1 , 0)]

Pt|t = Pt|t−1 − Kt Ht Pt|t−1 = Pt|t−1 − Kt [Ht Pt|t−1 HTt + Ut Rt UTt ]KTt

Problem : the linearization of the model yields biased estimate for xt|t
and erroneous covariance.
Remark : there are techniques, such as Unscented Kalman Filter
(UKF), based on Gaussian approximation of non-linear transform of
Gaussian random variables to address this problem.
19 juin 2021 16 / 33
Outline

1 About filtering

2 State space models

3 The Kalman filter equations

4 Extended Kalman filter

5 Particle filtering

19 juin 2021 17 / 33
Particle filters : introduction

Sequential observations y1 , y2 , . . . + state space model


Problem : estimate unknown quantities (input, state, parameters,...)
Linear case : the Kalman filter yields the linear regression xn |y1:t
(= E[xt |y1:t ] in the Gaussian case)
Non-linear case : EKF → linearisation problems ; UKF → Gaussian
approximation
Non linear and/or non Gaussian case
I sum of Gaussians (IMM) → approximation
I grid techniques → complexity
I particle filter → simple and adapted for parallel implementation.
Remark : particle filter = sequential Monte Carlo
= bootstrap filter = condensation filter

19 juin 2021 18 / 33
Model

State space model :



xt+1 = ft+1 (xt , vt+1 ),
(2)
yt = ht (xt , nt ).

I x = (xt )t∈N : unobserved, markovian


I y = (yt )t∈N∗ : observation
I p(x0 ), p(xt+1 |xt ) and p(yt |xt ) are known.
Problem : calculate values related to p(x0:t |y1:t ), p(xt |y1:t ), such as
Ep(x0:t |y1:t ) [kt (x0:t )] or maxxt+k p(xt+k |y1:t ).
Recurrences : there exist recurrence relationships for p(xt+k |y1:t ).
k > 0 → prediction, k = 0 → filtering, k < 0 → smoothing.

19 juin 2021 19 / 33
Filtering equations

One step prediction


Z Z
p(xt+1 , y1:t ) = p(xt , xt+1 , y1:t )dxt = p(xt+1 |xt )p(xt , y1:t )dxt
xt xt
p(xt+1 , y1:t ) p(xt+1 , y1:t )
p(xt+1 |y1:t ) = =R .
p(y1:t ) xt+1
p(xt+1 , y1:t )dxt+1
(3)
Filtering

p(xt , y1:t ) = p(yt |xt , y1:t−1 )p(xt , y1:t−1 ) = p(yt |xt )p(xt , y1:t−1 )
p(xt , y1:t ) (4)
p(xt |y1:t ) =R .
xt
p(xt , y1:t )dxt

Iterative relations prédiction ↔ filtering


Difficulty : heavy calculations of integrals and normalisations p(y1:t ).

19 juin 2021 20 / 33
Monte Carlo sampling
(i)
Principle : sample N independent experiments {x0:t ; i = 1, . . . , N },
named particles,fromPp(x0:t |y1:t )
→ dP̂ (x0:t ) = N −1 i=1,N δx(i) (x0:t ).
0:t

Z
I(ft ) = Ep(x0:t |y1:t ) [ft (x0:t )] = ft (x0:t )p(x0:t |y1:t )dx0:t
x0:t
(5)
(i)
−1
= IˆN (f ).
P
≈N i=1,N ft (x0:t )

If σI2ˆ = I(ft2 ) − I 2 (ft ) < ∞,


1


N [IˆN (ft ) − I(ft )] → N (0, σI2ˆ ).
1

Convergence speed independent from the dimension of x !


Problem : sample from p(x0:t |y1:t ).

19 juin 2021 21 / 33
Importance sampling

Let π(x0:t |y1:t ), with supp [p(x0:t |y1:t )] ⊂ supp [π(x0:t |y1:t )]. On note
p(x0:t |y1:t )
w(x0:t ) = . Then,
π(x0:t |y1:t )

I(ft ) = Ep(x0:t |y1:t ) [f (x0:t )] = Eπ(x0:t |y1:t ) [f (x0:t )w(x0:t )].

We can approximate I(ft ) by


P (i) (i)
i=1,N f (x0:t )w(x0:t ) X (i) (i)
IˆN (f ) = P (i)
= f (x0:t )w̃(x0:t ),
i=1,N w(x0:t ) i=1,N

(i)
with x0:t ∼ π(x0:t |y1:t ), i = 1, . . . , N .

19 juin 2021 22 / 33
Sequential importance sampling

(i)
Objective : for fixed t, not resample the complete trajectories (x0:t )i=1,N ,
(i)
but only (xt )i=1,N .
Solution : set π(x0:t−1 |y1:t ) = π(x0:t−1 |y1:t−1 ). then,

π(x0:t |y1:t ) = π(xt |x0:t−1 , y1:t )π(x0:t−1 |y1:t−1 ) (6)

et
(i) (i) (i) (i)
(i) p(x0:t , y1:t ) p(yt |xt )p(xt |xt−1 ) (i)
w̃t ∝ (i)
∝ (i) (i)
w̃t−1 . (7)
π(x0:t , y1:t ) π(xt |x0:t−1 , y1:t )

Important particular case (bootstrap filter) :


(i) (i) (i)
π(xt |x0:t−1 , y1:t ) = p(xt |xt−1 ) → w̃t ∝ p(yt |xt )w̃t−1 .

19 juin 2021 23 / 33
Resampling

(i)
Degeneracy problem : as t increases, all the w̃t but one go to zero !
Solution : resample.
P (i)2
I (1) degeneracy test : Nef f = ( i w̃n )−1 ∈ [1, N ].
On teste si Nef f < N0 .
(i) (i)
I (2) resampling : if Nef f < N0 , x0:t → x̃0:t , with
(i) (l) (l)
P (x̃0:t = x0:t ) = w̃n
(i)
I (3) replace w̃n by N −1

19 juin 2021 24 / 33
Summary (bootstrap)

(i)
initialisation : sample x0 ∼ p(x0 ), i = 1, . . . , N
Pour t = 1, 2, . . .,
(i) (i)
I (1) sample xt ∼ p(xt |xt−1 ), i = 1, . . . , N
(i) (i) (i)
I (2) define x0:t = (x0:t−1 , xt ), i = 1, . . . , N
(i) (i) (i)
I (3) calculate weights wn = p(yt |xt )w̃t−1 , then
(i) (i) (i)
w̃n = wn ( i wn )−1 , i = 1, . . . , N
P
(i) (i)
I (4) resample : (xt )i=1,N → (xt )i=1,N .
I (5) go to (1).

19 juin 2021 25 / 33
Improved particle filtering

PN (i)
Bootstrap proceeds from p̂(xt−1 |xt−2 , y0:t−1 ) = i=1 w̃t−1 δx(i) by
t−1
sampling xt + weighting + re-sampling xt
PN (i)
to get p̂(xt |xt−1 , y0:t ) = i=1 w̃t δx(i)
t

Bootstrap belongs to sampling importance re-sampling (SIR) filters. SIR


filters differ from their sampling functions (optimaly p(xt |xt−1 , y0:t ) and
p(xt |xt−1 )for the bootstrap )
Problem : sampled particles often poorly coherent with observations yt
(i)
Solution : re-sample particles xt−1 from yt before sampling xt
→ auxiliary particle filters (APF)

19 juin 2021 26 / 33
Derivation of the APF (I)

R
p(xt , xt−1 , y0:t )dxt−1
p(xt |y0:t ) =
p(yt |y0:t−1 Z)p(y0:t−1 )
p(yt |xt )
= p(xt |xt−1 )p(xt−1 |y0:t−1 )dxt−1
p(yt |y0:t−1 )
(i) (8)
PN (i) p(yt |xt )p(xt |xt−1 )
≈ i=1 w̃t−1
p(yt |y0:t−1 )
(i) (i)
PN w̃t−1 p(yt |xt−1 ) (i)
≈ i=1 PN (j) (j)
p(xt |xt−1 , yt )

j=1 t−1 p(y |x
t t−1 )

(i) (i) (i) (i)


since p(yt |xt )p(xt |xt−1 ) = p(xt , yt |xt−1 ) = p(yt |xt−1 )p(xt |xt−1 , yt )
and R PN (j) (j)
p(yt |y0:t−1 ) = p(yt |xt−1 )p(xt−1 |y0:t−1 )dxt−1 ≈ j=1 w̃t−1 p(yt |xt−1 )

19 juin 2021 27 / 33
Derivation of the APF (II)

PN (i)
from p̂(xt−1 |xt−2 , y0:t−1 ) = i=1 w̃t−1 δx(i)
t−1

(i) (i) (i)


I weighting : w̃t ∝ p(yt |xt−1 )w̃t−1
(i) P N (i)
I resampling : xt−1 ∼ i=1 w̃n δx(i)
t−1
(i) (i) (i)
I sampling : xt ∼ p(xt |xt−1 , yt ) (or simpler, e.g. p(xt |xt−1 ))
PN (i) PN (j) (j)
→ p̂(xt |xt−1 , y0:t ) = i=1 w̃t δx(i) j=1 w̃n−1 p(yt |xt−1 )
t
P P
since to sample from ai pi (x) : (i) sample index j from ai δi , (ii)
sample x from pj (x)
(i)
Note that yt is involved in the re-sampling of xt−1 .

19 juin 2021 28 / 33
Example

non-linear model
xt+1 = 0.5xt + 25xt (1 + x2t )−1 + 8 cos(1.2t) + vt ,

(9)
yt = 0.05x2t + nt ,

vt ∼ N (0, 10) and wt ∼ N (0, 1), t = 1 : 100, N = 100 particles

19 juin 2021 29 / 33
Example

20

10

−10

−20
0 20 40 60 80 100 120

20

10

−10

−20
0 20 40 60 80 100 120

Figure – 1) observation (black) and state (red)


2) particles spreading (green) and estimated state (red)

19 juin 2021 30 / 33
Smoothing

Qt−1
p(x0:t |y1:t ) = p(xt |y1:t ) k=0 p(xk |xk+1:t , y1:t )
Qt−1
= p(xt |y1:t ) k=0 p(xk |xk+1 , y1:k ) (10)

p(xk |xk+1 , y1:k ) ∝ p(xk+1 |xk )p(xk |y1:k ).


PN (i)
As P̂ (dxu |y1:u ) = i=1 wu δx(i) (dxu ) for u = 1, . . . , t − 1, p(xu |xu+1 , y1:k ) is
u
approximated by
N (i) (i) (i)
X p(xu+1 |xu )wu
P̂ (dxu |xu+1 , y1:k ) = PN (j) (j)
δ
(j) xu
(i) (dxu )
i=1 j=1 p(xu+1 |xu )wu (11)
PN (i)
= i=1 ρu δx(i) (dxu ).
u

19 juin 2021 31 / 33
Smoothing : summary

(i)
1 initialization : set x̃t = xt (i), i = 1, . . . , N
2 iterations : for u = t − 1, t − 2, . . . , k,
(i) (i) (i) (i)
I calculate ρu ∝ p(x̃u+1 |xu )wu , i = 1, . . . , N
(i) PN (i)
I choose x̃u ∼ i=1 ρu δx(i) (dxu )
u

(i) PN (i)
3 set p̂(dxk |x̃k+1 , y1:t ) = i=1 ρk δx̃(i) (dxk ).
k

19 juin 2021 32 / 33
Example

Example of the previous section but with vt ∼ N (0, 10) and


nt ∼ N (0, 1).

Figure – 1) observation (green) and state (red)


2) particles spreading (green) and estimated state (red)

19 juin 2021 33 / 33

You might also like