A Novel Robust Student T-Based Kalman Filter

A Novel Robust Student’s t-Based Kalman Filter
Correspondence
A novel robust Student’s t-based Kalman filter is proposed by
using the variational Bayesian approach, which provides a Gaussian
approximation to the posterior distribution. Simulation results for
a manoeuvring target tracking example illustrate that the proposed
filter has smaller root mean square error and bias than existing filters.
I. INTRODUCTION
The Kalman filter has been successfully applied in nu-
merous practical applications, such as target tracking, navi-
gation, positioning, control and signal processing due to its
optimality, ease of implementation, and low computational
complexity. The Kalman filter is optimal in terms of min-
imum mean square error (MMSE) for a linear state-space
model with Gaussian process and measurement noises [1],
[2]. However, in some engineering applications, such as
tracking agile targets with measurement outliers from unre-
liable sensors, heavy-tailed nonGuassian process and mea-
surement noises are present [3]–[5]. The performance of
the conventional Kalman filter may break down in such en-
gineering applications with heavy-tailed nonGaussian pro-
cess and measurement noises. A large number of robust
filters have been proposed to solve the filtering problem of
linear systems with heavy-tailed measurement noises, such
as the Student’s t mixture filter [6], outlier-robust Kalman
filter [7], and Student’s t- and variational Bayesian (VB)-
based robust Kalman filters [8]–[11]. However, these ro-
bust filters may fail in the case of nonGaussian heavy-tailed
process noise since they all assume Gaussian process noise
[3], [12].
To solve effectively and robustly the filtering prob-
lem of linear systems with both heavy-tailed process
and measurement noises, a Huber-based Kalman filter
(HKF) has been proposed by minimizing a combined l1
and l2 norm [13]–[17]. The residual is bounded by us-
Manuscript received May 5, 2016; revised September 22, 2016; released

for publication October 10, 2016. Date of publication January 11, 2017;
date of current version June 7, 2017.
DOI. No. 10.1109/TAES.2017.2651684
Refereeing of this contribution was handled by F. Govaers.
This work was supported in part by the National Natural Science Foun-
dation of China under Grant 61371173 and 61633008 and in part by
the Natural Science Foundation of Heilongjiang Province under Grant
F2016008, and in part by the Engineering and Physical Sciences Research
Council of the U.K. under Grant EP/K014307/1.
Authors’ addresses: Y. Huang, Y. Zhang, N. Li, Department of Automa-
tion, Harbin Engineering University, 145 Nantong St., Nangang District,
Harbin, 150001 China; Z. Wu, School of Electrical Engineering and
Automation, Harbin Institute of Technology, Harbin, 150080, China; J.
Chambers, School of Electrical and Electronic Engineering, Newcastle
University, Newcastle upon Tyne, NE1 7RU, UK. (Corresponding author:
Y. G. Zhang.)
0018-9251
C 2017 IEEE
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 53, NO. 3 JUNE 2017 1545
ing a Huber function, and the HKF can be deemed as a II. PROBLEM FORMULATION
generalized maximum likelihood estimator [13]–[17]. A. Review of the Classical Kalman Filter
However, the influence function of the HKF doesn’t
redescend, which results in limited estimation accu- Consider the following discrete-time linear stochastic
racy. To cope with heavy-tailed nonGaussian noise in- system as shown by the state-space model
duced by large outliers, many maximum correntropy
xk = Fk−1 xk−1 + wk−1 (process equation) (1)
criterion based Kalman filters (MCCKFs) have been
proposed by maximizing the correntropy of the pre- zk = Hk xk + vk (measurement equation) (2)
dicted error and residual, and can be deemed as max- where k is the discrete time index, xk ∈ Rn is the state
imum a posterior estimators [18]–[22]. However, there vector, zk ∈ Rm is the measurement vector, Fk ∈ Rn×n is
is lack of theoretical basis to develop the estimation the state transition matrix, Hk ∈ Rm×n is the measurement
error covariance matrix, which degrades the estimation matrix, wk ∈ Rn is the heavy-tailed process noise vector
accuracy. with zero mean and nominal covariance matrix Qk , and
A reasonable approach to improve the estimation ac- vk ∈ Rm is the heavy-tailed measurement noise vector with
curacy is better modeling the heavy-tailed nonGaussian zero mean and nominal covariance matrix Rk . Both Qk
process and measurement noises. A linear Student’s t- and Rk are generally not accurate due to the existence of
based filter has been derived by assuming that both the outliers. The initial state vector x0 is assumed to have a
process and measurement noises are Student’s t distribu- Gaussian distribution with mean vector x̂0|0 and covariance
tions and the posterior probability density function (PDF) matrix P0|0 , i.e.,
is approximated as a Student’s t distribution to obtain a
closed form solution for the filtering problem [3], [4]. How- p(x0 ) = N(x0 ; x̂0|0 , P0|0 ) (3)
ever, this Student’s t filter (STF) suffers from the following
where N(x; μ, ) denotes the Gaussian PDF with mean
problems.
vector μ and covariance matrix . Moreover, x0 , wk , and
1) The derivation of the STF is based on an assumption vk are assumed to be mutually uncorrelated in this paper.
that the Student’s t PDFs of process and measurement The Kalman filter is the most common method to infer
noises have the same degrees of freedom (dof) pa- the state vector xk given the state-space model and measure-
rameter, which is seldom met in practical application ments until time k. The recursive Kalman filter consists of
because the process and measurement noises have dif- a time update and a measurement update, which are given
ferent heavy-tailed characteristics, as shown in [3]. as follows:
2) The growth of the dof parameter must be prevented Time update:
to preserve the heavy-tailed properties and closed Stu-
dent’s t-distributed form of the posterior PDF, which x̂k|k−1 = Fk−1 x̂k−1|k−1 (4)
increases the bias of the state estimation, as discussed Pk|k−1 = Fk−1 Pk−1|k−1 FTk−1 + Qk−1 (5)
in [4].
Measurement update:
3) The Student’s t approximation to the posterior PDF
may be unreasonable, as will be illustrated in Kk = Pk|k−1 HTk (Hk Pk|k−1 HTk + Rk )−1 (6)
Section II-C. x̂k|k = x̂k|k−1 + Kk (zk − Hk x̂k|k−1 ) (7)
In this paper, a new robust Student’s t-based Kalman Pk|k = Pk|k−1 − Kk Hk Pk|k−1 (8)
filter is proposed to improve the estimation accuracy of the
Student’s t-based filter with both heavy-tailed process and where (·)T denotes the transpose operation, x̂k|k−1 and Pk|k−1
measurement noises. First, to better model the linear sys- are the predicted state vector and corresponding predicted
tems with heavy-tailed process and measurement noises, error covariance matrix, respectively, x̂k|k and Pk|k are the
the one-step predicted PDF and likelihood PDF are ap- state estimation vector and corresponding estimation error
proximated as Student’s t distributions with different dof covariance matrix, respectively, and Kk is the Kalman gain.
parameters, and the PDF of the unknown scale matrix of The Kalman filter is an MMSE estimator for a linear
the one-step predicted PDF is modeled as an inverse Wishart state-space model with Gaussian process and measurement
distribution. Second, motivated by the fact that the posterior noises. However, it is suboptimal for heavy-tailed nonGaus-
PDF approaches to Gaussian in the case of moderate con- sian process and measurement noises since the required
taminated process and measurement noises, the posterior Gaussian assumptions are violated. Moreover, the Kalman
PDF is approximated as Gaussian, and an unknown scale filter may break down because a single outlier from any of
matrix and auxiliary random variables are inferred based the process and measurement noises can result in the fil-
on the hierarchical Gaussian state-space model and a VB ter’s bias exceeding target bounds [17]. The STF discussed
approach. Finally, the proposed robust Kalman filter and in the next section is intended to solve this problem.
existing robust filters are applied to the problem of tracking
an agile target that is observed in clutter. Simulation results B. Review of STF
show the proposed filter has smaller root mean square error The key idea of the STF is to approximate the posterior
(RMSE) and bias than existing robust filters. PDF by a Student’s t PDF. To that end, the jointly predicted
1546 IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS VOL. 53, NO. 3 JUNE 2017
PDF p(xk , zk |z1:k−1 ) of state and measurement are assumed
to be a Student’s t PDF as follows [4]:
p(xk , zk |z1:k−1 ) =

xk x̂k|k−1 Pk|k−1 Pk|k−1 HTk
St ; , , η (9)
zk Hk x̂k|k−1 Hk Pk|k−1 Sk
where x̂k|k−1 and Pk|k−1 are defined in (4)–(5), and the scale
matrix Sk = Hk Pk|k−1 HTk + Rk , and η is the dof parameter
of the jointly predicted PDF p(xk , zk |z1:k−1 ). According to
the Bayesian rule and using (9), the posterior PDF p(xk |z1:k )
can be updated as a Student’s t distribution, i.e., [3]

p(xk |z1:k ) = St(xk ; x̂k|k , Pk|k , η ) (10) Fig. 1. True posterior distribution, Gaussian approximation, and
Student’s t approximation of the 1-D numerical example.
where η is the dof parameter of the posterior PDF
p(xk |z1:k ), and the updated parameters are given by [3]

2) In the STF, a moment matching approach is employed
η =η+m (11) to preserve the heavy-tailed properties and closed Stu-

x̂k|k = x̂k|k−1 + Pk|k−1 HTk S−1
k (zk − Hk x̂k|k−1 ) (12) dent’s t-distributed form of the posterior PDF, as shown
in (15). However, only the first two moments of the true
η+ 2k
Pk|k = Pk|k−1 − Pk|k−1 HTk S−1
k Hk Pk|k−1 (13) posterior PDF are captured, and the higher order mo-
η+m ments are lost. Thus, the STF may suffer from a signifi-

cant bias since the moment matching approach imposes
k = (zk − Hk x̂k|k−1 )T S−1
k (zk − Hk x̂k|k−1 ) (14) a further bias on the state estimation [4].
where x̂k|k−1 and Pk|k−1 are formulated in (4) and (5). 3) The Gaussian approximation to the posterior PDF is
It is seen from (11) that the dof parameter of the poste- more reasonable than the Student’s t approximation with
rior PDF increases by m from time k − 1 to time k. Thus, a fixed dof parameter in [3] for the case of moderate
the Student’s t approximation to the posterior PDF lacks contaminated process and measurement noises. In or-
strict closeness to the target distribution. Furthermore, the der to illustrate this problem, a one-dimensional (1-D)
STF will degrade to a Kalman filter after a few steps with numerical example is shown. In this example, the state
the increase of dof parameter, and the heavy-tailed proper- transition parameter Fk = 0.5, measurement parameter
ties will be lost. To retain the heavy-tailed properties and Hk = 1, and outlier corrupted process and measurement
closed Student’s t-distributed form of the posterior PDF, noises are generated according to [12]
the moment matching approach has been used to obtain
N(0, 1) w.p. 0.95
an approximate posterior PDF St(xk ; x̂k|k , Pk|k , η) with dof wk ∼ (16)
parameter η, where the mean vector x̂k|k and scale matrix N(0, 100) w.p. 0.05

Pk|k are obtained by matching the first two moments, i.e., N(0, 1) w.p. 0.90
[3], [4] vk ∼ (17)
N(0, 100) w.p. 0.10

η η
x̂k|k = x̂k|k Pk|k = Pk|k . (15) where w.p. denotes “with probability”. Definitions (16),
η−2 η −2
(17) imply that wk and vk are most frequently drawn
The STF exhibits good robustness to heavy-tailed pro- from a Gaussian distribution with variances 1 and 5%
cess and measurement noises, and has almost identical com- of process noise values and 10% of measurement noise
putational complexity to the Kalman filter. values are generated from Gaussian distributions with
variances 100. Process and measurement noises, which
C. Motivation of This Work are generated according to (16) and (17), have heavier
Although the STF can effectively resist the influence tails. In this example, the true posterior distribution is
of heavy-tailed process and measurement noises, it suffers approximated by using the particle filter with 10000 par-
from the following drawbacks. ticles [23]. The Gaussian approximation and Student’s
t approximation to the posterior PDF are obtained from
1) In the derivation of the STF, it is necessary to assume the true posterior distribution with identical first two
that the Student’s t PDFs of process and measurement moments. As suggested in [3], the dof parameter of the
noises have the same dof parameter. However, this as- Student’s t approximation is set as η = 3. Fig. 1 shows
sumption is seldom met in practical application because the true posterior distribution, the Gaussian approxi-
the process and measurement noises normally have dif- mation and the Student’s t approximation at time step
ferent heavy-tailed characteristics, which will degrade k = 100. Fig. 2 shows the Kullback–Leibler divergence
the approximation accuracy to the posterior PDF [3]. (KLD) between the true posterior PDF and Gaussian
CORRESPONDENCE 1547
to the Bayesian theorem and using (1), p(xk |z1:k−1 ) can be
computed as

p(xk |z1:k−1 ) = p(xk |xk−1 )p(xk−1 |z1:k−1 )dxk−1

= pw (xk − Fk−1 xk−1 )p(xk−1 |z1:k−1 )dxk−1

(21)
where pw (·) is the PDF of the process noise.
Considering that the process noise vector wk has zero
mean and nominal covariance matrix Qk and using (21),
the mean vector μk and nominal covariance matrix k|k−1
can be computed as
Fig. 2. KLD between the true posterior PDF and Gaussian and
Student’s t approximations of the 1-D numerical example. μk = Fk−1 x̂k−1|k−1 = x̂k|k−1 (22)
k|k−1 = Fk−1 Pk−1|k−1 Fk−1 + Qk−1 = Pk|k−1 . (23)
T
and Student’s t approximations from time step k = 1 to
k = 100. We can see from Figs. 1 and 2 that the Gaussian One may choose the nominal covariance matrix Pk|k−1
approximation can match the true posterior PDF better as the scale matrix k of the one-step predicted PDF
than the Student’s t approximation. Thus, the STF may p(xk |z1:k−1 ). However, Qk−1 is not accurate and the large
exhibit unacceptable performance in the case of moder- process uncertainty induced by heavy-tailed process noise
ate contaminated process and measurement noises if the will be introduced into the scale matrix k , and accurate
dof parameter is poorly chosen. k will be unknown, which will degrade the estimation
performance.
The problem discussed earlier represents the main mo-
To solve this problem, the unknown scale matrix k
tivation of this work. In this paper, to solve these problems,
will be adaptively estimated based on the VB approach.
a new robust Student’s t-based Kalman filter will be de-
To this end, a conjugate prior distribution needs to be
rived, where the one-step predicted PDF and the likelihood
first selected for the unknown scale matrix k since the
PDF are modeled as Student’s t PDFs with different dof
conjugacy can guarantee that the posterior distribution is
parameters, and the posterior PDF is updated as a Gaussian
of the same functional form as the prior distribution. In
distribution.
Bayesian statistics, the inverse Wishart distribution is usu-
III. ROBUST STUDENT’S t-BASED KALMAN FILTER ally used as the conjugate prior for the covariance ma-
trix of a Gaussian distribution with a known mean [24].
A. Student’s t-Based Novel Hierarchical Gaussian State- The inverse Wishart PDF of a symmetric positive definite
Space Model random matrix B of dimension d × d is formulated as
||ρ/2 |B|−(ρ+d+1)/2 exp{−0.5tr(B−1 )}
In order to derive an approximate closed form solution IW(B; ρ, ) = 2dρ/2 d (ρ/2)
, where ρ is
of the posterior PDF for a linear state-space model with the dof parameter, and is the inverse scale matrix that is a
heavy-tailed process and measurement noises, the one-step symmetric positive definite matrix of dimension d × d, and
predicted PDF p(xk |z1:k−1 ) and the likelihood PDF p(zk |xk ) | · | and tr(·) denote the determinant and trace operations, re-
need to be modeled. First, the heavy-tailed measurement spectively, and d (·) is the d-variate gamma function [24].
noise is modeled as a Student’s t distribution as follows: If B ∼ IW(B; ρ, ), then E[B−1 ] = (ρ − d − 1) −1 when
p(vk ) = St(vk ; 0, Rk , ν) (18) ρ > d + 1 [24]. Since the scale matrix of a Student’s t distri-
bution is proportional to the covariance matrix of a Gaussian
where St(vk ; 0, Rk , ν) denotes the Student’s t PDF of the distribution in the hierarchical Gaussian form, as shown in
measurement noise with mean vector 0, scale matrix Rk (30), an inverse Wishart distribution is chosen as a prior
(the nominal covariance matrix of measurement noise), and distribution for the unknown scale matrix k as follows:
dof parameter ν. Using (2) and (18), the likelihood PDF
p(zk |xk ) can be formulated as p(k ) = IW(k ; uk , Uk ) (24)
p(zk |xk ) = St(zk ; Hk xk , Rk , ν). (19) where IW(k ; uk , Uk ) denotes the inverse Wishart PDF of
k with dof parameter uk and inverse scale matrix Uk .
Second, to model the heavy-tailed process noise, the To capture the prior information of k , the mean value of
one-step predicted PDF p(xk |z1:k−1 ) is approximated as a p(k ) is set as the nominal covariance matrix Pk|k−1 , i.e.,
Student’s t distribution as follows:
Uk
p(xk |z1:k−1 ) = St(xk ; μk , k , ω) (20) = Pk|k−1 . (25)
uk − n − 1
where St(xk ; x̂k|k−1 , k , ω) denotes the Student’s t PDF of Let
the state vector xk given measurements z1:k−1 with mean
vector μk , scale matrix k , and dof parameter ω. According uk = n + τ + 1 (26)
where τ ≥ 0 is a tuning parameter. Using (26) in (25) PDFs can be obtained by minimizing the KLD between the
results in approximate posterior PDF q(xk )q(ξk )q(k )q(λk ) and true
posterior PDF p(xk , ξk , k , λk |z1:k ), i.e., [25], [26]
Uk = τ Pk|k−1 . (27)
{q(xk ), q(ξk ), q(k ), q(λk )} = arg min KLD(q(xk )q(ξk )
A new Student’s t-based state-space model thereby con-
sists of (19), (20), (24), and (26), (27). However, the closed × q(k )q(λk )||p(xk , ξk , k , λk |z1:k )) (33)
solution of the posterior PDF is unavailable because the q(x)
where KLD(q(x)||p(x)) q(x) log p(x) dx is the KLD.
Student’s t PDF is not strictly closed. To solve this prob- The optimal solution for (33) satisfies the following
lem, the Student’s t-based state-space model needs to be equation [12]:
transformed into a hierarchical Gaussian state-space model
by introducing two auxiliary random variables. log q(φ) = E(−φ) [log p(, z1:k )] + cφ (34)
Since the Student’s t PDF can be viewed as an infi- {xk , ξk , k , λk } (35)
nite mixture of Gaussian PDFs, the one-step predicted PDF
p(xk |z1:k−1 ) and the likelihood PDF p(zk |xk ) can be rewrit- where φ is an arbitrary element of , and (−φ) is the set
ten as follows [12]: of all elements in except for φ, and E[·] denotes the sta-

ω ω tistical expectation operation, and cφ denotes the constant
p(xk |z1:k−1 ) = N(xk ; x̂k|k−1 , k /ξk )G ξk ; , dξk with respect to variable φ. Since the variational parameters
2 2 of q(xk ), q(ξk ), q(k ), and q(λk ) are coupled, we need to
(28) utilize fixed-point iterations to solve (34), where only one

ν ν factor in (32) is updated while keeping other factors fixed
p(zk |xk ) = N(zk ; Hk xk , Rk /λk )G λk ; , dλk
2 2 [12], [25], [26].
(29) 1) Computations of Approximate Posterior PDFs Using
the conditional independence properties of the hierarchical
where G(·; α, β) denotes the Gamma PDF with shape pa- Gaussian state-space model in (22)–(24), (26), (27) and
rameter α and rate parameter β, and ξk and λk are auxiliary (30), (31), the joint PDF p(, z1:k ) can be factored as
random variables. According to (28) and (29), the one-step
predicted PDF p(xk |z1:k−1 ) and the likelihood PDF p(zk |xk ) p(, z1:k ) = p(zk |xk , λk )p(xk |ξk , z1:k−1 )p(ξk )p(k )p(λk ).
can be rewritten in the following hierarchical Gaussian (36)
forms Substituting (24) and (30), (31) in (36) results in
p(xk |ξk , z1:k−1 ) = N(xk ; x̂k|k−1 , k /ξk ); p(, z1:k ) = N(zk ; Hk xk , Rk /λk )N(xk ; x̂k|k−1 , k /ξk )
ω ω ω ω ν ν
p(ξk ) = G ξk ; , (30) × G ξk ; , IW(k ; uk , Uk )G λk ; , . (37)
2 2 2 2 2 2
p(zk |xk , λk ) = N(zk ; Hk xk , Rk /λk ); Using (37), log p(, z1:k ) can be formulated as
ν ν
p(λk ) = G λk ; , (31)
2 2 m+ν ν
log p(, z1:k ) = − 1 log λk − λk
where p(k ) is given by (24), (26), and (27). 2 2

Equations (22)–(24), (26), (27) and (30), (31) consti- λk T −1 n+ω
− (zk − Hk xk ) Rk (zk − Hk xk ) + −1
tute a Student’s t-based hierarchical Gaussian state-space 2 2
model. The problem of state estimation for a linear state ω ξk
space model with heavy-tailed process and measurement × log ξk − ξk − (xk − x̂k|k−1 )T k−1 (xk − x̂k|k−1 )
2 2
noises is transformed into the problem of state estimation 1 1
for a Student’s t-based hierarchical Gaussian state-space − tr(Uk k−1 ) − (n + uk + 2) log |k |. (38)
2 2
model.
Let φ = ξk and using (38) in (34), we obtain
B. New Robust Student’s t-Based Kalman Filter
n+ω
To estimate the state xk of a hierarchical Gaussian log q (i+1)
(ξk ) = − 1 log ξk
2
state-space model formulated in (22)–(24), (26), (27) and
−1
(30), (31), we need to compute the joint posterior PDF − 0.5 ω + tr(D(i) k E (i)
[ k ]) ξk + cξ (39)
p(xk , ξk , k , λk |z1:k ). For the hierarchical Gaussian state-
space model, there is not an analytical solution for this where q (i+1) (·) is the approximation of PDF q(·) at the
posterior PDF. Thus, to obtain an approximate solution, (i + 1)th iteration, and E(i) [ρ] is the expectation of variable
the VB approach is used to look for a free form factored ρ at the ith iteration, and tr(·) denotes the trace operation
approximate PDF for p(xk , ξk , k , λk |z1:k ), i.e., [25], [26] and D(i)
k is given by
p(xk , ξk , k , λk |z1:k ) ≈ q(xk )q(ξk )q(k )q(λk ) (32) D(i)

k = E [(xk − x̂k|k−1 )(xk − x̂k|k−1 ) ].
(i) T
(40)
where q(·) is the approximate posterior PDF. According Employing (39), q (i+1) (ξk ) can be updated as a Gamma
to the standard VB approach, these approximate posterior PDF with shape parameter αki+1 and rate parameter
CORRESPONDENCE 1549
βki+1 , i.e., where the modified measurement noise covariance ma-
trix R̃(i+1)
k and modified predicted error covariance matrix
q (i+1) (ξk ) = G ξk ; αki+1 , βki+1 (41) P̃k|k−1 are given by
where shape parameter αki+1 and rate parameter βki+1 are −1
given by Rk E(i+1) [k−1 ]
R̃(i+1)
k = (i+1) P̃(i+1)
k|k−1 = . (56)
αki+1 = 0.5(n + ω) (42) E [λk ] E(i+1) [ξk ]

−1
βki+1 = 0.5 ω + tr D(i)
k E(i)
[ k ] . (43) Using (54)–(56) in (53), we obtain
Let φ = λk and using (38) in (34), we obtain
1
m+ν q (i+1) (xk ) = p (i+1) (zk |xk )p (i+1) (xk |z1:k−1 ) (57)
log q (i+1) (λk ) = − 1 log λk ck(i+1)
2

(i) −1
− 0.5 ν + tr Ek Rk λ k + cλ (44) where the normalizing constant ck(i+1) is given by

where E(i)
k is given by
ck(i+1) = p (i+1) (zk |xk )p(i+1) (xk |z1:k−1 )dxk . (58)
E(i)
k = E [(zk − Hk xk )(zk − Hk xk ) ].
(i) T
(45)
Using (44), q (i+1) (λk ) can be updated as a Gamma PDF Employing (54)–(58), q (i+1) (xk ) can be updated as a
with shape parameter γki+1 and rate parameter δki+1 , i.e., Gaussian PDF with mean vector x̂k|k(i+1)
and covariance ma-
(i+1)
q (i+1) (λk ) = G λk ; γki+1 , δki+1 (46) trix Pk|k , i.e.,
where shape parameter γki+1 and rate parameter δki+1 are

given by q (i+1) (xk ) = N xk ; x̂(i+1)
k|k , P(i+1)
k|k (59)
γki+1 = 0.5(m + ν) (47)

where the mean vector x̂(i+1) (i+1)
and covariance matrix Pk|k
−1 k|k
δki+1 = 0.5 ν + tr E(i)
k Rk . (48) are given by
Let φ = k and using (38) in (34), we obtain
x̂(i+1)
k|k = x̂k|k−1 + K(i+1)
k (zk − Hk x̂k|k−1 ) (60)
1
log q (i+1) (k ) = − (n + uk + 2) log |k | P(i+1) = P̃(i+1) − K(i+1) Hk P̃(i+1) (61)
2 k|k k|k−1 k k|k−1
1
−1

− tr Uk + E (i+1)
[ξk ]D(i) + c . (49)
2 k k
where K(i+1)
k denotes the modified Kalman gain given by
Exploiting (49), q (i+1) (k ) can be updated as an inverse −1
Wishart PDF with dof parameter û(i+1) k and inverse scale Kk(i+1) = P̃k|k−1
(i+1) T (i+1) T
Hk Hk P̃k|k−1 Hk + R̃(i+1) . (62)
k
matrix Û(i+1)
k , i.e.,
q (i+1) (k ) = IW(k ; û(i+1) , Û(i+1) ) (50) After fixed point iteration N, the approximate posterior
k k
PDFs q(xk ), q(ξk ), q(k ), and q(λk ) can be updated as
where the dof parameter û(i+1) and inverse scale matrix
(i+1)
k
Ûk are given by (N)
q(xk ) ≈ q (N) (xk ) = N xk ; x̂k|k , P(N)
k|k
û(i+1)
k = uk + 1 (51) = N(xk ; x̂k|k , Pk|k ) (63)

Û(i+1) = Uk + E (i+1)
[ξk ]D(i)
k . (52) q(ξk ) ≈ q (ξk ) = G ξk ; αkN , βkN
(N)
(64)
k

Let φ = xk and using (38) in (34), we obtain q(k ) ≈ q (N) (k ) = IW k ; û(N) (N)
k , Ûk|k (65)

log q (xk ) = −0.5E
(i+1)
[λk ](zk − Hk xk ) R−1
(i+1)
k
T
q(λk ) ≈ q (N) (λk ) = G λk ; γkN , δkN . (66)
× (zk − Hk xk ) − 0.5E(i+1) [ξk ](xk − x̂k|k−1 )T E(i+1)
× [k−1 ](xk − x̂k|k−1 ) + cx . (53) 2) Computation of Expectations Using (41), (46), and
(50), we can compute the required expectations E(i+1) [ξk ],
Define the modified one-step predicted PDF p(i+1) E(i+1) [λk ], and E(i+1) [k−1 ] as follows:
(xk |z1:k−1 ) and the modified likelihood PDF p(i+1) (zk |xk )
as follows: αki+1 γki+1
E(i+1) [ξk ] = E(i+1) [k−1 ]
E(i+1) [λk ] =
p (i+1) (xk |z1:k−1 ) = N xk ; x̂k|k−1 , P̃(i+1)
k|k−1 (54) βki+1 δki+1
−1
p (i+1) (zk |xk ) = N zk ; Hk xk , R̃(i+1)
k (55) = û(i+1)
k − n − 1 Û(i+1)
k . (67)

Algorithm 1: One time step of the proposed robust E(i+1)
k = E(i+1) (zk − Hk xk )(zk − Hk xk )T
Kalman filter for linear state-space model with heavy-
tailed process and measurement noises. = E(i+1)
zk − Hk x̂(i+1) (i+1)
k|k + Hk x̂k|k − Hk xk
Inputs: x̂k−1|k−1 , Pk−1|k−1 , Fk−1 , Hk , zk , Qk−1 , Rk , T
(i+1) (i+1)
m, n, ω, ν, τ , N × zk − Hk x̂k|k + Hk x̂k|k − Hk xk
Time update: T
1. x̂k|k−1 = Fk−1 x̂k−1|k−1 = zk − Hk x̂(i+1) zk − Hk x̂(i+1)
k|k k|k
2. Pk|k−1 = Fk−1 Pk−1|k−1 FTk−1 + Qk−1 T
Measurement update: + Hk E(i+1) (i+1)
xk − x̂k|k (i+1)
xk − x̂k|k HTk
3. Initialization: uk = n + τ + 1, Uk = τ Pk|k−1 ,
x̂(0) (0) −1 T
k|k = x̂k|k−1 , Pk|k = Pk|k−1 , E [k ] =
(0)
= zk − Hk x̂(i+1) zk − Hk x̂(i+1) (i+1) T
+ Hk Pk|k Hk .
(uk − n − 1)U−1 k
k|k k|k
for i = 0 : N − 1 (69)
Update q (i+1) (ξk ) = G(ξk ; αki+1 , βki+1 ) given q (i) (xk )
The proposed robust Kalman filter consists of (4),
and q (i) (k ):
(5), (26), (27), (40)–(43), (45)–(48), (50)–(52), (56), and
4. D(i) (i) (i) (i)
k = Pk|k + (x̂k|k − x̂k|k−1 )(x̂k|k − x̂k|k−1 )
T
(59)–(69), and the implementation pseudocode for one
5. αk = 0.5(n + ω), βk = 0.5 ω +
i+1 i+1
time step of the proposed robust Kalman filter is shown in
−1
(i+1)
tr(D(i) (i)
k E [k ]) , E [ξk ] = αki+1 /βki+1 Algorithm 1.
Update q (i+1)
(λk ) = G(λk ; γki+1 , δki+1 ) given q (i) (xk ): Finally, we discuss the effect of the tuning parameter τ
6. Ek = (zk − Hk x̂(i)
(i) (i) T
) + Hk P(i) upon the proposed robust Kalman filter. Substituting (67) in
k|k )(zk − Hk x̂k|k
T
k|k Hk
(57), the modified predicted error covariance matrix P̃k|k−1
7. γk = 0.5(m + ν), δk = 0.5 ν +
i+1 i+1
−1
(i+1) can be reformulated as
tr(E(i) k Rk ) , E [λk ] = γki+1 /δki+1 −1 −1
Update q (i+1) (k ) = IW(k ; û(i+1) k , Û(i+1)
k ) given û(i+1) − n − 1 Û (i+1)
k k
q (i) (xk ) and q (i+1) (ξk ): P̃(i+1)
k|k−1 =
8. û(i+1)
k = uk + 1, Û(i+1) k = Uk + E(i+1) [ξk ]D(i) k , E(i+1) [ξk ]
E(i+1) [k−1 ] = (û(i+1) k − n − 1)( Û (i+1) −1
k ) Û(i+1)
Update q (i+1) (xk ) = N(xk ; x̂(i+1) (i+1) = k
. (70)
k|k , Pk|k ) given (û(i+1) − n − 1)E(i+1) [ξk ]
k
q (i+1) (ξk ), q (i+1) (λk ) and q (i+1) (k ):
9. R̃(i+1) = Rk /E(i+1) [λk ], Using (26), (27) and (51), (52) in (70) results in
k
−1 (i+1)
P̃k|k−1 = E(i+1) [k−1 ]
(i+1)
/E [ξk ] τ Pk|k−1 /E(i+1) [ξk ] + D(i)
(i+1) −1 P̃(i+1)
k|k−1 =
k
. (71)
10. K(i+1)k = P̃(i+1) T
k|k−1 k H (H P̃(i+1) T
k k|k−1 k + R̃k
H ) τ +1
(i+1) (i+1)
11. x̂k|k = x̂k|k−1 + Kk (zk − Hk x̂k|k−1 ) It is seen from (71) that τ can be deemed as a harmonic
12. P(i+1)k|k = P̃(i+1)
k|k−1 − Kk
(i+1)
Hk P̃(i+1)
k|k−1 weight to balance the efficacy of Pk|k−1 /E(i+1) [ξk ] and D(i)
k .
end for On the one hand, if τ is too large, the substantial prior
(N)
13. x̂k|k = x̂k|k , Pk|k = P(N) k|k uncertainties induced by the heavy-tailed process noise are
Outputs: x̂k|k and Pk|k introduced into the measurement update, which degrades
the performance of the proposed filter. On the other hand,
if τ is too small, a large quantity of information about the
process model is lost, which also degrades the performance
Exploiting (59), the required expectations D(i+1)
k and of the proposed filter. In this paper, the tuning parameter is
E(i+1)
k are computed as chosen to lie within the range τ ∈ [2, 6], and the proposed
filter with τ ∈ [2, 6] has essentially consistent estimation
D(i+1)
k = E(i+1) (xk − x̂k|k−1 )(xk − x̂k|k−1 )T performance, as shown in the later simulation.

= E(i+1) xk − x̂(i+1) (i+1)
k|k + x̂k|k − x̂k|k−1 IV. SIMULATION
T In this simulation, the superior performance of the pro-
× xk − x̂(i+1)
k|k + x̂(i+1)
k|k − x̂k|k−1 posed robust Kalman filter as compared with existing filters
T is illustrated in the problem of tracking an agile target that is
= E(i+1) (i+1)
xk − x̂k|k (i+1)
xk − x̂k|k observed in clutter. The target moves according to a constant
velocity model in 2-D space and its position is observed.
T The linear state space can be formulated as [3]
+ x̂(i+1)
k|k − x̂k|k−1 x̂(i+1)
k|k − x̂k|k−1
T xk = Fxk−1 + wk−1 (72)
= P(i+1) (i+1)
k|k + x̂k|k − x̂k|k−1 x̂(i+1)
k|k − x̂k|k−1 (68) zk = Hxk + vk (73)
CORRESPONDENCE 1551
where the state xk = [xk yk ẋk ẏk ]; xk , yk , ẋk , and ẏk denote
the cartesian coordinates and corresponding velocities; F
and H denote the state transition matrix and observation
matrix, respectively, which are given by

I2 tI2
F= H = I2 0 (74)
0 I2
where the parameter t = 1s is the sampling interval and

I2 is the 2-D identity matrix. Outlier corrupted process and
measurement noises are generated according to [12] Fig. 3. True and estimated trajectories of the target.

N(0, Q) w.p. 0.95
wk ∼ (75)
N(0, 100Q) w.p. 0.05

N(0, R) w.p. 0.90
vk ∼ (76)
N(0, 100R) w.p. 0.10
where Q and R are, respectively, nominal process and mea-

surement noise covariance matrices
⎡ ⎤ Fig. 4. RMSEs of the position from existing filters and the proposed
t 3 t 2 filters.
⎢ 3 I2 2
I2 ⎥
Q=⎢
⎣ t 2
⎥q
⎦ R = rI2 (77)
respectively defined as follows:
I2 tI2
2
RMSEpos =
"
where q = 1 and r = 100 m2 . Equations (75), (76) imply 1 #M s 2 2
xk − x̂ks + yks − ŷks (78)
that wk and vk are most frequently drawn from a Gaussian M s=1
distribution with covariance matrix Q or R and 5% of pro- ARMSEpos =

"
cess noise values and 10% of measurement noise values 1 #T #M s 2 2
are generated from Gaussian distributions with severely xk − x̂ks + yks − ŷks (79)
MT k=1 s=1
increased covariance matrices. Process and measurement AAVBpos =
noises, which are generated according to (75), (76), have $ $ $ $
1 # $$ 1 # s $ 1# T $ # $
heavier tails.
T M
$ $ 1
M
$
$ xk − x̂ks $ + $ yks − ŷks $(80)
In this simulation, the Kalman filter with true covari- T k=1 $ M s=1 $ T
k=1
$M
s=1
$
ance matrices (KFTCM), the HKF [16], the MCCKF [22],
the STF [3], the proposed filter with fixed k = Pk|k−1 where (xks , yks ) and (x̂ks , ŷks ) are the true and estimated po-
(the proposed filter-fixed), and the proposed filter with es- sitions at the sth Monte Carlo run, and M = 1000 and
timated k (the proposed filter-) are tested. The tuning T = 100 s are respectively the total number of Monte
parameter and the number of iterations of the HKF are cho- Carlo runs and the simulation time. Similar to the RMSE,
sen as γ = 1.345 [16] and N = 10, the kernel size of the ARMSE, and AAVB in position, we can also write formula
MCCKF is chosen as σ = 15, and the dof parameter of the for the RMSE, ARMSE, and AAVB in velocity.
STF is chosen as ν = 3 [3]. In the proposed filter-fixed and The true and estimated trajectories obtained from the
the proposed filter-, the dof parameters, tuning parameter existing filters and the proposed filter in a single Monte
and the number of iterations are set as: ω = ν = 5, τ = 5, Carlo run are shown in Fig. 3. The RMSEs and AAVBs
N = 10. The proposed filters and existing filters are coded of position and velocity from the existing filters and the
with MATLAB and the simulations are run on a computer proposed filter are, respectively, shown in Figs. 4 and 5
with Intel Core i7-3770 CPU at 3.40 GHz. and Table I. The implementation times of the proposed fil-
To compare the performance of existing filters and the ters and existing filters in single-step run when N = 10 are
proposed filter, the RMSEs, the averaged RMSEs (ARM- shown in Table II. It is seen from Fig. 3 that the estimated
SEs) and the averaged absolute value of biases (AAVBs) trajectories from the proposed filter are closer to the
of position and velocity are chosen as performance met- true trajectory as compared with existing filters, particu-
rics. The RMSE, ARMSE, and AAVB of position are larly around coordinates (1500, 1200), which is caused by
Fig. 5. RMSEs of the velocity from existing filters and the proposed Fig. 7. ARMSEs of the velocity of the target when N = 1, 2, . . . , 20.
filters.
TABLE I
AAVBs of the Proposed Filters and Existing Filters
Filters AAVBpos (m) AAVBvel (m/s)
KFTCM 0.948 0.303

HKF 0.728 0.272
MCCKF 0.742 0.277
STF 0.918 0.349
The proposed filter-fixed 0.607 0.257
The proposed filter- 0.561 0.243
Fig. 8. RMSEs of the position from the proposed filters when

TABLE II τ = 2, 3, 4, 5, 6.
Implementation Times of the Proposed
Filters and Existing Filters in Single
Step Run When N = 10
Filters Time (s)
KFTCM 2.7 × 10−5

HKF 5.1 × 10−4
MCCKF 1.0 × 10−4
STF 3.9 × 10−5
The proposed filter-fixed 6.5 × 10−4
The proposed filter- 7.4 × 10−4
Fig. 9. RMSEs of the velocity from the proposed filters when

τ = 2, 3, 4, 5, 6.
seen from Figs. 6 and 7 that the proposed filter-fixed and the
proposed filter-, respectively, outperform existing filters
for N ≥ 7 and N ≥ 3. The proposed filter- has higher es-
timation accuracy than the proposed filter-fixed for N ≥ 2.
Figs. 8 and 9 show the RMSEs of position and velocity
from the proposed filters with different tuning parameters
τ = 2, 3, 4, 5, 6. We can see from Figs. 8 and 9 that the
Fig. 6. ARMSEs of the position of the target when N = 1, 2, . . . , 20. proposed filter- with tuning parameters τ = 2, 3, 4, 5, 6
has almost identical estimation accuracy, however, it
the process and measurement outliers. We can see from outperforms the proposed filter-fixed in terms of RMSEs
Figs. 4 and 5, Tables I and II that the proposed filter of position and velocity.
has smaller RMSEs and AAVBs of position and velocity
but slightly higher computational complexity than existing V. CONCLUSION
filters. Moreover, as expected, the proposed filter- has In this paper, a new robust Student’s t-based Kalman
higher estimation accuracy and smaller bias than the pro- filter was proposed for linear systems with heavy-tailed
posed filter-fixed. process and measurement noises, which provided a Gaus-
Figs. 6 and 7 show the ARMSEs of position and ve- sian approximation to the posterior PDF. The one-step pre-
locity from the existing filters and the proposed filters with dicted PDF and likelihood PDF were modeled as Stu-
different numbers of iterations N = 1, 2, . . . , 20. It can be dent’s t distributions with different dof parameters, and
CORRESPONDENCE 1553
the PDF of the unknown scale matrix was modeled as an IEEE Signal Process. Lett., vol. 22, no. 11, pp. 2450–2454,
inverse Wishart distribution. A Student’s t-based hierarchi- Dec. 2015.
cal Gaussian state-space model is presented by introduc- [11] Y. L. Huang, Y. G. Zhang, N. Li, and J. Chambers
A robust Gaussian approximate filter for nonlinear systems with
ing auxiliary random variables, based on which approxi- heavy-tailed measurement noises
mate posterior PDFs of state, unknown scale matrix and in Proc. 2016 IEEE Int. Conf. Acoust., Speech Signal Process.,
auxiliary random variables can be obtained by perform- Mar. 2016, pp. 4209–4213.
ing structured VB inference. Simulation results illustrated [12] Y. L. Huang, Y. G. Zhang, N. Li, and J. Chambers
that the proposed robust Kalman filter outperforms existing A robust Gaussian approximate fixed-interval smoother for
nonlinear systems with heavy-tailed process and measurement
state-of-the-art filters in a manoeuvring target tracking ex- noises
ample with moderate contaminated process and measure- IEEE Signal Process. Lett., vol. 23, no. 4, pp. 468–472, Apr.
ment noises. 2016.
[13] B. Kovacevic, Z. Durovic, and S. Glavaski
YULONG HUANG On robust Kalman filtering
YONGGANG ZHANG, Senior Member, IEEE Int. J. Control, vol. 56, no. 3, pp. 547C–562, 1992.
NING LI [14] G. Durgaprasad and S. S. Thakur
Harbin Engineering University, Harbin, China Robust dynamic state estimation of power systems based on
M-estimation and realistic modeling of system dynamics
ZHEMIN WU IEEE Trans. Power Syst., vol. 13, no. 4, pp. 1331–1336, Nov.
Harbin Institute of Technology, Harbin, China 1998.
JONATHON A. CHAMBERS, Fellow, IEEE [15] C. D. Karlgaard
Newcastle University, Newcastle upon Tyne NE1 Robust rendezvous navigation in elliptical orbit
7RU, U.K. J. Guid., Control, Dyn., vol. 29, no. 2, pp. 495C–499, 2006.
[16] C. D. Karlgaard and H. Schaub
Huber-based divided difference filtering
REFERENCES J. Guid., Control, Dyn., vol. 30, no. 3, pp. 885–C891, 2007.
[17] M. A. Gandhi and L. Mili
[1] B. Jia and M. Xin Robust Kalman filter based on a generalized maximum-
Sparse-grid quadrature H∞ filter for discrete-time systems with likelihood-type estimator
uncertain noise statistics IEEE Trans. Signal Process., vol. 58, no. 5, pp. 2509–2520,
IEEE Trans. Aerosp. Electron. Syst., vol. 49, no. 3, pp. 1626– May 2010.
1636, Jul. 2013. [18] G. T. Cinar and J. C. Principe
[2] M.-J. Yu Hidden state estimation using the correntropy filter with fixed
INS/GPS integration system using adaptive filter for estimating point update and adaptive kernel size
measurement noise variance in Proc. IEEE World Congr. Comput. Intell., Brisbane, Aus-
IEEE Trans. Aerosp. Electron. Syst., vol. 48, no. 2, pp. 1786– tralia, 2012, pp. 1–6.
1792, Apr. 2012. [19] B. Chen and J. C. Principe
[3] M. Roth, E. Özkan, and F. Gustafsson Maximum correntropy estimation is a smoothed MAP estima-
A Student’s filter for heavy-tailed process and measurement tion
noise IEEE Signal Process. Lett., vol. 19, no. 8, pp. 491–494, Aug.
in Proc. 2013 IEEE Int. Conf. Acoust., Speech Signal Process., 2012.
May 2013, pp. 5770–5774. [20] B. Chen, J. Wang, H. Zhao, N. Zheng, and J. C. Principe
[4] Y. L. Huang, Y. G. Zhang, N. Li, and J. Chambers Convergence of a fixed-point algorithm under maximum cor-
Robust Student’s t based nonlinear filter and smoother rentropy criterion
IEEE Trans. Aero. Electron. Syst., vol. 52, no. 5, pp. 2586–2596, IEEE Signal Process. Lett., vol. 22, no. 10, pp. 1723–1727, Oct.
Oct. 2016. 2015.
[5] Y. L. Huang, Y. G. Zhang, N. Li, S. M. Naqvi, and J. Chambers [21] Y. D. Wang, W. Zheng, S. M. Sun, and L. Li
A robust Student’s t based cubature filter Robust information filter based on maximum correntropy cri-
in Proc. 19th Int. Conf. Inf. Fusion, Jul. 2016, pp. 9–16. terion
[6] J. Loxam and T. Drummond J. Guid., Control, Dyn., vol. 39, pp. 1126–1131, 2016.
Student mixture filter for robust, realtime visual tracking [22] R. Izanloo, S. A. Fakoorian, H. S. Yazdi, and D. Simon
in Proc. 10th Eur. Conf. Comput. Vis., Part III, 2008, pp. 372– Kalman filtering based on the maximum correntropy criterion
385. in the presence of non-Gaussian noise
[7] G. Agamennoni, J. I. Nieto, and E. M. Nebot in Proc. 50th Annu. Conf. Inf. Sci. Syst., 2016, pp. 500–505.
Approximate inference in state-space models with heavy-tailed [23] S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp
noise A tutorial on particle filters for on-Line non-linear/non-
IEEE Trans. Signal Process., vol. 60, no. 10, pp. 5024–5037, Gaussian Bayesian tracking
Oct. 2012. IEEE Trans. Signal Process., vol. 50, no. 2, pp. 174–189, Feb.
[8] J. Ting, E. Theodorou, and S. Schaal 2002.
Learning an outlier-robust Kalman filter [24] A. O’Hagan and J. J. Forster
in Proc. 18th Eur. Conf. Mach. Learn., Warsaw, Poland, 2007, Kendall’s Advanced Theory of Statistics: Bayesian Inference.
pp. 748–756. London, U.K.: Arnold, 2004.
[9] H. Zhu, H. Leung, and Z. He [25] C. M. Bishop
A variational Bayesian approach to robust sensor fusion based Pattern Recognition and Machine Learning. New York, NY,
on Student-t distribution USA: Springer, 2007.
Inf. Sci., vol. 221, no. 2013, pp. 201–214, Sep. 2012. [26] D. Tzikas, A. Likas, and N. Galatsanos
[10] H. Nurminen, T. Ardeshiri, R. Piché, and F. Gustafsson The variational approximation for Bayesian inference
Robust inference for state-space models with skewed measure- IEEE Signal Process. Mag., vol. 25, no. 6, pp. 131–146, Nov.
ment noise 2008.
View publication stats

A Novel Robust Student T-Based Kalman Filter

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Novel Robust Student T-Based Kalman Filter

Uploaded by

Copyright:

Available Formats

A Novel Robust Student’s t-Based Kalman Filter

Manuscript received May 5, 2016; revised September 22, 2016; released

p(xk |z1:k−1 ) = p(xk |xk−1 )p(xk−1 |z1:k−1 )dxk−1

= pw (xk − Fk−1 xk−1 )p(xk−1 |z1:k−1 )dxk−1

p(xk , ξk , k , λk |z1:k ) ≈ q(xk )q(ξk )q(k )q(λk ) (32) D(i)

where shape parameter γki+1 and rate parameter δki+1 are

γki+1 = 0.5(m + ν) (47)

where the parameter t = 1s is the sampling interval and

where Q and R are, respectively, nominal process and mea-

distribution with covariance matrix Q or R and 5% of pro- ARMSEpos =

Filters AAVBpos (m) AAVBvel (m/s)

KFTCM 0.948 0.303

Fig. 8. RMSEs of the position from the proposed filters when

Filters Time (s)

KFTCM 2.7 × 10−5

Fig. 9. RMSEs of the velocity from the proposed filters when

View publication stats

You might also like