Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Event based sampling in non-linear ltering

$
Mauricio G. Cea
n
, Graham C. Goodwin
School of Electrical Engineering and Computer Science, University of Newcastle, Australia
a r t i c l e i n f o
Article history:
Received 16 August 2011
Received in revised form
17 November 2011
Accepted 21 November 2011
Available online 28 February 2012
Keywords:
Event-based sampling
Sampling systems
Vector quantization
Non-linear lters
Non-linear systems
a b s t r a c t
Most of the existing approaches to estimation and control are based on the premise that regular
sampling is used. However, in some applications, there exists strong motivation to use event rather
than time based sampling. For example, in sensor networks, it is often desirable to send data only
when something interesting happens. This paper explores some of the issues involved in event based
sampling in the context of non-linear ltering. Several examples are presented to illustrate the ideas.
& 2012 Elsevier Ltd. All rights reserved.
1. Introduction
Most current implementations of digital control and estima-
tion use regular sampling with xed period T, see e.g. Middleton
and Goodwin (1990), Feuer and Goodwin (1996),

Astr om and
Wittenmark (1990) and Hristu-Varsakelis and Levine (2005).
However, there is often strong practical motivation to change
this paradigm to one in which one only takes samples when
something interesting happens. This changes the focus to, so-
called, event based sampling. In this paper, we consider that a
measurement is sent only when the measured variable crosses a
given threshold. Thus the sampling is not regular. The latter
strategy has many advantages including conserving valuable
communication resources in the context of networked control
or sensor networks.
There is a growing literature on event based sampling. An early
seminal paper was that of A

str om and Bernhardsson (2002). Other


related publications include

Arze n (1999), Anta and Tabuada (2009,
2008), Byrnes and Isidori (1989), Otanez, Moyne, and Tilbury (2002),
Tabuada (2007), Le and McCann (2007), McCann and Le (2008),
Pawlowski et al. (2009), and Xu and Cao (2011). As pointed out in
Anta and Tabuada (2010), event based sampling and control are
particularly attractive for non-linear systems since the nature of the
system response can be operating point dependent and this may
mean that different sampling strategies are desirable at different
operating points.
The current paper examines some of the issues related to event
based sampling for non-linear ltering. An event based non-linear
lter is developed. It is also shown, how such a lter can be imple-
mented using approximate non-linear ltering algorithms including
particle ltering (Chen, 2003; Handschin & Mayne, 1969; Sch on,
2006) and minimum distortion lters (Cea, Goodwin, & Feuer, 2010;
Goodwin, Feuer, & M uller, 2010).
One issue that needs careful consideration in the context of
event based ltering is that of the anti-aliasing lter. It is argued
here that an alternative viewpoint needs to be adopted for the
design of this lter.
The layout of the remainder of this paper is as follows: Section 2
reviews continuous time stochastic models. Section 3 describes
basic sampling strategies. Section 4 describes the core ideas behind
regular and event based sampling. Section 5 describes sampled data
models. Section 6 reviews the traditional discrete non-linear lter.
Section 7 details modications that are required in the discrete non-
linear lter to incorporate event based sampling. Section 8 briey
describes approximate discrete non-linear lters. Section 9 presents
a realistic example. Section 10 draws conclusions.
2. A continuous time non-linear model
Most physical systems evolve in continuous time and are
hence described by ordinary differential equations. A stochastic
version of such equations takes the following conceptual form:
dx
dt
f
c
xg
c
x
dx
dt
1
Contents lists available at SciVerse ScienceDirect
journal homepage: www.elsevier.com/locate/conengprac
Control Engineering Practice
0967-0661/$ - see front matter & 2012 Elsevier Ltd. All rights reserved.
doi:10.1016/j.conengprac.2011.11.008
$
This paper is built upon the plenary presentation: Graham C. Goodwin
Temporal and Spatial Quantization in Nonlinear Filtering, AdConIP, Hangzhou,
China, 2011.
n
Corresponding author.
E-mail addresses: Mauricio.Cea@uon.edu.au,
mauricio.cea.g@gmail.com (M.G. Cea),
Graham.Goodwin@newcastle.edu.au (G.C. Goodwin).
Control Engineering Practice 20 (2012) 963971
dz
dt
h
c
x
dm
dt
2
where xAR
n
is the state vector and dz=dt AR
m
is the measured
output vector. In (1) and (2), dx=dt, dm=dt represent independent
continuous time white noise processes of intensity Q
c
and R
c
respectively. An important observation is that continuous time
white noise does not exist in any meaningful sense. (For example,
if one calculates the auto-covariance of such a process, then it
takes the form Q
c
dt where d is the dirac delta function.) To
overcome this difculty, it is often more insightful to use spectral
density description for the noise. Spectral density is the Fourier
transform of the autocorrelation i.e.
Spectral density of
dx
dt
_ _

_
1
1
Q
c
dte
jxt
dt Q
c
3
Thus Q
c
is the spectral density of the process fdx=dtg. White
noise has constraint spectral density over an innite bandwidth. This
observation allows one to supplement the notion of white noise by
the notion of broad band noise which has constant spectrum over a
wide (but not innite) bandwidth. Indeed, it turns out that white-
ness of the process and measurement noise is largely irrelevant to
the operation of an optimal lter. What is actually needed is that the
spectrum be substantially constant in key regions. This issue is dis-
cussed in detail in Goodwin, Ag uero, Salgado, and Yuz (2009). These
ideas expose a difculty with the common practice of using variances
to describe noise in the discrete time case. For example, say that the
noise is broadband (but non-white) having spectral density Q cover-
ing a bandwidth of W, then the associated variance V is equal to the
area under the spectrum, i.e. V WQ. If one uses spectral density to
describe the noise, then no difculties will be encountered since the
noise intensity has been correctly captured. However, say that the
Nyquist frequency, 1=2D, is greater than the noise bandwidth. Then,
if one uses variance to describe the associated lter, then the variance
must be articially scaled to V V=WD to match the spectral
densities. If this is not done then the associated lter will perform
badly due to underestimation of the noise intensity.
A related problem is that variance does not indicate the
difculty of an estimation problem. For example, consider the
case of very fast sampling. Then 1=D will be large. In this case, a
small noise intensity i.e. small spectral density could be asso-
ciated with a large noise variance. Yet, most of this noise power
will lie at frequencies above the bandwidth of the system.
Intuitively this part of the noise will not effect the lter perfor-
mance. Again, it is only the spectral density in relevant parts of
the spectrum that effects lter performance.
The above difculties are overcome if one works with spectral
density rather than variance. Moreover, this aligns the continuous
and discrete cases, since spectral density (or equivalently incre-
mental variance) is exclusively used in the continuous case.
In view of the above discussion, Eqs. (1) and (2) are more
appropriately expressed in incremental form as:
dx f
c
x dt g
c
x dx 4
dz h
c
x dt dm 5
where the processes x and m correspond to Brownian motion
process having incremental covariance Q
c
dt and R
c
dt respec-
tively. Also, as discussed above, Q
c
and R
c
can equivalently be
thought of as spectral densities for dx=dt and dm=dt respectively.
The linear equivalents of Eqs. (4) and (5) are
dx A
c
x dt dx 6
dz C
c
x dt dm 7
xAR
n
, zAR
m
, A
c
AR
nn
, C
c
AR
mn
, dxAR
n
and dmAR
m
are
the state, measured output, system matrices, process noise and
measurement noise respectively. The initial state satises Efx
0
g
^ x
0
and Efx
0
^ x
0

T
x
0
^ x
0
g
^
P
0
. In the linear case, x and m are
assumed to be stationary vector Wiener processes with incremental
covariance Q
c
dt and R
c
dt respectively. The matrices Q
c
and P
0
are
assumed to be symmetric and positive semidenite, and R
c
is
assumed to be symmetric and positive denite.
3. Choice of sampling strategy
Consider rst the case of regular sampling with xed period D.
(This is sometimes called Riemann sampling (A

str om &
Bernhardsson, 2002). Here the focus is on the independent time
variable).
In Section 2, dz=dt was dened as the continuous time output
(see Eqs. (2), (5) and (7)). The next step is to develop the form of
the model when samples are taken. However, this begs the
question, Samples of what?. Two possible options are explored
below for the sampled output.
3.1. Direct sampling of dz=dt
At rst glance, it seems plausible that one could directly
sample the continuous process dz=dt. However, this choice is
actually an infeasible option since the samples of the associated
noise, dm=dt, would have innite variance!
3.2. Sampling after passing through an anti-aliasing lter
An appropriate remedy to the difculty described in Section 3.1
is to pass dz=dt through an anti-aliasing lter prior to sampling. A
common choice for such a lter is to simply average dz=dt over the
sample period. Actually, some form of averaging is inherent in all
low pass lters that are typically used as anti-aliasing lters. In the
case of averaging, the sampled output satises:
y
k

1
D
_
k1D
kD
dz
dt
8
y
k

1
D
fzk1DzkDg 9
To obtain a notation for the sampled data case which resembles the
continuous case, the (discrete) increment in z is dened via
dz

zk1DzkD 10
where the superscript denotes next sampled value. In this case,
Eq. (9) can be rewritten as
y
k

1
D
dz

11
4. Event based sampling
Next consider the case of event based sampling. (This is some-
times called Lebesgue sampling (A

str om & Bernhardsson, 2002).


Here the focus is on the dependent variable).
Let fq
ij
g be a set of quantization levels for the jth output. These
quantization levels could, for example, be evenly spaced so that
q
i 1,j
q
i,j
L
j
AR for j 1, . . . ,n 12
In event based sampling, the measured output is transmitted
only when a quantization level has been crossed. Moreover,
provided no bits are lost and provided a starting signal level is
known, then only 1 bit/sample needs to be sent to indicate
that the signal has moved to the next interval above (1) or the
next interval below (1). The difference between Riemann and
Lebesgue sampling is illustrated in Fig. 1.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 964
Next consider the design of the anti-aliasing lter. Here, a little
more care is needed than in the case of Riemann sampling.
Specically it is required that interesting events should trigger
sampling. This raises, the need to trade-off noise immunity versus
sensitivity to change. To illustrate, say that one uses the averaging
lter given in (8) and (9). Then a sudden change in output may be
masked by the effect of averaging an (almost) constant signal over a
long period of time. Hence it is desirable to place a lower limit on
the bandwidth of the anti-aliasing lter so as to achieve a compro-
mise between sensitivity and noise averaging. This trade-off does
not arise in Riemann sampling since there is no need to detect
changes. In the case of the averaging lter, the trade-off can be
achieved by simply resetting the averager when the sample period
goes beyond some pre-determined upper limit, say, D
max
.
There also exists a close connection between the choice of the
anti-aliasing lter bandwidth and the quantization thresholds
used in the event based sampler. The reason is that one needs to
ensure that measurement noise does not cause frequent trigger-
ing of the event based sampling even if the signal component is
substantially constant.
Simple design guidelines can be developed as follows:
Say that the measurement noise is broadband with spectral
density R and that an anti-aliasing lter with reset period D
max
is
used. Then the corresponding discrete measurement noise will
have variance of approximately R=D
max
. Assume that the quanti-
zation level spacing is L and say that spurious triggering of the
event based sampler should be avoided with high probability.
This can be achieved by requiring that there is only a small
probability that the discrete measurement noise has magnitude
greater than L=2. To achieve this one might require
2srL=2 13
where s is the discrete noise standard deviation, i.e.

R=D
_
.
Eq. (13) is equivalent to
LZ4

R=D
max
_
14
This equation links the anti-aliasing lter bandwidth, 1=D
max
,
the noise spectral density R and the quantization level spacing L
to achieve a low probability that the noise will be greater than
L=2. In practice, it is desirable to choose D
max
as small as possible
to satisfy (14) since large values of D
max
compromise ones ability
to detect changes in the signal component.
5. Discrete time models
Here the model update period, which is denoted by D is not
necessarily the same as the sampling period, denoted T. Note that
typically T 4D, especially when event based sampling is utilized.
For simplicity, the anti-aliasing lter is xed as an averaging lter
having period D. However, the extension to other anti-aliasing lters
is straightforward.
5.1. The conventional discrete model for linear systems
First consider the linear case of (6) and (7). This case will
reveal several modelling issues which apply, mutatis mutandis, to
the non-linear case.
Consider the linear continuous model (6) and (7). An exact
discrete time model describing the samples can be readily shown
to be
x

A
d
xx 15
y C
d
xm 16
where the system matrices take the following specic values:
A
d
e
Ac D
IA
c
DA
c
D
2
=2 17
C
d

1
D
C
c
A
1
c
e
Ac D
I C
c
I
1
2!
A
c
D
1
3!
A
2
c
D
2

_ _
18
The corresponding process and output noise processes have
zero mean and covariance:
S
d
E
x
k
m
k
_ _
x
k
m
k
T
_ _ _ _

Q
d
S
d
S
T
d
Q
d
_ _
19
where the covariance matrix is given by
S
d
D
_
D
0
e
At
Q
c
0
0 R
c
_ _
e
A
T
t
dt
_ _
D 20
and where
A
A
c
0
C
c
0
_ _
) e
At

e
Ac t
0
C
c
_
t
0
e
Acs
ds I
_ _
21
D
I 0
0
1
D
_
_
_
_
22
Even though the above sampled system is an exact description
for every nite D, the model is a source of conceptual and
numerical problems when the sampling period decreases to zero.
For example, it is readily seen, that as D-0:
A
d
-I 23
S
d
-
0 0
0 1
_ _
24
These results show that the discrete-time model (15) and (16)
will be the source of difculties as the sampling interval becomes
small: the A
d
matrix becomes the identity matrix, and the noise
covariance matrix S
d
tends to the uninformative values given in
(24). These difculties can be readily resolved by appropriate scaling
of the model equations. This is shown in the next subsection.
Regular Time Sampling
Regular
Spatial
Sampling
Fig. 1. Riemann vs Lebesgue sampling.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 965
5.2. Incremental form of the sampled data linear model
Here, an alternative formulation of the discrete-time model,
which has the same structure as the continuous-time model, is
presented. The key tool used is to introduce appropriate scaling so
that the limit D-0 is meaningful. The alternative model provides
conceptual advantages and superior numerical behavior at fast
sampling rates, see Goodwin, Middleton, and Poor (1992), Feuer
and Goodwin (1996), and Middleton and Goodwin (1990).
The problems illustrated in (24) and (23) suggest that the
traditional approach to describing discrete-time models is not
appropriate when fast sampling rates are employed. The remedy
is to scale the equations to produce an equivalent incremental
model
1
expressed as follows:
dx

x
k1
x
k
A
i
x
k
Dx
k
25
dz

z
k1
z
k
Dy
k
C
i
x
k
Dm
k
y
k
26
where it is readily seen using (17) and (18) that
A
i

A
d
I
D
A
c

1
2
A
2
c
D 27
C
i
C
d
C
c
28
The initial state satises Efx
0
g ^ x
0
and Efx
0
^ x
0
x
0
^ x
0

T
g
^
P
0
. The new process noise sequence is x
k
x
k
, having covariance
Efx
k
x
T
k
g Q
d
. For consistency, with the continuous case, the
noise covariance is expressed in incremental form (or equiva-
lently using spectral density) by scaling by the sample period.
Thus let
Q
d
Q
i
D Q
c

D
2
A
c
Q
c
Q
c
A
T
c

_ _
D 29
where Q
i
can be interpreted as either incremental covariance or
discrete noise spectral density.
For the system output equation, it is clear that, when an
integrating anti-aliasing lter is used, then the expression obtained
for the output corresponds to increments of the variable z, i.e.
y
k
Dy
k

_
kDD
kD
dz z
k1
z
k
30
The measurement noise sequence is now m
k
Dm
k
having
incremental covariance expressed as Efm
k
m
T
k
g R
i
D, where
R
i
DD
2
R
d
R
c

D
2
3
C
c
Q
c
C
T
c

_ _
D 31
The cross-variance Efx
k
m
T
k
g is S
i
DS
d
D.
Finally, if D is small, then the incremental model matrices can
be approximated by retaining the rst term in the expansions
(27), (28) and (31) respectively i.e.
A
i
CA
c
, C
i
CC
c
, Q
i
CQ
c
, R
i
CR
c
, S
i
C0 32
Thus at fast sampling the incremental matrices are approxi-
mately the same as the underlying continuous time matrices.
Note that the approximations given in (32) are equivalent to
using Euler integration to obtain the incremental model. Also note
that the use of Euler integration gives an approximation whereas
use of incremental models can give an exact description.
5.3. Incremental form of the sampled data non-linear model
Next, consider the non-linear case under the assumption that
D, the model update period, is sufciently small so that Euler
integration gives a discrete model of sufcient accuracy (in
practice this may require some experimentation to nd a suitable
value for D). Also note that D is the model update period which is
not necessarily equal to the sampling period T. The discrete model
in incremental form can then be written as
dx

9x
k1
x
k
f
i
x
k
Dx
k
33
dz

9z
k1
z
k
y
k
h
i
x
k
Dm
k
34
where
Efx
k
x
T
k
g Q
i
x
k
D 35
Efm
k
m
T
k
g R
i
D 36
Also, if one uses Euler integration, the functions f
i
, h
i
, Q
i
, R
i
can be directly linked to the corresponding continuous functions
as follows:
f
i
x-f
c
x 37
h
i
x-h
c
x 38
Q
i
-Q
c
39
R
i
-R
c
40
6. Review of the traditional discrete non-linear lter
The traditional discrete non-linear lter can now be directly
formulated. The changes necessary to deal with event based
sampling are dealt with later. Thus, consider a discrete time
stochastic non-linear model of the form (33) and (34).
The problem of interest is to compute p
x
k
x
k
9Y
k
(the condi-
tional distribution of the state at time k given observations of y up
to and including time k i.e. Y
k
fy
0
, . . . ,y
k
g).
A recursive set of equations is presented below that yields the
solution to the above problem (see also Jazwinski, 1970).
One proceeds sequentially by rst assuming that p
x
0
x
0
9Y
1
is
known. For example, this distribution might be Gaussian with
mean x
0
and covariance P
0
.
Next assume that p
x
k
x
k
9Y
k
is known. Then have the following
formula of the state update law holds:
p
x
k 1
x
k1
9Y
k

_
p
x
k
x
k
9Y
k
p
x
k 1
x
k1
9x
k
dx
k
41
The impact of adding an observation, i.e. y
k1
is described by
p
x
k 1
x
k1
9Y
k1
p
x
k 1
x
k1
9Y
k
,y
k1

p
x
k 1
x
k1
9Y
k
p
y
k 1
y
k1
9x
k1

_
p
x
k 1
x
k1
9Y
k
p
y
k 1
y
k1
9x
k1
dx
k1
42
Eqs. (41) and (42) are often referred to as the Chapman
Kolmogorov equation and Bayes rule respectively.
7. Modications to deal with down-sampling
As argued in Introduction, it may be highly inefcient to
sample very quickly. Thus some form of down-sampling, or event
based sampling, may be benecial. Say that one begins with a
sample period D. Then one can down-sample in several ways.
Two alternatives are discussed below.
7.1. Regular down-sampling
Say that it is desired to change the sample period by a xed
factor, e.g. from D to mD, where D is assumed very small relatively
1
Sometimes called a delta operator model in the literature (Middleton &
Goodwin, 1990).
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 966
to the natural dynamics of the system. There are some subtle
issues that need to be considered.
The non-linear lter is now updated only at period mD. Assume
that the original anti-aliasing lter is reset every D seconds not
every mD seconds. The correct strategy is now denitely not to
simply take every mth sample and throw the rest away! Clearly this
would lead to a highly suboptimal lter since most of the data will
have been discarded. On the contrary, if one decides to increase the
sampling period from D to mD, then a new anti-aliasing lter
relevant to the new sample period mD is desirable. For example, say
that one uses the usual averaging lter, then a new observation
sequence
2
dz
l
0
dz
l
0

m
k 1
dz
ml1 k
43
can be digitally constructed before using the discrete non-linear
lter.
If implemented properly, the step of down-sampling can lead
to major computational improvements without signicant degra-
dation in performance. Indeed, the example below shows that, in
this illustrative case, one can down-sample by several orders
of magnitude with a corresponding reduction of several orders of
magnitude in the computational effort without signicantly chan-
ging the computed conditional probability.
7.1.1. Example
Consider the following simple discrete non-linear system:
x
t 1
ax
t
x
t
0
44
y
t
x
2
t
m
t
0
45
where a0.999; Eo
02
t
10
2
; En
02
t
10
4
, D10
3
. The magni-
tude of Eo
0
t

2
and En
0
t

2
may seem counterintuitive but these
scalings are a consequence of the ideas described earlier in
Section 5.1. The system (44) and (45) is actually more intuitive
when expressed in the equivalent incremental form:
dx

f
i
x
k
Do
k
46
dz h
i
x
k
Dn
k
47
where f
i
x
k
x
k
; h
i
x
k
x
2
k
and o
k
, n
k
both have incremental
covariance of 10D.
It seems heuristically clear that the sample period of 10
3
may
lead to wasted computational effort. Thus down-sample is intro-
duced using the strategy explained in (43). Figs. 4, 3, 2 show the
evolution of p
x
k
x
k
9Z
k
for D10
3
and the down-sampled ver-
sions for mD10
2
and mD10
1
respectively. Inspection of the
plots indicates that there is no noticeable deterioration in the
computed posterior probability. However, at D10
1
, the total
computational load has been reduced by two orders of magnitude
relative to the use of D10
3
! Note that the introduction of the
new anti-aliasing lter in (43) is crucial in achieving those results.
7.2. Event based sampling
At rst glance it may seem that the extension to event based
sampling is immediate, i.e. all one need to do is run the state
update (41) at period D (chosen sufciently small so that Euler
integration gives an adequate approximation) and then to use the
observation update (42) when one decides that a sufciently
interesting change in the output has occurred. Certainly the
observations are only needed when a threshold has been crossed.
However, it is not true that there is zero relevant information
between threshold crossings. On the contrary, there is a valuable
piece of information, namely that the output has not crossed a
threshold. Hence, estimates can continue to be updated between
Fig. 2. Time evolution of the probability density function at fast sampling,
D0:001.
Fig. 3. Time evolution of the probability density function at fast sampling,
D0:01.
Fig. 4. Time evolution of the probability density function at fast sampling, D0:1.
2
Note that this step of using a new digital anti-aliasing lter is very helpful
and does not appear to be widely appreciated.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 967
threshold crossings provided an appropriate change is made to the
observation update formula. Specically, consider the situation
illustrated in Fig. 5 where, at the kth time instant, it is known that
y
k
AQ
a
,Q
b
Q
k
48
The observation update (42) in the non-linear lter can now be
modied to the following which explicitly utilizes (48):
px
k1
9Y
k
,y
k1
AQ
k

_
Q
k
px
k1
9Y
k
py
k1
9x
k1
dy
k1
_ _
Q
k
px
k1
9Y
k
py
k1
9x
k1
dy
k1
dx
k1
49
Note that if one simply chooses not to update the states, then
the state estimation uncertainty will grow due to the drift term
inherent in the state update (33). Use of (49) avoids this problem.
Actually, this is different from the common strategy used in much
of the existing event based sampling literature where updates are
usually restricted to cases when a threshold is crossed (Anta &
Tabuada, 2009, 2008;

Arze n, 1999; Byrnes & Isidori, 1989; Le &
McCann, 2007; McCann & Le, 2008; Otanez et al., 2002; Pawlowski
et al., 2009; Tabuada, 2007; Xu & Cao, 2011). Some authors
e.g. Sijs and Lazar (2009) and Marck and Sijs (2010) have noted
that, for the case of linear ltering, it is desirable to continue
to update based on the known fact that the output lies within
the quantization threshold. This is the idea captured in (49) for
the case of non-linear ltering. Of course, in practice, the integrals
in (49) will need to be approximated. The approximation issue
is discussed below via particle ltering and vector quantiza-
tion methods.
8. Spatial quantization
Next consider the issue of spatial quantization. As is clear from
(41) and (42), the conditional probability for the states is a function
in a high dimensional space. Also, evolution of this function requires
the evaluation of high dimensional integrals as is evident from the
right hand sides of (41) and (42). Such integrals cannot be computed
in practice without some form of discretization of the spatial
coordinates. Two strategies are described below to achieve spatial
quantization, namely particle ltering and minimum distortion
ltering. The former strategy has it strengths in that the number
of particles is independent of dimension but requires large number
of points to accurately describe the problem. The latter strategy uses
a small number of points for low dimensional problems but its
computational cost increases in higher dimensions due to the need
for extra grid points.
8.1. Particle ltering
This technique achieves spatial quantization by drawing a set
of random samples from the disturbance distribution. Thus, a
discrete approximation to the posterior distribution is generated
which is based on a set of randomly chosen points. The approx-
imation converges, in probability, with order 1=

N
p
, where N is
the number of chosen samples (Crisan & Doucet, 2002). The main
disadvantages of this strategy is that a large number of points
may be needed. Also these points need, in principle, to be related
to the distribution of interest and suffer from degeneracy of the
particles. Also, the number of points will grow exponentially with
time unless some form of reduction is used. Thus, there are many
xes needed to get this type of algorithm to work in practice. Such
xes include the use of proposal distributions, resampling meth-
ods, etc. For details the reader is referred to Chen (2003).
8.2. Minimum distortion ltering (MDF)
This is a new class of algorithm. It was rst described in Goodwin
et al. (2010). The MDF algorithm belongs to the class of determi-
nistic griding methods. There exist many algorithms within this
framework. Some of them use xed grid methods, where the choice
of the grid is based on some pre-known information regarding the
problem which is never updated. Another method is presented in
Bucy and Senne (1971), where a griding method based on the mean
plus an ellipsoid determined by the covariance of the probability
density function is used. Other approaches include adaptive uniform
grid methods e.g. Bergman (1998). Here a uniform resolution grid
with adaptive resolution is used. The grid is also relocated depend-
ing on the likelihood of the current grid points. By contrast, the MDF
algorithm is a method where the grid is non-uniform and is adapted
at each sampling instant. The adaptation step depends on vector
quantization of the current estimate of the probability density
function. This technique provides the algorithm with the capacity
of relocating grid points where they are most needed. The non-
uniform characteristic allows for a tailored location of the grid,
without wasting points in regions between the modes without
importance e.g. in multimodal distributions. A summary of the
algorithm is presented below.
The key idea underlying this class of algorithm is to utilize
vector quantization to generate, on-line, a nite approximation to
the a-posteriori distribution of the states.
Say that one begins with a discrete approximation to the
distribution of x
0
on N
x
grid points. Also assume that one has a
nite approximation to the distribution of the process noise on N
w
grid points. These approximations can be generated off-line. Then
utilizing the discretized version of Eq. (41), one obtains a nite
approximation to px
1
on N
x
N
w
grid points. Then, one uses the
discrete equivalent of (42) to obtain a nite approximation to
px
1
9y
1
on N
x
N
w
points. Finally, one uses vector quantization
ideas to re-approximate px
1
9y
1
back to N
x
points. (How this
crucial last step is performed will be described in detail below.)
Then, one returns to the beginning to obtain a discrete approx-
imation to px
2
9y
1
on N
x
N
w
points and so on. The algorithm is
summarized in Table 1.
The key step in the MDF algorithm is the vector quantization
step (step 5 in Table 1). Details of this step are given below.
Fig. 5. Inter-sample illustration.
Table 1
MDF algorithm.
Step Description
1 Initialization: Quantize px
0
to N
x
points by x
i
,p
i
; i 1, . . . ,Nx. Quantize
px to Nx points by w
j
,q
j
; j 1, . . . ,Nw
2 Begin with px
k
9Y
k
represented by x
i
,p
i
; i 1, . . . ,Nx
3 Approximate px
k1
9Y
k
via (41) on NxnNx points
4 Evaluate px
k1
9Y
k1
on NxnNw points via (42)
5 Quantize back to N
x
points
6 Go to 2
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 968
Assume one has a vector discrete distribution for some distribu-
tion px, where xAR
n
, quantized to a very large (but nite) set of
points. The goal is to quantize px to a smaller nite set of points
x
i
,p
i
, i 1, . . . ,N. The rst step in vector quantization is to dene a
measure to quantify the distortion of a given discrete representa-
tion. This measure is then optimized to nd the optimal representa-
tion which minimizes the cost. In summary, one seeks a nite set
W
x
fx
1
, . . . ,x
N
g and an associated collection of sets S fS
1
, . . . ,S
N
g
such that

N
1
S
i
R
n
and S
i
\ S
j
0; i aj. The quantities W
x
, S
x
are
chosen by minimizing a cost function of the form:
JW
x
,S
x

N
i 1
Efxx
i

T
Wxx
i
9xAS
i
g 50
where WdiagW
1
, . . . ,W
N
. Other choices of the distance measure
can also be used; e.g. Manhattan, L
1
, Jaccard, etc. . ., see Tan,
Steinbach, and Kumar (2005).
If x
1
, . . . ,x
N
(the set of grid points) are given, then the optimal
choice of the sets S
i
is the, so-called, Voronoi cells (Gersho & Gray,
1992; Graf & Lushgy, 2000):
S
i
fx9xx
i

T
Wxx
i
rxx
j

T
Wxx
j
; 8j aig 51
Similarly, if the sets S
1
, . . . ,S
N
are given, then the optimal
choice for x
i
is the centroid of the sets S
i
, i.e.
x
i
Ex9xAS
i
52
Many algorithms exist for minimizing functions of the form
(50) to produce a discrete approximation. One class of algorithm
(known as the k-means algorithm or Lloyds algorithm, Gersho &
Gray, 1992; Graf & Lushgy, 2000; Lloyd, 1982) iterates between
the two conditions (51) and (52).
Thus Lloyds algorithm begins with an initial set of grid points
W
x
fx
i
; i 1, . . . ,N
x
g. Then one calculates the Voronoi cells S
x
of
W
x
using (51). Next, one computes the centroids of the Voronoi cells
S
x
via (52). One then returns to the calculation of the associated
Voronoi cells and so on. Lloyds algorithm iterates these steps until
the distortion measure (50) reaches a local minimum, or until the
change in the distortion measure falls below a given threshold, i.e.
JW
k1
x
,S
k1
x
JW
k
x
,S
k
x

JW
k
x
,S
x

k
re 53
where W
k
x
and S
k
x
are the codebook and Voronoi cells at iteration k
respectively.
In order to obtain satisfactory results with the MDF algorithm
various practical steps are necessary. These include the use of fast
sampling, scaling, and clustering, see Goodwin and Cea (2011).
9. Example
Consider the practical problem of radar tracking using range and
bearing measurements. Both particle ltering and MDF methods
are used below for the spatial quantization step. Also event based
sampling is used and compared with regular sampling.
0 20 40 60 80 100 120 140 160 180 200
0
20
40
60
80
100
120
1st Moment
Sample
x
1
0 20 40 60 80 100 120 140 160 180 200
50
0
50
100
1st Moment
Sample
x
2
True
MDF
PF
True
MDF
PF
Fig. 6. First moment estimation using, MDF, particle and true lters.
0 20 40 60 80 100 120 140 160 180 200
0
50
100
150
200
250
300
350
2nd Moment
Sample
x
1
0 20 40 60 80 100 120 140 160 180 200
0
50
100
150
200
2nd Moment
Sample
x
2
True
MDF
PF
True
MDF
PF
Fig. 7. Second central moment estimation using, MDF, particle and true lters.
Table 2
Root mean square error.
Algorithm Mean x
1
Mean x
2
Variance x
1
Variance x
2
MDF 4.043 4.5689 29.109 27.585
PF 6.644 6.870 38.023 46.6718
0 50 100 150 200
0
50
100
150
200
R
a
n
g
e
True
Lebesgue
Riemann
0 50 100 150 200
1
0.5
0
0.5
1
B
e
a
r
i
n
g
,

r
a
d
i
a
n
True
Lebesgue
Riemann
Fig. 8. Range and bearing trajectory.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 969
Consider the following two state model,
x
1
k1
x
1
k
Dv
1
k
o
1
k
54
x
2
k1
x
2
k
Dv
2
k
o
2
k
55
where D0:1 is the sampling period and x x
1
x
2
AR
2
is the
state vector. The input v v
1
v
2
AR
2
corresponds to the speed
of the object in cartesian coordinates, xo
1
o
2
AR
2
is pro-
cess noise (say wind gusts or unmeasured speed variations) with
covariance:
Q
d

100 0
0 100
_ _
D 56
The range and bearing measurements are given by the follow-
ing equations (Floudas, Polychronopoulos, & Amditis, 2005):
y
1
k

x
1
k

2
x
2
k

2
_
n
1
k
57
y
2
k
arctan
x
1
k
x
2
k
_ _
n
2
k
58
The measurement vector is thus y y
1
y
2
AR
2
, the mea-
surement noise m n
1
n
2
AR
2
is taken to have variance:
R
d

0:6 0
0 0:06
_ _
1
D
59
The MDF tuning parameters are taken to be N
x
49, N
w
9,
z 10
20
and E 10%. For the particle lter 1000 particles were
used. This approximately yields equal computational load per
sample for MDF and particle methods. Both lters used the same
initial condition for the sate, i.e. a Gaussian distribution with
^ x
0
35 23 and covariance
^
P
0
100 0; 0 100.
Figs. 6 and 7 show the mean and variance of the state estimate.
As can be seen, the MDF and particle lter give similar results.
Moreover these results are almost identical to the true esti-
mates. The latter was computed (for comparison purposes only)
using a very ne griding of the state space.
Table 2 shows the root mean square error for the mean and
variance estimates using MDF and PF algorithms. These results
show that the performance of the MDF algorithm is better than
that obtained by PF methods.
Next, regular (Riemann) sampling and event based (Lebesgue)
sampling are compared. For the former, 8 bits were utilized to
represent each sample and one sample was taken per second. For
the second case, the quantization thresholds were set at 50
and 0.9 respectively for range and bearing. Fig. 8 compares the
reconstructed range and bearing for the two lters. Fig. 9 shows
the sampling times for range (upper two traces) and bearing
(lower two traces).
It can be seen from Fig. 8 that the estimates produced by
Lebesgue sampling are very close to those produced by Riemann
sampling. This occurs despite of the obvious difference in sampling
rates shown in Fig. 9. Indeed, the Riemann sampling strategy uses
8 bits/sample and 1 sample/s, i.e. a data rate of 8 bits/s. On the other
hand, the Lebesgue sampling strategy uses only 1 bit/sample (up or
down) at an average of 0.2 samples/s. The latter corresponds to an
average data rate of 0.2 bits/s which is 40 times less than used in the
case of regular sampling.
10. Conclusions
This paper has described the use of event based sampling in
the context of non-linear ltering. Special issues regarding the
choice of anti-aliasing lter have been addressed. Also, a realistic
example has been presented showing that the required data rate
can be reduced by more than an order of magnitude (40:1 for the
given example) whilst retaining essentially the same estimation
accuracy.
References
Anta, A., & Tabuada, P. (2008). Self-triggered stabilization of homogeneous control
systems. In: American control conference (pp. 41294134). IEEE.
Anta, A., & Tabuada, P. (2009). On the benets of relaxing the periodicity
assumption for networked control systems over CAN. In: 30th IEEE real-time
systems symposium (pp. 312). IEEE.
Anta, A., & Tabuada, P. (2010). To sample or not to sample: Self-triggered control
for nonlinear systems. IEEE Transactions on Automatic Control, 55(9),
20302042.

Arze n, K. (1999). A simple event-based PID controller. In: Proceedings of the 14th
IFAC world congress, Vol. 18.
A

str om, K., & Bernhardsson, B. (2002). Comparison of Riemann and Lebesgue
sampling for rst order stochastic systems. In: Proceedings of the 41st IEEE
conference on decision and control, Vol. 2.

Astr om, K. J., & Wittenmark, B. (1990). Computer controlled systems. Theory and
design ((2nd edition). Englewood Cliffs, NJ: Prentice Hall.
Bergman, N. (1998). An interpolating wavelet lter for terrain navigation
In: Proceedings of the conference on multisourcemultisensor information fusion
(pp. 251258).
Bucy, R., & Senne, K. (1971). Digital synthesis of non-linear lters. Automatica, 7(3),
287298.
Byrnes, C., & Isidori, A. (1989). New results and examples in nonlinear feedback
stabilization. Systems & Control Letters, 12(5), 437442.
Cea, M., Goodwin, G., & Feuer, A. (2010). A discrete nonlinear lter for fast sampled
problems based on vector quantization. In: American control conference (ACC),
July (pp. 13991403).
0 20 40 60 80 100 120 140 160 180 200
1
1
1
1
1
1
1
Discrete time index k
S
a
m
p
l
e

i
n
s
t
a
n
t
Lebesgue
Riemann
Riemann
Lebesgue
Fig. 9. Sampling instants.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 970
Chen, Z. (2003). Bayesian ltering: From Kalman lters to particle lters, and beyond.
Available at: /http://users.isr.ist.utl.pt/$jpg/tfc0607/chen_bayesian.pdfS.
Crisan, D., & Doucet, A. (2002). A survey of convergence results on particle ltering
methods for practitioners. IEEE Transactions on Signal Processing, 50(3), 736746.
Feuer, A., & Goodwin, G. (1996). Sampling in digital signal processing and control.
Boston, Cambridge, MA: Birk ausser.
Floudas, N., Polychronopoulos, A., & Amditis, A. (2005). A survey of ltering
techniques for vehicle tracking by radar equipped automotive platforms. In
8th international conference on information fusion, July, 2005 (Vol. 2, p. 8).
Gersho, A., & McGray, R. (1992). Vector quantization and signal compression. In:
Springer International Series in Engineering and Computer Science. .
Goodwin, G., Ag uero, J., Salgado, M., & Yuz, J. I. (2009). Variance or spectral density
in sampled data ltering? In 4th international conference on optimization and
control with applications (OCA2009), 611 June. Harbin, China.
Goodwin, G., & Cea, M. G. (2011). Temporal and spatial quantization in nonlinear
ltering. In 4th international symposium on advanced control of industrial
processes, 2327 May.
Goodwin, G. C., Feuer, A., & M uller, C. (2010). Sequential bayesian ltering viaminimum
distortion ltering. Three decades of progress in control sciences ((1st ed.). Springer.
Goodwin, G. C., Middleton, R. H., & Poor, H. V. (1992). High-speed digital signal
processing and control. Proceedings of the IEEE, 80(2), 240259.
Graf, S., & Lushgy, H. (2000). Foundations of quantization for probability distribu-
tions. In: Lecture notes in mathematics, Vol. 1730). Springer.
Handschin, J., & Mayne, D. (1969). Monte carlo techniques to estimate the
conditional expectation in multi-stage non-linear ltering. International
Journal of Control, 9(5), 547559.
Hristu-Varsakelis, D., & Levine, W. (2005). Handbook of networked and embedded
control systems. Birkhauser.
Jazwinski, A. (1970). Stochastic processes and ltering theory. San Diego, CA:
Academic Press.
Le, A., & McCann, R. (2007). Event-based measurement updating Kalman lter in
network control systems. In: 2007 IEEE region 5 technical conference (pp. 138
141). TPS.
Lloyd, S. (1982). Least squares quantization in PCM. IEEE Transactions on Informa-
tion Theory, IT-28, 127135.
Marck, J. W., & Sijs, J. (2010). Relevant sampling applied to event-based state-
estimation. Proceedings4th international conference on sensor technologies and
applications SENSORCOMM. pp. 618624.
McCann, R., & Le, A. T. (2008). Lebesgue sampling with a Kalman lter in wireless
sensors for smart appliance networks. In: Conference recordIAS annual
meeting. IEEE Industry Applications Society.
Middleton, R., & Goodwin, G. C. (1990). Digital control and estimation. A unied
approach. Englewood Cliffs, NJ: Prentice Hall.
Otanez, P., Moyne, J., & Tilbury, D. (2002). Using deadbands to reduce commu-
nication in networked control systems. In: American control conference, Vol. 4.
Pawlowski, A., Guzma n, J. L., Rodrguez, F., Berenguel, M., Sa nchez, J., & Dormido, S.
(2009). The inuence of event-based sampling techniques on data transmis-
sion and control performance. In: ETFA IEEE conference on emerging technolo-
gies and factory automation.
Sch on, T. B. (2006). Estimation of nonlinear dynamic systemsTheory and applica-
tions. Ph.D. Thesis. Link oping Studies in Science and Technology. /http://
www.control.isy.liu.se/research/reports/Ph.D.Thesis/PhD998.pdfS.
Sijs, J., & Lazar, M. (2009). On event based state estimation. In: Lecture notes in
computer science, Vol. 5469.
Tabuada, P. (2007). Event-triggered real-time scheduling of stabilizing control
tasks. IEEE Transactions on Automatic Control, 52(9), 16801685.
Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. Addison
Wesley.
Xu, Y., & Cao, X. (2011). Lebesgue-sampling-based optimal control problems with
time aggregation. IEEE Transactions on Automatic Control, 56(5), 10971109.
M.G. Cea, G.C. Goodwin / Control Engineering Practice 20 (2012) 963971 971

You might also like