Professional Documents
Culture Documents
Unscented Kalman Filter for Dummies - Robotics Stack Exchange
Unscented Kalman Filter for Dummies - Robotics Stack Exchange
Unscented Kalman Filter for Dummies - Robotics Stack Exchange
Sign up to join this community The best answers are voted up and
rise to the top
Robotics
I need some help here because I can't figure how the Unscented Kalman Filter works. I've searched for
examples but all of them are too hard to understand.
10
Please someone can explain how it works step by step with a trivial example like position estimation,
sensor fusion or something else?
kalman-filter
Share Improve this question Follow edited Jun 5, 2020 at 13:10 asked Feb 23, 2016 at 22:43
heretoinfinity Thiago Bezerra
295 1 5 14 111 1 6
I'm going to give you a high-level overview without going into much math. The purpose here is to give
you a somewhat intuitive understanding of what is going on, and hopefully this will help the more
17 mathematical resources make more sense. I'm mostly going to focus on the unscented transform, and
how it relates to the UKF.
Random variables
Alright, the first thing you need to understand is that modern (mobile) robotics is probabilistic; that is, we
represent things we aren't sure about by random variables. I've written about this topic on my personal
website, and I strongly suggest you check it out for a review. I'm going to briefly go over it here.
Take the (one-dimensional) position of a robot for example. Usually we use sensors (e.g., wheel encoders,
laser scanners, GPS, etc.) to estimate the position, but all of these sensors are noisy. In other words, if our
GPS tells us we are at x = 10.2 m, there is some uncertainty associated with that measurement. The
most common way we model this uncertainty is using Gaussian (also known as Normal) distributions. The
probabilistic density function (don't worry about what this means for now) of a Gaussian distribution
looks like this:
On the x axis are different values for your random variable. For example, this could be the one
dimensional position of the robot. The mean position is denoted by μ, and the standard deviation is
denoted by σ , which is a sort of "plus/minus" applied to the mean. So instead of saying "the robot's
position is x = 9.84 m", we say something along the lines of "the mean of our estimate of the robot's
position is x = 9.84 m with a standard deviation of 0.35 m".
On the y axis is the relative likelihood of that value occurring. For example, note that x = μ has a relative
likelihood of about 0.4, and x = μ − 2σ has a relative likelihood of about 0.05. And let's pretend that
μ = 9.84 m and σ = 0.35 m like the above example. That means
What the y -axis values are telling you is that the likelihood that the robot is at 9.84 is eight times higher
than 9.14 because the ratio of their likelihoods is 0.4/0.05 = 8 .
The takeaway point from this section is that the things we often most interested in (e.g., position,
velocity, orientation) are random variables, most often represented by Gaussian distributions.
y = 12x − 7
and x is a Gaussian random variable, then y is also a Gaussian random variable (but with a different
mean and standard deviation). On the other hand, this property does not hold for nonlinear functions,
such as
2
y = 3 sin(x) − 2x .
Here, passing the Gaussian random variable x through the function results in a non-Gaussian distribution
y . In other words, the shape of the PDF for y would not look like the above plot. But what do I mean by
"passing a Gaussian random variable through a function". In other words, what do I put in for x in the
above equation? As we saw in the above plot, x has many possible values.
As before, let's say x has mean μ = 9.84 m and standard deviation σ = 0.35 m. We are interested in
approximating y as a Gaussian random variable. One way to do this would be calculating a whole bunch
of y s for different xs. Let's calculate y for x , , ,
= μ x = μ + 0.1σ x = μ − 0.1σ x = μ + 0.2σ , etc. I've
tabulated some results below.
2
x y = 3 sin(x) − 2x
μ −194.5
μ + 0.1σ −195.9
μ − 0.1σ −193.0
μ + 0.2σ −197.3
μ − 0.2σ −191.6
⋮ ⋮
μ + 10σ −354.5
μ − 10σ −80.3
Although the values of y do not form a Gaussian distribution, we create one by taking the mean of all the
y values to get μy (in this case μy = −201.8), and we calculate the standard deviation the usual way:
−−− −−−−n
−−−−−−−− −
1
2
σy = √ ∑(y i − μy ) = 81.2.
n − 1
i=1
So voila! We have come up a way to pass Gaussian random variables through nonlinear functions.
Problem solved, right? Well not necessarily. It took a lot of computations to come up with our solution,
so solving problems this way when our data is streaming in many times per second may not be possible.
And how did we choose how spread out our values of x were? And why did we stop at μ ± 10σ ?
In fact, for the problem I described, the unscented transformation requires you to pass exactly three
points (called sigma points) through the nonlinear function:
x = μ, x = μ + αμ, x = μ − αμ,
where α depends on the dimensionality of the random variable (one in this case) and other scaling
factors. In other words, you pass the mean and one point on each side of the mean through the
nonlinear function. Then you calculate the mean and standard deviation of the result to approximate y as
a random variable.
xk+1 = f (xk , uk )
where uk is the input, and both xk and uk are (Gaussian) random variables. Now in a regular Kalman
filter, f (xk , uk ) is a linear function, which results in xk+1 being a Gaussian random variable. However, it
is often the case that f (xk , uk ) is nonlinear. So what do we do? We calculate sigma points, pass them
through the motion model, and then calculate the mean and variance of the result. This gives us our
approximate estimate of xk+1 as a Gaussian random variable.
The correction step mostly works the same way. This time you have a measurement model that looks
something like
zk = h(x)
where h(x) may be nonlinear. So how do we get calculate our predicted measurement z ? You guessed
it, we use an unscented transformation again. I won't go into details on how you update the state from
here (it's outside of the scope of my "intuitive" description), but it's relatively straightforward.
Summary
1. The state (what we are estimating) and the measurements come from "real-life" sensors, which all
have noise. We model this uncertainty by representing them as Gaussian random variables.
2. Passing Gaussian random variables through linear functions results in other Gaussian random
variables. Passing them through nonlinear functions does not.
3. One method of approximating the result of passing a Gaussian random variable through a nonlinear
function as a Gaussian random variable is by sampling the input variable at a bunch of different
points, passing those through the nonlinear function, and then considering the mean and standard
deviation of the resulting output as a Gaussian distribution.
4. The unscented transformation strategically picks points from distribution of the input variable that
keep the most information about the distribution and passes them through the nonlinear function,
then calculates the mean and standard deviation of the output.
5. The unscented Kalman filter uses the unscented transformation to pass Gaussian random variables
through the motion and measurement models.
Share Improve this answer Follow answered Feb 24, 2016 at 2:21
kamek
990 8 14
Very helpful intuition. I wish to have further treatment on how the estimated Gaussian distribution by the unscented
transformation can lead to the estimation of the state update. (i.e. interpret Kalman filter in light of transformation
to the Gaussian distribution.) – Yu Shen Jul 27, 2017 at 13:24