Professional Documents
Culture Documents
M3 Kalman Filter Equations
M3 Kalman Filter Equations
M3 Kalman Filter Equations
Girish Chowdhary
Associate Professor,
Director Distributed Autonomous Systems Lab
University of Illinois at Urbana Champaign,
www.daslab.illinois.edu
p(A) denotes the probability that the event A is true: Axioms of prob
(discrete) Random variable (RV): a variable that can take one of many
values
P
x∈X p(x) = 1 (Something happens)
3
e.g. even die roll: 6
For X be able to take any value xi and RV Y be able to take any value
yi
Question: what is the joint probability that I will get “snake eyes”?
i.e.xi = 1, yi = 1
X X
p(A) = p(A|B) = p(A|B = b)p(B = b)
b b
Sum rule
X X
p(X ) = p(X , Y ) = p(X |Y = y )p(Y = y )
Y Y
Product rule
p(X , Y ) = p(Y |X )p(X )
Bayes Theorem
p(X , Y ) p(Y |X )p(X )
p(X |Y ) = =
p(Y ) p(Y )
p(x) ≥ 0
Z ∞
p(x)dx = 1
−∞
Useful identity
Variance
Var [f ] = E[f (x)2 ] − E[f (x)]2
image source
https://upload.wikimedia.org/wikipedia/commons/8/8e/MultivariateNormal.png
Chowdhary www.daslab.illinois.edu March 16, 2021 15 / 105
Goals
State
Current state, control system, and system model completely defines future
response
State Variables: The variables that make up the state of the dynamic
system. If n variables, (x1 , x2 , . . . , xn ), are required to completely
describe the behavior of the system, then such n variables are a set of
state variables.
Assume that a system has n state variables, m inputs and r outputs. Then
the general form of the state space model for the system is
The above first-order differential equations in the state variables are called
state differential equations or state space equations.
They can be nonlinear
y1 (t) = g1 (x1 , x2 , . . . , xn ; u1 , u2 , . . . , um ; t)
y2 (t) = g2 (x1 , x2 , . . . , xn ; u1 , u2 , . . . , um ; t)
..
.
yr (t) = gr (x1 , x2 , . . . , xn ; u1 , u2 , . . . , um ; t)
• In compact vector form, we can write the state differential equations and
outputs as
ẋ(t) = f (x, u, t)
y (t) = g (x, u, t)
where x(t) is the state vector, u(t) is the input vector, and y (t) is the
output vector.
If the system dynamics is linear and time-invariant, then the state space
model is of the form
State models are not unique: One can choose many different sets of
state variables to describe the system in state space form
Commonly used variables include:
variables associated with energy storage elements;
sets including one variable and its successive derivatives;
variables that give a canonical (standard) form for the state space
representation
x
k
b m u(t)
• The state space model will depend on the set of state variables that are
chosen.
• The variables associated with energy storage elements are the force in the
spring Fk (t) and the velocity of the mass v (t).
Elemental equations
x1 = Fk
x2 = v
and assume the measured output is the displacement of the mass. Then the
state variable equations and output equation are
ẋ1 = x2 (11)
k b 1
ẋ2 = − x1 − x2 + u (12)
m m m
y = x1 (13)
• Notice that the solution consists of two parts: the first part is due to the
initial condition x0 , and the second part is due to the external input u(t).
Φ(t) := e At (20)
(at)2 (at)3
e at = 1 + at + + + ··· (22)
2! 3!
• For the matrix case we have
(At)2 (At)3
e At = I + At + + + ··· (23)
2! 3!
where I denotes the n × n identity matrix.
ẋ = Ax + Bu
ẋ = Ax + Bu
Consider the full state feedback case: C = I
then the state feedback controller always has the form:
u = −Kx
ẋ = (A − BK )x
x1 = Φx0 (27)
2
x2 = Φx1 = Φ x0 (28)
..
. (29)
n
xn = Φxn−1 = Φ x0 (30)
In this class we will learn about Kalman Filters, the learning outcomes are:
Understand the Origin, need, and applications of Kalman Filters (KF)
Learn about applications in robotics
Get familiar with foundations of necessary probability theory
Get familiar with the terminology, including process noise, measurement
noise, predictive covariance, Gaussian white noise etc.
Mathematically formulate the Kalman filtering problem
Understand the KF algorithm and how to tune the filter
Recognize when advanced filtering techniques are necessary and when
KF assumptions are violated
1.4 1.4
1.2
1.2
1
1
0.8
0.8
0.6
0.6 2Σ
0.4 m(t)
m̄(t)
0.4 m(t) 0.2
−5 0 5
noisyobs
0.2
−5 0 5
Figure: Estimate of the mean with
Figure: Noisy data and mean predictive covariance
Chowdhary www.daslab.illinois.edu March 16, 2021 42 / 105
Why is the Kalman Filter so popular?
It Works!
It Works!
Well... It works good-enough for many real-world applications
Why?
Sensor noise does tend to be Gaussian in the limit of data received
(central limit theorem)
Most systems behave linearly in local regions
Kalman filter utilizes feedback, which makes it robust to uncertainties
Its convenient to implement in an on-line manner to process streaming
data
Amenable to real-time implementation for many problems
The Kalman filter finds many many applications across pretty much all
important scientific disciplines
Its early application was on trajectory estimation on the Apollo
space-craft
Since then, it has been applied to many dynamical system state
estimation problems, including: Cellphone, GPS, weather monitoring,
precision agriculture, digital camera, ag-sensors.
The Kalman filter will return a mean (average) of the quantity being
estimated and provide a predictive variance on its estimate given:
The process and measurement noise variance is known
The dynamical system model is known
The Kalman filter is guaranteed to be the optimal un-biased estimator
for the following case:
The noise in the sensors in Gaussian
The dynamical system is linear
Sufficient number of states of the dynamical system are measured (the
system is observable)
If these assumptions are violated, the Kalman filter will be sub-optimal
ẋ = Ax + Bωt (32)
y = Hx + vt (33)
xk+1 = Φk xk + Γk ωk (34)
yk = H k xk + vk (35)
Φk ∈ Rn×n discretized
n×1
x ∈R state vector state transition matrix
ω ∈ Rn×1 (additive) process noise Γk ∈ Rn×m Discretized
l×1
y ∈R sensor measurements input matrix
v ∈ Rl×1 (additive) measurement noise Hk ∈ Rl×n output matrix
So we have
Z tk +∆t
ωk = e A(tk+1 −λ) B(λ)ω(λ)dλ (37)
tk
Hence
x̂k− +
= φk x̂k−1 (46)
−
Pk+1 = Φk Pk+ ΦT
k + Qk (47)
x̂k− +
= φk x̂k−1 (46)
−
Pk+1 = Φk Pk+ ΦT
k + Qk (47)
Remember we are propagating the mean x̂k Now we use the observer
equation to set up the correction
+
x̂k+1 = Φk xk + Lk (yk − Hk x̂k− ) (48)
So now
x̂k+ = x̂k− + Lk δyk− (49)
Innovation term
δyk− = yk − yk− = yk − Hk x̂k−
Now
Now
Now
x̂k− +
= φk x̂k−1
−
Pk+1 = Φk Pk+ ΦT
k + Qk
x̂k− +
= φk x̂k−1
−
Pk+1 = Φk Pk+ ΦT
k + Qk
Posterior covariance
Pk+ = (I − Lk Hk )Pk− (I − Lk Hk )T + Lk Rk LT
k (60)
x̂k− +
= φk x̂k−1
−
Pk+1 = Φk Pk+ ΦT
k + Qk
Posterior covariance
Pk+ = (I − Lk Hk )Pk− (I − Lk Hk )T + Lk Rk LT
k (60)
What does optimal mean here: What if we chose the gain to minimze
the error variance!
What does optimal mean here: What if we chose the gain to minimze
the error variance!
The error variance is contained in the diagonal of Pk+
Optimal gain Lk
To compute the optimal gain Lk we minimize trace(Pk+ ) wrt Lk . To do this,
∂trace(Pk+ )
solve ∂Lk = 0 for Lk
Multiply the terms out and drop the subscript k for convinience
Multiply the terms out and drop the subscript k for convinience
∂trace(Pk+ )
= −2(HP − )T + 2L(HP − H T + R) (63)
∂Lk
Hence
Kalman Gain
L = P − H T (HP − H T + R)−1 (65)
Initialize x0 ∼ N(0, P0 )
Prediction step
xk− = Φk xk−1
Pk− = Φk Pk−1 ΦT
k + Qk
Correction step
ek = yk − Hk xk−
Sk = Hk Pk− HkT + Rk
Lk = Pk− HkT Sk−1
xk+ = xk− + Lk ek
Pk+ = Pk− − Lk Sk LT
k
xk− = Φxk−1
Pk− = ΦPk−1 ΦT + Q
Correction step
Lk = Pk− H T (HPk− H T + R)−1 (66)
Sk = HPk− H T + R (67)
xk = xk− + Lk (yk − Hxk− ) (68)
Pk = Pk− − Lk Sk LT
k (69)
KF algorithm
Prediction step Initialize x̂ − (t) = x̂(t = k), P(t) = Pk (t = k)
x̂˙ − = Ax̂
Ṗ − = AP + PAT + Q
Correction step
Pk− = P(t = k) (70)
Lk = Pk− C T (CPk− C T + R)−1 (71)
xk = xk− + L(yk − Cxk− ) (72)
Pk = [I − Lk C ]P − (k) (73)
0.2
0 1
-0.2
-0.4 0.5
-0.6
0 1 2 3 4 5 6 7 8 9 10
0
y
0 -1.5
-1
-2
-2
-3 -2.5
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Time in unit time Time in unit time
ẋ = Ax + Bu + ζ
With B = [10]T
The input is sinusoidal, and the system is discretized to work with the
framework outlined in class
ẋ = f (x, u) (74)
y = h(x) (75)
Correction step
y (k) = h(x) (76)
∂h(x)
Hk = |x=x̂k−1 (77)
∂x
Lk = Pk− H T (HPk− H T + R)−1 (78)
Sk = HPk− H T + R (79)
xk = xk− + L(yk − Hxk− ) (80)
Pk = Pk− − LSk LT (81)
Chowdhary www.daslab.illinois.edu March 16, 2021 72 / 105
Outline
˙
p̂(t) = v̂ (t), (85)
˙v̂ (t) = â(t) (86)
˙
What should b̂(t) be?
˙
p̂(t) = v̂ (t), (85)
˙v̂ (t) = â(t) (86)
˙
What should b̂(t) be? Remember that E [ḃ(t)] = 0 because ωb is Gaussian
white noise, so, we have our third equation
˙
b̂(t) = 0 (87)
Now we will consider designing a navigation system for a ground robot that
has GPS and encoders. We will use encoders to mechanize the (extended)
Kalman filter and use GPS to correct it.
Kimenatic model of the Dubin’s car robot from Module 2, assuming no slip
and no lateral velocity in the body frame
ẋ = u sin ψ (93)
ẏ = u cos ψ (94)
ψ̇ = ω (95)
u
Let v b = . The velocity vector in the tangent frame is
v
v t = Rb→t v b
cos ψ − sin ψ
Where Rb→t =
sin ψ cos ψ
1
u= (UL + UR ) (96)
2
1
ω = (UL − UR ) (97)
L
1
u= (UL + UR ) (96)
2
1
ω = (UL − UR ) (97)
L
Used in aircraft that do not have GPS to hold attitude, does not work
in the presence of strong acceleration
Use inertial sensor only + magnetometers
3 axis accelerometers, gyroscopes, and magnetometers
Idea: Integrate gyro outputs to predict attitude and then correct with
gravity direction sensed from accelerometers
States to estimate:
p: x, y , z position in NED frame (which we are going to assume is
inertial)
v : U, V , W velocities in NED frame
q: Attitude quaternion (note the discussion on error quaternion in a
latter slide)
bω : Bias of the rate gyro
ba : Bias of the accelerometer
Using the following measurements:
ã: 3 axis accelerometers
ω̃: 3 axis gyro
m̃ : 3 axis magnetometer (we won’t really use this)
p̃: position from GPS
ṽ : velocity from GPS
•Acceleration
ã = a − ba (108)
ω̃ = ω − bω (109)
Let the quaternion denoting the rotation from body to inertial frame
q = [q1 , q2 , q3 , q4 ], where q1 is the scalar part denoting the rotation and
[q2 , q3 , q4 ] is the axis of rotation
2
q1 + q22 − q32 − q42
2(q2 q3 + q1 q4 ) 2(q2 q4 − q1 q3 )
Rb→I = 2(q2 q3 − q1 q4 ) q12 − q22 + q32 − q42 2(q3 q4 + q1 q2 )
2(q2 q4 + q1 q3 ) 2(q3 q4 − q1 q2 ) q12 − q22 − q32 + q42
(110)
And,
T
RI →b = Rb→I (111)
•Don’t forget to normalize the quaternion in your code every time-step, you
q
do this by setting q ← norm(q)
∂ I p̂˙
Fpv = = I3×3 (112)
∂ I v̂
∂ I v̂˙ ∂Rb→I b
Fvq = I = a (113)
∂ q̂ ∂q
∂ I v̂˙
Fvba = = −Rb→I (114)
∂ I bˆa
∂ I q̂˙ 1
Fqq = I = − Ω(ω) (115)
∂ q̂ 2
∂ I q̂˙
Fqbω = (116)
∂ b̂ω
∂ I v̂˙ ∂Rb→I I
= a (117)
∂ I q̂ ∂q
2(q1 ax − q4 ay + q3 az ) 2(q2 ax + q3 ay + q4 az ) 2(−q3 ax + q2 ay + q1 az ) 2(−q4 ax − q1 ay + q2 az )
∂ I v̂˙
∂ I q̂
= 2(q4 ax + q1 ay − q2 az ) 2(q3 ax − q2 ay − q1 az ) 2(q2 ax + q3 ay + q4 az ) 2(q1 ax − q4 ay + q3 az )
2(−q3 ax + q2 ay + q1 az ) 2(q4 ax + q1 ay − q2 az ) 2(−q1 ax + q4 ay − q3 az ) 2(q2 ax + q3 ay + q4 az )
(118)
∂ I q̂˙ ∂Ω(ω)q
= −0.5 (119)
∂ b̂ω ∂bω
q2 q3 q4
1 −q 1 q 4 −q 3
= (120)
2 −q4 −q1 q2
q3 −q2 −q1
•Don’t forget to discretize A. If you want to use continuous A, i.e. 121 then
use Ṗ = AP + PAT + Q
−rGPS (1)2q1 −rGPS (1)2q2 rGPS (1)2q3 rGPS (1)2q4
Hxq = −rGPS (1)2q4 −rGPS (1)2q3 −rGPS (1)2q2 −rGPS (1)2q1
rGPS (1)2q3 −rGPS (1)2q4 rGPS (1)2q1 −rGPS (1)2q2
(124)
rGPS (1)2q3 Q + rGPS (1)2q4 R rGPS (1)2q4 Q − rGPS (1)2q3 R rGPS (1)2q1 Q − rGPS (1)2q2 R rGPS (1)2q2 Q + rGPS (1)2q1 R
Hvq = −rGPS (1)2q2 Q − rGPS (1)2q1 R rGPS (1)2q2 R − rGPS (1)2q1 Q rGPS (1)2q4 Q − rGPS (1)2q3 R rGPS (1)2q3 Q + rGPS (1)2q4 R
rGPS (1)2q1 Q − rGPS (1)2q2 R −rGPS (1)2q2 Q − rGPS (1)2q1 R −rGPS (1)2q3 Q − rGPS (1)2q4 R rGPS (1)2q4 Q − rGPS (1)2q3 R
(125)
I (3 × 3) Z (3 × 3) Hxq Z (3 × 6)
H= (126)
Z (3 × 3) I (3 × 3) Hvq Z (3 × 6)
δq ◦ q̂ = q (127)
In practice, this error quaternion takes on very small values, so its ok to say
ŝ = 0 If we do this, we can simplify some calculations. See JFR paper by
Chowdhary et al.
ẋ = f (x, θ, u) (128)
θ̇ = 0 (129)
y = f (x, θ) (130)
When you do the correction step, make sure you do it every second.
The prediction should happen as fast as the inertial sensors give you data,
assume 100Hz.
Observability condition
The pair (C , A) is observable if and only if the matrix O has full column rank
C
CA
2
O = CA (135)
..
.
CAn−1
Consider the HW system from Q3, the linearized matrices are given in 133
and the measurement matrix: 134
Let us use MATLAB to check the observability condition, try it for the
following cases:
θ = 450 , x, y measured
θ = 450 , x, y , v measured
θ = 00 , x, y measured
θ = 900 , x, y measured
Consider the HW system from Q3, the linearized matrices are given in 133
and the measurement matrix: 134
Let us use MATLAB to check the observability condition, try it for the
following cases:
θ = 450 , x, y measured
θ = 450 , x, y , v measured
θ = 00 , x, y measured
θ = 900 , x, y measured