Professional Documents
Culture Documents
Ch13 4-LinearDynamicalSystems
Ch13 4-LinearDynamicalSystems
Ch13 4-LinearDynamicalSystems
Sargur N. Srihari
srihari@cedar.buffalo.edu
Machine Learning Course:
http://www.cedar.buffalo.edu/~srihari/CSE574/index.html
Machine Learning Srihari
Latent variables
Observations
Motivation
• Simple problem in practical settings
• Measure a value of unknown quantity z using
a noisy sensor that returns a value x
• x is z plus zero mean Gaussian noise
• Given single measurement, best guess for
z is to assume that z = x
• We can improve estimate for z by taking
lots of measurements and averaging them
• Random noise terms tend to cancel each
other
Machine Learning Srihari
⎡ N ⎤ N
Corresponding p ( X , Z | θ ) = p ( z 1 | π ) ⎢ ∏ p ( z n | z n −1 , A ) ⎥ ∏ p ( x m | z m , φ )
⎣ n=2 ⎦ m =1
Factorization
where X = {x 1 ,..x N }, Z = {z 1 ,..z N }, θ = {π , A , φ }
Kalman Filter
• Since model is a tree-structured graph
inference solved using sum-product
• Forward recursions, analogous to alpha
messages in HMM, are known as Kalman
filter equations
• Backward messages are known as
Kalman smoother equations
Machine Learning Srihari
⎡ N ⎤ N
p ( X , Z | θ ) = p ( z1 | π ) ⎢∏ p ( z n | z n −1 , A)⎥∏ p (xm | z m , φ )
⎣ n=2 ⎦ m =1
where X = {x1,..x N }, Z = {z1,..z N }, θ = {π , A, φ}
Machine Learning Srihari
LDS Formulation
• Transition and Emission probabilities are written
in the general form:
p (z n | z n −1 ) = N (z n | Az n −1 , Γ)
p (x n | z n −1 ) = N (x n | Cz n , Σ)
• Initial latent variable has the form:
p(x n | z n ) = N (z1 | µ0 ,V0 )
Inference in LDS
• Finding marginal distributions for latent
variables conditioned on observation sequence
• For given parameter settings wish to make
predictions of next latent state zn and next
observation xn conditioned on observed data
x1,..xn-1
• Done efficiently using sum-product algorithm
which are Kalman filter and smoother equations
• Joint distribution over all latent and observed
variables is simply Gaussian
• Inference problem solved using marginals and
conditionals of multivariate Gaussian
Machine Learning Srihari
Uncertainty Reduction
Before Observing xn
• Kalman filter is a p(zn-1|x1,..,xn-1)
process of
Incorporates
making data upto Step n-1
successive
p(zn|x1,..,xn-1)
predictions and
Diffusion arising from
then correcting non-zero variance of
predictions in transition probability p(zn|zn-1)
Not a density wrt
light of new After Observing xn p(xn|zn) zn and not
observations p(zn|x1,..,xn) normalized to one
Estimated Path
Learning in LDS
• Determination of parameters θ={A,Γ,C,Σ,µ0,V0)
using EM algorithm
• Denote the estimated parameter values at some
particular cycle of the algorithm as θ old
• For these parameter values we run the
inference algorithm to determine the posterior
distribution of the latent variables p(Z|X, θ old)
• In the M step, function Q(θ, θ old) is maximized
with respect to the components of θ
Machine Learning Srihari
• Extensions of HMMs:
http://www.cedar.buffalo.edu/~srihari/CSE574/Chap11/Ch11.3-
HMMExtensions.pdf