Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 44

Sequential Monte Carlo Methods

Shashidhar

School of Marine NWPU

Overview

We are faced with many problems involving large, sequentially evolving datasets: tracking, computer vision, speech and audio, robotics, ....

We wish to form models and algorithms for Bayesian sequential updating of probability distributions as data evolve.

Here we consider the Sequential Monte Carlo (SMC), or `particle filtering' methodology

In many applications it is required to estimate the `state' of the system from noisy, convolved or non-linearly distorted observations. Since data also arrive sequentially in many applications it is therefore desirable to estimate the state online, in order to avoid memory storage of huge datasets and to make inferences and decisions in real time. Some typical applications from the engineering perspective include:

Tracking for radar and sonar applications Real-time enhancement of speech and audio signals Sequence and channel estimation in digital communications channels Medical monitoring of patient eeg/ecg signals Image sequence tracking

Contents

Bayes' Theorem Monte Carlo methods Sampling Techniques Monte Carlo Markov Chain Importance Sampling State-Space System Sequential Importance Sampling (SIS) Sequential Importance Resampling (SIR)

Bayesian Inference
Data

Belief Before

Belief After

Hunter sees Cat from far

Prior Belief

Hunter goes near and learns

Hunter decides its a Tiger

Bayesian Signal Processing (BSP):

Estimation of Probability Distribution of random signal in order to perform statistical interferences. Observation: Y Quantity of Interest: X
Posteriori Distribution Likelihood Prior Distribution

Pr X Y =

Pr Pr() Pr()
Evidence/ Normalizing Factor

Pr X Y Pr Pr()
Posteriori Distribution

Likelihood

Prior Distribution

Prior Distribution of X

Posteriori Distribution of X given Y Data = Y


Model

Belief Before = Pr(X)

Likelihood of X given Y

Belief After = Pr(X|Y)


Prior Pr(X)

Likelihood Pr(Y|X)

Posteriori Pr(X|Y)

Estimated distributions

Prior Pr(X)
Prob (X) Posteriori Pr(X|Y)

X-(random parameter)

Why Monte Carlo ????

As the grids increase computationally complex

Monte Carlo method efficient in picking up random samples from regions of high concentration (Probability)

In signal processing we are often interested in statistical measure of a random signal or parameters in terms of moments.

Pr

Instead of using direct numerical integration. We use Monte Carlo integration as an alternative. MC integration draws random samples from the prior distribution. MC forms the sample averages to approximate the posterior distribution. Empirical Distribution:

Pr

which is a probability mass distribution with weights 1/N and random variable (Sample) X( i )

=1 (

())

Substituting the empirical distribution into integral gives


= 1 Pr


=1


=1

Here is said to be Monte Carlo estimate of



Take Pr = Gamma(4,1) Generate some random samples Plot histogram and basic approximation to PDF
0.45 0.4 0.35 0.3
0.25

0.4 0.35 0.3

0.25
0.2

0.2
0.15

0.15
0.1 0.05 0

0.1 0.05 0
0 2 4 6 8 10 12 14 16 18 20

10

12

14

16

18

20

N = 200

N = 500

0.35

0.25

0.3 0.2 0.25 0.15

0.2

0.15

0.1

0.1 0.05 0.05

10

12

14

16

18

20

10

15

20

25

N = 1000
0.25

N = 5000

0.2

0.15

N = 10000
0.1 0.05

10

15

20

25

Integrals in Probabilistic Inference

Normalization:
Pr Pr() Pr = Pr Pr
Nasty Integrals

Marginalization:
Pr = Pr ,

Expectation:
Pr

Pr

Monte Carlo Integration


Suppose we want to compute: I = Pr 1) Simulate () | from Pr =1
Cannot directly sample from Pr Approximation of Pr
() Pr =
=1 (

() )

2) Replace Nasty Integral with simply sum: I


1 ( () ) =1

Monte Carlo Integration formally


The idea of Monte Carlo simulation is to draw an i.i.d set of samples * () + from a target density Pr defined on a =1 high-dimensional space . These N samples can be used to approximate the target distribution with the following empirical point-mass function (think of it as a histogram): Pr =
1 =1

()

where () denotes the delta-Dirac mass located at .

Summery on MC

MC Method is a powerful means for generating random samples used in estimating conditional
and marginal probability distribution

The efficiency of MC Method increases as the problem dimensionality increases

Sampling Techniques

Uniform Sampling Rejection Sampling Metropolis Sampling Metropolis Hastings Sampling Random walk Metropolis Hasting Sampling Importance Sampling Gibbs Sampling Slice Sampling

Rejection Sampling
Set = 1 for =

Generate a sample: Generate a uniform sample: (0,1)

ACCEPT the sample: = if

Pr() ()

Otherwise, REJECT the sample and generate the next trail sample:

end
REJECT

Sampling PDF: ()

Target PDF: Pr()

ACCEPT

Markov Chain Monte Carlo (MCMC)



MCMC : is basically Monte Carlo integration where the random samples are produced using Markov Chain Markov Chain : is a discrete random process

possessing the property that the conditional distribution at the present sample (given all of the past samples) depends only on the previous samples i.e. Pr |1 = Pr(()| 1 (0)) Markov
Chain
1

Pr |1 = Pr(()| 1 )

The most powerful and efficient MCMC methods: Metropolis Hastings Sampling Gibbs Sampling
Markov Chain simulation is essentially a general technique based on generating samples from proposal distribution and then correcting (ACCEPTING or REJECTING ) those samples to approximate a target posterior distribution.

Metropolis Sampling

Initialize: 0 = p 0 Generate a candidate sample from proposal: () Calculate the acceptance probability: p ( ) (1 , ) = min ,1 p (1 ) ACCEPT candidate sample with probability, (1 , ) according to: p > p (1 ) = Prob{ NEW_STATE} > Prob{ OLD_STATE} ACCEPT 1

Disadvantage:
symmetric

Proposal

distribution

should

be

Metropolis Hastings Sampling

The Metropolis Hastings (M-H) technique defines a Markov chain such that a new sample is generated from previous samples, 1 , by first drawing a candidate sample, from a proposal distribution, () and then making a decision whether this candidate should be accepted and retained or rejected and discarded using the previous sample as the new If accepted, replaces ( ) otherwise the old sample 1 is saved (1 ) Can take care of asymmetric distributions

Metropolis Hastings Sampling Algorithm

prob( NEW_STATE )

prob( OLD_STATE )

Importance Sampling
One way to mitigate difficulties with the inability to directly sample from target (Posterior) distribution is based on the concept of

Importance Sampling Importance Sampling : method to compute


Draw random samples using MC

expectations with respect to one distribution using random samples drawn from another. Proposal Distribution Target Distribution

() ()

() is the Importance sampling distribution The integral shown above can be estimated by: Drawing N-samples from : ~ and 1 =1 ( ())

Computing the sample mean


= () ()


=1

1 =

=1

() ()

=
=1

( ())

Sample-based PDF Representation



Region of high density Large weight of particles

Discrete approximation of posterior distribution using Importance Sampling: Pr ( ()) =1

Sequential Importance Sampling

Likelihood

Prior

Proposal Dist Sample

The Space State System


State Transition Equation:
= 1 1 , 1 , 1
, 1 current and previous state (Velocity, Altitude, Acceleration) 1 ( , , ) Known evolution function (possibly NonLinear) 1 State noise (usually non-Gaussian) 1 Known input Ex: Velocity, Acceleration, Altitude

. . .

Measurement Equation:

= , ,

Current measurement Current state ( , , ) Known measurement function (Possibly nonlinear) Known input Measurement noise (Usually non-Gaussian)

...

Need for Particle Filter



Kalman filters, Extended Kalman filters, Unscented Kalman filters can only deal with linear, unimodal distributions. KF, EKF, UKF considers conditional mean and covariance to characterize Gaussian posterior . These techniques try to linearize the non-linearity to certain degree. Particle filters can characterize multimodal distributions and handle Non-linear state estimations. Particle filters are sequential MC methodology

Particle Filtering is a sequential Monte Carlo method employing the

Particle Filters

sequential of relevant probability distributions using importance sampling

Particles Point Mass Weights Prob Mass

Sampling Importance Sampling

Visualization of SIS
1 ,

(1 , ) 1

Degeneracy Problem

One of the major problem with importance sampling is the degeneracy of particles. After few iterations, the variance of the importance weights increases thereby making it impossible to avoid weight degradation.

Resampling

Eliminate particles with small importance weights Concentrate on particles with large weights
1 ,

unweighted measure

compute importance weights resampling move particles

(1 , ) 1 1 1 ,

predict

( , ) 1

Sampling Importance Resampling

Comparison of KF, EKF, UKF and PF


Non Linear or NonGaussian System
PF

Linear, Gaussian System

Accura cy

UKF EKF KF

Accura cy

KF

EKF

UKF

PF

Complexity

Complexity

Thank You
One must learn by doing the thing; for though you think you know it You have no certainty, until you try. Sophocles, Trachiniae

You might also like