Professional Documents
Culture Documents
Markov Chain Monte Carlo: Prof. David Page
Markov Chain Monte Carlo: Prof. David Page
n
i
j i
i Parents j y y P y P
0 ) ( ) (
S y
y P S P
0
) (
) (
) | (
2
2 1
2 1
S P
S S P
S S P
Gibbs Sampler Markov Chain
We assume we have already chosen to sample
variable Y
i
T(u
i
,y
i
u
i
,y
i
) = P(y
i
|u
i
)
If we want to incorporate the probability of
randomly uniformly choosing a variable to sample,
simply multiply all transition probabilities by 1/n
Gibbs Sampler Markov Chain is
Regular
Path from y to y with Nonzero Probability:
Let n be the number of variables in the Bayes net.
For step i = 1 to n :
Set variable Y
i
to y
i
and leave other variables the same.
That is, go from (y
1
,y
2
,...,y
i-1
,y
i
,y
i+1
,...,y
n
) to (y
1
,y
2
,...,y
i-
1
,y
i
,y
i+1
,...,y
n
)
The probability of this step is
P(y
i
|y
1
,y
2
,...,y
i-1
,y
i+1
,...,y
n
), which is nonzero
So all steps, and thus the path, has nonzero
probability
Self loop T(yy) has probability P(y
i
|u
i
) > 0
How p Changes with Time in a
Markov Chain
p
t+1
(y) =
A distribution p
t
is stationary if p
t
= p
t+1
, that is, for
all y, p
t
(y) = p
t+1
(y)
y
y y y ) )T( (
t
'
Detailed Balance
A Markov chain satisfies detailed balance if
there exists a unique distribution p such that
for all states y, y,
p(y)T(yy) = p(y)T(yy)
If a regular Markov chain satisfies detailed
balance with distribution p, then there exists t
such that for any initial distribution p
0
, p
t
= p
Detailed balance (with regularity) implies
convergence to unique stationary distribution
Examples of Markov Chains (arrows
denote nonzero-probability transitions)
Regular, Detailed
Balance (with
appropriate p and T)
Converges to Stationary
Distribution
Detailed Balance with p
on nodes and T on arcs.
Does not converge
because not regular
1/6 1/3
2/3
1/3
1/3
2/3
1/6 1/3
2/3
1/3
1/3
2/3
Gibbs Sampler satisfies Detailed
Balance
Claim: A Gibbs sampler Markov chain defined by a Bayesian
network with all CPT entries nonzero satisfies detailed balance
with probability distribution p(y)=P(y) for all states y
Proof: First we will show that P(y)T(yy) = P(y)T(yy).
Then we will show that no other probability distribution p
satisfies p(y)T(yy) = p(y)T(yy)
Gibbs Sampler satisfies Detailed
Balance, Part 1
P(y)T(yy) = P(y
i
,u
i
)P(y
i
|u
i
) (Gibbs Sampler Def.)
= P(y
i
|u
i
)P(u
i
)P(y
i
|u
i
) (Chain Rule)
= P(y
i
,u
i
)P(y
i
|u
i
) (Reverse Chain Rule)
= P(y)T(yy) (Gibbs Sampler Def.)
Gibbs Sampler Satisfies Detailed
Balance, Part 2
Since all CPT entries are nonzero, the Markov chain is regular.
Suppose there exists a probability distribution p not equal to P
such that p(y)T(yy) = p(y)T(yy). Without loss of
generality, there exists some state y such that p(y) > P(y). So,
for every neighbor y of y, that is, every y such that T(yy)
is nonzero,
p(y)T(yy) = p(y)T(yy) > P(y)T(yy) = P(y)T(yy)
So p(y) > P(y).
Gibbs Sampler Satisfies Detailed
Balance, Part 3
We can inductively see that p(y) > P(y) for every
state y path-reachable from y with nonzero
probability. Since the Markov chain is regular,
p(y) > P(y) for all states y with nonzero
probability. But the sum over all states y of p(y)
is 1, and the sum over all states y of P(y) is 1.
This is a contradiction. So we can conclude that P is
the unique probability distribution p satisfying
p(y)T(yy) = p(y)T(yy).
Using Other Samplers
The Gibbs sampler only changes one random
variable at a time
Slow convergence
High-probability states may not be reached
because reaching them requires going through low-
probability states
Metropolis Sampler
Propose a transition with probability T
Q
(yy)
Accept with probability A=min(1, P(y)/P(y))
If for all y, y T
Q
(yy)=T
Q
(yy) then the
resulting Markov chain satisfies detailed
balance
Metropolis-Hastings Sampler
Propose a transition with probability T
Q
(yy)
Accept with probability
A=min(1, P(y)T
Q
(yy)/P(y)T
Q
(yy))
Detailed balance satisfied
Acceptance probability often easy to compute
even though sampling according to P difficult
Gibbs Sampler as Instance of
Metropolis-Hastings
Proposal distribution T
Q
(u
i
,y
i
u
i
,y
i
) = P(y
i
|u
i
)
Acceptance probability
1
)
) | ' ( ) ( ) | (
) | ( ) ( ) | ' (
, 1 min(
)
) ' , , ( ) , (
) , ' , ( ) ' , (
, 1 min(
i i i i i
i i i i i
i i i i
Q
i i
i i i i
Q
i i
u y P u P u y P
u y P u P u y P
y u y u T y u P
y u y u T y u P
A