Professional Documents
Culture Documents
Probability
Probability
Probability
General situation:
• Observed variables (evidence): Agent knows certain
things about the state of the world (e.g., sensor readings or
symptoms)
• Unobserved variables: Agent needs to reason about
other aspects (e.g. where an object is or what disease is
present)
• Model: Agent knows something about how the known
variables relate to the unknown variables
Probabilistic reasoning gives us a framework for managing
our beliefs and knowledge
3/
28
3/28
Random Variables
A random variable is some aspect of the world
about which we (may) have uncertainty
• R = Is it raining?
• T = Is it hot or cold?
• D = How long will it take to drive to work?
• L = Where is the ghost?
We denote random variables with capital letters
4/
28
4/28
Random Variables
A random variable is some aspect of the world
about which we (may) have uncertainty
• R = Is it raining?
• T = Is it hot or cold?
• D = How long will it take to drive to work?
• L = Where is the ghost?
We denote random variables with capital letters
Domains:
• R ∈ {true, f alse} (often written as {+r, −r})
• T ∈ {hot, cold}
• D ∈ [0, ∞)
• L ∈ {(0, 0), (0, 1), . . . }
4/
28
4/28
Probability Distributions
Associate a probability with each value
5/
28
5/28
Probability Distributions
Associate a probability with each value
Temperature
P (T )
T P
hot 0.5
cold 0.5
5/
28
5/28
Probability Distributions
Associate a probability with each value
Temperature Weather
P (W )
P (T )
W P
T P
sun 0.6
hot 0.5 rain 0.1
cold 0.5 fog 0.3
meteor 0.0
5/
28
5/28
Probability Distributions
Unobserved random variables have distributions
P (W )
P (T ) W P
T P
sun 0.6
hot 0.5 rain 0.1
cold 0.5 fog 0.3
meteor 0.0
A distribution is a TABLE of probabilities of values
5/
28
5/28
Probability Distributions
Unobserved random variables have distributions
P (W )
P (T ) W P
T P
sun 0.6
hot 0.5 rain 0.1
cold 0.5 fog 0.3
meteor 0.0
A distribution is a TABLE of probabilities of values
A probability (lower case value) is a single number
e.g.: P (W = rain) = 0.1
5/
28
5/28
Probability Distributions
Unobserved random variables have distributions
P (W )
P (T ) W P
T P
sun 0.6
hot 0.5 rain 0.1
cold 0.5 fog 0.3
meteor 0.0
A distribution is a TABLE of probabilities of values
A probability (lower case value) is a single number
e.g.: P (W = rain) = 0.1
Must have: ∀x P (X = x) ≥ 0 and
P
P (X = x) = 1
x
5/
28
5/28
Probability Distributions
Unobserved random variables have distributions
P (W )
P (T ) W P
T P
sun 0.6 Shorthand notation:
hot 0.5 rain 0.1 P (hot) = P (T = hot),
cold 0.5 fog 0.3 P (cold) = P (T = cold),
meteor 0.0 P (rain) = P (W = rain),
...
A distribution is a TABLE of probabilities of values
OK if all domain entries are
A probability (lower case value) is a single number unique
e.g.: P (W = rain) = 0.1
Must have: ∀x P (X = x) ≥ 0 and
P
P (X = x) = 1
x
5/
28
5/28
Joint Distributions
A joint distribution over a set of random variables:
X1 , X2 , . . . , Xn specifies a real number for each
assignment (or outcome):
P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
6/
28
6/28
Joint Distributions
A joint distribution over a set of random variables:
X1 , X2 , . . . , Xn specifies a real number for each
assignment (or outcome):
P (X1 = x1 , X2 = x2 , . . . , Xn = xn ) P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
6/
28
6/28
Joint Distributions
A joint distribution over a set of random variables:
X1 , X2 , . . . , Xn specifies a real number for each
assignment (or outcome):
P (X1 = x1 , X2 = x2 , . . . , Xn = xn ) P (T, W )
= P (x1 , x2 , . . . , xn ) T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
6/
28
6/28
Joint Distributions
A joint distribution over a set of random variables:
X1 , X2 , . . . , Xn specifies a real number for each
assignment (or outcome):
P (X1 = x1 , X2 = x2 , . . . , Xn = xn ) P (T, W )
= P (x1 , x2 , . . . , xn ) T W P
Must obey: hot sun 0.4
P (x1 , x2 , . . . , xn ) ≥ 0 hot rain 0.1
P cold sun 0.2
P (x1 , x2 , . . . , xn ) = 1
(x1 ,x2 ,...,xn ) cold rain 0.3
6/
28
6/28
Joint Distributions
A joint distribution over a set of random variables:
X1 , X2 , . . . , Xn specifies a real number for each
assignment (or outcome):
P (X1 = x1 , X2 = x2 , . . . , Xn = xn ) P (T, W )
= P (x1 , x2 , . . . , xn ) T W P
Must obey: hot sun 0.4
P (x1 , x2 , . . . , xn ) ≥ 0 hot rain 0.1
P cold sun 0.2
P (x1 , x2 , . . . , xn ) = 1
(x1 ,x2 ,...,xn ) cold rain 0.3
Size of distribution if n variables with domain sizes d?
6/
28
6/28
Joint Distributions
A joint distribution over a set of random variables:
X1 , X2 , . . . , Xn specifies a real number for each
assignment (or outcome):
P (X1 = x1 , X2 = x2 , . . . , Xn = xn ) P (T, W )
= P (x1 , x2 , . . . , xn ) T W P
Must obey: hot sun 0.4
P (x1 , x2 , . . . , xn ) ≥ 0 hot rain 0.1
P cold sun 0.2
P (x1 , x2 , . . . , xn ) = 1
(x1 ,x2 ,...,xn ) cold rain 0.3
Size of distribution if n variables with domain sizes d?
→ dn
• Impractical to write out for large distributions
6/
28
6/28
Probabilistic Models
A joint distribution over a set
of random variables
• (Random) variables with
P (T, W )
domains
• Assignments are called T W P
outcomes hot sun 0.4
• Joint distributions: say
hot rain 0.1
whether assignments
cold sun 0.2
(outcomes) are likely
• Normalized: sum to 1.0
cold rain 0.3
• Ideally: only certain variables
directly interact
7/
28
7/28
Events
A set E of outcomes
P
P (E) = P (x1 . . . xn )
(x1 ...xn )∈E P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
8/
28
8/28
Events
A set E of outcomes
P
P (E) = P (x1 . . . xn )
(x1 ...xn )∈E P (T, W )
From a joint distribution, we can calculate the T W P
probability of any event
• Probability that it’s hot AND sunny? hot sun 0.4
• Probability that it’s hot? hot rain 0.1
• Probability that it’s hot OR sunny? cold sun 0.2
cold rain 0.3
8/
28
8/28
Events
A set E of outcomes
P
P (E) = P (x1 . . . xn )
(x1 ...xn )∈E P (T, W )
From a joint distribution, we can calculate the T W P
probability of any event
• Probability that it’s hot AND sunny? hot sun 0.4
• Probability that it’s hot? hot rain 0.1
• Probability that it’s hot OR sunny? cold sun 0.2
cold rain 0.3
Typically, the events we care about are partial
assignments, like P (T = hot)
8/
28
8/28
Quiz: Events
P (+x, +y)?
P (X, Y )
X Y P
+x +y 0.2
+x −y 0.3
−x +y 0.4
−x −y 0.1
9/
28
9/28
Quiz: Events
P (+x, +y)?
→ 0.2 P (X, Y )
P (+x)? X Y P
+x +y 0.2
+x −y 0.3
−x +y 0.4
−x −y 0.1
9/
28
9/28
Quiz: Events
P (+x, +y)?
→ 0.2 P (X, Y )
P (+x)? X Y P
P (+x, +y) + P (+x, −y) +x +y 0.2
→ 0.5 +x −y 0.3
P (−y OR + x)? −x +y 0.4
−x −y 0.1
9/
28
9/28
Quiz: Events
P (+x, +y)?
→ 0.2 P (X, Y )
P (+x)? X Y P
P (+x, +y) + P (+x, −y) +x +y 0.2
→ 0.5 +x −y 0.3
P (−y OR + x)? −x +y 0.4
P (+x, −y) + P (−x, −y) + P (+x, +y) −x −y 0.1
→ 0.6
9/
28
9/28
Marginal Distributions
Marginal distributions are sub-tables which eliminate
variables
Marginalization (summing out): Combine collapsed
rows by adding
P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
10 /
10/2828
Marginal Distributions
Marginal distributions are sub-tables which eliminate
variables
Marginalization (summing out): Combine collapsed
rows by adding
P (TP
)=?
P (T, W ) P (t) = P (t, s)
s
T W P P (T )
hot sun 0.4 T P
hot rain 0.1 hot 0.5
cold sun 0.2 cold 0.5
cold rain 0.3
10 /
10/2828
Marginal Distributions
Marginal distributions are sub-tables which eliminate
variables
Marginalization (summing out): Combine collapsed
rows by adding
P (TP
)=? P (WP)=?
P (T, W ) P (t) = P (t, s) P (s) = P (t, s)
s t
T W P P (T ) P (W )
hot sun 0.4 T P W P
hot rain 0.1 hot 0.5 sun 0.6
cold sun 0.2 cold 0.5 rain 0.4
cold rain 0.3
10 /
10/2828
Marginal Distributions
Marginal distributions are sub-tables which eliminate
variables
Marginalization (summing out): Combine collapsed
rows by adding
P (TP
)=? P (WP)=?
P (T, W ) P (t) = P (t, s) P (s) = P (t, s)
s t
T W P P (T ) P (W )
hot sun 0.4 T P W P
hot rain 0.1 hot 0.5 sun 0.6
cold sun 0.2 cold 0.5 rain 0.4
cold rain 0.3 P
P (X1 = x1 ) = P (X1 = x1 , X2 = x2 )
x2
10 /
10/2828
Quiz: Marginal Distributions
P (X, Y )
X Y P
+x +y 0.2
+x −y 0.3
−x +y 0.4
−x −y 0.1
11 /
11/2828
Quiz: Marginal Distributions
P
P (x) = P (x, y)
y
P (X, Y )
P (X)
X Y P
X P
+x +y 0.2
+x +x −y 0.3
−x −x +y 0.4
−x −y 0.1
11 /
11/2828
Quiz: Marginal Distributions
P
P (x) = P (x, y)
y
P (X)
X P P (X, Y )
+x 0.5 X Y P
−x 0.5
P +x +y 0.2
P (y) = P (x, y) +x −y 0.3
x
P (Y ) −x +y 0.4
−x −y 0.1
Y P
+y
−y
11 /
11/2828
Quiz: Marginal Distributions
P
P (x) = P (x, y)
y
P (X)
X P P (X, Y )
+x 0.5 X Y P
−x 0.5
P +x +y 0.2
P (y) = P (x, y) +x −y 0.3
x
P (Y ) −x +y 0.4
−x −y 0.1
Y P
+y 0.6
−y 0.4
11 /
11/2828
Conditional Probabilities
A simple relation between joint and conditional probabilities
P (a, b)
P (a|b) =
P (b)
P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
12 /
12/2828
Conditional Probabilities
A simple relation between joint and conditional probabilities
P (a, b)
P (a|b) =
P (b)
P (T, W )
T W P
hot sun 0.4 P (W = s|T = c) =
hot rain 0.1
cold sun 0.2
cold rain 0.3
12 /
12/2828
Conditional Probabilities
A simple relation between joint and conditional probabilities
P (a, b)
P (a|b) =
P (b)
P (T, W )
T W P
hot sun 0.4 P (W = s|T = c) = P (W =s,T =c)
P (T =c)
= 0.2
0.5
= 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
12 /
12/2828
Quiz: Conditional Probabilities
P (+x| + y)
P (X, Y )
X Y P
+x +y 0.2
+x −y 0.3
−x +y 0.4
−x −y 0.1
13 /
13/2828
Quiz: Conditional Probabilities
P (+x| + y)
P (+x,+y)
→ P (+y)
P (X, Y )
→ 1
3
P (−x| + y) X Y P
+x +y 0.2
+x −y 0.3
−x +y 0.4
−x −y 0.1
13 /
13/2828
Quiz: Conditional Probabilities
P (+x| + y)
P (+x,+y)
→ P (+y)
P (X, Y )
→ 1
3
P (−x| + y) X Y P
P (−x,+y)
→ P (−x,+y)+P (+x,+y)
+x +y 0.2
→ 2 +x −y 0.3
3
−x +y 0.4
−x −y 0.1
13 /
13/2828
Quiz: Conditional Probabilities
P (+x| + y)
P (+x,+y)
→ P (+y)
P (X, Y )
→ 1
3
P (−x| + y) X Y P
P (−x,+y)
→ P (−x,+y)+P (+x,+y)
+x +y 0.2
→ 2 +x −y 0.3
3
P (−y| + x) −x +y 0.4
−x −y 0.1
13 /
13/2828
Quiz: Conditional Probabilities
P (+x| + y)
P (+x,+y)
→ P (+y)
P (X, Y )
→ 1
3
P (−x| + y) X Y P
P (−x,+y)
→ P (−x,+y)+P (+x,+y)
+x +y 0.2
→ 2 +x −y 0.3
3
P (−y| + x) −x +y 0.4
P (+x,−y) −x −y 0.1
→ P (+x,−y)+P (+x,+y)
→ 3
5
13 /
13/2828
Conditional Distributions
Probability distributions over some variables given fixed values of others
Joint Distribution
P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
14 /
14/2828
Conditional Distributions
Probability distributions over some variables given fixed values of others
Conditional Distributions
P (W |T = hot)
Joint Distribution
W P
P (T, W )
T W P sun 0.8
rain 0.2
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
14 /
14/2828
Conditional Distributions
Probability distributions over some variables given fixed values of others
Conditional Distributions
P (W |T = hot)
Joint Distribution
W P
P (T, W )
T W P sun 0.8
rain 0.2
hot sun 0.4
hot rain 0.1 P (W |T = cold)
cold sun 0.2 W P
cold rain 0.3
sun 0.4
rain 0.6
14 /
14/2828
Normalization Trick
P (T, W )
P (W |T = cold)
T W P
W P
hot sun 0.4
hot rain 0.1 sun
cold sun 0.2 rain
cold rain 0.3
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (T, W )
P (W |T = cold)
T W P
W P
hot sun 0.4
hot rain 0.1 sun
cold sun 0.2 rain
cold rain 0.3
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W )
P (W |T = cold)
T W P
W P
hot sun 0.4
hot rain 0.1 sun
cold sun 0.2 rain
cold rain 0.3
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W ) P (W = s, T = c)
=
P (W = s, T = c) + P (W = r, T = c) P (W |T = cold)
T W P
W P
hot sun 0.4
hot rain 0.1 sun
cold sun 0.2 rain
cold rain 0.3
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W ) P (W = s, T = c)
=
P (W = s, T = c) + P (W = r, T = c) P (W |T = cold)
T W P
0.2 W P
= = 0.4
hot sun 0.4 0.2 + 0.3
hot rain 0.1 sun 0.4
cold sun 0.2 rain
cold rain 0.3
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W ) P (W = s, T = c)
=
P (W = s, T = c) + P (W = r, T = c) P (W |T = cold)
T W P
0.2 W P
= = 0.4
hot sun 0.4 0.2 + 0.3
hot rain 0.1 P (W = r|T = c) sun 0.4
cold sun 0.2 rain
cold rain 0.3
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W ) P (W = s, T = c)
=
P (W = s, T = c) + P (W = r, T = c) P (W |T = cold)
T W P
0.2 W P
= = 0.4
hot sun 0.4 0.2 + 0.3
hot rain 0.1 P (W = r|T = c) sun 0.4
cold sun 0.2 P (W = r, T = c) rain
=
cold rain 0.3 P (T = c)
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W ) P (W = s, T = c)
=
P (W = s, T = c) + P (W = r, T = c) P (W |T = cold)
T W P
0.2 W P
= = 0.4
hot sun 0.4 0.2 + 0.3
hot rain 0.1 P (W = r|T = c) sun 0.4
cold sun 0.2 P (W = r, T = c) rain
=
cold rain 0.3 P (T = c)
P (W = r, T = c)
=
P (W = s, T = c) + P (W = r, T = c)
15 /
15/2828
Normalization Trick
P (W = s|T = c)
P (W = s, T = c)
=
P (T = c)
P (T, W ) P (W = s, T = c)
=
P (W = s, T = c) + P (W = r, T = c) P (W |T = cold)
T W P
0.2 W P
= = 0.4
hot sun 0.4 0.2 + 0.3
hot rain 0.1 P (W = r|T = c) sun 0.4
cold sun 0.2 P (W = r, T = c) rain 0.6
=
cold rain 0.3 P (T = c)
P (W = r, T = c)
=
P (W = s, T = c) + P (W = r, T = c)
0.3
= = 0.6
0.2 + 0.3
15 /
15/2828
Normalization Trick
P (T, W )
T W P
hot sun 0.4
hot rain 0.1
cold sun 0.2
cold rain 0.3
15 /
15/2828
Normalization Trick
15 /
15/2828
Normalization Trick
15 /
15/2828
Normalization Trick
15 /
15/2828
Quiz: Normalization Trick
Find P (X|Y = −y)
16 /
16/2828
Quiz: Normalization Trick
Find P (X|Y = −y)
16 /
16/2828
Quiz: Normalization Trick
Find P (X|Y = −y)
16 /
16/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
17 /
17/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
• All entries sum to ONE
17 /
17/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
• All entries sum to ONE
Procedure:
• Step 1: Compute Z = sum over all entries
• Step 2: Divide every entry by Z
17 /
17/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
• All entries sum to ONE
Procedure:
• Step 1: Compute Z = sum over all entries
• Step 2: Divide every entry by Z
Example 1
W P
sun 0.2
rain 0.3
17 /
17/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
• All entries sum to ONE
Procedure:
• Step 1: Compute Z = sum over all entries
• Step 2: Divide every entry by Z
Example 1
W P Normalize W P
sun 0.2 sun 0.4
rain 0.3 Z = 0.5 rain 0.6
17 /
17/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
• All entries sum to ONE
Procedure:
• Step 1: Compute Z = sum over all entries
• Step 2: Divide every entry by Z
Example 1 Example 2
T W P
W P Normalize W P
hot sun 20
sun 0.2 sun 0.4 hot rain 5
rain 0.3 Z = 0.5 rain 0.6 cold sun 10
cold rain 15
17 /
17/2828
To Normalize
(Dictionary) To bring or restore to a normal condition
• All entries sum to ONE
Procedure:
• Step 1: Compute Z = sum over all entries
• Step 2: Divide every entry by Z
Example 1 Example 2
T W P T W P
W P Normalize W P
Normalize
hot sun 20 hot sun 0.4
sun 0.2 sun 0.4 hot rain 5 hot rain 0.1
rain 0.3 Z = 0.5 rain 0.6 cold sun 10 Z = 50 cold sun 0.2
cold rain 15 cold rain 0.3
17 /
17/2828
Probabilistic Inference
Compute a desired probability from other
known probabilities (e.g. conditional from
joint)
We generally compute conditional
probabilities
• P (on time|no accidents) = 0.90
• These represent the agent’s beliefs given the
evidence
Probabilities change with new evidence:
• P (on time|no accidents, 5 a.m.) = 0.95
• P (on time|no accidents, 5 a.m., raining) = 0.80
• Observing new evidence causes beliefs to be
updated
18 /
18/2828
Inference by Enumeration
General case (X1 , X2 , . . . , Xn )
• Evidence variables: E1 . . . Ek = e1 . . . ek
• Query* variable: Q
19 /
19/2828
Inference by Enumeration
General case (X1 , X2 , . . . , Xn )
• Evidence variables: E1 . . . Ek = e1 . . . ek
• Query* variable: Q
We want: P (Q|e1 . . . ek )
(works fine with multiple query variables, too)
19 /
19/2828
Inference by Enumeration
General case (X1 , X2 , . . . , Xn )
• Evidence variables: E1 . . . Ek = e1 . . . ek
• Query* variable: Q
• Hidden variables: H1 . . . Hr
We want: P (Q|e1 . . . ek )
(works fine with multiple query variables, too)
19 /
19/2828
Inference by Enumeration
19 /
19/2828
Inference by Enumeration
19 /
19/2828
Inference by Enumeration
P
P (Q, e1 . . . ek ) = P (Q, h1 . . . hr , e1 . . . ek )
h1 ...hr
| {z }
X1 ,X2 ,...,Xn
19 /
19/2828
Inference by Enumeration
P
P (Q, e1 . . . ek ) = P (Q, h1 . . . hr , e1 . . . ek )
h1 ...hr
| {z }
X1 ,X2 ,...,Xn
19 /
19/2828
Inference by Enumeration
P
P (Q, e1 . . . ek ) = P (Q, h1 . . . hr , e1 . . . ek )
h1 ...hr
| {z }
X1 ,X2 ,...,Xn
19 /
19/2828
Example: Inference by Enumeration
P (W ) S T W P
summer hot sun 0.30
summer hot rain 0.05
P (W |winter) summer cold sun 0.10
summer cold rain 0.05
winter hot sun 0.10
P (W |winter, hot) winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
20 /
20/2828
Example: Inference by Enumeration
P (W ) S T W P
P (W = sun) = 0.65
summer hot sun 0.30
P (W = rain) = 0.35
summer hot rain 0.05
P (W |winter) summer cold sun 0.10
summer cold rain 0.05
winter hot sun 0.10
P (W |winter, hot) winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
20 /
20/2828
Example: Inference by Enumeration
P (W ) S T W P
P (W = sun) = 0.65
summer hot sun 0.30
P (W = rain) = 0.35
summer hot rain 0.05
P (W |winter) summer cold sun 0.10
P (W = sun|winter) = 0.50 summer cold rain 0.05
P (W = rain|winter) = 0.50 winter hot sun 0.10
P (W |winter, hot) winter hot rain 0.05
winter cold sun 0.15
winter cold rain 0.20
20 /
20/2828
Example: Inference by Enumeration
P (W ) S T W P
P (W = sun) = 0.65
summer hot sun 0.30
P (W = rain) = 0.35
summer hot rain 0.05
P (W |winter) summer cold sun 0.10
P (W = sun|winter) = 0.50 summer cold rain 0.05
P (W = rain|winter) = 0.50 winter hot sun 0.10
P (W |winter, hot) winter hot rain 0.05
P (W = sun|winter, hot) = 0.666 . . . winter cold sun 0.15
P (W = rain|winter, hot) = 0.333 . . . winter cold rain 0.20
20 /
20/2828
Problems with Inference by Enumeration
Obvious problems:
• Worst-case time complexity: O(dn )
• Space complexity O(dn ) to store the joint distribution
21 /
21/2828
Problems with Inference by Enumeration
Obvious problems:
• Worst-case time complexity: O(dn )
• Space complexity O(dn ) to store the joint distribution
Availability of the joint distributions and evidence
21 /
21/2828
The Product Rule
Sometimes we have condition distributions, but want the joint distribution
P (x, y)
P (x|y) =
P (y)
P (x, y) = P (y)P (x|y)
22 /
22/2828
The Product Rule
Example:
P (D|W ) P (D, W )
P (W ) D W P D W P
R P
× wet sun 0.1 = wet sun
sun 0.8 dry sun 0.9 dry sun
rain 0.2 wet rain 0.7 wet rain
dry rain 0.3 dry rain
22 /
22/2828
The Product Rule
Example:
P (D|W ) P (D, W )
P (W ) D W P D W P
R P
× wet sun 0.1 = wet sun 0.08
sun 0.8 dry sun 0.9 dry sun 0.72
rain 0.2 wet rain 0.7 wet rain 0.14
dry rain 0.3 dry rain 0.06
22 /
22/2828
The Chain Rule
More generally, we can always write any joint distribution as an incremental
product of conditional distributions
23 /
23/2828
Bayes’ Rule
24 /
24/2828
Bayes’ Rule
Two ways to factor a joint distribution over two variables:
P (x, y) = P (x|y)P (y) = P (y|x)P (x)
24 /
24/2828
Bayes’ Rule
Two ways to factor a joint distribution over two variables:
P (x, y) = P (x|y)P (y) = P (y|x)P (x)
Dividing, we get:
P (y|x)
P (x|y) = P (x)
P (y)
24 /
24/2828
Bayes’ Rule
Two ways to factor a joint distribution over two variables:
P (x, y) = P (x|y)P (y) = P (y|x)P (x)
Dividing, we get:
P (y|x)
P (x|y) = P (x)
P (y)
24 /
24/2828
Bayes’ Rule
Two ways to factor a joint distribution over two variables:
P (x, y) = P (x|y)P (y) = P (y|x)P (x)
Dividing, we get:
P (y|x)
P (x|y) = P (x)
P (y)
P (effect|cause)P (cause)
P (cause|effect) =
P (effect)
Example:
• M : meningitis, S: stiff neck
P (+m) = 0.0001
P (+s| + m) = 0.8
P (+s| − m) = 0.01
P (+m| + s) = P (+s|+m)P
P (+s)
(+m) P (+s|+m)P (+m)
= P (+s|+m)P (+m)+P (|s|−m)P (−m) = 0.00794
• Note: posterior probability of meningitis still very small
• Note: you should still get stiff necks checked out! Why?
25 /
25/2828
Quiz: Bayes’ Rule
Given:
P (D|W )
P (W ) D W P
R P
wet sun 0.1
sun 0.8 dry sun 0.9
rain 0.2 wet rain 0.7
dry rain 0.3
What is P (W |dry)?
26 /
26/2828
Ghostbusters (Revisited)
28 /
28/2828