Baysian Belief Networks

Bayesian Belief Networks
1
Introduction
Suppose you are trying to determine
if a patient has inhalational
anthrax. You observe the
following symptoms:
• The patient has a cough
• The patient has a fever
• The patient has difficulty
breathing
2
Introduction
You would like to determine how
likely the patient is infected with
inhalational anthrax given that the
patient has a cough, a fever, and
difficulty breathing
We are not 100% certain that the

patient has anthrax because of these
symptoms. We are dealing with
uncertainty!
3
Introduction
Now suppose you order an x-ray
and observe that the patient has a
wide mediastinum.
Your belief that that the patient is
infected with inhalational anthrax is
now much higher.
4
Introduction
• In the previous slides, what you observed
affected your belief that the patient is
infected with anthrax
• This is called reasoning with uncertainty
• Wouldn’t it be nice if we had some
methodology for reasoning with
uncertainty? Why in fact, we do…
5
Bayesian Networks
HasAnthrax
HasCough HasFever HasDifficultyBreathing

HasWideMediastinum
• In the opinion of many AI researchers, Bayesian

networks are the most significant contribution
in AI in the last 10 years
• They are used in many applications eg. spam
filtering, speech recognition, robotics, diagnostic
systems and even syndromic surveillance
6
Outline
1. Introduction
2. Probability Primer
3. Bayesian networks
7
Probability Primer: Random Variables
• A random variable is the basic element of
probability
• Refers to an event and there is some degree
of uncertainty as to the outcome of the
event
• For example, the random variable A could
be the event of getting a heads on a coin
flip
8
Boolean Random Variables
• We will start with the simplest type of random
variables – Boolean ones
• Take the values true or false
• Think of the event as occurring or not
occurring
• Examples (Let A be a Boolean random
variable):
A = Getting heads on a coin flip
A = It will rain today
A = The Cubs win the World Series in 2007
9
Probabilities
We will write P(A = true) to mean the probability that A = true.
What is probability? It is the relative frequency with which an
outcome would be obtained if the process were repeated a large
number of times under similar conditions*
The sum of the red

and blue areas is
1 P(A = true)
P(A = false)
10
Conditional Probability
• P(A = true | B = true) = Out of all the outcomes in which B
is true, how many also have A equal to true
• Read this as: “Probability of A conditioned on B” or
“Probability of A given B”
H = “Have a headache”
F = “Coming down with Flu”
P(F = true)
P(H = true) = 1/10
P(F = true) =
1/40
P(H = true | F =
true) = 1/2
P(H = true)
“Headaches are rare and flu is rarer, but if
you’re coming down with flu there’s a
11
50-50 chance you’ll have a headache.”
The Joint Probability Distribution
• We will write P(A = true, B = true) to mean
“the probability of A = true and B = true”
• Notice that:
P(H=true|F=true)
P(F = true)
Area of " H and F"


region Area of " F"
region
P(H = true)
P(H  true,F  true)

P(F  true) 12
• Joint probabilities can be between A B C P(A,B,C)
false false false 0.1
any number of variables
false false true 0.2
eg. P(A = true, B = true, C =
false true false 0.05
• true) false true true 0.05
For each combination of variables, true false false 0.3
we need to say how probable that true false true 0.1
• combination is true true false 0.05
The probabilities of these true true true 0.15
combinations need to sum to 1
Sums to 1
13
A B C P(A,B,C)
• Once you have the joint probability false false false 0.1
distribution, you can calculate any false false true 0.2
probability involving A, B, and C false true false 0.05
• Note: May need to use false true true 0.05
marginalization and Bayes rule, true false false 0.3
(both of which are not discussed in
true false true 0.1
these slides)
true true false 0.05
true true true 0.15
Examples of things you can compute:
• P(A=true) = sum of P(A,B,C) in rows with A=true
• P(A=true, B = true | C=true) =
P(A = true, B = true, C = true) / P(C = true)
14
The Problem with the Joint
Distribution
• Lots of entries in the A B C P(A,B,C)
table to fill up! false false false 0.1
• For k Boolean random false false true 0.2

false true false 0.05
variables, you need a false true true 0.05
table of size 2k true false false 0.3
• How do we use fewer true false true 0.1
true true false 0.05
numbers? Need true true true 0.15
the concept of
independence
15
Independence
Variables A and B are independent if any of
the following hold:
• P(A,B) = P(A) P(B)
• P(A | B) = P(A)
• P(B | A) = P(B)
This says that knowing the outcome of
A does not tell me anything new
about the outcome of B.
16
Independence
How is independence useful?
• Suppose you have n coin flips and you want to
calculate the joint distribution P(C1, …, Cn)
• If the coin flips are not independent, you need 2n
values in the table
• If the coin flips are independent, then
n
Each P(Ci) table has 2 entries
P(C1,..., Cn )   and there are n of them for a
P(Ci ) total of 2n values
i1
17
Conditional Independence
Variables A and B are conditionally
independent given C if any of the following
hold:
• P(A, B | C) = P(A | C) P(B | C)
• P(A | B, C) = P(A | C)
• P(B | A, C) = P(B | C)
Knowing C tells me everything about B. I don’t gain
anything by knowing A (either because A doesn’t
influence B or because knowing C provides all the
information knowing A would give) 18
Outline
1. Introduction
2. Probability Primer
3. Bayesian networks
19
Bayesian Networks
1. Encodes the conditional independence
relationships between the variables
(attributes) in the graph structure
2. Is a compact representation of the joint
probability distribution over the variables
3. Allow the representation of dependencies
among subsets of variables.
4. Provide graphical model of causal
relationships between attributes
20
A Bayesian Network
A Bayesian network is made up of:
1. A Directed Acyclic Graph
A
D
A P(A) A B P(B|A) B D P(D|B) B C P(C|B)
2. A set of conditional
false 0.6 probabilities
false false tables
0.01for eachfalse
node in the
false 0.02 false false 0.4
graph
true 0.4 false true 0.99 false true 0.98 false true 0.6
true false 0.7 true false 0.05 true false 0.9
true true 0.3 true true 0.95 true true 0.1
A Directed Acyclic Graph
Each node in the graph is a A node X is a parent of
random variable another node Y if there is an
arrow from node X to node Y
A eg. A is a parent of B
C D
Informally, an arrow from

node X to node Y means
X has a direct influence
on Y 21
A Set of Tables for Each Node
A P(A) A B P(B|A)
Each node Xi has a
false 0.6 false false 0.01
true 0.4 false true 0.99
conditional probability
true false 0.7 distribution P(Xi | Parents(Xi))
true true 0.3 that quantifies the effect of
the parents on the node
B C P(C|B)
false false 0.4 The parameters are the
false true 0.6 A probabilities in these
true false 0.9 conditional probability tables
true true 0.1 (CPTs)
B
B D P(D|B)
false false 0.02
C D
false true 0.98
true false 0.05
true true 0.95
A Set of Tables for Each Node
Conditional Probability
Distribution for C given B
B C P(C|B)
false false 0.4
false true 0.6
true false 0.9
true true 0.1 For a given combination of values of the parents (B
in this example), the entries for P(C=true | B) and
P(C=false | B) must add up to 1
eg. P(C=true | B=false) + P(C=false |B=false )=1
If you have a Boolean variable with k Boolean parents, this table

has 2k+1 probabilities (but only 2k need to be stored)
24
Conditional Independence
Given its parents (P1, P2), a node (X) is
conditionally independent of its non- descendants
(ND1, ND2)
P1 P2
ND1 X ND2
C1 C2
25
We can compute the joint probability
distribution over all the variables X1, …, Xn in
the Bayesian net using the formula:
P( X1  x1 ,..., X n  xn )  
n
P( X i  xi | Parents( X i ))
i1
Where Parents(Xi) means the values of the Parents of the node Xi

with respect to the graph
26
Using a Bayesian Network Example
Using the network in the example, suppose you want to
calculate:
P(A = true, B = true, C = true, D = true)
= P(A = true) * P(B = true | A = true) *
P(C = true | B = true) P( D = true | B = true)
= (0.4)*(0.3)*(0.1)*(0.95) A
C D
27
Using a Bayesian Network Example
Using the network in the example, suppose you want to
calculate:
This is from the
P(A = true, B = true, C = true, D = true) graph structure
= P(A = true) * P(B = true | A = true) *
P(C = true | B = true) P( D = true | B = true)
= (0.4)*(0.3)*(0.1)*(0.95) A
B
These numbers are from the
conditional probability tables
C D
28
Example
Inference
• Using a Bayesian network to compute
probabilities is called inference
• In general, inference involves queries of the form:
P( X | E )
E = The evidence variable(s)
X = The query variable(s)
30
Inference
HasAnthrax
HasCough HasFever HasDifficultyBreathing HasWideMediastinum
• An example of a query would be:

P( HasAnthrax = true | HasFever = true, HasCough = true)
• Note: Even though HasDifficultyBreathing and
HasWideMediastinum are in the Bayesian network, they are
not given values in the query (ie. they do not appear either as
query variables or evidence variables)
• They are treated as unobserved variables
31
The Bad News
• Exact inference is feasible in small to
medium-sized networks
• Exact inference in large networks takes a
very long time
• We resort to approximate inference
techniques which are much faster and give
pretty good results
32

Baysian Belief Networks

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Baysian Belief Networks

Uploaded by

Copyright:

Available Formats

Bayesian Belief Networks

We are not 100% certain that the

HasCough HasFever HasDifficultyBreathing

• In the opinion of many AI researchers, Bayesian

The sum of the red

Area of " H and F"

• For k Boolean random false false true 0.2

Informally, an arrow from

If you have a Boolean variable with k Boolean parents, this table

Where Parents(Xi) means the values of the Parents of the node Xi

HasCough HasFever HasDifficultyBreathing HasWideMediastinum

• An example of a query would be:

You might also like