Professional Documents
Culture Documents
Statistical Reasoning: Dr. Anjali Diwan
Statistical Reasoning: Dr. Anjali Diwan
► Medical diagnosis
► Facts: symptoms, lab test results, and other observed
findings (called manifestations)
► KB: causal associations between diseases and
manifestations
► Reasoning: one or more diseases whose presence would
causally explain the occurrence of the given
manifestations
► Many other reasoning processes (e.g., word sense
disambiguation in natural language process, image
understanding, criminal investigation) can also been
seen as abductive reasoning
4
Comparing abduction, deduction,
and induction
Deduction: major premise: A => B
All balls in the box are black
A
minor premise: These balls are from the ---------
box
B
conclusion: These balls are black
6
Sources of uncertainty
► Uncertain inputs
► Missing data
► Noisy data
► Uncertain knowledge
► Multiple causes lead to multiple effects
► Incomplete enumeration of conditions or effects
► Incomplete knowledge of causality in the domain
► Probabilistic/stochastic effects
► Uncertain outputs
► Abduction and induction are inherently uncertain
► Default reasoning, even in deductive fashion, is
uncertain
► Incomplete deductive inference may be uncertain
Probabilistic reasoning only gives probabilistic
results (summarizes uncertainty from various
7
sources)
Causes of uncertainty
► Rational behavior:
► For each possible action, identify the possible
outcomes
► Compute the probability of each outcome
► Compute the utility of each outcome
► Compute the probability-weighted (expected) utility
over possible outcomes for each action
► Select the action with the highest expected utility
(principle of Maximum Expected Utility)
9
Bayesian reasoning
► Probability theory
► Bayesian inference
► Use probability theory and information about independence
► Reason diagnostically (from evidence (effects) to conclusions (causes)) or causally
(from causes to effects)
► Bayesian networks
► Compact representation of probability distribution over a set of propositional
random variables
► Take advantage of independence relationships
10
Other uncertainty representations
► Default reasoning
► Nonmonotonic logic: Allow the retraction of default beliefs if they prove
to be false
► Rule-based methods
► Certainty factors (Mycin): propagate simple models of belief through
causal or diagnostic rules
► Evidential reasoning
► Dempster-Shafer theory: Bel(P) is a measure of the evidence for P; Bel(¬P)
is a measure of the evidence against P; together they define a belief
interval (lower and upper bounds on confidence)
► Fuzzy reasoning
► Fuzzy sets: How well does an object satisfy a vague property?
► Fuzzy logic: “How true” is a logical statement?
11
Decision making with uncertainty
► Rational behavior:
► For each possible action, identify the possible
outcomes
► Compute the probability of each outcome
► Compute the utility of each outcome
► Compute the probability-weighted (expected) utility
over possible outcomes for each action
► Select the action with the highest expected utility
(principle of Maximum Expected Utility)
12
Probabilistic reasoning
a a∧ b
b
14
Probabilistic reasoning:
• Bayesian Statistics
►Bayesian Network can be used for building models from data and experts
opinions, and it consists of two parts:
• Directed Acyclic Graph
• Table of conditional probabilities
►The generalized form of Bayesian network that represents and solve decision
problems under uncertain knowledge is known as an Influence diagram.
►Note: It is used to represent conditional dependencies.
►A Bayesian network graph is made up of nodes and Arcs (directed links),
where:
• Each node corresponds to the random variables, and a variable can be
continuous or
discrete.
• Arc or directed arrows represent the causal relationship or conditional
probabilities between random variables. These directed links or arrows connect
the pair of nodes in the graph. These links represent that one node directly
influence the other node, and if there is no directed link that means that nodes
are independent with each other
• Note: The Bayesian network graph does not contain any cyclic graph.
Hence, it is known as a directed acyclic graph or DAG.
• The Bayesian network has mainly two components:
1. Causal Component
2. Actual numbers
• Each node in the Bayesian network has condition probability distribution P(Xi
|Parent(Xi) ), which determines the effect of the parent on that node.
• Bayesian network is based on Joint probability distribution and conditional
probability.
• Example: Harry installed a new burglar alarm at his home to detect burglary. The
alarm reliably responds at detecting a burglary but also responds for minor
earthquakes. Harry has two neighbors David and Sophia, who have taken a
responsibility to inform Harry at work when they hear the alarm. David always calls
Harry when he hears the alarm, but sometimes he got confused with the phone
ringing and calls at that time too. On the other hand, Sophia likes to listen to high
music, so sometimes she misses to hear the alarm. Here we would like to
compute the probability of Burglary Alarm.
• Problem: Calculate the probability that alarm has sounded, but there is
neither a burglary, nor an earthquake occurred, and David and Sophia both
called the Harry.
Note: List of all events occurring in this
network: Burglary (B)
Earthquake(E)
Alarm(A) David
Calls(D) Sophia
calls(S)
• From the formula of joint distribution, we can write the problem statement in the
form of probability distribution:
P(S, D, A, ¬B, ¬E) = P (S|A) *P (D|A)*P (A|¬B ^ ¬E) *P (¬B) *P (¬E)
Bayes’s rule
26
Bayesian inference
► In the setting of diagnostic/evidential reasoning
►
… …
Know prior probability of hypothesis
conditional probability
► Want to compute the posterior probability
► Bayes’ theorem (formula 1):
27
Simple Bayesian diagnostic reasoning
► Knowledge base:
► Evidence / manifestations:E1, …, Em
► Hypotheses / disorders: H1, …, Hn
► Ej and Hi are binary; hypotheses are mutually exclusive (non-overlapping) and exhaustive
(cover all possible cases)
28
Bayesian diagnostic reasoning II
29
Limitations of simple
Bayesian inference
► Cannot easily handle multi-fault situation, nor cases where intermediate
(hidden) causes exist:
► Disease D causes syndrome S, which causes correlated manifestations M1 and M2
► Consider a composite hypothesis H1 ∧ H2, where H1 and H2 are independent.
What is the relative posterior?
► P(H1 ∧ H2 | E1, …, Em) = α P(E1, …, Em | H1 ∧ H2) P(H1 ∧ H2)
= α P(E1, …, Em | H1 ∧ H2) P(H1) P(H2)
= α ∏mj=1 P(Ej | H1 ∧ H2) P(H1) P(H2)
► How do we compute P(Ej | H1 ∧ H2) ??
30
Limitations of simple Bayesian inference
II
► Assume H1 and H2 are independent, given E1, …, Em?
► P(H1 ∧ H2 | E1, …, Em) = P(H1 | E1, …, Em) P(H2 | E1, …, Em)
► This is a very unreasonable assumption
► Earthquake and Burglar are independent, but not given Alarm:
► P(burglar | alarm, earthquake) << P(burglar | alarm)
► Another limitation is that simple application of Bayes’s rule doesn’t
allow us to handle causal chaining:
► A: this year’s weather; B: cotton production; C: next year’s cotton
price
► A influences C indirectly: A→ B → C
► P(C | B, A) = P(C | B)
► Need a richer representation to model interacting hypotheses,
conditional independence, and causal chaining
► Next time: conditional independence and Bayesian31 networks!
What is a rule-based system in AI?
► A rule-based system is a system that applies human-made rules to store, sort
and manipulate data. In doing so, it mimics human intelligence.
► Rule-based systems require a set of facts or source of data, and a set of rules
for manipulating that data. These rules are sometimes referred to as ‘If
statements’ as they tend to follow the line of ‘IF X happens THEN do Y’.
► Widely used in Artificial Intelligence, Rule-Based Expert System is not just only
responsible for modeling intelligent behavior in machines and building expert
system that outperform human expert(s) but also helps:
47
SNSCT 16BM031 AI
48
Semantic network
49
Frames
51
52
Conceptual dependency
53
54
Scripts
55
56
WHAT IS SEMANTIC NET?