Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 19

UNIT III: REASONING

From Classical Logic to Uncertainty

 Logical reasoning is the process of deriving previously-


unknown facts based on the known ones.
 However, it is not always possible to have access to the
entire set of facts to do the reasoning.
 What we lose under uncertainty:
 Deduction (reasoning in stages)
 Incrementality
Rule-based Systems:
A Semantic Model for Uncertainty

 Rule-based systems :
 A Semantic Model for Handling Uncertainty.
 Another Name for expert systems.
 Organize the knowledge in terms of “if-then”
rules, each associated with a numeric certainty
measure.
 Suffer from most of the problems with classical
logic under uncertainty.
Reasoning Under Uncertainty

 Most tasks requiring intelligent behavior have some


degree of uncertainty associated with them.
 The type of uncertainty that can occur in knowledge-
based systems may be caused by problems with the
data. For example:
 Data might be missing or unavailable.
 Data might be present but unreliable or ambiguous due to
measurement errors.
 The representation of the data may be imprecise or
inconsistent.
 Data may just be user’s best guess.
 Data may be based on defaults and the defaults may have
exceptions.
Reasoning Under Uncertainty: contd.

 The uncertainty may also be caused by the


represented knowledge since it might
– represent best guesses of the experts that are based on
plausible or statistical associations they have observed.
– Not be appropriate in all situations (e.g., may have
indeterminate applicability)

 Given these numerous sources of errors, most


knowledge-based systems require the incorporation
of some form of uncertainty management.
Reasoning Under Uncertainty:
Implementation Issues

 When implementing some uncertainty


scheme we must be concerned with three
issues:
 How to represent uncertain data?
 How to combine two or more pieces of uncertain
data?
 How to draw inference using uncertain data?

 Implementation Method: Using Probabilities


Probability Theory:
Basic Terminology

 Random variables describe the “features” of the


world whose status may or may not be known.

 Propositions are bi-modal (i.e., Boolean) random


variables
 We will assume a finite set of propositional
symbols, denoted as A,B, …
 Sometimes, we will use English sentences to
denote propositions, e.g., white(milk).
Probability Theory:
Basic Terminology

 Without loss of generality, we will assume that the


world can be described by a finite set of
propositions.
 An event in the world is an occurrence in the world
that is modeled by assigning truth values to one or
more propositions.
e.g., “the outcomes of rolling two dice are the
same”
Probability Theory

 Probabilities are used as a way of representing the


structure of knowledge and reasoning over that
knowledge.
 Notation: P(A) is the probability of the proposition
A being true in the knowledge base (or
alternatively, it is the probability of that the event A
occurs in the world)
 The Axioms of Probability Theory:
 0  P(A)  1
 P(certain event) = 1
 P(A or B) = P(A) + P(B) - P(A and B)
Compound probabilities

 Events that do not affect each other in any way are called independent
events. For two independent events A and B,
P(A  B) = P(A) P(B)
 Independent events: The events E1, E2, ..., En in a sample space S, are
independent if
P(Ei1  ...  Eik) = P(Ei1) ...P(Eik)
for each subset {1, ...,k)  N, 1  k  n, n  1, n belongs to N.
If events A and B are mutually exclusive, then

P(A  B) = P(A) + P(B)

If events A and B are not mutually exclusive, then

P(A  B) = P(A) + P(B) - P(A  B)


This is also called Addition law.
Conditional Probabilities

 A conditional probability, P(A | B), describes the belief in the


event A, under the assumption that another event B is known
with absolute certainty.
Formal Definition:
P(A | B) = P(A and B) / P(B), P(B) ≠ 0
 Multiplicative Law of probability for two events is then
defined as:

P(A  B) = P(A | B) P(B)


which is equivalent to the following
P(A  B) = P(B | A) P(A)
Bayes’ Theory

 The basis of the Bayesian Theory is conditional


probabilities
 Bayesian Theory sees a conditional probability as a
way to describe the structure and organization of
human knowledge
 In this view, A | B stands for the event A in the
“context” of the event B
E.g., the symptom A in the context of a disease B
Evidential Reasoning:
The Inversion Rule

 Reasoning about “hypotheses” and “evidences”


that do/do not support those hypothesis is the main
venue for Bayesian Inference
P(H | e) : given that I know about an evidence
e, the probability that my hypothesis H is true

P(H | e) = P(e | H) P(H) / P(e),


where P(e | H) is the probability that evidence e
will actually be observed in the world, if the
hypothesis H is true.
Bayes’ Theorem:Derivation

 Note that conditional probability is defined as


P(H  E)
P(H | E) = ---------------, for P(E)  0.
P(E)
i.e., the conditional probability of H given E.
Bayes’ Theorem:Derivation

 From conditional probability:


P(H  E)
P(H | E) = ------------------,
P(E)
P(E  H)
Furthermore, we have, P(E | H) = ---------------
P(H)
So,
P(E | H)P(H) = P(H | E)P(E) = P(H  E)
Thus
P(E | H) P(H)
P(H | E) = ---------------------
P(E)
Probability vs
Conditional Probability

 An example
As an example of probabilities, Table below shows hypothetical
probabilities of a disk crash using a Brand X drive within one
year.
Brand X Brand X’ Total of Rows
Crash C 0.6 0.1 0.7
No crash C’ 0.2 0.1 0.3
Total of columns 0.8 0.2 1.0
Hypothetical probabilities of a disk crash
X X’ Total of rows
C P(C  X) P(C  X’) P(C)
C’ P(C’ X) P(C’  X’) P(C’)
Total P(X) P(X’) 1
of columns
Probability vs
Conditional Probability

 Probability interpretation of two sets


Using above tables, the probabilities of all events
can be calculated. Some probabilities are :

(1) P(C) = 0.7


(2) P(C’) = 0.3
(3) P(X) = 0.8
(4) P(X’) = 0.2
(5) P(C  X) = 0.6
(the probability of a crash and using Brand
X)
Probability vs
Conditional Probability

(6) The probability of a crash, given that Brand


X is used, is
P(C  X) 0.6
P(C | X) = ------------- = ------- = 0.75
P(X) 0.8
(7) The probability of a crash, given that Brand
X is not used, is
P(C  X’) 0.1
P(C | X’) = ------------- = ------- =
0.50
P(X’) 0.2
Probability vs
Conditional Probability

 Probabilities (5) and (6) may appear to have similar meanings when
you read their descriptions. However (5) is simply the intersection
of two events, while (6) is a conditional probability.
 The meaning of (5) is the following:
IF a disk drive is picked randomly, then 0.6 of the time it will be Brand x
and have crashed.
 In other words, we are just picking samples from the population of disk
drives. Some of those drives are Brand X and have crashed (0.6), some
are not Brand X and have crashed (0.1), some are Brand X and have not
crashed (0.2), and some are not Brand X and have not crashed (0.1).
 In contrast, the meaning of the conditional probability (6) is very
different
IF a Brand X disk drive is picked, then 0.75 of the time it will have
crashed.

You might also like