Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 16

Name : Suman Kundu

Sem : 7 th

Dept : CSE
Topic : Bayesian Classifier
Bayesian Classifier
Bayesian Classifier
• They are statistical classifiers
• Based primarily on the Bayes' Theorem.
• In Bayesian terms, every tuple X is called Evidence
• Let H be some hypothesis such as that X belongs to a specified
classC.
• For classification problems, we want to determine P(HIX), the
proba bility that the hypothesis H holds given X
• Simple put, we are looking for the probability that X belongs
to class Cgiven that we know the attribute description of X, or
we are computing P(CX).
• After computing P(CX) for all classes Cr,i=1..n, we simply
assignX to the class which has the highest value of P(C;|X).
Bayes’ Theorem
P(C/X)=P(X,C)/P(X)
=P(X/C)P(C)/P(X)
• We need to maximize P(C/X), and find the class C;
which maximizes this value. This is the class of X
• Since P(X) is constant across all classes, we can
reduce the problemas follows:
maximize P(CX) = P(X/C)P(C)
• However, computing P(X/C) is incredibly complex for
large datasets involving a large number of attributes
or dimensions.
The Independent Assumption And Naïve Bayes Classifier
Class Conditional Independence
For simplicity, it can be assumed that the effect of an attribute value in X
on a given class C is independent of the values of the other attributes.
With this assumption,
P(X/C)=P(x1/C)P(x2/C)….. P(xn/C)

• The Bayesian Classifier that makes use of the Class


Conditional Independence assumption is called Naive Bayes
Classifier.

• These classifiers are extremely simple and effective in a


number ofsituations.

• However, they do fail in situations where class conditional


independence cannot be assumed.
All Electronics Customer Database

Let X = (age = youth, income = medium, students


= yes, credit = fair)
Forming the Naive Bayes Equations
Let X = (age = youth, income = medium, student =
yes, credit = fair)

P(Cyes/X) = P(X/Cyes)P(Cyes)
=P(age = youth/Cyes) P(income =
medium/Cyes)P(student = yes/Cyes)
P(credit =fair/Cyes)P(Cyes)

P(Cno/X) = P(X/Cno)P(Cno)
= P(age = youth/Cno) P(income =
medium/Cno) P(student = yes/
Cno)P(credit = fair/Cno)P(Cno)
Computing the Posterior Probabilities

P(age = youth/Cyes) = (2/9 ) P(income = medium/Cyes) = (4/9)

P(student = yes/Cyes) = (6/9) P(credit = fair/Cyes) = (6/9)

P(Cyes) = (9/14)
Computing the Posterior Probabilities

P(age = youth/Cno) =(3/5) P(income = medium/Cno) = (2/5)

P(student = yes/Cno) = (1/5) P(credit = fair/Cno) =


(2/5)

P(Cno) = (5/14)
Assigning the Class
Let X = (age = youth, income = medium, student = yes, credit = fair)
P(Cyes/X) = P(X/Cyes) P(Cyes)
= P(age = youth/Cyes) P(income = medium/Cyes)
P(student = yes/Cyes) P(credit = fair/Cyes)P(Cyes)
= 0.028

P(Cno/X) = P(X/Cno) P(Cno)


= P(age = youth/Cno)P(income = medium/Cno)
P(student=yes/Cno)P(credit = fair/Cno)P(Cns)
= 0.0068

Since P(Cyes/X) > P(Cno/X), X is assigned to Class YES.


Removing the Independence Assumption
• In a number of real world applications, a subset of attributes
will be dependent on each other which leads the Naive Bayes
classifier to give inferior results.
• However, the original version of the Bayes Equation can still be
used to compute the probabilities.

P(C/X)=P(X,C)/P(X)

• P(A1 A2...An) can be computed using the Chain Rule.

P(A1, A2...An)=P(A1/A2, A3…An)P(A2/A3…An)…P(An-1/ An)P(An)

• The main issue that arises here is computing P(X. C) which can
easilyswell upto a large number of terms for a moderately sized
dataset.
Bayesian Belief Networks
• Bayesian Belief Networks (BBN) are probabilistic graphical models used to
represent a set of attributes and their dependencies using a Directed Acyclic
Graph (DAG).

Exposure to Toxic Smoking


Chemicals

Cancer

Laung Tumour
Conditional Independent
•A node in a BBN is said to be
conditionally independent of its non-
descendents given its parents.

• Lung Tumour is conditionally


independent of Exposure to
Toxins and Smoking, given
Cancer.
•Exposure to Toxins and
Smoking are conditionally
independent of each other. in the
absence of cancer.
Prediction in Bayesian Belief Networks
• We wish to find the probability of Cancer given that the
person hasLung Tumour and a history of smoking i.e. X=
(LTT,S=T}.

• In the normal computation of Bayesian Probability, we will


encounter the term P(LT, C,S, ET) which requires 2 = 16
combinations tocompute completely.

• The presence of conditional independence simplifies the


computationto the following:

P(LT,C,S,ET) = P(LT/C)P(C/ET,S)P(ET)P(S)
= P(LT/C)P(S) Σ
(ET)
P(C/ET, S)P(ET)
The Most Famous Application
• The Microsoft Office Assis-
tant nicknamed "Clippy" was
a prominent feature in MS
Office '97-'03.
• It was implemented partly
using Bayesian Belief
Networks.
Thank You

You might also like