Bays Classifier (Machine Learning)

Name : Suman Kundu
Sem : 7 th
Dept : CSE
Topic : Bayesian Classifier
Bayesian Classifier
Bayesian Classifier
• They are statistical classifiers
• Based primarily on the Bayes' Theorem.
• In Bayesian terms, every tuple X is called Evidence
• Let H be some hypothesis such as that X belongs to a specified
classC.
• For classification problems, we want to determine P(HIX), the
proba bility that the hypothesis H holds given X
• Simple put, we are looking for the probability that X belongs
to class Cgiven that we know the attribute description of X, or
we are computing P(CX).
• After computing P(CX) for all classes Cr,i=1..n, we simply
assignX to the class which has the highest value of P(C;|X).
Bayes’ Theorem
P(C/X)=P(X,C)/P(X)
=P(X/C)P(C)/P(X)
• We need to maximize P(C/X), and find the class C;
which maximizes this value. This is the class of X
• Since P(X) is constant across all classes, we can
reduce the problemas follows:
maximize P(CX) = P(X/C)P(C)
• However, computing P(X/C) is incredibly complex for
large datasets involving a large number of attributes
or dimensions.
The Independent Assumption And Naïve Bayes Classifier
Class Conditional Independence
For simplicity, it can be assumed that the effect of an attribute value in X
on a given class C is independent of the values of the other attributes.
With this assumption,
P(X/C)=P(x1/C)P(x2/C)….. P(xn/C)
• The Bayesian Classifier that makes use of the Class

Conditional Independence assumption is called Naive Bayes
Classifier.
• These classifiers are extremely simple and effective in a

number ofsituations.
• However, they do fail in situations where class conditional

independence cannot be assumed.
All Electronics Customer Database
Let X = (age = youth, income = medium, students

= yes, credit = fair)
Forming the Naive Bayes Equations
Let X = (age = youth, income = medium, student =
yes, credit = fair)
P(Cyes/X) = P(X/Cyes)P(Cyes)
=P(age = youth/Cyes) P(income =
medium/Cyes)P(student = yes/Cyes)
P(credit =fair/Cyes)P(Cyes)
P(Cno/X) = P(X/Cno)P(Cno)
= P(age = youth/Cno) P(income =
medium/Cno) P(student = yes/
Cno)P(credit = fair/Cno)P(Cno)
Computing the Posterior Probabilities
P(age = youth/Cyes) = (2/9 ) P(income = medium/Cyes) = (4/9)
P(student = yes/Cyes) = (6/9) P(credit = fair/Cyes) = (6/9)
P(Cyes) = (9/14)
Computing the Posterior Probabilities
P(age = youth/Cno) =(3/5) P(income = medium/Cno) = (2/5)
P(student = yes/Cno) = (1/5) P(credit = fair/Cno) =

(2/5)
P(Cno) = (5/14)
Assigning the Class
Let X = (age = youth, income = medium, student = yes, credit = fair)
P(Cyes/X) = P(X/Cyes) P(Cyes)
= P(age = youth/Cyes) P(income = medium/Cyes)
P(student = yes/Cyes) P(credit = fair/Cyes)P(Cyes)
= 0.028
P(Cno/X) = P(X/Cno) P(Cno)

= P(age = youth/Cno)P(income = medium/Cno)
P(student=yes/Cno)P(credit = fair/Cno)P(Cns)
= 0.0068
Since P(Cyes/X) > P(Cno/X), X is assigned to Class YES.

Removing the Independence Assumption
• In a number of real world applications, a subset of attributes
will be dependent on each other which leads the Naive Bayes
classifier to give inferior results.
• However, the original version of the Bayes Equation can still be
used to compute the probabilities.
P(C/X)=P(X,C)/P(X)
• P(A1 A2...An) can be computed using the Chain Rule.
P(A1, A2...An)=P(A1/A2, A3…An)P(A2/A3…An)…P(An-1/ An)P(An)
• The main issue that arises here is computing P(X. C) which can
easilyswell upto a large number of terms for a moderately sized
dataset.
Bayesian Belief Networks
• Bayesian Belief Networks (BBN) are probabilistic graphical models used to
represent a set of attributes and their dependencies using a Directed Acyclic
Graph (DAG).
Exposure to Toxic Smoking

Chemicals
Cancer
Laung Tumour
Conditional Independent
•A node in a BBN is said to be
conditionally independent of its non-
descendents given its parents.
• Lung Tumour is conditionally

independent of Exposure to
Toxins and Smoking, given
Cancer.
•Exposure to Toxins and
Smoking are conditionally
independent of each other. in the
absence of cancer.
Prediction in Bayesian Belief Networks
• We wish to find the probability of Cancer given that the
person hasLung Tumour and a history of smoking i.e. X=
(LTT,S=T}.
• In the normal computation of Bayesian Probability, we will

encounter the term P(LT, C,S, ET) which requires 2 = 16
combinations tocompute completely.
• The presence of conditional independence simplifies the

computationto the following:
P(LT,C,S,ET) = P(LT/C)P(C/ET,S)P(ET)P(S)
= P(LT/C)P(S) Σ
(ET)
P(C/ET, S)P(ET)
The Most Famous Application
• The Microsoft Office Assis-
tant nicknamed "Clippy" was
a prominent feature in MS
Office '97-'03.
• It was implemented partly
using Bayesian Belief
Networks.
Thank You

Bays Classifier (Machine Learning)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bays Classifier (Machine Learning)

Uploaded by

Copyright:

Available Formats

Name : Suman Kundu

• The Bayesian Classifier that makes use of the Class

• These classifiers are extremely simple and effective in a

• However, they do fail in situations where class conditional

Let X = (age = youth, income = medium, students

P(age = youth/Cyes) = (2/9 ) P(income = medium/Cyes) = (4/9)

P(student = yes/Cyes) = (6/9) P(credit = fair/Cyes) = (6/9)

P(age = youth/Cno) =(3/5) P(income = medium/Cno) = (2/5)

P(student = yes/Cno) = (1/5) P(credit = fair/Cno) =

P(Cno/X) = P(X/Cno) P(Cno)

Since P(Cyes/X) > P(Cno/X), X is assigned to Class YES.

• P(A1 A2...An) can be computed using the Chain Rule.

P(A1, A2...An)=P(A1/A2, A3…An)P(A2/A3…An)…P(An-1/ An)P(An)

Exposure to Toxic Smoking

• Lung Tumour is conditionally

• In the normal computation of Bayesian Probability, we will

• The presence of conditional independence simplifies the

You might also like