Professional Documents
Culture Documents
Lecture - 4 Classification (Naive Bayes)
Lecture - 4 Classification (Naive Bayes)
Lecture - 4 Classification (Naive Bayes)
11/07/22 2
Probability Theory : Naïve Bayes
In both kNN and DT; we asked the classifiers to make hard decisions.
We asked for a defined answer for the question.
Asking the best guess about the class is better. (probability)
Probability theory forms the basis for many machine learning algorithms.
11/07/22 3
Probability Theory : Naïve Bayes
Works with:
Nominal values
11/07/22 4
Probability Theory : Naïve Bayes
Conditional Probability:
P(gray/bucket B) = P(gray and bucket B) / P(bucket B)
11/07/22 5
Probability Theory : Naïve Bayes
11/07/22 6
Probability Theory : Naïve Bayes
Conditional Probability:
Calculating the probability of a gray stone, given that the unknown stone
comes from bucket B.
P(gray / bucket B) = 1/3
P(gray / bucket A) = 2/4
To formalize how to calculate the conditional probability, we can say:
P(gray / bucket B) = P(gray and bucket B) / P(bucket B)
P(gray and bucket B) = 1/7 (gray stone in B / total stone )
P(bucket B) = 3/7 (Three stone in bucket B) – Simple
11/07/22 7
Probability Theory : Naïve Bayes
Conditional Probability
P(gray / bucket B) = P(gray and bucket B) / P(bucket B)
P(gray / bucket B) = (1/7) / (3/7)
P(gray / bucket B) = 1/3
11/07/22 8
Classifying with Conditional
Probabilities
Bayesian decision theory can told us to find the two probabilities:
If P1(x, y) > P2(x, y) , then the class is 1.
If P1(x, y) < p2(x, y), then the class is 2.
11/07/22 9
Classifying with Conditional
Probabilities
11/07/22 10
Uses of Naïve Bayes Classification
11/07/22 11
Example One
11/07/22 13
Example One
11/07/22 14
Example One
11/07/22 15
Example One
11/07/22 16
Example Two
11/07/22 17
Example Two
11/07/22 18
Example Two
11/07/22 19
Example Two
11/07/22 20
Example Two
Classify the following into Sports or Informatics using a Naive Bayes classifier.
b1 = (1, 0, 0, 1, 1, 1, 0, 1) = S or I
b2 = (0, 1, 1, 0, 1, 0, 1, 0) = S or I
11/07/22 21
Example Two
We can estimate the prior probabilities from the training data as:
P(S) = 6/11
P(I) = 5/11
11/07/22 22
Example Two
11/07/22 23
Example Two
11/07/22 24
Example Two
11/07/22 25
Example Two
To compute the posterior probabilities of the two test vectors and hence classify them.
b1 = (1, 0, 0, 1, 1, 1, 0, 1)
P(S| b1) = P(wt | S) x P(S)
(1/2 X 5/6 X 2/3 X ½ X ½ X 2/3 X 1/3 X 2/3) x (6/11)
(5/891) = 5.6 x 10-3
P(I| b1) = P(wt | I) x P(I)
(1/5 X 2/5 X 2/5 X 1/5 X 1/5 X 1/5 X 2/5 X 1/5) x (5/11)
(8/859375) = 9.3 x 10-6
Classify this document as S.
11/07/22 26
Example Two
To compute the posterior probabilities of the two test vectors and hence classify them.
b2 = (0, 1, 1, 0, 1, 0, 1, 0)
P(S| b2) = P(wt | S) x P(S)
(1/2 X 1/6 X 1/3 X ½ X ½ X 1/3 X 2/3 X 1/3) x (6/11)
(12/14256) = 8.4 x 10-4
P(I| b2) = P(wt | I) x P(I)
(4/5 X 3/5 X 3/5 X 4/5 X 1/5 X 4/5 X 3/5 X 4/5) x (5/11)
(34560/4296875) = 8.0 x 10-3
Classify this document as I.
11/07/22 27
Naïve Bayes: Syntax
11/07/22 28
Summary
Using probabilities can sometimes be more effective than using hard rules for classification.
Bayesian probability and Bayes’ rule gives us a way to estimate unknown probabilities from known values.
You can reduce the need for a lot of data by assuming conditional independence among the features in your data.
The assumption we make is that the probability of one word doesn’t depend on any other words in the document.
11/07/22 29
Summary
Underflow is one problem that can be addressed by using the logarithm of probabilities in your
calculations.
11/07/22 30
Question & Answer
11/07/22 31
Thank You !!!
11/07/22 32
Assignment Three
Predict outcome for the following: x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
11/07/22 33