ML-Unit I - Naive Bayes

Machine Learning
Dr. Sunil Saumya

IIIT Dharwad
Naive Bayes
Bayes Theorem:
where:
● A and B are called events.
● P(A | B) is the probability of event A, given the event B is true (has occurred).
Event B is also termed as evidence.
● P(A) is the priori of A (the prior independent probability, i.e. probability of event
before evidence is seen).
● P(B | A) is the probability of B given event A, i.e. probability of event B after
evidence A is seen.
Effort (x) Result (y)
Naive Bayes classification Poor Fail
Average Pass
Average Pass
Bayes Theorem:
Good Pass
where: Good Pass
● Let’s take a one dimensional data to
Poor Fail
understand how bayes theorem
Poor Fail
works:
Poor Pass
Check “student will fail if his efforts are
poor” statement is correct or not? Poor Fail
Here, x= poor, so find y = ? Average Pass
Average Fail
Average Pass
Average Pass
where:
Good Pass
● Given problem, the bayes classifier
Good Pass
model will be:
Poor Fail
Poor Fail
Which is similar as Poor Pass

P(Fail | Poor) = P(Poor | Fail) * P(Fail) / Poor Fail
P(Poor) Average Pass
Average Fail
Average Pass
Average Pass
P(Fail | Poor) = P(Poor | Fail) * P(Fail) /
P(Poor) Good Pass
P(Poor | Fail) = Number of students who Good Pass
failed with poor efforts / Number of Poor Fail

students failed Poor Fail
P(Poor | Fail) = 4/5= 0.8 Poor Pass
P(Fail) = Number of students failed /
Poor Fail
Total students = 5/11 = 0.45
Average Pass
P(Poor)= Number of students with poor
Average Fail
efforts / Total students= 5/11 = 0.45
Average Pass
Average Pass
P(Fail | Poor) = P(Poor | Fail) * P(Fail) /
P(Poor) Good Pass
P(Fail | Poor) = (0.8 * 0.45) /0.45 = 0.8 Good Pass
Poor Fail
P(Pass | Poor) = P(Poor | Pass) * P(Pass) / Poor Fail
P(Poor) Poor Pass
P(Pass | Poor) = (1/6 * 6/11) /(5/11) = 0.04
Poor Fail
Average Pass
Therefore, for an new student if effort
Average Fail
given in poor, his result is fail.
Naive Bayes Exercise
Consider the training dataset are as follows:
Classify a Red Domestic SUV

SMS spam classification: Dataset
● Problem: How to do text classification using Naive Bayes
○ we can't feed in text directly to our classifier.
○ We extract some features from the text and then feed it as an input.
Sentence Label
Send your mobile number Ham (0)
Send your account number and mobile number Spam (1)
Your mobile number selected as a winner Spam (1)
Select your mobile number ??

SMS spam classification:
Feature extraction
● We will extract the TF-IDF (Term Frequency- Inverse Document
Frequency Feature) from the text.
Step 1: Prepare the vocabulary from the dataset having unique words
Vocabulary: “Send”, “your”, “mobile”,
“number”, “account”, “and”,
Sentence Label
“selected”, “as”, “winner”
Send your mobile number Ham (0)
Let’s calculate the number of
term for each unique word in Send your account number and mobile number Spam (1)
the vocabulary Your mobile number selected as a winner Spam (1)
Send your mobile ??

Feature extraction
● We will extract the TF-IDF (Term Frequency- Inverse Document
Frequency Feature) from the text.
Step 2: Calculate the TF-IDF feature
TF-IDF (x) = TF (x) * IDF (x), where x is word in the vocabulary.
Feature extraction
TF (send) = 1/4= 0.25, TF * IDF (send of S1) = 0.25*0.176= 0.044
IDF (send) = log (3/2) = 0.176
Send your mobile number account and selected as winner
S1 0.044
S2
S3
Feature extraction
TF (send) = 1/7= 0.14, TF * IDF (send of S1) = 0.25*0.176= 0.044
IDF (send) = log (3/2) = 0.176 TF * IDF (send of S2) = 0.14*0.176= 0.024
S1 0.044
S2 0.024
S3
Feature extraction
TF (send) = 0/6= 0, TF * IDF (send of S1) = 0.25*0.176= 0.044
IDF (send) = log (3/2) = 0.176 TF * IDF (send of S2) = 0.14*0.176= 0.024
TF * IDF (send of S3) = 0*0.176= 0
S1 0.044
S2 0.024
S3 0
Feature extraction
TF (your) = 1/4= 0.25 TF * IDF (your of S1) = 0.25*0= 0
IDF (your) = log (3/3) = 0 TF * IDF (your of S2) = 0.14*0= 0
TF * IDF (your of S3) = 0.16*0= 0
S1 0.044 0
S2 0.024 0
S3 0 0
Feature extraction
TF (mobile) = 1/4= 0.25 TF * IDF (mobile of S1) = 0.25*0= 0
IDF (mobile) = log (3/3) = 0 TF * IDF (mobile of S2) = 0.16*0= 0
TF * IDF (mobile of S3) = 0.14*0= 0
S1 0.044 0 0
S2 0.024 0 0
S3 0 0 0
Feature extraction
TF (account) = 0/4= 0 TF * IDF (account of S1) = 0*0.477= 0
IDF (account) = log (3/1) = 0.477 TF * IDF (account of S2) = 0.16*0.477= 0.119
TF * IDF (account of S3) = 0*0= 0
S1 0.044 0 0 0 0
S2 0.024 0 0 0 0.076
S3 0 0 0 0 0
Feature extraction
TF (selected) = 0/4= 0 TF * IDF (selected of S1) = 0*0.477= 0
IDF (selected) = log (3/1) = 0.477 TF * IDF (selected of S2) = 0*0.477= 0
TF * IDF (selected of S3) = 0.14*0.477= 0.076
Send your mobile number account and selected as win
ner
S1 0.044 0 0 0 0 0 0
S2 0.024 0 0 0 0.076 0.076 0
S3 0 0 0 0 0 0 0.076
Feature extraction
TF (winner) = 0/4= 0 TF * IDF (winner of S1) = 0*0.477= 0
IDF (winner) = log (3/1) = 0.477 TF * IDF (winner of S2) = 0*0.477= 0
TF * IDF (winner of S3) = 0.14*0.477= 0.076
S1 0.044 0 0 0 0 0 0 0 0
S2 0.024 0 0 0 0.076 0.076 0 0 0
S3 0 0 0 0 0 0 0.076 0.076 0.076

Feature extraction
TF (winner) = 0/4= 0 TF * IDF (winner of S1) = 0*0.477= 0
IDF (winner) = log (3/1) = 0.477 TF * IDF (winner of S2) = 0*0.477= 0
TF * IDF (winner of S3) = 0.14*0.477= 0.076
Send your mobile number account and selected as winner Class
S1 0.044 0 0 0 0 0 0 0 0 Ham
S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam
S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

Feature extraction
Step 3: Classification using Naive Bayes P(wk/Ham)= (nk+1)/ (n+ vocabulary)
Documents with Ham outcomes: Where, n be the sum of all TFIDF in HAM cases
nk: the sum of TF-IDF values of word k in HAM cases
P(Ham) = 1/3 = 0.33
P(wk/Ham)= (nk+1)/ (n+ vocabulary), P(send/Ham) = (0.044+1)/ (0.044+9)= 0.115
S1 0.044 0 0 0 0 0 0 0 0 Ham
S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam
S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

Feature extraction
P(send/Ham) = (0.044+1)/ (0.044+9)= 0.115
P(your/Ham) = (0+1)/ (0.004+9)= 0.111
P(mobile/Ham) = (0+1)/ (0.004+9)= 0.111
Step 3: Classification using NB P(number/Ham) = (0+1)/ (0.004+9)= 0.111
P(account/Ham) = (0+1)/ (0.004+9)= 0.111
Documents with Ham outcomes: P(and/Ham) = (0+1)/ (0.004+9)= 0.111
P(Ham) = 1/3 = 0.33 P(selected/Ham) = (0+1)/ (0.004+9)= 0.111
P(as/Ham) = (0+1)/ (0.004+9)= 0.111
P(wk/Ham)= (nk+1)/ (n+ vocabulary),P(winner/Ham) = (0+1)/ (0.004+9)= 0.111
S1 0.044 0 0 0 0 0 0 0 0 Ham
S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam
S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

Feature extraction
P(send/Spam) = (0.024+1)/ (0.404+9)= 0.108
P(your/Spam) = (0+1)/ (0.404+9)= 0.106
P(mobile/Spam) = (0+1)/ (0.404+9)= 0.106
Step 3: Classification using NB P(number/Spam) = (0+1)/ (0.404+9)= 0.106
P(account/Spam) = (0.076+1)/ (0.404+9)= 0.114
Documents with Spam outcomes: P(and/Spam) = (0.076+1)/ (0.404+9)= 0.114
P(Spam) = 2/3 = 0.66 P(selected/Spam) = (0.076+1)/ (0.404+9)= 0.114
P(as/Spam) = (0.076+1)/(0.404+9)= 0.114
P(wk/Spam)= (nk+1)/ (n+ vocabulary),P(winner/Spam) = (0.076+1)/(0.404+9)= 0.114
S1 0.044 0 0 0 0 0 0 0 0 Ham
S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam
S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

SMS spam classification
● Test data: Send your mobile
P(Ham/send your mobile)= P(Ham/send) * P (Ham/your) * P(Ham/mobile)
P(Ham/send) = P(send/ham)*P(Ham)/P(send) = 0.115*0.33
Denominator is ignored because it is common in both classes.
Therefore, P(Ham/send) = 0.037
P(Ham/your) = 0.111*0.33= 0.036
P(Ham/mobile) = 0.111*0.33= 0.036
Therefore, P(Ham/send your mobile) = 0.037*0.036*0.036 = 0.002664
SMS spam classification
● Test data: Send your mobile
P(Spam/send your mobile)= P(Spam/send) * P (Spam/your) * P(Spam/mobile)
P(Spam/send) = P(send/Spam)*P(Spam)/P(send) = 0.108*0.66
Denominator is ignored because it is common in both classes.
Therefore, P(Spam/send) = 0.071
P(Spam/your) = 0.101*0.66= 0.069
P(Spam/mobile) = 0.101*0.66= 0.069
Therefore, P(Spam/send your mobile) = 0.071*0.069*0.069 = 0.0097982
Clearly,
P(Spam/Send your mobile=0.0097982) > P(Ham/Send your mobile=0.002664)

ML-Unit I - Naive Bayes

Uploaded by

Copyright:

Available Formats

You might also like

ML-Unit I - Naive Bayes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ML-Unit I - Naive Bayes

Uploaded by

Copyright:

Available Formats

Machine Learning

Dr. Sunil Saumya

Naive Bayes classiﬁcation Poor Fail

Here, x= poor, so find y = ? Average Pass

Naive Bayes classiﬁcation Poor Fail

Which is similar as Poor Pass

Naive Bayes classiﬁcation Poor Fail

P(Poor | Fail) = Number of students who Good Pass

failed with poor efforts / Number of Poor Fail

Naive Bayes classiﬁcation Poor Fail

P(Fail | Poor) = (0.8 * 0.45) /0.45 = 0.8 Good Pass

Classify a Red Domestic SUV

Send your mobile number Ham (0)

Send your account number and mobile number Spam (1)

Your mobile number selected as a winner Spam (1)

Select your mobile number ??

the vocabulary Your mobile number selected as a winner Spam (1)

Send your mobile ??

Send your mobile number account and selected as winner

Send your mobile number account and selected as winner

S2 0.024 0 0 0 0.076 0.076 0

S2 0.024 0 0 0 0.076 0.076 0 0 0

S3 0 0 0 0 0 0 0.076 0.076 0.076

S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam

S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

Send your mobile number account and selected as winner Class

S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam

S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

Send your mobile number account and selected as winner Class

S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam

S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

Send your mobile number account and selected as winner Class

S2 0.024 0 0 0 0.076 0.076 0 0 0 Spam

S3 0 0 0 0 0 0 0.076 0.076 0.076 Spam

You might also like