Professional Documents
Culture Documents
Naive Bayes Part 1
Naive Bayes Part 1
Naive Bayes Part 1
Naïve Bayes
1
Probability Problem
2
Solution
4
Application
• Assigns each observation to the most likely class given its feature
values
• Assign a test observation with features to the class j for which is
largest
• The Bayes classifier minimizes the test error rate (called the Bayes
error rate)
• Since conditional distribution of Y given X is unknown, computing the
Bayes classifier is impossible
• Many methods attempt to estimate it
6
Bayes Decision Boundary
3
2
In general, we try
1
to approximate this
X2
-1 0
-3 -2 -1 0 1 2 3
X1
7
Naïve Bayes Decision Boundary
Naïve Bayes is a classification rule
3
features.
-1 0
X1 Uncannily similar!
Confusion Matrix:
3
2
Yes No
1
X2
Yes 66 4
-1 0
No 7 23
-3
-3 -2 -1 0 1 2 3
X1
8
Example: Personal Loan Offer
9 9
Personal Loan Data Description
• The data has information about the customers’ relationship with the
bank,
ID
as well asCustomer
some demographic
ID
information
Age Customer's age in completed years
Experience #years of professional experience
Income Annual income of the customer ($000)
ZIPCode Home Address ZIP code.
Family Family size of the customer
CCAvg Avg. spending on credit cards per month ($000)
Education Education Level. 1: Undergrad; 2: Graduate; 3: Advanced/Professional
Mortgage Value of house mortgage if any. ($000)
Personal Loan Did this customer accept the personal loan offered in the last campaign?
Securities Account Does the customer have a securities account with the bank?
CD Account Does the customer have a certificate of deposit (CD) account with the bank?
Online Does the customer use internet banking facilities?
CreditCard Does the customer use a credit card issued by UniversalBank?
10 10
The Exact Bayesian classifier
11 11
Example: Exact Bayesian classifier
• Let’s simplify our motivating example
• Assume we only have two predictors:
• CreditCard (0/1) and Online (0/1)
• We are trying to classify records to “will accept”/ “will not
accept”
• (Y = personal loan acceptance (0/1))
• How would you classify a record for customers with
CreditCard=1, Online=1?
Count of Personal Loan Online
CreditCard Personal Loan 0 1 Grand Total
0 794 1123 1917
0
1 70 130 200
0 Total 864 1253 2117
0 323 481 804
1
1 31 48 79
1 Total 354 529 883
Grand Total 1218 1782 3000
12 12
Example: Exact Bayesian classifier
1. First find all records that have the same predictor values in the
training set
Count of Personal Loan Online
CreditCard Personal Loan 0 1 Grand Total
0 794 1123 1917
0
1 70 130 200
0 Total 864 1253 2117
0 323 481 804
1
1 31 48 79
1 Total 354 529 883
Grand Total 1218 1782 3000
2. Determine which class is the most prevalent amongst records you
found in step 1: ____
3. Assign most prevalent class to the new
0 record: ____
0
13
Assigning Probabilities
Count of Personal Loan Online
CreditCard Personal Loan 0 1 Grand Total
• It may be desirable to tweak the
0
method so that it answers
794 0 the 1917
1123
question: What is an estimated probability
0 Total
of belonging
70 to the
864
1 130 class
1253
200
2117
of interest? 1
0 323 481 804
1
• Allows analysis of misclassification costs,
1 Total
ROC curves etc. to31
identify48
the 79
354 529 883
appropriate model Grand Total 1218 1782 3000
48/529=0.0907
The denominator is the proportion of records with CC=1 and O=1. The
numerator is the proportion of records with CC=1 and O=1 and PL=1.
15
Practical difficulty with Exact Bayes
16
Example with (only) 3 predictors:
CC, Online, CD account
Count of Personal Loan CD Account Online
0 0 1 1 Grand
0 1 Total 0 1 Total Total
CreditCard Personal Loan
0 794 1116 1910 7 7 1917
0
1 68 103 171 2 27 29 200
0 Total 862 1219 2081 2 34 36 2117
0 320 402 722 3 79 82 804
1
1 28 28 3 48 51 79
1 Total 348 402 750 6 127 133 883
Grand Total 1210 1621 2831 8 161 169 3000
17
Solution: The Naïve Bayes Method
18
The Naïve Bayes Algorithm
• Goal: To classify a new record with values X1=x1,…,Xp=x Proportion of the Y=1
p as one of k classes
cases that have X = x
1. For class 1, find the individual probabilities that each predictor value ini thei
record to be classified (x1, . . . , xp) occurs in class 1
• In other words: Calculate P(Xi=xi|Y=1)
2. Multiply these probabilities times each other, then times the proportion of
records belonging to class 1
• If p1 is the proportion of records belonging to class 1
• Then multiply P(X1=x1|Y=1)*P(X2=x2|Y=1) *… * P(Xp=xp|Y=1)*p1
• Lets call this product PC1
3. Repeat steps 1 and 2 for all the classes
4. Estimate a probability for class i by taking the value calculated in step 2 for class
i and dividing it by the sum of such values for all classes
• P(new record belongs to class 1) = PC1/∑i=1..k PCi
5. Assign the record to the class with the highest probability value for this set of
predictor values.
19
The Naïve Bayes Algorithm
• Goal: To classify a new record with values X1=x1,…,Xp=xp as one of k classes
1. For class 1, find the individual probabilities that each predictor value in
the record to be classified (x1, . . . , xp) occurs in class 1
• In other words: Calculate P(Xi=xi|Y=1)
2. Multiply these probabilities times each other, then times the proportion
of records belonging to class 1
• If p1 is the proportion of records belonging to class 1
• Then multiply P(X1=x1|Y=1)*P(X2=x2|Y=1) *… * P(Xp=xp|Y=1)*p1
• Lets call this product PC1
3. Repeat steps 1 and 2 for all the classes
4. Estimate a probability for class i by taking the value calculated in step 2
for class i and dividing it by the sum of such values for all classes
• P(new record belongs to class 1) = PC1/∑i=1..k PCi
5. Assign the record to the class with the highest probability value for this
set of predictor values.
20