Lab6 - Naive Bayes Classification

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Naive Bayesian classification

It is a classification technique based on Bayes’ Theorem with an assumption of independence among


predictors. In simple terms, a Naive Bayes classifier assumes that the presence of a particular feature in
a class is unrelated to the presence of any other feature.

use Naive Bayesian equation to calculate the posterior probability for each class. The class with the
highest posterior probability is the outcome of prediction.

The e1071 package contains a function named naiveBayes() which is helpful in performing Bayes
classification

Theory applied on previous example:


P(C1) = P(buys_computer = yes) = 9/14 =0.643
P(C2) = P(buys_computer = no) = 5/14= 0.357
P(age=youth /buys_computer = yes) = 2/9 =0.222
P(age=youth /buys_computer = no) = 3/5 =0.600
P(income=medium /buys_computer = yes) = 4/9 =0.444
P(income=medium /buys_computer = no) = 2/5 =0.400
P(student=yes /buys_computer = yes) = 6/9 =0.667
P(student=yes/buys_computer = no) = 1/5 =0.200
P(credit rating=fair /buys_computer = yes) = 6/9 =0.667
P(credit rating=fair /buys_computer = no) = 2/5 =0.400
P(X/Buys a computer = yes) = P(age=youth /buys_computer = yes) *
P(income=medium
/buys_computer = yes) * P(student=yes /buys_computer = yes) * P(credit
rating=fair
/buys_computer = yes) = 0.222 * 0.444 * 0.667 * 0.667 = 0.044
P(X/Buys a computer = No) = 0.600 * 0.400 * 0.200 * 0.400 = 0.019
Find class Ci that Maximizes P(X/Ci) * P(Ci)
=>P(X/Buys a computer = yes) * P(buys_computer = yes) = 0.028
=>P(X/Buys a computer = No) * P(buys_computer = no) = 0.007
Prediction : Buys a computer for Tuple X

Install Packages e1701

>>install.packages(“e1701”)
Loading Package
>>library(e1701)
To create student dataset
> age<-
c("youth","youth","middleaged","senior","senior","senior","middleaged","you
th","youth","senior","youth","middleaged","middleaged","senior")
> income<-
c("high","high","high","medium","low","low","low","medium","low","medium","
medium","medium","high","medium")
> student<-
c("no","no","no","no","yes","yes","yes","no","yes","yes","yes","no","yes","
no")
> credit_rating<-
c("fair","excellent","fair","fair","fair","excellent","excellent","fair","f
air","fair","excellent","excellent","fair","excellent")
> buys_computer<-
c("no","no","yes","yes","yes","no","yes","no","yes","yes","yes","yes","yes"
,"no")
> studnew<-data.frame(age,income,student,credit_rating,buys_computer)

> studnew
age income student credit_rating buys_computer
1 youth high no fair no
2 youth high no excellent no
3 middleaged high no fair yes
4 senior medium no fair yes
5 senior low yes fair yes
6 senior low yes excellent no
7 middleaged low yes excellent yes
8 youth medium no fair no
9 youth low yes fair yes
10 senior medium yes fair yes
11 youth medium yes excellent yes
12 middleaged medium no excellent yes
13 middleaged high yes fair yes
14 senior medium no excellent no

Create data frame with new vales to be predicted


> df_new<-data.frame( age="youth", income = "medium", student = "yes",
credit_rating = "fair")

A training model is created by using naiveBayes() function. The model is


used to predict the status of a person to buy a computer.
model<-
naiveBayes(studnew$buys_computer~studnew$age+studnew$income+studnew$student
+studnew$credit_rating,data=studnew)
> model

Naive Bayes Classifier for Discrete Predictors

Call:
naiveBayes.default(x = X, y = Y, laplace = laplace)

A-priori probabilities:
Y
no yes
0.3571429 0.6428571

Conditional probabilities:
studnew$age
Y middleaged senior youth
no 0.0000000 0.4000000 0.6000000
yes 0.4444444 0.3333333 0.2222222

studnew$income
Y high low medium
no 0.4000000 0.2000000 0.4000000
yes 0.2222222 0.3333333 0.4444444

studnew$student
Y no yes
no 0.8000000 0.2000000
yes 0.3333333 0.6666667

studnew$credit_rating
Y excellent fair
no 0.6000000 0.4000000
yes 0.3333333 0.6666667

> predict(model,df_new)
[1] yes
Levels: no yes
> prediction<-predict(model,df_new,type="raw")
> prediction
no yes
[1,] 0.3571429 0.6428571

> table(studnew$age)

middleaged senior youth


4 5 5
> table(studnew$income)

high low medium


4 4 6
> table(studnew$age,studnew$buys_computer)

no yes
middleaged 0 4
senior 2 3
youth 3 2
> table(studnew$income,studnew$buys_computer)

no yes
high 2 2
low 1 3
medium 2 4
> table(studnew$student,studnew$buys_computer)

no yes
no 4 3
yes 1 6
> table(studnew$credit_rating,studnew$buys_computer)

no yes
excellent 3 3
fair 2 6
> table(studnew$buys_computer)

no yes
5 9

model$tables$age
age
Y middleaged senior youth
no 0.0000000 0.4000000 0.6000000
yes 0.4444444 0.3333333 0.2222222
> model$tables$income
income
Y high low medium
no 0.4000000 0.2000000 0.4000000
yes 0.2222222 0.3333333 0.4444444

> model$tables[[1]]
age
Y middleaged senior youth
no 0.0000000 0.4000000 0.6000000
yes 0.4444444 0.3333333 0.2222222
> model$tables[[2]]
income
Y high low medium
no 0.4000000 0.2000000 0.4000000
yes 0.2222222 0.3333333 0.4444444

You might also like