Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Khoa Cng Ngh Thng Tin

Trng i Hc Cn Th
Thanh Ngh
dtnghi@cit.ctu.edu.vn
Cn Th
02-12-2008
Phng php hc Bayes
Bayesian classification
Ni dung
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
Kt lun v hng pht trin
2
Ni dung
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
Kt lun v hng pht trin
3
Bayesian classification
lp cc gii thut hc
da trn theorem Bayes
mng Bayes v naive Bayes
kt qu sinh ra c th dch c
gii quyt cc vn v phn loi, gom nhm, etc.
c ng dng thnh cng : phn tch d liu, phn loi text,
spam, etc.
4
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
K thut DM thnh cng
trong ng dng thc (2004)
5
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
Ni dung
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
Kt lun v hng pht trin
6
Gii thut naive Bayes
ngy th
cc thuc tnh (bin) c quan trng nh nhau
cc thuc tnh (bin) c lp thng k
nhn xt
gi thit cc thuc tnh c lp khng bao gi ng
nhng trong thc t, naive Bayes cho kt qu kh tt
7
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
D liu weather, da trn cc thuc
tnh (Outlook, Temp, Humidity, Windy), quyt nh (play/no)
8
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
No True High Mild Rainy
Yes False Normal Hot Overcast
Yes True High Mild Overcast
Yes True Normal Mild Sunny
Yes False Normal Mild Rainy
Yes False Normal Cool Sunny
No False High Mild Sunny
Yes True Normal Cool Overcast
No True Normal Cool Rainy
Yes False Normal Cool Rainy
Yes False High Mild Rainy
Yes False High Hot Overcast
No True High Hot Sunny
No False High Hot Sunny
Play Windy Humidity Temp Outlook
D liu weather, da trn cc thuc
tnh (Outlook, Temp, Humidity, Windy), quyt nh (play/no)
No True High Mild Rainy
Yes False Normal Hot Overcast
Yes True High Mild Overcast
Yes True Normal Mild Sunny
Yes False Normal Mild Rainy
Yes False Normal Cool Sunny
No False High Mild Sunny
Yes True Normal Cool Overcast
No True Normal Cool Rainy
Yes False Normal Cool Rainy
Yes False High Mild Rainy
Yes False High Hot Overcast
No True High Hot Sunny
No False High Hot Sunny
Play Windy Humidity Temp Outlook
9
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
5/14
5
No
9/14
9
Yes
Play
3/5
2/5
3
2
No
3/9
6/9
3
6
Yes
True
False
True
False
Windy
1/5
4/5
1
4
No Yes No Yes No Yes
6/9
3/9
6
3
Normal
High
Normal
High
Humidity
1/5
2/5
2/5
1
2
2
3/9
4/9
2/9
3
4
2
Cool 2/5 3/9 Rainy
Mild
Hot
Cool
Mild
Hot
Temperature
0/5 4/9 Overcast
3/5 2/9 Sunny
2 3 Rainy
0 4 Overcast
3 2 Sunny
Outlook
5/14
5
No
9/14
9
Yes
Play
3/5
2/5
3
2
No
3/9
6/9
3
6
Yes
True
False
True
False
Windy
1/5
4/5
1
4
No Yes No Yes No Yes
6/9
3/9
6
3
Normal
High
Normal
High
Humidity
1/5
2/5
2/5
1
2
2
3/9
4/9
2/9
3
4
2
Cool 2/5 3/9 Rainy
Mild
Hot
Cool
Mild
Hot
Temperature
0/5 4/9 Overcast
3/5 2/9 Sunny
2 3 Rainy
0 4 Overcast
3 2 Sunny
Outlook
D liu weather, da trn cc thuc
tnh (Outlook, Temp, Humidity, Windy), quyt nh (play/no)
5/14
5
No
9/14
9
Yes
Play
3/5
2/5
3
2
No
3/9
6/9
3
6
Yes
True
False
True
False
Windy
1/5
4/5
1
4
No Yes No Yes No Yes
6/9
3/9
6
3
Normal
High
Normal
High
Humidity
1/5
2/5
2/5
1
2
2
3/9
4/9
2/9
3
4
2
Cool 2/5 3/9 Rainy
Mild
Hot
Cool
Mild
Hot
Temperature
0/5 4/9 Overcast
3/5 2/9 Sunny
2 3 Rainy
0 4 Overcast
3 2 Sunny
Outlook
5/14
5
No
9/14
9
Yes
Play
3/5
2/5
3
2
No
3/9
6/9
3
6
Yes
True
False
True
False
Windy
1/5
4/5
1
4
No Yes No Yes No Yes
6/9
3/9
6
3
Normal
High
Normal
High
Humidity
1/5
2/5
2/5
1
2
2
3/9
4/9
2/9
3
4
2
Cool 2/5 3/9 Rainy
Mild
Hot
Cool
Mild
Hot
Temperature
0/5 4/9 Overcast
3/5 2/9 Sunny
2 3 Rainy
0 4 Overcast
3 2 Sunny
Outlook
10
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
? True High Cool Sunny
Play Windy Humidity Temp. Outlook
quyt nh (play=yes/no)
Likelihood(yes) = 2/9 x 3/9 x 3/9 x 3/9 x 9/14 = 0.0053
Likelihood(no) = 3/5 x 1/5 x 4/5 x 3/5 x 5/14 = 0.0206
Xc sut :
P(yes) = 0.0053 / (0.0053 + 0.0206) = 0.205
P(no) = 0.0206 / (0.0053 + 0.0206) = 0.795
Lut Bayes
11
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
Probability of event H given evidence E :
A priori probability of H : Pr[H]
Probability of event before evidence is seen
A posteriori probability of H : Pr[H | E]
Probability of event after evidence is seen
] Pr[
] Pr[ ] | Pr[
] | Pr[
E
H H E
E H =
Lut Bayes
12
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
hc phn lp khi c d liu n
Evidence E = d liu
Event H = gi tr lp ca d liu
nave :
] Pr[
] Pr[ ] | Pr[ ] | Pr[ ] | Pr[
] | Pr[
2 1
E
H H E H E H E
E H
n

=
Lut Bayes
13
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
? True High Cool Sunny
Play Windy Humidity Temp. Outlook
Evidence E
xc sut
ca lp
yes
] | Pr[ ] | Pr[ yes Sunny Outlook E yes = =
] | Pr[ yes Cool e Temperatur =
] | Pr[ yes High Humidity =
] | Pr[ yes True Windy =
] Pr[
] Pr[
E
yes

] Pr[
14
9
9
3
9
3
9
3
9
2
E

=
Xc sut = 0
14
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
gi tr ca thuc tnh khng xut hin trong tt c cc lp
(Humidity = high ca lp yes)
Probability will be zero!
A posteriori probability will also be zero!
s dng Laplace estimator
xc sut khng bao gi c gi tr 0
0 ] | Pr[ = E yes
0 ] | Pr[ = = yes High Humidity
Laplace estimator
15
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
v d : thuc tnh outlook cho lp yes
trng s c th khng bng nhau, nhng tng phi l 1

+
+
9
3 / 2

+
+
9
3 / 4

+
+
9
3 / 3
Sunny Overcast Rainy

+
+
9
2
1
p

+
+
9
4
2
p

+
+
9
3
3
p
Sunny Overcast Rainy
Gi tr thuc tnh nhiu
16
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
hc : b qua d liu nhiu
phn lp : b qua cc thuc tnh nhiu
v d :
? True High Cool ?
Play Windy Humidity Temp. Outlook
Likelihood(yes) = 3/9 3/9 3/9 9/14 = 0.0238
Likelihood(no) = 1/5 4/5 3/5 5/14 = 0.0343
P(yes) = 0.0238 / (0.0238 + 0.0343) = 41
P(no) = 0.0343 / (0.0238 + 0.0343) = 59
D liu lin tc
17
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
gi s cc thuc tnh c phn phi Gaussian
hm mt xc sut c tnh nh sau
mean
standard deviation o
hm mt xc sut f(x)

=
=
n
i
i
x
n
1
1

=
n
i
i
x
n
1
2 2
) (
1
1
o
2
2
2
) (
2
1
) (
o

o t

=
x
e x f
Karl Gauss, 1777-1855
great German mathematician
D liu lin tc
18
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
v d : 0340 . 0
2 . 6 2
1
) | 66 (
2
2
2 . 6 2
) 73 66 (
= = =
-

e yes e temperatur f
t
D liu lin tc
19
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
phn lp
? true 90 66 Sunny
Play Windy Humidity Temp. Outlook
Likelihood(yes) = 2/9 0.0340 0.0221 3/9 9/14 = 0.000036
Likelihood(no) = 3/5 0.0291 0.0380 3/5 5/14 = 0.000136
P(yes) = 0.000036 / (0.000036 + 0. 000136) = 20.9
P(no) = 0.000136 / (0.000036 + 0. 000136) = 79.1
Ni dung
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
Kt lun v hng pht trin
20
Kt lun
nave Bayes
cho kt qu tt trong thc t mc d chu nhng gi thit v
tnh c lp thng k ca cc thuc tnh
phn lp khng yu cu phi c lng mt cch chnh xc
xc sut
d ci t, hc nhanh, kt qu d hiu
s dng trong phn loi text, spam, etc
tuy nhin khi d liu c nhiu thuc tnh d tha th nave
Bayes khng cn hiu qu
d liu lin tc c th khng tun theo phn phi chun (=>
kernel density estimators)
21
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin
Hng pht trin
nave Bayes
chn thuc tnh con t cc thuc tnh ban u
ch s dng cc thuc tnh con hc phn lp
mng Bayes : mi lin quan gia cc thuc tnh
tm kim thng tin (ranking)
22
Gii thiu v Bayesian classification
Gii thut hc ca naive Bayes
kt lun v hng pht trin

You might also like