Professional Documents
Culture Documents
Hocmay Chapter11
Hocmay Chapter11
H c my
N i dung
Gi i thi u H c khi ni m (Concept learning) Cy quy t nh (Decision Tree)
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 2
H c khi ni m
H c khi ni m:
G m nhi u ph n t (c th ) M i c th cho bi t c thu c khi ni m hay khng (thu c: positive, khng: negative)
V d :
(Input) Cc VD hu n luy n:
T p cc animal, cng thu c tnh c a n.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 3
H c khi ni m (tt.)
V d :
Example 1 2 3 4
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
H c khi ni m (tt.)
Gi thi t:
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 5
H c khi ni m (tt.)
Gi thi t:
X l c th , v X tho t t c cc rng bu c trn gi thi t h th h phn lo i X l positive (h(X) =1) V d : Gi thi t l Aldo thch mn th thao d i vo ngy cold days with high humidity, gi thi t c ghi l:
<?, Cold, High, ?, ?, ?>
Gi thi t t ng qut nh t:
<?, ?, ?, ?, ?, ?>
Gi thi t c th nh t:
<,,,,,>
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 6
H c khi ni m (tt.)
Cc k hi u:
T p c th (set of instances)
T p c dng trch khi ni m t . K hi u: X VD trn: t p c th = t p ngy, m i ngy c 6 thu c tnh.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 7
H c khi ni m (tt.)
Cc k hi u:
H c khi ni m ~ Tm ki m:
H c khi ni m ~:
Tm trn khng gian gi thi t c th . Tr v gi thi t t t nh t tho mn t p VD hu n luy n.
S gi thi t c th :
N thu c tnh trong m i gi thi t. M tr c th cho m i c th , c ng thm hai tr n a:
u nh nhau m t t p
Side 8
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
H c khi ni m (tt.)
H c khi ni m ~ Tm ki m:
S gi thi t c th :
1 + (M+1)N.
Th t cc gi thi t:
V d : h2 g h1 v i h1 v h2 sau:
h1 = <Sunny, ?,? , Strong, ?, ?> h2 = <Sunny, ?,? , ?, ?, ?>
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 9
H c khi ni m (tt.)
Gi i thu t FIND-S:
1. 2.
3.
h = gi thi t c th nh t trong H. V i m i x t p VD hu n luy n, m c(X) =1 V i m i rng bu c ai trong h IF ai tho b i x THEN do nothing. ELSE thay ai b i RB t ng qut hn k ti p m n c tho b i x Xu t ra h.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 10
H c khi ni m (tt.)
Gi i thu t FIND-S: (VD)
C th 2 (positive):
<Sunny, Warm, High, Strong , Warm , Same >
i.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
H c khi ni m (tt.)
Gi i thu t FIND-S: (VD)
V i training examples l correct th FIND-S tr v gi thi t c th nh t trong H tng thch v i cc c th positive. 1 s bi ton cha gi i:
Khi c nhi u gi thi t tho th sao ? Khi training set is not correct th sao ? Gi i thu t c h i t ?
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 12
H c khi ni m (tt.)
Gi i thu t Candidate-Elimination (CE):
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 13
H c khi ni m (tt.)
List-Then-Eliminate(LTE):
1. 2. 3.
VersionSpace danh sch ch a m i gi thi t trong H. For each <x, c(x)>: Remove h t VS n u h(x) <> c(x) Output: danh sch cc h trong VS.
Bi u di n thu g n VS:
Gi i h n t ng qut G:
T p cc ph n t t ng qut nh t c a H tng thch v i D. G={gH| Consistent(g,D) ^ ( gH) [(g >g g) ^ Consistent(g, D)]}
Gi i h n c th nh t S:
T p cc ph n t c th nh t c a H tng thch v i D. S={sH| Consistent(s,D) ^ ( sH) [(s >g s) ^ Consistent(s, D)]}
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 14
H c khi ni m (tt.)
Bi u di n thu g n VS:
nh l bi u di n VS: VSH,D = {hH | (s S) (g G) (g g h g s)} G = t p gi thi t t ng qut nh t. S = t p gi thi t c th nh t. For each d thu c training set:
IF d is positive:
Candidate-Elimination:
Remove t G b t k gi thi t no ko tng thch v i d. For each s S m ko tng thch v i d. Remove s t S. Thm vo S t t c cc t ng qut nh nh t h c a s sao cho: > h l tng thch v i d, > Vi ph n t c a G t ng qut hn h. Remove t S b t k gi thi t no t ng qut hn 1 ci khc trong S.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 15
H c khi ni m (tt.)
Candidate-Elimination:
IF d is negative:
Remove t S b t k gi thi t no ko tng thch v i d. For each g G m ko tng thch v i d. Remove g t G. Thm vo G t t c cc c th nh nh t h c a g sao cho: > h l tng thch v i d, > Vi ph n t c a S c th hn h. Remove t G b t k gi thi t no t t ng qut hn 1 ci khc trong G.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 16
H c khi ni m (tt.)
Candidate-Elimination: (VD)
Kh i
ng:
S0: {< , , , , , >}
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 17
H c khi ni m (tt.)
Candidate-Elimination: (VD)
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 18
H c khi ni m (tt.)
Candidate-Elimination: (VD)
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 19
H c khi ni m (tt.)
Candidate-Elimination: (VD)
10
H c khi ni m (tt.)
Candidate-Elimination: (VD)
VersionSpace:
S4: {<Sunny, Warm, ?, Strong, ?, ?>}
<Sunny ,? ,? , Strong ,? ,? > <Sunny ,Warm ,? ,? ,? ,? > <? ,Warm ,? ,Strong ,? ,? >
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 21
H c cy quy t
Gi i thi u
nh
L phng php x p x hm r i r c gi tr . Hm c bi u d i d ng cy quy t nh, cng c th lu t if-then. Thu c l p gi i thu t suy di n quy n p (inductive)
d ng
Bi u di n cy quy t
nh
Node thu c tnh c a c th . Nhnh t node X cc tr c th c a thu c tnh t i X. Phn lo i trn cy cho 1 c th :
B t u t node g c, ki m tra thu c tnh tng ng v i node ny v di duy n xu ng theo nhnh c tr l tr c a thu c tnh. L p l i v i cy con.
M t cy quy t
nh m u: trang sau
Side 22
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
11
H c cy quy t
nh (tt.)
Outlook
Sunny Humidity
Overcast YES
Rain Wind
High NO
Normal YES
Strong NO
Weak YES
M t cy quy t nh bi u di n khi ni m PlayTennis. Cy ny c kh nng phn lo i m t bu i sng ch nh t no c thch h p cho vi c chi tennis khng d a vo cc thu c tnh <Outlook, Humidity, Wind, ..> c a sng .
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o Side 23
H c cy quy t
nh (tt.)
V i cy trn:
M t bu i sng c thu c tnh:
Cy quy t
nh:
m th i
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 24
12
H c cy quy t
nh (tt.)
nh:
C th c bi u di n d ng cc c p <thu c tnh, gi tr > Hm c gi tr output r i r c. V d trn hm c hai tr YES|NO. Bi ton c d ng bi u di n tuy n c a cc h i. Training set c th ch a error. Error trong vi c phn lo i c th , cng nh error trong vi c gn tr c a thu c tnh. Training set c th c c th thi u i m t s thu c tnh. Cc lnh v c p d ng:
Phn lo i b nh nhn b i b nh. S tr c tr c thi t b theo nguyn nhn. Ti chnh
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 25
H c cy quy t
nh (tt.)
Gi i thu t h c c b n - ID3.
t ng:
Xy d ng cy t root
n l, b ng cch tr l i:
Thu c tnh no l t t nh t c ki m tra t i root ? M i cy con c t o ra tng ng m i nhnh l tr c a thu c tnh ny.
Qu trnh l p l i v i cc cy con.
ENTROPY:
o l ng tnh ng nh t c a t p hu n luy n. T p hu n luy n:S ENTROPY: Entropy(S) = -p+log2p+ - p-log2p-
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 26
13
H c cy quy t
nh (tt.)
Gi i thu t h c c b n - ID3.
ENTROPY:
P+: T l gi a s cc th positive trn t ng s c th . P- : T l gi a s cc th negative trn t ng s c th . V d :
T ng s c th : 14 Trong c: 9 c th positive (thu c vo phn lo i c a khi ni m ch). 5 c th negative (khng thu c vo phn lo i c a khi ni m ch). Hay ghi rt g n: [9+,5-] Entropy([9+,5-]) = -((9/14)log2(9/14)) ((5/14)log2(5/14)) = 0.94
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 27
H c cy quy t
nh (tt.)
Gi i thu t h c c b n - ID3.
ENTROPY:
Lu :
0log20 = 0 Khi P+ hay P- =0 th Entropy =0 Entropy =1 khi P+ =P0<= Entropy <=1 C = 2: tr ng h p trn. Entropy(S)= - SUM(Pilog2P i), i=1 c
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 28
14
H c cy quy t
nh (tt.)
Gi i thu t h c c b n - ID3.
Info Gain:
o l ng s gi m Entropy mong mu n. Cng th c: Gian(S,A) = Entropy(S) SUM([|Sv|/|S|]Entropy(Sv)), v values(A)
S: T p hu n luy n. A: thu c tnh. Values(A): t p cc gi tr c th c a A. Sv: t p con c a S m thu c tnh A c tr l v. |Sv|/|S| : t s c th c thu c tnh A c tr v trn t ng s c th .
VD:
Xem B ng c th hu n luy n trang sau.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 29
H c cy quy t
Day 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Outlook Sunny Sunny Overcast Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast Overcast Rain
nh (tt.)
Temperature Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot Mild Humidity High High High High Normal Normal Normal High Normal Normal Normal High Normal High Wind Weak Strong Weak Weak Weak Strong Strong Weak Weak Weak Strong Strong Weak Strong Playtennis No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes No
Side 30
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
15
H c cy quy t
nh (tt.)
Gi i thu t h c c b n - ID3.
Info Gain:
Trang sau tnh Info Gain cho hai thu c tnh: Humidity v Wind
Side 31
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
H c cy quy t
S: [9+,5-] E=0.94 Humidity
nh (tt.)
S: [9+,5-] E=0.94 Wind
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 32
16
H c cy quy t
nh (tt.)
T ng b c gi i thu t:
Bi ton:
B ng d li u cho trn. Khi ni m ch: PlayTennis.
B c 1: cy cho S
Gian(S, Outlook) = 0.246 Gian(S, Humidity) = 0.151 Gian(S, Wind) = 0.048 Gian(S, Temperature) = 0.029 Outlook : thu c tnh phn lo i t t nh t t i b c ny. Outlook: root node. Cy nh sau:
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 33
H c cy quy t
nh (tt.)
[D1,D2, ..,D14] [9+,5-]
Outlook
Sunny
[D1,D2,D8,D9,D11] [2+,3-]
Overcast
Rain
[D3,D7,D12,D13] [4+,0-]
[D4,D5,D6,D10,D14] [3+,2-]
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 34
17
H c cy quy t
nh (tt.)
T ng b c gi i thu t:
B c 2: Cy cho SSunny
Ssunny={D1,D2,D8,D9,D11} Gian(Ssunny, Humidity) = 0.97- (3/5)0.0 (2/5)0.0 = 0.97 Gian(Ssunny, Wind) = 0.97 (2/5)1.0 (3/5)0.918 = 0.019 Gian(Ssunny, Temperature) = 0.97-(2/5)0.0-(2/5)1.0-(1/5)0.0=0.57 Humidity : thu c tnh phn lo i t t nh t t i b c ny. Humidity: root c a Ssunny. Cy nh sau:
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 35
H c cy quy t
nh (tt.)
[D1,D2, ..,D14] [9+,5-]
Outlook
Sunny
[D1,D2,D8,D9,D11] [2+,3-]
Overcast
Rain
[D3,D7,D12,D13] [4+,0-]
[D4,D5,D6,D10,D14] [3+,2-]
Humidity
YES
High
[D1,D2,D8] [0+,3-]
Normal
[D9,D11] [2+,0-]
NO
YES
Side 36
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
18
H c cy quy t
nh (tt.)
T ng b c gi i thu t:
B c 3: Cy cho SRain
SRain={D4,D5,D6,D10,D14} Tng t nh trn.
K t qu :
Cy cu i cng nh cy u tin c a ph n H c cy quy t nh
i u ki n d ng:
M i nt l u n m vo 1 trong hai tr ng h p: 1. T t c cc thu c tnh u n m trn node thu c con ng t root n l . 2. Node l c entropy = 0.
Entropy=0, T t c c th Entropy=0, T t c c th
u+ u-
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
H c cy quy t
Gi i thu t ID3:
nh (tt.)
T o node g c cho cy. IF t t c cc c th l positive THEN tr v cy ch c node, nhn l + IF t t c cc c th l negative THEN tr v cy ch c node, nhn l IF Attributes tr ng THEN tr v cy ch c 1 node, nhn l gi tr chung nh t c a Target_Attribute trong t p c th . ElSE: BEGIN
A Thu c tnh t Attributes t t nh t phn lo i t p c th . Thu c tnh cho root l A. (root A) For each tr Vi c a A:
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 38
19
H c cy quy t
Gi i thu t ID3:
nh (tt.)
For each tr Vi c a A:
Thm 1 nhnh m i d i root, tng ng A = Vi. ExamplesVi = t p con cc c th thu c Examples c A=Vi. N u ExamplesVi tr ng : Th: d i nhnh m i ny, thm 1 node l c nhn = tr chung nh t c a Target_Attribute trong Examples. Ng c l i: d i nhnh m i ny thm 1 cy con, tr v t l i g i: ID3(ExamplesVi, Target_Attribute, Attributes {A})
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 39
Bi t p 1
Cho b ng d li u dng STT 1 2 3 4 5 Tem Khng C Khng C Khng hu n luy n cho khi ni m h c R u gi nh sau: Mu Trong Trong c Trong Trong Mi Khng Khng Khng N ng Khng V Cay Cay Cay Cht Chua R u gi C Khng C Khng C
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 40
20
Bi t p 2
Cho b ng d li u dng STT 1 2 3 4 5 6 7 Nh c hu n luy n cho khi ni m h c Vim xoang nh sau: Hi th Bnh th ng Bnh th ng N ng mi Bnh th ng N ng mi Bnh th ng N ng mi Bnh th ng Bnh th ng Bnh th ng H c mi Vim xoang Khng C Khng C C Khng C u th ng xuyn Khng C C C C Khng Khng
n t p
Cc d ng ton c n n t p:
Bi u di n tri th c:
Cu/ o n Cho l c bi u di n trong l c yu c u. ny cu/ o n vn. chuy n sang l c khc tng ng.
Tr chi:
Cho bi ton (???):
Cho gi i thu t:
ph c t p b nh , th i gian.
Tri th c khng ch c ch n:
Cho o n vn m ng Bayes. Cho m ng Bayes tnh xc su t : 3 cch suy l n trn m ng Bayes. Cho o n vn
Bi u di n cc s ki n, bi n. Dng lu t Bayes chu n on m t s ki n.
Tnh h s khng ch c ch n.
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o Side 42
21
n t p (tt.)
Cc d ng ton c n n t p:
L p k ho ch:
Cho bi ton.
H c my:
(H c khi ni m & Cy quy t Cho b ng d li u.
nh)
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 43
n t p (tt.)
thi:
Khoa Cng Ngh Thng Tin i H c Bch Khoa Tp. HCM Bi Gi ng Mn: Tr tu nhn t o
Side 44
22