ML Nhapmon

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 52

Tr Tu Nhn To

Nguyn Nht Quang


quangnn-fit@mail.hut.edu.vn Vin Cng ngh Thng tin v Truyn thng Trng i hc Bch Khoa H Ni
Nm hc 2010-2011

Ni dung mn hc:

Gii thiu v Tr tu nhn to Tc t Gii quyt vn : Tm kim, Tha mn rng buc Logic v suy din Biu din tri thc Biu din tri thc khng chc chn Hc my

Gii thiu v hc my Phn lp Nave Bayes y Hc da trn cc lng ging gn nht

Lp k hoch
Tr Tu Nhn To

Gii thiu v Hc my y

Cc nh ngha v Hc my (Machine learning)


Mt qu trnh nh mt h thng ci thin hiu sut (hiu qu hot ng) ca n [Simon, 1983] Mt qu trnh m mt chng trnh my tnh ci thin hiu sut ca n trong g mt cng g vic thng gq qua kinh nghi g m [Mitchell, 1997] Vic lp trnh cc my tnh ti u ha mt tiu ch hiu sut da trn cc d liu v d hoc kinh nghim trong qu kh [Alpaydin, 2004]

Biu din mt bi ton hc my [Mitchell, 1997]


Hc my = Ci thin hiu qu mt cng vic thng qua kinh nghim
Mt cng vic (nhim v) T i vi cc tiu ch nh gi hiu sut P Thng qua (s dng) kinh nghim E
Tr Tu Nhn To

Cc v d ca bi ton hc my (1)
Bi ton lc cc trang Web theo s thch ca mt ngi dng

T: D on ( lc) xem nhng trang Web no m mt ngi dng c th th h thch c P: T l (%) cc trang Web c d on ng E: Mt tp cc trang Web m ngi dng ch nh l thch c v mt tp cc trang Web m anh ta ch nh l khng thch c
Interested?

Tr Tu Nhn To

Cc v d ca bi ton hc my (2)
Bi ton phn loi cc trang Web theo cc ch

T: Phn Ph l loi cc t trang W Web b th theo cc ch h nh ht tr c P: T l (%) cc trang Web c phn loi chnh xc E: Mt tp cc trang Web Web, trong mi trang Web gn vi mt ch

Which cat.?

Tr Tu Nhn To

Cc v d ca bi ton hc my (3)
Bi ton nhn dng ch vit tay

T: Nhn dng v phn loi cc t trong cc nh ch vit tay P: T l (%) cc t c nhn dng v phn loi ng E: Mt tp cc nh ch vit tay, trong mi nh c gn vi mt nh danh ca mt t
we do Which word?

in

the

right

way

Tr Tu Nhn To

Cc v d ca bi ton hc my (4)
Bi ton robot li xe t ng

T: Robot (c trang b cc camera quan st) li xe t ng trn ng cao tc P: Khong cch trung bnh m robot c th li xe t ng trc khi xy ra li (tai nn) E: Mt tp cc v d c ghi li khi quan st mt ngi li xe trn ng cao tc, c trong mi v d gm mt chui cc nh v cc lnh iu khin xe
Go straight Which steering command? Move left Move Slow Speed right down up

Tr Tu Nhn To

Qu trnh hc my Q y
Tp hc (Training set) Tp d liu (Dataset) Tp ti u (Validation set) Ti u ha cc tham s ca h thng Tp th nghim (Test set) Hun luyn h thng

Th nghim h thng hc
8

Tr Tu Nhn To

Hc c vs. khng g c g gim st


Hc c gim st (supervised learning)


Mi v d hc gm 2 phn: m t (biu din) ca v d hc, c v nhn lp (hoc gi tr u ra mong mun) ca v d hc Bi ton hc phn lp (classification problem)
D train = {(<Biu_di D_train u din_c n ca_x>, a x> <Nhn_l <Nhn lp_c p c a a_x>)} x>)}

Bi ton hc d on/hi quy (prediction/regression problem)


D_train = {(<Biu_din_ca_x>, <Gi_tr_u_ra_ca_x>)}

Hc khng c gim st (unsupervised learning)


Mi v d hc ch cha m t (biu din) ca v d hc - m khng c bt k thng tin no v nhn lp hay gi tr u ra mong mun ca v d hc Bi ton hc phn cm (Clustering problem)
Tp hc D_train D train = {(<Biu_di u din_c n ca_x>)} a x>)}
Tr Tu Nhn To 9

Bi ton hc my Cc thnh phn chnh (1)


La chn cc v d hc (training/learning examples)


Cc thng tin hng dn qu trnh hc (training feedback) c cha ngay trong cc v d hc, hay l c cung cp gin tip (vd: t mi trng hot ng) Cc v d hc theo kiu c g gim st ( (supervised) p ) hay y khng g c g gim st (unsupervised) Cc v d hc phi tng thch vi (i din cho) cc v d s c s dng bi h thng trong tng lai (future test examples)

Xc nh hm mc tiu (gi thit, khi nim) cn hc


F: X {0,1} F: X {Mt tp cc nhn lp} F: X R+ (min cc gi tri s thc dng)
Tr Tu Nhn To 10

Bi ton hc my Cc thnh phn chnh (2)


La chn cch biu din cho hm mc tiu cn hc


Hm a thc (a polynomial function) Mt tp cc lut (a set of rules) Mt cy quyt nh (a decision tree) Mt mng n-ron ron nhn to (an artificial neural network)

La chn mt gii thut hc my c th hc (xp x) c hm mc tiu


Phng php hc hi quy (Regression-based) gp php p h c q quy y np lut (Rule ( induction) ) Phng Phng php hc cy quyt nh (ID3 hoc C4.5) Phng php hc lan truyn ngc (Back-propagation)
Tr Tu Nhn To

11

Cc vn trong Hc my (1)

Gii thut hc my (Learning algorithm)


Nhng gii thut hc my no c th hc (xp x) mt hm mc tiu cn hc? Vi nh hng iu kin no, mt gi ii th thut hc my ch hn s hi t (tim cn) hm mc tiu cn hc? i vi mt lnh vc bi ton c th v i vi mt cch biu din cc v d (i tng) c th, gii thut hc my no thc hin tt nht?

Tr Tu Nhn To

12

Cc vn trong Hc my (2)

Cc v d hc (Training examples)
Bao nhiu v d hc l ? Kch thc ca tp hc (tp hun luyn) nh hng th no i vi chnh h h xc ca h hm mc ti tiu hc c? ? Cc v d li (nhiu) v/hoc cc v d thiu gi tr thuc tnh (missing-value) (missing value) nh hng th no i vi chnh xc?

Tr Tu Nhn To

13

Cc vn trong Hc my (3)

Qu trnh hc (Learning process)


Chin lc ti u cho vic la chn th t s dng (khai thc) cc v d hc? C Cc chi hin lc la ch hn ny l lm th thay i mc ph hc tp ca bi ton hc my nh th no? Cc tri thc c th ca bi ton (ngoi cc v d hc) c th ng gp th no i vi qu trnh hc?

Tr Tu Nhn To

14

Cc vn trong Hc my (4)

Kh nng/gii hn hc (Learning capability)


Hm mc tiu no m h thng g cn hc? Biu din hm mc tiu: Kh nng biu din (vd: hm tuyn tnh / hm phi tuyn) vs. phc tp ca gii thut v qu trnh hc Cc gii hn (trn l thuyt) i vi kh nng hc ca cc gii thut hc my? Kh nng khi qut t ha h (generalize) ( li ) ca h thng t cc v d hc? ?

trnh vn over-fitting (t chnh xc cao trn tp hc, nhng t chnh xc thp trn tp th nghim)

Kh nng h thng t ng thay i (thch nghi) biu din (cu trc) bn trong ca n?

ci thin kh nng (ca h thng i vi vic) biu din v hc h mc tiu hm ti


Tr Tu Nhn To 15

Vn over-fitting g( (1) )

Mt hm mc tiu (mt gi thit) hc c h s c gi l qu khp/qu ph hp (over (over-fit) fit) vi mt tp hc nu tn ti mt hm mc tiu khc h sao cho:
h km ph hp hn (t chnh xc km hn) h i vi tp hc, nh hng h t chnh xc cao hn h i vi ton b tp d liu (bao gm c nhng v d c s dng sau qu trnh hun luyn)

Vn over-fitting thng do cc nguyn nhn:


Li ( (nhiu) ) trong g tp hun luy yn (do ( q qu trnh thu thp p/xy y dng g tp d liu) S lng cc v d hc qu nh, khng i din cho ton b tp (phn b) ca cc v d ca bi ton hc
Tr Tu Nhn To 16

Vn over-fitting g( (2) )

Gi s gi D l tp ton b cc v d, v D_train l tp cc v d hc Gi s gi ErrD(h) l mc li m gi thit h sinh ra i vi tp D, v ErrD_train D t i (h) l mc li m gi thit h sinh ra i vi tp D_train Gi thit h qu khp (qu ph hp) tp hc D_train D train nu tn ti mt gi thit khc h:
ErrD_train(h) < ErrD_train(h), v ErrD(h) > ErrD(h)

Tr Tu Nhn To

17

Vn over-fitting g( (3) )

Trong s cc gi thit (hm mc tiu) hc c, gi thit (hm mc tiu) no khi qut ha tt nht t cc v d hc? Lu : Mc tiu ca hc my l t c chnh xc cao trong g d on i vi cc v d sau ny, khng phi i vi cc v d hc

Hm mc tiu f(x) no t chnh xc cao nht i vi cc v d sau ny?


f(x)

Occams O razor: u tin ti ch hn h hm mc tiu n gin nht ph hp (khng nht thit hon ho) vi cc v d hc Khi qut t h ha tt hn D gii thch/din gii hn p phc tp tnh ton t hn
Tr Tu Nhn To

18

Vn over-fitting g V d
Tip tc qu trnh hc cy quyt nh s lm gim chnh xc i vi tp th nghim mc d tng chnh xc i vi tp hc

[Mitchell, 1997]
Tr Tu Nhn To 19

Phn lp Nave Bayes y


L cc phng php hc phn lp c gim st v da trn xc sut Da trn mt m hnh (hm) xc sut Vic phn h l loi da trn t cc gi i t tr xc sut ca cc kh nng xy ra ca cc gi thit L mt trong cc phng php hc my thng c s dng trong cc bi ton thc t Da trn nh l Bayes (Bayes theorem)

Tr Tu Nhn To

20

nh l Bayes y
P( D | h).P(h) P(h | D) = P( D)
P(h): Xc sut trc (prior probability) rng gi thit (phn lp) h l ng P(D): Xc sut trc rng tp d liu D c quan st (thu c) P(D|h): Xc sut ca vic quan st c (thu c) tp d liu D, vi iu kin gi thit h l ng P(h|D): Xc sut ca gi thit h l ng, vi iu kin tp d liu D c quan st
Tr Tu Nhn To 21

nh l Bayes y V d (1) ( )
Xt tp d liu sau y:
Day
D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12

Outlook
Sunny Sunny O Overcast t Rain Rain Rain Overcast Sunny Sunny Rain Sunny Overcast

Temperature Humidity
Hot Hot H t Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild
Tr Tu Nhn To

Wind
Weak Strong W k Weak Weak Weak Strong Strong Weak Weak Weak Strong Strong

Play Tennis
No No Y Yes Yes Yes No Yes No Yes Yes Yes Yes
22

High High Hi h High High Normal Normal Normal High Normal Normal Normal High

[Mitchell, 1997]

nh l Bayes y V d (2) ( )

Tp v d D. Tp cc ngy m thuc tnh Outlook c gi tr Sunny v thuc tnh Wind c gi tr Strong Gi thit (phn lp) h. Anh ta chi tennis Xc sut trc P(h). Xc sut anh ta chi tennis (khng ph thuc vo cc thuc tnh Outlook v Wind) Xc sut trc P(D). Xc sut ca mt ngy m thuc tnh Outlook c gi tr Sunny v thuc tnh Wind c gi tr Strong P(D|h). Xc sut ca mt ngy m thuc tnh Outlook c gi tr Sunny v Wind c gi tr Strong, vi iu kin (nu bit rng) anh ta chi tennis P(h|D). Xc sut anh ta chi tennis, vi iu kin (nu bit rng) thuc tnh Outlook c gi tr Sunny v Wind c gi tr Strong Phng gp php pp phn lp Nave Bayes y da trn xc sut c iu kin (posterior probability) ny!
Tr Tu Nhn To 23

Cc i ha xc sut c iu kin

Vi mt tp cc gi thit (cc phn lp) c th H, h thng hc s tm gi thit c th xy ra nht (the most probable hypothesis) h(H) i vi cc d liu quan st c D Gi thit h ny y c gi l g gi thit cc i ha xc sut c iu kin (maximum a posteriori MAP)

hMAP = arg max P(h | D)


hH

hMAP

P( D | h).P (h) = arg max P( D) hH


hH

(bi nh l Bayes) (P(D) l nh nhau gi thit h) i vi cc g


24

hMAP = arg max P( D | h).P(h)


Tr Tu Nhn To

MAP V d

Tp H bao gm 2 gi thit (c th)


h1: Anh ta chi tennis h2: Anh ta khng chi tennis

Tnh gi tr ca 2 xc xut c iu kin: P(h1|D), P(h2|D) Gi thit c th nht hMAP=h1 nu P(h1|D) P(h2|D); ngc li th hMAP=h2 Bi v P(D)=P(D,h P(D h1)+P(D,h P(D h2) l nh nhau i vi c 2 gi thit h1 v h2, nn c th b qua i lng P(D) V vy, cn tnh 2 biu thc: P(D|h ( | 1) ).P(h ( 1) v P(D|h2).P(h2), v a ra quyt nh tng ng
Nu P(D|h1).P(h1) P(D|h2).P(h2), th kt lun l anh ta chi tennis Ng N c li, i th kt lu l n l anh h ta t khng kh ch hi tennis t i
Tr Tu Nhn To 25

nh gi kh nng xy ra cao nht


Phng php MAP: Vi mt tp cc gi thit c th H, cn tm mt gi thit cc i ha gi tr: P(D|h).P(h) Gi s (assumption) trong phng php nh gi kh nng xy ra cao nht (maximum likelihood estimation MLE): Tt c cc gi thit u c gi tr xc sut trc nh nhau: P(hi)=P(hj), hi,hjH Phng php MLE tm gi thit cc i ha gi tr P(D|h); trong P(D|h) c gi l kh nng xy ra (likelihood) ca d liu D i vi h Gi thit cc i ha kh nng xy ra (maximum likelihood hypothesis)

hMLE = arg max P( D | h)


hH
Tr Tu Nhn To

26

MLE V d

Tp H bao gm 2 gi thit c th
h1: Anh ta chi tennis h2: Anh ta khng chi tennis D: Tp d liu (cc ngy) m trong thuc tnh Outlook c gi tr Sunny v thuc tnh Wind c gi tr Strong

Tnh 2 gi tr kh nng xy ra (likelihood values) ca d liu D i vi 2 gi thit: P(D|h1) v P(D|h2)


P(Outlook=Sunny, Wind=Strong|h1)= 1/8 P(Outlook=Sunny, Wind=Strong|h2)= 1/4

Gi thit MLE hMLE=h1 nu P(D|h1) P(D|h2); v ng g c li th hMLE=h2


Bi v P(Outlook=Sunny, Wind=Strong|h1) < P( (Outlook=Sunny y, Wind=Strong g| |h2), h thng g kt lun rng: g Anh ta s khng chi tennis!
Tr Tu Nhn To 27

Phn loi Nave Bayes y (1) ( )


Biu din bi ton phn loi (classification problem)


Mt tp hc D_train D train, trong mi v d hc x c biu din l mt vect n chiu: (x1, x2, ..., xn) Mt tp xc nh cc nhn lp: C={c1, c2, ..., cm} Vi mt v d (mi) z, z s c phn vo lp no?

Mc tiu: Xc nh phn lp c th (ph hp) nht i vi z


c MAP = arg max P(ci | z )
ci C

c MAP = arg max P(ci | z1 , z 2 ,..., z n )


ci C

c MAP = arg max


ci C

P( z1 , z 2 ,..., z n | ci ).P(ci ) P( z1 , z 2 ,..., z n )


Tr Tu Nhn To

(bi nh l Bayes)

28

Phn loi Nave Bayes y (2) ( )


tm c phn lp c th nht i vi z

c MAP = arg max P( z1 , z 2 ,..., z n | ci ). ) P ( ci )


ci C

(P(z1,z z2,...,z zn) l nh nhau vi cc lp)

Gi s (assumption) trong phng php phn loi Nave Bayes. Cc thuc tnh l c lp c iu kin (conditionally independent) i vi cc lp

P( z1 , z 2 ,..., z n | ci ) = P( z j | ci )
j =1

Phn loi Nave Bayes tm phn lp c th nht i vi z


c NB = arg max P (ci ). P ( z j | ci )
ci C j =1
Tr Tu Nhn To

29

Phn loi Nave Bayes Gii thut


Giai on hc (training phase), s dng mt tp hc


i vi mi phn lp c th (mi nhn lp) ciC Tnh gi tr xc sut trc: P(ci) i vi mi gi tr thuc tnh xj, tnh gi tr xc sut xy ra ca gi tr thuc tnh i vi mt phn lp ci: P(xj|ci)

Giai on phn lp (classification phase), i vi mt v d mi


i vi mi phn lp ciC, tnh gi tr ca biu thc:

P(ci ). P( x j | ci )
j =1

Xc nh phn lp ca z l lp c th nht c*

c* = arg max P (ci ). P ( x j | ci )


ci C j =1
Tr Tu Nhn To 30

Phn lp Nave Bayes V d (1)


Mt sinh vin tr vi thu nhp trung bnh v mc nh gi tn dng bnh thng s mua mt ci my tnh? Rec. ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Age Young Young Medium Old Old Old Medium Young Young Old Young Medium Medium Old Income High High High M di Medium Low Low Low Medium Low Medium Medium Medium High Medium Student No No No N No Yes Yes Yes No Yes Yes Yes No Yes No Credit_Rating Fair Excellent Fair F i Fair Fair Excellent Excellent Fair Fair Fair Excellent Excellent Fair Excellent Buy_Computer No No Yes Y Yes Yes No Yes No Yes Yes Yes Yes Yes No
31

http://www.cs.sunysb.edu /~cse634/lecture_notes/0 7classification.pdf

Tr Tu Nhn To

Phn lp Nave Bayes V d (2)


Biu din bi ton phn loi


z = (Age (Age=Young, Young Income Income=Medium, Medium Student Student=Yes, Yes Credit_Rating Credit Rating=Fair) C 2 phn lp c th: c1 (Mua my tnh) v c2 (Khng mua my tnh)

Tnh gi tr xc sut trc cho mi phn lp


P(c1) = 9/14 P(c2) = 5/14

Tnh gi tr xc sut ca mi gi tr thuc tnh i vi mi phn lp


P(Age=Young|c1) = 2/9; P(Income=Medium M di |c1) = 4/9; P(Student=Yes|c1) = 6/9; P(Credit_Rating=Fair|c1) = 6/9;
Tr Tu Nhn To

P(Age=Young|c2) = 3/5 P(Income=Medium M di |c2) = 2/5 P(Student=Yes|c2) = 1/5 P(Credit_Rating=Fair|c2) = 2/5

32

Phn lp Nave Bayes V d (3)


Tnh ton xc sut c th xy ra (likelihood) ca v d z i vi mi phn lp i vi phn lp c1


P(z|c1) = P(Age=Young|c1).P(Income=Medium|c1).P(Student=Yes|c1). P(Credit_Rating=Fair|c1) = (2/9).(4/9).(6/9).(6/9) = 0.044

i vi phn lp c2
P(z|c2) = P(Age=Young|c2).P(Income=Medium|c2).P(Student=Yes|c2). P(Credit_Rating=Fair|c2) = (3/5).(2/5).(1/5).(2/5) = 0.019

Xc nh phn lp c th nht (the most probable class) i vi phn lp c1


P(c1).P( ) P(z|c1) = (9/14) (9/14).(0.044) (0 044) = 0 0.028 028

i vi phn lp c2
P(c2).P(z|c2) = (5/14).(0.019) = 0.007

Kt lun: Anh ta (z) s mua mt my tnh!


Tr Tu Nhn To 33

Phn lp Nave Bayes Vn (1)


Nu khng c v d no gn vi phn lp ci c gi tr thuc tnh xj n P(x ( j| |ci)=0 , v v vy y: P (c ). ) P( x | c ) = 0


i

j =1

Gii php: S dng phng php Bayes c lng P(xj|ci) n(ci , x j ) + mp P ( x j | ci ) = n(ci ) + m
n(c ( i): s l ng g cc v d hc gn vi p phn lp ci n(ci,xj): s lng cc v d hc gn vi phn lp ci c gi tr thuc tnh xj p: c lng i vi gi tr xc sut P(xj|ci) Cc c lng ng mc: p=1/k, vi thuc tnh fj c k gi tr c th m: mt h s (trng s) b sung cho n(ci) cc v d thc s c quan st vi thm m mu v d vi c lng p
Tr Tu Nhn To 34

Phn lp Nave Bayes Vn (2)


Gii hn v chnh xc trong tnh ton ca my tnh


P(x P( j|c | i)<1, <1 i vi mi gi tr thuc tnh xj v phn lp ci V vy, khi s lng cc gi tr thuc tnh l rt ln, th:
n lim P ( x | c ) j i = 0 n j =1

Gii php: S dng hm lgarit cho cc gi tr xc sut n c NB = arg max log P(ci ). P ( x j | ci ) c C j =1 n c NB = arg max log P(ci ) + log P( x j | ci ) ci C j =1
i

Tr Tu Nhn To

35

Phn loi vn bn bng NB (1)


Biu din bi ton phn loi vn bn


Tp hc D_train, trong mi v d hc l mt biu din vn bn gn vi mt nhn lp: D = {(dk, ci)} Mt tp cc nhn lp xc nh: C = {ci}

Giai on hc
T tp cc vn bn trong D_train, trch ra tp cc t kha (keywords/terms): T = {tj} Gi D_c D ci (D_train D train) l tp cc vn bn trong D_train D train c nhn lp ci i vi mi phn lp ci - Tnh gi tr xc sut trc ca phn lp ci:

D - i vi mi t kha tj, tnh xc sut t kha tj xut hin i vi lp ci


P (t j | ci ) =

P (ci ) =

D _ ci

d k D _ ci

n(d k ,t j ) + 1

d k D _ ci

m T

n( d k , t m ) + T

(n(dk,tj): s ln xut hin ca t kha tj trong vn bn dk)

Tr Tu Nhn To

36

Phn loi vn bn bng NB (2)


phn lp cho mt vn bn mi d Giai on phn lp


T vn bn d, trch ra tp T_d gm cc t kha (keywords) tj c nh ngha trong tp T (T_d T) Gi s (assumption) ( ti ). Xc X sut t kha kh tj xut hin i vi lp ci l c lp i vi v tr ca t kha trong vn bn
P(tj v tr k|ci) = P(tj v tr m|ci), k,m

i vi mi phn lp ci, tnh gi tr likelihood ca vn bn d i vi ci


P(ci ).
t j T _ d

P(t j | ci )

Phn lp vn bn d thuc vo lp c*
c * = arg max P(ci ).
ci C t j T _ d

P(t j | ci )
37

Tr Tu Nhn To

Hc da trn lng ging gn nht


Mt s tn gi khc ca phng php hc da trn lng ging gn nht (Nearest neighbor learning)
Instance-based learning Lazy learning Memory Memory-based based learning

tng ca phng php hc da trn lng ging gn nht Vi mt tp cc v d hc


(n gin l) lu li cc v d hc Khng cn xy dng mt m hnh (m t) r rng v tng qut ca hm mc tiu cn hc Kim tra (xt) quan h gia v d vi cc v d hc gn gi tr ca hm mc tiu (mt nhn lp, hoc mt gi tr thc)
Tr Tu Nhn To 38

i vi mt v d cn phn loi/d on

Hc da trn lng ging gn nht


Biu din u vo ca bi ton


Mi v d x c biu din l mt vect n chiu trong g khng gg gian cc vect XRn x = (x1,x2,,xn), trong xi (R) l mt s thc

C Chng ta xt 2 kiu bi ton hc


Bi ton phn lp (classification)
hc mt hm mc tiu c gi g tr ri rc ( (a discrete-valued target g function) u ra ca h thng l mt trong s cc gi tr ri rc xc nh trc (mt trong cc nhn lp)

Bi ton d on/hi quy (prediction/regression)


hc mt hm mc tiu c gi tr lin tc (a continuous-valued g function) ) target u ra ca h thng l mt gi tr s thc

Tr Tu Nhn To

39

Phn lp da trn NN V d

Xt 1 lng ging gn nht Gn z vo lp c2 Xt 3 lng ging gn nht Gn z vo lp c1 Xt 5 lng ging gn nht Gn z vo lp c1

Lp c1

Lp c2 V d cn phn lp z

Tr Tu Nhn To

40

Gii thut p phn lp k-NN


Mi v d hc x c biu din bi 2 thnh phn:


M t ca v d: x=(x1,x2,,xn), trong xiR Nhn lp : c (C, vi C l tp cc nhn lp c xc nh trc)

Giai on hc
n gin l lu li cc v d hc trong tp hc D = {x}

Giai on p phn lp p: p phn lp cho mt v d ( (mi) )z


Vi mi v d hcxD, tnh khong cch gia x v z Xc nh tp NB(z) cc lng ging gn nht ca z Gm k v d hc t trong D gn nh ht vi z tnh t h theo th mt hm h khong cch d Phn z vo lp chim s ng (the majority class) trong s cc lp ca cc v d hc trong NB(z)
Tr Tu Nhn To 41

Gii thut d on k-NN


Mi v d hc x c biu din bi 2 thnh phn:


M t ca v d: x=(x1,x2,,xn), trong xiR Gi tr u ra mong mun: yxR (l mt s thc)

Giai on hc
n gin l lu li cc v d hc trong tp hc D

Giai on d on: d on gi tr u ra cho v d z


i vi mi v d hc xD, tnh khong cch gia x v z Xc nh tp NB(z) cc lng ging gn nht ca z Gm k v d hc trong g D gn nht vi z tnh theo mt hm khong g cch d D on gi tr u ra i vi z:

yz =

1 y xNB ( z ) x k
42

Tr Tu Nhn To

Xt mt hay nhiu lng ging?


Vic phn lp (hay d on) ch da trn duy nht mt lng ging gn nht (l v d hc gn nht vi v d cn phn lp/d on) thng khng chnh xc
Nu v d hc ny l mt v d bt thng, khng in hnh (an outlier) ) rt khc so vi cc v d khc Nu v d hc ny c nhn lp (phn lp sai) do li trong qu trnh thu thp (xy dng) tp d liu

Thng xt k (>1) cc v d hc gn nht vi v d cn phn lp, v gn v d vo lp chim s ng trong s k v d hc gn nht ny k thng c chn l mt s l, trnh cn bng v t l phn lp (ties in classification)
V d: k= 3, 5, 7,
Tr Tu Nhn To 43

Hm tnh khong g cch ( (1) )


Hm tnh khong cch d


ng vai tr rt quan trng trong phng php hc da trn lng ging gn nht Thng c xc nh trc, v khng thay i trong sut qu trnh hc v phn loi/d on

La chn hm khong cch d


Cc hm khong cch hnh hc: Dnh cho cc bi ton c cc thuc tnh u vo l kiu s thc (xiR) Hm khong g cch Hamming g: Dnh cho cc bi ton c cc thuc tnh u vo l kiu nh phn (xi{0,1}) Hm tnh tng t Cosine: Dnh cho cc bi ton phn lp vn bn (xi l gi tr trng s TF/IDF ca t kha th i)
Tr Tu Nhn To 44

Hm tnh khong g cch ( (2) )


Cc hm tnh khong cch hnh hc (Geometry distance functions)


Hm Manhattan: Hm Euclid: Hm Minkowski (p-norm): Hm Chebyshev:

d ( x, z ) = xi zi
i =1

d ( x, z ) =

(x z )
i =1 i i

n p d ( x, z ) = xi zi i =1

1/ p

n p d ( x, z ) = lim xi zi p i =1

1/ p

= max xi zi
i
Tr Tu Nhn To 45

Hm tnh khong g cch ( (3) )


Hm khong cch H Hamming i


i vi cc thuc tnh u vo l kiu nh phn V d: x=(0,1,0,1,1)

d ( x, z ) = Difference ff ( xi , z i )
i =1

1, if ( a b) Difference ( a, b) = f ( a = b) 0, if

Hm tnh tng t Cosine


i vi u vo l mt vect cc gi tr trng s (TF/IDF) ca cc t kha

x.z = d ( x, z ) = x z

x z
i =1

i i

xi
i =1

zi
i =1

Tr Tu Nhn To

46

Chun ha min gi tr thuc tnh


Hm tnh khong cch Euclid:

d ( x, z ) =

2 ( ) x z i i i =1

Gi s mi v d c biu din bi 3 thuc tnh: Age, Income (cho mi thng), v Height (o theo mt)
x = (Age=20, Income=12000, Height=1.68) z = (Age=40, Income=1300, Height=1.75)

Khong cch gia x v z


d(x,z) d(x z) = [(20-40)2 + (12000-1300)2 + (1.68-1.75) (1 68-1 75)2]1/2 Gi tr khong cch ny b quyt nh ch yu bi gi tr khong cch (s khc bit) gia 2 v d i vi thuc tnh Income V: Thuc tnh Income c min gi tr rt ln so vi cc thuc tnh khc

Cn phi chun ha min gi tr (a v cng mt khong gi tr)


Khong gi tr [0,1] thng c s dng i vi mi thuc tnh i: xi = xi/gi_tr /gi tr_c cc_ c i_ i i_v i vi_thu i thuc_tnh_ c tnh i
Tr Tu Nhn To 47

Trng g s ca cc thuc tnh


Hm khong cch Euclid:

d ( x, z ) =

2 ( ) x z i i i =1

Tt c cc thuc tnh c cng (nh nhau) nh hng i vi gi tr khong cch


Cc thuc tnh khc nhau c th (nn) c mc nh hng khc nhau i vi gi tr khong cch Cn phi tch hp (a vo) cc gi tr trng s ca cc thuc tnh n trong hm tnh khong cch 2

d ( x, z ) =

wi l trng s ca thuc tnh i:


w (x z )
i =1 i i i

Lm sao xc nh cc gi tr trng s ca cc thuc tnh?


Da trn cc tri thc c th ca bi ton (vd: c ch nh bi cc chuyn gia trong lnh vc ca bi ton ang xt) g m t q qu trnh ti u ha cc g gi tr trng g s ( (vd: s dng g mt tp Bng hc hc mt b cc gi tr trng s ti u)
Tr Tu Nhn To 48

Khong cch ca cc lng ging (1)


Xt tp NB(z) gm k v d hc gn nht vi v d cn phn lp/d on z


Mi v d (lng ging gn nht) ny c khong cch khc nhau n z Cc lng ging ny c nh h ng nh nhau i vi vic phn lp/d oncho z? KHNG!

test instance z

Cn gn cc mc nh hng (ng gp) ca mi lng ging gn nht ty theo khong cch ca n n z


Mc nh hng cao hn cho cc lng ging gn hn!

Tr Tu Nhn To

49

Khong cch ca cc lng ging (2)


Gi v l hm xc nh trng s theo khong cch


i vi mt gi tr d(x,z) khong cch gia x v z v(x,z) t l nghch vi d(x,z)

i vi bi ton phn lp:

c ( z ) = arg max
c j C

xNB ( z )

v( x, z ).Identical (c j , c( x))
1, if (a = b) Identical (a, b) = 0, if (a b)

i vi bi ton d on (hi quy):

f ( z) =

xNB ( z )

v( x, z ). f ( x) v ( x, z )
xNB ( z )

La chn mt hm xc nh trng s theo khong cch:


1 v ( x, z ) = + d ( x, z ) 1 v ( x, z ) = + [d ( x, z )]2
Tr Tu Nhn To

v ( x, z ) = e

d ( x, z )2

50

Hc NN Khi no?

Cc v d c biu din l cc vect trong khng gian s thc (Rn) S lng cc thuc tnh (s chiu ca khng gian) u vo khng ln Tp hc kh ln (nhiu v d hc)

Cc u im
Khng Kh cn b c hc (h thng ch h n gi in l lu li cc v d hc) ) Hot ng tt vi cc bi ton c s lp kh ln
Khng cn phi hc ring r n b phn lp cho n lp Phng php hc k-NN (k >>1) c th lm vic c c vi d liu li Vic phn lp/d on da trn k lng ging gn nht

Cc nhc im
Phi xc nh hm tnh khong cch ph hp Chi ph tnh ton (v thi gian v b nh) ti thi im phn lp/d on C th phn lp/d on sai , do cc thuc tnh khng lin quan (irrelevant attributes)
Tr Tu Nhn To 51

Ti liu tham kho


E. Alpaydin. Introduction to Machine Learning. The MIT Press, 2004. T. M. Mitchell. Machine Learning. McGraw-Hill, 1997. H. A. Simon. Why Should Machines Learn? In R. S. Michalski, J. Carbonell, and T. M. Mitchell (Eds.): M hi Machine l learning: i A An artificial tifi i l i intelligence t lli approach, h chapter 2, pp. 25-38. Morgan Kaufmann, 1983.

Tr Tu Nhn To

52

You might also like