Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Khoa Cng Ngh Thng Tin Trng i Hc Cn Th

My hc vct h tr Support vector machines


Thanh Ngh dtnghi@cit.ctu.edu.vn

Cn Th 02-12-2008

Ni dung
Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM Kt lun v hng pht trin Demo chng trnh (20 30 pht)

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM Kt lun v hng pht trin

Support vector machines?

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

lp cc gii thut hc

tm siu phng trong khng gian N-dim phn loi d liu SVM + hm kernel = m hnh gii thut SVM = li gii ca bi ton quy hoch ton phng ti u ton cc c nhiu gii thut gii SVM c th m rng gii cc vn ca hi quy, gom nhm, etc. c ng dng thnh cng : nhn dng, phn tch d liu, phn loi gien, k t, etc.
4

K thut DM thnh cng trong ng dng thc (2004)

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM Kt lun v hng pht trin

Support vector machines?

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

Support vector machines?


-1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

+1

Support vector machines?


-1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

+1
p1 p2 p3

Support vector machines?


xT.w b = -1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

-1

+1

10

Support vector machines?


xT.w b = -1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

-1

xT.w b = +1

+1

11

Support vector machines?


xT.w b = -1 xT.w b = 0 xT.w b = +1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

-1

+1

12

Support vector machines?


xT.w b = -1 xT.w b = 0 xT.w b = +1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

-1

+1

l = 2/||w||

13

Support vector machines?


xT.w b = -1 xT.w b = 0 xT.w b = +1

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

-1
zi zj

+1

l = 2/||w||

14

Phn loi vi SVM

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

cc i ha l + cc tiu ha li

(1)

gii (1): w, b phn loi x: sign(x.w b)


15

Phn loi vi SVM

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

bi ton i ngu ca (1):

(2)

gii (2): i

nhng xi tng ng vi i > 0 l vct h tr (SV) tnh b da trn tp SV

phn loi x:
16

SVM phi tuyn

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

khng gian input: phi tuyn

khng gian trung gian: tuyn tnh

chuyn khng gian input => khng gian trung gian:

2 chiu [x1, x2] khng gian input => 5 chiu [x1, x2, x1x2, x12, x22] khng gian trung gian (feature space)
17

SVM phi tuyn

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

khng gian input: phi tuyn

khng gian trung gian: tuyn tnh

chuyn khng gian input => khng gian trung gian:

php chuyn i kh xc nh, s chiu trong khng gian trung gian rt ln. hm kernel (Hilbert Schmidt space): lm vic trong khng gian input nhng ngm nh trong khng gian trung gian

18

SVM phi tuyn s dng kernel

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

SVM + kernel = m hnh


thay dot product bi hm kernel, K(u, v) tuyn tnh: K(u, v) = u.v a thc bc d: K(u, v) = (u.v + c)d Radial Basis Function: K(u, v) = exp(-2||u v||2/2)

v d, hm kernel a thc bc 5

d liu c s chiu l 250 trong khng gian input tng ng vi 1010 chiu trong khng gian trung gian

19

SVM cho vn nhiu lp (> 2)

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

xy dng trc tip m hnh cho nhiu lp

xy dng bi ton ti u cho k lp mi m hnh phn tch 1 lp t cc lp khc k lp : k m hnh mi m hnh phn tch 2 lp k lp : k*(k-1)/2 m hnh phn tch 2 nhm, mi nhm c th bao gm nhiu lp xc nh cch phn tch nhm sao cho c li nht

1-tt c (1 vs all)

1-1 (1 vs 1)

phng php khc


20

SVM cho vn nhiu lp (> 2)

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

21

Gii thut SVM


(Bennett & Campell, 2000) (Cristianini & Shawe-Taylor, 2000)

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

SVM = gii ca bi ton quy hoch ton phng gii quadratic program

s dng trnh ti u ha sn c: phng php Quasi-Newton, gradient phn r vn (decomposition hoc chunking) phn r vn thnh nhng vn con kch thc 2 nh SMO (Platt, 1998)

din gii SVM bng phng php khc


lp cc gii thut ca Mangasarian & sinh vin ca ng hoc a v gii bi ton quy hoch tuyn tnh hay h phng trnh tuyn tnh
22

Gii thut SVM cho khi d liu khng l

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

active SVM

chn tp d liu con t tp d liu ban u hc trn tp d liu con


w.x b = 0 w.x b = 0

-1 +1

Active subset selection

-1 +1

23

Gii thut SVM cho khi d liu khng l

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

hc song song SVM


chia thnh nhng khi d liu nh phn trn cc my tnh con hc c lp v song song tp hp nhng m hnh con sinh ra m hnh tng th

Program

Program

Program

Data

Data single program multiple data

Data

24

Gii thut SVM cho khi d liu khng l

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

hc tin ha SVM (incremental learning)


chia thnh nhng khi d liu nh load ln lt tng khi cp nht m hnh

25

Gii thut SVM cho khi d liu khng l

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

hc trn d liu symbolic


chuyn d liu v dng trnh by mc cao hn c th l clusters, interval, ... hc trn d liu tru tng ny thay i cch tnh hm kernel

26

Gii thut SVM cho khi d liu khng l

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

Boosting SVM

ly mu con t tp d liu ban u hc trn tp d liu con ny phn loi ton b d liu nhng d liu b phn loi sai s c cp nht trng lng tng ly mu da trn trng lng gn cho d liu tip tc hc, etc. qu trnh lp k ln cc m hnh con s bnh chn khi phn loi d liu mi n

27

Hi quy vi SVM
w.x b = 0

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

w.x b = -

w.x b =

tm siu phng (Input hoc Hilbert space)

i qua tt c cc im vi lch chun l epsilon

28

Mt lp vi SVM

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

tm siu cu (Input hoc Hilbert space)


tm o bn knh nh nht r cha hu ht cc im d liu


29

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM Kt lun v hng pht trin

30

ng dng ca SVM

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

tham kho, (Guyon, 1999)


http://www.clopinet.com/isabelle/Projects/SVM/applist.html nhn dng: ting ni, nh, ch vit tay (hn mng nron) phn loi vn bn, khai m d liu vn bn phn tch d liu theo thi gian phn tch d liu gien, nhn dng bnh, cng ngh bo ch thuc phn tch d liu marketing etc.

31

SVM nhn dng s vit tay

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

32

SVM nhn dng s vit tay

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

33

SVM nhn s vit tay

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

34

SVM phn loi text (reuters)

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

35

SVM phn loi gien

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

36

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM Kt lun v hng pht trin

37

Kt lun

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

SVM + kernel methods


phng php hc mi cung cp nhiu cng c nn tng l thuyt hc thng k ti u ton cc, m hnh cht lng cao, chu ng c nhiu thnh cng trong nhiu ng dng

hn ch

kh dch kt qu phc tp vn cao x l d liu kiu s tham s u vo

38

Hng pht trin

Gii thiu v SVM Gii thut hc ca SVM ng dng ca SVM kt lun v hng pht trin

hng pht trin


multi-class clustering x l d liu ln d liu khng phi kiu s d liu khng cn bng xy dng hm nhn dch kt qu tm kim thng tin (ranking)

39

DEMO chng trnh

40

LibSVM
nh dng d liu label att-i:val-i . att-n:val-n v d

1 1:-0.555556 2:0.25 3:-0.864407 4:-0.916667 1 1:-0.666667 2:-0.166667 3:-0.864407 4:-0.916667 1 1:-0.777778 3:-0.898305 4:-0.916667 2 1:0.111111 2:-0.583333 3:0.322034 4:0.166667 2 1:-1.32455e-07 2:-0.333333 3:0.254237 4:-0.0833333 3 1:0.222222 2:-0.166667 3:0.525424 4:0.416667 3 1:0.888889 2:0.5 3:0.932203 4:0.75 3 1:0.888889 2:-0.5 3:1 4:0.833333

41

LibSVM

ty chn
-s svm_type (default 0) : 0 (SVC), 1 (nu-SVC), 2 (one-class), 3 (epsilon-SVR), 4 (nu-SVR) -t kernel_type (default 2) : 0 (lin), 1 (poly), 2 (RBF), 3 (sigmoid) -d degree (default 3) : polynomial kernel -g gamma (default 1/#attr) : RBF kernel -c cost (default 1) : C-SVC, epsilon-SVR, nu-SVR -p epsilon (default 0.1) : epsilon-SVR -v num_fold : kim tra cho

42

Test (2d)

43

Test (2d)

44

Test (2d)

45

Tp d liu

UCI (Asuncion & Newman, 2007)


*Spambase, 4601 ind., 57 att., 2 classes (spam, non) *Image Segmentation, 2310 ind., 19 att., 7 classes Landsat Satellite, 6435 ind. (4435 tp hc, 2000 tp kim tra), 36 att., 6 classes Reuters-21578, 10789 ind. (7770 tp hc, 3019 tp kim tra), 29406 att., 2 classes (earn, rest)

Bio-medicales (Jinyan & Huiqing, 2002)


ALL-AML Leukemia, 72 ind. (38 tp hc, 34 tp kim tra), 7129 att., 2 classes (ALL, AML) Lung Cancer, 181 ind. (32 tp hc, 149 tp kim tra), 12533 attr., 2 classes (cancer, normal) 46

k-fold

10-fold
it 1 : it 2 :
test train train train train train train train train train

train test train train train train train train train train

it 10 :

train train train train train train train train train test

47

Demo

Spambase (spam=1, non-spam=2)


nghi thc kim tra : 10-fold linear kernel lnh : svm-train -t 0 -c 10 -v 10 spambase.scale

kt qu

48

Demo

Image Segmentation
nghi thc kim tra : 10-fold RBF kernel lnh : svm-train -t 2 -g 0.0002 -c 10000 -v 10 segment.data

kt qu

49

Demo

Landsat Satellite
tp hc sat.trn, tp kim tra sat.tst RBF kernel lnh : svm-train -t 2 -g 0.001 -c 100000 sat.train sat.rbf

kt qu

50

Demo

Reuters-21578 (earn=1, rest=2)


tp hc r.trn, tp kim tra r.tst linear kernel lnh : svm-train -t 0 -c 1000 r.trn r.lin

kt qu

51

Demo

ALL-AML Leukemia (ALL=1, AML=2)


tp hc allaml.trn, tp kim tra allaml.tst linear kernel lnh : svm-train -t 0 -c 1000000 allaml.trn allaml.lin

kt qu

52

Demo

Lung Cancer (cancer=1, normal=2)


tp hc lung.trn, tp kim tra lung.tst linear kernel lnh : svm-train -t 0 -c 1000000 lung.trn lung.lin

kt qu

53

You might also like