Professional Documents
Culture Documents
SVM2
SVM2
SVM2
Hm tuyn tnh phn bit hai lp nh sau: ( ) Trong : l vector trng s hay vector chun ca siu phng phn cch, T l k hiu chuyn v. l lch ( ) l vc t c trng, lm hm nh x t khng gian u vo sang khng gian c trng. ( ) (1)
Tp d liu u vo gm N mu input vector {x1, x2,...,xN}, vi cc gi tr nhn tng ng l {t1,,tN} trong * +. Lu cch dng t y: im d liu, mu u c hiu l input vector xi; nu l khng gian 2 chiu th ng phn cch l ng thng, nhng trong khng gian a chiu th gi l siu phng. Gi s tp d liu ca ta c th phn tch tuyn tnh hon ton (cc mu u c phn ng lp) trong khng gian c trng (feature space), do s tn ti gi tr tham s w v b theo (1)
tha ( th m
) (
cho nhng im c
, v
SVM tip cn gii quyt vn ny thng qua khi nim gi l l, ng bin (margin). L c chn l khong cch nh nht t ng phn cch n mi im d liu hay l khong cch t ng phn cch n nhng im gn nht.
Trong SVM, ng phn lp tt nht chnh l ng c khong cch margin ln nht (tc l s tn ti rt nhiu ng phn cch xoay theo cc phng khc nhau, v ta chn ra ng phn cch m c khong cch margin l ln nht).
Do ta ang xt trong trng hp cc im d liu u c phn lp ng nn mi n. V th khong cch t im xn n mt phn cch c vit li nh sau:
( ) ( ( ) )
cho
(2)
L l khong cch vung gc n im d liu gn nht xn t tp d liu, v chng ta mun tm gi tr ti u ca w v b bng cch cc i khong cch ny. Vn cn gii quyt s c vit li di dng cng thc sau: { Chng ta c th em nhn t
, (
)-}
(3)
cch trc tip s rt phc tp, do ta s chuyn n v mt vn tng ng d gii quyt hn. Ta s scale v cho mi im d liu, t y khong cch l tr thnh 1, vic bin i ny khng lm thay i bn cht vn . ( ( ) ) (4)
Vic nhn h s s gip thun li cho ly o hm v sau. L thuyt Nhn t Lagrange: Vn cc i hm f(x) tha iu kin Lagrange nh sau: ( ) ( ) ( ) s c vit li di dng ti u ca hm ( )
gii quyt bi ton trn, ta vit li theo hm Lagrange nh sau: ( Trong ( ) * ( ( ) ) + (7)
) l nhn t Lagrange.
Lu du () trong hm Lagrange, bi v ta cc tiu theo bin w v b, v l cc i theo bin a. Ly o hm L(w,b,a) theo w v b ta c: ( ) (8) (9)
Loi b w v b ra khi L(w,b,a) bng cch th (8), (9) vo. iu ny s dn ta n vn ti u: ( ) Tha cc rng buc: (11) (12) ( ) (10)
).
Vn tm thi gc li y, ta s tho lun k thut gii quyt (10) tha (11), (12) ny sau. phn lp cho 1 im d liu mi dng m hnh hun luyn, ta tnh du ca y(x) theo cng thc (1), nhng th w trong (8) vo: ( ) Tha cc iu kin KKT sau: (14) ( * ) ( ) + (15) (16) ( ) (13)
( ) V th vi mi im d liu, hoc l hoc l . Nhng im d liu m c s khng xut hin trong (13) v do m khng ng gp trong vic d on im d liu mi. Nhng im d liu cn li ( ) c gi l support vector, chng tha nhng im nm trn l ca siu phng trong khng gian c trng. ( ) , l
Support vector chnh l ci m ta quan tm trong qu trnh hun luyn ca SVM. Vic phn lp cho mt im d liu mi s ch ph thuc vo cc support vector. Gi s rng ta gii quyt c vn (10) v tm c gi tr nhn t a, by gi ta cn xc ( ) nh tham s b da vo cc support vector xn c . Th (13) vo: ( ( ) ) (17)
Trong S l tp cc support vector. Mc d ta ch cn th mt im support vector xn vo l c th tm ra b, nhng m bo tnh n nh ca b ta s tnh b theo cch ly gi tr trung bnh da trn cc support vector. u tin ta nhn tn vo (17) (lu Trong Ns l tng s support vector. Ban u d trnh by thut ton ta gi s l cc im d liu c th phn tch hon ton trong khng gian c trng ( ). Nhng vic phn tch hon ton ny c th dn n kh nng tng qut ha km, v thc t mt s mu trong qu trnh thu thp d liu c th b gn nhn sai, nu ta c tnh phn tch hon ton s lm cho m hnh d on qu khp. ), v gi tr b s l: ( ( )) (18)
chng li s qu khp, chng ta chp nhn cho mt vi im b phn lp sai. lm iu ny, ta dng cc bin slack variables cho nhng im nm trn l hoc pha trong ca l ( ) cho nhng im cn li. Do nhng im nm trn ng phn cch ( ) s c Cn nhng im phn lp sai s c cho mi im d liu.
Mc tiu ca ta by gi l cc i khong cch l, nhng ng thi cng m bo tnh mm mng cho nhng im b phn lp sai. Ta vit li vn cn cc tiu: hay l l. (21)
Trong C > 0 ng vai tr quyt nh t tm quan trng vo bin By gi chng ta cn cc tiu (21) tha rng buc (20) v ( Trong * ) + v * * ( )
+ l cc nhn t Lagrange.
Cc iu kin KKT cn tha l: (23) ( ( ) ( ) ) (24) (25) (26) (27) (28) Vi n = 1,,N Ly o hm (22) theo w, b v { }: ( ) (29) (30) (31) Th (29), (30), (31) vo (22) ta c: ( ) T (23), (26) v (31) ta c: ( ) (32)
Vn cn ti u ging ht vi trng hp phn tch hon ton, ch c iu kin rng buc khc bit nh sau: (33) Th (29) vo (1), ta s thy d on cho mt im d liu mi tng t nh (13). Nh trc , tp cc im c khng c ng gp g cho vic d on im d liu mi. v theo (25) tha: (35) v l nhng im nm trn l. (34)
, t (28) suy ra
Nhng im c c th l nhng im phn lp ng nm gia l v ng phn cch nu hoc c th l phn lp sai nu xc nh tham s b trong (1) ta s dng nhng support vector m ( ) th : ( ( ) ) c v
(36)
Ln na, m bo tnh n nh ca b ta tnh theo trung bnh: Trong M l tp cc im c gii quyt (10) v (32) ta dng thut ton Sequential Minimal Optimization (SMO) do Platt a ra vo 1999. ( ( )) (37)
2. MultiClass SVMs:
By gi xt n trng hp phn nhiu lp K > 2. Chng ta c th xy dng vic phn K-class da trn vic kt hp mt s ng phn 2 lp. Tuy nhin, iu ny s dn n mt vi kh khn (theo Duda and Hart, 1973). Hng one-versus-the-rest, ta s dng K-1 b phn lp nh phn xy dng K-class. Hng one-versus-one, dng K(K-1)/2 b phn lp nh phn xy dng K-class. C 2 hng u dn n vng mp m trong phn lp (nh hnh v). Ta c th trnh c vn ny bng cch xy dng K-Class da trn K hm tuyn tnh c dng: ( ) V mt im x c gn vo lp Ck khi ( ) ( ) vi mi .
Mt hng tip cn khc do Wu (2004) xut phng php c lng xc sut cho vic phn m lp.
Vic ci t SVM kh phc tp ta nn dng cc th vin ci sn trn mng nh LibSVM, SVMLight. Thut ton gm 2 giai on hun luyn v phn lp: 1. Hun luyn: u vo: Cc vector c trng ca vn bn trong tp hun luyn (Ma trn MxN, vi M l s vector c trng trong tp hun luyn, N l s c trng ca vector). Tp nhn/lp cho tng vector c trng ca tp hun luyn. Cc tham s cho m hnh SVM: C, (tham s ca hm kernel, thng dng hm Gauss) u ra: M hnh SVM (Cc Support Vector, nhn t Lagrange a, tham s b). 2. Phn lp: u vo: Vector c trng ca vn bn cn phn lp. M hnh SVM u ra: Nhn/lp ca vn bn cn phn loi.