Detecting Pornographic Images - Tam

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 13

DETECTING PORNOGRAPHIC IMAGES

Tm tt
Bi vit ny m t mt h thng dng xc nh liu mt hnh nh c phi l "khiu dm" hay khng? H thng ny s nh v tt c cc im nh c trong mt tm nh v nhm chng li vi nhau nh cc m mu da bng cch s dng mu sc, kt cu v thng tin b mt. Cc c im ny sau c trch xut t cc m mu da v c s dng nh mt u vo cho mt b phn loi mng li thn kinh MLP. Tnh chnh xc ca vic phn loi v tm quan trng ca cc tnh nng c nhn trong vic phn loi qu trnh, c nghin cu thng qua mt lot cc th nghim, v c tho lun. 1.GII THIU Nhng bc tin trong sut thp k qua trong nhng hnh nh chp k thut s v cng ngh lu tr d liu, cng vi vic truy cp Internet mt cch rng ri, lm tng ng k lu lng thng tin gia mi ngi vi nhau. Mc d c rt nhiu li ch bt ngun t vic chia s thng tin , song vn cn 1 s t ni dung khc, chng hn nh ni dung khiu dm, cng l min ph c sn. H thng c yu cu xy dng m bo rng ti nguyn my tnh ang c s dng nhn dng chnh xc v ngn chn cc ni dung ny m khng cn can thip bng tay. C rt nhiu h thng thng mi c thit k ngn chn truy cp vo ti liu khiu dm. Hu ht cc h thng, chng hn nh NetNanny, CyberSitter, CyberPatrol v ChildWebGuardian ngn chn website da trn vic so snh a ch IP / URL v ni dung cha trong cc trang web. Phng php ny rt c hiu qu trong vic ngn chn cc trang web khiu dm v cc trang lin kt khiu dm ph bin, nhng n li gp kh khn trong vic ngn chn cc trang c cha th vin hnh nh khiu dm t cc hnh ng khng tn ti lin kt n cc trang hoc ni dung phn i khc. C mt mi tng quan mnh m gia t l phn trm ca da trong mt hnh nh v ni dung khiu dm ca n. Mc d vic s dng cc t l phn trm ca da trong mt hnh nh nh l phng tin duy nht xc nh xem mt nh cha ni dung khiu dm khng phi l hon ton chnh xc,song cch tip cn nh vy c thc hin trong h thng thng mi nh ScreenShield, SnitchSystem Reconv Enologic Netfilter Home. Cc h thng ny hiu qu nh phn mm gim st vic xem xt hnh nh v thng tin lin quan, chng hn nh ngi s dng v thi gian ca s kin, c lu tr kim
1

tra vin. Ni cch khc cc h thng ny c thit k ngn chn vic truy cp cc ni dung khiu dm, hn l ngn chn cc ni dung. Ni dung bi bo co gm nhng phn: Phn 2 cp n cng tc lin quan n vic nhn din nh khiu dm ca cc nh nghin cu Phn 3 chi tit cc m mu da v trch dn cc tnh nng ca thut ton c s dng trong h thng. Phn 4 m t vic xy dng cc phn loi ti u v nh gi hiu qu ca vic phn loi ny cng nh cc tnh nng c nhn da trn kt qu False Acceptation Rate (FAR) v False Rejection Rate (FRR). Phn 5 kt lun bi bo v vch ra hng nghin cu hin ti

2.CC CNG VIC LIN QUAN TI NHN DIN NH KHIU DM Vi tng quan mnh m gia t l phn trm ca da v ni dung khiu dm trong mt hnh nh , bc u tin trong tt c cc h thng phn loi hnh nh khiu dm c xc nh. Tuy nhin, c mt s ro cn tim nng xc nh chnh xc v da : (a) Cht lng hnh nh km, v d tng phn rt thp. (b) S tn ti ca cc i tng khng phi con ngi trong nhng hnh nh c mu sc gn ging vi da ngi. (c) S bo ha ca khu vc da gy ra bi cc phn x da v cc vn chiu sng. khc phc nhng ro cn trn, mt s thut ton cnh tranh da trn khng gian mu khc nhau nh RGB, HSV, RGB, CIE Lu * v * Log opponent c xut xc nh cc im nh mu da [1, 2, 3, 4, 5, 6]. Tuy nhin, n c kim chng qua phng php tip cn tt c cc khng gian mu sc. V vy, vic la chn khng gian mu sc tt nht ti u ha tnh chnh xc ca my nhn din loi da khng cn l mt vn . Trong khi cc thut ton ch n gin l xc nh mt tp hp tnh ca cc gi tr im nh mu nh l mu da. 3.TNG QUAN V H THNG

Hnh 1: 4 m-un trong h thng

Trong h thng c xut mi ba tnh nng c trch t cc khu vc pht hin ca da theo hnh nh v c s dng o to mt mng li thn kinh MLP dng ly c mt phn loi ti u v tng qut vi mc ch xc nh xem ni dung khiu dm l c hoc khng. Mc d mt s tnh nng c trch xut l tng t nh trong cc nghin cu lin quan [1, 2, 3]. Bn thn h thng c chia thnh bn phn: Size and Palette Analysis, Skin Detection, Feature Extraction and Decision Classifier Modules, nh minh ha trong hnh 1.

The Size and Palette Analysis Module hot ng nh mt b lc nhanh v c bn. Nu kch thc hnh nh di y l mt mc c xc nh, v d 32x32 im nh, sau hnh nh s t ng c gn nhn l khng khiu dm v n c th l mt biu tng hay hnh nh no . Tng t, mt hnh nh ch tn ti mt ch s c gii hn v mu sc (<100) cng s c dn nhn khng khiu dm v n khng th l nh chp. Cc Skin Detection Module s dng mu sc, kt cu v thng tin b mt xc nh kh nng ca lp im nh. Lin kt cc lp im nh s c nhm li vi nhau, to thnh cc m mu (vng) ca da v khng phi da. Cc chi tit ca m-un ny c m t trong mc 3.1 Trong cc Feature Extraction Module mi ba c im c trch xut t vic nhn dng cc m mu da. Chi tit ca cc c im v cc gii php rt ra s c m t trong mc 3.2.

Phn loi ti u c s dng trong cc Decision Classifier Module d xc nh cc hnh nh c khiu dm hay khng. 3.1 Skin Blob Detection Su bc c s dng nhn din cc m mu da c th hin trong hnh 2. u tin,p dng vic tng phn hnh nh ln thng qua vic bnh thng ha biu [8]. Hiu qu ca n ci thin ng k chnh xc ca cc bc tip theo, c bit l khi nhng hnh nh c ra ra hoc qu ti.

Hnh 2:Su bc nhn din m mu Th hai, cc im nh c dn nhn l im nh da nu gi tr RGB ca n nm trong mt tng mu c trong bng mu. Nhng bng mu c bt ngun t cc la chn bng tay cc im nh mu da trong mt tp cc hnh nh c kim tra Th ba, kt cu ca cc im nh c xc nh loi b nhng im nh vi mu da nhng bin kt cu cao (vng da thng mn v do c bin kt cu nh). K thut ma trn xut hin ng thi c s dng v n i din tt nht gia chnh xc v thi gian tnh ton [8]. Hnh 3 minh ha tnh hiu qu ca kt cu c xc nh trong vic c c im nh da chnh xc cho hnh nh khiu dm v khng khiu dm.

Th t,cc im nh "probable skin" da trn vic xc nh kt cu c m rng bao gm cc im nh ln cn vi mu sc tng t v kt cu thuc tnh. Bc ny cho php a vo nhng iu sau y vo cc vng da: (A) im nh mu da c mislabelled trong vic xc nh cu to gn vi cc v mt trong hnh nh. (B) im nh mu khng-da t cc vng da ngi hp l c mislabelled do bo ha. Th nm, cc vng da l phn on bng cch p dng nhn din Canny-edge v im nh ghi nhn nm trn cc b mt nh khi non-skin. iu ny c thc hin m bo tip gip vi cc vng da, v d nh mt cnh tay nm trn u trang ca mt thn, vn duy tr s khc bit khi xy dng cc m mu da. Cui cng, th su, cc m mu da c hnh thnh bng cch nhm cc im nh da kt ni kt ni li vi nhau. m mu da nh, v tng t nh cc m mu non-skin, c loi b bng cch ti ghi nhn cc im nh nh non-skin v skin tng ng. 3.2 Tnh nng Khai thc Nu cc m mu da c nhn din trong cc hnh nh sau cc c trng mi ba, nh th hin trong Bng 1, c trch xut. Nu khng c cc m mu da c tm thy th sau cc hnh nh c phn loi l khng khiu dm. Tnh nng th nht ch n gin l t l phn trm ca cc im nh c dn nhn l lp trong hnh nh vo cc skin blob detection module. Tnh nng th hai l t l phn trm ca mu sc trong hnh nh c phn loi mu sc 'skin'

Table 1: 13 tnh nng c trch xut t hnh nh Tnh nng th ba ch n gin l s lng cc m mu da trong hnh nh. N cho thy hnh nh vi s lng ln cc m mu da c th s tng ng vi nhng on nhm li c l nh vy khng c ni dung khiu dm. Tnh nng th t i din cho mt xp x ca c ch lc hnh hc Forsyth and Fleck, m c ch ra vic c mt ca cc cu trc ca con ngi [2]. xc nh s lng limb nh cc i tng trong hnh nh mt thut ton cc nhanh c s dng trn cc m mu da l ra b xng c bn ca h. Potential limbs c nh ngha nh cc b phn thng di ca b xng c chiu di v khong cch gi tr t l tng t nh tay chn con ngi, sau c tnh. Cc v d ca cc i tng chn tay nm s dng k thut ny c th hin trong hnh 4. Tnh nng 5-13 lin quan n kch thc v hnh dng ca cc m mu da ln nht. Nu mt ngi kho thn hin din v sau ngi ta cho rng cc m mu da ln nht s tng ng vi ngi . Tnh nng 5 v 6 lin quan n t l chiu rng v chiu cao ca cc m mu tng ng vi chiu rng v chiu cao ca hnh nh. Tnh nng 7-13 lin quan n hnh ca cc m mu ln nht i din ca by khonh khc Hu bt bin. Nhng khonh khc Hu c tnh ton s dng cc phc tho ca cc m mu da ln nht [9].

Hnh 4: Cc limb nh i tng c th hin trong cc dng mu en.

4.TH NGHIM nghin cu tnh hiu qu ca cc tnh nng c trch xut t nhng hnh nh xc nh liu mt hnh nh c ni dung khiu dm hay khng, ba b th nghim c thc hin. Mc tiu u tin ca b th nghim l xy dng mt b phn loi mng li thn kinh ti u MLP [10]. C th, s lng ti u ca cc lp n v s lng cc nt trong mi lp n ca b phn loi mng li thn kinh c tm thy. Mc tiu ca cc b th nghim th hai l kim tra nh hng ca hnh nh bn khiu dm trn tnh chnh xc ca vic phn loi ti u. Mc tiu ca cuc th nghim th ba l iu tra tm quan trng ca tnh nng c nhn trong vic xc nh ni dung khiu dm ca mt hnh nh thng qua vic xy dng mt Maximum Entropy Tree (MET) [10] Bn b nhn d liu hnh nh bng th cng c xy dng. Mi b d liu bao gm 800 hnh nh khiu dm v 800 hnh nh khng khiu dm. Cho php ta phn tch mc chi tit v b p cho s khc bit ch quan trong kin v vic liu mt s hnh nh c khiu dm hay khng, nhng hnh nh trong cc b d liu c phn chia thnh vo mt lp con, nh th hin trong hnh 5. Hnh nh mu t su cp thp nht c th hin trong hnh 6

Figure 6: Hnh nh t lp con: (a) Wrapped-Exposed, (b) Half-Exposed, (c) FullExposed3X, (d)Full-Exposed5X (e) Human (f) Non-Human.Lu 1 s khu vc hnh nh c bi en.

4.1 Phng php th nghim Mc ch b th nghim u tin l tm phn loi ti u, h thng thn kinh c training s dng mt s lp n khc nhau (1-2) v s lng cc nt trong mi lp n (120). Phm vi ca cc lp v cc nt c chng minh l ly c mt phn loi tng qut ph hp vi mu training [10]. Mt trong nhng b d liu c s dng nh b training v nhng ci khc nh b xc thc. Cc hnh nh trong c hai b d liu c phn u tri rng trn cp di ca lp con, v d nh 800 hnh nh khiu dm khng bao gm 400 ngi v 400 hnh nh khng phi con ngi. i vi mi s kt hp ca cc lp v cc nt thut ton training (100.000 vng lp) c p dng. Vo cui kha hc,b phn loi c p dng b training v t l li bnh ng (EER) c tm thy, cc kt qu ER thp hn th tng qut tt hn ca b phn loi. Sau khi cc t hp tt nht c xc nh (1 lp v 10 nt), cc thit lp xc nhn d liu c s dng cng vi cc d liu training thit lp xc nh s lng ti u ca training lp i lp li (trc y t 100.000 lp i lp li). iu ny c thc hin trnh nhng thut ton training overfitting trn d liu training vi chi ph gim chnh xc tng qut cho b d liu hnh nh khng nhn thy khc. Sau khi kt thc mi
8

ln lp cc li training (tng s tch ly kch thc ca gi tr sn lng tr i gi tr d kin) c tnh ton cho tp d liu training v cc d liu xc nhn thit lp, xem hnh 7. C th thy rng cc li training ca tp d liu training (ng t nt) gim khi s lng training lp i lp li gia tng, tuy nhin cc li training ca tp d liu xc nhn (ng cong khng tan) bt u tng khi s ln lp t 1000. Cho cc thit lp th nghim th hai, vi mc ch l kim tra tnh chnh xc ca vic phn loi ti u trong s hin din ca hnh nh bn khiu dm, hai b d liu th nghim c xy dng (nhng hnh nh trong cc b d liu th nghim khng xut hin trong d liu training v xc nhn b c s dng trong th nghim u tin). Cc d liu th nghim u tin thit lp c 800 ni dung khiu dm v 800 hnh nh khiu dm khng ly vi s lng bng nhau t lp con tng ng. Th nghim th hai tp hp d liu ging ht vi cc d liu mi ln u tin thit ngoi tr 200 hnh nh trong lp con "Light-Porn" c thay th bng 200 hnh nh mi t lp con HeavyPorn.

Hnh 7: S thay i li training vi s gia tng lp i lp li cho vic training v xc nhn b d liu. Trong b th nghim th 3, cc thit k dng o lng s ng gp ca cc tnh nng khc nhau i vi khiu dm, cc MET ti u c thit lp bng cch s dng hng dn v xc nhn cng nhau tp hp d liu v phng php tip cn tng t nh th nghim 1. Cc li hng dn c xc nh l tng ca FAR v FRR, v ti thiu ca n xy ra vi 21 nt trong MET. Tnh chnh xc ca MET c o bng cch s dng d
9

liu th nghim u tin (cn bng) thit lp nh l xy dng cho b th hai ca th nghim 4.2 Kt qu kt qu v tho lun Khi p dng cho mt hnh nh, phn loi tr v gi tr khng (khng kha thn) v mt (y nh kho thn), nu gi tr cao hn mt ngng 'kha thn' th hnh nh c coi l khiu dm. iu ny l do trong vic o to gim st ca mng li thn kinh cc hnh nh khiu dm c cho l mt trong nhng gi tr v hnh nh khng khiu dm gi tr bng khng. Tnh chnh xc ca vic phn loi ti u c o bng cc gi tr FAR v FRR da trn mt gi tr nh kho thn.

Hnh 8 cho thy cc kt qu FAR v FRR cho hai b d liu c s dng trong th nghim 2. Trong hnh 8 (a), trong mt s lng bng nhau ca hnh nh t lp con c mt trong cc b d liu, mt EER 18% c thc hin vi nh kho thn gi tr ngng 0,5. Nh gi tr nh kho thn lm gim FRR rt nhanh, y l kt qu hu ch cho cc phn mm gim st hnh nh khiu dm m hnh nh khiu dm khng nhn nh hnh nh khng khiu dm l li pht hin ln. Hnh 8 (b), ni hnh nh khiu dm light c thay th bng hnh nh khiu dm'heavy' trong b d liu, cho thy mt ci tin ng k trong tnh chnh xc ca pht hin ni dung khiu dm vi mt EER l 11%. iu ny cho thy cc bc nh m c cho rng khiu dm l nhng phn loi ti u c nhiu kh nng mislabel. Ni cch khc,cc tnh nng c trch xut nh gi mc m mt hnh nh c khiu dm khng?

10

Hnh 8: FAR v FRR kt qu cho (a) b d liu 1, v (b) b d liu 2 nh trong th nghim 2. Hnh 9 cho thy nm cp u tin ca mi ba cp ti u MET. Qua s kim tra, t l chiu cao ca m mu (Tnh nng 4) l tnh nng quan trng nht xc nh xem hnh nh khiu dm hay khng. Tip theo l phn trm da (Tnh nng 0) v sau l Hu Moments (tnh nng 6,7,8,9 v 12). ng dng ti u ca MET vo cc d liu th nghim thit lp cho FAR v FRR gi tr l 15% v 19,5% tng ng. Lu rng 90% ca FAR l do nt t l chiu cao ca m mu mc 0 v 70% FRR l do cc nt da T l cp 1. Cho rng tnh nng t l chiu cao ca m mu tng quan mnh m vi cc tnh nng phn trm da (thng qua kim tra ca cc b d liu), kt qu cho thy s lng mu da l ch s tt nht xem mt nh cha ni dung khiu dm hoc khng. Cc b tip theo ca tnh nng quan trng lin quan n hnh dng ca cc m mu da. Tnh nng lin quan n s lng cc m mu v cc i tng chn tay c tm thy tng i khng quan trng, thc s c rt t s tng quan gia cc tnh nng v ni dung khiu dm ca hnh nh.

11

Hnh 9: Nm cp u tin ca mi ba cp ti u MET (P c ngha l hnh nh khiu dm v NP c ngha l n khng khiu dm). 5. KT LUN V CNG VIC HIN TI Bi vit ny m t mt h thng nhn din hnh nh khiu dm s dng nhiu tnh nng c ngun gc t cc m mu da trong mt mng thn kinh phn loi MLP. Mc d EERs 18% v 11% tng ng vi b phn loi trong hai th nghim khng c bit n tng ,nhng chng ta phi nh rng nhng hnh nh khng khiu dm trong cc b d liu khng phi l ngi tm thng , ni cch khc tt c h u c mt vng rng ln ca da - nh im nh. T cc kt qu v tho lun trnh by trong phn 4 , r rng l trong khi vic s dng nhiu tnh nng xc nh xem mt nh l khiu dm hoc khng lm tng chnh xc ca cc quyt nh. Do thiu mt c s d liu tiu chun th nghim,cho nn vic so snh cc h thng trn vi cc nhm khc lm vic trong cng mt khu vc l khng th Cn ci thin qu trnh khai thc tnh nng hin ti bao gm pht hin nhiu quy m edge v snakes khai thc tt hn cc vng da , v ii ) vic xc nh cc tnh nng b sung , chng hn nh s lng cc im cong trong cc vng da m c th tng ng vi ngc . 6.TI LIU THAM KHO [1] M. J. Jones and J. M. Rehg, Statistical color models with applications to skin detection, in Technical Report CRL 98/11, Compaq Cambridge Research Laboratory, 1998, pp. 1-23. [2] D. A. Forsyth and M. M. Fleck, Automatic detection of human nudes, International Journal of Computer Vision, vol. 32, no. 1, pp. 63 77, 1999.

12

[3] J. Z. Wang, System for screening objectionable images, Computer Communications, vol. 21, no. 15, pp. 1355 1360, 1998. [4] A. Albiol et al., A simple and efficient face detection algorithm for video database applications, in Proceeding of IEEE International Conference on Image Processing, 2000, pp. 239-242. [5] K. Sobottka and I. Pitas, Face localization and facial feature extraction based on shape and color information, in Proceeding of IEEE International Conference on Image Processing, 1996, pp. 236-241. [6] M. H. Yang and N. Ahuja, Detecting human faces in color images, in Proceeding of IEEE International Conference on Image Processing, 1998, pp. 127-130. [7] A. Albiol, L. Torres, and E. J. Delp, Optimum color spaces for skin detection, in Proceeding of IEEE International Conference on Image Processing, 2001, pp.681-684. [8] R. C. Gonzalez and R. E. Woods, Digital Image Processing Second Edition, PrenticeHall, New Jersey,2002. [9] M. K. Hu, Visual pattern recognition by moment invariants, IRE Transactions on Information Theory, vol. 8, pp. 179 187, 1962. [10] T. M. Mitchell, Machine Learning, McGraw-Hill, New York, 1997.

13

You might also like