Professional Documents
Culture Documents
Base Paper
Base Paper
1 INTRODUCTION
YDOXHNQRZQWRDFOLHQWRIWKHGDWD/HWYLWGHQRWHWKHYDOXH
RIWKHLWKGDWDLWHPDWWKHGDWDVRXUFHDWWLPHWDQGOHWWKH
YDOXHWKHGDWDLWHPNQRZQWRWKHFOLHQWEHXLW7KHQWKH
GDWDLQFRKHUHQF\DWWKHFOLHQWLVJLYHQE\_YLWXLW_)RU
DGDWDLWHPZKLFKQHHGVWREHUHIUHVKHGDWDQLQFRKHUHQF\
ERXQG & D GDWD UHIUHVK PHVVDJH LV VHQW WR WKH FOLHQW DV
VRRQDVGDWDLQFRKHUHQF\H[FHHGV&LH_YLWXLW_!&
1HWZRUN RI GDWD DJJUHJDWRUV 'DWD UHIUHVK IURP GDWD
VRXUFHV WR FOLHQWV FDQ EH GRQH XVLQJ SXVK RU SXOO EDVHG
PHFKDQLVPV ,Q D SXVK EDVHG PHFKDQLVP GDWD VRXUFHV
VHQGXSGDWHPHVVDJHVWRFOLHQWVRQWKHLURZQZKHUHDVLQ
SXOOEDVHGPHFKDQLVPGDWDVRXUFHVVHQGPHVVDJHVWRWKH
FOLHQW RQO\ ZKHQ WKH FOLHQW PDNHV D UHTXHVW :H DVVXPH
WKHSXVKEDVHGPHFKDQLVPIRUGDWDWUDQVIHUEHWZHHQGDWD
VRXUFHV DQG FOLHQWV )RU VFDODEOH KDQGOLQJ RI SXVK EDVHG
GDWDGLVVHPLQDWLRQQHWZRUNRIGDWDDJJUHJDWRUVDUHSUR
SRVHGLQ WKHOLWHUDWXUH > @ ,Q VXFK QHWZRUNRI GDWD
DJJUHJDWRUVGDWDUHIUHVKHVRFFXUIURPGDWDVRXUFHVWRWKH
FOLHQWVWKURXJKRQHRUPRUHGDWDDJJUHJDWRUV
,Q WKLV SDSHU ZH DVVXPH WKDW HDFK GDWD DJJUHJDWRU
PDLQWDLQV LWV FRQILJXUHG LQFRKHUHQF\ ERXQGV IRU YDULRXV
GDWDLWHPV)URPDGDWDGLVVHPLQDWLRQFDSDELOLW\SRLQWRI
YLHZHDFKGDWDDJJUHJDWRU'$LVFKDUDFWHUL]HGE\DVHW
RIGLFLSDLUVZKHUHGLLVDGDWDLWHPZKLFKWKH'$FDQ
GLVVHPLQDWH DW DQ LQFRKHUHQF\ ERXQG FL 7KH FRQILJXUHG
LQFRKHUHQF\ERXQGRIDGDWDLWHPDWDGDWDDJJUHJDWRUFDQ
EH PDLQWDLQHG XVLQJ DQ\ RI IROORZLQJ PHWKRGV D 7KH
GDWDVRXUFHUHIUHVKHVWKHGDWDYDOXHRIWKH'$ZKHQHYHU
'$V LQFRKHUHQF\ ERXQG LV DERXW WR JHW YLRODWHG 7KLV
PHWKRG KDV VFDODELOLW\ SUREOHPV E 'DWD DJJUHJDWRUV
ZLWKWLJKWHULQFRKHUHQF\ERXQGKHOSWKH'$WR PDLQWDLQ
LWV LQFRKHUHQF\ ERXQG LQ DVFDODEOH PDQQHU DV H[SODLQHG
LQ>@
5DMHHY*XSWDLVZLWK,%05HVHDUFK1HZ'HOKL(PDLO
JUDMHHY#LQLEPFRP
.ULWKL5DPDPULWKDPLVZLWK,QGLDQ,QVWLWXWHRI7HFKQRORJ\0XPEDL(
PDLONULWKL#FVHLLWEDFLQ
0DQXVFULSWUHFHLYHG0D\
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
([DPSOH,QDQHWZRUNRIGDWDDJJUHJDWRUVPDQDJLQJ
GDWDLWHPVGGYDULRXVDJJUHJDWRUVFDQEHFKDUDFWHUL]HG
DV
D^GG`
D^GGG`
$JJUHJDWRU D FDQ VHUYH YDOXHV RI G ZLWK DQ
LQFRKHUHQF\ERXQGJUHDWHUWKDQRUHTXDOWRZKHUHDVD
FDQ GLVVHPLQDWH WKH VDPH GDWD LWHP DW D ORRVHU
LQFRKHUHQF\ ERXQG RI RU PRUH ,Q VXFK D QHWZRUN RI
DJJUHJDWRUV RI PXOWLSOH GDWD LWHPV DOO WKH QRGHV FDQ EH
FRQVLGHUHGDVSHHUVVLQFHDQRGHDLFDQKHOSDQRWKHUQRGH
DN WR PDLQWDLQ LQFRKHUHQF\ ERXQG RI WKH GDWD LWHP G
LQFRKHUHQF\ERXQGRIGDWDLLVWLJKWHUWKDQWKDWDWDNEXW
WKHQRGHDLJHWVYDOXHVRIDQRWKHUGDWDLWHPGIURPDN
WKDQRQHGDWDLWHP
6HFRQGO\LIDVLQJOH'$FDQGLVVHPLQDWHDOOWKUHHGDWD
LWHPV UHTXLUHG WR DQVZHU WKH FOLHQW TXHU\ WKH '$ FDQ
FRQVWUXFW D FRPSRVLWH GDWD LWHP FRUUHVSRQGLQJ WR WKH
FOLHQW TXHU\ GT G G G DQG GLVVHPLQDWH
WKH UHVXOW WR WKH FOLHQW VR WKDW WKH TXHU\ LQFRKHUHQF\
ERXQG LV QRW YLRODWHG ,W LV REYLRXV WKDW LI ZH JHW WKH
TXHU\ UHVXOW IURP D VLQJOH '$ WKH QXPEHU RI UHIUHVKHV
ZLOO EH PLQLPXP DV GDWD LWHP XSGDWHV PD\ FDQFHO RXW
HDFK RWKHU WKHUHE\ PDLQWDLQLQJ WKH TXHU\ UHVXOWVZLWKLQ
WKH LQFRKHUHQF\ ERXQG $V GLIIHUHQW GDWD DJJUHJDWRUV
GLVVHPLQDWH GLIIHUHQW VXEVHWV RI GDWD LWHPV QR GDWD
DJJUHJDWRU PD\ KDYH DOO WKH GDWD LWHPV UHTXLUHG WR
H[HFXWH WKH FOLHQW TXHU\ ZKLFK LV LQGHHG WKH FDVH LQ
([DPSOH )XUWKHU HYHQ LI DQ DJJUHJDWRU FDQ UHIUHVK DOO
WKH GDWD LWHPV LW PD\ QRW EH DEOH WR VDWLVI\ WKH TXHU\
FRKHUHQF\UHTXLUHPHQWV,QVXFKFDVHVWKHTXHU\KDVWREH
H[HFXWHGZLWKGDWDIURPPXOWLSOHDJJUHJDWRUV
$WKLUGRSWLRQLVWRGLYLGHWKHTXHU\LQWRDQXPEHURI
VXETXHULHVDQGJHWWKHLUYDOXHVIURPLQGLYLGXDO'$V,Q
WKDWFDVHWKHFOLHQWTXHU\UHVXOWLVREWDLQHGE\FRPELQLQJ
WKH UHVXOWV RI PXOWLSOHVXETXHULHV )RU WKH'$V JLYHQ LQ
([DPSOH WKH TXHU\ 4 FDQ EH GLYLGHG LQ WZR DOWHUQDWLYH
ZD\V
3ODQ5HVXOWRIVXETXHU\GGLVVHUYHGE\D
ZKHUHDVYDOXHRIGLVVHUYHGE\D
3ODQ9DOXHRIGLVVHUYHGE\DZKHUHDVUHVXOWRIVXE
TXHU\GGLVVHUYHGE\D
,Q ERWK WKH SODQV FRPELQLQJ WKH VXETXHU\ YDOXHV DW
WKHFOLHQWJLYHVWKHTXHU\UHVXOW%XWVHOHFWLQJWKHRSWLPDO
SODQ DPRQJYDULRXV RSWLRQV LV QRWWULYLDO ,QWXLWLYHO\ ZH
VKRXOG EH VHOHFWLQJ WKH SODQ ZLWK OHVVHU QXPEHU RI VXE
TXHULHV%XWWKDWLVQRWJXDUDQWHHGWREHWKHSODQZLWKWKH
OHDVW QXPEHU RI PHVVDJHV )XUWKHU ZH VKRXOG VHOHFW WKH
VXETXHULHV VXFK WKDW XSGDWHV WR YDULRXV GDWD LWHPV DS
SHDULQJ LQ D VXETXHU\ KDYH PRUH FKDQFHV RI FDQFHOLQJ
HDFK RWKHU DV WKDW ZLOOUHGXFH WKH QHHG IRUUHIUHVK WR WKH
FOLHQW ,Q WKH DERYH H[DPSOH LI XSGDWHV WR G DQG G DUH
VXFKWKDWZKHQGLQFUHDVHVGGHFUHDVHVDQGYLFHYHUVD
WKHQVHOHFWLQJSODQPD\EHEHQHILFLDO:HJLYHDPHWKRG
WR VHOHFW WKH TXHU\ SODQ EDVHG RQ WKHVH REVHUYDWLRQV
:KLOH VROYLQJ WKH DERYH SUREOHP ZH HQVXUH WKDW HDFK
GDWD LWHP IRU D FOLHQW TXHU\ LV GLVVHPLQDWHG E\ RQH DQG
RQO\ RQH GDWD DJJUHJDWRU $OWKRXJK D TXHU\ FDQ EH GL
YLGHG LQ VXFK D ZD\ WKDW D VLQJOH GDWD LWHP LV VHUYHG E\
PXOWLSOH'$VHJGGGLVGLYLGHGLQWR
WZRVXETXHULHVGGDQGGGLQGRLQJ
VR WKH VDPH GDWD LWHP LV SURFHVVHG DW PXOWLSOH DJJUHJD
WRUVLQFUHDVLQJWKHXQQHFHVVDU\SURFHVVLQJORDGIXUWKHU
LQFDVHRISDLGGDWDVXEVFULSWLRQVLWLVQRWSUXGHQWWRJHW
WKH VDPH GDWD LWHP IURP PXOWLSOH VRXUFHV %\ GLYLGLQJ
WKHFOLHQWTXHU\LQWRGLVMRLQWVXETXHULHVZHHQVXUHWKDWD
GDWDLWHPXSGDWHLVSURFHVVHGRQO\RQFHIRUHDFKTXHU\
6XETXHU\ LQFRKHUHQF\ ERXQGV DUH UHTXLUHG WR EH GH
ULYHG XVLQJ WKH TXHU\LQFRKHUHQF\ ERXQGV VXFK WKDW EH
VLGHV VDWLVI\LQJ WKH FOLHQW FRKHUHQF\ UHTXLUHPHQWV WKH
FKRVHQ'$ZKHUHWKHVXETXHU\LVWREHH[HFXWHGLVFD
SDEOH RI VDWLVI\LQJ WKH DOORFDWHG VXETXHU\ LQFRKHUHQF\
ERXQG )RU H[DPSOH LQ SODQ LQFRKHUHQF\ ERXQG DOOR
FDWHGWRWKHVXETXHU\GGVKRXOGEHJUHDWHUWKDQ
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
Description
ak
Dk
dkj
tkj
Client query.
Cq
nq
dqi
vqi(t)
wqi
Vq(t)
qk
Sub-query of q to be executed at ak .
Cqk
Rq
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS
(a) C=0.001
Figure 2. Number of pushes vs. data sumdiff
(b) C=0.01
(c) C=0.1
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
ZLWKDQLQFRKHUHQF\ERXQG&WKHGDWDGLVVHPLQDWLRQFRVW FXODWLQJ
LV SURSRUWLRQDO WR 5V& ,Q WKH QH[W VHFWLRQ ZH XVH WKLV 3.2 Query based Normalization
GDWDGLVVHPLQDWLRQFRVWPRGHOIRUGHYHORSLQJFRVWPRGHO 6XSSRVH ZH ZDQW WR FRPSDUH WKH FRVW RI WZR TXHULHV D
IRUDGGLWLYHDJJUHJDWLRQTXHULHV
680 TXHU\ LQYROYLQJ WZR GDWD LWHPV DQG DQ $9* TXHU\
LQYROYLQJWKHVDPHVHWRIGDWDLWHPV/HWWKHTXHU\ LQFR
3 COST MODEL FOR ADDITIVE AGGREGATION
KHUHQF\ERXQGIRUWKH 680DQGWKH $9*TXHULHVEH& &
QUERIES
DQG& & UHVSHFWLYHO\ )URP (TXDWLRQ VXPGLII RI WKH
&RQVLGHUDQDGGLWLYHTXHU\RYHUWZRGDWDLWHPV3DQG4 680 TXHU\ ZLOO EH GRXEOH WKDW RI WKH $9* TXHU\ +HQFH
ZLWKZHLJKWVZS DQGZT UHVSHFWLYHO\ DQG ZH ZDQW WR HV TXHU\HYDOXDWLRQFRVWDVSHU5& RIWKH 680TXHU\ZLOO
EHKDOIWKDWRIWKH
$9*TXHU\%XWLQWXLWLYHO\GLVVHPLQDW
WLPDWH LWV GLVVHPLQDWLRQ FRVW ,I GDWD LWHPV DUH GLVVHPL
LQJ WKH $9* RI WZR GDWD LWHPV DW D JLYHQ LQFRKHUHQF\
QDWHGVHSDUDWHO\WKHTXHU\VXPGLIIZLOOEH
ERXQG VKRXOG UHTXLUH WKH VDPH QXPEHU RI UHIUHVK PHV
Rdata = w p R p + wq Rq = w p | pi pi 1 | + wq | qi qi 1 |
VDJHV DV WKHLU 680 ZLWK GRXEOH WKH LQFRKHUHQF\ ERXQG
,QVWHDGLIWKHDJJUHJDWRUXVHVWKHLQIRUPDWLRQWKDWFOLHQW 7KXV WKHUH LV D QHHG WR QRUPDOL]H TXHU\ FRVWV )URP D
LV LQWHUHVWHG LQ D TXHU\ RYHU 3 DQG 4 UDWKHU WKDQ WKHLU TXHU\H[HFXWLRQFRVWSRLQWRIYLHZDTXHU\ZLWKZHLJKWV
LQGLYLGXDOYDOXHVLWFUHDWHVDQGSXVKHVDFRPSRVLWHGDWD ZL DQG LQFRKHUHQF\ ERXQG & LV WKH VDPH DV TXHU\ ZLWK
LWHPZSSZTTWKHQWKHTXHU\VXPGLIIZLOOEH
ZHLJKWV ZL DQG LQFRKHUHQF\ ERXQG & 6R ZKLOH QRU
Rquery = | w p ( pi pi 1 ) +wq ( qi qi 1 ) |
PDOL]LQJZHQHHGWRHQVXUHWKDWERWKTXHU\ZHLJKWVDQG
5TXHU\LVFOHDUO\OHVVWKDQRUHTXDOFRPSDUHGWR5GDWD7KXV LQFRKHUHQF\ ERXQGV DUH PXOWLSOLHG E\ WKH VDPH IDFWRU
ZH QHHG WR HVWLPDWH WKH VXPGLII RI DQ DJJUHJDWLRQ TXHU\ 1RUPDOL]HGTXHU\VXPGLIILVJLYHQE\
2
2 2
2 2
2
2
LH 5TXHU\ JLYHQ WKH VXPGLII YDOXHV RI LQGLYLGXDO GDWD Rquery = ( w p R p + wq Rq + 2 w p R p wq Rq ) /(w p + wq + 2 w p wq )
LWHPVLH5SDQG5T2QO\GDWDDJJUHJDWRUVDUHLQDSRVL LHWKHYDOXHRIWKHQRUPDOL]LQJIDFWRUIRU5TXHU\VKRXOGEH
WLRQWRFDOFXODWH5TXHU\DVGLIIHUHQWGDWDLWHPVPD\EHGLV
2
2
VHPLQDWHG IURP GLIIHUHQW VRXUFHV :H GHYHORS WKH TXHU\ 1 / w p + wq + 2 w p wq 7KHYDOXHRIWKHLQFRKHUHQF\ERXQG
FRVWPRGHOLQWZRVWDJHV
KDV WR EH DGMXVWHG E\ WKHVDPH IDFWRU 1RUPDOL]DWLRQ HQ
3.1 Modeling Correlation between Data Dynamics
VXUHVWKDWTXHULHVZLWKDUELWUDU\YDOXHVRIZHLJKWVFDQEH
)URP (TXDWLRQV DQG ZH FDQ VHH WKDW LI WZR GDWD FRPSDUHG IRU H[HFXWLRQ FRVW HVWLPDWHV (TXDWLRQ FDQ
LWHPV DUH FRUUHODWHG VXFK WKDW DV WKH YDOXH RI RQH GDWD EHH[WHQGHGWRJHWTXHU\VXPGLIIIRUDQ\JHQHUDOZHLJKWHG
LWHP LQFUHDVHV WKDW RI WKH RWKHU GDWDLWHP DOVR LQFUHDVHV DJJUHJDWLRQTXHU\JLYHQE\(TXDWLRQDV
WKHQ5TXHU\ZLOOEHFORVHUWR5GDWD2QWKHRWKHUKDQGLIWKH
nq
nq
nq
2 2
wqi Ri + ij wqi wqj Ri R j
GDWDLWHPVDUHLQYHUVHO\FRUUHODWHGWKHQ5TXHU\ZLOOEHOHVV
i=
i =1 j =1, j i
FRPSDUHGWR5GDWD7KXVLQWXLWLYHO\ZHFDQUHSUHVHQWWKH RQ2 = 1 n
nq
nq
q
2
UHODWLRQVKLSEHWZHHQ5TXHU\DQGVXPGLIIYDOXHVRIWKHLQGL
wqi + ij wqi wqj
i =1
i =1 j =1, j i
YLGXDOGDWDLWHPVXVLQJDFRUUHODWLRQPHDVXUHDVVRFLDWHG
ZLWKWKHSDLURIGDWDLWHPV6SHFLILFDOO\LI LVWKHFRUUHOD
3.3 Validating the Query Cost Model
WLRQPHDVXUHWKHQ5TXHU\FDQEHZULWWHQDV
2
2 2
2 2
Rquery ( w p R p + wq Rq + 2 w p R p wq Rq )
7R YDOLGDWH WKH TXHU\ FRVW PRGHO ZH SHUIRUPHG VLPXOD
WLRQV E\ FRQVWUXFWLQJ ZHLJKWHG DJJUHJDWLRQ TXHULHV
7KH FRUUHODWLRQ PHDVXUH LV GHILQHG VXFK WKDW
XVLQJ WKH VWRFN GDWD ZLWK HDFK TXHU\ FRQVLVWLQJ RI
6R 5TXHU\ ZLOO DOZD\V EH OHVV WKDQ _ZS5SZT5T_DV H[
GDWD LWHPV ZLWK GDWD ZHLJKWV XQLIRUPO\ GLVWULEXWHG EH
SODLQHG HDUOLHU DQG DOZD\V EH PRUH WKDQ _ZS5SZT5T_
WZHHQ DQG )RU HDFK TXHU\ WKH QXPEHU RI UHIUHVKHV
7KHDERYHUHODWLRQFDQEHEHWWHUXQGHUVWRRGIURPLWVVLPL
ZDV FRXQWHG IRU YDULRXV LQFRKHUHQF\ ERXQGV VXFK WKDW
ODULW\ZLWKWKHVWDQGDUGGHYLDWLRQRIWKHVXPRIWZRUDQ
WKHLUQRUPDOL]HGYDOXHVXVLQJQRUPDOL]DWLRQIDFWRUDVLQ
GRPYDULDEOHV>@)RUGDWDLWHPV3DQG4 FDQEHFDO
(TXDWLRQDUHEHWZHHQDQG)LJXUHDVKRZV
FXODWHGDV
WKDW WKH QXPEHU RI PHVVDJHV LV SURSRUWLRQDO WR WKH QRU
= ( ( pi pi 1 )( qi qi 1 ) ) /( ( pi pi 1 ) 2 ( qi qi 1 ) 2 ) PDOL]HGTXHU\VXPGLIIDVFDOFXODWHGXVLQJ(TXDWLRQLI
,Q$SSHQGL[%ZHGLVFXVVDPHWKRGIRUHIILFLHQWO\FDO WKHLUQRUPDOL]HGLQFRKHUHQF\ERXQGVDUHWKHVDPH,QWKLV
FDVH 330&&YDOXHLVIRXQGWREH6LPL
ODUO\ )LJXUH E VKRZV WKH GHSHQGHQFHRI
WKHQXPEHURIUHIUHVKHVRQ&WRLOOXVWUDWH
WKDW WKH UHODWLRQVKLS WKDW KROGV EHWZHHQ
WKHP IRU VLQJOH GDWD LWHP DOVR KROGV IRU D
TXHU\ZLWKPXOWLSOHGDWDLWHPV:HXVHWKLV
TXHU\FRVWPRGHOIRUTXHU\SODQQLQJZKLFK
LVSUHVHQWHGQH[W
4
Figure 3: Query cost validation with varying (a) Sumdiff (b) Incoherency bound
)RUH[HFXWLQJDQLQFRKHUHQF\ERXQGHGFRQ
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS
Zq =
Rqk
2
k =1C qk
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
ZKHUH_T_LVQXPEHURIGDWDLWHPVLQWKHTXHU\PD[_'N_
LV WKH PD[LPXP QXPEHU RI GDWD LWHPV GLVVHPLQDWHG E\
DQ\'$)RUHDFKVXETXHU\P0TLWVVXPGLII5PFDQEH
FDOFXODWHG XVLQJ (TXDWLRQ 'LIIHUHQW FULWHULD FDQ
EH XVHG WR VHOHFW DVXETXHU\LQ HDFKLWHUDWLRQ RI YDULRXV
JUHHG\ KHXULVWLFV $OO GDWDLWHPV FRYHUHG E\ WKH VHOHFWHG
VXETXHU\ DUH UHPRYHG IURP DOO WKH UHPDLQLQJ VXE
TXHULHV LQ 0T EHIRUH SHUIRUPLQJ WKH QH[W LWHUDWLRQ ,W
N
N
2
) + ( Cqk Cq ) IRU D VKRXOGEHQRWHGWKDWVXETXHULHVIRU'$VFDQEHQXOO
VFKHPH ZH PLQLPL]H ( Rqk / Cqk
k =1
k =1
1RZ ZH GHVFULEH WZR FULWHULD IRU WKH JUHHG\ KHX
FRQVWDQWWRJHWYDOXHVRI Cqk VDV
ULVWLFV PLQFRVW HVWLPDWH RI TXHU\ H[HFXWLRQ FRVW LV
PLQLPL]HGDQGPD[JDLQHVWLPDWHGJDLQGXHWRH[HFXW
N
1/ 3
Cqk = Cq Rqk
/( R1qk/ 3 )
LQJWKHTXHU\XVLQJVXETXHULHVLVPD[LPL]HG
TXHULHVXVLQJ(TXDWLRQZHFDQFDOFXODWHVXPGLIIYDO
XHVRIDOOWKHVXETXHULHV7KXVZHQHHGWRPLQLPL]H=T
JLYHQ E\ (TXDWLRQ VXEMHFW WR &RQVWDLQW TXHU\ LQFR
KHUHQF\ ERXQG LV VDWLVILHG DQG &RQVWDLQW VXETXHU\ LQFR
KHUHQF\ERXQGLVVDWLVILHG:HFDQJHWDFORVHIRUPH[SUHV
VLRQ E\ VROYLQJ (TXDWLRQ ZLWK (TXDWLRQ XVLQJ
/DJUDQJH 0XOWLSOLHU VFKHPH 6HH $SSHQGL[ ' ,Q WKDW
k =1
Zq
N
R1 / 3
2 / 3 qk
C q k =1
UHVXOW
ZKLOH0T
FKRRVHDVXETXHU\PL0TZLWKFULWHULRQ
UHVXOWUHVXOWPL0T0T^PL`
IRUHDFKGDWDLWHPGPL
IRUHDFKPM0T
PMPM^G`
LIPM 0T0T^PM`
HOVHFDOFXODWHVXPGLIIIRUPRGLILHGPM
UHWXUQUHVXOW
DOORFDWHWKHTXHU\LQFRKHUHQF\ERXQGDPRQJWKHPXVLQJ
DQ\RIWKHFRQYH[RSWLPL]DWLRQWHFKQLTXHVDVGLVFXVVHGLQ
6HFWLRQ %XW WKLV PHWKRG RI ILUVW GHULYLQJ VXETXHULHV
DQG WKHQ DOORFDWLQJ WKH LQFRKHUHQF\ ERXQGV KDV D SURE
OHPZKLFKLVGHVFULEHGQH[W
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS
SODQ DV WKH RSWLPDO SODQ %XW IURP WKH VSHFLILFDWLRQ RI
DJJUHJDWRUV D DQG D LQ ([DPSOH ZH VHH WKDW LW LV QRW
SRVVLEOH IRU SODQ WR VDWLVI\ WKH FOLHQW VSHFLILHG LQFR
KHUHQF\ERXQGDVWLJKWHVWLQFRKHUHQF\ERXQGWKDWFDQEH
VDWLVILHGE\ WKH VHOHFWHG DJJUHJDWRUV 7SODQ
LV JUHDWHU WKDQ WKH TXHU\ LQFRKHUHQF\
ERXQG 7KXV DOWKRXJK WKHUH H[LVWV D SODQ 7SODQ
ZKLFK FDQ VDWLVI\ WKH FOLHQW
TXHU\ LQFRKHUHQF\ ERXQG ZKLOH PLQLPL]LQJ WKH TXHU\
H[HFXWLRQFRVWWKHDERYHPHWKRGFDQQRWHQVXUHWKDWVXFK
D SODQ ZLOO EH VHOHFWHG :KDW ZH QHHG LV D FRPSURPLVH
EHWZHHQ TXHU\ VDWLVILDELOLW\ DQG SHUIRUPDQFH ,QVWHDG RI
VHOHFWLQJ WKH VXETXHULHV ZLWKRXW FRQVLGHULQJ WKH GDWD
LQFRKHUHQF\ERXQGVIRUWKHVHOHFWHGGDWDDJJUHJDWRUVZH
Tm
VHOHFW VXETXHULHV XVLQJ ( Rm1 / 3 +
) DV H[SDQGHG RE
1/ 3
Cq Rm
MHFWLYHIXQFWLRQ7KHVHFRQGWHUPHQVXUHVWKDWZKLOHVHOHFW
LQJ WKH RSWLPDO SODQ ZH SUHIHU GDWD DJJUHJDWRUV KDYLQJ
WLJKWHUGDWDLQFRKHUHQF\ERXQGVORZHUYDOXHVRI7PWKXV
KLJKHU FKDQFHV RI VDWLVI\LQJ WKH TXHU\ 7KH WXQLQJ SD
UDPHWHUFDQEHXVHGWREDODQFHWKHREMHFWLYHVRIPLQL
PL]LQJTXHU\H[HFXWLRQ FRVWWKURXJK VXETXHU\ VHOHFWLRQ
DQG PHHWLQJ WKH TXHU\ FRKHUHQF\ UHTXLUHPHQWV :H
XVH Tm / Cq Rm1 / 3 LQ WKH VHFRQG WHUP DV DFFRUGLQJ WR (TXD
WLRQ RSWLPDO LQFRKHUHQF\ ERXQG DOORFDWLRQ LV OLNHO\
WR EH GRQH SURSRUWLRQDO WR Cq Rm1 / 3 ,Q 6HFWLRQ ZH
PHDVXUH WKH HIIHFWV RI WKH WXQLQJ SDUDPHWHU RQ WKH
TXHU\VDWLVILDELOLW\
wi Ri
i
2 2
wi Ri
i
+ ij wi w j Ri R j
1
j i
ZKHUH5LLVVXPGLIIRIWKHGDWDLWHPGL7KLVDOJRULWKPFDQ
EHLPSOHPHQWHGE\XVLQJFULWHULRQPD[LPL]H*P_P_
WR JHW WKH VHW RI VXETXHULHV DQG FRUUHVSRQGLQJ '$V
7KHQZHXVHWKHFRQYH[RSWLPL]DWLRQPHWKRGRXWOLQHGLQ
6HFWLRQ WR DOORFDWH LQFRKHUHQF\ ERXQGV DPRQJ VXE
TXHULHV7RWDFNOH WKH TXHU\ VDWLVILDELOLW\ LVVXH WKH TXHU\
JDLQ(TXDWLRQLVPRGLILHGWR
( wiTi )
i
Gm' = Gm
1/ 3
C q Rm
7RVXPPDUL]HIRUDJLYHQFOLHQWTXHU\DQGDQHWZRUN
RIGDWDDJJUHJDWRUVILUVWZHJHWWKHPD[LPDOVXETXHULHV
IRU DOO GDWD DJJUHJDWRUV :H XVH KHXULVWLFV GHVFULEHG LQ
WKLV VHFWLRQ WR GHULYH VXETXHULHV ,Q WKHVH KHXULVWLFV H[
WHQGHG REMHFWLYH IXQFWLRQV DUH XVHG WR KDYH WKH GHVLUHG
OHYHORITXHU\VDWLVILDELOLW\7KHQWKHWHFKQLTXHH[SODLQHG
LQ 6HFWLRQ LV XVHG WR DOORFDWH WKH TXHU\ LQFRKHUHQF\
ERXQGDPRQJWKHGHULYHGVXETXHULHV
5 PERFORMANCE EVALUATION
)RU SHUIRUPDQFH HYDOXDWLRQ ZH VLPXODWHG D QHWZRUN RI
GDWDDJJUHJDWRUVRIVWRFNGDWDLWHPVRYHU DJJUH
JDWRU QRGHV VXFK WKDW HDFK DJJUHJDWRU FDQ GLVVHPLQDWH
FRPELQDWLRQV RI WR GDWDLWHPV 'DWD LWHPV ZHUH DV
VLJQHG WR GLIIHUHQW DJJUHJDWRUV XVLQJ ]LSI GLVWULEXWLRQ
VNHZ DVVXPLQJWKDWVRPHSRSXODUGDWDLWHPVZLOOEH
GLVVHPLQDWHG E\ PRUH '$V 'DWD LQFRKHUHQF\ ERXQGV
IRUYDULRXVDJJUHJDWRUGDWDLWHPVZHUHFKRVHQXQLIRUPO\
EHWZHHQDQG:HFUHDWHGSRUWIROLRTXHULHV
VXFK WKDW HDFK TXHU\ KDV WR UDQGRPO\ XVLQJ ]LSI
GLVWULEXWLRQ ZLWK WKH VDPH GHIDXOW VNHZ VHOHFWHG GDWD
LWHPVZLWKZHLJKWVYDU\LQJEHWZHHQDQG7KHVHTXH
ULHVZHUHH[HFXWHGZLWKLQFRKHUHQF\ERXQGVEHWZHHQ
DQG LH RI WKH TXHU\ YDOXH $OWKRXJK
KHUHZHSUHVHQWUHVXOWVIRUVWRFNWUDFHVPDQPDGHGDWD
VLPLODU UHVXOWV ZHUH REWDLQHG IRU VHQVRU WUDFHV QDWXUDO
GDWD DV ZHOO >@ ,Q WKH ILUVW VHW RI H[SHULPHQWV ZH NHSW
GDWDLQFRKHUHQF\ERXQGVDWWKHGDWDDJJUHJDWRUVYHU\ORZ
VR WKDW TXHU\ VDWLVILDELOLW\ FDQ EH HQVXUHG ZKLOH NHHSLQJ
GHIDXOWYDOXHRIDV
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
10
WKHVL]HRIWKHVXETXHU\LQZKLFKWKDWGDWDLWHPDSSHDUV
,QWKLVH[SHULPHQWZLWKGDWDLWHPV'$VZHUHVLPX
ODWHGVXFKWKDWHDFK'$FDQGLVVHPLQDWHDGLIIHUHQWVHWRI
GDWD LWHPV 7KHQ TXHULHV ZHUH FUHDWHG HDFK ZLWK
UDQGRPO\ FKRVHQ GDWD LWHPV ,Q WKH RSWLPDO TXHU\ SODQ
HDFK TXHU\ ZLOO EH H[HFXWHG ZLWK WZR VXETXHULHV RQH
FRQVLVWLQJ RI GDWD LWHPV DQG DQRWKHU ZLWK VLQJOH GDWD
LWHP SODQ ZLWK WKUHH RQH LWHP VXETXHULHV ZLOO EH WULYL
DOO\LQHIILFLHQW$VWKHTXHU\KDVRQO\GDWDLWHPVRQO\
VXFKTXHU\SODQVDUHSRVVLEOH:HVLPXODWHGDOOWKHVHRS
WLRQV WR JHW WKH EHVW TXHU\ SODQ )RU WKHVH RSWLPDO TXHU\
SODQV )LJXUH D VKRZVYDULDWLRQ RI DYHUDJH VXETXHU\
VL]HLQZKLFKDSDUWLFXODUGDWDLWHPDSSHDUVYHUVXV VXP
GLIIYDOXHRIWKHGDWDLWHP:HFDQVHHWKDWLIDGDWDLWHPLV
PRUHG\QDPLFLQWKHRSWLPDOSODQLWLVPRUHOLNHO\WREH
SDUWRIODUJHUVXETXHU\7KLVLVDQLPSRUWDQWREVHUYDWLRQ
DV LW LQGLFDWHV WKDW IRU HIILFLHQW TXHU\ HYDOXDWLRQ PRUH
G\QDPLFGDWDLWHPVVKRXOGEHSDUWRIDODUJHUVXETXHU\
7KLV SKHQRPHQRQ FDQ EH H[SODLQHG E\ WKH IDFW WKDW E\
H[HFXWLQJ D TXHU\ DV D FRPELQDWLRQ RI VXETXHULHV ZLOO
DOZD\V EH PRUH HIILFLHQW FRPSDUHG WR JHWWLQJ WKH GDWD
LWHPV LQGHSHQGHQWO\ %\ FRPELQLQJ PRUH G\QDPLF GDWD
LWHPVZHDUHOLNHO\WRJDLQPRUH)RUFRPSDULVRQZHDOVR
VKRZWKHFXUYHIRUWKHVXETXHU\VHOHFWLRQEDVHGRQPD[
JDLQDOJRULWKP,WFDQEHVHHQWKDWE\XVLQJPD[JDLQDOJR
ULWKP ZH DFKLHYH RXU REMHFWLYH RI LQFOXGLQJ PRUH G\
QDPLF GDWD LWHPV DV SDUW RI ODUJHU VXETXHULHV ,Q FRP
SDULVRQ IRU WKH PLQFRVW DOJRULWKP PRVW G\QDPLF GDWD
LWHP LV PRUH OLNHO\ WR EH GLVVHPLQDWHG DV VLQJOH LWHP
TXHU\ 7KLV KDSSHQV EHFDXVHWKHVXPGLII YDOXHRIDPRUH
G\QDPLF GDWD LWHP ZLOO EH KLJK WKXV LQ HDFK LWHUDWLRQ RI
WKH JUHHG\ DOJRULWKP )LJXUH WKHUH LV OHVV FKDQFH RI
VHOHFWLQJ D VXETXHU\ ZLWK PRUH G\QDPLF GDWD LWHP
7KXV LW LV YHU\ OLNHO\ WKDW WKH PRVW G\QDPLF GDWD LWHP
ZLOOEHGLVVHPLQDWHGDVDVLQJOHLWHPVXETXHU\UHVXOWLQJ
LQEDGSHUIRUPDQFHRIWKHFOLHQWTXHU\)RUWKHPD[JDLQ
5.2 Effects of Algorithmic Parameters
DQGPLQFRVWDOJRULWKPVVLPLODUUHVXOWVZHUHREWDLQHGIRU
7KLVVHW RI H[SHULPHQWV ZDV SHUIRUPHG WR JHW DQ LQVLJKW ODUJHU TXHU\ VL]HV DV ZHOO DV VKRZQ LQ )LJXUH E )RU
LQWR YDULRXV FKDUDFWHULVWLFV RI RXU VXETXHU\ VHOHFWLRQ JHQHUDWLQJ UHVXOWV RI )LJXUH E ZH VLPXODWHG GDWD
PHWKRGZKLFKOHDGLWWRSHUIRUPEHWWHUFRPSDUHGWRRWKHU DJJUHJDWRUV HDFK GLVVHPLQDWLQJ GDWD LWHPV ZKLOH HDFK
RSWLRQV :H FRQVLGHU HIIHFWV RI WKUHH SDUDPHWHUV RQ WKH TXHU\KDGGDWDLWHPV
TXHU\ SHUIRUPDQFH GDWD G\QDPLFV FRUUHODWLRQ EHWZHHQ
5.2.2 Effect of correlation between data dynamics
GDWDG\QDPLFVDQGTXHU\VDWLVILDELOLW\SDUDPHWHU
7R PHDVXUH WKH HIIHFWV RI FRUUHODWLRQ EHWZHHQ GDWD G\
5.2.1 Effect of data dynamics
QDPLFV DV PHDVXUHG XVLQJ FRUUHODWLRQ PHDVXUH RQ WKH
,QWKLVVHWRIH[SHULPHQWVZHZDQWHGWRVHHZKHWKHUWKHUH TXHU\SHUIRUPDQFHZHFRPSDUHGWKHTXHU\SHUIRUPDQFH
LV DQ\ GHILQLWH UHODWLRQVKLS EHWZHHQ GDWD G\QDPLFV DQG ZLWK WKH FDVH ZKHQ DOO WKH GDWD LWHPV DUH DVVXPHG WR EH
JDLQ7KLVDOJRULWKPLVGHVFULEHGLQ6HFWLRQ
)LJXUH VKRZV DYHUDJH QXPEHURI UHIUHVKHV UHTXLUHG
IRU TXHU\ LQFRKHUHQF\ ERXQGV RI 7KH QDwYH DOJR
ULWKP UHTXLUHV PRUH WKDQ ILYH WLPHV WKH QXPEHU RI PHV
VDJHVFRPSDUHGWRPLQFRVW DQGPD[JDLQDOJRULWKPV )RU
LQFRKHUHQF\ ERXQG RI HDFK TXHU\ RQ DYHUDJH UH
TXLUHV PHVVDJHV LI LW LV H[HFXWHG MXVW E\ RSWLPL]LQJ
LQFRKHUHQF\ERXQGRSWFFRPSDUHGWRZKHQZHVHOHFW
WKHTXHU\SODQXVLQJWKHPD[JDLQDOJRULWKP7KHJDLQVRI
RXU DOJRULWKPVLQFUHDVH IXUWKHU DV QXPEHU RI GDWDLWHPV
GLVVHPLQDWHGE\GDWDDJJUHJDWRUVLQFUHDVHQDwYHUHTXLUHV
PRUHWKDQWLPHV WKH PHVVDJHV ZKHQ HDFK GDWDDJJUH
JDWH GLVVHPLQDWHV GDWDLWHPV 7KLV KDSSHQV DV ZLWK
PRUHGDWDLWHPVSHU'$VXETXHU\EDVHGDOJRULWKPVUH
VXOWLQODUJHUVXETXHULHVDQGZHVHOHFWVXETXHULHVLQWHO
OLJHQWO\
,Q WKH DERYH H[SHULPHQW IRU FUHDWLQJ TXHULHV ZH VH
OHFWHGWKHTXHU\GDWDLWHPVZLWKWKHVDPH]LSIGLVWULEXWLRQ
VNHZ DVZHXVHGIRUVHOHFWLQJGDWDLWHPVWREHVHUYHG
E\ '$V %XW LI ZH UHGXFH WKH VNHZ LH KDYLQJ TXHULHV
ZLWKOHVVSRSXODUGDWDLWHPVZHIRXQGWKDWWKHSHUIRUP
DQFHRIVXETXHU\EDVHGDOJRULWKPVVXIIHU7KLV KDSSHQV
EHFDXVH IRU EHWWHU SHUIRUPDQFH VXETXHU\ EDVHG DOJR
ULWKPV GHSHQG RQ TXHU\ GDWD LWHPV EHLQJ GLVVHPLQDWHG
E\ WKH VDPH '$V )RU TXHULHV ZLWK OHVV SRSXODU GDWD
LWHPVSUREDELOLW\RIWKLVKDSSHQLQJLVOHVVKHQFHWKHLQIH
ULRUSHUIRUPDQFH
)XUWKHU DOWKRXJK WKH RSWLPL]DWLRQ SUREOHPLV VLPLODU
WR WKHFRYHULQJD VHW RI GDWD LWHPV TXHU\ XVLQJLWVVXE
VHWVVXETXHULHVIRUZKLFKWKHJUHHG\PLQFRVWDOJRULWKP
LVFRQVLGHUHGWREH PRVW HIILFLHQW >@ ZHVHHWKDWPD[
JDLQ DOJRULWKP UHTXLUHV OHVV PHVVDJHV FRPSDUHG
WR WKH PLQFRVWDSSURDFK5HDVRQVIRUPD[JDLQDOJRULWKP
SHUIRUPLQJ EHWWHU WKDQ RWKHU DOJRULWKPV DUH H[SORUHG LQ
WKHQH[WVHWRIH[SHULPHQWV
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS
WKHQXPEHURIGDWDLWHPVEHLQJGLVVHPLQDWHGE\WKHQHW
ZRUNEHWZHHQDQG7KHVHH[SHULPHQWVZHUHGRQH
RQD:LQGRZV;3PDFKLQHZLWK*+],QWHO&RUH'XR
&38 DQG *% 5$0 )RU YDULRXV VXPGLII EDVHG DOJR
ULWKPVZHQHHGWRPDLQWDLQWKHVXPGLIIYDOXHVRIYDULRXV
GDWD LWHPV SURSRUWLRQDO WR WKH QXPEHURI GDWD LWHPV EH
LQJ GLVVHPLQDWHG DQG WKH FRUUHODWLRQ PHDVXUH IRU HDFK
SDLURIGDWDLWHPVSURSRUWLRQDOWRWKHVTXDUHRIWKHQXP
EHU RI GDWD LWHPV LQ DGGLWLRQ WR WKH TXHU\ GHSHQGHQW
SODQQLQJ FRVW )RU D WUDFH VL]H RI IRU HDFK GDWD
LWHPERWKWKHFRVWRIPDLQWDLQLQJVXPGLIISHUGDWDLWHP
DQG WKH FRVW RI PDLQWDLQLQJ FRUUHODWLRQ PHDVXUH IRU HDFK
SDLURIGDWDLWHPVZHUHIRXQGWREHLQWKHUDQJHRI
PLFURVHFRQGV 4XHU\ SODQQLQJFRVW WLPHUHTXLUHG WR GH
ULYHVXETXHULHVDQGWKHLUDVVRFLDWHGLQFRKHUHQF\ERXQGV
IRU QDwYH DQG RSWF DOJRULWKP ZDV IRXQG WR EH DSSUR[L
PDWHO\PLFURVHFRQGSHUTXHU\ZKHUHDVWKHVDPHIRUWKH
UDQGRPPLQ&RVWDQGPD[*DLQDOJRULWKPVZDVIRXQGWREH
DQG PLOOLVHFRQGV +LJKHU FRVW RI TXHU\ SODQ
QLQJ IRU WKH VXPGLII EDVHG DOJRULWKPV LV MXVWLILHG E\ WKH
VDYLQJV ZH DFKLHYH LQ WHUPV RI QXPEHU RI PHVVDJHV IRU
WKH ZKROH GXUDWLRQ RI WKH FRQWLQXRXV TXHU\ 7KH TXHU\
SODQQLQJ FRVW RIUDQGRP DQGPLQFRVWLV KLJKHUDV WKH\ UH
TXLUH PRUH LWHUDWLRQV RI WKH DOJRULWKP LQ )LJXUH LH
PRUHVXETXHULHVFRPSDUHGWRWKHPD[*DLQDOJRULWKP
,Q WKHVH TXHULHV HYHQ LI YDOXHV RI RQH RU PRUH GDWD LWHP
FKDQJHFKDQJLQJWKHLULQGLYLGXDOLQFRKHUHQFLHVLWLVSRV
5.3 Overheads of Query Planning
1RZ ZH UHSRUW WKH WLPH RYHUKHDGV IRU YDULRXV TXHU\ VLEOH WKDW TXHU\ LQFRKHUHQF\ UHPDLQV XQFKDQJHG 7KXV
SODQQLQJRSHUDWLRQV:HPHDVXUHGWKHVHFRVWVE\YDU\LQJ IRUDJLYHQ0$;TXHU\LWLVSRVVLEOHWRKDYHDQLQGLYLGXDO
11
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
12
nq
i =1
j =1, j i
Rq Ri ( p ( xi > x j )) max( Ri | 1 i n q )
d qi
ZKHUH5GLTLVVXPGLIIRILWKGDWDLWHPRIWKHTXHU\T
7 RELATED WORK
:H GLYLGH WKHUHODWHG ZRUN RQ VFDODEOH DQVZHULQJRI DJ
JUHJDWLRQTXHULHVRYHUDQHWZRUNRIGDWDDJJUHJDWRUVLQWR
WZRLQWHUUHODWHGWRSLFV
$QVZHULQJ,QFRKHUHQF\%RXQGHG$JJUHJDWLRQ4XHULHV
9DULRXV PHFKDQLVPV IRU HIILFLHQWO\ DQVZHULQJ LQFR
KHUHQF\ ERXQGHG DJJUHJDWLRQ TXHULHV RYHU FRQWLQXRXVO\
FKDQJLQJGDWDLWHPVDUHSURSRVHGLQWKHOLWHUDWXUH>
@ 2XU ZRUN GLVWLQJXLVKHV LWVHOI E\ HPSOR\LQJ
VXETXHU\EDVHGTXHU\HYDOXDWLRQWRPLQLPL]HQXPEHURI
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS
13
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.
14
LWHPVDQGH[HFXWLQJWKHPDWVSHFLILFDOO\FKRVHQGDWD
DJJUHJDWRUV
o 'HFLGLQJWKH TXHU\ SODQ XVLQJ VXPGLII EDVHG PHFKD
QLVPVSHFLILFDOO\E\PD[LPL]LQJVXETXHU\JDLQV
o ([HFXWLQJTXHULHVVXFKWKDWPRUHG\QDPLFGDWDLWHPV
DUHSDUWRIDODUJHUVXETXHU\
:HVKRZHGWKDWWKHPD[JDLQDOJRULWKPLVYHU\FORVHWR
WKH RSWLPDO DOJRULWKP LQ VHOHFWLQJ VXETXHULHV EDVHG RQ
GDWD G\QDPLFV )LJXUH 4XHU\ VDWLVILDELOLW\ SDUDPHWHU
LVHPSOR\HGIRUWUDGHRIIEHWZHHQTXHU\VDWLVILDELOLW\
DQGTXHU\SHUIRUPDQFH)RUDQ\YDOXHRIWKHTXHU\VDWLV
ILDELOLW\ SDUDPHWHU WKHUH LV DOZD\V QRQ]HUR SUREDELOLW\
WKDWDTXHU\ZLOOQRWJHWVDWLVILHGE\WKHQHWZRUNRIGDWD
DJJUHJDWRUV 8VXDOO\ GDWD DJJUHJDWRUV GLVVHPLQDWLQJ
VDPHGDWDLWHPIRUPDKLHUDUFKLFDOQHWZRUN,QWKDWFDVH
HYHQ LI D GDWD DJJUHJDWRU FDQ QRW VDWLVI\ LWV DVVLJQHG
TXHU\ LW FDQ DJDLQ DSSO\ WKH SULQFLSOHV RXWOLQHG LQ WKLV
SDSHU WR VHQG D VXETXHU\ RI WKH DVVLJQHG TXHU\ WR LWV
SDUHQWVZKLFKFDQGLVVHPLQDWHWKHGDWDLWHPDWDWLJKWHU
LQFRKHUHQF\ERXQG7KDWZLOOOHDGWRSRRUHUSHUIRUPDQFH
RXWOLQLQJWKHWUDGHRIIEHWZHHQWKHTXHU\VDWLVILDELOLW\DQG
SHUIRUPDQFH'HYHORSLQJ HIILFLHQWVWUDWHJLHV IRU PXOWLSOH
LQYRFDWLRQV RI RXU DOJRULWKP FRQVLGHULQJ KLHUDUFK\ RI
GDWDDJJUHJDWRUVLVDQDUHDIRUIXWXUHUHVHDUFK
$QRWKHU DUHD IRU IXWXUH UHVHDUFK LV FKDQJLQJ D TXHU\
SODQ DV GDWD G\QDPLFV FKDQJHV :H DUH FDOFXODWLQJ GDWD
VXPGLII LQ G\QDPLF PDQQHU ,I GDWD VXPGLII FKDQJHV EH
\RQG D FHUWDLQ OLPLW WKH FKRVHQ TXHU\ SODQ PD\ QRW UH
PDLQ HIILFLHQW$V D VLPSOH VFKHPH OLPLWV RQ FKDQJHV WR
GDWD VXPGLII FDQ EH IRXQG IRU ZKLFK WKH VHOHFWHG TXHU\
SODQUHPDLQVRSWLPDO2XUZRUNFDQDOVREHXVHGIRUH[
WHQGLQJ WKH ZRUN SURSRVHG LQ >@ IRU FRQVWUXFWLRQ DQG
PDLQWHQDQFH RI D QHWZRUN RI GDWD DJJUHJDWRUV VR WKDW
HQGWRHQG VRXUFHVWRFOLHQW ILGHOLW\ FDQ EH PD[LPL]HG
2XUTXHU\FRVWPRGHOFDQDOVREHXVHGIRURWKHUSXUSRVHV
VXFK DV ORDG EDODQFLQJ YDULRXV DJJUHJDWRUV PXOWLTXHU\
H[HFXWLRQURXWLQJVHQVRUGDWDHWF8VLQJWKHFRVW PRGHO
IRUWKHVHDSSOLFDWLRQV DQG GHYHORSLQJ WKH FRVW PRGHOIRU
PRUHFRPSOH[TXHULHVLVWKLUGDUHDRIRXUIXWXUHZRUN
[8]
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
>@
REFERENCES
>@
>@
>@
>@
>@
>@
>@
$'DYLV-3DULNKDQG::HLKO(GJH&RPSXWLQJ([WHQGLQJ(QWHU
SULVH$SSOLFDWLRQVWRWKH(GJHRIWKH,QWHUQHW:::
'9DQGHU0HHU$'DWWD.'XWWD+7KRPDVDQG.5DPDPULWKDP
3UR[\%DVHG$FFHOHUDWLRQRI'\QDPLFDOO\*HQHUDWHG &RQWHQW RQWKH
:RUOG:LGH :HE$&0 7UDQVDFWLRQVRQ'DWDEDVH 6\VWHPV72'6
9RO-XQH
- 'LOOH\ % 0DJJV - 3DULNK + 3URNRS 5 6LWDUDPDQ DQG % :HLKO
*OREDOO\ 'LVWULEXWHG &RQWHQW 'HOLYHU\ ,((( ,QWHUQHW &RPSXWLQJ
6HSW
6 5DQJDUDMDQ 6 0XNHUMHH DQG 3 5RGULJXH] 8VHU 6SHFLILF 5HTXHVW
5HGLUHFWLRQ LQ D &RQWHQW 'HOLYHU\ 1HWZRUN WK ,QWO :RUNVKRS RQ
:HE&RQWHQW&DFKLQJDQG'LVWULEXWLRQ,:&:
66KDK.5DPDPULWKDPDQG36KHQR\0DLQWDLQLQJ&RKHUHQF\RI
'\QDPLF'DWDLQ&RRSHUDWLQJ5HSRVLWRULHV9/'%
7 + &RUPHQ &KDUOHV ( /HLVHUVRQ 5RQDOG / 5LYHVW DQG &OLIIRUG
6WHLQ,QWURGXFWLRQWR$OJRULWKPV0,73UHVVDQG0F*UDZ+LOO
< =KRX % &KLQ 2RL DQG .LDQ/HDQ 7DQ 'LVVHPLQDWLQJ 6WUHDPLQJ
'DWD LQ D '\QDPLF (QYLURQPHQW $Q $GDSWLYH DQG &RVW %DVHG $S
SURDFK7KH9/'%-RXUQDO,VVXHSJ
>@
>@
>@
Query
cost
model
validation
for
sensor
data.
www.cse.iitb.ac.in/~grajeev/sumdiff/RaviVijay_BTP06.pdf.
5 *XSWD $ 3XUL DQG . 5DPDPULWKDP ([HFXWLQJ ,QFRKHUHQF\
%RXQGHG&RQWLQXRXV4XHULHVDW:HE'DWD$JJUHJDWRUV:::
3RSXOLV$ 3UREDELOLW\ 5DQGRP9DULDEOHDQG 6WRFKDVWLF3URFHVV 0F
*UDZ+LOO
&2OVWRQ--LDQJDQG-:LGRP$GDSWLYH)LOWHUIRU&RQWLQXRXV4XH
ULHVRYHU'LVWULEXWHG'DWD6WUHDPV6,*02'
66KDK.5DPDPULWKDPDQG&5DYLVKDQNDU&OLHQW$VVLJQPHQWLQ
&RQWHQW'LVVHPLQDWLRQ1HWZRUNVIRU'\QDPLF'DWD9/'%
1()6&
6FLHQWLILF
&RPSXWHU
6\VWHP
KWWSVROHZKZKRLHGXaMPDQQLQJFUXLVHVHUYHFJL
6 0DGGHQ 0-)UDQNOLQ- +HOOHUVWHLQDQG: +RQJ7$*D7LQ\
$JJUHJDWLRQ6HUYLFHIRU$G+RF6HQVRU1HWZRUNV3URFRIWK6\PSR
VLXPRQ2SHUDWLQJ6\VWHPV'HVLJQDQGLPSOHPHQWDWLRQ
'6-RKQVRQDQG05*DUH\&RPSXWHUVDQG,QWUDFWDELOLW\$*XLGHWR
WKHWKHRU\RI13FRPSOHWHQHVV6DQ)UDQFLVFR&$)UHHPDQ
6=KXDQG&5DYLVKDQNDU6WRFKDVWLF&RQVLVWHQF\DQG6FDODEOH3XOO
%DVHG&DFKLQJIRU(UUDWLF'DWD6RXUFHV9/'%
' &KX $ 'HVKSDQGH - +HOOHUVWHLQ : +RQJ $SSUR[LPDWH 'DWD
&ROOHFWLRQLQ6HQVRU1HWZRUNVXVLQJ3UREDELOLVWLF0RGHOV,&'(
$ 'HVKSDQGH & *XHVWULQ 6 5 0DGGHQ - 0 +HOOHUVWHLQ DQG :
+RQJ 0RGHO'ULYHQ 'DWD $FTXLVLWLRQ LQ 6HQVRU 1HWZRUNV 9/'%
3HDUVRQ
3URGXFW
PRPHQW
FRUUHODWLRQ
FRHIILFLHQW
KWWSZZZQ\[QHWaWPDFIDUO67$7B787FRUUHODWVVL
$QWRQLRV 'HOLJLDQQDNLV <DQQLV .RWLGLV DQG 1LFN 5RXVVRSRXORV
3URFHVVLQJ$SSUR[LPDWH$JJUHJDWH 4XHULHVLQ :LUHOHVV 6HQVRU 1HW
ZRUNV,QIRUPDWLRQ6\VWHPVYRO,VVXH3J
*&RUPRGHDQG0*DURIDODNLV6NHWFKLQJ6WUHDPVWKURXJKWKH1HW
'LVWULEXWHG$SSUR[LPDWH4XHU\7UDFNLQJ9/'%
6$JUDZDO.5DPDPULWKDPDQG66KDK&RQVWUXFWLRQRID7HPSRUDO
&RKHUHQF\ 3UHVHUYLQJ'\QDPLF'DWD'LVVHPLQDWLRQ 1HWZRUN 5766
%ULDQ %DEFRFN DQG &KULV 2OVWRQ 'LVWULEXWHG 7RS. 0RQLWRULQJ
6,*02'
$GDP6LOEHUVWHLQ.DPHVK0XQDJDODDQG-XQ<DQJ(QHUJ\(IILFLHQW
0RQLWRULQJRI([WUHPH9DOXHVLQ6HQVRU1HWZRUNV6,*02'
1-DLQ'.LW3 0DKDMDQ3<DODJDQGXOD0'DKOLQDQG<=KDQJ
67$56HOI7XQLQJ$JJUHJDWLRQIRU6FDODEOH0RQLWRULQJ9/'%
5*XSWDDQG.5DPDPULWKDP2SWLPL]HG4XHU\3ODQQLQJRI&RQ
WLQXRXV $JJUHJDWLRQ 4XHULHV LQ '\QDPLF 'DWD 'LVVHPLQDWLRQ 1HW
ZRUNV:::
6.DVK\DS-5DPDPULWKDP55DVWRJLDQG36KXNOD(IILFLHQW&RQ
VWUDLQW0RQLWRULQJXVLQJ$GDSWLYH7KUHVKROGV,&'(
'6+RFKEDXP$SSUR[LPDWLRQDOJRULWKPVIRUWKHVHWFRYHULQJDQG
YHUWH[FRYHUSUREOHPV6,$0-RXUQDORQ&RPSXWLQJYRO
3(GDUD$/LPD\HDQG.5DPDPULWKDP$V\QFKURQRXV,QQHWZRUN
3UHGLFWLRQ(IILFLHQW$JJUHJDWLRQLQ6HQVRU1HWZRUNV$&07UDQVDF
WLRQVRQ6HQVRU1HWZRUNV9ROXPH1XPEHU$XJXVW
Rajeev Gupta got his BTech from Indian Institute of Technology (IIT)
Kharagpur, India in Electronics Engineering. He is currently pursuing
his PhD from IIT Mumbai, India in Computer Science. He is working
as Researcher at IBM Research, New Delhi, India for last 10 years.
Krithi Ramamritham received the PhD in Computer Science from
University of Utah and then joined the University of Massachusetts.
He is currently at IIT Bombay as Professor in the Department of
Computer Science. He is a fellow of IEEE and a fellow of ACM. He
has served on numerous program committees of conferences and
workshops. His editorial board contributions include IEEE Transactions, the Real Time Systems Journal, and the VLDB Journal.