Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication.

IEEE TRANSACTIONS ON JOURNAL KNOWLEDGE AND DATA ENGINEERING, MANUSCRIPT ID

Query Planning for Continuous Aggregation


Queries over a Network of Data Aggregators
Rajeev Gupta, and Krithi Ramamritham, Fellow IEEE
AbstractContinuous queries are used to monitor changes to time varying data and to provide results useful for online
decision making. Typically a user desires to obtain the value of some aggregation function over distributed data items, for
example, to know value of portfolio for a client; or the AVG of temperatures sensed by a set of sensors. In these queries a client
specifies a coherency requirement as part of the query. We present a low-cost, scalable technique to answer continuous
aggregation queries using a network of aggregators of dynamic data items. In such a network of data aggregators, each data
aggregator serves a set of data items at specific coherencies. Just as various fragments of a dynamic web-page are served by
one or more nodes of a content distribution network, our technique involves decomposing a client query into sub-queries and
executing sub-queries on judiciously chosen data aggregators with their individual sub-query incoherency bounds. We provide a
technique for getting the optimal set of sub-queries with their incoherency bounds which satisfies client querys coherency
requirement with least number of refresh messages sent from aggregators to the client. For estimating the number of refresh
messages, we build a query cost model which can be used to estimate the number of messages required to satisfy the client
specified incoherency bound. Performance results using real-world traces show that our cost based query planning leads to
queries being executed using less than one third the number of messages required by existing schemes.
Index TermsAlgorithms, Continuous queries, Distributed query processing, Data dissemination, Coherency, Performance.

1 INTRODUCTION

YDOXHNQRZQWRDFOLHQWRIWKHGDWD/HWYL W GHQRWHWKHYDOXH
RIWKHLWKGDWDLWHPDWWKHGDWDVRXUFHDWWLPHWDQGOHWWKH
YDOXHWKHGDWDLWHPNQRZQWRWKHFOLHQWEHXL W 7KHQWKH
GDWDLQFRKHUHQF\DWWKHFOLHQWLVJLYHQE\_YL W XL W _)RU
DGDWDLWHPZKLFKQHHGVWREHUHIUHVKHGDWDQLQFRKHUHQF\
ERXQG & D GDWD UHIUHVK PHVVDJH LV VHQW WR WKH FOLHQW DV
VRRQDVGDWDLQFRKHUHQF\H[FHHGV&LH_YL W XL W _!&
1HWZRUN RI GDWD DJJUHJDWRUV 'DWD UHIUHVK IURP GDWD
VRXUFHV WR FOLHQWV FDQ EH GRQH XVLQJ SXVK RU SXOO EDVHG
PHFKDQLVPV ,Q D SXVK EDVHG PHFKDQLVP GDWD VRXUFHV
VHQGXSGDWHPHVVDJHVWRFOLHQWVRQWKHLURZQZKHUHDVLQ
SXOOEDVHGPHFKDQLVPGDWDVRXUFHVVHQGPHVVDJHVWRWKH
FOLHQW RQO\ ZKHQ WKH FOLHQW PDNHV D UHTXHVW :H DVVXPH
WKHSXVKEDVHGPHFKDQLVPIRUGDWDWUDQVIHUEHWZHHQGDWD
VRXUFHV DQG FOLHQWV )RU VFDODEOH KDQGOLQJ RI SXVK EDVHG
GDWDGLVVHPLQDWLRQQHWZRUNRIGDWDDJJUHJDWRUVDUHSUR
SRVHGLQ WKHOLWHUDWXUH > @ ,Q VXFK QHWZRUNRI GDWD
DJJUHJDWRUVGDWDUHIUHVKHVRFFXUIURPGDWDVRXUFHVWRWKH
FOLHQWVWKURXJKRQHRUPRUHGDWDDJJUHJDWRUV
,Q WKLV SDSHU ZH DVVXPH WKDW HDFK GDWD DJJUHJDWRU
PDLQWDLQV LWV FRQILJXUHG LQFRKHUHQF\ ERXQGV IRU YDULRXV
GDWDLWHPV)URPDGDWDGLVVHPLQDWLRQFDSDELOLW\SRLQWRI
YLHZHDFKGDWDDJJUHJDWRU '$ LVFKDUDFWHUL]HGE\DVHW
RI GLFL SDLUVZKHUHGLLVDGDWDLWHPZKLFKWKH'$FDQ
GLVVHPLQDWH DW DQ LQFRKHUHQF\ ERXQG FL 7KH FRQILJXUHG
LQFRKHUHQF\ERXQGRIDGDWDLWHPDWDGDWDDJJUHJDWRUFDQ
EH PDLQWDLQHG XVLQJ DQ\ RI IROORZLQJ PHWKRGV D  7KH
GDWDVRXUFHUHIUHVKHVWKHGDWDYDOXHRIWKH'$ZKHQHYHU
'$V LQFRKHUHQF\ ERXQG LV DERXW WR JHW YLRODWHG 7KLV
PHWKRG KDV VFDODELOLW\ SUREOHPV E  'DWD DJJUHJDWRU V 
ZLWKWLJKWHULQFRKHUHQF\ERXQGKHOSWKH'$WR PDLQWDLQ
LWV LQFRKHUHQF\ ERXQG LQ DVFDODEOH PDQQHU DV H[SODLQHG
LQ>@

SSOLFDWLRQV VXFK DV DXFWLRQV SHUVRQDO SRUWIROLR


YDOXDWLRQV IRU ILQDQFLDO GHFLVLRQV VHQVRUV EDVHG
PRQLWRULQJURXWHSODQQLQJEDVHGRQWUDIILFLQIRUPD
WLRQ HWF PDNH H[WHQVLYH XVH RI G\QDPLF GDWD )RU VXFK
DSSOLFDWLRQV GDWD IURP RQH RU PRUH LQGHSHQGHQW GDWD
VRXUFHVPD\EHDJJUHJDWHGWRGHWHUPLQHLIVRPHDFWLRQLV
ZDUUDQWHG*LYHQWKHLQFUHDVLQJQXPEHURIVXFKDSSOLFD
WLRQVWKDWPDNH XVHRI KLJKO\ G\QDPLF GDWD WKHUH LV VLJ
QLILFDQWLQWHUHVWLQV\VWHPVWKDWFDQHIILFLHQWO\GHOLYHUWKH
UHOHYDQW XSGDWHV DXWRPDWLFDOO\$VDQ H[DPSOH FRQVLGHU
DXVHUZKRZDQWVWRWUDFNDSRUWIROLRRIVWRFNVLQGLIIHUHQW
EURNHUDJH  DFFRXQWV 6WRFN GDWD YDOXHV IURP SRVVLEO\
GLIIHUHQWVRXUFHVDUHUHTXLUHGWREHDJJUHJDWHG WR VDWLVI\
XVHUV UHTXLUHPHQW 7KHVH DJJUHJDWLRQ TXHULHV DUH ORQJ
UXQQLQJTXHULHVDVGDWDLVFRQWLQXRXVO\FKDQJLQJDQGWKH
XVHULVLQWHUHVWHGLQQRWLILFDWLRQVZKHQFHUWDLQFRQGLWLRQV
KROG7KXVUHVSRQVHVWRWKHVHTXHULHVDUHUHIUHVKHGFRQ
WLQXRXVO\ ,Q WKHVH FRQWLQXRXV TXHU\ DSSOLFDWLRQV XVHUV
DUHOLNHO\WRWROHUDWHVRPHLQDFFXUDF\LQWKHUHVXOWV7KDW
LVWKHH[DFWGDWDYDOXHVDWWKHFRUUHVSRQGLQJGDWDVRXUFHV
QHHGQRWEHUHSRUWHG DV ORQJ DV WKH TXHU\ UHVXOWVVDWLVI\
XVHU VSHFLILHG DFFXUDF\ UHTXLUHPHQWV  )RU LQVWDQFH D
SRUWIROLRWUDFNHUPD\EHKDSS\ZLWKDQDFFXUDF\RI
'DWD LQFRKHUHQF\ 'DWD DFFXUDF\ FDQ EH VSHFLILHG LQ
WHUPV RI LQFRKHUHQF\ RI D GDWD LWHP GHILQHG DV WKH DEVROXWH
GLIIHUHQFH LQYDOXH RI WKH GDWD LWHP DW WKH GDWD VRXUFHDQG WKH


5DMHHY*XSWDLVZLWK,%05HVHDUFK1HZ'HOKL(PDLO
JUDMHHY#LQLEPFRP
.ULWKL5DPDPULWKDPLVZLWK,QGLDQ,QVWLWXWHRI7HFKQRORJ\0XPEDL(
PDLONULWKL#FVHLLWEDFLQ
0DQXVFULSWUHFHLYHG 0D\ 

xxxx-xxxx/0x/$xx.00 200x IEEE

Digital Object Indentifier 10.1109/TKDE.2011.12

1041-4347/11/$26.00 2011 IEEE

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

([DPSOH,QDQHWZRUNRIGDWDDJJUHJDWRUVPDQDJLQJ
GDWDLWHPVGGYDULRXVDJJUHJDWRUVFDQEHFKDUDFWHUL]HG
DV
D^ G  G `
D^ G  G  G `
$JJUHJDWRU D FDQ VHUYH YDOXHV RI G ZLWK DQ
LQFRKHUHQF\ERXQGJUHDWHUWKDQRUHTXDOWRZKHUHDVD
FDQ GLVVHPLQDWH WKH VDPH GDWD LWHP DW D ORRVHU
LQFRKHUHQF\ ERXQG RI  RU PRUH ,Q VXFK D QHWZRUN RI
DJJUHJDWRUV RI PXOWLSOH GDWD LWHPV DOO WKH QRGHV FDQ EH
FRQVLGHUHGDVSHHUVVLQFHDQRGHDLFDQKHOSDQRWKHUQRGH
DN WR PDLQWDLQ LQFRKHUHQF\ ERXQG RI WKH GDWD LWHP G
LQFRKHUHQF\ERXQGRIGDWDLLVWLJKWHUWKDQWKDWDWDN EXW
WKHQRGHDLJHWVYDOXHVRIDQRWKHUGDWDLWHPGIURPDN

WKDQRQHGDWDLWHP
6HFRQGO\LIDVLQJOH'$FDQGLVVHPLQDWHDOOWKUHHGDWD
LWHPV UHTXLUHG WR DQVZHU WKH FOLHQW TXHU\ WKH '$ FDQ
FRQVWUXFW D FRPSRVLWH GDWD LWHP FRUUHVSRQGLQJ WR WKH
FOLHQW TXHU\ GT  G   G   G  DQG GLVVHPLQDWH
WKH UHVXOW WR WKH FOLHQW VR WKDW WKH TXHU\ LQFRKHUHQF\
ERXQG LV QRW YLRODWHG ,W LV REYLRXV WKDW LI ZH JHW WKH
TXHU\ UHVXOW IURP D VLQJOH '$ WKH QXPEHU RI UHIUHVKHV
ZLOO EH PLQLPXP DV GDWD LWHP XSGDWHV PD\ FDQFHO RXW
HDFK RWKHU WKHUHE\ PDLQWDLQLQJ WKH TXHU\ UHVXOWVZLWKLQ
WKH LQFRKHUHQF\ ERXQG  $V GLIIHUHQW GDWD DJJUHJDWRUV
GLVVHPLQDWH GLIIHUHQW VXEVHWV RI GDWD LWHPV QR GDWD
DJJUHJDWRU PD\ KDYH DOO WKH GDWD LWHPV UHTXLUHG WR
H[HFXWH WKH FOLHQW TXHU\ ZKLFK LV LQGHHG WKH FDVH LQ
([DPSOH )XUWKHU HYHQ LI DQ DJJUHJDWRU FDQ UHIUHVK DOO
WKH GDWD LWHPV LW PD\ QRW EH DEOH WR VDWLVI\ WKH TXHU\
FRKHUHQF\UHTXLUHPHQWV,QVXFKFDVHVWKHTXHU\KDVWREH
H[HFXWHGZLWKGDWDIURPPXOWLSOHDJJUHJDWRUV
$WKLUGRSWLRQLVWRGLYLGHWKHTXHU\LQWRDQXPEHURI
VXETXHULHVDQGJHWWKHLUYDOXHVIURPLQGLYLGXDO'$V,Q
WKDWFDVHWKHFOLHQWTXHU\UHVXOWLVREWDLQHGE\FRPELQLQJ
WKH UHVXOWV RI PXOWLSOHVXETXHULHV )RU WKH'$V JLYHQ LQ
([DPSOH WKH TXHU\ 4 FDQ EH GLYLGHG LQ WZR DOWHUQDWLYH
ZD\V
3ODQ5HVXOWRIVXETXHU\GGLVVHUYHGE\D
ZKHUHDVYDOXHRIGLVVHUYHGE\D
3ODQ9DOXHRIGLVVHUYHGE\DZKHUHDVUHVXOWRIVXE
TXHU\GGLVVHUYHGE\D
,Q ERWK WKH SODQV FRPELQLQJ WKH VXETXHU\ YDOXHV DW
WKHFOLHQWJLYHVWKHTXHU\UHVXOW%XWVHOHFWLQJWKHRSWLPDO
SODQ DPRQJYDULRXV RSWLRQV LV QRWWULYLDO ,QWXLWLYHO\ ZH
VKRXOG EH VHOHFWLQJ WKH SODQ ZLWK OHVVHU QXPEHU RI VXE
TXHULHV%XWWKDWLVQRWJXDUDQWHHGWREHWKHSODQZLWKWKH
OHDVW QXPEHU RI PHVVDJHV )XUWKHU ZH VKRXOG VHOHFW WKH
VXETXHULHV VXFK WKDW XSGDWHV WR YDULRXV GDWD LWHPV DS
SHDULQJ LQ D VXETXHU\ KDYH PRUH FKDQFHV RI FDQFHOLQJ
HDFK RWKHU DV WKDW ZLOOUHGXFH WKH QHHG IRUUHIUHVK WR WKH
FOLHQW ,Q WKH DERYH H[DPSOH LI XSGDWHV WR G DQG G DUH
VXFKWKDWZKHQGLQFUHDVHVGGHFUHDVHVDQGYLFHYHUVD
WKHQVHOHFWLQJSODQPD\EHEHQHILFLDO:HJLYHDPHWKRG
WR VHOHFW WKH TXHU\ SODQ EDVHG RQ WKHVH REVHUYDWLRQV
:KLOH VROYLQJ WKH DERYH SUREOHP ZH HQVXUH WKDW HDFK
GDWD LWHP IRU D FOLHQW TXHU\ LV GLVVHPLQDWHG E\ RQH DQG
RQO\ RQH GDWD DJJUHJDWRU $OWKRXJK D TXHU\ FDQ EH GL
YLGHG LQ VXFK D ZD\ WKDW D VLQJOH GDWD LWHP LV VHUYHG E\
PXOWLSOH'$V HJGGGLVGLYLGHGLQWR
WZRVXETXHULHVGGDQGGG LQGRLQJ
VR WKH VDPH GDWD LWHP LV SURFHVVHG DW PXOWLSOH DJJUHJD
WRUVLQFUHDVLQJWKHXQQHFHVVDU\SURFHVVLQJORDG IXUWKHU
LQFDVHRISDLGGDWDVXEVFULSWLRQVLWLVQRWSUXGHQWWRJHW
WKH VDPH GDWD LWHP IURP PXOWLSOH VRXUFHV  %\ GLYLGLQJ
WKHFOLHQWTXHU\LQWRGLVMRLQWVXETXHULHVZHHQVXUHWKDWD
GDWDLWHPXSGDWHLVSURFHVVHGRQO\RQFHIRUHDFKTXHU\
6XETXHU\ LQFRKHUHQF\ ERXQGV DUH UHTXLUHG WR EH GH
ULYHG XVLQJ WKH TXHU\LQFRKHUHQF\ ERXQGV VXFK WKDW EH
VLGHV VDWLVI\LQJ WKH FOLHQW FRKHUHQF\ UHTXLUHPHQWV WKH
FKRVHQ'$ ZKHUHWKHVXETXHU\LVWREHH[HFXWHG LVFD
SDEOH RI VDWLVI\LQJ WKH DOORFDWHG VXETXHU\ LQFRKHUHQF\
ERXQG )RU H[DPSOH LQ SODQ LQFRKHUHQF\ ERXQG DOOR
FDWHGWRWKHVXETXHU\GGVKRXOGEHJUHDWHUWKDQ

1.1 Aggregate Queries and their Execution


,Q WKLV SDSHU ZH SUHVHQW D PHWKRG IRU H[HFXWLQJ
FRQWLQXRXV PXOWLGDWD DJJUHJDWLRQ TXHULHV XVLQJ D
QHWZRUN RI GDWD DJJUHJDWRUV ZLWK WKH REMHFWLYH RI
PLQLPL]LQJ WKH QXPEHU RI UHIUHVKHV IURP GDWD
DJJUHJDWRUV WR WKH FOLHQW )LUVW ZH JLYH WZR PRWLYDWLQJ
VFHQDULRVZKHUHWKHUHDUHYDULRXVRSWLRQVIRUH[HFXWLQJD
PXOWLGDWD DJJUHJDWLRQ TXHU\ DQG RQH PXVW VHOHFW D
SDUWLFXODURSWLRQWRPLQLPL]HWKHQXPEHURIPHVVDJHV
6FHQDULR&RQVLGHUDFOLHQWTXHU\4 GGG
ZKHUH G GGDUH GLIIHUHQWVWRFNVLQ D SRUWIROLR ZLWK D
UHTXLUHG LQFRKHUHQF\ ERXQG RI  :H ZDQW WR H[HFXWH
WKLV TXHU\ RYHU WKH GDWD DJJUHJDWRUV JLYHQ LQ ([DPSOH
PLQLPL]LQJWKHQXPEHURIUHIUHVKHV
6FHQDULR ,Q D VHQVRU QHWZRUN FRQVLGHU DQ $9* TXHU\
RYHUDWDUJHWVHWRIVHQVRUV VD\GGDQGG LQMHFWHGDWD
TXHU\ QRGH ,QQHWZRUN DJJUHJDWLRQ LV XVHG IRU HQHUJ\
HIILFLHQWSURSDJDWLRQ RI DJJUHJDWHV >@ )RU FRQVWUXFWLQJ
DQDJJUHJDWLRQWUHHFRQQHFWLQJWKHWDUJHWVHQVRUVDQGWKH
TXHU\QRGHHDFKQRGHFDQVHOHFWDSDWKWRWKHTXHU\QRGH
EDVHGRQFHUWDLQSUHIHUHQFHIDFWRU:HZDQWWRVHOHFWWKHLQ
QHWZRUN DJJUHJDWLRQ SDWK VXFK WKDW WKH DJJUHJDWLRQ
TXHU\ JHWV H[HFXWHG ZLWK WKH PLQLPXP QXPEHU RI
PHVVDJHV
,Q ERWK WKH FDVHV D OLPLWHG QXPEHU RI RSWLRQV DUH
DYDLODEOH IRU H[HFXWLQJ WKH DJJUHJDWLRQ TXHU\ ,Q WKLV
SDSHU ZH ZLOO XVH 6FHQDULR DV WKH UXQQLQJ H[DPSOH EXW
UHVXOWVREWDLQHGDQGFRQFOXVLRQVGUDZQDUHDSSOLFDEOHWR
ERWK WKH VFHQDULRV 6SHFLILFDOO\ ZH DQVZHU WKH TXHVWLRQ
*LYHQ D FOLHQW TXHU\ SRVHG RYHU D K\SRWKHWLFDO GDWDEDVH RI
PXOWLSOH GDWD VRXUFHV ZKDW VXETXHULHV VKRXOG EH SRVHG DW
YDULRXVGDWD DJJUHJDWRUV VR WKDW WKH QXPEHU RI UHIUHVKHV IURP
WKHVH DJJUHJDWRUV WR WKH FOLHQW FDQ EH PLQLPL]HG" :H XVH
DGGLWLYHDJJUHJDWLRQ TXHULHV WR GHYHORS RXU DSSURDFKLQ
GHWDLO DQG WRZDUGV WKH HQG RI WKH SDSHU GHVFULEH KRZ
PD[PLQTXHULHVFDQEHKDQGOHG
)RU DQVZHULQJ WKH PXOWLGDWD DJJUHJDWLRQ TXHU\ LQ
6FHQDULRWKHUHDUH WKUHH RSWLRQV IRU WKH FOLHQW WR JHW WKH
TXHU\UHVXOWV)LUVWO\WKHFOLHQWPD\JHWWKHGDWDLWHPVG
GDQGG VHSDUDWHO\7KHTXHU\LQFRKHUHQF\ERXQGFDQEH
GLYLGHGDPRQJGDWDLWHPVLQYDULRXVZD\VHQVXULQJWKDW
TXHU\ LQFRKHUHQF\ LV EHORZ WKH LQFRKHUHQF\ ERXQG ,Q
WKLV SDSHU ZH VKRZ WKDW JHWWLQJ GDWD LWHPV LQGHSHQG
HQWO\LVDFRVWO\RSWLRQ7KLVVWUDWHJ\LJQRUHVWKHIDFWWKDW
WKHFOLHQWLVLQWHUHVWHGRQO\LQWKHDJJUHJDWHGYDOXHRIWKH
GDWDLWHPVDQGYDULRXVDJJUHJDWRUVFDQGLVVHPLQDWHPRUH


This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS

     DV WKDW LV WKH WLJKWHVW LQFRKHUHQF\


ERXQG ZKLFK WKH DJJUHJDWRU D FDQ VDWLVI\ IRU WKH VXE
TXHU\GG:HVKRZWKDWWKHQXPEHURIUHIUHVKHV
DOVR GHSHQGV RQ WKH GLYLVLRQ RI WKH TXHU\ LQFRKHUHQF\
ERXQGVDPRQJVXETXHU\LQFRKHUHQF\ERXQGV$VLPLODU
UHVXOWZDVUHSRUWHG IRU GDWD LQFRKHUHQF\ ERXQGV LQ >@
1H[W ZH SUHVHQW SUREOHP VWDWHPHQW IRUPDOO\ DQG RXU
FRQWULEXWLRQV

YDULRXV DOWHUQDWLYH RSWLRQV $ PHWKRG IRU HVWLPDWLQJ WKH


TXHU\ H[HFXWLRQ FRVW LV DQRWKHU LPSRUWDQW FRQWULEXWLRQ RI WKLV
SDSHU$VZHGLYLGHWKHFOLHQWTXHU\LQWRVXETXHULHVVXFK
WKDWHDFKVXETXHU\JHWVH[HFXWHGDWGLIIHUHQWDJJUHJDWRU
QRGHVWKHTXHU\H[HFXWLRQFRVW LHQXPEHURIUHIUHVKHV 
LV WKH VXP RI WKH H[HFXWLRQ FRVWV RI LWV FRQVWLWXHQW VXE
TXHULHV:HPRGHOWKHVXETXHU\H[HFXWLRQFRVWDVDIXQF
WLRQ RI GLVVHPLQDWLRQ FRVWV RI WKH LQGLYLGXDO GDWD LWHPV
1.2 Problem Statement and Contributions
LQYROYHG 7KH GDWD GLVVHPLQDWLRQ FRVW LV GHSHQGHQW RQ
9DOXH RI D FRQWLQXRXV ZHLJKWHG DGGLWLYH DJJUHJDWLRQ GDWD G\QDPLFV DQG WKH LQFRKHUHQF\ ERXQG DVVRFLDWHG
ZLWKWKHGDWD:HPRGHOWKHGDWDG\QDPLFVXVLQJDGDWD
TXHU\DWWLPHWFDQEHFDOFXODWHGDV
nq
G\QDPLFVPRGHODQGWKHHIIHFWRIWKHLQFRKHUHQF\ERXQG
Vq (t ) = ( vqi (t ) wqi ) 
  XVLQJ DQ LQFRKHUHQF\ ERXQG PRGHO 7KHVH WZR PRGHOV
i =1
DUH FRPELQHG WR JHW WKH HVWLPDWH RI WKH GDWD GLVVHPLQD
9TLVWKHYDOXHRIDFOLHQWTXHU\TLQYROYLQJQTGDWDLWHPV
WLRQFRVW
ZLWKWKHZHLJKWRIWKHLWKGDWDLWHPEHLQJ wqi LQT6XFK
7R HPSLULFDOO\ HYDOXDWH RXU DSSURDFK ZH XVH UHDO
DTXHU\HQFRPSDVVHV 64/DJJUHJDWLRQRSHUDWRUV 680DQG
ZRUOGGDWDIURPWKHVHQVRUDQGVWRFNPDUNHWGRPDLQV>@
$9* EHVLGHV JHQHUDO ZHLJKWHG DJJUHJDWLRQ TXHULHV VXFK
6HQVRU QHWZRUN GDWD XVHG ZHUH WHPSHUDWXUH DQG ZLQG
DVSRUWIROLRTXHULHVLQYROYLQJDJJUHJDWLRQRIVWRFNSULFHV
VHQVRU GDWD IURP *HRUJHV %DQN &UXLVHV $OEDWURVV 6KLS
ZHLJKWHGZLWKQXPEHURIVKDUHVRIVWRFNVLQWKHSRUWIROLR
ERDUG>@6WRFNWUDFHVRIVWRFNVZHUHREWDLQHGE\SH
6XSSRVHWKHUHVXOWIRUWKHTXHU\JLYHQE\(TXDWLRQ  
ULRGLFDOO\ SROOLQJ KWWSILQDQFH\DKRRFRP :H FROOHFWHG
QHHGVWREHFRQWLQXRXVO\SURYLGHGWRDXVHUDWWKHTXHU\
VDPSOHVIRUHDFKGDWDLWHPZLWKDSHULRGRI  VHF
LQFRKHUHQF\ ERXQG &T 7KHQ WKH GLVVHPLQDWLRQ QHWZRUN
RQGV $SSHQGL[ $ JLYHV VWDWLVWLFDO SURSHUWLHVRI VRPH RI
KDVWRHQVXUHWKDW
WKHVH VWRFN WUDFHV ,Q WKLV SDSHU ZH SUHVHQW UHVXOWV XVLQJ
nq
| ( v qi (t ) u qi ( t ) ) wqi | Cq 
  VWRFNGDWDRQO\EXWVLPLODUUHVXOWVZHUHREWDLQHGIRUVHQ
i =1
VRUGDWDDVZHOO>@2XUVLPXODWLRQVWXGLHVVKRZWKDWIRU
:KHQHYHU GDWD YDOXHV DW VRXUFHV FKDQJH VXFK WKDW FRQWLQXRXVDJJUHJDWLRQTXHULHV
TXHU\ LQFRKHUHQF\ ERXQG LVYLRODWHG WKH XSGDWHG YDOXH 2XU PHWKRG RI GLYLGLQJ TXHU\ LQWRVXETXHULHVDQGH[HFXW
VKRXOGEHUHIUHVKHGWRWKHFOLHQW,IWKHQHWZRUNRIDJJUH
LQJWKHPDWLQGLYLGXDO'$VUHTXLUHVOHVVWKDQRQHWKLUGRI
JDWRUV FDQ HQVXUH WKDW WKH LWK GDWD LWHP KDV LQFRKHUHQF\
WKHQXPEHURIUHIUHVKHVUHTXLUHGLQWKHH[LVWLQJVFKHPHV
ERXQG &TL WKHQ WKH IROORZLQJ FRQGLWLRQ HQVXUHV WKDW WKH )RU UHGXFLQJ WKH QXPEHU RI UHIUHVKHV PRUH G\QDPLF
GDWD LWHPV VKRXOG EHSDUW RI VXETXHU\ LQYROYLQJODU
TXHU\LQFRKHUHQF\ERXQG&TLVVDWLVILHG
JHUQXPEHURIGDWDLWHPV
nq
 ( C qi w qi ) C q 
  2XUPHWKRGRIH[HFXWLQJTXHULHVRYHUDQHWZRUNRIGDWDDJ
i =1
JUHJDWRUV LV SUDFWLFDO VLQFH LW FDQ EH LPSOHPHQWHG XVLQJ D
7KHFOLHQWVSHFLILHGTXHU\LQFRKHUHQF\ERXQGQHHGVWR PHFKDQLVPVLPLODUWR85/UHZULWLQJ>@LQFRQWHQWGLVWUL
EHWUDQVODWHGLQWRLQFRKHUHQF\ERXQGVIRULQGLYLGXDOGDWD EXWLRQ QHWZRUNV &'1V  -XVW OLNH LQ D &'1 WKH FOLHQW
LWHPVRUVXETXHULHVVXFKWKDW(TXDWLRQ  LVVDWLVILHG,W VHQGVLWVTXHU\WRWKHFHQWUDOVLWH)RUJHWWLQJDSSURSULDWH
VKRXOGEHQRWHGWKDW(TXDWLRQ  LVDVXIILFLHQWFRQGLWLRQ DJJUHJDWRUV HGJHQRGHV WRDQVZHUWKHFOLHQWTXHU\ ZHE
IRUVDWLVI\LQJWKHTXHU\LQFRKHUHQF\ERXQGEXWQRWQHFHV SDJH  WKH FHQWUDO VLWH KDV WR ILUVW GHWHUPLQH ZKLFK GDWD
VDU\ 7KLV ZD\ RI WUDQVODWLQJ WKH TXHU\ LQFRKHUHQF\ DJJUHJDWRUV KDYH WKH GDWD LWHPV UHTXLUHG IRU WKH FOLHQW
ERXQGLQWRWKHVXETXHU\LQFRKHUHQF\ERXQGVLVUHTXLUHG TXHU\,IWKHFOLHQWTXHU\FDQQRWEHDQVZHUHGE\DVLQJOH
LI GDWD LV WUDQVIHUUHG EHWZHHQ YDULRXV QRGHV XVLQJ RQO\ GDWD DJJUHJDWRU WKH TXHU\ LV GLYLGHG LQWR VXETXHULHV
SXVKEDVHGPHFKDQLVP
IUDJPHQWV  DQG HDFK VXETXHU\ LV DVVLJQHG WR D VLQJOH
:H QHHG D PHWKRG IRU D  RSWLPDOO\ GLYLGLQJ D FOLHQW GDWD DJJUHJDWRU ,Q FDVH RI D &'1 ZHE SDJHV GLYLVLRQ
TXHU\ LQWR VXETXHULHV DQG E  DVVLJQLQJ LQFRKHUHQF\ LQWR IUDJPHQWV LV D SDJH GHVLJQ LVVXH ZKHUHDV IRU FRQ
ERXQGVWRWKHPVXFKWKDW F WKHGHULYHGVXETXHULHVFDQ WLQXRXVDJJUHJDWLRQTXHULHVWKLVLVVXHKDVWREHKDQGOHG
EHH[HFXWHGDWFKRVHQ'$VDQG G WRWDOTXHU\ H[HFXWLRQ RQSHUTXHU\EDVLVE\FRQVLGHULQJGDWDGLVVHPLQDWLRQFDSD
FRVW LQ WHUPV RI QXPEHU RI UHIUHVKHV WR WKH FOLHQW LV ELOLWLHVRIGDWDDJJUHJDWRUVDVH[SODLQHGLQ([DPSOH
PLQLPL]HG
:H ZRXOG OLNH WR GLIIHUHQWLDWH WKH FXUUHQW ZRUN ZLWK
:H SURYH WKDW WKH SUREOHP RI FKRRVLQJ VXETXHULHV ZKLOH WKDWRIGHVLJQLQJDQHWZRUNRIGDWDDJJUHJDWRUVIRUDVSH
PLQLPL]LQJ TXHU\ H[HFXWLRQ FRVW LV DQ 13KDUG SUREOHP :H FLILF VHW RI FOLHQW TXHULHV :KHUHDV ZH SURSRVH D PHWKRG
JLYH HIILFLHQW DOJRULWKPV WR FKRRVH WKH VHW RI VXETXHULHV DQG WR DQVZHU D FOLHQW TXHU\ XVLQJ D JLYHQ QHWZRUN RI GDWD
WKHLU FRUUHVSRQGLQJ LQFRKHUHQF\ ERXQGV IRU D JLYHQ FOLHQW DJJUHJDWRUVLIWKHFOLHQWTXHULHVDUHIL[HGRQHFDQXVHWKH
TXHU\ ,Q FRQWUDVW DOO UHODWHG ZRUN LQ WKLV DUHD >@ FOLHQW TXHULHV WR RSWLPDOO\ FRQVWUXFW D QHWZRUN RI
SURSRVH JHWWLQJ LQGLYLGXDO GDWD LWHPV IURP WKHDJJUHJD GDWDDJJUHJDWRUVDVLQ>@2XUDLPRIPLQLPL]LQJWKH
WRUVZKLFKDVZHVKRZLQWKLVSDSHUOHDGVWRODUJHQXP QXPEHU RI PHVVDJHV EHWZHHQ DJJUHJDWRUV DQG FOLHQW
EHU RI UHIUHVKHV  )RU VROYLQJ WKHDERYH SUREOHP RI RSWL FRPSOLPHQWVWKHZRUNVRI>@7RJHWKHUWKH\FDQ EH
PDOO\GLYLGLQJWKHFOLHQWTXHU\LQWRVXETXHULHV ZH ILUVW XVHG WR PLQLPL]H WKH WRWDO QXPEHU RI PHVVDJHV EHWZHHQ
QHHG D PHWKRG WR HVWLPDWH WKH TXHU\ H[HFXWLRQ FRVW IRU GDWDVRXUFHVDQGFOLHQWV

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

>@ ZKHUH HDFK VWHSLV FRUUHODWHG ZLWKLWV SUHYLRXVVWHS


,Q SXVK EDVHG GLVVHPLQDWLRQ D GDWD VRXUFH FDQ IROORZ
RQHRIWKHIROORZLQJVFKHPHV
D 'DWDVRXUFHSXVKHVWKHGDWDYDOXHZKHQHYHULWGLIIHUV
IURPWKHODVWSXVKHGYDOXHE\DQDPRXQWPRUHWKDQ&
E  &OLHQW HVWLPDWHV GDWD YDOXH EDVHGRQVHUYHU VSHFLILHG
SDUDPHWHUV > @ 7KH VRXUFH SXVKHV WKH QHZ GDWD
YDOXHZKHQHYHULWGLIIHUVIURPWKH FOLHQW HVWLPDWHGYDOXH
E\DQDPRXQWPRUHWKDQ&
,QERWKWKHVHFDVHVYDOXHDWWKHVRXUFHFDQEHPRGHOHG
DVDUDQGRPSURFHVVZLWKDYHUDJHDVWKHYDOXHNQRZQDW
WKH FOLHQW ,Q FDVH E  WKH FOLHQW DQG WKH VHUYHU HVWLPDWH
WKHGDWDYDOXHDVWKHPHDQRIWKHPRGHOHGUDQGRPSURF
HVV ZKHUHDV LQ FDVH D  GHYLDWLRQ IURP WKH ODVW SXVKHG
YDOXH FDQ EH PRGHOHG DV ]HUR PHDQ SURFHVV 8VLQJ &KH
E\VKHYVLQHTXDOLW\>@
P (| v ( t ) u (t ) | > C ) 1 / C 2 
  

1.3 Outline of the Paper


7KH FRVW PRGHO IRU GDWD GLVVHPLQDWLRQ LV GHYHORSHG LQ
6HFWLRQ   ,Q6HFWLRQ  ZH SUHVHQWWKH TXHU\ FRVWPRGHO
IRUWKHDGGLWLYHDJJUHJDWLRQTXHULHV,WXVHVWKHGDWDGLV
VHPLQDWLRQ PRGHO DQG D PHDVXUH IRU FDSWXULQJ FRUUHOD
WLRQEHWZHHQGDWDG\QDPLFV2SWLPDOTXHU\SODQQLQJIRU
DGGLWLYHTXHULHVLVSUHVHQWHGLQ6HFWLRQ5HVXOWVRISHU
IRUPDQFHHYDOXDWLRQVRIDOJRULWKPVGHVFULEHGLQ6HFWLRQ
DUH SUHVHQWHG LQ 6HFWLRQ  6HFWLRQ  GLVFXVVHV RSWLPDO
TXHU\SODQQLQJIRU 0$;TXHULHV0RVWFRQFOXVLRQVGUDZQ
IRU WKLV FODVV RI TXHULHV DUH VLPLODU WR WKDW IRU DGGLWLYH
DJJUHJDWLRQTXHULHV5HODWHGZRUNLVSUHVHQWHGLQ6HFWLRQ
 'LVFXVVLRQDERXW YDULRXV DVSHFWV RI RXU ZRUN FRQFOX
VLRQVDQGIXWXUHZRUNDUHSUHVHQWHGLQ6HFWLRQ7DEOH
JLYHVVXPPDU\RIYDULRXVV\PEROVXVHGLQWKHSDSHUDQG
WKHLUGHVFULSWLRQV
7DEOH,PSRUWDQWV\PEROVDQGWKHLUPHDQLQJ
Symbols

Description

Set of aggregators in the network.

Number of data aggregators (DAs).

Set of data items disseminated by the network.

Incoherency bounds of data items.

ak

kth data aggregator, 1kN

Dk

Set of data items disseminated by the kth DA.

dkj

jth data item disseminated by the kth DA.

tkj

Incoherency bound which ak can ensure for dkj.

Client query.

Cq

Incoherency bound for q.

nq

Number of data items in q.

dqi

ith data item of the query q.

vqi(t)

Value of the ith data item of the query q at time t.

wqi

Weight of the data item dqi for the query q.

Vq(t)

Value of the query q at time t.

qk

Sub-query of q to be executed at ak .

Cqk

Incoherency bound of qk.

Rq

Sumdiff of the query q.

Correlation measure between data items

Query satisfiability parameter

7KXV ZH K\SRWKHVL]H WKDW WKH QXPEHU RI GDWD UHIUHVK


PHVVDJHV LV LQYHUVHO\ SURSRUWLRQDO WR WKH VTXDUH RI WKH
LQFRKHUHQF\ ERXQG $ VLPLODU UHVXOW ZDV UHSRUWHG LQ >@
ZKHUHGDWDG\QDPLFVZHUHPRGHOHGDVUDQGRPZDONV

Figure 1. Number of pushes vs. incoherency bounds

9DOLGDWLQJ WKH DQDO\WLFDO PRGHO 7R FRUURERUDWH WKH


DERYH DQDO\WLFDO UHVXOW ZH VLPXODWHG GDWD VRXUFHV E\
UHDGLQJYDOXHVIURPWKHVHQVRUDQGVWRFNGDWDWUDFHVGH
VFULEHGLQ6HFWLRQDWSHULRGLFLQVWDQFHV)RUWKHVHH[
SHULPHQWV HDFK GDWD YDOXH DW WKH ILUVW WLFN LV VHQW WR WKH
FOLHQW 'DWD VRXUFHV PDLQWDLQ ODVW VHQWYDOXH IRU HDFK FOL
HQW7KHVRXUFHVUHDGQHZYDOXHIURPWKHWUDFHDQGVHQG
WKH YDOXH WR LWV FOLHQWV LI DQG RQO\ LI QRW VHQGLQJ LW ZLOO
YLRODWH WKH FOLHQWV LQFRKHUHQF\ ERXQG &  )RU HDFK GDWD
LWHPWKHLQFRKHUHQF\ERXQGZDVYDULHGDQGUHIUHVKPHV
VDJHV WR HQVXUH WKDW LQFRKHUHQF\ ERXQG ZHUH FRXQWHG
)LJXUH  VKRZV WKH FXUYHV IRU WKH QXPEHU RI SXVK PHV
VDJHV IRU IRXU UHSUHVHQWDWLYH VKDUH SULFH GDWD LWHPV DV
WKHLUFRUUHVSRQGLQJLQFRKHUHQF\ERXQGVDQGKHQFH&
DUHYDULHG %HVLGHVYDOLGDWLQJ WKHDQDO\WLFDO PRGHO WKHVH
UHVXOWVSURYLGHRQHLPSRUWDQWLQVLJKWLQWRWKHGLVVHPLQD
WLRQPHFKDQLVP$VWKHLQFRKHUHQF\ERXQGGHFUHDVHVWKH
QXPEHU RI PHVVDJHV LQFUHDVHV DV SHU DQDO\WLFDO PRGHO
EXWWKHUHLVDVDWXUDWLRQHIIHFWIRUYHU\ORZYDOXHVRI WKH
LQFRKHUHQF\ ERXQG LH ULJKW SDUW RI WKH FXUYH  7KLV LV
GXHWRWKHIDFWWKDWWKHGDWDLWHPVKDYHOLPLWHGQXPEHURI
GLVFUHWHFKDQJHVLQWKHYDOXH)RUH[DPSOHLIWKHVHQVLWLY
LW\RIDWHPSHUDWXUHVHQVRULVRQHGHJUHHWKHQQXPEHURI
GLVVHPLQDWLRQ PHVVDJHV ZLOO QRW LQFUHDVH HYHQ LI LQFR
KHUHQF\ERXQGLVGHFUHDVHGEHORZR

2 DATA DISSEMINATION COST MODEL


,QWKLVVHFWLRQZHSUHVHQWWKHPRGHOWRHVWLPDWHWKHQXP
EHURIUHIUHVKHVUHTXLUHGWRGLVVHPLQDWHDGDWDLWHPZKLOH
PDLQWDLQLQJ D FHUWDLQ LQFRKHUHQF\ ERXQG7KHUH DUH WZR
SULPDU\IDFWRUVDIIHFWLQJWKHQXPEHURIPHVVDJHVWKDWDUH
QHHGHG WR PDLQWDLQ WKH FRKHUHQF\ UHTXLUHPHQW D WKH
FRKHUHQF\UHTXLUHPHQWLWVHOIDQG E G\QDPLFVRIWKHGDWD

2.1 Incoherency Bound Model


&RQVLGHU D GDWD LWHP ZKLFK QHHGV WR EH GLVVHPLQDWHG DW
DQLQFRKHUHQF\ERXQG&LH QHZYDOXHRI WKH GDWDLWHP
ZLOOEHSXVKHGLIWKHYDOXHGHYLDWHVE\PRUHWKDQ&IURP
WKHODVWSXVKHGYDOXH7KXVWKHQXPEHURIGLVVHPLQDWLRQ
PHVVDJHVZLOOEHSURSRUWLRQDOWRWKHSUREDELOLW\RI_Y W 
X W _ JUHDWHU WKDQ & IRU GDWD YDOXH Y W  DW WKH
VRXUFHDJJUHJDWRUDQGX W DWWKHFOLHQWDWWLPHW$GDWD
LWHP FDQ EH PRGHOHG DV D GLVFUHWH WLPH UDQGRP SURFHVV


This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS

(a) C=0.001
Figure 2. Number of pushes vs. data sumdiff

(b) C=0.01

(c) C=0.1

2.2 Data Dynamics Model


ERXQGYDOXHVRIDQG7KLVUDQJHLVWR
:H FRQVLGHUHG WZR SRVVLEOH RSWLRQV WR PRGHO GDWD G\ WLPHVWKHDYHUDJHVWDQGDUGGHYLDWLRQRIWKHVKDUHSULFH
QDPLFV$VDILUVWRSWLRQWKHGDWDG\QDPLFVFDQEHTXDQ YDOXHV 1XPEHURI UHIUHVK PHVVDJHV LV SORWWHGZLWKGDWD
WLILHGEDVHGRQVWDQGDUGGHYLDWLRQRIWKHGDWDLWHPYDOXHV VXPGLII LQ  LQ )LJXUH 7KHOLQHDUUHODWLRQVKLS DSSHDUV
>@:HWDNHDQH[DPSOHWRVKRZZK\VWDQGDUGGHYLDWLRQ WRH[LVWIRUDOOLQFRKHUHQF\ERXQGYDOXHV7RTXDQWLI\WKH
LVQRWDJRRGPHDVXUHRIGDWDG\QDPLFVLQRXUFDVH6XS PHDVXUH RI OLQHDULW\ ZH XVHG 3HDUVRQ SURGXFW PRPHQW
SRVHGDWDYDOXHVLQFRQVHFXWLYH LQVWDQFHV IRU D GDWDLWHP FRUUHODWLRQ FRHIILFLHQW 330&&  >@ D ZLGHO\ XVHG PHDV
GDUH^`ZKHUHDVIRUDQRWKHUGDWDLWHPG XUH RI DVVRFLDWLRQ PHDVXULQJ WKH GHJUHH RI OLQHDULW\ EH
YDOXHVDUH^      ` 6XSSRVH ERWKGDWDLWHPV WZHHQ WZR YDULDEOHV ,W LV FDOFXODWHG E\ VXPPLQJ XS WKH
DUHGLVVHPLQDWHG ZLWK DQ LQFRKHUHQF\ ERXQGRI  ,W FDQ SURGXFWV RI WKH GHYLDWLRQV RI WKH GDWD LWHP YDOXHV IURP
EH VHHQ WKDW WKH QXPEHU RI PHVVDJHV UHTXLUHG IRU PDLQ WKHLU PHDQ 330&& YDULHV EHWZHHQ  DQG  ZLWK KLJKHU
WDLQLQJ WKH LQFRKHUHQF\ ERXQG ZLOO EH  DQG  IRU GDWD DEVROXWH  YDOXHV VLJQLI\LQJ WKDW GDWD SRLQWV FDQ EH FRQ
LWHPVGDQGGUHVSHFWLYHO\ZKHUHDVERWKGDWDLWHPVKDYH VLGHUHG OLQHDU ZLWK PRUH FRQILGHQFH )RU WKUHH YDOXHV RI
WKH VDPH VWDQGDUG GHYLDWLRQ   7KXV ZH QHHG D LQFRKHUHQF\ ERXQGV   DQG  330&& YDOXHV
PHDVXUHZKLFKFDSWXUHVGDWDFKDQJHVDORQJZLWKLWVWHP ZHUH   DQG  UHVSHFWLYHO\ LH DYHUDJH GHYLD
SRUDOSURSHUWLHV7KLVPRWLYDWHVXVWRH[DPLQHWKHVHFRQG WLRQIURPOLQHDULW\ZDVLQWKHUDQJHRIIRUORZYDOXHV
RI&DQGIRUKLJKYDOXHVRI&7KXVZHFDQFRQFOXGH
PHDVXUH
$V D VHFRQG RSWLRQ ZH FRQVLGHUHG )DVW )RXULHU 7UDQV WKDW IRU ORZHU YDOXHV RI WKH LQFRKHUHQF\ ERXQGV OLQHDU
IRUP ))7  ZKLFK LV XVHG LQ WKH GLJLWDO VLJQDO SURFHVVLQJ UHODWLRQVKLS EHWZHHQ GDWD VXPGLII DQG WKH QXPEHU RI UH
GRPDLQWRFKDUDFWHUL]HDGLJLWDOVLJQDO))7FDSWXUHVQXP IUHVKPHVVDJHVFDQEHDVVXPHGZLWKPRUHFRQILGHQFH$
EHU RI FKDQJHV LQ GDWD YDOXH DPRXQW RI FKDQJHV DQG ODUJHUHUURUIRUODUJHUYDOXHVRI&FDQEHH[SODLQHGDVIRO
WKHLUWLPLQJV7KXV))7FDQEHXVHGWRPRGHOGDWDG\QDP ORZV
$V SHU WKH K\SRWKHVLV D ODUJHU YDOXH RI GDWD VXPGLII
LFV EXW LW KDV D SUREOHP 7R HVWLPDWH WKH QXPEHU RI UH
IUHVKHV UHTXLUHG WR GLVVHPLQDWH D GDWD LWHP ZH QHHG D VKRXOGUHVXOWLQPRUHUHIUHVKHV%XWWKDWPD\QRWEHWUXH
IXQFWLRQ RYHU ))7 FRHIILFLHQWV ZKLFK FDQ UHWXUQ D VFDODU ZKHQ HLWKHU   WKHUH DUH ORZ DPSOLWXGH FKDQJHV LQ WKH
YDOXH7KHQXPEHURI))7FRHIILFLHQWVFDQEHDVKLJKDVWKH FRQVHFXWLYH GDWD YDOXHV VPDOOHU WKDQ WKH LQFRKHUHQF\
QXPEHU RI FKDQJHV LQ WKH GDWDYDOXH $PRQJ ))7 FRHIIL ERXQG ZKLFK LQFUHDVHV WKH GDWD VXPGLII YDOXH ZLWKRXW
FLHQWVWKRUGHUFRHIILFLHQWLGHQWLILHVDYHUDJHYDOXHRIWKH UHTXLULQJ WKH GLVVHPLQDWLRQ RI PHVVDJHV RU   WKHUH DUH
GDWD LWHP ZKHUHDV KLJKHU RUGHU FRHIILFLHQWV UHSUHVHQW KLJK VSLNHVVXFK WKDW WKH\ DUH PXFK KLJKHU FRPSDUHG WR
WUDQVLHQWFKDQJHVLQWKHYDOXHRIGDWDLWHP:HK\SRWKH WKHLQFRKHUHQF\ERXQGOHDGLQJWRPRUHWKDQSURSRUWLRQDO
VL]HWKDWWKHFRVWRIGDWDGLVVHPLQDWLRQIRUDGDWDLWHPFDQ LQFUHDVH LQ WKH GDWD VXPGLII 7KH ILUVW FDVH ZLOO EH PRUH
EH DSSUR[LPDWHG E\ D IXQFWLRQ RI WKH VW ))7 FRHIILFLHQW SUHYDOHQW IRU KLJK YDOXHV RI WKH LQFRKHUHQF\ ERXQG
6SHFLILFDOO\WKHFRVWRIGDWDGLVVHPLQDWLRQIRUDGDWDLWHP ZKHUHDVWKHVHFRQGFDVHZLOOEHPRUHSURQRXQFHGIRUORZ
YDOXHVRIWKHLQFRKHUHQF\ERXQG7KXVWKHOLQHDUUHODWLRQ
ZLOOEHSURSRUWLRQDOWRGDWDVXPGLIIGHILQHGDV
Rs = | si si 1 | 
  VKLS EHWZHHQ WKH GDWD VXPGLII DQG QXPEHU RI UHIUHVKHV
KDVPRUHHUURUIRUYHU\ORZYDOXHVDQGYHU\KLJKYDOXHV
i
ZKHUHVLDQGVLDUHWKHVDPSOHGYDOXHVRIDGDWDLWHP6DW RI LQFRKHUHQF\ ERXQGV $V ORZ DPSOLWXGH SHUWXUEDWLRQV
LWK DQG L WK WLPH LQVWDQFHV LH FRQVHFXWLYH WLFNV  ,Q DUH PRUH SUHYDOHQW WKDQ KLJK DPSOLWXGH VSLNHV LQ PRVW
SUDFWLFHVXPGLIIYDOXHIRUDGDWDLWHPFDQEHFDOFXODWHGDW GDWD LWHPV WKH OLQHDU UHODWLRQVKLS LV PRUH DFFXUDWH IRU
WKH GDWD VRXUFH E\ WDNLQJ UXQQLQJ DYHUDJH RI GLIIHUHQFH ORZHUYDOXHVRILQFRKHUHQF\ERXQGV
EHWZHHQ GDWD YDOXHV IRU FRQVHFXWLYH WLFNV )RU RXU H[ 2.3 Combining Data Dissemination Models
SHULPHQWV ZH FDOFXODWHG WKH VXPGLII YDOXHV XVLQJ H[SR 1XPEHURIUHIUHVKPHVVDJHVLVSURSRUWLRQDOWRGDWDVXP
QHQWLDOZLQGRZPRYLQJDYHUDJHZLWKHDFKZLQGRZKDY GLII 5V DQG LQYHUVHO\ SURSRUWLRQDO WR VTXDUH RI WKH LQFR
LQJVDPSOHVDQGJLYLQJZHLJKWWRWKHPRVWUHFHQW KHUHQF\ERXQG & )XUWKHUZHFDQVHHWKDWZHQHHGQRW
GLVVHPLQDWH DQ\ PHVVDJH ZKHQ HLWKHU GDWD YDOXH LV QRW
ZLQGRZ
9DOLGDWLQJ WKH K\SRWKHVLV :H GLG VLPXODWLRQV ZLWK FKDQJLQJ 5V    RU LQFRKHUHQF\ ERXQG LV XQOLPLWHG
GLIIHUHQW VWRFNV EHLQJ GLVVHPLQDWHG ZLWK LQFRKHUHQF\ &    7KXV IRU D JLYHQ GDWD LWHP 6 GLVVHPLQDWHG

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

ZLWKDQLQFRKHUHQF\ERXQG&WKHGDWDGLVVHPLQDWLRQFRVW FXODWLQJ
LV SURSRUWLRQDO WR 5V& ,Q WKH QH[W VHFWLRQ ZH XVH WKLV 3.2 Query based Normalization
GDWDGLVVHPLQDWLRQFRVWPRGHOIRUGHYHORSLQJFRVWPRGHO 6XSSRVH ZH ZDQW WR FRPSDUH WKH FRVW RI WZR TXHULHV D
IRUDGGLWLYHDJJUHJDWLRQTXHULHV
680 TXHU\ LQYROYLQJ WZR GDWD LWHPV DQG DQ $9* TXHU\
LQYROYLQJWKHVDPHVHWRIGDWDLWHPV/HWWKHTXHU\ LQFR
3 COST MODEL FOR ADDITIVE AGGREGATION
KHUHQF\ERXQGIRUWKH 680DQGWKH $9*TXHULHVEH& &
QUERIES
DQG& & UHVSHFWLYHO\ )URP (TXDWLRQ  VXPGLII RI WKH
&RQVLGHUDQDGGLWLYHTXHU\RYHUWZRGDWDLWHPV3DQG4 680 TXHU\ ZLOO EH GRXEOH WKDW RI WKH $9* TXHU\ +HQFH

ZLWKZHLJKWVZS DQGZT UHVSHFWLYHO\ DQG ZH ZDQW WR HV TXHU\HYDOXDWLRQFRVW DVSHU5& RIWKH 680TXHU\ZLOO
EHKDOIWKDWRIWKH
$9*TXHU\%XWLQWXLWLYHO\GLVVHPLQDW
WLPDWH LWV GLVVHPLQDWLRQ FRVW ,I GDWD LWHPV DUH GLVVHPL
LQJ WKH $9* RI WZR GDWD LWHPV DW D JLYHQ LQFRKHUHQF\
QDWHGVHSDUDWHO\WKHTXHU\VXPGLIIZLOOEH
ERXQG VKRXOG UHTXLUH WKH VDPH QXPEHU RI UHIUHVK PHV
Rdata = w p R p + wq Rq = w p | pi pi 1 | + wq | qi qi 1 | 
 
VDJHV DV WKHLU 680 ZLWK GRXEOH WKH LQFRKHUHQF\ ERXQG
,QVWHDGLIWKHDJJUHJDWRUXVHVWKHLQIRUPDWLRQWKDWFOLHQW 7KXV WKHUH LV D QHHG WR QRUPDOL]H TXHU\ FRVWV  )URP D
LV LQWHUHVWHG LQ D TXHU\ RYHU 3 DQG 4 UDWKHU WKDQ WKHLU TXHU\H[HFXWLRQFRVWSRLQWRIYLHZDTXHU\ZLWKZHLJKWV
LQGLYLGXDOYDOXHV LWFUHDWHVDQGSXVKHVDFRPSRVLWHGDWD ZL DQG LQFRKHUHQF\ ERXQG & LV WKH VDPH DV TXHU\ ZLWK
LWHP ZSSZTT WKHQWKHTXHU\VXPGLIIZLOOEH
ZHLJKWV ZL DQG LQFRKHUHQF\ ERXQG & 6R ZKLOH QRU
Rquery = | w p ( pi pi 1 ) +wq ( qi qi 1 ) | 
  PDOL]LQJZHQHHGWRHQVXUHWKDWERWKTXHU\ZHLJKWVDQG
5TXHU\LVFOHDUO\OHVVWKDQRUHTXDOFRPSDUHGWR5GDWD7KXV LQFRKHUHQF\ ERXQGV DUH PXOWLSOLHG E\ WKH VDPH IDFWRU
ZH QHHG WR HVWLPDWH WKH VXPGLII RI DQ DJJUHJDWLRQ TXHU\ 1RUPDOL]HGTXHU\VXPGLIILVJLYHQE\
2
2 2
2 2
2
2
LH 5TXHU\  JLYHQ WKH VXPGLII YDOXHV RI LQGLYLGXDO GDWD Rquery = ( w p R p + wq Rq + 2 w p R p wq Rq ) /(w p + wq + 2 w p wq )    
LWHPV LH5SDQG5T 2QO\GDWDDJJUHJDWRUVDUHLQDSRVL LHWKHYDOXHRIWKHQRUPDOL]LQJIDFWRUIRU5TXHU\VKRXOGEH
WLRQWRFDOFXODWH5TXHU\DVGLIIHUHQWGDWDLWHPVPD\EHGLV
2
2
VHPLQDWHG IURP GLIIHUHQW VRXUFHV :H GHYHORS WKH TXHU\ 1 / w p + wq + 2 w p wq 7KHYDOXHRIWKHLQFRKHUHQF\ERXQG
FRVWPRGHOLQWZRVWDJHV
KDV WR EH DGMXVWHG E\ WKHVDPH IDFWRU 1RUPDOL]DWLRQ HQ
3.1 Modeling Correlation between Data Dynamics
VXUHVWKDWTXHULHVZLWKDUELWUDU\YDOXHVRIZHLJKWVFDQEH
)URP (TXDWLRQV   DQG   ZH FDQ VHH WKDW LI WZR GDWD FRPSDUHG IRU H[HFXWLRQ FRVW HVWLPDWHV (TXDWLRQ  FDQ
LWHPV DUH FRUUHODWHG VXFK WKDW DV WKH YDOXH RI RQH GDWD EHH[WHQGHGWRJHWTXHU\VXPGLIIIRUDQ\JHQHUDOZHLJKWHG
LWHP LQFUHDVHV WKDW RI WKH RWKHU GDWDLWHP DOVR LQFUHDVHV DJJUHJDWLRQTXHU\JLYHQE\(TXDWLRQ  DV
WKHQ5TXHU\ZLOOEHFORVHUWR5GDWD2QWKHRWKHUKDQGLIWKH
nq
nq
nq
2 2
wqi Ri + ij wqi wqj Ri R j
GDWDLWHPVDUHLQYHUVHO\FRUUHODWHGWKHQ5TXHU\ZLOOEHOHVV
i=
i =1 j =1, j i
FRPSDUHGWR5GDWD7KXVLQWXLWLYHO\ZHFDQUHSUHVHQWWKH RQ2 = 1 n

 
nq
nq
q
2
UHODWLRQVKLSEHWZHHQ5TXHU\DQGVXPGLIIYDOXHVRIWKHLQGL
wqi + ij wqi wqj
i =1
i =1 j =1, j i
YLGXDOGDWDLWHPVXVLQJDFRUUHODWLRQPHDVXUHDVVRFLDWHG
ZLWKWKHSDLURIGDWDLWHPV6SHFLILFDOO\LI LVWKHFRUUHOD 
3.3 Validating the Query Cost Model
WLRQPHDVXUHWKHQ5TXHU\FDQEHZULWWHQDV
2
2 2
2 2
Rquery ( w p R p + wq Rq + 2 w p R p wq Rq ) 
  7R YDOLGDWH WKH TXHU\ FRVW PRGHO ZH SHUIRUPHG VLPXOD
WLRQV E\ FRQVWUXFWLQJ  ZHLJKWHG DJJUHJDWLRQ TXHULHV
7KH FRUUHODWLRQ PHDVXUH  LV GHILQHG VXFK WKDW   
XVLQJ WKH VWRFN GDWD ZLWK HDFK TXHU\ FRQVLVWLQJ RI 
6R 5TXHU\ ZLOO DOZD\V EH OHVV WKDQ _ZS5SZT5T_ DV H[
GDWD LWHPV ZLWK GDWD ZHLJKWV XQLIRUPO\ GLVWULEXWHG EH
SODLQHG HDUOLHU  DQG DOZD\V EH PRUH WKDQ _ZS5SZT5T_
WZHHQ  DQG  )RU HDFK TXHU\ WKH QXPEHU RI UHIUHVKHV
7KHDERYHUHODWLRQFDQEHEHWWHUXQGHUVWRRGIURPLWVVLPL
ZDV FRXQWHG IRU YDULRXV LQFRKHUHQF\ ERXQGV VXFK WKDW
ODULW\ZLWKWKHVWDQGDUGGHYLDWLRQRIWKHVXPRIWZRUDQ
WKHLUQRUPDOL]HGYDOXHV XVLQJQRUPDOL]DWLRQIDFWRUDVLQ
GRPYDULDEOHV>@)RUGDWDLWHPV3DQG4 FDQEHFDO
(TXDWLRQ  DUHEHWZHHQDQG)LJXUH D VKRZV
FXODWHGDV
WKDW WKH QXPEHU RI PHVVDJHV LV SURSRUWLRQDO WR WKH QRU
= ( ( pi pi 1 )( qi qi 1 ) ) /( ( pi pi 1 ) 2 ( qi qi 1 ) 2 )    PDOL]HGTXHU\VXPGLIIDVFDOFXODWHGXVLQJ(TXDWLRQ  LI
,Q$SSHQGL[%ZHGLVFXVVDPHWKRGIRUHIILFLHQWO\FDO WKHLUQRUPDOL]HGLQFRKHUHQF\ERXQGVDUHWKHVDPH,QWKLV
FDVH 330&&YDOXHLVIRXQGWREH6LPL
ODUO\ )LJXUH E  VKRZV WKH GHSHQGHQFHRI
WKHQXPEHURIUHIUHVKHVRQ&WRLOOXVWUDWH
WKDW WKH UHODWLRQVKLS WKDW KROGV EHWZHHQ
WKHP IRU VLQJOH GDWD LWHP DOVR KROGV IRU D
TXHU\ZLWKPXOWLSOHGDWDLWHPV:HXVHWKLV
TXHU\FRVWPRGHOIRUTXHU\SODQQLQJZKLFK
LVSUHVHQWHGQH[W

4
Figure 3: Query cost validation with varying (a) Sumdiff (b) Incoherency bound

QUERY PLANNING FOR WEIGHTED


ADDITIVE AGGREGATION QUERIES

)RUH[HFXWLQJDQLQFRKHUHQF\ERXQGHGFRQ

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS

WLQXRXVTXHU\D TXHU\ SODQ LV UHTXLUHG7KH TXHU\SODQ


QLQJSUREOHPFDQEHVWDWHGDV
,QSXWV  $QHWZRUNRIGDWDDJJUHJDWRUVLQWKHIRUPRID
UHODWLRQI $'& VSHFLI\LQJWKH1GDWDDJJUHJDWRUVDN 
$ N1 VHW'N ' RI GDWDLWHPV GLVVHPLQDWHG E\ WKH
GDWD DJJUHJDWRU DN DQG LQFRKHUHQF\ ERXQG tkj ZKLFK WKH

)ROORZLQJ LV WKH RXWOLQH RI RXU DSSURDFK IRU VROYLQJ


WKLV FRQVWUDLQW RSWLPL]DWLRQ SUREOHP DV GHWDLOHG LQ WKH
UHVWRIWKLVVHFWLRQ,Q6HFWLRQZHSURYHWKDWGHWHUPLQ
LQJ VXETXHULHV ZKLOH PLQLPL]LQJ =T DV JLYHQ E\ (TXD
WLRQ  LV13KDUG,Q6HFWLRQZHVKRZWKDWLIWKHVHW
RI VXETXHULHV TN  LV DOUHDG\ JLYHQ VXETXHU\ LQFR
KHUHQF\ ERXQGV &TNV FDQ EH RSWLPDOO\ GHWHUPLQHG WR
PLQLPL]H =T $V RSWLPDOO\ GLYLGLQJ WKH TXHU\ LQWR VXE
DJJUHJDWRUDNFDQHQVXUHIRUHDFKGDWDLWHP d kj 'N
  &OLHQWTXHU\TDQGLWVLQFRKHUHQF\ERXQG&T$QDGGL TXHULHV LV13KDUGDQG WKHUH LV QR NQRZQDSSUR[LPDWLRQ
WLYH DJJUHJDWLRQ TXHU\ T FDQ EH UHSUHVHQWHG DV wqid qi  DOJRULWKP LQ 6HFWLRQ  ZH SUHVHQW WZR KHXULVWLFV IRU
GHWHUPLQLQJ VXETXHULHV ZKLOH VDWLVI\LQJ DV PDQ\ FRQ
ZKHUH wqi LVWKHZHLJKWRIWKHGDWDLWHP d qi IRULQT
2XWSXWV  TNIRUN1LHVXETXHU\IRUHDFKGDWDDJ VWUDLQWVDVSRVVLEOH &RQVWUDLQWDQG&RQVWUDLQWWREHSUH
FLVH 7KHQZHSUHVHQWYDULDWLRQRIWKHWZRKHXULVWLFV IRU
JUHJDWRUDN
  &TNIRUN1LHLQFRKHUHQF\ERXQGVIRUDOOWKHVXE HQVXULQJWKDWVXETXHU\ LQFRKHUHQF\ERXQG LV VDWLVILHG &RQ
VWUDLQW ,QSDUWLFXODUWRJHWDVROXWLRQRIWKHTXHU\SODQ
TXHULHV
7KXVWR JHW D TXHU\ SODQ ZH QHHG WRSHUIRUP IROORZLQJ QLQJ SUREOHPWKH KHXULVWLFVSUHVHQWHGLQ 6HFWLRQ  DUH
XVHG IRU GHWHUPLQLQJ VXETXHULHV 7KHQ XVLQJ WKH VHW RI
WDVNV
 'HWHUPLQLQJVXETXHULHV)RUWKHFOLHQWTXHU\TJHWVXE VXETXHULHVWKHPHWKRGRXWOLQHGLQ6HFWLRQLVXVHGIRU
GLYLGLQJLQFRKHUHQF\ERXQG
TXHULHVT VIRUHDFKGDWDDJJUHJDWRU
N

 'LYLGLQJ LQFRKHUHQF\ ERXQG 'LYLGH WKH TXHU\ LQFR


KHUHQF\ERXQG&TDPRQJVXETXHULHVWRJHW&TNV
)RU RSWLPDO TXHU\ SODQQLQJ DERYH WDVNV DUH WR EH SHU
IRUPHGZLWKWKHIROORZLQJREMHFWLYHDQGFRQVWUDLQWV
2SWLPL]DWLRQ REMHFWLYH 1XPEHU RI UHIUHVK PHVVDJHV LV
PLQLPL]HG,Q6HFWLRQ  ZH KDYH SURYHG WKDW IRUD VXE
TXHU\ TN WKH HVWLPDWHG QXPEHU RI UHIUHVK PHVVDJHV LV
2 
JLYHQ E\ 5TN Cqk
 ZKHUH 5TN LV WKH VXPGLII RI WKH VXE
TXHU\TN&TNLVWKHLQFRKHUHQF\ERXQGDVVLJQHGWRLWDQG
 WKH SURSRUWLRQDOLW\ IDFWRU LV WKH VDPH IRU DOO VXE
TXHULHVRIDJLYHQ TXHU\ T7KXV WRWDO QXPEHU RI UHIUHVK
PHVVDJHVLVHVWLPDWHGDV
N

Zq =

Rqk

2
k =1C qk

 

+HQFH =T QHHGV WR EH PLQLPL]HG IRU PLQLPL]LQJ WKH


QXPEHURIUHIUHVKHV
&RQVWUDLQW TN LV H[HFXWDEOH DW DN (DFK '$ KDV WKH GDWD
LWHPV UHTXLUHG WR H[HFXWH WKH VXETXHU\ DOORFDWHG WR LW
LHIRUHDFKGDWDLWHP dqkiUHTXLUHGIRUWKHVXETXHU\TN
d qki 'N
&RQVWUDLQW 4XHU\ LQFRKHUHQF\ ERXQG LV VDWLVILHG 4XHU\
LQFRKHUHQF\VKRXOGEHOHVVWKDQRUHTXDOWRWKHTXHU\LQ
FRKHUHQF\ERXQG)RUDGGLWLYHDJJUHJDWLRQTXHULHVYDOXH
RIWKHFOLHQWTXHU\LVWKHVXPRIVXETXHU\YDOXHV$VGLI
IHUHQWVXETXHULHV DUH GLVVHPLQDWHG E\ GLIIHUHQW GDWD DJ
JUHJDWRUVZHQHHGWRHQVXUHWKDWVXPRIVXETXHU\LQFR
KHUHQFLHV LV OHVV WKDQ RU HTXDO WR WKH TXHU\ LQFRKHUHQF\
ERXQG7KXV
Cqk Cq   
&RQVWUDLQW 6XETXHU\ LQFRKHUHQF\ ERXQG LV VDWLVILHG 'DWD
LQFRKHUHQF\ ERXQGV DW DN tkj IRU d kj 'N  VKRXOG EH VXFK
WKDWWKHVXETXHU\LQFRKHUHQF\ERXQG&TNFDQEHVDWLVILHG
DWWKDW'$7KHWLJKWHVWLQFRKHUHQF\ERXQG7TNZKLFKWKH
GDWDDJJUHJDWRU DN FDQ VDWLVI\ IRU WKH JLYHQ VXETXHU\ TN
FDQEHFDOFXODWHG DV Tqk = ( wqi tqj d qi d kj )  )RU VDWLV
n qk

I\LQJWKLVFRQVWUDLQWZHHQVXUH Cqk Tqk 

4.1 Finding Optimal Query Plan is NP-hard


)RU SURYLQJ WKDW WKH SUREOHP LV 13KDUG ZH XVH UHGXF
WLRQIURPGLPHQVLRQDOPDWFKLQJ '0 SUREOHP>@
'03UREOHP*LYHQWKUHHVHWV;<DQG=HDFKZLWK
Q HOHPHQWV DQG D VHW 0 ; < = GRHV WKHUH H[LVWV D
VXEVHW00VXFK WKDW HYHU\ HOHPHQW RI VHWV ; < DQG=
RFFXU LQ 0 RQFH DQGRQO\RQFH" 7KHFDUGLQDOLW\ RI0
ZLOOEHQLILWGRHVH[LVW 
:HXVHDGHFLVLRQYHUVLRQRIWKHTXHU\SODQQLQJSUREOHP
WR UHGXFH WKH '0 SUREOHP 7R VROYH WKH '0 SUREOHP
ZHUHGXFHLWWRD 680TXHU\RIQLWHPVDQGLQFRKHUHQF\
ERXQG Q DV JLYHQ LQ $SSHQGL[ &  ,Q WKH DSSHQGL[ ZH
SURYH WKDW WKH EHVW TXHU\ SODQ KDYLQJ TXHU\ FRVW RI Q
ZLOO FRQVLVW RI Q VXETXHULHV HDFK ZLWK  GDWD LWHPV DQG
VXETXHU\LQFRKHUHQF\ERXQGRI,IZHFDQILQGVXFKDQ
RSWLPDO SODQ WKUHH GDWD LWHPV IURP WKH FKRVHQ GDWD DJ
JUHJDWRUVIRUPDWULSOHWIRUWKHVHW0ZKLOHHQVXULQJWKDW
HDFK DQG HYHU\ HOHPHQW RI VHWV ; < DQG = RFFXUV RQFH
DQGRQO\RQFHLQ0&RQYHUVHO\WKHUHFDQEHFDVHVZKHQ
  7KHVXPTXHU\FDQQRWEHVDWLVILHGDVQRFRPELQDWLRQ
RI GDWD DJJUHJDWRU FDQ GLVVHPLQDWH DOO WKH TXHU\ GDWD
LWHPV,QWKLVFDVHZHFDQHDVLO\VHHWKDWWKHUHFDQQRWEH
DQ\00VXFKWKDWDOOHOHPHQWVRI;<DQG=EHLQ0
  &RVW RI VHOHFWHG RSWLPDO SODQ LV PRUH WKDQ Q 7KDW
LPSOLHVWKDWWKHSODQKDVDWOHDVWRQHGDWDDJJUHJDWRUGLV
VHPLQDWLQJ VXETXHU\ ZLWK OHVV WKDQ  GDWD LWHPV VHH
$SSHQGL[ &  6LQFH ZH JHW0  XVLQJ DOO WKH HOHPHQWV RI
WKH VHOHFWHG GDWD DJJUHJDWRUV VWHS  LQ WKH DSSHQGL[ 
VRPHHOHPHQWVDUHUHSHDWHGLQ0
,QERWKRIWKHVHFDVHV'0LVDQVZHUHGLQQHJDWLRQ7KXV
XVLQJ '0 ZH KDYH SURYHG WKDW RSWLPDO SODQQLQJ SURE
OHP LV 13KDUG )RU WKH SXUSRVH RI WKH QH[W VXEVHFWLRQ
  ZH DVVXPH WKDW ZH KDYH DOUHDG\ GHWHUPLQHG VXE
TXHULHV ZKLOH VDWLVI\LQJ &RQVWUDLQW DQG ZH VKRZ WKDWLQ
FRKHUHQF\ERXQGGLYLVLRQFDQEHSHUIRUPHGRSWLPDOO\ZKLOH
VDWLVI\LQJ&RQVWUDLQWDQG&RQVWUDLQW
4.2 Optimal Allocation of Query Incoherency
Bound among Sub-queries
,I ZH NQRZ WKH GLYLVLRQ RI WKH FOLHQW TXHU\ LQWR VXE

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

ZKHUH_T_LVQXPEHURIGDWDLWHPVLQWKHTXHU\PD[_'N_
LV WKH PD[LPXP QXPEHU RI GDWD LWHPV GLVVHPLQDWHG E\
DQ\'$)RUHDFKVXETXHU\P0TLWVVXPGLII5PFDQEH
FDOFXODWHG XVLQJ (TXDWLRQ   'LIIHUHQW FULWHULD  FDQ
EH XVHG WR VHOHFW DVXETXHU\LQ HDFKLWHUDWLRQ RI YDULRXV
JUHHG\ KHXULVWLFV $OO GDWDLWHPV FRYHUHG E\ WKH VHOHFWHG
VXETXHU\ DUH UHPRYHG IURP DOO WKH UHPDLQLQJ VXE
TXHULHV LQ 0T EHIRUH SHUIRUPLQJ WKH QH[W LWHUDWLRQ ,W
N
N
2
) + ( Cqk Cq )   IRU D VKRXOGEHQRWHGWKDWVXETXHULHVIRU'$VFDQEHQXOO
VFKHPH ZH PLQLPL]H ( Rqk / Cqk
k =1
k =1
1RZ ZH GHVFULEH WZR FULWHULD  IRU WKH JUHHG\ KHX
FRQVWDQWWRJHWYDOXHVRI Cqk VDV
ULVWLFV   PLQFRVW HVWLPDWH RI TXHU\ H[HFXWLRQ FRVW LV
PLQLPL]HGDQG PD[JDLQHVWLPDWHGJDLQGXHWRH[HFXW
N
1/ 3
Cqk = Cq Rqk
/( R1qk/ 3 ) 
  LQJWKHTXHU\XVLQJVXETXHULHVLVPD[LPL]HG
TXHULHVXVLQJ(TXDWLRQ  ZHFDQFDOFXODWHVXPGLIIYDO
XHVRIDOOWKHVXETXHULHV7KXVZHQHHGWRPLQLPL]H=T
JLYHQ E\ (TXDWLRQ   VXEMHFW WR &RQVWDLQW TXHU\ LQFR
KHUHQF\ ERXQG LV VDWLVILHG  DQG &RQVWDLQW VXETXHU\ LQFR
KHUHQF\ERXQGLVVDWLVILHG :HFDQJHWDFORVHIRUPH[SUHV
VLRQ E\ VROYLQJ (TXDWLRQ   ZLWK (TXDWLRQ   XVLQJ
/DJUDQJH 0XOWLSOLHU VFKHPH 6HH $SSHQGL[ '  ,Q WKDW

k =1

LH ZLWKRXW WKH &RQVWDLQW VXETXHU\ LQFRKHUHQF\


ERXQGVVKRXOGEHDOORFDWHGLQSURSRUWLRQWR R1qk/ 3 ,Q6HF
WLRQ ZH XVH WKLV H[SUHVVLRQ WR GHYHORS KHXULVWLFV IRU
RSWLPDOO\GLYLGLQJWKHFOLHQWTXHU\LQWRVXETXHULHV,IZH
DOVRFRQVLGHU&RQVWDLQWWKHQZHFDQPRGHOWKHSUREOHP
RI PLQLPL]DWLRQ RI =T ZKLOH VDWLVI\LQJ &RQVWDLQW DQG
&RQVWDLQW  DV D QRQOLQHDU  FRQYH[ RSWLPL]DWLRQ SURE
OHP7KHQRQOLQHDUFRQYH[RSWLPL]DWLRQSUREOHPFDQ EH
VROYHG XVLQJ YDULRXV FRQYH[ RSWLPL]DWLRQ WHFKQLTXHV
DYDLODEOH LQ WKH OLWHUDWXUH VXFK DV JUDGLHQW GHVFHQW
PHWKRG EDUULHU PHWKRG HWF :H XVHG JUDGLHQW GHVFHQW
PHWKRG IPLQFRQ IXQFWLRQ LQ 0$7/$%  WR VROYH WKLV QRQ
OLQHDURSWLPL]DWLRQ SUREOHP WR JHW WKHYDOXHVRI LQGLYLG
XDOVXETXHU\LQFRKHUHQF\ERXQGVIRUDJLYHQVHWRIVXE
TXHULHV ,Q WKH QH[W VXEVHFWLRQ ZH GHVFULEH WZR JUHHG\
KHXULVWLFVWRGHWHUPLQHVXETXHULHVZKLOHXVLQJWKHIRUPXOD
WLRQVGHYHORSHGLQWKLVVHFWLRQ

4.3.1 Minimum Cost Heuristic


$VZHQHHGWRPLQLPL]HWKHTXHU\FRVWDVXETXHU\ZLWK
PLQLPXPFRVWSHUGDWDLWHPFDQEHFKRVHQLQHDFKLWHUDWLRQ
RI WKH DOJRULWKP JLYHQ E\ )LJXUH  LH FULWHULRQ 
PLQLPL]H 5P&P_P_ %XWIURP(TXDWLRQ  ZHFDQVHH
WKDW WKH VXETXHU\ LQFRKHUHQF\ ERXQGV VKRXOG EH DOOR
FDWHGLQSURSRUWLRQWR Rk1 / 3 8VLQJ(TXDWLRQV  DQG  
ZHJHW
1/ 3

Zq

N
R1 / 3
2 / 3 qk
C q k =1

 

)URP (TXDWLRQ   LW LV FOHDU WKDW IRU PLQLPL]LQJ WKH


TXHU\ H[HFXWLRQ FRVW ZH VKRXOG VHOHFW WKH VHW RI VXE
1/ 3
TXHULHV VR WKDW Rqk
 LV PLQLPL]HG :H FDQ GR WKDW E\
XVLQJFULWHULRQPLQLPL]H R1m/ 3 _P_ LQWKHJUHHG\DOJR
ULWKP2QFHZHJHWWKHRSWLPDOVHWRIVXETXHULHVZHFDQ
XVH(TXDWLRQ  DQG&RQVWUDLQW Cqk Tqk WRRSWLPDOO\

UHVXOW 
ZKLOH0T
FKRRVHDVXETXHU\PL0TZLWKFULWHULRQ
UHVXOWUHVXOWPL0T0T^PL`
IRUHDFKGDWDLWHPGPL
IRUHDFKPM0T
PMPM^G`
LIPM 0T0T^PM`
HOVHFDOFXODWHVXPGLIIIRUPRGLILHGPM
UHWXUQUHVXOW

DOORFDWHWKHTXHU\LQFRKHUHQF\ERXQGDPRQJWKHPXVLQJ
DQ\RIWKHFRQYH[RSWLPL]DWLRQWHFKQLTXHVDVGLVFXVVHGLQ
6HFWLRQ  %XW WKLV PHWKRG RI ILUVW GHULYLQJ VXETXHULHV
DQG WKHQ DOORFDWLQJ WKH LQFRKHUHQF\ ERXQGV KDV D SURE
OHPZKLFKLVGHVFULEHGQH[W

4.3.2 Satisfiability of sub-query incoherency bound


,QWKHVROXWLRQGHVFULEHGLQWKHSUHYLRXVVHFWLRQZHVHOHFW
WKHVHWRIVXETXHULHV DQGFRUUHVSRQGLQJ'$V DQGWKHQ
DOORFDWHWKHLQFRKHUHQF\ERXQGDPRQJWKHPXVLQJDFRQ
YH[ RSWLPL]DWLRQ WHFKQLTXH %XW WKH SUREOHP RI LQFR
KHUHQF\ ERXQG DOORFDWLRQ DPRQJ FKRVHQ '$V PD\ QRW
KDYH DQ\ IHDVLEOH VROXWLRQ 7KHUH PD\ EH VLWXDWLRQV
ZKHUHDOWKRXJKWKHJLYHQQHWZRUNRIGDWDDJJUHJDWRUVLV
DEOHWRVDWLVI\WKHTXHU\FRKHUHQF\UHTXLUHPHQWVEXWRQFH
WKH VHW RI VXETXHULHV LV VHOHFWHG WKH LQFRKHUHQF\ ERXQG
DOORFDWLRQ LV QRW SRVVLEOH 6XFK D VLWXDWLRQ FDQ EH LOOXV
WUDWHG ZLWK WKH KHOS RI WKH QHWZRUN RI GDWD DJJUHJDWRUV
FRQVLVWLQJ RI WZR '$V D DQG D DV JLYHQ LQ ([DPSOH 
&RQVLGHU D FOLHQW TXHU\ 4 G  G  G ZLWK DQ
LQFRKHUHQF\ERXQGRI$VGLVFXVVHGLQ6HFWLRQWKHUH
DUH DWOHDVW  WZR SRVVLEOH TXHU\ SODQV 3ODQ DQG 3ODQ 
WR DQVZHU WKLV TXHU\ $V VXJJHVWHG LQ WKH SUHYLRXV VXE
1/3
VHFWLRQ ZH VHOHFW VXETXHULHV KDYLQJ PLQLPXP Rqk


Figure 4: Greedy algorithm for query plan selection

4.3 Greedy Heuristics for Deriving the Sub-queries


)LJXUHJLYHVWKH RXWOLQH RI JUHHG\ DOJRULWKP IRU GHULY
LQJVXETXHULHV)LUVWZHJHWDVHWRIPD[LPDOVXETXHULHV
0T FRUUHVSRQGLQJWRDOOWKHGDWDDJJUHJDWRUVLQWKHQHW
ZRUN7KHPD[LPDOVXETXHU\IRUDGDWDDJJUHJDWRULVGH
ILQHG DV WKH ODUJHVW SDUW RI WKH TXHU\ ZKLFK FDQ EH GLV
VHPLQDWHGE\WKH'$ LHWKHPD[LPDOVXETXHU\KDVDOO
WKHTXHU\GDWDLWHPVZKLFKWKH'$FDQGLVVHPLQDWH )RU
H[DPSOH FRQVLGHU D FOLHQW TXHU\ G GG )RU
WKH GDWD DJJUHJDWRUV D DQG D JLYHQ LQ ([DPSOH  WKH
PD[LPDOVXETXHU\IRUDZLOOEHP GGZKHUHDV
IRUDLWZLOOEHP GG)RUWKHJLYHQFOLHQWTXHU\
T DQGUHODWLRQFRQVLVWLQJRIGDWDDJJUHJDWRUVGDWDLWHPV
DQG GDWD LQFRKHUHQF\ ERXQGV I $ ' &  PD[LPDO VXE
TXHULHVFDQEHREWDLQHGIRUHDFKGDWDDJJUHJDWRUE\IRUP
LQJVXETXHU\LQYROYLQJDOOGDWDLWHPVLQWKHLQWHUVHFWLRQ
RITXHU\GDWDLWHPVDQGWKRVHEHLQJGLVVHPLQDWHGE\WKH
'$7KLVRSHUDWLRQFDQEHSHUIRUPHGLQ2 _T_PD[_'N_ 

WKXV EDVHG RQ GDWD G\QDPLFV LW LV SRVVLEOHWKDWZH VHOHFW




This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS

SODQ DV WKH RSWLPDO SODQ %XW IURP WKH VSHFLILFDWLRQ RI
DJJUHJDWRUV D DQG D LQ ([DPSOH ZH VHH WKDW LW LV QRW
SRVVLEOH IRU SODQ WR VDWLVI\ WKH FOLHQW VSHFLILHG LQFR
KHUHQF\ERXQGDVWLJKWHVWLQFRKHUHQF\ERXQGWKDWFDQEH
VDWLVILHGE\ WKH VHOHFWHG DJJUHJDWRUV 7SODQ   
    LV JUHDWHU WKDQ WKH TXHU\ LQFRKHUHQF\
ERXQG   7KXV DOWKRXJK WKHUH H[LVWV D SODQ 7SODQ 
        ZKLFK FDQ VDWLVI\ WKH FOLHQW
TXHU\ LQFRKHUHQF\ ERXQG ZKLOH PLQLPL]LQJ WKH TXHU\
H[HFXWLRQFRVWWKHDERYHPHWKRGFDQQRWHQVXUHWKDWVXFK
D SODQ ZLOO EH VHOHFWHG :KDW ZH QHHG LV D FRPSURPLVH
EHWZHHQ TXHU\ VDWLVILDELOLW\ DQG SHUIRUPDQFH ,QVWHDG RI
VHOHFWLQJ WKH VXETXHULHV ZLWKRXW FRQVLGHULQJ WKH GDWD
LQFRKHUHQF\ERXQGVIRUWKHVHOHFWHGGDWDDJJUHJDWRUVZH
Tm
VHOHFW VXETXHULHV XVLQJ ( Rm1 / 3 +
)  DV H[SDQGHG RE
1/ 3
Cq Rm

MHFWLYHIXQFWLRQ7KHVHFRQGWHUPHQVXUHVWKDWZKLOHVHOHFW
LQJ WKH RSWLPDO SODQ ZH SUHIHU GDWD DJJUHJDWRUV KDYLQJ
WLJKWHUGDWDLQFRKHUHQF\ERXQGV ORZHUYDOXHVRI7P WKXV
KLJKHU FKDQFHV RI VDWLVI\LQJ WKH TXHU\ 7KH WXQLQJ SD
UDPHWHU FDQEHXVHGWREDODQFHWKHREMHFWLYHVRIPLQL
PL]LQJTXHU\H[HFXWLRQ FRVWWKURXJK VXETXHU\ VHOHFWLRQ
DQG PHHWLQJ WKH TXHU\ FRKHUHQF\ UHTXLUHPHQWV :H
XVH Tm / Cq Rm1 / 3  LQ WKH VHFRQG WHUP DV DFFRUGLQJ WR (TXD
WLRQ   RSWLPDO LQFRKHUHQF\ ERXQG DOORFDWLRQ LV OLNHO\
WR EH GRQH SURSRUWLRQDO WR Cq Rm1 / 3  ,Q 6HFWLRQ  ZH
PHDVXUH WKH HIIHFWV RI WKH WXQLQJ SDUDPHWHU  RQ WKH
TXHU\VDWLVILDELOLW\

4.3.3 Maximum Gain Heuristic


1RZ ZH SUHVHQW DQDOJRULWKP ZKLFK LQVWHDG RI PLQL
PL]LQJWKHHVWLPDWHGTXHU\H[HFXWLRQFRVWPD[LPL]HVWKH
HVWLPDWHG JDLQV RI H[HFXWLQJ WKH FOLHQW TXHU\ XVLQJ VXE
TXHULHV ,Q WKLV DOJRULWKP IRU HDFK VXETXHU\ ZH FDOFX
ODWHWKHUHODWLYHJDLQRIH[HFXWLQJLWE\ILQGLQJWKHVXPGLII
GLIIHUHQFHEHWZHHQFDVHVZKHQHDFKGDWDLWHPLVREWDLQHG
VHSDUDWHO\DQGZKHQDOOWKHGDWDLWHPVDUHDJJUHJDWHGDV
D VLQJOH VXETXHU\ LH PD[LPDO VXETXHU\  7KXV WKH
UHODWLYHJDLQIRUDVXETXHU\ wi d i FDQEHZULWWHQDV
Gm =

wi Ri
i

2 2
wi Ri
i

+ ij wi w j Ri R j

1   

j i

ZKHUH5LLVVXPGLIIRIWKHGDWDLWHPGL7KLVDOJRULWKPFDQ
EHLPSOHPHQWHGE\XVLQJFULWHULRQPD[LPL]H *P_P_ 
WR JHW WKH VHW RI VXETXHULHV DQG FRUUHVSRQGLQJ '$V
7KHQZHXVHWKHFRQYH[RSWLPL]DWLRQPHWKRGRXWOLQHGLQ
6HFWLRQ  WR DOORFDWH LQFRKHUHQF\ ERXQGV DPRQJ VXE
TXHULHV7RWDFNOH WKH TXHU\ VDWLVILDELOLW\ LVVXH WKH TXHU\
JDLQ(TXDWLRQ  LVPRGLILHGWR
( wiTi )
i
  
Gm' = Gm
1/ 3
C q Rm

ZKHUH7LLVWLJKWHVW LQFRKHUHQF\ ERXQG WKDW FDQ EH VDWLV


ILHGIRUWKHGDWD LWHPGL DQG5P LV WKH VXETXHU\ VXPGLII
5HDVRQVIRUVHOHFWLQJWKHSDUWLFXODUH[WHQGHGREMHFWLYHIXQF
WLRQDUHVDPHDVRQHVRXWOLQHGIRUWKHPLQFRVWKHXULVWLF

7RVXPPDUL]HIRUDJLYHQFOLHQWTXHU\DQGDQHWZRUN
RIGDWDDJJUHJDWRUVILUVWZHJHWWKHPD[LPDOVXETXHULHV
IRU DOO GDWD DJJUHJDWRUV :H XVH KHXULVWLFV GHVFULEHG LQ
WKLV VHFWLRQ WR GHULYH VXETXHULHV ,Q WKHVH KHXULVWLFV H[
WHQGHG REMHFWLYH IXQFWLRQV DUH XVHG WR KDYH WKH GHVLUHG
OHYHORITXHU\VDWLVILDELOLW\7KHQWKHWHFKQLTXHH[SODLQHG
LQ 6HFWLRQ  LV XVHG WR DOORFDWH WKH TXHU\ LQFRKHUHQF\
ERXQGDPRQJWKHGHULYHGVXETXHULHV

5 PERFORMANCE EVALUATION
)RU SHUIRUPDQFH HYDOXDWLRQ ZH VLPXODWHG D QHWZRUN RI
GDWDDJJUHJDWRUVRIVWRFNGDWDLWHPVRYHU  DJJUH
JDWRU QRGHV VXFK WKDW HDFK DJJUHJDWRU FDQ GLVVHPLQDWH
FRPELQDWLRQV RI  WR  GDWDLWHPV 'DWD LWHPV ZHUH DV
VLJQHG WR GLIIHUHQW DJJUHJDWRUV XVLQJ ]LSI GLVWULEXWLRQ
VNHZ  DVVXPLQJWKDWVRPHSRSXODUGDWDLWHPVZLOOEH
GLVVHPLQDWHG E\ PRUH '$V   'DWD LQFRKHUHQF\ ERXQGV
IRUYDULRXVDJJUHJDWRUGDWDLWHPVZHUHFKRVHQXQLIRUPO\
EHWZHHQDQG:HFUHDWHGSRUWIROLRTXHULHV
VXFK WKDW HDFK TXHU\ KDV  WR  UDQGRPO\ XVLQJ ]LSI
GLVWULEXWLRQ ZLWK WKH VDPH GHIDXOW VNHZ  VHOHFWHG GDWD
LWHPVZLWKZHLJKWVYDU\LQJEHWZHHQDQG7KHVHTXH
ULHVZHUHH[HFXWHGZLWKLQFRKHUHQF\ERXQGVEHWZHHQ
DQG  LH  RI WKH TXHU\ YDOXH  $OWKRXJK
KHUHZHSUHVHQWUHVXOWVIRUVWRFNWUDFHV PDQPDGHGDWD 
VLPLODU UHVXOWV ZHUH REWDLQHG IRU VHQVRU WUDFHV QDWXUDO
GDWD  DV ZHOO >@ ,Q WKH ILUVW VHW RI H[SHULPHQWV ZH NHSW
GDWDLQFRKHUHQF\ERXQGVDWWKHGDWDDJJUHJDWRUVYHU\ORZ
VR WKDW TXHU\ VDWLVILDELOLW\ FDQ EH HQVXUHG ZKLOH NHHSLQJ
GHIDXOWYDOXHRIDV

5.1 Comparison of Algorithms


)RU FRPSDULVRQ ZLWK RXU DOJRULWKPV SUHVHQWHG LQ WKH
SUHYLRXV VHFWLRQ ZH FRQVLGHU YDULRXV RWKHU TXHU\ SODQ
RSWLRQV  (DFK TXHU\ FDQ EH H[HFXWHG E\ GLVVHPLQDWLQJ
LQGLYLGXDOGDWDLWHPVRUE\JHWWLQJVXETXHU\YDOXHVIURP
'$V 6HW RI VXETXHULHV FDQ EH VHOHFWHG XVLQJ VXPGLII
EDVHG DSSURDFKHV RU DQ\ RWKHU UDQGRP VHOHFWLRQ 6XE
TXHU\ RU GDWD  LQFRKHUHQF\ ERXQG FDQ HLWKHU EH SUH
GHFLGHG RU RSWLPDOO\ DOORFDWHG 9DULRXV FRPELQDWLRQV RI
WKHVHGLPHQVLRQVDUHFRYHUHGLQWKHIROORZLQJDOJRULWKPV
1RVXETXHU\HTXDOLQFRKHUHQF\ERXQG QDwYH ,Q
WKLVDOJRULWKPWKHFOLHQWTXHU\LVH[HFXWHGZLWKHDFKGDWD
LWHP EHLQJ GLVVHPLQDWHG WR WKH FOLHQW LQGHSHQGHQW RI
RWKHU GDWD LWHPV LQ WKH TXHU\ ,QFRKHUHQF\ ERXQG LV GL
YLGHGHTXDOO\DPRQJWKHGDWDLWHPV7KLVDOJRULWKP DFWV
DVDEDVHOLQHDOJRULWKP
 1R VXETXHU\ RSWLPDO LQFRKHUHQF\ ERXQG RSWF 
,Q WKLV DOJRULWKP DOVR GDWD LWHPV DUH GLVVHPLQDWHG LQGH
SHQGHQWO\EXWLQFRKHUHQF\ERXQGLVGLYLGHGDPRQJGDWD
LWHPV XVLQJ (TXDWLRQ   VR WKDW WRWDO QXPEHU RI UH
IUHVKHVFDQEHPLQLPL]HG
5DQGRPVXETXHU\VHOHFWLRQ UDQGRP ,QWKLVFDVH
VXETXHULHV DUH REWDLQHG E\ UDQGRPO\ VHOHFWLQJ D '$ LQ
WKHHDFKLWHUDWLRQRIWKHJUHHG\DOJRULWKP )LJXUH 7KLV
DOJRULWKP LV GHVLJQHG WR VHH KRZ WKH UDQGRP VHOHFWLRQ
ZRUNVLQFRPSDULVRQWRWKHVXPGLIIEDVHGDOJRULWKPV
 6XETXHU\ VHOHFWLRQ ZKLOH PLQLPL]LQJ VXPGLII
PLQFRVW 7KLVDOJRULWKPLVGHVFULEHGLQ6HFWLRQ
6XETXHU\VHOHFWLRQZKLOHPD[LPL]LQJJDLQ PD[

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

10

Figure 5: Performance evaluation of algorithms

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

(a) Query size=3


Figure 6: Effect of data sumdiff on sub-query size

(b) Query size=5

WKHVL]HRIWKHVXETXHU\LQZKLFKWKDWGDWDLWHPDSSHDUV
,QWKLVH[SHULPHQWZLWKGDWDLWHPV'$VZHUHVLPX
ODWHGVXFKWKDWHDFK'$FDQGLVVHPLQDWHDGLIIHUHQWVHWRI
 GDWD LWHPV 7KHQ  TXHULHV ZHUH FUHDWHG HDFK ZLWK 
UDQGRPO\ FKRVHQ GDWD LWHPV ,Q WKH RSWLPDO TXHU\ SODQ
HDFK TXHU\ ZLOO EH H[HFXWHG ZLWK WZR VXETXHULHV RQH
FRQVLVWLQJ RI  GDWD LWHPV DQG DQRWKHU ZLWK VLQJOH GDWD
LWHP SODQ ZLWK WKUHH RQH LWHP VXETXHULHV ZLOO EH WULYL
DOO\LQHIILFLHQW $VWKHTXHU\KDVRQO\GDWDLWHPVRQO\
VXFKTXHU\SODQVDUHSRVVLEOH:HVLPXODWHGDOOWKHVHRS
WLRQV WR JHW WKH EHVW TXHU\ SODQ )RU WKHVH RSWLPDO TXHU\
SODQV )LJXUH  D  VKRZVYDULDWLRQ RI DYHUDJH VXETXHU\
VL]HLQZKLFKDSDUWLFXODUGDWDLWHPDSSHDUVYHUVXV VXP
GLIIYDOXHRIWKHGDWDLWHP:HFDQVHHWKDWLIDGDWDLWHPLV
PRUHG\QDPLFLQWKHRSWLPDOSODQLWLVPRUHOLNHO\WREH
SDUWRIODUJHUVXETXHU\7KLVLVDQLPSRUWDQWREVHUYDWLRQ
DV LW LQGLFDWHV WKDW IRU HIILFLHQW TXHU\ HYDOXDWLRQ PRUH
G\QDPLFGDWDLWHPVVKRXOGEHSDUWRIDODUJHUVXETXHU\
7KLV SKHQRPHQRQ FDQ EH H[SODLQHG E\ WKH IDFW WKDW E\
H[HFXWLQJ D TXHU\ DV D FRPELQDWLRQ RI VXETXHULHV ZLOO
DOZD\V EH PRUH HIILFLHQW FRPSDUHG WR JHWWLQJ WKH GDWD
LWHPV LQGHSHQGHQWO\ %\ FRPELQLQJ PRUH G\QDPLF GDWD
LWHPVZHDUHOLNHO\WRJDLQPRUH)RUFRPSDULVRQZHDOVR
VKRZWKHFXUYHIRUWKHVXETXHU\VHOHFWLRQEDVHGRQPD[
JDLQDOJRULWKP,WFDQEHVHHQWKDWE\XVLQJPD[JDLQDOJR
ULWKP ZH DFKLHYH RXU REMHFWLYH RI LQFOXGLQJ PRUH G\
QDPLF GDWD LWHPV DV SDUW RI ODUJHU VXETXHULHV ,Q FRP
SDULVRQ IRU WKH PLQFRVW DOJRULWKP PRVW G\QDPLF GDWD
LWHP LV PRUH OLNHO\ WR EH GLVVHPLQDWHG DV VLQJOH LWHP
TXHU\ 7KLV KDSSHQV EHFDXVHWKHVXPGLII YDOXHRIDPRUH
G\QDPLF GDWD LWHP ZLOO EH KLJK WKXV LQ HDFK LWHUDWLRQ RI
WKH JUHHG\ DOJRULWKP )LJXUH   WKHUH LV OHVV FKDQFH RI
VHOHFWLQJ D VXETXHU\ ZLWK PRUH G\QDPLF GDWD LWHP
7KXV LW LV YHU\ OLNHO\ WKDW WKH PRVW G\QDPLF GDWD LWHP
ZLOOEHGLVVHPLQDWHGDVDVLQJOHLWHPVXETXHU\UHVXOWLQJ
LQEDGSHUIRUPDQFHRIWKHFOLHQWTXHU\)RUWKHPD[JDLQ
5.2 Effects of Algorithmic Parameters
DQGPLQFRVWDOJRULWKPVVLPLODUUHVXOWVZHUHREWDLQHGIRU
7KLVVHW RI H[SHULPHQWV ZDV SHUIRUPHG WR JHW DQ LQVLJKW ODUJHU TXHU\ VL]HV DV ZHOO DV VKRZQ LQ )LJXUH  E  )RU
LQWR YDULRXV FKDUDFWHULVWLFV RI RXU VXETXHU\ VHOHFWLRQ JHQHUDWLQJ UHVXOWV RI )LJXUH  E  ZH VLPXODWHG  GDWD
PHWKRGZKLFKOHDGLWWRSHUIRUPEHWWHUFRPSDUHGWRRWKHU DJJUHJDWRUV HDFK GLVVHPLQDWLQJ  GDWD LWHPV ZKLOH HDFK
RSWLRQV :H FRQVLGHU HIIHFWV RI WKUHH SDUDPHWHUV RQ WKH TXHU\KDGGDWDLWHPV
TXHU\ SHUIRUPDQFH GDWD G\QDPLFV FRUUHODWLRQ EHWZHHQ
5.2.2 Effect of correlation between data dynamics
GDWDG\QDPLFVDQGTXHU\VDWLVILDELOLW\SDUDPHWHU
7R PHDVXUH WKH HIIHFWV RI FRUUHODWLRQ EHWZHHQ GDWD G\
5.2.1 Effect of data dynamics
QDPLFV DV PHDVXUHG XVLQJ FRUUHODWLRQ PHDVXUH  RQ WKH
,QWKLVVHWRIH[SHULPHQWVZHZDQWHGWRVHHZKHWKHUWKHUH TXHU\SHUIRUPDQFHZHFRPSDUHGWKHTXHU\SHUIRUPDQFH
LV DQ\ GHILQLWH UHODWLRQVKLS EHWZHHQ GDWD G\QDPLFV DQG ZLWK WKH FDVH ZKHQ DOO WKH GDWD LWHPV DUH DVVXPHG WR EH

JDLQ 7KLVDOJRULWKPLVGHVFULEHGLQ6HFWLRQ
)LJXUH  VKRZV DYHUDJH QXPEHURI UHIUHVKHV UHTXLUHG
IRU TXHU\ LQFRKHUHQF\ ERXQGV RI  7KH QDwYH DOJR
ULWKP UHTXLUHV PRUH WKDQ ILYH WLPHV WKH QXPEHU RI PHV
VDJHVFRPSDUHGWRPLQFRVW DQGPD[JDLQDOJRULWKPV )RU
LQFRKHUHQF\ ERXQG RI  HDFK TXHU\ RQ DYHUDJH UH
TXLUHV PHVVDJHV LI LW LV H[HFXWHG MXVW E\ RSWLPL]LQJ
LQFRKHUHQF\ERXQG RSWF FRPSDUHGWRZKHQZHVHOHFW
WKHTXHU\SODQXVLQJWKHPD[JDLQDOJRULWKP7KHJDLQVRI
RXU DOJRULWKPVLQFUHDVH IXUWKHU DV QXPEHU RI GDWDLWHPV
GLVVHPLQDWHGE\GDWDDJJUHJDWRUVLQFUHDVH QDwYHUHTXLUHV
PRUHWKDQWLPHV WKH PHVVDJHV ZKHQ HDFK GDWDDJJUH
JDWH GLVVHPLQDWHV  GDWDLWHPV  7KLV KDSSHQV DV ZLWK
PRUHGDWDLWHPVSHU'$VXETXHU\EDVHGDOJRULWKPVUH
VXOWLQODUJHUVXETXHULHVDQGZHVHOHFWVXETXHULHVLQWHO
OLJHQWO\
,Q WKH DERYH H[SHULPHQW IRU FUHDWLQJ TXHULHV ZH VH
OHFWHGWKHTXHU\GDWDLWHPVZLWKWKHVDPH]LSIGLVWULEXWLRQ
VNHZ  DVZHXVHGIRUVHOHFWLQJGDWDLWHPVWREHVHUYHG
E\ '$V %XW LI ZH UHGXFH WKH VNHZ LH KDYLQJ TXHULHV
ZLWKOHVVSRSXODUGDWDLWHPV ZHIRXQGWKDWWKHSHUIRUP
DQFHRIVXETXHU\EDVHGDOJRULWKPVVXIIHU7KLV KDSSHQV
EHFDXVH IRU EHWWHU SHUIRUPDQFH VXETXHU\ EDVHG DOJR
ULWKPV GHSHQG RQ TXHU\ GDWD LWHPV EHLQJ GLVVHPLQDWHG
E\ WKH VDPH '$V )RU TXHULHV ZLWK OHVV SRSXODU GDWD
LWHPVSUREDELOLW\RIWKLVKDSSHQLQJLVOHVVKHQFHWKHLQIH
ULRUSHUIRUPDQFH
)XUWKHU DOWKRXJK WKH RSWLPL]DWLRQ SUREOHPLV VLPLODU
WR WKHFRYHULQJD VHW RI GDWD LWHPV TXHU\  XVLQJLWVVXE
VHWV VXETXHULHV IRUZKLFKWKHJUHHG\PLQFRVWDOJRULWKP
LVFRQVLGHUHGWREH PRVW HIILFLHQW >@ ZHVHHWKDWPD[
JDLQ DOJRULWKP UHTXLUHV  OHVV PHVVDJHV FRPSDUHG
WR WKH PLQFRVWDSSURDFK5HDVRQVIRUPD[JDLQDOJRULWKP
SHUIRUPLQJ EHWWHU WKDQ RWKHU DOJRULWKPV DUH H[SORUHG LQ
WKHQH[WVHWRIH[SHULPHQWV

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS

LQGHSHQGHQW LH     )RU SHUIRUPLQJ WKHVH H[SHUL


PHQWVZHFRQVWUXFWHGV\QWKHWLFGDWDWUDFHV HDFKZLWK
VXPGLII  VRWKDWYDOXHVRI IRUYDULRXVGDWD LWHP SDLUV
ZHUH GLVWULEXWHG XQLIRUPO\ EHWZHHQ  DQG  7KHQ 
'$V ZHUH VLPXODWHG VR WKDW HDFK '$ FDQ GLVVHPLQDWH 
GDWDLWHPVTXHULHV ZHUH JHQHUDWHG HDFK ZLWKGDWD
LWHPV ,Q WKLV FDVH HDFK TXHU\ ZLOO JHW H[HFXWHG ZLWK 
VXETXHULHV RI  GDWD LWHPV HDFK &RPELQDWLRQ RI VXE
TXHULHVZLOOEHGHFLGHGEDVHGRQFRUUHODWLRQEHWZHHQGDWD
LWHPV VXPGLIIYDOXHVRIDOOWKHGDWDLWHPVZHUHWKHVDPH 
:HIRXQGWKDWE\ FRQVLGHULQJ FRUUHODWLRQ PHDVXUH QXP
EHU RI UHIUHVKHV UHGXFH E\ DSSUR[LPDWHO\  7KLV
UHVXOWLQGLFDWHVWKDWIRURSWLPDOTXHU\SODQQLQJ GDWD G\
QDPLFV DQG LQFRKHUHQF\ ERXQG DOORFDWLRQ PD\ EH PRUH
LPSRUWDQWIDFWRUWKDQWKHFRUUHODWLRQPHDVXUH

5.2.3 Effect of query satisfiability parameter


7RVLPXODWHWKHVLWXDWLRQZKHUHVHOHFWHGDJJUHJDWRUVPD\
QRW EH DEOH WR VDWLVI\ WKH TXHU\ LQFRKHUHQF\ ERXQGV ZH
PRGLILHGWKHVLPXODWLRQ VHW XS XVHG LQ 6HFWLRQ  WRVHW
WKH PLQLPXP GDWD LQFRKHUHQF\ ERXQGV ZKLFK '$V FDQ
VDWLVI\WREHEHWZHHQDQG9DOXHRIZDVYDULHG
EHWZHHQDQG7KHFDVH FRUUHVSRQGVWRWKHDOJR
ULWKPZLWKRXWGHDOLQJZLWKWKHTXHU\VDWLVILDELOLW\)LJXUH
VKRZVQXPEHURIXQDQVZHUDEOHTXHULHVDVWKHYDOXHRI
 LV YDULHG $V VKRZQ LQ WKH ILJXUH DV WKH YDOXH RI  LV
LQFUHDVHGSHUFHQWDJHRIWKHXQVDWLVILHGTXHULHVGHFUHDVHV
IRU YDULRXV YDOXHV RI TXHU\ LQFRKHUHQF\ ERXQGV 'XH WR
FKDQJHGGDWDLQFRKHUHQF\ERXQGVRI'$VZHIRXQGWKDW
RI TXHULHV FDQ QRW EH VDWLVILHG HYHQ E\ WKH GDWD DJ
JUHJDWRUV ZLWK WLJKWHVW GDWD LQFRKHUHQF\ ERXQGV $W WKH
TXHU\ LQFRKHUHQF\ ERXQG RI   TXHULHV FDQ QRW EH
VDWLVILHGE\WKHRSWLPDOO\VHOHFWHGGDWDDJJUHJDWRUVEXWDV
ZHLQFUHDVHWKHYDOXHRIWRRQO\TXHULHVDUHXQ
DQVZHUHG
7KH YDOXH RI  FDQ EH FKRVHQ WR EDODQFH WKH SHUIRUP
DQFHDQGVDWLVILDELOLW\RITXHULHV)RUH[DPSOHDQHWZRUN
RIGDWDDJJUHJDWRUVPD\DLPDWTXHU\VDWLVILDELOLW\RI
IRUDJLYHQGLVWULEXWLRQRITXHU\LQFRKHUHQF\ERXQGV,IDW
DQ\WLPHTXHU\VDWLVILDELOLW\LVEHORZWKHWDUJHWYDOXHRI
FDQ EH LQFUHDVHG ZKHUHDV LQ FDVH RI RYHU DFKLHYLQJ WKH
WDUJHW WKH YDOXH RI  FDQ EH GHFUHDVHG WR LPSURYH WKH
TXHU\SHUIRUPDQFH

WKHQXPEHURIGDWDLWHPVEHLQJGLVVHPLQDWHGE\WKHQHW
ZRUNEHWZHHQDQG7KHVHH[SHULPHQWVZHUHGRQH
RQD:LQGRZV;3PDFKLQHZLWK*+],QWHO&RUH'XR
&38 DQG *% 5$0 )RU YDULRXV VXPGLII EDVHG DOJR
ULWKPVZHQHHGWRPDLQWDLQWKHVXPGLIIYDOXHVRIYDULRXV
GDWD LWHPV SURSRUWLRQDO WR WKH QXPEHURI GDWD LWHPV EH
LQJ GLVVHPLQDWHG  DQG WKH FRUUHODWLRQ PHDVXUH IRU HDFK
SDLURIGDWDLWHPV SURSRUWLRQDOWRWKHVTXDUHRIWKHQXP
EHU RI GDWD LWHPV  LQ DGGLWLRQ WR WKH TXHU\ GHSHQGHQW
SODQQLQJ FRVW )RU D WUDFH VL]H RI  IRU HDFK GDWD
LWHP ERWKWKHFRVWRIPDLQWDLQLQJVXPGLIISHUGDWDLWHP
DQG WKH FRVW RI PDLQWDLQLQJ FRUUHODWLRQ PHDVXUH IRU HDFK
SDLURIGDWDLWHPVZHUHIRXQGWREHLQWKHUDQJHRI
PLFURVHFRQGV 4XHU\ SODQQLQJFRVW WLPHUHTXLUHG WR GH
ULYHVXETXHULHVDQGWKHLUDVVRFLDWHGLQFRKHUHQF\ERXQGV 
IRU QDwYH DQG RSWF DOJRULWKP ZDV IRXQG WR EH DSSUR[L
PDWHO\PLFURVHFRQGSHUTXHU\ZKHUHDVWKHVDPHIRUWKH
UDQGRPPLQ&RVWDQGPD[*DLQDOJRULWKPVZDVIRXQGWREH
  DQG  PLOOLVHFRQGV +LJKHU FRVW RI TXHU\ SODQ
QLQJ IRU WKH VXPGLII EDVHG DOJRULWKPV LV MXVWLILHG E\ WKH
VDYLQJV ZH DFKLHYH LQ WHUPV RI QXPEHU RI PHVVDJHV IRU
WKH ZKROH GXUDWLRQ RI WKH FRQWLQXRXV TXHU\ 7KH TXHU\
SODQQLQJ FRVW RIUDQGRP DQGPLQFRVWLV KLJKHUDV WKH\ UH
TXLUH PRUH LWHUDWLRQV RI WKH DOJRULWKP LQ )LJXUH  LH
PRUHVXETXHULHV FRPSDUHGWRWKHPD[*DLQDOJRULWKP

6 QUERY PLANNING FOR MAX QUERIES


,Q WKLV VHFWLRQ ZH EULHIO\ GHVFULEH WKH RSWLPDO TXHU\
SODQQLQJ IRU 0$; TXHULHV 0,1 TXHULHVFDQEHKDQGOHG LQ
WKH VLPLODU PDQQHU $ 0$; TXHU\ ZKHUH D FOLHQW ZDQWV
WKH PD[LPXP RI D VSHFLILHG VHW RI GDWD LWHP YDOXHV FDQ
EHZULWWHQDV
Vq (t ) = max(vqi (t ), 1 i nq ) 
 
)RU 0$; TXHULHV UHODWLRQVKLS EHWZHHQ WKH TXHU\ LQFR
KHUHQF\ ERXQG DQG UHTXLUHG GDWDLQFRKHUHQF\ ERXQGVLV
GLVFXVVHGLQWKHOLWHUDWXUH>@$FFRUGLQJWRRQHVXFK
IRUPXODWLRQLIWKHQHWZRUNRIDJJUHJDWRUVFDQHQVXUHWKDW
WKH LWK GDWD LWHP KDV LQFRKHUHQF\ ERXQG &L WKHQ WKH IRO
ORZLQJ FRQGLWLRQ HQVXUHV WKDW WKH TXHU\ LQFRKHUHQF\
ERXQG&TLVVDWLVILHG
Ci Cq , i , 1 i nq 
 

,Q WKHVH TXHULHV HYHQ LI YDOXHV RI RQH RU PRUH GDWD LWHP
FKDQJH FKDQJLQJWKHLULQGLYLGXDOLQFRKHUHQFLHV LWLVSRV
5.3 Overheads of Query Planning
1RZ ZH UHSRUW WKH WLPH RYHUKHDGV IRU YDULRXV TXHU\ VLEOH WKDW TXHU\ LQFRKHUHQF\ UHPDLQV XQFKDQJHG 7KXV
SODQQLQJRSHUDWLRQV:HPHDVXUHGWKHVHFRVWVE\YDU\LQJ IRUDJLYHQ0$;TXHU\LWLVSRVVLEOHWRKDYHDQLQGLYLGXDO

Figure 7: Effect of on query satisfiability

(a) Comparison of algorithm


Figure 8: Performance of MAX queries

(b) Effect of data dynamics order on performance

11

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

12

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

GDWD RU VXETXHU\  LQFRKHUHQF\ ERXQG ZKLFK LV PRUH


WKDQ WKH TXHU\ LQFRKHUHQF\ ERXQG %XW VXFK DQ LQFR
KHUHQF\ ERXQG ZLOO GHSHQG RQ LQVWDQWDQHRXV YDOXHV RI
GDWDLWHPVWKXVFKDQJLQJYHU\G\QDPLFDOO\,QWKLVSDSHU
ZH GR QRW FRQVLGHU VXFK GDWD YDOXH GHSHQGHQW LQFR
KHUHQF\ERXQGV

SODQQLQJ SUREOHP RSWLPDOO\ ZH FDQ DOVR VROYH WKH VHW


FRYHUSUREOHPRSWLPDOO\7KXVQRZZHJLYHJUHHG\KHX
ULVWLFVIRUWKHVXETXHULHVVHOHFWLRQSUREOHP

6.2.2 Greedy Heuristics


:HXVHJUHHG\DOJRULWKPJLYHQLQ)LJXUHIRUVROYLQJWKH
TXHU\ SODQQLQJ SUREOHP ZLWK GLIIHUHQW VHW RI VXETXHU\
VHOHFWLRQ FULWHULD  OLNH WKH RQHV GHVFULEHG LQ 6HFWLRQV
DQG,QWKHPLQFRVWKHXULVWLFZHVHOHFWWKHVXE
TXHU\KDYLQJPLQLPXPVXETXHU\ VXPGLII SHU GDWDLWHP
)RU WKH 0$; TXHU\ VXETXHU\ VXPGLII LV QRWKLQJ EXW WKH
VXPGLII RI WKH PRVW G\QDPLF GDWD LWHP LQ WKH VXETXHU\
7KXV IRU WKH PD[JDLQ KHXULVWLF WKH JDLQ RI HDFK VXE
TXHU\LVFDOFXODWHGDVJLYHQLQ(TXDWLRQ  
G = Rdiq max( Rdiq ) 
 

6.1 Query Cost Model


/HWXVFRQVLGHUDTXHU\4  0$; $% ZKLFKLVXVHGIRU
GLVVHPLQDWLQJPD[RIGDWDLWHPV$DQG%IURPDGDWDDJ
JUHJDWRU /HW WKH VXPGLII YDOXHV RI $ DQG % LV 5D DQG 5E
UHVSHFWLYHO\ )RU D 0$; TXHU\ WKH TXHU\ UHVXOW LV WKH
PD[LPXPRIGDWDLWHPYDOXHV7KXV WKH TXHU\G\QDPLFV
LV GHFLGHGDVSHUWKH G\QDPLFVRI WKH GDWDLWHPZLWKWKH
PD[LPXPYDOXH+HQFHWKHTXHU\VXPGLIILVQRWKLQJEXW
ZHLJKWHGDYHUDJHRIGDWDVXPGLIIVZHLJKWHGE\IUDFWLRQRI
WLPHZKHQWKHSDUWLFXODUGDWDLWHPLVPD[LPXP
nq

nq

i =1

j =1, j i

Rq Ri ( p ( xi > x j )) max( Ri | 1 i n q ) 

d qi

ZKHUH5GLTLVVXPGLIIRILWKGDWDLWHPRIWKHTXHU\T

 

6.2.3 Simulation results


)LJXUH  D  VKRZV VLPXODWLRQ UHVXOWV IRU 0$; TXHU\ IRU
YDULRXV DOJRULWKPV RXWOLQHG LQ 6HFWLRQ  :H KDYH QRW
XVHG RSWF DOJRULWKP KHUH DV DOO GDWD LWHPV KDYH WR EH
VHUYHG DW WKH TXHU\ LQFRKHUHQF\ ERXQG ZLWKRXW DQ\ RS
WLPL]DWLRQ LQ WKH LQFRKHUHQF\ ERXQG DOORFDWLRQ 1DwYH
DOJRULWKP UHTXLUHV PRUH WKDQ  WLPHV PHVVDJHV FRP
SDUHG WR RWKHU HIILFLHQW VXETXHU\ EDVHG DOJRULWKPV
2WKHU UHVXOWV DUH TXDOLWDWLYHO\ VLPLODU WR ZKDW ZH RE
WDLQHG IRU WKH DGGLWLYH TXHULHV ZLWK RQH GLIIHUHQFH )RU
ERWKW\SHVRITXHULHVWKHPD[JDLQDOJRULWKPZRUNVEHWWHU
WKDQ WKH PLQFRVW DOJRULWKP EXW XQOLNH LQ DGGLWLYH TXH
ULHVLQFDVHRI 0$;TXHULHVSHUIRUPDQFHRIPLQFRVWDOJR
ULWKPLVFORVHU WR WKDWRI WKHUDQGRP DOJRULWKPFRPSDUHG
WRWKHPD[JDLQDOJRULWKP7KLVLVDVXUSULVLQJUHVXOWFRQ
VLGHULQJWKDWPLQFRVWLVWKHPRVWQDWXUDOFDQGLGDWH IRUVHW
FRYHUSUREOHP ZLWKDSSUR[LPDWLRQJXDUDQWHHRIORJQT>@
)RU 0$; TXHULHV VXETXHU\ FRVW GHSHQGV RQ WKH PRVW
G\QDPLF GDWD LWHP 7KXV ZH PRGLILHG WKH JUHHG\ DOJR
ULWKPV E\ FRQVLGHULQJ WKH GDWD LWHPV LQ WKH GHVFHQGLQJ
RUGHURIVXPGLIIV)RUH[DPSOHLQWKHPD[JDLQDOJRULWKP
ZH ILUVW FDOFXODWH JDLQV RI VXETXHULHV FRYHULQJ WKH GDWD
LWHP KDYLQJ PD[LPXP VXPGLII :H VHOHFW WKH RQH ZLWK
PD[LPXPJDLQ:HUHSHDWWKHVWHSIRUWKHQH[WPRVWG\
QDPLF GDWD LWHP DQG VR RQ )LJXUH  E  VKRZV ZLWK WKLV
PRGLILHG JUHHG\ DSSURDFK SHUIRUPDQFH RI PLQFRVW DQG
PD[JDLQ DOJRULWKPV LV DOPRVW WKH VDPH 7KLV FDQ EH H[
SODLQHG DV ZLWK WKH GDWD LWHP RUGHULQJ HQIRUFHG LQ WKH
PLQFRVW KHXULVWLF DOVR ZH HQVXUH WKDW WKH PRVW G\QDPLF
GDWDLWHPLVSDUWRIORZHUFRVWVXETXHU\OHDGLQJWREHWWHU
TXHU\SODQ

ZKHUHS [L![M LVWKHSUREDELOLW\WKDWYDOXHRILWKGDWDLWHP


LVPRUHWKDQYDOXHRIMWK GDWDLWHP:HKDYHWKHYDOXHVRI
GDWD LWHP VXPGLIIV EXW IRU JHWWLQJ WKH SUREDELOLWLHV ZH
QHHG WR KDYH H[DFW YDOXHV RI GDWD LWHPV $V TXHU\ SODQ
GHSHQGHQWRQLQGLYLGXDOGDWDYDOXHV LQVWHDGRIGDWDG\
QDPLFV  ZLOO EH WRR YRODWLOH DV D ILUVW DSSUR[LPDWLRQ ZH
XVHXSSHUERXQGRIWKHH[SUHVVLRQJLYHQE\(TXDWLRQ  
DVTXHU\VXPGLII$SSUR[LPDWLRQXVHGLVWKHPD[LPXPRI
VXPGLIIVRIGDWDLWHPVLQYROYHG1RZZHFRQVLGHUWKHRS
WLPL]HG H[HFXWLRQ RI 0$; TXHULHV XVLQJ WKH DERYH PHQ
WLRQHGTXHU\FRVWPRGHO

6.2 Optimized Execution


7RH[HFXWHWKH 0$;TXHU\XVLQJDQHWZRUNRIGDWDDJJUH
JDWRUVZHDVVLJQVXETXHULHVWRGLIIHUHQW'$V(DFKVXE
TXHU\LVD 0$;TXHU\RYHUDVXEVHWRITXHU\GDWDLWHPV
)RU RSWLPDO SODQQLQJ ZH QHHG WR PLQLPL]H WKH VXP RI
VXETXHU\ H[HFXWLRQ FRVWV $V ZH DVVLJQ VDPH LQFR
KHUHQF\ERXQGWRDOOWKHVXETXHULHV HTXDOVWRWKHTXHU\
LQFRKHUHQF\ERXQGDVSHU(TXDWLRQ  ZHMXVWQHHGWR
PLQLPL]HVXPRIVXETXHU\VXPGLIIYDOXHV
6.2.1 Optimal query planning problem is NP-hard
2SWLPDOTXHU\SODQQLQJ SUREOHP IRU 0$; TXHULHV LV13
KDUG7KLVFDQEHSURYHGE\PDSSLQJWKHVHWFRYHUSURE
OHPWRWKLVRSWLPDOTXHU\SODQQLQJSUREOHP
6HWFRYHUSUREOHP*LYHQ D XQLYHUVH8 DQGD IDPLO\6 RI
VXEVHWVRI8DFRYHULV D VXEVHW &RYHU6  RIVHWV ZKRVH
XQLRQLV8,QWKH VHW FRYHULQJ RSWLPL]DWLRQ SUREOHP WKH
WDVNLVWRILQGDVHWFRYHULQJZKLFKXVHVWKHIHZHVWVHWV
:H FDQ PDS WKH VHW FRYHU SUREOHP WR RXU TXHU\
SODQQLQJSUREOHP7KH 0$; TXHU\FRUUHVSRQGLQJ WR WKH
VHWFRYHUSUREOHPZLOOEHPD[RIDOOWKHLWHPVLQWKHXQL
YHUVH8DWDQLQFRKHUHQF\ERXQG)RUHDFKVHWV6ZH
DVVXPH WKH H[LVWHQFH RI D '$ GLVVHPLQDWLQJ DOO WKH HOH
PHQWVRIVDWDQLQFRKHUHQF\ ERXQG RI   )XUWKHUOHW DOO
GDWDLWHPVKDYHVXPGLIIYDOXHRI)URP(TXDWLRQ  ZH
FDQVHHWKDWFRVWRIDQ\VXETXHU\ZLOOEH7KXVFRVWRI
WKH FOLHQW TXHU\ ZKLFK LV VXP RIFRVWRI LWV VXETXHULHV
ZLOOEHVDPHDVWKHQXPEHURIVXEVHWVUHTXLUHGWRJHWWKH
VHWFRYHU,WLVHDV\ WR VHH WKDW LI ZH FDQ VROYH WKH TXHU\

7 RELATED WORK
:H GLYLGH WKHUHODWHG ZRUN RQ VFDODEOH DQVZHULQJRI DJ
JUHJDWLRQTXHULHVRYHUDQHWZRUNRIGDWDDJJUHJDWRUVLQWR
WZRLQWHUUHODWHGWRSLFV
$QVZHULQJ,QFRKHUHQF\%RXQGHG$JJUHJDWLRQ4XHULHV
9DULRXV PHFKDQLVPV IRU HIILFLHQWO\ DQVZHULQJ LQFR
KHUHQF\ ERXQGHG DJJUHJDWLRQ TXHULHV RYHU FRQWLQXRXVO\
FKDQJLQJGDWDLWHPVDUHSURSRVHGLQWKHOLWHUDWXUH>
  @ 2XU ZRUN GLVWLQJXLVKHV LWVHOI E\ HPSOR\LQJ
VXETXHU\EDVHGTXHU\HYDOXDWLRQWRPLQLPL]HQXPEHURI


This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

GUPTA ET AL.: QUERY PLANNING FOR CONTINUOUS QUERIES IN DYNAMIC DATA DISSEMINATION NETWORKS

UHIUHVKHV 3XOO EDVHG GDWD GLVVHPLQDWLRQ WHFKQLTXHV


ZKHUH FOLHQWV RU GDWD DJJUHJDWRUV SXOO GDWD LWHPV VXFK
WKDW TXHU\UHTXLUHPHQWV DUH PHWDUH GHVFULEHG LQ >@
)RU PLQLPL]LQJ WKH QXPEHU RI SXOOV ERWK SUHGLFW GDWD
YDOXHV DQG SXOO LQVWDQFHV ,Q FRPSDULVRQ ZH XVH SXVK
EDVHG PHFKDQLVP WR UHIUHVK VXETXHU\ YDOXHV DW WKH FOL
HQW ,Q >@ DXWKRUV SURSRVH SXVK EDVHG VFKHPH XVLQJ
GDWDILOWHUVDWWKHVRXUFHV$FFRUGLQJWRWKDWZRUNIRUDQ
DJJUHJDWLRQTXHU\WKHQXPEHURIUHIUHVKPHVVDJHVFDQEH
PLQLPL]HG E\ SHUIRUPLQJ LQFRKHUHQF\ ERXQG DOORFDWLRQ
WR LQGLYLGXDO GDWD LWHPV VXFK WKDW WKH QXPEHU RI PHV
VDJHVIURPGLIIHUHQWGDWDVRXUFHVLVWKHVDPH,QVWHDGZH
H[HFXWH PRUH G\QDPLF GDWD LWHPV DV SDUW RI ODUJHU VXE
TXHULHV ZKLOH RSWLPDOO\ DVVLJQLQJ LQFRKHUHQF\ ERXQGV
:KLOH WKLV PLJKW OHDG WR GLIIHUHQW PHVVDJLQJ RYHUKHDGV
IRUGLIIHUHQW'$VDVRSSRVHGWRZKDWLVSURSRVHGLQ>@
LWGRHVUHVXOWLQPLQLPL]LQJWKHWRWDOQXPEHURIPHVVDJHV
VHQW E\ '$V  /LNH XV DXWKRUV RI >@ DOVR DVVXPH WKDW
GLVVHPLQDWLRQ WUHH IURP VHQVRU QRGHV GDWD VRXUFHV  WR
URRW FOLHQWV  DOUHDG\ H[LVWV DQG WKH\ DOVR LQVWDOO HUURU
ILOWHUVRQSDUWLDODJJUHJDWHV VLPLODUWRLQFRKHUHQF\ERXQG
DVVLJQHGWRVXETXHULHV %XWLQRXUZRUNHDFK GDWD DJ
JUHJDWRUFDQRQO\GLVVHPLQDWHGDWDDWVRPHSUHVSHFLILHG
LQFRKHUHQF\ ERXQG GHSHQGLQJ RQ LWV FDSDELOLW\ ZKHUHDV
VXFKDFRQVWUDLQWGRHVQRWH[LVWIRU>@)XUWKHUZHDOVR
JLYHDPHWKRGWRVHOHFWSDUWLDODJJUHJDWHV VXETXHULHV WR
EHXVHGIRUDQVZHULQJWKHTXHU\
,Q >@ DXWKRUV SURSRVH FRVWEDVHG PHWKRGV WR FUHDWH
LQQHWZRUNDJJUHJDWLRQ WUHH FRQVLVWLQJ RI WKH TXHU\ QRGH
ZKHUHDQ DJJUHJDWLRQ TXHU\LVLQYRNHG EHLQJWKHURRW RI
WKHDJJUHJDWLRQWUHHDQGVHQVRUV$XWKRUVRI>@SURSRVH
FRPELQDWLRQVRIQXPEHURIKRSVDQGUHPDLQLQJHQHUJ\WR
VHOHFW D SDUWLFXODU SDWK IURP YDULRXV RSWLRQV DYDLODEOH
EHWZHHQ DQ\ WZR QRGHV 0DSSLQJ WKHLU SUREOHP WR WKH
RSWLPDO TXHU\ SODQQLQJ GLVFXVVHG LQ WKLV SDSHU HDFK
FRPPXQLFDWLQJQRGH FDQ ZRUN DV GDWD VRXUFH DV ZHOO DV
GDWD DJJUHJDWRU (DFK QRGH FDQ VHOHFW VXETXHULHV EDVHG
RQ WKHLU VXPGLII YDOXHV XVLQJ SULQFLSOHV RXWOLQHG LQ WKLV
SDSHUWRPLQLPL]HWKHQXPEHURIPHVVDJHWUDQVIHUVLQWKH
QHWZRUN
,Q >@ DXWKRUV XVH GDWD KLVWRJUDPV WR RSWLPDOO\ DV
VLJQ ORFDO WKUHVKROGV DW PRQLWRULQJ VLWHV IRU WKUHVKROG
PRQLWRULQJ DW D FHQWUDO VLWH 0DLQWDLQLQJ KLVWRJUDP LV D
WHGLRXV WDVN ZLWK PRUH VSDFH DQG WLPH RYHUKHDG FRP
SDUHG WR WKH VXPGLII EDVHG PHFKDQLVP $XWKRUV RI >@
DOVR XVH &KHE\VKHYV LQHTXDOLW\ WR VKRZ WKDW H[SHFWHG
FRPPXQLFDWLRQFRVWLVLQYHUVHO\SURSRUWLRQDOWRVTXDUHRI
WKH HUURUEXGJHW %XW FRPSDUHG WR RXU ZRUN WKH\ DV
VXPHWKDWQXPEHU RI UHIUHVK PHVVDJHV LV SURSRUWLRQDOWR
GDWD YDULDQFH $V ZH KDYH H[SODLQHG LQ 6HFWLRQ  RXU
VXPGLII PHDVXUH WDNHV WKH RUGHU RI GDWD YDOXH FKDQJHV
LQWRDFFRXQWZKLFKYDULDQFHGRHVQRW6SDWLDODQGWHPSR
UDO FRUUHODWLRQV EHWZHHQ VHQVRU GDWD DUH XVHG WR UHGXFH
GDWD UHIUHVKHV LQ >@ :H DOVR FRQVLGHU FRUUHODWLRQ LQ
WHUPVRIFRUUHODWLRQPHDVXUHEHWZHHQGDWDLWHPVEXWZH
XVH LW IRU GLYLGLQJ FOLHQW TXHU\ LQWR VXETXHULHV $
PHWKRGRIDVVLJQLQJFOLHQWVGDWDTXHULHVWRDJJUHJDWRUVLQ
D FRQWHQW GLVWULEXWLRQ QHWZRUN LV JLYHQ LQ >@ 7KH\ GR
IRULQGLYLGXDOGDWDLWHPVZKDWZHGRIRUTXHULHVFRQVLVW
LQJRIPXOWLSOHGDWDLWHPV

&RQVWUXFWLRQ DQG 0DLQWHQDQFH RI 1HWZRUN RI 'DWD


$JJUHJDWRUV $XWKRUV RI >@ GHVFULEH FRQVWUXFWLRQ
DQGPDLQWHQDQFHRIKLHUDUFKLFDOQHWZRUNRIGDWDDJJUHJD
WRUVIRUSURYLGLQJVFDODELOLW\DQGILGHOLW\LQGLVVHPLQDWLQJ
G\QDPLFGDWDLWHPVWRDODUJHQXPEHURIFOLHQWV,QWKHVH
ZRUNV ILGHOLW\ LV GHILQHG DV IUDFWLRQ RI WLPH ZKHQ WKH
FOLHQW FRKHUHQF\ UHTXLUHPHQWV DUH PHW (DFK GDWD DJJUH
JDWRU LV JLYHQ FOLHQW UHTXLUHPHQWV LQ WKH IRUP RI GDWD
LWHPV DQG WKHLU UHVSHFWLYH LQFRKHUHQF\ ERXQGV ,QVWHDG
ZH XVH VXFK QHWZRUNV IRU HIILFLHQWO\ DQVZHULQJ FOLHQWV
DJJUHJDWLRQ TXHULHV 2QH FDQ XVH FOLHQW TXHULHV WR RSWL
PDOO\ FRQVWUXFW D QHWZRUN RI GDWD DJJUHJDWRUV ZKLOH RQ
WKHRWKHUKDQGRQHFDQDOVRXVHDJLYHQQHWZRUNRIGDWD
DJJUHJDWRUV WR HIILFLHQWO\ DQVZHU FOLHQW TXHULHV $XWKRUV
RI>@GHDOZLWKWKHILUVWSDUWZKHUHDVZHKDYHVWXG
LHGWKHVHFRQGSDUW&KDQJHVLQGDWDG\QDPLFVPD\OHDG
WR UHRUJDQL]DWLRQ RI WKH QHWZRUN RI GDWD DJJUHJDWRUV
ZKLFKLQWXUQPD\QHFHVVLWDWHFKDQJHVLQTXHU\SODQV,W
LVDFKLFNHQDQGHJJSUREOHP$JJUHJDWRUVWUHHUHRUJDQL
]DWLRQ VKRXOG EH D ORQJHU WHUP SKHQRPHQRQ LH HDFK
LQFRPLQJ TXHU\ VKRXOG QRW OHDG WR WUHH UHRUJDQL]DWLRQ 
ZKHUHDVTXHU\SODQFDQFKDQJHPRUHRIWHQGHSHQGLQJRQ
GDWDG\QDPLFV
,QVWHDGRIRSWLPL]LQJILGHOLW\RIGDWDLWHPVDWGDWDDJ
JUHJDWRUV DV SURSRVHG LQ >@ XVLQJ RXU ZRUN RQH FDQ
RSWLPL]HILGHOLW\DOOWKHZD\XSWRFOLHQWTXHULHV)LGHOLW\
RIDGDWDLWHPFDQEHDSSUR[LPDWHO\FDOFXODWHGDVQXPEHU
RI GLVVHPLQDWLRQ PHVVDJHV PXOWLSOLHG E\ WKH WRWDO GHOD\
LQ WKH PHVVDJH WUDQVPLVVLRQ $XWKRUV RI >@ DVVXPH WKDW
HDFK FOLHQWV GDWD UHTXLUHPHQWV DUH IXOILOOHG E\ D VLQJOH
GDWD DJJUHJDWRU %XW LQ WKDW FDVH GDWD DJJUHJDWRUV PD\
QHHG WR GLVVHPLQDWH D ODUJH QXPEHU RI GDWD LWHPV ZKLFK
ZLOOOHDGWRSURFHVVLQJODUJHQXPEHURIUHIUHVKPHVVDJHV
KHQFH LQFUHDVH LQ GHOD\ 7KXV HDFK FOLHQW JHWWLQJ DOO LWV
GDWD LWHPV IURP D VLQJOH GDWD DJJUHJDWRU XVLQJ VLQJOH
VXETXHU\  LVRSWLPDO IURP QXPEHU RI PHVVDJHV SRLQW RI
YLHZ EXW QRW QHFHVVDULO\ IURP WKH TXHU\ ILGHOLW\ SRLQW RI
YLHZ %\ XVLQJ RXU ZRUN RQH FDQ PRGHO H[SHFWHG QXP
EHURIPHVVDJHVIRUWKHFOLHQWTXHU\7KXVRXU ZRUNFDQ
FRPSOHPHQW WKH ZRUN RI >@ IRU HQGWRHQG VRXUFHVWR
FOLHQW ILGHOLW\RSWLPL]DWLRQ

8 DISCUSSION & CONCLUSION


7KLVSDSHUSUHVHQWVDFRVWEDVHGDSSURDFKWRPLQLPL]H
WKH QXPEHU RI UHIUHVKHV UHTXLUHG WR H[HFXWH DQ LQFR
KHUHQF\ERXQGHGFRQWLQXRXVTXHU\:HDVVXPHWKHH[LV
WHQFHRIDQHWZRUNRIGDWDDJJUHJDWRUVZKHUHHDFK'$LV
FDSDEOH RI GLVVHPLQDWLQJ D VHW RI GDWD LWHPV DW WKHLU SUH
VSHFLILHG LQFRKHUHQF\ ERXQGV :H GHYHORSHG DQ LPSRU
WDQW PHDVXUH IRU GDWD G\QDPLFV LQ WKH IRUP RI VXPGLII
ZKLFKDVZHGLVFXVVHGLQ6HFWLRQLVDPRUHDSSURSULDWH
PHDVXUHFRPSDUHGWRWKHZLGHO\XVHGVWDQGDUGGHYLDWLRQ
EDVHG PHDVXUHV )RU RSWLPDO TXHU\ H[HFXWLRQ ZH GLYLGH
WKHTXHU\LQWRVXETXHULHVDQGHYDOXDWHHDFKVXETXHU\DW
DMXGLFLRXVO\FKRVHQGDWDDJJUHJDWRU3HUIRUPDQFHUHVXOWV
VKRZWKDWE\RXUPHWKRGWKHTXHU\FDQEHH[HFXWHGXVLQJ
OHVV WKDQ RQH WKLUG WKH PHVVDJHV UHTXLUHG IRU H[LVWLQJ
VFKHPHV  :H VKRZHG WKDW WKH IROORZLQJ IHDWXUHV RI WKH
TXHU\SODQQLQJDOJRULWKPVLPSURYHSHUIRUPDQFH
o 'LYLGLQJWKHTXHU\LQWRVXETXHULHV UDWKHUWKDQGDWD

13

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

14

IEEE TRANSACTIONS ON JOURNAL NAME, MANUSCRIPT ID

LWHPV DQGH[HFXWLQJWKHPDWVSHFLILFDOO\FKRVHQGDWD
DJJUHJDWRUV
o 'HFLGLQJWKH TXHU\ SODQ XVLQJ VXPGLII EDVHG PHFKD
QLVPVSHFLILFDOO\E\PD[LPL]LQJVXETXHU\JDLQV
o ([HFXWLQJTXHULHVVXFKWKDWPRUHG\QDPLFGDWDLWHPV
DUHSDUWRIDODUJHUVXETXHU\
:HVKRZHGWKDWWKHPD[JDLQDOJRULWKPLVYHU\FORVHWR
WKH RSWLPDO DOJRULWKP LQ VHOHFWLQJ VXETXHULHV EDVHG RQ
GDWD G\QDPLFV )LJXUH   4XHU\ VDWLVILDELOLW\ SDUDPHWHU
LVHPSOR\HGIRUWUDGHRIIEHWZHHQTXHU\VDWLVILDELOLW\
DQGTXHU\SHUIRUPDQFH)RUDQ\YDOXHRIWKHTXHU\VDWLV
ILDELOLW\ SDUDPHWHU WKHUH LV DOZD\V QRQ]HUR SUREDELOLW\
WKDWDTXHU\ZLOOQRWJHWVDWLVILHGE\WKHQHWZRUNRIGDWD
DJJUHJDWRUV  8VXDOO\ GDWD DJJUHJDWRUV GLVVHPLQDWLQJ
VDPHGDWDLWHPIRUPDKLHUDUFKLFDOQHWZRUN,QWKDWFDVH
HYHQ LI D GDWD DJJUHJDWRU FDQ QRW VDWLVI\ LWV DVVLJQHG
TXHU\ LW FDQ DJDLQ DSSO\ WKH SULQFLSOHV RXWOLQHG LQ WKLV
SDSHU WR VHQG D VXETXHU\ RI WKH DVVLJQHG TXHU\ WR LWV
SDUHQWV ZKLFKFDQGLVVHPLQDWHWKHGDWDLWHPDWDWLJKWHU
LQFRKHUHQF\ERXQG 7KDWZLOOOHDGWRSRRUHUSHUIRUPDQFH
RXWOLQLQJWKHWUDGHRIIEHWZHHQWKHTXHU\VDWLVILDELOLW\DQG
SHUIRUPDQFH'HYHORSLQJ HIILFLHQWVWUDWHJLHV IRU PXOWLSOH
LQYRFDWLRQV RI RXU DOJRULWKP FRQVLGHULQJ KLHUDUFK\ RI
GDWDDJJUHJDWRUVLVDQDUHDIRUIXWXUHUHVHDUFK
$QRWKHU DUHD IRU IXWXUH UHVHDUFK LV FKDQJLQJ D TXHU\
SODQ DV GDWD G\QDPLFV FKDQJHV :H DUH FDOFXODWLQJ GDWD
VXPGLII LQ G\QDPLF PDQQHU ,I GDWD VXPGLII FKDQJHV EH
\RQG D FHUWDLQ OLPLW WKH FKRVHQ TXHU\ SODQ PD\ QRW UH
PDLQ HIILFLHQW$V D VLPSOH VFKHPH OLPLWV RQ FKDQJHV WR
GDWD VXPGLII FDQ EH IRXQG IRU ZKLFK WKH VHOHFWHG TXHU\
SODQUHPDLQVRSWLPDO2XUZRUNFDQDOVREHXVHGIRUH[
WHQGLQJ WKH ZRUN SURSRVHG LQ >@ IRU FRQVWUXFWLRQ DQG
PDLQWHQDQFH RI D QHWZRUN RI GDWD DJJUHJDWRUV VR WKDW
HQGWRHQG VRXUFHVWRFOLHQW  ILGHOLW\ FDQ EH PD[LPL]HG
2XUTXHU\FRVWPRGHOFDQDOVREHXVHGIRURWKHUSXUSRVHV
VXFK DV ORDG EDODQFLQJ YDULRXV DJJUHJDWRUV PXOWLTXHU\
H[HFXWLRQURXWLQJVHQVRUGDWDHWF8VLQJWKHFRVW PRGHO
IRUWKHVHDSSOLFDWLRQV DQG GHYHORSLQJ WKH FRVW PRGHOIRU
PRUHFRPSOH[TXHULHVLVWKLUGDUHDRIRXUIXWXUHZRUN

[8]
>@
>@
>@
>@
>@
>@

>@
>@
>@
>@

>@
>@

>@
>@

>@
>@
>@
>@

REFERENCES
>@
>@

>@

>@

>@
>@
>@

$'DYLV-3DULNKDQG::HLKO(GJH&RPSXWLQJ([WHQGLQJ(QWHU
SULVH$SSOLFDWLRQVWRWKH(GJHRIWKH,QWHUQHW:::
'9DQGHU0HHU$'DWWD.'XWWD+7KRPDVDQG.5DPDPULWKDP
3UR[\%DVHG$FFHOHUDWLRQRI'\QDPLFDOO\*HQHUDWHG &RQWHQW RQWKH
:RUOG:LGH :HE$&0 7UDQVDFWLRQVRQ'DWDEDVH 6\VWHPV 72'6 
9RO-XQH
- 'LOOH\ % 0DJJV - 3DULNK + 3URNRS 5 6LWDUDPDQ DQG % :HLKO
*OREDOO\ 'LVWULEXWHG &RQWHQW 'HOLYHU\ ,((( ,QWHUQHW &RPSXWLQJ
6HSW
6 5DQJDUDMDQ 6 0XNHUMHH DQG 3 5RGULJXH] 8VHU 6SHFLILF 5HTXHVW
5HGLUHFWLRQ LQ D &RQWHQW 'HOLYHU\ 1HWZRUN WK ,QWO :RUNVKRS RQ
:HE&RQWHQW&DFKLQJDQG'LVWULEXWLRQ ,:&: 
66KDK.5DPDPULWKDPDQG36KHQR\0DLQWDLQLQJ&RKHUHQF\RI
'\QDPLF'DWDLQ&RRSHUDWLQJ5HSRVLWRULHV9/'%
7 + &RUPHQ &KDUOHV ( /HLVHUVRQ 5RQDOG / 5LYHVW DQG &OLIIRUG
6WHLQ,QWURGXFWLRQWR$OJRULWKPV0,73UHVVDQG0F*UDZ+LOO
< =KRX % &KLQ 2RL DQG .LDQ/HDQ 7DQ 'LVVHPLQDWLQJ 6WUHDPLQJ
'DWD LQ D '\QDPLF (QYLURQPHQW $Q $GDSWLYH DQG &RVW %DVHG $S
SURDFK7KH9/'%-RXUQDO,VVXHSJ

>@
>@
>@

Query
cost
model
validation
for
sensor
data.
www.cse.iitb.ac.in/~grajeev/sumdiff/RaviVijay_BTP06.pdf.
5 *XSWD $ 3XUL DQG . 5DPDPULWKDP ([HFXWLQJ ,QFRKHUHQF\
%RXQGHG&RQWLQXRXV4XHULHVDW:HE'DWD$JJUHJDWRUV:::
3RSXOLV$ 3UREDELOLW\ 5DQGRP9DULDEOHDQG 6WRFKDVWLF3URFHVV 0F
*UDZ+LOO
&2OVWRQ--LDQJDQG-:LGRP$GDSWLYH)LOWHUIRU&RQWLQXRXV4XH
ULHVRYHU'LVWULEXWHG'DWD6WUHDPV6,*02'
66KDK.5DPDPULWKDPDQG&5DYLVKDQNDU&OLHQW$VVLJQPHQWLQ
&RQWHQW'LVVHPLQDWLRQ1HWZRUNVIRU'\QDPLF'DWD9/'%
1()6&
6FLHQWLILF
&RPSXWHU
6\VWHP
KWWSVROHZKZKRLHGXaMPDQQLQJFUXLVHVHUYHFJL
6 0DGGHQ 0-)UDQNOLQ- +HOOHUVWHLQDQG: +RQJ7$*D7LQ\
$JJUHJDWLRQ6HUYLFHIRU$G+RF6HQVRU1HWZRUNV3URFRIWK6\PSR
VLXPRQ2SHUDWLQJ6\VWHPV'HVLJQDQGLPSOHPHQWDWLRQ
'6-RKQVRQDQG05*DUH\&RPSXWHUVDQG,QWUDFWDELOLW\$*XLGHWR
WKHWKHRU\RI13FRPSOHWHQHVV6DQ)UDQFLVFR&$)UHHPDQ
6=KXDQG&5DYLVKDQNDU6WRFKDVWLF&RQVLVWHQF\DQG6FDODEOH3XOO
%DVHG&DFKLQJIRU(UUDWLF'DWD6RXUFHV9/'%
' &KX $ 'HVKSDQGH - +HOOHUVWHLQ : +RQJ $SSUR[LPDWH 'DWD
&ROOHFWLRQLQ6HQVRU1HWZRUNVXVLQJ3UREDELOLVWLF0RGHOV,&'(
$ 'HVKSDQGH & *XHVWULQ 6 5 0DGGHQ - 0 +HOOHUVWHLQ DQG :
+RQJ 0RGHO'ULYHQ 'DWD $FTXLVLWLRQ LQ 6HQVRU 1HWZRUNV 9/'%

3HDUVRQ
3URGXFW
PRPHQW
FRUUHODWLRQ
FRHIILFLHQW
KWWSZZZQ\[QHWaWPDFIDUO67$7B787FRUUHODWVVL
$QWRQLRV 'HOLJLDQQDNLV <DQQLV .RWLGLV DQG 1LFN 5RXVVRSRXORV
3URFHVVLQJ$SSUR[LPDWH$JJUHJDWH 4XHULHVLQ :LUHOHVV 6HQVRU 1HW
ZRUNV,QIRUPDWLRQ6\VWHPVYRO,VVXH3J
*&RUPRGHDQG0*DURIDODNLV6NHWFKLQJ6WUHDPVWKURXJKWKH1HW
'LVWULEXWHG$SSUR[LPDWH4XHU\7UDFNLQJ9/'%
6$JUDZDO.5DPDPULWKDPDQG66KDK&RQVWUXFWLRQRID7HPSRUDO
&RKHUHQF\ 3UHVHUYLQJ'\QDPLF'DWD'LVVHPLQDWLRQ 1HWZRUN 5766

%ULDQ %DEFRFN DQG &KULV 2OVWRQ 'LVWULEXWHG 7RS. 0RQLWRULQJ
6,*02'
$GDP6LOEHUVWHLQ.DPHVK0XQDJDODDQG-XQ<DQJ(QHUJ\(IILFLHQW
0RQLWRULQJRI([WUHPH9DOXHVLQ6HQVRU1HWZRUNV6,*02'
1-DLQ'.LW3 0DKDMDQ3<DODJDQGXOD0'DKOLQDQG<=KDQJ
67$56HOI7XQLQJ$JJUHJDWLRQIRU6FDODEOH0RQLWRULQJ9/'%
5*XSWDDQG.5DPDPULWKDP2SWLPL]HG4XHU\3ODQQLQJRI&RQ
WLQXRXV $JJUHJDWLRQ 4XHULHV LQ '\QDPLF 'DWD 'LVVHPLQDWLRQ 1HW
ZRUNV:::
6.DVK\DS-5DPDPULWKDP55DVWRJLDQG36KXNOD(IILFLHQW&RQ
VWUDLQW0RQLWRULQJXVLQJ$GDSWLYH7KUHVKROGV,&'(
'6+RFKEDXP$SSUR[LPDWLRQDOJRULWKPVIRUWKHVHWFRYHULQJDQG
YHUWH[FRYHUSUREOHPV6,$0-RXUQDORQ&RPSXWLQJYRO  
3(GDUD$/LPD\HDQG.5DPDPULWKDP$V\QFKURQRXV,QQHWZRUN
3UHGLFWLRQ(IILFLHQW$JJUHJDWLRQLQ6HQVRU1HWZRUNV$&07UDQVDF
WLRQVRQ6HQVRU1HWZRUNV9ROXPH1XPEHU$XJXVW


Rajeev Gupta got his BTech from Indian Institute of Technology (IIT)
Kharagpur, India in Electronics Engineering. He is currently pursuing
his PhD from IIT Mumbai, India in Computer Science. He is working
as Researcher at IBM Research, New Delhi, India for last 10 years.
Krithi Ramamritham received the PhD in Computer Science from
University of Utah and then joined the University of Massachusetts.
He is currently at IIT Bombay as Professor in the Department of
Computer Science. He is a fellow of IEEE and a fellow of ACM. He
has served on numerous program committees of conferences and
workshops. His editorial board contributions include IEEE Transactions, the Real Time Systems Journal, and the VLDB Journal.

You might also like