KisungLee RDF Cloud

You might also like

Download as rtf, pdf, or txt
Download as rtf, pdf, or txt
You are on page 1of 18

Efficient and Customizable Data Partitioning Framework for Distributed Big RDF Data Processing in the Cloud

Kisung Lee, Ling Liu, Yuzhe Tang, Qi Zhang, Yang Zhou


DiSL, College of Computing, Georgia Institute of Technology, tlanta, !S {"isung#lee, lingliu}$cc#gatech#e%u, {yztang, &zhang'(, yzhou)*}$gatech#e%u
AbstractBig data business can leverage and benefit from the Clouds, the most o timized, shared, automated, and virtualized com uting infrastructures! "ne of the im ortant challenges in rocessing big data in the Clouds is how to effectivel# artition the big data to ensure efficient distributed rocessing of the data! $n this a er we resent a %calable and #et customizable data P&rtitioning framework, called %P&, for distributed rocessing of big RDF gra h data! 'e choose big RDF datasets as our focus of the investigation for two reasons! First, the (inking " en Data cloud has ut forwards a good number of big RDF datasets with tens of billions of tri les and hundreds of millions of links! %econd, such huge RDF gra hs can easil# overwhelm an# single server due to the limited memor# and CP) ca acit# and e*ceed the rocessing ca acit# of man# conventional data rocessing software s#stems! "ur data artitioning framework has two uni+ue features! First, we introduce a suite of verte*, centric data artitioning building blocks to allow efficient and #et customizable artitioning of large heterogeneous RDF gra h data! B# efficient, we mean that the %P& data artitions can su ort fast rocessing of big data of different sizes and com, le*it#! B# customizable, we mean that the %P& artitions are ada tive to different +uer# t# es! %econd, we ro ose a selection of scalable techni+ues to distribute the building block artitions across a cluster of com ute nodes in a manner that minimizes inter, node communication cost b# localizing most of the +ueries on distributed artitions! 'e evaluate our data artitioning framework and algorithms through e*tensive e* eriments using both benchmark and real datasets! "ur e* erimental results show that the %P& data artitioning framework is not onl# efficient for artitioning and distributing big RDF datasets of diverse sizes and structures but also effective for rocessing big data +ueries of different t# es and com le*it#!

I. I +T,-D!CTI-+
Clou% computing infrastructures are .i%ely recognize% as an attracti/e computing platform for efficient 0ig %ata pro1 cessing 0ecause it minimizes the upfront o.nership cost for the large1scale computing infrastructure %eman%e% 0y 0ig %ata analytics# 2ith Lin"ing -pen Data community pro3ect an% 2orl% 2i%e 2e0 Consortium 425C6 a%/ocating ,D7 4,esource Description 7rame.or"6 8*9 as a stan%ar% %ata mo%el for 2e0 resources, .e ha/e .itnesse% a stea%y gro.th of 0oth 0ig ,D7 %atasets an% large an% gro.ing num0er of %omains an% applications capturing their %ata in ,D7 an% performing 0ig %ata analytics o/er 0ig ,D7 %atasets# 7or e:ample, more than ;< 0illion ,D7 triples are pu0lishe% as of =arch <(>< on Lin"e% Data 8><9 an% a0out *#?* 0illion triples are pro/i%e% 0y the Data1go/ 2i"i 8@9 as of 7e0ruary <(>5# ,ecently the !K 4!nite% King%om6 go/ernment is pu0lishing ,D7 a0out its legislation 8)9 .ith a SA ,QL 4a stan%ar% &uery language for ,D76 &uery interface for its %ata sources 8;9# Ba%oop =ap,e%uce programming mo%el an% Ba%oop Dis1 tri0ute% 7ile System 4BD7S6 are one of the most popular %istri0ute% computing technologies for %istri0uting 0ig %ata processing across a large cluster of compute no%es in the

Clou%# Bo.e/er, processing the huge ,D7 %ata using Ba%oop =ap,e%uce an% BD7S poses a num0er of ne. technical challenges# 7irst, .hen /ie.ing a 0ig ,D7 %ataset as an ,D7 graph, it typically consists of millions of /ertices 4su03ects or o03ects of ,D7 triples6 connecte% 0y millions or 0illions of e%ges 4pre%icates of ,D7 triples6# Thus triples are correlate% an% connecte% in many %ifferent .ays# ,an%om partitioning of 0ig %ata into chun"s through either horizontal 40y triples6 or /ertical partitioning 40y su03ect, o03ect or pre%icate6 is no longer a /ia0le solution 0ecause %ata partitions generate% 0y such a simple partitioning metho% ten% to ha/e high correlation .ith one another# Thus, most of the ,D7 &ueries nee% to 0e processe% through multiple roun%s of %ata shipping across partitions hoste% in multiple compute no%es in the Clou%# Secon%, BD7S 4an% its attache% storage systems6 is e:cellent for managing 0ig ta0le li"e %ata .here ro. o03ects are in%epen%ent an% thus 0ig %ata can 0e simply %i/i%e% into e&ual1size% chun"s .hich can 0e store% in a %istri0ute% manner an% processe% in parallel efficiently an% relia0ly# Bo.e/er, BD7S is not optimize% for processing 0ig ,D7 %atasets of high correlation# Therefore, e/en simple retrie/al &ueries can 0e &uite inefficient to run on BD7S# Thir% 0ut not the least, Ba%oop =ap,e%uce programming mo%el is optimize% for 0atch1oriente% processing 3o0s o/er 0ig %ata rather than real1time re&uest an% respon% types of 3o0s# Thus, .ithout correlation preser/ing %ata

partitioning, Ba%oop =ap,e%uce alone is neither a%e&uate for han%ling ,D7 &ueries nor suita0le for structure10ase% reasoning on ,D7 graphs# 2ith these challenges in min%, in this paper, .e present a Scala0le an% yet customiza0le %ata A rtitioning frame.or", calle% SA , for %istri0ute% processing of 0ig ,D7 graph %ata# -ur %ata partitioning frame.or" has t.o uni&ue features# 7irst, .e intro%uce a suite of /erte:1centric %ata partitioning 0uil%ing 0loc"s, calle% e:ten%e% /erte: 0loc"s, to allo. efficient an% yet customiza0le %ata partitioning of large heterogeneous ,D7 graphs 0y preser/ing the 0asic /erte: structure# Cy efficient, .e mean that the SA %ata partitions can support fast processing of 0ig %ata of %ifferent sizes an% comple:ity# Cy customiza0le, .e mean that one partitioning techni&ue may not fit all# Thus the SA partitions are 0y %esign a%apti/e to %ifferent %ata processing %eman%s in terms of structural correlations# Secon%, .e propose a selection of scala0le par1 allel processing techni&ues to %istri0ute the structure% /erte: 0loc"10ase% partitions across a cluster of compute no%es in a manner that minimizes inter1no%e communication cost 0y localizing most of the &ueries to in%epen%ent partitions an% 0y ma:imizing intra1no%e processing# Cy partitioning an% %istri0uting 0ig ,D7 %ata using structure% /erte: 0loc"s, .e can consi%era0ly re%uce the inter1no%e communication cost of comple: &uery processing 0ecause most ,D7 &ueries can

http

type

Proceedings

1-108 Journal 1

Journal

star

Journal

SELECT ?yr #

!E"E

bi-vertex

% Pr 1 o oc l 1 u & e

$nProc1

% ''' o Jo typ e 1 l ur Proc 1 u n ? & e al (o 1 ur

na l

? ( o u r n a l r d ) * t y p e J o u r n a l +
type

bloc k

? ( o u r n a l t i t l e , J o u r n a l 1 , 1-.0 $ n / 1 $nPro -0 title c1 P 0 r o 1 c * e e di n g s 1-.0 45 3 ' c a Per g' r o http t son $n r i m 333 Pro 1 ti c pl c 1 cl cr e l e e x ea 1
to r

+ ?(ournal issued ?yr 2 nP c 333 'rticle

?yr Jo urn al 1

ae r 1 & r e es a t o o n r

SEL EC T 6$S T$7 CT ? type pers on ? ? na& na& ep e e ? ! nr E" as E# o & ? en inpr oc rd)* type $npr ocee ding s ? inpr oc crea tor ? pers on + ? pers on na& e? na& e+ t i t l e n o t e

P e r s o n

Pn

aP e& e e

r r s s o o n 1 n

$n Pr oc1 inve rte x

3 3 3

Pers on/ 4 1 P 1 0 e a r 8s s tr o ac n t

$n type Pr oc ee di ng s

? ? inp per roc son


typ e Per son 2

Person outv er te x bl o c

1 k Prsc'rticle o
r

< (a)(b) (c) Differ =

D:ample D:ample ,D7


" 6 > St or ag e Sy st e &

0 e e / a l u a t e % l o c a ll y o n a p a rt iti o n s e r / e r . it h o u t r e & u ir i n g % a t

6 a t a 7 o d e 9 P ue 3 ry u $n te r) ac e + : + E ;e + cu to < r =

cu st o mi za 0l e for pa rtit io ni ng an % %i str i0 uti ng 0i g , D 7 %a ta se ts of %i /e rs e siz es an % str uc tur es , an % eff ec ti/ e for pr oc es si ng

real1 time ,D7 &ueri es of %iffer

r e % i c a a n s t e , a s D 0 n 7 e a t , s % . D a S e 7 l t A e a a - n g 0 s r e e t i a l t r t p e c i s h % o p n l s . e s e u i % i s 0 t g s 6 3 h e t . e s s i c s t t u c o h a 0 o f n 3 n 4 t % e n s h c e u e o t c 0 0 s t 3 p 3 i e r e a n c e c n g t % t % , i # a p c o r a n 0 p e t 3 a % e , e i i D c r c r 7 t o a e s f t p % s e r a a u , e t s 0 o s a 3 0 e s / e 3 n e e c e ent t t r t types c 1 c t a t an% i a i n compl 6 n n c % e:ity# g e t 0 s o r a e 0 i a 3 A. R p r % n e DF l an e e e % c l p t d s a i p # SP A R Q L

4 o r s o 1 c a l l e %

t i o n s h i p

c t e %

D a c h e % g e i s % i r e c t e % , e m a n a t i n g f r o m i t s s u 0 3 e c t / e r t e : t o i t s o 0

3ect /erte: # 7ig# >4a6 sho.s an e:am ple ,D7 graph 0ase% on the struct ure of SA<C

ench 4SA ,QL Aerf orma nce Cenc hmar "6 8<(9#

SA ,QL is a SQL1 li"e stan% ar% &uery langu age for ,D7 recom men% e% 0y 25C# =ost SA , QL &uerie s consis t of multip le triple patter ns, .hich are simila r to ,D7 triples e:cep t that in each triple

patter n, the su03ec t, pre%ic ate an% o03ect may 0e a /aria0l e# 2e categ orize SA , QL &uerie s into t.o typesE star an% compl e:, 0ase% on the 3oin chara cteristi cs of triple patter ns# Star &uerie s consis t of su03ec t1 su03ec t 3oins an% each 3oin /aria0l e is the su03ec t of all the triple patter ns in/ol/ e%# 2e refer to the remai ning &uerie s as compl

ex &ueri es# 7ig# >406 sho. s t.o e:a mple SA ,QL &uer y grap hs# The first is a star &uer y re&u estin g the year of pu0li catio n of Four nal > an% the seco n% is a com ple: &uer y re&u estin g the nam es of all pers ons .ho are an auth or of at least one pu0li catio n of inpr oce e%in

g type# SA ,QL &uery proce ssing can 0e /ie. e% as fin%in g matc hing su0gr aphs in the ,D7 grap h .her e ,D7 terms from those su0gr aphs may 0e su0st itute% for the &uery /aria 0les#

frame.or" SA # The first prototype of SA is implemente% on top of Ba%oop =ap,e%uce an% BD7S# The Ba%oop SA frame.or" consists of one coordinator an% a set of worker no%es 4G=s6 in the SA cluster# The coor%inator ser/es as the +ame+o%e of BD7S an% the Fo0Trac"er of Ba%oop =ap,e%uce an% each .or"er ser/es as the Data+o%e of BD7S

7ig# <E SA rc hite ctur e

B. S
y st e m A rc hi te ct ur e 7ig # < s"etc hes the archit ectur e of our ,D7 %ata partiti oning

an% the Tas"Trac"er of Ba%oop =ap,e%uce# To efficiently store the generate% partitions, .e utilize an ,D71specific storage system on each .or"er no%e# The core componen ts of our ,D7 %ata partitionin g frame1 .or" are the partition 0loc" generator an% %istri0utor# The generator uses a /erte:1centric approach to construct partition 0loc"s such that all triples of each partition 0loc" are resi%ing in the same .or"er no%e# 7or a 0ig ,D7 graph .ith a huge num0er of /ertices, .e nee% to carefully %istri0ute all gener1 ate% partition 0loc"s across a cluster of .or"er no%es using a partition %istri0ution mechanism#

2 e p r o / i % e a n e ff i c i e n t % i s t ri 0 u t e % i m p l e m e n t a ti o n o f o u r % a t a p a r ti ti o n i n g s y s

t e m o n t o p o f B a % o o p = a p , e % u c e a n % B D 7 S #

t h i s s e c ti o n . e % e s c ri 0 e t h e S A

% a t a p a r ti ti III. o n i n g f r a m e . o r " , f o c u s i n g o n t h e I t

. o c o r e c o m p o n e n t s E c o n s tr u c ti n g t h e p a rt it i o n 0 l o c " s a n % % i s tr i 0 u ti n g t h e m a c r

oss multi ple .or" er no%e s# In or%e r to pro/i %e partit ionin g mo% els that are cust omiz a0le to %iffer ent proc essi ng nee% s, .e %e/i se thre e type s of partit ion 0loc "s 0ase % on the /erte : struc ture an% the /erte : acce ss patte rn# The goal of cons tructi ng partit ion 0loc

"s is to assi gn all triple s of each partit ion 0loc " to the sam e .or" er no%e in or%e r to supp ort effici ent &uer y proc essi ng# In ter m s of eff ici en t &u er y pr oc es sin g, th er e ar e t. o %if fer en t type s of proc essi ngE intra -VM proc

e s s i n g

n t h a t l o c a l l y

a a n % & s u e i e a n r r t y c e h r Q i n V c g M a n t p h r 0 e o e 1 s f u c u 0 e l g s l r s y a i p n e h g : s # e c m C u a y t t e c i % h n i t i n r n g a 1 p t G a h = r e a p l t r l r o e i c l p e l s o e s n i p n e a g a t , c t h e . r e G n =s m e 0 o a y f

a Q t , o r . i s t i h m o p u l t y a s n e y n % c s o o Q r % t i o n a a t l i l o n . o f r r " o e m r o n n o e % e G s = 4 t G o = s a 6 n , o t . h i e t r h # o u T t h e u s c i o n o g r % B i a n %

o o p , a n % t h e n m e r g e s t h e p a r t i a l r e s u l t s r e c e i / e % f r o m a l l G = s t

o gen erat e the final resul ts of Q# Cy inter 1G= proc essi ng, .e mea n that a &uer y Q as a .hol e cann ot 0e e:ec ute% on any G=, an% it nee %s to 0e %eco mpo se% into

a set of su0& uerie s such that each su0& uery can 0e e/al uate % 0y intra1 G= proc essi ng# Thus , the proc essi ng of Q re&ui res multi ple roun %s of coor %inat ion an% %ata trans fer acro ss the

cluster of .or"ers using Ba%oop# In contrast to intra1G= processing, inter1G= communication cost can 0e e:tremely high, especially .hen the num0er of su0&ueries is not small an% the size of interme%iate results to 0e transferre% across the net.or" of .or"er no%es is large#

Proc1 title 45 http $nProc 1

$nProceedings 1*66 1-.0


creator

1*66 Journal1

$nProc 1

'rticle

A. o
ns tru cti n! "x te nd ed Ve rte x #l oc ks 2 e c e n t e r o u r % a t a p a r t i t i o n i n g o n a / e r t e :

4'r tic le 1 a8st ract 1 1 0

title note

Cy /erte:1centric, it means that .e construct a /erte: 0loc" for each /erte:# Cy e:ten%ing a /erte: 0loc" to an e:ten%e% /erte: 0loc" .e can assign more triples .hich are close to efficient &uery processing# for

7ig# 5E <1hop e:ten%e% 0i1/erte: 0loc" of HAerson>I

{$out|$out V% 4$% $out6

" }# Similarly, in,verte* in block


out

%efine% as V #

in "

in "

J 4V$ % "$ % K
V in in J

Cefore .e formally %efine the concept of /erte: 0loc" an% the concept of e:ten%e% /erte: 0loc", .e first %efine some 0asic concepts of ,D7 graphs# De&inition >' 4,D7 Graph6 n ,D7 graph is a %irecte%,
$

in {$ }" in {$ |$ V% 4 $ % $ 6 } # Thus .e %efine

6 such % that l $ $ in
in

i n

the verte* block of $ as the com0ination of 0oth out1 /erte: 0loc" an% in1/erte: 0loc", namely the 0i1/erte: 0loc", an% is (i formally (i (i %efine% as V #

2 r e y r e / f e e r r t t e o : / e 0 r t l e o : c $ " a s t h e a n c h o r / e r t e : o f t h e / e r t e : 0 l o c " c e n t e r e % a t h a s a n a n c h o r / e r t e : # 2 e r e f e r t o t h o s e

i n a / e r t e : 0 l o c " , . h i c h a r e n o t t h e a n c h o r / e r t e : , a s t h e ( o r d

J 4V$ % "$ % K"(i % l"(i 6 such that


V
( i
$ $

la0ele% multigraph, %enote% as ) J 4V% "% K"

(i (i

out in

% l" 6 .here V

( i
$

J {$} {$ |$

V% 4$% $

6 "$

% $6

"$

or 4$ }#

is a set of an% " is a multiset of/ertices %irecte% e%ges 4i#e#, or%ere% pairs of /ertices6# %irecte% e%ge 4u% $ " %enotes a6 triple in the ,D7 mo%el from su03ect u to o03ect $# K" is a set of a/aila0le

la0els 4i#e#, pre%icates6 for e%ges an% l" is

J {4$% o6.e |4$% %efine o6 "} # Similarly, a set ofo(*ect e%ges/erte: 4triples6 .hose is $ as the in,tri les of a /erte: $ V , %enote% 0y $ J {4s% $6|4s% $6 "}#les 2eof %efine bi, tri a /erte: $ V as the union of its out,
tri les an% in,tri les, %enote%

%efine .hose a set of e%ges 4triples6 su(*ect /erte: is $ $ as the out,tri les out of /erte: , %enote% 0y "

a map from an e%ge to its la0el 4" K" 6# In ,D7 %atasets, multiple triples may ha/e the same su03ect an% o03ect# Thus " is a multiset instea% of a set# 2e consi%er that t.o triples in an ,D7 %ataset are correlate% if they share the same su03ect, o03ect or pre%icate# 7or simplicity, only /erte:10ase% correlation is consi%ere% in this paper# Thus .e only consi%er three types of correlate% triples as follo.sE De&inition <' 4Different types of correlate% Let J4 V% "% K" % 7or ltriples6 6 0e " an ) ,D7 graph#
$

"

each .e /erte: $ V , in

/ e r $ t # i c D e / s e

e l r u s / e e / r e t r i t c e e : s 0 l o o f c " t s h t i o s r e / f e e r r t t e o : a ll 0 t l h o r c e " e # t I y n p e t s h o e f / r e e r s t t e : o 0 f l o t c h " e s a p n a % p u e s r e 0 . i1 e / e . r i t l e

: 0 l o c " t o r e f e r t o t h e c o m 0 i n a ti o n o f i n 1 / e rt e : 0 l o c " a n % o u t1 / e rt e : 0 l o c " #

>4 : c6 sh 0 o l .s o thr c ee " %if fer i en n /e t rte h : e 0l oc "s s of a a m /e e rte : p A a e r r t s i o t n i > o # n , C y . e l o c c a a n t i e n f g f i a c l i l e n t t r l i y p l p e r s o c o e f s s a a / l 7 e l i r g t & # e u

eries .hich re&ue st only triple s %irect ly conn ecte% to a /erte : 0eca use those &ueri es can 0e proce sse% using intra1 G= proce ssing# 7or e:am ple, to proce ss the star &uery in 7ig# >406, .e can run the &uery on each G= .itho ut any coor% inatio n .ith other G=s 0eca use it is guara ntee% that all triple

s conn ecte% to a /erte : .hich can 0e su0sti tute% for the /aria 0le L 3our nal are locate % in the same G=# D/ en tho ugh the /ert e: 0lo c"s are 0en efic ial for star &ue ries , .hich are com mon in practi ce, there are some ,D7 %atas ets in .hich chain &ueri es or more compl e: &ueri es may re&ue

st 0y " J" " # $ $ $ +o. .e %efine the concept of $ertex (lock, the 0asic 0uil%ing 0loc" for graph partitioning# /erte: 0loc" can 0e represente% 0y a /erte: ID an% its connecte% triples# n intuiti/e .ay to construct the /erte: 0loc" of a /erte: $ is to inclu%e all connecte% triples 4i#e#, 0i1 triples6 of $ regar%less of their %irection# Bo.e/er, for some ,D7 %atasets, all their &ueries may re&uest only triples in one %irection from a /erte:# 7or e:ample, to e/aluate the first SA ,QL &uery in 7ig# >406, the &uery processor nee%s only outtriples of a /erte: .hich may 0e su0stitute% for /aria0le L 3ournal# Since one partitioning techni&ue cannot fit all, .e nee% to pro/i%e customiza0le options that can 0e efficiently use% for %ifferent ,D7 %atasets an% SA ,QL &uery types# In a%%ition to &uery efficiency, partition 0loc"s shoul% also minimize the triple re%un%ancy 0y consi%ering only necessary triples an% re%ucing the %is" space of generate% partitions# This moti/ates us to intro%uce three %ifferent .ays of %efining the concept of /erte: 0loc"s 0ase% on the e%ge %irection of triples# De&inition 5'"% 4Gerte: Let ) J 4V% K" %"ut, l"0loc"6 6 0e an ,D7 graph#

(i

out in

verte* block of a /erte: $ V is a of $ su0graph of ) .hich consists

an% its out1triples, %enote% 0y

o u t V #

o u u t o t

more triples that are 0eyon% those %irectly connecte% triples# Consi%er the secon% &uery in 7ig# >406# Clearly there is no guarantee that any triple connecting from Lperson to Lname is locate% in the same partition .here those connecte% triples of /erte: L inproc are locate%# Thus inter1G= processing may 0e re&uire%# This moti/ates us to intro%uce the concept of k1 hop e:ten% e% /erte: 0loc" 4k >6# Gi/en a /erte: $ V in an ,D7 graph ), k1hop extended $ertex (lock of $ inclu%es not only %irectly connecte% triples 0ut also all near0y triples .hich are in%irectly connecte% to the /erte: .ithin k1hop ra%ius in )# Thus, a /erte: 0loc" of $ is the same as the +-hop e:ten%e% /erte: 0loc" of $# Similar to the /erte: 0loc", .e %efine three %ifferent .ays of %efining the e:ten%e% /erte: 0loc" 0ase% on the e%ge %irection of triplesE k-hop extended out-$ertex (lock, k-hop extended in-$ertex (lock an% k-hop extended (i-$ertex (lock# Similarly, gi/en a k1hop e:ten%e% /erte: 0loc" anchore% at $, .e refer to those /ertices that are not the anchor /erte: of the e:ten%e% /erte: 0loc" as the 0or%er /ertices# 7or e:ample, 7ig# 5 sho.s the <1hop e:ten%e% 0i1/erte: 0loc" of /erte: Aerson># Due to the space limitation, .e omit the formal %efinitions of the three types of the k1hop extended $ertex (lock in this paper# out 2e 0elo. %iscuss 0riefly ho. to set the system1 %efine%

J 4V $ % " $ % $ K"out % l"out 6 such that V$ J {$} $

parameter k# 7or a gi/en ,D7 %ataset, .e can %efine k 0ase% on the common types of &ueries .e .ish to pro/i%e fast e/aluation# If the ra%ius of most of such &ueries is 5 hops or less, .e can set k J 5# The higher k /alue is, the higher %egree of triple replication each %ata partition .ill ha/e an% thus more &ueries can 0e e/aluate% using intra1G= processing# In fact, .e fin% that the most fre&uent &ueries ha/e > hop or < hops as their ra%ius# -nce the system1%efine% k /alue is set, in or%er to %etermine if a gi/en &uery can 0e e/aluate% using intra1G= processing or it has to pay inter1G= processing cost, all .e nee% to %o is to chec" .hether e/ery /erte: in the correspon%ing &uery graph can 0e co/ere% 0y any k1hop e:ten%e% /erte: 0loc"# Concretely, if there is any /erte: in the &uery graph .hose k1 hop e:ten%e% /erte: 0loc" inclu%es all triple patterns 4e%ges6 of the &uery graph, then .e can say that the &uery can 0e e/aluate% using intra1G= processing un%er the current k1hop %ata partitioning scheme# In the first prototype of SA system, .e implement our ,D7 %ata partitioning schemes using Ba%oop =ap,e%uce an% BD7S# 7irst, users can use the SA configuration to set some system1%efine% parameters, inclu%ing the num0er of .or"er no%es use% for partitioning, say n, the type of e:ten%e% /erte: 0loc"s 4out1/erte:, in1/erte: or 0i1/erte:6 an% the k /alue of e:ten%e% /erte: 0loc"s# -nce the selection is entere%, the SA %ata partitioner .ill launch a set of Ba%oop 3o0s to partition the triples of the gi/en %ataset into n partitions 0ase% on the %irection an% the k /alue of the e:ten%e% /erte: 0loc"s# The first Ba%oop 3o0 rea%s the input ,D7 %ataset store% in BD7S chun"s# 7or each chun", it e:amines triples one 0y one an% groups triples 0y t.o partitioning parametersE 4i6 the type of e:ten%e% /erte: 0loc"s 4out1/erte:, in1/erte: or 0i1 /erte:6, .hich %efines the %irection of the k1hop e:tension on the input ,D7 graphM 4ii6 the k /alue of k1hop e:ten%e% /erte: 0loc"s# If the parameters are set to generate >1hop e:ten%e% (i-$ertex 0loc"s, each triple is assigne% to t.o e:ten%e% /erte: 0loc"s 4or one 0loc" if the su03ect an% o03ect of the triple is the same entity6# This can 0e /ie.e% as in%e:ing this triple 0y 0oth its su03ect an% o03ect# Bo.e/er, if the parameters are set to generate >1hop e:ten%e% out-$ertex or in-$ertex 0loc"s, each triple .ill 0e assigne% to one e:ten%e% /erte: 0loc" 0ase% on its su03ect or o03ect respecti/ely# 2e 0elo. %escri0e ho. the SA %ata partitioner ta"es the input ,D7 %ata an% uses a cluster of compute no%es to partition it into n partitions 0ase% on the settings of <1hop e:ten%e% out-$ertex 0loc"s# In our first prototype, the SA %ata partitioner is implemente% in t.o phasesE Ahase > generates /erte: 0loc"s or >1hop e:ten%e% /erte: 0loc"s, an% Ahase < generates k1hop e:ten%e% /erte: 0loc"s for k , ># If k J >, only Ahase > is nee%e%# In the follo.ing %iscussion, .e assume that the ,D7 %ataset to 0e partitione% has 0een loa%e% into BD7S# The first Ba%oop 3o0 is initialize% .ith a set of map tas"s in .hich each map tas" rea%s a fi:e%1size %ata chun" from BD7S 4the chun" size is initialize% as a part of the BD7S configuration6# 7or each map tas", it e:amines triples in a chun" one 0y one an% returns a "ey1/alue pair .here the "ey is the su03ect /erte: an% the /alue is the remaining part 4pre%icate an% o03ect6 of the triple# The local com0ine function of this Ba%oop 3o0 groups all "ey1 /alue pairs that ha/e the same "ey into a local su03ect10ase%

triple group for each su03ect# The re%uce function rea%s the outputs of all maps to further merge all local su03ect10ase% groups to form a /erte: 0loc" .ith the su03ect as the anchor /erte:# Dach /erte: group .ill ha/e one anchor /erte: an% 0e store% in BD7S as a .hole# !pon the completion of this first Ba%oop 3o0, .e o0tain the full set of >1hop e:ten%e% out$ertex 0loc"s for the ,D7 %ataset# In the ne:t Ba%oop 3o0, .e e:amine each triple against the set of >1hop e:ten%e% out1/erte: 0loc"s, o0taine% in the pre/ious phase, to %etermine .hich e:ten%e% /erte: 0loc"s shoul% inclu%e this triple in or%er to meet the k1hop guarantee# 2e call this phase the controlle% triple replication# Concretely, .e perform a full scan of the ,D7 %ataset an% e:amine in .hich >1hop e:ten%e% out1/erte: 0loc"s this triple shoul% 0e a%%e% in t.o steps# 7irst, for each triple, one "ey1/alue pair is generate% .ith the su03ect of the triple as the "ey an% the remaining part 4pre%icate an% o03ect6 of the triple as the /alue# Then .e group all triples .ith the same su03ect into a local su03ect10ase% triple group# In the secon% step, for each >1hop e:ten%e% out1/erte: 0loc" an% each su03ect .ith its local triple group, .e test if the su03ect /erte: matches the e:ten%e% out1 /erte: 0loc"# Cy matching, .e mean that the su03ect /erte: is the 0or%er /erte: of the e:ten%e% out1/erte: 0loc"# If a match is foun%, then .e create a "ey1/alue pair .ith the uni&ue ID of the e:ten%e% out1/erte: 0loc" as the "ey an% the su03ect /erte: of the gi/en local su03ect10ase% triple group as the /alue# This ensures that all out1triples of this su03ect /erte: are a%%e% to the matching e:ten%e% out1 /erte: 0loc"# If there is no matching e:ten%e% out1/erte: 0loc" for the su03ect /erte:, the triple group of the su03ect /erte: is %iscar%e%# This process is repeate% until all the su03ect10ase% local triple groups are e:amine% against the set of e:ten%e% out1/erte: 0loc"s# 7inally, .e get a set of <1 hop e:ten%e% out1/erte: 0loc"s 0y the re%uce function, .hich merges all "ey1/alue pairs .ith the same "ey together such that the su03ect10ase% triple groups .ith the same e:ten%e% out1/erte: 0loc" ID 4"ey6 are integrate% into the e:ten%e% out1/erte: 0loc" to o0tain the <1hop e:ten%e% out1/erte: 0loc" an% store it as a BD7S file# 7or k , <, .e .ill repeat the a0o/e process until the k- hop e:ten%e% /erte: 0loc"s are constructe%# In our e:periment section .e report the performance of the SA partitioner in terms of 0oth the time comple:ity an% %ata characteristics of generate% partitions to illustrate .hy partitions generate% 0y the SA partitioner is much more efficient for ,D7 graph &ueries than ran%om partitioning 0y BD7S chun"s#

B. Distri(utin! "xtended Vertex #locks


fter .e construct the e:ten%e% /erte: 0loc" for each /erte:, .e nee% to %istri0ute all the e:ten%e% /erte: 0loc"s across the cluster of .or"er no%es# Since there are usually a huge num0er of /ertices in ,D7 %ata, .e nee% to carefully assign each e:ten%e% /erte: 0loc" to a .or"er no%e# 2e consi%er three o03ecti/es for %istri0ution of e:ten%e% /erte: 0loc"sE 4i6 0alance% partitions in terms of the storage cost, 4ii6 re%uce% replication an% 4iii6 the time re&uire% to carry out the %istri0ution of e:ten%e% /erte: 0loc"s to .or"er no%es# 7irst, 0alance% partitions are essential for efficient &uery processing 0ecause one 0ig partition in the im0alance% par1 titions may 0e a 0ottlenec" for &uery processing an% so .ill increase the o/erall &uery processing time# Secon%, 0y

the %efinition of the e:ten%e% /erte: 0loc", a triple may 0e replicate% in se/eral e:ten%e% /erte: 0loc"s# If /erte: 0loc"s ha/ing many common triples can 0e assigne% to the same partition, the storage cost can 0e re%uce% consi%era0ly 0ecause %uplicate triples are eliminate% on each .or"er no%e# 2hen many triples are replicate% on multiple partitions, the enlarge% partition size .ill increase the &uery processing cost on each .or"er no%e# Therefore, re%ucing the num0er of replicate% triples is also an important factor critical to efficient &uery processing# Thir%, .hen selecting the %istri0ution mechanism, .e nee% to consi%er the processing time for the %istri0ution of e:ten%e% /erte: 0loc"s# This criterion is also important for ,D7 %atasets that re&uire more fre&uent up%ates# In the first prototype of SA partitioner, .e implement t.o alternati/e techni&ues for %istri0ution of e:ten%e% /erte: 0loc"s to the set of .or"er no%esE hashing10ase% an% graph partitioning10ase%# The hashing10ase% techni&ue can achie/e the 0alance% partitions an% ta"e less time to complete the %istri0ution# It assigns the e:ten%e% /erte: 0loc" of /erte: $ to a partition using the hash /alue of $# 2e can also achie/e the secon% goal re%ucing replication 0y %e/eloping a smart hash function 0ase% on the semantics of the gi/en %ataset# 7or e:ample, if .e "no. that /ertices ha/ing the same !,I prefi: are closely connecte% in the ,D7 graph, then .e can use the !,I prefi: to get the hash /alue, .hich .ill assign those /ertices to the same partition# The graph partitioning10ase% %istri0ution techni&ue utilizes the partitioning result of the minimum cut /erte: partitioning algorithm 8>;9# The minimum cut /erte: partitioning algorithm %i/i%es the /ertices of a graph into n smaller partitions 0y minimizing the num0er of e%ges 0et.een %ifferent partitions# 2e generate a /erte:1to1.or"er1no%e map 0ase% on the result of the minimum cut /erte: partitioning algorithm# Then .e assign the e:ten%e% /erte: 0loc" of each /erte: to the .or"er no%e using the /erte:1to1no%e map# Since it is li"ely that close /ertices are inclu%e% in the same partition an% thus place% on the same .or"er no%e, this techni&ue focuses on re%ucing replication 0y assigning close /ertices, .hich are more li"ely to ha/e common triples in their e:ten%e% /erte: 0loc"s, to the same partition#

"ey /alues of a Ba%oop 3o0#

IV. D NAD,I=D+TS
In this section, .e report the e:perimental e/aluation of our ,D7 %ata partitioning frame.or"# -ur e:periments are performe% using not only %ifferent 0enchmar" %atasets 0ut also /arious real %atasets# 2e first intro%uce 0asic character1 istics of %atasets use% for our e/aluation# 2e categorize the e:perimental results into t.o setsE 4>6 2e sho. the effects of %ifferent types of /erte: 0loc"s an% the %istri0ution techni&ues in terms of the 0alance% partitions, replicate% triples an% %ata partitioning an% %istri0ution time# 4<6 2e con%uct the e:periments on &uery processing latency#

A. Setup
2e use a cluster of <> machines 4one is the coor%inator6 on Dmula0 8<>9E each has >< GC , =, one <#? GBz *?10it &ua% core Neon D;;5( processor an% t.o <;(GC @<(( rpm S T %is"s# The net.or" 0an%.i%th is a0out ?( =COs# 2hen .e measure the &uery processing time, .e perform fi/e col% runs un%er the same setting an% sho. the &astest time to remo/e any possi0le 0ias pose% 0y -S an%Oor net.or" acti/ity# To store each generate% partition on each machine, .e use ,D715N 8>)9 /ersion (#5#; .hich is a ,D7 storage system# 2e also use Ba%oop /ersion (#<(#<(5 running on Fa/a >#*#( to run /arious partitioning algorithms an% 3oin the interme%iate results generate% 0y su0&ueries# In a%%ition to our ,D7 %ata partitioning frame.or", .e ha/e implemente% a ran%om partitioning for comparison, .hich ran%omly assigns each triple to a partition# ,an%om partitioning is often use% as the %efault %ata partitioner in Ba%oop =ap,e%uce an% BD7S# To sho. the 0enefits of %ifferent 0uil%ing 0loc"s of our partitioning frame.or", .e e:periment .ith fi/e %ifferent e:ten%e% /erte: 0loc"sE >1hop e:ten%e% out1/erte: 0loc" 4+out6, >1hop e:ten%e% in1/erte: 0loc" 4+-in6, >1hop e:ten%e% 0i1 /erte: 0loc" 4+-(i6, <1hop e:ten%e% out1/erte: 0loc" 4 -- out6 an% <1hop e:ten%e% in1/erte: 0loc" 4 --in6# lso hashing1 0ase% an% graph partitioning10ase% %istri0ution techni&ues are compare% in terms of 0alance% partitions, triple replication an% the time to complete the %istri0ution# 2e use the graph parti1 tioner =DTIS 8?9 /ersion ;#(#< .ith its %efault configuration to implement the graph partitioning10ase% %istri0ution#

C. Distri(uted Query Processin!


Gi/en that intra1G= processing an% inter1G= processing are t.o types of %istri0ute% &uery processing in our frame1 .or", for a SA ,QL &uery Q, assuming that the partitions are generate% 0ase% on the k1hop e:ten%e% /erte: 0loc", .e construct a k1hop e:ten%e% /erte: 0loc" for each /erte: in the &uery graph of Q# If any k1hop e:ten%e% /erte: 0loc" inclu%es all e%ges in the &uery graph of Q, .e can process Q using intra1G= processing 0ecause it is guarantee% that all triples .hich are re&uire% to process Q are locate% in the same partition# If there is no such k1hop e:ten%e% /erte: 0loc", .e nee% to process Q using inter1G= processing 0y splitting Q into smaller su0&ueries# In this paper, .e split the &uery .ith an aim of minimizing the num0er of su0&ueries# 2e use Ba%oop =ap,e%uce an% BD7S to 3oin the interme%iate results generate% 0y the su0&ueries# Concretely, each su0&uery is e:ecute% on each G= an% then the interme%iate results of the su0&uery are store% in BD7S# To 3oin the interme%iate results of all su0&ueries, .e use the 3oin /aria0le /alues as

B. Datasets
2e use t.o %ifferent 0enchmar" generators an% three %iffer1 ent real %atasets for our e:periments# To generate 0enchmar" %atasets, .e utilize t.o famous ,D7 0enchmar"sE L!C= 4Lehigh !ni/ersity Cenchmar"6 8>>9 ha/ing a uni/ersity %o1 main an% SA<Cench 8<(9 ha/ing a DCLA %omain# 2e

generate t.o %atasets ha/ing %ifferent sizes from each 0enchmar"E

4i6 L!C=;(( 4*@= triples6 an% L!C=<((( 4<*@= triples6 co/ering ;(( an% <((( uni/ersities respecti/ely 4ii6 SA<C1 >((= an% SA<C1<((= ha/ing a0out >(( an% <(( million triples respecti/ely# 7or real %atasets, .e use 4iii6 DCLA 4;@= triples6 8>9 containing 0i0liographic %escriptions in computer science, 4i/6 7ree0ase 4>(>= triples6 8<9 .hich is a large "no.le%ge 0ase, an% 4/6 DCpe%ia 4<))= triples6 859 .hich is structure% information from 2i"ipe%ia# 2e ha/e %elete% any %uplicate triples using one Ba%oop 3o0# 7ig# ? sho.s some 0asic metrics of the se/en ,D7 %atasets# The %atasets generate% from each 0enchmar" ha/e almost the

'%g+ @inco&ing edges 5


1?

100000

7u&8er o) predicates AlogB

100000000 10000000 1000000

63LP >ree8ase 68pedia

100000000

63LP >ree8ase 68pedia

@su8(ects AlogB
100000 10000 1000 100 10 1

(b)(c) /g# +um0 (a)


Pi n c o m in g e % g e s

'%g+ @outgoing edges 14


18 1/ 10 8 ? 4 / 0

? . 4 0 / 1 0 10000 1000 100 10 1

10000000 1000000

@su8(ects AlogB
100000 10000 1000 100 10 1

er of pr e%i cat es

(d)

@ -ute% o u gtg oi en g e d %i g es st Al o rigB 0 (e) ut Ine% io n g e % is tr i 0 u ti o n

1 1 10 E C 100 0 1 E C 1 1 E C / 1 E C 0 1 E C 4 1 E C . @outgoing 1 E C ? 1 E C 5

7 u &

' '%g Type+ s perPre Su8(dica ect tes per Typ e AlogB

0 0 0 0 1 0 0 1 0 0

@ s u
1 0 0 1 0 0

1 0 0 0 0 1 0 0 0 1 0 0 1 0 1

1 0 0 1 0 0 1 0 1

L D 3 = . L 0 D 0

1 1
@ s u

/ 0 0 &
1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 1

0 0

1 0 0 0 0 0 0 0 0

LD3 =.0 0 LD3 =/0 00 SP/ 3110 0& SP/ 31/0 0&

(g) (h) (f)/g# /g#

+um Ty A pe re (i) @out goin g s %i -ut edg es pe c e Alog B r at % Su e (j) g 03e s Ine%g e ct p e er % %is T i tri y s 0u p t tio e r n i 0 u t i o n 7ig# ?E Casi c Char acter istics of Se/e n ,D7 %ata sets

1 1EC 10 0 1EC / @outg 1EC oing 4 edges 1EC AlogB ? 1EC 8

same charact eristics regar%l ess of their %ata size, similar to those reporte % in 8'9# -n the other han%, the real %ataset s ha/e totally %ifferen t charact eristics# 7or

e:ampl e, .hile DCLA has <) pre%icat es, DCpe%i a has a0out ;),((( pre%icat es# To measur e the metrics 0ase% on types 47ig# ?4f6, ?4g6, ?4h66, .e ha/e counte%

those triples ha/ing rd&.type as their pre%icat e# The %istri0ut ion of the num0er of outgoin gOincom ing e%ges in the three C. real %ataset s 47ig# ?4%6 an% ?4e66 ha/e /ery similar po.er la.1li"e %istri0ut ions# It is interesti ng to note that, e/en though a huge num0er of o03ects ha/e less than >(( incomin g e%ges, there are still many o03ects ha/ing a signific ant num0er of incomin g e%ges 4e#g# more than >((,((

( e%ges6 in real %ataset s# Cenchm ar" %ataset s ha/e totally %ifferent %istri0u1 tions as sho.n in 7ig# ?4i6 an% ?436# ) e n e r a t i n ! P a r t i t i o n s 7ig# ; sho.s the partition %istri0uti on in terms of the num0er of triples for %ifferent 0uil%ing 0loc"s using the hashing1 0ase% an% graph partitioni ng1 0ase% %istri0uti on

t e c h n i & u e s # B e r e a ft e r, . e o m it t h e r e s u lt s o n L ! C = ; ( ( a n % S A < C 1 > ( ( = . h i c h a r e a l

m o s t t h e s a m e . it h t h o s e o n L ! C = < ( ( ( a n % S A < C 1 < ( ( = r e s p e c ti / e l y # T h e r e s u lt s s h

o. that .e can get alm ost perf ectl y 0al anc e% part itio ns if .e e:t en% onl y out goi ng tripl es .h en .e con stru ct the e:t en% e% /ert e: 0lo c"s an% use the has hin g1 0as e% %ist ri0u tion tec hni &ue # The se res ults are rela te% to the

out goi ng e%g e %ist ri0u tion in 7ig# ?4%6 an% ?4i6# Sin ce mo st su03 ects ha/ e less tha n >(( out goi ng tripl es an% ther e is no su03 ect ha/i ng a hug e nu m0 er of out goi ng tripl es, .e can gen erat e 0al anc e% part ition s 0y %ist ri0u ting tripl

e s 0 a s e % o n s u 0 3 e c t s # 7 o r e : a m p l e , t h e m a : i m u m n u m 0 e r o f o u t g o i n g tr i p l e s o n L

! C = < ( ( ( is 3 u s t > ? a n % ' ' # ' ' Q o f s u 0 3 e c t s o n 7 r e e 0 a s e h a / e l e s s t h a n < ( ( o u t g o i

n g tri pl e s# T h e g e n e r at e % p a rti ti o n s u si n g th e g r a p h p a rti ti o ni n g 1 0 a s e % %i st ri 0 ut io n te c h ni & u e a r

e al s o . el l 0 al a n c e % . h e n . e e :t e n % o nl y o ut g oi n g e % g e s 0 e c a u s e = D TI S h a s g e n er at e % al m o st u

n i f o r m

n c e %

t h / a e n r t t e h : o s p e a r u t s i i t n i g o n t s h , e e / e n t h o u g h h a s h i n g 1 0 a s e %

t h e t y e c a h r n e i & a u e l # i B t o t . l e/ e er, th l e e re s su s lts cl 0 ea a rly l sh a o

. th at it is ha r% to ge ne rat e 0ala nce% partit ions if .e inclu %e inco ming triple s for our e:te n%e% /erte : 0loc "s, rega r%les s of the %istri 0utio n tech1 ni&u e# Sinc e there are man y o03e cts ha/i ng a huge num 0er of inco ming triple s as sho .n in 7ig# ?4e6 an% ?436,

partit ions ha/in g such high in1 %egr ee /ertic es may 0e muc h 0igg er than the other partit ions# 7or e:a mple , each of t.o o03ec t /ertic es on L!C =<( (( has a0ou t >* millio n inco ming triple s .hic h is

a0out *Q of all triples# Ta0le I sho.s the norma lize% stan%a r% %e/iati on 4i#e#, the stan%a r% %e/iati on %i/i%e % 0y the mean6 in terms of the num0 er of triples on gener ate% partitio ns to compa re the 0alanc e 0et.e en using the hashin g1 0ase% an% graph partitio ning1 0ase% %istri0 ution techni &ues#
T C L D I E C a l a

0loc"s using the hashin o f g1 0ase% p an% a graph r partitio t i ning1 t 0ase% i %istri0 o ution n techni s &ues# 4 Since + the o hashin r m g1 a 0ase% l %istri0 i ution z techni e % &ue is incorp s orate% t into a n our % imple a mentat r ion % using % Ba%oo e p for / efficie i nt a proces t i sing, o .e %o n not 6 Dataset separa L!C=<((( 4hashing6 te the L!C=<((( 4graph6 %istri0 SA<C1<((= 4hashing6 SA<C1<((= 4graph6 ution DCLA 4hashing6 DCLA 4graph6 time 7ree0ase 4hashing6 an% 7ree0ase 4graph6 the DCpe%ia 4hashing6 DCpe%ia 4graph6 e:ten% e% Ta0l /erte: e II 0loc" sho.s constr the uction e:ten% time# e% T /erte: 0loc" C L constr D uction time I I for E %iffere D nt : t 0uil%in e g n
n c e

s un # = . D e I TI t S f co i i ns r s i% s er t i a0 n ly n t /a e e rie e r s % e fo s r t t %if o i fe n re c g nt o %a n t ta / o se ts# Dataset e L!C=<((( 4hashing6 r n Si L!C=<((( 4graph6 t o nc SA<C1<((= 4hashing6 t e SA<C1<((= 4graph6 DCLA 4hashing6 t e ea DCLA 4graph6 ch 7ree0ase 4hashing6 h 7ree0ase 4graph6 e t lin DCpe%ia 4hashing6 h e DCpe%ia 4graph6 , a in 7or D t a the 7 = graph t D partitio % h TI ning1 a e S 0ase% t in %istri0u a f pu tion s o t techni& e r fil ue, on t m e the s a re other t pr han%, i es there n c en is an t o ts a%%itio o n a nal / /e proces = e rt sing D r e: time T s an to run I i % =DTIS S o its as n co sho.n i nn 0elo.# n t ec Concre p i te tely, to u m % run the t e /e graph rti partitio f t ce ning i o s, using l . =DTIS e r e

% e % / e rt e : 0 l o c " c o n s tr u c ti o n ti m e 4 s e c 6

ne e% to gr ou p all co nn ec te % /e rti ce s for ea ch /e rte : in th e , D 7 gr ap h# 2 e ha /e im pl e m en te % thi s co n/ er si on st ep us in g B a% oo p for eff ici en t pr oc

e s s i n g # u r i m p l e m e n t a t i o n c o n s i s t s o f f o u r B a % o o p = a p , e % u c e 3 o 0 s

E 4 > 6 a s s i g n a n u n i & u e i n t e g e r I D t o e a c h / e r t e : , 4 < 6 c o n / e r t e a c

h s u 0 3 e c t t o i t s I D , 4 5 6 c o n / e r t e a c h o 0 3 e c t t o i t s I D , a n % ? 6 g r o u p a l l c o n n e c t e % / e r t i c e s o f

100 -0

@Tripl es A&illio nB
80 50 ?0 .0 40 00 /0 10 0

rando& 1-in /-out

1-out 1-8i /-in

-0

rando&

@Triples A&illionB
80 50 ?0 .0 40 00

1-in 1-8i /-in /-out

1-out

18

rando&

@Triples A&illionB
1? 14 1/ 10 8 ? 4 / 0

1-in 1-8i /-in /-out

1-out

18

rando&

@Triples A&illionB
1? 14 1/ 10 8 ? 4 / 0

1-in 1-8i /-in /-out

1-out

-0

@Triples A&illionB
80 50 ?0 .0 40 00

rando& 1-out 1-in 1-8i /-in /-out

1 / 0 4 . ? 5 8 - 10 11 1/ 10 14 1. 1? 15 18 1- /0

/0 10 0 1 / 0 4 . ? 5 8 - 10 11 1/ 10 14 1. 1? 15 18 1- /0

Partition

1 / 0 4 . ? 5 8 - 10 11 1/ 10 14 1. 1? 15 18 1- /0

1 / 0 4 . ? 5 8 - 10 11 1/ 10 14 1. 1? 15 18 1- /0

/0 10 0 1 / 0 4 . ? 5 8 - 10 11 1/ 10 14 1. 1? 15 18 1- /0

Partition

Partition

Partition

Partition

(a) (b) (c) (d)

L!C SA< DCLA 7ree0a = C 4h se < 1 as 4h ( < hin as ( ( g6 hin ( ( g6 4 = h a 4 s h hi a n s g h 6 i n g 6


1 0 0 0

(e)
DCpe %i a 4 h a s hi n g 6

@ T r i p l e s

rando& in -0 1-8i /-out /-in @Triples


A&illionB
80

1-out 18 rand o& 1-out


@Triples A&illionB @Triples A&illionB
1?

rand o& 1out

18

A . ? 5 8 & - 10 11 1/ i 10 14 1. 1? 15 18 1- /0 l l Partition i o n B

1 / 0 4

1in 18i /ou t /in


50 ?0 .0 40 00 /0 10

(f)

1in 18i /ou t /in


14 1/ 10 8 ? 4 / 0 1 / 0 4 . ? 5 8 10 11 1/ 10 14 1. 1? 15 18 1/0

1in 18i /ou t /in


1? 14 1/ 10 8 ? 4 / 0 1 0 . 5 11 10 1. 15 1-

ran do & 1out

-0

ra nd o& 1ou t

@Triples A&illionB
80

1in 18i
50

8 0 5 0 ? 0 . 0 4 0 0 0 / 0 1 0 0

L!C = < ( ( ( Parti 4 tion g r (g) a SA<C1 <( p ( h = 6 4g ra ph 6


0 1 0 . 5 11 10 1. 15 1/ 4 ? 8 10 1/ 14 1? 18 /0

/out /in
?0 .0 40 00 /0 10 1 ? 11 1? 0 / 5 1/ 15 0 8 10 18 4 14 1. 10 1. /0

/ 4 ? 8 10 1/ 14 1? 18 /0

Partit ion

Partition

Partiti on

(h)
DCL A 4 g r a p h 6 7ig # ;E Aa rtiti on Dis tri0 uti on

(i)

436 7ree DC 0 pe a %ia s 4gr e ap h6 4 g r a p h 6

% rs e/en though it r has the u smallest n num0er n of i triples# It n too" g a0out 5; t hours to i con/ert m the e 7ree0as e to a o =DTIS f input = file# 2e e D 0elie/e a T that this c I is h S 0ecause # there / T are e h much r e more t /ertices e c .hich : o are # n connect T / e% to a a e huge 0 r num0er l s of e i /ertices o on I n 7ree0as I e an% I t DCLA s i as h m sho.n o e in 7ig# . ?4e6# s o Since t f most h D e:isting e C graph L partitioni c A ng tools o re&uire n i similar / s input e formats, r a this s 0 result i o in%icate o u s that n t the @ graph ti # partitioni m ; ng1 e 0ase% h %istri0uti a o on n u techni&u

e may 0e infeasi0l e for some ,D7 %atasets #


T CLD IIIE Graph partitionin g time 4sec6
Dataset L!C=<((( SA<C1<((= DCLA 7ree0ase DCpe%ia

7ig# * sho.s, for %ifferent 0uil%ing 0loc"s, the ratio of the total num0er of triples of all generate % partition s to that of the original ,D7 %ata# large ratio means that many triples are replicate % across the cluster of machine s through the %ata partitioni ng# Aartition s using the >1 hop e:ten%e % out1 /erte: or in1/erte: 0loc"s ha/e >

as the ratio 0ecause each triple is assigne% to only one partition 0ase% on its su03ect or o03ect respecti/ ely# s .e mentione % a0o/e, the results sho. that the graph partitionin g10ase% %istri0utio n techni&ue can re%uce the num0er of replicate% triples consi%era D. 0ly, compare% to the hashing1 0ase% techni&ue # This is 0ecause the graph partitioner generates compone nts in .hich close /ertices, .hich ha/e high pro0a0ilit y to share some triples in their e:ten%e% /erte: 0loc"s, are assigne% to the same compone

nt# Dspecially , there are only a fe. replicate% triples for the <1hop e:ten%e% out1 /erte: 0loc"s# Bo.e/er, for the other e:ten%e% /erte: 0loc"s, the re%uction is not so significant if .e consi%er the o/erhea% of the graph partitionin g10ase% %istri0utio n techni&ue # Q u e r y P r o c e s s i n ! To e/aluate &uery processin g performan ce on the generate% partitions, .e choose %ifferent &ueries from one

0enchmar " %ataset 4L!C=<( ((6 an% one real %ataset 4DCpe%ia 6 as sho.n in Ta0le IG# Since L!C= pro/i%es a set of 0enchmar " &ueries, .e select three among them# +ote that Q>> an% Q>5 are slightly change% 0ecause our focus is not ,D7 reasoning # 2e create three &ueries for DCpe%ia 0ecause there is no represent ati/e 0enchmar " &ueries# 7ig# @ sho.s the &uery proces sing time of the &ueries for

su03ect an% +- 0et.een shoul% in# The the ()B- .//E SDLDCT 0e results hashing1 Ly /su0-rganization-f locate% clearly 0ase% ()B- ./0E SDLDCT L: /un%ergra%uateDegree7rom in the sho. an% the ()B- ./1E SDLDCT L: 2BD,D same that graph DB edia ./E using partitionin /Carac" -0amapartition DB edia .2E SDLDCT Lperson 2BD,D for inter1 g10ase% DB edia .0E SDLDCT Lsuc< Lsuc> Lperson 2BD,D intra1 G= %istri0utio Lsuc< /successor G= processi n %ifferent process ng is techni&ue 0uil%ing ing# -n slo.er s is not 0loc"s the than significan an% other using t for most %ifferent han%, intra1 cases %istri0uti for G= e/en on DCpe%i processi though techni& a Q> ng they ha/e ues# ll .hich is 0ecaus %ifferent 0uil%ing a e of the 0alance 0loc"s re$erse commu an% ensure star nication replicatio intra1 &uery cost n le/els G= .here across as sho.n processi all triple the in Ta0le I ng for patterns cluster an% 7ig# L!C= share of *# 2e Q>? the machin thin" that an% same es to this is DCpe%i o03ect, 3oin 0ecause a Q< random interme% each 0ecaus , +-out iate partition e they an% -results is are out generate efficiently simple nee%s % from store% retrie/al inter1 su0&ueri an% &ueries G= es# in%e:e% ha/ing process 7urther in the only ing# 7or more, ,D71 one L!C= each specific triple Q>> Ba%oop storage pattern# an% 3o0 system, 7or DCpe%i usually thus such L!C= a Q5 has le/el of Q>5, .here a0out %ifference random, the >( on the +-in an% o03ect secon%s 0alance --in of one of an% nee%s triple initializa replicatio inter1 pattern tion n %oes G= is lin"e% o/erhea not ha/e processi to the % a great ng su03ect regar%le effect on using of the ss of its &uery Ba%oop other input processin 0ecaus triple %ata g# The e Q>5 is pattern, size# In only a star inter1 terms of nota0le &uery G= the %ifference an% so process intra1 is .hen all ing is G= .e use triples re&uire processi --out for sharing % for ng, the partitionin the random %ifferen same , +-out ce
T CLD IGE Queries

g 0ecause the graph partitionin g10ase% techni&ue re%uces the num0er of replicate% triples consi%era 0ly# Dspecially , on DCpe%ia .here -out using the hashing1 0ase% %istri0utio n techni&ue generates many replicate% triples, -out using the graph partitionin g1 0ase% %istri0utio n techni&ue is three times faster than that using the hashing1 0ase% techni&ue for DCpe%ia Q5# In summary, .e conclu%e that the most important factor of ,D7 %ata partitionin g is to ensure the intra1 G= processin g

4 "eplication "atio

!ashing-8ased

3 Eraph partitioning8ased
/ 1 0

0 "eplic ation "atio /+. / 1+. 1 0+. 0

!ashing-8ased Eraph partitioning-8ased

0 "eplic ation "atio /+. / 1+. 1 0+. 0

!ashing-8ased Eraph partitioning-8ased

/+. "eplic ation "atio / 1+.

!ashing-8ased Eraph partitioning-8ased

. "eplication "atio

4 Eraph partitioning-8ased
!ashing-8ased

0 / 1-out 1-in 1-8i /-out /-in

1-out

1-in

1-8i

/-out

/-in

1-out

1-in

1-8i

/-out

/-in

1 0+. 0

(b)

SA<C1<((=

(c) DCLA
7ig# *E Triple ,eplication

(d)

1 0 1-out 1-in 1-8i /-out /-in

7ree0ase

(a)
r a n i n 9 u o e u

DCpe%ia

1/-in

out

1-in

1-8i

/-out

(a)
(((

L!C=<

1 2 r y P r o c e s s i n g T i & e i n S e c 1 0 0

1 2 -

r a n i n o u

1 0

Coth are special cases of our SA partition er# 2e argue that our SA frame. or" present s a system atic approac h to graph partition ing an% is more general an% easily configur a0le .ith customi za0le perform ance guarant ee# GI# C-+CL!SI
-+
90

91 910 914 914 911 911 0 Aha shi ngB Agr ap hB Aha shi ngB Agr ap hB Aha shi ngB Agr ap hB

(a)
L!

91 9/ 90 91 9/ Ah as hi ng B Agr ap hB Ah as hi ng B Agr ap hB Ah as hi ng B Agr ap hB

2e ha/e presente % SA , an efficient an% customiz a0le ,D7 (b) %ata DCpe partitioni ng mo%el for %istri0ute %

p r

o c

essing of

7ig# @E Query Arocessing Time 4sec6

for all 4or most6 gi/en SA ,QL &ueries 0ecause inter1G= processing using Ba%oop is usually much slo.er than intra1 G= processing# If all 4or most6 &ueries can 0e co/ere% 0y only outgoing e:tension, the graph partitioning10ase% %istri0ution techni&ue shoul% 0e preferre% 0ecause it can re%uce the num0er of replicate% triples consi%era0ly# Bo.e/er, for all the other cases, the hashing10ase% %istri0ution techni&ue is a 0etter %istri0ution metho% if .e consi%er the huge o/erhea% of the graph partitioning10ase% techni&ue# lso, it shoul% 0e note% that running e:isting graph partitioner can 0e infeasi0le for some ,D7 %atasets %ue to their comple: structure#

0ig ,D7 graph %ata in the Clou%# The main contri0ution is to %e/elop a selection of scala0le techni&ues to %istri0ute the partition 0loc"s across a cluster of compute no%es in a manner that minimizes inter1no%e communication cost 0y localizing most of the &ueries on %istri0ute% partitions# 2e sho. that the SA approach is not only efficient for partitioning an% %istri0uting 0ig ,D7 %atasets of %i/erse sizes an% structures 0ut also effecti/e for processing ,D7 &ueries of %ifferent types an% comple:ity# &cknowledgmentE This .or" is partially sponsore% 0y grants from +S7 CISD +etSD program, SaTC program, an% Intel ISTC on Clou% Computing# , D7D,D+CDS
8>9 H 0out 7acete%DCLA,I http.00d(lp'l1s'de0d(lp22'php# 8<9 HCTC <(>< Dataset,I http.00km'ai&('kit'edu0pro*ects0(tc--3+-0# 859 HDCpe%ia 5#) Do.nloa%s,I http.00wiki'd(pedia'or!0Downloads14# 8?9 H=DTIS,I http.00www'cs'umn'edu05metis# 8;9 H-pening up Go/ernment,I http.00data'!o$'uk0spar6l# 8*9 H,esource Description 7rame.or" 4,D76,I http.00www'w1'or!0RDF0# 8@9 HThe Data1go/ 2i"i,I http.00data-!o$'tw'rpi'edu0wiki# 8)9 H!K Legislation,I http.00www'le!islation'!o$'uk0de$eloper0&ormats0rd&# 8'9 S# Duan, Kementsietsi%is, Srini/as, an% !%rea, H pples an% orangesE a comparison of r%f 0enchmar"s an% real r%f %atasets,I in S7)M8D, <(>># 8>(9 C# 7ran"e an% et al#, HDistri0ute% Semantic 2e0 Data =anagement in BCase an% =ySQL Cluster,I in 7""" L89D, <(>># 8>>9 Y# Guo, Z# Aan, an% F# Beflin, HL!C=E 0enchmar" for -2L "no.le%ge 0ase systems,I :e( Semant', /ol# 5, no# <15, -ct# <((;# 8><9 T# Beath, C# Cizer, an% F# Ben%ler, Linked Data, >st e%#, <(>># 8>59 C# Ben%ric"son an% ,# Lelan%, H multile/el algorithm for partitioning graphs,I in Supercomputin!, >'';# 8>?9 F# Buang, D# F# 0a%i, an% K# ,en, HScala0le SA ,QL Querying of Large ,D7 Graphs,I AGLDC, ?4<>6, pp# >><5R>>5?, ugust <(>># 8>;9 G# Karypis an% G# Kumar, H 7ast an% Bigh Quality =ultile/el Scheme for Aartitioning Irregular Graphs,I S7AM ;' Sci' omput', >'')# 8>*9 # Kyrola, G# Clelloch, an% C# Guestrin, HGraphchiE large1scale graph computation on 3ust a pc,I in 8SD7# Cer"eley, C , !S E !SD+IN ssociation, <(><, pp# 5>R?*# 8>@9 G# =ale.icz, =# B# ustern, # F# Ci", F# C# Dehnert, I# Born, +# Leiser, an% G# Cza3"o.s"i, HAregelE a system for large1scale graph processing,I in S7)M8D# +e. Yor", +Y, !S E C=, <(>(, pp# >5;R>?*# 8>)9 T# +eumann an% G# 2ei"um, HThe ,D715N engine for scala0le man1 agement of ,D7 %ata,I <he VLD# ;ournal, /ol# >', no# >, 7e0# <(>(# 8>'9 K# ,ohloff an% ,# D# Schantz, HClause1iteration .ith =ap,e%uce to scala0ly &uery %atagraphs in the SB ,D graph1store,I in D7D , <(>># 8<(9 =# Schmi%t, T# Bornung, G# Lausen, an% C# Ain"el, HSA<CenchE SA ,QL Aerformance Cenchmar",I in 7 D", <(('# 8<>9 C# 2hite an% et al#, H n integrate% e:perimental en/ironment for %istri0ute% systems an% net.or"s,I in 8SD7, <((<#

V. , DL

TDD

2-,K

Graph partitioning has 0een e:tensi/ely stu%ie% in se/eral communities for se/eral %eca%es 8>59, 8>;9# ,ecently a num0er of techni&ues ha/e 0een propose% to process ,D7 &ueries on a cluster of machines# =ost of them utilize e:isting %istri0ute% file systems such as BD7S to store an% %istri0ute ,D7 %ata# SB ,D 8>'9 %irectly stores ,D7 triples in BD7S as flat te:t files an% runs one Ba%oop 3o0 for each clause 4triple pattern6 of a SA ,QL &uery# 8>(9 uses t.o BCase ta0les to store ,D7 triples an%, gi/en a &uery, iterati/ely e/aluates each triple pattern of the &uery on the t.o ta0les to generate final results# Cecause they use general %istri0ute% file systems .hich are not optimize% for ,D7 %ata, their &uery processing is much less efficient than the state1of1the1art ,D7 storage an% in%e:ing technology 8>)9# 8>?9 generates partitions using e:isting graph partitioning algorithms an% stores them on ,D71 5N# s .e reporte% in Sec# IG, running an e:isting graph partitioner for large ,D7 %ata has a huge amount of o/erhea% an% is not feasi0le for some real ,D7 %atasets# In a%%ition, t.o recent papers 8>@9, 8>*9 ha/e propose% t.o %ifferent approaches to general processing of 0ig graph %ata# 8>@9 promotes to store a /erte: an% all its outgoing e%ges together an% implemente% a %istri0ute% system to sho. the parallel processing opportunity of their approach# 8>*9 proposes to store a /erte: an% all its incoming e%ges together an% sho.s that this can spee% up some graph processing on a single ser/er, such as Aage,an"# In comparison, 8>@9 presents a system 0uilt using out1/erte: 0loc" for 0oth su03ect an% o03ect /ertices, .hereas 8>*9 presents a single ser/er solution using the in1/erte: 0loc" for 0oth su03ect an% o03ect /ertices#

You might also like