UBICC-Submitted 206 206

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Continuous Reverse Nearest Neighbor Search

Lien-Fa Lin*, Chao-Chun Chen


Department of Computer Science and Information
Engineering National Cheng-Kung University, Tainan, Taiwan, R.O.C.
Department of Information Communication Southern Taiwan
University of Technology, Tainan, Taiwan, R.O.C.

lienfa@cc.kyu.edu.tw,chencc@mail.stut.edu.tw

ABSTRACT
The query service for the location of an object is called Location Based Services
(LBSs), and Reverse Nearest Neighbor (RNN) queries are one of them. RNN queries
have diversified applications, such as decision support system, market decision,
query of database document, and biological information. Studies of RNN in the past,
however, focused on inquirers in immobile status without consideration of
continuous demand for RNN queries in moving conditions. In the environment of
wireless network, users often remain in moving conditions, and sending a query
command while moving is a natural behavior. Availability of such service therefore
becomes very important; we refer to this type of issue as Continuous Reverse
Nearest Neighbor (CRNN) queries. Because an inquirer’s location changes
according to time, RNN queries will return different results according to different
locations. For a CRNN query, executing RNN search for every point of time during a
continuous query period will require a tremendously large price to pay. In this work,
an efficient algorithm is designed to provide precise results of a CRNN query in just
one execution. In addition, a large amount of experiments were conducted to verify
the above-mentioned method, of which results of the experiments showed significant
enhancement in efficiency.

Keywords: Location Based Services, Location-Dependent Query, Continuous


Query, Reverse Nearest Neighbor Query, Continuous Reverse Nearest Neighbor
Query

1 INTRODUCTION Location-Dependent Query (LDQ), of which


applications include Range Query, Nearest Neighbor
As wireless network communications and mobile (NN) query, K-Nearest Neighbor (KNN) query, and
device technology develop vigorously and Reverse Nearest Neighbor (RNN) query.
positioning technology matures gradually, LBS is There are plenty of studies about NN [14, 22, 26],
becoming a key development in the industrial as well KNN [4, 9, 14, 23, 25], CNN [17, 3, 12, 20], and
as academic circles [2, 5, 13, 21, 26, 27]. According CKNN [17, 20] queries, and issues pertaining to
to the report of “IT Roadmap to a Geospatial Future” Reverse Nearest Neighbor (RNN) Query [10, 11, 16,
[6], LBSs will embrace pervasive computing and 18, 19, 22, 24] have been receiving attention in recent
transform mass advertising media, marketing, and years. RNN query means finding a collection of
different societal facets in the upcoming decade. nearest neighbor objects for S, a given collection of
Despite the fact that LBSs have been existing in the objects, with q, a given query object. Practical
traditional calculation environment (such as Yahoo! examples of RNN query are provided in [10]. If a
Local), its greatest development potential lies in the bank is planning to open a new branch, and its clients
domain of mobile computing that provides freedom prefer a branch on a nearest possible location, then
of mobility and access to information anywhere such new branch should be established on a location
possible. where the distance to the majority of its clients is
LBSs shall become an indispensable application shorter than that of other banks. Taxi cabs selecting
in mobile network as its required technology has passengers is another good example. If a taxi cab uses
matured and 3G wireless communication wireless devices to find out the location of its
infrastructure is expected to be deployed everywhere. customer, then RNN queries will be far more
The query that answers to LBSs is referred to as advantageous than NN queries from the aspect of

Ubiquitous Computing and Communication Journal


competition. Figure 1 illustrates that Customer c is Related works about RNN search are introduced in
the nearest neighbor for Taxi a, but that does not Section 2. Concerned issues are defined and
necessarily mean Taxi a can capture Customer c assumptions made are described in Section 3. The
because Taxi b is even closer to Customer c. On the proposed CRNN search algorithm is introduced in
contrary, the best option for Taxi a should be Section 4 The experiment environment and
Customer d because Taxi a is the nearest neighbor for evaluation parameters for experimental efficacy are
Customer d. That is, d is the RNN for a, and a may described in Section 5. In the end, a conclusion and
reach d faster than any other taxi. This is an example future study directions are provided in Section 6.
of CRNN query for that the query object, the taxi,
changes location according to time. Mobile users will
be mobile in a wireless environment, and that is why
the continuous query is an important issue in the
wireless environment.
As far as the knowledge available to the
researchers is concerned, there is not yet any
researcher working on this issue. Because an inquirer
changes location constantly according to time,
changes of location will cause RNN queries to return
different results. For a CRNN query, executing RNN
search for every point of time during a continuous
query period will require a tremendously large price
to pay. The larger the number of query objects and Figure 1: Example of RNN query.
the shorter the time segment are, the longer the
calculation time will be.
In addition, due to the continuance nature of
time, defining the appropriate time segment for RNN 2 RELATED WORK
search will be a concern; if the interval between RNN
searches is too short, then more CRNN queries need RNN search concerns about finding q, a query
to be executed to complete the query, and vice versa. that is the NN for some objects. Related works of
If a RNN search is repeated over a longer period of study about RNN search are introduced and
time to reduce the number of execution, the RNN summarized in this section:
query result for the whole time segment will lose z Index methods that support RNN search
accuracy due to insufficient frequency of sampling. The number of objects can be infinite; if one must
In this paper, a more efficient algorithm is first find out the distance from query q to each object
designed to replace processing of each and every for identifying the RNN for query q, then the
point of time for RNN search; just one execution of efficiency may be unacceptably low due to
CRNN query is all it takes to properly define the overwhelmingly large computation cost. To
segment for the query time that a user is interested in, accelerate processing speed, most of studies adopt the
and find out the segments that share the same answer index methods. Major index methods are introduced
and the RNN for each of the intervals. in this section.
z RNN search of different types
Other than that, an index is also used to filter out RNN searches in different scenarios are described
unnecessary objects to reduce search space and and categorized according to static and moving
improve CRNN search efficiency. The experiment situations of query q and the objects.
result suggests that using index provides efficiency
20 times better than not using index when the number 2.1 Index Methods for RNN Query
of objects is 1000.
This Study provides major contribution in three RNN search concerns about finding q, a query
ways: that is the NN for some objects, and it is necessary to
z This Study pioneers into continuous query find out the distance between query q and each object,
processing methods opposite to static or the distance from the coordinate of query q to the
query regarding RNN issues. coordinate of an object. For a given q, not every
z A CRNN search algorithm is proposed; object is its RNN, and these objects which can not be
just one execution will return all CRNN RNN may be practically left out of consideration to
results. reduce the number of objects to be taken into
z The proposed method allows the index consideration and accelerate processing speed for
which was only applicable to finding RNN RNN search. Many studies were dedicated to the
for a single query point to support CRNN designing of an effective indexing structure for
query to improve CRNN search efficiency. coordinates of an object. The most famous ones are
The structure of the other sections in this work: R-Tree proposed by [8] and Rdnn-Tree proposed by

Ubiquitous Computing and Communication Journal


[10]. These two index methods are described below. these child trees to its NN will not exceed MaxDnn.

2.1.1R-Tree

R-Tree is an index structure developed in early


years for spatial database and was used by [10] to
accelerate RNN search processing. All objects are
grouped and then placed on leaf nodes according to
the closeness of their coordinates. That is, objects at
similar coordinates are put in one group. Next, each
group of objects is contained in a smallest possible
rectangle, which is called Minimum Bounding
Rectangle (MBR). Next, MBRs are grouped in
clusters, which are contained inside a larger MBR
until all objects are contained in the same MBR.
What is stored on an internal node of a R-Tree is an Figure 3. Data structure of Rdnn-tree
MBR, in which all nodes underneath are contained,
and the root of the R-Tree contains all objects. The 2.2 Categories of Rnn queries
size and range of an MBR is defined by its lower left
coordinate (Ml,Md) and upper right coordinate (Mr, Depending on the static or moving status of
Mu). Figure 2 is an example of R-Tree. From a to l, query q and the query objects, related studies can be
summarized into 4 categories.
there are total 12 objects; (a , b , c , d) belong to
1. If query q and the query objects are both static,
MBR b1, and (e,f,g) belong to MBR b2. MBR b1 then this category is called static query vs. static
and b2 belong to MBR B1, and MBR R contains all objects.
objects.. 2. If query q is moving and the query objects are
static, then this category is called moving query vs.
static objects.
3. If query q is static and the query objects are
moving, then this category is called static query vs.
moving objects.
4. If both query q and the query objects are moving,
then this category is called moving query vs. moving
objects.

2.2.1 Static query vs. static objects

The scenario that both query q and query objects


Figure 2: Example of R-tree Indexing
are static is first discussed because the query and
query objects are immobile and are therefore easier
2.1.2 Rdnn-Tree
for processing than other scenarios. The method
proposed in [10] is now introduced. For static
Rdnn-tree (R-tree containing Distance of Nearest
database, the author adopts a special R-tree, called
Neighbors) [22] improves the method of [10]. The
RNN-tree, for answering RNN queries. For static
author proposes a single index structure (Rdnn-tree)
database that requires being frequently updated, the
to provide solutions for NN queries and RNN queries
author proposes a combined use of NN-tree and
at the same time. Rdnn-tree differs from standard R-
RNN-tree. NN of every object is stored in the RNN-
tree structure by storing extra information about
tree, and what are stored in the NN-tree are the
nearest neighbor of the points in each node.
objects themselves and their respective collections of
Information of (ptid,dnn) is stored on the leaf node
NN. The author uses every object as the center of a
of Rdnn-tree, as shown in Figure 3. ptid means an circle, of which the radius is the distance from the
object of which the data concentrate on the dimension, object to its NN, to make a circle, and then examines
denoted as d, and dnn means the distance from such every circle that contains query q to find out the
object to its NN. Information of (ptr , Rect , answers of RNN queries. Such method, however, is
MaxDnn) is stored on a non-leaf node, where ptr very inefficient for dynamic database because the
points to the address of a child node, Rect contains structures of NN-tree and RNN-tree must be changed
the MBR of all child nodes subordinate to this node, whenever the database is updated. In [22], the method
and MaxDnn means the maximum value of dnn of all proposed by [10] is therefore improved. The author
objects in the child trees subordinate to this node. The proposes a single index structure, Rdnn-tree, for
maximum distance from any object contained in answering NN queries and RNN queries at the same

Ubiquitous Computing and Communication Journal


time. It differs from normal R-tree; it separately q; instead, i segments of time, such as segment1,
stores the information of NN of every object (i.e. segment2 …segmenti, that have the same result, are
Distance of Nearest Neighbor), and NN of every first identified among the entire CRNN query time
object must be calculated in advance. period. RNN result of each segment of time, such as
RNN1, RNN2, …, is calculated separately, and the
2.2.2 Static query vs. moving objects result is returned in the format of (q ,
[segment1])={RNN1 result} , … , (q ,
Studies mentioned above primarily assume a [segmenti])={RNNi result} back to the inquirer.
monochromatic situation that all objects, including
query q and query objects, are of the same type. In
[18], the researcher addresses this type of issues in a
bichromatic situation that objects are divided into two
different types; one is inquirer, and the other is query
object. NN and range query techniques are used in
this Paper to handle RNN issues.

2.2.3 Moving query


Figure 4. Example of a CRNN query
This subsection discusses the situation when an
inquirer is no longer static but changes his or her Based on the description above, CRNN query,
location according to time, and the query object can one issue that this Study concerns, may be stated as
be either static or moving. That is, two categories of below:
query: moving query with static objects and moving Given:
query with moving objects, are involved. Because the A collection of static objects S={O1,O2,…,On}
inquirer is moving, these two categories of query will A query point q, its current position (q.x,q.y), and
return different results for the identical RNN search moving velocity (v.x,v.y)
at different points of time. This type of issue is A continuous query time [Ps , Pe]; where Ps
obviously more complicate than the issues previously
represents the coordinate of the point of time when
discussed. As far as the knowledge available to the
the query begins, and Pe represents the coordinate of
researcher is concerned, no related study has ever
the point of time when the query ends
discussed about the issues of these two categories. In
Find:
this Study, solutions for a moving query with static
The RNNs of q between any two adjacent points of
objects are pursued.
time of {P1 ,P2 ,…,Pi} within [Ps ,Pe] remain
constant.
3 Problem Formulation Such that:
RNN(q,[Ps ,P1]) = {RNN1},RNN(q,[P1 ,P2])
CRNN query concerns about a period of
continuing time where adjacent points in such period ={RNN2},…, RNN(q,[Pi ,Pe]) = {RNNi},
of time may have the identical RNN. That is, a period where [Pi,Pe] ⊆ [Ps,Pe],{RNN1} ∈ {O1,O2,…,
of time may have the same RNN unless the query q On}.
has moved beyond this period of time. Please refer to Under the assumptions:
Figure 4. When a user executes CRNN query q, the 1. The moving direction of query q is fixed.
time segment of the continuous query is [Ps,Pe], and 2. All query objects are static.
the query objects are {a,b,c,d}. If time point P1 As described above, two adjacent points of time
can be identified, and any given point of time in the may share the same RNN, or a segment of time has
only one RNN unless query q moves to another
time segment from Ps to P1, or [Ps,P1], has the same
segment of time that has a different RNN. The CRNN
RNN result, then one-time execution of RNN search
search algorithm proposed in this Study uses exactly
is all it needs for the time segment of [Ps ,P1]. If this concept. First, the points of time that produce
points of time, P2, P3, and P4, are also identified, and different RNN results within a query time are
any given point of time in the time segment of [P1, identified. These points of time divide the query time
P2] has the identical RNN result, while any given into several segments that have different RNN results,
point of time in the time segment of [P2,P3] has the and then the RNN results are identified for each of
identical result, and any given point of time in the the segments. The detailed algorithm of CRNN
time segment of [P3,P4] has the identical result, then Search is explained in the next section.
the entire CRNN query needs only one-time
execution of RNN search at each time segment. 4. CRNN Search Algorithm
For processing CRNN query, it is not necessary
to execute RNN search for every point of time and The detailed procedure of CRNN Search
return the RNN of every point of time back to query algorithm is introduced in this section. CRNN Search

Ubiquitous Computing and Communication Journal


algorithm is divided into two steps. As illustrated in Figure 6, if the NN of object a
Step 1: Finding segment points of CRNNq is b, and a circle is made using ab as the radius with
Points of time that produce different RNN
a as the center point, then the distance from query q
results are identified. Based on these points of time,
to a must be shorter than the distance from a to its
CRNN query is divided into several time segments
NN, or object b, as long as query q falls within this
that require execution of RNN search. The RNN
circle. Therefore, during the period of time when
result for any given point of time within one segment
query q remains within this circle, RNNs of object a
will remain constant, and different segments have
must include a, unless query q moves out of this
different RNN results.
circle. Because the moving direction of query q is
Step 2: Calculating RNN result of each segment
assumed to be fixed, CRNN query will form a query
Separately calculate the RNN results for each of
line (qline) from its beginning to its end. The point to
the segments that have been divided in the previous
which this CRNN query begins to leave this circle is
step.
the intersection S of this circle and the query line
The entire procedure for processing CRNN
formed by CRNN query. Before intersection S, the
Search is illustrated in Figure 5. On top of the
result of RNN query must include object a; beyond
necessary query objects and continuous query (query
intersection S, the result of RNN query will not
path), it is divided into two steps: finding segment
include object a; the RNN results will be different.
points of CRNNq and calculating RNN result of each
This intersection is referred to as a segment point.
segment; each of the steps is described below:
This explains why the intersection of the circle
with NN as its radius and the query line is the point
of time where RNN query produces different results.
Making a circle by using an object itself as the center
and the distance to its NN as the radius will enable all
of the intersections of the circle and the query line of
CRNN query to cut CRNN query into several time
segments that have different results of RNN query.

Figure 5: Flow chart of CRNN query


processing

4.1 Finding segment points of CRNN

What CRNN query pursues is a period of Figure 6. Finding segment point of CRNN search
continuous time; the moving distance of query
objects is very short among some adjacent points of Figure 7 illustrates the time segmentation
time for the query, thus possibly resulting in the same process described above. For object a, b, and c, their
RNN result. That is, the entire period of continuous respective NNs are identified first: NN(a)=b,
query is divided into several segments, and the RNN NN(b)=a, and NN(c)=b. Next, use each object as the
results in each segment are the same. If these points center of a circle, and the distance to its respective
of time share the same RNN result, then it is not NN as the radius to make circles of a, b, and c. Then,
necessary to execute RNN search for each of the intersections of the circles and qlines, Ps, P1, P2, P3,
points of time; one-time calculation is enough. P4, and Pe , are sorted according to time, and every
Therefore, CRNN query does not require executing two intersection points define a time segment. The
RNN search for all points of time. Instead, points of entire CRNN query is cut into five time segments, [Ps,
time that share the same RNN result are grouped into P1] , [ P1, P2] , [P2,P3] , [P3,P4] , and [P4,Pe].
time segments, and one-time RNN search is executed Every segment has a unique RNN query result.
for each of the segments. RNN of query q is a
collection of the objects of which the NN is query q.
If the distance, or N, is realized in advance, then
these objects are the RNN for query q when the
distances from query q to the objects are shorter than
the distances from the objects to their respective NN.

Ubiquitous Computing and Communication Journal


Figure 8: Calculating RNN result of each segment.

1. CRNN Algorithm with Index


Not every object will be an answer in the
processing of CRNN query. To improve RNN query
efficiency, it is preferred that the objects that can not
be answers are filtered out in advance to greatly
reduce search space for CRNN query, size of data
that requires CRNN query, and consequently,
computation cost. The process that further improves
CRNN query efficiency dramatically is referred to as
pruning process. Figure 9 illustrates the flowchart of
CRNN query processing with a pruning process
added.
Figure 7: Segmenting of the CRNN query

4.2 Calculating RNN result of each segment

In the previous section, intersections of qlines


and the circles with the distances between the objects
and their respective NNs as the radiuses are defined.
With these intersections, CRNN query is cut into
several time segments. The next step is to find RNNs
for each of the time segments. Because the distances
from query objects to their respective NNs are used
as the radiuses to make circles which are coded by
the objects’ numbers, if a segment falls within a
certain circle, then the resulting RNN of this time
segment for the CRNN query is the object collection
represented by such circle. This is illustrated in
Figure 8. First, intersections of qlines that represent
the CRNN query and the circles of the objects are Figure 9. Flow chart of CRNN query with index
sorted by time; every two intersection points define a
time segment, and there are five segments, [Ps,P1] , Step 2 and 3 are identical to Step 1 and 2 in
CRNN search algorithm, which have been described
[ P1 , P2] , [P2 , P3] , [P3 , P4] , and [P4 , Pe].
in the previous sections, and they will not be
Segment [Ps , P1] is contained only by circle a, reiterated again here. For step 1, the pruning process,
therefore: RNN(q,[Ps ,P1]) ={a}. Next, examine an index structure for Rdnn-tree is designed to
segment [Ps,P1]; this segment is contained by circle effectively execute the pruning process. The three
a and circle b. Therefore: RNN(q, [Ps,P1]) = {a, steps of CRNN query with index are illustrated in
b}. If this process is repeated, then the obtained Figure 9. The pruning process is described below. For
results will be RNN(q , [P2 , P3]) = {a , b , c}, every internal node of Rdnn-tree, the distance from
RNN(q,[P3,P4]) ={b,c}, and RNN(q,[P4,P5]) query q to its node will be computed for every
={c}. separation, and the distance is denoted as D(q,Rect).
If D(q,Rect) of a node is larger than MaxDnn of the
node, then all the objects beneath it will not be
considered because the distance from query q to Rect
Node will be equal to or larger than the distances
from query q to all the objects underneath Rect node.
When the distance from query q to Rect node is
longer than MaxDnn, it is impossible that query q is
closer to its NN than any other object underneath
Rect node, and no object underneath can be the RNN
result for query q. On the contrary, if D(q,Rect)
equals the MaxDnn of such node, then the distances
from some objects underneath Rect node to their
respective NNs are shorter than the distance from
query q to Rect. That is, some objects are the RNN
results for query q. The examination continues along

Ubiquitous Computing and Communication Journal


the branch all the way to the lead node. All entries therefore, h and i are placed inside RNNCanSet. Next,
underneath such leaf node are recorded as the b4 is examined. MaxDnn of b4 is equal to or smaller
candidate objects for RNN query result. The than D(q , b4); therefore, b4 can be pruned. The
collection of these candidate objects is referred to as entire pruning process then ends.
RNNCanSet, which means the possible results for However, the CRNN query to be processed is
RNN query must exist within this collection, and the not a RNN query of a single query point; therefore,
objects outside of RNNCanSet can not possibly be the pruning process in [22] can not be directly used.
RNN query results. All that are needed to be To ensure that no possible RNN result is deleted, the
considered when finding segment point of CRNNq of criteria of pruning is changed from the condition that
CRNN search algorithm are the objects inside D(q,Rect), the distance from query point to Rect,
RNNCanSet. This will greatly reduce the quantity of must be longer than MaxDnn to the condition that
objects needed to be handled and enhance CRNN MinD(q,Rect)>MaxDnn, where MinD(qline,Rect)
search algorithm efficiency. represents the minimum distance from qline to Rect
Figure 3 explains the pruning process. It begins node. The reason why the shortest distance is selected
with root node R. Because D(q,R) ≦MaxDnn of R, is that if the minimum distance from the entire qline
child nodes of B1 and B2 must be examined. Because to Rect node is larger than MaxDnn, then the distance
the MaxDnn of MBR B1 ≦D(q,B1), all child nodes from any given point of time on the qline to Rect
underneath B1 can be pruned. Next, D(q , node must be longer than MaxDnn. Therefore, all the
B2)≦MaxDnn of B2, so child nodes b3 and b4 of B2 objects underneath Rect node can not be RNN for
must be examined. D(q,b3) is equal to or smaller qline, and pruning is out of consideration. Details of
than the MaxDnn of b3, which is also a leaf node; the pruning algorithm are exhibited in Algorithm 1:

Algorithm 1: Pruning Algorithm.


and comparison of experiment results.
5. Performance Study
5.1 Experiment Settings
To evaluate the improvement which the method
proposed in this Study has made in CRNN query The coordinates of the objects disperse in an
efficiency, some experiments are designed, and this experiment environment of [0 , 1]×[0 , 1] plane.
section provides descriptions of experiment Because distribution density of the objects may
environments, experimental parameters and settings, influence efficiency, it should be taken into

Ubiquitous Computing and Communication Journal


consideration in the experiment. In the experiment, referring to [15], and the velocity vector of each
three different types of distribution are used in the query falls between [-0.01 , 0.01]. Because the
generation of objects’ coordinates. The three different influence of different types of object distribution on
types of distribution are Uniform distribution, efficiency is concerned in this experiment, the queries
Gaussian distribution, and Zipf distribution. In are generated as close to the center of the plane as
uniform distribution, the objects are evenly possible. Having executed 30 queries, the average
distributed on the plane, as shown in Figure 10(a). In cost of executing one CRNN search is used in
Gaussian distribution, most of the objects concentrate determining which method is more favorable. As to
on the center of the plane, as shown in Figure 10(b). the program coding of Rdnn-tree in the CRNN search
In Zipf distribution, most of the objects will distribute algorithm, R*-tree code of GIST[7] is used in
at the extreme left and extreme bottom of the plane. perfecting Rdnn-tree to make it match with the
In the experiment, skew factor is set at 0.8, as shown requirement of this experiment.
in Figure 10(c). In addition, 30 queries are generated
randomly in a [0.4 , 0.6]×[0.4 , 0.6] plane by
1 1 1

0.8 0.8 0.8

0.6 0.6 0.6

Y axis
Y axis

Y axis

0.4 0.4 0.4

0.2 0.2
0.2

0 0
0 0 0.2 0.4 0.6 0.8 1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
X axis X axis
X axis

Figure 10 Data sets of experiment evaluation

In addition to the object distribution described CRNN query is executing RNN algorithm for every
above, the influences that the amount of query time point of time which is continuous, and it is
(qline) and the number of objects may impose on impossible to calculate the required count of
efficiency are also considered. Three data sets of execution. Therefore, the CRNN query time must be
Uniform, Gaussian, and Zipf are considered in object segmented before the total execution time required
distribution. The amount of query time (qline) for CRNN query may be calculated. The more the
changes from query length 1 to query length 10. The time is segmented, the more executions of RNN are
number of objects changes from 1K to 10K. required. If a period of time is segmented into m
Parameters and settings used in the experiment are segments, then time complexity will be O(m×n3), and
listed in Table 1. if time is not adequately segmented, then the RNN
Table 1: Parameter settings of experiment result may be erroneous. These make it an inefficient
CRNN search algorithm, and it will not be compared
Parameter Description Settings in this experiment. Efficiency of two methods is
distribution Data distribution Uniform, compared in this experiment: one uses Rdnn-tree as
Gaussian, Zipf the index, and the other uses no index. To evaluate
interval Time interval of 1, 2, 5, 8, 10 these two methods, comparison of the time required
Query for one CRNN search execution can be used, and this
object-no Number of Data 1, 10. 30, 50 , comparison is referred to as total cost in this Study.
Objects 100(k)
5.4 Performance Results and Discussion
5.2 Compared Algorithms and Performance
Metrics Based on the changes of metrics (distribution,
interval, and object-no), different types of
The most intuitive method for finding RNN is experiments have been conducted. Results are
looking for the NN of every object. If the number of summarized by object-no and query interval in the
query objects is N, then time Complexity is O(n2). next section.
Next, determine which objects’ NNs are query points.
If the NNs are the query points, then the objects will 5.4.1 The effect of object-no parameter
be the RNNs for the query points. The required time
complexity for the RNN algorithm is O(n3). First, the fixed query interval is set at 5. The
However, the most intuitive method for finding influence imposed on efficiency by object-no

Ubiquitous Computing and Communication Journal


parameter, which is the number of objects, under comparison of influence from object distribution on
different types of object distribution, will be efficiency clearly suggests Zipf distribution offers the
discussed. The experiment result is shown in Figure best efficiency, followed by Uniform distribution,
11. X axle represents the total cost of time required and Gaussian distribution offers the worst. This result
for executing one CRNN search, and Y axle can be explained as such: because Zipf distribution is
represents the number of query objects. Total cost located at the far left and the lowest bottom, most of
increases as the number of query objects increases. the data will be pruned, and the number of objects
In addition, when the number of objects is 1K, that are included in RNNCanSet without being
the efficiency of the CRNN search that uses Rdnn- pruned is very small, offering the lowest total cost.
tree is about 300 seconds, and that of the CRNN On the opposite, data in Gaussian distribution
search using no Rdnn-tree index is about 15 seconds; concentrate in the center of the plane and very few
about 20 times faster. It is obvious that pruning some data can be pruned, allowing many objects to remain,
unnecessary objects by adopting Rdnn-tree as the and causing a large RNNCanSet, thus resulting in the
index to reduce CRNN search space provides much highest total cost.
higher efficiency than not adopting Rdnn-tree. The
7 7
10 10 107

CRNN without index 6 CRNN without index


106 10
6 CRNN without index 10
CRNN with index CRNN with index
CRNN with index
5 5 5
10 10 10

Total time (sec.


Total time (sec.

Total time (sec.

)
)

104 104 104

103
103 103

2 2 2
10 10 10

10 10 10

1 1 1
1K 10K 30K 50K 100K 1K 10K 30K 50K 100K 1K 10K 30K 50K 100K
object-no object-no object-no

Figure 11: Influences on different types of data distribution from changing object-no

5.4.2 The effect of query interval parameter distance of MinD(qline , Rect) decreases, causing
pruning efficiency to reduce. On the contrary, when
This section focuses on the influence from the the interval increases, the number of time
length of query interval on each method under segmentation by CRNN query increases.
different types of object distribution. Results of the Consequently, the number of RNN searches for every
experiment are shown in Figure 12. Generally segment increases, and total cost of CRNN query
speaking, when the query interval is lengthened, the increases as well.
105 105 10
5

104 104 10
4

CRNN without index CRNN without index


CRNN with index CRNN with index
Total Time (sec.

Total Time (sec.


Total time (sec.
)

)
)

3
10 103 103 CRNN without index
CRNN with index

102 10
2
10
2

10 10 10

1 1 1
1 2 5 8 10 1 2 5 8 10 1 2 5 8 10
Query Interval Query Interval Query Interval

Figure12: Effect of query interval parameter for different data distribution

6. Conclusions and Future Works Study also prove the efficiency of the proposed
method. As wireless communication and mobile
An efficient CRNN search algorithm is proposed device technology become mature, more and more
in this Study. Such algorithm requires only one users access information from wireless information
execution to find out RNN results from all continuous systems through mobile devices. To process requests
RNN searches. The diversified experiments in this from more and more mobile users, data dissemination

Ubiquitous Computing and Communication Journal


through broadcast is an effective solution for (CLDB’96), pp. 215–226.
scalability. The future goal of this Study is extending [13] Lee,D.L., chien Lee, W.,Xu,J., and Zheng, B.
the issues of CRNN search to the wireless (2002) Data management in location dependent
broadcasting environment. services. IEEE Pervasive Computing,1, 65–72.
[14] Roussopoulos,N., Kelley,S. ,and Vincent,
7. References F.(1995) Nearest neighbor queries.
Proceedings of ACM Sigmod International
[1] AmitSingh,H.F. and Tosun,A.S. (2003) High Conference on Management of Data , Illinois,
dimensional reverse nearest neighbor queries. USA, June, pp.71–79.
Proceedings of the 20th International [15] Ross,S.(2000) Introduction to Probability and
Conference on Information and Knowledge Statistics for Engineers and Scientists.
Management (CIKM’03), NewOrleans, LA, [16] Stanoi,I., Agrawal,D., and Abbadi,A.E. (2000)
USA, pp.91–98. Reverse nearest neighbor queries for dynamic
[2] Barbara,D. (1999) Mobile computing and databases. ACM SIGMOD Workshop on
databases-a survey. IEEE Transactions on Research Issues in Data Mining and
Knowledge and Data Engineering, 11, 108–117. Knowledge Discovery, pp.44–53.
[3] Benetis,R.,Jensen, C.S.,Karciauskas,G., and [17] Song,Z. and Roussopoulos,N. (2001) K-nearest
Saltenis,S. (2002) Nearest neighbor and reverse neighbor search for moving query point.
nearest neighbor queries for moving objects. Proceedings of 7th International Symposium on
International Database Engineering and Advances in Spatial and Temporal Databases,
Applications Symposium, Canada, July17-19, LNCS2121, RedondoBeach, CA, USA, July12-
pp.44–53. 15, pp.79–96.
[4] Chaudhuri,S. and Gravano,L. (1999) Evaluating [18] Stanoi,I.,Riedewald, M.,Agrawal,D., and
top-k selection queries. Proceedings of the 25th Abbadi,A.E. (2001) Discovery of influence sets
IEEE International Conference on Very Large in frequently updated databases. Proceedings of
Data Bases, pp.397–410. the 27th VLDB Conference, Roma, Italy, pp.99–
[5] Civilis,A., Jensen,C.S., and Pakalnis,S. (2005) 108.
Techniques for efficient road-network-based [18] Tao,Y., Papadias,D., and Lian,X. (2004) Reverse
tracking of moving objects. IEEE Transactions knn search in arbitrary dimensionality.
on Knowledge and Data Engineering, 17, 698– Proceedings of 30th Very Large Data Bases,
712. Toronto, Canada, August29-September3,
[6] Computer Science and Telecommunication pp.279–290.
Board.IT Roadmap to a geospatial future, the [20]Tao,Y., Papadias, D., and Shen,Q. (2002)
national academies press,2003. Continuous nearest neighbor search.
[7] http://gist.cs.berkeley.edu/. International Conference on Very Large Data
[8] Guttman,A. (1984) R-trees:A dynamic index Bases, Hong Kong, China, August 20-23,
structure for spatial searching. Proceedings of pp.279–290.
the 1984 ACM SIGMOD international
conference on Management of data, pp.47–57. [21] Xu,J., Zheng,B., Lee,W.-C.,, and Lee,D.L. (2003)
[9] Hjaltason,G.R. and Samet,H. (1999) Distance Energy efficient index for energy query
browsing in spatial data bases. ACM location-dependent data in mobile
Transactions on Database Systems (TODS), 24, environments. In Proceedings of the 19th IEEE
265–318. International Conference on Data Engineering
[10] Korn,F. and Muthukrishnan,S. (2000) Influence (ICDE’03), Bangalore, India, March, pp.239–
sets based on reverse nearest neighbor queries. 250.
Proceedings of the 2000 ACM SIGMOD [22] Yang,C. and Lin,K.-I. (2001) An index structure
International Conference on Management of for efficient reverse nearest neighbor queries.
Data, Dallas, Texas, USA, May16-18, pp.201– Proceedings of the 17th International
212. Conference on Data Engineering, pp.485–492.
[11] Korn,F.,Muthukrishnan, S.,and Srivastava.,D. [23] Yiu,M.L., Papadias,D., Manoulis,N., and Tao,Y.
(2002) Reverse nearest neighbor aggregates (2005) Reverse nearest neighbors in large
over data streams. Proceedings of the graphs. Proceedings of 21st IEEE International
International Conference on Very Large Conference on Data Engineering (ICDE),
DataBases (VLDB’02), Hong Kong, China, Tokyo, Japan, April5-8, pp.186–187.
August, pp.91–98. [24] Yu,C.,Ooi,B.C., Tan,K.-L., and Jagadish,H.V.
[12] Korn,F., Sidiropoulos, N.,Faloutsos, C.,Siegel,E., (2001) Indexing the distance: An efficient
and Protopapas,Z. (1996) Fast nearest neighbor method to knn processing. Proceedings of the
search in medical image database. In 27th VLDB Conference, Roma, Italy, pp. 421–
Proceedings of the 22th International 430.
Conference on Very Large Data Bases [25] Zheng,B.,Lee, W.-C., and Lee,D.L. (2003)

Ubiquitous Computing and Communication Journal


Search k nearest neighbors on air. Proceedings Crete, Greece,March, pp.48–66.
of the 4th International Conferenceon Mobile [27] Zhang,J., Zhu,M., Papadias,D., Tao,Y., and
Data Management, Melbourne, Australia, Lee,D.L. (2003) Location-based spatial queries.
January, pp.181–195. In Proceedings of the 2003 ACM SIGMOD
[26] Zheng,B., Xu,J., chien Lee, W., and Lee,D.L. international conference on Management of
(2004) Energy conserving air indexes for data, SanDiego, California, USA, June9-12,
nearest neighbor search. Proceedings of the 9th pp.443–454.
International Conference on Extending
Database Technology (EDBT’04), Heraklion,

Ubiquitous Computing and Communication Journal

You might also like