Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Lab Session Unit III: Divide and Conquer

Spatial Search with Quadtrees

Carlos Cotta

Departamento de Lenguajes y Ciencias de la Computación


Universidad de Málaga

http://www.lcc.uma.es/∼ccottap

Comput Eng, Softw Eng, Comput Sci & Math – 2021-2022

C. Cotta Divide and Conquer Lab 1 / 14


Index

Index

1 Lab Session Unit III: Divide and Conquer


Problem Statement
Quadtrees

C. Cotta Divide and Conquer Lab 2 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Problem Context
We have a map with the location of some points of interest (POIs).

Given this information, we are


interested in answering questions
such as:
Which POI is the nearest to
my current location?
Which POIs are within a
certain distance of my
current location?
To solve these questions we need
a spatial database.

C. Cotta Divide and Conquer Lab 3 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Spatial Databases

A spatial database is a database optimized for storing objects


defined in a geometric space, and providing tools for querying and
analyzing such data.
Spatial databases are very important in
Geographical Information Systems
(GIS).
There are different algorithms and data
structures that can be used to arrange
spatial data for efficient search: R-trees,
K-d trees, ... We will consider a very
simple approach: quadtrees.

C. Cotta Divide and Conquer Lab 4 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

What is a Quadtree?

A quadtree is a tree in which each internal node can have four


children. In the image below, yellow nodes are internal nodes,
green nodes are leaves, and dashed nodes represent empty subtrees.

C. Cotta Divide and Conquer Lab 5 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Why Quadtrees?
The quaternary structure lends itself to represent a partition into
four quadrants of the plane.

S A B

A B C D
C D

We will refer to each to each children as top-left, top-right,


bottom-left and bottom-right.
Each quadrant can be successively divided into four smaller regions
as we go down the tree.

C. Cotta Divide and Conquer Lab 6 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Using Quadtrees to Store Spatial Data

Let P = {p1 , . . . , pn } be a collection of points, and let R be a


rectangle enclosing all of them. We will build a quadtree Q. Each
node of Q represents some P ′ ⊆ P.

1 The root of Q represents P. Each


node in the 1st level represents the
points of P in one of the quadrants of
R. Nodes in the 2nd level represent
points in a subquadrant, and so on.
2 If a subquadrant has no points, the
corresponding subtree is empty.
3 If a subquadrant has exactly 1 point,
it is a leaf of the quadtree.

C. Cotta Divide and Conquer Lab 7 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Constructing a Quadtree

Each node of the quadtree will contain a pair ⟨r , p⟩, where r is a


rectangle that encloses all points r (P) ⊆ P represented by the
subtree, and p ∈ r (P) is one of them (any p will do – its usefulness
will be shown later).
Build quadtree
func build-quadtree (↓r : Rectangle, ↓P: Set⟨Point⟩): Quadtree⟨Rectangle, Point⟩
begin
if P = ∅ then return empty-quadtree()
else if P = {p} then return quadtree(⟨r , p⟩)
else
p ← pick-point (P)
r1 , . . . , r4 ; P1 , . . . , P4 ← split(r , P)
for i ← 1 to 4 do qi ← build-quadtree(ri , Pi ) endfor
return quadtree(⟨r , p⟩, q1 , q2 , q3 , q4 )
endif
end

C. Cotta Divide and Conquer Lab 8 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Complexity of the Construction Procedure

Splitting the collection of points involves checking every point in


order to determine to which quadrant they should be assigned. If
|P| = n, then this is Θ(n).
The best case takes place when the partition is perfectly balanced
(n/4 points in each quadrant). Thus,

T (n) = 4T (n/4) + Θ(n)

The best-case time complexity is therefore Θ(n log n).


It is easy to see how this could degenerate into Θ(n2 ) in the worst
case.

Research goal (I) for this project: What is the average complexity?

C. Cotta Divide and Conquer Lab 9 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Querying for Points within a Region

To obtain the collection of points within a certain region C , we


proceed recursively. We start at the root, which covers a
rectangular area R.
Let r be the area covered by the current
node. If r ∩ C ̸= ∅ (i.e., the rectangle r and
the region C overlap), then:
1 If r (P) = 1 (i.e., there is a single point
p in the quadrant), we check whether
p ∈ C . If so, we return {p}; otherwise,
we return ∅.
2 If r (P) > 1, we call recursively on
C r1 , . . . , r4 (the children of the current
node), and return the union of the
point sets obtained in each recursive
call.

C. Cotta Divide and Conquer Lab 10 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Complexity of the Query Procedure

The time complexity of the query will depend on the area C and
the number of points within it.
Let’s assume as a best case that the area C is small enough so as
to only contain a number of points bounded by a constant. Such a
small area will overlap with just one quadrant at each level, until
being very close to the leaves. Then,

T (n) = T (n/4) + Θ(1)

The best-case time complexity is therefore Θ(log n).

Research goal (II) for this project: Empirically verify this claim.

C. Cotta Divide and Conquer Lab 11 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Finding the Nearest Point

Let q be the point whose nearest neighbor ν(q) ∈ P we look for.


Let d be the (unknown) minimum distance between q and ν(q).
We initialize d = ∞ and proceed recursively from the root.

Let ⟨r , p⟩ be the current node:


1 If distance(r , q) > d, we can ignore this
quadrant.
2 Otherwise, let d ′ = distance(p, q).
q 1 If d < d ′ , update d and keep p as the
current candidate.
2 Call recursively on each of the children
of the current node.

C. Cotta Divide and Conquer Lab 12 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Complexity of the Search Procedure

The time complexity will depend on the location of point q and


the points in the collection P. The recursive procedure may be
inefficient if we have bad luck in the order in which quadrants are
explored.
The best case is when we can screen out all quadrants but one in
each call. Then,
T (n) = T (n/4) + Θ(1)
The best-case time complexity is therefore Θ(log n).

Research goal (III) for this project: What is the average complexity?

C. Cotta Divide and Conquer Lab 13 / 14


Problem Statement
Lab Session Unit III: Divide and Conquer
Quadtrees

Image Credits

Vector spatial data: image by SydneyF.


https://community.alteryx.com/t5/Data-Science/
Vector-and-Raster-A-Tale-of-Two-Spatial-Data-Types/
ba-p/336141

C. Cotta Divide and Conquer Lab 14 / 14

You might also like