Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 44

R-Trees

Extension of B+-trees.
Collection of d-dimensional rectangles. A point in d-dimensions is a trivial rectangle.

Non-rectangular Data
Non-rectangular data may be represented by minimum bounding rectangles (MBRs).

Operations
Insert Delete Find all rectangles that intersect a query rectangle. Good for large rectangle collections stored on disk.

R-TreesStructure
Data nodes (leaves) contain rectangles. Index nodes (non-leaves) contain MBRs for data in subtrees. MBR for rectangles or MBRs in a non-root node is stored in parent node.

R-TreesStructure
R-tree of order M.
Each node other than the root has between m <= ceil(M/2) and M rectangles/MBRs.
Assume m = ceil(M/2) henceforth.

Typically, m = ceil(M/2). Root has between 2 and M rectangles/MBRs. Each index node has as many MBRs as children. All data nodes are at the same level.

Example
R-tree of order 4.
Each node may have up to 4 rectangles/MBRs.

Example
Possible partitioning of our example data into 12 leaves.

Example
Possible R-tree of order 4 with 12 leaves.
mn op

a bc d

ef

g h i

j k l

Leaves are data nodes that contain 4 input rectangles each.

a-p are MBRs

mn o p

Example
Possible corresponding grouping.
a m
a b cd e f g h i jk l

mn o p

Example
Possible corresponding grouping.
a m e
a b cd e f g h i jk l

mn o p

Example
Possible corresponding grouping.
a m e
a b cd e f g h i jk l

g h d i o p

Query
Report all rectangles that intersect a given rectangle.

Query

Start at root and find all MBRs that overlap query. Search corresponding subtrees recursively.

mn o p

Query
a b cd e f g h i jk l

n m

mn o p

Query
Search m.
a b cd e f g h i jk l

n a m

a
b

a
x
d

Insert
Similar to insertion into B+-tree but may insert into any leaf; leaf splits in case capacity exceeded.
Which leaf to insert into? How to split a node?

InsertLeaf Selection
Follow a path from root to leaf. At each node move into subtree whose MBR area increases least with addition of new rectangle.
n m

InsertLeaf Selection
Insert into m.

InsertLeaf Selection
Insert into n.
n

InsertLeaf Selection
Insert into o.

InsertLeaf Selection
Insert into p.

InsertSplit A Node
Split set of M+1 rectangles/MBRs into 2 sets A and B.
A and B each have at least m rectangles/MBRs. Sum of areas of MBRs of A and B is minimum.

M = 8, m = 4

InsertSplit A Node
Split set of M+1 rectangles/MBRs into 2 sets A and B.
A and B each have at least m rectangles/MBRs. Sum of areas of MBRs of A and B is minimum.

M = 8, m = 4

InsertSplit A Node
Split set of M+1 rectangles/MBRs into 2 sets A and B.
A and B each have at least m rectangles/MBRs. Sum of areas of MBRs of A and B is minimum.

M = 8, m = 4

InsertSplit A Node
Exhaustive search for best A and B.
Compute area(MBR(A)) + area(MBR(B)) for each possible A. Notefor each A, the B is unique. Select partition that minimizes this sum.

When |A| = m = ceil(M/2), number of choices for A is


(M+1)! m!(M+1-m)! Impractical for large M.

InsertSplit A Node
Grow A and B using a clustering strategy.
Start with a seed rectangle a for A and b for B. Grow A and B one rectangle at a time. Stop when the M+1 rectangles have been partitioned into A and B.

InsertSplit A Node
Quadratic Methodseed selection.
Let S be the set of M+1 rectangles to be partitioned. Find a and b in S that maximize
area(MBR(a,b)) area(a) area(b)
M = 8, m = 4

InsertSplit A Node
Quadratic Methodseed selection.
Let S be the set of M+1 rectangles to be partitioned. Find a and b in S that maximize
area(MBR(a,b)) area(a) area(b)
M = 8, m = 4

InsertSplit A Node
Quadratic Methodassign remaining rectangles/MBRs.
Find an unassigned rectangle c that maximizes
|area(MBR(A,c)) area(MBR(A)) - (area(MBR(B,c)) area(MBR(B)))| M = 8, m = 4

InsertSplit A Node
Quadratic Methodassign remaining rectangles/MBRs.
Find an unassigned rectangle c that maximizes
|area(MBR(A,c)) area(MBR(A)) - (area(MBR(B,c)) area(MBR(B)))| M = 8, m = 4

InsertSplit A Node
Quadratic Methodassign remaining rectangles/MBRs.
Assign c to partition whose area increases least.

M = 8, m = 4

InsertSplit A Node
Quadratic Methodassign remaining rectangles/MBRs.
Continue assigning in this way until all remaining rectangles must necessarily be assigned to one of the two partitions for that partition to have m rectangles.

M = 8, m = 4

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

Separation in xdimension

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

Rectangles with max x-separation

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

Divide by x-width to normalize

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

Separation in ydimension

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

Rectangles with max y-separation

InsertSplit A Node
Linear Methodseed selection.
Choose a and b to have maximum normalized separation.

M = 8, m = 4

Divide by y-width to normalize

InsertSplit A Node
Linear Methodassign remainder.
Assign remaining rectangles in random order. Rectangle is assigned to partition whose MBR area increases least. Stop when all remaining rectangles must be assigned to one of the partitions so that the partition has its minimum required m rectangles. M = 8, m = 4

Delete
If leaf doesnt become deficient, simply readjust MBRs in path from root. If leaf becomes deficient, get from nearest sibling (if possible) and readjust MBRs. Combine with sibling as in B+ tree. Could instead do a more global reorganization to get better R-tree.

Variants
R*-tree
Leaf selection and node overflows in insertion handled differently.

Hilbert R-tree

Related Structures
R+-tree
Index nodes have non-overlapping rectangles. A data object may be represented in several data nodes. No upper bound on size of a data node. No bounds (lower/upper) on degree of an index node.

Related Structures
Cell tree
Combines BSP and R+-tree concepts. Index nodes have non-overlapping convex polyhedrons. No lower/upper bound on size of a data node. Lower bound (but not upper) on degree of an index node.

You might also like