Professional Documents
Culture Documents
Lecture 09 Hash Index - Without Answers
Lecture 09 Hash Index - Without Answers
Long Cheng
Assistant Professor
c.long@ntu.edu.sg
record
(2) key h(key) Ptr
E.g., index id
in the Directory
directory
• Much smaller storage
than the disk file
• Easier to update (suitable
for dynamic case)
DATABASE SYSTEM PRINCIPLES: Lecture 09: Hash Index 5
Within a Bucket
Yes
If
Do we keep records
• CPU time critical
sorted (wrt a key)?
• Inserts/Deletes
not too frequent
INSERT: 0
h(a) = 1 1
h(b) = 2 Pointer
2
h(c) = 1
h(d) = 0 3
4 buckets
DATABASE SYSTEM PRINCIPLES: Lecture 09: Hash Index 7
Inserts and Deletes
INSERT: 0 d
h(a) = 1 1 a
h(b) = 2 c
2
h(c) = 1 b
h(d) = 0 3
h(e) = 1
Overflow
INSERT: 0 d
h(a) = 1 1 a e
h(b) = 2 c
2
h(c) = 1 b
h(d) = 0 3
2 I/Os for
h(e) = 1 accessing e
Delete: 0 a
e 1 b d
f c
2
e
3
f
g
Delete: 0 a
e 1 b d
f c
c 2
e
3
f move
g “g” up
Delete: 0 a
e 1 b d
f c d
c 2
e
3
f move
g “g” up
Delete: 0 a
e 1 b
f d
c 2
3
g
3
Many I/Os
1 Initially
1 • i=1
1001 • 2 buckets
1100
Insert 1010
1
i= 1 0001
0
1
1 2
1001 Cannot use 1 bit for
1010 1100 hashing these three keys,
and thus increase to 2
1 2 bits
Insert 1010 1100
i= 2
00
1
01
0001
10
11 2
1001
1010
Insert: 2
1100
0111
10
11
1001 2
1010
Insert:
1001 1100 2
10 1001 3
11 1001
1010 1001 2 3
1010
Insert:
1001 1100 2
1010 110
Insert:
1001 1100 2 111
• No merging of blocks
• Merge blocks
and cut directory if possible
(Reverse insert procedure)
1100 2
1
1100
1100
2 1100
many records with 1100
duplicate keys 1100
1 1
1100 1100 1100
1100 1100
Question:
Can you think of another scenario where an
overflow block is not avoidable?
DATABASE SYSTEM PRINCIPLES: Lecture 09: Hash Index 31
Summary of Indexes
Conventional B+ Tree Hash
Index Index Index
Next lecture:
Lecture 10: Multiple Key Index