Professional Documents
Culture Documents
2018 Day19 Hashing
2018 Day19 Hashing
Malay Bhattacharyya
Assistant Professor
1 Basics
2 Hash Collision
What is hashing?
Definition (Hashing)
Hashing is the process of indexing and retrieving data items in a
data structure to provide faster way (preferably O(1)) of finding
the element using the hash function.
Hash function
Hash function
Hash function
Hash function
Definition (Preimage)
For a hash value k ∗ = h(k), k is defined as the preimage of k ∗ .
h(k1 ) = h(k2 ).
For example, the keys 121 and 1234321 will have hash collision
with respect to the hash function h(k) = k%11.
Strategy 1: Resolution
Closed addressing: Store all the elements with hash collisions
in an auxiliary data structure (e.g., linked list, BST, etc.)
outside the hash table.
Open addressing: Store all the elements with hash collisions by
strategically moving them from preferred to the other positions
in the hash table itself.
Strategy 1: Resolution
Closed addressing: Store all the elements with hash collisions
in an auxiliary data structure (e.g., linked list, BST, etc.)
outside the hash table.
Open addressing: Store all the elements with hash collisions by
strategically moving them from preferred to the other positions
in the hash table itself.
Strategy 2: Avoidance
Perfect hashing: Ensure that collisions do not happen and if
happen relocate the other elements.
Strategy 1: Resolution
Closed addressing: Store all the elements with hash collisions
in an auxiliary data structure (e.g., linked list, BST, etc.)
outside the hash table.
Open addressing: Store all the elements with hash collisions by
strategically moving them from preferred to the other positions
in the hash table itself.
Strategy 2: Avoidance
Perfect hashing: Ensure that collisions do not happen and if
happen relocate the other elements.
Closed addressing
k 0 1 2 3 4 5 6 7
42 →
94 25 8
11 24 9
16
68
23
22
y 10
11
37 21 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
42
94 → 25 8
11 24 9
16 23 INSERTION 10
68 22 11
37 21 12
42
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
42
94 25 8
11 → 24 9
16 23 CHAINING 10
68 22 11
37 21 12
42
20 19 18 17 16 15 14 13
94
k 0 1 2 3 4 5 6 7
42
94 25 8
11 24 9
16 → 23 INSERTION 10
68 22 11 11
37 21 12
42
20 19 18 17 16 15 14 13
94
k 0 1 2 3 4 5 6 7
42
94 25 8
11 24 9
16 23 CHAINING 10
68 → 22 11 11
37 21 12
42
20 19 18 17 16 15 14 13
94
16
k 0 1 2 3 4 5 6 7
42
94 25 8
11 24 9
16 23 CHAINING 10
68 22 11 11
37 → 21 12
42
20 19 18 17 16 15 14 13
94
16
68
k 0 1 2 3 4 5 6 7
42
94 25 8
11 24 9
16 23 CHAINING 10
68 22 11 11 37
37 21 12
42
20 19 18 17 16 15 14 13
94
16
68
#define HASH_UNROLL 10
typedef struct node{
unsigned hash_val[HASH_UNROLL];
DATA data[HASH_UNROLL]; // Array of elements at a node
struct node *next;
}HNODE;
k 0 1 2 3 4 5 6 7
28 →
61 25 8
99 24 9
35
9
23
22
y 10
11
55 21 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 → 25 8
99 24 9
35 23 INSERTION 10
9 22 11
55 21 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 → 24 61 9
35 23 INSERTION 10
9 22 11
55 21 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 → 23 INSERTION 10
9 22 11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 → 23 COLLISION 10
9 22 11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 23 PROBING 35 10
9→ 22 11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 23 COLLISION 35 10
9→ 22 11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 23 PROBING, COLLISION 35 10
9→ 22 11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 23 PROBING 35 10
9 22 9 11
55 → 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28 55
61 25 8
99 24 61 9
35 23 INSERTION 35 10
9 22 9 11
55 21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
80 28? 55
25 8
24 61 9
23 LOOKUP, MOVE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
80 28 55?
25 8
24 61 9
23 LOOKUP, MOVE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
80 28 55 ?
25 8
24 61 9
23 LOOKUP, NOWHERE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
35 28 55
25 8
24 61? 9
23 LOOKUP, MOVE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
35 28 55
25 8
24 61 9
23 LOOKUP, FOUND 35? 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
99 28 55
25 8
24 61 9
23 LOOKUP, FOUND 35 10
22 9 11
21 99? 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
61 × 28 55
25 8
24 61? 9
23 LOOKUP, DELETE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
28 × 28? 55
25 8
24 9
23 LOOKUP, DELETE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
55 × 55?
25 8
24 9
23 LOOKUP, DELETE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
35 ×
25 8
24 ? 9
23 LOOKUP, NOWHERE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
s 0 1 2 3 4 5 6 7
35 × $ $
25 8
24 $? 9
23 LOOKUP, MOVE 35 10
22 9 11
21 99 12
20 19 18 17 16 15 14 13
Note: It has the only disadvantage that as soon as the hash table
fills up the performance degrades.
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 61 9
35 →
9
23
22
y 10
11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 35 9
35 23 RELOCATION 61 10
9→ 22 11
55 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28
61 25 8
99 24 9 9
35 23 RELOCATION 35 10
9 22 61 11
55 → 21 99 12
20 19 18 17 16 15 14 13
k 0 1 2 3 4 5 6 7
28 28 55
61 25 8
99 24 9 9
35 23 INSERTION 35 10
9 22 61 11
55 21 99 12
20 19 18 17 16 15 14 13
Cuckoo hashing
Cuckoo hashing
k 0 0
10 → 1 1
1 2 2
92
4
3
4 ↓ 3
4
2 5 5
6 6
7 7
k 0 0
10 1 1
1→ 2 10 2
92 3 3
INSERTION
4 4 4
2 5 5
6 6
7 7
k 0 0
10 1 1 1
1 2 10 2
92 → 3 3
INSERTION
4 4 4
2 5 5
6 6
7 7
k 0 0
10 1 1 1
1 2 10 2
92 3 3
INSERTION
4→ 4 92 4
2 5 5
6 6
7 7
k 0 0
10 1 1 1
1 2 10 2
92 3 3
2ND CHOICE
4→ 4 92 4
2 5 5
6 6
7 7
k 0 0
10 1 1 1
1 2 10 92 2
92 3 3
RELOCATION
4 4 4 4
2→ 5 5
6 6
7 7
k 0 0
10 1 1 1
1 2 10 92 2
92 3 3
2ND CHOICE
4 4 4 4
2→ 5 5
6 6
7 7
k 0 0
10 1 1 10 1
1 2 2 92 2
92 3 3
RELOCATION
4 4 4 4
2 5 5
6 6
7 7
k 0 0
92 × 1 1 10 1
2 2 92 2
3 3
LOOKUP, MOVE
4 4? 4
5 5
6 6
7 7
k 0 0
92 × 1 1 10 1
2 2 92? 2
3 3
LOOKUP, DELETE
4 4 4
5 5
6 6
7 7
Cryptography
Cryptography
Requirement Description
Variable input size h can be applied to a block of data of any size
Fixed output size h produces a fixed-length output
h(k) is relatively easy to compute for any
Efficiency given k, making both hardware and software
implementations practical
Preimage resistant For any given hash value k ∗ , it is computationally
(one-way property) infeasible to find k such that k ∗ = h(k)
Second preimage For any given key k1 , it is computationally
resistant (weak infeasible to find k2 6= k1 with h(k1 ) = h(k2 )
collision resistant)
Collision resistant It is computationally infeasible to find any pair
(strong collision (k1 , k2 ) such that h(k1 ) = h(k2 )
resistant)
Pseudorandomness Output of k meets standard tests for
pseudorandomness
Cryptography
Dimensionality reduction
Problems – Day 19
Problems – Day 19