CH 6 Searching Algorithms and Hashing

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 142

6.

Searching Algorithms and Hashing


Data Structure and Algorithms

Dr. Udaya Raj Dhungana


Assist. Professor
Pokhara University, Nepal
Guest Faculty
Hochschule Darmstadt University of Applied Sciences, Germany
E-mail: udaya@pu.edu.np and udayas.epost@gmail.com

1
Search
Unit 6: Searching Algorithms and Hashing (5 hrs)
1.Sequential Search
2.Binary Search
3.Hashing
3.1.Hash Function
3.2.Hash Table
3.3.Hashing as a Data Structure and a Search Technique
4.Collision in Hash Table
5.Collision Resolution Techniques
5.1.Open Hashing: Separate Chaining
5.2.Closed Hashing: Linear Probing, Quadratic Probing and Double
Hashing
6.Load Factor and Rehashing

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 2


Searching
• Searching is the algorithmic process of ….
finding a particular item or the location of ….
an item in a collection of items. 1892 Ram

• A search typically returns either True or 1893 Sujan


False as to whether the item is present. 1894 Shyam
1895 Abhi
• A search sometimes returns the location of
1896 Arpan
the found data item.
1897 Paru
• Any search is said to be successful or 1898 Milan
unsuccessful depending upon whether the ….
element that is being searched is found or
….
not.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 3


Search in Unsorted List
1. Praveen Yadav 13. Keshab Thapa 24. Sangat Sharma 36. Prashanna Sapkota
2. Sudip Kandel 14. Irosh Panday 25. Akash Babu Baral 37. Pratikshya Aryal
3. Kiran Chhantyal 15. Chandra Paudel 26. Madhav Sharma 38. Sudip KC
4. Sambat Bhujel 16. Aishwarya Baniya Regmi 39. Pujan Paudel
5. Ankita Bhusal 17. Madan Adhikari 27. Sandeep Adhikari 40. Shikhar Bahik
6. Gaurav Adhikari 18. Kapil Raj Baral 28. Nischal Gautam 41. Sisam Rimal
7. Anu Sapkota Ashish 19. Ramesh Khatri 29. Sandesh Sigdel 42. Roshan B.K.
8. Samip Poudel 20. Sandesh Baral 30. Pooja Dhakal 43. Prabesh Baral
9. Himal Midun 21. Lily Gautam 31. Pradeep Adhikari 44. Sujan Tiwari
10. Bikash Thapa 22. Tilak Bdr Karki 32. Saurav Dhakal 45. Adheesh Tiwari
11. Prabhat Pandit 23. Sagar Gautam 33. Raghubir Prasad Tharu 46. Toya Narayan Ka e
12.Ramesh Khatri 35. Ujwal Paudel 47. Sudip Poudel

1. You are provided unsorted list of student’s name/


2. Find the name Adheesh Tiwari in the list.
3. How much time (or comparisons) does it take to locate your name?

Do the same for the sorted list.


Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 4
fl
Search in Sorted List
A K, L, M, N Prashanna Sapkota Sandesh Baral
Pratikshya Aryal Sandesh Sigdel
Adheesh Tiwari Kapil Raj Baral Praveen Yadav Sangat Sharma
Aishwarya Baniya Keshab Thapa Pujan Paudel Saurav Dhakal
Akash Babu Baral Kiran Chhantyal Shikhar Bahik
Ankita Bhusal Lily Gautam R Sisam Rimal
Anu Sapkota Ashish Madan Adhikari Sudip Kandel
Madhav Sharma Regmi Raghubir Prasad Tharu Sudip KC
Nischal Gautam Ramesh Khatri Sudip Poudel
B, C, G, H, I
Roshan B.K. Sujan Tiwari
P
Bikash Thapa S
T, U
Chandra Paudel Pooja Dhakal Sagar Gautam
Gaurav Adhikari Prabesh Baral Sambat Bhujel Tilak Bdr Karki
Himal Midun Prabhat Pandit Samip Poudel Toya Narayan Ka e
Irosh Panday Pradeep Adhikari Sandeep Adhikari Ujwal Paudel

If the data are sorted, the searching becomes very easy and efficient.
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 5
fl
Sequential Search
• Sequential search is also called a Linear search.
• It compares the item (to be searched) or key with every elements in
the array/list starting from the first element, one at a time.
• It keeps comparing until the item (with matching key) is found or all
the elements are examined/compared without success.
• If the item is found, it returns the index (or location) of the item,
else it returns -1.
• Sequential search can be applied on the sorted or unsorted list.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 6


Sequential Search
Item to be searched = 39

39

1. Search is successful
4 2. Returned location = 4

Sequential Search

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 7


Sequential Search

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 8


Sequential Search- Algorithm
Linear Search (arr[], item, n)
// arr is the name of the array, and item is the searched element.
// n is the total no of elements of the array
Step 1: Set i to 0
Step 2: if i > n then go to step 7
Step 3: if arr[i] = item then go to step 6
Step 4: Set i to i + 1
Step 5: Goto step 2
Step 6: Print “element a found at index I” and go to step 8
Step 7: Print “element not found”
Step 8: Exit

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 9


Sequential Search- Implementation in C

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 10


Sequential Search- Implementation in C

• The linear search algorithm is easy to implement and


efficient in two scenarios:
• When the list contains lesser elements
• When searching for a single element in an unordered
array
• Time Complexity:
• Best-case: O(1)
• Average-case:O(n)
• Worst-case: O(n)

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 11


Binary Search
• Binary Search is a fast and efficient search algorithm.
• It can be applied only on the sorted data.
• Binary search algorithm works on the principle of divide and conquer.
• It improves the search by repeatedly dividing the array/list in half
until it either finds the item or the array/list gets narrowed down to
single element that doesn’t match with the searched item.
• In every iteration (or comparison), the searching is narrowed down
by half of the elements.
• It's worst-case time complexity is O(log n).

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 12


How does binary search work?
• Binary search looks for an item be searched by comparing with the
middle item in the array.

• If a match occurs, then the index of middle item is returned.

• If the item to be searched is less than the middle item, then it is


searched in the half sub-array to the left of the middle item.

• Otherwise, it is searched in the half sub-array to the right of the


middle item.

• This process continues on the sub-array as well until the search is


successful or the size of the subarray reduces to zero.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 13


Binary Search

https://brilliant.org/wiki/binary-search/
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 14
How does binary search work?
• Example:Consider the following is our sorted array and let us assume
that we need to search the location of value 31 using binary search

First, we shall determine half of the array by using this formula:


mid = low + (high - low) / 2
= 0 + (9 - 0 ) / 2 = 4.5 = 4 (integer value)
So, index 4 is the mid of the array and 27 is the middle element.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 15


How does binary search work?
• Example:Consider the following is our sorted array and let us assume
that we need to search the location of value 31 using binary search

Now, compare the value 31 with the value at location 4.


Here, 31 > 27. So, the item to be searched “31” must be in the right
half of middle element 27.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 16


How does binary search work?
• Example:Consider the following is our sorted array and let us assume
that we need to search the location of value 31 using binary search

Now, find new low and the new mid value again as
low = mid+1 = 4+1 = 5
mid = low + (high - low) / 2 = 5 + (9 - 5 ) / 2 = 7
So, index 7 is the mid of the array and 35 is the new middle element.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 17


How does binary search work?
• Example:Consider the following is our sorted array and let us assume
that we need to search the location of value 31 using binary search

Now, compare the value 31 with the value at location 7.


Here, 31 < 35. So, the item to be searched “31” must be in the left
half of middle element 35.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 18


How does binary search work?
• Example:Consider the following is our sorted array and let us assume
that we need to search the location of value 31 using binary search

Now, find new high and the new mid value again as
high = mid-1 = 7-1 = 6
mid = low + (high - low) / 2 = 5 + (6 - 5 ) / 2 = 5+0.5 = 5.5 = 5
So, index 5 is the mid of the array and 31 is the new middle element.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 19


How does binary search work?
• Example:Consider the following is our sorted array and let us assume
that we need to search the location of value 31 using binary search

Now, compare the value 31 with the value at location 5.


Here, 31 = 31. So, the item to be searched “31” is found at index 5.
Return the location 5 where the searched element is located.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 20


Binary Search Implementation
• There are two forms of binary search implementation:

• Iterative Method

• Space complexity = O(1)


• Recursive Method

• Space complexity = O(log2n) = O(logn)


• Although the recursive version is easier to implement, the iterative
approach is more efficient.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 21


Binary Search Implementation
• Iterative Method:

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 22


Binary Search Implementation
• Recursive Method:

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 23


Relation to Binary Search Tree

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 24


Binary Search: Analysis
• Space Complexity:
• Iterative Method

• Space complexity = O(1)


• Recursive Method

• Space complexity = O(log2n)


• Time Complexity:

• Worst-case = O(log2n)
• Average-case = O(log2n)
• Best-case = O(1)

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 25


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31. See how the linear and binary search
progress:

Linear
Search

Binary
Search

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 26


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31. See how the linear and binary search
progress:

Linear
Search

Binary
Search

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 27


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31. See how the linear and binary search
progress:

Linear
Search

Binary
Search

Binary search found the element in 3 comparisons (i.e. in O(logn))

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 28


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31. See how the linear and binary search
progress:

Linear
Search

Binary
Search

Binary search found the element in 3 comparisons (i.e. in O(logn))

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 29


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31. See how the linear and binary search
progress:

Linear
Search

Binary
Search

Binary search found the element in 3 comparisons (i.e. in O(logn))

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 30


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31. See how the linear and binary search
progress:
Linear search found the element in 6 comparisons (i.e. in O(n))

Linear
Search

Binary
Search

Binary search found the element in 3 comparisons (i.e. in O(logn))

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 31


Analysis: Linear/Binary Search
• Let us consider the following array containing 10 items. Suppose, we
want to search element 31.

1.Linear Search: O(n)


2.Binary Search: O(logn)
3. For a very big array, it takes a lot time to find an item.
4. If we know the item 31 is at index 5 in advance, it is

31 independent of array size and we can find it in O(1).


5. Can we calculate the index no from the item value?

5
The answer is YES.
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 32
Can we do better?

• Now, we will attempt to go one step further by building a data


structure that can be searched in O(1) time.
• This concept is referred to as hashing.

• Also See: https://www.youtube.com/watch?v=KyUTuwz_b7Q

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 33


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: let us define a function f(x) to find the index value at
which the integer value x is stored as

f(x) = x mod array − size

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 34


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 26
f(26) = 26 mod 7 = 5
The value 26 is stored at index 5.

26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 35


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 17
f(17) = 17 mod 7 = 3
The value 17 is stored at index 3.

17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 36


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 77
f(77) = 77 mod 7 = 0
The value 77 is stored at index 0.

77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 37


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 53
f(53) = 53 mod 7 = 4
The value 53 is stored at index 4.

77 17 53 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 38


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 30
f(30) = 30 mod 7 = 2
The value 30 is stored at index 2.

77 30 17 53 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 39


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 43
f(43) = 43 mod 7 = 1
The value 43 is stored at index 1.

77 43 30 17 53 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 40


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To store x = 62
f(60) = 62 mod 7 = 6
The value 62 is stored at index 6.

77 43 30 17 53 26 62

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 41


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To Search an item value x = 53, calculate the index as
f(53) = 53 mod 7 = 4
The value 53 is found at index 4.

77 43 30 17 53 26 62

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 42


Hashing
• Suppose a list of integer values: 26, 17, 77, 53, 30, 43, 62. Let us
define a new way of storing and searching these values in an integer
array.
• Solution: To Search an item value x = 30, calculate the index as
f(30) = 30 mod 7 = 2
The value 30 is found at index 2.

77 43 30 17 53 26 62

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 43


Hashing

30 f(30) = 30 mod 7 2
Value Index Value
Hash Function
(Key)

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 44


Hashing
Key
Hash Function Hash Value
(Item value,
h(x) = x mod 7 (Index value) Index Item
x)
26 26 mod 7 5 0 77

17 17 mod 7 3 1 43

77 77 mod 7 0 2 30

53 53 mod 7 4 3 17

30 30 mod 7 2 4 53

43 43 mod 7 1 5 26

62 62 mod 7 6 6 62

Hash Table
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 45
Hash System
An item value is a search key that is provided to the hash
function h(x) for hashing to determine the slot no in the
hash table

Hash Table
A hash value is a slot number (index) in the hash table.
An item is inserted or searched in the calculated hash value.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 46


Hash System
In some cases, the record value is itself used as a search key, x.
In other cases, a record has some attributes like ID which
can be used as a search key to determine the slot number
Hash table.

Hash Table
Example: In records of PU students, the Exam_Roll_no of
student can be used as a search key.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 47


Hash System
Index Record

0 77

Key Hash Hash 43


Function Value 1

Hash Table
26 26 mod 7 5 30
17 17 mod 7 3 2
x 77
53
77
53
mod
mod
7
7
0
4
h(x) 17
30 30 mod 7 2 3
Search Key Hash Value
43 43 mod 7 1 53
62 62 mod 7 6
4 26
Hash Function
(h(x) = x mod 7) 5 62
Hash Table

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 48


Hashing
• Hashing is a method for storing and retrieving records from a database.
• It lets you insert, delete, and search for records based on a search key
value.
• When properly implemented, these operations can be performed in
constant time.
• In fact, a properly tuned hash system typically looks at only one or two
records for each search, insert, or delete operation. Hence, only O(1)
time requires for search, insert, or delete operation
• This is far better than the O(logn) average cost required to do a
binary search on a sorted array of n records (or in a binary search
tree).
• However, even though hashing is based on a very simple idea, it is
surprisingly difficult to implement properly.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 49


Hash Table
• A hash system stores records in an array called
a hash table (HT)
Index Item
• Hashing works by performing a computation on a 0 77
search key x in a way that is intended to identify
the position- h(x) in hash table that contains the 1 43
record.
2 30
• A position or location in the hash table is known
as a slot. 3 17
• The function that does this calculation is called 4 53
the hash function, and will be denoted by the
letter h. 5 26

6 62

Hash Table
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 50
Hash Table
• A hash table is a data structure that allows the Index Item
very fast retrieval of data. No matter how much
5 26
data there are.
• Thats why, hash table is widely used various 3 17
applications like in database indexing, chasing
0 77
etc.
• Since hashing schemes place records in the table 4 53
in whatever order satisfies the needs of the 2 30
address calculation, records are not ordered by
value. 1 43

6 60

Hash Table
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 51
Hash Function
• The mapping between a search key of a record and the slot in the
hash table where the record is stored in the hash table is called
the hash function.
• A record with key x can be stored in h(x) slot in the hash table.
• To find the record with key value x, the h(x) slot in the hash table is
examined.

30 f(30) = 30 mod 7 2
Value Hash Hash
(Search Key) Function Value

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 52


Hash Function
• The hash function will take any item (or a search key) in the array
(of size m) and return an integer (hash value) in the range of slot
number, between 0 and m-1.
• Once the hash values have been computed, we can insert each item
into the hash table at the designated position (slot number in the
array).

30 f(30) = 30 mod 7 2
Value Hash Hash
(Search Key) Function Value

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 53


Hash Function
• When we want to search for an item, we simply use the hash
function to compute the slot name for the item and then check the
hash table to see if it is present.
• This searching operation is O(1), since a constant amount of time is
required to compute the hash value and then index the hash table at
that location.
• If everything is where it should be, we have found a constant time
search algorithm.

30 f(30) = 30 mod 7 2
Value Hash Hash
(Search Key) Function Value

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 54


Hash Function
A simple hash function for strings:
A simple hash function
for integers:

• This function sums the ASCII values of the


letters in a string and result is modulo-divided
by size of the array, m.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 55


Collision
• You have probably noticed that hashing works only if hash function
can map each search key (item) to a unique slot in the hash table.
• What if this is not always the case??? Suppose a list of integer
values: 26, 17, 77 are stored in the hash table. Now you need to
store the values: 59.
• Solution: To Store the item value x = 59, calculate the index as
f(59) = 59 mod 7 = 3

77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 56


Collision
• What if this is not always the case??? Suppose a list of integer
values: 26, 17, 77 are stored in the hash table. Now you need to
store the values: 59.
• Solution: To Store the item value x = 59, calculate the index as
f(59) = 59 mod 7 = 3
The 59 must be stored at index 3 which is already occupied.
59

77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 57


Collision

If the hash function generates the same hash


value (same slot in the hash table) for the two
or more search keys (items), this is referred to
as
a collision.
77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 58


Collision
• In most applications, there are many more values in the key range
than there are slots in the hash table.
• For a more realistic example, suppose the key can take any value in
the range 0 to 65,535 (i.e., the key is a two-byte unsigned integer),
and that we expect to store approximately 1000 records at any given
time.
• It is impractical in this situation to use a hash table with 65,536
slots, because then the vast majority of the slots would be left
empty.
• Instead, we must find/design a hash function that allows us to store
the records in a much smaller table.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 59


Collision
• Because the key range is larger than the size of the table, at least
some of the slots must be mapped to from multiple key values.
• Given a hash function h and two keys k1 and k2,
• if h(k1)=β=h(k2) where β is a slot in the table, then we say that k1
and k2 have a collision at slot β under hash function h.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 60


Collision Resolution
• When two items hash to the same slot, we must have a systematic
method for placing the second item in the hash table. This process is
called collision resolution.
• If the hash function is perfect, collisions will never occur. However,
since this is often not possible.
• Collision resolution becomes a very important part of hashing.
• Finding an alternate location for the hashed key is called collision
resolution.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 61


Collision Resolution Policy
• Based on whether collisions are stored outside the hash table (open
hashing), or whether collisions result in storing one of the records at
another slot in the table (closed hashing), they are categorized as:
1. Open addressing (Closed Hashing)
a) Linear probing
b) Plus 3 probing
c) Quadratic probing
d) Double hashing
2. Closed Addressing (Open Hashing)
a) Chaining

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 62


Linear Probing
• When a collision occurred in a slot of hash table, we tries to find
another open slot to store the item that caused the collision.
• A simple way to do this is to start at the original hash value position
and then move in a sequential manner through the slots until we
encounter the first slot that is empty.
• Note that we may need to go back to the first slot (circularly) to
cover the entire hash table.
• This collision resolution process is referred to as open addressing in
that it tries to find the next open slot or address in the hash table.
• By systematically visiting each slot one at a time, we are performing
an open addressing technique called linear probing.

When the insertion encounters a collision, we move forward in the


table until a vacant slot is found. This is called linear probing.
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 63
Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.

77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 64


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 59, calculate the index as
f(59) = 59 mod 7 = 3

59 Collision!

77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 65


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 59, calculate the index as
f(59) = 59 mod 7 = 3

59 Empty slot.

77 17 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 66


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 59, calculate the index as
f(59) = 59 mod 7 = 3

77 17 59 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 67


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 25, calculate the index as
f(25) = 25 mod 7 = 4
25 Collision!

77 17 59 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 68


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 25, calculate the index as
f(25) = 25 mod 7 = 4
Slot is already
25 Occupied!

77 17 59 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 69


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 25, calculate the index as
f(25) = 25 mod 7 = 4
25 Slot is empty!

77 17 59 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 70


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 25, calculate the index as
f(25) = 25 mod 7 = 4

77 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 71


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 8, calculate the index as
f(8) = 8 mod 7 = 1
8 Slot is empty!

77 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 72


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 8, calculate the index as
f(8) = 8 mod 7 = 1

77 8 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 73


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 47, calculate the index as
f(47) = 47 mod 7 = 5
47 Collision!

77 8 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 74


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 47, calculate the index as
f(47) = 47 mod 7 = 5
Slot is already
47 Occupied!

77 8 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 75


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 47, calculate the index as
f(47) = 47 mod 7 = 5
Slot is already
47 Occupied!

77 8 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 76


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 47, calculate the index as
f(47) = 47 mod 7 = 5
Slot is already
47 Occupied!

77 8 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 77


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 47, calculate the index as
f(47) = 47 mod 7 = 5
47 Slot is empty!

77 8 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 78


Linear Probing- Example
• Suppose a list of integer values: 26, 17, 77 are stored in the hash
table. Now you need to store the values: 59, 25, 8, 47.
• If collision occurs, resolve it using linear probing policy.
• Solution: To Store the item value x = 47, calculate the index as
f(47) = 47 mod 7 = 5

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 79


Linear Probing- Example
• Once we have built a hash table using linear probing, it is essential
that we utilize the same methods to search for items.
• Assume we want to look up the item 47.

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 80


Linear Probing- Example
• Once we have built a hash table using linear probing, it is essential
that we utilize the same methods to search for items.
• Assume we want to look up the item 47. When we compute the hash
value, we get 5. Looking in slot 5, 47 is not in the slot 5.

Item does not


47 Match!

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 81


Linear Probing- Example
• Once we have built a hash table using linear probing, it is essential
that we utilize the same methods to search for items.
• Assume we want to look up the item 47. When we compute the hash
value, we get 5. Looking in slot 5, 47 is not in the slot 5. So we try to
find 47 in 6

Item does not


47 Match!

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 82


Linear Probing- Example
• Once we have built a hash table using linear probing, it is essential
that we utilize the same methods to search for items.
• Assume we want to look up the item 47. When we compute the hash
value, we get 5. Looking in slot 5, 47 is not in the slot 5. So we try to
find 47 in 6, then in 0

Item does not


47 Match!

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 83


Linear Probing- Example
• Once we have built a hash table using linear probing, it is essential
that we utilize the same methods to search for items.
• Assume we want to look up the item 47. When we compute the hash
value, we get 5. Looking in slot 5, 47 is not in the slot 5. So we try to
find 47 in 6, then in 0, then in 1

Item does not


47 Match!

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 84


Linear Probing- Example
• Once we have built a hash table using linear probing, it is essential
that we utilize the same methods to search for items.
• Assume we want to look up the item 47. When we compute the hash
value, we get 5. Looking in slot 5, 47 is not in the slot 5. So we try to
find 47 in 6, then in 0, then in 1 and at slot 2, we found 47 and we
return True.

47 Item matched!

77 8 47 17 59 26 25

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 85


Clustering
• A disadvantage to linear probing: clustering
• If many collisions occur at the same hash value, a number of
surrounding slots will be filled by the linear probing resolution.
This leads to the problem of clustering.
• This will have an impact on other items that are being inserted.
• Eg. when we tried to add the item 25. A cluster of values
hashing to 3 had to be skipped to finally find an open slot at 1.

25 Collision!

38 17 59 24 31

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 86


Clustering
• As the table approaches its capacity, these clusters tend to
merge.
• This causes insertion to take a long time (due to linear
probing to find vacant slot).
• Eg. Try to add new item value 73.

What is the cost to add 73?

38 25 17 59 24 31

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 87


Quadratic Probing
• The linear probing has the problem of clustering.
• Another variation of linear probing is quadratic probing.
• It eliminates the problem of clustering.
th
• This probe function p for the hashed value k for i collision, is
given by the quadratic function:
2
p(k, i) = c1i + c2i + c3
where c1, c2 and c3 are some constants.
• A simple variation of p(k, i) can be as follows:
2
p(k, i) = i
where c1 = 1,c2 = 0 and c3 = 0.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 88


Quadratic Probing
• Using quadratic probing,
• Calculate the hash value with hash function h(k).
• If there is no collision withh(k), insert the item in the hashed
slot h(k).
th
• If the collision occurred, then for i collision determine the
slot in hash table using the quadratic probe function
2
(h(k) + i ) mod m
where m = size of the hash table.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 89


Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
h(k) = k mod 10
• If no collision,
th 2
• Otherwise, for i collision, h2(k) = [h(k) + i ] mod 10

0 1 2 3 4 5 6 7 8 9

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 90


Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 11, calculate the index as


h(11) = 11 mod 10 = 1

11 Empty slot!

0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 91
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 11, calculate the index as


h(11) = 11 mod 10 = 1

11
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 92
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 21, calculate the index as


h(21) = 21 mod 10 = 1

21 Collision 1 st
occurred!

11
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 93
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 21, calculate the index as


h(21) = 21 mod 10 = 1
2 2
h1(21) = [h(21) + 1 ] mod 10 = [1 + 1 ] mod 10 = 2
21 Empty slot

11
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 94
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 21, calculate the index as


h(21) = 21 mod 10 = 1
2 2
h1(21) = [h(21) + 1 ] mod 10 = [1 + 1 ] mod 10 = 2

11 21
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 95
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 33, calculate the index as


h(61) = 61 mod 10 = 1

61 Collision 2 nd
occurred!

11 21
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 96
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 33, calculate the index as


h(61) = 61 mod 10 = 1
2 2
h2(61) = [h(61) + 2 ] mod 10 = [1 + 2 ] mod 10 = 5

61 Empty slot!

11 21
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 97
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 33, calculate the index as


h(61) = 61 mod 10 = 1
2 2
h2(61) = [h(61) + 2 ] mod 10 = [1 + 2 ] mod 10 = 5

11 21 61
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 98
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 74, calculate the index as


h(74) = 74 mod 10 = 4

74 Empty slot!

11 21 61
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 99
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 74, calculate the index as


h(74) = 74 mod 10 = 4

11 21 74 61
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 100
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 91, calculate the index as


h(91) = 91 mod 10 = 1

91 Collision 3 rd
occurred!

11 21 74 61
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 101
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 91, calculate the index as


h(91) = 91 mod 10 = 1
2 2
h2(91) = [h(91) + 3 ] mod 10 = [1 + 3 ] mod 10 = 0

91 Empty slot!

11 21 74 61
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 102
Quadratic Probing- Example
• Let the items: 11, 21, 61, 74, 91
th
• If no collision, h(k) = k mod 10, otherwise, for i
2
collision, hi(k) = [h(k) + i ] mod 10

• To Store the item value k = 91, calculate the index as


h(91) = 91 mod 10 = 1
2 2
h2(91) = [h(91) + 3 ] mod 10 = [1 + 3 ] mod 10 = 0

91 11 21 74 61
0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 103
Quadratic Probing- Example
• Observe at the resulted hash table. The items which have
had collisions are not clustered this time.

91 11 21 74 61
0 1 2 3 4 5 6 7 8 9

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 104


Quadratic Probing- Example
• Observe at the resulted hash table. The items which have
had collisions are not clustered.

Quadratic
91 11 21 74 61 Probing

0 1 2 3 4 5 6 7 8 9

• If it was done with simple linear probing, it would be like


Cluster of items stored
in the adjacent slots of Simple linear
11 21 61 74 91 hash table. Probing

0 1 2 3 4 5 6 7 8 9
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 105
Class Discussion
• What is the advantage of quadratic probing over linear
probing?

• How does clustering effect the efficiency on searching?

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 106


Class Discussion
• What is the advantage of quadratic probing over linear
probing?

• Ans: No clustering.

• How does clustering effect the efficiency on searching?

• Ans: Causes a long linear search for the items whose keys
are collided and also causes more other collisions.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 107


Assignment
•Plus 3 Probing
•Double Hashing

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 108


Separate Chaining
• It is also called open hashing since the collisions are stored outside
the table.
• The simplest form of open hashing defines each slot in the hash table
to be the head of a linked list.
• All records that hash to a particular slot are placed on that slot’s
linked list.
• The following figure illustrates a hash table where each slot points
to a linked list to hold the records associated with that slot.
• The hash function used is the simple mod function.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 109


Separate Chaining
0

If collision occurs, resolve it using


separate chaining policy.
Hash Table

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 110


Separate Chaining
0 Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
separate chaining policy.
Hash Table

x=

f(26) = 26 mod 7 = 5

26 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 111


Separate Chaining
0 Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
separate chaining policy.
Hash Table

x=
17 X
f(17) = 17 mod 7 = 3

26 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 112


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
separate chaining policy.
Hash Table

x=
17 X
f(77) = 77 mod 7 = 0

26 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 113


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
separate chaining policy.
Hash Table

x=
17 59 X
f(59) = 59 mod 7 = 3

26 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 114


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
separate chaining policy.
Hash Table

x=
17 59 X
f(25) = 25 mod 7 = 4
25 X

26 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 115


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
separate chaining policy.
Hash Table

x=
17 59 X
f(47) = 47 mod 7 = 5
25 X

26 47 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 116


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
8 X separate chaining policy.
Hash Table

x=
17 59 X
f(8) = 8 mod 7 = 1
25 X

26 47 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 117


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
8 X separate chaining policy.
Hash Table

x=
17 59 52 X
f(52) = 52 mod 7 = 3
25 X

26 47 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 118


Separate Chaining
0 77 X Suppose data items: 26, 17, 77, 59,
25, 47, 8, 52, 41.
If collision occurs, resolve it using
8 X separate chaining policy.
Hash Table

x=
17 59 52 X
f(41) = 41 mod 7 = 6
25 X

26 47 X

41 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 119


Separate Chaining
0

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 120


77 X When we want to search for an
item, we use the hash function
to generate the slot where it
8 X should reside.
Hash Table

x=
17 59 52 X
f(52) = 52 mod 7 = 3
25 X
Then, we search the linked list
associated with the calculated
26 47 X index using linear search.

41 X
Separate Chaining
0 77 X When we want to search for an
item, we use the hash function
to generate the slot where it
8 X should reside.
Item does not
52 Match!
Hash Table

x=
17 59 52 X
f(52) = 52 mod 7 = 3
25 X
Then, we search the linked list
associated with the calculated
26 47 X index using linear search.

41 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 123


Separate Chaining
0 77 X When we want to search for an
item, we use the hash function
to generate the slot where it
8 X should reside.
Item does not
52 Match!
Hash Table

x=
17 59 52 X
f(52) = 52 mod 7 = 3
25 X
Then, we search the linked list
associated with the calculated
26 47 X index using linear search.

41 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 124


Separate Chaining
0 77 X When we want to search for an
item, we use the hash function
to generate the slot where it
8 X should reside.
Item is matched!
52
Hash Table

x=
17 59 52 X
f(52) = 52 mod 7 = 3
25 X
Then, we search the linked list
associated with the calculated
26 47 X index using linear search.

41 X

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 125


Separate Chaining
• Records within a slot’s list can be ordered in several ways:
• by insertion order,
• by key value order, or
• by frequency-of-access order.
• Ordering the list by key value provides an advantage in the case of
an unsuccessful search, because we know to stop searching the list
once we encounter a key that is greater than the one being
searched for.
• If records on the list are unordered, then an unsuccessful search
will need to visit every record on the list.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 126


Separate Chaining
• In the case where a list is empty or has only one record, a search
requires only one access to the list. Thus, the average cost for
hashing should be O(1).
• However, if clustering causes many records to hash to only a few of
the slots, then the cost to access a record will be much higher
because many elements on the linked list must be searched.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 127


Separate Chaining
• Open hashing is most appropriate when the hash table is kept in
main memory, with the lists implemented by a standard in-memory
linked list.
• Storing an open hash table on disk in an efficient way is difficult,
because members of a given linked list might be stored on
different disk blocks. This would result in multiple disk accesses
when searching for a particular key value, which defeats the
purpose of using hashing.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 128


Video: Hashing System

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 129


Properties of Hash Function
• Minimize collisions
• Uniform distribution of hash values
• Easy to calculate
• Resolve any collisions

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 130


Hashing in Summary
• It is used to index large amounts of data.
• Address of each key is calculated using the key itself.
• Collisions are resolved with open or closed addressing.
• Hashing is widely used in database indexing, compilers, caching,
password authentication etc.
• Insertion, deletion, searching and retrieval occur in a constant time.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 131


Load Factor
• A load factor measures a resource's efficiency over time, such as a
system or machine.
• It conveys the relationship between the resource's utilization and
carrying capacity.
• In a hash system, load factor provides the ratio of the numbers of
items in the hash table to the size of the hash table (denoted by α)

number of items in hash table


α=
size of hash table

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 132


Load Factor
• Example: In the below array of size 7, the number of items stored in
the array (hash table) is 3. The load factor of this array is calculated
as .

number of items in hash table 3


α= = = 0.428
size of hash table 7

43 30 26

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 133


Load Factor
• We can gauge how much storage is used and whether the hash table
needs to be expanded or reorganized by monitoring the load factor.
• High load factor: leads to more collisions, resulting in the reduced
efficiency of the hash table (but higher memory utilization).
• Low load factor: leads to minimum number of collisions resulting in
the better efficiency of the hash table (but results in large wasted
space).
• The performance of the hash table deteriorates in relation to the
load factor α. Therefore, a hash table is resized or rehashed if the
load factor α approaches 1.

• Acceptable figures of load factor α should range around 0.6 to 0.75

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 134


Rehashing
• As you go inserting items into the hash table, it increases the value
of load factor α.
• The increase in load factor indicates the decrease on the number of
empty slots in hash table. This causes in more collisions in the hash
table resulting in the degradation in the performance of the hash
system.
• To preserve collisions only up to a certain level, an upper limit of the
load factor (say αT , the threshold value of α) is already defined in
the hash system. It is normally not greater than 0.75.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 135


Rehashing
• Every time you insert an item into a hash table, you calculate the
value of α to check whether its value exceeds the value of predefine
limit αT.
• When you found α > αT:
• you create new array (new hash table) of bigger size (normally of
double size)
• For each item already stored in the old hash table, you calculate
the new hash value of the item to store it in the new hash table.
Repeat this until all the items in the old hash table are moved to
the new hash table. Delete the old hash table.
• In new hash table, store new items until α ≤ αT.
• This over all process is called rehashing.

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 136


Rehashing
• Suppose a hash table, HT of size 4. It currently contains two elements 47 (hashed
into index 3) and 29 (hashed into index 1).
• The hash function is f(key) = key mod size
• Suppose the threshold load factor of our hashing system is 0.70
• Every time you insert an item (key) into the hash table, you have to check
whether current load factor is greater than 0.70 or not.
2
• The current load factor is α = = 0.5 which is less than threshold load factor
4
0.70. So, you can insert another item into the hash table.

29 47

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 137


Rehashing
• Now, suppose you want to insert item 36 into the hash table.
• f(36) = 36 mod 4 = 0 (insert 36 into index 0)
3
• Calculate current load factor α = = 0.75 which is greater than
4
threshold load factor (αT) 0.70.
• Insertion of item 36 into the hash table violated the condition α ≤ αT.
• Now, rehashing must be done.

36 29 47

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 138


Rehashing
• Create a new hash table of doubled size (i.e. size = 8)
• So, new hash function will be
• f(key) = key mod size
• Now, calculate new the hash values for all the items stored in the old hash
table and restore them in the new hash table with new hash value:
• f(36) = 36 mod 8 = 4 ()
• f(29) = 29 mod 8 = 5
• f(47) = 47 mod 8 = 7
• In new hash table, store new items until α ≤ αT.

36 29 47
7

Fig: New hash table and its Items after rehashing is completed.
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 139
Most student’s Confusion!
• What is a search key in hashing?
• An item can have multiple attributes. For example, a record of a
student can contain attributes:
• Student_ID
• Name
• Address
• …….
• Now come to the point, if you are going to applies to store, retrieve,
delete or search such records of students, you can use student_id as
a search key for hashing to find the hash value (slot no in hash table.
• Alternatively, you can also use the name of student as a search key.
• In some cases, if the item is itself integer value or string value, we
can take the item’s value as a search key for hashing.
Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 140
Most student’s Confusion!

Name of person as a search key A search key can be any

}
attribute that can be
manipulated by a hash
function to find a slot
no in a hash table.
Other attributes of person

Dr. Udaya R. Dhungana, Pokhara University, Nepal. udaya@pu.edu.np 141


THANK YOU

End of the Chapter

142

You might also like