Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Bhagwan Arihant Institute of

Technology

Data Structure :
Hashing
Guided By : Abhay Hadiya Prepared By :
Ghadiyali Rutvika [ 2307020703011 ]
Patel Dhruvi V. [ 2307020703033 ]
Rakholiya Honey [ 2307020703043 ]
Vekariya Bhavyanshu [ 2307020703057 ]
Introduction

• Hashing is a technique that is used to uniquely identify a


specific object from a group of similar objects.

• Some examples of how hashing is used in our lives include:


In universities, each student is assigned a unique roll number
that can be used to retrieve information about them.
Continue…

• As s u m e t h a t you ha ve an ob j ec t a nd you
wa n t t o a s s i gn a k ey t o i t t o m a k e s ea rch i n g
ea s y.

• To s t ore k e y/ va lu e p a i rs, u s e a s i m p le a rra y


or d a ta s t ruc tu re, wi t h k eys a s i nd exes , or
u s e ha s h in g f or la rge k eys t ha t c ann ot b e
d i rec t ly u s ed .
Continue…

• In hashing, large keys are converted into small keys by


using hash functions.

• The values are then stored in a data structure called


hash table.

• The idea of hashing is to distribute entries (key/value


pairs) uniformly across an array.

• The key allows O(1) time access to an element, while


the hash function computes an index suggesting entry
locations.
Continue…

• Hashing is implemented in two steps:

1. An element is converted into an integer by


using a hash function. This element can be
used as an index to store the original element,
which falls into the hash table.

2. The element is stored in the hash table where


it can be quickly retrieved using hashed key.
Hash Function

• A hash function is a tool that maps an


arbitrary data set to a fixed-size data
set, which is then stored in a hash table.

The values returned by a hash function


are called hash values, hash codes, hash
sums, or simply hashes.
Hash Table
• A hash table is a data structure that is
used to store keys/value pairs. It uses a
hash function to compute an index into
an array in which an element will be
inserted or searched.

• By using a good hash function, hashing


can work well. Under reasonable
assumptions, the average time required
to search for an element in a hash table
is 0(1).
Collision resolution
techniques
• If x1 and x2 are two different keys, it is possible that
h(x1) = h(x2). This is called a collision. Collision
resolution is the most important issue in hash table
implementations.

• Choosing a hash function that minimizes the number of


collisions and also hashes uniformly is another critical
issue.
1. Separate chaining (open hashing)
2. Linear probing (open addressing or closed hashing)
3. Quadratic Probing
4. Double hashing
Separate Chaining
(Open Hashing)
• Separate chaining is one of the most commonly
used collision resolution techniques.

• It is usually implemented using linked lists. In


separate chaining, each element of the hash table is a
linked list.

• To store an element in the hash table you must


insert it into a specific linked list.

• If there is any collision (i.e. two different elements


have same hash value) then store both the elements
in the same linked list.
Linear Probing

• In open addressing, instead of in linked lists, all


entry records are stored in the array itself.

• When a new entry has to be inserted, the hash


index of the hashed value is computed and then
the array is examined (starting with the hashed
index).

• If the slot at the hashed index is unoccupied, then


the entry record is inserted in slot at the hashed
index else it proceeds in some probe sequence
until it finds an unoccupied slot.
Continue…
• Linear probing is when the interval between successive
probes is fixed (usually to 1). Let's assume that the
hashed index for a particular entry is index. The probing
sequence for linear probing will be:

1. index = index % hashTableSize


2. index = (index + 1) % hashTableSize
3. index = (index + 2)% hashTableSize
4. index = (index + 3)% hashTableSize
Quadratic Probing

• Quadratic probing is similar to linear probing and


the only difference is the interval between
successive probes or entry slots.

• Here, when the slot at a hashed index for an entry


record is already occupied, you must start
traversing until you find an unoccupied slot. The
interval between slots is computed by adding the
successive value of an arbitrary polynomial in the
original hashed index.
Double Hashing

• Double hashing is a method similar to linear probing, but


differs by calculating the interval between successive probes
using two hash functions.

• The hashed index for an entry record is computed by a single


hashing function, and an unoccupied slot must be found
through a specific probing sequence. The probing sequence
will be :

index= (index + 1 * indexH) % hashTableSize;


index = (index + 2 * indexH) % hashTableSize;
Real-world Applications

• Hashing is widely used in various applications


such as databases, cryptography, caches, and
compilers for efficient data storage and
retrieval.
Advantages Of
Hashing

• The tool compares two files for equality using


calculated hash values, allowing owners to instantly
identify differences between the files without
opening them.

• Hashing is a process that ensures the integrity of


files transferred in a file backup program by
comparing the hash values of both files.

• Hash values can identify differences in files even


when an encrypted file is designed in a manner to
disallow change in file size.
Conclusion
Thus, Hashing is essential for efficient data management and security. It utilizes algorithms to
convert diverse data into a fixed-size string of characters, facilitating swift data retrieval and
ensuring data integrity.
Thank You!

You might also like