Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 32

CHAPTER-04 SEARCHING

AND SORTING​
1.SEQUENTIAL Click
SEARCH​
to add text
2. BINARY SEARCH​
3.BREADTH FIRST SEARCH​
4.DEPTH FIRST SEARCH ​
5.INSERTION SORT​
6.SELECTION SORT​
7. DIVIDE AND CONQUER SORT: MERGE SORT​
8.INTRODUCTION TO HASHING.​
SEARCHING​:
Searching is an operation or a technique that helps finds the place
of a given element or value in the list.
Some of the standard searching technique that is being followed in
data structure is listed below:
1.LINEAR SEARCH
2.BINARY SEARCH​

1.LINEAR SEARCH
It is also called as sequential search.
In Linear search, we search an element or value in a given array by
traversing the array from the starting, till the desired element
or value is found.
when the element is matched successfully, it
returns the index of the element in the array
, else it return -1.

Example:int A[] = {10, 8, 2, 7, 3, 4, 9, 1, 6, 5}


the value to be searched is VAL = 7

Return 4 as 7 is found at position 4.

Complexity of Linear Search Algorithm


Linear search executes in O(n) time where
n is the number of elements in the array.
2.BINARY SEARCH

In binary search, we follow the following steps:


1. We start by comparing the element to be searched with the element in the
middle of the list/array.
2. If we get a match, we return the index of the middle element.
3. If we do not get a match, we check whether the element to be searched is
less or greater than in value than the middle element.
4. If the element/number to be searched is greater in value than the middle
number,then we pick the elements on the right side of the middle element
and start again from the step 1.
5. If the element/number to be searched is lesser in value than the middle
number, then we pick the elements on the left side of the middle element, and
start again from the step 1.
Consider an array A[] that is declared and initialized as
1.int A[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
2.the value to be searched is VAL = 9
3.BEG = 0, END = 10, MID = (0 + 10)/2 = 5
Now, VAL = 9 and A[MID] = A[5] = 5
4.A[5] is less than VAL, therefore, we now search for the value in the
second half of the array.
5.Now, BEG = MID + 1 = 6, END = 10, MID = (6 + 10)/2 =16/2 = 8
VAL = 9 and A[MID] = A[8] = 8
A[8] is less than VAL, therefore, we now search for the value in the
second half of the segment.
6. Now, BEG = MID + 1 = 9, END = 10, MID = (9 + 10)/2 = 9
Now, VAL = 9 and A[MID] = 9.
In general:
(a) If VAL < A[MID], then VAL will be present in the left segment of the array.
So, the value of END will be changed as
END = MID – 1.
(b) If VAL > A[MID], then VAL will be present in the right segment of the
array. So, the value of BEG will be changed as BEG = MID + 1.

The complexity of the binary


Search algorithm can be
expressed as f(n)= log2n
Sorting
Sorting means arranging the elements of an array so that they are placed in some
relevant order which may be either ascending or descending.
A sorting algorithm is defined as an algorithm that puts the elements of a list in a
certain order,
which can be either numerical order, lexicographical order, or any user-defined
order.
There are two types of sorting:
1. Internal sorting which deals with sorting the data stored in the computer’s
memory
2. External sorting which deals with sorting the data stored in files.
Following are Sorting Algorithms:
1.Insertion sort
2.Selection Sort
3.Merge Sort
1.Insertion sort
The idea is to divide the array into two subsets – sorted subset and unsorted subset.
1.Initially, a sorted subset consists of only one first element at index 0.
2.And remaining elements of array are consider as unsorted subset.
3.Insertion sort consists of N − 1 passes.
4. For pass p = 1 through N − 1, insertion sort ensures that the elements in positions 0
through p are in sorted order.
Algorithm
INSERTION-SORT (ARR, N)
Step 1: Repeat Steps 2 to 5 for
K = 1 to N – 1
Step 2: SET TEMP = ARR[K]
Step 3: SET J = K - 1
Step 4: Repeat while TEMP <= ARR[J]
SET ARR[J + 1] = ARR[J]
Click to add text
SET J = J - 1
[END OF INNER LOOP]
Step 5: SET ARR[J + 1] = TEMP
[END OF LOOP]
Step 6: EXIT
Time Complexity
Best-case:when the array is already sorted. It takes a linear running time O(n).

Worst Case:when the array is sorted in the reverse order.It takes quadratic
running time (i.e., O(n2)).

Average Case:the insertion sort algorithm will have to make at least (K–1)/2
comparisons. Thus, the average case also has a quadratic running time.

WORST CASE ANALYSIS - O(N2)


BEST CASE ANALYSIS - O(N)
AVERAGE CASE ANALYSIS - O(N2)
Selection Sort
Selection sort is based on the idea of “selecting” the elements in sorted order.
1. Search the array for the smallest element and move it to the front.
2. Then, you find the next smallest element, and put it next. Etc.
Algorithm
SELECTION SORT(ARR, N) SMALLEST (ARR, K, N, POS)
Step 1: Repeat Steps 2 and 3 Step 1: [INITIALIZE]
for K = 1 to N-1 SET SMALL = ARR[K]
Step 2: Step 2: [INITIALIZE] SET POS = K
Step 3: Repeat for J = K+1 to N
CALL SMALLEST(ARR, K, N, POS) IF SMALL > ARR[J]
Step 3: SWAP A[K] with ARR[POS] SET SMALL = ARR[J]
[END OF LOOP] SET POS = J
Step 4: EXIT [END OF IF]
[END OF LOOP]
Step 4: RETURN POS -1

Time Complexity:
There are N iterations of the algorithm, and each iteration needs to find the smallest
element. Finding the smallest element is O(N), and doing an O(N) operating N times results in
an effiency of O(N2)
Merge Sort
Merge sort is a recursive algorithm that involves splitting and
merging the array.
Algorithm
The algorithm works as follows:
1. Divide the array in half.
2. Recursively sort both halves.
3. Merge the halves back together.

Merge sort keeps on dividing the array into equal halves


subarray until it can no more be divided. A subarray of 1 element
is considered sorted.
It repeatedly merge subarray to produce new sorted subarray until
there is only 1 subarray remaining. This will be sorted array.
The basic steps of a merge sort algorithm are as
follows:
1.If the array is of length 0 or 1, then it is already sorted.

2.Otherwise, divide the unsorted array into two sub-arrays of


about half the size.

3.Use merge sort algorithm recursively to sort each sub-array.

4.Merge the two sub-arrays to form a single sorted list.


Algorithm
MERGE_SORT(ARR, BEG, END)
Step 1: IF BEG < END
SET MID = (BEG + END)/2
CALL MERGE_SORT (ARR, BEG, MID)
CALL MERGE_SORT (ARR, MID + 1,END)
MERGE (ARR, BEG, MID END)
[END OF IF]
Step 2: END
MERGE (ARR, BEG, MID, END)
Step 1: [INITIALIZE] SET I = BEG, J = MID + 1, INDEX =BEG
Step 2: Repeat while (I <= MID) AND (J<=END)
IF ARR[I] < ARR[J]
SET TEMP[INDEX] = ARR[I]
SET I = I + 1
ELSE
SET TEMP[INDEX] = ARR[J]
SET J = J + 1
[END OF IF]
SET INDEX = INDEX + 1
[END OF LOOP]
Step 3: [Copy the remaining elements of right sub-array, if any]
IF I > MID
Repeat while J <= END
SET TEMP[INDEX] = ARR[J]
SET INDEX = INDEX + 1, SET J = J + 1
[END OF LOOP]
[Copy the remaining elements of left sub-array, if any]
ELSE
Repeat while I <= MID
SET TEMP[INDEX] = ARR[I]
SET INDEX = INDEX + 1, SET I = I + 1
[END OF LOOP]
[END OF IF]
Step 4: [Copy the contents of TEMP back to ARR]
SET K=0
Step 5: Repeat while K < INDEX
SET ARR[K] = TEMP[K]
SET K = K + 1
Time
[END OF compleixity
LOOP] :O(n log n)
Step 6: END
Hashing
• There are several searching techniques like linear search, binary search,
search trees etc.
• In these techniques, time taken to search any particular element depends
on the total number of elements.
• The main drawback of these techniques is-
• As the number of elements increases, time taken to perform the search
also increases.
• This becomes problematic when total number of elements become too
large.
• Hashing
• Hashing is a well-known technique to search any particular element
among several elements.
• It minimizes the number of comparisons while performing the search.
• The time taken by it to perform the search does not depend upon the
total number of elements.
• It completes the search with constant time complexity O(1).
Hashing Mechanism-

In hashing,
An array data structure called as Hash table is used to store the data items.
Based on the hash key value, data items are inserted into the hash table.

Hash Key Value-

Hash key value is a special value that serves as an index for a data item.
It indicates where the data item should be stored in the hash table.
Hash key value is generated using a hash function.
Example:
Assume a table has 8 slots (m=8). Using division method, insert the following
elements into the hash table. 36, 18, 72, 43, and 6 are inserted in the order.
Hash Function-
Hash function is a function that maps any big number or string
to a small integer value.
Hash function takes the data item as an input and returns a small
integer value as an output.
The small integer value is called as a hash value.
Hash value of the data item is then used as an index for storing it
into the hash table.
Types of Hash Functions-
There are various types of hash functions available such as-
1.Division Method
2.Multiplication Method
3.Mid Square Hash Function
4.Folding Hash Function
1.Division Method
The hash function divides the value k by M and then uses the remainder obtained.
Formula:
h(K) = k mod M
Here,
k is the key value, and
M is the size of the hash table.
It is best suited that M is a prime number as that can make sure the keys are more uniformly distributed.
The hash function is dependent upon the remainder of a division.
Example:
k = 12345
M = 95
h(12345) = 12345 mod 95
= 90
k = 1276
M = 11
h(1276) = 1276 mod 11
=
Pros:
• This method is quite good for any value of M.
• The division method is very fast since it requires only a single division operation.
Cons:
• This method leads to poor performance since consecutive keys map to consecutive hash values in the hash
table.
• Sometimes extra care should be taken to choose value of M.
2.Multiplication Method
The steps involved in the multiplication method are as follows:
Step 1: Choose a constant A such that 0 < A < 1.
Step 2: Multiply the key k by A.
Step 3: Extract the fractional part of kA.
Step 4: Multiply the result of Step 3 by the size of hash table (m).
Hence, the hash function can be given as:
h(K) = floor (M (kA mod 1))
Ex:Given a hash table of size 100, map the key 12345 to an appropriate
location in the hash table.
Solution:We will use A = 0.357840, m = 100, and k = 12345
h(12345) = floor[ 100 (12345*0.357840 mod 1)]
= floor[ 100 (4417.5348 mod 1) ]
= floor[ 100 (0.5348) ]
= floor[ 53.48 ]
3.Mid Square Hash Function
It involves two steps to compute the hash value-
• Square the value of the key k i.e. k2
• Extract the middle r digits as the hash value.
Formula:
h(K) = h(k x k)
Here,
k is the key value.
The value of r can be decided based on the size of the table.
Example:
Suppose the hash table has 100 memory locations. So r = 2 because two
digits are required to map the key to the memory location.
k = 60
k x k = 60 x 60
= 3600
h(60) = 60
The hash value obtained is 60
3. Digit Folding Method:
This method involves two steps:
Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where each part
has the same number of digits except for the last part that can have lesser digits
than the other parts.
Add the individual parts. The hash value is obtained by ignoring the last carry if
any.
Formula:
k = k1, k2, k3, k4, ….., kn
s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s
Here,
s is obtained by adding the parts of the key k
Example:
k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51
COLLISIONS
collisions occur when the hash function maps two different keys
to the same location.
A method used to solve the problem of collision, also called collision
resolution technique, is applied.
The simplest approach to resolve a collision is linear probing.
Linear Probing:
• In this technique, if a value is already stored at
a location generated by h(k), it means collision
occurred then we do a sequential search to
find the empty location.
• Here the idea is to place a value in the next
available position.
• Here array or hash table is considered circular
because when the last slot reached an empty
location not found then the search proceeds
to the first location of the array.
• Below is a hash function that calculates the next location. If the location is empty
then store value otherwise find the next location.
• Following hash function is used to resolve the collision in:
• h(k, i) = [h(k) + i] mod m
• Where
• m = size of the hash table,
• h(k) = (k mod m),
• i = the probe number that varies from 0 to m–1.
• Therefore, for a given key k, the first location is generated by [h(k) + 0] mod m, the
first time i=0.
• If the location is free, the value is stored at this location. If value
successfully stores then probe count is 1 means location is founded on the first go.
• If location is not free then second probe generates the address of the location
given by [h(k) + 1]mod m.
• Similarly, if the generated location is occupied, then subsequent probes generate
the address as [h(k) + 2]mod m, [h(k) + 3]mod m, [h(k) + 4]mod m, [h(k) + 5]mod
m, and so on, until a free location is found.
• Probes is a count to find the free location for each value to store in the hash
table.
Example:
Insert the following sequence of keys in the hash table
{9, 7, 11, 13, 12, 8}
Use linear probing technique for collision resolution
h(k, i) = [h(k) + i] mod m
h(k) = 2k + 5
m=10
Solution:
Step 01:
First Draw an empty hash table of Size 10.
The possible range of hash values will be [0, 9].
Step 02:
Insert the given keys one by one in the hash table.
First Key to be inserted in the hash table = 9.
h(k) = 2k + 5
h(9) = 2*9 + 5 = 23
h(k, i) = [h(k) + i] mod m
h(9, 0) = [23 + 0] mod 10 = 3
So, key 9 will be inserted at index 3 of the hash table
Step 03:
Next Key to be inserted in the hash table = 7.
h(k) = 2k + 5
h(7) = 2*7 + 5 = 19
h(k, i) = [h(k) + i] mod m
h(7, 0) = [19 + 0] mod 10 = 9
So, key 7 will be inserted at index 9 of the hash table
Step 04:
Next Key to be inserted in the hash table = 11.
h(k) = 2k + 5
h(11) = 2*11 + 5 = 27
h(k, i) = [h(k) + i] mod m
h(11, 0) = [27 + 0] mod 10 = 7
So, key 11 will be inserted at index 7 of the hash table
Step 05:
Next Key to be inserted in the hash table = 13.
h(k) = 2k + 5
h(13) = 2*13 + 5 = 31
h(k, i) = [h(k) + i] mod m
h(13, 0) = [31 + 0] mod 10 = 1
So, key 13 will be inserted at index 1 of the hash table
Step 06:
Next key to be inserted in the hash table = 12.
h(k) = 2k + 5
h(12) = 2*12 + 5 = 27
h(k, i) = [h(k) + i] mod m
h(12, 0) = [27 + 0] mod 10 = 7
Here Collision has occurred because index 7 is already filled.
Now we will increase i by 1.
h(12, 1) = [27 + 1] mod 10 = 8
So, key 12 will be inserted at index 8 of the hash table.
Step 07:
Next key to be inserted in the hash table = 8.
h(k) = 2k + 5
h(8) = 2*8 + 5 = 21
h(k, i) = [h(k) + i] mod m
h(8, 0) = [21 + 0] mod 10 = 1
Here Collision has occurred because index 1 is already filled.
Now we will increase i by 1 now i become 1.
h(k) = 2k + 5
h(8) = 2*8 + 5 = 21
h(k, i) = [h(k) + i] mod m
h(8, 0) = [21 + 1] mod 10 = 2
index 2 is vacant so 8 will be inserted at index 2.

You might also like