Download as pdf or txt
Download as pdf or txt
You are on page 1of 57

DATA STRUCTURES

SWATI JAGTAP
Syllabus
Unit II Searching and Sorting Algorithms (06 Hrs)

Algorithms: Analysis of Iterative and Recursive


algorithms, Space & Time complexity, Asymptotic
notation- Big-O, Theta and Omega notations.
Searching methods: Linear, Binary, Hashing
Sorting methods: Bubble and Quick Sort.

Mapping of Course Outcomes for Unit II


CO2: Implement sorting and searching algorithms and calculate their
complexity.
Sorting
 To arrange a set of items in sequence.
 It is estimated that 25~50% of all
computing power is used for sorting
activities.
 Possible reasons:
 Many applications require sorting;
 Many applications perform sorting when they
don't have to;
 Many applications use inefficient sorting
algorithms.
Sorting
 Basic problem
order elements in an array or vector
 Use
 Need to know relationship between data elements
(e.g., top N students in class)
 Searching can be made more efficient (e.g., binary search)

 Implementation
 Simple implementation, relatively slow: O(n2)
 Complex implementation, more efficient: O(n.logn)

4
Sorting Terminology

 What is in-place sorting?


An in-place sorting algorithm uses constant extra space for
producing the output (modifies the given array only). It
sorts the list only by modifying the order of the elements
within the list.
For example, Insertion Sort and Selection Sorts are in-place
sorting algorithms as they do not use any additional space
for sorting the list and a typical implementation of Merge
Sort is not in-place, also the implementation for counting
sort is not in-place sorting algorithm.
Sorting Terminology

 What are Internal and External Sortings?


When all data that needs to be sorted cannot be placed in-
memory at a time, the sorting is called external sorting.
External Sorting is used for massive amount of data. Merge
Sort and its variations are typically used for external
sorting. Some external storage like hard-disk, CD, etc is
used for external storage.
When all data is placed in-memory, then sorting is called
internal sorting.
Internal sorting
 Data is entirely sorted on a main memory.
 Advantage:
 High speed
 Random access to all data member.
 Disadvantage
 Data is in main memory& main memory is volatile memory. So as power goes off,
we loose the data on it. So we have to store the data in secondary memory.

7
External sorting
2) External sorting
Data is stored on secondary devices (External memory)
is required when there is huge amount of data.
when sorting is done between data lying on secondary devices, then it is called external sorting.
 Advantage:
 No data is lost because of power off.
 Disadvantage
 More complex
 Not efficient because comparison of data is between secondary devices.

8
Stability in sorting algorithms

 Stability is mainly important when we have key value pairs


with duplicate keys possible (like people names as keys
and their details as values). And we wish to sort these
objects by keys.
Stability in sorting algorithms

 What is it?
A sorting algorithm is said to be stable if two objects with
equal keys appear in the same order in sorted output as they
appear in the input array to be sorted.
Formally stability may be defined as,


Stability in sorting algorithms
Stability
 A sorting algorithm is said to be stable if after sorting , identical elements
appear in the same sequence as in the original list.
Name Subject Marks
Name Subject Marks
Amit Chemistry 74
Mohan Physics 65
Mohan Physics 65
Sohan Chemistry 70
Mohan Chemistry 68
Mohan Chemistry 68 Fig: b)
Sohan Chemistry 70 Sorted List (Stable)
Amit Chemistry 74
Sohan Physics 75
Sohan Physics 75

Name Subject Marks


Fig: c)
Amit Chemistry 74 Sorted List
Mohan Chemistry 68 (Unstable)
Fig: a)
Unsorted List Mohan Physics 65
Sohan Chemistry 70
Sohan Physics 75
Sunday, April
17, 2022
12
Guess!!! Which is Stable?

13
Time Complexities of all Sorting Algorithms

 Efficiency of an algorithm depends on two parameters:


 1. Time Complexity
 2. Space Complexity
 Time Complexity: Time Complexity is defined as the number of times a
particular instruction set is executed rather than the total time is taken. It is
because the total time taken also depends on some external factors like the
compiler used, processor’s speed, etc.
 Space Complexity: Space Complexity is the total memory space required by the
program for its execution.

Time Complexities of all Sorting Algorithms
Sorting Methods

 Bubble
 Insertion
 Selection
 Merge
 Quick Sort
Sorting

 Rearrange
a[0], a[1], …, a[n-1]
into ascending order.
When done,
a[0] <= a[1] <= … <= a[n-1]
 8, 6, 9, 4, 3 => 3, 4, 6, 8, 9

17
General Sorting

 Assumptions
data in linear data structure
availability of comparator for
elements
availability of swap routine (or shift )
no knowledge about the data values
18
General Sorting

Swap
temp = a;
a = b;
b = temp;

Passes:
While sorting the elements in some specific order, there
is a lot of arrangement of the elements. The phases in
which the elements are moving to acquire their proper
position is called as passes.

If there are n elements in an array, then


(n-1) passes will exist for sorting 19
Bubble Sort
 Bubble Sort is the simplest sorting algorithm that works by repeatedly
swapping the adjacent elements if they are in wrong order.
 https://www.youtube.com/watch?v=18OO361--1E
 The bubble sort gets its name because elements tend to move up into the
correct order like bubbles rising to the surface.
Bubble Sort

 Slowest
 Most popular
 Since the comparison position look like
bubbles, it is called so.
 Take multiple passes over the array
 Swap adjacent places when values are out of
order
 Invariant: each pass guarantees that
largest remaining element is in the correct
(next last) position.

21
Bubble Sort Pass 1
 Start – Unsorted

 Compare, swap (0, 1)

 Compare, swap (1, 2)

 Compare, no swap

 Compare, no swap

 Compare, swap (4, 5)

 99 in position
22
Bubble Sort
Pass 2

 swap (0, 1)

 no swap

 no swap

 swap (3, 4)

 21 in position

23
Pass 3

Bubble Sort  no swap


 no swap
 swap (2, 3)
 12 in position, Pass 4
 no swap
 swap (1, 2)
 8 in position, Pass 5
 swap (1, 2)
 Done
Pass 4 & Pass 5 will be executed
without any change.

24
#include <stdio.h>
void swap(int *xp, int *yp)
{
int temp = *xp;
*xp = *yp;
*yp = temp;
}
// A function to implement bubble sort
void bubbleSort(int arr[], int n)
{
int i, j;
for (i = 0; i < n-1; i++)

// Last i elements are already in place


for (j = 0; j < n-i-1; j++)
if (arr[j] > arr[j+1])
swap(&arr[j], &arr[j+1]);
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 34, 25, 12, 22, 11, 90};
int n = sizeof(arr)/sizeof(arr[0]);
bubbleSort(arr, n);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
BubbleSort – Algorithm Complexity

 Time consuming operations


 compares, swaps.

 #Compares
 a for loop embedded inside a while loop (n-1) + (n-2) + (n-3) + ..... + 3 + 2 + 1
 (n-1)+(n-2)+(n-3) …+1 , or O(n2) Sum = n(n-1)/2 i.e O(n2)

 #Swaps
 inside a conditional -> #swaps data dependent !!
 Best Case 0, or O(1)
 Worst Case (n-1)+(n-2)+(n-3) …+1 , or O(n2)

 Space
 size of the array
 an in-place algorithm
 Space Complexity: O(1) 26
Sorting Methods

 Bubble
 Insertion
 Selection
 Merge
 Quick Sort
Insertion Sort

 Insertion sort is a simple sorting algorithm that works similar to the


way you sort playing cards in your hands. The array is virtually split
into a sorted and an unsorted part. Values from the unsorted part
are picked and placed at the correct position in the sorted part.
 https://www.youtube.com/watch?v=uMqVuEEWJv4
Insertion Sort

 Algorithm
To sort an array of size n in ascending order:
1: Iterate from arr[0] to arr[n] over the array.
2: Compare the current element (key) to its predecessor.
3: If the key element is smaller than its predecessor, compare it to
the elements before. Move the greater elements one position up to
make space for the swapped element.
Insertion Sort:-Example:

 12, 11, 13, 5, 6


 Let us loop for i = 0 (First element of the array) to 4 (last element of the array)
 No element is at left of 12 so increment the index of array
 i = 1. Since 11 is smaller than 12, move 12 and insert 11 before 12
11, 12, 13, 5, 6
 i = 2. 13 will remain at its position as all elements in A[0..I-1] are smaller than 13
11, 12, 13, 5, 6
 i = 3.
 5 will move to the beginning and all other elements from 11 to 13 will move one
position ahead of their current position.
5, 11, 12, 13, 6
 i = 4. 6 will move to position after 5, and elements from 11 to 13 will move one
position ahead of their current position.
5, 6, 11, 12, 13

#include <math.h>
#include <stdio.h>

/* Function to sort an array using insertion sort*/


void insertionSort(int arr[], int n)
{
int i, key, j;
for (i = 1; i < n; i++) {
12, 11, 13, 5, 6
key = arr[i];
j = i - 1;
0 1 2 3 4
i=1
/* Move elements of arr[0..i-1], that are
greater than key, to one position ahead Key=11
J=0
of their current position */
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j = j - 1;
arr[0]=12>11
}
arr[j + 1] = key;
arr[1]=arr[0]
} Arr[1]=12
}
J=-1
Arr[0]=11
// A utility function to print an array of size n
void printArray(int arr[], int n)
{
int i;
for (i = 0; i < n; i++)
printf("%d ", arr[i]);
11, 12 , 13, 5, 6
printf("\n"); i=2
}
Key=13
J=1
/* Driver program to test insertion sort */
int main()
{
int arr[] = { 12, 11, 13, 5, 6 };
Arr[1]=12>13
int n = sizeof(arr) / sizeof(arr[0]);
insertionSort(arr, n);
Arr[2]=13
printArray(arr, n);

return 0; 11, 12 , 13, 5, 6


}
Insertion Sort

 Time Complexity: O(n*2)


 Auxiliary Space: O(1)
 Boundary Cases: Insertion sort takes maximum time to sort if elements are sorted
in reverse order. And it takes minimum time (Order of n) when elements are already
sorted.
 Algorithmic Paradigm: Incremental Approach
 Sorting In Place: Yes
 Stable: Yes
 Online: Yes
 Uses: Insertion sort is used when number of elements is small. It can also be useful
when input array is almost sorted, only few elements are misplaced in complete big
array.
Sorting Methods

 Bubble
 Insertion
 Selection
 Merge
 Quick Sort
Selection Sort

 https://www.youtube.com/watch?v=R_f3PJtRqUQ
 Selection sort is a simple sorting algorithm. This sorting algorithm is an in-place
comparison-based algorithm in which the list is divided into two parts,
 the sorted part at the left end and the unsorted part at the right end.
 Initially, the sorted part is empty and the unsorted part is the entire list.
 The smallest element is selected from the unsorted array and swapped with the
leftmost element, and that element becomes a part of the sorted array. This
process continues moving unsorted array boundary by one element to the right.
 This algorithm is not suitable for large data sets as its average and worst case
complexities are of Ο(n2), where n is the number of items.
Selection Sort

Algorithm
Step 1 − Set MIN to location 0
Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted
Selection Sort:-Example
#include <stdio.h>

void swap(int *xp, int *yp)


{
int temp = *xp;
*xp = *yp;
*yp = temp;
}

void selectionSort(int arr[], int n)


{
int i, j, min_idx;

// One by one move boundary of unsorted subarray


for (i = 0; i < n-1; i++)
{
// Find the minimum element in unsorted array
min_idx = i;
for (j = i+1; j < n; j++)
if (arr[j] < arr[min_idx])
min_idx = j;
// Swap the found minimum element with the first element
swap(&arr[min_idx], &arr[i]);
}
}
/* Function to print an array */
void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}
// Driver program to test above functions
int main()
{
int arr[] = {64, 25, 12, 22, 11};
int n = sizeof(arr)/sizeof(arr[0]);
selectionSort(arr, n); Output:Sorted array: 11 12 22 25 64
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
Selection Sort

 Time Complexity: O(n2) as there are two nested loops.


 Auxiliary Space: O(1)
The good thing about selection sort is it never makes
more than O(n) swaps and can be useful when
memory write is a costly operation.
 The default implementation is not stable. However it
can be made stable.
 In Place : Yes, it does not require extra space.
Sorting Methods

 Bubble
 Insertion
 Selection
 Merge
 Quick Sort
Merge Sort
https://www.youtube.com/watch?v=4VqmGXwpLqc&t=24s

➢ Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls
itself for the two halves and then merges the two sorted halves.

➢ Merge sort (also commonly spelled mergesort) is an efficient, general-


purpose, comparison-based sorting algorithm. Most implementations produce a stable
sort, which means that the order of equal elements is the same in the input and output.
Merge sort is a divide and conquer algorithm that was invented by John von Neumann in
1945
➢ Worst-case time complexity being Ο(n log n), it is one of the most respected algorithms.
#include <stdio.h>
#include <stdlib.h>
// Merges two subarrays of arr[].
// First subarray is arr[l..m]
// Second subarray is arr[m+1..r]
void merge(int arr[], int l, int m, int r)
{
int i, j, k;
int n1 = m - l + 1;
int n2 = r - m;

/* create temp arrays */


int L[n1], R[n2];

/* Copy data to temp arrays L[] and R[] */


for (i = 0; i < n1; i++)
L[i] = arr[l + i];
for (j = 0; j < n2; j++)
R[j] = arr[m + 1 + j];

/* Merge the temp arrays back into arr[l..r]*/


i = 0; // Initial index of first subarray
j = 0; // Initial index of second subarray
k = l; // Initial index of merged subarray
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
}
else {
arr[k] = R[j];
j++;
}

}
k++;
https://www.youtube.com/watch?v=cAv-4ltj1go
/* Copy the remaining elements of L[], if there
are any */
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
/* Copy the remaining elements of R[], if there
are any */
while (j < n2) {
arr[k] = R[j];
j++;
k++; } }
/* l is for left index and r is right index of the
sub-array of arr to be sorted */
void mergeSort(int arr[], int l, int r)
{ 9, 7, 3, 6, 2
0 1 2 3 4
if (l < r) {
// Same as (l+r)/2, but avoids overflow for

I m r
// large l and h
int m = (l + r) / 2;

// Sort first and second halves m=0+4 / 2=2


mergeSort(arr, l, m);
mergeSort(arr, m + 1, r);

merge(arr, l, m, r); Mergesort(arr,l=0,r=2)


}
}
9, 7, 3,
/* UTILITY FUNCTIONS */ 0 1 2
I m r
/* Function to print an array */
void printArray(int A[], int size)

m=0+2 / 2=1
{
int i;
for (i = 0; i < size; i++)
printf("%d ", A[i]);
printf("\n");
}

/* Driver program to test above functions */


int main()
{
int arr[] = { 9, 7, 3, 6, 2};
int arr_size = sizeof(arr) / sizeof(arr[0]);

printf("Given array is \n");


printArray(arr, arr_size);

mergeSort(arr, 0, arr_size - 1);

printf("\nSorted array is \n");


printArray(arr, arr_size);
return 0;
}
Algorithm Analysis

Bubble ,
selection,
Insertion sort
time
complexity
Merge sort time
complexity
Applications of Merge Sort

 Merge Sort is useful for sorting linked lists in O(nLogn) time.In the case of
linked lists, the case is different mainly due to the difference in memory
allocation of arrays and linked lists. Unlike arrays, linked list nodes may not be
adjacent in memory. Unlike an array, in the linked list, we can insert items in the
middle in O(1) extra space and O(1) time. Therefore merge operation of merge
sort can be implemented without extra space for linked lists.
➢ Inversion Count Problem
1. Inversion Count for an array indicates – how far (or close) the array is from being
sorted. If array is already sorted then inversion count is 0. If array is sorted in reverse
order that inversion count is the maximum.
➢ Used in External Sorting
Algorithm Analysis

 Time Complexity: Sorting arrays on different machines. Merge Sort is a


recursive algorithm and time complexity can be expressed as following
recurrence relation.
T(n) = 2T(n/2) + θ(n)
 The above recurrence can be solved either using Recurrence Tree method or
Master method. It falls in case II of Master Method and solution of the
recurrence is θ(nLogn). Time complexity of Merge Sort is O(nLogn) in all 3
cases (worst, average and best) as merge sort always divides the array into two
halves and take linear time to merge two halves.
Auxiliary Space: O(n)
Algorithmic Paradigm: Divide and Conquer
Sorting In Place: No in a typical implementation
Stable: Yes
Sorting Methods

 Bubble
 Insertion
 Selection
 Merge
 Quick Sort
Quick sort
 Like Merge Sort, QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and
partitions the given array around the picked pivot. There are many different versions of
quickSort that pick pivot in different ways.
1. Always pick first element as pivot.
2. Always pick last element as pivot (implemented below)
3. Pick a random element as pivot.
4. Pick median as pivot.
 The key process in quickSort is partition(). Target of partitions is, given an array and an
element x of array as pivot, put x at its correct position in sorted array and put all smaller
elements (smaller than x) before x, and put all greater elements (greater than x) after x. All this
should be done in linear time.
 https://www.youtube.com/watch?v=PgBzjlCcFvc
How QuickSort Works?
The sub-parts are again divided into smaller sub-parts until each subpart is formed of a
single element.
At this point, the array is already sorted.
#include<stdio.h>
// A utility function to swap two elements
void swap(int* a, int* b)
{
int t = *a;
*a = *b;
*b = t;
}
/* This function takes last element as pivot, places
the pivot element at its correct position in sorted
array, and places all smaller (smaller than pivot)
to left of pivot and all greater elements to right
of pivot */
int partition (int arr[], int low, int high)
{
int pivot = arr[high]; // pivot
int i = (low - 1); // Index of smaller element

for (int j = low; j <= high- 1; j++)


{
// If current element is smaller than the pivot
if (arr[j] < pivot)
{
i++; // increment index of smaller element
swap(&arr[i], &arr[j]);
}
}
swap(&arr[i + 1], &arr[high]);
return (i + 1);
}

/* The main function that implements QuickSort


arr[] --> Array to be sorted,
low --> Starting index,
high --> Ending index */
void quickSort(int arr[], int low, int high)
{
if (low < high)
{
/* pi is partitioning index, arr[p] is now
at right place */
int pi = partition(arr, low, high);

// Separately sort elements before


// partition and after partition
quickSort(arr, low, pi - 1);
quickSort(arr, pi + 1, high);
}
}

/* Function to print an array */


void printArray(int arr[], int size)
{
int i;
for (i=0; i < size; i++)
printf("%d ", arr[i]);
printf("\n");
}

// Driver program to test above functions


int main()
{
int arr[] = {10, 7, 8, 9, 1, 5};
int n = sizeof(arr)/sizeof(arr[0]);
quickSort(arr, 0, n-1);
printf("Sorted array: \n");
printArray(arr, n);
return 0;
}
Analysis of Quick Sort
Time taken by QuickSort in general can be written as following.
T(n) = T(k) + T(n-k-1) + θ(n)
 The first two terms are for two recursive calls, the last term is for the partition
process. k is the number of elements which are smaller than pivot.
The time taken by QuickSort depends upon the input array and partition strategy.
Following are three cases.
 Worst Case: The worst case occurs when the partition process always picks greatest
or smallest element as pivot. If we consider above partition strategy where last
element is always picked as pivot, the worst case would occur when the array is
already sorted in increasing or decreasing order.
 the worst case time complexity of QuickSort is O(n2) which is more than many other
sorting algorithms like Merge Sort and Heap Sort, QuickSort is faster in practice,
because its inner loop can be efficiently implemented on most architectures, and in
most real-world data.
 QuickSort can be implemented in different ways by changing the choice of pivot, so
that the worst case rarely occurs for a given type of data. However, merge sort is
generally considered better when data is huge and stored in external storage.
Time Complexities of all Sorting Algorithms

You might also like