Professional Documents
Culture Documents
DS-notes full
DS-notes full
DS-notes full
Abstract Data type (ADT) is a type (or class) for objects whose
behaviour is defined by a set of values and a set of operations. The
definition of ADT only mentions what operations are to be performed but
not how these operations will be implemented. It does not specify how
data will be organized in memory and what algorithms will be used for
implementing the operations. It is called “abstract” because it gives an
implementation-independent view.
The process of providing only the essentials and hiding the details is
known as abstraction.
The above figure shows the ADT model. There are two types of models in
the ADT model, i.e., the public function and the private function. The ADT
model also contains the data structures that we are using in a program.
In this model, first encapsulation is performed, i.e., all the data is
wrapped in a single unit, i.e., ADT. Then, the abstraction is performed
means showing the operations that can be performed on the data
structure and what are the data structures that we are using in a
program.
So a user only needs to know what a data type can do, but not how it
will be implemented. Think of ADT as a black box which hides the inner
structure and design of the data type..
The commonly used asymptotic notations used for calculating the running
time complexity of an algorithm is given below:
If f(n) and g(n) are the two functions defined for positive integers,
This implies that f(n) does not grow faster than g(n), or g(n) is an upper
bound on the function f(n). In this case, we are calculating the growth
rate of the function which eventually calculates the worst time complexity
of a function, i.e., how worst an algorithm can perform.
If f(n) and g(n) are the two functions defined for positive integers,
Let f(n) and g(n) be the functions of n where n is the steps required to
execute the program then:
f(n)= θg(n)
where the function is bounded by two limits, i.e., upper and lower limit,
and f(n) comes in between. The condition f(n)= θg(n) will be true if and
only if c1.g(n) is less than or equal to f(n) and c2.g(n) is greater than or
equal to f(n). The graphical representation of theta notation is given
below:
Linked List
o Linked List can be defined as collection of objects called nodes that
are randomly stored in the memory.
o A node contains two fields i.e. data stored at that particular address
and the pointer which contains the address of the next node in the
memory.
o The last node of the list contains pointer to the null.
Why use linked list over array?
Till now, we were using array data structure to organize the group of
elements that are to be stored individually in the memory. However,
Array has several advantages and disadvantages which must be known in
order to decide the data structure which will be used throughout the
program.
Linked list is the data structure which can overcome all the limitations of
an array. Using linked list is useful because,
1. It allocates the memory dynamically. All the nodes of linked list are
non-contiguously stored in the memory and linked together with the
help of pointers.
2. Sizing is no longer a problem since we do not need to define its size
at the time of declaration. List grows as per the program's demand
and limited to the available memory space.
1. Singly Linked List: The nodes only point to the address of the next
node in the list.
2. Doubly Linked List: The nodes point to the addresses of both
previous and next nodes.
3. Circular Linked List: The last node in the list will point to the first
node in the list.
4. Circular Doubly Linked List: A circular doubly linked list is defined
as a circular linked list in which each node has two links connecting it
to the previous node and the next node.
A singly linked list is a linear data structure in which the elements are
not stored in contiguous memory locations and each element is connected
only to its next element using a pointer.
The basic operations in the linked lists are insertion, deletion, searching,
display, and deleting an element at a given key. These operations are
performed on Singly Linked Lists as given below –
1. Creating a Node:
Code:
struct node
int data;
};
Inserting Nodes:
To insert an element or a node in a linked list, the following three things
to be done:
Allocating a node
Assigning a data to info field of the node
Adjusting a pointer and a new node may be inserted
☀ Inserting a node at the beginning of the singly linked list:
Algorithm:
let *head be the pointer to first node in the current list
1. Create a new node using malloc function
NewNode=(NodeType*)malloc(sizeof(NodeType));
2. Assign data to the info field of new node
NewNode->info=newItem;
3. Set next of new node to head
NewNode->next=head;
4. Set the head pointer to the new node
head=NewNode;
5. End
Inserting a node at the end of the singly linked list:
Algorithm:
let *head be the pointer to first node in the current list
1. Create a new node using malloc function
NewNode=(NodeType*)malloc(sizeof(NodeType));
2. Assign data to the info field of new node
NewNode->info=newItem;
3. Set next of new node to NULL
NewNode->next=NULL;
4. if (head ==NULL)then
Set head =NewNode.and exit.
5. Set temp=head;
6 while(temp->next!=NULL)
temp=temp->next; //increment temp
7. Set temp->next=NewNode;
8. End
☀ Inserting a node at the specified position of the singly linked
list:
Algorithm:
let *head be the pointer to first node in the current list
1. Create a new node using malloc function
NewNode=(NodeType*)malloc(sizeof(NodeType));
2. Assign data to the info field of new node
NewNode->info=newItem;
3. Enter position of a node at which you want to insert a new node. Let this position is
4. Set temp=head;
5. if (head ==NULL)then
printf(“void insertion”); and exit(1).
6. for(i=1; i<pos-1; i++)
temp=temp->next;
7. Set NewNode->next=temp->next;
set temp->next =NewNode..
8. End
Deleting Nodes:
void display()
{
struct node *ptr;
ptr = head;
if(ptr == NULL)
{
printf("Nothing to print");
}
else
{
printf("\nprinting values . . . . .\n");
while (ptr!=NULL)
{
printf("\n%d",ptr->data);
ptr = ptr -> next;
}
}
}
NOTE: For program code you can refer 1st program in the lab
syllabus
Doubly linked list is a complex type of linked list in which a node contains
a pointer to the previous as well as the next node in the sequence.
Therefore, in a doubly linked list, a node consists of three parts: node
data, pointer to the next node in sequence (next pointer) , pointer to the
previous node (previous pointer). A sample node in a doubly linked list is
shown in the figure.
1 Insertion at beginning Adding the node into the linked list at beginning.
2 Insertion at end Adding the node into the linked list to the end.
3 Insertion after Adding the node into the linked list after the specified
specified node node.
5 Deletion at the end Removing the node from end of the list.
6 Deletion of the node Removing the node which is present just after the node
having given data containing the given data.
Node Creation:
struct node
{
struct node *prev;
int data;
struct node *next;
};
struct node *head;
Begin:
alloc (head)
Else then
read (data)
head.data ← data;
head.prev ← NULL;
head.next ← NULL;
last ← head;
End else
End
Algorithm
1. START
2. Create a new node with three variables: prev, data, next.
3. Store the new data in the data variable
4. If the list is empty, make the new node as head.
5. Otherwise, link the address of the existing first node to the next
variable of the new node, and assign null to the prev variable.
6. Point the head to the new node.
7. END
The head pointer points to the first node of the doubly linked list, and the
previous pointer of the first node points to Null. To insert a node at the
beginning of the Linked List, the head pointer should point to the new first
node, and the next pointer of the new first node must point to the
previous first node.
Algorithm :
Write OVERFLOW
Go to Step 9
[END OF IF]
if(head==NULL)
{
ptr->next = NULL;
ptr->prev=NULL;
ptr->data=item;
head=ptr;
}
else
{
ptr->data=item;
ptr->prev=NULL;
ptr->next = head;
head->prev=ptr;
head=ptr;
}
}
}
}
2. The previous pointer of the new node should point to the old last
node.
3. The next pointer of the old last node should point to the new last
node.
Also, there may be a case where the DLL is initially empty. In that case,
the newly created node will become both the first and the last node of the
doubly linked list.
The code for the insertion of a new node as the last node in Java is given
below:
// To insert a node at the end of a Doubly Linked List
public void insertAtLast(int data) {
// Creating a new node with the given data
Node newNode = new Node(data);
/*
If DLL is empty then this node will be both the first as
well as the last node
*/
if (head == null) {
newNode.prev = null;
head = newNode;
return;
}
/*
If DLL is not empty, then traverse till the end of DLL. Make
the next pointer of the original last node point to the new
last node and the previous of the last node to the original
last node
*/
2. The previous node’s next pointer should be linked to the new node,
and the new node’s previous pointer should be linked to the previous
node.
The code for insertion of a new node after a given previous node in Java
is given below.
public void insertAfter(Node prevNode, int data) {
// if the previous node is null
if (prevNode == null) {
System.out.println("The given previous node cannot be null");
return;
}
/*
The next pointer of this node should point to
the next of prevNode
*/
newNode.next = prevNode.next;
/*
The previous pointer of newNode should point to the
prevNode
*/
newNode.prev = prevNode;
1. Ptr = head;
2. head = head → next;
now make the prev of this new head node point to NULL. This will be done
by using the following statements.
1. free(ptr)
Algorithm
WRITE UNDERFLOW
GOTO STEP 6
void beginning_delete()
{
struct node *ptr;
if(head == NULL)
{
printf("\n UNDERFLOW\n");
}
else if(head->next == NULL)
{
head = NULL;
free(head);
printf("\nNode Deleted\n");
}
else
{
ptr = head;
head = head -> next;
head -> prev = NULL;
free(ptr);
printf("\nNode Deleted\n");
}
}
void display()
{
struct node *ptr;
printf("\n printing values...\n");
ptr = head;
while(ptr != NULL)
{
printf("%d\n",ptr->data);
ptr=ptr->next;
}
}
In a circular Singly linked list, the last node of the list contains a pointer
to the first node of the list. We can have circular singly linked list as well
as circular doubly linked list.
We traverse a circular singly linked list until we reach the same node
where we started. The circular singly liked list has no beginning and no
ending. There is no null value present in the next part of any of the
nodes.
Insertion
SN Operation Description
2 Insertion at the end Adding a node into circular singly linked list at the end.
SN Operation Description
1 Deletion at Removing the node from circular singly linked list at the
beginning beginning.
2 Deletion at the Removing the node from circular singly linked list at the end.
end
3 Searching Compare each element of the node with the given item and
return the location at which the item is present in the list
otherwise return null.
4 Traversing Visiting each element of the list at least once in order to perform
some specific operation.
here are two scenario in which a node can be inserted in circular singly
linked list at beginning. Either the node will be inserted in an empty list or
the node is to be inserted in an already filled list.
Firstly, allocate the memory space for the new node by using the malloc
method of C language.
In the first scenario, the condition head == NULL will be true. Since, the
list in which, we are inserting the node is a circular singly linked list,
therefore the only node of the list (which is just inserted into the list) will
point to itself only. We also need to make the head pointer point to this
node. This will be done by using the following statements.
if(head == NULL)
{
head = ptr;
ptr -> next = head;
}
In the second scenario, the condition head == NULL will become false
which means that the list contains at least one node. In this case, we
need to traverse the list in order to reach the last node of the list. This
will be done by using the following statement.
temp = head;
while(temp->next != head)
temp = temp->next;
At the end of the loop, the pointer temp would point to the last node of
the list. Since, in a circular singly linked list, the last node of the list
contains a pointer to the first node of the list. Therefore, we need to make
the next pointer of the last node point to the head node of the list and the
new node which is being inserted into the list will be the new head node
of the list therefore the next pointer of temp will point to the new node
ptr.
the next pointer of temp will point to the existing head node of the list.
1. ptr->next = head;
Now, make the new node ptr, the new head node of the circular singly
linked list.
1. head = ptr;
in this way, the node ptr has been inserted into the circular singly linked
list at beginning.
Algorithm
Write OVERFLOW
Go to Step 11
[END OF IF]
[END OF LOOP]
There are three scenarios of deleting a node from circular singly linked list
at beginning.
If the list is empty then the condition head == NULL will become true, in
this case, we just need to print underflow on the screen and make exit.
if(head == NULL)
{
printf("\nUNDERFLOW");
return;
}
If the list contains single node then, the condition head → next ==
head will become true. In this case, we need to delete the entire list and
make the head pointer free. This will be done by using the following
statements.
if(head->next == head)
{
head = NULL;
free(head);
}
If the list contains more than one node then, in that case, we need to
traverse the list by using the pointer ptr to reach the last node of the list.
This will be done by using the following statements.
ptr = head;
while(ptr -> next != head)
ptr = ptr -> next;
At the end of the loop, the pointer ptr point to the last node of the list.
Since, the last node of the list points to the head node of the list.
Therefore this will be changed as now, the last node of the list will point
to the next of the head node.
1. ptr->next = head->next;
Now, free the head pointer by using the free() method in C language.
1. free(head);
Make the node pointed by the next of the last node, the new head of the
list.
1. head = ptr->next;
In this way, the node will be deleted from the circular singly linked list
from the beginning.
Algorithm
Write UNDERFLOW
Go to Step 8
[END OF IF]
[END OF LOOP]
else
{
ptr = head;
while(ptr -> next != head)
ptr = ptr -> next;
ptr->next = head->next;
free(head);
head = ptr->next;
printf("\nNode Deleted\n");
}
}
Display the nodes in circular singly linked list
void display()
{
struct node *ptr;
ptr=head;
if(head == NULL)
{
printf("\nnothing to print");
}
else
{
printf("\n printing values ... \n");
}
UNIT II STACK & QUEUE
Stack is a linear data structure that follows a particular order in which the
operations are performed. The order may be LIFO(Last In First Out) or
FILO(First In Last Out). LIFO implies that the element that is inserted last,
comes out first and FILO implies that the element that is inserted first,
comes out last.
Working of Stack
Stack works on the LIFO pattern. As we can observe in the below figure
there are five memory blocks in the stack; therefore, the size of the stack
is 5.
Suppose we want to store the elements in a stack and let's assume that
stack is empty. We have taken the stack of size 5 as shown below in
which we are pushing the elements one by one until the stack becomes
full.
Applications of Stack
Infix to prefix
Infix to postfix
Prefix to infix
Prefix to postfix
Postfix to infix
Memory management: The stack manages the memory. The memory is
assigned in the contiguous memory blocks. The memory is known as
stack memory as all the variables are assigned in a function call stack
memory.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char pop() {
if (top == -1) {
printf("Empty stack!\n");
return ' ';
}
char data = stack[top];
top--;
return data;
}
int main() {
char text[MAX_SIZE];
printf("Input an expression in parentheses: ");
scanf("%s", text);
if (isBalanced(text)) {
printf("The expression is balanced.\n");
} else {
printf("The expression is not balanced.\n");
}
return 0;
}
Here, we will use the stack data structure for the conversion of infix
expression to postfix expression.
K K
+ +
L + KL
- - K L+
M - K L+ M
* -* K L+ M
N -* KL+MN
+ + K L + M N*
K L + M N* -
( +( K L + M N *-
O +( KL+MN*-O
^ +(^ K L + M N* - O
P +(^ K L + M N* - O P
) + K L + M N* - O P ^
* +* K L + M N* - O P ^
W +* K L + M N* - O P ^ W
/ +/ K L + M N* - O P ^ W *
U +/ K L + M N* - O P ^W*U
/ +/ K L + M N* - O P ^W*U/
V +/ KL + MN*-OP^W*U/V
* +* KL+MN*-OP^W*U/V/
T +* KL+MN*-OP^W*U/V/T
+ + KL+MN*-OP^W*U/V/T*
KL+MN*-OP^W*U/V/T*+
Q + KL+MN*-OP^W*U/V/T*Q
KL+MN*-OP^W*U/V/T*+Q+
Postfix notation:
.
Example-1:
Example-2
3. For example, people waiting in line for a rail ticket form a queue.
Applications of Queue
Due to the fact that queue performs actions on first in first out basis
which is quite fair for the ordering of actions. There are various
applications of queues discussed as below.
1. Queues are widely used as waiting lists for a single shared resource
like printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is
not being transferred at the same rate between two processes) for
eg. pipes, file IO, sockets.
3. Queues are used as buffers in most of the applications like MP3
media player, CD player, etc.
4. Queue are used to maintain the play list in media players in order to
add and remove the songs from the play-list.
5. Queues are used in operating systems for handling interrupts.
In the linked queue, there are two pointers maintained in the memory i.e.
front pointer and rear pointer. The front pointer contains the address of
the starting element of the queue while the rear pointer contains the
address of the last element of the queue.
Insertion and deletions are performed at rear and front end respectively.
If front and rear both are NULL, it indicates that the queue is empty.
#include<stdio.h>
#include<stdlib.h>
struct node
{
int data;
struct node *next;
};
struct node *front;
struct node *rear;
void insert();
void delete();
void display();
void main ()
{
int choice;
while(choice != 4)
{
printf("\n*************************Main Menu****************
*************\n");
printf("\n======================================
===========================\n");
printf("\n1.insert an element\n2.Delete an element\n3.Display the que
ue\n4.Exit\n");
printf("\nEnter your choice ?");
scanf("%d",& choice);
switch(choice)
{
case 1:
insert();
break;
case 2:
delete();
break;
case 3:
display();
break;
case 4:
exit(0);
break;
default:
printf("\nEnter valid choice??\n");
}
}
}
void insert()
{
struct node *ptr;
int item;
There are four different types of queue that are listed as follows -
Priority Queue
Insertion in priority queue takes place based on the arrival, while deletion
in the priority queue occurs based on the priority. Priority queue is mainly
used to implement the CPU scheduling algorithms.
There are two types of priority queue that are discussed as follows -
The priority queue can be implemented in four ways that include arrays,
linked list, heap data structure and binary search tree. The heap data
structure is the most efficient way of implementing the priority queue.
What is Heap?
o Max heap: The max heap is a heap in which the value of the
parent node is greater than the value of the child nodes.
o
o Min heap: The min heap is a heap in which the value of the parent
node is less than the value of the child nodes.
o
Both the heaps are the binary heap, as each has exactly two child nodes.
The deque stands for Double Ended Queue. Deque is a linear data
structure where the insertion and deletion operations are performed from
both ends. We can say that deque is a generalized version of the queue.
Types of deque
Circular Queue
There was one limitation in the array implementation of Queue. If
the rear reaches to the end position of the Queue then there might
be possibility that some vacant spaces are left in the beginning
which cannot be utilized. So, to overcome such limitations, the
concept of the circular queue was introduced.
A circular queue is similar to a linear queue as it is also based on
the FIFO (First In First Out) principle except that the last position is
connected to the first position in a circular queue that forms a
circle. It is also known as a Ring Buffer.
As we can see in the above image, the rear is at the last position of
the Queue and front is pointing somewhere rather than the
0th position. In the above array, there are only two elements and
other three positions are empty. The rear is at the last position of
the Queue; if we try to insert the element then it will show that
there are no empty spaces in the Queue. There is one solution to
avoid such wastage of memory space by shifting both the elements
at the left and adjust the front and rear end accordingly. It is not a
practically good approach because shifting all the elements will
consume lots of time. The efficient approach to avoid the wastage
of the memory is to use the circular queue data structure.
Dequeue Operation
Step 1: IF FRONT = -1
Write " UNDERFLOW "
Goto Step 4
[END of IF]
Step 4: EXIT
1. List ADT
Vies of list
View of stack
3. Queue ADT
View of Queue
The queue abstract data type (ADT) follows the basic design of the
stack abstract data type.
Each node contains a void pointer to the data and the link pointer to
the next element in the queue. The program’s responsibility is to
allocate memory for storing the data.
enqueue() – Insert an element at the end of the queue.
dequeue() – Remove and return the first element of the queue, if the
queue is not empty.
peek() – Return the element of the queue without removing it, if the
queue is not empty.
size() – Return the number of elements in the queue.
isEmpty() – Return true if the queue is empty, otherwise return false.
isFull() – Return true if the queue is full, otherwise return false.
UNIT III SORTING & HASHING
Sorting Terminologies:
In-place sorting:
An in-place sorting algorithm uses constant space for producing the output
(modifies the given array only). It sorts the list only by modifying the order of
the elements within the list. For example, Insertion Sort and Selection Sorts
are in-place sorting algorithms as they do not use any additional space for
sorting the list.
Types Of Sorting :
1. Internal Sorting
2. External Sorting
Sort Stability :
1. Stable Sort
2. Unstable Sort
Internal Sorting :
When all data is placed in the main memory or internal memory then
sorting is called internal sorting.
In internal sorting, the problem cannot take input beyond its size.
Example: heap sort, bubble sort, selection sort, quick sort, shell sort,
insertion sort.
External Sorting :
When all data that needs to be sorted cannot be placed in memory at a time,
the sorting is called external sorting. External Sorting is used for the
massive amount of data.
Merge Sort and its variations are typically used for external sorting.
Some external storage like hard disks and CDs are used for external
sorting.
Example: Merge sort
What is stable sorting?
When two same data appear in the same order in sorted data without
changing their position is called stable sort.
Example: merge sort, insertion sort, bubble sort.
Bubble Sort:
Bubble sort is a sorting algorithm that compares two adjacent elements and
swaps them until they are in the intended order. Just like the movement of air
bubbles in the water that rise up to the surface, each element of the array move
to the end in each iteration. Therefore, it is called a bubble sort.
Suppose we are trying to sort the elements in ascending order.
1. First Iteration (Compare and Swap)
1. Starting from the first index, compare the first and the second elements.
2. If the first element is greater than the second element, they are swapped.
3. Now, compare the second and the third elements. Swap them if they are not in
order.
4. The above process goes on until the last element.
Best O(n)
Worst O(n2)
Average O(n2)
Stability Yes
selection sort
Algorithm
for i = 1 to n - 1
/* set current element as minimum*/
min = i
for j = i+1 to n
if list[j] < list[min] then
min = j;
end if
end for
end procedure
For the first position in the sorted list, the whole list is scanned sequentially.
The first position where 14 is stored presently, we search the whole list and find
that 10 is the lowest value.
So we replace 14 with 10. After one iteration 10, which happens to be the
minimum value in the list, appears in the first position of the sorted list.
For the second position, where 33 is residing, we start scanning the rest of the
list in a linear manner.
We find that 14 is the second lowest value in the list and it should appear at the
second place. We swap these values.
After two iterations, two least values are positioned at the beginning in a sorted
manner.
The same process is applied to the rest of the items in the array.
Best O(n2)
Worst O(n2)
Average O(n2)
Stability No
Quick Sort
Quick sort is a highly efficient sorting algorithm. Like merge sort , this
algorithm is also based on the divide and conquer technique and uses the
comparison method. Quick sort is an ideal solution for a large set of data. The
sorting algorithm first divides the array into two sub-arrays by comparing all
elements with a specified value, called the Pivot value.
The two sub-arrays are divided in a way that one of them holds smaller values
than the pivot value, and the other holds greater values than the pivot value.
There are different ways to implement quick sort:
1. Always pick the last element as a pivot (we'll use this in our quick sort in
C example).
2. Always pick the first element as the pivot.
3. Pick median as the pivot.
4. Pick a random element as the pivot.
Best O(n*log n)
Worst O(n2)
Average O(n*log n)
Stability No
This condition leads to the case in which the pivot element lies in an extreme
end of the sorted array. One sub-array is always empty and another sub-array
contains n - 1 elements. Thus, quicksort is called only on this sub-array.
Merge Sort
Merge Sort is a Divide and Conquer algorithm. It divides the input array into
two halves, calls itself for the two halves, and then it merges the two sorted
halves. The merge() function is used for merging two halves. The merge(arr,
l, m, r) is a key process that assumes that arr[l..m] and arr[m+1..r] are sorted
and merges the two sorted sub-arrays into one.
Algorithm:
Step 1: Start
Step 2: Declare an array and left, right, mid variable
Step 3: Perform merge function.
mergesort(array,left,right)
mergesort (array, left, right)
if left > right
return
mid= (left+right)/2
mergesort(array, left, mid)
mergesort(array, mid+1, right)
merge(array, left, mid, right)
Step 4: Stop
Here, a problem is divided into multiple sub-problems. Each sub-problem is
solved individually. Finally, sub-problems are combined to form the final
solution.
Merge Sort Complexity
Time Complexity
Best O(n*log n)
Worst O(n*log n)
Average O(n*log n)
Stability Yes
Insertion Sort
Example:
To understand the working of the insertion sort algorithm, let's take an unsorted
array. It will be easier to understand the insertion sort via an example.
Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with
25. Along with swapping, insertion sort will also check it with all elements in
the sorted array.
For now, the sorted array has only one element, i.e. 12. So, 25 is greater than
12. Hence, the sorted array remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next
elements that are 31 and 8.
Now, the sorted array has three items that are 8, 12 and 25. Move to the next
items that are 31 and 32.
Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.
Best O(n)
Worst O(n2)
Average O(n2)
Stability Yes
Heap sort
In heap sort, basically, there are two phases involved in the sorting of elements.
By using the heap sort algorithm, they are as follows -
o The first step includes the creation of a heap by adjusting the elements of
the array.
o After the creation of heap, now remove the root element of the heap
repeatedly by shifting it to the end of the array, and then store the heap
structure with the remaining elements.
What is Heapify?
It is a process of creating a data structure called a heap from that of a binary tree
using a data structure array.
Now let's see the working of heap sort in detail by using an example. To
understand it more clearly, let's take an unsorted array and try to sort it using
heap sort.
First, we have to construct a heap from the given array and convert it into
max heap.
After converting the given heap into max heap, the array elements are -
Next, we have to delete the root element (89) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (11). After
deleting the root element, we again have to heapify it to convert it into
max heap.
After swapping the array element 89 with 11, and converting the heap
into max-heap, the elements of array are -
In the next step, again, we have to delete the root element (81) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (54). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 81 with 54 and converting the heap into
max-heap, the elements of array are -
In the next step, we have to delete the root element (76) from the max
heap again. To delete this node, we have to swap it with the last node,
i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 76 with 9 and converting the heap into
max-heap, the elements of array are -
In the next step, again we have to delete the root element (54) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (14). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 54 with 14 and converting the heap into
max-heap, the elements of array are -
In the next step, again we have to delete the root element (22) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (11). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 22 with 11 and converting the heap into
max-heap, the elements of array are -
In the next step, again we have to delete the root element (14) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 14 with 9 and converting the heap into
max-heap, the elements of array are -
In the next step, again we have to delete the root element (11) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 11 with 9, the elements of array are -
Now, heap has only one element left. After deleting it, heap will be
empty.
Best O(nlog n)
Worst O(nlog n)
Average O(nlog n)
Stability No
Radix Sort
Radix Sort is a linear sorting algorithm that sorts elements by processing them
digit by digit. It is an efficient sorting algorithm for integers or strings with
fixed-size keys.
Rather than comparing elements directly, Radix Sort distributes the elements
into buckets based on each digit‟s value. By repeatedly sorting the elements by
their significant digits, from the least significant to the most significant, Radix
Sort achieves the final sorted order.
The key idea behind Radix Sort is to exploit the concept of place value. It
assumes that sorting numbers digit by digit will eventually result in a fully
sorted list. Radix Sort can be performed using different variations, such as
Least Significant Digit (LSD) Radix Sort or Most Significant Digit (MSD)
Radix Sort.
Algorithm
radixSort(arr)
max = largest element in the given array
d = number of digits in the largest element (or, max)
Now, create d buckets of size 0 - 9
for i -> 0 to d
sort the array elements using counting sort (or any stable sort) according to the digi
ts at
the ith place
Step 1: Find the largest element in the array, which is 802. It has three digits,
so we will iterate three times, once for each significant place.
Step 2: Sort the elements based on the unit place digits (X=0). We use a stable
sorting technique, such as counting sort, to sort the digits at each significant
place.
Sorting based on the unit place:
Perform counting sort on the array based on the unit place digits.
The sorted array based on the unit place is [170, 90, 802, 2, 24, 45, 75, 66].
Best O(n+k)
Worst O(n+k)
Average O(n+k)
Stability Yes
Shell Sort:
Shell sort is the generalization of insertion sort, which overcomes the drawbacks
of insertion sort by comparing elements separated by a gap of several positions.
In insertion sort, at a time, elements can be moved ahead by one position only.
To move an element to a far-away position, many movements are required that
increase the algorithm's execution time. But shell sort overcomes this drawback
of insertion sort. It allows the movement and swapping of far-away elements as
well.
Algorithm:
Step 1 − Start
Step 2 − Initialize the value of gap size. Example: h
Step 3 − Divide the list into smaller sub-part. Each must have equal intervals to
h
Step 4 − Sort these sub-lists using insertion sort
Step 5 – Repeat this step 2 until the list is sorted.
Step 6 – Print a sorted list.
Step 7 – Stop.
This algorithm uses insertion sort on a widely spread elements, first to sort
them and then sorts the less widely spaced elements. This spacing is termed
as interval. This interval is calculated based on Knuth's formula as −
Knuth's Formula
h=h*3+1
where : h is interval with initial value 1
This algorithm is quite efficient for medium-sized data sets as its average and
worst-case complexity of this algorithm depends on the gap sequence the best
known is Ο(n), where n is the number of items. And the worst case space
complexity is O(n).
Let us consider the following example to have an idea of how shell sort works.
We take the same array we have used in our previous examples. For our
example and ease of understanding, we take the interval of 4. Make a virtual
sub-list of all values located at the interval of 4 positions. Here these values are
{35, 14}, {33, 19}, {42, 27} and {10, 44}
We compare values in each sub-list and swap them (if necessary) in the original
array. After this step, the new array should look like this −
Then, we take interval of 1 and this gap generates two sub-lists - {14, 27, 35,
42}, {19, 10, 33, 44}
We compare and swap the values, if required, in the original array. After this
step, the array should look like this −
Finally, we sort the rest of the array using interval of value 1. Shell sort uses
insertion sort to sort the array.
Best O(nlog n)
Worst O(n2)
Average O(nlog n)
Stability No
Comparison of sorting methods
A sorting algorithm is In-place if the algorithm does not use extra space
for manipulating the input but may require a small though nonconstant
extra space for its operation. Or we can say, a sorting algorithm sorts in-
place if only a constant number of elements of the input array are ever
stored outside the array.
A sorting algorithm is stable if it does not change the order of elements
with the same value.
Hashing:
Dictionaries:
Dictionary is one of the important Data Structures that is usually used to store
data in the key-value format. Each element presents in a dictionary data
structure compulsorily have a key and some value is associated with that
particular key. In other words, we can also say that Dictionary data structure is
used to store the data in key-value pairs. Other names for the Dictionary data
structure are associative array, map, symbol table but broadly it is referred to as
Dictionary.
Example:
o Add or Insert: In the Add or Insert operation, a new pair of keys and
values is added in the Dictionary or associative array object.
o Replace or reassign: In the Replace or reassign operation, the already
existing value that is associated with a key is changed or modified. In
other words, a new value is mapped to an already existing key.
o Delete or remove: In the Delete or remove operation, the already present
element is unmapped from the Dictionary or associative array object.
o Find or Lookup: In the Find or Lookup operation, the value associated
with a key is searched by passing the key as a search argument.
HashTable Representation
Hash tables are a type of data structure in which the address or the index value
of the data element is generated from a hash function. That makes accessing the
data faster as the index value behaves as a key for the data value. In other words
Hash table stores key-value pairs but the key is generated through a hashing
function.
So the search and insertion function of a data element becomes much faster as
the key values themselves become the index of the array which stores the data.
A hash function is a function that can map a piece of data of any length to a
fixed-length value, called hash.
They are fast to compute: calculate the hash of a piece of data have to be
a fast operation.
They are deterministic: the same string will always produce the same
hash.
They produce fixed-length values: it doesn‟t matter if your input is one,
ten, or ten thousand bytes, the resulting hash will be always of a fixed,
predetermined length.
Another characteristic that is quite common in hash functions is that they often
are one-way functions: thanks to a voluntary data loss implemented in the
function, you can get a hash from a string but you can‟t get the original string
from a hash. This is not a mandatory feature for every hash functions but
becomes important when they have to be cryptographically secure.
Hashing
Hashing is one of the searching techniques that uses a constant time. The time
complexity in hashing is O(1). Till now, we read the two techniques for
searching, i.e., linear search and binary search. The worst time complexity in
linear search is O(n), and O(logn) in binary search. In both the searching
techniques, the searching depends upon the number of elements but we want the
technique that takes a constant time. So, hashing technique came that provides a
constant time.
In Hashing technique, the hash table and hash function are used. Using the hash
function, we can calculate the address at which the value can be stored.
The main idea behind the hashing is to create the (key/value) pairs. If the key is
given, then the algorithm computes the index at which the value would be
stored. It can be written as:
o Division method
o Folding method
o Mid square method
h(ki) = ki % m;
For example, if the key value is 6 and the size of the hash table is 10. When we
apply the hash function to key 6 then the index would be:
h(6) = 6%10 = 6
When the two different values have the same value, then the problem occurs
between the two values, known as a collision. In the above example, the value is
stored at index 6. If the key value is 26, then the index would be:
h(26) = 26%10 = 6
Therefore, two values are stored at the same index, i.e., 6, and this leads to the
collision problem. To resolve these collisions, we have some techniques known
as collision techniques.
Open Hashing
The first Collision Resolution or Handling technique, " Open Hashing ", is
popularly known as Separate Chaining. This is a technique which is used to
implement an array as a linked list known as a chain. It is one of the most used
techniques by programmers to handle collisions. When a number of elements
are hashed into the index of a single slot, then they are inserted into a singly-
linked list. This singly-linked list is the linked list which we refer to as a chain
in the Open Hashing technique.
All key-value pairs mapping to the same index will be stored in the linked list of
that index.
Disadvantages:
The cache performance of chaining is not good as keys are stored using a
linked list. Open addressing provides better cache performance as
everything is stored in the same table.
Wastage of Space (Some Parts of the hash table are never used)
If the chain becomes long, then search time can become O(n) in the worst
case
Uses extra space for links
Closed Hashing:
The second most Collision resolution technique, Closed Hashing, is a way of
dealing with collisions, similar to the Separate Chaining process. In Open
Addressing, the hash table alone stores all of its elements. The size of the table
should always be greater than or equal to the total number of keys at all times (
we can also increase the size of the table by copying the old data that is already
existing whenever it is needed ). This mechanism is referred to as Closed
Hashing. The formation and consideration of the whole process is probing.
Several techniques to perform Implementation of Closed Hashing:
1. Linear Probing: In linear probing, the hash table undergoes clear and neat
examination, starting from the hash's initial or beginning point. If the slot that is
obtained after the calculation is already occupied, then we should look for a
different one. The function that is responsible for performing rehashing is " key
= rehash(n+1)%table-size ". The space between the two probes or positions is
generally 1.
Let us see Linear Probing for a slot index " hash(a) ", which is computed using a
hash function. It is one of the best techniques which has the best cache
performance.
1. Insert( k ): Up till a space is left unfilled, keep probing. Place the key " k
" in the first empty slot you find.
2. Search( k ): Probe each slot until the key is not equal to k or until an
empty slot is found.
3. Delete( k ): It's interesting to delete something. The search may fail when
we just remove a key and then perform search operation. The slots that
are a part of deleted key slots are considered as "deleted."
Let us consider a simple hash function as “key mod 7” and a sequence of keys
as 50, 700, 76, 85, 92, 73, 101,
which means hash(key)= key% S, here S=size of the table =7,indexed from 0
to 6.We can define the hash function as per our choice if we want to create a
hash table,although it is fixed internally with a pre-defined formula.
Applications of linear probing:
Linear probing is a collision handling technique used in hashing, where the
algorithm looks for the next available slot in the hash table to store the
collided key. Some of the applications of linear probing include:
Symbol tables: Linear probing is commonly used in symbol tables, which
are used in compilers and interpreters to store variables and their associated
values. Since symbol tables can grow dynamically, linear probing can be
used to handle collisions and ensure that variables are stored efficiently.
Caching: Linear probing can be used in caching systems to store frequently
accessed data in memory. When a cache miss occurs, the data can be
loaded into the cache using linear probing, and when a collision occurs, the
next available slot in the cache can be used to store the data.
Databases: Linear probing can be used in databases to store records and
their associated keys. When a collision occurs, linear probing can be used
to find the next available slot to store the record.
Compiler design: Linear probing can be used in compiler design to
implement symbol tables, error recovery mechanisms, and syntax analysis.
Spell checking: Linear probing can be used in spell-checking software to
store the dictionary of words and their associated frequency counts. When a
collision occurs, linear probing can be used to store the word in the next
available slot.
Overall, linear probing is a simple and efficient method for handling collisions
in hash tables, and it can be used in a variety of applications that require
efficient storage and retrieval of data.
Challenges in Linear Probing :
Primary Clustering: One of the problems with linear probing is Primary
clustering, many consecutive elements form groups and it starts taking time
to find a free slot or to search for an element.
Secondary Clustering: Secondary clustering is less severe, two records
only have the same collision chain (Probe Sequence) if their initial position
is the same.
It deals with secondary clustering, and sometimes, two keys have same prob
sequence whenever they possess the same key position.
3. Double hashing
The intervals that lie between probes are computed by another hash function.
Double hashing is a technique that reduces clustering in an optimized way. In
this technique, the increments for the probing sequence are computed by using
another hash function. We use another hash function hash2(x) and look for the
i*hash2(x) slot in the ith rotation.
let hash(x) be the slot index computed using hash function.
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S
If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x))
%S
If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x))
%S
Example: Insert the keys 27, 43, 692, 72 into the Hash Table of size 7. where
first hash-function is h1(k) = k mod 7 and second hash-function is h2(k) = 1 +
(k mod 5)
Step 1: Insert 27
27 % 7 = 6, location 6 is empty so insert 27 into 6 slot.
Step 2: Insert 43
43 % 7 = 1, location 1 is empty so insert 43 into 1 slot.
Step 4: Insert 72
72 % 7 = 2, but location 2 is already being occupied and this is a
collision.
So we need to resolve this collision using double hashing.
hnew = [h1(72) + i * (h2(72)] % 7
= [2 + 1 * (1 + 72 % 5)] % 7
=5%7
= 5,
Now, as 5 is an empty slot,
so we can insert 72 into 5th slot.
Insert key 72 in the hash table
It is a hashing technique that enables users to lookup a dynamic data set. Means,
the data set is modified by adding data to or removing the data from, on demand
hence the name „Dynamic‟ hashing. Thus, the resulting data bucket keeps
increasing or decreasing depending on the number of records.
Here are some prominent differences by which Static Hashing is different than
Dynamic Hashing –
Solved examples:
1)
43 % 10 = 3 so 43 will go to bucket 3
62 % 10 = 2 so 62 will go to bucket 2
123 % 10 = 3 so 123 will try to go to bucket 3 but 43 is already there so
collision happens and hence using linear probing it will go to next available
bucket so, it goes to bucket 4.
so after inserting all keys our hash table will look like
Hash Table
bucket no. key
0
1
2 62
3 43
4 123
5 165
6 152
7
8
9
2) The keys 12, 18, 13, 2, 3, 23, 5 and 15 are inserted into an initially empty
hash table of length 10 using open addressing with hash function
h(k) = k mod 10 and linear probing. What is the resultant hash table?
UNIT IV TREE
Introduction
Terminology
Degree of a Node
Forest
Applications of trees
A binary tree is a tree data structure in which each parent node can have
at most two children. Each node of a binary tree consists of three
items:
data item
struct node
int data;
};
o The height of the tree is defined as the longest path from the root
node to the leaf node. The minimum number of nodes possible at
height h is equal to h+1.
o If the number of nodes is minimum, then the height of the tree
would be maximum. Conversely, if the number of nodes is
maximum, then the height of the tree would be minimum.
The full binary tree is also known as a strict binary tree. The tree
can only be considered as the full binary tree if each node must
contain either 0 or 2 children. The full binary tree can also be
defined as the tree in which each node must contain 2 children
except the leaf nodes.
n= 2*h - 1
n+1 = 2*h
h = n+1/2
The complete binary tree is a tree in which all the nodes are completely
filled except the last level. In the last level, all the nodes must be as left
as possible. In a complete binary tree, the nodes should be added from
the left.
A tree is a perfect binary tree if all the internal nodes have 2 children, and
all the leaf nodes are at the same level.
Note: All the perfect binary trees are the complete binary trees as well as
the full binary tree, but vice versa is not true, i.e., all complete binary
trees and full binary trees are the perfect binary trees.
Degenerate or Pathological Tree
AVL Tree
AVL tree is a self-balancing Binary Search Tree (BST) where the
difference between heights of left and right subtrees for any node cannot
be more than one.
Red-Black Tree
A red-black tree is a kind of self-balancing binary search tree where each
node has an extra bit, and that bit is often interpreted as the color (red
or black). These colors are used to ensure that the tree remains
balanced during insertions and deletions.
The term 'tree traversal' means traversing or visiting each node of a tree.
There is a single way to traverse the linear data structure such as linked
list, queue, and stack. Whereas, there are multiple ways to traverse a
tree that are listed as follows -
Preorder traversal
Inorder traversal
Postorder traversal
Level order traversal
Preorder traversal
This technique follows the 'root left right' policy. It means that, first root
node is visited after that the left subtree is traversed recursively, and
finally, right subtree is recursively traversed. As the root node is
traversed before (or pre) the left and right subtree, it is called preorder
traversal.
Postorder traversal
This technique follows the 'left-right root' policy. It means that the first
left subtree of the root node is traversed, after that recursively traverses
the right subtree, and finally, the root node is traversed. As the root node
is traversed after (or post) the left and right subtree, it is called postorder
traversal.
Algorithm
Inorder traversal
This technique follows the 'left root right' policy. It means that first left
subtree is visited after that root node is traversed, and finally, the right
subtree is traversed. As the root node is traversed between the left and
right subtree, it is named inorder traversal.
Algorithm
Example1:
Example-2:
Pre Order: 1 2 4 8 12 5 9 3 6 7 10 11
Post Order: 12 8 4 9 5 2 6 10 11 7 3 1
In Order: 8 12 4 2 9 5 1 6 3 10 7 11
Example-3
Preorder traversal: 27 14 10 19 35 31 42
Inorder traversal: 10 14 19 27 31 35 42
Post order traversal: 10 19 14 31 42 35 27
In the above figure, we can observe that the root node is 40, and all the
nodes of the left subtree are smaller than the root node, and all the nodes
of the right subtree are greater than the root node.
The properties that separate a binary search tree from a regular binary
tree is
All nodes of left subtree are less than the root node
All nodes of right subtree are more than the root node
Both subtrees of each node are also BSTs i.e. they have the above
two properties
Now, let's see the creation of binary search tree using an example.
Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50
o First, we have to insert 45 into the tree as the root of the tree.
o Then, read the next element; if it is smaller than the root node,
insert it as the root of the left subtree, and move to the next
element.
o Otherwise, if the element is larger than the root node, then insert it
as the root of the right subtree.
Now, let's see the process of creating the Binary search tree using the
given data element. The process of creating the BST is shown below -
As 15 is smaller than 45, so insert it as the root node of the left subtree.
Step 3 - Insert 79.
As 79 is greater than 45, so insert it as the root node of the right subtree.
55 is larger than 45 and smaller than 79, so it will be inserted as the left
subtree of 79.
50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as
a left subtree of 55.
Now, the creation of binary search tree is completed. After that, let's
move towards the operations that can be performed on Binary search
tree.
We can perform insert, delete and search operations on the binary search
tree.
Now, let's see the process of inserting a node into BST using an example.
Deletion in Binary Search tree
In a binary search tree, we must delete a node from the tree by keeping
in mind that the property of BST is not violated. To delete a node from
BST, there are three possible situations occur -
We can see the process to delete a leaf node from BST in the below
image. In below image, suppose we have to delete node 90, as the node
to be deleted is a leaf node, so it will be replaced with NULL, and the
allocated space will free.
When the node to be deleted has only one child
In this case, we have to replace the target node with its child, and then
delete the child node. It means that after replacing the target node with
its child node, the child node will now contain the value to be deleted. So,
we simply have to replace the child node with NULL and free up the
allocated space.
We can see the process of deleting a node with one child from BST in the
below image. In the below image, suppose we have to delete the node
79, as the node to be deleted has only one child, so it will be replaced
with its child 55.
So, the replaced node 79 will now be a leaf node that can be easily
deleted.
This case of deleting a node in BST is a bit complex among other two
cases. In such a case, the steps to be followed are listed as follows -
We can see the process of deleting a node with two children from BST in
the below image. In the below image, suppose we have to delete node 45
that is the root node, as the node to be deleted has two children, so it will
be replaced with its inorder successor. Now, node 45 will be at the leaf of
the tree so that it can be deleted easily.
Step2:
Step3:
Algorithm to search an element in Binary search tree
Time Complexity
Operations Best case time Average case time Worst case time
complexity complexity complexity
AVL Trees
AVL tree got its name after its inventor Georgy Adelson-Velsky and
Landis.
AVL Tree can be defined as height balanced binary search tree in which
each node is associated with a balance factor which is calculated by
subtracting the height of its right sub-tree from that of its left sub-tree.
If balance factor of any node is 1, it means that the left sub-tree is one
level higher than the right sub-tree.
If balance factor of any node is 0, it means that the left sub-tree and right
sub-tree contain equal height.
If balance factor of any node is -1, it means that the left sub-tree is one
level lower than the right sub-tree.
An AVL tree is given in the following figure. We can see that, balance
factor associated with each node is in between -1 and +1. therefore, it is
an example of AVL tree.
Complexity
AVL Rotations
Where node A is the node whose balance Factor is other than -1, 0, 1.
The first two rotations LL and RR are single rotations and the next two
rotations LR and RL are double rotations. For a tree to be unbalanced,
minimum height must be at least 2, Let us understand each rotation
1. RR Rotation
When BST becomes unbalanced, due to a node is inserted into the right
subtree of the right subtree of A, then we perform RR rotation, RR
rotation is an anticlockwise rotation, which is applied on the edge below a
node having balance factor -2
2. LL Rotation
When BST becomes unbalanced, due to a node is inserted into the left
subtree of the left subtree of C, then we perform LL rotation, LL
rotation is clockwise rotation, which is applied on the edge below a node
having balance factor 2.
3. LR Rotation
Double rotations are bit tougher than single rotation which has already
explained above. LR rotation = RR rotation + LL rotation, i.e., first RR
rotation is performed on subtree and then LL rotation is performed on full
tree, by full tree we mean the first node from the path of inserted node
whose balance factor is other than -1, 0, or 1.
State Action
A node B has been inserted into the right subtree of A the left subtree
of C, because of which C has become an unbalanced node having
balance factor 2. This case is L R rotation where: Inserted node is in
the right subtree of left subtree of C
4. RL Rotation
As already discussed, that double rotations are bit tougher than single
rotation which has already explained above. R L rotation = LL rotation +
RR rotation, i.e., first LL rotation is performed on subtree and then RR
rotation is performed on full tree, by full tree we mean the first node from
the path of inserted node whose balance factor is other than -1, 0, or 1.
State Action
A node B has been inserted into the left subtree of C the right subtree
of A, because of which A has become an unbalanced node having
balance factor - 2. This case is RL rotation where: Inserted node is in
the left subtree of right subtree of A
As RL rotation = LL rotation + RR rotation, hence, LL (clockwise) on
subtree rooted at C is performed first. By doing RR rotation,
node C has become the right subtree of B.
The basic operations performed on the AVL Tree structures include all the
operations performed on a binary search tree, since the AVL Tree at its
core is actually just a binary search tree holding all its properties.
Therefore, basic operations performed on an AVL Tree are
− Insertion and Deletion.
Insertion
The data is inserted into the AVL Tree by following the Binary Search Tree
property of insertion, i.e. the left subtree must contain elements less than
the root value and right subtree must contain all the greater elements.
However, in AVL Trees, after the insertion of each element, the balance
factor of the tree is checked; if it does not exceed 1, the tree is left as it
is. But if the balance factor exceeds 1, a balancing algorithm is applied to
readjust the tree such that balance factor becomes less than or equal to 1
again.
Algorithm
Step 3 − If the tree is empty, the new node created will become the root
node of the AVL Tree.
Step 4 − If the tree is not empty, we perform the Binary Search Tree
insertion operation and check the balancing factor of the node in the tree.
START
if node == null then:
return new node
if key < node.key then:
node.left = insert (node.left, key)
else if (key > node.key) then:
node.right = insert (node.right, key)
else
return node
node.height = 1 + max (height (node.left), height (node.right))
balance = getBalance (node)
if balance > 1 and key < node.left.key then:
rightRotate
if balance < -1 and key > node.right.key then:
leftRotate
if balance > 1 and key > node.left.key then:
node.left = leftRotate (node.left)
rightRotate
if balance < -1 and key < node.right.key then:
node.right = rightRotate (node.right)
leftRotate (node)
return node
END
Insertion Example
Starting with the first element 1, we create a node and measure the
balance, i.e., 0.
Since both the binary search property and the balance factor are satisfied,
we insert another element into the tree.
The balance factor for the two nodes are calculated and is found to be -1
(Height of left subtree is 0 and height of the right subtree is 1). Since it
does not exceed 1, we add another element to the tree.
Now, after adding the third element, the balance factor exceeds 1 and
becomes 2. Therefore, rotations are applied. In this case, the RR rotation
is applied since the imbalance occurs at two right nodes.
The tree is rearranged as −
Similarly, the next elements are inserted and rearranged using these
rotations. After rearrangement, we achieve the tree as −
Deletion
Deletion Example
Using the same tree given above, let us perform deletion in three
scenarios −
However, element 6 is not a leaf node and has one child node attached to
it. In this case, we replace node 6 with its child node: node 5.
The balance of the tree becomes 1, and since it does not exceed 1 the
tree is left as it is. If we delete the element 5 further, we would have to
apply the left rotations; either LL or LR since the imbalance occurs at both
1-2-4 and 3-2-4.
The balance of the tree still remains 1, therefore we leave the tree as it is
without performing any rotations.
color
key
leftChild
rightChild
parent (except root node)
Red-Black tree's node structure would be:
struct t_red_black_node {
enum { red, black } colour;
void *item;
struct t_red_black_node *left,
*right,
*parent;
1. Recolor
2. Rotation
3. Rotation followed by Recolor
The insertion operation in Red Black tree is performed using the following steps...
Undirected Graph:
Directed Graph:
Path
A path can be defined as the sequence of nodes that are followed in order
to reach some terminal node V from the initial node U.
Closed Path
A path will be called as closed path if the initial node is same as terminal
node. A path will be closed path if V0=VN.
Simple Path
If all the nodes of the graph are distinct with an exception V0=VN, then
such path P is called as closed simple path.
Cycle
Connected Graph
A connected graph is the one in which some path exists between every
two vertices (u, v) in V. There are no isolated nodes in connected graph.
Complete Graph
A complete graph is the one in which every node is connected with all
other nodes. A complete graph contain n(n-1)/2 edges where n is the
number of nodes in the graph.
Weighted Graph
Loop
An edge that is associated with the similar end points can be called as
Loop.
Adjacent Nodes
If two nodes u and v are connected via an edge e, then the nodes u and v
are called as neighbours or adjacent nodes.
A degree of a node is the number of edges that are connected with that
node. A node with degree 0 is called as isolated node.
Graph representation
There are two ways to store Graphs into the computer's memory:
Ex-1:
In the above diagram, the full way of traversing is shown using arrows.
Step 1: Create a Queue with the same size as the total number of
vertices in the graph.
Step 2: Choose 12 as your beginning point for the traversal. Visit 12 and
add it to the Queue.
Step 3: Insert all the adjacent vertices of 12 that are in front of the
Queue but have not been visited into the Queue. So far, we have 5, 23,
and 3.
Step 4: Delete the vertex in front of the Queue when there are no new
vertices to visit from that vertex. We now remove 12 from the list.
Step 6: When the queue is empty, generate the final spanning tree by
eliminating unnecessary graph edges.
Since the queue is empty, we have completed the Breadth First Traversal
of the graph.
1. First, create a stack with the total number of vertices in the graph.
2. Now, choose any vertex as the starting point of traversal, and push
that vertex into the stack.
3. After that, push a non-visited vertex (adjacent to the vertex on the
top of the stack) to the top of the stack.
4. Now, repeat steps 3 and 4 until no vertices are left to visit from the
vertex on the stack's top.
5. If no vertex is left, go back and pop a vertex from the stack.
6. Repeat steps 2, 3, and 4 until the stack is empty.
Ex-2:
Difference between BFS and DFS
Applications of BFS:
Un-weighted Graphs
BFS algorithm can easily create the shortest path and a minimum
spanning tree to visit all the vertices of the graph in the shortest time
possible with high accuracy.
P2P Networks
Web Crawlers
Search engines or web crawlers can easily build multiple levels of indexes
by employing BFS. BFS implementation starts from the source, which is
the web page, and then it visits all the links from that source.
Network Broadcasting
A broadcasted packet is guided by the BFS algorithm to find and reach all
the nodes it has the address for.
Applications of DFS:
Weighted Graph
Path Finding
Topological Sorting
It used in DFS graph when there is a path from each and every vertex in
the graph to other remaining vertexes.
Topological Sort:
Note: Topological Sorting for a graph is not possible if the graph is not
a DAG.
Example:
Input: Graph :
Output: 5 4 2 3 1 0
Explanation: The first vertex in topological sorting is always a vertex
with an in-degree of 0 (a vertex with no incoming edges). A topological
sorting of the following graph is “5 4 2 3 1 0”. There can be more than
one topological sorting for a graph. Another topological sorting of the
following graph is “4 5 2 3 1 0”.
Ex:
Answer:
Choose vertex B, because it has no incoming edge, delete it along with its
adjacent edges.
According to the above steps remove the nodes and put it into a list.
For example, The below graph has two strongly connected components
{1,2,3,4} and {5,6,7} since there is path from each vertex to every
other vertex in the same strongly connected component.
Shortest Path
Dijkstra's Algorithm is a Graph algorithm that finds the shortest
path from a source vertex to all other vertices in the Graph (single source
shortest path). It is a type of Greedy Algorithm that only works on
Weighted Graphs having positive weights. The time complexity of
Dijkstra's Algorithm is O(V2) with the help of the adjacency matrix
representation of the graph. This time complexity can be reduced to O((V
+ E) log V) with the help of an adjacency list representation of the
graph, where V is the number of vertices and E is the number of edges in
the graph.
In this algorithm each vertex will have two properties defined for it-
Visited property:-
o This property represents whether the vertex has been visited
or not.
o We are using this property so that we don't revisit a vertex.
o A vertex is marked visited only after the shortest path to it
has been found.
Path property:-
o This property stores the value of the current minimum path
to the vertex. Current minimum path means the shortest
way in which we have reached this vertex till now.
o This property is updated whenever any neighbour of the
vertex is visited.
o The path property is important as it will store the final
answer for each vertex.
In a telephone network
Dijkstra’s Algorithm will generate the shortest path from Node 0 to all
other Nodes in the graph.
Ex:
Step 1: Start from Node 0 and mark Node as visited as you can check in
below image visited Node is marked red.
Step 2: Check for adjacent Nodes, Now we have to choices (Either
choose Node1 with distance 2 or either choose Node 2 with distance 6 )
and choose Node with minimum distance. In this step Node 1 is
Minimum distance adjacent Node, so marked it as visited and add up the
distance.
Distance: Node 0 -> Node 1 = 2
Step 3: Then Move Forward and check for adjacent Node which is Node
3, so marked it as visited and add up the distance, Now the distance will
be:
Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7
Step 4: Again we have two choices for adjacent Nodes (Either we can
choose Node 4 with distance 10 or either we can choose Node 5 with
distance 15) so choose Node with minimum distance. In this step Node
4 is Minimum distance adjacent Node, so marked it as visited and add
up the distance.
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 =
17
Step 5: Again, Move Forward and check for adjacent Node which
is Node 6, so marked it as visited and add up the distance, Now the
distance will be:
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 +
5 + 10 + 2 = 19
So, the Shortest Distance from the Source Vertex is 19 which is
optimal one
Psuedocode
distance[v] = infinity
distance[source] = 0
while G is non-empty:
mark Q visited
distance[N] := alt_dist
return distance[ ]
Advantages:
Disadvantages:
Spanning tree
The minimum spanning tree from the above spanning trees is:
The minimum spanning tree from a graph is found using the following
algorithms:
1. Prim's Algorithm
2. Kruskal's Algorithm
Prim's Algorithm
It falls under a class of algorithms called greedy algorithms that find the
local optimum in the hopes of finding a global optimum.
We start from one vertex and keep adding edges with the lowest weight
until we reach our goal.
The pseudocode for prim's algorithm shows how we create two sets of
vertices U and V-U. U contains the list of vertices that have been visited
and V-U the list of vertices that haven't. One by one, we move vertices
from set V-U to set U by connecting the least weight edge.
T = ∅;
U = { 1 };
while (U ≠ V)
T = T ∪ {(u, v)}
U = U ∪ {v}
Example of Prim's algorithm
Example: Generate minimum cost spanning tree for the following graph
using Prim's algorithm. ( Homework)
Kruskal's Algorithm
has the minimum sum of weights among all the trees that can be formed
from the graph
Step 4: If it doesn’t form the cycle, then include that edge in MST.
Otherwise, discard it.
A=∅
MAKE-SET(v)
For each edge (u, v) ∈ G.E ordered by increasing order by weight(u, v):
if FIND-SET(u) ≠ FIND-SET(v):
A = A ∪ {(u, v)}
UNION(u, v)
return A
Home work: Find the Minimum Spanning Tree of the following graph using Kruskal's
algorithm.
Spanning Tree Applications
Computer Network Routing Protocol
Cluster Analysis
Civil Network Planning
Minimum Spanning tree Applications
To find paths in the map
To design networks like telecommunication networks, water supply
networks, and electrical grids.