DS-notes full

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 164

UNIT I LIST ADT

Introduction: Abstract Data Type (ADT)

Abstract Data type (ADT) is a type (or class) for objects whose
behaviour is defined by a set of values and a set of operations. The
definition of ADT only mentions what operations are to be performed but
not how these operations will be implemented. It does not specify how
data will be organized in memory and what algorithms will be used for
implementing the operations. It is called “abstract” because it gives an
implementation-independent view.
The process of providing only the essentials and hiding the details is
known as abstraction.

The above figure shows the ADT model. There are two types of models in
the ADT model, i.e., the public function and the private function. The ADT
model also contains the data structures that we are using in a program.
In this model, first encapsulation is performed, i.e., all the data is
wrapped in a single unit, i.e., ADT. Then, the abstraction is performed
means showing the operations that can be performed on the data
structure and what are the data structures that we are using in a
program.

So a user only needs to know what a data type can do, but not how it
will be implemented. Think of ADT as a black box which hides the inner
structure and design of the data type..

Introduction to data structures

A data structure is a particular way of organising data in a computer so


that it can be used effectively. The idea is to reduce the space and time
complexities of different tasks.
Need Of Data Structure:
The structure of the data and the synthesis of the algorithm are relative
to each other. Data presentation must be easy to understand so the
developer, as well as the user, can make an efficient implementation of
the operation. Data structures provide an easy way of organising,
retrieving, managing, and storing data.

Here is a list of the needs for data.


 Data structure modification is easy.
 It requires less time.
 Save storage memory space.
 Data representation is easy.
 Easy access to the large database

Classification/Types of Data Structures:


1. Linear Data Structure
2. Non-Linear Data Structure.

Figure-1: Classifications of Data Structures

Primitive Data Structures


1. Primitive Data Structures are the data structures consisting of
the numbers and the characters that come in-built into programs.
2. These data structures can be manipulated or operated directly by
machine-level instructions.
3. Basic data types like Integer, Float, Character,
and Boolean come under the Primitive Data Structures.
4. These data types are also called Simple data types, as they
contain characters that can't be divided further

Non-Primitive Data Structures


1. Non-Primitive Data Structures are those data structures derived
from Primitive Data Structures.
2. These data structures can't be manipulated or operated directly by
machine-level instructions.
3. The focus of these data structures is on forming a set of data
elements that is either homogeneous (same data type)
or heterogeneous (different data types).
4. Based on the structure and arrangement of data, we can divide
these data structures into two sub-categories -
a. Linear Data Structures
b. Non-Linear Data Structures

Linear Data Structures

A data structure that preserves a linear connection among its data


elements is known as a Linear Data Structure. The arrangement of the
data is done linearly, where each element consists of the successors and
predecessors except the first and the last data element. However, it is not
necessarily true in the case of memory, as the arrangement may not be
sequential.

Based on memory allocation, the Linear Data Structures are further


classified into two types:

1. Static Data Structures: The data structures having a fixed size


are known as Static Data Structures. The memory for these data
structures is allocated at the compiler time, and their size cannot be
changed by the user after being compiled; however, the data stored
in them can be altered.
The Array is the best example of the Static Data Structure as they
have a fixed size, and its data can be modified later.
2. Dynamic Data Structures: The data structures having a dynamic
size are known as Dynamic Data Structures. The memory of these
data structures is allocated at the run time, and their size varies
during the run time of the code. Moreover, the user can change the
size as well as the data elements stored in these data structures at
the run time of the code.
Linked Lists, Stacks, and Queues are common examples of
dynamic data structures
Algorithm Efficiency

Some algorithms perform better than others. We always prefer to select


an efficient algorithm, hence metrics for assessing algorithm efficiency
would be useful.

The complexity of an algorithm is a function that describes the algorithm's


efficiency in terms of the amount of data it must process. There are
usually natural units for the domain and range of this function. There are
two basic complexity metrics of the efficiency of an algorithm:

 Time complexity is a function that describes how long an algorithm


takes in terms of the quantity of input it receives.

 Space complexity is a function that describes how much memory


(space) an algorithm requires to the quantity of input to the method.

Algorithmic notation- Analyzing programs

Asymptotic notation is a way to describe the running time or space


complexity of an algorithm based on the input size. It is commonly
used in complexity analysis to describe how an algorithm performs as
the size of the input grows.

The commonly used asymptotic notations used for calculating the running
time complexity of an algorithm is given below:

o Big oh Notation (?)


o Omega Notation (Ω)
o Theta Notation (θ)

Big oh Notation (O)


o Big O notation is an asymptotic notation that measures the
performance of an algorithm by simply providing the order of
growth of the function.
o This notation provides an upper bound on a function which ensures
that the function never grows faster than the upper bound. So, it
gives the least upper bound on a function so that the function never
grows faster than this upper bound.

It is the formal way to express the upper boundary of an algorithm


running time. It measures the worst case of time complexity or the
algorithm's longest amount of time to complete its operation. It is
represented as shown below:
For example:

If f(n) and g(n) are the two functions defined for positive integers,

then f(n) = O(g(n)) as f(n) is big oh of g(n) or f(n) is on the order of


g(n)) if there exists constants c and no such that:

f(n)≤c.g(n) for all n≥no

This implies that f(n) does not grow faster than g(n), or g(n) is an upper
bound on the function f(n). In this case, we are calculating the growth
rate of the function which eventually calculates the worst time complexity
of a function, i.e., how worst an algorithm can perform.

Omega Notation (Ω)


o It basically describes the best-case scenario which is opposite to the
big o notation.
o It is the formal way to represent the lower bound of an algorithm's
running time. It measures the best amount of time an algorithm can
possibly take to complete or the best-case time complexity.
o It determines what is the fastest time that an algorithm can run.

If we required that an algorithm takes at least certain amount of time


without using an upper bound, we use big- Ω notation i.e. the Greek
letter "omega". It is used to bound the growth of running time for large
input size.

If f(n) and g(n) are the two functions defined for positive integers,

then f(n) = Ω (g(n)) as f(n) is Omega of g(n) or f(n) is on the order


of g(n)) if there exists constants c and no such that:

f(n)>=c.g(n) for all n≥no and c>0


As we can see in the above figure that g(n) function is the lower bound
of the f(n) function when the value of c is equal to 1. Therefore, this
notation gives the fastest running time. But, we are not more
interested in finding the fastest running time, we are interested in
calculating the worst-case scenarios because we want to check our
algorithm for larger input that what is the worst time that it will take
so that we can take further decision in the further process.

Theta Notation (θ)


o The theta notation mainly describes the average case scenarios.
o It represents the realistic time complexity of an algorithm. Every
time, an algorithm does not perform worst or best, in real-world
problems, algorithms mainly fluctuate between the worst-case and
best-case, and this gives us the average case of the algorithm.
o Big theta is mainly used when the value of worst-case and the best-
case is same.
o It is the formal way to express both the upper bound and lower
bound of an algorithm running time.

Let's understand the big theta notation mathematically:

Let f(n) and g(n) be the functions of n where n is the steps required to
execute the program then:

f(n)= θg(n)

The above condition is satisfied only if when


c1.g(n)<=f(n)<=c2.g(n)

where the function is bounded by two limits, i.e., upper and lower limit,
and f(n) comes in between. The condition f(n)= θg(n) will be true if and
only if c1.g(n) is less than or equal to f(n) and c2.g(n) is greater than or
equal to f(n). The graphical representation of theta notation is given
below:

Linked List
o Linked List can be defined as collection of objects called nodes that
are randomly stored in the memory.
o A node contains two fields i.e. data stored at that particular address
and the pointer which contains the address of the next node in the
memory.
o The last node of the list contains pointer to the null.
Why use linked list over array?

Till now, we were using array data structure to organize the group of
elements that are to be stored individually in the memory. However,
Array has several advantages and disadvantages which must be known in
order to decide the data structure which will be used throughout the
program.

Array contains following limitations:

1. The size of array must be known in advance before using it in the


program.
2. Increasing size of the array is a time taking process. It is almost
impossible to expand the size of the array at run time.
3. All the elements in the array need to be contiguously stored in the
memory. Inserting any element in the array needs shifting of all its
predecessors.

Linked list is the data structure which can overcome all the limitations of
an array. Using linked list is useful because,

1. It allocates the memory dynamically. All the nodes of linked list are
non-contiguously stored in the memory and linked together with the
help of pointers.
2. Sizing is no longer a problem since we do not need to define its size
at the time of declaration. List grows as per the program's demand
and limited to the available memory space.

Difference between array and linked list


Types of Linked List

1. Singly Linked List: The nodes only point to the address of the next
node in the list.
2. Doubly Linked List: The nodes point to the addresses of both
previous and next nodes.
3. Circular Linked List: The last node in the list will point to the first
node in the list.
4. Circular Doubly Linked List: A circular doubly linked list is defined
as a circular linked list in which each node has two links connecting it
to the previous node and the next node.

Singly Linked List

A singly linked list is a linear data structure in which the elements are
not stored in contiguous memory locations and each element is connected
only to its next element using a pointer.

Applications of Linked Lists:


 Linked Lists are used to implement stacks and queues.
 It is used for the various representations of trees and graphs.
 It is used in dynamic memory allocation( linked list of free blocks).
 It is used for representing sparse matrices.
 It is used for the manipulation of polynomials.
 It is also used for performing arithmetic operations on long integers.
 It is used for finding paths in networks.
 In operating systems, they can be used in Memory management,
process scheduling and file system.
 Linked lists can be used to improve the performance of algorithms
that need to frequently insert or delete items from large collections of
data.
 Implementing algorithms such as the LRU cache, which uses a linked
list to keep track of the most recently used items in a cache.
Applications of Linked Lists in real world:
 The list of songs in the music player are linked to the previous and
next songs.
 In a web browser, previous and next web page URLs are linked
through the previous and next buttons.
 In image viewer, the previous and next images are linked with the
help of the previous and next buttons.
 Switching between two applications is carried out by using “alt+tab”
in windows and “cmd+tab” in mac book. It requires the functionality
of circular linked list.
 In mobile phones, we save the contacts of the people. The newly
entered contact details will be placed at the correct alphabetical
order. This can be achieved by linked list to set contact at correct
alphabetical position.
 The modifications that we make in documents are actually created as
nodes in doubly linked list. We can simply use the undo option by
pressing Ctrl+Z to modify the contents. It is done by the functionality
of linked list.

Operations on Singly Linked List

Basic Operations in the Linked Lists

The basic operations in the linked lists are insertion, deletion, searching,
display, and deleting an element at a given key. These operations are
performed on Singly Linked Lists as given below –

 Creating a Node:To create a new node


 Insertion − Adds an element at the beginning of the list.
 Deletion − Deletes an element at the beginning of the list.
 Display − Displays the complete list.
 Search − Searches an element using the given key.
 Delete − Deletes an element using the given key.

1. Creating a Node:

Algorithm to create a node in a linked list:

 Step 01: Start


 Step 02: Define a new user-defined data type called “node” with the help
of structure.
o Step 03: Declare a variable named “data” of int/char/float/double... type.
o Step 04: Declare a pointer named “link” of node type.
 Step 05: Declare a pointer named “start” whose type would be node and
initialize it with NULL.
 Step 06: Define a function “CreateNode()” whose return type would be a
pointer of node type.
o Step 07: Declare a pointer “n” of node type.
o Step 08: Assign the return value of the malloc function to “n”.
 Step 09: Return n.
 Step 10: Stop

Code:

struct node

int data;

struct node *link;

};

struct node *head, *ptr;


ptr = (struct node *)malloc(sizeof(struct node *));

Inserting Nodes:
To insert an element or a node in a linked list, the following three things
to be done:

Allocating a node

Assigning a data to info field of the node

Adjusting a pointer and a new node may be inserted

☀ Inserting a node at the beginning of the singly linked list:

Algorithm:
let *head be the pointer to first node in the current list
1. Create a new node using malloc function
NewNode=(NodeType*)malloc(sizeof(NodeType));
2. Assign data to the info field of new node
NewNode->info=newItem;
3. Set next of new node to head
NewNode->next=head;
4. Set the head pointer to the new node
head=NewNode;
5. End
Inserting a node at the end of the singly linked list:
Algorithm:
let *head be the pointer to first node in the current list
1. Create a new node using malloc function
NewNode=(NodeType*)malloc(sizeof(NodeType));
2. Assign data to the info field of new node
NewNode->info=newItem;
3. Set next of new node to NULL
NewNode->next=NULL;
4. if (head ==NULL)then
Set head =NewNode.and exit.
5. Set temp=head;
6 while(temp->next!=NULL)
temp=temp->next; //increment temp
7. Set temp->next=NewNode;
8. End
☀ Inserting a node at the specified position of the singly linked
list:
Algorithm:
let *head be the pointer to first node in the current list
1. Create a new node using malloc function
NewNode=(NodeType*)malloc(sizeof(NodeType));
2. Assign data to the info field of new node
NewNode->info=newItem;
3. Enter position of a node at which you want to insert a new node. Let this position is
4. Set temp=head;
5. if (head ==NULL)then
printf(“void insertion”); and exit(1).
6. for(i=1; i<pos-1; i++)
temp=temp->next;
7. Set NewNode->next=temp->next;
set temp->next =NewNode..
8. End

Deleting Nodes:

☀ Deleting the first node of the singly linked list:


Algorithm:
let *head be the pointer to first node in the current list
1. If(head==NULL) then
print “Void deletion” and exit
2. Store the address of first node in a temporary variable temp.
temp=head;
3. Set head to next of head.
head=head->next;
4. Free the memory reserved by temp variable.
free(temp);
5. End
☀ Deleting the last node of the singly linked list:
Algorithm:
let *head be the pointer to first node in the current list
1. If(head==NULL) then //if list is empty
print “Void deletion” and exit
2. else if(head->next==NULL) then //if list has only one node
Set temp=head;
print deleted item as,
printf(“%d” ,head->info);
head=NULL;
free(temp);
3. else
set temp=head;
while(temp->next->next!=NULL)
set temp=temp->next;
End of while
free(temp->next);
Set temp->next=NULL;
4. End
☀ Deleting the node at the specified position of the singly linked
list:
Algorithm:
let *head be the pointer to first node in the current list
1. Read position of a node which to be deleted, let it be pos.
2. if head==NULL
print “void deletion” and exit
3. Enter position of a node at which you want to delete a new node. Let this position is
4. Set temp=head
declare a pointer of a structure let it be *p
5. if (head ==NULL)then
print “void ideletion” and exit
otherwise;.
6. for(i=1; i<pos-1; i++)
temp=temp->next;
7. print deleted item is temp->next->info
8. Set p=temp->next;
9. Set temp->next =temp->next->next;
10. free(p);
11. End

Display nodes on Singly Linked List

void display()
{
struct node *ptr;
ptr = head;
if(ptr == NULL)
{
printf("Nothing to print");
}
else
{
printf("\nprinting values . . . . .\n");
while (ptr!=NULL)
{
printf("\n%d",ptr->data);
ptr = ptr -> next;
}
}
}

NOTE: For program code you can refer 1st program in the lab
syllabus

Doubly linked list

Doubly linked list is a complex type of linked list in which a node contains
a pointer to the previous as well as the next node in the sequence.
Therefore, in a doubly linked list, a node consists of three parts: node
data, pointer to the next node in sequence (next pointer) , pointer to the
previous node (previous pointer). A sample node in a doubly linked list is
shown in the figure.

A doubly linked list containing three nodes having numbers from 1 to 3 in


their data part, is shown in the following image.

Operations on doubly linked list

operations regarding doubly linked list are described in the following


table.
SN Operation Description

1 Insertion at beginning Adding the node into the linked list at beginning.

2 Insertion at end Adding the node into the linked list to the end.

3 Insertion after Adding the node into the linked list after the specified
specified node node.

4 Deletion at beginning Removing the node from beginning of the list

5 Deletion at the end Removing the node from end of the list.

6 Deletion of the node Removing the node which is present just after the node
having given data containing the given data.

7 Searching Comparing each node data with the item to be searched


and return the location of the item in the list if the item
found else return null.

8 Traversing Visiting each node of the list at least once in order to


perform some specific operation like searching, sorting,
display, etc.

Node Creation:

struct node
{
struct node *prev;
int data;
struct node *next;
};
struct node *head;

Algorithm to create Doubly Linked list

Begin:

alloc (head)

If (head == NULL) then

write ('Unable to allocate memory')


End if

Else then

read (data)

head.data ← data;

head.prev ← NULL;

head.next ← NULL;

last ← head;

write ('List created successfully')

End else

End

Insertion on a Doubly Linked List

Pushing a node to a doubly-linked list is similar to pushing a node to a


linked list, but extra work is required to handle the pointer to the previous
node.

We can insert elements at 3 different positions of a doubly-linked list:

1. Insertion at the beginning


2. Insertion in-between nodes
3. Insertion at the End

Insertion at the Beginning

In this operation, we create a new node with three compartments, one


containing the data, the others containing the address of its previous and
next nodes in the list. This new node is inserted at the beginning of the
list.

Algorithm
1. START
2. Create a new node with three variables: prev, data, next.
3. Store the new data in the data variable
4. If the list is empty, make the new node as head.
5. Otherwise, link the address of the existing first node to the next
variable of the new node, and assign null to the prev variable.
6. Point the head to the new node.
7. END

1. Insertion at the Beginning

The head pointer points to the first node of the doubly linked list, and the
previous pointer of the first node points to Null. To insert a node at the
beginning of the Linked List, the head pointer should point to the new first
node, and the next pointer of the new first node must point to the
previous first node.

Algorithm :

o Step 1: IF ptr = NULL

Write OVERFLOW
Go to Step 9
[END OF IF]

o Step 2: SET NEW_NODE = ptr


o Step 3: SET ptr = ptr -> NEXT
o Step 4: SET NEW_NODE -> DATA = VAL
o Step 5: SET NEW_NODE -> PREV = NULL
o Step 6: SET NEW_NODE -> NEXT = START
o Step 7: SET head -> PREV = NEW_NODE
o Step 8: SET head = NEW_NODE
o Step 9: EXIT
void insertbeginning(int item)
{

struct node *ptr = (struct node *)malloc(sizeof(struct node));


if(ptr == NULL)
{
printf("\nOVERFLOW");
}
else
{

if(head==NULL)
{
ptr->next = NULL;
ptr->prev=NULL;
ptr->data=item;
head=ptr;
}
else
{
ptr->data=item;
ptr->prev=NULL;
ptr->next = head;
head->prev=ptr;
head=ptr;
}
}

}
}

2. Insertion at the End of Doubly Linked Lists


The next pointer of the last node of the DLL points to Null. To insert a new
node at the last position of the DLL, three important points need to be
taken into consideration:
1. The next pointer of the new node should point to the Null.

2. The previous pointer of the new node should point to the old last
node.

3. The next pointer of the old last node should point to the new last
node.
Also, there may be a case where the DLL is initially empty. In that case,
the newly created node will become both the first and the last node of the
doubly linked list.
The code for the insertion of a new node as the last node in Java is given
below:
// To insert a node at the end of a Doubly Linked List
public void insertAtLast(int data) {
// Creating a new node with the given data
Node newNode = new Node(data);

// A temporary pointer that will be used for traversing DLL


Node temp = head;

// Making the next of newNode as Null


newNode.next = null;

/*
If DLL is empty then this node will be both the first as
well as the last node
*/
if (head == null) {
newNode.prev = null;
head = newNode;
return;
}

/*
If DLL is not empty, then traverse till the end of DLL. Make
the next pointer of the original last node point to the new
last node and the previous of the last node to the original
last node
*/

while (temp.next != null) {


temp = temp.next;
} // Now temp points to the original last node

// Illustarted by Blue color in the diagram


temp.next = newNode;
// Illstrated by orange color in the diagram
newNode.prev = temp;
}
3. Insertion after a given Node
There may be cases wherein a record or data is to be inserted after a
given record or data. If the data is stored in a Linked List, the insertion
after a given node operation in Linked List is used for such cases.
For inserting a new node after a given node, the following points need to
be under consideration:
1. The records next to the previous node should now be linked to the
new node to be inserted.

2. The previous node’s next pointer should be linked to the new node,
and the new node’s previous pointer should be linked to the previous
node.

The code for insertion of a new node after a given previous node in Java
is given below.
public void insertAfter(Node prevNode, int data) {
// if the previous node is null
if (prevNode == null) {
System.out.println("The given previous node cannot be null");
return;
}

// Create a new node with the given data


Node newNode = new Node(data);

/*
The next pointer of this node should point to
the next of prevNode
*/
newNode.next = prevNode.next;

// The next pointer of prevNode should point to the newNode


prevNode.next = newNode;

/*
The previous pointer of newNode should point to the
prevNode
*/
newNode.prev = prevNode;

// Change previous of newNode's next node


if (newNode.next != null)
newNode.next.prev = newNode;
}

Deletion from a Doubly Linked Lists


Deletion at beginning:

Deletion in doubly linked list at the beginning is the simplest operation.


We just need to copy the head pointer to pointer ptr and shift the head
pointer to its next.

1. Ptr = head;
2. head = head → next;

now make the prev of this new head node point to NULL. This will be done
by using the following statements.

1. head → prev = NULL

Now free the pointer ptr by using the free function.

1. free(ptr)
Algorithm

o STEP 1: IF HEAD = NULL

WRITE UNDERFLOW
GOTO STEP 6

o STEP 2: SET PTR = HEAD


o STEP 3: SET HEAD = HEAD → NEXT
o STEP 4: SET HEAD → PREV = NULL
o STEP 5: FREE PTR
o STEP 6: EXIT

void beginning_delete()
{
struct node *ptr;
if(head == NULL)
{
printf("\n UNDERFLOW\n");
}
else if(head->next == NULL)
{
head = NULL;
free(head);
printf("\nNode Deleted\n");
}
else
{
ptr = head;
head = head -> next;
head -> prev = NULL;
free(ptr);
printf("\nNode Deleted\n");
}
}

Display nodes in doubly linked list

void display()
{
struct node *ptr;
printf("\n printing values...\n");
ptr = head;
while(ptr != NULL)
{
printf("%d\n",ptr->data);
ptr=ptr->next;
}
}

Circular Singly Linked List

In a circular Singly linked list, the last node of the list contains a pointer
to the first node of the list. We can have circular singly linked list as well
as circular doubly linked list.

We traverse a circular singly linked list until we reach the same node
where we started. The circular singly liked list has no beginning and no
ending. There is no null value present in the next part of any of the
nodes.

The following image shows a circular singly linked list.


Advantages of a Circular linked list

 Entire list can be traversed from any node.


 Circular lists are the required data structure when we want a list to
be accessed in a circle or loop.
 Despite of being singly circular linked list we can easily traverse to
its previous node, which is not possible in singly linked list.

Disadvantages of Circular linked list

 Circular list are complex as compared to singly linked lists.


 Reversing of circular list is a complex as compared to singly or
doubly lists.
 If not traversed carefully, then we could end up in an infinite loop.
 Like singly and doubly lists circular linked lists also doesn’t supports
direct accessing of elements.

Applications/Uses of Circular linked list in real life

 Circular lists are used in applications where the entire list is


accessed one-by-one in a loop. Example: Operating systems may
use it to switch between various running applications in a circular
loop.
 It is also used by Operating system to share time for different
users, generally uses Round-Robin time sharing mechanism.
 Multiplayer games uses circular list to swap between players in a
loop.

Operations on Circular Singly linked list:

Insertion

SN Operation Description

1 Insertion at Adding a node into circular singly linked list at the


beginning beginning.

2 Insertion at the end Adding a node into circular singly linked list at the end.

Deletion & Traversing

SN Operation Description
1 Deletion at Removing the node from circular singly linked list at the
beginning beginning.

2 Deletion at the Removing the node from circular singly linked list at the end.
end

3 Searching Compare each element of the node with the given item and
return the location at which the item is present in the list
otherwise return null.

4 Traversing Visiting each element of the list at least once in order to perform
some specific operation.

Insertion into circular singly linked list at beginning

here are two scenario in which a node can be inserted in circular singly
linked list at beginning. Either the node will be inserted in an empty list or
the node is to be inserted in an already filled list.

Firstly, allocate the memory space for the new node by using the malloc
method of C language.

struct node *ptr = (struct node *)malloc(sizeof(struct node));

In the first scenario, the condition head == NULL will be true. Since, the
list in which, we are inserting the node is a circular singly linked list,
therefore the only node of the list (which is just inserted into the list) will
point to itself only. We also need to make the head pointer point to this
node. This will be done by using the following statements.

if(head == NULL)
{
head = ptr;
ptr -> next = head;
}

In the second scenario, the condition head == NULL will become false
which means that the list contains at least one node. In this case, we
need to traverse the list in order to reach the last node of the list. This
will be done by using the following statement.

temp = head;
while(temp->next != head)
temp = temp->next;
At the end of the loop, the pointer temp would point to the last node of
the list. Since, in a circular singly linked list, the last node of the list
contains a pointer to the first node of the list. Therefore, we need to make
the next pointer of the last node point to the head node of the list and the
new node which is being inserted into the list will be the new head node
of the list therefore the next pointer of temp will point to the new node
ptr.

This will be done by using the following statements.

1. temp -> next = ptr;

the next pointer of temp will point to the existing head node of the list.

1. ptr->next = head;

Now, make the new node ptr, the new head node of the circular singly
linked list.

1. head = ptr;

in this way, the node ptr has been inserted into the circular singly linked
list at beginning.

Algorithm

o Step 1: IF PTR = NULL

Write OVERFLOW
Go to Step 11
[END OF IF]

o Step 2: SET NEW_NODE = PTR


o Step 3: SET PTR = PTR -> NEXT
o Step 4: SET NEW_NODE -> DATA = VAL
o Step 5: SET TEMP = HEAD
o Step 6: Repeat Step 8 while TEMP -> NEXT != HEAD
o Step 7: SET TEMP = TEMP -> NEXT

[END OF LOOP]

o Step 8: SET NEW_NODE -> NEXT = HEAD


o Step 9: SET TEMP → NEXT = NEW_NODE
o Step 10: SET HEAD = NEW_NODE
o Step 11: EXIT
void beg_insert(int item)
{

struct node *ptr = (struct node *)malloc(sizeof(struct node));


struct node *temp;
if(ptr == NULL)
{
printf("\nOVERFLOW");
}
else
{
ptr -> data = item;
if(head == NULL)
{
head = ptr;
ptr -> next = head;
}
else
{
temp = head;
while(temp->next != head)
temp = temp->next;
ptr->next = head;
temp -> next = ptr;
head = ptr;
}
printf("\nNode Inserted\n");
}

Deletion in circular singly linked list at beginning

In order to delete a node in circular singly linked list, we need to make a


few pointer adjustments.

There are three scenarios of deleting a node from circular singly linked list
at beginning.

Scenario 1: (The list is Empty)

If the list is empty then the condition head == NULL will become true, in
this case, we just need to print underflow on the screen and make exit.

if(head == NULL)
{
printf("\nUNDERFLOW");
return;
}

Scenario 2: (The list contains single node)

If the list contains single node then, the condition head → next ==
head will become true. In this case, we need to delete the entire list and
make the head pointer free. This will be done by using the following
statements.

if(head->next == head)
{
head = NULL;
free(head);
}

Scenario 3: (The list contains more than one node)

If the list contains more than one node then, in that case, we need to
traverse the list by using the pointer ptr to reach the last node of the list.
This will be done by using the following statements.

ptr = head;
while(ptr -> next != head)
ptr = ptr -> next;
At the end of the loop, the pointer ptr point to the last node of the list.
Since, the last node of the list points to the head node of the list.
Therefore this will be changed as now, the last node of the list will point
to the next of the head node.

1. ptr->next = head->next;

Now, free the head pointer by using the free() method in C language.

1. free(head);

Make the node pointed by the next of the last node, the new head of the
list.

1. head = ptr->next;

In this way, the node will be deleted from the circular singly linked list
from the beginning.

Algorithm

o Step 1: IF HEAD = NULL

Write UNDERFLOW
Go to Step 8
[END OF IF]

o Step 2: SET PTR = HEAD


o Step 3: Repeat Step 4 while PTR → NEXT != HEAD
o Step 4: SET PTR = PTR → next

[END OF LOOP]

o Step 5: SET PTR → NEXT = HEAD → NEXT


o Step 6: FREE HEAD
o Step 7: SET HEAD = PTR → NEXT
o Step 8: EXIT
void beg_delete()
{
struct node *ptr;
if(head == NULL)
{
printf("\nUNDERFLOW\n");
}
else if(head->next == head)
{
head = NULL;
free(head);
printf("\nNode Deleted\n");
}

else
{
ptr = head;
while(ptr -> next != head)
ptr = ptr -> next;
ptr->next = head->next;
free(head);
head = ptr->next;
printf("\nNode Deleted\n");
}
}
Display the nodes in circular singly linked list

void display()
{
struct node *ptr;
ptr=head;
if(head == NULL)
{
printf("\nnothing to print");
}
else
{
printf("\n printing values ... \n");

while(ptr -> next != head)


{

printf("%d\n", ptr -> data);


ptr = ptr -> next;
}
printf("%d\n", ptr -> data);
}

}
UNIT II STACK & QUEUE

Stacks: Definition- representations :

Stack is a linear data structure that follows a particular order in which the
operations are performed. The order may be LIFO(Last In First Out) or
FILO(First In Last Out). LIFO implies that the element that is inserted last,
comes out first and FILO implies that the element that is inserted first,
comes out last.

Some key points related to stack

o It is called as stack because it behaves like a real-world stack, piles


of books, etc.
o A Stack is an abstract data type with a pre-defined capacity, which
means that it can store the elements of a limited size.
o It is a data structure that follows some order to insert and delete
the elements, and that order can be LIFO or FILO.

Working of Stack

Stack works on the LIFO pattern. As we can observe in the below figure
there are five memory blocks in the stack; therefore, the size of the stack
is 5.

Suppose we want to store the elements in a stack and let's assume that
stack is empty. We have taken the stack of size 5 as shown below in
which we are pushing the elements one by one until the stack becomes
full.
Applications of Stack

o Balancing of symbols: Stack is used for balancing a symbol.


o String reversal: Stack is also used for reversing a string.
o Recursion: The recursion means that the function is calling itself
again. To maintain the previous states, the compiler creates a
system stack in which all the previous records of the function are
maintained.
o DFS(Depth First Search): This search is implemented on a Graph,
and Graph uses the stack data structure.
o Backtracking: Suppose we have to create a path to solve a maze
problem. If we are moving in a particular path, and we realize that
we come on the wrong way. In order to come at the beginning of
the path to create a new path, we have to use the stack data
structure.
o Expression conversion: Stack can also be used for expression
conversion. This is one of the most important applications of stack.
The list of the expression conversion is given below:

 Infix to prefix
 Infix to postfix
 Prefix to infix
 Prefix to postfix
 Postfix to infix
Memory management: The stack manages the memory. The memory is
assigned in the contiguous memory blocks. The memory is known as
stack memory as all the variables are assigned in a function call stack
memory.

UNDO/REDO: It can also be used for performing UNDO/REDO


operations.

Standard Stack Operations

The following are some common operations implemented on the


stack:

o push(): When we insert an element in a stack then the operation is


known as a push. If the stack is full then the overflow condition
occurs.
o pop(): When we delete an element from the stack, the operation is
known as a pop. If the stack is empty means that no element exists
in the stack, this state is known as an underflow state.
o isEmpty(): It determines whether the stack is empty or not.
o isFull(): It determines whether the stack is full or not.'
o peek(): It returns the element at the given position.
o count(): It returns the total number of elements available in a
stack.
o change(): It changes the element at the given position.
o display(): It prints all the elements available in the stack.

NOTE : for code you can refer lab program-2

Check for Balanced Brackets in an expression


Given an expression string exp, write a program to examine whether the
pairs and the orders of “{“, “}”, “(“, “)”, “[“, “]” are correct in the given
expression.

Follow the steps mentioned below to implement the idea:


 Declare a character stack (say temp).
 Now traverse the string exp.
 If the current character is a starting bracket ( ‘(‘ or ‘{‘ or
‘[‘ ) then push it to stack.
 If the current character is a closing bracket ( ‘)’ or ‘}’ or
‘]’ ) then pop from the stack and if the popped character is
the matching starting bracket then fine.
 Else brackets are Not Balanced.
 After complete traversal, if some starting brackets are left in the
stack then the expression is Not balanced, else Balanced.

Write a C program that checks whether a string of


parentheses is balanced or not using stack.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_SIZE 100


char stack[MAX_SIZE];
int top = -1;

void push(char data) {


if (top == MAX_SIZE - 1) {
printf("Overflow stack!\n");
return;
}
top++;
stack[top] = data;
}

char pop() {
if (top == -1) {
printf("Empty stack!\n");
return ' ';
}
char data = stack[top];
top--;
return data;
}

int is_matching_pair(char char1, char char2) {


if (char1 == '(' && char2 == ')') {
return 1;
} else if (char1 == '[' && char2 == ']') {
return 1;
} else if (char1 == '{' && char2 == '}') {
return 1;
} else {
return 0;
}
}
int isBalanced(char* text) {
int i;
for (i = 0; i < strlen(text); i++) {
if (text[i] == '(' || text[i] == '[' || text[i] == '{') {
push(text[i]);
} else if (text[i] == ')' || text[i] == ']' || text[i] == '}') {
if (top == -1) {
return 0;
} else if (!is_matching_pair(pop(), text[i])) {
return 0;
}
}
}
if (top == -1) {
return 1;
} else {
return 0;
}
}

int main() {
char text[MAX_SIZE];
printf("Input an expression in parentheses: ");
scanf("%s", text);
if (isBalanced(text)) {
printf("The expression is balanced.\n");
} else {
printf("The expression is not balanced.\n");
}
return 0;
}

Conversion of infix expression to postfix expression

Here, we will use the stack data structure for the conversion of infix
expression to postfix expression.

Rules for the conversion from infix to postfix expression

1. Print the operand as they arrive.


2. If the stack is empty or contains a left parenthesis on top, push the
incoming operator on to the stack.
3. If the incoming symbol is '(', push it on to the stack.
4. If the incoming symbol is ')', pop the stack and print the operators
until the left parenthesis is found.
5. If the incoming symbol has higher precedence than the top of the
stack, push it on the stack.
6. If the incoming symbol has lower precedence than the top of the
stack, pop and print the top of the stack. Then test the incoming
operator against the new top of the stack.
7. If the incoming operator has the same precedence with the top of
the stack then use the associativity rules. If the associativity is from
left to right then pop and print the top of the stack then push the
incoming operator. If the associativity is from right to left then push
the incoming operator.
8. At the end of the expression, pop and print all the operators of the
stack.

Let's understand through an example.

Infix expression: K + L - M*N + (O^P) * W/U/V * T + Q

Input Stack TOP Postfix Expression


Expression

K K

+ +

L + KL

- - K L+
M - K L+ M

* -* K L+ M

N -* KL+MN

+ + K L + M N*
K L + M N* -

( +( K L + M N *-

O +( KL+MN*-O

^ +(^ K L + M N* - O

P +(^ K L + M N* - O P

) + K L + M N* - O P ^

* +* K L + M N* - O P ^

W +* K L + M N* - O P ^ W

/ +/ K L + M N* - O P ^ W *

U +/ K L + M N* - O P ^W*U

/ +/ K L + M N* - O P ^W*U/

V +/ KL + MN*-OP^W*U/V

* +* KL+MN*-OP^W*U/V/

T +* KL+MN*-OP^W*U/V/T

+ + KL+MN*-OP^W*U/V/T*
KL+MN*-OP^W*U/V/T*+

Q + KL+MN*-OP^W*U/V/T*Q

KL+MN*-OP^W*U/V/T*+Q+

The final postfix expression of infix expression is KL+MN*-


OP^W*U/V/T*+Q+.
Example-2

NOTE: for code you can refer lab program 3.a

Evaluating a postfix expression

Postfix notation:

The general mathematical way of representing algebraic expressions is to


write the operator between operands: Example: a + b. Such
representation is called the Infix representation. If we write the
operator after the operands Example: a b +, it is called the Postfix
representation. It is also called the "Reverse Polish notation

.
Example-1:

Example-2

NOTE: for code you can refer lab program 3.b


Queue

1. A queue can be defined as an ordered list which enables insert


operations to be performed at one end called REAR and delete operations
to be performed at another end called FRONT.

2. Queue is referred to be as First In First Out list.

3. For example, people waiting in line for a rail ticket form a queue.

Applications of Queue

Due to the fact that queue performs actions on first in first out basis
which is quite fair for the ordering of actions. There are various
applications of queues discussed as below.

1. Queues are widely used as waiting lists for a single shared resource
like printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is
not being transferred at the same rate between two processes) for
eg. pipes, file IO, sockets.
3. Queues are used as buffers in most of the applications like MP3
media player, CD player, etc.
4. Queue are used to maintain the play list in media players in order to
add and remove the songs from the play-list.
5. Queues are used in operating systems for handling interrupts.

Array and linked list representations – operations


For array code you can refer lab program 4
Linked List implementation of Queue

In the linked queue, there are two pointers maintained in the memory i.e.
front pointer and rear pointer. The front pointer contains the address of
the starting element of the queue while the rear pointer contains the
address of the last element of the queue.

Insertion and deletions are performed at rear and front end respectively.
If front and rear both are NULL, it indicates that the queue is empty.

The linked representation of queue is shown in the following figure

#include<stdio.h>
#include<stdlib.h>
struct node
{
int data;
struct node *next;
};
struct node *front;
struct node *rear;
void insert();
void delete();
void display();
void main ()
{
int choice;
while(choice != 4)
{
printf("\n*************************Main Menu****************
*************\n");
printf("\n======================================
===========================\n");
printf("\n1.insert an element\n2.Delete an element\n3.Display the que
ue\n4.Exit\n");
printf("\nEnter your choice ?");
scanf("%d",& choice);
switch(choice)
{
case 1:
insert();
break;
case 2:
delete();
break;
case 3:
display();
break;
case 4:
exit(0);
break;
default:
printf("\nEnter valid choice??\n");
}
}
}
void insert()
{
struct node *ptr;
int item;

ptr = (struct node *) malloc (sizeof(struct node));


if(ptr == NULL)
{
printf("\nOVERFLOW\n");
return;
}
else
{
printf("\nEnter value?\n");
scanf("%d",&item);
ptr -> data = item;
if(front == NULL)
{
front = ptr;
rear = ptr;
front -> next = NULL;
rear -> next = NULL;
}
else
{
rear -> next = ptr;
rear = ptr;
rear->next = NULL;
}
}
}
void delete ()
{
struct node *ptr;
if(front == NULL)
{
printf("\nUNDERFLOW\n");
return;
}
else
{
ptr = front;
front = front -> next;
free(ptr);
}
}
void display()
{
struct node *ptr;
ptr = front;
if(front == NULL)
{
printf("\nEmpty queue\n");
}
else
{ printf("\nprinting values .....\n");
while(ptr != NULL)
{
printf("\n%d\n",ptr -> data);
ptr = ptr -> next;
}
} }
Types of Queue

There are four different types of queue that are listed as follows -

o Simple Queue or Linear Queue


o Circular Queue
o Priority Queue
o Double Ended Queue (or Deque)

Priority Queue

It is a special type of queue in which the elements are arranged based on


the priority. It is a special type of queue data structure in which every
element has a priority associated with it. Suppose some elements occur
with the same priority, they will be arranged according to the FIFO
principle. The representation of priority queue is shown in the below
image -

Insertion in priority queue takes place based on the arrival, while deletion
in the priority queue occurs based on the priority. Priority queue is mainly
used to implement the CPU scheduling algorithms.

There are two types of priority queue that are discussed as follows -

o Ascending priority queue - In ascending priority queue, elements


can be inserted in arbitrary order, but only smallest can be deleted
first. Suppose an array with elements 7, 5, and 3 in the same order,
so, insertion can be done with the same sequence, but the order of
deleting the elements is 3, 5, 7.
o Descending priority queue - In descending priority queue,
elements can be inserted in arbitrary order, but only the largest
element can be deleted first. Suppose an array with elements 7, 3,
and 5 in the same order, so, insertion can be done with the same
sequence, but the order of deleting the elements is 7, 5, 3.
Characteristics of a Priority queue

A priority queue is an extension of a queue that contains the following


characteristics:

o Every element in a priority queue has some priority associated with


it.
o An element with the higher priority will be deleted before the
deletion of the lesser priority.
o If two elements in a priority queue have the same priority, they will
be arranged using the FIFO principle.

Implementation of Priority Queue

The priority queue can be implemented in four ways that include arrays,
linked list, heap data structure and binary search tree. The heap data
structure is the most efficient way of implementing the priority queue.

What is Heap?

A heap is a tree-based data structure that forms a complete binary tree,


and satisfies the heap property. If A is a parent node of B, then A is
ordered with respect to the node B for all nodes A and B in a heap. It
means that the value of the parent node could be more than or equal to
the value of the child node, or the value of the parent node could be less
than or equal to the value of the child node. Therefore, we can say that
there are two types of heaps:

o Max heap: The max heap is a heap in which the value of the
parent node is greater than the value of the child nodes.
o

o Min heap: The min heap is a heap in which the value of the parent
node is less than the value of the child nodes.
o

Both the heaps are the binary heap, as each has exactly two child nodes.

Deque (or double-ended queue)

The deque stands for Double Ended Queue. Deque is a linear data
structure where the insertion and deletion operations are performed from
both ends. We can say that deque is a generalized version of the queue.

Though the insertion and deletion in a deque can be performed on both


ends, it does not follow the FIFO rule. The representation of a deque is
given as follows -

Types of deque

There are two types of deque -

o Input restricted queue


o Output restricted queue

Input restricted Queue

In input restricted queue, insertion operation can be performed at only


one end, while deletion can be performed from both ends.
Output restricted Queue

In output restricted queue, deletion operation can be performed at only


one end, while insertion can be performed from both ends.

Circular Queue
There was one limitation in the array implementation of Queue. If
the rear reaches to the end position of the Queue then there might
be possibility that some vacant spaces are left in the beginning
which cannot be utilized. So, to overcome such limitations, the
concept of the circular queue was introduced.
A circular queue is similar to a linear queue as it is also based on
the FIFO (First In First Out) principle except that the last position is
connected to the first position in a circular queue that forms a
circle. It is also known as a Ring Buffer.
As we can see in the above image, the rear is at the last position of
the Queue and front is pointing somewhere rather than the
0th position. In the above array, there are only two elements and
other three positions are empty. The rear is at the last position of
the Queue; if we try to insert the element then it will show that
there are no empty spaces in the Queue. There is one solution to
avoid such wastage of memory space by shifting both the elements
at the left and adjust the front and rear end accordingly. It is not a
practically good approach because shifting all the elements will
consume lots of time. The efficient approach to avoid the wastage
of the memory is to use the circular queue data structure.

Applications of Circular Queue

The circular Queue can be used in the following scenarios:

o Memory management: The circular queue provides memory


management. As we have already seen that in linear queue, the
memory is not managed very efficiently. But in case of a circular
queue, the memory is managed efficiently by placing the elements
in a location which is unused.
o CPU Scheduling: The operating system also uses the circular
queue to insert the processes and then execute them.
o Traffic system: In a computer-control traffic system, traffic light is
one of the best examples of the circular queue. Each light of traffic
light gets ON one by one after every jinterval of time. Like red light
gets ON for one minute then yellow light for one minute and then
green light. After green light, the red light gets ON.

Algorithm to insert an element in a circular queue

Step 1: IF (REAR+1)%MAX = FRONT


Write " OVERFLOW "
Goto step 4
[End OF IF]

Step 2: IF FRONT = -1 and REAR = -1


SET FRONT = REAR = 0
ELSE IF REAR = MAX - 1 and FRONT ! = 0
SET REAR = 0
ELSE
SET REAR = (REAR + 1) % MAX
[END OF IF]

Step 3: SET QUEUE[REAR] = VAL


Step 4: EXIT

Dequeue Operation

The steps of dequeue operation are given below:

o First, we check whether the Queue is empty or not. If the queue is


empty, we cannot perform the dequeue operation.
o When the element is deleted, the value of front gets decremented
by 1.
o If there is only one element left which is to be deleted, then the
front and rear are reset to -1.

Algorithm to delete an element from the circular queue

Step 1: IF FRONT = -1
Write " UNDERFLOW "
Goto Step 4
[END of IF]

Step 2: SET VAL = QUEUE[FRONT]

Step 3: IF FRONT = REAR


SET FRONT = REAR = -1
ELSE
IF FRONT = MAX -1
SET FRONT = 0
ELSE
SET FRONT = FRONT + 1
[END of IF]
[END OF IF]

Step 4: EXIT

Difference between stack and queue


Define three ADTs namely List ADT, Stack ADT, Queue ADT.

1. List ADT

Vies of list

 The data is generally stored in key sequence in a list which has a


head structure consisting of count, pointers and address of compare
function needed to compare the data in the list.
 The data node contains the pointer to a data structure and a self-
referential pointer which points to the next node in the list.
 The List ADT Functions is given below:
 get() – Return an element from the list at any given position.
 insert() – Insert an element at any position of the list.
 remove() – Remove the first occurrence of any element from a non-
empty list.
 removeAt() – Remove the element at a specified location from a non-
empty list.
 replace() – Replace an element at any position by another element.
 size() – Return the number of elements in the list.
 isEmpty() – Return true if the list is empty, otherwise return false.
 isFull() – Return true if the list is full, otherwise return false.
2. Stack ADT

View of stack

 In Stack ADT Implementation instead of data being stored in each


node, the pointer to data is stored.
 The program allocates memory for the data and address is passed to
the stack ADT.
 The head node and the data nodes are encapsulated in the ADT. The
calling function can only see the pointer to the stack.
 The stack head structure also contains a pointer to top and count of
number of entries currently in stack.
 push() – Insert an element at one end of the stack called top.
 pop() – Remove and return the element at the top of the stack, if it is
not empty.
 peek() – Return the element at the top of the stack without removing
it, if the stack is not empty.
 size() – Return the number of elements in the stack.
 isEmpty() – Return true if the stack is empty, otherwise return false.
 isFull() – Return true if the stack is full, otherwise return false.

3. Queue ADT

View of Queue
 The queue abstract data type (ADT) follows the basic design of the
stack abstract data type.
 Each node contains a void pointer to the data and the link pointer to
the next element in the queue. The program’s responsibility is to
allocate memory for storing the data.
 enqueue() – Insert an element at the end of the queue.
 dequeue() – Remove and return the first element of the queue, if the
queue is not empty.
 peek() – Return the element of the queue without removing it, if the
queue is not empty.
 size() – Return the number of elements in the queue.
 isEmpty() – Return true if the queue is empty, otherwise return false.
 isFull() – Return true if the queue is full, otherwise return false.
UNIT III SORTING & HASHING

Sorting techniques: A Sorting Algorithm is used to rearrange a given array or


list of elements according to a comparison operator on the elements. The
comparison operator is used to decide the new order of elements in the respective
data structure.

Types of Sorting Techniques


There are various sorting algorithms are used in data structures. The following
two types of sorting algorithms can be broadly classified:
1. Comparison-based: We compare the elements in a comparison-based
sorting algorithm)
Ex: Bubble sort, Selection sort, Quick sort
2. Non-comparison-based: We do not compare the elements in a non-
comparison-based sorting algorithm)
Ex: Radix sort, Counting sort.

Sorting Terminologies:
In-place sorting:

An in-place sorting algorithm uses constant space for producing the output
(modifies the given array only). It sorts the list only by modifying the order of
the elements within the list. For example, Insertion Sort and Selection Sorts
are in-place sorting algorithms as they do not use any additional space for
sorting the list.

Types Of Sorting :
1. Internal Sorting
2. External Sorting

Sort Stability :
1. Stable Sort
2. Unstable Sort

Internal Sorting :
 When all data is placed in the main memory or internal memory then
sorting is called internal sorting.
 In internal sorting, the problem cannot take input beyond its size.
 Example: heap sort, bubble sort, selection sort, quick sort, shell sort,
insertion sort.
External Sorting :
 When all data that needs to be sorted cannot be placed in memory at a time,
the sorting is called external sorting. External Sorting is used for the
massive amount of data.
 Merge Sort and its variations are typically used for external sorting.
 Some external storage like hard disks and CDs are used for external
sorting.
Example: Merge sort
What is stable sorting?
 When two same data appear in the same order in sorted data without
changing their position is called stable sort.
Example: merge sort, insertion sort, bubble sort.

What is Unstable sorting?


 When two same data appear in the different order in sorted data it is
called unstable sort.
Example: quick sort, heap sort, shell sort.

Bubble Sort:

Bubble sort is a sorting algorithm that compares two adjacent elements and
swaps them until they are in the intended order. Just like the movement of air
bubbles in the water that rise up to the surface, each element of the array move
to the end in each iteration. Therefore, it is called a bubble sort.
Suppose we are trying to sort the elements in ascending order.
1. First Iteration (Compare and Swap)
1. Starting from the first index, compare the first and the second elements.
2. If the first element is greater than the second element, they are swapped.
3. Now, compare the second and the third elements. Swap them if they are not in
order.
4. The above process goes on until the last element.

void bubbleSort(int arr[])


{
int n = arr.length;
for (int i = 0; i < n-1; i++)
for (int j = 0; j < n-i-1; j++)
if (arr[j] > arr[j+1])
{
// swap arr[j+1] and arr[i]
int temp = arr[j];
arr[j] = arr[j+1];
arr[j+1] = temp;
}
}
Example:

Bubble Sort Complexity


Time Complexity

Best O(n)

Worst O(n2)

Average O(n2)

Space Complexity O(1)

Stability Yes

selection sort

Selection sort is a simple and efficient sorting algorithm that works by


repeatedly selecting the smallest (or largest) element from the unsorted portion
of the list and moving it to the sorted portion of the list. This process is
repeated for the remaining unsorted portion until the entire list is sorted.

Algorithm

Step 1 − Set MIN to location 0


Step 2 − Search the minimum element in the list
Step 3 − Swap with value at location MIN
Step 4 − Increment MIN to point to next element
Step 5 − Repeat until list is sorted
Pseudocode

procedure selection sort


list : array of items
n : size of list

for i = 1 to n - 1
/* set current element as minimum*/
min = i

/* check the element to be minimum */

for j = i+1 to n
if list[j] < list[min] then
min = j;
end if
end for

/* swap the minimum element with the current element*/


if indexMin != i then
swap list[min] and list[i]
end if
end for

end procedure

How Selection Sort Works?

Consider the following depicted array as an example.

For the first position in the sorted list, the whole list is scanned sequentially.
The first position where 14 is stored presently, we search the whole list and find
that 10 is the lowest value.

So we replace 14 with 10. After one iteration 10, which happens to be the
minimum value in the list, appears in the first position of the sorted list.
For the second position, where 33 is residing, we start scanning the rest of the
list in a linear manner.

We find that 14 is the second lowest value in the list and it should appear at the
second place. We swap these values.

After two iterations, two least values are positioned at the beginning in a sorted
manner.

The same process is applied to the rest of the items in the array.

Selection Sort Complexity


Time Complexity

Best O(n2)

Worst O(n2)

Average O(n2)

Space Complexity O(1)

Stability No
Quick Sort

Quick sort is a highly efficient sorting algorithm. Like merge sort , this
algorithm is also based on the divide and conquer technique and uses the
comparison method. Quick sort is an ideal solution for a large set of data. The
sorting algorithm first divides the array into two sub-arrays by comparing all
elements with a specified value, called the Pivot value.

The two sub-arrays are divided in a way that one of them holds smaller values
than the pivot value, and the other holds greater values than the pivot value.
There are different ways to implement quick sort:

1. Always pick the last element as a pivot (we'll use this in our quick sort in
C example).
2. Always pick the first element as the pivot.
3. Pick median as the pivot.
4. Pick a random element as the pivot.

1. An array is divided into sub-arrays by selecting a pivot element (element


selected from the array).
While dividing the array, the pivot element should be positioned in such a
way that elements less than pivot are kept on the left side and elements
greater than pivot are on the right side of the pivot.
2. The left and right subarrays are also divided using the same approach.
This process continues until each subarray contains a single element.
3. At this point, elements are already sorted. Finally, elements are combined
to form a sorted array.
Pictorial presentation - Quick Sort algorithm :
Quicksort Complexity
Time Complexity

Best O(n*log n)

Worst O(n2)

Average O(n*log n)

Space Complexity O(log n)

Stability No

Worst Case Complexity [Big-O]: O(n2)


It occurs when the pivot element picked is either the greatest or the smallest
element.

This condition leads to the case in which the pivot element lies in an extreme
end of the sorted array. One sub-array is always empty and another sub-array
contains n - 1 elements. Thus, quicksort is called only on this sub-array.

NOTE : For program refer lab program 5

Merge Sort

Merge Sort is a Divide and Conquer algorithm. It divides the input array into
two halves, calls itself for the two halves, and then it merges the two sorted
halves. The merge() function is used for merging two halves. The merge(arr,
l, m, r) is a key process that assumes that arr[l..m] and arr[m+1..r] are sorted
and merges the two sorted sub-arrays into one.

Algorithm:
Step 1: Start
Step 2: Declare an array and left, right, mid variable
Step 3: Perform merge function.
mergesort(array,left,right)
mergesort (array, left, right)
if left > right
return
mid= (left+right)/2
mergesort(array, left, mid)
mergesort(array, mid+1, right)
merge(array, left, mid, right)
Step 4: Stop
Here, a problem is divided into multiple sub-problems. Each sub-problem is
solved individually. Finally, sub-problems are combined to form the final
solution.
Merge Sort Complexity
Time Complexity

Best O(n*log n)

Worst O(n*log n)

Average O(n*log n)

Space Complexity O(n)

Stability Yes
Insertion Sort

Insertion sort is an algorithm used to sort a collection of elements in


ascending or descending order. The basic idea behind the algorithm is to
divide the list into two parts: a sorted part and an unsorted part.
Initially, the sorted part contains only the first element of the list, while the
rest of the list is in the unsorted part. The algorithm then iterates through each
element in the unsorted part, picking one at a time, and inserts it into its
correct position in the sorted part.
To do this, the algorithm compares the current element with each element in
the sorted part, starting from the rightmost element. It continues to move to the
left until it finds an element that is smaller (if sorting in ascending order) or
larger (if sorting in descending order) than the current element.
Once the correct position has been found, the algorithm shifts all the elements
to the right of that position to make room for the current element, and then
inserts the current element into its correct position.
This process continues until all the elements in the unsorted part have been
inserted into their correct positions in the sorted part, resulting in a fully sorted
list.
One of the advantages of insertion sort is that it is an in-place sorting
algorithm, which means that it does not require any additional storage space
other than the original list. Additionally, it has a time complexity of O(n^2),
which makes it suitable for small datasets, but not for large ones.
Overall, insertion sort is a simple, yet effective sorting algorithm that can be
used for small datasets or as a part of more complex algorithms.
Insertion Sort Algorithm
To sort an array of size N in ascending order:
 Iterate from arr[1] to arr[N] over the array.
 Compare the current element (key) to its predecessor.
 If the key element is smaller than its predecessor, compare it to the
elements before. Move the greater elements one position up to make space
for the swapped element

Example:

To understand the working of the insertion sort algorithm, let's take an unsorted
array. It will be easier to understand the insertion sort via an example.

Let the elements of array are -

Initially, the first two elements are compared in insertion sort.


Here, 31 is greater than 12. That means both elements are already in ascending
order. So, for now, 12 is stored in a sorted sub-array.

Now, move to the next two elements and compare them.

Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with
25. Along with swapping, insertion sort will also check it with all elements in
the sorted array.

For now, the sorted array has only one element, i.e. 12. So, 25 is greater than
12. Hence, the sorted array remains sorted after swapping.

Now, two elements in the sorted array are 12 and 25. Move forward to the next
elements that are 31 and 8.

Both 31 and 8 are not sorted. So, swap them.

After swapping, elements 25 and 8 are unsorted.

So, swap them.


Now, elements 12 and 8 are unsorted.

So, swap them too.

Now, the sorted array has three items that are 8, 12 and 25. Move to the next
items that are 31 and 32.

Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.

Move to the next elements that are 32 and 17.

17 is smaller than 32. So, swap them.

Swapping makes 31 and 17 unsorted. So, swap them too.

Now, swapping makes 25 and 17 unsorted. So, perform swapping again.

Now, the array is completely sorted.


Insertion Sort Complexity
Time Complexity

Best O(n)

Worst O(n2)

Average O(n2)

Space Complexity O(1)

Stability Yes

Heap sort

Heap sort is a comparison-based sorting technique based on Binary Heap data


structure. It is similar to the selection sort where we first find the minimum
element and place the minimum element at the beginning. Repeat the same
process for the remaining elements.

Heap Sort Algorithm


To solve the problem follow the below idea:
First convert the array into heap data structure using heapify, then one by one
delete the root node of the Max-heap and replace it with the last node in the
heap and then heapify the root of the heap. Repeat this process until size of
heap is greater than 1.
 Build a heap from the given input array.
 Repeat the following steps until the heap contains only one element:
 Swap the root element of the heap (which is the largest element)
with the last element of the heap.
 Remove the last element of the heap (which is now in the correct
position).
 Heapify the remaining elements of the heap.
 The sorted array is obtained by reversing the order of the elements in the
input array.

In heap sort, basically, there are two phases involved in the sorting of elements.
By using the heap sort algorithm, they are as follows -

o The first step includes the creation of a heap by adjusting the elements of
the array.
o After the creation of heap, now remove the root element of the heap
repeatedly by shifting it to the end of the array, and then store the heap
structure with the remaining elements.
What is Heapify?

It is a process of creating a data structure called a heap from that of a binary tree
using a data structure array.

Now let's see the working of heap sort in detail by using an example. To
understand it more clearly, let's take an unsorted array and try to sort it using
heap sort.

First, we have to construct a heap from the given array and convert it into
max heap.

After converting the given heap into max heap, the array elements are -

Next, we have to delete the root element (89) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (11). After
deleting the root element, we again have to heapify it to convert it into
max heap.
After swapping the array element 89 with 11, and converting the heap
into max-heap, the elements of array are -

In the next step, again, we have to delete the root element (81) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (54). After deleting the root element, we again have to heapify it to
convert it into max heap.

After swapping the array element 81 with 54 and converting the heap into
max-heap, the elements of array are -

In the next step, we have to delete the root element (76) from the max
heap again. To delete this node, we have to swap it with the last node,
i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.

After swapping the array element 76 with 9 and converting the heap into
max-heap, the elements of array are -

In the next step, again we have to delete the root element (54) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (14). After deleting the root element, we again have to heapify it to
convert it into max heap.
After swapping the array element 54 with 14 and converting the heap into
max-heap, the elements of array are -

In the next step, again we have to delete the root element (22) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (11). After deleting the root element, we again have to heapify it to
convert it into max heap.

After swapping the array element 22 with 11 and converting the heap into
max-heap, the elements of array are -

In the next step, again we have to delete the root element (14) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.

After swapping the array element 14 with 9 and converting the heap into
max-heap, the elements of array are -
In the next step, again we have to delete the root element (11) from the
max heap. To delete this node, we have to swap it with the last node,
i.e. (9). After deleting the root element, we again have to heapify it to
convert it into max heap.

After swapping the array element 11 with 9, the elements of array are -

Now, heap has only one element left. After deleting it, heap will be
empty.

After completion of sorting, the array elements are -

Now, the array is completely sorted.

Heap Sort Complexity


Time Complexity

Best O(nlog n)

Worst O(nlog n)

Average O(nlog n)

Space Complexity O(1)

Stability No
Radix Sort

Radix Sort is a linear sorting algorithm that sorts elements by processing them
digit by digit. It is an efficient sorting algorithm for integers or strings with
fixed-size keys.
Rather than comparing elements directly, Radix Sort distributes the elements
into buckets based on each digit‟s value. By repeatedly sorting the elements by
their significant digits, from the least significant to the most significant, Radix
Sort achieves the final sorted order.

The key idea behind Radix Sort is to exploit the concept of place value. It
assumes that sorting numbers digit by digit will eventually result in a fully
sorted list. Radix Sort can be performed using different variations, such as
Least Significant Digit (LSD) Radix Sort or Most Significant Digit (MSD)
Radix Sort.

Algorithm
radixSort(arr)
max = largest element in the given array
d = number of digits in the largest element (or, max)
Now, create d buckets of size 0 - 9
for i -> 0 to d
sort the array elements using counting sort (or any stable sort) according to the digi
ts at
the ith place

How does Radix Sort Algorithm work?


To perform radix sort on the array [170, 45, 75, 90, 802, 24, 2, 66], we follow
these steps:

How does Radix Sort Algorithm work | Step 1

Step 1: Find the largest element in the array, which is 802. It has three digits,
so we will iterate three times, once for each significant place.
Step 2: Sort the elements based on the unit place digits (X=0). We use a stable
sorting technique, such as counting sort, to sort the digits at each significant
place.
Sorting based on the unit place:
 Perform counting sort on the array based on the unit place digits.
 The sorted array based on the unit place is [170, 90, 802, 2, 24, 45, 75, 66].

How does Radix Sort Algorithm work | Step 2

Step 3: Sort the elements based on the tens place digits.


Sorting based on the tens place:
 Perform counting sort on the array based on the tens place digits.
 The sorted array based on the tens place is [802, 2, 24, 45, 66, 170, 75, 90].

How does Radix Sort Algorithm work | Step 3

Step 4: Sort the elements based on the hundreds place digits.


Sorting based on the hundreds place:
 Perform counting sort on the array based on the hundreds place digits.
 The sorted array based on the hundreds place is [2, 24, 45, 66, 75, 90, 170,
802].
How does Radix Sort Algorithm work | Step 4

Step 5: The array is now sorted in ascending order.


The final sorted array using radix sort is [2, 24, 45, 66, 75, 90, 170, 802].

How does Radix Sort Algorithm work | Step 5

Radix Sort Complexity


Time Complexity

Best O(n+k)

Worst O(n+k)

Average O(n+k)

Space Complexity O(max)

Stability Yes
Shell Sort:

Shell sort is the generalization of insertion sort, which overcomes the drawbacks
of insertion sort by comparing elements separated by a gap of several positions.

In insertion sort, at a time, elements can be moved ahead by one position only.
To move an element to a far-away position, many movements are required that
increase the algorithm's execution time. But shell sort overcomes this drawback
of insertion sort. It allows the movement and swapping of far-away elements as
well.

Algorithm:
Step 1 − Start
Step 2 − Initialize the value of gap size. Example: h
Step 3 − Divide the list into smaller sub-part. Each must have equal intervals to
h
Step 4 − Sort these sub-lists using insertion sort
Step 5 – Repeat this step 2 until the list is sorted.
Step 6 – Print a sorted list.
Step 7 – Stop.
This algorithm uses insertion sort on a widely spread elements, first to sort
them and then sorts the less widely spaced elements. This spacing is termed
as interval. This interval is calculated based on Knuth's formula as −

Knuth's Formula

h=h*3+1
where : h is interval with initial value 1

This algorithm is quite efficient for medium-sized data sets as its average and
worst-case complexity of this algorithm depends on the gap sequence the best
known is Ο(n), where n is the number of items. And the worst case space
complexity is O(n).

How Shell Sort Works?

Let us consider the following example to have an idea of how shell sort works.
We take the same array we have used in our previous examples. For our
example and ease of understanding, we take the interval of 4. Make a virtual
sub-list of all values located at the interval of 4 positions. Here these values are
{35, 14}, {33, 19}, {42, 27} and {10, 44}
We compare values in each sub-list and swap them (if necessary) in the original
array. After this step, the new array should look like this −

Then, we take interval of 1 and this gap generates two sub-lists - {14, 27, 35,
42}, {19, 10, 33, 44}

We compare and swap the values, if required, in the original array. After this
step, the array should look like this −

Finally, we sort the rest of the array using interval of value 1. Shell sort uses
insertion sort to sort the array.

Following is the step-by-step depiction −


Shell Sort Complexity
Time Complexity

Best O(nlog n)

Worst O(n2)

Average O(nlog n)

Space Complexity O(1)

Stability No
Comparison of sorting methods

Analysis of sorting techniques :

 When the array is almost sorted, insertion sort can be preferred.


 When order of input is not known, merge sort is preferred as it has worst
case time complexity of n log n and it is stable as well.
 When the array is sorted, insertion and bubble sort gives complexity of n
but quick sort gives complexity of n^2.

Comparison of time and space complexities

In-place and Stable sorting Algorithms

 A sorting algorithm is In-place if the algorithm does not use extra space
for manipulating the input but may require a small though nonconstant
extra space for its operation. Or we can say, a sorting algorithm sorts in-
place if only a constant number of elements of the input array are ever
stored outside the array.
 A sorting algorithm is stable if it does not change the order of elements
with the same value.
Hashing:

Dictionaries:

Dictionary is one of the important Data Structures that is usually used to store
data in the key-value format. Each element presents in a dictionary data
structure compulsorily have a key and some value is associated with that
particular key. In other words, we can also say that Dictionary data structure is
used to store the data in key-value pairs. Other names for the Dictionary data
structure are associative array, map, symbol table but broadly it is referred to as
Dictionary.

Example:

For example, the results of a classroom test could be represented as a dictionary


with pupil's names as keys and their scores as the values:

results = {'Detra' : 17,


'Nova' : 84,
'Charlie' : 22,
'Henry' : 75,
'Roxanne' : 92,
'Elsa' : 29}
The various operations that are performed on a Dictionary or associative array
are:

o Add or Insert: In the Add or Insert operation, a new pair of keys and
values is added in the Dictionary or associative array object.
o Replace or reassign: In the Replace or reassign operation, the already
existing value that is associated with a key is changed or modified. In
other words, a new value is mapped to an already existing key.
o Delete or remove: In the Delete or remove operation, the already present
element is unmapped from the Dictionary or associative array object.
o Find or Lookup: In the Find or Lookup operation, the value associated
with a key is searched by passing the key as a search argument.

HashTable Representation

Hash tables are a type of data structure in which the address or the index value
of the data element is generated from a hash function. That makes accessing the
data faster as the index value behaves as a key for the data value. In other words
Hash table stores key-value pairs but the key is generated through a hashing
function.

So the search and insertion function of a data element becomes much faster as
the key values themselves become the index of the array which stores the data.

A hash function is a function that can map a piece of data of any length to a
fixed-length value, called hash.

Hash functions have three major characteristics:

 They are fast to compute: calculate the hash of a piece of data have to be
a fast operation.
 They are deterministic: the same string will always produce the same
hash.
 They produce fixed-length values: it doesn‟t matter if your input is one,
ten, or ten thousand bytes, the resulting hash will be always of a fixed,
predetermined length.

Another characteristic that is quite common in hash functions is that they often
are one-way functions: thanks to a voluntary data loss implemented in the
function, you can get a hash from a string but you can‟t get the original string
from a hash. This is not a mandatory feature for every hash functions but
becomes important when they have to be cryptographically secure.
Hashing

Hashing is one of the searching techniques that uses a constant time. The time
complexity in hashing is O(1). Till now, we read the two techniques for
searching, i.e., linear search and binary search. The worst time complexity in
linear search is O(n), and O(logn) in binary search. In both the searching
techniques, the searching depends upon the number of elements but we want the
technique that takes a constant time. So, hashing technique came that provides a
constant time.

In Hashing technique, the hash table and hash function are used. Using the hash
function, we can calculate the address at which the value can be stored.

The main idea behind the hashing is to create the (key/value) pairs. If the key is
given, then the algorithm computes the index at which the value would be
stored. It can be written as:

There are three ways of calculating the hash function:

o Division method
o Folding method
o Mid square method

In the division method, the hash function can be defined as:

h(ki) = ki % m;

where m is the size of the hash table.

For example, if the key value is 6 and the size of the hash table is 10. When we
apply the hash function to key 6 then the index would be:

h(6) = 6%10 = 6

The index is 6 at which the value is stored.


Collision

When the two different values have the same value, then the problem occurs
between the two values, known as a collision. In the above example, the value is
stored at index 6. If the key value is 26, then the index would be:

h(26) = 26%10 = 6

Therefore, two values are stored at the same index, i.e., 6, and this leads to the
collision problem. To resolve these collisions, we have some techniques known
as collision techniques.

The following are the collision techniques:

o Open Hashing: It is also known as closed addressing.


o Closed Hashing: It is also known as open addressing.

Open Hashing

The first Collision Resolution or Handling technique, " Open Hashing ", is
popularly known as Separate Chaining. This is a technique which is used to
implement an array as a linked list known as a chain. It is one of the most used
techniques by programmers to handle collisions. When a number of elements
are hashed into the index of a single slot, then they are inserted into a singly-
linked list. This singly-linked list is the linked list which we refer to as a chain
in the Open Hashing technique.

All key-value pairs mapping to the same index will be stored in the linked list of
that index.

The benefits of chaining


 Through chaining, insertion in a hash table always occurs in O(1) since
linked lists allow insertion in constant time.
 Theoretically, a chained hash table can grow infinitely as long as there is
enough space.
 A hash table which uses chaining will never need to be resized.

Example: Let us consider a simple hash function as “key mod 7” and a


sequence of keys as 50, 700, 76, 85, 92, 73, 101
Advantages:
 Simple to implement.
 Hash table never fills up, we can always add more elements to the chain.
 Less sensitive to the hash function or load factors.
 It is mostly used when it is unknown how many and how frequently keys
may be inserted or deleted.

Disadvantages:
 The cache performance of chaining is not good as keys are stored using a
linked list. Open addressing provides better cache performance as
everything is stored in the same table.
 Wastage of Space (Some Parts of the hash table are never used)
 If the chain becomes long, then search time can become O(n) in the worst
case
 Uses extra space for links

Closed Hashing:
The second most Collision resolution technique, Closed Hashing, is a way of
dealing with collisions, similar to the Separate Chaining process. In Open
Addressing, the hash table alone stores all of its elements. The size of the table
should always be greater than or equal to the total number of keys at all times (
we can also increase the size of the table by copying the old data that is already
existing whenever it is needed ). This mechanism is referred to as Closed
Hashing. The formation and consideration of the whole process is probing.
Several techniques to perform Implementation of Closed Hashing:

1. Linear Probing: In linear probing, the hash table undergoes clear and neat
examination, starting from the hash's initial or beginning point. If the slot that is
obtained after the calculation is already occupied, then we should look for a
different one. The function that is responsible for performing rehashing is " key
= rehash(n+1)%table-size ". The space between the two probes or positions is
generally 1.

Let us see Linear Probing for a slot index " hash(a) ", which is computed using a
hash function. It is one of the best techniques which has the best cache
performance.

Functions Used in Closed Hashing:

1. Insert( k ): Up till a space is left unfilled, keep probing. Place the key " k
" in the first empty slot you find.
2. Search( k ): Probe each slot until the key is not equal to k or until an
empty slot is found.
3. Delete( k ): It's interesting to delete something. The search may fail when
we just remove a key and then perform search operation. The slots that
are a part of deleted key slots are considered as "deleted."

Let us consider a simple hash function as “key mod 7” and a sequence of keys
as 50, 700, 76, 85, 92, 73, 101,

which means hash(key)= key% S, here S=size of the table =7,indexed from 0
to 6.We can define the hash function as per our choice if we want to create a
hash table,although it is fixed internally with a pre-defined formula.
Applications of linear probing:
Linear probing is a collision handling technique used in hashing, where the
algorithm looks for the next available slot in the hash table to store the
collided key. Some of the applications of linear probing include:
 Symbol tables: Linear probing is commonly used in symbol tables, which
are used in compilers and interpreters to store variables and their associated
values. Since symbol tables can grow dynamically, linear probing can be
used to handle collisions and ensure that variables are stored efficiently.
 Caching: Linear probing can be used in caching systems to store frequently
accessed data in memory. When a cache miss occurs, the data can be
loaded into the cache using linear probing, and when a collision occurs, the
next available slot in the cache can be used to store the data.
 Databases: Linear probing can be used in databases to store records and
their associated keys. When a collision occurs, linear probing can be used
to find the next available slot to store the record.
 Compiler design: Linear probing can be used in compiler design to
implement symbol tables, error recovery mechanisms, and syntax analysis.
 Spell checking: Linear probing can be used in spell-checking software to
store the dictionary of words and their associated frequency counts. When a
collision occurs, linear probing can be used to store the word in the next
available slot.
Overall, linear probing is a simple and efficient method for handling collisions
in hash tables, and it can be used in a variety of applications that require
efficient storage and retrieval of data.
Challenges in Linear Probing :
 Primary Clustering: One of the problems with linear probing is Primary
clustering, many consecutive elements form groups and it starts taking time
to find a free slot or to search for an element.
 Secondary Clustering: Secondary clustering is less severe, two records
only have the same collision chain (Probe Sequence) if their initial position
is the same.

Difficulties faced with Linear Probing:

o Primary Clustering: Primary clustering is one of the major issues that


are caused with the linear probing technique. Many elements that are
consecutive to each other generally form clusters or a group of scattering,
which in turn makes the hash table more difficult to find an empty slot or
search for an element.
o Secondary Clustering: Secondary clustering is not as severe as primary
clustering, and the elements or records that must be placed within the
same location are only allowed to share a collision chain which is also
known as a probe sequence, if they begin at the same location.
o Clustering is the only problem in Linear probing. If clustering can be
reduced within this mechanism, then this can be considered one of the
best Collision resolution techniques.

2. Quadratic Probing: In Quadratic Probing, the intervals between the key


positions is increased when compared to linear probing as the hash function is
mostly different. The issue that is occurred due to the clustering in the above
technique can be easily solved by using the quadratic probing technique. This
technique is also known as mid-square method. When the iteration that is
currently running is " i ", then the i^2th position is considered as the key
position for that respective key. Other slots of positions are checked only when
the key position that we are trying for is already occupied. This is the most
efficient and effective method for a hash table which possesses closed
properties. It has an average performance of cache and a subtle problem with
clustering.

Difficulties faced with Quadratic Probing:

It deals with secondary clustering, and sometimes, two keys have same prob
sequence whenever they possess the same key position.

3. Double hashing

In this resolution technique, another hash function is used, which is created


especially for the Double hashing mechanism. In this technique, the clustering
that is formed between the keys is handled efficiently and is further reduced.
The increment of the key positions is made out of the function that will be used
in this mechanism. With that function, the key positions are calculated with
their respective keys and are placed in the positions accordingly. The function is
then multiplied with the variable " i ", and then the modulo operation is
performed.

Difficulties in Double hashing:

Compared to other techniques, double hashing possesses poor cache


performance but does not have any clustering issues. The time required for the
completion of the entire process is more as there are two hash functions that are
supposed to be performed. So, this causes poor cache performance. Other than
this, there is no problem with Double hashing.

The intervals that lie between probes are computed by another hash function.
Double hashing is a technique that reduces clustering in an optimized way. In
this technique, the increments for the probing sequence are computed by using
another hash function. We use another hash function hash2(x) and look for the
i*hash2(x) slot in the ith rotation.
let hash(x) be the slot index computed using hash function.
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S
If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x))
%S
If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x))
%S
Example: Insert the keys 27, 43, 692, 72 into the Hash Table of size 7. where
first hash-function is h1(k) = k mod 7 and second hash-function is h2(k) = 1 +
(k mod 5)
 Step 1: Insert 27
 27 % 7 = 6, location 6 is empty so insert 27 into 6 slot.

Insert key 27 in the hash table

 Step 2: Insert 43
 43 % 7 = 1, location 1 is empty so insert 43 into 1 slot.

Insert key 43 in the hash table

 Step 3: Insert 692


 692 % 7 = 6, but location 6 is already being occupied and this is a
collision
 So we need to resolve this collision using double hashing.
hnew = [h1(692) + i * (h2(692)] % 7
= [6 + 1 * (1 + 692 % 5)] % 7
= 9% 7
=2
Now, as 2 is an empty slot,
so we can insert 692 into 2nd slot.

Insert key 692 in the hash table

 Step 4: Insert 72
 72 % 7 = 2, but location 2 is already being occupied and this is a
collision.
 So we need to resolve this collision using double hashing.
hnew = [h1(72) + i * (h2(72)] % 7
= [2 + 1 * (1 + 72 % 5)] % 7
=5%7
= 5,
Now, as 5 is an empty slot,
so we can insert 72 into 5th slot.
Insert key 72 in the hash table

Static and Dynamic Hashing

What is Static Hashing?

It is a hashing technique that enables users to lookup a definite data set.


Meaning, the data in the directory is not changing, it is "Static" or fixed. In this
hashing technique, the resulting number of data buckets in memory remains
constant.

What is Dynamic Hashing?

It is a hashing technique that enables users to lookup a dynamic data set. Means,
the data set is modified by adding data to or removing the data from, on demand
hence the name „Dynamic‟ hashing. Thus, the resulting data bucket keeps
increasing or decreasing depending on the number of records.

In this hashing technique, the resulting number of data buckets in memory is


ever-changing.
Differences between Static and Dynamic Hashing

Here are some prominent differences by which Static Hashing is different than
Dynamic Hashing –

Key Factor Static Hashing Dynamic Hashing

Fixed-size, non-changing data. Variable-size, changing


Form of Data
data.

The resulting Data Bucket is of The resulting Data


Result fixed-length. Bucket is of variable-
length.

Challenge of Bucket overflow Bucket overflow can


Bucket
can arise often depending upon occur very late or doesn‟t
Overflow
memory size. occur at all.

Complexity Simple Complex

Solved examples:

1)

Using hash function as Key % 10 and ,linear probing as collision resolution


technique

43 % 10 = 3 so 43 will go to bucket 3

165 % 10 = 5 so 165 will go to bucket 5

62 % 10 = 2 so 62 will go to bucket 2
123 % 10 = 3 so 123 will try to go to bucket 3 but 43 is already there so
collision happens and hence using linear probing it will go to next available
bucket so, it goes to bucket 4.

152 % 10 = 2 so 152 will try to go to bucket 2 but 62 is already at bucket 2 so


collision happens and hence using linear probing it will go to next available
bucket 6.

so after inserting all keys our hash table will look like

Hash Table
bucket no. key
0
1
2 62
3 43
4 123
5 165
6 152
7
8
9

so key 152 is at bucket 6.

2) The keys 12, 18, 13, 2, 3, 23, 5 and 15 are inserted into an initially empty
hash table of length 10 using open addressing with hash function

h(k) = k mod 10 and linear probing. What is the resultant hash table?
UNIT IV TREE

Introduction

A tree is a nonlinear hierarchical data structure that consists of nodes


connected by edges.

key points of the Tree data structure.

o A tree data structure is defined as a collection of objects or entities


known as nodes that are linked together to represent or simulate
hierarchy.
o A tree data structure is a non-linear data structure because it does
not store in a sequential manner. It is a hierarchical structure as
elements in a Tree are arranged in multiple levels.
o In the Tree data structure, the topmost node is known as a root
node. Each node contains some data, and data can be of any type.
In the above tree structure, the node contains the name of the
employee, so the type of data would be a string.
o Each node contains some data and the link or reference of other
nodes that can be called children.

Terminology

Following are the important terms with respect to tree.

 Path − Path refers to the sequence of nodes along the edges of a


tree.
 Root − The node at the top of the tree is called root. There is only
one root per tree and one path from the root node to any node.
 Parent − Any node except the root node has one edge upward to a
node called parent.
 Child − The node below a given node connected by its edge
downward is called its child node.
 Leaf − The node which does not have any child node is called the
leaf node.
 Subtree − Subtree represents the descendants of a node.
 Visiting − Visiting refers to checking the value of a node when
control is on the node.
 Traversing − Traversing means passing through nodes in a specific
order.
 Levels − Level of a node represents the generation of a node. If
the root node is at level 0, then its next child node is at level 1, its
grandchild is at level 2, and so on.
 Keys − Key represents a value of a node based on which a search
operation is to be carried out for a node.
 Sibling: The nodes that have the same parent are known as
siblings.
 Internal nodes: A node has atleast one child node known as
an internal

Degree of a Node

The degree of a node is the total number of branches of that node.

Forest

A collection of disjoint trees is called a forest.

Ancestor node:- An ancestor of a node is any predecessor node on a


path from the root to that node. The root node doesn't have any
ancestors.

Descendant: The immediate successor of the given node is known as a


descendant of a node.

Applications of trees

The following are the applications of trees:

o Storing naturally hierarchical data: Trees are used to store the


data in the hierarchical structure. For example, the file system. The
file system stored on the disc drive, the file and folder are in the
form of the naturally hierarchical data and stored in the form of
trees.
o Organize data: It is used to organize data for efficient insertion,
deletion and searching. For example, a binary tree has a logN time
for searching an element.
o Trie: It is a special kind of tree that is used to store the dictionary.
It is a fast and efficient way for dynamic spell checking.
o Heap: It is also a tree data structure implemented using arrays. It
is used to implement priority queues.
o B-Tree and B+Tree: B-Tree and B+Tree are the tree data
structures used to implement indexing in databases.
o Routing table: The tree data structure is also used to store the
data in routing tables in the routers.

Types of Tree data structures:


 Binary tree: In a binary tree, each node can have a maximum of two
children linked to it. Some common types of binary trees include full
binary trees, complete binary trees, balanced binary trees, and
degenerate or pathological binary trees.
 Ternary Tree: A Ternary Tree is a tree data structure in which each
node has at most three child nodes, usually distinguished as “left”,
“mid” and “right”.
 N-ary Tree or Generic Tree: Generic trees are a collection of nodes
where each node is a data structure that consists of records and a list
of references to its children(duplicate references are not allowed).
Unlike the linked list, each node stores the address of multiple nodes.

Binary Tree representation:

A binary tree is a tree data structure in which each parent node can have
at most two children. Each node of a binary tree consists of three
items:

 data item

 address of left child

 address of right child


Binary Tree Representation

A node of a binary tree is represented by a structure containing a data


part and two pointers to other structures of the same type.

struct node

int data;

struct node *left;

struct node *right;

};

Properties of Binary Tree

o At each level of i, the maximum number of nodes is 2i.

o The height of the tree is defined as the longest path from the root
node to the leaf node. The minimum number of nodes possible at
height h is equal to h+1.
o If the number of nodes is minimum, then the height of the tree
would be maximum. Conversely, if the number of nodes is
maximum, then the height of the tree would be minimum.

Types of Binary tree:

o Full/ proper/ strict Binary tree


o Complete Binary tree
o Perfect Binary tree
o Degenerate Binary tree
o Balanced Binary tree
Full/ proper/ strict Binary tree

The full binary tree is also known as a strict binary tree. The tree
can only be considered as the full binary tree if each node must
contain either 0 or 2 children. The full binary tree can also be
defined as the tree in which each node must contain 2 children
except the leaf nodes.

Let's look at the simple example of the Full Binary tree.

Properties of Full Binary Tree


The number of leaf nodes is equal to the number of internal nodes
plus 1.

o The maximum number of nodes is the same as the number of nodes


in the binary tree, i.e., 2h+1 -1.
o The minimum number of nodes in the full binary tree is 2*h-1.
o The minimum height of the full binary tree is log2(n+1) - 1.
o The maximum height of the full binary tree can be computed as:

n= 2*h - 1

n+1 = 2*h

h = n+1/2

Complete Binary Tree

The complete binary tree is a tree in which all the nodes are completely
filled except the last level. In the last level, all the nodes must be as left
as possible. In a complete binary tree, the nodes should be added from
the left.

Let's create a complete binary tree.


Properties of Complete Binary Tree

o The maximum number of nodes in complete binary tree is 2h+1 - 1.


o The minimum number of nodes in complete binary tree is 2h.
o The minimum height of a complete binary tree is log2(n+1) - 1.
o The maximum height of a complete binary tree is

Perfect Binary Tree

A tree is a perfect binary tree if all the internal nodes have 2 children, and
all the leaf nodes are at the same level.

Note: All the perfect binary trees are the complete binary trees as well as
the full binary tree, but vice versa is not true, i.e., all complete binary
trees and full binary trees are the perfect binary trees.
Degenerate or Pathological Tree

A degenerate or pathological tree is the tree having a single child either


left or right.

Degenerate Binary Tree

5. Skewed Binary Tree

A skewed binary tree is a pathological/degenerate tree in which the tree is


either dominated by the left nodes or the right nodes. Thus, there are two
types of skewed binary tree: left-skewed binary tree and right-
skewed binary tree.

Skewed Binary Tree

6. Balanced Binary Tree

It is a type of binary tree in which the difference between the height of


the left and the right subtree for each node is either 0 or 1.
Figure: Balanced Binary Tree

AVL Tree
AVL tree is a self-balancing Binary Search Tree (BST) where the
difference between heights of left and right subtrees for any node cannot
be more than one.

Red-Black Tree
A red-black tree is a kind of self-balancing binary search tree where each
node has an extra bit, and that bit is often interpreted as the color (red
or black). These colors are used to ensure that the tree remains
balanced during insertions and deletions.

Binary Tree Traversals

The term 'tree traversal' means traversing or visiting each node of a tree.
There is a single way to traverse the linear data structure such as linked
list, queue, and stack. Whereas, there are multiple ways to traverse a
tree that are listed as follows -

 Preorder traversal
 Inorder traversal
 Postorder traversal
 Level order traversal

Preorder traversal

This technique follows the 'root left right' policy. It means that, first root
node is visited after that the left subtree is traversed recursively, and
finally, right subtree is recursively traversed. As the root node is
traversed before (or pre) the left and right subtree, it is called preorder
traversal.

So, in a preorder traversal, each node is visited before both of its


subtrees.
Algorithm

Until all nodes of the tree are not visited


1. Step 1 - Visit the root node
2. Step 2 - Traverse the left subtree recursively.
3. Step 3 - Traverse the right subtree recursively.

Postorder traversal

This technique follows the 'left-right root' policy. It means that the first
left subtree of the root node is traversed, after that recursively traverses
the right subtree, and finally, the root node is traversed. As the root node
is traversed after (or post) the left and right subtree, it is called postorder
traversal.

So, in a postorder traversal, each node is visited after both of its


subtrees.

Algorithm

Until all nodes of the tree are not visited

1. Step 1 - Traverse the left subtree recursively.


2. Step 2 - Traverse the right subtree recursively.
3. Step 3 - Visit the root node.

Inorder traversal

This technique follows the 'left root right' policy. It means that first left
subtree is visited after that root node is traversed, and finally, the right
subtree is traversed. As the root node is traversed between the left and
right subtree, it is named inorder traversal.

So, in the inorder traversal, each node is visited in between of its


subtrees.

Algorithm

Until all nodes of the tree are not visited

1. Step 1 - Traverse the left subtree recursively.


2. Step 2 - Visit the root node.
3. Step 3 - Traverse the right subtree recursively.

Level Order Traversal technique is defined as a method to traverse a


Tree such that all nodes present in the same level are traversed
completely before traversing the next level.
Complexity of Tree traversal techniques

The time complexity of tree traversal techniques discussed above is O(n),


where 'n' is the size of binary tree.

Whereas the space complexity of tree traversal techniques discussed


above is O(1) if we do not consider the stack size for function calls.
Otherwise, the space complexity of these techniques is O(h), where 'h' is
the tree's height.

Example1:

Level order can be : 1 2 3 4 5 6 7( from root level by level left


to right )

Example-2:
Pre Order: 1 2 4 8 12 5 9 3 6 7 10 11
Post Order: 12 8 4 9 5 2 6 10 11 7 3 1
In Order: 8 12 4 2 9 5 1 6 3 10 7 11

Example-3

Preorder traversal: 27 14 10 19 35 31 42
Inorder traversal: 10 14 19 27 31 35 42
Post order traversal: 10 19 14 31 42 35 27

Binary Search Tree (BST)

A binary search tree follows some order to arrange the elements. In a


Binary search tree, the value of left node must be smaller than the parent
node, and the value of right node must be greater than the parent node.
This rule is applied recursively to the left and right subtrees of the root.

Let's understand the concept of Binary search tree with an example.

In the above figure, we can observe that the root node is 40, and all the
nodes of the left subtree are smaller than the root node, and all the nodes
of the right subtree are greater than the root node.
The properties that separate a binary search tree from a regular binary
tree is
 All nodes of left subtree are less than the root node
 All nodes of right subtree are more than the root node
 Both subtrees of each node are also BSTs i.e. they have the above
two properties

Advantages of Binary search tree

o Searching an element in the Binary search tree is easy as we always


have a hint that which subtree has the desired element.
o As compared to array and linked lists, insertion and deletion
operations are faster in BST.

Example of creating a binary search tree

Now, let's see the creation of binary search tree using an example.

Suppose the data elements are - 45, 15, 79, 90, 10, 55, 12, 20, 50

o First, we have to insert 45 into the tree as the root of the tree.
o Then, read the next element; if it is smaller than the root node,
insert it as the root of the left subtree, and move to the next
element.
o Otherwise, if the element is larger than the root node, then insert it
as the root of the right subtree.

Now, let's see the process of creating the Binary search tree using the
given data element. The process of creating the BST is shown below -

Step 1 - Insert 45.

Step 2 - Insert 15.

As 15 is smaller than 45, so insert it as the root node of the left subtree.
Step 3 - Insert 79.

As 79 is greater than 45, so insert it as the root node of the right subtree.

Step 4 - Insert 90.

90 is greater than 45 and 79, so it will be inserted as the right subtree of


79.
Step 5 - Insert 10.

10 is smaller than 45 and 15, so it will be inserted as a left subtree of 15.

Step 6 - Insert 55.

55 is larger than 45 and smaller than 79, so it will be inserted as the left
subtree of 79.

Step 7 - Insert 12.

12 is smaller than 45 and 15 but greater than 10, so it will be inserted as


the right subtree of 10.
Step 8 - Insert 20.

20 is smaller than 45 but greater than 15, so it will be inserted as the


right subtree of 15.

Step 9 - Insert 50.

50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as
a left subtree of 55.
Now, the creation of binary search tree is completed. After that, let's
move towards the operations that can be performed on Binary search
tree.

We can perform insert, delete and search operations on the binary search
tree.

Insertion in Binary Search tree

A new key in BST is always inserted at the leaf. To insert an element in


BST, we have to start searching from the root node; if the node to be
inserted is less than the root node, then search for an empty location in
the left subtree. Else, search for the empty location in the right subtree
and insert the data. Insert in BST is similar to searching, as we always
have to maintain the rule that the left subtree is smaller than the root,
and right subtree is larger than the root.

Now, let's see the process of inserting a node into BST using an example.
Deletion in Binary Search tree

In a binary search tree, we must delete a node from the tree by keeping
in mind that the property of BST is not violated. To delete a node from
BST, there are three possible situations occur -

o The node to be deleted is the leaf node, or,


o The node to be deleted has only one child, and,
o The node to be deleted has two children

We will understand the situations listed above in detail.

When the node to be deleted is the leaf node

It is the simplest case to delete a node in BST. Here, we have to replace


the leaf node with NULL and simply free the allocated space.

We can see the process to delete a leaf node from BST in the below
image. In below image, suppose we have to delete node 90, as the node
to be deleted is a leaf node, so it will be replaced with NULL, and the
allocated space will free.
When the node to be deleted has only one child

In this case, we have to replace the target node with its child, and then
delete the child node. It means that after replacing the target node with
its child node, the child node will now contain the value to be deleted. So,
we simply have to replace the child node with NULL and free up the
allocated space.

We can see the process of deleting a node with one child from BST in the
below image. In the below image, suppose we have to delete the node
79, as the node to be deleted has only one child, so it will be replaced
with its child 55.

So, the replaced node 79 will now be a leaf node that can be easily
deleted.

When the node to be deleted has two children

This case of deleting a node in BST is a bit complex among other two
cases. In such a case, the steps to be followed are listed as follows -

o First, find the inorder successor of the node to be deleted.


o After that, replace that node with the inorder successor until the
target node is placed at the leaf of tree.
o And at last, replace the node with NULL and free up the allocated
space.
The inorder successor is required when the right child of the node is not
empty. We can obtain the inorder successor by finding the minimum
element in the right child of the node.

We can see the process of deleting a node with two children from BST in
the below image. In the below image, suppose we have to delete node 45
that is the root node, as the node to be deleted has two children, so it will
be replaced with its inorder successor. Now, node 45 will be at the leaf of
the tree so that it can be deleted easily.

Searching in Binary search tree

Searching means to find or locate a specific element or node in a data


structure. In Binary search tree, searching a node is easy because
elements in BST are stored in a specific order. The steps of searching a
node in Binary Search tree are listed as follows -

1. First, compare the element to be searched with the root element of


the tree.
2. If root is matched with the target element, then return the node's
location.
3. If it is not matched, then check whether the item is less than the
root element, if it is smaller than the root element, then move to
the left subtree.
4. If it is larger than the root element, then move to the right subtree.
5. Repeat the above procedure recursively until the match is found.
6. If the element is not found or not present in the tree, then return
NULL.

Now, let's understand the searching in binary tree using an example. We


are taking the binary search tree formed above. Suppose we have to find
node 20 from the below tree.
Step1:

Step2:

Step3:
Algorithm to search an element in Binary search tree

1. Search (root, item)


2. Step 1 - if (item = root → data) or (root = NULL)
3. return root
4. else if (item < root → data)
5. return Search(root → left, item)
6. else
7. return Search(root → right, item)
8. END if
9. Step 2 - END

Time Complexity

Operations Best case time Average case time Worst case time
complexity complexity complexity

Insertion O(log n) O(log n) O(n)

Deletion O(log n) O(log n) O(n)

Search O(log n) O(log n) O(n)

AVL Trees

AVL tree is a self-balancing binary search tree in which each node


maintains extra information called a balance factor whose value is either -
1, 0 or +1.

AVL tree got its name after its inventor Georgy Adelson-Velsky and
Landis.

AVL Tree can be defined as height balanced binary search tree in which
each node is associated with a balance factor which is calculated by
subtracting the height of its right sub-tree from that of its left sub-tree.

Tree is said to be balanced if balance factor of each node is in between -1


to 1, otherwise, the tree will be unbalanced and need to be balanced.
Balance Factor (k) = height (left(k)) - height (right(k))

If balance factor of any node is 1, it means that the left sub-tree is one
level higher than the right sub-tree.

If balance factor of any node is 0, it means that the left sub-tree and right
sub-tree contain equal height.

If balance factor of any node is -1, it means that the left sub-tree is one
level lower than the right sub-tree.

An AVL tree is given in the following figure. We can see that, balance
factor associated with each node is in between -1 and +1. therefore, it is
an example of AVL tree.

Complexity

Algorithm Average case Worst case

Space o(n) o(n)

Search o(log n) o(log n)

Insert o(log n) o(log n)


Delete o(log n) o(log n)

AVL Rotations

We perform rotation in AVL tree only in case if Balance Factor is other


than -1, 0, and 1. There are basically four types of rotations which are as
follows:

1. L L rotation: Inserted node is in the left subtree of left subtree of A


2. R R rotation : Inserted node is in the right subtree of right subtree
of A
3. L R rotation : Inserted node is in the right subtree of left subtree of
A
4. R L rotation : Inserted node is in the left subtree of right subtree of
A

Where node A is the node whose balance Factor is other than -1, 0, 1.

The first two rotations LL and RR are single rotations and the next two
rotations LR and RL are double rotations. For a tree to be unbalanced,
minimum height must be at least 2, Let us understand each rotation

1. RR Rotation

When BST becomes unbalanced, due to a node is inserted into the right
subtree of the right subtree of A, then we perform RR rotation, RR
rotation is an anticlockwise rotation, which is applied on the edge below a
node having balance factor -2

In above example, node A has balance factor -2 because a node C is


inserted in the right subtree of A right subtree. We perform the RR
rotation on the edge below A.

2. LL Rotation

When BST becomes unbalanced, due to a node is inserted into the left
subtree of the left subtree of C, then we perform LL rotation, LL
rotation is clockwise rotation, which is applied on the edge below a node
having balance factor 2.

In above example, node C has balance factor 2 because a node A is


inserted in the left subtree of C left subtree. We perform the LL rotation
on the edge below A.

3. LR Rotation

Double rotations are bit tougher than single rotation which has already
explained above. LR rotation = RR rotation + LL rotation, i.e., first RR
rotation is performed on subtree and then LL rotation is performed on full
tree, by full tree we mean the first node from the path of inserted node
whose balance factor is other than -1, 0, or 1.

Let us understand each and every step very clearly:

State Action

A node B has been inserted into the right subtree of A the left subtree
of C, because of which C has become an unbalanced node having
balance factor 2. This case is L R rotation where: Inserted node is in
the right subtree of left subtree of C

As LR rotation = RR + LL rotation, hence RR (anticlockwise) on subtree


rooted at A is performed first. By doing RR rotation, node A, has
become the left subtree of B.
After performing RR rotation, node C is still unbalanced, i.e., having
balance factor 2, as inserted node A is in the left of left of C

Now we perform LL clockwise rotation on full tree, i.e. on node C.


node C has now become the right subtree of node B, A is left subtree
of B

Balance factor of each node is now either -1, 0, or 1, i.e. BST is


balanced now.

4. RL Rotation

As already discussed, that double rotations are bit tougher than single
rotation which has already explained above. R L rotation = LL rotation +
RR rotation, i.e., first LL rotation is performed on subtree and then RR
rotation is performed on full tree, by full tree we mean the first node from
the path of inserted node whose balance factor is other than -1, 0, or 1.

State Action

A node B has been inserted into the left subtree of C the right subtree
of A, because of which A has become an unbalanced node having
balance factor - 2. This case is RL rotation where: Inserted node is in
the left subtree of right subtree of A
As RL rotation = LL rotation + RR rotation, hence, LL (clockwise) on
subtree rooted at C is performed first. By doing RR rotation,
node C has become the right subtree of B.

After performing LL rotation, node A is still unbalanced, i.e. having


balance factor -2, which is because of the right-subtree of the right-
subtree node A.

Now we perform RR rotation (anticlockwise rotation) on full tree, i.e.


on node A. node C has now become the right subtree of node B, and
node A has become the left subtree of B.

Balance factor of each node is now either -1, 0, or 1, i.e., BST is


balanced

The basic operations performed on the AVL Tree structures include all the
operations performed on a binary search tree, since the AVL Tree at its
core is actually just a binary search tree holding all its properties.
Therefore, basic operations performed on an AVL Tree are
− Insertion and Deletion.

Insertion

The data is inserted into the AVL Tree by following the Binary Search Tree
property of insertion, i.e. the left subtree must contain elements less than
the root value and right subtree must contain all the greater elements.
However, in AVL Trees, after the insertion of each element, the balance
factor of the tree is checked; if it does not exceed 1, the tree is left as it
is. But if the balance factor exceeds 1, a balancing algorithm is applied to
readjust the tree such that balance factor becomes less than or equal to 1
again.
Algorithm

The following steps are involved in performing the insertion operation of


an AVL Tree −

Step 1 − Create a node

Step 2 − Check if the tree is empty

Step 3 − If the tree is empty, the new node created will become the root
node of the AVL Tree.

Step 4 − If the tree is not empty, we perform the Binary Search Tree
insertion operation and check the balancing factor of the node in the tree.

Step 5 − Suppose the balancing factor exceeds ±1, we apply suitable


rotations on the said node and resume the insertion from Step 4.

START
if node == null then:
return new node
if key < node.key then:
node.left = insert (node.left, key)
else if (key > node.key) then:
node.right = insert (node.right, key)
else
return node
node.height = 1 + max (height (node.left), height (node.right))
balance = getBalance (node)
if balance > 1 and key < node.left.key then:
rightRotate
if balance < -1 and key > node.right.key then:
leftRotate
if balance > 1 and key > node.left.key then:
node.left = leftRotate (node.left)
rightRotate
if balance < -1 and key < node.right.key then:
node.right = rightRotate (node.right)
leftRotate (node)
return node
END

Insertion Example

Let us understand the insertion operation by constructing an example AVL


tree with 1 to 7 integers.

Starting with the first element 1, we create a node and measure the
balance, i.e., 0.
Since both the binary search property and the balance factor are satisfied,
we insert another element into the tree.

The balance factor for the two nodes are calculated and is found to be -1
(Height of left subtree is 0 and height of the right subtree is 1). Since it
does not exceed 1, we add another element to the tree.

Now, after adding the third element, the balance factor exceeds 1 and
becomes 2. Therefore, rotations are applied. In this case, the RR rotation
is applied since the imbalance occurs at two right nodes.
The tree is rearranged as −

Similarly, the next elements are inserted and rearranged using these
rotations. After rearrangement, we achieve the tree as −
Deletion

Deletion in the AVL Trees take place in three different scenarios −

 Scenario 1 (Deletion of a leaf node) − If the node to be deleted


is a leaf node, then it is deleted without any replacement as it does
not disturb the binary search tree property. However, the balance
factor may get disturbed, so rotations are applied to restore it.
 Scenario 2 (Deletion of a node with one child) − If the node to
be deleted has one child, replace the value in that node with the
value in its child node. Then delete the child node. If the balance
factor is disturbed, rotations are applied.
 Scenario 3 (Deletion of a node with two child nodes) − If the
node to be deleted has two child nodes, find the inorder successor
of that node and replace its value with the inorder successor value.
Then try to delete the inorder successor node. If the balance factor
exceeds 1 after deletion, apply balance algorithms.
START
if root == null: return root
if key < root.key:
root.left = delete Node
else if key > root.key:
root.right = delete Node
else:
if root.left == null or root.right == null then:
Node temp = null
if (temp == root.left)
temp = root.right
else
temp = root.left
if temp == null then:
temp = root
root = null
else
root = temp
else:
temp = minimum valued node
root.key = temp.key
root.right = delete Node
if (root == null) then:
return root
root.height = max (height (root.left), height (root.right)) + 1
balance = getBalance
if balance > 1 and getBalance (root.left) >= 0:
rightRotate
if balance > 1 and getBalance (root.left) < 0:
root.left = leftRotate (root.left);
rightRotate
if balance < -1 and getBalance (root.right) <= 0:
leftRotate
if balance < -1 and getBalance (root.right) > 0:
root.right = rightRotate (root.right);
leftRotate
return root
END

Deletion Example

Using the same tree given above, let us perform deletion in three
scenarios −

 Deleting element 7 from the tree above −

Since the element 7 is a leaf, we normally remove the element without


disturbing any other node in the tree

 Deleting element 6 from the output tree achieved −

However, element 6 is not a leaf node and has one child node attached to
it. In this case, we replace node 6 with its child node: node 5.
The balance of the tree becomes 1, and since it does not exceed 1 the
tree is left as it is. If we delete the element 5 further, we would have to
apply the left rotations; either LL or LR since the imbalance occurs at both
1-2-4 and 3-2-4.

The balance factor is disturbed after deleting the element 5, therefore we


apply LL rotation (we can also apply the LR rotation here).

Once the LL rotation is applied on path 1-2-4, the node 3 remains as it


was supposed to be the right child of node 2 (which is now occupied by
node 4). Hence, the node is added to the right subtree of the node 2 and
as the left child of the node 4.
 Deleting element 2 from the remaining tree −

As mentioned in scenario 3, this node has two children. Therefore, we find


its inorder successor that is a leaf node (say, 3) and replace its value with
the inorder successor.

The balance of the tree still remains 1, therefore we leave the tree as it is
without performing any rotations.

Red Black Trees

Red-Black tree is a self-balancing binary search tree in which each node


contains an extra bit for denoting the color of the node, either red or
black.

A red-black tree satisfies the following properties:

1. Red/Black Property: Every node is colored, either red or black.


2. Root Property: The root is black.
3. Leaf Property: Every leaf (NIL) is black.
4. Red Property: If a red node has children then, the children are always
black.
5. Depth Property: For each node, any simple path from this node to any
of its descendant leaf has the same black-depth (the number of black
nodes).

Properties of Red Black Tree


 Property #1: Red - Black Tree must be a Binary Search Tree.
 Property #2: The ROOT node must be colored BLACK.
 Property #3: The children of Red colored node must be colored BLACK. (There
should not be two consecutive RED nodes).
 Property #4: In all the paths of the tree, there should be same number of BLACK
colored nodes.
 Property #5: Every new node must be inserted with RED color.
 Property #6: Every leaf (e.i. NULL node) must be colored BLACK.

An example of a red-black tree is:

Figure: Red Black Tree

Each node has the following attributes:

 color
 key
 leftChild
 rightChild
 parent (except root node)
Red-Black tree's node structure would be:
struct t_red_black_node {
enum { red, black } colour;
void *item;
struct t_red_black_node *left,
*right,
*parent;

How the red-black tree maintains the property of self-balancing?


The red-black color is meant for balancing the tree.The limitations put on
the node colors ensure that any simple path from the root to a leaf is not
more than twice as long as any other such path. It helps in maintaining
the self-balancing property of the red-black tree.

Why Red-Black Trees?


Most of the BST operations (e.g., search, max, min, insert, delete.. etc)
take O(h) time where h is the height of the BST. The cost of these
operations may become O(n) for a skewed Binary tree. If we make sure
that the height of the tree remains O(log n) after every insertion and
deletion, then we can guarantee an upper bound of O(log n) for all these
operations. The height of a Red-Black tree is always O(log n) where n is
the number of nodes in the tree.
Comparison with AVL Tree:
The AVL trees are more balanced compared to Red-Black Trees, but they
may cause more rotations during insertion and deletion. So if your
application involves frequent insertions and deletions, then Red-Black
trees should be preferred. And if the insertions and deletions are less
frequent and search is a more frequent operation, then AVL tree should
be preferred over the Red-Black Tree.

Is every AVL tree can be a Red-Black tree?


Yes, every AVL tree can be a Red-Black tree if we color each node either
by Red or Black color. But every Red-Black tree is not an AVL because the
AVL tree is strictly height-balanced while the Red-Black tree is not
completely height-balanced.

Red-Black Tree Applications


1. To implement finite maps

2. To implement Java packages: java.util.TreeMap and java.util.TreeSet


3. To implement Standard Template Libraries (STL) in C++: multiset, map,
multimap
4. In Linux Kernel
Insertion into RED BLACK Tree
In a Red-Black Tree, every new node must be inserted with the color RED. The insertion
operation in Red Black Tree is similar to insertion operation in Binary Search Tree. But it is
inserted with a color property. After every insertion operation, we need to check all the
properties of Red-Black Tree. If all the properties are satisfied then we go to next operation
otherwise we perform the following operation to make it Red Black Tree.

 1. Recolor
 2. Rotation
 3. Rotation followed by Recolor

The insertion operation in Red Black tree is performed using the following steps...

 Step 1 - Check whether tree is Empty.


 Step 2 - If tree is Empty then insert the newNode as Root node with color Black and
exit from the operation.
 Step 3 - If tree is not Empty then insert the newNode as leaf node with color Red.
 Step 4 - If the parent of newNode is Black then exit from the operation.
 Step 5 - If the parent of newNode is Red then check the color of parentnode's sibling
of newNode.
 Step 6 - If it is colored Black or NULL then make suitable Rotation and Recolor it.
 Step 7 - If it is colored Red then perform Recolor. Repeat the same until tree
becomes Red Black Tree.
Deletion Operation in Red Black Tree
The deletion operation in Red-Black Tree is similar to deletion operation in BST. But after
every deletion operation, we need to check with the Red-Black Tree properties. If any of the
properties are violated then make suitable operations like Recolor, Rotation and Rotation
followed by Recolor to make it Red-Black Tree.
UNIT V GRAPH

A Graph is a non-linear data structure consisting of vertices and edges. The


vertices are sometimes also referred to as nodes and the edges are lines or
arcs that connect any two nodes in the graph. More formally a Graph is
composed of a set of vertices( V ) and a set of edges( E ). The graph is
denoted by G(E, V).

Undirected Graph:

A graph can be directed or undirected. However, in an undirected graph,


edges are not associated with the directions with them.

Directed Graph:

In a directed graph, edges form an ordered pair. Edges represent a


specific path from some vertex A to another vertex B. Node A is called
initial node while node B is called terminal node.
Graph Terminology

Path

A path can be defined as the sequence of nodes that are followed in order
to reach some terminal node V from the initial node U.

Closed Path

A path will be called as closed path if the initial node is same as terminal
node. A path will be closed path if V0=VN.

Simple Path

If all the nodes of the graph are distinct with an exception V0=VN, then
such path P is called as closed simple path.

Cycle

A cycle can be defined as the path which has no repeated edges or


vertices except the first and last vertices.

Connected Graph

A connected graph is the one in which some path exists between every
two vertices (u, v) in V. There are no isolated nodes in connected graph.

Complete Graph

A complete graph is the one in which every node is connected with all
other nodes. A complete graph contain n(n-1)/2 edges where n is the
number of nodes in the graph.

Weighted Graph

In a weighted graph, each edge is assigned with some data such as


length or weight. The weight of an edge e can be given as w(e) which
must be a positive (+) value indicating the cost of traversing the edge.
Digraph

A digraph is a directed graph in which each edge of the graph is


associated with some direction and the traversing can be done only in the
specified direction.

Loop

An edge that is associated with the similar end points can be called as
Loop.

Adjacent Nodes

If two nodes u and v are connected via an edge e, then the nodes u and v
are called as neighbours or adjacent nodes.

Degree of the Node

A degree of a node is the number of edges that are connected with that
node. A node with degree 0 is called as isolated node.

Basic properties of a graph include:

1. Vertices (nodes): The points where edges meet in a graph are


known as vertices or nodes. A vertex can represent a physical object,
concept, or abstract entity.
2. Edges: The connections between vertices are known as edges. They
can be undirected (bidirectional) or directed (unidirectional).
3. Weight: A weight can be assigned to an edge, representing the cost
or distance between two vertices. A weighted graph is a graph where
the edges have weights.
4. Degree: The degree of a vertex is the number of edges that connect
to it. In a directed graph, the in-degree of a vertex is the number of
edges that point to it, and the out-degree is the number of edges that
start from it.
5. Path: A path is a sequence of vertices that are connected by edges. A
simple path does not contain any repeated vertices or edges.
6. Cycle: A cycle is a path that starts and ends at the same vertex. A
simple cycle does not contain any repeated vertices or edges.
7. Connectedness: A graph is said to be connected if there is a path
between any two vertices. A disconnected graph is a graph that is not
connected.

Graph representation

There are two ways to store Graphs into the computer's memory:

o Sequential representation (or, Adjacency matrix representation)


o Linked list representation (or, Adjacency list representation)
Sequential representation

In sequential representation, there is a use of an adjacency matrix to


represent the mapping between vertices and edges of the graph. We can
use an adjacency matrix to represent the undirected graph, directed
graph, weighted directed graph, and weighted undirected graph.

If adj[i][j] = w, it means that there is an edge exists from vertex i to


vertex j with weight w.

Ex: adjacency matrix representation of an undirected graph

Adjacency matrix for a directed graph

In a directed graph, edges represent a specific path from one vertex to


another vertex. Suppose a path exists from vertex A to another vertex B;
it means that node A is the initial node, while node B is the terminal node.
Adjacency matrix for a weighted directed graph

It is similar to an adjacency matrix representation of a directed graph


except that instead of using the '1' for the existence of a path, here we
have to use the weight associated with the edge. The weights on the
graph edges will be represented as the entries of the adjacency matrix.

Linked list representation

An adjacency list is used in the linked representation to store the Graph in


the computer's memory. It is efficient in terms of storage as we only have
to store the values for edges.

Let's see the adjacency list representation of an undirected graph.


Graph Traversal techniques

We can traverse a graph in two ways :

1. BFS ( Breadth First Search )

2. DFS ( Depth First Search )

BFS Graph Traversal :

Breadth-first search (BFS) traversal is a technique for visiting all nodes in


a given network. This traversal algorithm selects a node and visits all
nearby nodes in order. After checking all nearby vertices, examine
another set of vertices, then recheck adjacent vertices. This algorithm
uses a queue as a data structure as an additional data structure to store
nodes for further processing. Queue size is the maximum total number of
vertices in the graph.

We use the following steps to implement BFS traversal...

 Step 1 - Define a Queue of size total number of vertices in the


graph.
 Step 2 - Select any vertex as starting point for traversal. Visit
that vertex and insert it into the Queue.
 Step 3 - Visit all the non-visited adjacent vertices of the vertex
which is at front of the Queue and insert them into the Queue.
 Step 4 - When there is no new vertex to be visited from the vertex
which is at front of the Queue then delete that vertex.
 Step 5 - Repeat steps 3 and 4 until queue becomes empty.
 Step 6 - When queue becomes empty, then produce final spanning
tree by removing unused edges from the graph

Ex-1:

In the above diagram, the full way of traversing is shown using arrows.
 Step 1: Create a Queue with the same size as the total number of
vertices in the graph.

 Step 2: Choose 12 as your beginning point for the traversal. Visit 12 and
add it to the Queue.

 Step 3: Insert all the adjacent vertices of 12 that are in front of the
Queue but have not been visited into the Queue. So far, we have 5, 23,
and 3.

 Step 4: Delete the vertex in front of the Queue when there are no new
vertices to visit from that vertex. We now remove 12 from the list.

 Step 5: Continue steps 3 and 4 until the queue is empty.

 Step 6: When the queue is empty, generate the final spanning tree by
eliminating unnecessary graph edges.

Ex-2: We use an undirected graph with 5 vertices.

We start from vertex 0, the BFS algorithm starts by putting it in the


Visited list and putting all its adjacent vertices in the stack.
Next, we visit the element at the front of queue i.e. 1 and go to its
adjacent nodes. Since 0 has already been visited, we visit 2 instead.

Vertex 2 has an unvisited adjacent vertex in 4, so we add that to the back


of the queue and visit 3, which is at the front of the queue.
Only 4 remains in the queue since the only adjacent node of 3 i.e. 0 is
already visited. We visit it.

Since the queue is empty, we have completed the Breadth First Traversal
of the graph.

Applications of BFS algorithm

The applications of breadth-first-algorithm are given as follows -

o BFS can be used to find the neighboring locations from a given


source location.
o In a peer-to-peer network, BFS algorithm can be used as a traversal
method to find all the neighboring nodes. Most torrent clients, such
as BitTorrent, uTorrent, etc. employ this process to find "seeds" and
"peers" in the network.
o BFS can be used in web crawlers to create web page indexes. It is
one of the main algorithms that can be used to index web pages. It
starts traversing from the source page and follows the links
associated with the page. Here, every web page is considered as a
node in the graph.
o BFS is used to determine the shortest path and minimum spanning
tree.
o BFS is also used in Cheney's technique to duplicate the garbage
collection.
o It can be used in ford-Fulkerson method to compute the maximum
flow in a flow network.
Complexity of BFS algorithm

Time complexity of BFS depends upon the data structure used to


represent the graph. The time complexity of BFS algorithm is O(V+E),
since in the worst case, BFS algorithm explores every node and edge. In a
graph, the number of vertices is O(V), whereas the number of edges is
O(E).

For code: refer lab program 8

DFS (Depth First Search) algorithm

Depth first Search or Depth first traversal is a recursive algorithm for


searching all the vertices of a graph or tree data structure. Traversal
means visiting all the nodes of a graph.

Because of the recursive nature, stack data structure can be used to


implement the DFS algorithm.

The step by step process to implement the DFS traversal is given as


follows -

1. First, create a stack with the total number of vertices in the graph.
2. Now, choose any vertex as the starting point of traversal, and push
that vertex into the stack.
3. After that, push a non-visited vertex (adjacent to the vertex on the
top of the stack) to the top of the stack.
4. Now, repeat steps 3 and 4 until no vertices are left to visit from the
vertex on the stack's top.
5. If no vertex is left, go back and pop a vertex from the stack.
6. Repeat steps 2, 3, and 4 until the stack is empty.

Applications of DFS algorithm

The applications of using the DFS algorithm are given as follows -

o DFS algorithm can be used to implement the topological sorting.


o It can be used to find the paths between two vertices.
o It can also be used to detect cycles in the graph.
o DFS algorithm is also used for one solution puzzles.
o DFS is used to determine if a graph is bipartite or not.
Ex-1:

As in the example given above, DFS algorithm traverses from S to A to D


to G to E to B first, then to F and lastly to C. It employs the following
rules.

 Rule 1 − Visit the adjacent unvisited vertex. Mark it as visited.


Display it. Push it in a stack.
 Rule 2 − If no adjacent vertex is found, pop up a vertex from the
stack. (It will pop up all the vertices from the stack, which do not
have adjacent vertices.)
 Rule 3 − Repeat Rule 1 and Rule 2 until the stack is empty.
As C does not have any unvisited adjacent node so we keep popping the
stack until we find a node that has an unvisited adjacent node. In this
case, there's none and we keep popping until the stack is empty.

Ex-2:
Difference between BFS and DFS

Some practical applications of BFS and DFS

Applications of BFS:

Un-weighted Graphs

BFS algorithm can easily create the shortest path and a minimum
spanning tree to visit all the vertices of the graph in the shortest time
possible with high accuracy.

P2P Networks

BFS can be implemented to locate all the nearest or neighboring nodes in


a peer to peer network. This will find the required data faster.

Web Crawlers

Search engines or web crawlers can easily build multiple levels of indexes
by employing BFS. BFS implementation starts from the source, which is
the web page, and then it visits all the links from that source.

Network Broadcasting

A broadcasted packet is guided by the BFS algorithm to find and reach all
the nodes it has the address for.
Applications of DFS:

Weighted Graph

In a weighted graph, DFS graph traversal generates the shortest path


tree and minimum spanning tree.

Detecting a Cycle in a Graph

A graph has a cycle if we found a back edge during DFS. Therefore, we


should run DFS for the graph and verify for back edges.

Path Finding

We can specialize in the DFS algorithm to search a path between two


vertices.

Topological Sorting

It is primarily used for scheduling jobs from the given dependencies


among the group of jobs. In computer science, it is used in instruction
scheduling, data serialization, logic synthesis, determining the order of
compilation tasks.

Searching Strongly Connected Components of a Graph

It used in DFS graph when there is a path from each and every vertex in
the graph to other remaining vertexes.

Solving Puzzles with Only One Solution

DFS algorithm can be easily adapted to search all solutions to a maze by


including nodes on the existing path in the visited set.

Topological Sort:

Topological sorting for Directed Acyclic Graph (DAG) is a linear


ordering of vertices such that for every directed edge u-v,
vertex u comes before v in the ordering.

Note: Topological Sorting for a graph is not possible if the graph is not
a DAG.
Example:
Input: Graph :

Output: 5 4 2 3 1 0
Explanation: The first vertex in topological sorting is always a vertex
with an in-degree of 0 (a vertex with no incoming edges). A topological
sorting of the following graph is “5 4 2 3 1 0”. There can be more than
one topological sorting for a graph. Another topological sorting of the
following graph is “4 5 2 3 1 0”.

Applications of Topological Sorting:


 Topological Sorting is mainly used for scheduling jobs from the given
dependencies among jobs.
 In computer science, applications of this type arise in:
 Instruction scheduling
 Ordering of formula cell evaluation when recomputing
formula values in spreadsheets
 Logic synthesis
 Determining the order of compilation tasks to perform in
make files
 Data serialization
 Resolving symbol dependencies in linkers
Illustration Topological Sorting Algorithm: using DFS method

Topological Sorting using Source removal algorithm

The Source Removal Algorithm is a Decrease-and-Conquer way to


topologically sort a directed graph. A source is a vertex with no incoming
edges.
Following are the steps to be followed in this algorithm-
1. From a given graph find a vertex with no incoming edges. Delete it
among with all the edges outgoing from it. If there are more than one
such vertices then break the tie randomly.
2. Note the vertices that are deleted.
3. All these recorded vertices give a topologically sorted list.

Ex:
Answer:
Choose vertex B, because it has no incoming edge, delete it along with its
adjacent edges.
According to the above steps remove the nodes and put it into a list.

Strongly Connected Components


A strongly connected component is the component of a directed
graph that has a path from every vertex to every other vertex in that
component. It can only be used in a directed graph.

For example, The below graph has two strongly connected components
{1,2,3,4} and {5,6,7} since there is path from each vertex to every
other vertex in the same strongly connected component.
Shortest Path
Dijkstra's Algorithm is a Graph algorithm that finds the shortest
path from a source vertex to all other vertices in the Graph (single source
shortest path). It is a type of Greedy Algorithm that only works on
Weighted Graphs having positive weights. The time complexity of
Dijkstra's Algorithm is O(V2) with the help of the adjacency matrix
representation of the graph. This time complexity can be reduced to O((V
+ E) log V) with the help of an adjacency list representation of the
graph, where V is the number of vertices and E is the number of edges in
the graph.

Can Dijkstra’s Algorithm work on both Directed and Undirected graphs?


Yes, Dijkstra’s algorithm can work on both directed graphs and
undirected graphs as this algorithm is designed to work on any type of
graph as long as it meets the requirements of having non-negative edge
weights and being connected.

In this algorithm each vertex will have two properties defined for it-

 Visited property:-
o This property represents whether the vertex has been visited
or not.
o We are using this property so that we don't revisit a vertex.
o A vertex is marked visited only after the shortest path to it
has been found.
 Path property:-
o This property stores the value of the current minimum path
to the vertex. Current minimum path means the shortest
way in which we have reached this vertex till now.
o This property is updated whenever any neighbour of the
vertex is visited.
o The path property is important as it will store the final
answer for each vertex.

Algorithm for Dijkstra’s Algorithm:


1. Mark the source node with a current distance of 0 and the rest with
infinity.
2. Set the non-visited node with the smallest current distance as the
current node.
3. For each neighbor, N of the current node adds the current distance of
the adjacent node with the weight of the edge connecting 0->1. If it
is smaller than the current distance of Node, set it as the new current
distance of N.
4. Mark the current node 1 as visited.
5. Go to step 2 if there are any nodes are unvisited
Dijkstra's Algorithm Applications

 To find the shortest path

 In social networking applications

 In a telephone network

 To find the locations in the map

How does Dijkstra’s Algorithm works?

Dijkstra’s Algorithm will generate the shortest path from Node 0 to all
other Nodes in the graph.

Ex:

Initially we have a set of resources given below :


 The Distance from the source node to itself is 0. In this example the
source node is 0.
 The distance from the source node to all other node is unknown so we
mark all of them as infinity.

Example: 0 -> 0, 1-> ∞,2-> ∞,3-> ∞,4-> ∞,5-> ∞,6-> ∞.


 we’ll also have an array of unvisited elements that will keep track of
unvisited or unmarked Nodes.
 Algorithm will complete when all the nodes marked as visited and the
distance between them added to the path.
Unvisited Nodes:- 0 1 2 3 4 5 6.

Step 1: Start from Node 0 and mark Node as visited as you can check in
below image visited Node is marked red.
Step 2: Check for adjacent Nodes, Now we have to choices (Either
choose Node1 with distance 2 or either choose Node 2 with distance 6 )
and choose Node with minimum distance. In this step Node 1 is
Minimum distance adjacent Node, so marked it as visited and add up the
distance.
Distance: Node 0 -> Node 1 = 2

Step 3: Then Move Forward and check for adjacent Node which is Node
3, so marked it as visited and add up the distance, Now the distance will
be:
Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7
Step 4: Again we have two choices for adjacent Nodes (Either we can
choose Node 4 with distance 10 or either we can choose Node 5 with
distance 15) so choose Node with minimum distance. In this step Node
4 is Minimum distance adjacent Node, so marked it as visited and add
up the distance.
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 =
17

Step 5: Again, Move Forward and check for adjacent Node which
is Node 6, so marked it as visited and add up the distance, Now the
distance will be:
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 +
5 + 10 + 2 = 19
So, the Shortest Distance from the Source Vertex is 19 which is
optimal one

Psuedocode

function Dijkstra(Graph, source):

for each vertex v in Graph:

distance[v] = infinity

distance[source] = 0

G = the set of all nodes of the Graph

while G is non-empty:

Q = node in G with the least dist[ ]

mark Q visited

for each neighbor N of Q:

alt_dist = distance[Q] + dist_between(Q, N)

if alt-dist < distance[N]

distance[N] := alt_dist

return distance[ ]

Note: for program refer lab program-9


Time and Space Complexity of Dijkstra's Algorithm

o The Time Complexity of Dijkstra's Algorithm is O(E log V), where E


is the number of edges and V is the number of vertices.
o The Space Complexity of Dijkstra's Algorithm is O(V), where V is
the number of vertices.

Advantages and Disadvantages of Dijkstra's Algorithm

Advantages:

1. One primary advantage of using Dijkstra's Algorithm is that it has


an almost linear time and space complexity.
2. We can use this algorithm to calculate the shortest path from a
single vertex to all other vertices and a single source vertex to a
single destination vertex by stopping the algorithm once we get the
shortest distance for the destination vertex.
3. This algorithm only works for directed weighted graphs, and all the
edges of this graph should be non-negative.

Disadvantages:

1. Dijkstra's Algorithm performs a concealed exploration that utilizes a


lot of time during the process.
2. This algorithm is impotent to handle negative edges.
3. Since this algorithm heads to the acyclic graph, it cannot calculate
the exact shortest path.
4. It also requires maintenance to keep a record of vertices that have
been visited.

Some practical applications of Dijkstra's Algorithm:

 Digital Mapping Services in Google Maps


 Social Networking Applications:
 Telephone Network
 Flight Program
 IP routing to find Open Shortest Path First
 Robotic Path
 Designate the File Server
Home work: consider the below graph as the input, with node A as the
source. Apply the Dijkstra's Algorithm to find shortest path from A.

Minimum Spanning Tree

Spanning tree

A spanning tree is a sub-graph of an undirected connected graph, which


includes all the vertices of the graph with a minimum possible number of
edges. If a vertex is missed, then it is not a spanning tree.

Example of a Spanning Tree


Some of the possible spanning trees that can be created from the above
graph are:
Two conditions exist in the spanning tree, which is as follows:

o The number of vertices in the spanning tree would be the same as


the number of vertices in the original graph.
V` = V
o The number of edges in the spanning tree would be equal to the
number of edges minus 1.
E` = |V| - 1
o The spanning tree should not contain any cycle.
o The spanning tree should not be disconnected.

Minimum Spanning Tree

A minimum spanning tree is a spanning tree in which the sum of the


weight of the edges is as minimum as possible.

The minimum spanning tree from the above spanning trees is:

The minimum spanning tree from a graph is found using the following
algorithms:

1. Prim's Algorithm
2. Kruskal's Algorithm

Prim's Algorithm

Prim's algorithm is a minimum spanning tree algorithm that takes a graph


as input and finds the subset of the edges of that graph which
 form a tree that includes every vertex
 has the minimum sum of weights among all the trees that can be formed
from the graph

How Prim's algorithm works

It falls under a class of algorithms called greedy algorithms that find the
local optimum in the hopes of finding a global optimum.
We start from one vertex and keep adding edges with the lowest weight
until we reach our goal.

Steps for finding MST using Prim's Algorithm:


1. Create MST set that keeps track of vertices already included in MST.
2. Assign key values to all vertices in the input graph. Initialize all key
values as INFINITE (∞). Assign key values like 0 for the first vertex
so that it is picked first.
3. While MST set doesn't include all vertices.
a. Pick vertex u which is not is MST set and has minimum key
value. Include 'u'to MST set.
b. Update the key value of all adjacent vertices of u. To update,
iterate through all adjacent vertices. For every adjacent
vertex v, if the weight of edge u.v less than the previous key
value of v, update key value as a weight of u.v.

Prim's Algorithm pseudocode

The pseudocode for prim's algorithm shows how we create two sets of
vertices U and V-U. U contains the list of vertices that have been visited
and V-U the list of vertices that haven't. One by one, we move vertices
from set V-U to set U by connecting the least weight edge.

T = ∅;

U = { 1 };

while (U ≠ V)

let (u, v) be the lowest cost edge such that u ∈ U and v ∈ V - U;

T = T ∪ {(u, v)}

U = U ∪ {v}
Example of Prim's algorithm
Example: Generate minimum cost spanning tree for the following graph
using Prim's algorithm. ( Homework)
Kruskal's Algorithm

Kruskal's algorithm is a minimum spanning tree algorithm that takes a


graph as input and finds the subset of the edges of that graph which
 form a tree that includes every vertex

 has the minimum sum of weights among all the trees that can be formed
from the graph

The steps involved in Kruskal’s algorithm to generate a minimum


spanning tree are:

 Step 1: Sort all edges in increasing order of their edge weights.

 Step 2: Pick the smallest edge.

 Step 3: Check if the new edge creates a cycle or loop in a spanning


tree.

 Step 4: If it doesn’t form the cycle, then include that edge in MST.
Otherwise, discard it.

 Step 5: Repeat from step 2 until it includes |V| - 1 edges in MST.

Example of Kruskal's algorithm


Kruskal Algorithm Pseudocode
KRUSKAL(G):

A=∅

For each vertex v ∈ G.V:

MAKE-SET(v)

For each edge (u, v) ∈ G.E ordered by increasing order by weight(u, v):

if FIND-SET(u) ≠ FIND-SET(v):

A = A ∪ {(u, v)}

UNION(u, v)

return A

Note: for program refer lab program-10

Home work: Find the Minimum Spanning Tree of the following graph using Kruskal's
algorithm.
Spanning Tree Applications
 Computer Network Routing Protocol
 Cluster Analysis
 Civil Network Planning
Minimum Spanning tree Applications
 To find paths in the map
 To design networks like telecommunication networks, water supply
networks, and electrical grids.

You might also like