Professional Documents
Culture Documents
Dfs Chapter
Dfs Chapter
MODULE 2
ARRAY LINKEDLIST STRINGS
Array in Data Structure
An array is a collection of items stored at contiguous memory locations. The idea is to store
multiple items of the same type together. This makes it easier to calculate the position of each
element by simply adding an offset to a base value, i.e., the memory location of the first element
of the array (generally denoted by the name of the array).
Properties of array
There are some of the properties of an array that are listed as follows -
Each element in an array is of the same data type and carries the same size that is 4 bytes.
Elements in the array are stored at contiguous memory locations from which the first element is
stored at the smallest memory location.
Elements of the array can be randomly accessed since we can calculate the address of each element
of the array with the given base address and the size of the data element.
Representation of an array
We can represent an array in various ways in different programming languages. As an
illustration, lets see the declaration of array in C language -
As per the above illustration, there are some of the following important points –
Index starts with 0.
The array’s length is 10, which means we can store 10 elements.
Each element in the array can be accessed via its index.
Why are arrays required?
Arrays are useful because –
Sorting and searching a value in an array is easier.
Arrays are best to process multiple values quickly and easily.
Arrays are good for storing multiple values in a single variable - In computer programming, most
cases require storing a large number of data of a similar type. To store such an amount of data, we
need to define a large number of variables. It would be very difficult to remember the names of all
the variables while writing the programs. Instead of naming all the variables with a different name,
it is better to define an array and store all the elements into it.
Types Array
One-dimensional Array: A simple linear array where elements are stored in a contiguous memory
location and accessed using a single index. It's the most basic form of an array.
Multi-dimensional Array: Arrays with more than one dimension. The most common type is a two-
dimensional array (also known as a matrix), but arrays with three or more dimensions are also possible.
Basic operations
Now, let;s discuss the basic operations supported in the array –
Insertion operation
This operation is performed to insert one or more elements into the array. As per the requirements, an
element can be added at the beginning, end, or at any index of the array.
Now, lets see the implementation of inserting an element into the array.
Code:-
#include <stdio.h>
#define MAX_SIZE 100
void insertElement(int array[], int *size, int element, int position) {
if (*size >= MAX_SIZE || position < 0 || position > *size) {
array[position] = element;
(*size)++;
}
int main() {
int array[MAX_SIZE] = {1, 2, 3, 4, 5};
int size = 5;
insertElement(array, &size, 10, 2);
printf("Array after insertion:\n");
for (int i = 0; i < size; i++)
printf("%d ", array[i]);
printf("\n");
return 0;
}
Deletion operation
As the name implies, this operation removes an element from the array and then reorganizes
all of the array elements.
Code:-
#include <stdio.h>
(*size)--;
}
int main() {
int array[MAX_SIZE] = {1, 2, 3, 4, 5};
int size = 5;
return 0;
}
Search operation
This operation is performed to search an element in the array based on the value or index.
Code:-
#include <stdio.h>
#define MAX_SIZE 100
int main() {
int array[MAX_SIZE] = {1, 3, 5, 7, 9, 2, 4, 6, 8, 10};
int size = 10, key = 6;
return 0;
}
Update operation
This operation is performed to update an existing array element located at the given index.
Code:-
#include <stdio.h>
#define MAX_SIZE 100
void updateElement(int array[], int size, int position, int newValue) {
if (position >= 0 && position < size)
array[position] = newValue;
else
printf("Update failed: Invalid position.\n");
}
int main() {
int array[MAX_SIZE] = {1, 2, 3, 4, 5};
int size = 5;
updateElement(array, size, 2, 10);
2D Array
2D array can be defined as an array of arrays. The 2D array is organized as matrices which can be
represented as the collection of rows and columns.
However, 2D arrays are created to implement a relational database look alike data structure. It provides
ease of holding bulk of data at once which can be passed to any number of functions wherever required.
How to declare 2D Array
The syntax of declaring two dimensional array is very much similar to that of a one dimensional array,
given as follows.
1. int arr[max_rows][max_columns];
however, It produces the data structure which looks like following.
Above image shows the two dimensional array, the elements are organized in the form of rows and
columns. First element of the first row is represented by a[0][0] where the number shown in the first index
is the number of that row while the number shown in the second index is the number of the column.
Initializing 2D Arrays
We know that, when we declare and initialize one dimensional array in C programming simultaneously,
we dont need to specify the size of the array. However this will not work with 2D arrays. We will have to
define at least the second dimension of the array.
The syntax to declare and initialize the 2D array is given as follows.
1. int arr[2][2] = {0,1,2,3};
The number of elements that can be present in a 2D array will always be equal to (number of
rows * number of columns).
Example : Storing User’s data into a 2D array and printing it.
C Example :
#include <stdio.h>
void main () {
int arr[3][3],i,j;
for (i=0;i<3;i++) {
for (j=0;j<3;j++) {
printf(“Enter a[%d][%d]: “,i,j);
scanf("%d",&arr[i][j]);
}
}
printf("\n printing the elements ....\n");
for(i=0;i<3;i++) {
printf("\n");
for (j=0;j<3;j++) {
printf(“%d\t”,arr[i][j]);
}
}
}
There are two main techniques of storing 2D array elements into memory
1. Row Major ordering
In row major ordering, all the rows of the 2D array are stored into the memory contiguously.Considering
the array shown in the above image, its memory allocation according to row major order is shown as
follows.
first, the 1 st row of the array is stored into the memory completely, then the 2 nd row of thearray is
stored into the memory completely and so on till the last row.
2. Column Major ordering
According to the column major ordering, all the columns of the 2D array are stored into thememory
contiguously. The memory allocation of the array which is shown in in the aboveimage is given as
follows.
first, the 1 st column of the array is stored into the memory completely, then the 2 nd row of the array is
stored into the memory completely and so on till the last column of the array.
Multidimensional Arrays in C
A multi-dimensional array can be termed as an array of arrays that stores homogeneousdata in tabular
form. Data in multidimensional arrays is generally stored in row-major order in the memory.
The general form of declaring N-dimensional arrays is shown below.
Syntax:
data_type array_name[size1][size2]....[sizeN];
data_type: Type of data to be stored in the array.
array_name: Name of the array.
size1, size2,…, sizeN: Size of each dimension.
Three-Dimensional Array in C
A Three Dimensional Array or 3D array in C is a collection of two-dimensional arrays. It
can be visualized as multiple 2D arrays stacked on top of each other.
x: Number of 2D arrays.
Strings in C
No explicit type, instead strings are maintained as arrays of characters
Representing strings in C
stored in arrays of characters
array can be of any length
end of string is indicated by a delimiter, the zero character ‘\0’
String Literals
String literal values are represented by sequences of characters between double quotes (“)
Examples
“” - empty string
“hello”
“a” versus ‘a’
‘a’ is a single character value (stored in 1 byte) as the ASCII value for a
“a” is an array with two characters, the first is a, the second is the character value \0
Declaration of a string
Since we cannot declare string using String Data Type, instead ofwhich we use array of type “char” to create String.
Syntax :
char String_Variable_name [ SIZE ] ;
Examples :
char city[30];
char name[20];
char message[50];
Note, each variable is considered a constant in that the space it isconnected to cannot be changed
str1 = str2; /* not allowable, but we can copy the contents of str2 to str1 (more later) */
Sample Example:-
#include <stdio.h>
int main () {
char greeting[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
printf("Greeting message: %s\n", greeting );
return 0;
}
When the above code is compiled and executed, it producesthe following result −
Greeting message: Hello
String functions we will see
strlen : finds the length of a string
strcat : concatenates one string at the end of another
strcmp : compares two strings lexicographically
strcpy : copies one string to another.
C gets() function
The gets() function enables the user to enter some characters followed by the enter key. All the characters entered by
the user get stored in a character array. The null character is added to the array to make it a string. The gets() allows
the user to enter the space-separated strings. It returns the string entered by the user.
C puts() function
The puts() function is very much similar to printf() function. The puts() function is used to print the string on the
console which is previously read by using gets() or scanf() function. The puts() function returns an integer value
representing the number of characters being printed on the console. Since, it prints an additional newline character
with the string, which moves the cursor to the new line on the console, the integer value returned by puts() will
always be equal to the number of characters present in the string plus 1.
Pattern matching
Pattern matching in strings involves finding the occurrences of a specific pattern (substring) within a
larger string. There are various algorithms and approaches for pattern matching, with the most common
being the brute-force method, Knuth-Morris-Pratt (KMP) algorithm, and Boyer-Moore algorithm. Here's a
simple example of pattern matching in C using the brute-force method:
#include <stdio.h>
#include <string.h>
// Function to perform pattern matching using the brute-force method
void bruteForcePatternMatching(char text[], char pattern[]) {
int textLength = strlen(text);
int patternLength = strlen(pattern);
for (int i = 0; i <= textLength - patternLength; i++) {
int j;
for (j = 0; j < patternLength; j++) {
if (text[i + j] != pattern[j])
break; // Mismatch found, break inner loop
}
if (j == patternLength) {
printf("Pattern found at index %d\n", i);
}
}
}
int main() {
char text[] = "This is a simple example text for pattern matching.";
char pattern[] = "example";
bruteForcePatternMatching(text, pattern);
return 0;
}
Linked List
A linked list is a linear data structure, in which the elements are not stored at contiguous memory
locations. The elements in a linked list are linked using pointers as shown in the below image: In simple
words, a linked list consists of nodes where each node contains a data field and a reference(link) to the
next node in the list.
Linked List can be defined as collection of objects called nodes that are randomly stored in the
memory.
A node contains two fields i.e. data stored at that particular address and the pointer which contains
the address of the next node in the memory.
The last node of the list contains pointer to the null.
In the above figure, the arrow represents the links. The data part of every node contains themarks obtained
by the student in the different subject. The last node in the list is identified bythe null pointer which is
present in the address part of the last node. We can have as manyelements we require, in the data part of
the list.
Representation of Singly Linked Lists:
A linked list is represented by a pointer to the first node of the linked list. The first node iscalled the head
of the linked list. If the linked list is empty, then the value of the headpoints to NULL.
Each node in a list consists of at least two parts:
- A Data Item (we can store integers, strings, or any type of data).
- Pointer (Or Reference) to the next node (connects one node to another) or An address ofanother
node
In C, we can represent a node using structures. Below is an example of a linked list node with integer
data.
In Java or C#, LinkedList can be represented as a class and a Node as a separate class. The LinkedList
class contains a reference of Node class type.
// A linked list node
struct Node {
int data;
struct Node* next;
};
Operations on Singly Linked List
There are various operations which can be performed on singly linked list. A list of all such operations is
given below.
Node Creation
struct node
{
int data;
struct node *next;
};
struct node *head, *ptr;
ptr = (struct node *)malloc(sizeof(struct node *));
Circular linked list are mostly used in task maintenance in operating systems. There are many examples
where circular linked list are being used in computer science including browser surfing where a record of
pages visited in the past by the user, ismaintained in the form of circular linked lists and can be accessed
again on clicking the previous button.
Memory Representation of circular linked list:
In the following image, memory representation of a circular linked list containing marks of a student in 4
subjects. However, the image shows a glimpse of how the circular list is being stored in the memory. The
start or head of the list is pointing to the element with the index 1 and containing 13 marks in the data part
and 4 in the next part. Which means that it is linked with the node that is being stored at 4th index of the
list.
However, due to the fact that we are considering circular linked list in the memory therefore the last node
of the list contains the address of the first node of the list.
We can also have more than one number of linked list in the memory with the different start pointers
pointing to the different start nodes in the list. The last node is identified by its next part which contains
the address of the start node of the list. We must be able to identify the last node of any linked list so that
we can find out the number of iterations which need to be performed while traversing the list.
Traversing in Circular Singly linked list
Traversing in circular singly linked list can be done through a loop. Initialize thetemporary pointer
variable temp to head pointer and run the while loop until the next pointer of temp becomes head.
while(ptr -> next != head) {
printf("%d\n", ptr -> data);
ptr = ptr -> next;
}
Insertion into circular singly linked list at the beginning
temp = head;
while(temp->next != head)
temp = temp->next;
ptr->next = head;
temp -> next = ptr;
head = ptr;
temp = head;
while(temp -> next != head) {
temp = temp -> next;
}
temp -> next = ptr;
ptr -> next = head;
Deletion in circular singly linked list at beginning
ptr = head;
while(ptr -> next != head)
ptr = ptr -> next;
ptr->next = head->next;
free(head);
head = ptr->next;
printf(“\nNode Deleted\n”);
Deletion in Circular singly linked list at the end
ptr = head;
while(ptr ->next != head) {
preptr=ptr;
ptr = ptr->next;
}
preptr->next = ptr -> next;
free(ptr);
printf("\nNode Deleted\n");
Doubly Linked List:-
Inserting a new node in a doubly linked list is very similar to inserting new node in linked list. There is a
little extra work required to maintain the link of the previous node. A node can be inserted in a Doubly
Linked List in four ways:
At the front of the DLL.
In between two nodes
After a given node.
Before a given node.
At the end of the DLL.
Insertion at the Beginning in Doubly Linked List:
To insert a new node at the beginning of the doubly list, we can use the following steps:
Allocate memory for a new node (say new_node) and assign the provided value to its data field.
Set the previous pointer of the new_node to nullptr.
If the list is empty:
o Set the next pointer of the new_node to nullptr.
o Update the head pointer to point to the new_node.
If the list is not empty:
o Set the next pointer of the new_node to the current head.
o Update the previous pointer of the current head to point to the new_node.
o Update the head pointer to point to the new_node.
The new node is always added after the last node of the given Linked List. This can be done using the
following steps:
Create a new node (say new_node).
Put the value in the new node.
Make the next pointer of new_node as null.
If the list is empty, make new_node as the head.
Otherwise, travel to the end of the linked list.
Now make the next pointer of last node point to new_node.
Change the previous pointer of new_node to the last node of the list.
MODULE 3
STACK AND QUEUE
Stack
Stack is a linear data structure that follows a particular order in which the operations are performed. The
order may be LIFO(Last In First Out) or FILO(First In Last Out). LIFO implies that the element that is
inserted last, comes out first and FILO implies that the element that is inserted first, comes out last.
A Stack is a linear data structure that follows the LIFO (Last-In-First-Out) principle. Stack has one end,
whereas the Queue has two ends (front and rear). It contains only one pointer top pointer pointing to the
topmost element of the stack. Whenever an element is added in the stack, it is added on the top of the stack,
and the element can be deleted only from the stack. In other words, a stack can be defined as a container
in which insertion and deletion can be done from the one end known as the top of the stack.
o It is called as stack because it behaves like a real-world stack, piles of books, etc.
o A Stack is an abstract data type with a pre-defined capacity, which means that it can store the
elements of a limited size.
o It is a data structure that follows some order to insert and delete the elements, and that order can be
LIFO or FILO.
Working of Stack
Stack works on the LIFO pattern. As we can observe in the below figure there are five memory blocks in
the stack; therefore, the size of the stack is 5.
Suppose we want to store the elements in a stack and let's assume that stack is empty. We have taken the
stack of size 5 as shown below in which we are pushing the elements one by one until the stack becomes
full.
Since our stack is full as the size of the stack is 5. In the above cases, we can observe that it goes from the
top to the bottom when we were entering the new element in the stack. The stack gets filled up from the
bottom to the top.
When we perform the delete operation on the stack, there is only one way for entry and exit as the other end
is closed. It follows the LIFO pattern, which means that the value entered first will be removed last. In the
above case, the value 5 is entered first, so it will be removed only after the deletion of all the other elements.
o push(): When we insert an element in a stack then the operation is known as a push. If the stack is
full then the overflow condition occurs.
o pop(): When we delete an element from the stack, the operation is known as a pop. If the stack is
empty means that no element exists in the stack, this state is known as an underflow state.
o isEmpty(): It determines whether the stack is empty or not.
o isFull(): It determines whether the stack is full or not.'
o peek(): It returns the element at the given position.
o count(): It returns the total number of elements available in a stack.
o change(): It changes the element at the given position.
o display(): It prints all the elements available in the stack.
PUSH operation
POP operation
Before deleting the element from the stack, we check whether the stack is empty.
If we try to delete the element from the empty stack, then the underflow condition occurs.
If the stack is not empty, we first access the element which is pointed by the top
Once the pop operation is performed, the top is decremented by 1, i.e., top=top-1.
Basic Operations on Stack
In order to make manipulations in a stack, there are certain operations provided to us.
push() to insert an element into the stack
pop() to remove an element from the stack
top() Returns the top element of the stack.
isEmpty() returns true if stack is empty else false.
size() returns the size of stack.
Push:
Adds an item to the stack. If the stack is full, then it is said to be an Overflow condition.
Algorithm for push:
begin
if stack is full
return
endif
else
increment top
stack[top] assign value
end else
end procedure
Pop:
Removes an item from the stack. The items are popped in the reversed order in which they are pushed. If
the stack is empty, then it is said to be an Underflow condition.
Algorithm for pop:
begin
if stack is empty
return
endif
else
store value of stack[top]
decrement top
return value
end else
end procedure
Top:
Returns the top element of the stack.
Algorithm for Top:
begin
return stack[top]
end procedure
isEmpty:
Returns true if the stack is empty, else false.
Algorithm for isEmpty:
begin
if top < 1
return true
else
return false
end procedure
Applications of Stack
o Balancing of symbols: Stack is used for balancing a symbol. For example, we have the following
program:
o String reversal: Stack is also used for reversing a string. For example, we want to reverse a
"javaTpoint" string, so we can achieve this with the help of a stack.
First, we push all the characters of the string in a stack until we reach the null character.
After pushing all the characters, we start taking out the character one by one until we reach the
bottom of the stack.
o UNDO/REDO: It can also be used for performing UNDO/REDO operations. For example, we have
an editor in which we write 'a', then 'b', and then 'c'; therefore, the text written in an editor is abc. So,
there are three states, a, ab, and abc, which are stored in a stack. There would be two stacks in which
one stack shows UNDO state, and the other shows REDO state.
If we want to perform UNDO operation, and want to achieve 'ab' state, then we implement pop
operation.
o Recursion: The recursion means that the function is calling itself again. To maintain the previous
states, the compiler creates a system stack in which all the previous records of the function are
maintained.
o DFS(Depth First Search): This search is implemented on a Graph, and Graph uses the stack data
structure.
o Backtracking: Suppose we have to create a path to solve a maze problem. If we are moving in a
particular path, and we realize that we come on the wrong way. In order to come at the beginning of
the path to create a new path, we have to use the stack data structure.
o Expression conversion: Stack can also be used for expression conversion. This is one of the most
important applications of stack. The list of the expression conversion is given below:
o Infix to prefix
o Infix to postfix
o Prefix to infix
o Prefix to postfix
o Postfix to infix
o Memory management: The stack manages the memory. The memory is assigned in the contiguous
memory blocks. The memory is known as stack memory as all the variables are assigned in a function
call stack memory. The memory size assigned to the program is known to the compiler. When the
function is created, all its variables are assigned in the stack memory. When the function completed
its execution, all the variables assigned in the stack are released.
Implementing Stack using Arrays
#include<stdio.h>
int stack[100],choice,n,top,x,i;
void push(void);
void pop(void);
void display(void);
int main(){
top=-1;
printf("\n Enter the size of STACK[MAX=100]:");
scanf("%d",&n);
printf("\n\t STACK OPERATIONS USING ARRAY");
printf("\n\t--------------------------------");
printf("\n\t 1.PUSH\n\t 2.POP\n\t 3.DISPLAY\n\t 4.EXIT");
do{
printf("\n Enter the Choice:");
scanf("%d",&choice);
switch(choice){
case 1:{
push();
break;
}
case 2:{
pop();
break;
}
case 3:{
display();
break;
}
case 4:{
printf("\n\t EXIT POINT ");
break;
}
default:{
printf ("\n\t Please Enter a Valid Choice(1/2/3/4)");
}
}
}
while(choice!=4);
return 0;
}
void push(){
if(top>=n-1){
printf("\n\tSTACK is over flow");
}
else{
printf(" Enter a value to be pushed:");
scanf("%d",&x);
top++;
stack[top]=x;
}
}
void pop(){
if(top<=-1){
printf("\n\t Stack is under flow");
}
else{
printf("\n\t The popped elements is %d",stack[top]);
top--;
}
}
void display(){
if(top>=0){
printf("\n The elements in STACK \n");
for(i=top; i>=0; i--)
printf("\n%d",stack[i]);
printf("\n Press Next Choice");
}
else{
printf("\n The STACK is empty");
}
}
Characteristics of Queue:
Queue can handle multiple data.
We can access both ends.
They are fast and flexible.
Queue Representation:
Like stacks, Queues can also be represented in an array: In this representation, the Queue is implemented
using the array. Variables used in this case are
Queue: the name of the array storing queue elements.
Front: the index where the first element is stored in the array representing the queue.
Rear: the index where the last element is stored in an array representing the queue.
1. A queue can be defined as an ordered list which enables insert operations to be performed at one end
called REAR and delete operations to be performed at another end called FRONT.
2. Queue is referred to be as First In First Out list.
3. For example, people waiting in line for a rail ticket form a queue.
Applications of Queue
Due to the fact that queue performs actions on first in first out basis which is quite fair for the ordering of
actions. There are various applications of queues discussed as below.
1. Queues are widely used as waiting lists for a single shared resource like printer, disk, CPU.
2. Queues are used in asynchronous transfer of data (where data is not being transferred at the same
rate between two processes) for eg. pipes, file IO, sockets.
3. Queues are used as buffers in most of the applications like MP3 media player, CD player, etc.
4. Queue are used to maintain the play list in media players in order to add and remove the songs from
the play-list.
5. Queues are used in operating systems for handling interrupts.
Types of Queue
There are four different types of queue that are listed as follows -
In Linear Queue, an insertion takes place from one end while the deletion occurs from another end. The end
at which the insertion takes place is known as the rear end, and the end at which the deletion takes place is
known as front end. It strictly follows the FIFO rule.
The major drawback of using a linear Queue is that insertion is done only from the rear end. If the first three
elements are deleted from the Queue, we cannot insert more elements even though the space is available in
a Linear Queue. In this case, the linear Queue shows the overflow condition as the rear is pointing to the last
element of the Queue.
Circular Queue
In Circular Queue, all the nodes are represented as circular. It is similar to the linear Queue except that the
last element of the queue is connected to the first element. It is also known as Ring Buffer, as all the ends
are connected to another end. The representation of circular queue is shown in the below image -
The drawback that occurs in a linear queue is overcome by using the circular queue. If the empty space is
available in a circular queue, the new element can be added in an empty space by simply incrementing the
value of rear. The main advantage of using the circular queue is better memory utilization.
Priority Queue
It is a special type of queue in which the elements are arranged based on the priority. It is a special type of
queue data structure in which every element has a priority associated with it. Suppose some elements occur
with the same priority, they will be arranged according to the FIFO principle. The representation of priority
queue is shown in the below image -
Insertion in priority queue takes place based on the arrival, while deletion in the priority queue occurs based
on the priority. Priority queue is mainly used to implement the CPU scheduling algorithms.
There are two types of priority queue that are discussed as follows -
Ascending priority queue - In ascending priority queue, elements can be inserted in arbitrary order,
but only smallest can be deleted first. Suppose an array with elements 7, 5, and 3 in the same order,
so, insertion can be done with the same sequence, but the order of deleting the elements is 3, 5, 7.
Descending priority queue - In descending priority queue, elements can be inserted in arbitrary
order, but only the largest element can be deleted first. Suppose an array with elements 7, 3, and 5
in the same order, so, insertion can be done with the same sequence, but the order of deleting the
elements is 7, 5, 3.
To learn more about the priority queue, you can click the link - https://www.javatpoint.com/ds-priority-
queue
In Deque or Double Ended Queue, insertion and deletion can be done from both ends of the queue either
from the front or rear. It means that we can insert and delete elements from both front and rear ends of the
queue. Deque can be used as a palindrome checker means that if we read the string from both ends, then the
string would be the same.
Deque can be used both as stack and queue as it allows the insertion and deletion operations on both ends.
Deque can be considered as stack because stack follows the LIFO (Last In First Out) principle in which
insertion and deletion both can be performed only from one end. And in deque, it is possible to perform both
insertion and deletion from one end, and Deque does not follow the FIFO principle.
The representation of the deque is shown in the below image -
To know more about the deque, you can click the link - https://www.javatpoint.com/ds-deque
There are two types of deque that are discussed as follows -
Input restricted deque - As the name implies, in input restricted queue, insertion operation can be
performed at only one end, while deletion can be performed from both ends.
Output restricted deque - As the name implies, in output restricted queue, deletion operation can be
performed at only one end, while insertion can be performed from both ends.
After deleting an element, the value of front will increase from -1 to 0. however, the queue will look
something like following.
Algorithm to insert any element in a queue
Check if the queue is already full by comparing rear to max - 1. if so, then return an overflow error.
If the item is to be inserted as the first element in the list, in that case set the value of front and rear to 0 and
insert the element at the rear end.
Otherwise keep increasing the value of rear and insert each element one by one having rear as the index.
Algorithm
Step 1: IF REAR = MAX - 1
Write OVERFLOW
Go to step
[END OF IF]
Step 2: IF FRONT = -1 and REAR = -1
SET FRONT = REAR = 0
ELSE
SET REAR = REAR + 1
[END OF IF]
Step 3: Set QUEUE[REAR] = NUM
Step 4: EXIT
C Function
void insert (int queue[], int max, int front, int rear, int item) {
if (rear + 1 == max) {
printf("overflow");
}
else {
if(front == -1 && rear == -1) {
front = 0;
rear = 0;
}
else {
rear = rear + 1;
}
queue[rear]=item;
}
}
Algorithm to delete an element from the queue
If, the value of front is -1 or value of front is greater than rear , write an underflow message and exit.
Otherwise, keep increasing the value of front and return the item stored at the front end of the queue at each
time.
Algorithm
Step 1: IF FRONT = -1 or FRONT > REAR
Write UNDERFLOW
ELSE
SET VAL = QUEUE[FRONT]
SET FRONT = FRONT + 1
[END OF IF]
Step 2: EXIT
C Function
int delete (int queue[], int max, int front, int rear) {
int y;
if (front == -1 || front > rear) {
printf("underflow");
}
else {
y = queue[front];
if(front == rear) {
front = rear = -1;
else
front = front + 1;
}
return y;
}
}
C Function
void insert(struct node *ptr, int item; ) {
ptr = (struct node *) malloc (sizeof(struct node));
if(ptr == NULL) {
printf("\nOVERFLOW\n");
return;
}
else {
ptr -> data = item;
if(front == NULL) {
front = ptr;
rear = ptr;
front -> next = NULL;
rear -> next = NULL;
}
else {
rear -> next = ptr;
rear = ptr;
rear->next = NULL;
} } }
Deletion
Deletion operation removes the element that is first inserted among all the queue elements. Firstly, we need
to check either the list is empty or not. The condition front == NULL becomes true if the list is empty, in
this case , we simply write underflow on the console and make exit.
Otherwise, we will delete the element that is pointed by the pointer front. For this purpose, copy the node
pointed by the front pointer into the pointer ptr. Now, shift the front pointer, point to its next node and free
the node pointed by the node ptr. This is done by using the following statements.
ptr = front;
front = front -> next;
free(ptr);
The algorithm and C function is given as follows.
Algorithm
- Step 1: IF FRONT = NULL
Write " Underflow "
Go to Step 5
[END OF IF]
- Step 2: SET PTR = FRONT
- Step 3: SET FRONT = FRONT -> NEXT
- Step 4: FREE PTR
- Step 5: END
C Function
void delete (struct node *ptr) {
if(front == NULL) {
printf("\nUNDERFLOW\n");
return;
}
else {
ptr = front;
front = front -> next;
free(ptr);
}
}
// Driver code
int main(){
struct Queue* q = createQueue();
enQueue(q, 10);
enQueue(q, 20);
deQueue(q);
deQueue(q);
enQueue(q, 30);
enQueue(q, 40);
enQueue(q, 50);
deQueue(q);
printf("Queue Front : %d \n", ((q->front != NULL) ? (q->front)->key : -1));
printf("Queue Rear : %d", ((q->rear != NULL) ? (q->rear)->key : -1));
return 0; }
Circular Queue
There was one limitation in the array implementation of Queue. If the rear reaches to the end position of the
Queue then there might be possibility that some vacant spaces are left in the beginning which cannot be
utilized. So, to overcome such limitations, the concept of the circular queue was introduced.
As we can see in the above image, the rear is at the last position of the Queue and front is pointing somewhere
rather than the 0th position. In the above array, there are only two elements and other three positions are
empty. The rear is at the last position of the Queue; if we try to insert the element then it will show that there
are no empty spaces in the Queue. There is one solution to avoid such wastage of memory space by shifting
both the elements at the left and adjust the front and rear end accordingly. It is not a practically good
approach because shifting all the elements will consume lots of time. The efficient approach to avoid the
wastage of the memory is to use the circular queue data structure.
A circular queue is similar to a linear queue as it is also based on the FIFO (First In First Out) principle
except that the last position is connected to the first position in a circular queue that forms a circle. It is also
known as a Ring Buffer.
The following are the operations that can be performed on a circular queue:
Front: It is used to get the front element from the Queue.
Rear: It is used to get the rear element from the Queue.
enQueue(value): This function is used to insert the new value in the Queue. The new element is
always inserted from the rear end.
deQueue(): This function deletes an element from the Queue. The deletion in a Queue always takes
place from the front end.
Enqueue operation
Let's understand the enqueue and dequeue operation through the diagrammatic representation.
Implementation of circular queue using Array
#include <stdio.h>
# define max 6
int queue[max]; // array declaration
int front=-1;
int rear=-1;
// function to insert an element in a circular queue
void enqueue(int element)
{
if(front==-1 && rear==-1) // condition to check queue is empty
{
front=0;
rear=0;
queue[rear]=element;
}
else if((rear+1)%max==front) // condition to check queue is full
{
printf("Queue is overflow..");
}
else
{
rear=(rear+1)%max; // rear is incremented
queue[rear]=element; // assigning a value to the queue at the rear position.
}
}
switch(choice)
{
case 1:
}}
return 0;
}
Implementation of circular queue using linked list
As we know that linked list is a linear data structure that stores two parts, i.e., data part and the address part
where address part contains the address of the next node. Here, linked list is used to implement the circular
queue; therefore, the linked list follows the properties of the Queue. When we are implementing the circular
queue using linked list then both the enqueue and dequeue operations take O(1) time.
#include <stdio.h>
// Declaration of struct type node
struct node
{
int data;
struct node *next;
};
struct node *front=-1;
struct node *rear=-1;
// function to insert the element in the Queue
void enqueue(int x)
{
struct node *newnode; // declaration of pointer of struct node type.
newnode=(struct node *)malloc(sizeof(struct node)); // allocating the memory to the newnode
newnode->data=x;
newnode->next=0;
if(rear==-1) // checking whether the Queue is empty or not.
{
front=rear=newnode;
rear->next=front;
}
else
{
rear->next=newnode;
rear=newnode;
rear->next=front;
}
}
else
{
while(temp->next!=front)
{
printf("%d,", temp->data);
temp=temp->next;
}
printf("%d", temp->data);
}
}
void main()
{
enqueue(34);
enqueue(10);
enqueue(23);
display();
dequeue();
peek();
}
Output:
Types of deque
We can also perform peek operations in the deque along with the operations listed above. Through peek
operation, we can get the deque's front and rear elements of the deque. So, in addition to the above
operations, following operations are also supported in deque -
Get the front item from the deque
Get the rear item from the deque
Check whether the deque is full or not
Checks whether the deque is empty or not
Implementation of deque
#include <stdio.h>
#define size 5
int deque[size];
int f = -1, r = -1;
// insert_front function will insert the value from the front
void insert_front(int x)
{
if((f==0 && r==size-1) || (f==r+1))
{
printf("Overflow");
}
else if((f==-1) && (r==-1))
{
f=r=0;
deque[f]=x;
}
else if(f==0)
{
f=size-1;
deque[f]=x;
}
else
{
f=f-1;
deque[f]=x;
}
}
while(i!=r)
{
printf("%d ",deque[i]);
i=(i+1)%size;
}
printf("%d",deque[r]);
}
else
{
printf("\nThe value of the element at rear is %d", deque[r]);
}
}
else if(f==(size-1))
{
printf("\nThe deleted element is %d", deque[f]);
f=0;
}
else
{
printf("\nThe deleted element is %d", deque[f]);
f=f+1;
}
}
}
else if(r==0)
{
printf("\nThe deleted element is %d", deque[r]);
r=size-1;
}
else
{
printf("\nThe deleted element is %d", deque[r]);
r=r-1;
}
}
int main()
{
insert_front(20);
insert_front(10);
insert_rear(30);
insert_rear(50);
insert_rear(80);
display(); // Calling the display function to retrieve the values of deque
getfront(); // Retrieve the value at front-end
getrear(); // Retrieve the value at rear-end
delete_front();
delete_rear();
display(); // calling display function to retrieve values after deletion
return 0;
}
Output:
So, that's all about the article. Hope, the article will be helpful and informative to you.
Priority queue
A priority queue is an abstract data type that behaves similarly to the normal queue except that each element
has some priority, i.e., the element with the highest priority would come first in a priority queue. The priority
of the elements in a priority queue will determine the order in which elements are removed from the priority
queue.
The priority queue supports only comparable elements, which means that the elements are either arranged
in an ascending or descending order.
For example, suppose we have some values like 1, 3, 4, 8, 14, 22 inserted in a priority queue with an ordering
imposed on the values is from least to the greatest. Therefore, the 1 number would be having the highest
priority while 22 will be having the lowest priority.
An element with the higher priority will be deleted before the deletion of the lesser priority.
If two elements in a priority queue have the same priority, they will be arranged using the FIFO
principle.
Ascending order priority queue: In ascending order priority queue, a lower priority number is
given as a higher priority in a priority. For example, we take the numbers from 1 to 5 arranged in an
ascending order like 1,2,3,4,5; therefore, the smallest number, i.e., 1 is given as the highest priority
in a priority queue.
Descending order priority queue: In descending order priority queue, a higher priority number is
given as a higher priority in a priority. For example, we take the numbers from 1 to 5 arranged in
descending order like 5, 4, 3, 2, 1; therefore, the largest number, i.e., 5 is given as the highest priority
in a priority queue.
Representation of priority queue
Now, we will see how to represent the priority queue through a one-way list.
We will create the priority queue by using the list given below in which INFO list contains the data
elements, PRN list contains the priority numbers of each data element available in the INFO list, and LINK
basically contains the address of the next node.
The priority queue can be implemented in four ways that include arrays, linked list, heap data structure and
binary search tree. The heap data structure is the most efficient way of implementing the priority queue, so
we will implement the priority queue using a heap data structure in this topic. Now, first we understand the
reason why heap is the most efficient way among all the other data structures.
Analysis of complexities using different implementations
Implementation add Remove peek
What is Heap?
A heap is a tree-based data structure that forms a complete binary tree, and satisfies the heap property. If A
is a parent node of B, then A is ordered with respect to the node B for all nodes A and B in a heap. It means
that the value of the parent node could be more than or equal to the value of the child node, or the value of
the parent node could be less than or equal to the value of the child node. Therefore, we can say that there
are two types of heaps:
Max heap: The max heap is a heap in which the value of the parent node is greater than the value
of the child nodes.
Min heap: The min heap is a heap in which the value of the parent node is less than the value of the
child nodes.
Both the heaps are the binary heap, as each has exactly two child nodes.
The common operations that we can perform on a priority queue are insertion, deletion and peek. Let's see
how we can maintain the heap data structure.
Inserting the element in a priority queue (max heap)
If we insert an element in a priority queue, it will move to the empty slot by looking from top to bottom and
left to right.
If the element is not in a correct place then it is compared with the parent node; if it is found out of order,
elements are swapped. This process continues until the element is placed in a correct position.
Removing the minimum element from the priority queue
As we know that in a max heap, the maximum element is the root node. When we remove the root node, it
creates an empty slot. The last inserted element will be added in this empty slot. Then, this element is
compared with the child nodes, i.e., left-child and right child, and swap with the smaller of the two. It keeps
moving down the tree until the heap property is restored.
Program to create the priority queue using the binary max heap.
#include <stdio.h>
#include <stdio.h>
int heap[40];
int size=-1;
return (i - 1) / 2;
}
int temp;
temp=heap[parent(i)];
heap[parent(i)]=heap[i];
heap[i]=temp;
}
// updating the value of i to i/2
i=i/2;
}
}
//function to move the node down the tree in order to restore the heap property.
void moveDown(int k)
{
int index = k;
// move the node stored at ith location is shifted to the root node
moveUp(i);
insert(20);
insert(19);
insert(21);
insert(18);
insert(12);
insert(17);
insert(15);
insert(16);
insert(14);
int i=0;
int min=get_Min();
printf("\nThe element which is having the minimum priority is : %d",min);
return 0;
}
Tree data structure is a hierarchical structure that is used to represent and organize data in a
way that is easy to navigate and search. It is a collection of nodes that are connected by edges
and has a hierarchical relationship between the nodes.
The topmost node of the tree is called the root, and the nodes below it are called the child nodes.
Each node can have multiple child nodes, and these child nodes can also have their own child
nodes, forming a recursive structure.
Basic Terminologies In Tree Data Structure:
Parent Node: The node which is a predecessor of a node is called the parent node of that
node. {B} is the parent node of {D, E}.
Child Node: The node which is the immediate successor of a node is called the child node of
that node. Examples: {D, E} are the child nodes of {B}.
Root Node: The topmost node of a tree or the node which does not have any parent node is
called the root node. {A} is the root node of the tree. A non-empty tree must contain exactly one
root node and exactly one path from the root to all other nodes of the tree.
Leaf Node or External Node: The nodes which do not have any child nodes are called leaf
nodes. {K, L, M, N, O, P, G} are the leaf nodes of the tree.
Ancestor of a Node: Any predecessor nodes on the path of the root to that node are called
Ancestors of that node. {A,B} are the ancestor nodes of the node {E}
Descendant: A node x is a descendant of another node y if and only if y is an ancestor of x.
Sibling: Children of the same parent node are called siblings. {D,E} are called siblings.
Level of a node: The count of edges on the path from the root node to that node. The root node
has level 0.
Internal node: A node with at least one child is called Internal Node.
Neighbour of a Node: Parent or child nodes of that node are called neighbors of that node.
struct Node {
int data;
struct Node* first_child;
struct Node* second_child;
struct Node* third_child;
struct Node* nth_child;
};
Basic Operations Of Tree Data Structure:
Create – create a tree in the data structure.
Insert − Inserts data in a tree.
Search − Searches specific data in a tree to check whether it is present or not.
Traversal: Depth-First-Search Traversal, Breadth-First-Search Traversal
Properties of Tree Data Structure:
o Number of edges: An edge can be defined as the connection between two nodes. If a tree has N
nodes then it will have (N-1) edges. There is only one path from each node to any other node of
the tree.
o Depth of a node: The depth of a node is defined as the length of the path from the root to that
node. Each edge adds 1 unit of length to the path. So, it can also be defined as the number of
edges in the path from the root of the tree to the node.
o Height of a node: The height of a node can be defined as the length of the longest path from the
node to a leaf node of the tree.
o Height of the Tree: The height of a tree is the length of the longest path from the root of the tree
to a leaf node of the tree.
o Degree of a Node: The total count of subtrees attached to that node is called the degree of the
node. The degree of a leaf node must be 0. The degree of a tree is the maximum degree of a node
among all the nodes in the tree.
Applications of Tree Data Structure:
File System: This allows for efficient navigation and organization of files.
Data Compression: Huffman coding is a popular technique for data compression that involves
constructing a binary tree where the leaves represent characters and their frequency of
occurrence. The resulting tree is used to encode the data in a way that minimizes the amount of
storage required.
Compiler Design: In compiler design, a syntax tree is used to represent the structure of a
program.
Database Indexing: B-trees and other tree structures are used in database indexing to efficiently
search for and retrieve data.
Advantages of Tree Data Structure:
Tree offer Efficient Searching Depending on the type of tree, with average search times of
O(log n) for balanced trees like AVL.
Trees provide a hierarchical representation of data, making it easy to organize and
navigate large amounts of information.
The recursive nature of trees makes them easy to traverse and manipulate using recursive
algorithms.
Disadvantages of Tree Data Structure:
Unbalanced Trees, meaning that the height of the tree is skewed towards one side, which can
lead to inefficient search times.
Trees demand more memory space requirements than some other data structures like arrays
and linked lists, especially if the tree is very large.
The implementation and manipulation of trees can be complex and require a good
understanding of the algorithms.
BINARY TREE
Binary Tree is a non-linear data structure where each node has at most two children. In this article,
we will cover all the basics of Binary Tree, Operations on Binary Tree, its implementation, advantages,
disadvantages which will help you solve all the problems based on Binary Tree.
Inorder traversal of expression tree produces infix version of given postfix expression (same with
postorder traversal it gives postfix expression)
Construction of Expression Tree:
Now For constructing an expression tree we use a stack. We loop through input expression and
do the following for every character.
If a character is an operand push that into the stack
If a character is an operator pop two values from the stack make them its child and push the
current node again.
In the end, the only element of the stack will be the root of an expression tree.
Examples:
Input: A B C*+ D/
Output: A + B * C / D
BINARY SEARCH TREE
Binary Search Tree is a data structure used in computer science for organizing and storing data in a
sorted manner. Binary search tree follows all properties of binary tree and its left child contains values
less than the parent node and the right child contains values greater than the parent node. This
hierarchical structure allows for efficient Searching, Insertion, and Deletion operations on the data
stored in the tree.
The above tree is a binary search tree and also a height-balanced tree. Suppose we want to find
the value 79 in the above tree. First, we compare the value of the root node. Since the value of 79
is greater than 35, we move to its right child, i.e., 48. Since the value 79 is greater than 48, so we
move to the right child of 48. The value of the right child of node 48 is 79. The number of hops
required to search the element 79 is 2.
Similarly, any element can be found with at most 2 jumps because the height of the tree is 2.
So it can be seen that any value in a balanced binary tree can be searched in O(logN) time
where N is the number of nodes in the tree. But if the tree is not height-balanced then in the
worst case, a search operation can take O(N) time.
Applications of Height-Balanced Binary Tree:
Balanced trees are mostly used for in-memory sorts of sets and dictionaries.
Balanced trees are also used extensively in database applications in which insertions and
deletions are fewer but there are frequent lookups for data required.
It is used in applications that require improved searching apart from database applications.
It has applications in storyline games as well.
It is used mainly in corporate sectors where they have to keep the information about the
employees working there and their change in shifts.
Advantages of Height-Balanced Binary Tree:
It will improve the worst-case lookup time at the expense of making a typical case roughly one
lookup less.
As a general rule, a height-balanced tree would work better when the request frequencies across
the data set are more evenly spread,
It gives better search time complexity.
Disadvantages of Height-Balanced Binary Tree:
Longer running times for the insert and remove operations.
Must keep balancing info in each node.
To find nodes to balance, must go back up in the tree.
How to check if a given tree is height-balanced:
You can check if a tree is height-balanced using recursion based on the idea that every subtree
of the tree will also be height-balanced. To check if a tree is height-balanced perform the
following operations:
Use recursion and visit the left subtree and right subtree of each node:
Check the height of the left subtree and right subtree.
If the absolute difference between their heights is at most 1 then that node is height-balanced.
Otherwise, that node and the whole tree is not balanced.
AVL TREE
An AVL tree defined as a self-balancing Binary Search Tree (BST) where the difference
between heights of left and right subtrees for any node cannot be more than one.
The difference between the heights of the left subtree and the right subtree for any node is known
as the balance factor of the node.
The AVL tree is named after its inventors, Georgy Adelson-Velsky and Evgenii Landis, who
published it in their 1962 paper “An algorithm for the organization of information”.
Example of AVL Trees:
The above tree is AVL because the differences between the heights of left and right subtrees for every
node are less than or equal to 1.
Operations on an AVL Tree:
1. Insertion
2. Deletion
3. Searching [It is similar to performing a search in BST]
Rotating the subtrees in an AVL Tree:
An AVL tree may rotate in one of the following four ways to keep itself balanced:
Left Rotation:
When a node is added into the right subtree of the right subtree, if the tree gets out of balance, we do a
single left rotation.
Right Rotation:
If a node is added to the left subtree of the left subtree, the AVL tree may get out of balance, we do a
single right rotation.
Left-Right Rotation:
A left-right rotation is a combination in which first left rotation takes place after that right rotation
executes.
Right-Left Rotation:
A right-left rotation is a combination in which first right rotation takes place after that left rotation
executes.
Applications of AVL Tree:
It is used to index huge records in a database and also to efficiently search in that.
For all types of in-memory collections, including sets and dictionaries, AVL Trees are used.
Database applications, where insertions and deletions are less common but frequent data lookups
are necessary
Software that needs optimized search.
It is applied in corporate areas and storyline games.
Advantages of AVL Tree:
AVL trees can self-balance themselves.
It is surely not skewed.
It provides faster lookups than Red-Black Trees
Better searching time complexity compared to other trees like binary tree.
Height cannot exceed log(N), where, N is the total number of nodes in the tree.
Disadvantages of AVL Tree:
It is difficult to implement.
It has high constant factors for some of the operations.
Less used compared to Red-Black trees.
Due to its rather strict balance, AVL trees provide complicated insertion and removal operations
as more rotations are performed.
Take more processing for balancing
B TREE
B Tree is a specialized m-way tree that can be widely used for disk access. A B-Tree of order m
can have at most m-1 keys and m children. One of the main reason of using B tree is its capability
to store large number of keys in a single node and large key values by keeping the height of the
tree relatively small.
A B tree of order m contains all the properties of an M way tree. In addition, it contains the
following properties.
Every node in a B-Tree contains at most m children.
Every node in a B-Tree except the root node and the leaf node contain at least m/2 children.
The root nodes must have at least 2 nodes.
All leaf nodes must be at the same level.
A B tree of order 4 is shown in the following image.
While performing some operations on B Tree, any property of B Tree may violate such as number of
minimum children a node can have. To maintain the properties of B Tree, the tree may split or join.
Operations
1) Searching :
Searching in B Trees is similar to that in Binary search tree. For example, if we search for an item
49 in the following B Tree.
The process will something like following :
1) Compare item 49 with root node 78. since 49 < 78 hence, move to its left sub-tree.
2) Since, 40<49<56, traverse right sub-tree of 40.
3) 49>45, move to right. Compare 49.
4) match found, return.
Searching in a B tree depends upon the height of the tree. The search algorithm takes O(log n) time
to search any element in a B tree.
2. Inserting
Insertions are done at the leaf node level. The following algorithm needs to be followed in order to
insert an item into B Tree.
Traverse the B Tree in order to find the appropriate leaf node at which the node can be inserted.
If the leaf node contain less than m-1 keys then insert the element in the increasing order.
Else, if the leaf node contains m-1 keys, then follow the following steps.
Insert the new element in the increasing order of elements.
Split the node into the two nodes at the median.
Push the median element upto its parent node.
If the parent node also contain m-1 number of keys, then split it too by following the same steps.
Example:
Insert the node 8 into the B Tree of order 5 shown in the following image.
The node, now contain 5 keys which is greater than (5 -1 = 4 ) keys. Therefore split the node from the
median i.e. 8 and push it up to its parent node shown as follows.
3. Deletion
Deletion is also performed at the leaf nodes. The node which is to be deleted can either be a leaf
node or an internal node. Following algorithm needs to be followed in order to delete a node from
a B tree.
Locate the leaf node.
If there are more than m/2 keys in the leaf node then delete the desired key from the node.
If the leaf node doesn't contain m/2 keys then complete the keys by taking the element from eight
or left sibling.
If the left sibling contains more than m/2 elements then push its largest element up to its parent and
move the intervening element down to the node where the key is deleted.
If the right sibling contains more than m/2 elements then push its smallest element up to the parent
and move intervening element down to the node where the key is deleted.
If neither of the sibling contain more than m/2 elements then create a new leaf node by joining two
leaf nodes and the intervening element of the parent node.
If parent is left with less than m/2 nodes then, apply the above process on the parent too.
If the the node which is to be deleted is an internal node, then replace the node with its in-order
successor or predecessor. Since, successor or predecessor will always be on the leaf node hence,
the process will be similar as the node is being deleted from the leaf node.
Example 1
Delete the node 53 from the B Tree of order 5 shown in the following figure.
Now, 57 is the only element which is left in the node, the minimum number of elements that must be
present in a B tree of order 5, is 2. it is less than that, the elements in its left and right sub-tree are also not
sufficient therefore, merge it with the left sibling and intervening element of parent i.e. 49.
The final B tree is shown as follows.
Applications of B-Trees:
It is used in large databases to access data stored on the disk
Searching for data in a data set can be achieved in significantly less time using the B-Tree
With the indexing feature, multilevel indexing can be achieved.
Most of the servers also use the B-tree approach.
B-Trees are used in CAD systems to organize and search geometric data.
B-Trees are also used in other areas such as natural language processing, computer networks, and
cryptography.
Advantages of B-Trees:
B-Trees have a guaranteed time complexity of O(log n) for basic operations like insertion,
deletion, and searching, which makes them suitable for large data sets and real-time applications.
B-Trees are self-balancing.
High-concurrency and high-throughput.
Efficient storage utilization.
Disadvantages of B-Trees:
B-Trees are based on disk-based data structures and can have a high disk usage.
Not the best for all cases.
Slow in comparison to other data structures.
Searching for a key in an m-Way search tree is similar to that of binary search tree
To search for 77 in the 5-Way search tree, shown in the figure, we begin at the root & as 77>
76> 44> 18, move to the fourth sub-tree
In the root node of the fourth sub-tree, 77< 80 & therefore we move to the first sub-tree of the
node. Since 77 is available in the only node of this sub-tree, we claim 77 was successfully
searched
Advantages of Multi-way Trees
Efficient Use of Space:
Multi-way trees can use space more efficiently by storing multiple keys and children in each node,
reducing the overall height of the tree and minimizing the number of nodes.
Reduced Tree Height:
By having more children per node, multi-way trees often have a lower height compared to binary
trees, leading to shorter paths from the root to any leaf. This reduction in height results in faster
search, insertion, and deletion operations.
Improved Disk Access:
Multi-way trees are particularly well-suited for scenarios involving disk access. By storing
multiple keys in each node, they reduce the number of disk accesses required for operations, as
more data can be read or written in a single I/O operation.
Balanced Structure:
Many multi-way trees, like B-trees, maintain a balanced structure, ensuring that all leaf nodes are
at the same level. This balance provides consistent and predictable performance for search,
insertion, and deletion operations.
Support for Range Queries:
Multi-way trees facilitate efficient range queries and sequential access to data, which is beneficial
for applications like databases where such operations are common.
Flexibility in Node Size:
The node size in multi-way trees can be adjusted to optimize performance based on the underlying
hardware, such as matching the node size to the block size of the storage medium.
Disadvantages of Multi-way Trees
Complex Implementation:The implementation of multi-way trees is more complex than simpler
structures like binary search trees. Handling node splits, merges, and balancing operations requires
additional code and careful handling.
Memory Overhead:Each node in a multi-way tree contains multiple keys and pointers, which can
lead to significant memory overhead. This is particularly problematic if the keys or pointers are
large.
Partial Node Utilization:To maintain balance and efficiency, nodes in multi-way trees may not be
fully utilized, leading to wasted space within nodes. This partial utilization can impact overall
memory efficiency.
Insertion and Deletion Complexity:Insertion and deletion operations in multi-way trees can be
complex, involving multiple steps to ensure that the tree remains balanced. These operations can be
more intricate compared to simpler trees.
Not Always Optimal for In-Memory Structures:Multi-way trees are optimized for scenarios
involving disk-based storage. For purely in-memory structures, simpler data structures like hash
tables or binary search trees may offer better performance due to lower overhead.
What is a Graph?
Graph is a non-linear data structure consisting of vertices and edges. The vertices are sometimes also
referred to as nodes and the edges are lines or arcs that connect any two nodes in the graph. More
formally a Graph is composed of a set of vertices( V ) and a set of edges( E ). The graph is denoted
by G(V, E).
Components of a Graph:
Vertices: Vertices are the fundamental units of the graph. Sometimes, vertices are also known as
vertex or nodes. Every node/vertex can be labeled or unlabelled.
Edges: Edges are drawn or used to connect two nodes of the graph. It can be ordered pair of
nodes in a directed graph. Edges can connect any two nodes in any possible way. There are no
rules. Sometimes, edges are also known as arcs. Every edge can be labelled/unlabelled.
Representation of Graphs:
There are two ways to store a graph:
Adjacency Matrix
Adjacency List
Initialization: Enqueue the starting node into a queue and mark it as visited.
Exploration: While the queue is not empty:
Dequeue a node from the queue and visit it (e.g., print its value).
For each unvisited neighbor of the dequeued node:
Enqueue the neighbor into the queue.
Mark the neighbor as visited.
Termination: Repeat step 2 until the queue is empty.
How Does the BFS Algorithm Work?
Starting from the root, all the nodes at a particular level are visited first and then the nodes of the
next level are traversed till all the nodes are visited.
To do this a queue is used. All the adjacent unvisited nodes of the current level are pushed into
the queue and the nodes of the current level are marked visited and popped from the queue.
Illustration:
Let us understand the working of the algorithm with the help of the following example.
Step1: Initially queue and visited arrays are empty.
Step 3: Remove node 0 from the front of queue and visit the unvisited neighbours and push them into
queue.
Step 4: Remove node 1 from the front of queue and visit the unvisited neighbours and push them into
queue.
Step 5: Remove node 2 from the front of queue and visit the unvisited neighbours and push them into
queue.
Step 6: Remove node 3 from the front of queue and visit the unvisited neighbours and push them into
queue. As we can see that every neighbours of node 3 is visited, so move to the next node that are in the
front of the queue.
Steps 7: Remove node 4 from the front of queue and visit the unvisited neighbours and push them into
queue.
As we can see that every neighbours of node 4 are visited, so move to the next node that is in the front
of the queue.
Now, Queue becomes empty, So, terminate these process of iteration.
Applications of BFS in Graphs:
BFS has various applications in graph theory and computer science, including:
Shortest Path Finding: BFS can be used to find the shortest path between two nodes in an
unweighted graph. By keeping track of the parent of each node during the traversal, the shortest
path can be reconstructed.
Cycle Detection: BFS can be used to detect cycles in a graph. If a node is visited twice during
the traversal, it indicates the presence of a cycle.
Connected Components: BFS can be used to identify connected components in a graph. Each
connected component is a set of nodes that can be reached from each other.
Topological Sorting: BFS can be used to perform topological sorting on a directed acyclic graph
(DAG). Topological sorting arranges the nodes in a linear order such that for any edge (u, v), u
appears before v in the order.
Level Order Traversal of Binary Trees: BFS can be used to perform a level order traversal of
a binary tree. This traversal visits all nodes at the same level before moving to the next level.
Network Routing: BFS can be used to find the shortest path between two nodes in a network,
making it useful for routing data packets in network protocols.
Advantages of BFS:
1. The solution will definitely found out by BFS If there is some solution.
2. BFS will never get trapped in a blind alley, which means unwanted nodes.
3. If there is more than one solution then it will find a solution with minimal steps.
Disadvantages Of BFS:
1. Memory Constraints As it stores all the nodes of the present level to go for the next level.
2. If a solution is far away then it consumes time.
DFS- DEPTH FIRST TRAVERSAL OF A TREE
Depth First Traversal (or DFS) for a graph is similar to Depth First Traversal of a tree. The only catch
here is, that, unlike trees, graphs may contain cycles (a node may be visited twice). To avoid processing
a node more than once, use a boolean visited array. A graph can have more than one DFS traversal.
How does DFS work?
Depth-first search is an algorithm for traversing or searching tree or graph data structures. The algorithm
starts at the root node (selecting some arbitrary node as the root node in the case of a graph) and
explores as far as possible along each branch before backtracking.
Let us understand the working of Depth First Search with the help of the following illustration:
Depth-first search
Advantages Of DFS:
1. The memory requirement is Linear WRT Nodes.
2. Less time and space complexity rather than BFS.
3. The solution can be found out without much more search.
The disadvantage of DFS:
1. Not Guaranteed that it will give you a solution.
2. Cut-off depth is smaller so time complexity is more.
3. Determination of depth until the search has proceeded.
Applications of DFS:-
1. Finding Connected components.
2. Topological sorting.
3. Finding Bridges of the graph.
Introduction to Dijkstra’s Shortest Path Algorithm
The algorithm maintains a set of visited vertices and a set of unvisited vertices. It starts at the source
vertex and iteratively selects the unvisited vertex with the smallest tentative distance from the source. It
then visits the neighbors of this vertex and updates their tentative distances if a shorter path is found.
This process continues until the destination vertex is reached, or all reachable vertices have been visited.
Need for Dijkstra’s Algorithm (Purpose and Use-Cases)
The need for Dijkstra’s algorithm arises in many applications where finding the shortest path between
two points is crucial.
For example, It can be used in the routing protocols for computer networks and also used by map
systems to find the shortest path between starting point and the Destination
Can Dijkstra’s Algorithm work on both Directed and Undirected graphs?
Yes, Dijkstra’s algorithm can work on both directed graphs and undirected graphs as this algorithm is
designed to work on any type of graph as long as it meets the requirements of having non-negative edge
weights and being connected.
In a directed graph, each edge has a direction, indicating the direction of travel between the vertices
connected by the edge. In this case, the algorithm follows the direction of the edges when searching for
the shortest path.
In an undirected graph, the edges have no direction, and the algorithm can traverse both forward and
backward along the edges when searching for the shortest path.
Algorithm for Dijkstra’s Algorithm:
Mark the source node with a current distance of 0 and the rest with infinity.
Set the non-visited node with the smallest current distance as the current node.
For each neighbor, N of the current node adds the current distance of the adjacent node with the weight
of the edge connecting 0->1. If it is smaller than the current distance of Node, set it as the new current
distance of N.
Mark the current node 1 as visited.
Go to step 2 if there are any nodes are unvisited.
How does Dijkstra’s Algorithm works?
Let’s see how Dijkstra’s Algorithm works with an example given below:
Dijkstra’s Algorithm will generate the shortest path from Node 0 to all other Nodes in the graph.
Consider the below graph:
Dijkstra’s Algorithm
The algorithm will generate the shortest path from node 0 to all the other nodes in the graph.
For this graph, we will assume that the weight of the edges represents the distance between two
nodes.
As, we can see we have the shortest path from,
Node 0 to Node 1, from
Node 0 to Node 2, from
Node 0 to Node 3, from
Node 0 to Node 4, from
Node 0 to Node 6.
Initially we have a set of resources given below :
The Distance from the source node to itself is 0. In this example the source node is 0.
The distance from the source node to all other node is unknown so we mark all of them as
infinity.
Example: 0 -> 0, 1-> ∞,2-> ∞,3-> ∞,4-> ∞,5-> ∞,6-> ∞.
we’ll also have an array of unvisited elements that will keep track of unvisited or unmarked
Nodes.
Algorithm will complete when all the nodes marked as visited and the distance between them
added to the path. Unvisited Nodes:- 0 1 2 3 4 5 6.
Step 1: Start from Node 0 and mark Node as visited as you can check in below image visited Node is
marked red.
Step 2: Check for adjacent Nodes, Now we have to choices (Either choose Node1 with distance 2 or
either choose Node 2 with distance 6 ) and choose Node with minimum distance. In this step Node 1 is
Minimum distance adjacent Node, so marked it as visited and add up the distance.
Distance: Node 0 -> Node 1 = 2
Step 3: Then Move Forward and check for adjacent Node which is Node 3, so marked it as visited and
add up the distance, Now the distance will be:
Distance: Node 0 -> Node 1 -> Node 3 = 2 + 5 = 7
Step 4: Again we have two choices for adjacent Nodes (Either we can choose Node 4 with distance 10
or either we can choose Node 5 with distance 15) so choose Node with minimum distance. In this
step Node 4 is Minimum distance adjacent Node, so marked it as visited and add up the distance.
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 = 2 + 5 + 10 = 17
Step 5: Again, Move Forward and check for adjacent Node which is Node 6, so marked it as visited
and add up the distance, Now the distance will be:
Distance: Node 0 -> Node 1 -> Node 3 -> Node 4 -> Node 6 = 2 + 5 + 10 + 2 = 19
So, the Shortest Distance from the Source Vertex is 19 which is optimal o
MODULE V
LINEAR SEARCH
Linear Search is defined as a sequential search algorithm that starts at one end and goes through each
element of a list until the desired element is found, otherwise the search continues till the end of the data
set. In this article, we will learn about the basics of Linear Search Algorithm, Applications, Advantages,
Disadvantages, etc. to provide a deep understanding of Linear Search.
Comparing key with next element arr[1]. SInce not equal, the iterator moves to the next element as a
potential match.
Step 2: Now when comparing arr[2] with key, the value matches. So the Linear Search Algorithm will
yield a successful message and return the index of the element when key is found (here 2).
Compare the middle element of the search space with the key.
Second Step: If the key matches the value of the mid element, the element is found and stop search.
HASHING
Hashing in Data Structures refers to the process of transforming a given key to another value. It involves
mapping data to a specific index in a hash table using a hash function that enables fast retrieval of
information based on its key. The transformation of a key to the corresponding value is done using a Hash
Function and the value obtained from the hash function is called Hash Code .
Components of Hashing
There are majorly three components of hashing:
Key: A Key can be anything string or integer which is fed as input in the hash function the
technique that determines an index or location for storage of an item in a data structure.
Hash Function: The hash function receives the input key and returns the index of an element in
an array called a hash table. The index is known as the hash index .
Hash Table: Hash table is a data structure that maps keys to values using a special function called
a hash function. Hash stores the data in an associative manner in an array where each data value
has its own unique index.
How does Hashing work?
Suppose we have a set of strings {“ab”, “cd”, “efg”} and we would like to store it in a table.
Our main objective here is to search or update the values stored in the table quickly in O(1) time and we
are not concerned about the ordering of strings in the table. So the given set of strings can act as a key and
the string itself will act as the value of the string but how to store the value corresponding to the key?
Step 1: We know that hash functions (which is some mathematical formula) are used to calculate
the hash value which acts as the index of the data structure where the value will be stored.
Step 2: So, let’s assign
o “a” = 1,
o “b”=2, .. etc, to all alphabetical characters.
Step 3: Therefore, the numerical value by summation of all characters of the string:
ab” = 1 + 2 = 3,
“cd” = 3 + 4 = 7 ,
“efg” = 5 + 6 + 7 = 18
Step 4: Now, assume that we have a table of size 7 to store these strings. The hash function that is
used here is the sum of the characters in key mod Table size . We can compute the location of the
string in the array by taking the sum(string) mod 7 .
Step 5: So we will then store
“ab” in 3 mod 7 = 3,
“cd” in 7 mod 7 = 0, and
“efg” in 18 mod 7 = 4.
The above technique enables us to calculate the location of a given string by using a simple hash function
and rapidly find the value that is stored in that location. Therefore the idea of hashing seems like a great
way to store (key, value) pairs of the data in a table.
What is Collision?
Collision in Hashing occurs when two different keys map to the same hash value. Hash collisions can be
intentionally created for many hash algorithms. The probability of a hash collision depends on the size of
the algorithm, the distribution of hash values and the efficiency of Hash function.
The hashing process generates a small number for a big key, so there is a possibility that two keys could
produce the same value. The situation where the newly inserted key maps to an already occupied, and it
must be handled using some collision handling technology.
How to handle Collisions?
There are mainly two methods to handle collision:
Separate Chaining
Open Addressing
2) Separate Chaining
The idea is to make each cell of the hash table point to a linked list of records that have the same hash
function value. Chaining is simple but requires additional memory outside the table.
Example: We have given a hash function and we have to insert some elements in the hash table using a
separate chaining method for collision resolution technique.
Hash function = key % 5,
Elements = 12, 15, 22, 25 and 37.
Hence In this way, the separate chaining method is used as the collision resolution technique.
2) Open Addressing
In open addressing, all elements are stored in the hash table itself. Each table entry contains either a record
or NIL. When searching for an element, we examine the table slots one by one until the desired element is
found or it is clear that the element is not in the table.
2.a) Linear Probing
In linear probing, the hash table is searched sequentially that starts from the original location of the
hash. If in case the location that we get is already occupied, then we check for the next location.
Algorithm:
Calculate the hash key. i.e. key = data % size
Check, if hashTable[key] is empty
store the value directly by hashTable[key] = data
If the hash index already has some value then
check for next index using key = (key+1) % size
Check, if the next index is available hashTable[key] then store the value. Otherwise try for
next index.
Do the above process till we find the space.
Example: Let us consider a simple hash function as “key mod 5” and a sequence of keys that are
to be inserted are 50, 70, 76, 85, 93.
2.b) Quadratic Probing
Quadratic probing is an open addressing scheme in computer programming for resolving hash collisions in
hash tables. Quadratic probing operates by taking the original hash index and adding successive values of
an arbitrary quadratic polynomial until an open slot is found.
An example sequence using quadratic probing is:
H + 1 2 , H + 2 2 , H + 3 2 , H + 4 2 …………………. H + k 2
This method is also known as the mid-square method because in this method we look for i 2 ‘th probe
(slot) in i’th iteration and the value of i = 0, 1, . . . n – 1. We always start from the original hash location. If
only the location is occupied then we check the other slots.
Example: Let us consider table Size = 7, hash function as Hash(x) = x % 7 and collision resolution
strategy to be f(i) = i 2 . Insert = 22, 30, and 50
Doubling hashing
Doubling hashing is a collision resolution technique used in open addressing schemes for hash tables. It
combines two hash functions to calculate the probe sequence, which helps in distributing the keys more
uniformly across the table, reducing clustering and improving the performance of the hash table.
How Doubling Hashing Works
Primary Hash Function: The first hash function, ℎ1(𝑘)h1(k), is used to determine the initial
position of the key 𝑘k in the hash table.
Secondary Hash Function: The second hash function, ℎ2(𝑘)h2(k), is used to determine the step
size for probing in case of a collision.
Probe Sequence: The probe sequence is calculated as follows:
index𝑖=(ℎ1(𝑘)+𝑖⋅ℎ2(𝑘))mod 𝑚indexi=(h1(k)+i⋅h2(k))modm
where 𝑖i is the probe number (starting from 0) and 𝑚m is the size of the hash table.
Example
Let's consider a simple example:
Hash table size, 𝑚=10m=10
Primary hash function, ℎ1(𝑘)=𝑘mod 10h1(k)=kmod10
Secondary hash function, ℎ2(𝑘)=1+(𝑘mod 9)h2(k)=1+(kmod9)
To insert keys into the hash table using doubling hashing:
Insert key 19:
ℎ1(19)=19mod 10=9h1(19)=19mod10=9
ℎ2(19)=1+(19mod 9)=1+1=2h2(19)=1+(19mod9)=1+1=2
Initial index: 9 (no collision)
Insert key 28:
ℎ1(28)=28mod 10=8h1(28)=28mod10=8
ℎ2(28)=1+(28mod 9)=1+1=2h2(28)=1+(28mod9)=1+1=2
Initial index: 8 (no collision)
Insert key 18 (causes collision with key 28):
ℎ1(18)=18mod 10=8h1(18)=18mod10=8
ℎ2(18)=1+(18mod 9)=1+0=1h2(18)=1+(18mod9)=1+0=1
Initial index: 8 (collision)
Next index: (8+1⋅1)mod 10=9(8+1⋅1)mod10=9 (collision with key 19)
Next index: (8+2⋅1)mod 10=0(8+2⋅1)mod10=0 (no collision, place key 18 here)
By using both the primary and secondary hash functions, doubling hashing effectively resolves collisions
and distributes keys more uniformly, improving the overall efficiency of the hash table.
Insertion sort
Insertion sort is a simple sorting algorithm that works by building a sorted array one element at a
time. It is considered an “in-place” sorting algorithm, meaning it doesn’t require any additional
memory space beyond the original array.
To achieve insertion sort, follow these steps:
We have to start with second element of the array as first element in the array is assumed to be
sorted.
Compare second element with the first element and check if the second element is smaller then
swap them.
Move to the third element and compare it with the second element, then the first element and swap
as necessary to put it in the correct position among the first three elements.
Continue this process, comparing each element with the ones before it and swapping as needed to
place it in the correct position among the sorted elements.
Repeat until the entire array is sorted.
Working of Insertion Sort Algorithm:
Consider an array having elements: {23, 1, 10, 5, 2}
First Pass:
o Current element is 23
o The first element in the array is assumed to be sorted.
o The sorted part until 0th index is : [23]
Second Pass:
o Compare 1 with 23 (current element with the sorted part).
o Since 1 is smaller, insert 1 before 23.
o The sorted part until 1st index is: [1, 23]
Third Pass:
o Compare 10 with 1 and 23 (current element with the sorted part).
o Since 10 is greater than 1 and smaller than 23, insert 10 between 1 and 23.
o The sorted part until 2nd index is: [1, 10, 23]
Fourth Pass:
o Compare 5 with 1, 10, and 23 (current element with the sorted part).
o Since 5 is greater than 1 and smaller than 10, insert 5 between 1 and 10.
o The sorted part until 3rd index is: [1, 5, 10, 23]
Fifth Pass:
o Compare 2 with 1, 5, 10, and 23 (current element with the sorted part).
o Since 2 is greater than 1 and smaller than 5 insert 2 between 1 and 5.
o The sorted part until 4th index is: [1, 2, 5, 10, 23]
Final Array:
The sorted array is: [1, 2, 5, 10, 23]
// C program for insertion sort
#include <math.h>
#include <stdio.h>
insertionSort(arr, n);
printArray(arr, n);
return 0;
}
Advantages of Insertion Sort:
Simple and easy to implement.
Stable sorting algorithm.
Efficient for small lists and nearly sorted lists.
Space-efficient.
Disadvantages of Insertion Sort:
Inefficient for large lists.
Not as efficient as other sorting algorithms (e.g., merge sort, quick sort) for most cases.
Applications of Insertion Sort:
Insertion sort is commonly used in situations where:
The list is small or nearly sorted.
Simplicity and stability are important.
Selection sort
Selection sort is a simple and efficient sorting algorithm that works by repeatedly selecting the smallest
(or largest) element from the unsorted portion of the list and moving it to the sorted portion of the list.
The algorithm repeatedly selects the smallest (or largest) element from the unsorted portion of the list and
swaps it with the first element of the unsorted part. This process is repeated for the remaining unsorted
portion until the entire list is sorted.
How does Selection Sort Algorithm work?
Lets consider the following array as an example: arr[] = {64, 25, 12, 22, 11}
First pass:
For the first position in the sorted array, the whole array is traversed from index 0 to 4 sequentially. The
first position where 64 is stored presently, after traversing whole array it is clear that 11 is the lowest
value.
Thus, replace 64 with 11. After one iteration 11, which happens to be the least value in the array, tends to
appear in the first position of the sorted list.
Second Pass:
For the second position, where 25 is present, again traverse the rest of the array in a sequential manner.
After traversing, we found that 12 is the second lowest value in the array and it should appear at the
second place in the array, thus swap these values.
Third Pass:
Now, for third place, where 25 is present again traverse the rest of the array and find the third least value
present in the array.
While traversing, 22 came out to be the third least value and it should appear at the third place in the array,
thus swap 22 with element present at third position.
Fourth pass:
Similarly, for fourth position traverse the rest of the array and find the fourth least element in the array
As 25 is the 4th lowest value hence, it will place at the fourth position.
Fifth Pass:
At last the largest value present in the array automatically get placed at the last position in the array
The resulted array is the sorted array.
// C program for implementation of selection sort
#include <stdio.h>
QuickSort
QuickSort is a sorting algorithm based on the Divide and Conquer algorithm that picks an element as a
pivot and partitions the given array around the picked pivot by placing the pivot in its correct position in
the sorted array.
How does QuickSort work?
The key process in quickSort is a partition(). The target of partitions is to place the pivot (any
element can be chosen to be a pivot) at its correct position in the sorted array and put all smaller
elements to the leftof the pivot, and all greater elements to the right of the pivot.
Partition is done recursively on each side of the pivot after the pivot is placed in its correct
position and this finally sorts the array.
Partition Algorithm:
The logic is simple, we start from the leftmost element and keep track of the index of smaller (or equal)
elements as i. While traversing, if we find a smaller element, we swap the current element with arr[i].
Otherwise, we ignore the current element.
Let us understand the working of partition and the Quick Sort algorithm with the help of the following
example:
Consider: arr[] = {10, 80, 30, 90, 40}.
Compare 10 with the pivot and as it is less than pivot arrange it accrodingly.
Compare 30 with pivot. It is less than pivot so arrange it accordingly.
As the partition process is done recursively, it keeps on putting the pivot in its actual position in the
sorted array. Repeatedly putting pivots in their actual position makes the array sorted.
Follow the below images to understand how the recursive implementation of the partition algorithm
helps to sort the array.
Initial partition on the main array:
// Function call
quickSort(arr, 0, n - 1);
Let’s sort the array or list [38, 27, 43, 10] using Merge Sort
Let’s look at the working of above example:
Divide:
[38, 27, 43, 10] is divided into [38, 27] and [43, 10].
[38, 27] is divided into [38] and [27].
[43, 10] is divided into [43] and [10].
Conquer:
[38] is already sorted.
[27] is already sorted.
[43] is already sorted.
[10] is already sorted.
Merge:
Merge [38] and [27] to get [27, 38].
Merge [43] and [10] to get [10,43].
Merge [27, 38] and [10,43] to get the final sorted
list [10, 27, 38, 43]
Therefore, the sorted list is [10, 27, 38, 43].
// C program for Merge Sort
#include <stdio.h>
#include <stdlib.h>
// Merges two subarrays of arr[].
// First subarray is arr[l..m]
// Second subarray is arr[m+1..r]
void merge(int arr[], int l, int m, int r)
{
int i, j, k;
int n1 = m - l + 1;
int n2 = r - m;
// Create temp arrays
int L[n1], R[n2];
// Copy data to temp arrays L[] and R[]
for (i = 0; i < n1; i++)
L[i] = arr[l + i];
for (j = 0; j < n2; j++)
R[j] = arr[m + 1 + j];
// Merge the temp arrays back into arr[l..r
i = 0;
j = 0;
k = l;
while (i < n1 && j < n2) {
if (L[i] <= R[j]) {
arr[k] = L[i];
i++;
}
else {
arr[k] = R[j];
j++;
}
k++;
}
// Copy the remaining elements of L[],
// if there are any
while (i < n1) {
arr[k] = L[i];
i++;
k++;
}
// Copy the remaining elements of R[],
// if there are any
while (j < n2) {
arr[k] = R[j];
j++;
k++;
}
}
// l is for left index and r is right index of the
// sub-array of arr to be sorted
void mergeSort(int arr[], int l, int r)
{
if (l < r) {
int m = l + (r - l) / 2;
merge(arr, l, m, r);
}
}
Exchange sort
Exchange sort is an algorithm used to sort in ascending as well as descending order. It compares
the first element with every element if any element seems out of order it swaps.
Example:
Input: arr[] = {5, 1, 4, 2, 8}
Output: {1, 2, 4, 5, 8}
Explanation: Working of exchange sort:
1st Pass:
Exchange sort starts with the very first elements,
comparing with other elements to check which one is greater.
( 5 1 4 2 8 ) –> ( 1 5 4 2 8 ).
Here, the algorithm compares the first two elements and swaps since 5 > 1.
No swap since none of the elements is smaller than 1 so after 1st iteration (1 5 4 2 8)
2nd Pass:
(1 5 4 2 8 ) –> ( 1 4 5 2 8 ), since 4 < 5
( 1 4 5 2 8 ) –> ( 1 2 5 4 8 ), since 2 < 4
( 1 2 5 4 8 ) No change since in this there is no other element smaller than 2
3rd Pass:
(1 2 5 4 8 ) -> (1 2 4 5 8 ), since 4 < 5
after completion of the iteration, we found array is sorted
After completing the iteration it will come out of the loop, Therefore array is sorted.
To sort in Ascending order:
procedure ExchangeSort(num: list of sortable items)
n = length(A)
// outer loop
for i = 1 to n – 2 do
// inner loop
for j = i + 1 to n-1 do
if num[i] > num[j] do
swap(num[i], num[j])
end if
end for
end for
end procedure
Time Complexity: O(N^2)
Auxiliary Space : O(1)
Advantages of using Exchange sort over other sorting methods:
There are some situations where exchange sort may be preferable over other algorithms. For
example, exchange sort may be useful when sorting very small arrays or when sorting data
that is already mostly sorted. In these cases, the overhead of implementing a more complex
algorithm may not be worth the potential performance gains.
Another advantage of the exchange sort is that it is stable, meaning that it preserves the
relative order of equal elements. This can be important in some applications, such as when
sorting records that contain multiple fields.
Deterministic: A hash function must consistently produce the same output for the same
input.
Fixed Output Size: The output of a hash function should have a fixed size, regardless of the
size of the input.
Efficiency: The hash function should be able to process input quickly.
Uniformity: The hash function should distribute the hash values uniformly across the output
space to avoid clustering.
Pre-image Resistance: It should be computationally infeasible to reverse the hash function,
i.e., to find the
original input given a hash value.
Collision Resistance: It should be difficult to find two different inputs that produce the same
hash value.
Avalanche Effect: A small change in the input should produce a significantly different hash
value.
Applications of Hash Functions
Hash Tables: The most common use of hash functions in DSA is in hash tables, which
provide an efficient way to store and retrieve data.
Data Integrity: Hash functions are used to ensure the integrity of data by generating
checksums.
Cryptography: In cryptographic applications, hash functions are used to create secure hash
algorithms like SHA-256.
Data Structures: Hash functions are utilized in various data structures such as Bloom filters
and hash sets.
Types of Hash Functions
There are many hash functions that use numeric or alphanumeric keys. This article focuses on
discussing different hash functions:
1. Division Method.
2. Multiplication Method
3. Mid-Square Method
4. Folding Method
5. Cryptographic Hash Functions
6. Universal Hashing
7. Perfect Hashing
Let’s begin discussing these methods in detail.
1. Division Method
The division method involves dividing the key by a prime number and using the remainder as the hash
value.
h(k)=k mod m
Where k is the key and 𝑚m is a prime number.
Advantages:
Simple to implement.
Works well when 𝑚m is a prime number.
Disadvantages:
Poor distribution if 𝑚m is not chosen wisely.
2. Multiplication Method
In the multiplication method, a constant 𝐴A (0 < A < 1) is used to multiply the key. The fractional part
of the product is then multiplied by 𝑚m to get the hash value.
h(k)=⌊m(kAmod1)⌋
Where ⌊ ⌋ denotes the floor function.
Advantages:
Less sensitive to the choice of 𝑚m.
Disadvantages:
More complex than the division method.
3. Mid-Square Method
In the mid-square method, the key is squared, and the middle digits of the result are taken as the hash
value.
Steps:
1. Square the key.
2. Extract the middle digits of the squared value.
Advantages:
Produces a good distribution of hash values.
Disadvantages:
May require more computational effort.
4. Folding Method
The folding method involves dividing the key into equal parts, summing the parts, and then taking the
modulo with respect to 𝑚m.
Steps:
1. Divide the key into parts.
2. Sum the parts.
3. Take the modulo 𝑚m of the sum.
Advantages:
Simple and easy to implement.
Disadvantages:
Depends on the choice of partitioning scheme.
5. Cryptographic Hash Functions
Cryptographic hash functions are designed to be secure and are used in cryptography. Examples include
MD5, SHA-1, and SHA-256.
Characteristics:
Pre-image resistance.
Second pre-image resistance.
Collision resistance.
Advantages:
High security.
Disadvantages:
Computationally intensive.
6. Universal Hashing
Universal hashing uses a family of hash functions to minimize the chance of collision for any given set
of inputs.
h(k)=((a⋅k+b)modp)modm
Where a and b are randomly chosen constants, p is a prime number greater than m, and k is the key.
Advantages:
Reduces the probability of collisions.
Disadvantages:
Requires more computation and storage.
7. Perfect Hashing
Perfect hashing aims to create a collision-free hash function for a static set of keys. It guarantees that no
two keys will hash to the same value.
Types:
Minimal Perfect Hashing: Ensures that the range of the hash function is equal to the number
of keys.
Non-minimal Perfect Hashing: The range may be larger than the number of keys.
Advantages:
No collisions.
Disadvantages:
Complex to construct.
File structures
encompass various aspects of how data is physically stored, organized, and accessed on storage media.
This includes organizing records into blocks, different types of file organizations, and methods of indexing
and hashing. Here’s a detailed breakdown of each concept:
Hard Disk Drives (HDDs): Common storage medium with spinning disks and read/write heads.
Data is stored in tracks and sectors.
Solid State Drives (SSDs): Uses flash memory with no moving parts, offering faster data access
and durability.
Optical Discs: Includes CDs, DVDs, used for data distribution and archival storage.
Magnetic Tapes: Used for backups and archival storage, sequential access only.
Cloud Storage: Data stored on remote servers accessible via the internet.
2. File Organization
Sequential File Organization: Records are stored one after another, in sequence.
Direct or Hashed File Organization: Uses a hash function to compute the address of the data
record, allowing direct access.
Indexed File Organization: Uses an index to keep track of the records, allowing efficient search
operations.
Organization of Records into Blocks
Blocks: Smallest unit of data transfer between storage and memory. In HDDs, a typical block size
is 512 bytes or 4 KB.
Blocking Factor: The number of records per block.
Record Splitting: Dividing a record across multiple blocks.
Fixed-Length vs. Variable-Length Records: Fixed-length records are easier to manage, while
variable-length records are more space-efficient.
Sequential Files
Characteristics: Data is stored in a linear order, typically based on a key field. Sequential files are
efficient for batch processing but not for random access.
Operations: Efficient for sequential access (reading from start to end) but slow for random access
since it might require scanning the entire file.
1. Indexing
Primary Index: An index on the primary key of a file, which uniquely identifies records.
Typically, the primary index is dense (every search key in the data file has an index entry).
Secondary Index: An index on non-primary keys, which might not be unique. Secondary indices
can be sparse or dense, and are used to speed up queries that search for records based on non-
primary key attributes.
Clustered Index: Data records are sorted on the index key, enhancing performance for range
queries.
Non-clustered Index: The index is separate from the data records, typically contains pointers to
the actual data locations.
2. Hashing
Static Hashing: Uses a fixed hash table size. Can suffer from overflow problems handled by
overflow chaining or open addressing.
Dynamic Hashing: Adjusts the hash table size dynamically (e.g., extendible hashing, linear
hashing).
Collision Resolution: Techniques like chaining (using linked lists) or open addressing (probing for
the next available slot).
Index Files
Structure: Contains index entries (key-pointer pairs) that map search keys to data record locations.
Types: Dense index (index entry for every search key) and sparse index (index entry for some
search keys).
Comparisons: Indexing vs. Hashing
Indexing
Advantages:
o Efficient for range queries.
o Can handle ordered data.
o Suitable for both equality and range searches.
Disadvantages:
o Requires additional storage for index structures.
o Index maintenance can be complex with frequent updates.
Hashing
Advantages:
o Direct access to records, providing fast search, insert, and delete operations.
o Efficient for equality searches.
Disadvantages:
o Inefficient for range queries.
o Collision handling adds complexity.
o Hash table size management (static vs. dynamic) can be challenging.
The above grammar can be represented in a hierarchical data structure of symbol tables:
The global symbol table contains one global variable and two procedure names. The name mentioned in the
sum_num table is not available for sum_id and its child tables.
Data structure hierarchy of symbol table is stored in the semantic analyzer. If you want to search the name
in the symbol table then you can search it using the following algorithm:
o First a symbol is searched in the current symbol table.
o If the name is found then search is completed else the name will be searched in the symbol table of
parent until,
o The name is found or global symbol is searched