Professional Documents
Culture Documents
Module 5
Module 5
Module 5
GRAPHS
Mrs.JIMSHA K MATHEW
Assistant Professor Department of ISE
Acharya Institute Of Technology
TOPICS
Graphs: Definitions, Terminologies, Matrix and Adjacency List Representation Of
Graphs, Elementary Graph operations, Traversal methods: Breadth First Search
and Depth First Search.
Sorting and Searching: Insertion Sort, Radix sort, Address Calculation Sort.
Hashing: Hash Table organizations, Hashing Functions, Static and Dynamic
Hashing.
Files and Their Organization: Data Hierarchy, File Attributes, Text Files and
Binary Files, Basic File Operations, File Organizations and Indexing
3 4
3 4
3 4
I n the above graph the self loops are (1, 1) and (2, 2)
3 4
Note: In a complete graph with ‘n’ vertices there will be n(n-1)/2 edges.
Cycle – A cycle is a simple path in which the first and last vertices are the
same.
Example: In the above graph the path 1 2 4 3 1 represents a cycle.
Note: A graph with atleast one cycle is called as a cyclic graph and a graph
with no cycles is called as an acyclic graph.
To implement a graph, you need to first represent the given information in the form of a graph.
The most commonly used ways of representing a graph are as follows:
Set Representation
Adjacency Matrix (Array-based implementation)
Adjacency List(Linked list representation)
Set Representation
Two sets are maintained 1) V, the set of vertices
2) E, the set of Edges (V * V)
If the graph is weighted, then the set E is the ordered collection of 3 tuples.
E=W*V*V where W is the weight
Advantages
1. Efficient utilization of memory
2. Easy representation
Disadvantages
3. We can’t represent parallel , loop edges
4. Graph manipulation is not possible with set
Array-based implementation- Adjacency Matrix
It is used to represent which nodes are adjacent to each other. i.e. is there any
edge connecting nodes to a graph.
v1 0 1 0 0
v2 0 0 1 0
v3 0 0 0 0
v4 1 0 1 0
Undirected graph representation
Directed graph representation
weighted graph representation
Linked-list implementation - Adjacency List
A list is used for each vertex v which contains the vertices which are
adjacent from v (adjacency list) .
In this representation, for each vertex in the graph, we maintain the list of its
neighbors.
It means, every vertex of the graph contains list of its adjacent vertices.
We have an array of vertices which is indexed by the vertex number and for each
vertex v, the corresponding array element points to a singly linked list of neighbors
of v.
Linked representation( Adjacency List)
linked representation is another space – saving way of graph
representation
Methods:
Algorithm: DFS(v)
1. Push the starting vertex, v into the stack.
2. Repeat until the stack becomes empty:
a. Pop a vertex from the stack.
b. Visit the popped vertex.
c. Push all the unvisited vertices adjacent to the popped
vertex into the stack.
Stack is used in the implementation of the depth first search.
DFS (Contd.)
v1
DFS (Contd.)
Pop a vertex, v1 from the stack
Visit v1
Push all unvisited vertices adjacent to v1 into the stack
Visited:
v1
DFS (Contd.)
Pop a vertex, v1 from the stack
Visit v1
Push all unvisited vertices adjacent to v1 into the stack
v2
v4
Visited:
v1
DFS (Contd.)
v4
Visited:
v1 v2
DFS (Contd.)
v6
v3
v4
Visited:
v1 v2
DFS (Contd.)
v3
v4
Visited:
v1 v2 v6
DFS (Contd.)
v4
Visited:
v1 v2 v6 v3
DFS (Contd.)
v5
v4
Visited:
v1 v2 v6 v3
DFS (Contd.)
v4
Visited:
v1 v2 v6 v3 v5
DFS (Contd.)
Visited:
v1 v2 v6 v3 v5 v4
DFS (Contd.)
Visited:
v1 v2 v6 v3 v5 v4
Simple Algorithm – DFS (FORMAL)
Algorithm DFS
1. Push the starting vertex in to the stack STACK
2. While STACK is not empty
1. POP a vertex V
2. If V is not in the VISIT
1. Visit the vertex V
2. Store V in VISIT_LIST
3. PUSH all adjacent vertex of V on to STACK
3. EndIf
3. EndWhile
4. Stop
DFS (Contd.)
Although the preceding algorithm provides a simple and convenient method to
traverse a graph, the algorithm will not work correctly if the graph is not connected.
BFS
Algorithm: BFS(v)
1. Visit the starting vertex, v and insert it into a queue.
2. Repeat step 3 until the queue becomes empty.
3. Delete the front vertex from the queue, visit all its
unvisited adjacent vertices, and insert them into the queue.
BFS (Contd.)
v1
BFS (Contd.)
Remove a vertex v1 from the queue
Visit v1
Visit all unvisited vertices adjacent to v1 and insert them in the queue
Visited:
v1
BFS (Contd.)
Visit all unvisited vertices adjacent to v1 and insert them in the queue
v2 v4
v1
BFS (Contd.)
Remove a vertex v2 from the queue
Visit v2
Visit all unvisited vertices adjacent to v2 and insert them in the queue
v2 v4
v1 v2
BFS (Contd.)
Visit all unvisited vertices adjacent to v4 and insert them in the queue
v4 v3 v6
v1 v2
BFS (Contd.)
Remove a vertex v4 from the queue
Visit v4
Visit all unvisited vertices adjacent to v4 and insert them in the queue
v4 v3 v6
v1 v2
BFS (Contd.)
Remove a vertex v3 from the queue
Visit v3
Visit all unvisited vertices adjacent to v3 and insert them in the queue
v3 v6 v5
v1 v2 v4
BFS (Contd.)
Remove a vertex v3 from the queue
Visit v3
Visit all unvisited vertices adjacent to v3 and insert them in the queue
v3 v6 v5
v1 v2 V3
V4
BFS (Contd.)
Remove a vertex v6 from the queue
Visit v6
Visit all unvisited vertices adjacent to v6 and insert them in the queue
v6 v5
v1 v2 v4 v3 v6
BFS (Contd.)
v5
v1 v2 v4 v3 v6 v5
BFS (Contd.)
v1 v2 v4 v3 v6 v5
Simple Algorithm - BFS
Algorithm BFS
1. Enqueue the starting vertex in to the queue QUEUE
2. While QUEUE not empty
1. Dequeue a vertex V
2. If Vis not in the VISIT
1. Visit the vertex V
2. Store V in VISIT_LIST
3. Enqueue all adjacent vertex of V on to QUEUE
3. EndIf
3. EndWhile
4. Stop
1. FIND THE DFS & BFS OF THE BELOW GRAPH
Difference Between
BFS &DFS
BFS finds the shortest path to the destination. DFS goes to the bottom of a subtree, then backtracks.
The full form of BFS is Breadth-First Search. The full form of DFS is Depth First Search.
It uses a queue to keep track of the next location to It uses a stack to keep track of the next location to visit.
visit.
BFS traverses according to tree level. DFS traverses according to tree depth.
Computer Science: In computer science, graph is used to represent networks of communication, data
organization, computational devices,Google map etc.
Physics and Chemistry: Graph theory is also used to study molecules in chemistry and physics.
Social Science: Graph theory is also widely used in sociology.
Mathematics: In this, graphs are useful in geometry and certain parts of topology such as knot theory.
Biology: Graph theory is useful in biology and conservation efforts.
In Computer science graphs are used to represent the flow of computation.
Google maps uses graphs for building transportation systems, where intersection of
two(or more) roads are considered to be a vertex and the road connecting two vertices
is considered to be an edge, thus their navigation system is based on the algorithm to
calculate the shortest path between two vertices.
In Facebook, users are considered to be the vertices and if they are friends then there is
an edge running between them. Facebook’s Friend suggestion algorithm uses graph
theory. Facebook is an example of undirected graph.
In World Wide Web, web pages are considered to be the vertices. There is an edge from
a page u to other page v if there is a link of page v on page u. This is an example of
Directed graph. It was the basic idea behind Google Page Ranking Algorithm.
In Operating System, we come across the Resource Allocation Graph where each
process and resources are considered to be vertices. Edges are drawn from resources to
the allocated process, or from requesting process to the requested resource. If this leads
to any formation of a cycle then a deadlock will occur.
Graph Traversals
Traversing a Graph is the systematic approach of visiting each node of
the graph and do the processing.(Ex:printing the nodes information)
Insertion sort is a simple sorting algorithm that works similar to the way you sort
playing cards in your hands.
The array is virtually split into a sorted and an unsorted part.
Values from the unsorted part are picked and placed at the correct position in the
sorted part.
Insertion Sort
12
Insertion Sort
6 10 24 36
12
Insertion Sort
6 10 24 3
6
12
Sorting An Array Using Insertion Sort
Let us take an example of Insertion sort using an array.
Now for each pass, we compare the current element to all its previous
elements. So in the first pass, we start with the second element.
•First, we have to find the largest element (suppose max) from the given
array. Suppose 'x' be the number of digits in max. The 'x' is calculated
because we need to go through the significant places of all elements.
•After that, go through one by one each significant place. Here, we have to
use any stable sorting algorithm to sort the digits of each significant place.
radixSort(arr)
1.max = largest element in the given array
2.d = number of digits in the largest element (or, max)
3.Now, create d buckets of size 0 - 9
4.for i -> 0 to d
5.sort the array elements using counting sort (or any stable sort) according to the digits at
the ith place
Complexity of the Radix Sort Algorithm
WorstCase : O(n)
BestCase O(n)
Average Case : O(n)
ADDRESS CALCULATION SORT
• In this sorting algorithm, Hash Function f is used with the property of Order Preserving Function which
states that if if x>y THEN f(x)>f(y) .
This algorithm uses an address table to store the values which is simply a
list of Linked lists.
The Hash function is applied to each value in the array to find its
corresponding address in the address table.
Then the values are inserted at their corresponding addresses in a sorted
manner by comparing them to the values which are already present at that
address.
Examples:
Input : arr = [29, 23, 14, 5, 15, 10, 3, 18, 1]
Output: After inserting all the values in the address table, the address table looks like this:
ADDRESS 0: 1 --> 3
ADDRESS 1: 5
ADDRESS 2:
ADDRESS 3: 10
ADDRESS 4: 14 --> 15
ADDRESS 5: 18
ADDRESS 6:
ADDRESS 7: 23
ADDRESS 8:
ADDRESS 9: 29
ALGORITHM-ADDRESS CALCULATION SORT
Step1: Read the elements from the array
Step 2:Create an array of linked list with index from 0 to 9
Step 3: Divide each element in the given array by 10 to get the index of the
linked list where the element needs to be stored and store the element using
insert_order function of linked list.
Example :F(91)=91/10=9
F(28)=28/10=2
Step4: Copy the elements from the linked list inorder from 0 to 9 into an array.
Step 6: Display the array elements.
Files and Their Organization:
A file is an external collection of related data treated as a unit.
Files are stored in auxiliary/secondary storage devices.
• Disk
• Tapes
A file is a collection of data records with each record consisting
of one or more fields.
■ Hashing techniques –
map a large population of possible keys into a small address
space.
■ Forexample : 1,3,4
6-digit employee number → → → 3-digit
address
◻ 125870 → 158
◻ 122801 → 128
◻ 121267 → 112
◻ …
◻ 123413 → 134
■Binary files –
◻ A collection of data stored in the internal format of the
computer.
◻ Contain data that are meaningful
only if they are properly interpreted by a program.
All parts except for the last one have the same
length.
Add the parts together to obtain the hash
address.
Eg: suppose the size of each part is 2.
k: 1522756 5499025 11943936
Chopping: 01 52 27 56 05 49 90 25 11 94 39 36
Pure folding: 01+52+27+56 05+49+90+25 11+94+39+36
=136 =169 =180
Department of ISE Acharya Institute of Technology
Collision Resolution
• If, when an element is inserted, it hashes to the
same value as an already inserted element, then
we have a collision and need to resolve it.
• There are several methods for dealing with this:
– Separate chaining
– Open addressing
• Linear Probing
• Quadratic Probing
• Double Hashing
In linear probing, collisions are resolved by
sequentially scanning an array until an
empty cell is found.
Suppose there is a hash table of size h , key
value k and address calculated is i. If ith
location is free then k can be stored in
it.
If the location already has some data value
stored in it, then other slots are examined
systematically in the forward direction to find
a free slot.
ie, the locations i+1,i+2,…h-1,0,1,..i-1 will be
the sequence of locations.
Here the hash table is considered as circular, so
that when the last location is reached, the
search proceeds to the first location in the table;
hence the name closed hashing.
Example:
Insertitems with keys: 89, 18, 49, 58, 9 into an
empty hash table.
Table size is 10.
Hash function is hash(x) = x mod 10.
Figure 20.4
Linear
probing hash
table after
each insertion
Problem 1:Apply the hash function h(x) = x mod 7 for linear probing on the data 2341, 4234,
2839, 430, 22, 397, 3920 and show the resulting hash table
A hash table contains 7 buckets and uses linear probing to solve collision. The key
values are integers and the hash function used is key Mod 7. Draw the table that
results after inserting in the given order the following values: 16,8,4,13,29,11,22.
16,8,4,13,29,11,22.
16 MOD 7=2 0 22
8 MOD 7=1 1 8
4 MOD 7=5 2 16
13 MOD 7 =6 3 29
29 MOD 7 =1//COLLISION 4 11
11 MOD 7=4 //COLLISION 5 4
22 MOD 7 =1 //COLLISION 6 13
(Drawbacks of linear probing)
One of the problems with linear probing is
clustering, many consecutive elements form groups
and it starts taking time to find a free slot or to
search an element.
As long as table is big enough, a free cell can always
be found, but the time to do so can get quite large.
Worse, even if the table is relatively empty, blocks of
occupied cells start forming.
This effect is known as primary clustering.
Any key that hashes into the cluster will require
several attempts to resolve the collision, and then it
will add to the cluster.
Double Hashing.
Quadratic Probing.
Double Hashing
Here a second hash function H2 is used for resolving collision
The second hash function should be selected in such a way that
the hash addresses generated by the two hash functions are
distinct.
The second hash function generates a value m for the key k,
which is used for the pseudo random number generator as in the
case of random probing.
Here m and h are relatively prime.
Double hashing can be done using :
(hash1(key) + i * hash2(key)) % TABLE_SIZE
Here hash1() and hash2() are hash functions and TABLE_SIZE
is size of hash table.
A popular second hash function is :
A hash table contains 10 buckets and uses double hashing to solve collision. The key
values are integers and the hash function used is division method. Draw the table
that results after inserting in the given order the following values:
16,8,24,13,29,14,33.
Quadratic probing
Instead of probing next locations i+1,
i+2,i+3…as in the case of linear probing , here
the next locations to be probed are
i+12,i+22,i+32….
Mathematically, if h is the size of the hash
table and H(k) is the hash function then the
quadratic probing searches the locations:
(k+i2 ) MOD M
for i=1,2,3….
M=table size
The idea is to keep a list of all elements that hash
to the same value.
The array elements are pointers to the
first nodes of the lists.
A new item is inserted to the end of the list
This is implemented via a linear search for an empty slot, from the point of collision. If the
physical end of table is reached during the linear search, the search will wrap around to the
beginning of the table and continue from there.If an empty slot is not found before reaching
the point of collision, the table is full.
Instead of using a constant “skip” value, if the first hash value that has resulted in collision is h,
the successive values which are probed are h+1, h+4, h+9, h+16, and so on.
In other words, quadratic probing uses a skip consisting of successive perfect squares.
A popular second hash function is: Hash2(key) = R - ( key % R ) where R is a prime number
that is smaller than the size of the table.But any independent hash function may also be used.