Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

16IT4202 – DAA UNIT 3 NOTES

Greedy approach: Prims Algorithms – Kruskal’s Algorithm – Dijkstra’s Algorithm -


Huffman Trees and Codes
Greedy approach

The greedy approach suggests constructing a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the
problem is reached. On each step—and this is the central point of this technique—the choice
made must be:
feasible, i.e., it has to satisfy the problem’s constraints
locally optimal, i.e., it has to be the best local choice among all feasible choices available on that
step
irrevocable, i.e., once made, it cannot be changed on subsequent steps of the algorithm

Prim’s Algorithm
A spanning tree of an undirected connected graph is its connected acyclic subgraph (i.e.,
a tree) that contains all the vertices of the graph.

If such a graph has weights assigned to its edges, a minimum spanning tree is its
spanning tree of the smallest weight, where the weight of a tree is defined as the sum of the
weights on all its edges.

The minimum cost spanning tree problem is the problem of finding a minimum spanning tree
for a given weighted connected graph.

Prim’s algorithm constructs a minimum spanning tree through a sequence of expanding


subtrees. The initial subtree in such a sequence consists of a single vertex selected arbitrarily
from the set V of the graph’s vertices. On each iteration, the algorithm expands the current tree in
the greedy manner by simply attaching to it the nearest vertex not in that tree. The algorithm
stops after all the graph’s vertices have been included in the tree being constructed. The total
number of such iterations is n − 1, where n is the number of vertices in the graph.

1
16IT4202 – DAA UNIT 3 NOTES

ALGORITHM Prim(G)
//Prim’s algorithm for constructing a minimum spanning tree
//Input: A weighted connected graph G = <V, E>
//Output: ET , the set of edges composing a minimum spanning tree of G

VT = {V0} //the set of tree vertices can be initialized with any vertex
ET = 𝝓
for i = 1 to |V| − 1 do
find a minimum-weight edge e* = (v*, u*) among all the edges (v, u)
such that v is in VT and u is in V − VT
VT = VT ∪ {u*}
ET = ET ∪ {e*}
return ET

Efficiency of Prim’s Algorithm

If a graph is represented by its weight matrix and the priority queue is implemented as an
unordered array, the algorithm’s running time will be in 𝚯(|V |2).

If a graph is represented by its adjacency lists and the priority queue is implemented
as a min-heap, the running time of the algorithm is in O(|E| log |V |).

2
16IT4202 – DAA UNIT 3 NOTES

3
16IT4202 – DAA UNIT 3 NOTES

KRUSKAL’S ALGORITHM

Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected graph G


= <V, E> as an acyclic subgraph with |V| − 1 edges for which the sum of the edge weights is the
smallest. The algorithm begins by sorting the graph’s edges in nondecreasing order of their
weights. Then, starting with the empty subgraph, it scans this sorted list, adding the next edge on
the list to the current subgraph if such an inclusion does not create a cycle and simply skipping
the edge otherwise.

ALGORITHM Kruskal(G)
//Kruskal’s algorithm for constructing a minimum spanning tree
//Input: A weighted connected graph G = <V, E>
//Output: ET , the set of edges composing a minimum spanning tree of Gsort E in nondecreasing
order of the edge weights w(ei1) ≤ . . . ≤ w(ei|E|)

ET = 𝜙;
ecounter = 0 //initialize the set of tree edges and its size
k = 0 //initialize the number of processed edges
while ecounter < |V| − 1 do
k=k+1
if ET ∪ {eik } is acyclic
ET = ET ∪ { eik };
ecounter = ecounter + 1
return ET

Efficiency of Kruskal’s Algorithm


With an efficient sorting algorithm, the time efficiency of Kruskal’s algorithm will be in
O(|E| log |E|).

4
16IT4202 – DAA UNIT 3 NOTES

5
16IT4202 – DAA UNIT 3 NOTES

Dijkstra’s Algorithm

The single-source shortest-paths problem is defined as for a given vertex called the source in a
weighted connected graph, find shortest paths to all its other vertices

Applications.
Most widely used applications are transportation planning and packet routing in communication
networks, including the Internet. Multitudes of less obvious applications include finding shortest
paths in social networks, speech recognition, document formatting, robotics, compilers, and
airline crew scheduling. In the world of entertainment, one can mention path finding in video
games and finding best solutions to puzzles using their state-space graphs

The best-known algorithm for the single-source shortest-paths problem, called Dijkstra’s
algorithm.
This algorithm is applicable to undirected and directed graphs with nonnegative weights only
Dijkstra’s algorithm finds the shortest paths to a graph’s vertices in order of their distance from a
given source. First, it finds the shortest path from the source to a vertex nearest to it, then to a
second nearest, and so on. In general, before its ith iteration commences, the algorithm has
already identified the shortest paths to
i − 1 other vertices nearest to the source.

Move u* from the fringe to the set of tree vertices.


For each remaining fringe vertex u that is connected to u* by an edge ofweight w(u*, u) such that
du* + w(u*, u) < du, update the labels of u by u* and du* + w(u*, u), respectively.

The set VT of vertices for which a shortest path has already been found and the priority queue Q
of the fringe vertices.

6
16IT4202 – DAA UNIT 3 NOTES

ALGORITHM Dijkstra(G, s)
//Dijkstra’s algorithm for single-source shortest paths
//Input: A weighted connected graph G = <V, E>with nonnegative weights and its vertex s
//Output: The length dv of a shortest path from s to v and its penultimate vertex pv for every
vertex v in V
Initialize(Q) //initialize priority queue to empty
for every vertex v in V
dv = ∞; pv = null
Insert(Q, v, dv) //initialize vertex priority in the priority queue
ds = 0; Decrease(Q, s, ds) //update priority of s with ds
VT = 𝝓
for i = 0 to |V| − 1 do
u* = DeleteMin(Q) //delete the minimum priority element
VT = VT ∪ {u*}
for every vertex u in V − VT that is adjacent to u* do
if du* + w(u*, u) < du
du = du* + w(u*, u); pu = u*
Decrease(Q, u, du)

The time efficiency of Dijkstra’s algorithm depends on the data structures used for
implementing the priority queue and for representing an input graph itself in Θ(|V |2) for graphs
represented by their weight matrix and the priority queue implemented as an unordered array.
For graphs represented by their adjacency lists and the priority queue implemented as a min-
heap, it is in O(|E| log |V |).

7
16IT4202 – DAA UNIT 3 NOTES

8
16IT4202 – DAA UNIT 3 NOTES

Huffman Trees and Codes


• Huffman Coding also called as Huffman Encoding is a famous greedy algorithm that is used
for the lossless compression of data.
• It uses variable length encoding where variable length codes are assigned to all the characters
depending on how frequently they occur in the given text.
• The character which occurs most frequently gets the smallest code and the character which
occurs least frequently gets the largest code.
Prefix Rule-
To prevent ambiguities while decoding, Huffman coding implements a rule known as a prefix
rule which ensures that the code assigned to any character is not a prefix of the code assigned to
any other character
Huffman’s algorithm
Step-01:
Create a leaf node for all the given characters containing the occurring frequency of characters.
Step-02:
Arrange all the nodes in the increasing order of frequency value contained in the nodes.
Step-03:
Considering the first two nodes having minimum frequency, create a new internal node having
frequency equal to the sum of the two nodes frequencies and make the first node as a left child
and the other node as a right child of the newly created node.
Step-04:
Keep repeating Step-02 and Step-03 until all the nodes form a single tree.

A tree constructed by the above algorithm is called a Huffman tree. It defines—in the
manner described above—a Huffman code.
It finds the optimal coding in O(N. log N) time.
EXAMPLE
Consider the five-symbol alphabet {A, B, C, D, _} with the following occurrence frequencies in
a text made up of these symbols:
symbol A B C D _
frequency 0.35 0.1 0.2 0.2 0.15

9
16IT4202 – DAA UNIT 3 NOTES

10
16IT4202 – DAA UNIT 3 NOTES

Dynamic programming: Knapsack Problem and Memory functions – Optimal Binary


Search Trees – Warshall’s and Floyd’s Algorithms

Knapsack Problem and Memory functions:


Given n items of known weights w1, . . . , wn and values v1, . . . , vn and a knapsack of
capacity W, find the most valuable subset of the items that fit into the knapsack.
To design a dynamic programming algorithm, we need to derive a recurrence relation that
expresses a solution to an instance of the knapsack problem in terms of solutions to its smaller
subinstances.
Let us consider an instance defined by the first i items, 1≤ i ≤ n, with weights w1, . . . , wi,
values v1, . . . , vi , and knapsack capacity j, 1 ≤ j ≤ W.
Let F(i, j) be the value of an optimal solution to this instance, i.e., the value of the most valuable
subset of the first i items that fit into the knapsack of capacity j.
We can divide all the subsets of the first i items that fit the knapsack of capacity j into two
categories: those that do not include the ith item and those that do.

1. Among the subsets that do not include the ith item, the value of an optimal subset is,
F(i − 1, j).
2. Among the subsets that do include the ith item (hence, j – wi ≥ 0), an optimal subset is made
up of this item and an optimal subset of the first i − 1 items that fits into the knapsack of
capacity j − wi . The value of such an optimal subset is vi + F(i − 1, j − wi).

Thus, the value of an optimal solution among all feasible subsets of the first i items is the
maximum of these two values. Of course, if the ith item does not fit into the knapsack, the value
of an optimal subset selected from the first i items is the same as the value of an optimal subset
selected from the first i − 1 items. These observations lead to the following recurrence:
F(i, j) = max{F(i − 1, j ), vi+ F(i − 1, j − wi)} if j – wi ≥ 0, F(i − 1, j) if j − wi < 0.
It is convenient to define the initial conditions as follows:
F(0, j) = 0 for j ≥ 0 and F(i, 0) = 0 for i ≥ 0.
Our goal is to find F(n, W), the maximal value of a subset of the n given items that fit into the
knapsack of capacity W, and an optimal subset itself.

11
16IT4202 – DAA UNIT 3 NOTES

Memory functions

Dynamic programming deals with problems whose solutions satisfy a recurrence relation with
overlapping subproblems. The direct top-down approach to finding a solution to such a
recurrence leads to an algorithm that solves common subproblems more than once and hence is
very inefficient The classic dynamic programming approach, on the other hand, works bottom
up: it fills a table with solutions to all smaller subproblems, but each of them is solved only once.
An unsatisfying aspect of this approach is that solutions to some of these smaller subproblems
are often not necessary for getting a solution to the problem given. Since this drawback is not
present in the top-down approach, The goal is to get a method that solves only subproblems that
are necessary and does so only once. Such a method exists; it is based on using memory
functions.

This method solves a given problem in the top-down manner but, in addition, maintains a table
of the kind that would have been used by a bottom-up dynamic programming algorithm. Initially,
all the table’s entries are initialized with a special “null” symbol to indicate that they have not yet
been calculated. Thereafter, whenever a new value needs to be calculated, the method checks the
corresponding entry in the table first: if this entry is not “null,” it is simply retrieved from the
table; otherwise, it is computed by the recursive call whose result is then recorded in the table.

The following algorithm implements this idea for the knapsack problem. After initializing the
table, the recursive function needs to be called with i = n (the number of items) and j = W (the
knapsack capacity).

12
16IT4202 – DAA UNIT 3 NOTES

ALGORITHM MFKnapsack(i, j )
//Implements the memory function method for the knapsack problem
//Input: A nonnegative integer i indicating the number of the first items being considered and a
nonnegative integer j indicating the knapsack capacity
//Output: The value of an optimal feasible subset of the first i items
//Note: Uses as global variables input arrays Weights[1..n], V alues[1..n], and table F[0..n, 0..W ]
whose entries are initialized with −1’s except for row 0 and column 0 initialized with 0’s

if F[i, j ]< 0
if j <Weights[i]
value = MFKnapsack(i − 1, j)
else
value = max(MFKnapsack(i − 1, j),Values[i] + MFKnapsack(i − 1, j −Weights[i]))
F[i, j ] = value
return F[i, j ]

13
16IT4202 – DAA UNIT 3 NOTES

14
16IT4202 – DAA UNIT 3 NOTES

15
16IT4202 – DAA UNIT 3 NOTES

16
16IT4202 – DAA UNIT 3 NOTES

17
16IT4202 – DAA UNIT 3 NOTES

Warshall’s Algorithm

The transitive closure of a directed graph with n vertices can be defined as the n*n boolean
matrix T = {tij}, in which the element in the ith row and the jth column is 1 if there exists a
nontrivial path (i.e., directed path of a positive length) from the ith vertex to the jth vertex;
otherwise, tij is 0.

Warshall’s algorithm constructs the transitive closure through a series of n * n boolean matrices: R(0), . . .
, R(k−1), R(k), . . . R(n). Each of these matrices provides certain information about directed paths in the
digraph. Specifically, the element r(k) ij in the ith row and jth column of matrix R(k) (i, j = 1, 2, . . . , n, k =
0, 1, . . . , n) is equal to 1 if and only if there exists a directed path of a positive length from the ith vertex
to the jth vertex with each intermediate vertex, if any, numbered not higher than k.

R(1) contains the information about paths that can use the first vertex as intermediate; In general, each
subsequent matrix in series has one more vertex to use as intermediate for its paths than its predecessor
and hence may, but does not have to, contain more 1’s. The last matrix in the series, R(n), reflects paths
that can use all n vertices of the digraph as intermediate and hence is nothing other than the digraph’s
transitive closure.
Let r(k) ij , the element in the ith row and jth column of matrix R(k), be equal to 1. This means that there
exists a path from the ith vertex vi to the jth vertex vj with each intermediate vertex numbered not higher
than k: vi, a list of intermediate vertices each numbered not higher than k, vj .

Rule for changing zeros in Warshall’s algorithm.

The formula for generating the elements of matrix

R(k) from the elements of matrix R(k−1):

This formula implies the following rule for generating elements of matrix R(k) from elements of matrix
R(k−1), which is particularly convenient for applyingWarshall’s algorithm by hand:
If an element rij is 1 in R(k−1), it remains 1 in R(k).

18
16IT4202 – DAA UNIT 3 NOTES

If an element rij is 0 in R(k−1), it has to be changed to 1 in R(k) if and only if the element in its row i and
column k and the element in its column j and row k are both 1’s in R(k−1).

19
16IT4202 – DAA UNIT 3 NOTES

20
16IT4202 – DAA UNIT 3 NOTES

Floyd’s Algorithm for the All-Pairs Shortest-Paths Problem

Given a weighted connected graph (undirected or directed), the all-pairs shortest paths problem
asks to find the distances—i.e., the lengths of the shortest paths— from each vertex to all other
vertices.

The distance matrix matrix D is an n*n where the element dij in the ith row and the jth
column of this matrix indicates the length of the shortest path from the ith vertex to the jth
vertex.

Floyd’s algorithm computes the distance matrix of a weighted graph with n vertices through a
series of n *n matrices: D(0), . . . , D(k−1), D(k), . . . , D(n). the element d(k) ij in the ith row and
the jth column of matrix
D(k) (i, j = 1, 2, . . . , n, k = 0, 1, . . . , n) is equal to the length of the shortest path among all paths
from the ith vertex to the jth vertex with each intermediate vertex, if any, numbered not higher
than k. hence,D(0) is simply the weight matrix of the graph. The last matrix in the series, D(n),
contains the lengths of the shortest paths among all paths that can use all n vertices as
intermediate

The recurrence relation can be stated as


d(k) ij = min{d(k−1) ij , d(k−1) ik + d(k−1) kj } for k ≥ 1, d(0) ij = wij

21
16IT4202 – DAA UNIT 3 NOTES

22
16IT4202 – DAA UNIT 3 NOTES

23
16IT4202 – DAA UNIT 3 NOTES

24
16IT4202 – DAA UNIT 3 NOTES

25
16IT4202 – DAA UNIT 3 NOTES

26
16IT4202 – DAA UNIT 3 NOTES

ALGORITHM OptimalBST(P [1..n])


//Finds an optimal binary search tree by dynamic programming
//Input: An array P[1..n] of search probabilities for a sorted list of n keys
//Output: Average number of comparisons in successful searches in the
// optimal BST and table R of subtrees’ roots in the optimal BST
for i ←1 to n do
C[i, i − 1] = 0
C[i, i] = P[i]
R[i, i] = i
C[n + 1, n] = 0
for d = 1 to n − 1 do //diagonal count
for i =1 to n − d do
j ←i + d
minval = ∞
for k = i to j do
if C[i, k − 1]+ C[k + 1, j ] < minval
minval = C[i, k − 1]+ C[k + 1, j]; kmin = k
R[i, j ] = kmin
for s = i to j do sum = sum + P[s]
C[i, j ] = minval + sum
return C[1, n], R

The algorithm’s space efficiency is clearly quadratic; the time efficiency of this
algorithm is cubic i.e. Θ(N3)

27
16IT4202 – DAA UNIT 3 NOTES

28
16IT4202 – DAA UNIT 3 NOTES

29
16IT4202 – DAA UNIT 3 NOTES

30
16IT4202 – DAA UNIT 3 NOTES

31
16IT4202 – DAA UNIT 3 NOTES

32
16IT4202 – DAA UNIT 3 NOTES

Question bank Unit 3

S.No Questions
What do you mean by optimal feasible solution?
1.
An optimal feasible solution is one that either minimizes or maximizes the objective function.
What is Transitive Closure?
The transitive closure of a directed graph with n vertices can be defined as the n by n Boolean
2.
matrix T = {tij}, in which the element in the ith row (1<i<n) and the jth column (l<j<n) is 1 if
there exists a non trivial directed path from the ith vertex to jth vertex ; otherwise , tij is 0.
State Principal of Optimality.
3. An optimal solution to any instance of an optimization problem is composed of optimal
solutions to its subinstances.
Compare Dynamic Programming and Greedy Approach..
Greedy Method Dynamic Programming

It is used for obtaining Optimal Solution It is used for obtaining Optimal Solution

In this method from a set of feasible solution There is no set of feasible solutions.
we pick up the optimum solution
4.
The optimum selection is without revising It considers all possible sequences in order to
previously generated solutions obtain the optimum solution

In greedy method there is no guarantee of It guarantees the final solution generated is


getting optimal solution optimal solution using the principal of
optimality.

State the general principal of greedy algorithm


The greedy technique constructs a solution to an optimization problem through a sequence of
5. steps, each expanding a partially constructed solution obtained so far, until a complete
solution to the problem is reached. On each step, the choice made must be feasible, locally
optimal and irrevocable.
Does Prim’s algorithm always work correctly on graphs with negative edge weights?
6.
Yes, Prim’s algorithm always work correctly on graphs with negative edge weights .In Prim's

33
16IT4202 – DAA UNIT 3 NOTES

algorithm MST forms a single tree. The safe edge added to MST is always a least-weight
edge connecting the tree to a vertex not in the tree.
How can one use Prim’s algorithm to find a spanning tree of a connected graph with no
weights on its edges?
7.
Since Prim’s algorithm requires weights on the graphs edges, some weights have to be
assigned so that we can employ the algorithm to find spanning tree.
Does Kruskal’s algorithm work correctly on graphs that have negative edge weights?
Yes, Kruskal’s algorithm work correctly on graphs that have negative edge weights. In
8. Kruskal's algorithm the safe edge added to A (subset of a MST) is always a least weight edge
in the graph that connects two distinct components. So, if there are negative weight edges
they will not affect the evolution of the algorithm.
What is All-pair shortest-paths problem?
Given a weighted connected graph (undirected or directed), the all pairs shortest paths
9.
problem asks to find the distances(the lengths of the shortest path) from each vertex to all
other vertices.
What is Minimum Cost Spanning Tree?
10. A minimum spanning tree of a weighted connected graph is its spanning tree of the smallest
weight, where the weight of a tree is defined as the sum of the weights on all its edges.
1. What do you mean by Huffman code?
11. A Huffman code is a optimal prefix free variable length encoding scheme that assigns bit
strings to characters based on their frequencies in a given text.
2. What is Optimal Binary Search Trees?

12. An optimal binary search tree is a binary search tree for which the nodes are arranged on
levels such that the tree cost is minimum.
3.
4. What are the drawbacks of dynamic programming?
13. • Time and space requirements are high, since storage is needed for all level.
• Optimality should be checked at all levels
What is Knapsack problem?
14. Given n items of known weights w1, . . . , wn and values v1, . . . , vn and a knapsack of
capacity W, find the most valuable subset of the items that fit into the knapsack.

34
16IT4202 – DAA UNIT 3 NOTES

State and explain the characteristics of greedy algorithm.


The greedy approach constructs a solution through a sequence of steps, each
expanding a partially constructed solution obtained so far, until a complete solution to the
problem is reached. On each step the choice made must be:
15. feasible, i.e., it has to satisfy the problem’s constraints
locally optimal, i.e., it has to be the best local choice among all feasible choices available on
that step
irrevocable, i.e., once made, it cannot be changed on subsequent steps of the algorithm

What do you mean by memory function technique


The Memory Function technique seeks to combine strengths of the top down and bottom-up
approaches to solving problems with overlapping subproblems. It does this by solving, in the
16.
top-down fashion but only once, just necessary sub problems of a given problem and
recording their solutions in a table.

PART B (Give questions unit wise - 14 Marks)


This can be either with one question for 14 marks or with two question 7 + 7 marks
S.No Questions Marks COs
Apply Kruskal’s algorithm to find a minimum spanning tree of the following graphs. 14

1. CO 3

Apply Prim’s algorithm to find a minimum spanning tree of the following graphs. 14

CO 3
2.

35
16IT4202 – DAA UNIT 3 NOTES

Write the Dijktra’s algorithm for finding the single source shortest path 14

3. CO 3

Find an optimal solution to the knapsack instance n = 7, m= 15 14


4. (p1, p2, p3, ….p7) = (10, 5, 15, 7, 6, 18, 3) and (w1, w2, w3, ...w7) (2, 3, 5, 7, 1, 4, 1) CO 3

Construct the optimal binary search tree with the following data 14 CO 3

5. Keys A B C D E

Probability 0.2 0.1 0.3 0.25 0.15

Solve the all pairs source shortest path for the digraph with the weight matrix 14 CO 3

PART C (Give questions unit wise - 10 Marks)


S.No Questions Marks COs
Find the shortest path from vertex A to remaining vertices using Dijktra’s algorithm
A B C D E

A 0 ∞ ∞ 7 ∞

B 3 0 4 ∞ ∞
1. 10 CO 3
C ∞ ∞ 0 ∞ 6

D ∞ 2 5 0 ∞

E ∞ ∞ ∞ 4 0

Consider the given table

Character A B C D E F

Frequency 5 9 12 13 16 45
2. 10 CO 3

i. Build a Huffman Tree from characters


ii. Print the codes to Characters
iii. Does it follow greedy approach? Justify your answer.

36
16IT4202 – DAA UNIT 3 NOTES

Solve the following instance of the knapsack problem given the knapsack capacity is
W=5

Items Weight Value


3. 1 2 12 10 CO 3
2 1 10
3 3 20
4 2 15

Apply the Warshall’s algorithm to find the transitive closure of the digraph defined by the
following adjacency matrix

4. 10 CO 3

37

You might also like