Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 25

Greedy Algorithm

A greedy algorithm is an approach for solving a problem by selecting the


best option available at the moment, without worrying about the future
result it would bring. In other words, the locally best choices aim at
producing globally best results.

This algorithm may not be the best option for all the problems. It may
produce wrong results in some cases.

This algorithm never goes back to reverse the decision made. This
algorithm works in a top-down approach.

The main advantage of this algorithm is:

1. The algorithm is easier to describe.


2. This algorithm can perform better than other algorithms (but, not in
all cases).

Feasible Solution
A feasible solution is the one that provides the optimal solution to the
problem.

Greedy Algorithm
1. To begin with, the solution set (containing answers) is empty.

2. At each step, an item is added into the solution set.

3. If the solution set is feasible, the current item is kept.

4. Else, the item is rejected and never considered again.

What is a 'Greedy algorithm'?

A greedy algorithm, as the name suggests, always makes the choice that


seems to be the best at that moment. This means that it makes a locally-
optimal choice in the hope that this choice will lead to a globally-optimal
solution.

How do you decide which choice is optimal?

Assume that you have an objective function that needs to be optimized


(either maximized or minimized) at a given point. A Greedy algorithm
makes greedy choices at each step to ensure that the objective function is
optimized. The Greedy algorithm has only one shot to compute the optimal
solution so that it never goes back and reverses the decision.

Ford-Fulkerson Algorithm
Ford-Fulkerson algorithm is a greedy approach for calculating the
maximum possible flow in a network or a graph.
A term, flow network, is used to describe a network of vertices and edges
with a source (S) and a sink (T). Each vertex, except S and T, can receive
and send an equal amount of stuff through it. S can only send and T can
only receive stuff.
We can visualize the understanding of the algorithm using a flow of liquid
inside a network of pipes of different capacities. Each pipe has a certain
capacity of liquid it can transfer at an instance. For this algorithm, we are
going to find how much liquid can be flowed from the source to the sink at
an instance using the network.

Flow network graph

what is the max flow problem?


The max flow problem is an optimization problem for determining the
maximum amount of stuff that can flow at a given point in time through a
single source/sink flow network. A flow network is essentially just a directed
graph where the edge weights represent the flow capacity of each edge.
The stuff that flows through these networks could be literally anything.
Maybe it’s traffic driving through a city, water flowing through pipes, or bits
traveling across the internet.

To make this more concrete, let’s look at the following example:


This is a pretty simple graph with no back edges where s is the source
vertex and t is the sink. Let’s imagine that s is a water treatment facility
and t is our home and we’re interested in finding out the amount of water
that can flow through to our literal bathroom sink.

You can kind of eyeball this one and see that although the edges coming
out of the source (vertex s) have a large capacity, we’re bottlenecked by
the edge leading to our home (the sink vertex t) which can only transport 1
unit of water.
Here our flow can clearly be at most the capacity of our smallest edge
leading into t. So can we simply look for the smallest capacity edges and
say definitively that we know our maximum flow? Almost… and we’ll get to
that later with the max-flow min-cut theorem, but first let’s look at a more
difficult example that has multiple edges flowing into the sink.

https://www.youtube.com/watch?v=NwenwITjMys
Dijkstra's Algorithm

Dijkstra's Algorithm allows you to calculate the shortest path between one
node (you pick which one) and every other node in the graph. Let's calculate
the shortest path between node C and the other nodes in our graph:

During the algorithm execution, we'll mark every node with its minimum
distance to node C (our selected node). For node C, this distance is 0. For
the rest of nodes, as we still don't know that minimum distance, it starts
being infinity (∞):

We'll also have a current node. Initially, we set it to C (our selected node). In
the image, we mark the current node with a red dot.
Now, we check the neighbours of our current node (A, B and D) in no
specific order. Let's begin with B. We add the minimum distance of the
current node (in this case, 0) with the weight of the edge that connects our
current node with B (in this case, 7), and we obtain 0 + 7 = 7. We compare
that value with the minimum distance of B (infinity); the lowest value is the
one that remains as the minimum distance of B (in this case, 7 is less than
infinity):

So far, so good. Now, let's check neighbour A. We add 0 (the minimum


distance of C, our current node) with 1 (the weight of the edge connecting
our current node with A) to obtain 1. We compare that 1 with the
minimum distance of A (infinity), and leave the smallest value:

OK. Repeat the same procedure for D:


Great. We have checked all the neighbours of C. Because of that, we mark
it as visited. Let's represent visited nodes with a green check mark:

We now need to pick a new current node. That node must be the unvisited
node with the smallest minimum distance (so, the node with the smallest
number and no check mark). That's A. Let's mark it with the red dot:
And now we repeat the algorithm. We check the neighbours of our current
node, ignoring the visited nodes. This means we only check B.

For B, we add 1 (the minimum distance of A, our current node) with 3 (the
weight of the edge connecting A and B) to obtain 4. We compare that 4
with the minimum distance of B (7) and leave the smallest value: 4.

Afterwards, we mark A as visited and pick a new current node: D, which is


the non-visited node with the smallest current distance.
We repeat the algorithm again. This time, we check B and E.

For B, we obtain 2 + 5 = 7. We compare that value with B's minimum


distance (4) and leave the smallest value (4). For E, we obtain 2 + 7 = 9,
compare it with the minimum distance of E (infinity) and leave the smallest
one (9).

We mark D as visited and set our current node to B.

Almost there. We only need to check E. 4 + 1 = 5, which is less than E's


minimum distance (9), so we leave the 5. Then, we mark B as visited and
set E as the current node.
E doesn't have any non-visited neighbours, so we don't need to check
anything. We mark it as visited.

As there are not univisited nodes, we're done! The minimum distance of
each node now actually represents the minimum distance from that node
to node C (the node we picked as our initial node)!

Here's a description of the algorithm:

1. Mark your selected initial node with a current distance of 0 and the
rest with infinity.
2. Set the non-visited node with the smallest current distance as the
current node C.
3. For each neighbour N of your current node C: add the current
distance of C with the weight of the edge connecting C-N. If it's
smaller than the current distance of N, set it as the new current
distance of N.
4. Mark the current node C as visited.
5. If there are non-visited nodes, go to step 2.

Prim’s Algorithm-
 
 Prim’s Algorithm is a famous greedy algorithm.
 It is used for finding the Minimum Spanning Tree (MST) of a given graph.
 To apply Prim’s algorithm, the given graph must be weighted, connected and
undirected.
 

Prim’s Algorithm Implementation-


 
The implementation of Prim’s Algorithm is explained in the following steps-
 

Step-01:
 
 Randomly choose any vertex.
 The vertex connecting to the edge having least weight is usually selected.
 

Step-02:
 
 Find all the edges that connect the tree to new vertices.
 Find the least weight edge among those edges and include it in the existing tree.
 If including that edge creates a cycle, then reject that edge and look for the next least
weight edge.
 

Step-03:
 
 Keep repeating step-02 until all the vertices are included and Minimum Spanning
Tree (MST) is obtained.
 
 

PRACTICE PROBLEMS BASED ON PRIM’S ALGORITHM-


 

Problem-01:
 
Construct the minimum spanning tree (MST) for the given graph using Prim’s
Algorithm-
 

Solution-
 
The above discussed steps are followed to find the minimum cost spanning tree
using Prim’s Algorithm-
 

Step-01:
 

Step-02:
 
 

Step-03:
 

Step-04:
 

Step-05:
 
 

Step-06:
 

 
Since all the vertices have been included in the MST, so we stop.
 
Now, Cost of Minimum Spanning Tree
= Sum of all edge weights
= 10 + 25 + 22 + 12 + 16 + 14
= 99 units
 

Problem-02:
 
Using Prim’s Algorithm, find the cost of minimum spanning tree (MST) of the given
graph-
 
 

Solution-
 
The minimum spanning tree obtained by the application of Prim’s Algorithm on the
given graph is as shown below-
 

 
Now, Cost of Minimum Spanning Tree
= Sum of all edge weights
= 1 + 4 + 2 + 6 + 3 + 10
= 26 units
Huffman Coding
Huffman Coding is a technique of compressing data to reduce its size
without losing any of the details. It was first developed by David Huffman.

Huffman Coding is generally useful to compress the data in which there are
frequently occurring characters.

How Huffman Coding works?


Suppose the string below is to be sent over a network.

Initial string
Each character occupies 8 bits. There are a total of 15 characters in the
above string. Thus, a total of  8 * 15 = 120  bits are required to send this
string.
Using the Huffman Coding technique, we can compress the string to a
smaller size.

Huffman coding first creates a tree using the frequencies of the character
and then generates code for each character.

Once the data is encoded, it has to be decoded. Decoding is done using


the same tree.

Huffman Coding prevents any ambiguity in the decoding process using the
concept of prefix code ie. a code associated with a character should not
be present in the prefix of any other code. The tree created above helps in
maintaining the property.
Huffman coding is done with the help of the following steps.

1. Calculate the frequency of each character in the string.

Frequency of string

2. Sort the characters in increasing order of the frequency. These are

stored in a priority queue  Q .


Characters sorted according to the frequency
3. Make each unique character as a leaf node.

4. Create an empty node  z . Assign the minimum frequency to the left


child of z and assign the second minimum frequency to the right child
of  z . Set the value of the  z  as the sum of the above two minimum

frequencies. Getting the sum of the


least numbers
5. Remove these two minimum frequencies from  Q  and add the sum
into the list of frequencies (* denote the internal nodes in the figure
above).
6. Insert node  z  into the tree.
7. Repeat steps 3 to 5 for all the characters.

Repeat steps 3 to 5 for all the

characters.  Repeat steps 3 to 5 for all


the characters.
8. For each non-leaf node, assign 0 to the left edge and 1 to the right

edge. Assign 0 to the left edge and 1


to the right edge

For sending the above string over a network, we have to send the tree as
well as the above compressed-code. The total size is given by the table
below.

Character Frequency Code Size

A 5 11 5*2 = 10

B 1 100 1*3 = 3

C 6 0 6*1 = 6

D 3 101 3*3 = 9

4 * 8 = 32 bits 15 bits   28 bits

Without encoding, the total size of the string was 120 bits. After encoding
the size is reduced to  32 + 15 + 28 = 75 .
Decoding the code
For decoding the code, we can take the code and traverse through the tree
to find the character.

Let 101 is to be decoded, we can traverse from the root as in the figure
below.

Decoding

Dynamic Programming
Dynamic Programming is a technique in computer programming that helps
to efficiently solve a class of problems that have overlapping subproblems
and  optimal substructure property.
Such problems involve repeatedly calculating the value of the same
subproblems to find the optimum solution. 

Dynamic Programming Example


Take the case of generating the fibonacci sequence.

If the sequence is F(1) F(2) F(3)........F(50), it follows the rule F(n) = F(n-1)
+ F(n-2)

F(50) = F(49) + F(48)

F(49) = F(48) + F(47)

F(48) = F(47) + F(46)

...

Notice how there are overlapping subproblems, we need to calculate F(48)


to calculate both F(50) and F(49). This is exactly the kind of algorithm
where Dynamic Programming shines.
How Dynamic Programming Works
Dynamic programming works by storing the result of subproblems so that
when their solutions are required, they are at hand and we do not need to
recalculate them.

This technique of storing the value of subproblems is called memoization.


By saving the values in the array, we save time for computations of sub-
problems we have already come across.

Dynamic programming by memoization is a top-down approach to dynamic


programming. By reversing the direction in which the algorithm works i.e.
by starting from the base case and working towards the solution, we can
also implement dynamic programming in a bottom-up manner.

Recursion vs Dynamic Programming


Dynamic programming is mostly applied to recursive algorithms. This is not
a coincidence, most optimization problems require recursion and dynamic
programming is used for optimization.

But not all problems that use recursion can use Dynamic Programming.
Unless there is a presence of overlapping subproblems like in the fibonacci
sequence problem, a recursion can only reach the solution using a divide
and conquer approach.

That is the reason why a recursive algorithm like Merge Sort cannot use
Dynamic Programming, because the subproblems are not overlapping in
any way.
Dynamic Programming and Recursion:

Dynamic programming is basically, recursion plus using common sense. What it means is
that recursion allows you to express the value of a function in terms of other values of that
function. Where the common sense tells you that if you implement your function in a way
that the recursive calls are done in advance, and stored for easy access, it will make your
program faster. This is what we call Memoization - it is memorizing the results of some
specific states, which can then be later accessed to solve other sub-problems.

The intuition behind dynamic programming is that we trade space for time, i.e. to say
that instead of calculating all the states taking a lot of time but no space, we take up space
to store the results of all the sub-problems to save time later.

Let's try to understand this by taking an example of Fibonacci numbers.

Fibonacci (n) = 1; if n = 0
Fibonacci (n) = 1; if n = 1
Fibonacci (n) = Fibonacci(n-1) + Fibonacci(n-2)

So, the first few numbers in this series will be: 1, 1, 2, 3, 5, 8, 13, 21... and so on!

A code for it using pure recursion:

int fib (int n) {


if (n < 2)
return 1;
return fib(n-1) + fib(n-2);
}

Using Dynamic Programming approach with memoization:

void fib () {
fibresult[0] = 1;
fibresult[1] = 1;
for (int i = 2; i<n; i++)
fibresult[i] = fibresult[i-1] + fibresult[i-2];
}

Are we using a different recurrence relation in the two codes? No. Are we doing anything
different in the two codes? Yes.

In the recursive code, a lot of values are being recalculated multiple times. We could do
good with calculating each unique quantity only once. Take a look at the image to
understand that how certain values were being recalculated in the recursive way:

You might also like