Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

Algorithm Design and Complexity

Course 9
Overview
 Minimum Spanning Trees
 Generic Algorithm
 Kruskal’s Algorithm
 Disjoint Sets
 Prim’s Algorithm
 Fibonacci Heaps
Spanning Trees
 G(V, E) undirected, connected and weighted graph

 The weight (cost) function w: E → R


 w(u, v) = the weight of the edge (u, v)

 A spanning tree of G is a connected, undirected and


acyclic graph (a tree) that covers all the vertices of the
graph
 T(V, E’), E’  E
 |E’| = |V| - 1
 The weight of a spanning tree = the sum of the weights of
the edges that are part of the tree
 w(T) = Σ w(e), e  E’
Minimum Spanning Trees
 A minimum spanning tree (MST) is a spanning tree whose
total weight is minimized over all the possible spanning
trees that can be build for a given graph

 Optimization problem
 Does it have optimal substructure?
 Are the sub-solutions optimal as well?
 Maybe greedy or dynamic programming

 A graph may have more than a single MST


 We want to find only one of them
 We can also find all of them, but is more difficult
Unique MST
 If the weights of all the edges in the graph are distinct =>
unique MST

 If there are two edges with the same weight => probably
there are more MSTs

 A graph that has the same weight for all the edges => all
the spanning trees have the same cost
Example 1st MST I
3 5
A J
 Two MST of the graph 2
G
2
 The dotted edges are not
9 8 6 4 8
K
B C

part of the MST 5


H 7
8 1

I D E L
2
3 5 9
J F
A
2
G
2
9 8 I
6 4 8
K 2nd MST 5
B C 3 J
5 A
2
H 7 G
8 1 2
9 8 6 4 8
D E L K
2 B C
9 5
F
H 7
8 1

D E L
2
9
F
MST – Applications
 Computer networks
 Road infrastructure
 Other networks

 Clustering in an Euclidian space

 Approximation algorithms for NP-complete problems


 E.g. for TSP
MST – Examples
 Image source: http://hansolav.net/sql/prim_graph.png
MST – Solution
 In order to find out the minimum spanning tree T(V, E’), we
need to find out the set of edges E’

 Build an algorithm that builds a set of edges A


 Initially, A is empty
 At each step, we add an edge such that the following loop
invariant is respected:
 A is a subset of a MST
 Therefore, we add only edges that maintain the invariant. These
are called safe edges
 If A is a subset of a MST, an edge (u,v)E is safe for A if and
only if A U {(u, v)} is also a subset of a MST for G

 Optimal sub-structure!
MST – Generic Algorithm
 Follows directly from the presented solution
 The loop invariant is respected
 However, it does not provide a way to select the safe
edges => the algorithm is not fully specified
 Need to extend it in order to determine how to find the
safe edges

GENERIC-MST(G, w)
A=
WHILE (|A| < |V| – 1)
find an edge (u, v) that is safe for A
A = A U {(u, v)}
RETURN A
Finding Safe Edges
 If A = 
 The edge with the lowest cost in G is safe for A = 

 If A != 
 Let S  V the set of vertices covered by the edges in A
 V \ S is not empty
 The edge (c, f), cS, fV \ S, that has the minimum cost from all
the edges that have one endpoint in S and the other one in V \S

 But these are greedy choices!


Definitions
 A cut (S, V \ S) of a graph is a partition of vertices into two
disjoint sets
 S
 V\S
 An edge (u, v)E crosses the cut (S, V \ S) if it has one
endpoint in S and the other one in V \ S
 A cut respects a set of edges AE if no edge in A crosses the
cut
 A light edge for a cut is one of the edges that crosses the cut
and has the minimum weight out of all the edges that cross the
cut

 A cut has >= 1 light edges! They are not unique!


Theorem – Finding Safe Edges
 A is a subset of a MST for G
 (S, V \ S) is a cut that respects A
 (u, v) is a light edge for the cut (S, V \ S)
Then
 (u, v) is a safe edge for A

Proof: Assume that we have


another MST T that does not
contain (u,v), but contains (x,y)
that crosses the cut. We can
build T’ = T \ {(x,y)} U {(u,v)}
which should also be a MST
Generic MST Revisited
 Initially, A = 
 Therefore, the partial MST contains all the vertices in G, but
no edges
 => We have a forest of |V| components, one vertex per component
 At each step, we choose a safe edge that connects any two
components
 Light edge for the cut that has one component in S and the other in
V\S
 The two connected components are merged into a larger
single connected component

 Each component in the partial MST is a tree

 In the end, we shall have a single component => the MST


Property
 Let C = (Vc, Ec) a connected component in the partial
MST corresponding to the forest GA=(V, A)
 (u, v) is a light edge connecting C with some other
component in GA

 If (u, v) is a light edge for the cut (Vc, V \ Vc)


 Then (u, v) is safe for A

 Starting point for Kruskal’s algorithm


Kruskal’s Algorithm
 Starts from the Generic MST algorithm
 Sorts the edges of the graph according to their weight
 Initially, A =  and each vertex is in its own connected
component
 Repeatedly merge two components into one by choosing the
light edge between them
 This edge should also be a light edge for the cut between one of the
components and the rest of the graph

 This is true if we consider the edges according to their


increasing weight
 If the endpoints are in different components, then this is a safe edge!
Merge the two components
Kruskal – Pseudocode
KRUSKAL(G, w)
A=
FOREACH (vV)
MAKE-SET(v)
sort E by increasing order of their weights
FOREACH ((u, v)E taken from the sorted list) // can also check if |A|<|V|-1
IF (FIND-SET(u) != FIND-SET(v))
A = A U {(u, v)}
UNION(u, v)
RETURN A

 Complexity: (m * logm + m * FIND-SET + n * UNION)


 In the worst case, we consider all the edges in the graph, for
each of them we call FIND-SET twice!
 UNION is always called O(n) times
Kruskal – Example
 Example from “Proiectarea Algoritmilor 2010” course
 Thanks to Costin Chiru 
 CE -1
I
 EF -2
3 5
J  AG-2
A
2  JK-2
G
9 8 2  AI-3
6 4 8
B 5
K  GH-4
C
H 7  BC-5
8 1
 IJ-5
E
D
2
L  AH-6
9
F  KL-7
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (II)
 CE -1
 EF -2
I I
3 5  AG-2 3 5
J J
A  JK-2 A
2 2

9
G
2
 AI-3 9
G
2
8 8
B
6 4 8
K  GH-4 B
6 4 8
K
5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F  KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (III)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2 2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (IV)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (V)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (VI)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (VII)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (VIII)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (IX)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (X)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (XI)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (XII)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (XIII)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (XIV)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Exemplu (XV)
 CE -1
 EF -2
I
3
I
5
 AG-2 3 5
J J
A  JK-2 A
2
2
G
2
 AI-3 9
G
2
9 8 8
6 4 8
K
 GH-4 B
6 4 8
K
B 5 5
C  BC-5 C
H H
8 1
7
 IJ-5 8 1
7

D
E
L
 AH-6 D
E
L
9 2
F
 KL-7 9 2
F
 BG-8
 CD-8
 IL-8
 AB-9
Comparison Prim - Kruskal

I I
3 5 5
J 3 J
A A
2 2
G G
9 8 2 2
6 8 9 8
4 6 4 8
B K K
5 B 5
C C
H 7 H
8 1 7
8 1
E E
D L D
2 L
9 9 2
F F
Disjoint Sets
 http://en.wikipedia.org/wiki/Disjoint-set_data_structure
 We want to partition the vertices of the graph into a number
of separate and non-overlapping sets
 To remember the connected components in the partial MST tree

 Operations:
 MAKE-SET(u): creates a set with a single element u
 FIND-SET(u): finds the set that u is part of (usually returns the
representative element of that set, e.g. an ID of each set)
 UNION(u, v): merges two distinct sets into a single one (need to
move all the elements of a set into the other one, in the end all the
elements in the new set must have the same representative)
Alternatives for Disjoint Sets
 Can be implemented using lists, arrays, forest of trees and
forest of trees + heuristics

 Simplest solutions: use arrays

 set[1..n] = array with the representative of each element


in all the disjoint sets
Example

A B C D E F G H I J K L
0 1 1 1 1 1 1 1 0 0 0 0

I
3 5
J
A
2
G
9 8 2
6 4 8
B K
5
C
H 7
8 1
D
E L
9 2 F
Arrays as Disjoint Sets
 Complexity?
 MAKE-SET(u): (1)
 FIND-SET(u): (1)
 Just return set[u]
 UNION(u, v): (n)
 Have to walk through all the elements of the smallest disjoint set and
change their representative to the one of the highest disjoint set!

 Kruskal complexity?
 (m*logm + m + n2) = (m*logm + n2)
 Want better!
Forest of Trees as Disjoint Sets
 Use a forest of trees
 One tree for each disjoint set
 The representative of the disjoint set is the root element of
each tree

 Complexity?
 MAKE-SET(u): (1)
 FIND-SET(u): (max_height)
 Need to return the root element
 Start from u and walk up to the root
 UNION(u, v): (max_height)
 Need to append all the elements in one tree to the other tree
 Just make the root of the first tree point to an element in the second
tree (the root of the second tree or even to v)
 But for this we need to find the root of the first tree
Forest of Trees as Disjoint Sets (2)
 But, in the worst case
 When unions are not made very wisely
 max_height of a tree is O(n)
 Therefore, the complexity of the two operations is O(n)

 Need to improve it using heuristics:


 Union by rank
 Path compression
Heuristic 1: Union by Rank
 Union wisely 
 Always add the smallest tree to the root of the highest
one

 This way, we keep the trees somewhat balanced and the


height does not increase a lot after multiple union
operations

 It can be shown that max_height will be O(log n) in this


case
Heuristic 2: Path Compression
 Flatten the tree whenever FIND-SET(u) is called

 How?
 Make all the elements on the path from u up to the root
of the tree point directly to the root
 Thus, when we call FIND-SET for these elements, we can
return the root in (1)
I I
A J
A J
K K L
L
Forests with Both Heuristics
 When using forests with union-by-rank and path-compression,
the average time of any operation on the disjoint set structure
(FIND-SET, UNION) is:
(α(n)) = (1) even for n – very large
 α(n) = Ack-1(n, n)

 Ack(m,n) = 2 m-2 (n+3) – 3


 A function that increases very, very quickly
 Therefore α(n) increases very, very slowly

 Kruskal complexity?
 (m*logm + m + n) = (m*logm + n) = (m*logn) WHY?
Prim’s Algorithm
 Instead of building the partial MST in different connected
components
 Build the partial MST in a single connected component S
 Always consider the cut (S, V \ S) and choose the light
edge for this cut
 Easier to implement?
 Easier to understand?

 Need a start vertex – it may be any vertex in G


Prim - Pseudocode
Prim(G, w, s)
FOREACH (vV)
p[v] = NULL; d[v] = INF;
d[s] = 0
A=
S= // used only to denote the cut
Q = PRIORITY-QUEUE(V, d) // build a priority queue indexed by the vertices V
// with priorities in d[u] for each vertex
WHILE (!Q.EMPTY())
u = Q.EXTRACT-MIN() // pick the light edge = safe edge
S = S U {u} // add the current vertex to the other side of the cut
A = A U {(u, p[u])} // add the current edge to the partial MST
FOREACH (vAdj[u])
IF (d[v] > w(u,v)) // found a better edge from S to v
d[v] = w(u,v) // need to heapify-up the element!
// Q.DECREASE-KEY(v, w(u,v))
p[v] = u
RETURN A \ {(s, p(s))}
Prim – Remarks
 Uses a priority queue in order to allow finding the light
edge for the cut (S, V \ S) as efficiently as possible
 The vertices that are in the priority queue are the ones in
V\S
 d[v] contains the minimum weight of an edge that
connects v with any vertex from S (true for each vertex
that is still in the priority queue)
 (p[u], u) is exactly this minimum weight edge!
Prim – Complexity
 Depends how we implement the priority queue:
 (n * EXTRACT-MIN + m * DECREASE-KEY)

 If the priority queue is a simple array:


 EXTRACT-MIN: O(n)
 DECREASE-KEY: O(1)
 Prim: (n2 +m)  good for dense graphs

 If the priority queue is a binary heap:


 EXTRACT-MIN: O(logn)
 DECREASE-KEY: O(logn)
 Prim: (nlogn +mlogn) = (mlogn)  good for sparse graphs
Prim & Fibonacci Heaps
 Best solution: use Fibonacci heaps
 http://en.wikipedia.org/wiki/Fibonacci_heap
 EXTRACT-MIN: O(logn)
 DECREASE-KEY: O(1)
 Prim: (nlogn + m) = (nlogn+m)  good for sparse and
dense graphs
Exemplu (I)
 Pornim din I
I
3 5
J
A
2
G
9 8 2
6 4 8
B K
5
C
H 7
8 1
E
D L
9 2
F

 Q: A(3), J(5), L(8),


B(∞), C(∞), D(∞), E(∞),
F(∞), G(∞), H(∞), K(∞)
A
Exemplu (II)

I
3 5
J
A
2
G
2
 Q: G(2), J(5), H(6),
9 8
6 4 8
K
L(8), B(9), C(∞), D(∞),
B 5
C
E(∞), F(∞), K(∞)  G
H 7
8 1
E
D L
9 2
F
Exemplu (III)

 Q: G(2), J(5), H(6),


I L(8), B(9), C(∞), D(∞),
3 5
J E(∞), F(∞), K(∞)  G
A
2
G
9 8 2
6 4 8
B K
5
C
H 7
8 1
E  Q: H(4), J(5), L(8),
D
9 2
L
B(8), C(∞), D(∞), E(∞),
F
F(∞), K(∞)  H
Exemplu (IV)

I
3 5
J
A
2
G
2
 Q: J(5), L(8), B(8),
9 8
6 4 8
K
C(∞), D(∞), E(∞), F(∞),
B 5
C
K(∞)  J
H 7
8 1
E
D L
9 2
F
Exemplu (V)

I
3 5
J
A
2
G
2
 Q: K(2), L(8), B(8),
9 8
6 4 8
K
C(∞), D(∞), E(∞), F(∞)
B 5
C
K
H 7
8 1
E
D L
9 2
F
Exemplu (VI)

 Q: K(2), L(8), B(8),


I C(∞), D(∞), E(∞), F(∞)
3 5
J K
A
2
G
9 8 2
6 4 8
B K
5
C
H 7
1
8
 Q: L(7), B(8), C(∞),
D(∞), E(∞), F(∞)  L
E
D L
9 2
F
Exemplu (VII)

I
3 5
J
A
2
G
 Q: B(8), C(∞), D(∞),
9 8
6 8
2 E(∞), F(∞)  B
4
B K
5
C
H 7
8 1
E
D L
9 2
F
Exemplu (VIII)

I
3 5
J
A
2
G
 Q: C(5), D(∞), E(∞),
9 8
6 8
2 F(∞)  C
4
B K
5
C
H 7
8 1
E
D L
9 2
F
Exemplu (IX)

I
3 5
J
A
2
G
 Q: E(1), D(8), F(∞) 
9 8
6 8
2 E
4
B K
5
C
H 7
8 1
E
D L
9 2
F
Exemplu (X)

I
3 5
J
A
2
G
 Q: F(2), D(8)  F
9 8 2
6 4 8
B K
5
C
H 7
8 1
E
D L
9 2
F
Exemplu (XI)

I
3 5
J
A
2
G
 Q: D(8)  D
9 8 2
6 4 8
B K
5
C
H 7
8 1
E
D L
9 2
F
Exemplu (XII)

I
3 5
J
A
2
G
 Q: Ø
9 8 2
6 4 8
B K
5
C
H 7
8 1
E
D L
9 2
F
References
 CLRS – Chapter 24

 R. Sedgewick, K Wayne – Algorithms and Data Structures –


Princeton 2007 www.cs.princeton.edu/~rs/AlgsDS07/
01UnionFind si 14MST

 MIT OCW – Introduction to Algorithms – video lecture 16

You might also like