Correctness of Kruskal's Algorithm: Operations

Correctness of Kruskal’s algorithm
1. Throughout the execution of Kruskal, (V , F ) remains a spanning forest.

Proof: (V , F ) is a spanning subgraph because the vertex set is V . It
Algorithms and Data Structures: always remains a forest because edges with endpoints in different
connected components never induce a cycle.
Minimum Spanning Trees II
2. Eventually, (V , F ) will be connected and thus a spanning tree.
Proof: Suppose that after the complete execution of the loop, (V , F ) has
a connected component (V1 , F1 ) with V1 6= V . Since G is connected, there
is an edge e ∈ E with exactly one endpoint in V1 . This edge would have
been added to F when being processed in the loop, so this can never
5th Nov, 2010 happen.
3. Throughout the execution of Kruskal, (V , F ) is contained in some MST
of G.
Proof: Similar to the proof of the corresponding statement for Prim’s
algorithm.
ADS: lect 12 – slide 1 – 5th Nov, 2010 ADS: lect 12 – slide 3 – 5th Nov, 2010
Kruskal’s Algorithm Data Structures for Disjoint Sets

A different approach to computing MSTs. I A disjoint set data structure maintains a collection S = {S1 , . . . , Sk }
A forest is a graph whose connected components are trees. of disjoint sets.
I The sets are dynamic, i.e., they may change over time.
Idea
Starting from the spanning forest without any edges, I Each set Si is identified by some representative, which is some
repeatedly add edges of minimum weight until the forest member of that set.
becomes a tree. Operations:
I Make-Set(x): Creates new set whose only member is x. The
Algorithm Kruskal(G, W ) representative is x.
1. F ← ∅ I Union(x, y ): Unites set Sx containing x and set Sy containing y
2. for all e ∈ E in the order of increasing weight do into a new set S and removes Sx and Sy from the collection.
3. if the endpoints of e belong to differ- I Find-Set(x): Returns representative of the set holding x.
ent connected components of (V , F )
then
4. F ← F ∪ {e}
5. return tree with edge set F
Implementation of Kruskal’s Algorithm Amortized Analysis
Want to analyse . . .
Algorithm Kruskal(G, W )
1. F ← 0 Given a sequence of m “Disjoint Set” operations
2. for all vertices v of G do (Make-Set, Union, Find-Set), bound the total running
3. Make-Set(v ) time to perform a sequence of m operations.
4. sort edges of G into non-decreasing order by weight
I Usually parametrize in terms of n (no. of Make-Set operations),
5. for all edges (u, v ) of G in non-decreasing order by weight do
not just m.
6. if Find-Set(u) 6= Find-Set(v ) then
7. F ← F ∪ {(u, v )} I (Of course) the bound we get will depend on the particular data
8. Union(u, v ) structure/implementation we design for this problem.
9. return F
Analysis of Kruskal Linked List Implementation of Disjoint Sets

Linked List Implementation of Disjoint Sets
Let n be the number of vertices and m the number of edges of the input Each element represented by a pointer to a cell:
graph
Each element represented by a pointer to a cell:
I Line 1, line 9: Θ(1)
I Loop in Lines 2–3: Θ(n · TMake-Set (n))
I Line 4: Θ(m lg m)
x
I Loop in Lines 5–8: Θ m · TFind-Set (n) + n · TUnion (n) .
Overall:
Use a linked list for each set.

Θ n TMake-Set (n) + TUnion (n) + m lg m + TFind-Set (n)
I Representative
Use of the
a linked list for set
eachis at the head of the list.
set.
With efficient implementation of disjoint set this amounts to I Each cell has a pointer
Representative of thedirect
set istoat
thethe
representative
head of the (head
list.of the list).
T (n, m) = Θ(m lg m). I Each cell has a pointer direct to the representative (head of the list).
A&DS Lecture 11 8 Mary Cryan
Example
Example Example (cont’d)
Example (cont’d)
Linked list representation of
Linked list representation of
{ a, f }, { b }, { g, c, e }, {d} :
a f
a f
b b
g c e g c e
d
d
The representatives are a, b, g and d.
The representatives are a, b, g and d. U NION (g, b)
Analysis of Linked List Implementation Conventions for Further Analysis

Make-Set: constant time. Express running time in terms of:
Find-Set: constant time. n : the number of Make-Set operations,
Union: Naive implementation of m : the number of Make-Set, Union and Find-Set
operations.
Union(x, y )
Note
appends x’s list onto end of y ’s list. 1. After n − 1 Union operations only one set remains.
Assumption: Representative y of each set has attribute 2. m ≥ n.
last[y ]: a pointer to last cell of y ’s list. (not shown)
Snag: have to update “representative pointer” in each cell
of x’s list to point to the representative (head) of y ’s list.
Cost is:
Θ(length of x’s list).
A nasty example The Forest Implementation of Disjoint-Sets
The Forest Implementation of Disjoint-Sets
Take: n = dm/2e + 1, q = m − n = bm/2c − 1. Each set represented by a rooted tree:
Elements: x1 , x2 , . . . , xn .
Each set represented by a rooted tree:
Operation Number of objects updated
Make-Set(x1 ) 1
Make-Set(x2 ) 1 i g b
. .
. .
Make-Set(xn ) 1
Union(x1 , x2 ) 1 f h e a
Union(x2 , x3 ) 2
Union(x3 , x4 ) 3
. .
. .
c d
Union(xq−1 , xq ) q−1
Total Θ(m2 )
ADS: lect 12 – slide 13 – 5th Nov, 2010 A&DS Lecture 11

ADS:
15
lect 12 – slide 15 – 5th Nov, 2010Mary Cryan
The Weighted-Union Heuristic Basic Operations
IDEA Record length of each list. To execute Make-Set: Constant time.

Find-Set: Follow pointers to root. (Path followed is called the find
Union(x, y ) path.) Cost proportional to height of tree.
append shorter list to longer one (breaking ties arbitrarily). Union: Naive strategy: root of tree of x made to point to that
of y . Cost proportional to height of x’s tree plus the height
Theorem 1 of y ’s tree.
Using the linked-list representation of disjoint sets and the weighted-union
heuristic, a sequence of m Make-Set, Union & Find-Set operations, n of Not faster than linked list implementation.
which are Make-Set operations, takes
O(m + n lg n)
time.
Crucial Proof Idea: Each element can appear at most lg n times in the shorter
list of a Union. (because if x is in the shortest list, the union operation doubles
the size of x’s list).
Improving the Running Time
f b g d
General Strategy
rank [f] = 0 rank [d] = 0
Keep trees low.
a e c
U NION (f, d) rank [g] = 1 rank [d] = 1
Two Heuristics
1. Union-by-Rank: Attach lower tree to the root of b g d
higher tree in Union
2. Path Compression: Update tree during Find-Set a e c f
operations. U NION (f, g) rank [g] = 2
b g
a d e c
f
The Union-by-Rank Heuristic The Height of the Trees

At each node x maintain a variable rank [x], such that if x is the root of Lemma 2
its tree, rank [x] is the height of this tree. The height of a tree (in our datastructure) is at most
lg(size of the tree).

In executing
Union(x, y ) Proof By induction on the number of Union operations it is proved that a tree of
height h has size at least 2h .
make the tree of smaller rank point to the one with larger rank.
I As long as there are no Unions, all trees have height 0 and contain 1 = 20 node.
(Break ties arbitrarily.)
I Suppose Union(x, y ) is executed. Let rx and ry be the roots of the trees of x
and y , resp.
Case 1: rank (rx ) < rank (ry ).
Then rx becomes child of ry , and the height remains unchanged, but the number
of nodes increases.
Case 2: rank (rx ) > rank (ry ). Analogously.
Case 3: rank (rx ) = rank (ry ).
Then height increases by one and the size of the new tree is
size(rx ) + size(ry ) ≥ 2rank (rx ) + 2rank (ry ) = 2rank (rx )+1 ,
which is the height of the new tree.
Running Time with Union-by-Rank Only Example
Example
Make-Set: constant time.
Find-Set: Bounded by rank: O(log n). f
Union: Bounded by rank: O(log n). e
d f
Bottom line (m operations, of which n are Make-Set): c

a b c d e
O(m log n). b
NOT any better than Linked-Lists with Weighted union heuristic. a
F IND -S ET (a)
A&DS Lecture 11 Find-Set(a)

23 Mary Cryan
The Path-Compression Heuristics The Ackermann Function

Idea Ackerman’s Function
When performing Function A : N × N → N defined by
Find-Set(x)
A(1, j) = 2j for j ≥ 1
make each vertex on find path point to root. A(i, 1) = A(i − 1, 2) for i ≥ 2
A(i, j) = A(i − 1, A(i, j − 1)) for i, j ≥ 2
Implementation
Other variants exist—last line of definition is crucial
common point.
Algorithm Find-Set(x)
1. if x 6= π(x) then
2. π(x) ← Find-Set(π(x)) Inverse
3. return π(x) Not true mathematical inverse—grows as slowly as A(i, j)
grows fast:
α(m, n) = min{i ≥ 1 | A(i, bm/nc) > lg n}.
Small table of A(i, j) Reading
[CLRS] Chapter 21 or [CLR] Chapter 22
j =1 j =2 j =3 j =4 [CLRS] Chapter 23 This is Chapter 24 of [CLR].
i =1 21 22 23 24
2
2 22 22
i =2 22 22 22 22 Chapters and Exercises have same labels in both editions of [CLRS]
·2 ·2 ·2 ·2 ·2 ·2
·· ·· ·· ·· ·· ··
2 16 2 2 16 2 2 2 16
22
i =3 2 2 2 2
Historical importance of A(i, j): Showed that the class of primitive recursive
functions does not include all computable functions.
A(i, j) grows faster than any primitive recursive function.
Anaysis of Disjoint Forests with both Heuristics Problems

1. Exercise 23.2-1 of [CLRS] Ex. 24.2-1 of [CLR].
Theorem 3
2. Suppose that all edge weights in a graph G are integers in the
Using both the union-by-rank and path compression heuristic, the
range 1 to |V |. How fast can you make Kruskal’s algorithm run in
worst-case runtime for disjoint forests is
this case?
O(m α(m, n)). What if the edge weights are integers in the range from 1 to C , for
some constant C ?
This is Exercise 23.2-4 of [CLRS] Ex. 24.2-4 of [CLR].
Remark
3. Exercise 21.2-2 of [CLRS]. 22.2-2 of [CLR].
Ackermann’s Function grows really really fast, so the
inverse α grows really really slowly. 4. Exercise 21.3-1 of [CLRS]. 22.3-1 of [CLR].
For all practical purposes α(m, n) ≤ 4.

Correctness of Kruskal's Algorithm: Operations

Uploaded by

Copyright:

Available Formats

You might also like

Correctness of Kruskal's Algorithm: Operations

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Correctness of Kruskal's Algorithm: Operations

Uploaded by

Copyright:

Available Formats

Correctness of Kruskal’s algorithm

1. Throughout the execution of Kruskal, (V , F ) remains a spanning forest.

Kruskal’s Algorithm Data Structures for Disjoint Sets

Analysis of Kruskal Linked List Implementation of Disjoint Sets

A&DS Lecture 11 8 Mary Cryan

Analysis of Linked List Implementation Conventions for Further Analysis

ADS: lect 12 – slide 13 – 5th Nov, 2010 A&DS Lecture 11

The Weighted-Union Heuristic Basic Operations

IDEA Record length of each list. To execute Make-Set: Constant time.

The Union-by-Rank Heuristic The Height of the Trees

lg(size of the tree).

Union: Bounded by rank: O(log n). e

Bottom line (m operations, of which n are Make-Set): c

O(m log n). b

NOT any better than Linked-Lists with Weighted union heuristic. a

A&DS Lecture 11 Find-Set(a)

The Path-Compression Heuristics The Ackermann Function

α(m, n) = min{i ≥ 1 | A(i, bm/nc) > lg n}.

Anaysis of Disjoint Forests with both Heuristics Problems

You might also like