Graph Traversals and Minimum Spanning Trees: 15-211: Fundamental Data Structures and Algorithms Rose Hoberman

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 58

Graph Traversals

and
Minimum Spanning Trees

15-211: Fundamental Data Structures


and Algorithms

Rose Hoberman
April 8, 2003
Announcements
Announcements

Readings:
• Chapter 14

HW5:
• Due in less than one week!
• Monday, April 14, 2003, 11:59pm

15-211: Fundamental Data 3 Rose Hoberman


Structures and Algorithms April 8, 2003
Today

• More Graph Terminology (some review)


• Topological sort
• Graph Traversals (BFS and DFS)
• Minimal Spanning Trees
• After Class... Before Recitation

15-211: Fundamental Data 4 Rose Hoberman


Structures and Algorithms April 8, 2003
Graph
Terminology
Paths and cycles

• A path is a sequence of nodes


v1, v2, …, vN such that (vi,vi+1)∈E for 0<i<N

– The length of the path is N-1.


– Simple path: all vi are distinct, 0<i<N

• A cycle is a path such that v1=vN


– An acyclic graph has no cycles

15-211: Fundamental Data 6 Rose Hoberman


Structures and Algorithms April 8, 2003
Cycles

BOS

DTW
SFO

PIT
JFK

LAX

15-211: Fundamental Data 7 Rose Hoberman


Structures and Algorithms April 8, 2003
More useful definitions

• In a directed graph:

• The indegree of a node v is the number of


distinct edges (w,v)∈E.
• The outdegree of a node v is the number of
distinct edges (v,w)∈E.

• A node with indegree 0 is a root.

15-211: Fundamental Data 8 Rose Hoberman


Structures and Algorithms April 8, 2003
Trees are graphs

• A dag is a directed acyclic graph.

• A tree is a connected acyclic undirected


graph.

• A forest is an acyclic undirected graph (not


necessarily connected), i.e., each connected
component is a tree.

15-211: Fundamental Data 9 Rose Hoberman


Structures and Algorithms April 8, 2003
Example DAG

Undershorts
Socks
Watch
Pants Shoes
Shirt
a DAG implies an
Belt Tie ordering on events

Jacket

15-211: Fundamental Data 10 Rose Hoberman


Structures and Algorithms April 8, 2003
Example DAG

Undershorts
Socks
Watch
Pants Shoes
Shirt
In a complex DAG, it
Belt Tie can be hard to find a
schedule that obeys
Jacket all the constraints.
15-211: Fundamental Data 11 Rose Hoberman
Structures and Algorithms April 8, 2003
Topological Sort
Topological Sort

• For a directed acyclic graph G = (V,E)


• A topological sort is an ordering of all of G’s
vertices v1, v2, …, vn such that...

Formally: for every edge (vi,vk) in E, i<k.


Visually: all arrows are pointing to the right

15-211: Fundamental Data 13 Rose Hoberman


Structures and Algorithms April 8, 2003
Topological sort

• There are often many possible topological


sorts of a given DAG
• Topological orders for this DAG :
1 2
• 1,2,5,4,3,6,7
• 2,1,5,4,7,3,6 3 4 5
• 2,5,1,4,7,3,6
• Etc. 6 7

• Each topological order is a feasible schedule.

15-211: Fundamental Data 14 Rose Hoberman


Structures and Algorithms April 8, 2003
Topological Sorts for Cyclic
Graphs?
1 2
Impossible!
3

• If v and w are two vertices on a cycle, there


exist paths from v to w and from w to v.
• Any ordering will contradict one of these paths

15-211: Fundamental Data 15 Rose Hoberman


Structures and Algorithms April 8, 2003
Topological sort algorithm

• Algorithm
– Assume indegree is stored with each node.
– Repeat until no nodes remain:
• Choose a root and output it.
• Remove the root and all its edges.

• Performance
– O(V2 + E), if linear search is used to find a root.

15-211: Fundamental Data 16 Rose Hoberman


Structures and Algorithms April 8, 2003
Better topological sort
• Algorithm:
– Scan all nodes, pushing roots onto a stack.
– Repeat until stack is empty:
• Pop a root r from the stack and output it.
• For all nodes n such that (r,n) is an edge, decrement n’s
indegree. If 0 then push onto the stack.
• O( V + E ), so still O(V2) in worst case, but better
for sparse graphs.

• Q: Why is this algorithm correct?

15-211: Fundamental Data 17 Rose Hoberman


Structures and Algorithms April 8, 2003
Correctness

• Clearly any ordering produced by this algorithm is


a topological order

But...

• Does every DAG have a topological order, and if


so, is this algorithm guaranteed to find one?

15-211: Fundamental Data 18 Rose Hoberman


Structures and Algorithms April 8, 2003
Quiz Break
Quiz
• Prove:
– This algorithm never gets stuck, i.e. if there
are unvisited nodes then at least one of
them has an indegree of zero.

• Hint:
– Prove that if at any point there are unseen
vertices but none of them have an indegree
of 0, a cycle must exist, contradicting our
assumption of a DAG.

15-211: Fundamental Data 20 Rose Hoberman


Structures and Algorithms April 8, 2003
Proof

• See Weiss page 476.

15-211: Fundamental Data 21 Rose Hoberman


Structures and Algorithms April 8, 2003
Graph Traversals
Graph Traversals

•Both take time: O(V+E)

15-211: Fundamental Data 23 Rose Hoberman


Structures and Algorithms April 8, 2003
Use of a stack

• It is very common to use a stack to keep track


of:
– nodes to be visited next, or
– nodes that we have already visited.
• Typically, use of a stack leads to a depth-first
visit order.
• Depth-first visit order is “aggressive” in the
sense that it examines complete paths.

15-211: Fundamental Data 24 Rose Hoberman


Structures and Algorithms April 8, 2003
Topological Sort as DFS

• do a DFS of graph G
• as each vertex v is “finished” (all of it’s
children processed), insert it onto the front of
a linked list
• return the linked list of vertices

• why is this correct?

15-211: Fundamental Data 25 Rose Hoberman


Structures and Algorithms April 8, 2003
Use of a queue

• It is very common to use a queue to keep


track of:
– nodes to be visited next, or
– nodes that we have already visited.
• Typically, use of a queue leads to a breadth-
first visit order.
• Breadth-first visit order is “cautious” in the
sense that it examines every path of length i
before going on to paths of length i+1.

15-211: Fundamental Data 26 Rose Hoberman


Structures and Algorithms April 8, 2003
Graph Searching ???

• Graph as state space (node = state, edge = action)


• For example, game trees, mazes, ...
• BFS and DFS each search the state space for a best
move. If the search is exhaustive they will find the same
solution, but if there is a time limit and the search space is
large...
• DFS explores a few possible moves, looking at the effects
far in the future
• BFS explores many solutions but only sees effects in the
near future (often finds shorter solutions)

15-211: Fundamental Data 27 Rose Hoberman


Structures and Algorithms April 8, 2003
Minimum Spanning Trees

15-211: Fundamental Data 28 Rose Hoberman


Structures and Algorithms April 8, 2003
Problem: Laying Telephone Wire

Central office

15-211: Fundamental Data 29 Rose Hoberman


Structures and Algorithms April 8, 2003
Wiring: Naïve Approach

Central office

Expensive!

15-211: Fundamental Data 30 Rose Hoberman


Structures and Algorithms April 8, 2003
Wiring: Better Approach

Central office

Minimize the total length of wire connecting the customers

15-211: Fundamental Data 31 Rose Hoberman


Structures and Algorithms April 8, 2003
Minimum Spanning Tree (MST)
(see Weiss, Section 24.2.2)

A minimum spanning tree is a subgraph of an


undirected weighted graph G, such that

• it is a tree (i.e., it is acyclic)


• it covers all the vertices V
– contains |V| - 1 edges
• the total cost associated with tree edges is the
minimum among all possible spanning trees
• not necessarily unique

15-211: Fundamental Data 32 Rose Hoberman


Structures and Algorithms April 8, 2003
Applications of MST
• Any time you want to visit all vertices in a graph at
minimum cost (e.g., wire routing on printed circuit boards,
sewer pipe layout, road planning…)

• Internet content distribution


– $$$, also a hot research topic
– Idea: publisher produces web pages, content distribution network
replicates web pages to many locations so consumers can access
at higher speed
– MST may not be good enough!
• content distribution on minimum cost tree may take a long time!

• Provides a heuristic for traveling salesman problems. The


optimum traveling salesman tour is at most twice the
length of the minimum spanning tree (why??)
15-211: Fundamental Data 33 Rose Hoberman
Structures and Algorithms April 8, 2003
How Can We Generate a MST?

9 9
b b
a 2 6 a 2 6
d d
4 5 4 5
5 4 5 4

5 e 5 e
c c

15-211: Fundamental Data 34 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s Algorithm
Initialization
a. Pick a vertex r to be the root
b. Set D(r) = 0, parent(r) = null
c. For all vertices v ∈ V, v ≠ r, set D(v) = ∞
d. Insert all vertices into priority queue P,
using distances as the keys
9 Vertex Parent
b e -
a 2 6 e a b c d
d
4 5
5 4 0 ∞ ∞ ∞ ∞
5 e
c
15-211: Fundamental Data 35 Rose Hoberman
Structures and Algorithms April 8, 2003
Prim’s Algorithm
While P is not empty:

1. Select the next vertex u to add to the tree


u = P.deleteMin()

2. Update the weight of each vertex w adjacent to


u which is not in the tree (i.e., w ∈ P)
If weight(u,w) < D(w),
a. parent(w) = u
b. D(w) = weight(u,w)
c. Update the priority queue to reflect
new distance for w

15-211: Fundamental Data 36 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
e d b c a b -
0 ∞ ∞ ∞ ∞ c -
9 b d -
a 2 6
d Vertex Parent
4 5
5 4 e -
5 e d b c a b e
c 4 5 5 ∞ c e
d e

The MST initially consists of the vertex e, and we update


the distances and parent for its adjacent vertices

15-211: Fundamental Data 37 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
d b c a b e
4 5 5 ∞ c e
9 b d e
a 2 6
d
4 5
5 4
Vertex Parent
5 e e -
c a c b b e
2 4 5 c d
d e
a d

15-211: Fundamental Data 38 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
a c b b e
2 4 5 c d
9 d e
b
a d
a 2 6
d
4 5
5 4

5 e
c Vertex Parent
e -
c b b e
4 5 c d
d e
a d

15-211: Fundamental Data 39 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
c b b e
9 4 5 c d
b d e
a 2 6
a d
d
4 5
5 4

5 e Vertex Parent
c e -
b b e
5 c d
d e
a d

15-211: Fundamental Data 40 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s algorithm
Vertex Parent
e -
b b e
5 c d
9 b d e
a 2 6 a d
d
4 5
5 4

5 e
c Vertex Parent
e -
b e
The final minimum spanning tree c d
d e
a d

15-211: Fundamental Data 41 Rose Hoberman


Structures and Algorithms April 8, 2003
Running time of Prim’s algorithm
(without heaps)
Initialization of priority queue (array): O(|V|)

Update loop: |V| calls


• Choosing vertex with minimum cost edge: O(|V|)
• Updating distance values of unconnected
vertices: each edge is considered only once
during entire execution, for a total of O(|E|)
updates
Overall cost without heaps: O(|E| + |V| 2)

When heaps are used, apply same analysis as for


Dijkstra’s algorithm (p.469) (good exercise)

15-211: Fundamental Data 42 Rose Hoberman


Structures and Algorithms April 8, 2003
Prim’s Algorithm Invariant

• At each step, we add the edge (u,v) s.t. the


weight of (u,v) is minimum among all edges
where u is in the tree and v is not in the tree

• Each step maintains a minimum spanning tree of


the vertices that have been included thus far

• When all vertices have been included, we have a


MST for the graph!

15-211: Fundamental Data 43 Rose Hoberman


Structures and Algorithms April 8, 2003
Correctness of Prim’s
• This algorithm adds n-1 edges without creating a cycle,
so clearly it creates a spanning tree of any connected
graph (you should be able to prove this).

But is this a minimum spanning tree?


Suppose it wasn't.

• There must be point at which it fails, and in particular


there must a single edge whose insertion first
prevented the spanning tree from being a minimum
spanning tree.

15-211: Fundamental Data 44 Rose Hoberman


Structures and Algorithms April 8, 2003
Correctness of Prim’s

• Let G be a connected,
undirected graph
• Let S be the set of
edges chosen by Prim’s x
algorithm before y
choosing an errorful
edge (x,y)
• Let V' be the vertices incident with edges in S
• Let T be a MST of G containing all edges in S, but not (x,y).
15-211: Fundamental Data 45 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Prim’s

w
v
• Edge (x,y) is not in T, so
there must be a path in
T from x to y since T is x
connected. y
• Inserting edge (x,y) into
T will create a cycle
• There is exactly one edge on this cycle with exactly one
vertex in V’, call this edge (v,w)
15-211: Fundamental Data 46 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Prim’s

• Since Prim’s chose (x,y) over (v,w), w(v,w) >= w(x,y).


• We could form a new spanning tree T’ by swapping (x,y)
for (v,w) in T (prove this is a spanning tree).
• w(T’) is clearly no greater than w(T)
• But that means T’ is a MST
• And yet it contains all the edges in S, and also (x,y)

...Contradiction

15-211: Fundamental Data 47 Rose Hoberman


Structures and Algorithms April 8, 2003
Another Approach
• Create a forest of trees from the vertices
• Repeatedly merge trees by adding “safe edges”
until only one tree remains
• A “safe edge” is an edge of minimum weight which
does not create a cycle

9 b
a 2 6
d forest: {a}, {b}, {c}, {d}, {e}
4 5
5 4

5 e
c

15-211: Fundamental Data 48 Rose Hoberman


Structures and Algorithms April 8, 2003
Kruskal’s algorithm
Initialization
a. Create a set for each vertex v ∈ V
b. Initialize the set of “safe edges” A
comprising the MST to the empty set
c. Sort edges by increasing weight

9 b
a 2 6 F = {a}, {b}, {c}, {d}, {e}
d A=∅
4 5
5 4
E = {(a,d), (c,d), (d,e), (a,c),
5 e (b,e), (c,e), (b,d), (a,b)}
c

15-211: Fundamental Data 49 Rose Hoberman


Structures and Algorithms April 8, 2003
Kruskal’s algorithm
For each edge (u,v) ∈ E in increasing order
while more than one set remains:
If u and v, belong to different sets U and V
a. add edge (u,v) to the safe edge set
A = A ∪ {(u,v)}
b. merge the sets U and V
F = F - U - V + (U ∪ V)

Return A

• Running time bounded by sorting (or findMin)


• O(|E|log|E|), or equivalently, O(|E|log|V|) (why???)

15-211: Fundamental Data 50 Rose Hoberman


Structures and Algorithms April 8, 2003
Kruskal’s algorithm
9 b
a 2 6
E= {(a,d), (c,d), (d,e), (a,c),
d
5 4
4 5 (b,e), (c,e), (b,d), (a,b)}
5 e
c
Forest A
{a}, {b}, {c}, {d}, {e} ∅
{a,d}, {b}, {c}, {e} {(a,d)}
{a,d,c}, {b}, {e} {(a,d), (c,d)}
{a,d,c,e}, {b} {(a,d), (c,d), (d,e)}
{a,d,c,e,b} {(a,d), (c,d), (d,e), (b,e)}

15-211: Fundamental Data 51 Rose Hoberman


Structures and Algorithms April 8, 2003
Kruskal’s Algorithm Invariant

• After each iteration, every tree in the forest is a MST


of the vertices it connects

• Algorithm terminates when all vertices are connected


into one tree

15-211: Fundamental Data 52 Rose Hoberman


Structures and Algorithms April 8, 2003
Correctness of Kruskal’s
• This algorithm adds n-1 edges without creating a
cycle, so clearly it creates a spanning tree of any
connected graph (you should be able to prove this).

But is this a minimum spanning tree?


Suppose it wasn't.

• There must be point at which it fails, and in particular


there must a single edge whose insertion first
prevented the spanning tree from being a minimum
spanning tree.

15-211: Fundamental Data 53 Rose Hoberman


Structures and Algorithms April 8, 2003
Correctness of Kruskal’s

K T
S
e
• Let e be this first errorful edge.
• Let K be the Kruskal spanning tree
• Let S be the set of edges chosen by Kruskal’s algorithm before choosing e
• Let T be a MST containing all edges in S, but not e.

15-211: Fundamental Data 54 Rose Hoberman


Structures and Algorithms April 8, 2003
Correctness of Kruskal’s
Lemma: w(e’) >= w(e) for all edges e’ in T - S

Proof (by contradiction):


• Assume there exists some K T
edge e’ in T - S, w(e’) < S
w(e)
• Kruskal’s must have e
considered e’ before e
• However, since e’ is not in K (why??), it must have
been discarded because it caused a cycle with some of
the other edges in S.
• But e’ + S is a subgraph of T, which means it cannot
form a cycle ...Contradiction
15-211: Fundamental Data 55 Rose Hoberman
Structures and Algorithms April 8, 2003
Correctness of Kruskal’s

• Inserting edge e into T will create a cycle


• There must be an edge on this cycle which is not in K (why??).
Call this edge e’
• e’ must be in T - S, so (by our lemma) w(e’) >= w(e)
• We could form a new spanning tree T’ by swapping e for e’ in T
(prove this is a spanning tree).
• w(T’) is clearly no greater than w(T)
• But that means T’ is a MST
• And yet it contains all the edges in S, and also e
...Contradiction

15-211: Fundamental Data 56 Rose Hoberman


Structures and Algorithms April 8, 2003
Greedy Approach

• Like Dijkstra’s algorithm, both Prim’s and Kruskal’s


algorithms are greedy algorithms

• The greedy approach works for the MST problem;


however, it does not work for many other
problems!

15-211: Fundamental Data 57 Rose Hoberman


Structures and Algorithms April 8, 2003
That’s All!

You might also like