Professional Documents
Culture Documents
lecture#5
lecture#5
?
Hamiltonian Path Approach
• S = { ATG AGG TGC TCC GTC GGT GCA CAG }
• Ham ATG AGG TGC TCC GTC GGT GCA CAG
ATGCAGGTCC
• Path 1: H
ATGCGTGGCA
H
• Path 2:
ATGGCGTGCA
SBH as Eulerian path
• Could use Hamiltonian path approach, but not useful
due to NP-completeness
• Instead, use Eulerian path approach
• Eulerian path: A path through a graph that traverses
every edge exactly once
• Construct graph with all (k-1)-mers as vertices
• For each k-mer in spectrum, add edge from vertex
representing first k-1 characters to vertex
representing last k-1 characters
SBH as Eulerian path
• Genome : AAABBBBA
• K= 3
• AAA, AAB, ABB, BBB, BBB, BBA
• Construct graph with all (k-1)-mers as vertices
• 3-1= 2 mers (Extract its left and right mers)
• AA, AA AA,AB AB,BB BB,BB BB,BB BB,BA
• Start making nodes
Reconstruct Genome
One edge per k-mer (For each K-mer there is one directed edge) • It involves walk through
One node per distinct K-1 mer
(All distinct K-1 mer) the graph.
• Move from node to node
following the direction of
edges.
• Walk that crosses each
This walk is called Eulerian walk edge once.
Reads -> Genome
Different sequences – the same spectrum
• Different sequences may have the same spectrum:
Spectrum(GTATCT,2)=
Spectrum(GTCTAT,2)=
{AT, CT, GT, TA, TC}
Eulerian Path Approach
• = { ATG, TGC, GCG,CGT,GTG,TGG,GGC,GCA}
• Vertices correspond to (k– 1)–mers : { AT, TG, GC, GG, GT, CA, CG }
• Edges correspond to k–mers from S
GT CG
AT TG GC CA
GG
AT TG GC CA
GG
ATGGCGTGCA ATGCGTGGCA
Properties of Eulerian graphs
• It will be easier to consider Eulerian cycles: Eulerian
paths that form a cycle
• Graphs that have an Eulerian cycle are simply called
Eulerian
• Theorem: A connected graph is Eulerian if and only if
each of its vertices are balanced
• A vertex v is balanced if indegree(v) = outdegree(v)
Seven Bridges of Königsberg
w
Eulerian Path -> Eulerian Cycle
• Indegree =outdegree
• If a graph has an Eulerian Path starting at s and ending at t then
• All vertices must be balanced, except for s and t which may have |indegree(v)
– outdegree(v)| = 1
• If and s and t are not balanced, add an edge between them to balance
• Graph now has an Eulerian cycle which can be converted to an Eulerian path by removal
of the added edge
Eulerian cycle algorithm
• Start at any vertex v, traverse unused edges until returning to v
• While the cycle is not Eulerian
• Pick a vertex w along the cycle for which there are untraversed outgoing
edges
• Traverse unused edges until ending up back at w
• Join two cycles into one cycle
20
Joining cycles
21
Algorithm for Constructing an Eulerian Cycle
30
Consensus
GTATCGTAGCTGACTGCGCTGC
Layout ATCGTCTCGTAGCTGACTGCGCTGC
ATCGTATCGAATCGTAG
TGACTGCGCTGCATCGTATCGTATC
Consensus TGACTGCGCTGCATCGTATCGTATCGTAGCTGACTGCGCTGC
31
Hard vs. Easy
• Eulerian path – visit every edge once (easy)
• Hamiltonian path – visit every node once (hard)
• Superstring(easy)
• Shortest common superstring (hard)
• Aligning 2 sequences (easy)
• Aligning k > 2 sequences (hard)
• Shortest path (easy)
• Longest path (hard)
Kruskal’s Minimum Spanning Tree (MST) Algorithm
• In Kruskal’s algorithm, sort all edges of the given graph in increasing order.
Then it keeps on adding new edges and nodes in the MST if the newly
added edge does not form a cycle.
• It picks the minimum weighted edge at first and the maximum weighted
edge at last. Thus we can say that it makes a locally optimal choice in each
step in order to find the optimal solution.
Greedy Algorithm Examples
• Kruskal’s Algorithm for Minimum Spanning Tree
• Minimum spanning tree: a set of n-1 edges that connects a graph of n vertices
without any cycles and that has minimal total weight
• Kruskal’s algorithm adds the edge that connects two components with the
smallest weight at each step without introducing a cycle
• Proven to give an optimal solution
Weight Src Dest Greedy Algorithm Examples
1 7 6
2 8 2
2 6 5
1 2 3
4 0 1
4 2 5
6 8 6
0 8 4
7 2 3
7 7 8
8 0 7
7 6 5
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
References
• Lecture notes of Colin Dewey @ University of Wisconsin-Madison
• Lecture notes of Arne Elofsson @ Stockholm University
• Lecture notes of Yuzhen Ye @ Indiana University