Professional Documents
Culture Documents
Sequence Comparison Part 3
Sequence Comparison Part 3
BIOINFORMATICS
DR. UROOJ AINUDDIN
PAIRWISE SEQUENCE
ALIGNMENT
CHAPTER 3
Alignment algorithms
Example i C A G G T A G T G
0 0
X: CAGGTAGTG 9
Y: CTAGTAG 7 1 C
• Define a scoring scheme.
• 𝑀𝑎𝑡𝑐ℎ = 2 2 T
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
3 A T(i-1,j-1)
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• 𝑇 0,0 = 0 4 G T(i,j)
𝑇 𝑖 − 1, 𝑗 − 1 + 𝑚𝑎𝑡𝑐ℎ 𝑜𝑟 𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ 5 T
• 𝑇 𝑖, j = max ൞ 𝑇 𝑖 − 1, 𝑗 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦
𝑇 𝑖, 𝑗 − 1 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 6 A
7 G
j 0 1 2 3 4 5 6 7 8 9
Example i C A G G T A G T G
0 0
X: CAGGTAGTG 9
Y: CTAGTAG 7 1 C
• Define a scoring scheme.
• 𝑀𝑎𝑡𝑐ℎ = 2 2 T
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
3 A T(i-1,j)
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• 𝑇 0,0 = 0 4 G T(i,j)
𝑇 𝑖 − 1, 𝑗 − 1 + 𝑚𝑎𝑡𝑐ℎ 𝑜𝑟 𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ 5 T
• 𝑇 𝑖, j = max ൞ 𝑇 𝑖 − 1, 𝑗 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦
𝑇 𝑖, 𝑗 − 1 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 6 A
7 G
j 0 1 2 3 4 5 6 7 8 9
Example i C A G G T A G T G
0 0
X: CAGGTAGTG 9
Y: CTAGTAG 7 1 C
• Define a scoring scheme.
• 𝑀𝑎𝑡𝑐ℎ = 2 2 T
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
3 A T(i,j-1)
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• 𝑇 0,0 = 0 4 G T(i,j)
𝑇 𝑖 − 1, 𝑗 − 1 + 𝑚𝑎𝑡𝑐ℎ 𝑜𝑟 𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ 5 T
• 𝑇 𝑖, j = max ൞ 𝑇 𝑖 − 1, 𝑗 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦
𝑇 𝑖, 𝑗 − 1 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 6 A
7 G
j 0 1 2 3 4 5 6 7 8 9
Example i C A G G T A G T G
0 0
X: CAGGTAGTG 9
Y: CTAGTAG 7 1 C Add gap penalty when
• Define a scoring scheme. moving in horizontal or
• 𝑀𝑎𝑡𝑐ℎ = 2 vertical direction.
2 T
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
3 A
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2 Add match or mismatch
score when moving in the
• 𝑇 0,0 = 0 4 G
diagonal direction.
𝑇 𝑖 − 1, 𝑗 − 1 + 𝑚𝑎𝑡𝑐ℎ 𝑜𝑟 𝑚𝑖𝑠𝑚𝑎𝑡𝑐ℎ 5 T
• 𝑇 𝑖, j = max ൞ 𝑇 𝑖 − 1, 𝑗 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦
𝑇 𝑖, 𝑗 − 1 + 𝑔𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 6 A
7 G
j 0 1 2 3 4 5 6 7 8 9
Example i C A G G T A G T G
7 G -14 -10 -6 -2 2 0 4 8 6 4
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• Tracing back is done from the
bottom right cell towards the
top left cell.
• The path is determined based
on the source of the present cell
which contributes the highest
score to it.
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• The score for each of each alignment is
calculated according to the scoring
scheme set at the beginning.
• The alignment score with the highest
value is considered as the best global
alignment.
• One can get more than one alignments
with the highest score, all of them being
equally good, and any one of them can
be accepted as the global alignment.
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• For every horizontal movement,
introduce a gap in the vertical sequence,
and apply the gap penalty.
• For every vertical movement, introduce a
gap in the horizontal sequence, and
apply the gap penalty.
• For every diagonal movement, apply
either match or mismatch score.
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 1
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 2
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 3
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 4
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 5
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 6
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• 𝑀𝑎𝑡𝑐ℎ = 2
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 2
• Alignment# 7
Example
X: CAGGTAGTG 9
Y: CTAGTAG 7
• We get seven possible global
alignments, all of which have the
highest score (i.e., equal to 4).
• We can select any one of these
alignments as the best alignment.
Homework
• 𝑀𝑎𝑡𝑐ℎ = 1
• 𝑀𝑖𝑠𝑚𝑎𝑡𝑐ℎ = – 1
• 𝐺𝑎𝑝 𝑝𝑒𝑛𝑎𝑙𝑡𝑦 = – 3
• Using NWA, globally align these sequences:
X: ACTGTGCGT 9 A: GACCGTATTCGAGT 14
Y: GACGCGTG 8 B: GTCACACATGT 11