G7 Sequence Alignment

Njah Hurbert ICTU20233878
Piabezih McBright ICTU20234411

Leudjou Maxime ICTU20234264
Njifon Eric ICTU20234391
Otang Desmond ICTU20234300
Nyanoh Mark ICTU20233842
LEVEL 1 GROUP4
Dynamic programming
Sequence alignment
Sequence alignment is a key technique in bioinformatics. It helps to compare

sequences by optimizing alignments based on scoring schemes, which account for
matches, mismatches, and gaps.
Introduction:
Sequence alignment is essential for understanding evolutionary relationships,

predicting function, and identifying conserved motifs. Dynamic programming
offers a systematic way to perform these alignments.
1. Global Alignment: This type aligns sequences end-to-end, optimizing the

alignment over the entire length of the sequences. It is suitable when the sequences
are of similar length and are expected to be highly similar throughout. An example
of an algorithm used for global alignment is the Needleman-Wunsch algorithm.
2. Local Alignment: This type finds the most similar regions within the sequences,
which is useful for sequences of different lengths or those containing only similar
subsequences. The Smith-Waterman algorithm is commonly used for local
alignment.
Key Algorithms:
1
1. Needleman-Wunsch(Global Alignments): This algorithm is used for global
alignment, meaning it aligns the entire length of two sequences. It builds a
matrix where each cell represents the best score for aligning subsequences up
to that point, considering matches, mismatches, and gaps.
Example: Suppose we want to align the sequences "GATTACA" and
"GCATGCU"
1. Initialization:
- Create a matrix with dimensions based on the sequence lengths plus one for
initial gaps.
- Initialize the first row and column with gap penalties (assuming a gap penalty
of -1).
2. Matrix Filling: Fill in the matrix using a scoring scheme (e.g., match = +1,
mismatch = -1, gap = -1).
3. Trace back: Start from the bottom-right cell and trace back to the top-left to get
the alignment.
- G A T T A C A
- 0 - - - - - - - Example Matrix Filling:
1 2 3 4 5 6 7
G -
1
C -
2
A -
3
T -
4
G - The matrix is filled based on the
5
C - recurrence relation:
6
U 2
-
7
F(i , j)
Max = (Fi-1,j-1 + S(Ai, Bj), Fi,j-1+d, Fi-1,j+d )
Final alignment might be:
GA-TTACA
| | | | |
G-CATGCU
2. Smith-Waterman Algorithm (Local Alignment): This algorithm is for local

alignment, which identifies the most similar regions within sequences. Unlike
Needleman-Wunsch, it allows for partial sequence alignment.
Example:
Align the same sequences "GATTACA" and "GCATGCU".
1. *Initialization*:
- Initialize a matrix similar to Needleman-Wunsch, but set the first row and
column to zero.
2. Matrix Filling:
- Fill in the matrix using a scoring scheme.
- The difference here is to not allow negative scores (set negative values to zero).
3. Traceback: Start from the cell with the highest score and trace back to the first
zero encountered.
3
Example Matrix Filling:
- G A T T A C A
- 0 0 0 0 0 0 0 0
G 0
C 0
A 0
T 0
G 0
C 0
U 0
Using similar recurrence relations but ensuring no cell has a negative score.
The local alignment might look like:
TTAC
| | | |
T-AC
The resulting aligned segments are the highest scoring subsequences.
Steps in Dynamic Programming Alignment:
1. Initialization: Create a matrix with dimensions based on the lengths of the

sequences. Initialize the first row and column based on gap penalties.
4
2. Matrix Filling: Populate the matrix using recurrence relations that consider
match, mismatch, and gap scores.
3. Traceback: Follow the path from the optimal score in the matrix to reconstruct
the alignment.
Scoring Systems:
- Match: A positive score for identical characters.
- Mismatch: A negative score for non-identical characters.
- Gap Penalties: Negative scores for introducing gaps to account for insertions or
deletions.
Applications:
- Homology Detection: Identifying evolutionary relationships.
- Function Prediction: Inferring the function of unknown sequences based on

similarity to known sequences.
- Genomics: Annotating genomes by aligning them with reference sequences.
Challenges:
- Computational Complexity: Dynamic programming algorithms can be slow for

long sequences.
- Scoring Scheme Sensitivity: Results can vary greatly with different scoring
parameters.
5
Conclusion:
Dynamic programming sequence alignment is a robust tool for comparing

biological sequences, offering precise alignments that are critical for various
bioinformatics applications.

G7 Sequence Alignment

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

G7 Sequence Alignment

Uploaded by

Copyright:

Available Formats

Njah Hurbert ICTU20233878

Piabezih McBright ICTU20234411

Sequence alignment is a key technique in bioinformatics. It helps to compare

Sequence alignment is essential for understanding evolutionary relationships,

1. Global Alignment: This type aligns sequences end-to-end, optimizing the

Max = (Fi-1,j-1 + S(Ai, Bj), Fi,j-1+d, Fi-1,j+d )

Final alignment might be:

2. Smith-Waterman Algorithm (Local Alignment): This algorithm is for local

Align the same sequences "GATTACA" and "GCATGCU".

- Fill in the matrix using a scoring scheme.

The local alignment might look like:

The resulting aligned segments are the highest scoring subsequences.

Steps in Dynamic Programming Alignment:

1. Initialization: Create a matrix with dimensions based on the lengths of the

- Match: A positive score for identical characters.

- Mismatch: A negative score for non-identical characters.

- Homology Detection: Identifying evolutionary relationships.

- Function Prediction: Inferring the function of unknown sequences based on

- Genomics: Annotating genomes by aligning them with reference sequences.

- Computational Complexity: Dynamic programming algorithms can be slow for

Dynamic programming sequence alignment is a robust tool for comparing

You might also like