Sequence Alignment - Final

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Short Introduction to DNA & RNA

For composition of cell, DNA has blueprints for building cells along with the information of cell’s
protein, carbohydrate and vitamins production. And transfer of this information from DNA to these
molecules is termed as “Central Dogma” which is

DNA RNA Protein.

Proteins are than use in constructing the cell.

All cells are made of carbohydrates and proteins and for these cells DNA codes the information
which makes the RNA and protein both.

The above mechanism explains the process of transcription in very simple way, DNA codes the
information and converted into RNA where mRNA copies the information and it execute the
information in cell and amino acids combine with each other according to coded information of
DNA and protein formation takes place. Which is known as Translation.
Molecule of DNA contains only four base pairs (A, T, C, and G) which are repeated thousands of
time and Adenine “A” pairs with Cytosine “C”, While Thymine “T” binds with Guanine “G” and all
pairings are with the help of Hydrogen bonding.
Same like DNA, the RNA contains four base pairs but Thymine is replaced with Uracil “U” and RNA is
single stranded.
DNA just codes the information for protein but RNA helps in making protein.

The Genetic Code


DNA must code for the 20 different amino acids found in all organisms. The information-carrying
capabilities of DNA reside in the sequence of nitrogenous bases. The genetic code is a sequence of three
bases—a triplet code. Figure 3.11 shows the genetic code as reflected in the mRNA that will be produced
from DNA. Each three-base combination is a codon. More than one codon can specify the same amino acid
because there are 64 possible codons, but only 20 amino acids. This characteristic of the code is referred
to as degeneracy. Note that not all codons code for an amino acid. The base sequences UAA, UAG, and
UGA are all stop signals that indicate where polypeptide synthesis should end. The base sequence AUG
code for the amino acid Methionine which is a start signal.
SEQUENCE ALIGNMENT
Sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify
regions of similarity that may be a consequence of functional, structural,
or evolutionary relationships between the sequences.

Methods of Sequence Alignment


Pairwise Alignment:
Aligns two sequences to find the optimal match.
 Uses algorithms like Needleman-Wunsch (global alignment) and Smith-Waterman (local
alignment).
 Global alignment aligns entire sequences, while local alignment identifies subsequences
with high similarity.
 Pair wise alignment helps us to find the similarity and differences there are three ways
according to which sequences can differ from each other. Which are

Global Alignment and Local Alignment


In global alignment, two sequences to be aligned are assumed to be generally similar over their
entire length. Alignment is carried out from beginning to end of both sequences to find the best
possible alignment across the entire length between the two sequences. This method is more
applicable for aligning two closely related sequences of roughly the same length. For divergent
sequences and sequences of variable lengths, this method may not be able to generate optimal
results because it fails to recognize highly similar local regions between the two sequences.

Local alignment, on the other hand, does not assume that the two sequences in question have
similarity over the entire length. It only finds local regions with the highest level of similarity
between the two sequences and aligns these regions without regard for the alignment of the rest
of the sequence regions. This approach can be used for aligning more divergent sequences with
the goal of searching for conserved patterns in DNA or protein sequences. The two sequences to
be aligned can be of different lengths. This approach is more appropriate for aligning divergent
biological sequences containing only modules that are similar, which are referred to as domains
or motifs. Figure 3.2 illustrates the differences between global and local pairwise alignment.
Multiple Sequence Alignment (MSA):

A natural extension of pairwise alignment is multiple sequence alignment, which is


to align multiple related sequences to achieve optimal matching of the sequences.
Related sequences are identified through the database similarity searching.

Key features
 Aligns three or more sequences to identify conserved regions across all of them.
 Common algorithms include ClustalW, MAFFT (Multiple Alignment using Fast Fourier
Transform), T-coffee(Tree-based Consistency Objective Function for Alignment Evaluation).
 MSA helps analyze shared features, evolutionary relationships, and functional domains.

CLUSTALW
CLUSTALW is an online tool to perform MSA.
Developed by European Molecular Biology Laboratory & European Bioinformatics Institute.
Performs alignment in:

SCOPE
Symbols in resulted alignments:

You might also like