Professional Documents
Culture Documents
Gene Prediction
Gene Prediction
Group 4
INTRODUCTION
Computational gene prediction is a prerequisite for detailed functional annotation
of genes and genomes.
The process of detection of the location of open reading frames (ORFs) and
delineation of the structures of introns as well as exons if the genes of interest are
of eukaryotic origin can define gene prediction as a whole.
Gene prediction one of the most difficult problems in the field of pattern
recognition. There are two basic problems in gene prediction: prediction of
protein coding regions and prediction of the functional sites of genes.
ORF (OPEN READING FRAMES)
Open reading frames (ORFs) are defined as spans of DNA sequence between the
start and stop codons.
A long open reading frame is often part of a gene
ORF is a sequence that has a length divisible by three and is bounded by stop
codons.
GENE
PREDICTIO
N DIFFERS
FOR
DIFFERENT
ORGANISMS
There are mainly two classes of methods
TYPES OF for computational gene prediction. One is
GENE based on sequence similarity searches,
while the other is gene structure and
PREDICTI signal-based searches, which is also
ON referred to as ab initio gene finding.
PROKARYOTES GENE
FEATURES
Prokaryotic DNA is first subject to conceptual translation in all six possible frames,
three frames forward and three frames reverse.
GENE SIGNALS
Examines non randomness of
nucleotide distribution example:
* GC over AT Codon Bias in 3rd
nucleotide in a coding sequence
* Testcode : repetition of 3rd nucleotide in
coding sequence
By plotting the repeating patterns of the
nucleotides at these positions, coding and
noncoding regions can be differentiated
Disadvantage:
They identify only typical genes and tend
to miss atypical genes in which the rule of
GENE CONTENT codon bias is not strictly followed
WHAT IS A MARKOV MODEL
Drawback: If the homologs are not available in the database, the method cannot
be used hence Novel genes in a new species cannot be discovered.