Professional Documents
Culture Documents
Annotation
Annotation
pg. 1
Table of Contents
Introduction…………………………………………………………………3
Problem Statement…………………………………………………………3
Objective…………………………………………………………………….3
Literature Review……………………………………………………………3
Methodology…………………………………………………………………4
Results ………………………………………………………………………..5
Validation…………………………………………………………………….8
References……………………………………………………………………9
pg. 2
➢ Introduction:
With the development of genome sequencing for many organisms, more and more raw
sequences need to be annotated. Gene prediction by computational methods for finding the
location of protein coding regions is one of the essential issues in bioinformatics. Two
classes of methods similarity based searches and ab initio prediction are used.
➢ Problem Statement:
Prediction of gene in sequence of Homo sapiens for annotation the sequence using
Bioinformatics tools in order to study and find gene components
➢ Objective:
• Gene prediction
• Locating the gene in genome
• Detecting the organism sequence for retrieving
➢ Literature Review:
This gene encodes a flavoprotein essential for nuclear disassembly in apoptotic cells, and
it is found in the mitochondrial intermembrane space in healthy cells. Induction of
apoptosis results in the translocation of this protein to the nucleus where it affects
chromosome condensation and fragmentation. In addition, this gene product induces
mitochondria to release the apoptogenic proteins cytochrome c and caspase-9. Mutations
in this gene cause combined oxidative phosphorylation deficiency 6 (COXPD6), a severe
mitochondrial encephalomyopathy, as well as Cowchock syndrome, also known as X-
linked recessive Charcot-Marie-Tooth disease-4 (CMTX-4), a disorder resulting in
neuropathy, and axonal and motor-sensory defects with deafness and cognitive disability.
Alternative splicing results in multiple transcript variants. A related pseudogene has been
identified on chromosome 10.Geneid is a program to predict genes in anonymous genomic
sequences designed with a hierarchical structure(Yan et al 2020). The accuracy
of geneid compares favorably to that of other existing tools, but geneid is likely more
efficient in terms of speed and memory usage. geneid accuracy compares to that of other
existing ab initio gene prediction tools.geneid is very efficient in terms of speed and
memory usage,geneid offers support to integrate predictions from multiple sources and to
reannotate genomic sequences, via external gff files and together with the redefinition of
the gene model.geneid output can be customized to different levels of detail, including
pg. 3
exhaustive listing of potential signals and exons. Furthermore, several output formats as gff
or XML are available (Gene id tool).
➢ Methodology:
In the first step, splice sites, start and stop codons are predicted and scored along the
sequence using Position Weight Arrays. In the second step, exons are built from the sites.
Exons are scored as the sum of the scores of the defining sites, plus the the log-likelihood
ratio of a Markov Model for coding DNA. Finally, from the set of predicted exons, the gene
structure is assembled, maximizing the sum of the scores of the assembled
exonsGeneid offers some type of support to integrate predictions from multiple source via
external gff files and the redefinition of the general gene structure or model is also feasible.
Setting the parameters and uploading the Fasta file of Homo sapiens
pg. 4
Submit the File and then run it .then Results are displayed On screen.
➢ Results:
The tools give following :
• Starts(+) predicted in sequence NC_000023.11:c130165841-130129362:
[0,36479]
pg. 5
Acceptors(+) predicted in sequence NC_000023.11:c130165841-130129362:
[0,36479]
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 1
2 -0.35 + . # ***********************GTTC
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 21
22 2.22 + . # ***GTTCCCCTTCCCCGGCTCTAGCAG
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 24
25 1.01 + . # GTTCCCCTTCCCCGGCTCTAGCAGGCC
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 55
56 -1.98 + . # TCTCTGTCCAATGCCCACCCGGAGCTG
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 90
91 -2.57 + . # AGTCTGCGTAATGTGCGTGTGAAGAGA
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 92
93 -5.08 + . # TCTGCGTAATGTGCGTGTGAAGAGACT
NC_000023.11:c130165841-130129362 geneid_v1.2 Acceptor 145
146 -5.04 + . # TTTGACCCGTCGGTCGTGCGTGAGAGG
pg. 6
# Internals(+) predicted in sequence NC_000023.11:c130165841-130129362:
[0,36479]
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 2
67 -8.40 + 0
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 2
67 -8.91 + 1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 2
73 -6.94 + 0
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 2
73 -7.36 + 1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 2
78 -5.57 + 0
Terminals(+) predicted in sequence NC_000023.11:c130165841-130129362:
[0,36479]
NC_000023.11:c130165841-130129362 geneid_v1.2 Terminal 2
21 -5.99 + 2
NC_000023.11:c130165841-130129362 geneid_v1.2 Terminal 2
77 -6.18 + 1
88 -4.20 + 1
pg. 7
NC_000023.11:c130165841-130129362 geneid_v1.2 First 186 291
2.78 + 0 NC_000023.11:c130165841-130129362_1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 9239
9381 -1.12 + 2 NC_000023.11:c130165841-130129362_1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 16274
16373 3.34 + 0 NC_000023.11:c130165841-130129362_1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 17966
18090 3.06 + 2 NC_000023.11:c130165841-130129362_1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 18219
18349 0.88 + 0 NC_000023.11:c130165841-130129362_1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 20273
20363 0.74 + 1 NC_000023.11:c130165841-130129362_1
NC_000023.11:c130165841-130129362 geneid_v1.2 Internal 25225
25309 -0.05 + 0 NC_000023.11:c13016584
➢ Validation:
GENEID a program which is used to predict genes, exons, splice sites, gene scores, internal
,external codon, start, stop, codon ,Open reading Frame and other signals along a sequence.
pg. 8
➢ References:
• Yan, C., Gong, L., Chen, L., Xu, M., Abou-Hamdan, H., Tang, M., Désaubry, L., &
Song, Z. (2020). PHB2 (prohibitin 2) promotes PINK1-PRKN/Parkin-dependent
mitophagy by the PARL-PGAM5-PINK1 axis. Autophagy, 16(3), 419–434.
https://doi.org/10.1080/15548627.2019.1628520
• https://genome.crg.es/geneid.html
pg. 9