Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 10

Promoter Prediction

By :
Anurag Maheshwari (BTB/14/302)
Arpita Gupta (BTB/14/303)
Gurleen Singh Thind (BTB/14/304)
Jagrit Sharma (BTB/14/306)
Promoters
• Gene promoter are DNA sequences located upstream
of gene coding regions.
• Contains multiple cis-acting elements, which are
specific binding sites for TFs.
• Contains “Core Promoter”(-40 bp upstream of the
transcriptional initiation site) and comprises the TATA
box
• Chromatins allow distant cis-acting elements to fold
and spatially become proximal to the regulatory
complex
Types of Promoters
A) Constitutive promoters
• Drives somewhat constant levels of gene expression in all
tissues, at all times. No promoters are truly constitutive. Eg
CaMV355 promoter. High-expressing housekeeping genes are
good source (Ubiquitin, actin, Tubulin, EIF genes )
B) Spatiotemporal promoters
• More precise control of native genes and transgenes.
Restricts gene expression to certain cells, tissues, organs, or
developmental stages. Seed specific promoters in Hordien
and Glutenin genes
C). Inducible Promoters
• Responsive to environmental stimuli (Biotic and Abiotic
stresses)and external chemical stimuli.
Models for finding Binding Sites
A) Exact String Model
• Searches for exact sequence in the DNA
sequence
B). String Mismatches Model
• Tries to find almost exact sequence tolerating a mistake in one of
the positions

C). Degenerate String Method Model (Consensus Model)


Tries to find a sequence and allows various bases to be placed in
specific position of the sequence
Promoter Prediction Servers

URL: http://www.cbs.dtu.dk/services/Promoter/
Procedure
• 1. Specify the input sequences
 The sequences intended for processing can be input in the following two ways:
 Paste a single sequence (just the nucleotides) or a number of sequences in
FASTA format into the upper window of the main server page.
 Select a FASTA file on your local disk, either by typing the file name into the
lower window or by browsing the disk.
 The allowed input alphabet is A, C, G, T and X (unknown); all the other symbols
will be converted to X before processing.

• 2. Select the output format


 Click on the "Full output" button if you want the input sequences to be
included in the server output. The default output format shows the predictions
only.
• 3. Submit the job
 Click on the "Submit" button.
 Your output will be ready in a short time to be reviewed.
Output method
For each input sequence the name and length are first printed, followed by a table
in the form: Position, Score and Likelihood
• 'Position' is a position in the sequence.
• 'Score' is the prediction score for a transcription start site occurring within 100
base pairs upstream from that position.
• 'Likelihood' is a descriptive label associated with that score.

The scores are always positive numbers; they are labelled as follows:
• below 0.5 ignored
• 0.5 - 0.8 Marginal prediction
• 0.8 - 1.0 Medium likely prediction
• above 1.0 Highly likely prediction

The input sequence will be included in the output, preceeding the predictions if
"Full output" has been selected.
EXAMPLE INPUT
• INPUT SEQUENCE:
• >gi_209811_gb_J01917_ADRCG Adenovirus type 2, complete
genome.
CATCATCATAATATACCTTATTTTGGATTGAAGCCAATATGATAATGAGGGGGTGGAGTTTGTGACGTGGCGCGGGGC
GTGGGAACGGGGCGGGTGACGTAGTAGTGTGGCGGAAGTGTGATGTTGCAAGTGTGGCGGAACACATGTAAGCGC
CGGATGTGGTAAAAGTGACGTTTTTGGTGTGCGCCGGTGTATACGGGAAGTGACAATTTTCGCGCGGTTTTAGGCG
GATGTTGTAGTAAATTTGGGCGTAACCAAGTAATGTTTGGCCATTTTCGCGGGAAAACTGAATAAGAGGAAGTGAAA
TCTGAATAATTCTGTGTTACTCATAGCGCGTAATATTTGTCTAGGGCCGCGGGGCTTTGACCGTTTACGTGGAGACTC
GCCCAGGTGTTTTTCTCAGGTGTTTTCCGCGTTCCGGGTCAAAGTTGGCGTTTTATTATTATAGTCAGCTGACGCGCA
GTGTATTTATACCCGGTGAGTTCCTCAAGAGGCCACTCTTGAGTGCCAGCGAGTAGAGTTTTCTCCTCCGAGCCGCTC
CGACACCGGGACTGAAAATGAGACATATTATCTGCCACGGAGGTGTTATTACCGAAGAAATGGCCGCCAGTCTTTTG
GACCAGCTGATCGAAGAGGTACTGGCTGATAATCTTCCACCTCCTAGCCATTTTGAACCACCTACCCTTCACGAACTG
TATGATTTAGACGTGACGGCCCCCGAAGATCCCAACGAGGAGGCGGTTTCGCAGATTTTTCCCGAGTCTGTAATGTT
GGCGGTGCAGGAAGGGATTGACTTATTCACTTTTCCGCCGGCGCCCGGTTCTCCGGAGCCGCCTCACCTTTCCCGG
CAGCCCGAGCAGCCGGAGCAGAGAGCCTTGGGTCCGGTTTCTATGCCAAACCTTGTGCCGGAGGTGATCGATCTTA
CCTGCCACGAGGCTGGCTTTCCACCCAGTGACGAC
GAGGATGAAGAGGGTGAGGAGTTTGTGTTAGATTATGTGGAGCACCCCGGGCACGGTTGCAGGTCTTGTCATTATC
ACCGGAGGAATACGGGGGACCCAGATATTATGTGTTCGCTTTGCTATATGAGGACCTGTGGCATGTTTGTCTACAGTA
AGTGAAAATTATGGGCAGTCGGTGAT
AGAGTGGTGGGTTTGGTGTGGTAATTTTTTTTTAATTTTTACAGTTTTGTGGTTTAAAGA
EXAMPLE OUTPUT
gi_209811_gb_J01917_ADRCG Adenovirus type 2, complete
genome., 1200 nucleotides

Position Score Likelihood


600 1.063 Highly likely prediction

NOTE: The Input sequence will precede the output if


FULL OUTPUT has been selected.

You might also like