Professional Documents
Culture Documents
Pattern Matching: Rhys Price Jones Anne R. Haake
Pattern Matching: Rhys Price Jones Anne R. Haake
Pattern Matching: Rhys Price Jones Anne R. Haake
Motif
multiples uses of the word
Def: a pattern; typically is used to refer to a
short (up to ten bases or residues) repeated
or conserved pattern in nucleic acids or
proteins
Def: a short conserved sequence in a protein;
usually associated with function
in a broader sense, motif is used for all localized
regions of homology, regardless of size
Restriction Sites
Why identify them?
Exact or inexact matches?
Examples:
Restriction sites
Splice Sites
Splice donor and splice acceptor
are consensus sequences
A statistical determination of the
pattern;approximates the pattern
Splice Sites
Remember that they are consensus sequences
Why are splice sites of interest?
Gene finding
Mutations in consensus sequence at the splice junctions
common in many inherited disorders
Ex: thalassemias, muscular dystrophy, Tay-Sachs,
neurofibromatosis, Dariers disease..
One of the thalassemias: mutation at splice acceptor
YYYNCAG| normal
YYYNCGG| mutant
Promoters
Prokaryotic promoters: Consensus sequences
TTGACA171TATAAT
3510
Eukaryotic promoters
TATA box at 25 relative to transcriptional start site
consensus is 5-TATAWAW-3 (W= A or T)
Initiator sequence(Inr)
consensus is 5-YYCARR-3 (Y is C or T; R is G or A)
the +1 nucleotide (start) is usually the A of the Inr sequence
Approximate
Most of the other examples of pattern matching in
bioinformatics
Mutations
Point mutations usually occur from a nucleotide
mismatch that becomes fixed during the process of
replication
Escapes the DNA repair mechanism
Evolutionary Considerations
Through time mutations tend to be preserved
if they are not deleterious
Functionally important sequences tend to be
conserved
Non-functional or non-coding sequences
diverge at a high rate
Evolutionary Considerations
The tendency of functionally important
sequences to remain relatively unchanged
over time is the basis for sequence analysis
Allows us to draw evolutionary connections among
genes that are related in sequence