01-Intro To Sequence

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Administrivia

What this course is about


Assumed knowledge and catch-up lecture
Labs
Course website
READ THE COURSE OUTLINE

Introduction to sequence analysis



BINF3010/9010

Topics (next few weeks)



Overview
Storing sequence data
Comparing a sequence with another: dotplots and alignments
Comparing a sequence with many others: similarity searching
Comparing many sequences with many others: multiple sequence alignment and family representations. Molecular phylogeny
Genome project informatics

Sequence analysis

Representation is key to understanding
In sequence analysis, macromolecules are represented as strings

QTELATKAGVKQQSIQLIEAGVTK TATACAAGAAAGTTTGTACT

Nucleotide sequences

DNA: 4 bases: A, G, C, T
RNA: 4 bases: A, G, C, U
Ambiguity codes:

N = A or G or C or T or U (also = X)
S (Strong) = G or C, W(Weak) = A or T/U
R (puRine) = G or A, Y (pYrimidine) = C or T/U
M (aMino) = A or C, K (Keto) = G or T/U
B = not A, D = not C, H = not G, V = not T/U

Nucleotide sequences

5- GATCCAGA - 3 5- TCTGGATC - 3 Sequence: 5-GATCCAGA-3
Reverse: 3-AGACCTAG-5
Complement: 3-CTAGGTCT-5 Reverse-complement: 5-TCTGGATC-3

Amino acid sequences



20 characters

Small: G (Gly), A (Ala)
Polar: S (Ser), T (Thr)
Hydrophobic: L (Leu), I (Ile), V (Val), M (Met)
Aromatic: F (Phe), Y (Tyr), W (Trp)
Acidic: D (Asp), E (Glu)
Amines: N (Asn), Q (Gln)
Basic: K (Lys), R (Arg), H (His)
Cyclic: P (Pro)
Sulphur-containing: C (Cys)

Sequence written from N terminal to C terminal


Sequencing project management

Sequence analysis: overview



Sequence entry
Sequence database browsing Manual sequence entry Search for protein coding regions Translate into protein Search databases for similar sequences

Nucleotide sequence analysis


Search databases for similar sequences

Nucleotide sequence file

Design further experiments lRestriction mapping lPCR planning non-coding Sequence comparison Search for known motifs

Protein sequence analysis


Protein sequence file

coding

Search for known motifs

Predict secondary structure

RNA structure prediction

Sequence comparison Predict tertiary structure

Multiple sequence analysis


Create a multiple sequence alignment

Edit the alignment

Format the alignment for publication

Molecular phylogeny

Protein family analysis

You might also like