Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

Genome Organization

Mardalisa, B.Sc., M.Si


Viral genomes
Viral genomes: ssRNA, dsRNA, ssDNA, dsDNA, linear or ciruclar

Viruses with RNA genomes:


•Almost all plant viruses and some bacterial and animal viruses
•Genomes are rather small (a few thousand nucleotides)
Viruses with DNA genomes (e.g. lambda = 48,502 bp):
•Often a circular genome.
Replicative form of viral genomes
•all ssRNA viruses produce dsRNA molecules
•many linear DNA molecules become circular
Molecular weight and contour length:
• duplex length per nucleotide = 3.4 Å
• Mol. Weight per base pair = ~ 660
Procaryotic genomes

 Generally 1 circular chromosome (dsDNA)


 Usually without introns
 Relatively high gene density (~2500 genes per mm
of E. coli DNA)
 Contour length of E.coli genome: 1.7 mm
 Often indigenous plasmids are present
-lactamase
Plasmids
ori
Extra chromosomal circular DNAs
 Found in bacteria, yeast and other fungi
foreign gene
 Size varies form ~ 3,000 bp to 100,000 bp.
 Replicate autonomously (origin of replication)
 May contain resistance genes
 May be transferred from one bacterium to another
 May be transferred across kingdoms
 Multicopy plasmids (~ up to 400 plasmids/per cell)
 Low copy plasmids (1 –2 copies per cell)
 Plasmids may be incompatible with each other
 Are used as vectors that could carry a foreign gene of interest
(e.g. insulin)
Eukaryotic genome

 Moderately repetitive
 Functional (protein coding, tRNA coding)
 Unknown function
 SINEs (short interspersed elements)
 200-300 bp
 100,000 copies

 LINEs (long interspersed elements)


 1-5 kb
 10-10,000 copies
Eukaryotic genome

 Highly repetitive
 Minisatellites
 Repeats of 14-500 bp
 1-5 kb long
 Scattered throughout genome
 Microsatellites
 Repeats up to 13 bp
 100s of kb long, 106 copies
 Around centromere
 Telomeres
 Short repeats (6 bp)
 250-1,000 at ends of chromosomes
Eucaryotic genomes
 Located on several chromosomes
 Relatively low gene density (50 genes per mm of
DNA in humans)
 Contour length of DNA from a single human cell = 2
meters
 Approximately 1011 cells = total length 2 x 1011 km
 Distance between sun and earth (1.5 x 108 km)
 Human chromosomes vary in length over a 25 fold
range
 Carry organelles genome as well
Mitochondrial genome (mtDNA)

 Multiple identical circular chromosomes


 Size ~15 Kb in animals
 Size ~ 200 kb to 2,500 kb in plants
 Over 95% of mitochondrial proteins are encoded in the nuclear genome.
 Often A+T rich genomes.
 Mt DNA is replicated before or during mitosis
Chloroplast genome (cpDNA)

 Multiple circular molecules


 Size ranges from 120 kb to 160 kb
 Similar to mtDNA
 Many chloroplast proteins are encoded in the nucleus (separate signal sequence)
“Cellular” Genomes
Viruses Procaryotes Eucaryotes
Nucleus

Capsid
Plasmids

Viral genome Bacterial


Chromosomes Mitochondrial
chromosome
(Nuclear genome) genome

Chloroplast
genome
Genome: all of an organism’s genes plus intergenic DNA
Intergenic DNA = DNA between genes
Estimated genome sizes
mammals

plants

fungi

bacteria (>100)

mitochondria (~ 100)

viruses (1024)

1e1 1e2 1e3 1e4 1e5 1e6 1e7 1e8 1e9 1e10 1e11 1e12
Size in nucleotides. Number in ( ) = completely sequenced genomes
Size of genomes

Epstein-Barr virus 0.172 x 106


E. coli 4.6 x 106
S. cerevisiae 12.1 x 106
C. elegans 95.5 x 106
A. thaliana 117 x 106
D. melanogaster 180 x 106
H. sapiens 3200 x 106
Chromosome organization
Eucaryotic chromosome

Telomere Centromere Telomere

p-arm q-arm

Centromere:
• DNA sequence that serve as an attachment for protein during mitosis.
• In yeast these sequences (~ 130 nts) are very A+T rich.
• In higher eucaryotes centromers are much longer and contain
“satellite DNA”
Telomeres:
• At the end of chromosomes; help stabilize the chromosome
• In yeast telomeres are ~ 100 bp long (imperfect repeats)
• Repeats are added by a specific telomerase

5’ – (TxGy)n x and y = 1 - 4
3’ – (AxCy)n n = 20 to 100; (1500 in mammals)
Gene classification
intergenic
region non-coding
coding genes genes
Chromosome
(simplified)

Messenger RNA Structural RNA

Proteins

transfer ribosomal other


RNA RNA RNA

Structural proteins Enzymes


What is a gene ?
 Definitions
1. Classical definition: Portion of a DNA that determines a
single character (phenotype)
2. One gene – one enzyme (Beadle & Tatum 1940): “Every
gene encodes the information for one enzyme”
3. One gene – one protein: “One gene contains information
for one protein (structural proteins included) one gene –
one polypeptide
4. Current definition: A piece of DNA (or in some cases
RNA) that contains the primary sequence to produce a
functional biological gene product (RNA, protein).
Coding region
Nucleotides (open reading frame) encoding the amino acid sequence of a protein

The molecular definition of gene includes


more than just the coding region
Noncoding regions

 Regulatory regions
 RNA polymerase binding site
 Transcription factor binding sites
 Introns
 Polyadenylation [poly(A)] sites
Gene

Molecular definition:
Entire nucleic acid sequence necessary for the synthesis of a functional polypeptide (protein
chain) or functional RNA
Anatomy of a gene

 ORF. From start (ATG) to stop (TGA, TAA, TAG)


 Upstream region with binding site. (e.g. TATA box).
 Poly-a ‘tail’
 Splices. Bounded by AG and GT splice signals.
Bacterial genes

 Most do not have introns


 Many are organized in operons: contiguous genes, transcribed as a single
polycistronic mRNA, that encode proteins with related functions

Polycistronic mRNA encodes several proteins


Bacterial operon

What would be the effect of a mutation


in the control region (a) compared to a
mutation in a structural gene (b)?
Eucaryotic genes

Hemoglobin beta subunit gene


Exon 1 Intron A Exon 2 Intron B Exon 3
90 bp 131 bp 222 bp 851 bp 126 bp

Splicing

Introns: intervening sequences within a gene that are not translated


into a protein sequence. Collagen has 50 introns.
Exons: sequences within a gene that encode protein sequences
Splicing: Removal of introns from the mRNA molecule.
Regulatory mechanisms

 ‘organize expression of genes’ (function calls)


 Promoter region (binding site), usually near coding region
 Binding can block (inhibit) expression
 Computational challenges
 Identify binding sites
 Correlate sequence to expression
TERIMA KASIH

You might also like