Biotechnologia MSC 2017 Ora1 MG

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 37

1.

Genomikáról általában, főbb metodikák, összehasonlításuk

2. Második generációs szekvenálás alkalmazási területei

3. Antimikrobiális peptidek
Genomics is the study of the genomes of organisms.

Milestones:
- Full sequence of f-X174 bacteriophage (5368 bp) 1977, Frederick
Sanger
-The first free-living organism to be sequenced was that of Haemophilus
influenzae (1.8 Mb) in 1995, Hamilton Smith
- Shotgun technique 1998, Celera Genomics
- Fruit fly (Drosophila melanogaster) in 2000
- Human in 2001 (3.3 Gb)
- Human in good quality in 2007 (less than one error in 10,000 bases
and all chromosomes assembled)
-Today: more than 2000 prokaryotic genomes, more than 3500 viruses
and around 200 eukaryotic genomes are fully sequenced
Genomics

Functional genomics
Personal genomics

Metagenomics

Pharmacogenomics
Psychogenomics

Nutrigenomics
Nitrogenomics
Hydrogenomics etc.
Functional genomics

Related to transcriptomics and proteomics

Only 1.5 % of the human genome encodes proteins (ca. 20 000 genes)
Pharmacogenomics
Metagenomics
Techniques applied in Genomics

General features:
- High-troughput
- Generate huge amount of data
- Data evaluation is often the main challenge

1. Microarray techniques

2. Sequencing techniques
1. Microarray techniques

Arrayed series of thousands of spots of DNA oligonucleotides, called probes. Probes can be
short sections of genes that are used to hybridize a cDNA sample (called target) under high-
stringency conditions. Probe-target hybridization is usually detected and quantified by
detection of fluorophore- or chemiluminescence-labeled targets to determine relative
abundance of nucleic acid sequences in the target.
Microarray experiment
Microarray’s weaknesses

- arrays prerequisite known sequences


- examination of similar sequences is almost impossible by arrays
(cross-hybridization, specificity issues)
- array reproducibility shows huge fluctuations

Ioannidis et al; Nat Genet 41 (2009)


2. Sequencing techniques

1. Traditional sequencing: non-high-throughput


1.1. Maxam-Gilbert method
1.2. Chain termination method (Sanger method): key principle is the use of
dideoxynucleotide triphosphates (ddNTPs) as DNA chain terminators
1.3. Shotgun sequencing (Sanger chemistry)

Chain termination method Shotgun seq


2. High-throughput sequencing NGS (next generation sequencing)

2.1. Second generation sequencing (pyrosequencing, sequencing by


ligation), this is the present
454 Life Sciences: pyrosequencing
Illumina: sequencing by synthesis
ABi SOLiD: sequencing by ligation

2.2. Third generation sequencing (nanopore sequencing, one molecule


sequencing), this is the future
Pacific Biosciences
Helicos
Complete Genomics
Etc.
Main characteristics of sequencing generations
Sanger sequencing
Need for library preparation in a host
• Labour and time - intensive, expensive
• Toxic regions are not represented Long read, low coverage
• Host genome contaminations Use in: De novo sequencing, validation
Low throughput
• strand synthesis and base determination are separated
• need for electrophoretic step No competition,
but complementation
• high unit cost (cost/bp)

NGS (Next Generation Sequencing) Use in: Resequencing, SNP analysis, RNA-Seq
No need for library preparation in a host
• immobilized template fragments, PCR methods Short read, huge coverage
(especially SOLiD and Illumina)
• labour, time and cost effective
High throughput
• several millions of sequencing /run
• synthesis and sequencing are not separated
Illumina platform

Illumina HiSeq
454 FLX
ABi SOLiD platform
June 2008 August 2010
October 2007

SOLiD V 2.0 SOLiD V 4


SOLiD V 1.0

July 2011

SOLiD 5500
Comparison of NGS technologies

Illumina
454 FLX Technology
A
+ PCR Reagents

+ Emulsion Oil
B
Micro-reactors
Adapter carrying Mix DNA Library Create
library DNA & capture beads “Water-in-oil”
(limited dilution) emulsion

“Break micro-reactors” Perform emulsion PCR


Isolate DNA containing beads

• Generation of millions of clonally amplified sequencing templates on each bead


• No cloning and colony picking
454 FLX Technology

Load Enzyme
Load beads into Beads
PicoTiter™Plate

Centrifuge Step

44 μm
454 FLX Technology
PicoTiterPlate
Wells

Photons
Reagent Flow Sequencing Generated are
By Synthesis Captured by
(pyrosequencing) Camera

Enzymes needed:- DNA polymerase, ATP sulfurylase, luciferase, apyrase

Template: ssDNA

Addition of one of the four dNTPs in each step


Sequencing Image Created
454 FLX Technology: Basecalling
• Count the photons generated for each “flow”
• Base call using signal thresholds
• Delivery of one nucleotide per flow ensures accurate base calling

Flow Order
T Measures the presence
4-mer A
C or absence of each
G
nucleotide at any given
position
3-mer
KEY (TCAG)

2-mer

1-mer
Summary of 454 FLX
• Read length: 400-600 bases
• Throughput: 750MB/slide/run
(average bacterial genome size: 5 MB)
(150x coverage on 5 MB bacterial genome)

• Homopolymer problem
(caused by proportionality of light intensity)
Illumina
Technology

Step 1-6
DNA Fragmentation
Adaptor ligation
Template amplification
Cluster Generation
The Cluster Generation is performed on the Illumina cBot. Single DNA-fragments are attached to the flow
cell by hybridizing to oligos on its surface that are complementary to the ligated adaptors. The DNA-
molecules are then amplified by a so called bridge amplification which results in a hundred of millions of
unique clusters. Finally, the reverse strands are cleaved and washed away and the sequencing primer is
hybridized to the DNA-templates.
Illumina
Technology
Step 7-12

Base determination
(sequencing by synthesis,
differently labeled nucleotides,
laser excitation, fluorescence
detection)

Base imaging

Multiple cycles
Illumina
Technology
Summary:

• Read length: 50-200 bases


• Throughput: 200 GB/flowcell/run
(40000x coverage on 5 MB bacterial genome)
• High accuracy (no homopolymer issue)

Flow cell
SOLiD V4 (5500) System

Summary:
• Read length: 50-75 bases
• Throughput: 200 GBase/slide/run
(40 000x coverage on 5 MB bact. genome)
• Highest accuracy (no homopolymer issue, two-base encoding)
SOLiD™ Chemistry
SOLiD™ Chemistry
Properties of the Probes
Cleavage site is between
5th and 6th base

Fluorescent dye
interrogates bases
X Xn n n z z z on 1st + 2nd
Blue - probe
position
2nd Base
Probes are octamers A C G T
A
N=degenerate bases, Z=universal bases

1st Base
C
G
T
Multiplexing

5 additional
bases on the
P2 adapter
Barcoding

Sequencing read BC read


(35 bases) (5 bases)

P1* Target DNA P2*

• 96 barcodes available
Multiplex Analysis
Libraries ePCR Enrichment Deposition
Ion Torrent platform
semiconductor chip technology

2011
Ion Torrent PGM

Ion Torrent PGM specifications :


• Read length: 100-400 bases
• Throughput: 1000 MBase/chip/run (various chips available)
(200x coverage on 5 MB bact. genome)
• 97% accuracy
• Faster (1 day) , cheaper
Ion Torrent: How does it work?
semiconductor chip technology

1. 2.

when a nucleotide is incorporated into a Each well holds a different DNA template.
strand of DNA by a polymerase, a Beneath the wells is an ion-sensitive layer
hydrogen ion is released as a byproduct and beneath that a proprietary Ion sensor
Ion Torrent: How does it work?
semiconductor chip technology
3. 4.

If a nucleotide, for example a C, is added to a PGM sequencer then sequentially floods the
DNA template and is then incorporated into a chip with one nucleotide after another. If the
strand of DNA, a hydrogen ion will be released. next nucleotide that floods the chip is not a
The charge from that ion will change the pH of match, no voltage change will be recorded
the solution, which can be detected by a and no base will be called
proprietary ion sensor. PGM - essentially a solid-
state pH meter - will call the base, going directly
from chemical information to digital information
Ion Torrent: How does it work?
semiconductor chip technology
5.

If there are two identical bases on the DNA strand, the voltage will be
double, and the chip will record two identical bases called. Because
this is direct detection - no scanning, no cameras, no light - each
nucleotide incorporation is recorded in seconds

You might also like