Download as pdf or txt
Download as pdf or txt
You are on page 1of 76

!

Evolution of Sequencing
!
Workshop in Applied Phylogenetics
March 9, 2014
Bodega Bay Marine Lab
!
Jonathan A. Eisen
UC Davis Genome Center

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Review Papers

Mardis ER. Next-generation sequencing platforms. Annu Rev Anal Chem 2013;6:287-303. doi: 10.1146/annurevanchem-062012-092628.

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Annu. Rev. Genom. Human Genet. 2008.9:


by Universidad Nacional Autonom

ther

k links to
ontent online,

his volume

articles
ve search

Review Papers

Next-Generation DNA
Sequencing Methods
Elaine R. Mardis
Departments of Genetics and Molecular Microbiology and Genome Sequencing Center,
Washington University School of Medicine, St. Louis MO 63108; email: emardis@wustl.edu

Annu. Rev. Genomics Hum. Genet. 2008.


9:387402
First published online as a Review in Advance on
June 24, 2008
The Annual Review of Genomics and Human Genetics
is online at genom.annualreviews.org

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Open Access Papers of Interest


http://www.microbialinformaticsj.com/content/2/1/3/
http://www.hindawi.com/journals/bmri/2012/251364/abs/
http://m.cancerpreventionresearch.aacrjournals.org/
content/5/7/887.full

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sequencing Technology Timeline

1977

2010

Sanger sequencing method by F. Sanger


(PNAS ,1977, 74: 560-564)

1983
1953

2000

1990

1980

Approaching to NGS

PCR by K. Mullis
(Cold Spring Harb Symp Quant Biol. 1986;51 Pt 1:263-73)

Discovery of DNA structure


(Cold Spring Harb. Symp. Quant. Biol. 1953;18:123-31)

Human Genome Project


(Nature , 2001, 409: 86092; Science, 2001, 291: 13041351)

1993

Development of pyrosequencing
(Anal. Biochem., 1993, 208: 171-175; Science ,1998, 281: 363-365)

Single molecule emulsion PCR

1998

Founded Solexa

1998

Founded 454 Life Science

2000

454 GS20 sequencer


(First NGS sequencer)

2005

Solexa Genome Analyzer


(First short-read NGS sequencer)

Illumina acquires Solexa


(Illumina enters the NGS business)

2006
2006

ABI SOLiD
(Short-read sequencer based upon ligation)

Roche acquires 454 Life Sciences


(Roche enters the NGS business)

2007
2007

GS FLX sequencer
(NGS with 400-500 bp read lenght)

NGS Human Genome sequencing


(First Human Genome sequencing based upon NGS technology)

From Slideshare presentation of Cosentino Cristian


http://www.slideshare.net/cosentia/highthroughput-equencing

2008
2008

Hi-Seq2000
(200Gbp per Flow Cell)

2010

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Miseq
Roche Jr
Ion Torrent
PacBio
Oxford

Generation I: Manual Sequencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Maxam-Gilbert Sequencing

http://www.pnas.org/content/74/2/560.full.pdf
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sanger Sequencing of PhiX174

http://www.ncbi.nlm.nih.gov/
pmc/articles/PMC431765/

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sanger Sequencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sanger Sequencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Nobel Prize 1980: Berg, Gilbert, Sanger

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Generation II: Automated Sanger

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sanger method with labeled dNTPs

Automation of Sanger Part I

The Sanger mehtods is based on the idea that inhibitors can


terminate elongation of DNA at specific points

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Automation of Sanger Part II

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Automated Sanger Highlights

1991: ESTs by Venter


1995: Haemophilus influenzae genome
1996: Yeast, archaeal genomes
1999: Drosophila genome
2000: Arabidopsis genome
2000: Human genome
2004: Shotgun metagenomics

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Generation III: Clusters not Clones

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Generation III = NextGen

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

NextGen
Sequencing Outline
Next-generation sequencing platforms
From Slideshare presentation of
Cosentino Cristian
http://www.slideshare.net/cosentia/
high-throughput-equencing

Isolation and purification of


target DNA

Sample preparation

Amplification

Emulsion PCR

Sequencing by synthesis
with 3-blocked reversible
terminators

Pyrosequencing

Imaging

Cluster generation
on solid-phase

Sequencing

Library validation

Sequencing by ligation

Four colour imaging

Sequencing by synthesis
with 3-unblocked reversible
terminators

Single colour imaging

Data analysis

Illumina GAII

Roche 454

ABi SOLiD

Helicos HeliScope

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

NextGen #1: 454


Roche 454
Sanger
method
ABi SOLiD

Pyrosequencing
From Slideshare presentation of
Cosentino Cristian
http://www.slideshare.net/cosentia/highthroughput-equencing

Illumina GAII
HeliScope
Nanopore

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

NextGen #1: Roche 454


Roche 454
Sanger
method
ABi SOLiD

Pyrosequencing
From Slideshare presentation of
Cosentino Cristian
http://www.slideshare.net/cosentia/highthroughput-equencing

Illumina GAII
HeliScope
Nanopore

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

NextGen #1: Roche 454


Roche 454
Sanger
method
ABi SOLiD

Pyrosequencing
From Slideshare presentation of
Cosentino Cristian
http://www.slideshare.net/cosentia/highthroughput-equencing

Illumina GAII
HeliScope
Nanopore

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Roche 454 Wokflow

From http://acb.qfab.org/acb/ws09/presentations/Day1_DMiller.pdf
http://www.slideshare.net/AGRF_Ltd/ngs-technologies-platforms-and-applications
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Roche 454 Step 1: Libraries


a
DNA library preparation
4.5 hours
Ligation

Genome fragmented
by nebulization

No cloning; no colony
picking

Selection
(isolate AB
fragments
only)
A

sstDNA library created


with adaptors
A/B fragments selected
using avidin-biotin
purification

gDNA

sstDNA library

gDNA
b fragmented by
nebulization or
Emulsion
PCR
sonication
8 hours

Fragments are endrepaired and ligated to


adaptors containing
universal priming sites

Fragments are denatured and


AB ssDNA are selected by
avidin/biotin purification
(ssDNA library)

From Mardis 2008. Annual Rev. Genetics 9: 387.


Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

only)
B

with adaptors

Roche 454 Step 2: Emulsion PCR


A

gDNA

A/B fragments selected


using avidin-biotin
purification

sstDNA library

b
Emulsion PCR
8 hours

Anneal sstDNA to an excess of


DNA capture beads

sstDNA library

Emulsify beads and PCR


reagents in water-in-oil
microreactors

Clonal amplification occurs


inside microreactors

Break microreactors and


enrich for DNA-positive
beads

Bead-amplified sstDNA library

From Mardis 2008. Annual Rev. Genetics 9: 387.

Sequencing
7.5 hours

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Anneal sstDNA to an excess of


DNA capture beads

Emulsify beads and PCR


reagents in water-in-oil
microreactors

Clonal amplification occurs


inside microreactors

Break microreactors and


enrich for DNA-positive
beads

Roche 454 Step 3: Pyrosequencing

sstDNA library

Bead-amplified sstDNA library

c
Sequencing
7.5 hours

Well diameter: average of 44 m


400,000 reads obtained in parallel
A single cloned amplified sstDNA
bead is deposited per well

Amplified sstDNA library beads

390

Mardis

Quality filtered bases

From Mardis 2008. Annual Rev. Genetics 9: 387.

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Roche 454 Step 3: Pyrosequencing


Pyrosequencing

Roche 454
Sanger
method

Annu. Rev. Genomics Hum. Genet., 2008, 9: 387-402


Nature Reviews genetics, 2010, 11: 31-46

44 m

ABi SOLiD
Illumina GAII
HeliScope
Nanopore

From Slideshare
presentation of Cosentino
Cristian
http://www.slideshare.net/
cosentia/high-throughputequencing

Pyrosequecning

Reads are recorded as flowgrams

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Roche 454 Key Issues


Number of repeated nucleotides
estimated by amount of light ... many
errors
Reasonable number of failures in EMPCR and other steps

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Roche 454 Evolution

http://www.slideshare.net/AGRF_Ltd/ngs-technologies-platforms-and-applications
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sequecning by synthesis with reversible terminator

NextGen #2: Solexa

From Slideshare
presentation of
Cosentino Cristian
http://
www.slideshare.net/
cosentia/highthroughputequencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Sequecning by synthesis with reversible terminator

NextGen #2: Solexa Illumina

From Slideshare
presentation of
Cosentino Cristian
http://
www.slideshare.net/
cosentia/highthroughput-equencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

GAII

ction

le
tion

Instrumentation

NextGen #2: Illumina Accessories


Bioanalyzer 2100

From Slideshare
presentation of
Cosentino
Cristian
http://
www.slideshare.
net/cosentia/
high-throughputequencing

Cluster station

ers
ation

ng by
esis

sis
ne

h
hput

Paired-end module

Genome Analyzer IIx

Linux server

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Illumina
Outline
Sequencing workflow

Illumina GAII

Sample
preparation and
library validation

Introduction

Sample
preparation

From Slideshare
presentation of
Cosentino Cristian
http://
www.slideshare.net/
cosentia/highthroughput-equencing

Cluster station

High
throughput

Wash cluster
station
Clusters
amplification
Linearization,
Blocking and
primer
Hybridization

Read 1

GAIIx & PE

Analysis
pipeline

Analysis

Sequencing by
synthesis

SBS sequencing

Cluster generation

Clusters
amplification

Prepare read 2
Read 2

Pipeline base call


Data analysis

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

continues for a specific number of cycles, as determined by user-defined instrument settings,


which permits discrete read lengths of 2535

read and a quality checking pipeline evaluates


the Illumina data from each run, removing
poor-quality sequences.

Illumina Step 1: Prep & Attach DNA


a

Adapter
DNA fragment

DNA

Dense lawn
of primers
Adapter

Adapters

Prepare genomic DNA sample

Attach DNA to surface

Randomly fragment genomic DNA


and ligate adapters to both ends of
the fragments.

Bind single-stranded fragments


randomly to the inside surface
of the flow cell channels.

From Mardis 2008. Annual Rev. Genetics 9: 387.


Nucleotides

Step 1: Sample Preparation The DNA sample of interest is sheared to appropriate size (average ~800bp) using a compressed air device known as a
nebulizer. The ends of the DNA are polished, and two unique adapters are ligated to the fragments. Ligated fragments of the size range of 150-200bp are
isolated via gel extraction and amplified using limited cycles of PCR

Attached
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied
Phylogenetics

Prepare genomic DNA sample

Attach DNA to surface

Randomly fragment genomic DNA


and ligate adapters to both ends of
the fragments.

Bind single-stranded fragments


randomly to the inside surface
of the flow cell channels.

Illumina Step 2: Clusters by Bridge PCR


Nucleotides

Attached

Bridge amplification
Add unlabeled nucleotides
and enzyme to initiate solidphase bridge amplification.

Denature the double


stranded molecules

From Mardis 2008. Annual Rev. Genetics 9: 387.

Figure 2

The Illumina sequencing-by-synthesis approach. Cluster strands created by bridge amplification are primed and all four fluorescently
the 2-6:
flowCluster
cell with
DNA polymerase.
The cluster In
strands
aretoextended
by one
labeled,
3 -OH blocked nucleotides are added to
From
: http://seqanswers.com/forums/showthread.php?t=21.
Steps
Generation
by Bridge Amplification.
contrast
the 454 and
ABI methods which
usenucleotide.
a bead-based
emulsionthe
PCRincorporation
to generate "polonies",
a unique
amplification
reaction are
thatwashed
occurs on
the asurface
of the is
flow
Following
step, the Illumina
unused utilizes
nucleotides
and"bridged"
DNA polymerase
molecules
away,
scan buffer
cell.added
The flow
cellflow
surface
coated
single
stranded
that
correspond
to the sequences
of the
adapters
ligated during
the sample
to the
cell,is and
thewith
optics
system
scansoligonucleotides
each lane of the
flow
cell by imaging
units called
tiles.
Once imaging
is completed,
preparation stage. Single-stranded, adapter-ligated fragments are bound
to the surface of the flow cell exposed to reagents for polyermase-based
chemicals
that effect
of the fluorescent
labelsfragment
and the "bridges"
3 -OH blocking
groups are added
thesurface.
flow cell,
which denaturation
prepares theand
extension.
Priming
occurs cleavage
as the free/distal
end of a ligated
to a complementary
oligo ontothe
Repeated
clusterresults
strandsinfor
another
round of fluorescent
nucleotide
incorporation.
extension
localized
amplification
of single molecules
in millions
of unique locations across the flow cell surface. This process occurs in what is
referred to as Illumina's "cluster station", an automated flow cell processor.

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics
392

Mardis

Clusters

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

tween 24 Gb of DNA sequence data. Once

different from that already established for

Illumina Step 3: Sequencing by Synthesis

b
First chemistry cycle:
determine first base
To initiate the first
sequencing cycle, add
all four labeled reversible
terminators, primers, and
DNA polymerase enzyme
to the flow cell.

Image of first chemistry cycle

Before initiating the


next chemistry cycle

After laser excitation, capture the image


of emitted fluorescence from each
cluster on the flow cell. Record the
identity of the first base for each cluster.

The blocked 3' terminus


and the fluorophore
from each incorporated
base are removed.

Laser

From Mardis 2008. Annual Rev. Genetics 9: 387.

From : http://seqanswers.com/forums/showthread.php?t=21. Steps 7-12: Sequencing by Synthesis. A flow cell containing millions of unique clusters is
now loaded into the 1G sequencer for automated cycles of extension and imaging. The first cycle of sequencing consists first GCTGA...
of the incorporation of a single
fluorescent nucleotide, followed by high resolution imaging of the entire flow cell. These images represent the data collected for the first base. Any signal
above background identifies the physical location of a cluster (or polony), and the fluorescent emission identifies which of the four bases was incorporated
at that position. This cycle is repeated, one base at a time, generating a series of images each representing a single base extension at a specific cluster.
Base calls are derived with an algorithm that identifies the emission color over time. At this time reports of useful Illumina reads range from 26-50 bases.

Sequence read over multiple chemistry cycles

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics
Repeat cycles of sequencing to determine the sequence
of bases in a given fragment a single base at a time.

umina GAII

SBS
technology
Illumina Step 3:
Sequencing
by Synthesis

troduction

Sample
reparation

Clusters
mplification

quencing by
synthesis
Analysis
pipeline

High
hroughput

From Slideshare
presentation of
Cosentino Cristian
http://
www.slideshare.net/
cosentia/highthroughput-equencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

DNA polymerase enzyme


to the flow cell.

Illumina Step 3: Cycling


Image of first chemistry cycle

Before initiating
next chemistry cy

After laser excitation, capture the image


of emitted fluorescence from each
cluster on the flow cell. Record the
identity of the first base for each cluster.

The blocked 3' termi


and the fluorophore
from each incorpora
base are removed.

Laser

GCTGA...

Sequence read over multiple chemistry cycles


Repeat cycles of sequencing to determine the sequence
of bases in a given fragment a single base at a time.
Figure 2
(Continued )
www.annualreviews.org Next-Generation DNA Sequencing Methods

From Mardis 2008. Annual Rev. Genetics 9: 387.


Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

393

Illumina Evolution

http://www.slideshare.net/AGRF_Ltd/ngs-technologies-platforms-and-applications
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

MiSeq Dx

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

HiSeq x Ten

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

HiSeq x Ten

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

ABi SOLiD

Sequecning
ligationABI Solid
NextGen
#3:by454:

Sanger
method
Roche 454
Illumina GAII
HeliScope
Nanopore

From Slideshare
presentation of
Cosentino Cristian
http://
www.slideshare.net/
cosentia/highthroughput-equencing
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

G09-20

ARI

25 July 2008

ABI Solid Details

14:57

a
SOLiD substrate

Di base probes
Template

1 m
bead 5'

2nd base

3'TTnnnzzz5'

Template sequence

P1 adapter

A C G T

3'

A
C
G

3'TCnnnzzz5'

1st base

3'TGnnnzzz5'

3'TAnnnzzz5'

Glass slide

Cleavage site
5. Repeat steps 14 to extend sequence

1. Prime and ligate


P OH

Primer round 1
Universal seq primer (n)
3'

1 m
bead

P1 adapter

Ligation cycle 1

Ligase

AT
TA

Template sequence

AT
TA

TT
AA

CT
GA

GT
CA

TT
AA

7 ... (n cycles)
CA
GT

GC
CG

3'

3'

6. Primer reset

2. Image

Excite

Fluorescence
Universal seq primer (n1)
3'
2. Primer reset
3'

TA

1. Melt off extended


sequence

1 m
bead

3'

3. Cap unextended strands


7. Repeat steps 15 with new primer

Phosphatase
PO4
3'

4. Cleave off fluor


HO
P

1 base shift

1
Universal seq primer (n1) A A C A C G T C A A T A
1 m
bead

Cleavage agent
AT

Primer round 2

The ligase-mediated sequencing approach of the


Applied Biosystems SOLiD sequencer. In a
manner similar to Roche/454 emulsion PCR
amplification, DNA fragments for SOLiD
sequencing are amplified on the surfaces of 1-m
magnetic beads to provide sufficient signal during
the sequencing reactions, and are then deposited
onto a flow cell slide. Ligase-mediated
sequencing begins by annealing a primer to the
shared adapter sequences on each amplified
fragment, and then DNA ligase is provided along
with specific fluorescent- labeled 8mers, whose
4th and 5th bases are encoded by the attached
fluorescent group. Each ligation step is followed
by fluorescence detection, after which a
regeneration step removes bases from the
ligated 8mer (including the fluorescent group)
and concomitantly prepares the extended primer
for another round of ligation. (b) Principles of twobase encoding. Because each fluorescent group
on a ligated 8mer identifies a two-base
combination, the resulting sequence reads can
be screened for base-calling errors versus true
polymorphisms versus single base deletions by
aligning the individual reads to a known highquality reference sequence.

3'

T GT

CC

G C A G T T A T GG

3'

From Mardis 2008. Annual Rev.


Genetics 9: 387.

3'

TA

8. Repeat Reset with , n2, n3, n4 primers

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Read position
1

Universal seq primer (n)

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35

ABI Solid Evolution

http://www.slideshare.net/AGRF_Ltd/ngs-technologies-platforms-and-applications
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Complete Genomics
REVIEW

Figure 3. Schematic of Complete Genomics


DNB array generation and cPAL
technology. (A) Design of sequencing
fragments, subsequent DNB synthesis, and
dimensions of the patterned nanoarray
used to localize DNBs illustrate the DNB
array formation. (B) Illustration of
sequencing with a set of common probes
corresponding to 5 bases from the distinct
adapter site. Both standard and extended
anchor schemes are shown.

omplete Genomics DNB array generation and cPAL technology. (A) Design of sequencing fragments, subsequent DNB
f the patterned nanoarray used to localize DNBs illustrate the DNB array formation. (B) Illustration of sequencing with a set
onding to 5 bases from the distinct adapter site. Both standard and extended anchor schemes are shown. Reprinted with
pyright XXXX American Association for the Advancement
of Science.
From Niedringhaus
et al. Analytical Chemistry

83: 4327. 2011.


gure 3. Schematic of Complete
Genomics
DNB
array
generation
and
cPAL
technology.
(A)inDesign
sequencing fragments, subsequent DN
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop
AppliedofPhylogenetics
ased
the number
of false positive
gene
sitosterolemia
phenotype
determined
comparison
of
nthesis,
and dimensions
of the
patterned
nanoarray
used towere
localize
DNBsafter
illustrate
the DNB
array formation. (B) Illustration of sequencing with a

ely reduced the number gene candidates

the patients genome to a collection of reference genomes.

Comparison in 2008
Roche (454)

Illumina

SOLiD

Chemistry

Pyrosequencing

Polymerase-based

Ligation-based

Amplification

Emulsion PCR

Bridge Amp

Emulsion PCR

Paired ends/sep Yes/3kb

Yes/200 bp

Yes/3 kb

Mb/run

100 Mb

1300 Mb

3000 Mb

Time/run

7h

4 days

5 days

Read length

250 bp

32-40 bp

35 bp

Cost per run


(total)
Cost per Mb

$8439

$8950

$17447

$84.39

$5.97

$5.81

From Introduction to Next Generation Sequencing by Stefan Bekiranov prometheus.cshl.org/twiki/pub/Main/CdAtA08/


Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics
CSHL_nextgen.ppt

Comparison in 2012
Roche (454)

Illumina

SOLiD

Chemistry

Pyrosequencing

Polymerase-based

Ligation-based

Amplification

Emulsion PCR

Bridge Amp

Emulsion PCR

Paired ends/sep Yes/3kb

Yes/200 bp

Yes/3 kb

Mb/run

100 Mb

1300 Mb

3000 Mb

Time/run

7h

4 days

5 days

Read length

250 bp

32-40 bp

35 bp

Cost per run


(total)
Cost per Mb

$8439

$8950

$17447

$84.39

$5.97

$5.81

From Introduction to Next Generation Sequencing by Stefan Bekiranov prometheus.cshl.org/twiki/pub/Main/CdAtA08/


Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics
CSHL_nextgen.ppt

Bells and Whistles

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Multiplexing

From http://www.illumina.com/technology/multiplexing_sequencing_assay.ilmn
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Multiplexing

http://res.illumina.com/documents/products/datasheets/datasheet_sequencing_multiplex.pdf
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Small Amounts of DNA

http://www.epibio.com/docs/default-source/protocols/nextera-dna-sample-prep-kit-(illumina--compatible).pdf?sfvrsn=4
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Capture
Methods
High throughput sample preparation

Illumina GAII

Introduction

RainDance
Microdroplet PCR

Sample
preparation

Roche Nimblegen
Salid-phase capture with customdesigned oligonucleotide microarray

Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput

Reported 84% of
capture efficiency
From Slideshare presentation of Cosentino Cristian
http://www.slideshare.net/cosentia/high-throughput-equencing
Nature Methods, 2010, 7: 111-118

Reported 65-90% of capture efficiency

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Capture Methods
Illumina GAII

Introduction

Sample
preparation

High throughput sample preparation


Agilent SureSelect
Solution-phase capture with
streptavidin-coated magnetic beads

Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput

Reported 60-80% of capture efficiency

From Slideshare presentation of


Cosentino Cristian
http://www.slideshare.net/cosentia/
high-throughput-equencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Illumina Paired Ends


Paired-end technology

Illumina GAII

Introduction

Sample
preparation

Paired-end sequencing works into GA and uses chemicals from the PE


module to perform cluster amplification of the reverse strand

Clusters
amplification
Sequencing by
synthesis
Analysis
pipeline
High
throughput

From Slideshare
presentation of
Cosentino Cristian
http://
www.slideshare.net/
cosentia/highthroughput-equencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Moleculo
DNA
Large fragments
Isolate and amplify

Sublibrary w/ unique barcodes


CACC

GGAA

TCTC

ACGT

AAGG

GATC

AAAA

Sequence w/ Illumina

Assemble seqs w/ same codes


Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Generation III+: Faster w/ Clusters

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Ion Torrent PGM

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Applied Biosystems Ion Torrent PGM

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Applied Biosystems Ion Torrent PGM


Workflow similar to
that for Roche/454
systems.
!
Not surprising,
since invented by
people from 454.

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Ion Torrent pH Based Sequencing

Mardis ER. Next-generation sequencing platforms. Annu Rev Anal Chem 2013;6:287-303.
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Ion Torrent Evolution

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Generation IV: Single Molecule

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Single Molecule I: Helicos

by

low for

lecular

scent_sequencing

3rd Generation Sequencing

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Single Molecule II: Pacific Biosciences

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Single Molecule II: Pacific Biosciences

Mardis ER. Next-generation sequencing platforms. Annu Rev Anal Chem 2013;6:287-303.

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Single Molecule II: Pacific Biosciences

Analytical Chemistry

REVIEW

Figure 2. Schematic of PacBios real-time single molecule sequencing. (A) The side view of a single ZMW nanostructure containing a single DNA
polymerase (29) bound to the bottom glass surface. The ZMW and the confocal imaging system allow uorescence detection only at the bottom
surface of each ZMW. (B) Representation of uorescently labeled nucleotide substrate incorporation on to a sequencing template. The corresponding
temporal uorescence detection with respect to each of the ve incorporation steps is shown below. Reprinted with permission from ref 39. Copyright
2009 American Association for the Advancement of Science.

29 polymerase. Each amplified product of a circularized


Each hybridization and ligation cycle is followed by uorescent
Figure
2. Schematic
of PacBios
real-time
single
molecule
sequencing. (A)
The side
view
of a spotted
single ZMW
nanostructure
containing
a single
fragment
is called a DNA
nanoball
(DNB).
DNBs
are selectively
imaging
of the
DNB
chip and
subsequently
regeneration
DNA
polymerase
(29)
bound
to
the
bottom
glass
surface.
The
ZMW
and
the
confocal
imaging
system
allow
fluorescence
detection
only at
attached to a hexamethyldisilizane (HMDS) coated silicon chip
of the DNBs with a formamide solution. This cycle is repeated
the
surface of each ZMW.
(B) Representation
of fluorescently
labeled
nucleotide
substrate incorporation
to a sequencing
thatbottom
is photolithographically
patterned
with aminosilane
active
until the
entire combinatorial
library of on
probes
and anchors is
template.
The
corresponding
temporal
fluorescence
detection
with
respect
to
each
of
the
five
incorporation
steps
is
shown
below.
sites. Figure 3A illustrates the DNB array design.
examined. This formula of the use of unchained
reads and
The use of the DNBs coupled with the highly patterned array
regeneration of the sequencing fragment reduces reagent conoers several advantages. The From
production
of DNBs increases
sumption
and eliminates
potential
Niedringhaus
et al. Analytical
Chemistry
83: 4327.
2011.accumulation errors that can
signal intensity by simply increasing the number of hybridization
arise in other sequencing technologies that require close to
sites available for probing.
Also,
the
size
of
the
DNB
is
on
the
completion
of each in
sequencing
reaction.19,52,53
Slides for Jonathan Eisen talk at UC Davis Bodega
Bay Workshop
Applied Phylogenetics
Complete Genomics showcased their DNB array and cPAL
same length scale as the active site or sticky spot patterned on

Why Finish Genomes?


The Value of Finished Bacterial Genomes
Why Are Finished Genomes So Important?
When Sanger sequencing was the only available sequencing technique, it was expensive but not unusual to
improve genome drafts until they were good enough to be considered finished. With the availability of short-read
sequencing technologies, draft genomes became cheap and easy to produce, and the majority of researchers
skipped the more labor- and time-intensive task of finishing genomes, with the realization that critical data
may be missing (Figure 3). Finished genomes are crucial for understanding microbes and advancing the field of
microbiology3 because:
Microbial Genetics Using SMRT Sequencing
Functional genomic studies demand a high-quality,

12000

complete genome sequence as a starting point

Understanding genome organization provides


biological insights
Microbial forensics requires at least one complete
reference genome sequence
Finished genomes aid in microbial outbreak source
identification and phylogenetic analysis
A complete genome is a permanent scientific
resource

10000

Drafted Bacterial Genomes


Finished Bacterial Genomes

Number of genomes

Comparative genomics is meaningful only in terms


of complete genome sequences

8000

6000

4000

2000

0
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Figure 3: History of drafted vs. finished genomes (adapted from ref. 2).

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Why Finish Genomes


JOURNAL OF BACTERIOLOGY, Dec. 2002, p. 64036405
0021-9193/02/$04.00!0 DOI: 10.1128/JB.184.23.64036405.2002
Copyright 2002, American Society for Microbiology. All Rights Reserved.

Vol. 184, No. 23

DIALOG
The Value of Complete Microbial Genome Sequencing
(You Get What You Pay For)
Claire M. Fraser,* Jonathan A. Eisen, Karen E. Nelson, Ian T. Paulsen,
and Steven L. Salzberg
The Institute for Genomic Research, Rockville, Maryland 20850

http://jb.asm.org/content/184/23/6403.full

Downloaded from http://jb.asm.or

organisms to be sampled because of the cost savings that would


Since the publication of the complete Haemophilus influenzae genome sequence in July 1995 (4), the field of microbiology
come from not taking each project to completion. While this
has been one of the largest beneficiaries of the breakthroughs
strategy does achieve a cost savings, today it is only approxiin genomics and computational biology that made this accommately 50%, and this comes at a cost in terms of the quality and
plishment possible. When the 1.8-Mbp H. influenzae project
utility of the finished product.
began in 1994, it was not certain that the whole-genome shotA complete genome sequence represents a finished product
gun sequencing strategy would succeed because it had never
in which the order and accuracy of every base pair have been
been attempted on any piece of DNA larger than an average
verified. In contrast, a draft sequence, even one of high covlambda clone ("40 kbp) (9).
erage, represents a collection of contigs of various sizes, with
During the past 7 years, progress in DNA sequencing techunknown order and orientation, that contain sequencing errors
nology, the design of new vectors for library construction for
and possible misassemblies. As stated by Selkov et al. in a 2000
use in shotgun sequencing projects, significant improvements
paper on a draft sequence of Thiobacillus ferrooxidans, It is
in closure and finishing strategies, and more sophisticated and
clear that such sequencing. . .produces more errors than comforfinding
Jonathan
Eisen talkhave
at UC
Davis Bodega
Bay Workshop
in Applied
robust methodsSlides
for gene
and annotation
dramatplete genome
sequencing.
. . . The Phylogenetics
current error rate is estiically reduced the time required for each stage of a genome
mated to be 1 per 1,000 to 2,000 base pairs vs. 1 in 10,000 base

HGAP Assembly from PacBio

HGAP Assembler for PacBio Data

http://www.pacificbiosciences.com/pdf/
microbial_primer.pdf
PacBio assembly

CDC assembly

Sanger validation

Illumina assembly

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

processes.

Detecting Modified Bases

The potential benefits of detecting base modification,


using SMRT sequencing, include:
Single-base resolution detection of a wide

the presence of a modified base in the DNA


3
template . This is observable as an increased space
between fluorescence pulses, which is called the
interpulse duration (IPD), as shown in Figure 2.

Figure 2. Principle of detecting modified DNA bases during SMRT sequencing. The presence of the modified base in the DNA
template (top), shown here for 6-mA, results in a delayed incorporation of the corresponding T nucleotide, i.e. longer
3
interpulse duration (IPD), compared to a control DNA template lacking the modification (bottom).

Page 2

www.pacb.com/basemod

http://www.pacificbiosciences.com/pdf/microbial_primer.pdf
Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Single Molecule III: Oxford Nanopores


From Oxford Nanopores Web Site

This diagram shows a protein nanopore set in an electrically resistant membrane bilayer. An ionic current is passed through the nanopore by setting a voltage across this
membrane. If an analyte passes through the pore or near its aperture, this event creates a characteristic disruption in current. By measuring that current it is possible to identify the
molecule in question. For example, this system can be used to distinguish the four standard DNA bases and G, A, T and C, and also modified bases. It can be used to identify
target proteins, small molecules, or to gain rich molecular information for example to distinguish the enantiomers of ibuprofen or molecular binding dynamics.

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Single Molecule III: Oxford Nanopores

tical Chemistry

RE

Figure6. BiologicalnanoporeschemeemployedbyOxfordNanopore.
(A)SchematicofRHLproteinnanoporemutantdepictingthepositionsofthe cyclodextrin (at residue 135) and glutamines (at residue
139). (B) A detailed view of the barrel of the mutant nanopore shows the locations of the arginines (at residue 113) and the
cysteines. (C) Exonuclease sequencing: A processive enzyme is attached to the top of the nanopore to cleave single nucleotides
from the target DNA strand and pass them through the nanopore. (D) A residual current-vs-time signal trace from an RHL
protein nanopore that shows a clear discrimination between single bases (dGMP, dTMP, dAMP, and dCMP). (E) Strand
sequencing: ssDNA is threaded through a protein nanopore and individual bases are identified, as the strand remains intact.
Panels A, B, and D reprinted with permission from ref 91. Copyright 2009 Nature Publishing Group. Panels C and E reprinted
with permission from Oxford Nanopore Technologies (Zoe McDougall).

Figure6. BiologicalnanoporeschemeemployedbyOxfordNanopore.(A)SchematicofRHLproteinnanoporemutantdepictingthepositionsofthe cyclodextrin (at residue 135) and glutamines (at residue

6. Biological
nanopore
scheme
by Oxford
Nanopore.
Schematic
of RHL
nanopore
depicting
position
139). (B) A detailed
view of the
barrel ofemployed
the mutant nanopore
shows the
locations of the(A)
arginines
(at residue 113)
and the protein
cysteines. (C)
Exonucleasemutant
sequencing:
A processivethe
enzyme
is
attached
to
the
top
of
the
nanopore
to
cleave
single
nucleotides
from
the
target
DNA
strand
and
pass
them
through
the
nanopore.
(D)
A
residual
current-vs-time
signal
trace
from
an
RHL
protein
extrin (at residue 135) and glutamines (at residue 139). (B) A detailed view of the barrel of the mutant nanopore shows the loca
nanopore that shows a clear discrimination between single bases (dGMP, dTMP, dAMP, and dCMP). (E) Strand sequencing: ssDNA is threaded through a protein nanopore and individual bases ar
inines
(at residue
113)
andintact.
the Panels
cysteines.
(C)
Exonuclease
sequencing:
A processive
enzyme
is attached
toCthe
of the
to cleav
identified,
as the strand
remains
A, B, and
D reprinted
with permission
from ref 91. Copyright
2009 Nature
Publishing
Group. Panels
and Etop
reprinted
withnanopore
permission from
Oxford
Nanopore
Technologies
(Zoe
McDougall).
tides from the target DNA strand and pass them through the nanopore. (D) A residual current-vs-time signal trace from an RHL protein na
Frombases
Niedringhaus
et al. dAMP,
Analytical
4327.
2011. ssDNA is threaded th
ows a clear discrimination between single
(dGMP, dTMP,
and Chemistry
dCMP). (E)83:
Strand
sequencing:
Slides
for are
Jonathan
Eisen
UC Davis
Bodega
Workshop
in Applied
Phylogenetics
n nanopore and individual
bases
identied,
astalk
the at
strand
remains
intact.Bay
Panels
A, B, and
D reprinted
with permission from ref 91. Co
Nature Publishing Group. Panels C and E reprinted with permission from Oxford Nanopore Technologies (Zoe McDougall).

Single
Molecule
III:
Oxford
Nanopores
Analytical Chemistry

REVIEW

igure 5. Nanopore DNA sequencing using electronic measurements and optical readout as detection methods. (A) In electronic nanopore schemes
ignal is obtained through ionic current,73 tunneling current,78 and voltage dierence79 measurements. Each method must produce a characteristic signa
o dierentiate
the four DNA bases. Reprinted with permission from ref 83. Copyright 2008 Annual Reviews. (B) In the optical readout nanopore design
Nanopore DNA sequencing using electronic measurements and optical readout as detection methods.(A)In electronic nanopore schemes, signal is obtained through ionic current,73 tunneling
achcurrent,
nucleotide
is converted to a preset oligonucleotide sequence and hybridized with labeled markers that are detected during translocation of the DNA
and voltage difference measurements. Each method must produce a characteristic signal to differentiate the four DNA bases. (B) In the optical readout nanopore design, each nucleotide
ragment
through
theoligonucleotide
nanopore. sequence
Reprinted
from refwith
82.labeled
Copyright
2010
American
is converted
to a preset
and hybridized
markers that
are detected
duringChemical
translocationSociety.
of the DNA fragment through the nanopore.

From Niedringhaus et al. Analytical Chemistry 83: 4327. 2011.


would result, in theory, in detectably altered current ow through
lipid bilayer, using ionic current blockage method. The author
he pore. Theoretically,
nanopores
couldEisen
also talk
be designed
to Bodega
predicted
that single
nucleotides
could be discriminated as lon
Slides
for Jonathan
at UC Davis
Bay Workshop
in Applied
Phylogenetics
measure tunneling current across the pore as bases, each with a
as: (1) each nucleotide produces a unique signal signature; (2
istinct tunneling potential, could be read. The nanopore apthe nanopore possesses proper aperture geometry to accommo

Oxford Nanopores MinIon

Its kind of a cute device, says Jaffe of the MinION, which is roughly the size and shape of a packet of chewing
gum. It has pretty lights and a fan that hums pleasantly, and plugs into a USB drive. But his technical review is
mixed. From http://www.nature.com/news/data-from-pocket-sized-genome-sequencer-unveiled-1.14724

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

Slides for Jonathan Eisen talk at UC Davis Bodega Bay Workshop in Applied Phylogenetics

You might also like