Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

2

G3, 2021, 11(1), jkaa009


DOI: 10.1093/g3journal/jkaa009
Advance Access Publication Date: 27 November 2020
Investigation

Fast and inexpensive whole-genome sequencing library


preparation from intact yeast cells
Sibylle C. Vonesch ,1,* Shengdi Li ,1 Chelsea Szu Tu,1,† Bianca P. Hennig,1,‡ Nikolay Dobrev,2and Lars M. Steinmetz1,3,4,*
1
Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg 69117, Germany
2
European Molecular Biology Laboratory (EMBL), Protein Expression and Purification Facility, Heidelberg 69117, Germany
3
Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA 94304, USA
4
Department of Genetics, Stanford University School of Medicine, Palo Alto, CA 94304, USA

Present address: Centre for Genomic Regulation (CRG), Barcelona, 08003, Spain.

Present address: Center for Digital Health, Berlin Institute of Health (BIH) and Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin,
Humboldt-Universität zu Berlin, Berlin, Germany, 10117, Berlin, Germany.

*Corresponding authors: Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstraße 1, Heidelberg 69117, Germany. vonesch@embl.de
(S.C.V.); lars.steinmetz@embl.de (L.M.S.)

Abstract
Through the increase in the capacity of sequencing machines massively parallel sequencing of thousands of samples in a single run is now
possible. With the improved throughput and resulting drop in the price of sequencing, the cost and time for preparation of sequencing
libraries have become the major bottleneck in large-scale experiments. Methods using a hyperactive variant of the Tn5 transposase effi-
ciently generate libraries starting from cDNA or genomic DNA in a few hours and are highly scalable. For genome sequencing, however,
the time and effort spent on genomic DNA isolation limit the practicability of sequencing large numbers of samples. Here, we describe a
highly scalable method for preparing high-quality whole-genome sequencing libraries directly from Saccharomyces cerevisiae cultures in
less than 3 h at 34 cents per sample. We skip the rate-limiting step of genomic DNA extraction by directly tagmenting lysed yeast sphero-
plasts and add a nucleosome release step prior to enrichment PCR to improve the evenness of genomic coverage. Resulting libraries do
not show any GC bias and are comparable in quality to libraries processed from genomic DNA with a commercially available Tn5-based
kit. We use our protocol to investigate CRISPR/Cas9 on- and off-target edits and reliably detect edited variants and shared polymorphisms
between strains. Our protocol enables rapid preparation of unbiased and high-quality, sequencing-ready indexed libraries for hundreds of
yeast strains in a single day at a low price. By adjusting individual steps of our workflow, we expect that our protocol can be adapted to
other organisms.

Keywords: yeast; WGS; Tn5; tagmentation

Introduction prevents its use in large-scale projects for most laboratories. We


Whole-genome sequencing is a powerful tool in genomics re- (Hennig et al. 2017) and others (Picelli et al. 2014) have previously
search by providing an unbiased and comprehensive view of the described a robust Tn5 transposase purification strategy and ac-
genetic alterations present in a cell. Genomic information is companying library preparation protocol that allows generating
instrumental for identifying mutations that underlie observed sequencing libraries with comparable quality but at dramatically
phenotypes and for pinpointing any collateral damage that can reduced cost compared to commercial solutions. In addition to a
occur as a side product of mutagenesis. As sequencing costs hyperactive Tn5 enzyme variant carrying the previously reported
continue to drop, the cost and time for preparation of sequencing missense mutations E54K (Zhou and Reznikoff 1997; Zhou et al.
libraries have become the major limiting factor for large-scale 1998) and L372P (Weinreich et al. 1994), which increase the DNA-
genome sequencing experiments. binding efficiency and reduce inhibitory effects on Tn5 activity,
The most rapid and scalable library preparation methods use respectively, we introduced a second Tn5 construct carrying an
a hyperactive variant of the Tn5 transposase that fragments additional amino acid substitution (R27S) in the DNA-binding do-
double-stranded DNA and ligates synthetic oligonucleotide main (Hennig et al. 2017), which allows adjusting the fragment
adapters required for Illumina sequencing in a 5-min reaction size distribution based on enzyme concentration during tagmen-
(Adey et al. 2010) (Illumina). While the one-step tagmentation re- tation.
action greatly simplifies library preparation workflows compared With Tn5-based adapter insertion using homemade enzymes,
to traditional, multistep methods, and scales to the parallel proc- genomic DNA isolation becomes the major bottleneck limiting
essing of hundreds of samples, the cost of commercial reagents the practicability of sequencing large numbers of genomes.

Received: September 07, 2020. Accepted: October 28, 2020


C The Author(s) 2020. Published by Oxford University Press on behalf of Genetics Society of America.
V
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which
permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
2 | G3, 2021, Vol. 11, No. 1

Analogous to colony PCR, where lysed bacterial cells or yeast Cell lysis and adjustment of cell numbers
spheroplasts are added to a PCR reaction without prior genomic For Zymolyase treatment, cells from overnight cultures were pel-
DNA isolation, we hypothesized that yeast whole-genome leted at 1000 g for 3 min and resuspended in 50 ml of 300 U/ml
sequencing library preparation could be simplified by applying Zymolyase 100-T (AMSBIO) solution, incubated at 37 C for 30 min
tagmentation directly to cells. A similar strategy incorporating followed by 10 min at 95 to inactivate Zymolyase and lyse spher-
heat-based lysis prior to tagmentation was successfully applied oplasts. About 1.25 ml of the solution was used for tagmentation.
for whole-genome (Adey et al. 2010) and plasmid (Hwang et al. Final desired cell number was adjusted prior to Zymolyase treat-
2019) sequencing of bacterial cells. In contrast to bacteria, the ment by measuring OD600 of a diluted overnight culture and cal-
yeast Saccharomyces cerevisiae has a rigid cell wall that prevents ef- culating sampling volume assuming 30 Mio cells per 1 ml OD 1
ficient heat-based lysis. To facilitate cellular lysis the lytic en- culture (based on value from https://bionumbers.hms.harvard.
zyme Zymolyase (Kitamura and Yamamoto 1972), which actively edu/bionumber.aspx?&id¼100986&ver¼3). Sampling volume
degrades the cell walls of various yeasts, thereby exposing the was adjusted such that 1.25 ml of the 50 ml Zymolyase reaction
fragile spheroplast, is routinely used in genomic DNA extraction contained the desired number of cells.
(van Burik et al. 1998; Wilkening et al. 2013) and colony PCR
(Fuxman Bass et al. 2016) protocols. With a molecular weight of Tn5 adapter complex assembly
138 kDa the active Tn5 dimer-adapter complex does not far ex-
Tn5 was expressed and purified as previously described (Hennig
ceed the size of average polymerase enzymes (90 kDa), such
et al. 2017). Tagmentation adapters were annealed as previously
that treatment with Zymolyase should provide sufficient access
described (Hennig et al. 2017) and thawed on ice. As Tn5 storage
to genomic DNA to Tn5. Furthermore, a similar method is used
buffer (20 mM Tris pH 7.4, 800 mM NaCl, 50% Glycerol) contains
for Tn5-based adapter insertion into accessible DNA in yeast
relatively high amounts of salts we mixed 2 ml of Tn5 protein
ATAC-seq (Schep et al. 2015) protocols.
(0.5 mg/ml) with 0.5 ml of each annealed adapter (70 mM stock)
Here, we report a simplified whole-genome sequencing library
and 8 ml of 20 mM Tris pH 7.5 for adapter loading. Providing
preparation protocol. By skipping a conventional genomic DNA
adapters in slight excess to Tn5 favors all Tn5 dimer molecules to
isolation step our method enables preparing sequencing-ready li-
be occupied by two adapters, shifting the equilibrium to the fully
braries directly from overnight yeast cultures in less than 3 h at a
saturated Tn5-adapter complex. The mixture was incubated at
cost of 34 cents per sample, while scaling to the parallel process-
23 for 30–60 min at 300 rpm in a thermoshaker. For its use in tag-
ing of hundreds of samples. In place of a lengthy genomic DNA
mentation, we further diluted the complex using dilution buffer
isolation step, we incubate saturated yeast cultures with
(10 mM Tris pH 7.5, 150 mM NaCl) to the desired dilution factor.
Zymolyase, followed by heat inactivation to inactivate
Zymolyase and lyse spheroplasts, and directly apply tagmenta-
tion to the resulting samples. A nucleosome release step prior to
Tagmentation with homemade Tn5
enrichment PCR improves the evenness of genomic coverage. Unless indicated otherwise 1.25 ml of lysed cells were mixed with
Resulting libraries are comparable in quality to libraries proc- 1.25 ml 1:10 diluted, adapter-loaded Tn5R27S, E54K, L372P, and 2.5 ml
essed from extracted genomic DNA with a commercially avail- tagmentation buffer TB1 [10 mM MgCl2, 25% dimethylformamide
able Tn5-based kit (Nextera XT, Illumina). We demonstrate the (DMF) (v/v), 10 mM Tris-HCl final concentrations, adjusted to pH
simplicity and unbiased nature of the method by investigating 7.6 using acetic acid]. Tagmentation buffer without DMF was pre-
CRISPR/Cas9 on- and off-target editing outcomes in a panel of pared as a 2 solution and DMF was added fresh immediately be-
yeast strains, reliably detecting edited variants, and shared poly- fore tagmentation for every experiment. Samples were incubated
morphisms between strains. Direct preparation of sequencing li- on a preheated thermocycler block for exactly 3 min at 55 C after
braries from yeast cultures without DNA isolation enables which 1.25 ml 0.2% SDS was added immediately for neutralization,
massively scaled genome sequencing experiments that have so and samples were incubated for 5 min at room temperature. For
far been hampered by the time and effort spent on genomic DNA library amplification, we added 1.25 ml each of indexed P5 and P7
isolation. At 34 cents library preparation costs are approximately primer (10 lM), 6.75 ml of 2 KAPA HiFi Ready mix (KAPA
5- to 10-fold lower compared to libraries prepared from gDNA Biosystems), and 0.75 ml of DMSO to 6.25 ml of tagmentation reac-
extracted using the most simple and scalable kit-based methods. tion. Salt or Proteinase K (ProK) treated samples were purified us-
By adjusting steps like initial lysis to specific sample require- ing 1.8 volumes of AMPure XP (Beckman Coulter) beads prior to
ments, we anticipate that our protocol can be applied to other PCR amplification. A gap-filling step followed by 12-cycle enrich-
microbial cells, mammalian cells, or tissues. ment PCR was performed as described previously (Hennig et al.
2017) and samples were purified using 1.8 volumes AMPure XP
beads and eluted in 10 ml nuclease-free water. We quantified li-
Materials and methods brary yield using the Qubit high-sensitivity DNA kit and evalu-
Yeast strain and media ated library quality on an Agilent Bioanalyzer (high-sensitivity
We used the well-characterized yeast strain YJM789 (Wei et al. DNA assay).
2007), a derivative of a yeast isolated from the lungs of an AIDS
patient with pneumonia, in all experiments. YJM789 contains ap- Tagmentation with Nextera enzyme
proximately 60,000 single nucleotide polymorphisms (SNPs) with Samples processed from extracted genomic DNA with Nextera
respect to the S. cerevisiae reference genome. For all experiments, XT kit (Illumina) were prepared following kit instructions with
we inoculated a YPAD culture directly from glycerol stock and the following changes: We used 150 pg DNA as input and scaled
grew it to saturation overnight. For experiments with genomic down reaction volumes such that we used 1=4 of indicated
DNA, we extracted genomic DNA using Master Pure genomic reagents per reaction. For cell-based protocols with Nextera en-
DNA extraction Kit (Epicentre) following the manufacturer’s zyme, tagmentation, inactivation, and library amplification were
instructions. We confirmed quality by gel and quantified genomic performed following kit instructions and using commercial
DNA yield using the Qubit high-sensitivity DNA assay. reagents, but only using 1=4 of indicated reagents. Briefly, 1.25 ml
S. C. Vonesch et al. | 3

gDNA (100 pg/ml) or zymolyase-treated cells were mixed with 2.5 incubated with 1.75 ml 0.4 mg/ml ProK for 30 min at 50 followed
ml Tagment DNA Buffer (TD) and 1.25 ml Amplicon Tagment Mix by 15 min at 65 and stored at 20 before proceeding with mag-
(ATM). Samples were incubated on a preheated thermocycler netic bead cleanup (1.8) and enrichment PCR. Final purification
block for exactly 3 min at 55 after which 1.25 ml Neutralize was done with 0.8 AMPure XP beads. All samples were pooled
Tagment Buffer (NT) was added and samples were incubated for and paired-end 150 bp reads were generated on an Illumina
5 min at room temperature. Salt or ProK treated samples were NextSeq platform.
purified using 1.8 volumes of AMPure XP (Beckman Coulter)
beads prior to PCR amplification. For enrichment PCR, we added
Whole-genome sequence analysis
3.75 ml Nextera PCR Master Mix (NPM) and 1.25 ml each of Index N
and Index S primers, and ran the NXT PCR program described in Each whole-genome sequencing dataset was down-sampled to
the kit instructions. Amplified libraries were purified using 1.8 consistent numbers of reads for further comparisons, using bash
volumes AMPure XP beads and eluted in 10 ml nuclease-free wa- command “gunzip -c reads.fastq.gz j head -n N.” For comparisons
ter. We quantified library yield using the Qubit high-sensitivity that involved both samples with 150 and 75 bp reads all reads
DNA kit and evaluated library quality on an Agilent Bioanalyzer were trimmed to length 75 prior to downstream processing. 30
(high-sensitivity DNA assay). Transposon adapter sequences were trimmed off using cutadapt
(Martin 2011) and trimmed paired-end reads mapped to the
Nucleosome dissociation S288c yeast reference genome (sacCer3 R64.2.1) using bwa (Li and
We tested two nucleosome dissociation methods. For salt treat- Durbin 2009) (v0.7.17). The bam files containing all mapped reads
ment, 6.25 ml of tagmented library was incubated with 3.75 ml of were sorted and PCR duplicates filtered using an embedded ver-
3 M NaOAc (final 1.125 M) at room temperature for 1 h. For ProK sion of picard toolkits in gatk v4.0 (McKenna et al. 2010). Base qual-
treatment 6.25 ml of tagmented library was incubated with 1.75 ml ity score recalibration (BQSR) was performed using “gatk
0.4 mg/ml ProK (diluted from 20 mg/ml stock) at 65 C for 30 min BaseRecalibrator” to detect and correct systematic errors in the
after tagmentation, unless indicated otherwise. When we applied original base accuracy scores. Statistics for assessing mapping
nucleosome dissociation before tagmentation, we combined cells qualities were generated using samtools (Li et al. 2009) (v1.9) with
for several samples using 10 ml of the spheroplast solution as in- its “flagstat” option. Insert size distribution was extracted using
put and eluting in the same volume after bead purification. the command “gatk CollectInsertSizeMetrics.” Variant calling
was performed using “gatk HaplotypeCaller” with the haploid op-
Combined changes tion “-ploidy 1.” Low-quality variant calls were annotated and fil-
One hundred microliters of a saturated overnight culture (no OD tered using “gatk VariantFiltration” with expression “QD < 2.0 | FS
measured) was pelleted and resuspended in 25 ml 300 U/ml > 60.0 | MQ < 40.0 | SOR > 3.0 | MQRankSum < 12.5|
Zymolyase solution and incubated at 37 for 30 min, followed by ReadPosRankSum < 8.0”.
10 min at 95 . We used 1.25 ml of this solution as input for tag-
mentation, corresponding to approximately 1.2 Mio cells (assum- SNP-calling rate assessment
ing 30 Mio cells in 1 ml OD 1 culture and saturation at OD 8). We used the set of variants called in a sample processed with the
Samples were processed with indicated dilutions of Tn5E54K, L372P commercial Nextera pipeline as gold standard (number of total
or Tn5R27S, E54K, L372P using indicated tagmentation buffers (TB1 or true SNPs), and determined true and false-positive SNP-calling
TB2) and tagmentation temperatures (37 or 55 ). After tagmen- rates with each protocol variation. For any given sample X, the
tation with the corresponding enzyme and inactivation, samples true positive rate was defined as (number of correctly called
were incubated with ProK for 30 min at 50 followed by 15 min at SNPs/number of total true SNPs), while the false-positive rate
65 and stored at 20 before proceeding with magnetic bead was defined as (number of miscalled SNPs/number of total SNPs
cleanup (1.8) and enrichment PCR. Final purification was done in X).
with 0.8 AMPure XP beads. The final protocol is also described
in Supplementary File S1. All samples were pooled and paired-
end 150 bp reads were generated on an Illumina NextSeq plat- Nucleosome occupancy and GC content bias
form. TB1: 10 mM MgCl2, 25% DMF (v/v), 10 mM Tris-HCl final analysis
concentrations, adjusted to pH 7.6 using acetic acid. TB2: 8 mM We used two datasets of yeast nucleosome mapping (Lee et al.
MgCl2, 20% DMF (v/v), 16 mM Tris-HCl final concentrations, ad- 2007; Schep et al. 2015). For tiling-array data, nucleosome occu-
justed to pH 7.6 using acetic acid. Tagmentation buffer without pancy was assessed by the log ratio of probe signals between
DMF was prepared as a 2 solution and DMF was added fresh im- mono-nucleosomal DNA and total genomic DNA samples. For
mediately before tagmentation. ATAC-seq data, we calculated transposon insertion frequency
per base using pyatac (Schep et al. 2015) (integrated in nucleoatac).
Library preparation for on- and off-target editing We grouped base positions of the rDNA locus into four catego-
analysis ries: 35S rRNA genes, external transcribed spacers, internal tran-
Sixteen randomly picked MAGESTIC (Roy et al. 2018)-edited yeast scribed spacers, and nontranscribed regions (NTS), according to
strains were inoculated from glycerol stocks into 100 ml YPAD in a published gene annotations of the S288c reference genome (http://
96 well plate, and grown overnight at 800 rpm at 30 . Entire cul- sgd-archive.yeastgenome.org/sequence/S288C_reference/genome_
tures were pelleted at 1000 g for 3 min, pellets were resuspended releases/S288C_reference_genome_R64-2-1_20150113.tgz).
in 25 ml 300 U/ml Zymolyase solution and incubated at 37 for GC content was calculated for the yeast autosomal genome
30 min, followed by 10 min at 95 . 1.25 ml of the solution was with a 500 bp sliding window stepped by 250 bp. We took the
mixed with 2.5 ml TB2 and 1.25 ml of 1:10 diluted Tn5R27S, E54K, mean value from overlapped windows to represent GC content

L372P, and tagmentation was performed at 55 C for 3 min, fol- bias for single-base position. For correlation tests (coverage vs nu-
lowed by inactivation with 1.25 ml 0.2% SDS for 5 min at room cleosome occupancy, coverage vs GC content), we randomly sub-
temperature. For nucleosome dissociation, samples were sampled 100,000 positions to reduce computational complexity.
4 | G3, 2021, Vol. 11, No. 1

Off-target variant analysis the commercial Nextera enzyme, using buffers and reagents
For the analysis of 16 strains edited by MAGESTIC, read mapping from the kit (Nextera XT, Illumina). As a reference standard, we
and variant calling were performed as described above, except: prepared a library from 150 pg extracted YJM789 genomic DNA
(1) gvcf files were generated using “gatk HaplotypeCaller -ERC using reagents from the kit (NXTMP). Paired-end 75-bp reads
GVCF” before genotyping in cohort mode; and (2) additional filters were generated on an Illumina MiSeq platform and reads
were applied to exclude variants with low read depth (<0.1  mapped to the reference yeast genome (sacCer3 R64.2.1) using
sample average base coverage) and variants with missing genotypes bwa (Li and Durbin 2009).
in 2 samples. Mutant allele frequency (MAF) was calculated by A major concern when skipping genomic DNA isolation is
counting fraction of nonreference alleles after excluding missing the reduced accessibility of genomic DNA to Tn5 and polymer-
genotypes. For each private mutation (unique variants present in ase, leading to a greater variation in coverage across the ge-
single strain), we scanned for gRNA-like sequences in both strand nome (coverage bias). To assess the impact on the distribution
directions of a (60þX) bp (X equals variant length) window cen- of genomic coverage, we quantified the number of reads map-
tered at variant position. We measured the Levenshtein edit dis- ping to each position, as well as average genome coverage for
tance between on-target sequence and its best match in the each sample. We used the ratio of per-base coverage to average
(60þX) bp window. coverage to illustrate coverage bias—the closer this ratio is to 1,
Data of in vitro on- and off-target sequence pairs were down- the more evenly the base is covered relative to the rest of the
loaded from the CIRCLE-seq publication (Tsai et al. 2017) and edit genome. As a secondary bias metric, we calculated the fraction
distance between each pair was calculated accordingly. of the genome covered at least eightfold, reflecting a common
variant calling filter. To measure the impact of coverage varia-
Data availability tion on variant calling, we determined the fraction of true posi-
The pETM11-Sumo3-Tn5 plasmids carrying either the Tn5E54K, tive SNPs and the false-positive call rate with increasing
L372Por Tn5R27S, E54K, L372P construct are available upon request. sequencing coverage using the extracted sample (NXTMP) as a
Supplementary File S1 contains detailed step-by-step instruc- reference.
tions of the protocol using homemade Tn5-based and nucleo- With the Nextera enzyme, libraries prepared from cells with-
some dissociation by salt or ProK. Supplementary File S2 contains out nucleosome dissociation (NA) were highly similar to librar-
Supplementary Figures S1–S8. Supplementary Table S1 contains ies prepared from extracted genomic DNA (Figure 2,
protocol conditions and associated insert sizes. Supplementary Supplementary Figure S1), and a ProK nucleosome dissociation
Table S2 contains editing outcomes for edited strains. step after tagmentation did not provide an additional benefit.
Supplementary Table S3 contains genotype information for Despite equimolar pooling, we obtained very low read numbers
edited strains. The sequencing data discussed in this manuscript for the sample with salt treatment and could not evaluate it
have been deposited on SRA and are accessible through the ac- across the entire range, but at lower sequencing coverage salt
cession number SRP282188. treatment seemed to negatively affect variant calling perfor-
Supplementary material is available at figshare DOI: https:// mance (Figure 2B). With homemade Tn5R27S, E54K, L372P in con-
doi.org/10.25387/g3.13187504. trast, both nucleosome dissociation methods improved
coverage (Figure 2, A and C, Supplementary Figure S1A) and
SNP-calling rate (Figure 2B, Supplementary Figure S1B) com-
Results pared to no treatment (NA) when applied after tagmentation.
Initial comparison of libraries prepared from Salt treatment resulted in libraries that were indistinguishable
genomic DNA vs intact cells in quality from the extracted sample. Insert size distribution
We hypothesized that tagmentation could be applied directly to (median/mode) was similar for the extracted sample (NXTMP
heat-treated yeast spheroplasts to simplify whole-genome se- 115/70) and extraction-free samples with salt treatment, while
quencing library preparation by skipping genomic DNA isolation. ProK treatment resulted in larger insert sizes for both Nextera
Figure 1 illustrates the extraction-free library preparation work- (150/96) and Tn5R27S, E54K, L372P (182/139) (Supplementary Figure
flow using homemade Tn5R27S, E54K, L372P. Instead of extracting ge- S1A, Table S1).
nomic DNA, cells from saturated overnight cultures are exposed The native Tn5 transposase has an integration site prefer-
to Zymolyase treatment to digest the cell wall, incubated at 95 ence (Goryshin et al. 1998; Shevchenko 2002; Reznikoff 2008) and
for inactivation of Zymolyase and lysis of spheroplasts, and the studies have reported a mild GC bias in vitro (Green et al. 2012)
Tn5 adapter complex is added directly to the sample (Figure 1). and in library preparation applications (Lan et al. 2015). To as-
As nucleosomes could provide a barrier for polymerase during li- sess sequence-dependent bias in coverage, we calculated nucle-
brary amplification, we compared two methods for nucleosome otide composition of the yeast reference genome in 500-bp
dissociation: salt or ProK treatment. High salt concentrations pro- sliding windows. No correlation with GC content was apparent
mote dissociation by decreasing the attractive force between pos- for either the extracted sample or our extraction-free method
itively charged histone proteins and negatively charged DNA using the Nextera kit (Supplementary Figure S1C). The absence
(Stacks and Schumaker 1979; Yager et al. 1989) while ProK is a of PCR-dependent GC bias, as reported previously (Adey et al.
broad-spectrum serine protease (Bajorath et al. 1988) used for 2010), could be a result of using the GC tolerant KAPA HiFi poly-
crosslink release and nucleosome digestion in MNase assays merase (KAPA Biosystems). Salt-treated libraries prepared with
(Yuan et al. 2005). We performed all experiments with the well- the extraction-free protocol and homemade Tn5R27S, E54K, L372P
characterized yeast strain YJM789 containing approximately showed a mild GC bias (Supplementary Figure S1C). Overall,
60,000 SNPs relative to the yeast reference genome (Wei et al. these initial experiments showed that high-quality whole-ge-
2007). In an initial attempt, we used 240,000 cells corresponding nome sequencing libraries can be generated without a genomic
to approximately 3 ng genomic DNA as input for tagmentation. DNA isolation step with both commercial and homemade trans-
For comparison, we performed extraction-free preparation with posase enzymes.
S. C. Vonesch et al. | 5

Cell wall digestion Tagmentation Nucleosome release


100 ul saturated yeast culture Tn5-adapter complex dimers

3 min 55°C

300U/ml Zymolyase Tn5 inactivation Proteinase K 1.125 M NaOAc

30 min 37°C 50°C 30 min OR 23°C 60 min


0.2% SDS 5 min RT
10 min 95°C 65°C 15 min

tagmented gDNA nucleosome-free, tagmented gDNA


lysed spheroplasts
Bead clean Amplification Bead clean
gap filling - 72°C 3 min

12 cycles
1.8X amplification 0.8X
AmPure XP AmPure XP

2X wash Elute 6.25 ul 2X wash Elute 10 ul


70% EtOH nuclease-free water P5-i5 i7-P7 70% EtOH nuclease-free water

amplified tagmented gDNA libraries

Qubit hsDNA, Bioanalyser


Figure 1 A simplified whole-genome sequencing library preparation workflow without genomic DNA isolation. Workflow of cell wall digestion and heat-
based lysis, gDNA tagmentation, nucleosome dissociation, and subsequent NGS library preparation for dual index (i5/i7) whole-genome sequencing is
depicted. The double-stranded part of the linker oligonucleotide is shown in gray with a yellow dot depicting the phosphorylated 30 end. The 50
overhangs serving as templates for the indexed P5 (dark blue) or P7 (orange) adapter primers are shown in blue and red, respectively.

Optimizing protocol parameters with homemade from a higher loss of material during bead-based purification af-
Tn5 ter ProK treatment, as intact genomic DNA might not bind and
Tagmentation tends to generate libraries with shorter insert sizes elute as efficiently as smaller, tagmented genomic DNA, but we
than methods using mechanical or enzymatic fragmentation did not investigate this further. With 65 at the higher end of the
(Adey et al. 2010), which can be a limitation for applications re- permissible temperature range for ProK activity, we tested incu-
quiring longer inserts. Compared to Tn5-based standard library bation at 50 , and 50 followed by 15 min at 65 . Both variations
preparation from genomic DNA, library preparation from cells improved coverage bias (Supplementary Figure S3A) and to a
with ProK-mediated nucleosome dissociation consistently pro- smaller degree variant calling (Supplementary Figure S3, B and C)
duced longer fragments, providing an opportunity to address this compared to incubation at 65 , but the 50 treatment by itself led
need. To identify robust conditions for an efficient, extraction- to reduced insert sizes (Supplementary Figure S3D, Table S1) and
free workflow with our in-house transposase compatible with reduced genomic coverage at lower sequencing depths
longer insert sizes, we extensively varied different reaction steps. (Supplementary Figure S3E). A combined incubation at 50 fol-
To assess whether providing Tn5 with a nucleosome-free DNA lowed by 65 resulted in highest library quality while retaining
substrate would improve genome coverage, we applied ProK larger insert sizes.
treatment after Zymolyase incubation but prior to tagmentation. For initial testing, we measured optical density and adjusted
Compared with treatment after tagmentation, we observed larger cell number of individual cultures, which limits throughput for
variation in genome coverage and lower performance for other parallel processing. To evaluate the robustness of the protocol to
quality metrics (Supplementary Figure S2) for both Nextera and variable input cell numbers, we processed samples starting from
Tn5R27S, E54K, L372P. The overall worse performance could stem 240,000 to 5 million cells. We adjusted sampling volume from the
6 | G3, 2021, Vol. 11, No. 1

A Nextera Tn5R27S,E54K,L372P
1.5 1.5
Method
NA
Density

Density
1.0 1.0
ProK
Salt
0.5 0.5 NXTMP

C
0.0 0.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
log(coverage / mean(coverage)) log(coverage / mean(coverage))

B Nextera Tn5R27S,E54K,L372P
100 100
%True SNPs called

%True SNPs called


80 80 Method
NA
60 60
ProK
Salt
40 40
NXTMP

20
0 10 20 30 40 50 0 10 20 30 40 50
X genome coverage X genome coverage

C
Nextera Tn5R27S,E54K,L372P
%genome of coverage >= 8

%genome of coverage >= 8

100 100

75 75 Method
NA
50 50
ProK

25 25 Salt
NXTMP
0 0
0 10 20 30 40 50 0 10 20 30 40 50
X genome coverage X genome coverage
Figure 2 Comparison of different nucleosome dissociation methods in extraction-free library preparation. Samples starting from 240,000 cells were
processed following the workflow depicted in Figure 1. After tagmentation nucleosomes were dissociated using salt (blue) or ProK (green). NA (salmon)
indicates the lack of a nucleosome dissociation step. NXTMP (black dashed line) is a library processed from 150 pg extracted genomic DNA with the
Nextera XT kit and serves as reference standard. (A) Coverage bias distribution (log scale), with bias calculated as coverage at a base divided by average
genome coverage. The salt condition is omitted for Nextera due to insufficient reads for this sample. (B) Fraction of YJM789 SNPs called as a function of
sequencing depth. The reference set of true positive SNPs (52,373 SNPs) is derived from variant calling on a sample prepared from extracted genomic
DNA with the Nextera XT kit (NXTMP). (C) Fraction of the genome covered at least 8 as a function of sequencing depth.

saturated culture such that 1.25 ml of the Zymolyase reaction contrast, increased coverage bias and resulted in lower SNP-
contained the desired number of cells, to prevent any changes in calling rate, potentially due to an unfavorable ratio of Tn5 to
the volumes of reactions. Libraries generated from 240,000, DNA molecules or insufficient lysis at higher cell densities. The
500,000, and 1 million cells (corresponding to approximately 3, 6, robustness of the protocol toward cell number variation facili-
and 12 ng of input DNA) were highly similar in terms of coverage tates parallel processing of many samples as it makes it unneces-
bias (Figure 3, A and E), variant calling (Figure 3, C and D), and in- sary to measure optical density and adjust cell number of
sert size distributions (Figure 3D). The main effect of increasing individual samples.
cell number was an increased library yield after enrichment PCR. Previous studies have reported that magnesium chloride and
Increasing input cell number to 5 million cells (60-ng DNA), in DMF concentrations in the tagmentation buffer affect
S. C. Vonesch et al. | 7

Tn5R27S,E54K,L372P Tn5R27S,E54K,L372P
A D
1.5 Cell number
0.0075 240K
500K

Density
Density

1.0
0.0050 1M
5M
0.5 0.0025 NXTMP

0.0 0.0000
−1.0 −0.5 0.0 0.5 1.0 0 200 400 600
log(coverage / mean(coverage)) Insert size (bp)

B E
Tn5R27S,E54K,L372P Tn5R27S,E54K,L372P

%genome of coverage >= 8


100 1.00
%True SNPs called

80 0.75

60 0.50

0.25
40

0.00
0 10 20 30 40 50 0 10 20 30 40 50
X genome coverage X genome coverage

C
15 Tn5R27S,E54K,L372P
%False postive calls

Cell number
10 240K
500K

5 1M
5M
NXTMP
0
0 10 20 30 40 50
X genome coverage
Figure 3 Extraction-free library preparation is robust to variation in input cell number. Samples starting from 240,000 (salmon), 500,000 (green), 1
million (turquoise), or 5 million (purple) cells were processed using homemade Tn5R27S, E54K, L372P and nucleosome dissociation by incubation with ProK
after tagmentation. Cell numbers between 240,000 and 1 million yield stable results in terms of library quality. (A) Coverage bias distribution (log scale),
with bias calculated as coverage at a base divided by average genome coverage. (B) Fraction of YJM789 SNPs called as a function of sequencing depth.
The reference set of true positive SNPs (52,373 SNPs) is derived from variant calling on a sample prepared from extracted genomic DNA with the
Nextera XT kit (NXTMP). (C) Fraction of false-positive SNP calls as a function of sequencing depth, with false-positive call rate ¼ number of miscalled
SNPs/total number of SNPs. (D) Distribution of fragment insert sizes. Final cleanup used 1.8 Ampure beads. (E) Fraction of the genome covered at least
8 as a function of sequencing depth. NXTMP (black dashed line) is a library processed from 150 pg extracted genomic DNA with the Nextera XT kit and
serves as reference.

performance (Picelli et al. 2014). As our tagmentation buffer (TB1: and C) to almost the same level as the extracted sample.
10 mM MgCl2, 25% DMF and 10 mM Tris-HCl, pH 7.6) was opti- Lowering DMF further to 17.5%, 15%, or 10% (only 10% shown in
mized for cDNA libraries, we varied buffer composition to im- plot) had no clear benefit compared to 20% DMF. In addition to
prove tagmentation of genomic DNA from lysed spheroplasts. DMF, we altered the concentration of both magnesium chloride
We used 500,000 cells in these tests and omitted the nucleosome and Tris-HCl for an optimized tagmentation buffer TB2 contain-
dissociation step to evaluate the effect of buffer changes only. ing 8-mM MgCl2, 20% DMF, and 16-mM Tris-HCl, pH 7.6 (final
Lowering the DMF concentration to 20% led to a slight reduction concentrations).
in coverage bias (Supplementary Figure S4, A and E) and im- As both altered ProK and tagmentation conditions led to small
proved variant calling performance (Supplementary Figure S4, B but discernible improvements in library quality individually, we
8 | G3, 2021, Vol. 11, No. 1

next evaluated them in combination. Taking advantage of the ro- locus exists in two distinct chromatin states (Merz et al. 2008), an
bustness of the protocol to variations in input cell number, we accessible and a highly condensed state, and only a small frac-
resuspended pellets of 100 ml saturated overnight cultures di- tion of the repeats is expressed in normal growth conditions,
rectly in 25 ml 300 U/ml Zymolyase solution, without measuring which could explain the reduced coverage for these regions. This
the optical density of the culture, and used 1.25 ml of this solution, is consistent with a higher variability in coverage across the 37S
corresponding to approximately 1.2 Mio cells, as input for tag- rRNA gene of the locus specifically in extraction-free samples
mentation. After tagmentation and inactivation, we incubated (Supplementary Figure S7B). Across the whole genome, we ob-
each sample with ProK for 30 min at 50 followed by 15 min at served no correlation between coverage and nucleosome posi-
65 , and stored samples at 20 before proceeding with magnetic tions, as evaluated by ATAC-seq insertion frequency (Schep et al.
bead cleanup and enrichment PCR. As our two different home- 2015) (Supplementary Figure S7C) or nucleosome occupancy (Lee
made Tn5 enzyme versions (Tn5E54K, L372P and Tn5R27S, E54K, L372P) et al. 2007) (Supplementary Figure S7D), and standard and
had previously shown different characteristics (Hennig et al. extraction-free samples again looked very similar. A few outlier
2017), we compared their performance to address whether one positions with distinctly higher coverage in the extraction-free
would be superior to the other. Compared with tagmentation in sample, specifically in regions with a nucleosome occupancy
buffer TB1, coverage and variant calling were clearly improved score of zero (Supplementary Figure S7D), also mapped to the
with buffer TB2 for both transposase versions, resulting in librar- rDNA locus, which shows a bias toward higher coverage of NTS
ies indistinguishable in quality parameters from the sample pre- in extraction-free samples (Supplementary Figure S7B). Taken to-
pared from extracted genomic DNA (Figure 4, A–C, gether, we conclude that our extraction-free protocol using
Supplementary Figure S5B) while retaining longer insert sizes homemade Tn5 enzymes generates whole-genome sequencing li-
(Supplementary Figure S5A, Table S1). No correlation with GC braries that are similar in quality to libraries prepared from
content was apparent in extraction-free libraries prepared with extracted genomic DNA using a commercial kit, at significantly
either Tn5R27S, E54K, L372P or Tn5E54K, L372P (Figure 4D). To assess reduced cost and improved throughput and with the additional
whether we could further modulate the insert size distribution benefit of generating longer inserts.
specifically toward longer inserts, we evaluated alternative tag-
mentation parameters. We had previously found that higher en- CRISPR-Cas9 on- and off-target activity profiling
zyme dilutions, which decrease the ratio of Tn5 to DNA in yeast
molecules, resulted in a progressive shift of the insert size distri- We applied our method to validate the presence of designed var-
bution toward larger inserts for libraries made from cDNA iants and identify any unwanted off-target mutations in a set of
(Hennig et al. 2017). We observed a similar effect for libraries pre- 16 yeast strains edited with MAGESTIC, our pooled CRISPR guide-
pared with our extraction-free protocol. Using a fivefold higher donor method for massively parallel precision editing and geno-
enzyme dilution, we could shift insert sizes from 158/112 (me- mic barcoding (Roy et al. 2018). Unique DNA barcodes present at
dian/mode) to 211/186 (Supplementary Figure S6, Table S1). Tn5 the genomic barcode locus tag a designed variant in each strain
transposase is typically used at 55 in library preparation applica- (Supplementary Table S2). We constructed sequencing libraries
tions. Lowering tagmentation temperature to 37 resulted in of these 16 strains and performed 150PE sequencing, generating
larger insert sizes (192/149), indicating reduced activity at this an average of 20-fold coverage per genome. After read mapping,
temperature, without affecting library quality (Supplementary variant calling, and variant filtering, we detected 131 SNPs and
Figure S6). In summary, we identified optimized nucleosome dis- 145 small insertions or deletions across all samples (Figure 5A).
sociation and tagmentation conditions, as well as strategies to Of the 16 strains, 14 received the desired mutations while 2 car-
generate libraries with larger insert sizes than is common for ried the wild-type allele at the targeted locus. Among the 276
Tn5-based adapter insertion. We also demonstrated that our pro- called variants, 121 were background variants (MAF ¼ 1) repre-
tocol is robust to variation in input cell numbers, which facili- senting the baseline genetic differences between the S288c-
tates parallel processing of many samples. derivative editing base strain and the S288c reference genome,
while 107 were private variants, uniquely present in single line-
Detailed characterization of libraries prepared ages (Supplementary Table S3).
from genomic DNA vs intact cells Unintended mutations caused by Cas9 off-target activity
We extensively compared the genomic characteristics of libraries should be private mutations, given the uniqueness of each guide
generated with the extraction-free protocol, using Tn5R27S, E54K, RNA sequence and the dependence on sequence similarity to the
L372P processed libraries with TB2 as an example, with standard li- target (Doench et al. 2016). To assess the likelihood of each vari-
brary preparation from extracted genomic DNA (NXTMP). ant to be derived from an off-target cleavage event, we quantified
Coverage of individual positions across the genome was relatively the sequence similarity of the window surrounding each private
well correlated between standard and extraction-free methods. mutation to the relevant target sequence (20 nt gRNA þ 3 nt
Positions with high coverage in standard extraction-based library PAM). As a positive control set, we calculated the edit distances
preparation (NXTMP) tended to have high coverage in extraction- between known on- and off-target sequence pairs captured by
free samples, with the notable exception of a population drop- CIRCLE-Seq (Tsai et al. 2017) and compared them to the spectrum
ping in coverage in the sample without genomic DNA extraction of edit distances in our dataset. There was a clear difference be-
(Supplementary Figure S7A). We observed the same trend for tween the two datasets, with experimentally captured off-target
extraction-free preparation using Nextera (with ProK treatment, sites from CIRCLE-seq exhibiting edit distances of 0–6 to the tar-
from Figure 2), indicating the coverage loss was specific to get site, while all but one of our private variants showed edit dis-
extraction-free protocols and not related to the use of different tances of seven or higher (Figure 5B). Higher edit distances in the
Tn5 enzymes. We mapped these positions to specific sites in the CIRCLE-Seq data, corresponding to more mismatches in the off-
yeast rDNA locus, which occurs in 100–200 tandem-arrayed target relative to the on-target site, were associated with lower
9.1 kb repeats on chromosome XII (Nomura et al. n.d.; Sandmeier cleavage activity (Supplementary Figure S8). This further sup-
et al. 2002), making up almost 10% of the genome. The rDNA ports that the observed private mutations were unlikely to be
S. C. Vonesch et al. | 9

Tn5E54K,L372P Tn5R27S,E54K,L372P
A
1.5 1.5

Density
Density

1.0 1.0 tag buffer


TB1

0.5 0.5 TB2


NXTMP

0.0 0.0
−1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0
log(coverage / mean(coverage)) log(coverage / mean(coverage))

B Tn5E54K,L372P Tn5R27S,E54K,L372P
100 100
%True SNPs called

%True SNPs called


80 80
tag buffer
TB1
60 60
TB2

40 40 NXTMP

0 10 20 30 40 50 0 10 20 30 40 50
X genome coverage X genome coverage
C
Tn5E54K,L372P Tn5R27S,E54K,L372P
%genome of coverage >= 8
%genome of coverage >= 8

100
100

75 75
tag buffer
50 50 TB1
TB2
25 25
NXTMP

0 0
0 10 20 30 40 50 0 10 20 30 40 50
X genome coverage X genome coverage
D
Tn5E54K,L372P Tn5R27S,E54K,L372P

40 40
Coverage (fold)

Coverage (fold)

30 30
count count
250
20 200 20 200
150 150
100 100
50 50
10 10

0 0
0.2 0.3 0.4 0.5 0.6 0.7 0.2 0.3 0.4 0.5 0.6 0.7
GC content GC content
Figure 4 An extraction-free protocol using homemade enzymes produces libraries of the same quality as a commercial, extraction-based solution.
Samples were prepared from 100 ml saturated overnight culture with nucleosome dissociation by ProK treatment at 50 followed by 65 and
tagmentation buffer TB1 (salmon) or TB2 (turquoise), and with homemade Tn5E54K, L372P or Tn5R27S, E54K, L372P enzyme. (A) Coverage bias distribution
(log scale), with bias calculated as coverage at a base divided by average genome coverage. (B) Fraction of YJM789 SNPs called as a function of
sequencing depth. The reference set of true positive SNPs (52,373 SNPs) is derived from variant calling on a sample prepared from extracted genomic
DNA with the Nextera XT kit (NXTMP). (C) Fraction of the genome covered at least 8 as a function of sequencing depth. NXTMP (black dashed line) is a
library processed from 150 pg extracted genomic DNA with the Nextera XT kit and serves as reference standard. (D) GC content related coverage bias
extraction-free libraries. GC content of the S. cerevisiae reference genome was determined for 500-bp sliding windows with 250 bp overlap, and per-base
GC content was calculated as the average of overlapping windows.
10 | G3, 2021, Vol. 11, No. 1

A gRNA-targeted variant Genomic variant Additional variant alleles Missing values


16_AE2_I3
16_CB6_D11
16_DA5_B10
13_CG12_N23
16_CB12_D23
16_CB1_D1
16_CC6_F11
16_DC9_F18
16_DB2_D4
16_DF3_L6
16_DE7_J14
16_CC10_F19
16_CC5_F9
16_CA4_B7
16_CB10_D19
3_DF2_L4

0%
MAF

100%

0.6 CIRCLE-Seq captured


off-target sites
Density

0.4 private mutation sites

0.2

0.0
0 2 4 6 8 10 12
Edit distance
Figure 5 On- and off-target editing analysis in extraction-free libraries. (A) Overview of identified mutations in each edited strain (indicated on y-axis),
at a sequencing depth of 20. Designed target variants are shaded in blue. Additional genomic variants are indicated in red or orange, with orange sites
representing alternative, non-wild-type genotype calls at sites at which an alternative allele has already been detected, likely representing false-
positive variant calls. Sites at which no genotype information could be obtained (no coverage) are highlighted in gray. The x-axis depicts a histogram of
MAF of each variant. (B) Histogram of edit distances between sequence windows surrounding gRNA targeted variants and alternative variants,
reflecting sequence similarity between sites. Sequence similarity between experimentally captured CIRCLE-seq off-target variants (salmon) and their
respective gRNA targeted variant indicate expected edit distances if a variant is caused by Cas9-mediated off-target activity. Edit distances obtained for
private mutations in each strain in our dataset are depicted in turquoise.

caused by Cas9 off-target activity, but rather derived from spon- brief heat treatment of spheroplasts is sufficient for Tn5 to access
taneous mutation events, accumulated in the cell divisions since genomic DNA, and removal of residual nucleosomes prior to en-
editing. In summary, we used our protocol to generate whole- richment PCR can improve the evenness of genomic coverage,
genome sequencing libraries directly from saturated cultures of such that resulting libraries are indistinguishable in quality from
CRISPR-edited yeast strains, and detected designed on-target, libraries prepared from extracted genomic DNA with a commer-
background, and spontaneous mutations in all strains. The fact cial kit (Nextera XT). With our method, the time from saturated
that the majority of the identified mutations were either com- yeast culture to library is reduced to less than 3 h, enabling mas-
mon (MAF ¼ 1) or private, even at a moderate sequencing depth sively scaled sequencing projects by eliminating the time- and
of 20, indicates that our extraction-free method generated high- labor-intensive genomic DNA isolation step while preserving li-
quality, unbiased whole-genome sequencing libraries with even brary quality. The robustness of the protocol toward input cell
coverage across the genome while substantially simplifying the number variation further facilitates parallel processing of many
library construction process and reducing cost. samples as it makes it unnecessary to measure optical density
and adjust sampling volumes of individual cultures. In addition
to its simplicity and the benefits in throughput, the protocol is
Discussion highly affordable with reagent costs of 34 cents per sample when
We describe a simplified whole-genome sequencing library prep- using our homemade Tn5 enzymes (Hennig et al. 2017). This is an
aration workflow to generate high-quality, sequencing-ready li- approximately 5- to 10-fold reduction compared to samples pre-
braries directly from yeast cultures without genomic DNA pared from gDNA extracted using the most simple and scalable
isolation. Enzymatic digestion of the yeast cell wall followed by kit-based methods. Combining our method with low-coverage
S. C. Vonesch et al. | 11

(3) sequencing allows genotyping of segregant panels or deter- Literature cited


mining targeted genome mutagenesis outcomes for only $1 per
genome. We believe that these advances will enable massively Adey A, Morrison HG, Asan Xun X O Kitzman, J, et al. 2010. Rapid,
scaled genome sequencing experiments that have so far been low-input, low-bias construction of shotgun fragment libraries
hampered by the time, effort, and cost spent on genomic DNA by high-density in vitro transposition. Genome Biol. 11:R119.
isolation. Bajorath J, Saenger W, Pal GP. 1988. Autolysis and inhibition of pro-
Depending on the individual needs and resources of a lab, teinase K, a subtilisin-related serine proteinase isolated from the
we present several options for extraction-free library prepa- fungus Tritirachium album limber. Biochim Biophys Acta. 954:
ration. If homemade Tn5 is not available, high-quality librar- 176–182.
ies can be generated using the Nextera XT kit on zymolyase Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, et al. 2016.
and heat-treated cells, without a nucleosome dissociation Optimized SgRNA design to maximize activity and minimize
step. To save cost, it is sufficient to only use 1=4 of indicated off-target effects of CRISPR-Cas9. Nat Biotechnol. 34:184–191.
reagents. With either of our homemade Tn5 enzymes cost is Fuxman Bass JI, Reece-Hoyes JS, Walhout AJM. 2016.
reduced to 34 cents per sample, and high-quality libraries Zymolyase-treatment and polymerase chain reaction amplifica-
can be obtained by including a nucleosome dissociation step tion from genomic and plasmid templates from yeast. Cold
by salt or ProK. Libraries generated by transposase-mediated Spring Harb Protoc. 2016:pdb.prot088971.
adapter insertion (tagmentation) tend to have shorter insert Goryshin IY, Miller JA, Kil YV, Lanzov VA, Reznikoff WS. 1998.
size distributions compared with methods using mechanical Tn5/IS50 target recognition. Proc Natl Acad Sci USA. 95:
or enzymatic fragmentation. Our extraction-free method us- 10716–10721.
ing ProK-mediated nucleosome dissociation generates longer Green B, Bouchier C, Fairhead C, Craig NL, Cormack BP. 2012.
inserts than extraction-free preparation with no or salt- Insertion site preference of Mu, Tn5, and Tn7 transposons.
mediated nucleosome dissociation and than the commercial Mobile DNA. 3:3.
workflow on genomic DNA, without the need for size selec- Hennig BP, Velten L, Racke I, Tu CS, Thoms M, et al. 2017. Large-scale
tion. We also identify additional tagmentation parameters low-cost NGS library preparation using a robust Tn5 purification
such as temperature and Tn5 to DNA ratio that allow flexible and tagmentation protocol. G3 (Bethesda). 8:79–89.
modulation of insert size for custom applications. Insert size Hwang B, Heo S, Cho N, Seo H, Bang D. 2019. Facilitated large-scale
distributions compatible with Illumina systems can be sequence validation platform using Tn5-tagmented cell lysates.
obtained even with relatively high dilutions of the Tn5 en- ACS Synth Biol. 8:596–600.
zyme, which further reduces cost of the assay. Kitamura K, Yamamoto Y. 1972. Purification and properties of an en-
With our method, we reliably detected designed on-target zyme, zymolyase, which lyses viable yeast cells. Arch Biochem
and shared background mutations in a panel of edited yeast Biophys. 153:403–406.
strains. We could confidently call genotypes at all target sites Lan JH, Yin Y, Reed EF, Moua K, Thomas K, et al. 2015. Impact of three
and detected the majority of mutations, representing differen- illumina library construction methods on GC bias and HLA geno-
ces due to shared genetic background, in all strains. This indi- type calling. Hum Immunol. 76:166–175.
cates that our extraction-free method provides high-quality, Lee W, Tillo D, Bray N, Morse RH, Davis RW, et al. 2007. A
unbiased genomic information while substantially simplifying high-resolution atlas of nucleosome occupancy in yeast. Nat
the library construction process and reducing cost. While we Genet. 39:1235–1244.
have only tested this workflow in yeast it should in principle be Li H, Durbin R. 2009. Fast and accurate short read alignment with
transferable to other organisms by adjusting initial lysis condi- Burrows-Wheeler transform. Bioinformatics (Oxford, England).
tions to the cell type of choice. Adding a homogenization step 25:1754–1760.
prior to lysis could further enable direct library preparation Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, 1000 Genome
from tissues. Project Data Processing Subgroup, et al. 2009. The sequence align-
ment/map format and SAMtools. Bioinformatics (Oxford,
England). 25:2078–2079.
Martin M. 2011. Cutadapt removes adapter sequences from
Acknowledgments high-throughput sequencing reads. Embnet J. 17:10.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, et al. 2010.
We thank the EMBL Protein Expression and Purification facility
The genome analysis toolkit: a MapReduce framework for ana-
for production of the in-house Tn5 enzymes, and the EMBL
lyzing next-generation DNA sequencing data. Genome Res. 20:
Genomics Core Facility for discussion and technical support. We
1297–1303.
thank members of the Steinmetz group for discussions and sup-
Merz K, Hondele M, Goetze H, Gmelch K, Stoeckl U, et al. 2008.
port.
Actively transcribed RRNA genes in S. Cerevisiae are organized in
a specialized chromatin associated with the high-mobility group
protein Hmo1 and are largely devoid of histone molecules. Genes
Dev. 22:1190–1204.
Funding
Nomura M, Nogi Y, Oakes M. n.d. Transcription of RDNA in the yeast
This work was supported by an Advanced Investigator grant from Saccharomyces cerevisiae. In: Madame Curie Bioscience Database.
the European Research Council (ERC) under the European Landes Bioscience. https://www.ncbi.nlm.nih.gov/books/
Union’s Horizon 2020 research and innovation programme (AdG- NBK6403/.
742804—SystGeneEdit to L.M.S.). S.C.V. was supported by an Picelli S, Björklund AK, Reinius B, Sagasser S, Winberg G, et al. 2014.
Advanced Postdoc Mobility Fellowship from the Swiss National Tn5 transposase and tagmentation procedures for massively
Science Foundation (grant number P300PA_177909). scaled sequencing projects. Genome Res. 24:2033–2040.
Conflicts of interest: None declared. Reznikoff WS. 2008. Transposon Tn 5. Annu Rev Genet. 42:269–286.
12 | G3, 2021, Vol. 11, No. 1

Roy KR, Smith JD, Vonesch SC, Lin G, Tu CS, et al. 2018. Multiplexed cerevisiae strain YJM789. Proc Natl Acad Sci USA. 104:
precision genome editing with trackable genomic barcodes in 12825–12830.
yeast. Nat Biotechnol. 36:512–520. Weinreich MD, Gasch A, Reznikoff WS. 1994. Evidence that the Cis
Sandmeier JJ, French S, Osheim Y, Cheung WL, Gallo CM, et al. 2002. preference of the Tn5 transposase is caused by nonproductive
RPD3 is required for the inactivation of yeast ribosomal DNA multimerization. Genes Dev. 8:2363–2374.
genes in stationary phase. EMBO J. 21:4959–4968. Wilkening S, Tekkedil MM, Lin G, Fritsch ES, Wei W, et al. 2013.
Schep AN, Buenrostro JD, Denny SK, Schwartz K, Sherlock G, et al. Genotyping 1000 yeast strains by next-generation sequencing.
2015. Structured nucleosome fingerprints enable high-resolution BMC Genomics. 14:90.
mapping of chromatin architecture within regulatory regions. Yager TD, McMurray CT, van Holde KE. 1989. Salt-induced release of
Genome Res. 25:1757–1770. DNA from nucleosome core particles. Biochemistry. 28:
Shevchenko Y. 2002. Systematic sequencing of CDNA clones using 2271–2281.
the transposon Tn5. Nucleic Acids Res. 30:2469–2477. Yuan G-C, Liu Y-J, Dion MF, Slack MD, Wu LF, et al. 2005.
Stacks PC, Schumaker VN. 1979. Nucleosome dissociation and transfer Genome-scale identification of nucleosome positions in S. cerevi-
in concentrated salt solutions. Nucleic Acids Res. 7:2457–2467. siae. Science. 309:626–630.
Tsai SQ, Nguyen NT, Malagon-Lopez J, Topkar VV, Aryee MJ, et al. Zhou M, Bhasin A, Reznikoff WS. 1998. Molecular genetic analysis of
2017. CIRCLE-Seq: a highly sensitive in vitro screen for transposase-end DNA sequence recognition: cooperativity of
genome-wide CRISPR-Cas9 nuclease off-targets. Nat Methods. three adjacent base-pairs in specific interaction with a mutant
14:607–614. Tn 5 transposase. J Mol Biol. 276:913–925.
van Burik J-AH, Schreckhise RW, White TC, Bowden RA, Myerson D. Zhou M, Reznikoff WS. 1997. Tn 5 transposase mutants that alter
1998. Comparison of six extraction techniques for isolation of DNA binding specificity. J Mol Biol. 271:362–373.
DNA from filamentous fungi. Med Mycol. 36:299–303.
Wei W, McCusker JH, Hyman RW, Jones T, Ning Y, et al. 2007. Communicating editor: C. Boone
Genome sequencing and comparative analysis of Saccharomyces

You might also like