Professional Documents
Culture Documents
Chromatin Accessibility
Chromatin Accessibility
Georgi K. Marinov
William J. Greenleaf Editors
Chromatin
Accessibility
Methods and Protocols
METHODS IN MOLECULAR BIOLOGY
Series Editor
John M. Walker
School of Life and Medical Sciences
University of Hertfordshire
Hatfield, Hertfordshire, UK
Edited by
This Humana imprint is published by the registered company Springer Science+Business Media, LLC, part of Springer
Nature.
The registered company address is: 1 New York Plaza, New York, NY 10004, U.S.A.
Preface
The genomic distribution patterns of chromatin accessibility and its dynamics are key
features of the regulation of gene expression and many other aspects of chromatin biology.
The genomes of eukaryotes are usually packaged by nucleosomal particles, which have a
generally strong inhibitory effect on transcription and on the occupancy of DNA by
regulatory proteins. It is typically active cis-regulatory regions (cREs) in the genome that
are characterized by depleted nucleosomal occupancy and increased chromatin accessibility,
which has in turn proven to be a highly useful property enabling the identification of
candidate cREs as well as the tracking of their activity across cell types and conditions as
accessible DNA can be preferentially enzymatically or chemically labeled in numerous ways.
Technological advances in the labeling and readout of accessible DNA have played a
major role in driving forward our understanding of chromatin and regulatory biology over
the last few decades. The last 15 years have seen a particularly dramatic explosion in the
variety and power of approaches for studying chromatin accessibility, driven by two sequen-
tial technological revolutions: first, the development of high-throughput sequencing in the
mid-2000s, and then the advent of single-cell genomics in the 2010s. The current book aims
to provide a comprehensive resource covering the existing and state-of-the-art tools in the
field.
We have divided the protocols in the book into several sections, depending on the
different aspects of chromatin accessibility that they measure and/or approaches that they
take. In the first section, bulk-cell methods for profiling chromatin accessibility and nucleo-
some positioning that rely on enzymatic cleavage of accessible DNA and produce informa-
tion about relative accessibility are covered. The second section is dedicated to methods that
use single-molecule and enzymatic approaches to solving the problem of mapping absolute
occupancy/accessibility. The third section covers the wide array of emerging tools for
mapping DNA accessibility and nucleosome positioning in single cells, as well as a number
of single-cell multiomics methods that simultaneously measure chromatin accessibility and
other features of the cell, such as the transcriptome, the methylome, and protein markers.
More recently, imaging-based methods for visualizing accessible chromatin in its nuclear
context have emerged; these are included in the fourth section. The final section features
computational methods for the processing and analysis of chromatin accessibility datasets.
This book will serve as an extensive and useful reference for researchers studying different
facets of chromatin accessibility in a wide variety of biological contexts.
v
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
Contributors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
vii
viii Contents
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
Contributors
ix
x Contributors
UDAYAKUMAR S. VISHNU • New England Biolabs Inc., Ipswich, MA, USA; Genome Biology
Division, New England Biolabs, Inc., Ipswich, MA, USA
ZHAONING WANG • Department of Cellular and Molecular Medicine, University of
California San Diego, School of Medicine, La Jolla, CA, USA
MICHAEL WASNEY • Genetics and Genomics Program, University of California, Los Angeles,
CA, USA
MICHAEL ROLAND WOLFF • Department of Physics, Technical University of Munich,
Garching, Germany
SHUANG-YONG XU • New England Biolabs Inc., Ipswich, MA, USA
RAM PRAKASH YADAV • Department of Immunology, Genetics and Pathology, Uppsala
University, Uppsala, Sweden
LIYAN YANG • Program in Systems Biology, University of Massachusetts Medical School,
Worcester, MA, USA
KARSTEN ZENGLER • Department of Pediatrics, University of California, San Diego, La Jolla,
CA, USA; Center for Microbiome Innovation, University of California, San Diego, La
Jolla, CA, USA; Department of Bioengineering, University of California, San Diego, La
Jolla, CA, USA
GUOQIANG ZHANG • New England Biolabs Inc., Ipswich, MA, USA
KEJI ZHAO • Laboratory of Epigenome Biology, Systems Biology Center, Division of
Intramural Research, National Heart, Lung and Blood Institute, National Institutes of
Health, Bethesda, MD, USA
CHENXU ZHU • Ludwig Institute for Cancer Research, La Jolla, CA, USA
Part I
Abstract
Active cis-regulatory elements (cREs) in eukaryotes are characterized by nucleosomal depletion and,
accordingly, higher accessibility. This property has turned out to be immensely useful for identifying
cREs genome-wide and tracking their dynamics across different cellular states and is the basis of numerous
methods taking advantage of the preferential enzymatic cleavage/labeling of accessible DNA. ATAC-seq
(Assay for Transposase-Accessible Chromatin using sequencing) has emerged as the most versatile and
widely adaptable method and has been widely adopted as the standard tool for mapping open chromatin
regions. Here, we discuss the current optimal practices and important considerations for carrying out
ATAC-seq experiments, primarily in the context of mammalian systems.
1 Introduction
Georgi K. Marinov and Zohar Shipony authors contributed equally to this work.
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_1,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
3
4 Georgi K. Marinov et al.
DNase remained the main tool for mapping active cREs into
the genomic era, initially coupled to microarray readouts [5–7]
and eventually adapted to a high-throughput sequencing format
[8–10]. In parallel to these developments as well as more recently,
a wide variety of alternative methods taking advantage of the
preferential enzymatic/chemical cleavage/modification of accessi-
ble DNA were also developed, employing methyltransferases
[11–15], restriction enzymes [16], nicking enzymes [17], small
molecules [18], viral integration [19], and others.
ATAC-seq, which is based on the preferential insertion into
unprotected DNA by a hyperactive mutant version of the Tn5
transposase [20] (Fig. 1), has emerged as the most convenient,
widely adaptable and straightforward to execute method for
profiling open chromatin. Treatment of chromatin with Tn5 results
in the insertion into accessible DNA of adapters that then enable
the direct amplification of open chromatin fragments. This elim-
inates much of complex series of enzymatic steps that are unavoid-
able features of previous methods such as DNase-seq, allows for the
protocol to be completed in just a few hours, and also dramatically
lowers the input requirements, down to a few tens of thousands of
cells in bulk reactions as well as enabling single cell (scATAC) assays
[21, 22].
In this chapter, we describe the most important considerations
for carrying out successful ATAC-seq experiments in the context of
the Omni-ATAC protocol, an optimized version of the ATAC-seq
assay that produces high-quality ATAC libraries for most mamma-
lian cell lines and cell types, as well as for a number of other
eukaryotes.
2 Materials
cells
Tn5 transposase
nuclei isolaon
Tagmentaon
- transposion
- library building
- sequencing
Fig. 1 Outline of the ATAC-seq assay. Nuclei are isolated from cells and chromatin is incubated with an active
Tn5 transposase carrying PCR amplification adapter sequences. Tn5 preferentially inserts into accessible
chromatin, such as that found at active regulatory elements. After transposition, DNA is purified and PCR
amplification is carried out from the primer landing sites deposited by Tn5
0.1% Tween-20
0.01% Digitonin
6. Lysis Wash Buffer (ATAC-RSB-wash)
10 mM Tris-HCl pH 7.4
10 mM NaCl
3 mM MgCl2
0.1% Tween-20
7. 2× TD buffer
20 mM Tris-HCl pH 7.6
10 mM MgCl2
20% Dimethyl Formamide
8. Tn5 transposase (see Note 1)
3 Methods
3.1 Removal of Non- The presence of non-viable cells can negatively affect the quality of
viable Cells (Optional) final ATAC-seq libraries as dead cells generate a general back-
ground of dechromatinized DNA, decreasing the enrichment for
open chromatin regions. Two strategies are usually used to address
this problem:
1. If the fraction of dead cells is not too high (i.e., 5–15%), cells
are treated with DNAse (200 U/mL) in culture media, usually
for 30 min at 37∘C. Cells are then washed thoroughly with
1×PBS to remove DNAse.
2. If the fraction of dead cells is higher, live cells can be separated
from dead cells using a Ficoll gradient (Sigma Cat# GE17-
1440-02), with the exact conditions varying depending on
the cell type.
3.2 Preparation of Once the quality of the input cells has been ensured, the next step is
Nuclei to prepare nuclei and transpose them. The empirically determined
optimum input number of cells for a species with a mammalian-
sized genome is 50,000 diploid cells. Scale appropriately according
to expected genome size and ploidy, and also change other para-
meters, such as centrifugation speeds, if necessary.
1. Centrifuge 50,000 viable cells at 500 g for 5 min at 4∘C
2. Carefully aspirate the supernatant avoiding the pellet.
8 Georgi K. Marinov et al.
3.4 DNA Purification 1. Immediately stop the reaction using 250 μL (i.e., 5×) of PB
buffer (if using MinElute) or DNA Binding Buffer (if using
Zymo; also see Note 12).
2. Purify samples following the kit instructions.
3. Elute with 10 μL of Elution Buffer.
A A insert B
ME ME
P5 i5 SR1
SR2 i7 P7
ME
B A 5’-TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-3’
3’TCTACACATATTCTCTGTC-5;
ME
5’-GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-3’
B 3’TCTACACATATTCTCTGTC-5’
i7 primer 5’-CAAGCAGAAGACGGCATACGAGAT[i7]GTCTCGTGGGCTCGG-3’
i5 primer 5’-AATGATACGGCGACCACCGAGATCTACAC[i5]TCGTCGGCAGCGTC
Fig. 2 Structure of an ATAC-seq library. (a) After transposition, an original DNA fragment is flanked by two Tn5
molecules with their adapter. Note that all three possible configurations—A-A, B-B, and A-B/B-A (where “A”
and “B” indicate the two different adapters that Tn5 molecules used for transposition carry; these sequences
have a common “ME” segment)—are produced, but only the A-B ones can be subsequently amplified and
sequenced under conventional protocols. The A and B are used as landing sites for the PCR primers that add
the i5 and i7 barcodes and the P5 and P7 sequences needed for Illumina sequencing. (b) Typical sequences of
A and B adapters and of i5 and i7 PCR primers. The [i7] and [i5] sequences are typically 8-bp long and should
be chosen appropriately so as to maximize the sequence distance between each pair of indexes
5 cycles of:
98∘C for 10 s
63∘C for 30 s
72∘C for 30 s
Hold at 4∘C
3. Determining additional cycles using qPCR. Use 5 μL of the
pre-amplified reaction in a total qPCR reaction of 15 μL as
follows:
3.76 μL nuclease-free H2O
0.5 μL of Adapter 1
0.5 μL of Adapter 2
0.24 μL 25× SYBR Green (in DMSO)
5 μL NEBNext High-Fidelity 2× PCR Master Mix
5 μL pre-amplified sample
10 Georgi K. Marinov et al.
a
35,000
30,000
Relative fluorescence
25,000
20,000
15,000
10,000
+6 +8 +10
5,000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Cycle
9
8
7
6
Log Conc.
5
4
3
2 y = -1.0879x + 11.974
1 R² = 0.9916
0
4 5 6 7 8 9 10 11
Ct
Fig. 3 Determination of additional PCR cycles (post pre-amplification) and library quantification using qPCR. (a)
Determination of additional PCR cycles; qPCR is performed to determine the number of extra cycles to perform
on the pre-amplified ATAC material without reaching saturation. To determine the number of extra cycles, find
the number of cycles needed to reach 1/3 of the maximum relative fluorescence, and then carry out this
number of additional PCR cycles. (b) Quantification of libraries; qPCR qualification is performed on diluted
ATAC-seq libraries (400×) against a serial dilution of PhiX (200–1.56 pM). A standard curve is generated based
on the PhiX dilutions and used to calculate the molarity of the ATAC-seq library
3.6 Library Before libraries can be sequenced, they need to be properly quanti-
Quantification and fied and their quality evaluated. There are two components to this
Evaluation of Library process—first, evaluation of the insert distribution, and second,
Quality quantification.
1. Examination of library size distribution. This step can be car-
ried out using a variety of instruments that are now available for
this purpose, such as a TapeStation or a BioAnalyzer. In our
practice we prefer to use a TapeStation (with the D1000 or HS
D1000 kits) due to its ease of use, flexibility, and rapid turn-
around time. Typical results are shown in Fig. 4. A successful
12 Georgi K. Marinov et al.
Fig. 4 Evaluation of ATAC-seq library size distribution. Shown is the fragment length distribution as evaluated
using a TapeStation instrument and a D1000 TapeStation kit for an ATAC-seq library for the human GM12878
cell line. When a clear nucleosomal signature is observed, as in the example shown here, the library is most
likely of high quality. Note that the nucleosomal signature can in some cases be obscured by the presence of
high levels of mitochondrial contamination or some other source of highly accessible DNA (see Note 13 for
further discussion)
4 μL Phusion HF Buffer
1 μL 25 μM i7 primer
1 μL 25 μM i5 primer
0.4 μL 10mM dNTP mix
0.5 μL 25× SYBR Green (in DMSO)
0.2 μL NEB Phusion HF
Run the qPCR reaction with the following settings in a
qPCR machine:
98∘C for 30 s
20 cycles of:
98∘C for 10 s
63∘C for 30 s
72∘C for 30 s
Hold at 4∘C
Create a standard curve based on the PhiX dilutions and
estimate the true molarity of the qPCR library based on it.
Commercial kits such as NEBNext Library Quant Kit for
Illumina or KAPA Library Quantification Kits can also be used,
in a similar manner.
0.3
Fraction of fragments
0.006
AverageRPM
0.004 0.2
0.002 0.1
0.000 0.0
00
00
00
0
0 50 100 150 200 250 300 350 400 450 500
00
0
0
00
50
00
50
,0
,5
,0
-5
1,
1,
2,
-2
-1
-1
Fragment length Position relative to TSS
100 kb hg38
C chr8: 127,700,000 127,750,000 127,800,000 127,850,000
CASC11 PVT1
CASC11
MYC
MYC
Fig. 5 Expected results from a successful ATAC-seq experiment. (a) Shown is the insert length distribution of a
typical sequenced mammalian ATAC-seq library, showing a prominent subnucleosomal peak, as well as a
mononucleosomal and a less pronounced dinuleosomal peak. (b) Aggregate ATAC-seq signal profile around
transcription start sites (TSSs). (c) ATAC-seq profile in a 212-kb neighborhood around the human MYC gene.
The ENCODE Consortium [34] keratinocyte dataset with accession ID ENCSR798IJQ was used for this example
4 Expected Results
5 Notes
5’-CAAGCAGAAGACGGCATACGAGAT[i7]GTCTCGTGGGCTCGG-3’
5’AATGATACGGCGACCACCGAGATCTACAC[i5]TCGTCGGCAGCGTC-3’
Acknowledgements
The authors thank members of the Greenleaf and Kundaje labs for
many helpful discussions. This work was supported by NIH grants
UM1HG009436 and P50HG007735 (to W.J.G.). WJG is a Chan
18 Georgi K. Marinov et al.
References
1. Luger K, M€ader AW, Richmond RK et al. turnover at paused promoters. Mol Cell 67:
(1997) Crystal structure of the nucleosome 411–422.e4
core particle at 2.8 A resolution. Nature 389: 13. Shipony Z, Marinov GK, Swaffer MP et al.
251–260 (2018) Long-range single-molecule mapping
2. Wu C (1980) The 5′ ends of Drosophila heat of chromatin accessibility in eukaryotes. bioR-
shock genes in chromatin are hypersensitive to xiv 504662
DNase I. Nature 286(5776):854–860 14. Wang Y, Wang A, Liu Z et al. (2019) Single-
3. Keene MA, Corces V, Lowenhaupt K et al. molecule long-read sequencing reveals the
(1981) DNase I hypersensitive sites in Dro- chromatin basis of gene expression. Genome
sophila chromatin occur at the 5′ ends of Res 29:1329–1342
regions of transcription. Proc Natl Acad Sci U 15. Aughey GN, Estacio Gomez A, Thomson J
S A 78:143–146 et al. (2018) CATaDa reveals global remodel-
4. McGhee JD, Wood WI, Dolan M et al. (1981) ling of chromatin accessibility during stem cell
A 200 base pair region at the 5′ end of the differentiation in vivo. Elife 7:pii: e32341
chicken adult β-globin gene is accessible to 16. Chereji RV, Eriksson PR, Ocampo J, Clark DJ
nuclease digestion. Cell 27:45–55 (2019) DNA accessibility is not the primary
5. Dorschner MO, Hawrylycz M, Humbert R determinant of chromatin-mediated gene reg-
et al. (2004) High-throughput localization of ulation bioRxiv 639971
functional elements by quantitative chromatin 17. Ponnaluri VKC, Zhang G, Estéve PO et al.
profiling. Nat Methods 1:219–225 (2017) NicE-seq: high resolution open chro-
6. Sabo PJ, Humbert R, Hawrylycz M et al. matin profiling. Genome Biol 18(1):122
(2004) Genome-wide identification of DNaseI 18. Umeyama T, Ito T (2017) DMS-seq for in vivo
hypersensitive sites using active chromatin genome-wide mapping of protein-DNA inter-
sequence libraries. Proc Natl Acad Sci U S A actions and nucleosome centers. Cell Rep 21:
101:4537–4542 289–300
7. Sabo PJ, Kuehn MS, Thurman R et al. (2006) 19. Timms RT, Tchasovnikarova IA, Lehner PJ
Genome-scale mapping of DNase I sensitivity (2019) Differential viral accessibility (DIVA)
in vivo using tiling DNA microarrays. Nat identifies alterations in chromatin architecture
Methods 3:511–518 through large-scale mapping of lentiviral inte-
8. Crawford GE, Holt IE, Whittle J et al. (2006) gration sites. Nat Protoc 14:153–170
Genome-wide mapping of DNase hypersensi- 20. Buenrostro JD, Giresi PG, Zaba LC et al.
tive sites using massively parallel signature (2013) Transposition of native chromatin for
sequencing (MPSS). Genome Res 16:123–131 fast and sensitive epigenomic profiling of open
9. Boyle AP, Davis S, Shulha HP et al. (2008) chromatin, DNA-binding proteins and nucleo-
High-resolution mapping and characterization some position. Nat Methods 10:1213–1218
of open chromatin across the genome. Cell 21. Buenrostro JD, Wu B, Litzenburger UM et al.
132:311–322 (2015) Single-cell chromatin accessibility
10. Thurman RE, Rynes E, Humbert R et al. reveals principles of regulatory variation.
(2012) The accessible chromatin landscape of Nature 523:486–490
the human genome. Nature 489:75–82 22. Cusanovich DA, Daza R, Adey A et al. (2015)
11. Kelly TK, Liu Y, Lay FD et al. (2012) Genome- Multiplex single cell profiling of chromatin
wide mapping of nucleosome positioning and accessibility by combinatorial cellular indexing.
DNA methylation within individual DNA Science 348:910–914
molecules. Genome Res 22:2497–2506 23. Corces MR, Trevino AE, Hamilton EG et al.
12. Krebs AR, Imanci D, Hoerner L, Gaidatzis D (2017) An improved ATAC-seq protocol
et al. (2017) Genome-wide single-molecule reduces background and enables interrogation
footprinting reveals high RNA polymerase II of frozen tissues. Nat Methods 14:959–962
Genome-Wide Mapping of Active Regulatory Elements Using ATAC-seq 19
24. Corces MR, Buenrostro JD, Wu B et al. (2016) individual cell types within a tissue. Dev Cell
Lineage-specific and single-cell chromatin 18:1030–1040
accessibility charts human hematopoiesis and 30. Daugherty AC, Yeo RW, Buenrostro JD et al.
leukemia evolution. Nat Genet 48:1193–1203 (2017) Chromatin accessibility dynamics reveal
25. Picelli S, Björklund AK, Reinius B et al. (2014) novel functional enhancers in C. elegans.
Tn5 transposase and tagmentation procedures Genome Res 27:2096–2107
for massively scaled sequencing projects. 31. Schep AN, Buenrostro JD, Denny SK et al.
Genome Res 24:2033–2040 (2015) Structured nucleosome fingerprints
26. Lu Z, Hofmeister BT, Vollmers C et al. (2017) enable high-resolution mapping of chromatin
Combining ATAC-seq with nuclei sorting for architecture within regulatory regions.
discovery of cis-regulatory regions in plant gen- Genome Res 25:1757–1770
omes. Nucleic Acids Res 45:e41 32. Cusanovich DA, Reddington JP, Garfield DA
27. Maher KA, Bajic M, Kajala K et al. (2018) et al. (2018) The cis-regulatory dynamics of
Profiling of accessible chromatin regions across embryonic development at single-cell resolu-
multiple plant species and cell types reveals tion. Nature 555:538–542
common gene regulatory principles and new 33. Cao J, Cusanovich DA, Ramani V et al. (2018)
control modules. Plant Cell 30:15–36 Joint profiling of chromatin accessibility and
28. Bajic M, Maher KA, Deal RB (2018) Identifi- gene expression in thousands of single cells.
cation of open chromatin regions in plant gen- Science 361:1380–1385
omes using ATAC-seq. Methods Mol Biol 34. ENCODE Project Consortium (2012) An
1675:183–201 integrated encyclopedia of DNA elements in
29. Deal RB, Henikoff S (2010) A simple method the human genome. Nature 489:57–74
for gene expression and chromatin profiling of
Chapter 2
Abstract
The organization of nucleosomes in eukaryotic chromatin is thought to play a critical role in the regulation
of the biological function of the chromatin. Because of this potential role in regulation, a number of
techniques have been developed, which combine chromatin fragmentation around nucleosomes with next-
generation sequencing to map the location of nucleosomes in chromatin. In this section, a procedure using
a kit from New England Biolabs (NEB NEXT Ultra II FS DNA library prep Kit) to fragment chromatin in
preparation for next-generation sequencing is described and compared to other available procedures for
mapping nucleosome location.
1 Introduction
It has been known for many years that eukaryotic DNA is found
within the nucleus of a cell organized with histones to form chro-
matin. The basic building block of chromatin is the nucleosome,
which consists of approximately 145 base pairs of DNA wrapped
around a histone octamer core containing two copies each of
histone H2A, H2B, H3, and H4. As shown in Fig. 1 for the
eukaryotic virus Simian Virus 40 (SV40), the nucleosomes typically
appear as “beads on a string” in chromatin.
Figure 1 also shows a short region of DNA, indicated by an
arrow, that appears to lack at least one nucleosome. This region of
“naked” DNA is found in the SV40 regulatory region [1–3] and for
obvious reasons it has been referred to as a “nucleosome-free
region” (NFR). The presence of a specialized chromatin structure,
such as the NFR found in SV40 chromatin, which is characterized
by specific nucleosome location and/or histone modifications,
appears to be a general characteristic of genes that are poised for
transcription or are actively transcribing [4].
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_2,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
21
22 Barry Milavetz et al.
3.5
2.5
2
Reads
.5
0.5
0
1087
1268
1449
1630
1811
1992
2173
2354
2535
2716
2897
3078
3259
3440
3621
3802
3983
4164
4345
4526
4707
4888
5069
182
363
544
725
906
1
Nucleotide number
Fig. 2 Workflow for mapping nucleosomes using chromatin fragmentation and next-generation sequencing.
Blue circle = nucleosome; blue rectangle = adapter 1; gold rectangle = adapter 2
1.1 Basic Protocol This protocol describes procedures for fragmenting SV40 chroma-
for Preparing tin using the proprietary reagents in the New England Biolabs
Sequencing Libraries NEXT Ultra II FS DNA library prep Kit and for preparing DNA
from Chromatin sequencing libraries from the fragmented chromatin using the New
Fragmented Using the England Biolabs NEXT Ultra II DNA library prep Kit and E7335S
FS Kit Mulitplex oligos for Illumina. The protocol includes a procedure
using submerged agarose gel electrophoresis to select and purify
the subset of the library members that contain insert fragments of
SV40 DNA sized from approximately 60–200 base pairs.
2 Materials
3 Methods
samples at the same time: the first would be the input chroma-
tin, the second input chromatin with the supplied reaction
buffer, and the third input chromatin, reaction buffer and
enzyme.
3. We also set up a Monarch column for each sample and labeled
the columns and flex tubes to receive the purified DNA follow-
ing column purification of the fragmented chromatin.
4. Each PCR tube in a set then received 13 μL of SV40 chromatin
using the 200 μL Pipetman. The buffer control and enzyme
tubes then received 3.5 μL of the reaction buffer using the
10 μL Pipetman and the enzyme tube received an additional
1 μL of the enzyme mix from the FS kit using the 2 μL Pipet-
man. Finally, using the 10 μL Pipetman set to 10 μL, the liquid
in each tube was mixed by drawing the liquid into the 10 μL tip
followed by forcing the liquid back into the PCR tube.
5. Fragmentation was carried out for varying times and tempera-
tures to optimize the generation of fragments. Typically, frag-
mentation was done for 5–15 min at either 4 °C or 37 °C. The
lower temperature was tested first since at 4 °C nucleosomes
would not be expected to slide appreciably and the results
would be expected to most closely resemble the original orga-
nization of nucleosomes in the chromatin. However, if only
limited fragmentation occurred, we then tested the higher
temperature.
6. Following fragmentation, the reaction in each PCR tube was
stopped by the addition of 100 μL of binding buffer (Monarch
kit) using the 200 μL Pipetman. The binding buffer and sample
in each tube was mixed as above using the 200 μL Pipetman,
and then added to the Monarch column. The DNA was pur-
ified according to the protocol and reagents supplied in the
Monarch kit. The purified bound DNA was eluted with 13 μL
of nuclease-free water (Ambion) and stored at -20 °C until
used in subsequent steps.
7. The extent of fragmentation in each of the samples was deter-
mined by qPCR. The assay is based on the idea that PCR will
only amplify a particular region of DNA if the DNA is intact. By
comparing the amount of DNA amplification product from the
untreated sample to the amount of product following addition
of reaction buffer and a mixture of reaction buffer and enzyme
mix, it is possible to determine the extent of fragmentation that
occurred in the buffer alone or with the mixture of buffer and
enzymes at each of the fragmentation conditions. In order to
analyze SV40 chromatin, we have a number of sets of primers,
all of which yield amplifications products between 200 and
400 base pairs in size. The primer sets recognize various
regions of the SV40 genome including the regulatory region,
Mapping Nucleosome Location Using FS-Seq 27
3.2 Preparation of 1. The fragmented DNA obtained using the FS kit was then used
Sequencing Libraries for the preparation of sequencing libraries using an NEB
from FS Fragmented NEXT Ultra II DNA library prep Kit designed for sequencing
DNA Using the New on an illumina sequencing platform. All biochemical manipula-
England Biolabs NEXT tions associated with the preparation of libraries with this kit
Ultra II DNA Library were performed in a BSL-II hood. An Eppendorf MasterCycler
Prep Kit Personal PCR located in the hood was set to a block tempera-
ture of 4 °C and a lid temperature of 65 °C. With the heated lid
up, sterile thin-walled PCR tubes being used for library prepa-
ration were placed in the 4 °C block of the cycler and cooled for
at least 10 min. For most library preparations, we generated
eight libraries at one time. At the same time that the cycler was
being set up, we added 11 μL using the 10 μL Pipetman of
adapter dilution buffer (NEB) to another PCR tube and placed
this tube in a -20 °C freezer for later use.
28 Barry Milavetz et al.
10. The dissolved library was added to the Gel DNA recovery
column and purified according to the protocol supplied by
the kit. Following the required washes, the library DNA was
eluted in 25 μL of nuclease-free water. The eluted library was
dried as above in the Savant DNA 120 SpeedVac concentrator.
11. The dried library was resuspended in 5 μL of nuclease-free
water in the preparation for PCR amplification with appropri-
ate primers. Libraries were amplified in a total volume of
160 μL of amplification buffer. The buffer was prepared by
adding 80 μL using a 200 μL Pipetman of 2X SsoAdvanced
Universal SYBR Green Supermix, 1.6 μL of universal primer
(NEB Mulitplex oligos for Illumina) using a 2 μL Pipetman,
1.6 μL of an indexed primer using a 2 μL Pipetman (NEB
Mulitplex oligos for Illumina), and 80 μL nuclease-free water
using a 200 μL Pipetman.
12. Following thorough mixing of the amplification buffer, a
10 μL aliquot was transferred to a PCR tube with the 200 μL
Pipetman to be used as a non-DNA control. A total of 2.5 μL
of the library DNA was added to the remaining 150 μL of
amplification buffer using the 10 μL Pipetman and the DNA
was thoroughly mixed in the buffer. A 10 μL aliquot was
removed with the 200 μL Pipetman and placed into a PCR
tube. The remaining 140 μL of amplification buffer was stored
in a freezer at -20 °C until needed. The non-DNA control and
library DNA PCR tubes were then placed in a BioRad CFX
Connect real-time PCR system and amplified using 1 min
cycles of 60°, 72°, and 95°. Following amplification the peak
for the amplification containing the library DNA was deter-
mined from the cycle threshold data generated and the remain-
ing 140 μL was divided into four aliquots of approximately
35 μL each using a 200 μL Pipetman and then amplified to the
cycle threshold empirically determined.
13. Following amplification the amplified library DNA was purified
by AMPure. All manipulation of the amplified libraries was
performed in a BSL-II hood. The amplification buffer in the
four tubes were combined and 95 μL of AMPure was added to
the tube using a 200 μL Pipetman, the contents mixed thor-
oughly and then transferred to flex tube. The combined con-
tents were incubated at room temperature for 10 min to allow
the library DNA to bind to the AMPure beads.
14. Following the incubation the tube was centrifuged to ensure
that the contents were all at the bottom of the tube, and the
tube was placed into a magnetic stand to separate the magnetic
beads with bound DNA from the DNA-depleted liquid. The
stand was placed on its side while the magnetic beads were
bound so that the bound beads would be located
Mapping Nucleosome Location Using FS-Seq 31
approximately half way up the tube and not at the bottom. This
was done so that when the liquid was removed there was less
chance that the beads would be dislodged. After a 10-min
incubation to allow the beads to be separated from the liquid,
the stand was placed upright in order to allow the liquid to
collect at the bottom of the tube. The DNA-depleted liquid
was removed using a 200 μL Pipetman set to 200 μL, being
careful not to dislodge any of the magnetic beads.
15. The beads on the side of the tube were washed twice with
400 μL of a wash solution consisting of 80% ethanol and 20%
nuclease-free water, which was prepared right before use using
a 1 mL Pipetman. Following the removal of the second wash,
the beads were air-dried for 10 min in the hood.
16. The tube containing the magnetic beads was removed from the
magnetic rack and 16 μL of nuclease-free water was added
using a 10 μL Pipetman set to 8 μL. The beads and water
were thoroughly mixed by vortexing and incubated for
10 min at room temperature. Following the incubation, the
mixture was centrifuged in a Minicentrifuge LabDoctor 12 for
10 s to force the beads and liquid to the bottom of the flex
tube. The tube was then placed back into the magnetic stand
and incubated for an additional 5 min to allow the beads to
bind to the side of the bottom of the tube and separate from
the nuclease-free water that contains the eluted library DNA.
17. A 12 μL aliquot of the nuclease-free water containing the
library DNA was very carefully removed from the tube using
a 10 μL Pipetman set to 6 μL and placed in a new sterile flex
tube. This aliquot is stored in the freezer at -20 °C and would
be submitted for DNA sequencing if it meets our quality
control. The remaining aliquot of the library (4 μL) is also
stored in the freezer and eventually analyzed by submerged
agarose gel electrophoresis to determine the size and amount
of nucleosome-sized DNA in the library.
18. The quality of the amplified libraries was determined using
submerged agarose gel electrophoresis. In preparation for anal-
ysis of the library DNA, we prepared running buffer and an
agarose gel. The running buffer was prepared by adding 7 mL
of a 50X stock TAE buffer to a 500 mL bottle and adding
350 mL of distilled purified water. To identify the location of
DNA in the gel, 17.5 μL of ethidium bromide was added to the
running buffer using a 200 μL Pipetman. In a 100 mL bottle,
we added 1.4 g agarose (Sigma-Aldrich) and 50 mL of the
running buffer and heated the mixture in the microwave to
dissolve the agarose. When the agarose was completely dis-
solved, we added 2.5 μL of the ethidium bromide solution,
swirl the agarose, and pour it into the gel apparatus.
32 Barry Milavetz et al.
19. When the gel has cooled and hardened, it is covered with
running buffer. The library sample is suspended in 10 μL of
sample buffer using a 10 μL Pipetman and added to a sample
well. In a well adjacent to the library sample, we added a DNA
marker and subject the samples to submerged electrophoresis
for approximately 1 h and 15 min with the voltage set to
125 volts. The gel is removed and the DNA present in the
gel visualized on a LiCor Oddyssey FC. A high-quality library
would be expected to show only a fairly tight band around the
size of a nucleosome with added adapters at approximately
250 base pairs in size.
20. Libraries that are of sufficient quality are then used for DNA
sequencing.
21. Libraries are sequenced on an Illumina MiSeq using a MiSeq
Reagent Kit v3 (150 cycle) in the sequencing core at the
University of North Dakota. Typically, 20–25 individual
libraries are sequenced at the same time. Because of the small
size of the SV40 genome, this usually results in enough reads
per library to adequately cover the genome.
3.3 Bioinformatic 1. Following sequencing of libraries, the data files generated are
Analyses analyzed using standard bioinformatics software. First, the
FASTQ files generated by sequencing are subjected to an initial
quality control analysis using FASTQC v.0.11.2 [11]. Second,
the adapters attached to the ends of the insert DNA during the
preparation of the libraries were removed using Scythe v0.98
[12]. Third, quality trimming was performed using Sickle
v1.33 [13], and readings with a Phred score of less than
30 and reads smaller than 45 base pairs were discarded. Fourth,
the reads corresponding to African green monkey (Chlorocebus
sabaeus 1.1) and human (hg19) sequences were removed fol-
lowing alignment to their respective genomes. While we con-
tinue to do this we have found that it has little effect on the
actual virus reads. Fifth, the reads present in the FASTQ files
remaining after these treatments were then aligned to the SV40
genome (RefSeq ACC: NC_001669.1) cut between nucleo-
tides 2666 and 2667 using Bowtie2 v2.2.4 [14]. Cutting the
genome was necessary to display the data as a linear map
because the SV40 genome is normally found as a circle. Sixth,
duplicate reads were removed using the Picard Tools (Broad)
Mark Duplicates function. Seventh, bam files were generated
using an awk script from each biological library replicate with
filtering for specific size ranges of the DNA. Nucleosome-sized
DNA was identified using filtered reads from 100 to 150 base
pairs in size, while potential transcription factor binding sites
were identified using reads filtered between 60 and 99 base
pairs in size.
Mapping Nucleosome Location Using FS-Seq 33
3.4 Reagents and Ethidium Bromide Stain (0.5 μg/μL) 50 mg ethidium bromide,
Solutions add water to 100 mL.
TAE stock (50X) 242 g of tris base dissolved in 750 mL water.
Add 57.1 mL glacial acetic acid and 100 mL EDTA. Adjust final
volume to 1 L. Bring the pH to 8.5.
TAE electrophoresis running buffer to 20 mL TAE (50X) stock
buffer, add 980 mL water.
Agarose gel sample buffer 6.0 mL 10% SDS, 2.0 mL of 0.1 M
EDTA, 50 mL glycerol 1% Coomassie blue, and water to 100 mL.
4 Notes
Fig. 3 Mapping nucleosomes using FS-Seq, ATAC-Seq, MN-Seq, and ChIP-Seq. The chromatin was obtained
from SV40 virus particles and analyzed by each of the different procedures as previously described [9]. ChIP-
Seq is shown using antibody targeting H3K9me3 and targeting H4K20me1 in nucleosomes
34 Barry Milavetz et al.
to that used for ATAC-Seq would likely work with FS-Seq for this
purpose as well. This workflow would consist of preparing nuclei
from cells followed by resuspension of the nuclei in reaction buffer
and addition of fragmentation enzymes to allow for fragmentation.
Fragmentation would be assayed by qPCR measurement of the
amount of a target gene or genes that is found in the buffer
following incubation at different temperatures or times. Since
FS-Seq is enzyme-based like ATAC-Seq and the two procedures
yield similar results with viral chromatin, it seems likely that it
would also tend to target open chromatin and might be an alterna-
tive strategy for analyzing open chromatin.
Specific considerations when preparing fragmented chromatin for
FS-Seq.
In working with viral chromatin we always determine the rela-
tive amount of DNA present in a sample using qPCR. In order to
ultimately obtain useful sequencing data following FS-Seq, we have
experimentally determined that for a genome that is SV40 in size
(5243 base pairs), we need the input chromatin to have a cycle
threshold of less than 20 cycles. This is due to the fact that the
fragmentation by FS typically results in a shift of the cycle threshold
to around 25 cycles. As noted below as long as the amount of DNA
in the fragmented samples is in this range, useful libraries can be
prepared. With larger genomes it is likely that in order to obtain
sufficient coverage of the genome, more input chromatin will
probably be needed. With SV40 used this way, we can obtain
anywhere from around 500 reads per library sample to 5000
reads. Since there are only about 24 nucleosomes in SV40, this is
sufficient coverage.
Based on the relatively large numbers of samples analyzed, we
have noted that occasionally we will have a sample of disrupted
virus that does not appear to fragment very well. At this time we do
not know why this is the case, because we have successfully frag-
mented a number of other samples of chromatin from disrupted
virus. We believe that this may be due to the presence of inhibitors
remaining with the chromatin. For example, we use a high concen-
tration of dithiothreitol to disrupt the virus particles and this may
be the reason for the problem. We have not noted this issue with
other samples that were not prepared in the presence of dithio-
threitol. We are presently investigating whether the dithiothreitol is
responsible for this inhibition and if so whether there are alternative
ways to purify the chromatin from disrupted virus to prevent this
inhibition. This observation shows the importance of at least initi-
ally using the two controls listed, chromatin alone and chromatin
with buffer, since with qPCR of the samples it is possible to quickly
determine the extent of chromatin fragmentation. We have also
observed that with some samples, but not all we observed,
36 Barry Milavetz et al.
Acknowledgments
References
1. Scott WA, Wigmore DJ (1978) Sites in simian 12. Buffalo V (2011) Scythe: a Bayesian adapter
virus 40 chromatin which are preferentially trimmer
cleaved by endonucleases. Cell 15(4): 13. Joshi NA, Fass JN (2011) Sickle – a windowed
1511–1518 adaptive trimming tool for FASTQ files using
2. Varshavsky AJ, Sundin OH, Bohn MJ (1978) quality
SV40 viral minichromosome: preferential 14. Langmead B, Trapnell C, Pop M, Salzberg SL
exposure of the origin of replication as probed (2009) Ultrafast and memory-efficient align-
by restriction endonucleases. Nucleic Acids Res ment of short DNA sequences to the human
5(10):3469–3477. PubMed PMID: 214758; genome. Genome Biol 10(3):R25. https://
PMCID: 342688 doi.org/10.1186/gb-2009-10-3-r25.
3. Waldeck W, Fohring B, Chowdhury K, PubMed PMID: 19261174; PMCID:
Gruss P, Sauer G (1978) Origin of DNA repli- 2690996
cation in papovavirus chromatin is recognized 15. Li H, Handsaker B, Wysoker A, Fennell T,
by endogenous endonuclease. Proc Natl Acad Ruan J, Homer N, Marth G, Abecasis G,
Sci U S A 75(12):5964–5968. PubMed PMID: Durbin R, Genome Project Data Processing S
216004; PMCID: 393097 (2009) The sequence alignment/map format
4. Parmar JJ, Padinhateeri R (2020) Nucleosome and SAMtools. Bioinformatics 25(16):
positioning and chromatin organization. Curr 2078–2079. https://doi.org/10.1093/bioin
Opin Struct Biol 64:111–118. https://doi. formatics/btp352. PubMed PMID:
org/10.1016/j.sbi.2020.06.021 19505943; PMCID: PMC2723002
5. Weintraub H, Groudine M (1976) Chromo- 16. Ramirez F, Ryan DP, Gruning B, Bhardwaj V,
somal subunits in active genes have an altered Kilpert F, Richter AS, Heyne S, Dundar F,
conformation. Science 193(4256):848–856. Manke T (2016) deepTools2: a next genera-
https://doi.org/10.1126/science.948749 tion web server for deep-sequencing data anal-
6. Voong LN, Xi L, Wang JP, Wang X (2017) ysis. Nucleic Acids Res 44(W1):W160–W165.
Genome-wide mapping of the nucleosome https://doi.org/10.1093/nar/gkw257.
landscape by micrococcal nuclease and chemi- PubMed PMID: 27079975; PMCID:
cal mapping. Trends Genet 33(8):495–507. PMC4987876
https://doi.org/10.1016/j.tig.2017.05.007. 17. Robinson JT, Thorvaldsdottir H, Winckler W,
PubMed PMID: 28693826; PMCID: Guttman M, Lander ES, Getz G, Mesirov JP
PMC5536840 (2011) Integrative genomics viewer. Nat Bio-
7. Sun Y, Miao N, Sun T (2019) Detect accessible technol 29(1):24–26. https://doi.org/10.
chromatin using ATAC-sequencing, from prin- 1038/nbt.1754. PubMed PMID: 21221095;
ciple to applications. Hereditas 156:29. PMCID: PMC3346182
https://doi.org/10.1186/s41065-019- 18. Milavetz B, Kallestad L, Gefroh A, Adams N,
0105-9. PubMed PMID: 31427911; PMCID: Woods E, Balakrishnan L (2012) Virion-
PMC6696680 mediated transfer of SV40 epigenetic informa-
8. Park PJ (2009) ChIP-seq: advantages and chal- tion. Epigenetics 7(6):528–534. https://doi.
lenges of a maturing technology. Nat Rev org/10.4161/epi.20057. PubMed PMID:
Genet 10(10):669–680. https://doi.org/10. 22507897; PMCID: 3398982
1038/nrg2641. PubMed PMID: 19736561; 19. Kumar MA, Kasti K, Balakrishnan L, Milavetz
PMCID: PMC3191340 B (2018) Directed nucleosome sliding in SV40
9. Milavetz B, Haugen J, Rowbotham K (2020) minichromosomes during the formation of the
Comparing a new method for mapping nucleo- virus particle exposes dna sequences required
somes in simian virus 40 chromatin to standard for early transcription. J Virol. https://doi.
procedures. Epigenetics. 1–10. https://doi. org/10.1128/JVI.01678-18
org/10.1080/15592294.2020.1814487 20. Kube D, Milavetz B (1996) Differential regu-
10. Balakrishnan L, Milavetz B (2017) Epigenetic lation by SV40 T-antigen binding at site I
analysis of SV40 minichromosomes. Curr Pro- defines two distinct classes of nucleosome-free
toc Microbiol 46:14F 3 1–1F 3 26. https:// promoter. Anat Rec 244(1):28–32. https://
doi.org/10.1002/cpmc.35 doi.org/10.1002/(SICI)1097-0185
11. Andrews S (2010) FastQC: a quality control (199601)244:1<28::AID-AR3>3.0.CO;2-B
for high throughput sequence data
Chapter 3
Abstract
Genome-wide accessible chromatin sequencing and identification has enabled deciphering the epigenetic
information encoded in chromatin, revealing accessible promoters, enhancers, nucleosome positioning,
transcription factor occupancy, and other chromosomal protein binding. The starting biological materials
are often fixed using formaldehyde crosslinking. Here, we describe accessible chromatin library preparation
from low numbers of formaldehyde-crosslinked cells using a modified nick translation method, where a
nicking enzyme nicks one strand of DNA and DNA polymerase incorporates biotin-conjugated dATP,
dCTP, and methyl-dCTP. Once the DNA is labeled, it can be isolated for NGS library preparation. We
termed this method as universal NicE-seq (nicking enzyme-assisted sequencing). We also demonstrate a
single tube method that enables direct NGS library preparation from low cell numbers without DNA
purification. Furthermore, we demonstrated universal NicE-seq on FFPE tissue section sample.
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_3,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
39
40 Hang Gyeong Chin et al.
2 Materials
2.1 Harvesting and 1. HCT116 cells are cultured in McCoys 5A medium (Thermo
Crosslinking Cells Fisher Scientific #16600082) supplemented with 10% Fetal
Bovine Serum (GemCell #100-500).
2. TrypLE (Thermo Fisher Scientific #12605028, store at R/T
before use).
3. 50 mL conical falcon tubes and pipette tips for automatic pipet.
4. Cell culture flasks.
5. Trypan Blue Solution, 0.4% (Thermo Fisher Scientific
#15250061).
6. Hemocytometer and inverted microscope.
7. 1.5 mL Eppendorf tube for cell harvest; 1.5 mL DNA LoBind
tube (Eppendorf AG #022431021).
8. 16% formaldehyde (Thermo Fisher Scientific #28908).
9. 1X PBS (from 10X PBS, Gibco #70011-044).
10. 2.5 M Glycine (Sigma #G7126).
11. End-over-end bench top rotator (VWR #10136-084).
3 Methods
3.1 Harvesting and 1. Take cells from the incubator and visually check their health
Crosslinking Cells under the microscope. Remove old medium in the flask and
transfer 50 mL to a sterile conical centrifuge tube and then add
5–10 mL TrypLE to the adherent cells in the flask, and incu-
bate for 5 min at 37 °C.
2. Gently tap to detach cells, pipet cells to the old medium con-
taining sterile conical centrifuge (the old medium/serum will
inhibit the activity of the trypsin). Spin down 5 min at room
temperature, 1500 rpm and remove supernatant (decant).
3. Wash in 5 mL 1X PBS, spin down (5 min at room temperature,
1500 rpm), remove supernatant, and resuspend in 5 mL
1X PBS.
4. Count cells: Dilute small amounts of cells (1:1) with Trypan
Blue Stain (gives you a dilution factor of 2) (e.g., take 100 μL
cells and 100 μL stain). Pipet into small edge of the hemocy-
tometer at counting chamber. Calculate the number of cells
per mL.
5. Calculate how many cells in total. Take 10e6 cells and transfer
to 1.5 mL DNA LoBind Eppendorf tube.
6. Add 62.5 μL of 16% formaldehyde and adjust with 1X PBS up
to 1 mL (final concentration would be 1% formaldehyde),
incubate cells for 10 min at room temperature by the end-
over-end rotator (see Note 1).
7. Quench the reaction by adding 125 mM Glycine (for a volume
of 1 mL, add 52 μL of 2.5 M stock) and incubate for 5 min at
room temperature by the end-over-end rotator.
8. Wash cells twice by resuspending in 1 mL of 1X PBS and
spinning down 1500 rpm, 1 min at 4 °C. Remove supernatant
and cells may be stored at -80 °C for later use. For immediate
use, resuspend cells in 1X PBS (0.5 mL) and counter the
number of cells and make aliquots depending on how many
cells will be needed for downstream work (ideally 400 μL for
one million cells, which will give 250,000 in 0.1 mL). Note:
Cells may be lost during centrifuge steps. Therefore, the above
counting step is crucial for universal NicE-seq. In our experi-
ence, cell loss up to ~40% during the crosslink step may occur, if
we count cells before adding formaldehyde. This is a critical
step when cell numbers are limited (see Notes 1 and 2).
44 Hang Gyeong Chin et al.
3.3.1 Quality Control for 1. Denature genomic DNA by heating for 3 min at 95 °C and
Accessible Chromatin incubate for 3 min in an ice-water bath.
Labeling (Optional) 2. Make a serial dilution (1, 0.5, 0.25, 0.125 μg) of genomic
DNA in MQ water on ice. Never exceed a total volume of
5 μL per spot.
3. Prepare a positively charged nylon membrane (Amersham
#RPN119B) by drawing the circles with a pencil where the
spot will be with DNA. Spot the dilution on the membrane and
let dry.
4. Cut mark the upper left corner (for orientation).
5. Wet the membrane by dripping MQ water on top of it so that it
is fully hydrated.
6. DNA crosslinking by UV: put membrane in UV crosslinker
machine.
7. Wash the membrane with 1X PBST. Transfer the membrane to
a square protein gel box (with lid).
8. Block the membrane with 5% skimmed milk in 1X PBST for
30–60 min at room temperature on a shaker.
9. Add HRP-conjugated goat anti-biotin antibody (1/2000 dilu-
tion) for 1 h.
10. Detect biotin signal with LumiGLO reagent.
3.5 Universal NicE- 1. For DNA pull-down, add 1 mL of 2X High Salt Buffer to the
seq Library fragmented DNA tube and add 50 μL of streptavidin magnetic
Construction beads. If the quantity of DNA is below 50 ng or the number of
starting cell is below 2 K, the bead amount can be reduced to
15–20 μL.
2. Incubate for 2 h at 4 °C using end-over-end rotator.
3. Place the tube on magnetic rack. When the solution is clear,
remove the liquid using a pipette and resuspend the beads for
5 min with 1 mL of 1X High Salt Buffer containing 0.05%
Triton X-100 at room temperature.
4. Repeat wash steps 3 times more (4 times in total).
5. Wash the beads once with 1 mL of 1X TE buffer by inverting
5 times.
46 Hang Gyeong Chin et al.
3.6 PCR Cleanup 1. Bring the AMPure beads solution to room temperature before
Using AMPure Beads the cleanup steps (If PCR cleanup conduct with cold beads, it
may affect the efficiency for the recovery of DNA). After PCR
reaction is over, PCR tubes can be vortex for 2–3 s followed by
quick spin down. Put PCR reaction tubes on the magnetic rack
for 1 min, transfer the solution (containing library) to a new
DNA LoBind tube, and add 0.9 volume (45 μL) AMPure
beads (see Note 5).
2. Incubate for 10–15 min at room temperature.
3. Put samples on magnetic rack. When the solution looks clear,
remove it and wash the beads twice with 200 μL of freshly
prepared 80% ethanol by slowly pipetting the ethanol on the
beads without removing the Eppendorf tube off the rack. Wait
10 s, remove the liquid from the beads, and repeat. After
removing the ethanol for a second time, quickly spin down
the tubes, put them back on magnetic rack, and remove the
remaining ethanol at the bottom of the tube.
4. Resuspend the bead in 10 μL of 0.1X TE buffer and incubate
for 10 min at room temperature.
5. Put back on the magnetic rack and transfer the library contain-
ing solution into the new DNA LoBind tube. It is the final
library DNA.
6. Measure the amount of the library DNA using the Qubit HS
DNA Kit set at ds High Sensitivity mode. If the concentration
is >1 ng/μL, the yield of library is acceptable for the NGS
(e.g., Illumina NextSeq 500/550).
7. Analyze the library DNA on the Bioanalyzer with DNA High
Sensitivity ChIP to check the actual library quantity. (Note: In
case the ligated adaptor is still visible in Bioanalyzer, the library
pool can be re-purified using AMPure bead after the libraries
are combined.)
3.7 Optional Method 1. Take 200 ng of genomic DNA into 1.5 mL of DNA LoBind
A: Sonication Free Eppendorf tube, add 10 μL of 10X NEB Buffer #2, 1 U of Nt.
Labeled DNA CviPII, and adjust with MQ water up to 100 μL.
Enrichment for NicE- 2. Incubate for O/N at 37 °C.
seq Library 3. After O/N digestion, heat-inactivate Nt.CviPII by incubating
Preparation Using the reaction tube at 65 °C for 10 min.
Nicking Enzyme
Digestion
48 Hang Gyeong Chin et al.
3.8 Optional Method This method is sonication-free, one tube NicE-seq library from
B: One Tube NicE-seq cultured cells, and is recommended for low cell number, i.e.,
<1000 cells. However, starting materials between 250 and 5 K
cells are adapted to one tube NicE-seq method.
3.8.1 Harvesting and Harvest and crosslink cells by the same procedure that described in
Crosslinking Cells one of the Harvesting and Crosslinking Cells section. Care must be
taken to wash cells with 1X PBS before 1% formaldehyde for
fixation, followed by two washes with 1X PBS. Note: Residual
formaldehyde and glycine may inhibit downstream reaction.
3.8.5 End-Repair/dA Follow the same procedure that described in steps 7–12 of Sub-
Tailing heading 3.5 UniNicE-seq library construction section.
3.8.6 Adaptor Ligation Follow the same procedure that described in steps 13–20 of Sub-
heading 3.5 UniNicE-seq library construction section. If the staring
material is <1000 cells, the adaptor can be diluted up to 1:30.
3.8.7 PCR Amplification Follow the same procedure that described in Subheading 3.5,
and PCR Cleanup by UniNicE-seq library construction section and in Subheading 3.6,
AMPure Beads PCR Cleanup using AMPure beads. PCR cycle can be modified
up to 13 cycles while avoiding high PCR duplication. For very low
number cells, <250 cells, PCR cycle can be modified up to
20 cycles (see Note 5).
3.9 Optional Method Start with 5–10 μm FFPE section on the slide.
C: One Tube NicE-seq
1. Add 500 μL of mineral oil (Sigma-Aldrich #330760) to the
of the Human FFPE
area of tissue on the slide.
Samples from 5 to
10 μm Tissue Section
2. Incubate for 20–30 min at 52 °C.
3. Remove the mineral oil carefully, transfer the slide to coupling
3.9.1 Removal of Paraffin jars/plate with the ethanol 100%, and incubate for 5 min at
from the Tissue Section R/T.
Slide
50 Hang Gyeong Chin et al.
3.9.2 Accessible 1. Mark the circle the area of tissue by the liquid-repellent slide
Chromatin Labeling and maker pen and carefully drop it off with 200 μL of 1X Accessible
Decrosslinking Chromatin labeling mix onto the circle (see Note 6).
2. Incubate for 2 h at 37 °C in a humidified chamber.
3. Transfer the slide in 1X PBS containing 0.5 M EDTA and
incubate for 10 min at R/T.
4. Collect the tissue carefully into 1.5 mL DNA LoBind Eppen-
dorf tube by using a surgical scalpel and add 200 μL of ATL
buffer (QIAGEN, QIAamp DNA FFPE Tissue Kit #56404).
5. Add 2 μl of RNase A (1 mg/ml) and incubate for 30 min at
37 °C.
6. Add 20 μL of Proteinase K and incubate for O/N at 65 °C.
7. Incubate for 2 min at 95 °C for heat-inactivation of
Proteinase K.
8. Cooling it down for 15 min at R/T.
3.9.3 Enrichment of 1. Add 50 μL of 10X NEB Buffer #2, 2.5 U of Nt.CviPII to the
Labeled Chromatin by the tube and adjust with MQ water up to 500 μL, and mix well.
Nicking Enzyme Digestion 2. Incubate for O/N at 37 °C.
3. Incubate for 15 min at 65 °C for heat-inactivation of Nt.
CviPII.
3.9.5 End-Repair/dA Follow the same procedure that described in steps 7–12 of Sub-
Tailing heading 3.5 UniNicE-seq library construction section.
3.9.6 Adaptor Ligation Follow the same procedure that described in steps 13–20 of Sub-
heading 3.5 UniNicE-seq library construction section.
3.9.7 PCR Amplification Follow the same procedure that described in Subheadings 3.5 and
and PCR Cleanup by 3.6. PCR amplification and PCR Cleanup by AMPure Beads sec-
AMPure Beads tion (see Note 5).
4 Notes
Acknowledgments
References
1. Keene MJ, Corces V, Lowenhaupt K, Elgin SC DNA-binding proteins and nucleosome posi-
(1981) DNase I hypersensitive sites in Dro- tion. Nat Methods 10:1213–1218
sophila chromatin occur at the 5′ ends of 13. Corces M, Trevino A, Hamilton E et al (2017)
regions of transcription. PNAS 78:143–146 An improved ATAC-seq protocol reduces
2. Babiss LE, Bennett A, Friedman JM, Darnell background and enables interrogation of fro-
JE Jr (1986) DNase I-hypersensitive sites in the zen tissues. Nat Methods 14:959–962
5′-flanking region of the rat serum 14. Liu C, Wang M, Wei X, Wu L, Xu J et al (2019)
albumin gene: correlation between chromatin An ATAC-seq atlas of chromatin accessibility in
structure and transcriptional activity. PNAS 83: mouse tissues. Sci Data 6:65
6504–6508 15. Bysani M, Agren R, Davegårdh C, Volkov P,
3. Winter BB, Arnold HH (1987) Tissue-specific Rönn T, Unneberg P, Bacos K, Ling C (2019)
DNase I-hypersensitive Sites and Hypomethy- ATAC-seq reveals alterations in open chroma-
lation in the Chicken CardiacMyosin Light tin in pancreatic islets from subjects with type
Chain Gene (L2-A). JBC 262:13750–13757 2 diabetes. Sci Rep 9:7785
4. Tsompana M, Buck MJ (2014) Chromatin 16. Bentsen M, Goymann P, Schultheis H, Klee K,
accessibility: a window into the genome. Epi- Petrova A, Wiegandt R, Fust A, Preussner J,
genetics Chromatin 7:33 Kuenne C, Braun T, Kim J, Looso M (2020)
5. Crawford GE, Davis S, Scacheri PC et al ATAC-seq footprinting unravels kinetics of
(2006) DNase-chip: a high-resolution method transcription factor binding during zygotic
to identify DNase I hypersensitive sites using genome activation. Nat Commun 11:4267
tiled microarrays. Nat Methods 3:503–509 17. Davie K, Jacobs J, Atkins M, Potier D,
6. Boyle AP, Davis S, Shulha HP, Meltzer P, Mar- Christiaens V, Halder G, Aerts S (2015) Dis-
gulies EH et al (2008) High-resolution covery of transcription factors and regulatory
mapping and characterization of open chroma- regions driving in vivo tumor development by
tin across the genome. Cell 132:311–322 ATAC-seq and FAIRE-seq open chromatin
7. Giresi PG, Kim J, McDaniell RM, Iyer VR, profiling. PLoS Genet 11:e1004994
Lieb JD (2007) FAIRE (Formaldehyde- 18. Buenrostro JD, Wu B, Litzenburger UM,
Assisted Isolation of Regulatory Elements) iso- Ruff D, Gonzales ML, Snyder MP, Chang
lates active regulatory elements from human HY, Greenleaf WJ (2015) Single-cell chroma-
chromatin. Genome Res 17:877–885 tin accessibility reveals principles of regulatory
8. Jin W, Tang Q, Wan M et al (2015) Genome- variation. Nature 523:486–490
wide detection of DNase I hypersensitive sites 19. Corces MR, Granja JM, Shams S, Louie BH,
in single cells and FFPE tissue samples. Nature Seoane JA, Zhou W, Silva TC, Groeneveld C,
528:142–146 Wong CK, Cho SW, Satpathy AT, Mumbach
9. Schones DE, Cui K, Cuddapah S, Roh TY, MR, Hoadley KA, Robertson AG, Sheffield
Barski A, Wang Z et al (2008) Dynamic regu- NC, Felau I, Castro MAA, Berman BP, Staudt
lation of nucleosome positioning in the human LM, Zenklusen JC, Laird PW, Curtis C, Cancer
genome. Cell 132:887–898 Genome Atlas Analysis Network, Greenleaf
10. Kuan PF, Huebert D, Gasch A, Keles S (2009) WJ, Chang HY (2018) The chromatin accessi-
A non-homogeneous hidden-state model on bility landscape of primary human cancers. Sci-
first order differences for automatic detection ence 362:eaav1898
of nucleosome positions. Stat Appl Genet Mol 20. Ponnaluri VKC, Zhang G, Estève PO,
Biol 8:Article29. https://doi.org/10.2202/ Spracklin G, Sian S, Xu SY, Benoukraf T, Prad-
1544-6115.1454. PMC 2861327 han S (2017) NicE-seq: high resolution open
11. Klein DC, Hainer SJ (2019) Genomic methods chromatin profiling. Genome Biol 18:122
in profiling DNA accessibility and factor locali- 21. Chin HG, Sun Z, Vishnu US, Hao P, Cejas P,
zation. Chromosom Res 28:69–85 Spracklin G, Estève PO, Xu SY, Long HW,
12. Buenrostro JD, Giresi PG, Zaba LC, Chang Pradhan S (2020) Universal NicE-seq for
HY, Greenleaf WJ (2013) Transposition of high-resolution accessible chromatin profiling
native chromatin for fast and sensitive epige- for formaldehyde-fixed and FFPE tissues. Clin
nomic profiling of open chromatin, Epigenetics 12:143
Chapter 4
Abstract
Chromatin accessibility has been an immensely powerful metric for identifying and understanding regu-
latory elements in the genome. Many important regulatory elements, such as enhancers and transcriptional
start sites, are characterized by “open” or nucleosome-free regions. Understanding the areas of the genome
that are not considered open chromatin has been more difficult. Protect-seq is a genomics technique that
aims to identify inaccessible chromatin associated with the nuclear periphery. These regions are enriched for
histone modifications associated with transcriptional repression and correlate with loci identified by other
techniques measuring heterochromatin and peripheral localization. Here, we discuss the protocol and best
practices to perform Protect-seq.
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_4,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
53
A B
C D
1 2 3 4 5 6
3kb
1kb
RFU
500bp
200bp
100bp 100
200
300
400
500
1000
3000
1
Size (bp)
E chr8 63.0M 64.0M 65.0M 66.0M 67.0M 68.0M 69.0M 70.0M 71.0M 72.0M 73.0M 74.0M
1.1
Protect-seq 0.0
-1.1
6.7
DNaseI-seq
0.0
RefSeq Genes
Fig. 1 Overview of Protect-seq in HCT116 cells. (a) Cartoon schematic of Protect-seq. Chromatin in gray/black
and nucleases depicted in red scissors. (b) Microscopy images of untreated (non-digested) and treated
(digested) nuclei stained with DAPI. Reproduced from [4] by permission from Oxford University Press. (c) Gel
electrophoresis image of a typical Protect-seq experiment. Lane 1 is a 2-log DNA Ladder (NEB# N3200S). Lane
2 is empty. Lane 3 is undigested genomic DNA. Lane 4 is nuclei digested with DNase only. Lane 5 is nuclei
digested with MNase only. Lane 6 is nuclei digested with both DNaseI and MNase (Protect-seq method). (d)
Bioanalyzer trace of a typical experiment after DNA purification and NGS library preparation. Note: NEB
adapters are around 120 nucleotides. LM lower marker, UM upper marker, both are internal standards. (e)
Genome browser of example region (chr8: 60–75 M) of Protect-seq and DNaseI-seq, representing inaccessi-
ble and accessible chromatin, respectively. Protect-seq signal track (GSE135580) is represented as log2ratio
(treatment/input) and DNaseI-seq signal track is reprocessed from ENCODE (ENCSR000ENM) and represented
as reads per genomic coverage
Measuring Inaccessible Chromatin Genome-Wide Using Protect-seq 55
2 Materials
2.4 General Material 1. Eppendorf DNA LoBind tube 1.5 mL (Cat# 022431021).
and Equipment 2. Tabletop centrifuge.
3. Thermal cycler.
4. ThermoMixer.
5. Nuclease-free water.
3 Methods
3.1 Cell Culture and 1. HCT116 cells were cultured in McCoy5A media supplemen-
Crosslinking ted with 10% fetal bovine serum (FBS) at 37 °C and 5% CO2.
Once cells reach 75% confluency, trypsinize and wash twice
with 1X phosphate buffered saline (PBS) and stored at -80 °
C. Note: These conditions will vary depending on the cell type.
Measuring Inaccessible Chromatin Genome-Wide Using Protect-seq 57
3.5 NGS Library NGS libraries are constructed using NEB Next Ultra II DNA
Preparation library kit following the manufacturer’s protocol. We typically use
200 ng for library construction (ranging from 1 ng to 1 mg). NGS
libraries have also been generated using individual enzymes for
end-repair, dA-tailing, and adapter ligation opposed to a commer-
cial kit.
1. Add end-repair and dA-tailing enzymes.
7 μL NEB Next Ultra II End Prep Reaction Buffer (E7647AA)
3 μL NEB Next Ultra II End Prep Enzyme Mix (E7646AA)
50 μL Purified DNA fragments.
2. Incubate for 30 min at RT.
3. Incubate for 30 min at 65 °C.
4. Add Ligation Enzymes.
60 μL End-Repair/dA-tailed DNA fragments
2.5 μL NEB loop adapter
30 μL Ligation Master mix
1 μL Ligation enhancer.
5. Incubate for 1 h (see Note 5).
6. Add 3 μL USER.
(a) Note this step is specific to NEB loop adapters.
7. Incubate for 30 min at 37 °C.
8. Cleanup DNA fragments and remove unligated adapter with
0.9X AMPure.
9. Resuspend in 15 μL low-TE.
10. Setup PCR reaction.
15 μL Resuspended DNA
5 μL Universal F primer
Measuring Inaccessible Chromatin Genome-Wide Using Protect-seq 59
5 μL Index R primer
25 μL Q5 Master mix (2X).
11. PCR amplify library to add index barcodes.
98 °C for 30 s
5 Cycles of:
98 °C for 10 s
65 °C for 75 s
65 °C for 5 min.
Hold at 4 °C.
12. Cleanup PCR product with 0.9X AMPure or equivalent fol-
lowing manufacturer’s instructions.
13. Measure DNA by Qubit.
14. Store DNA library at -20 °C (long-term) or 4 °C (short-
term).
3.6 NGS Library Before NGS libraries are sequenced, we perform three quality
Quantification and controls.
Quality Control
1. Examination of library fragment size distribution. Gel electro-
phoresis and/or Bioanalyzer are suitable for approximating
fragment length. On average, our fragments are 50–200 bp as
shown in Fig. 1d.
2. Estimation of effective library concentration for optimal cluster
density. Following the manufacturer’s instructions, we use
NEB Next Library Quant Kit for Illumina, which includes
P5/P7 primers for amplification and DNA standards for quan-
titative PCR (qPCR).
3. Compare control (non-digested) and treatment (digested)
nuclei using DAPI staining. In HCT116 cells, the treatment
(or digested) sample consistently results in DAPI dense foci
around the nuclear periphery, whereas the input will have signal
throughout the nucleus as shown in Fig. 1b from [4].
3.7 Sequencing and The above protocol is designed to generate sequencing libraries for
Expected Results illumina. This version of the Protect-seq protocol results in small
DNA fragments (<200 bp) after nuclease digestion. Therefore, we
recommend the use of paired-end 2 × 50 bp or 2 × 75 bp sequenc-
ing kits. After sequencing, paired-end reads are mapped to the
reference genome. We do not apply a MAPQ threshold because
Protect-seq is enriched at transposable and repetitive elements and
enriched evenly across the centromeres. Signal tracks can be repre-
sented as either fold-change or log2 ratio using MACS2 [10] or
deepTools [11] as shown in Fig. 1e. Thus far, Protect-seq is strongly
correlated with constitutive heterochromatin. Therefore, we
60 George Spracklin et al.
4 Notes
References
1. Klemm SL, Shipony Z, Greenleaf WJ (2019) 3. Sebestyén E, Marullo F, Lucini F et al (2020)
Chromatin accessibility and the regulatory epi- SAMMY-seq reveals early alteration of hetero-
genome. Nat Rev Genet 20:207–220 chromatin and deregulation of bivalent genes
2. Becker JS, McCarthy RL, Sidoli S et al (2017) in Hutchinson-Gilford Progeria Syndrome.
Genomic and proteomic resolution of hetero- Nat Commun 11:6274
chromatin and its restriction of alternate fate 4. Spracklin G, Pradhan S (2020) Protect-seq:
genes. Mol Cell 68:1023–1037.e15 genome-wide profiling of nuclease inaccessible
Measuring Inaccessible Chromatin Genome-Wide Using Protect-seq 61
domains reveals physical properties of chroma- 11. Ramı́rez F, Ryan DP, Grüning B et al (2016)
tin. Nucleic Acids Res 48:e16 deepTools2: a next generation web server for
5. van Schaik T, Vos M, Peric-Hupkes D et al deep-sequencing data analysis. Nucleic Acids
(2020) Cell cycle dynamics of lamina- Res 44:W160–W165
associated DNA. EMBO Rep 21:e50636 12. Vogel MJ, Guelen L, de Wit E et al (2006)
6. Guelen L, Pagie L, Brasset E et al (2008) Human heterochromatin proteins form large
Domain organization of human chromosomes domains containing KRAB-ZNF genes.
revealed by mapping of nuclear lamina interac- Genome Res 16:1493–1504
tions. Nature 453:948–951 13. Berezney R, Funk LK, Crane FL (1970) The
7. Schones DE, Cui K, Cuddapah S et al (2008) isolation of nuclear membrane from a large-
Dynamic regulation of nucleosome positioning scale preparation of bovine liver nuclei. Bio-
in the human genome. Cell 132:887–898 chim Biophys Acta 203:531–546
8. Boyle AP, Davis S, Shulha HP et al (2008) 14. Ueda K, Matsuura T, Date N, Kawai K
High-resolution mapping and characterization (1969) The occurrence of cytochromes in
of open chromatin across the genome. Cell the membranous structures of calf thymus
132:311–322 nuclei. Biochem Biophys Res Commun 34:
9. Buenrostro JD, Giresi PG, Zaba LC et al 322–327
(2013) Transposition of native chromatin 15. Kay RR, Fraser D, Johnston IR (1972) A
for fast and sensitive epigenomic profiling of method for the rapid isolation of nuclear mem-
open chromatin, DNA-binding proteins and branes from rat liver. Characterisation of the
nucleosome position. Nat Methods 10: membrane preparation and its associated
1213–1218 DNA polymerase. Eur J Biochem 30:145–154
10. Zhang Y, Liu T, Meyer CA et al (2008) Model- 16. Berezney R, Coffey DS (1974) Identification
based analysis of ChIP-Seq (MACS). Genome of a nuclear protein matrix. Biochem Biophys
Biol 9:R137 Res Commun 60:1410–1417
Chapter 5
Abstract
The hyperactive Tn5 transposase in the ATAC-seq method has been widely used to determine the open
DNA regions and understand the overall epigenomic regulation in the chromatins of eukaryotic cells. Here,
we describe POP-seq (Prokaryotic chromatin Openness Profiling sequencing), an adaptation of the ATAC-
seq method, to interrogate changes in the openness of prokaryotic nucleoids.
Key words Nucleoid structure, Tn5 transposase, Nucleoid-associated proteins, Chromatin structure,
H-NS, HiC, Transcription factor binding sites, POP-seq
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_5,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
63
64 Mahmoud M. Al-Bassam and Karsten Zengler
2 Materials
3 Methods
(f) 72 °C for 15 s
(g) Return to (c) for 20 times.
2. Transfer the 50 μL amplification mix into a qPCR tube and
start the amplification.
3. Watch the progress of the amplification and stop the reaction at
the end of the exponential phase when the curve starts to
plateau by simply pausing the program at 72 °C after the
reading in step “e” and removing the tube from the qPCR
machine (see Note 7). Incubate on ice.
3.4 Purification of 1. Mix 1.8× AMPure XP beads with PCR product in either 0.2 or
the Libraries 1.5 mL tubes (depending on the size of the magnet available in
your lab). For example, 90 μL of AMPure XP beads +50 μL of
PCR product. Incubate the mix at least 5 min at room
temperature.
2. Place on magnet till the solution is completely clear. While on
the magnet, remove the supernatant and add 170 μL of freshly
prepared 80% ethanol. Incubate for at least 30 s.
3. Remove the 80% ethanol and add another 170 μL of 80%
ethanol. Incubate for at least 30 min and remove the superna-
tant. Try to remove ethanol as much as possible, remove the
tubes from the magnet, and incubate at room temperature and
allow to dry for 3–4 min to remove any traces of ethanol.
4. Finally, resuspend the dried beads with 25 μL of DNase-free
water, mix well, and incubate at room temperature for 2 min.
Place on magnet.
5. Once the solution is clear, take 23 μL of the purified library and
transfer into 1.5 mL tube.
3.5 Measuring the 1. Measure the concentration of the library of each sample using
Library Concentration the Qubit high sensitivity DNA kit as described in Subheading
and Checking the 3.1, step 3.
Quality of the Libraries 2. Use a TapeStation (Agilent) to determine the average size and
the quality of the libraries. Both the Qubit concentration and
the library size are required to determine the molar concentra-
tion of each library. See Fig. 1 for an example of the POP-seq
library.
3.6 Alignment of the 1. Trim the libraries using specialized trimming software such as
Library Sequencing Cutadapt [12].
Reads to the Reference 2. Determine the quality of the libraries by using FastQC.
Genome
3. The trimmed fastq files can be used as input for fastq2wig2.pl
customed script written in Perl language that outputs “.wig”
files with the genome coverage normalized as counts per mil-
lion (CPM).
Determination of the Chromatin Openness in Bacterial Genomes 67
er
13
w
2
Lo
18
14
3000
Sample Intensity [Normalized FU]
2500
2000
1500
1000
500
0
Size
1000
1500
100 [bp]
200
300
400
500
700
25
50
Fig. 1 A snapshot from the TapeStation analysis software showing the lower and upper sizes of a typical
POP-seq library. The average library size is automatically calculated by the software
WT1
WT2
WT3
IHF1
IHF2
IHF3
Fig. 2 A snapshot from the IGB software showing a section of the E. coli BW25113 genome. The three tracks
with black signals represent biological replicates of POP-seq experiments in the wild type. The three tracks
with red signals are three biological replicates of congentic ihfB deletion mutants. The experiments are highly
reproducible for both strains. The genes colored in green have significantly higher Tn5 accession in the wild-
type strain compared to ihfB ( p-value <0.05). The locus_tags are shown for four genes
4 Notes
References
Abstract
Open or accessible chromatin typifies euchromatic regions and helps define cell type-specific transcription
programs. DNA replication massively disorders chromatin composition and structure, and how accessible
regions are affected by and recover from this disruption has been unclear. Here, we present repli-ATAC-seq,
a protocol to profile accessible chromatin genome-wide on replicated DNA starting from 100,000 cells. In
this method, replicated DNA is labeled with a short 5-ethynyl-2′-deoxyuridine (EdU) pulse in cultured
cells and isolated from a population of tagmented fragments for amplification and next-generation
sequencing. Repli-ATAC-seq provides high-resolution information on chromatin dynamics after DNA
replication and reveals new insights into the interplay between DNA replication, transcription, and the
chromatin landscape.
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_6, © The Author(s) 2023
71
72 Kathleen R. Stewart-Morgan and Anja Groth
(EdU-labelled
Drosophila S2)
EdU
pulse
next-generation
nuclei Click-IT sequencing
tagmentation streptavidin
isolation + biotin pulldown
Fig. 1 Schematic of repli-ATAC-seq protocol. The cell type of interest is pulsed with EdU and harvested. If
using spike-in, freshly harvested, 100% EdU-labeled D. melanogaster S2 cells are mixed with the cells of
interest prior to nuclei isolation and lysis. DNA is digested with Tn5 transposase and EdU+ DNA fragments are
isolated through Click biotinylation and streptavidin conjugation. These fragments are then amplified and
sequenced using next-generation sequencing. (Adapted from Ref. [15])
2 Materials
2.2.1 For Thymidine Thymidine, 10 mM stock dissolved in deionized H2O and ali-
Chase to Study Chromatin quoted at -20 °C (Sigma).
Maturation
3 Methods
3.1 EdU Labeling 1. Seed 2 × 106 mESCs per sample on a 10 cm gelatinized dish in
10 mL of appropriate media and grow them for 24 h at 37 °C,
5% CO2 (see Note 1).
2. After preparing reagents (see Note 2), begin EdU labeling.
3.2 Cell Lysis 1. To prepare, aliquot 995 μL of Buffer A into a 1.5 mL tube and
add 5 μL of 10% Triton-X 100 to the aliquot. Invert to mix. For
steps 2 and 3, work in a 4 °C room.
2. Carefully remove media (see Note 8) from labeled cells and add
200 μL of cold 1X Buffer A with Triton-X. Pipet up and down
5 times to resuspend, being careful to avoid creating bubbles.
3. Incubate on ice for 7 min. Lay tube on ice in the cold room, but
avoid burying the tube in the ice (see Note 9).
4. Pellet nuclei by spinning at 1300 × g for 5 min at 4 °C and
carefully remove lysis buffer as in step 2, here using a P200 set
to 198 μL and gel-loading tips to aspirate all supernatant.
5. Add 100 μL of cold 1X lysis buffer and pipet up and down
10 times to resuspend, being careful to avoid creating bubbles.
6. Split sample into two 1.5 mL low-binding tubes each contain-
ing 50 μL lysate (equivalent to 50,000 cells) each.
7. Vortex samples for 10 s on medium-high strength.
8. Incubate on ice for 15 min at room temperature (RT) (bury
tubes in ice).
9. After incubation, vortex tubes for 10 s at medium-high
strength again. Pellet nuclei by spinning at 600 × g for
10 min at 4 °C and carefully remove lysis buffer as in steps
2 and 4, here using a P200 set at 48 μL and gel-loading tips to
aspirate all supernatant.
11. Repeat steps 9 and 10 3 times, waiting 1 min off the magnetic
rack between washes. Perform washes on maximum 4–6 reac-
tions at a time to avoid overdrying the beads.
12. Wash beads as in steps 9 and 10 twice with 200 μL EBT Buffer.
13. Wash beads as in steps 9 and 10 once with 200 μL 10 mM
Tris–HCl pH 7.5.
14. Pellet the beads on a magnetic rack and carefully remove all
supernatant.
15. Resuspend the beads in 10 μL PCR-grade H2O, transfer to a
0.2 mL low-binding tube, and keep on ice. Proceed to Library
Amplification (see Note 14).
3.6 Library 1. Set up the PCR reaction by adding the following reagents to
Amplification (on bead-bound DNA in 0.2 mL tubes: 1.25 μL 25 μM Primer
Beads) 1, 1.25 μL 25 μM Primer 2 (see Note 15), 12.5 μL NEB Next
High-Fidelity 2X PCR Master Mix.
2. Vortex to mix and spin down briefly.
3. Amplify libraries using the following conditions: 72 °C, 5 min;
98 °C, 30 s; 12 cycles of: 98 °C, 10 s; 63 °C, 30 s; 72 °C, 30 s;
4 °C hold.
4. Add 25 μL PCR-grade H2O to each library (final volume:
50 μL).
5. To purify libraries, add 80 μL equilibrated AMPure beads to
each sample (1.6:1 bead ratio).
6. Mix thoroughly by vortexing.
7. Incubate the tube(s) at RT for 10 min to bind DNA fragments
to the beads.
8. During incubation, prepare 400 μL of 80% ethanol per sample.
9. During incubation, warm a thermoblock to 37 °C.
10. Place the tube(s) on the magnet to capture the beads. Incubate
until the liquid is clear.
11. Carefully remove and discard supernatant.
12. Keeping the tube(s) on the magnet, add 200 μL of freshly
prepared 80% ethanol.
13. Incubate the tube(s) on the magnet at RT for ≥30 s, turning
the tubes 180° to ensure all beads pass through the ethanol.
14. Carefully remove and discard the ethanol.
15. Repeat steps 12–14 once. Try to remove all residual ethanol
without disturbing the beads, using a P10 pipette if necessary.
16. Dry the beads at RT for 1–2 min. Caution: Avoid overdrying of
the beads, as it may result in dramatic yield loss.
17. Remove the tube(s) from the magnet. Resuspend the beads in
12 μL of Elution Buffer.
Repli-ATAC-seq 81
4 Notes
15. Primer sequences are from ref. [5], except that in repli-ATAC-
seq the final two nucleotides in each primer are joined by a
phosphorothioate bond. Primer 2 is indexed for multiplexing
sequencing lanes.
References
1. Core LJ, Martins AL, Danko CG et al (2014) 9. Annunziato AT (2013) Assembling chromatin:
Analysis of nascent RNA identifies a unified the long and winding road. Biochim Biophys
architecture of initiation regions at mammalian Acta 1819:196–210
promoters and enhancers. Nat Genet 46: 10. Marchal C, Sima J, Gilbert DM (2019) Con-
1311–1320 trol of DNA replication timing in the 3D
2. Li W, Notani D, Rosenfeld MG (2016) Enhan- genome. Nat Rev Mol Cell Biol 20:721–737
cers as non-coding RNA transcription units: 11. Fennessy RT, Owen-Hughes T (2016) Estab-
recent insights and future perspectives. Nat lishment of a promoter-based chromatin archi-
Rev Genet 17:207–223 tecture on recently replicated DNA can
3. Boyle AP, Davis S, Shulha HP et al (2008) accommodate variable inter-nucleosome
High-resolution mapping and characterization spacing. Nucleic Acids Res 44:7189–7203
of open chromatin across the genome. Cell 12. Vasseur P, Tonazzini S, Ziane R et al (2016)
132:311–322 Dynamics of nucleosome positioning matura-
4. Furey TS (2013) ChIP-seq and beyond: new tion following genomic replication. Cell Rep
and improved methodologies to detect and 16:2651–2665
characterize protein-DNA interactions. Nat 13. Gutierrez MP, MacAlpine HK, MacAlpine DM
Rev Genet 13:840–852 (2019) Nascent chromatin occupancy profiling
5. Buenrostro JD, Giresi PG, Zaba LC et al reveals locus- and factor-specific chromatin
(2013) Transposition of native chromatin for maturation dynamics behind the DNA replica-
fast and sensitive epigenomic profiling of open tion fork. Genome Res 29:1123–1133
chromatin, DNA-binding proteins and nucleo- 14. Ramachandran S, Henikoff S (2016) Tran-
some position. Nat Methods 10:1213 scriptional regulators compete with nucleo-
6. Stewart-Morgan KR, Petryk N, Groth A somes post-replication. Cell 165:580–592
(2020) Chromatin replication and epigenetic 15. Stewart-Morgan KR, Reverón-Gómez N,
cell memory. Nat Cell Biol 22:361–371 Groth A (2019) Transcription restart estab-
7. Petryk N, Dalby M, Wenger A et al (2018) lishes chromatin accessibility after DNA repli-
MCM2 promotes symmetric inheritance of cation. Mol Cell 75:284–297.e6
modified histones during DNA replication. Sci- 16. Luhur A, Klueg KM, Roberts J, Zelhof AC
ence 361:1389–1392 (2019) Thawing, culturing, and cryopreser-
8. Yu C, Gan H, Serra-Cardona A et al (2018) A ving drosophila cell lines. JoVE 146. https://
mechanism for preventing asymmetric histone doi.org/10.3791/59459
segregation onto replicating DNA strands. Sci- 17. ENCODE Project Consortium (2012) An
ence 361:1386–1389 integrated encyclopedia of DNA elements in
the human genome. Nature 489:57–74
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International
License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution
and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter’s Creative Commons license,
unless indicated otherwise in a credit line to the material. If material is not included in the chapter’s Creative
Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use,
you will need to obtain permission directly from the copyright holder.
Chapter 7
Abstract
Spatial organization of the genome modulates pivotal biological processes. The emerging new technologies
have provided novel insights into genome structure and its role in regulating cell activities. To examine the
genome-wide chromatin interactions at accessible chromatin regions, we developed a DNA transposase-
mediated analysis of chromatin looping (Trac-looping) method for simultaneously detecting chromatin
interactions and chromatin accessibility. Here, we describe a detailed protocol of generating Trac-looping
libraries.
Key words Genome structure, Tn5, Trac-looping, Chromatin looping, Chromatin accessibility
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_7,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
85
86 Shuai Liu et al.
2 Materials
2.1 Reagents 1. Competent cells BL21 Gold (DE3) (Agilent, cat# 230132).
2. pET15b-His6Tnp (Addgene plasmid #79807).
3. Nuclease-free water (Life Technologies, cat# AM9930).
4. 1 M Tris–HCl pH 7.4 (Quality Biological, cat# 351-006-101).
5. 1 M Tris–HCl pH 8.0 (KD medical, cat# RGF-3360).
6. 1× PBS (Corning Incorporated – Life Sciences, cat#
21-040-CV).
7. 0.5 M EDTA (Quality Biological, cat# 351-027-721).
8. Ethyl alcohol (Warner-Graham, cat# 64-17-5). Caution:
Highly flammable.
9. Isopropanol (MG Scientific, cat# 6810008227637). Caution:
Highly flammable.
10. 16% Formaldehyde (w/v), Methanol-free (Thermo Fisher Sci-
entific, cat# 28906 or 28908).
11. Phenol–Chloroform (Amresco, cat# 0883-100ML). Caution:
Use in chemical hood.
12. Ni-NTA agarose bead slurry (Qiagen, cat#1018244).
13. Imidazole (Sigma, cat# I5513-25G).
14. ATP (Sigma, cat# A7699-1G).
15. 100% Glycerol (Invitrogen, cat# 15514-011).
16. 1 mg/mL Tn5 transposase (homemade, describe in the
methods).
17. 5 M Sodium Chloride, Molecular Biology Grade (Promega,
cat# V4221).
Analysis of Chromatin Interaction and Accessibility by Trac-Looping 87
2.2 Buffers 1. Bacteria lysis buffer (50 mM Tris–HCl pH 8.0, 300 mM NaCl,
20 mM Imidazole, 0.1% Triton X-100, 10 μg/mL Pepstatin A
(Calbiochem, cat# 516481), 10 μg/mL Leupeptin Hemisul-
fate (Calbiochem, cat# 108975), 10 μg/mL Chymostatin
(Calbiochem, cat# 230790), 6 μg/mL Antipain Dihydrochlor-
ide (Sigma, cat# A6191), 1 mg/mL lysozyme (Millipore, cat#
4403)).
2. Ni-NTA beads wash buffer (50 mM Tris–HCl pH 8.0, 1 M
NaCl, 20 mM Imidazole, 0.1% Triton® X-100).
Analysis of Chromatin Interaction and Accessibility by Trac-Looping 89
3 Methods
3.1 Expression and 1. Transform competent cells BL21 Gold (DE3) with pET15b-
Purification of His6Tnp. Plate the transformed bacteria cells on LB agar plates
Hyperactive Tn5 and containing 100 μg/mL Carbenicillin and incubate the plates at
Annealing of Adapters 37 °C overnight.
2. Inoculate 60 mL LB containing Carbenicillin (100 μg/mL)
with one single colony and incubate at 37 °C with shaking at
200 rpm overnight.
3. Dilute 10 mL of the above culture to 0.6 L of the same media
in a 2 L flask and continue to grow until OD600 reaches 0.8 at
37 °C while shaking at 200 rpm. Use two flasks for a total of
1.2 L media.
4. When the OD600 reaches 0.8, transfer the flasks to ice-water
bath and cool them down for 10 min. Add 300 μL
1 M IPTG (MPbio, cat# 114064112) to 0.5 mM (final
Analysis of Chromatin Interaction and Accessibility by Trac-Looping 91
3.2 Cell Fixation 1. Harvest 5 × 107 cells and resuspend in 50 mL culture medium
containing 10% FBS in a 50 mL tube.
2. Add 3.33 mL of 16% formaldehyde to the cell suspension (1%
final concentration) and mix by inverting the tube gently.
92 Shuai Liu et al.
3.3 Assemble the 1. Prepare the Tn5 transposase complex in a 1.5 mL tube by
Tn5 Complex and DNA mixing 16 μL 100% glycerol, 4.5 μL 50 μM annealed half
Transposition Reaction adapter, and 12.5 μL 9 μM annealed bivalent linker. Mix well
before adding Tn5. Add 30 μL 1 mg/mL Tn5 transposase into
the mixture, then mix well gently by pipetting up and down
several times. Incubate at room temperature for 20 min to
assemble the transposase complex. Store the complex on ice.
2. Thaw the fixed 5 × 107 cells on ice-water. Spin down the cells at
1500 rpm for 3 min in a microfuge and remove supernatant.
Resuspend cell pellet with 50 mL lysis buffer and incubate on
ice for 15 min to permeabilize the cells.
3. Centrifuge at 370 × g for 10 min at 4 °C, then remove
supernatant.
4. Resuspend the cell pellet with 1.8 mL lysis buffer, then add
200 μL 10× Tn5 reaction buffer. Mix well by pipetting up and
down several times, then dispense the cell suspension into
100 μL aliquots in 20 1.5 mL tubes (2.5 × 106 cells/tube).
5. Add 1.6 μL Tn5 complex into 100 μL cell suspension and mix
well (see Notes 2 and 3). Incubate on a thermal mixer at 37 °C
for 2 h with interval shaking (shaking at 800 rpm 30 sec ON,
5 min OFF). Then add 1.5 μL of Tn5 complex again into the
reaction and continue with the shaking/incubation overnight.
3.4 Reverse Cross- 1. Stop the reaction by adding 5 μL 0.5 M EDTA (25 mM final
Linking and Purify concentration). Pool the reaction mixtures from every two
Genomic DNA tubes into one tube (216 μL total volume in each of the final
ten tubes).
2. To each tube, add 6 μL 10% SDS (0.3% final concentration) and
5 μL 20 mg/mL Protease K (0.5 mg/mL final concentration).
Incubate on a thermal mixer at 55 °C for 2 h with shaking, then
incubate at 65 °C overnight to reverse cross-linking.
Analysis of Chromatin Interaction and Accessibility by Trac-Looping 93
3.5 Repair DNA Gaps 1. To each tube, add 10 μL 10× NEBuffer 2.1, 34 μL ddH2O,
Between the Bivalent 2 μL 10 mM dNTPs, and 4 μL T4 DNA Polymerase (3 units/μ
Adapter and Genomic L). Mix well and incubate at room temperature for 1 h. Then
DNA add 5 μL 0.5 M EDTA to each tube to stop the reaction.
2. Add 61 μL AMPure XP beads to each tube (volume ratio of
beads to DNA is 0.6). Mix well and incubate at room tempera-
ture for 30 min. DNA fragments over 200 bp are captured and
free linkers are removed at this step.
3. Collect beads on a magnet and remove the supernatant. Wash
beads with 1 mL 70% ethanol twice. Air dry the beads briefly
after final wash.
4. Elute bound DNA from beads with 188 μL 1× CutSmart
buffer.
3.6 DNA Restriction 1. Half of the eluted DNA (five tubes) are subjected to NlaIII
Enzyme Digestion and digestion, and the other half (five tubes) are subjected to
Enrichment via the MluCI digestion. Add 12 μL restriction enzyme separately
Biotinylated Bivalent (10 units/μL) to each tube. Mix well and incubate at 37 °C
Adapter for 3 h. Note: There is no NlaIII nor MluCI site in the adapter.
Only the genomic DNA will be cut.
2. Prepare 40 μL Streptavidin C1 beads for each tube (400 μL for
ten tubes). Wash beads twice with 500 μL 1× B/W buffer, then
resuspend beads with 400 μL 2× B/W buffer.
3. Add 40 μL washed Streptavidin C1 beads to each restriction
digestion tube. Incubate at room temperature for 30 min with
gentle rotation.
4. Collect beads on a magnet and remove the supernatant. Wash
beads 5 times with 1 mL 1× B/W buffer plus 0.1% Triton
X-100 by rotating at room temperature for 5 min. At the final
wash step, transfer the beads in wash buffer to a new tube to
94 Shuai Liu et al.
3.7 Self- 1. Pool the DNA from all of the ten tubes (2 mL in total) into a
Circularization of 50 mL tube. Add 10 mL 2× T7 DNA ligase buffer, 8 mL
Genomic DNA ddH2O, mix well. Add 40 μL T7 DNA ligase (3000 units/μ
Fragments in a Large L) to the tube. Mix well and incubate at room temperature
Volume overnight.
2. Add 20 mL Phenol–Chloroform to the ligation reaction. Vor-
tex vigorously for 30 s, then spin at 4200 rpm in a centrifuge
for 30 min.
3. Transfer the upper phase to 24 × 1.5 mL tubes (about 0.83 mL
each tube). Spin at 12,000 rpm in a microcentrifuge for
10 min. Transfer the upper phase to a new tube. Add 0.2 μL
GlycoBlue (15 mg/mL), 80 μL 3 M Sodium Acetate (pH 5.2),
and 0.8 mL isopropanol to each tube. Mix well and keep on dry
ice for 30 min.
4. Spin at 12,000 rpm for 30 min at 4 °C. Remove the superna-
tant and wash the pellets twice with 70% ethanol. Air dry the
pellets briefly after final wash. Note: Do not overdry the pellets
before adding the sampling buffer in the next step.
3.8 RCA Reaction in 1. Resuspend the pellet from each tube with 11 μL sampling
a Small Volume buffer from TempliPhi Amplification Kit, then split into two
PCR tubes in an eight-tube strip (5.5 μL each tube). Now the
samples are in 6× eight-tube PCR strips.
2. Heat at 97 °C for 3 min to denature DNA on a thermocycler,
then chill on ice.
Analysis of Chromatin Interaction and Accessibility by Trac-Looping 95
3.9 Library Indexing 1. Pool RCA reactions from eight PCR tubes in one strip into one
and Amplification 1.5 mL tube. Rinse the PCR tubes with 10 μL ddH2O and
pool with RCA reactions (six tubes, 160 μL total for each tube;
only use 20 μL in the next step).
2. Transfer 20 μL from each tube to a new 1.5 mL tube. Add
20 μL AMPure XP beads to each tube. Mix well and incubate at
room temperature for 30 min (see Note 4).
3. Collect beads on a magnet, then wash beads with 500 μL 70%
ethanol twice. Air dry beads briefly after final wash, then add
40 μL EB to elute DNA from beads.
4. Measure DNA concentration of eluates using NanoDrop Spec-
trophotometer. Usually the concentration is 30–40 ng/μL (see
Note 5).
5. Prepare PCR reaction by mixing 0.5 μL purified RCA product
as template (about 20 ng), 25 μL 2× NEB Phusion HF master
mix, 22.5 μL ddH2O, 1 μL 10 μM Illumina_Nextera_-
PE_PCR_primer_F (such as N501), and 1 μL 10 μM Illumi-
na_Nextera_PE_PCR_primer_R (such as N701). Run the
following PCR program: 98 °C 30 sec; 11 cycles of 98 °C
10 sec, 65 °C 30 sec, and 72 °C 8 sec; 72 °C 5 min; 4 °C hold.
6. After gel electrophoresis, excise the gel slices containing DNA
fragments between 220 and 700 bp. Purify DNA using MinE-
lute Gel Extraction Kit.
7. Measure the DNA concentration of the library using Qubit
dsDNA HS Kit following manufacturer’s instructions.
8. Proceed to Illumina Hiseq or Novaseq Paired-End 50-8-8-50
format sequencing.
9. Map the sequencing reads to the expected genomes and ana-
lyze chromatin accessibility and interaction as described
previously [33].
4 Notes
References
1. Bonev B, Cavalli G (2016) Organization and of genomes: interpreting chromatin interaction
function of the 3D genome. Nat Rev Genet 17: data. Nat Rev Genet 14:390–403. https://doi.
661–678. https://doi.org/10.1038/nrg. org/10.1038/nrg3454
2016.112 9. McCord RP, Kaplan N, Giorgetti L (2020)
2. Zheng H, Xie W (2019) The role of 3D Chromosome conformation capture and
genome organization in development and cell beyond: toward an integrative view of chromo-
differentiation. Nat Rev Mol Cell Biol 20:535– some structure and function. Mol Cell 77:688–
550. https://doi.org/10.1038/s41580-019- 708. https://doi.org/10.1016/j.molcel.
0132-4 2019.12.021
3. Stadhouders R, Filion GJ, Graf T (2019) Tran- 10. Agbleke AA et al (2020) Advances in chroma-
scription factors and 3D genome conformation tin and chromosome research: perspectives
in cell-fate decisions. Nature 569:345–354. from multiple fields. Mol Cell 79:881–901.
https://doi.org/10.1038/s41586-019- https://doi.org/10.1016/j.molcel.2020.
1182-7 07.003
4. Schoenfelder S, Fraser P (2019) Long-range 11. Lieberman-Aiden E et al (2009) Comprehen-
enhancer-promoter contacts in gene expression sive mapping of long-range interactions reveals
control. Nat Rev Genet 20:437–455. https:// folding principles of the human genome. Sci-
doi.org/10.1038/s41576-019-0128-0 ence 326:289–293. https://doi.org/10.
5. Yu M, Ren B (2017) The three-dimensional 1126/science.1181369
organization of mammalian genomes. Annu 12. Rao SS et al (2014) A 3D map of the human
Rev Cell Dev Biol 33:265–289. https://doi. genome at kilobase resolution reveals principles
org/10.1146/annurev-cellbio- of chromatin looping. Cell 159:1665–1680.
100616-060531 https://doi.org/10.1016/j.cell.2014.11.021
6. Kempfer R, Pombo A (2020) Methods for 13. Ren G et al (2017) CTCF-mediated enhancer-
mapping 3D chromosome architecture. Nat promoter interaction is a critical regulator of
Rev Genet 21:207–226. https://doi.org/10. cell-to-cell variation of gene expression. Mol
1038/s41576-019-0195-2 Cell 67:1049–1058 e1046. https://doi.org/
7. Schmitt AD, Hu M, Ren B (2016) Genome- 10.1016/j.molcel.2017.08.026
wide mapping and analysis of chromosome 14. Ma W et al (2015) Fine-scale chromatin inter-
architecture. Nat Rev Mol Cell Biol 17:743– action maps reveal the cis-regulatory landscape
7 5 5 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 8 / n r m . of human lincRNA genes. Nat Methods 12:
2016.104 71–78. https://doi.org/10.1038/nmeth.
8. Dekker J, Marti-Renom MA, Mirny LA (2013) 3205
Exploring the three-dimensional organization
Analysis of Chromatin Interaction and Accessibility by Trac-Looping 97
15. Hsieh TS et al (2020) Resolving the 3D land- 25. Dixon JR et al (2012) Topological domains in
scape of transcription-linked mammalian chro- mammalian genomes identified by analysis of
matin folding. Mol Cell 78:539–553 e538. chromatin interactions. Nature 485:376–380.
https://doi.org/10.1016/j.molcel.2020. https://doi.org/10.1038/nature11082
03.002 26. Nora EP et al (2012) Spatial partitioning of the
16. Krietenstein N et al (2020) Ultrastructural regulatory landscape of the X-inactivation cen-
details of mammalian chromosome architec- tre. Nature 485:381–385. https://doi.org/10.
ture. Mol Cell 78:554–565 e557. https://doi. 1038/nature11049
org/10.1016/j.molcel.2020.03.003 27. Sexton T et al (2012) Three-dimensional fold-
17. Mifsud B et al (2015) Mapping long-range ing and functional organization principles of
promoter contacts in human cells with high- the Drosophila genome. Cell 148:458–472.
resolution capture Hi-C. Nat Genet 47:598– https://doi.org/10.1016/j.cell.2012.01.010
606. https://doi.org/10.1038/ng.3286 28. Sanborn AL et al (2015) Chromatin extrusion
18. Hughes JR et al (2014) Analysis of hundreds of explains key features of loop and domain for-
cis-regulatory landscapes at high resolution in a mation in wild-type and engineered genomes.
single, high-throughput experiment. Nat Proc Natl Acad Sci U S A 112:E6456–E6465.
Genet 46:205–212. https://doi.org/10. https://doi.org/10.1073/pnas.1518552112
1038/ng.2871 29. Fudenberg G et al (2016) Formation of chro-
19. Fullwood MJ et al (2009) An oestrogen-recep- mosomal domains by loop extrusion. Cell Rep
tor-alpha-bound human chromatin interac- 15:2038–2049. https://doi.org/10.1016/j.
tome. Nature 462:58–64. https://doi.org/ celrep.2016.04.085
10.1038/nature08497 30. Davidson IF et al (2019) DNA loop extrusion
20. Kieffer-Kwon KR et al (2013) Interactome by human cohesin. Science 366:1338–1345.
maps of mouse gene regulatory domains reveal https://doi.org/10.1126/science.aaz3418
basic principles of transcriptional regulation. 31. Ganji M et al (2018) Real-time imaging of
Cell 155:1507–1520. https://doi.org/10. DNA loop extrusion by condensin. Science
1016/j.cell.2013.11.039 360:102–105. https://doi.org/10.1126/sci
21. Mumbach MR et al (2016) HiChIP: efficient ence.aar7831
and sensitive analysis of protein-directed 32. Vian L et al (2018) The energetics and physio-
genome architecture. Nat Methods 13:919– logical impact of cohesin extrusion. Cell 173:
922. https://doi.org/10.1038/nmeth.3999 1165–1178 e1120. https://doi.org/10.
22. Mumbach MR et al (2017) Enhancer connec- 1016/j.cell.2018.03.072
tome in primary human cells identifies target 33. Lai B et al (2018) Trac-looping measures
genes of disease-associated DNA elements. Nat genome structure and chromatin accessibility.
Genet 49:1602–1612. https://doi.org/10. Nat Methods 15:741–747. https://doi.org/
1038/ng.3963 10.1038/s41592-018-0107-y
23. Gibcus JH, Dekker J (2013) The hierarchy of 34. Buenrostro JD, Giresi PG, Zaba LC, Chang
the 3D genome. Mol Cell 49:773–782. HY, Greenleaf WJ (2013) Transposition of
https://doi.org/10.1016/j.molcel.2013. native chromatin for fast and sensitive epige-
02.011 nomic profiling of open chromatin,
24. Wang S et al (2016) Spatial organization of DNA-binding proteins and nucleosome posi-
chromatin domains and compartments in sin- tion. Nat Methods 10:1213–1218. https://
gle chromosomes. Science 353:598–602. doi.org/10.1038/nmeth.2688
https://doi.org/10.1126/science.aaf8084
Part II
Abstract
The bulk of gene expression regulation in most organisms is accomplished through the action of transcrip-
tion factors (TFs) on cis-regulatory elements (CREs). In eukaryotes, these CREs are generally characterized
by nucleosomal depletion and thus higher physical accessibility of DNA. Many methods exploit this
property to map regions of high average accessibility, and thus putative active CREs, in bulk. However,
these techniques do not provide information about coordinated patterns of accessibility along the same
DNA molecule, nor do they map the absolute levels of occupancy/accessibility. SMF (Single-Molecule
Footprinting) fills these gaps by leveraging recombinant DNA cytosine methyltransferases (MTase) to mark
accessible locations on individual DNA molecules. In this chapter, we discuss current methods and
important considerations for performing SMF experiments.
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_8,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
101
102 Michaela Hinks et al.
2 Materials
2.1 Methylation Prepare the RSB-Lysis and RSB-Wash buffers immediately before
Buffers and Reagents use by adding the necessary detergents; keep on ice:
1. IGEPAL CA-630 detergent (Sigma Cat# 11332465001; sup-
plied as a 10% solution).
2. Tween-20 detergent (Sigma Cat# 11332465001, supplied as a
10% solution; store at 4 ∘C).
3. Digitonin detergent (Promega Cat# G9441, supplied as a 2%
solution in DMSO; store at - 20∘C)).
4. RSB buffer (master stock)
10 mM Tris-HCl pH 7.4
10 mM NaCl
3 mM MgCl2
5. RSB-Lysis buffer
10 mM Tris-HCl pH 7.4
10 mM NaCl
3 mM MgCl2
0.1% IGEPAL CA-630
0.1% Tween-20
0.01% Digitonin
6. Lysis Wash Buffer (RSB-wash)
10 mM Tris-HCl pH 7.4
10 mM NaCl
3 mM MgCl2
0.1% Tween-20
Single-Molecule Mapping of Chromatin Accessibility Using NOMe-seq/dSMF 103
2.2 Library Building, 1. Monarch Genomic DNA Purification Kit (NEB, Cat #
Sequencing, and T3010L) or equivalent
Quality Evaluation 2. NEBNext Enzymatic Methyl-seq Kit (EM-seq, NEB, Cat #
E7120L) and associated reagents or EZ-DNA Methylation-
Gold Kit (Zymo Research Cat# D5005 (or equivalent),
depending on the exact type of SMF experiment being per-
formed (see more details below)
3. Optional, required if doing probe-hybridization enrichment of
genomic locations: SureSelectXT Methyl-Seq Library Prepara-
tion kit (Agilent, Cat# G9651A) and associated reagents
4. Optional, required if doing probe-hybridization enrichment of
genomic locations: SureSelectXT Mouse Methyl-Seq target
enrichment panel and associated reagents (Agilent, Cat#
931052) (or equivalent)
5. Optional, required if doing probe-hybridization enrichment of
genomic locations—Dynabeads MyOne Streptavidin T1
(Thermo Fisher Scientific Cat# 65601)
6. Agencourt AMPure XP Kit (Beckman Coulter Genomics Cat#
A63880)
7. 10 M NaOH, molecular biology grade (Sigma Cat# 72068)
8. 100% Ethanol, molecular biology grade (Sigma-Aldrich Cat#
E7023)
9. 1× Low TE Buffer (10 mM Tris-HCl, pH 8.0, 0.1 mM EDTA)
(Thermo Fisher Scientific Cat# 12090015)
104 Michaela Hinks et al.
3 Methods
The general outline of the dSMF assay is shown in Fig. 1. Nuclei are
first isolated from cells, then chromatin is methylated using a 5mC
methyltransferase, and genomic DNA is purified. Next, base con-
version of unmethylated cytosines into uracils is carried out and
sequencing libraries are prepared. In most cases, a GpC methyl-
transferase is used, e.g., M.CviPI, which methylates cytosines in a
GpC dinucleotide context. This is because the genomes of mam-
mals, plants, and many other species contain endogenous methyla-
tion in CpG context. However, if endogenous CpG methylation is
not present in the samples being analyzed (e.g., yeast, Drosophila,
specially engineered mammalian cells that lack endogenous meth-
ylation [15], and others), an additional CpG methyltransferase can
be used, e.g., M.SssI. This improves the resolution of the assay as
the number of informative positions can be increased by a factor of
two. Historically, the difference between NOMe-seq [7] and dSMF
[8] (dual-enzyme SMF) has been that the latter uses both enzymes.
There are several ways to create a dSMF sequencing library,
including via hybridization-based probe enrichment of genomic
regions [15], targeted PCR amplification of specific loci, or by
unbiased whole-genome sequencing of methylated DNA. Here
we describe a generalized protocol for creating dSMF libraries
following these approaches using commercially available kits.
We also note that it is possible to carry out SMF on crosslinked
material, but we advise that the exact parameters of any such
protocol be individually optimized depending on the specifics of
the experiment. The protocol described here is for native
chromatin.
3.1 Preparation of The first step of the SMF procedure is to prepare nuclei for methyl-
Nuclei ation. The nuclei lysis delineated here is different from most previ-
ously published SMF protocols and identical to the Omni-ATAC
cell lysis procedure [16] as we have found that optimal and consis-
tent results are obtained that way. It will work well for most
106 Michaela Hinks et al.
- DNA extracon
- DNA shearing
unmethylated
methylated
Fig. 1 Outline of the NOMe-seq/dSMF assay. As a first step, nuclei are isolated from cells, and chromatin is
incubated with the M.CviPI (GpC) and/or M.SssI (CpG) DNA methyltransferases (CpG can usually only be used
in biological contexts in which there is no endogenous CpG DNA methylation). DNA is methylated where it is
accessible, i.e., where it is not protected by nucleosomes and bound transcription factors. DNA is then purified
and fragmented, and chemical or enzymatic conversion is carried out. Three different readout strategies can
be applied subsequently—unbiased whole-genome sequencing (left), targeted enrichment using probe-
hybridization pulldown, or amplicon sequencing (see the text for more details). After sequencing, single-
molecule accessibility maps are generated based on the methylation status of informative positions along DNA
mammalian and insect cell lines. Note that tissues and eukaryotic
cells with cell walls (e.g. yeast and plant cells) will require different
lysis and nuclei isolation procedures:
Single-Molecule Mapping of Chromatin Accessibility Using NOMe-seq/dSMF 107
3.3 DNA Purification Quench MTase by adding 175 μL of the lysis buffer from the NEB
Monarch gDNA extraction kit, along with 3 μL RNase A and 1 μL
Proteinase K (supplied with Monarch gDNA kit). See Note 4.
Purify gDNA following the NEB Monarch gDNA extraction
kit instructions.
Following elution, quantify gDNA using Qubit.
3.6 Library Even greater levels of enrichment and depth of coverage can be
Preparation— obtained by selectively amplifying individual loci. This approach
Amplicon-Targeted works best together with the EM-seq conversion kit because, as
SMF discussed above, it provides better preserved DNA compared to
bisulfite treatment. Footprinted whole-genome DNA is used as
input and carried through the EM-seq procedure up to the last,
final library amplification step. Then PCR primers specific for a
locus (or loci) of interest are used to make the final targeted library.
The challenge when using this approach is that PCR primers need
to be selected and/or designed in such a way that they work on
converted DNA; the exact specifics of that selection will vary
depending on the particulars of the experiment carried out.
3.7 Library Before libraries can be sequenced, they need to be properly quanti-
Quantification and fied, and their quality evaluated. There are two components to this
Evaluation of Library process—first, evaluation of the insert distribution, and second,
Quality quantification:
1. Examination of library size distribution. This step can be car-
ried out using a variety of instruments that are now available for
this purpose, such as a TapeStation or a BioAnalyzer. In our
practice, we prefer to use a TapeStation (with the D1000 or HS
D1000 kits) due to its ease of use, flexibility, and rapid
turnaround time.
2. Quantification of library concentration. For most high-
throughput sequencing applications, where fragment size is
unimodal, this step can be carried out with a sufficient degree
of accuracy using a Qubit fluorometer. Typically, dSMF falls in
that category. For libraries with complex fragment distribu-
tions, as well as for higher accuracy of quantification, qPCR
can be used. Commercial kits, such as the NEBNext Library
Quant Kit for Illumina or the KAPA Library Quantification
Kits, exist for that purposes, and custom in-house quantifica-
tion methods can also be used (see the first chapter in this book
on ATAC-seq for details).
110 Michaela Hinks et al.
3.9.1 Adapter Trimming If working with EM-seq datasets, Trimmomatic can be used to trim
adapters as follows:
trim_galore
Single-Molecule Mapping of Chromatin Accessibility Using NOMe-seq/dSMF 111
raw reads
adapter trimming
methylation-aware alignment
downstream analysis
Fig. 2 Outline of NOMe-seq/dSMF computational processing. Raw sequencing reads are first trimmed of
adapters (note that it is important to do this properly depending on the type of conversion protocol used for
making the libraries). They are then aligned against the target genome or amplicons in a methylation-aware
manner. Subsequently, alignments are used to make aggregate methylation tracks (if data is to be used to
evaluate bulk accessibility) and single-molecule plots (for actual footprinting)
--path_to_cutadapt ./cutadapt
--clip_R1 9 --clip_R2 9
--three_prime_clip_r1 6
--three_prime_clip_r2 6
--paired EM-seq.read1.fastq.gz
EM-seq.read1.fastq.gz
3.9.2 Read Mapping and We use BWAmeth for alignment of base-converted datasets:
Alignment Filtering
python MethylationPercentageContext.py
EM-seq.pUC19.dedup.bam pCU19.fa CG,GC
EM-seq.pUC19.dedup.CG-GC-meth-perc
3.9.4 Methylation Calling The next step is to extract methylation calls, using MethylDackel:
Note the parameters used so that both CpG and GpC contexts
are included in the output. However, further filtering is needed in
order to specifically obtain GpC positions, described further below.
3.9.5 Bulk Accessibility For the purpose of generating bulk accessibility profiles (this is
or Methylation Profile often useful for genome browser visualization of results), execute
Generation the following steps:
Single-Molecule Mapping of Chromatin Accessibility Using NOMe-seq/dSMF 113
gzip EM-seq.bwameth.dedup_CHG.bedGraph
gzip EM-seq.bwameth.dedup_CHH.bedGraph
gzip EM-seq.bwameth.dedup_CpG.bedGraph
python MethylationPercentageSmooth-dSMF.py
EM-seq.bwameth.dedup_CHH.bedGraph.gz
genome.fa GpC 10 -MethylDackel -minCov 10 >
EM-seq.bwameth.dedup_CHH.GpC-only.minCov10.wig
python MethylationPercentageSmooth-dSMF.py
EM-seq.bwameth.dedup_CpG.bedGraph.gz
genome.fa CpG 10 -MethylDackel -minCov 10 >
EM-seq.bwameth.dedup_CHH.CpG-only.minCov10.wig
UCSC-utils/wigToBigWig
EM-seq.bwameth.dedup_CHH.GpC-only.minCov10.wig
genome.chrom.sizes
EM-seq.bwameth.hg38.dedup_CHH.GpC-only.minCov10
.bigWig
3.9.6 Metaprofile It is often useful to generate metaplots over a defined set of geno-
Evaluation mic features, for quality evaluation (e.g., assessing how strong the
methylation levels are around active promoters) and for other
analysis tasks (e.g., measuring average footprinting by TFs at their
occupancy sites).
python BismarckSequenceContextFilter.py
EM-seq.bwameth.hg38.dedup_CHH.bedGraph.gz GC
genome.fa | gzip >
EM-seq.bwameth.hg38.dedup_CHH.GpC-only.bedGraph
.gz
python signalAroundPeaks-nano.py
annotation.TSS-0bp.bed 0 1 3 1000 10
EM-seq.bwameth.hg38.dedup_CHH.GpC-only.bedGraph
.gz
EM-seq.bwameth.hg38.dedup_CHH.GpC-only.TSS_
profile -bismark.cov
3.9.7 Generating Single- Finally, we illustrate the generation of single-molecule maps. This is
Molecule Maps done using the dSMF-footprints.py script, which has as a
dependency the heatmap.py custom script, and has a wide variety
of options for color adjustment, minimal read coverage filtering,
and others.
It takes as input a BAM file, a BED file with the windows over
which single molecules are to be plotted, the genome sequence,
and the sequence context(s) (GC, CG, or both).
In this case, we filter out all alignments that do not cover at least
90% of the input regions and plot the single molecules using the
binary Matplotlib colormap, meaning that methylated positions
will be shown as light, while protected unmethylated positions will
be shown in black.
4 Expected Results
A Scale
chr6: 122,690,000 122,695,000 122,700,000
20 kb
122,705,000 122,710,000 122,715,000 122,720,000 122,725,000
mm10
122,730,000 122,735,000 122,740,000 122,745,000
694 _
read coverage
reads
1_
1_
CpG
methylation
0_
1_
GpC
0_
Nanog Slc2a3
Nanog
ENCODE cCREs Nanog
s
0.8 0.8
0.7 0.7
1-methylation
1-methylation
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
-1,000 -800 -600 -400 -200 0 200 400 600 800 1,000 -500-450-400-350-300-250-200-150-100 -50 0 50 100 150 200 250 300 350 400 450 500
Position relative to TSS Position relative to CTCT motif
Fig. 3 Aggregate accessibility analysis of NOMe-seq/dSMF datasets. (a) Read coverage and aggregate
CpG/GpC methylation genome browser tracks show accessibility and/or endogenous methylation levels
around the genome. In this case, reduced representation probe-capture dSMF datasets (obtained from
ArrayExpress accessions E-MTAB-9033 and E-MTAB-9123) are shown, thus the uneven coverage. (b)
Metaplot showing average accessibility levels around TSSs in the mouse genome. (c) Metaplot showing
average footprinting levels at occupied CTCF motifs
116 Michaela Hinks et al.
5 Notes
A 50bp
Spic
B 50bp
Fig. 4 Examples of single-molecule accessibility measurements. Shown are dSMF single-molecule maps
(obtained from ArrayExpress accessions E-MTAB-9033 and E-MTAB-9123 [15]). (a) High levels of occupancy
by the CTCF transcription factor (middle). (b) CTCF (middle) and possible nucleosome (left) footprints
118 Michaela Hinks et al.
Acknowledgements
References
1. Johnson DS, Mortazavi A, Myers RM, Wold B 6. Schones DE, Cui K, Cuddapah S et al. (2008)
(2007) Genome-wide mapping of in vivo Dynamic regulation of nucleosome positioning
protein-DNA interactions. Science in the human genome. Cell 132(5):887–898
316(5830):1497–1502 7. Kelly TK, Liu Y, Lay FD et al. (2012) Genome-
2. Mikkelsen TS, Ku M, Jaffe DB et al. (2007) wide mapping of nucleosome positioning and
Genome-wide maps of chromatin state in plu- DNA methylation within individual DNA
ripotent and lineage-committed cells. Nature molecules. Genome Res 22:2497–2506
448(7153):553–560 8. Krebs AR, Imanci D, Hoerner L, Gaidatzis D
3. Buenrostro JD, Giresi PG, Zaba LC et al. et al. (2017) Genome-wide Single-Molecule
(2013) Transposition of native chromatin for Footprinting Reveals High RNA Polymerase
fast and sensitive epigenomic profiling of open II Turnover at Paused Promoters. Mol Cell
chromatin, DNA-binding proteins and nucleo- 67:411–422.e4
some position. Nat Methods 10:1213–1218 9. Kuhn RM, Haussler D, Kent WJ (2013) The
4. Crawford GE, Holt IE, Whittle J et al. (2006) UCSC genome browser and associated tools.
Genome-wide mapping of DNase hypersensi- Brief Bioinform 14:144–161
tive sites using massively parallel signature 10. Kent WJ, Zweig AS, Barber G et al. (2010)
sequencing (MPSS). Genome Res 16:123–131 BigWig and BigBed: enabling browsing of
5. Boyle AP, Davis S, Shulha HP et al. (2008) large distributed datasets. Bioinformatics 26:
High-resolution mapping and characterization 2204–2207
of open chromatin across the genome. Cell 11. Bolger AM, Lohse M, Usadel B. 2014. Trim-
132:311–322 momatic: a flexible trimmer for Illumina
Single-Molecule Mapping of Chromatin Accessibility Using NOMe-seq/dSMF 119
Abstract
Digestion with restriction enzymes is a classical approach for probing DNA accessibility in chromatin. It
allows to monitor both the cut and the uncut fraction and thereby the determination of accessibility or
occupancy (= 1 - accessibility) in absolute terms as the percentage of cut or uncut molecules, respectively,
out of all molecules. The protocol presented here takes this classical approach to the genome-wide level.
After exhaustive restriction enzyme digestion of chromatin, DNA is purified, sheared, and converted into
libraries for high-throughput sequencing. Bioinformatic analysis counts uncut DNA fragments as well as
DNA ends generated by restriction enzyme digest and derives thereof the fraction of accessible DNA. This
straightforward principle is technically challenged as preparation and sequencing of the libraries leads to
biased scoring of DNA fragments. Our protocol includes two orthogonal approaches to correct for this
bias, the “corrected cut–uncut” and the “cut–all cut” method, so that accurate measurements of absolute
accessibility or occupancy at restriction sites throughout a genome are possible. The protocol is presented
for the example of S. cerevisiae chromatin but may be adapted for any other species.
Key words Chromatin, DNA accessibility, Absolute occupancy, Restriction enzyme, High-through-
put sequencing
1 Introduction
Authors Elisa Oberbeckmann and Michael Roland Wolff have equally contributed to this chapter.
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_9,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
121
122 Elisa Oberbeckmann et al.
this position (“peak height” or, more exactly, “area under the
peak”). Occupancy is related to accessibility as the sum of both
amounts to 100%. Many techniques that map DNA binding of a
factor are good at determining the position but are limited regard-
ing occupancy measurements. This is because they are often yield
methods, i.e., they score either the bound (e.g., MNase-seq [1]) or
unbound (e.g., ATAC-seq [2, 3]) subpopulation but not both. This
allows, at best, to compare occupancies at different conditions
relative to each other (relative occupancy) but not to measure
occupancy in absolute terms. Absolute occupancy is defined as the
percentage of DNA molecules bound by a factor. The measurement
of absolute occupancy requires monitoring (1) simultaneously the
bound and unbound state, (2) under saturating conditions, and
(3) at sufficiently frozen binding-unbinding-dynamics to avoid a
shift of the bound:unbound ratio during mapping. A prominent
counterexample where mapping of absolute occupancy is not pos-
sible is the genome-wide mapping of nucleosomes, i.e., binding of
histone octamers along DNA, by digestion of chromatin with
micrococcal nuclease (MNase). This technique relies on a carefully
titrated and limited MNase digestion degree that removes most
non-nucleosomal DNA while not yet cleaving the DNA wrapped
around the histone octamers. This protected DNA is detected by
high-throughput sequencing (MNase-seq). While MNase-seq
readily determines nucleosome positions, it does not track the
unbound state and does not work at saturation and therefore
cannot measure absolute nucleosome occupancy.
Our ORE-seq protocol presented here offers a genome-wide
and complementary technique for the determination of absolute
occupancies. It employs type II restriction endonucleases (restric-
tion enzymes, REs) that cleave DNA as dimers by embracing [4]
defined short DNA sequences (RE sites) with high specificity. This
has the advantages that their cleavage is prevented by most DNA
binders, leads to double strand breaks (DSBs) at predictable sites
and with predictable DNA end properties and can be carried out to
saturation without losing any DNA. Saturated digestion can be
ensured by following a digestion time course or by comparing
digestion with two sufficiently different RE concentrations over
the same digestion time. The combination of both can be used to
ensure that chromatin dynamics are sufficiently frozen, i.e., if two
different RE concentrations yield similar accessibilities at two dif-
ferent time points each. After cleavage of all accessible RE sites on
the level of chromatin, the DNA is purified so that all occluding
DNA binders are removed, and DSBs flanking the RE sites are
quantitatively introduced in a secondary cleavage step. This gener-
ates DNA fragments with all combinations of either one or two
ends generated by the RE (RE end) and/or by the secondary
cleavage. From the viewpoint of one particular RE site, there can
be DNA fragments that span the RE site (uncut fragments) or that
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 123
spike-in
+ RE -digested S. pombe spike-in
split sample
Fig. 1 Flow chart for cut–uncut and cut–all cut method. For details see text. “f.” abbreviates “fragments”
2 Materials
2.1 Cells and Buffers 1. S. cerevisiae strain of interest and media and growth conditions
for Preparation according to your requirements.
of S. cerevisiae 2. Cold (0–4 °C) distilled or deionized water (dH2O).
Chromatin
3. Preincubation solution: 0.7 M β-mercaptoethanol, 2.8 mM
EDTA pH 8 (see Note 1). Add 2.5 mL 14.3 M
β-mercaptoethanol and 278 μL 0.5 M EDTA pH 8 to a
50 mL tube. Add dH2O to yield a final volume of 50 mL.
126 Elisa Oberbeckmann et al.
100
75
not corrected
cut− uncut
50
25
75
cut−uncut
measured mean absolute occupancy / %
50
25
75
cut− uncut
50
25
100
50
25
Fig. 2 Calibration curves for the indicated REs. Mixtures of given percentages of S. cerevisiae gDNA cut with
the indicated REs were prepared as in Subheading 3.8 and analyzed both via the cut–uncut method
(uncorrected or corrected by individual factor derived for each RE or by a combined correction factor derived
from the combined RE calibration samples, see Subheading 4.1) and via the cut–all cut method. Circles denote
the average and error bars denote the standard deviation of all RE sites in the S. cerevisiae genome included in
the analysis. Data are deposited at GEO under the accession number GSE189142
Wipe off any spills from the tube and wrap it with parafilm as
this solution smells. Store at -20 °C.
4. 1 M sorbitol: Dissolve 182.2 g sorbitol in 1 L dH2O. Store at -
20 °C.
5. Sorbitol + β-mercaptoethanol solution: 1 M sorbitol, 5 mM
β-mercaptoethanol. Top up 17.5 μL 14.3 M β-mercaptoetha-
nol with 1 M sorbitol to 50 mL. Prepare freshly.
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 127
ChrII:437,298-451,553
ORE-seq (combined)
ORE-seq (BamHI)
ORE-seq (HindIII)
ORE-seq (AluI)
Chemical Cleavage
GSM2561057
Fig. 3 Absolute occupancies as measured for the indicated REs by ORE-seq for an exemplary region of the
S. cerevisiae genome. Data for ORE-seq with BamHI, HindIII, AluI, and their combination are taken from
[6]. Nucleosome mapping data by chemical cleavage [18] is shown for reference
2.2 Cells and Buffers The S. pombe spike-in gDNA is only necessary if the cut–all cut
for Preparation of S. method is applied.
cerevisiae and
1. S. cerevisiae and
S. pombe Genomic S. pombe wild-type strains (in our case BY4741 and h-972).
DNA (gDNA)
2. S. cerevisiae YPD medium: 20 g/L Bacto Peptone (Becton,
Dickinson and Company), 10 g/L yeast extract (Biolife),
20 g/L D-glucose, autoclave, and store at RT.
3. S. pombe YES medium: 5 g/L yeast extract (Difco), 30 g/L
D-glucose, 0.7 g/L amino acid mix (0.1 g/L each of adenine,
leucine, histidine, uracil, lysine, arginine, glutamate), use Milli-
pore treated or equivalent water quality, filter sterilize, and
store at RT.
128 Elisa Oberbeckmann et al.
2.3 Buffers and 1. Suitable restriction enzymes (REs; e.g., NEB) and
Enzymes for Digestion corresponding RE-buffers, e.g., BamHI-HF and 10×
of Chromatin and CutSmart-Buffer. Store at -20 °C. Dilute 10× CutSmart-
S. cerevisiae/S. pombe Buffer with double distilled water (ddH2O) to 1× CutSmart-
gDNA with Restriction Buffer (50 mM potassium acetate, 20 mM Tris–acetate, 10 mM
Enzymes and DNA magnesium acetate, 100 μg/mL BSA pH 7.9). Store at -20 °
Purification C.
2. 10× STOP-Buffer: 4% SDS, 100 mM EDTA, 50 mM Tris–HCl
pH 7.5. Add 2.5 mL 1 M Tris–HCl pH 7.5, 10 mL 0.5 M
EDTA pH 8 and 27.5 mL ddH2O to a 50 mL tube. Mix and
add 10 mL 20% SDS. Store at RT.
3. 20 mg/mL proteinase K (Genaxxon) solution in ddH2O. Store
at -20 °C. Aliquots may be refrozen.
4. 10 mg/mL RNase A (Roche) in ddH2O. Remove DNases by
incubation at 95 °C for 15 min. Store at -20 °C. Aliquots may
be refrozen.
5. 5 M NaClO4 in ddH2O. Store at RT.
6. 100% and 70% ethanol. Store at RT.
7. Isopropanol. Store at RT.
8. Phenol for DNA extraction, equilibrated at pH ~8 (Sigma).
Store 50 mL aliquots at -20 °C.
9. Chloroform/isoamylalcohol (24:1): Under the fume hood,
add 20 mL isoamylalcohol to 480 mL chloroform. Store at
RT under the hood.
10. TE-buffer: 5 mM Tris–HCl pH 8, 1 mM EDTA. Store at RT.
11. 1 M KOAc in ddH2O. Store at RT.
12. 0.2 M EDTA pH 8. Store at RT.
3 Methods
3.3 Optional: Digestion of S. pombe gDNA is necessary only for the cut–all cut
Restriction Enzyme method to obtain the S. pombe gDNA spike-in.
Digest of S. pombe
1. Digest 20 μg gDNA with 100 Units of your chosen RE in
gDNA 200 μL of respective 1× RE buffer, e.g., 1× CutSmart Buffer
(NEB), for 1.5 h at the temperature according to used RE, e.g.,
37 °C. Do NOT use the same RE as for chromatin digestion.
2. Stop the digest by addition of 50 μL 10× STOP-Buffer and
proteinase K to a final concentration of 0.5 μg/μL.
3. Incubate for 45 min at 37 °C. Store gDNA at 4 °C (see Note
12).
3.4 Chromatin Digest 1. Per RE, thaw chromatin pellet corresponding to 0.3 g wet cell
with Restriction pellet and keep on ice. Chromatin pellet corresponding to 0.1 g
Enzymes and DNA wet cell pellet is used per individual RE-digest or mock sample,
Purification e.g., per zero/low/high RE concentration (see Note 13).
2. Resuspend chromatin pellet corresponding to 0.3 g wet cell
pellet in 2 mL ice-cold 1× RE-Buffer (e.g., 1× CutSmart or
specific 1× RE-buffer) by vortexing. Make sure that the sample
does not get too warm and that no clumps remain.
3. Centrifuge for 5–8 min at ~750 × g (2000 rpm Eppendorf
5810R) and 4 °C.
4. Decant the supernatant, resuspend pellet in 0.6 mL 1×
RE-Buffer by vortexing, and aliquotize into three 200 μL ali-
quots in 1.5 mL tubes.
5. Add RE to desired concentration, e.g., 0, 5, and 20 μL 20 U/μ
L RE (NEB) (see Note 14). To each sample, add the same total
volume of either RE or RE storage buffer.
6. Incubate for 0.5 h (see Note 15) at the temperature according
to used RE, e.g., 37 °C for BamHI.
7. Stop reaction with 1/10 volume of 10× STOP-Buffer, vortex,
and add 1/20 volume proteinase K (see Note 16).
8. Incubate for 0.5–1 h or up to overnight at 37 °C.
9. Optional: For cut–all cut method, add 12 μL (amounts to
ca. 1 μg, i.e., ca. 10% of RE digested chromatin sample’s
132 Elisa Oberbeckmann et al.
3.5 Optional: Second 1. Take two 40 μL aliquots from the purified DNA sample after
RE Digest for Cut–All step 28 in Subheading 3.4, add 5 μL 10× RE-Buffer (e.g., 10×
Cut Method CutSmart Buffer (NEB)) to each aliquot and vortex.
2. Label one aliquot as “100% digested” and add the same RE as
used for chromatin digestion of this sample, e.g., 4 μL 20 U/μ
L RE. Mix gently.
3. Label the other aliquot as “X% digested” and add same volume
of RE storage buffer as the volume of RE added in step 2.
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 133
3.7 Sequencing 1. Use 100–200 ng (see Note 19) DNA of step 12 in Subheading
Library Preparation 3.6 for library preparation with NEBNext Ultra II DNA
Library Prep Kit. Adjust volume to 50 μL with 1× TE-buffer.
2. Add 7 μL NEBNext Ultra II End Prep Reaction Buffer and
3 μL NEBNext Ultra II End Prep Enzyme Mix and mix thor-
oughly by pipetting up and down.
3. Incubate in thermocycler with lid to at least 75 °C for 30 min at
20 °C, 30 min at 65 °C, and hold at 4 °C.
4. Add 2.5 μL NEBNext Adaptor for Illumina, 30 μL NEBNext
Ultra II Ligation Master Mix (mixed by pipetting prior to
addition, very viscous, ensure proper mixing), and 1 μL
134 Elisa Oberbeckmann et al.
Table 1
Mixing scheme for calibration curve samples
Percent cut 0% 10% 30% 50% 70% 90% 100% Total amount
Uncut gDNA (= mock digest) (μg) 1 0.9 0.7 0.5 0.3 0.1 0 3.5
RE-digested gDNA (μg) 0 0.1 0.3 0.5 0.7 0.9 1 3.5
3.9 Bioinformatics Overviews of the bioinformatics steps for the cut–all cut and the
Analysis cut–uncut method are shown in Figs. 4 and 5.
1. Map fragments with bowtie2 using the combined S. cerevisiae
and S. pombe reference genome (see ‘reference_genome/Scer-
AndSpomWithMT.fsa’: with chromosomes named as follows:
chr01–chr16 for the 16 S. cerevisiae and chrI, chrII, chrIII for
the three S. pombe chromosomes). Use alignment parameters:
‘-X 500 --no-discordant --no-mixed --no-unal’.
2. Remove multiply mapped reads using ‘samtools view -hf 0x2’.
3. Index BAM file using ‘samtools index’.
4. Download this repository (https://github.com/gerland-
group/ORE-seq_analysis) (see Subheading 4.1).
5. Install R & packages detailed in ‘restriction_enzyme/
RE_Rprofile.R’.
136 Elisa Oberbeckmann et al.
Fig. 4 Flow chart for bioinformatic analysis steps for cut–all cut method
Fig. 5 Flow chart for bioinformatic analysis steps for cut–uncut method
3.9.1 Sample Naming Bam files within ‘data/bam’ need to follow these naming
Rules conventions:
1. The script needs bam files for both samples (X% cut and 100%
cut) with identical file name except the ending: Samples with
one RE digest end with ‘_X.bam’, while samples with second
RE digest end with ‘_1.bam’.
2. If there was no second digest and only cut–uncut analysis is
wanted, the ‘_1.bam’ file can be a copy/hard link of the ‘_X.
bam’ file and the cut–all cut results should then be ignored.
3. File names must contain the RE name of the enzyme present in
the sample after an “_” sign, e.g., ‘<Strain>_BamHI-
HF_ < RE units>_X.bam’, where the information in
‘<Strain>’ and ‘ < RE units>’ is not used by the script and
‘<RE units>’ could be omitted.
138 Elisa Oberbeckmann et al.
3.9.2 How to Use Other Add the required information to ‘RE_info.txt’, see ‘RE_info_R-
REs EADME.txt’.
3.9.3 How to Use Other Other genomes than S. cerevisiae (and S. pombe for spike-in) are not
Genomes supported by default, as there are unfortunately several references
to the chromosome names within the code. If these are treated
properly, the script should run for other genomes as well.
3.9.4 How to Fit Uncut Run section ‘3.1.1 Calc and plot deviation from calibration sam-
Correction Factors ples’ in the script that is skipped by default.
4 Notes
REs (thoroughly tested with AluI, BamHI, and HindIII) and the
S. cerevisiae genome and with an S. pombe spike-in for the cut–all
cut method. For custom applications of other REs and especially
other genomes, the script has to be modified or newly written by a
bioinformatics expert. As we cannot foresee future applications by
other users, we just state in the following in detail the underlying
rationale and mathematical background.
In the following, steps marked with * are only needed for the
cut–all cut method, which for example needs a normalization
between the cut and all cut sample using an S. pombe gDNA
spike-in (Fig. 4). Likewise, steps marked with ° are only needed
for the cut–uncut method, like the counts of uncut fragments at a
given cut site (Fig. 5). Note that this description and our script are
an all-in-one solution that calculates the outcome according to
both methods at the same time.
First follow the steps for mapping/indexing and download the
script files as described in Subheading 3.9 and have a look at the
readme file of the repository.
The script then performs the following actions:
4.1.1 Map/Filter Reads 1. Extract paired-end read information: chromosome, start, end,
and strand information, with end positions shifted by +1 bp.
2. Remove fragments that are longer than 500 bp.
3. Remove rDNA fragments by excluding the following loci:
S. cerevisiae chr. 12: 451500–495000
S. pombe chr. 3: 0–30000
S. pombe chr. 3: 2430000–2452883
4.1.2 Count Cut and 4. Count the starting/ending fragments on plus and minus strand
Uncut° Fragments cτ(x) for each genomic position x with τ = 1, 2, 3, 4, denoting
starts on plus, starts on minus, ends on plus, and ends on minus
strands, respectively. For starting reads, we count the position
of the first base pair, and for ending reads, we count the
position after the last base pair (i.e., end positions are shifted
by +1 bp). We use the notation c 1τ ðx Þ and c 2τ ðx Þ for the sample
without and with second RE digest, respectively. For later
modeling, we assume that one single given fragment with
RE-cut or sheared fragment start or end at x will on average
yield pxτ counts after PCR and Illumina sequencing.
5. For the cut–uncut method, we need the uncut fragments for
fixed genomic positions x, i.e., fragments that start before x - d
and end after x + d (end positions are shifted by +1 bp) in the
sample without second RE digest. The extension by d is needed
due to the fact that not all RE cut both strands at the same
position, as explained later. We denote this number of uncut
fragments with u1τ ðx, d Þ, also using the index τ as in step 4 to
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 143
4.1.3 Determine Cut Site 6. Determine the cut site positions, i. e., the positions of the RE
Positions with RE Motif recognition motif, on both* genomes including generation of
the actual DNA ends by end polishing in the following way. We
define xi as the position of the first base pair of the recognition
motif of cut site i plus half the length of the recognition motif,
which usually has an even length.
HindIII as an example with ‘|’ denoting the cut in both
strands:
Position xi is given by the underlined base:
+ strand: 5′-...A|A G C T T...-3′
- strand: 3′-...T T C G A|A...-5′
In case of a 5′ overhang, the 3′ end is elongated to match
the 5′ end during DNA end polishing by a DNA polymerase.
Conversely, a 3′ overhang is digested to match the recessed 5′
end during DNA end polishing by a 5′–3′ exonuclease. For
such end-polished HindIII ends, we get the following double
stranded fragment ends:
Position xi is given by the underlined base:
+ strand: ending: 5′-...A A G C T-3′ and starting: 5′-A
G C T T...-3′
-strand: ending: 3′-...T T C G A-5′ and starting: 3′-T
C G A A...-5′
Let Δs be the shift length from the pattern center to the cut
position of the + strand in upstream direction, which corre-
sponds to half the length of the 5′ overhang of the cleavage
product in bp. For HindIII, Δs = + 2, Δs = 0 for blunt end
cutting RE, whereas in case of an RE with 3′ overhangs, Δs is
negative.
4.1.4 Remove RE Sites 7. Especially for X% samples derived from RE digestion of chro-
with Close Neighbor RE matin, uncut fragment counts are increased at cut sites with any
Sites neighboring cut site within approx. 150 bp. Thus, we ignore
RE sites completely if they have a neighbor within 200 bp in
either direction. This cutoff may be adjusted depending on
given samples. We denote the set of leftover sites with I and J,
for the S. cerevisiae and S. pombe* genomes, respectively.
8. As shown in Fig. 6, we often saw dependencies between the
fragment counts C iτ and A iτ (defined below) and the distance to
the next neighboring RE site, ranging up to 300–500 bp, e.g.,
for starting reads and the downstream distance to the next
neighbor RE site (Fig. 6b). Thus, we ignore start or end cut
counts of an RE site and near the RE site (see RE site window
approach below), if the next RE site downstream or upstream,
respectively, is closer than 300 bp, respectively. Note that this
value can be further tuned to the experimental conditions,
although for our calibration samples shown in Fig. 2, there
was hardly any difference between this limit set to 300 bp or a
more conservative 500 bp. In general, the higher the degree of
shearing, i.e., the shorter the average fragment length, the
lower this limit can be. See also legend to Fig. 6.
9. In our protocol here, we modified the original protocol [6]
such that different REs are used for digesting S. cerevisiae
chromatin or S. pombe gDNA spike-in, for example BamHI
and EcoRI, respectively. In this case, the EcoRI sites in the
S. cerevisiae genome are not considered when determining
close RE sites. However, since the second RE digest is applied
after including the S. pombe gDNA spike-in, the BamHI sites
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 145
Fig. 6 Scoring bias due to close next neighbor RE sites. Exemplary selection from the “cut_counts_vs_nn_-
distance” plots for AluI 50% cut calibration sample (as in Fig. 2) are shown. Our script automatically generates
such plots for the “X%” (a, b) and “100%” (c, d) samples (X or 1 in y-axis label) that show the number of reads
146 Elisa Oberbeckmann et al.
4.1.5 Collect Cut and 10. Due to endogenous exonucleases that may be present in the
Uncut° Counts Within chromatin preparations and trim DNA ends after RE cleavage,
Window Near Cut Sites to some fragment ends do not match the RE cut site positions any
Correct for Resection more, even though they were generated by the RE. Thus we
ä
Fig. 6 (continued) (position of the marker along the y-axis) that map to the indicated combinations (y-axis
label) of the plus strand of the chromosome and start (a, b) or end (c, d) (y-axis label) at a given RE site (0 on x-
axis) that have a next neighbor (nn) site for the same RE at a given upstream (a, c) or downstream (b, d)
distance (x-axis label) in bp (position of the marker along the x-axis). Analogous plots are also generated for
minus end reads. We found that the strand identity does not matter, but rather the orientation (upstream
versus downstream) relative to whether a read starts or ends at the RE site. (e, f) Plots for sequencing reads
analogous to those in panels a and b, but for uncut fragments where the given RE site was not cut. The green
lines correspond to the average at a given x-axis position. Our interpretation of the observed curve shapes is
as follows. (a) If the sequencing reads stem from fragments starting with the RE site, then next neighbor RE
sites upstream are irrelevant for scoring efficiency measured via the obtained read number as they are not
contiguous with the sequenced fragment anymore due to the RE cut and therefore do not affect scoring by
adapter ligation and sequencing. (d) The same is true for next neighbor RE sites downstream of an RE site
where a read ends. In contrast, next neighbor RE sites downstream (b) or upstream (c) of an RE site where a
read starts or ends, respectively, may be cut (are indeed cut to 100% in our calibration samples shown here)
and therefore generate DNA fragments with two RE cut ends and a consistent length that may be shorter than
the average length generated by the combination of one RE and one shearing end. Such fragments with two
RE ends are scored more efficiently (= above averages shown in panels A and D) than fragments with one RE
and one sheared end. The paucity in fragments <100 bp reflects the DNA fragment length cutoff of the
AMPure bead purification during this particular sequencing library preparation. Note that fragments of
>500 bp length are excluded from the analysis as Illumina sequencing becomes biased against longer
fragments, which explains that the curves level off to the average level (similar to green line in panels a and d)
beyond 500 bp next neighbor distance. If shearing is more extensive, i.e., if the average fragment length is
much shorter than 500 bp, then the curve will approach the average level at a next neighbor distance close to
the average fragment length as next neighbor sites beyond the average fragment length will not be contiguous
anymore
Note that especially for the “X%” samples generated from chromatin digestion, the x-axis need not reflect the
actual fragment length that gave rise to a certain sequencing read, but denotes a property of the genome
sequence (distance to the next neighbor RE site). Nonetheless, for calibration samples shown here, the x-axis
does mostly reflect actual fragment length as virtually all RE ends stem from the 100% cut S. cerevisiae gDNA
that was mixed with uncut S. cerevisiae gDNA. Fortuitous ends at RE sites due to shearing are negligible (e.g.,
less than 10 counts on y-axis here).
Finally, the bias due to next neighbor RE sites can also be apparent for uncut fragments where there is a
potentially cut RE site within approx. 150 bp of the view point RE site (0 on x-axis) in either the upstream (E) or
downstream (F) direction. While this is not much pronounced for the samples shown here as the uncut
fragments stem from mock digests in these calibration samples, it may be considerable in chromatin samples
and also calls for excluding such next neighbor sites.
The bioinformatics procedure that corrects for the next neighbor site bias by excluding these RE sites is
detailed in see Subheading 4.1.
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 147
4.1.6 Occupancy We seek to estimate the real accessibility αi at cut site i using the cut
Estimation by Cut–All Cut counts of the cut and all cut samples taking into account a bias
Method with Background toward RE versus sheared fragment ends and effective sequencing
Correction and probabilities. We begin with viewing C iτ and A iτ as random variables
Normalization with the expectation values.
E C iτ = N C μi piτ with μi = αi þ 1 - αi s
E A iτ = N A piτ
two sets C iτ and A iτ within themselves, however, are statistically
dependent.
αiτ = NA, if A iτ = 0 or A iτ = NA (due to a close neighbor
We set b
in direction of τ).
To obtain NA/NC, we use the S. pombe gDNA spike-in cut
sites, which are completely cut in both samples:
i
NA A τ i∈J ,τ
= i
NC C τ i∈J ,τ
αiτ = NA, it
to obtain one accessibility estimate for each cut site i. If b
is ignored during the averaging step.
To obtain the global accessibility, we average over all sites:
D E
b
α= b αi
i∈I
4.1.7 Occupancy In the following we only use data from the sample without second
Estimation by Cut–Uncut RE digest to estimate the accessibility and use the ratio of the cut
Method with Background counts C iτ and the counts of uncut fragments U iτ . We choose to
Correction only consider different PCR biases and sequencing biases between
cut and uncut fragments, giving all cut fragments the sequencing
probability p and all uncut fragments the sequencing probability q.
Summing up cut counts and uncut counts, we set.
C i ≔Ci1 þ Ci2 þ Ci3 þ Ci4 and Ui ≔2 Ui1 þ Ui:2
for sites without any neighbor within 300 bp and
C i ≔Ci1 þ Ci2 or Ci ≔Ci3 þ Ci4 and Ui ≔Ui1 þ Ui2
for sites with one upstream/downstream neighbor within 300 bp,
respectively. Then define the ratio of cut and uncut fragments,
Ci
κi ≔
b
Ui
If the denominator is 0, we set b κi = 1, which will lead to an
accessibility of 1.
Similar to the previous subheading, we have E[Ci] = 4NCp(-
α + (1 - αi)s1(w + 1)) with s1 being the shearing probability per
i
base pair, but now calculated only using the cut sample, i.e., the
ratio of all cut counts away from sites and the sum of cut and uncut
fragment counts away from cut sites. For Ui, we assume that the
uncut fragment counts are given by fragments that have not been
cut by the RE at x iτ and after that also not been cut by shearing at x iτ.
ORE-Seq: Genome-Wide Absolute Occupancy Measurement by Restriction Enzyme. . . 151
C i - σ ðw þ 1ÞU i γ Ci
αi =
b = i eff i
C - σ ðw þ 1ÞU γ þ ð1 þ σ ÞU γ C eff þ U eff
i i i
hc1τ ðzÞiz∈Z,τ
with σ≔ 1 -s1 s1 = 1
being the corrected ratio of all cut
hu1τ ðzÞiz∈Z,τ
γ
counts away from all cut sites and all uncut fragment counts away
from all cut sites.
C ieff = C i - σ ðw þ 1ÞU i γ and
U ieff = ð1 þ σ ÞU i γ
are the effective counts of cut and uncut fragments, respectively,
both corrected for cuts in the shearing step and different sequenc-
ing probabilities of cut and uncut fragments. C ieff þ U ieff gives an
“effective coverage” of cut and uncut fragments at the site i and we
ignore sites with an effective coverage below 40. This limit may be
adapted for different applications.
D EFinally, the genome-wide aver-
age accessibility is given by b αi
α= b .
i∈I
Fit of γ using prepared calibration samples for RE digests:
For each RE (AluI, BamHI, and HindIII) and each calibration
sample s with 0%, 10%, 30%, 50%, 70%, 90%, and 100% prepared
fraction of uncut DNA molecules, i.e., prepared occupancy ωs = 1 -
αs, we calculate the measured genome-wide average occupancy
b s ðγ Þ = 1 -Db
ω αs ðγ Þ for varying
E γ. We then choose γ for each RE
such that b s ðγ ÞÞ2
ðωs - ω is minimized. Additionally, we did a
s
combined fit over all calibration samples of the three REs to use
for REs, for which no specific calibration samples were measured.
The following table shows the best values for γ:
References
1. Wal M, Pugh BF (2012) Genome-wide resolution, ultrasensitive and quantitative
mapping of nucleosome positions in yeast DNA double-strand break labeling in eukary-
using high-resolution MNase ChIP-Seq. otic cells using i-BLESS. Nat Protoc 16(2):
Methods Enzymol 513:233–250. https://doi. 1034–1061. https://doi.org/10.1038/
org/10.1016/b978-0-12-391938-0.00010-0 s41596-020-00448-3
2. Buenrostro JD, Wu B, Chang HY, Greenleaf 10. Martinez-Campa C, Kent NA, Mellor J (1997)
WJ (2015) ATAC-seq: a method for assaying Rapid isolation of yeast plasmids as native chro-
chromatin accessibility genome-wide. Curr matin. Nucleic Acids Res 25(9):1872–1873
Protoc Mol Biol 109:21.29.21–21.29.29. 11. Aris JP, Blobel G (1991) Isolation of yeast
https://doi.org/10.1002/0471142727. nuclei. Methods Enzymol 194:735–749.
mb2129s109 https://doi.org/10.1016/0076-6879(91)
3. Buenrostro JD, Wu B, Litzenburger UM, 94056-i
Ruff D, Gonzales ML, Snyder MP, Chang 12. Kizer KO, Xiao T, Strahl BD (2006) Acceler-
HY, Greenleaf WJ (2015) Single-cell chroma- ated nuclei preparation and methods for analy-
tin accessibility reveals principles of regulatory sis of histone modifications in yeast. Methods
variation. Nature 523(7561):486–490. (San Diego, Calif) 40(4):296–302. https://
https://doi.org/10.1038/nature14590 doi.org/10.1016/j.ymeth.2006.06.022
4. Kim YC, Grable JC, Love R, Greene PJ, Rosen- 13. Reese JC, Zhang H, Zhang Z (2008) Isolation
berg JM (1990) Refinement of Eco RI endo- of highly purified yeast nuclei for nuclease
nuclease crystal structure: a revised protein mapping of chromatin structure. Methods
chain tracing. Science (New York, NY) Mol Biol (Clifton, NJ) 463:43–53. https://
249(4974):1307–1309. https://doi.org/10. doi.org/10.1007/978-1-59745-406-3_3
1126/science.2399465 14. Zhang Z, Reese JC (2006) Isolation of yeast
5. Gregory PD, Barbaric S, Horz W (1999) nuclei and micrococcal nuclease mapping of
Restriction nucleases as probes for chromatin nucleosome positioning. Methods Mol Biol
structure. Methods Mol Biol (Clifton, NJ) (Clifton, NJ) 313:245–255. https://doi.org/
119:417–425. https://doi.org/10.1385/1- 10.1385/1-59259-958-3:245
59259-681-9:417 15. Kiseleva E, Allen TD, Rutherford SA,
6. Oberbeckmann E, Wolff M, Krietenstein N, Murray S, Morozova K, Gardiner F, Goldberg
Heron M, Ellins JL, Schmid A, Krebs S, MW, Drummond SP (2007) A protocol for
Blum H, Gerland U, Korber P (2019) Abso- isolation and visualization of yeast nuclei by
lute nucleosome occupancy map for the Sac- scanning electron microscopy (SEM). Nat Pro-
charomyces cerevisiae genome. Genome Res toc 2(8):1943–1953. https://doi.org/10.
29(12):1996–2009. https://doi.org/10. 1038/nprot.2007.251
1101/gr.253419.119 16. Schmid A, Fascher KD, Horz W (1992) Nucle-
7. Ohtsubo Y, Sakai K, Nagata Y, Tsuda M osome disruption at the yeast PHO5 promoter
(2019) Properties and efficient scrap-and- upon PHO5 induction occurs in the absence of
build repairing of mechanically sheared 3’ DNA replication. Cell 71(5):853–864
DNA ends. Commun Biol 2:409. https://doi. 17. Wolff MR (2020) Nucleosome occupancy and
org/10.1038/s42003-019-0660-7 dynamics in yeast: genome-wide and
8. Chereji RV, Eriksson PR, Ocampo J, Prajapati promoter-level analyses and modeling. PhD,
HK, Clark DJ (2019) Accessibility of promoter LMU München, München
DNA is not the primary determinant of 18. Chereji RV, Ramachandran S, Bryson TD,
chromatin-mediated gene regulation. Genome Henikoff S (2018) Precise genome-wide
Res 29(12):1985–1995. https://doi.org/10. mapping of single nucleosomes and linkers
1101/gr.249326.119 in vivo. Genome Biol 19(1):19. https://doi.
9. Biernacka A, Skrzypczak M, Zhu Y, Pasero P, org/10.1186/s13059-018-1398-0
Rowicka M, Ginalski K (2021) High-
Part III
Abstract
Simultaneous detection of chromatin accessibility and transcription from the same cells promises to greatly
facilitate the dissection of cell-type-specific gene regulatory programs in complex tissues. Paired-seq enables
joint analysis of open chromatin and nuclear transcriptome from up to a million cells in parallel. It achieves
ultra-high-throughput single-cell multiomics with the use of a combinatorial barcoding strategy involving
sequential ligation of multiplexed DNA barcodes to chromatin DNA fragments and reverse transcription
products, followed by high-throughput DNA sequencing of the resulting DNA libraries and deconvolution
of single-cell multiomic maps based on cell-specific barcodes.
Key words Paired-seq, Single-cell multiomics, Chromatin accessibility, Gene expression, Epigenome
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_10,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
155
156 Chenxu Zhu et al.
RNA Library
TTT
TTT
NotI TTT
BC #1 #2 BC #1 #2 #3 TTT
DNA: DNA: TTT
TTT
AAA AAA TTT TTT
TTT
RNA: TTT RNA: TTT
TTT TTT
TTT
AAA AAA
TTT TTT
b cDNA Tn5-Adaptor1
BC#4#3 #2 #1 Read2 Primer Read1 Primer
5’ TTT 3’ TTT GGG 5’ TTT CCC
TTT GGG 5’ TTT GGG TTT
Cellular barcodes TTT GGG 5’ TTT
Tagmentation
5’ 3’ CCC
GGG
3’ 5’ GGG 5’ 100-500 bp
BC#1 GGG 5’
DNA GGG 5’
FokI FokI
TdT tailing cutting site recognition site
Pre- NotI
amplification
5’ TTT CCCCCCCCCC >1.5 kb
5’ CCCCCCCCC N4
TTT CCC
TTT GGG
Linear amplification TTT
CCC TTT
GGG
5’ TTT CCCCCCCCCC
CCC Read1 Adaptor Read2 Primer Read1 Primer
GGG TTT CCC GGG
NNNN
5’ 5’
TTT GGG SbfI
CCCCCCCCC
GGG
CCC FokI
GGG C
CC G
5’ GG Ligation
100-500 bp
100-500 bp
Fig. 1 Overview of Paired-seq protocol. (a) Paired-seq protocol can be finished in 2 days, pause points are
indicated. (b) Schematics for library preparation strategy of Paired-seq. Both DNA fragments from Tn5
tagmentation and cDNA were pre-amplified with a TdT-based strategy and then split into two portions. For
DNA library, the 2nd adaptor was added by ligation; for RNA library, the 2nd adaptor was added by Tn5
tagmentation
Single-Cell Co-Assay of Open Chromatin and RNA 157
2 Materials
2.1 Reagents 1. Tn5 protein were purified according to ref. [26] and Paired-seq
Preparation primers (Table 1 and see Note 2).
2. RT primers (Table 2).
3. Tn5 barcodes (Table 2).
4. Barcode oligos (Tables 3 and 4).
5. Tris–HCl, pH 7.5 (Invitrogen, Cat# 15567027).
6. NaCl (Sigma, Cat# S7653).
7. Glycerol (Sigma, Cat# G5516).
8. DTT (Sigma, Cat# D9779).
9. 200 μL thin-wall PCR tubes (USA Scientific, Cat# 1402-
3900).
10. 1.5 mL low-bind tubes (Eppendorf, Cat# 022431021).
11. 15 mL tubes (Corning Costar, Cat# 430790).
12. 96-well low-bind PCR plate (Eppendorf, Cat# 0030129512).
13. Sterile Reagent reservoir (Corning Costar, Cat# 07200127).
14. Thermocycler (Bio-Rad, T100).
Stock Final
Reagents concentration Volume concentration
Sucrose (Sigma, Cat# S7903) 1M 0.375 mL 250 mM
KCl (Sigma, Cat# P9333) 2M 18.8 μL 25 mM
MgCl2 (Sigma, Cat# 63069) 1M 7.5 μL 5 mM
Tris–HCl, pH 7.5 (Invitrogen, 1 M 15 μL 10 mM
Cat# 15567027)
DTT 1M 1.5 μL 1 mM
Protease Inhibitor (Sigma, 50X 30 μL 1X
Cat# 04693159001)
SUPERase IN (Invitrogen, 20 U/μL 37.5 μL 0.5 U/μL
Cat# AM2696)
RNase OUT (Invitrogen, Cat# 40 U/μL 18.8 μL 0.5 U/μL
10777019)
H 2O NA 996 μL NA
158
Table 1
Primer sequences
Linker-R02 CGAATGCTCTGGCCTCTCAAGCACGTGGAT
Blocker-R02 ATCCACGTGCTTGAGAGGCCAGAGCATTCG
Linker-R03 GGTCTGAGTTCGCACCGAAACATCGGCCAC
Quencher-R03 GTGGCCGATGTTTCGGTGCGAACTCAGACC
Anchor-FokI-GH AAGCAGTGGTATCAACGCAGAGTGAAGGATGTGGGGGGGGG*H
P5-FokI ACACTCTTTCCCTACACGACGCTCTTCCGATCT
P5c-NNDC-FokI 5Phos/NNDCAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTG
P5H-FokI ACACTCTTTCCCTACACGACGCTCTTCCGATCTH
P5Hc-NNDC-FokI 5Phos/NNDCDAGATCGGAAGAGCGTCGTGTAGGGAAAGAGTG
PA-F CAGACGTGTGCTCTTCCGATCT
PA-R AAGCAGTGGTATCAACGCAGAGT
N5XX AATGATACGGCGACCACCGAGATCTACACXXXXXXXXTCGTCGGCAGCGTC
P7XX CAAGCAGAAGACGGCATACGAGATXXXXXXGTGACTGGAGTTCAGACGTGTGCTCTTCCGA
TC
P5 Universal AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATC*T
* denotes Phosphorothioate Bonds modification
N denotes random bases
X denotes Illumina Index sequences
Table 2
Barcode plate #01
B7 RNA_#07_RE /5Phos/AGGCCAGAGCATTCGTTTACCCTGCAGGTTTTTTTTTTTTTTTTVN
B8 RNA_#08_RE /5Phos/AGGCCAGAGCATTCGTGTTGCCTGCAGGTTTTTTTTTTTTTTTTVN
159
(continued)
160
Table 2
(continued)
Table 3
Barcode plate #02
(continued)
162 Chenxu Zhu et al.
Table 3
(continued)
(continued)
Single-Cell Co-Assay of Open Chromatin and RNA 163
Table 3
(continued)
Stock Final
Reagents concentration Volume concentration
IGEPAL CA-630 (Sigma, 10% 20 μL 0.2%
Cat# I8896)
BSA in DPBS (Sigma, Cat# 10% 0.5 mL 5%
A1595)
Protease Inhibitor 50X 20 μL 1X
SUPERase IN 20 U/μL 25 μL 0.5 U/μL
RNase OUT 40 U/μL 12.5 μL 0.5 U/μL
DPBS (Gibco, Cat# 1X 422.5 μL NA
14190136)
Stock Final
Reagents concentration Volume concentration
Tris–Ac, pH 7.5 (Sigma, Cat# 1 M 660 μL 66 mM
93337)
KAc (Sigma, Cat# P5708) 3M 440 μL 132 mM
MgAc2 (Sigma, Cat# M2545) 1 M 200 μL 20 mM
DMF (Millipore, Cat# NA 3200 μL 32%
DX1730)
Ultrapure H2O 1X 5500 μL NA
Single-Cell Co-Assay of Open Chromatin and RNA 165
3. Tagmentation Mix.
Stock
Reagents concentration Volume
5X RT Buffer (with Maxima H minus reverse 5X 52.8 μL
transcriptase)
PBS 1X 52.8 μL
dNTP 10 mM 13.2 μL
RNase OUT 40 U/μL 1.65 μL
SUPERase IN 20 U/μL 3.3 μL
Ultrapure H2O NA 61 μL
Stock
Reagents concentration Volume
T4 DNA Ligase Buffer (NEB, Cat# 10X 500 μL
B0202S)
(continued)
166 Chenxu Zhu et al.
Stock
Reagents concentration Volume
BSA (NEB, Cat# B9000S) 20 mg/mL 50 μL
NEBuffer 3.1 (NEB, Cat# B7203S) 10X 100 μL
Ultrapure H2O NA 2250 μL
(continued)
Single-Cell Co-Assay of Open Chromatin and RNA 167
A4 R04_#04 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAAATCCANGTGGCCGATGTTTCG
A5 R04_#05 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAAATGAGNGTGGCCGATGTTTCG
A6 R04_#06 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAACACTGNGTGGCCGATGTTTCG
A7 R04_#07 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAACGTTTNNGTGGCCGATGTTTCG
A8 R04_#08 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAAGAAGCNNGTGGCCGATGTTTCG
A9 R04_#09 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAAGCCCTNNGTGGCCGATGTTTCG
A10 R04_#10 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAAGCTACNNNGTGGCCGATGTTTCG
A11 R04_#11 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAATCTTGNNNGTGGCCGATGTTTCG
A12 R04_#12 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACAACACNNNGTGGCCGATGTTTCG
B1 R04_#13 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACAGTATGTGGCCGATGTTTCG
B2 R04_#14 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACCAAGTGTGGCCGATGTTTCG
B3 R04_#15 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACCCTAAGTGGCCGATGTTTCG
B4 R04_#16 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACCCTTTNGTGGCCGATGTTTCG
B5 R04_#17 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACCTCTCNGTGGCCGATGTTTCG
B6 R04_#18 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACGATTGNGTGGCCGATGTTTCG
B7 R04_#19 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACGCAGANNGTGGCCGATGTTTCG
B8 R04_#20 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACGTAAANNGTGGCCGATGTTTCG
B9 R04_#21 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACTACCTNNGTGGCCGATGTTTCG
B10 R04_#22 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACTCGGTNNNGTGGCCGATGTTTCG
B11 R04_#23 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACTGTCGNNNGTGGCCGATGTTTCG
B12 R04_#24 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNACTTATGNNNGTGGCCGATGTTTCG
C1 R04_#25 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGAAAGGGTGGCCGATGTTTCG
C2 R04_#26 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGAATCTGTGGCCGATGTTTCG
C3 R04_#27 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGACATAGTGGCCGATGTTTCG
C4 R04_#28 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGAGACCNGTGGCCGATGTTTCG
C5 R04_#29 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGCCCAANGTGGCCGATGTTTCG
C6 R04_#30 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGCTATTNGTGGCCGATGTTTCG
C7 R04_#31 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGGAGGTNNGTGGCCGATGTTTCG
C8 R04_#32 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGGGCTTNNGTGGCCGATGTTTCG
C9 R04_#33 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGGTGTANNGTGGCCGATGTTTCG
C10 R04_#34 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGTGCTCNNNGTGGCCGATGTTTCG
C11 R04_#35 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGTGGGANNNGTGGCCGATGTTTCG
C12 R04_#36 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNAGTTACGNNNGTGGCCGATGTTTCG
D1 R04_#37 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATAAGGGGTGGCCGATGTTTCG
D2 R04_#38 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATCATTCGTGGCCGATGTTTCG
D3 R04_#39 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATGGAACGTGGCCGATGTTTCG
D4 R04_#40 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATGTGCCNGTGGCCGATGTTTCG
D5 R04_#41 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATTCACCNGTGGCCGATGTTTCG
Single-Cell Co-Assay of Open Chromatin and RNA
D6 R04_#42 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNATTCGAGNGTGGCCGATGTTTCG
D7 R04_#43 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNCAAGCCTNNGTGGCCGATGTTTCG
169
(continued)
170
Table 4
(continued)
H1 R04_#85 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNGTATAAGGTGGCCGATGTTTCG
H2 R04_#86 CAGACGTGTGCTCTTCCGATCTNNNNNNNNNNGTCAGACGTGGCCGATGTTTCG
171
(continued)
172
Table 4
Chenxu Zhu et al.
(continued)
3 Methods
3.1 Reagents 1. All oligo DNA sequences in this subheading are listed in
Preparation Tables 1 and 2. To prepare the RT primer mix, mix 12.5 μL
of barcoded T15VN primer (RNA_#XX_RE, 100 μM),
12.5 μL barcoded N6 primer (RNA_#XX_NRE, 100 μM),
and 75 μL ultrapure nuclease-free water in PCR tubes. Vortex
to mix and store at -20 °C.
2. To prepare barcoded Tn5 adaptors, mix 10 μL barcoded Tn5
adaptor (DNA_#XX_RE, 100 μM) and 10 μL pMENTs
(100 μM) in PCR tubes. Using a thermocycler, heat the mix
at 95 °C for 5 min and slowly cool down to 20 °C (0.1 °C/s).
Store the annealed adaptors at -20 °C or immediately use for
step 3.
3. To prepare barcoded Tn5 complex, add 5 μL of barcoded
annealed Tn5 adaptors (from step 2) to 1.5 mL low-bind
tubes. Add 35 μL 0.5 mg/mL unloaded Tn5 protein to each
tube and pipette to mix 5 times. Then vortex to mix for 3–5 s
and spin down quickly. Incubate at room temperature for
30 min, then transfer to ice and sit for 5 min. Store at -20 °C.
4. To prepare R02 and R03 barcode plates, add 6 μL of R02 or
R03 barcoded oligo (BC Plate#02 or BC Plate#03, 100 μM),
5.5 μL of Linker-R02 or Linker-R03 oligo (100 μM), and
38.5 μL ultrapure nuclease-free water to each well of a
low-bind 96-well PCR plate and seal the plate (annealing
plate). Heat at 95 °C for 5 min and slowly cool down to 20 °
C (0.1 °C/s). Aliquot 10 μL of annealed barcoded oligos from
each well of annealing plate to four low-bind 96-well PCR
plates (working plates). Store the working plates at -20 °C.
5. To prepare Adaptor Mix: (a) prepare P5-complex (25 μL
100 μM P5-FokI and 25 μL 100 μM P5c-NNDC-FokI) and
P5H-complex (25 μL 100 μM P5H-FokI and 25 μL 100 μM
P5Hc-NNDC-FokI) in two different tubes; (b) in a thermo-
cycler, heat the mixtures for 5 min at 95 °C and slowly cool
down to 20 °C (-0.1 °C/s); (c) mix 15 μL of P5-complex with
45 μL of P5H-complex on ice and pipette to mix, then add
240 μL cold ultrapure water (to dilute from 50 to 10 μM) and
store at -20 °C.
6. To prepare R02 Blocking Solution, add 264 μL 100 μM
Blocker-R02, 250 μL 10X T4 DNA Ligase Buffer, and
174 Chenxu Zhu et al.
3.3 Chromatin 1. Freshly prepare the Tagmentation Mix and keep on ice.
Tagmentation 2. Label 12 tubes for tagmentation. Aliquot a total of
1200–2400 k nuclei into 12 tubes on ice, each tube with
100–200 k nuclei. Different samples or replicates can be multi-
plexed here, differed by their 1st round barcode (sample bar-
code) (see Note 5).
3. Spin down the 12 tubes at 1000 × g for 10 min at 4 °C, and
carefully discard the supernatant. Samples should be kept
on ice.
Single-Cell Co-Assay of Open Chromatin and RNA 175
Temperature
Step no. (°C) Time
1 50 10 min
2 8 12 s
15 45 s
20 45 s
30 30 s
42 2 min
50 5 min; repeat step 2 for additional 2 cycles
3 50 10 min
4 12 Hold
6. Transfer the 12 tubes to ice. Keep on ice and pool all nuclei into
a 1.5 mL Axygen Maximum Recovery tube, add 4.8 μL 5%
Triton X-100, tap to mix, and quickly spin down.
7. Centrifuge to pellet the nuclei at 1000 × g for 10 min at 4 °C,
and carefully discard the supernatant.
8. Resuspend the nuclei in 1 mL 1X NEBuffer 3.1 and proceed to
Subheading 3.5 immediately.
176 Chenxu Zhu et al.
3.5 Adding DNA 1. Prepare R02 Blocking Solution, R03 Termination Solution,
Barcodes and two tubes of Ligation Mix freshly before the experiment.
2. Prewash two 15 mL Corning tubes by rinsing each tube with
0.5 mL 0.1% BSA in PBS, and discard the liquid (see Note 6).
3. Add the nuclei suspension to the 1st Ligation Mix, add
100 μL T4 DNA Ligase, and gently mix by pipetting up
and down.
4. Transfer the nuclei-Ligation Mix to a reagent reservoir, and
distribute 40 μL of the mixture to each of the 96-well of R02
Barcoding Plate with a multichannel pipette. Seal the plate
with film.
5. Incubate the nuclei–barcode ligation mixture in a Thermo-
Mixer set to 37 °C, 300 rpm for 30 min.
6. Open the seal, add 10 μL of R02 Blocking Solution into each of
the 96-well with a multichannel pipette, and reseal the plate.
7. Continue incubating the nuclei–barcode ligation mixture in a
ThermoMixer set to 37 °C, 300 rpm for another 30 min.
8. Pool all nuclei in a reagent reservoir, and transfer the mixture
containing the nuclei from the reagent reservoir to a 15 mL
tube (prewashed with 0.1% BSA in PBS in step 2).
9. Wash the reagent reservoir with 1 mL of PBS and combine to
the nuclei mixture.
10. Spin down the nuclei with a swing bucket centrifuge at
1000 × g for 10 min at 4 °C, and carefully discard the superna-
tant (see Note 7).
11. Resuspend the nuclei in 1 mL 1X NEBuffer 3.1.
12. Transfer the nuclei suspension to the 2nd Ligation Mix, add
100 μL T4 DNA Ligase, and gently mix by pipetting up
and down.
13. Transfer the nuclei-Ligation Mix to a reagent reservoir, and
distribute 40 μL of the mixture to each of the 96-well of R03
Barcoding Plate with a multichannel pipette. Seal the plate.
14. Incubate the nuclei–barcode ligation mixture in a Thermo-
Mixer set to 37 °C, 300 rpm for 30 min.
15. Open the seal and add 10 μL of R03 Termination Solution into
each of the 96-well with a multichannel pipette.
16. Immediately pool all nuclei in a reagent reservoir, and transfer
the mixture containing the nuclei from the reagent reservoir to
a 15 mL tube (prewashed with 0.1% BSA in PBS in step 2).
17. Wash the reagent reservoir with 1 mL of PBS and intermix with
the nuclei mixture.
Single-Cell Co-Assay of Open Chromatin and RNA 177
3.6 Library Pre- 1. Add 1.5 μL of 10X Terminal Transferase Buffer and 0.5 μL of
amplification 1 mM dCTP into each sub-library. Close the lid, tap to mix,
and briefly spin down.
2. Incubate at 95 °C for 5 min and immediately chill on ice and sit
for another 5 min.
3. Add 0.5 μL of Terminal Transferase into each tube. Close the
lid, tap to mix, and briefly spin down.
4. Incubate at 37 °C for 30 min, followed by heat inactivating the
reaction at 65 °C for 10 min.
5. Prepare the Anchor Mix freshly. Add 15 μL Anchor Mix into
each tube. Close the lid, tap to mix, and briefly spin down.
6. Carry out the reaction in a thermocycler with the program
below:
178 Chenxu Zhu et al.
3.7 Library Splitting 1. Divide the purified amplification product into two tubes:
20.5 μL for Tn5 tagmentation-derived DNA library prepara-
tion and 17 μL for RNA-derived library preparation.
2. Steps 2–11 are for DNA library preparation. Add 2.5 μL 10X
CutSmart Buffer, 1 μL SbfI-HF, and 1 μL FokI to each 20.5 μL
aliquot of the purified amplification product. Incubate at 37 °C
for 1 h.
3. Add 31.3 μL SPRI beads (1.25X) to each tube and mix. Incu-
bate at room temperature for 5 min.
4. Place the tubes on a magnetic stand, and let them sit for 5 min
until the liquid becomes clear. Carefully discard the
supernatant.
5. Add 150 μL 80% EtOH into each tube/well, sit for 30 s, and
discard the supernatant.
6. Repeat step 5 for a total of two washes.
7. Elute the amplification product with 15 μL ultrapure
H2O. The purified product can be stored at -20 °C.
8. Add 2 μL 10X T4 DNA Ligase Buffer, 1.5 μL Adaptor Mix,
and 1.5 μL T4 DNA Ligase to each tube from the previous
step. Close the lid, tap to mix, and briefly spin down.
9. Carry out the ligation reaction in a thermocycler with the
program as given below. Put the tubes to thermocycler imme-
diately after the temperature reached 4 °C:
12. Steps 12–18 are for RNA library preparation. Add 2 μL 10X
CutSmart Buffer and 1 μL NotI-HF into the 17 μL amplifi-
cation product. Incubate at 37 °C for 1 h.
13. Add 25 μL SPRI beads (1.25X) to each tube and mix. Incu-
bate at room temperature for 5 min and repeat the wash steps
as described in steps 4–6.
14. Elute with 10 μL ultrapure H2O. The purified product can be
store at -20 °C.
15. Use 5 μL of the purified product for tagmentation with
Illumina Nextera XT. Add 10 μL Buffer TD and pipette up
and down to mix.
16. Add 5 μL of Amplicon Tagmentation Mix (ATM) to each
tube, pipette 10 times to mix and close the lid, and quickly
spin down.
17. Incubate the mixture in a thermocycler at 55 °C for 5 min,
cool down to 10 °C, and immediately place the tubes
on ice.
18. Add 5 μL Neutralize Tagment Buffer (NT) to each well,
pipette 10 times to mix, close the lid, and incubate at room
temperature for 5 min. Proceed to step 8 of Subheading 3.8.
3.8 Library 1. Steps 1–7 are for DNA library amplification. Add 2 μL of
Amplification Illumina TruSeq i7 index primers, 2 μL Illumina TruSeq i5
index primers, and 25 μL NEBNext 2X HiFi PCR mix. Use
pipette to mix, close the lid, and quickly spin down.
2. Carry out the PCR reaction in a thermocycler with the pro-
gram as follows (see Note 10):
Temperature
Step no. (°C) Time
1 98 3 min
2 98 10 s
63 30 s
72 1 min; repeat step 2 for additional 11–13 cycles
3 72 5 min
4 12 Hold
3. Add 42.5 μL SPRI beads (0.85X) to each tube and mix. Incu-
bate at room temperature for 5 min.
Single-Cell Co-Assay of Open Chromatin and RNA 181
4. Place the tubes on a magnetic stand, and let them sit for 5 min
until the liquid becomes clear. Carefully discard the
supernatant.
5. Add 150 μL 80% EtOH into each tube/well, sit for 30 s, and
discard the supernatant.
6. Repeat step 5 for a total of two washes.
7. Elute the DNA library with 25 μL ultrapure H2O. The purified
library can be stored at -20 °C.
8. Steps 8–11 are for RNA library amplification. Add 6 μL
ultrapure H2O, 2 μL of Illumina TruSeq i7 index primers,
2 μL of Illumina Nextera i5 index primers, and 15 μL
Nextera PCR Mix (NPM) to each tube. Pipette to mix,
close the lid, and briefly spin down.
9. Carry out the PCR reaction in a thermocycler with the pro-
gram as follows (see Note 10):
Temperature
Step no. (°C) Time
1 72 3 min
2 95 30 s
3 95 10 s
55 30 s
72 1 min; repeat step 2 for additional 11–13 cycles
4 72 5 min
5 12 Hold
3.9 Sequencing and 1. DNA and RNA libraries with different combinations of indices
Data Preprocessing can be multiplexed for sequencing.
2. Paired-seq requires at least 50 cycles for Read1 (insert genomic
sequences), 8 cycles for Index Read1, 8 cycles for Index Read2,
and 100 cycles for Read2 (cellular barcodes) (50 + 8 + 8 + 100).
182 Chenxu Zhu et al.
4 Notes
1. All the safe pause points in the protocol are indicated in Fig. 1a.
2. For 96-well barcode plates, standard desalting purification can
be used. For Index PCR primers, HPLC purification is
required.
3. Native nuclei isolated from snap-frozen tissues or fresh tissue
are preferred. Crosslinked nuclei will reduce the complexities
for Paired-seq DNA libraries.
4. Nuclei preparation, tagmentation, reverse transcription, and
combinatorial DNA barcoding must be carried out in a single
day, which will take ~8 h.
5. The optimal input nuclei number is 1.2 million
(or 100,000 × 12 tubes). Less cell number is acceptable but
will result in a lower recovery rate. We can typically recover
200,000–300,000 from 1.2 million input nuclei (17–25%) and
30,000–50,000 from 500,000 (41,700 × 12 tubes) input
nuclei (6–10%).
6. Prewash the 15 mL tubes with 0.1% BSA in PBS, which can
reduce nuclei sticking to the tube and increase nuclei
recovery rate.
7. During removing supernatants after spin-down steps, remove
the liquid as much as possible. The downstream reaction might
be interfered by residual buffers, salt, or oligos from the previ-
ous step (e.g., EDTA after tagmentation in step 6 of Subhead-
ing 3.3, and adaptors oligos after nuclei barcoding in
Subheading 3.5).
8. During purification of nucleic acids from lysis mixture, make
sure to wash out SDS as the residual SDS may inhibit the
subsequent reactions.
9. The optimal number of nuclei in each sub-library is ~3500,
which gives 6–10% potential barcode collision rate. A higher
number of nuclei in each sub-library will result in higher bar-
code collision. Using nuclei sorting instead of dilution to ali-
quot sub-libraries can reduce the potential barcode collision,
but will also reduce the recovery rate.
Single-Cell Co-Assay of Open Chromatin and RNA 183
a b
600 Paired-seq DNA Library
Normallized Intensity
A
A
N
N
EL
R
bp
1,500 300
1,000
700
500 0
400
00
25
50
00
300 15
10
20
30
50
10
200 c
600 Paired-seq RNA Library
Normallized Intensity
100
50
300
25
0
00
25
50
00
15
10
20
30
50
10
Fig. 2 Representative fragment analysis results of Paired-seq library. (a) Tapestation analysis results of a
representative Paired-seq library. EL electronic ladder. (b) Fragments size distribution of representative DNA
(b) and RNA (c) library of Paired-seq
Acknowledgments
We thank QB3 MacroLab for the Tn5 enzyme. This study was
funded by grant nos. 1 U19 MH114831-02, U01MH121282,
and R01AG066018 and the Ludwig Institute for Cancer Research
(to B.R.) and grant no. 1K99HG011483-01 (to C.Z.).
References
1. Lee CK, Shibata Y, Rao B et al (2004) Evidence 10. Lai B, Gao W, Cui K et al (2018) Principles of
for nucleosome depletion at active regulatory nucleosome organization revealed by single-
regions genome-wide. Nat Genet 36(8): cell micrococcal nuclease sequencing. Nature
900–905. https://doi.org/10.1038/ng1400 562(7726):281–285. https://doi.org/10.
2. Thurman RE, Rynes E, Humbert R et al 1038/s41586-018-0567-3
(2012) The accessible chromatin landscape of 11. Cusanovich DA, Daza R, Adey A et al (2015)
the human genome. Nature 489(7414): Multiplex single cell profiling of chromatin
7 5 – 8 2 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 8 / accessibility by combinatorial cellular indexing.
nature11232 Science 348(6237):910–914. https://doi.
3. Yue F, Cheng Y, Breschi A et al (2014) A org/10.1126/science.aab1601
comparative encyclopedia of DNA elements in 12. Buenrostro JD, Wu B, Litzenburger UM et al
the mouse genome. Nature 515(7527): (2015) Single-cell chromatin accessibility
3 5 5 – 3 6 4 . h t t p s : // d o i . o r g / 1 0 . 1 0 3 8 / reveals principles of regulatory variation.
nature13992 Nature 523(7561):486–490. https://doi.
4. Boyle AP, Davis S, Shulha HP et al (2008) org/10.1038/nature14590
High-resolution mapping and characterization 13. Lareau CA, Duarte FM, Chew JG et al (2019)
of open chromatin across the genome. Cell Droplet-based combinatorial indexing for
132(2):311–322. https://doi.org/10.1016/j. massive-scale single-cell chromatin accessibility.
cell.2007.12.014 Nat Biotechnol 37(8):916–924. https://doi.
5. Schones DE, Cui K, Cuddapah S et al (2008) org/10.1038/s41587-019-0147-6
Dynamic regulation of nucleosome positioning 14. Preissl S, Fang R, Huang H et al (2018) Single-
in the human genome. Cell 132(5):887–898. nucleus analysis of accessible chromatin in
https://doi.org/10.1016/j.cell.2008.02.022 developing mouse forebrain reveals cell-type-
6. Giresi PG, Kim J, McDaniell RM et al (2007) specific transcriptional regulation. Nat Neu-
FAIRE (Formaldehyde-Assisted Isolation of rosci 21(3):432–439. https://doi.org/10.
Regulatory Elements) isolates active regulatory 1038/s41593-018-0079-3
elements from human chromatin. Genome Res 15. Kelsey G, Stegle O, Reik W (2017) Single-cell
17(6):877–885. https://doi.org/10.1101/gr. epigenomics: recording the past and predicting
5533506 the future. Science 358(6359):69–75. https://
7. Buenrostro JD, Giresi PG, Zaba LC et al doi.org/10.1126/science.aan6826
(2013) Transposition of native chromatin for 16. Stuart T, Satija R (2019) Integrative single-cell
fast and sensitive epigenomic profiling of open analysis. Nat Rev Genet 20(5):257–272.
chromatin, DNA-binding proteins and nucleo- https://doi.org/10.1038/s41576-019-
some position. Nat Methods 10(12): 0093-7
1213–1218. https://doi.org/10.1038/ 17. Zhu C, Preissl S, Ren B (2020) Single-cell
nmeth.2688 multimodal omics: the power of many. Nat
8. Minnoye L, Marinov GK, Krausgruber T et al Methods 17(1):11–14. https://doi.org/10.
(2021) Chromatin accessibility profiling meth- 1038/s41592-019-0691-5
ods. Nat Rev Methods Prim 1(1):10. https:// 18. Angermueller C, Clark SJ, Lee HJ et al (2016)
doi.org/10.1038/s43586-020-00008-9 Parallel single-cell sequencing links transcrip-
9. Jin W, Tang Q, Wan M et al (2015) Genome- tional and epigenetic heterogeneity. Nat Meth-
wide detection of DNase I hypersensitive sites ods 13(3):229–232. https://doi.org/10.
in single cells and FFPE tissue samples. Nature 1038/nmeth.3728
528(7580):142–146. https://doi.org/10. 19. Zhu C, Zhang Y, Li YE et al (2021) Joint
1038/nature15740 profiling of histone modifications and tran-
scriptome in single cells from mouse brain.
Single-Cell Co-Assay of Open Chromatin and RNA 185
Abstract
The ability to analyze the transcriptomic and epigenomic states of individual single cells has in recent years
transformed our ability to measure and understand biological processes. Recent advancements have focused
on increasing sensitivity and throughput to provide richer and deeper biological insights at the cellular level.
The next frontier is the development of multiomic methods capable of analyzing multiple features from the
same cell, such as the simultaneous measurement of the transcriptome and the chromatin accessibility of
candidate regulatory elements. In this chapter, we discuss and describe SHARE-seq (Simultaneous high-
throughput ATAC, and RNA expression with sequencing) for carrying out simultaneous chromatin
accessibility and transcriptome measurements in single cells, together with the experimental and analytical
considerations for achieving optimal results.
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_11,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
187
188 Samuel H. Kim et al.
2 Materials
2.1 DNA Oligos and All oligonucleotides can be obtained through IDT. The exact scale
Primers and purification methods are listed below:
1. Round 1 linker (1 μmol scale, standard desalting):
CCGAGCCCACGAGACTCGGACGATCATGGG
2. Round 2 linker (1 μmol scale, standard desalting):
CAAGTATGCAGCGCGCTCAAGCACGTGGAT
3. Round 3 linker (1 μmol scale, standard desalting):
AGTCGTACGCCGATGCGAAACATCGGCCAC
2.4 Buffers and Make all buffers using ultrapure molecular biology-grade ddH2O:
Reagents
1. 2.5M Glycine (50 mL)
9.375 g Glycine (powder)
1× PBS up to 50 mL
Filter through a 0.22 μM filter. Store at room temperature.
2. Tissue Dissociation (MACS) buffer
10 mM Tris-HCl pH 8.0
5 mM CaCl2
5 mM EDTA
3 mM MgAc
0.6 mM DTT
cOmplete Protease Inhibitor
Make fresh every time.
194 Samuel H. Kim et al.
3 Methods
Tagmentation
Tn5 transposase
Reverse transcription
biotin
TTTTTTTTTTTTTT
AAAAAAAAAAAA
biotin
TTTTTTTTTTTTTT
AAAAAAAAAAAA
Biotin pulldown
Supernatant: Beads:
Fig. 1 Outline of the SHARE-seq assay. Nuclei are isolated from cells or tissues and crosslinked. Transposition
is then carried out on chromatin, followed by reverse transcription with a biotinylated RT primer. Three pool–
split rounds of hybridization of barcode oligos are then performed. Hybridized barcodes are then ligated, and
crosslinks are reversed. The ATAC and RNA portions are separated by streptavidin pulldown. The ATAC is
directly amplified, and the RNA is subjected to cDNA amplification, tagmentation, and final library amplification
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 197
3.1 Determining the It is important to carefully track the number of cells going into the
Optimal Cell Number SHARE-seq assays and being retained at each key step of the
procedure. Pool–split assays rely on the statistical uniqueness of
barcode combinations through which cells pass, which in turn
means that having too many cells entering the pool–split procedure
will lead to an unacceptably high rate of doublets (two or more cells
with the same barcode). In the same time, some of the reactions
have an efficiency-imposed limit on the number of cells that can
enter them and need to be distributed into parallel reactions for
optimal results. This applies to the initial transposition and reverse
transcription reactions, as well as to the final amplification, where
the existing protocol is optimized for libraries of size 20,000 cells,
which means that after the final pooling cells are split into separate
subpools of that size and processed into individual sublibraries.
Figure 3 shows the theoretical number of detected cells and
doublet rate for different pool–split setups with three rounds,
accounting for a certain level of cell loss during repeated handling.
Based on these calculations and empirical experience, we usually
start the pool–split rounds with 5× 105 cells for a 96 × 96 ×
96 pool–split experiment.
ATAC
P5 Read 1 R1 linker R2 linker R3 linker P7
5’ AATGATACGGCGACCACCGAGATCTACACTAGATCGCTCGTCGTCGGCAGCGTCAGATGTGTATAAGAGACAG...CTGTCTCTTATACA CCGAGCCCACGAGACTCGGACGATCATGGG CAAGTATGCAGCGCGCTCAAGCACGTGGAT AGTCGTACGCCGATGCGAAACATCGGCCAC
ACATATTCTCTGTC...GACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGAGCCTGCTAGTACCCTAGTGCAAGTTCATACGTCGCGCGAGTTCGTGCACCTATAGTGCAATCAGCATGCGGCTACGCTTTGTAGCCGGTGTAGTGCAATAGAGCATACGGCAGAAGACGAAC 5’
Read 2 R1 BC R2 BC R3 BC
RNA
P5 Read 1 UMI R1 linker R2 linker R3 linker P7
5’ AATGATACGGCGACCACCGAGATCTACACTAGATCGCTCGTCGTCGGCAGCGTCAGATGTGTATAAGAGACAG... AAAAAAAAAAAAAAA CCGAGCCCACGAGACTCGGACGATCATGGG CAAGTATGCAGCGCGCTCAAGCACGTGGAT AGTCGTACGCCGATGCGAAACATCGGCCAC
ACATATTCTCTGTC...NVTTTTTTTTTTTTTTTNNNNNNNNNNGACAGAGAATATGTGTAGAGGCTCGGGTGCTCTGAGCCTGCTAGTACCCTAGTGCAAGTTCATACGTCGCGCGAGTTCGTGCACCTATAGTGCAATCAGCATGCGGCTACGCTTTGTAGCCGGTGTAGTGCAATAGAGCATACGGCAGAAGACGAAC 5’
Read 2 R1 BC R2 BC R3 BC
Fig. 2 Structure of final SHARE-seq libraries. ATAC (top) and RNA (bottom). Dots represent the actual library
insert
198 Samuel H. Kim et al.
0.20
0.19 96 x 96 x 96
0.18 384 x 96 x 96
0.17
0.16 384 x 384 x 384
0.15
0.14
Doublet fraction
0.13
0.12
0.11
0.10
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0.00
10 4 10 5 10 6 10 7 10 8
Number detected cells
Fig. 3 Combinatorial indexing and SHARE-seq’s throughput. Shown is the number of cells that can be detected
at a given doublet rate; the pool–split process was simulated as a random Poisson loading at a 50% loss of
cells during each pool–split round
3.2 Annealing of In this step, barcode containing oligonucleotides for each round of
Oligo Plates split–pool is annealed and distributed into 96-well plates prior to
the actual assay. These plates can be stored at - 20∘C indefinitely. It
is advisable for the purposes of time saving to prepare sufficiently
many such plates in advance to support multiple experiments. It is
critical to thaw these plates to room temperature prior to use.
See Note 4.
1. Dilute Round 1 linker oligos (120 μL at 1 mM concentration)
with 11,880 μL STE buffer.
2. Mix 90 μL diluted Round 1 linker oligo with 10 μL Round
1 oligo (at 100 μM) in the wells of a multiwell plate.
3. Dilute Round 2 linker oligos (120 μL at 1 mM concentration)
with 9480 μL STE buffer.
4. Mix 88 μL diluted Round 2 linker oligo with 12 μL Round
2 oligo (at 100 μM) in the wells of a multiwell plate.
5. Dilute Round 3 linker oligos (144 μL at 1 mM concentration)
with 9360 μL STE buffer.
6. Mix 86 μL diluted Round 3 linker oligo with 14 μL Round
3 oligo (at 100 μM) in the wells of a multiwell plate.
7. Anneal the Round 1, Round 2, and Round 3 plates as follows in
a thermocycler:
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 199
2 min at 95 ∘C
Slow ramp at - 1∘C per minute to 20 ∘C
2 min at 20 ∘C
Indefinitely at 4 ∘C
8. Check if there has been significant water evaporation for wells
situated at the corners. If yes, add water to equalize volumes.
9. Aliquot 10 μL of the annealed oligos to new plates. This should
be enough for 9 experiments. Store these plates at - 20∘C.
3.3 Anneal Adapter In this step, Tn5 adapters are prepared for both transposition of
Oligos chromatin and tagmentation during cDNA library preparation:
1. Dilute the Phosphorylated Read2, Read1, and Blocked ME
Comp oligos to a 100 μM concentration with the IDTE buffer.
2. Prepare the transposition adapter mix in a PCR tube as follows:
6.5 μL 100 μM Phosphorylated Read2 oligo
6.5 μL 100 μM Read1 oligo
13 μL 100 μM Blocked ME Comp oligo
0.26 μL 1 M Tris pH 8.0
0.26 μL 5 M NaCl
3. Prepare the tagmentation adapter mix in a PCR tube as follows:
13 μL 100 μM Read1 oligo
13 μL 100 μM Blocked ME Comp oligo
0.26 μL 1 M Tris pH 8.0
0.26 μL 5 M NaCl
4. Anneal oligos as follows in a thermocycler:
2 min at 85 ∘C
Slow ramp at - 1∘C per minute to 20 ∘C
2 min at 20 ∘C
Indefinitely at 4 ∘C
∘
5. Heat glycerol to 65 C, and then equilibrate to room
temperature.
6. Mix 25 μL glycerol with 25 μL of annealed oligo.
The annealed adapters can be immediately used or stored at -
20∘C.
3.4 Transposome In this step, Tn5 transposomes are assembled together with the
Assembly annealed adapter oligos:
1. Assemble Tn5 transposomes by mixing the following
components:
200 Samuel H. Kim et al.
3.5 Tissue Here, we describe an example tissue dissociation protocol that has
Dissociation worked successfully in our hands for several human embryonic
tissues. However, users should be aware that generally each tissue
requires separate optimization of dissociation conditions, and it is
likely that a different protocol will have to be adapted in most
situations.
1. Set swing bucket centrifuge to 4 ∘C Fast Temp and thaw
1M DTT.
2. Transfer tissue samples onto dry ice.
3. Prepare MACS buffer (2 mL for each sample) as described
above. Make sure the buffer is cold on ice.
4. Add 10 μL Protector RNase Inhibitor for each 1 mL in Gen-
tleMACS M-tubes. Add 1 mL of MACS buffer to each Gentle-
MACS M-tube and chill on ice.
5. Transfer 30–50 mg of tissue into each GentleMACS M-tube
containing 1 mL MACS buffer.
6. Allow the tissue to thaw in buffer. Transition to a cold room.
7. Homogenize using a Protein_01_01 dissociation protocol
on a GentleMACS Tissue Dissociator instrument.
8. Filter the homogenate through 30 μm CellTrics filter into a
2mL DNA LoBind tube by pipetting directly onto the top of
the filter and gently tapping to allow flow.
9. Wash the GentleMACS M-tube with 1 mL MACS buffer and
filter the wash again through the 30 μm CellTrics filter.
10. Spin down the homogenate in a swing bucket centrifuge at
500 g for 5 min at 4 ∘C (ramp up and down both at 3/9).
11. Remove and discard supernatant.
12. Resuspend in 1mL PBS-2RI.
13. Count cells/nuclei and proceed with a desired number of
cells/nuclei.
3.6 Fixation of Cells The next step, if starting with a dissociated tissue, is to fix the
in Culture and of nuclei. This is also the first step if starting with cells in culture.
Dissociated Nuclei The procedure used is generally the same, with the difference that
from Tissue with nuclei the first step is directly the fixation:
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 201
3.7 ATAC Reaction In this step, transposition of the entire sample is performed by
splitting it into 10,000–20,000 cells in 50-μL reactions each in a
96-well plate. The smaller volume and the number of cells per
reaction improve the quality of transposition.
The cell lysis conditions described here are adapted from the
omniATAC bulk ATAC protocol [22] (see Note 7):
1. Prepare PBS-RI by mixing the following:
800 μL PBS
2 μL Enzymatic RI
2. After the last centrifugation, remove supernatant and resus-
pend the cells with PBS-RI to 2× 106 cells/mL.
3. Prepare 2× TB buffer (sufficient for 96 reactions) by mixing the
following:
874.5 μL 0.2 M Tris-acetate
70 μL 5 M Potassium acetate
53 μL 1 M Magnesium acetate
53 μL 10% Tween-20
53 μL 1% Digitonin
848 μL 100% DMF
698.5 μL H2O
4. Prepare 1× TB buffer according to the number of reactions
N to be carried out. N = 1 corresponds to 104 input cells.
25× N 2× TB
16.45× N H2O
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 203
0.2× N PIC
0.85× N Enzymatic RI
Total volume: 42.5× N.
5. Aliquot 5× NμL of the diluted cells to a new tube, e.g., for
10× 105 cells, N = 10, so aliquot 50 μL cells to a new tube.
6. Add 42.5× NμL 1× TB to sample.
7. Add 2.5× NμL of assembled Tn5 to sample. Mix well.
8. Aliquot 50 μL of sample in the wells of a 96- or 384-well plate.
9. Seal the plate and incubate with shaking at 500 rpm for 30 min
at 37 ∘C.
10. Pool the reactions and spin down at 500 g.
11. Add 0.5 mL NIB-RI without disturbing the pellet and spin
down again at 500 g.
12. Resuspend the cells in 60 μL EB.
3.8 Reverse In this step, reverse transcription is performed in situ. The condi-
Transcription tions are optimized for 1×105 cells entering each 50-μL reaction:
1. Prepare the reverse transcription (RT) mix (sufficient for 6 reac-
tions) as follows:
70 μL 5× RT buffer
2.19 μL Enzymatics RNase Inhibitor
4.38 μL SUPERase RI
17.5 μL dNTPs
35 μL RT Primer
10.94 μL H2O
105 μL 50% PEG
35 μL Maxima H Minus Reverse Transcriptase (add right
before RT reaction)
Total volume: 280 μL.
2. Add 240 μL RT mix to 60 μL cells in EB.
3. Aliquot 50 μL to 6 PCR wells.
4. Start thawing the oligo plates, while the RT is ongoing.
5. Run the reverse transcription reaction in a thermocycler as
follows:
50 ∘C for 10 min
3 cycles of:
8 ∘C for 12 s
15 ∘C for 45 s
20 ∘C for 45 s
204 Samuel H. Kim et al.
30 ∘C for 30 s
42 ∘C for 2 min
50 ∘C for 3 min
50 ∘C for 5 min
6. Pool samples and mix with 500 μL NIB-RI.
7. Spin down at 500 g.
8. Wash with 1000 μL NIB.
9. Spin down at 500 g.
10. Resuspend with 1152 μL NIB-RI.
3.9 Hybridization– In this step, cells/nuclei are iteratively split into individual wells to
Ligation and Pool–Split dynamically create a combinatorial index statistically unique to each
cell. All handling is performed at room temperature so make abso-
lutely sure that oligo plates have been fully thawed before
proceeding.
If different samples are multiplexed in a single run, they can be
individually identified based on the first-round barcodes. If such a
strategy is deployed, each sample needs to be processed through
transposition and reverse transcription separately and then loaded
into specified positions in the first-round plate(s).
1. Prepare 3456 μL hybridization buffer as follows:
2761.9 μL H2O
576 μL 10× T4 ligase buffer
14.4 μL SUPERase RI 20 U/μL
46.08 μL Enzymatics RI 40 U/μL
57.60 μL 10% NP40
2. Mix 1152 μL of sample with 3456 μL hybridization buffer.
Keep the sample at RT.
3. Aliquot 40 μL of mixture to a Round 1 plate.
4. Mix and shake at 300 rpm for 30 min at RT.
5. Prepare 1152 μL Blocking Oligo 1 mix as follows:
253.4 μL 100 μM Round 1 blocking oligo
211.2 μL 10× T4 DNA Ligase buffer
687.4 μL H2O
6. Add 10 μL Blocking Oligo 1 mix to each well.
7. Mix and shake at 300 rpm for 30 min at RT.
8. Pool samples from all wells.
9. Aliquot 50 μL of mixture to a Round 2 plate.
10. Mix and shake at 300 rpm for 30 min at RT.
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 205
3.10 Reverse In this step, cells are reverse crosslinked to release DNA from the
Crosslinking bound proteins so that the ATAC libraries can be amplified. As the
crosslinking is relatively gentle (at 0.1 or 0.2%), a milder reverse
crosslinking condition of 1 h incubation at 55 ∘C is generally
sufficient.
Further reverse crosslinking optimization might be needed if
the crosslinking protocol has been modified:
1. For each N of 50-μL sub-library, add the following:
50 μL 2× RCB
2 μL Proteinase K
1 μL SUPERase RI
2. Incubate at 55 ∘C for 1 h.
3. Add 5 μL 100 mM PMSF/IPA.
4. Incubate at room temperature for 10 min.
Note: this is an optional stopping point. The reverse cross-
linked product can be stored at - 80∘C for a few days.
3.11 Pulldown In this step, the cDNA is separated from the transposition products
by pulling down on the biotin that is part of the reverse transcrip-
tion primer. The supernatant constitutes the transposition products
and is processed separately from the cDNA:
1. Prepare 1× B & W-T/RI buffer by mixing the following:
400× (N + 1) μL 1× B & W-T buffer
4× (N + 1) μL SUPERase RI
2. Prepare 1× B & W/RI buffer by mixing the following:
100× (N + 1) μL 1× BW buffer
2× (N + 1) μL SUPERase RI
3. Prepare 1× STE/RI buffer by mixing the following:
200× (N + 1) μL 1× STE buffer
N + 1 μL SUPERase RI
4. In a fresh tube, mix 10× NμL MyOne C1 Dynabeads with
100× NμL 1× B & W-T.
5. Separate on a magnetic rack and remove supernatant.
6. Wash twice with 100× NμL B & W-T without RI.
7. Wash once with 100× NμL B & W-T/RI.
8. Resuspend beads in 100× NμL 2× B & W/RI.
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 207
3.12 ATAC Library In this step, ATAC fragments are purified and amplified into a final
Preparation library ready for sequencing:
1. Clean up the ATAC part of the sample using Zymo DNA Clean
and Concentrate. Elute in 11 μL EB buffer, and then elute
again with additional 11 μL EB buffer (a total of 22 μL EB
buffer).
2. Prepare ATAC PCR Master Mix by mixing the following:
225 μL 2× NEBnext Master Mix
9 μL P7 primer 25 μM
27 μL H2O
3. Mix the following:
20 μL sample
29 μL ATAC PCR Master Mix
1 μL of 25 μM Adapter 1 Primer (from the PCR Library
indexing primers plate)
4. Run PCR for 5 cycles as follows:
72 ∘C for 5 min
98 ∘C for 30 s
5 cycles of:
98 ∘C for 10 s
65 ∘C for 30 s
72 ∘C for 30 s
5. Determine additional cycles using qPCR. Add 5 μL of the
pre-amplified reaction to 10 μL qPCR Master Mix for a total
qPCR reaction of 15 μL as follows:
5 μL NEBnext Master Mix
0.2 μL 25 μM Adapter 1.1
0.2 μL 25 μM P7
208 Samuel H. Kim et al.
3.13 RNA Library In this step, RNA library generation is initiated by carrying out
Preparation Step 1. template switching on the pulled down cDNA:
Template Switching
1. Prepare the Template switch mix by mixing the following:
11.25 μL H2O
125 μL 50% PEG 6000
90 μL 5× Maxima RT buffer
90 μL Ficoll PM-400 (20%)
45 μL 10 mM dNTPs
45 μL RNase inhibitor (Lucigen)
11.25 μL 100 μM TSO oligo
22.5 μL Maxima RT Rnase H Minus (add last right before
reaction)
2. Remove all supernatant. Be careful to avoid drying the beads.
3. Resuspend beads in 50 μL Template switch mix.
4. Incubate samples for 30 min at room temperature with
rotation.
5. Incubate samples for 90 min at 42 ∘C at 300 rpm. Resuspend
every 30 min by pipetting up and down.
3.14 RNA Library The next step is to amplify the individual cDNA molecules.
Preparation Step 2.
1. Prepare cDNA PCR Mix by mixing the following:
Amplification of cDNA
247.5 μL KAPA HiFi 2× mix
7.92 μL 25 μM RNA PCR primer
7.92 μL 25 μM P7 primer
231.7 μL H2O
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 209
3.15 RNA Library The next step is to tagment the amplified cDNA, which will prepare
Preparation Step 3. it for the final library amplification step:
Tagmentation
1. Quantify cDNA concentration using Qubit.
2. Dilute cDNA to a concentration of 5 ng/μL for tagmentation.
Note: Expect more than 50 ng cDNA. If cDNA amount is
low, it can get away with tagmenting 20 ng cDNA; in this case,
adjust the volume of H2O and cDNA accordingly.
210 Samuel H. Kim et al.
3.17 Library Before libraries can be sequenced, they need to be properly quanti-
Quantification and fied and be subjected to quality evaluation. This is done by first,
Evaluation of Library evaluation of the insert distribution, and second, quantification:
Quality 1. Examination of library size distribution. This step can be car-
ried out using several different instruments, such as a TapeSta-
tion or a BioAnalyzer. We prefer to use a TapeStation (with the
D1000 or HS D1000 kits) due to flexibility, ease of use, and
rapid turnaround time.
2. Quantification of library concentration. For most high-
throughput sequencing applications, this step is standardly
carried out using a Qubit fluorometer. While this works well
for libraries with a unimodal fragment-length distribution,
ATAC libraries typically exhibit a multimodal fragment
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 211
4 Computational Processing
barcode assignment
joint scATAC/RNA
analyses
Fig. 4 Outline of the SHARE-seq computational processing procedures. As a first step, cell barcodes are
annotated for all reads in both ATAC and RNA FASTQ files. Subsequently, UMIs are consolidated and assigned
to reads in the RNA set. RNA reads are then aligned against the genome, and gene expression is quantified in
single cells, resulting in a final data matrix that can be analyzed in Seurat (or other scRNA-seq) tools. ATAC
reads are aligned against the genome, filtered (removing mitochondria-mapping reads), and deduplicated
within each barcode. Alignments are then annotated with their cell barcodes and can be used as input for
further analysis in ArchR. Further joint analysis of the ATAC and RNA can be carried out downstream
seq reads and to produce object that can be used for further analysis
with established tools for scRNA-seq/scATAC-seq processing such
as Seurat and ArchR (e.g., sparse matrices and BAM files). The
outline of the processing is shown in Fig. 4. For both ATAC and
RNA, reads are first assigned their cellular barcodes. RNA reads are
additionally annotated with the sequenced UMIs. RNA reads are
aligned against the genome, a quantification is carried out for each
gene in each cell, and a final sparse matrix is created. For ATAC,
reads are mapped against the genome, then filtered, and dedupli-
cated within each cell, and a final BAM file with cellular barcodes
appended to each alignment is created.
4.1 RNA 1. As a first step in the RNA processing, annotated barcodes for
each read pair, using the SHARE-seq-barcode-annotate.
py script.
python SHARE-seq-barcode-annotate.py
BC1file fieldID pos1 lenBC1 BC2file
fieldID2 pos2 lenBC2 BC3file fieldID3
pos3 lenBC3 [-BCedit N] [-revcompBC]
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 213
@[readID]:::[GTTAGCCT+TAGTCTTG+TACCGAGC] 1:N:0:
TGGGGNCACAGAGCCAAACCATATCAGCTG
+
AAAAA#EEEEEAEEEEEEEEEEEEEEEEEE
@[readID]:::[GACGGATT+GATAGAGG+nan] 1:N:0:
ACCAANCTGTGCACAAGCGTGAATCAACCT
+
6AAAA#E/EEEEEEEEEAEEEEEEEEEEEE
gzip RNA.barcodes_annotated.barcodes_annotated.end1.fastq
gzip RNA.barcodes_annotated.barcodes_annotated.end1.fastq
As follows:
python PEFastqToTabDelimited.py
RNA.barcodes_annotated.end1.fastq.gz
RNA.barcodes_annotated.end2.fastq.gz |
python SHARE-seq-RNA-UMI-Add.py 10 read2 |
python PEFastqToTabDelimited-reverse.py -
RNA.barcodes_annotated.RNA_UMI
This step will append the UMI sequence to the cell bar-
codes in the read ID:
@[readID]:::[TGACCACT+GGTCGTGT+TGCTGATA+TTTATGATAG]
CCTCTNGCTCAGCCTATATACCGCCATCTTCAGCAAACCCTGATGAAGGC
+
AAAAA#EEEEEEEEEEEEEEEEEEEEEEEEEEEEEE/EEEEEEEEEEEEE
gzip RNA.barcodes_annotated.RNA_UMI.end1.fastq
gzip RNA.barcodes_annotated.RNA_UMI.end2.fastq
6. Align the Read 1 FASTQ file against the genome using STAR
as follows (the commands given here use the standard
ENCODE Project Consortium[69] STAR settings):
samtools index
RNA.end1.STAR/Aligned.sortedByCoord.out.bam
python SAMstats.py
RNA.end1.STAR/Aligned.sortedByCoord.out.bam
SAMstats-RNA.end1.STAR.hg38
-bam genome.chrom.sizes samtools
This script run with these settings will output the average
read profile over all genes with only a single transcript anno-
tated (in order to avoid confounding by the presence of multi-
ple isoforms) and ≥1000 bp in length. Use a simple annotation
with few isoforms, such as refSeq to get as many genes meeting
these requirements as possible.
12. Calculate UMI counts per gene and per cell barcode combina-
tion using the SHARE-seq_RNA_counts.py. For faster pro-
cessing, run this on each chromosome in parallel, as follows
(shown is chr1):
python SHARE-seq_RNA_counts.py
RNA.end1.STAR/Aligned.sortedByCoord.out.bam
annotation.gtf.chr1 genome.chrom.sizes
RNA.SHARE-seq_RNA_counts.chr1 -UMIedit 1
python SHARE-seq-RNA-BC-sum-across-files.py
list_of_per_chromosome_outputs
RNA.SHARE-seq_RNA_counts.UMIs_per_cell
15. Create final sparse matrix format files that can be used as input
to Seurat for further analysis with the SHARE-seq-RNA-
UMIs-sum-across-files.py script:
python SHARE-seq-RNA-UMIs-sum-across-files.py
list_of_per_chromosome_outputs
RNA.SHARE-seq_RNA_counts.UMIs_per_cell.min500 0
RNA.SHARE-seq_RNA_counts.UMIs_per_cell.min500.sparse
-sparse
4.2 ATAC The first steps of the ATAC processing are analogous to those of the
RNA pipeline:
1. First, annotate cellular barcodes:
python PEFastqToTabDelimited.py
ATAC.end1.fastq.gz ATAC.end2.fastq.gz |
python SHARE-seq-barcode-annotate.py
Plate_R1.tsv 2 15 8 Plate_R2.tsv 2 53 8 Plate_R3.tsv 2
91 8 -revcompBC |
PEFastqToTabDelimited-reverse.py -
ATAC.barcodes_annotated
gzip ATAC.barcodes_annotated.end1.fastq
gzip ATAC.barcodes_annotated.end1.fastq
python PEFastqToTabDelimited.py
ATAC.barcodes_annotated.end1.fastq.gz
ATAC.barcodes_annotated.end2.fastq.gz -trim 30 30 |
bowtie bowtie-indexes/chrM -p 20 -v 2 -a -t --best
--strata -q -X 1000 --sam --12 - |
samtools view -F4 -bT genome.fa - |
samtools sort - ATAC.2x30mers.chrM
218 Samuel H. Kim et al.
python PEFastqToTabDelimited.py
ATAC.barcodes_annotated.end1.fastq.gz
ATAC.barcodes_annotated.end2.fastq.gz
-trim 30 30 | bowtie bowtie-indexes/genome
-p 20 -v 2 -k 2 -m 1 -t --best --strata -q
-X 1000 --sam --12 - | egrep -v chrM |
samtools view -F4 -bT genome.fa - | samtools sort -
ATAC.2x30mers.unique.nochrM
python PEInsertDistFromBAM.py
ATAC.2x30mers.unique.nochrM.bam
genome.chrom.sizes
ATAC.2x30mers.unique.nochrM.InsLen
-uniqueBAM -normalize
wigToBigWig ATAC.2x30mers.unique.nochrM.wig
genome.chrom.sizes
ATAC.2x30mers.unique.nochrM.bigWig
12. Calculate the global TSS enrichment. The TSS enrichment TSSE
is the most informative ATAC-seq and is based on generating an
average read distribution profile around annotated transcription
start sites for protein coding genes and then calculating the ratio
between the number of reads in the immediate neighborhood of
the TSS and the number of reads falling in the regions on the
flanks of the TSS peak. The advantage of the TSSE metric is that it
is an internal to the dataset measure independent of peak calling.
We use a TSS window of ±100 bp and a TSS flank distance of
2000 bp, i.e., TSSE is calculated as follows:
|R ∈ [T SS ± 100]|
T SSE =
|R ∈ [T SS − 2050, T SS − 1950]| + |R ∈ [T SS + 1950, T SS + 2050]|
(2)
First, generate the TSS metaprofile:
python signalAroundCoordinate-BW.py
annotation.TSS-0bp.bed 0 1 3 4000
ATAC.2x30mers.unique.nochrM.bigWig
ATAC.2x30mers.unique.nochrM.TSS_profile -normalize
Note that you need a BED file containing the start positions
and the strands of annotated TSSs in the genome, e.g.,
python ATACTSSscore.py
ATAC.2x30mers.unique.nochrM.TSS_profile
100 2000 >> ATACTSSscore.txt
13. Deduplicate the BAM file. Note that this step is different from
the typical deduplication carried out in most high-throughput
sequencing pipelines, based on tools such as MarkDups in
picard. Here, we perform deduplication of fragments only
within the same cell barcode, i.e., for two fragments to be
collapsed, they need to have the same coordinates, orientation,
and cell barcode.
python SHARE-seq_ATAC_dedup.py
ATAC.2x30mers.unique.nochrM.bam
genome.chrom.sizes
ATAC.2x30mers.unique.nochrM.BC_dedup.bam
-addBC
python SHARE-seq_ATAC_stats_per_cell.py
ATAC.2x30mers.unique.nochrM.BC_dedup.bam
genome.chrom.sizes annotation.TSS-0bp.bed 0 1 2000 200
ATAC.2x30mers.unique.nochrM.BC_dedup.per_cell_stats
5 Expected Results
5.1 Sequencing Figure 5 shows the typical fragment profiles for ATAC and RNA
Libraries SHARE-seq libraries. ATAC libraries are expected to show a
nucleosomal signature, with a prominent subnucleosomal, mono-
nucleosomal, and perhaps dinucleosomal peaks, shifted to the right
by the length of the adapters and barcodes added to the original
fragments. In contrast, RNA libraries are primarily unimodal in
length.
5.2 Species Mixing A customary experiment to be carried out when testing, adopting,
Experiments or developing any new single-cell protocol is the species mixing
experiment, in which cells from two different species, usually mouse
and human, are mixed together, and the extent of crosstalk/con-
tamination of individual barcodes or of doublet formation
(in which two cells are processed together with the same barcode)
Fig. 5 Typical fragment-length profiles of SHARE-seq libraries. (a) BioAnalyzer profile of a SHARE-seq ATAC
library. (b) BioAnalyzer profile of a SHARE-seq RNA library
222 Samuel H. Kim et al.
A ATAC B RNA
15,000 50,000
40,000
mm10 fragments per barcode
20,000
5,000
10,000
0 0
Fig. 6 Typical results from a species mixing SHARE-seq experiment. Human HEK293 and mouse embryonic
fibroblast (MEF) cells were mixed in equal proportions and carried through the SHARE-seq workflow. (a) ATAC
fragments per cell. (b) RNA UMIs per cell
5.3 ATAC Post- Figure 7 shows the key ATAC-seq bulk-level metrics. The fragment-
sequencing Quality length distribution (Fig. 7a) usually shows strong subnucleosomal
Evaluation and nucleosomal peaks as well as a weaker dinucleosomal one. High
TSS enrichment is desirable; in this case (Fig. 7b), it is very high (TSSE
≥25). See Note 8 for more details. Figure 7c shows the fraction of
mitochondrial reads in the human and mouse cells in the species
mixing experiment. Note that the fraction can vary greatly depending
on the properties of the cell type (cancer cell lines and highly meta-
bolically active cells tend to have more mitochondria [70]) and not
just on the experimental variation (which in this case is completely
minimized as the cells were processed together).
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 223
A 0.015
B 0.3
C 1.0 nuclear chrM
TSSE = 27.51
0.9
0.8
Number fragments
0.010 0.2
0.7
Average RPM
Fraction of reads
0.6
0.5
0.3
0.2
0.1
0.000 0.0
0.0
0 100 200 300 400 500 600 700 800
0
1, 0
1, 0
2, 0
2, 0
3, 0
3, 0
4, 0
0
-3 00
-3 0 0
-2 00
-2 00
-1 0 0
-1 00
00
00
50
00
50
00
50
00
50
00
,0
,5
,0
,5
,0
,5
,0
-5
Fragment length HEK293 (hg38) MEF (mm10)
-4
Position relative to TSS
Fig. 7 Basic evaluation of bulk-level ATAC quality and enrichment. (a) Fragment-length distribution. (b) TSS
enrichment. Shown are the same experiments as those featured in Fig. 6. (c) Mitochondrial read fraction for
each species in this experiment
A B
80
60
TSS ratio
40
20
Fig. 8 Basic evaluation of scATAC-seq-level quality and enrichment. (a) Number fragments per cell
barcode vs. TSS enrichment. (b) Cell barcode rank (by fragment counts) vs. fragment counts per cell barcode
Figure 8 shows the key scATAC metrics. One such metric is the
relationship between the number of fragments per cell barcode and
the TSS enrichment within each cell barcode (Fig. 8a). Another is
the curve of the number of fragments per cell barcode plotted
against the rank (by the number of fragments per cell barcode) of
the cell barcodes (Fig. 8b). Ideally, there should be a clear inflection
point between the cell barcodes with high fragment counts and the
cell barcodes with low fragment counts, indicating that a set of
high-quality cells have been captured and preserved intact through
the full pool–split procedure. A flatter, diagonal-like shape of that
curve can be indicative of loss of cell integrity during handling and
is potentially concerning regarding the biological interpretability of
the experiment if the lack of inflection is too extreme.
224 Samuel H. Kim et al.
A 0.3
B exonic
1.0
intronic intergenic
Coverage (arbitrary units)
0.9
0.8
0.2 0.7
Fraction of reads
0.6
0.5
0.4
0.1
0.3
0.2
0.1
0.0 0.0
Fig. 9 Basic evaluation of the bulk-level RNA-seq properties. (a) Read distribution along transcript lengths. (b)
Read distribution relative to the exonic, intronic, and intergenic genomic spaces
5.4 RNA Post- Figure 9 shows the typical parameters to be evaluated for a bulk-
sequencing Quality level RNA-seq dataset. One is the distribution of reads along tran-
Evaluation scripts (Fig. 9a). SHARE-seq is not a 3’-tagging experiment the
way some scRNA-seq approaches are as it attaches UMIs to the 3’
end of transcripts, but cDNAs are tagmented at random after
cDNA amplification; thus the first reads of the RNA part of a
SHARE-seq dataset can be some distance away from the 3’ end.
Another is the distribution of reads relative to the annotation
(Fig. 9b). As is often observed in scRNA-seq datasets, SHARE-seq
RNA libraries contain a significant portion of reads originating
from introns, presumably from unspliced transcripts present in
the nucleus. This is likely due to the fact that the ATAC reaction
has to happen first in the workflow, and thus a substantial portion of
the cytoplasm is lost and the final libraries are enriched for nuclear
material.
Figure 10 shows the key metric for evaluating the success of the
RNA portion of a SHARE-seq experiment. As with ATAC above,
the curve of the number of UMIs per cell barcode plotted against
the rank (by the number of UMIs per cell barcode) of the cell
barcodes should ideally feature a clear inflection point between
the cell barcodes with high UMI counts and the cell barcodes
with low UMI counts (Fig. 10a). There should also be a concor-
dance between the cell barcodes with high ATAC fragment counts
and those with high UMI counts, i.e., the same cells are of high
quality in both modalities, and are thus usable for joint analysis
(Fig. 10b).
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 225
A B
10000
10000
1000
Number_fragments +1
1000
count
Number UMIs
20000
15000
10000
100
100
5000
10
10
1
1
Fig. 10 Basic evaluation of SHARE-seq RNA single-cell-level quality and enrichment. (a) Cell barcode rank
(by UMI counts) vs. UMI counts per cell barcode. (b) UMI counts per barcode vs. ATAC fragment counts per
barcode.
A ATAC B RNA
10
5
UMAP_2
UMAP_2
−5
−10 0 10
UMAP_1
UMAP_1
Fig. 11 Example SHARE-seq output on human embryonic lung samples. (a) ArchR iterative LSI UMAP on the
ATAC-seq dataset. (b) Seurat UMAP on the RNA dataset. Individual ArchR- and Seurat-defined clusters are
colored separately
5.5 Dimensionality Following initial data processing, clusters and cell types can be
Reduction and Cell identified using standard tools for that purpose such as Seurat
Type/Cluster [68] and/or ArchR [67]. Figure 11 shows typical such output in
Identification UMAP space for both the ATAC and RNA sides of a SHARE-seq
experiment from a human embryonic lung tissue sample.
226 Samuel H. Kim et al.
6 Notes
Acknowledgements
The authors thank Sai Ma and Jason Buenrostro for helpful discus-
sion regarding the SHARE-seq protocol. This work was supported
by NIH grants (P50HG007735, RO1 HG008140, U19AI057266
and UM1HG009442 to W.J.G., 1UM1HG009436 to W.J.G. and
A.K., 1DP2OD022870-01 and 1U01HG009431 to A.K., and
HG006827 to C.H.), the Rita Allen Foundation (to W.J.G.), the
Baxter Foundation Faculty Scholar Grant, and the Human Fron-
tiers Science Program grant RGY006S (to W.J.G). W.J.G is a Chan
Zuckerberg Biohub investigator and acknowledges grants 2017-
174468 and 2018-182817 from the Chan Zuckerberg Initiative.
S.K. is supported by MSTP training grant T32GM007365 and the
Paul and Daisy Soros Fellowship. Fellowship support also provided
by the Stanford School of Medicine Dean’s Fellowship (G.K.M.),
by the EMBO Long-Term Fellowship EMBO ALTF 1119-2016,
and by the Human Frontier Science Program Long-Term Fellow-
ship HFSP LT 000835/2017-L (Z.S.).
References
1. Mortazavi A, Williams BA, McCue K et al. transcriptome surveyed at single-nucleotide
(2008) Mapping and quantifying mammalian resolution. Nature 453(7199):1239–1243.
transcriptomes by RNA-Seq. Nat Methods 5. Tang F, Barbacioru C, Wang Y et al. (2009)
5(7):621–628 mRNA-Seq whole-transcriptome analysis of a
2. Nagalakshmi U, Wang Z, Waern K et al. single cell. Nat Methods 6(5):377–382.
(2008) The transcriptional landscape of the 6. Islam S, Kj€allquist U, Moliner A et al. (2011)
yeast genome defined by RNA sequencing. Sci- Characterization of the single-cell transcrip-
ence 320(5881):1344–1349 tional landscape by highly multiplex RNA-seq.
3. Sultan M, Schulz MH, Richard H et al. (2008) Genome Res 21(7):1160–1167.
A global view of gene activity and alternative 7. Ramsköld D, Luo S, Wang YC et al. (2012)
splicing by deep sequencing of the human tran- Full-length mRNA-Seq from single-cell levels
scriptome. Science 321(5891):956–960. of RNA and individual circulating tumor cells.
4. Wilhelm BT, Marguerat S, Watt S et al. (2008) Nat Biotechnol 30(8):777–782
Dynamic repertoire of a eukaryotic
228 Samuel H. Kim et al.
8. Hashimshony T, Wagner F, Sher N, Yanai I 19. Wu C (1980) The 5′ ends of Drosophila heat
(2012) CEL-seq: single-cell RNA-Seq by mul- shock genes in chromatin are hypersensitive to
tiplexed linear amplification. Cell Rep 2(3): DNase I. Nature 286(5776):854–860
666–673. 20. Minnoye L, Marinov GK, Krausgruber T et al.
9. Shalek AK, Satija R, Adiconis X, Gertner RS, (2021) Chromatin accessibility profiling meth-
Gaublomme JT, Raychowdhury R, Schwartz S, ods. Nat Rev Meth Primers 1:10.
Yosef N, Malboeuf C, Lu D, Trombetta JJ, 21. Buenrostro JD, Giresi PG, Zaba LC et al.
Gennert D, Gnirke A, Goren A, Hacohen N, (2013) Transposition of native chromatin for
Levin JZ, Park H, Regev A (2013) Single-cell fast and sensitive epigenomic profiling of open
transcriptomics reveals bimodality in expres- chromatin, DNA-binding proteins and nucleo-
sion and splicing in immune cells. Nature some position. Nat Methods 10:1213–1218
498(7453):236–240. 22. Corces MR, Trevino AE, Hamilton EG et al.
10. Jaitin DA, Kenigsberg E, Keren-Shaul H, (2017) An improved ATAC-seq protocol
Elefant N, Paul F, Zaretsky I, Mildner A, reduces background and enables interrogation
Cohen N, Jung S, Tanay A, Amit I (2014) of frozen tissues. Nat Methods 14:959–962
Massively parallel single-cell RNA-seq for 23. Reznikoff WS (2008) Transposon Tn5. Annu
marker-free decomposition of tissues into cell Rev Genet 42:269–286
types. Science 343(6172):776–779
24. Adey A, Morrison HG, Asan et al. (2010)
11. Klein AM, Mazutis L, Akartuna I, Rapid, low-input, low-bias construction of
Tallapragada N, Veres A, Li V, Peshkin L, shotgun fragment libraries by high-density
Weitz DA, Kirschner MW (2015) Droplet bar- in vitro transposition. Genome Biol 11(12):
coding for single-cell transcriptomics applied R119
to embryonic stem cells. Cell 161(5):
1187–1201 25. Buenrostro JD, Wu B, Litzenburger UM et al.
(2015) Single-cell chromatin accessibility
12. Macosko EZ, Basu A, Satija R, Nemesh J, reveals principles of regulatory variation.
Shekhar K, Goldman M, Tirosh I, Bialas AR, Nature 523:486–490
Kamitaki N, Martersteck EM, Trombetta JJ,
Weitz DA, Sanes JR, Shalek AK, Regev A, 26. Cusanovich DA, Daza R, Adey A et al. (2015)
McCarroll SA (2015) Highly parallel genome- Multiplex single cell profiling of chromatin
wide expression profiling of individual cells accessibility by combinatorial cellular indexing.
using nanoliter droplets. Cell 161(5): Science 348:910–914
1202–1214 27. Cusanovich DA, Reddington JP, Garfield DA
13. Zheng GX, Terry JM, Belgrader P et al. (2017) et al. (2018) The cis-regulatory dynamics of
Massively parallel digital transcriptional embryonic development at single-cell resolu-
profiling of single cells. Nat Commun 8:14049 tion. Nature 555:538–542
14. Han X, Wang R, Zhou Y et al. (2018) Mapping 28. Preissl S, Fang R, Huang H et al. (2018)
the Mouse Cell Atlas by Microwell-Seq. Cell Single-nucleus analysis of accessible chromatin
172(5):1091–1107.e17 in developing mouse forebrain reveals cell-
type-specific transcriptional regulation. Nat
15. Cao J, Packer JS, Ramani V et al. (2017) Com- Neurosci 21(3):432–439
prehensive single-cell transcriptional profiling
of a multicellular organism. Science 357:661– 29. Mezger A, Klemm S, Mann I et al. (2018)
667 High-throughput chromatin accessibility
profiling at single-cell resolution. Nat Com-
16. Rosenberg AB, Roco CM, Muscat RA et al. mun 9(1):3647
(2018) Single-cell profiling of the developing
mouse brain and spinal cord with split-pool 30. Satpathy AT, Granja JM, Yost KE et al. (2019)
barcoding. Science 360:176–182 Massively parallel single-cell chromatin land-
scapes of human immune cell development
17. McGhee JD, Wood WI, Dolan M et al. (1981) and intratumoral T cell exhaustion. Nat Bio-
A 200 base pair region at the 5′ end of the technol 37:925–936
chicken adult β-globin gene is accessible to
nuclease digestion. Cell 27:45–55 31. Lareau CA, Duarte FM, Chew JG et al. (2019)
Droplet-based combinatorial indexing for
18. Keene MA, Corces V, Lowenhaupt K et al. massive-scale single-cell chromatin accessibility.
(1981) DNase I hypersensitive sites in Dro- Nat Biotechnol 37:916–924
sophila chromatin occur at the 5′ ends of
regions of transcription. Proc Natl Acad Sci U 32. Macaulay IC, Haerty W, Kumar P, et al. 2015.
S A 78:143–146 G & T-seq: parallel sequencing of single-cell
Simultaneous Single-Cell Profiling of the Transcriptome and Accessible. . . 229
genomes and transcriptomes. Nat Methods role of TREM2 in cancer. Cell 182(4):
12(6):519–522 872–885.e19
33. Huang AY, Li P, Rodin RE et al. (2020) Paral- 46. Guo F, Li L, Li J et al. (2017) Single-cell multi-
lel RNA and DNA analysis after deep sequenc- omics sequencing of mouse early embryos and
ing (PRDD-seq) reveals cell type-specific embryonic stem cells. Cell Res 27(8):967–988
lineage patterns in human brain. Proc Natl 47. Clark SJ, Argelaguet R, Kapourani CA et al.
Acad Sci U S A 117(25):13886–13895 (2018) scNMT-seq enables joint profiling of
34. Zachariadis V, Cheng H, Andrews N, Enge M chromatin accessibility DNA methylation and
(2020) A highly scalable method for joint transcription in single cells. Nat Commun 9(1):
whole-genome sequencing and gene- 781
expression profiling of single cells. Mol Cell 48. Wang Y, Yuan P, Yan Z et al. (2021) Single-cell
80(3):541–553.e5 multiomics sequencing reveals the functional
35. Yin Y, Jiang Y, Lam KG et al. (2019) High- regulatory landscape of early embryos. Nat
throughput single-cell sequencing with linear Commun 12(1):1247
amplification. Mol Cell 76(4):676–690.e10 49. Luo C, Liu H, Xie F et al. (2019) Single
36. Rodriguez-Meira A, Buck G, Clark SA et al. nucleus multi-omics links human cortical cell
(2019) Unravelling intratumoral heterogeneity regulatory genome diversity to disease risk var-
through high-sensitivity single-cell mutational iants. bioRxiv 2019.12.11.873398
analysis and parallel rna sequencing. Mol Cell 50. Xiong H, Luo Y, Wang Q et al. (2021) Single-
73(6):1292–1305.e8 cell joint detection of chromatin occupancy
37. Hou Y, Guo H, Cao C et al. (2016) Single-cell and transcriptome enables higher-dimensional
triple omics sequencing reveals genetic, epige- epigenomic reconstructions. Nat Methods
netic, and transcriptomic heterogeneity in 18(6):652–660
hepatocellular carcinomas. Cell Res 26(3): 51. Zhu C, Zhang Y, Li YE et al. (2021) Joint
304–319 profiling of histone modifications and tran-
38. Hu Y, Huang K, An Q, Du G, Hu G, Xue J, scriptome in single cells from mouse brain.
Zhu X, Wang CY, Xue Z, Fan G (2016) Simul- Nat Methods 18(3):283–292
taneous profiling of transcriptome and DNA 52. Markodimitraki CM, Rang FJ, Rooijers K et al.
methylome from a single cell. Genome Biol (2020) Simultaneous quantification of protein-
17:88 DNA interactions and transcriptomes in single
39. Angermueller C, Clark SJ, Lee HJ et al. (2016) cells with scDam & T-seq. Nat Protoc 15(6):
Parallel single-cell sequencing links transcrip- 1922–1953
tional and epigenetic heterogeneity. Nat Meth- 53. Fiskin E, Lareau CA, Eraslan G et al. (2020)
ods 13(3):229–232 Single-cell multimodal profiling of proteins
40. Pott S (2017) Simultaneous measurement of and chromatin accessibility using
chromatin accessibility, DNA methylation, and PHAGE-ATAC. bioRxiv 2020.10.01.322420
nucleosome phasing in single cells. Elife 6: 54. Mimitou EP, Lareau CA, Chen KY et al. (2021)
e23203 Scalable, multimodal profiling of chromatin
41. Peterson VM, Zhang KX, Kumar N et al. accessibility, gene expression and protein levels
(2017) Multiplexed quantification of proteins in single cells. Nat Biotechnol. https://doi.
and transcripts in single cells. Nat Biotechnol org/10.1038/s41587-021-00927-2
35(10):936–939 55. Swanson E, Lord C, Reading J et al. (2021)
42. Stoeckius M, Hafemeister C, Stephenson W Simultaneous trimodal single-cell measure-
et al. (2017) Simultaneous epitope and tran- ment of transcripts, epitopes, and chromatin
scriptome measurement in single cells. Nat accessibility using TEA-seq. eLife 10:e63632
Methods 14(9):865–868 56. Kearney CJ, Vervoort SJ, Ramsbottom KM
43. O’Huallachain M, Bava FA et al. (2020) Ultra- et al. (2021) SUGAR-seq enables simultaneous
high throughput single-cell analysis of proteins detection of glycans, epitopes, and the tran-
and RNAs by split-pool synthesis. Commun scriptome in single cells. Sci Adv 7(8):
Biol 3(1):213 eabe3610
44. Chung H, Parkhurst CN, Magee EM et al. 57. Cao J, Cusanovich DA, Ramani V et al. (2018)
(2021) Simultaneous single cell measurements Joint profiling of chromatin accessibility and
of intranuclear proteins and gene expression. gene expression in thousands of single cells.
https://doi.org/10.1101/2021.01.18.427139 Science 361:1380–1385
45. Katzenelenbogen Y, Sheban F, Yalin A et al. 58. Zhu C, Yu M, Huang H et al. (2019) An ultra
(2020) Coupled scRNA-seq and intracellular high-throughput method for single-cell joint
protein activity reveal an immunosuppressive
230 Samuel H. Kim et al.
analysis of open chromatin and transcriptome. 66. Dobin A, Davis CA, Schlesinger F et al.
Nat Struct Mol Biol 26:1063–1070 (2013) STAR: ultrafast universal RNA-seq
59. Xing QR, Farran CAE, Zeng YY et al. (2020) aligner. Bioinformatics 29(1):15–21.
Parallel bimodal single-cell sequencing of tran- 67. Granja JM, Corces MR, Pierce SE et al. (2021)
scriptome and chromatin accessibility. Genome ArchR is a scalable software package for inte-
Res 30(7):1027–1039 grative single-cell chromatin accessibility analy-
60. Chen S, Lake BB, Zhang K (2019) High- sis. Nat Genet 53(3):403–411
throughput sequencing of the transcriptome 68. Hao Y, Hao S, Andersen-Nissen E et al. (2021)
and chromatin accessibility in the same cell. Integrated analysis of multimodal single-cell
Nat Biotechnol 37(12):1452–1457 data. Cell 184(13):3573–3587.e29
61. Ma S, Zhang B, LaFave LM et al. (2020) Chro- 69. ENCODE Project Consortium (2012) An
matin potential identified by shared single-cell integrated encyclopedia of DNA elements in
profiling of RNA and chromatin. Cell 183: the human genome. Nature 489:57–74
1103–1116.e20 70. Marinov GK, Wang YE, Chan DC, Wold BJ
62. Langmead B, Trapnell C, Pop M et al. (2009) (2014) Evidence for site-specific occupancy of
Ultrafast and memory-efficient alignment of the mitochondrial genome by nuclear tran-
short DNA sequences to the human genome. scription factors. PLoS ONE 9(1):e84713. link
Genome Biol 10:R25 71. Picelli S, Björklund AK, Reinius B et al. (2014)
63. Li H, Handsaker B, Wysoker A et al. (2009) Tn5 transposase and tagmentation procedures
The sequence alignment/map format and for massively scaled sequencing projects.
SAMtools. Bioinformatics 25:2078–2079 Genome Res 24:2033–2040
64. Kuhn RM, Haussler D, Kent WJ (2013) The 72. Domcke S, Hill AJ, Daza RM et al. (2020) A
UCSC Genome Browser and associated tools. human cell atlas of fetal chromatin accessibility.
Brief Bioinform 14:144–161 Science 370(6518):eaba7612
65. Kent WJ, Zweig AS, Barber G et al. (2010) 73. Corces MR, Granja JM, Shams S et al. (2018)
BigWig and BigBed: enabling browsing of The chromatin accessibility landscape of pri-
large distributed datasets. Bioinformatics 26: mary human cancers. Science 362(6413):
2204–2207 eaav1898
Chapter 12
Abstract
Single-cell Nucleosome Occupancy and Methylome sequencing (scNOMe-seq) is a multimodal assay that
simultaneously measures endogenous DNA methylation and nucleosome occupancy (i.e., chromatin
accessibility) in single cells. scNOMe-seq combines the activity of a GpC Methyltransferase, an enzyme
which methylates cytosines in GpC dinucleotides, with bisulfite conversion, whereby unmethylated cyto-
sines are converted into thymines. Because GpC Methyltransferase acts only on cytosines present in
non-nucleosomal regions of the genome, the subsequent bisulfite conversion step not only detects the
endogenous DNA methylation, but also reveals the genome-wide pattern of chromatin accessibility.
Implementing this technology at the single-cell level helps to capture the dynamics governing methylation
and accessibility vary across individual cells and cell types. Here, we provide a scalable plate-based protocol
for preparing scNOMe-seq libraries from single nucleus suspensions.
Key words scNOMe-seq, Single cell, DNA methylation, Nucleosome occupancy, Chromatin accessi-
bility, GpC Methyltransferase, Bisulfite sequencing, Epigenetic modification, Fluorescence-activated
cell sorting
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_12,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
231
232 Michael Wasney and Sebastian Pott
30 CpG
GpC
20
A B
GmC mCG 100 kb
aggregate
20
Cardiomyocytes CM
10
Hemat.
Endothelial
TSNE 2
0 Fibroblasts Fibroblasts
Pericytes/
Sm. mus. Hemat.
−10
Endothelial cells Pericytes/
Sm.Mus.
−10 −5 0 5 10
MYH7
TSNE 1
Fig. 2 Multimodal profiling of the heart captures cell-type-specific epigenetic configurations in the MYH7
locus. scNOMe-seq data from an adult human heart sample comprising 1229 cells. (a) TSNE plot with clusters
corresponding to major cell types (left). (b) Pseudobulk data tracks for the corresponding clusters for both data
modalities capturing chromatin accessibility (GmC, green) and DNA methylation (mCG, blue), respectively
2 Materials
2.1.4 Random-Primed 1. Random Priming Master Mix: Prior to denaturing the samples
DNA Synthesis as part of the Random-primed DNA synthesis step, mix 922 μL
10X Blue Buffer, 231 μL of 50 U/μL Klenow fragment (Qia-
gen Beverly P7010HCL), 461 μL of dNTP solution with each
nucleotide at a concentration of 10 mM (Thermo Fisher
R0191), and 2995 μL of water. Keep on ice.
scNOMe-Seq 235
2.1.5 Inactivation of Free 1. Exo/rSAP Master Mix: Prior to beginning inactivation step,
Primers and dNTPs mix 922 μL of 20 U/μL Exonuclease I and 461 μL of 1 U/μL
rSAP (Qiagen Beverly X8010L). Keep on ice.
2.1.6 Sample Clean-Up 1. SPRI Beads: Apportion 280 μL of Sera-Mag SpeedBeads into
an Eppendorf tube and place on a magnetic stand. Allow the
solution to clear of beads before carefully removing the super-
natant. Wash the beads twice with 1 mL TE. Between washes,
remove the tube from the magnet and mix by inversion before
replaces the tube on the stand and allowing the beads to clear.
After the second wash, resuspend beads in 280 μL of TE. Mean-
while, transfer 2.52 g PEG 8000 to a 50 mL conical tube. Add
2.8 mL of 5 M NaCl, 140 μL 1 M Tris–HCl pH 8, and 28 μL of
0.5 M EDTA pH 8. Add 7 to 8 mL of water and vortex the
solution until the PEG 8000 has dissolved. Add the washed
Sera-Mag SpeedBeads and bring the solution up to 14 mL with
water. Store at 4 °C (see Notes 3 and 12).
2. 80% Ethanol: To make 50 mL of 80% ethanol, mix 40 mL
200 proof ethanol and 10 mL water. Vortex before use.
2.1.7 Adaptase Reaction 1. Adaptase Master Mix: Mix 450.5 μL of Elution Buffer (Qiagen
19,086), 212 μL Buffer G1, 212 μL Reagent G2, 132 μL
Reagent G3, 53 μL Enzyme G4, and 53 μL Enzyme G5.
Pipette to mix and keep on ice.
2.1.8 Library 1. P5L PCR Primer Mix: 1.2 μM P5L primer (working concen-
Amplification tration of 600 nM when combined with P7L primer). Mix
1.2 μL of 100 μM P5L stock with 98.8 μL water. Keep on ice
before use.
2. P7L PCR Primer Mix: 2 μM P7L primer (working concentra-
tion of 1 μM after being combined with P5L primer). Mix 2 μL
of 100 μM P7L primer with 98 μL water. Keep on ice
before use.
3. 2X Kapa Hifi Mix (Roche 07958935001).
2.1.10 Primers and 1. HPLC purified random primers (added after bisulfite
Barcodes conversion): H: A, G, or T.
Barcode 1: /5SpC3/ TTCCCTACACGACGCTCTTCC
GATCTATCACG (H1:33340033)(H1)(H1)(H1)(H1)(H1)
(H1)(H1)(H1).
236 Michael Wasney and Sebastian Pott
P511: AATGATACGGCGACCACCGAGATCTACACG
ATCTTGCACACTCTTTCCCTACACGACGCTCT.
P512: AATGATACGGCGACCACCGAGATCTACACA
GGATAGCACACTCTTTCCCTACACGACGCTCT
3. P7 primers (added during library amplification).
P701: CAAGCAGAAGACGGCATACGAGATAGGCAA
TGGTGACTGGAGTTCAGACGTGTGCTCTT.
P702: CAAGCAGAAGACGGCATACGAGATTCACCT
AGGTGACTGGAGTTCAGACGTGTGCTCTT.
P703: CAAGCAGAAGACGGCATACGAGATCATACG
GAGTGACTGGAGTTCAGACGTGTGCTCTT.
P704: CAAGCAGAAGACGGCATACGAGATGTCATC
GTGTGACTGGAGTTCAGACGTGTGCTCTT.
P705: CAAGCAGAAGACGGCATACGAGATTTACCG
ACGTGACTGGAGTTCAGACGTGTGCTCTT.
P706: CAAGCAGAAGACGGCATACGAGATACCTTC
GAGTGACTGGAGTTCAGACGTGTGCTCTT.
P707: CAAGCAGAAGACGGCATACGAGATACGCTT
CTGTGACTGGAGTTCAGACGTGTGCTCTT.
P708: CAAGCAGAAGACGGCATACGAGATGAGTAG
AGGTGACTGGAGTTCAGACGTGTGCTCTT.
3 Methods
3.1 Assay 1. Prepare digestion mix on ice. Deliver 2 μL of mix to every well
of two 384-well plates. Plates with digestion mix can be
3.1.1 Nuclei Isolation and
prepared the day before the experiment and stored at 4 °C
GpC Methyltransferase
(see Notes 1 and 2).
Treatment
2. Obtain a suspension of single cells. This protocol was opti-
mized for use with a total of 5–10 million cells. Centrifuge
single cell suspension at 500 × g for 5 min at 4 °C, remove the
supernatant, suspend in 1 mL ice-cold PBS, and centrifuge the
sample again at the same settings. Discard the supernatant and
suspend in 1 mL 1X RSB buffer. Incubate for 10 min at room
temperature.
3. Add 15 μL of 1% NP-40 to the cell suspension (NP-40 con-
centration may need to be adjusted depending on the cell
type). Transfer the cell suspension to a 2 mL dounce tissue
grinder and add 1 mL of 1X RSB. Homogenize cell suspension
using 15 strokes of both pestle A and B (number of strokes may
be adjusted to accommodate the particular cell-/tissue-type
being handled). Transfer lysed cells to a new 1.5 mL Eppendorf
tube and centrifuge at 800 × g for 5 min at 4 °C. Discard the
supernatant and wash with 1 mL 1X RSB. Incubate for 30 s to
1 min at room temperature. Centrifuge at 800 × g for 5 min at 4 °C.
4. Resuspend the nuclei in 1X GpC Methyltransferase buffer such
that there are one million nuclei per 75 μL of buffer. If there are
less than one million nuclei, suspend in 75 μL of buffer. Mean-
while, prepare GpC Methylase Reaction Mix. Add the 75 μL of
nuclei to the reaction mix and incubate at 37 °C for 7.5 min.
Add a boost of 25 μL GpC Methyltransferase and 0.75 μL
32 mM SAM and incubate for another 7.5 min at 37 °C (see
Note 4).
5. Quench the reaction by adding 500 μL 1X PBS and spin at
800 × g for 5 min at 4 °C. Resuspend in 0.5–1 mL of 1X PBS
and add 2 drops of Hoechst per mL of sample (1 drop for
0.5 mL, two drops for 1 mL). Keep the sample on ice for
15 min before commencing with fluorescence-activated cell
sorting (FACS).
scNOMe-Seq 239
250k
5
10 Gate 2: 84% (37.1%) 105
200k
FSC-H
SSC-A
SSC-A
100k
103 10
3
50k single nuclei
Gate 1: 44.2% Gate 3: 18.7% (6.9%)
2
10 0 102
50k 100k 150k 200k 250k 50k 100k 150k 200k 250k 102 103 104 105
FSC-A FSC-A BV421-A
Fig. 3 Example of a gating strategy during FACS sorting. Individual nuclei were selected based on size and
DNA content. Percentages provide proportion of events within a particular gate for each scatter plot; the
proportion of total events is indicated in parenthesis
Plate 1 Plate 2
1 2 3 4 1 2 3 4
A A
B B
C C
D D
Barcode 1 Barcode 5
Barcode 2 Barcode 6
Barcode 3 Barcode 7
Barcode 4 Barcode 8
Fig. 4 Loading schema for primers in the two 384-well plates used for random
priming step. Pattern shown for Wells A 1–2 and B 1–2 in plates 1 and
2, respectively, is repeated across the entire plate
3.1.5 Inactivation of Free 1. Prepare Exo/rSAP Master Mix and keep on ice. Add 1.5 μL to
Primers and dNTPs each well of the 384-well reaction plates. Vortex to mix and
quick spin at 2000 × g for 10 s at room temperature (see Notes
1, 8, and 10).
2. Place the plates into a thermocycler and run the following
program:
(a) 37 °C for 30 min
(b) Hold at 4 °C (see Note 11).
3.1.7 Adaptase Reaction 1. Prepare Adaptase Master Mix prior to denaturing the sample
and keep mix on ice.
2. Denature the samples by placing the 96-well plate in a thermo-
cycler and run the following program:
(a) 95 °C for 3 min.
3. Place the plate on ice for 2 min.
4. Add 10.5 μL Adaptase Master Mix to each well of the 96-well
plate. Vortex to mix and quick spin at 2000 × g for 10 s at room
temperature (see Notes 1 and 10).
5. Place the plate in the thermocycler and run the following
program:
(a) 37 °C for 30 min
(b) 95 °C for 2 min
(c) Hold at 4 °C (see Note 11).
3.1.8 Library 1. Prepare P5L and P7L primer mixes. Add 5 μL of the appropri-
Amplification ate primers to the each well of a clean 96-well plate such that
each well has a unique P5L–P7L combination (keep note of
each combination’s location in the plate). Transfer 5 μL of each
P5L–P7L combination to the corresponding well in the
96-well plate containing the pooled samples (see Notes 1 and
10).
2. Add 25 μL 2X Kapa Hifi Mix to each well of the 96-well plate
containing the samples. Vortex to mix and quick spin at
2000 × g for 10 s at room temperature (see Notes 1 and 10).
3. Place the plate in a thermocycler and run the following
program:
(a) 95 °C for 2 min
(b) 98 °C for 30 s
(c) 98 °C for 15 s
(d) 64 °C for 30 s
(e) 72 °C for 2 min
Return to step c 14 times for a total of 15 cycles.
(f) 72 °C for 5 min
(g) Hold at 4 °C (see Notes 11 and 13).
scNOMe-Seq 243
3.1.9 Library Clean-Up 1. Add 40 μL SPRI beads to each well of the 96-well plate. Vortex
the plate briefly and incubate for 5 min at room temperature
and then quick spin at 2000 × g for 10 s at room temperature.
Place the plate on a DynaMag™-96 Side Skirted Magnet and
allow the solution to clear of beads (see Notes 1, 10, and 12).
2. Remove the supernatant and wash beads twice with 150 μL of
freshly made 80% ethanol. After the final wash, remove the
plate from the magnet and allow beads to dry at room temper-
ature. Take care to not overdry beads (see Note 1).
3. Add 25 μL Elution Buffer (Qiagen) and suspend beads by
pipette. Place back on the DynaMag™-96 Side Skirted Magnet
and allow the solution to clear of beads. Combine the superna-
tant from each column into 12 Eppendorf tubes such that there
is one Eppendorf tube per 96-well plate column. Add 160 μL
(0.8×) SPRI beads to each of the 12 Eppendorf tubes. Pipette
to mix and incubate for 5 min at room temperature (see Notes
1 and 10).
4. Place the Eppendorf tubes on a DynaMag™-2 Magnet and
allow the solution to clear of beads. Discard the supernatant
and wash the beads 2 times with 500 μL of 80% ethanol. After
the second wash, remove all ethanol and allow the beads to dry
at room temperature. Take care to not overdry beads (see Note
10).
5. Add 40 μL Elution Buffer (Qiagen) and suspend beads by
pipette. Incubate for 5 min at room temperature. After incuba-
tion, transfer 40 μL of the supernatant to 12 new Eppendorf
tubes (see Note 10).
6. Measure concentration of the libraries using a Qubit Fluorom-
eter and assess the fragment size distribution with an Agilent
2100 Bioanalyzer. Fragment sizes should fall between 300 and
1500 bp (Fig. 5). On the fluorometer, libraries with a concen-
tration of 2–15 ng/μL are to be expected. If concentration and
size distributions are as expected, proceed with sequencing of
the libraries.
3.3 Analysis A full description of the analysis is outside of the scope of this protocol
describing the steps top generate scNOMe-seq libraries for sequenc-
ing. We provide an example of a processing pipeline for raw scNOMe-
seq data at [https://github.com/sebpott/scNOMe_smk].
244 Michael Wasney and Sebastian Pott
Fig. 5 Expected size distribution of scNOMe-seq library pools. Bioanalyzer profile shows size distribution of a
representative pool of 64 individual scNOMe-seq libraries after final amplification
4 Notes
References
snmC-seq2. Nat Commun 9:3824. https:// 18. Miranda TB, Kelly TK, Bouazoune K, Jones PA
doi.org/10.1038/s41467-018-06355-2 (2010) Methylation-sensitive single-molecule
17. Luo C, Keown CL, Kurihara L et al (2017) analysis of chromatin structure. Curr Protoc
Single-cell methylomes identify neuronal sub- Mol Biology Ed Frederick M Ausubel Et Al
types and regulatory elements in mammalian Chapter 21:Unit 21.17.1 16. https://doi.
cortex. Sci New York NY 357:600–604. org/10.1002/0471142727.mb2117s89
https://doi.org/10.1126/science.aan3351
Chapter 13
Abstract
While methods such as the Assay for Transposase Accessible Chromatin by sequencing (ATAC-seq) enable a
comprehensive characterization of regulatory DNA, additional measurements are required to characterize
the multifaceted nature of eukaryotic cells. Here, we delineate the ATAC with Select Antigen Profiling by
sequencing (ASAP-seq) protocol, a scalable approach to quantifying proteins via oligo-tagged antibodies
alongside accessible DNA in thousands of single cells. Critically, our method utilizes a custom bridge oligo
that enables the utilization of a variety of oligo-conjugated antibodies, enabling the utilization and
repurposing of other commercial products. The ASAP-seq method can be completed with straightforward
experimental and computational modifications existing single-cell ATAC-seq workflows but yields distinct
modalities underlying complex cellular states, including estimation of protein abundance on the cell surface
as well as intracellular and intranuclear factors.
Key words Multimodal, Single-cell, Protein, Accessible chromatin, ATAC, Intracellular, Gene
regulation
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_13,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
249
250 Eleni P. Mimitou et al.
Fig. 1 Schematic of the experimental assay. ASAP-seq allows whole cell input into the scATAC-seq workflow,
maintaining the connection between nuclear content and cell surface marker information. Cells are stained
with oligo-conjugated antibodies followed by fixation, permeabilization, and Tn5 transposition. Bridge oligos
are spiked in the barcoding mix prior to droplet formation to allow simultaneous barcoding of ATAC fragments
and antibody-derived oligos
Fig. 2 Barcoding scheme of the protein tags using the bridge oligo strategy. Bridge oligo A (BOA) and bridge
oligo B (BOB) function as templates to extend the protein-derived oligos in droplets. While TSB tags (right)
contain UMIs, UBIs (N9V) are introduced to TSA tags via the bridge oligo (left) to allow molecule counting
2 Materials
Table 1
Oligo sequences
2.2 ASAP-Seq 1. 10x Genomics Chromium Next GEM Single Cell ATAC
Library Preparation Library & Gel Bead Kit, 16 or 4 rxns.
2. 10x Genomics Chromium Next GEM Chip H Single Cell Kit,
48 or 16 rxns.
3. 10x Genomics Single Index Kit N, Set A, 96 rxns.
4. 2x Kapa Hifi PCR mastermix.
5. SPRI beads (AMPure XP beads or KAPA Pure beads).
6. Custom oligonucleotides for library prep (see Table 1).
254 Eleni P. Mimitou et al.
2.3 Quality Control 1. Qubit dsDNA HS Assay Kit (Thermo Fisher Q32851 or
and Sequencing Q33230).
2. Agilent Bioanalyzer High Sensitivity DNA Analysis Kit
(or Tapestation or similar).
3. KAPA Library Quantification Kit for Illumina® Platforms
(KAPA biosystems KK4835).
4. Illumina NovaSeq or NextSeq reagent kits.
2.4 Software and 1. Download cellranger-atac and relevant reference files (https://
References Needed for suppor t.10xgenomics.com/single-cell-atac/software/
Computational pipelines/latest/what-is-cell-ranger-atac). The most up-to-
Analysis date reference files and versions of the software are available
online. This software will be used to demultiplex sequencing
libraries rom an Illumina sequencing run and can be executed
to process (see Note 3).
2. Install an up-to-date version of the Python 3 library either for
the system, the user, or through a conda environment (see
Note 4).
3. Download the kite antibody tag preprocessing toolkit. The
most up-to-date version of the software is available online at
https://github.com/pachterlab/kite. This software is used to
build a reference map of the oligonucleotide barcodes to the
respective antibody clones.
4. Download the kallisto and bustools software binaries. Current
versions of these software are available at https://github.com/
pachterlab/kallisto and https://github.com/BUStools/bus
tools, respectively. These software are utilities used to efficiently
count reads assigned to each antibody barcode for every cell
while efficiently correcting for sequencing errors.
5. Download the ASAP to kite script toolkit available online:
https://github.com/caleblareau/asap_to_kite. This code is
required to convert the ASAP-seq sequencing data into a for-
mat that are compatible with the existing kite | kallisto | bus-
tools workflows (see Note 5).
6. mgatk package and dependencies (https://github.com/
caleblareau/mgatk).
7. 10x scATAC barcode whitelist:
$ wget https://teichlab.github.io/scg_lib_structs/
data/737K-cratac-v1.txt. This file is available in the distribu-
tion of CellRanger-ATAC but is more accessible from the
indicated GitHub link.
Massively Parallel Profiling of Accessible Chromatin and Proteins with ASAP-Seq 255
3 Methods
3.1 Cell Preparation, This section outlines the steps required to stain the cells with the
Fixation, and conjugated antibodies, followed by fixation and permeabilization.
Permeabilization The fixation steps are based on the mtscATAC-seq workflow (see a
separate chapter describing mtscATAC-seq in the same issue). Per-
meabilization can be performed using two alternative lysis buffers:
LLL (low loss lysis) and OMNI (based on OMNI-ATAC protocol
[14]), which is the default lysis buffer in the 10x Genomics scATAC
kit. LLL is the lysis buffer described in mtscATAC kit, which, due to
lack of Tween 20 in its formulation, retains mtDNA fragments in
the ATAC library that can be used for mtDNA variant tracing. In
benchmarking experiments, either LLL or OMNI buffers yielded
comparable ATAC and protein data and can be used interchange-
ably if mtDNA retention is not desired.
3.1.1 Cell Staining 1. Obtain single cell suspensions (filter if needed) and measure
viability and density. If viability is lower than 80%, proceed with
live cell enrichment and/or use best judgement depending on
sample source/importance/cell numbers.
2. Resuspend 1–2 million cells in 100 μL CITE staining buffer.
3. Add 10 μL Fc Blocking reagent.
4. Incubate for 10 min at 4 °C.
5. While cells are incubated in Fc Block, prepare the antibody
pool (panel or titrated amounts).
6. Add antibody-oligo pool to cells.
7. Incubate for 30 min at 4 °C.
8. Wash cells 3 times with 1 mL CITE staining buffer, spin at
300 × g for 5 min at 4 °C for every wash to harvest cells.
9. Resuspend cells in 450 μL room temperature PBS.
3.1.2 Cell Fixation and 1. Use about 0.5–1 million cells in 450 μL PBS for the fixation
Permeabilization reaction.
2. Add 30 μL 16% formaldehyde (1% final concentration), mix by
pipetting, and incubate at room temperature for 10 min with
occasional inversion.
3. Quench by adding glycine to final concentration 0.125 M.
4. Wash with 1× ice-cold PBS by filling up the tube, invert
5 times.
5. Spin at 400 × g for 5 min at 4 °C.
6. Discard supernatant and repeat wash with 1 mL 1×
ice-cold PBS.
7. Spin at 400 × g for 5 min at 4 °C, discard the supernatant.
256 Eleni P. Mimitou et al.
Table 2
Permeabilization buffers
3.2 Transposition For this step, proceed according to 10x Genomics Single Cell
and Barcoding ATAC protocol (CG000168 Rev. D for v1 and CG000209 Rev.
D for v1.1; hereafter, ‘10x Protocol’) with the below modifications:
1. During the barcoding reaction (see step 2.1 of the 10x Proto-
col), spike in 0.5 μL of 1 μM bridge oligo. There is no dead
volume in the reaction, so final volume will be 65.5 μL for v1
and 60.5 μL for v1.1.
2. During GEM incubation (see step 2.5 of the 10x Protocol),
add a 5 min incubation at 40 °C at the beginning of the
protocol (see Note 8). Incubation protocol: 40 °C 5 min,
72 °C 5 min, 98 °C 30 s, 98 °C 10 s, 59 °C 30 s, 72 °C
1 min, cycle for a total of 12 times, hold at 15 °C.
3. During silane bead elution (see step 3.1o of the 10x Protocol),
add 43.5 μL of Elution Solution I and subsequently recover
43 μL. Keep 3 μL aside to use as input (see Note 9) in the tag
library PCR, and with the remaining 40 μL, proceed to SPRI
cleanup as per 10x protocol.
4. During SPRI cleanup (see step 3.2d of the 10x Protocol), save
the supernatant. For the bead bound fraction, proceed as per
10x protocol. For the supernatant fraction, add 32 μL SPRI, let
bind for 5 min. Collect beads on magnet, wash twice with 80%
EtOH, remove the remaining ethanol and elute beads in
42 μL EB (see Note 9). This can be combined with the 3 μL
left aside after the silane purification, as input in the TSA/TSB
indexing reaction:
50 μL 2x KAPA mix
2.5 μL primer P5 10 μM
2.5 μL indexing primer 10 μM (RPxx or D7xx, see Table 2)
3–45 μL input fragments (see Note 9)
100 μL total.
Incubation protocol: 95 °C 3 min, 95 °C 20 s, 60 °C 30 s,
72 °C 20 s, 72 °C 5 min, cycle for a total of 14–16 times, hold
at 4 °C.
5. Proceed with indexing the ATAC library as described in
Subheading 4.2 of the 10× protocol. Usually 10 cycles provide
sufficient material to perform library QC and sequencing. If
native nuclei are run in parallel, a noticeable reduction in PCR
yield can be observed with the fixed sample compared to native
nuclei (presumably due to fixation).
258 Eleni P. Mimitou et al.
Fig. 3 Representative fragment analyzer traces of the sequencing libraries. ATAC (top) and protein tag (bottom)
libraries of fixed human PBMCs permeabilized with OMNI lysis buffer (a) or LLL lysis buffer (b). Note the
increased abundance of the nucleosome-free region (size <300 bp) in the LLL library that corresponds to the
increased capture of mtDNA fragments (arrow)
3.3 Library QC, We recommend to quantify all libraries in three sequential steps:
Pooling and
1. Qubit: use 1 μL undiluted library for total nucleic acid mass.
Sequencing
2. Fragment analyzer (see examples in Fig. 3): preferably Agilent
3.3.1 Library QC BioA (if not available, Tapestation or PerkinElmer LabChip GX
can be used). Run ~1–3 ng of each library based on the Qubit
read. The fragment analyzer will provide the size distribution
of the library and a more accurate quantification of the
expected-for-each-library size fragment/population.
3. KAPA qPCR: prepare 4 nM dilutions of each library based on
quantification by BioA and record dilution. Follow the KAPA
manual instructions for quantification of the “clusterable” frag-
ments (fragments containing P5/P7 sequences). This will be
the most accurate concentration read for sequencing purposes
(see Note 10).
Sequencing for ASAP-seq with TotalSeq-A Read Length ATAC Protein Tag
protein detection (spiked into ATAC run) Read 1: 50 Genomic fragment 1-10 =
UBI
i7: 8 sample index sample index
i5: 16 cell barcode cell barcode
Read 2: 50 Genomic fragment 1-15 = antibody tag
(continued)
260 Eleni P. Mimitou et al.
Box 1 (continued)
Sequencing for ASAP-seq with TotalSeq-A Read Length ATAC Protein Tag
Hashtag detection (spiked into ATAC run) Read 1: 50 Genomic fragment 1-10 = UBI
i7: 8 sample index sample index
i5: 16 cell barcode cell barcode
Read 2: 50 Genomic fragment 1-15 = hashtag
READ 1 -->
•••••••••••••••••••••••••••••••••••••••••••••••••• i7 index read --> ••••••••
5’AATGATACGGCGACCACCGAGATCTACACNNNNNNNNNNNNNNNNTCGTCGGCAGCGTCAGATGTGTATAAGAGACAGTTGCTAGGACCGGCCTTA
AAGCNNNNNNNNNxxxxxxxxxxxxxxxNNNNNNNNNNAGATCGGAAGAGCACACGTCTGAACTCCAGTCACxxxxxxxxATCTCGTATGCCGTCTTC
TGCTTG
3’TTACTATGCCGCTGGTGGCTCTAGATGTGNNNNNNNNNNNNNNNNAGCAGCCGTCGCAGTCTACACATATTCTCTGTCAACGATCCTGGCCGGAAT
TTCGNNNNNNNNNxxxxxxxxxxxxxxxNNNNNNNNNNTCTAGCCTTCTCGTGTGCAGACTTGAGGTCAGTGxxxxxxxxTAGAGCATACGGCAGAAG
ACGAAC
i5 ••••••••••••••••
•••••••••••••••••••••••••••••••••• <-- read 2
Cell barcode (16)
UMI / antibody barcode
Sequencing for ASAP-seq with TotalSeq-B Read Length ATAC Protein Tag
detection (spiked into ATAC run) Read 1: 50 Genomic fragment (discard)
i7: 8 sample index sample index
i5: 16 cell barcode cell barcode
Read 2: 50 Genomic fragment 1-10 = UMI1, 11-25 =
Antibody tag, 26-34 = UMI2
Sequencing this library alone will cause problems due to lack of sequence diversity in read 1. We highly recommend spiking this
into the ATAC libraries generated together in the same assay. UMI and tag barcode can be recovered from either read 1 or read
2. asap_to_kite uses read 2 by default. Please note the orientation of the i5 index read will be different depending on the Illumina
chemistry used. Refer to the 10x scATAC manual for guidance.
3.4 Demultiplex This section briefly summarizes the steps needed to demultiplex
Sequencing Data sequencing data to generate paired-end sequencing data associated
with all libraries on the flow cell. In the ASAP-seq multimodal
workflow, chromatin accessibility (and optionally mtDNA) are cap-
tured in the same library, whereas different libraries per protein
modality (hash tags, protein abundance, different oligonucleotide
backgrounds) will be present on distinct sequencing libraries.
Massively Parallel Profiling of Accessible Chromatin and Proteins with ASAP-Seq 261
Table 3
Interpretation of quality control metrics from ASAP-seq protein mapping
Target
Processing step Metric value Debugging workflow
Step 5, Subheading Antibody barcode 85% 1. Verify indicated reference matches the
3.5. Pseudo- pseudo- experimental input.
alignmenta alignment rate 2. Verify the correct specification of the antibody
library (e.g., TSA or TSB).
3. Run FastQC to look for overrepresented
sequences that may correspond to known
barcodes.
4. If the FastQC quality-control report from the
demultiplexed flow cell indicates poor base
quality at bases associated with the antibody
barcode (see Fig. 4), consider using a
two-mismatch barcode dictionary in step 2 of
Subheading 3.5.
Calculation: # pseudo-aligned/# reads × 100%
Example: 8,498,079/9,809,110 × 100% = 86.6%
Step 6, Subheading Bead barcode 90% 1. If very low (<5%), check to see if the R2 was
3.5. Correct alignment rate correctly handled for a reverse-complement.
2. Check corresponding scATAC-seq library for
barcode alignment rate.
3. Examine top sequences for contamination or
other repetitive sequences.
Calculation: (# whitelist + # corrected)/# pre-corrected records × 100%
Example: (8,049,279 + 114,962)/8,498,079 × 100% = 96.1%
Step 9, Subheading UMI saturation rate 25– 1. This metric is not interpretable if the top two
3.5. Text 50% values are not of reasonable quality.
2. If >50%, sequencing is saturated and may
represent a low-quality library (unless
purposefully sequenced to saturation).
3. If <25%, additional sequencing is
recommended.
Calculation: [1 - (#Final UMIs/#pre-sorted records)] × 100%
Example: [1 - (6,120,282/8,164,241)] × 100% = 25.0%
a
By virtue of the kite reference, every k-mer up to 1 mismatch will be accounted for and then collapsed during the
quantification step. In this sense, though kallisto is a “pseudo-alignment” algorithm, the quantifications are absolute and
effectively a fast dictionary-based quantification
1. Build the sample sheet csv that specifies the indices for both the
ATAC library and the tag library (see Note 11).
2. Demultiplex sequencing data by running cellranger-atac
mkfastq. For example, $ cellranger mkfastq --id = asap_seq_-
demux --run=/path/to/flow_cell --csv = sample_sheet.
csv.
262 Eleni P. Mimitou et al.
Fig. 4 Schematic of library structure and computational preprocessing for ASAP-seq tag libraries. Colors
represent specific technical attributes of the read library. Colored arrows represent the data transformations in
the asap_to_kite.py tool. The resulting fastq files mimic scRNA-seq data and can be used in kallisto | bustools
for single-cell protein abundance estimation. For Total-Seq B, both UMIs can be used in the mapping
abundance but requires the execution of “kallisto bus” with custom parameters
Table 4
Linking experimental reagents to downstream bioinformatics workflows
3.5 Process This section outlines the steps to take raw sequencing data and
Sequencing Data generate counts matrices of features per cell. As a reference, we
include Box 2 that contains a summary of values from a real-world
ASAP-seq library that was processed with the outlined workflow. In
Table 3, we provide context for idea values associated with various
steps in this pipeline, including ideas for debugging executions that
do not meet quality control standards.
1. Build a mismatch aware antibody barcode map using kite (see
Note 12): $ python kite/featuremap/featuremap.py Fea-
tureBarcodes.csv --header.
2. Build a kallisto index from the mismatch aware .fasta file pro-
duced by kite: $ kallisto index -i FeaturesMismatch.idx -k
15 FeaturesMismatch.fa.
3. For convenience in processing, define a bash variable related to
the specific library/sample to run the subsequent steps: $
sample = “ASAP_tag_Sample_ID”.
Massively Parallel Profiling of Accessible Chromatin and Proteins with ASAP-Seq 263
(continued)
264 Eleni P. Mimitou et al.
Box 2. (continued)
Number of hamming dist 1 barcodes = 20,309,952.
Processed 8,498,079 bus records.
In whitelist = 8,049,279.
Corrected = 114,962.
Uncorrected = 333,838.
4 Notes
https://support.10xgenomics.com/single-cell-atac/soft
ware/pipelines/latest/advanced/references.
4. Some dependency packages are also required to run the work-
flow, depending on the exact use case, and are documented
alongside the complementary tools.
5. The current version of the asap_to_kite toolkit contains custom
python scripts for performing this task of reformatting
sequencing data. Depending on the library input (either Total-
SeqA or TotalSeqB, or a mix), these software will have to be
run with custom parameters. See the GitHub repository for
more details.
6. If mtDNA retention is desired, use LLL lysis buffer.
7. So far we have used about 0.5–1 μg of antibodies during the
intracellular staining.
8. This extra step is not essential when using TSA products, but
increases efficiency in TSB capture.
9. You can use either as input in the tag indexing reaction or
combine when working with large antibody panels to increase
input complexity.
10. If KAPA qPCR is not an available option, use the molarity of
the expected fragments as measured by BioA.
11. An online tool to facilitate building the sample sheet is avail-
able: https://support.10xgenomics.com/single-cell-atac/soft
ware/pipelines/latest/using/bcl2fastq-direct. We note that
the index used for the tag libraries will not be available from
the tool and must be entered manually.
12. By default, the kite tool produces an off-by-one mismatch
k-mer dictionary. When using the kallisto tool for read
mapping, there is no error tolerance or incorporation of
sequence base qualities. Thus, building a mismatch index for
all possible off-by-one changes is essential to optimize data
yield.
13. The execution of this software can be performed modularly
without information from the antibody tag libraries. Other
single-cell ATAC preprocessing workflows can also be utilized
at this point.
Acknowledgments
References
1. Mimitou EP, Cheng A, Montalbano A et al 8. Cusanovich DA, Daza R, Adey A et al (2015)
(2019) Multiplexed detection of proteins, Multiplex single cell profiling of chromatin
transcriptomes, clonotypes and CRISPR per- accessibility by combinatorial cellular indexing.
turbations in single cells. Nat Methods 16: Science 348:910–914
409–412 9. Ma S, Zhang B, LaFave LM et al (2020) Chro-
2. Stoeckius M, Hafemeister C, Stephenson W matin potential identified by shared single-cell
et al (2017) Simultaneous epitope and tran- profiling of RNA and chromatin. Cell 183:
scriptome measurement in single cells. Nat 1103–1116.e20
Methods 14:865–868 10. Mimitou EP, Lareau CA, Chen KY et al (2021)
3. Peterson VM, Zhang KX, Kumar N et al Scalable, multimodal profiling of chromatin
(2017) Multiplexed quantification of proteins accessibility, gene expression and protein levels
and transcripts in single cells. Nat Biotechnol in single cells. Nat Biotechnol. https://doi.
35:936–939 org/10.1038/s41587-021-00927-2
4. Triana SH, Vonficht D, Jopp-Saile L et al 11. Lareau CA, Ludwig LS, Muus C et al (2020)
(2021) Single-cell proteo-genomic reference Massively parallel single-cell mitochondrial
maps of the hematopoietic system enable the DNA genotyping and chromatin profiling.
purification and massive profiling of precisely Nat Biotechnol. https://doi.org/10.1038/
defined cell states. bioRxiv s41587-020-0645-6
5. Hao Y, Hao S, Andersen-Nissen E et al (2021) 12. Stoeckius M, Zheng S, Houck-Loomis B et al
Integrated analysis of multimodal single-cell (2018) Cell hashing with barcoded antibodies
data. Cell 184:3573–3587.e29 enables multiplexing and doublet detection for
6. Satpathy AT, Granja JM, Yost KE et al (2019) single cell genomics. Genome Biol 19:224
Massively parallel single-cell chromatin land- 13. McGinnis CS, Patterson DM, Winkler J et al
scapes of human immune cell development (2019) MULTI-seq: sample multiplexing for
and intratumoral T cell exhaustion. Nat Bio- single-cell RNA sequencing using lipid-tagged
technol 37:925–936 indices. Nat Methods 16:619–626
7. Lareau CA, Duarte FM, Chew JG et al (2019) 14. Corces MR, Trevino AE, Hamilton EG et al
Droplet-based combinatorial indexing for (2017) An improved ATAC-seq protocol
massive-scale single-cell chromatin accessibil- reduces background and enables interrogation
ity. Nat Biotechnol 37:916–924 of frozen tissues. Nat Methods 14:959–962
Chapter 14
Abstract
Mitochondria are unique organelles of eukaryotic cells that carry their own multicopy number and circular
genome. In most mammals, including humans and mice, the size of the chromosome is ~16,000 base pairs
and unlike nuclear DNA, mitochondrial DNA (mtDNA) is not densely compacted. This results in mtDNA
to be highly accessible for enzymes such as the Tn5 transposase, commonly used for accessible chromatin
profiling of nuclear chromatinized DNA. Here, we describe a method for the concomitant sequencing of
mtDNA and accessible chromatin in thousands of individual cells via the mitochondrial single-cell assay for
transposase accessible chromatin by sequencing (mtscATAC-seq). Our approach extends the utility of
existing scATAC-seq products and protocols as we (Nam et al, Nat Rev Genet 22:3–18, 2021) fix cells
using formaldehyde to retain mitochondria and its mtDNA within its originating cell, (Buenrostro et al,
Nat Methods 10:1213–1218, 2013) modify lysis conditions to permeabilize cells and mitochondria, and
(Corces et al, Nat Methods 14:959–962, 2017) optimize bioinformatic processing protocols to collectively
increase mitochondrial genome coverage for downstream analysis. Here, we discuss the essentials for the
experimental and computational methodologies to generate and analyze thousands of multiomic profiles of
single cells over the course of a few days, enabling the profiling of accessible chromatin and mtDNA
genotypes to reconstruct clonal relationships and studies of mitochondrial genetics and disease.
Key words Single cell multiomics, Accessible chromatin profiling, Mitochondrial DNA, Somatic
mutation, Lineage tracing, Pathogenic mutation, Mitochondrial disease
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_14,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
269
270 Leif S. Ludwig and Caleb A. Lareau
Fig. 1 Schematic of the mtscATAC-seq reaction. Fixed and permeabilized whole cells are used as input into
the Tn5 transposition reaction wherein both mtDNA and accessible chromatin are transposed within cells
before being input into the 10x Chromium controller microfluidic device. These key differences are colored in
the schematic. The remaining steps (grayscale) closely match the standard 10x scATAC-seq workflow
Concomitant Sequencing of Accessible Chromatin and Mitochondrial Genomes. . . 271
2 Materials
2.2 mtscATAC-Seq 1. 10x Genomics Chromium Next GEM Single Cell ATAC
Library Preparation Library & Gel Bead Kit, 16 or 4 rxns (see Note 2).
2. 10x Genomics Chromium Next GEM Chip H Single Cell Kit,
48 or 16 rxns.
3. 10x Genomics Single Index Kit N, Set A, 96 rxns.
2.4 Computational 1. 10x Genomics Cell Ranger ATAC package (see Note 4)
Resources (https://support.10xgenomics.com/single-cell-atac/soft
ware/pipelines/latest/what-is-cell-ranger-atac).
2. mgatk package and dependencies (https://github.com/
caleblareau/mgatk) (see Note 5).
3 Methods
The protocol described here has been optimized for the use of
hematopoietic cell lines and primary human hematopoietic cells,
including peripheral blood or bone marrow-derived mononuclear
cells that have been obtained via the use of standard approaches
such as Ficoll-based gradient centrifugation. Specific populations of
interest may be enriched for example via flow cytometry-based
sorting. A high viability of cells (>95%) and a low residual granulo-
cyte/neutrophil content (<3%) is essential to obtain high-quality
mtscATAC-seq data. For primary hematopoietic cells, we have
processed these fresh or cryopreserved them using standard prac-
tices (e.g., in 90% FBS with 10% DMSO) with no significant loss of
data quality following thawing and processing of cells when com-
bined with sorting to ensure high viability of the input cell popula-
tion. For the centrifugation of cells, we recommend the use of
DNA LoBind microcentrifuge tubes and favor swinging-bucket
centrifuges compared to fixed angle rotors. For mtscATAC-seq
library preparation, we follow the Chromium Next GEM Single
Cell ATAC Reagent kits v1.1 user guide from 10x Genomics
(CG000209 Fev F) and only briefly describe modified steps as
outlined in Subheading 3.2.
3.1 Cell Processing, 1. Transfer 1 × 105 to 1 × 106 live cells to a 1.5 mL microcen-
Fixation, and Lysis trifuge tube and spin at 400 × g for 5 min at 4 °C. Discard the
supernatant without disrupting the cell pellet, resuspend and
wash cells in 1–1.5 mL FACS buffer, and spin at 400 × g for
5 min at 4 °C.
2. Discard the supernatant without disrupting the cell pellet,
gently flick the tube to loosen the cell pellet, and carefully
and completely resuspend the cells in 450 μL of room
temperature PBS.
3. Fix cells by adding formaldehyde to a final concentration of 1%
(e.g., by adding 30 μL of 16% formaldehyde), followed by
inversion of the tube for complete mixing. Incubate at room
temperature for 10 min and occasionally invert the tube.
4. Quench the fixation reaction by adding a glycine solution to a
final concentration of 0.125 M and invert the tube for com-
plete mixing. Add 950 μL PBS or FACS buffer, invert the tube
2–3× times, and spin at 400 × g for 5 min at 4 °C. Discard the
supernatant without disrupting the cell pellet, gently flick the
Concomitant Sequencing of Accessible Chromatin and Mitochondrial Genomes. . . 273
tube to loosen the cell pellet, resuspend cells and repeat the
wash with 1–1.5 mL FACS buffer, and spin at 400 × g for 5 min
at 4 °C.
5. Discard the supernatant without disrupting the cell pellet,
gently flick the tube to loosen the cell pellet, and add
200–300 μL ice-cold lysis buffer. Gently pipette up and down
3 times to completely resuspend the cells. Incubate on ice for
3 min, before adding 1 mL of ice-cold wash buffer and spin at
500 × g for 5 min at 4 °C (see Note 6).
6. Discard the supernatant without disrupting the cell pellet,
gently flick the tube to loosen the pellet, and resuspend cells
in freshly prepared 1x nuclei buffer provided by the 10x Geno-
mics scATAC-seq kit. Aim for a concentration of 2000–7500
cells/μL, as validated by counting an aliquot of cells mixed with
trypan blue using a hematocytometer (e.g., Neubauer
Improved) or a ThermoFisher Countess II or III automated
cell counter (see Note 7). If cell clumps are abundant, the cell
suspension may be filtered using, for example, 40 μm Flowmi
cell strainers. Immediately proceed with the next steps of the
protocol.
3.2 mtscATAC-Seq 1. Adjust the cell concentration as desired with 1x nuclei buffer,
Library Preparation following the recommendations by 10x Genomics. We typically
aim for a concentration of about 2500 cells/μL and note that
only 5 μL of cell suspension may be used for the tagmentation
reaction.
2. The cells are mixed with the transposition mix on ice in a
suitable PCR tube, followed by transposition at 37 °C before
proceeding with GEM generation and barcoding using linear
PCR and after GEM incubation cleanup. Please follow the
detailed instructions of the 10x Genomics user guide for
these steps without modifications (see Note 8).
3. For the library construction step involving the index PCR of
the mtscATAC-seq sample, we typically conduct 1–2 additional
cycles of PCR (see Note 9) before cleaning up of the libraries as
described.
3.3 Quality Control 1. The yield of the mtscATAC-seq libraries is determined using a
and Sequencing Qubit dsDNA HS Assay kit following the manufacturer’s
recommendations. We typically use 1 μL of the library and
typically yield 5–20 ng/μL depending on cell type and used
cell input.
2. The size distribution of the mtscATAC-seq library is assessed
using an Agilent 2100 Bioanalyzer system and a High Sensitiv-
ity DNA Analysis Kit using 1–10 ng of the library. Typical
bioanalyzer traces of libraries prepared with the original 10x
scATAC-seq and the modified mtscATAC-seq protocols are
shown in Fig. 2 (see Note 10). We typically set region gates at
274 Leif S. Ludwig and Caleb A. Lareau
Fig. 2 Representative bioanalyzer traces of sequencing libraries. (a) Original scATAC-seq and (b) mtscATAC-
seq library of human peripheral blood mononuclear cells. We note the increased abundance of the
nucleosome-free (size <300 bp) region in the mtscATAC-seq library relative to the scATAC-seq library that
corresponds to the significant increase of captured mtDNA fragments
Fig. 3 Insert size distribution of mtscATAC-seq libraries. (a) Representative size distribution of accessible
fragments mapping to a nuclear chromosome and (b) the mitochondrial DNA chromosome. The distinctive
mono- and di-nucleosome peaks from ATAC-seq data appear only in the nuclear genome as mtDNA is not
compacted into nucleosomes. The median lengths of the fragment distribution lengths are indicated
Concomitant Sequencing of Accessible Chromatin and Mitochondrial Genomes. . . 275
Fig. 4 Mitochondrial genome coverage. (a) Circular and (b) linear representations of the mitochondrial genome
showing the differences in coverage of mtscATAC-seq (red) and scATAC-seq (blue). The same mtscATAC-seq
data obtained from hematopoietic cell lines [4] is shown in both panels. Notable differences in the coverage
plots are highlighted. Note that the resulting coverage is a function of cell type and sequencing depth. Primary
hematopoietic cells tend to have lower mean mtDNA coverage
276 Leif S. Ludwig and Caleb A. Lareau
Fig. 5 Example of variant calling output from mgatk. Each dot is a distinct mtDNA
mutation. Heteroplasmic, low-quality, and homoplasmic mutations are
separated by these two dimensions (x-axis: heteroplasmy correlation between
strands; y-axis: variance-mean-ratio (VMR) of the allele frequencies for all cells
in the analysis)
Concomitant Sequencing of Accessible Chromatin and Mitochondrial Genomes. . . 277
Fig. 6 Quality control metrics for mtscATAC-seq data. The correlation between
the log10 number of nuclear chromatin accessibility fragments and the mean
mtDNA coverage per cell is shown for peripheral blood mononuclear cells. The
cells with low FRIP and proportionally lower mtDNA abundance present residual
granulocytes
4 Notes
1. Note that the lysis and wash buffer do not contain Tween 20 as
is being used in many scATAC-seq workflows, including the
standard protocol by 10x Genomics. We omit Tween 20 as it
depletes mitochondria and mtDNA within [3, 4], thereby pre-
venting the sequencing and identification of mtDNA variants.
2. For mtscATAC-seq library preparation, we refer the reader to
the detailed instructions of the user guide by 10x Genomics,
which further includes a detailed list of reagents, consumables,
and best practices required to successfully complete the
protocol.
3. For sequencing we have successfully worked with the Illumina
NextSeq and NovaSeq reagents kits and respective sequencing
platforms. We typically have used kits with 150–200 cycles to
obtain high coverage of the mitochondrial genome for variant
calling enabled by the longer read lengths.
4. The 10x Genomics cellranger-atac software comes as a stable
binary that requires no installation aside from untarring the
requisite files and placing them in a stable directory for
execution.
5. A complete discussion of dependencies and installation instruc-
tions is available online: https://github.com/caleblareau/
mgatk/wiki/Installation.
6. Lysis time may need to be optimized depending on cell type.
Lysis efficacy may be assessed via the quantification of live/
dead cells and should be performed on unfixed cells. Please also
see the 10x Genomics demonstrated protocol Nuclei isolation
for single cell ATAC sequencing (CG000169 Rev. D) and the
troubleshooting section within.
7. To obtain a sufficiently high cell concentration for the tagmen-
tation reaction, we initially resuspend the cells in a small vol-
ume of 1x nuclei buffer to avoid the need to concentrate
further via an additional centrifugation step. The initial volume
is dependent on the starting cell number and for 300,000 cells
we would typically resuspend in 20–30 μL from which to
obtain a first cell count using 5 μL of the cell suspension. One
typically loses some cells during the upstream processing and it
is advisable to be more conservative before diluting the cell
concentration too much. For overloading of 10x channels, for
example when pooling multiple cell lines or cells of multiple
donors or when applying hashing-based approaches [12] to
enable downstream computational demultiplexing of the sam-
ple origin, a higher cell concentration will be required.
Concomitant Sequencing of Accessible Chromatin and Mitochondrial Genomes. . . 279
Acknowledgments
References
1. Nam AS, Chaligne R, Landau DA (2021) Inte- 7. Penter L, Gohil SH, Lareau C et al (2021)
grating genetic and non-genetic determinants Longitudinal single-cell dynamics of chromatin
of cancer evolution by single-cell multi-omics. accessibility and mitochondrial mutations in
Nat Rev Genet 22:3–18 chronic lymphocytic leukemia mirror disease
2. Buenrostro JD, Giresi PG, Zaba LC et al history. Cancer Discov 11:3048. https://doi.
(2013) Transposition of native chromatin for org/10.1158/2159-8290.CD-21-0276
fast and sensitive epigenomic profiling of open 8. Lareau CA, Ludwig LS, Sankaran VG (2019)
chromatin, DNA-binding proteins and nucleo- Longitudinal assessment of clonal mosaicism in
some position. Nat Methods 10:1213–1218 human hematopoiesis via mitochondrial muta-
3. Corces MR, Trevino AE, Hamilton EG et al tion tracking. Blood Adv 3:4161–4165
(2017) An improved ATAC-seq protocol 9. Stuart T, Srivastava A, Madad S et al (2021)
reduces background and enables interrogation Multimodal single-cell chromatin analysis with
of frozen tissues. Nat Methods 14:959–962 Signac. Nat Methods (in press)
4. Lareau CA, Ludwig LS, Muus C et al (2021) 10. Fang R, Preissl S, Li Y et al (2021) Compre-
Massively parallel single-cell mitochondrial hensive analysis of single cell ATAC-seq data
DNA genotyping and chromatin profiling. with SnapATAC. Nat Commun 12:1–15
Nat Biotechnol 39:451–461 11. Granja JM, Corces MR, Pierce SE et al (2021)
5. Walker MA, Lareau CA, Ludwig LS et al ArchR is a scalable software package for inte-
(2020) Purifying selection against pathogenic grative single-cell chromatin accessibility analy-
mitochondrial DNA in human T cells. N Engl J sis. Nat Genet 53:403–411
Med 383:1556–1563 12. Mimitou EP, Lareau CA, Chen KY et al (2021)
6. Ludwig LS, Lareau CA, Ulirsch JC et al (2019) Scalable, multimodal profiling of chromatin
Lineage tracing in humans enabled by mito- accessibility, gene expression and protein levels
chondrial mutations and single-cell genomics. in single cells. Nat Biotechnol 39:1246.
Cell 176:1325–1339.e22 https://doi.org/10.1038/s41587-021-
00927-2
Part IV
Abstract
Assay of transposase-accessible chromatin with visualization (ATAC-see), a transposase-mediated imaging
technology that enables direct imaging of the accessible genome in situ and deep sequencing to reveal the
identity of the imaged elements. Here we image spatial organization of the accessible genome in HT1080
cells with this method.
Key words ATAC-see, Tn5 transposase, Chromatin accessibility, In situ imaging, Epigenetics, 3D
genome organization
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_15,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
285
286 Yonglong Dang et al.
2 Materials
2.2 Tn5 Transposase 1. DNA adaptor oligos were synthesized at integrated DNA tech-
Assembly nologies (IDT) with following sequences (see Notes 1 and 2):
Tn5MErev, 5′-[phos]CTGTCTCTTATACACATCT-3′;
Tn5ME-A-ATTO590:
5′-/ATTO590/TCGTCGGCAGCGTCAGATGTGTATAA-
GAGACAG-3′;
Tn5ME-B-ATTO590:
ATAC-See: A Tn5 Transposase-Mediated Assay for Detection of Chromatin. . . 287
5′-/ATTO590/GTCTCGTGGGCTCGGAGATGTGTA-
TAAGAGACAG-3′.
2. Tn5 transposases were produced according to Picelli et al. [10].
3. 2X dialysis buffer (DB) [9, 10]: To make 10 mL of 2X DB
buffer,mix 1 mL of 1 M HEPES-KOH (pH 7.2), 400 μL of
5 M NaCl, 40 μL of 0.5 M EDTA, 20 μL mL of 1 M DTT,
20 μL mL of Triton X-100, 2 mL of glycerol, and 6.52 mL of
sterilized ddH2O. Make aliquots and store at -20 °C.
2.5 Cell Culture HT1080 cells were cultured in DMEM/F-12, GlutaMAX™ sup-
plement, 10% fetal bovine serum [12], and 1% Pen/Strep with
SecureSlip cell culture system.
3 Methods
3.3 Slide Preparation First, precleaned glass coverslips were placed in the 6-well cell
and Fixation cultures plate. Then, HT1080 cells were grown on precleaned
glass coverslip until 80–90% confluent, fixed with 1% formaldehyde
[15] (Sigma-Aldrich) for 10 min at room temperature and
quenched with 0.125 M glycine for 5 min at room temperature.
3.4 ATAC-see 1. Glass coverslip and fixed HT1080 cells were permeabilized
with lysis buffer (10 mM Tris–HCl pH 7.4, 10 mM NaCl,
3 mM MgCl2, 0.1% IGEPAL CA-630) for 10 min.
2. Premixed (50 μL) transposase reaction solution (2.5 μL 2 mM
ATTO-Tn5, 25 μL 2X TD buffer, 22.5 μL water) was added
onto the slide, and the cells on the slide were incubated at 37 °
C for 30 min.
3. After Tn5 tagmentation, the cells were washed with washing
buffer (0.01% SDS, 50 mM EDTA in PBS) for 3 times at 55 °C
for 15 min each.
3.5 Immunostaining 1. Cells were blocked with antibody dilution reagent for 1 h at
After ATAC-see room temperature.
2. Primary antibodies (rabbit anti-LaminB1, ab16048, Abcam
and mouse anti-mitochondria, ab3298, Abcam) were diluted
(1:100) in the antibody dilution reagent and incubated over-
night at 4 °C.
3. After washing with washing buffer (containing 0.05% Tween-
20 in PBS) for 3 times 10 min each, slides were incubated with
secondary antibodies (goat anti-rabbit-ATTO488,18,772-
1ML-F, Sigma-Aldrich; goat anti-mouse-Atto647N, 50,185-
1ML-F, Sigma-Aldrich) diluted to 1:500 for 45 min at room
temperature.
4. Finally, slides were washed with washing buffer, 3 times for
10 min each, mounted using Vectashield with DAPI (H-1200,
Vector labs), and imaged with confocal microscopy.
290 Yonglong Dang et al.
4 Notes
Acknowledgments
References
1. Babu A, Verma RS (1987) Chromosome struc- 4. Corces MR et al (2018) The chromatin acces-
ture: euchromatin and heterochromatin. Int sibility landscape of primary human cancers.
Rev Cytol 108:1–60. https://doi.org/10. Science 362. https://doi.org/10.1126/sci
1016/s0074-7696(08)61435-7 ence.aav1898
2. Janssen A, Colmenares SU, Karpen GH (2018) 5. Klemm SL, Shipony Z, Greenleaf WJ (2019)
Heterochromatin: guardian of the Genome. Chromatin accessibility and the regulatory epi-
Annu Rev Cell Dev Biol 34:265–288. genome. Nat Rev Genet 20:207–220. https://
https://doi.org/10.1146/annurev-cellbio- doi.org/10.1038/s41576-018-0089-8
100617-062653 6. Bickmore WA, van Steensel B (2013) Genome
3. Buenrostro JD, Giresi PG, Zaba LC, Chang architecture: domain organization of inter-
HY, Greenleaf WJ (2013) Transposition of phase chromosomes. Cell 152:1270–1284.
native chromatin for fast and sensitive epige- https://doi.org/10.1016/j.cell.2013.02.001
nomic profiling of open chromatin, 7. Misteli T (2009) Self-organization in the
DNA-binding proteins and nucleosome posi- genome. Proc Natl Acad Sci U S A 106:
tion. Nat Methods 10:1213–1218. https:// 6885–6886. https://doi.org/10.1073/pnas.
doi.org/10.1038/nmeth.2688 0902010106
ATAC-See: A Tn5 Transposase-Mediated Assay for Detection of Chromatin. . . 291
Abstract
A novel genome-wide accessible chromatin visualization, quantitation, and sequencing method is
described, which allows in situ fluorescence visualization and sequencing of the accessible chromatin in
the mammalian cell. The cells are fixed by formaldehyde crosslinking, and processed using a modified nick
translation method, where a nicking enzyme nicks one strand of DNA, and DNA polymerase incorporates
biotin-conjugated dCTP, 5-methyl-dCTP, Fluorescein-12-dATP or Texas Red-5-dATP, dGTP, and dTTP.
This allows accessible chromatin DNA to be labeled for visualization and on bead NGS library preparation.
This technology allows cellular level chromatin accessibility quantification and genomic analysis of the
epigenetic information in the chromatin, particularly accessible promoter, enhancers, nucleosome position-
ing, transcription factor occupancy, and other chromosomal protein binding.
Key words Open chromatin, Nicking enzyme, DNA Polymerase I, DNA labeling, Fluorescent dye,
dNTPs, Slides, NEBNext, Biotin, DNA library, Microscopy
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_16,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
293
294 Pierre-Olivier Estève et al.
a DAPI - + Nt.CviPII b
9
8
7
DMSO
6
OCI Index
5
4
3
2
Romidepsin 1
0
DMSO ROMIDEPSIN
Fig. 1 NicE-view of HUT 78 cells treated or not with 1 μM of HDAC inhibitor (Romidepsin) for 6 h at 37 °C. (a)
The visualization of Texas Red-5-dATP (red) with or without Nt.CviPII. Nuclei are shown using DAPI staining
(blue). (b) The open chromatin index (OCI), (quantification of fluorescence incorporation)
Fig. 2 Representative IGV tracks of HUT 78 control and Romidepsin treated cells
2 Materials
3 Methods
3.1 Crosslinking 1. Grow between 5 k to one million cells on micro cover glass in
Cells on Micro Cover 2 mL media in a 6-well plate (see Note 1). Remove media at
Glass 50–70% confluency and add 937.5 μL 1X PBS per well (see
Note 1). Add 62.5 μL of 16% formaldehyde to crosslink the
cells for 10 min at RT on rocking platform (see Note 2).
2. Quench reaction by adding 125 mM glycine (52.5 μL of 2.5 M
stock) and incubate for 5 min at RT on a rocking platform.
3. Wash cells twice with 1X PBS.
298 Pierre-Olivier Estève et al.
3.2 Accessible 1. Add 1 mL of cytosolic buffer to the cells and incubate for
Chromatin Labeling 10 min at 4 °C (see Note 3).
2. Wash twice with 1X PBS, nuclei can be visualized under the
microscope at this point (circular with smooth edges).
3. Add 800 μL of accessible chromatin labeling buffer for at least
30 min (can be extended up to 2 h) at 37 °C away from light (see
Note 4).
4. Add 20 μL 0.5 M EDTA and 2 μL RNase A to each well to stop
the reaction. Incubate for 20 min at 37 °C to digest RNA.
5. Remove the supernatant and wash once with 1X PBS at 55 °C
to remove autofluorescence background (see Note 6).
6. Wash twice with 1X PBS at RT.
7. Remove and dry the cover glass for at least 30 min at RT away
from light.
8. Mount the coverslip on microscope slide using ProLong gold
antifade reagent with DAPI. At this point, slides are ready to be
visualized by confocal microscopy (see Note 7).
3.4 NicE-viewSeq 1. After labeling chromatin on slides, cells can be removed using
DNA Preparation 1% SDS, 2 mg/mL proteinase K, and 200 mM NaCl at 65 °C
overnight.
2. Genomic DNA can be extracted and purified using phenol/
chloroform/isoamyl alcohol method or Monarch Genomic
DNA purification kit.
3. Take 200–500 ng of genomic DNA and sonicate (see Note 8).
Transfer the genomic DNA to Covaris microtube and add 1X
TE buffer to 50 μL final volume. Sonicate using the following
settings to obtain 150 bp fragments. First, insert the tube into
the holder, and simply select “Open” and select the program
named “Covaris 200 for 50μL” on the computer. Click the
“start” button. The parameters: Intensity: 5; Duty Cycle: 10%;
Cycles per burst: 200; and Treatment time: 2 min.
NicE-viewSeq: An Integrative Visualization and Genomics Method to Detect. . . 299
3.6 NicE-viewSeq 1. For end-repair per sample, combine the following in sequential
NGS Library order: 50 μL of fragmented DNA on Streptavidin beads from
Preparation the above step; 3 μL of NEB Next Ultra II End Prep Enzyme
Mix; 7 μL of NEB Next Ultra II End Prep Reaction Buffer. Mix
well and incubate at 20 °C for 30 min and at 65 °C for 30 min
(use a PCR machine).
2. For adaptor ligation per sample, combine the following in
sequential order: 60 μL of End Prep Reaction Mixture; 30 μL
of NEB Next Ultra II ligation Master Mix; 1 μL of NEB Next
ligation Enhancer; 1 μL of 1:10 diluted (recommended by
NEB) NEB Next Adaptor for Illumina. Incubate for 2–16 h
at RT.
3. Add 3 μL USER enzyme for 15 min and incubate at 37 °C.
4. Place the Eppendorf tube on the magnetic rack. When solution
is clear, remove the liquid carefully using a pipet, and wash the
beads for 5 min with 1 mL cold High Salt Buffer containing
0.05% Triton X-100.
5. Repeat the above wash steps 2 times.
6. Wash the beads once with 1 mL 1X TE buffer for 5 min.
7. Resuspend the beads in 19 μL 0.1X TE buffer.
3.7 NicE-viewSeq 1. For PCR amplification per sample, combine the following in
NGS Library sequential order: 19 μL of Streptavidin beads, 3 μL Index
Amplification primer (10 μM), 3 μL Universal primer (10 μM), and 25 μL
NEB Ultra II Q5 Master Mix.
2. Set the amplification using the following parameters in a PCR
machine: 30 s at 98 °C; initial denaturation, 10 s at 98 °C;
denaturation, 30 s at 65 °C; annealing, 45 s at 65 °C; extension
for 10 cycles. The final extension is 5 min at 72 °C. Library
amplification may be hold at 4 °C.
300 Pierre-Olivier Estève et al.
3.8 NicE-viewSeq 1. Place PCR reaction on a magnetic rack to remove the magnetic
NGS Library streptavidin-biotinylated-DNA bead complexes. Transfer the
Purification Using supernatant that contains the PCR products to new DNA
NEBNext® Sample LoBind tube and add 0.9X volume (45 μL) of NEBNext®
Purification Beads Sample Purification Beads.
2. Incubate for 5 min at RT and quick-spin the tube in a
microcentrifuge.
3. Put samples on magnetic rack to separate beads from the
supernatant. When the solution looks clear, carefully remove
the supernatant. DO NOT DISTURB THE BEADS.
4. Add 200 μL of freshly prepared 80% EtOH. DO NOT
REMOVE THE PCR TUBES OF THE RACK OR RESUS-
PEND THE BEADS. Wait for 30 s, remove the 80% EtOH
from the beads, and repeat once. After removing the superna-
tant for a second time, quickly spin down the tubes and
completely remove the residual EtOH.
5. Air-dry the beads for 5 min while the tube is on the rack with
the lid open. DO NOT OVERDRY THE BEADS, THIS MAY
RESULT IN LOWER RECOVERY OF DNA.
6. Remove the tube from the magnet and resuspend the beads in
20 μL 0.1X TE buffer for 2 min at RT to elute the DNA.
7. Put back the tube on to the magnetic rack until the solution is
clear and transfer the supernatant to a clean PCR tube and store
it at -20 °C.
8. Measure the amount of DNA using the Qubit HsDNA proto-
col. A successful library preparation should have at least a DNA
concentration of 1 ng/μL.
9. Analyze the DNA on the Bioanalyzer (Agilent DNA 1000
Chip) to assess the library quality (size distribution and con-
centration). After Illumina DNA sequencing mapping and peak
analyses, genomic open chromatin regions can be visualized
using IGV browser (Fig. 2).
4 Notes
Acknowledgments
References
1. Jackson DA (2003) The principles of nuclear fluorescence in situ hybridization techniques.
structure. Chromosom Res 11:387–401 Nat Rev Microbiol 6:339–348
2. Martins RP, Finan JD, Guilak F, Lee DA 8. Volpi EV, Bridger JM (2008) FISH glossary:
(2012) Mechanical regulation of nuclear struc- an overview of the fluorescence in situ hybridi-
ture and function. Annu Rev Biomed Eng 14: zation technique. BioTechniques 45:385–409
431–455 9. Tarnowski BI, Spinale FG, Nicholson JH
3. Nathanailidou P, Taraviras S, Lygerou Z (1991) DAPI as a useful stain for nuclear quan-
(2020) Chromatin and nuclear architecture: titation. Biotech Histochem 66:297–302
shaping DNA replication in 3D. Trends 10. Latt SA, Stetten G, Juergens LA, Willard HF,
Genet 36:967–980 Scher CD (1975) Recent developments in the
4. Klemm SL, Shipony Z, Greenleaf WJ (2019) detection of deoxyribonucleic acid synthesis by
Chromatin accessibility and the regulatory epi- 33258 Hoechst fluorescence. J Histochem
genome. Nat Rev Genet 20:207–220 Cytochem 23:493–505
5. Volpe TA, Kidner C, Hall IM, Teng G, Grewal 11. Latt SA, Stetten G (1976) Spectral studies on
SI, Martienssen RA (2002) Regulation of het- 33258 Hoechst and related bisbenzimidazole
erochromatic silencing and histone H3 lysine-9 dyes useful for fluorescent detection of deoxyr-
methylation by RNAi. Science 297:1833–1837 ibonucleic acid synthesis. J Histochem Cyto-
6. Langer-Safer PR, Levine M, Ward DC (1982) chem 24:24–33
Immunological method for mapping genes on 12. Bucevičius J, Lukinavičius G, Gerasimaitė R
Drosophila polytene chromosomes. Proc Natl (2018) The Use of Hoechst Dyes for DNA
Acad Sci U S A 79:4381–4385 Staining and Beyond. Chemosensors 6:18
7. Amann R, Fuchs BM (2008) Single-cell identi- 13. Chen X, Shen Y, Draper W et al (2016) ATAC-
fication in microbial communities by improved see reveals the accessible genome by
302 Pierre-Olivier Estève et al.
Abstract
ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) has gained wide popularity as a
fast, straightforward, and efficient way of generating genome-wide maps of open chromatin and guiding
identification of active regulatory elements and inference of DNA protein binding locations. Given the
ubiquity of this method, uniform and standardized methods for processing and assessing the quality of
ATAC-seq datasets are needed. Here, we describe the data processing pipeline used by the ENCODE
(Encyclopedia of DNA Elements) consortium to process ATAC-seq data into peak call sets and signal tracks
and to assess the quality of these datasets.
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_17,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
305
306 Daniel S. Kim
2 Materials
3 Methods
Each step of the pipeline is described with input and outputs. For
this step-by-step guide, we provide the pipeline for paired-ended
ATAC-seq with two replicates, utilizing multimapping reads (our
recommended default experiment design and processing).
ATAC-seq Data Processing 307
3.1 Adaptor 1. Trim adapter sequence from the sequencing reads. Do this for
Detection and Read each FASTQ file (see Notes 1 and 2).
Trimming • Inputs: reads in FASTQ format ($FASTQ), adapter
sequence ($ADAPTER), adapter error rate ($ADAPTER_-
ERR_RATE, default 0.2).
3.2 Read Alignment 1. Run Bowtie2 to align reads with up to k multimapping loca-
and Post-alignment tions allowed (see Notes 3–5).
Filtering • Inputs: reads in FASTQ format ($FASTQ1, $FASTQ2),
Bowtie2 index ($bwt2_idx), number of CPU threads
($nth_bwt2), number of multimapping locations allowed
per read ($multimapping, default 4), output file name
prefix ($prefix).
• Outputs: unfiltered alignments file in BAM format (${pre-
fix}.bam, in future steps referred to as $RAW_BAM), log file
($log).
• Command:
bowtie2 -k ${multimapping+1} -X2000 --mm --
threads $nth_bwt2 -x $bwt2_idx -1 $FASTQ1 -2
$FASTQ2 2>$log | samtools view -Su /dev/stdin |
samtools sort - $prefix
• Command:
zcat $FINAL_BEDPE | awk ’BEGIN{OFS="\t"}
{printf "%s\t%s\t%s\tN\t1000\t%s\n%s\t%s\t%s
\tN\t1000\t%s\n",$1,$2,$3,$9,$4,$5,$6,$10}’
| \ gzip -nc > $FINAL_TA_FILE
6. Adjust read starts for transposase cut sites for base-pair resolu-
tion alignments (see Note 13).
• Inputs: reads in tagAlign format ($FINAL_TA).
• Outputs: shifted reads in tagAlign format
($FINAL_TA_SHIFTED).
• Command:
zcat $FINAL_TA | awk -F $’\t’ ’BEGIN {OFS = FS}
{ if ($6 == "+") {$2 = $2 + 4} else if ($6 == "-") {$3
= $3 - 5} print $0}’ | gzip -nc > $FINAL_TA_SHIFTED
3.4 Identifying 1. Run IDR framework for replicate consistent peaks (see Notes
Replicate-Consistent 18–22).
Peaks • Inputs: replicate peak files ($REP1_PEAK_FILE,
$REP2_PEAK_FILE), master peak list ($POOLED_PEAK_-
FILE), p-value threshold ($IDR_THRESH).
310 Daniel S. Kim
3.5 Generating 1. Generate fold-change bigWig file with MACS2 (see Note 23).
Signal Tracks • Inputs: pileup bedGraph files, generated from MACS2 call-
peak ($TREAT_PILEUP, $CONTROL_PILEUP), chromo-
some sizes file ($gensz). Note that this produces an
intermediate bedGraph file ($fc_bedgraph,
$fc_bedgraph_srt).
10. Calculate the IDR quality control metrics (see Note 37).
Let Np = number of peaks passing IDR threshold by
comparing pooled pseudoreplicates and Nt = number of
peaks passing IDR threshold by comparing true replicates.
Calculate the Rescue Ratio = max(Np, Nt)/min(Np, Nt).
Let N1, N2 = number of peaks passing IDR threshold for
self-pseudoreplicates for replicate 1 and replicate 2, respectively.
Calculate the Self-consistency Ratio = max(N1, N2)/min
(N1, N2).
ATAC-seq Data Processing 315
4 Notes
References
1. Buenrostro JD, Giresi PG, Zaba LC et al 7. Langmead B, Salzberg SL (2012) Fast gapped-
(2013) Transposition of native chromatin for read alignment with Bowtie 2. Nat Methods 9:
fast and sensitive epigenomic profiling of open 357–359. https://doi.org/10.1038/nmeth.
chromatin, DNA-binding proteins and nucleo- 1923
some position. Nat Methods 10:1213–1218. 8. Li H, Handsaker B, Wysoker A et al (2009)
https://doi.org/10.1038/nmeth.2688 The sequence alignment/map format and
2. Galas DJ, Schmitz A (1978) DNAse footprint- SAMtools. Bioinformatics 25:2078–2079.
ing: a simple method for the detection of https://doi.org/10.1093/bioinformatics/
protein-DNA binding specificity. Nucleic btp352
Acids Res 5:3157–3170. https://doi.org/10. 9. Quinlan AR, Hall IM (2010) BEDTools: a
1093/nar/5.9.3157 flexible suite of utilities for comparing genomic
3. Hesselberth JR, Chen X, Zhang Z et al (2009) features. Bioinformatics 26:841–842. https://
Global mapping of protein-DNA interactions doi.org/10.1093/bioinformatics/btq033
in vivo by digital genomic footprinting. Nat 10. (2020) Picard Toolkit. Broad Institute
Methods 6:283–289. https://doi.org/10. 11. Feng J, Liu T, Qin B et al (2012) Identifying
1038/nmeth.1313 ChIP-seq enrichment using MACS. Nat Pro-
4. Li Z, Schulz MH, Look T et al (2019) Identifi- toc 7:1728–1740. https://doi.org/10.1038/
cation of transcription factor binding sites nprot.2012.101
using ATAC-seq. Genome Biol 20:45. 12. Kharchenko PV, Tolstorukov MY, Park PJ
https://doi.org/10.1186/s13059-019- (2008) Design and analysis of ChIP-seq experi-
1642-2 ments for DNA-binding proteins. Nat Biotech-
5. ENCODE Project Consortium, Moore JE, nol 26:1351–1359. https://doi.org/10.
Purcaro MJ et al (2020) Expanded encyclopae- 1038/nbt.1508
dias of DNA elements in the human and mouse 13. Ramı́rez F, Ryan DP, Grüning B et al (2016)
genomes. Nature 583:699–710. https://doi. deepTools2: a next generation web server for
org/10.1038/s41586-020-2493-4 deep-sequencing data analysis. Nucleic Acids
6. Martin M (2011) Cutadapt removes adapter Res 44:W160–W165. https://doi.org/10.
sequences from high-throughput sequencing 1093/nar/gkw257
reads. EMBnet J 17:10. https://doi.org/10. 14. Amemiya HM, Kundaje A, Boyle AP (2019)
14806/ej.17.1.200 The ENCODE blacklist: identification of prob-
lematic regions of the genome. Sci Rep 9:9354.
ATAC-seq Data Processing 323
Abstract
DNA accessibility has been a powerful tool in locating active regulatory elements in a cell type, but
dissecting the combinatorial logic within these regulatory elements has been a continued challenge in the
field. Deep learning models have been shown to be highly predictive models of regulatory DNA and have
led to new biological insights on regulatory syntax and logic. Here, we provide a framework for deep
learning in genomics that implements best practices and focuses on ease of use, versatility, and compatibility
with existing tools for inference on DNA sequence.
Key words DNA accessibility, ATAC-seq, DNase-seq, Deep learning, Machine learning
1 Introduction
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7_18,
© The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature 2023
325
326 Daniel S. Kim
2 Materials
3 Methods
3.1 Data Processing 1. Start with a set of genomic intervals, such as a set of accessible
and Data Loading regions for a cell type (see Note 1). This will be your set of
genomic intervals that are labeled as positives.
2. Collect an informative set of negative genomic intervals (see
Notes 2 and 3). This includes flanking intervals (the genomic
intervals adjacent to the positive intervals on either side, our
default is to collect three extra bins on each side), random
intervals (intervals anywhere else in the genome that are not
positives), as well as known accessible intervals that are not
accessible in your set of genomic intervals (see Note 4).
3. For the positive intervals and negative intervals selected, bin
these intervals into equal-size bins (see Note 5), using a stride
length to generate fixed-length examples across the selected
intervals (see Note 6). These bins are your genomic examples.
Default bins are 200 bp in length (e.g., an example is 200 base
pairs of genomic sequence), and our default stride length is
50 bp.
4. Set up labels for your examples. Positives should be labeled
with 1 and negatives are labeled with 0 (see Note 7).
5. Extend each example interval to your final interval length (see
Note 8). This step now adds the flanking sequences of each bin
to give more sequence context during training. The default
final length is 1000 bp. At this stage, you should have a set of
genomic intervals that are all 1000 bp in length and are each
associated with a label (1 or 0).
6. Optionally, pre-generate one-hot encodings for your regions
(see Notes 9–11). If you intend to use a standard data loader
for your desired deep learning framework, this will be necessary
to have appropriate inputs for training.
7. Build a data generator appropriate for your desired deep
learning framework. Many frameworks now provide the option
to create your own data loader if needed. If performing a
one-hot encoding on the fly, write a one-hot encoder in your
data generator to ensure the deep learning framework receives
a proper input with the label.
3.2 Train a Model 1. Before training, determine your evaluation setup. We use a
cross-validation strategy based on splitting by chromosome
(see Note 12). For a tenfold cross validation strategy, split
your chromosomes by size as equally as possible across ten-
folds, then use eight folds for training, one fold for validation,
and one fold for testing (see Note 13).
328 Daniel S. Kim
3.3 Evaluation 1. Evaluate your model using only the held-out test data (see
Notes 19 and 20). Unlike training, where only an informative
set of negative regions is used, please use the entirety of the
validation chromosomes during evaluation. Useful measures
during evaluation of a classification model include the loss,
area under the precision-recall curve (AUPRC), and area
under the receiver-operator curve (AUROC) (see Note 21).
4 Notes
References
1. Boyle AP, Davis S, Shulha HP et al (2008) 9. Kelley DR, Snoek J, Rinn J (2016) Basset:
High-resolution mapping and characterization learning the regulatory code of the accessible
of open chromatin across the genome. Cell genome with deep convolutional neural net-
132:311–322. https://doi.org/10.1016/j. works. Genome Res gr.200535.115. https://
cell.2007.12.014 doi.org/10.1101/gr.200535.115
2. Song L, Crawford GE (2010) DNase-seq: a 10. Shrikumar A, Greenside P, Kundaje A (2017)
high-resolution technique for mapping active Learning important features through propa-
gene regulatory elements across the genome gating activation differences.
from mammalian cells. Cold Spring Harb Pro- arXiv:170402685 [cs]
toc 2010:pdb.prot5384. https://doi.org/10. 11. Lundberg SM, Lee S-I (2017) A unified
1101/pdb.prot5384 approach to interpreting model
3. Thurman RE, Rynes E, Humbert R et al predictions. In: Advances in neural information
(2012) The accessible chromatin landscape of processing systems. Curran Associates, Inc.
the human genome. Nature 489:75–82. 12. Greenside P, Shimko T, Fordyce P, Kundaje A
https://doi.org/10.1038/nature11232 (2018) Discovering epistatic feature interac-
4. Roadmap Epigenomics Consortium, tions from neural network models of regu-
Kundaje A, Meuleman W et al (2015) Integra- latory DNA sequences. Bioinformatics 34:
tive analysis of 111 reference human epigen- i629–i637. https://doi.org/10.1093/bioin
omes. Nature 518:317–330. https://doi.org/ formatics/bty575
10.1038/nature14248 13. Avsec Ž, Weilert M, Shrikumar A et al (2021)
5. Buenrostro JD, Giresi PG, Zaba LC et al Base-resolution models of transcription-factor
(2013) Transposition of native chromatin for binding reveal soft motif syntax. Nat Genet 53:
fast and sensitive epigenomic profiling of open 354–366. https://doi.org/10.1038/s41588-
chromatin, DNA-binding proteins and nucleo- 021-00782-6
some position. Nat Methods 10:1213–1218. 14. Shrikumar A, Tian K, Avsec Ž, et al (2020)
https://doi.org/10.1038/nmeth.2688 Technical note on transcription factor Motif
6. Eraslan G, Avsec Ž, Gagneur J, Theis FJ discovery from importance scores
(2019) Deep learning: new computational (TF-MoDISco) version 0.5.6.5.
modelling techniques for genomics. Nature arXiv:181100416 [cs, q-bio, stat]
Reviews Genetics 20:389–403. https://doi. 15. Kingma DP, Ba J (2017) Adam: a method for
org/10.1038/s41576-019-0122-6 stochastic optimization. arXiv:14126980 [cs]
7. Alipanahi B, Delong A, Weirauch MT, Frey BJ 16. Hinton G (2012) Neural networks for machine
(2015) Predicting the sequence specificities of learning, Lecture 6
DNA- and RNA-binding proteins by deep 17. Kim DS, Risca V, Reynolds D et al (2020) The
learning. Nat Biotechnol 33:831–838. dynamic, combinatorial cis-regulatory lexicon
https://doi.org/10.1038/nbt.3300 of epidermal differentiation. bioRxiv
8. Zhou J, Troyanskaya OG (2015) Predicting 2020.10.16.342857. https://doi.org/10.
effects of noncoding variants with deep 1101/2020.10.16.342857
learning-based sequence model. Nat Methods
12:931–934. https://doi.org/10.1038/
nmeth.3547
INDEX
Georgi K. Marinov and William J. Greenleaf (eds.), Chromatin Accessibility: Methods and Protocols,
Methods in Molecular Biology, vol. 2611, https://doi.org/10.1007/978-1-0716-2899-7,
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer
Nature 2023
335
CHROMATIN ACCESSIBILITY: METHODS AND PROTOCOLS
336 Index
Chromatin accessibility ................ v, 13, 71–96, 101–118, Diploid.......................................................................7, 116
155, 156, 187–227, 232, 233, 249–266, Disuccinimidyl glutarate (DSG)..................................... 60
269–281, 285–290, 293–301, 325–333 Dithiothreitol (DTT)................................. 35, 41, 89, 91,
Chromatin immunoprecipitation sequencing 103, 138, 157, 193, 200–202, 253, 286–289, 296
(ChIP-Seq) ............ 24, 33, 34, 60, 101, 317, 320 DNA fragmentation ........................................................ 45
Chromatin interactions.............................................85–96 DNA LoBind tube .................................. 47, 56, 200, 300
Chromatin looping ......................................................... 86 DNA methylation ............... 39, 102, 106, 156, 231–246
Chromium ....................................................253, 270–272 DNA Polymerase I .........................................41, 296, 301
Chromosome sizes ...................................... 310, 311, 314 DNA replication ........................................................72, 73
Circularization................................................................. 94 DNase ......................... 3, 4, 7, 22, 39, 54, 128, 139, 155
cis-regulatory elements (cREs) ........................... 3, 4, 101, DNase-seq .................................................... 4, 39, 40, 53,
108, 155, 188 54, 63, 71, 101, 155, 294
CITE-seq .............................................189, 252, 264, 265 DNse hypersensitivity (DHS)...................................39, 40
Click-IT .............................................................. 74, 78, 83 dNTP ............................................. 6, 13, 41, 88, 93, 165,
CO2 ............................................................................56, 75 166, 192, 203, 208, 234, 235, 241, 296, 301
Combinatorial indexing ......................156, 188, 189, 198 DOGMA-seq................................................................. 189
Confocal .....................................288, 289, 296, 298, 301 Doublets ..............................................197, 198, 220–222
Convolutional neural networks (CNNs) ..................... 330 Dounce ................................................164, 174, 237, 238
Coomassie.......................................................33, 286, 288 Droplet ....................................................... 155, 156, 188,
CoTECH ....................................................................... 189 189, 222, 249, 251, 252
Covaris .............................................................42, 45, 104, Drosophila ............................................... 72, 76, 105, 116
108, 109, 128, 133, 297, 298 Drosophila melanogaster (D. melanogaster) 72–74, 76, 82
CpG ............................................................ 103, 105–107, dsciATAC-seq ................................................................ 189
112, 113, 115, 116, 232, 233 dSMF .................................................................... 101–118
Cross-correlation......................................... 306, 312, 320 dTTP....................................................................... 41, 296
Crosslinked ...........................................17, 44, 48, 60, 65, Dynabeads .............................................88, 103, 192, 206
105, 182, 195, 196, 206, 300
Cross-linking .............................................................92–93 E
Cross-validation.................................................... 327, 330 EB buffer ..........................................................84, 88, 207
CTCF .................................................................... 115–117 EDTA....................................................33, 42, 44, 48, 50,
CuSO4 ............................................................................. 78 55, 65, 75, 86, 89, 92, 93, 103, 125, 127, 128,
Cutadapt ......................................... 64, 66, 104, 306, 307
133, 139, 141, 165, 174, 175, 182, 192–195,
Cut sites ............................. 144, 146–151, 309, 317, 318 235, 287–289, 297, 298
CutSmart .................... 93, 128, 131, 132, 140, 179, 180 EGTA...................................................................... 55, 127
Cytosolic buffer................... 41, 44, 48, 50, 51, 296, 298
EM-seq .........................................................103, 108–114
ENCODE............ 14, 54, 214, 305–307, 309, 310, 315,
D
317–319, 321
dATP ...................................................... 41, 294–296, 298 End repair ................... 28, 46, 49, 51, 58, 109, 141, 299
dCTP .....................................................41, 166, 177, 296 Enhancers ...............................................3, 14, 28, 34, 46,
Deep learning ...............................................319, 325–333 53, 58, 71, 85, 86, 108, 129, 134, 232, 285, 299
DeepTools ........................................ 33, 59, 64, 306, 320 Epigenome .................................................................... 156
Demultiplex ................................ 254, 260–262, 275, 278 Escherichia coli (E. coli) ........................... 65, 68, 286, 288
Desalting............................................................... 182, 190 Ethidium bromide ....................................................31, 33
Desulphonation.................................................... 234, 240 Ethylene glycol bis(succinimidyl succinate) (EGS)....... 60
3D genome.................................................................... 285 5-ethynyl-2′-deoxyuridine (EdU)...............72–76, 82, 83
dGTP ...................................................................... 41, 296 Euchromatin......................................................... 285, 293
4′,6-diamidino-2-phenylindole (DAPI) .................54–59, Exonuclease ......................... 40, 123, 143, 146, 148, 235
252, 287, 289, 294–296, 298 Exo/rSAP .................................................... 235, 241, 245
Digitonin ............4, 6, 16, 102, 192, 201, 202, 252, 256
Dimensionality reduction .................................... 225, 265 F
Dimethyl Formamide (DMF) ................................ 6, 164, FASTA ............................................................67, 262, 280
192, 194, 202, 287 FASTQ...........................32, 66, 212–214, 217, 307, 314
Dinucleosomal...................................................... 221, 222 FASTQC..............................................32, 64, 66, 82, 261
CHROMATIN ACCESSIBILITY: METHODS AND PROTOCOLS
Index 337
Fetal bovine serum (FBS) ........56, 88, 91, 252, 271, 272 High Salt Buffer .................. 42, 45, 46, 49, 50, 297, 299
Fetal Calf Serum.............................................................. 74 HindIII ................................................127, 142, 143, 151
FFPE ..........................................................................49–51 HiSeq ...................................................................... 95, 167
Ficoll gradient ................................................................... 7 Histone ...................................................3, 21, 24, 34, 40,
FITC .............................................................................. 294 63, 72, 121, 122, 156, 293
Fixation ................................48, 68, 91–92, 95, 200–202, Histone modifications...............................................21, 34
251–253, 255–257, 271–273, 279, 289, 296 H3K9me3........................................................... 33, 34, 60
Flowmi ......................................................... 253, 271, 273 H3K27me3 ...............................................................55, 60
Fluorescein .................................................. 294, 296, 298 Hoechst ....................................................... 238, 294, 296
Fluorescence-activated cell sorting (FACS)........... 16, 83, Homoplasmic ....................................................... 276, 281
234, 238, 239, 252, 271–273 HP1.................................................................................. 55
Fluorophores ........................................................ 285, 294 HPLC ..................................................182, 190, 235, 290
FokI......................................................158, 167, 173, 179 HT1080....................................................... 286, 288, 289
Footprints ...........13, 101, 108, 109, 111, 114–117, 305 HU................................................................................... 64
Formaldehyde............................................. 40, 41, 43, 48, Human papillomavirus (HPV) ....................................... 25
51, 55, 57, 60, 64, 65, 68, 86, 91, 139, 155, 252, Hybridization ............................................... 40, 103, 105,
255, 271, 272, 287, 289, 295–297, 300 106, 108–109, 196, 197, 204–206
Formaldehyde-assisted isolation of regulatory elements
sequencing (FAIRE-Seq) ..... 40, 63, 71, 155, 294 I
Fraction of reads in peaks (FRiP)............... 277, 314, 322 i5 .................. 9, 13, 15, 65, 90, 134, 167, 180, 181, 260
Fragment length......................................... 12, 59, 81, 82, i7 ...........................9, 13, 15, 65, 90, 167, 180, 181, 260
108, 144, 146, 210, 221, 223, 245, 314, 320, 321
IDR .................................... 309, 310, 314, 317–319, 322
FS-seq ........................................................................21–37 IGEPAL ...........................................................4, 5, 16, 55,
75, 102, 164, 194, 287, 289, 290
G
Illumina............................................ 9, 12, 13, 15, 23, 24,
GC bias ................................................................. 313, 321 27, 29, 30, 32, 42, 46, 47, 56, 59, 64, 74, 77, 82,
Gelatin ............................................................................. 75 83, 90, 95, 102, 109, 110, 116, 123, 124, 128,
Gene expression ........ v, 63, 85, 156, 212, 270, 285, 293 129, 133, 134, 142, 143, 146, 158, 167, 180,
GentleMACS ............................................... 191, 193, 200 181, 197, 211, 238, 244, 254, 259, 260, 271,
Glycerol.................................................33, 75, 86, 91, 92, 274, 278, 279, 297, 299, 300
116, 140, 157, 193, 195, 199, 200, 210, 287–289 Imidazole ............................................................ 86, 88, 89
Glycine ........................................................ 41, 43, 48, 55, Immuno-staining ................................................. 287, 289
57, 60, 65, 89, 92, 192, 193, 201, 252, 255, 271, Inaccessible ....................................................... 53–60, 108
272, 289, 296, 297 Insulators ......................................................................... 14
Glycogen................................................42, 44, 88, 93, 94 Integrated Genome Browser (IGB).................. 64, 67, 68
GpC ....................................................103, 105–107, 110, Isopropanol ....................... 42, 44, 86, 94, 128, 130, 132
112–116, 232–234, 238, 244 Isopropyl-β-d-1-thiogalactopyranoside
Graphical processing unit (GPU) ................................ 326 (IPTG) ................................................90, 286, 288
H J
H3 .............................................................. 3, 21, 163, 172 Jensen-Shannon distance (JSD) .......................... 313, 321
H4 .............................................................. 3, 21, 163, 172
H2A .............................................................................3, 21 K
Haploid .......................................................................... 116 KAc ................................................................................ 164
H2B .............................................................................3, 21 Kallisto .................................................254, 261–263, 266
HCT116 .................................41, 48, 54–56, 59, 60, 296
KAPA .................................... 13, 19, 166, 167, 181, 192,
HDAC .................................................................. 294, 295 208, 209, 211, 235, 242, 253, 254, 257, 258, 266
HDF5 ............................................................................ 330 Keras .............................................................................. 326
HEK293 ........................................................................ 222
Kipoi .............................................................................. 326
Hemocytometer .............................................................. 41 Kite.......................................................254, 261–264, 266
HEPES............................................................75, 287–289 KOAc .................................................................... 128, 132
Heterochromatin ........................... 59, 60, 285, 293, 294 KOH .............................................................127, 287–289
Heteroplasmic ...................................................... 276, 281
CHROMATIN ACCESSIBILITY: METHODS AND PROTOCOLS
338 Index
L Multimodal................................................. 210, 231, 233,
260, 264–265, 270, 281
Lambda ................................................................. 108, 112 Multiomics................................................v, 156, 189, 211
Lamin B1 ....................................................................... 287
Ligation ................................................28, 40, 46, 49, 51, N
58, 60, 94, 123, 125, 129, 133, 134, 141, 146,
156, 165, 176, 179, 190, 204–206, 294, 299 NaClO4 ................................................................. 128, 132
Lineage tracing.............................................................. 263 NEBNext High-Fidelity ..................................6, 8, 9, 104
Linker.................................................... 22, 23, 86, 91–93, NEB NEXT Ultra II FS DNA library prep Kit .......24–27
96, 158, 173, 190, 198, 226 NEB Ultra II DNA library prep Kit ................ 24, 27–32,
Low loss lysis (LLL)............................255, 256, 258, 266 42, 56, 297
Nextera ..................... 15, 64, 65, 68, 167, 180, 181, 193
M NextGEM ...................................................................... 271
Next generation sequencing (NGS) ...................... 22, 23,
MACS2 ......................................... 59, 306, 309, 310, 317 45, 47, 51, 54, 57–59, 72, 73, 86, 238, 244, 294,
MAPQ ..................................................... 59, 82, 313, 316 295, 299–300
MarkDuplicates ............................................................. 308 NextSeq ...........................................................47, 82, 167,
Matplotlib............................................................. 104, 115 211, 254, 259, 271, 274, 278
Maxima H............................................165, 175, 192, 203 NicE-viewSeq ....................................................... 293–301
M.CviPI ...............................................103, 105–107, 232 Nicking enzyme assisted sequencing
mdCTP ............................................................................ 41 (NicE-seq) ............................................ 39–51, 294
mESCs .......................................................................75, 76 Nicks ..................................................40, 47–50, 294, 300
Metaprofile .................................................. 114, 116, 219 Ni-NTA............................................................... 86, 88, 91
MethylDackel ...............................................105, 112–114 NlaIII .........................................................................88, 93
Methylome ...............................................v, 101, 189, 231 NotI-HF ............................................................... 167, 180
Methyltransferase ...................................4, 102, 105, 106, NovaSeq........................................................ 95, 167, 211,
116, 232–234, 238 238, 244, 254, 259, 271, 274, 278
MgAc2 ............................................................................ 164 NP40..............................41, 55, 204, 205, 252, 256, 271
mgatk ................................. 254, 263, 272, 276, 280, 281 Nt.CviPII....................................41, 47, 49, 50, 295, 296
MgCl2 ....................................................... 5, 6, 41, 56, 57, Nuclear periphery......................................................55, 59
60, 75, 89, 102, 103, 127, 157, 192, 194, 201, Nuclei isolation .................. 7, 16, 17, 73, 106, 157–164,
202, 234, 252, 256, 271, 287, 289, 296 174, 194, 196, 226, 233–234, 238–239, 278
Micrococcal nuclease (MNase).................. 22, 40, 54, 55, Nuclei isolation buffer (NIB) .................... 164, 174, 194,
57, 72, 122, 155 201, 203–206
Microscope ................................................. 41, 43, 44, 48, Nucleoid-associated proteins (NAPs) ......................63, 64
56, 75, 76, 296, 298, 301 Nucleosome............................... 3, 17, 21–37, 39, 40, 53,
Milli-Q ............................................................41, 286, 295 63, 71, 72, 101, 106, 117, 122, 127, 148, 155,
MinElute PCR Purification Kit ..........6, 74, 78, 104, 193 188, 231–246, 258, 274, 279, 281, 305, 319, 321
MiSeq.........................................................................25, 32 Nucleosome depleted regions (NDRs)...........39, 40, 232
Mitochondria..........................................7, 12, 16, 17, 40, Nucleosome-free region (NFR) ............. 21, 22, 258, 321
212, 218, 222, 223, 250, 251, 270, 271, 274, Nucleosome Occupancy and Methylome sequencing
277, 278, 287, 311, 319 (NOMe-seq) ............................................. 101–118
Mitochondrial DNA (mtDNA)................. 255, 258–260, Nucleosome positioning................................72, 305, 321
263, 264, 266, 270, 271, 274–281 NUMT.................................................................. 275, 280
Mitochondrial fraction.................................................. 311
Mitochondriall genome ........................... 16, 40, 82, 217, O
218, 259, 263, 269–281, 319
MluCI ........................................................................88, 93 OD600 ........................................... 65, 90, 129, 139, 288
MNase-seq.................................................. 40, 53, 54, 63, Oligos ................................................... 15, 24, 30, 89–90,
71, 72, 101, 122, 155, 294 129, 157, 173, 182, 190–191, 193–199, 226,
Monarch PCR & DNA Cleanup Kit ............................. 25 236, 251, 252, 286, 288, 290, 297
Mononucleosomal .......................................................... 14 OmniATAC .......................................................... 202, 226
Mouse embryonic fibroblast (MEF) ............................ 222 Open chromatin ....................................... 4, 7, 16, 23, 35,
M.SssI ...........................................................103, 105–107 39, 53, 63, 64, 72, 101, 116, 155–183, 188, 189,
mtscATAC ..................................................................... 255 285, 295, 296, 298, 300, 301, 305, 315, 333
ORE-seq ............................................................... 121–151
CHROMATIN ACCESSIBILITY: METHODS AND PROTOCOLS
Index 339
P Q
P5 ................... 9, 59, 158, 173, 197, 236, 253, 257, 258 Q5 ................................................... 46, 59, 129, 134, 299
P7 ................... 9, 59, 190, 197, 207–210, 237, 253, 258 qPCR ........................................................ 7, 9–13, 26, 35,
Paired-end .................................................. 13, 59, 82, 95, 56, 59, 65, 66, 104, 109, 167, 181, 183, 191,
110, 134, 142, 260, 306, 318, 321 207, 209, 211, 258, 266, 279
Paired-seq .....................................................155–183, 189 Quantification .........................11–13, 59, 109, 167, 183,
Paired-Tag ..................................................................... 189 210–212, 250, 251, 254, 258, 261, 278, 279, 295
Paraffin............................................................................. 49 Qubit .........................................6, 12, 42, 44, 47, 56, 58,
Paraformaldehyde ......................................................... 290 59, 64–66, 68, 73, 74, 81, 88, 89, 95, 104, 107,
PB buffer ......................................................................... 17 109, 128, 131, 133, 167, 179, 183, 191, 193,
PBST ..........................................................................45, 50 209–211, 244, 254, 258, 271, 273, 274, 297, 300
PCR............................................5, 25, 42, 56, 64, 74, 86,
104, 134, 157, 190, 235, 253, 273, 288, 297, 312 R
PCR duplicates ............................. 82, 112, 141, 280, 316 R................................................33, 59, 89, 135, 137, 165
Peak calling ............................ 14, 82, 219, 309, 317, 318
Read counts .......................................................... 312, 319
PEG...................................... 89, 192, 194, 203, 208, 235 Read mapping ...................................................... 111–112
Penicillin/streptomycin .................................................. 74 repli-ATAC ................................................................71–84
Permeabilization..................................251, 255–257, 279
Rescue ratio ................................................................... 314
PHAGE-ATAC.............................................................. 189 Resection ......................................................123, 146–148
Phenol................................................... 40, 42, 44–45, 86, Restriction enzyme....................... 4, 86, 93–94, 121–151
93, 94, 118, 128, 132, 140, 297, 298
Reverse cross-linking.................................................92–93
Phenol:Chloroform:Isoamyl Alcohol ................... 42, 297 Reverse transcriptase ..................156, 165, 175, 192, 203
PhiX ...........................................................................11–13 Reverse transcription (RT) primer ..................... 157, 173,
Phosphate-buffered saline (PBS) ....................... 7, 41–45,
175, 190, 195–197, 203
48, 50, 56, 57, 60, 74–76, 82, 83, 86, 92, 104, Rhodamine .................................................................... 294
107, 165, 176, 177, 182, 191, 193, 200–202, RNase A ...................................................... 42, 44, 48, 50,
234, 238, 252, 255, 271, 272, 287, 289, 290, 88, 93, 107, 128, 132, 296, 298
296–298, 301
RNase OUT ................................................ 157, 164, 165
Phusion High-Fidelity DNA Polymerase ........................ 9 RNA-seq ...................................................... 187, 188, 224
Picard .....................................................32, 220, 306, 308 Rolling circle amplification (RCA).................... 86, 94–96
Picolyl-Azide-PEG4-Biotin ......................................74, 78
Pipeline .............................................. 124, 211, 217, 220, S
244, 254, 262, 263, 305–307, 315, 319, 326, 329
PitStop2 ................................................................ 164, 165 Saccharomyces cerevisiae.............................. 124–130, 135,
PMSF ..............................................................91, 192, 206 138, 140, 142, 144, 146, 147
Position-weight matrix (PWM).................. 328, 332, 333 S-adenosylmethionine (SAM) ............................ 103, 107,
Primary antibodies ............................................... 287, 289 215, 234, 238, 311, 312
Primers.........................................6, 9, 15, 25, 27, 30, 42, SAMstats ................... 215, 218, 220, 306, 311, 312, 319
59, 91, 104, 109, 157, 167, 175, 180–182, Samtools .......................33, 64, 105, 111, 112, 135, 195,
190–191, 207, 210, 235–237, 239–242, 253 215, 217, 218, 220, 306–308, 312, 315–317, 319
Prokaryotic chromatinOpenness Profiling sequencing SbfI-HF ................................................................ 167, 179
(POP-seq) ......................................................64–68 scATAC ..................................................... 4, 17, 188, 189,
Promoters .............3, 53, 71, 85, 86, 108, 114, 232, 285 212, 223, 249–251, 254, 255, 260–262, 264,
Protease inhibitor.................. 57, 75, 157, 164, 192, 193 265, 270, 271, 273–275, 278, 280
Protect-seq.................................................................53–60 scDNase ........................................................................... 40
Proteinase K .............. 42, 44, 48, 50, 58, 107, 128, 130, Schizosaccharomyces pombe................. 124, 125, 127, 128,
131, 139, 141, 166, 177, 192, 206, 233, 297, 298 130–132, 135, 138–140, 142, 144, 146, 147, 149
Pseudoreplicates .......................................... 314, 318, 319 sciATAC-seq .................................................................. 189
pTXB1 .................................................................. 286, 288 sci-CAR-seq ................................................................... 189
pUC19.................................................................. 108, 112 SciPy............................................................................... 104
pyBigWig .............................................................. 105, 195 sci-RNA-seq................................................................... 188
Python ........................................................ 104, 105, 195, scNMT-seq .................................................................... 189
254, 262, 263, 266, 326, 330 scNOMe-seq.................................................189, 231–246
CHROMATIN ACCESSIBILITY: METHODS AND PROTOCOLS
340 Index
Scythe............................................................................... 32 TapeStation.......................................................... 6, 11, 12,
Secondary antibodies ........................................... 287, 289 17, 66, 67, 104, 109, 128, 167, 183, 191, 193,
Seurat ..........................................195, 212, 217, 225, 264 209, 210, 254, 258
SHARE-seq .......................................................... 187–227 TD buffer ................................................. 4, 6, 8, 77, 193,
Shearing ...................................................... 104, 106, 108, 194, 210, 287, 289
123–125, 128, 133, 141, 144, 146–151 TDE1 ............................................................................... 77
Signac........................................................... 264, 276, 281 T4 DNA ligase ......... 165, 173, 176, 179, 192, 204, 205
Simian Virus 40 (SV40)............................. 21, 22, 24–27, T7 DNA ligase ................................................... 88, 89, 94
32, 33, 35–37 TEA-seq......................................................................... 189
Single-cell ..................................................... 4, 40, 55, 63, TE buffer .............................................. 42, 44–47, 49, 51,
155, 187, 231, 249, 269, 326 56, 74, 103, 128, 130–135, 198, 199, 206, 252,
Single-cell RNA-seq (scRNA-seq) ..................... 188, 189, 271, 287, 288, 297–300
212, 224, 249, 250, 262, 265 Template switching oligo (TSO) ........................ 190, 208
Single-end...............................13, 82, 110, 315–318, 320 Tensorflow ..................................................................... 326
Single molecule footprinting Terminal transferase ............................................. 166, 177
(SMF).............101–103, 105, 107–110, 115, 116 Texas Red ............................................................. 294–296
Single nucleotide polymorphism (SNP) ...................... 332 TF-MODisco........................................................ 326, 328
SMC ................................................................................. 64 Thermocycler ............................10, 73, 77, 94, 129, 133,
SnapATAC ..................................................................... 276 134, 157, 165, 167, 173, 175, 177–181, 198,
SNARE-seq ................................................................... 189 199, 203, 208, 237, 239, 241, 242, 245, 246, 288
Sodium acetate ................................................... 88, 93, 94 Thermomixer...................................................6, 8, 56, 58,
Sodium chloride (NaCl) .................................5, 6, 41, 42, 73, 82, 104, 107, 165, 166, 175–177, 191
55, 65, 75, 86, 88, 89, 91, 94, 102, 103, 157, 177, THPTA ......................................................................74, 78
192, 194, 195, 199, 201, 202, 234, 252, 256, Tissue dissociation ...................................... 193, 200, 226
271, 286–289, 296–298 Tn5 ......................................4–6, 8, 9, 15–17, 40, 63, 65,
Sodium dodecyl sulfate (SDS)..........................33, 42, 44, 68, 73, 86, 89–92, 96, 155–157, 165, 173, 175,
48, 57, 58, 75, 88, 91, 92, 94, 128, 140, 177, 182, 179, 184, 188, 193, 196, 199, 203, 210, 227,
192, 194, 287, 289, 290, 298 251, 270, 280, 285–290, 294, 305, 317
Sodium hydroxide (NaOH) ................................ 103, 286 Topologically associating domains (TADs) ................... 85
Somatic mutation.......................................................... 271 TotalSeq............................. 252, 259, 260, 262, 265, 266
Sonication ................................................... 24, 40, 45–48, Trac-looping ..............................................................85–96
58, 91, 123, 124, 288 Transcription factor...................................... 3, 32, 37, 71,
Sonicator........................42, 89, 109, 128, 133, 288, 297 72, 106, 116, 117, 232, 270
Sorbitol ................................................................. 126, 129 Transcription factor binding sites (TFBS) ...............32, 37
S phase ............................................................................. 72 Transcription start sites
Split-pool .............................................................. 188, 198 (TSSs)................................ 14, 114–116, 281, 321
SPRI beads .........................................166, 167, 177–180, Transcriptome ............................ 155–184, 187–227, 249
183, 209, 235, 241, 243, 245, 253 Transposase........................................... 4–6, 8, 15, 16, 23,
STAR.............................................................195, 214–216 40, 63, 71, 73, 77, 83, 86, 92, 96, 156, 193, 195,
Streptavidin....................... 42, 45, 49–51, 73, 74, 79, 84, 196, 250, 270, 285–290, 305, 309, 315, 317
88, 93, 94, 103, 109, 195, 196, 297, 299, 300 Transposome ........................40, 199–200, 210, 288–289
Sub-library ...........................................177, 182, 183, 206 TrimGalore ............................................................. 82, 104
Subnucleosomal .........................14, 17, 72, 82, 221, 222 Trimmomatic........................................................ 104, 110
Subsampling ........................................312, 313, 315, 320 Tris-Ac (Tris-acetate) ................... 89, 128, 164, 192, 202
Sucrose......................................16, 41, 75, 103, 157, 296 Tris-HCl ................................................... 5, 6, 41, 42, 55,
SUPERase IN.............................................. 157, 164, 165 65, 74, 75, 80, 86, 88, 89, 91, 102, 103, 128, 134,
SYBR Green ...................... 6, 9, 13, 24, 27, 30, 193, 208 157, 193, 194, 201, 202, 234, 235, 252, 256,
271, 287, 289, 296, 297
T Triton-X100 .........................................42, 45, 46, 49, 50,
TAE buffer....................................................................... 31 57, 88, 89, 164, 165, 175, 287, 289, 297, 299
tagAlign ...............................................308–314, 317, 318 TruSeq .................................................167, 180, 181, 253
Tagmentation ........................................... 65, 73, 75, 156, Trypan Blue ...........................................41, 191, 201, 273
164, 165, 174, 175, 179, 180, 182, 196, 199, TrypLE ......................................................................41, 43
209, 210, 273, 287, 289 Trypsin ................................................... 43, 56, 74–76, 82
CHROMATIN ACCESSIBILITY: METHODS AND PROTOCOLS
Index 341
TSS enrichment.......................................... 219, 220, 222, V
223, 314, 321, 322
TSS score ....................................................................... 220 Visualization ............................... 112, 293–301, 306, 319
Tween-20 ................................................. 4, 6, 16, 26, 42,
X
74, 75, 88, 102, 192, 194, 201, 202, 252, 255,
256, 278, 287, 301 10x Genomics ............................................ 253, 255–257,
265, 271–273, 278, 279
U
Y
UCSC Genome Browser ............................ 104, 195, 219
UCSC tools ................................................................... 306 Yeast ................................................... 7, 17, 74, 105, 106,
Unique molecular identifier (UMI)................... 197, 211, 116, 127, 130, 134, 138, 286
212, 214, 216, 224, 225, 260, 261 Yeast extract............................................................ 74, 127
Universal nicking enzyme-assisted sequencing
(UniNicE-seq) ......................................49, 51, 294 Z
Uracil ........................................................... 105, 108, 127
Zymo Clean & Concentrate.................................. 15, 208
USER enzyme ................................ 28, 46, 129, 134, 299
Zymolyase..................................... 17, 127, 129, 130, 139