Methods Guide For Cancer Research: For Research Use Only. Not For Use in Diagnostic Procedures

Methods Guide for Cancer Research
For Research Use Only. Not for use in diagnostic procedures.

Informatics
Informatics tools enable
critical insights. Illumina
recommends using the
DRAGEN™ pipelines on
Sequencing BaseSpace™ Sequence
Illumina offers a Hub, a DRAGEN Server,
comprehensive portfolio or onboard the NextSeq™
of next-generation 1000 and NextSeq 2000
sequencing (NGS) Systems for push-button
Library prep instruments that are analysis.
accessible and scalable
Illumina library prep
for virtually every lab.
kits enable a broad
range of methods and
applications.
Infinium™
BeadChips
Powerful, high-
throughput genotyping
and methylation
products with off-
the-shelf and custom
options
For Research Use Only. Not for use in diagnostic procedures. M-GL-00029 v01 2
Illumina genomic analysis techniques for cancer researchers
Take your research to the next level
The introduction of NGS has transformed the way scientists study biological systems. With clear benefits, such as
reduced time and cost compared to legacy technologies and the capacity to scale from small studies to population-
level throughput, NGS opens the door to a broad range of research capabilities.
Deep sequencing provides the sensitivity to detect low-frequency molecular events, uncovering the somatic variants
behind the tumor. These studies can be instrumental to understanding how changes at the DNA, RNA, protein, or
cellular level contribute to tumor initiation, growth, and metastasis. Single-cell techniques allow researchers to go
beyond bulk measurements to understand how different cells within the tumor microenvironment promote or inhibit
cancer progression.
NGS has already expanded our knowledge of cancer as a disease of the genome. More recent techniques are poised
to shed even more light within oncology. With the advantages of speed, sensitivity, and scale, NGS can take your
cancer research to the next level.
In addition to NGS, microarrays are a valuable tool for variant detection, from discovery applications to routine
screening. Powered by widely adopted Infinium technology, Illumina microarrays provide the trusted data quality
needed to accelerate research. The scalable, multi-sample format supports labs conducting large studies or
institutions processing high sample volumes, making it easy to keep pace with demand.
Overview of cancer research workflows

Simple, comprehensive workflows for a broad range of cancer research applications
NGS-based workflows
Library prep
Library prep kits are available for a range of applications, including exome, transcriptome, and whole-genome
sequencing (WGS).
Sequencing
Illumina offers a full portfolio of sequencing platforms, from the benchtop iSeq™ 100 System to the production-scale
NovaSeq™ 6000 System, that deliver the right level of speed, capacity, and cost for various laboratory or sequencing
center requirements. Illumina has pioneered major advances in sequencing simplicity, flexibility, and platform
performance. Experiments that used to require complex workflows now use simple push-button workflows.
Data analysis
Illumina cancer research workflows using the DRAGEN pipelines provide user-friendly data analysis tools that are
easily accessible through the web with BaseSpace Sequence Hub or onboard the NextSeq 1000 and NextSeq 2000
Systems for push-button analysis. Additionally, BaseSpace Sequence Hub enables integration of many workflow
steps, including library prep planning and sample management, run set-up and chemistry validation, and real-time
data monitoring and data transfer to computing and analysis modules.
Microarray workflows
Illumina offers ready-to-use microarrays with expertly-designed content. BeadChips can be processed and scanned
using Illumina systems as part of a rapid and automation-compatible DNA-to-data workflow.
Methods
Single-cell sequencing to understand tumor

evolution and the microenvironment
Understand the molecular drivers of cancer at the DNA,
RNA, epigenetic, and protein level at single-cell resolution.
Spatial analysis to probe gene expression in situ

Combine molecular profiling with immunoflourescence
imaging for multiomic analyses within the native morphology.
ATAC-Seq to determine chromatin accessibility

across the genome
Map the chromatin regulatory landscape in cancers and
elucidate the role of epigenetics in cancer progression.
BEN-Seq to simultaneously assay protein and RNA

Interrogate tumor heterogeneity to characterize tumor cells,
infiltrating immune cells, and the microenvironment.
T-cell receptor sequencing to interrogate immune

function
Characterize tumor-reactive T-cells to understand the central
players in immunosurveillance.
WGS to identify a comprehensive list of genetic

drivers of cancer progression
Identify cancer-driving genetic events beyond protein-
coding variants.
Mutational profiling for neoantigen prediction

Identify tumor-specific peptides that may be capable of
inducing an immune response.
Methylation arrays for comprehensive coverage

of sites genome-wide
Quantitatively interrogate methylation sites across the
genome at single-nucleotide resolution.
Method 1: Single-cell sequencing
Understand the molecular drivers of cancer at single-cell resolution.
Key benefits over bulk sequencing:
• Detect functional cell populations in the tumor microenvironment

• Uncover the B-cell receptor (BCR) and T-cell receptor (TCR) sequences of individual tumor lymphocytes
• Understand the effects of epigenetic heterogeneity in cancer progression
• Construct the evolution of somatic variants from tumor samples
Analysis,
Tissue Single-cell isolation
Specimens Sequencing visualization, and
preparation and library prep
interpretation
Easy: Fresh Tissue dissociation: Single-cell isolation: NovaSeq Multiple methods

FACS, droplet fluidics 6000 System Open-source:
Difficult: Frozen Mechanical, platforms, microwells Seurat, Monocle,
enzymatic, or NextSeq 1000 and velocyto, chromVAR
combinatorial Library prep: NextSeq 2000
Transriptome: full- Systems Commercial:
length RNA-Seq, mRNA DRAGEN Single-Cell
end-tag amplification, RNA Pipeline
targeted panels 10x Loupe Browser,
Genome: MALBAC, 10x Cell Ranger
DOP-PCR, targeted ATAC, Tapestri Insight,
panels SeqGeq Software,
Epigenome:ATAC-Seq, Partek Flow
HiC
Proteome: AbSeq,
CITE-Seq, REAP-Seq,
BEN-Seq
Training Service contracts
Customer site training Tiered service plans

N/A NovaSeq 6000 Plans
NextSeq 2000 Plans
Potential applications of single-cell cancer studies
Identify biomarkers associated with immunotherapy response

Immune checkpoint blockade has been shown to induce durable responses in
patients across multiple types of cancer. Unfortunately, only a fraction of patients
respond to this type of therapy. The antitumor immune response is a complex
mechanism. Identification of new biomarkers may help to differentiate responders
from nonresponders.
Identify specific tumor-reactive T-cell profiles

Immunosurveillance is mediated through the CD8+ T-cell recognition of neoantigens
presented on the surface of tumor cells. This recognition is mediated by T-cell
receptors whose diversity, originating though V(D)J recombination, is a hallmark of
the adaptive immune response. Using single-cell RNA-Seq techniques to elucidate
specific TCR sequences that drive neoantigen recognition, along with clonal
expansion and TCR diversity, may help drive an understanding of the central players
in immunosurveillance and immunotherapy.
Elucidate the role of epigenetics in drug resistance

While the key role of epigenetics on tumor progression has been known for quite
some time, cell-to-cell heterogeneity of the epigenome may have profound effects on
tumor progression and most notably, drug resistance. Drug-resistant tumor clones
have been known to co-opt epigenetic pathways to suppress gene expression that
drives drug sensitivity. Single-cell assays for transposase-accessible chromatin using
sequencing (ATAC-Seq) can be used to detect this kind of heterogeneity to drive a
deeper understanding of drug resistance mechanisms.
Uncover the genetic drivers of relapse

One barrier to effective cancer treatment paradigms is the failure to prevent relapse
after the initial treatment, even after complete remission. Using targeted single-cell
DNA sequencing (scDNA-Seq) of clones present in both pretreatment and relapsed
sample settings, the molecular hallmarks of relapse can be elucidated.
Single-cell RNA Sequencing (scRNA-Seq)
This guide presents an overview of scRNA-Seq in cancer research. For a complete overview of single-cell sequencing
techniques, download the single-cell eBook www.illumina.com/single-cell-rna-sequencing.
Step 1 Single-cell tissue preparation
• Tissue preparation from solid tumors can be difficult, as the goal is to obtain viable, individualized fresh cells
• For RNA-Seq, this difficulty can translate into artifacts, as tissue manipulation can lead to transcriptional changes
• For cancer research methods, isolation of nuclei only has been employed, which enables preparation of fixed
tissues and reduces dependence on cell viability and integrity
• In general, after dissociation, fluorescence-activated cell sorting (FACS) to isolate single cells has been used,
especially with tumor samples
Step 2 Single-cell isolation
Advances in microfluidics technologies have enabled high-throughput single-cell profiling, where researchers can
cost-effectively examine hundreds to tens of thousands of cells per experiment. This guide only covers these high-
throughput technologies; for low-throughput options, consult the single-cell eBook.
Method Example commercial offering Advantages Disadvantages

Unique molecular identifiers (UMIs)
• 10x Genomics Chromium Controller and cell barcodes enable cell- and
Requires specialized equipment
Droplet fluidics platforms • Mission Bio Tapestri Platform gene-specific identification, low
and poses technical challenges
• Bio-Rad ddSEQ Single-Cell Isolator cost per cell, and extensive support
from commercial providers
• BD Rhapsody Single-Cell Supports imaging and short-term Uses emerging technology
Microwells Analysis System culture of cells; ideal for and offers limited commercial
• Celsee Genesis System adherent cells solutions
Step 3 Library prep

Single-cell method Description Example commercial offering/reference
• 10x Genomics Chromium Single Cell Gene
Capture of mRNA by 3‘ polyadenylation (poly(A)) tails
Expression Solution (3’ WTA)
RNA-Seq enables sequencing of the coding transcriptome with
• Bio-Rad SureCell™ WTA 3’ Library Prep Kit for the
strand-specific information
ddSEQ System
IR-Seq is a targeted sequencing method used to
Immune-repertoire-seq (IR- • 10x Genomics Chromium Single Cell Immune
quantify the composition of B- or T-cell antigen
Seq) Profiling Solution
receptor repertoires
Understand genome-wide chromatin accessibility • 10x Genomics Chromium Single Cell ATAC Solution
ATAC-Seq
within each cell • Bio-Rad SureCell ATAC-Seq Library Prep Kit
Simultaneously detect single nucleotide variants and
Targeted DNA-Seq • Mission Bio Tapestri Platform
copy number variants from the same cells
DNA-tagged antibodies enable protein profiling by
AbSeq • BD AbSeq Assay
NGS
Cellular Indexing of Transcriptomes and Epitopes • Stoeckius M, Hafemeister C, Stephenson W
by Sequencing (CITE-Seq) uses oligonucleotide- et al. Simultaneous epitope and transcriptome
CITE-Seq
labeled antibodies to convert protein detection into a measurement in single cells. Nat Methods.
quantitative assay by NGS 2017;14:865-868.
RNA expression and protein sequencing (REAP-Seq) • Peterson VM, Zhang KX, Kumar N et al. Multiplexed
REAP-Seq uses antibodies conjugated to DNA barcodes to quantification of proteins and transcripts in single
measure protein levels at the single-cell level cells. Nat Biotechnol. 2017;35:936-939.
Step 4 Sequencing
The number of cells being evaluated varies depending on the experimental study design. For illustrative purposes,
this guide uses 5000 cells per experiment.
Single-cell method Reads per cell/nucleus

RNA-Seq 20K
IR-Seq 5K
ATAC-Seq 50K
Targeted DNA-Seq Depends on the panel size
NextSeq 1000 and

Product NovaSeq 6000 System
NextSeq 2000 Systems
Most important to me Instrument affordability and desktop footprint Low cost/sample
No. of experiments/flow cell
6-8 (SP flow cell)
4 (P2 Flow Cell) 13-16 (S1 flow cell)
RNA-Seq
12 (P3 Flow Cell) 33-41 (S2 flow cell)
80-100 (S4 flow cell)
26-32 (SP flow cell)
16 (P2 flow cell) 52-64 (S1 flow cell)
IR-Seq
48 (P3 flow cell) 132-164 (S2 flow cell)
2–3 (SP flow cell)
1 (P2 flow cell) 5–6 (S1 flow cell)
ATAC-Seq
32–40 (S4 flow cell)
DNA-Seq a
a. Using the Mission Bio Tapestri Single-Cell DNA ALL Panel as an example (includes 107 genes with 283 amplicons)
Step 5 Single-cell analysis and visualization
After the single-cell sequencing run is complete, downstream analysis can be performed. Generally, the analysis
pipeline for single-cell sequencing experiments involves three phases: primary analysis (base calling), secondary
analysis (demultiplexing, alignment, and genetic characterization), and tertiary analysis (data visualization and
interpretation). There is no one correct way to carry out an analysis pipeline for single-cell sequencing experiments.
Many approaches and software programs are available for each step in the pipeline. The research objective, single-
cell isolation platform, and general lab considerations will largely determine the specific pipeline used. For more
information around analysis, consult the single-cell eBook.
Primary analysis: 1 File conversion

File conversion Raw data files (BCL) are converted to FASTQ
.bcl file .fastq file format for downstream analysis.
2 Demultiplexing
Secondary analysis:
If the samples were multiplexed for sequencing,
Demultiplexing
resulting read files are demultiplexed prior to
(if applicable)
downstream analyses.
3 Sequence alignment
Sequence alignment The reads are mapped and aligned to a
reference genome.
4 Data set QC and filtering

Noncellular barcodes and low-quality cells are
Data set QC and filtering
excluded from downstream analysis by various
metrics
5 Genetic characterization
Initial genetic Quality-controlled data sets are analyzed for
characterization genomic variants, gene expression, chromatin
accessibility, protein expression, etc.
6 Data visualization
Tertiary analysis: Multidimensional data plots enable the
Data visualization and interpretation clustering of cells and identification of
subpopulations.
Downstream tertiary analysis and visualization solutions
There are many options for single-cell tertiary analysis tools, including open-source analysis tools developed
by academic labs in popular programming languages like R and Python, ‘plug-and-play’ packages that allow
researchers to use preconfigured analysis workflows, and commercial offerings. The tools chosen will depend on the
research goals and experimental objectives.
Application/software Description Learn more

Open-source bioinformatics tools (freeware)
Seurat is an R-based scRNA-Seq analysis software designed to assess cellular
satijalab.org/seurat/archive/
Seurat heterogeneity using normalization, dimensionality reduction approaches, plots, heat
v3.0/integration.html
maps, and data integration tools.
Monocle is an R-based scRNA-Seq analysis software designed to determine cell
cole-trapnell-lab.github.io/
Monocle developmental trajectory. Monocle is ideal for experiments where there are known
monocle-release/
beginning and terminal cell states.
The velocyto software package enables analysis of gene expression dynamics in
velocyto scRNA-Seq data to estimate RNA velocity, which describes the rate of change in velocyto.org/
gene expression for a particular gene at a specific point in time.
chromVAR is an open-source R package for analyzing variations in chromatin
github.com/GreenleafLab/
chromVAR accessibility in scATAC-Seq data to identify associated motifs or genomic
chromVAR
annotations.
CITE-Seq-Count and Bioinformatics tools such as CITE-seq-Count and CiteFuse can be applied to various
cite-seq.com/
CITEFuse single-cell protein profiling methods to support data analysis and visualization.
Commercially available bioinformatics tools
support.illumina.com/
The DRAGEN Single-Cell RNA Pipeline processes multiplexed single-cell RNA-Seq content/dam/illumina-
data sets from reads to a cell-by-gene UMI count gene expression matrix. The support/help/Illumina_
DRAGEN Single-Cell pipeline includes: RNA-Seq (splice-aware) alignment and matching to annotated DRAGEN_Bio_IT_Platform_
RNA Pipeline genes for transcript reads, cell-barcode and UMI error correction for the barcode v3_7_1000000141465/
read, UMI counting per cell and gene to measure gene expression, sparse matrix Content/SW/Informatics/
output and QC metrics Dragen/SingleCellRNA_fDG.
htm
The 10x Loupe Browser is a desktop application that allows researchers to visualize support.10xgenomics.com/
and analyze single-cell sequencing data, including 10x Chromium scRNA-Seq and genome-exome/software/
10x Loupe Browser
sc-ATAC-Seq data. The software enables users to find significant genes, cell types, visualization/latest/what-is-
and substructure within scRNA-Seq data quickly and interactively. loupe
Cell Ranger ATAC is a set of analysis pipelines designed to process 10x Chromium support.10xgenomics.com/
10x Cell Ranger Single Cell ATAC data. The software performs primary through tertiary analysis, single-cell-atac/software/
ATAC including dimensionality reduction, cell clustering, and clustering of regions of pipelines/latest/what-is-cell-
differential accessibility. ranger-atac
Tapestri Insight is a software solution for analyzing Tapestri single-cell DNA
support.missionbio.com/hc/
sequencing data that includes sequence import, data analysis, and visualization
Tapestri Insight en-us/articles/360047770774-
capabilities. The software enables variant identification, including SNVs and copy
Introduction-to-Tapestri-Insights
number variants (CNVs), at the clonal and subclonal level.
The SureCell ATAC-Seq Analysis Toolkit is used with the SureCell ATAC-Seq Library
Prep Kit to enable epigenetic analysis of single cells genome-wide. The software
Bio-Rad SureCell www.bio-rad.com/webroot/
can estimate gain and loss of chromatin accessibility within peaks, cluster scATAC-
ATAC-Seq Analysis web/pdf/lsr/literature/
Seq profiles, characterize of sequence motifs associated with gene expression,
Toolkit Bulletin_7191.pdf
and identify cis- and trans-acting elements that are the source of diverse cellular
phenotypes.
SeqGeq Software is a platform-agnostic desktop bioinformatics platform designed
for the analysis of single-cell experiments. It includes a wide suite of informatics
SeqGeq v1.6 www.flowjo.com/solutions/
features including: V(D)J analysis, Seurat clustering, Monocle trajectory inference,
Software seqgeq
and more. It allows researchers to perform advanced analysis, data exploration, and
visualization with an easy to use drag-and-drop interface.
Partek Flow allows researchers to quickly and reliably discover biological meaning
Partek Flow in single-cell genomic data, and is compatible with mutliple single-cell applications, www.partek.com/partek-flow/
including scRNA-Seq, CITE-Seq, cell hashing, spatial transcriptomics, and more.
Method 2: Spatial transcriptomics analysis
Perform multiomic profiling of tumors within the native morphological structure and context.
Key benefits over single parameter methods:
• Preserve spatial context for gene expression studies without the need for tissue dissociation
• Combine molecular profiling with immunofluorescence or histochemical staining and imaging on the same sample
• Obtain results rapidly with streamlined, automation-compatible workflows
Nanostring GeoMx Digital Spatial Profiler
Prepare Prepare
Specimens Image Sequence Data visualization
samples libraries
Fresh frozen Nanostring GeoMx Digital Spatial NovaSeq Nanostring GeoMx

(FF) or Profiler (DSP) 6000 System Digital Spatial Profiling
formalin-fixed, with DRAGEN on
paraffin- NextSeq 1000 and BaseSpace Sequence
embedded NextSeq 2000 Hub
(FFPE) Systems
tumor samples
+ normal
10x Genomics Visium Spatial Gene Expression
Prepare Prepare
Specimens Image Sequence Data visualization
samples libraries
FF or FFPE 10x Genomics Visium Spatial Gene Expression NovaSeq 10x Genomics Space
tumor samples 6000 System Ranger and Loupe
+ normal Browser
NextSeq 1000
and
NextSeq 2000
Systems

N/A NovaSeq 6000 and NextSeq 2000 Plans
Potential applications of spatial genomics analysis cancer studies
Identify biomarkers related to tissue location

Cellular position within a tissue or tumor is lost in traditional bulk analysis. Spatial
genomics analysis preserves morphological context to enable identification and
profiling of biomarkers related to location to find novel potential therapeutics.
Map unique tumor and tumor microenvironment molecular profiles

Tissue dissociation before traditional methods for bulk analysis destroys the
morphological context of a tissue. Spatial genomics analysis enables delineation and
mapping of the tumor and its surrounding microenvironment.
Track and profile tumor infiltrating lymphocytes

Tumor-infiltrating lymphocytes (TILs) have been shown to correlate with improved
clinical outcomes in various cancers; however, the ability to monitor TILs can
be challenging with bulk analysis methods. Spatial genomics analysis provides
identification and molecular profiling of TILs within the morphological context of the
tumor for highly accurate and detailed assay of immune function in response to
cancer.
Step 1 Sample preparation
Spatial analysis is performed on tissue sections mounted on slides. Illumina recommends following the respective
protocol for either the 10x Genomics Visium Spatial Gene Expression or Nanostring GeoMx DSP for sample
preparation. Options are available for targeted gene expression analysis.
Provider Panel Description

Spatial Gene Expression Captures whole-transcriptome gene expression
10x Genomics Visium Targets ~1200 genes associated with tumors, the
Human Pan-Cancer Panel
tumor microenvironment, and immune response
GeoMx Whole Transcriptome Atlas Enables profiling of >18K protein-coding genes
Nanostring GeoMx DSP Enables profiling of >1800 genes related to

GeoMx Cancer Transcriptome Atlas immune response, tumor biology, and the
microenvironment
Step 2 Imaging
Product Nanostring GeoMx DSP 10x Genomics Visium

Imaging can be performed with standard
Most important to me Imaging for both protein and RNA using RNAscope
microscope
Step 3 Library preparation
Illumina recommends following the respective manufacturer’s protocol for library preparation.
Product Nanostring GeoMx DSP 10x Genomics Visium

Minimum input requirements Sequence a region of interest (ROI) Sequence everything
Sample type Fresh frozen or FFPE Fresh frozen or FFPE
Step 4 Recommended sequencing systems
NextSeq 1000 and

NovaSeq 6000 System
10x Genomics no. of reads/flow 150M reads per section
cell 4 sections per slide
200M reads per 24 ROIs for whole-transcriptome atlas
Nanostring no. of reads/flow cell
50M reads per 24 ROIs for cancer transcriptome atlas
Recommended read length for
2 × 151 bp 2 × 151 bp
10x Genomics
Recommended read length for
2 × 35 bp 2 × 35 bp
Nanostring
Step 5 Visualization
Illumina recommends using the respective analysis software for each manufacturer.
Application Application Input

Automated overlay of spatial gene expression data 10x Genomics Visium Spatial Gene Expression
10x Genomics Space Ranger
on tissue images and analysis data files
10x Genomics Loupe Browser Visual exploration of spatial expression data Data output from Space Ranger software
Nanostring GeoMx DSP Data
Automated analysis and visualization of spatial data Nanostring GeoMX data files
Analysis
Method 3: Assay for Transposase-Accessible Chromatin
Sequencing (ATAC-Seq)
Determine chromatin accessibility across the genome, without prior knowledge of regulatory
elements.
Key benefits over traditional methods for assessing chromatin accessibility, such as chromatin immunoprecipitation
sequencing (ChIP-Seq), formaldehyde-assisted isolation of regulatory elements sequencing (FAIRE-Seq), or DNase I
hypersensitive sites sequencing (DNase-Seq):
• Avoid sensitive enzymatic digestion or rigorous validation of antibodies by using the Tn5 transposase
• Obtain results rapidly, with a streamlined workflow that can be completed in < 3 hours
• Interrogate precious samples with input requirements as low as 500-50,000 cells
Sample prep, tagmentation, Analysis and peak

Specimens Sequence Visualization
and library prep calling
FF or FFPE ≥ 50,000 cells or nuclei NovaSeq Alignment Integrative

is recommended 6000 System BWA, Bowtie2 Genomics Viewer
(IGV)
Inputs as low as 500 cells or nuclei NextSeq 1000 Peak Calling
are possible and Genrich, MACS2 UCSC Genome
NextSeq 2000 Browser
Systems

NextSeq 2000 Plans
Potential applications of ATAC-Seq cancer studies
Interrogate the chromatin regulatory landscape in cancers

Mutations in regulatory elements that affect transcription factor binding can result
in increased or decreased chromatin accessibility to help drive tumor formation or
cancer progression. ATAC-Seq enables surveillance of chromatin accessibility to
identify patterns of gene regulation specific to different cancer types.
Map nucleosome redistribution during cancer progression

Nucleosomes are the structural subunit of chromatin, enabling packaging of long DNA
molecules into more compact forms, and play a central role in epigenetic regulation.
Changes in nucleosome positioning throughout the genome have been associated
with dysregulation of gene expression that can contribute to tumor formation and
cancer progression. ATAC-Seq provides genome-wide mapping of nucleosome
distribution to help understand the etiology of cancer.
Elucidate the role of epigenetics in drug resistance

Epigenetics plays a key role in tumor progression and drug resistance. Drug-resistant
tumor clones have been known to co-opt epigenetic pathways to suppress gene
expression that drives drug sensitivity. ATAC-Seq can be used to profile these
epigenetic changes to drive a deeper understanding of drug resistance mechanisms.
ATAC-Seq is performed on intact cells or nuclei, not isolated genomic DNA. Illumina recommends input requirements
for ATAC-Seq of ≥ 50,000 cells or nuclei. While lower inputs are possible (as few as 500 cells), they may cause issues
with low library complexity.
Step 2 Tagmentation and amplification
Illumina recommends following the protocol outlined in ATAC-seq: A Method for Assaying Chromatin Accessibility
Genome-Wide. IDT for Illumina Nextera DNA Unique Dual Indexes can be incorporated during PCR amplification to
generate a sequencing-ready library.
IDT for Illumina

Product
Nextera DNA Unique Dual Indexes
Minimum input requirements ≥ 50,000 cells or nuclei
Total library prep time <3h
Sample type FF or FFPE
Sample index sets 384 unique dual indexes
NextSeq 1000 and

16/4 (SP flow cell)
8 (P2 flow cell) 32/8 (S1 flow cell)
Samples/flow cell a
20 (P3 flow cell) 82/20 (S2 flow cell)
200/50 (S4 flow cell)
Recommended read length 2 × 151 bp 2 × 151 bp
a. Illumina recommends 50M+ reads for assaying differences in chromatin accessibility and ≥ 200M reads for transcription factor footprinting.
Step 4 Primary and secondary analysis
Illumina recommends using the BWA Aligner app on BaseSpace Sequence Hub for primary sequencing data analysis
and the third-party Genrich or MACS2 software applications for secondary analysis.
Primary analysis: alignment
Software Application Input
Burrows Wheeler Aligner (BWA)
Alignment to reference genome FASTQ files
Bowtie2
Secondary analysis: peak calling
Genrich
Analysis of alignment files to call peaks of
Model-based Analysis of ChIP- Output files from alignment application
significant enrichment
Seq (MACS2)
Step 5 Visualization
Illumina recommends using the IGV app on BaseSpace Sequence Hub or UCSC Genome Browser for visualizing
called peaks in genomic context.
Application Application Input
Any project can be used as input, including BAM,
Integrative Genomics Viewer (IGV)
Visual exploration of genomic data VCF, BED, BW, and BEDGRAPH
UCSC Genome Browser Multiple file types, including BAM, BED, and more
Method 4: Bulk epitope and nucleic acid sequencing (BEN-Seq)
Perform simultaneous bulk protein and gene expression profiling.
Key benefits over single parameter methods:
• Detect multiple proteins more efficiently than traditional methods, such as Western blotting
• Achieve similar levels of accuracy for protein measurement as flow cytometry with added benefit of quantifying
RNA expression
Isolate Prepare
Prepare samples Sequence Analyze data
TotalSeq ADTs libraries
Stain cells with BioLegend TotalSeq antibodies, lyse, and Illumina library prep NovaSeq DESeq2
isolate antibody-derived tags (ADTs) kits 6000 System BaseSpace App
NextSeq 1000 R ComplexHeatmap

and module
NextSeq 2000
Systems

NextSeq 2000 Plans
Potential applications of BEN-Seq cancer studies
Perform multiomics for a more comprehensive view of cancer cells

Various tumor processes, including initial formation, progression, immune evasion,
angiogenesis, and metastasis involve dynamic changes in the genome, transcriptome,
and proteome of cancer cells. BEN-Seq enables simultaneous analysis of RNA and
protein levels for detailed analysis cancer cells.
Discover new biomarkers and targets for therapy

By integrating protein and RNA analyses into a single assay, BEN-Seq has the
potential to accelerate identification of cancer biomarkers (prognostic, predictive, and
diagnostic) and novel targets for therapeutics.
Step 1 Sample preparation and antibody staining
Illumina recommends following the BioLegend BEN-Seq protocol to prepare samples and stain cells with TotalSeq
antibodies for protein analysis. Illumina recommends input requirements of 8K cells. RNA and protein associated
libraries (ADTs) are separated using streptavidin-magnetic beads loaded with complementary oligonucleotides,
according to the protocol.
Step 2 Recommended library preparation methods
Product TruSeq Stranded mRNA Library Prep Kit

Minimum input requirements ≥ 10 ng RNA
Total library prep time ~ 10.5 h
Sample type FF
Sample index sets 96
NextSeq 1000 and

128 (S1 flow cell)
16 (P2 flow cell)
Samples per run 256 (S2 flow cell)
44 (P3 flow cell)
768 (S4 flow cell)
Step 4 Data analysis and visualization
Illumina recommends using the DESeq2 app on BaseSpace Sequence Hub for data analysis and the third-party
R ComplexHeatmap module software for data visualization.

Differential expression analysis on aligned RNA
DESeq2 Output from RNA-Seq Alignment app
samples
Flexible generation, arrangement, and annotation
R ComplexHeatmap
of heat maps
Method 5: TCR (T-cell Receptor) sequencing
Characterize TCR diversity to understand immune function and responsiveness.
Key benefits over traditional methods for TCR characterization, such as PCR and flow cytometry:
• Target all three complementary determining regions (CDRs) for comprehensive coverage of the TCR region
• Detect T-cell clones at far greater sensitivity with fewer false positive and false negative rates
• Interrogate precious samples with input requirements as low as 25 ng of RNA
Analysis,
Nucleic acid
Specimens Library prep Sequencing visualization, and
extraction
interpretation
Easy: Fresh QIAGEN AllPrep DNA AmpliSeq for Illumina NovaSeq MiXCR Immune
and RNA FFPE Kit Repertoire Plus, TCR 6000 System Repertoire Analyzer
Difficult: FF and FFPE beta Panel BaseSpace App
NextSeq 1000 and
AmpliSeq for Illumina NextSeq 2000
TCR beta-SR Panel Systems

NextSeq 2000 Plans
Potential applications of immune profiling cancer studies
Identify and monitor minimal residual disease (MRD)

MRD describes the small number of cancer cells that persist in the body post-cancer
treatment. The number of cells may be so small that they do not cause symptoms
in the patient and may be challenging to detect by traditional methods, such as
histology. Accurate testing for MRD is important, as any residual cancers cells can
cause relapse in the patient. Targeted NGS of TCRs provides rapid, highly sensitive
detection of MRD in acute lymphoblastic leukemia (ALL). Monitoring MRD with NGS
can show how cancer has responded to treatment, evaluate remissions, identify
patients that may need to restart treatment or start alternative therapies, and more.
Characterize tumor-reactive T-cell profiles

Analysis of TCR sequences can reveal the clonal content of tumor-infiltrating T-cells,
which has been shown to correlate with improved clinical outcomes in various
cancers. TCR repertoire sequencing elucidates TCR diversity to help drive an
understanding of the central players in immunosurveillance and immunotherapy.
Step 1 Recommended extraction methods
There are several DNA and RNA extraction methods that can be used with FF samples. For FFPE samples, Illumina
recommends the QIAGEN AllPrep DNA/RNA FFPE Kit. In Illumina internal studies, this kit extracts DNA and RNA with
high-quantity and -quality in the same workflow.
Step 2 Recommended library prep methods
AmpliSeq for Illumina

AmpliSeq for Illumina
Product Immune Repertoire Plus,
TCR beta-SR Panel
TCR beta Panel
Minimum input requirements 25 ng 10 ng
Total library prep time 5-6 h 5-6 h
Sample type FF FF and FFPE compatible
Sample index sets 384 unique dual indexes 384 unique dual indexes
NextSeq 1000 and

Recommended number of reads 400M total PE reads and 30M PE reads per sample
Step 4 Secondary analysis
Illumina recommends using the MiXCR Immune Repertoire Analyzer app on BaseSpace Sequence Hub for TCR
repertoire sequencing data analysis.
MiXCR Immune Repertoire V-(D)-J segment mapping, alignment, mutation Any sequencing data type with any level of TCR
Analyzer analysis coverage
Method 6: WGS
Identify cancer-driving genetic events beyond protein-coding variants.
Key benefits over targeted or exome sequencing
• Obtain a complete view of the mutational profile of a tumor sample, including noncoding and structural variants
• Identify mutational signatures linked to cancer progression
• Differentiate driver from passenger mutations
DNA Expression
Specimens Library prep Sequencing
extraction analysis
DRAGEN
FF and FFPE QIAGEN AllPrep DNA Illumina DNA Prep NovaSeq 6000 DRAGEN Bio-IT
Kit System Platform
Training Service contracts Professional services
Customer site training Tiered service plans Proof-of-Concept Service

Illumina DNA Prep NovaSeq 6000 Plans Standard application functional testing
with your samples
Potential applications of WGS cancer studies
Link somatic mutational signatures with cancer progression

Multiple processes can lead to a buildup of somatic mutations in cancer genomes.
Each process may lead to a distinct mutation signature. WGS is the most ideal way
to identify these mutational signatures. Understanding these signatures can lead to a
better understanding of the underlying mutational processes leading to novel insights
into cancer progression and possible therapeutic targets.
Elucidate the role of noncoding mutations in cancer

The link between somatic mutations and cancer progression has largely focused
on the role of coding mutations on cellular processes. While most noncoding
mutations may simply be passenger mutations, some may be in regulatory elements
that themselves exert effects on cancer progression. WGS can identify coding and
noncoding somatic mutations and, using computational techniques, classify the
importance of these mutations on cancer progression.
Identify the role of viral integration in cancer

The driving role of genomic viral integration in cancer progression has been known
for many years. Unfortunately, systematic studies understanding the role of viral
integration are lacking. WGS can be used to identify viral integrations across the
genome. This hypothesis-free approach can elucidate associations between certain
viruses and cancer progression, beyond what is known today.
There are several DNA extraction methods that can be used with FF samples. For FFPE samples, Illumina recommends
the QIAGEN AllPrep FFPE Kit. In Illumina internal studies, this kit gives a high quantity and quality of DNA.
Product Illumina DNA Prep

Input requirements 100–500 ng
Total library prep time 3–4 hrs
Sample type gDNA from blood, FFPE, and FF samples
Sample index sets 384 unique dual indexes

1 (S1 flow cell)
Samples/flow cella 2–3 (S2 flow cell)
Recommended read length 2 × 101 bp
a. Illumina recommends 220 Gb reads for tumor DNA and 85 Gb reads for normal DNA.
Illumina recommends using the DRAGEN pipelines on BaseSpace Sequence Hub or on a DRAGEN Server to obtain
somatic variant calls and gene expression data. When using BaseSpace Sequence Hub, users can monitor runs in
real time while securely streaming data directly from the instruments into the ecosystem for push-button analysis.
Pipeline Application Input

Somatic variant detection in tumor samples; Tumor DNA FASTQ
DRAGEN Somatic Pipeline
includes tumor-only and tumor-normal modes Normal DNA FASTQ
Method 7: Mutational profiling
Identify tumor-specific peptides that may be capable of inducing an immune response.
Key benefits over traditional methods such as qPCR, Sanger sequencing, and small NGS panels
• Obtain a comprehensive view of the mutational profile of a tumor sample

• Detect highly expressed somatic variants in one workflow
• Identify specific neoantigens that may drive tumor immunity
Extract Prepare Variant calling and Neoantigen

Specimens Sequence
nucleic acids libraries expression analysis prediction
DRAGEN
FF or FFPE QIAGEN AllPrep Tumor and normal NovaSeq DRAGEN Platform Multiple
tumor samples DNA and RNA DNA: 6000 System methods
+ normal FFPE Kit Illumina DNA Prep DNA: Somatic Pipeline
with Enrichment with NextSeq 1000
exome oligos and RNA: RNA-Seq Pipeline
NextSeq 2000
Tumor RNA: Illumina Systems
RNA Prep with
Enrichment
Training Service contracts Professional services
Customer site training Tiered service plans Proof-of-Concept Service

Illumina DNA Prep with Enrichment and NovaSeq 6000 Plans Standard application functional testing
Illumina RNA Prep with Enrichment NextSeq 2000 Plans with your samples
Potential research applications of neoantigen prediction
Identify responders and nonresponders of immune checkpoint blockers

Immune checkpoint blockade has been shown to induce durable responses in
patients across multiple types of cancer. Unfortunately, only a fraction of patients
respond to this type of therapy. Multiple biomarkers have been proposed to stratify
responders from nonresponders, though none have been shown to be completely
effective. As neoantigens play a direct role in inducing antitumor immunity, accurate
prediction of these molecules may be the most effective biomarker for predicting
response to emerging immunotherapies.
Drive personalized vaccine development

Recent studies have illustrated the promise of neoantigen-based vaccines.
Unfortunately, while clinical trials are ongoing in several different tumor types, multiple
challenges remain. The first step in the development of a vaccine is the accurate
identification of putative neoantigens. Optimizing this technique is key to developing
personalized vaccines and realizing the hope of this promising new therapeutic
approach.
Identify tumor-reactive T-cells

Another emerging therapeutic approach across multiple types of cancers is adoptive
cell transfer of neoantigen-targeting tumor-infiltrating lymphocytes (TILs) into patients.
The key to this approach is the identification of the appropriate neoantigen-reactive
T-cells that can mediate a durable antitumor immune response. The first step to
identifying this possibly disease-altering population of cells is the identification of the
neoantigens that can drive this response.
There are several DNA and RNA extraction methods that can be used with FF samples. For FFPE samples, Illumina
recommends the QIAGEN AllPrep DNA/RNA FFPE Kit. In Illumina internal studies, this kit extracts DNA and RNA with
high-quantity and -quality in the same workflow.
Product Illumina DNA Prep with Enrichment Illumina RNA Prep with Enrichment
Most important to me Minimal input requirements, fewest number of steps, and high uniformity of coverage
Minimum input requirements 10 ng 10 ng
Total library prep time ~6.5 h 9.5 ha
Sample type FF and FFPE compatible FF and FFPE compatible
Sample index sets 384 unique dual indexes 384 indexes availableb
a. Turnaround time for 24 samples with 3-plex enrichment
b. Up to 192 unique dual indexes are currently supported; 384 indexes will be available later in 2021
NextSeq 1000 and

2–3 (P2 flow cell) 8–10 (S1 flow cell)
Samples/flow cella
6–10 (P3 flow cell) 20–26 (S2 flow cell)
a. Illumina recommends 200M reads for tumor DNA, 75M reads for normal DNA, and 30M reads for tumor RNA.
Illumina recommends using the DRAGEN pipelines on BaseSpace Sequence Hub, a DRAGEN Server, or onboard
the NextSeq 1000 and NextSeq 2000 Systems to obtain somatic variant calls and gene expression data. When
using BaseSpace Sequence Hub, users can monitor runs in real time while securely streaming data directly from the
instruments into the ecosystem for push-button analysis.
Pipeline Application Input

Rapid alignment of reads from targeted
DRAGEN Enrichment Pipeline Tumor DNA FASTQ
enrichment experiments
Somatic variant detection in tumor samples; Tumor DNA FASTQ
DRAGEN Somatic Pipeline
includes tumor-only and tumor-normal modes Normal DNA FASTQ
Rapid alignment, splice junction mapping and
DRAGEN RNA Pipeline Tumor RNA FASTQ
quantification, and fusion detection
DRAGEN RNA Differential Pairwise differential gene expression analysis.
Tumor RNA FASTQ
Expression Available on BaseSpace Sequence Hub only.
Step 5 Tertiary analysis
Before neoantigen prediction, users must use the somatic variant calls and gene expression data for human
leukocyte antigen (HLA) typing, peptide processing, and major histocompatibility complex (MHC)-binding prediction.
There are many options for the user in each of these steps. For a review of these options and instructions for use,
read “Best practices for bioinformatic characterization of neoantigens for clinical utility” at https://genomemedicine.
biomedcentral.com/articles/10.1186/s13073-019-0666-2.
Analysis step Description

HLA typing The use of exome data to determine a patient’s HLA alleles and corresponding MHC complexes
Peptide processing A generation of small peptides using a sliding window that is applied to the mutant protein sequence
MHC-binding prediction An analysis of peptide affinity toward the MHC complexes identified
Neoantigen prioritization The prioritization of selected peptides based on variant frequency, binding affinity, and other factors
Method 8: Methylation arrays
Interrogate methylation status across the genome of cancer cells at single-nucleotide resolution.
Key benefits over other methods:
• Comprehensive genome-wide coverage including: CpG islands, CHH sites, enhancers, open chromatin,
transcription factor binding sites, and more
• High-throughput research capabilities at minimal cost per sample
• User-friendly, streamlined workflow with high assay reproducibility and support for FFPE samples
Sample and
Specimens Scanning Analysis
BeadChip prep
FF, FFPE Infinium iScan System GenomeStudio™

MethylationEPIC Software:
BeadChip NextSeq 550 System Methylation Module
Infinium Mouse
Methylation BeadChip

N/A iScan System Service plan
Potential applications of methylation array cancer studies
Characterize epigenetic patterns in cancer

Studies of cancer epigenetics, such as aberrant methylation and altered
transcription factor binding, can provide insight into important tumorigenic
pathways. Methylation arrays can assist translational researchers in tumor
classification applications.
Perform epigenome-wide association studies in cancer

Methylation arrays enable epigenome-wide association studies (EWAS) that can
analyze multiple cancer samples in parallel, enabling identification of epigenetic
variants associated with cancer.
Illumina recommends an input requirement of 250 ng genomic DNA. FFPE samples are supported.
Step 2 BeadChip hybridization
Illumina recommends following the Infinium MethylationEPIC BeadChip protocol for analysis of human samples and
the Infinium Mouse Methylation BeadChip protocol for supporting studies with mouse models of cancer.
Product Infinium MethylationEPIC BeadChip Infinium Mouse Methylation BeadChip

Minimum input requirement 250 ng DNA 250 ng DNA
No. of markers > 850K methylation sites per sample > 285K methylation sites per sample
Sample type FF or FFPE FF or FFPE
No. of samples per array 8 12
Step 3 Recommended array scanning systems
Product iScan System NextSeq 550 System

High throughput processing of Infinium
Most important to me Dual sequencing and array scanning capability
MethylationEPIC BeadChips
Scan time per BeadChip 20 minutes 40 minutes
Scan time per sample ~ 2.5 minutes 5 minutes
Step 4 Analysis and visualization
Illumina recommends using the Methylation Module in GenomeStudio Software.
Application Description
Calculates methylation levels (beta values) and analyzes differential methylation levels between
GenomeStudio Software
experimental groups. Single-site resolution data can be visualized as line plots, bar graphs, scatter plots,
Methylation Module
histograms, dendrograms, box plots, or heat maps.
Illumina • 1.800.809.4566 toll-free (US) • +1.858.202.4566 tel • techsupport@illumina.com • www.illumina.com
For Research Use Only. Not for use in diagnostic procedures.
© 2021 Illumina, Inc. All rights reserved. All trademarks are the property of Illumina, Inc. or their respective owners. For specific trademark
information, see www.illumina.com/company/legal.html. M-GL-00029 v01

Methods Guide For Cancer Research: For Research Use Only. Not For Use in Diagnostic Procedures

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Methods Guide For Cancer Research: For Research Use Only. Not For Use in Diagnostic Procedures

Uploaded by

Copyright:

Available Formats

Methods Guide for Cancer Research

For Research Use Only. Not for use in diagnostic procedures.

Overview of cancer research workflows

Single-cell sequencing to understand tumor

Spatial analysis to probe gene expression in situ

ATAC-Seq to determine chromatin accessibility

BEN-Seq to simultaneously assay protein and RNA

T-cell receptor sequencing to interrogate immune

WGS to identify a comprehensive list of genetic

Mutational profiling for neoantigen prediction

Methylation arrays for comprehensive coverage

• Detect functional cell populations in the tumor microenvironment

Easy: Fresh Tissue dissociation: Single-cell isolation: NovaSeq Multiple methods

Training Service contracts

Customer site training Tiered service plans

Identify biomarkers associated with immunotherapy response

Identify specific tumor-reactive T-cell profiles

Elucidate the role of epigenetics in drug resistance

Uncover the genetic drivers of relapse

Step 1 Single-cell tissue preparation

Step 2 Single-cell isolation

Method Example commercial offering Advantages Disadvantages

Step 3 Library prep

Single-cell method Reads per cell/nucleus

NextSeq 1000 and

Primary analysis: 1 File conversion

4 Data set QC and filtering

Application/software Description Learn more

Fresh frozen Nanostring GeoMx Digital Spatial NovaSeq Nanostring GeoMx

10x Genomics Visium Spatial Gene Expression

Training Service contracts

Identify biomarkers related to tissue location

Map unique tumor and tumor microenvironment molecular profiles

Track and profile tumor infiltrating lymphocytes

Provider Panel Description

Nanostring GeoMx DSP Enables profiling of >1800 genes related to

Product Nanostring GeoMx DSP 10x Genomics Visium

Step 3 Library preparation

Product Nanostring GeoMx DSP 10x Genomics Visium

NextSeq 1000 and

Application Application Input

Sample prep, tagmentation, Analysis and peak

FF or FFPE ≥ 50,000 cells or nuclei NovaSeq Alignment Integrative

Training Service contracts

Customer site training Tiered service plans

Interrogate the chromatin regulatory landscape in cancers

Map nucleosome redistribution during cancer progression

Elucidate the role of epigenetics in drug resistance

Step 2 Tagmentation and amplification

IDT for Illumina

Step 3 Recommended sequencing systems

NextSeq 1000 and

NextSeq 1000 R ComplexHeatmap

Training Service contracts

Customer site training Tiered service plans

Perform multiomics for a more comprehensive view of cancer cells

Discover new biomarkers and targets for therapy

Step 2 Recommended library preparation methods

Product TruSeq Stranded mRNA Library Prep Kit

Step 3 Recommended sequencing systems

NextSeq 1000 and

Software Application Input

Training Service contracts