Professional Documents
Culture Documents
Analysis of Sequence-Based COpy Number Variation Detection Tools For Cancer Studies
Analysis of Sequence-Based COpy Number Variation Detection Tools For Cancer Studies
Analysis of Sequence-Based COpy Number Variation Detection Tools For Cancer Studies
amplica7on
Sequence
based
technologies
(using
next
genera7on
sequencing
(NGS))
are
emerging
Advantages
Higher
resolu7on
and
accuracy
Single
plaTorm
Rapid
cost
reduc7on
Disadvantages
Currently
expensive
No
standard
analysis
Computa7onally
demanding
NGS
data
@SRR034720.3267591
length=36
ATTATTTTATGTTATTTATTTTGTATGTTTTTTTTT
+
88888888888888888888885888%888888/8)
@SRR034720.3267592
length=36
TCGGGAACGTCTCGACCGAAATTATTTTGTATGTCT
+
8888788888888888878-188288881878"888
.
.
.
Coun7ng window
Short read
Read count
Reference genome
The number of reads that align to a posi7on in a genome is propor7onal to the copy number at that posi7on
Sample
Soma7c dele7on
Control
ReadDepth
Analyze DOC, circular binary SE segmentation algorithm, use negative PE binomial distribution, use PE information Analyze DOC, use LASSO-based algorithm for segmentation, normalize for GC content Analyze DOC with paired-end mapping information, repeat graph algorithm Analyze DOC, fixed window, no segmentation Analyze DOC, extend a window to include a fixed number of reads SE PE PE
No
FREEC
Yes/ No No
CNVer
CNV-seq
SE PE SE PE
Yes
SegSeq
Yes
CNV size distribu7ons Cell lines detected CNVs size, number, span and type are dierent across the tools.
1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6
MCF7
Sensi7vity
Precision
Benchmarks: Common CNVs in two published results Detected CNVs with more 70% overlap with the benchmark CNVs are called true posi7ve
Sensi7vity
60% 40% 20% 0% 0.5x SegSeq CNVnator CNV-seq 2x 5x 10x CNVer FREEC ReadDepth 20x 50x
Precision
Coverage
Coverage
Benchmark: known synthesized CNVs Detected CNVs with more 70% overlap with the benchmark CNVs are called true posi7ve
Conclusions
The
CNV
results
across
the
tools
are
not
consistence.
Most
of
the
tools
show
high
sensi7vity
and
breakpoint
accuracy,
however
their
precision
is
not
high.
Tools
with
advanced
algorithms
such
as
CNVnator
and
FREEC
perform
beoer,
however
they
are
computa7onally
more
expensive.
Tools
u7lize
pair
end
informa7on,
such
as
CNVer
and
ReadDepth,
detect
CNVs
more
accurately.
Development
of
more
ecient
and
accurate
tools
is
required.
Acknowledgements
Laboratory
of
Personalized
Medicine
(LPM)
Peter
Tonellato
Zengqui
Cai
Erik
Gafni
Vincent
Fusaro
Chih-Lin
Chi
Michiyo
Yamada
Jessica
Correia
Maohew
Crawford