Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

LETTER doi:10.

1038/nature11003

The Cancer Cell Line Encyclopedia enables predictive


modelling of anticancer drug sensitivity
Jordi Barretina1,2,3{*, Giordano Caponigro4*, Nicolas Stransky1*, Kavitha Venkatesan4*, Adam A. Margolin1{*, Sungjoon Kim5,
Christopher J. Wilson4, Joseph Lehár4, Gregory V. Kryukov1, Dmitriy Sonkin4, Anupama Reddy4, Manway Liu4, Lauren Murray1,
Michael F. Berger1{, John E. Monahan4, Paula Morais1, Jodi Meltzer4, Adam Korejwa1, Judit Jané-Valbuena1,2, Felipa A. Mapa4,
Joseph Thibault5, Eva Bric-Furlong4, Pichai Raman4, Aaron Shipway5, Ingo H. Engels5, Jill Cheng6, Guoying K. Yu6, Jianjun Yu6,
Peter Aspesi Jr4, Melanie de Silva4, Kalpana Jagtap4, Michael D. Jones4, Li Wang4, Charles Hatton3, Emanuele Palescandolo3,
Supriya Gupta1, Scott Mahan1, Carrie Sougnez1, Robert C. Onofrio1, Ted Liefeld1, Laura MacConaill3, Wendy Winckler1,
Michael Reich1, Nanxin Li5, Jill P. Mesirov1, Stacey B. Gabriel1, Gad Getz1, Kristin Ardlie1, Vivien Chan6, Vic E. Myer4,
Barbara L. Weber4, Jeff Porter4, Markus Warmuth4, Peter Finan4, Jennifer L. Harris5, Matthew Meyerson1,2,3, Todd R. Golub1,3,7,8,
Michael P. Morrissey4*, William R. Sellers4*, Robert Schlegel4* & Levi A. Garraway1,2,3*

The systematic translation of cancer genomic data into knowledge of known cancer genes were assessed by mass spectrometric genotyping13
tumour biology and therapeutic possibilities remains challenging. (Supplementary Table 2 and Supplementary Fig. 1). DNA copy number
Such efforts should be greatly aided by robust preclinical model was measured using high-density single nucleotide polymorphism arrays
systems that reflect the genomic diversity of human cancers and for (Affymetrix SNP 6.0; Supplementary Methods). Finally, messenger RNA
which detailed genetic and pharmacological annotation is available1. expression levels were obtained for each of the lines using Affymetrix
Here we describe the Cancer Cell Line Encyclopedia (CCLE): a U133 plus 2.0 arrays. These data were also used to confirm cell line
compilation of gene expression, chromosomal copy number and identities (Supplementary Methods and Supplementary Figs 2–4).
massively parallel sequencing data from 947 human cancer cell lines. We next measured the genomic similarities by lineage between CCLE
When coupled with pharmacological profiles for 24 anticancer lines and primary tumours from Tumorscape14, expO, MILE and
drugs across 479 of the cell lines, this collection allowed identification COSMIC data sets (Fig. 1b–d and Supplementary Methods). For most
of genetic, lineage, and gene-expression-based predictors of drug lineages, a strong positive correlation was observed in both chromo-
sensitivity. In addition to known predictors, we found that plasma somal copy number and gene expression patterns (median correlation
cell lineage correlated with sensitivity to IGF1 receptor inhibitors; coefficients of 0.77, range 5 0.52–0.94, P , 10215, for copy number, and
AHR expression was associated with MEK inhibitor efficacy in 0.60, range 5 0.29–0.77, P , 10215, for expression, respectively; Fig. 1b,
NRAS-mutant lines; and SLFN11 expression predicted sensitivity c and Supplementary Tables 3 and 4), as has been described previ-
to topoisomerase inhibitors. Together, our results indicate that large, ously3–5,15. A positive correlation was also observed for point mutation
annotated cell-line collections may help to enable preclinical strati- frequencies (median correlation coefficient 5 0.71, range 5 20.06–
fication schemata for anticancer agents. The generation of genetic 0.97, P , 1022 for all but 3 lineages; Supplementary Fig. 5), even when
predictions of drug response in the preclinical setting and their TP53 was removed from the data set (median correlation coefficient 5
incorporation into cancer clinical trial design could speed the emer- 0.64, range 5 20.31–0.97, P , 1022 for all but 3 lineages; Fig. 1d and
gence of ‘personalized’ therapeutic regimens2. Supplementary Table 5). Thus, with relatively few exceptions (Sup-
Human cancer cell lines represent a mainstay of tumour biology and plementary Information), the CCLE may provide representative genetic
drug discovery through facile experimental manipulation, global and proxies for primary tumours in many cancer types.
detailed mechanistic studies, and various high-throughput applica- Given the pressing clinical need for robust molecular correlates of
tions. Numerous studies have used cell-line panels annotated with both anticancer drug response, we incorporated a systematic framework to
genetic and pharmacological data, either within a tumour lineage3–5 or ascertain molecular correlates of pharmacological sensitivity in vitro.
across multiple cancer types6–12. Although affirming the promise of First, 8-point dose–response curves for 24 compounds (targeted and
systematic cell line studies, many previous efforts were limited in their cytotoxic agents) across 479 cell lines were generated (Supplementary
depth of genetic characterization and pharmacological interrogation. Tables 1 and 6, and Supplementary Methods). These curves were
To address these challenges, we generated a large-scale genomic data represented by a logistical sigmoidal function with a maximal effect
set for 947 human cancer cell lines, together with pharmacological pro- level (Amax), the concentration at half-maximal activity of the com-
filing of 24 compounds across ,500 of these lines. The resulting collec- pound (EC50), a Hill coefficient representing the sigmoidal transition,
tion, which we termed the Cancer Cell Line Encyclopedia (CCLE), and the concentration at which the drug response reached an absolute
encompasses 36 tumour types (Fig. 1a and Supplementary Table 1; see inhibition of 50% (IC50).
also http://www.broadinstitute.org/ccle). All cell lines were characterized Broadly active compounds, exemplified by the HDAC inhibitor
by several genomic technology platforms. The mutational status of LBH589 (panobinostat), showed a roughly even distribution of Amax
.1,600 genes was determined by targeted massively parallel sequencing, and EC50 values across most cell lines (Fig. 2a). In contrast, the RAF
followed by removal of variants likely to be germline events (Sup- inhibitor PLX4720 had a more selective profile: Amax or EC50 values for
plementary Methods). Moreover, 392 recurrent mutations affecting 33 most cell lines could be categorized as ‘sensitive’ or ‘insensitive’ to
1
The Broad Institute of Harvard and MIT, Cambridge, Massachusetts 02142, USA. 2Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts 02115,
USA. 3Center for Cancer Genome Discovery, Dana-Farber Cancer Institute, Harvard Medical School, Boston, Massachusetts 02115, USA. 4Novartis Institutes for Biomedical Research, Cambridge,
Massachusetts 02139, USA. 5Genomics Institute of the Novartis Research Foundation, San Diego, California 92121, USA. 6Novartis Institutes for Biomedical Research, Emeryville, California 94608, USA.
7
Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts 02115, USA. 8Howard Hughes Medical Institute, Chevy Chase, Maryland 20815, USA. {Present addresses:
Novartis Institutes for Biomedical Research, Cambridge, Massachusetts 02139, USA (J.B.); Sage Bionetworks, 1100 Fairview Ave. N., Seattle, Washington 98109, USA (A.A.M.); Department of Pathology,
Memorial Sloan-Kettering Cancer Center, New York, New York 10065, USA (M.F.B.).
*These authors contributed equally to this work.

2 9 M A R C H 2 0 1 2 | VO L 4 8 3 | N AT U R E | 6 0 3
©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER

a Breast Melanoma b Neuroblastoma Figure 1 | The Cancer Cell Line


Encyclopedia. a, Distribution of

Primary tumours (Tumorscape)


Lung small cell Colorectal Kidney
Melanoma cancer types in the CCLE by lineage.
Ovary
56 57 Lung NSC Glioma b, Comparison of DNA copy-number
51 58
Glioma 48
ALL profiles (GISTIC G-scores) between
Other
Medulloblastoma Liver cell lines and primary tumours. The
43 123 Chondrosarcoma
Pancreas Leukaemia other
Breast diagonal of the heat map shows the
41 Prostate Ovarian Pearson correlation between
37 Bile duct
Stomach Osteosarcoma Lung adeno. corresponding tumour types. Because
32 Ewing’s sarcoma
Lymphoma Burkitt’s
Lung squam. cell lines and tumours are separate
AML 32 Lung SC
Mesothelioma data sets, the correlation matrix is
Head & neck Thyroid
B-cell ALL Colorectal asymmetric: the top left showing how
Endometrium Lymphoma Hodgkin’s
well the tumour features correlate

Colorectal
Lung SC
Lung squam.
Lung adeno.
Ovarian
Breast
Liver
ALL
Glioma
Melanoma
Kidney
Neuroblastoma
T-cell ALL Copy number
Multiple myeloma CML
Neuroblastoma
correlation with the average of the cell lines in a
Lymphoma other
Lymphoma DLBCL lineage, and the bottom right showing
Oesophagus Soft tissue 0 0.5 1
Kidney the converse. c, Comparison of
Urinary tract Liver CCLE mRNA expression profiles between
c Liver d Urinary tract cell lines and primary tumours. For
Kidney Lung SC each tumour type, the log fold change
Primary tumours (expO/MILE)

Thyroid Glioma

Primary tumours (COSMIC)


Melanoma Head & neck* of the 5,000 most variable genes is
Soft tissue Oesophagus* calculated between that tumour type
Breast Melanoma
Prostate Kidney and all others. Pearson correlations
Endometrium M. myeloma between tumour type fold changes
Ovarian AML
Urinary tract Liver*
from primary tumours and cell lines
Head & neck Colorectal are shown as a heat map.
Lung NSC Pancreas
Stomach Lung NSC
d, Comparison of point mutation
Colorectal frequencies between cell lines and
Ovary
Pancreas
T-cell ALL Endometrium primary tumours in COSMIC (v56),
B-cell ALL Stomach
AML Breast restricted to genes that are well
represented in both sample sets but
Breast
Stomach
Endometrium
Ovary
Lung NSC
Pancreas
Colorectal
Liver*
AML
M. myeloma
Kidney
Melanoma
Oesophagus*
Head & neck*
Glioma
Lung SC
Urinary tract
AML
B-cell ALL
T-cell ALL
Pancreas
Colorectal
Stomach
Lung NSC
Head & neck
Urinary tract
Ovarian
Endometrium
Prostate
Breast
Soft tissue
Melanoma
Thyroid
Kidney
Liver

Expression Mutation freq. excluding TP53, which is highly


correlation corr. (excl. TP53)
prevalent in most tumour types.
−0.5 0 0.5 1 0 0.5 1 Pairwise Pearson correlations are
shown as a heat map. Asterisk
CCLE
CCLE indicates that the correlations of
oesophageal, liver, and head and neck
cancer mutation frequencies are
restored when including TP53.

PLX4720, with sensitive lines enriched for the BRAFV600E mutation BRAFV600E for RAF inhibitors (PLX4720 (ref. 18) and RAF265); HGF
(Fig. 2a). To capture simultaneously the efficacy and potency of a drug, expression and MET amplification for the MET/ALK inhibitor PF-
we designated an ‘activity area’ (Fig. 2b and Supplementary Fig. 6). The 2341066 (ref. 19); and MDM2 overexpression for Nutlin-3 (ref. 20)
24 compounds profiled showed wide variations in activity area, and sensitivity. Variants affecting the EXT2 gene, which encodes a glyco-
those with similar mechanisms of action clustered together (Sup- syltransferase involved in heparin sulphate biosynthesis, were signifi-
plementary Fig. 7). cantly correlated with erlotinib effects (Supplementary Fig. 12). This
Genomic correlates of drug sensitivity may be extracted by predictive observation is intriguing in light of a report linking heparin sulphate
models using machine learning techniques6,10. We therefore assembled with erlotinib sensitivity21. In addition, NQO1 expression was identified
all CCLE genomic data types into a matrix wherein each feature was as the top predictive feature for sensitivity to the Hsp90 inhibitor 17-
converted to a z-score across all lines (Supplementary Methods). Next, AAG, a quinone moiety metabolized by NAD(P)H:quinone oxido-
we adapted a categorical modelling approach that used a naive Bayes reductase (NQO1). NQO1 produces a high-potency intermediate
classification and discrete sensitivity calls, or an elastic net regression (17-AAGH2)22, and has previously been identified as a potential bio-
analysis16 for continuous sensitivity measurements. Both approaches marker for Hsp90 inhibitors23.
were applied to all compounds and genomic data with or without gene Because some genetic/molecular alterations occur commonly in
expression features (Supplementary Methods). Prediction perform- specific tumour types, lineage may become a confounding factor in
ance was determined using tenfold cross-validation, and the elastic predictive analyses. Indeed, a classifier built using the entire cell-line
net features were bootstrapped to retain only those that were consistent data set performed suboptimally when applied exclusively to
across runs (Supplementary Methods). melanoma-derived cell lines (Fig. 2d), whereas a model built with only
Out of .50,000 input features, the regression-based analysis iden- melanoma cell lines performed better (Fig. 2d). Predictive features in
tified multiple known features as top predictors of sensitivity to several the melanoma-only model showed a strong overexpression of genes
agents (Supplementary Table 7 and Supplementary Figs 8 and 9), with regulated by the transcription factors MITF and SOX10 (Supplemen-
robust cross-validated performance (Supplementary Fig. 10 and 11). tary Table 10), which may also help predict RAF inhibitor drug
For example, activating mutations in BRAF and NRAS were among the sensitivity in melanoma cell lines.
top four predictors of sensitivity in models generated for the MEK Nonetheless, lineage emerged as the predominant predictive feature
inhibitor PD-0325901 (ref. 10) (Fig. 2c). Additional predictive features for several compounds. For example, elastic net studies of the HDAC
for MEK inhibition included expression of PTEN, PTPN5 and SPRY2 inhibitor panobinostat identified haematological lineages as predictors
(which encodes a regulator of MAPK output). KRAS mutations were of sensitivity (Fig. 2e and Supplementary Fig. 9). Interestingly, most
also identified, albeit with a lower predictive value (Fig. 2c, Supplemen- clinical responses to panobinostat and related compounds (for example,
tary Tables 8 and 9 and Supplementary Fig. 8). vorinostat and romidepsin) have been observed in haematological
Other top predictors included EGFR mutations and ERBB2 cancers. Similarly, most multiple myeloma cell lines (12 of 14 lines
amplification/overexpression for erlotinib8 and lapatinib17, respectively; tested) exhibited enhanced sensitivity to the IGF1 receptor inhibitor
6 0 4 | N AT U R E | VO L 4 8 3 | 2 9 M A R C H 2 0 1 2
©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH

a c
0 Expr. ACSS3 (0.82)
Expr. C5orf39 (0.85)
−20
Expr. PTPN5 (0.82)
−40 Mut. LOF+nnMS SULF2 (0.92)
Expr. GRIN2A (0.8)
−60
Amax

Expr. NCRNA00173 (0.83)


−80 Expr. CYP27B1 (0.87)
Mut. nnMS KRAS (0.82)
−100 Expr. GAPDHS (0.84)
−120 PLX4720 (BRAF V600E) Expr. S100A4 (0.84)
PLX4720 (BRAF WT)
Panobinostat Mut. cosmicMS NRAS (1)
−8 −7 −6 −5 Mut. LOF+nnMS BRAF (0.92)
EC50 (log10) Expr. PTEN (0.84)
b Activity area
Expr. SPRY2 (0.98)
Relative growth

0
inhibition (%)

0
−0.3 0 0.1 2 PD-0325901
Amax

4 Activity area
–0.5 Weights 6

EC50 IC50 100 200 300 400 −3 −2 −1 0 1 2 3


–1
Drug concentration
d e P = 7.06 10–21 f P = 5.88 10–8
1.0 7 Multiple myeloma
4
Other
Average true positive rate

0.8
6 3

activity area
AEW541
0.6
Panobinostat
activity area

5 2
0.4
4 1
0.2
3 0
0.0
P = 3.14 10–8
0.0 0.2 0.4 0.6 0.8 1.0 2
False positive rate
Melanoma-only categorical model 3 4 5 6 7 8 9
Global categorical model applied to all CCLE Haematopoietic Solid
Global categorical model applied to melanoma IGF1 expression (log2, RMA)
Random

Figure 2 | Predictive modelling of pharmacological sensitivity using CCLE d, Specificity and sensitivity (receiver operating characteristic curves) of cross-
genomic data. a, b, Drug responses for panobinostat (green) and PLX4720 validated categorical models predicting the response to a MEK inhibitor, PD-
(orange/purple) represented by the high-concentration effect level (Amax) and 0325901 (activity area). Mean true positive rate and standard deviation (n 5 5)
transitional concentration (EC50) for a sigmoidal fit to the response curve are shown when models are built using all lines (global categorical model, in
(b). c, Elastic net regression modelling of genomic features that predict blue and orange), or within only melanoma lines (green). e, Activity area values
sensitivity to PD-0325901. The bottom curve indicates drug response, for panobinostat between cell lines derived from haematopoietic (n 5 61) and
measured as the area over the dose–response curve (activity area), for each cell solid tumours (n 5 387). The middle bar, median; box, inter-quartile range;
line. The central heat map shows the CCLE features in the model (continuous bars extend to 1.53 the inter-quartile range. f, Distribution of activity area
z-score for expression and copy number, dark red for discrete mutation calls), values for AEW541 relative to IGF1 mRNA expression. Orange dots, multiple
across all cell lines (x axis). Bar plot (left): weight of the top predictive features myeloma cell lines (n 5 14); blue dots, cell lines from other tumour types
for sensitivity (bottom) or insensitivity (top). Parentheses indicate features (n 5 434). Box-and-whisker plots show the activity area or mRNA expression
present in .80% of models after bootstrapping. LOF, loss of function mutation; distributions relative to each cell line type (line, median; box, inter-quartile
nnMS, non-neutral missense mutation (Supplementary Methods). range), with bars extending to 1.53 the inter-quartile range.

AEW541 (Fig. 2f and Supplementary Figs 8 and 9) and showed high To test this hypothesis, we first confirmed the correlation between
IGF1 expression (Fig. 2f). Interestingly, elevated IGF1R expression also AHR expression and sensitivity to MEK inhibitors in a subset of
correlated with AEW541 sensitivity (Supplementary Fig. 9). The CCLE NRAS-mutant cell lines (Fig. 3b and Supplementary Fig. 13). Next,
results indicate that multiple myeloma may be a promising indication we performed short hairpin RNA (shRNA) knockdown of AHR in cell
for clinical trials of IGF1 receptor inhibitors24 and that these drugs may lines with high or low AHR expression (Fig. 3c). Silencing of AHR
have enhanced efficacy in cancers with high IGF1 or IGF1R expression. suppressed the growth of three NRAS-mutant cell lines with elevated
Whereas BRAF and NRAS mutations are known single-gene pre- AHR expression (Fig. 3d–f), but had no effect on the growth of two
dictors of sensitivity to MEK inhibitors, several ‘sensitive’ cell lines lines with low AHR expression (Fig. 3g, h). The growth inhibitory
lacked mutations in these genes, whereas other lines harbouring these effect was confirmed with two additional shRNAs, where evidence
mutations were nonetheless ‘insensitive’ (Fig. 2c). The elastic net for dose dependence was also apparent (Fig. 3i, j). We also tested the
regression model derived from the subset of cell lines with validated hypothesis that allosteric MEK inhibitors may suppress AHR function
NRAS mutations identified elevated expression of the AHR gene by measuring the effect of PD-0325901 and PD-98059 on endogenous
(which encodes the aryl hydrocarbon receptor) as strongly correlated CYP1A1 mRNA, a transcriptional target of AHR in some contexts.
with sensitivity to the MEK inhibitor PD-0325901 (Fig. 3a). This find- Both compounds reduced CYP1A1 levels in NRAS-mutant melanoma
ing was interesting in light of previous studies indicating that a related cells (IPC-298 and SK-MEL-2; Fig. 3k) but not in neuroblastoma cells
MEK inhibitor (PD-98059) may also function as a direct AHR (CHP-212; Fig. 3k), indicating that other factors may govern CYP1A1
antagonist25. We therefore hypothesized that the enhanced sensitivity expression in the latter lineage. Together, these results suggest that
of some NRAS-mutant cell lines to MEK inhibitors might relate to a AHR dependency may co-occur with MAP kinase activation in some
coexistent dependence on AHR function. NRAS-mutant cancer cells, and that elevated AHR may serve as a
2 9 M A R C H 2 0 1 2 | VO L 4 8 3 | N AT U R E | 6 0 5
©2012 Macmillan Publishers Limited. All rights reserved
RESEARCH LETTER

a c i 2.0 IPC-298
Expr. C20orf173 (0.4)

4
Absorbance

R_

R_
1.2

c
AH

AH
Lu
Expr. PHRF1 (0.43) 1.5

AHR/GAPDH mRNA

sh

sh

sh
relative to CHP-212
1.0
Expr. AHRR (0.5) 1.0 AHR
0.8
Expr. AHR (0.71) 0.6 0.5 Actin
2
−0.4 0 0.3
4 PD-0325901 0.4
Weights 6 Activity area 1 2 3 4
0.2 Time in culture (days)
5 10 15 20 25 30
j

4
NRAS-mutant cell lines (Oncomap) 1.4 SK-MEL-2

R_

R_
0.0

c
AH

AH
1.2

Lu
b

Absorbance
SKPC- 12
E 8
H L-2
S -M 8
TT K-N Y-Z
09 S
02

HT -152
NC -10 1
I-H 80
2- KM 299
KA 1

SK Min6
-N o
H
S- A
absorbance (%)

sh

sh

sh
-M 29

HI M-

7
ON W
26 -A
100 1.0

-S
I -2

-C
HE KO
CHP-212

HD uT

1
AHR

C
0.8
Control

CH
IPC-298
0.6

IC
50 SK-MEL-2 Actin
0.4

P1
ONS-76
0.2
0 SK-N-SH NRAS-mutant cell lines 1 2 3 4
–6 –4 –2 0 2
Time in culture (days)
log10 [PD-0325901 (μM)]
k
2.0 DMSO

CYP1A1/GAPDH mRNA
d e f g h PD-0325901
0.10

relative to DMSO
CHP-212 IPC-298 SK-MEL-2 0.30 ONS-76 SK-N-SH PD-98059
0.5 0.35 1.5
Absorbance

0.08 0.8
0.25
0.4 0.30
0.06 0.6 0.20 1.0
0.3 0.25
0.04 0.4 0.15
0.2 0.10 0.20 0.5
0.02 0.2

0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0.0
Time in culture (days) CHP-212 IPC-298 SK-MEL-2

Figure 3 | AHR expression may denote a tumour dependency targeted by AHR (red lines) or luciferase (blue lines). i, Left: proliferation of IPC-298 cells
MEK inhibitors in NRAS-mutant cell lines. a, Predictive features for PD- (high AHR) after introduction of additional shRNAs against AHR (shAHR_1
0325901 sensitivity (using the ‘varying baseline’ activity area) in validated and shAHR_4; green and purple lines, respectively) or luciferase (control shLuc;
NRAS-mutant cell lines. b, Growth inhibition curves for NRAS-mutant cell lines blue line). Right: corresponding immunoblot analysis of AHR protein.
expressing high (red) or low (blue) levels of AHR mRNA in the presence of the j, Equivalent studies as in i using SK-MEL-2 cells (high AHR). k, Endogenous
MEK inhibitor PD-0325901. c, Relative AHR mRNA expression across a panel CYP1A1 mRNA expression in the neuroblastoma line CHP-212 or the
of NRAS-mutant cell lines (arrows indicate cell lines where AHR dependency melanoma lines IPC-298 and SK-MEL-2 after exposure to vehicle (blue) or
was analysed). d–h, Proliferation of NRAS-mutant cell lines displaying high (d– MEK inhibitors (PD-0325901, green or PD-98059, purple). Error bars indicate
f) and low (g, h) AHR mRNA expression, after introduction of shRNAs against standard deviation between replicates, with n 5 12 (b), n 5 3 (c), n 5 6 (d–k).
mechanistic biomarker for enhanced MEK inhibitor sensitivity in also emerged as the top predictor of topotecan sensitivity (another
this setting. TOP1 inhibitor; Supplementary Figs 8 and 14). Overall, 12 of 16
We also looked for markers predictive of response to several con- lineages showed significant SLFN11 associations for topotecan or
ventional chemotherapeutic agents (Supplementary Fig. 7 and Sup- irinotecan sensitivity (Pearson’s r $ 0.2, Supplementary Fig. 14b).
plementary Table 6) and identified SLFN11 expression as the top This finding was independently validated using data from the NCI-60
correlate of sensitivity to irinotecan (Fig. 4a), a camptothecin analogue collection (Supplementary Fig. 15). SLFN11 knockdown did not affect
that inhibits the topoisomerase I (TOP1) enzyme. SLFN11 expression steady-state growth sensitivity profiles (Supplementary Fig. 14d–f).
a b Ewing's MHH-ES-1
−3 −2 −1 0 1 2 3 SK-ES-1
0 sarcoma
Expr. SPOCK3 (0.7) TC-71
Per cent inhibition (median)

HCC-56
SK-HEP−1
Expr. TMEM90B (0.73) 20
Mut. LOF+nnMS ULK4 (0.78)
40
Mut. nnMS SEPT9 (0.72)
Expr. NF1 (0.85) 60
Expr. SLFN11 (1)
80
1
−0.4 −0.2 0 0.2 2 Irinotecan
3
Weights 4 activity area
5 100
6
50 100 150 200 –8 –7 –6 –5
c Cell lines log10 [Irinotecan (M)]
11
SLFN11 expression

10
9
(log2, RMA)

8
7
6
5
4
myo deno 9)
om 15)
T-c MDS (2(17)
LL ( 6)
Sof CML ( )
76)
B-c kin sc 29)
)
Pan SC (12 )
0)
)
AM CL (7)
42)

Me hagus 1)
(5)
Cer ney (2 )
scc 6)
Fall ritonea (26)
)
7)
ry (2 )
Urin Uterus 5)
trac 2)
End Thyro t (5)
(36)
CLL (202)
)
ndr east (3 7)
rco 60)
ary Vulva ( (4)
12)

all in e duc 0)
Colo testine (6)
tal ( 10)
Tes 427)
4)
Ast state 1)
ytom (85)
)
174

c (3
g N L (576

ma eas (23

(14

l (18

Ova ch (12

(448

a (3
Leio rvix a ma (3

(1

Sto tube (

4
(

Br IST (

c (3

tis (
Pro er (1
ma

t
.(

S ue (

L (5

(
a

Oes neck

id
m

t tc
m
DLB

Liv
G
s

ma
L
Ce sarco

lano

etriu
cr
t tis

rec
ell A

ell A

ary
vix
sarc

trac
Kid
op

roc
Bil
opia

osa
nd

Pe

om
Lun

da
g’s

pho

Hea

Sm
Urin
Cho
Ewin

Lym

Primary tumours

Figure 4 | Predicting sensitivity to topoisomerase I inhibitors. a, Elastic net the mean growth inhibition (n 5 2). c, SLFN11 expression across 4,103 primary
regression analysis of genomic correlates of irinotecan sensitivity is shown for tumours. Box-and-whisker plots show the distribution of mRNA expression for
250 cell lines. b, Dose–response curves for three Ewing’s sarcoma cell lines each subtype, ordered by the median SLFN11 expression level (line), the inter-
(MSS-ES-1, SK-ES-1 and TC-71) and two control cell lines with low SLFN11 quartile range (box) and up to 1.53 the inter-quartile range (bars). Sample
expression (HCC-56 and SK-HEP-1). Grey vertical bars, standard deviation of numbers (n) are indicated in parentheses.

6 0 6 | N AT U R E | VO L 4 8 3 | 2 9 M A R C H 2 0 1 2
©2012 Macmillan Publishers Limited. All rights reserved
LETTER RESEARCH

All three Ewing’s sarcoma cell lines screened showed both high 10. Solit, D. B. et al. BRAF mutation predicts sensitivity to MEK inhibition. Nature 439,
358–362 (2006).
SLFN11 expression and sensitivity to irinotecan (Fig. 4b and Sup- 11. Staunton, J. E. et al. Chemosensitivity prediction by transcriptional profiling. Proc.
plementary Fig. 14). Ewing’s sarcomas also exhibited the highest Natl Acad. Sci. USA 98, 10787–10792 (2001).
SLFN11 expression among 4,103 primary tumour samples spanning 12. Weinstein, J. N. et al. An information-intensive approach to the molecular
pharmacology of cancer. Science 275, 343–349 (1997).
39 lineages (Fig. 4c), suggesting that TOP1 inhibitors might offer an 13. Thomas, R. K. et al. High-throughput oncogene mutation profiling in human
effective treatment option for this cancer type. Towards this end, cancer. Nature Genet. 39, 347–351 (2007).
several ongoing trials in Ewing’s sarcoma are examining irinotecan- 14. Beroukhim, R. et al. The landscape of somatic copy-number alteration across
human cancers. Nature 463, 899–905 (2010).
based combinations, or the addition of topotecan to standard regimens26. 15. Ross, D. T. et al. Systematic variation in gene expression patterns in human cancer
For some lineages with high SLFN11 expression (for example, cervical cell lines. Nature Genet. 24, 227–235 (2000).
adenocarcinoma), topoisomerase inhibitors already comprise a standard 16. Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R.
Stat. Soc. B 67, 301–320 (2005).
chemotherapy regimen. In other tumours where topoisomerase 17. Konecny, G. E. et al. Activity of the dual kinase inhibitor lapatinib (GW572016)
inhibitors are commonly used (for example, colorectal and ovarian against HER-2-overexpressing and trastuzumab-treated breast cancer cells.
cancers), a range of SLFN11 expression was observed, raising the Cancer Res. 66, 1630–1639 (2006).
possibility that high SLFN11 expression might enrich for tumours more 18. Tsai, J. et al. Discovery of a selective inhibitor of oncogenic B-Raf kinase with potent
antimelanoma activity. Proc. Natl Acad. Sci. USA 105, 3041–3046 (2008).
likely to respond. If confirmed in correlative clinical studies, SLFN11 19. Zou, H. Y. et al. An orally available small-molecule inhibitor of c-Met, PF-2341066,
expression may offer a means to stratify patients for topoisomerase exhibits cytoreductive antitumor efficacy through antiproliferative and
inhibitor treatment. antiangiogenic mechanisms. Cancer Res. 67, 4408–4417 (2007).
20. Müller, C. R. et al. Potential for treatment of liposarcomas with the MDM2
By assembling the CCLE, we have expanded the process of detailed antagonist Nutlin-3A. Int. J. Cancer 121, 199–205 (2007).
annotationofpreclinicalhumancancermodels(http://www.broadinstitute. 21. Nishio, M. et al. Serum heparan sulfate concentration is correlated with the failure
org/ccle). Genomic predictors of drug sensitivity revealed both known of epidermal growth factor receptor tyrosine kinase inhibitor treatment in patients
with lung adenocarcinoma. J. Thorac. Oncol. 6, 1889–1894 (2011).
and novel candidate biomarkers of response. Even within genetically 22. Guo, W. et al. Formation of 17-allylamino-demethoxygeldanamycin (17-AAG)
defined sub-populations—or when agents were broadly active without hydroquinonebyNAD(P)H:quinoneoxidoreductase1:roleof17-AAGhydroquinone
clear genetic targets—elastic net modelling studies identified key pre- in heat shock protein 90 inhibition. Cancer Res. 65, 10006–10015 (2005).
23. Kelland, L. R., Sharp, S. Y., Rogers, P. M., Myers, T. G. & Workman, P. DT-Diaphorase
dictors or mechanistic effectors of drug response. Additional efforts that expression and tumor cell sensitivity to 17-allylamino, 17-
increase the scale and provide complementary types of information (for demethoxygeldanamycin, an inhibitor of heat shock protein 90. J. Natl Cancer Inst.
example, whole-genome/transcriptome sequencing, epigenetic studies, 91, 1940–1949 (1999).
24. Moreau, P. et al. Phase I study of the anti insulin-like growth factor 1 receptor (IGF-1R)
metabolic profiling or proteomic/phosphoproteomic analysis) should monoclonalantibody,AVE1642,assingleagentandincombination withbortezomib
enable additional insights. In the future, comprehensive and tractable in patients with relapsed multiple myeloma. Leukemia 25, 872–874 (2011).
cell-line systems provided through this and other efforts27 may facilitate 25. Reiners, J. J. Jr, Lee, J. Y., Clift, R. E., Dudley, D. T. & Myrand, S. P. PD98059 is an
numerous advances in cancer biology and drug discovery. equipotent antagonist of the aryl hydrocarbon receptor and inhibitor of mitogen-
activated protein kinase kinase. Mol. Pharmacol. 53, 438–445 (1998).
26. Wagner, L. M. et al. Temozolomide and intravenous irinotecan for treatment of
advanced Ewing sarcoma. Pediatr. Blood Cancer 48, 132–139 (2007).
METHODS SUMMARY 27. Garnett, M. J. et al. Systematic identification of genomic markers of drug sensitivity
A total of 947 independent cancer cell lines were profiled at the genomic level (data in cancer cells. Nature http://dx.doi.org/10.1038/nature11005 (this issue).
available at http://www.broadinstitute.org/ccle and Gene Expression Omnibus Supplementary Information is linked to the online version of the paper at
(GEO) using accession number GSE36139) and compound sensitivity data were www.nature.com/nature.
obtained for 479 lines (Supplementary Table 11). Mutation information was obtained
Acknowledgements We thank the staff of the Biological Samples Platform, the Genetic
both by using massively parallel sequencing of .1,600 genes (Supplementary Analysis Platform and the Sequencing Platform at the Broad Institute. We thank
Table 12) and by mass spectrometric genotyping (OncoMap), which interrogated S. Banerji, J. Che, C .M. Johannessen, A. Su and N. Wagle for advice and discussion. We
492 mutations in 33 known oncogenes and tumour suppressors. Genotyping/copy are grateful for the technical assistance and support of G. Bonamy, R. Brusch III,
number analysis was performed using Affymetrix Genome-Wide Human SNP E. Gelfand, K. Gravelin, T. Huynh, S. Kehoe, K. Matthews, J. Nedzel, L. Niu, R. Pinchback,
Array 6.0 and expression analysis using the GeneChip Human Genome U133 D. Roby, J. Slind, T. R. Smith, L. Tan, V. Trinh, C. Vickers, G. Yang, Y. Yao and X. Zhang. The
Cancer Cell Line Encyclopedia project was enabled by a grant from the Novartis
Plus 2.0 Array. Eight-point dose–response curves were generated for 24 anticancer Institutes for Biomedical Research. Additional funding support was provided by the
drugs using an automated compound-screening platform. Compound sensitivity National Cancer Institute (M.M., L.A.G.), the Starr Cancer Consortium (M.F.B., L.A.G.),
data were used for two types of predictive models that used the naive Bayes and the NIH Director’s New Innovator Award (L.A.G.).
classifier or the elastic net regression algorithm. The effects of AHR expression
Author Contributions For the work described herein, J.B. and G.C. were the lead
silencing on cell viability were assessed by stable expression of shRNA lentiviral research scientists; N.S., K.V. and A.M.M. were the lead computational biologists; M.P.M.,
vectors targeting either this gene or luciferase as control. The effect of compound W.R.S., R.S. and L.A.G. were the senior authors. J.B., G.C., S.K., P.M., J.M., J.T., A.S., N.L. and
treatment on AHR target gene expression was assessed by quantitative RT–PCR. A K.A. performed cell-line procural and processing; P.M. and K.A. performed or directed
full description of the Methods is included in Supplementary Information. nucleic acid extraction and quality control; S.G., W.W. and S.B.G. performed or directed
genomic data generation; C.J.W., F.A.M., E.B.-F., I.H.E., P.A., M.d.S., K.J. and V.E.M.
performed pharmacological data generation; N.S., K.V., G.V.K., A.R., M.F.B., J.C., G.K.Y.,
Received 25 July 2011; accepted 1 March 2012.
M.D.J., T.L., M.R. and G.G. contributed to software development; N.S., K.V., A.A.M., J.L.,
G.V.K., D.S., A.R., M.L., M.F.B., A.K., P.R., J.C., G.K.Y., J.Y., M.D.J., L.W., C.H., E.P., J.P.M., V.C.
1. Caponigro, G. & Sellers, W. R. Advances in the preclinical testing of cancer and M.P.M. performed computational biology and bioinformatics analysis; J.B., G.C.,
therapeutic hypotheses. Nature Rev. Drug Discov. 10, 179–187 (2011). N.S., L.M., J.E.M., J.J.-V., M.P.M., W.R.S., R.S. and L.A.G. performed biological analysis and
2. MacConaill, L. E. & Garraway, L. A. Clinical implications of the cancer genome. interpretation; N.S., K.V., A.A.M., J.L., A.R., M.L., L.M., A.K., J.J.-V., J.C., G.K.Y. and J.Y.
J. Clin. Oncol. 28, 5219–5228 (2010). prepared figures and tables for the main text and Supplementary Information; J.B., G.C.,
3. Lin, W. M. et al. Modeling genomic diversity and tumor dependency in malignant N.S., K.V., A.A.M., J.L., G.V.K., J.J.-V., M.P.M. and L.A.G. wrote and edited the main text and
melanoma. Cancer Res. 68, 664–673 (2008). Supplementary Information; J.B., G.C., N.S., K.V., S.K., C.J.W., J.L., S.M., C.S., R.C.O., T.L.,
4. Neve, R. M. et al. A collection of breast cancer cell lines for the study of functionally L.McC., W.W., M.R., N.L., S.B.G., K.A. and V.C. performed project management; J.P.M.,
distinct cancer subtypes. Cancer Cell 10, 515–527 (2006). V.E.M., B.L.W., J.P., M.W., P.F., J.L.H., M.M. and T.R.G. contributed project oversight and
5. Sos, M. L. et al. Predicting drug susceptibility of non-small cell lung cancers based advisory roles; and M.P.M., W.R.S., R.S. and L.A.G. provided overall project leadership.
on genetic lesions. J. Clin. Invest. 119, 1727–1740 (2009).
6. Dry, J. R. et al. Transcriptional pathway signatures predict MEK addiction and Author Information Data have been deposited in the Gene Expression Omnibus (GEO)
response to selumetinib (AZD6244). Cancer Res. 70, 2264–2273 (2010). using accession number GSE36139 and are also available at http://
7. Garraway, L. A. et al. Integrative genomic analyses identify MITF as a lineage survival www.broadinstitute.org/ccle. Reprints and permissions information is available at
oncogene amplified in malignant melanoma. Nature 436, 117–122 (2005). www.nature.com/reprints. The authors declare competing financial interests: details
8. Greshock, J. et al. Molecular target class is predictive of in vitro response profile. accompany the full-text HTML version of the paper at www.nature.com/nature.
Cancer Res. 70, 3677–3686 (2010). Readers are welcome to comment on the online version of this article at
9. McDermott, U. et al. Identification of genotype-correlated sensitivity to selective www.nature.com/nature. Correspondence and requests for materials should be
kinase inhibitors by using high-throughput tumor cell line profiling. Proc. Natl addressed to L.A.G. (Levi_Garraway@dfci.harvard.edu) or R.S.
Acad. Sci. USA 104, 19936–19941 (2007). (robert.schlegel@novartis.com).

2 9 M A R C H 2 0 1 2 | VO L 4 8 3 | N AT U R E | 6 0 7
©2012 Macmillan Publishers Limited. All rights reserved
CORRECTIONS & AMENDMENTS
ADDENDUM which T represents the Cell Titer Glo (CTG) level measured for the
doi:10.1038/nature11735 compound-treated well, and U is the median level of the untreated
wells across the plate. This raw A is 0% with no drug and 100% for fully
Addendum: The Cancer Cell Line active compounds, when no CTG is detected. Second, the data were
Encyclopedia enables predictive adjusted to a plate surface pattern and normalized to the MG132 posi-
tive control, as described in the Supplementary Methods. This norma-
modelling of anticancer drug lized A is also 0% with no drug, but 100% corresponds to the median
sensitivity MG132 response on that plate. Although normalized drug responses
were used to determine EC50, IC50 and Amax values, we used the raw
Jordi Barretina, Giordano Caponigro, Nicolas Stransky, drug responses for calculating the activity area (AA). This distinction is
Kavitha Venkatesan, Adam A. Margolin, Sungjoon Kim, now clear in the corrected Supplementary Table 11 (the two AA mea-
Christopher J.Wilson, Joseph Lehár, Gregory V. Kryukov, sures, derived from raw or normalized data, correlate closely: r 5 0.98).
Dmitriy Sonkin, Anupama Reddy, Manway Liu, The activity is the sum of differences between the measured Ai
Lauren Murray, Michael F. Berger, John E. Monahan, at concentration i and A 5 0, excluding positive A values: AA~
P
Paula Morais, Jodi Meltzer, Adam Korejwa, if0{ min (0,Ai =100)g. This AA has a value of 0 with no drug,
Judit Jané-Valbuena, Felipa A. Mapa, Joseph Thibault, and 18 for a compound inhibiting at A 5 100% at all eight drug
Eva Bric-Furlong, Pichai Raman, Aaron Shipway, concentrations, as illustrated in Fig. 2b of the original Letter. We hope
Ingo H. Engels, Jill Cheng, Guoying K. Yu, Jianjun Yu, that this definition eliminates any confusion that may have existed in
Peter Aspesi Jr, Melanie de Silva, Kalpana Jagtap, the original Supplementary Methods (page 13) and enables others to
Michael D. Jones, Li Wang, Charles Hatton, reproduce our AA results starting from the raw drug sensitivity data.
Emanuele Palescandolo, Supriya Gupta, Scott Mahan, As a further means of clarification, we have added three columns to
Carrie Sougnez, Robert C. Onofrio, Ted Liefeld, Supplementary Table 11 showing the raw (non-normalized) response
Laura MacConaill, Wendy Winckler, Michael Reich, Nanxin Li, data necessary to calculate AA, MG132 activity, and AA derived from
Jill P. Mesirov, Stacey B. Gabriel, Gad Getz, Kristin Ardlie, normalized response data.
Vivien Chan, Vic E. Myer, Barbara L. Weber, Jeff Porter, In addition, although all computational analyses used the above AA
Markus Warmuth, Peter Finan, Jennifer L. Harris, formula, a few Supplementary Figures (Supplementary Figs 6, 11, 9 and
Matthew Meyerson, Todd R. Golub, Michael P. Morrissey, 14b) used a scale showing 8 2 AA. This value was used for display
William R. Sellers, Robert Schlegel & Levi A. Garraway purposes, so that low values corresponded to sensitive cell lines and the
visualization remained consistent with other sensitivity metrics (IC50,
Nature 483, 603–607 (2012); doi:10.1038/nature11003 Amax). This specification was noted in Supplementary Fig. 8 but had
In the Supplementary Information of this Letter, the use of distinct data been inadvertently cut off the Supplementary Fig. 9 legend. We have
normalization and directionality methods for pharmacological response therefore updated the Supplementary Fig. 9 legend to clarify where an
calculations caused minor inconsistencies. We have therefore updated inverted scale was used, and updated the scale of Supplementary Figs 6,
Supplementary Table 11 and some of the Supplementary Figures to 11 and 14b to reflect our definition of AA (noted above).
resolve any confusion (see the Supplementary Information to this These changes do not affect the analyses, results or scientific con-
Addendum). We also wish to describe the relevant drug sensitivity clusions presented in the paper. The authors are indebted to B. Yadav,
normalization and response score calculations more completely. who alerted them to these inconsistencies.
Two versions of the drug response data were generated. First, raw Supplementary Information is available in the online version of the
activity values were calculated at each dose as A 5 100(T/U 2 1), in Addendum.

0 0 M O N T H 2 0 1 2 | VO L 0 0 0 | N AT U R E | 1
©2012 Macmillan Publishers Limited. All rights reserved

You might also like