DNA Methylation-Carcinogenesis-2009

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Carcinogenesis vol.30 no.3 pp.

416–422, 2009
doi:10.1093/carcin/bgp006
Advance Access publication January 6, 2009

Epigenetic profiling reveals etiologically distinct patterns of DNA methylation in head


and neck squamous cell carcinoma

Carmen J.Marsit1,y,, Brock C.Christensen1,y, E.Andres independently and synergistically (1,2). However, human papilloma
Houseman2,3, Margaret R.Karagas4, Margaret virus (HPV), particularly the high-risk type 16, is associated with
R.Wrensch5,6, Ru-Fang Yeh6, Heather H.Nelson7, Joseph 20–25% of HNSCC, and individuals with HPV-positive disease com-
L.Wiemels5,6, Shichun Zheng5, Marshall R.Posner8, pared with HPV-negative have better overall survival (3,4). Given the
established association of etiologic factors with clinical outcome,
Michael D.McClean9, John K.Wiencke5 and Karl identifying the molecular character of tumors arising from varying
T.Kelsey1,3 exposures will aid in understanding the mechanisms influencing prog-
1
Department of Pathology and Laboratory Medicine, Brown University, Box nosis and provide novel targets for diagnosis and therapy of HNSCC.
G-E537, Providence, RI 02912, USA, 2Department of Biostatistics, Harvard Study of the contribution of epigenetic alterations to tumor biology
School of Public Health, Boston, MA 02115, USA, 3Department of is now a vast field, and it is widely accepted that epigenetic alterations
Community Health, Center for Environmental Health and Technology, Brown in target tissues are part of the causal path to the development of
University, Providence, RI 02912, USA, 4Department of Community and malignancy (5,6). DNA methylation-associated epigenetic silencing
Family Medicine, Dartmouth Medical School, Lebanon, NH 03766, USA, of tumor suppressor genes is an aberrant mark of cancer with consid-
5
Department of Neurological Surgery and 6Department of Epidemiology and erable specificity. DNA hypermethylation in HNSCCs targets genes in
Biostatistics, University of California—San Francisco, San Francisco, CA
94143, USA, 7Division of Epidemiology and Community Health, Masonic pathways such as DNA repair, cell cycle control, apoptosis, angiogen-
Cancer Center, University of Minnesota, Minneapolis, MN 55455, USA, esis, cell–cell interaction and metastasis (7). Associations among
8 HPV16, smoking, betel nut use and methylation of specific genes

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014


Head and Neck Oncology Program, Dana-Farber Cancer Center, Boston, MA
02215, USA and 9Department of Environmental Health, Boston University have been identified (8–10). These findings, though, have focused
School of Public Health, Boston, MA 02118, USA on single gene methylation alterations and their associations to ex-

To whom correspondence should be addressed. Tel: þ1 401 863 6508;
posures, but have not examined how exposures might be influencing
Fax: þ1 401 863 9008; the overall processes leading to epigenetic alteration. We have pre-
Email: carmen_marsit@brown.edu viously demonstrated that exposures and age, in bladder cancer, lead
to an increased propensity for gene promoter hypermethylation in
Head and neck squamous cell carcinomas (HNSCCs) represent a panel of 16 tumor suppressor genes (11). Using now available
clinically and etiologically heterogeneous tumors affecting high-throughput technologies, we are better equipped to understand
>40 000 patients per year in the USA. Previous research has iden- the process by which carcinogenic exposures act to alter the DNA
tified individual epigenetic alterations and, in some cases, the re- methylation status of a developing tumor.
lationship of these alterations with carcinogen exposure or patient We aim to more completely understand the etiology of epigenetic
outcomes, suggesting that specific exposures give rise to specific alterations by examining the relationships between these alterations
types of molecular alterations in HNSCCs. Here, we describe and carcinogen exposures. In this manner, we hope to define novel path-
how different etiologic factors are reflected in the molecular char- ways through which HNSCC can arise and aid in the development of
acter and clinical outcome of these tumors. In a case series of diagnostic screening tools and targeted therapies. We characterized
primary, incident HNSCC (n 5 68), we examined the DNA meth- DNA methylation profiles of primary human HNSCC tumors by exam-
ylation profile of 1413 autosomal CpG loci in 773 genes, in relation ining DNA methylation status of 1400 CpG sites in 800 cancer-
to exposures and etiologic factors. The overall pattern of epigenetic related genes in a population-based case series of incident, primary
alteration could significantly distinguish tumor from normal head HNSCC and non-diseased head and neck epithelium. Both the diagnostic
and neck epithelial tissues (P < 0.0001) more effectively than spe- and prognostic utility of these markers were defined; and uniquely, we
cific gene methylation events. Among tumors, there were significant also revealed how etiologic factors responsible for head and neck carci-
associations between specific DNA methylation profile classes and nogenesis are associated with the molecular character of these tumors.
tobacco smoking and alcohol exposures. Although there was a sig-
nificant association between methylation profile and tumor stage
(P < 0.01), we did not observe an association between these profiles Subjects and methods
and overall patient survival after adjustment for stage; although Study population
methylation of a number of specific loci falling in different cellular The study population has been described previously (8,12). Briefly, incident
pathways was associated with overall patient survival. We found cases of histologically confirmed HNSCC were identified from nine medical
that the etiologic heterogeneity of HNSCC is reflected in specific facilities in the Boston, MA, metropolitan area. Diagnoses were confirmed by
patterns of molecular epigenetic alterations within the tumors and an independent study pathologist. All cases enrolled in the study provided
that the DNA methylation profiles may hold clinical promise written informed consent as approved by the Institutional Review Boards of
worthy of further study. the participating institutions. Archived pathology specimens were used for
analysis of promoter hypermethylation, and a total of 42 formalin-fixed par-
affin-embedded (FFPE) and 26 fresh-frozen tumor samples were selected for
analysis. Data on HPV16 tumor DNA status and serology from the parent case–
control study (3) have been previously reported. Demographic and exposure
Introduction information was collected through self-administered questionnaires and clin-
ical information through medical chart reviews. Table I shows the demographic
Head and neck squamous cell carcinoma (HNSCC) is a physically, characteristics of the final population studied. In addition to the case tumor
etiologically and molecularly heterogeneous disease, with an annual tissues, non-malignant head and neck tissues from individuals without head
incidence in the USA of .40 000 cases. The majority of head and and neck cancer were obtained from the National Disease Research Inter-
neck cancers are associated with tobacco and alcohol use, acting both change. Clinicopathologic information is limited by this anonymous tissue
bank, but all samples were obtained from patients who were not previously
diagnosed with any cancer, and thus whose cause of death was not cancer
Abbreviations: FFPE, formalin-fixed paraffin embedded; HNSCC, head and related.
neck squamous cell carcinoma; HPV, human papilloma virus; OOB, out of bag;
RF, random forest. DNA extraction and methylation analysis
For FFPE tissues, tumor sections with the greatest proportion of malignant
y
These authors contributed equally to this work. tissue were selected by the study pathology for use in our molecular analyses.

Ó The Author 2009. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org 416
Epigenetic profiling in HNSCC

Subsequent analyses were carried out in The R Package. For exploratory/


Table I. Characteristics of the subjects with tissue involved in methylation visualization purposes, hierarchical clustering using the Manhattan metric and
analysis average linkage was performed. Associations between sample type or cova-
riates such as age or gender and methylation at individual CpG loci were tested
Characteristic HNSCC cases Non-diseased with a generalized linear model. The beta distribution of average beta values
(n 5 68) head and neck was accounted for with a quasibinomial logit link with an estimated scale
tissues (n 5 11) parameter constraining the mean between 0 and 1. In contrast, array-wide
scanning for autosomal CpG loci (n 5 1413) associations with sample type
Age, mean (±SD) 57.6 (11.4) 66.2 (7.9)
or covariate used false discovery rate correction and q-values computed by the
Gender, n (%)
qvalue package in R. An false discovery rate q-value of ,0.05 was considered
Female 14 (21) 3 (27)
significant for this examination. Initial analyses were performed to examine if
Male 54 (79) 8 (73)
there were systematic differences in methylation profile between FFPE and
Sample location, n (%)
fresh tumor samples. Clustering did not reveal linkage by sample type, nor was
Oral 35 (52) 3 (28)
any association observed across individual loci, thus these sample types (FFPE
Pharyngeal 26 (38) 4 (36)
and fresh) were combined in all subsequent analyses.
Laryngeal 7 (10) 4 (36)
The R Package was also used to build classifiers of sample type with the
Cigarette smokinga, n (%)
random forest (RF) approach, a supervised classification analysis. RF is a tree-
Never 16 (24) —
based classification algorithm similar to Classification and Regression Tree
Former 38 (56) —
(14–16) and was performed on CpG average beta values using RF R package
Current 13 (20) —
version 4.5-18 by Liaw and Wiener. RF builds each individual tree by taking
Lifetime average packs 1.3 (0.5) —
a bootstrap sample (sampling with replacement) of the original data and on
per day, mean (±SD)a
average about one-third of the original data are not sampled [out of bag
Number of years smoking, 32.2 (14.5) —
(OOB)]. Those sampled are used as the training set to grow the trees, and
mean (±SD)a
the OOB data are used as the test set. At each node of the tree, a random
Lifetime pack-years smokeda 41.1 (26.1) —

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014


sample of m out of the total M variables is chosen and the best split ispfound ffiffiffiffiffi
Lifetime average alcoholic 28.3 (35.5) —
among the m variables. The default value for m in the pRF
ffiffiffiffiffi R package ispffiffiffiffiM
ffi . In
drinks per week
this analysis, we will test a range of m from half of M to two times M and
Tumor HPV16 DNA statusb
will use the m that gives the lowest prediction error. The OOB error rate is the
Negative 56 (85%) —
percentage of time the RF prediction is incorrect. A test for association be-
Positive 10 (15%) —
tween methylation (predictors) and sample type was conducted by comparing
a the OOB obtained on the data set with the null distribution of OOB errors
Smoking data not available on one case and metrics of smoking (average
obtained by permuting sample type labels and running the RF procedure
packs per day, years smoked and pack-years smoked) based only on current 20 000 times.
and former smokers. For inference, data were clustered using a mixture model (17) with a mixture
b
HPV data not available on two cases.
of beta distributions (18), and the number of classes was determined by Bayes-
ian information criterion (19,20). The mixture model was fit by recursively
partitioning the data using a 2-class mixture model, with Bayesian information
criterion used as a criterion for the split, as described in ref. (21). Class mem-
Three 20 lM sections were cut from each FFPE tumor sample and the sections bership was obtained from the mixture model and subsequent univariate asso-
were transferred into microcentrifuge tubes. The paraffin was dissolved using ciations were tested via permutation test with 10 000 permutations each. For
Histochoice Clearing Agent (Sigma–Aldrich, St Louis, MO) followed by two continuous variables, the Kruskal–Wallis test statistic was used, whereas for
washes with 100% ethanol and one wash with phosphate-buffered saline. The categorical variables, the standard chi-square goodness-of-fit test was used.
samples were then incubated in sodium dodecyl sulfate lysis solution (50 mM For those variables having a significant univariate association, multinomial lo-
Tris–HCl, pH 8.1, 10 mM ethylenediaminetetraacetic acid, 1% sodium dodecyl gistic regression was used to model methylation class while controlling for
sulfate) with proteinase K (Qiagen, Valencia, CA) overnight at 55°C. De- potential confounders. Because of the potentially large number of methylation
crosslinking was performed by adding NaCl (final concentration 0.7 M) and classes, logistic regression coefficients were regularized using a ridge (L2) pen-
incubating at 65°C for 4 h. DNA was recovered using the Wizard DNA clean- alty, with coefficients for a common (non-intercept) covariate across outcome
up kit (Promega, Madison, WI) according to the manufacturer’s protocols. For levels shrunk toward zero (22,23) the tuning parameter was selected by minimiz-
fresh-frozen tumor tissues and peripheral blood samples, DNA was extracted ing Bayesian information criterion. To test multivariate associations with stage,
using the QIAamp DNA mini kit according to the manufacturer’s protocol ordinary logistic regression and likelihood ratio tests were used. Cox proportional
(Qiagen). Sodium bisulfite modification of the DNA was performed using hazards models and likelihood ratio tests of models with and without methylation
the EZ DNA Methylation Kit (Zymo Research, Orange, CA) following the classes were used for survival analyses. In addition, we performed a locus-by-
manufacturer’s protocol, with the addition of a 5 min initial incubation at 95°C locus analysis of the 500 most variable loci to examine associations with patient
prior to addition of the denaturation reagent. The de-crosslinking steps in the survival using a Cox proportional hazards model, using false discovery rate cor-
extraction as well as the 95°C incubation ensure more complete melting of rection and q-values computed by the qvalue package in R.
the DNA and thus more complete sodium bisulfite conversion, especially for Pathway analysis was conducted with the use of Ingenuity Pathways Anal-
the highly cross-linked formalin-fixed specimens. Illumina GoldenGateÒ ysis (IngenuityÒ Systems, Redwood City, CA; www.ingenuity.com) (24). Ca-
methylation bead arrays were used to simultaneously interrogate 1505 CpG nonical pathways analysis identified the pathways from the Ingenuity
loci associated with 803 cancer-related genes. This is a commercially available Pathways Analysis library of canonical pathways that were most significant
array designed by Illumina to interrogate genes with CpG islands considered to the data set. The top 500 most variable CpG gene loci tested locus-by-locus
cancer related. Bead arrays were run at the University of California-San for a survival association that was also associated with a canonical pathway in
Francisco Institute for Human Genetics, Genomics Core Facility according the Ingenuity Pathways Knowledge Base were considered for the analysis.
to the manufacturer’s protocol and as described in ref. (13). Since we are using a cancer panel, the 500 CpG gene loci analyzed were also
Statistical analysis used as the reference set to avoid bias toward cancer-related pathways by
selecting ‘user data set’ in the edit analysis parameters window. The significance
We assembled data with BeadStudio Methylation software from the array man- of the association between the data set and the canonical pathway was measured in
ufacturer Illumina (San Diego, CA). All array data points are represented by two ways: (i) a ratio of the number of gene loci from the data set that map to the
fluorescent signals from both methylated (Cy5) and unmethylated (Cy3) alleles, pathway divided by the total number of gene loci that map to the canonical
and methylation level is given by b 5 [max (Cy5, 0)]/(jCy3j þ jCy5j þ 100), pathway is displayed and (ii) Fisher’s exact test was used to calculate a P-value
the average methylation (b) value is derived from the 30 replicate methylation determining the probability that the association between the genes in the data set
measurements for each loci. At each locus for each sample, the detection P-value and the canonical pathway is explained by chance alone.
is defined as 1  P-value computed from the background model characterizing
the chance that the signal was distinguishable from negative controls. Using this
as a metric for quality control for sample performance, seven samples (9%), all Results
FFPE, where ,75% of loci had a detection P-value ,1  105, were dropped
from analysis. A similar quality control for CpG loci eliminated those with Characterization of the profile of DNA methylation alterations in non-
median detection P-value .0.05 (n 5 8, 0.5%). malignant head and neck epithelial tissues compared with HNSCC

417
C.J.Marsit et al.

tumor samples was completed using the Illumina Goldengate Meth- 261 loci with significantly differential methylation between tumors
ylation BeadArray. Unsupervised hierarchical clustering of the DNA and normal (see supplementary Figure S1 available at Carcinogenesis
methylation data with Manhattan distance and average linkage as the Online). Of those, 125 loci showed greater methylation in tumors
metric across the 1250 most variable autosomal loci (Figure 1A) compared with normal, whereas 136 loci exhibited lower methylation
depicts relatively tight clustering of the non-malignant tissues com- levels in tumors compared with normal tissues (see list of loci in
pared with the tumors, as well as the extent of variability in the supplementary Materials and Table 2, is available at Carcinogenesis
methylation beta value across the loci. In a locus-by-locus analysis Online). The confusion matrix (Table II) resulting from RF analysis
applying an false discovery rate cutoff q-value of 0.05, we identified shows which samples are correctly classified, those that are

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014

Fig. 1. (A) Unsupervised hierarchical clustering and heat map of methylation beta values for 1250 most variable loci across all samples. (B) Recursive
partitioning mixture model classification of normal and tumor head and neck tissues using all methylation beta values resulting in eight classes whose average
methylation beta values are represented in the heat map. Distribution of normal and tumor samples within each class is depicted in pie charts on the right.

418
Epigenetic profiling in HNSCC

misclassified and the misclassification error rate for each sample type. A recursive partitioning mixture model applied to methylation data
Five normal tissues (45%) were misclassified as tumor tissues, while from all autosomal loci in tumors and non-tumor head and neck
only two tumors were misclassified as normal tissues (3.0%). This epithelial tissues delineated 11 distinct methylation classes (Figure
results in an overall error rate of 8.86%, a significant improvement in 1B). This model demonstrates that methylation class membership
sample classification compared with the expected error under the null was a highly significant predictor of tumor status (permutation
hypothesis (P , 0.0001). These results, consistent with our previous P , 0.0001).
work, suggest that use of overall patterns of methylation alterations To examine how known risk factors for HNSCC are associated with
may have more utility in capturing the tumorigenic process than do these profiles, we utilized a case series approach and reconstructed
individual alterations. the recursive partitioning mixture models using only the tumor data
(Figure 2A), resulting in the delineation of six tumor-specific classes.
A permutation test with tumor stage, dichotomized as high (Stage III
Table II. RF classification of HNSCC tumor status using DNA methylation or IV) versus low (Stage I or II) revealed a significant association
between methylation class membership and stage (P , 0.01). A lo-
Tumor Non-diseased Error rate gistic regression model of stage (see supplementary Table 1 available
sample sample (%) at Carcinogenesis Online) suggested that inclusion of methylation
Tumor sample 66 2 2.9 class is significant in predicting stage (likelihood ratio P , 0.01)
Non-diseased sample 5 6 45 and that membership in Class 6 was associated with a significantly
reduced risk of high-stage disease (odds ratio 0.1, 95% confidence
Overall out of the bag (OOB) estimate of error rate 5 8.86%, permutation test interval 0.01, 1.0). Membership in Class 2 showed a similar protective
for association: P , 0.0001. effect, whereas membership in Class 5 was associated with an

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014

Fig. 2. Recursive partitioning mixture model classification of HNSCCs (A) resulting in six classes with average methylation beta values across loci depicted in
the heat map. (B) Average age of (C) lifetime average packs of cigarettes smoked per day by, (D) distribution of tumor location of and (E) lifetime average
alcoholic drinks per week consumed by patients whose samples are members of the distinct methylation classes depicted in (A).

419
C.J.Marsit et al.

increased risk of high-stage disease, although the small numbers of greater lifetime alcohol exposures. These results overall suggest that
tumors in these classes made these estimates unstable. differing etiologies of this disease influence the pattern of epigenetic
In order to identify if exposures leading to this disease are associ- alteration observed in the resulting tumors.
ated with these methylation classes, we examined the associations Next, we examined if the DNA methylation profiles or methylation
between individual risk factors for HNSCC and these classes. Meth- at specific loci were associated with patient survival. Among the 68
ylation class was significantly associated with patient age as a contin- samples examined for methylation using the array, there were 22
uous variable (permutation test P , 0.01, Figure 2B); methylation deaths and a mean of 2.75 years of follow-up among surviving
Class 2 members had lower patient age, and Class 4 higher age com- patients (range 0.75–5 years). We found no significant association
pared with other classes. Smoking intensity (packs per day) also sig- between methylation classes derived from the recursive partitioning
nificantly differed across methylation classes (P , 0.04, Figure 2C); mixture modeling procedure among tumors and overall patient
Class 1 demonstrated lower smoking intensity, and 3 relatively high survival, controlling for tumor stage and patient age.
intensity. However, we did not observe a significant association of Finally, we tested the hypothesis that biologic pathways, rather than
methylation class with smoking duration (years smoked) or pack- overall profiles of methylation, are important in determining survival.
years smoked. A borderline significant association was observed with To examine this hypothesis, we utilized Ingenuity Pathway Analysis
tumor site by methylation class (oral, pharyngeal and laryngeal, to examine which specific pathways were overrepresented among the
P , 0.1) (Figure 2D). Tumor HPV16 DNA status also demonstrated top 500 loci having both positive and negative correlation with sur-
an association with methylation class that approached statistical sig- vival as determined by loci-specific Cox proportional hazards analysis
nificance (P , 0.1, Figure 2E), patients in Class 4 had a greater prev- (24). The pathways identified to be significantly overrepresented are
alence of HPV-positive tumors. Finally, lifetime average drinks per listed in Table IV, as well as correlation between the increase in
week also showed a strong differential trend by methylation class methylation beta value of the genes represented by that pathways
(P , 0.1, Figure 2F). and patient survival, such that those with a positive correlation would
Multinomial logistic regression results are shown in Table III, with represent loci whose increasing methylation level is associated with

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014


the classes numbered as they were in Figure 2A and with Class 5 improved patient survival (i.e. hazard ratio ,1), whereas those with
serving as the referent class as this class had the largest membership. a negative correlation are loci where increasing methylation is asso-
The overall Wald P-value indicates whether the covariate significantly ciated with poorer survival or a risk hazard ratio (.1). We also iden-
differentiates class membership overall. Individual confidence inter- tified 18 loci with a false discovery rate ,20% in their association
vals for each covariate within a class identify the magnitude of any with overall patient survival in models stratified by tumor stage and
association and significance of the association of a covariate on mem- controlled for patient age, and those loci are shown in supplementary
bership in that class compared with the referent class (Class 5), con- Table 2 (available at Carcinogenesis Online). Of note, only two
ditional on membership in either class. of these 18 loci (ZAP70 and GP1BB) are associated with a hazard
Patient age and average alcohol drinks per week each significantly ratio .1, whereas 16 demonstrate a protective hazard ratio ,1. Such
differentiated membership across classes (Wald P , 0.0001). Laryn- a negative association with risk could indicate, in fact, that loss of
geal tumors were less probably to be members of Class 1, and the odds methylation at these loci may be associated with increased risk, as one
of membership in Class 1 were significantly reduced with each year of might expect from oncogene activation.
age. In addition, the odds of membership in Class 1 compared with
Class 5 were significantly decreased by almost 20% for each addi- Discussion
tional pack of cigarettes smoked per day on lifetime average. Each
year of age reduced the odds of membership in Class 2 compared with This study revealed that patterns of epigenetic alteration, rather than
Class 5 by .10%, and tumors in this class were mostly likely to be the identification of single or small numbers of epigenetic alterations
oral tumors compared with pharyngeal or laryngeal. Only age dem- affecting individual genes, may hold greater utility in identifying the
onstrated a significant effect on membership in Class 3 and Class 4, process through which carcinogens act epigenetically to drive tumor-
leading to a reduced odds of membership in Class 3 compared with igenesis as well as in providing useful diagnostic value to these mo-
Class 5, but an increased odds of membership in Class 4 compared lecular markers. The utility of profiles is probably due to the strong,
with Class 5. Class 6 tumors were significantly less probably to be yet multidirectional, correlation between epigenetic alterations across
HPV-positive tumors, but more probably to be from patients with loci, as we have previously reported (25), and as clearly seen in the

Table III. Multinomial logistic regression of methylation class membership by etiologic factors

Covariate Class 1 OR Class 2 OR (95% CI), Class 3 OR (95% CI), Class 4 OR Class 6 OR Overall
(95% CI), n53 n54 (95% CI), (95% CI), Wald P
n 5 17 n54 n 5 17

Age, per year 0.94 (0.94, 0.95) 0.89 (0.87, 0.91) 0.97 (0.97, 0.98) 1.08 (1.04, 1.12) 0.98 (0.98, 0.98) 0.0001
Tumor site 0.001
Oral Referent Referent Referent Referent Referent
Pharyngeal 1.07 (0.93, 1.22) 0.95 (0.92, 0.98) 0.98 (0.90, 1.06) 0.94 (0.88, 1.01) 0.90 (0.79, 1.01)
Laryngeal 0.95 (0.90, 1.00) 0.98 (0.97, 1.00) 1.06 (0.95, 1.19) 0.98 (0.96, 1.01) 1.01 (0.92, 1.12)
Tumor HPV16 DNA status 0.32
Negative Referent Referent Referent Referent Referent
Positive 1.00 (0.90, 1.11) 1.01 (0.95, 1.07) 0.98 (0.94, 1.01) 1.09 (0.98, 1.20) 0.93 (0.86, 1.00)
Lifetime average packs 0.84 (0.71, 0.99) 0.93 (0.85, 1.01) 1.07 (0.94, 1.22) 0.95 (0.85, 1.06) 0.92 (0.81, 1.04) 0.07
of cigarettes per
day smoked
Lifetime average 0.99 (0.99, 1.00) 1.00 (0.99, 1.01) 1.01 (0.99, 1.02) 1.01 (0.99, 1.02) 1.02 (1.01, 1.02) 0.0001
alcoholic drinks
per week

The odds ratio for each covariate in each class is conditional on membership in the given class compared with Class 5, the referent class (n 5 23). The model is
controlled for all covariates listed in the table. Results in bold italics are considered statistically significant (P , 0.05). OR, odds ratio; CI, confidence interval.

420
Epigenetic profiling in HNSCC

Table IV. Pathways significantly overrepresented in analysis of loci-specific associations with patient survival

Pathway Direction of correlation P-value Genes on array in pathway


with survival

Ephrin receptor signaling þ 0.009 SRC, EGF, CRK, EPHA2, GNG7, PDGFB, FGF1
SAPK/c-jun N-terminal kinase signaling þ 0.012 LCK, GADD45A, CRK, GNG7
Platelet-derived growth factor signaling þ 0.020 SRC, PDGFRA, CRK, JAK3, PDGFB
Cell cycle: G2/M DNA damage checkpoint regula- þ 0.030 CDKN2A, GADD45A, CDKN1A, SFN
tion
NF-jB signaling þ 0.039 LCK, BMP4, IL1RN, LTA, ZAP70, PDGFRA, EGF,
IRAK3
p53 signaling þ 0.040 CDKN2A, GADD45A, CDKN1A, SERPINB5, BAX,
SFN
Acute phase response signaling þ 0.043 IL1RN, IL6, RBP1
Hepatic fibrosis/hepatic stellate cell activation þ 0.045 IFNG, IFNGR2, EGF, MYH11, MMP2, BAX, IL6,
PDGFB, FGF1, COL1A1, CYP2E1, PDGFRA,
TGFB3
Synaptic long-term depression  0.045 IGF1, GUCY2D, PLA2G2A, NOS2A, NOS3, NPR2
Purine metabolism  0.049 TJP2, GUCY2D, PDE1B, NPR2

SAPK, stress activated protein kinase.

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014


present data. Our method for identification of methylation classes The individuals whose tumors reside in classes 4–6 demonstrated
based solely on the patterns of DNA methylation alterations derived a greater intensity of tobacco smoke exposure as well as an elevated
from the Goldengate Methylation BeadArray limits the bias intro- exposure to alcohol and appeared to be older than individuals of the
duced by selection of candidate genes for examination and allows other classes. Thus, the methylation patterns exhibited by these indi-
for statistical inference, so that the association between these classes viduals may be driven more by these exposures and indicate that they
and various clinical or risk factors can be assessed with rigor. It is also are somewhat less susceptible to the disease, as they presented at
critically important to note that it is not only hypermethylation of a later age and required greater exposure to attain this disease.
specific loci that is used in defining these signatures of epigenetic We were unable to observe any significant associations between
alterations between non-malignant and malignant tissues but also these methylation profiles and overall patient survival. This may in-
hypomethylation of loci at particular CpG sites. The locus-by-locus dicate that our approach (defining methylation classes) classifies only
approach suggested that 125 loci have greater methylation in tumors the phenotypic results of methylation and is not integrating the obvi-
compared with normal, whereas 136 loci exhibit lower methylation ously crucial results of genetic changes. Indeed, the epigenetic alter-
levels in tumors compared with normal tissues. The RF classification ation profile may be driven predominantly by exposures that do not
suggests that using DNA methylation at all CpG loci may be clinically relate to patient outcome. Our analyses of pathways overrepresented
useful in confirming tumor, as misclassification of tumor as non- among the loci demonstrating a relationship to patient survival suggests
tumor tissue occurred at a frequency of only 3%, but that as a screening that patient outcome is more probably determined not by the pattern of
tool, differentiating non-tumor from tumor tissue, misclassification methylation but by the combined targets of genetic and epigenetic
may be higher. The recursive partitioning approach defined patterns alteration. The most significantly overrepresented pathway, in this anal-
that can well differentiate between non-diseased oral epithelium and ysis, was the ‘ephrin receptor-signaling pathway’, containing a number
malignant tissues, with class membership highly significantly associ- of oncogenes such as SRC, EGF and EPHA2. The results of the Cox
ated with tissue type (P , 0.0001). Similarly, looking specifically model demonstrate a positive relationship between methylation beta
within tumors, these methylation alterations patterns significantly dif- value of these loci and improved patient survival, or more appropriately
ferentiated between low-stage disease (I, II) and high (III, IV)-stage that loss of methylation at these loci is associated with poorer patient
disease (P , 0.01) with Class 6 being more probably of lower stage survival. Overexpression of EPHA2, for example, has been linked to
than Class 1, controlling for the other methylation patterns. It should invasiveness in breast (26) and ovarian (27) tumors, and in general,
be noted that differentiation of adjacent normal tissue in patients with deregulation and overexpression of this family of genes are related to
tumors may be more complicated than differentiation of tumor from changes in cell motility, morphology, migration and adhesion—all
normal tissue of completely non-diseased individuals, and such an marks of an aggressive phenotype (28,29). Similarly, loss of methyla-
examination is worthy of further study. tion of the stress activated protein kinase (SAPK)/c-jun N-terminal
Patients whose tumors comprised Class 3 had greater exposures to kinase and platelet-derived growth factor-signaling pathways,
carcinogens causal to this disease, namely tobacco smoke and alco- traditional oncogenic growth-signaling pathways, probably represents
hol, and their tumors were more often laryngeal (Table III). Thus, the increasing tumor growth leading to poorer patient outcome. Gene-
chemical carcinogens, particularly those in cigarette smoke, may be specific analyses also suggest that loss of methylation of critical onco-
leading to this specific pattern of epigenetic alteration, either directly genic growth factors and receptors including HGF, FGF, ATP10A and
or through mutation of key genes related to the control of DNA NTRK3 is associated with poorer patient survival, whereas hyperme-
methylation. On the other hand, the patients whose tumors comprised thylation of ZAP70 and GP1BB is also associated with poor survival,
Class 1, demonstrated modest or low exposures to tobacco and alco- further suggesting that these genes may be acting as novel tumor
hol, were not, for the most part, HPV16 DNA positive within the suppressors in HNSCC. Additional analyses using a larger independent
tumors and were relatively young compared with classes 3–6. Thus, set of tumors will aid in confirming these results and determining if
these individuals may represent a group of people with true suscepti- a targeted panel of genes, such as those predicted by this array-based
bility for this disease, such that even low to modest exposure can approach, can be clinically useful as prognostic indicators in this
initiate oral carcinogenesis. Further studies are necessary to confirm disease. In addition, these data suggest that integrative analyses of
these observations in a larger series of tumors and may also consider human tumor samples, including epigenetic alteration, genetic copy
integration of these profiles with genome-wide polymorphism studies number changes and mutation, may be necessary to truly understand
to aid in identifying the genes or pathways responsible for this in- the biology and better identify meaningful disease biomarkers.
creased susceptibility and thereby aid in the prevention or treatment of HNSCC is a relatively common tumor, characterized by its aggres-
this disease. siveness and high morbidity, particularly at later stages, as well as for

421
C.J.Marsit et al.

the distinct clinical outcomes associated with its disparate etiologies. 7. Fan,C.Y. (2004) Epigenetic alterations in head and neck cancer: prevalence,
Molecular characterizations of HNSCC tumors have yet to clearly clinical significance, and implications. Curr. Oncol. Rep., 6, 152–161.
identify pathways responsible for driving the differential aggressive- 8. Marsit,C.J. et al. (2006) Epigenetic inactivation of the SFRP genes is as-
ness or treatment response related to HPV etiology. Due to its aggres- sociated with drinking, smoking and HPV in head and neck squamous cell
carcinoma. Int. J. Cancer, 119, 1761–1766.
siveness at high stages, the diagnostic and prognostic utility of somatic 9. Marsit,C.J. et al. (2004) Inactivation of the Fanconi anemia/BRCA pathway
molecular alterations including DNA methylation and genetic altera- in lung and oral cancers: implications for treatment and survival. Oncogene,
tions in accessible materials, such as saliva or buccal swabs, at early 23, 1000–1004.
stages of disease development is highly desirable, but has proven elu- 10. Tran,T.N. et al. (2005) Frequent promoter hypermethylation of RASSF1A
sive due to a general lack of markers with appropriate levels of sensi- and p16INK4a and infrequent allelic loss other than 9p21 in betel-associated
tivity and specificity (30). Identification of novel sites or patterns of oral carcinoma in a Vietnamese non-smoking/non-drinking female popula-
DNA methylation alterations in tumors may provide the necessary tion. J. Oral Pathol. Med., 34, 150–156.
sensitivity and specificity to eventually make non-invasive, early di- 11. Marsit,C.J. et al. (2007) Promoter hypermethylation is associated with cur-
agnosis of this disease possible. In addition, understanding how the rent smoking, age, gender and survival in bladder cancer. Carcinogenesis,
28, 1745–1751.
disease risk factors drive these alterations may allow for more specific 12. Ting Hsiung,D. et al. (2007) Global DNA methylation level in whole blood
diagnostic panels, as well as more targeted preventative strategies based as a biomarker in head and neck squamous cell carcinoma. Cancer Epide-
on the molecular character of the resulting disease. It is clear that miol. Biomarkers Prev., 16, 108–114.
exposures causal to this disease not only can influence the genetic 13. Bibikova,M. et al. (2006) High-throughput DNA methylation profiling
make-up of the tumor has been previously demonstrated but also sig- using universal bead arrays. Genome Res., 16, 383–393.
nificantly and distinctly alter the epigenetic profile. This concept that 14. Breiman,L. (2001) Random forests. Mach Lear, 45, 5–32.
carcinogens and lifestyle factors contribute to tumorigenesis through 15. Breiman,L. et al. (2007) http://www.math.usu.edu/adele/forests/. (20
epigenetic mechanisms is critical and holds great promise in disease January 2009, date last accessed).
prevention and treatment. In order to define those individuals at greatest 16. Pico,A. et al. (2007) SNPLogic: an interactive web resource for the path-

Downloaded from http://carcin.oxfordjournals.org/ by guest on May 28, 2014


way-based selection and prioritization of SNPs for genotyping studies.
risk and to limit the effect of carcinogens in these individuals, the Approaches to Complex Pathways in Molecular Epidemiology. Santa Ana
epigenetic character of the individual and the epigenetic mechanism Pueblo, NM.
of the carcinogens must be considered. In addition, treatment aimed at 17. Siegmund,K.D. et al. (2004) A comparison of cluster analysis methods
the epigenome holds great promise as we continue to understand the using DNA methylation data. Bioinformatics, 20, 1896–1904.
complex and powerful contribution, this mode of somatic alteration 18. Ji,Y. et al. (2005) Applications of beta-mixture models in bioinformatics.
plays in tumorigenesis. Bioinformatics, 21, 2118–2122.
19. Fraley,F. et al. (2002) Model-based clustering, discriminant analysis, and
Supplementary materials density estimation. J. Am. Stat. Assoc., 97, 611–631.
20. Houseman,E.A. et al. (2006) Feature-specific penalized latent class analy-
Supplementary Tables 1 and 2 and Figure S1 can be found at http:// sis for genomic data. Biometrics, 62, 1062–1070.
carcin.oxfordjournals.org/ 21. Houseman,E.A. et al. (2008) Model-based clustering of methylation
array data: a recursive-partitioning algorithm for high-dimensional
Funding data arising as a mixture of beta distributions. BMC Bioinformatics, 9,
365–380.
Flight Attendants Medical Research Institute to C.J.M. and M.D.M.; 22. Hastie,T. et al. (2004) Efficient quadratic regularization for expression
National Institutes of Health (R01CA078609, R01CA100679, arrays. Biostatistics, 5, 329–340.
R01CA52689, P50CA097257); Tendrich/Berkow Fund; Friends of 23. Cawley,G.C. et al. (2007) Sparse multinomial logistic regression via
the Dana-Farber Cancer Institute. Bayesian L1 regularisation. In Scholkopf,B., Platt,J. and Hofmann,T.
(eds.) Advances in Neural Information Processing Systems. Vol. 19. MIT
Acknowledgements Press, Cambridge, MA, pp. 209–216.
24. (2008) Ingenuity Pathways Analysis Application. Ingenuity Systems,
Conflict of Interest Statement: None declared. Redwood City, CA.
25. Marsit,C.J. et al. (2006) Examination of a CpG island methylator pheno-
type and implications of methylation profiles in solid tumors. Cancer Res.,
References 66, 10621–10629.
26. Fox,B.P. et al. (2004) Invasiveness of breast carcinoma cells and transcript
1. Blot,W.J. et al. (1988) Smoking and drinking in relation to oral and pha- profile: Eph receptors and ephrin ligands as molecular markers of potential
ryngeal cancer. Cancer Res., 48, 3282–3287. diagnostic and prognostic application. Biochem. Biophys. Res. Commun.,
2. Hashibe,M. et al. (2007) Alcohol drinking in never users of tobacco, cig- 318, 882–892.
arette smoking in never drinkers, and the risk of head and neck cancer: 27. Herath,N.I. et al. (2006) Over-expression of Eph and ephrin genes in ad-
pooled analysis in the International Head and Neck Cancer Epidemiology vanced ovarian cancer: ephrin gene expression correlates with shortened
Consortium. J. Natl Cancer Inst., 99, 777–789. survival. BMC Cancer, 6, 144.
3. Furniss,C.S. et al. (2007) Human papillomavirus 16 and head and neck 28. Wimmer-Kleikamp,S.H. et al. (2005) Eph-modulated cell morphology, ad-
squamous cell carcinoma. Int. J. Cancer, 120, 2386–2392. hesion and motility in carcinogenesis. IUBMB Life, 57, 421–31.
4. Loning,T. et al. (1985) Analysis of oral papillomas, leukoplakias, and in- 29. Campbell,T.N. et al. (2008) The Eph receptor/ephrin system: an emerging
vasive carcinomas for human papillomavirus type related DNA. J. Invest. player in the invasion game. Curr. Issues Mol. Biol., 10, 61–66.
Dermatol., 84, 417–420. 30. Chai,R.L. et al. (2006) Advances in molecular diagnostics and therapeutics
5. Jones,P.A. et al. (1999) Cancer epigenetics comes of age. Nat. Genet., 21, in head and neck cancer. Curr. Treat. Options Oncol., 7, 3–11.
163–167.
6. Gaudet,F. et al. (2003) Induction of tumors in mice by genomic hypome- Received October 8, 2008; revised November 24, 2008;
thylation. Science, 300, 489–492. accepted December 22, 2008

422

You might also like