Professional Documents
Culture Documents
LIFE Co-Exposures 092617 EST
LIFE Co-Exposures 092617 EST
LIFE Co-Exposures 092617 EST
2 Toward Capturing the Exposome: Exposure Biomarker Variability and Co-Exposure Patterns in the
3 Shared Environment
4
5 Ming Kei Chung, Kurunthachalam Kannan, Germaine M. Buck Louis, and Chirag J. Patel*,
6
7 Department of Biomedical Informatics
8 Harvard Medical School
9 Harvard University
10 10 Shattuck Street
11 Boston, MA 02115
12
13 Division of Environmental Health Sciences
14 Wadsworth Center, New York State Department of Health, and
15 Department of Environmental Health Sciences
16 The University at Albany
17 Albany, New York 12201
18 Kurunthachalam.kannan@health.ny.gov
19
20 Office of the Director
21 Division of Intramural Population Health Research
22 Eunice Kennedy Shriver National Institute for Child Health and Human Development
23 The National Institutes of Health
24 6710b Rockledge Drive
25 Bethesda, MD 20892
26 1-301-496-6155
27 louisg@mail.nih.gov
28
29 *,Corresponding author:
30 Chirag J. Patel
31
32 Department of Biomedical Informatics
33 Harvard Medical School
34 Harvard University
35 10 Shattuck Street, Room 302
36 Boston, MA 02115
37 1-617-432-1195
38 *E-mail: chirag_patel@hms.harvard.edu
39
40
41 Acknowledgements
42 The authors declare no competing financial interest. This study is supported by the NIH grants
43 (ES025052 and ES023504), and the Intramural Research Program of the Eunice Kennedy Shriver
44 National Institute of Child Health and Human Development (contracts # N01-HD-3-3355; N01-HD-3-
45 3356; N01-HD-3-3358; HHSN27500002; HHSN27500003; HHSN27500006).
46
47 Supporting Information
48 Correlation heat maps showing the Spearmans rank correlations (rs) across 128 chemicals grouped in
49 13 different classes supplied as Supporting Information.
50
51 Abstract
52 Many factors affect the variation in the exposome and we examined the influence of shared household
53 and partners sex in relation to the variation in 128 endocrine disrupting chemical (EDC) exposures
54 among couples. In a cohort comprising 501 couples trying for pregnancy, we measured 128 (13
55 chemical classes) persistent and non-persistent EDCs and estimated 1) sex-specific differences; 2)
56 variance explained by shared household; and 3) Spearman's rank correlation coefficients (rs) for
57 females, males, and couples exposures. Sex was correlated with 8 EDCs including polyfluoroalkyl
58 substances (PFASs) (p < 0.05). Shared household explained 43% and 41% of the total variance for PFASs
59 and blood metals, respectively, but less than 20% for the remaining 11 EDC classes. Co-exposure
60 patterns of the exposome were similar between females and males, with within-class rs higher for
61 persistent than for non-persistent chemicals. Median rss of polybrominated compounds and urine
62 metalloids were 0.45 and 0.09, respectively, for females (0.41 and 0.08 for males; 0.21 and 0.04 for
63 couples). Individual, rather than shared environment, could be a major factor influencing the co-
64 variation of 128 markers of the exposome. Correlations between exposures are lower in couples than
65 in individual partners and have important analytical and sampling implications for epidemiological
66 study.
67
68 Keywords: exposome, co-exposures, EDCs, combined exposure, POPs, metals, phthalates, persistent
70
71
72
73 TOC Abstract Art:
74
75 Introduction
77 influenced by factors shared by individuals such as those within a household and non-shared
78 factors specific to individuals, such as their sex. One could apply the exposome paradigm 1, a model to
79 capture the totality of exposures from conception onwards, to comprehensively characterize the
80 individual-level and shared differences of exposure. However, the exposome is a dynamic entity with
81 variations across time (temporal) and place (spatial) 2 underscoring the importance of considering
83
84 Instead of trying to capture all lifetime exposures, investigators can focus on critical and sensitive
85 time windows in human development such as pregnancy, infancy, childhood and adolescence to
87 sampling individuals from households) has been posited to be sufficient surrogates for all individuals in
88 the household 4. These assumptions may help characterize the exposome and study its time-
89 dependent relationship with health outcomes. For example, the Human Early-Life Exposome (HELIX)
90 project seeks to define the pregnancy and early-life exposomes and health 5, while the EXPOsOMICS
91 project has its conceptual framework of a life-course approach to a broader range of exposures 6.
92
93 Another challenge includes the difficulty of interpreting exposure-disease associations due to the
94 dense correlations among all exposures 7. The dense correlation pattern makes it hard to identify the
95 directionality of the potential causality 8. Second, correlations between exposures vary (e.g. absolute
96 median correlation from almost 0 to above 0.5) and, thus, there is no universal scale to assess the
97 biological significance 9. In addition, exposome-wide, or equivalently, environment-wide association
98 studies (EWASs), assess all the associations between exposures and an outcome to identify potential
99 etiologic signals 10. The data-driven approach assumes little to no collinearity between environmental
100 predictors, but it is almost impossible to select any single uncorrelated exposures out from the dense
101 exposome. One strategy for addressing these analytical issues is to characterize the correlations in
102 diverse cohorts to provide reference levels to gauge biological significance of associations.
103
104 We investigated whether cohabiting couples trying for pregnancy would have similar concentrations of
105 endocrine disrupting chemicals (EDCs) given their shared households, and whether concentrations
106 varied across partners in light of their individual exposures arising from other environments such as
107 lifestyle, recreation or occupation. This avenue of study is important given that EDCs have been found
108 to affect human fecundity and fertility 11,12, though much of the available evidence relies on research
109 conducted in either men or women but not couples. We utilized the Longitudinal Investigation of
110 Fertility and the Environment (LIFE) Study to empirically assess couples shared and individual
111 variations in a mixture of EDCs. We selected the LIFE Study because it has 13 classes of EDCs quantified
112 in both partners of the couple in keeping with the couple based nature of human reproduction 1315. To
113 meet our overarching aim, we characterize the dense correlation structure of couples EDC
114 concentrations and then estimate shared and individual variability. Lastly, we discuss the implications
116
117 Methods
120 from 16 counties in Michigan and Texas, 20052009. Couples were followed daily until pregnant or up
121 to 12 months of trying to become pregnant (infertile). Study participants were screened for eligibility
122 based on a set of criteria and the complete details have been previously published 16.
123
125 Following enrollment and completion of the baseline interview, couples provided preconception blood
126 (20 mL) and urine (120 mL) samples for the quantification of both man-made and natural EDCs (e.g.
127 phytoestrogens). We also included serum cotinine, a metabolite of nicotine, and have a total of 128
128 chemicals from 13 different classes (Table 1). Persistent EDCs included 4 classes of serum persistent
130 (OCPs), 11 polybrominated chemicals [polybrominated diphenyl ethers (PBDEs) and 1 polybrominated
131 biphenyl (PBB)], and 7 polyfluoroalkyl substances (PFASs), and were quantified by a single laboratory
133
135 phytoestrogens, 14 phthalate metabolites, 6 phenols [bisphenol A (BPA) and benzophenones (BPs)], 12
136 antimicrobial chemicals [parabens, triclosan (TCS) and triclocarban (TCC)], 2 paracetamol & derivatives,
137 and were quantified by another laboratory using published standard operating procedures 1922. Other
138 3 classes of EDCs included 3 blood metals, 17 urinary metals, and 4 urinary metalloids 23.
139
140
141 Table 1. List of chemical classes and measured chemicals in the current study.
Polychlorinated 36 Congeners: 28, 44, 49, 52, 66, 74, 87, 99, 101, 105, 110, 114, 118, 128, 138, 146, Serum 1-3 pg/g, wet weight
biphenyls 149, 151, 153, 156, 157, 167, 170, 172, 177, 178, 180, 183, 187, 189, 194, 195, 196,
Organochlorine 9 Hexachlorobenzene (HCB), -hexachlorocyclohexane (-HCH), - Serum 1-3 pg/g, wet weight
Polybrominated 11 Brominated biphenyl (BB 153); brominated diphenyl ethers (BDEs) congeners: 17, Serum 5 pg/g, wet weight
chemicalsa 28, 47, 66, 85, 99, 100, 153, 154, 183
Polyfluoroalkyl 7 2-(N-ethyl-perfluorooctane sulfonamido) acetate (Et-PFOSA-AcOH), 2-(N-methyl- Serum 0.04-0.1 ng/mL, wet
Phytoestrogens 6 Genistein, daidzein, O-desmethylangolensin, equol, enterodiol, and enterolactone Urine 0.2-0.6 ng/mL
Phthalate 14 Mono (3-carboxypropyl) phthalate (mCPP), monomethyl phthalate (mMP), monoethyl Urine 0.2-2 ng/mL
metabolites phthalate (mEP), mono (2-isobutyl phthalate) (miBP), mono-n-butyl phthalate (mBP),
Phenolsb 6 Total bisphenol A (BPA); benzophenones (BPs): 4-hydroxybenzophenone (4-OH- Urine 0.02-0.05 ng/mL
dihydroxy-4-methoxybenzophenone (2,2-OH-4-MeO-BP)
Antimicrobial 12 Triclosan (TCS) and triclocarban (TCC); parabens: methyl paraben (MP), ethyl Urine 0.02-0.05 ng/mL
chemicalsc paraben (EP), propyl paraben (PP), butyl paraben (BP), benzyl paraben (BzP), heptyl
Blood metals 3 Cadmium (Cd), lead (Pb), and mercury (Hg) 10 ng/L to 10 g/L
Urine metals 17 Manganese (Mn), chromium (Cr), beryllium (Be), cobalt (Co), molybdenum (Mo), Urine 10 ng/L to 10 g/L
cadmium (Cd), tin (Sn), caesium (Cs), barium (Ba), nickel (Ni), copper (Cu), zinc (Zn),
tungsten (W), platinum (Pt), thallium (Tl), lead (Pb), and uranium (U)
Urine metalloids 4 Selenium (Se), arsenic (As), antimony (Sb), and tellurium (Te) Urine 10 ng/L to 10 g/L
142
147 dSerum cotinine is not an EDC but included for comprehensive investigation.
148
149 Serum lipids were obtained by measuring total cholesterol, free cholesterol, triglycerides, and
150 phospholipids by an enzymatic method 24 and we calculated the total serum lipids as described by
151 Phillips et al. 25. Creatinine was measured by a Roche/Hitachi Model 912 clinical analyzer (Dalla, TX,
152 USA) using the Creatinine Plus Assay (Roche Diagnostics). Serum cotinine was quantified by isotope
154
156 Our overall analytical scheme is shown in Figure 1. First, we adjusted each of the indicators of
157 exposures for potential confounders in addition to total lipids (for lipophilic chemicals, n = 56) and
158 creatinine (for urinary chemicals, n = 61) to reduce estimate variability and susceptibility to bias 27,28.
159 Specifically, we adjusted PCBs, OCPs, and polybrominated chemicals for total lipids and all other
160 urinary chemicals for creatinine. Chemicals were log-transformed (x + 1) and continuous covariates
161 were rescaled to have mean zero and unit variance (Figure 1A). After extracting the residuals, we
162 calculated the Spearmans rank correlation (rs) matrices for EDCs in females, males, and couples.
163
164 Adjusting procedures such as Bonferroni correction assume independence between independent
165 variables. One way to resolve this analytical issue is to account for the correlation. To study the effect
166 of correlations between exposures on the family-wise error rate, we calculated the effective number of
167 variables (Meff) 29 for estimating the Bonferroni adjusted p values with formula 1, where Var(obs) is the
168 eigenvalue variance of the correlation matrix and M is the original number of variables.
( )
169 = 1 + ( 1)(1 ) formula 1
170
171
172 Figure 1. Analytical scheme to investigate the variability and correlations in this study. A) We first
173 extract the residuals from a linear model after adjusting for the base covariates (total lipids or
174 creatinine) to calculate the effective number of variables (Meff) and Spearmans rank correlation (rs). B)
175 Then, we used another linear model with an additional age variable to obtain residuals and conducted
176 paired t-test to test the difference of biomarkers between females and males living in the same
177 household. C) Afterward, we further adjusted for sex prior to extracting residuals to calculate the
178 percentage of biomarker variance explained by the shared environment.
179
180
181 We estimated the sex-specific difference with a paired t-test (by household) after extracting the
182 residuals from a linear model adjusted for age (Figure 1B). We used a similar approach to estimate the
183 percentage of variance explained by the shared environment. However, sex and age variables were
184 excluded in the adjustment step to isolate their effects (Figure 1C). Afterward, we extracted and
185 regressed the residuals against the household variable to obtain the adjusted coefficient of
187
188 We computed concordance as the Pearson correlation coefficient (r) between the chemical relatedness
189 rs in this study and that in the 20032004 National Health and Nutrition Examination Survey (NHANES)
190 9 to assess generalizability of the co-exposure patterns. We estimated the concordance based on a
191 total of 101 matched biomarkers between the 2 studies. We chose the 20032004 NHANES because 1)
192 many of the chemicals measured in LIFE were also measured in NHANES and 2) by the same
193 laboratory; and 3) the time period (20032004) is close to the beginning of recruitment for the LIFE
194 Study.
195
196 We used all the instrument derived concentrations for the analyses 30,31. For missing values, we
197 substituted them by multiple imputation, assuming a missing-at-random scenario 32. We conducted
198 imputation based on the information from available demographic, previous history of clinical
199 symptoms and all other chemical variables and created a total of 10 imputed data sets for males and
201
202 We visualized the correlations between exposures as exposome globe using the R package Circlize (v
203 0.3.1) 33. EDCs were sorted from lipophilic to hydrophilic to aid visual interpretation of the patterns.
204 We combined the final estimates from imputations using Rubins method 34 and calculate the p values
205 of correlations by permutation tests. To adjust for multiple testing, we used the false discovery rate
206 (FDR) q values unless otherwise specified. We executed all analyses using the computing environment
207 R (v 3.3.1) 35. For reproducibility purpose, all analytic code is publicly available on GitHub via a MIT
209
210 Results
211 Important sociodemographic differences were observed between partners (Table 2). Overall, female
212 partners were younger, had a lower body mass index (BMI) (< 25), consumed fewer alcoholic drinks,
213 were less likely to report a hypertension or high cholesterol, and had lower serum cotinine and lipids
215
216 Nine (7%) chemicals were correlated with partners sex (p < 0.05), i.e., cotinine, blood lead, mercury
217 and cadmium; and 5 PFASs: perfluorodecanoate (PFDeA), perfluorononanoate (PFNA), perfluorooctane
218 sulfonamide (PFOSA), perfluorooctane sulfonate (PFOS), and perfluorooctanoate (PFOA). Of note,
219 findings were robust to the FDR with the exception of blood cadmium.
220
221 Figure 2 shows the boxplot summary of the variance explained by the shared environment. We
222 estimated that two classes of chemicals, PFASs and blood metals, had higher levels of explained
223 variance (median 0.43 and 0.41 respectively) than the others. For the rest of the 11 classes, median
224 explained variances were ranged from 0.04 (phthalates) to 0.21 (cotinine). A few persistent organic
225 compounds, namely PCB congener 28, PBDE congener 47, and -hexachlorocyclohexane (-HCH), had
227
228 The exposome globe 36 displays the rss for females (right-half), for males (left-half), and for couples
229 (across the left-right of the globe) (Figure 3). To assist interpretation, we only presented the rss outside
230 the range of 0.25 to 0.25 as lines connecting different parts on the track, and they represent less than
231 10% of all the rss. For females, we observed two larger positively correlated clusters across EDC
232
233 Table 2. Sociodemographic and lifestyle characteristics of females and males in the LIFE Study.
BMI (kg/m2) **
Regular vigorous exercise in the past 12 months 200 (40) 211 (42)
7 - 10 15 (3) 8 (2)
16 - 25 3 (1) 10 (2)
4 19 (4) 62 (12)
5 9 (2) 54 (11)
234
238
239
240
241 Figure 2. Summary of the percentage variance explained by the shared environment. Boxplots of the
242 adjusted coefficient of determination (R2) within different chemical classes are shown. Interquartile
243 range is not shown for cotinine class because it contains only 1 compound. For each box, median and
244 interquartile range are drawn and the whiskers are extended to the largest values within
245 1.5*interquartile range. Black dots denote correlations outside of the range covered by the whiskers.
246
247
248
249
250 Figure 3. Exposome correlation globe showing the relationships of biomarkers between females, males
251 and couples. Right-half represents biomarkers in females; left-half represents biomarkers in males.
252 Only Spearmans rank correlations greater than 0.25 and smaller than 0.25 were shown as
253 connections in the globe. Red line denotes positive correlation and dark green line denotes negative
254 one. Color intensity and line width are proportional to the size of the correlation. Within-class and
255 between-class correlations are shown outside and inside of the track respectively. Correlations in
256 couples are indicated by the lines linking females and males (i.e. crossing the vertical-half of the globe).
257 classes: A) a dense cluster with serum persistent organic compounds such as PCBs and OCPs (upper
258 right of Figure 3); B) another loosely packed cluster with urinary EDCs such as phytoestrogens,
259 phthalates, phenols, and antimicrobial compounds (lower right of Figure 3). Correlations between
260 serum and urinary EDCs were mostly small and distributed between 0.25 and 0.25. For males, there
262
263 While we found similar correlations in the population of males and females separately, we found that
264 correlations in couple living in the same household were, in fact, less densely packed and with values
266
267 Summary of the within-class correlations as absolute magnitude is shown in Figure 4. For females
268 (Figure 4A), we found that polybrominated compounds had the highest median correlation (rs = 0.45)
269 while urine metalloids had the lowest (rs = 0.09). Across different chemical classes, persistent organic
270 compounds such as polybrominated compounds, PCBs and OCPs had higher median correlations (0.45,
271 0.38, and 0.34 respectively). For the rest of the classes, the median correlations were all below 0.25.
272 Males (Figure 4B) had similar within-class correlation distributions as were found in females. The class
273 with highest and lowest median correlations were polybrominated compounds (rs = 0.41) and
275
276 In contrast, we found a strong diminishing effects to the within-class correlations, both in terms of the
277 median and interquartile range (IQR), in couples (Figure 4C). Comparing the couples with females and
278 males, all chemical classes had the median correlations below 0.25 and urine metals were the class
279
280
281
282
283 Figure 4. Boxplots of Spearmans rank correlations (rs) within different chemical classes. A) Females; B)
284 Males; and C) Couples. For couples, summary statistics were estimated with the full 128 x 128
285 correlation matrix instead of with the half triangle. Certain classes contain only 1 pair of correlation
286 (paracetamols in females, paracetamols in males, and cotinine in couples). All represents the
287 grouping by the correlation of all pairs of chemicals available. Horizontal line drawn across the
288 chemical classes is equal to the 95th percentile of the null distribution obtained from permuting the
289 concentrations of all chemicals. Definition of whisker and black dot can be referred to the caption in
290 Figure 2.
291
292 with greatest percentage drop (83%). We also observed a substantial reduction in the IQRs of the
293 chemical classes. For example, the IQR of polybrominated compounds was 0.21, which corresponds to
294 an over 35% drop relatively to females and males. We found that urine metals had the largest drop in
295 IQR (over 77%). Looking more closely to the data, correlations between the same chemicals in couples
296 (i.e. the diagonal of the correlation matrix of couples) were generally higher than that among the same
297 class members. Because of the low number of exposure indicators in blood metals (3 chemicals),
298 paracetamols (2 chemicals), and cotinine (1 chemical), the diminishing effect to within-class
299 correlations was countered and thus the drop in medians and IQRs of these classes were not as
301
302 P values obtained from Bonferroni correction with Meff were highly similar between females and males
303 (Table 3). Both groups had similar Meff values with a difference smaller than 1 variable, suggesting that
304 the overall degree of correlations was also similar. After adjusting for the correlation, we found the
305 total number of variables decreased from 128 to 112 for females and 113 for males.
306
307 We estimated the concordances between correlations in the LIFE study (as females, males and
308 couples) and the 2003-2004 NHANES in Table 4. Overall, correlations r varied greatly between chemical
309 groups, from 0.78 to 0.98. Certain chemical groups had small sample size (n 6) and could cause low
310 and inverse correlations. However, correlations r were more consistent and comparable when we
311 discarded group information and considered all chemicals as a whole (females: 0.88; males: 0.84;
312 couples: 0.67); therefore, we conclude the exposure correlation patterns captured in LIFE are
316 correlation.
Females Males
Serum persistent
organic compounds
biphenyls (PCBs)
pesticides (OCPs)
chemicals
substances (PFASs)
Urinary non-persistent
organic compounds
metabolites
chemicals
& derivatives
Others
317
318 Note: M, number of variables in the corresponding chemical classes; M eff, number of effective variables
319 after taking account of within-class correlation; Mdiff, the different between M and Meff; P value,
320 Bonferroni adjusted p values as 0.05/Meff.
321
322
323 Table 4. Correlations of chemical relatedness between 20032004 NHANES and current study by
325
326 Note: Pearson r, Pearson correlation coefficients of the chemical relatedness (spearman correlations of
327 chemicals) between National Health and Nutrition Examination Survey (NHANES) and this study; n,
329 aOnly 9 classes are shown because not all chemicals in 13 chemical classes in this study could be
331
332 Discussion
333 Findings
334 Understanding the co-exposure patterns is an important step toward investigating the joint health
335 effects of chemical mixtures and for statistical design of exposome-related investigations. We describe
336 our high level findings here. First, exposure levels of one individual in the household were not
337 correlated with another individual in the same household. Second, the percentages of significant rs (q <
338 0.05) in males and females were 25.3 and 23.1 respectively, in contrast to only 9.5% between couples
339 in the same household (Figure 4). Chemical correlations in a household setting were concordant to
340 those in 0304 NHANES, indicating reproducible co-exposure correlations with respect to the patterns
342
343 Although couples in our study potentially shared a large degree of dietary and indoor environmental
344 factors, their exposures were only modestly correlated (low rs ). We believe that there are two
345 additional factors affecting the familial co-exposure patterns in our investigation. The first one is
346 concerned with how long the couples have been living together. In the U.S., the median age of first
347 marriage is over 24 since 2000 37, and newly married couples could show a greater discordance in
348 chemical co-exposure relationships due to a different pre-marriage exposures. The second factor is
349 potential physiological dampening of exposure variability related to the half-life of the target chemicals
350 38,39. Polybrominated compounds, PCBs, and OCPs are persistent chemicals with high lipophilicity and
351 longer half-lives (on the order of years). Their serum concentrations are integrated over a period of
352 time and not completely associated with recent exposure 40. Since most of the couples recruited in LIFE
353 Study were living together, we claim this phenomenon can explain the drop in rs of persistent
354 chemicals relative to the short-lived urinary chemicals (such as phthalate metabolites, whose half-lives
355 are on the order of days) in couples (Figure 4). These factors should be considered when using
357
358 Wu et al. 41 conducted a household-based study and measured serum PBDE levels in children and
359 parents and reported the rs between child and parent were in the range of 0.66 to 0.74 (median = 0.68,
360 n = 68) for a number of PBDE compounds. The pairs shared a substantial portion of genes, diet, and
361 living environment and they found that for the latter 2 components, measured as floor-wipe PBDEs,
362 canned meat, tuna and whitefish, were predictive of serum PBDEs. Furthermore, they found higher rs
363 of PBDEs between older couples (age 55, range = 0.45 to 0.78, median = 0.72). These findings are
365
366 Persistent organic chemicals are known to cause adverse health effects and are prioritized by the U.S.
367 Centers for Disease Control and Prevention for health monitoring 42. Many of them had wide
368 applications in electrical/electronic equipment, agricultural chemicals, and furniture 43,44. Although
369 other emerging EDCs such as BPA and phthalates have short half-lives, they have extensive modern
370 applications in cosmetics and consumer products 15,4547. The ubiquity of these chemicals in different
371 microenvironments such as schools and offices suggests that household environment alone is not a
372 major contributor to body burden, consistent with our results (Figure 2). PFASs and blood metals had
373 higher variance explained by the shared environment than other chemical classes and we believe that
374 this could be related to the lifestyle of the subjects in the LIFE cohort (e.g. PFAS exposures via food
378 Our investigation includes 128 chemical biomarkers with diverse physicochemical properties that span
379 from persistent lipophilic to non-persistent hydrophilic chemicals and is one of the first attempts to
380 systematically characterize their correlations in a household setting. However, we do not know how
381 long the couples have been living together at the baseline of the study to quantitatively assess how this
382 factor affects co-exposure patterns. Also, we only collected biological samples at the baseline; thus, it
383 is not possible to study how exposure levels and co-exposure patterns change longitudinally with time,
384 which could be an important piece of information for assessing fecundity outcome and chemical
385 exposures.
386
388 Our findings have implications for high-throughput association tests between correlated exposures and
389 health outcomes and phenotypes. One of the typical approaches adjusting for multiplicity in EWASs is
390 controlling the family-wise error rate (e.g. Bonferroni correction). However, the tests are assumed to
391 be independent. Nyholt 29 provided one solution to address correlation in Bonferroni correction by
392 calculating the effective number of variables (Meff, formula 1), which relies on identifying the number
393 of variables, M, and estimating the eigenvalue variance, Var(obs) of the correlation matrix. For
394 example, conducting an EWAS of a response with respect to the chemical class PCBs containing 36
395 congeners, one can calculate the correlation matrix (M = 36) and estimate the associated Var( obs) to
396 obtain Meff (Table 3). A new significance level for this set of comparisons will be /Meff, which is less
397 stringent than Bonferroni correction because Meff will be smaller than M if correlations exist between
398 congeners. Benjamini and Yekutieli 48 and Fan et al. 49 documented a procedure that considers the
400
401 In the future, exposome-related investigations, will include many more variables than ever before with
402 the emergence of highly sensitive high-resolution mass spectrometry techniques 50 and availability of
403 large data sets from reproducibility initiatives 10. Attempts to assess health outcome associated with a
404 number of exposures may increase R2 and lead to overfitting 51. Additionally, dense correlations
405 among exposures indicate that multicollinearity may also influence the reliability of the association size
406 in multiple regression or even potential confounding. We claim that these analytic challenges could be
407 ameliorated through understanding co-exposure patterns. For example, statistical approaches for
408 evaluating the effects of mixture exposures such as principal component analysis generally involve
409 dimensionality reduction that relies on estimating the correlation structure to reduce the number of
410 exposures being considered prior to analysis but are difficult to interpret 52. Other regression
411 approaches, such as the Elastic Net 53 can consider highly co-linear exposure variables while giving
412 coefficients that are similar to that delivered from a typical regression model; however, inferential
413 estimates (e.g. p values and confidence intervals) are still in development 54.
414
415
416 Capturing population-level exposure variability and the demographic variables that are associated
417 with the variation, such as sex, location, and time is a grand ambition in the exposome concept 10.
418 Given people spend over 90% of their time indoor and more than 12 hours a day at home (bls.gov/tus),
419 household samples (e.g. such as house dust) might be a reasonable surrogate to represent home
420 exposures to family members who share the same living environment. While sampling the household is
421 a tempting approach because of the simplicity and cost-savings over personal measurement, it may fail
422 to catch a significant fraction of exposure variability in the population as we found that shared
423 environment could explain a small percentage of biomarker variance (Figure 2). For epidemiological
424 investigation that sample study participants who are nested in a unit, for example, schools or homes,
425 conducting preliminary measurements to assess the influence of shared environment is one of the
427
428 Finally, correlations of chemical mixtures can also be an important tool used in exposure science and
429 cumulative risk assessment as groups of correlated chemicals are often released from a single source
430 (e.g. power plant and vehicular exhaust 55. Thus, studying the co-exposure patterns is the first step to
431 identify the sources prior to more in-depth source apportionment methods such as positive matrix
432 factorization and chemical mass balance receptor models 56. In cumulative risk assessment, dose
433 addition is one of the common approaches to estimate the risks from mixture exposures by assuming a
434 shared toxicity mechanism between chemicals (e.g. binding to the same receptor) 57. Knowing the co-
435 exposure correlations could be a first step toward identification of exposures with similar
437
438
439
440
441
442 References
443 (1) Wild, C. P. Complementing the Genome with an Exposome: The Outstanding Challenge of
446 (2) Wild, C. P.; Scalbert, A.; Herceg, Z. Measuring the exposome: a powerful basis for evaluating
447 environmental exposures and cancer risk. Environ. Mol. Mutagen. 2013, 54 (7), 480499.
448 (3) Stingone, J. A.; Buck Louis, G. M.; Nakayama, S. F.; Vermeulen, R. C. H.; Kwok, R. K.; Cui, Y.;
449 Balshaw, D. M.; Teitelbaum, S. L. Toward Greater Implementation of the Exposome Research
450 Paradigm within Environmental Epidemiology. Annu. Rev. Public Health 2017, 38, 315327.
451 (4) Potera, C. The HELIX Project: tracking the exposome in real time. Environ. Health Perspect. 2014,
453 (5) Vrijheid, M.; Slama, R.; Robinson, O.; Chatzi, L.; Coen, M.; van den Hazel, P.; Thomsen, C.; Wright,
454 J.; Athersuch, T. J.; Avellana, N.; et al. The human early-life exposome (HELIX): project rationale
455 and design. Environ. Health Perspect. 2014, 122 (6), 535544.
456 (6) Vineis, P.; Chadeau-Hyam, M.; Gmuender, H.; Gulliver, J.; Herceg, Z.; Kleinjans, J.; Kogevinas, M.;
457 Kyrtopoulos, S.; Nieuwenhuijsen, M.; Phillips, D. H.; et al. The exposome in practice: Design of the
458 EXPOsOMICS project. Int. J. Hyg. Environ. Health 2017, 220 (2 Pt A), 142151.
459 (7) Ioannidis, J. P. A. Exposure-wide epidemiology: revisiting Bradford Hill. Stat. Med. 2016, 35 (11),
460 17491762.
461 (8) Ioannidis, J. P. A.; Loy, E. Y.; Poulton, R.; Chia, K. S. Researching genetic versus nongenetic
462 determinants of disease: a comparison and proposed unification. Sci. Transl. Med. 2009, 1 (7),
463 7ps8.
464 (9) Patel, C. J.; Ioannidis, J. P. A. Placing epidemiological results in the context of multiplicity and
465 typical correlations of exposures. J. Epidemiol. Community Health 2014, 68 (11), 10961100.
466 (10) Manrai, A. K.; Cui, Y.; Bushel, P. R.; Hall, M.; Karakitsios, S.; Mattingly, C. J.; Ritchie, M.; Schmitt, C.;
467 Sarigiannis, D. A.; Thomas, D. C.; et al. Informatics and Data Analytics to Support Exposome-Based
468 Discovery for Public Health. Annu. Rev. Public Health 2017, 38, 279294.
469 (11) te Velde, E.; Burdorf, A.; Nieschlag, E.; Eijkemans, R.; Kremer, J. A. M.; Roeleveld, N.; Habbema, D.
470 Is human fecundity declining in Western countries? Hum. Reprod. 2010, 25 (6), 13481353.
471 (12) Hauser, R. The environment and male fertility: recent research on emerging chemicals and semen
473 (13) Patel, C. J.; Sundaram, R.; Buck Louis, G. M. A data-driven search for semen-related phenotypes in
475 (14) Buck Louis, G. M.; Barr, D. B.; Kannan, K.; Chen, Z.; Kim, S.; Sundaram, R. Paternal exposures to
476 environmental chemicals and time-to-pregnancy: overview of results from the LIFE study.
478 (15) Buck Louis, G. M.; Sundaram, R.; Sweeney, A. M.; Schisterman, E. F.; Maisog, J.; Kannan, K. Urinary
479 bisphenol A, phthalates, and couple fecundity: the Longitudinal Investigation of Fertility and the
480 Environment (LIFE) Study. Fertil. Steril. 2014, 101 (5), 13591366.
481 (16) Buck Louis, G. M.; Schisterman, E. F.; Sweeney, A. M.; Wilcosky, T. C.; Gore-Langton, R. E.; Lynch,
482 C. D.; Boyd Barr, D.; Schrader, S. M.; Kim, S.; Chen, Z.; et al. Designing prospective cohort studies
483 for assessing reproductive and developmental toxicity during sensitive windows of human
484 reproduction and development the LIFE Study. Paediatr. Perinat. Epidemiol. 2011, 25 (5), 413
485 424.
486 (17) Sjdin, A.; Jones, R. S.; Lapeza, C. R.; Focant, J.-F.; McGahee, E. E., 3rd; Patterson, D. G., Jr.
487 Semiautomated high-throughput extraction and cleanup method for the measurement of
490 (18) Kuklenyik, Z.; Needham, L. L.; Calafat, A. M. Measurement of 18 perfluorinated organic acids and
491 amides in human serum using on-line solid-phase extraction. Anal. Chem. 2005, 77 (18), 6085
492 6091.
493 (19) Mumford, S. L.; Kim, S.; Chen, Z.; Boyd Barr, D.; Buck Louis, G. M. Urinary Phytoestrogens Are
494 Associated with Subtle Indicators of Semen Quality among Male Partners of Couples Desiring
496 (20) Guo, Y.; Alomirah, H.; Cho, H.-S.; Minh, T. B.; Mohd, M. A.; Nakata, H.; Kannan, K. Occurrence of
497 phthalate metabolites in human urine from several Asian countries. Environ. Sci. Technol. 2011, 45
499 (21) Asimakopoulos, A. G.; Wang, L.; Thomaidis, N. S.; Kannan, K. A multi-class bioanalytical
500 methodology for the determination of bisphenol A diglycidyl ethers, p-hydroxybenzoic acid esters,
501 benzophenone-type ultraviolet filters, triclosan, and triclocarban in human urine by liquid
503 (22) Smarr, M. M.; Grantz, K. L.; Sundaram, R.; Maisog, J. M.; Honda, M.; Kannan, K.; Buck Louis, G. M.
504 Urinary paracetamol and time-to-pregnancy. Hum. Reprod. 2016, 31 (9), 21192127.
505 (23) Bloom, M. S.; Buck Louis, G. M.; Sundaram, R.; Maisog, J. M.; Steuerwald, A. J.; Parsons, P. J. Birth
506 outcomes and background exposures to select elements, the Longitudinal Investigation of Fertility
507 and the Environment (LIFE). Environ. Res. 2015, 138, 118129.
508 (24) Akins, J. R.; Waldrep, K.; Bernert, J. T. The estimation of total serum lipids by a completely
509 enzymatic summation method. Clin. Chim. Acta 1989, 184 (3), 219226.
510 (25) Phillips, D. L.; Pirkle, J. L.; Burse, V. W.; Bernert, J. T., Jr; Henderson, L. O.; Needham, L. L.
511 Chlorinated hydrocarbon levels in human serum: effects of fasting and feeding. Arch. Environ.
513 (26) Bernert, J. T., Jr; Turner, W. E.; Pirkle, J. L.; Sosnoff, C. S.; Akins, J. R.; Waldrep, M. K.; Ann, Q.;
514 Covey, T. R.; Whitfield, W. E.; Gunter, E. W.; et al. Development and validation of sensitive method
516 chromatography/atmospheric pressure ionization tandem mass spectrometry. Clin. Chem. 1997,
518 (27) Schisterman, E. F.; Whitcomb, B. W.; Louis, G. M. B.; Louis, T. A. Lipid adjustment in the analysis of
519 environmental contaminants and human health risks. Environ. Health Perspect. 2005, 113 (7),
520 853857.
521 (28) Heavner, D. L.; Morgan, W. T.; Sears, S. B.; Richardson, J. D.; Byrd, G. D.; Ogden, M. W. Effect of
522 creatinine and specific gravity normalization techniques on xenobiotic biomarkers in smokers
523 spot and 24-h urines. J. Pharm. Biomed. Anal. 2006, 40 (4), 928942.
524 (29) Nyholt, D. R. A simple correction for multiple testing for single-nucleotide polymorphisms in
525 linkage disequilibrium with each other. Am. J. Hum. Genet. 2004, 74 (4), 765769.
526 (30) Schisterman, E. F.; Vexler, A.; Whitcomb, B. W.; Liu, A. The limitations due to exposure detection
527 limits for regression models. Am. J. Epidemiol. 2006, 163 (4), 374383.
528 (31) Richardson, D. B.; Ciampi, A. Effects of exposure measurement error when an exposure variable is
529 constrained by a lower limit. Am. J. Epidemiol. 2003, 157 (4), 355363.
530 (32) Louis, G. M. B.; Sundaram, R.; Schisterman, E. F.; Sweeney, A. M.; Lynch, C. D.; Gore-Langton, R. E.;
531 Maisog, J.; Kim, S.; Chen, Z.; Barr, D. B. Persistent environmental pollutants and couple fecundity:
532 the LIFE study. Environ. Health Perspect. 2013, 121 (2), 231236.
533 (33) Gu, Z.; Gu, L.; Eils, R.; Schlesner, M.; Brors, B. circlize Implements and enhances circular
535 (34) Schafer, J. L. Multiple imputation: a primer. Stat. Methods Med. Res. 1999, 8 (1), 315.
536 (35) R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for
538 (36) Patel, C. J.; Manrai, A. K. Development of exposome correlation globes to map out environment-
540 (37) U.S. Census Bureau. Current Population Survey: Annual Social and Economic Supplement. US
542 (38) Makey, C. M.; McClean, M. D.; Sjdin, A.; Weinberg, J.; Carignan, C. C.; Webster, T. F. Temporal
543 variability of polybrominated diphenyl ether (PBDE) serum concentrations over one year. Environ.
545 (39) Rappaport, S. M.; Kupper, L. L. Quantitative Exposure Assessment; Rappaport: El Cerrito, CA, 2008.
546 (40) Aylward, L. L.; Hays, S. M.; Smolders, R.; Koch, H. M.; Cocker, J.; Jones, K.; Warren, N.; Levy, L.;
547 Bevan, R. Sources of variability in biomarker concentrations. J. Toxicol. Environ. Health B Crit. Rev.
549 (41) Wu, X. M.; Bennett, D. H.; Moran, R. E.; Sjdin, A.; Jones, R. S.; Tancredi, D. J.; Tulve, N. S.; Clifton,
550 M. S.; Coln, M.; Weathers, W.; et al. Polybrominated diphenyl ether serum concentrations in a
551 Californian population of children, their parents, and older adults: an exposure assessment study.
552 Environ. Health 2015, 14, 23.
553 (42) Li, Q. Q.; Loganath, A.; Chong, Y. S.; Tan, J.; Obbard, J. P. Persistent organic pollutants and adverse
554 health effects in humans. J. Toxicol. Environ. Health A 2006, 69 (21), 19872005.
555 (43) Whitehead, T.; Metayer, C.; Buffler, P.; Rappaport, S. M. Estimating exposures to indoor
556 contaminants using residential dust. J. Expo. Sci. Environ. Epidemiol. 2011, 21 (6), 549564.
557 (44) Dodson, R. E.; Perovich, L. J.; Covaci, A.; Van den Eede, N.; Ionas, A. C.; Dirtu, A. C.; Brody, J. G.;
558 Rudel, R. A. After the PBDE phase-out: a broad suite of flame retardants in repeat house dust
559 samples from California. Environ. Sci. Technol. 2012, 46 (24), 1305613066.
560 (45) Bloom, M. S.; Whitcomb, B. W.; Chen, Z.; Ye, A.; Kannan, K.; Buck Louis, G. M. Associations
561 between urinary phthalate concentrations and semen quality parameters in a general population.
563 (46) Smarr, M. M.; Sundaram, R.; Honda, M.; Kannan, K.; Louis, G. M. B. Urinary Concentrations of
564 Parabens and Other Antimicrobial Chemicals and Their Association with Couples Fecundity.
566 (47) Louis, G. M. B.; Kannan, K.; Sapra, K. J.; Maisog, J.; Sundaram, R. Urinary concentrations of
567 benzophenone-type ultraviolet radiation filters and couples fecundity. Am. J. Epidemiol. 2014,
569 (48) Benjamini, Y.; Yekutieli, D. The Control of the False Discovery Rate in Multiple Testing under
571 (49) Fan, J.; Han, X.; Gu, W. Estimating False Discovery Proportion Under Arbitrary Covariance
573 (50) Jones, D. P. Sequencing the exposome: A call to action. Toxicol Rep 2016, 3, 2945.
574 (51) Hawkins, D. M. The problem of overfitting. J. Chem. Inf. Comput. Sci. 2004, 44 (1), 112.
575 (52) Taylor, K. W.; Joubert, B. R.; Braun, J. M.; Dilworth, C.; Gennings, C.; Hauser, R.; Heindel, J. J.;
576 Rider, C. V.; Webster, T. F.; Carlin, D. J. Statistical Approaches for Assessing Health Effects of
579 (53) Zou, H.; Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B
581 (54) Taylor, J.; Tibshirani, R. J. Statistical learning and selective inference. Proc. Natl. Acad. Sci. U. S. A.
583 (55) Ravindra, K.; Sokhi, R.; Van Grieken, R. Atmospheric polycyclic aromatic hydrocarbons: Source
584 attribution, emission factors and regulation. Atmos. Environ. 2008, 42 (13), 28952921.
585 (56) Rizzo, M. J.; Scheff, P. A. Utilizing the Chemical Mass Balance and Positive Matrix Factorization
586 models to determine influential species and examine possible rotations in receptor modeling
588 (57) Chen, J. J.; Chen, Y. J.; Rice, G.; Teuschler, L. K.; Hamernik, K.; Protzel, A.; Kodell, R. L. Using dose
589 addition to estimate cumulative risks from exposures to multiple chemicals. Regul. Toxicol.
591