Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Environment International 125 (2019) 505–514

Contents lists available at ScienceDirect

Environment International
journal homepage: www.elsevier.com/locate/envint

Exposome-wide association study of semen quality: Systematic discovery of T


endocrine disrupting chemical biomarkers in fertility require large sample
sizes
Ming Kei Chunga, Germaine M. Buck Louisb,c, Kurunthachalam Kannand, Chirag J. Patela,

a
Department of Biomedical Informatics, Harvard Medical School, Harvard University, 10 Shattuck Street, Boston, MA 02115, United States of America
b
Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health & Human Development, National Institutes of Health,
6710B Rockledge Drive, Room 3148, Bethesda, MD 20892, United States of America
c
Dean's Office, College of Health and Human Services, George Mason University, 4400 University Drive, Fairfax, VA 22030, United States of America
d
Division of Environmental Health Sciences, Wadsworth Center, New York State Department of Health, Department of Environmental Health Sciences, The University at
Albany, Albany, NY 12201, United States of America

ARTICLE INFO ABSTRACT

Handling Editor: Heather Stapleton Objectives: Exposome-wide association studies (EWAS) are a systematic and unbiased way to investigate mul-
Keywords: tiple environmental factors associated with phenotype. We applied EWAS to study semen quality and queried the
Chemical mixtures sample size requirements to detect modest associations in a reproductive cohort.
Endocrine disruptors Study design and setting: We conducted 1) a multivariate EWAS of 128 endocrine disrupting chemicals (EDCs)
Exposome from 15 chemical classes measured in urine/serum relative to 7 semen quality endpoints in a prospective cohort
Fecundity study comprising 473 men and 2) estimated the sample size requirements for EWAS etiologic investigations.
Semen quality Results: None of the EDCs were associated with semen quality endpoints after adjusting for multiple tests.
Statistical power
However, several EDCs (e.g., polychlorinated biphenyl congeners 99, 105, 114, and 167) were associated with
raw p < 0.05. In a post hoc statistical power analysis with the observed effect sizes, we determined that EWAS
research in male fertility will require a mean sample size of 2696 men (1795–3625) to attain a power of 0.8. The
average size of four published studies is 201 men.
Conclusion: Existing cohort studies with hundreds of participants are underpowered (< 0.8) for EWAS-related
investigations. Merging cohorts to ensure a sufficient sample size can facilitate the use of EWAS methods for
assessing EDC mixtures that impact semen quality.

1. Introduction than a mixture-based analytic strategy that more closely resembles


human exposure. However, as measurement capacity increases and
Unintended exposure to endocrine disrupting chemicals (EDCs), more phenotypes and exposures are measured in epidemiological and
including pesticides and phthalates, may adversely influence re- observational cohorts, findings are prone to publication bias and asso-
productive health, such as diminished semen quality – decreased sperm ciations could be falsely identified (Ioannidis, 2005, 2008). Exposome-
concentration, motility, and increased abnormal morphology wide association study, or equivalently, environment-wide association
(Abdelouahab et al., 2011; Buck Louis et al., 2015a; Joensen et al., study (EWAS) techniques are an agnostic data-driven approach that
2009; Vitku et al., 2016). Low dose and consistent exposures to EDCs accounts for multiple testing of chemical mixtures associated with a
are hypothesized to influence human reproduction and neurodevelop- phenotype. It calls for the associations of all the measured exposures
ment (Diamanti-Kandarakis et al., 2009). Nevertheless, it is unclear and outcome systematically while controlling for the type I error rate.
how these chemicals as a group may influence semen phenotypes. EWAS techniques have recently been used to assess environmental
Traditionally, investigations of EDCs and semen phenotypes have factors and chronic diseases (e.g., type 2 diabetes, high blood pressure,
based on exposure to individual EDC basis or conducted by associating and peripheral arterial disease) and mortality (McGinnis et al., 2016;
a class of chemical, such as polychlorinated biphenyls (PCBs), rather Patel et al., 2010; Zhuang et al., 2018; Patel et al., 2013). However,


Corresponding author at: Department of Biomedical Informatics, Harvard Medical School, Harvard University, 10 Shattuck Street, Room 302, Boston, MA 02115,
United States of America.
E-mail addresses: glouis@gmu.edu (G.M. Buck Louis), Kurunthachalam.kannan@health.ny.gov (K. Kannan), chirag_patel@hms.harvard.edu (C.J. Patel).

https://doi.org/10.1016/j.envint.2018.11.037
Received 24 August 2018; Received in revised form 18 October 2018; Accepted 14 November 2018
Available online 22 December 2018
0160-4120/ © 2018 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/BY-NC-ND/4.0/).
M.K. Chung et al. Environment International 125 (2019) 505–514

EWAS techniques have not been empirically used for the assessment of five classes of persistent EDCs measured in serum were included in the
human fecundity endpoints and, in particular, semen quality despite the analysis: 1) OCPs; 2) polybrominated biphenyl (PBB); 3) PBDEs; 4)
relevancy of the exposome research paradigm for the sensitive windows PCBs; and 5) per- and polyfluoroalkyl substances (PFASs). Gas chro-
underlying human reproduction and development (Buck Louis et al., matography with high-resolution mass spectrometry was used to
2013). Furthermore, the power to detect association in existing cohorts quantify persistent EDCs with the exception of PFASs, which were
designed for discovery of exposures in male fertility has not been well measured using high performance liquid chromatography-tandem mass
documented. These are critical and timely questions in light of concerns spectrometry (HPLC-MS/MS) (Kuklenyik et al., 2005; Sjödin et al.,
about temporal patterns reflecting declining semen quality believed 2004; Kato et al., 2011).
attributable to environmental factors and the community drive to de- For six classes of non-persistent EDCs, we used HPLC-MS/MS
sign epidemiological investigations to shed light on these concerns methods to quantify urinary 1) bisphenol A (BPA); 2) benzophenones;
(Skakkebaek et al., 2016; Smarr et al., 2017; Vested et al., 2014). 3) anti-microbials (triclosan, triclocarban, and parabens); 4) phthalate
Although EWAS is a powerful approach in the emerging exposome metabolites; 5) paracetamol and derivatives; and 6) phytoestrogens
era, detecting signals from noise requires large sample sizes. EDC bio- using established protocols (Kunisue et al., 2010; Guo et al., 2011;
markers have dense correlational structure (Patel and Manrai, 2015), Zhang et al., 2011; Asimakopoulos et al., 2014; Mumford et al., 2015b;
modest and small association sizes (Buck Louis et al., 2015a; Mumford Smarr et al., 2016). We quantified metal(loid)s using inductively cou-
et al., 2015a), and low concentrations with a substantial proportion of pled plasma mass spectrometry (Bloom et al., 2015).
concentrations below laboratory detection limits (Rappaport et al., Other quantified exposures included serum cotinine using HPLC-
2014). Along with multiple testing, these characteristics present unique MS/MS method (Bernert et al., 1997) and serum lipids using commer-
analytical challenges that hinder discovery in reproductive health re- cially available enzymatic method (Akins et al., 1989; Phillips et al.,
search. These issues often decrease the sensitivity of the statistical 1989). We used a Roche/Hitachi Model 912 clinical analyzer (Dallas,
models but can be ameliorated through increasing the sample size of TX) and the Creatinine Plus Assay to quantify creatinine, a marker of
the study. In contrast to cohorts of tens of thousands of participants urinary dilution for non-persistent EDCs.
(McGinnis et al., 2016), applying EWAS to smaller cohorts such as the
Longitudinal Investigation of Fertility and the Environment (LIFE) 2.4. Semen analysis
Study that targets more specific questions between EDCs and re-
productive remains unexplored, and it is unclear how many participants Consistent with the population-based sampling framework used in
is large enough to drive discovery. the LIFE Study, men collected semen samples following two days of
In this study, we sought to explore the utility of EWAS techniques abstinence after enrollment into the cohort and a second sample ap-
for assessing the relation between a mixture of EDCs measured in urine/ proximately 1 month later. Both samples were returned to the National
serum and semen quality phenotypes, and to provide methodologic Institute for Occupational Safety and Health's andrology laboratory for
insights for future exposome-related research. Specifically, 1) we used next day analysis using overnight delivery. Within 24 h, samples were
EWAS techniques to investigate the association between 128 EDCs and analyzed for next day motility (%), volume (mL), sperm concentration
seven semen quality endpoints using data from the LIFE Study; 2) (×106/mL), total sperm count (×106), morphology using both strict
conducted a post hoc estimation of statistical power for EWAS-type and WHO criteria (%), DNA fragmentation index (%), and high DNA
analysis; and 3) assessed the statistical power of four existing cohorts stainability (%). A complete description of the laboratory methods for
for answering questions about EDC mixtures and semen quality. assessing semen quality is provided elsewhere (Buck Louis et al., 2014).

2. Methods 2.5. Statistical analyses

2.1. Study population A. Overall Analyses

The referent study population comprised 501 couples who partici- Fig. 1 shows the overall analytical scheme of our study. We con-
pated in the LIFE Study all of whom were discontinuing contraception ducted three major analyses, namely: 1) EWAS with the LIFE Study data
for purposes of becoming pregnant. Participants were recruited be- to uncover associations between EDCs and semen quality using a
tween 2009 and 2012 from 16 counties in Michigan and Texas, USA. multivariate model to assess the relationships between each EDC and all
From this cohort, 473 (94%) male partners provided semen samples semen endpoints simultaneously (i.e., multiple phenotypes versus an
representing the study cohort for analysis. Inclusion criteria for male exposure biomarker); 2) post hoc power analysis with LIFE Study
partners were minimal: ≥18 years of age, in a committed relationship findings to investigate statistical power relative to the observed effect
and no physician diagnosed infertility. Complete details about the sizes from step 1; and 3) implementation of a field-wide post hoc power
construction of the cohort are provided elsewhere (Buck Louis et al., analysis (Serghiou et al., 2016) to investigate if existing cohorts
2011). studying semen quality are powered for EWAS-type investigations.

2.2. Data collection B. Multivariate EWAS

Male partners completed standardized baseline interviews then Fig. 2 shows the EWAS procedure. As an initial step, we assessed the
provided blood and urine samples for quantification of a mixture of distributions of all exposures and semen outcomes and characterized
persistent and non-persistent EDCs, respectively. Human subject ap- the cohort by key covariates. Given the high correlatedness between the
proval was granted for all participating institutions, and informed WHO and strict criteria for determining normal morphology, we in-
consent was obtained from all men prior to any data collection. cluded only the former. Thus, our EWAS approach considered seven
continuous semen endpoints, viz., next day motility, seminal volume,
2.3. Quantification of EDCs & semen analysis sperm concentration, total sperm count, morphology (WHO criteria),
DNA fragmentation, and high DNA stainability.
Persistent and non-persistent EDCs (n = 128) representing 15 che- All instrument-derived chemical concentrations were used to mini-
mical classes were quantified at the Centers for Disease Control and mize bias introduced from adjusting values below laboratory limits of
Prevention and the Wadsworth Center (New York State Department of detection when estimating human health outcomes with the model
Health), respectively, using published methods. As listed on Table 1, (Richardson and Ciampi, 2003; Schisterman et al., 2006). We imputed

506
M.K. Chung et al. Environment International 125 (2019) 505–514

Table 1
Listing of 128 endocrine disruptors included in the analysis.
Chemical classes # Chemicals

Serum persistent organic compounds


Polychlorinated biphenyls (PCBs) 36 Congeners: 28, 44, 49, 52, 66, 74, 87, 99, 101, 105, 110, 114, 118, 128, 138, 146, 149, 151, 153, 156, 157, 167, 170, 172, 177,
178, 180, 183, 187, 189, 194, 195, 196, 201, 206, and 209
Organochlorine pesticides (OCPs) 9 Hexachlorobenzene (HCB), β-hexachlorocyclohexane (β-HCH), γ-hexachlorocyclohexane (γ-HCH), oxychlordane, trans-
nonachlor, p,p′-DDT, o,p′-DDT, p,p′-DDE, and mirex
Polybrominated diphenyl ethers (PBDEs) 10 Congeners: 17, 28, 47, 66, 85, 99, 100, 153, 154, and 183
Polybrominated biphenyl (PBB) 1 Congener: 153
Per- and polyfluoroalkyl substances 7 2‑(N‑ethyl‑perfluorooctane sulfonamido) acetate (Et-PFOSA-AcOH), 2‑(N‑methyl‑perfluorooctane sulfonamido) acetate (Me-
(PFASs) PFOSA-AcOH), perfluorodecanoate (PFDeA), perfluorononanoate (PFNA), perfluorooctane sulfonamide (PFOSA),
perfluorooctane sulfonate (PFOS), and perfluorooctanoate (PFOA)

Urinary non-persistent organic compounds


Anti-microbialsa 12 Triclosan (TCS) and triclocarban (TCC); Parabens: methyl paraben (MP), ethyl paraben (EP), propyl paraben (PP), butyl paraben
(BP), benzyl paraben (BzP), heptyl paraben (HP), 4‑hydroxy benzoic acid (4‑HB), 3,4‑dihydroxy benzoic (3,4‑DHB),
methyl‑protocatechuic acid (OH-Me-P), and ethyl‑protocatechuic acid (OH-Et-P)
Phytoestrogens 6 Genistein, daidzein, O‑desmethylangolensin (O‑DMA), equol, enterodiol, and enterolactone
Phthalate metabolites 14 Mono (3‑carboxypropyl) phthalate (mCPP), monomethyl phthalate (mMP), monoethyl phthalate (mEP), mono (2‑isobutyl
phthalate) (miBP), mono‑n‑butyl phthalate (mBP), mono (2‑ethyl‑5‑carboxyphentyl) phthalate (mECPP),
mono‑[(2‑carboxymethyl) hexyl] phthalate (mCMHP), mono (2‑ethyl‑5‑oxohexyl) phthalate (mEOHP), mono
(2‑ethyl‑5‑hydroxyhexyl) phthalate (mEHHP), monocyclohexyl phthalate (mCHP), monobenzyl phthalate (mBzP), mono
(2‑ethylhexyl) phthalate (mEHP), mono-isononyl phthalate (mNP), and monooctyl phthalate (mOP).
Benzophenones (BPs) 5 4‑Hydroxybenzophenone (4‑OH‑BP), 2,4‑dihydroxybenzophenone (2,4‑OH‑BP), 2,2′,4,4′‑tetrahydroxybenzophenone
(2,2′4,4′‑OH‑BP), 2‑hydroxy‑4‑methoxybenzophenone (2‑OH‑4‑MeO‑BP), and 2,2′‑dihydroxy‑4‑methoxybenzophenone
(2,2′‑OH‑4‑MeO‑BP)
Bisphenol A (BPA) 1 Total bisphenol A
Paracetamol & derivatives 2 Paracetamol and 4‑aminophenol

Short-lived chemicals
Blood metals 3 Cadmium (Cd), lead (Pb), and mercury (Hg)
Urinary metals 17 Manganese (Mn), chromium (Cr), beryllium (Be), cobalt (Co), molybdenum (Mo), cadmium (Cd), tin (Sn), caesium (Cs), barium
(Ba), nickel (Ni), copper (Cu), zinc (Zn), tungsten (W), platinum (Pt), thallium (Tl), lead (Pb), and uranium (U)
Urinary metalloids 4 Selenium (Se), arsenic (As), antimony (Sb), and tellurium (Te)

Lifestyle chemicals
Serum cotinineb 1 Cotinine

a
Anti-microbials contain mostly parabens with TCS and TCC.
b
Serum cotinine is not an endocrine disrupting chemical but included for completeness of the study.

missing EDC data stemming from insufficient sample volume with a Specifically:
multiple imputation technique under the “missing-at-random” as-
sumption. Specifically, we imputed data using all demographic and [morphology + log(motility ) + log(volume) + log(concentration)
chemical variables and then created 10 imputed data sets for EWAS + log(count ) + log(fragmentation) + log(stainability )]
analyses.
= log (EDC + 1) + age + BMI + smoke + exercise + parity
To search for EDCs associated with semen quality in the context of a
mixture, we executed a multivariate multiple regression model for each + lipid/ creatinine
EDC with all seven semen endpoints as dependent variables.
We also adjusted for a fixed set of five a priori potential

Fig. 1. Analytical scheme of current study. (A) Using LIFE cohort data, we conducted a multivariate exposome-wide association study (EWAS) to systematically
assess the associations between seven semen quality endpoints simultaneously and endocrine disrupting chemicals (EDCs). (B) Using LIFE cohort data, we conducted
a post-hoc power analysis to gauge the sample size requirement for each of the endpoint for future EWAS. (C) We employed the meta-review technique to identify
studies investigating the effects of EDC exposure on semen quality that we pooled related outcomes together. We conducted a post-hoc power analysis to assess
whether the sample size of existing fecundity and fertility cohorts can produce enough statistical power to drive EWAS.

507
M.K. Chung et al. Environment International 125 (2019) 505–514

Fig. 2. Illustration of the analytical scheme for the exposome-wide association (EWAS) analysis. The LIFE Study comprises 473 men for whom 128 endocrine
disrupting chemicals (EDCs) have been quantified. Missing EDC concentrations were imputed using multiple imputation techniques. All semen quality endpoints,
except percent morphologically normal sperm, were log-transformed prior to analysis. For each EDC, we modeled seven semen variables with a multivariate multiple
regression, adjusted for a set of five a priori covariates based on chemical class. We did not adjust for lipids or creatinine for per- and poly-fluoroalkyl substances,
blood metals or cotinine. Then, we combined the F statistic from the imputed data sets and used family-wise error rate to control for type I error adjusted for false
discovery.

confounders, i.e., age (years), body mass index (lean/normal < 25.0, 2017). Finally, we estimated both the family-wise error rate (FWER)
overweight 25.0–29.9, obese ≥ 30.0), currently smoking (yes/no), with Bonferroni correction and the Benjamini-Hochberg false discovery
regular vigorous exercise in past year (yes/no), and having previously rate (FDR) using p values obtained from the model to adjust for multiple
fathered a pregnancy (yes/no). In addition, we included either total comparisons (Benjamini and Hochberg, 1995). Bonferroni correction is
serum lipids (ng/g serum) calculated according to Phillips et al. a conservative method and, hence, we provided FDR for comparison.
(Phillips et al., 1989) for lipophilic EDCs or creatinine (mg/dL urine)
for urinary EDCs. Since EDC distributions are right-skewed and re- C. Post Hoc Power Analysis
ported in different units, we log-transformed (x + 1) and rescaled each
to have zero-means and unit-variances to facilitate comparison with We conducted post hoc statistical power calculations using the R
each other. We also log-transformed six semen endpoints (excluding package pwr (Champely et al., 2017) to inform future EWAS-type in-
percent morphology) to conform with the multivariate normality as- vestigations. Power is defined as the probability of rejecting the null
sumption. hypothesis given that the alternative hypothesis is true, i.e., probability
For our regression model, the null hypothesis is that the coefficients to detect a true effect. We ran EWASs on each of semen endpoints to
of an EDC is simultaneously equal to zero across all semen phenotypes, study the power and sample size relationship. The association size is an
while an alternative hypothesis is that one or more of the EDC coeffi- estimate used to quantify the association between an EDC and semen
cients are different from null. To test this hypothesis, we calculated the quality endpoint. We assumed an effect size ƒ2 (Cohen, 1988) that is
multivariate F statistics (Pillai's Trace statistic) using multivariate calculated by comparing variance explained (R2) in the full and reduced
analysis of variance technique. We combined the F statistics from the multiple regression models (formula 1).
imputed data sets using the miceadds package in R (Robitzsch et al.,

508
M.K. Chung et al. Environment International 125 (2019) 505–514

2
RFull 2
RReduced 2008) to systematically extract and summarize the association sizes
f2 =
1 2
RFull (1) reported for human research (Fig. 3). The Medical Subject Headings
(MeSH) is a thesaurus that contains a set of controlled descriptors in a
The predictors of the full model included an EDC and a set of cov- hierarchical structure for indexing biomedical journals. Investigators at
ariates as described earlier (Fig. 2), while excluding the EDC for the the National Library of Medicine annotate each article indexed in
reduced model. The power analysis set the Bonferroni corrected sig- PubMed. We used MeSH to perform a search in PubMed with the fol-
nificance level to 0.05/128. Since the effect sizes are typically low and lowing terms: “Semen Analysis” [Mesh] AND (“Endocrine Disruptors”
we do not know the biologically significant sizes for EDC exposures, we [Mesh] OR “Environmental Pollutants” [Mesh]). We identified 423
assumed a null effect size distribution and took the 95th percentile (P) papers and selected 40 publications in English meeting our inclusion
ƒ2 as a threshold of important effect size to estimate the required sample criteria. We used the reviews to identify relevant observational studies
size and statistical power. We selected Bonferroni correction in favor of that reported Pearson correlation coefficient (r) as a metric of effect size
using FDR for direct interpretation of the effect from multiple com- between serum/plasma or urinary EDCs (e.g., pesticides and PCBs) and
parisons. All sample sizes were reported with Bonferroni correction semen related outcomes (e.g., semen volume and sperm count). Al-
unless otherwise specified. For comparison, we also estimated the re- though the odds ratio is the most commonly reported point estimate for
quired sample sizes to reach 80% power using FDR methods. Details of estimating the magnitude of an association (e.g., testing cases versus
the FDR simulation procedures can be found in Appendix A. controls) and we estimated ƒ2 in the previous analysis, we chose r in this
field-wide analysis given 1) the ease of computation; 2) simpler as-
D. Post Hoc Field-Wide Power Analysis sumptions (e.g., without specifying baseline prevalence of outcome and
case-control sample size ratio); and 3) standardized effect size r can
Lastly, we sought to ascertain whether EWAS techniques could be facilitate direct comparison. Finally, we included 47 pairs of rs from
readily applied to the typical cohort sizes utilized in epidemiologic and four independent research papers for the power analysis (De Jager
clinical research. To this end, we employed the meta-review (i.e., et al., 2006; Haugen et al., 2011; Richthoff et al., 2003; Rignell-Hydbom
overview of reviews) techniques (Smith et al., 2011; Francke et al., et al., 2004).

Fig. 3. Flow diagram illustrating meta-review techniques of the published literature for the extraction of Pearson correlation coefficients (n = 47 pairs) for endocrine
disrupting chemicals (EDCs) and semen phenotypes.

509
M.K. Chung et al. Environment International 125 (2019) 505–514

Table 2 detect the 95th P ƒ2, the number of recruited men to detect this effect
Description of study cohort (n = 473). size for a statistical power of ≥0.8 at a Bonferroni corrected sig-
Characteristic # % nificance level (0.05/128) would be: 2100 (next day motility), 2168
(seminal volume), 3625 (sperm concentrations), 3486 (total sperm
Age (years): count), 2185 (morphology), 3510 (DNA fragmentation), and 1795
< 25 16 3
(DNA stainability). The cohort would need 3.8 times more (current size
25–29 151 40
30–34 176 37
n = 473) to detect the 95th P associations between EDCs and DNA
≥35 130 28 stainability.
Race/ethnicity: In comparison with FDR, the sample sizes to reach power of ≥0.8 to
Black, Non-Hispanic 20 4 detect the 95th effect sizes were generally lower than with Bonferroni
White, Non-Hispanic 381 81
correction: 1094 (next day motility), 1110 (seminal volume), 2024
Hispanic 38 8
Other 34 7 (sperm concentrations), 1744 (total sperm count), 1331 (morphology),
Household income: 2116 (DNA fragmentation), and 925 (DNA stainability).
< $50,000 71 15 The sample size and statistical power relationship to detect corre-
$50,000–$89,999 120 26
lation rs in our field-wide post hoc analysis is depicted in Fig. 6. After
≥$90,000 275 59
Fathered a pregnancy before enrollment:
pooling the data, the 25th, 50th, 75th, and 95th P of rs were: 0.044,
No 215 45 0.090, 0.140, and 0.229, respectively. We found that, on average, these
Yes 258 55 studies had 201 men and tested a subset of individual EDCs with 12
Mean ( ± SD) hypotheses per study. Using these average settings and the 95th P r,
Age (years) 31.8 (4.9) previous investigations had a statistical power of 0.69 at a significance
Body mass index (kg/m2) 29.9 (5.6) level at 0.05/12. For scenarios that did not adjusted for multiple testing
Mean abstinence time (days) 4.1 (3.4)
and adjusted for 100 pairs of comparisons, the statistical power was
Geometric mean (95% CI) 0.91 and 0.42, respectively.
Serum cotinine (ng/mL) 0.04 (0.04, 0.06)
Serum total lipids (ng/g) 693 (593.0, 811.8)
Urinary creatinine (mg/dL) 6.55 (6.45, 6.66)
4. Discussion

4.1. Multivariate EWAS


We compared multiplicity in three scenarios: 1) without adjustment;
2) adjusting with 12 pairs of comparisons (empirical average); and 3) While we could not identify robust associations when analyzing a
adjusting with 100 pairs of comparisons (arbitrarily set for a EWAS). mixture of 128 EDCs, several associations have been reported for the
We set the Bonferroni corrected significance level at 0.05, 0.05/12 and LIFE Study when assessing specific candidate chemical classes of EDCs
0.05/100, respectively. We selected the 95th P of the absolute r dis- (e.g., benzophenones, phthalates) and semen quality (Buck Louis et al.,
tribution as a metric for important effect size and used one-sided al- 2015a; Mumford et al., 2015b; Bloom et al., 2015; Buck Louis et al.,
ternative hypothesis in the power analysis. Similar to previous analysis, 2015b). Other cohorts of men also have reported adverse associations
all sample sizes were reported with Bonferroni correction unless between PCBs and PBDEs and sperm motility (Abdelouahab et al.,
otherwise specified. 2011; Rignell-Hydbom et al., 2004; Meeker and Hauser, 2010) and
Further analysis of the correlations between semen quality end- sperm morphology for PBDEs and PFASs (Hauser et al., 2003; Toft
points can be found in Appendix B. We conducted all statistical analyses et al., 2012).
using the computing environment R (v 3.3.1). Possible reasons accounting for our inability to identify significant
EDCs in the EWAS analysis include choice of statistical models, differ-
ences in model specification, sources of biofluids for EDC measurement
3. Results (urine, serum, semen), and a lack of attention to multiple testing in
chemical class approaches that do not account for mixtures (Patel et al.,
Overall, the LIFE cohort comprised largely white men (81%) with a 2015). Although EWAS is the most sensitive approach for the detection
mean age of 31.8 ( ± 4.9) years and a body mass index of 29.9 ( ± 5.6) of associations when assessing mixtures (Agier et al., 2016), we hy-
with most (55%) having previously fathered a pregnancy (Table 2). pothesize that limited statistical power is a key reason for our null
Most of the men resided in a household with an annual income of findings in this study. For example, to detect association sizes similar to
≥$90,000. those estimated in this study (n = 473), we concluded that we would
Fig. 4 is a Manhattan plot that illustrates the EWAS results for the require at least 1795 men to detect the associations with sperm DNA
128 EDCs. The p values for each EDC estimated from the multivariate F stainability.
test are shown on the vertical axis. We found that 7/15 chemical classes
had p values < 0.1, viz., PCBs, PBDEs, PFASs, phthalates, benzophe- 4.2. Consideration of multiple comparisons in mixture analysis
nones, anti-microbials, and urinary metals. Only two PCB congeners,
104 and 115, and one PFAS (Me-PFOSA-AcOH) had p values < 0.01. The large sample size requirement underlying EWAS techniques
Overall, the findings were not robust to multiple adjustment. We did stems from correcting type I error rate in the context of multiple
not observe any EDCs to be significantly associated with semen phe- comparisons (along with overcoming errors in measurement of the
notypes, as none of the p values passed Bonferroni correction (0.05/ exposures and phenotypes). To highlight power requirements for the
128; shown as the red line in the plot) nor the FDR threshold of 0.1 (line field, we analyzed selected studies in the field-wide analysis.
not shown). Importantly, all did not report original findings with correction for
Since effect sizes of EDCs are modest and mixture analysis requires multiplicity. We calculated that the power to detect associations was as
adjusting for multiple comparisons, we investigated whether lack of high as 0.91 (32% increase from 0.69) without adjustment of multiple
power could explain the null findings. In a post hoc analysis, we found comparisons (significance level at 0.05). On average, each study tested
the power of our study was modest with respect to detecting the asso- 12 hypotheses and hence the Bonferroni corrected significance level is
ciation of 128 EDCs (Fig. 5). Taking sperm motility as an example 0.05/12. This illustrates that failure of adjusting for only 12 pairs of
(Fig. 5A), with a cohort size of 473, the third, second, and first quantiles comparisons in study with modest sample size may lead to inflation in
of statistical power were 0.012, 0.002, and 0.001, respectively. To statistical power and it is therefore more likely to lead to false

510
M.K. Chung et al. Environment International 125 (2019) 505–514

Fig. 4. Manhattan plot showing the results from multi-


variate exposome-wide association study. We tested the
null hypothesis that endocrine disrupting chemicals
(EDCs) were not associated with any of the seven semen
quality endpoints. The Y axis represents the –log10 of the
p values associated with multivariate F statistic. The X
axis represents the 128 EDCs from persistent lipophilic to
non-persistent compounds (left to right) that are colored
by chemical class. Horizontal lines are drawn at p values
of 0.005, 0.001, and Bonferroni correction level (0.05/
128). None of the EDCs were statistically significant at a
false discovery rate of 0.1. EDCs with p values < 0.05
are labeled. PCB: Polychlorinated biphenyl; OCPs:
Organochlorine pesticides; PBBs: Polybrominated biphe-
nyls; PBDE: Polybrominated diphenyl ether; PFASs: Per-
and polyfluoroalkyl substances; Me-PFOSA-AcOH:
2‑(N‑methyl‑perfluorooctane sulfonamido) acetate;
PFOSA: Perfluorooctane sulfonamide.

Fig. 5. Graph showing the relationships between statistical power and sample size in the LIFE Study. The graph reflects post hoc power analysis using empirical data.
Panels A to G represent analyses with different endpoints. A) next day motility; B) seminal volume; C) sperm concentration; D) total sperm count; E) morphology
(WHO criteria); F) DNA fragmentation, and G) high DNA stainability. For example, in A), we regressed sperm motility on each endocrine disrupting chemical (EDC)
separately. We calculated Cohen's ƒ2 as the effect size and used a Bonferroni corrected significance level at 0.05/128 to estimate power. In the graph, each of the 128
EDCs is represented by a curve. The color of the curve denotes the EDC class. The top five EDCs are annotated. PCB: Polychlorinated biphenyl; OCPs: Organochlorine
pesticides; PBBs: Polybrominated biphenyls; PBDE: Polybrominated diphenyl ether; PFASs: Per- and polyfluoroalkyl substances.

511
M.K. Chung et al. Environment International 125 (2019) 505–514

Fig. 5. (continued)

Fig. 6. Graph showing the relationships between statis-


tical power and sample size, using data from four pub-
lished semen quality studies. Vertical dash-dot blue lines
indicate study sample sizes. We have shown the results in
three different Bonferroni-corrected significance level (α)
scenarios: no comparison (α = 0.05); 12 comparisons per
study (α = 0.05/12), which is the average of selected
studies; 100 comparisons per study (α = 0.05/100),
which is a value we arbitrarily set for a comprehensive
exposome-wide association study. We extracted a total of
47 Pearson correlation coefficients as input and used
Bonferroni corrected α to estimate power. In the graph,
curves corresponding to the correlations at 25th, 50th,
75th and 95th percentiles (P) are shown. (For inter-
pretation of the references to color in this figure legend,
the reader is referred to the web version of this article.)

512
M.K. Chung et al. Environment International 125 (2019) 505–514

discovery. Given that the associations between EDCs and semen quality Development (NICHD), National Institutes of Health (Contracts #N01-
endpoints are modest, cautious interpretation of findings on EDCs and HD-3-3355, N01-HD-3-3356, N01-HD-3-3358, HHSN27500001,
semen quality is required (Patel and Ioannidis, 2014). HHSN27500002, HHSN27500003, HHSN27500006), and the National
Institute of Environmental Health Sciences grants (ES023504 and
4.3. Limitations of post hoc statistical power analysis ES025052). NICHD had a signed memo of understanding with the
Centers for Disease Control and Prevention for the analysis of semen
Our field-wide power analysis in semen quality has several limita- quality and persistent environmental chemicals.
tions. First, studies providing data were published from 2002 to 2011,
and the extent to which they have external validity for other time Declaration of financial interests
periods remains unknown. Second, we could only extract 47 pairs of rs
from four studies. Third, we summarized semen endpoints and chose r None of the author has any competing interests with this work.
as the effect size measurement. Therefore, results could not be com-
pared directly with our LIFE post hoc power analysis, which estimated Appendix A. Supplementary data
ƒ2 on individual semen endpoints.
Supplementary data to this article can be found online at https://
4.4. Approaches to increase statistical power of EWAS doi.org/10.1016/j.envint.2018.11.037.

There are several approaches to increase the power for EWAS in- References
vestigations as we move forward to implement these tools. First, EDC
concentrations are typically low and/or below the laboratory detection Abdelouahab, N., Ainmelk, Y., Takser, L., 2011. Polybrominated diphenyl ethers and
limits, especially when studying participants are sampled from the sperm quality. Reprod. Toxicol. 31, 546–550.
Agier, L., Portengen, L., Chadeau-Hyam, M., Basagaña, X., Giorgis-Allemand, L., Siroux,
general population. To reduce the number of comparisons, one possi- V., et al., 2016. A systematic comparison of linear regression-based statistical
bility is to exclude exposures with low detection percentage (e.g., 5%). methods to assess exposome-health associations. Environ. Health Perspect. 124,
Alternatively, one can retain the information by aggregating rare ex- 1848–1856.
Akins, J.R., Waldrep, K., Bernert, J.T., 1989. The estimation of total serum lipids by a
posures by chemical class (Auer and Lettre, 2015). Secondly, when completely enzymatic “summation” method. Clin. Chim. Acta 184, 219–226.
calculating the FWER, tests are assumed to be independent. One may Asimakopoulos, A.G., Wang, L., Thomaidis, N.S., Kannan, K., 2014. A multi-class bioa-
take account of the correlations between exposures and estimate a new nalytical methodology for the determination of bisphenol A diglycidyl ethers, p‑hy-
droxybenzoic acid esters, benzophenone-type ultraviolet filters, triclosan, and tri-
significance threshold. For example, Bonferroni correction is calculated
clocarban in human urine by liquid chromatography-tandem mass spectrometry. J.
from dividing an a priori significance level (e.g., 0.05) by the number of Chromatogr. A 1324, 141–148.
comparisons made. Nyholt (2004) provided a method to calculate an Auer, P.L., Lettre, G., 2015. Rare variant association studies: considerations, challenges
and opportunities. Genome Med. 7.
“effective number of variables”, which is smaller than number of
Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: a practical and
comparisons when tests are correlated and produces a higher sig- powerful approach to multiple testing. J. R. Stat. Soc. 57, 289–300.
nificance threshold. Alternatively, FDR controlling procedures are Benjamini, Y., Yekutieli, D., 2001. The control of the false discovery rate in multiple
generally more powerful than those for FWER, but at the sake of a testing under dependency. Ann. Stat. 29, 1165–1188.
Bernert, J.T., Turner, W.E., Pirkle, J.L., Sosnoff, C.S., Akins, J.R., Waldrep, M.K., et al.,
higher type I error rate (Benjamini and Yekutieli, 2001; Kim and van de 1997. Development and validation of sensitive method for determination of serum
Wiel, 2008). Thirdly, one may use joint analyses for studies with more cotinine in smokers and nonsmokers by liquid chromatography/atmospheric pressure
than one correlated endpoint (Zhou and Stephens, 2014), i.e., modeling ionization tandem mass spectrometry. Clin. Chem. 43, 2281–2291.
Bloom, M.S., Whitcomb, B.W., Chen, Z., Ye, A., Kannan, K., Buck Louis, G.M., 2015.
data using one multivariate multiple regression in place of a few mul- Associations between urinary phthalate concentrations and semen quality parameters
tiple regressions as we attempted in this investigation. In this study, we in a general population. Hum. Reprod. 30, 2645–2657.
have selected seven endpoints. For EWAS driven by multiple regres- Buck Louis, G.M., Schisterman, E.F., Sweeney, A.M., Wilcosky, T.C., Gore-Langton, R.E.,
Lynch, C.D., et al., 2011. Designing prospective cohort studies for assessing re-
sions, we would have 896 pairs of comparisons (128 × 7), whereas the productive and developmental toxicity during sensitive windows of human re-
number of comparisons is reduced to 128 for multivariate multiple production and development—the LIFE Study. Paediatr. Perinat. Epidemiol. 25,
regression. Lastly, while it is not feasible for individual investigators to 413–424.
Buck Louis, G.M., Yeung, E., Sundaram, R., Laughon, S.K., Zhang, C., 2013. The exposome
conduct large-scale studies without substantial resources, meta-ana- – exciting opportunities for discoveries in reproductive and perinatal epidemiology.
lyzing existing cohorts may be a cost-effective way to increase effective Paediatr. Perinat. Epidemiol. 27, 229–236.
sample size and hence statistical power; however, a challenge remains Buck Louis, G.M., Sundaram, R., Schisterman, E.F., Sweeney, A., Lynch, C.D., Kim, S.,
et al., 2014. Semen quality and time-to-pregnancy, the LIFE Study. Fertil. Steril. 101,
in harmonizing studies across regions and with varying methodologies.
453–462.
Buck Louis, G.M., Chen, Z., Schisterman, E.F., Kim, S., Sweeney, A.M., Sundaram, R.,
5. Conclusions et al., 2015a. Perfluorochemicals and human semen quality: the LIFE Study. Environ.
Health Perspect. 123, 57–63.
Buck Louis, G.M., Chen, Z., Kim, S., Sapra, K.J., Bae, J., Kannan, K., 2015b. Urinary
We did not identify EDC significantly associated with diminished concentrations of benzophenone-type ultraviolet light filters and semen quality.
semen quality in a multivariate EWAS (FDR = 0.1). In a post-hoc power Fertil. Steril. 104, 989–996.
analysis, we conclude that the sample size requirements are between Champely, S., Ekstrom, C., Dalgaard, P., Gill, J., Weibelzahl, S., Anandkumar, A., et al.,
2017. pwr: Basic Functions for Power Analysis.
1795–3625 men and 925–2116 men when using a Bonferroni or false Cohen, J., 1988. Statistical Power Analysis for the Behavioral Sciences. L. Erlbaum
discovery rate to mitigate type 1 error, respectively. Last, despite the Associates, Hillsdale, NJ.
importance of investigating endocrine disrupting chemicals for public De Jager, C., Farias, P., Barraza-Villarreal, A., Avila, M.H., Ayotte, P., Dewailly, E., et al.,
2006. Reduced seminal parameters associated with environmental DDT exposure and
health and male fertility, we found that existing cohort investigations p,p′‑DDE concentrations in men in Chiapas, Mexico: a cross-sectional study. J. Androl.
are vastly underpowered to undertake discovery-based or EWAS-like 27, 16–27.
approach and greater investment in larger sample sizes are required to Diamanti-Kandarakis, E., Bourguignon, J.P., Giudice, L.C., Hauser, R., Prins, G.S., Soto,
A.M., et al., 2009. Endocrine-disrupting chemicals: an endocrine society scientific
identify environmental factors associated with semen phenotypes given statement. Endocr. Rev. 30, 293–342.
their modest association sizes. Francke, A.L., Smit, M.C., de Veer, A.J., Mistiaen, P., 2008. Factors influencing the im-
plementation of clinical guidelines for health care professionals: a systematic meta-
review. BMC Med. Inform. Decis. Mak. 8, 38.
Acknowledgements
Guo, Y., Alomirah, H., Cho, H.S., Minh, T.B., Mohd, M.A., Nakata, H., et al., 2011.
Occurrence of phthalate metabolites in human urine from several Asian countries.
This work was supported by the Intramural Research Program of the Environ. Sci. Technol. 45, 3138–3144.
Eunice Kennedy Shriver National Institute of Child Health and Human Haugen, T.B., Tefre, T., Malm, G., Jönsson, B.A.G., Rylander, L., Hagmar, L., et al., 2011.

513
M.K. Chung et al. Environment International 125 (2019) 505–514

Differences in serum levels of CB-153 and p,p′‑DDE, and reproductive parameters Rappaport, S.M., Barupal, D.K., Wishart, D., Vineis, P., Scalbert, A., 2014. The blood
between men living south and north in Norway. Reprod. Toxicol. 32, 261–267. exposome and its role in discovering causes of disease. Environ. Health Perspect. 122,
Hauser, R., Chen, Z., Pothier, L., Ryan, L., Altshul, L., 2003. The relationship between 769–774.
human semen parameters and environmental exposure to polychlorinated biphenyls Richardson, D.B., Ciampi, A., 2003. Effects of exposure measurement error when an ex-
and p,p′‑DDE. Environ. Health Perspect. 111, 1505–1511. posure variable is constrained by a lower limit. Am. J. Epidemiol. 157, 355–363.
Ioannidis, J.P.A., 2005. Why most published research findings are false. PLoS Med. 2. Richthoff, J., Rylander, L., Jönsson, B.A.G., Akesson, H., Hagmar, L., Nilsson-Ehle, P.,
Ioannidis, J.P.A., 2008. Why most discovered true associations are inflated. Epidimiology et al., 2003. Serum levels of 2,2′,4,4′,5,5′‑hexachlorobiphenyl (CB-153) in relation to
19, 640–648. markers of reproductive function in young males from the general Swedish popula-
Joensen, U.N., Bossi, R., Leffers, H., Jensen, A.A., Skakkebæk, N.E., Jørgensen, N., 2009. tion. Environ. Health Perspect. 111, 409–413.
Do perfluoroalkyl compounds impair human semen quality? Environ. Health Rignell-Hydbom, A., Rylander, L., Giwercman, A., Jönsson, B.A.G., Nilsson-Ehle, P.,
Perspect. 117, 923–927. Hagmar, L., 2004. Exposure to CB-153 and p,p′‑DDE and male reproductive function.
Kato, K., Wong, L.Y., Jia, L.T., Kuklenyik, Z., Calafat, A.M., 2011. Trends in exposure to Hum. Reprod. 19, 2066–2075.
polyfluoroalkyl chemicals in the U.S. Population: 1999–2008. Environ. Sci. Technol. Robitzsch, A., Grund, S., Henke, T., 2017. miceadds: Some Additional Multiple
45, 8037–8045. Imputation Functions, Especially for “mice”.
Kim, K.I., van de Wiel, M.A., 2008. Effects of dependence in high-dimensional multiple Schisterman, E.F., Vexler, A., Whitcomb, B.W., Liu, A., 2006. The limitations due to ex-
testing problems. BMC Bioinf. 9, 114. posure detection limits for regression models. Am. J. Epidemiol. 163, 374–383.
Kuklenyik, Z., Needham, L.L., Calafat, A.M., 2005. Measurement of 18 perfluorinated Serghiou, S., Patel, C.J., Tan, Y.Y., Koay, P., Ioannidis, J.P.A., 2016. Field-wide meta-
organic acids and amides in human serum using on-line solid-phase extraction. Anal. analyses of observational associations can map selective availability of risk factors
Chem. 77, 6085–6091. and the impact of model specifications. J. Clin. Epidemiol. 71, 58–67.
Kunisue, T., Wu, Q., Tanabe, S., Aldous, K.M., Kannan, K., 2010. Analysis of five ben- Sjödin, A., Jones, R.S., Lapeza, C.R., Focant, J.F., McGahee, E.E., Patterson, D.G., 2004.
zophenone-type UV filters in human urine by liquid chromatography-tandem mass Semiautomated high-throughput extraction and cleanup method for the measure-
spectrometry. Anal. Methods 2, 707–713. ment of polybrominated diphenyl ethers, polybrominated biphenyls, and poly-
McGinnis, D.P., Brownstein, J.S., Patel, C.J., 2016. Environment-wide association study of chlorinated biphenyls in human serum. Anal. Chem. 76, 1921–1927.
blood pressure in the national health and nutrition examination survey (1999–2012). Skakkebaek, N.E., Rajpert-De Meyts, E., Buck Louis, G.M., Toppari, J., Andersson, A.M.,
Sci. Rep. 6, 30373. Eisenberg, M.L., et al., 2016. Male reproductive disorders and fertility trends: influ-
Meeker, J.D., Hauser, R., 2010. Exposure to polychlorinated biphenyls (PCBs) and male ences of environment and genetic susceptibility. Physiol. Rev. 96, 55–97.
reproduction. Syst Biol Reprod Med 56, 122–131. Smarr, M.M., Grantz, K.L., Sundaram, R., Maisog, J.M., Honda, M., Kannan, K., et al.,
Mumford, S.L., Kim, S., Chen, Z., Gore-Langton, R.E., Barr, D.B., Buck Louis, G.M., 2015a. 2016. Urinary paracetamol and time-to-pregnancy. Hum. Reprod. 31, 2119–2127.
Persistent organic pollutants and semen quality: the LIFE Study. Chemosphere 135, Smarr, M.M., Sapra, K.J., Gemmill, A., Kahn, L.G., Wise, L.A., Lynch, C.D., et al., 2017. Is
427–435. human fecundity changing? A discussion of research and data gaps precluding us
Mumford, S.L., Kim, S., Chen, Z., Barr, D.B., Louis, G.M.B., 2015b. Urinary phytoestrogens from having an answer. Hum. Reprod. 32, 499–504.
are associated with subtle indicators of semen quality among male partners of couples Smith, V., Devane, D., Begley, C.M., Clarke, M., 2011. Methodology in conducting a
desiring pregnancy. J. Nutr. 145, 2535–2541. systematic review of systematic reviews of healthcare interventions. BMC Med. Res.
Nyholt, D.R., 2004. A simple correction for multiple testing for single-nucleotide poly- Methodol. 11, 15.
morphisms in linkage disequilibrium with each other. Am. J. Hum. Genet. 74, Toft, G., Jönsson, B.A.G., Lindh, C.H., Giwercman, A., Spano, M., Heederik, D., et al.,
765–769. 2012. Exposure to perfluorinated compounds and human semen quality in arctic and
Patel, C.J., Ioannidis, J.P.A., 2014. Placing epidemiological results in the context of European populations. Hum. Reprod. 27, 2532–2540.
multiplicity and typical correlations of exposures. J. Epidemiol. Community Health Vested, A., Giwercman, A., Bonde, J.P., Toft, G., 2014. Persistent organic pollutants and
68, 1096–1100. male reproductive health. Asian J. Androl. 16, 71–80.
Patel, C.J., Manrai, A.K., 2015. Development of exposome correlation globes to map out Vitku, J., Heracek, J., Sosvorova, L., Hampl, R., Chlupacova, T., Hill, M., et al., 2016.
environment-wide associations. Pac. Symp. Biocomput. 231–242. Associations of bisphenol A and polychlorinated biphenyls with spermatogenesis and
Patel, C.J., Bhattacharya, J., Butte, A.J., 2010. An Environment-Wide Association Study steroidogenesis in two biological fluids from men attending an infertility clinic.
(EWAS) on type 2 diabetes mellitus. PLoS One 5, e10746. Environ. Int. 89–90, 166–173.
Patel, C.J., Rehkopf, D.H., Leppert, J.T., Bortz, W.M., Cullen, M.R., Chertow, G.M., et al., Zhang, Z., Alomirah, H., Cho, H.S., Li, Y.F., Liao, C., Minh, T.B., et al., 2011. Urinary
2013. Systematic evaluation of environmental and behavioural factors associated bisphenol A concentrations and their implications for human exposure in several
with all-cause mortality in the United States national health and nutrition ex- Asian countries. Environ. Sci. Technol. 45, 7044–7050.
amination survey. Int. J. Epidemiol. 42, 1795–1810. Zhou, X., Stephens, M., 2014. Efficient multivariate linear mixed model algorithms for
Patel, C.J., Burford, B., Ioannidis, J.P.A., 2015. Assessment of Vibration of Effects due to genome-wide association studies. Nat. Methods 11, 407–409.
Model Specification Can Demonstrate the Instability of Observational Associations. J. Zhuang, X., Ni, A., Liao, L., Guo, Y., Dai, W., Jiang, Y., et al., 2018. Environment-wide
Clin. Epidemiol. 68, 1046–1058 (June). association study to identify novel factors associated with peripheral arterial disease:
Phillips, D.L., Pirkle, J.L., Burse, V.W., Bernert, J.T., Henderson, L.O., Needham, L.L., evidence from the National Health and Nutrition Examination Survey (1999–2004).
1989. Chlorinated hydrocarbon levels in human serum: effects of fasting and feeding. Atherosclerosis 269, 172–177.
Arch. Environ. Contam. Toxicol. 18, 495–500.

514

You might also like