Professional Documents
Culture Documents
kww009 PDF
kww009 PDF
kww009 PDF
12
© The Author 2016. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. DOI: 10.1093/aje/kww009
This is an Open Access article distributed under the terms of the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, Advance Access publication:
provided the original work is properly cited. May 10, 2016
Practice of Epidemiology
Initially submitted September 22, 2015; accepted for publication January 8, 2016.
Progress has recently been made in understanding the genetic basis of schizophrenia and other psychiatric dis-
orders. Longitudinal studies are complicated by participant dropout, which could be related to the presence of psy-
chiatric problems and associated genetic risk. We tested whether common genetic variants implicated in
schizophrenia were associated with study nonparticipation among 7,867 children and 7,850 mothers from the
Avon Longitudinal Study of Parents and Children (ALSPAC; 1991–2007), a longitudinal population cohort study.
Higher polygenic risk scores for schizophrenia were consistently associated with noncompletion of questionnaires
by study mothers and children and nonattendance at data collection throughout childhood and adolescence (ages
1–15 years). These associations persisted after adjustment for other potential correlates of nonparticipation. Re-
sults suggest that persons at higher genetic risk for schizophrenia are likely to be underrepresented in cohort stud-
ies, which will underestimate risk of this and related psychiatric, cognitive, and behavioral phenotypes in the
population. Statistical power to detect associations with these phenotypes will be reduced, while analyses of
schizophrenia-related phenotypes as outcomes may be biased by the nonrandom missingness of these pheno-
types, even if multiple imputation is used. Similarly, in complete-case analyses, collider bias may affect associations
between genetic risk and other factors associated with missingness.
Avon Longitudinal Study of Parents and Children; attrition bias; cohort studies; genetic risk; longitudinal studies;
schizophrenia; study nonparticipation
Abbreviations: ALSPAC, Avon Longitudinal Study of Parents and Children; GWAS, genome-wide association study; IQ, intelligence
quotient; MDD, major depressive disorder; PRS, polygenic risk scores; SNP, single-nucleotide polymorphism.
Schizophrenia is a highly heritable and severely impairing may also be limited by nonparticipation at given time points,
neurodevelopmental disorder with onset typically in early or even complete loss to follow-up.
adulthood. In a recent genome-wide association study (GWAS) There are multiple factors associated with nonparticipation
meta-analysis of 34,241 schizophrenia cases, 45,604 con- in cohort studies, including socioeconomic adversity, male
trols, and 1,235 parent-offspring trios, the Schizophrenia sex, and cognitive, emotional, and behavioral problems (3–6).
Working Group of the Psychiatric Genomics Consortium re- Given that many of these factors are associated with risk of
ported 128 independent genome-wide significant single- schizophrenia (7, 8), persons at higher risk for this disorder
nucleotide polymorphisms (SNPs) associated with risk of may also be more likely to be lost to follow-up, even before
this disorder (1). Given that schizophrenia has a relatively illness onset. Studies show that genetic risk of schizophrenia
low lifetime morbidity risk of about 0.7% in the general pop- overlaps with risks of other psychiatric conditions, including
ulation (2), population-based cohort samples are unlikely to bipolar disorder, major depressive disorder (MDD), attention-
include many affected individuals. Such longitudinal studies deficit/hyperactivity disorder, autism spectrum disorder,
and intellectual disability (9–12). Given the breadth of delivery dates between April 1, 1991, and December 31,
phenotypic-genetic overlap, it is plausible that schizophrenia 1992. Among these pregnancies, there were 14,062 liveborn
genetic risk predisposes people to a broad range of psychopa- children, of whom 13,988 were alive at 1 year of age. An ad-
thology that could be at subthreshold levels with regard to ditional 713 children who would have been eligible but
disorder diagnosis but result in higher levels of nonparticipa- whose mothers did not enroll during pregnancy were enrolled
tion in longitudinal studies. Thus, schizophrenia could be re- after age 7 years, resulting in a total sample of 14,701 chil-
lated to nonparticipation in cohort studies either through dren alive at age 1 year. The study website contains details
clinical phenotypic features associated with the disorder or on all available data through a fully searchable data dictionary
through genetic risk factors that affect an individual’s pre- (17). Ethical approval for the study was obtained from the
morbid state. ALSPAC Ethics and Law Committee and local research eth-
Missing data in longitudinal studies are associated with a ics committees.
Am J Epidemiol. 2016;183(12):1149–1158
Genetic Schizophrenia Risk and Study Nonparticipation 1151
8,000
7,000
6,000
5,000
No. of People
4,000
2,000
1,000
0
7 10 13 15 1 4 7 10 13 15 7 10 13 15
Clinic Attended Mother-Completed Questionnaire Child-Completed
Questionnaire
Source of Data and Age, years
Figure 1. Data availability in a sample of mothers and children with genetic data from the Avon Longitudinal Study of Parents and Children, 1991–
2007. Black indicates persons who participated in data collection at a given time point; gray indicates persons with missing data.
maternal educational level (Certificate of Secondary Education, strand mismatches were corrected using the “flip” command
vocational, O-levels, A-levels, or undergraduate degree), in PLINK. Schizophrenia PRS were subsequently calculated
home ownership (yes/no), and parental socioeconomic for ALSPAC children and mothers using the --score function
position. Three socioeconomic categories were defined (low: in PLINK. Scores were based on 35,768 SNPs in children and
unskilled workers/unemployed; medium: manual and non- 35,756 SNPs in mothers. The scores were normally distribu-
manual skilled/partially skilled workers; high: professional ted and were standardized using z-score transformation to aid
and managerial workers) on the basis of whichever reported interpretation of the results.
parental occupation was higher-level; the occupations were
classified using the United Kingdom Standard Occupational Statistical analysis
Classification (21). Analyses were performed in Stata, version 13.1 (StataCorp
Discovery sample and calculating PRS LP, College Station, Texas). Logistic regression analyses were
used to estimate odds ratios and 95% confidence intervals for
We used the largest schizophrenia GWAS published to date missingness per 1-standard-deviation increase in PRS. For
(1) to derive PRS in ALSPAC children and mothers. The each test, the Nagelkerke pseudo-R 2 value is presented as an
P value threshold (PT) used to define risk alleles was P < estimate of the amount of variance in outcome (missingness)
0.05, the threshold at which PRS maximally predict caseness explained by PRS. Analyses were performed separately for
in independent samples (9). Scores were calculated separately child PRS and maternal PRS for each of the 3 types of data
for ALSPAC children and mothers using imputed SNPs. First, collection (clinic attendance and child- and mother-completed
SNPs in the discovery sample were filtered for a minor allele questionnaires). Secondary analyses explored the addition of
frequency greater than 1% and an INFO imputation score (a covariates, unique child and maternal associations, and wheth-
measure of the certainty and quality of the imputation results) er the association with PRS changed over time. The latter anal-
greater than 0.9 (18). Linkage disequilibrium-based clumping ysis was conducted using generalized estimating equations,
was performed for SNPs overlapping discovery and ALSPAC including an interaction term for the interaction of PRS with
samples in PLINK software, version 1.9 (22), using the follow- age, to examine change in the association over time.
ing parameters: --clump-p1 0.5, --clump-r2 0.25, and --clump-
kb 500. Additional quality control procedures were subsequently RESULTS
performed that checked for strand mismatches and substantial
differences in allele frequencies (>0.1) between the Psychiatric Table 1 displays descriptive statistics for covariates in the
Genomics Consortium schizophrenia sample (1) and the sam- study sample, stratified by whether children had genetic
ples of ALSPAC children and mothers. Where appropriate, data. Persons with genetic data were more likely to have a
Am J Epidemiol. 2016;183(12):1149–1158
1152 Martin et al.
Table 1. Distribution of the Study Sample According to Availability of Children’s Genetic Data, Avon Longitudinal
Study of Parents and Children, 1991–2007
Abbreviations: CSE, Certificate of Secondary Education; MDD, major depressive disorder; SDQ, Strengths and
Difficulties Questionnaire.
a
Descriptive statistics are based on 7,867 persons with genetic data available for analysis and 5,930 persons
without genetic data in the analyses.
b
Low: unskilled worker/unemployed; medium: manual or nonmanual skilled/partially skilled worker; high:
professional or managerial worker.
c
Values are presented as mean (standard deviation).
d
t = −3.8; P = 1.2e −4.
higher socioeconomic position, mothers with higher levels of of missing data points. A linear regression of the number of
education, parents who owned their own home, lower risk of missing data points on schizophrenia PRS showed that PRS
maternal MDD, higher risk of a family history of psychosis, were associated with an increase of 0.07 missing data points
and lower scores on the Strengths and Difficulties Question- (0.08 in mothers) per 1-standard-deviation increase in PRS.
naire than persons without genetic data. Results of multivariable regression analyses of the asso-
There was strong evidence that higher schizophrenia PRS ciation between schizophrenia PRS and missing data, after
in the children were associated with missing data for all mea- adjustment for family history of MDD or schizophrenia,
sures and time points (see Table 2). The odds ratios reflected child variables (sex and total Strengths and Difficulties Ques-
an increased likelihood of missing data per 1-standard- tionnaire problems), and socioeconomic variables (family
deviation increase in schizophrenia PRS. PRS in mothers socioeconomic position, maternal education, and home own-
were similarly associated with missing data for all time points ership), can be found in Table 3. Complete data on all vari-
(see Table 2). The mean differences in PRS for persons with ables were available for 5,601 children and 5,347 mothers.
and without missing data are shown in Figures 2 and 3 for PRS for schizophrenia in children continued to show an
PRS in children and mothers, respectively. association with availability of data for the majority of data
Figures 4 and 5 display mean PRS for schizophrenia in collection time points (except mother-completed question-
children and mothers, depending on the cumulative number naires at age 4 years and child-completed questionnaires at
Am J Epidemiol. 2016;183(12):1149–1158
Genetic Schizophrenia Risk and Study Nonparticipation 1153
Table 2. Association of Schizophrenia Polygenic Risk Scores With Missing Data Among Children and Mothers From
the Avon Longitudinal Study of Parents and Children, 1991–2007
Clinic attendance
7 1.16 1.10, 1.22 2.0e −8 0.0035 1.17 1.11, 1.22 1.0e −8 0.0042
−7
10 1.14 1.08, 1.19 1.5e 0.0028 1.14 1.09, 1.19 3.0e −8 0.0030
13 1.14 1.09, 1.20 1.0e −8 0.0032 1.14 1.09, 1.19 2.0e −8 0.0030
15 1.14 1.09, 1.19 1.0e −8 0.0031 1.16 1.11, 1.21 1.0e −8
0.0038
Abbreviations: CI, confidence interval; OR, odds ratio; PRS, polygenic risk scores.
0.150
0.100
0.050
Mean z Score
0.000
–0.050
–0.100
7 10 13 15 1 4 7 10 13 15 7 10 13 15
Clinic Attendance Mother-Completed Questionnaire Child-Completed
Questionnaire
Source of Data and Age, years
Figure 2. Polygenic risk scores (mean z score) for schizophrenia among children from the Avon Longitudinal Study of Parents and Children (1991–
2007), depending on data availability. White bars represent persons with data available; gray bars represent persons with missing data. Error bars
show standard errors.
Am J Epidemiol. 2016;183(12):1149–1158
1154 Martin et al.
0.150
0.100
0.050
Mean z Score
–0.050
–0.100
7 10 13 15 1 4 7 10 13 15 7 10 13 15
Clinic Attendance Mother-Completed Questionnaire Child-Completed
Questionnaire
Source of Data and Age, years
Figure 3. Polygenic risk scores (mean z score) for schizophrenia among mothers from the Avon Longitudinal Study of Parents and Children
(1991–2007), depending on data availability. White bars represent persons with data available; gray bars represent persons with missing data.
Error bars show standard errors.
0.06
0.04
0.02
0.00
–0.02
Mean z Score
–0.04
–0.06
–0.08
–0.10
–0.12
–0.14
–0.16
0 1 1–2 1–3 1–4 1–5 1–6 1–7 1–8 1–9 1–10 1–11 1–12 1–13 1–14
Cumulative No. of Missing Data Points
Figure 4. Polygenic risk scores (mean z score) for schizophrenia among children from the Avon Longitudinal Study of Parents and Children (1991–
2007), according to the number of missing data points. The x-axis displays categories of persons with no missing data, missing data for only 1 time
point, missing data for 1–2 time points, missing data for 1–3 time points, and so on. Error bars show standard errors.
Am J Epidemiol. 2016;183(12):1149–1158
Genetic Schizophrenia Risk and Study Nonparticipation 1155
0.06
0.04
0.02
0.00
–0.02
Mean z Score
–0.04
–0.08
–0.10
–0.12
–0.14
–0.16
0 1 1–2 1–3 1–4 1–5 1–6 1–7 1–8 1–9 1–10 1–11 1–12 1–13 1–14
Figure 5. Polygenic risk scores (mean z score) for schizophrenia among mothers from the Avon Longitudinal Study of Parents and Children
(1991–2007), according to the number of missing data points. The x-axis displays categories of persons with no missing data, missing data for
only 1 time point, missing data for 1–2 time points, missing data for 1–3 time points, and so on. Error bars show standard errors.
Table 3. Association of Schizophrenia Polygenic Risk Scores With Missing Data Among Children and Mothers, After
Adjustment for Known Correlates of Study Nonparticipation,a Avon Longitudinal Study of Parents and Children,
1991–2007
Clinic attendance
7 1.15 1.07, 1.24 1.9e −4 0.0029 1.18 1.10, 1.26 1.4e −6 0.0043
−4 −4
10 1.13 1.06, 1.20 3.8e 0.0022 1.13 1.06, 1.20 2.8e 0.0023
13 1.15 1.08, 1.21 3.6e −6 0.0031 1.09 1.03, 1.16 2.7e −3 0.0014
15 1.15 1.09, 1.22 7.7e −7 0.0032 1.13 1.06, 1.19 4.2e −5 0.0023
Mother-completed questionnaire
1 1.24 1.06, 1.44 6.2e −3 0.0048 1.10 0.96, 1.27 0.17 0.0011
4 1.01 0.90, 1.14 0.88 0.0000 1.27 1.12, 1.44 1.5e −4 0.0068
7 1.11 1.03, 1.20 9.0e −3 0.0016 1.10 1.02, 1.19 0.013 0.0014
10 1.09 1.02, 1.17 0.013 0.0012 1.13 1.05, 1.20 6.5e −4 0.0022
13 1.08 1.02, 1.15 0.013 0.0010 1.10 1.04, 1.17 2.3e −3 0.0015
−4
15 1.11 1.05, 1.18 4.5e 0.0018 1.11 1.05, 1.18 5.5e −4 0.0017
Child-completed questionnaire
7 1.04 0.97, 1.10 0.25 0.0002 1.03 0.97, 1.10 0.30 0.0002
−3 −6
10 1.10 1.03, 1.19 6.5e 0.0015 1.18 1.10, 1.26 4.9e 0.0040
13 1.11 1.04, 1.18 9.4e −4 0.0017 1.12 1.05, 1.19 3.7e −4 0.0021
15 1.15 1.09, 1.22 6.0e −7 0.0032 1.10 1.04, 1.17 7.0e −4 0.0015
Abbreviations: CI, confidence interval; OR, odds ratio; PRS, polygenic risk scores.
a
Family history of schizophrenia or depression, child sex and behavioral/emotional problems, family socioeconomic
position, maternal educational level, and home ownership.
Am J Epidemiol. 2016;183(12):1149–1158
1156 Martin et al.
Table 4. Association of Schizophrenia Polygenic Risk Scores With Missing Data Among Children and Mothers, After
Adjustment for Shared Associations, Avon Longitudinal Study of Parents and Children, 1991–2007
Clinic attendance
7 1.12 1.04, 1.21 3.6e −3 0.0016 1.07 1.00, 1.16 0.066 0.0006
10 1.14 1.06, 1.23 5.3e −4 0.0021 1.03 0.96, 1.11 0.43 0.0001
13 1.15 1.08, 1.23 4.0e −5 0.0025 1.02 0.95, 1.09 0.62 <0.0001
15 1.11 1.04, 1.19 1.1e −3 0.0015 1.06 1.00, 1.13 0.056 0.0005
Abbreviations: CI, confidence interval; OR, odds ratio; PRS, polygenic risk scores.
age 7 years), even after accounting for the known correlates Web Table 2). The results suggested that the association of
of missing data in longitudinal research. Similar results were child PRS with child-completed questionnaires increased
found for PRS calculated in mothers (see Table 3), with over time, but there was no evidence that the association
strong evidence of association between maternal PRS and with other data sources changed over time.
data availability at the majority of data collection time points,
after accounting for the known correlates of nonparticipation.
On the whole, effect sizes for the association of PRS with DISCUSSION
availability of data at the collection time points were not
The results of this study show a consistent pattern of associ-
changed upon the addition of these correlates to the models,
ation between increased PRS for schizophrenia in children and
although the evidence for association was marginally stron-
mothers from the ALSPAC general population sample and non-
ger than in the unadjusted models. (See Web Table 1, avail-
participation in data collection at time points across ages 1–15
able at http://aje.oxfordjournals.org/, for results of unadjusted
years. The magnitudes of these associations remained similar
analyses confined to persons with no missing data.)
after adjustment for several known correlates of nonparticipa-
Secondary analyses tion (i.e., family history of schizophrenia or depression, child
sex and behavioral/emotional problems, family socioeconomic
In order to further examine the source of the genetic asso- position, maternal educational level, and home ownership (3–
ciation with nonparticipation, we repeated the analyses while 6)). Together, these results suggest that analyses of schizo-
including both child and maternal PRS in the same model, for phrenia (and genetically related psychiatric phenotypes) as
child-mother pairs with available genetic data (n = 5,262). outcomes in longitudinal population cohort studies are likely
The correlation between children’s and mothers’ scores was to be underpowered, given that persons with a genetic predispo-
0.509. As can be seen from Table 4, children’s PRS were as- sition to these phenotypes will be underrepresented in longitu-
sociated with missing data independently of the association dinal population cohorts. They also indicate that some analyses
of maternal PRS for the majority of data collection time of schizophrenia and related outcomes may be biased because
points (except for mother-completed questionnaires at age data for these outcomes may be missing not-at-random.
4 years). There was no evidence of association of maternal Given the low prevalence of schizophrenia in the popula-
PRS with nonparticipation, above and beyond the variance tion (2) and the high pleiotropy of genetic risk of schizophre-
shared with child PRS. On the whole, odds ratios were higher nia (10–12), the results further imply that nonparticipation
for child PRS than for maternal PRS. in longitudinal research may be associated with other psychi-
Finally, we estimated the population-averaged association atric disorders (e.g., MDD or attention-deficit/hyperactivity
between schizophrenia PRS and missing data across time disorder), as well as subclinical psychopathology or related
points and examined whether this changed over time (see phenotypes (e.g., personality traits, cognitive function),
Am J Epidemiol. 2016;183(12):1149–1158
Genetic Schizophrenia Risk and Study Nonparticipation 1157
even in the absence of a psychiatric diagnosis. The observed missing data. However, there were consistent associations
association between genetic risk of schizophrenia and study of child PRS with nonparticipation at all data collection
nonparticipation is likely to be the result of an influence of time points. This finding suggests that additional genetic
schizophrenia risk alleles on phenotypes beyond those gener- risk factors beyond those carried by the mother influence
ally considered directly related to the disorder (e.g., other the likelihood of mothers and children not returning question-
psychopathology or behavioral, cognitive, or personality naires or attending the clinic visits.
phenotypes), since the association was seen by the age of It is possible that the current study underestimated the effect
7 years—an age at which the probability of children in this size of the association between PRS in mothers and nonpar-
population sample having schizophrenia or schizophrenia- ticipation because some mothers with higher PRS may have
spectrum disorders was very low. This is an important obser- not enrolled in the study at all or may not have provided a
vation that is likely to be relevant for analyses of any popula- DNA sample and so were not part of the analyses. Genetic
Am J Epidemiol. 2016;183(12):1149–1158
1158 Martin et al.
imputation models may still not be met even after inclusion of 6. de Graaf R, Bijl RV, Smit F, et al. Psychiatric and
variables related to family history, socioeconomic position, and sociodemographic predictors of attrition in a longitudinal study:
childhood behavioral and emotional problems. These results the Netherlands Mental Health Survey and Incidence Study
open up possibilities for future research into nonparticipation (NEMESIS). Am J Epidemiol. 2000;152(11):1039–1047.
7. Woodberry KA, Giuliano AJ, Seidman LJ. Premorbid IQ in
in longitudinal studies—for example, regarding whether inclu-
schizophrenia: a meta-analytic review. Am J Psychiatry. 2008;
sion of genetic risk scores in imputation of missing data might 165(5):579–587.
allow associations to be estimated with less bias. 8. Reichenberg A, Caspi A, Harrington H, et al. Static and dynamic
cognitive deficits in childhood preceding adult schizophrenia: a
30-year study. Am J Psychiatry. 2010;167(2):160–169.
ACKNOWLEDGMENTS 9. Hamshere ML, Stergiakouli E, Langley K, et al. Shared
polygenic contribution between childhood attention-deficit
Am J Epidemiol. 2016;183(12):1149–1158