Professional Documents
Culture Documents
Profiling Allele-Specific Gene Expression in Brains From Individuals With Autism Spectrum Disorder Reveals Preferential Minor Allele Usage
Profiling Allele-Specific Gene Expression in Brains From Individuals With Autism Spectrum Disorder Reveals Preferential Minor Allele Usage
https://doi.org/10.1038/s41593-019-0461-9
One fundamental but understudied mechanism of gene regulation in disease is allele-specific expression (ASE), the preferen-
tial expression of one allele. We leveraged RNA-sequencing data from human brain to assess ASE in autism spectrum disorder
(ASD). When ASE is observed in ASD, the allele with lower population frequency (minor allele) is preferentially more highly
expressed than the major allele, opposite to the canonical pattern. Importantly, genes showing ASE in ASD are enriched in
those downregulated in ASD postmortem brains and in genes harboring de novo mutations in ASD. Two regions, 14q32 and
15q11, containing all known orphan C/D box small nucleolar RNAs (snoRNAs), are particularly enriched in shifts to higher minor
allele expression. We demonstrate that this allele shifting enhances snoRNA-targeted splicing changes in ASD-related target
genes in idiopathic ASD and 15q11–q13 duplication syndrome. Together, these results implicate allelic imbalance and dysregula-
tion of orphan C/D box snoRNAs in ASD pathogenesis.
A
SD is defined by deficits in social interaction and repetitive syndrome are rare neurodevelopmental disorders resulting
behaviors1,2 and has a global prevalence of approximately 1 from deletions within the highly imprinted region, 15q11–q13.
case per 68 people1. Although genetic factors contribute to Conversely, recurrent maternally derived duplications of this region
a substantial proportion of disease liability, the genetics of ASD are lead to dup15q, a rare but penetrant syndromic form of ASD com-
complex with contributions from both rare and common alleles, prising approximately 1% of cases19. Although duplications gener-
including de novo mutations and polygenic inheritance2. Despite ally increase gene expression19, varied patterns are observed across
this heterogeneity and polygenicity, we have previously identified genes in the region impacted by dup15q due to complex local regu-
a reproducible shared pattern of gene expression alterations in latory mechanisms4,19–21.
human postmortem cortex from people with idiopathic ASD3 and In addition to known imprinted regions, ASE extends to over
replicated this signal in independent cortex samples with idiopathic 5% of autosomal genes and is especially enriched in transmem-
ASD and dup15q4. This gene expression signal may be mediated by brane receptors and cell surface molecules9. Random monoallelic
epigenetic factors5,6, such as DNA methylation7 or histone modifica- expression (RMAE), which reflects MAE not associated with par-
tion6, in addition to underlying genetic variation. ent-of-origin imprinting, may be important for cell-type-specific
ASE is a form of genetic regulation in which the expression of gene dosage effects10,22. Moreover, emerging evidence suggests that
mRNA at a specific locus is biased toward a specific allele8–10. An RMAE is involved in neurodevelopmental disorders, as recently
extreme case of ASE is genomic imprinting, resulting in monoal- observed for developmental dyspraxia caused by RMAE in FOXP2
lelic expression (MAE), whereby a particular allele is completely (ref. 23). In this case, ASE may contribute to disease susceptibility by
silenced8,11. But most ASE represents more subtle shifts in the allelic exacerbating the deleterious effects of a mutation via haploinsuf-
ratio rather than complete silencing of one allele9,12 as evidenced by ficiency, or mosaic somatic expression.
the observations of preferentially reduced expression of mutant dis- Except for one notable study, which showed that ASE in sev-
ease alleles and that a substantial proportion of cis-acting expression eral specific genes may play a role in a subset of ASD cases8, the
quantitative trait loci (cis-eQTL) are mediated by ASE12–14. Indeed, role of genome-wide ASE has not been explored in a sufficiently
cis-eQTL and ASE are complementary mechanisms involved in the powered cohort. The overall paucity of ASE studies in brain dis-
regulation of gene expression15. Further, as it relies primarily on orders, such as ASD, is likely due to several challenges including
within-sample expression comparisons, rather than comparisons limited access to human brain tissue, small sample sizes, incomplete
between samples, analysis of ASE is less influenced by genetic, envi- transcriptome-wide genotyping data and reference bias in mapping
ronmental and technical confounders15,16. transcripts8,12,13,24,25. Reference bias is attributable to reduced map-
A number of diseases can be attributed to abnormalities in ping of alternative-allele-containing RNA-sequencing (RNA-seq)
ASE17,18. For example, Prader–Willi syndrome and Angelman reads (due to a greater number of mismatches) and can generate
1
Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
2
Department of Computer Science, Henry Samueli School of Engineering, University of California, Los Angeles, Los Angeles, CA, USA. 3Center for
Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA. 4Department of
Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA. 5Department of Computational Medicine,
David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA. 6Center for Autism Research and Treatment, Semel
Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA. 7Present address: Department of Neuroscience,
Peter O’Donnell Jr. Brain Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA. *e-mail: dhg@mednet.ucla.edu
ia
l
I
ne
et
et
et
lia
AR
ro
lig
PS
gl
cy
rg
rg
rg
(P value)
he
ge
eu
ro
tro
SF
ta
ta
ta
t
ic
N
do
335
d
1,927
As
P
uR
X1
M
te
En
in
O
H
FM
pr
BF
im
R
n
ow
Kn
886 c de novo variants GWAS
80 1.70 3.95
50
ba41 148 ba9 (1 × 10−4) (6 × 10−4) ba9
40
2,490 ver
1.73 3.88 30
3,326 ba41 ba41
(2 × 10−4) (8 × 10−4)
20
2.79 1.80 4.22 10
ver (0.008) (1 × 10−5) (3 × 10−4) ver
0
–log10
2
1
D
SC
ID
Z
BP
D
D
AS
AS
SC
AD
M
(P value)
AS
AS
Fig. 2 | ASE patterns shared among cases and controls. a, Venn diagram comparing the overlap of ASE genes from three brain tissues: BA9, BA41 and
vermis. b, Enrichment analyses of genes exhibiting ASE with ASD-relevant (SFARI gene list36; Methods) and cell-type-specific gene lists (Methods)31.
Across all three brain regions, ASE genes showed strong overlap with known imprinted genes (Methods) as well as targets of FMRP (ref. 28), HuR
(ref. 29) and RBFOX1 (ref. 30). Plots show ORs and FDR-corrected P values for enrichment, if significant. c, Enrichment analysis of genes exhibiting ASE with
candidate psychiatric risk genes. Likely gene-disrupting de novo variants32,33 were considered from studies of schizophrenia (SCZ), intellectual disability
(ID) and ASD. The datasets for SCZ, ID and ASD1 were gene lists containing de novo likely gene disrupting mutations from Iossifov et al.32, and the data
for ASD2 represent risk genes integrating de novo copy number variations (FDR ≦ 0.01) from Sanders et al.33 (Methods). The risk variants from GWAS34,35
were considered for ADHD, BPD, MDD, SCZ and ASD (Methods). In this case, ASD1 and ASD2 represent the GWAS datasets from the Cross-Disorder
Group of the Psychiatric Genomics34 and Grove et al.35, respectively. If significant, the plot shows ORs and FDR-corrected P values for de novo variant
datasets and FDR-corrected P values for GWAS. The sample numbers of BA9, BA41 and vermis are 67, 64 and 64, respectively. Ver, vermis.
per million mapped reads (RPSM)): 0.37 versus 0.71; Supplementary patterns of ASE across the three brain regions (Fig. 3a and
Fig. 2d). ASE was also enriched at known imprinted genes includ- Supplementary Fig. 3a,b). Since the two cortical regions showed
ing the targets of RNA-binding proteins, FMRP (ref. 28), ELAVL1 strong overlap in ASE and cortical manifestations of transcriptional
(HuR)29 and RBFOX1(ref. 30) (Fig. 2b; Methods). Genes showing evi- dysregulation in ASD are more prominent than those in cerebel-
dence for ASE were also preferentially enriched for neuronal31 and lum4, we focused on the cerebral cortex for most of the subsequent
astrocyte markers31 (Fig. 2b and Supplementary Fig. 2e). Consistent case–control comparisons. We found that SNPs showing ASE are
with a strong neuronal signal, they were also enriched for genes broadly and evenly distributed throughout the genome in both con-
encoding proteins of the postsynaptic density (PSD)32, similar to trol and idiopathic ASD samples (Fig. 3b). The majority of genes
previous observations9. Remarkably, across all three brain regions, exhibiting ASE (70.41%) are shared across case and control groups,
ASE genes are enriched for genes previously shown to be downregu- providing further confidence in the reproducibility of these results
lated in postmortem brain from people with ASD4 (Supplementary (Fig. 3c and Supplementary Table 3). Nevertheless, fewer genes
Fig. 2f); 37% of downregulated genes were subject to ASE. In cere- showing ASE are observed in idiopathic ASD compared with con-
bral cortex, genes downregulated in ASD postmortem brain were trols (Fig. 3c), and this was recapitulated when studying the rate of
enriched among those showing ASE, and those upregulated in ASD ASE at the level of each individual sample (Fig. 3d).
were depleted in genes showing ASE. In cortex, 448 and 503 genes Comparing biological pathway enrichment in cases and con-
were up- and downregulated in ASD, respectively. However, among trols revealed overlap and differences in ASE gene function. Genes
ASE genes, 84 and 181 genes were respectively up- and downregu- exhibiting ASE in both idiopathic ASD and controls are enriched
lated in ASD (OR = 1.92 and P = 7.03 × 10−6). Both ASE genes and in phosphoprotein, RNA splicing and neuron projection pathways
those downregulated in ASD are involved in synaptic transmission (Fig. 3e and Supplementary Table 4). Genes exhibiting control-spe-
and vesicular transport4, consistent with their enrichment in PSD cific ASE are significantly enriched for pathways related to growth
and targets of FMRP28, HuR29 and RBFOX1, genes previously impli- cone and synapse (Supplementary Fig. 3c and Supplementary Table
cated in ASD4. Importantly, genes showing ASE also showed sig- 4). Although there are fewer idiopathic ASD-specific ASE genes
nificant enrichment in genes harboring de novo ASD risk variants than control-specific ASE genes, those showing idiopathic ASD-
identified in two different studies32,33 (Fig. 2c; Methods). However, specific ASE are significantly enriched for pathways involved in
genes manifesting ASE did not show any significant enrichment transport (Supplementary Fig. 3d and Supplementary Table 4).
with genome-wide association study (GWAS) data from several Genes exhibiting ASE in both idiopathic ASD and controls show
major psychiatric disorders including attention-deficit/hyperactiv- significant enrichment in genes harboring de novo ASD risk vari-
ity disorder (ADHD)34, bipolar disorder (BPD)34, major depression ants (Supplementary Fig. 3e; OR = 1.68, P = 5 × 10−4 for ASD1;
disorder (MDD)34, schizophrenia (SCZ)34 and ASD34,35. OR = 3.83, P = 0.001 for ASD2). However, control-specific and idio-
pathic ASD-specific ASE genes do not show any enrichment either
ASE patterns shared across idiopathic ASD cases and controls. with rare de novo ASD risk variants or with the common variants
Similar to controls, idiopathic ASD cases showed tissue-specific from GWAS data (Supplementary Fig. 3e).
Quantitative allelic imbalance in idiopathic ASD. ASE is not Overall, minor allele-predominant MAE was significantly
always completely monoallelic and may involve subtle shifts in enriched on chromosomes 14 and 15 (Fig. 5c), with ASD and
allelic balance, whereby one allele is quantitatively favored over the dup15q showing stronger patterns compared with controls. Loci on
other8,15. To investigate this full spectrum of ASE in control and chromosome 15 exhibited a substantially higher minor allele MAE
idiopathic ASD, we compared the distributions of the minor allele fraction in both ASD and dup15q than controls (3.9- and 1.6-fold,
expression fraction for SNPs in which these calls are high quality respectively; chi-squared test P < 2.2 × 10−16 and P < 0.0184, respec-
(Methods; Fig. 4a). As expected, the majority of heterozygous alleles tively; Fig. 5c and Supplementary Table 5). Chromosomes 3, 6 and
show no evidence of ASE (herein called ‘balanced’) with the fraction 7 also showed higher fractions of minor allele MAE in ASD and
near 50%. A much smaller number of alleles show complete MAE dup15q samples than controls, of which only chromosome 3 was
and accordingly their fractions are either 0% or 100%. Finally, we significant in both (4.2- and 9.8-fold, respectively; chi-squared
define ‘imbalanced’ alleles as those that lie between the balanced and P = 0.0022 and P = 8.89 × 10−8, respectively). However, we focused
MAE distribution ranges. Controls had a higher density of balanced on chromosomes 14 and 15 for further analysis since chromosome
alleles than ASD cases, while idiopathic ASD cases showed more 3 had an overall lower degree of minor allele MAE.
imbalanced alleles (Fig. 4a and Supplementary Fig. 4a). However, Pathway analysis of minor allele MAE genes showed strong
controls showed significantly more major allele MAE for genes enrichment for known imprinted genes, as expected (Fig. 5d).
harboring low-frequency minor alleles (minor allele frequency Interestingly, these MAE genes were enriched for distinct pathways
(MAF) < 0.05) compared with idiopathic ASD (Supplementary from typical ASE genes that favor major allele expression (Fig. 2b),
Fig. 4b). These patterns were recapitulated in the pseudoautosomal suggesting that orthogonal biological processes are involved. For
region (PAR) of the chromosome X (Supplementary Fig. 4c). example, genes showing minor allele MAE showed no overlap
We next investigated whether the differential ASE pattern was with astrocyte markers31, PSD32 or ASD-specific downregulated
related to the MAF genome wide (Fig. 4b and Supplementary Fig. 4b). genes4 (Supplementary Fig. 6c). However, they did show enrich-
The disease-specific patterns (balanced alleles: control > idiopathic ment for genes harboring known rare mutations increasing risk
ASD; imbalanced alleles: control < idiopathic ASD) were largely for ASD (Simons Foundation Autism Research Initiative (SFARI)
stable across the range of MAFs. However, we observed that com- genes36; Methods) whereas other classes of genes showing ASE did
plete MAE was limited to rare alleles. Furthermore, preferential not. Similar to genes that manifest ASE (Fig. 2c), major allele MAE
major allele expression becomes stronger as the MAF decreases genes show significant enrichment with genes associated with the
(Fig. 4c). Major alleles exhibiting MAE are enriched in genes that rare de novo ASD risk variants (Supplementary Fig. 6d). However,
are more tolerant of loss of function12 (LoF; Methods) and missense MAE genes observed in control and ASD samples showed some
mutations compared with those showing balanced expression (chi- enrichment for common variants from GWAS data of BPD and SCZ
squared test P = 0.0134; Fig. 4d). This suggests that a bias toward (Supplementary Fig. 6d).
major allele expression may provide a buffer against the conse-
quences of potentially deleterious (rarer) mutations. Remarkably, Allele shift regions enriched in minor allele MAE in ASD and
this preferential pattern of major allele expression shows a signifi- dup15q and snoRNAs. We have so far demonstrated that ASD
cant interaction with idiopathic ASD (Fig. 4e). On average, minor and dup15q are associated with an increased rate of genes showing
alleles are preferentially more highly expressed in idiopathic ASD minor allele MAE, which are enriched on chromosomes 14 and 15
than in controls (minor/major allele ratio: 0.79 in ASD and 0.75 in and overlap known ASD risk genes, especially those harboring del-
control; chi-squared test P = 7.33 × 10−11). Together, these results eterious mutations that act via haploinsufficiency. Interestingly, the
suggest that rare and potentially pathogenic alleles are more likely genomic regions showing MAE allele shifts from major to minor
to be unmasked in idiopathic ASD brain by the patterns of ASE. allele in both ASD and dup15q relative to controls are located on just
a few chromosomes (Table 1) and overlap regions known to harbor
MAE occurs primarily from the major allele. Overall, we observed ASD-associated copy-number variants (CNVs)37. Among common
a strong bias toward expression of the major allele for sites in genes MAE alleles, the frequencies of the allele shift in ASD and dup15q
harboring rare minor alleles and MAE alleles. However, unlike are 15.52% and 2.62%, respectively. To understand the effect of this
global ASE patterns (Fig. 3c), MAE patterns differed between cases apparent MAE allele shift in ASD, we compiled a set of high-con-
and controls (Fig. 5a). Although MAE SNPs were observed across fidence genomic regions which we called ‘allele shift rich regions’.
the genome (Supplementary Fig. 5), there was an approximately two- We defined these regions as having such allele shifts in both ASD
fold enrichment on chromosome 15 harboring a number of known and dup15q relative to controls. This analysis identities only two
imprinted loci (Supplementary Table 5). Reasoning that this could genomic regions, 14q32 (chr14:101302,638-101,544,745) and 15q11
provide a convergent mechanism underlying the co-occurrence of (chr15:25,223,730-25,582,395). Both are known imprinted loci that
dup15q syndrome and idiopathic ASD, we next investigated MAE have many small splice junctions (Fig. 6) and show continuous high
in eight cases of dup15q. We observed a substantial enrichment expression (Supplementary Fig. 7) as single transcript units38–40.
of minor allele MAE in ASD and dup15q compared with controls Both regions also contain tandem repeats of multiple orphan C/D
(Fig. 5b; 2.5- and 2.7-fold, respectively). This genome-wide enrich- box snoRNA genes39,40, which are located within repeated introns of
ment pattern persisted when SNPs were analyzed at individual MEG8 (ref. 40), SNURP-SNFPN and SNHG14 transcripts. Indeed, all
genes (Supplementary Fig. 6a,b). known orphan C/D box snoRNA genes are located within these two
Fig. 3 | ASE patterns distinguish control and idiopathic ASD brains. a, The comparison of ASE genes from BA9, BA41 and vermis within control and
idiopathic ASD groups. b, Manhattan plots for control and idiopathic ASD cortex show a broad distribution of genomic-loci exhibiting ASE. Sample
numbers of control and idiopathic ASD brains are 69 and 62, respectively. c, Venn diagram showing comparison of ASE genes across groups in cortex.
d, ASE rates for individual samples from idiopathic ASD and control groups (Methods). Idiopathic ASD samples (n = 32) show lower overall rates of ASE
compared with controls (n = 31). For the violin plots, ASE rates were calculated per chromosome for each cortical sample (Methods). The Wilcoxon rank
sum test was two-sided. e, GO analyses are shown for common ASE genes between control (n = 69) and idiopathic ASD groups (n = 62). In the upper
interactive graph, the bubble color indicates the P value of the GO term, and bubble size indicates its frequency49. The P values and other results of the GO
analysis are in Supplementary Table 4. Highly similar GO terms are linked by edges, and the line width indicates the degree of similarity. In the lower figure,
the white, yellow and orange boxes represent P > 10−3, 10−5 < P ≦ 10−3 and 10−7 < P ≦ 10−5, respectively. Ctl, control.
a b Ctl
Ctl
40
2,175
ba9
469 30
–log10(P )
20
186
10
0
1,183
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X
337
ASD
324 40
ba41 62 41 ver
1,734 30
1,623
–log10(P )
20
ASD
812
10
ba9
293 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 X
Chromosome
363 57
e
Regulation of heart rate by cardiac conduction
99
293 Behavior
Adult behavior
62 Cognition
573
ver Positive regulation of nervous system development
511 Cell part morphogenesis
ba41 Regulation of membrane potential
Response of light intensity
1,097
Regulation of dephosphorylation
Phosphorylation
Ctl
0.004 Synapse Membrane Cell Cell
part part Synapse junction
part
0.003
ASE rate
0.002
AN14757
UMB5342
AN00544
AN16641
AN17254
UMB5176
AN10723
UMB5303
UMB1578
UMB5242
UMB5302
AN08677
AN10679
UMB4337
AN19442
Dendritic
spine
SNP number
6,000
4,000
4
2,000
Density
0
Possible Possible aa Possible
LoF changing synonymous
mutation mutation
2
Major allele MAE
300
250
SNP number
200
0 150
0 25 50 75 100 100
Minor allele expression fraction (%) 50
0
b Possible Possible aa Possible
100
Balanced LoF changing synonymous
Imbalanced mutation mutation
80 MAE
Minor allele MAE
SNP fraction (%)
10
60 9
8
7
SNP number
40 6
5
4
20 3
2
1
0
0
0.0 0.1 0.2 0.3 0.4 0.5 Possible Possible aa Possible
LoF changing synonymous
MAF
mutation mutation
c
70
Preferential major allele expression
e
Preferential minor allele expression
Major Minor Minor/Major
60
Ctl 96,071 71,882 0.74822
SNP fraction (%)
40
Fig. 4 | Quantitative allelic imbalance in idiopathic ASD and control cortex. a, The distribution of minor allele expression fraction for idiopathic ASD
(red) and control (blue) groups for autosomal SNPs. The majority of loci show balanced expression. The density plot near 0% and 100% is zoomed in to
show patterns of MAE. b, SNP fractions of balanced, imbalanced or MAE expression per MAF. c, SNP fractions showing preferential major or minor allele
expression per MAF. In b and c, open circles are control, and closed circles are idiopathic ASD. d, The comparison of SNP numbers, which possibly can
cause LoF mutations, amino acid changes or synonymous mutations in control cortex. Major-allele MAE can act to buffer deleterious LoF and missense
mutations. Y axes show SNP numbers. e, SNP counts showing preferential expression of the major and minor alleles in control and idiopathic ASD. Their
ratio is expressed as Minor/Major.
To begin to understand the impact of the allele shifts in these As snoRNA genes regulate the splicing of downstream targets, allele
regions, we next assessed whether the relevant genes were differen- shifts would change potential binding and may alter the splicing pat-
tially expressed in ASD or dup15q cases versus controls. Analysis terns of snoRNA targets. To test this hypothesis, we used a known
of RNA-seq data showed regional downregulation of 15q11 in splicing target list41 of the snoRNAs to investigate whether these
ASD (P = 0.0016; Fig. 7a). SNORD116-24 downregulation was genes show splicing changes based on RNA-seq data from postmor-
observed in ASD and dup15q from small noncoding RNA-seq data tem ASD brain4. The accuracy of these splicing target predictions
(sncRNA-seq) data42 (P = 0.0347 and 0.0019, respectively; Fig. 7b). has been confirmed in previously published studies43,44, providing
454
129
100
dup
1,899
ASD
3,877
c d
60
1.60 2.73 1.77 3.01 1.52 2.66 50
Ctl_major (0.04) (0.003) (4 × 10−16) (8 × 10−39) (7 × 10−18) (2 × 10−21)
10 –log10
I
ne
et
et
et
AR
PS (P value)
rg
rg
rg
ge
SF
ta
ta
ta
d
uR
X1
te
R
in
O
H
FM
pr
BF
im
R
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X
n
ow
Chromosome
Kn
Fig. 5 | MAE SNPs across control, ASD and dup15q groups. a, Venn diagram showing overlap of MAE SNPs across groups. Less than 15% of MAE SNPs
overlapped between cases and controls. b, The numbers of major allele MAE SNPs and minor allele MAE SNPs across groups. Both chi-squared tests in
ASD and dup15q compared with control return P < 2.2 × 10−16. c, Minor allele MAE fraction (minor/(major + minor)) across chromosomes for control,
ASD and dup15q. d, Gene set enrichment for major and minor allele MAE genes in control (n = 69), ASD (n = 62) and dup15q (n = 15) groups (Methods).
Enrichment was assessed for known ASD risk genes (SFARI36; Methods); known imprinted genes (Methods); and PSD32, FMRP (ref. 28), HuR (ref. 29) and
RBFOX1 target genes30. Enrichment was assessed separately for genes showing major allele and minor allele MAE. Dup, dup15q.
Table 1 | Regions that shift to minor allele MAE in ASD and dup15q and previously reported relevant ASD-related mutations
Coordinate Region Size (bp) SNP no. Previous report in ASD
ASD
chr3:66,429,475 3p14.1 1 1 3p14.1 de novo microdeletion
chr14:101,325,640-101,390,093 14q32.2 64,454 2 14q32.2 duplication
chr15:23,889,739-25,561,958 15q11.2 1,672,220 83 15q11.2-q13.1 duplication
dup15q
chr2:207,173,390-207,175,070 2q33.3 1,681 2 2q33.3-q34 interstitial deletion, 2q32.3-q37.3
duplication
chr11:2,690,293 11p15.5 1 1 11p15.5-p15.4 duplication
chr14:101,321,515-101,349,017 14q32.2 27,503 2 14q32.2 duplication
chr15:25,372,247 15q11.2 1 1 15q11.2-q13.1 duplication
If there were more than two allele shifts, we defined the boundaries using their maximum and minimum coordinates.
experimentally validated snoRNA targets including some studies in within a 0.02–0.30% false positive rate range. Indeed, we find that
neuronal cells43. From receiver operating characteristic analysis44, the snoRNA target genes show more splicing changes in ASD and
the splicing target prediction showed a 90–100% true positive rate dup15q compared with controls (Fisher’s exact tests: OR = 1.40 and
a Chromosome 14
ASD
dup
User-supplied track
–
b Chromosome 15
ASD
dup
User-supplied track
Fig. 6 | Allele-shift-rich regions at ASD and dup15q. a,b, The allele-shift-rich regions on chromosome 14 (chr14:101,302,638-101,544,745; 242,108 bp) (a)
and on chromosome 15 (chr15:25,223,730-25,582,395; 358,666 bp) (b). Chromosomal locations of allele shifting to minor allele MAE are shown at the
top. Allele shift to minor allele MAE in ASD and dup15q tracks shows the allele shifts. We visualized all splice junctions identified from RNA-seq mapping
data and their chromosomal directions, which are shown by ‘+’ and ‘−’. The RefSeq Gene model is shown at the bottom, indicating known imprinted genes,
as well as maternally (outlined in red rectangles) and paternally (outlined in blue rectangles) expressed genes.
1.72, respectively; Fig. 7c). The snoRNA targeting sites were located splicing of synaptic ASD risk genes in the pathophysiology of ASD.
preferentially proximal to alternatively spliced junctions of their Moreover, the shared patterns of MAE allele shifting in ASD and
target genes in ASD (Supplementary Fig. 8a), and their splicing dup15q provide a potential convergent biological mechanism linking
changes show significant correlations with the expression changes idiopathic ASD and dup15q syndrome. Similar to ASE genes (Fig. 2c),
of their specific snoRNAs (Supplementary Fig. 8b,c). When one the snoRNA targets that change splicing in ASD and dup15q showed
snoRNA has two different targets, their splicing changes also show significant enrichment with two different datasets of de novo ASD
strong correlation (Supplementary Fig. 8d). risk variants (in ASD, OR = 2.30 (P = 0.04) for the first and OR = 7.42
Furthermore, among snoRNA targets, genes showing altered splic- (P = 0.005) for the second; in dup15q, OR = 2.14 (P = 0.05) for the
ing changes in ASD and dup15q are significantly enriched for known first and OR = 5.97 (P = 0.02) for the second; Fig. 7e).
ASD risk genes (SFARI) and genes encoding PSD proteins com-
pared with the other target genes without splicing changes (Fig. 7d; Discussion
Methods). These results are consistent with the bioinformatic predic- This study provides a large-scale, genome-wide investigation of
tions above and potentially implicate disruption of snoRNA-mediated ASE patterns in human brain samples from people with ASD and
Fig. 7 | Characterization of allele-shift-rich regions in ASD and dup15q. a, Regional expression changes of the shared allele-shift-rich regions across ASD
and dup15q. The x axes are log2(reads per kilobase of gene model per million mapped reads) (log2(RPGM)), and two-tailed unpaired t-test P = 0.6397,
P = 0.0016, P = 0.1538 and P = 0.1525 for 14q32 in ASD, 15q11 in ASD, 14q32 in dup15q and 15q11 in dup15q, respectively. Significantly downregulated
regions are marked with an asterisk. 14q32 in ASD also has a lower mean than control, similar to the others. The minimum, first quantile, median, third
quantile and maximum expression values of the boxplots are shown for 14q32 (control: 1.671, 2.048, 2.270, 2.442 and 2.806; ASD: 0.7351, 2.0643,
2.3355, 2.5048 and 2.9375; dup15q: 0.9286, 1.8464, 2.1859, 2.2984 and 3.0108, respectively) and 15q11 loci (control: 2.039, 2.890, 3.130, 3.302 and
3.524; ASD: 0.3255, 2.5385, 2.8784, 3.2264 and 3.7170; dup15q: 1.114, 2.537, 3.105, 3.208 and 3.430, respectively). b, snoRNA gene expression changes
in ASD and dup15q. Yellow and blue backgrounds indicate 14q32 and 15q11, respectively. Among snoRNA genes, 51 genes (RPKM ≧ 1) are selected as
expressed. Based on a linear mixed model-based differential expression gene analysis, significantly downregulated SNORD116-24 genes (ASD: P = 0.0347;
dup15q: P = 0.0019) are marked with asterisks (*P ≦ 0.05; **P ≦ 0.001). c, The numbers of snoRNA target genes and genes with splicing changes in ASD
and dup15q. Two-sided Fisher’s exact test P values are shown at the bottom of tables. d, Gene set enrichment analysis for snoRNA target genes. Among
snoRNA target genes, we compared splicing change and the other genes in ASD and dup15q brain samples. The labels ‘mono major’ and ‘mono minor’ are
major and minor allele MAE genes, respectively. The labels ‘expression up’ and ‘expression down’ represent significantly up- and downregulated genes in
idiopathic ASD4. e, Enrichment of snoRNA target genes with candidate psychiatric disorder risk genes, as defined in Fig. 2c. Plots show ORs and P values
if significant. For the GWAS datasets defined in Fig. 2c, plots show FDR-corrected P values if significant (Methods). For a, b, d and e, RNA-seq sample
numbers of control, ASD and dup15q are 69, 62 and 15, respectively. Dup, participants with dup15q.
splicing changes in ASD brain, providing evidence of an association small family of snoRNAs existing only at chromosomes 14q32 and
between snoRNA and ASD. 15q11, this family controls the splicing of a unique set of about 400
Among all forms of small noncoding RNAs, miRNAs have target genes that are distributed across the genome. Although we
received the most focus as a target for ASD and other psychiatric could not find enrichment for common ASD risk in these targets,
disease studies38,42. Although the orphan C/D box snoRNA is a it may be due to simple lack of power of current GWAS studies in
ASD dup
a 14q32 15q11* 14q32 15q11
4 4 4 4
3 3 3 3
log2(RPGM)
2 2 2 2
1 1 1 1
0 0 0 0
Ctl ASD Ctl ASD Ctl dup Ctl dup
b
1.0
0.5
log2(fold change)
0.0
–0.5
*
–1.0
ASD
** dup
–1.5
13
14
16
15
D6
D1
D1
D1
D1
OR
OR
OR
OR
OR
SN
SN
SN
SN
SN
Splicing change
d Splicing changing
Splicing changing 5.91 3.20 3.54 2.13 6.93 2.42 3.10 2.68 50
–6
targets in ASD (4 × 10 ) (9 × 10–7) (1 × 10–6) (6 × 10–4) (1 × 10–14) targets in ASD (4 × 10–5) (7 × 10–7) (0.004)
40
No splicing changing 1.78 1.89 2.14 No splicing changing 1.36 1.59 1.84
–8
(3 × 10 ) (9 × 10–5) –4
targets in ASD (0.001)
targets in ASD (0.01) (3 × 10 ) (0.004) 30
Splicing changing 2.73 2.98 3.23 2.44 6.89 Splicing changing 2.08 3.50 2.18 20
targets in dup –7 –6 –5 –16
(7 × 10 ) (3 × 10 ) (1 × 10 ) (1 × 10 ) (2 × 10–4) (1 × 10–9) (0.02)
(0.02)
targets in dup
10
No splicing changing 1.87 1.79 1.90 1.28 1.41 2.28 1.96
targets in dup
–4
(5 × 10 ) (7 × 10–7) (0.002) No splicing changing (0.05) (0.01) (0.03) (0.002)
targets in dup 0
I
ne
et
et
et
or
up
n
AR
–log10(P value)
w
AS
PS
rg
rg
rg
aj
in
ge
do
on
SF
m
ta
ta
ta
d
si
n
o
o
P
uR
X1
te
es
io
on
on
R
in
ss
O
H
FM
pr
M
M
pr
BF
re
Ex
im
p
R
Ex
n
ow
Kn
2
Z
ID
D
D
SC
SC
BP
–log10(P value)
AS
AS
AS
AS
AD
Reporting Summary
Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency
in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist.
Statistics
For all statistical analyses, confirm that the following items are present in the figure legend, table legend, main text, or Methods section.
n/a Confirmed
The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement
A statement on whether measurements were taken from distinct samples or whether the same sample was measured repeatedly
The statistical test(s) used AND whether they are one- or two-sided
Only common tests should be described solely by name; describe more complex techniques in the Methods section.
For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted
Give P values as exact values whenever suitable.
For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings
For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes
Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated
Our web collection on statistics for biologists contains articles on many of the points above.
Data analysis R and python were used to analyze the data. For the data process, we also used GATK (version 2.7-2), Samtools (version0.1.19), TopHat2
(version 2.0.9), Bowtie2 (version 2.1.0), Mach (version 1.0), Minimac (version beta-2013.7.17), Picard (version 1.100), PLINK (version
0.08), MAGMA (version 1.07b), FASTX-Toolkit (version 0.0.13.2), and htseq-count (version 0.6.1p1).
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers.
We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information.
Data
Policy information about availability of data
All manuscripts must include a data availability statement. This statement should provide the following information, where applicable:
- Accession codes, unique identifiers, or web links for publicly available datasets
- A list of figures that have associated raw data
- A description of any restrictions on data availability
October 2018
The data that we analyzed were already published at the previous studies (Parikshak et al., Nature 2016; Wu et al., Nat. Neurosci., 2016). We could access their raw
data directly from the groups.
1
nature research | reporting summary
Field-specific reporting
Please select the one below that is the best fit for your research. If you are not sure, read the appropriate sections before making your selection.
Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences
For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf
Data exclusions To ensure if groups were balanced with respect to age and brain bank, we removed samples based on age (<=7) and brain banks (The UK
Brain Bank for Autism and Related Developmental Research and the MRC London Neurodegenerative Diseases Brain Banks).
Replication The genes with allele specific expression which were reported in this manuscript showed significant overlap with the previous study (Savova
et al. Nucleic Acids Res., 2016). The findings of allele specific expression were confirmed using multiple analytical approaches on the same
dataset showing the robustness of our findings.
Randomization To avoid reference bias, we prepared a masked reference based on allele randomization.
Blinding Data analysis was not performed blind to the metadata information of the samples.
Ethics oversight No ethics oversight was required because we analyzed open-source datasets.
Note that full information on the approval of the study protocol must also be provided in the manuscript.
October 2018