Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Technical Report

https://doi.org/10.1038/s41556-021-00696-9

Global profiling of RNA-binding protein target


sites by LACE-seq
Ruibao Su1,2,11, Li-Hua Fan3,4,11, Changchang Cao   1,11, Lei Wang1,5, Zongchang Du1,2, Zhaokui Cai   1,2,
Ying-Chun Ouyang4, Yue Wang2,4, Qian Zhou2,4, Ligang Wu6, Nan Zhang7, Xiaoxiao Zhu1,2,
Wen-Long Lei   4, Hailian Zhao1,2, Yong Tian   1,2, Shunmin He   1, Catherine C. L. Wong   7,8,9,10 ✉,
Qing-Yuan Sun   3,4 ✉ and Yuanchao Xue   1,2 ✉

RNA-binding proteins (RBPs) have essential functions during germline and early embryo development. However, current
methods are unable to identify the in vivo targets of a RBP in these low-abundance cells. Here, by coupling RBP-mediated
reverse transcription termination with linear amplification of complementary DNA ends and sequencing, we present the
LACE-seq method for identifying RBP-regulated RNA networks at or near the single-oocyte level. We determined the bind-
ing sites and regulatory mechanisms for several RBPs, including Argonaute 2 (Ago2), Mili, Ddx4 and Ptbp1, in mature mouse
oocytes. Unexpectedly, transcriptomics and proteomics analysis of Ago2−/− oocytes revealed that Ago2 interacts with endog-
enous small interfering RNAs (endo-siRNAs) to repress mRNA translation globally. Furthermore, the Ago2 and endo-siRNA
complexes fine-tune the transcriptome by slicing long terminal repeat retrotransposon-derived chimeric transcripts. The pre-
cise mapping of RBP-binding sites in low-input cells opens the door to studying the roles of RBPs in embryonic development
and reproductive diseases.

R
NA-binding proteins (RBPs) interact with RNA via specific TRIBE suffers from low editing efficiency and frequent A-to-I edit-
motifs or structural elements to control their processing, mod- ing at non-target sites. Therefore, mapping protein–RNA interac-
ification, localization and degradation1. Besides critical roles tions in fewer cells still represents a substantial challenge.
in somatic cells, many RBPs seem essential for germline and early Argonaute family RBPs are vital players in small-RNA-guided
embryo development2–4. As transcription is silent during late germ- gene silencing, including microRNAs (miRNAs), endogenous
line development and early embryogenesis, post-transcriptional small interfering RNAs (endo-siRNAs) and PIWI-interacting
regulation, which is primarily controlled by RBPs, becomes the RNAs (piRNAs)18. In mouse oocytes, miRNA activity is glob-
dominant way to determine cellular RNA and protein levels3. The ally suppressed19,20, while transposon and pseudogene-derived
mutation or aberrant expression of some RBPs is intimately linked endo-siRNAs are loaded onto Ago2 to cleave fully complementary
to germline deficiency and reproductive pathologies5. For example, transposon and protein-coding transcripts21,22. Moreover, whether
Ddx4, Dazl and Argonaute 2 (Ago2) all contribute to the meiotic Ago2–endo-siRNA complexes have non-cleavage functions is
maturation of oocytes6–8, but their functional mechanisms are unknown. Although the loss of Ago2 or endo-siRNA impairs the
poorly understood. meiotic maturation of mouse oocytes7,8, their complete target reper-
Oocytes and early embryos are not expandable, which makes it toire and regulatory mechanism remain unclear. To fill these knowl-
difficult to identify protein–RNA interaction sites within these rare edge and technical gaps, we developed the linear amplification of
cells. RNA immunoprecipitation with sequencing (RIP-seq) and complementary DNA ends and sequencing (LACE-seq) method for
crosslinking immunoprecipitation coupled with high-throughput unbiasedly mapping the binding sites of Ago2 and multiple RBPs
sequencing (CLIP-seq or HITS-CLIP) are two premier approaches at single-nucleotide resolution in low-input cells or single oocytes.
to identify RBP targets from millions of cells9–11. Unlike RIP-seq,
CLIP-seq can identify the precise binding positions of defined Results
RBPs. However, dozens of complicated steps have hindered its Overview of the LACE-seq method. LACE-seq relies on
general applicability12. Many variants have been developed to over- RBP-mediated reverse transcription termination on the immuno-
come these limitations, including iCLIP, eCLIP and irCLIP13–15. purified protein–RNA complex and the subsequent linear amplifi-
Unfortunately, these state-of-the-art methods still need at least cation of terminating cDNA ends (Fig. 1a). Cells are first irradiated
20,000 cells. To reduce cell input, tRIP-seq and TRIBE (the targets of with ultraviolet C (UV-C) light to crosslink RBPs and their inter-
RBPs identified by editing) were developed to identify RBP targets acting RNAs. We next use a batch of validated antibodies to spe-
from thousands or hundreds of cells, respectively16,17. Nonetheless, cifically pull-down protein–RNA complexes from the cell lysate.

1
Key Laboratory of RNA Biology, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China. 2University of Chinese Academy of Sciences, Beijing,
China. 3Fertility Preservation Lab, Reproductive Medicine Center, Guangdong Second Provincial General Hospital, Guangzhou, China. 4State Key Laboratory
of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China. 5College of Life Sciences, Xinyang Normal
University, Xinyang, China. 6State Key Laboratory of Molecular Biology, Shanghai Key Laboratory of Molecular Andrology, CAS Center for Excellence
in Molecular Cell Science, Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, China. 7Center for Precision Medicine
Multi-Omics Research, Peking University Health Science Center, Beijing, China. 8School of Basic Medical Sciences, Peking University Health Science
Center, Beijing, China. 9Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China. 10Peking University First Hospital, Beijing, China.
11
These authors contributed equally: Ruibao Su, Li-Hua Fan, Changchang Cao. ✉e-mail: catclw321@126.com; sunqy@gd2h.org.cn; ycxue@ibp.ac.cn

664 Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
a b c
5′ 3′ 50 nt
5 nt 5′ 3′
RBP 0.015 CLIP-seq
CLIP-seq
RBP

Normalized cDNA density


0.02 iCLIP
iCLIP
0.05 10,000 cells
Protein A/G LACE-seq 10,000 cells
beads RNA immunoprecipitation 0.08 1,000 cells
LACE-seq 1,000 cells

LACE-seq
RBP 0.15 100 cells
5′ OH- App- 3′-NH2 LACE-seq 100 cells
10 cells
0.2
3′ linker ligation, RT LACE-seq 10 cells Single cell-1
0.2
RBP LACE-seq Single cell Single cell-2
5′ OH- 3′-NH2 0
B EIF4G1
T7 –100 –50 0 50 100
SNORD66
Position relative to LACE-seq peak (nt)
cDNA release,
Streptavidin
Poly(A) tailing
beads d e
B 5′ PTBP1 3′
read 1,600 LACE-seq read
10,000 cells 3′ 5′
Second-strand cDNA, 1,000 cells
10,000 cells
pre-PCR 100 cells

Normalized complexity
1,200
10 cells
1,000 cells Single cell
Random
IVT, RT 800
100 cells

10 cells 400
PCR barcoding,
deep sequencing
Single cell
0
–100 nt PTBP1 binding motif 100 nt
0
–8
–6

–2
0

6
8
10
–4

2
4
–1

Fig. 1 | Overview of the LACE-seq method. a, Schematic of the LACE-seq method. A circled B represents biotin modification. N, random nucleotide; V
can be A, G or C. b, LACE-seq peaks showed a 5-nt shift relative to bulk CLIP-seq and iCLIP. In the schematic, the black arrow denotes a CLIP read, while
the purple arrow stands for a LACE-seq read. The yellow oval represents PTBP1 protein. c, PTBP1 LACE-seq reads were terminated before the CUUCCU
motif in the EIF4G1/SNORD66 transcript. d, WebLogo showing the base frequency at and around the PTBP1–RNA crosslinking sites. e, PTBP1 LACE-seq
detected cDNA ends were accumulated at CU-rich motifs. Dashed lines denote randomized controls. Data in b–e represent results from two independent
experiments.

To cut RNAs into individual RBP-associated short fragments, we Data Fig. 1c). However, the cDNA ends showed an approximately
treat the RBP–RNA complexes on the beads with micrococcal nucle- 5-nucleotide shift to the downstream region of bulk CLIP-seq
ase (MNase). The 3′ ends of fragmented RNAs are then dephos- and iCLIP peaks24 (Fig. 1b), which indicates that the exact PTBP1
phorylated and ligated with a 5′ pre-adenylated linker containing binding positions were located upstream of the termination sites.
four randomized nucleotides. Next, a biotinylated primer contain- Indeed, classical CU-rich motifs were concentrated around the
ing the T7 promoter is used for reverse transcription on beads termination sites, such as the CUUCCU motif in SNORD66
(Fig. 1a). Since reverse transcriptase would efficiently stop before (Fig. 1c). The termination sites were also stacked at the classical
the intact RBP–RNA crosslinking sites, sequencing these truncated CU-rich motif, and top peaks showed more cDNA end covering
cDNA ends enables us to identify 3′ sequences downstream of the reads and a higher CU-rich motif density (Fig. 1d,e and Extended
bound element of a defined RBP. Data Fig. 1d,e). Therefore, high-priority binding sites could be
As immunopurified RNAs from single or dozens of cells are very sorted on the basis of peak strength and the ratio of cDNA end cov-
limited, we adopted a T7 RNA polymerase-based in vitro transcrip- ering reads. Although target gene numbers identified by LACE-seq
tion (IVT) approach to amplify trace amounts of truncated cDNAs decreased along with reduced cell numbers, the precision was still
linearly. The cDNAs are tailed with poly(A) and subsequently similar (Extended Data Fig. 1f–h and Methods). Furthermore, com-
enriched using streptavidin beads via the biotin moiety at the 5′ pared with non-targets, PTBP1-associated cassette exons showed
end of the T7 primer (Fig. 1a). Next, we synthesize second-strand significant changes following PTBP1 knockdown (Extended Data
cDNA on beads with an adaptor containing oligo-(dT). After Fig. 1i), and LACE-seq could sensitively capture lowly expressed
pre-amplifying the double-stranded DNA (dsDNA) fragments by targets from few cells (Extended Data Fig. 1j).
PCR for 14–18 cycles, the products are purified and used for IVT.
Subsequently, the transcribed RNAs are converted into libraries Comparison to CLIP-based methods. Next, we validated
for single-end deep sequencing (~180 bp; Fig. 1a and Extended LACE-seq by comparing it to several CLIP-based methodologies.
Data Fig. 1a). We found the following results: (1) LACE-seq tended to produce
fewer PCR duplicates than CLIP-seq, iCLIP, sCLIP, eCLIP and
Validation of the LACE-seq method. We first created PTBP1 tRIP-seq14–16,23,25 (Extended Data Fig. 2a,b); (2) LACE-seq produced
LACE-seq libraries using gradually reduced HeLa cells (Extended more uniquely mapped reads than other methods, and 89.7% of
Data Fig. 1b). The correlation of two biological replicates progres- the LACE-seq-specific peaks (compared to irCLIP) also contained
sively decreased, which reflects the heterogeneities of individual CU-rich motifs (Extended Data Fig. 2c,d); (3) LACE-seq was
cells. PTBP1 binding sites identified by LACE-seq mostly correlated highly specific when comparing the PTBP1 binding signals before
with bulk CLIP-seq datasets23 (~2 × 107 cells, R ≥ 0.72) (Extended and after knockdown (4.9-fold reduction), and it showed a higher

Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology 665


Technical Report NATURE CELL BioLogy

a b c 1.5 2
Low
3′ UTR

Number of hexamers (×1,000)


5′ UTR

Bits
Chr17 (qB1) complexity DNA repeat 1
7% 5% 7%
100 kb Others 3% 0
1% Others 1.0 1 2 3 4 5 6 7 8
UCSC genes
Exon 1%
(0–10) 16% LTR
50 oocytes 19%
Repeat SINE
(0–7) Intergenic Simple 30% 0.5
23%
22% repeat 176.74
30 oocytes
Intron 20% LINE
148.58
Ddx4 LACE-seq

(0–11) 26% 20%


10 oocytes 0
(0–25) –50 0 50 100 150 200
1 oocyte Z-score
(0–9)
d 2 kb 0.3 1 kb
1 oocyte 3 Ddx4 Ddx4
(0–5)
1 oocyte 0 0

LACE-seq
0 0
(0–25)
IgG –3 –0.3
3 0.3
IgG IgG
5kb
0 0
(0–10) SINE SINE
LINE LTR
50 oocytes LTR Other
(0–7) Other
30 oocytes D10Wsu102e Pou5f1
(0–11)
Ddx4 LACE-seq

10 oocytes e Ddx4 Ddx4 f 0.25


(nM) Pou5f1 WT RNA
(0–25)
Pou5f1 MT RNA
62 5

62 5

0
0
.2

12 5

.2
.5
1 oocyte

00
00

0
0.20
50 0
1, 0

1, 0
5

5
.

25
25
31

50
31

12
0
0

Relative mobility
(0–9)
1 oocyte 0.15
(0–5) Ddx4 Kd = 3.75 × 10
–3
nM
1 oocyte bound 0.10
–2
Kd = 3.48 × 10 nM
(0–25) 0.05
IgG
Free 0
(37 nt)
Hspa1l Lsm2 0 200 400 600 800 1,000
Pou5f1 WT Pou5f1 MT
Ddx4 (nM)

Fig. 2 | Mapping of Ddx4 binding sites in single oocytes. a, UCSC genome browser view of Ddx4 LACE-seq reads on chromosome 17 (Chr17). A magnified
view of the Ddx4-binding sites at the Hspa1l transcript is shown at the bottom. Six independent Ddx4 LACE-seq experiments are shown. b, Genomic
distribution of the Ddx4 LACE-seq reads. c, Histogram showing overrepresented Ddx4-binding motifs identified by LACE-seq. The Z-score of the top-two
hexamers is indicated. The inset shows the Ddx4-binding consensus calculated from the top-20 enriched hexamers. d, Ddx4 LACE-seq signals at Pou5f1
and D10Wsu102e. Repeat elements are shown as grey/black boxes and the corresponding repeat classes are indicated on the left. The yellow box marked
region was used for gel-shift validation in e. Data in b–d represent results from six independent experiments. e, Gel-shift assay showing that Ddx4 binds
to Pou5f1 via a CG-rich motif. MT, mutant. The experiment was independently repeated twice with similar results. f, Relative mobility was calculated as the
ratio of Ddx4-bound RNA to free RNA. Data are plotted using the mean value from two biological replicates.

signal-to-noise ratio than eCLIP (88.8% versus 82.3%; Extended reads were mapped to repeat elements, including short interspersed
Data Fig. 2e–h); (4) LACE-seq outperformed irCLIP at multiple nuclear elements (SINEs), long interspersed nuclear elements
levels, including sensitivity, accuracy and precision (Extended Data (LINEs) and long terminal repeats (LTRs) (Fig. 2b), which coin-
Fig. 2i–l); (5) the experimental time for constructing a LACE-seq cides with its direct association with PIWI and piRNAs in transpo-
library was 2.5 days, which is significantly shorter than other son silencing26. Second, Ddx4 can regulate translation in Drosophila
methods (Supplementary Table 1); and (6) LACE-seq could map germlines27, and our meta-analysis revealed its predominant bind-
RBP binding sites in dozens of cells at single-nucleotide resolu- ing around start codons, stop codons and 3′ untranslated regions
tion, while all the other methods require at least thousands of cells (UTRs) (Extended Data Fig. 3d). Third, among the 14,661 Ddx4
(Supplementary Table 1). Together, these comparisons demonstrate peaks (average length of 51.4 nt), 89.8% had at least one GC-rich
the reliability of the LACE-seq method. hexamer, and 51% of the peaks contained at least one top-20 motif
(Fig. 2c). The motifs were validated by an in vitro systematic evo-
Profiling Ddx4 binding sites in single oocytes. Next, we selected lution of ligands by exponential enrichment (SELEX) assay and
Ddx4, an abundant and germline-specific RNA helicase6, for a gel-shift experiment with the well-known Ddx4 target Pou5f1
LACE-seq analysis using 1–50 mouse metaphase II (MII) oocytes. (ref. 28) (Fig. 2d and Extended Data Fig. 3e,f). Specifically, the muta-
After optimizing the reaction buffers and the MNase concentra- tion of CG to AA in Pou5f1 RNA reduced its binding affinity to
tion (Extended Data Fig. 3a,b; 1:600,000 dilution), we developed a Ddx4 proteins by approximately tenfold (Fig. 2e,f). Collectively,
protocol suitable for a few cells. Compared to negative control IgG, these data demonstrate that LACE-seq can identify direct and
the optimized protocol facilitated the generation of thousands of functional protein–RNA interactions in vivo.
specific peaks for Ddx4 even in a single oocyte (Fig. 2a, Extended
Data Fig. 3b,c and Supplementary Table 2). Notably, we removed the Binding landscape of Argonaute proteins. Having validated the
nonspecific background by excluding the overlapping peaks shown LACE-seq methodology, we next mapped Ago2 and endo-siRNA
in the IgG controls. targets using dozens of mouse MII oocytes. RNA co-precipitated
Several lines of evidence indicated that the identified targets of with Ago2 was different from the control RNA enriched by pre-
Ddx4 were biologically relevant. First, 23% of the Ddx4 LACE-seq immune rabbit serum or by an anti-Ptbp1 antibody, but similar to

666 Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
a b Ago2 binding sites
Intron Intergenic
IgG 1 1.80% region
Ago2-100-oocyte rep1 0.8 5′ UTR Other
0.83% 6.64%
Ago2-100-oocyte rep2

4.
0.6 MTB

02
Ago2-50-oocyte 0.4 ncRNAs 10.29%

%
Mili-50-oocyte rep1 0.2 2.01% CDS
8.17% LTR/ERV-L
Mili-50-oocyte rep2 MTA-Mm
3′ UTR -MaLR MTA-Mm-int
Mili-50-oocyte rep3 55.34%
9.70% 65.41% 27.73%
Ptbp1-50-oocyte rep1
Ptbp1-50-oocyte rep2 Other repeat
Ptbp1-50-oocyte rep3 4.94%
LTR/ERV-K
3.12%
c Mili binding sites d
Ago2-specific targets
Mili-specific targets
Other repeat 13
2.12%
3′ UTR
LTR/ERV-L-MaLR CDS 2,372
31.77%
93.39% 34.07%

3,637
Intergenic nc 8.
R 26
17.78% N %
As
Intron 5′ UTR
6.64% 1.48%

e 2 kb f
(RPM)
Endo-siRNA Network of Ago2-specific targets
12
0
–12 Regulation of chromosome
Histone modification organization
6 Ago2 LACE-seq
0
–6
5 piRNA
0
–5
5 Mili LACE-seq
0
–5
IgG LACE-seq
5
0
–5 Pdzd11 Kif4
Microtubule cytoskeleton
SINE
LINE organization
LTR Nuclear division
Simple

Fig. 3 | The binding landscape of Ago2 and Mili in MII oocytes. a, Heatmap showing the correlations in LACE-seq reads among Ago2, Mili and Ptbp1.
IgG served as the negative control. The colour intensity indicates Pearson’s correlation coefficient values. Three independent experiments (rep1–3) for
Ago2, Mili and Ptbp1 are shown. b, Genomic distribution of Ago2 binding sites. CDS, coding sequence; ncRNA, noncoding RNA. c, Genomic distribution
of Mili binding sites. d, Venn diagram showing the direct overlap between Ago2 and Mili targets. Mili-specific targets, n = 13; Ago2-specific targets,
n = 3,637. e, Snapshot of Ago2 and Mili nonoverlapping targets. Endo-siRNA and piRNA sequencing data in MII oocytes were downloaded and analysed
from our previous report44. f, Network analysis of the enriched GO terms of Ago2-specific targets. Data in b–f represent results from three independent
experiments.

RNA co-precipitated with Mili (a germ-cell-specific PIWI-clade the remaining signals may be caused by full-length Ago2 mRNA
Argonaute protein) (Fig. 3a). Ptbp1 preferred the CU-rich motif and protein present after Ago2 ablation. Notably, the IVT step
in oocytes (Extended Data Fig. 4a), and 6.5% of the LACE-seq was critical to obtaining enough peaks if starting with dozens of
reads were aligned to endo-siRNA loci, which is consistent with cells (Extended Data Fig. 4f,g). Saturation analysis revealed that
its reported roles in repressing splicing through intronic LINEs29. Ago2 targets identified by LACE-seq nearly reached the saturation
Conversely, the most enriched hexamer of Ago2 was CCTGAC, and point, even when using such few oocytes (Extended Data Fig. 4h).
74% of the Ago2 cluster (average length of 677.3 nt) contained such Ago2 mainly bound the MaLR family of retrotransposons, such as
a motif, which might correspond to the sequence in endo-siRNA-26 MTA, MTB and MTC, but not murine ERV-L (MuERV-L). Similar
(Extended Data Fig. 4a). Moreover, 25.9% of the Ago2 reads were to Ago2, approximately 94% of the Mili clusters (average length
mapped to endo-siRNA loci, and Ago2 bound 96.4% of the reported of 657.9 nt) were derived from LTR retrotransposons (Fig. 3c and
endo-siRNAs (Extended Data Fig. 4b). Among the 26,611 Ago2 Supplementary Table 3), which agrees well with its predominant
clusters, approximately 70% were derived from LTR retrotranspo- roles in silencing transposable elements (TEs).
sons, while 20.5% were aligned to protein-coding genes (Fig. 3b). Unexpectedly, nearly all the Mili targets directly overlapped with
We found that Ago2-binding signals were reduced by approxi- the Ago2 targets (Fig. 3d, Extended Data Fig. 4i and Supplementary
mately threefold in Ago2-null oocytes (Extended Data Fig. 4c–e); Table 4), which suggests that there is a functional redundancy

Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology 667


Technical Report NATURE CELL BioLogy

a α-tubulin/DAPI
b 20 kb c
100
Ctrl cKO Ctrl-1 (0–750) Down (n = 775)
Unchanged (n = 13,759)
Ctrl-2 (0–750) Up (n = 1,525)
75
Ctrl-3 (0–750)

–log10(P value)
Ctrl-4 (0–750)

Ctrl-5 (0–750) 50

cKO-1 (0–750)
cKO cKO
cKO-2 (0–750)
25
cKO-3 (0–750)

cKO-4 (0–750)

cKO-5 (0–750) 0

Ago2 –10 –5 0 5 10
Exon 3 log2(cKO/Ctrl)

d e All targets f Upregulated targets


GO of upregulated genes 1.00 None (12,013) 1.00
Nuclear division Low (4,319)
P value Medium (1,179)
Meiotic cell cycle
0.015 High (511)
Chromosome segregation

Cumulative probability
Cumulative probability

0.010 0.75 0.75


Fertilization
0.005
Spindle organization
Female gamete generation
Chromatin assembly or disassembly 0.50 0.50
Oogenesis
Chromatin remodelling
Chromatin assembly
Spindle assembly 0.25 0.25 vs none:
vs none: None (6,074)
Oocyte development Count P < 2.2 × 10–16 Low (1,910) P < 2.2 × 10–16
Regulation of fertilization 5 P < 2.2 × 10–16 Medium (658) P < 2.2 × 10–16
10 P < 2.2 × 10–16
Negative regulation of cell cycle arrest 15 0 0 High (388) P < 2.2 × 10–16
DNA methylation involved in 20
gamete generation –5 –4 –3 –2 –1 0 1 2 3 4 5 0 1 2 3 4 5
01

02

03

04

log2(cKO/Ctrl) log2(cKO/Ctrl)
0.

0.

0.

0.

Gene ratio

Fig. 4 | Transcriptome changes in Ago2-null oocytes. a, IF showing aberrant chromosome alignments and meiotic spindles in Ago2-ablated oocytes. Ctrl,
control of Ago2loxP/+. α-tubulin, green; 4,6-diamidino- 2-phenylindole (DAPI), blue. The experiment was independently repeated three times with similar
results. Representative images were selected and illustrated from 68 Ctrl and 75 cKO oocytes. Scale bar, 20 μm. b, Single-cell RNA-seq results showing
the complete deletion of exon 3 in cKO oocytes. Five independent scRNA-seq experiments are shown. c, Volcano plot showing transcriptome changes in
Ago2-ablated oocytes. d, GO term analysis of upregulated genes. e, Cumulative distribution function (CDF) plot showing the RNA expression level changes
after Ago2 depletion in oocytes. Ago2 targets were classified into low, medium and high groups based on the LACE-seq read density. RNAs without Ago2
binding were classified as None. f, CDF plot showing the RNA expression level changes after Ago2 depletion for the upregulated targets. P values in e and f
were calculated by two-tailed Kolmogorov–Smirnov test. Data in c–f represent results from five independent experiments.

between these two RBPs and their associated small RNAs in oocytes. Single-cell transcriptomics analysis identified 775 downregu-
This overlapping phenomenon might explain why Mili-deficient lated, 1,525 upregulated and 13,759 unchanged transcripts in
oocytes were healthy and fertile. Gene Ontology (GO) and network Ago2-deficient oocytes (Fig. 4c and Supplementary Table 5). The
analyses revealed that Ago2-specific targets (3,637 genes) were upregulated genes may account for the Ago2-null phenotypes
enriched for terms related to chromosome segregation and spindle because enriched GO terms were related to nuclear division, spindle
assembly (Fig. 3e,f and Extended Data Fig. 4j). Together, these data organization and cell cycle regulation (Fig. 4d). Next, we divided all
demonstrate that LACE-seq can identify global RBP binding sites in the detectable transcripts into none, low, medium and high groups
previously unapproachable oocytes. based on mapped Ago2 binding events. Compared with the none
group, the Ago2-bound transcripts were preferably upregulated in
Transcriptome changes in Ago2-null oocytes. To study the Ago2-null oocytes, and the binding strength positively correlated
detailed mechanisms of Ago2–endo-siRNA complexes, we gener- with the upregulation fold-change (right shift; Fig. 4e). This trend
ated an oocyte-specific Ago2 conditional knockout mouse model by was more evident for the upregulated transcripts (Fig. 4f), which
placing two loxP sites in the introns adjacent to exon 3 (Extended highlights the cleavage activity of Ago2–endo-siRNA complexes.
Data Fig. 5a). The deletion of exon 3 introduced a premature ter- By comparing RNA-seq data with LACE-seq identified peaks,
mination codon at exon 4, thereby disrupting Ago2 expression. We we identified 300 downregulated, 530 upregulated and 5,179
isolated Ago2-null MII oocytes by crossing the loxP-flanked strain unchanged transcripts as direct targets of Ago2 in MII oocytes
with Zp3-cre mice that expressed Cre recombinase exclusively in (Extended Data Fig. 5f). The enormous number of unchanged
growing oocytes30. Similar to previous reports7,8, Ago2-deficient transcripts strongly suggested that Ago2 and its associated small
oocytes (Ago2loxP/loxP,Zp3−cre, cKO) showed defects in chromo- RNAs might primarily function at the translational level to regu-
some condensation, segregation and spindle morphology (Fig. 4a late gene expression. Consistently, 93.2% of the Ago2 targets were
and Extended Data Fig. 5b,c). Similar defects were also found in not changed in DicerO-null oocytes (Extended Data Fig. 5g), in
Dicer- and DicerO-null oocytes31–33. Next, we performed single-cell which the RNA interference pathway is disabled33. Notably, the
RNA-seq and confirmed that exon 3 had been completely deleted redundancy between Ago2 and Mili might also contribute to the
in five Ago2-null oocytes (Fig. 4b). Clustering and principal com- unchanged transcript levels because the piRNA pathway still actively
ponent analysis (PCA) clearly distinguished the Ago2-null oocytes functioned in Dicer-null oocytes. Indeed, the RNA level of Mili tar-
from the control oocytes (Ago2loxp/+; Extended Data Fig. 5d,e). gets was mainly unchanged in Ago2-null oocytes (88.5%; Extended

668 Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
Data Fig. 5h), and significantly more Ago2 binding than Mili was that highly abundant endo-siRNAs tended to engage with Ago2
observed for 227 upregulated Mili targets (Extended Data Fig. 5i). (Extended Data Fig. 7a,b). Next, we classified the Ago2 clusters
Together, these results indicate that Ago2 mainly functions at the into two classes: class I clusters (n = 20,966) were fully base-paired
post-transcriptional level to repress gene expression in oocytes. and overlapped with endo-siRNAs, while class II (n = 5,645)
were partially base-paired with endo-siRNAs (Fig. 6a). Most
Ago2 and endo-siRNAs repress LTR-driven chimeric transcripts. class I clusters were derived from TEs or gene–pseudogene pairs
In exploring how Ago2 depletion led to RNA upregulation, we noted (Fig. 6b); conversely, ~80% of the class II clusters were localized at
that Ago2–endo-siRNAs complexes bound many intronic LTR ret- mRNAs (Fig. 6b). RNAhybrid35 analysis revealed that LTR-derived
rotransposons such as MTA, MTB and MTC (Fig. 5a). Our RNA-seq endo-siRNAs in the antisense direction preferred to bind Ago2 clus-
data showed that these LTRs were dramatically activated in Ago2-null ters in the sense LTR (Extended Data Fig. 7c). Conversely, sense
oocytes and functioned as promoters to drive the formation of chi- LTR-, SINE- and LINE-derived endo-siRNAs all had the poten-
meric transcripts with their host genes by providing first exons tial to partially base-pair with the Ago2 cluster in mRNA. As a
(Fig. 5a). Notably, many LTR, SINE and LINE themselves were also positive control, miRNAs mainly targeted mRNAs (Extended Data
upregulated after Ago2 deletion (Extended Data Fig. 6a). To globally Fig. 7c). Moreover, the calculated base-pairing energy between class
identify LTR-driven transcripts, we reconstructed whole transcrip- II targets and endo-siRNAs was significantly lower than that of ran-
tomes based on our single-cell RNA-seq datasets. The first exons of dom controls (Extended Data Fig. 7d), which indicates that partially
the reconstructed chimeric transcripts were then applied for align- base-paired interactions between endo-siRNAs and target mRNAs
ment with LTR and Ago2 cluster sequences (Fig. 5b and Methods). might exist in vivo.
We identified more than 2,700 LTR-derived chimeric transcripts and To test this possibility, we performed dual-luciferase assays
found that Ago2-bound chimeric transcripts were upregulated to a by injecting Renilla luciferase mRNAs containing two entirely or
greater extent than unbound transcripts in Ago2-deficient oocytes partially base-paired target sites of endo-siRNA-336 into growing
(Fig. 5c and Supplementary Table 6). Moreover, the ratio of chime- oocytes. Compared with perfectly base-paired controls, double- and
ric transcripts to endogenous transcripts increased by 2.88-fold after single-nucleotide mutations in positions 2–8 led to higher luciferase
Ago2 ablation (Extended Data Fig. 6b), which highlights the sup- activity (Fig. 6c and Extended Data Fig. 7e). Therefore, endo-siRNA
pressive role of Ago2–endo-siRNAs on LTRs. Furthermore, LTRs also followed a seed rule to repress mRNA translation. Notably, the
such as MTA, MTB and MTC were the top three most active ret- ‘off-target’ effect of exogenous siRNA in somatic cells may ascribe to
rotransposons driving chimeric transcript generation (Fig. 5d). this reason36–38. Taken together, we propose that endo-siRNA func-
Among the 1,287 Ago2-bound chimeric transcripts, 244 were tions in a miRNA-like manner in oocytes and that the seed region
upregulated by at least 1.5-fold after Ago2 ablation (Supplementary is crucial for its activity.
Table 7). This LTR-promoter mechanism covered 46% of the To investigate how many protein-coding transcripts might be
upregulated transcripts in Ago2-null oocytes (Extended Data regulated by endo-siRNAs, we compiled 19,024 Ago2-associated
Fig. 5f). Pseudogene-derived endo-siRNAs could interact with endo-siRNAs and used miRanda and TargetScan algorithms39,40
Ago2 to cleave base-paired protein-coding transcripts, which would to predict their candidate targets within Ago2 cluster sequences
also be upregulated after Ago2 depletion21,22. Therefore, we devel- (Fig. 6d). At a minimum free energy (MFE) cut-off of −14 kcal mol−1,
oped an algorithm to identify naturally occurring pseudogene–gene both algorithms predicted 22,335,946 endo-siRNA target sites in
pairs (Extended Data Fig. 6c), and revealed 165 double-stranded 5,520 protein-coding genes (Fig. 6e and Supplementary Table 8).
RNA (dsRNA) transcripts that were stabilized after Ago2 depletion Among these, we identified 3,331 genes that contained both Ago2
(Extended Data Fig. 6d and Supplementary Table 7). Together, these clusters and TEs (named TE genes), and the sequences of these TE
two mechanisms explained approximately 77% of the upregulated clusters showed stronger base-pairing abilities with endo-siRNAs
transcripts in Ago2-deficient oocytes. (MFE ≤ −30 kcal mol−1; Fig. 6f,g). In the remaining 2,085 genes
Unexpectedly, we observed dozens of downregulated chimeric (other genes), Ago2 bound to non-TE regions, including the 5′
and pseudogene-derived transcripts in Ago2-null oocytes (Extended UTR, 3′ UTR and exons. Notably, 69% of them might represent
Data Fig. 6d). Although the detailed mechanism remains unclear, partially base-paired targets of endo-siRNA because they had rela-
we observed the following interesting phenomena: (1) the downreg- tively higher free energies (−30 kcal mol−1 < MFE ≤ −14 kcal mol−1;
ulated chimeric transcripts often showed stronger Mili binding sig- Fig. 6f,g). Conversely, the remaining 31% showed a strong base-pairing
nals than Ago2 compared with the upregulated chimeric transcripts ability and might represent naturally formed dsRNAs in vivo.
(Extended Data Fig. 6e); (2) the downregulated chimeric transcripts
usually corresponded to LTR loci containing a higher endo-siRNA Endo-siRNA target validation. We next chose seven predicted
density than the upregulated ones (Extended Data Fig. 6f), which endo-siRNA target genes for validation. The genomic fragments
indicates that they might be preferred substrates of Dicer; and (3) covering individual Ago2 clusters were placed into the 3′ UTR of
the downregulated pseudogene transcripts in Ago2-null oocytes the Renilla luciferase gene, and the luciferase activity was measured
also showed reduced expression in DicerO-null oocytes (Extended by co-transfection with different endo-siRNA mimics. Among the
Data Fig. 6g), which suggests that they were not direct targets of 23 target sites, 78.3% showed reduced luciferase activity in HEK293
endo-siRNA. cells, which indicates that these Ago2 clusters might be functional
Next, we performed GO analysis of the LTR-activated host endo-siRNA target sites (Fig. 7a,b and Extended Data Fig. 8a–c). We
genes. We found that the terms of male/female gamete genera- next focused on Calm1, Bub3, Nuf2 and Chk1 for further analysis in
tion, spermatogenesis, chromatin assembly or disassembly, and mouse oocytes because of their reported functions in meiotic matu-
oogenesis-related genes were significantly enriched (Extended ration41–43. Compared to vehicle injections, we found that the lucif-
Data Fig. 6h), thereby highlighting the potential contribution of erase activities of the Calm1, Bub3, Nuf2 and Chk1 reporters were
these chimeric transcripts to the observed Ago2-null phenotypes. reduced by 50–97% (Fig. 7c and Extended Data Fig. 8d). The reduc-
Collectively, these data suggest that Ago2 engages with endo-siRNAs tion mainly occurred at the translational level because the reporter
to play a crucial role in repressing LTR-driven chimeric transcripts. mRNAs were unchanged (Fig. 7d). Moreover, the decreased lucifer-
ase activities of Calm1, Nuf2, Bub3 and Chk1 were partially rescued
Endo-siRNA targeting rules. To identify endo-siRNAs loaded by mutating the seed regions of endo-siRNA 336, 503, 1603, 3492,
onto Ago2, we performed single-cell CAS-seq34 using Ago2- 607, 2837, 1, 1976 and 2932 in oocytes and HEK293 cells (Fig. 7e
immunoprecipitated small RNAs from MII oocytes. We found and Extended Data Fig. 8b–d). Furthermore, blocking endogenous

Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology 669


Technical Report NATURE CELL BioLogy

a 10 kb 10 kb 10 kb
(RPM) (RPM) (RPM)
1 1 2
Endo-siRNA
smRNA-seq

0 0 0

–0.5 –3 –0.5
5 Ago2 45 1.5
LACE-seq

0 0 0

–50 –5 –5
850 Ctrl 25,000 5,600
RNA-seq

0 0 0
850 cKO 25,000 5,600

0 0 0
Gne Aspm Atp5c1
LTR MTA MTB ORR1A2

20 kb 10 kb 10 kb

0.6 7 0.5
smRNA-seq

Endo-siRNA

0 0 0

–0.6 –0.3 –0.5


1.5 Ago2 3 1.5
LACE-seq

0 0 0

–3 –3 –1.5
500 Ctrl 500 500
RNA-seq

0 0 0
500 cKO 500 500

0 0 0
Pkd1l3 Gin1 Nipal1
LTR MTC RLTR10 MTD

b c d
LTR-driven transcripts in oocytes
Mapping RNA-seq 180
1.00 Ago2 unbound (n = 1,484)
reads to genome
Ago2 bound (n = 1,287) 150
LTR-driven gene count

0.75
Cumulative probability

120
De novo transcriptome
assemblies
0.50 P < 2.2 × 10–16 90

60
Identification of
0.25
LTR-driven transcripts
30

0 0
Overlapped with Ago2 clusters s
TA TB TC R1 TR TD ER T2 TE her
–3 –1.5 0 1.5 3 M M M OR RL M M M M t
R O
log2(cKO/Ctrl)

Fig. 5 | Ago2 and endo-siRNAs repress LTR-driven chimeric transcripts in oocytes. a, Genome browser views of LTR-driven chimeric transcripts and
Ago2–endo-siRNA binding at those loci. Red arrow indicates LTR promoter direction. smRNA-seq, small RNA sequencing. The combined RNA-seq,
smRNA-seq and LACE-seq results were from five, two and three independent experiments, respectively. b, The strategy used for analysing LTR-driven
transcripts. c, Ago2-bound LTR-driven transcripts showing stronger upregulation in Ago2 knockout oocytes. The P value was calculated by two-tailed
Kolmogorov–Smirnov test. d, The number of chimeric transcripts driven by different LTR types in oocytes. Data in c and d represent results from five
independent experiments.

endo-siRNAs targeting Calm1, Nuf2 and Bub3 by sponges (contain- immunofluorescence (IF) analysis. We collected germinal vesicle
ing 8× complementary target sites) markedly increased their pro- (GV) oocytes from Ago2−/− female mice and injected them with
tein levels in MII oocytes (1.45–2.23-fold; Fig. 7f). Importantly, the either Myc-tagged Ago2 or Myc-tagged Ago2ADH (catalytically inac-
mRNA levels of Calm1, Nuf2 and Bub3 were unchanged (Extended tive mutant) mRNAs (Fig. 7g). The subsequent IF analysis with spe-
Data Fig. 8e–g), thereby highlighting the translational repression cific antibodies against Myc and Bub3 revealed that the Bub3 protein
mediated by endo-siRNAs. level was dramatically reduced after injection with wild-type (WT)
To examine whether endo-siRNA-mediated translational repres- or Ago2ADH mRNA (Fig. 7g,h). Together, these data suggest that the
sion required the slicer activity of Ago2, we focused on Bub3 for slicer activity of Ago2 is not required for endo-siRNA-mediated
detailed analysis because of the high-quality antibody available for translational repression in oocytes.

670 Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
a Ago2 clusters b Class I Class II
n = 20,966 n = 5,645 (RPM) 5 kb (RPM) 5 kb
100 24 Endo-siRNA Endo-siRNA
LTR
0 0
80 LINE
0 0
Region in clusters (%)

SINE
Genic –16
20
60 Intergenic 5 Ago2 LACE-seq Ago2 LACE-seq
Others 0 0
0 0
40
–10 –20
5 IgG LACE-seq 20 IgG LACE-seq
20
0 0
0 0
0
–10 –20
I

II
ss

ss
la

Kifc1 Calm1
la

SINE
C

c Position of mismatches in d e miRanda


endo-siRNA 336 target region Endo-siRNAs Ago2 footprints
loaded into Ago2 in mRNA
5′ 3′ 20,440,500
Ago2
*Perfect 3′ 5′ AAAAAA
3′ 5′ 22,335,946

3′ 5′ miRanda and TargetScan


3′ 5′ (ΔG ≤ –14 kcal mol–1)
3′ 5′ 115,696,071

3′ 5′ Ago2
AAAAAA
Di
3′ 5′
3′ 5′ TargetScan
AAAAAA
3′ 5′
3′ 5′ f g Strong base-pairing
3′ 5′ Weak base-pairing
3.5 4
3′ 5′
Number of predicted hybrids (×105)

3.0

4
10
Ago2-bound genes (×103)
2.5 3
P = 0.6972
P = 0.3177

2.0
P = 0.9947
P = 0.7668

P = 0.3632

P = 0.5413
P = 0.6641
Relative luciferase

2.0
activity (RL/FL)

1.5
2
1.0 1.5

1
33
P = 0.0010

P = 0.0002

P = 0.0369

P = 0.0278

3,
Strong

0
43
1.0 base-pairing

1,
0.5 1
0.5 Weak
0
base-pairing

5
65
2

8
10

2
6

8
6

0
4
ct

2
4

0
1/

7/

/2
5/

/1
/1

/2
3/

/1
/1
rfe

0
9/

21
17
15

19
11
13
Pe

–60 –50 –40 –30 –20 TE Other


genes genes
Mutations MFE (kcal mol–1)

Fig. 6 | Endo-siRNA targeting rules and global target prediction. a, Two classes of Ago2 clusters were revealed based on their overlapping situation
with endo-siRNAs. Class I clusters (n = 20,966) with overlapping endo-siRNAs are mainly localized at LTR loci, while class II clusters (n = 5,645) without
overlapping endo-siRNAs mostly reside in protein-coding genes. b, Examples of LACE-seq revealed class I and class II clusters. Data in a and b represent
results from three independent Ago2 LACE-seq experiments. c, The base-pairing of endo-siRNA-336 (blue) to its fully or partially base-paired targets in
the Renilla luciferase (RL) reporter mRNA. The positions of the mismatches (red) and their effects on the expression of Renilla luciferase in mouse oocytes
are shown (bottom). FL, firefly luciferase. Data are the mean ± s.e.m.; n = 3 biological replicates, two-tailed unpaired Student’s t-test. d, The computational
workflow for predicting Ago2–endo-siRNA targets using miRanda and TargetScan. e, The overlapping targets predicted by miRanda and TargetScan. f, The
distribution of MFE for the hybrids between overlapping targets and endo-siRNA. MFEs lower or higher than −30 kcal mol−1 were classified as strong or
weak base-pairing, respectively. Ago2-binding sites deduced from three independent LACE-seq experiments were used for the data analysis in e and f.
g, The number of Ago2-bound genes containing or not containing a TE. TE genes: Ago2 binds at the TE region in the gene. Other genes: Ago2 binds to the
non-TE region in the gene.

Proteomics analysis of Ago2-mediated translational repres- We identified 170 upregulated and 80 downregulated proteins in
sion. To further explore Ago2–endo-siRNA-mediated translational Ago2-depleted oocytes compared with WT controls (P < 0.05; Fig. 8b
regulation globally, we collected ~60 control and Ago2-null MII and Supplementary Table 9). Although Calm1, Nuf2, Bub3 and
oocytes for deep shotgun proteomics sequencing using an advanced Chk1 showed consistent upregulation as revealed by western blot-
trapped-ion mobility selecting (timsTOF Pro) mass spectrometer ting (Fig. 7f), only Calm1 passed the stringent cut-off criteria
(Fig. 8a). Two biological replicates showed strong correlations in the used in our proteomics analysis (Extended Data Fig. 9b). Of the
detected proteins (R = 0.947 and R = 0.858; Extended Data Fig. 9a). 170 upregulated proteins, 49% contained Ago2 clusters identified

Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology 671


Technical Report NATURE CELL BioLogy

a Calm1 Bub3 Nuf2


(RPM)
8.5 Ago2 LACE-seq 8 0

0 0 –8
8.5 IgG LACE-seq 8 0

0 0 –8

336 18 248 503 607 3,100 1,030 2,837 3,492 2,188 1,603

b Calm1-3′ UTR : endo-siRNA-336 Bub3-3′ UTR : endo-siRNA-607 Nuf2 : endo-siRNA-1603

MFE = –26.70 kcal mol–1 MFE = –31.87 kcal mol–1 MFE = –26.69 kcal mol–1

Calm1-3′ UTR : endo-siRNA-503 Bub3-3′ UTR : endo-siRNA-2837 Nuf2 : endo-siRNA-3492

MFE = –25.00 kcal mol–1 MFE = –26.43 kcal mol–1 MFE = –25.61 kcal mol–1

c d P = 0.0113 e
P = 0.0903 P < 0.0001 P < 0.0001
P = 0.1137 P < 0.0001 P < 0.0001 P < 0.0001
1.0 1.2 P = 0.0020 1.0 1.1
1.0
P = 0.0006

Relative RL mRNA level

1.0 P = 0.0002 1.0


0.8
Relative RL activity
Relative RL activity

0.8 0.8 P = 0.0489


P < 0.0001

0.8 P = 0.0413 0.9


P < 0.0001

0.6 0.6 0.6


0.8 P < 0.0001
0.6
0.4 0.4 0.4 0.2 P = 0.0011
0.4
0.2 0.2 0.2 0.2 0.1

0 0 0 0 0

e
e

e
e

TR

TR

TR

ut
ut
TR

ut
TR

ut
Bu -28 ut
TR

TR

N -34 ut
34 WT
Bu -60 T

-2 WT

T
cl
cl

cl

cl
cl

b3 7W

m
6m

3m

7m
7m

m
hi
hi

hi

hi
hi

U
′U

U
U
U
′U

92
uf 92
b3 37

uf 03

uf 03
Ve
Ve

Ve

Ve
Ve

33

50

83
3′

Bu -60
3′
3′
3′

3′

-3
-3

N -16

6
1-

1-
2-
1-

2-

1-

1-

1
b3
b3

b3

2-
2-
m

m
uf
m

uf

2
2
b3
Bu
Bu

al

al
N
al

uf
Bu
al

al
C

C
C

N
N
C

f g h
3 nd
sp
sp

sp
50 6 a

Bub3 Myc tag DNA Merge


3
6
trl

50

33
33

kDa
C

Ago2 KO oocyte

P < 0.0001
Anti-Calm1 2.5
12 P < 0.0001
1.0 1.45 1.47 1.57
2.0
Bub3 signal intensity

40 Anti-GAPDH
P = 0.6607
1.5
60 sp

sp
83 nd
sp

2, 7 a
7

7
83

Ago2 KO oocyte Ago2 KO oocyte


7
trl

60

2,
C

+Myc–Ago2

1.0
40 Anti-Bub3
1.0 1.49 1.54 1.47 0.5
Anti-tubulin
40 0
2 d
sp

sp
sp

49 n

2 AD es
s

go s
+Myc–Ago2 ADH
a

te

–A yte

H
go cyt
2
3

2
3, 3

cy
49
60

60

yc oc
trl

–A oo
oo
3,
1,

+M o
C

1,

66
yc O
KO

KO

+M K

Anti-Nuf2
o2
o2

o2

Ag
Ag

Ag

1.0 2.06 2.23 1.98


40 Anti-GAPDH

Fig. 7 | TE-derived endo-siRNAs repress mRNA translation in mouse oocytes. a, Snapshot of Ago2 LACE-seq reads and the predicted endo-siRNA targets
in Calm1, Bub3 and Nuf2 mRNA. Red indicates Watson strand, blue indicates Crick strand. Ago2 LACE-seq data represent results from three independent
experiments. b, The base-pairing between endo-siRNA and Calm1, Bub3 and Nuf2 mRNA. c, The relative RL reporter activity of Calm1, Bub3 and Nuf2 in
mouse oocytes. The RL reporter mRNAs containing the 3′ UTR sequence of Calm1, Bub3 or Nuf2 (blue column) were microinjected into the oocytes. The RL
reporter activities were normalized to the co-injected FL activities. The vehicle (grey column) served as the control. d, qPCR showing that the mRNA levels
of luciferase reporter genes were not changed. e, The RL activities of Calm1, Nuf2 and Bub3 reporters were rescued after mutating (mut) the endo-siRNA
target region (red columns). f, Simple western blotting results showing that endogenous protein levels increased after blocking endo-siRNAs with sponges
(sp). The experiment was independently repeated twice with similar results. g, IF showing that upregulated Bub3 in Ago2-null oocytes was abolished after
injection of WT or Ago2ADH mRNAs. Bub3, red; Myc tag, green; DAPI, blue. Scale bar, 20 μm. h, Quantification of Bub3 signal intensity in water (n = 29),
Myc–Ago2 mRNA (n = 37), and Myc–Ago2ADH mRNA-injected (n = 42) Ago2−/− oocytes. Data in c,d,e and h are the mean ± s.e.m.; n = 3 biological replicates,
two-tailed unpaired Student’s t-test.

672 Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
a b 40

Significance (–10log10[P value])


PMSG, hCG ~60 control oocytes
30

Calm1
Oocyte collection 20
from ovaries

10

Six-week-old female mouse


0
~60 Ago2 cKO oocytes timsTOF Pro MS –6 –4 –2 0 2 4 6
log2(cKO/Ctrl)

c MS-detected genes (n = 2,715) d e


1.00 Ago2-unbound transcripts 1.00 Endo-siRNA density
2,000 mRNA not changed (n = 983) > 0.72 (n = 276)
mRNA changed Ago2-bound Endo-siRNA density
0.75 transcripts (n = 878) < 0.72 (n = 276)

Cumulative probability
Cumulative probability

0.75
1,500
Number of transcripts

0.50 0.50
1,000 983

878
0.25 P = 1.28 × 10–7 0.25 P = 0.019
500
577
277
0 0 0
Ago2 Ago2
–1.5 –1.0 –0.5 0 0.5 1.0 1.5 –1.5 –1 –0.5 0 0.5 1 1.5
bound unbound
log2(cKO/Ctrl) log2(cKO/Ctrl)

f
LTR/LINE LTR-driven gene Gene Pseudogene

Endo-siRNAs AAA...
AAA
AAA

Ago2

Ago2
AAAAAA

Ago2
AAAAAA AAAAAA

Fig. 8 | Proteomics analysis of Ago2-mediated translational repression. a, Schematic of oocyte collection and MS analysis. b, Significantly upregulated
(red) and downregulated (blue) proteins in Ago2 cKO oocytes. c, Transcripts that are bound by Ago2 but for which the mRNA is unchanged in cKO
oocytes. The timsTOF Pro mass spectrometer detected 2,715 protein-related transcripts used for the analysis. d, Ago2-bound transcripts showed stronger
upregulation at the protein level than unbound transcripts in cKO oocytes. e, The predicted endo-siRNA density (endo-siRNA counts/mRNA length)
showing a positive correlation with protein level changes after Ago2 ablation in oocytes. Two-tailed Kolmogorov–Smirnov test was used to calculate the P
values in d and e. MS data in b–e represent results from two independent experiments. f, Model of Ago2–endo-siRNA functional mechanisms in mouse
oocytes. Endo-siRNAs are mainly generated from TEs and gene–pseudogene pairs in mouse oocytes. In addition to repressing themselves, we found
that TE-derived endo-siRNAs interact with Ago2 to partially base-pair with mRNAs to repress their translation. Furthermore, these Ago2–endo-siRNA
complexes play critical roles in controlling LTR-driven chimeric transcripts in mouse oocytes.

by LACE-seq, which indicates that they might be direct targets of unbound group (P = 1.28 × 10−7; Fig. 8d). Furthermore, we clas-
Ago2–endo-siRNA complexes. sified mass spectrometry (MS)-detected proteins into two groups
Using timsTOF Pro, we identified 2,715 proteins from distinct based on the predicted endo-siRNA density. The groups with higher
mRNAs. Among these, 1,155 were bound by Ago2, while 1,560 did endo-siRNA density showed stronger upregulation in Ago2-null
not contain Ago2 clusters (Fig. 8c). By checking our RNA-seq data oocytes (P = 0.019; Fig. 8e). Notably, no such trend was observed
(Fig. 4c), we identified 878 Ago2-bound and 983 Ago2-unbound using the predicted miRNA density (P = 0.361; Extended Data
mRNAs that did not show any observable changes in the cKO Fig. 9c). In addition, the proteomics analysis detected 17.6% of
oocytes (red columns, Fig. 8c). Interestingly, Ago2-bound tran- the LTR-derived and upregulated chimeric proteins in Ago2-null
scripts showed higher upregulation at the protein level than the oocytes (Extended Data Fig. 9d,e). Together, our proteomics

Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology 673


Technical Report NATURE CELL BioLogy

analysis confirms the presence of chimeric proteins and demon- 6. Medrano, J. V., Ramathal, C., Nguyen, H. N., Simon, C. & Reijo Pera, R. A.
strates that Ago2–endo-siRNA-mediated translational regulation is Divergent RNA-binding proteins, DAZL and VASA, induce meiotic
progression in human germ cells derived in vitro. Stem Cells 30,
widely present in mouse oocytes. 441–451 (2012).
7. Kaneda, M., Tang, F., O’Carroll, D., Lao, K. & Surani, M. A. Essential role for
Discussion Argonaute2 protein in mouse oogenesis. Epigenetics Chromatin 2, 9 (2009).
In summary, we presented a LACE-seq method for mapping RBP 8. Stein, P. et al. Essential role for endogenous siRNAs during meiosis in mouse
binding sites in as few as a single cell. Using the LACE-seq method, oocytes. PLoS Genet. 11, e1005013 (2015).
9. Licatalosi, D. D. et al. HITS-CLIP yields genome-wide insights into brain
we profiled the binding landscape of the Ago2–endo-siRNA com- alternative RNA processing. Nature 456, 464–469 (2008).
plex in oocytes and found that TE-derived endo-siRNAs globally 10. Yeo, G. W. et al. An RNA code for the FOX2 splicing regulator revealed by
repress mRNA translation in a miRNA-like manner (Fig. 8f). We mapping RNA–protein interactions in stem cells. Nat. Struct. Mol. Biol. 16,
estimate that at least 18% of protein-coding genes (fragments per 130–137 (2009).
kilobase of transcript per million mapped reads (FPKM) > 1) might 11. Zhao, J. et al. Genome-wide identification of polycomb-associated RNAs by
RIP-seq. Mol. Cell 40, 939–953 (2010).
be controlled by endo-siRNAs via partial base-pairing. Among 12. Ule, J. et al. CLIP identifies Nova-regulated RNA networks in the brain.
the validated endo-siRNA targets, Bub3 (ref. 41), Chk1 (ref. 42) and Science 302, 1212–1215 (2003).
Nuf2 (ref. 43) are critical for chromosome segregation. We further 13. Zarnegar, B. J. et al. irCLIP platform for efficient characterization of
extended the control of those meiotic maturation-related genes to protein–RNA interactions. Nat. Methods 13, 489–492 (2016).
the endo-siRNA–Ago2 pathway. 14. Konig, J. et al. iCLIP reveals the function of hnRNP particles in splicing at
individual nucleotide resolution. Nat. Struct. Mol. Biol. 17, 909–915 (2010).
Dicer is responsible for the biogenesis of miRNA and 15. Van Nostrand, E. L. et al. Robust transcriptome-wide discovery of
endo-siRNAs in oocytes18. Although the catalytically inactive RNA-binding protein binding sites with enhanced CLIP (eCLIP). Nat.
Ago2 (Ago2ADH) recapitulates the Dicer-null phenotypes during Methods 13, 508–514 (2016).
oocyte maturation19,31,32, many dysregulated genes in Dicer-null 16. Masuda, A. et al. tRIP-seq reveals repression of premature polyadenylation by
and Ago2ADH oocytes are different8. This result strongly suggests co-transcriptional FUS-U1 snRNP assembly. EMBO Rep. 21, e49890 (2020).
17. McMahon, A. C. et al. TRIBE: hijacking an RNA-editing enzyme to identify
that endo-siRNAs may have cleavage-independent activity. By cell-specific targets of RNA-binding proteins. Cell 165, 742–753 (2016).
analysing single-cell RNA-seq and LACE-seq datasets, we identi- 18. Kim, V. N., Han, J. & Siomi, M. C. Biogenesis of small RNAs in animals. Nat.
fied 9.8-fold more cleavage-independent targets of endo-siRNAs. Rev. Mol. Cell Biol. 10, 126–139 (2009).
Importantly, we demonstrated that this hidden layer function of 19. Suh, N. et al. MicroRNA function is globally suppressed in mouse oocytes
Ago2–endo-siRNAs did not rely on the catalytic activity of Ago2 and early embryos. Curr. Biol. 20, 271–277 (2010).
20. Ma, J. et al. MicroRNA activity is suppressed in mouse oocytes. Curr. Biol. 20,
but required the imperfect base-pairing between endo-siRNAs and 265–270 (2010).
target mRNAs. 21. Tam, O. H. et al. Pseudogene-derived small interfering RNAs regulate gene
We performed many steps on beads to reduce material loss and expression in mouse oocytes. Nature 453, 534–538 (2008).
hands-on time, which enabled the profiling of RBP binding sites 22. Watanabe, T. et al. Endogenous siRNAs from naturally formed dsRNAs
in dozens of cells within 2.5 days. To increase the resolution, we regulate transcripts in mouse oocytes. Nature 453, 539–543 (2008).
23. Xue, Y. et al. Direct conversion of fibroblasts to neurons by reprogramming
directly mapped the termination sites of reverse transcriptase on the PTB-regulated microRNA circuits. Cell 152, 82–96 (2013).
3′ end of the intact RBP-bound elements. This design is reminiscent 24. Coelho, M. B. et al. Nuclear matrix protein Matrin3 regulates alternative
of the elegant iCLIP approach14. As smaller peptides are inefficient splicing and forms overlapping regulatory networks with PTB. EMBO J. 34,
in blocking reverse transcriptase, many read-throughs are present 653–668 (2015).
in the iCLIP datasets. Similar to sCLIP25, we adopted an IVT strat- 25. Kargapolova, Y., Levin, M., Lackner, K. & Danckwardt, S. sCLIP—an
integrated platform to study RNA–protein interactomes in biomedical
egy to lower the input materials. However, sCLIP can only map RBP research: identification of CSTF2tau in alternative processing of small nuclear
binding sites from 106 cells. We believe that LACE-seq opens the RNAs. Nucleic Acids Res. 45, 6074–6086 (2017).
door to studying the roles of RBPs in previously uncharted cell types 26. Kuramochi-Miyagawa, S. et al. MVH in piRNA processing and gene silencing
and will expedite the discovery of RNA regulatory networks in vari- of retrotransposons. Genes Dev. 24, 887–892 (2010).
ous diseased and healthy cells. 27. Liu, N., Han, H. & Lasko, P. Vasa promotes Drosophila germline stem cell
differentiation by activating mei-P26 translation by directly interacting with a
(U)-rich motif in its 3′ UTR. Genes Dev. 23, 2742–2752 (2009).
Online content 28. Tanaka, S. S. et al. The mouse homolog of Drosophila Vasa is required for the
Any methods, additional references, Nature Research report- development of male germ cells. Genes Dev. 14, 841–853 (2000).
ing summaries, source data, extended data, supplementary infor- 29. Attig, J. et al. Heteromeric RNP assembly at LINEs controls lineage-specific
mation, acknowledgements, peer review information; details of RNA processing. Cell 174, 1067–1081.e17 (2018).
30. Lewandoski, M., Wassarman, K. M. & Martin, G. R. Zp3-cre, a transgenic
author contributions and competing interests; and statements of mouse line for the activation or inactivation of loxP-flanked target genes
data and code availability are available at https://doi.org/10.1038/ specifically in the female germ line. Curr. Biol. 7, 148–151 (1997).
s41556-021-00696-9. 31. Murchison, E. P. et al. Critical roles for Dicer in the female germline. Genes
Dev. 21, 682–693 (2007).
Received: 30 April 2020; Accepted: 7 May 2021; 32. Flemr, M. et al. A retrotransposon-driven dicer isoform directs endogenous
Published online: 9 June 2021 small interfering RNA production in mouse oocytes. Cell 155, 807–816 (2013).
33. Taborska, E. et al. Restricted and non-essential redundancy of RNAi and
piRNA pathways in mouse oocytes. PLoS Genet. 15, e1008261 (2019).
References 34. Yang, Q. et al. Single-cell CAS-seq reveals a class of short PIWI-interacting
1. Hentze, M. W., Castello, A., Schwarzl, T. & Preiss, T. A brave new world of RNAs in human oocytes. Nat. Commun. 10, 3389 (2019).
RNA-binding proteins. Nat. Rev. Mol. Cell Biol. 19, 327–341 (2018). 35. Kruger, J. & Rehmsmeier, M. RNAhybrid: microRNA target prediction easy,
2. Benoit, B. et al. An essential role for the RNA-binding protein Smaug fast and flexible. Nucleic Acids Res. 34, W451–W454 (2006).
during the Drosophila maternal-to-zygotic transition. Development 136, 36. Jackson, A. L. et al. Widespread siRNA ‘off-target’ transcript silencing
923–932 (2009). mediated by seed region sequence complementarity. RNA 12, 1179–1187
3. Christou-Kent, M., Dhellemmes, M., Lambert, E., Ray, P. F. & Arnoult, C. (2006).
Diversity of RNA-binding proteins modulating post-transcriptional regulation 37. Maida, Y., Kyo, S., Lassmann, T., Hayashizaki, Y. & Masutomi, K. Off-target
of protein expression in the maturing mammalian oocyte. Cells 9, 662 (2020). effect of endogenous siRNA derived from RMRP in human cells. Int. J. Mol.
4. Lee, M. H. & Schedl, T. Identification of in vivo mRNA targets of GLD-1, a Sci. 14, 9305–9318 (2013).
maxi-KH motif containing protein required for C. elegans germ cell 38. Jackson, A. L. et al. Expression profiling reveals off-target gene regulation by
development. Genes Dev. 15, 2408–2420 (2001). RNAi. Nat. Biotechnol. 21, 635–637 (2003).
5. Khalaj, K. et al. RNA-binding proteins in female reproductive pathologies. 39. Agarwal, V., Bell, G. W., Nam, J. W. & Bartel, D. P. Predicting effective
Am. J. Pathol. 187, 1200–1210 (2017). microRNA target sites in mammalian mRNAs. eLife 4, e05005 (2015).

674 Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
40. Betel, D., Koppal, A., Agius, P., Sander, C. & Leslie, C. Comprehensive 43. Zhang, T. et al. Nuf2 is required for chromosome segregation during mouse
modeling of microRNA targets predicts functional non-conserved and oocyte meiotic maturation. Cell Cycle 14, 2701–2710 (2015).
non-canonical sites. Genome Biol. 11, R90 (2010). 44. Yang, Q. et al. Highly sensitive sequencing reveals dynamic modifications and
41. Li, M. et al. Bub3 is a spindle assembly checkpoint protein regulating activities of small RNAs in mouse oocytes and early embryos. Sci. Adv. 2,
chromosome segregation during mouse oocyte meiosis. PLoS ONE 4, e1501482 (2016).
e7701 (2009).
42. Chen, L. et al. Checkpoint kinase 1 is essential for meiotic Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in
cell cycle regulation in mouse oocytes. Cell Cycle 11, 1948–1955 published maps and institutional affiliations.
(2012). © The Author(s), under exclusive licence to Springer Nature Limited 2021

Nature Cell BIology | VOL 23 | June 2021 | 664–675 | www.nature.com/naturecellbiology 675


Technical Report NATURE CELL BioLogy

Methods To generate the dsDNA template for IVT, the poly(A)-tailed cDNA was first
LACE-seq method. RNA immunoprecipitation and fragmentation. HeLa cells amplified with the following mixture: 12.5 μl cDNA, 0.5 μl second-strand primer
(American Type Culture Collection (ATCC), CCL-2), K562 cells (ATCC, CCL- (CAAGCAGA AGACGGCA TACGAGATTT TTTTTTTTTTT TTTTTTTVN,
243) or mouse oocytes were resuspended in PBS and manually picked under a 10 μM; Sangon), 0.5 μl primer A (GATCACTAATACGACTCACTATAGG, 10 μM;
microscope using needles. A specific number of cells were collected into 1.5-ml Sangon) and 12.5 μl 2× KAPA HiFi HotStart ReadyMix (KAPA Biosystems,
LoBind microcentrifuge tubes (Eppendorf, 022431021) and quickly spun down KK2601). The PCR program was set as follows: 98 °C for 3 min; 98 °C for 15 s,
using an IKA mini G centrifuge. The cells were irradiated twice with UV-C light on 50 °C for 20 s, and 72 °C for 30 s (14–18 cycles); 72 °C for 5 min, then hold at
ice at 400 mJ. The crosslinked samples were stored at −80 °C until use or directly 12 °C. The PCR tube was subsequently placed on a magnetic stand for 2 min
proceeded to the next step. and the supernatant was transferred to a new LoBind tube for purification with
A total of 10 µl protein A/G magnetic beads (per sample; Thermo Scientific, 46.8 μl Ampure XP beads (1.8:1 ratio; Beckman Coulter, A63881) as per the
26162) was washed twice with 200 μl BSA/PBS solution (0.1% BSA in 1× PBS) and manufacturer’s instructions. The PCR products were eluted in 13 μl water and
blocked with 200 μl blocking buffer (1× PBS, 0.2 mg ml−1 glycogen and 0.2 mg ml−1 transferred to a new PCR tube.
BSA) at room temperature for 1 h. The blocked beads were washed once with 200 µl
of 0.1 M sodium-phosphate buffer (93.2 mM Na2HPO4, 6.8 mM NaH2PO4 and IVT and RNA purification. The IVT reaction mixture (2 μl 10× reaction buffer,
0.05% Tween 20, pH 8.0) and then resuspended in 40 µl sodium-phosphate buffer 2 μl NTP (25 mM; NEB, N0466), 1 μl 0.1 M DTT, 0.5 μl RNase inhibitor and 2 μl
containing 2 µg specific antibody and rotated at room temperature for 1 h. The T7 RNA polymerase (NEB, M0251)) was added to the PCR tube and incubated at
antibody-coupled beads were washed twice with 200 µl wash buffer (1× PBS, 0.1% 37 °C for 24 h.
SDS, 0.5% NP-40 and 0.5% sodium deoxycholate) and resuspended in 10 µl of wash For removal of the DNA template, DNase I mixture (3 μl 10× TURBO buffer,
buffer per sample. 1 μl TURBO DNase (Thermo Scientific, AM2238) and 6 μl water) was added to the
Cells were lysed on ice using 50 µl wash buffer for 10 min. Next, 1 µl IVT solution and incubated at 37 °C for 30 min. The transcribed RNA was purified
SUPERase·In RNase inhibitor (Ambion, AM2696) and 4 µl RQ1 DNase (Promega, with 66 μl Agencourt RNA Clean beads (2.2:1 ratio; Beckman Coulter, A63987) per
M6101) were applied to the lysate and incubated at 37 °C for 3 min. After the manufacturer’s protocol. The RNA was eluted in 13 μl nuclease-free water and
snap-chilling the tube on ice for 3 min, 10 µl antibody-coupled beads was added transferred to a new PCR tube.
to the lysate and rotated for 1 h at 4 °C. The tube was subsequently placed on a
magnetic stand for 1 min and the supernatant was discarded. The beads were RT, PCR barcoding and deep sequencing. The linear amplified RNA was converted
washed three times with wash buffer, once with high-salt wash buffer (5× PBS, into cDNA with 1 μl 10 μM P7 primer (CAAGCAGAAGACG GCATACGAGAT),
0.1% SDS, 0.5% NP-40 and 0.5% sodium deoxycholate) and once with PNK buffer 4 μl 5× first strand buffer, 1 μl 0.1 M DTT, 0.5 μl Superscript II reverse transcriptase,
(50 mM Tris-HCl, pH 7.4, 10 mM MgCl2 and 0.5% NP-40). 0.5 μl RNase inhibitor and 1 μl dNTP mix (10 mM) using the program of 42 °C for
The immunoprecipitated RNAs were then fragmented by MNase (Thermo 50 min, 70 °C for 15 min, then hold at 12 °C.
Scientific, EN0181) at a dilution factor of 300,000 to 600,000 in 1× MN reaction PCR was performed with 20 μl cDNA, 1 μl P7 primer, 1 μl P5 index primer
buffer (50 mM Tris-HCl, pH 8.0 and 5 mM CaCl2). Next, 10 µl of the diluted MNase (AATGATACGGC GACCACCGAGAT CTACACNNNNN ACACTCTTTCC
was added to the thoroughly washed beads and incubated at 37 °C for 3 min. CTACACGACGCT CTTCCGATCT, Sangon), 3 μl 10× Pfx buffer, 1 μl 50 mM
The MNase was quenched by washing beads twice with 1× PNK + EGTA buffer MgSO4, 0.6 μl dNTP (25 mM; Enzymatics, N2050L), 0.8 μl Pfx DNA polymerase
(50 mM Tris-HCl, pH 7.4, 20 mM EGTA and 0.5% NP-40), twice with wash buffer (Invitrogen, C11708021) and 2.6 µl water. The PCR program was set as follows:
and twice with PNK buffer. 94 °C for 3 min; 94 °C for 15 s, 62 °C for 30 s, and 72 °C for 30 s (8–12 cycles); 72 °C
for 10 min, then hold at 12 °C.
RNA dephosphorylation and 3′ linker ligation. The beads were resuspended in 20 μl PCR products between 130 and 300 bp were excised from a 2%
FastAP mixture (2 μl 10× FastAP buffer, 1 µl FastAP alkaline phosphatase (Thermo agarose gel and purified using a gel extraction kit (Qiagen, 28604).
Scientific, EF0651), 17 µl water) and incubated at 37 °C for 10 min. After washing The LACE-seq library was single-end sequenced using Illumina HiSeq
twice with 1× PNK + EGTA buffer, twice with 1× PNK buffer and twice with BSA 2500 at Novogene. The step-by-step protocol for LACE-seq is provided
solution (0.2 mg ml−1 BSA in DEPC water), the beads were resuspended in 20 μl at Nature Protocol Exchange45.
ligation mixture (12.5 μl water, 2 μl 10× ligation buffer, 0.5 μl 3′ linker (1 µM), 1 μl T4
RNA ligase 2, truncated (NEB, M0242), 4 μl 50% PEG8000) and incubated at 25 °C LACE-seq data mapping. The adapter sequences and poly(A) tails at the 3′ end
for 2.5 h in a ThermoMixer C with intermittent vortexing for 15 s at 500 r.p.m. every of raw reads were removed using Cutadapt (v.1.15)46 with two parameters: -f fastq
3 min. The tube was placed on a magnetic stand for 1 min and the supernatant was -q 30,0 -a ATCTCGTA TGCCGTCTT CTGCTT -m 18 -max-n 0.25 -trim-n., and
discarded. The beads were washed three times with 1× PNK buffer. -f fastq -a A{15} -m 18 -n 2. Clean reads were first aligned to human or mouse
The 3′ linker (/5rApp/NNNN AGATCGGAAGAG CGTCGTGTAGG pre-rRNA using Bowtie software (v.1.2.3)47, and the remaining unmapped reads
GAAAGAGTG T/3ddC/, where 5rApp denotes pre-adenylated 5′ nucleotides, N were then aligned to the human (hg19) or mouse (mm9) reference genome. For
represents a randomized nucleotide and 3ddC indicates 3′-dideoxycytidine) was LACE-seq data mapping, two mismatches were allowed (Bowtie parameters:
synthesized and HPLC-purified by Integrated DNA Technologies. -v 2 -m 10 -best -strata; -v 2 -k 10 -best -strata). Pearson’s correlation coefficient
between LACE-seq replicates or between LACE-seq and bulk CLIP-seq samples
Reverse transcription on beads. The beads were resuspended in 8.5 µl DEPC was performed as previously described23.
water and 1 µl T7-RT primer (0.5 nM; /Biotin/GATCACT AATACGACTCACT
ATAGGGACACTCT TTCCCTACACG ACGCTCTTCC GATCT). After Peak, cluster and motif identification. We used Piranha software
denaturation at 65 °C for 5 min and snap-chilled on ice for 2 min, the reverse (http://smithlabresearch.org/software/piranha/, v.1.2.1) to identify
transcription (RT) mixture (3 μl 5× first-strand buffer, 0.5 μl 0.1 M dithiothreitol peaks in HeLa cells. The parameters were as follows: -s -p 0.001 -b 20 -d
(DTT), 0.5 μl Superscript II reverse transcriptase (Life Technologies, 18064014), ZeroTruncatedNegativeBinomial. Ago2–Mili LACE-seq clusters were defined
0.5 μl RNase inhibitor (Thermo Scientific, EO0381) and 1 μl dNTP mix (10 mM; as previously described48. For motif analysis, LACE-seq peaks/clusters were first
NEB, N0447)) was directly added to the PCR tube, incubated at 42 °C for 50 min, extended 30 nt to 5′ upstream, and overrepresented hexamers in the extended
70 °C for 15 min, then held at 12 °C. Subsequently, 2 μl of 10× Exonuclease I buffer sequences were identified as previously described48. Ddx4 in vitro SELEX-enriched
and 3 μl Exonuclease I (NEB, M0293) were applied to the cDNA mixture and RNAs within the randomized 25-nt sequences were used for deducing hexamers.
incubated at 37 °C for 1 h and 80 °C for 20 min. The consensus motifs were generated from the top-20 enriched hexamers using
WebLogo (https://weblogo.berkeley.edu/logo.cgi).
First-strand cDNA capture by streptavidin beads. The first-strand cDNA was
released from the Protein A/G beads by adding 3 μl 10× RNase H buffer, 1 μl RNase Sensitivity and precision of PTBP1 LACE-seq. In Extended Data Fig. 1f,g, the
H and 6 μl water, and incubated at 37 °C for 30 min and 65 °C for 20 min. The PCR sensitivity was defined as the percentage of target genes revealed by bulk CLIP-seq
tube was placed on a magnetic stand for 1 min. The supernatant that contained the that LACE-seq also captured. The precision was defined as the percentage of target
released cDNAs was transferred to a 1.5-ml LoBind tube. Next, 5 µl streptavidin C1 genes captured by LACE-seq that belonged to the set of target genes identified by
beads (Invitrogen, 650002) in 1× B & W buffer (5 mM Tris-HCl, pH 7.5, 0.5 mM bulk CLIP-seq.
EDTA and 1 M NaCl) was added to the tube and incubated at room temperature
for 30 min and occasionally mixing every 5 min. The supernatant was discarded Small RNA-seq. The Ago2-IP-enriched small RNA library was created and
and the beads were washed twice with 200 µl 1× B & W buffer and once with 200 µl analysed as previously described34. Clean reads longer than 17 nt were then
BSA solution. mapped to miRNAs, tRNAs, rRNAs, sn/snoRNAs, endo-siRNA and piRNAs
sequentially. Repeat elements in the mouse reference genome were re-defined by
Poly(A) tailing and pre-PCR. The streptavidin beads were resuspended in 9 μl water integrating the annotation of retrotransposon coordinates from Choi et al.,49 and
and transferred to a new PCR tube. Poly(A) tailing was performed by adding TdT the RepeatMasker genomic datasets (http://www.repeatmasker.org/). Endo-siRNAs
mixture (1.25 μl 10× terminal transferase buffer, 1.25 μl CoCl2, 0.5 μl terminal were mapped to the mouse reference genome and classified as sense-LTR,
transferase (NEB, M0315) and 0.5 μl dATP (0.2 μM; NEB, N0440)) directly to the sense-LINE, sense-SINE, antisense-LTR, antisense-LINE, antisense-SINE or other
PCR tube and incubated at 37 °C for 8 min and 70 °C for 10 min. types according to the location and strand bias.

Nature Cell BIology | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
Base-pairing preference and GO analysis. RNAhybrid was employed to predict Notably, α-tubulin staining was performed on MII oocytes collected from mice
potential duplexes formed between endo-siRNAs (reads per million (RPM) > 1; superovulated with PMSG and hCG. The oocytes were mounted on glass slides and
3,550) and class I or class II Ago2 cluster sequences. The 263 miRNAs expressed in examined using a confocal microscope (Zeiss LSM 880 META).
mouse oocytes were also used for the analysis. The base-pairing preference of each
small RNA was calculated by the free energy change for all the potential hybrids Simple western analysis. About 50–80 oocytes were collected in 4.5 µl loading
formed within these two classes of Ago2 clusters and by the proportion change for buffer and denatured at 95 °C for 5 min. After cooling on ice for 5 min, 3 µl lysate
clusters that could form hybrids with each small RNA. GO analysis was performed was loaded onto a Wes Separation Module microplate (SM-W002) for analysis
using clusterProfiler (v.3.6.0)50. according to the manufacturer’s instructions (ProteinSimple). Calm1, Nuf2, Bub3,
GAPDH and β-tubulin were detected using anti-Calm1 (1:20; Abcam, ab45689),
SELEX and gel-shift assay. Ddx4 protein was expressed in Rosetta anti-Nuf2 (1:10; Bethyl, A304-319A), anti-Bub3 (1:5; CST, 3049), anti-GAPDH
(DE3)-competent cells and purified using a His GraviTrap column (GE Healthcare (14C10, 1:20; CST, 2118) and anti-β-tubulin (1:25; CST, 2128) antibodies,
Life Sciences, 11-0033-99). Input RNA was generated by IVT using the following respectively. Relative protein levels were quantified using ImageJ (v.1.52a).
dsDNA: GATCACTAATAC GACTCACTA TAGGGTACAC GACGCTCTTCCG
ATCT (N25) AGATCGGAAGA GCACACGTCT (N indicates randomized Ago2 cKO mice. Ago2loxP/loxP mice were generated by the animal core facility of
nucleotides). the Institute of Biophysics. A mixture of Cas9 mRNA, sgRNA and loxP sequences
For SELEX, 5.8 pmol of Ddx4 was first equilibrated in 250 µl binding buffer containing pLSODN-1 donor plasmids was microinjected into C57BL/6 fertilized
(25 mM Tris pH 7.5, 150 mM KCl, 3 mM MgCl2, 0.01% Tween 20, 1 mg ml−1 BSA, eggs. The injected zygotes were then transferred into the uterus of pseudopregnant
1 mM DTT and 40 U SEPERase·In RNase inhibitor) for 30 min. After adding input ICR females. Targeted alleles were identified by PCR and Sanger sequencing.
RNA to a final concentration of 1 µM and incubating at room temperature for Ago2loxP/+ lines were crossed with Zp3-Cre mice, and their progeny were
1 h, 40 µl protein A/G magnetic beads pre-coupled with Ddx4 antibody (Abcam, intercrossed to produce Ago2loxP/loxP,Zp3−cre female mice.
ab108392) were applied to pull-down Ddx4-bound RNAs. The enriched RNAs
were subsequently purified by proteinase K digestion and phenol–chloroform Single-cell RNA-seq. Control (Ago2loxP/+) and cKO oocytes (Ago2loxP/ loxP, Zp3−cre)
extraction for generating DNA pools to make new input RNAs. We repeated the from littermate mice were individually collected for RNA-seq analysis following
selection three times. the Smart-seq2 protocol51. Significantly changed TEs were screened using
Different amounts of Ddx4 protein (0–1 µM) were incubated with 0.3 pmol TEtranscripts software (v.2.1.4)52. DESeq2 (v.1.24.0)53 was used to identify
of 32P-labelled RNA probe in gel-shift buffer (10 mM Tris-HCl, pH 8.0, 25 mM differentially expressed genes. Based on the gene expression profiles, individual
NaCl, 0.1 mM EDTA, 0.1 mg ml−1 tRNA and 5 μg ml−1 heparin) at 30 °C for 15 min samples were clustered by PCA using the function prcomp in R. To predict
before loading on a 6% polyacrylamide gel. Gels were run at 150 V for 3 h in 0.5× LTR-driven chimeric transcripts, uniquely mapped reads were subjected to
TBE buffer and exposed for imaging with a Typhoon FLA 7000 scanner (GE StringTie (v.2.0.4)54 for de novo transcriptome assemblies with the following
Healthcare). parameters: -f 0.05 -a 3 -M 0.25. The newly assembled transcript whose first
exon overlapped with an annotated LTR was selected as a LTR-driven chimeric
IVT. Renilla luciferase fragments containing endo-siRNA target sites and a firefly transcript. The FPKM of chimeric and endogenous transcripts was computed using
luciferase fragment were cloned into a pCS2+ vector for IVT. The transcribed StringTie software.
RNAs were capped with mMESSAGE mMACHINE kits (Ambion, AM1340 and
AM1344) and subsequently polyadenylated using a Poly(A) polymerase tailing kit LC–MS/MS analysis of mouse oocytes. Oocytes were lysed with 100 μl RIPA
(Epicentre, PAP5104H). The Ago2 and Ago2ADH coding sequences were also cloned cleavage buffer on ice for 30 min and then centrifuged at 16,000 × g for 30 min. The
into pCS2+ plasmid (primers listed in Supplementary Table 10). supernatant was precipitated with 34 μl trichloroacetic acid solution for 4 h at 4 °C.
The precipitate was resuspended in 20 μl of 8 M urea (500 mM Tris-HCl, pH 8.5)
Oocyte collection, culture and microinjection. The Animal Use and Care and sequentially treated with Tris (2-carboxyethyl) phosphine hydrochloride and
Committee of the Institute of Zoology approved the animal procedures and oocyte iodoacetamide (Sigma). After digestion by trypsin overnight, the resulting peptides
collection. Female mice aged 6–8 weeks old were injected with 10 IU of pregnant were desalted using a Monospin C18 column (GL Sciences) and redissolved into
mare serum gonadotropin (PMSG) for 42–48 h. Fully grown GV-intact oocytes 0.1% formic acid.
were collected into M2 medium (Sigma, M7167) containing 2.5 μM milrinone Data were acquired using a timsTOF Pro mass spectrometer (Bruker
(Sigma, M4659). The oocytes were microinjected with 5–10 pl mRNA using a Daltonics) in data-dependent mode with a 120-min nonlinear LC gradient and
Nikon Narishige microinjector and incubated in M2 medium containing 2.5 μM 300 nl min−1 flow rate. We set the accumulation and ramp time as 100-ms each and
milrinone in a CO2 incubator at 37 °C for 24 h. After washing in fresh M2 medium, recorded mass spectra in the range of m/z 100–1,700 in positive electrospray mode.
the oocytes were cultured in vitro to resume and complete the first meiosis for The ion mobility was scanned from 0.6 to 1.6 Vs cm−2. The overall acquisition
12–13 h. In vivo-matured MII oocytes were collected from mice primed with 10 IU cycle was 1.16 s consisting of one full TIMS-MS scan and 10 PASEF MS/MS scans.
PMSG for 42–48 h and 10 IU human chorionic gonadotropin (hCG) for 13 h. The acquired MS/MS data were analysed against a UniProt mouse database and a
home-built Introns Database using Peaks Online X (v.1.2.1010.85). Mass tolerances
Quantitative PCR with RT and luciferase assays. Oocyte RNA was extracted for precursor ions were set at 10 ppm, and the fragment mass error tolerance was
using a Dynabeads mRNA DIRECT Micro kit (Life Technology, 61021) and set as 0.05 Da.
converted into cDNA using All-In-One RT MasterMix (Abm, G485). Quantitative
PCR with RT (RT–qPCR) was performed with FastStart Universal SYBR Green Statistics and reproducibility. All experiments were independently repeated at
Master mix (CWbiotech, CW0957S) and gene-specific primers in a LightCycler least twice, and no inconsistent results were observed. Statistical analyses were
480 (Roche). carried out using GraphPad software or R studio. Data are presented as the
Luciferase reporters were constructed by inserting the Ago2-bound fragments mean ± s.e.m. The box borders in the boxplots represent upper and lower quartiles
into the psiCHECK-2 vector between XhoI and NotI restriction sites. HEK293 (25th and 75th percentiles, respectively), and the centre line denotes the median.
cells (ATCC, CRL-1573) were seeded in 24-well plates and transfected with a The statistical tests and P values are indicated in the figure legends. P values of
mixture containing 20 ng reporter plasmid and 20 pmol endo-siRNA mimics <0.05 were considered significant. All data were reproducible, and details of
(GenePharma) using Lipofectamine 2000 (Life Technology, 11668019). Luciferase replicates are stated in the figure legends.
activity was measured using a Dual-Luciferase Reporter Assay kit (Promega,
E1910) on a Veritas Microplate Luminometer (Promega). Reporting Summary. Further information on research design is available in the
Nature Research Reporting Summary linked to this article.
Antibodies. Anti-GAPDH (CST, 2118), anti-β-tubulin (CST, 2128), anti-Calm1
(Abcam, ab45689), anti-Bub3 (CST, 3049) and anti-Nuf2 (BETHYL, A304-319A) Data availability
antibodies were used for western blot analysis. Anti-Ddx4 (Abcam, ab108392), All the sequencing data generated in this paper have been deposited in the Gene
anti-Ago2 (Abcam, ab186733), anti-PTBP1 (Abcam, 133734), anti-Mili (CST, Expression Omnibus under accession number GSE137925. The MS data have been
2071) and anti-PTBP1 (monoclonal BB7) antibodies were used for LACE-seq deposited in ProteomeXchange with the primary accession code PXD025846.
analysis. Rat anti-Ago2 antibody (Sigma, SAB4200085) was used for single-cell Previously published CLIP-seq, iCLIP, irCLIP, eCLIP, sCLIP and tRIP-seq data that
CAS-seq analysis. were re-analysed here are available under accession codes GSE42701, E-MTAB-3108,
GSE78832, GSE92205, GSE92995 and DRA005743, respectively. RNA-seq for WT
IF. GV oocytes injected with Ago2 or Ago2ADH mRNA were maintained in and DicerSOM/SOM oocytes were downloaded from the Gene Expression Omnibus
2.5 μM milrinone for 12 h and subsequently washed in milrinone-free medium database under accession number GSE132121. The small RNA-seq data were
to allow resumption of meiosis. After further culture for 6.5 h, the oocytes were downloaded from the Sequence Read Archive database under accession number
collected for IF analysis we previously described41. The following antibodies were SRP045287. The UniProt mouse database was downloaded from https://www.
used: anti-Bub3 (Abclonal, A8831; 1:50); anti-Myc–FITC (Invitrogen, R953- uniprot. org/uniprot/? query=mouse& fil=reviewed%3 Ayes& sort=score. Source
25; 1:500); Alexa-Fluor-594-conjugated goat anti-rabbit secondary antibody data are provided with this paper. All other data supporting the findings of this study
(Invitrogen, A-11012; 1:1,000); and anti-α-tubulin–FITC (Sigma, F2168; 1:200). are available from the corresponding authors upon reasonable request.

Nature Cell BIology | www.nature.com/naturecellbiology


Technical Report NATURE CELL BioLogy

Code availability National Natural Science Foundation of China (32025008, 91740201, 91940306 and
The custom code for analysing LACE-seq data is available at GitHub at 81921003) to Y.X.; by the National R&D Program (2018YFA0107701) to Q.-Y.S.;
https://github.com/caochch/LACEseq. by the Fundamental Research Funds for the Central Universities (BMU2017YJ003)
and the Outstanding Technology Talent Program of Chinese Academy of Sciences
(BMU2018XTZ002) to C.C.L.W.; by the Beijing Municipal Natural Science Foundation
References (5182024) grant to C.C.; and by the Young Scientists Fund of the National Natural
45. Su, R., Wang, D., Cao, C. & Xue, Y. Profiling the binding sites of Science Foundation of China to C.C. and L. Wang (31900465, 31701109).
RNA-binding protein by LACE-seq. Protoc. Exch. https://doi.org/10.21203/
rs.3.pex-1499/v1 (2021).
46. Martin, M. Cutadapt removes adapter sequences from high-throughput Author contributions
sequencing reads. EMBnet. J. 17, 10–12 (2011). Y.X. conceived the project and designed the experiments. R.S. developed the
47. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and LACE-seq method, performed the dual-luciferase assay and western blotting and
memory-efficient alignment of short DNA sequences to the human genome. prepared the small RNA-seq samples (with help from L. Wu). Z.C. performed the
Genome Biol. 10, R25 (2009). gel-shift experiments. L.-H.F., Y.-C.O., Y.W., W.-L.L. and Q.Z. collected the mouse
48. Xue, Y. et al. Genome-wide analysis of PTB–RNA interactions reveals a oocytes, prepared the luciferase mRNA and completed the injection experiments
strategy used by the general splicing repressor to modulate exon inclusion or (under the guidance of Q.-Y.S.). N.Z. performed the proteomics analysis of mouse
skipping. Mol. Cell 36, 996–1006 (2009). oocytes (under the guidance of C.C.L.W.). X.Z. generated the Ago2loxP/loxP mice
49. Choi, Y. J. et al. Deficiency of microRNA miR-34a expands cell fate potential (under the supervision of Y.T.). C.C., L. Wang, H.Z. and Z.D. performed the
in pluripotent stem cells. Science 355, eaag1927 (2017). bioinformatics analysis (under the guidance of S.H.). Y.X. wrote the manuscript
50. Yu, G., Wang, L.-G., Han, Y. & He, Q.-Y. clusterProfiler: an R package for with help from C.C., N.Z. and L.-H.F.
comparing biological themes among gene clusters. OMICS 16, 284–287
(2012). Competing interests
51. Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. The authors declare no competing interests.
Protoc. 9, 171–181 (2014).
52. Jin, Y., Tam, O. H., Paniagua, E. & Hammell, M. TEtranscripts: a package for
including transposable elements in differential expression analysis of RNA-seq Additional information
datasets. Bioinformatics 31, 3593–3599 (2015). Extended data is available for this paper at https://doi.org/10.1038/s41556-021-00696-9.
53. Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and Supplementary information The online version contains supplementary material
dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014). available at https://doi.org/10.1038/s41556-021-00696-9.
54. Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome
from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015). Correspondence and requests for materials should be addressed to C.C.L.W., Q.-Y.S.
or Y.X.

Acknowledgements Peer review information Nature Cell Biology thanks Haruhiko Siomi, Wayne Miles and
This work was supported by the Ministry of Science and Technology of China the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
(2017YFA0504400), the Strategic Priority Program of CAS (XDB37000000) and the Reprints and permissions information is available at www.nature.com/reprints.

Nature Cell BIology | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report

Extended Data Fig. 1 | Validation of the LACE-seq method. a, Bioanalyzer 2100 analysed the PTBP1 LACE-seq libraries generated from different numbers
of HeLa cells. b, Scatter plots showing gradually reduced correlations between PTBP1 LACE-seq libraries generated from the sequentially decreased
number of HeLa cells. Pearson’s R is indicated. Two independent LACE-seq experiments are shown here. c, PTBP1 LACE-seq reads were highly correlated
with previously published bulk CLIP-seq data. d, PTBP1 peaks containing a higher percentage of cDNA end covering reads usually have a higher CU-rich
motif density. The number of peaks is 1403, 2562, 3140, 4042, 5003, 4442, 4702, 4679, 4151, and 7461 from left to right. e, The CU-rich motif density
was significantly higher in PTBP1 peaks containing more sequencing reads and termination sites. Peaks were classified into four equal categories (Q1 to
Q4, average 10235 peaks for each quarter) based on the read density. P-values were calculated by two-tailed unpaired Student’s t-test. f-g, The sensitivity
and precision of LACE-seq compared with targets generated from bulk CLIP-seq (GSE42701). The sensitivity decreased along with the reduced cell
numbers, but the precision showed little change. h, UCSC genome browser views of LACE-seq reads on known PTBP1 targets GPRC5A and HMGA1. i, CDF
plot showing the absolute value of delta percent-spliced-in (PSI) for cassette exons was significantly changed upon PTBP1 knockdown. Cassette exons
and adjacent introns without PTBP1 binding were defined as the ‘Non-targets’ group. The PSI of each cassette exon was calculated based on the RNA-seq
data (GSE42701) of wild-type (WT) and PTBP1 knockdown (KD) HeLa cells. P-values were calculated by one-tailed Kolmogorov-Smirnov test. j, Boxplots
showing the abundance of PTBP1 target genes identified by CLIP-seq and LACE-seq. FPKM: fragments per kilobase of transcript per million mapped reads.
The number of target genes is 7292, 4806, 1686, 1110, 208, and 307 from left to right. For box plots in d, e, and j, the centre line represents the median, the
box borders represent the first (Q1) and third (Q3) quartiles, and the whiskers are the most extreme data points within 1.5× the interquartile range (from
Q1 to Q3). Data in c-j represent results from two independent LACE-seq experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


Technical Report NATURE CELL BioLogy

Extended Data Fig. 2 | See next page for caption.

Nature Cell BIology | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report
Extended Data Fig. 2 | Comparison between LACE-seq and CLIP-based methods. a,b, LACE-seq tends to produce fewer PCR duplicates than other
methods at high or low sequencing depths. c, Mapping rate comparison for LACE-seq, iCLIP, eCLIP, irCLIP, CLIP-seq, sCLIP, and tRIP-seq methods.
All CLIP variant-generated datasets were mapped to the genome using the same pipeline. d, PTBP1 LACE-seq-specific peaks (red line) also showed
accumulated CU-rich motifs compared with random controls (blue line). We picked 30 M of PTBP1 LACE-seq and irCLIP reads and generated an equal
number of randomized peaks for such comparisons. e, Western blot showing the knockdown (KD) efficiency of PTBP1 in K562 cells. The experiment
was independently repeated twice with similar results. f, Heatmap showing the LACE-seq signal around the peaks before and after PTBP1 knockdown in
K562 cells. The scale stands for the number of LACE-seq reads per million. g, Metaprofile of the PTBP1 LACE-seq signal around the identified peaks. h,
LACE-seq achieved a higher signal-to-noise ratio than eCLIP in K562 cells. The dashed line represents the cutoff of the two-fold signal-to-noise ratio. The
P-value was calculated by two-tailed unpaired Student’s t-test. i, LACE-seq captured more bulk CLIP-seq-revealed target genes than irCLIP-seq with the
same number of reads. j, The number of target genes identified by LACE-seq and irCLIP that could be confirmed by bulk CLIP-seq data. k, The sensitivity
of LACE-seq is higher than that of irCLIP. The sensitivity was calculated by counting how many target genes revealed by bulk CLIP-seq could be captured
by LACE-seq and irCLIP, along with different sequencing read inputs. l, The precision of LACE-seq is better than irCLIP. The precision was calculated by
comparing the identified targets of LACE-seq and irCLIP to bulk CLIP-seq. Identical numbers of reads were randomly downsampled 10 times in i, j, k, and l
(n = 10). Data are mean ± s.e.m., P-values in i, j, k, and l were calculated by two-tailed paired Student’s t-test. LACE-seq data in d and f-l represent results
from two independent experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


Technical Report NATURE CELL BioLogy

Extended Data Fig. 3 | LACE-seq methodology optimization and validation in oocytes using Ddx4 antibody. a, A meta-analysis of Ddx4 LACE-seq
libraries generated by cutting with a series of diluted micrococcal nuclease (from 1:120K to 1:15M). IgG served as a control. TSS: transcription start site;
TES: transcription end site. b, The number of peaks detected in the Ddx4 libraries. Four independent LACE-seq experiments are shown in a and b. c,
Heatmap showing the correlation of six Ddx4 LACE-seq datasets generated from different numbers of mouse oocytes. IgG samples were generated from
ten oocytes. The colour intensity indicates the scale of Pearson’s correlation coefficient. d, Ddx4-RNA interacting sites were enriched around the start
codons (left), stop codons (middle), and poly(A) sites (right). The IgG sample is shown as the blue line. e, Schematic diagram of the in vitro SELEX strategy
to enrich Ddx4 preferentially bound RNA sequences. f, Ddx4-binding motifs deduced from the in vitro SELEX enriched reads. Top: in vitro SELEX-deduced
consensus motif for Ddx4. Bottom: relative enrichment of U-rich or GC-rich sequences around the LACE-seq peaks of Ddx4. LACE-seq data in d and f
represent results from six independent experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report

Extended Data Fig. 4 | Analysis of PTBP1, Ago2, and Mili binding features in mouse oocytes. a, PTBP1- and Ago2-binding motifs identified by LACE-seq
in mouse oocytes. b, Ago2 LACE-seq reads mapping to known endo-siRNA loci. c, Heatmap showing the LACE-seq signal around the identified peaks
before and after Ago2 knockout in oocytes. The scale stands for the number of LACE-seq reads per million. d, Metaprofile of Ago2 LACE-seq signals
from control (ctrl) or Ago2−/− mouse oocytes around the identified peaks. e, An example of abolished Ago2 binding at the transcript Tuba1a in Ago2-null
oocytes. f, The number of reads and the identified peaks by the LACE-seq protocol with or without IVT steps from different cell inputs. LACE-seq data in
c-f represent results from two independent experiments. g, Gel images and bar graphs showing that the yield of the LACE-seq library is strictly dependent
on the IVT step if starting with 50 mouse oocytes. Data are mean ± s.e.m.; n = 3 or 4 biological replicates, two-tailed unpaired Student’s t-test. h,
Saturation analysis of the identified Ago2 peaks in mouse oocytes. ‘Fraction’ indicates the percentage of randomly selected and inputted reads. i, Snapshot
of Ago2 and Mili LACE-seq signals on mRNA specifically bound by Ago2 or by both Ago2 and Mili. Repeat elements are shown as black boxes at the
bottom. j, GO analysis of Ago2-specific targets. LACE-seq data in a, b, and h–j represent results from three independent experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


Technical Report NATURE CELL BioLogy

Extended Data Fig. 5 | The meiotic defects of oocytes in Ago2 conditional knockout mice. a, Schematic of the Ago2 conditional knockout (denoted as
cKO or Ago2−/−, Ago2loxP/loxP; Zp3−cre) strategy in mouse oocytes. Two loxP sites inserted at the adjacent intronic regions of exon 3 are shown as red triangles.
The agarose gel shown in the bottom panel is the genotyping result with two primers (F+R) flanking both sides of one loxP site. The experiment was
independently repeated three times with similar results. b, Over 80% of the Ago2 cKO oocytes showed spindle defects compared with oocytes from
control littermates (Ago2loxP/+). Ctrl, control, n = 68; cKO, n = 75. c, The classification and percentage of abnormal phenotypes in Ago2−/− and Dicer−/−
oocytes. The representative phenotypes are shown above the column. α-tubulin, green; DAPI, blue. d, Clustering analysis of single-cell RNA-seq data
generated from control (Ctrl) and Ago2 conditional knockout (cKO) oocytes. e, Ago2 cKO samples are clustered together rather than to control oocytes
by PCA. f, Percentage of Ago2 bound (red) and unbound (blue) transcripts revealed by LACE-seq. The upregulated, downregulated, and unchanged
transcripts revealed by RNA-seq were further classified into Ago2-bound or Ago2-unbound groups. g, Scatter plot showing the abundance of Ago2 targets
in wild-type (WT) and DicerO knockout (DicerSOM/SOM) oocytes. The percentage of unchanged targets is listed. h, Scatter plot showing that most Mili targets
are not changed in Ago2−/−oocytes. Mili-specific targets are marked in red. i, Boxplot showing that the upregulated Mili targets have a higher ratio of Ago2
occupancy to Mili than other targets. The P-value was calculated by one-tailed unpaired Student’s t-test. The centre line represents the median, the box
borders represent the first (Q1) and third (Q3) quartiles, and the whiskers are the most extreme data points within 1.5× the interquartile range (from Q1 to
Q3). Data in d-f, h, and i represent results from five independent scRNA-seq experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report

Extended Data Fig. 6 | LTR-driven chimeric transcripts in mouse oocytes. a, The number of upregulated (Up) and downregulated (Down) TEs revealed
by RNA-seq in Ago2-null oocytes. Red: upregulated TEs; black: downregulated TEs. b, Boxplot showing that the median ratio of LTR-driven chimeric
transcripts (n = 244) to endogenous transcripts is increased by 2.88-fold in Ago2−/− oocytes. The median value is shown in the boxes. c, Strategy for
analysing naturally formed dsRNA transcripts. d, The number of LTR-driven transcripts and naturally occurring double-stranded RNAs accounted for
the upregulated and downregulated genes, as revealed by RNA-seq. Red: upregulated genes; black: downregulated genes. e, Boxplot showing that the
downregulated LTR-driven transcripts (n = 75) tend to have more Mili binding than Ago2 binding compared with upregulated chimeric transcripts
(n = 244). f, Boxplot showing that the downregulated LTR-driven transcripts (n = 75) tend to have more Dicer binding than Ago2 binding compared with
upregulated chimeric transcripts (n = 244). The Dicer-binding density at a specific transcript was quantified by normalizing the endo-siRNA amounts
derived from small RNA-seq to the transcript length. g, Boxplot showing that the downregulated naturally formed dsRNA transcripts (n = 89) in Ago2-null
oocytes also tend to be downregulated in DicerSOM/SOM oocytes. h, GO analysis of LTR-driven chimeric transcripts in MII oocytes. P-values in b, e, f, and g
were calculated by two-tailed unpaired Student’s t-test. For the box plots in b, e, f, and g, the centre line represents the median, the box borders represent
the first (Q1) and third (Q3) quartiles, and the whiskers are the most extreme data points within 1.5× the interquartile range (from Q1 to Q3). Data in a, b,
and d-h represent results from five independent scRNA-seq experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


Technical Report NATURE CELL BioLogy

Extended Data Fig. 7 | Endo-siRNA targeting rules in mouse oocytes. a, Ago2-IP enriched small RNAs. b, Boxplots showing that endo-siRNAs (n = top
200) are preferably loaded into Ago2 than miRNAs (n = top 200) due to their relative abundance in oocytes. P-values were calculated by two-tailed
Wilcoxon test. The centre line represents the median, the box borders represent the first (Q1) and third (Q3) quartiles, and the whiskers are the most
extreme data points within 1.5× the interquartile range (from Q1 to Q3). c, Scatter plot showing the preference of endo-siRNA base-paired with class I or
class II Ago2 clusters. Each point represents an endo-siRNA, and different colours indicate the source for endo-siRNA. The horizontal axis represents the
difference in the average MFE of all the potential hybrids formed between each class of clusters and a given endo-siRNA. The vertical axis represents the
difference in the proportion of clusters that could form hybrids with each endo-siRNA. As a positive control, miRNAs mostly pair with class II clusters.
d, Endo-siRNAs paired with class II RNAs have a significantly lower MEF value than random controls. Two-tailed unpaired Student’s t-test was used to
calculate the P-value. Ago2-IP enriched smRNA-seq data in a-d represent results from a single experiment. e, The single-nucleotide mutation at the seed
region compromises the repression mediated by endo-siRNA-336 in oocytes. Data are mean ± s.e.m.; n = 3 biological replicates, two-tailed unpaired
Student’s t-test.

Nature Cell BIology | www.nature.com/naturecellbiology


NATURE CELL BioLogy Technical Report

Extended Data Fig. 8 | Validation of functional endo-siRNA target sites in HEK293 cells. a, The predicted endo-siRNA target sites in the 3′ UTRs of
Birc5, Cdc42, Chk1, and Ska1 based on the Ago2 LACE-seq signal. Deduced base-pairing potentials and calculated MFEs are illustrated. Ago2 LACE-seq
data represent results from three independent experiments. b-c, The relative Renilla luciferase reporter assay in HEK293 cells. Negative control (Ctrl) or
endo-siRNA mimics were cotransfected with WT or seed mutants. d, The relative luciferase reporter assay of wild-type (WT) and mutant (MT) Chk1 in
mouse oocytes. e-g, qPCR showing that the mRNA levels of Calm1, Bub3, and Nuf2 were not changed upon treatment with endo-siRNA-specific sponges
in oocytes. Two-tailed unpaired Student’s t-test was used to calculate the P-values in b-g. Data are mean ± s.e.m., n = 3 biological replicates.

Nature Cell BIology | www.nature.com/naturecellbiology


Technical Report NATURE CELL BioLogy

Extended Data Fig. 9 | Proteome analysis of Ago2-null oocytes. a, The mass spectrometry data were highly correlated in two biological replicates. b, The
fold change and significance of the Calm1, Nuf2, Chk1, and Bub3 protein levels as revealed by mass spectrometry in Ago2-null oocytes. c, The predicted
miRNA density (miRNA counts/mRNA length×1000) showing no correlation with the protein level changes upon Ago2 ablation in oocytes. P-value was
calculated by two-tailed Kolmogorov-Smirnov test. d, The percentage of LTR-driven and upregulated chimeric proteins detected by mass spectrometry. e,
The fold change in MS-detected (17.6%) and LTR-derived chimeric proteins in Ago2-null oocytes. MS data in b-e represent results from two independent
experiments.

Nature Cell BIology | www.nature.com/naturecellbiology


β

α −
β

You might also like