Professional Documents
Culture Documents
The Recombinome of IKZF1 Deletions in B-ALL
The Recombinome of IKZF1 Deletions in B-ALL
Bruno Lopes
(
brunolopes@id.uff.br
)
Instituto Nacional de Cancer
https://orcid.org/0000-0003-1072-470X
Claus Meyer
Johann Wolfgang Goethe-Universität
Heloysa Bouzada
Instituto Nacional de Câncer
Marius Külp
Goethe-University of Frankfurt
Ana Luiza Maciel
Instituto Nacional de Câncer
Patrizia Larghero
Goethe-University of Frankfurt
Thayana Barbosa
Instituto Nacional de Cancer
Caroline Poubel
Instituto Nacional de Câncer
Caroline Blunck
Instituto Nacional de Cancer (INCA)
Nicola Venn
Children's Cancer Institute Australia for Medical Research
Luciano Dalla-Pozza
The Children's Hospital at Westmead
Draga Barbaric
Sydney Children’s Hospital
Chiara Palmi
o Ricerca M. Tettamanti, University of Milano-Bicocca
Grazia Fazio
o Ricerca M. Tettamanti, University of Milano-Bicocca
Claudia Saitta
Centro Ricerca M. Tettamanti, Pediatrics, University of Milano Bicocca
https://orcid.org/0000-0002-
5842-7774
Thais Aguiar
Arthur Siqueira Cavalcanti Hematology Institute (HEMORIO)
Mecneide Lins
Page 1/21
Instituto de Medicina Integral Prof. Fernando Figueira (IMIP)
Maura Ikoma-Colturato
Hospital Amaral Carvalho
Marcia Schramm
Prontobaby Hospital da Criança and Hospital do Câncer I, INCA
Eduardo Chapchap
Hospital Israelita Albert Einstein
Giovanni Cazzaniga
Centro Ricerca M. Tettamanti, Università di Milano Bicocca
Rosemary Sutton
Children's Cancer Institute
https://orcid.org/0000-0002-0188-6005
Rolf Marschalek
Goethe-University of Frankfurt
https://orcid.org/0000-0003-4870-3445
Mariana Emerenciano
Instituto Nacional de Câncer
https://orcid.org/0000-0003-2337-8420
Article
Keywords: IKZF1 deletion, acute lymphoblastic leukemia, B-ALL, breakpoints, MLPA, multiplex PCR
DOI: https://doi.org/10.21203/rs.3.rs-2697729/v1
License:
This work is licensed under a Creative Commons Attribution 4.0 International
License.
Read Full License
Page 2/21
Abstract
IKZF1 deletions are associated with an increased risk of relapse in B-cell precursor acute lymphoblastic
leukemia (B-ALL), and their accurate detection has great clinical impact. Here, we included four
international cohorts of pediatric and adult patients with B-ALL, and reviewed literature to illustrate the
recombination map of IKZF1 deletions, with a focus at non-recurrent deletions. We provide a substantial
basis for the improvement of diagnostic methods based on MLPA and multiplex PCR for the
identification of IKZF1 deletions, and also demonstrate that rare IKZF1 deletions increase the incidence of
relapse in these patients. Of note, non-recurrent deletions comprised a wide range of alterations, but the
majority were Δ1 and Δ1–3. They were often associated with reciprocal IKZF1 fusions. So far, a total of
23 IKZF1 gene fusions were identified in B-ALL. We also verified the occurrence of the heptamer sequence
(E-value: 9.9 x 10− 9) and an enrichment of GC nucleotides (71% versus 56%; P value = 4.9 x 10− 3)
exclusively within breakpoint clusters, suggesting that RAG recombination and TdT activity may promote
the majority of IKZF1 deletions, although rare types of alterations may be associated with other
molecular mechanism of leukemogenesis, such as microhomology-mediated end joining.
Introduction
The occurrence of IKZF1 deletions increases the risk of relapse in patients with B-cell precursor acute
lymphoblastic leukemia (B-ALL) 1. Therefore, the development of methods for rapid and accurate
detection of this alteration is clinically relevant. Traditionally, the identification of IKZF1 deletions has
been based on multiplex ligation-dependent probe amplification (MLPA), although distinct approaches
are also available, including array-based comparative genomic hybridization (array CGH), optical gene
mapping 2, and whole genome sequencing (WGS). The latter techniques provide a broad panorama of
copy number or structural alterations, but either the validation of results is advisable or the high costs
limit the application in routine diagnosis worldwide. Multiplex (M)-PCR is a feasible and accurate
complementary method for the screening and validation of the majority of IKZF1 deletions, specially
recurrent intragenic ones: Δ2–3, Δ2–7, Δ2–8, Δ4–7, Δ4–8 3,4. In addition, several studies describe the
occurrence of non-recurrent IKZF1 deletions in B-ALL, and proved the importance of these rare alterations
for determining patient outcome 5. For instance, exon 1 deletions and other non-recurrent deletions are
evidenced by MLPA, but lack the possibility of verification by M-PCR. Therefore, here we performed an
international effort to include a large cohort of B-ALL samples, and review the literature to provide the
landscape of IKZF1 deletions in this subtype of leukemia. Our study presents a special focus on non-
recurrent structural alterations encompassing this gene, which was used both for illustrating the genetic
profile of rare lesions and for providing the basis for upgrading current methods to determine IKZF1
status. In addition, we analyze sequence features underlying different types of IKZF1 deletions to explore
possible mechanisms of leukemogenesis.
Page 3/21
Patients
This study included pediatric and adult patients with B-ALL. Four patient cohorts were used for analyses:
two from Australia (n = 506 and n = 161), one from Brazil (n = 277), and another from Italy (n = 596). A
total of 1,540 patients were analyzed for the identification of IKZF1 deletions. Participating centers
obtained local ethical committee approval and written informed consent in accordance with the
Declaration of Helsinki. This study was approved by human ethics committees (CAAE
#33709814.7.0000.5274 and SCHN 2019/ETH06161).
Literature data
We revised previously published data regarding IKZF1 deletions in B-ALL for compiling breakpoint
sequence information of these deletions. We used the term "IKZF1 deletion" AND "acute lymphoblastic
leukemia" for this search on PubMed, which provided 207 articles. Then, we manually inspected all
manuscripts to extract patient and breakpoint sequence data. This database was restricted to breakpoint
sequence information identified at DNA or RNA level, and the respective B-ALL patient data. The
breakpoint sequences retrieved from literature were mapped using BLASTN 13 and the hg38 human
reference genome.
Page 5/21
also contained 2.5 mM MgCl2 and 0.20 uM of each primer mix. The amplification was performed with an
initial denaturation at 95°C for 5 min, followed by 35 cycles of denaturation at 95°C for 30 s, annealing at
60°C for 30 s, and extension at 72°C for 4 min. The final extension occurred at 72°C for 12 min. Then, M-
PCR products were separated by agarose gel electrophoresis, and stained with GelRed to allow
visualization of amplicons by ultraviolet light.
Survival analysis
The B-ALL patients were divided into three comparative groups with i) non-recurrent IKZF1 deletions
versus ii) IKZF1 wild-type or iii) IKZF1 Δ4–7. Reference groups 2 and 3 had twice the number of group 1.
They were matched by leukemia subtype, treatment protocol, age group, WBC group, and sex. The
number of years between diagnosis and death or last follow-up was collected for every patient to
calculate the overall survival (OS). A Kaplan–Meier curve was fitted for each group, which were compared
by means of the log-rank test. Additionally, the time between diagnosis and relapse was calculated for the
analysis of the cumulative incidence of relapse (CIR). The CIR between IKZF1 groups was compared by
means of the Gray test. The analyses were performed using the survminer and tidycmprsk R packages.
Statistical analyses
Data analyses were performed using R studio (version 4.2.1). Categorical variables were compared by
Pearson's Chi-squared test. Continuous data were analyzed by variance test, followed by Student’s t-test
(unpaired and two-tailed). P values < 0.05 were considered significant. Data manipulation and
visualization were performed using tidyverse packages. The majority of plots were generated using
ggplot2 16. In regard to genomic data, the Protein Paint (proteinpaint.stjude.org/FusionEditor) was used
to illustrate IKZF1 gene fusions. Additionally, Bioconductor packages were used for preparing the data
(BSgenome.Hsapiens.UCSC.hg38, version 1.4.4) and for illustrating genomic maps (Gviz, version 1.40.1)
17
.
Reference sequences
Page 6/21
We used the following transcript identification for the description of genes and respective exon/intron
numbering in this study: ABCA13 (NM_152701.5), ADSS1 (NM_152328.5), CDK2 (NM_001798.5), CEP170
(NM_014812.3), COBL (NM_015198.5), DDC (NM_001082971.2), DLG2 (NM_001142699.3), DNAH14
(NM_001367479.1), ENSG00000278996 (NR_146144.1), ETV6 (NM_001987.5), FIGNL1
(NM_001287492.4), GRB10 (NM_001350814.2), IKZF1 (NM_006060.6), LOC105377979 (XR_942936.3),
KMT2A (NM_001197104.2), NUTM1 (NM_001284292.2), PAX5 (NM_016734.3), SETD5
(NM_001080517.3), SPATA48 (NM_001161834.3), STIM2 (NM_020860.4), STK38L (NM_015000.4),
TRPV2 (NM_016113.5), TYW1 (NM_018264.4), ZEB2 (NM_014795.4), and ZPBP (NM_007009.3).
Results
Frequency of IKZF1 alterations
A total of 1,540 B-ALL samples, which represent a combination of four different cohorts, were screened
for IKZF1 deletions: two cohorts from Australia (n = 667; n1 = 506 and n2 = 161), one from Brazil (n = 277)
and another from Italy (n = 596). In this study, the identification of IKZF1 deletions was performed by a
combination of several methodologies. A total of 33 samples initially characterized by MLPA (P335)
presented suspicious non-recurrent IKZF1 deletions. After validation using the MLPA P202 probemix, M-
PCR and/or NGS, 27% demonstrated to be true positive results, while 73% were reclassified as either
IKZF1 wild-type (n = 12) or IKZF1 recurrent deletion (n = 12). We also assessed the peak ratios of MLPA
P335 probes expected to indicate monoallelic deletions (median: 0.63; range: 0.48–1.00) or no alterations
(median: 0.97; range: 0.57–1.74) (Fig. 1a), as well as the peak ratios of MLPA P202 probes expected to
indicate monoallelic deletions (median: 0.64; range: 0.15–1.11) or no alterations (median: 0.98; range:
0.59–1.53) (Fig. 1b). We noticed that a cutoff value at 0.80 allowed for the interpretation of most non-
recurrent IKZF1 deletions, although false positive deletions may be identified. In summary, the overall
frequency of IKZF1 deletion was 17%. Non-recurrent IKZF1 deletions were found in 1.2% of the total
cohort, which represents 7% of all IKZF1 deletions (Fig. 1c; Supplementary Table S4). The clinical
characteristics of these patients are summarized in Supplementary Table S5.
Considering the rarity of some IKZF1 deletions, we reviewed literature to compile information of B-ALL
patients carrying this genetic alteration. We included breakpoint sequences available at DNA level (n =
478) and RNA level (n = 16). All data from the current work and literature are summarized in
Supplementary Table S6. First, we noticed that non-recurrent IKZF1 deletions comprise miscellaneous
alterations: Δ1, Δ1–2, Δ1–3, Δ1–4, Δ1–5, Δ1–7, Δ2, Δ3, Δ4–6, Δ5, Δ5–7, Δ5–8 and Δ6–8 (Fig. 1d). Most
of them (82%) are associated with the loss of IKZF1 promoter. In addition, a total of 23 IKZF1 gene
fusions were identified with the following partner genes: ABCA13, ADSS1, CDK2, CEP170, COBL, DDC,
DLG2, DNAH14, ENSG00000278996, ETV6, FIGNL1, LOC105377979, KMT2A, NUTM1, PAX5, SETD5,
SPATA48, STIM2, STK38L, TRPV2, TYW1, ZEB2, and ZPBP (Fig. 1e; Supplementary Fig. S1). Seven of
them were exclusively found in this work (Supplementary Table S7). Of note, only 9 out of 36 fusions
were in-frame, while the remaining produced head-to-head (e.g. ZPBP::IKZF1), tail-to-tail (e.g.
Page 7/21
IKZF1::COBL) fusions, or even rearrangements with genes encoding long non-coding RNAs (e.g.
IKZF1::ENSG00000278996), which may not produce functional chimeric IKZF1 proteins.
The sequence data allowed us to investigate BC of IKZF1 deletions. A total of 24 BC were identified, half
of which were associated with 5' and the other half with 3' breaks (Fig. 2a). The detailed genomic
location of each cluster is provided in Supplementary Table S8. The most frequent clusters were 5'BC09,
5'BC12, 3'BC07, and 3'BC10, which are located within IKZF1 intron 1, intron 3, intron 7, and ~ 11.6 kb
downstream this gene, respectively. DNA lesions at these sites promote the most recurrent IKZF1
deletions: Δ2–7 (5'BC09–3'BC07), Δ4–7 (5'BC12–3'BC07), Δ2–8 (5'BC09–3'BC10), and Δ4–8 (5'BC12–
3'BC10). Conversely, the majority of DNA breaks triggering non-recurrent IKZF1 deletions were not located
at BC. Nevertheless, we observed four clusters (5'BC01, 5'BC02, 3'BC01, and 3'BC02) associated with rare
rearrangements. Of note, the majority of ∆1 (9 out of 13) were associated with 3'BC01 (Fig. 2b).
To further explore the role of cryptic RSS on the establishment of IKZF1 deletions, we performed a
quantitative ChIP mapping of RAG1 occupancy within breakpoint regions using the Nalm6 leukemia cell
line (Supplementary Fig. S3). Although we observed an enrichment of RAG1 within 5’BC03 (P value =
0.001 by t-test), its occupancy levels were similar to control regions (without any breakpoint promoting
IKZF1 deletions) in the remaining breakpoint clusters of recurrent and non-recurrent IKZF1 deletions, as
well as in sporadic breakpoint sites. Our results suggest that a time-dependent RAG1 expression and
activation might have suppressed our ability to demonstrate its role in the establishment of IKZF1
deletions.
Although the frequency of filler DNA varied depending on the type of IKZF1 deletion, the size of these
additional sequences was similar between all comparison groups (Fig. 4g–i). The median number of
nucleotides ranged between 4 and 7 base pairs. Conversely, the nucleotide content of filler DNAs was
unbalanced. Although not statistically different, recurrent deletions had the highest frequency of GC
content, while non-recurrent deletions displayed an intermediate frequency (70% versus 64%, respectively;
P value = 0.235 by Pearson's Chi-squared test; Fig. 4j). Remarkably, complete IKZF1 deletions (Δ1–8)
presented the lowest (50%) GC content compared to the majority of the recurrent deletions (Fig. 4k): ∆2–
3 (70%; P value = 4.4 x 10− 2), ∆2–7 (72%; P value = 1.3 x 10− 2), ∆2–8 (71%; P value = 3.1 x 10− 2), ∆4–7
Page 9/21
(72%; P value = 9.5 x 10− 3), ∆4–8 (65%; P value = 0.118). Of note, structural alterations derived from DNA
breaks within clusters at both sides had the highest frequency of GC nucleotides compared to
breakpoints outside any cluster (71% versus 56%; P value = 4.9 x 10− 3; Fig. 3l), whereas deletions derived
from breaks at one cluster had an intermediate GC content (66%; P value = 0.213).
Discussion
IKZF1 deletions are recurrent alterations in B-ALL, and offer relevant information for risk stratification of
these patients, considering their association with a higher relapse risk 1,18. The genetic landscape of
these abnormalities is complex, as they comprise different types of deletions and may co-occur along
with multiple CNA. For instance, B-ALL patients with deletions in CDKN2A, CDKN2B, PAX5, or within the
pseudoautosomal region 1 (PAR1) without ERG deletion have a more adverse outcome, and are defined
as IKZF1plus group 19. Although the first studies addressing the relationship between IKZF1 deletions and
the prognosis of B-ALL had more representation of common deletions (i.e. Δ4–7 and Δ1–8), it has also
been demonstrated that less frequent deletions (i.e. Δ2–7 and Δ2–8) reduce event-free survival of these
patients 5. In this regard, our study adds novel information to those previous results, especially for IKZF1
exon 1 deletions. In summary, we have substantial evidence that IKZF1 deletions confer a dismal
prognosis for B-ALL patients, regardless of the range of deletion.
Traditional methodologies (i.g. MLPA) provide the possibility to detect all IKZF1 deletions in patient
samples whilst a great proportion of clones display this genetic alteration. However, the occurrence of
rare types of deletions may be challenging to confidently interpret, considering that MLPA may provide
false positive results, especially for deletions within exon 1. As previously demonstrated, methodological
Page 10/21
adjustments, including a longer denaturation step in MLPA may overcome this limitation 20. In addition,
the MLPA peak ratios of non-recurrent deletions frequently varied between the threshold of wild-type and
monoallelic deletion, thus indicating that most of them were subclonal. Indeed, we reveal that structural
alterations either within IKZF1 exon 1 or ranging from exon 1 to 3 are the most common non-recurrent
alterations of this gene. This information highlights the importance of considering these rare types of
deletions, although carefully controlling methodological variables to minimize the interpretation of false
positive results. In this regard, we advise the inclusion of an extra MLPA probe within the overlapping
deletion region upstream of IKZF1 promoter.
Previous PCR-based approaches for the detection of IKZF1 deletions have verified a substantial number
of samples escaping MLPA detection (half of B-ALL samples at diagnosis) due to subclonal lesions, low
blast count, or hemodilution 4. As recently discussed, both methods have pros and cons 21. While MLPA
detects a broader range of deletions within IKZF1 and other relevant genes for ALL risk stratification 19,22,
M-PCR is more sensitive and accurate to identify these genetic alterations. Although the prognostic
relevance of subclonal IKZF1 lesions has not been specifically addressed 23, we might assume that the
higher incidence of these genetic alterations at relapse would suffice for proper investigation of IKZF1
status. Therefore, the combination of broader approaches (e.g. MLPA or array techniques) with M-PCR is
ideal. According to the World Health Organization, the astonishing discrepancies in cure rates of
childhood cancer between high-income and low- or middle-income countries include lack of diagnosis,
misdiagnosis or delayed diagnosis 24. These limitations may be partially circumvented by the access to
low cost methodologies for genetic diagnosis of B-ALL. Here, we also improved the M-PCR to cover a
wider spectrum of IKZF1 alterations. Of note, previous PCR methods did not include primers flanking the
3'BC10, which is one of the most recurrent BC leading to Δ2–8 and Δ4–8 deletions. Therefore, we
estimate that our novel M-PCR may detect a higher number of IKZF1 deletions in ALL.
The occurrence of gene fusions is a common feature in acute leukemia. Although they comprise a lower
number of IKZF1 aberrations, here we have identified seven novel partner genes of IKZF1 fusions:
ABCA13, DDC, ENSG00000278996, LOC105377979, PAX5, STK38L, and ZPBP. Among them, only
ABCA13::IKZF1 and IKZF1::DDC were in-frame fusions, and derived from interstitial deletions within
chromosome 7. The ABCA13 belongs to a large family of ATP-binding cassette (ABC) transmembrane
transporters, and contributes to cholesterol internalization 25. It is highly expressed in leukemia cells and
in the bone marrow 26. The predicted chimeric protein lacks both transmembrane domains (TMDs) and
nucleotide-binding domains (NBDs) of ABCA13, thus truncating the normal function of this transporter.
Although all IKZF1 zinc-fingers are conserved, its regulation by the ABCA13 promoter as well as the
overall structure disruption of both proteins may together impact leukemogenesis. The DDC encodes a
protein that catalyzes the decarboxylation of several aromatic amino acids. Its dysfunction results in a
lack of monoamine neurotransmitters, including serotonin and catecholamine 27. Moreover, genetic
variants in this gene were previously linked to childhood ALL 28–30. Of note, previous works have also
identified IKZF1 gene fusions with ADSS131, CDK232, CEP17031, COBL3,33, DLG234, DNAH1435, ETV634,36,
FIGNL137, KMT2A38, NUTM132, SETD532, SPATA4832,36,37, STIM239, TRPV232, TYW140, and ZEB239. As
Page 11/21
the majority of them are reciprocal and out-of-frame IKZF1 fusions, they disrupt its promoter or truncate
its protein to contribute to leukemia promotion.
The process of B-cell maturation includes rearrangements of immunoglobulin genes, which is quite
relevant for the diversification of immune response. This event is orchestrated by recombinase-activating
genes (RAG), which recognizes recombination signal sequences to mediate interstitial deletions at these
genomic loci. Following, the terminal deoxynucleotidyl transferase (TdT) displays template-independent
activity and adds nucleotides to the junctions 41. Here, we demonstrate that most IKZF1 deletions are
associated with genetic signatures of RAG and TdT activity, thus explaining the reason for its recurrence
in B-ALL. Although heptamer sequences for RAG-mediated recombination are expected to occur once per
~ 16 kb, we identified a 74-fold enrichment of these sequences (once per ~ 0.2 kb) within BC. Conversely,
rare types of deletions and rearrangements outside BC lacked those signatures. Of interest, a recent study
identified off-target RAG-mediated rearrangements – which occur outside immunoglobulin genes – in
about 10% of normal lymphocytes 42. In addition, another study including patients at chronic phase of
chronic myeloid leukemia (CML) identified an overexpression of RAG1/2 and DNTT (encoding TdT) prior
to progression to lymphoid blast crisis and acquisition of secondary alterations, such as IKZF1 deletions
43
. This evidence demonstrates that off-target RAG-mediated recombination is a common genetic event
in lymphocytes, and meanwhile promotes secondary alterations in ALL, including the majority of IKZF1
deletions.
Following the cleavage of the DNA double-strand, TdT incorporates random nucleotides in an
untemplated mode. When it reaches the downstream DNA, it becomes template-dependent 44. Although
TdT is able to add several nucleotides to a template strand in vitro, this polymerase is assembled in a
complex of the non-homologous end-joining (NHEJ) machinery in vivo 45, which keeps the upstream and
downstream DNA closer and facilitates the transition between a template-independent to dependent
mode. Therefore, the length of their additional sequences is restricted to 1–20 nucleotides in vivo, and
reaches a maximum at four nucleotides 46. Here, we also observed a similar number of additional
nucleotides at the breakpoint junctions, regardless of the type and range of IKZF1 deletion.
One relevant aspect of TdT is that it requires divalent metal ions as cofactors (Mg2+, Mn2+, Zn2+, and
Co2+), and the incorporation of nucleotides is biased depending on their concentration in the
microenvironment. For instance, purine is more frequently added in the presence of Mg2+, while
pyrimidines are preferred when TdT is chelated to Co2+ 47. In this regard, in vivo studies have shown that
TdT more often incorporates guanine and cytosine nucleotide bases compared to adenine and thymine
48. In this scenario, we observed high GC content in additional nucleotides of the majority of IKZF1
deletions. The most pronounced GC incorporation was observed in sequences derived from
recombination between two BC and intragenic IKZF1 deletions. Conversely, whole-gene deletions had the
lowest frequency of filler DNAs within the breakpoint junction and lacked this nucleotide assimilation
bias. Instead, they presented more often microhomologies between both rearranged strands, thus
indicating that microhomology-mediated end joining (MMEJ) might play a significant role in the
Page 12/21
formation of large structural alterations that deplete the IKZF1 gene. This line of evidence is consistent
with a role of TdT on the establishment of most IKZF1 deletions, though discrete in complete deletions.
In conclusion, this study provides a wide spectrum of structural alterations that affect the IKZF1 gene. It
derives from an international effort to gather information on a large number of B-ALL samples to
illustrate the genetic alterations behind rare types of deletions, and their prognostic relevance for B-ALL
patients. In this regard, we point out methodological adjustments in MLPA and M-PCR to fine-tune the
detection of IKZF1 deletions, once we showed that every patient with either recurrent or non-recurrent
deletions will benefit from accurate risk stratification. Also, we summarize several levels of evidence that
support the idea that RAG and TdT mediate the majority of these alterations, although microhomology-
mediated repair may contribute to the genesis of this secondary alteration in B-ALL to some extent.
Declarations
Acknowledgments
We are grateful to the children and their parents for their participation in research. We also thank
Australian and New Zealand Children’s Haematology and Oncology Group hospital staff and oncologists
and Sydney Children’s Tumour Bank Network for their support and the MRD teams in Sydney and Monza.
We thank Dr. Jinghui Zhang, Dr. Xiaotu Ma, Dr. Qingsong Gao, and Dr. Charles Mullighan for kindly
providing contig sequences of IKZF1 deletions. We also appreciate the contributions made by Luana
Batista as well as the technical assistance provided by Alessandra J. Faro, André F. Duarte and Jodie
Giles.
These results were partially based on data derived from the Therapeutically Applicable Research to
Generate Effective Treatments (TARGET; https://ocg.cancer.gov/programs/target) initiative.
BAL was supported by the Brazilian Ministry of Health and INCA. BAL received a return research grant
(Ref 3.2 - 1193718 - BRA - HFSTCAPES-P) and a fellowship for a short stay in Germany (Ref 3.2 - BRA /
1193718) from the Alexander von Humboldt Foundation. ME is supported by Brazilian National Council
of Technological and Scientific Development – CNPq (PQ-311220/2020-7) and Fundação Carlos Chagas
Filho de Amparo à Pesquisa do Estado do Rio de Janeiro – FAPERJ (E_26/203.214/2017; E-26-
010.101072-2018; and E-26/010.002187//2019) research grants. RM was supported by grants from the
DFG (Ma 1876/12-1) and Wilhelm Sander foundation (2018.070.2). RS, LDP and BD received research
grant funding from NH&MRC Australia APP1128727 and Cancer Australia PdCCRS APP1024232.
Author Contributions
BAL, CM and ME designed the study. ALTM, TCB, CPP, CB, NCV, RS, CP, GF, CS and GC contributed with
MLPA data of individual patients. LDP, DB, CP, GF, TFA, MML, MRVIC, MS, EC, GC and RS provided
samples and patient data. BAL and ALTM performed MLPA validation of suspicious rare deletions. BAL
and ME provided these materials. RM, CM and BAL provided materials for NGS sequencing. BAL and PL
Page 13/21
performed library preparation and NGS sequencing. PL mapped these data. HB carried out M-PCR. MK
performed quantitative ChIP analysis. BAL prepared data, reviewed literature, performed data analyses
and wrote the manuscript. ME reviewed the first draft of the manuscript and provided important insights
to this work. All authors critically reviewed and approved the final version of the manuscript.
Competing Interests
All data supporting the findings of this study are available from the corresponding author upon
reasonable request. The TARGET dataset is available at the Genomic Data Commons (GDC) Data Portal
(portal.gdc.cancer.gov/projects).
References
1. Mullighan CG, Su X, Zhang J, Radtke I, Phillips LAA, Miller CB et al. Deletion of IKZF1 and prognosis
in acute lymphoblastic leukemia. N Engl J Med 2009; 360: 470–480.
2. Lühmann JL, Stelter M, Wolter M, Kater J, Lentes J, Bergmann AK et al. The clinical utility of optical
genome mapping for the assessment of genomic aberrations in acute lymphoblastic leukemia.
Cancers (Basel) 2021; 13. doi:10.3390/cancers13174388.
3. Meyer C, Zur Stadt U, Escherich G, Hofmann J, Binato R, Barbosa TC et al. Refinement of IKZF1
recombination hotspots in pediatric BCP-ALL patients. Am J Blood Res 2013; 3: 165–173.
4. Caye A, Beldjord K, Mass-Malo K, Drunat S, Soulier J, Gandemer V et al. Breakpoint-specific multiplex
polymerase chain reaction allows the detection of IKZF1 intragenic deletions and minimal residual
disease monitoring in B-cell precursor acute lymphoblastic leukemia. Haematologica 2013; 98: 597–
601.
5. Boer JM, van der Veer A, Rizopoulos D, Fiocco M, Sonneveld E, de Groot-Kruseman HA et al.
Prognostic value of rare IKZF1 deletion in childhood B-cell precursor acute lymphoblastic leukemia:
an international collaborative study. Leukemia 2016; 30: 32–38.
6. Babraham Bioinformatics - FastQC A Quality Control tool for High Throughput Sequence Data.
http://www.bioinformatics.babraham.ac.uk/projects/fastqc (accessed 11 Nov2022).
7. Chiang C, Layer RM, Faust GG, Lindberg MR, Rose DB, Garrison EP et al. SpeedSeq: ultra-fast
personal genome analysis and interpretation. Nat Methods 2015; 12: 966–968.
8. Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 2013.
doi:10.48550/arxiv.1303.3997.
9. Faust GG, Hall IM. SAMBLASTER: fast duplicate marking and structural variant read extraction.
Bioinformatics 2014; 30: 2503–2505.
Page 14/21
10. Tarasov A, Vilella AJ, Cuppen E, Nijman IJ, Prins P. Sambamba: fast processing of NGS alignment
formats. Bioinformatics 2015; 31: 2032–2034.
11. Layer RM, Chiang C, Quinlan AR, Hall IM. LUMPY: a probabilistic framework for structural variant
discovery. Genome Biol 2014; 15: R84.
12. Robinson JT, Thorvaldsdóttir H, Winckler W, Guttman M, Lander ES, Getz G et al. Integrative
genomics viewer. Nat Biotechnol 2011; 29: 24–26.
13. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K et al. BLAST+: architecture and
applications. BMC Bioinformatics 2009; 10: 421.
14. Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in
biopolymers. Proc Int Conf Intell Syst Mol Biol 1994; 2: 28–36.
15. Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a given motif. Bioinformatics
2011; 27: 1017–1018.
16. Wickham H. ggplot2 - Elegant Graphics for Data Analysis. Springer-Verlag New York: New York, NY,
2016 doi:10.1007/978-0-387-98141-3.
17. Hahne F, Ivanek R. Visualizing genomic data using gviz and bioconductor. Methods Mol Biol 2016;
1418: 335–351.
18. Kuiper RP, Waanders E, van der Velden VHJ, van Reijmersdal SV, Venkatachalam R, Scheijen B et al.
IKZF1 deletions predict relapse in uniformly treated pediatric precursor B-ALL. Leukemia 2010; 24:
1258–1264.
19. Stanulla M, Dagdan E, Zaliova M, Möricke A, Palmi C, Cazzaniga G et al. IKZF1plus Defines a New
Minimal Residual Disease-Dependent Very-Poor Prognostic Profile in Pediatric B-Cell Precursor Acute
Lymphoblastic Leukemia. J Clin Oncol 2018; 36: 1240–1249.
20. Morel G, Deau M-C, Simand C, Caye-Eude A, Arfeuille C, Ittel A et al. Large deletions of the 5’ region of
IKZF1 lead to haploinsufficiency in B-cell precursor acute lymphoblastic leukaemia. Br J Haematol
2019; 186: e155–e159.
21. Mitchell RJ, Kirkwood AA, Barretta E, Clifton-Hadley L, Lawrie E, Lee S et al. IKZF1 alterations are not
associated with outcome in 498 adults with B-precursor ALL enrolled in the UKALL14 trial. Blood Adv
2021; 5: 3322–3332.
22. Hamadeh L, Enshaei A, Schwab C, Alonso CN, Attarbaschi A, Barbany G et al. Validation of the United
Kingdom copy-number alteration classifier in 3239 children with B-cell precursor ALL. Blood Adv
2019; 3: 148–157.
23. Stanulla M, Cavé H, Moorman AV. IKZF1 deletions in pediatric acute lymphoblastic leukemia: still a
poor prognostic marker? Blood 2020; 135: 252–260.
24. Childhood and Adolescence Cancer - PAHO/WHO | Pan American Health Organization.
http://www.paho.org/en/topics/childhood-and-adolescence-cancer (accessed 11 Nov2022).
25. Nakato M, Shiranaga N, Tomioka M, Watanabe H, Kurisu J, Kengaku M et al. ABCA13 dysfunction
associated with psychiatric disorders causes impaired cholesterol trafficking. J Biol Chem 2021; 296:
Page 15/21
100166.
26. Prades C, Arnould I, Annilo T, Shulenin S, Chen ZQ, Orosco L et al. The human ATP binding cassette
gene ABCA13, located on chromosome 7p12.3, encodes a 5058 amino acid protein with an
extracellular domain encoded in part by a 4.8-kb conserved exon. Cytogenet Genome Res 2002; 98:
160–168.
27. Rizzi S, Spagnoli C, Frattini D, Pisani F, Fusco C. Clinical Features in Aromatic L-Amino Acid
Decarboxylase (AADC) Deficiency: A Systematic Review. Behav Neurol 2022; 2022: 2210555.
28. Papaemmanuil E, Hosking FJ, Vijayakrishnan J, Price A, Olver B, Sheridan E et al. Loci on 7p12.2,
10q21.2 and 14q11.2 are associated with risk of childhood acute lymphoblastic leukemia. Nat Genet
2009; 41: 1006–1010.
29. Treviño LR, Yang W, French D, Hunger SP, Carroll WL, Devidas M et al. Germline genomic variants
associated with childhood acute lymphoblastic leukemia. Nat Genet 2009; 41: 1001–1005.
30. Ellinghaus E, Stanulla M, Richter G, Ellinghaus D, te Kronnie G, Cario G et al. Identification of germline
susceptibility loci in ETV6-RUNX1-rearranged childhood acute lymphoblastic leukemia. Leukemia
2012; 26: 902–909.
31. Brown LM, Lonsdale A, Zhu A, Davidson NM, Schmidt B, Hawkins A et al. The application of RNA
sequencing for the diagnosis and genomic classification of pediatric acute lymphoblastic leukemia.
Blood Adv 2020; 4: 930–942.
32. Lilljebjörn H, Henningsson R, Hyrenius-Wittsten A, Olsson L, Orsmark-Pietras C, von Palffy S et al.
Identification of ETV6-RUNX1-like and DUX4-rearranged subtypes in paediatric B-cell precursor acute
lymphoblastic leukaemia. Nat Commun 2016; 7: 11790.
33. Lopes BA, Meyer C, Barbosa TC, Poubel CP, Mansur MB, Duployez N et al. IKZF1 Deletions with COBL
Breakpoints Are Not Driven by RAG-Mediated Recombination Events in Acute Lymphoblastic
Leukemia. Transl Oncol 2019; 12: 726–732.
34. Schieck M, Lentes J, Thomay K, Hofmann W, Behrens YL, Hagedorn M et al. Implementation of RNA
sequencing and array CGH in the diagnostic workflow of the AIEOP-BFM ALL 2017 trial on acute
lymphoblastic leukemia. Ann Hematol 2020; 99: 809–818.
35. Mata-Rocha M, Rangel-López A, Jiménez-Hernández E, Morales-Castillo BA, González-Torres C,
Gaytan-Cervantes J et al. Identification and Characterization of Novel Fusion Genes with Potential
Clinical Applications in Mexican Children with Acute Lymphoblastic Leukemia. Int J Mol Sci 2019; 20.
doi:10.3390/ijms20102394.
36. Ma X, Liu Y, Liu Y, Alexandrov LB, Edmonson MN, Gawad C et al. Pan-cancer genome and
transcriptome analyses of 1,699 paediatric leukaemias and solid tumours. Nature 2018; 555: 371–
376.
37. Li B, Brady SW, Ma X, Shen S, Zhang Y, Li Y et al. Therapy-induced mutations drive the genomic
landscape of relapsed acute lymphoblastic leukemia. Blood 2020; 135: 41–55.
38. Meyer C, Burmeister T, Gröger D, Tsaur G, Fechina L, Renneville A et al. The MLL recombinome of
acute leukemias in 2017. Leukemia 2018; 32: 273–284.
Page 16/21
39. Gu Z, Churchman M, Roberts K, Li Y, Liu Y, Harvey RC et al. Genomic analyses identify recurrent
MEF2D fusions in acute lymphoblastic leukaemia. Nat Commun 2016; 7: 13331.
40. Panagopoulos I, Brunetti M, Stoltenberg M, Strandabø RAU, Staurseth J, Andersen K et al. Novel
GTF2I-PDGFRB and IKZF1-TYW1 fusions in pediatric leukemia with normal karyotype. Exp Hematol
Oncol 2019; 8: 12.
41. Christie SM, Fijen C, Rothenberg E. V(D)J recombination: recent insights in formation of the
recombinase complex and recruitment of DNA repair machinery. Front Cell Dev Biol 2022; 10:
886718.
42. Machado HE, Mitchell E, Øbro NF, Kübler K, Davies M, Leongamornlert D et al. Diverse mutational
landscapes in human lymphocytes. Nature 2022; 608: 724–732.
43. Thomson DW, Shahrin NH, Wang PPS, Wadham C, Shanmuganathan N, Scott HS et al. Aberrant
RAG-mediated recombination contributes to multiple structural rearrangements in lymphoid blast
crisis of chronic myeloid leukemia. Leukemia 2020; 34: 2051–2063.
44. Loc’h J, Rosario S, Delarue M. Structural basis for a new templated activity by terminal
deoxynucleotidyl transferase: implications for V(D)J recombination. Structure 2016; 24: 1452–1463.
45. Graham TGW, Walter JC, Loparo JJ. Two-Stage Synapsis of DNA Ends during Non-homologous End
Joining. Mol Cell 2016; 61: 850–858.
46. Murugan A, Mora T, Walczak AM, Callan CG. Statistical inference of the generation probability of T-
cell receptors from sequence repertoires. Proc Natl Acad Sci USA 2012; 109: 16161–16166.
47. Fowler JD, Suo Z. Biochemical, structural, and physiological characterization of terminal
deoxynucleotidyl transferase. Chem Rev 2006; 106: 2092–2110.
48. Motea EA, Berdis AJ. Terminal deoxynucleotidyl transferase: the story of a misguided DNA
polymerase. Biochim Biophys Acta 2010; 1804: 1151–1166.
Figures
Page 17/21
Figure 1
Genetic landscape of IKZF1 deletions. Individual MLPA peak ratios derived from (a) P335 and (b) P202
were compared between wild-type (WT) or monoallelic deletion loci in samples in which IKZF1 status was
verified. The horizontal line (gray) indicates the peak ratio of 0.8. (c) Frequency of IKZF1 deletions in four
independent cohorts and overall. The gene loss was classified in two groups: recurrent (Δ1–8, Δ2–3, Δ2–
7, Δ2–8, Δ4–7, Δ4–8) and non-recurrent (the remaining deletion ranges). (d) Types of non-recurrent
IKZF1deletions identified in the four cohorts of this study and literature (restricted to DNA and RNA
sequencing data). (e) Spectrum of IKZF1 gene fusions reported so far. They compose in-frame (blue) or
out-of-frame (gray) gene fusions. Bold indicates those found exclusively in this study. *DDCwas a partner
gene in two samples: one had an in-frame fusion, while the other had out-of-frame.
Page 18/21
Figure 2
Breakpoint map of IKZF1 deletions. (a) Genomic location of BC associated with IKZF1 deletions. A total
of 24 BC were identified (horizontal ticks) and located at 5’ (n=12; BC01–12 in blue) and 3’ (n=12; BC01–
12 in red) breakpoints of IKZF1deletions. The counting of all breakpoints is also displayed by genomic
coordinates, where the y-axis illustrates the number of DNA breaks (the maximum value was restricted to
15). (b) Number of IKZF1deletions associated with each combination of BC. The gradient color illustrates
the frequency of non-recurrent IKZF1deletions for every BC combination. The bar plots indicate the
number of sequences for individual 5'BC (right) and 3'BC (above). "No" refers to breakpoints outside BC.
Figure 3
Opportunities to improve methodologies for the determination of IKZF1 status. (a) Genomic map of
IKZF1deletions encompassing exon 1 (n=23; red horizontal bars). The location of commercially available
MLPA probes (P202-C1 assay; orange) and guanine-cytosine content (GC content; gray) are also
Page 19/21
displayed. The common deleted region of IKZF1 promoter is highlighted (light red). After excluding the
region of high GC content, the remaining target area (chr7:50,275,701–50,302,000) may be used for the
design of novel MLPA probes, allowing more accurate definition of exon 1 deletions. (b) M-PCR assay to
determine IKZF1 status. The agarose gel electrophoresis was used to visualize the amplicons of IKZF1
Δ2–3 in the first M-PCR, and Δ2–7, Δ2–8, Δ4–7 in the second reaction. We also included IKZF1wild-type
samples (lanes 3 and 8) as well as negative controls (N).
Figure 4
Genetic signatures of IKZF1 deletions. DNA motifs identified at breakpoints after agnostic search of (a)
BC and (b) individual recurrent IKZF1 deletions. (c) The most significant heptamer-like sequences (5'-
CACAGTG-3') identified at BC. Different types of genetic signatures at the junctions of IKZF1 deletions
were compared based on (d) the frequency of IKZF1 deletions, (e) types of recurrent deletions, and (f)
DNA break at clusters. MH, microhomologies between both breakpoints. The (g–i) size of additional
nucleotides at DNA junctions and the (j–l) frequency of each nucleobase (A, T, G or C) within them was
also compared by the same parameters. Statistical tests were performed between IKZF1 deletion groups:
Page 20/21
Chi-square test (analysis of genetic signatures and frequency of nucleobases) and two-tailed unpaired
Student’s t-test (size of filler DNA).
Figure 5
Outcome of B-ALL patients with non-recurrent IKZF1deletions.Kaplan–Meier curves illustrate the (a) OS
and (b) CIR of non-recurrent IKZF1 deletions versus IKZF1 wild-type or IKZF1 Δ4–7.
Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
SupplementaryTables.pdf
Page 21/21