Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Radiotherapy and Oncology 173 (2022) 262–268

Contents lists available at ScienceDirect

Radiotherapy and Oncology


journal homepage: www.thegreenjournal.com

Original Article

Development and evaluation of an automated EPTN-consensus based


organ at risk atlas in the brain on MRI
Jeroen A. Crouzen a, Anna L. Petoukhova b, Ruud G.J. Wiggenraad a, Stefan Hutschemaekers a,
Christa G.M. Gadellaa-van Hooijdonk a, Noëlle C.M.G. van der Voort van Zyp a, Mirjam E. Mast a, Jaap D.
Zindler a,⇑
a
Haaglanden Medical Center, Department of Radiotherapy; and b Haaglanden Medical Center, Department of Medical Physics, BA Leidschendam, The Netherlands

a r t i c l e i n f o a b s t r a c t

Article history: Background and purpose: During radiotherapy treatment planning, avoidance of organs at risk (OARs) is
Received 3 December 2021 important. An international consensus-based delineation guideline was recently published with 34 OARs
Received in revised form 29 April 2022 in the brain. We developed an MR-based OAR autosegmentation atlas and evaluated its performance
Accepted 8 June 2022
compared to manual delineation.
Available online 15 June 2022
Materials and methods: Anonymized cerebral T1-weighted MR scans (voxel size 0.9  0.9  0.9 mm3)
were available. OARs were manually delineated according to international consensus. Fifty MR scans
Keywords:
were used to develop the autosegmentation atlas in a commercially available treatment planning system
Organs at risk
Neuroimaging
(RaystationÒ). The performance of this atlas was tested on another 40 MR scans by automatically delin-
Computer-assisted radiotherapy planning eating 34 OARs, as defined by the 2018 EPTN consensus. Spatial overlap between manual and automated
Magnetic resonance imaging delineations was determined by calculating the Dice similarity coefficient (DSC). Two radiation oncolo-
gists determined the quality of each automatically delineated OAR. The time needed to delineate all
OARs manually or to adjust automatically delineated OARs was determined.
Results: DSC was  0.75 in 31 (91 %) out of 34 automated OAR delineations. Delineations were rated by
radiation oncologists as excellent or good in 29 (85 %) out 34 OAR delineations, while 4 were rated fair
(12 %) and 1 was rated poor (3 %). Interobserver agreement between the radiation oncologists ranged
from 77-100 % per OAR. The time to manually delineate all OARs was 88.5 minutes, while the time
needed to adjust automatically delineated OARs was 15.8 minutes.
Conclusion: Autosegmentation of OARs enables high-quality contouring within a limited time. Accurate
OAR delineation helps to define OAR constraints to mitigate serious complications and helps with the
development of NTCP models.
Ó 2022 The Author(s). Published by Elsevier B.V. Radiotherapy and Oncology 173 (2022) 262–268 This is
an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

With modern radiotherapy techniques, radiation dose to organs lished by the European Particle Therapy Network (EPTN) of ESTRO
at risk (OARs) can be actively reduced during treatment planning [1]. The aim of this consensus is to reduce inter- and intraobserver
to minimize the risk of acute and long-term toxicity. Therefore, variability in delineating OARs relevant in neuro-oncology.
accurate delineation of OARs is essential. In 2018, a consensus- Another goal is to enable the development and validation of nor-
based atlas for the delineation of OARs in the brain region was pub- mal tissue complication probability (NTCP) models, which corre-
late radiotherapy dose in OARs with observed toxicity. To
develop these NTCP models, uniform and precise delineation of
Abbreviations: ABS, Automated atlas-based segmentation; AI, Artificial intelli- OARs is crucial.
gence; DSC, Dice similarity coefficient; EPTN, European Particle Therapy Network; However, the manual delineation of all these structures (34
ESTRO, European Society for Radiotherapy and Oncology; NTCP, Normal tissue OARs in total) is a labor intensive process of more than an hour
complication probability; OAR, Organ at risk.
⇑ Corresponding author at: Haaglanden Medical Center, Department of Radio- and can thereby impede the clinical workflow. This number
therapy, Burgemeester Banninglaan 1, 2262 BA Leidschendam, The Netherlands. includes structures such as the hippocampus as well as substruc-
E-mail addresses: j.crouzen@haaglandenmc.nl (J.A. Crouzen), a.petoukhova@ tures such as the anterior part of the left hippocampus. Further-
haaglandenmc.nl (A.L. Petoukhova), s.hutschemaekers@haaglandenmc.nl more, manual delineation is subject to inter- and intraobserver
(S. Hutschemaekers), n.van.der.voort.van.zyp@haaglandenmc.nl (N.C.M.G. van der variability [2,3]. For this reason, computerized techniques based
Voort van Zyp), m.mast@haaglandenmc.nl (M.E. Mast), j.zindler@haaglandenmc.nl
(J.D. Zindler), j.zindler@haaglandenmc.nl (J.D. Zindler).
on medical image processing and analysis have been developed

https://doi.org/10.1016/j.radonc.2022.06.004
0167-8140/Ó 2022 The Author(s). Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
J.A. Crouzen, A.L. Petoukhova, Ruud G.J. Wiggenraad et al. Radiotherapy and Oncology 173 (2022) 262–268

to automate the delineation process (autosegmentation), usually develop the atlas. In ABS, data of previously delineated OARs can
based on CT imaging [4]. This approach can reduce interobserver be used to segment new image data based on delineations from
variability and delineation time, although the results are not multiple image sets [31]. Then, an automated atlas is trained based
always accurate enough for clinical use [2,5]. At present, OARs on this input. The produced atlases, which closely match the new
are usually delineated manually in clinical practice. image data, are found through rigid image registration. Addition-
Automated atlas-based segmentation (ABS) is the most com- ally, model-based segmentation (MBS) for 50 atlases was applied.
mon method to automate the OAR delineation process and is com- This means that the OAR geometries are then deformed onto the
mercially available in several treatment planning systems [6,7]. new image data by applying deformable image registration [32].
Deep learning-based approaches have become more common in The Medical Ethical Committee Leiden-The Hague-Delft pro-
the last decade [8]. Early evaluations of deep learning-based vided consent for this study (reference number G21.076). Seventy
autosegmentation have shown that these delineations are gener- anonymized cerebral 3D MR scans (T1-weighted, voxel size
ally accurate and consistent [9–13]. Van Dijk et al. found that deep 0.9x0.9x0.9 mm3, slice thickness 0.9 mm), which had been used
learning was generally preferred for OAR delineations in the neck for radiotherapy treatment planning in the period of January
region but that atlas-based OAR delineations in the brain region 2018 until May 2021, were available. The brain region anatomy
performed better than or equal to their deep learning counterparts in this group was variable, in accordance with recommendations
[14]. Similarly, Urago et al. compared atlas-based segmentation from the ABS software developers. Of these seventy MR scans, 50
with AI-based segmentation and concluded that there was no dif- were used for the development of the automated atlas (model
ference between the two methods for head and neck cancer [15]. cohort). Twenty out of 50 MR scans from the model cohort were
New and improved algorithms are being developed continuously used for internal evaluation (internal cohort). From the original
for each approach, meaning that comparisons between methods 70 MR scans, 20 MR scans had not been used in the model cohort,
are only valid in comparing the specific algorithms which were but were used for the external evaluation (external cohort). Exter-
analyzed. nal validation refers to the fact that the MR scans from this cohort
Autosegmentation of OARs in the brain region on MR scans were not used to train the atlas, while the MR scans from the inter-
instead of CT scans is rare but is recommended for delineation nal validation cohort were initially used to train the automated
due to its superior soft tissue contrast resolution [6,16–21]. Usually atlas. Baseline characteristics are summarized in Table 1.
in the literature, only a limited number of OARs in the brain region The OAR delineations in the model cohort were redelineated or
are segmented. To our knowledge, there are no publications about adjusted to meet the criteria of the EPTN consensus [1]. Final
autosegmentation of OARs in the brain region based on an interna- approval for use of the delineations in the atlas was given by an
tional consensus-based atlas. experienced radiation oncologist (JZ). The automated atlas was
Radiotherapy of brain tumors can cause serious complications, trained to delineate 34 OARs from the EPTN consensus, the full list
such as blindness, by exceeding the tolerance dose of the optic of which can be found in Tables 2 and 3. After the development of
nerves/chiasm. To prevent serious complications, it is essential to the atlas based on the 50 model cohort MR scans, the atlas was
accurately delineate OARs and respect the constraints of relevant used to automatically delineate 34 OARs on MR scans from the
OARs [22]. The dose–effect relations for serious complications after internal and external evaluation cohorts.
photon therapy were summarized in the QUANTEC papers [23–25]. The atlas-based delineations were compared to the manual
Strategies to lower the risk of serious complications after stereotac- delineations using the Dice similarity coefficient (DSC) for each
tic radiotherapy have also been reported [26]. Limited literature is OAR. DSC describes the relative overlap of two segmentation vol-
available about OAR dose constraints for proton therapy, although umes (X and Y) and is defined as DSC(X,Y) = 2 * |X \ Y|/(|X| + |Y|).
serious complications after proton therapy, such as radiation- The DSC values range between 0 (no overlap) and 1 (complete
induced optic neuropathy, have also been reported [27,28]. overlap).
Another opportunity for uniform OAR delineation, as proposed The quality of the atlas-based delineations was determined by
by the EPTN, is the development and validation of NTCP models, expert evaluation through a scoring system. Experts were defined
which require a large amount of dosimetric and prospective clini- as having over 5 years of experience as a neuro-radiation oncolo-
cal data in combination with accurate OAR delineations [1,29]. gist. In this system, two experienced radiation oncologists (JZ,
These models calculate the risk of complications and are used to RW) independently determined the number of transversal MR
reduce complicated dosimetric and anatomic information to a sin- slices where adjustments were deemed to be required for the
gle risk measure. In the Netherlands, the indication of proton ther- specific OAR to be clinically acceptable (Fig. 1). The percentage of
apy instead of photon therapy is determined by the lower chance transverse MR slices where adjustments were required was then
of complications with proton therapy [30]. These calculations are
mainly based on a lower risk of complications such as grade 2 or
grade 3 toxicity. Currently, no externally validated NTCP models Table 1
are available in relation to brain toxicity after radiotherapy [1]. Baseline characteristics.
Accurate OAR delineation through ABS in combination with
Total cohort (n = 70) Model Internal External
prospective data registrations makes the development of NTCP cohort evaluation evaluation
models more viable for brain toxicity by reducing the manual (n = 50/70) cohort cohort
workload and interobserver variability [5]. (n = 20/50)* (n = 20/70)**
We developed an automated MR-based atlas in our treatment Brain metastasis 23 (46 %) 12 (60 %) 8 (40 %)
planning system based on the EPTN International Neurological Glioblastoma 17 (34 %) 4 (20 %) 8 (40 %)
Contouring Atlas. In this study, we compared its performance to multiforme
Meningioma 5 (10 %) 1 (5 %) 3 (15 %)
manual delineation in both model and evaluation cohorts.
Oligodendroglioma 2 (4 %) 2 (10 %) 1 (5 %)
Other 3 (6 %) 1 (5 %) 0
Postoperative (%) 32 (64 %) 15 (75 %) 9 (45 %)
Materials and methods
*All MR scans from the internal evaluation cohort were also part of the model
cohort.
Commercially available ABS software (RayStation version 9B, **All MR scans from the external evaluation cohort were not part of the model
RaySearch Laboratories AB, Stockholm, Sweden) was used to cohort.

263
Development and evaluation of an automated EPTN-consensus based organ at risk atlas in the brain on MRI

calculated for each OAR. A score of 0 % meant that no adjustments Results


for that OAR were required in any MR scans, while a score of 10 %
meant that in 10 % of MR slices with delineations for that specific In the internal evaluation cohort, 14 (41 %), 17 (50 %) and 3 (9 %)
OAR adjustments were required to make it clinically acceptable. out of 34 OAR delineations showed mean DSCs of  0.90, 0.75–0.90
Scores of < 5 %, 5 to 15 %, 15 to 25 % and  25 % were deemed excel- and < 0.75, respectively. In the external evaluation cohort, 13
lent, good, fair, and poor, respectively. Interobserver agreement (38 %), 18 (53 %) and 3 (9 %) out of 34 OAR delineations showed
between the two radiation oncologists in the expert evaluation mean DSCs of  0.90, 0.75–0.90 and < 0.75, respectively. After Bon-
was determined by calculating the average score similarity per ferroni adjustment for multiple testing, only one OAR (right
OAR. cochlea) differed significantly between the cohorts (p = 0.001).
The time needed for one radiation oncologist (JZ) to manually The lowest mean DSC values were measured in the delineations
delineate all structures and the time needed to correct mistakes of the lacrimal glands and semicircular canals, while the highest
from the atlas-based delineations in a clinical setting was mea- were found in the delineations of the pons, cerebellum and brain-
sured for 10 MR scans. Time difference between these two meth- stem. The mean DSC for each OAR is shown in Table 2. Fig. 2 shows
ods to delineate OARs was calculated for each MR scan. DSC box plots of 34 OAR delineations from the external evaluation
Computation time to generate atlas-based delineations was not cohort, as well as the average volume of each OAR. In Fig. 2, one
included in the measurements, because this is an automated pro- case (number 9) was an outlier with a DSC values outside of the
cess which does not require significant effort from treatment plan- 75th percentiles in 13 OARs.
ning staff. In the expert evaluation, 25 of 34 OAR delineations were rated
Descriptive statistics were used to describe the DSC, reporting excellent (74 %), 4 were rated good (12 %), 4 were rated fair (12 %)
means and standard deviations (SD). Boxplots of the calculated and 1 was rated poor (3 %) in the internal evaluation cohort. Sim-
DSC values of all 34 OARs were generated. To compare the differ- ilarly, 22 out of 34 OAR delineations were rated excellent (65 %), 7
ences in DSC values between the internal and external cohorts, were rated good (21 %), 4 were rated fair (12 %) and 1 was rated
normality tests were executed. Based on the results, we decided poor (3 %) in the external evaluation cohort. Fig. 3 shows the mean
to use the nonparametric independent-samples Mann-Whitney U percentage of adjusted slices per OAR. Interobserver agreement
test. To correct for multiple testing, a Bonferroni adjustment was between the radiation oncologists in the expert evaluation was
performed. A two-sided p value  0.001 (0.05/34) was considered high for all OARs in both cohorts, with agreement ranging from
statistically significant. Statistical analysis was performed using 77-100 %. Table 3 shows interobserver agreement in the expert
SPSS version 26.0 (IBM Corp, Armonk, NY). evaluation.

Table 2
Mean Dice Similarity Coefficient. To correct for multiple testing, a Bonferroni adjustment was performed. A two-sided p value  0.001 (0.05/34) was considered statistically
significant. P values in bold are statistically significant.

OAR Mean Dice Similarity Coefficient


Internal evaluation cohort (SD, range) External evaluation cohort (SD, range) p value
Brain 0.93 (0.06, 0.83–0.99) 0.95 (0.06, 0.81–0.99) 0.3
BrainStem 0.96 (0.02, 0.87–0.97) 0.96 (0.004, 0.95–0.97) 0.3
BrainStem_interior 0.94 (0.03, 0.83–0.95) 0.95 (0.004, 0.94–0.95) 0.8
BrainStem_surface 0.80 (0.06, 0.58–0.84) 0.81 (0.02, 0.77–0.84) 0.5
Cerebellum 0.96 (0.02, 0.88–0.98) 0.96 (0.01, 0.91–0.97) 0.8
Cerebellum_anterior 0.93 (0.02, 0.86–0.96) 0.94 (0.01, 0.91–0.96) 0.1
Cerebellum_posterior 0.94 (0.03, 0.87–0.96) 0.95 (0.02, 0.87–0.96) 0.8
Cochlea_L 0.81 (0.04, 0.74–0.90) 0.84 (0.04, 0.73–0.91) 0.05
Cochlea_R 0.80 (0.06, 0.68–0.92) 0.86 (0.04, 0.78–0.91) 0.001
Cornea_L 0.81 (0.05, 0.68–0.92) 0.81 (0.2, 0.18–0.91) 0.1
Cornea_R 0.80 (0.07, 0.70–0.87) 0.83 (0.1, 0.38–0.9) 0.005
Hippocampus_L 0.90 (0.03, 0.76–0.92) 0.91 (0.02, 0.85–0.93) 0.007
Hippocampus_anterior_L 0.89 (0.03, 0.82–0.93) 0.90 (0.04, 0.75–0.94) 0.5
Hippocampus_posterior_L 0.85 (0.06, 0.61–0.89) 0.86 (0.03, 0.76–0.90) 0.6
Hippocampus_R 0.91 (0.02, 0.86–0.95) 0.92 (0.02, 0.87–0.94) 0.1
Hippocampus_anterior_R 0.88 (0.02, 0.84–0.94) 0.90 (0.02, 0.85–0.93) 0.02
Hippocampus_posterior_R 0.85 (0.07, 0.60–0.91) 0.85 (0.05, 0.70–0.89) 0.6
Hypothalamus 0.78 (0.07, 0.55–0.86) 0.77 (0.06, 0.63–0.89) 0.4
LacrimalGland_L 0.76 (0.2, 0.26–0.88) 0.83 (0.06, 0.66–0.91) 0.05
LacrimalGland_R 0.68 (0.2, 0.10–0.85) 0.65 (0.2, 0.27–0.90) 0.9
Lens_L 0.83 (0.06, 0.73–0.91) 0.81 (0.08, 0.61–0.91) 0.3
Lens_R 0.80 (0.1, 0.46–0.9) 0.79 (0.1, 0.36–0.9) 0.7
MedullaOblongata 0.92 (0.02, 0.89–0.95) 0.92 (0.05, 0.73–0.95) 0.7
Midbrain 0.95 (0.01, 0.93–0.98) 0.94 (0.009, 0.92–0.96) 0.3
OpticChiasm 0.83 (0.02, 0.79–0.86) 0.85 (0.03, 0.79–0.91) 0.03
OpticNerve_L 0.83 (0.04, 0.71–0.88) 0.84 (0.1, 0.41–0.90) 0.03
OpticNerve_R 0.84 (0.05, 0.66–0.88) 0.85 (0.1, 0.33–0.92) 0.003
Pituitary 0.84 (0.2, 0.01–0.94) 0.89 (0.02, 0.84–0.93) 0.07
Pons 0.96 (0.005, 0.95–0.97) 0.96 (0.007, 0.94–0.97) 0.3
Retina_L 0.90 (0.02, 0.84–0.92) 0.87 (0.2, 0.22–0.93) 0.08
Retina_R 0.90 (0.03, 0.8–0.92) 0.87 (0.2, 0.14–0.93) 0.07
SpinalCord 0.91 (0.03, 0.81–0.94) 0.89 (0.1, 0.46–0.95) 0.4
Vestibulum_SemicircularCanals_L 0.49 (0.3, 0.05–0.87) 0.42 (0.3, 0.04–0.84) 0.6
Vestibulum_SemicircularCanals_R 0.60 (0.4, 0.01–0.9) 0.68 (0.2, 0.02–0.86) 0.4

264
J.A. Crouzen, A.L. Petoukhova, Ruud G.J. Wiggenraad et al. Radiotherapy and Oncology 173 (2022) 262–268

Table 3 The mean time required to manually delineate all 34 OARs in a


Interobserver agreement between two radiation oncologists in the expert evaluation clinical setting was 88.5 minutes (range 80–95, SD 4.8) in 10 MR
(see Fig. 3).
scans. The mean time required to verify each OAR and correct
OAR Interobserver agreement any mistakes in the automated delineations was 15.8 minutes
Internal External (range 13–18, SD 1.7) in the same 10 MR scans. The mean time dif-
evaluation evaluation ference was 72.7 minutes (range 67–77, SD 3.4).
cohort, % cohort, %
Brain 84 91
BrainStem 100 99 Discussion
Brainstem_interior 100 99
Brainstem_surface 100 99
Cerebellum 96 97
We demonstrated that ABS of OARs in the brain on MR enables
Cerebellum_anterior 92 77 high-quality contouring within a limited time compared to manu-
Cerebellum_posterior 83 94 ally delineating all 34 OARs with a time gain of over 70 minutes.
Cochlea_L 100 100 Only a few OARs, such as the brain and optic nerves, required fre-
Cochlea_R 100 100
quent corrections to achieve clinical acceptability. This makes
Cornea_L 100 100
Cornea_R 100 100 accurate and efficient OAR delineation according to the EPTN con-
Hippocampus_L 100 100 sensus possible in routine clinical practice. Because our automated
Hippocampus_anterior_L 100 100 atlas is MR-based, more soft tissue structures can be accurately
Hippocampus_posterior_L 100 100 delineated automatically.
Hippocampus_R 100 82
Hippocampus_anterior_R 100 79
In this study, a significant difference in DSC was found
Hippocampus_posterior_R 100 90 between the internal and external cohorts in only one OAR (right
Hypothalamus 78 77 cochlea). This finding does not seem clinically relevant because
LacrimalGland_L 100 100 the mean DSC values of both cohorts for this small OAR were
LacrimalGland_R 100 100
acceptable (0.80 and 0.86). Furthermore, the cochlea may be seg-
Lens_L 99 100
Lens_R 100 100 mented more accurately on CT scans due to the increased visibil-
MedullaOblongata 98 94 ity of this structure on CT. DSC of other OAR delineations were
Midbrain 94 91 similar between the internal and external cohorts. The fact that
OpticChiasm 98 100 no other significant difference in DSC between the internal and
OpticNerve_L 85 78
external evaluation cohorts could be found and that the DSC val-
OpticNerve_R 83 82
Pituitary 100 100 ues in both cohorts were generally high suggests that automated
Pons 87 90 OAR delineation quality is not decreased in new (external) image
Retina_L 98 100 data.
Retina_R 100 100
Several OAR delineations with a smaller volume showed lower
SpinalCord 98 100
Vestibulum_SemicircularCanals_L 97 96 mean DSC values than larger OAR delineations, even when the
Vestibulum_SemicircularCanals_R 95 95 automated delineations were deemed acceptable in the expert

Fig. 1. A. Example of clinically acceptable automated delineations on a transverse slice of MR scan; 1B: Example of clinically unacceptable automated delineations (brain,
optic nerve) on a transverse slice of MR scan. The delineations of the optic chiasm and optic nerve are not connected.

265
Development and evaluation of an automated EPTN-consensus based organ at risk atlas in the brain on MRI

Fig. 2. Box plots of DSCs for OARs (external cohort). The central lines represent median values. The upper and lower box edges represent the 25th and 75th percentiles. The
whiskers show the maximum and minimum values. Circles represent outliers at least 1.5 box lengths from the median along with the corresponding patient number.
Asterisks represent extremes at least three box lengths from the median. OARs are sorted from largest mean volume (left) to smallest mean volume (right). ^
VSC = Vestibulum/semicircular canals.

evaluation. This effect can be seen in Fig. 2, where some OARs with the expert evaluation, such as the brain and anterior cerebellum,
low mean DSC values had a small volume (lacrimal glands, vestibu- did not also show low mean DSC values. This finding is consistent
lum/semicircular canals). OARs that were rated as the poorest in with other studies, which showed that DSC is dependent on the
266
J.A. Crouzen, A.L. Petoukhova, Ruud G.J. Wiggenraad et al. Radiotherapy and Oncology 173 (2022) 262–268

Fig. 3. Mean percentage of adjusted slices per OAR (based on the average percentage of observer 1 and observer 2). Green: not adjusted. Red: adjusted percentage.

delineation volume [15,18]. This does not hinder the clinical work- in practice. In both the training as well as the evaluation of the
flow because smaller OAR delineations take less time to adjust. The automated atlas, MR scans with larger tumors were also included
delineations of the brain were rated lower in our evaluation cohort to ensure a larger variety of anatomy to train the automated atlas
(Fig. 3) due to insufficient contrast between brain and skull in some with. Additionally, this is an argument in favor of MR-based delin-
MR scans. This would lead to the brain delineation extending to the eation, because disruption of the regular anatomy is more visible
skull, which in practice should be manually corrected. An example on MR compared to CT in, for example, the optic system. Further-
of this can be seen in Fig. 1. In some MR scans, the optic nerves did more, we did not specifically look at differences in quality of auto-
not border the optic chiasm, which meant that these structures mated delineations between pre- and post-operative MR scans. As
required more corrections (Fig. 3). The automated delineations of can be seen in Table 1, 75 % of the internal evaluation cohort and
the hypothalamus were rated lower than other OARs because these 45 % of the external evaluation cohort consisted of post-operative
delineations often did not cover the entire hypothalamus. MR scans. The quality of automated delineations did not differ
This study shows that with ABS, a generally high interobserver between the internal and external evaluation cohort, so it is sug-
agreement can be achieved, although for a limited number of OARs, gested that the automated atlas still performs well in post-
interobserver agreement was lower, such as the hypothalamus and operative MR scans. A significant disruption of the regular anatomy
the anterior part of the cerebellum. To achieve the highest quality in a post-operative scan can require more effort from the radiation
of OAR delineation, the authors recommend thorough evaluation of oncologist to correct mistakes.
all OARs after ABS for clinical use by at least one RO and preferably A limitation of this study is that there is a need for validation of
a second observer to make the final adjustments. this model by applying it to datasets from other institutions with
Quality assurance requirements are different if deformable reg- evaluation from more observers [34]. While we have established
istration is limited to contouring segmentations or is extended to that the atlas works at our institution, factors such as imaging
dose mapping and dose accumulation [33]. In our study, registra- quality could influence its performance in other institutions. This
tion between patient images and an automated MR-based OAR was illustrated with one case, where non-clinically relevant
atlas was performed for segmentation purposes. The quality assur- motion artifacts, and therefore lower image quality, likely led to
ance of contour propagation was based on the human visual sys- poorer results for several OAR delineations (Fig. 2; case 9). Another
tem by trained radiation oncologists. The quality of CT and MR limitation of this study was that the model was based on only 50
registration is also very important as the next step of radiotherapy cerebral MR scans. The input of more image data may further
planning. improve its performance. Structures that can easily move such as
Large tumors causing disruption of the regular anatomy could the lens and optic nerves are advised by the international consen-
have a negative influence on the performance of the automated sus to be delineated on (planning)CT since this anatomy is used for
atlas. It can require more effort from the radiation oncologist to treatment planning with or without patient instructions. Also, the
correct mistakes. However, the authors have rarely seen this effect brain and skin are best delineated using CT instead of MR. In our
267
Development and evaluation of an automated EPTN-consensus based organ at risk atlas in the brain on MRI

study, these structures (except skin) were primarily ABS delineated organs at risk by Deep Learning Contouring. Radiother Oncol
2020;142:115–23. https://doi.org/10.1016/j.radonc.2019.09.022.
on MR and afterwards adjusted on CT if it was needed. In future
[15] Urago Y, Okamoto H, Kaneda T, Murakami N, Kashihara T, Takemori M, et al.
developments, an atlas can be created on co-registered CT and Evaluation of auto-segmentation accuracy of cloud-based artificial intelligence
MR images. Additionally, the EPTN consensus has recently been and atlas-based models. Radiat Oncol 2021;16. https://doi.org/10.1186/
updated and now includes additional OARs [35]. Therefore, further s13014-021-01896-1.
[16] Orasanu E, Brosch T, Glide-Hurst C, Renisch S. Organ-at-risk segmentation in
development of our automated atlas is warranted. brain MRI using model-based segmentation: benefits of deep learning-based
In conclusion, MR-based ABS enables efficient and high-quality boundary detectors. Shape Med Imaging 2018;11167:291–9. https://doi.org/
contouring of 34 OARs according to the EPTN consensus. It can be 10.1007/978-3-030-04747-4_27.
[17] Mlynarski P, Delingette H, Alghamdi H, Bondiau P-Y, Ayache N. Anatomically
used to replace manual delineation of OARs to save time in the consistent CNN-based segmentation of organs-at-risk in cranial radiotherapy. J
clinical workflow, if all OARs are thoroughly evaluated by experi- Med Imag 2020;7:. https://doi.org/10.1117/1.jmi.7.1.014502014502.
enced radiation oncologists. Furthermore, accurate OAR delin- [18] Isambert A, Dhermain F, Bidault F, Commowick O, Bondiau P-Y, Malandain G,
et al. Evaluation of an atlas-based automatic segmentation software for the
eation helps to define OAR constraints to mitigate serious delineation of brain organs at risk in a radiation therapy clinical context.
complications and enables the development of NTCP models. Radiother Oncol 2008;87:93–9. https://doi.org/10.1016/j.radonc.2007.11.030.
[19] Chen H, Lu W, Chen M, Zhou L, Timmerman R, Tu D, et al. A recursive ensemble
organ segmentation (REOS) framework: Application in brain radiotherapy.
Declaration of Competing Interest Phys Med Biol 2019;64. https://doi.org/10.1088/1361-6560/aaf83c. 025015.
[20] Dolz J, Laprie A, Ken S, Leroy H-A, Reyns N, Massoptier L, et al. Supervised
machine learning-based classification scheme to segment the brainstem on
The authors declare that they have no known competing finan- MRI in multicenter brain tumor treatment context. Int J Comput Assist Radiol
cial interests or personal relationships that could have appeared Surg 2016;11:43–51. https://doi.org/10.1007/s11548-015-1266-2.
to influence the work reported in this paper. [21] Agn M, Munck af Rosenschöld P, Puonti O, Lundemann MJ, Mancini L, Papadaki
A, et al. A modality-adaptive method for segmenting brain tumors and organs-
at-risk in radiation therapy planning. Med Image Anal 2019;54:220–37.
References https://doi.org/10.1016/j.media.2019.03.005.
[22] Lambrecht M, Eekers DBP, Alapetite C, Burnet NG, Calugaru V, Coremans IEM,
et al. Radiation dose constraints for organs at risk in neuro-oncology; the
[1] Eekers DBP, in ’t Ven L, Roelofs E, Postma A, Alapetite C, Burnet NG, et al. The European Particle Therapy Network Consensus. Radiother Oncol
EPTN consensus-based atlas for CT- and MR-based contouring in neuro- 2018;128:26–36. https://doi.org/10.1016/j.radonc.2018.05.001.
oncology. Radiother Oncol 2018;128:37–43. https://doi.org/10.1016/j. [23] Bentzen SM, Constine LS, Deasy JO, Eisbruch A, Jackson A, Marks LB, et al.
radonc.2017.12.013. Quantitative analyses of normal tissue effects in the clinic (QUANTEC): An
[2] Tao C-J, Yi J-L, Chen N-Y, Ren W, Cheng J, Tung S, et al. Multi-subject atlas- introduction to the scientific issues. Int J Radiat Oncol Biol Phys 2010;76:S3–9.
based auto-segmentation reduces interobserver variation and improves https://doi.org/10.1016/j.ijrobp.2009.09.040.
dosimetric parameter consistency for organs at risk in nasopharyngeal [24] Marks LB, Yorke ED, Jackson A, Ten Haken RK, Constine LS, Eisbruch A, et al.
carcinoma: A multi-institution clinical study. Radiother Oncol Use of normal tissue complication probability models in the clinic. Int J Radiat
2015;115:407–11. https://doi.org/10.1016/j.radonc.2015.05.012. Oncol Biol Phys 2010;76:S10–9. https://doi.org/10.1016/j.ijrobp.2009.07.1754.
[3] Vogin G, Hettal L, Bartau C, Thariat J, Claeys M-V, Peyraga G, et al. Cranial [25] Kirkpatrick JP, Marks LB, Mayo CS, Lawrence YR, Bhandare N, Ryu S. Estimating
organs at risk delineation: heterogenous practices in radiotherapy planning. normal tissue toxicity in radiosurgery of the CNS: application and limitations
Radiat Oncol 2021;16. https://doi.org/10.1186/s13014-021-01756-y. of QUANTEC. J radiosurgery SBRT 2011;1:95–107.
[4] Chaney EL, Pizer SM. Autosegmentation of Images in Radiation Oncology. J Am [26] Lo SS, Sahgal A, Chang EL, Mayr NA, Teh BS, Huang Z, et al. Serious
Coll Radiol 2009;6:455–8. https://doi.org/10.1016/j.jacr.2009.02.014. complications associated with stereotactic ablative radiotherapy and
[5] Teguh DN, Levendag PC, Voet PWJ, Al-Mamgani A, Han X, Wolf TK, et al. strategies to mitigate the risk. Clin Oncol 2013;25:378–87. https://doi.org/
Clinical validation of atlas-based auto-segmentation of multiple target 10.1016/j.clon.2013.01.003.
volumes and normal tissue (swallowing/mastication) structures in the head [27] Kountouri M, Pica A, Walser M, Albertini F, Bolsi A, Kliebsch U, et al. Radiation-
and neck. Int J Radiat Oncol Biol Phys 2011;81:950–7. https://doi.org/10.1016/ induced optic neuropathy after pencil beam scanning proton therapy for skull-
j.ijrobp.2010.07.009. base and head and neck tumours. Br J Radiol 2020;93:20190028. https://doi.
[6] Vrtovec T, Močnik D, Strojan P, Pernuš F, Ibragimov B. Auto-segmentation of org/10.1259/bjr.20190028.
organs at risk for head and neck radiotherapy planning: From atlas-based to [28] Köthe A, van Luijk P, Safai S, Kountouri M, Lomax AJ, Weber DC, et al.
deep learning methods. Med Phys 2020;47:e929–50. https://doi.org/10.1002/ Combining clinical and dosimetric features in a PBS proton therapy cohort to
mp.14320. develop a NTCP model for radiation-induced optic neuropathy. Int J Radiat
[7] La Macchia M, Fellin F, Amichetti M, Cianchetti M, Gianolini S, Paola V, et al. Oncol Biol Phys 2021;110:587–95. https://doi.org/10.1016/j.
Systematic evaluation of three different commercial software solutions for ijrobp.2020.12.052.
automatic segmentation for adaptive therapy in head-and-neck, prostate and [29] Eekers DBP, in ’t Ven L, Deprez S, Jacobi L, Roelofs E, Hoeben A, et al. The
pleural cancer. Radiat Oncol 2012;7. https://doi.org/10.1186/1748-717X-7- posterior cerebellum, a new organ at risk? Clin Transl Radiat Oncol
160. 2018;8:22–6. https://doi.org/10.1016/j.ctro.2017.11.010.
[8] Boldrini L, Bibault JE, Masciocchi C, Shen Y, Bittner MI. Deep learning: A review [30] Langendijk JA, Hoebers FJP, de Jong MA, Doornaert P, Terhaard CHJ,
for the radiation oncologist. Front Oncol 2019;9:977. https://doi.org/ Steenbakkers RJHM, et al. National protocol for model-based selection for
10.3389/fonc.2019.00977. proton therapy in head and neck cancer. Int J Part Ther 2021;8:354–65.
[9] Wong J, Fong A, McVicar N, Smith S, Giambattista J, Wells D, et al. Comparing https://doi.org/10.14338/IJPT-20-00089.1.
deep learning-based auto-segmentation of organs at risk and clinical target [31] Loi G, Fusella M, Lanzi E, Cagni E, Garibaldi C, Iacoviello G, et al. Performance of
volumes to expert inter-observer variability in radiotherapy planning. commercially available deformable image registration platforms for contour
Radiother Oncol 2020;144:152–8. https://doi.org/10.1016/j. propagation using patient-based computational phantoms: A multi-
radonc.2019.10.019. institutional study. Med Phys 2018;45:748–57. https://doi.org/10.1002/
[10] Ibragimov B, Xing L. Segmentation of organs-at-risks in head and neck CT mp.12737.
images using convolutional neural networks. Med Phys 2017;44:547–57. [32] Raystation 9B User Manual. RaySearch Laboratories AB, Stockholm, Sweden,
https://doi.org/10.1002/mp.12045. 2019.
[11] Wang J, Lu J, Qin G, Shen L, Sun Y, Ying H, et al. Technical note: A deep [33] Brock KK, Mutic S, McNutt TR, Li H, Kessler ML. Use of image registration and
learning-based autosegmentation of rectal tumors in MR images. Med Phys fusion algorithms and techniques in radiotherapy: Report of the AAPM
2018;45:2560–4. https://doi.org/10.1002/mp.12918. Radiation Therapy Committee Task Group No. 132. Med Phys 2017;44:
[12] Zhong T, Huang X, Tang F, Liang S, Deng X, Zhang Y. Boosting-based cascaded e43–76. https://doi.org/10.1002/mp.12256.
convolutional neural networks for the segmentation of CT organs-at-risk in [34] Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a
nasopharyngeal carcinoma. Med Phys 2019;46:5602–11. https://doi.org/ multivariable prediction model for individual prognosis or diagnosis
10.1002/mp.13825. (TRIPOD): The TRIPOD Statement. BMC Med 2015;13:1–10. https://doi.org/
[13] van der Veen J, Willems S, Deschuymer S, Robben D, Crijns W, Maes F, et al. 10.1186/s12916-014-0241-z.
Benefits of deep learning for delineation of organs at risk in head and neck [35] Eekers DBP, Di Perri D, Roelofs E, Postma A, Dijkstra J, Ajithkumar T, et al.
cancer. Radiother Oncol 2019;138:68–74. https://doi.org/10.1016/j. Update of the EPTN atlas for CT- and MR-based contouring in Neuro-Oncology.
radonc.2019.05.010. Radiother Oncol 2021;160:259–65. https://doi.org/10.1016/j.
[14] van Dijk LV, Van den Bosch L, Aljabar P, Peressutti D, Both S, J.H.M. radonc.2021.05.013.
Steenbakkers R, et al. Improving automatic delineation for head and neck

268

You might also like