Professional Documents
Culture Documents
Disruption Prediction Investigations Using Machine Learning Tools On DIII-D and Alcator C-Mod
Disruption Prediction Investigations Using Machine Learning Tools On DIII-D and Alcator C-Mod
Disruption Prediction Investigations Using Machine Learning Tools On DIII-D and Alcator C-Mod
PAPER
E-mail: crea@mit.edu
Abstract
Using data-driven methodology, we exploit the time series of relevant plasma parameters for a
large set of disrupted and non-disrupted discharges to develop a classification algorithm for
detecting disruptive phases in shots that eventually disrupt. Comparing the same methodology
on different devices is crucial in order to have information on the portability of the developed
algorithm and the possible extrapolation to ITER. Therefore, we use data from two very different
tokamaks, DIII-D and Alcator C-Mod. We focus on a subset of disruption predictors, most of
which are dimensionless and/or machine-independent parameters, coming from both plasma
diagnostics and equilibrium reconstructions, such as the normalized plasma internal inductance ℓi
and the n=1 mode amplitude normalized to the toroidal magnetic field. Using such
dimensionless indicators facilitates a more direct comparison between DIII-D and C-Mod. We
then choose a shallow Machine Learning technique, called Random Forests, to explore the
databases available for the two devices. We show results from the classification task, where we
introduce a time dependency through the definition of class labels on the basis of the elapsed
time before the disruption (i.e. ‘far from a disruption’ and ‘close to a disruption’). The
performances of the different Random Forest classifiers are discussed in terms of several metrics,
by showing the number of successfully detected samples, as well as the misclassifications. The
overall model accuracies are above 97% when identifying a ‘far from disruption’ and a
‘disruptive’ phase for disrupted discharges. Nevertheless, the Forests are intrinsically different in
their capability of predicting a disruptive behavior, with C-Mod predictions comparable to
random guesses. Indeed, we show that C-Mod recall index, i.e. the sensitivity to a disruptive
behavior, is as low as 0.47, while DIII-D recall is ∼0.72. The portability of the developed
algorithm is also tested across the two devices, by using DIII-D data for training the forests and
C-Mod for testing and vice versa.
1. Introduction unstable regime has not yet been reached. Lacking a com-
prehensive theoretical model, scientists have addressed
The physics of disruptions in present tokamak devices still disruptions using advanced statistical analysis [1], and
remains a challenge for the fusion community. A thorough, recent efforts have seen a boosted interest in the exploita-
physical understanding of the transition mechanisms that tion of state-of-the-art Machine Learning techniques [2] to
drive the plasma away from a stable phase into an develop data-driven predictors for disruption avoidance or
mitigation [3–12]. Current devices are not extremely 2. The DIII-D and C-Mod disruption databases
affected by these unforeseen discharge terminations;
nevertheless, the consequences of disruption events for The physical processes that lead up to a disruption are com-
future tokamaks and reactors could be disastrous, given the plex [19], but based on extensive empirical experience, it is
energy scale and size. generally believed that changes in behavior of some routinely
Disruption precursors can be very different, and their measured plasma parameters are correlated with the approach
phenomenology strongly depends on the analyzed device. of a disruption. Therefore, our database currently consists of
Inspiring work has been published so far [13], presenting the time series values of ∼45 disruption-relevant signals,
manual classifications of the chain of events that can lead to sampled simultaneously throughout the duration of all 2146
disruptions in tokamak plasmas. From these, one could plasma discharges in the 2015 campaign on DIII-D, which
implement advanced statistical techniques to automatically includes 678 disruptions, and all 1821 plasma discharges in
classify such transitions and thus define a possible disruption the 2015 campaign on Alcator C-Mod, which includes 643
predictor. Other approaches foresee the incorporation of disruptions. We include data from both disruptive and non-
physics-based first-principle models in a complex statistical disruptive discharges because, in addition to predicting
learning architecture [14]. impending disruptions with high accuracy, we want to avoid
These studies require an intense and expensive human predicting disruptions on discharges that will not disrupt (i.e.
intervention in the definition of the different cause-effect false positives).
sequences that can be identified during the transition to a The times at which the data are sampled consist of two
distinct sets for each machine. For all shots in the DIII-D
disruptive end, under the most varied operational conditions,
database we sample every 25 ms, starting at t=0.100 s and
which might indeed be very device-dependent. Most fre-
continuing until the end of the discharge. (Typical DIII-D
quently, these studies are not affordable in terms of either
shot durations range from 3 to 8 s.) For disruptive shots, we
time or human resources. The approach we propose in this
add a set of samples taken at 2 ms intervals for the 100 ms
paper follows the data science paradigm of letting the algo-
period preceding the disruption4. For all shots in the C-Mod
rithm learn from the data, with as little as possible human
database, we sample every 20 ms, starting at t=0.060 s and
interference. Statistical inference, developed after the algo-
continuing until the end of the discharge. (The typical C-Mod
rithm has learned from the provided data, can provide shot duration is 2 s.) For disruptive shots, we add a set of
important physics insights on the disruption dynamics, samples taken at 1 ms intervals for the 20 ms period preceding
especially when comparing a similar predictive methodology the disruption. On each machine, in order to avoid overlap
on two very different devices. between the two sampling sets, we remove the slow-sampled
In this paper we present the results regarding the appli- points during the pre-disruption period of high-frequency
cation of the Random Forests algorithm [15] to two datasets sampling.
coming from two very different tokamaks, DIII-D and Alcator Our choice of parameters to include in the databases is
C-Mod. The construction of the databases, as well as the based partly on our own tokamak operational experience, and
selection of the relevant input features for our algorithms, will partly on those specified in the relevant literature [12, 20].
be discussed in section 2. In section 3, we will discuss and These include plasma parameters directly measured by diag-
compare in detail the behavior of two plasma signals as dis- nostics, such as radiated power Prad and density, as well as
ruption precursors, the normalized internal inductance ℓi and those derived from EFIT equilibrium reconstructions [21],
the n=1 mode amplitude. We will then thoroughly describe such as q95, elongation, and ℓi. We initially included a few
the Random Forests algorithm in section 4: this is a popular plasma parameters that are explicit time derivatives, such as
and very powerful Machine Learning algorithm, widely used dWth dt (where Wth is the stored plasma energy), but we
in many different applications [16–18]. The results of its found that the noise in these time derivative signals was much
application to DIII-D and C-Mod data will be presented in larger than any observed changes due to impending disrup-
section 5. Since a binary classification scheme is adopted, one tions, and therefore not useful. There are also several control-
efficient way of displaying the results is through the utiliza- related parameters, such as the programmed plasma current
tion of a confusion matrix. The number of correctly classified Iprog (useful for separating data into the rampup, flattop, and
samples as well as the misclassified ones are reported, from rampdown phases of the discharges), a power supply status
which we can extract several performance metrics. We will flag, an intentional disruption flag, etc. Many of the plasma
also discuss the ranking of the relatively most important input parameters can be cast in a normalized form, such as
variables in our binary classification scheme for both the Greenwald fraction and Prad/Pinput, which is useful for cross-
devices. In section 6 we will also test the portability of machine analyses.
Random Forests across the two different devices; we will There are a number of additional factors that were con-
show the results coming from the algorithm trained on DIII-D sidered in the design of the database. Since the ultimate goal
data and tested on C-Mod and vice versa. Finally, conclusions is to develop a real-time disruption warning algorithm, we
are drawn in section 7, where we will discuss the reasons that 4
The disruption time tD is defined to be the time of max (∣dIp dt∣), which is
led to have such different predictive capabilities on DIII-D typically about halfway down the decay of the plasma current, i.e. current
and C-Mod. quench.
2
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Table 1. List of signals considered for Machine Learning Even adding data from other campaign years is feasible, with
applications on DIII-D and C-Mod. the primary effort being the need to run dedicated EFITs as
Signal description Variable name described previously.
Each record in our SQL database consists of the values of
Percent error between measured and
ip_error_frac the ∼45 parameters at a single time on a single shot. The shot
programmed plasma cur- number and the time are two of the parameters (the primary
rent, (Ip - Iprog ) Ip
keys) in each record. The time_until_disrupt is
Poloidal beta, βp betap
another parameter in each record, but it is only defined for
Greenwald density fraction, n/nG n/nG
shots that disrupt. (For non-disruptive shots, it has a null
Safety factor at 95% of minor radius, q95
q95 value, or NaN in Matlab.) The ∼45 parameters can be thought
Normalized internal inductance, ℓi li of as columns of the database. There are typically 200–250
Radiated power fraction, Prad/Pinput prad_frac records (i.e. time slices) for each of the 2146 shots in the DIII-
Loop voltage, Vloop (V) Vloop D database, totaling nearly 0.5 million records, and contain-
Stored plasma energy, Wth (J) Wmhd ing ∼22 million parameter values, and 80–100 records
n=1 mode amplitude, for each of the 1821 shots in the C-Mod database, totaling
n_equal_1_normalized
normalized to Btor 0.2 million records and containing ∼7.7 million parameter
Electron temperature profile width, values.
normalized to plasma minor radius Te_width_normalized For the Machine Learning studies described in the
-not available for C-Mod-
following sections of this paper, a subset of 10 particular
parameters was chosen, based on the literature on disruption
chose only parameters that, in principle, can be available in prediction [5, 7–12]. One of the chosen plasma parameters,
real-time to the plasma control system. This precludes the use the electron temperature profile width normalized to the
of parameters that may be very useful for disruption warning, plasma minor radius, was not available for most of the 2015
but which require extensive offline analysis to derive. On C-Mod database and was therefore neglected from the spe-
DIII-D (and several other tokamaks) a highly optimized cific Machine Learning application on C-Mod. The subset of
version of EFIT runs in real-time in the plasma control system chosen signals, reported in table 1, contains mostly machine-
[22], so we are justified in including EFIT-derived data in our independent and dimensionless parameters, which can there-
database. The goal of a real-time disruption predictor intro- fore enable cross-device analysis and comparisons. Each
duces another constraint on the data, namely the avoidance of signal’s description is given together with the associated
non-causal filtering. Several of the desired parameters, such as name of the variable as it appears in the figures and tables of
Prad, are available from the MDSplus archiving system [23], the following sections.
but have been processed using non-causal smoothing win- We report in table 2 a schematic summary of the number
dows. Incorporation of these data into the disruption database of disrupted and non-disrupted discharges used for Machine
could lead to incorrect identification of these parameters as Learning applications on DIII-D and C-Mod.
useful for disruption prediction. Therefore we have re-ana- For these initial studies, we have chosen to concentrate
lyzed the offending signals to avoid or minimize non-causal only on the flattop phase of discharges and on shots that
filtering. disrupted during flattop. Furthermore, we did not use the data
Although real-time EFITs are done on DIII-D, we have from disruptions that were caused by hardware (power sup-
elected to use EFIT-derived data from full EFIT reconstruc- ply) failures, nor intentionally-triggered disruptions (usually
tions done after the shot, since they are more accurate. for studies of disruption mitigation). These restrictions leave
However, the standard post-shot EFITs are not done at the us with a set of 194 discharges that disrupted during the
higher sampling rates that we desire prior to disruptions. In flattop on DIII-D and 189 discharges that disrupted during
order to avoid excessive interpolation, we have rerun EFIT on flattop on C-Mod. We complemented these with data from
all the shots in our database, using the sampling times that we the flattop of 1366 non-disruptive DIII-D discharges and 1160
desire. For these custom EFITs, we also reduced the non- C-Mod discharges, preferentially from the same experimental
causal filtering of the magnetic diagnostic signals upon which runs. This will ensure similar operational spaces for our
the Grad–Shafranov reconstructions are based. The data from analyses. Thus for each of the 10 selected parameters, we are
these custom EFITs are archived in an alternative MDSplus using a total of ∼253 000 records from the DIII-D database,
EFIT tree for each shot. selected from a total of 1562 discharges, and ∼74 000 records
The data are stored in an SQL relational database and can from the C-Mod database, selected from a total of 1349
be retrieved by any analysis software that supports SQL discharges.
queries, including Matlab, IDL, Python, etc. Population of the
database is done with a Matlab main program that loops
through a specified list of shots, calling Matlab subroutines to 3. Univariate data analysis
extract and process the relevant parameter data from
MDSplus for each shot, and interpolating data at the desired Before presenting the results related to the application of
times. This architecture makes it rather easy to incorporate Machine Learning techniques, we present a detailed analysis
additional parameters, which we continue to do occasionally. on some of the parameters of interest for both devices.
3
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Table 2. Number of DIII-D and C-Mod discharges considered for Table 3. Performance metrics for DIII-D and C-Mod binary
Machine Learning applications during 2015 campaigns. classification tasks. All the indices are limited between 0 and 1, with
1 representing optimal performances.
Disrupted Non-disrupted
Accuracy Precision Recall F1 score
DIII-D 194 1366
C-Mod 189 1160 DIII-D 0.983 6 0.819 6 0.720 5 0.766 8
C-Mod 0.978 5 0.814 6 0.468 8 0.595 1
4
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Figure 1. Normalized internal inductance, ℓi, as a function of time before disruption during the flattop phase of all the considered disrupted
discharges. To reduce the clutter of the signal’s time traces we used different greyscale colors depending on the initial value of ℓi. For DIII-D
(top), starting from approximately −0.35 s or earlier, ℓi starts to increase on a large fraction of discharges; the orange-colored time traces help
highlighting such behavior. For C-Mod (bottom), a less obvious increase in ℓi starts ∼60−50 ms before the disruption time.
but also in other tokamaks [24–26], and are responsible for vacuum coupling. Detailed discussion on the technique
most locked mode disruptions. adopted for the detection of n=1 modes can be found
Statistical analysis of m/n=2/1 modes conducted in in [28].
[27] on a database including 22 500 discharges on DIII-D The raw ESLD saddle loop signals are archived for all
showed that more than 18% of disruptions between 2005 and discharges. However, the real-time n=1 amplitude signal is
2014 were due to locked or slowly rotating modes. only available and archived for those discharges that require
For DIII-D, an estimate of the perturbed radial field of the PCS ‘Alarms’ category to be enabled during the experi-
non-rotating modes is provided in real-time to the Plasma ment. For the analyzed 2015 database, the n=1 amplitude
Control System (PCS) ‘Alarms’ category by the difference indicator was missing on ∼38% of the discharges. For these
pairs of the integrated external saddle loops (ESLDs). These discharges, we reconstructed an equivalent n=1 amplitude
consist of a toroidal array of six external saddle loops, posi- signal by performing the same computation on the ESLDs as
tioned at 60◦ intervals around the outboard midplane, capable done in the PCS. To speed up the analyses of thousands of
of resolving modes with toroidal number 0 n 2. A discharges, we used the recently developed tool TokSearch
compensation matrix is used to account for the pickup of the [29], which automates parallel processing of discharges on
driven non-axisymmetric coils (I-coils and C-coils) and the multiple nodes.
5
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Figure 2. For (a) DIII-D and (b) Alcator C-Mod discharges, probability histograms of the normalized internal inductance, ℓi, are shown for all
non-disruptive discharges (blue) and disruptive discharges, where the latter data is split into times far from (orange) and close to (yellow) the
disruption. Note that the thresholds between ‘far from’ and ‘close to’ disruption data are different for C-Mod and DIII-D.
Figure 3. Median of ℓi probability distributions, for DIII-D (red) and Alcator C-Mod (blue) discharges, extracted at different time ranges
before the disruption event. On both machines there is an increase in the median value of ℓi before disruptions, but both the warning time and
the magnitude of the increase is noticeably smaller for C-Mod. The gray box in the left plot identifies the zoomed-in region in the right plot.
Alcator C-Mod did not have functional saddle loops, so series: A1 (t ) + A2 (t ) cos (f ) + A3 (t ) sin (f ). The n=1 mode
we used the analog-integrated signals from 4 poloidal amplitude at each time is A22 + A32 . This amplitude is then
magnetic field Bp sensors located on the internal surface of the normalized by dividing by the toroidal field at each time.
vacuum vessel, near the outboard midplane. These sensors are A statistical analysis, similar to the one described above
sensitive also to rotating modes and not only to locked ones, for ℓi, can also be done for the n=1 mode amplitude nor-
unlike the saddle loops used for the DIII-D analysis. The 4 Bp malized to the value of the toroidal magnetic field on-axis.
sensors were all at the same poloidal location, and were tor-
Figure 4 (top) and (bottom) show the database parameter
oidally distributed at roughly 90° intervals. (These 4 Bp
n_equal_1_normalized as a function of the time before
sensors were a subset of the 104 Bp sensors used for real-time
the disruption event, for all the analyzed flattop disruptions on
control and for EFIT equilibrium reconstructions.) The Bp
signals were compensated for baseline offsets, integrator DIII-D and C-Mod, respectively. There is an obvious differ-
drifts, and toroidal field pickup. However, the contribution of ence in the average magnitude of Bpn = 1 Btor between the two
applied non-axisymmetric magnetic fields from external error machines, with C-Mod values being about 3× higher. Both
field coils has not yet been compensated for. At each sam- machines also show a general trend of increasing Bpn = 1 Btor
pling time we used a least-squares fit to fit the 4 signals to the as the disruption time is approached.
6
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Figure 4. For DIII-D (top) and C-Mod (bottom), n=1 mode amplitude normalized to the value of the toroidal magnetic field as a function of
time before disruption during the flattop phase for the set of disrupted discharges.
Better quantification of this behavior can be seen in the available DIII-D and Alcator C-Mod databases. We chose to
evolving probability distributions displayed in figure 5(a) adopt the scikit-learn [30] implementation: this is an
and (b). A significantly larger fraction of impending disrup- open-source Python library for Machine Learning that we
tions on DIII-D exhibit an increase in Bpn = 1 Btor compared to used through the OMFIT framework [31].
C-Mod, and the increases tend to occur much earlier before The Random Forests algorithm is among the most ver-
the disruption. satile advanced statistical models. The details of such algo-
Future work will focus on removing from the reconstructed rithm have already been discussed in a previous paper [18], to
C-Mod signal the contribution of external error fields as well as which we refer for a detailed discussion of the algorithmic
the contribution of rotating modes, in order to have a more methodology. It belongs to the family of ensemble learners:
direct comparison with the DIII-D n=1 mode indicator. the forests are grown by developing parallel sets of predictors,
thus collecting a large number of independent and identically
distributed, de-correlated decision trees [32]. The trees are
4. Random Forests for disruption prediction usually fully grown and the final prediction is aggregated,
using majority voting, from a very large number of trees. A
In this paper we present the application of a supervised, representation of a single tree in a forest is given in figure 6.
classification algorithm, called Random Forests [15], to the These individual learners (i.e. the trees) can be defined as
7
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Figure 5. For (a) DIII-D and (b) C-Mod, histograms of n_equal_1_normalized during the flattop of non-disruptive discharges (blue
distribution), and disruptive discharges, separated into far from (orange) and close to (yellow) disruption datasets.
Figure 6. Graphical depiction of an individual tree in a Random Forest, zoomed-in the first three layers. We can see that in the root node, at the
very top, we have all the bootstrapped samples (100%) with a class composition given by [0.96, 0.04] reflecting the [‘non-disruptive’,‘disruptive’]
classes population in the whole DIII-D dataset. Nodes are then branched on the basis of real values of input features; the feature chosen as the best
candidate to reduce the impurity measure at each split can be picked from a random subset of the initially available input features. This
mechanism strongly reduces the correlation among the grown trees [15]. The blue and brown colors represent the two different classes in the
binary classification task. The classes population is reported in each node. The color at each node is assigned depending on the majority of the
classes of the samples populating that node.
hierarchical data structures that are grown through a divide- particular, it is defined as:
and-conquer strategy. When dealing with a supervised clas- K
sification problem, this means that the original input space is Gini = å pˆtk (1 - pˆtk ) , (1 )
recursively partitioned until the almost complete separation of k= 1
samples belonging to different labels. Starting from a root where pˆtk is the proportion of class k observations in node t. For
node, the data is split on the input feature that results in the two classes, if p is the proportion in the positive class:
largest information gain, and this process is recursively Gini = 2p (1 - p ). It can be seen in figure 6 that each node is
repeated until the decision tree ends with nodes as pure as associated to a specific impurity measurement given by the Gini
possible. For the purpose of estimating the information gain, index and a Gini closer to zero indicates that the node contains
several equivalent metrics can be adopted; in particular we almost all samples belonging to just one of the two classes.
adopt the Gini impurity measure. Such measure is differ- Random Forests have the great advantage of being a
entiable, and hence amenable to numerical optimization. In Machine Learning model with low bias and low variance due
8
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
9
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
10
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
Table 4. Relative variable importance for binary classifications on DIII-D and C-Mod. C-Mod ranking is reported for completeness, but given
the poor performances of the classifier, the feature importance is not related to the true disruptivity on C-Mod.
DIII-D C-Mod
Parameter Importance Parameter Importance
q95 0.241 ip_error_frac 0.168
n_equal_1_normalized 0.235 n/nG 0.155
n/nG 0.170 li 0.123
li 0.079 n_equal_1_normalized 0.114
betap 0.066 q95 0.106
Wmhd 0.059 Vloop 0.096
Te_width_normalized 0.047 Wmhd 0.083
Vloop 0.047 betap 0.081
ip_error_frac 0.034 prad_frac 0.071
prad_frac 0.020
Table 5. The table reports on Random Forests performance when For DIII-D (table 4), these relative importance measures
trained on one device’s data and tested on the other. Performances represent a well known correlation between disruptivity,
are reported in terms of the F1 score. We also tested the algorithm
n=1 mode activity, and the safety factor.
against different thresholds in time for the discrimination of
‘disruptive’ class labels. Instead, given the poor classifier’s performances on
C-Mod data, we cannot rely on the variable ranking reported
Train Test Thr (s) F1 score in table 4. Since the Random Forests algorithm is capable of
C-Mod DIII-D 0.35 0.275 8 discriminating ‘disruptive’ labels in less than 50% of the
C-Mod DIII-D 0.04 0.177 2 cases, the relative ranking does not reflect any correlation
DIII-D C-Mod 0.35 0.174 6 between the input features and the actual disruptivity on
DIII-D C-Mod 0.04 0.011 4 C-Mod. It only reflects the relatively most important vari-
ables, e.g. ip_error_frac or n/nG, in detecting less than
a half disruptive samples.
positive misclassifications. In other words, it represents the
ability of the classifier not to label as ‘disruptive’ a sample that
belongs to the ‘far from disruption or non-disruptive’ class.
Precision needs to be evaluated along with another metric 6. The cross-device analysis
named recall, also called sensitivity or true positive rate: this
represents the ratio of ‘disruptive’ class instances that are In order to test the portability of the Random Forests predictive
TP
correctly detected by the classifier, i.e. TP + FN . algorithm across different devices, we selected from table 1
It is often convenient to combine precision and recall into those variables available for both DIII-D and C-Mod data,
a single metric called F1 score, which is obtained from the trained the Random Forests on one device’s dataset, tested
harmonic mean of precision and recall: F1 = 2TP +2TP . on the other, and vice versa. Results are reported in table 5;
FP + FN
Each of the performance metrics reported in table 3 we also trained the algorithm using the two different thresholds
in time that we adopted on DIII-D and C-Mod for the identi-
provides a piece of useful information. It is often instructive
fication of a ‘disruptive’ phase in disrupted discharges.
to report all of them to understand the classifiers differences.
In this analysis we used all the available dimensionless
From table 3 it is indeed possible to notice at first glance the
signals: we discarded Te_width_normalized because it
different performances of the Random Forest classifier on
was not available on C-Mod dataset and Vloop and Wmhd,
DIII-D and C-Mod data. In particular, the recall index and the
because they are not normalized quantities.
F1 score reveal the very poor predictive capabilities of the As we can see from the F1 score, the algorithm performs
classifier for C-Mod disruptive samples. The recall shows that rather poorly when trained on DIII-D data and tested on
a Random Forests trained on C-Mod data is capable of dis- C-Mod and vice versa. Relatively better models can be
criminating ‘disruptive’ samples in not even half of the obtained using C-Mod data as training set and predicting
available observations. The F1 score instead, provides a more DIII-D samples labels. Nevertheless, the predictive cap-
global view of the correct detection of the positive class, abilities of all the developed models are rather insufficient.
taking into account the global misclassification error. Apart from differences in geometry, configuration or material,
As introduced in section 4, Random Forests provide a the two devices are characterized by intrinsically different
relative importance ranking for the input features, i.e. plasma timescales for the evolution of the physics processes: the
signals, used as training set to build the model. The relative confinement time on DIII-D is of the order of 0.1 s, whereas
rank of a feature reflects the relative importance of that feature on C-Mod is around 0.04 s [38]; while the current relaxation
with respect to the predictability of the target variable, in our time is about 1 s for DIII-D and ∼0.2 s on C-Mod [39, 40].
case the discrimination of ‘disruptive’ class labels. These differences in timescales are likely one reason for the
11
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
poor performance. Further work shall be devoted to improve have shown the very poor predictive capabilities of the
the methodology and understand how to conciliate such Random Forests algorithm, training on one device’s data and
intrinsic differences. testing on the other’s (and vice versa), even when using
dimensionless parameters as input variables.
This does not bode well for the realization of a ‘uni-
7. Discussion and conclusions versal’ classifier, capable of predicting disruptions in ITER
with sufficient warning time. Nevertheless, different strategies
In this paper we presented a comparative study between DIII- can still be explored. Different Machine Learning algorithms
D and C-Mod using Machine Learning algorithms for dis- dealing with data as a full time series could have better results
ruption prediction. The chosen methodology is explained in for predicting disruptions, especially on a device character-
detail in section 4: Random Forests is a very powerful and ized by such fast timescales as C-Mod. Furthermore, the
versatile Machine Learning technique, allowing the extraction incorporation of more dimensionless and device-independent
of valuable information from the database, such as the relative quantities might provide more information for developing an
importance ranking for the input features. efficient disruption warning algorithm. In terms of the
We chose to adopt the same set of features for the two applicability to ITER, the experience on machines with
machines, listed in table 1, with the exception of the electron similar characteristics has to be explored to extrapolate all the
temperature data, which was not available on a large fraction possible knowledge that can be used to predict disruptive
of 2015 C-Mod data. Both the constructed algorithms show behaviors or assess the predictive algorithm itself. It may be
accurate predictions when analyzing ‘far from disruption or possible that all the efforts developed on currently available
non-disruptive’ samples, incurring in a very low percentage of data will have limited effect in the realization of a disruption
false positive misclassifications. When asked to identify warning algorithm for ITER. Given the gradual transition to a
‘disruptive’ time slices in discharges that eventually disrupt, full performance, though, an adaptive retraining on initially
the classifier’s performances differ substantially with the available low-power and low-current data could be exploited,
C-Mod case showing percentages of correct classifications by including also data coming from simulations. Finally,
actually worse than a random guess. To better evidence this, cutting-edge techniques in Artificial Intelligence can prove to
we show in table 3 the recall index for both DIII-D and be beneficial in learning domain-invariant representations that
C-Mod cases. can be used to adapt the algorithm between different opera-
The poor performance of disruption prediction in C-Mod tional regimes and tokamaks.
may reflect the fact that some C-Mod disruptions are thought
to be caused by tiny flecks of molybdenum penetrating into
the plasma and very effectively radiating away its thermal
Acknowledgments
energy on a millisecond timescale, due to the high atomic
number. Though not reported in this paper, the analysis of the
This work was supported by the US Department of Energy
radiated power versus the time before the disruption shows
under DE-FC02-04ER54698, DE-SC0014264 and DE-FG02-
that on a fraction of C-Mod discharges, Prad has a very rapid
04ER54761.
increase starting a few milliseconds before the disruption
DIII-D data shown in this paper can be obtained in digital
event. The molybdenum injections presumably originate from
format by following the links at https://fusion.gat.com/
overheated edges/corners of the molybdenum tiles that
global/D3D_DMP.
comprise the plasma facing surface of the divertor. This is not
surprising, since the energy density is higher than all other
currently operating tokamaks. Since the B-field is also high,
the scrape-off width is smaller than all other tokamaks, Disclaimer
resulting in parallel heat fluxes to the divertor of order
0.5 GW m−2 near the strikepoints. This can lead to the This report was prepared as an account of work sponsored by
overheating of tile edges/corners. an agency of the United States Government. Neither the
In contrast, DIII-D has lower thermal energy density and United States Government nor any agency thereof, nor any of
lower B-field, and therefore less heat flux on the divertor. In their employees, makes any warranty, express or implied, or
addition, the plasma facing surface is graphite, not high-Z assumes any legal liability or responsibility for the accuracy,
metal and graphite in a hot plasma does not radiate as completeness, or usefulness of any information, apparatus,
effectively as high-Z metals. It should be noted that ITER will product, or process disclosed, or represents that its use would
have a high-Z divertor (tungsten), like C-Mod, and its thermal not infringe privately owned rights. Reference herein to any
energy density and B-field and parallel heat flux to the specific commercial product, process, or service by trade
divertor will be very similar to C-Mod, and therefore dis- name, trademark, manufacturer, or otherwise, does not
ruptions on ITER may be more like disruptions on C-Mod necessarily constitute or imply its endorsement, recommen-
than on DIII-D. dation, or favoring by the United States Government or any
All of these differences have to be taken into account agency thereof. The views and opinions of authors expressed
when testing the portability of the predictive model across herein do not necessarily state or reflect those of the United
DIII-D and C-Mod. Results from the cross-device analysis States Government or any agency thereof.
12
Plasma Phys. Control. Fusion 60 (2018) 084004 C Rea et al
13