Professional Documents
Culture Documents
Aircraft Taxi Time Prediction Comparisons and Insights
Aircraft Taxi Time Prediction Comparisons and Insights
a r t i c l e i n f o a b s t r a c t
Article history: The predicted growth in air transportation and the ambitious goal of the European Commission to have
Received 28 January 2013 on-time performance of flights within 1 min makes efficient and predictable ground operations at air-
Received in revised form ports indispensable. Accurately predicting taxi times of arrivals and departures serves as an important
13 September 2013
key task for runway sequencing, gate assignment and ground movement itself. This research tests dif-
Accepted 6 October 2013
Available online 23 October 2013
ferent statistical regression approaches and also various regression methods which fall into the realm
of soft computing to more accurately predict taxi times. Historic data from two major European air-
ports is utilised for cross-validation. Detailed comparisons show that a TSK fuzzy rule-based system
Keywords:
Data mining outperformed the other approaches in terms of prediction accuracy. Insights from this approach are then
Fuzzy rule-based system presented, focusing on the analysis of taxi-in times, which is rarely discussed in literature. The aim of this
Regression research is to unleash the power of soft computing methods, in particular fuzzy rule-based systems, for
Airport ground movement taxi time prediction problems. Moreover, we aim to show that, although these methods have only been
Decision support system recently applied to airport problems, they present promising and potential features for such problems.
© 2013 Elsevier B.V. All rights reserved.
1. Introduction and will provide important predictions for all of the parties which
are involved in the handling of aircraft. Furthermore, it will also
The latest vision for air transportation in Europe is predict- be increasingly important to have accurate control of the aircraft
ing marked growth in this sector. The European Commission [1] on the ground using an advanced decision support system [5].
assumes an increase in the global volume of air traffic from 2.5 Idris et al. [6] published the first paper on taxi-out time esti-
billion passengers in 2011 to 16 billion passengers in 2050. Thus, mation based on multiple linear regression, with data from Boston
the number of commercial flights in Europe per year is expected Logan International Airport (BOS). The importance of taking into
to increase from 9.4 million to 25 million during the same time account the current amount of traffic on the surface, when predict-
period. Nevertheless, one of the formulated goals by 2050 is also ing taxi-out times, was highlighted for the first time. The take-off
that on-time performance of flights is within 1 min. queue size was found to be an especially important explanatory
Efficient ground movement operations are key to successful variable for the problem. With the introduction of Collaborative
operations of air transportation networks [2]. Especially since Decision Making (CDM) systems at airports within the last few
airport ground movement operations link runway sequencing years, practitioners at airports realised the need for having more
and gate assignment problems and accurate taxi time predictions accurate taxi times and, driven by that, more researchers have ana-
support various different stakeholders at airports. For example, lysed the problem of taxi time prediction. Several authors have
the benefits for the take-off sequence of having accurate taxi published their results about taxi-out time prediction at US airports
times was shown in [3] and recent developments for Heathrow [4] [7–12]. Balakrishna et al. [7–9,11] used a reinforcement learning
require accurate taxi times both for take-off sequencing and for algorithm which showed good results for data from Detroit Inter-
allocating pushback times to aircraft, at which they should leave national Airport (DTW) and Tampa International Airport (TPA), but
the stands. A significant proportion of the actual travel time can be the results were not very consistent for data from John F. Kennedy
spent at the airport, especially with short-haul flights. To achieve International Airport (JFK). However, this approach cannot provide
the stated on-time performance from the European Commission, the same insights into the problem as some other approaches.
it is crucial to more accurately predict taxi times at European Clewlow et al. [10] highlighted that the number of arrivals does
airports. This will enable runway planning with less uncertainty affect the taxi-out times, which was not sufficiently taken into
account before that. Their multiple regression approach was based
on John F. Kennedy International Airport and Boston Logan Inter-
∗ Corresponding author. national Airport. Jordan et al. [13] developed a sequential forward
E-mail address: smr@cs.nott.ac.uk (S. Ravizza). floating subset selection method with the aim of selecting the most
1568-4946/$ – see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.asoc.2013.10.004
398 S. Ravizza et al. / Applied Soft Computing 14 (2014) 397–406
set threshold. A value C is normally defined to weight the trade-off for the problem of estimating aircraft taxi times, and preliminary
between the two objective functions. The dual formulation of this results, can be found in the work by Chen et al. [16].
optimisation model is often solved by preference. Support vector Some of the key features of Mamdani FRBSs highlighted in [16]
regression can be extended to non-linear models by incorporat- are (1) the ability to approximate complex non-linear systems, (2)
ing a kernel function which transforms the original training data the ability for rules to differ in different regions, (3) the ability
into a higher dimensional space. The best kernel found for this par- to integrate human expertise, and (4) the ability to interpret the
ticular problem and these datasets turned out to be a normalised underlying system.
polynomial kernel. A tutorial for this approach can be found in [24].
3.6. TSK fuzzy rule-based systems
3.4. M5 model trees
Another form of fuzzy inference system, originally proposed by
Another way of predicting numeric values is by using decision Takagi and Sugeno [27], has fuzzy sets involved only in the premise
trees which store linear regression models on their leaves. Such part. By using Takagi and Sugeno’s fuzzy inference scheme (TSK),
trees are called model trees and are similar to piecewise linear one can describe the fuzzy if-then rules as follows:
functions for the entire model. Model trees are usually smaller and
more accurate than regression trees which have only an average j
Ri : If x1 is A1i and x2 is A2i , . . ., and xj is Ai
value on their leaves [25]. The tree is constructed by the divide-
and-conquer method, where the splitting criterion determines the Then yi = gi (x1 , x2 , . . ., xj ). (3)
best explanatory variables to split on, based on the expected error
reduction. The splitting process finishes when the standard devia- Ri denotes the ith rule to be considered. gi () is any function, and
tion of the subset of the training data is below a certain threshold could, for example, be linear or quadratic. Normally, using a linear
or the size of this subset is too small. Afterwards, a linear regression function for gi () is enough, since the fuzzy if-then rule has already
model is calculated for each leaf. Pruning can be applied in a sec- embedded non-linearity inherently. When a linear model structure
ond stage: all non-leaves are tested for whether it is better to keep is assumed then a rule base with k rules takes the following format:
the subtree or whether a linear model could replace the subtree. j
R1 : If x1 is A11 and x2 is A21 , . . ., and xj is A1
j
Then y1 = b01 + b11 · x1 + . . . + b1 · xj
Additionally, a smoothing stage can be added to reduce the dis-
... (4)
continuities between the linear models for different leaves. More
j j
details of the M5 model tree can be found in the paper by Quinlan Rk : If x1 is A1k and x2 is A2k , . . ., and xj is Ak Then yk = b0k + b1k · x1 + . . . + bk · xj .
[25].
The crisp output from the input (x1 , x2 , . . ., xj ) is obtained as the
weighted sum of the consequences of the k rules:
3.5. Mamdani fuzzy rule-based systems
k
k
j
Fuzzy Rule-Based Systems (FRBSs) are a way of modeling pro- y= ˇi · yi = ˇi · (b0i + b1i · x1 + . . . + bi · xj ), (5)
cesses which have the ability to be interpreted with linguistic i=1 i=1
statements. First introduced by Zadeh [26], they give the possi-
where ˇi actually represents the certainty of each rule contributed
bility of combining human expertise together with mathematical
by the premise of the corresponding rule:
models. The input is first mapped with a fuzzification interface then
decisions can be made, before these are mapped to a single crisp A1 (x1 ) ∗ A2 (x2 ) ∗ . . . ∗ j (xj )
i i A
i
output using a defuzzification interface. ˇi = k . (6)
The concept behind a modified and enhanced Mamdani FRBSs (x ) ∗ A2 (x2 ) ∗ . . . ∗ j (xj )
l=1 A1 1
l l A
l
is introduced first, before discussing another concept based on a
different type of FRBS in the next section. The general ‘rule-base’ The membership function Al (xl ) of the premise part is again a
i
of an FRBS has the following form, with a number of fuzzy if-then Gaussian function as in Eq. (2).
rules Ri : The above presented Mamdani FRBS utilised a combined k-
j means algorithm [16] and genetic algorithm [28], to automatically
Ri : If x1 is A1i and x2 is A2i , . . ., and xj is Ai Then yi = Zi . (1) identify the initial values of the parameters both in the premise
and consequent parts. The same method is used to determine the
The values xl for all l = 1, . . ., j are the explanatory variables and the Ali
parameters associated with the premise part for the TSK FRBS. The
are the ith linguistic values (fuzzy sets) which are each represented
least square approach is then used to determine the initial values
by a membership function Al (xl ). They take the form of Gaussian j
i of b0i , b1i , . . ., bi for all i = 1, . . ., k for the consequent part. To further
functions of the following form in this work in order to give smooth refine the initial fuzzy system which is obtained, a genetic algo-
predictions and to ensure its partial differentiability: rithm, namely G3PCX [29], is incorporated into TSK to fine-tune the
2
premise part, followed by a least square approach to obtain the con-
l
1 (xl − ci ) sequent part. This process continues iteratively until a pre-specified
Al (xl ) = exp − · 2
, (2)
i 2 (il ) condition is met in order to reach a more accurate fuzzy system.
G3PCX is a real-parameter genetic algorithm using a parent-centric
where cil denotes the centre of the bell-shape curve and il denotes recombination operator (PCX) and an elite-preserving, computa-
the standard deviation. The consequence part Zi is a fuzzy set for a tionally fast evolutionary model (G3).
Mamdani FRBS, which is here modelled as a bell-shaped member- As mentioned in [16], in comparison to the Mamdani FRBS, the
ship function. The utilisation of a bell-shape membership function following distinctive features associated with the TSK FRBS can be
ensures again the function’s partial differentiability. The chosen identified:
membership types along with the centre of gravity defuzzifica-
tion ensure that gradient based optimization method, such as the • The TSK FRBS could in some ways be viewed as an extension of
Back-Error-Propagation technique, can be utilised to fine tune the multiple linear regression. Each rule in the rule base resembles a
obtained Mamdani FRBS. More details of the approach, the tuning multiple linear regression model for a decomposed explanatory
S. Ravizza et al. / Applied Soft Computing 14 (2014) 397–406 401
variable region. Hence, the explanatory ability associated with 4.2.2. Mean-absolute error
multiple linear regression automatically applies to the TSK FRBS. A second measure, which is also in the same dimension as the
• These rules work cooperatively to produce estimations, which predicted value, is the mean-absolute error (MAE). This perfor-
may result in more accurate estimations. mance measure averages the individual errors by neglecting their
• Although one could lose certain linguistic meanings in the con- sign. It is defined as follows:
sequent part in comparison to the Mamdani FRBS, due to the
|y1 − ŷ1 | + . . . + |yn − ŷn |
function form of the TSK consequent part, such a form should MAE = . (8)
n
be able to approximate the sub region more accurately than a
fuzzy set. 4.2.3. Root relative-squared error
In the root relative-squared error (RRSE) the total squared errors
4. Comparisons and insights are divided by the total squared errors when using the simplest
prediction model (which just outputs the average value y) as can
This section analyses the different regression approaches for be seen below:
predicting taxi times at airports and shows comparisons between
2 2
approaches and insights from the best performing approach. First, it (y1 − ŷ1 ) + . . . + (yn − ŷn )
RRSE = . (9)
is necessary to specify the experimental setup and the performance 2
(y1 − y) + . . . + (yn − y)
2
measures used.
4.2.4. Relative-absolute error
4.1. Experiment setup The relative-absolute error (RAE) is the total absolute error,
which is normalised as in the root relative-squared error by using
All of the experiments were executed on a standard desktop PC a simple average predictor:
(Intel Core 2 Duo, 3 GHz, 2 GB RAM). WEKA was used to perform
|y1 − ŷ1 | + . . . + |yn − ŷn |
all of the experiments apart from the last two models, which are RAE = . (10)
related to the fuzzy rule-based systems [20,21]. Analysis with the |y1 − y| + . . . + |yn − y|
support vector regression method identified that the best param-
eter for these experiments was to use the value C equal to 2 and 4.2.5. Coefficient of determination
to employ a normalised polynomial kernel with exponent 3. Both A commonly used performance measure related to linear regres-
of the fuzzy rule-based systems were implemented and tested in sion is the coefficient of determination R2 . It can be determined
MATLAB R2010a. Mamdani FRBS was based on 12 rules, as pub- from the root relative-squared error and takes values between 0
lished in [16]. TSK FRBS was analysed in detail and preliminary and 1 for linear regression models, with values closer to 1 indicating
results showed that 4 rules for Stockholm-Arlanda Airport and a better fit. R2 is defined as follows:
8 rules for Zurich Airport were the most promising settings. All 2 2
(y1 − ŷ1 ) + . . . + (yn − ŷn )
experiments were based on 10-fold cross-validation if not other- R2 = 1 − 2 2
. (11)
(y1 − y) + . . . + (yn − y)
wise stated. This is suggested to be the recommended setting and
leads to relative low bias and variance [30,31]. Furthermore, 15 Sometimes an adjusted coefficient of determination is used, which
repetitions were done for each experiment, as recommended by penalises models with many explanatory variables. In this study,
Nadeau and Bengio [31]. They also recommend using the corrected the explanatory variables are fixed and the size of the datasets is
resample t-test to test whether the difference between the two pre- much larger than the number of explanatory variables, making such
diction models is significant. The corrected resample t-test should a correction term unnecessary.
be preferred over a normal paired t-test, because the suggested
test adjusts the variance in relation to the overlaps between sub- 4.2.6. Prediction accuracy
sets of the data [19]. To compare the different models the same The last set of performance measures is used to show practi-
seeds were used to generate the subsets for cross-validation for tioners the accuracy of the models and is of a form that they will be
the different repetitions for both utilised software packages. The familiar with. The percentage of the prediction accuracy measure
significance level ˛ was set to 0.05. indicates what percentage of the flights in the dataset are predicted
within ± 1, 2, 3, 5 or 10 min.
4.2. Performance measures
4.3. Visual comparisons
This research aims to compare the different prediction methods
using various performance measures to provide as much insight as Fig. 2 shows the predication accuracy of the 6 different regres-
possible and to enable further comparisons. The utilised measures sion approaches for Zurich Airport. The x-axis represents the
are discussed below. aircraft, which were sorted from underestimated to overestimated
taxi times within each approach. The analysis is based on 15 rep-
etitions of 10-fold cross-validation. The range on the y-axis is only
4.2.1. Root mean-squared error
shown within the interval of ± 5 min. To halve the height of the
The root mean-squared error (RMSE) is a very commonly used
graph and make the differences more obvious, the absolute error is
measure and was preferred over the mean-squared error in this
shown, so the left hand side shows under predictions and the right
paper since it gives values in the same dimension as the predicted
hand side over predictions. The solid black line visualises the multi-
values. The formula is
ple linear regression approach (LinReg) which is used as a baseline
2 2 analysis. It is clear that least median square linear regression (LMS)
(y1 − ŷ1 ) + . . . + (yn − ŷn )
RMSE = , (7) performs poorly for predictions which underestimate the actual
n
taxi time. Support vector regression (SMOreg), Mamdani FRBS and
where yi is the actual taxi time of aircraft i and ŷi is the corre- TSK FRBS seem to perform the best, but it is hard to distinguish
sponding predicted taxi time. The value n represents the number clearly based on this figure, so a numerical comparison will now be
of aircraft in the dataset. presented.
402 S. Ravizza et al. / Applied Soft Computing 14 (2014) 397–406
5
Multiple Linear Regression
Least Median Squared Linear Regression
Support Vector Regression
Absolute value of difference between predicted taxi times and true taxi times
4.5
M5 Model Trees
Mamdani Fuzzy Rule−Based Systems
TSK Fuzzy Rule−Based Systems
4
3.5
Underestimated taxi times Overestimated taxi times
3
[minutes]
2.5
1.5
0.5
0
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Sorted Aircraft
Table 2 shows the first four performance measures for both air- As the comparisons in the last section showed, TSK FRBS seems
port datasets. Bold numbers highlight the best (smallest) result for to perform the best for the analysed prediction problem. Therefore,
each performance measure at each airport. The newly introduced this section focuses on the results of that particular approach. Ear-
TSK FRBS outperforms the other approaches in almost all cases. lier results and analysis for multiple linear regression and Mamdani
Only in the case of Zurich Airport does support vector regression FRBS can be found in the papers by Ravizza et al. [15] and by Chen
have the same result for the mean-absolute error and is slightly bet- et al. [16], respectively.
ter in terms of the relative-absolute error. Tests with the corrected Fig. 3 illustrates the membership functions of two explanatory
resample t-test showed that there is always a significant improve- variables for the TSK FRBS at Stockholm-Arlanda Airport. The first
ment between the multiple linear regression approach and TSK figure is related to the indication of whether an aircraft is arriving
FRBS at Zurich Airport. For this dataset, TSK FRBS also significantly or departing and the second figure shows the final model related to
outperformed least median square linear regression and, apart the logarithmic transformation of the total taxi distance. Input val-
from the relative-absolute error, also outperform the M5 model ues are scaled to the range [−1,1] using a linear scaling from their
trees. Although the numeric results are better for the TSK FRBS, it range (see Table 1). The other explanatory variables had less distinct
only significantly outperformed the Mamdani approach in terms of membership functions for the different rules and were omitted due
the root mean-square error and the root relative-squared error and to space limitations. This model is based upon using the approach
did not outperform the support vector regression. The results for with the entire dataset as training data. The four rules are repre-
Stockholm-Arlanda Airport are very similar, but fewer tests identify sented with different lines, showing that rules 1 and 2 are more
significant differences. The best values found for the coefficient of focusing on departures and rules 3 and 4 on arrivals. Furthermore,
determination R2 were 80.85% and 93.25% for Stockholm-Arlanda the rules cover different total taxi distances starting with rule 3 for
Airport and Zurich Airport, respectively, and were both obtained smaller distances, following by rules 2, 4 and 1. The consequence
using the TSK FRBS. part of the four rules can be seen in Table 4. Each rule has the form
The prediction accuracy within ±1, 2, 3, 5 and 10 min can be of a multiple linear regression approach with the coefficients stated
found in Table 3. Again, TSK FRBS outperformed, in most cases, the in the table.
other regression approaches in both datasets. Mamdani FRBS also TSK FRBS has the ability to model non-linear relationships,
did very well in comparison to the others. Our findings are in line which gives such a method an advantage over multiple linear
with the analysis by Wu et al. [32] where TSK FRBSs with fewer regression, least median squared linear regression or M5 model
rules were more successful than Mamdani FRBSs. Although sup- trees, where automatic transformations of linear functions with
port vector regression often seemed a good alternative, the ±2 and a polynomial transformation are only possible to a certain extent
3 min accuracy in the case of Stockholm-Arlanda Airport reported [33]. Fig. 4 shows how the functions look for the TSK FRBS and
the worst values. multiple linear regression on the example of the total taxi distance
S. Ravizza et al. / Applied Soft Computing 14 (2014) 397–406 403
Table 2
Comparisons of performance measures for Stockholm-Arlanda Airport and Zurich Airport.
Root mean-squared error ARN 1.52 1.57 1.50 1.51 1.46 1.44
ZRH 1.47 1.60 1.32 1.36 1.33 1.30
Mean-absolute error ARN 1.14 1.14 1.09 1.13 1.07 1.06
ZRH 1.08 1.10 0.96 0.99 0.97 0.96
Root relative-squared error ARN 45.70% 47.47% 45.27% 45.54% 44.19% 43.53%
ZRH 29.29% 31.76% 26.28% 27.00% 26.41% 25.89%
Relative-absolute error ARN 45.80% 45.92% 43.98% 45.61% 43.28% 42.83%
ZRH 26.85% 27.52% 23.87% 24.55% 24.30% 23.93%
Table 3
Comparisons of accuracies for Stockholm-Arlanda Airport and Zurich Airport.
Accuracy within ± 1 min ARN 54.28% 56.02% 57.86% 54.61% 58.21% 58.80%
ZRH 58.38% 59.66% 64.05% 62.49% 62.97% 63.33%
Accuracy within ± 2 min ARN 85.30% 85.18% 84.91% 85.19% 86.73% 86.81%
ZRH 86.12% 85.99% 88.98% 88.15% 88.55% 89.07%
Accuracy within ± 3 min ARN 95.40% 94.80% 94.32% 95.43% 95.72% 96.16%
ZRH 95.55% 94.26% 96.60% 96.46% 96.54% 96.89%
Accuracy within ± 5 min ARN 99.16% 98.81% 99.16% 99.18% 98.97% 99.08%
ZRH 99.21% 98.56% 99.45% 99.46% 99.53% 99.62%
Accuracy within ± 10 min ARN 99.92% 99.92% 99.92% 99.92% 99.92% 99.92%
ZRH 99.92% 99.87% 99.97% 99.97% 99.98% 99.97%
1 1
0.8 0.8
Membership Degree
Membership Degree
0.6 0.6
0.4 0.4
0.2 0.2
0 0
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
Departure vs. Arrival (scaled) Scaled log( Distance )
Fig. 3. Four fuzzy rules extracted from the TSK FRBS analysis for Stockholm-Arlanda Airport.
404 S. Ravizza et al. / Applied Soft Computing 14 (2014) 397–406
8.5
12
8
11
7.5
9
7
8
Predicted Taxi Time [min]
6.5
7
6 6
5
5.5
4
5 3
10
8
4.5 6
4 6000
5000
Amount of Traffic 4000
Departure with TSK FRBS 2 3000
4 Arrival with TSK FRBS 2000
0 1000
Departure with Multiple Linear Regression 0 Distance [m]
Arrival with Multiple Linear Regression
3.5
0 1000 2000 3000 4000 5000 6000 Fig. 5. Analysis of predicted taxi-in times at Stockholm-Arlanda Airport with the
Distance [m] TSK FRBS.
1 1
Membership Degree
Membership Degree
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
Departure vs. Arrival (scaled) Scaled log( Distance )
1 1
Membership Degree
Membership Degree
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
Scaled log( Angle ) Mode 1 (scaled)
1 1
Membership Degree
Membership Degree
0.8 0.8
0.6
0.6
0.4
0.4
0.2
0.2 0
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
Mode 3 (scaled) Mode 5 (scaled)
Fig. 6. Eight fuzzy rules extracted from the TSK FRBS analysis for Zurich Airport.
S. Ravizza et al. / Applied Soft Computing 14 (2014) 397–406 405
Table 5
Consequence part of the TSK FRBS analysis for Zurich Airport.
of OpenAccess Series in Informatics (OASIcs). ISBN 978-3-939897-33-0; pp. [26] L. Zadeh, Fuzzy sets, Information and Control 8 (3) (1965) 338–353,
134–145. doi:10.4230/OASIcs.ATMOS.2011.134. http://dx.doi.org/10.1016/S0019-9958(65)90241-X.
[17] J.A.D. Atkin, E.K. Burke, J.S. Greenwood, D. Reeson, Hybrid metaheuristics to aid [27] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications to
runway scheduling at London Heathrow Airport, Transportation Science 41 (1) modeling and control, IEEE Transactions on Systems, Man and Cybernetics 15
(2007) 90–106, http://dx.doi.org/10.1287/trsc.1060.0163. (1) (1985) 116–132.
[18] U. Dorndorf, A. Drexl, Y. Nikulin, E. Pesch, Flight gate scheduling: state- [28] J. Chen, Biological inspired optimisation algorithms for transparent knowledge
of-the-art and recent developments, Omega 35 (3) (2007) 326–334, extraction allied to engineering materials process, Department of Automatic
http://dx.doi.org/10.1016/j.omega.2005.07.001. Control and Systems Engineering, The University of Sheffield, 2009 (Ph.D. the-
[19] J. Demšar, Statistical comparisons of classifiers over multiple data sets, Journal sis).
of Machine Learning Research 7 (2006) 1–30. [29] K. Deb, A. Anand, D. Joshi, A computationally efficient evolutionary algo-
[20] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, The WEKA rithm for real-parameter optimization, Evolutionary computation 10 (4) (2002)
data mining software: an update, ACM SIGKDD Explorations Newsletter 11 (1) 371–395, http://dx.doi.org/10.1162/106365602760972767.
(2009) 10–18, http://dx.doi.org/10.1145/1656274.1656278. [30] J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques. The Morgan
[21] I.H. Witten, E. Frank, M.A. Hall, Data Mining: Practical Machine Learning Tools Kaufmann Series in Data Management Systems, 3rd ed., Elsevier Science, 2011,
and Techniques. The Morgan Kaufmann Series in Data Management Systems, ISBN: 9780123814791.
3rd ed., Elsevier Science, 2011, ISBN: 9780123748560. [31] C. Nadeau, Y. Bengio, Inference for the generalization error, Machine Learning
[22] D.C. Montgomery, E.A. Peck, G.G. Vining, Introduction to Linear Regression 52 (2003) 239–281, http://dx.doi.org/10.1023/A:1024068626366.
Analysis, 3rd ed., John Wiley & Sons, Inc, 2001. [32] Y. Wu, B. Zhang, J. Lu, K.L. Du, Fuzzy logic and neuro-fuzzy systems: a systematic
[23] P. Rousseeuw, A. Leroy, Robust Regression and Outlier Detection. Wiley Series introduction, International Journal of Artificial Intelligence and Expert Systems
in Probability and Mathematical Statistics: Applied Probability and Statistics, 2 (2) (2011) 47–80.
Wiley, 1987, ISBN: 9780471852339. [33] G.E.P. Box, D.R. Cox, An analysis of transformations, Journal of the Royal Statis-
[24] A.J. Smola, B. Schölkopf, A tutorial on support vector regression, Statis- tical Society Series B (Methodological) 26 (2) (1964) 211–252.
tics and Computing 14 (2004) 199–222, http://dx.doi.org/10.1023/B:STCO. [34] Eurocontrol. Airport CDM implementation manual. Tech. Rep. version 4; Euro-
0000035301.49549.88. control; 2012.
[25] J.R. Quinlan, Learning with continuous classes, in: Proceedings of the 5th Aus-
tralian Joint Conference on Artificial Intelligence, 1992.