Professional Documents
Culture Documents
Cheng 2015
Cheng 2015
Cheng 2015
PII: S0950-7051(14)00460-2
DOI: http://dx.doi.org/10.1016/j.knosys.2014.12.022
Reference: KNOSYS 3031
Please cite this article as: M-Y. Cheng, N-D. Hoang, A Swarm-Optimized Fuzzy Instance-Based Learning approach
for predicting slope collapses in mountain roads, Knowledge-Based Systems (2014), doi: http://dx.doi.org/10.1016/
j.knosys.2014.12.022
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers
we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and
review of the resulting proof before it is published in its final form. Please note that during the production process
errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
TITLE PAGE
A Swarm-Optimized Fuzzy Instance-Based Learning Approach for Predicting Slope Collapses in
Mountain Roads
Position: Professor
Affiliation: Department of Civil and Construction Engineering, National Taiwan University of Science and
Technology
Email: myc@mail.ntust.edu.tw
Position: Lecturer
Affiliation: Institute of Research and Development, Faculty of Civil Engineering, Duy Tan University
*Corresponding author
Abstract. Due to the disastrous consequences of slope failures, forecasting their occurrences is a
practical need of government agencies to develop strategic disaster prevention programs. This
predicting slope collapses. The proposed model utilizes the Fuzzy k-Nearest Neighbor (FKNN)
determine the model’s hyper-parameters appropriately, the Firefly Algorithm (FA) is employed
as an optimization technique. Experimental results have pointed out that the newly established
SOFIL can outperform other benchmarking algorithms. Therefore, the proposed model is very
promising to help decision-makers in coping with the slope collapse prediction problem.
1. Background
Road is the major transportation construction that support modern society; many researchers
have found significant positive correlations between roads and economic growth at both local
and regional levels [1, 2]. Accordingly, in various countries around the world, extensive
networks of mountain roads has recently been built to catch up with the population expansion
possibly occur in many sections of the road network. These catastrophic events are often
triggered by earthquakes or heavy rainfalls during typhoons or monsoon storms [5, 6]. Slope
collapses are very undesirable since they inflict damages to man-made structures, disruption of
traffic, and indispensible losses of human lives. Hence, slope stability assessment is an inevitable
task which should be regularly conducted by roadway maintenance authorities [7, 8]. The
analysis results can be utilized for identifying collapse-prone areas as well as allocating scarce
resources to establish an overall disaster prevention program [9]. In order to analyze slope
stability, physical model, expert evaluation, and machine learning are the three common methods
[10].
The physical model method is based on the slope displacement model which can analyze the
slope stability by identifying of the most dangerous sliding surface and calculating the factor of
safety [11]. Although this approach can deliver accurate analytical results, it requires input
parameters for every calculation point of the investigated area. Therefore, the physical model
method is only appropriate for evaluating stability in small areas, and its capacity for analysis
The expert evaluation approach utilizes expert judgments and information of slope collapse
events occurred in the past [8, 13]. Using expert knowledge, the main influencing features and
possible triggering factors can be identified [14]. Based on that information, the stability of a
slope can be evaluated by expert knowledge. Obviously, requiring many subjective judgments
and inconsistency of the prediction results are the main drawbacks of this method.
Recently, machine learning approaches have been utilized to automate the slope assessment
process due to their better flexibilities and prediction capabilities compared to the traditional
approaches. Generally, machine learning based models are established by combining artificial
intelligence (AI) techniques and historical databases [15]. Using these models, the slope
evaluation can be considered as a classification task in which prediction outputs are either “stable”
or “unstable”.
Lu and Rosenbaum [16], Zhou and Chen [17], Jiang [18], Das et al. [7], Cho [19], Lee et al.
[20], and Wang et al. [21] applied the Artificial Neural Network (ANN) to predict the slope
condition. Zhao et al. [22] employed the Relevance Vector Machine (RVM) to explore the
nonlinear relationship between slope stability and its influence factors. Slope stability forecasting
models based on the Support Vector Machine (SVM) were developed by Li and Wang [23],
Cheng et al. [24], Zhao [25], Samui [26], and Li and Dong [27]; these studies found that SVM
based models are very effective under the condition of limited data.
Although the ANN has been extensively applied for predicting slope collapse, the
implementation of this approach has several drawbacks. The major disadvantage of the ANN is
that its training process is achieved through a gradient descent algorithm on the error space,
which can be very complex and may contain many local minima [28]. Moreover, the SVM
this means that the training process for large data sets requires expensive computational cost [29].
Most importantly, the black box nature of the ANN, SVM, and RVM algorithms makes them
difficult for practical engineers or government agencies to comprehend how they predict slope
collapses.
Different from the aforementioned AI methods, the Fuzzy k-Nearest Neighbor (FKNN)
algorithm [30] belongs to the class of instance-based learning. This algorithm utilizes the whole
collected data to establish its memory. A FKNN classifier utilizes the information obtained from
the k nearest neighbors of a sample vector and assigns class memberships to it. The vector’s
Moreover, the algorithm also assigns fuzzy memberships as a function of the vector’s distance
from its k nearest neighbors and those neighbors’ memberships in the possible classes [31].
Needless to say, this approach is simple to implement and its classification outcomes are also
easily interpretable. In addition, the competitive prediction performance of the FKNN has been
demonstrated in various studies [30, 32-34]. Nevertheless, none of previous works has evaluated
Additionally, the implementation of the FKNN requires a proper setting of two tuning
parameters: the neighboring size (k) and the fuzzy strength (m). Furthermore, this parameter
been illustrated to be feasible to tackle the optimization problem at hand [35-39]. Recently
developed by Yang [40], the Firefly Algorithm (FA) is a fast and effective meta-heuristic for
have demonstrated the superior performance of the FA over other meta-heuristic methods [41-
43]. Nonetheless, few research works have investigated the capability of this algorithm in
optimizing the parameter selection process of the FKNN. Thus, this study proposes to hybridize
the FKNN with FA [40] to automatically search for appropriate hyper-parameters of the
prediction model.
Thus, this research employs the FKNN classifier as the machine learning technique to
construct a prediction model for slope collapse assessment. We propose to hybridize the FKNN
algorithm with the FA [40] to automatically search for an appropriate combination of tuning
parameters for the prediction model. The newly established approach is named as Swarm-
Optimized Fuzzy Instance-based Learning (SOFIL). The remaining part of this paper is
organized as follows. The second section of this paper presents the research methodology. The
framework of the proposed SOFIL is described in the third section. The fourth section
demonstrates the experimental results. Conclusions of the study are stated in the final section.
2. Methodology
The FKNN algorithm is an instance-based classifier that incorporates the fuzzy set theory
into the classification process [30]. In the FKNN, the fuzzy memberships of samples are assigned
to different classes. The class which possesses the maximum membership degree can be chosen
as the winner. The first step of the FKNN algorithm is to calculate the fuzzy partition matrix U =
[uij] from the memory which stores a set of n training sample vectors [x1,…,xn]. Herein, we
denote j as the vector index (j = 1, 2, …,n), where n is the number of training samples. And, the
variable i represents the class index (i = 1, 2, …,C) , where C is the number of classes. For each
training case x, we identify its k nearest neighbors by calculating Euclidean distances. The
where ni is the number of neighbors found which belong to the class i and c(xj) represents the
class label of the sample vector xj. It is obvious that uij is an element of the C-by-n matrix U.
Moreover, it is also worth noticing that the purpose of Eq. (1) is to assign higher fuzzy
membership grades to the training samples that stay away from the decision boundary and lower
fuzzy memberships grade to the patterns that lie in the vicinity of the decision boundary [30]. It
is because the information supplied by the samples in the region close to the decision surface is
Since uij is a fuzzy membership grade of the sample xj in the class i, uij must satisfy the
following properties:
∑u
i =1
ij =1 (3)
n
0 < ∑ui j < n (4)
j =1
The second step of the FKNN approach is to assign fuzzy memberships of the unknown
∑u j =1
ij (1 / || x − x j ||2 /( m−1) )
ui ( x ) = k (5)
∑ (1/ || x − x
j =1
j || 2 /( m −1)
)
where i = 1,2,…,C, and j = 1,2,..,k. j represents the jth sample vector among the k nearest
neighbors of x. C is the number of classes; k denotes the neighboring size. The fuzzy strength m
is used to determine how heavily the distance is weighted when computing each neighbor’s
contribution to the membership value. || x − x j || represents the distance between x and its jth
nearest neighbor xj. In this study, Euclidean metric is used as the distance measurement. uij,
denotes the membership degree of the sample vector xj in the class i and is computed in the first
desirable performance of the prediction model [29]. Thus, in this study, we utilize the FA as a
means for tuning the FKNN parameters. The description of the FA algorithm is provided in the
communication behavior of fireflies [44]. In the natural world, a firefly is attracted to brighter
ones as it randomly explores the habitat. Based on that phenomenon in nature, the FA is
formulated as a global optimization method in which the brightness of fireflies characterizes the
value of the objective function. Previous studies have demonstrated that this advanced swarm
intelligence is fast and effective for locating the global optimum and superior performance of the
FA over other meta-heuristic algorithms has been proved in various applications [39, 41, 42].
The FA utilizes the following rules: (1) all fireflies are unisex, so each firefly is attracted to
other fireflies regardless of their sex, (2) the attractiveness of a firefly is proportional to its
brightness and decreases as the distance increases. A firefly moves randomly if no other firefly is
brighter, and (3) the brightness of a firefly is affected or determined by the landscape of the
The brightness of an individual firefly can be defined similarly to the fitness value in the
genetic algorithm. The light intensity I(r) varies according to the following equation:
I (r ) = I o exp(−γr 2 ) (6)
where Io denotes the light intensity of the source. γ is the light absorption coefficient. r represents
β = βo exp(−γr 2 ) (7)
In a D-dimensional space, the distance between any two fireflies i at xi and j at xj, is the
calculated as follows:
D
rij = xi − x j = ∑ (x
k =1
i ,k − x j ,k ) 2 (8)
Since a specific firefly xi is attracted to the brighter one xj, the movement of the ith firefly can
be expressed as:
where γ is the light absorption coefficient, γ varies from 0.1 to 10; β0 represents the attractiveness
The historical data utilized in this research contains 211 slope evaluation cases collected in
the Taiwan Provincial Highway No. 18 and No. 21 during the typhoons Herb (1996), Nari (2001),
and Toraji (2013). In this database, there are 105 failure and 106 non-failure cases. In this
research, a slope condition, either failure or non-failure, is determined and recorded during field
surveys. Specifically, a non-failure slope is determined when there is no movement of the soil in
the slope surface that affects the safety of road traffic. Moreover, a case of slope is characterized
For the purpose of slope collapse prediction, this study employs 16 slope attributes divided
into 9 groups: landforms, geological structure, stratigraphy, rock properties, vegetation coverage,
water condition, road properties, earthquake, and rainfall. These attributes can be considered as
influencing factors that determine slope conditions and they are selected based on engineering
judgments, available statistical data, and findings from previous researches [4, 9, 45-47] .
Table 1 provides the information of the influencing factors and their statistical descriptions.
Illustration of the database is shown in Table 2 where the output of 1 indicates a failed slope and
the output of -1 represents a stable slope. In Table 1, the first group covers four factors: slope
aspect, slope gradient, slope height, and slope form. The slope aspect refers to the horizontal
direction to which a mountain slope faces; and it has an indirect impact on moisture content of
the soil, which is related to the reduction of the effective stresses at the potential failure surface.
The slope angle measures the steepness inclination of the slope. The slope height, defined as the
distance from crest to toe of a slope, is physically related to the magnitude of the stress and the
pore-water pressure in the lower slope [48]. The slope form describes the geometry of slope
surface which influences soil movement, rill patterns, and run-off production [49].
The second group depicts the geological characteristic of the area along the mountain roads.
The stratigraphic feature of the region is described in the third group; it includes two impact
factors, namely the angle between slope aspect and trend and the angle between gradient and
inclination. In addition, the fourth group of factors provides information of rock properties in
which the rock mass size and the rock mass volume are taken into account. Moreover, the
characteristic of vegetation on slope surface is also critical when assessing the slope stability; to
quantify this characteristic, the vegetation coverage percentage and the vegetation coverage
Furthermore, in this research, the water condition of a slope is reflected by the size of
catchment area which is computed by a digital terrain model (DTM) [4]. Additionally, since road
construction is the artificial factor that can cause slope failure, our study considers two features
of mountain roads: excavation height at slope toe and change of slope gradient due to toe cutting.
On the other hand, earthquake and typhoon are generally considered as the two natural hazards
that trigger slope collapse events in many places (e.g. Taiwan). In our study, the maximum
ground acceleration at the slope location during earthquake and the maximum accumulated
rainfall during typhoon are taken into account to measure their effects on slope stability.
This section of the article describes the proposed slope collapse prediction method, named as
SOFIL, in detail. The model (see Fig. 2) is established by a fusion of the FKNN algorithm and
(1) Input Data: The input data provides the attributes of a slope. As mentioned earlier, the slope
attributes consist of influencing factors that impose significant impacts on the slope collapse
events. The data can be real values or integers and they should be normalized into a range of (0,
1). This transformation can help avoid numerical difficulties and prevent the situation in which
attributes with greater numeric magnitudes dominate those with smaller magnitudes.
(2) Tuning Parameter Initialization: The aforementioned tuning parameters of the model are
randomly generated within the range of lower and upper boundaries. In this study, the lower and
upper boundaries of the neighboring size (k) are 1 and 30, respectively. Meanwhile, these two
values of the fuzzy strength (m) are 1.0001 and 10. Moreover, the equation used for generating
where X i , 0 is the tuning parameter i at the first generation. rand[0,1] denotes a uniformly
distributed random number between 0 and 1. LB and UB are two vectors of lower bound and
(3) Class Membership Assignment: In this step, the FKNN algorithm is deployed to assign fuzzy
memberships of an input vector to different classes. This step requires two parameters (the
neighboring size and the fuzzy strength) that are acquired from the FA component. It is noted
that the slope assessment problem is a two-class classification problem with two labels: “collapse”
and “non-collapse”. Thus, for each input pattern x, there are two outputs, u1(x) and u2(x),
explore the various combinations of the tuning parameters (k and m). At each generation, the
optimizer carries out its searching process to guide the population of fireflies to the optimal
solution. By evaluating the fitness of each firefly, the algorithm discards inferior combinations of
m and k, and permits robust combinations of these parameters to be passed on the next
generations.
(5) Output Defuzzification: Because the FKNN yields fuzzy memberships of an input pattern in
the two classes (u1(x) and u2(x)), a step of defuzzification is employed to convert fuzzy outputs to
2
Y ( x) = arg max(ui ( x)) (11)
i =1
(6) Fitness Evaluation: In this step, the training data set is divided into five mutually exclusive
subsets. In each run, one subset is used as a validating set; meanwhile, the other subsets are used
for constructing the model memory. In order to determine the optimal tuning parameters of the
1
Ffitness = 5
(12)
∑ AR
k =1
k
where ARk denotes the classification accuracy of the validating set at the kth run. The
(7) Stopping Condition: The optimization process of the FA algorithm terminates when the
maximum number of generation is achieved. If the stopping condition is not met, the FA will
(8) Optimal Prediction Model: When the program terminates, the optimal set of tuning
parameters has been successfully identified. The SOFIL is ready to predict new input patterns.
4. Experimental Results
results acquired from other benchmark approaches including the ANN, FKNN, RVM, and
SVM algorithms. As mentioned earlier, in the SOFIL, the neighboring size (k) and the fuzzy
strength (m) are automatically chosen by the FA optimization. In FKNN algorithms, the
neighboring size k is allowed to vary between 1 and 30; additionally, this parameter is also
selected via a five-fold cross validation process based on the training cases. The fuzzy strength
parameter (m) in the FKNN algorithm is set to be 2, as recommended by the previous work
[30]. Moreover, the parameters of the SVM is determined via the grid search approach [51, 52].
When using an ANN, it is needed to specify the number of hidden layers, the number of
neurons in the hidden layer, the learning rate, and the number of training epochs [53]. These
parameters of an ANN are generally selected via repetitive trial-and-error processes. The
network configuration is described as follows: the number of hidden layers is set to be 1; the
number of neurons in the hidden layer is 16; and the number of training epochs is selected to be
2000. The back-propagation approach is used as the method for training the ANN model [54].
In the experiment, the whole database is randomly divided into two set: set 1 (including 80%
of the cases) used to construct the prediction model, and set 2 (including 20% of the cases)
utilized for testing the model. To evaluate model performance, the classification accuracy rate
can be employed. The classification accuracy rate (CAR) is the ratio of correctly predicted cases
over the total number of cases, can be used to measure the classifier performance [55, 56].
Moreover, the predictive capability of the classifiers can also be assessed using the following
four metrics [56]: true positive rate (the percentage of positive instances correctly classified),
true negative rate (the percentage of negative instances correctly classified), false positive rate
(the percentage of negative instances misclassified), and false negative rate (the percentage of
positive instances misclassified). These four metrics can be summarized in a confusion matrix
[57]. A well-known approach to incorporate these four measures and to produce an evaluation
criterion is to employ the Receiver Operating Characteristic (ROC) curve [58]. Furthermore, the
area under the ROC curve, denoted as AUC, provides a single measure of a classifier’s
It is noted that a higher AUC value indicates a better predictive performance. Generally, a
classifier with perfect predictive ability has an AUC of 1; meanwhile, a poor classifier with
random predictions has an AUC of 0.5. Moreover, an AUC of the range (0.7, 0.8) indicates an
performance is attained. And, if AUC ≥ 0.9, the classifier has attained an outstanding
performance.
When the FA-based parameter tuning process terminates, the optimized hyper-parameters of
the SOFIL has been identified as: k = 5, m = 1.28. Furthermore, the evolutionary process of the
SOFIL can be observed in Fig. 3. It can be seen that the value of the fitness function (shown in
Eq. 12) gradually improved and it reached the best value at iteration 102. During the latter part of
the searching process, the fitness value remains the same until the stopping condition (the
The experimental result has demonstrated that the FA is a very effective meta-heuristic since
it can help the proposed model to converge quickly toward the most desirable set of hyper-
parameters. Detailed of the SOFIL’s prediction results for the testing data is shown in Table 3.
Herein, µ 1(X) and µ 2(X) represent the membership degrees of the input vector X in the two
The confusion matrices of the SOFIL and other methods are described in Table 4. In the
training process, the numbers of common false positives and false negatives of the five models
are 1 and 3, respectively. Meanwhile, in the testing process, the proposed approach and the SVM
do not commit any false positive. The RVM, ANN, and FKNN algorithms have 1 overlapped
false positive. In addition, the number of overlapped false negatives of the five methods is 2. It
can be observed that the proposed SOFIL has achieved the lowest false positives and false
Table 5 provides the result obtained from the training and testing processes of the SOFIL and
other benchmark methods. The CAR results of the SOFIL, SVM, RVM, ANN, and FKNN
methods are 95.24%, 92.85%, 88.01%, 88.10%, and 85.71% in the testing process, respectively.
When predicting testing samples, the AUC values of the five methods are 0.95, 0.94, 0.89, 0.88
and 0.86, respectively. Observably, the proposed SOFIL can deliver the best result of slope
collapse prediction in both training and testing processes. Notably, the newly established method
can properly classify 40 testing cases with only 2 misclassifications. Thus, the SOFIL deems best
5. Conclusion
In this research, a new slope collapse prediction model, named as SOFIL, has been proposed.
Experimental results obtained from both training and testing processes have verified that the new
model can outperform other benchmark methods in terms of all performance measurements. This
demonstrates that the SOFIL is a very promising alternative to support decision makers in slope
collapse assessment.
learning classifier and the FA - a swarm intelligence optimization technique. The SOFIL utilizes
the FKNN algorithm to assign a membership grade in each class to an unknown pattern of slope
attributes. Additionally, the FA searching algorithm is deployed to identify the most appropriate
set of the FKNN’s tuning parameters. As a result, the proposed method can eliminate the need of
human effort or domain knowledge for parameter setting. Since the SOFIL is an effective
classifier, it can be applied for solving other problems in the field of civil engineering. Moreover,
References
[1] J. Bryan, S. Hill, M. Munday, A. Roberts, Road infrastructure and economic development in the periphery: the
case of A55 improvements in North Wales, J. Transp. Geogr., 5 (1997) 227-237.
[2] A.H. Munnel, Policy Watch - Infrastructure Investment and Economic Growth, J. Econ. Perspect., 6 (1992) 189–
198.
[3] S. Yang, C. Shen, C. Huang, C. Lee, C. Cheng, C. Chen, Prediction of Mountain Road Closure Due to Rainfall-
Induced Landslides, J. Perform. Constr. Fac., 26 (2012) 197-202.
[4] J. Ching, H.-J. Liao, J.-Y. Lee, Predicting rainfall-induced landslide potential along a mountain road in Taiwan,
Geotechnique 61, No. 2, 153–166, (2011).
[5] H.-M. Lin, S.-K. Chang, J.-H. Wu, C.H. Juang, Neural network-based model for assessing failure potential of
highway slopes in the Alishan, Taiwan Area: Pre- and post-earthquake investigation, Eng. Geol., 104 (2009)
280-289.
[6] H.A. Nefeslioglu, E. Sezer, C. Gokceoglu, A.S. Bozkir, T.Y. Duman, Assessment of Landslide Susceptibility by
Decision Trees in the Metropolitan Area of Istanbul, Turkey, Math. Probl. Eng., 2010 (2010).
[7] S.K. Das, R.i. Biswal, N. Sivakugan, B. Das, Classification of slopes and prediction of factor of safety using
differential evolution neural networks, Environ. Earth Sci., 64 (2011) 201-210.
[8] M.Y. Cheng, C.H. Ko, Automated Safety Monitoring and Diagnosis System for Unstable Slopes, Comput-aided
Civ. Inf., 18 (2003) 64-77.
[9] M.-Y. Cheng, A.F.V. Roy, K.-L. Chen, Evolutionary risk preference inference model using fuzzy support vector
machine for road slope collapse prediction, Expert Syst. Appl., 39 (2012) 1737-1746.
[10] A. Ahangar-Asr, A. Faramarzi, A.A. Javadi, A new approach for prediction of the stability of soil and rock
slopes, Eng. Computation, 27 (2010) 878 - 893.
[11] R. Baker, Sufficient conditions for existence of physically significant solutions in limiting equilibrium slope
stability analysis, Int. J. Solids. Struct, 40 (2003) 3717-3735.
[12] Y. Song, J. Gong, S. Gao, D. Wang, T. Cui, Y. Li, B. Wei, Susceptibility assessment of earthquake-induced
landslides using Bayesian network: A case study in Beichuan, China, Comput. Geosci., 42 (2012) 189-199.
[13] A.K. Sinha, M. Sengupta, Expert system approach to slope stability, Mining Science and Technology, 8 (1989)
21-29.
[14] K. Muthu, M. Petrou, Landslide-Hazard Mapping Using an Expert System and a GIS, IEEE Trans. Geosci.
Remote Sens., 45 (2007) 522-531.
[15] M.-Y. Cheng, N.-D. Hoang, A novel groutability estimation model for ground improvement projects in sandy
silt soil based on Bayesian framework, Tunn.Undergr. Sp. Tech., 43 (2014) 453-458.
[16] P. Lu, M.S. Rosenbaum, Artificial Neural Networks and Grey Systems for the Prediction of Slope Stability, Nat.
Hazards, 30 (2003) 383-398.
[17] K.-p. Zhou, Z.-Q. Chen, Stability Prediction of Tailing Dam Slope Based on Neural Network Pattern
Recognition, in: In Proc. of the Second International Conference on Environmental and Computer Science,
Dubai, 2009, pp. 380-383.
[18] j. Jiang, BP neural networks for Prediction of the factor of safety of slope stability, In Proceedings of the
International Conference on Computing, Control and Industrial Engineering (CCIE), Wuhan, China, (2011).
[19] S.E. Cho, Probabilistic stability analyses of slopes using the ANN-based response surface, Comput. Geotech.,
36 (2009) 787-797.
[20] T.-l. Lee, H.-m. Lin, Y.-p. Lu, Assessment of highway slope failure using neural networks, J. Zhejiang Univ.
Sci. A, 10 (2009) 101-108.
[21] H.B. Wang, W.Y. Xu, R.C. Xu, Slope stability evaluation using Back Propagation Neural Networks, Eng. Geol.,
80 (2005) 302-315.
[22] H. Zhao, S. Yin, Z. Ru, Relevance vector machine applied to slope stability analysis, Int. J. Numer. Anal. Meth.
Geomech., 36 (2012) 643–652.
[23] J. Li, F. Wang, Study on the Forecasting Models of Slope Stability under Data Mining, in: In Proc. of the Earth
and Space 2012: : Engineering, Science, Construction, and Operations in Challenging Environments, Honolulu,
Hawaii, United States, ASCE, 2010, pp. 765-776.
[24] M.-Y. Cheng, Y.-W. Wu, K.-L. Chen, Risk Preference Based Support Vector Machine Inference Model for
Slope Collapse Prediction, Autom. Constr., 22 (2012) 175-181.
[25] H.-b. Zhao, Slope reliability analysis using a support vector machine, Comput. Geotech., 35 (2008) 459-467.
[26] P. Samui, Slope stability analysis: a support vector machine approach, Environ Geol, 56 (2008) 255-267.
[27] J. Li, M. Dong, Method to Predict Slope Safety Factor Using SVM, in: In Proc. of the Earth and Space 2012:
Engineering, Science, Construction, and Operations in Challenging Environments, Pasadena, California, United
States, ASCE, 2012, pp. 888-899.
[28] S. Kiranyaz, T. Ince, A. Yildirim, M. Gabbouj, Evolutionary artificial neural networks by multi-dimensional
particle swarm optimization, Neural Net., 22 (2009) 1448-1462.
[29] M.-Y. Cheng, N.-D. Hoang, Interval Estimation of Construction Cost at Completion Using Least Squares
Support Vector Machine, J. Civ. Eng. Manag., 20 (2013) 223-236.
[30] J.M. Keller, M.R. Gray, J.A. Given, A Fuzzy K-Nearest Neighbor Algorithm, IEEE T. Syst. Man Cy., 15 (1985)
580-585.
[31] S.-T. Li, H.-F. Ho, Predicting financial activity with evolutionary fuzzy case-based reasoning, Expert Syst.
Appl., 36 (2009) 411-422.
[32] M. Govindarajan, R.M. Chandrasekaran, Evaluation of k-Nearest Neighbor classifier performance for direct
marketing, Expert. Syst. Appl., 37 (2010) 253-258.
[33] H.-L. Chen, C.-C. Huang, X.-G. Yu, X. Xu, X. Sun, G. Wang, S.-J. Wang, An efficient diagnosis system for
detection of Parkinson’s disease using fuzzy k-nearest neighbor approach, Expert. Syst. Appl., 40 (2013) 263-
271.
[34] M. Cheng, N. Hoang, Groutability Estimation of Grouting Processes with Microfine Cements Using an
Evolutionary Instance-Based Learning Approach, J. Comput. Civ. Eng., ASCE, 28 (2014) 04014014.
[35] M.-Y. Cheng, N.-D. Hoang, L. Limanto, Y.-W. Wu, A novel hybrid intelligent approach for contractor default
status prediction, Knowl.-Based Syst., 71 (2014) 314-321.
[36] H.-L. Chen, B. Yang, G. Wang, J. Liu, X. Xu, S.-J. Wang, D.-Y. Liu, A novel bankruptcy prediction model
based on an adaptive fuzzy k-nearest neighbor method, Knowl.-Based Syst., 24 (2011) 1348-1359.
[37] G. Kim, D. Seo, K. Kang, Hybrid Models of Neural Networks and Genetic Algorithms for Predicting
Preliminary Cost Estimates, J. Comput. Civ. Eng., ASCE, 19 (2005) 208-211.
[38] W. Zhang, P. Niu, G. Li, P. Li, Forecasting of turbine heat rate with online least squares support vector machine
based on gravitational search algorithm, Knowl.-Based Syst., 39 (2013) 34-44.
[39] T. Xiong, Y. Bao, Z. Hu, Multiple-output support vector regression with a firefly algorithm for interval-valued
stock price index forecasting, Knowl.-Based Syst., 55 (2014) 87-100.
[40] X.-S. Yang, Firefly algorithm, Luniver Press, Bristol, UK, (2008).
[41] B. Amiri, L. Hossain, J.W. Crawford, R.T. Wigand, Community Detection in Complex Networks: Multi–
objective Enhanced Firefly Algorithm, Knowl.-Based Syst., 46 (2013) 1-11.
[42] I. Fister, I. Fister Jr, X.-S. Yang, J. Brest, A comprehensive review of firefly algorithms, Swarm. Evol. Comput.,
13 (2013) 34-46.
[43] L.d.S. Coelho, V.C. Mariani, Improved firefly algorithm approach applied to chiller loading for energy
conservation, Energ. Buildings, 59 (2013) 273-278.
[44] A. Baykasoğlu, F.B. Ozsoydan, An improved firefly algorithm for solving dynamic multidimensional knapsack
problems, Expert Systems with Applications, 41 (2014) 3712-3725.
[45] S.Z. Chen, L.J. Hu, G.H. Chen, The investigation and remedy on the slope failure of Mt. So-San, Hazard
Mitigation Report No. 77-12, National Science Council, Taiwan, Republic of China (1988).
[46] J.B. Hsu, A study of landslide damage risk evaluation model for mountain roads, MS Thesis, Department of
Construction Engineering, National Taiwan Univ. of Sci. and Tech., (2006).
[47] D.H. Lee, A study on mechanical properties of representative rocks at Mt. So-San and in situ measurements of
slope movements, Report to the National Science Council No. 77-0414-P006-12B, National Cheng Kung
University, Taiwan, Republic of China, (1989).
[48] C.T. Lee, C.C. Huang, J.F. Lee, K.L. Pan, M.L. Lin, J.J. Dong, Statistical approach to storm event-induced
landslides susceptibility, Nat. Hazards Earth Syst. Sci., 8(4), 941–960, (2008).
[49] D.H. Rieke-Zapp, M.A. Nearing, Slope Shape Effects on Erosion: A Laboratory Study, Soil Sci. Soc. Am. J.
69:1463–1471, (2005).
[50] W.-T. Lin, W.-C. Chou, C.-Y. Lin, Earthquake-induced landslide hazard and vegetation recovery assessment
using remotely sensed data and a neural network-based classifier: a case study in central Taiwan, Nat. Hazards,
47 (2008) 331-347.
[51] J. Suykens, J.V. Gestel, J.D. Brabanter, B.D. Moor, J. Vandewalle, Least Square Support Vector Machines,
World Scientific Publishing Co. Pte. Ltd., Singapore, (2002).
[52] C.W. Shu, C.C. Chang, C.J. Lin, A practical guide to support vector classification, Technical Report.
Department of Computer Science, National Taiwan University, (2010).
[53] S. Samarasinghe, Neural Networks for Applied Sciences and Engineering, Taylor and Francis (2006).
[54] S.J. Russell, P. Norvig, Artificial Intelligence A Modern Approach, 2nd Edition, Prentice Hall, Person
Education, Inc, (2003).
[55] M.-Y. Cheng, N.-D. Hoang, Groutability prediction of microfine cement based soil improvement using
evolutionary LS-SVM inference model, J. Civ. Eng. Manag., (2014) 1-10.
[56] V. López, A. Fernández, S. García, V. Palade, F. Herrera, An insight into classification with imbalanced data:
Empirical results and current trends on using data intrinsic characteristics, Inform. Sciences, 250 (2013) 113-
141.
[57] T. Fawcett, An introduction to ROC analysis, Pattern Recognition Letters, 27 (2006) 861-874.
[58] A.P. Bradley, The use of the area under the ROC curve in the evaluation of machine learning algorithms,
Pattern Recognit., 30 (1997) 1145-1159.
[59] H. Jin, C.X. Ling, Using AUC and accuracy in evaluating learning algorithms, IEEE Trans. Knowl. Data Eng.,
17 (2005) 299-310.
[60] H.P. Tserng, G.-F. Lin, L.K. Tsai, P.-C. Chen, An enforced support vector machine model for construction
contractor default prediction, Autom. Constr., 20 (2011) 1242-1249.
List of Figures
0.0120
0.0115
0.0110
0.0105
0.0100
0 50 100 150 200 250 300
Iterations
26
Table 3 Prediction results of the SOFIL for the testing data
Influencing factors Membership Results
No
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 µ 1(X) µ 2(X) YA YP
1 0.73 0.75 0.06 0.46 0.80 0.50 0.86 0.21 0.69 0.81 0.55 0.02 0.06 0.00 0.47 1.00 1.00 0.00 -1 -1
2 0.54 0.50 0.07 0.49 0.00 0.50 0.67 0.06 0.38 0.81 0.21 0.00 0.04 0.00 0.52 1.00 0.78 0.22 -1 -1
3 0.37 0.63 0.66 0.30 0.80 0.86 0.32 0.27 0.75 0.05 0.06 0.28 0.10 0.33 0.64 0.46 0.03 0.97 1 1
4 0.20 0.75 0.10 0.62 0.80 0.50 0.48 0.25 0.75 0.65 0.06 0.03 0.02 0.00 0.68 0.63 0.73 0.27 -1 -1
5 0.39 0.44 0.10 0.57 1.00 0.50 0.62 0.00 0.69 0.00 0.01 0.05 0.16 0.78 0.75 0.03 0.00 1.00 1 1
6 0.27 0.81 0.38 0.10 0.80 0.67 0.71 0.06 0.50 0.16 0.21 0.06 0.26 0.00 0.87 0.10 0.01 0.99 1 1
7 0.13 0.75 0.07 0.49 0.80 0.50 0.38 0.42 0.88 0.75 0.44 0.00 0.10 0.00 0.53 1.00 0.99 0.01 -1 -1
8 0.44 0.75 0.09 0.58 1.00 0.50 0.86 0.04 0.88 0.00 0.10 0.02 0.08 0.22 0.66 0.38 0.04 0.96 1 1
9 0.27 0.69 0.24 0.50 0.80 0.50 0.33 0.06 0.69 0.70 0.33 0.01 0.10 0.00 0.69 0.28 0.20 0.80 1 1
10 0.08 0.63 0.10 0.48 0.40 0.50 0.76 0.08 0.56 0.81 0.44 0.00 0.10 0.22 0.52 0.37 0.22 0.78 1 1
11 0.44 0.63 0.17 0.34 0.80 0.50 0.76 0.06 0.38 0.75 0.44 0.03 0.16 0.33 0.90 0.32 0.08 0.92 1 1
12 0.42 0.50 0.24 0.35 0.80 0.50 0.67 0.02 0.56 0.86 1.00 0.05 0.06 0.00 0.52 0.01 0.21 0.79 1 1
13 0.28 0.69 0.10 0.42 0.80 0.50 0.81 0.02 0.38 0.48 0.06 0.02 0.04 0.33 0.63 0.25 0.10 0.90 1 1
14 0.18 0.69 0.03 0.72 0.80 0.50 0.81 0.21 0.38 0.81 0.10 0.00 0.04 0.00 0.65 0.63 0.95 0.05 -1 -1
15 0.82 0.31 0.14 0.63 0.80 0.50 0.52 0.13 0.75 0.91 0.06 0.04 0.02 0.33 0.67 0.63 0.98 0.02 -1 -1
16 0.38 0.38 0.14 0.18 0.80 0.89 0.10 0.04 0.31 0.65 0.06 0.08 0.03 0.56 0.62 0.11 0.15 0.85 1 1
17 0.56 0.63 0.05 0.66 0.80 0.50 0.76 0.25 0.69 0.27 0.10 0.00 0.14 0.44 0.98 1.00 1.00 0.00 -1 -1
18 0.55 0.63 0.12 0.67 0.80 0.50 0.76 0.04 0.38 0.97 0.55 0.02 0.03 0.00 0.63 0.63 0.99 0.01 -1 -1
19 0.61 0.44 0.17 0.28 0.40 0.50 0.62 0.04 0.00 0.16 0.33 0.09 0.12 0.00 0.58 0.39 0.02 0.98 1 1
20 0.25 0.94 0.09 0.63 0.80 0.19 0.57 0.58 0.88 0.22 0.06 0.01 0.02 0.22 0.64 0.10 0.04 0.96 1 1
21 0.00 0.75 0.45 0.28 0.80 0.14 0.57 0.17 0.44 0.05 0.10 1.00 0.20 0.56 0.66 0.04 0.02 0.98 1 1
22 0.72 0.50 0.31 0.69 0.80 0.50 0.67 0.38 0.63 0.27 0.06 0.01 0.02 0.22 0.56 0.33 0.60 0.40 1 -1
23 0.15 0.55 0.14 0.58 1.00 0.50 0.70 0.06 0.50 0.00 0.10 0.01 0.04 0.00 0.59 0.63 0.07 0.93 1 1
24 0.28 0.63 0.04 0.79 0.80 0.50 0.76 0.17 0.63 0.75 0.55 0.00 0.06 0.00 0.50 1.00 1.00 0.01 -1 -1
25 0.82 0.50 0.06 0.61 0.80 0.50 0.67 0.13 0.63 0.81 0.21 0.00 0.10 0.33 0.95 1.00 1.00 0.00 -1 -1
26 0.46 0.75 0.10 0.68 0.40 0.33 0.38 0.08 0.63 0.22 0.33 0.00 0.24 0.22 0.50 0.21 0.02 0.98 1 1
27 0.49 0.63 0.14 0.25 0.40 0.28 0.29 0.06 0.38 0.81 0.44 0.01 0.14 0.00 0.50 0.36 0.09 0.91 1 1
28 0.00 0.63 0.14 0.55 0.20 0.50 0.76 0.02 0.38 0.86 0.78 0.01 0.16 0.00 0.50 0.36 0.26 0.74 1 1
29 0.25 0.50 0.04 0.64 0.80 0.08 0.26 0.02 0.38 0.91 0.21 0.06 0.02 0.00 0.67 0.63 0.93 0.07 -1 -1
30 0.15 0.63 0.07 0.61 0.80 0.50 0.76 0.29 0.69 0.97 0.66 0.00 0.06 0.00 0.47 1.00 1.00 0.00 -1 -1
31 0.56 0.63 0.07 0.61 0.80 0.56 0.48 0.17 0.75 0.32 0.06 0.00 0.02 0.11 0.62 0.12 0.02 0.98 1 1
27
32 0.89 0.56 0.05 0.55 0.80 0.50 0.71 0.25 0.69 0.86 0.44 0.00 0.06 0.00 0.47 1.00 1.00 0.00 -1 -1
33 0.54 0.75 0.31 0.50 0.80 0.50 0.81 0.08 0.69 0.59 0.21 0.00 0.60 0.22 0.00 0.72 0.19 0.81 1 1
34 0.04 0.94 0.06 0.64 0.80 0.58 0.71 0.58 0.94 0.38 0.06 0.00 0.02 0.11 0.64 0.63 0.66 0.34 -1 -1
35 0.23 0.63 0.05 0.72 0.80 0.50 0.76 0.25 0.75 0.86 0.33 0.00 0.06 0.00 0.83 1.00 1.00 0.00 -1 -1
36 0.24 0.63 0.08 0.45 0.80 0.50 0.76 0.21 0.63 0.81 0.55 0.01 0.06 0.00 0.53 1.00 1.00 0.00 -1 -1
37 0.37 0.19 0.03 0.79 1.00 0.50 0.43 0.04 0.38 0.91 0.33 0.03 0.02 0.00 0.61 0.63 1.00 0.00 -1 -1
38 0.00 0.75 0.14 0.27 0.80 0.50 0.86 0.25 0.50 0.81 0.66 0.13 0.18 0.67 0.00 0.98 0.80 0.20 1 -1
39 0.39 0.63 0.03 0.45 1.00 0.50 0.76 0.04 0.00 0.05 0.21 0.06 0.05 0.44 0.64 0.39 0.03 0.97 1 1
40 0.34 0.63 0.10 0.54 0.40 0.00 0.29 0.08 0.56 0.81 0.44 0.01 0.10 0.00 0.00 0.36 0.00 1.00 1 1
41 0.00 0.31 0.09 0.15 0.80 0.67 0.24 0.08 0.38 0.59 0.01 0.01 0.07 0.00 0.62 0.12 0.20 0.80 1 1
42 0.06 0.69 0.10 0.41 0.80 0.50 0.81 0.02 0.13 0.16 0.10 0.04 0.03 0.00 0.64 0.45 0.06 0.94 1 1
Note: YA and YP represent actual and predicted slope collapse, respectively. The output of 1 and -1 indicate a failed slope and a stable slope, respectively.
28
Table 4 Confusion matrices
30
Research highlights
• The SOFIL integrates the Fuzzy K-NN and the Firefly algorithms.
31