Professional Documents
Culture Documents
Multi Geohazard Susceptibility Mapping Based On Machine Learning, A Case Study in Jiuzhaigou, China
Multi Geohazard Susceptibility Mapping Based On Machine Learning, A Case Study in Jiuzhaigou, China
https://doi.org/10.1007/s11069-020-03927-8
ORIGINAL PAPER
Received: 29 January 2020 / Accepted: 12 April 2020 / Published online: 5 May 2020
© Springer Nature B.V. 2020
Abstract
Jiuzhaigou, located in the transitional area between the Qinghai–Tibet Plateau and the
Sichuan Basin, is highly prone to geological hazards (e.g., rock fall, landslide, and debris
flow). High-performance-based hazard prediction models, therefore, are urgently required
to prevent related hazards and manage potential emergencies. Current researches mainly
focus on susceptibility of single hazard but ignore that different types of geological hazards
might occur simultaneously under a complex environment. Here, we firstly built a multi-
geohazard inventory from 2000 to 2015 based on a geographical information system and
used satellite data in Google earth and then chose twelve conditioning factors and three
machine learning methods—random forest, support vector machine, and extreme gradi-
ent boosting (XGBoost)—to generate rock fall, landslide, and debris flow susceptibility
maps. The results show that debris flow models presented the best prediction capabilities
[area under the receiver operating characteristic curve (AUC 0.95)], followed by rock fall
(AUC 0.94) and landslide (AUC 0.85). Additionally, XGBoost outperformed the other two
methods with the highest AUC of 0.93. All three methods with AUC values larger than
0.84 suggest that these models have fairly good performance to assess geological hazards
susceptibility. Finally, evolution index was constructed based on a joint probability of these
three hazard models to predict the evolution tendency of 35 unstable slopes in Jiuzhaigou.
The results show that these unstable slopes are likely to evolve into debris flows with a
probability of 46%, followed by landslides (43%) and rock falls (29%). Higher susceptibil-
ity areas for geohazards were mainly located in the southeast and middle of Jiuzhaigou,
implying geohazards prevention and mitigation measures should be taken there in near
future.
* Zhao Zhang
Zhangzhao@mail.bnu.edu.cn
Extended author information available on the last page of the article
13
Vol.:(0123456789)
852 Natural Hazards (2020) 102:851–871
1 Introduction
The occurrence of a geological hazard is a complex process, and many natural (e.g.,
earthquake and tsunami) and anthropogenic factors (e.g., deforestation and urbaniza-
tion) can trigger such hazards, including ground subsidence, landslides, rock falls, and
debris flows (Ge et al. 2013; Gutiérrez et al. 2014; Tehrany et al. 2015). Geological
hazards have caused major threats to the lives and property of humankind, and heavy
damage to our environment and resources (Corominas et al. 2013; Huang and Zhao
2018). The economic losses from such geological hazards have amounted to about 20
billion Yuan (CNY) every year in China (Hong et al. 2016). The geological hazards of
China are characterized by various types, wide distribution, and high frequency, espe-
cially in its western mountainous areas (Guha-Sapir et al. 2012). Therefore, it is of great
importance for related hazard relief and policy-making entities to obtain high-resolu-
tion susceptibility maps, especially for typical hazards such as landslide, rock fall, and
debris flow. Moreover, for potential threat from unstable slopes, we should know their
evolution tendencies and take timely preventive measures. In this way, we can better
understand the spatial probability of geological hazard occurrence, and further monitor,
forecast, and warn the hazards more accurately. Consequently, risk assessment, hazard
prevention, and emergency management could be conducted in a timely manner.
Some traditional methods (e.g., heuristic and deterministic models) have been criti-
cized because of their excessive dependence on either the subjective judgments of
experts (Hong et al. 2016) or the large amounts of data required (Pourghasemi et al.
2012). However, machine learning can handle various types of data (e.g., ratio, interval,
nominal, or ordinal data) and are capable of identifying complex, nonlinear relation-
ships (Ge et al. 2013; Youssef et al. 2015; Zou et al. 2013). Therefore, machine learning
has been used widely, particularly to handle problems in which the characteristics of the
underlying processes are difficult to describe using physical equations (Cao et al. 2019).
Consequently, machine learning has become a potential alternative approach to study
geological hazard susceptibility. Such methods have been widely applied to hazard anal-
ysis through learning the relationship between a certain geological hazard and condi-
tioning factors without assuming a structural model at first (Dickson and Perry 2016;
Huang and Zhao 2018); there has been increasing interest in using machine learning
techniques to study susceptibility of geological hazards, e.g., the boosted regression tree
(BRT), classification and regression tree (CART), generalized linear model (GLM), ran-
dom forest (RF), and support vector machine (SVM) methods. For example, SVM was
used to map landslide susceptibility in China (Huang and Zhao 2018; Yao et al. 2008).
Moreover, different machine learning methods have been compared and estimated in
different regions of the world (Marjanović et al. 2011; Mokhtari and Abedian 2019;
Youssef et al. 2015; Zhu et al. 2017), such as in Saudi Arabia (Youssef et al. 2015),
Sichuan China (Zhu et al. 2017), Iran (Pourghasemi and Rahmati 2018), and Western
Serbia (Marjanović et al. 2011). Nevertheless, such previous researchers have only
focused on a single type of hazard, but ignored that different types of geological hazards
might occur simultaneously under a complex environment. Thus, multi-hazards studies
should be systematically investigated for a scientific decision. Additionally, very few
studies have concerned on the evolution tendency of potential hazard (e.g., the unstable
slopes). More urgently, decision-makers should know firstly the evolution tendency of
some potential hazards and then implement the corresponding prevention and reduction
measures in advance (Huang and Zhao 2018).
13
Natural Hazards (2020) 102:851–871 853
After the 2008 M s 8.0 Wenchuan earthquake, many unstable slopes threaten residents
in western mountainous areas of Sichuan Province. Jiuzhaigou has been one of such hot-
spots since it has been affected by frequent geological hazards due to climatic, seismic,
and anthropogenic factors. For example, on August 8, 2017, the occurrence of an Ms 7.0
earthquake caused a great number of geological hazards in Jiuzhaigou, and researchers
have conducted a series of studies in the area (Fabbri et al. 2003; Wu et al. 2018). In this
study, based on a multisource datasets, together with remote sensing (RS) and geographical
information system (GIS) techniques, three typical machine learning methods—including
SVM, RF, and XGBoost—were employed to assess multi-hazards (landslide, rock fall, and
debris flow) susceptibility and predict the evolution tendency of unstable slopes in Jiuzhai-
gou. The main objectives were: (1) to develop RF, SVM, and XGBoost models to map the
susceptibility of the three main geological hazards, (2) to validate these hazard susceptibil-
ity maps based on receiver operating characteristic (ROC) curves and predictive accuracy
(ACC), and identify the main conditioning factors, and (3) to predict the evolution ten-
dency of unstable slopes in Jiuzhaigou. Our research may provide some contributions for
Chinese decision-makers in monitoring and warning geological hazards.
2 Materials and method
2.1 Study area
An accurate and detailed hazard inventory map has been prepared. In this study, geologi-
cal hazard information (i.e., occurrence location, size, type, and occurrence time) is com-
piled from official records (BLRS, the Bureau of Land and Resources of Sichuan) from
2000 to 2015. In order to delineate these hazards area, the interpretation of satellite images
was carried out in Google Earth pro 7.1, including geological hazards (landslide, rock fall,
and debris flow) and potential hazard (unstable slope). The contour and texture features of
common geohazards in satellite images are shown in Fig. 2.
13
854 Natural Hazards (2020) 102:851–871
Fig. 1 A location of the study area in China’s Sichuan Province (a–b); the distribution of geological haz-
ards in towns (c); the locations of the geological hazards in Jiuzhaigou (d)
13
Natural Hazards (2020) 102:851–871 855
Fig. 2 Interpretation of geological hazards: landslide (a), rock fall (b), debris flow(c), and unstable slope
(d)
Fig. 3 Temporal patterns of geographical hazards, annual occurrence number (bar graph) and total scale
(blue line: unit 1 04 m3) (a), and seasonal changes (b) of geological hazards from 2000 to 2015 in Jiuzhaigou
13
856 Natural Hazards (2020) 102:851–871
period from March to July (367, 93.4%), especially in June (255, 64.8%), followed by
May (50, 12.7%) and March (23, 5.85%).
2.4 Conditioning factors
Geological hazards are caused by many potential factors (e.g., geology, topography, and
earthquakes tendency) and triggering factors (e.g., earthquakes, rainfall, stream scouring,
and human activities) (Chuang and Shiu 2018; van Westen et al. 2008; Yao et al. 2008).
Considering the causes of geological hazards and the geomorphologic characteristics of
Jiuzhaigou, we selected twelve factors (i.e., altitude, aspect, slope, land use, fault density,
distance to rivers, distance to roads, normalized difference vegetation index (NDVI), mean
annual rainfall, distance to faults, distance to epicenters, and lithology) related to triggering
mechanisms (e.g., rainfall, earthquake, and human activities) and potential variables (e.g.,
topographic structure, vegetation cover, and river systems) as conditioning factors (Fig. 4).
All the data were converted to 90 × 90 m grid data and a unified projection (UTM-Zone
48, WGS84 datum). One-to-one correlation coefficient analysis between the conditioning
factors was calculated to prevent collinearity (Fig. S2). Kornejady et al. (2015, 2017) sug-
gested that an absolute value of the correlation coefficient of 0.7 should be chosen as a
threshold to judge the collinearity between two factors. An absolute value of coefficient
larger than 0.7 will cause bias in the hazard susceptibility map. Hence, high correlation
should be removed (Kornejady et al. 2015; Kornejady et al. 2017). No significant corre-
lations were indicated for the 12 conditioning factors, suggesting that all factors can be
reasonably applied to develop model. Besides, the distributions of all geohazards with the
12 conditioning factors are illustrated in Fig. S1 (see supplementary material). The result
shows that there are obviously nonlinear relationships between conditioning factors and
geohazards. The detailed description of these factors was displayed as follows:
2.4.1 Triggering factors
Rainfall (mean annual rainfall), earthquakes (distance to epicenters), and certain engineer-
ing measures related to human activities (distance to roads, land use) are the main causes
of geological hazards in Jiuzhaigou. Rainfall data were derived from China’s meteorologi-
cal data sharing service system (http://data.cma.cn). We calculated the mean annual rain-
fall from 2000 to 2015 and used the inverse distance weighted (IDW) method to interpolate
rainfall per pixel (Fig. 4i). The locations of historical earthquakes (Ms ≥ 3) from 1970 were
obtained from the China Earthquake Networks Center (http://news.ceic.ac.cn). Information
on land use in Jiuzhaigou in 2015 was derived from the Resource and Environment Data
Cloud Platform (http://www.resdc.cn/). The road information was extracted from a topo-
graphic map, obtained from the China Geological Survey (http://gsd.cgs.cn/download.asp).
The Euclidean Distance tool in ArcGIS 10.3 was used to produce the maps of distance to
epicenters (Fig. 4g) and distance to roads (Fig. 4j).
2.4.2 Potential factors
The geomorphology (altitude, slope, and aspect), geology (lithology, distance to faults, and
fault density), vegetation cover (mean annual NDVI), and river systems (distance to riv-
ers) are the basic internal factors of geological hazards. A digital elevation model (DEM)
with 90 × 90 m resolution was derived from the Consultative Group for International
13
Natural Hazards (2020) 102:851–871 857
Fig. 4 Conditioning factors: a altitude, b slope, c aspect, d land use, e faults density, f distance to rivers,
g distance to roads, h NDVI, i rainfall, j distance to epicenters, k distance to faults, i lithology. Note that
Carboniferous (C), Devonian (D), Paleogene (E), Permian (P), Quaternary (Q), Triassic (T), Ediacaran (Z),
Cambrian (∈)
13
858 Natural Hazards (2020) 102:851–871
2.5 Methods
The analysis is conducted in five steps: (1) preparing landslide, rock fall, and debris flow
inventories, and conditioning factors; (2) analyzing the correlation and rescaling the con-
ditioning factors, and determining the relationships between the hazards and the factors;
(3) optimizing the crucial parameters and constructing RF, SVM, and XGBoost models,
and producing three geological hazard susceptibility maps; (4) evaluating and comparing
the models using ROC curves and ACC, and sorting the factors by their importance; (5)
finally, predicting the evolution tendencies of unstable slopes in Jiuzhaigou (Fig. 5).
An equal number of non-hazards were randomly selected as hazard-free sites. Then all
geological hazards and non-hazards were divided using a random partitioning algorithm
for training data (70%) and validation data (30%) (Cao et al. 2019; Trigila et al. 2015). To
reduce variability, the tenfold cross-validation and GridSearchCV function were applied to
optimize hyper-parameters for each method from empirical candidates only using the train-
ing data (Chen et al. 2017).
2.5.1 SVM
Support vector machine (SVM) is a supervised learning model based on the principle of
structural risk minimization (SRM) (Pourghasemi et al. 2012; Vapnik 2000). A kernel
function is efficiently applied for linear classification and nonlinear classification (Boser
2008; Cortes and Vapnik 1995) because it can transform the training data into high-
dimensional feature spaces (Huang and Zhao 2018). Selection of the kernel function, such
as sigmoid, polynomial, linear, or radial basis function (RBF), will affect the prediction
accuracy. In this paper, we used the two-class (0 or 1) SVM with RBF, the most popu-
lar function commonly used for landslide susceptibility mapping, to build the SVM model
(Marjanović et al. 2011; Pourghasemi and Rahmati 2018). The algorithm generates a sepa-
rating hyper-plane between the points of two distinct classes.
The classification problem of hazards and conditioning factors are a nonlinearity; the
SVM approach should transform the nonlinear case into a linear one by using the kernel
function. Finally, we followed previous studies (Pourghasemi and Rahmati 2018; Pradhan
2013) to determine three key parameters of SVM: the penalty factor C, the kernel functions
(the regularization parameter (ϑ), and the kernel width (γ). The grid search cross-valida-
tion (GridSearchCV) function was used to optimize crucial parameters (Vapnik 1999) and
13
Natural Hazards (2020) 102:851–871 859
13
860 Natural Hazards (2020) 102:851–871
“SVM” (Pedregosa et al. 2011) in scikit-learn of the Python 3.5 software to assess multi-
hazard susceptibility.
2.5.2 RF
2.5.3 XGBoost
2.6 Model assessment
ROC curve and area under the curve (AUC) has been widely used to characterize the per-
formance of multi-hazard susceptibility models, which are the most common evaluation
index in this field (Chen et al. 2017; Hong et al. 2016; Kim et al. 2018; Mokhtari and
Abedian 2019; Youssef et al. 2015). The AUC values of ROC curves display the goodness
of these model predictions. The value of the AUC is between 0.5 and 1. A higher AUC
value indicates better predictive ability. The AUC values less than 0.7 indicate poor pre-
dictive capability, with higher values indicating moderate (0.7–0.8), good (0.8–0.9), and
excellent (0.9–1) predictive capability (Swets 1988). Furthermore, the ACC is also a useful
tool to assess the predictive capability of these models. ACC is the proportion of hazard
13
Natural Hazards (2020) 102:851–871 861
and non-hazard pixels that models correctly classified. In this study, ACC together with
AUC was used to evaluate performances of the three hazard models.
TP + TN
Accuracy = (1)
TP + FP + TN + FN
where TP (true positive) and TN (true negative) are the number of pixels that are correctly
classified and FP (false positive) and FN (false negative) are the numbers of pixels incor-
rectly classified (Hong et al. 2016; Wang et al. 2015).
Triggered by the earthquakes, rainstorms, and human activities, etc., an unstable slope
could evolve into a landslide, debris flow, rock fall, or other geohazards. In order to pre-
dict the evolution tendencies of 35 unstable slopes recorded in Jiuzhaigou. We established
an evolution index (EI) by determining a joint probability of the three machine learning
methods discussed above for landslide, rock fall, and debris flow, respectively. Firstly, we
extracted the susceptibility values (calculated via three machine learning methods for land-
slide, rock fall, and debris flow) of 35 unstable slopes using ArcGIS 10.3. Secondly, we
calculated the average susceptibility value of 35 unstable slopes for landslide, rock fall,
and debris flow as EI (produced by three techniques), respectively. Finally, we defined this
unstable slope could evolve into the geohazard if susceptibility values calculated by three
machine learning methods are all larger than the EI of the corresponding hazard.
3 Results
Three machine learning techniques (including RF, SVM, and XGBoost) were employed to
produce three geohazard’s (landslide, rock fall, and debris flow) susceptibility maps, then
compared, and evaluated their accuracy for each type of hazard. Finally, the importance of
conditioning factors was analyzed and predicted the evolution tendency for unstable slopes
in Jiuzhaigou.
RF, SVM, and XGBoost techniques were used to calculate the susceptibility index val-
ues throughout the study area. After that, the susceptibility map was reclassified into five
classes using natural break point method (Cao et al. 2019; Chen et al. 2017; Dragicevic
et al. 2015; Kornejady et al. 2017). The three hazard susceptibility maps are illustrated
in Fig. 6, and the area percentages of each class are shown in Tables 1, 2 and 3. Inverted
“Y” shapes were indicated, which implied that the highly susceptible areas of landslide
are mainly concentrated in the southeast (Fig. 6a–c), and rock falls and debris flows are
mainly concentrated in the southeast and middle of the study area (Fig. 6d–i). Moreover,
the highly susceptible areas (including very high and high area) for rock falls and debris
flows are relatively clustered, and the low susceptibility areas are relatively small. Spe-
cifically, the highly susceptible area percent (including very high and high area percent)
for landslides only accounted for approximately 14% of the study area, but almost 91% of
13
862 Natural Hazards (2020) 102:851–871
Fig. 6 Susceptibility maps of landslides (a–c), debris flow (d–f), and rock fall produced (g–i)
historical landslide percent happened in those areas; similar results were found for rock
falls (~ 12% vs. ~ 92%) and debris flows (~ 11% vs. ~ 91%) (Tables 1, 2, 3).
In this study, the ACC, ROC curves, and AUC values of these three techniques using
training data are shown in Table 4 and Fig. 7 (landslide, rock fall, and debris flow). The
XGBoost model has the highest performance in terms of ACC and AUC, with mean values
of 0.92 and 0.95, respectively (Table 4 and Fig. 7). The RF and SVM techniques exhib-
ited slightly lower ACC and AUC values than XGBoost techniques, with mean ACC/AUC
values of 0.90/0.94 and 0.89/0.94 for the RF and SVM models, respectively (Table 4 and
Fig. 7). All these results indicated reasonable goodness-of-fit performance with the train-
ing dataset, and the XGBoost performed slightly better than the other two techniques.
13
Natural Hazards (2020) 102:851–871 863
Table 1 Landslide susceptibility Model Susceptibility level Area (%) Landslide Landslide (%)
areas and landslide percent for
three techniques
SVM Very low 45.83 2 2.99
Low 22.84 0 0.00
Moderate 14.95 4 5.97
High 8.46 14 20.90
Very high 7.92 47 70.15
RF Very low 55.03 0 0
Low 19.01 1 1.49
Moderate 11.74 3 4.48
High 7.18 8 11.94
Very high 7.05 55 82.09
XGBoost Very low 68.99 5 7.46
Low 11.09 0 0.00
Moderate 6.66 3 4.48
High 5.11 4 5.97
Very high 8.16 55 82.09
Table 2 Rock fall susceptibility Model Susceptibility level Area (%) Rock fall Rock fall (%)
areas and rock fall percent for
three techniques
SVM Very low 61.86 2 0
Low 15.55 2 0.92
Moderate 8.84 4 1.83
High 7.01 17 9.17
Very high 6.74 84 88.07
RF Very low 69.89 1 0.92
Low 6.63 2 1.83
Moderate 9.74 6 5.5
High 8.52 29 26.61
Very high 5.23 71 65.14
XGBoost Very low 81.2 0 7.46
Low 5.53 1 0.00
Moderate 3.88 2 4.48
High 3.33 10 5.97
Very high 6.06 96 82.09
The prediction capabilities of the three constructed models for three hazards were
evaluated using validation data; Fig. 8 and Table 4 show the ROC curves and ACC
for the three techniques of three geological hazards (landslides, rock falls, and debris
flows). All three models exhibited good prediction performance for rock falls and debris
flows (AUC ≥ 0.9; ACC ≥ 0.85). In the case of landslides, the AUCs/ACCs of RF, SVM,
and XGBoost corresponded to 0.84/0.81, 0.88/0.86, and 0.86/0.84, respectively. Cor-
responding values for rock falls (debris flows) were 0.90/0.91 (0.93/0.91), 0.95/0.85
(0.96/0.86), and 0.97/0.92 (0.97/0.92), respectively. We concluded that all three tech-
niques exhibited reasonably good prediction capabilities in the study area. In addition,
13
864 Natural Hazards (2020) 102:851–871
Table 3 Debris flow susceptibility areas and debris flow percent for three techniques
Model Susceptibility level Area (%) Debris flow Debris flow (%)
Table 4 ACC for the three ACC Landslide Rock fall Debris flow
techniques on training and
validation data Training data
SVM 0.90 0.86 0.90 0.89
RF 0.82 0.92 0.95 0.90
XGBoost 0.86 0.93 0.96 0.92
Mean 0.86 0.90 0.94 –
Validation data
SVM 0.86 0.85 0.86 0.86
RF 0.81 0.91 0.91 0.88
XGBoost 0.84 0.92 0.92 0.89
Mean 0.84 0.89 0.90 –
13
Natural Hazards (2020) 102:851–871 865
the best models to predict landslide are SVM, but the rock fall and debris flow are
XGBoost model, with AUCs/ACCs of 0.88/0.86, 0.97/0.92, and 0.97/0.92, respectively,
although the three landslide models performed more poorly than the models for rock
falls and debris flows.
The best method XGBoost was applied to assess the importance of all conditioning fac-
tors (Fig. 9). The result showed that altitude (89.7%, 98.3%, and 96.1% for landslides,
rock falls, and debris flows, respectively) was the most important factor, which was con-
sistent with previous researches (Chen et al. 2017; Tien Bui et al. 2015). Our results
further indicated that geomorphic conditions are closely related to the occurrence of
geohazards. Moreover, other factors, such as distance to roads (85.8%, 91.8%, 91.6%),
rainfall (65.5%, 61.2%, 70.2%), and distance to epicenters (68.9%, 65.5%, 52.4%), also
contribute greatly to the susceptibility models, which indicated that rainfall, earth-
quakes, and engineering measures related to human activities significantly impact geo-
hazards occurrences.
13
866 Natural Hazards (2020) 102:851–871
Table 5 shows that susceptibility values statistics of these unstable slopes and EIs (the bold
blue number in Table 5) were 0.76 (debris flow), 0.69 (rock fall), and 0.63 (landslide),
respectively. In this research, we defined this unstable slope could become the correspond-
ing hazard (the red recorded in Table S1) if the joint probability of three models is larger
than the EI of the corresponding hazard. A detailed summary on the susceptibility values
and their distributions is shown in Table S1 and Fig. 10. All the unstable slopes could
potentially develop as follows: 16 unstable slopes (45.7%) could become debris flows; 15
(42.9%) for landslides; and 10 (28.6%) for rock falls. According to the results, some unsta-
ble slopes have more than one evolution tendency. In terms of their spatial patterns, unsta-
ble slopes for landslide are mainly located in southern towns (Fig. 10a), especially in Nan-
ping Town (five red triangles) and Zhangzha Town (three red triangles). Unstable slopes
prone to rock falls are sparsely scattered in eastern towns, especially in Nanping Town
(four black circles) (Fig. 10b). However, a belt is clearly indicated by the unstable slopes
prone to debris flows, with many green points in a mosaic pattern near the southeast towns
(Fig. 10c), especially in considering the causes of geological hazards and the geomorpho-
logic and geological characteristics of Jiuzhaigou Nanping Town (six green circles).
4 Discussion
Each machine learning technique has its advantages and shortcomings, and performance
may vary in different areas (Huang and Zhao 2018; Pourghasemi and Rahmati 2018;
Youssef et al. 2015). The results showed all models (XGBoost, RF, and SVM) performed
well for the three geological hazards in Jiuzhaigou county. We attributed their good
13
Natural Hazards (2020) 102:851–871 867
13
868 Natural Hazards (2020) 102:851–871
Compared with the RF technique, XGBoost used a gradient boosting method to improve
model accuracy, whereas an ensemble of trees that vote independently is used for RF. In addi-
tion, XGBoost reduces error mainly by reducing bias rather than reducing variance as RF
model conducting. Although prediction accuracy may differ for different parameters and dif-
ferent datasets (Trigila et al. 2015), we found that XGBoost generally performed better than
SVM and RF.
4.2 Suggestions and implications
Based on factors importance analysis, distance to roads was ordered secondly after altitude,
which demonstrated the close relationship between human activities and hazard susceptibil-
ity. Moreover, the spatial susceptibility patterns of the three geological hazards (Fig. 6) indi-
cated highly susceptible areas are located along roads, rivers, settlements, and valleys because
intensive human activities (built-up areas, road constructions, and the related soil erosion) can
reduce the natural stability of the original slopes and change their topographic and geological
conditions (Cao et al. 2019; Chuang and Shiu 2018; Gorsevski et al. 2006). According to sta-
tistical records, tourism is the largest industry in Jiuzhaigou, and tourism numbers and income
have risen sharply in recent years (Fig. S3). Some previous studies have shown that Jiuzhaigou
needs to construct lots of housing and roads to satisfy the dramatically increasing tourist arriv-
als, especially in Zhangzha Town, where 138 geological hazards, accounting for 35.03% of
the total and including 85 hazards in scenic areas, have occurred (Cao et al. 2019). Therefore,
how the government can balance and control eco-sustainable tourism evolution in Jiuzhaigou
should be investigated in the future.
4.3 Some limitations
There were also some limitations in this study. Firstly, the prediction evolution tendency
for unstable slopes is based on the assumption that these unstable slopes are more likely to
become landslides, rock falls, and debris flows under the backgrounds of historical mountain-
ous climate, earthquakes-prone, and intensive human activities. However, land morphology
changes continually; multi-hazard susceptibility maps and the evolution tendency for unstable
slopes will consequently change with the environmental changes. Therefore, hazard inven-
tory maps and conditioning factors should be updated regularly. Secondly, RF, SVM, and
XGBoost are all statistical models, but lacking in hazard mechanisms. Therefore, the mech-
anism processes of geological hazards should be mainly focused in future studies. Finally,
machine learning only considers the attribute information of spatial objects and ignores the
spatial structural information, which led to suboptimal geohazard susceptibility mapping.
In addition, the selection of conditioning factors was not objective to such an extent, which
may reduce the reliability of susceptibility mapping. To address these problems, combining
GeoDetector (Geographical Detectors), machine learning model and spatial auto-regression
(SAR) model for geohazard susceptibility mapping are proposed in the further research (Yang
et al. 2019), which can make full use of both the spatial structure and attribute information of
spatial objects.
13
Natural Hazards (2020) 102:851–871 869
5 Conclusion
References
Boser BE (2008) A training algorithm for optimal margin classifiers. Proc Annu ACM Workshop Comput
Learn Theory 5:144–152. https://doi.org/10.1145/130385.130401
Breiman L (2001) Random forests. Mach Learn 45:5–32
Cao J, Zhang Z, Wang C, Liu J, Zhang L (2019) Susceptibility assessment of landslides triggered by
earthquakes in the Western Sichuan Plateau. CATENA 175:63–76. https://doi.org/10.1016/j.caten
a.2018.12.013
Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM sig-
kdd international conference on knowledge discovery and data mining, pp 785–794. ACM
Chen W, Xie X, Wang J et al (2017) A comparative study of logistic model tree, random forest, and classifi-
cation and regression tree models for spatial prediction of landslide susceptibility. CATENA 151:147–
160. https://doi.org/10.1016/j.catena.2016.11.032
Cheng S, Zhang S, Li L, Zhang D (2018) Water quality monitoring method based on TLD 3D fish tracking
and XGBoost. Math Probl Eng 7:1–12. https://doi.org/10.1155/2018/5604740
Chuang YC, Shiu YS (2018) Relationship between landslides and mountain development-integrating geo-
spatial statistics and a new long-term database. Sci Total Environ 622–623:1265–1276. https://doi.
org/10.1016/j.scitotenv.2017.12.039
Corominas J et al (2013) Recommendations for the quantitative analysis of landslide risk. Bull Eng Geol
Environ. https://doi.org/10.1007/s10064-013-0538-8
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Dickson ME, Perry GLW (2016) Identifying the controls on coastal cliff landslides using machine-learning
approaches. Environ Modell Softw 76:117–127. https://doi.org/10.1016/j.envsoft.2015.10.029
Dragićević S, Lai T, Balram S (2015) GIS-based multicriteria evaluation with multiscale analysis to charac-
terize urban landslide susceptibility in data-scarce environments. Habitat Int 45:114–125. https://doi.
org/10.1016/j.habitatint.2014.06.031
Fabbri AG, Chung CJF, Cendrero A, Remondo J (2003) Is prediction of future landslides possible with a
GIS? Nat Hazards 30:487–503. https://doi.org/10.1016/j.habitatint.-2014.06.031
Ge Y, Dou W, Gu Z et al (2013) Assessment of social vulnerability to natural hazards in the Yangtze
River Delta, China. Stoch Environ Res Risk Assess 27:1899–1908. https://doi.org/10.1007/s0047
7-013-0725-y
13
870 Natural Hazards (2020) 102:851–871
Goetz JN, Brenning A, Petschko H, Leopold P (2015) Evaluating machine learning and statistical pre-
diction techniques for landslide susceptibility modeling. Comput Geosci 81:1–11. https://doi.
org/10.1016/j.cageo.2015.04.007
Gorsevski PV, Gessler PE, Boll J, Elliot WJ, Foltz RB (2006) Spatially and temporally distributed mod-
eling of landslide susceptibility. Geomorphology 80:178–198. https://doi.org/10.1016/j.geomo
rph.2006.02.011
Guha-Sapir D, Vos F, Below R, Ponserre S (2012) Annual disaster statistical review 2011: the numbers
and trends. Centre for Research on the Epidemiology of Disasters (CRED). https://doi.org/10.13140
/RG.2.2.10378.88001
Gutiérrez F, Parise M, De Waele J, Jourde H (2014) A review on natural and human-induced geohazards
and impacts in karst. Earth Sci Rev 138:61–88. https://doi.org/10.1016/j.earscirev.2014.08.002
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal
20:832–844. https://doi.org/10.1109/34.709601
Hong H, Pourghasemi HR, Pourtaghi ZS (2016) Landslide susceptibility assessment in Lianhua
County (China): a comparison between a random forest data mining technique and bivariate and
multivariate statistical models. Geomorphology 259:105–118. https://doi.org/10.1016/j.geomo
rph.2016.02.012
Huang Y, Zhao L (2018) Review on landslide susceptibility mapping using support vector machines.
CATENA 165:520–529. https://doi.org/10.1016/j.catena.2018.03.003
Kamp U, Growley BJ, Khattak GA, Owen LA (2008) GIS-based landslide susceptibility mapping for the
2005 Kashmir earthquake region. Geomorphology 101:631–642. https://doi.org/10.1016/j.geomo
rph.2008.03.003
Kim HG, Lee DK, Park C, Ahn Y, Kil S-H, Sung S, Biging GS (2018) Estimating landslide susceptibil-
ity areas considering the uncertainty inherent in modeling methods. Stoch Environ Res Risk Assess
32:2987–3019. https://doi.org/10.1007/s00477-018-1609-y
Kornejady A, Heidari K, Nakhavali M (2015) Assessment of landslide susceptibility, semi-quantitative
risk and management in the Ilam dam basin, Ilam, Iran. Environ Resour Res 3:85–109. https://doi.
org/10.22069/ijerr.2015.2563
Kornejady A, Ownegh M, Bahremand A (2017) Landslide susceptibility assessment using maximum
entropy model with two different data sampling methods. CATENA 152:144–162. https://doi.
org/10.1016/j.catena.2017.01.010
Li W, Zhang Q, Liu C, Xue Q (2006) Tourism’s impacts on natural resources: a positive case from
China. Environ Manag 38:572–579. https://doi.org/10.1007/-s00267-004-0299-z
Marjanović M, Kovačević M, Bajat B, Voženílek V (2011) Landslide susceptibility assessment using SVM
machine learning algorithm. Eng Geol 123:225–234. https://doi.org/10.1016/j.enggeo.2011.09.006
Mokhtari M, Abedian S (2019) Spatial prediction of landslide susceptibility in Taleghan basin, Iran.
Stoch Environ Res Risk Assess 33:1297–1325. https://doi.org/10.1007/s00477-019-01696-w
Pedregosa F, Varoquaux G, Gramfort A et al (2011) Scikit-learn: machine learning in python. J Mach
Learn Res 12:2825–2830
Pourghasemi HR, Rahmati O (2018) Prediction of the landslide susceptibility: which algorithm, which
precision? CATENA 162:177–192. https://doi.org/10.1016/j.catena.-2017.11.022
Pourghasemi HR, Pradhan B, Gokceoglu C (2012) Application of fuzzy logic and analytical hierarchy
process (AHP) to landslide susceptibility mapping at Haraz watershed, Iran. Nat Hazards 63:965–
996. https://doi.org/10.1007/s11069-012-0217-2
Pradhan B (2013) A comparative study on the predictive ability of the decision tree, support vector
machine and neuro-fuzzy models in landslide susceptibility mapping using GIS. Comput Geosci
51:350–365. https://doi.org/10.1016/j.cageo.2012.08.023
Qiao X, Du J, Lugli S, Ren J, Xiao W, Chen P, Tang Y (2016) Are climate warming and enhanced
atmospheric deposition of sulfur and nitrogen threatening tufa landscapes in Jiuzhaigou National
Nature Reserve, Sichuan, China? Sci Total Environ 562:724–731. https://doi.org/10.1016/j.scito
tenv.2016.04.073
Swets JA (1988) Measuring the accuracy of diagnostic systems. Science 240(4857):1285–1293. https://
doi.org/10.1126/science.3287615
Tehrany MS, Pradhan B, Jebur MN (2015) Flood susceptibility analysis and its verification using a novel
ensemble support vector machine and frequency ratio method. Stoch Environ Res Risk Assess
29:1149–1165. https://doi.org/10.1007/s00477-015-1021-9
Tien Bui D, Tuan TA, Klempe H, Pradhan B, Revhaug I (2015) Spatial prediction models for shallow
landslide hazards: a comparative assessment of the efficacy of support vector machines, artificial
neural networks, kernel logistic regression, and logistic model tree. Landslides 13:361–378. https://
doi.org/10.1007/s10346-015-0557-6
13
Natural Hazards (2020) 102:851–871 871
Publisher’s Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Affiliations
13