Professional Documents
Culture Documents
Intelligent Approach Based On Random Forest For Safety Risk Prediction of Deep Foundation Pit in Subway Stations
Intelligent Approach Based On Random Forest For Safety Risk Prediction of Deep Foundation Pit in Subway Stations
Abstract: The number of safety accidents caused by excavation of deep foundation pits in subway stations has been increasing rapidly in
recent years. Thus, precisely predicting the safety risks for subway deep foundation pits bears importance. Existing methods, such as machine
learning models, have been established for predicting such risks. However, these methods are unable to provide accurate results for deep
foundation pits in subway stations due to small and unbalanced data samples. In this research, an intelligent model based on random forest
(RF) was established for risk prediction of deep foundation pits in subway stations. To achieve such a goal, different types of monitoring data
and risk level monitoring were introduced to the RF for training the model and estimating unknown relationships between monitoring values
and safety risks of pits. An actual deep foundation pit in a subway station of the Wuhan Metro was used to demonstrate the applicability of the
developed RF risk prediction model. The results showed that the superiority of the proposed RF risk prediction model can be used as a basis to
implement a decision-making tool for predicting safety risks of subway foundation pits. The importance evaluation function of the model
provides the ability to aid onsite engineers in determining the causes of safety risks, thus facilitating the implementation of emergency
measures in advance. DOI: 10.1061/(ASCE)CP.1943-5487.0000796. © 2018 American Society of Civil Engineers.
Author keywords: Subway station; Deep foundation pit; Safety risk prediction; Random forest.
Introduction models are often difficult to apply to different kinds of ground con-
ditions and construction techniques (Loganathan and Poulos 1998).
Deep foundation pits in subway stations are typically characterized On the other hand, when adopting numerical simulation methods,
by long duration of construction, substantial uncertainties, and seri- complex engineering conditions and construction parameters are
ous effects on the surrounding environment, which lead to frequent constantly simplified, thus leading to large deviations in results
accidents (Zhou et al. 2017a; Ding et al. 2017). The Singapore (Li 2000). Machine learning methods, such as artificial neural nets
Nicoll Highway collapse in 2004, Guangzhou Metro Haizhu Square (ANN), bayesian network (BN), support vector machine (SVM),
accident in 2005, and Hangzhou Metro Line 1 station accident in and random forest (RF), have gradually become the mainstream
2008 were all caused by foundation pit collapse and resulted in a of risk prediction in recent years. Monitoring data are limited,
large number of casualties and economic losses. To reduce the dam- and only small sample data of 100–200 sets can be collected
age caused by accidents in foundation pits and to minimize their det- (Hasofer and Qu 2002) because the average construction period
rimental environmental effects, the safety risks of deep foundation of a deep foundation pit project in a subway station is approxi-
pits in subway stations must be effectively predicted (Ding and Jie mately 1 year, and monitoring frequency of the monitored item
2017). is generally once every 1–2 days. Among the current popular
In the last few decades, numerous scholars have tried to predict machine learning algorithms, ANN and BN are generally used for
the risks of deep foundation pits via various methods, including thousands of sample data sets and are unsuitable for small amounts
empirical (Whittle et al. 1993; Mair and Taylor 1997; Lee and of sample data (Kecman 2001; Zhou et al. 2017b). In comparison
Halpin 2003) and numerical simulations (Chen et al. 2003, 2004; with other algorithms, SVM and RF are more suitable for problems
Yoo and Lee 2008) and machine learning methods (Sun and Wu with a small amount of sample data and can achieve higher predic-
1998; Su et al. 2009; Sun 2010; Zhou and Zhang 2011). Empirical tion accuracy. However, SVM is time consuming when solving
complex problems and is sensitive to missing and unbalanced data
1 (Martens et al. 2007). In actual deep foundation pit projects, there
Professor, School of Civil Engineering and Mechanics, Huazhong
are consistently less low-risk than high-risk data, thereby resulting
Univ. of Science and Technology, Wuhan, Hubei 430074, China. Email:
ying_zhou@hust.edu.cn in imbalanced sample data.
2
Master Student, School of Civil Engineering and Mechanics, Compared with SVM, RF is easy to understand and robust to
Huazhong Univ. of Science and Technology, Wuhan, Hubei 430074, China. unbalanced data (Zhou et al. 2016a). RF also features a function
Email: 13237166717@163.com that is superior to other algorithms—importance evaluation of pre-
3
Associate Professor, School of Civil Engineering and Mechanics, dictive variables, which can determine the most important predic-
Huazhong Univ. of Science and Technology, Wuhan, Hubei 430074, China tive variable and find the cause of risks (Kuhn and Johnson 2013).
(corresponding author). Email: chengzhou@hust.edu.cn Based on the previous reasons, this study primarily aimed to ex-
4
Professor, School of Civil Engineering and Mechanics, Huazhong plore the capability of RF for risk prediction of deep foundation
Univ. of Science and Technology, Wuhan, Hubei 430074, China. Email:
pits in subway stations. Thus, a RF risk prediction model for deep
luohbcem@hust.edu.cn
Note. This manuscript was submitted on January 23, 2018; approved on
foundation pits in subway stations was established. The feasibility
May 25, 2018; published online on September 26, 2018. Discussion period of using monitoring data as risk predictive variables of deep foun-
open until February 26, 2019; separate discussions must be submitted for dation pits in subway stations was verified. The proposed ap-
individual papers. This paper is part of the Journal of Computing in Civil proach was validated in a real deep foundation pit project in a
Engineering, © ASCE, ISSN 0887-3801. subway station of the Wuhan Metro, and the importance analysis
in clay dimensional small data sets; time and space; sensitive to missing
Sun (2010) Predict deformation of deep foundation pits in good at handling complex and unbalanced sample data; lack
soft soil areas biological nonlinear data of transparency
Li et al. (2016) Predict horizontal displacement of deep
foundation pits
Zhou et al. (2017c) Forecast potential safety hazards in
construction of deep foundation pits
function of RF also effectively helped the exploration of risk penetration rate. Zhou et al. (2016b, c) investigated the feasibility
source. of using RF to forecast surface movements induced by tunnel
The rest of this paper is divided into four parts. The first section construction. Hong et al. (2017) applied logistic regression and
briefly reviews research on predicting foundation pit risks. The RF to analyze the landslide susceptibility of the Wuyuan area
second part introduces the RF algorithm and the process of estab- in Jiangxi Province, China. The superiority of the RF method
lishing the RF risk prediction model for deep foundation pits in has been determined via comparative analyses. These studies
subway stations. The third part applies the developed model to showed promising applications of RF in engineering risk prediction
an actual project, a foundation pit project in a subway station of problems.
the Wuhan Metro. A SVM risk prediction model is also established Deep foundation pit projects in subway stations are constantly
for comparison. Finally, the fourth section summarizes the main constructed in city centers, which are densely populated. Risk pre-
achievements and conclusions of this research. diction of foundation pits in subway stations is a complex problem,
which presents a challenge when using traditional methods. Unlike
steel structures, rock soil acts as a transmission medium. Therefore,
Literature Review in construction, if small deformations occur in foundation pits, then
such deformations will expand accordingly when effective treat-
In recent years, several machine learning methods have been used ments are not applied, eventually leading to large-scale collapse
to analyze and predict construction risks of deep foundation pits, of foundation pits and surrounding buildings and serious damage.
namely ANN, BN, and SVM. Based on these methods, a number of Therefore, an accurate prediction should be accomplished immedi-
complex problems can be addressed by learning simple relations ately to promptly adopt the appropriate remedial measure. RF is
without knowing the exact relationship among parameters. The algorithmically simpler and computationally lighter than other
lack of a definite expression between risk results and various influ- machine learning methods (Rodriguez-Galiano et al. 2014). RF is
encing factors can be overcome. Table 1 shows several applications also robust to missing and unbalanced data; thus, rescaling and
of these machine learning methods in foundation pit risk prediction. modification of data are unnecessary (Hong et al. 2017). In addi-
The table also summarizes the advantages and limitations of these tion, RF can identify the most important predictive variable. In
methods. As denoted in the table, these methods are not used in theory, RF is suitable for risk prediction of deep foundation pits
cases involving complex calculations, small amounts of sample in subway stations. However, no research has been conducted in
data, or unbalanced and missing sample data. Additionally, using relation to this subject. In this work, a RF risk prediction model
these models for calculation is extremely time consuming. There- was introduced to predict the safety risks of deep foundation pits
fore, these methods are unsuitable for safety risk prediction of deep in subway stations.
foundation pits in subway stations because only limited and unbal-
anced sample data can be retrieved in real projects. Calculations
Research Methods
should also be generated in a short time when applying the methods
in actual constructions. This section explains the RF algorithm and establishment of a RF
The advent of the RF algorithm aids in solving complex engi- risk prediction model.
neering problems due to its capability to discover nonlinear com-
plex relationships between independent and dependent variables
without statistical assumptions (Rodriguez-Galiano et al. 2014). RF Method
A comparative experiment of 10 supervised learning methods The RF algorithm is an ensemble learning method proposed by
was conducted to classify a data set of 246 rockburst events. When Breiman in 2001. In this model, numerous trees with the same dis-
these methods were compared, the RF method was considered the tribution are used to set up a forest to train and predict the sample
best (Zhou et al. 2016a). Several researchers have also applied RF data (Kuhn and Johnson 2013). The generalization error of the forest
to other engineering problems. For example, Hu et al. (2015) ap- approaches the limit with the increasing number of trees (Breiman
plied the RF method to predict the hard rock tunnel boring machine 2001a). RF is extensively used for prediction and feature selection
mented as follows:
node of a tree obtains two branches. All nonleaf nodes use an attrib- 1. The bootstrapping method was used to obtain ntree training sub-
ute selection measure to determine the optimal attribute and are sets from the original sample data. Each training subset featured
divided according to this attribute. The attribute selection measure the same size as the original sample data, indicating that some
is a heuristic method. Ideally, divisions obtained from this measure data may be repeated or left out.
should be pure, indicating that sample data at the same node belong 2. Each training subset was used to generate a CART tree. At each
to the same category. A node’s impure function F can thus be used tree node, a subset of attributes of predictive variables was ran-
to determine the performance of divisions. A higher F value indi- domly selected. The number of attributes is mtry , which is not
cates a higher impurity level of the node. When the value of F higher than the number of predictive variables. The best split
reaches zero, all sample data at that node belong to the same cat- variable from the subset was used to divide the node.
egory. CART trees generally use the most popular impure function 3. Each CART tree contributed a single voting category to the
F, the Gini index (Rodriguez-Galiano et al. 2014), as the split stan- forest, and the classification result was obtained by taking the
dard and select attributes with minimized Gini indices as optimal majority of the voting categories.
attributes.
CART requires no prior knowledge. Thus, this algorithm is eas-
ier to explain than neural networks and other methods. However, Establishment of RF Risk Prediction Model
the established trees may be extremely complicated during recur- The RF risk prediction model was implemented as follows: (1) the
sion due to its numerous nodes, thus possibly reducing classifica- sample data were collected and processed; (2) the monitoring data
tion accuracy. Pruning can avoid overfitting but increase model were used as input, and safety risk levels of the foundation pit in a
complexity. subway station as output were used to establish the model and an-
alyze the correlation and importance of predictive variables; and
Integrated Learning Method (3) classification accuracy of the model was verified. Fig. 1 shows
A single classifier often cannot achieve satisfactory classification the specific process.
accuracy and easily encounters overfitting, thereby resulting in a
weak generalization. The integrated learning method is the combi- Collecting and Processing of Sample Data
nation of several classifiers. Through a combination of the results of Multiple monitoring items are used during the construction of deep
each basic classifier, sample data categories can be determined. The foundation pits in subway stations, and hidden dangers are indi-
integrated learning method can achieve better classification perfor- cated by abnormal monitoring data. Therefore, safety risk levels
mance than any single classifier and thereby effectively improve the of foundation pits in subway stations can be determined by analyz-
generalization capability of the learning system. ing monitoring data, and preventive measures can be implemented
Bagging (Breiman 1996) is an integrated learning method based accordingly. Common types of monitoring include settlement,
on the idea of bootstrapping (Davison and Hinkley 1997) from sta- stress, groundwater level, lateral displacement, and excavation
tistics. Bootstrapping is achieved via resampling randomly with depth. Currently, selection of monitoring items for deep foundation
replacement. In the bagging algorithm, several training subsets pits in China is based on the national standard “Technical Code of
can be drawn through bootstrapping, and corresponding basic clas- Urban Rail Transit (MHURDPRC 2009),” and its monitoring rules
sifiers can be obtained from training. Each training subset retains are shown in Table 2. The table categorizes foundation pits into
the original size of the sample data, and some data in the same train- three types on the basis of excavation depth. Based on the standard,
ing subset may be repeated (Han et al. 2017). When the size of the the monitoring items are divided into two types: compulsory and
original sample data is N, the probability of each datum not being optional measurements. Selection of optional monitoring items in
drawn is approximately ½1 − ð1=NÞN. If N is large enough, then the the actual project should be compatible with the design and con-
probability is 1=e ≈ 0.368, indicating that approximately 37% of struction plan. Geological hydrology, construction parameters, and
the original sample data are not drawn each time. This part of the the surrounding environment should also be fully considered when
sample is called out-of-bag (OOB). selecting optional monitoring items.
The bagging algorithm can improve the accuracy of classifica- The traditional method of evaluating safety risks of pits typically
tion results better than data-sensitive classifiers, such as CART. involves analyzing one or two abnormal monitoring values. How-
Bagging can also train multiple basic classifiers in parallel, thereby ever, assessments based on information from different monitoring
saving considerable time. types and those based on different monitoring points of the
same monitoring types may conflict due to the complexity of
RF Algorithm subway foundation pit projects and considerable uncertainty in
RF, which is an integrated algorithm formed through the combina- evaluation. Therefore, inferring the safety risk levels based only
tion of CART trees and bagging, encounters no overfitting and im- on the abnormal values of one or two monitoring types is unreli-
proves the deficiencies of single decision trees (Breiman 2001b). able, and the values of multiple monitoring types should be
In this algorithm, multiple training subsets are obtained by synthetically considered. In this study, a RF risk prediction model
Preprocessing
M Samples
Training data Splitting into training and test sets Testing data
N samples x variables M-n samples x variables
K-fold CV
Sub training Sub training Sub training
data 1 data 2 data k
Testing Training
CART tree 1 CART tree 2 CART tree k
Table 2. Selection of urban rail transit monitoring items On the basis of the principles of representativeness, universality,
and compactness, historical monitoring data and records of safety
Pit category
risk levels were selected from the monitoring system. Denoising or
Monitoring project Level 1 Level 2 Level 3 standardization of monitoring data is unnecessary because the RF is
Retaining wall (pile) horizontal displacement C C C resistant to outliers in predictive variables and can automatically
Retaining wall (pile) vertical displacement C C C handle missing values (Catani et al. 2013). To prevent overfitting,
Deep horizontal displacement C C O at least 80% of the samples were used to train the model, and
Pillar vertical displacement C C O the remaining samples were used to validate the model. To avoid
Internal force of retaining wall O O O
densely stacking similar sample data and to guarantee effective
Internal force of support C C C
Internal force of column O O O prediction results, the orders of the sample data were randomly
Internal force of bolt C C C disrupted.
Internal force of soil nail O O O
Pit bottom uplift (rebound) O O O Establishment of Risk Prediction Model
Retaining wall lateral force O O O Two major parameters of RF must be optimized when establishing
Pore water pressure O O O a model (Breiman 2001a). These parameters are ntree and mtry ,
Groundwater level C C C which refer to the numbers of trees in the forest and prediction var-
Soil layered settlement O O O iables at each node of the tree, respectively. If the value of mtry is
Surface settlement O O O
small, then overfitting may occur and the accuracy of model may
Building settlement C C C
Building inclinometer O O O decrease. If the value of mtry is large, then calculating speed may
Building horizontal displacement O O O decrease. If the value of ntree is small, then training may become
Building/surface crack C=O C=O C=O inadequate. If the value of ntree is large, then computing complexity
Underground pipeline settlement C=O C=O C=O of the model may increase.
Note: C = compulsory measurement; O = optional measurement; Level 1 = A k-fold cross-validation method (Kohavi 1995) was used
design depth ðDDÞ ≥ 20 m; Level 2 ¼ 20 > DD ≥ 10 m; and Level 3 ¼ to optimize these two major parameters. In this method, sam-
DD < 10 m. ple data are divided into k subsets, k − 1 subsets are used to
train the model, and the remaining subset is used for testing.
The subsets alternately act as the independent test set, whereas
was established via training. The monitoring values of important the others serve as training sets (Arlot and Celisse 2010).
monitoring types and points were used as predictive variables, Based on the k-fold cross-validation test, the optimal parameters
and safety risk levels of the foundation pit in a subway station were were determined. The parameters and training sets were used
used as output. as input to generate a final pit risk classifier, and a RF risk
the predicted and actual risk levels, the classification accuracy rate River, which is the main factor behind the dynamic changes in
was determined. Higher values of the classification accuracy rate groundwater. Fluctuation of groundwater critically affects the level
result in better classification performance of the classifier of confined water and thus influences the stability of the foundation
X pits (Ding et al. 2017). This subway station is in a commercial
1 C center. A safety accident will lead to critical consequences. There-
Accuracy ¼ xii × 100%
n i¼1 fore, a risk prediction model must be established to forecast and
analyze safety risks. On the basis of forecast results, effective
In this study, another machine learning algorithm, specifically suggestions can be proposed and the corresponding control and
the SVM, was also used to verify the classification effects of RF. By emergent measures can be implemented in time. Thus, safety risks
comparing its classification accuracy rate with that of the RF, the can be reduced to an acceptable range. Fig. 3 illustrates the estab-
validity of the established RF risk prediction model can be tested. lishment and application of the risk prediction model for foundation
After the capability of the model has been verified, the model pits in this subway station.
can be applied to an actual deep foundation pit construction in a
subway station. Through this model, onsite engineers can be in- Collecting and Processing of Sample Data
formed in a timely manner once a risk has been identified. Thus,
measures can be implemented accordingly to prevent potential The northern section of the foundation pit, which is adjacent to a
safety accidents. large number of commercial buildings, was the key monitoring
area. Because the excavation depth of the foundation pit is greater
than 20 m, the selection of the monitoring items mainly referred
Case Study to the compulsory measurement items of the foundation pit of
Level 1 in the specification. In this study, various types of mon-
A foundation pit project in a subway station of the Wuhan Metro itoring points from the northern section were used as research
was selected for this case study. This station lies in the center of objects.
Wuhan City, which is a densely populated area. An open-cut A web-based early warning system for subway safety risk was
method was used to construct the pit, as shown in Fig. 2. The recently developed by the Huazhong University of Science and
pit was completely covered by ground buildings. The pit depth Technology (Ding and Zhou 2013). Considerable monitoring infor-
of the station was around 24–25.5 m, its excavation length mea- mation for construction of different foundation pits was recorded in
sured 203.5 m, and its width spanned 22.15 m. Waterproof imper- this system. The safety risk levels of foundation pits were deter-
meable diaphragm walls and inner support constituted the strutting mined via a hybrid data fusion model on the basis of multisource
structure of the pit. The diaphragm wall was supported by concrete information, in which information sources of monitoring measure-
and reinforced by steel support. The wall depth was 60 m, and the ments, construction parameters, and field inspections were com-
wall thickness approximately 1 m. bined (Zhou et al. 2017c). The assessment process included the
following: (1) primary safety risk assessment of the foundation
pit was implemented by 130 experts, (2) the basic probability
assignment (BPA) of safety risk assessment was calculated using
the BPA function, and (3) Dempster–Shafer theory was used to
determine safety risks. Safety risk levels of foundation pits were
divided into three levels, namely low, medium, and high risk,
which were indicated in the platform by green, yellow, and orange
boxes, respectively. The monitoring types and daily safety risk
levels of the foundation pit in this subway station can be found
in the system.
Eight monitoring types that remarkably influence the safety
status were selected on the basis of the corresponding technical
specifications and experience of experts. These types included sur-
face, building, underground pipeline, and structural settlements;
diaphragm wall and structural horizontal displacements; steel sup-
port axial force; and concrete support steel stress. The maximum
cumulative changing values and rates were selected to compose 16
predictive variables. The monitoring data and daily safety risk lev-
els were collected from the subway safety early warning system.
Fig. 2. Construction site in the subway station.
Monitoring was planned to be conducted once a day. When the data
RF Classifier
Importance
analysis
Optimal RF risk
prediction model
Correlation
analysis
Optimal RF risk
A set of sample data Input
prediction model
Output
No
of a monitoring item vary dramatically, monitoring of this item labeled 1, 2, and 3. The orders of the sample data were randomly
should be conducted several times a day. Fig. 4 displays the mon- disrupted. A total of 160 sets of sample data were used for model
itoring points in the northern section of the pit. Several monitoring training, and the remaining 40 were used for model testing. Table 4
types exhibited multiple monitoring points. Only the points with presents a part of the training sample data.
the highest absolute values were selected. Table 3 lists the collected
monitoring data and safety risk levels.
A total of 200 sets of sample data was acquired. These sets con- Establishment of RF Risk Prediction Model
sisted of 92 low-risk, 75 medium-risk, and 33 high-risk sets. Each The collated sample data were used to establish a RF risk predic-
set of sample data included 16 predictive variables and their cor- tion model. Inputs for the model included 16 predictive variables,
responding safety risk levels. The category label causes no effect on and the outputs were safety risk levels. Two important parameters
RF classification accuracy. Thus, the three safety risk levels were of the RF algorithm, ntree and mtry , were determined via fivefold
cross-validation, and classification accuracy was used as the assess- maximum changing rate and maximum cumulative changing
ment criterion. Fig. 5 shows the fivefold cross-validation accuracy values of the monitoring items were used simultaneously, then
under different values of ntree and mtry . As displayed in the figure, duplicates will occur during calculation. Therefore, in this re-
the XOY plane represents the transformation range of ntree and mtry . search, the maximum cumulative changing values of the moni-
The search range of mtry was set to 1–16, and the step distance toring items were used for correlation analysis. Fig. 6 illustrates
was 1. The search range of parameter ntree was set to 0–500, and the correlation matrix of the maximum cumulative changing
the step distance was 5. The Z-axis represents the change in cross- values A1–A8.
validation accuracy. Each group of ntree and mtry corresponds to a The diagonal boxes in Fig. 6 show the distribution of each pre-
cross-validation accuracy; different mtry and ntree values can dictive variable, the boxes in the lower-left area show the scatter
achieve the same accuracy. However, the lowest values of the plots with fitted lines of each two predictive variables, and the
parameters were used to reduce computing complexity and time. boxes in the upper-right area show the correlation values and sig-
The results showed that when ntree ¼ 215 and mtry ¼ 5, the accu- nificance levels of each two predictive variables. Significance levels
racy rate of cross validation was the highest at 99.38%. Based on in the figure present a one-to-one correspondence with several sym-
the determined optimal parameters, the training sample data were bols {correspondence between p values [(0.0, 0.001), (0.001, 0.01),
used to generate the RF risk prediction model. (0.01, 0.05), (0.05, 0.1), (0.1, 1)] and symbols (***, **, *, °, ·)}.
A correlation matrix was provided to illustrate the correlation The eight predictive variables exhibited good correlation with one
among predictive variables. In comparison with the maximum another. Among these variables, A1, A2, A5, and A6 manifested
changing rate of the monitoring types, the numerical relationship the strongest correlation. A1 and A2 showed a highly positive cor-
between maximum cumulative changing values can better reflect relation, and both showed a highly negative correlation with A5 and
the changes in pit safety status and reveal the deformation trend a highly positive correlation with A6. A5 was also highly correlated
(Lu and Zhang 2013; He et al. 2014). On the other hand, if the with A6. In excavation, most monitoring data were linked with one
© ASCE
Number A1 B1 A2 B2 A3 B3 A4 B4 A5 B5 A6 B6 A7 B7 A8 B8 risk level
1 −31.13 0.47 −32.16 0.49 −16.87 0.29 −6.55 1.64 −2.95 −0.21 8.39 −0.96 771.82 −44.06 1,781.34 −664.79 1
2 −49.75 −2.73 −49.38 −2.89 −46.77 −3.95 −15.40 3.23 −4.30 −0.75 15.00 5.50 809.64 −34.24 1,814.50 394.69 3
3 −30.10 0.71 −32.07 −1.06 −32.74 0.76 5.98 −1.60 −3.20 −0.65 9.10 −1.25 828.39 42.50 1,662.07 664.79 1
4 −29.30 0.31 −32.63 0.31 −17.69 0.33 −9.93 −0.90 −2.85 0.03 9.93 −0.40 817.32 −31.56 2,275.36 270.46 1
5 −71.34 −2.35 −58.88 −2.20 −183.63 −3.65 −19.08 −1.28 −5.60 −0.38 16.75 0.89 359.35 −40.62 17.41 −3.53 2
::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: ::: :::
156 −73.10 −0.71 −57.96 −0.60 −184.29 −7.10 −20.63 −0.40 −5.10 −0.35 18.00 1.00 414.15 −120.18 15.84 −2.40 2
157 −73.66 0.68 −57.52 0.55 −184.36 0.53 −20.42 −0.38 −5.20 −0.16 18.50 1.10 489.33 −90.30 21.82 −10.84 3
158 −46.02 −0.64 −47.30 −0.64 −46.27 −0.62 −10.73 0.73 −4.10 −0.25 15.38 6.25 802.58 39.79 16,611.02 263.89 2
159 −45.25 −0.91 −46.97 −0.71 −45.76 −0.63 −8.51 −1.25 −3.80 −0.22 14.50 1.25 904.21 96.22 1,874.97 152.56 1
160 −42.96 −0.35 −44.19 −0.8 −43.53 0.35 −5.37 2.90 −3.50 −0.30 15.00 2.00 884.07 128.74 1,297.88 −224.72 1
05018004-8
pipelines.
ntree and mtry .
when sudden risk events occur. Thus, the maximum changing rate
items cannot be eliminated. Although the maximum changing rate
thereby leading to lateral displacement of the retaining structure
Fig. 7. Importance analysis of predictive variables: (a) mean decrease accuracy; and (b) mean decrease Gini.
Fig. 10. Monitoring data from April 3 to May 2: (a) surface accumulative settlement; and (b) building accumulative settlement.
Fig. 13. Changing values of monitoring data on May 1: (a) surface cumulative settlement; and (b) building cumulative settlement.
Conclusion
was the value of accumulated surface subsidence. By exploring Hong, H., P. Tsangaratos, I. Ilia, W. Chen, and C. Xu. 2017. “Comparing
the cause of outliers, hints of dangerous sources were found the performance of a logistic regression and a random forest model in
punctually, and remedial measures were implemented to prevent landslide susceptibility assessments, the case of Wuyaun area, China.”
In Workshop on World Landslide Forum, 1043–1050. Cham, Switzerland:
major safety incidents.
Springer.
In future research, real-time automatic data acquisition systems
Hu, T., J. Wang, and L. Zhang. 2015. “Prediction of hard rock TBM
should be considered to achieve automatic collection and extraction penetration rate using random forests.” In Proc., Control and Decision
of sample data. Therefore, for any type of problem, the superiority Conf., 3716–3720. New York: IEEE.
of any method cannot be broadly generalized. This study showed Iverson, L. R., M. Prasadam, S. N. Matthews, and M. Peters. 2008.
that the RF features a good application prospect for deep pit risk “Estimating potential habitat for 134 eastern US tree species under six
prediction. climate scenarious.” For. Ecol. Manage. 254 (3): 390–406. https://doi
.org/10.1016/j.foreco.2007.07.023.
Jan, J. C., S. L. Hung, S. Y. Chi, and J. C. Chern. 2002. “Neural network
Acknowledgments forecast model in deep excavation.” J. Comput. Civ. Eng. 16 (1): 59–65.
https://doi.org/10.1061/(ASCE)0887-3801(2002)16:1(59).
The presented work has been supported by the National Science Kecman, V. 2001. Learning and soft computing: Support vector machines,
Foundation of China (NSFC) through Grant No. 71471072. The neural networks, and fuzzy logic models. Cambridge, MA: MIT Press.
authors gratefully acknowledge the NSFC’s support. Kohavi, R. 1995. “A study of cross-validation and bootstrap for accuracy
estimation and model selection.” In Proc., Int. Joint Conf. on Artificial
Intelligence, 1137–1143. Burlington, MA: Morgan Kaufmann.
Kuhn, M., and K. Johnson. 2013. Applied predictive modeling. New York:
References Springer.
Arlot, S., and A. Celisse. 2010. “A survey of cross-validation procedures Lee, S., and D. W. Halpin. 2003. “Predictive tool for estimating accident
for model selection.” Stat. Surv. 4: 40–79. https://doi.org/10.1214/09 risk.” J. Constr. Eng. Manage. 129 (4): 431–436. https://doi.org/10
-SS054. .1061/(ASCE)0733-9364(2003)129:4(431).
Breiman, L. 1996. “Bagging predictors.” Mach. Learn. 24 (2): 123–140. Li, H. 2000. “Grey forecast and precaution system for foundation pit
https://doi.org/10.1007/BF00058655. deformation.” [In Chinese.] Site Invest. Sci. Technol. 6: 40–44.
Breiman, L. 2001a. “Random forests.” Mach. Learn. 45 (1): 5–32. https:// https://doi.org/10.3969/j.issn.1001-3946.2000.06.009.
doi.org/10.1023/A:1010933404324. Li, W. D., M. H. Wu, and N. Lin. 2016. “Horizontal displacement predic-
Breiman, L. 2001b. “Using iterated bagging to debias regressions.” Mach. tion research of deep foundation pit based on the least square support
Learn. 45 (3): 261–277. https://doi.org/10.1023/A:1017934522171. vector machine.” In Proc., 3rd Int. Conf. on Wireless Communication
Breiman, L., J. Friedman, R. Olshen, and C. J. Stone. 1984. Classification and Sensor Networks. Paris: Atlantis Press.
and regression trees. Belmont, CA: Wadsworth International Group. Loganathan, N., and H. G. Poulos. 1998. “Analytical prediction for
Catani, F., D. Lagomarsino, S. Segoni, and V. Tofani. 2013. “Landslide tunneling-induced ground movements in clays.” J. Geotech. Geoen-
susceptibility estimation by random forests technique: Sensitivity and viron. Eng. 124 (9): 846–856. https://doi.org/10.1061/(ASCE)1090
scaling issues.” Nat. Hazards Earth Syst. Sci. 13 (11): 2815–2831. -0241(1998)124:9(846).
https://doi.org/10.5194/nhess-13-2815-2013. Lu, Z. G., and J. D. Zhang. 2013. “Spatial and temporal analysis of pit
Chen, C., S. Pei, and J. Jiao. 2003. “Land subsidence caused by ground- deformation monitoring based on GIS.” Appl. Mech. Mater. 239–240:
water exploitation in Suzhou city, China.” Hydrogeol. J. 11 (2): 536–543. https://doi.org/10.4028/www.scientific.net/AMM.239-240.536.
275–287. https://doi.org/10.1007/s10040-002-0225-5. Ma, F., Y. Zheng, and F. Yang. 2008. “Research on deformation prediction
Chen, C., S. Zhang, and Y. Yu. 2004. “Prediction of retaining structure method of soft soil deep foundation pit.” J. Coal Sci. Eng. 14 (4):
displacement in foundation pit.” [In Chinese.] Chin. J. Rock Mech. 637–639. https://doi.org/10.1007/s12404-008-0430-5.
Eng. 23 (12): 2065–2068. Mair, R. J., and R. N. Taylor. 1997. “Theme lecture: Bored tunneling in the
Davison, A. C., and D. V. Hinkley. 1997. Bootstrap methods and their urban environment.” In Proc., 14th Int. Conf. on Soil Mechanics and
application. Cambridge, UK: Cambridge University Press. Foundation Engineering. Hamburg, Germany.
Ding, L., and X. U. Jie. 2017. “A review of metro construction in China: Martens, D., M. D. Backer, R. Haesen, J. Vanthienen, M. Snoeck, and B.
Organization, market, cost, safety and schedule.” Front. Eng. Manage. Baesens. 2007. “Classification with ant colony optimization.” Trans.
4 (1): 4–19. https://doi.org/10.15302/J-FEM-2017015. Evolut. Comput. 11 (5): 651–665. https://doi.org/10.1109/TEVC.2006
Ding, L., K. Li, Y. Zhou, and P. E. D. Love. 2017. “An IFC-inspection .890229.
process model for infrastructure projects: Enabling real-time quality MHURDPRC (Ministry of Housing and Urban-Rural Development of the
monitoring and control.” Autom. Constr. 84: 96–110. https://doi.org/10 People’s Republic of China). 2009. Technical code of urban rail transit.
.1016/j.autcon.2017.08.029. Beijing: China Planning Press.
Ding, L. Y., and C. Zhou. 2013. “Development of web-based system for Rodriguez-Galiano, V. F., B. Ghimire, J. Rogan, M. Chica-Olmo, and
safety risk early warning in urban metro construction.” Autom. Constr. J. P. Rigol-Sanchez. 2012. “An assessment of the effectiveness of a
34 (2): 45–55. https://doi.org/10.1016/j.autcon.2012.11.001. random forest classifier for land-cover classification.” J. Photogramm.
Grinand, C., D. Arrouays, B. Laroche, and M. P. Martin. 2008. “Extrapo- Remote Sens. 67 (1): 93–104. https://doi.org/10.1016/j.isprsjprs.2011
lating regional soil landscapes from an existing soil map: Sampling .11.002.
Sun, F. X. 2010. “SVM in predicting the deformation of deep foundation pit Zhou, J., X. Z. Shi, K. Du, X. Y. Qiu, X. B. Li, and H. S. Mitri. 2016c.
in soft soil area.” In Proc., 2010 Int. Conf. on Machine Vision and “Development of the ground movements due to shield tunneling
Human-Machine Interface, 761–763. New York: IEEE. prediction model using random forests.” In Proc., 4th Geo-China
Sun, H. T., and X. Wu. 1998. “Study on neural networks method of de-
Int. Conf., 108–115. Reston, VA: ASCE.
formation prediction of foundation pit based on artificial.” [In Chinese.]
Zhou, Y., L. Y. Ding, and L. J. Chen. 2013b. “Application of 4D visuali-
Rock Soil Mech. 4: 11.
zation technology for safety management in metro construction.”
Whittle, A. J., Y. M. A Hashash, and R. V. Whitman. 1993. “Analysis of
Autom. Constr. 34 (13): 25–36. https://doi.org/10.1016/j.autcon.2012
deep excavation in Boston.” J. Geotech. Eng. 119 (1): 69–90. https://doi
.10.011.
.org/10.1061/(ASCE)0733-9410(1993)119:1(69).
Wu, X., H. Liu, L. Zhang, M. J. Skibniewski, Q. Deng, and J. Teng. 2015. Zhou, Y., L. Y. Ding, Y. Rao, H. Luo, B. Medjdoub, and H. Zhong. 2017a.
“A dynamic Bayesian network based approach to safety decision sup- “Formulating project-level building information modeling evaluation
port in tunnel construction.” Reliab. Eng. Syst. Saf. 134: 157–168. framework from the perspectives of organizations: A review.” Autom.
https://doi.org/10.1016/j.ress.2014.10.021. Constr. 81: 44–55. https://doi.org/10.1016/j.autcon.2017.05.004.
Yoo, C., and D. Lee. 2008. “Deep excavation-induced ground surface Zhou, Y., H. Luo, and Y. Yang. 2017b. “Implementation of augmented real-
movement characteristics—A numerical investigation.” Comput. Geotech. ity for segment displacement inspection during tunneling construction.”
35 (2): 231–252. https://doi.org/10.1016/j.compgeo.2007.05.002. Autom. Constr 82: 112–121. https://doi.org/10.1016/j.autcon.2017
Zhang, L., X. Wu, M. J. Skibniewski, J. Zhong, and Y. Lu. 2014. .02.007.
“Bayesian-network-based safety risk analysis in construction projects.” Zhou, Y., and Y. Peng. 2016. “A case history of deep excavation above
Reliab. Eng. Syst. Saf. 131: 29–39. https://doi.org/10.1016/j.ress.2014 an operational metro subway.” [In Chinese.] Soil Eng. Found.
.06.006. 30 (5): 541–543+565.
Zhou, H. B., and H. Zhang. 2011. “Risk assessment methodology for a Zhou, Y., W. Su, L. Ding, H. Luo, and P. E. Love. 2017c. “Predicting safety
deep foundation pit construction project in Shanghai, China.” J. Constr. risks in deep foundation pits in subway infrastructure projects: A sup-
Eng. Manage. 137 (12): 1185–1194. https://doi.org/10.1061/(ASCE) port vector machine approach.” J. Comput. Civ. Eng. 31 (5): 04017052.
CO.1943-7862.0000391. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000700.