Machine Learning Applications For Building Structural Design and Performance 2020

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 46

Journal Pre-proof

Machine Learning Applications for Building Structural Design and Performance


Assessment: State-of-the-Art Review

Han Sun, Henry V. Burton, Honglan Huang

PII: S2352-7102(20)33449-5
DOI: https://doi.org/10.1016/j.jobe.2020.101816
Reference: JOBE 101816

To appear in: Journal of Building Engineering

Received Date: 21 March 2020


Revised Date: 17 July 2020
Accepted Date: 11 September 2020

Please cite this article as: H. Sun, H.V. Burton, H. Huang, Machine Learning Applications for Building
Structural Design and Performance Assessment: State-of-the-Art Review, Journal of Building
Engineering, https://doi.org/10.1016/j.jobe.2020.101816.

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition
of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of
record. This version will undergo additional copyediting, typesetting and review before it is published
in its final form, but we are providing this version to give early visibility of the article. Please note that,
during the production process, errors may be discovered which could affect the content, and all legal
disclaimers that apply to the journal pertain.

© 2020 Elsevier Ltd. All rights reserved.


Highlights:

- Provides detailed formulation of machine learning (ML) algorithms that are relevant to

building structural engineering

- Synthesizes the state of practice and research for ML applications in building structural

engineering based on four major categories

- Discusses the challenges and opportunities in bringing ML applications to building

of
structural engineering into practice

ro
-p
re
lP
na
ur
Jo
Credit Author Statement

Han Sun: writing-original draft preparation and editing, literature reviews, methodology

presenting

Henry Burton: conceptualization, writing-reviewing and editing, supervision

Honglan Huang: writing-reviewing and editing

of
ro
-p
re
lP
na
ur
Jo
1 Machine Learning Applications for Building Structural Design and Performance Assessment:

2 State-of-the-Art Review

a* b c
3 Han Sun ; Henry V. Burton, M.ASCE ; Honglan Huang

a
4 Research Engineer, Yahoo Research. Email: hansun2014@ucla.edu

b
5 Asscociate Professor, Department of Civil and Environmental Engineering, University of California Los Angeles. Email:

6 hvburton@seas.ucla.edu

c
7 PhD Candidate, Department of Civil and Environmental Engineering, University of California Los Angeles. Email:

of
ro
8 honglanhuang@ucla.edu

9 *
-p
corresponding author
re
10 ABSTRACT:
lP

11 Machine learning provides a powerful tool for predicting and assessing structural performance, identifying structural

12 condition and informing preemptive and recovery decisions by extracting patterns from data collected via various sources
na

13 and media. This paper presents a review of the historical development and recent advances in the application of machine
ur

14 learning to the area of building structural design and performance assessment. To this end, an overview of machine
Jo

15 learning theory and the most relevant algorithms is provided with the goal of identifying problems suitable for machine

16 learning and the appropriate models to use. The machine learning applications in building structural design and

17 performance assessment are then reviewed in four main categories: (1) predicting structural response and performance, (2)

18 interpreting experimental data and formulating models to predict component-level structural properties, (3) information

19 retrieval using images and written text and (4) recognizing patterns in structural health monitoring data. The challenges of

20 bringing machine learning into building engineering practice are identified, and future research opportunities are

21 discussed.
22 KEY WORDS: machine learning; artificial intelligence; building structural design and performance assessment;

23 supervised learning; unsupervised learning

24 1. Introduction

25 Machine learning (ML) refers to a set of methodologies that are capable of automatically detecting patterns in data,

26 which can then be used to develop forecasting models and support decision making under uncertain conditions (Murphy,

27 2012) [1]. There are three main types of learning: supervised, unsupervised and reinforcement. Supervised learning is used

28 to develop predictive models where the goal is to map a set of inputs (also known as features, attributes or covariates) to one

of
ro
29 or more outputs (also known as the response variable). Supervised learning problems are described as classification or

30 -p
pattern-recognition when the response variables are categorical and regression when the outputs are numerical variables.
re
31 Unsupervised or descriptive learning is associated with much less well-defined problems, where the goal is to discover
lP

32 underlying relationships in the data. Both supervised and unsupervised learning can be achieved using parametric and

33 non-parametric models. Whereas the former utilizes a fixed number of parameters, the size of the training dataset
na

34 determines the number of parameters in the latter. Parametric models are often easier to construct and implement but are
ur

35 constrained by the assumptions that they make about the data distribution. Non-parametric models are much more flexible
Jo

36 but their complexity increases with the size of the dataset. Reinforcement learning, the least popular of the three categories,

37 is used to acquire knowledge on how to act or behave (i.e. make decisions) under uncertainty (Hastie et al. 2009 [2];

38 Murphy, 2012 [1]). Note that semi-supervised learning, which, for the purposes of this paper, is not included as a primary

39 category, combines elements of both supervised and unsupervised learning.

40 ML methods are not foreign to building structural design and performance assessment (SDPA) as applications in this

41 area can be traced back to as early as the late 1980’s when Adeli and Yeh (1989) [3] developed and applied an ML-based

42 methodology to a beam design problem. This pioneering work was followed by several studies during the 1990’s that

43 applied artificial neural networks (ANNs) (Hopfield, 1982) [4] to building SDPA problems. One of the first in this series of
44 studies was conducted by Vanluchene and Sun (1990) [5], who applied back-propagation neural networks (Rumelhart et al.

45 1986 [6] ) to three distinct building SDPA problems related to locating the load on a beam, designing a reinforced concrete

46 beam and analyzing a simply supported plate. This study was closely followed by several others (Hajela and Berke, 1991;

47 Ghaboussi et al. 1991; Wu et al. 1992; Masri et al. 1993; Kang and Yoo, 1994; Messner et al. 1994; Elkordy et al. 1994) [7–

48 13], most of which utilized the back-propagation network. Recognizing the growing popularity of ANNs in building SDPA,

49 Gunaratnam and Gero (1994) [14] conducted a detailed examination of the factors that influence their performance, some

50 of which were domain-specific, while others were domain-independent. The authors highlighted the importance of reduced

of
ro
51 dimensionality (i.e. the number of features or predictors) and embedment of domain-specific knowledge in achieving

52 -p
effective learning. To address specific challenges associated with the back-propagation methodology such as the slow rate
re
53 of learning, Adeli and Park (1995) [15] explored the use of counter-propagation algorithms to address building SDPA
lP

54 problems. Whereas back-propagation networks utilized only supervised learning, the counter-propagation algorithm

55 combined both supervised and unsupervised learning. In the Adeli and Park study, the two algorithms were applied to four
na

56 building SDPA problems including the concrete beam design and simply supported plate analysis defined by Vanluchene
ur

57 and Sun and two others involving the analysis of a steel beam.
Jo

58 In the late 1990’s, Reich (1997) [16] conducted a review of the literature on the application of ML to civil engineering

59 problems. In addition to building SDPA, the review included other civil engineering domains such as transportation,

60 construction management, water resources, environmental and materials. In fact, only sixteen of the ninety-seven citations

61 were specific to building SDPA. In addition to reviewing the literature, the author highlighted several issues to be addressed

62 towards the practical application of ML in civil engineering. They include (1) having a deep understanding of the learning

63 problems, (2) knowing which ML technique is most suitable for the problem at hand, (3) the ease of implementation or

64 availability of various ML techniques, (4) proper evaluation of trained models and (5) the availability of efficient

65 information management systems.


66 Due to the limited availability of computational power and storage, early ML applications in building SDPA (such as

67 the ones described in the previous two paragraphs) were limited to a few relatively simple problems involving small

68 datasets. In contrast, the increase in computational resources and resurgence of artificial intelligence over the past two

69 decades has led to the development of more sophisticated tools and techniques that can harness these new data streams and

70 solve highly nonlinear learning problems. Within building SDPA, the revitalization of ML has been fueled by the

71 complexity of modern systems, which requires the generation and/or manipulation of large datasets to rigorously assess

72 their performance under various loading conditions. These datasets can be produced from (1) reconnaissance and remote

of
ro
73 sensing from past extreme events, (2) measurement data from large-scale (or multiple small-scale) physical experiments, (3)

74 -p
response of instrumented systems under normal operating loading conditions, (4) large-scale computational simulations
re
75 and (5) relevant audio-visual media (e.g. images, videos and written text).
lP

76 The abundance of studies on ML methods applied to building SDPA problems since the Reich paper warrants a more

77 current state-of-the-art review. The goal of this paper is to synthesize past research on this topic towards a common
na

78 understanding of the types of problems that are suited to ML applications, the characteristics of ML methods, the
ur

79 challenges associated with applying ML to building SDPA (ML-SDPA) and opportunities for the future. The review begins
Jo

80 with a brief introduction to ML that includes a general problem formulation and discussion of relevant sub-topics (feature

81 engineering and model training and performance evaluation). Next, the mathematical details of some ML algorithms that

82 are increasingly being applied to building SDPA problems are presented. This is followed by a review of the existing

83 ML-SDPA literature categorized in terms of the following four application areas: (1) predicting structural response and

84 performance, (2) interpreting experimental data and formulating models to predict component-level structural properties,

85 (3) information retrieval using images and written text and (4) recognizing patterns in structural health monitoring data.

86 Subsequently, a discussion of specific challenges and future research opportunities related to the availability and collection

87 of useful data, the explainability and interpretability (or lack thereof) of some ML models and challenges with overfitted
88 models are presented.

89 2. Overview of Machine Learning Problems

90 This section presents an overview of ML beginning with a generalized formulation of supervised (classification and

91 regression) and unsupervised learning problems. A brief discussion of feature engineering and model training and

92 performance evaluation is also included. The material presented in this section is obtained from several statistics and ML

93 sources that provide more details.

94 2.1 General Formulation

of
× ,

ro
95 For supervised learning, the dataset of feature variables can be described by a matrix with dimension

96 where is the total number of observations (data points) and -p is the number of features (or independent variables).

× 1 vector containing the label for each observation. For a classification


re
97 The response variable is described by an
lP

98 problem, is a categorical variable and for regression, is a numerical variable. Unsupervised learning problems

99 include the feature matrix but not the response variable. The objective of supervised learning is to solve the
na

100 generalized optimization problem by minimizing the empirical loss function defined by Equation 1 (Murphy, 2012) [1].
ur

101 ∑ , ; + Ω (1)
Jo

102 Where is the response variable for observation , ; (also commonly denoted as ! including later in this

103 paper) is the predicted response from the ML model based on the feature and represents the set of model

104 parameters. is a loss measure between the true and predicted ! value of the response variable. Ω is a

105 regularization term that penalizes the model based on its complexity by restricting the parameter set through some

106 regularizing function Ω . is a model parameter that is determined as part of the optimization process. The

107 objective is to find the set of model parameters " that minimizes the empirical loss over the training data with the

108 regularization penalty considered. Note that is only used in parametric models, which have a finite number of

109 parameters. For example, linear regression models always have + 1 model parameters. On the other hand, rather than
110 using a finite number of parameters to define the data distribution, non-parametric models utilize a flexible parameter set

111 whose size, in theory, can be infinite, and is often treated as a function. The Support Vector Machine with radial basis

112 function kernel is an example of a non-parametric model whose parameter set depends on the training data. Equation 1 is

113 a convenient generalized formulation that is adopted by many supervised learning methods including ordinary least

114 squares, ridge, least absolute shrinkage and selection operator (LASSO) and logistic and kernel regression. Depending on

115 the ML method, the minimization problem can be solved using a closed form solution, gradient-based optimization, or

116 convex relaxation.

of
ro
117 The goal of unsupervised learning is to infer the underlying structure and parameters of the model that generated the

118 -p
data, which can then be used to group the data into clusters, generating new instances and drawing inferences. The

objective function for unsupervised learning is shown in Equation 2, where # is the set of model parameters that
re
119
lP

120 characterizes a learned structure for the given dataset. The objective function can take the variant forms of negative

121 log-likelihood and Kullback-Leibler divergence. In clustering analysis, quantifies the cost of assigning a data point
na

122 to a particular cluster. Examples of ML methods that follow this generalized formulation include the Gaussian
ur

123 Mixture Model, K-means and K Nearest Neighbors (Buhmann and Held, 1999) [17].
Jo

124 $ ∑ ;# (2)

125 In ML theory, the objective function expressed in Equations 1 and 2 is defined as the empirical loss over the training

126 dataset, denoted as % for a given model . The theoretical solution to the ML problem, which is shown in Equation

127 3, is the set of parameters that minimizes the loss function over the entire data space, % . However, real problems are

128 almost always limited by the amount of data sampled from the entire space. Therefore, the ideal solution is often

129 approximated by minimizing the empirical loss over the training data instead.

130 $, % = ') , , ( ( (3)

131 Where , is the theoretical joint distribution of the feature and response variables over the entire data
132 space, *.

133 2.2 Feature Engineering

134 Prior to training a ML model, the features that are found to influence model performance, improve training

135 efficiency, and increase flexibility, must be selected and extracted. Most ML methods deploy standard feature selection

136 and extraction algorithms. However, some also have the ability to adjust features to achieve the best possible prediction

137 performance.

138 Feature selection can be categorized into three methods: filter, wrapper, and embedded. The filter method ranks the

of
ro
139 original features according to an importance measure such as the scores from a Chi-square test or correlation coefficients

140 -p
between individual features and the response variable and selects a subset to be used for model training. The wrapper
re
141 method recursively includes or excludes features from an initial pool and selects the best performing feature set based on
lP

142 feedback from the ML model. Embedded methods are used by those algorithms that incorporate automatic feature

143 selection (e.g., LASSO regression). Both filter and wrapper methods are good at avoiding overfitting issues by reducing
na

144 model complexity and improving training efficiency by reducing highly correlated features.
ur

145 Feature extraction consists of two major tasks that increase the effectiveness of ML models. The first is dimension
Jo

146 reduction, which is achieved by applying methods such as Principle Component Analysis (PCA), which performs a linear

147 mapping from the original data space to a lower dimensional space such that the data variance over each resulting

148 orthogonal component is maximized. The second involves transforming the data into a higher dimensional space such

149 that the patterns become sparse and separable, such as in kernel-based ML algorithms (Huang et al. 2006) [18].

150 Besides the earlier-described general feature engineering techniques, specific feature designs have been proven to be

151 very successful for domain-specific problems. For instance, the use of HAAR-like features achieved human-level

152 face-recognition accuracy with far less computational effort (Viola and Jones, 2001; Lienhart and Maydt, 2002) [19,20],

153 SIFT features (Lowe, 2004) [21] are very effective for object detection within images and HOG features (Dalal and
154 Triggs, 2005) [22] are particularly good for human detection. However, these domain-specific feature engineering

155 techniques require considerable trial and error testing and are designed to only work for very specific problems and data

156 structures. Neural networks and the associated deep learning approaches are extremely popular because they automate

157 feature engineering to achieve state-of-the-art level performance in many pattern recognition and data mining domains.

158 This approach has gained widespread popularity in recent years because of the increase in computation power, which

159 made complex neural net architecture trainable, thus achieving superb feature extraction.

160 2.3 Model Training and Performance Assessment

of
ro
161 There are many well-established procedures for training ML models that attempt to achieve stable and effective

162 prediction performance for new data given a training dataset. One common strategy is +-fold cross validation (also
-p
discussed in Reich 1997 [16]), which randomly splits a dataset into + different subsets and trains the model + times
re
163

using the + ,- subset as testing data and the remaining + − 1 subsets for model training. The best performing of the +
lP

164

165 models over the testing dataset is selected. This procedure is intended to reduce overfitting on the training dataset.
na

166 Another popular technique that is used to avoid overfitting is Bootstrapping, which randomly samples a subset of the data
ur

167 with replacement and trains the model / times. The final model is selected as an average over the predicted results
Jo

168 (regression) or based on a majority vote (classification) (Bunke and Droge, 1984 [23]) from the / models. Both

169 bootstrapping and +-fold cross validation effectively reduces model variances and removes bias and are the primary

170 training techniques used to develop data-driven models. These training procedures are evaluated by using various

171 performance metrics for model selection. For example, performance metrics for binary classification models include

172 accuracy, precision and recall. Precision refers to the number of correct “positive” (e.g. building is red-tagged)

173 predictions normalized by the number of positive predictions. Recall is the number of correct positive predictions

174 normalized by the number of actual positive classes. Accuracy is the number of correct predictions normalized by the

175 total number of predictions (Powers, 2011) [24]. For multi-class classification problems, in addition to the
176 aforementioned three metrics, a confusion matrix and top-+ class accuracy is also used (Krizhevsky et al. 2012) [25].

177 Regression models are typically evaluated using the mean squared error (MSE), root mean squared error (RMSE),

178 adjusted root mean square error, coefficient of multiple determination, the median absolute error and the median absolute

179 relative deviation (MARD) (Mack et al. 2007; Burton et al. 2017) [26,27].

180 3. Machine Learning Models Commonly used for Building SDPA Problems

181 Some ML algorithms that are commonly employed for building SDPA problems are introduced. Only supervised

182 learning (classification and regression) models are included because very few ML-SDPA studies involving unsupervised

of
ro
183 learning can be found in the literature. The next section summarizes the recent (mostly within the last decade) ML-SDPA

184 -p
research. Most of the methods included in the literature review are covered in this section. However, some of the more
re
185 advanced algorithms such as recurrent or convolutional neural networks, while included in the ML-SDPA review, are not
lP

186 discussed in this section. The relevant references are provided for readers who would like to become more familiar with the

187 details these methodologies.


na

188 3.1 Linear Regression: Ordinary Least Squares, LASSO, Ridge and Polynomial Basis Functions
ur

189 Ordinary Least Squares (OLS) is one of the more simpler regression methods and is very commonly used in building
Jo

190 SDPA. It is included in this section because it provides a basis for understanding some of the less common regression

191 techniques (e.g. LASSO and ridge), which are described later in this section. In the OLS formulation, the response

192 variables are approximated using Equation 4.

193 = 0+1 = 2+1 (4)

194 Where is the observed response variables, 0 represents the predicted response variables, is the feature matrix

195 described earlier and ∈ is an × 1 vector of residuals, which is taken as the difference between the observed and

196 predicted values of the response variables. 2 (same as the model parameters, , in Equation 1) is a × 1 vector of

197 predictor coefficients, which is derived by minimizing the residual sum of squares, 455 = − 2 6
− 2 . Note that
198 for OLS regression, 455 represents the loss function described earlier and no regularization term is included. The

199 OLS minimizing predictors are computed using the closed form solution in Equation 5a.

200 LASSO (Tibshirani, 1996) [28] and ridge (Hoerl and Kennard, 1970) [29] are two linear regression methods that, as

201 noted earlier, incorporate feature engineering as part of the overall formulation. The LASSO method integrates both feature

202 selection (by setting some of the predictor coefficients to zero) and shrinking the OLS coefficients by including a penalty

203 on the 455 loss function. As shown in Equation 5b, the regularizing function (Ω in Equation 1) is taken as the sum of

204 the absolute values (the % norm) of the predictor coefficients. Like LASSO, ridge adds a penalty to the 455 loss function.

of
However, the root sum of the square of the predictor coefficients (the %7 norm) is used as the regularization function. Also,

ro
205

206 -p
ridge regression does not incorporate feature selection i.e. none of the predictor coefficients are shrunk to zero.

8 9:; =
2 2 455 = 6 < 6
re
207 (5a)

8 :=;;9 =
2 2 455 + ‖2‖:?
lP

208 (5b)

209 8 @ ABC =
2 2 455 + ‖2‖7:D = 6
+ E < 6
(5c)
na

210 Where E is a × 8 9:; and ridge 2


identity matrix. Note that while the predictor coefficients for OLS 2 8 @ ABC can be
ur

211 8 :=;;9 . For both ridge and LASSO, the


solved analytically, there is no closed form solution for the LASSO coefficients 2
Jo

212 value of the regularization parameter can be determined using +-fold cross validation and/or by minimizing some

213 information criterion such as the Akaike Information Criterion (AIC) (Akaike, 1974) [30] and Bayesian Information

214 Criterion (BIC) (Schwarz, 1978) [31].

215 By replacing in Equation 4 with a nonlinear transformation of itself (also referred to as a basis function, e.g.,

216 ), linear regression can be used to create models that capture a nonlinear relationship between the response and feature

217 variables. It is important to note that the regression model itself is still linear because the parameters 2 are linear

218 (Murphy, 2012) [1]. In the literature review section presented later in the paper, several studies have utilized linear

219 regression with higher-order polynomial basis functions i.e. in Equation 4 is replaced with F1 G G7 … G A I. The
220 complexity of the model can be increased by using higher values of ( or by utilizing multiple piece-wise polynomial basis

221 functions as in multiple adaptive regression splines (MARS) (Friedman, 1991) [32].

222 3.2 Kernel Regression

223 As noted earlier, linear regression models can be adapted to capture complex non-linear relationships between

224 features and response variables by employing nonlinear basis functions. One strategy that has been adopted in several of

225 the ML-SDPA studies discussed later in the paper is the use of kernel basis functions. The word “kernel” has several

226 interpretations; however, it is often described as a measure of the similarity between two observations and J. This

of
similarity (or lack therefore) is quantified using a kernel function, + , J J

ro
227 (note that and are feature vectors of

228 -p
an individual observation). Examples of commonly used kernel functions include linear, polynomial, sigmoid and
re
229 Gaussian or squared exponential. Equation 6 describes the Gaussian kernel, which was commonly used in the reviewed
lP

230 ML-SDPA studies.


D
+ , = exp N− S
O P< QO
231 J 7R D
(6)
na

232 Where the bandwidth, T, is a model parameter that is determined during the training process (e.g. using +-fold
ur

233 cross validation or based on AIC). In the context of kernel ridge regression (i.e. ridge regression performed using a kernel
Jo

234 basis function), the feature matrix is replaced by a kernel matrix U (Equation 7) and the minimization problem takes

235 on the form shown in Equation 8.

+ , + , 7 … + ,
+ 7, + 7, … ⋮
U=V 7 Y
⋮ ⋮ ⋱ ⋮
236 (7)
+ , + , 7 … + ,

0 @ ABC =
Z Z7| − UZ|7 + Z6 UZ = U + E <
237 (8)

238 Where E is an × identity matrix and Z is the vector of regression coefficients in kernel space. Equation 9 is

239 used to compute the response function !,C\, for a test data point ,C\, .

240 !,C\, = ∑ #] , ,C\, (9)


241 3.3 Tree-Based Algorithms: Decision Trees, Random Forests and Adaptively Boosted Trees

242 Tree-based algorithms can be used for both classification and regression. The models that belong to this category

243 recursively divide the training dataset while exploring and learning its structure towards creating subspaces that are

244 mutually exclusive or have high levels of purity (the ratio between the class with the most samples and the size of the data

245 subset). The rules used to grow and prune each tree (e.g. node splitting and stopping criteria), the number of considered

246 trees and the approach to aggregating information from multiple trees, are what distinguishes the different types of

247 tree-based algorithms.

of
ro
248 The structure of a tree can be described using nodes, each of which represents a data subset of predictors and response

249 -p
variables. The lowest level node, which is the one that comprises the entire dataset, is called the root node. Additional nodes
re
250 are created when a parent node is divided based on some criteria (described later) to create child nodes. The highest level
lP

251 nodes, which are also referred to as the leaf nodes, represent the data subsets created at the very end of the data-division

252 process. In other words, no further splitting of the data occurs beyond the leaf nodes, whose associated subspaces provide
na

253 the response variable prediction. All other nodes (excluding the root and leaf nodes) are referred to as interior nodes. For a
ur

254 classification problem, the prediction is taken as the dominant class (or categorial variable) within the leaf node that meets
Jo

255 the data-division criteria for all nodes leading to it. For a regression problem, the predicted value of the response variable is

256 taken as the mean value at the corresponding leaf node.

257 The Decision Tree (DT), which is the simplest of the tree-based algorithms, considers all features when splitting

258 each data subset and choses the one that minimizes the impurity measure. The Gini index ^_ (Equation 10) is

259 commonly used to measure impurity.

260 ^_ = ∑cb ̂ab 1 − ̂ab (10)

̂ab = ∑ P ∈ge
_ = f represents the fraction of observations belonging to the f ,- class within the
de
261 Where

262 region (or data subset) 4a . _ ∙ is an indicator function. Several alternative criteria such as the minimum number of
263 samples needed at a given node for additional splitting and the maximum depth of the tree are used to terminate the

264 growth (or data-division) process (Breiman et al. 1984; Hastie et al. 2009) [33,34].

265 Adaptive boosting seeks to improve the performance of DTs by iteratively creating new models that correct the

266 errors of the previous one. This is achieved by applying weights to the training datapoints based on some set of criteria.

267 The first model is created using uniform weights i = , = 1,2, … , for the training data k , |1,2, … , l,

268 ∈ k1,2, … , ml,. The error n a


associated with iteration is computed and the weights are updated according to

269 Equations 11a and 11b.

of
# =o p r
a <q e
qe

ro
270 (11a)

271 i s = i ∙ exp t# a
∙_p -p ≠v a
rw (11b)

Where # a
is the weight factor assigned to the base model v a
, and _ ∙ is the indicator function. Based on
re
272

are assigned a higher # a


lP

273 Equation 11a, the misclassified data points in iteration and a higher weight in iteration

274 + 1. A linear aggregation of the weighted base models is used to give the final prediction (Freund and Schapire, 1997;
na

275 Hastie et al. 2009 Abu-Mostafa et al. 2012) [35–37].


ur

!,C\, = G ∑x
a # a
_ v a
,C\, =f
b
276 (12)
Jo

277 The Random Forests (RF) model uses an aggregated set of Decision Trees, which are constructed by applying

278 bootstrap sampling to the training dataset. For regression problems, the model prediction is taken as the average of the

279 considered DTs and the class that is predicted by the majority of trees is used for classification. During the data-division

280 process at each node, a randomly selected subset of features is considered, thus reducing the correlation across the

281 different trees and avoiding overfitting (Breiman, 2001) [38].

282 3.4 Logistic Regression

283 Logistic regression is often used as a classification algorithm because it is fairly easy to implement and interpret the

284 final results. Given the feature vector, and assuming a binary response for each observation, ∈ k0,1l, the
285 probability of the = 1 class is computed using the sigmoid transformation of the linear function of (Equation 13).

z = 1|{ = =
|}~ 2• s€•
s|}~ 2• s€•
286 (13)

Where 2 = ‚ƒ , ƒ7 , … , ƒ„ … and ƒ† are the predictor coefficients vector and the bias term, respectively.
6
287

288 Multinomial logistic regression is used for problems with more than two classes i.e. ∈ k1,2, … , ml, where m

289 represents the number of classes. The probability that belongs to the f ,- class is computed using Equation 14.

z = f|{ = = ∑ˆ
|}~ 2•
‡ s€‡•

P‰? |}~ 2P s€P•
290 (14)

291 Where 2b and ƒb† are the predictor coefficients vector and the bias term associated with computing the

of
probability of the f ,- class. The predicted class is taken as the one with the highest probability (Bishop, 2006) [39]. The

ro
292

293 training process used to retrieve 2 is similar that of the OLS method discussed above with a closed-form solution.
-p
re
294 3.5 Support Vector Machines
lP

295 Originally developed as a binary classifier, support vector machines (SVM) seek to determine the hyperplane that

296 separates a dataset into two classes with the widest possible gap between them. If the training data is linearly separable, a
na

297 hard-margin version of SVM is applied. Otherwise, the hinge loss function (Rosasco, et al., 2004) [40] is introduced to
ur

298 maintain a soft margin for the decision boundary, which begins by defining the %7 regularized objective function shown in
Jo

299 Equation 15.

300 ∑d , ! + ‖2‖7:D (15)

301 where the response approximation function in the original feature space is ! = 26 + ƒ† . By adopting the Š-insensitive

302 loss function together with slack variables (because the objective function is not differentiable) (Cortes & Vapnik, 1995)

303 [41] and an appropriate kernel function + , , the response approximation becomes ! = ∑d # + , + ƒ† where

304 k# | = 1,2, … , ‹l are the regression coefficients in kernel (high-dimensional) space.

305 For a binary class problem with the training dataset defined by k , |1,2, … , l, ∈ k−1,1l and feature vector

306 observation , the classification is based on sign ! . For a multi-class problem, an one-versus-all or one-versus-one
307 approach can be adopted. For a set of m classes i.e. ∈ k1,2, … , ml, the data from class f is treated as positive and the

c c<
7
308 data from the other classes is treated as negative in the one-versus-all approach. In the one-versus-one approach,

309 classifiers are trained and a prediction is established for each pair. The class with the highest number of votes is then used as

310 the prediction (Bishop, 2006; Murphy, 2012) [1,39].

311 3.6 K-Nearest Neighbors

312 K-Nearest Neighbors (KNN) is a non-parametric algorithm that is used for both classification and regression. First, the

313 "Ž" observations in the training data that are nearest to the observation ‹• are identified based on some pre-defined

of
ro
314 distance metric. An empirical function is then created based on the number of each class in that subset of datapoints

315 (defined as ‹• ). For ∈ k1,2, … , ml, the probability that observation


-p belongs to the f ,- class is computed using
re
316 Equation 15.

z = f|{ = , Ž = ∑ ∈d‘ _ =f
lP


317 (16)

318 Where _ ’ is an indicator function that is equal to 1 if ’ is true and 0 of ’ is false where ’ represents whether the
na

319 observation belongs to class f. The Euclidean distance is often used as the distance metric in KNN and the value of Ž
ur

320 can be chosen using +-fold cross validation. The observation is assigned to the class with the highest empirical
Jo

321 probability. KNN can also be used for regression where the value of the response variable is taken as the average (or median)

322 value of the Ž nearest distance (Murphy, 2012) [1].

323 3.7 Discriminant Analysis

324 Discriminant analysis is a technique that is used to address binary or multi-class classification problems. The

325 methodology assumes that feature variables within a particular class take on a multi-variate normal (MVN) distribution.

326 More specifically, the distribution of the feature vector conditioned on class f is defined by ‹ |“b , ”b , where “b

327 and ”b are the mean vector and covariance matrix computed using that data subset associated with class f. The

328 probability that observation belongs to class f is obtained by applying Bayes theorem to the class-conditioned
329 multi-variate normal distribution (Equation 17).

z = f|{ = = ∑ˆ
•‡ –‡
P‰? •P –P
330 (17)

331 Where b and are the class-conditioned MVN probability density function (pdf) for the feature vector

332 and —b is the prior probability of being in class f (estimated as the fraction of class f observations in the training

333 dataset). The classification (or discriminant) function is obtained by substituting the MVN pdf into Equation 17. In linear

334 discriminant analysis (LDA), the same covariance matrix ” is used across all classes (i.e. computed using the feature

335 vectors for the entire training dataset), which produces a linear decision boundary between each pair of classes. The LDA

of
classifier ˜b

ro
336 is shown in Equation 18.

˜b = ” “b − “6b ” < “b + log —b


6 <
7 -p
337 (18)

The predicted class is taken as the one with the highest ˜b


re
338 value. Quadratic Discriminant Analysis (QDA) uses

class-conditioned covariance matrices ”b (i.e. computed using only the feature vectors from the class f data subset),
lP

339

340 which produces a quadratic decision boundary between each pair of classes. The QDA classifier is shown in Equation 19
na

341 (Hastie et al. 2009; Murphy, 2012) [1,36].


ur

˜b = − log|”b | − − “b 6 ”b< − “b + log —b


7 7
342 (19)
Jo

343 3.8 Artificial Neural Networks and its Variants

344 ANNs refer to a category of pattern recognition algorithms that are inspired by the biological nervous system. The

345 network takes a set of features as inputs and applies complex feature fusion operations through a series of layers. ™š ,

346 which is denotes the oth hidden layer and consists of š


neurons, is computed based on a linear combination of the

347 neurons at the previous layer via the š


× š<
weight matrix ›š and š
× 1 bias vector œš , followed by the

348 activation š
∙ (Wu et. al., 2018) [42].

349 ™š = š
›š ™š< + œš (20)

350 The final layer, which could be a linear layer for regression problems or a softmax layer for classification problems
351 (Bishop, 2006) [39], outputs the predicted response !. Loss function choices include 455 for regression and cross

352 entropy loss for classification. The ANN model is trained through backpropagation, which is a gradient-based algorithm

353 that calculates error gradients over each model parameter based on the chain rule (Rumelhart, et. al., 1986) [6].

354 Numerous variants of ANNs have been developed to achieve faster convergence, better prediction performance and less

355 memory usage. ANN variants can differ based on the activation function (e.g., leaky rectified linear unit), the type of

356 layer-connections (e.g., dropout, max-pooling) or the connection mechanism (e.g., Recurrent Neural Network (RNN)).

357 The term deep learning (DL) is used to describe ANNs with many layers. The success of DL started with Krizhevsky et

of
ro
358 al. (2012) [25], who formulated a deep convolutional neural network (CNN) ImageNet classification model that achieved

359 -p
superb performance. Because of its pattern recognition capability, DL has since been successfully used in many domains
re
360 including computer vision, speech recognition and natural language processing.
lP

361 4. Prior Studies on Applying Machine Learning to Building SDPA Problems

362 A broad range of relatively recent (mostly within the last decade) ML-SDPA publications are summarized based on
na

363 the four categories identified earlier, which are also schematically illustrated in Figure 1: (1) predicting structural
ur

364 response and performance, (2) models developed using data from physical experiments, (3) information retrieval using
Jo

365 images and written text and (4) models developed using field reconnaissance and structural health monitoring data. It is

366 recognized that some of the reviewed studies belong to more than one category. For instance, models developed with the

367 intent of using measured structural responses (from instrumented buildings) and/or observed damage (during

368 reconnaissance) to estimate building performance states belong to categories (1) and (4) (e.g. Ghiasi et al. 2016 [43];

369 Zhang and Burton, 2019 [44]). Another example is text-feature-based building damage prediction models trained using

370 field reconnaissance data, which belong to categories (3) and (4) (Mangalathu and Burton, 2019) [45]. The reason for the

371 categories is to help researchers and practitioners make decisions about which types of ML problems are suited for

372 specific SE problems.


of
ro
-p
re
373
374 Figure 1 Schematic illustration of the four ML-SDPA categories
lP

375 4.1 Predicting Structural Response and Performance


na

376 Nonlinear structural response simulation is recognized as the ideal approach to assessing the performance of built
ur

377 systems under extreme loading. Prior studies have used ML to complement or expand the predictive capabilities of
Jo

378 “mechanistic” or “physics-based” structural response simulations. These so-called surrogate (or meta-) models serve as

379 compact statistical representations of the relationship between a set of input variables (e.g. structural properties, loading

380 characteristics) and the response or performance quantities of interest. They are useful for reducing the number of

381 mechanistic simulations needed for computationally intensive applications such as uncertainty quantification and

382 performance optimization within a high-dimension parameter space.

383 Table 1 summarizes the studies that have used ML models to estimate structural response demands or performance

384 metrics (e.g. collapse fragility). For each study, the structural system type, category of response and predictor variables,

385 adopted ML algorithm(s) and model performance evaluation methods are shown. The listed studies focused on building
386 seismic systems including steel (Seo et al. 2012; Khojastehfar et al. 2014; Jough and Sensoy, 2016; Kiani et al. 2019)

387 [46–49] and concrete moment frames (with and without masonry infill) (Mitropoulou and Papadrakakis, 2011; Burton et

388 al. 2017; Morfidis et al. 2017; Zhang et al. 2018) [27,50–52], steel braced frames (Moradi and Burton, 2018; Moradi et al.

389 2018) [53,54] and reinforced concrete shear walls (Sun et al. 2019; Zhang and Burton, 2019) [44,55]. Most of the

390 reviewed studies developed regression models with engineering demand parameters (e.g. story drift ratio, peak floor

391 acceleration) as the response variables and in many of those cases, this prediction was as an intermediate step towards

392 developing limit state fragility functions (e.g. Seo et al. 2012) [46]. Utilizing a slightly different approach, a few studies

of
ro
393 directly incorporated the limit state fragility parameters (e.g. median and dispersion of the collapse intensity) as the

394 -p
response variable (Khojastehfar et al. 2014; Jough and Sensoy, 2016; Burton et al. 2017) [27,47,48]. Whereas most of the
re
395 studies sought to predict mainshock demands and limit state parameters, three (Burton et al. 2017; Zhang et al. 2018;
lP

396 Zhang and Burton, 2019) [27,44,52] focused on aftershock performance. Binary classification models were used in only

397 two of the fourteen studies (Zhang et al. 2018; Kiani et al. 2019) [49,52]. While some studies used only ground motion
na

398 intensity measures as the model features (e.g. Mitropoulou and Papadrakakis, 2011; Kiani et al. 2019) [49,50], others also
ur

399 included structural configuration (e.g. number of stories in building), modeling (e.g. damping ratio) and material
Jo

400 properties (e.g. yield strength of steel).

401 Several different algorithms were used to develop the ML-based structural response and limit state parameter

402 prediction models. For regression, ANN and linear models with polynomial basis functions were most widely used. Other

403 adopted regression algorithms include ridge (conventional and kernel), LASSO, elastic net, PCA and SVM. Some

404 authors used a single algorithm (e.g. Mitropoulou and Papadrakakis, 2011) [50] while others compared the performance

405 of multiple algorithms (e.g. Burton et al. 2017) [27]. Similarly, of the two studies that developed classification models,

406 one used RF (Zhang et al. 2018) [52] while the other compared the performance several algorithms (Kiani et al. 2019)

407 [49]. Most studies used training-testing splits to evaluate the performance the developed ML model. These splits ranged
408 from 33%-67% (i.e. 33% training and 67% testing) on one extreme to 80-20 on the other. None of the studies evaluated

409 the effect of the training-testing partition point on model performance. The ratio of the predicted to actual value of the

410 response variable was the most widely used performance metric in the regression studies. Others included the coefficient

411 of determination 47 , MSE, MARD and mean absolute error (MAE). For the classification studies, accuracy, precision,

412 recall and F-measure were used as the performance metrics.

413

of
ro
-p
re
lP
na
ur
Jo
414 Table 1 Summary of ML models developed in prior studies to predict building structural response and/or performance

Study Structure Response Predictor ML ML Model Performance


Type Variable(s) Variables Algorithm(s) Evaluation
Training/Testing Performance
Split Metric

Mitropoulou RC Frames Structural Ground motion ANN 33-67 Ratio of


and response parameters predicted to
Papadrakakis, parameters actual response
2011

Seo et al. 2012 Steel Structural Earthquake Linear NA NA


Moment response direction, steel Regression with
Frames parameters properties, damping Polynomial
and building age Basis Function
and configuration

of
ro
Khojastehfar Steel Collapse fragility Ground motion ANN 65-30 Mean absolute
et al. 2014 Moment parameters parameters error and MSE
Frames

Jough and Steel Collapse fragility


-p
Frame strength and Linear NA R2, MSE and
re
Sensoy, 2016 Moment parameters ductility parameters Regression with mean absolute
Frames Polynomial error
Basis Function
lP

Burton et al. RC Infilled Aftershock Mainshock response OLS, PC, 80-20 MARD
2017 Frames collapse fragility demands (e.g. SDR) LASSO, Ridge
na

parameters and component (conventional


damage levels and kernel)
regression
ur

Morfidis et al. RC Frames SDR and Ground motion ANN 70-30 MSE
2017 asscoiated (e.g. peak ground
damage acceleration) and
Jo

structural
parameters (e.g.
fundamental period)

Moradi and Controlled Structural Rocking frame Linear 35-65 R2, ratio of
Burton, 2018 Rocking response design parameters Regression with predicted to
Steel Braced parameters (e.g. post-tensioning Polynomial actual response
Frames force and frame Basis Function
aspect ratio)

Moradi et al. Controlled Performance Rocking frame Linear 80-20 R2, ratio of
2018 Rocking limit states (e.g. design parameters Regression with predicted to
Steel Braced life safety) (e.g. post-tensioning Polynomial actual response
Frames force and frame Basis Function
aspect ratio)

Zhang et al. RC Frames Post-earthquake Mainshock response Random Forests 75-25 Confusion
2018 safety state demands (e.g. SDR) Matrix
and component
damage levels
Kiani et al. Steel Damage fragility Ground motion Logistic, 70-30 Recall,
2019 Moment parameters parameters LASSO, SVM, precision and
Frames Naïve Bayes, F-measure
DT, RF, KNN,
DA and ANN
Zhang and Tall RC Aftershock Mainshock response Support Vector 75-25 MSE
Burton, 2019 Building damage fragility demands Machines
with parameters
Moment
Frame and
Core Walls

415

416 4.2 Models Developed using Experimental Data

417 There is a long history of using empirical data from physical experiments to develop statistical models for

of
418 predicting structural parameters (e.g. component stiffness, strength and/or deformation capacity). Many of the earlier

ro
419 models, which were developed using very small datasets (on the order 10’s of datapoints), adopted relatively simple

420
-p
analytical expressions with one or two input parameters (Hobbs, 1972; Bažant & Zebich, 1983; Bažant & Chern, 1984;
re
421 Bažant, et al., 1991; Bažant & Kim, 1991; Carpinteri et al., 1995) [56–61]. As the size of the datasets increased (on the
lP

422 order of 100’s), more complex multi-variate analytical expressions have been adopted (e.g. Haselton et al. 2016; Lignos
na

423 & Krawinkler, 2010) [62,63]. Some of the most recent studies, which are summarized in the next paragraph, have
ur

424 implemented advanced ML models.


Jo

425 Table 2 summarizes the studies that have developed ML models using experimental data. The categories are the

426 same as Table 1with one exception: the models developed to predict structural response demands and limit state

427 parameters are based on the results from nonlinear analyses, which, for the most part, meant that the authors had the

428 ability to control the size of the dataset. However, for the ML models developed using experiment data, the size of the

429 dataset is controlled by the number of experiments conducted in prior studies. For this reason, the number of sample

430 points is also documented in Table 2. Only reinforced concrete components were used in the reviewed studies including

431 columns (Naeej et al. 2013; Luo and Paal, 2018; Luo and Paal, 2019; Mangalathu and Jeon, 2018) [64–67], beam-column

432 joints (Jeon et al. 2014; Luo and Paal, 2018; Luo and Paal, 2019; Mangalathu and Jeon, 2018) [65–68], slabs (Vu and

433 Hoang, 2016) [69], infilled frames (Huang and Burton, 2019) [70] and shear walls (Mangalathu et al., 2020) [71]. The
434 number of specimens ranged from 65 (Naeej et al. 2013) [64] to 536 (Mangalathu and Jeon, 2018) [67]. Both

435 classification and regression models have been developed using experimental data. The former has been used to predict

436 the failure mode in specific components (Huang and Burton, 2019; Mangalathu and Jeon, 2019; Mangalathu et al., 2020)

437 [67,70,71] while the latter was employed to predict component-level structural parameters including column confinement

438 coefficients, beam-column shear strengths, punching shear strength in slab, beam-column drift capacity and backbone

439 parameters (e.g. strength, stiffness and deformation capacity). The Luo and Paal (2018) [65] model is the only one that

440 was developed to predict multiple output parameters. The input parameters always comprise the cross section,

of
ro
441 reinforcement and material properties of the component.

442 -p
Linear (ordinary and piecewise linear) models, DT, MARS, symbolic methods, least square SVM and ANNs have
re
443 been used for regression. The methods adopted for the classification studies include adaptive boosting, logistic regression,
lP

444 ANNs, RF, SVM and DT. Again, most studies utilized training-testing splits to evaluate model performance, however,

445 because of the generally small size of the datasets, the partition point was mostly skewed towards a much larger
na

446 proportion for the training subset (e.g. 90%-10% training-testing). MSE, RMSE and 47 were most commonly used to
ur

447 evaluate the performance of regression models. Bias, scatter index, correlation coefficient, agreement index, coefficient
Jo

448 of variance, mean average percentage and absolute error were also used for this purpose. Similar to the structural

449 response and limit state assessment studies, accuracy, recall and precision were used to evaluate the classification models.
450 Table 2 Summary of ML models developed using data from physical experiments of building components

Study Component Number Response Variable(s) Predictor Variables ML Algorithm(s) ML Model Performance Evaluation
Type of Sample Training/Testing Performance Metric
Points Split
Naeej et al. RC Column 65 Lateral confinement Material and geomteric M5 Regression Tree 37-28 Bias, Scatter Index,
2013 coefficient properties correlation coefficient
and agreement index

Jeon et al. RC 516 Shear strength Material and geometric Linear, MARS and symbolic NA RMSE, CC and
2014 Beam-Column properties regression Coefficient of Variance

o f
Joint
Luo and Paal RC Column 262 Backbone curve Material and geometric Least square SVM regression 90-10 R2 and RMSE

ro
2018 parameters (e.g., yield properties, applied axial
and maximum shear loads and failure mode

-p
force)
Luo and Paal RC Column 160 Drift capacity Material and geometric Locally weighted least square 70-30 R2 , RMSE and mean

re
2019 properties SVM regression absolute prediction error
(MAPE)

lP
Hoang et al. RC 218 Reinforcement-concrete Steel and concrete Least square SVM regression NA R2, RMSE and MAPE
2019 Components ultimate bond strength properties

na
Hoang 2019 RC Slab 140 Punching shear capacity Material and geometric Piecewise linear and ANN NA R2
properties and fibre regression
volume
Vu and Hoang FRP RC Slab 82 ur
Punching shear capacity Material and geomteric Least square SVM and ANN 72-10 R2, RMSE and MAPE
Jo
2016 properties regression
Mangalathu RC 311 Shear strength and failure Material and geomteric Linear, logistic, LASSO, ridge, 70-30 R2 , MSE and Abolute
and Jeon 2018 Beam-Column mode properties stepwise and elastic net for Error for regression and
Joint regression. KNN, Naïve Bayes, accuracy for
SVM, DT and RF for classification
classification

Huang and RC Frames with 114 In-plane failure mode Infill material and AdaBoost, DT, logistic, 70-30 Recall score
Burton 2019 infills geometric properties, multi-layer perceptron, RF and
stiffness of the system SVM
and axial loading

Mangalathu, et RC Core Wall 393 Failure mode Wall geometric Naïve Bayes, KNN, DT, RF, 70-30 Confusion matrix
al. 2020 properties, material AdaBoost, XGBoost, LightGBM
properties of rebar and and CatBoost
concrete and design
parameters
451 4.3 Information Retrieval using Images, Video and Written Text

452 Techniques for automatically extracting information from images, video and written text, have been broadly applied to a

453 several fields, including engineering, medicine and the physical, natural and social sciences. In building SDPA, large

454 numbers of images, written text and (to a lesser extent) videos are often generated during laboratory experiments, field

455 reconnaissance and routine inspections. With systematic collection, curation and organization, ML models can be

456 developed to extract useful information from these three types of media.

457 Computer Vision (CV) is a sub-category of artificial intelligence that seeks to empower computers to extract

of
ro
458 meaningful information from images and videos (Szeliski, 2010) [72]. It is worth noting that while ML methods can be

459 -p
incorporated, some CV tasks can be performed using non-ML algorithms. In fact, most of the existing CV applications
re
460 related to the built environment did not utilize ML algorithms. These studies focused on (i) visual identification and
lP

461 retrieval of concrete (Zhu et al. 2011; German et al. 2012; German et al. 2013; Koch et al. 2014; Koch et al. 2015) [73–77]

462 and steel (Kong and Li, 2018) [78] crack properties (e.g. crack width, length, and orientation) and spalling (concrete only)
na

463 from images and videos, (ii) automatically developing as-built models using images (Brilakis et al. 2011; Koch et al.
ur

464 2014; Koch et al. 2015) [76,77,79] and (iii) structural component-level damage classification (German et al. 2013; Paal et
Jo

465 al. 2015) [75,80].

466 Unlike the previously mentioned studies that explicitly make use of designated features for visual content

467 detection/classification, the more modern CV applications such as the ones summarized in Table 3, utilize ML methods to

468 automatically extract visual features. ML-based computer vision has been used to detect RC cracks and spalling (Cha et

469 al. 2018; Kucuksubasi and Sorgucb, 2018; Hoang et al. 2019b) [81–83], detect loosened and corroded steel bolts (Cha et

470 al. 2016; Cha et al. 2018) [81,84] and identify and classify structure and component types and the presence and severity

471 of damage (Gao and Mosalam 2018; Gonzalez, et al., 2020; Naito et al., 2020) [85–87]. While CNN is the most widely

472 used method among these studies, SVM and logistic regression has also been implemented. The training-testing splits
473 ranged from as high as (in terms of the relative size of the training set) 90%-10% to as low as 2%-98%. It is worth noting

474 that the latter involved a study that utilized transfer learning which typically does not require large amounts of training

475 data. The confusion matrix, accuracy, precision, recall and the F1 score were used as performance metrics.

476 The Mangalathu and Burton (2019) [45] study, which is also summarized in Table 3, is the only one that utilized

477 text-based media. The authors trained a long short-term memory (LSTM) deep learning model (Graves and Schmidhuber,

478 2005; Hochreiter, and Schmidhuber, 1997) [88,89] to classify building damage based on the ATC-20 categories (red,

479 yellow and green) (ATC, 1995) using natural language damage descriptions as the features. The dataset included 3,423

of
ro
480 buildings affected by the 2014 South Napa earthquake, with written documentation of the damage and the assigned

481 -p
ATC-20 tags. A 75%-25% training-testing split was used and the model performance was also assessed using the
re
482 confusion matrix.
lP

483

484 Table 3 Summary of ML models developed using images, videos and written text
na

Study Structure Task Media ML ML Model Performance


and/or Type Algorithm(s) Evaluation
Component
ur

Type Training/Testing Performance


Split Metric
Jo

Cha et al. 2016 Steel Bolts Detecting loosened bolts Images SVM ~ 90-10 Accuracy

Cha et al. 2018 Steel and Detecting concrete Images Faster R-CNN ~ 80-20 Precision
concrete cracks, steel corrosion,
components bolt corrosion and steel
delamination

Gao and Mutliple building Structural component and Images CNN ~ 80-20 Confusion
Mosalam 2018 systems and system classification. matrix
components Damage type and level
classification

Kucuksubasia RC buildings RC crack Images CNN ~ 2-98 Accuracy


and Sorgucb,
2018
Hoang et al. RC building RC spalling Images Logistic ~ 90-10 Recall and F1
2019b components regression score
Mangalathu and Multiple building ATC-20 Tag (red, yellow, Text RNN 75-25 Confusion
Burton 2019 types green) matrix

Gonzalez et al., Multiple building Building materials and Images CNN N/A Precision and
2020 types lateral support resisting recall
system types

Naito et al., 2020 Multiple building Seismic damage level Images CNN N/A Precision and
types recall

485

486 4.4 Models Developed using Structural Health Monitoring and Field Reconnaissance Data

487 Structural health monitoring (SHM) and post-event field reconnaissance have been central to the advancement of

of
488 building SDPA. The data generated from both of these activities provide insights into the performance of different types

ro
489 -p
of structures, especially under extreme loading conditions, and are well-suited for ML applications.
re
490 SHM is generally concerned with using various types of sensors to detect the type, location and extent of damage to
lP

491 a structure. Some of the more traditional techniques that have been used to detect damage from SHM data include

492 auto-regressive model fitting (e.g. Sohn and Farrar, 2001) [90], Fast Fourier Transform (e.g. Lynch, 2002 [91]) and
na

493 wavelet transformation (e.g. Noh et al. 2011; Hwang and Lignos, 2018) [92,93]. Some noticeable ML-SDPA-SHM
ur

494 studies are summarized in Table 4. A broad range of structure types were considered, including an aluminum frame test
Jo

495 specimen (Figueiredo et al. 2011) [94], a steel frame and truss (Ghiasi et al. 2016) [43] and tall RC building structures

496 (Rafiei and Adeli, 2017; Sun et al. 2019) [55,95]. All but one of the studies involved on damage detection, localization

497 and/or classification. The Sun et al. study was focused on reconstructing seismic structural responses in tall buildings.

498 This was also the only study that utilized regression (i.e. all others incorporated classification). In most cases, the

499 predictor variables (e.g. auto-regressive and frequency-domain parameters, wavelet features) were extracted from

500 accelerometer recordings. However, while the Sun et al. was developed with the intention of being applied to

501 accelerometer measurements, it was demonstrated using data generated from nonlinear response history analyses. The

502 adopted ML methods include SVM, ANN and kernel ridge. Also, the Rafiei and Adeli (2017) [95] study implemented a
503 neural dynamic algorithm that was previously developed by the second author (Adeli and Park, 1995) [15]. Most studies

504 utilized training-testing splits and two of them (Rafiei and Adeli, 2017; Sun et al. 2019) [55,95] evaluated the effect of

505 the partition-point on model performance. The Figueredo et al. study was the only one to utilize the receiving operator

506 characteristic (ROC) curve to evaluate model performance. The other studies utilized some of the metrics from prior

507 sections (e.g. accuracy, MARD, mean absolute error).

508 The only study to develop ML models using post-event field reconnaissance data (besides the ones mentioned in the

509 earlier sections) is by Mangalathu et al. (2020) [96]. Using a similar dataset from the 2014 South Napa earthquake

of
ro
510 (discussed in the previous section), the authors developed a second damage classification model based on ATC-20 tags.

511 -p
However, instead of written damage descriptions, this model utilized features related to the building (e.g. age, number of
re
512 stories), site (closest distance to surface projection of the fault rupture) and shaking intensity.
lP

513

514 Table 4 Summary of ML models developed using structural health monitoring and field reconnaissance data
na

Study Structure Task Predictor ML ML Model Performance


Type Variables Algorithm(s) Evaluation
ur

Training/Testing Performance
Split Metric
Jo

Figueiredo et Aluminum Damage Auto-regressive ANN NA ROC Curve


al. 2011 Frame Test detection and parameters
Specimen classification extracted from
accelerometer
recordings
Ghiasi et al. Steel Frame Damage Wavelet features SVM 66-34 Mean Absolute
2016 and Truss detection and extracted from Error
localization accelerometer
recordings
Rafiei and RC Tall Damage Frequency domain Neural Multiple Accuracy
Adeli 2017 Building classification parameters from Dynamic
acceleration Algorithm
recordings
Sun et al. RC Tall Structural Structure (e.g. wall Kernel ridge Multiple MARD
2019 Building response thickness) and and Kernel
prediction building properties SVM
(e.g. height and regression
location)
Mangalathu Portfolio of Damage Building (e.g., DT, RF, KNN 70-30 Accuracy,
et al. 2019 Buildings classification number of stories) and LDA Precision and
and site (e.g. Recall
shaking intensity)
properties

515

516 5. Discussion

of
ro
517 As evidenced by the previous section, the application of ML to building SDPA problems has regained significant

518 -p
momentum within the past decade since its dormancy from the late 1990’s to the late 2000’s. Most (if not all) of the
re
519 reviewed studies have been exploratory and there is no evidence that any of the applications have made their way into
lP

520 practice. For ML-SDPA to advance from conception and research into practice, there are several challenges that must be

521 overcome. A synthesis of those challenges as well as opportunities for future work are presented in this section.
na

522 5.1 Data


ur

523 One big contributor to the success of ML in other fields is the access to adequate data. Although the amount of data
Jo

524 required to achieve reasonable performance for ML models depends on the problem and goal, it is essential to have

525 sufficient high-quality data that the sampled group could represent the true distribution. This enables the adopted ML

526 algorithm(s) to discover underlying patterns and produce predictive models that are truly generalizable within the

527 problem scope. One of the major challenges in ML-SDPA applications is that the datasets are often limited in quantity

528 and diversity. In the studies that sought to predict structural response and performance using ML models, the data was

529 generated from nonlinear response history analyses by the researchers performing the study (e.g. Seo et al. 2012; Moradi

530 and Burton, 2018) [46,53]. However, to the author’s knowledge, none of these datasets have been made publicly

531 available. To have a truly representative dataset of structural response demands, an open access repository should be
532 instituted with rigorous quality control measures. The recently established DesignSafe (Rathje et. al., 2017) [97] platform

533 makes the creation of such a repository more feasible. The studies related to automatic information retrieval from visual

534 media and models developed using field reconnaissance and SHM data face similar challenges with lack of diversity in

535 the adopted datasets. Resources such as Structural ImageNet (Gao and Mosalam, 2018) [85], the National Hazard

536 Engineering Infrastructure (NHERI) RAPID facility (https://rapid.designsafe-ci.org/) and the DataCenterHub

537 (http://datacenterhub.org) will, over time, help alleviate this challenge. Despite being relatively small (on the order of

538 hundreds of datapoints), there is more diversity in the datasets that have been generated from physical experiments. In

of
ro
539 other words, the prior studies in this area have utilized data generated from a broad range of experiments conducted by

540 -p
many researchers (e.g. Jeon et al. 2014; Huang and Burton, 2019; Hoang et al. 2019a) [68,70,98].
re
541 One partial solution to the shortage of data from physical experiments is to incorporate domain knowledge within the
lP

542 ML algorithm, which will reduce the complexity of the model space and consequently reduce amount of data that is

543 needed to achieve good performance. Transfer learning is another technique that can be used to address the data-shortage
na

544 issue. The basic idea behind transfer learning is that the knowledge acquired from training one model for a specific
ur

545 problem or domain can be “transferred” to another. Additionally, there are procedures such as Monte Carlo Simulation,
Jo

546 and generative models such as Generative Adversarial Networks (GAN) (Goodfellow et al. 2014) [99] and Variational

547 Autoencoders (VAE) (Kingma and Welling 2014) [100], which can augment existing datasets through the generation of

548 synthetic data. Future efforts should focus on all four of these options: (i) collecting and curating more diverse datasets,

549 (ii) generating synthetic data, (iii) utilizing transfer learning and (iv) incorporating domain knowledge in the design of the

550 ML model.

551 Another important concern is data quality, which is a common challenge for ML models. Currently, there are no

552 formal methods for collecting and synthesizing datasets generated by the building SDPA community. This lack of

553 systematic curation procedures can lead to issues such as the existence of outliers in the data, which can have an adverse
554 effect on the performance of ML models. This is especially true for ML algorithms such as logistic regression, which are

555 less capable of dealing with noise. There are anomaly detection procedures such as DBSCAN (Ester et al. 1996) [101],

556 K-Means clustering (Lloyd and Stuart, 1982) [102] and Z-score (Rousseeuw and Hubert, 2011) [103] that can be used to

557 address outliers. Building SDPA domain-specific procedures should also be implemented. In other words, the universal

558 data filtering procedure of ML models should be carefully integrated with building SDPA domain knowledge. Ultimately,

559 many of the challenges related to the quality of building SDPA datasets can be addressed if precise collection and

560 processing techniques are established and adopted.

of
ro
561 Once the standardized benchmark datasets such as the ones suggested in the previous section have been created, a

562 -p
unified set of performance measures and context-specific thresholds for determining when models are deemed adequate,
re
563 should be developed. This, along with the creation of the ImageNet dataset [104], was a key factor in the success of the field
lP

564 of computer vision.

565 5.2 Methods


na

566 A wide range of algorithms were used in the reviewed ML-SDPA studies. Unfortunately, there is no consensus or
ur

567 general takeaway about ML method-selection that could be inferred from the review. In some studies, the author(s) chose to
Jo

568 focus on a single method (e.g. Morfidis and Kostinakis, 2017; Luo and Pall, 2018) [51,65]. However, no clear compelling

569 reason is ever provided for the selected method. Other studies focused on comparing the performance of ML models

570 developed using different methods. However, the findings from these comparative assessments are difficult to generalize

571 because they are very much conditioned on the adopted dataset, and the model training (e.g. whether or not +-fold cross

572 validation is adopted) and testing (e.g. partition point for training-testing split, performance metric). Future efforts should

573 place a greater focus on analyzing domain specific characteristics of the adopted datasets and applying

574 knowledge-informed strategies in selecting ML algorithms instead of using a purely performance-driven search. For

575 example, multioutput models are especially useful for predicting backbone curves due to its capability in predicting
576 multiple response variables. Addressing some of the aforementioned challenges with creating systematic and well-curated

577 datasets would also help with the method-selection issue. The advantage of having such benchmark datasets is that a

578 standard dataset will encourage focused attention on integrating domain knowledge and the associated data patterns.

579 Nevertheless, performance-driven model selection is often the ideal solution when there is no sense of how domain

580 knowledge can be incorporated.

581 An immediate strategy that can be used to guide method-selection is to begin by training and evaluating the performance of

582 a linear (basis function) ML model (OLS, LASSO, ridge) for regression problems and logistic regression for classification

of
ro
583 problems. With the exception of very specific problems (e.g. computer vision or natural language processing), linear

584 -p
models have been shown to perform reasonably well, while being easy to implement (e.g. Burton et al. 2017; Mangalathu
re
585 and Jeon, 2018) [27,67] and more importantly, have high model transparency and interpretability. In the event that the
lP

586 initial linear models do not perform well, they should be further investigated before moving on to more complex models.

587 For example, the poor performance could be because of simplicity of the linear ML model or noisy data. The former
na

588 situation requires exploring advanced (e.g. non-parametric) models that can capture the complexity of data. However,
ur

589 issues of noise in the data can be addressed to filtering.


Jo

590 5.3 Explainability and Interpretability of Machine Learning Models

591 One of the most significant challenges associated with ML-SDPA models is explaining the feature effects and

592 interpreting the physical meaning of the model parameters. A commonly held view is that ML models, especially the

593 more advanced ones, are black boxes. In other words, they are difficult to extract mechanistic relationships between input

594 (features) and output (response variables) parameters in data-driven models. One approach to increasing model

595 explainability is to perform feature importance tests to understand their marginal effect on the response variable, which

596 can then be benchmarked against the fundamental principles that are known to govern the phenomena. Statistical

597 methods such as the F test (e.g. Sun et al. 2019) [55] and analysis of variance (ANOVA) (e.g. Moradi and Burton, 2018)
598 [53] can be used to evaluate the relative strengths of association between features and response variables. In addition, the

599 partial dependence plot (PD plot) and its variant, individual conditional expectation curves (ICE) are also widely used

600 [105,106]. Besides these general measures of feature importance, model-specific techniques such as the use of class

601 activation mapping (CAM) to visualize focus areas on the image of CNN models, [107,108] have been developed. On the

602 other hand, some recent efforts on the interpretability of ML have demonstrated the benefit of introducing domain

603 knowledge into ML algorithms by incorporating a physics-based loss function. A specific example is to embed hard

604 conditions with a Lagrange multiplier into the loss function (e.g. Karpatne et al. 2017a; Muralidhar et al. 2018) [109,110].

of
ro
605 This approach provides a means to explain some of the ML model by adding a physics-based law into the objective

606 -p
function. In Karpatne et al. (2017b) [111], a spectrum of approaches is discussed, whereby the wealth of domain
re
607 knowledge is leveraged to improve the performance of data-driven models. One recent article (Zhang and Sun, 2020
lP

608 [112]) in the SHM domain combines observed labeled field data with unlabeled simulation data using a physics-guided

609 neural network with a loss function that contains additional terms that reflect the discrepancy between observed and
na

610 simulation output. Combining ML and physics-based models remains a challenging problem especially for SE
ur

611 community and will continue to be explored in future research.


Jo

612 5.4 Overfitted ML Models

613 Overfitted models, which results in inadequate performance outside of the data used for training and/or testing, is a


614 domain-agnostic challenge that is faced by the broader ML community. In Figure 2, assuming presents the “true”

615 model in the space Θ, overfitted models ( "7) are of high variance and do not generalize well while underfitting models

616 ( "Ÿ ) have high bias and inferior predictive performance. In most cases there is no so-called “true” model and the goal of

617 ML is to find a balanced model ( "Ÿ ). Standard ML procedures seek to address the overfitting issue by utilizing

618 training/testing split, +-fold cross validation, bagging and bootstrapping, as well as other algorithm-specific approaches.

619 For instance, the stochastic procedure used by RF to generate trees was intentionally developed to avoid the overfitting
620 challenge associated with DT. It should be noted that overfitting is not only associated with model training but also

621 model selection. A sophisticated nonlinear model trained on a dataset with low-dimensional (a small number of) features

622 can also be overfitted. For the SE community, the application of domain knowledge can also help with avoiding

623 overfitting issues. The combination of a data-driven procedure and domain knowledge, similar to the approach used to

624 deal with data sparsity, may prove to be a powerful combination. Although overfitting has been extensively studied in the

625 broader ML community, it could be more critical to SE applications given the complexity of some of the mechanistic

626 relationships that data-driven models attempt to replicate. Consequently, ML-SDPA models often require large amounts

of
ro
627 of data, better noise filtering processes, and careful tuning to reduce the effects of overfitting.

-p
re
lP
na
ur
Jo

628
629 Figure 2 Illustrating the tradeoff between bias and variance in machine learning models

630 6. Conclusion

631 This paper provides a review of machine learning (ML) applications in building structural design and performance

632 assessment (SDPA). The vulnerability of aged structures under natural hazards and the complexity of modern building

633 systems call for efficient and reliable frameworks for performance assessment, conditional monitoring and risk-informed

634 decision making. The increase in computational power in recent years has enhanced the capability of ML in complex

635 applications involving large-scale, high-dimensional nonlinear data. With the advantages in pattern recognition and

636 function approximation, ML offers a natural choice to help address the aforementioned challenges in building SDPA.
637 In order to provide a good understanding of building SDPA problems that are suitable for ML applications and the

638 available models for solving specific problems, an overview of the ML methodology is given, followed by a review of

639 the supervised learning algorithms most utilized in building SDPA literature: Linear Regression, Kernel Regression,

640 Tree-Based Algorithms, Logistic Regression, Support Vector Machine, K-Nearest Neighbors and Neural Networks. Next,

641 the ML applications in previous building SDPA studies are placed into the following four categories and reviewed: (1)

642 predicting structural response and performance, (2) interpreting experimental data and formulating models to predict

643 component-level structural properties, (3) information retrieval using images and written text and (4) recognizing

of
ro
644 patterns in structural health monitoring data. These successful applications have demonstrated the capability of ML in

645 -p
efficiently extracting information from multi-media building SDPA data and assessing structural performance.
re
646 To bring ML into building SDPA practice, several key challenges need to be addressed. First, adequate high-quality
lP

647 data sources essential for ML model development are currently unavailable within the building SDPA community.
na

648 Therefore, a unified effort is needed to generate, collect and curate diverse datasets to an open-source repository that can
ur

649 be populated by researchers and practitioners. This effort should also include the creation of benchmark datasets for
Jo

650 specific SDPA sub-domains to align and focus research resources. Data augmentation, transfer learning and reasonable

651 design of ML algorithms with domain knowledge can also help address the data sparsity. Second, previous studies did not

652 establish general guidelines for the selection of ML models. Future studies should incorporate more knowledge-informed

653 selection strategies. As a rule of thumb, initial exploration should focus on simple linear models which are usually easy to

654 interpret and explain. The complexity of the data space can also inform the model selection. Third, the results from ML

655 models are often difficult to interpret. This can be addressed by using importance testing to better understand the

656 individual effects of features on the response variable. The introduction of physics-based loss functions can offer insight

657 into ML model training and interpretation and can potentially improve robustness. Lastly, overfitting is a significant issue

658 for ML models, especially when attempting to capture complex mechanistic relationships in building SDPA problems.
659 This issue can be further studied by examining the SDPA data space and proposing physics-based validation and

660 evaluation techniques. Future research should also focus on finding ways to combine data-driven procedures with

661 building SDPA domain knowledge, which will serve to boost performance and provide model insights.

of
ro
-p
re
lP
na
ur
Jo
662 Acknowledgements

663 The research presented in this paper is supported by two National Science Foundation CMMI research grants: No.

664 1538866 and No. 1554714.

665 Reference

666 1. Murphy KP. Machine learning: a probabilistic perspective. MIT press; 2012.
667 2. Friedman J, Hastie T, Tibshirani R. The elements of statistical learning. vol. 1. Springer series in statistics New York;
668 2001.
669 3. Adeli H, Yeh C. Perceptron learning in engineering design. Computer-Aided Civil and Infrastructure Engineering 1989;
670 4(4): 247–256.

of
671 4. Hopfield JJ. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the
672 National Academy of Sciences 1982; 79(8): 2554–2558.

ro
673 5. Vanluchene R, Sun R. Neural networks in structural engineering. Computer-Aided Civil and Infrastructure Engineering
674 1990; 5(3): 207–215.
675
676
-p
6. Rumelhart DE, Hinton GE, Williams RJ. Learning representations by back-propagating errors. Nature 1986; 323(6088):
533–536.
re
677 7. Hajela P, Berke L. Neurobiological computational models in structural analysis and design. Computers & Structures
678 1991; 41(4): 657–667.
lP

679 8. Ghaboussi J, Garrett Jr J, Wu X. Knowledge-based modeling of material behavior with neural networks. Journal of
680 Engineering Mechanics 1991; 117(1): 132–153.
na

681 9. Wu X, Ghaboussi J, Garrett Jr J. Use of neural networks in detection of structural damage. Computers & Structures
682 1992; 42(4): 649–659.
ur

683 10. Masri S, Chassiakos A, Caughey T. Identification of nonlinear dynamic systems using neural networks 1993.
684 11. Kang HT, Yoon CJ. Neural network approaches to aid simple truss design problems. Computer-Aided Civil and
Jo

685 Infrastructure Engineering 1994; 9(3): 211–218.


686 12. Messner JI, Sanvido VE, Kumara SR. StructNet: A neural network for structural system selection. Computer-Aided
687 Civil and Infrastructure Engineering 1994; 9(2): 109–118.
688 13. Elkordy M, Chang K, Lee G. A structural damage neural network monitoring system. Computer-Aided Civil and
689 Infrastructure Engineering 1994; 9(2): 83–96.
690 14. Gunaratnam D, Gero J. Effect of representation on the performance of neural networks in structural engineering
691 applications. Computer-Aided Civil and Infrastructure Engineering 1994; 9(2): 97–108.
692 15. Adeli H, Park HS. A neural dynamics model for structural optimization—theory. Computers & Structures 1995; 57(3):
693 383–390.
694 16. Reich Y. Machine learning techniques for civil engineering problems. Computer-Aided Civil and Infrastructure
695 Engineering 1997; 12(4): 295–310.
696 17. Buhmann JM, Held M. Unsupervised learning without overfitting: Empirical risk approximation as an induction
697 principle for reliable clustering. International Conference on Advances in Pattern Recognition, Springer; 1999.
698 18. Huang TM, Kecman V, Kopriva I. Kernel based algorithms for mining huge data sets. vol. 1. Springer; 2006.
699 19. Viola P, Jones M. Rapid object detection using a boosted cascade of simple features. Proceedings of the 2001 IEEE
700 computer society conference on computer vision and pattern recognition. CVPR 2001, vol. 1, IEEE; 2001.
701 20. Lienhart R, Maydt J. An extended set of haar-like features for rapid object detection. Image Processing. 2002.
702 Proceedings. 2002 International Conference on, vol. 1, IEEE; 2002.
703 21. Lowe DG. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 2004;
704 60(2): 91–110.
705 22. Dalal N, Triggs B. Histograms of oriented gradients for human detection. Computer Vision and Pattern Recognition,
706 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, IEEE; 2005.
707 23. Bunke O, Droge B. Bootstrap and cross-validation estimates of the prediction error for linear regression models. The
708 Annals of Statistics 1984: 1400–1424.
709 24. Powers DM. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation 2011.
710 25. Krizhevsky A, Sutskever I, Hinton GE. Imagenet classification with deep convolutional neural networks. Advances in
711 neural information processing systems, 2012.
712 26. Mack Y, Goel T, Shyy W, Haftka R. Surrogate model-based optimization framework: a case study in aerospace design.
713 Evolutionary computation in dynamic and uncertain environments, Springer; 2007.
714 27. Burton HV, Sreekumar S, Sharma M, Sun H. Estimating aftershock collapse vulnerability using mainshock intensity,

of
715 structural response and physical damage indicators. Structural Safety 2017; 68: 85–96.
716 28. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B

ro
717 (Methodological) 1996; 58(1): 267–288.
718 29. Hoerl AE, Kennard RW. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 1970; 12(1):
719
720
55–67.
-p
30. Akaike H. A new look at the statistical model identification. IEEE Transactions on Automatic Control 1974; 19(6):
re
721 716–723.
722 31. Schwarz G, others. Estimating the dimension of a model. The Annals of Statistics 1978; 6(2): 461–464.
lP

723 32. Friedman JH. Multivariate adaptive regression splines. The Annals of Statistics 1991: 1–67.
724 33. Breiman L, Friedman J, Stone CJ, Olshen RA. Classification and regression trees. CRC press; 1984.
na

725 34. Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. Springer
726 Science & Business Media; 2009.
ur

727 35. Freund Y, Schapire RE. A desicion-theoretic generalization of on-line learning and an application to boosting. European
728 conference on computational learning theory, Springer; 1995.
Jo

729 36. Hastie T, Rosset S, Zhu J, Zou H. Multi-class adaboost. Statistics and Its Interface 2009; 2(3): 349–360.
730 37. Abu-Mostafa YS, Magdon-Ismail M, Lin HT. Learning from data. vol. 4. AMLBook New York, NY, USA:; 2012.
731 38. Breiman L. Random forests. Machine Learning 2001; 45(1): 5–32.
732 39. Bishop CM. Pattern recognition and machine learning. springer; 2006.
733 40.Rosasco L, Vito ED, Caponnetto A, Piana M, Verri A. Are loss functions all the same? Neural Computation 2004; 16(5):
734 1063–1076.
735 41. Cortes C, Vapnik V. Support-vector networks. Machine Learning 1995; 20(3): 273–297.
736 42. Wu YN, Gao R, Han T, Zhu SC. A tale of three probabilistic families: Discriminative, descriptive, and generative
737 models. Quarterly of Applied Mathematics 2019; 77(2): 423–465.
738 43. Ghiasi R, Torkzadeh P, Noori M. A machine-learning approach for structural damage detection using least square
739 support vector machine based on a new combinational kernel function. Structural Health Monitoring 2016; 15(3): 302–
740 316.
741 44. Zhang Y, Burton HV. Pattern recognition approach to assess the residual structural capacity of damaged tall buildings.
742 Structural Safety 2019; 78: 12–22.
743 45. Mangalathu S, Burton HV. Deep learning-based classification of earthquake-impacted buildings using textual damage
744 descriptions. International Journal of Disaster Risk Reduction 2019; 36: 101111.
745 46. Seo J, Dueñas-Osorio L, Craig JI, Goodno BJ. Metamodel-based regional vulnerability estimate of irregular steel
746 moment-frame structures subjected to earthquake events. Engineering Structures 2012; 45: 585–597.
747 47. Khojastehfar E, Beheshti-Aval SB, Zolfaghari MR, Nasrollahzade K. Collapse fragility curve development using Monte
748 Carlo simulation and artificial neural network. Proceedings of the Institution of Mechanical Engineers, Part O: Journal
749 of Risk and Reliability 2014; 228(3): 301–312.
750 48. Jough FKG, Şensoy S. Prediction of seismic collapse risk of steel moment frame mid-rise structures by meta-heuristic
751 algorithms. Earthquake Engineering and Engineering Vibration 2016; 15(4): 743–757.
752 49. Kiani J, Camp C, Pezeshk S. On the application of machine learning techniques to derive seismic fragility curves.
753 Computers & Structures 2019; 218: 108–122.
754 50. Mitropoulou CC, Papadrakakis M. Developing fragility curves based on neural network IDA predictions. Engineering
755 Structures 2011; 33(12): 3409–3421.
756 51. Morfidis K, Kostinakis K. Seismic parameters’ combinations for the optimum prediction of the damage state of R/C
757 buildings using neural networks. Advances in Engineering Software 2017; 106: 1–16.
758 52. Zhang Y, Burton HV, Sun H, Shokrabadi M. A machine learning framework for assessing post-earthquake structural

of
759 safety. Structural Safety 2018; 72: 1–16.
760 53. Moradi S, Burton HV. Response surface analysis and optimization of controlled rocking steel braced frames. Bulletin of

ro
761 Earthquake Engineering 2018; 16(10): 4861–4892.
762 54. Moradi S, Burton HV, Kumar I. Parameterized fragility functions for controlled rocking steel braced frames.
763
764
Engineering Structures 2018; 176: 254–264.
-p
55. Sun H, Burton H, Wallace J. Reconstructing seismic response demands across multiple tall buildings using kernel-based
re
765 machine learning methods. Structural Control and Health Monitoring 2019: e2359.
766 56. Hobbs D. The compressive strength of concrete: a statistical approach to failure. Magazine of Concrete Research 1972;
lP

767 24(80): 127–138.


768 57. Bažant ZP, Zebich S. Statistical linear regression analysis of prediction models for creep and shrinkage. Cement and
na

769 Concrete Research 1983; 13(6): 869–876.


770 58. Bazant ZP, Chern JC. Bayesian statistical prediction of concrete creep and shrinkage. ACJ Journal, Proceedings 1984;
ur

771 81(4): 319–330.


772 59. Bažant ZP, Kim JK, Panula L. Improved prediction model for time-dependent deformations of concrete: Part
Jo

773 1-Shrinkage. Materials and Structures 1991; 24(5): 327–345.


774 60. Bažant ZP, Kim JK. Improved prediction model for time-dependent deformations of concrete: Part 2—Basic creep.
775 Materials and Structures 1991; 24(6): 409.
776 61. Carpinteri A, Ferro G, Invernizzi S. A truncated statistical model for analyzing the size-effect on tensile strength of
777 concrete structures. Proceedings of the 2 nd International Conference on Fracture Mechanics of Concrete Structures, ed.
778 by F. H. Wittmann, Zürich, Aedificatio, 1995.
779 62. Haselton CB, Liel AB, Taylor-Lange SC, Deierlein GG. Calibration of model to simulate response of reinforced
780 concrete beam-columns to collapse. ACI Structural Journal 2016; 113(6).
781 63. Lignos DG, Krawinkler H. Deterioration modeling of steel components in support of collapse prediction of steel
782 moment frames under earthquake loading. Journal of Structural Engineering 2011; 137(11): 1291–1302.
783 64. Naeej M, Bali M, Naeej MR, Amiri JV. Prediction of lateral confinement coefficient in reinforced concrete columns
784 using M5′ machine learning method. KSCE Journal of Civil Engineering 2013; 17(7): 1714–1719.
785 65. Luo H, Paal SG. Machine learning–based backbone curve model of reinforced concrete columns subjected to cyclic
786 loading reversals. Journal of Computing in Civil Engineering 2018; 32(5): 04018042.
787 66. Luo H, Paal SG. A locally weighted machine learning model for generalized prediction of drift capacity in seismic
788 vulnerability assessments. Computer-Aided Civil and Infrastructure Engineering 2019; 34(11): 935–950.
789 67. Mangalathu S, Jeon JS. Classification of failure mode and prediction of shear strength for reinforced concrete
790 beam-column joints using machine learning techniques. Engineering Structures 2018; 160: 85–94.
791 68. Jeon JS, Shafieezadeh A, DesRoches R. Statistical models for shear strength of RC beam-column joints using
792 machine-learning techniques. Earthquake Engineering & Structural Dynamics 2014; 43(14): 2075–2095.
793 69. Vu DT, Hoang ND. Punching shear capacity estimation of FRP-reinforced concrete slabs using a hybrid machine
794 learning approach. Structure and Infrastructure Engineering 2016; 12(9): 1153–1161.
795 70. Huang H, Burton HV. Classification of in-plane failure modes for reinforced concrete frames with infills using machine
796 learning. Journal of Building Engineering 2019; 25: 100767.
797 71. Mangalathu S, Jang H, Hwang SH, Jeon JS. Data-driven machine-learning-based seismic failure mode identification of
798 reinforced concrete shear walls. Engineering Structures 2020; 208: 110331.
799 72. Szeliski R. Computer vision: algorithms and applications. Springer Science & Business Media; 2010.
800 73. Zhu Z, German S, Brilakis I. Visual retrieval of concrete crack properties for automated post-earthquake structural
801 safety evaluation. Automation in Construction 2011; 20(7): 874–883.
802 74. German S, Brilakis I, DesRoches R. Rapid entropy-based detection and properties measurement of concrete spalling

of
803 with machine vision for post-earthquake safety assessments. Advanced Engineering Informatics 2012; 26(4): 846–858.
804 75. German S, Jeon JS, Zhu Z, Bearman C, Brilakis I, DesRoches R, et al. Machine vision-enhanced postearthquake

ro
805 inspection. Journal of Computing in Civil Engineering 2013; 27(6): 622–634.
806 76. Koch C, Paal SG, Rashidi A, Zhu Z, König M, Brilakis I. Achievements and challenges in machine vision-based
807
808
-p
inspection of large concrete structures. Advances in Structural Engineering 2014; 17(3): 303–318.
77. Koch C, Georgieva K, Kasireddy V, Akinci B, Fieguth P. A review on computer vision based defect detection and
re
809 condition assessment of concrete and asphalt civil infrastructure. Advanced Engineering Informatics 2015; 29(2): 196–
810 210.
lP

811 78. Kong X, Li J. Vision-based fatigue crack detection of steel structures using video feature tracking. Computer-Aided
812 Civil and Infrastructure Engineering 2018; 33(9): 783–799.
na

813 79. Brilakis I, Fathi H, Rashidi A. Progressive 3D reconstruction of infrastructure with videogrammetry. Automation in
814 Construction 2011; 20(7): 884–895.
ur

815 80. Paal SG, Jeon JS, Brilakis I, DesRoches R. Automated damage index estimation of reinforced concrete columns for
816 post-earthquake evaluations. Journal of Structural Engineering 2015; 141(9): 04014228.
Jo

817 81. Cha YJ, Choi W, Suh G, Mahmoudkhani S, Büyüköztürk O. Autonomous structural visual inspection using
818 region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering
819 2018; 33(9): 731–747.
820 82. Kucuksubasi F, Sorguc A. Transfer Learning-Based Crack Detection by Autonomous UAVs. ArXiv Preprint
821 ArXiv:180711785 2018.
822 83. Hoang ND, Nguyen QL, Tran XL. Automatic detection of concrete spalling using piecewise linear stochastic gradient
823 descent logistic regression and image texture analysis. Complexity 2019; 2019.
824 84. Cha YJ, You K, Choi W. Vision-based detection of loosened bolts using the Hough transform and support vector
825 machines. Automation in Construction 2016; 71: 181–188.
826 85. Gao Y, Mosalam KM. Deep transfer learning for image-based structural damage recognition. Computer-Aided Civil and
827 Infrastructure Engineering 2018; 33(9): 748–768.
828 86. Gonzalez D, Rueda-Plata D, Acevedo AB, Duque JC, Ramos-Pollán R, Betancourt A, et al. Automatic detection of
829 building typology using deep learning methods on street level images. Building and Environment 2020: 106805.
830 87. Naito S, Tomozawa H, Mori Y, Nagata T, Monma N, Nakamura H, et al. Building-damage detection method based on
831 machine learning utilizing aerial photographs of the Kumamoto earthquake. Earthquake Spectra 2020:
832 8755293019901309.
833 88. Graves A, Schmidhuber J. Framewise phoneme classification with bidirectional LSTM and other neural network
834 architectures. Neural Networks 2005; 18(5–6): 602–610.
835 89. Hochreiter S, Schmidhuber J. LSTM can solve hard long time lag problems. Advances in neural information processing
836 systems, 1997.
837 90. Sohn H, Farrar CR. Damage diagnosis using time series analysis of vibration signals. Smart Materials and Structures
838 2001; 10(3): 446.
839 91. Lynch JP. Decentralization of wireless monitoring and control technologies for smart civil structures. PhD Thesis.
840 Stanford University Stanford, CA, 2002.
841 92. Young Noh H, Krishnan Nair K, Lignos DG, Kiremidjian AS. Use of wavelet-based damage-sensitive features for
842 structural damage diagnosis using strong motion data. Journal of Structural Engineering 2011; 137(10): 1215–1228.
843 93. Hwang SH, Lignos DG. Assessment of structural damage detection methods for steel structures using full-scale
844 experimental data and nonlinear analysis. Bulletin of Earthquake Engineering 2018; 16(7): 2971–2999.
845 94. Figueiredo E, Park G, Farrar CR, Worden K, Figueiras J. Machine learning algorithms for damage detection under
846 operational and environmental variability. Structural Health Monitoring 2011; 10(6): 559–572.

of
847 95. Rafiei MH, Adeli H. A novel machine learning-based algorithm to detect damage in high-rise building structures. The
848 Structural Design of Tall and Special Buildings 2017; 26(18): e1400.

ro
849 96. Mangalathu S, Sun H, Nweke CC, Yi Z, Burton HV. Classifying earthquake damage to buildings using machine
850 learning. Earthquake Spectra 2020; 36(1): 183–208.
851
852
-p
97. Rathje EM, Dawson C, Padgett JE, Pinelli JP, Stanzione D, Adair A, et al. DesignSafe: new cyberinfrastructure for
natural hazards engineering. Natural Hazards Review 2017; 18(3): 06017001.
re
853 98. Hoang ND, Tran XL, Nguyen H. Predicting ultimate bond strength of corroded reinforcement and surrounding concrete
854 using a metaheuristic optimized least squares support vector regression model. Neural Computing and Applications
lP

855 2019: 1–21.


856 99. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, et al. Generative adversarial nets. Advances
na

857 in neural information processing systems, 2014.


858 100. Kingma DP, Welling M. Auto-encoding variational bayes. ArXiv Preprint ArXiv:13126114 2013.
ur

859 101. Ester M, Kriegel HP, Sander J, Xu X, others. A density-based algorithm for discovering clusters in large spatial
860 databases with noise. Kdd, vol. 96, 1996.
Jo

861 102. Lloyd S. Least squares quantization in PCM. IEEE Transactions on Information Theory 1982; 28(2): 129–137.
862 103. Rousseeuw PJ, Hubert M. Robust statistics for outlier detection. Wiley Interdisciplinary Reviews: Data Mining and
863 Knowledge Discovery 2011; 1(1): 73–79.
864 104. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: A large-scale hierarchical image database. Computer
865 Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, Ieee; 2009.
866 105. Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of Statistics 2001: 1189–1232.
867 106. Goldstein A, Kapelner A, Bleich J, Pitkin E. Peeking inside the black box: Visualizing statistical learning with plots of
868 individual conditional expectation. Journal of Computational and Graphical Statistics 2015; 24(1): 44–65.
869 107. Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A. Learning deep features for discriminative localization.
870 Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
871 108. Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D. Grad-cam: Visual explanations from deep
872 networks via gradient-based localization. Proceedings of the IEEE international conference on computer vision, 2017.
873 109. Karpatne A, Atluri G, Faghmous JH, Steinbach M, Banerjee A, Ganguly A, et al. Theory-guided data science: A new
874 paradigm for scientific discovery from data. IEEE Transactions on Knowledge and Data Engineering 2017; 29(10):
875 2318–2331.
876 110. Muralidhar N, Islam MR, Marwah M, Karpatne A, Ramakrishnan N. Incorporating Prior Domain Knowledge into
877 Deep Neural Networks. 2018 IEEE International Conference on Big Data (Big Data), IEEE; 2018.
878 111. Karpatne A, Watkins W, Read J, Kumar V. Physics-guided neural networks (pgnn): An application in lake temperature
879 modeling. ArXiv Preprint ArXiv:171011431 2017.
880 112. Zhang Z, Sun C. Structural damage identification via physics-guided machine learning: a methodology integrating
881 pattern recognition with finite element model updating. Structural Health Monitoring 2020: 1475921720927488.
882

of
ro
-p
re
lP
na
ur
Jo
Declaration of interests

☒ The authors declare that they have no known competing financial interests or personal relationships
that could have appeared to influence the work reported in this paper.

☐The authors declare the following financial interests/personal relationships which may be considered
as potential competing interests:

of
ro
-p
re
Han Sun
lP

Research Engineer, Yahoo Research


na
ur

Henry Burton
Jo

Assistant Professor, Department of Civil and Environmental Engineering, University of California Los
Angeles

Honglan Huang

Ph.D. Candidate, Department of Civil and Environmental Engineering, University of California Los
Angeles

You might also like