Professional Documents
Culture Documents
2022 - Predicting Nominal Capacity of RC Wall in Building
2022 - Predicting Nominal Capacity of RC Wall in Building
2022 - Predicting Nominal Capacity of RC Wall in Building
A R T I C L E I N F O A B S T R A C T
Keywords: Reinforced concrete shear walls are used in many structural systems to resist earthquake loading.
Reinforced concrete shear wall In recent earthquakes, shear wall buildings have tended to perform well. Modern building codes
Structural mechanics include provisions concerning shear capacity, which are recognized for their effectiveness.
Shear capacity Studies have demonstrated that the American Concrete Institute (ACI) 318-19 provision uses a
Machine learning low safety factor and does not cover high-strength concrete shear walls whereas the Eurocode 8
Extreme gradient boosting
provision is overly conservative. A rational method for predicting shear wall capacity could be
Metaheuristic optimization
used as an alternative to the simplified provision in the codes. Nevertheless, the use of rational
Jellyfish search optimizer
Symbiotic organisms search methods may present some difficulties for structural engineers because they require an iterative
calculation to determine the peak strengths of shear walls. Accordingly, an appropriate data-
driven machine learning scheme that accurately determines shear capacity is needed. Three
experimental cases that involve various input variables are adopted herein to train single models
and ensemble models. Numerical analytics show that the best result is achieved by using extreme
gradient boosting (XGBoost), which involves conventional parameters and synthetic parameters
that are inspired by the ACI shear wall strength equation. Subsequently, two metaheuristic
optimization algorithms are used to fine-tune the hyperparameters of the generally recognized
XGBoost. Two proposed metaheuristically hybrid models, jellyfish search (JS)-XGBoost and
symbiotic organisms search (SOS)-XGBoost, outperform the ACI provision equation and grid
search optimization (GSO)-XGBoost in the literature in predicting the nominal capacity of rein
forced concrete shear walls in buildings. Metaheuristics-optimized machine learning models can
be used to improve building safety, simplify a cumbersome shear capacity calculation process,
and reduce material costs. The systematic approach that is utilized herein also serves as a general
framework for quantifying the performance of various mechanical models and empirical formulas
that are used in design standards.
1. Introduction
Reinforced concrete (RC) shear walls (SWs) (Fig. 1) are used in many structural systems to resist earthquake loading [1]. In recent
earthquakes, shear wall buildings have generally performed well [2]. Recognized for its effectiveness, the American Concrete Institute
(ACI) 318-19, EC-2 modern building code has included provisions about SW flexural and shear capacity. The mechanism that
* Corresponding author.
E-mail addresses: jschou@mail.ntust.edu.tw (J.-S. Chou), d10905002@mail.ntust.edu.tw (C.-Y. Liu), d10905818@mail.ntust.edu.tw (H. Prayogo), d10905812@
mail.ntust.edu.tw (R.R. Khasani), m10905839@mail.ntust.edu.tw (D. Gho), m10905833@mail.ntust.edu.tw (G.G. Lalitan).
https://doi.org/10.1016/j.jobe.2022.105046
Received 10 June 2022; Received in revised form 22 July 2022; Accepted 25 July 2022
Available online 17 August 2022
2352-7102/© 2022 Elsevier Ltd. All rights reserved.
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
determines flexural capacity has been adequately explained by flexural theory [3], but the ACI code provisions for shear capacity are
relatively unsophisticated [4]. Studies have shown that the ACI 318-19 provision has a low safety factor and does not cover
high-strength concrete SWs; moreover, the Eurocode 8 provision is overtly conservative [3].
A rational method to predict peak shear wall strength could be used in place of the simplified provisions in building codes.
Nevertheless, the use of rational methods, such as softened strut and tie [5] and the truss model [3] may present some difficulties to
structural engineers because such methods require a relatively complex calculation to determine the peak strengths of the shear walls.
Hence, an alternative approach that can provide accurate values of shear strength and is simple enough to use is required.
In the last few years, artificial intelligence (AI)-based models have been shown to be effective in predicting the shear strengths of
deep beams [6], soils [7], and concrete columns [8]. The use of a data-driven AI-based model is desirable because of its relative
simplicity and ease of development relative to rule-based/rational models [9]. Furthermore, the end-user of an AI model does not need
to perform complex calculations.
Despite the ease of use of recently developed AI models, developing an accurate AI model is a daunting challenge. The difficulty
arises in optimizing the hyperparameters of an AI algorithm. The usage of sub-optimal hyperparameters may result in unsatisfactory
model performance [10]. A metaheuristic optimizer can be used to tune the hyperparameters, yielding an optimized AI-based model
that offers improved precision and performance in predicting shear strength of RC walls in buildings. The results of this research can
support building safety and simplify an otherwise tedious shear strength calculation process.
This paper is organized as follows. Section 2 reviews the literature on rational and machine learning methods for predicting the
shear capacity of RC shear walls. Section 3 provides the basics of machine learning techniques, metaheuristic optimization algorithms,
hybrid model construction, and methods for evaluating models. Section 4 describes the collected data, data preprocessing, Pearson’s
correlation analysis, and the setting of hyperparameters of models. Section 5 comprehensively compares prediction models to identify
the best one; this model is then optimized by fine-tuning its hyperparameters using metaheuristic algorithms and optimal hybrid AI
models are thus proposed. The final section draws conclusions.
2. Literature review
Fig. 2 displays a shear wall that is a structural wall designed to resist combinations of shear, moments, and axial forces in the plane
of the wall [11]. Reinforced concrete shear walls are often used in high-rise buildings to withstand lateral forces due to wind or
earthquake loads. Shear walls can be grouped into the following three categories [12].
• Short/squat walls: reinforced concrete walls with a height-to-length ratio of less than or equal to two. The failure of squat walls is
generally shear-related and non-ductile.
• Slender/flexural walls: reinforced concrete walls with a height-to-length ratio greater than or equal to three. The behavior of
slender walls tends to be controlled by flexure.
• Intermediate walls: reinforced concrete walls with a height-to-length ratio value between two and three. The behavior of such
reinforced concrete walls is governed by shear and flexure.
The main target of this research is squat walls, which tend to fail under shear rather than flexure. The nominal shear strength of a
√̅̅̅̅
reinforced concrete shear wall is given by Eq. (1) and shall not exceed 0.66Acv f c in ACI 318-19 provision [11].
′
( √̅̅̅̅ )
(1)
′
Vn = Acv αc λ f c + ρt f y
where Vn is the nominal shear strength (resistance); Acv is the gross area of concrete section bounded by web thickness and length of
section in the direction of shear force; αc is the height to length ratio of the wall; λ is a modification factor that reflects the fact that the
mechanical properties of lightweight concrete are poorer than those of normal-weight concrete of the same compressive strength; f c is
′
2
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Fig. 2. Required design inputs and shear capacity of reinforced concrete shear wall.
the compressive strength of concrete; ρt is the ratio of the area of distributed transverse reinforcement to the gross concrete area
perpendicular to that reinforcement, and f y is the yield strength of non-prestressed reinforcement. Fig. 2 shows those geometric factors
and additional mechanical properties that affect the shear strength of reinforced concrete walls, such as the wall height hw , wall length
lw , web thickness tw , flange width bf , flange thickness tf , vertical web reinforcement ratio ρv and strength f yv , horizontal web rein
forcement ratio ρh and strength f yh , longitudinal boundary reinforcement ratio ρL and strength f yL , applied axial load P, aspect ratio αc
√̅̅̅̅
(wall height hw divided by wall length lw ), and square root of concrete compressive strength f c .
′
Following the successful application of machine learning (ML) in the engineering field [13,14], several investigations of the ca
pacity of shear wall structures have been undertaken. Mangalathu et al. [15] developed an ML-based method to predict the failure
modes of reinforced concrete shear walls based on the experimental data concerning shear walls. Siam et al. [16] applied ML tech
niques to classify the performance and predict the drift of masonry shear walls. Gondia et al. [17] proposed a genetic programming
model to predict the shear strength of squat walls.
Concrete structures are among the most common structures, and owing to their wide range of uses, understanding their behavior is
essential in structural engineering. Artificial Neural Networks (ANNs) [18] and fuzzy systems have been successfully used to estimate
the capacity of structural RC members and to determine the properties of concrete that influence it [19]. Tran, for instance, used ML
models to predict the chloride diffusion coefficient of concrete that contains supplementary cementitious materials, such as silica fume,
ground granulated blast furnace slag, and fly ash [20]. The ML technique represents a new way of predicting the effective stiffness of
precast concrete columns with greatly improved accuracy relative to the conventional methods and is used to investigate systemat
ically the effects of design parameters on the stiffness of precast concrete columns [21].
In particular, ANN, Support Vector Regression (SVR) [22], and Random Forest (RF) [23] are widely used to predict the compressive
strength of concrete [6,24], handwritten digit recognition task [25], energy consumption [26–28], and solar energy generation [29].
Studies have shown that the aforementioned models are highly effective for prediction. As the field of ML is progresses, some advanced
ensemble models have been developed. One such model is extreme gradient boosting (XGBoost). Most recently, Feng et al. [30] used
XGBoost [31] to improve the prediction accuracy of squat shear wall strength.
XGBoost is extensively used not only in engineering but in a wide range of different fields, predicting, for example, insurance claims
[32], protein-protein interactions [33], and many other things. The widespread usage of XGBoost has established that the model is
robust and can be applied in real-world problems. However, XGBoost has more hyperparameters than other ML techniques, such as
ANN, SVR, and RF [32]. The usage of a sub-optimal set of hyperparameters may yield unsatisfactory model performance and so the
development of optimal XGBoost models is quite challenging.
Metaheuristic algorithm has been shown to be an effective method for optimizing ML models that are difficult to formulate
mathematically (also known as black box models). Since they are gradient-free optimizers, metaheuristic algorithms have been used to
optimize SVM [34], ANN [35], and other ML model parameters. However, a review of the recent corpus of metaheuristic optimization
has identified the use of many ‘classical’ metaheuristic algorithms, such as Genetic Algorithm (GA) and Particle Swarm Optimization
3
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
(PSO) [36–38]. These classical algorithms, while effective, may not offer optimal results efficiently.
The newer metaheuristic algorithms Jellyfish Search (JS) and Symbiotic Organisms Search (SOS) [39,40] are parameter-free and so
do not require any form of parameter fine-tuning, unlike the GA and PSO. They are therefore objectively superior to classical meta
heuristic algorithms and are used in this study. The following numerical experiments will identify the optimal ML model to predict RC
wall shear strength. The usage of the superior metaheuristic algorithm with the best ML model constitutes a comprehensive framework
for precisely predicting RC shear wall strength.
3. Methodology
The following sub-sections present the machine learning techniques, metaheuristic optimization algorithms, hybrid model
framework, and evaluation methods that are adopted in this research.
Where f(x) is the regression function; φ(x) is the kernel function that converts input data (x) into a higher-dimensional space; w is the
weight vector of the hyperplane, and b is the bias parameter.
where ̂y i is the predicted value; K is the maximum tree depth; f k (•) is the prediction function of a single decision tree; and αk is a
learning rate that is used to avoid overfitting.
XGBoost builds trees by minimizing the following loss function.
∑n
( ( t− 1 ) ) 1 ∑T
Lt = yi
l yi , ̂ + f t (xi ) +Ω(f t ), where Ω(f)= γT + λ ‖ wj ‖2 (4)
i=1
2 j=1
where ̂y t−i 1 is the prediction of the i-th datum at the (t-1)-th iteration, and l(•) is a function that quantifies the squared difference
between the prediction ŷi and the target yi . The purpose of the rest of the loss function is to determine the appropriate decision function
that minimizes the loss function. The second term Ω(•) is a regularization term which penalizes the complexity of the model based on
the parameters (γ and λ) that are used in generating the decision trees; T is the number of leaves in the tree, and wj is the weight of leaf j.
The second-order Taylor approximation can be used to optimize quickly the objective in the general setting; by removing the
4
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
y t− 1 ) and hi = ϑ2 t− 1 l(yi , ̂
where gi = ϑ t− 1 l(yi , ̂ y t− 1 ) are first- and second-order gradient statistics on the loss function. ϑ t− 1 and ϑ2 t− 1
̂y ̂y ̂y ̂y
represents the first and second-order partial derivatives of ̂ y t−i 1 . The optimal weight wj that is used in Eq. (4) is calculated using the
following equation (Eq. (6)) [31].
∑
g
wj = − ∑ i∈I i (6)
i∈I hi +λ
In building XGBoost trees for regression, the similarity score of each leaf in the tree is calculated using Eq. (7) [31].
(∑ )2
g
Similarity score = ∑ i∈I i (7)
i∈I hi +λ
Pruning is carried out by calculating the differences between Gain and a user-defined tree complexity parameter, γ. If the result is
positive, the branch is not pruned; otherwise, the branch is pruned or removed. If the branch is pruned, then γ must be subtracted from
the Gain of the upper branch. The XGBoost model keeps building new trees until the residuals reach a certain threshold or the model
reaches the maximum number of tree depths.
5
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
In the JS, movement toward an ocean current is exploration; movements within a jellyfish swarm are exploitation, and a time
control mechanism switches between them. Initially, the probability of exploration exceeds that of exploitation; over time, the
probability of exploitation increases, ultimately becoming much higher than that of exploration. The jellyfish identify the best location
inside the searched areas. Fig. 4 presents the flowchart of the artificial JS optimizer.
3.2.1.1. Population initialization. The logistic map is used to improve the diversity of the initial population [39]. Xi is the logistic
chaotic value of the location of the i-th jellyfish; X0 is used to generate the initial population of jellyfish, X0 ∈ (0, 1), X0 ∕
∈ (0.0, 0.25,
0.75, 0.5, 1.0), and parameter η is set to 4.0. The population size, nPop, is set according to the complexity of problem.
Xi+1 = ηXi (1 − Xi ) (9)
3.2.1.2. Boundary conditions. Oceans are located all around the world, which represent the search space of JS algorithm. The earth is
approximately spherical, so when a jellyfish moves outside the bounded search area, meaning that jellyfish moves across the Earth’s
North (South) Poles, it is assumed to return to the opposite bound. Equation (10) presents this re-entry process. Xi,d is the location of the
i-th jellyfish with the d-th dimension [39]; Xi,d is the updated location after the boundary constraints have been imposed; Ub,d and
′
Lb,d are the upper and lower bounds on the search space in the d-th dimension, respectively.
6
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
⎧ ( )
⎨ X′ = Xi,d − Ub,d + Lb,d if Xi,d > Ub,d
(10)
i,d
′( )
⎩ X = Xi,d − Lb,d + Ub,d if Xi,d < Lb,d
i,d
3.2.1.3. Ocean current. The ocean current contains large amounts of nutrients, so the jellyfish are attracted to it. The direction of the
ocean current (trend) is determined by averaging all of the vectors from each jellyfish in the ocean to the jellyfish that is currently in
the best location (Eq. (11)). X* is the location of the jellyfish that currently has the best location in the swarm; μ is the mean location of
all jellyfish; Xi (t +1) is the new location of each jellyfish (Eq. (12)); and β > 0 is a distribution coefficient that is related to the length of
trend. From the results of a sensitivity analysis [39] based on numerical experiments, β = 3 is obtained.
̅̅→
trend = X* − β × rand(0, 1) × μ (11)
7
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
8
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
9
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
3.2.2.4. Parasitism phase. The parasitism phase in the ecosystem only benefits one organism, while the other organism suffers a
disadvantage because of the interaction. In SOS, organism Xi is given a role similar to a parasite in the ecosystem called ‘Para
site_Vector’, which is created in the search space by duplicating organism Xi and modifying in the specified dimensions. Organism Xj is
selected at random from the ecosystem and acts as a host to the parasite vector. The fitness values of both organisms are then evaluated.
If Parasite_Vector has a better fitness value than organism Xj, it will replace position Xj in the ecosystem [40].
3.2.2.5. Termination criteria. The SOS algorithm is iteratively implemented until pre-specified termination criteria are satisfied. The
termination criteria that are used in metaheuristic applications usually involve the maximum number of iterations (max_iter) or the
number of fitness evaluations (FE). Once the algorithm is terminated, a solution is presented.
√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
1∑ n
RMSE = (y − ̂ y i )2 (21)
n i=1 i
Table 1
ANN, SVR, RF, and XGBoost parameter settings.
10
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
1∑ n
MAE = yi|
|y − ̂ (22)
n i=1 i
n ⃒ ⃒
100% ∑ ⃒(yi − ̂y i )⃒
MAPE = ⃒ ⃒ (23)
n i=1 ⃒ yi ⃒
where n is the number of observations; yi is the actual value of the i-th observation; ̂
y i is the predicted value of the i-th observation.
Table 2
Selected feature factors in three cases.
11
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
understand and visualize data. This step helps to identify patterns in a dataset. Data exploration has three primary goals, which are to
identify the characteristics of single variables, reveal patterns of data distributions, and to determine relationships between variables.
Visualization methods graphically represent data in graphs and charts to facilitate understanding of complex structures and re
lationships within the data. Table 3 presents the statistical descriptions concerning each variable and Fig. 8 displays a visualization of
relevant statistical distributions. Obviously, these variables display non-normal and irregular distributions, supporting the need to
determine the input-output relationship by advanced data analytics.
The input variables are the wall height hw , wall length lw , web thickness tw , flange width bf , flange thickness tf , concrete
compressive strength f c , vertical web reinforcement ratio ρv and strength f yv , horizontal web reinforcement ratio ρh and strength f yh ,
′
longitudinal boundary reinforcement ratio ρL and strength f yL , applied axial load P, gross area of concrete section Acv , aspect ratio αc ,
√̅̅̅̅
and square root of concrete compressive strength f c . The output is simply the nominal shear capacity of squat wall Vn . From Feng
′
√̅̅̅̅
et al., 13 input variables and one output are used, and three new input variables - Acv , αc , and f c - are added in this study (Table 2). A
′
total of 492 data with 16 input and one output variables are thus involved. The entire database is randomly split into training (70%)
and testing (30%) sets.
Pearson’s correlation coefficient is used to measure the linear correlation between two variables. Its value lies between -1 and +1,
where -1 indicates a perfect negative linear correlation, 0 indicates no linear correlation and 1 indicates a perfect positive linear
correlation. A coefficient value between ±0.50 and ± 1 is said to indicate a strong correlation. Fig. 9 presents a heatmap of correlation
coefficient between pair-wise variables. The newly added variables based on the ACI equation are seen to be strongly correlated with
existing variables. For example, gross area of concrete section Acv is strongly correlated with seven variables, hw , lw , tw , tf , bf , P, and Vn .
√̅̅̅̅
The aspect ratio αc is strongly correlated with wall height hw ,. The square root of concrete strength f c is strongly correlated with three
′
variables, f c , f yL , and P. The output variable Vn is strongly correlated with six variables: lw , tw , tf , bf , P, and Acv .
′
5.4. Integration of machine learning and optimization algorithm to establish hybrid models
Two recently developed optimization algorithms, jellyfish search (JS) and symbiotic organisms search (SOS) are integrated with
Table 3
Variables in database of shear strength of squat walls.
12
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
13
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
the best model (XGBoost), identified in the aforementioned section, to establish hybrid models, JS-XGBoost and SOS-XGBoost, for
prediction of shear wall strength. The search space for various hyperparameters of XGBoost, including eta, gamma, max_depth,
min_child_weight, max_delta_step, subsample, colsample_bytree, lambda, and alpha, is defined as a range of values that is taken from
the literature [52] and provided in Table 5. The search range and hardware devices that are used in this study can be used for reference
although a larger search range should be considered in future work. In SOS algorithm, the objective function is evaluated four times in
a single run, twice in the mutualism phase and once in the commensalism phase and parasitism phase, respectively. In JS algorithm, the
objective function is only evaluated one time in a single run. To make a fair comparison between the optimization algorithms, the
initialization control parameters for JS and SOS are given in Table 6.
The objective function of the metaheuristic algorithms is the minimization of RMSE (kN). The training dataset is used to optimize
the hyperparameter settings. Ten-fold cross-validation is adopted to evaluate the performance (fitness) of the model. After the hybrid
models have been optimized, the test dataset is used to evaluate the prediction performance. As the generalizability of resulting model
must be considered, the hybrid models are run five times with the aim of eliminating random bias in the population initialization for
the JS and SOS algorithms. Three shear wall datasets are used to evaluate the hybrid models.
In the first case, in which only the 13 factors from Feng et al. (2021) are used, JS-XGBoost exhibits the best average performance
with R2, RMSE, MAE, and MAPE values of 0.976, 104.46 kN, 64.06 kN, and 16.92% respectively, over five runs. Table 7 compares the
performance measures of hybrid models. Both JS-XGBoost and SOS-XGBoost achieve better RMSE values than the original XGBoost. An
average RMSE of 104.46 kN was obtained using JS-XGBoost in five runs. This result is 3.22 kN better than the 107.68 kN that was
obtained using XGBoost with the default hyperparameters.
14
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Table 4
Comparison of performance measures for single and ensemble models.
Table 5
XGBoost tuning hyperparameters.
Parameter Lb Ub Description
Fig. 11 presents the convergence histories of the RMSE for JS and SOS. The vertical axis represents RMSE (kN). SOS yields a better
optimal solution than JS in four of the five runs of the optimization process. Furthermore, the optimization runtime of SOS was 1.5 min
faster than that of JS in average. Therefore, better convergence efficiency of the objective function optimizer may not result in better
performance as measured by the metrics of interest.
In the second case, five variables that were identified in the ACI equation for the shear strength of a structural wall, Eq. (1), are
15
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Table 6
Control parameters in metaheuristic algorithms.
SOS 50 20
JS 50 80 (20 × 4)
Table 7
Comparison of performance measures of hybrid models in case 1.
Fig. 11. Convergence histories of RMSE (kN) for JS and SOS in case 1.
16
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Fig. 12. Convergence histories of RMSE (kN) for JS and SOS in case 2.
used. Fig. 12 shows that SOS converges faster than JS and has a smaller objective function value and the best algorithm cannot be
determined with reference only to the convergence history. The best model is identified by comparing the multiple performance
measures of the models, as in Table 8. Neither of the hybrid models (JS-XGBoost and SOS-XGBoost) outperforms the ensemble models,
RF and XGBoost (Table 4). However, a comparison between only JS-XGBoost and SOS-XGBoost demonstrates that SOS-XGBoost
performs better in all respects, with the exception of their having the same R2 value. The results in case 2 indicate that with fewer
factors, ensemble models are better than single and hybrid models for predicting shear wall strength.
Additionally, this study used ACI 318-19 provision equation, as mentioned in Eq. (1), to calculate the shear strength. The R2, RMSE,
MAE, and MAPE values are 0.853, 393.725 kN, 233.755 kN, and 34.736% as shown in Table 8. With the condition of the same features
in case 2, SOS-XGBoost performed with mean R2, RMSE, MAE, and MAPE values of 0.953, 145.03 kN, 78.99 kN, and 19.55%. The
proposed SOS-XGBoost has a 63.16% improvement in RMSE and a 43.72% improvement in MAPE when they are comparing to those
obtained using the ACI 318-19 provision equation in shear capacity estimation.
In the last case with 16 factors, JS-XGBoost yields similar results with SOS-XGBoost in R2, RMSE, and MAE values, as shown in
Table 9. Both JS-XGBoost and SOS-XGBoost achieve better performance, except the MAPE, than the original XGBoost with the default
hyperparameters. The efficacies of the two hybrid algorithms are comparable and both outperform the original XGBoost in case 3.
Fig. 13 displays the convergence histories of RMSE of JS and SOS for case 3. The vertical axis represents RMSE (kN). SOS can obtain
a better optimal solution and converges faster than JS in all five runs in the optimization process. Additionally, the optimization
runtime of SOS was 1.43 min shorter than that of JS, as seen in Table 9. This result confirms that better performance of the objective
function optimizer may not result in better performance in terms of the performance metrics of interest.
Consistent with the “no free lunch” concept, the hyperparameters of machine learning models are often set case by case. Table 10
displays the settings of the hyperparameters of JS-XGBoost and SOS-XGBoost in case 3, demonstrating the challenge of manually
setting various algorithm parameters to determine the optimal models.
5.6. Discussion
The analytical results in case 3 are illustrative because the case study uses all the original and synthetic features (Table 2) to predict
shear wall strength more accurately than that of the other two cases., Fig. 15 presents a histogram that compares MAPEs of the best
single, ensemble, and hybrid models, which are SVR (32.87%), XGBoost (13.22%), and JS-XGBoost (11.57%). The MAPE value of JS-
17
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Table 8
Comparison of performance measures of hybrid models in case 2.
Table 9
Comparison of performance measures of hybrid models in case 3.
18
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Fig. 13. Convergence histories of RMSE (kN) for JS and SOS in case 3.
Table 10
Tuning of hyperparameters of hybrid models in case 3.
Hyperparameter Feasible range Default setting in Python Optimal value in JS-XGBoost Optimal value in SOS-XGBoost
Fig. 14. Convergence histories of RMSE (kN) for JS and SOS with original data.
XGBoost is 12.48% and 64.8% better than those of XGBoost and SVR, respectively.
For XGBoost in cases 1 to 3 (Table 4), the maximum difference of error rates between the training and test datasets is 14.17%,
revealing that XGBoost may suffer from slight overfitting, as shown in Fig. 16. Notably, after the hyperparameters of XGBoost were
fine-tuned, the variability of the MAPE values of XGBoost was reduced by integrating it with the JS and SOS algorithms. For JS-
XGBoost, the difference between the error rates of the training dataset and that of the test dataset is 11.23% in case 1 and 9.09%
in case 3. For SOS-XGBoost, the difference is 11.09% in case 2. The results imply the following two ways of possibly mitigating
overfitting in the shear wall datasets; (1) using metaheuristic optimization algorithms to fine-tune the XGBoost hyperparameters, and
19
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Table 11
Comparison of performance measures for hybrid models using original dataset.
Fig. 15. Histogram of MAPEs for best single, ensemble, and hybrid models in case 3.
Fig. 16. Histogram of MAPEs for best XGBoost and metaheuristics-optimized XGBoost models.
(2) increasing the number of input features according to the ACI provision equation.
The tendency of the ACI code to underestimate nominal shear capacity relative to that predicted by metaheuristics-optimized ML
raises issues of design quality and safety, as shown in Fig. 17. Notably, the values of nominal shear capacity that are predicted using JS-
XGBoost in case 3 tend to be overestimated in the middle-strength range (680–1000 kN), yielding a larger predictive variability than
that of the ACI code; they tend to be underestimated at higher strengths (above 2000 kN). The overall accuracy of JS-XGBoost exceeds
that of the ACI code. For the purposes of design and quality control, further research into data quality, ML model improvements, and
the use of outputs from the prediction system must be carried out.
20
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Fig. 17. Scatter plot of actual versus predicted/calculated nominal shear capacities.
6. Conclusion
Accurately calculating the shear capacity of an RC wall is complex and time-consuming. Thus, most engineers use simplified
equations for design purposes. However, the simplified equations in current building codes have low safety factors and do not cover
new materials. Accordingly, an appropriate machine learning scheme that can accurately determine peak shear strength is needed.
This investigation considered three cases. The first case involves the models using all of the original variables in the literature. The
second case involves the synthetic variables from the ACI equation for the shear strength of a wall. The third case involves a com
bination of original and synthetic variables.
Based on the results in these three cases with single and ensemble models (ANN, SVR, RF, and XGBoost), XGBoost performed best
performance with R2, RMSE, MAE, and MAPE values of 0.978, 99.36 kN, 59.96 kN, and 13.22%, respectively, in case 3. We conclude
that the incorporation of synthetic features improved the results obtained over those obtained using only the original variables (first
case) or only the ACI variables (second case). Overall, ensemble models outperform single models. In particular, XGBoost exhibits
greater predictive accuracy when more factors are involved, and RF is effective even when the number of factors is small.
To further increase predictive accuracy, the JS and SOS metaheuristic optimization algorithms were integrated into XGBoost. JS
and SOS were used to fine-tune the hyperparameters of XGBoost and these two algorithms were proved to yield better predictive
accuracies than grid search, results concerning which were taken from the literature. The best RMSE value, 85.19 kN, was obtained
using JS-XGBoost over five runs in case 3 with 16 factors; this value was 14.17 kN better than the 99.36 kN that was obtained using
XGBoost with the default hyperparameters. This difference represents a 14.26% improvement in shear strength predictive perfor
mance. Furthermore, SOS-XGBoost performs a 63.16% improvement of RMSE than the ACI provision equation in estimating shear wall
capacity (case 2) and allows the designed size of a shear wall or concrete strength to be reasonably reduced, while meeting the building
codes and saving construction and material cost. The efficacies of the two algorithms are similar and both yield better results than the
original XGBoost. JS algorithm finds the best solutions, but SOS performs more stably herein.
In the three cases of interest, the hybrid models generally exhibit more reliable and better predictive performances than the single
models and the ensemble models. Particularly, JS-XGBoost presents the best performance with MAPE of 11.57% in case 3 although
some hybrid models exhibit poorer predictive performance than the ensemble models for case 2. This study recommends to use the
metaheuristics-optimized XGBoost for shear strength prediction of RC walls in buildings based on the analytical results. The value of
this research warrants examination by other researchers in other contexts or other hybrid models. The proposed machine learning
framework favors building safety and simplify a cumbersome shear capacity calculation process. It can also be used as a general tool for
quantifying the performance of various mechanical models and empirical formulas in design standards.
Replication of results
The datasets (Appendix 1), codes, and results that were generated and/or analyzed during the current study are available from the
corresponding author upon reasonable request. Appendix 2 provides details of the best metaheuristics-optimized XGBoost models for
each case study, which may be of benefit to researchers and practitioners in the future.
21
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
Data availability
Acknowledgements
The authors would like to thank the National Science and Technology Council, Taiwan, for financially supporting this research
under contract NSTC 110-2221-E-011-080-MY3.
References
[1] A.C. Aydin, B. Bayrak, Design and performance parameters of shear walls: a review, Archit. Civ. Eng. Environ. 14 (4) (2021) 69–93, https://doi.org/10.21307/
acee-2021-032.
[2] J.A. Gallardo, J.C. de la Llera, H.S. Maria, M.F. Chacon, Damage and sensitivity analysis of a reinforced concrete wall building during the 2010, Chile
earthquake, Eng. Struct. 240 (2021) 19, https://doi.org/10.1016/j.engstruct.2021.112093.
[3] J. Chandra, K. Chanthabouala, S. Teng, Truss model for shear strength of structural concrete walls, ACI Struct. J. 115 (2018) 323–335, https://doi.org/
10.14359/51701129.
[4] T. Tran, C. Motter, C. Segura, J. Wallace, Strength and deformation capacity of shear walls, in: 16th World Conference on Earthquake Engineering, Santiago,
January, 2017, pp. 9–13. https://www.wcee.nicee.org/wcee/article/16WCEE/WCEE2017-2969.pdf.
[5] S.-J. Hwang, W.-H. Fang, H.-J. Lee, H.-W. Yu, Analytical model for predicting shear strengthof squat walls, J. Struct. Eng. 127 (1) (2001) 43–50, https://doi.org/
10.1061/(ASCE)0733-9445(2001)127:1(43).
[6] J.-S. Chou, N. Ngoc-Tri, A.-D. Pham, Shear strength prediction in reinforced concrete deep beams using nature-inspired metaheuristic support vector regression,
J. Comput. Civ. Eng. 30 (2016), 04015002, https://doi.org/10.1061/(ASCE)CP.1943-5487.0000466.
[7] S. Kiran, B. Lal, S. Tripathy, Shear strength prediction of soil based on probabilistic neural network, Indian J. Sci. Technol 9 (41) (2016), https://doi.org/
10.17485/ijst/2016/v9i41/99188.
[8] M.R. Azadi Kakavand, H. Sezen, E. Taciroglu, Data-driven models for predicting the shear strength of rectangular and circular reinforced concrete columns, J.
Struct. Eng. 147 (1) (2021), https://doi.org/10.1061/(ASCE)ST.1943-541X.0002875, 04020301.
[9] A. Halevy, P. Norvig, F. Pereira, The unreasonable effectiveness of data, Intelligent Systems, IEEE 24 (2009) 8–12, https://doi.org/10.1109/MIS.2009.36.
[10] M.S. Cao, L.X. Pan, Y.F. Gao, D. Novák, Z.C. Ding, D. Lehký, X.L. Li, Neural network ensemble-based parameter sensitivity analysis in civil engineering systems,
Neural Comput. Appl. 28 (7) (2017) 1583–1590, https://doi.org/10.1007/s00521-015-2132-4.
[11] ACI Committee 318, Building Code Requirements for Structural Concrete and Commentary, Technical Documents, American Concrete Institute, Farmington
Hills, MI, United States, 2019, p. 624. https://www.concrete.org/publications/internationalconcreteabstractsportal.aspx?m=details&ID=51716937.
[12] J. Chandra, K. Chanthabouala, S. Teng, Truss model for shear strength of structural concrete walls, ACI Struct. J. 115 (2) (2018) 323–335, https://doi.org/
10.14359/51701129.
[13] S.-H. Hwang, S. Mangalathu, J. Shin, J.-S. Jeon, Machine learning-based approaches for seismic demand and collapse of ductile reinforced concrete building
frames, J. Build. Eng. 34 (2021), 101905, https://doi.org/10.1016/j.jobe.2020.101905.
[14] B. Hilloulin, V.Q. Tran, Using machine learning techniques for predicting autogenous shrinkage of concrete incorporating superabsorbent polymers and
supplementary cementitious materials, J. Build. Eng. 49 (2022), 104086, https://doi.org/10.1016/j.jobe.2022.104086.
[15] S. Mangalathu, H. Jang, S.-H. Hwang, J.-S. Jeon, Data-driven machine-learning-based seismic failure mode identification of reinforced concrete shear walls,
Eng. Struct. 208 (2020), 110331, https://doi.org/10.1016/j.engstruct.2020.110331.
[16] A. Siam, M. Ezzeldin, W. El-Dakhakhni, Machine learning algorithms for structural performance classifications and predictions: application to reinforced
masonry shear walls, Structures 22 (2019) 252–265, https://doi.org/10.1016/j.istruc.2019.06.017.
[17] A. Gondia, M. Ezzeldin, W. El-Dakhakhni, Mechanics-guided genetic programming expression for shear-strength prediction of squat reinforced concrete walls
with boundary elements, J. Struct. Eng. 146 (11) (2020), 04020223, https://doi.org/10.1061/(ASCE)ST.1943-541X.0002734.
[18] M. Zarringol, H.T. Thai, Prediction of the load-shortening curve of CFST columns using ANN-based models, J. Build. Eng. 51 (2022) 19, https://doi.org/
10.1016/j.jobe.2022.104279.
[19] M. Mirrashid, H. Naderpour, Recent trends in prediction of concrete elements behavior using soft computing, Arch. Comput. Methods Eng. 28 (4) (2021)
3307–3327, https://doi.org/10.1007/s11831-020-09500-7.
[20] V.Q. Tran, Machine learning approach for investigating chloride diffusion coefficient of concrete containing supplementary cementitious materials, Construct.
Build. Mater. 328 (2022) 14, https://doi.org/10.1016/j.conbuildmat.2022.127103.
[21] Z. Wang, T.X. Liu, Z.L. Long, J.Q. Wang, J. Zhang, A machine-learning-based model for predicting the effective stiffness of precast concrete columns, Eng. Struct.
260 (2022) 24, https://doi.org/10.1016/j.engstruct.2022.114224.
[22] H.Y. Zhang, X.W. Cheng, Y. Li, X.L. Du, Prediction of failure modes, strength, and deformation capacity of RC shear walls through machine learning, J. Build.
Eng. 50 (2022) 22, https://doi.org/10.1016/j.jobe.2022.104145.
[23] W.L. Yao, D.H. Li, L. Gao, Fault detection and diagnosis using tree-based ensemble learning methods and multivariate control charts for centrifugal chillers, J.
Build. Eng. 51 (2022) 19, https://doi.org/10.1016/j.jobe.2022.104243.
[24] J.-S. Chou, A.-D. Pham, Enhanced artificial intelligence for ensemble approach to predicting high performance concrete compressive strength, Construct. Build.
Mater. 49 (2013) 554–563, https://doi.org/10.1016/j.conbuildmat.2013.08.078.
[25] Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE 86 (11) (1998) 2278–2324. http://vision.
stanford.edu/cs598_spring07/papers/Lecun98.pdf.
[26] J.-S. Chou, D.-S. Tran, Forecasting energy consumption time series using machine learning techniques based on usage patterns of residential householders,
Energy 165 (2018) 709–726, https://doi.org/10.1016/j.energy.2018.09.144.
[27] R. Olu-Ajayi, H. Alaka, I. Sulaimon, F. Sunmola, S. Ajayi, Building energy consumption prediction for residential buildings using deep learning and other
machine learning techniques, J. Build. Eng. 45 (2022) 13, https://doi.org/10.1016/j.jobe.2021.103406.
[28] J. Moon, S. Park, S. Rho, E. Hwang, Robust building energy consumption forecasting using an online learning approach with R ranger, J. Build. Eng. 47 (2022)
20, https://doi.org/10.1016/j.jobe.2021.103851.
[29] M.W. Ahmad, J. Reynolds, Y. Rezgui, Predictive modelling for solar thermal energy systems: a comparison of support vector regression, random forest, extra
trees and regression trees, J. Clean. Prod. 203 (2018) 810–821, https://doi.org/10.1016/j.jclepro.2018.08.207.
[30] D.-C. Feng, W.-J. Wang, S. Mangalathu, E. Taciroglu, Interpretable XGBoost-SHAP machine-learning model for shear strength prediction of squat RC walls, J.
Struct. Eng. 147 (11) (2021), 04021173, https://doi.org/10.1061/(ASCE)ST.1943-541X.0003115.
22
J.-S. Chou et al. Journal of Building Engineering 61 (2022) 105046
[31] T. Chen, C. Guestrin, XGBoost: a scalable tree boosting system, in: Balaji Krishnapuram, et al. (Eds.), Proceedings of the 22nd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining Association for Computing Machinery, Unted States, New York, NY, 2016, pp. 785–794, https://doi.org/
10.1145/2939672.2939785.
[32] P. Probst, M. Wright, A.-L. Boulesteix, Hyperparameters and Tuning Strategies for Random Forest, vol. 9, Wiley Interdisciplinary Reviews: Data Mining and
Knowledge Discovery, 2018, https://doi.org/10.1002/widm.1301.
[33] A. Deng, H. Zhang, W. Wang, J. Zhang, D. Fan, P. Chen, B. Wang, Developing computational model to predict protein-protein interaction sites based on the
XGBoost algorithm, Int. J. Mol. Sci. 21 (7) (2020) 2274, https://doi.org/10.3390/ijms21072274.
[34] J.-S. Chou, A.-D. Pham, Smart artificial firefly colony algorithm-based support vector regression for enhanced forecasting in civil engineering, Comput. Aided
Civ. Infrastruct. Eng. 30 (9) (2015) 715–732, https://doi.org/10.1111/mice.12121.
[35] E.A. Grimaldi, F. Grimaccia, M. Mussetta, R. Zich, PSO as an effective learning algorithm for neural network applications, in: Proceedings of 3rd International
Conference on Computational Electromagnetics and Its Applications, 2004, pp. 557–560, https://doi.org/10.1109/ICCEA.2004.1459416.
[36] A.O. Alnahit, A.K. Mishra, A.A. Khan, Stream water quality prediction using boosted regression tree and random forest models, Stoch. Environ. Res. Risk Assess.
(2022), https://doi.org/10.1007/s00477-021-02152-4.
[37] T. Mao, A.S. Mihaita, F. Chen, H.L. Vu, Boosted genetic algorithm using machine learning for traffic control optimization, IEEE Trans. Intell. Transport. Syst. 23
(7) (2022) 7112–7141, https://doi.org/10.1109/tits.2021.3066958.
[38] M.M. Rosso, R. Cucuzza, F. Di Trapani, G.C. Marano, Nonpenalty machine learning constraint handling using PSO-SVM for structural optimization, Adv. Civ.
Eng. 2021 (2021) 17, https://doi.org/10.1155/2021/6617750.
[39] J.-S. Chou, D.-N. Truong, A novel metaheuristic optimizer inspired by behavior of jellyfish in ocean, Appl. Math. Comput. 389 (2021), 125535, https://doi.org/
10.1016/j.amc.2020.125535.
[40] M.-Y. Cheng, D. Prayogo, Symbiotic Organisms Search: a new metaheuristic optimization algorithm, Comput. Struct. 139 (2014) 98–112, https://doi.org/
10.1016/j.compstruc.2014.03.007.
[41] F. Rosenblatt, Principles of neurodynamics: perceptrons and the theory of brain mechanisms. https://safari.ethz.ch/digitaltechnik/spring2018/lib/exe/fetch.
php?media=neurodynamics1962rosenblatt.pdf, 1961.
[42] K. Fukushima, Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position, Biol. Cybern. 36 (4)
(1980) 193–202, https://doi.org/10.1007/BF00344251.
[43] C. Cortes, V. Vapnik, Support-vector networks, Mach. Learn. 20 (3) (1995) 273–297, https://doi.org/10.1007/BF00994018.
[44] H. Tin Kam, Random decision forests, in: Proceedings of 3rd International Conference on Document Analysis and Recognition, 1995, pp. 278–282, https://doi.
org/10.1109/ICDAR.1995.598994.
[45] L. Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32, https://doi.org/10.1023/A:1010933404324.
[46] S. Gao, C.W. de Silva, Estimation distribution algorithms on constrained optimization problems, Appl. Math. Comput. 339 (2018) 323–345, https://doi.org/
10.1016/j.amc.2018.07.037.
[47] M.S. Kiran, TSA: tree-seed algorithm for continuous optimization, Expert Syst. Appl. 42 (19) (2015) 6686–6698, https://doi.org/10.1016/j.eswa.2015.04.055.
[48] J.-S. Chou, D.-N. Truong, Multiobjective optimization inspired by behavior of jellyfish for solving structural design problems, Chaos, Solitons & Fractals 135
(2020), 109738, https://doi.org/10.1016/j.chaos.2020.109738.
[49] C.-L. Ning, B. Li, Probabilistic development of shear strength model for reinforced concrete squat walls, Earthq. Eng. Struct. Dynam. 46 (6) (2017) 877–897,
https://doi.org/10.1002/eqe.2834.
[50] L.M. Massone, F. Melo, General solution for shear strength estimate of RC elements based on panel response, Eng. Struct. 172 (2018) 239–252, https://doi.org/
10.1016/j.engstruct.2018.06.038.
[51] E. Geeurickx, L. Lippens, P. Rappu, B.G. De Geest, O. De Wever, A. Hendrix, Recombinant extracellular vesicles as biological reference material for method
development, data normalization and assessment of (pre-)analytical variables, Nat. Protoc. 16 (2) (2021) 33, https://doi.org/10.1038/s41596-020-00446-5.
[52] Xgboost Developers, XGBoost parameters. https://xgboost.readthedocs.io/en/latest/parameter.html, 2020.
23