Professional Documents
Culture Documents
Naser 2024 Integrating Machine Learning Models Into Building Codes and Standards Establishing Equivalence Through
Naser 2024 Integrating Machine Learning Models Into Building Codes and Standards Establishing Equivalence Through
Naser 2024 Integrating Machine Learning Models Into Building Codes and Standards Establishing Equivalence Through
Abstract: The traditional approach to formulating building codes often is slow and labor-intensive, and may struggle to keep pace with the
rapid evolution of technology and domain findings. Overcoming such challenges necessitates a methodology that streamlines the mod-
ernization of codal provisions. This paper proposes a machine learning (ML) approach to append a variety of codal provisions, including
those of empirical, statistical, and theoretical natures. In this approach, a codal provision (i.e., equation) is analyzed to trace its properties
(e.g., engineering intuition and causal logic). Then a ML model is tailored to preserve the same properties and satisfy a collection of
similarity and performance measures until declared equivalent to the provision at hand. The resulting ML model harnesses the predictive
capabilities of ML while arriving at predictions similar to the codal provision used to train the ML model, and hence it becomes possible to
use in lieu of the codal expression. This approach was examined successfully for seven structural engineering phenomena contained within
various building codes, including those in North America and Australia. The findings suggest that the proposed approach could lay the
groundwork for implementing ML in the development of future building codes. DOI: 10.1061/JSENDH.STENG-12934. © 2024 American
Society of Civil Engineers.
Author keywords: Building codes; Machine learning; Prediction.
Perhaps of most importance is this lack of transparency of some such an investigation then is analyzed via statistical regression. To
ML models. This opaqueness can pose difficulties when attempting a certain extent, civil engineers favor using linear and nonlinear
to integrate ML models into building codes, because these codes regression (Benjamin and Cornell 2014). Both forms of regression
need to have clearly defined and understandable criteria so that en- comprises a number of assumptions that need to satisfy the data
gineers can follow and verify compliance (Naser and Ross 2022). at hand as well as those pertaining to preselected parametric forms
Although the data-driven nature of ML is its strength, the value of (e.g., the use of linear regression implies the validity of linear and
engineering principles, intuition, and domain knowledge cannot be additive nature of features, and so forth) some of which may or may
overlooked—especially in our field, in which decisions can impact not be satisfied.
occupants’ safety (Naser 2021a). Evidently, a proper statistical regression analysis boils down
Understandably, engineers may remain skeptical of ML models to the best coefficients that can improve the accuracy of the
due to the aforementioned challenges and the absence of a regula- preselected parametric form from the available tests or simulations
tory framework that allows such models to be included in building data (Ribarič et al. 2008). Such an expression is treated further to
codes. This absence of a framework means there are no defined attain the expected reliability measures inherently required by a
standards to follow. This lack of standards makes it difficult for building code’s technicalities. When such an expression is fully
engineers to fully trust and adopt ML models. For example, engi- verified, vetted, and voted upon, this expression is likely to be in-
neers might be wary of using a ML model if there is not a univer- corporated into a guiding document. This procedure has been duly
sally accepted standard for such ML models. Given the absence of noted in the literature, and one of the recent examples is that de-
a structured way to integrate ML models into the design and con- veloped by Kuchma et al. (2019), credited with updating the one-
struction processes outlined by the building codes, engineers often way shear design provisions for RC in the latest ACI 318 code
opt for traditional, code-approved methods—even in scenarios (ACI 2019) adopted by the American Concrete Institute.
requiring performance-based principles or out-of-the-box solutions. Depending on the comprehensiveness of a testing campaign, it
On a more positive note, the rapid advancements in ML have is quite likely that additional tests or numerical simulations can be
allowed the integration of eXplainable AI (XAI), which can added to supplement the available data. For example, Franssen and
help create more reliable and interpretable models that could be Dotreppe (2003) followed a similar approach when developing
adopted into building codes. This concept refer to methods and the basis for Eurocode 2’s assessment method for the fire resistance
techniques that allow users (i.e., engineers) to understand ML of columns. Similarly, Wade et al. (1997) combined tests with
models’ decision-making processes (Emmert-Streib et al. 2020). numerical simulations to derive an equation to evaluate the fire re-
For example, if a ML model could explain that it predicts a specific sistance of RC columns for the Australian building code AS 3600
outcome (e.g., sectional capacity) based on specific characteristics (Standards Australia 1994). Other examples were presented by
of the building design (such as material properties and geometric Feldman and Bartlett (2005) and Ollgaard et al. (1971).
features), engineers would be more confident integrating these in- This paper argues that, with the presence and/or absence of
sights into their work than if the ML was a traditional black-box first principles, and when the primary motivation behind utilizing
model. Simply, if a ML model can explain its decisions in a way statistical-based expressions in building codes is their capacity to
that aligns with the stipulations of codal provisions, this could sig- best fit existing data, it is of merit to utilize higher-order models
nificantly enhance the potential for ML integration into these codes. (such as tree-based and neural network models, and so forth) that
Providing transparency and a deeper understanding of the under- are capable of best fitting the data in a more accurate manner than
lying relationships in the data can bridge the gap between the dynamic, that provided by the traditional counterparts. Noting how such mod-
data-driven world of ML and the static, calmer nature of building els can better handle high-dimensional data, has nonparametric free-
codes. This paper proposes such a methodology. In this methodology, dom, and less adherence to data related assumptions enables this
the first stage analyzes a codal equation to trace its properties method to create suitable codal provisions.
(e.g., logic). Then a ML model is tailored to learn and map the same A supporting philosophical view is that, for the same data and
logic—thereby transforming into a higher-order model that preserves features, the statistical-based procedure of iteratively fitting a codal
the same properties of the codal equation. Finally, the ML model is expression by best fitting the coefficients of its predefined form can
tuned on big data to achieve superior performance, given the ability of be viewed similarly to training a ML model to learn patterns in the
ML to accommodate highly nonlinear relations. Our findings indicate available data (Fig. 1). A primary difference is that the available
that the proposed approach could set the stage for ML implementation data tune the coefficients for the already-defined parametric form
in future building codes. Although significant research and develop- of the former and the algorithmic logic (internal model structure) for
ment are needed to refine these techniques and demonstrate their reli- the latter. One then can view the parametric form to be equivalent to
ability and effectiveness in the context of civil engineering and the the algorithmic logic because both connect the key parameters to
involvement of regulatory and code-developing organizations, this realize the prediction. However, a critical difference between the
work and its findings could be an important step toward such an effort. two approaches is that the procedure of statistical regression is
familiar, and hence the steps that are carried out to interpret the out- 1X n
come of such analysis can be traced. As is explained in the section PDPXS ¼ fðX S ; X Ci Þ ð2Þ
n i¼1
“Proposed Methodology,” recent advances in explainability have
Downloaded from ascelibrary.org by Universidad de la Sabana on 03/12/24. Copyright ASCE. For personal use only; all rights reserved.
made this possible for ML (Molnar 2019). where X Ci = remaining features for ith observation; and n = num-
ber of observations.
The key insight of PDPs is that by averaging out the effects of all
Proposed Methodology
other features, this tool can isolate the effect of the feature of in-
terest. In this way, the relationship between the target and a feature
General Description can be revealed. Fig. 2 illustrates a hypothetical PDP obtained from
Suppose the same logic and properties of an empirical codal pro- a ML model and from a physical equation. This PDP shows how
vision (i.e., equation) are imposed on a superior high-dimensional changing a specific feature while keeping all other features constant
ML model. In this case, such a model can be considered to be a affects the predicted target. The hypothetical model’s PDP has a
natural substitute for the statistically derived equation, because the similar trend and behavior, to some extent, to that of the hypotheti-
model can arrive at predictions and interpret their outcomes sim- cal equation. This implies that model and equation predictions are
ilarly to the codal equation. within certain a certain bound (i.e., percentage).
This is of importance because, in reality, although many of the
phenomena addressed by civil engineers may not conform fully to Equivalence through Accumulated Local Effects
the assumptions inherent in statistical methods used to derive em-
Although PDPs provide valuable insights, they can be biased when
pirical expressions, engineers still apply domain knowledge to as-
features are correlated, because PDPs treat features independently.
sign the position and interaction between the involved parameters
ALEs also can be used to explain ML models. However, unlike
fitted into such expressions. Thus, the proposed methodology starts
PDPs, ALEs provide a more accurate representation when features
by constraining the algorithmic structure of a candidate ML model
are correlated. For example, ALEs provide a way to visualize fea-
to one that imitates the same characteristics (in terms of engineering
ture effects by taking into account the average effects of other fea-
logic, domain knowledge, and so forth) and the essence of the codal
tures. Hence, ALE plots can be especially useful when there are
provision.
interactions between features. ALEs handle interactions between
This process is defined herein as model equivalence. Although
features by considering only the local effects; hence, they are not
there could be multiple ways to establish equivalence (as is de-
biased by global correlations as are PDPs. They also are centered
tailed in the section “Additional Thoughts, Limitations, and Future
around the average prediction over the training data, which makes
Directions”), this work establishes equivalence by leveraging two
them easier to interpret in terms of the average prediction.
eXplainable artificial intelligence methods, namely partial depend-
The ALE of a feature is calculated by accumulating the local
ence plots (PDPs) and accumulated local effects (ALEs), to con-
effects (i.e., the difference between the prediction for a slightly
strain the algorithmic structure to yield PDPs or ALEs similar to
higher value of the feature and the prediction for a slightly lower
those obtained from the codal equation (to a prespecified degree).
value for the same feature, averaged over the data set). To calculate
Such a degree is outlined via several similarity measures, five of
the ALE, the feature of interest is divided into intervals. For each
which are presented subsequently. Finally, when equivalence is
established, model predictions are compared with those obtained
from the codal equation via statistical measures. This section starts
by covering the principles of PDPs and ALEs and then addresses
the similarity measures selected to establish equivalence.
show how the predicted sectional capacity changes with the cross-
sectional area while averaging out the effect of the material strength. decay rate × iterationÞ and the difference in PDPs or ALEs, and
If there is a strong interaction between the cross-sectional area and a sensitivity parameter that represents how sensitive the target is
the material (e.g., if changing the material significantly affects how to changes in this feature. The adjustment is calculated as follows:
the cross-sectional area influences the capacity), this interaction ef-
fect could be hidden in a PDP. Conversely, suppose that there is an adjustmentpdp ¼ learning ratepdp × pdp difference½i
interaction between the cross-sectional area and the material of the × sensitivitypdp
load-bearing member. An ALE plot can capture this interaction by
showing the effect of changing the cross-sectional area on the pre-
adjustmentale ¼ learning rateale × ale difference½i
dicted capacity, considering the material feature’s average effect. For
visualization purposes, both PDP and ALE are quite similar. × sensitivityale
building codes are governed by physical equations/provisions. Thus, turns similarity in the same units as the target, a default setting was
if a ML model is developed to mimic these physical processes, the chosen arbitrarily as 10% of the observed target observation in this
PDP and ALE of the model also need to mimic the logic of the study.
physical equation. The elemental idea is that if the PDP and/or ALE
of a feature in the model follows the same trend as the physical equa- Bounding Measure
tion, this implies that the ML model has learned the underlying The custom bounding measure calculates an upper and a lower
engineering intuition and causal logic of the physical process. For bound based on a specified threshold (Fig. 5). The idea is to de-
example, if the physical equation describes a linear relationship termine whether the PDP or ALE values from the ML model fall
between a feature and its target, and the PDP or ALE of these var- within a certain range of values defined by a physical equation.
iables in the model also shows a linear relationship, this suggests The measure returns the proportion of model values that fall within
that the model has learned this relationship. the bounds. This measure assumes that the physical equation pro-
This section presents the selected similarity measures. These mea- vides a ground truth range of acceptable values. The choice of thresh-
sures were selected to be largely independent of each other to maxi- old is arbitrary (it was taken as 10% in this study), and this measure
mize their strengths and minimize their limitations. The derivations does not take into account the order of the values.
of these measures are presented in their respective references.
X
n X
n The light gradient boosting machine (LGBM) is an improvement
R2 ¼ 1 − ðPi − Ai Þ2 ðAi − Amean Þ2 ð5Þ over XGboost because it combines the strengths of gradient boost-
i¼1 i¼1 ing and the LightGBM framework to achieve high performance
(Ke et al. 2017). This optimizes the objective function by iteratively
Pn fitting new trees to the negative gradient of the loss function with
i¼1 ðAi − Āi ÞðPi − P̄i Þ
R ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn Pn ð6Þ respect to the current predictions. The final prediction is obtained
2 2 by summing the predictions of all the trees. The exclusive feature
i¼1 ðAi − Āi Þ i¼1 ðPi − P̄i Þ
bundling technique in LGBM groups data points in a near-lossless
way, leveraging similarities within the data. This technique improves
where A = actual measurements; P = predictions; and n = number computational efficiency by reducing the number of distinct feature
of data points. values and enabling efficient memory usage during the training
process.
Machine Learning Models Support Vector Machines
Five ML algorithms were used herein, and these were selected be- The support vector machines (SVM) algorithm aims to find the
cause they are commonly used algorithms in our domain, as noted optimal hyperplane that maximally separates the data points (scikit-
in a survey by Tapeh and Naser (2022). All ML algorithms were learn 2020b). The objective is to minimize the deviation or error
used in their default settings (including hyperparameters) for trans- between the actual data points and the hyperplane while consid-
parency and to allow repetition of the proposed approach. The fol- ering the regularization parameters. To handle nonlinearly sepa-
lowing sections briefly describe such algorithms, and their full rable data, SVM utilizes kernel functions to transform the input
description and scripts are presented in their respective references. data into a higher-dimensional feature space in which it becomes
The specific script used in creating the provided analysis can be linearly separable.
requested from the author.
Performance Metrics
Decision Trees
A decision tree (DT) is a supervised ML algorithm that uses a tree- After the ensemble was created, it was trained by randomly shuf-
like graph of decisions and their possible consequences (Naser fling and training the data into training (T), validation (V), and test-
2021b). It is a flowchart-like structure, in which each internal node ing (S) sets. The ensemble was trained and validated against the
denotes a test of a feature, each branch represents an outcome, and T and V sets and then examined on the S set following a k-folds
each leaf node holds a class label. The process of constructing a cross-validation procedure, wherein the collected data set was split
decision tree involves selecting features to test at each node in the randomly into test and training sets of k ¼ 10 groups. The ensem-
tree to minimize the total variance of the target variable among the ble was trained using nine unique sets and validated on the remain-
instances at each node. There are several strategies for making this ing set until each unique set was used as the validation set.
choice, but a common strategy is to use information gain, which Then, in lieu of the aforementioned measures selected to verify
is related to the concept of entropy (a measure of impurity) (Kent and compare the similarity between the PDPs and ALEs of the ML
1983). model and codal provision, the validity of the created ML models
also was examined using another independent performance metric,
Random Forest namely the mean absolute error (MAE), as well as R2 (Naser and
Random forest (RF) is a versatile ensemble learning algorithm that Alavi 2021; Tapeh and Naser 2022)
combines the power of decision trees and randomization techniques Pn
jP − Ai j
(Breiman 2001). The RF algorithm constructs an ensemble of de- MAE ¼ i¼1 i ð7Þ
cision trees by randomly selecting subsets of the training data and n
features. The individual trees are trained independently using the
selected subsets, and their predictions are aggregated to produce Case Studies
the final prediction. Each decision tree in the RF ensemble is built
using a random subset of the training data (known as bootstrap Seven phenomena and their respective codal provisions were
aggregating or bagging). This random sampling process helps to used herein to showcase and verify the proposed methodology.
reduce the risk of overfitting and improves the model’s generali- These are presented in the order of their complexity (i.e., the num-
zation ability. Additionally, the algorithm selects a random subset ber of involved features). Two phenomena were taken from ACI
of features at each split in the tree construction, introducing diver- 318. Two phenomena were curated from ASCE 29 (ASCE 2006);
sity among the trees and reducing the impact of individual features. one of these phenomena also appears in NDS (2018) for wood
Fig. 6. Comparison for Case study 1 (unit: psi). (f = compressive strength of concrete. Base ML mode: RF.)
ensemble and the codal equation was 1. It is clear that the tuned zero, nor as greater than λ5ðfc0 Þ0.5 bw d, and N u =6Ag should not be
ensemble outperformed all individual models in matching the codal taken as greater than 0.05fc0 . In this case study, beams with applied
equation’s PDPs and ALEs. In this case study, and in the next case axial forces were not considered. In addition, λs (the size effect
study, matching PDPs often is found to be faster (and more afford- term) and λ were taken as unity for simplicity.
able) than matching ALEs. This stems from the more complex The LGBM was used in this case study as the base model. Fig. 8
derivation nature of ALEs than of PDPs. Given this observation, shows the behavior of the tuned ensemble in terms of PDPs and
the proposed methodology was appended to allow the user or en- ALEs after satisfying two similarity measures, namely the Fréchet
gineer to select whether they seek to match PDP, ALE, or both. Such distance and the bounding measure. The MAEensemble was calculated
an amendment did not alter the analysis of the first case study. to be 3,444.348 lbs (1 lbs = 4.45 N), and that for the untuned model
Fig. 7. Comparison for Case study 2 (unit: min). (b and d = breadth and depth of timber beams, respectively. Base ML mode: LGBM.)
Fig. 8. Comparison for Case study 3 (unit: lbs). (Base ML mode: LGBM.)
was 3,620.0 lbs. The R2 for the ensemble was 99%. Clear matching rating of hollow steel columns filled with unreinforced normal-
between the ensemble and codal equation is evident. weight concrete. The equation was established empirically through
To examine further the performance of the tuned ensemble (as a number of publications and has been verified by computer sim-
well as the proposed approach), an additional layer of validation ulations at the National Research Council of Canada (NRC) (Kodur
was performed. In this process, 100 random beams were selected 1999; Kodur and MacKinnon 2000). This expression is also listed
from ACI’s shear database (Reineck et al. 2003) for RC members under Section D-2.6.6. in the National Building Code of Canada
without shear reinforcement used in deriving the codal expression and also is applicable to tubular shapes; however, only the case of
and examined against the codal expression and tuned ensemble. circular tubes is presented herein
Fig. 9 shows the results of this comparison; the ensemble achieved pffiffiffiffiffiffiffiffiffiffi
ðf þ 20Þ
better predictive capability than the codal expression in pre- R¼a D2 D=C ð11Þ
dicting the shear strength of the selected beams (MAEensemble ¼ 60ðLe − 1000Þ
4,930.51 lbs, MAEcodalexpression ¼ 5,219.73 lbs, R2ensemble ¼ 84.1%, where R = fire resistance rating (h); a ¼ 0.07 for circular columns
and R2codalexpression ¼ 80.83%). This additional external verification filled with siliceous aggregate concrete, and 0.08 for circular
could not be conducted in the other case studies because the data columns filled with carbonate aggregate concrete; f = specified
used to create the codal provisions of interest were not available. 28-day compressive strength of concrete (MPa); Le = column ef-
fective length (mm); D = outside diameter for circular columns
(mm); and C = compressive force due to unfactored dead load
ASCE 29 Section 5.2.3 on Concrete-Filled Hollow and live load (kN).
Steel Columns This codal provision is accompanied by additional stipulations
Another complex codal provision is addressed in this case study. that were enforced in this analysis, including that the required R
ASCE 29 also presents guidance for evaluating the fire resistance rating should be less than or equal to 2 h, the specified f should
be equal to or greater than 20 MPa and should not exceed 40 MPa,
the column’s effective length should be at least 2,000 mm and
should not exceed 4,000 mm, the outside diameter should be at
least 140 mm and should not exceed 410 mm, and the compressive
force should not exceed the design strength of the concrete core
determined in accordance with AISC LRFD-94 (AISC 1994).
The RF algorithm was used as the base model. The convergence
of the ensemble on the PDPs and ALEs was difficult to obtain,
and hence the results in Fig. 10 are shown for a converged run
wherein only the PDPs were matched using all aforementioned
similarity measures. Despite matching only the PDPs, the ALEs
also seem to have attained some degree of matching to that from
the equation. The MAEensemble of the tuned model was calculated to
be 0.281 h, versus 0.471 h for the untuned model—an improvement
of about 40%.
Fig. 10. Comparison for Case study 4 (unit: h). (Base ML mode: LGBM.)
evaluate the fire resistance of concrete columns (Wade et al. 1997); AISC Manual Section F6.1 for I-Shaped Members and
this equation was fitted using the results of standard fire tests and Channels Bent about Their Minor Axis
numerical simulations: Section F6.1 in the AISC manual (AISC 2016) presents the follow-
ing simple equation to evaluate the nominal flexural strength, Mn ,
k × f1.3 × B3.3 D1.8 in yielding:
R¼ ð12Þ
105 × N 1.5 × L0.9
M ¼ fy × Z ≤ 1.6fy Sy ð13Þ
where R = fire resistance (min); k is a constant dependent on the
cover and steel reinforcement ratio, and equals 1.47 and 1.48 for where Sy = elastic section modulus about the y-axis.
cover less than 35 mm and greater than or equal to 35 mm, respec- This simple codal provision has two main features, and the data
tively; f c ¼ 28-day compressive strength of concrete (MPa); B = used in this case study were selected to satisfy those conditions.
least dimension of column (mm); D = greatest dimension of column XGBoost was selected to be the base ML model with two simi-
(mm); N = axial load during fire (kN); and L = effective length (mm). larity measures (i.e., Fréchet distance and the Bounding measure).
This case study utilized the RF algorithm as a base model. In The outcome of this analysis is shown in Fig. 12. Both similarity
addition, R2 and the bounding measure were satisfied in this case metrics were satisfied, and MAEensemble was calculated to be
study. MAEensemble (with default settings) was calculated to be 1,637.95 kip · in (1 kip.in = 1,355.8 N.m), and that for the same
56.957 min, and that for the same algorithm without tuning was algorithm without tuning was 1,659.74 kip.in. The tuned ensemble
68.03 min (the R2 between the ensemble and the codal equation matched the PDP and ALE of the codal equation and outperformed
was 89%, and it is quite possible to improve the performance of the those obtained from the other models (Fig. 12). Although this was a
tuned ensemble by tuning the default settings). The tuned ensemble simple case study, and there is a need for additional tests in future
matched a good degree of similarity in terms of trends between work, this finding seems to indicate the suitability and applicability
PDPs and ALEs with the former codal provision (Fig. 11). In some of the proposed approach to the ability of ML to mimic principle-
features (namely k, f, and L), there was a gap between the PDPs based codal provisions.
and ALEs obtained from all ML models and the equation. This gap
seems to have a shifting effect, wherein the models capture the Various Building Codes: On the Elastic Deformation
trends of the PDPs and ALEs, but not the values in some of the of Beams
involved features. Another theoretically derived expression was used as a case study.
The equation for elastic deformation in simply supported beams
loaded with a point load placed at the midspan, which is used in
Additional Case Studies Based on Provisions
a variety of building codes and standards [such as Table 3-23 in the
Stemming from a Theoretical Derivation
AISC manual (AISC 2016)], is
These two case studies explored the applicability of the proposed
approach to codal provisions that are not empirical in nature but PL
Elastic deformation ¼ ð14Þ
rather are derived from first principles. 48EI
Fig. 11. Comparison for case study no. 7 (in mm) (k = constant for cover and steel reinforcement ratio; fc = 28-day compressive strength of concrete;
B = least dimension of column; D = greatest dimension of column; N = axial load during fire; and L = effective length [Note: The base ML mode:
XGBoost].
where P = applied load (N); L = span of the beam (mm); E = modu- importance measures the total effect of shuffling a feature on pre-
lus of elasticity (Pa); and I = moment of inertia (mm4 ). diction accuracy. Although this is a common approach in ML, the
For this case study, the XGBoost model was selected as the base perception of feature importance in an equation is somewhat of a
model. All five similarity measures were used, but only one mea- nuanced concept simply because all features in a given equation
sure (R2 ) was satisfied (given the default settings). Fig. 13 shows must hold true (Naser 2023b). Therefore, one could argue that
the PDPs and ALEs generated from the model and Eq. (14). Despite all features in an equation are equally important.
satisfying one similarity measure, the achieved performance can be Other explainability measures [i.e., SHapley Additive exPlana-
deemed acceptable. MAEensemble was calculated to be 2.461 mm, tions (SHAP) (Lundberg and Lee 2017), local interpretable model-
and that for the same algorithm without tuning was 2.391 mm. The agnostic explanations (LIME) (Ribeiro et al. 2016)), prototypes,
R2 value between the ensemble and the equation was 96%. and so forth] could be explored to build upon the current analysis.
The author believes that, despite their computational complexity,
such measures could be proven successful, and hence can be stream-
Additional Thoughts, Limitations, and lined and added to the proposed methodology. There is room to in-
Future Directions vestigate the influence of model type (i.e., tree-based, ANN, and so
forth) versus the degree of explainability. It is possible that some
This section documents some of the main limitations of the proposed models might fit a phenomenon more easily, more accurately, or
methodology and shares insights into future research directions. with higher explainability than other models. Therefore, there is a
possibility of a trade-off between accuracy and explainability.
A Note on Other Matching Methods
Limitations of the Proposed Methodology and
An engineer might entertain the idea of achieving equivalence
Future Directions
through matching feature importance (e.g., a measure of how valu-
able a feature is in the construction of the model) between the codal The presented approach can be viewed as complex and somewhat
provision and the ML model. For example, if a given model’s error lengthy. As such, this approach may struggle to guarantee the con-
increases after systematically permuting a feature, the so-called fea- vergence of a ML model that both fits the data well and matches the
ture is important, because permuting adversely affected the model’s physical equation’s behavior. In reality, the proposed approach con-
predictions (Altmann et al. 2010). verged within 2 min of analysis for all the case studies mentioned
Although this concept could possibly be integrated into the in this work using a typical desktop that can be found in a com-
proposed methodology, and was successful in the first two studies mon engineering design office. However, in lengthy simulations or
(these results were not shown in the present work), feature multiphased codal provisions, it is possible that the proposed
Fig. 12. Comparison for Case study 6 (unit: kip·in.). (Base ML mode: XGBoost.)
Fig. 13. Comparison for Case study 7 (unit: mm). (Base ML mode: XGBoost.)