4 Machine-learning models for predicting surgical site infections using patient pre operative risk and surgical procedure

TagedEnAmerican Journal of Infection Control 51 (2023) 544−550
Contents lists available at ScienceDirect
TagedFiur TagedEn American Journal of Infection Control TagedFiur TagedEn
journal homepage: www.ajicjournal.org
Major Article
TagedH1Machine-learning models for predicting surgical site infections using

patient pre-operative risk and surgical procedure factorsTagedEn
TagedPRabia Emhamed Al Mamlook PhD a,c,*, Lee J. Wells PhD a, Robert Sawyer PhD bTagedEn
a
TagedP Department of Industrial and Entrepreneurial Engineering & Engineering Management, Western Michigan University, Kalamazoo, MI
b
Department of Surgery, Western Michigan University Homer Stryker School of Medicine, Kalamazoo, MI
c
Department of Industrial, Engineering University of Zawiya, Al Zawiya City, Libya
TagedEn
TagedEnTagedPKeywords: TAGEDPA B S T R A C T
Machine-learning models
Pre-operative risk Background: Surgical site infections (SSIs) are a significant health care problem as they can cause increased
Surgical procedure factors
medical costs and increased morbidity and mortality. Assessing a patient’s preoperative risk factors can
Surgical site infections TagedEn
improve risk stratification and help guide the surgical decision-making process. Previous efforts to use pre-
operative risk factors to predict the occurrence of SSIs have relied upon traditional statistical modeling
approaches. The aim of this paper is to develop and validate, using state-of-the-art machine learning (ML)
approaches, classification models for the occurrence of SSI to improve upon previous models.
Methods: In this work, using the American College of Surgeons’ National Surgical Quality Improvement Pro-
gram (ACS NSQIP) database, the performances (eg prediction accuracy) of 7 different ML approaches (Logistic
Regression (LR), Naïve Bayesian (NB), Random Forest (RF), Decision Tree (DT), Support Vector Machine
(SVM), Artificial Neural Network (ANN), and Deep Neural Network (DNN)) were compared. The performance
of these models was evaluated using the area under the curve, accuracy, precision, sensitivity, and F1-score
metrics.
Results: Overall, 2,882,526 surgical procedures were identified in the study for the SSI predictive models’
development. The results indicate that the DNN model offers the best predictive performance with 10-fold
compared to the other 6 approaches considered (area under the curve = 0.8518, accuracy = 0.8518, preci-
sion = 0.8517, sensitivity = 0.8527, F1-score = 0.8518). Emergency case surgeries, American Society of Anes-
thesiologists (ASA) Index of 4 (ASA_4), BMI, Vascular surgeries, and general surgeries were most significant
influencing features towards developing an SSI.
Conclusions: Equally important is that the commonly used LR approach for SSI prediction displayed medio-
cre performance. The results are encouraging as they suggest that the prediction performance for SSIs can be
improved using modern ML approaches.
© 2022 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All
rights reserved.TagedEn
T urgical site infections (SSIs) are infections that occur after a sur-
agedPS 3.3 million additional hospital days.7,8 In the US, the incidence of SSIs
gical procedure and are the second most common health care-associ- reported in the literature range from 2%-5% of all patients undergoing
ated infections.1,2,3 The United States (US) Center for Disease Control surgical procedures, depending on the type of surgery.4,9TagedEn
and Prevention has estimated that 300,000- 500,000 cases occur each TagedPEven though most infections are treatable with antibiotics, SSIs
year in the US,4 and these infections can lead to extended hospital remain a significant cause of morbidity and mortality after sur-
stays,5 hospital readmissions, and increased surgery-related costs.6,7 gery.10,11 Considering the impact SSIs have on morbidity, mortality,
SSIs cost the US health care system upwards of $1.6 billion and and hospital costs,10 it is important to have a clear understanding of
the risk factors that contribute to their occurrence. Based upon multi-
ple risk factors, predicting the occurrence of an SSI (preoperative)
TagedEn * Address correspondence to Rabia Emhamed Al Mamlook, PhD, Department of
could allow for SSI-preventative strategies to be applied to prevent
Industrial and Entrepreneurial Engineering & Engineering Management, Western
Michigan University, 3635 kenbrooke ct, Kalamazoo, MI, 49008, USA.
the occurrence of SSI, reduce pain, speed recovery, and reduce medi-
E-mail addresses: Rabia.emhamedm.almamlook@wmich.edu, rabiaemhamedm. cal costs. Several risk factors have been identified as significant SSI
almamlook@wmich.edu (R.E.A. Mamlook).
https://doi.org/10.1016/j.ajic.2022.08.013
0196-6553/© 2022 Association for Professionals in Infection Control and Epidemiology, Inc. Published by Elsevier Inc. All rights reserved.
Downloaded for Institut Latihan KKM (ilkkm@moh.gov.my) at Training Management Division Ministry of Health Malaysia from
ClinicalKey.com/nursing by Elsevier on June 20, 2024. For personal use only. No other uses without permission. Copyright ©2024. Elsevier Inc. All
rights reserved.
TagedEnR.E.A. Mamlook et al. / American Journal of Infection Control 51 (2023) 544−550 545
TagedEn
Table 1
Articles focused on pre-operative and post-operative modelling SSIs
Article Objective Surgery Type Methods Phase

14
Develop classification model for SSI Neonatal surgery Logistic regression, decision trees, & Pre-operative &
boosted decision trees Post-operative
15 Validate model for SSIs
All Multivariate logistic regression Post-operative
16 Develop risk factor model for SSI
All Logistic regression Post-operative
17 Compare SSI model with National Health care Safety
All Stepwise logistic regression & bootstrap Post-operative
Network Risk Index
resampling
18 Develop user friendly tool
All Logistic regression Post-operative
19 Improve SSI risk score
Vascular surgery Multivariate logistic regression & boot- Pre-operative
strap resampling
20 Identify risk factors & develop classification model
Abdominal Multivariate logistic regression Post-operative
21 Identify patients who will develop SSIs
General Naïve Bayesian Post-operative
22 Develop model by improving accuracy
Orthopedic trauma Logistic regression Pre-operative
23 Improve accurate prediction model
Spine surgery Logistic regression Pre-operative
24 Compare artificial neural network with logistic
Head & neck surgery Multivariate logistic regression & artifi- Pre-operative &
regression model for SSI
cial neural network Post-operative
13 Determine the risk factors for developing an SSI
Pelvic Multivariate logistic regression Pre-operative &
Post-operative
25 Predictive risk factors for SSI
General surgery Bivariate & regression analyses Post-operative
26 Predicting postoperative SSIs
All Logistic regression & random forest Post-operative
14 Develop an accurate model to predict neonatal SSI
All Different statistical methods Preoperative &
Intraoperative
27 Compare multivariate logistic regression with differ-
Liver dysfunction Random forest, multivariate logistic Post-operative
ent machine learning approaches
regression, naïve Bayesian & support
vector machine
28 SSI detection models
All Lasso logistic regression & bootstrap Post-operative
29 Predict SSIs after posterior spinal Spinal fusions Deep neural network Post-operative
TagedFiur
predictors in the literature, such as: (1) Patient risk factors, such as
sex,12 body mass index (BMI), diabetes ,12 and smoking13; (2) Surgical
procedure risk factors, such as duration of surgery and anesthesia
type, and (3) Hospital-related risk factors, such as duration of hospi-
talization, institution, and surgical staff volume. Classification models
based upon these factors can be developed at either the preoperative,
intraoperative, or postoperative stages. A great deal of research, as
shown in Table 1, has been done on the development of prediction
models for SSIs, which have shown reasonable discriminative abil-
ity.23 However, most of the models are based on postoperative risk
factors. In addition, many of these research efforts use traditional sta-
tistical modeling approaches, such as univariate or multivariate logis-
tic Regression (LR). LR models are widely used for estimating the
probability of occurrence of an event because of its simplicity and
interpretability.30 Furthermore, only a few research studies have
developed models for SSI prediction based solely on preoperative
Fig 1. Machine Learning Framework.TagedEn
risk factors.19 ML has been successfully applied to many health care
problems to explain complex relationships and/or provide insight
dataset used for this study, data preprocessing implementation, and
into important features within large datasets.14 Therefore, ML may
considered ML methods are described in Section 2. In Section 3, the
be useful in predicting the occurrence of SSIs using large clinical data-
results are presented and discussed. Finally, conclusions and future
bases, such as the American College of Surgeons’ National Surgical
work are drawn in Section 4.TagedEn
Quality Improvement Program (ACS NSQIP) database. Previous stud-
ies have indicated that DNNs may have better prediction accuracy
TAGEDH1METHODSTAGEDN
than other ML methods.31 Expanding the models considered for pre-
dicting SSI based on preoperative data, our motivation was to
TagedPThe main process was divided into 4 stages: (1) Dataset and
improve the predictive accuracy of the previous models. This study
patient cohort identification, (2) Data preprocessing, (3) ML model
suggests that the use of ML approaches and a large data set with vari-
development, and (4) Model performance evaluation, which are illus-
ous features to improve the accuracy of SSI prediction.TagedEn
trated in Figure 1. Seven ML models were developed to classify the
TagedPMore specifically, ML has been shown to positively predict post-
occurrence of SSIs. The first step towards developing these machine
operative complications (Arvind et al., 2020). However, to our knowl-
learning models is a preprocessing phase to make the raw data suit-
edge, no studies have applied various ML models, other than
able for model development.TagedEn
traditional LR-based models, in the prediction of preoperatives’ risk
factors considering of all surgery types. Therefore, the purpose of this TagedH2Dataset source and patient cohort identificationTagedEn
paper is to (1) identify significant preoperative risk factors for classi-
fying the occurrence of SSIs; (2) perform comparisons across multiple TagedPThis study was approved by the Western Michigan University
ML models; and (3) develop the most accurate risk prediction model School of Medicine Institutional Review Board (IRB) (WMed-2018-
possible. The remainder of this paper is organized as follows. The 0333). The classification models were created using the NSQIP
rights reserved.
TagedEn546 R.E.A. Mamlook et al. / American Journal of Infection Control 51 (2023) 544−550
database. The 2,882,526 adult patients (age 18+) who underwent sur- the chi-squared is greater than a significance level of 5%, that feature
gical procedures between January 1st, 2013, and December 31st, is removed. In total, 25 risk factors from the NSQIP database, as
2016, were reported by the ACS NSQIP affiliated hospitals. All surgical shown in Table 2, were considered for this work.TagedEn
procedures including cardiac surgeries, gynecology, general, inter-
ventional, otolaryngology, radiologist orthopedic, plastic, and vascu- TagedH2Data preprocessingTagedEn
lar were considered in this study. These surgeries were identified by
the relevant current procedure terminology (CPT) codes. The NSQIP TagedPData preprocessing is an important stage for developing ML mod-
database used in this work consisted of 273-305 (depending upon els. The first step of data preprocessing was to clean the data, which
the year) variables, including preoperative risk factors, intra-opera- consisted of dealing with missing values and outliers. Many missing
tive variables, and 30-day postoperative mortality and morbidity out- values are found in the NSQIP dataset. The attributes with missing
comes. Considering only preoperative risk factors, type of surgery, data were weight, height, age, and ASA factor; with 3,425 missing
and intraoperative risk factors a thorough literature review and values among them. The missing values for these attributes were
expert clinical opinions, were used to refine the list of potential fea- handled by replacing them. Missing numeric attributes (ie weight,
tures to 28.TagedEn height, and age) and ordinal attributes (ie ASA) were replaced by the
TagedPFeature selection is one of the important steps for building a ML average and median values for those attributes, respectively. It
model. It is the method of identifying and removing as much unre- should be noted that several approaches for replacing missing values
lated information as possible. This reduces the dimensionality of the exist, such as k-means imputation. However, given the high
data and may allow learning algorithms to operate faster and more dimensionality of the dataset and the fact that less than 0.1% of the
efficiently. Dimension reduction can also improve modeling accuracy, dataset’s sample contain missing values the use of mean imputation
reduce overfitting, and decrease training time (Kotsiantis et al., 2006). seemed more practical. Patient cases were deemed as outliers and
There are 3 common feature selection methods: filters, wrappers, and subsequently removed if any of their continuous features exceeded
embedded.32 Since the wrappers method provides an optimal set of §3 standard deviations (estimated) from the mean (estimated). In
features for training the model, thus resulting in better accuracy and total, 7,200 cases were identified as outliers and removed from the
have better prediction performance than other methods (Chen et al., dataset.TagedEn
2021).TagedEn TagedPThe second step of the data preprocessing was attribute creation.
TagedPIn our study, the wrappers method was used with backwards A common medical risk factor is adiposity (fatness), which is often
elimination for feature selection. This process is used to optimize the indirectly measured via the body mass index (BMI). While NSQIP
performance of the ML model as it will only remove irrelevant fea- does not record BMI, it does record a patient’s height and weight.
tures. In this study, when the P-value from a test statistic following Therefore, a new attribute for BMI was added to the dataset.TagedEn
TagedEnTable 2
Pre-operative features of the NSQIP database
Attribute ID Attribute Type Description
1 Age Numerical
2 Sex Categorical Female; Male
3 Functional health status prior to surgery Categorical Independent; Partially dependent; Totally dependent
4 In/Out-patient status Categorical Inpatient; Outpatient
5 Emergency case Categorical No; Yes
6 Surgical specialty Categorical Interventional; Gynecology; Otolaryngology; Gen-
eral; Radiologist; Plastic; Vascular; Orthopedic; and
Cardiac surgery
7 Height in inches Numerical
8 Weight in pounds Numerical
9 Diabetes mellitus with oral agents Categorical Non-Diabetic; Diabetic requiring therapy with a non-
insulin anti-diabetic; Diabetic requiring insulin
therapy
10 Current smoker Categorical No; Yes
11 Elective surgery Categorical No; Yes; Unknown
12 Congestive heart failure in 30 days before Categorical No; Yes
surgery
13 Alcohol abuse Categorical No; Yes
14 History of severe chronic obstructive pul- Categorical No; Yes
monary disease
15 Currently on dialysis (pre-op) Categorical No; Yes
16 Wound infection Categorical No; Yes
17 Steroid use for chronic condition Categorical No; Yes
18 Disseminated cancer Categorical No; Yes
19 Weight loss > 10% Categorical No; Yes
20 Bleeding disorders Categorical No; Yes
21 Ventilator dependent Categorical No; Yes
22 Systemic sepsis Categorical No sepsis; SIRS; Sepsis; Septic shock
23 American Society of Anesthesiologists Categorical ASA 1 (no disturb); ASA 2 (mild disturb); ASA 3
(ASA) (severe disturb); ASA4 (threat to life)
Index
24 Preoperative transfusion Categorical No; Yes
25 Transfer origin Categorical Not transferred; Acute care hospital; Nursing home;
Transfer from other; and Transfer from outside
emergency room
rights reserved.
TagedPThe third step of data preprocessing was handling the issue of

unbalanced data, which often reduces the performance of a classifica- TP
tion model. In particular, the NSQIP database consisted of 2,882,526 Specificity ¼ ð1Þ
TN þ FP
cases, where »97% (2,794,324 cases) of the dataset were cases with-
out SSIs and »3% (88,202) were cases with SSIs. In general, a ML TP
Sensitivity ¼ ð2Þ
model created from significantly imbalanced data tends to be bias TP þ FN
toward classifying the dominant class. In other words, a ML model
created from this dataset would be inclined to classify patients as not TP þ TN
Accuracy ¼ ð3Þ
having an SSI. One solution for solving imbalanced data sets issues is TP þ TN þ FP þ FN
to use sampling techniques. Three major sampling techniques are
2 Sensitivity Specificity
used for dealing with imbalanced datasets: oversampling, undersam- F Score ¼ ð4Þ
Specificity þ Sensitivity
pling, and a combination of both (eg SMOTE).33 In undersampling, TagedEn
some samples are eliminated from the majority class (ie non-SSI) TagedPBy graphing a classification model’s TP rate versus FP rate, the
until it’s in balance with the minority class (ie SSI). The main disad- receiver operating characteristic (ROC) curve provides a graphical
vantage to under sampling is that data (potential information) is representation the models performance. Calculating the Area Under
omitted. In oversampling, minority class samples must be replicated the ROC Curve (AUC) provides an additional measure of model per-
to achieve a more balanced distribution, which may result in overfit- formance that will also be assessed in this study. AUC ranges from
ting. Given the large number of samples present in the dataset being [0,1] and the greater the AUC, the better the classification perfor-
used, under sampling was implemented since potential information mance between the positive and negative classes. An AUC of 0 indi-
loss was not a concern.33TagedEn cates a model that incorrectly classifies all subjects with SSI as
TagedPThe fourth step of data preprocessing was transforming the data negative and all subjects with non-SSI as positive. Conversely, an
to be more suitable for applying ML approaches. To ensure that all AUC of 1 indicates a perfect classifier. Overall, an AUC of less than 0.5
the feature values are on the same scale and treated with equal indicates a very poor classifier, 0.7-0.8 indicates a mediocre classifier,
weight, normalization is important. A Min-Max normalization was and 0.8-1 indicates a good to perfect classifier.TagedEn
performed to transform each numerical attribute (eg BMI) onto the TagedPIt should be noted that the models being considered are “eager”
range [0,1]. Some ML approaches (eg. support vector machines learners, in that they are learned off-line and evaluated as needed.
(S1VMs) and LR) cannot handle multinomial variables. To compen- Since the model already exists when it is used to predict SSIs, the
sate, polynomial features were transformed into a set of multiple computational costs for their development are not considered in this
binary factors through one-hot encoding.34TagedEn comparison.TagedEn
TagedH2Model validationTagedEn
TagedH2Machine learning classification methodsTagedEn
T tratified k-fold cross-validation (CV) is used to evaluate the per-
agedPS
formance of ML models and avoid biased performance estimates. In TagedPTo achieve the best performance, each model’s parameters were
this approach, the dataset is randomly divided into folds resulting in optimized during its development. For each optimization, k-fold
classification models. Each of these models is learned using folds as cross-validation was performed with 80% of the original dataset (the
training data and the remaining fold are used for testing (obtaining remaining 20% is used for model comparison), where the optimal
performance), where each model is learned/tested with a different parameters resulting in the largest AUC were used to construct the
combination of folds. This procedure is repeated k times, and the final final model. In this study, 7 popular ML classification methods were
performance of each model is averaged to obtain an unbiased esti- considered, which are briefly introduced in the following subsec-
mate of the model’s performance. The k-fold CV procedure, described tions.TagedEn
above, is used in this paper to determine hyperparameters (ML model
parameters that are not learned by the model) to optimize each mod- TagedPLogistic Regression (LR)TagedEn
el’s performance. However, the performances obtained from this CV TagedPLR is a common classification method used to classify discrete val-
for different ML models should not be used for comparisons. When ues typically in the form of binary classes (eg true/false, 0/1, yes/no)
the same cross-validation procedure and dataset are used to both based on a set of independent variables. The idea is to map the results
determine hyper parameters and compare different models, the per- of linear functions to sigmoid functions (generation a number
formance estimates become biased.TagedEn between 0 and 1), which are used to obtain the probability of a class
(eg true/false). In this study, the commonly used ordinal logistic
TagedH2Model performance and evaluationTagedEn regression was applied. The best performance was achieved for LR
with the penalty and cost parameters equal to 11 and 10, respec-
TagedPTo compare and assess a model’s performance, a test dataset (sep- tively.TagedEn
arate from the data used for CV) was used to estimate the perfor-
mance metrics: accuracy, sensitivity, specificity, and F1-score. These
estimates were calculated using the number of True Positives (TPs), TagedPSupport Vector Machine (SVM)TagedEn
True Negatives (TNs), False Positive (FPs), and False Negatives (FNs) TagedPSVM is a classification method, which can perform efficient non-
resulting from applying a given model to the test dataset, as linear classification via mathematical kernels (Vapnik et al., 2013).
described in Eqs. 1-4. In this study, TP is the number of patients that For classification, SVM maps input attributes to a higher dimensional
had an SSI and the model correctly classified them as having an SSI, feature space, and in that space constructs a hyperplane to separate
TN is the number of patients that did not have an SSI and the model the 2 classes (Vapnik et al., 1999). This optimal hyperplane is deter-
correctly classified them as not having an SSI, FP is the number of mined by selecting support vectors that create the maximum separa-
patients that did not have an SSI and the model incorrectly classified tion (maximal margin) between classes.35 In this study, SVM models
them as having an SSI (Type I error), and FN is the number of patients were constructed considering four different kernels: linear, Gaussian,
that had an SSI and the model incorrectly classified them as not hav- polynomial, and sigmoid. The best performance was achieved with a
ing an SSI (Type II error). linear kernel with regularization parameter (C) set to 10.TagedEn
rights reserved.
TagedPDecision Tree (DT)TagedEn nodes and a bias. Sigmoid (logistic) activation function was compared
TagedPDT is a simple and easily interpretable classification method that and analysis with others like ReLU (Rectifed Linear Unit), Tanh, Sig-
can be created relatively quickly compared to other well-known clas- moid, and SoftMax activation function. The best model was built
sifiers .36 DTs consist of a root node, internal nodes, and leaf nodes, in with 2 layers. The input hidden layer is about 76 neurons. The first
which the internal nodes allow the tree to split into branches hidden layer has 64 neurons, and the activation function is ReLU. The
depending upon attribute values, and leaf nodes represents a class. In second hidden layer has 32 neurons with relu function. The last layer
this study, the use of bagging and boosting procedures (ie Ada-Boost is the output of a single-unit layer with 1 neuron, and the activation
and Gradient Boosting, respectively) were investigated to reduce the function is a sigmoid activation. The best performance was achieved,
variance and increase the accuracy of the model. In addition, DTs with an ANN with 150 epochs (150 iterations over all samples), and
were explored using both “Gini” and “entropy” for information gain. the batch size of the epoch as 10.TagedEn
The best results were obtained using bagging, “Gini”, a maximum
tree depth of 13, random state of 10, a minimum number of samples
TagedPDeep Neural Network (DNN)TagedEn
required to split a node as 110, and the minimum number of samples
TagedPDNN is also known as deep structured learning or hierarchical learn-
required at a leaf node was 20.TagedEn
ing. Most modern DNN are based on an ANN approach. A general defini-
tion of DNN can be expressed as an ANN with more than one hidden
TagedPRandom Forest (RF)TagedEn layer between the input and output layers of the network. In this regard
TagedPRF is a nonlinear tree-based integrated learning model. It is an ensemble after carrying out several experiments we found out that having 4 layers
learning method that constructs a group of randomly induced DTs to with activation functions delivered the best result. A fully connected
increase classification accuracy. For the classification problem, the voting DNN was designed, which consist of 4 layers. The input hidden layer is
method is used, and the maximum number of votes is the final model out- about 76 neurons. The first layer has 64 neurons, and the activation
put. In this study, bagging and boosting procedures were investigated. The function is relu. The second layer hidden has 32 neurons, and the activa-
best results were achieved using bagging, with a random forest with 60 tion function is relu function. The third hidden layer has 16 neurons, and
trees, a maximum tree depth of 13, a minimum number of samples split as the activation function is relu function. The fourth layer has 8 neurons,
110, and the minimum samples for a leaf was 20.TagedEn and the activation function is relu function. The last hidden layer is the
output of a single-unit layer with one neuron, and the activation function
TagedPNaïve Bayesian (NB)TagedEn is a sigmoid activation whereas a sigmoid value into the [0,1] interval. A
TagedPNB classifiers are probabilistic models, based on Bayesian theorem, loss function and adaptive moment estimation algorithm was used for
that are robust to real data noise and missing values. The NB classifier optimization technique for gradient descent. Training of the optimal net-
is one of the most effective and efficient classification algorithms in the work model is done with several iterations to avoid overtraining. The
literature and is simple to implement and use. NB classifiers have dem- best performance was achieved, with a DNN with 50 epochs (50 itera-
onstrated to be highly accurate and reliable for large datasets. In this tions over all samples), and the batch size of the epoch as 256.TagedEn
study, 3 kinds of NB models were investigated: Gaussians (Gaussian
Bayes), Multinomial (Polynomial Bayesian), and Bernoulli (Bernoulli TAGEDH1RESULTSTAGEDN
Naïve Bayes). The best results were achieved using Gaussian algorithm.TagedEn
TagedPThis study compared seven different ML models for the classifica-
TagedPArtificial Neural Network (ANN)TagedEn tion of SSI in patients using the remaining 20% of the dataset (data
TagedPArtificial Neural Network (ANN) is a widely used ML model in not used for model creation). The results for Accuracy, Sensitivity,
classification task. ANNs considered in the study consist of three Specificity, F1-Score, and AUC with k-fold (k = 5,7, and 10) used in
node layers: input, hidden, and output. Each node consists of a trans- this comparison are given in tabular form in Table 3 and presented
fer function applied to the weighted sum of the previous layer’s visually in Figures 2 and 3.TagedEn
TagedEnTable 3
Performance of classification models with k-fold CV (k = 5,7, and 10)
k Models Performance of classification models
Accuracy Sensitivity Specificity F1-Score AUC
5 Logistic Regression (LR) 0.7921 0.7898 0.7992 0.7944 0.7912

Support Vector Machines (SVM) 0.7856 0.7899 0.8085 0.7990 0.7866
Decision Tree (DT) 0.8011 0.8032 0.851 0.8041 0.7958
Random Forest (RF) 0.8012 0.7996 0.8022 0.8008 0.8007
Naive Bayes (NB) 0.6553 0.6795 0.6912 0.6853 0.6611
Artificial Neural Network (ANN) 0.7956 0.8047 0.7925 0.7985 0.7845
Deep Neural Network (DNN) 0.8158 0.8215 0.8145 0.8179 0.8056
Decision Tree (DT) 0.8018 0.8121 0.8153 0.8136 0.8112
Random Forest (RF) 0.8210 0.8211 0.8263 0.8236 0.8256
Naive Bayes (NB) 0.6952 0.7058 0.7125 0.7091 0.7012
Decision Tree (DT) 0.8181 0.8152 0.8140 0.8146 0.8149
Random Forest (RF) 0.8380 0.8378 0.8379 0.8378 0.8381
Naive Bayes (NB) 0.7130 0.7578 0.7108 0.7335 0.7132
rights reserved.
TagedFiur TagedFiur
Fig 4. Feature Importance for the DNN Classification Model.TagedEn
Fig 2. Performance of Classification Models.TagedEn

TagedFiur outcomes for patients, such as extended length of hospital stays,
increased mortality, more hospital readmissions, the need for addi-
tional operations, as well as poorer patient quality of life and an
increasing health care costs to both individuals and health care
systems.37TagedEn
TAGEDH1DISCUSSIONTAGEDN
TagedPThis study used ML to analyze the NSQIP dataset to improve pre-

operative insights regarding the occurrence of SSIs, with the goal of
decreasing health care costs, increasing efficiency of service delivery,
reducing operational time, and improving patient satisfaction and
clinical outcomes. To the best of our knowledge, no studies have
applied various ML models other than traditional LR modeling in the
prediction of preoperatives with classification surgery types. This is
the first study to construct and compare various ML models in terms
Fig 3. ROC Curves for the SSI Classification Model.TagedEn of predicting SSI based on NSQIP datasets. We found evidence to sup-
port the hypothesis that using ML models, other than LR, to classify
TagedPIn Table 3, the best and worst performance for each metric is indi- SSIs based on the NSQIP dataset leads to better performance.TagedEn
cated by bold and italicized font with, respectively. As can be seen, TagedPOur study developed and compared seven well-accepted ML clas-
the DNN model uniformly outperformed all other considered models, sification models and found that DNN models were well suited for
which was followed closely by the RF model. Similar performances the classification of SSIs and resulted in the highest AUC of 0.85, indi-
were seen with the LR, SVM, DT, and ANN models for all metrics con- cating good prediction accuracy. The results are encouraging and
sidered.TagedEn may lead to a better ability to identify patients at significant risk for
TagedPThe NB model uniformly underperformed all the other considered SSIs and ultimately reduce their chances of developing an SSI. While
models. In terms of Accuracy, Sensitivity, Specificity, F1-Score and the DNN developed in this study demonstrated high accuracy, it may
AUC the NB seems to perform much worse than others. The DNN be possible to improve this performance by identifying optimum
model demonstrated excellent discriminatory capability to classify classification models for specific surgery types. Future research
SSIs (AUC of 0.8518), which proves the highest accuracy when com- efforts will focus on identifying the best prediction models for spe-
pared to other classifiers. DNN model exhibited the best AUC, and the cific surgery types (eg general, and vascular). The results are encour-
performance was significantly better than the other models.TagedEn aging and may lead to a better ability to identify patients at
TagedPIn addition to being able to predict the occurrence of an SSI, significant risk factors for SSIs. By using the classification model
understanding the influential features that lead to SSIs can also aid in developed in this study to predict preoperative SSIs, we can signifi-
preventative strategies to reduce SSIs. Therefore, feature importance cantly reduce infections after surgery.TagedEn
assessment has been performed on the DNN model. The results of
this assessment are given in Figure 4, which lists the features impor- TagedH2LimitationsTagedEn
tance in descending (top to bottom) order. Patients undergoing emer-
gency surgeries, ASA_4 Index, BMI, and patient age were the major TagedPAll surgical procedures including cardiac surgeries, gynecology,
influential variables leading to SSIs. Two types of Surgical specialty general, interventional, otolaryngology, radiologist orthopedic, plas-
also have high incidence of SSIs (ie vascular and general).TagedEn tic, and vascular were considered in this study. The NSQIP database
TagedPThe features’ importance results are in agreement with the litera- used in this study did not contain information regarding cancer
ture where vascular and general surgery are consistently associated patients, which limits the applicability the resulting model. The
with much higher SSI rates relative to other types of surgery.19 In the NSQIP dataset was not designed for cancer surgery. To expand upon
US, the overall incidence of SSIs following general and vascular sur- the work that was done in this study a similar form of analysis should
gery remain high.19 SSIs after vascular and general surgeries have be applied to additional types of SSI for cancer patients. Similarly, the
critical implications for patients because they result in worse health NSQIP dataset only contained data regarding adult patients.
rights reserved.
Furthermore, all patients over the age of 90 are grouped together as TagedP10. Hoang SC, Klipfel AA, Roth LA, Vrees M, Schechter S, Shah N. Colon and rectal sur-
90+ (which was set to 90 for this work). Further research is required gery surgical site infection reduction bundle: to improve is to change. Am J Surg.
2019;217:40–45.TagedEn
to understand the modeling of SSIs for these age groups.TagedEn TagedP11. Rudder NJ, Borgert AJ, Kallies KJ, Smith TJ, Shapiro SB. Reduction of surgical site
TagedPIn addition to the shortcomings resulting from unrepresented infections in colorectal surgery: a 10-year experience from an independent aca-
groups, the performance of the models developed in this work may demic medical center. Am J Surg. 2019;217:1089–1093.TagedEn
TagedP12. Ban KA, Minei JP, Laronga C, et al. American college of surgeons and surgical infec-
be affected by assumptions within the NSQIP dataset. For example, tion society: surgical site infection guidelines, 2016 update. J Am Coll Surg.
this work attempted to predict the occurrence of SSIs based upon a 2017;224:59–74.TagedEn
binary attribute in the NSQIP dataset identifies whether a patient TagedP13. Lachiewicz MP, Moulton LJ, Jaiyeoba O. Pelvic surgical site infections in gyneco-
logic surgery. Infect Dis Obstet Gynecol. 2015;2015:1–8.TagedEn
developed an SSI within the first 30 days at a hospital and after dis- TagedP14. Bartz-Kurycki MA, Green C, Anderson KT, et al. Enhanced neonatal surgical site
charge. Complications (eg SSI) occurring after this period are not infection prediction model utilizing statistically and clinically significant variables
included. Also, many of the attributes within the dataset are generic in combination with a machine learning algorithm. Am J Surg. 2018;216:764–777.TagedEn
TagedP15. Walraven CV, Musselman R. The Surgical site infection risk score (SSIRS): a model
in nature, such as smoking being considered as a binary attribute and
to predict the risk of surgical site infection. PLoS One. 2013.TagedEn
where the degree to which the patient smokes is not considered.TagedEn TagedP16. Li X, Nylander W, Smith T, Han S, Gunnar W. Risk factors and predictive model
development of thirty-day post-operative surgical site infection in the veterans
administration surgical population. Surg Infect (Larchmt). 2018;19:278–285.TagedEn
TAGEDH1CONCLUSIONSTAGEDN TagedP17. Mu Yi, et al. Improving risk-adjusted measures of surgical site infection for the national
health care safely network. Infect Control Hosp Epidemiol. 2011;32:970–986.TagedEn
TagedPThe results indicated that the DNN approach performed the best, TagedP18. Sampalis JS. Development of risk scoring tool to predict surgical site infections. SOJ
Surgery. 2015;2:1–10.TagedEn
compared to the other ML approaches considered. Considering any TagedP19. Leekha S, Lahr BD, Thompson RL, Sampathkumar P, Duncan AA, Orenstein R. Pre-oper-
surgery type, the DNN model suggests that older patients undergoing ative risk prediction of surgical site infection requiring hospitalization or reoperation in
emergency surgeries with a high BMI and ASA index are the most patients undergoing vascular surgery. J Vasc Surg. 2016;64:177–184.TagedEn
TagedP20. Ejaz A, Schmidt C, Johnston FM, Frank SM, Pawlik TM. Risk factors and prediction
vulnerable to developing an SSI. In addition, such patients that are model for inpatient surgical site infection after major abdominal surgery. J Surg
undergoing vascular and general surgeries are extremely vulnerable Res. 2017;217:153–159.TagedEn
as the DNN model shows that these surgeries have the highest inci- TagedP21. Gbegnon A, Monestina J, Cromwell J. Machine learning algorithm for accurate,
automated, and real-time prediction of surgical site infections using EHR Data. J
dence SSIs.TagedEn Surg Res. 2014;186:527.TagedEn
TagedP22. Reese SM, Knepper B, Young HL, Mauffrey C. Development of a surgical site infec-
tion prediction model in orthopaedic trauma: the denver health model. Injury.
TagedH1AcknowledgmentsTagedEn
2017;48:2699–2704.TagedEn
TagedP23. Janssen DM, van Kuijk SM, d’Aumerie B, Willems P. A prediction model of surgical
TagedPThe researchers extend their thanks and appreciation to all those site infection after instrumented thoracolumbar spine surgery in adults. Eur Spine
J. 2019;28:775–782.TagedEn
who contributed including the Western Michigan University, for the
TagedP24. Kuo P, Wu S, Chien P, et al. Artificial neural network approach to predict surgical
facilities provided, and the staff responsible for obtaining the IRB. The site infection after free-flap reconstruction in patients receiving surgery for head
researchers also appreciate the significant role of the Western Michi- and neck cancer. Oncotarget. 2018;9.TagedEn
gan University Homer Stryker School of Medicine for its support with TagedP25. Haridas M, Malangoni MA. Predictive factors for surgical site infection in general
surgery. Surgery. 2008;144:496–503.TagedEn
this research and access to the dataset used in this study.TagedEn TagedP26. Petrosyan Y, Thavorn K, Smith G, et al. Predicting postoperative surgical site infec-
tion with administrative data: a random forests algorithm. BMC Med Res Method.
2021;21:1–11.TagedEn
TagedH1ReferencesTagedEn TagedP27. Shi S, Lei G, Yang L, et al. Using machine learning to predict postoperative liver
dysfunction after aortic arch surgery. J Cardiothorac Vasc Anesth. 2021.TagedEn
TagedP 1. Najjar YW, Al-Wahsh ZM, Hamdan M, Saleh MY. Risk factors of orthopedic surgical TagedP28. Zhu Y, Simon GJ, Wick EC, et al. Applying machine learning across sites: external
site infection in Jordan: a prospective cohort study. Int J Surg Open. 2018;15:1–6.TagedEn validation of a surgical site infection detection algorithm. J Am Coll Surg.
TagedP 2. Tschelaut L, Assadian O, Strauss R, et al. A survey on current knowledge, practice 2021;232:963–971.TagedEn
and beliefs related to pre-operative antimicrobial decolonization regimens for pre- TagedP29. Hopkins BS, Mazmudar A, Driscoll C, et al. Using artificial intelligence (AI) to pre-
vention of surgical site infections among Austrian surgeons. J Hosp Infect. dict postoperative surgical site infection: a retrospective cohort of 4046 posterior
2018;100:386–392.TagedEn spinal fusions. Clin Neurol Neurosurg. 2020;192: 105718.TagedEn
TagedP 3. Ahmad HF, Kallies KJ, Shapiro SB. The effect of mupirocin dressings on Post-opera- TagedP30. Cooper JN, Wei L, Fernandez SA, Minneci PC, Deans KJ. Pre-operative prediction of
tive surgical site infections in elective colorectal surgery: A prospective, random- surgical morbidity in children: comparison of five statistical models. Comput Biol
ized controlled triet al. Am J Surg. 2019;217:1083–1088.TagedEn Med. 2015;57:54–65.TagedEn
TagedP 4. Schiavone MB, Moukarzel L, Leong K, et al. Surgical site infection reduction bundle TagedP31. Fkirin A, Attiya G, El-Sayed A, Shouman MA. Copyright protection of deep neural
in patients with gynecologic cancer undergoing colon surgery. Gynecol Oncol. network models using digital watermarking: a comparative study. Multimedia
2017;147:115–119.TagedEn Tools Appl. 2022;81:15961–15975.TagedEn
TagedP 5. Takahashi Y, Takesue Y, Fujiwara M, et al. Risk factors for surgical site infection after TagedP32. Kotsiantis SB, Zaharakis I, Pintelas P. Supervised machine learning: a review of
major hepatobiliary and pancreatic surgery. J Infect Chemother. 2018;24:739–743.TagedEn classification techniques. Emerg Artif Intell Appl Comput Eng. 2007;160:3–24.TagedEn
TagedP 6. Lubega A, Joel B, Lucy NJ. Incidence and etiology of surgical site infections among TagedP33. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority
emergency post-operative patients in Mbarara Regional Referral Hospital, South over-sampling technique. J Artificial Intelligence Res. 2002;16:321–357.TagedEn
Western Uganda. Surg Res Pract. 2017;2017:1–6.TagedEn TagedP34. Chen CW, Tsai YH, Chang FR, Lin WC. Ensemble feature selection in medical data-
TagedP 7. Kang SI, Oh H, Kim MH, et al. Systematic review and meta-analysis of randomized sets: combining filter, wrapper, and embedded feature selection results. Expert
controlled trials of the clinical effectiveness of impervious plastic wound protec- Systems. 2020;37:e12553.TagedEn
tors in reducing surgical site infections in patients undergoing abdominal surgery. TagedP35. Al Mamlook Rabia Emhamed. Advanced risk models to predict surgical site infec-
Surgery. 2018;164:939–945.TagedEn tions using machine learning approaches and risk-adjusted control charts. Disser-
TagedP 8. Segreti J, Parvizi J, Berbari E, Ricks P, Berríos-Torres SI. Introduction to the centers tations. 2021;3779.TagedEn
for disease control and prevention and health care infection control practices advi- TagedP36. Vapnik V. An overview of statistical learning theory. IEEE Trans Neural Networks.
sory committee guideline for prevention of surgical site infection: prosthetic joint 1999;10:988–999. Sep..TagedEn
arthroplasty section. Surg Infect (Larchmt). 2017;18:394–400.TagedEn TagedP37. Ott E, BANGE FC, Sohr D, Teebken O, Mattner F. Risk factors associated with surgi-
TagedP 9. Korol E, Johnston K, Waser N, et al. A systematic review of risk factors associated cal site infections following vascular surgery at a German university hospitet al.
with surgical site infections among surgical patients. PLoS One. 2013;8:e83743.TagedEn Epidemiol Infect. 2013;141:1207–1213.TagedEn
rights reserved.

4 Machine-learning models for predicting surgical site infections using patient pre operative risk and surgical procedure

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4 Machine-learning models for predicting surgical site infections using patient pre operative risk and surgical procedure

Uploaded by

Copyright:

Available Formats

TagedEnAmerican Journal of Infection Control 51 (2023) 544−550

Contents lists available at ScienceDirect

TagedFiur TagedEn American Journal of Infection Control TagedFiur TagedEn

journal homepage: www.ajicjournal.org

TagedH1Machine-learning models for predicting surgical site infections using

Article Objective Surgery Type Methods Phase

Attribute ID Attribute Type Description

TagedPThe third step of data preprocessing was handling the issue of

k Models Performance of classiﬁcation models

Accuracy Sensitivity Speciﬁcity F1-Score AUC

5 Logistic Regression (LR) 0.7921 0.7898 0.7992 0.7944 0.7912

Fig 4. Feature Importance for the DNN Classiﬁcation Model.TagedEn

Fig 2. Performance of Classiﬁcation Models.TagedEn

TagedPThis study used ML to analyze the NSQIP dataset to improve pre-

You might also like