Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

The Monitoring and Improvement

of Surgical-Outcome Quality
WILLIAM H. WOODALL
Virginia Tech, Blacksburg, Virginia 24061-0439, USA

SANDY L. FOGEL, MD
Virginia Tech Carilion School of Medicine, Roanoke, Virginia 24014, USA

STEFAN H. STEINER
University of Waterloo, Waterloo, Ontario, Canada N2L 3G1

In this expository paper, we review methods for monitoring medical outcomes with a focus on surgical
quality. We discuss the importance and role of risk adjustment. We give the advantages and disadvantages
of various competing surveillance methods. We provide an extensive literature review and give some ideas
for future research. In addition, we describe the highly e↵ective American College of Surgeons National
Surgical Quality Improvement Program (NSQIP), which o↵ers data-based benchmarking of participating
hospitals and provides information on best surgical practices. A case study illustrates improvements of
mortality and surgical-site infection rates based on the NSQIP approach.

Key Words: CRAM Chart; Cumulative Sum (CUSUM) Chart; NSQIP; Risk-Adjustment; Statistical Process
Monitoring; VLAD Chart.

1. Introduction Action Plan to Prevent Health Care-Associated In-


fections: Roadmap to Elimination”.

T HERE has been a great deal of interest in improv-


ing the quality of health care, with particular em-
phasis on surgical quality. The Institute for Health-
According to the healthcare quality framework of
Donabedian (1966), one can assess structural, pro-
care Improvement (IHI) (2014) pointed out, for ex- cess, or outcome quality. Structural quality refers
ample, that surgical-site infections continue to rep- to the use of metrics such as nurse-to-bed ratios.
resent a significant portion of healthcare-associated An example of a process-quality variable would be
infections. Their impact on morbidity, mortality, and the percentage of surgical patients receiving antibi-
cost of care has resulted in their reduction being iden- otics within a prescribed time period before surgery.
tified as a top national priority in the U.S. Depart- Outcome-quality variables, on the other hand, reflect
ment of Health and Human Services (2013) “National the patients’ results. An example of an outcome vari-
able would be whether or not a surgical patient devel-
oped a surgical-site infection within 30 days following
Dr. Woodall is Professor in the Department of Statistics surgery.
at Virginia Tech. He is a Fellow of ASQ. His email address is
bwoodall@vt.edu. Generally, the proper use of outcome variables re-
quires considerably more e↵ort in data collection, but
Dr. Fogel is NSQIP Surgeon Champion at the Carilion it is the most informative approach. Porter and Teis-
Clinic. His email address is slfogel@carilionclinic.org.
berg (2007) and Department of Health (2010), among
Dr. Steiner is Professor in the Department of Statistics and others, have made a strong case for the use of out-
Actuarial Science at the University of Waterloo. He is a Fellow come results to provide vital feedback on what works
of ASQ. His email address is shsteine@uwaterloo.ca. and what does not instead of concentrating on pro-

Vol. 47, No. 4, October 2015 383 www.asq.org


384 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

cess targets. The Donabedian (1966) framework was use of scores, such as the Parsonnet score or the Eu-
outlined in more detail in the excellent paper by Ko roSCORE II, in a logistic regression model for risk
(2009), who also stressed the importance of using adjustment for adult cardiac surgery (Parsonnet et
outcome results. al. (1989), Nashef et al. (2012)). Risk factors used
to calculate these types of scores can include gender,
Overviews of process monitoring in healthcare ap- age, diabetic status, hypertension status, dialysis sta-
plications have been provided by Benneyan (1998a, tus, and so forth.
b), Woodall (2006), Winkel and Zhang (2007, 2012),
Woodall et al. (2012), and others. In our paper, we In virtually all cases, the logistic regression models
restrict attention to the various approaches for mon- are based on discrete or categorical explanatory vari-
itoring surgical-outcome quality. Blackstone (2004) ables modeled using indicator variables and do not
reviewed the history of process monitoring in surgi- include interaction terms. The number of explana-
cal applications and gave an overview of some of the tory variables is often in the range from 20 to 30.
important issues. Models have been built for many other types of ad-
verse events such as surgical-wound infection, anas-
Because e↵ective monitoring requires one to ac- tomotic leak (a leak at the surgical connection of two
count for the variation among patients, we briefly structures), and deep vein thrombosis (Bruce et al.
review risk adjustment in Section 2. In Section 3, we (2001)).
give our notation and define some of the surveillance-
method performance metrics. In Section 4, we de- The risk-adjustment model must be fit based on
scribe the various surveillance methods. In Sec- some historical data from all of the surgeons or hospi-
tion 5, we discuss the related analysis of surgical tals of interest. Most often, the data from a particular
learning curves. Outcome monitoring is important time period is somewhat arbitrarily selected to serve
and useful in detecting and understanding changes as the baseline. Paynabar et al. (2012) discussed the
in performance. It can motivate the need for im- analysis of historical baseline data.
provement, which can then come through targeting Cook et al. (2008) provided an excellent discus-
the most promising opportunities and implement- sion of some of the issues related to risk adjustment.
ing improvement projects. For this reason, we pro- It is important that the risk-adjustment model accu-
vide an overview of the American College of Sur- rately estimate the probability of the adverse event of
geons National Surgical Quality Improvement Pro- interest because the risk-adjusted surveillance meth-
gram (NSQIP) in Section 6 with an NSQIP-based ods detect deviations from the risk-adjustment model
case study given in Section 7. Some ideas for future predictions. It is also important that risk-adjustment
research on outcome monitoring are given in Section models be periodically updated because they can be-
8 and our conclusions follow in Section 9. gin to overestimate the probability of the adverse
event due to process improvement.
2. Risk Adjustment
In comparing hospital performance, it is common
Surgical patients vary considerably with respect to to use random-intercept multilevel logistic regression
physical characteristics, such as age and weight and modeling or other types of hierarchical generalized
with respect to health status. In order to perform linear models. See, for example, Clark et al. (2010)
useful monitoring or meaningful comparisons of the and COPPS-CMS White Paper Committee (2012).
surgical outcomes stratified by surgeons or hospitals,
there must be adjustments for the patient mix of 3. Monitoring Background
risk factors. Many aspects of risk adjustment were
covered in considerable detail by Iezzoni (2012). It is important to note that in industrial appli-
cations the monitoring methods are designed based
Most often logistic regression models are used for on information obtained using background data col-
patient-level risk adjustment when binary outcomes lected from the particular process of interest. The
are used. See, for example, Cohen et al. (2009) and collection and analysis of these data are referred to
New York State Department of Health (2001). The as phase I, an area reviewed by Jones-Farmer et al.
probability of a particular adverse event, such as (2014). In most of the methods covered in our paper,
death within 30 days of surgery, is modeled with however, the performance of a particular hospital or
various physical and health characteristics used as surgeon is monitored relative to a risk-adjustment
the explanatory variables. Examples would be the model constructed using data from a number of hos-

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 385

pitals. Thus, it is often performance to a standard Monitoring can lead to insights and is useful for
that is being monitored. Steiner (2014) discussed this detecting changes in performance and understanding
issue in more detail. trends over time. Outcome monitoring can be used
to identify problems, motivate the need for improve-
We let p0i represent the probability obtained from ment, and quantify the extent to which improvement
the risk-adjustment model that the ith surgical pa- initiatives have been successful.
tient, i = 1, 2, . . . , experiences the adverse event of
interest. We let Yi = 1 if the ith patient experiences 4.1. Risk-Adjusted Sets Method and
the adverse event of interest and Yi = 0 otherwise. Resetting SPRT
The assumption that Yi , i = 1, 2, 3, . . . , are mutu-
In their review, Grigg and Farewell (2004a) fo-
ally independent is ubiquitous. The odds against the
cused to some extent on the risk-adjusted sets
adverse event occurring are p0i : 1 p0i under the
method of Grigg and Farewell (2004b), which has
risk-adjustment model. A change in the odds ratio
not become widely used. To signal a process deterio-
of size leads to odds against the event of p0i :
ration, the sets method requires that the waiting time
1 p0i with a corresponding probability of the event
between adverse events, measured in the number of
of p1i = p0i /(1 + (p0i ( 1)). Monitoring methods
cases, be below a specified threshold for a specified
can be designed to detect specified changes in the
number of consecutive adverse events.
odds ratio.
Sego et al. (2008) showed, in the non-risk-adjusted
Risk-adjusted monitoring methods are usually case, that the apparent performance advantage of the
compared based on the average run length (ARL), sets method is due to an implicit “headstart” fea-
where the run length is the number of surgical pa- ture that leads to good zero-state ARL performance,
tients until a signal is given that there has been a but poor steady-state ARL performance relative to
change in the process. We would like the ARL when competing methods. These performance results likely
the process is stable, i.e., the in-control ARL, to be carry over to the risk-adjusted application.
large and the ARL to be small when there is a sig-
nificant change in the odds of the adverse event. We do not recommend the use of the resetting risk-
adjusted sequential probability ratio test (RSPRT)
The ARL can be calculated assuming that any proposed by Spiegelhalter et al. (2003) and Grigg et
process shift occurred before monitoring begins al. (2003). This method was discussed in some detail
(called the zero-state ARL) or assuming any shift oc- by Cook et al. (2008) and applied by Rogers et al.
curs sometime after the start of monitoring (steady- (2005). With this approach, one sets up a sequential
state ARL). We prefer the use of the steady-state probability-ratio hypothesis test (SPRT) with the
ARL because the assumption of a possibly delayed null hypothesis corresponding to the risk-adjustment
change in the process seems more realistic. Gombay model being correct and the alternative hypothesis
et al. (2011) pointed out, however, that the in-control corresponding to a specified shift in the odds of the
ARL can be misleading and considered other metrics, adverse event occurring. If the null hypothesis is ac-
such as the probability of a false alarm within a given cepted, then the hypothesis test is repeated. Reject-
number of patients. Sun and Kalbfleisch (2013) also ing the null hypothesis is a signal that performance
took this latter approach. may have changed.
There are two issues with this approach. The first
4. Various Monitoring Methods is that the SPRT type I and type II error probabil-
ities, ↵ and , respectively, are often misinterpreted
Grigg and Farewell (2004a), Rogers et al. (2004), and are not meaningful in assessing the run-length
and Cook et al. (2008) provided review papers on performance of the RSPRT. The second issue is that
risk-adjusted monitoring. Cook et al. (2008) pro- the RSPRT is a generalization of the risk-adjusted
vided an appendix with all related formulas. Woodall Bernoulli cumulative sum (RA-CUSUM) method dis-
(2006) included a section on risk-adjusted monitor- cussed in Section 4.4 with the primary e↵ect being
ing. A considerable amount of research has been done that the RSPRT chart is generally less able to de-
in the last decade or so, however, with many more tect deterioration in performance after a period of
applications. A nontechnical review and discussion of good performance and vice versa. This phenomenon
issues related to risk-adjusted monitoring was given is referred to a building up “credit” in the health-
by Steiner (2014). care surveillance literature and referred to as issues

Vol. 47, No. 4, October 2015 www.asq.org


386 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

with “inertia” in the industrial statistical process- 4.3. CRAM and VLAD Charts
monitoring literature.
The popular variable life adjusted display (VLAD)
4.2. Risk-Adjusted p Chart method of Lovegrove et al. (1997) and the equivalent
cumulative risk-adjusted mortality (CRAM) method
Alemi et al. (1996) and Alemi and Oliver (2001) of Poloniecki et al. (1998) are based on plots of cu-
proposed aggregating the patients into consecutive mulative sums of either p0i Yi or Yi p0i . In the
groups and using the mean and variance of the sum first case, the Y -axis is often labeled “Lives Saved”
of the Bernoulli observations within each group to or “Statistical Lives Saved”, whereas, in the sec-
determine Shewhart-type control limits. Hart et al. ond case, it is frequently labeled “Excess Mortality”.
(2003, 2004) gave related work on Shewhart charts Sometimes the plot is referred to as an observed mi-
based on aggregated data. Cockings et al. (2006) nus expected (O-E) CUSUM chart. The VLAD name
gave examples of risk-adjusted p charts with patients is most common, however, so we will use it in our pa-
grouped into consecutive blocks of size 30. Gustafson per.
(2000), on the other hand, compared various types of
charts for monitoring infection rates with data aggre- The book by the Clinical Practice Improvement
gated by month. Centre (2008) provides a detailed description of the
use of the VLAD and many practical issues related
Although these approaches can provide a good to its use. Albert et al. (2003) and Lovegrove et
overview summary of the performance of the process al. (1999) provided a number of examples of VLAD
over time, greater levels of aggregation cause longer plots. Treasure et al. (2004) and Sherlaw-Johnson et
delays in detecting changes in the process relative to al. (2000) also discussed the use of the VLAD chart.
methods that incorporate the data on a case-by-case
basis. This point was also made by Cook et al. (2003). The VLAD chart shown in Figure 1 was provided
The adverse e↵ect of data aggregation on monitor- to us by Dr. Albert Yuen of the Hong Kong Hospi-
ing was discussed in a general context by Schuh et tal Authority. It shows the net lives saved/lost fol-
al. (2013). lowing emergency operations at a hospital in Hong

FIGURE 1. Rocket Tail VLAD Plot. Provided by W.-C. Yuen, Hong Kong Hospital Authority.

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 387

FIGURE 2. Risk-Adjusted Bernoulli CUSUM Chart for Surgeon 2 (from Steiner, (2014)).

Kong. Under the risk-adjustment model, the VLAD the Bernoulli CUSUM chart of Reynolds and Stoum-
statistic wanders over time in a nonstationary man- bos (1999), which was used by Leandro et al. (2005)
ner. It has no tendency to return to any particu- to monitor the outcomes of liver-transplant surgery.
lar value, including zero, because it can be modeled
The one-sided RA-CUSUM chart statistics for a
as a random-walk process. The statistics in Figure
chart designed to detect a change in the odds ratio
1 decrease substantially, however, over time, indi-
to a specified value are
cating deterioration in performance compared with
the risk-adjustment model. The “rocket-tail” con- Si = max(0, Si 1 + Wi ), i = 1, 2, 3, . . . ,
trol limits in Figure 1, which widen over time, are
based on percentiles of the marginal distribution of where S0 = 0, Wi = ln(p1i /p0i ) if Yi = 1 and Wi =
the cumulative sum. These limits have been recom- ln[(1 p1i )/(1 p0i )] if Yi = 0. The chart signals when
mended, for example, by Grunkemeier et al. (2003, Si > h, where h is selected to provide a specified
2009) and Noyez (2009). The use of these limits to in-control ARL. The selection of h depends on the
signal changes in quality leads to a problem with relevant population of risk scores.
inertia because process deterioration, for example, Often the RA-CUSUM chart is implemented to be
could occur when the VLAD statistic is near the up- two-sided with the simultaneous application of two
per limit. In addition, the percentile values used to one-sided charts, one to detect improvement in per-
determine the limits are not directly related to the formance and the other to detect deterioration. The
run-length performance of the method. signs on the statistics on the chart used to detect
Sismanidis et al. (2003) and Poloniecki et al. improvement are usually changed so the two one-
(2003) proposed an ad hoc signal rule for the VLAD sided charts can be plotted together more easily. An
and CRAM charts. The lack of a theoretically justifi- example of a two-sided RA-CUSUM chart provided
able way to determine a signal rule for the chart has in Steiner (2014) is given in Figure 2. In this ex-
led, however, to the use of the VLAD as an easily un- ample, the outcome is mortality within 30 days of
derstood visual aid with reliance on the RA-CUSUM cardiac surgery with risk-adjustment based on lo-
chart, as described in the next subsection, to signal gistic regression using Parsonnet scores. The upper
shifts in the performance of the surgical process. CUSUM chart was designed to detect a doubling of
the odds ratio while the lower CUSUM chart was
4.4. Risk-Adjusted Bernoulli CUSUM Chart designed to detect a halving of the odds ratio. The
upper CUSUM boundary was crossed with patient
The risk-adjusted Bernoulli cumulative sum
253, indicating poor performance relative to the risk-
(CUSUM) chart (referred to as the RA-CUSUM
adjustment model.
chart) of Steiner et al. (2000, 2001) is preferred over
the VLAD chart for detecting changes in perfor- There have been a number of applications of the
mance. The RA-CUSUM chart is a generalization of RA-CUSUM chart in the literature. For example,

Vol. 47, No. 4, October 2015 www.asq.org


388 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

FIGURE 4. Rocket-Tail VLAD Plot for Surgeon A with


FIGURE 3. RA-CUSUM Chart for Surgeon A with Two Two RA-CUSUM Chart Signals Indicated. (Reprinted by
Signals of Poor Performance. (Reprinted by permission from permission from Macmillan Publishers Ltd: Journal of the
Macmillan Publishers Ltd: Journal of the Operational Re- Operational Research Society, Sherlaw-Johnson et al., 2007,
search Society, Sherlaw-Johnson et al., 2007, published by published by Palgrave Macmillan.)
Palgrave Macmillan.)

control limits, the in-control ARL depends heavily


Axelrod et al. (2006, 2009) discussed the use of the on the population of risk scores. They showed that
RA-CUSUM method to assess the performance of the e↵ect was considerably greater than that found
organ-transplant centers. Beiles and Morton (2004) by Loke and Gan (2012). Zhang and Woodall (2015),
and Collins et al. (2011) gave applications in assess- however, have developed a method for designing the
ing the performance of arterial surgery and gastro- RA-CUSUM charts based on the method of Shen et
esophageal surgery, respectively. Morton et al. (2008) al. (2013). This is a computationally intensive ap-
discussed the monitoring of healthcare-acquired in- proach which requires on-line dynamic simulation to
fections and used the RA-CUSUM chart as an exam- determine the control limits patient-by-patient to de-
ple. Bottle and Aylin (2008) discussed the reliance sign each chart for the specific sequence of patient
on RA-CUSUM charts in a system for monitoring risk scores observed. This means each chart is cus-
clinical performance involving 100 hospitals in Eng- tomized to the specific sequence of patients at hand.
land. For other applications, see Harris et al. (2005), This alleviates the major disadvantage of the risk
Novick et al. (2006), Moore et al. (2007), and Chen adjusted Bernoulli CUSUM chart, which is any con-
et al. (2011). cern that misspecification or changes in the risk score
population can a↵ect to a considerable extent the in-
It has been recommended that one display the control performance of the chart.
more easily interpretable VLAD chart, but with a
RA-CUSUM chart run in the background to signal It is frequently stated that the RA-CUSUM has
any changes in quality. This approach was advocated optimal performance based on the results of Mous-
by Sherlaw-Johnson (2005), Sherlaw-Johnson et al. takides (1986), but these optimality results are based
(2005, 2007), Cook et al. (2008), Clinical Practice Im- on the assumption of an independent and identically
provement Centre (2008), and Collett et al. (2009). distributed sequence of observations. The observa-
We also support this approach, which is illustrated tions in the sequence Yi , i = 1, 2, 3, . . . are assumed
in Figures 3 and 4. to be independent but they are not identically dis-
tributed because patients have varying risk factors.
A number of researchers have studied the perfor-
mance of the RA-CUSUM chart. Jones and Steiner Grigg and Farewell (2004) pointed out the useful-
(2012) investigated the e↵ect of estimation error on ness of the approximation
the performance of the chart. Webster and Pettitt
(ARLC ) 1 ⇠
= (ARLL ) 1
+ (ARLU ) 1
,
(2007) studied some technical issues related to the
computation of chart performance metrics.
where ARLC is the ARL for the two-sided RA-
Steiner et al. (2001) and Tian et al. (2015) showed CUSUM chart and ARLL and ARLU are the ARLs
that, for a given risk-adjustment model and given for the component lower- and upper-sided RA-

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 389

CUSUM charts, respectively. Megahed et al. (2011) methods can lead to better statistical performance
gave conditions under which this relationship is ex- than the RA-CUSUM chart based on binary out-
act. comes, but they are considerably more complicated
and require a survival model assumption. Phiniket-
4.5. Risk-Adjusted Exponentially Weighted tos and Gandy (2014), however, recently proposed a
Moving-Average Methods nonparametric approach based on the Kaplan–Meier
Cook at al. (2011) proposed a risk-adjusted ex- survival curve estimator.
ponentially weighted moving average (RA-EWMA) Steiner and Jones (2010) showed that the method
method. The advantages given for their RA-EWMA of Sego et al. (2008) is at a performance disadvantage
method are that it communicates information about relative to the competing methods. This is because,
the current level of an indicator in a direct and un- until information on the final status of a particular
derstandable way and it explicitly displays informa- patient is available, no information on patients oper-
tion about the current patient case mix. Also, be- ated on after this patient can be used in the analy-
cause it is not reset after a signal, they considered sis. In recent developments, Snyder et al. (2014) dis-
the RA-EWMA chart to be a more natural chart cussed the application of the CUSUM method of Sun
to use in healthcare applications, where a process of and Kalbfleisch (2008) in the monitoring of survival
care can rarely be changed quickly. One might note times after organ transplantation. In addition, As-
that it is common not to reset surveillance statistics sareh and Mengersen (2014 a, b) proposed Bayesian
to their initial values after a signal in prospective methods for change-point estimation for step shifts
public health surveillance applications. and trends, respectively, in monitoring risk-adjusted
Steiner and MacKay (2014) also proposed an survival times.
EWMA-based approach that gives more weight to
4.7. Other Approaches
recent outcomes and plots a clinically interpretable
estimate of the failure rate for a “standard” patient. There have been a number of alternative ap-
They pointed out some advantages of their EWMA proaches proposed for the monitoring of risk-
approach compared with those of Cook et al. (2011) adjusted binary outcomes. Steiner et al. (1999) pro-
and Grigg and Spiegelhalter (2007). One advantage posed methods for monitoring paired binary surgi-
claimed is that fewer historical data are needed to cal outcomes, which allows the simultaneous moni-
set up their surveillance method. toring of mortality and “near-misses”. Chang (2008)
Regardless of any relative advantages of these compared a risk-adjusted Shiryayev–Roberts scheme
three EWMA methods, the RA-CUSUM chart is to the performance of the RA-CUSUM chart. The
likely to remain the most accepted surveillance ap- Shiryayev–Roberts method was found to be less able
proach for monitoring with risk-adjusted binary out- to detect deterioration in performance.
comes in the near future. Gan and Tan (2010) proposed a risk-adjusted ver-
sion of the time-between events chart, which in the
4.6. Methods Based on Survival Models
non-risk-adjusted application typically involves use
There has been more recent work that incorpo- of the geometric distribution. Albers (2011) proposed
rates information on the times of any deaths within a related method based on the number of cases be-
the given time window following surgery or survival tween a specified number of adverse events, which
times more generally. Monitoring of survival times is led to a generalization of the negative binomial dis-
more common in organ-transplantation applications. tribution. In the non-risk-adjusted case, however,
For reviews of methods used to assess the perfor- Szarka and Woodall (2011) reported that these types
mance of organ-transplantation centers and related of time-between-event charts fare poorly compared
issues, we recommend Collett et al. (2009) and Neu- with the performance of a CUSUM chart.
berger et al. (2010). For mortality outcome data in
Zeng and Zhou (2011) proposed a risk-adjusted
organ-transplantation applications, the time window
monitoring method based on Bayesian methods that
after surgery is most often one year.
is said to require less data than other methods in
Survival model-based surveillance methods have order for monitoring to begin. More recently, Gan
been proposed by Biwas and Kalbfleisch (2008), Sego et al. (2012) proposed a generalization of the RA-
et al. (2009), Steiner and Jones (2010), Gandy et CUSUM chart to detect changes in parameters other
al. (2010), and Sun and Kalbfleisch (2013). These than the odds ratio.

Vol. 47, No. 4, October 2015 www.asq.org


390 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

5. Use of a Sequential Technique National Surgical Quality Improvement Program


to Assess Learning Curves (NSQIP) participants. Maggard-Gibbons (2013) re-
It is important to distinguish between the use of ported that about 10% of the hospitals in the U.S.
the RA-CUSUM method and the use of what is re- participate in NSQIP and that they account for
ferred to as the CUSUM technique to assess surgical about 30% of the over 40 million operations per-
learning curves. With learning curves, the cumulative formed annually. The participating hospitals pro-
number of failures is often plotted against the patient vide to NSQIP data on samples of surgical patients
number if there is no risk adjustment. Sometimes cu- designed to represent all types of surgical proce-
mulative values of Yi p0 are plotted, where p0 is the dures and all surgeons. NSQIP then provides risk-
acceptable or expected failure rate. Decision lines are adjustment models and performance results to the
determined based on the sequential hypothesis test- hospitals. Thus hospital administrators and surgeons
ing (SPRT) approach of Wald (1947) for a sequence can see how their results compare to other hospitals
of independent Bernoulli outcomes, where one must and how individual surgeons compare to each other
also specify an unacceptable failure rate and type I and to the overall NSQIP performance. This pro-
and type II error rates. Rogers et al. (2004) referred cess can be used to identify areas needing improve-
to this method as resulting in a “cumulative failure ment. Without the sort of benchmarking provided by
chart”. The method can be risk adjusted. The SPRT NSQIP, hospital administrators and surgeons can-
stopping rule, under which sampling stops when a not accurately assess their current performance or
decision line is crossed, is typically not followed. as easily identify areas most needing improvement.
Maggard-Gibbons (2013) reported that in a survey
For more information on this CUSUM learning of NSQIP participants, over half reported that prior
curve technique, see Williams et al. (1992), Novick to joining NSQIP they did not know their surgical
and Stitt (1999), Bolsin and Colson (2000), Novick mortality rates, much less how their rates compared
et al. (2001), Grunkemeier et al. (2003), and Yap et to other hospitals.
al. (2007). Most of the CUSUM applications found
in the review by Biau et al. (2007) were learning Ko (2009) and Maggard-Gibbons (2013) provided
curve analyses. It is frequently incorrectly stated in very good descriptions of the history and structure
the literature that the learning curve CUSUM is re- of NSQIP. Also, see http://site.acsnsqip.org/. Cohen
lated to the CUSUM method of Page (1954), whereas et al. (2013) discussed a number of statistical aspects
it is the RA-CUSUM method that is an extension of the NSQIP analyses. It is reported at the NSQIP
of Page’s work. It seems that the learning curve website that hospitals participating in NSQIP ben-
SPRT approach has been misleadingly referred to as efit from an average savings of about $3 million
a CUSUM procedure because this was the terminol- per year, reduced readmissions and lengths of stay,
ogy used by de Leval et al. (1994). higher patient satisfaction, better patient outcomes,
In applications of the learning curve approach, it better performance on publicly reported measures,
is frequently expected that performance levels will and better performance under pay-for-performance
change over time, perhaps more than once. Perfor- programs. In a thorough study, Hall et al. (2009)
mance could either improve or deteriorate. Thus, it found that surgical outcomes improved across a ma-
does not seem reasonable to use a method relying on jority of the NSQIP participating hospitals in the pri-
a one-sided hypothesis testing approach where the vate sector. Improvement was found for both poor-
error probabilities are based on the assumption of a and well-performing facilities. They reported that
constant level of performance. In learning curve ap- NSQIP hospitals appeared to be avoiding substan-
plications, it is assumed that there is a sequence of tial numbers of complications, improving care, and
independent Bernoulli observations where the inter- reducing costs.
est is in detecting changes in the probability of fail-
ure. The large literature on this topic was reviewed A key aspect leading to the success of NSQIP
by Szarka and Woodall (2011). is that each hospital has a surgeon champion that
identifies and leads improvement initiatives. There
6. National Surgical Quality- is also a well-trained surgical clinical reviewer, who
Improvement Program is responsible for collecting complete and accurate
clinical data, as opposed to reliance on less accu-
Over 560 U.S. hospitals and health systems and 43 rate and less relevant claims or administrative data.
outside the U.S. are American College of Surgeons About 140 variables are measured for each surgical

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 391

FIGURE 5. Funnel Plot Showing All-Cause Risk-Adjusted In-Hospital 30-Day Mortality for English National Health Service
Hospital Trusts (from Symons et al., 2013). Reprinted with permission from John Wiley and Sons.

patient included in the NSQIP database. There are Because the number of participating hospitals has
40 adverse events for which risk-adjustment models increased markedly from the inception of NSQIP, the
are constructed. These include such adverse events caterpillar plots became unreadable. With this in
as cardiac occurrences, pneumonia, unplanned intu- mind, as well as the above-mentioned shortcoming
bation, ventilator dependence over 48 hours, renal of the plot, Cohen et al. (2013) reported that NSQIP
failure, and urinary tract infection, as well as death. moved to reporting the odds ratios with confidence
intervals, along with the decile in which each hospi-
NSQIP previously used “caterpillar” plots to iden-
tal lies for each of the adverse events. It is easier to
tify outlying performance. The healthcare providers
follow progress over time with this information than
were ordered by the O/E ratio, i.e., the ratio of the
it is following position on the caterpillar plot.
observed number of adverse events during a given
time period to the expected number based on the A variety of methods have been proposed for iden-
risk-adjustment model. Note that, if the adverse tifying outlying performers in the comparison of hos-
event is death, the O/E ratio is often referred to as pitals. Bilimoria et al. (2010) showed that the num-
the standardized mortality ratio (SMR). If the confi- ber of hospitals identified as outlying varies widely
dence interval on the O/E ratio falls completely be- depending on the method used. Spiegelhalter (2005)
low unity, then the hospital is considered to have argued that it is important to allow for some overdis-
better than expected performance. A confidence in- persion. It is also important to adjust for the number
terval falling completely above unity indicates perfor- of hospitals being compared. In this regard, Jones
mance below expected. Because much of the ordering and Spiegelhalter (2008) showed that adjusting the
of the performance of the hospitals is random, this thresholds based on the false discovery rate was bet-
caterpillar plot can lead to an overemphasis on rela- ter than using the Bonferroni adjustment method.
tive position. Small di↵erences in performance can Other discussions of the issues involved in how to
be attributed to chance and small di↵erences can classify hospitals as below average, average, or above
change the position of a hospital on the caterpillar average were provided by Jones and Spiegelhalter
plot substantially. For this reason, we prefer the use (2011), He at al. (2014), Seaton and Manktelow
of the funnel plot of Spiegelhalter (2005a, b) where (2012), Kalbfleisch and Wolfe (2013), Cohen et al.
the performance metric is plotted versus the number (2013), and Ieva and Paganoni (2015), among oth-
of cases. An example of a funnel plot from Symons ers.
et al. (2013) is shown in Figure 5. See Mayer et al.
(2009) for a discussion of the use of funnel plots in One limitation of NSQIP analyses is that the full
surgical applications. risk-adjusted outcome reports are provided based on

Vol. 47, No. 4, October 2015 www.asq.org


392 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

data aggregated over 6-month intervals with some tion and/or use of Vancomycin for pre-op an-
additional delay for processing. Monitoring on a more tibiotic
continuous basis is possible by running non-risk- • Better pre-op glucose control
adjusted reports on a daily, weekly, or monthly ba-
• Identification and treatment of pre-op infection
sis. This only allows trending, however, with the as-
(especially urinary tract infections)
sumption that the patient populations are uniform.
Providing risk-adjusted charts based on patient-by- • Increasing the dose of pre-op antibiotics for
patient outcomes, even plotted retrospectively, would obesity
likely provide more insight into hospital performance • Redosing antibiotics at 3 hours into procedure
over time and into the e↵ect of improvement initia- • Transport of post-op patients on oxygen
tives. Cohen et al. (2013) reported, however, that
• Pre-op optimization of respiratory status.
NSQIP is moving toward timelier monitoring.
The Surgical Outcomes Monitoring and Improve- The sequence of implementation of the quality-
ment Program of the Hong Kong Hospital Author- improvement projects was based on a combination
ity is structured similarly to NSQIP. Yuen (2013) of the expected relative impact on outcomes and the
reported on their evaluation of 17 public hospitals ease of accomplishment, including an assessment of
and their comparison of the observed mortality rates the financial resources needed. The home antisep-
with the expected rates using data from 23,700 oper- sis, patient warming, and improved glucose-control
ations performed during the period July 2010–June projects were implemented first.
2012. Elective surgery and emergency surgery were It is important to control glucose levels because
treated separately. Outlying performance was identi- high blood-sugar levels are associated with higher
fied using caterpillar plots. rates of infection. EndoToolTM is a computerized
system for calculating dosing for intravenous insulin.
7. NSQIP Case Study Fogel and Baker (2013) showed use of this comput-
In this section, we report on the surgical qual- erized system leads to better glucose control than
ity improvement obtained at the Carilion Clinic in standard paper-based protocols, where insulin doses
Roanoke, Virginia, with Dr. Sandy L. Fogel, MD, as are calculated using a worksheet. There were seven
NSQIP surgeon champion and James Jones, BSN, as paper-based protocols at Carilion before the adop-
surgical clinical reviewer. tion of EndoToolTM , with none used particularly
well. The percentage of patients with blood-sugar
Based on the initial NSQIP results in 2007 show- levels above the high level of 150 milligrams per
ing O/E ratios significantly above one, the focus of deciliter (mg/dL) was 31% over a 6-month period
improvement was in reducing the rate of surgical- before use of EndoToolTM and 16% in the 6-month
site infections and the general surgery mortality. Ad- period afterward. Some patients are insulin-resistant,
verse events tend to be expensive. NSQIP (2014) re- making it impossible to prevent having some patients
ported that the cost of a surgical-site infection av- with high blood-sugar levels.
erages around $27,000, while Dimick et al. (2004)
estimated that a case of ventilator-associated pneu- With respect to other process-quality variables,
monia added about $50,000 to the cost of a surgical the projects led to the following implementation rate
admission. changes: home antisepsis, a 20% rate to over 95%;
warming, less than 30% to over 95%; redosing in op-
Best practices were used to identify improvement erating room for cases over 3 hours, from 0% to over
projects. Projects were undertaken to improve each 75%; and MRSA screening rate of 100%, with MRSA
of the following best practices, which were either in- treatment pre-op from 0% to more than 50%.
consistently done or not done at all, in order to re-
duce the rate of surgical-site infections: The dramatic e↵ect of the initiatives in lowering
the surgical-site infection rates can be seen in Fig-
• Normothermia (patient warming) throughout ure 6. No O/E values from December 2009 onward,
the surgical and post-op period marked by open circles, were significantly di↵erent
• Post-op glucose control (though EndoToolTM ) from one. Being able to monitor performance over
• Pre-op skin antisepsis at home time is a key benefit of participating in NSQIP.
• Methicillin-resistant Staphylococcus aureus With respect to the general mortality rate, re-
(MRSA) screening and selective decontamina- views of medical records for prior cases showed that

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 393

FIGURE 6. Time-Series Plot of Carilion Surgical Site Infection O/E Ratios. Values marked by open circles are not signifi-
cantly di↵erent from one.

some surgical patients were less than medically op- and the changes implemented to reduce the rate of
timized prior to surgery. This included patients go- surgical-site infections is illustrated in Figure 7. The
ing to surgery with poorly controlled hypertension, O/E values from June 2010 onward, marked by open
diabetes, cardiac disease, etc. Thus, the patient- circles, are not significantly di↵erent than one, evi-
screening process was moved back from 2–3 days dence of improvement over earlier performance.
before the planned operation to 2–3 weeks before
in order to provide more time for proactive treat- The largest barrier to successful implementation
ment. The e↵ect on the mortality rate of this change of a given quality-improvement project was surgeon

FIGURE 7. Time-Series Plot of Carilion 30-Day Mortality O/E Ratios. Values marked by open circles are not significantly
di↵erent from one.

Vol. 47, No. 4, October 2015 www.asq.org


394 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

habit. It is simply hard to institute a change in be- Gandy and Kvaløy (2013) could perhaps be
havior. The key was to make the changes as invisi- used to control the percentage of the time that
ble to the surgeon as possible by relying on policies the in-control ARL falls below a specified value.
and protocols, with automatic population of the elec- This can help to avoid designing charts that re-
tronic medical record with the appropriate physician sult in many false alarms.
orders. Such process changes are difficult, time con- (c) It may be possible to build on the work of
suming, and require a great deal of teamwork among Yeh et al. (2009) to develop a prospective pro-
individuals and groups to accomplish. Obviously, the file monitoring approach to determine when a
surgeons are needed, but so are anesthesiologists, risk-adjustment logistic model needs to be up-
nurses (pre-op, intra-op, and post-op), systems an- dated. Similarly, the change-point approach of
alysts, data analysts, financial representatives, pur- Gurevich and Vexler (2005) may be useful in
chasing agents, supply managers, and many others. the analysis of the baseline data used to design
One of the lessons learned was that it was the pro- the surveillance methods, a topic needing more
cesses that needed to be improved, not the perfor- study generally.
mance of any of the surgeons.
(d) There seems to be an opportunity to develop
There were very significant improvements made in alternatives to the SPRT method for the anal-
surgical quality that prevented many surgical-site in- ysis of learning curves.
fections and saved many lives. The raw data showed (e) The e↵ect of data aggregation on the perfor-
a reduction in mortality from a high of 3.7% to a mance of the various methods needs to be quan-
low of 1.8%. This was a reduction in mortality of tified.
approximately 50%. The hospital performs roughly
(f) Current surveillance methods are based on an
20,000 surgical procedures a year, which translates
assumption of independence of the outcomes.
into approximately 300 lives saved per year. The
As pointed out by Morton (2003), there could
need for the improvements was made clear through
be dependence over time or overdispersion com-
the NSQIP benchmarking process. The e↵ects of im-
pared with the assumed models. This requires
provement initiatives were then monitored over time
study of current methods under these condi-
using NSQIP reports as process changes were imple-
tions and the development of new methods.
mented.
Mousavi and Reynolds (2009) considered the
design of a Bernoulli CUSUM chart using a
8. Some Potential Research Topics
model for dependence over time in the non-risk-
Some topics related to risk-adjusted monitoring adjusted case.
that merit further research include the following: (g) Some applications involve monitoring many
(a) There needs to be studies of alternative meth- process data streams, a topic included in the
ods for risk adjustment, including further study overview by Woodall and Montgomery (2014).
of the use of interaction terms in the logistic re- For example, Spiegelhalter et al. (2012) consid-
gression model. Multiple years of NSQIP data ered the problem of using CUSUM charts to
(the Participant Use Data File) are available monitor over 200,000 indicators for excess mor-
to NSQIP participants. The 2012 file, for ex- tality. How to monitor such a large number of
ample, contains information on 543,885 cases data streams most e↵ectively is an area that
submitted from 374 participating sites. These needs more attention. One must ideally keep
data could be used to study the performance of the number of false alarms low while maintain-
other risk-adjustment approaches. The evolv- ing the ability to detect significant outlying per-
ing methodology used by NSQIP was discussed formance.
in considerable detail by Cohen et al. (2013). (h) Tang et al. (2015) proposed a method for risk-
(b) It is important to study the e↵ect of estima- adjusted monitoring that allows for more than
tion error on the various monitoring methods, two outcomes. More work is needed in this area.
particularly those described in Section 4.6 that Their method could be designed using the ap-
incorporate the time until any death within a proach of Zhang and Woodall (2015) in order
given time window following surgery. The boot- to make the method invariant to the underlying
strap method of Jones and Steiner (2012) and risk distribution.

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 395

9. Conclusions Estimation in Monitoring Survival Times”. PLoS ONE 7(3).


DOI: 10.1371/journal.pone.0033630.
It is clearly important to monitor and improve Assareh, H. and Mengersen, K. (2014b). “Estimation of
healthcare quality, which includes surgical quality. the Time of a Linear Trend in Monitoring Survival Time”.
We believe that there will be an increasing empha- Health Services and Outcomes Research Methodology 14(1–
sis on the monitoring and public reporting of risk- 2), pp. 15–33.
Axelrod, D. A.; Guidinger, M. K.; Metzger, R. A.; Wies-
adjusted outcome performance metrics, as, for ex-
ner, R. H.; Webb, R. L.; and Merion, R. M. (2006).
ample, in the UK (Bottle and Aylin (2008), Spiegel- “Transplant Center Quality Assessment Using a Contin-
halter et al. (2012)). Performance indicators are pub- uously Updatable, Risk-Adjusted Technique (CUSUM)”.
licly available for each health trust in the UK. See, American Journal of Transplantation 6, pp. 313–323.
for example, Dr. Foster Intelligence (2014). Axelrod, D. A.; Kalbfleisch, J. D.; Sun, R. J.; Guidinger,
M. K.; Biswas, P.; Levine, G. N.; Arrington, C. J.; and
We strongly encourage hospital administrators to Merion, R. M. (2009). “Innovations in the Assessment of
participate in NSQIP or some other collaborative Transplant Center Performance: Implications for Quality Im-
provement”. American Journal of Transplantation 9(2), pp.
network of hospitals to evaluate their performance
959–969.
results. The business case and the benefits to pa- Beiles, C. B. and Morton, A. P. (2004). “Cumulative
tients more than justify such participation. For those Sum Control Charts for Assessing Performance in Arterial
interested in best surgical practices to improve surgi- Surgery”. ANZ Journal of Surgery 74, pp. 146–151.
cal quality, we also recommend the information pro- Benneyan, J. C. (1998a). “Statistical Quality Control Meth-
vided through the Institute for Healthcare Improve- ods in Infection Control and Hospital Epidemiology. Part I:
Introduction and Basic Theory”. Infection Control and Hos-
ment (www.ihi.org).
pital Epidemiology 19, pp. 194–214.
In general, we support the monitoring of surgical Benneyan, J. C. (1998b). “Statistical Quality Control Meth-
outcomes on a case-by-case basis with as little ag- ods in Infection Control and Hospital Epidemiology. Part II:
Chart Use, Statistical Properties, and Research Issues”. In-
gregation of data over time as possible. Data aggre- fection Control and Hospital Epidemiology 19, pp. 265–283.
gation can slow the detection of changes in quality Biau, D. J.; Resche-Rigon, M.; Godiris-Petit, G.; Nizard,
and make it more difficult to determine the immedi- R. S.; and Porcher, R. (2007). “Quality Control of Surgical
ate e↵ects of specific quality-improvement initiatives. and Interventional Procedures: A Review of the CUSUM”.
The RA-CUSUM chart combined with a VLAD plot Quality and Safety in Health Care 16, pp. 203–207.
is our recommended approach for monitoring on a Bilimoria, K. Y.; Cohen, M. E.; Merkow, R. P.; Wang, X.;
Bentrem, D. J.; Ingraham, A. M.; Richards, K.; Hall, B.
case-by-case basis with binary data. L.; and Ko, C. Y. (2010). “Comparison of Outlier Identifica-
tion Methods in Hospital Surgical Quality Improvement Pro-
grams”. Journal of Gastrointestinal Surgery 14, pp. 1600–
Acknowledgments 1607.
We thank Dr. Albert Yuen of the Hong Kong Hos- Biwas, P. and Kalbfleisch, J. D. (2008). “A Risk-Adjusted
CUSUM in Continuous Time Based on the Cox Model”.
pital Authority for providing Figure 1. The work of Statistics in Medicine 27, pp. 3382–3406.
W. H. Woodall was supported by National Science Blackstone, E. H. (2004). “Monitoring Surgical Perfor-
Foundation Grant CMMI-1436365. mance”. Journal of Thoracic and Cardiovascular Surgery
128(6), pp. 807–810.
Bolsin, S. and Colson, M. (2000). “The Use of the Cusum
References Technique in the Assessment of Trainee Competence in New
Albers, W. (2011). “Risk-Adjusted Control Charts for Health Procedures”. International Journal for Quality in Health
Care Monitoring”. International Journal of Mathematics Care 12(5), pp. 433–438.
and Mathematical Sciences Article ID 895273, 16 pages. Bottle, A. and Aylin, P. (2008). “Intelligent Information:
Albert, A. A.; Walter, J.A.; Arnrich, B.; Hassanein, W.; A National System for Monitoring Clinical Performance”.
Rosendahl, U. P.; Bauer, S.; and Ennker, J. (2004). Health Services Research 43, pp. 1–31.
“On-Line Variable Live-Adjusted Displays with Internal and Bruce, J.; Russell, E. M.; Mollison, J.; and Krukowski,
External Risk-Adjusted Mortalities, A Valuable Method for Z. H. (2001). “The Measurement and Monitoring of Surgical
Benchmarking and Early Detection of Unfavorable Trends Adverse Events”. Health Technology Assessment 5(22).
in Cardiac Surgery”. European Journal of Cardio-Thoracic Chang, T.-C. (2008). “Cumulative Sum Schemes for Surgical
Surgery 25, pp. 312–319. Performance Monitoring”. Journal of the Royal Statistical
Alemi, F. and Oliver, D. (2001). “Tutorial on Risk-Adjusted Society, Series A 171(2), pp. 407–432.
p Charts””. Quality Management in Health Care 10, pp. 1–9. Chen, T.-T.; Chung, K.-P.; Hu, F.-C.; Fan, C. M.; and
Alemi, F.; Rom, W.; and Eisenstein, E. (1996). “Risk- Yang, M.-C. (2011). “The Use of Statistical Process Control
Adjusted Control Charts for Health Care Assessment”. An- (Risk-Adjusted CUSUM, Risk-Adjusted RSPRT and CRAM
nals of Operations Research 67, pp. 45–60. with Prediction Limits) for Monitoring the Outcomes of Out-
Assareh, H. and Mengersen, K. (2014a). “Change Point of-Hospital Cardiac Arrest Patients Rescued by the EMS

Vol. 47, No. 4, October 2015 www.asq.org


396 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

System”. Journal of Evaluation in Clinical Practice 17, pp. gram”. Journal of the American College of Surgery 202(6),
71–77. pp. 531–537.
Clark, D. E.; Hannan, E. L.; and Wu, C. (2010). “Predict- Department of Health. (2010). Equity and Excellence: Lib-
ing Risk-Adjusted Mortality for Trauma Patients: Logistic erating the NHS. London: Department of Health.
Versus Multilevel Logistic Models”. Journal of the Ameri- Dr. Foster Intelligence (2014). “My Hospital Guide 2013”.
can College of Surgeons 211(2), pp. 224–231. myhospitalguide.drfosterintelligence.co.uk/#/mortality. Ac-
Clinical Practice Improvement Centre. (2008). VLADs cessed June 2, 2014.
for Dummies. Milton, Queensland, Australia: Wiley Pub- Donabedian, A. (1966). “Evaluating the Quality of Medical
lishing Australia Pty Ltd. Available on request from VLAD Care”. Milbank Memorial Fund Quarterly 44, pp. 166–206.
Queries@health.qld.gov.au Fogel, S. L. and Baker, C. C. (2013). “E↵ects of Computer-
Cockings, J. G. L.; Cook, D. A.; and Iqbal, R. K. (2006). ized Decision Support Systems on Blood Glucose Regulation
“Process Monitoring in Intensive Care with the use of Cu- in Critically Ill Surgical Patients”. Journal of the American
mulative Expected Minus Observed Mortality and Risk- College of Surgeons 216(4), pp. 828–833.
Adjusted p Charts”. Critical Care 10, R28. DOI: 10.1186/ Gan, F. F.; Lin, L.; and Loke, C. K. (2012). “Risk-Adjusted
cc3996. Cumulative Sum Charting Procedures”. In Frontiers in Sta-
Cohen, M. E.; Dimick, J. B.; Bilimoria, K. Y.; Ko, C. Y.; tistical Quality Control, Vol. 10, Lenz, H.-J.; Wilrich, P.-T.;
Richards, K.; and Hall, B. L. (2009). “Risk Adjustment and W. Schmid, W., eds., pp. 207–225. Physica-Verlag.
in the American College of Surgeons National Surgical Qual- Gan, F. F. and Tan, T. (2010). “Risk-Adjusted Number-
ity Improvement Program: A Comparison of Logistic Versus Between Failures Charting Procedures for Monitoring a Pa-
Hierarchical Modeling”. Journal of the American College of tient Care Process for Acute Myocardial Infarctions”. Health
Surgeons 209(6), pp. 687–693. Care Management Science 13, pp. 222–233.
Cohen, M. E.; Ko, C. Y.; Bilimoria, K. Y.; Zhou, L.; Huff- Gandy, A. and Kvaløy, J. T. (2013). “Guaranteed Condi-
man, K.; Wang, X.; Liu, Y.; Kraemer, K.; Meng, X.; tional Performance of Control Charts via Bootstrap Meth-
Merkow, R.; Chow, W; Matel, B.; Richards, K.; Hart, ods”. Scandinavian Journal of Statistics 40, pp. 647–668.
A. J.; Dimick, J. B.; and Hall, R. I. (2013). “Optimizing
Gandy, A.; Kvaløy, J. T.; Bottle, A.; and Zhou, F. (2010).
ACS NSQIP Modeling for Evaluation of Surgical Quality and
“Risk-Adjusted Monitoring of Time to Event”. Biometrika
Risk: Patient Risk Adjustment, Procedure Mix Adjustment,
97, pp. 375–388.
Shrinkage Adjustment, and Surgical Focus”. Journal of the
Gombay, E.; Hussein, A. A.; and Steiner, S. H. (2011).
American College of Surgery 217(2), pp. 336–346.
“Monitoring Binary Outcomes Using Risk-Adjusted Charts:
Collett, D.; Sibanda, N.; Pioli, S.; Bradley, A.; and
A Comparative Study”. Statistics in Medicine 30, pp. 2815–
Rudge, C. (2009). “The UK Scheme for Mandatory Contin-
2826.
uous Monitoring of Early Transplant Outcome in All Kidney
Transplant Centers”. Transplantation 88, pp. 970–975. Grigg, O. and Farewell, V. (2004a). “An Overview of Risk-
Adjusted Charts”. Journal of the Royal Statistical Society,
Collins, G. S.; Jibawi, A.; and McCulloch, P. (2011). “Con-
Series A 167, pp. 523–539.
trol Charts Methods for Monitoring Surgical Performance:
A Case Study from Gastro-Oesophageal Surgery”. European Grigg, O. and Farewell, V. (2004b). “A Risk-Adjusted Sets
Journal of Surgical Oncology 37, pp. 473–480. Method for Monitoring Adverse Medical Outcomes”. Statis-
Cook, D. A.; Coory, M.; and Webster, R. A. (2011). tics in Medicine 23, pp. 1593–1602.
“Exponentially Weighted Moving Average Charts to Com- Grigg, O. A.; Farewell, V. T.; and Spiegelhalter, D. J.
pare Observed and Expected Values for Monitoring Risk- (2003). “The Use of Risk-Adjusted CUSUM and RSPRT
Adjusted Hospital Indicators”. BMJ Quality and Safety 20, Charts for Monitoring in Medical Contexts”. Statistical
pp. 469–474. Methods in Medical Research 12, pp. 147–170.
Cook, D. A.; Duke, G.; Hart, G. K.; Pilcher, D.; and Grigg, O. and Spiegelhalter, D. J. (2007). “Simple Risk-
Mullany, D. (2008). “Review of the Application of Risk- Adjusted Exponentially Weighted Moving Average”. Jour-
Adjusted Charts to Analyze Mortality Outcomes in Critical nal of the American Statistical Association 102, pp. 140–152.
Care.” Critical Care Resuscitation 10(3), pp. 239–251. Grunkemeier, G. L.; Jin, R.; and Wu, Y. (2009). “Cumu-
Cook, D. A.; Steiner, S. H.; Cook, R. J.; Farewell, V. T.; lative Sum Curves and Their Prediction Limits”. Annals of
and Morton, A. P. (2003). “Monitoring the Evolutionary Thoracic Surgery 87, pp. 361–364.
Process of Quality: Risk-Adjusted Charting to Track Out- Grunkemeier, G. L.; Wu, Y. X.; and Furnary, A. P. (2003).
comes in Intensive Care”. Critical Care Medicine 31(6), pp. “Cumulative Sum Techniques for Assessing Surgical Re-
1676–1682. sults”. Annals of Thoracic Surgery 76, pp. 663–667.
COPPS-CMS White Paper Committee. (2012). Statisti- Gurevich, G. and Vexler, A. (2005). “Change Point Prob-
cal Issues in Assessing Hospital Performance. imstat.org/ lems in the Model of Logistic Regression”. Journal of Sta-
news/2012/03/05/1330972991833.html. Accessed on May tistical Planning and Inference 131, pp. 313–331.
29, 2014. Gustafson, T. L. (2000). “Practical Risk-Adjusted Quality
de Leval, M. R.; Francois, K.; Bull, C.; Brawn, W. B.; Control Charts for Infection Control”. American Journal of
and Spiegelhalter, D. J. (1994). “Analysis of a Cluster of Infection Control 28(6), pp. 406–414.
Surgical Failures”. Journal of Thoracic and Cardiovascular Hall, B. L.; Hamilton, B. H.; Richards, K.; Bilimoria,
Surgery 104, pp. 914–924. K. Y.; Cohen, M. E.; and Ko, C.Y. (2009). “Does Surgi-
Dimick, J. B.; Chen, S. L.; Taheri, P. A.; Henderson, W. cal Quality Improve in the American College of Surgeons
G.; Khuri, S. F.; and Campbell, Jr., D. A. (2004). “Hospi- National Surgical Quality Improvement Program: An Eval-
tal Costs Associated with Surgical Complications: A Report uation of All Participating Hospitals”. Annals of Surgery
from the Private Sector National Surgical Improvement Pro- 205(3), pp. 363–376.

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 397

Harris, J. R.; Forbes, T. L.; Steiner, S. H.; Lawlor, Maggard-Gibbons, M. (2013). “Use of Report Cards and
K.; Derose, G. and Harris, K. A. (2005). “Risk-Adjusted Outcome Measurements to Improve Safety of Surgical Care:
Analysis of Early Mortality After Ruptured Abdominal Aor- American College of Surgeons National Quality Improve-
tic Aneurysm Repair”. Journal of Vascular Surgery 42, pp. ment Program”. Chapter 14 in Making Healthcare Safer.
387–391. II: An Updated Critical Analysis of the Evidence for Pa-
Hart, M. K.; Lee, K. Y.; Hart, R. F.; and Robertson, J. tient Safety Practices, Evidence Reports/Technology As-
W. (2003). “Application of Attribute Control Charts to Risk- sessments, No. 211, Report No. 13-E001-EF, pp. 140–157.
Adjusted Data for Monitoring and Improving Health Care Rockville, MD: Agency for Healthcare Research and Qual-
Performance”. Quality Management in Health Care 12(1), ity.
pp. 5–19. Mayer, E. K.; Bottle, A.; Rao, C.; Darsi, A. W.; and
Hart, M. K.; Robertson, J. W.; Hart, R. F.; and Lee, K. Thanos, A. (2009). “Funnel Plots and Their Emerging Ap-
Y. (2004). “Application of Variables Control Charts to Risk- plication in Surgery”. Annals of Surgery 249(3), pp. 376–383.
Adjusted Time-Ordered Healthcare Data”. Quality Manage- Megahed, F. M.; Kensler, J. L. K.; Bedair, K.; and
ment in Health Care 13(2), pp. 99–119. Woodall, W. H. (2011). “A Note on the ARL of Two-
He, Y.; Selck, F.; and Sharon-Lise, T. N. (2014). “On Sided Bernoulli-Based CUSUM Control Charts”. Journal of
the Accuracy of Classifying Hospitals on Their Performance Quality Technology 43(1), pp. 43–49.
Measures”. Statistics in Medicine 33, pp. 1081–1103. Moore, R.; Nutley, M.; Cina, C. S.; Motamedi, M.; Faris,
Ieva, F. and Paganori, A. M. (2015). “Detecting and Visu- P.; and Abuznadah, W. (2007). “Improved Survival After
alizing Outliers in Provider Profiling via Funnel Plots and Introduction of an Emergency Endovascular Therapy Proto-
Mixed E↵ect Models”. Health Care Management Science, to col for Ruptured Abdominal Aortic Aneurysms”. Journal of
appear. DOI 10.1007/s10729-013-9264-9. Vascular Surgery 4, pp. 443–450.
Iezzoni, L. (2012). Risk Adjustment for Measuring Health Morton, A. P. (2003). “The Use of Statistical Process Con-
Care Outcomes, 4th edition. Chicago, IL: Health Adminis- trol Methods in Monitoring Clinical Performance—Letter
tration Press. to the Editor”. International Journal for Quality in Health
Institute for Healthcare Improvement (2014). “Surgical Care 15(4), pp. 361–362.
Site Infection”. www.ihi.org/Topics/SSI/Pages/default.aspx. Morton, A. P.; Clements, A. C. A.; Doidge, S. R.; Stack-
Accessed June 1, 2014. elroth, J.; Curtis, M.; and Whitby, M. (2008). “Surveil-
Jones, H. E. and Spiegelhalter, D. J. (2008). “Use of lance of Healthcare-Acquired Infections in Queensland, Aus-
False Discovery Rate When Comparing Multiple Healthcare tralia: Data and Lessons Learned in the First 5 Years”. Infec-
Providers”. Journal of Clinical Epidemiology 61(3), pp. 232– tion Control and Hospital Epidemiology 29(8), pp. 695–701.
240. Mousavi, S. and Reynolds, M. R., Jr. (2009). “A CUSUM
Jones, H. E. and Spiegelhalter, D. J. (2011). “The Identifi- Chart for Monitoring a Proportion with Autocorrelated Bi-
cation of ‘Unusual’ Health-Care Providers from a Hierarchi- nary Observations”. Journal of Quality Technology 41(4),
cal Model”. The American Statistician 65(3), pp. 154–163. pp. 401–414.
Jones, M. A. and Steiner, S. H. (2012). “Assessing the Ef- Moustakides, G. V. (1986). “Optimal Stopping Times for
fect of Estimation Error on Risk-Adjusted CUSUM Chart Detecting Changes in Distribution”. The Annals of Statistics
Performance”. International Journal for Quality in Health 14, pp. 1379–1387.
Care 24(2), pp. 176–181. Nashef, S. A. M.; Roques, F.; Sharples, L. D.; Nils-
Jones-Farmer, L. A.; Woodall, W. H.; Steiner, S. H.; and son, J.; Smith, C.; Goldstone, A. R.; and Lockowandt,
Champ, C. W. (2014). “An Overview of Phase I Analysis for U. (2012). “EuroSCORE II”. European Journal of Cardio-
Process Improvement and Monitoring”. Journal of Quality Thoracic Surgery 41(4), pp. 734–745.
Technology 46(3), pp. 265–280. National Surgical Quality Improvement Program
Kalbfleisch, J. D. and Wolfe, R. A. (2013). “On Moni- (2014) site.acsnsqip.org/about/business-case/. Accessed on
toring Outcomes of Medical Providers”. Statistics in Bio- May 29, 2014.
sciences 5(2), pp. 286–302. Neuberger, J.; Madden, S.; and Collett, D. (2010). “Re-
Ko, C. Y. (2009). “Measuring and Improving Surgical Qual- view of Methods for Measuring and Comparing Center Per-
ity”. Patient Safety and Quality Healthcare 6(6), pp. 36–41. formance After Organ Transplantation”. Liver Transplanta-
Leandro, G.; Rolando, N.; Gallus, G.; Rolles, K.; and tion 16, pp. 1119–1128.
Burroughs, A. K. (2005). “Monitoring Surgical and Med- New York State Department of Health (2001). Coronary
ical Outcomes: The Bernoulli Cumulative SUM Chart. A Artery Bypass Surgery in New York State 1996–1998. www
Novel Application to Assess Clinical Interventions”. Post- .health.ny.gov/statistics/diseases/cardiovascular/heart dise
graduate Medical Journal 81, pp. 647–652. ase/docs/1996-1998 adult cardiac surgery.pdf. Accessed on
Loke, C. K. and Gan, F. F. (2012). “Joint Monitoring May 29, 2014.
Scheme for Clinical Failures and Predisposed Risks”. Quality Novick, R. J.; Fox, S. A.; Stitt, L. W.; Forbes, T. L.;
Technology and Quantitative Management 9(1), pp. 3–21. and Steiner, S. H. (2006). “Direct Comparison of Risk-
Lovegrove, J.; Sherlaw-Johnson, C.; Valencia, O.; Trea- Adjusted and Non-Risk-Adjusted CUSUM Analyses of Coro-
sure, T.; and Gallivan, S. (1999). “Monitoring the Per- nary Artery Bypass Surgery Outcomes”. Journal of Tho-
formance of Cardiac Surgeons”. Journal of the Operational racic and Cardiovascular Surgery 132, pp. 386–391.
Research Society 50, pp. 684–689. Novick, R. J.; Fox, S. A.; Stitt, L. W.; Swinamer, S. A.;
Lovegrove, J.; Valencia, O.; Treasure, T.; Sherlaw- Lehnhardt, K. R.; Rayman, R.; and Boyd, W. D. (2001).
Johnson, C.; and Gallivan, S. (1997). “Monitoring the Re- “Cumulative Sum Failure Analysis of a Policy Change from
sults of Cardiac Surgery by Variable Life-Adjusted Display”. On-Pump to O↵-Pump Coronary Artery Bypass Grafting”.
Lancet 18, pp. 1128–1130. The Annals of Thoracic Surgery 72(3), pp. S1016–S1021.

Vol. 47, No. 4, October 2015 www.asq.org


398 WILLIAM H. WOODALL, SANDY L. FOGEL, MD, AND STEFAN H. STEINER

Novick, R. J. and Stitt, L. W. (1999). “The Learning When Sample Sizes Are Time-varying”. Naval Research Lo-
Curve of an Academic Cardiac Surgeon: Use of the CUSUM gistics 60(8), pp. 625–636.
Method”. Journal of Cardiac Surgery 14(5), pp. 312–320. Sherlaw-Johnson, C. (2005). “A Method for Detecting Runs
Noyez, L. (2009). “Control Charts, Cusum Techniques and of Good and Bad Clinical Outcomes on Variable Life-
Funnel Plots. A Review of Methods for Monitoring Perfor- Adjusted Display (VLAD) Charts”. Health Care Manage-
mance in Healthcare”. Interactive Cardiovascular and Tho- ment Science 8, pp. 61–65.
racic Surgery 9, pp. 494–499. Sherlaw-Johnson, C.; Lovegrove, J.; Treasure, T.; and
Page, E. S. (1954). “Continuous Inspection Schemes”. Gallivan, S. (2000). “Likely Variations in Perioperative
Biometrika 41, pp. 100–114. Mortality Associated with Cardiac Surgery: When Does
Parsonnet, V.; Dean, D.; and Bernstein, A. D. (1989). High Mortality Reflect Bad Practice?” Heart 84, pp. 79–82.
“A Method of Uniform Stratification of Risks for Evaluating Sherlaw-Johnson, C.; Morton, A.; Robison, M. B.; and
the Results of Surgery in Acquired Adult Heart Disease”. Hall, A. (2005). “Real-Time Monitoring of Coronary Care
Circulation 779(Supplement 1), pp. 1–12. Mortality: A Comparison and Combination of Two Moni-
Paynabar, K.; Jin, J. H.; and Yeh, A. B. (2012). “Phase I toring Tools”. International Journal of Cardiology 100, pp.
Risk-Adjusted Control Charts for Monitoring Surgical Per- 301–307.
formance by Considering Categorical Covariates”. Journal Sherlaw-Johnson, C.; Wilson, P.; and Gallivan, S. (2007).
of Quality Technology 44(1), pp. 39–53. “The Development and Use of Tools for Monitoring the Oc-
Phinikettos, I. and Gandy, A. (2014). “An Omnibus currence of Surgical Wound Infections”. Journal of the Op-
CUSUM Chart for Monitoring Time to Event Data”. Life- erational Research Society 58, pp. 228–234.
time Data Analysis 20, pp. 481–494. Sismanidis, C.; Bland, M.; and Poloniecki, J. (2003).
Poloniecki, J.; Sismanidis, C.; Bland, M.; and Jones, P. “Properties of the Cumulative Risk-Adjusted Mortality
(2004). “Retrospective Cohort Study of False Alarms Associ- (CRAM) Chart, Including the Number of Deaths Before a
ated with a Series of Heart Operations: The Case for Hospital Doubling of the Death Rate Is Detected”. Medical Decision
Mortality Monitoring Groups”. British Medical Journal 328, Making 23(3), pp. 242–251.
pp. 375–378. Snyder, J. J.; Salkowski, N.; Zaun, D.; Leppke, S. N.;
Leighton, T.; Israni, A. K.; and Kasiske, B. L. (2014).
Poloniecki, J.; Valencia, O.; and Littlejohns, P. (1998).
“New Quality Monitoring Tools Provided by the Scien-
“Cumulative Risk-Adjusted Mortality Chart for Detect-
tific Registry of Transplant Recipients: CUSUM”. American
ing Changes in Death Rate: Observational Study of Heart
Journal of Transplantation 14, pp. 515–523.
Surgery”. British Medical Journal 316, pp. 1697–1700.
Spiegelhalter, D. J. (2005a). “Funnel Plots for Comparing
Porter, M. E. and Teisberg, E. O. (2007). “How Physi-
Institutional Performance”. Statistics in Medicine 24, pp.
cians Can Change the Future of Health Care”. Journal of
1185–1202.
the American Medical Association 297(10), pp. 1103–1111.
Spiegelhalter, D. J. (2005b). “Handling Over-Dispersion of
Reynolds, M. R., Jr. and Stoumbos, Z. G. (1999). “A
Performance Indicators”. Quality and Safety in Healthcare
CUSUM Chart for Monitoring a Proportion when Inspect-
14, pp. 347–351.
ing Continuously”. Journal of Quality Technology 31, pp.
Spiegelhalter, D.; Grigg, O.; Kinsman, R.; and Trea-
87–108.
sure, T. (2003). “Risk-Adjusted Sequential Probability Ra-
Rogers, C. A.; Ganesh, J. S.; Nicholas, R.; Banner, N. tio Tests: Applications to Bristol, Shipman and Adult Car-
R.; and Bonser, R. S. (2005). “Cumulative Risk Adjusted diac Surgery”. International Journal for Quality in Health
Monitoring of 30-Day Mortality After Cardiothoracic Trans- Care 15, pp. 7–13.
plantation: UK Experience”. European Journal of Cardio- Spiegelhalter, D.; Sherlaw-Johnson, C.; Bardsley, M.;
Thoracic Surgery 27, pp. 1022–1029 Blunt, I.; Wood, C.; and Grigg, O. (2012). “Statistical
Rogers, C. A.; Reeves, B. C.; Caputo, M.; Ganesh, J. S.; Methods for Healthcare Regulation: Rating, Screening and
Bonser, R. S.; and Angelini, G. D. (2004). “Control Chart Surveillance (with Discussion)”. Journal of the Royal Statis-
Methods for Monitoring Cardiac Surgical Performance and tical Society, Series A 175(1), pp. 1–47.
Their Interpretation”. Journal of Thoracic and Cardiovas- Steiner, S. H. (2014). “Risk-Adjusted Monitoring of Out-
cular Surgery 128, pp. 811–819. comes in Health Care”. Chapter 14 in Statistics in Action:
Schuh, A.; Woodall, W. H.; and Camelio, J. A. (2013). A Canadian Outlook, Lawless, J. F., ed., pp. 245–264. Chap-
“The E↵ect of Aggregating Data When Monitoring a Poisson man and Hall/CRC.
Process”. Journal of Quality Technology 45(3), pp. 260–272. Steiner, S. H.; Cook, R. J.; and Farewell, V. T. (1999).
Seaton, S. E. and Manktelow, B. N. (2012). “The Proba- “Monitoring Paired Binary Surgical Outcomes Using Cumu-
bility of Being Identified as an Outlier with Commonly Used lative Sum Charts”. Statistics in Medicine 18, pp. 69–86.
Funnel Plot Control Limits for the Standardized Mortality Steiner, S. H.; Cook, R. J.; and Farewell, V. T. (2001).
Ratio”. BMC Medical Research Methodology 12, p. 98. “Risk-Adjusted Monitoring of Binary Surgical Outcomes”.
Sego, L H.; Reynolds, M. R., Jr.; and Woodall, W. Medical Decision Making 21(3), pp. 163–169.
H. (2009). “Risk-Adjusted Monitoring of Survival Times”. Steiner, S. H.; Cook, R. J.; Farewell, V. T.; and Trea-
Statistics in Medicine 28, pp. 1386–1401. sure, T. (2000). “Monitoring Surgical Performance Using
Sego, L. H.; Woodall, W. H.; and Reynolds, M. R., Jr. Risk-Adjusted Cumulative Sum Charts”. Biostatistics 1, pp.
(2008). “A Comparison of Surveillance Methods for Small 441–452.
Incidence Rates”. Statistics in Medicine 27(8), pp. 1225– Steiner, S. H. and Jones, M. (2010). “Risk-Adjusted Sur-
1247. vival Time Monitoring with an Updating Exponentially
Shen, X.; Tsung, F.; Zou, C.; and Jiang, W. (2013). “Mon- Weighted Moving Average (EWMA) Control Chart”. Statis-
itoring Poisson Count Data with Probability Control Limits tics in Medicine 29, pp. 444–454.

Journal of Quality Technology Vol. 47, No. 4, October 2015


THE MONITORING AND IMPROVEMENT OF SURGICAL OUTCOME QUALITY 399

Steiner, S. H. and MacKay, R. J. (2014). “Monitoring (1992). “Quality Control: An Application of the CUSUM”.
Risk-Adjusted Medical Outcomes Allowing for Changes over British Medical Journal 304, pp. 1359–1361.
Time”. Biostatistics 15(4), pp. 665–676. Winkel, P. and Zhang, N. F. (2007). Statistical Development
Sun, R. J. and Kalbfleisch, J. D. (2013). “A Risk-Adjusted of Quality in Medicine. Hoboken, NJ: John Wiley & Sons,
O-E CUSUM with Monitoring Bands for Monitoring Medical Inc.
Outcomes”. Biometrics 69, pp. 62–69. Winkel, P. and Zhang, N. F. (2012). “Statistical Process
Symons, N. R. A.; Moorthy, K.; Almoudaris, A. M.; Bot- Control in Clinical Medicine”, Chapter 15 in Statistical
tle, A.; Aylin, P.; Vincent, C. A.; and Faiz, O. D. (2013). Methods in Healthcare, Faltin, F. W.; Kenett, R.; and. Rug-
“Mortality in High-Risk Emergency General Surgical Admis- geri, F., eds., pp. 309–334. John Wiley & Sons, Inc.
sions”. British Journal of Surgery 100, pp. 1318–1325. Woodall, W. H. (2006). “Use of Control Charts in Health-
Szarka, III, J. L. and Woodall, W. H. (2011). “A Re- Care and Public-Health Surveillance (with Discussion)”.
view and Perspective on Surveillance of Bernoulli Processes”. Journal of Quality Technology 38(2), pp. 89–104.
Quality and Reliability Engineering International 27, pp.
Woodall, W. H.; Adams, B. M.; and Benneyan, J. C.
735–752.
(2012). “The Use of Control Charts in Healthcare”, Chap-
Tang, X.; Gan, F. F.; and Zhang, L. (2015). “Risk- ter 12 in Statistical Methods in Healthcare, Faltin, F. W.;
Adjusted Cumulative Sum Charting Procedure Based on Kenett, R.; and. Ruggeri, F., eds., pp. 253–267. John Wiley
Multi-Responses”. Journal of American Statistical Associ- & Sons, Inc.
ation, to appear.
Woodall, W. H. and Montgomery, D. C. (2014). “Some
Tian, W.; Sun, H.; Zhang, X.; and Woodall, W. H.
Current Directions in the Theory and Application of Sta-
(2015). “The Impact of Varying Patient Populations on
tistical Process Monitoring”. Journal of Quality Technology
the In-Control Performance of the Risk-Adjusted Bernoulli
46(1), pp. 78–94.
CUSUM Chart”. International Journal for Quality in Health
Care 27(1), pp. 31–36. Yap, C.-H.; Colson, M. E.; and Watters, D. A. (2007).
“Cumulative Sum Techniques for Surgeons: A Brief Review”.
Treasure, T.; Gallivan, S.; and Sherlaw-Johnson, C.
ANZ Journal of Surgery 77, pp. 583–586.
(2004). “Monitoring Cardiac Surgical Performance: A Com-
mentary”. Journal of Thoracic and Cardiovascular Surgery Yeh, A. B.; Huwang, L.; and Li, Y.-M. (2009). “Profile Mon-
128, pp. 823–825. itoring for a Binary Response”. IIE Transactions 41(11), pp.
U.S. Department of Health and Human Services. (2013). 931–941.
“National Action Plan to Prevent Health Care-Associated Yuen, W.-C. (2013). “Applying Variable Life Adjusted Dis-
Infections: Roadmap to Elimination”. www.health.gov/hai/ play in Monitoring Surgical Outcomes”. Paper presented at
prevent hai.asp. Accessed June 6, 2014. the 2013 Hong Kong Hospital Authority Convention. www
Wald, A. (1947). Sequential Analysis. New York, NY: Wiley. .ha.org.hk/haconvention/hac2013/proceedings/downloads/
Webster, R. A. and Pettitt, A. N. (2007). “Stability of MC1.1.pdf. Accessed on June 1, 2014.
Approximations of Average Run Length of Risk-Adjusted Zeng, L. and Zhou, S. (2011). “A Bayesian Approach to Risk-
CUSUM Schemes Using the Markov Approach: Comparing Adjusted Outcome Monitoring in Healthcare”. Statistics in
Two Methods of Calculating Transition Probabilities”. Com- Medicine 30, pp. 3431–3446.
munications in Statistics—Simulation and Computation 36, Zhang, X. and Woodall, W. H. (2015). “Dynamic Con-
pp. 471–482. trol Limits for the Risk-Adjusted Bernoulli CUSUM Chart”.
Williams, S. M.; Parry, B. R.; and Schlup, M. M. T. Statistics in Medicine, to appear.

Vol. 47, No. 4, October 2015 www.asq.org

You might also like