Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

J. Account.

Public Policy xxx (xxxx) xxx

Contents lists available at ScienceDirect

J. Account. Public Policy


journal homepage: www.elsevier.com/locate/jaccpubpol

Using machine learning to predict auditor switches:


How the likelihood of switching affects audit quality among
non-switching clients
Joshua O.S. Hunt a,⇑, David M. Rosser b, Stephen P. Rowe c
a
Mississippi State University, United States
b
University of Texas at Arlington, United States
c
University of Arkansas, United States

a r t i c l e i n f o a b s t r a c t

Article history: In this paper, we utilize machine learning techniques to identify the likelihood that a com-
Available online xxxx pany switches auditors and examine whether increased likelihood of switching is associated
with audit quality. Building on research that finds a deterioration in audit quality associated
Keywords: with clients that engage in audit opinion shopping, we predict and find lower audit quality
Auditor change among companies that are more likely to switch auditors but remain with their incumbent
Audit quality auditor. Specifically, we find that companies more likely to switch auditors have a higher
Auditor independence
likelihood of misstatement and larger abnormal accruals. These results are consistent with
Machine learning
Data analytics
auditors sacrificing audit quality to retain clients that might otherwise switch. Our findings
are especially concerning because there is no public signal of this behavior, such as an auditor
switch. Our methodology is designed such that it could be implemented by investors, audit
firms and regulators to identify companies with a higher probability of switching auditors
and preemptively address the deterioration in audit quality.
Published by Elsevier Inc.

1. Introduction

Regulators have long been concerned about the possibility that company management could switch auditors in order to
obtain a more favorable audit opinion (U.S. Senate, 1977). This ‘‘opinion shopping” poses a threat to auditor independence
and, consequently, audit quality, which has prompted regulators to implement policies designed to prevent it.1 For instance,
the SEC requires companies switching auditors to publicly disclose auditor-client disagreements in Form 8-k and the Sarbanes-
Oxley Act of 2002 (SOX) tranferred the responsibility of auditor appointment from management to the audit committee of the
board of directors. Recent research indicates, however, that these efforts have not eliminated management involvement in audi-
tor selection and potential opinion shopping behavior (e.g., Dhaliwal et al., 2015; Newton et al., 2016). While it is important for
policy makers to understand and address the independence threat posed by auditor switching, of equal or greater concern are
instances when a switch is threatened but the auditor capitulates in some way in order to retain the client.2 In such cases,

⇑ Corresponding author.
1
Prior research has used ‘‘opinion shopping” to refer to a variety of audit outcomes that a company might switch auditors to obtain, including an unqualified
audit opinion (instead of an adverse opinion or an opinion modified because of going concern uncertainties), a clean opinion on the effectiveness of internal
controls, or less conservative accounting treatment.
2
A threat of switching can be implied rather than explicit. Furthermore, prior research indicates that weakening of auditors’ independence mindset can be
unintentional (Bazerman et al., 1997; Nelson 2009).

https://doi.org/10.1016/j.jaccpubpol.2020.106785
0278-4254/Published by Elsevier Inc.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
2 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

market participants cannot observe a switch and have no warning of potential financial reporting weaknesses. In this paper, we
utilize machine learning techniques to identify instances when companies are likely to switch auditors but do not and we inves-
tigate whether the likelihood of switching is associated with lower audit quality for these companies. Our approach could help
regulators identify continuing audit engagements that are susceptible to deteriorated audit quality.
Prior research in auditing provides evidence that companies engage in opinion shopping, which has the potential to
decrease auditor independence and audit quality (e.g., Krishnan, 1994; Lennox, 2000; Newton et al., 2016). Decreased audit
quality provides less assurance that the financial statements reflect the underlying economic condition of the company and
increases the likelihood that the financial statements reflect management bias and contain material errors (Watkins et al.,
2004; DeFond and Zhang, 2014). Most prior opinion shopping studies examine instances when companies switch auditors
and investigate the effects of switching on audit quality. While the implications of this form of opinion shopping are impor-
tant to understand, this stream of research does not investigate the full extent of the potential problem because the analyses
are limited to instances when a switch is observed. We contribute to this literature by identifying instances when auditor
independence may be impaired but no auditor switch is observed. We design our tests based on the conjecture that an
increased likelihood of switching when no switch occurs may indicate that the auditor compromised their independence
by capitulating in some way to avert the loss of the client.
We use machine learning techniques to estimate the likelihood that a company will switch their auditor and examine the
association between the likelihood of switching and audit quality for clients that retain their incumbent auditor. Machine
learning is a statistical approach that uses data to identify patterns and relations between variables using automated model
building processes with limited human interaction (Cecchini et al., 2010). Advances in this approach have improved our abil-
ity to model likelihoods and investigate settings where observable outcomes are rare but potentially meaningful. Machine
learning is a valuable tool to use for our purpose because machine learning methods are largely specialized for prediction
tasks (Gu et al., 2020). These methods have the potential to improve on the traditional models that audit research has used
to make predictions (e.g., logistic regression) because they are well-suited to discovering complex patterns and they rely on
fewer, and less restrictive, assumptions (Hastie et al., 2009; Mullainathan and Spiess, 2017).
In order to be useful, however, machine learning techniques require appropriate training and testing procedures
(Mullainathan and Spiess, 2017).3 We begin our training and testing procedures by modeling auditor switches as our target
variable using input variables selected following prior auditor turnover research using a sample of company-years from
2002 through 2016 (Brown and Knechel, 2016). We use five year rolling windows for our training sets and assess the accuracy
of our prediction models out-of-sample in the sixth year. We compare the accuracy of predictions from multiple machine learn-
ing models (gradient boosting, random forest, neural networks, and support vector machines) and logistic regression because it
is commonly used in prior research. We use the probability of switching auditors estimated using the gradient boosting method
in our remaining analyses because gradient boosting consistently outperforms the other methods in our setting.
Similar to the opinion shopping literature, we interpret lower audit quality as indicative of impaired auditor indepen-
dence. Accordingly, we hypothesize that audit quality among companies retaining their incumbent auditor decreases as
the probability of switching auditors increases. In our primary analyses, we limit the sample to companies with a continuing
auditor–client relationship (i.e., companies that do not switch their auditor). We use proxies for audit quality commonly
used in prior research, the likelihood of misstatement (subsequently revealed through a restatement of the previously issued
financial statements) and abnormal accruals (DeFond and Zhang, 2014). We find that the probability of switching auditors is
positively and significantly associated with the likelihood of misstatement and with abnormal accruals. We also find that the
decreased audit quality is concentrated among companies in the top decile of the probability of switching.
In additional analyses we further investigate our conjecture that the findings from our primary tests are attributable to
impaired auditor independence. We separately examine large/small clients and large/small audit offices. Large audit offices
may be less susceptible to independence impairment because the loss of a single client is less impactful (DeAngelo, 1981)
and the threat of switching auditors is likely less credible for large clients because they have fewer realistic alternatives
(Krishnan, 1994). We find that our results are generally stronger for small clients and for clients audited by smaller audit
offices. The results from these tests, in conjunction with our primary results, support the notion that the likelihood of switch-
ing auditors decreases auditor independence. However, we acknowledge that there may be alternative explanations because
audit office and client size have been used to proxy for other constructs unrelated to auditor independence.
Lastly, we expand our sample to include companies that switched auditors in order to provide insight into how switching
companies compare to the companies in our primary sample because prior research indicates that companies who switch
auditors obtain more favorable audit opinions (Lennox, 2000). We continue to find strong and consistent evidence that
the probability of switching is associated with lower audit quality among companies remaining with their incumbent audi-
tor. We also find some evidence that audit quality is lower among non-switching companies with a high probability of
switching than it is among companies that switched.
Our study makes two primary contributions. First, we leverage machine learning techniques to estimate the probability
that a company will switch auditors and we investigate whether this probability is associated with adverse audit outcomes.
While prior research has investigated many reasons to be concerned when a company switches its auditor (Schwartz and

3
Training and testing procedures are integral to machine learning techniques because the techniques require training data in order to ‘‘learn” and out-of-
sample testing for assessing the accuracy of predictions in order to ensure that the models do not overfit the data. We discuss these procedures in greater detail
in Section 3.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 3

Meonon, 1985; Roberts et al., 1990; Schwartz and Soo, 1996; Ashbaugh-Skaife et al., 2007; Lennox and Park, 2007), to the
best of our knowledge, our paper is the first to identify similar concerns among companies that are more likely to switch
auditors but do not. The audit quality implications may be more serious because there isn’t an observable signal as in the
case of an auditor switch. As such, we contribute to the literature on factors impacting auditor independence, which could
introduce bias into auditor judgments similar to those identified in prior literature on opinion shopping (e.g., Lennox, 2000;
Lu, 2006).
Second, we describe an approach to identifying audits with potentially poor audit quality that can be adapted and used by
regulators, investors, and audit firms. Our findings highlight the need to scrutinize auditor–client relationships with a high
probability of switching auditors where no auditor switch is observed (our results suggest that the top decile of the distri-
bution of the probability of switching would be a reasonable cutoff to use). For example, the PCAOB could incorporate the
probability of switching into the risk factors that they use to identify audits to review and audit quality review teams within
audit firms could implement a similar approach to reduce audit failures. An advantage of our approach is that it uses publicly
available data (financial results for public companies from previous years) and can be implemented in a timely manner (the
analyses can be performed on a sample as soon as the auditor for the current year is known). Our findings suggest that the
analyses described here could help financial statement users identify audits at higher risk of low quality, including serious
quality impairments (i.e., ‘‘Big R” restatements).
The rest of this paper proceeds as follows. Section 2 discusses prior literature and presents our hypothesis, Section 3
describes our methodology for predicting the probability of switching auditors, Section 4 discusses the sample and method-
ology used in our analyses and presents results, and Section 5 concludes.

2. Prior literature and hypothesis

2.1. Machine learning

Broadly defined, machine learning refers to an approach that uses automated processes to identify patterns and relations
between variables in data sets with limited human interaction (Cecchini et al. 2010). The process can be used to identify
complex patterns and relations that would be difficult for humans to discover using traditional statistical techniques.
Machine learning can be broadly classified as a subset of design science, the objective of which is to develop useful tools
to help solve important problems, whereas natural and social science seek to develop theories and test them (Kogan
et al., 2019). Machine learning techniques have been used outside of accounting research to investigate various topics of con-
cern for accounting and audit researchers, such as detecting fraud (e.g., Maes et al., 2002; Cecchini et al., 2010; Whiting et al.,
2012) and predicting bankruptcy (e.g., Min and Lee, 2005; Tsai and Wu, 2008).
These techniques have also been introduced in accounting research for bankruptcy prediction (Jones, 2017), identification
of peer companies (Ding et al., 2019), estimating equity risk premia (Gu et al. 2020) and predicting financial statement fraud
(Green and Choi, 1997; Perols and Lougee, 2011; Perols et al., 2017; Bao et al., 2020; Bertomeu et al., 2020). The use of
machine learning to predict financial statement fraud is the most closely related to our study because it attempts to predict
rare events that have a meaningful impact on the reliability of the financial statements. Prior research generally finds that,
while the best technique to use varies by setting, predicting financial statement reliability can often be improved by using
machine learning methods relative to traditional parametric methods such as logistic regression. For instance, Bertomeu
et al. (2020) use machine learning techniques to predict SEC Accounting and Auditing Enforcement Releases (AAERs) and
restatements and find that gradient boosting outperforms logistic regression.
Machine learning techniques have particularly great potential to improve prediction tasks because prediction is often the
primary purpose for which they are designed (Gu et al. 2020). Other models commonly used in accounting research, such as
logistic regression, are well-suited to provide parameter estimates but may not be as effective at making out-of-sample pre-
dictions. Machine learning methods may also improve upon predictions made using other methods because they require
fewer assumptions about the data generating process (Mullainathan and Spiess, 2017). Consequently, they are flexible
and well-suited to approximating complex and unknown data generating processes (Gu et al. 2020). Of particular concern
in our setting are the logistic regression assumptions that the independent variables in the model are linearly related to the
log odds and that the independent variables have little multicollinearity (Hastie et al. 2009). The machine learning methods
that we consider do not make linearity (they are nonparametric) or multicollinearity assumptions and are likely to improve
predictions if these assumptions are violated in our setting.

2.2. Opinion shopping and auditor independence

The term ‘‘opinion shopping” is commonly used to refer to instances where companies switch auditors in order to obtain
a more favorable audit opinion. Prior research on opinion shopping identifies certain ‘‘unsatisfactory” opinions that a com-
pany might seek to avoid, including opinions modified for going concern uncertainty (e.g., Geiger et al., 1998; Lennox, 2000;
Carcello and Neal, 2003) and opinions reporting material weaknesses in internal controls (Newton et al. 2016). This stream of
research has shown that auditor switches are particularly more likely when audit opinions or the overall accounting pref-
erences (e.g., discretionary accruals) enforced by auditors are conservative (Krishnan, 1994; DeFond and Subramanyam,

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
4 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

1998). A major concern with opinion shopping behavior identified by much of the prior literature is that it reduces overall
audit quality by decreasing auditor independence.
Audit quality is the level of assurance that the financial statements accurately portray the financial performance of the
company, generally by being free of material errors or omissions (Watkins et al., 2004; DeFond and Zhang, 2014). A large
literature has emerged that examines the factors that affect the supply of, and demand for, audit quality (DeFond and
Zhang, 2014 provide an extensive literature review). Audit quality related to the likelihood of companies switching auditors
is generally classified as a factor impacting auditors’ supply of ‘‘independent” assurance over the financial statements. Audi-
tor independence is a critical attribute that provides value to the audit by reducing outside bias; however, achieving an unbi-
ased audit is difficult (Bazerman et al. 1997). Prior research has examined various factors within the audit setting that could
reduce independence and introduce bias into auditor judgements, such as the provision of non-audit services (DeFond et al.,
2002; Kinney et al., 2004), client importance (Craswell et al., 2002; Li, 2009), auditor tenure (Gul et al., 2007) and auditor
opinion shopping (Lennox, 2000; Lu, 2006). Within this broad literature addressing auditor independence, our study relates
most closely to the literature investigating the independence threat associated with opinion shopping.
Companies choosing to switch auditors because of a disagreement with the auditor or in order to obtain a less conserva-
tive audit opinion pose a serious risk to audit quality. However, there is disagreement about the broader impact of auditor
switches on audit and financial reporting quality. Some studies demonstrate improvement in audit quality following an
auditor switch (Lu, 2006) or provide evidence that audit judgments and opinions are similar after a switch (Chow and
Rice, 1982; DeFond and Subramanyam, 1998). However, Lennox (2000) demonstrates that audit opinions are more favorable
to management following a change in auditor, using predicted opinions instead of actual opinions. Importantly, previous
opinion shopping research is generally limited to observed auditor–client switching activities and to comparing the audit
quality of the departing auditor and receiving auditor.
Our study is related to prior opinion shopping research in that similar auditor–client disagreements that lead companies
to switch auditors may be present in auditor–client relationships where no switch is observed. Our conjecture is that audi-
tors sometimes capitulate to their clients’ demands in order to retain the client, compromising their independence.4 We use
machine learning techniques to identify instances where an auditor switch is more likely to occur but the company retains its
incumbent auditor, and we investigate whether auditor independence appears to be impaired for these companies. Our expec-
tations are based on the ability of machine learning methods to predict switching events and the notion that companies’ ability
to switch away from more rigorous auditors can decrease auditor independence (Lennox, 2000). Similar to the opinion shopping
literature, we interpret decreased audit quality as evidence of impaired auditor independence. Specifically, we hypothesize:

H1: Audit quality among companies retaining their incumbent auditor decreases as the probability of switching auditors
increases.

3. Predicting the probability of switching auditors

We begin with the intersection of Compustat and Audit Analytics from fiscal year 2002 through 2016. We require total
assets of at least $1 million. Similar to prior literature, we exclude companies in the financial (two-digit SIC 60 through 69)
and utilities (two-digit SIC 44 through 49) industries because the audit market in these highly regulated industries is likely
substantially different from other industries included in our sample. We use this sample (73,898 observations) as our train-
ing and testing dataset to generate our estimated probability of switching auditors (PROB_SWITCH). We impute values for
missing variables using the median (for continuous variables) or mode (for indicator variables) by year. Table 1 presents
descriptive statistics for our training and testing sample.5
In order to identify the independent variables likely to be most important to include in our training and testing models,
we begin with the variables used by Brown and Knechel (2016, p. 747 and p. 752) in their model of auditor turnover at the
office level, with some adaptations to our setting. Specifically, we include the following variables for each company in our
sample, measured at year t  1: the natural log of the company’s total assets in $ millions (LN_AT), receivables plus inventory
scaled by total assets (INVT_RECT), discretionary accruals estimated using the modified Jones model (DACC),6 cash plus cash
equivalents scaled by total assets (CASH), income before extraordinary items scaled by total assets (ROA), an indicator variable
set equal to one if ROA is less than zero, and zero otherwise (LOSS), the change in total assets scaled by prior year total assets
(AT_GROW), an indicator variable set equal to one if cash outflows related to acquisitions or the contribution of acquisitions to
sales exceed ten percent of total assets, and zero otherwise (ACQUIRE), an indicator variable set equal to one if the company is in
the introduction or growth stage of its life cycle, and zero otherwise (CFEARLY),7 an indicator variable set equal to one if the
company is in the mature stage of its life cycle, and zero otherwise (CFMATURE), an indicator variable set equal to one if Com-

4
Auditor capitulation need not be conscious on the part of the auditor. Experimental research in auditing has demonstrated many instances when auditors’
incentives, evidence or biases reduce their ability to identify, and their willingness to propose, audit adjustments (e.g., Nelson 2009; Nolder and Kadous 2018;
Rowe 2019).
5
Machine learning requires training data (training sets) for the learning process and out-of-sample data (testing sets) in order to assess how well the models
perform.
6
Specifically, DACC are the residuals obtained from the discretionary accruals model used by DeFond and Subramanyam (1998, p. 47).
7
We identify life cycle stages following Dickinson (2011).

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 5

Table 1
Descriptive statistics - training and testing sample.

N Mean St. Dev. p25 Median p75


SWITCH 73,898 0.063 0.244 0.000 0.000 0.000
LN_AT 73,898 5.436 2.480 3.636 5.429 7.177
INVT_RECT 73,898 0.248 0.197 0.087 0.213 0.362
DACC 73,898 0.001 0.670 0.048 0.008 0.065
CASH 73,898 0.181 0.210 0.035 0.104 0.241
ROA 73,898 0.193 1.417 0.143 0.015 0.066
LOSS 73,898 0.441 0.496 0.000 0.000 1.000
AT_GROW 73,898 0.263 4.458 0.055 0.045 0.176
ACQUIRE 73,898 0.088 0.283 0.000 0.000 0.000
CFEARLY 73,898 0.431 0.495 0.000 0.000 1.000
CFMATURE 73,898 0.365 0.482 0.000 0.000 1.000
MODOP 73,898 0.369 0.483 0.000 0.000 1.000
SPECIALIST 73,898 0.154 0.361 0.000 0.000 0.000
CLIENTS 73,898 25.344 24.933 10.000 15.000 30.000
MARKET 73,898 8.436 5.368 6.000 7.000 9.000
SHORT_TEN 73,898 0.112 0.315 0.000 0.000 0.000
BIGN 73,898 0.663 0.473 0.000 1.000 1.000

See Appendix A for variable definitions.

pustat reports a nonstandard audit opinion, and zero otherwise (MODOP), an indicator variable set equal to one if the company’s
auditor has at least five percent more clients than the next largest auditor in both the industry and the MSA, and zero otherwise
(SPECIALIST), the number of clients in the auditor’s MSA (CLIENTS), the number of auditors having five or more clients in the
same MSA as the company (MARKET), an indicator variable if auditor tenure with the company is three years or less, and zero
otherwise (SHORT_TEN), and an indicator variable set equal to one if the company is audited by one of the four largest auditors,
and zero otherwise (BIGN).8
The purpose of the training and testing procedure is to calculate an estimated probability that a company-year observa-
tion will switch auditors. Our target variable is an indicator variable set equal to one if the company switches auditors from
the previous (year t-1) to the current (year t) year, and zero otherwise (SWITCH). Using data from previous years to estimate
the probability of switching auditors for the current year allows our procedure to be implemented in a timely manner - the
only current information necessary is whether the company has retained their incumbent auditor. This also mitigates the
possibility that the findings from our main analyses (presented in Section 4) are the result of a mechanical relation between
the independent variables included in our training models and the dependent variables used in the subsequent analyses. We
use five year rolling windows to construct our training sets. For example, we use financial information for 2002 through 2006
as our training set and use auditor switches from 2006 to 2007 as our testing set to assess which prediction model performs
best. We then estimate the probability that a company will switch auditors (PROB_SWITCH) for the 2008 audit by inputting
2007 financial information into the prediction model that performed best in the out-of-sample testing.9 We use five-year
windows to balance the need to keep the estimated probabilities relevant while allowing for sufficient auditor switch observa-
tions. Auditor switches are a relatively rare event (approximately six percent of our training and testing sample) and five years
of data allows for a larger number of auditor switch events. However, this also increases the length of time between the earliest
observations in the training set and the year of the estimated probability. The rolling window allows the predictions to vary
temporally.10
We assess the performance of several machine learning techniques: gradient boosting, random forest, neural networks,
and support vector machines. We also assess the performance of logistic regression because it is a technique commonly used
in prior accounting research (e.g., Dechow et al., 2011).11 We use two fit statistics, the receiver operator characteristic curve
(ROC curve) and the precision-recall curve (PR curve) to assess the performance of each of the prediction methods and use
PROB_SWITCH estimated using the method with the best fit statistics in our later analyses so that our method can be imple-
mented in practice.12 The ROC curve plots the false positive rate against the true positive rate of the prediction model. In
our setting, false positives are observations predicted to switch auditors that do not switch and true positives are observations

8
Brown and Knechel (2016) include only clients of Big N auditors in their sample. We adapt their model to include control variables suitable for a sample
with clients of Big N and non-Big N auditors. We also exclude the client similarity variables that are the focus of their paper.
9
To complete the example, all dependent variables and control variables used in our main analyses in conjunction with PROB_SWITCH for 2008 are measured
during 2008. Our main analyses are discussed in Section 4.
10
An alternative is to train on a static sample and use the predictions from the model in multiple future years. However, this would likely make the
predictions less informative in the later years of the sample.
11
Logistic regression is not a machine learning technique. Our logistic regression model regresses SWITCH on the same set of year t-1 independent variables
that we include in the training procedures for the other techniques. We calculate the predicted probability of switching auditors for each observation using the
estimated coefficients from the regression.
12
Fit statistics can be evaluated before audit outcomes are observable. We caution that the particular method that has the best fit for our data set (gradient
boosting) is unlikely to always be the method that best fits other data sets.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
6 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

predicted to switch auditors that do switch. The PR curve plots the ratio of true positives to the total of true positives plus false
positives (precision) against the ratio of true positives to the total of true positives plus false negatives (recall). In our setting, the
PR curve plots the ratio of correctly classified auditor switches relative to the total of those predicted to be switches against the
ratio of correctly classified auditor switches relative to the total number of actual switches. We use both fit statistics because
ROC curves are commonly used in previous research, but they can be overly optimistic and less accurate than PR curves when
predicting rare events (Davis and Goadrich, 2006; Saito and Rehmsmeier, 2015). In our setting, however, the best fitting pre-
diction model is the same regardless of the fit statistic we use for our assessment and we use PROB_SWITCH estimated using
the gradient boosting technique in our analyses because it consistently outperforms the other models. Specifically, the area
under the PR curve is 0.229 and the area under the ROC curve is 0.806 for the gradient boosting method and these values
are significantly larger than any other method (p < 0.01 for all tests of differences).
Gradient boosting is a nonparametric ensemble method (Hastie et al. 2009). An ensemble method combines the results
from multiple weak learners in order to outperform the weak learners used individually.13 In the case of our gradient boosting
implementation, decision trees are used as the weak learners. Decision trees are a set of binary splits. Each split creates an inter-
nal node using a value of one of the input variables. The first split is based on how well the split separates the data into distinct
classes (purity). Every variable and every possible split is considered until the split with the highest purity is found. The process
is recursive, meaning that it continues to split the data until a stopping criteria is reached.14 The process is also a greedy algo-
rithm, meaning that it solves for a local optimum with the hope of finding a global optimum. In the case of decision trees, the
algorithm identifies a variable that creates the best split but does not consider future splits. When the process is complete, a
new observation is classified into a class (auditor switch or no auditor switch in our setting) by passing down the decision tree
to the terminal node.
Gradient boosting combines decision trees additively to form a strong predictive model. Gradient refers to the optimiza-
tion process the algorithm uses to minimize errors.15 Boosting refers to the general process of combining the decision trees,
using an iterative process to improve prediction. In the first iteration, a decision tree is fit. In the second iteration, more weight is
given to the larger errors from the first iteration. Gradient boosting uses these errors as the target variable for the second iter-
ation and fits a decision tree to the errors. It then combines the model from the second iteration with the model from the first
iteration. In the third iteration, the errors from the combination of the first two models are used as the target variable. This pro-
cess continues until the optimal solution for the algorithm is determined.

4. Sample and analyses

We use the following model in order to investigate the effect of the probability of switching auditors on audit quality:16

AQ it ¼ b0 þ b1 PROB SWITCHit þ b2 LN AT it þ b3 LEV it þ b4 ROAit þ b5 LOSSit þ b6 INVT RECT it þ b7 CFEARLY it


þ b8 CFMATUREit þ b9 MTBit þ b10 VOLit þ b11 AT GROW it þ b12 CFOit þ b13 LIT INDit þ b14 EXFINit
þ b15 SPECIALIST it þ b16 BIGNit þ IndustryFE þ YearFE þ eit ð1Þ

where AQ is one of our proxies for audit quality commonly used in prior research (e.g., Tan and Young, 2015; Brown and
Knechel, 2016; Aobdia, 2019). We use two versions of financial statement misstatement: i) MISS_ALL, an indicator variable
set equal to one if the company subsequently restates year t financial statements for reasons related to accounting, fraud, or
an SEC investigation, and zero otherwise, and ii) MISS_BIGR, an indicator variable set equal to one if the company subse-
quently restates year t financial statements and the restatement is disclosed in a Form 8-K filing, and zero otherwise. We
estimate performance adjusted abnormal accruals following Kothari et al. (2005),17 and use two versions of abnormal accru-
als: i) ABS_ACC is the absolute value of abnormal accruals and ii) POS_ACC are income-increasing abnormal accruals (income-
decreasing abnormal accruals are set to zero).
PROB_SWITCH is the probability of auditor switch from year t-1 to t as previously defined and is our primary variable of
interest. LEV is long-term debt scaled by total assets, MTB is the market value of common shares outstanding divided by the
book value of total equity, VOL is the standard deviation of monthly stock returns for the previous twelve months, CFO is cash
flow from operations scaled by total assets, LIT_IND is an indicator variable set equal to one for companies in litigious indus-
tries, and zero otherwise,18 EXFIN is an indicator variable set equal to one if debt issuances exceed twenty percent of total assets

13
A weak learner is an algorithm that performs slightly better than random chance.
14
Identifying the stopping criteria is one of several ‘‘tuning” parameters that gradient boosting requires. We use ten-fold cross validation in order to select
tuning parameters. This procedure randomly selects different sets of tuning parameters and uses the set that performs the best. Bertomeu et al. (2020) provide
a good discussion of gradient boosting tuning parameters and Perols et al. (2017) provide a good discussion of ten-fold cross validation.
15
Gradients are conceptually similar to error terms from an OLS regression.
16
We adapt the Brown and Knechel (2016) misstatement model on p. 754 for our setting. Specifically, we exclude the client similarity variables that are the
focus of their paper and we include AT_GROW, CFO, LIT_IND, EXFIN, SPECIALIST, and BIGN as additional control variables.
17
Specifically, we estimate ACC as the residual from the following regression model: Total Accruals = c0 + c1(1/Assetsit-1) + c2DSalesit + c3PPEit + c4ROAit-1 + git,
with DSales and PPE scaled by lagged total assets. We estimate the regression by industry-year and require industry-years to have at least 10 observations. g is
our estimate of abnormal accruals.
18
We follow Francis et al. (1994) and define litigious industries as four-digit SIC 2833 through 2836, 3570 through 3577, 3600 through 3674, 5200 through
5961, and 7370 through 7374.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 7

or equity issuances exceed ten percent of total assets, and zero otherwise. All other variables are as previously defined. We win-
sorize all continuous variables at the 1st and 99th percentiles and cluster standard errors by company.
The sample we use in our analyses is the intersection of Audit Analytics and Compustat from 2008 through 2017 with
non-missing estimated probabilities, excluding financial (two-digit SIC 60 through 69) and utilities (two-digit SIC 44 through
49) industries (37,755 observations). We begin in 2008 because our probability models require five previous years of data for
the training sets and we wish to minimize possible confounding effects on company-auditor alignments related to the imple-
mentation of the Sarbanes-Oxley Act of 2002. We drop 17,791 observations that do not have the required variables for the
misstatement regressions. We drop 879 observations that switch auditors from year t-1 to t because we are interested in the
effect of the estimated probability of switching auditors in companies that remain with their incumbent auditor, leaving a
sample of 19,085 observations for our analyses. Table 2 presents descriptive statistics for this sample.
Table 3 presents the distribution of observations that switch auditors (n = 879) relative to observations included in the
sample of observations used in the primary analyses that remain with their incumbent auditor (n = 19,085) by decile of
PROB_SWITCH. The percentage of observations switching auditors increases from 0.95% for the first decile to a high of
21.59% for the top decile of PROB_SWITCH.19 This provides descriptive evidence that PROB_SWITCH effectively captures the
probability of switching auditors. The large increase in the percentage of switching companies from the ninth to the tenth decile
(4.16% versus 21.59%) also suggests that the effects of PROB_SWITCH on audit quality may be concentrated among observations
in the top decile. We investigate this possibility further below (see Table 5 and the related discussion).
Table 4 presents results for Eq. (1) and provides support for our hypothesis. Columns 1 and 2 present results for MISS_ALL
and MISS_BIGR, respectively, using logistic regression. Columns 3 and 4 present results for ABS_ACC and POS_ACC, respec-
tively, using OLS regression.20 We find consistent evidence in Columns 1 and 2 indicating that the probability of switching audi-
tors is positively associated with the likelihood of misstatement. Specifically, the coefficient on PROB_SWITCH is 2.514 (p < 0.01)
in Column 1 and 2.971 (p < 0.05) in Column 2. We find consistent evidence in Columns 3 and 4 indicating that the probability of
switching auditors is also positively associated with abnormal accruals. Specifically, the coefficient on PROB_SWITCH is 0.084
(p < 0.01) in Column 3 and 0.038 (p < 0.05) in Column 4. Taken together, Table 4 provides strong support for our hypothesis
and suggests that audit quality decreases as the estimated probability of switching auditors increases among companies con-
tinuing with their incumbent auditor.
Table 5 presents results for Eq. (1) after replacing PROB_SWITCH with an indicator variable set equal to one if the company
is in the top decile of the distribution of PROB_SWITCH, and zero otherwise (HIGH_PROB). Table 3 indicates that a substan-
tially higher percentage of companies in the top decile switch auditors, suggesting that the decrease in audit quality evi-
denced in Table 4 may be concentrated among companies in this decile.21 The results presented in Table 5 support this
conjecture. Columns 1 and 2 present results for MISS_ALL and MISS_BIGR, respectively, and Columns 3 and 4 present results
for ABS_ACC and POS_ACC, respectively. We find consistent evidence in all Columns that companies in the top decile of PROB_S-
WITCH have a significantly higher likelihood of misstatement and significantly higher abnormal accruals than companies in dec-
iles one through nine. Specifically, the coefficient on HIGH_PROB is 0.328 (p < 0.01) in Column 1, 0.401 (p < 0.05) in Column 2,
0.016 (p < 0.01) in Column 3, and 0.012 (p < 0.01) in Column 4. These findings indicate that the negative effects of PROB_SWITCH
on audit quality are concentrated among companies in the top decile of PROB_SWITCH. This also suggests that regulators and
audit firms could focus on companies and clients in the top decile to address the deterioration in audit quality.
In Tables 6 and 7 we investigate whether our main results for probabilities estimated using gradient boosting models are
sensitive to the sampling methods used. Because auditor switches are a relatively rare event (approximately 6% in our train-
ing and testing sample presented in Table 1), our sample may be unbalanced and our estimated probabilities might be
improved by using sampling techniques designed to help machine learning algorithms with unbalanced data sets. The sam-
pling techniques that we examine are up-sampling, down-sampling, and SMOTE sampling, all of which are designed to make
the training dataset more balanced. Up-sampling balances the data using random sampling of the less prevalent class (com-
panies switching auditors in our setting), with replacement, to match the number of observations in the majority class (com-
panies remaining with their incumbent auditor in our setting). Down-sampling essentially takes an opposite approach,
balancing the data by selecting a random sample of the majority class that is matched in number of observations to the size
of the less prevalent class. SMOTE sampling is a more sophisticated sampling method and combines elements of up- and
down-sampling. SMOTE down-samples the majority class while also synthesizing new observations for the less prevalent
class (Chawla et al., 2002). SMOTE uses a nearest neighbor approach to synthesize new observations.22

19
We also examine the distribution of switching and non-switching observations in the testing sample and find a similar distribution. Specifically, 0.9% of
observations in the first decile switch auditors relative to 36.1% in the top decile.
20
Sample sizes decrease to 19,021 observations in the MISS_ALL regressions and 17,629 observations in the MISS_BIGR regressions because certain industries
do not have any misstatement observations. All inferences are unchanged if we use OLS instead of logistic regression and maintain the full sample. Sample sizes
in the abnormal accruals regressions are smaller (n = 18,333) because of the additional variables and the number of observations per industry-year required to
estimate accruals.
21
We also use Youden’s J statistic to determine the optimal threshold of PROB_SWITCH that best classifies switchers and non-switchers (Youden 1950). The
optimal threshold is 0.076 in our setting, slightly higher than the threshold for the top decile (0.072). All results are similar and all inferences unchanged using
the optimal threshold to identify HIGH_PROB.
22
SMOTE synthesizes new observations that fall in the feature space between members of a group of nearest neighbors. We use groups of five members for
this task (the number of neighbors must be chosen by the programmer). The SMOTE procedure ensures that the synthetic observations fall within plausible
ranges of actual occurrences.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
8 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

Table 2
Descriptive statistics – sample used in primary analyses.

N Mean St. Dev. p25 Median p75


PROB_SWITCH 19,085 0.048 0.042 0.028 0.040 0.056
MISS_ALL 19,085 0.119 0.323 0.000 0.000 0.000
MISS_BIGR 19,085 0.020 0.139 0.000 0.000 0.000
ABS_ACC 18,313 0.070 0.086 0.018 0.040 0.084
POS_ACC 18,313 0.036 0.073 0.000 0.001 0.041
LN_AT 19,085 6.245 2.060 4.757 6.167 7.707
LEV 19,085 0.178 0.208 0.000 0.114 0.290
ROA 19,085 0.080 0.312 0.093 0.023 0.070
LOSS 19,085 0.401 0.490 0.000 0.000 1.000
INVT_RECT 19,085 0.231 0.176 0.087 0.202 0.337
CFEARLY 19,085 0.397 0.489 0.000 0.000 1.000
CFMATURE 19,085 0.429 0.495 0.000 0.000 1.000
MTB 19,085 3.360 6.543 1.232 2.180 3.979
VOL 19,085 0.137 0.085 0.080 0.116 0.170
AT_GROW 19,085 0.148 0.506 0.056 0.041 0.172
CFO 19,085 0.007 0.251 0.001 0.075 0.127
LIT_IND 19,085 0.432 0.495 0.000 0.000 1.000
EXFIN 19,085 0.306 0.461 0.000 0.000 1.000
SPECIALIST 19,085 0.273 0.446 0.000 0.000 1.000
BIGN 19,085 0.705 0.456 0.000 1.000 1.000

See Appendix A for variable definitions. All continuous variables are winsorized at the 1st and 99th percentiles.

Table 3
Distribution of Switches by Decile of PROB_SWTICH.

Decile Remain Switch Total % Switch


1 1,978 19 1,997 0.95%
2 1,966 30 1,996 1.50%
3 1,970 27 1,997 1.35%
4 1,962 34 1,996 1.70%
5 1,944 52 1,996 2.61%
6 1,938 59 1,997 2.95%
7 1,917 79 1,996 3.96%
8 1,932 65 1,997 3.25%
9 1,913 83 1,996 4.16%
10 1,565 431 1,996 21.59%
Total 19,085 879 19,964 4.40%

Tables 6 and 7 present results using the probability of switching auditors estimated with different sampling methods. In
both tables, Column 1 is included for comparison and presents results without a sampling technique, Column 2 presents
results using up-sampling, Column 3 presents results using down-sampling, and Column 4 presents results using SMOTE
sampling. Table 6 presents results for Eq. (1) estimated using logistic regression and MISS_ALL as the dependent variable
and Table 7 presents results for Eq. (1) estimated using OLS and ABS_ACC as the dependent variable.23 The coefficient on
PROB_SWITCH is positive and significant (p < 0.01) in all Columns in both tables. Taken together, the results from Tables 6
and 7 indicate that our primary results are robust to using alternative sampling methods and provide little evidence that sam-
pling techniques improve the identification of low quality audits in our setting.

4.1. Additional analyses

4.1.1. Auditor independence


In this section we discuss the results of additional analyses designed to investigate our conjecture that our results are
attributable to impaired auditor independence. Our primary finding is that audit quality is lower among companies with
a high predicted probability of switching auditors that do not switch. If the results from our primary tests are attributable
to a deterioration of auditor independence created by management’s threat of switching auditors, then we expect the dete-
rioration in independence to be less severe when the threat of switching is likely to be less credible and when the loss of a
single audit client is likely to be less costly.

23
We also perform these analyses using MISS_BIGR and POS_ACC as the variables of interest and find that results are quite similar to those presented in Table 4
for all sampling methods.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 9

Table 4
Audit quality.

(1) (2) (1) (2)


MISS_ALL MISS_BIGR ABS_ACC POS_ACC
PROB_SWITCH 2.514*** 2.971** 0.084*** 0.038**
(3.358) (2.343) (3.872) (2.068)
LN_AT 0.072*** 0.014 0.004*** 0.002***
(2.674) (0.240) (7.664) (5.074)
LEV 0.776*** 1.232*** 0.003 0.015***
(4.047) (3.117) (0.667) (3.891)
ROA 0.117 0.205 0.037*** 0.049***
(0.600) (0.548) (6.421) (7.557)
LOSS 0.041 0.070 0.002 0.011***
(0.504) (0.406) (1.392) (7.083)
INVT_RECT 0.047 0.613 0.038*** 0.046***
(0.169) (1.047) (6.297) (9.027)
CFEARLY 0.010 0.170 0.009*** 0.002
(0.128) (0.945) (4.575) (1.345)
CFMATURE 0.169** 0.047 0.008*** 0.004**
(2.032) (0.236) (4.147) (2.427)
MTB 0.008 0.001 0.000* 0.000
(1.560) (0.067) (1.669) (0.008)
VOL 0.447 1.325** 0.080*** 0.037***
(1.217) (2.009) (7.184) (3.661)
AT_GROW 0.139** 0.160 0.057*** 0.034***
(2.562) (1.593) (24.540) (12.481)
CFO 0.315 0.143 0.016** 0.122***
(1.196) (0.256) (2.285) (16.489)
LIT_IND 0.143 0.253 0.009*** 0.012***
(1.045) (0.768) (3.265) (4.772)
EXFIN 0.078 0.145 0.002 0.001
(1.036) (0.909) (0.962) (0.967)
SPECIALIST 0.051 0.240 0.002 0.001
(0.624) (1.343) (1.216) (0.597)
BIGN 0.259** 0.269 0.007*** 0.006***
(2.386) (1.161) (3.271) (3.488)
Constant 2.696*** 3.949*** 0.048*** 0.026**
(3.629) (6.378) (3.089) (2.555)
Obs. 19,021 17,629 18,313 18,313
Adj./Pseudo R2 0.064 0.076 0.253 0.160

See Appendix A for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

Our first set of analyses investigates whether our primary results are stronger among smaller companies that likely have
lower costs associated with switching auditors and a greater number of alternative auditors, thereby making a threat of
switching more credible. We delineate larger companies as those with total assets above the sample median and smaller
companies as those with total assets below the sample median. Table 8 presents the results. Columns 1 and 2 present results
using MISS_ALL as the proxy for audit quality while Columns 3 and 4 present results using ABS_ACC as the proxy for audit
quality. Columns 1 and 3 present results for larger companies (Big) while Columns 2 and 4 present results for smaller com-
panies (Small). We find that the coefficient on PROB_SWITCH is insignificant in Column 1 but positive and significant in Col-
umn 2 (p < 0.01), suggesting that the likelihood of misstatement increases with the probability of switching auditors for
smaller companies but not for larger companies. We find that the coefficient on PROB_SWITCH is positive and significant
in Columns 3 (p < 0.05) and 4 (p < 0.01), indicating that abnormal accruals increase as the probability of switching auditors
increases for all companies, though the effect is less significant for larger companies.24 Taken together, our findings suggest
that the more egregious audit quality problems (i.e., misstatements) associated with the probability of switching are concen-
trated among smaller companies.
Our second set of analyses investigates whether our primary results vary by the size of the audit office. If our findings are
attributable to impaired independence, we would expect our results to be stronger when there are fewer companies audited
by the office. We expect that audit offices for which the loss of a single client would be more consequential are likely to exert
greater effort to keep the client and might be more likely to acquiesce to management’s desires. We conjecture that each
individual client is more important for smaller local audit offices than for larger local audit offices. We use MSA to delineate

24
We also perform the analyses presented in Tables 7 and 8 using the different sampling techniques discussed earlier and find similar results (untabulated).
Our inferences that the results from our main tests are concentrated among smaller companies and smaller local audit offices are unchanged.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
10 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

Table 5
High Probability Indicator.

(1) (2) (1) (2)


MISS_ALL MISS_BIGR ABS_ACC POS_ACC
HIGH_PROB 0.328*** 0.401** 0.016*** 0.012***
(3.261) (2.015) (5.748) (4.591)
LN_AT 0.066** 0.017 0.004*** 0.002***
(2.462) (0.303) (8.074) (5.102)
LEV 0.770*** 1.220*** 0.003 0.015***
(4.015) (3.079) (0.730) (3.869)
ROA 0.094 0.172 0.036*** 0.051***
(0.479) (0.458) (6.245) (7.740)
LOSS 0.047 0.074 0.002 0.011***
(0.572) (0.424) (1.307) (7.089)
INVT_RECT 0.028 0.555 0.036*** 0.045***
(0.101) (0.935) (6.117) (8.800)
CFEARLY 0.010 0.165 0.009*** 0.002
(0.123) (0.919) (4.595) (1.354)
CFMATURE 0.174** 0.039 0.008*** 0.004**
(2.090) (0.199) (4.249) (2.465)
MTB 0.008 0.000 0.000* 0.000
(1.581) (0.021) (1.688) (0.005)
VOL 0.477 1.360** 0.080*** 0.036***
(1.302) (2.065) (7.228) (3.615)
AT_GROW 0.132** 0.147 0.056*** 0.033***
(2.453) (1.498) (24.304) (12.300)
CFO 0.320 0.140 0.016** 0.122***
(1.214) (0.251) (2.222) (16.478)
LIT_IND 0.143 0.251 0.009*** 0.012***
(1.047) (0.760) (3.286) (4.799)
EXFIN 0.073 0.153 0.001 0.002
(0.974) (0.961) (0.814) (1.071)
SPECIALIST 0.050 0.243 0.002 0.001
(0.604) (1.354) (1.067) (0.448)
BIGN 0.264** 0.259 0.006*** 0.005***
(2.400) (1.106) (2.923) (3.023)
Constant 2.585*** 3.845*** 0.050*** 0.026**
(3.458) (6.276) (3.216) (2.518)
Obs. 19,021 17,629 18,313 18,313
Adj./Pseudo R2 0.064 0.076 0.255 0.162

See Appendix 1 for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

a local audit office and use the median number of audit clients at the local audit office level to delineate larger versus smaller
local audit offices.25
Table 9 presents the results. Columns 1 and 2 present results using MISS_ALL as the proxy for audit quality while Columns
3 and 4 present results using ABS_ACC as the proxy for audit quality. Columns 1 and 3 present results for larger audit offices
(Big) while Columns 2 and 4 present results for smaller audit offices (Small). We find that the coefficient on PROB_SWITCH is
insignificant in Column 1 but positive and significant in Column 2 (p < 0.01), suggesting that the likelihood of misstatement
increases with the probability of switching auditors for companies with auditors from smaller audit offices but not those
with auditors from larger audit offices. We find that the coefficient on PROB_SWITCH is positive and significant in Columns
3 (p < 0.05) and 4 (p < 0.01), indicating that abnormal accruals increase as the probability of switching auditors increases for
all companies, though the effect is less significant for companies with auditors from larger offices. These results are very sim-
ilar to results in Table 8 and suggest that audit quality is most negatively affected in situations where the threat of switching
auditors is likely more plausible and the costs of losing a client are likely greater. Taken together, results from Tables 8 and 9,
combined with our previous results, provide support for our conjecture that the lower audit quality we observe in our main
results may be attributable, at least in part, to impaired auditor independence. However, we acknowledge that we cannot
rule out alternative explanations because audit office and client size have also been used to proxy for other constructs such
as client complexity, litigation risk, financial distress and auditor expertise.

25
We also perform these tests proxying for client importance using audit fees (results untabulated). Specifically, we calculate the proportion of audit fees
received from the company relative to the total audit fees received from all clients by the local audit office. All inferences are the same using this alternative
proxy.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 11

Table 6
Sampling methods - misstatements (MISS_ALL).

(1) (2) (3) (4)


Base Up Down SMOTE
PROB_SWITCH 2.514*** 0.999*** 0.812*** 1.156***
(3.358) (3.102) (2.647) (3.879)
LN_AT 0.072*** 0.095*** 0.087*** 0.092***
(2.674) (3.244) (3.001) (3.285)
LEV 0.776*** 0.761*** 0.763*** 0.768***
(4.047) (3.980) (3.983) (4.023)
ROA 0.117 0.146 0.144 0.134
(0.600) (0.747) (0.735) (0.688)
LOSS 0.041 0.021 0.028 0.030
(0.504) (0.254) (0.337) (0.369)
INVT_RECT 0.047 0.029 0.034 0.045
(0.169) (0.106) (0.121) (0.161)
CFEARLY 0.010 0.009 0.010 0.006
(0.128) (0.113) (0.124) (0.074)
CFMATURE 0.169** 0.161* 0.165** 0.164**
(2.032) (1.950) (2.001) (1.982)
MTB 0.008 0.008 0.008 0.008
(1.560) (1.558) (1.565) (1.579)
VOL 0.447 0.407 0.432 0.439
(1.217) (1.107) (1.172) (1.193)
AT_GROW 0.139** 0.132** 0.137** 0.138**
(2.562) (2.421) (2.526) (2.534)
CFO 0.315 0.308 0.314 0.296
(1.196) (1.166) (1.191) (1.123)
LIT_IND 0.143 0.144 0.145 0.144
(1.045) (1.055) (1.058) (1.056)
EXFIN 0.078 0.078 0.080 0.075
(1.036) (1.041) (1.056) (0.999)
SPECIALIST 0.051 0.046 0.051 0.040
(0.624) (0.561) (0.620) (0.481)
BIGN 0.259** 0.298*** 0.278** 0.282***
(2.386) (2.699) (2.538) (2.581)
Constant 2.696*** 3.088*** 2.971*** 2.959***
(3.629) (4.088) (3.898) (3.901)
Obs. 19,021 19,021 19,021 19,021
Pseudo R2 0.064 0.064 0.064 0.065

See Appendix A for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

4.1.2. Switching companies


Our final set of analyses provide insight into how our sample companies (companies remaining with their incumbent
auditor) compare to companies that switch auditors. We expand the sample used in our primary analyses to include 879
clients that switch auditors, some of which are likely opinion shopping as suggested by prior research (Lennox, 2000).
We also include an indicator variable set equal to one if the company changed auditors from the previous year to the current
year, and zero otherwise (SWITCH), and the interaction between SWITCH and our variable of interest (PROB_SWITCH). Table 10
presents the results.26 The main effects on PROB_SWITCH are similar in magnitude and significance to the results presented in
Table 4. The interactions between SWITCH and PROB_SWITCH are not significant at conventional levels, indicating that the prob-
ability of switching does not have an incremental effect on audit quality for companies that switched auditors. Additional anal-
yses using separate samples (results untabulated) confirm that the probability of switching does not have a significant effect on
audit quality among companies switching auditors.
We also perform joint tests and tests of differences in coefficients (results untabulated) in order to evaluate how audit
quality among companies switching auditors compares with audit quality among high-probability non-switching compa-
nies. Results indicate that the total effect of switching auditors on audit quality is statistically smaller than the effect of
PROB_SWITCH for companies remaining with their incumbent auditor for MISS_ALL and ABS_ACC, though the differences
are not significant for MISS_BIG and POS_ACC.27 This provides some evidence that the negative effects of the probability of
switching auditors on the audit quality of non-switching companies is worse than the audit quality effects of switching. How-
ever, we believe that the results of the tests in this section should be interpreted cautiously because our tests are not designed to
investigate audit quality among companies switching auditors and the number of auditor switches in our sample is relatively

26
Results using HIGH_PROB instead of PROB_SWITCH are similar (untabulated).
27
Specifically, we test whether SWITCH + SWITCH*PROB_SWITCH = PROB_SWITCH. We estimate Columns 1 and 2 using OLS for purposes of the joint test in
order to ease interpretation (Shipman et al., 2017; Cassell et al., 2019).

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
12 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

Table 7
Sampling Methods – Abnormal Accruals (ABS_ACC).

(1) (2) (3) (4)


Base Up Down SMOTE
PROB_SWITCH 0.084*** 0.031*** 0.029*** 0.029***
(3.872) (4.077) (4.002) (3.681)
LN_AT 0.004*** 0.003*** 0.003*** 0.004***
(7.664) (5.818) (6.127) (6.748)
LEV 0.003 0.003 0.003 0.003
(0.667) (0.764) (0.773) (0.741)
ROA 0.037*** 0.038*** 0.038*** 0.038***
(6.421) (6.567) (6.568) (6.522)
LOSS 0.002 0.003* 0.003 0.002
(1.392) (1.733) (1.640) (1.457)
INVT_RECT 0.038*** 0.037*** 0.037*** 0.038***
(6.297) (6.233) (6.240) (6.336)
CFEARLY 0.009*** 0.009*** 0.009*** 0.009***
(4.575) (4.595) (4.577) (4.612)
CFMATURE 0.008*** 0.008*** 0.008*** 0.008***
(4.147) (4.040) (4.112) (4.104)
MTB 0.000* 0.000* 0.000* 0.000*
(1.669) (1.677) (1.670) (1.676)
VOL 0.080*** 0.079*** 0.079*** 0.081***
(7.184) (7.101) (7.139) (7.260)
AT_GROW 0.057*** 0.057*** 0.057*** 0.057***
(24.540) (24.448) (24.454) (24.539)
CFO 0.016** 0.017** 0.016** 0.017**
(2.285) (2.314) (2.264) (2.315)
LIT_IND 0.009*** 0.009*** 0.009*** 0.009***
(3.265) (3.290) (3.280) (3.273)
EXFIN 0.002 0.002 0.002 0.001
(0.962) (0.951) (0.972) (0.858)
SPECIALIST 0.002 0.002 0.002 0.001
(1.216) (1.119) (1.189) (1.021)
BIGN 0.007*** 0.006*** 0.006*** 0.006***
(3.271) (2.727) (2.905) (3.144)
Constant 0.048*** 0.037** 0.038** 0.044***
(3.089) (2.264) (2.344) (2.752)
Obs. 18,313 18,313 18,313 18,313
Adj. R2 0.253 0.253 0.253 0.253

See Appendix 1 for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

small (n = 879). We interpret the overall tenor of these analyses as suggesting similarly low audit quality among companies that
retain their auditors despite having a high probability of switching relative to companies that switch auditors (some of which
may have successfully opinion shopped). These results highlight the need to be able to identify companies that have similarly
low audit quality relative to opinion shopping companies when an auditor switch cannot be observed.

5. Conclusion

Auditor independence is a vital component of audit quality, providing assurance that the financial statements appropri-
ately reflect the company’s economic reality. Prior opinion shopping research indicates that auditor independence is com-
promised when a client is able to switch auditors in order to obtain a more favorable audit opinion or less conservative
accounting treatment. Policy makers and regulators have demonstrated their concern surrounding possible opinion shop-
ping by requiring companies switching their auditor to publicly disclose disagreements with the predecessor auditor and
by shifting the responsibility to appoint the auditor from management to the audit committee. Of potentially greater con-
cern, however, are instances when an auditor seeks to retain an existing client by capitulating to client pressure. In such
cases, auditor independence is compromised without an observable auditor switch. We investigate the implications of this
possibility by examining audit quality among companies that are more likely to switch auditors but do not, expecting that
these companies may exhibit lower audit quality. Similar to the opinion shopping literature, we conjecture that lower audit
quality is indicative of impaired auditor independence in our setting.
We utilize machine learning techniques to predict auditor switches. We begin by evaluating the out-of-sample prediction
accuracy of several nonparametric machine learning techniques and find that gradient boosting consistently outperforms the
other models in our setting. In our main analyses, we regress a series of audit quality indicators on the probability of an audi-
tor switch generated using our gradient boosting prediction model. Using misstatements and abnormal accruals to proxy for
audit quality, we find that the estimated probability of switching auditors is negatively associated with audit quality among

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 13

Table 8
Big vs. Small Companies.

(1) (2) (3) (4)


MISS_ALL Big MISS_ALL Small ABS_ACC Big ABS_ACC Small
PROB_SWITCH 1.175 2.991*** 0.086** 0.081***
(0.699) (3.567) (2.553) (3.104)
LN_AT 0.035 0.269*** 0.002*** 0.009***
(0.752) (4.035) (2.972) (6.250)
LEV 0.772*** 0.474* 0.005 0.005
(2.824) (1.794) (1.199) (0.679)
ROA 0.739** 0.227 0.013 0.032***
(2.209) (0.985) (1.270) (4.787)
LOSS 0.025 0.076 0.007*** 0.005**
(0.212) (0.663) (3.180) (1.971)
INVT_RECT 0.394 0.395 0.024*** 0.043***
(0.837) (1.127) (3.078) (5.201)
CFEARLY 0.089 0.056 0.015*** 0.006**
(0.738) (0.546) (4.714) (2.258)
CFMATURE 0.113 0.189 0.015*** 0.010***
(0.955) (1.539) (5.071) (3.889)
MTB 0.009 0.002 0.000 0.000
(1.372) (0.349) (1.563) (0.303)
VOL 0.270 0.318 0.070*** 0.086***
(0.372) (0.721) (4.965) (5.935)
AT_GROW 0.223** 0.093 0.044*** 0.062***
(2.523) (1.347) (11.521) (21.511)
CFO 0.894 0.067 0.050*** 0.022**
(1.265) (0.218) (3.873) (2.513)
LIT_IND 0.209 0.053 0.004 0.011**
(1.084) (0.271) (1.514) (2.571)
EXFIN 0.074 0.103 0.001 0.000
(0.706) (0.932) (0.342) (0.150)
SPECIALIST 0.053 0.093 0.002 0.001
(0.488) (0.733) (1.269) (0.486)
BIGN 0.274 0.112 0.004 0.002
(1.276) (0.857) (1.505) (0.600)
Constant 0.597 2.605*** 0.058*** 0.052**
(0.608) (3.814) (3.502) (2.304)
Obs. 9,446 9,449 9,156 9,157
Adj./Pseudo R2 0.077 0.064 0.165 0.237

See Appendix A for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

companies remaining with their incumbent auditor. We also find evidence that the negative audit quality effects are concen-
trated among companies in the top decile of the distribution of the probability of switching. Furthermore, we find that our
results are robust to various sampling techniques (up, down, and SMOTE sampling), which help alleviate potential problems
when using machine learning algorithms to predict rare events.
In additional analyses we find that our main results for misstatements are concentrated among smaller companies that
face lower costs related to switching auditors and among smaller audit offices, for which the loss of a client is likely to be
more consequential. Our main results for less egregious audit quality problems (abnormal accruals) are similar for smaller
and larger companies and audit offices. These findings suggest that more egregious audit quality problems (misstatements)
are concentrated among auditor–client relationships where the auditor is likely to face greater pressure to capitulate to the
client, providing additional support for our conjecture that our primary results are attributable, at least in part, to impaired
auditor independence. We caution, however, that because audit office and client size have been used to proxy for several
other constructs, such as complexity and auditor expertise, we cannot rule out alternative explanations for our results.
Our final analyses investigate how our sample companies compare to companies that switch auditors. Overall, the results
suggest that audit quality is at least as poor among companies with a high probability of switching auditors that remain with
their incumbent auditors as it is among switching companies, for which there is a publicly observable signal.
Our findings should be of interest to academics, regulators and audit firms. First, prior opinion shopping research has
focused on accounting treatment and audit opinions received following the switch to a new auditor. We extend this research
by investigating whether similar problems manifest when companies are likely to switch yet remain with their incumbent
auditor. Our findings should raise concerns among policy makers, regulators and investors because, unlike an opinion shop-
ping event where there is an observable auditor switch, there is no public signal warning of potential audit quality impair-
ments among companies with a high probability of switching auditors that do not. Second, the methodology we use is
designed to be implementable in a timely fashion, using historical information that is publicly available and could be used
by regulators and audit firms to preemptively identify audit engagements that are more likely to experience pressure from

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
14 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

Table 9
Big vs. Small Audit Offices.

(1) (2) (3) (4)


MISS_ALL Big MISS_ALL Small ABS_ACC Big ABS_ACC Small
PROB_SWITCH 1.335 3.348*** 0.089** 0.079***
(1.017) (3.717) (2.291) (3.123)
LN_AT 0.002 0.145*** 0.004*** 0.004***
(0.045) (3.663) (6.487) (4.959)
LEV 0.933*** 0.628** 0.000 0.006
(3.404) (2.263) (0.047) (0.948)
ROA 0.620** 0.292 0.040*** 0.033***
(2.313) (1.004) (4.814) (4.157)
LOSS 0.103 0.005 0.007*** 0.002
(0.855) (0.045) (2.947) (0.669)
INVT_RECT 0.240 0.112 0.031*** 0.042***
(0.550) (0.310) (4.060) (5.019)
CFEARLY 0.028 0.041 0.009*** 0.010***
(0.249) (0.376) (3.142) (3.269)
CFMATURE 0.085 0.265** 0.008*** 0.008***
(0.719) (2.300) (2.958) (2.856)
MTB 0.012* 0.002 0.000 0.000
(1.726) (0.229) (1.324) (0.770)
VOL 0.347 0.885* 0.094*** 0.070***
(0.575) (1.877) (5.722) (4.613)
AT_GROW 0.115 0.167** 0.054*** 0.060***
(1.362) (2.380) (17.438) (17.377)
CFO 1.004*** 0.270 0.000 0.031***
(2.813) (0.706) (0.041) (3.073)
LIT_IND 0.101 0.054 0.008** 0.010**
(0.472) (0.292) (2.398) (2.515)
EXFIN 0.202* 0.014 0.003 0.000
(1.805) (0.133) (1.312) (0.043)
SPECIALIST 0.223** 0.207* 0.002 0.002
(2.076) (1.658) (0.936) (0.843)
BIGN 0.243 0.211 0.010** 0.006**
(0.987) (1.569) (2.366) (2.323)
Constant 2.030** 3.315*** 0.045** 0.070***
(2.053) (3.812) (2.308) (4.671)
Obs. 9,059 9,837 9,138 9,175
Adj./Pseudo R2 0.084 0.072 0.260 0.248

See Appendix 1 for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

Table 10
Non-switching vs. Switching Companies.

(1) (2) (1) (2)


MISS_ALL MISS_BIGR ABS_ACC POS_ACC
PROB_SWITCH 2.364*** 2.788** 0.078*** 0.031*
(3.146) (2.173) (3.577) (1.656)
SWITCH 0.201 0.390 0.013*** 0.011**
(1.284) (1.309) (2.700) (2.559)
PROB_SWITCH* 1.105 2.091 0.053 0.038
SWITCH (0.991) (1.023) (1.592) (1.310)
LN_AT 0.073*** 0.013 0.004*** 0.002***
(2.749) (0.229) (7.516) (5.001)
LEV 0.737*** 1.109*** 0.003 0.015***
(3.932) (2.895) (0.815) (3.841)
ROA 0.063 0.093 0.036*** 0.051***
(0.336) (0.257) (6.256) (8.131)
LOSS 0.047 0.071 0.003 0.011***
(0.586) (0.431) (1.554) (7.544)
INVT_RECT 0.060 0.589 0.039*** 0.048***
(0.221) (1.039) (6.448) (9.196)
CFEARLY 0.003 0.110 0.010*** 0.003
(0.045) (0.644) (4.757) (1.518)
CFMATURE 0.182** 0.018 0.008*** 0.004**
(2.258) (0.094) (4.025) (2.257)
MTB 0.009* 0.005 0.000* 0.000
(1.746) (0.414) (1.787) (0.046)

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 15

Table 10 (continued)

(1) (2) (1) (2)


MISS_ALL MISS_BIGR ABS_ACC POS_ACC
VOL 0.542 1.283** 0.083*** 0.043***
(1.551) (2.091) (7.449) (4.243)
AT_GROW 0.132*** 0.184** 0.058*** 0.034***
(2.615) (1.981) (25.631) (13.005)
CFO 0.283 0.168 0.021*** 0.128***
(1.127) (0.317) (2.876) (17.741)
LIT_IND 0.131 0.273 0.009*** 0.013***
(0.974) (0.877) (3.367) (5.095)
EXFIN 0.062 0.179 0.001 0.002
(0.845) (1.155) (0.868) (1.082)
SPECIALIST 0.055 0.226 0.002 0.001
(0.690) (1.300) (1.289) (0.647)
BIGN 0.261** 0.293 0.008*** 0.007***
(2.494) (1.306) (3.837) (4.157)
Constant 2.691*** 3.852*** 0.048*** 0.026**
(3.611) (6.308) (3.033) (2.469)
Obs. 19,896 18,595 19,156 19,156
Adj./Pseudo R2 0.064 0.073 0.260 0.168

See Appendix A for variable definitions. Coefficient estimates above, t-statistic below. Robust standard errors are clustered by company with year and
industry fixed effects. *, **, *** indicate significance at p < 0.10, p < 0.05, and p < 0.01 (two-tailed), respectively.

clients to acquiesce to favorable outcomes in order to retain the client. Specifically, we find that clients in the top decile of
switching probability have lower audit quality. More broadly, we provide evidence that machine learning techniques can be
useful for predicting the likelihood of rare events that may have a negative effect on audit quality.

Appendix A. Variable definitions

Variable Definition
Dependent Variables
MISS_ALL Indicator variable set equal to one if the company subsequently restates year t financial statements for
reasons related to accounting, fraud, or an SEC investigation, and zero otherwise.
MISS_BIGR Indicator variable set equal to one if the company subsequently restates year t financial statements and
the restatement is disclosed in a Form 8-K filing, and zero otherwise.
ABS_ACC Absolute discretionary accruals estimated using the modified Jones model, adjusted for prior year
company performance (Kothari et al. 2005).
POS_ACC Income increasing discretionary accruals estimated using the modified Jones model, adjusted for prior
year company performance (Kothari et al. 2005). Income-decreasing accruals are set to zero (Brown and
Knechel, 2016).

Variables of Interest
PROB_SWITCH Probability of the company switching auditors for the current year estimated using our prediction
models.
HIGH_PROB Indicator variable set equal to one if the company is in the top decile of the distribution of
PROB_SWITCH, and zero otherwise.

Control Variables
ACQUIRE Indicator variable set equal to one if cash outflows related to acquisitions or the contribution of
acquisitions to sales exceed ten percent of total assets, and zero otherwise.
AT_GROW Change in total assets scaled by prior year total assets.
BIGN Indicator variable set equal to one if the company is audited by one of the four largest auditors, and zero
otherwise.
CASH Cash plus cash equivalents scaled by total assets.
CFEARLY Indicator variable set equal to one if the company is in the introduction or growth stage of its life cycle,
and zero otherwise.
CFMATURE Indicator variable set equal to one if the company is in the mature stage of its life cycle, and zero
otherwise.

(continued on next page)

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
16 J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx

Appendix A (continued)

Variable Definition
CFO Cash flow from operations scaled by total assets.
CLIENTS Number of clients in the auditor’s MSA.
DACC Discretionary accruals estimated using the modified Jones model (DeFond and Subramanyam, 1998).
EXFIN Indicator variable set equal to one if debt issuances exceed twenty percent of total assets or equity
issuances exceed ten percent of total assets.
INVT_RECT Receivables plus inventory scaled by total assets.
LEV Long-term debt scaled by total assets.
LIT_IND Indicator variable set equal to one for companies in litigious industries, and zero otherwise.
LN_AT Natural log of the company’s total assets in $ millions.
LOSS An indicator variable set equal to one if ROA is less than zero, and zero otherwise.
MARKET Number of auditors having five or more clients in the same MSA as the company.
MODOP Indicator variable set equal to one if Compustat reports a nonstandard audit opinion, and zero
otherwise.
MTB Market value of common shares outstanding divided by the book value of total equity.
ROA Income before extraordinary items scaled by total assets.
SHORT_TEN Indicator variable if auditor tenure with the company is three years or less, and zero otherwise.
SPECIALIST Indicator variable set equal to one if the company’s auditor has at least five percent more clients than
the next largest auditor in both the industry and the MSA, and zero otherwise.
SWITCH Indicator variable set equal to one if the company changed auditors from the previous year to the
current year, and zero otherwise.
VOL Standard deviation of monthly stock returns for the previous twelve months.

References

Aobdia, D., 2019. The validity of publicly available measures of audit quality: Evidence from the PCAOB inspection data. J. Account. Econ. (Forthcoming)
Ashbaugh-Skaife, H., Collins, D.W., Kinney Jr., W.R., 2007. The discovery and reporting of internal control deficiencies prior to SOX-mandated audits. J.
Account. Econ. 44 (1–2), 166–192.
Bao, Y., Ke, B., Li, B., Yu, Y.J., Zhang, J., 2020. Detecting accounting fraud in publicly traded U.S. firms using a machine learning approach. J. Account. Res. 58
(1), 199–235.
Bazerman, M.H., Morgan, K.P., Loewenstein, G.F., 1997. The impossibility of auditor independence. Sloan Manage. Rev. 38 (4), 89–94.
Bertomeu, J., Cheynel, E., Floyd, E., Pan, W., 2020. Using machine learning to detect misstatements. Rev. Acc. Stud. (Forthcoming)
Brown, S.V., Knechel, W.R., 2016. Auditor-client compatibility and audit firm selection. J. Account. Res. 54 (3), 725–775.
Carcello, J.V., Neal, T.L., 2003. Audit committee characteristics and auditor dismissals following ‘‘new” going-concern reports. Account. Rev. 78 (1), 95–117.
Cassell, C., Hunt, E., Narayanamoorthy, G., Rowe, S.P., 2019. A hidden risk of auditor industry specialization: Evidence from the financial crisis. Rev. Acc. Stud.
24, 891–926.
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P., 2002. SMOTE: Synthetic minority over-sampling technique. J. Artificial Intell. Res. 16, 321–357.
Cecchini, M., Aytug, H., Koehler, G.J., Pathak, P., 2010. Detecting management fraud in public companies. Manage. Sci. 56 (7), 1146–1160.
Chow, C.W., Rice, S.J., 1982. Qualified audit opinions and auditor switching. Account. Rev. 57 (2), 326–335.
Craswell, A., Stokes, D.J., Laughton, J., 2002. Auditor independence and fee dependence. J. Account. Econ. 33, 253–275.
Davis, J., Mun. Goadrich, 2006. The relationship between precision-recall and ROC curves. In: Proceedings of the 23rd international conference on Machine
learning. ACM, pp. 233–240.
DeAngelo, L., 1981. Auditor independence, ‘‘low-balling” and disclosure regulation. J. Account. Econ. 3 (2), 113–127.
Dechow, P.M., Ge, W., Larson, C.R., Sloan, R.G., 2011. Predicting material accounting misstatements. Contemp. Account. Res. 28 (1), 17–82.
DeFond, M.L., Raghunandan, K., Subramanyam, K.R., 2002. Do non-audit service fees impair auditor independence? Evidence from going concern audit
opinions. J. Account. Res. 40 (4), 1247–1274.
DeFond, M.L., Subramanyam, K.R., 1998. Auditor changes and discretionary accruals. J. Account. Econ. 25 (1), 35–67.
DeFond, M., Zhang, J., 2014. A review of archival auditing research. J. Account. Econ. 58 (2–3), 275–326.
Dhaliwal, D.S., Lamoreaux, P.T., Lennox, C.S., Mauler, L.M., 2015. Management influence on auditor selection and subsequent impairments of auditor
independence during the post-SOX period. Contemp. Account. Res. 32 (2), 575–607.
Dickinson, V., 2011. Cash flow patterns as a proxy for firm life cycle. Account. Rev. 86 (6), 1969–1994.
Ding, K., Peng, X., Wang, Y., 2019. A machine learning-based peer selection method with financial ratios. Account. Horizons 33 (3), 75–87.
Francis, J., Philbrick, D., Schipper, K., 1994. Shareholder litigation and corporate disclosures. J. Account. Res. 32 (2), 137–164.
Geiger, M.A., Raghunandan, K., Rama, D.V., 1998. Costs associated with going-concern modified audit opinions: An analysis of auditor changes, subsequent
opinions and client failures. Adv. Account. 16 (1), 117–139.
Green, B.P., Choi, J.H., 1997. Assessing the risk of management fraud through neural network technology. Auditing: A J. Practice Theory 16 (1), 14–28.
Gu, S., Kelly, B., Xiu, D., 2020. Empirical asset pricing via machine learning. Rev. Financ. Stud. 33 (5), 2223–2273.
Gul, F.A., Jaggi, B.L., Krishnan, G.V., 2007. Auditor independence: Evidence on the joint effects of auditor tenure and nonaudit fees. Auditing: A J. Practice
Theory 26 (2), 117–142.
Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference and prediction. Springer, New York.
Jones, S., 2017. Corporate bankruptcy prediction: A high dimensional analysis. Rev. Acc. Stud. 22 (3), 1366–1422.
Kinney Jr., W.R., Palmrose, Z.V., Scholz, S., 2004. Auditor independence, non-audit services, and restatements: Was the U.S., government right?. J. Account.
Res. 42 (3), 561–588.
Kogan, A., Mayhew, B.W., Vasarhelyi, M.A., 2019. Audit data analytics research – An application of design science. Account. Horizons 33 (3), 69–73.
Kothari, S.P., Leone, A.J., Wasley, C.E., 2005. Performance matched discretionary accruals measures. J. Account. Econ. 39 (1), 163–197.
Krishnan, J., 1994. Auditor switching and conservatism. Account. Rev. 69 (1), 200–215.
Lennox, C.S., 2000. Do companies successfully engage in opinion-shopping? Evidence from the UK. J. Account. Econ. 37 (2), 201–231.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785
J.O.S. Hunt et al. / J. Account. Public Policy xxx (xxxx) xxx 17

Lennox, C.S., Park, C.W., 2007. Audit firm appointments, audit firm alumni, and audit committee independence. Contemp. Account. Res. 24 (1), 235–258.
Li, C., 2009. Does client importance affect auditor independence at the office level? Empirical evidence from going-concern opinions. Contemp. Account. Res.
26 (1), 201–230.
Lu, T., 2006. Does opinion shopping impair auditor independence and audit quality?. J. Account. Res. 44 (3), 561–583.
Maes, S., Tuyls, K., Vanschoenwinkel, B., Manderick, B., 2002. Credit card fraud detection using Bayesian and neural networks. In: Proceedings of the 1st
international naiso congress on neuro fuzzy technologies, pp. 261–270.
Min, J.H., Lee, Y.C., 2005. Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst. Appl. 28 (4),
603–614.
Mullainathan, S., Spiess, J., 2017. Machine learning: An applied econometric approach. J. Econ. Perspect. 31 (2), 87–106.
Nelson, M.W., 2009. A model and literature review of professional skepticism in auditing. Auditing: a J. Practice Theory 28 (2), 1–34.
Newton, N.J., Persellin, J.S., Wang, D., Wilkins, M.S., 2016. Internal control opinion shopping and audit market competition. Account. Rev. 91 (2), 603–623.
Nolder, C.J., Kadous, K., 2018. Grounding the professional skepticism construct in mindset and attitude theory: A way forward. Acc. Organ. Soc. 67, 1–14.
Perols, J.L., Bowen, R.M., Zimmermann, C., Samba, B., 2017. Finding needles in a haystack: Using data analytics to improve fraud prediction. Account. Rev. 92
(2), 221–245.
Perols, J.L., Lougee, B.A., 2011. The relation between earnings management and financial statement fraud. Adv. Account. 27 (1), 39–53.
Roberts, R.W., Glezen, G.W., Jones, T.W., 1990. Determinants of auditor change in the public sector. J. Account. Res. 28 (1), 220–228.
Rowe, S.P., 2019. Auditors’ comfort with uncertain estimates: More evidence is not always better. Acc. Organ. Soc. 76 (July), 1–64.
Saito, T., Rehmsmeier, M., 2015. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.
PloS ONE 10 (3), e0118432.
Schwartz, K.B., Menon, K., 1985. Auditor switches by failing firms. Account. Rev.ew 60 (2), 248–261.
Schwartz, K.B., Soo, B.S., 1996. The association between auditor changes and reporting lags. Contemp. Account. Res. 13 (1), 353–370.
Shipman, J.E., Swanquist, Q.T., Whited, R.L., 2017. Propensity score matching in accounting research. Account. Rev. 92 (1), 213–244.
Tan, C.E.L., Young, S.M., 2015. An analysis of ‘‘Little r” restatements. Account. Horizons 29 (3), 667–693.
Tsai, C.F., Wu, J.W., 2008. Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34 (4), 2639–2649.
U.S. Senate. 1977. Report of the subcommittee on reports, accounting, and management of the Committee on Government Operations (Metcalf Committee
report). U.S. Government Printing Office, Washington, DC
Watkins, A.L., Hillison, W., Morecroft, S.E., 2004. Audit quality: A synthesis of theory And empirical evidence. J. Account. Literat. 23, 153.
Whiting, D.G., Hansen, J.V., McDonald, J.B., Albrecht, C., Albrecht, W.S., 2012. Machine learning methods for detecting patterns of management fraud.
Comput. Intell. 28 (4), 505–527.
Youden, W.J., 1950. Index for rating diagnostic tests. Cancer 3 (1), 32–35.

Please cite this article as: J. O. S. Hunt, D. M. Rosser and S. P. Rowe, Using machine learning to predict auditor switches: How the likelihood
of switching affects audit quality among non-switching clients, J. Account. Public Policy, https://doi.org/10.1016/j.jaccpubpol.2020.106785

You might also like