Professional Documents
Culture Documents
Article 6 - Big Data As Complementary Audit Evidence
Article 6 - Big Data As Complementary Audit Evidence
SYNOPSIS: In this paper we argue for the use of Big Data as complementary audit
evidence. We evaluate the applicability of Big Data using the audit evidence criteria
framework and provide cost-benefit analysis for sufficiency, reliability, and relevance
considerations. Critical challenges, including integration with traditional audit evidence,
information transfer issues, and information privacy protection, are discussed and
possible solutions are provided.
INTRODUCTION
A
udit evidence is defined as the entire set of information collected and evaluated by
auditors when deciding whether a firm’s financial statements are stated in accordance with
generally accepted accounting principles (SAS No. 106, AICPA 2004). In practice,
external auditors deal with large amounts of information that, although less than the entire set of
possible audit evidence, meet the accepted professional requirement of being ‘‘sufficient and
appropriate’’ (SAS No. 106, AICPA 2004). Audit evidence may be obtained through the
examination of underlying accounting records as well as from other corroborative information
sources such as observations, confirmations from third parties, and any other information that may
provide a reasonable basis for conclusions (Louwers, Ramsey, Sinason, and Strawser 2007).
Auditors are becoming more holistic in their audit risk assessments, examining evidence
available from various sources in order to decrease the probability of material misstatement and
audit failure (Bell, Peecher, and Solomon 2005). This approach has been facilitated by new
technologies that provide auditors with a wider variety of both financial and nonfinancial
Kyunghee Yoon is a Ph.D. student and Li Zhang is an Assistant Professor at Rutgers, The State
University of New Jersey, Newark, and Lucas Hoogduin is a senior manager with KPMG LLP.
We thank Helen Brown-Liburd, Paul Byrnes, Paul Griffin, Eunice Hung, Hussein Issa, Alexander Kogan, John Peter
Krahel, Brad Tuttle, Miklos Vasarhelyi, Arnold Wright, and anonymous reviewers for their valuable suggestions.
The information contained herein is of a general nature and is not intended to address the specific circumstances of any
particular individual or entity.
This article represents the views of the authors only, and does not necessarily represent the views or professional advice
of KPMG LLP or the American Accounting Association.
Submitted: February 2015
Accepted: February 2015
Published Online: February 2015
Corresponding author: Kyunghee Yoon
Email: yoonkhee@rutgers.edu
431
432 Yoon, Hoogduin, and Zhang
information, as well as improved audit efficiency resulting from computerization and audit
automation (Trompeter and Wright 2010).
In the Big Data era, techniques such as pattern recognition, data mining, and natural-language
processing have improved the predictive power of data analysis routines. Accordingly, it is
expected that decisions will be more data driven than experience driven (Lohr 2012). Given this
new reality, we anticipate that by utilizing Big Data, auditors’ efforts to collect sufficient and
appropriate audit evidence can be enhanced. In this paper, we evaluate Big Data as audit evidence
from an evidentiary requirement perspective and provide a cost-benefit analysis for possible
applications. We contend that Big Data is a valuable complement to traditional audit evidence due
to its special characteristics. We will also discuss several challenges to its use during the audit
process. In addition, we will suggest a few related questions for future research.
Accounting Horizons
June 2015
Big Data as Complementary Audit Evidence 433
complement for a client’s internal information that is not readily available to auditors. For example,
when auditing a manufacturing concern, auditors may request management’s sales forecasts
because they can be used to understand the production volume and inventory levels (Louwers et al.
2007). If managers’ sales forecasts are not available or are of low quality, then auditors could use
text analysis to analyze Big Data from news articles, product discussion forums, and social
networks to better understand the sales trends of the client.
Big Data can offer support when traditional audit evidence is deficient, as might be true in a
case of fraud. Obtaining evidence for fraud is difficult because evidence for components linking to
motivation and rationalization are related to an individual’s lifestyle, conduct, and morality (SAS
No. 99, AICPA 2002), none of which are necessarily observable. Evaluating emails can be
particularly helpful in identifying a person’s motivation and probable rationalization, such as
discontentment for a firm. Holton (2009) has used automated text mining to identify emails of
disgruntled employees for fraud detection.
The major sufficiency-related benefit is the information abundance provided by different forms
of data in massive quantities. The primary cost is the data processing effort necessary to reach a
certain audit assertion. Fortunately, advanced data analytics are available, and these analytical tools
are more powerful for larger data sets and friendly to unstructured data (Russom 2011). Auditors
can also create their own data warehouses to achieve economies of scale among clients to reduce
data processing cost.
Accounting Horizons
June 2015
434 Yoon, Hoogduin, and Zhang
nature. Auditors can analyze timely news reports to evaluate their clients’ financial performance
changes and business planning.
Professional standards also require that auditors evaluate the risks related to internal control
weaknesses and fraudulent statements (SAS No. 107, AICPA 2007). Previous literature has
provided various ways to assess client risks (Johnstone 2000), and management disclosures could
be particularly useful for this task. SAS No. 99 (AICPA 2002) points out that ‘‘overly optimistic
press releases or annual report messages’’ are risk factors related to potential fraud. In this vein,
Humpherys, Moffitt, Burns, Burgoon, and Felix (2011) study fraudulent disclosures by analyzing
Management Discussion and Analysis sections. Similarly, Larcker and Zakolyukina (2012) find
that executive use of deceptive language in conference calls can help to identify financial
misstatements. Therefore, text analysis of management disclosures is relevant to assessing the risk
of management fraud.
The nature of e-commerce presents a unique opportunity to use Big Data-based auditing
techniques. There has been a significant shift in the retail industry from brick-and-mortar sales to
Internet sales. Even though e-commerce sales constitute only 6.6 percent of total U.S. retail sales in
the third quarter of 2014, they are growing more rapidly (16.2 percent relative to the third quarter of
2013) than other traditional types of retail sales.1 This trend indicates that auditors will increasingly
be facing clients who have very different types of business processes, prompting the need for
collecting different forms of audit evidence. For example, auditors can compare a client’s website
traffic data to that of competitors with similar customers over the same time period. Any
inconsistency should be identified for special attention, even when the client’s own sales record
shows no problem.
The Big Data approach is relevant because it provides unique and sometimes more timely
evidence compared to the traditional audit approach. The major cost is that the evidence generated
from Big Data mainly suggests association, not causation (Cao, Chychyla, and Stewart 2015). In
the above example, the deceptive language in earnings conference calls does not cause financial
misstatement. It is related to the deceptive behavior of CEOs and CFOs, thereby being associated
with financial misstatement (Larcker and Zakolyukina 2012).
1
See: https://www.census.gov/retail/mrts/www/data/pdf/ec_current.pdf for the 2014 third quarter release.
Accounting Horizons
June 2015
Big Data as Complementary Audit Evidence 435
CRITICAL CHALLENGES
Integration with Traditional Audit Evidence
Integration involves identifying and establishing important relationships among information
from separate sources (Moeckel 1991). In most cases, auditors will obtain both qualitative and
quantitative, or objective and subjective, information (Louwers et al. 2007). Hence, integration of
audit evidence is difficult, but a critical factor in determining the quality of an audit opinion
(Moeckel 1991).
Integrating Big Data with more traditional audit evidence is critical because Big Data is often
unstructured and, specifically, may not have the structure needed by relational databases to uniquely
identify transactions, customers, or products. For example, auditors may use GPS location data to
verify certain transactions. However, matching Big Data information, such as pictures or GPS data,
to traditional accounting records can prove difficult. To successfully incorporate Big Data into the
audit, auditors must find the appropriate ‘‘bridge methods’’ to link the new information and
traditional audit evidence (Vasarhelyi, Kogan, and Tuttle 2015).
Summarization and evaluation of Big Data to integrate with traditional audit evidence presents
other hurdles to providing useful audit evidence. Because of the size and unstructured nature of Big
Data, auditors must have sophisticated techniques to sift and summarize the information, such as
the data-mining and statistical analysis techniques studied in the previous literature (Pang and Lee
2008; Russom 2011). Auditors are less familiar with the many sources of Big Data (compared to
traditional sources), and its audit evidence properties such as sufficiency, reliability, and relevance.
Hence, it is difficult to a priori predict how effective the use of Big Data will be for any specific
purpose. Furthermore, data sources may be correlated (such as social media and third-party news
articles), so the incremental value of a data source may be limited, and the evidence obtained from
multiple sources may be less than the sum of the evidence from each part.
Weighting Big Data could also become an issue. Auditors have the responsibility to collate
different formats of audit evidence from traditional sources. Based on previous experience, auditors
are likely to have their own hierarchical system in place to weight such evidence (Louwers et al.
2007). However, it may not be easy to adhere to the traditional systems for weighting evidence
from Big Data. Big Data commonly does not offer precise information, and sometimes the data
from sources such as news articles may be affected by biases (Vasarhelyi 2008).
Facing these challenges, auditors should weight Big Data evidence under the framework of
evidentiary requirements. That is, auditors must estimate the total amount of audit evidence for each
specific audit objective complying with sufficiency, reliability, and relevance requirements. The
amount of evidence provided by Big Data could be determined based on the pros and cons of Big
Data for each evidentiary requirement, as well as the level of deficiency of traditional audit
evidence. In the belief function framework, the cluster of variables with available belief functions in
the evidential network need to be identified (Srivastava 1995). In order to reduce detection risk,
more weight should be given to the audit evidence generated by Big Data if they provide
disconfirming evidence (Fukukawa and Mock 2012). The answer to the problem of how much Big
Data should be used in an audit engagement may also vary significantly among different industry
types and firm sizes.
Information Transfer
Auditors who specialize in certain industries are often considered to produce higher-quality
audits because of both their in-depth knowledge and greater economies of scale (Balsam, Krishnan,
and Yang 2003; Danos and Eichenseher 1982). Even though the codes of professional conduct
prohibit auditors from disclosing any confidential client information without a client’s specific
Accounting Horizons
June 2015
436 Yoon, Hoogduin, and Zhang
permission (ET Sec. 301, AICPA 1992), the general knowledge and expertise obtained through a
client engagement is transferrable to other engagements. For example, when external auditors
‘‘know’’ general customer responses to the new iPade device, this direct knowledge can be used as
audit evidence for not only Apple Inc.t, but also for others in the same industry.
Consequently, clients who fear information transfer to competitors may actually avoid hiring
specialized auditors (Kwon 1996). While this is not an issue unique to Big Data, it can be
intensified when external auditors endeavor to access a wider scope of internal data sources.
Because there are higher costs in collecting and analyzing Big Data, the economies of scale may
grow. Hence, specialized audit firms may utilize more Big Data as audit evidence than other firms
do, leading to a higher barrier that prevents competitors from entering specialized industries. Clients
who are concerned about spillover of their information may also restrict access to the proprietary
data sources.
To solve the information transfer issue, auditors should formally contract with clients with
regard to the usage of clients’ internal data, such as meeting minutes and website traffic. If one
client’s internal data is used for another client’s auditing task, then the key identifying information
should be deleted or hidden. In general, auditors should only use highly synthesized information
from Big Data for other auditing tasks and limit the access to the original unprocessed data.
Information Privacy
Information privacy, described as ‘‘the ability of the individual to control, personally,
information about one’s self’’ (Stone, Gueutal, Gardner, and McClure 1983), is a significant
challenge to utilizing Big Data as audit evidence. Smith, Milberg, and Burke (1996) discuss the
major concerns individuals have related to information privacy, such as internal and external
unauthorized secondary use. Internal emails could be used to detect fraudulent employee behavior.
However, once external auditors have access to employees’ emails, employees may feel their
information privacy is violated. If auditors can access an even wider scope of information including
GPS data, videos, and audio files, then such concerns will only be heightened.
To alleviate these concerns, auditing firms should cooperate with their clients and inform the
employees in advance that any work-related data sources could be used for audit purposes. They
should also communicate with the employees that work-related data would be used for a particular
audit objective only. The information should be anonymized unless fraud is detected.
Accounting Horizons
June 2015
Big Data as Complementary Audit Evidence 437
in different industries for auditing purposes, and updating auditing standards to regulate information
transfer and privacy issues.
REFERENCES
American Institute of Certified Public Accountants (AICPA). 1988. Analytical Procedures. Statement on
Auditing Standards No. 56. New York, NY: AICPA.
American Institute of Certified Public Accountants (AICPA). 1992. Confidential Client Information. Code
of Professional Conduct ET Sec. 301. New York, NY: AICPA.
American Institute of Certified Public Accountants (AICPA). 2002. Consideration of Fraud in a Financial
Statement Audit. Statement on Auditing Standards No. 99. New York, NY: AICPA.
American Institute of Certified Public Accountants (AICPA). 2004. Audit Evidence. Statement on Auditing
Standards No. 106. New York, NY: AICPA.
American Institute of Certified Public Accountants (AICPA). 2007. Audit Risk and Materiality in
Conducting an Audit. Statement on Auditing Standards No. 107. New York, NY: AICPA.
Balsam, S., J. Krishnan, and J. S. Yang. 2003. Auditor industry specialization and earnings quality.
Auditing: A Journal of Practice & Theory 22 (2): 71–97.
Bell, T. B., M. E. Peecher, and I. Solomon. 2005. The 21st Century Public Company Audit: Conceptual
Elements of KPMG’s Global Audit Methodology. New York, NY: KPMG International.
Bennett, G. B., and R. C. Hatfield. 2012. The effect of the social mismatch between staff auditors and client
management on the collection of audit evidence. The Accounting Review 88 (1): 31–50.
Buhl, H. U., M. Röglinger, D. K. F. Moser, and J. Heidemann. 2013. Big Data: A fashionable topic
with(out) sustainable relevance for research and practice? Business and Information System
Engineering 5 (2): 65–69.
Cao, M., R. Chychyla, and T. Stewart. 2015. Big data analytics in financial statement audits. Accounting
Horizons 29 (2).
Danos, P., and J. W. Eichenseher. 1982. Audit industry dynamics: Factors affecting changes in client-
industry market shares. Journal of Accounting Research 20 (2): 604–616.
Dhillon, I. S., and D. S. Modha. 2001. Concept decompositions for large sparse text data using clustering.
Machine Learning 42 (1/2): 143–175.
Dunn, J. 1996. Auditing: Theory and Practice. Vol. 2. Upper Saddle River, NJ: Prentice Hall, Inc.
Engle, R. F., C. W. Granger, J. Rice, and A. Weiss. 1986. Semiparametric estimates of the relation between
weather and electricity sales. Journal of the American Statistical Association 81 (394): 310–320.
Fukukawa, H., and T. J. Mock. 2012. Auditors’ evidence evaluation and aggregation using beliefs and
probabilities. International Journal of Approximate Reasoning 53 (2): 190–199.
Holton, C. 2009. Identifying disgruntled employee systems fraud risk through text mining: A simple
solution for a multi-billion dollar problem. Decision Support Systems 46 (4): 853–864.
Humpherys, S. L., K. C. Moffitt, M. B. Burns, J. K. Burgoon, and W. F. Felix. 2011. Identification of
fraudulent financial statements using linguistic credibility analysis. Decision Support Systems 50 (3):
585–594.
Issa, H., and A. Kogan. 2014. A predictive ordered logistic regression model as a tool for quality review of
control risk assessments. Journal of Information Systems 28 (2): 209–229.
Ittner, C. D., and D. F. Larcker. 1998. Are nonfinancial measures leading indicators of financial
performance? An analysis of customer satisfaction. Journal of Accounting Research 36
(Supplement): 1–35.
Johnstone, K. M. 2000. Client-acceptance decisions: Simultaneous effects of client business risk, audit risk,
auditor business risk, and risk adaptation. Auditing: A Journal of Practice & Theory 19 (1): 1–25.
Kwon, S. Y. 1996. The impact of competition within the client’s industry on the auditor selection decision.
Auditing: A Journal of Practice & Theory 15 (1): 53–70.
Larcker, D. F., and A. A. Zakolyukina. 2012. Detecting deceptive discussions in conference calls. Journal
of Accounting Research 50 (2): 495–540.
Accounting Horizons
June 2015
438 Yoon, Hoogduin, and Zhang
Lohr, S. 2012. The age of Big Data. The New York Times (February 11).
Louwers, T. J., R. J. Ramsey, D. H. Sinason, and J. R. Strawser. 2007. Auditing and Assurance Services.
New York, NY: McGraw-Hill.
Moeckel, C. 1991. Two factors affecting an auditor’s ability to integrate audit evidence. Contemporary
Accounting Research 8 (1): 270–292.
Moffitt, K., and M. A. Vasarhelyi. 2013. AIS in an age of Big Data. Journal of Information Systems 27 (2):
1–19.
Pang, B., and L. Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information
Retrieval 2 (1/2): 1–135.
Russom, P. 2011. Big Data Analytics. TDWI Best Practices Report, Fourth Quarter. Available at: http://
tdwi.org/research/2011/09/;/media/TDWI/TDWI/Research/BPR/2011/TDWI_BPReport_Q411_
Big_Data_Analytics_Web/TDWI_BPReport_Q411_Big%20Data_ExecSummary.ashx
Smith, H. J., S. J. Milberg, and S. J. Burke. 1996. Information privacy: Measuring individuals’ concerns
about organizational practices. MIS Quarterly 20 (2): 167–196.
Srivastava, R. P. 1995. The belief-function approach to aggregating audit evidence. International Journal of
Intelligent Systems 10 (3): 329–356.
Srivastava, R. P., and G. R. Shafer. 1992. Belief-function formulas for audit risk. The Accounting Review 67
(2): 249–283.
Starr-McCluer, M. 2000. The Effects of Weather on Retail Sales. Available at: http://www.federalreserve.
gov/pubs/feds/2000/200008/200008pap.pdf
Stone, E. F., H. G. Gueutal, D. G. Gardner, and S. McClure. 1983. A field experiment comparing
information-privacy values, beliefs, and attitudes across several types of organizations. Journal of
Applied Psychology 68 (3): 459–468.
Tetlock, P. C. 2007. Giving content to investor sentiment: The role of media in the stock market. The
Journal of Finance 62 (3): 1139–1168.
Tetlock, P. C., M. Saar-Tsechansky, and S. Macskassy. 2008. More than words: Quantifying language to
measure firms’ fundamentals. The Journal of Finance 63 (3): 1437–1467.
Trompeter, G., and A. Wright. 2010. The world has changed—Have analytical procedure practices?
Contemporary Accounting Research 27 (2): 669–700.
Tufekci, Z. 2013. Big Data: Pitfalls, Methods, and Concepts for an Emergent Field. Available at: http://
papers.ssrn.com/sol3/papers.cfm?abstract_id¼2229952
Vasarhelyi, M. A. 2008. Evolving accounting systems research with business measurement practice: A
letter from the editor. Journal of Emerging Technologies in Accounting 5 (1): i–x.
Vasarhelyi, M. A., A. Kogan, and B. Tuttle. 2015. Big Data in accounting: An overview. Accounting
Horizons 29 (2).
Waller, M. A., and S. E. Fawcett. 2013. Data science, predictive analytics, and Big Data: A revolution that
will transform supply chain design and management. Journal of Business Logistics 34 (2): 77–84.
Accounting Horizons
June 2015
Copyright of Accounting Horizons is the property of American Accounting Association and
its content may not be copied or emailed to multiple sites or posted to a listserv without the
copyright holder's express written permission. However, users may print, download, or email
articles for individual use.