Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

t

os
SMU560

MOSS & ASSOCIATES: ACCOUNTING FOR FINANCIAL FRAUD


In June 2019, Cheryl Leong, Head of Fraud Analytics and Data Management at Moss & Associates,

rP
a mid-sized New York accounting firm, hurried down two blocks to get a sandwich from a nearby
food truck on Third Avenue, New York, USA. Due to the busy quarter end reporting period, it would
likely be her only meal of the day. She gazed wistfully toward Central Park; she would have to skip
her usual sunset walk. There were several clients reporting earnings at the end of the business day
and she had to get her team ready to go over the financial statements.

The accounting industry provided services to help companies present financial information in an

yo
orderly manner to investors, regulators and other stakeholders. Some accounting firms would also
provide additional services to help companies minimise taxes and grow their businesses. Auditors
were required to be detail-oriented in ensuring the accuracy of financial statements, highlight a
company’s risk exposure based on its internal controls and adapt to the changing regulatory
environment.

The incidence of companies reporting material falsehoods had risen in recent years and regulators
op
were pushing accounting firms to detect those instances more efficiently. Fraud detection was
becoming increasingly difficult. Previously, fraud involved reporting numbers in a non-compliant
manner. Presently, companies were finding innovative ways to deceive investors by misrepresenting
their financial health through the misuse of qualitative text. They would benefit from fraudulent
activity by maintaining their access to capital markets at low interest rates and keep their share prices
high.
tC

Moss & Associates was a data-driven accounting firm that was sensitive to regulatory requirements
and utilised technology to supplement its workforce. The firm had essentially managed to automate
the auditing of numerical data but the checking of qualitative data remained manual. In the past,
Leong’s team had had to meticulously pore over executives’ statements and their future earnings
guidance. However, she was working on a data analytics platform that could detect textual fraud
more quickly and accurately.
No

Leong paid for her meal and picked up another cup of coffee in preparation for another sleepless
night. The data analytics platform was designed to improve the prospects for fraud detection and
increase the productivity of her team. She hoped it would stem the rise of financial statement fraud.
Qualitative data would have to be collected and converted to a readable format for computers. Leong
was familiar with several text mining techniques that would break down the data before classifying
them for her team to go over. She needed the new fraud detection tool to work in order to cover all
the qualitative text. What steps should she programme the tool to take?
Do

This case was written by Professor Swapna Gottipati, and Professor Venky Shankararaman at the Singapore Management
University. The case was prepared solely to provide material for class discussion. The authors do not intend to illustrate either
effective or ineffective handling of a managerial situation. The authors may have disguised certain names and other identifying
information to protect confidentiality.

Copyright © 2019, Singapore Management University Version: 2019-01-10

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
Moss & Associates

os
The accounting industry was dominated by the Big Four1. As a mid-sized firm, Moss & Associates
had fewer internal conflicts of interest and was more responsive to regulatory changes. It also
provided closer attention to individual clients in order to compete. The firm also developed a niche
in alternative investment valuations by establishing a division that was familiar with products as
specialised as art and wine.

rP
While providing value-added advisory services to family offices and private companies was a
growing part of the business, most of its work involved auditing listed companies. It had recently
improved its processes through implementing new technology. For example, Robotic Process
Automation handled basic audit and reconciliation functions and removed the need for random
sampling as the system could sift through entire data sets quickly and accurately. Moss & Associates
continually searched for ways to fine-tune its operational systems.

yo
Auditing Financial Statements

Traditionally, auditors reviewed the disclosure of the financial positions and achievements of
companies. In the financial reports, companies needed to account for their results and the pursued
policies and procedures. A company would not only have to disclose information concerning its
business activities, organisational structure, and mission statement, but also reveal how it engaged
op
with its most important suppliers and customers. Additionally, a company would have to provide
some financial analysis for its performance and explain the rationale for material transactions such
as takeovers and investments, and its personnel and remuneration policy. Finally, it also had to
discuss its expectations and expected developments for the future.2

Listed companies in the United States were required to file annual reports with the Securities and
tC

Exchange Committee (SEC). The SEC provided a clear structure for annual reports in forms 10-K
for U.S. based companies and 20-F for foreign companies. 3,4 An annual report would typically
include four financial statements: the balance sheet, the income statement, the statement of
shareholders’ equity and the cash flow statement. The balance sheet reflected the financial position
of the company in terms of its assets, liabilities and equity. The income statement showed the income,
expenses and profits of the preceding year. The statement of shareholders’ equity listed the changes
in the ownership interest of the company. Finally, the cash flow statement reported the holdings of
cash and cash equivalents in the company and how it was used. The annual report would also include
No

textual information in the notes to the financial statements that explained specific items in the
financial statements.

Stakeholders used financial metrics to determine the financial performance of a company. These
metrics included indicators such as “quick assets to current liabilities, market value of equity to total
assets, total liabilities to total assets, interest payments to earnings before interest and tax, net income
to total assets, and retained earnings to total assets”5. Common financial ratios included the price-to-
earnings Ratio (P/E), the dividend payout ratio, the leverage ratio and the return on assets ratio.
Do

1
Deloitte, Ernst & Young, KPMG and PricewaterhouseCoopers
2
Jan Klaassen, Martinus Hoogendoorn, and Rudolphus Vergoossen, Externe Verslaggeving. Noordhoff Uitgevers, 2008.
3
U.S. Securities and Exchange Commission, Form 10-K, June 26, 2009, https://www.sec.gov/fast-answers/answers-form10khtm.html,
accessed May 2019.
4
Will Kenton, SEC Form 20-F, Investopedia, June 28, 2018, https://www.investopedia.com/terms/s/sec-form-20-f.asp, accessed May
2019.
5
M.A. Sahaf, Management Accounting: Principles & Practice, 3rd Edition, Vikas Publishing House, p 191.

2/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
Corporate disclosure would take place through annual reports, press conferences, and corporate

os
announcements to provide stakeholders with the pulse of the company. Its main purpose was to
provide information about a company’s performance, financial health, current problems, successes
and failures, and prospects of future development. 6 The annual report included a management
discussion and analysis (MD&A) section that allowed company executives to provide an overview
of the past performance of the company, macro-economic environment in which the industry
operated, and future company policies. Stakeholders would gain insights provided by management

rP
explaining the financial performance within the macroeconomic context. Key sub-sections of the
MD&A involved the executive overview, discussions on results and operations, liquidity and capital
resources, management of risk and price fluctuations, and performance goals.7

The MD&A section was arguably the part of the annual report that was the most read.8 Companies
would have to undergo financial statement audits by independent auditors who would examine their
financial statements before making accompanying disclosures. Auditors would return an unqualified

yo
audit report if they did not find any issues or an adverse opinion report when the financial statements
could be found to be materially misstated.

The prevalence of financial statement fraud was one of the biggest challenges faced by managers and
investors. Fraud could be defined as the act of intentional or irresponsible conduct or if the
information conveyed deception or misrepresentation.9 The common forms of financial statement
fraud included “fictitious sales, improper expense recognition, incorrect asset valuation, hidden
op
liabilities and unsuitable disclosures” 10 . These forms of fraud would affect the numbers in a
company’s financial overviews.

Fraud Analytics and Data Management Division

Moss & Associates worked closely with regulators to ensure it was at the forefront of accounting
trends and remained vigilant to the potential for material misstatements. A company could influence
tC

financial measurements and the qualitative context of financial statements to present a rosier picture
of its financial health to investors. The growing trend was for companies to influence people more
subtly through textual information instead of altering numbers in the financial statements. Compared
to financial metrics, the textual information in annual reports was more easily understood and could
have a greater reach and impact on stakeholders. Companies could deceive stakeholders by
intentionally interpreting the numbers wrongly or making unrealistic forecasts.
No

The qualitative narratives might not be explicitly fraudulent; however, it was possible to identify
fraud indicators through a closer examination of the syntax, the language structure, and the semantics,
the meaning of the language components, (i.e., of the language used). Perpetrators of fraud would try
to camouflage these indicators in their corporate disclosure documents. For better analysis of the

6
Arline Savage and Cynthia Miree, “Financial Analysts and Enron: Asleep at the Wheel?”, Published in Practical Financial Economics:
Do

A New Science, January 1, 2003: 75-101.


7
Niamh Brennan & Doris M. Merkl-Davies, "Accounting Narratives and Impression Management," Open Access publications
10197/4949, Research Repository, University College Dublin, 2013.
8
Management’s Discussion and Analysis — Guidance on preparation and disclosure https://www.cpacanada.ca/-/media/site/business-and-
accounting-resources/docs/managements-discussion-and-analysis-guidance-on-preparation-and-disclosure-july-
2015.pdf?la=en&hash=975160EAF268AC96D7C6A95969A553805DF59AEF
9
Belinna Bai , Jerome Yen and Xiaoguang Yang , “False Financial Statements: Characteristics of China's Listed Companies and CART
Detecting Approach”. International Journal of Information Technology & Decision Making 2008; 7: 339–359.
10
Arthur Pinkasovitch, Detecting Financial Statement Fraud, Investopedia, January 4, 2018,
https://www.investopedia.com/articles/financial-theory/11/detecting-financial-fraud.asp, accessed May 2019.

3/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
company’s performance, the financial reports should be analysed together with other documents such

os
as annual reports and newsletters released by the company.11

After conducting a strategy review, Moss & Associates endeavoured to refine its audit process. In
2018, a fraud analytics and data management division was established to counter the use of non-
numerical data to embellish the outlook of certain companies. Leong and her team would have to
work closely with the data science, auditing and accounting teams and peruse entire annual reports

rP
to spot anomalies (refer to Exhibit 1 for the key areas to focus on in the financial statements for fraud
detection). The team was responsible for predicting fraud in the annual reports using text mining,
dashboard techniques, visualisation techniques, and complex reporting solutions utilising machine
learning techniques.

Fraud Detection Process

yo
Examining the text incorporated in annual reports would complement the analysis of numerical data
for the purpose of fraud detection. The text in annual reports was easier to manipulate because it was
not subjected to as many rules as the financial information. Thus, a company's management had more
freedom in its textual disclosures. Furthermore, the reach of textual information was greater than that
of the financial information. More people would understand the textual information in the annual
reports more easily than the quantitative information. In the past decade, regulators realised that there
op
had been a big change in how words were used to describe the health of companies.12 Company
executives could use strategically placed phrases to frame financial results to their advantage. Given
the amount of discretion at their disposal, they might even resort to disseminating outright lies.

Therefore, text should be considered an important source of information in fraud detection


procedures. The textual information ultimately had the power to influence and, in case of fraud,
deceive a greater number of people (refer to Exhibit 2A & 2B for sample fraudulent disclosures from
tC

Enron’s MD&A section and the letter to shareholders). Textual disclosures in annual reports could
be analysed by reading the text. However, the process was time consuming as the data had to be
benchmarked against industry peers or compared to previous disclosures before it could be
considered abnormal.

Computers would be able to process a higher number of reports in a shorter amount of time. Leong
and her team had been building a data analytics platform to help the team sieve through the data. The
No

analytics platform included methods such as data mining, statistics and machine learning. As a first
step, the target data were noisy and needed cleaning before processing by the software tools. The
challenge was to extract the company background information and other qualitative content that were
in pdf or html format, with html tags and programming syntax that needed to be converted (refer to
Exhibits 3 & 4 for an illustration). Although companies had to follow a specific section format,
companies could organise information and phrase sentences in different ways. Data had to be
collected and converted into a readable format before any analysis could be performed.
Do

The second step required data to be mined using principled algorithms. Finally, analytics had to
applied to gain valuable insights from the results. While computers would not able to understand the
content of the text, they were better equipped to extract the more abstract linguistic information. The

11
Wei Dong, Shaoyi Liao; and Liang Liang, "Financial Statement Fraud Detection Using Text Mining: A Systemic Functional Linguistics
Theory Perspective" (2016). PACIS 2016 Proceedings. 188. https://aisel.aisnet.org/pacis2016/188
12
Saliha Minhas and Amir Hussain, “From Spin to Swindle: Identifying Falsification in Financial Text”, CrossMark, May 21, 2015,
https://core.ac.uk/download/pdf/81187791.pdf, accessed May 2019.

4/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
team aimed to uncover three main categories of `red flags': management's overoptimistic

os
characteristics and the attitude of the management toward the internal control system; industry
conditions; operating characteristics and financial stability. These factors could be detected by
studying various aspects of the qualitative sections in the financial statements.

Impression Management

rP
Impression management was a process in which company executives would attempt to shape the
perceptions of stakeholders in their favour. The theory hypothesised that textual information could
be used to conceal bad results or exaggerate good performance. With this knowledge, Leong and her
team would look for potential fraud indicators. For example, the frequency of terms associated with
risk and uncertainty in 10-K reports to predict future earnings could indicate that bad results were
being moderated. Another indicator was that executives from companies that did not perform well
were less likely to use personal references or active sentences, and preferred to focus more on future

yo
prospects in the chairman's statement (refer to Exhibits 5 for an example of how impression
management was used).

Financial Distress Management

The key approach to search for signs of financial distress was to study the quantitative numbers.
However, the chairman's statement or other qualitative data in the annual report would also contain
op
information about the future of the company. Auditors would be able to better detect fraud by adding
sentiment information to the disclosed text. This would lead to a more accurate prediction of financial
distress than when the prediction was based on financial quantitative information alone.13 A financial
dictionary of sentiment words of various categories; Fin-Neg (financial-negative), Fin-Pos (financial-
positive), Fin-Unc (financial-uncertainty), Fin-Lit (financial-litigious), would be useful for this
approach:
tC

1. Fin-Neg: negative business terminologies (e.g., deficit, default).


2. Fin-Pos: positive business terminologies (e.g., achieve, profit).
3. Fin-Unc: words denoting uncertainty, with emphasis on the general notion of imprecision rather
than exclusively focusing on risk (e.g., appear, doubt).
4. Fin-Lit: words reflecting a propensity for legal contest or, per our label, litigiousness (e.g., amend,
forbear).14
No

Text analytics tools could be used to parse data. In general, the most frequently appearing Fin-Neg
words in the MD&A section were loss, losses, claims, impairment, decline, against, adverse, delay,
etc. Common Fin-Pos words include “achieve, attain, efficient, improve, profitable, and upturn”15.
The Fin-Unc words included those such as “approximate, contingency, depend, fluctuate, indefinite,
uncertain, and variability”16. Fin-Lit words reflected a “propensity for legal contest or, per our label,
litigiousness” 17 (e.g., amend, forbear). Further, the confidence level words also indicated the
relationship with financial risk; MW-Strong (Strong Modal Words) and MW-Weak (Weak Modal
Words) lexicons:
Do

13
Chuan -Ju Wang, Ming-Feng Tsai, Tse Liu, and Ching-Ting Chang, “Financial Sentiment Analysis for Risk Prediction”, 2013. IJCNLP.
14
Tim Loughran and Bill McDonald, “When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10‐Ks.” Journal of
Finance, 66, (2011), pp. 35-65.
15
Ibid.
16
Ibid.
17
Ibid.

5/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
1. MW-Strong: words expressing strong levels of confidence (e.g., always, must).

os
2. MW-Weak: words expressing weak levels of confidence (e.g., could, might).18

Using text analytics would help provide a better picture of a company’s overall financial health as
executives might use vague language to obscure the actual financial position of companies on the
brink of bankruptcy.

rP
Detection of Deception (Lies)

While impression management and concealing financial distress were situations in which companies
used words to spin negative circumstances, executives would sometimes disseminate outright lies.
In deception detection research, deception was defined as a deliberate attempt to mislead others. This
was the most difficult task as the story in well-written documents such as annual reports would
usually be well thought through and quantitative metrics were not able to discover the deception.

yo
Deception detection theory had several techniques to detect lies in the content with the given context
(refer to Exhibit 6A & 6B for examples of how the use of certain text phrases and repetition could
indicate the possibility of fraud when associated with financial metrics). However, there were words
or phrases to look out for that would be useful in detecting fraud (refer to Exhibit 7 for indicators of
lies).19
op
Launching the Tool

Leong and her team had worked hard to get the data analytics platform ready. As she was getting
ready to debut the fraud detection tool, she knew that it should greatly improve the firm’s efficiency
in fraud detection if programmed correctly. She had the option of using a variety of text mining
techniques to extract the relevant data for analysis (refer to Appendix A for a list of text mining
tC

techniques). Did she adequately address the challenges of analysing text in annual reports? What
steps should she programme the tool to take?
No
Do

18
Ibid.
19
Chi-Chen Lee, Natalie Churyk, and B. Douglas Clinton, “Validating Early Fraud Prediction Using Narrative Disclosures”, Journal of
Forensic and Investigative Accounting, Vol. 5 No. 1, 2013: 35-57.

6/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
EXHIBIT 1: FRAUD DETECTION AREAS

os
Financial Statements

rP
Quantitative Numbers Qualitative text

Financial Financial Auditor's Financial


metrics ratios comments narratives

Source: Provided by Authors

yo
EXHIBIT 2A: SNIPPETS FROM MDA SECTION OF ENRON’S 2000 SEC 10-K FORM

Enron Transportation Services is expected to provide stable


op
earnings and cash flows during 2001. The four major natural gas
pipelines have strong competitive positions in their respective
markets as a result of efficient operating practices, competitive
rates and favorable market conditions. Enron Transportation
Services expects to continue to pursue demand-driven expansion
opportunities.
...
tC

The combination of knowledge gained in building networks in


key energy markets and the application of new technology, such as
EnronOnline, is expected to provide the basis to extend Wholesale
Services' business model to new markets and industries. In key
international markets, where deregulation is underway, Enron
plans to build energy networks by using the optimum combination
of acquiring or constructing physical assets and securing
contractual access to third party assets. Enron also plans to
No

replicate its business model to new industrial markets such as


metals, pulp, paper and lumber, coal and steel. Enron expects to
use its Ecommerce platform, EnronOnline, to accelerate the
penetration into these industries.

Source: Securities and Exchange Commission Archives,


https://www.sec.gov/Archives/edgar/data/1024401/000102440101500010/ene10-k.txt, accessed May 2019.
Do

7/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
EXHIBIT 2B: EXTRACTS FROM ENRON’S LETTER TO SHAREHOLDERS, ANNUAL REPORT

os
2000

Enron’s performance in 2000 was a success by any measure, as we continued to outdistance the
competition and solidify our leadership in each of our major businesses. In our largest business,
wholesale services, we experienced an enormous increase of 59 percent in physical energy
deliveries. Our retail energy business achieved its highest level ever of total contract value. Our
newest business, broadband services, significantly accelerated transaction activity, and our oldest

rP
business, the interstate pipelines, registered increased earnings. The company’s net income
reached a record $1.3 billion in 2000. (p. 4)

Enron hardly resembles the company we were in the early days. During our 15-year history, we
have
stretched ourselves beyond our own expectations. We have metamorphosed from an asset-based
pipeline and power generating company to a marketing and logistics company whose biggest
assets are its well established business approach and its innovative people. (pp. 6-7)

yo
Our performance and capabilities cannot be compared to a traditional energy peer group. Our
results put us in the top tier of the world’s corporations. We have a proven business concept that is
eminently scalable in our existing businesses and adaptable enough to extend to new markets. (p.
7)
Our talented people, global presence, financial strength and massive market knowledge have
created our sustainable and unique businesses. EnronOnline will accelerate their growth. We plan
to leverage all of these competitive advantages to create significant value for our shareholders. (p.
7)
op
Source: Securities and Exchange Commission Archives,
https://www.sec.gov/Archives/edgar/data/1024401/000102440101500010/ene10-k.txt, accessed May 2019.

EXHIBIT 3: SAMPLE SUB-SECTION OF FINANCIAL STATEMENT


tC
No

Source: Securities and Exchange Commission Archives,


Do

https://www.sec.gov/Archives/edgar/data/1288776/000128877614000020/goog2013123110-k.htm, accessed May


2019.

8/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
EXHIBIT 4: VARIOUS REPRESENTATIONS OF EMPLOYEE INFORMATION

os
rP
yo
op
Source: Securities and Exchange Commission Archives, https://www.sec.gov/Archives/edgar/data/, accessed May
2019.

EXHIBIT 5: USE OF IMPRESSION MANAGEMENT

Fraud indicators:
tC

Metaphors
Enron’s performance in 2000 was a success by any measure, as we continued to outdistance the
competition and solidify our leadership in each of our major businesses.

Hyperboles
Enron has built unique and strong businesses that have tremendous opportunities for growth.
These businesses … can be significantly expanded within their very large existing markets and
extended to new markets with enormous growth potential.
No

Non-Fraud indicators:
Use of personal pronouns
I know from my year as chairman of the Administration Board that budgeting has been a very
delicate operation over the last two years.

Source: Craig, R.J. and Amernic, J.H. (2004) ‘Enron discourse: the rhetoric of a resilient capitalism’, Critical
Perspectives on Accounting, 15(6/7): 813–851.
Hyland, K. (1998) ‘Exploring corporate rhetoric: metadiscourse in the CEO’s letters’, Journal of Business
Communication, 35(2), 224–245.
Do

9/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
EXHIBIT 6A: POSSIBLE TEXT PHRASES THAT CAN INDICATE THE POSSIBILITY OF FRAUD

os
WHEN ASSOCIATED WITH FINANCIAL METRICS

rP
yo
Source: Saliha Minhas and Amir Hussain, From Spin to Swindle: Identifying Falsification in Financial Text,
CrossMark, May 21, 2015, https://core.ac.uk/download/pdf/81187791.pdf, accessed May 2019.
op
EXHIBIT 6B: ILLUSTRATION ON THE USE OF REPETITION

Use of repetition
The use of repetition in the narratives where repeating short key phrases at the beginning of successive
sentences (anaphora) is used to emphasise business intangibles and future growth.20
tC

Example:
It’s about the liberation
It’s about the creation [. . .] [note also the repetitive rhyme of “ion”]
And it’s about growth [. . .]
The first phrase continues thus:
It’s about the liberation of our people and our assets – these new businesses will be free to innovate
and free to operate at speed.
No

The use of repetition in the narratives where repeating short key phrases at the beginning of
successive sentences (anaphora) is used to emphasise business intangibles and future growth21.
Source: Provided by Authors
Do

20
Jane Davison, ”Rhetoric, Repetition, Reporting and the ’dot.com‘ Era’: Words, Pictures, Intangibles’, Accounting, Auditing &
Accountability Journal, 21(6) 2008: 791–826.
21
Ibid.

10/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
EXHIBIT 7: INDICATORS FOR LIE DETECTION

os
a. fewer terms indicating positive emotion such as happy, pretty, good
b. fewer present tense verbs, examples include ―walk, is, be
c. the presence of an increased number of words,
d. fewer colons.
e. fewer semicolons

rP
f. lower use of "for example"
g. lower lexical
h. repetitions of certain phrases in the narratives 22.

Source: Saliha Minhas and Amir Hussain, From Spin to Swindle: Identifying Falsification in Financial Text,
CrossMark, May 21, 2015, https://core.ac.uk/download/pdf/81187791.pdf, accessed May 2019.

yo
op
tC
No
Do

22
Saliha Minhas and Amir Hussain, Cogn Comput (2016) 8: 729. https://doi.org/10.1007/s12559-016-9413-9

11/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
APPENDIX A: TEXT MINING TECHNIQUES

os
Speech and natural language can be analysed through the application of natural language
processing23 systems. Using computational techniques, these systems take strings of words
(sentences) as their input and produce structured representations capturing the meaning of those
strings as their output. The nature of this output depends heavily on the task at hand. The following
tasks are popularly used for information retrieval, information extraction, machine translation,

rP
summarisation, and sentiment analysis.

Sentence Tokeniser

In preparation for text mining, the computer breaks the text down using tokenisation methods. A
sentence tokeniser divides text into a list of sentences. The sentence tokeniser generates sentences
from text by using an unsupervised algorithm to build a model for abbreviated words, collocations,

yo
and words that start sentences, such as capitalised words. Tokenisers are smart enough to know that
the periods in Mr. Smith and Johann S. Bach do not mark sentence boundaries. Sometimes sentences
could start with non-capitalised words.

FIGURE 1: EXAMPLE INPUT AND OUTPUT OF SENTENCE TOKENISER


op
Input Output
tC

Source: Provided by Authors

Word Tokeniser

This process involves splitting up the elements of textual content so that only separate tokens remain.
Then, some of these tokens need to be adjusted or excluded from the extraction and selection step.
Firstly, a computer defines two words as being different when one of the words starts with a
capitalised character, such as most first words in English sentences, and the other word is written
No

only in lower case characters. For example, the words ‘Two’ and ‘two’ are different to a computer.
To circumvent this, all characters are transformed to lowercase characters. Secondly, similar words
would be grouped in a process called stemming.24 For example, analysis and analysing represent
similar root word analyse. Other cleaning process such as spelling correction and proper nouns
grouping would be undertaken during word tokenisation. The system performance can be tested by
including various optional settings such as stopword removal, stemming and grouping of words.
Do

23
https://www.scm.tees.ac.uk/isg/aia/nlp/NLP-overview.pdf
24
https://www.meaningcloud.com/developer/resources/doc/models/models/text-tokenization-multiwords

12/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
FIGURE 2: WORD TOKENS GENERATED FROM A SENTENCE

os
rP
yo
op
Source: Provided by Authors

Parts of Speech Tagger

The Part-Of-Speech Tagger25 (POS Tagger) is a piece of software that reads text in some language
and assigns parts of speech to each word (and other tokens), such as noun, verb, adjective, although
generally computational applications use more fine-grained POS tags like 'noun-plural'.
tC

FIGURE 3: EXAMPLE INPUT AND OUTPUT OF POS TAGGER

"The little yellow dog POS tagger based on


barked at the cat" Markov Models
No

NP Noun Phrase
NN Noun, singular or mass

IN Preposition or subordinating
conjunction
VBD Verb, past tense
Do

DT Determiner

JJ Adjective

Source: Provided by Authors

25
http://nlp.stanford.edu/software/tagger.shtml

13/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
amed Entity Tagger

os
Named entity taggers are tools that seek to locate and classify named entity mentions in unstructured
text into pre-defined categories such as person names, organisations, locations, medical codes, time
expressions, quantities, monetary values, and percentages.26

rP
FIGURE 4: EXAMPLE INPUT AND OUTPUT OF THE NAMED ENTITY TAGGER

[Jim]Person bought 300


"Jim bought 300
Named Entity tagger shares of [Acme
shares of Acme Corp.
based on CRF Models Corp.]Organization in
in 2006"
[2006]Time

yo
Source: Provided by Authors

Lexical Dictionaries (Domain/Task specific features)

Using dictionaries for text mining may improve the outcomes of the tool as they are domain specific.
Researchers use lexical dictionaries to help determine connotation as some words may be either
positive or negative depending on the context. For example, ‘fast’ is a good or positive word in the
op
context of a movie plot but not in the context of teaching; it indicates a negative word. Further, the
word lists from the lexicon can be compared with their similar words to handle various word forms.
Text mining tasks use lexical dictionaries to compile domain-specific or task-specific words because
most dictionaries (e.g., WordNet27) list synonyms and antonyms for each word.

FIGURE 5: DICTIONARY WITH THE WORDS AND ASSOCIATED CATEGORISED FEATURES


tC

(WORD FORMS)
No
Do

Source: https://www.visualthesaurus.com/app/view

26
Wikipedia https://en.wikipedia.org/wiki/Named-entity_recognition
27
George A. Miller. “WordNet: a Lexical Database for English”, 1995, Commun. ACM 38, 11 (November 1995), 39-41.
DOI=10.1145/219717.219748 http://doi.acm.org/10.1145/219717.219748

14/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
Similarity techniques can be applied to map the text documents to the words in the dictionary to

os
compute the overall category of the text document. If a particular word is relevant to both the fields
of medicine and engineering, a lexical dictionary would compare against both lists and classify the
document based on the computed similarity score.

Document Classification

rP
Predictive analytics uses historical data in combination with algorithms, and occasionally with
external knowledge, to determine a probable future outcome of an entity or new data point
behaviour.28,29 The classifier algorithm can be taught to classify such data and predict a new data
point within a certain probability.30 To accomplish this, each document would be represented as a
set of features (words or POS tags or lexical ratios, etc) before classifiers such as Support Vector
Machines are applied to classify the documents to specific categories. This can also be used to predict
the category of the new document. Then opinions expressed by the company will be classified as

yo
either positive or negative.

FIGURE 6: ILLUSTRATION OF CLASSIFIER ALGORITHM31


op
tC

The classifier algorithm can predict new data points by using support vectors through a statistical
approach where a linear plane is gleaned from known data based on a binary classification. New data
No

can be predicted by this model.

Source: Provided by Authors

Sentiment Classification

28
Rado Kotorov, “Enhancing Decision-Making, Cost-Efficiency, and Profitability with Predictive Analytics”, 2009, Information
Do

Builders, http://www.informationbuilders.com/about_us/whitepapers/download_form/4575, accessed June 2014.


29
Charles Nyce, Predictive Analytics White Paper, 2007, American Institute for Chartered Property Casualty Underwriters, Insurance
Institute of America, p. 1, http://www.aicpcu.org/doc/predictivemodelingwhitepaper.pdf, accessed June 2014.
30
Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin, , “A Practical Guide to Support Vector Classification” (Technical report),
2003, Department of Computer Science and Information Engineering, National Taiwan University,
http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf, accessed June 2014.
31
The classifier algorithm can predict new data points by using support vectors through a statistical approach where a linear plane is
gleaned from known data based on a binary classification. New data can be predicted by using this model. David Larcker and Anastasia
Zakolyukina, “Detecting Deceptive Discussions in Conference Calls. Journal of Accounting Research”, 50(2) 2012:495{540.

15/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
os
This is more focussed classification task. The aim of sentiment classification is to classify the data
into positive or negative polarities using supervised methods or unsupervised methods. Fine grained
sentiment analysis is desired as it is highly effective to understand the pulse of the consumers at
feature level. The task of sentiment target detection aims at extracting the sentiment targets in the
reviews using sentiment classification techniques.32

rP
FIGURE 7: SENTIMENT CLASSIFICATION OF MOVIE REVIEWS

Source: Provided by Authors

Rule-based Information Extraction


yo
The task of extracting valuable information from unstructured data could be very complex. There
op
were several methods to extract such information. A typical rule-based method consisted of two parts:
a collection of rules, and a set of policies to control when these rules are applied.33 These rules were
either manually coded or input from labelled sources.

FIGURE 8: RULES FOR INFORMATION EXTRACTION FROM UNSTRUCTURED TEXT


tC

{String=“The”} {Orthography type = Capitalised word}


{Orthography type = All capitalised} {Dictionary Type =Company end}

Source: Provided by Authors

Using these rules, useful information could be extracted from unstructured text.
No

FIGURE 9: INFORMATION EXTRACTION FROM UNSTRUCTURED TEXT


Do

Source: Provided by Authors

32
The deceptive messages in conference calls contain fewer positive and more negative words compared to truthful messages.
33
Martin Atzmueller, Peter Kluegl, and Frank Puppe. “Rule-Based Information Extraction for Structured Data Acquisition using
TextMarker”. Proc. LWA 2008 Knowledge Discovery and Machine Learning Track, University of Wuerzburg.

16/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860
SMU-19-0023 Moss & Associates: Accounting for Financial Fraud

t
The first term could be considered optional while the second term would match all capitalised

os
abbreviations, and the last term would match all capitalised words that form the last word of any
entry in a dictionary of company names. Similarly, numerical data could also be extracted by
specifying the rules.

The rules in the rule engine would incorporate terms that could extract various aspects from the text
to generate the desired output. The first term would match up with the amount type: net or gross. The

rP
second term would match the profit or loss section, the third term would match all capitalised words
that form the last word of any entry in a dictionary of company names, and the fourth term would
match the amount and last term matches the quarter.

yo
op
tC
No
Do

17/17

This document is authorized for educator review use only by Norma Ortiz, Universidad de Los Andes - Colombia (UniAndes) until Mar 2024. Copying or posting is an infringement of copyright.
Permissions@hbsp.harvard.edu or 617.783.7860

You might also like