Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Chapter 6 Data-Driven Fraud

Detection
(Fraud Examination 6e
by Albrecht, Arbrecht, Arbrecht, Zimbelman)

ACC 5375 -260 – Forensic Accounting


Source: Adapted from Cengage Learning
1
Learning Objectives
➢ Obj. 1: Describe the importance of data-driven fraud detection,
including the difference between accounting anomalies and fraud.
• Obj. 2: Explain the steps in the data analysis process.
• Obj. 3: Be familiar with common data analysis packages.
• Obj. 4: Understand the principles of data access, including open
database connectivity (ODBC), text import, and data warehousing.
• Obj. 5: Perform basic data analysis procedures for fraud detection.
• Obj. 6: Read and analyze a Matasos matrix.
• Obj. 7: Understand how fraud is detected by analyzing financial
statements.
2
Errors Versus Fraud
➢Errors…
✓ are not intentional
✓ are simply problems in the system caused by failures in systems, procedures, or policies
✓ do not represent fraud and normally do not result in legal action
✓ are usually spread evenly throughout a data set
➢Fraud…
✓ Fraud is the intentional circumvention of controls by intelligent human beings
✓ Perpetrators cover their tracks by creating false documents or changing records in database systems
✓ Evidence of fraud may be found in very few transactions
✓ Fraudulent symptoms are found in single cases or limited areas of the data set

3
Audit Sampling and Fraud
➢ Statistical sampling has become a standard auditing procedure.
✓ Audit sampling is an effective analysis procedure for finding routine errors spread
throughout a data set.
✓ In contrast, sampling is usually a poor analysis technique when looking for a needle
in a haystack.
✓ If you sample at a 5 percent rate, you effectively take a 95 percent chance that you
will miss the few fraudulent transactions.
➢ Often, fraud examiners strive to complete full-population analysis to ensure
that the “needles” are found.
➢ Given the right tools and techniques, full-population analysis is often the
preferred method in a fraud investigation.

4
The Data Analysis Process
➢Fraud investigators must be prepared to learn new methodologies,
software tools, and analysis techniques to successfully take advantage
of data-oriented methods.
➢Data-driven fraud detection is proactive in nature.
✓ The investigator no longer has to wait for a tip to be received.
✓ The investigator brainstorms the schemes and symptoms that might be found
and then looks for them.
➢Data-driven detection is essentially a hypothesis-testing approach:
✓ The investigator makes hypotheses and tests to see which are supported by the
data.

5
Figure 6.1 The Proactive Method of Fraud
Detection

6
The Data Analysis Process – Six Steps

1. Understand the business


2.Identify possible frauds that could exist
3.Catalog possible fraud symptoms
4.Use technology to gather data about symptoms
5.Analyze results
6.Investigate symptoms
7
Step 1: Understand the Business
➢The same fraud detection procedures cannot be applied generically to
all businesses or even to different units of the same organization.
➢Several potential methods to gather information about a business are
as follows:
✓ Tour the business, department, or plant
✓ Become familiar with competitor processes
✓ Interview key personnel (ask them where fraud might be found)
✓ Analyze financial statements and other accounting information
✓ Review process documentation
✓ Work with auditors and security personnel
✓ Observe employees performing their duties 8
Step 2: Identify Possible Frauds That Could Exist

➢This risk assessment step requires an understanding of the


nature of different frauds, how they occur, and what
symptoms they exhibit.
➢The fraud identification process begins by conceptually
dividing the business unit into its individual functions or
cycles.
➢During this stage, the fraud detection team should
brainstorm potential frauds by type and player.
9
Step 3: Catalog Possible Fraud Symptoms
➢ In Step 3, the fraud examiner should carefully consider what symptoms could
be present in the potential frauds identified in Step 2.
➢ Type of Fraud Symptoms (See Chapter 5):
✓ Accounting errors
✓ Internal control weaknesses
✓ Analytical errors
✓ Extravagant lifestyles
✓ Unusual behaviors
✓ Tips and complaints

10
Figure 6.2 Red Flags of Kickbacks

➢Analytical Symptoms
✓Increasing prices
✓Larger order quantities
✓Increasing purchases from favored vendor
✓Decreasing purchases from other vendors
✓Decreasing quality
11
Figure 6.2 Red Flags of Kickbacks

➢Behavioral Symptoms
✓ Buyer doesn’t relate well to other buyers and vendors
✓ Buyer’s work habits change unexpectedly

12
Figure 6.2 Red Flags of Kickbacks

➢Lifestyle Symptoms
✓ Buyer lives beyond known salary

✓ Buyer purchases more expensive automobile

✓ Buyer builds more expensive home

13
Figure 6.2 Red Flags of Kickbacks

➢Control Symptoms
✓ All transactions with one buyer and one vendor
✓ Use of unapproved vendors

➢Document Symptoms
✓ 1099s from vendor to buyer’s relative

14
Figure 6.2 Red Flags of Kickbacks

➢Tips and Complaints


✓ Anonymous complaints about buyer or vendor

✓ Unsuccessful vendor complaints

✓ Quality complaints about purchased products

15
Step 4: Use Technology to Gather Data
about Symptoms
➢Searching and analysis
✓ Data analysis applications
✓ Custom structured query language (SQL) queries and scripts
➢The deliverable of this step is a set of data that matches the symptoms
identified in the previous step

16
Step 5: Analyze Results
• Once errors are refined and determined by the examiners to be likely
indications of fraud, they are analyzed using either traditional or technology-
based methods:
• Screening results using computer algorithms
• Real-time analysis and detection of fraud
• One advantage of the data-driven approach is its potential reuse.

17
Step 6: Investigate Symptoms
➢ The final step of the data-driven approach is investigation into the most
promising indicators.
➢ The primary advantage of the data-driven approach is the investigator takes
charge of the fraud investigation process.
✓ Instead of waiting for tips or other indicators to become egregious enough to show
on their own, the data-driven approach can highlight frauds while they are small.
➢ The primary drawback to the data-driven approach is that it can be more
expensive and time intensive than the traditional approach.

18
Data Analysis Software
➢ACL Audit Analytics
✓ Powerful program for data analysis
✓ Most widely used by auditors worldwide
➢CaseWare’s IDEA
✓ Recent versions include an increasing number of fraud techniques
✓ ACL’s primary competitor

19
Data Analysis Software
➢Microsoft Office + ActiveData
✓ a plug-in for Microsoft Office
✓ provides data analysis procedures
✓ based in Excel and Access
✓ less expensive alternative to ACL and IDEA
➢Other software package include
✓ SAS and SPSS (Statistical analysis programs with available fraud modules
✓ Traditional programming languages like Perl, Python, Ruby, Visual Basic, and other
specialized data mining platforms

20
Data Access
➢The most important (and often most difficult) step
in data analysis is gathering the right data in the
right format during the right time period.
➢Methods include:
o Open Database Connectivity (ODBC)
o Text Import
o Hosting a Data Warehouse
21
Open Database Connectivity (ODBC)
➢ standard method of querying data from corporate relational
databases
➢ a connector between the front-end analysis using analysis
applications (such as ACL and IDEA ) and the back-end
corporate databases (Oracle, SQL Server, and MySQL)
➢ best way to retrieve data for analysis because
✓ it can retrieve data in real time
✓ it allows use of the SQL language
✓ it allows repeated pulls for iterative analysis
✓ it retrieves metadata (like column types and relationships) directly
22
Text Import
➢ Several text formats exist for copying data from one application (i.e., a
database) to another (i.e., an analysis application).
➢ Text Import
✓ Import data with a delimited text
o Comma separated values (CSV):
ID, Date, First Name, Last Name, Phone Number, etc.
342, 12/23/2007, Seth, Knab, 000-000-0000, etc.
o Table separated values (TSV):
ID Date First Name Last Name Phone
342 12/23/2007 Seth Knab 000-000-0000
✓ Fixed-width format
✓ Extensible markup language (XML) – used in many new applications
✓ EBCDIC - Used primarily on IBM mainframes
23
Hosting a Data Warehouse
➢Many investigators simply import data directly into their analysis
application, effectively creating a simplified data warehouse.
➢While most programs are capable of storing millions of records in
multiple tables, most analysis applications are relatively poor data
repositories.
➢Databases are the optimal method of storing data.
➢Accounting applications like ACL and IDEA provide options for
server-based storage of data.
24
Data Analysis Techniques
➢ Once data are retrieved and stored in a data warehouse, analysis application, or text
file, they need to be analyzed to identify transactions that match the indicators
identified earlier in the process.
➢ Analysis techniques that are most commonly used by fraud investigators:
✓ Data Preparation
✓ Benford’s Law
✓ Digital Analysis
✓ Outlier Investigation
✓ Stratification and Summarization
✓ Time Trend Analysis
✓ Fuzzy Matching
✓ Real-Time Analysis
25
Data Preparation
➢One of the most important—and often most difficult— tasks in data
analysis is proper preparation of data.
➢Areas of concern
✓ Type conversion and consistency of values
✓ Descriptives about columns of data
✓ Time standardization

26
Digital Analysis
➢Digital analysis is the art of analyzing the digits that make up number
sets like invoice amounts, reported hours, and costs.
➢Benford’s Law accurately predicts for many kinds of financial data
that the first digits of each group of numbers in a set of random
numbers will conform to the predicted distribution pattern.
✓ Using Benford’s Law to detect fraud has the major advantage of being a very
inexpensive method to implement and use.
✓ The disadvantage of using Benford’s Law is that it is tantamount to hunting
fraud with a shotgun.

27
Table 6.1 Benford’s Law Probability
Values

28
Figure 6.3 Digital Analysis—Supply
Management

29
Figure 6.4 Supplier Graphs

30
Stratification
➢Stratification is the splitting of complex data sets into groupings.
➢The data set must be stratified into a number of “subtables” before
analysis can be done.
➢For many data sets, stratification can result in thousands of subtables.
➢While basic programs like spreadsheets make working with this many
tables difficult and time consuming, analysis applications like ACL
and IDEA make working with lists of tables much easier.

31
Summarization
➢Summarization is an extension of stratification.
➢Summarization runs one or more calculations on the subtables to
produce a single record representing each subtable.
➢Basic summarization usually produces a single results table with one
record per case value.
➢Pivot tables (also called cross tables) are two-dimensional views with
cases in one dimension and the calculations in the detail cells.

32
Time Trend Analysis
➢Time trend analysis is a summarization technique that produces a
single number that summarizes each graph.
➢By sorting the results table appropriately, the investigator quickly
knows which graphs need further manual investigation.

33
Figure 6.5 Time Trend Graph

34
Fuzzy Matching
➢ Another common technique is fuzzy matching of textual values.
➢ This technique allows for searches to be performed that will find matches
between some text and entries in a database that are less than 100 percent
identical.
➢ The first and most common method of fuzzy matching is use of the
Soundex algorithm.
➢ A more powerful technique for fuzzy matching uses n-grams. This technique
compares runs of letters in two values to get a match score from 0 to 100
percent.
35
Real-Time Analysis
➢ Data-driven investigation is one of the most powerful methods of
discovering fraud.
➢ It is usually performed during investigations or periodic audits, but it can be
integrated directly into existing systems to perform real-time analysis on
transactions.
➢ Although real-time analysis is similar to traditional accounting controls
because it works at transaction time, it is a distinct technique because it
specifically analyzes each transaction for fraud (rather than for accuracy or
some other attribute).

36

You might also like