Professional Documents
Culture Documents
The Effect of Early Career Experience On Auditors' Assessments of Error Explanations in Analytical Review
The Effect of Early Career Experience On Auditors' Assessments of Error Explanations in Analytical Review
INTRODUCTION
T
his paper examines the analytical review judgments of staff-level auditors. The
performance of analytical review procedures has traditionally been associated with
senior-level auditors (see survey data in Bonner and Pennington [1991] and Prawitt
[1995]). However, changes in the audit environment (e.g., increased team auditing, a shift away
from compliance testing to increased analytical procedures, and more work ‘‘pushed down’’ to
lower-level staff ) over the last ten to 15 years may have led to staff-level auditors being exposed to
the analytical review process, and more specifically, financial statement error knowledge, earlier in
their careers. Thus, an assessment of staff-level auditors’ performance on analytical review
procedures is necessary, both to increase our understanding of the learning curve as it relates to
I thank Theresa Libby (editor), Penelope Bagley, Sudip Bhattacharjee, Caroline Ford, Morris McInnes, Laurie Pant,
Tracey Riley, Lew Shaw, Karen Teitel, Kristy Towry, anonymous reviewers, and workshop participants at the College of
the Holy Cross and the 2010 AAA Auditing Section Midyear Meeting for helpful comments and suggestions. I thank
Laurie Pant for her assistance in obtaining study participants, and I thank the many students and professionals who
volunteered to participate in this research study. An earlier version of this paper was presented at the 2008 AAA Annual
Meeting.
Published Online: March 2012
211
212 Yen
analytical review procedures and to provide evidence on the readiness of staff-level auditors to
perform these traditionally more senior-level procedures.
To examine this research question, I conduct an experiment based on Heiman (1988, 1990).
Heiman (1990) finds that professional auditors at the senior-level do not spontaneously consider
alternative explanations when assessing the likelihood of a hypothesized explanation for an
unexpected analytical review fluctuation. She proposes and tests two strategies (i.e., providing
alternative explanations and asking participants to self-generate alternative explanations) for
restructuring analytical review to improve these likelihood assessments. Heiman (1990) finds that
when senior-level auditors are provided with alternative explanations, their likelihood assessments
are reduced. When senior-level auditors are asked to self-generate alternative explanations, their
likelihood assessments are also reduced in a similar manner, if they self-generate at least two
alternatives.1 Heiman (1988) compares these same two strategies using students as subjects and
finds that only one of the strategies (providing alternative explanations) is successful in reducing
students’ likelihood assessments. When students are asked to self-generate alternative explanations,
their likelihood assessments are not reduced—instead, they actually increase.
Heiman (1988, 16) attributes the different results for senior-level auditors versus students to
differences in their knowledge of the financial statement errors that could have caused a given
analytical review fluctuation. With the performance implications of senior-level experience versus
no experience established, the primary objective of the current study is to examine the performance
of those individuals who fall in between these two groups, namely staff-level auditors who have
some full-time experience but are not yet at the senior level. Although these staff-level auditors
have substantially less on-the-job experience than the senior-level auditors previously studied, the
developments of the last ten to 15 years in auditing described earlier may serve to offset this. Given
these developments, will staff-level auditors perform more like Heiman’s (1990) senior-level
auditors, like students, or somewhere in between?
To answer this research question, I conduct an experiment based on Heiman (1988, 1990),
using a 2 3 2 3 2 research design, with one within-subject factor and two between-subjects factors.
I manipulate one of the between-subjects factors, the manner by which alternatives are made
available. Consistent with Heiman (1988, 1990), participants either are provided with alternative
explanations or are asked to self-generate alternative explanations. The other between-subjects
factor, audit experience, is measured. For this factor, there are two levels, staff-level auditors and a
control group of accounting students with no full-time audit experience. The within-subject factor
(the ‘‘pre’’ and ‘‘post’’ likelihood assessments) represents the participants’ judged likelihood of a
hypothesized explanation for an unexpected analytical review fluctuation; before and after
alternative explanations are either provided or self-generated.
I find that both staff-level and student participants overestimate (underestimate) the likelihood
of specified (unspecified) causes, and that providing either group with alternative explanations
reduces their likelihood assessments for an original specified cause. On the other hand, when
participants are asked to self-generate alternative explanations, the likelihood assessments for
staff-level auditors and students are different. Students’ likelihood assessments increase after being
asked to self-generate alternative explanations. The pattern of results for the students in this study is
consistent with the pattern observed for the students in Heiman (1988). For staff-level auditors,
1
Prior research demonstrates that as participants are given more information (i.e., provided with alternative
explanations), their likelihood assessments for an initially hypothesized explanation decrease. Accordingly, in
Heiman (1988, 1990) and in this study, the posttest likelihood assessments of the participants who are provided
with alternative explanations serve as a benchmark for the posttest likelihood assessments of the participants who
are asked to self-generate alternative explanations. Additionally, in Heiman (1988, 1990) and in this study, the
posttest likelihood assessments of the participants who are provided with alternative explanations serve also as a
benchmark for the pretest likelihood assessments for all participants.
their assessments are reduced after being asked to self-generate alternative explanations, if they
self-generate at least two alternatives. The pattern of results for staff-level auditors in this study is
similar to the pattern observed for senior-level auditors in Heiman (1990). Overall, these results
show that staff-level auditors perform more like Heiman’s (1990) senior-level auditors than like
students on this analytical review task.2
In the auditing environment, the likelihood assessments described earlier first take place during
the hypothesis generation component of analytical review, as characterized by Koonce (1993). In
this component (and in the hypothesis evaluation component that follows), auditors retrieve (or
inherit) hypotheses to explain unexpected account balance fluctuations, make preliminary
likelihood assessments, and use the results of the assessments to make decisions about subsequent
information search and audit actions (Koonce 1993; Hirst and Koonce 1996). Because of their
impact on the scope of subsequent audit testing, judgments made during hypothesis generation/
evaluation have far-reaching implications that could affect the efficiency or effectiveness of the
audit. An understanding of these processes and the role of experience/knowledge in conducting
these procedures is important for audit firms as they continue to increase the use of analytical
procedures in planning and in execution of the audit due to competitive pressures in the industry.
The results of this study, in conjunction with Heiman (1988, 1990), provide evidence on the
effect of experience (at three levels—senior, staff, and student) on the performance of analytical
procedures. In doing so, this study, when viewed along with Heiman (1988, 1990), not only
provides evidence that experience/knowledge matters when performing analytical procedures, but
also provides evidence on the shape of the learning curve, by showing when the knowledge to
perform certain analytical procedures is acquired and what particular factors (e.g., industry
experience) are associated with acquiring that knowledge. Evidence on the learning curve is
important to audit firms, as they consider how to staff audit assignments, with a view to minimizing
costs while ensuring that the staff assigned are qualified to perform the assignment. Evidence
presented in this paper also contributes to our understanding of ‘‘the nature of an auditor’s transition
from a novice to an expert,’’ which is an area highlighted for more attention by Nelson and Tan
(2005, 49). Lehmann and Norman (2006, 68) also state that an analysis of the intermediate level of
experience, as with staff-level auditors used in the current study, is important in understanding the
process of becoming an expert.
This study also extends research by examining the factors that influence individuals’ likelihood
assessments. At the time that Heiman (1988) and Heiman (1990) were published, it was assumed
that individuals’ likelihood assessments were mediated by the content of the alternative
explanations that were either provided or self-generated. Subsequent to Heiman (1988, 1990), an
alternate theory emerged—that individuals’ likelihood assessments are mediated by their subjective
experience of the ease or difficulty of recall of alternative explanations, for those individuals asked
to self-generate alternative explanations (Schwarz et al. 1991; Schwarz 1998). Prior studies in
psychology (Sanna et al. 2002) and accounting (Kadous et al. 2006) examine the relative roles of
content versus individuals’ subjective experience by manipulating features of the recall task and
find support for the primacy of individuals’ subjective experience in mediating likelihood
assessments. In the current study, by utilizing participants with different knowledge levels, I vary
the ease or difficulty of recall/self-generation of alternative explanations in a different manner than
2
The results in this study showing that the staff-level auditors perform comparably to Heiman’s (1990) senior-level
auditors are not intended to suggest that today’s staff-level auditors are comparable to today’s senior-level
auditors. Changes in the audit environment may have led to knowledge increases at all experience levels such that
today’s senior-level auditors maintain a ‘‘gap’’ over staff-level auditors. A comparison of today’s senior-level
auditors and staff-level auditors is beyond the scope of this study. Even without this comparison, however, the
results of this study have implications for the staffing of analytical review procedures, a task traditionally
associated with senior-level auditors.
these prior studies. The results from taking this different approach, in particular the results from this
study’s ‘‘control group’’ (i.e., the students) who had greater difficulty self-generating alternative
explanations, provide further support for the alternate theory that individuals’ likelihood
assessments are mediated by their subjective experience.
The remainder of the paper is organized as follows. The next section contains the theoretical
framework and hypotheses. This is followed by a description of the experiment used to test the
hypotheses, then a description of the results, and the final section summarizes the findings and
concludes.
3
Heiman (1990) argues that the results from her within-subjects design were not compromised either by learning
or by experimenter-demand effects. Accordingly, the design in the current study follows Heiman’s within-
subject design. The previous studies cited in this paper have found results in both between-subject and within-
subject designs.
4
Implicit in H1a and H1b, as well as in the hypotheses specified later, is an unstated hypothesis of a three-way
interaction between audit experience, source of alternatives, and ‘‘pre’’ versus ‘‘post.’’ These ‘‘simple effect’’
hypotheses have been stated assuming that the hypothesized interaction occurs.
specified, and in line with Kelley’s (1973) discounting mechanism, senior-level auditors have
greater decreases in their likelihood assessments than students do. In fact, students’ likelihood
assessments actually increase on average after they are asked to self-generate alternative
explanations.
With the performance implications of senior-level experience and no experience established,
the primary objective of this study is to examine the performance of those individuals who fall in
between these two groups, namely staff-level auditors who have some full-time experience but are
not yet at the senior-level.
Three earlier studies that examine the effect of experience on analytical review judgments (and
include staff-level auditors in their experience continuum) are potentially relevant to the current
study. However, they provide conflicting and/or incomplete evidence. Using the same ratio analysis
task as Heiman (1988, 1990), Libby and Frederick (1990) examine the effect of experience on error
knowledge, the precursor to the likelihood assessments made by auditors of a hypothesized
explanation, and find that experience leads to the knowledge to self-generate more alternative
explanations. They find differences in the number of self-generated alternative explanations
between managers and staff-level auditors and between staff-level auditors and students. However,
Libby and Frederick (1990) do not examine self-generated explanations for senior-level auditors, so
it is not clear how senior-level and staff-level auditors would perform relative to each other.
Tubbs (1992) looks at error knowledge by using an unconstrained free recall task to examine
the error knowledge of students, staff-level auditors, senior-level auditors, and managers. He finds
an overall experience effect, but it is mostly driven by the managers in the study, and he finds that
the error knowledge of students, staff-level auditors, and senior-level auditors are not significantly
different from each other. In the third study, Bonner (1990) finds that staff-level auditors perform
worse than managers in cue selection and cue weighting in an analytical risk assessment task.
However, she does not compare staff-level auditors to either senior-level auditors or students.5
Taken together, the results from these earlier studies provide limited guidance on how staff-
level auditors will perform in the current study. Accordingly, I follow the approach proposed by
Bonner and Pennington (1991) where, for studies of experience/knowledge on performance, they
advocate understanding the key cognitive processes of the task in question, and understanding when
the knowledge to perform the task is acquired.
In a survey conducted by Bonner and Pennington (1991), audit partners and senior managers
on average estimate that 34.5 percent of the knowledge required to perform the analytical review
task such as in Heiman (1988, 1990) is acquired through formal instruction in college. The
remaining knowledge is acquired via firm training, self-instruction or firm reference manuals, task
feedback, overall performance feedback, and environmental feedback (Bonner and Pennington
1991). Therefore, staff-level auditors have the resources to acquire the remaining knowledge to
perform analytical review procedures as soon as they start working. However, there may be barriers
to staff-level auditors acquiring the remaining knowledge via instruction and/or feedback.
In terms of staffing audit assignments, survey research conducted in the 1980s and 1990s
indicates that analytical review procedures are typically not assigned to staff-level auditors
(Abdolmohammadi 1987; Prawitt 1995). These results have multiple implications. First, they reflect
audit firms’ beliefs about staff-level auditors’ readiness (or lack thereof ) to perform analytical
review procedures. These policies, if still enforced today, would also limit the opportunities for
staff-level auditors to acquire the knowledge to perform analytical review procedures. If staff-level
auditors are not assigned to perform analytical review procedures, then they would not receive
5
Other notable studies that examine the performance of staff-level auditors do so in the context of higher-level
tasks, such as workpaper review (Moeckel 1990), going concern evaluation (Choo and Trotman 1991; Lehmann
and Norman 2006), and planning (Christ 1993), so their results are not particularly germane to the current study.
feedback to increase their knowledge. In addition, most audit firms’ formal training programs are
set up on a ‘‘just-in-time’’ basis, so training on analytical review procedures would be done around
the time that staff-level auditors are promoted to senior-level, so increases in knowledge from firm
training programs would not be likely to occur during the staff-level years. In addition, training of
any type, formal or self-instructed, without feedback has been shown to not increase knowledge for
novices (Bonner and Walker 1994). Accordingly, it may be that, without exposure to analytical
review procedures in their first few years at an audit firm, staff-level auditors would still be similar
to students in their ability to self-generate alternative explanations and in their likelihood
assessments of an initially hypothesized explanation.
The surveys referred to earlier were conducted from 15 to over 20 years ago. Changes in the
audit environment in the last ten to 15 years may have led to increased opportunities for staff-level
auditors to acquire the knowledge to perform analytical review procedures, such that they will be
similar to Heiman’s (1990) senior-level auditors in their ability to self-generate alternative
explanations and in their likelihood assessments. One of these changes is a shift away from
compliance testing to increased use of analytical review procedures in audits. Because more of the
audit is done via analytical review procedures, staff-level auditors may have certain analytical
review procedures delegated to them,6 or they may be assigned to perform the procedures outright,
giving them opportunities for practice and feedback earlier in their careers than before. This
‘‘pushing down’’ of work is exacerbated by employee turnover and tight client deadlines, which
result in more responsibility being assumed by lower-level staff auditors (Kelley et al. 1999; Pierce
and Sweeney 2004; Sweeney and Pierce 2004).
Other changes in at least one firm’s audit approach, discussed in Vinograd et al. (2000), include
an increased emphasis on understanding the client’s business objectives and related risks, and the
real-time coaching of audit staff in the field where supervisors give more immediate feedback.
Under these circumstances, staff-level auditors may acquire second-hand outcome feedback by
participating in team meetings and by observing the feedback given to other more senior members
of the audit team.
These expectations of staff-level auditors are reflected in recruiting materials posted on one Big
4 firm’s website. The website states that, in the first few years, staff will acquire ‘‘research and
analysis skills, including how to verify, challenge and interpret information used to create financial
statements’’ and ‘‘knowledge of financial performance and measurement concepts related to . . .
analytical reviews’’ (Ernst & Young 2009).
Based on these recent developments in the audit environment, I hypothesize that staff-level
auditors’ knowledge in this domain is sufficiently developed, and they will be able to self-generate
alternative explanations and have reduced likelihood assessments of an initially hypothesized cause.
Consequently, I propose the following hypothesis:
H2a: Staff-level auditors’ likelihood assessments of a hypothesized cause of an unexpected
analytical review fluctuation will be lower after they are asked to self-generate
alternative explanations than before they received such instructions.
Consistent with Heiman (1988), students’ knowledge levels in this domain are not sufficiently
developed, making self-generation of plausible alternatives more difficult. Based on this, students’
likelihood assessments for an initially hypothesized cause will not decrease after they are asked to
self-generate alternative explanations. Consequently, I propose the following hypothesis:
6
Abdolmohammadi’s (1999) survey results show that analytical review procedures are considered to be semi-
structured tasks. Even so, his respondents report that they are comfortable assigning some analytical procedures
to assistant level staff.
7
Note that participants who are provided with alternative explanations do not experience the ease or difficulty of
recalling alternative explanations. Accordingly, there is no subjective experience for these participants, and their
likelihood assessments are expected to be mediated entirely by the content of the alternative explanations. In
accounting, there is evidence in support of this from the ‘‘yoked’’ participants who were provided with another
participant’s self-generated alternative explanations in Experiment (2) in Kadous et al. (2006).
METHOD
Task
The experimental task in this study involved reading through case materials and making two
judgments (‘‘pre’’-treatment and ‘‘post’’-treatment) as to the likelihood of a given hypothesized
explanation for an unexpected financial statement fluctuation. As for the treatment, administered
between the two judgments, participants were either provided with (five) alternative explanations or
were asked to self-generate alternative explanations. The task used in this study was identical to the
task used by Heiman (1988, 1990).
Design
The experiment used a 2 3 2 3 2 design with one within-subject factor and two between-
subjects factors. The within-subject factor was the ‘‘pre’’ versus ‘‘post’’ likelihood assessment.
‘‘Pre’’ represents the initial likelihood assessment of an initially hypothesized explanation, and
‘‘post’’ represents the assessment made of that same explanation after being provided with
alternatives or being asked to self-generate alternatives. The between-subjects factors were the
source of the alternative explanations that were made available, which was manipulated, and audit
experience, which was measured. The two sources from which alternatives were made available
were experimenter-provided, where participants were provided with five alternative explanations to
the original hypothesized explanation, and self-generated, where participants were asked to self-
generate up to five alternatives. For audit experience, there were two levels: staff-level auditors,
who had at least three months of full-time audit experience, including at least one ‘‘busy season,’’
and graduate students in accounting, who had no full-time audit experience.9
Participants
Forty-six staff-level auditors and 41 graduate accounting students with no full-time audit
experience participated in the experiment and were paid a flat wage of ten dollars plus the chance of
8
It should be noted that the current study is not specifically designed to disentangle the effect of content from the
effect of participants’ subjective experience. Contrast this with the experiments in Schwarz et al. (1991), Sanna et
al. (2002), and Kadous et al. (2006), which were specifically designed to disentangle these effects, by utilizing a
‘‘generate many’’ versus ‘‘generate few’’ manipulation. Schwarz et al. (1991) noted that such a manipulation had
to be utilized because, in most settings, manipulations introduced to increase/decrease the ease of recall/
generation are also likely to affect the content of participants’ recall. Along these lines, in the current study,
experience is likely to affect both the ease of recall and content. Nevertheless, if certain patterns emerge, this
study can provide evidence on the relative importance of each in likelihood assessments. Additionally, by
utilizing participants with different knowledge levels, the current study varies the ease or difficulty of self-
generation in a different manner than earlier studies.
9
Participants were classified based on their full-time audit experience. Participants with the requisite full-time
audit experience, who were also students at the same time, were classified in the ‘‘staff-level auditor’’ group.
winning a drawing for a number of restaurant gift certificates. The graduate accounting students
were recruited from two accounting classes at an urban university in the northeastern United States.
The staff-level auditors were recruited from professional contacts of the author, including alumni of
the same urban university that was the source of the student participants, and professional contacts
of members of the university’s accounting department advisory board. In total, 16 audit firms were
represented in the sample, providing a broad-based sample. The largest number of participants from
any single firm was 15. Seventeen of the participants (38 percent) were employed by the ‘‘Big 4’’
audit firms, and all of the Big 4 firms were represented in the sample.10 All of the staff-level
auditors were based in offices in the northeastern United States, and the average full-time work
experience of these auditors was 1.1 years. Twelve percent of the students and 13 percent of the
staff-level auditors reported having audit internship experience.
Procedure
For the student participants, the data were collected in two accounting classes. For the staff-
level auditors, the data were collected in one of four ways: at a training session for one of the firms,
in a classroom setting,11 in small groups at the author’s office, or on a ‘‘take-home’’ basis.12
Participants in each experience group were randomly assigned to one of two conditions: ‘‘provided
alternatives’’ or ‘‘self-generated alternatives.’’ Following a brief verbal introduction, or written
introduction in the case of the take-home participants, participants began by reading a set of
instructions describing the case and the procedures to be followed during the experiment.
The instructions included a brief description of a medium-sized manufacturing company.
Participants also received the company’s income statement and balance sheet with current year
(unaudited) balances and three financial ratios (gross margin, current ratio, and quick ratio), which
were calculated based on the prior year (audited) balances and the current year (unaudited)
balances. Participants were told that there was one material error (or repeated occurrences of the
10
The senior-level auditors in Heiman (1990) were entirely from two of the then-Big 8 audit firms, while this study
uses staff-level auditors from 16 firms, of which 38 percent are from Big 4 firms. Ex ante, it is not clear whether
the presence of non-Big 4 auditors in this study, relative to a sample composed entirely of current Big 4 auditors,
would bias in favor of finding results (if auditors in non-Big 4 firms take on analytical review responsibilities
earlier in their career) or against finding results (if the hiring criteria at Big 4 firms are more selective or there is
better formal training than at non-Big 4 firms). Results from this study do not show a strong effect in either
direction, as the results for the Big 4 staff-level auditors and the non-Big 4 staff-level auditors are comparable,
once the not-for-profit industry specialists (one Big 4, 13 non-Big 4) are removed.
11
One of the professional participants was enrolled in one of the classes that was the source of the student
participants.
12
In Heiman (1990), seven percent of her professional participants completed the materials on a take-home basis.
In this experiment, a majority of the professional participants completed the materials on a take-home basis. To
guard against potential validity threats from conducting a larger proportion of the study in this manner, I
instructed study participants to complete the experimental materials in sequence and not to go back and change
answers previously provided. To confirm that this was the case, I reviewed the returned experimental materials
and noted that there were no significant differences in the completion of the materials by those who completed
them in a classroom setting and those who completed them on a take-home basis. In addition, it is not clear how
the study’s results would have been affected if some take-home professional participants did in fact change their
initial (‘‘pre’’) measures (in violation of the study instructions). If these participants had the ability to change
their initial measures and were interested in impression management, they likely would have adjusted their ‘‘pre’’
judgment to be closer to (or to match) their ‘‘post’’ judgment, which would bias against finding results for H1a
and H2a. Additionally, completing the materials on a take-home basis is not any less realistic than completing the
materials in a training session, for example, and may possibly replicate real-world conditions more closely.
Nevertheless, the fact that a majority of the professional participants completed the study on a take-home basis is
a minor limitation of this study.
same error) in the current year’s financial statements, and that the difference between the two years’
sets of ratios was a result of the material error and normal year-to-year variation.13
Next, the pretest judgment was elicited. All participants were asked the question: ‘‘Given the
fact that a financial statement error occurred and affected the ratios on the previous page, how likely
do you think it is that the error was next period’s credit sales being recorded in the current period?’’
Participants were asked to provide their assessment by making a mark on a judgment scale running
from 0 to 100, where the endpoints were labeled ‘‘extremely unlikely’’ and ‘‘extremely likely,’’
respectively. Deciles of the scale were also indicated. After answering this question, they were
instructed not to make any further changes to their likelihood assessment on this page. The
following page of the case materials elicited the posttest judgment, which differed between
treatments, by the source of the alternative explanations.
In the ‘‘provided alternatives’’ condition, participants were told that they would be asked to
reconsider the previous question, but first they were to consider ‘‘these alternative specific errors
which could have led to the change’’ in the set of ratios. They were then provided with five ‘‘high-
frequency’’ financial statement errors (according to Coakley and Loebbecke 1985), which could
have caused the unusual fluctuation. Participants were then reminded that there was only one error
in the financial statements and were asked to reconsider the previous question. The question was
restated, and the judgment scale was presented again to obtain the posttest judgment. In the ‘‘self-
generated alternatives’’ condition, participants were told that they would be asked to reconsider the
previous question, but first they were to list alternative specific errors that could have led to the
change in the set of financial ratios. Space was provided for them to list five errors that could have
caused the unusual fluctuation. Participants were reminded that there was only one error in the
financial statements and were asked to reconsider the previous question. The question was restated,
and the judgment scale was presented again to elicit the posttest judgment.
Following these questions, participants answered a post-experimental questionnaire, which
contained manipulation check questions and asked for demographic information. As part of the
questionnaire, participants in the alternatives provided condition were asked to self-generate one
alternative explanation, not among the six previously provided by the case materials. This question
served as a filler question, to attempt to equalize the time that each group spent on the case
materials. The task required about 15 to 20 minutes to complete.
RESULTS
Manipulation Checks
13
The procedures followed in this study are consistent with those followed in Heiman (1988, 1990). To emphasize
that identical procedures were followed, the description of the procedures from Heiman (1988, 1990) is repeated
in this paragraph and in the following two paragraphs, with some minor modifications.
14
The data were analyzed both including and excluding participants who answered this manipulation check
question incorrectly, and results were not significantly different. The results reported in this paper reflect the
inclusion of these participants.
I also asked participants to report the number of alternative explanations listed (either by the
experimenter or by the participant). Eighty-four percent of the participants in the ‘‘asked to self-
generate alternatives’’ condition correctly reported the number of alternatives that they had listed.
Seventy-five percent of the participants in the ‘‘alternatives provided’’ condition correctly reported
the number of alternatives that had been provided.15,16
Experimental Results
Recall that participants were asked for two likelihood assessments for a given hypothesized
explanation being the true cause of an unexpected analytical review fluctuation. The first
likelihood assessment (the ‘‘pretest’’) was elicited prior to participants being provided with
alternative explanations or being asked to self-generate alternative explanations. The second
likelihood assessment (the ‘‘posttest’’) was elicited after participants were provided with
alternative explanations or were asked to self-generate alternative explanations. Table 1
presents the descriptive statistics for pretest and posttest assessments in each of the between-
subject experimental conditions.17 Figure 1 presents the means for these assessments
graphically.
There were no significant differences for ‘‘pretest’’ likelihood assessments in the four crossed-
factor conditions. There was no significant main effect for source of alternatives, which is
appropriate, since the participants had not yet been exposed to the source of alternatives
manipulation. Additionally, there was no significant main effect for audit experience, and there was
no significant interaction.
A 2 3 2 3 2 ANOVA with repeated measures on the within-subject factor was performed. The
two between-subjects variables were the source of the alternative explanations (experimenter-
provided versus self-generated) and audit experience (staff-level auditor versus student). The
within-subject variable was the pretest versus posttest likelihood assessment, which measures the
change in participants’ judgments. The results of the ANOVA (not tabulated) indicate a three-way
interaction between pre versus post, source of alternatives, and audit experience (F ¼ 4.517, p ¼
0.04, two-tailed). Accordingly, the data were split based on ‘‘source of alternatives,’’18 and separate
2 3 2 ANOVAs with repeated measures on the within-subject factor were performed. The results of
these ANOVAs are presented in Table 2.
15
Some participants in the alternatives provided condition treated the original hypothesized explanation as a
‘‘provided alternative,’’ thus reporting six provided alternatives. For purposes of this manipulation check, these
participants were treated as having correctly reported the number of provided alternatives.
16
The data were analyzed both including and excluding participants who answered this manipulation check
question incorrectly, and results were not significantly different. The results reported in this paper reflect the
inclusion of these participants.
17
The pattern of means for the students in this study is consistent with the pattern of means observed by Heiman
(1988) from data collected at two large state universities and by the author from data collected in a separate study
at another large state university, suggesting that there are no school-related differences.
18
Data were also split based on audit experience, with consistent results.
TABLE 1
Descriptive Statisticsa
Mean Likelihood Assessments (Standard Deviations)
Before Alternative After Alternative
Explanations Explanations
(Pretest) (Posttest)
Staff-Level Auditors
Alternative Explanations Provided (n ¼ 24) 56.3 49.5
(24.8) (26.5)
Self-Generated Alternative Explanations (n ¼ 22) 56.6 51.9
(24.0) (19.5)
Students
Alternative Explanations Provided (n ¼ 20) 58.3 41.1
(23.8) (26.1)
Self-Generated Alternative Explanations (n ¼ 21) 53.5 63.5
(25.9) (20.0)
a
Participants were asked to provide two likelihood assessments on a 101-point scale with the end points labeled 0 ¼
‘‘extremely unlikely’’ and 100 ¼ ‘‘extremely likely.’’
between pretest versus posttest and audit experience (F ¼ 1.25, p ¼ 0.27, two-tailed). These results
are consistent with H1a and H1b.
FIGURE 1
Mean Likelihood Assessments for an Initially Hypothesized Explanation for an Analytical
Review Fluctuation (Before and After Alternative Explanations) (Actual)
are self-generated (p ¼ 0.05, two-tailed). When fewer than two plausible alternatives are self-
generated, there is a significant increase from the pretest (46.4) to the posttest (57.9) (p ¼ 0.08, two-
tailed). Thus, H2a is supported, if staff-level auditors self-generate two or more plausible alternative
explanations, which is very similar to Heiman’s (1990) results with senior-level auditors.
H2b predicted that students’ likelihood assessments of the initially hypothesized cause would not
be lower after they were asked to self-generate alternative explanations. Simple effects analysis using a
paired sample t-test shows that, when students are asked to self-generate alternatives, their likelihood
TABLE 2
Analysis of Variance
Audit Experience 3 Pretest versus Posttesta
a
This table shows the results of two ANOVAs conducted after the initial three-way ANOVA indicated a three-way
interaction (results for the three-way ANOVA are not tabulated). Following the three-way interaction, the data were
split on the basis of ‘‘source of alternatives;’’ an ANOVA was conducted for participants who were provided with
alternative explanations (Panel A), and an ANOVA was conducted for participants who self-generated alternative
explanations (Panel B).
assessments increased on average (pretest ¼ 53.5, posttest ¼ 63.5). Consistent with H2b, the mean
posttest assessment is not lower than the pretest. In fact, the mean posttest assessment is significantly
higher than the mean pretest assessment (t ¼ 1.994, p ¼ 0.06, two-tailed), consistent with H2c.
19
Eight of these 14 participants had not-for-profit as the only industry in which they spent more than one-third of
their time, suggesting that they could be exclusively working on not-for-profit audits.
TABLE 3
Simple Effects Analysis
Self-Generated Alternativesa
Hypothesis t-statistic p-value
H2a: Staff-level auditors’ ‘‘post’’ likelihood
assessments lower than ‘‘pre’’ likelihood assessments (n ¼ 22) 0.90 0.38
H2b/H2c: Students’ ‘‘post’’ likelihood
assessments not lower/higher than ‘‘pre’’ likelihood assessments (n ¼ 21) 1.99 0.06
a
This table shows the tests of H2a, H2b, and H2c. These simple effects analyses were performed following an initial
three-way interaction (source of alternatives, audit experience, and ‘‘pre’’ versus ‘‘post’’ likelihood assessments) (not
tabulated) and after the two-way ANOVA conducted for participants who self-generated alternative explanations
indicated a two-way interaction (as presented in Table 2). p-values are two-tailed.
performance of these not-for-profit industry specialists as a separate group and also re-analyzed the
overall staff-level auditor results with the not-for-profit specialists excluded from the analysis. For
the 14 not-for-profit industry specialists, their pattern of results was similar to that of the student
group, where likelihood assessments are reduced when alternatives are provided, and more
importantly, likelihood assessments increase when alternatives are self-generated (however, due to
low power, none of the differences are significant nor is the interaction; for the interaction, F ¼
1.574, p ¼ 0.23, two-tailed). Also, excluding these participants from the overall analysis actually
leads to a stronger audit experience effect, as the (initially tested) three-way interaction of pre
versus post, audit experience, and source of alternative explanations is stronger than before (F ¼
6.834, p ¼ 0.01, two-tailed).
Recall that data were collected from participants on two intermediate stage measures—a rating
of task difficulty and the number of plausible alternatives self-generated—that are expected to be,
respectively, negatively and positively associated with participants’ knowledge and their ability to
perform analytical review. Analysis of these intermediate stage measures for the not-for-profit
industry specialists provides some additional support for the notion that industry experience affects
one’s ability to self-generate alternative explanations. Not-for-profit industry specialists who were
asked to self-generate alternatives rated the task as significantly more difficult (mean ¼ 5.0 on a
seven-point Likert scale) than the other staff-level auditors who were asked to self-generate
alternatives (mean ¼ 4.25) (p ¼ 0.05, one-tailed). Similarly, not-for-profit industry specialists self-
generated on average 1.83 plausible alternative explanations, while the remaining staff-level
auditors self-generated 2.75 plausible alternative explanations (however, this difference is not
significant, presumably due to low power, p ¼ 0.28, two-tailed). Overall, these results suggest that
not-for-profit industry experience (and specialization) is associated with not acquiring the ability to
self-generate alternative explanations. Conversely, for-profit industry experience is associated with
staff-level auditors acquiring the ability to self-generate alternative explanations, which would
enhance their ability to perform analytical procedures.20
20
These analyses of industry effects should be interpreted with caution, as industry effects may be entangled with
firm-specific characteristics, as one firm accounted for nearly all of the not-for-profit industry specialists in total,
and for all of the not-for-profit industry specialists in the self-generated alternatives condition. Similar to the
results when the not-for-profit industry specialists are excluded from the analysis, when all of the staff-level
auditors from this firm are excluded from the analysis, the (initially tested) three-way interaction is stronger (F ¼
6.627, p ¼ 0.01) than when they are included.
The intermediate stage measures collected were also used to develop a more complete picture
of the process that individuals go through when self-generating alternative explanations and
providing likelihood assessments.
Interestingly, the mean number of plausible self-generated alternative explanations does not
appear to be related to experience level. The mean number of plausible alternative explanations
generated by staff-level auditors was 2.50,21 while the mean number of plausible alternative
explanations generated by students was 2.28 (these means are not significantly different from each
other [p ¼ 0.67, two-tailed]).
On the other hand, there are marginally significant experience-related differences in
participants’ task difficulty ratings (ratings are reported for participants in the self-generated
alternatives condition only). Students rated the task as more difficult (mean ¼ 4.93 on a seven-point
Likert scale) than the staff-level auditors did (mean ¼ 4.46) (p ¼ 0.09, one-tailed). Earlier, in the test
of H2c, students’ likelihood assessments increased after being asked to self-generate alternative
explanations, and that increase was interpreted as evidence of the primacy of accessibility
experiences over accessible content in explaining likelihood assessments. The fact that students’
likelihood assessments increased despite having essentially the same mean number of alternative
explanations as staff-level auditors, while reporting higher task difficulty ratings, provides
additional support for this interpretation.22
21
In Heiman (1988), the mean number of self-generated alternatives for the senior-level auditors was 2.63.
22
These analyses were also performed with the not-for-profit industry specialists excluded. With these industry
specialists excluded, the difference in the number of plausible alternative explanations self-generated is still not
significant (p ¼ 0.24, two-tailed), while the difference in task difficulty becomes more pronounced (p ¼ 0.03,
one-tailed).
I find that, when participants are asked to self-generate alternative explanations, the likelihood
assessments for staff-level auditors and students are indeed different. Students’ likelihood
assessments increase after being asked to self-generate alternative explanations, consistent with
Heiman (1988). For staff-level auditors, their likelihood assessments decrease after being asked to
self-generate alternative explanations, if they self-generate at least two alternatives. The pattern of
results for staff-level auditors in this study is similar to the pattern observed for senior-level auditors
in Heiman (1990). Overall, these results show that staff-level auditors perform more like Heiman’s
(1990) senior-level auditors than students on this analytical review task.
These results, in conjunction with Heiman (1988, 1990), provide evidence on when the
knowledge to perform certain analytical procedures is acquired, and help to provide a better
understanding of the shape of the knowledge curve as it relates to certain analytical procedures.
Audit firms continue to increase their use of analytical review procedures to respond to increasing
competitive pressures. This study provides evidence on the readiness of staff-level auditors to
perform certain analytical review procedures, which has staffing and training implications for audit
firms. The study also contributes to our understanding of the transition of an auditor from novice to
expert (Nelson and Tan 2005; Lehmann and Norman 2006).
The results of this study have other implications for audit researchers. Abdolmohammadi and
Wright (1987) examine the appropriateness of using junior-level staff and students as surrogates for
more senior auditors in experimental auditing research. The results of this study suggest that
staff-level auditors may be appropriate surrogates for senior-level auditors for certain tasks, under
certain conditions. This has implications for researchers because staff-level auditors may be more
accessible to researchers than senior-level auditors due to being less time constrained and being
more connected to universities due to being recent graduates or being enrolled in a fifth-year
Master’s program concurrent with their being staff-level auditors.
Finally, this study extends research on the role of individuals’ subjective experience while
self-generating alternative explanations on their likelihood assessments. Whereas prior studies in
psychology and accounting varied the ease or difficulty of recall/self-generation by manipulating
features of the task (i.e., generate many versus generate few), I vary the ease or difficulty by
utilizing participants with different knowledge levels and find a similar primary role for individuals’
subjective experience. I also reconcile Heiman’s (1988) results for her students who self-generated
alternative explanations (and had an increase in their likelihood assessments) to more recent
psychology and accounting research, and replicate her results with the students in the current study.
Avenues for future research include following up on the finding that not-for-profit specialists
performed worse than other staff-level auditors, when asked to self-generate alternative
explanations. This is contrasted with the performance of the financial services audit specialists (n
¼ 18) in the sample, who arguably were also ‘‘out of their industry,’’ because the study case
involved a manufacturing company. Even though both not-for-profit auditors and financial services
auditors were ‘‘out of their industry,’’ only the not-for-profit auditors performed at a worse level
than the whole sample of staff-level auditors. Future research could ascertain whether it is the
absence of a profit motive (for clients) that leads to a different approach to auditing and training at
the firm level for not-for-profit auditors. For example, many firms with for-profit clients use
manufacturing firms as sample firms in their generic training materials, so even financial services
auditors at these firms would be exposed to a manufacturing setting. Alternatively, it may be that
the additional compliance testing that is required as part of many not-for-profit audits is preventing
these auditors from gaining knowledge about performing analytical review procedures. Another
area for future research would be to examine staff-level auditor judgments on tasks that are less
structured than the task used in this study, which was characterized as a semi-structured task
(Abdolmohammadi 1999).
REFERENCES
Abdolmohammadi, M. J. 1987. A Taxonomy of Audit Task Complexity for Decision Aid Development.
Working paper, Bentley College.
Abdolmohammadi, M. J. 1999. A comprehensive taxonomy of audit task structure, professional rank and
decision aids for behavioral research. Behavioral Research in Accounting 11: 51–92.
Abdolmohammadi, M., and A. Wright. 1987. An examination of the effects of experience and task
complexity on audit judgments. The Accounting Review 62 (January): 1–13.
Bonner, S. E. 1990. Experience effects in auditing: The role of task-specific knowledge. The Accounting
Review 65 (January): 72–92.
Bonner, S. E., and N. Pennington. 1991. Cognitive processes and knowledge as determinants of auditor
expertise. Journal of Accounting Literature 10: 1–50.
Bonner, S. E., and P. F. Walker. 1994. The effects of instruction and experience on the acquisition of
auditing knowledge. The Accounting Review 69 (January): 157–178.
Choo, F., and K. T. Trotman. 1991. The relationship between knowledge structure and judgments for
experienced and inexperienced auditors. The Accounting Review 66 (July): 464–485.
Christ, M. Y. 1993. Evidence on the nature of audit planning problem representations: An examination of
auditor free recalls. The Accounting Review 68 (April): 304–322.
Coakley, J. R., and J. K. Loebbecke. 1985. The expectation of accounting errors in medium-sized
manufacturing firms. Advances in Accounting 2: 199–246.
Ernst & Young. 2009. Working at Ernst & Young-Assurance Services-External Audit. Available at: http://
www.ey.com/US/en/careers/students
Fischhoff, B., P. Slovic, and S. Lichtenstein. 1978. Fault trees: Sensitivity of estimated failure probabilities
to problem representation. Journal of Experimental Psychology: Human Perception and
Performance 4 (May): 330–344.
Heiman, V. B. 1988. Auditors’ Assessments of the Likelihood of Analytical Review Explanations. Ph.D.
dissertation, University of Michigan.
Heiman, V. B. 1990. Auditors’ assessments of the likelihood of error explanations in analytical review. The
Accounting Review 65 (October): 875–890.
Hirst, D. E., and L. Koonce. 1996. Audit analytical procedures: A field investigation. Contemporary
Accounting Research 13 (Fall): 457–486.
Hoch, S. J. 1985. Counterfactual reasoning and accuracy in predicting personal events. Journal of
Experimental Psychology: Learning, Memory, and Cognition 11 (October): 719–731.
Kadous, K., S. D. Krische, and L. M. Sedor. 2006. Using counter-explanation to limit analysts’ forecast
optimism. The Accounting Review 81 (March): 377–397.
Kelley, H. H. 1973. The processes of causal attribution. American Psychologist 28 (February): 107–128.
Kelley, T., L. Margheim, and D. Pattison. 1999. Survey on the differential effects of time deadline pressure
versus time budget pressure on auditor behavior. The Journal of Applied Business Research 15 (4):
117–128.
Koonce, L. 1993. A cognitive characterization of audit analytical review. Auditing: A Journal of Practice &
Theory 12 (Supplement): 57–76.
Lehmann, C. M., and C. S. Norman. 2006. The effects of experience on complex problem representation and
judgment in auditing: An experimental investigation. Behavioral Research in Accounting 18: 65–83.
Libby, R., and D. M. Frederick. 1990. Experience and the ability to explain audit findings. Journal of
Accounting Research 28 (Autumn): 348–367.
Lipe, M. G. 1985. Attribution Theory: A Proposed Model. Ph.D. dissertation, University of Chicago.
Mehle, T. 1982. Hypothesis generation in an automobile malfunction inference task. Acta Psychologica 52
(November): 87–106.
Moeckel, C. 1990. The effect of experience on auditors’ memory errors. Journal of Accounting Research 28
(Autumn): 368–386.
Nelson, M., and H.-T. Tan. 2005. Judgment and decision making research in auditing: A task, person, and
interpersonal interaction perspective. Auditing: A Journal of Practice & Theory 24 (Supplement): 41–
71.
Pierce, B., and B. Sweeney. 2004. Cost-quality conflict in audit firms: An empirical investigation. European
Accounting Review 67 (October): 783–801.
Prawitt, D. F. 1995. Staffing assignments for judgment-oriented audit tasks: The effects of structured audit
technology and environment. The Accounting Review 70 (July): 443–465.
Sanna, L. J., N. Schwarz, and S. L. Stocker. 2002. When debiasing backfires: Accessible content and
accessibility experiences in debiasing hindsight. Journal of Experimental Psychology: Learning,
Memory, and Cognition 28 (3): 497–502.
Schwarz, N. 1998. Accessible content and accessibility experiences: The interplay of declarative and
experiential information in judgment. Personality and Social Psychology Review 2 (2): 87–99.
Schwarz, N., F. Strack, H. Bless, G. Klumpp, H. Rittenauer-Schatka, and A. Simons. 1991. Ease of retrieval
as information: Another look at the availability heuristic. Journal of Personality and Social
Psychology 61 (2): 195–202.
Sweeney, B., and B. Pierce. 2004. Management control in audit firms: A qualitative examination.
Accounting, Auditing and Accountability Journal 17 (5): 779–812.
Tubbs, R. M. 1992. The effect of experience on the auditor’s organization and amount of knowledge. The
Accounting Review 67 (October): 783–801.
Vinograd, B. N., J. S. Gerson, and B. L. Berlin. 2000. Audit practices of PricewaterhouseCoopers. Auditing:
A Journal of Practice & Theory 19 (Fall): 175–82.