Verbal Fluency and Digit Span Variables As Performance Validity Indicators in Experimentally Induced Malingering and Real World Patients With TBI

Applied Neuropsychology: Child
ISSN: 2162-2965 (Print) 2162-2973 (Online) Journal homepage: https://www.tandfonline.com/loi/hapc20
Verbal fluency and digit span variables as

performance validity indicators in experimentally
induced malingering and real world patients with
TBI
Jessica Hurtubise, Tabarak Baher, Isabelle Messa, Laura Cutler, Ayman

Shahein, Maurissa Hastings, Marilou Carignan-Querqui & Laszlo A Erdodi
To cite this article: Jessica Hurtubise, Tabarak Baher, Isabelle Messa, Laura Cutler,
Ayman Shahein, Maurissa Hastings, Marilou Carignan-Querqui & Laszlo A Erdodi (2020):
Verbal fluency and digit span variables as performance validity indicators in experimentally
induced malingering and real world patients with TBI, Applied Neuropsychology: Child, DOI:
10.1080/21622965.2020.1719409
To link to this article: https://doi.org/10.1080/21622965.2020.1719409
Published online: 21 Feb 2020.
Submit your article to this journal
Article views: 10
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

https://www.tandfonline.com/action/journalInformation?journalCode=hapc20
APPLIED NEUROPSYCHOLOGY: CHILD
https://doi.org/10.1080/21622965.2020.1719409
Verbal fluency and digit span variables as performance validity indicators in

experimentally induced malingering and real world patients with TBI
Jessica Hurtubisea, Tabarak Bahera, Isabelle Messaa, Laura Cutlera, Ayman Shaheinb, Maurissa Hastingsa, Marilou
Carignan-Querquic, and Laszlo A Erdodia
a
Department of Psychology, University of Windsor, Windsor, Canada; bDepartment of Clinical Neurosciences, University of Calgary, Calgary,
Canada; cDepartment of Languages, Literature and Cultures, University of Windsor, Windsor, Canada
ABSTRACT KEYWORDS
Objective: This study was designed to examine the classification accuracy of verbal fluency (VF) Experimental malingering;
measures as performance validity tests (PVT). performance validity;
Method: Student volunteers were assigned to the control (n ¼ 57) or experimental malingering population-specific cutoffs;
verbal fluency
(n ¼ 24) condition. An archival sample of 77 patients with TBI served as a clinical comparison.
Results: Among students, FAS T-score 29 produced a good combination of sensitivity (.40–.42)
and specificity (.89–.95). Animals T-score 31 had superior sensitivity (.53–.71) at .86-.93 specificity.
VF tests performed similarly to commonly used PVTs embedded within Digit Span: RDS 7
(.54–.80 sensitivity at .93–.97 specificity) and age-corrected scaled score (ACSS) 6 (.54–.67 sensi-
tivity at .94–.96 specificity). In the clinical sample, specificity was lower at liberal cutoffs [animals T-
score 31 (.89–.91), RDS 7 (.86–.89) and ACSS 6 (.86–.96)], but comparable at conservative cut-
offs [animals T-score 29 (.94–.96), RDS 6 (.95–.98) and ACSS 5 (.92–.96)].
Conclusions: Among students, VF measures had higher signal detection performance than previ-
ously reported in clinical samples, likely due to the absence of genuine impairment. The superior
classification accuracy of animal relative to letter fluency was replicated. Results suggest that exist-
ing validity cutoffs can be extended to cognitively high functioning examinees, and emphasize
the importance of population-specific cutoffs.
Performance validity and cognitive testing Types of PVTs

Establishing the credibility of a given response set is critical There has been a gradual increase in the use of embedded
to the validity of clinical interpretations based on neuro- validity indicators (EVIs) to complement free-standing PVTs
psychological test scores (Fuermaier et al., 2017; Merten & dedicated to discriminating valid and invalid responding.
Rogers, 2017; Stevens, Friedel, Mehren, & Merten, 2008). Later co-opted as PVTs, EVIs are tests developed initially as
Since clinical judgment is notoriously poor at detecting measures of cognitive ability. The recent proliferation of
invalid performance (Dandachi-FitzGerald, Merckelbach, & EVI research, both in terms of introducing new instruments
Ponds, 2017; Heaton, Smith, Lehman, & Vogt, 1978), per- (Berger et al., 2019; Erdodi, Seke, et al., 2017; Erdodi et al.,
formance validity tests (PVTs) have been developed to pro- 2016; Fuermaier et al., 2016; Lichtenstein, Erdodi, & Linnea,
2017; Ord, Boettcher, Greve, & Bianchini, 2010; Rai, An,
vide objective measures of the extent to which a given test
Charles, Ali, & Erdodi, 2019; Whiteside, Kogan, et al., 2015)
score (or the entire neurocognitive profile) is an accurate
and the ongoing cross-validation of existing ones (Abeare,
reflection of the examinee’s underlying abilities. Recent sur-
Sabelli, et al., 2019; Erdodi, Pelletier, & Roth, 2018; Lange
veys have detected a trend toward increased utilization of et al., 2013; Persinger et al., 2018; Schroeder, Twumasi-
validity tests in neuropsychology (Martin, Schroeder, & Ankrah, Baade, & Marshall, 2012; Webber & Soble, 2018;
Odland, 2015), following longstanding recommendations by Whiteside, Caraher, Hahn-Ketter, Gaasedelen, & Basso,
professional organizations (Bush et al., 2005; Bush, 2019; Whitney, Davis, Shepard, Bertram, & Adams, 2009) is
Heilbronner, & Ruff, 2014; Heilbronner et al., 2009). a testament to their clinical utility. Add-ons to free-standing
However, a recent report on the discrepancy between self- PVTs (post-publication enhancements that are analogous to
reported and observed PVT use cautions against equating EVIs) have also been shown to improve classification accur-
survey results with actual clinical practice (MacAllister, acy (Boone, Salazar, Lu, Warner-Chacon, & Razani, 2002;
Vasserman, & Armstrong, 2019). Erdodi, Tyson, et al., 2017, 2018; Kim et al., 2010; Lupu,
CONTACT Laszlo Erdodi lerdodi@gmail.com Department of Psychology, University of Windsor, Windsor, Canada.
ß 2020 Taylor & Francis Group, LLC
2 J. HURTUBISE ET AL.
Elbaum, Wagner, & Braw, 2018; Tomer, Lupu, Golan, genuine and severe deficits (Carone, Green, & Drane, 2014;
Wagner, & Braw, 2019). Green, Montijo, & Brockhaus, 2011), free-standing PVTs at
sufficiently conservative cutoffs tend to have superior speci-
ficity. Therefore, a score in the failing range on such tests
Free-standing PVTs vs. EVIs
makes a convincing rational and empirical argument for
Unlike most free-standing PVTs, EVIs provide information invalid performance.
on both performance validity and cognitive ability, making
them a cost-effective choice in assessment contexts where
Verbal fluency tests as EVIs
time spent on test administration and scoring is a highly
valued commodity (Bortnik et al., 2010; Erdodi, Kirsch, Verbal fluency measures have long been known to be sensi-
Sabelli, & Abeare, 2018; Glassmire, Wood, Ta, Kinney, & tive to non-credible responding. Hayward, Hall, Hunt, and
Nitch, 2019). Additionally, EVIs are inseparable from their Zubrick (1987) reported that nurses who were instructed to
host instrument, reducing the inferential leap from free- feign credible impairment performed significantly more
standing PVTs to measures of cognitive ability when deter- poorly than patients with medically verified TBI on animal
mining the credibility of a response set (Bigler, 2015). Since fluency. The effect size (d ¼ 1.46) was comparable to that
they are harder to identify, EVIs are also more resistant to observed on Digit Span (d ¼ 1.51; very large). In contrast,
the effect of coaching (Brennan et al., 2009; Kanser et al., Backhaus, Fichtenberg, and Hanks (2004) found that,
2017). EVIs also help reduce the appearance of bias toward although invalid performance had a large effect on letter flu-
malingering detection due to reliance on multiple free- ency (d ¼ 0.81), it was one of the most robust measures to
standing PVTs dedicated solely to the identification of non- non-credible performance. As a reference, a much larger
credible response sets (Boone, 2013). effect (d ¼ 1.48) emerged on the written version of the
Despite the myriad of advantages, individually, EVIs tend Symbol-Digit Modalities Test.
to have inferior signal detection properties relative to free- These findings were later replicated by the same research
standing PVTs (Erdodi, Green, Sirianni, & Abeare, 2019). group: non-credible responding had a large effect (d ¼ 0.83)
This is particularly problematic in forensic settings, where on letter fluency scores (Johnson, Silverberg, Millis, &
false positive errors have far-reaching consequences. More
Hanks, 2012). More recently, Whiteside, Gaasedelen, et al.
importantly, EVIs are often criticized for blurring the line
(2015) reported a medium-large effect for psychometrically
between credible deficits and invalid performance. Although
defined invalid performance on animal fluency (d ¼ 0.65)
free-standing PVTs are not immune to this line of reasoning
among patients with mild TBI. In addition, Johnson et al.
either (Bigler, 2012; Leighton, Weinborn, & Maybery, 2014),
(2012) combined the total score on a different version of the
EVIs are especially vulnerable, at least on theoretical
letter fluency test (CFL) with the change in output over
grounds, to the claim that they conflate genuine impairment
time in a logistic regression equation. The standard cutoff
and non-credible responding (Boskovic et al., 2018; Eglit
(.50) produced a good combination of sensitivity (.67) and
et al., 2019; Glassmire et al., 2019). While the inherent diffi-
specificity (.88). Increasing the cutoff to .60 disproportion-
culty of tests can be manipulated effectively through care-
fully planned test construction and/or deliberate choice of ately sacrificed sensitivity (.43) for specificity (.89).
In light of accumulating evidence that verbal fluency
cutoff, all neuropsychological tests require some level of cog-
nitive ability to yield a clinically meaningful score—an measures had the potential to serve as EVIs, formal validity
inescapable epistemological challenge for PVT designers and cutoffs were first introduced by Curtis, Thompson, Greve,
users (Bigler, 2014; Erdodi, 2019; Lippa, 2018). Ultimately, and Bianchini (2008). They proposed that a demographically
no PVT is robust enough to provide meaningful data when adjusted T-score of 33 (in mild TBI) and 30 (moderate-
confounded by extreme levels of impairment (Boone, 2013; to-severe TBI) achieved .90 specificity at .14–.34 sensitivity.
Green, Montijo, & Brockhaus, 2011; Larochette & Harrison, Their findings were replicated by Whiteside, Kogan, et al.
2012). Nevertheless, free-standing PVTs tend to have better (2015), who concluded that a more conservative cutoff
classification accuracy than EVIs (MacAllister et al., 2019). (T 24) on letter fluency was needed to maintain .90 speci-
However, EVIs face an additional, unique challenge: the ficity. However, this cutoff was insensitive to invalid per-
invalid before impaired paradox (Erdodi & Lichtenstein, formance (.05). Animal fluency cutoffs (T 25 and T 24)
2017). Namely, several validity cutoffs reach into the range produced slightly better combinations of sensitivity (.23–.25)
of scores historically considered to indicate normal cognitive and specificity (.89–.91).
functioning. At face value, such occurrences undermine the The superior signal detection performance of animal ver-
credibility of EVIs, as they appear to erase the range of cred- sus letter fluency was also observed in a study by Sugarman
ible impairment. Although a series of arguments have been and Axelrod (2015). In their large sample of clinically
marshalled to defend EVIs against such concerns (Hilsabeck, referred veterans, a T-score of 30 on FAS produced .30
2017), free-standing PVTs are immune to the invalid before sensitivity at .90 specificity. More liberal cutoffs (T 33 and
impaired paradox by default, as they are expected only to T 31) achieved a better combination of sensitivity
provide information on performance validity. Since they are (.42–.44) and specificity (.89–.91) on animal fluency. Table 1
typically designed to be easy while appearing difficult and provides a quick visual summary of the literature
are passed by the vast majority of credible patients with reviewed above.
APPLIED NEUROPSYCHOLOGY: CHILD 3
Table 1. Brief summary of the classification accuracy of verbal fluency measures as PVTs.
Study Sample characteristics Criterion Test Cutoff SENS SPEC
Curtis et al., 2008 204 patients with traumatic brain Slick, Sherman, and Mild TBI
injury (TBI); MAge ¼ 39.6; Iverson (1999) FAS T 33 .34 .92
MEducation ¼ 12.3 criteria for MND T 31 .21 .97
Moderate-to-Severe TBI
FAS T 33 .29 .79
T 31 .19 .87
T 30 .14 .91
T 28 .14 .95
Whiteside, Kogan, 57 compensation-seeking patients Valid: 0 PVT failures; FAS T 25 .09 .87
et al., 2015 with mild TBI & 61 non- Invalid: 2 T 24 .05 .90
compensation-seeking patients PVT failures T 17 .02 1.00
with moderate-to-severe TBI Animals T 25 .25 .89
T 24 .23 .91
T 23 .19 .94
Sugarman & 623 physician-referred VA patients; Valid: 0 PVT failures; FAS T 30 .30 .90
Axelrod, 2015 MAge–Valid ¼ 49.6; MEducation–Valid ¼ Invalid: 2 T 29 .26 .91
13.1; MAge–Invalid ¼ 51.3; PVT failures Animals T 33 .44 .89
MEducation– Invalid ¼ 12.6 T 31 .42 .91
Note. PVT: Performance validity test; MND: Malingered Neurocognitive Dysfunction (Slick et al., 1999); SENS: Sensitivity; SPEC: Specificity
The review of existing evidence on verbal fluency measures functioning examinees while simultaneously conflating
as EVIs converges on several conclusions. First, there is sig- invalid performance with genuine deficits in clinical patients,
nificant variability in cutoffs both across and within studies. relying on healthy, young, and cognitively high-functioning
When compared directly, animal fluency cutoffs outper- participants serves as a potentially valuable proof of concept.
formed letter fluency cutoffs. Second, once .90 specificity Since such a design controls for genuine deficits, scores
was achieved, sensitivity tended to be low and variable. Third, below the validity cutoff have a clearer interpretation (i.e.,
TBI severity was related to the signal detection profile of the non-credible responding). On the other hand, EVI cutoffs
verbal fluency based EVIs: more conservative cutoffs were calibrated on a cognitively high functioning sample (Abeare,
required in patients with moderate-to-severe TBI to maintain Freund, Kaploun, McAuley, & Dumitrescu, 2017; Erdodi,
the same level of specificity compared to patients with mild Sagar, et al., 2018) often fail to replicate in patients with
TBI (Curtis et al., 2008). Finally, performance on letter and more severe deficits (Eglit et al., 2019; Glassmire et al.,
category fluency tests varied across studies as a function of 2019). Therefore, the classification accuracy statistics in stu-
sample characteristics, stimulus presentation (Silverberg, dents were compared to a clinical sample of patients
Hanks, Buchanan, Fichtenberg, & Millis, 2008), study design, with TBI.
and criterion grouping (Crowe, 1996). These findings under- We hypothesized that previously published cutoffs (FAS
score the importance of developing population-specific cut- and animals T-score 33) would produce better classifica-
offs (Glassmire et al., 2019; Pearson, 2009) as a potential tion accuracy in university students who volunteered as
practical solution to divergent evidence on the classification research participants compared to the clinical samples, as
accuracy of EVIs within measures of verbal fluency. the signal detection model would not be contaminated by
genuine cognitive impairment as a confounding variable
(Berger et al., 2019; Bodner, Merten, & Benke, 2019; Rai
Current study et al., 2019). Similarly, we predicted that verbal fluency
Combined with recent reports suggesting that PVTs tend to based EVIs would be sensitive to injury severity. Finally, we
be insensitive to non-credible performance in cognitively hypothesized an inverse dose-response relationship between
intact individuals (Abeare, Messa, et al., 2019; An, Kaploun, failure rates on other PVTs and TBI severity (mild vs mod-
Erdodi, & Abeare, 2017; Roye, Calamia, Bernstein, De Vito, erate/severe), consistent with previous reports (Abeare,
& Hill, 2019), the current state of knowledge calls for a rep- Sabelli, et al., 2019; Erdodi & Rai, 2017; Green et al., 2011).
lication of validity cutoffs on verbal fluency measures in a
non-clinical sample. This study was designed to use an Method
experimental malingering (expMAL) paradigm to examine
Participants
the classification accuracy of existing cutoffs within a non-
clinical student sample. Given the paucity of research on Student sample
EVIs nested within verbal fluency tests (Curtis et al., 2008; Participants were recruited through the University’s online
Johnson et al., 2012; Whiteside, Kogan, et al., 2015), the research platform and were randomly assigned to either the
analyses were extended to commonly used Digit Span based control or the expMAL condition, following a 2:1 ratio.
EVIs, a well-researched instrument in the context of per- Inclusion criteria for this study were age 18–35, absence of
formance validity assessment. major neurological disorders, and enrollment in an under-
Given that free-standing PVTs may be insensitive to sub- graduate psychology or business course. There were no dif-
tle manifestations of non-credible responding in high ferences between the two groups in age [MControl ¼ 20.6,
Table 2. List of neuropsychological tests administered to the student sample (n ¼ 81).

Name of the test Abbreviation Reference Norms
Animal Fluency Animals Gladsjo et al., 1999 Heaton
Boston Naming Test—Short Form (Abbreviated) BNT-15 Goodglass, Kaplan, & Barresi, 2001; Erdodi, Jongsma, et al., 2017 Manual
Complex Ideational Material CIM Goodglass et al., 2001 Heaton
Digit Span DSWAIS-III Wechsler, 1997 Manual
Digit-Symbol Coding CDWAIS-III Wechsler, 1997 Manual
Grooved Pegboard Test GPB Matthews & Klove, 1964 Heaton
Hopkins Verbal Learning Test—Revised HVLT-R Brandt & Benedict, 2001 Manual
Letter Fluency FAS Gladsjo et al., 1999 Heaton
Rey 15-Item Test with Recognition Rey-15 Rey, 1941; Boone, Salazar, Lu, Warner-Chacon, & Razani, 2002 –
Wisconsin Card Sorting Test—64 Card Version WCST-64 Kongs, Thompson, Iverson, & Heaton, 2000 Manual
Single Word Reading WRAT-4 Wilkinson & Robertson, 2006 Manual
Symbol-Digit Modalities Test SDMT Smith, 2007 Manual
Trails 2 and 4 D-KEFS Delis, Kaplan, & Kramer, 2001 Manual
Word Choice Test WCT Pearson, 2009 –
Note. WAIS-III: Wechsler Adult Intelligence Scale—Third Edition; WRAT-4: Wide Range Achievement Test—Fourth Edition; D-KEFS: Delis–Kaplan Executive Function
System; Heaton: Demographically adjusted norms published by Heaton, Miller, Taylor, and Grant (2004).
Table 3. Components of the VI-7, cutoffs and corresponding base rates of failure in the student sample (n ¼ 81).
VI-7 Component Scale Cutoff BRFail Reference
BNT-15 T2C Raw 85” 9.9 An et al., 2019; Erdodi, Dunn, et al., 2018
CDWAIS-III ACSS 5 14.8 Ashendorf, Clark, & Sugarman, 2017; Erdodi, Abeare et al., 2017; Erdodi & Lichtenstein, 2017; Trueblood, 1994;
CIMBDAE Raw 9 27.2 An et al., 2019; Erdodi, 2019; Erdodi & Roth, 2017; Erdodi et al., 2016
FMSWCST-64 Raw 2 8.6 Greve, Bianchini, Mathias, Houston, & Crouch, 2002; Lichtenstein, Holcomb, et al., 2018; Suhr & Boyer, 1999
GPB DH T 29 27.2 Erdodi, Kirsch, et al., 2018; Erdodi, Seke, et al., 2017
RDHVLT-R Raw 6 16.0 Bailey, Soble, Bain, & Fullen, 2018; Sawyer, Testa, & Dux, 2017
Trails 2D-KEFS ACSS 5 17.3 Erdodi, Hurtubise, et al., 2018; Erdodi & Lichtenstein, 2019
Note. VI-7: Validity Index Seven; BNT-15: Boston Naming Test—Short Form; T2C: Time to completion; CDWAIS-III : Coding subtest of the Wechsler Adult Intelligence
Scale—Third Edition; CIMBDAE: Complex Ideational Material subtest of the Boston Diagnostic Aphasia Battery; FMSWCST : Failures to Maintain Set on the 64-card
version of the Wisconsin Card Sorting Test; GPB DH: Grooved Pegboard Test—Dominant hand; RDHVLT-R: Recognition Discrimination Index (Yes/No recognition
trial true positives (hits) minus false positives); Trails 2D-KEFS: Trails 2 subtest (Number Sequencing) on the Delis–Kaplan Executive Function System.
SD ¼ 2.6; MexpMAL ¼ 21.0, SD ¼ 3.0; t(79) ¼ 0.66, p ¼ .514], Materials

education [MControl ¼ 14.3, SD ¼ 1.2; MexpMAL ¼ 14.7,
Student sample
SD ¼ 1.2; t(79) ¼ 1.53, p ¼ .129] or proportion of females
[84% vs 83%, v2(1) ¼ 0.08, p ¼ .785]. Table 2 provides a list of neuropsychological tests adminis-
tered to the student sample (n ¼ 81). The Rey-15 and Word
Choice Test (WCT) were the two free-standing PVTs. In
addition, seven EVIs were aggregated into a single compos-
Clinical sample ite labeled “Validity Index Seven” (VI-7). Components of
To provide a direct comparison to patients with genuine the VI-7 are presented in Table 3, along with references to
neuropsychological deficits, the classification accuracy of the the cutoffs used. The value of the VI-7 is the cumulative
EVIs of interest was recomputed in a sample of 77 patients number of failures across all seven components. As such,
(57.1% male) clinically referred for neuropsychological test- possible scores range from 0 (all seven PVTs passed) to 7
ing following a TBI at a nearby academic medical center. (all seven PVTs failed). The majority of the sample (72.8%)
The majority of the injuries (75.3%) were classified as mild scored in the Pass range (1) on the VI-7; a small propor-
based on available injury parameters (Glasgow Coma Scale, tion (6.2%) scored in the Borderline range (2), and one fifth
duration of loss of consciousness and peri-traumatic (21.0%) scored in the Fail range (3). To maintain the pur-
amnesia, intracranial abnormalities on neuroradiological ity of the criterion groups, participants in the Borderline
findings). The remaining 24.7% were classified as moderate/ range were excluded from analysis in which the VI-7 served
severe. Patients were selected from a consecutive case as the dichotomous (Pass/Fail) reference PVT, consistent
sequence used in previous publications by the same research with emerging practice standards (Abeare, Messa, et al.,
group (Abeare, Sabelli et al., 2019; Erdodi & Abeare, 2019; 2019; Axelrod, Meyers, & Davis, 2014; Erdodi, 2019;
Erdodi, Abeare, Medoff et al., 2018; Erdodi, Roth et al., Schroeder, Martin, Heinrichs, & Baade, 2019; Sugarman &
2014; Erdodi, Taylor et al., 2019). Inclusion criteria were age Axelrod, 2015; Whiteside, Kogan, et al., 2015).
50 or younger, data available on animal fluency and the
same free-standing PVT used in the student sample. Mean
Clinical sample
age was 32.7 years (SD ¼ 10.5), while the mean level of edu-
cation was 13.3 years (SD ¼ 2.4). All patients were in the Patients were administered a comprehensive battery of
post-acute stage of recovery (>3 months since a mild TBI neuropsychological tests. However, data on FAS and certain
and >12 months since a moderate-severe TBI) and evaluated Digit Span variables (longest span forward and backward)
in an outpatient setting. were not available. As both samples were administered the
Table 4. Components of the VI-7, cutoffs and corresponding base rates of failure in the clinical sample (n ¼ 77).
VI-7 Component Scale Cutoff BRFail Reference
CDWAIS-IV ACSS 5 26.0 Ashendorf et al., 2017; Erdodi & Abeare, 2019; Erdodi, Abeare et al., 2017; Erdodi & Lichtenstein, 2017; Trueblood, 1994
FCRCVLT-II Raw 15 24.7 Erdodi, Abeare et al., 2018; Persinger et al., 2018; Schwartz et al., 2016
FMSWCST Raw 2 11.8 Greve et al., 2002; Lichtenstein, Holcomb, et al., 2018; Suhr & Boyer, 1999
FTT DH T 31 11.7 Erdodi, Taylor, et al., 2019
GPB DH T 29 25.0 Erdodi, Kirsch, et al., 2018; Erdodi, Seke, et al., 2017
LNSWAIS-IV ACSS 7 25.7 Erdodi & Abeare, 2019; Shura et al., 2016
SSWAIS-IV ACSS 6 29.9 Erdodi, Abeare et al., 2017; Erdodi & Abeare, 2019; Trueblood, 1994
Note. VI-7: Validity Index Seven; CDWAIS-IV: Coding subtest of the Wechsler Adult Intelligence Scale—Fourth Edition; FMSWCST: Failures to Maintain Set on the
Wisconsin Card Sorting Test; GPB DH: Grooved Pegboard Test—Dominant hand; ACSS: Age-corrected scaled score.
WCT, it was used as the shared criterion PVT. Additionally, Data analysis
a composite validity index was created mirroring the VI-7,
Descriptive statistics [M, SD, base rates of PVT failure
based on EVIs available in the data set (Table 4). The
(BRFail), risk ratios (RR)] were reported where relevant. The
majority of the sample (58.4%) scored in the Pass range
inferential statistics were independent samples t-tests and v2.
(1) on the VI-7. A small proportion (13.0%) scored in the
The assumption of homogeneity of variance was examined
Borderline range (2), and 28.6% scored in the Fail
with Levene’s test. Effect size estimates were expressed in
range (3).
Cohen’s d and U2. Given that studies based on the expMAL
paradigm tend to produce predictably inflated effect sizes,
Procedure the lower limit for moderate effect has been redefined as
.75; for large effect as 1.25, and for very large as 1.75
Student sample (Rogers, Sewell, Martin, & Vitacco, 2003). Receiver operat-
Testing was completed in a quiet, distraction-free room. ing characteristics [area under the curve (AUC), 95% CIs]
Informed consent was obtained from all participants prior was computed in SPSS version 25.0. AUC values in the
to psychometric testing. Participants assigned to the .70–.79 range are considered acceptable, whereas values .90
expMAL condition were provided a script that instructed are considered outstanding (Hosmer & Lemeshow, 2000).
them to feign credible cognitive impairments in the fictional Sensitivity and specificity values were calculated using stand-
context of personal injury litigation in which they were the ard formulas. The minimum acceptable level of specificity is
victim of a motor vehicle collision. The scenario has been .84 (Larrabee, 2003), but values .90 are desirable and
previously used by this research group (An et al., 2019; Rai becoming the emerging norm (Boone, 2013).
et al., 2019) and was modeled after comparable vignettes To examine the potential contribution of process varia-
developed by DenBoer and Hall (2007). bles within verbal fluency to differentiating valid from
At the end of testing, all participants completed a invalid responding, logistic regression classifiers (LRC) were
manipulation check. They were presented with a paper-and- generated using MATLAB’s generalized-linear-model pack-
pencil questionnaire that included the open-ended prompt age glmfit. They were cross-validated via n ¼ 1000 repeat
“explain what you were asked to do in this study” and a cross-validation on an 80:20 training-set versus testing-set
request to rate their compliance to task instructions on a split with binomial classification using expMAL status as the
scale from 0 to 10. Participants in the expMAL condition criterion. Step-wise analysis was performed based on an
had additional items. First, they were asked to rate how well embedded t-test analysis testing the null-hypothesis that the
they could relate to the imagined initial script on a scale parameter coefficients do not significantly differ from zero
from 0 to 10. Second, they were asked to select which strat- (i.e., offer no significant contribution to LRC performance).
egies they used to pretend to be impaired during testing
from a variety of options (including “I didn’t pretend”). The Results
study was approved by the University’s Research
Ethics Board. Comparing the two samples: demographic
characteristics and neuropsychological profiles
Clinical sample As anticipated, the student sample was significantly younger

(d ¼ 1.57, very large effect) and more educated (d ¼ 0.57,
Patients were evaluated at the outpatient neurorehabilitation medium effect). Also, a significantly lower proportion of stu-
service of a large academic medical center. dents was male (14.8%) compared to the clinical sample
Neuropsychological tests were administered and scored by (57.1%). Although single word reading performance was in
Master’s level psychometrists under the supervision of a the Average range in both samples, the students had a sig-
licensed clinical neuropsychologist. The evaluation was com- nificantly higher mean score (d ¼ 0.56, medium effect).
pleted in two half-day (i.e., four-hour) appointments. Only Likewise, the student sample scored above the TBI sample
de-identified data were collected for research purposes. The on most neuropsychological tests (Table 5). Interestingly,
project was approved by the Institutional Review Board of there was no difference in animal fluency performance
the hospital. between the two samples.
Table 5. Comparing the student and clinical samples on demographic and neuropsychological variables.
Sample
Student Clinical
Variable n M SD n M SD t p d
Age 81 20.7 2.7 77 32.7 10.5 9.95 <.001 1.57
Education 81 14.4 1.3 77 13.3 2.4 3.61 <.001 0.57
% Male 81 14.8 77 57.1 5.56 <.001 –
ReadingWRAT-4 57 100.8 13.6 77 93.6 12.2 3.22 .002 0.56
WCT 57 49.6 1.0 77 47.8 3.6 3.67 <.001 0.72
RDS 57 9.9 1.6 77 9.2 2.2 2.04 .044 0.36
Animals T-score 57 44.7 9.5 77 42.3 12.5 1.21 .227 –
GPB DH 57 43.3 11.9 77 38.8 13.0 2.05 .042 0.36
DSWAIS-III & IV 57 10.0 2.5 77 8.7 2.8 2.78 .006 0.49
CDWAIS-III & IV 57 10.9 2.8 77 8.0 3.4 5.25 <.001 0.93
Note. Scores on neuropsychological tests within the student sample were only reported for the control group; ReadingWRAT-4: Single word reading subtest of
the Wide Range Achievement Test—Fourth Edition; WCT: Word Choice Test; RDS: Reliable digit span; GPB DH: Grooved Pegboard Test—Dominant hand (demo-
graphically adjusted T-score); DSWAIS-III & IV: Age-corrected scaled score of the Digit Span subtest of the Wechsler Adult Intelligence Scale—Third (Students) and
Fourth (Patients) Editions; CD: Coding age-corrected scaled score.
Table 6. The effect of experimental malingering on Verbal Fluency, Digit Span, PVTs and Select Neuropsychological Tests in the student sample (n ¼ 81).
Criterion group
Control expMAL
n ¼ 57 n ¼ 24
Test/Variable M SD M SD t p d r12 vs. r22
FAS 42.1 9.3 31.8 10.0 4.47 <.001 1.07 .746
Animals 44.7 9.5 29.6 14.5 5.50 <.001 1.23 .011
LDF 5.9 1.0 3.9 2.0 6.01 <.001 1.26 <.001
LDB 4.1 0.9 2.7 1.4 5.03 <.001 1.19 .074
RDS 9.9 1.6 6.6 3.2 6.15 <.001 1.30 <.001
ACSSDigit Span 10.0 2.5 6.3 3.3 5.47 <.001 1.26 .032
LDF–LDB 1.8 1.1 1.2 1.2 2.45 .017 0.62 .309
WCT 49.6 1.0 40.2 7.8 8.98 <.001 1.69 <.001
Rey-15 28.9 2.0 21.1 8.6 6.44 <.001 1.25 <.001
VI-7 0.5 0.9 3.0 2.2 7.33 <.001 1.49 <.001
ReadingWRAT-4 100.8 13.6 94.7 17.8 1.67 .099 0.39 .169
CATWCST-64 3.7 1.1 2.4 1.7 4.09 .001 0.91 .004
SDMTWritten 1.62 1.05 1.32 2.01 8.26 <.001 1.83 <.001
HVLT-R 1-3 25.9 3.8 21.0 7.6 3.88 <.001 0.82 <.001
HVLT-R DR 8.8 2.1 5.3 3.2 5.77 <.001 1.29 .004
Trails 4D-KEFS 9.4 2.8 6.8 3.8 3.43 .001 0.78 .008
Note. LDF: Longest digits forward; LDB: Longest digits backward; RDS: Reliable Digit Span; ACSS: Age-corrected scaled score (M ¼ 10, SD ¼ 3); ReadingWRAT-4:
Single Word Reading subtest on the Wide Range Achievement Test—Fourth Edition standard score (M ¼ 100, SD ¼ 15); CATWCST-64: Categories completed on
the Wisconsin Card Sorting Test—64 Card Version; SDMT: Symbol-Digit Modalities Test z-score (M ¼ 0.00, SD ¼ 1.0); HVLT-R: Hopkins Verbal Learning Test-
Revised (raw scores); Trails 4D-KEFS: Trails 4 (Number-Letter Switching) on the Delis-Kaplan Executive Function System ACSS; expMAL: Experimental malingering;
r12 vs. r22: The p-value associated with Levene’s test of homogeneity of variance.
The effect of expMAL on verbal fluency and digit span effect). However, they performed within the expected range
performance in the student sample on Digit Span ACSS (Table 6).
Participants in the control group scored significantly higher
on all measures compared to those in the expMAL condition The effect of expMAL on free-standing/composite PVTs
(Table 6). Effect sizes ranged from medium [d ¼ .62; longest and tests of cognitive ability in the student sample
digit span forward (LDF) minus longest digit span backward
(LDB)] to large [d ¼ 1.30; reliable digit span (RDS)]. In add- A large effect was observed on all three criterion PVTs
ition, expMAL was associated with significantly higher (d: 1.25–1.69). Participants in the expMAL also produced
within-group variability on animal fluency, LDF, RDS, and significantly higher within-group variability. The difference
Digit Span age-corrected scaled scores (ACSS). on the single word reading test was non-significant.
Predictably, performance among participants in the However, significant contrasts emerged on tests of concept
expMAL condition was significantly below the normative formation (d ¼ 0.91, moderate effect), graphomotor process-
mean on both letter (d ¼ 1.82, very large) and animal flu- ing speed (d ¼ 1.83, very large effect), visuomotor scanning,
ency (d ¼ 1.64, very large). Likewise, participants scored and switching (d ¼ 0.78, moderate effect). On an auditory
below the norms on Digit Span ACSS (d ¼ 1.10, moderate verbal learning test, there was a moderate effect (d ¼ 0.82)
effect). Interestingly, even the control group scored signifi- on the acquisition trials and a large effect on the delayed
cantly lower than the normative mean on both letter free recall trial (d ¼ 1.29). As before, expMAL was associated
(d ¼ 0.81, large) and animal fluency (d ¼ 0.54, medium with significantly higher within-group variability.
Table 7. Overall classification accuracy of Verbal Fluency and Digit Span Variables against criterion measures in the student sample (n ¼ 81).
Criterion measure
expMAL WCT Rey-15 VI-7
Test/Variable AUC 95% CI AUC 95% CI AUC 95% CI AUC 95% CI
FAS .77 .65–.89 .68 .53–.83 .64 .47–.81 .64 .48–.81
Animals .80 .69–.91 .88 .78–.98 .74 .58–.90 .86 .74–.98
LDF .81 .71–.92 .90 .81–.99 .83 .68–.98 .83 .69–.96
LDB .79 .68–.91 .84 .73–.96 .86 .76–.97 .75 .60–.90
RDS .82 .71–.94 .90 .80–.99 .87 .74–.99 .80 .66–.95
ACSS .80 .89–.91 .89 .81–.98 .82 .68–.95 .80 .65–.95
LDF–LDB .64 .51–.77 .71 .58–.85 .64 .46–.81 .71 .57–.85
Note. LDF: Longest digits forward; LDB: Longest digits backward; RDS: Reliable Digit Span; ACSS: Age-corrected scaled score; expMAL: Experimental malingering;
WCT: Word Choice Test (Pearson, 2009; Fail defined as 45; Barhon, Batchelor, Meares, Chekaluk, & Shores, 2015; Bain & Soble, 2019; Davis, 2014; Erdodi et al.,
2014; Erdodi & Lichtenstein, 2019; Zuccato, Tyson, & Erdodi, 2018); Rey-15: Rey Fifteen Item Test (Rey, 1941) combination score (free recall þ recognition hits;
Fail defined as 23; Boone et al., 2002; Poynter et al., 2019); VI-7: Validity Index Seven (Fail defined as 3; Erdodi, Kirsch, et al., 2018; Lichtenstein, Flaro,
et al., 2019).
Table 8. Classification accuracy of Select Verbal Fluency and Digit Span cutoff scores in the student sample (n ¼ 81).
Criterion measure
expMAL WCT Rey-15 VI-7
29.6 21.2 18.8 21.0
Test/Variable Cutoff BRFail SENS SPEC SENS SPEC SENS SPEC SENS SPEC
FAS 33 28.7 .58 .84 .47 .76 .47 .75 .47 .76
31 22.5 .54 .91 .41 .82 .40 .81 .41 .83
29 16.3 .42 .95 .41 .90 .40 .89 .41 .91
27 8.8 .29 1.00 .29 .97 .27 .95 .24 .95
Animals 33 23.8 .58 .91 .76 .90 .60 .84 .81 .91
31 21.3 .54 .93 .71 .92 .53 .86 .71 .93
29 15.0 .42 .96 .59 .97 .53 .94 .59 .98
LDF 4 18.8 .50 .95 .71 .95 .73 .95 .65 .95
3 11.3 .38 1.00 .47 .98 .53 .98 .47 1.00
LDB 2 3.8 .38 1.00 .47 .98 .53 .98 .47 .98
RDS 7 18.8 .54 .96 .71 .95 .80 .97 .59 .93
6 11.3 .38 1.00 .47 .98 .53 .98 .47 1.00
5 10.0 .33 1.00 .47 1.00 .47 .98 .47 1.00
ACSS 6 18.8 .54 .96 .65 .94 .67 .94 .65 .95
5 11.3 .38 1.00 .47 .98 .53 .98 .47 1.00
4 10.0 .33 1.00 .41 .98 .47 .98 .41 1.00
LDF–LDB 0 18.8 .33 .88 .35 .85 .33 .84 .41 .88
–1 1.3 .04 1.00 .06 1.00 .07 1.00 .06 1.00
Note. BRFail: Base rate of failure (% of the sample that failed a given cutoff); expMAL: Experimental malingering; WCT: Word Choice Test (Pearson, 2009; Fail
defined as 45; Barhon et al., 2015; Bain & Soble, 2019; Davis, 2014; Erdodi et al., 2014; Erdodi & Lichtenstein, 2019; Zuccato, Tyson, & Erdodi, 2018); Rey-15:
Rey Fifteen Item Test (Rey, 1941) combination score (free recall þ recognition hits; Fail defined as 23; Boone et al., 2002; Poynter et al., 2019); VI-7: Validity
Index Seven (Fail defined as 3; Erdodi, Kirsch, et al., 2018; Lichtenstein, Flaro, et al., 2019); SENS: Sensitivity; SPEC: Specificity.
ROC curves for verbal fluency and digit span variables combination of sensitivity (.4–.42) and specificity (.89–.95).
in the student sample Making the cutoff more conservative (27) disproportion-
ately sacrificed sensitivity (.24–.29) for specificity (.95–1.00).
The FAS was a significant predictor of expMAL and Pass/
An animal fluency T-score of 33 achieved minimum
Fail status on the WCT, but not Rey-15 and the VI-7
specificity (.84–.91) against all criterion PVTs, with respect-
(Table 7). In contrast, animal fluency produced significant
able sensitivity (.58–.81). Lowering the cutoff to 31
AUCs (.74–.88) against all criterion measures. LDF, LDB,
resulted in the predictable tradeoff: improved specificity
RDS and Digit Span ACSS were significant predictors of all
(.86–.93) and decreased sensitivity (.53–.71). Making the cut-
criterion measures (AUCs: .75–.90). However, LDF–LDB
off more conservative (29) resulted in a balanced recalibra-
had significantly lower AUC values (.64–.71) and failed to
tion of sensitivity (.42–.59) and specificity (.94–.98).
reach significance against the Rey-15.
Relative contribution of process variables and

Classification accuracy of verbal fluency cutoffs in the multivariate models in the student sample
student sample
In an attempt to replicate the findings of Johnson et al.
The FAS T-score 33 and 31 cutoffs failed to reach min- (2012), we first examined whether the decline in output
imum specificity against the WCT, Rey-15, and VI-7 offered any additional benefit over total scores in predicting
(Table 6). Lowering the cutoff to 29 achieved a good expMAL status. Then we explored whether combining both
Table 9. Classification accuracy of Select Verbal Fluency and Digit Span cutoff scores in the clinical sample (n ¼ 77).
Criterion measure
WCT VI-7
45 3
19.5 32.8
Test/Variable Cutoff BRFail SENS SPEC SENS SPEC
Animals T-score AUC .74 .74
95% CI .63–.86 .61–.87
33 23.4 .33 .79 .36 .82
31 14.3 .27 .89 .27 .91
29 11.7 .27 .94 .27 .96
27 10.4 .27 .94 .23 .96
25 6.4 .20 .97 .14 .96
AUC .77 .78
95% CI .64–.91 .66–.89
Reliable Digit Span 7 20.8 .47 .86 .27 .89
6 9.1 .27 .95 .18 .98
5 3.9 .07 .95 .05 .98
AUC .83 .85
95% CI .72–.94 .75–.95
Digit Span ACSS 6 22.1 .53 .86 .50 .96
5 14.3 .40 .92 .27 .96
4 9.1 .27 .95 .14 .96
Note. BRFail : Base rate of failure (% of the sample that failed a given cutoff); WCT: Word Choice Test (Pearson, 2009; Fail defined as 45; Barhon et al., 2015;
Bain & Soble, 2019; Davis, 2014; Erdodi et al., 2014; Erdodi & Lichtenstein, 2019; Zuccato, Tyson, & Erdodi, 2018); VI-7: Validity Index Seven (Fail defined as 3;
Erdodi, Kirsch, et al., 2018; Lichtenstein, Flaro, et al., 2019); SENS: Sensitivity; SPEC: Specificity; ACSS: Age-corrected scaled score; AUC: Area under the curve.
letter and category fluency metrics improved signal detection The first level of failure on the RDS (7) produced a
accuracy over each task being used in isolation, employing a good combination of sensitivity (.54–.80) and specificity
stepwise LRC and subsequent ROC analysis. The number of (.93–.97). Making the cutoff more conservative (6) sacri-
correct responses within the 60-s time limit was split into ficed much of the sensitivity (.38–.53) for small gains in spe-
four 15-s intervals. The difference between the first cificity (.98–1.00). Further lowering the cutoff (5) reached
[0–15 sec] and second quarter [16–30 sec] was used as an the point of diminishing return (.33–.47 sensitivity at
estimate for the decline in response output over time. The .98–1.00 specificity).
discrepancy between these two intervals was chosen because An ACSS of 6 achieved a good balance between sensi-
they offered the largest and most consistent decline in out- tivity (.54–.67) and specificity (.94–.96). Making the cutoff
put, maximizing signal-to-noise ratio for classification while more conservative (5) resulted in the predictable improve-
preserving the metric’s simplicity for potential use in the ment in specificity (.98–1.00) at a reasonable cost to sensitiv-
clinical setting. ity (.38–.53). However, further lowering the cutoffs
The following parameters were loaded into the first LRC: sacrificed some of the sensitivity (.33–.47) for no gain in
animals total score, FAS total score, animals declining output, specificity (.98–1.00).
and FAS declining output. Subsequent t-tests on the LRC As it was predictable from its poor ROC characteristics,
coefficients were only significant for animals (p < 0.012) and cutoffs on the derivative EVI (LDF–LBD) struggled to differ-
FAS (p < 0.045) total score. Therefore, the declining output entiate valid from invalid response sets. The first cutoff to
parameters were discarded in a stepwise fashion. A new reach minimum specificity (.84–.88) was 0, with modest
LRC was generated with the animals total score and the FAS sensitivity (.33–.41). Making the cutoff more conservative
total score combined to explore the potential benefit of ( 1) achieved perfect specificity, but negligible sensitiv-
aggregating the two metrics. Overall classification accuracy ity (.04–.07).
for the combined LRC was 77.6%—a negligible advantage
over the model based on animals or FAS
alone (75.1–74.6%). Classification accuracy of animal fluency and digit span
cutoffs in the clinical sample
Animal fluency T-scores, RDS and Digit Span ACSS pro-
Classification accuracy of digit span cutoffs in the
duced significant AUC against both the WCT and VI-7
student sample
(.74–.85). Animals T 33 failed to achieve minimum specifi-
An LDF cutoff 4 produced uniformly high specificity (.95) city (.79–.82). Lowering the cutoff to T 31 improved speci-
and moderate sensitivity (.50–.73). Making the cutoff more ficity (.89–.91), at .27 sensitivity. Making the cutoff even
conservative (3) disproportionately traded sensitivity more conservative (T 29) achieved excellent specificity
(.38–.53) for specificity (.98–1.00). The only LDB cutoff that (.94–.96) without compromising sensitivity (Table 9). The
achieved a reasonable balance of sensitivity (.38–.53) and liberal RDS cutoff (7) cleared the minimum specificity
specificity (.98–1.00) was 2 (Table 8). threshold (.86–.89) at .27–.47 sensitivity. The conservative
Table 10. Base rates of PVT failure as a function of TBI severity in the clinical specificity fixed at .90 level. Likewise, on animal fluency,
sample (n ¼ 77).
higher sensitivity (.53–.81 versus .23–.42) was observed at
TBI severity
.90 specificity. Animal fluency validity cutoffs were the only
PVT Scale Cutoff Mild M/S v2 p U2 RR ones that had higher BRFail in patients with moderate/severe
WCT Raw 45 22.4 10.5 1.29 .256 .017 2.13 TBI, consistent with the reports of Curtis et al. (2008) on
VI-7 Raw 3 39.2 12.5 3.94 .047 .047 3.14
Animals T 31 12.1 21.1 0.94 .331 .012 0.57 the FAS. On all other PVT, patients with mild TBI produced
29 10.0 15.8 0.41 .521 .005 0.63 higher BRFail.
RDS Raw 7 24.1 10.5 1.61 .204 .021 2.30 However, other aspects of the findings contradict our
6 12.1 0.0 2.52 .112 .033 NA
Digit Span ACSS 6 29.3 0.0 7.15 .008 .093 NA omnibus prediction. For example, the FAS cutoff T 33
5 19.0 0.0 4.20 .040 .055 NA failed to meet specificity standards in the current sample
Note. TBI: Traumatic brain injury; PVT: Performance validity test; WCT: Word (.75–.84), although it was specific (.92) to non-credible
Choice Test; VI-7: Validity Index Seven; RDS: Reliable Digit Span; ACSS: Age-
corrected scaled score; M/S: Moderate-to-severe; RR: Risk ratio (base rate of responding in the mild TBI subsample in the Curtis et al.
failure among mild TBI patients divided by the base rate of failure among (2008) study. Similarly, in our sample, the FAS cutoff T 31
moderate-to-severe TBI patients). failed to reach minimum specificity against three of the four
criterion PVTs (.81–.83). In contrast, the same cutoff had
RDS cutoff (6) traded sensitivity (.18–.27) for excellent .87 specificity even in patients with moderate-to-severe TBI
specificity (.95–.98). A Digit Span ACSS 6 produced a in the study by Curtis et al. (2008).
good combination of sensitivity (.50–.53) and specificity
(.86–.96). Making the cutoff more conservative (5) dispro-
portionately traded sensitivity (.27–.40) for specifi- Process variables and multivariate models using verbal
city (.92–.96). fluency measures
Despite previous reports that component analysis of verbal
Base rates of PVT failure as a function of injury severity fluency measures provides unique and clinically useful infor-
in the clinical sample mation to characterize the pattern of cognitive deficits in
patients with TBI (Zakzanis, McDonald, & Troyer, 2011;
Given that EVIs are vulnerable to the confluence of genuine 2013), process variables failed to demonstrate a meaningful
cognitive impairment and non-credible responding by psychometric advantage over traditional summary scores in
design (Silverberg et al., 2008), BRFail were compared among predicting non-credible responding. Beyond the total scores,
patients with mild TBI and those with moderate or severe results of logistic regression analyses suggest that there is no
TBI. The mild TBI subsample had a higher BRFail on both statistical benefit to tracking the decline in output over time
the WCT and the VI-7 (RR: 2.13–3.14), although the former for verbal fluency measures as a derivative index of response
contrast failed to reach statistical significance (Table 10). In validity. Similarly, the small gain in classification accuracy
contrast, patients with moderate or severe TBI failed animal hardly justifies the burden of computing and interpreting
fluency validity cutoffs at a higher rate than patients with LRCs that combine the predictive power of animals and
mild TBI (RR: 1.58–1.74, non-significant contrasts). At RDS FAS total scores. Using univariate cutoffs appears to be the
7, BRFail was higher in the mild TBI subsample most parsimonious solution—from a statistical, practical,
(RR ¼ 2.30, non-significant contrast). No patient with mod-
and clinical perspective.
erate or severe TBI failed the RDS 6 cutoff. Similarly, This finding is in direct contradiction with a growing
Digit Span ACSS 6 and 5 had perfect specificity (i.e., a body of evidence on both the advantages of combining mul-
zero BRFail) in the subsample with moderate or severe TBI tiple PVTs to determine the credibility of a neurocognitive
(significant contrasts).
profile in general (Bashem et al., 2014; Davis & Millis, 2014;
Larrabee, 2008, 2014; Larrabee, Rohling, & Meyers, 2019;
Discussion Lichtenstein, Greenacre, et al., 2019; Meyers et al., 2014;
Odland, Lammy, Martin, Grote, & Mittenberg, 2015; Tyson
Review of main findings et al., 2018) and verbal fluency measures specifically
To our knowledge, this is the first study to evaluate the sig- (Johnson et al., 2012; Silverberg et al., 2008). A likely
nal detection performance of validity cutoffs on verbal flu- explanation for this divergence from the existing literature is
ency measures in a non-clinical sample using the expMAL collinearity. Process variables and the total score on the two
paradigm. We predicted that classification accuracy would types of verbal fluency tests (category and letter) contribute
be higher than what was reported in previous research in highly redundant information and thus, fail to enhance the
clinical patients, that verbal fluency based EVIs would be overall predictive power of the more complex model. This
sensitive to TBI severity, and that an inverse dose-response negative finding is a sobering reminder of longstanding
relationship between failure rates on other PVTs and TBI warnings about the importance of using independent PVTs
severity would be observed. Overall, the results supported (i.e., with low inter-correlation) in multivariate models of
these hypotheses. On FAS, sensitivity was consistently higher performance validity assessment (Nelson et al., 2003;
(.40–.54 versus .05–.34) in the student sample with Rosenfeld, Sands, & van Gorp, 2000).
Verbal fluency measures as EVIs in a broader context expMAL paradigms: the experimenter only controls the
instructions given to the participants, not the fidelity of their
The effect of expMAL on verbal fluency measures was com-
execution (Rai et al., 2019). Case in point: 28.1% of partici-
parable (d: 1.07–1.23) to that on Digit Span variables (d:
pants in the control group failed 1 PVTs, and 14.0% failed
1.19–1.26), with the isolated exception of LDF–LDB. This
2 PVTs. In other words, a quarter of the sample assumed
pattern of findings is consistent with previous research sug- to demonstrate valid performance had psychometric evi-
gesting that derivative validity indices either have inherently dence to the contrary (Abeare, Messa, et al., 2019; Davis,
low sensitivity (Abeare, Sabelli, et al., 2019; Arnold et al., Axelrod, McHugh, Hanks, & Millis, 2013; Proto et al.,
2005; Axelrod, Fichtenberg, Millis, & Wertheimer, 2006; 2014). While this is consistent with previous reports on fluc-
Erdodi, Hurtubise, et al., 2018; Erdodi, Sagar, et al., 2018; tuating effort in undergraduate research participants (An
Iverson, Lange, Green, & Franzen, 2002; Lichtenstein, Flaro, et al., 2017; An, Zakzanis, & Joordens, 2012; Roye et al.,
Baldwin, Rai, & Erdodi, 2019; Merten, Bossink, & Schmand, 2019), it provides objective evidence for criterion group con-
2007; Miller, Ryan, Carruthers, & Cluff, 2004; Powell, Locke, tamination. Conversely, 25–40% of the expMAL subsample
Smigielski, & McCrea, 2011) or their classification accuracy scored in the Average range or above on neuropsychological
may be population-specific (Erdodi, Abeare, et al., 2017; tests that are sensitive to diffuse neurocognitive deficits
Glassmire et al., 2019). Verbal fluency and Digit Span based (Bialystok, Craik, Binns, Ossher, & Freedman, 2014; Curtis,
EVIs were outperformed by free-standing and composite Greve, & Bianchini, 2009; Donders & Strong, 2015; Savla
PVTs (d: 1.25–1.69), re-affirming their superior signal detec- et al., 2011; Tyson et al., 2018).
tion properties. Arguably, the discrepancy between the request to produce
Consistent with the reports of Whiteside, Kogan, et al. credible impairment and the extent to which participants
(2015), animal fluency had higher AUC values (.74–.88) are willing or able to do so is the Achilles heel of research
than FAS (.64–.77). Compared to Digit Span based EVIs designs based on the expMAL paradigm. Anecdotally, some
(except LDF–LDB), FAS produced consistently lower AUCs participants reported struggling with the increased cognitive
(.64–.77 versus .75–.90). In contrast, animal fluency achieved demands of simultaneously feigning deficits while avoiding
similar AUC values (.74–.88). At the level of individual cut- detection. In defense of participants assigned to the expMAL
offs, animal fluency produced consistently better classifica- condition, their task does appear to be objectively more dif-
tion accuracy (Table 6). This finding is also consistent with ficult. Research on neurophysiological correlates of decep-
the results of the two previous studies that provided a direct tion repeatedly found increased overall activation
comparison between the animal fluency and FAS (Sugarman (Browndyke et al., 2008; Suchotzki, Crombez, Smulders,
& Axelrod, 2015; Whiteside, Kogan, et al., 2015). Meijer, & Verschuere, 2015), often outside of the brain
region typically involved in performing the task “to the best
Within-Group variability as an index of of their ability” (Larsen, Allen, Bigler, Goodrich-Hunsaker,
performance validity & Hopkins, 2010), suggesting that (perhaps even unsuccess-
ful) malingering attempts are energetically more expensive
Reliably higher within-group variability associated with the (Yu, Tao, Zhang, Chan, & Lee, 2019).
expMAL condition was an incidental finding that replicates
previous reports (Kanser, Rapport, Bashem, & Hanks, 2019;
Larrabee et al., 2019; Rapport, Farchione, Coleman, & Is single word reading immune to non-
Axelrod, 1998). A possible explanation for this is a diver- credible responding?
gence in malingering strategy: participants may have used Another incidental finding was that performance on a sin-
different templates for producing credible impairment gle-word reading test was robust to the global deleterious
(Cottingham, Victor, Boone, Ziegler, & Zeller, 2014; Erdodi, effects of expMAL, replicating earlier reports (Coleman,
Kirsch, Lajiness-O’Neill, Vingilis, & Medoff, 2014). This line Rapport, Millis, Ricker, & Farchione, 1998; Greiffenstein,
of reasoning underlines the importance of continuous moni- Baker, & Gola, 1996; Kanser et al., 2017). In contrast, a
toring of performance validity (Boone, 2009; Chafetz et al., more recent study found a strong linear relationship
2015; Critchfield et al., 2019; Schutte, Axelrod, & Montoya, between standard scores on a different single-word reading
2015) using multiple PVTs that represent a wide range of test and PVT failures (Martin et al., 2018). The same
cognitive domains, sensory modalities, and testing para- research group also reported that an excessive decline from
digms (Boone, 2013; Erdodi, 2019). estimated pre-morbid functioning based on performance on
a single-word reading test was an emerging index of non-
credible responding (Martin, Hunter, Rach, Heinrichs, &
Limitations of the experimental malingering paradigm
Schroeder, 2017).
A more skeptical (and parsimonious) account for the In a rare example of rigorous experimental control,
inflated SDs in the expMAL group is incomplete compre- Frazier, Frazier, Busch, Kerwood, and Demaree (2008) dem-
hension, lack of experiential basis, or the limited motivation onstrated the methodological complexity of isolating the
for the optimal execution of the expMAL instructions (An effect of expMAL. Even the lowest performing group scored
et al., 2017). As previous investigators have pointed out, in the Average range. However, this was significantly below
group assignment is a pseudo-independent variable in baseline testing (performance obtained outside of expMAL
instructions). This finding suggests that a score within the Inevitably, the study has several limitations, too. It was
normal range does not rule out non-credible responding— based on a sample of convenience recruited from a single
especially in cognitively high functioning examinees, such as location. Thus, results may not generalize to other geo-
athletes assessed to establish a baseline cognitive functioning graphic regions (Kura, 2013; Le on & Le on, 2014;
in sports concussion management programs (Abeare, Messa, Lichtenstein, Greenacre, et al., 2019; Lynn, 2010; McDaniel,
Zuccato, Merker, & Erdodi, 2018; Abeare, Messa, et al., 2006; Roth, Erdodi, McCulloch, & Isquith, 2015). In add-
2019; Gaudet & Weyandt, 2017; Higgins, Denney, & ition, participants were limited to healthy young adults.
Maerlender, 2017; Iverson & Schatz, 2015; Lichtenstein, Since age has been reported to influence base rates of PVT
Linnea, & Maerlender, 2018; McCrea et al., 2003; Tsushima failure, especially in children (Abeare et al., 2018;
et al., 2019). Similarly, a large effect emerged as a function Lichtenstein, Holcomb, & Erdodi, 2018), future studies may
of the type of disorder being feigned. Given the widespread benefit from expanding the age range of the examinees. The
reliance on such measures to establish baseline cognitive choice of criterion measures may have also inadvertently
abilities (Boone, 2013; Green et al., 2008; Johnstone, influenced the classification accuracy (Erdodi, 2019; Rai &
Callahan, Kapila, & Bouman, 1996; Mathias, Bowden, Bigler, Erdodi, 2019; Schroeder et al., 2019; Schwartz et al., 2016).
& Rosenfeld, 2007; McFarlane, Welch, & Rodgers, 2006; Replication is needed using different samples and criterion
Steward et al., 2017), more research is clearly needed to PVTs. Likewise, involving newer versions of verbal fluency
establish the relationship between single-word reading tests measures such as the Action (Piatt, Fields, Paolo, & Troster,
and invalid performance. 1999) and Emotion Word Fluency Test (Abeare et al., 2017)
could further enhance the expanding knowledge base on
this family of EVIs. Finally, as described above, the expMAL
The attenuation of classification accuracy from student paradigm itself has a number of inherent epistemological
to clinical sample limitations.
As the student sample consisted of healthy young adults
Naturally, correctly identifying invalid performance presents
from a single location, the extent to which results would
a greater psychometric challenge in a clinical sample where
generalize to older clinical patients in different geographic
genuine deficits and non-credible responding can coexist.
regions remains an open question. At the same time, even
Nevertheless, only a marginal shrinkage in specificity was
the control group scored significantly below the normative
observed on animal fluency cutoffs from students to patients
mean on both measures of verbal fluency—a puzzling find-
at T 31 (from .86–.93 to .89–.91) and T 29 (from .94–.98
ing that is not without precedent (Abeare, Messa, et al.,
to .94–.96). The loss of signal detection power was more
2019; An et al., 2017). Equally surprisingly, the FAS cutoff
apparent at RDS 7 (from .93–.97 among students to
(T 33) that was specific (.92) to non-credible responding
.86–.89 among patients with TBI). Although specificity val-
in patients with mild TBI in the Curtis et al. (2008) study
ues were comparable at RDS 6 between the two samples
failed to clear the minimum specificity threshold in the pre-
(.98–1.00 vs .95–.98), sensitivity was notably higher in the
sent sample of cognitively intact participants. These findings
student sample (.38–.53 vs .18–.27). Finally, Digit Span
suggest that the commonly used clinical vs. non-clinical dis-
ACSS 6 had produced similar combinations of sensitivity tinction may be an artificial dichotomy. At least in terms of
and specificity in both samples, although classification verbal fluency scores, there seems to be a substantial overlap
accuracy was consistently higher among students. In the in mean performance between healthy controls (Abeare,
clinical sample, Digit Span ACSS had better classification Messa, et al., 2019; An et al., 2017; Erdodi, Jongsma, & Issa,
accuracy than RDS. This finding is consistent with most 2017; McCrea et al., 2003; Piatt, Fields, Paolo, Koller, &
recent reports (Shura, Martindale, Taber, Higgins, & Troster, 1999) and clinical patients (Johnson et al., 2012;
Rowland, 2019) and the cumulative evidence (Babikian, Piatt, Fields, Paolo, Koller, et al., 1999; Sugarman &
Boone, Lu, & Arnold, 2006; Kiewel, Wisdom, Bradshaw, Axelrod, 2015; Tyson et al., 2018; Whiteside, Kogan, et al.,
Pastorek, & Strutt, 2012; Spencer et al., 2013; Webber & 2015; Zakzanis et al., 2011, 2013).
Soble, 2018; Whitney et al., 2009). Nevertheless, given the sensitivity of verbal fluency meas-
ures to diffuse neuropsychiatric disorders (Henry &
Strengths and limitations of the study Crawford, 2004; Iverson, Franzen, & Lovell, 1999; Loring
et al., 1995), low scores on these tests should not be inter-
As the first investigation of verbal fluency measures as EVIs preted as evidence of non-credible responding until alterna-
in a non-clinical sample, this study extended the scope of tive explanations (i.e., genuine deficits due to various
these instruments to cognitively intact populations. Previous etiologies ranging from neurodevelopmental disorders
research suggests that since most PVTs were designed to be through acquired aphasia to limited English proficiency)
robust to genuine deficits, their utility is limited in high have been ruled out (Bodner et al., 2019; Boone, Victor,
functioning examinees (Abeare, Messa, et al., 2019; An Wen, Razani, & Pont on, 2007; Erdodi, Nussbaum, Sagar,
et al., 2017; Roye et al., 2019). Animal and letter fluency cut- Abeare, & Schwartz, 2017; Schroeder & Marshall, 2010).
offs were validated against a variety of criterion PVTs, pro- Consequently, although failing validity cutoffs on FAS or
viding a thorough test of their signal detection properties. animals can be helpful in ruling in invalid performance,
Results support the need for population-specific cutoffs. they should never be used in isolation to determine the
credibility of the overall neurocognitive profile. However, in Ethical approval

combination with failures on robust free-standing PVTs,
Relevant ethical guidelines were followed throughout the
they can provide incremental evidence to improve the asses-
project. All data collection, storage and processing was done
sor’s confidence in the final conclusion (Erdodi, 2019;
with the approval of relevant institutional authorities regu-
Lippa, 2018; Merten & Rogers, 2017; Odland et al., 2015).
Future studies that provide a direct comparison between lating research involving human participants, in compliance
healthy controls, experimental malingerers and credible with the 1964 Helsinki Declaration and its subsequent
patients with genuine deficits could further illuminate the amendments or comparable ethical standards.
potential confluence of legitimate impairment and invalid
performance. Such a design would allow researchers to esti- Disclosure statement
mate the relative contribution of credible deficits and non-
The last author provides forensic consultation and medicolegal assess-
credible responding (Giromini, Viglione, Pignolo, &
ments, for which he receives financial compensation.
Zennaro, 2018; Hopwood, Morey, Rogers, & Sewell, 2007;
Kanser et al., 2019). Including an additional group of
patients with medically verified conditions known to be References
associated with decreased verbal fluency performance (who
Abeare, C. A., Freund, S., Kaploun, K., McAuley, T., & Dumitrescu, C.
are also instructed to malinger) would further enhance the (2017). The Emotion Word Fluency Test (EWFT): Initial psycho-
heuristic value of future research designs (Giromini et al., metric, validation, and physiological evidence in young adults.
2019; Inman & Berry, 2002; Iverson, Franzen, & Mccracken, Journal of Clinical and Experimental Neuropsychology, 39(8),
1994; Rogers et al., 2003; Rose, Hall, Szalda-Petree, & Bach, 738–752. doi:10.1080/13803395.2016.1259396
1998; Vickery et al., 2004; Viglione et al., 2019; von Abeare, C., Messa, I., Whitfield, C., Zuccato, B., Casey, J., Rykulski, N.,
& Erdodi, L. (2019). Performance validity in collegiate football ath-
Helvoort, Merckelbach, & Merten, 2019). letes at baseline neurocognitive testing. Journal of Head Trauma
Rehabilitation, 34(4), E20–E31. doi:10.1097/HTR.0000000000000451
Abeare, C. A., Messa, I., Zuccato, B. G., Merker, B., & Erdodi, L. A.
Conclusions (2018). Prevalence of invalid performance on baseline testing for
Verbal fluency measures were effective at separating valid sport-related concussion by age and validity indicator. JAMA
Neurology, 75(6), 697–703. doi:10.1001/jamaneurol.2018.0031
from invalid response sets in a non-clinical sample of under- Abeare, C., Sabelli, A., Taylor, B., Holcomb, M., Dumitrescu, C.,
graduate student volunteers using the expMAL paradigm. Kirsch, N., & Erdodi, L. (2019). The importance of demographically
Surprisingly, FAS cutoffs previously validated in clinical adjusted cutoffs: Age and education bias in raw score cutoffs within
patients (Curtis et al., 2008) failed to achieve minimum spe- the Trail Making Test. Psychological Injury and Law, 12(2), 170–182.
cificity standards in the present sample, underlining the doi:10.1007/s12207-019-09353
An, K. Y., Charles, J., Ali, S., Enache, A., Dhuga, J., & Erdodi, L. A.
importance of population-specific cutoffs. Consistent with
(2019). Re-examining performance validity cutoffs within the
previous research, animal fluency had higher classification Complex Ideational Material and the Boston Naming Test-Short
accuracy than FAS. However, both EVIs performed notably Form using an experimental malingering paradigm. Journal of
better in the student sample than in previous studies and Clinical and Experimental Neuropsychology, 41(1), 15–25. doi:10.
had a comparable signal detection profile to Digit Span vari- 1080/13803395.2018.1483488
ables as EVIs. The most plausible explanation to this pattern An, K. Y., Kaploun, K., Erdodi, L. A., & Abeare, C. A. (2017).
Performance validity in undergraduate research participants: A com-
of finding is that, as a test of cognitive ability, verbal fluency
parison of failure rates across tests and cutoffs. The Clinical
is more sensitive to diffuse neuropsychological deficits than Neuropsychologist, 31(1), 193–206. doi:10.1080/13854046.2016.
Digit Span. 1217046
As such, genuine impairment and non-credible respond- An, K. Y., Zakzanis, K. K., & Joordens, S. (2012). Conducting research
ing may be more closely intertwined in clinical patients and, with non-clinical healthy undergraduates: Does effort play a role in
thus, more difficult to differentiate using verbal fluency neuropsychological test performance? Archives of Clinical
Neuropsychology, 27(8), 849–857. doi:10.1093/arclin/acs085
measures. Therefore, failing validity cutoffs on the FAS or Arnold, G., Boone, K. B., Lu, P., Dean, A., Wen, J., Nitch, S., &
animals should not be interpreted as conclusive evidence of McPherson, S. (2005). Sensitivity and specificity of finger tapping
invalid performance. Indeed, patients with moderate or test scores for the detection of suspect effort. The Clinical
severe TBI had a higher BRFail on animal fluency validity Neuropsychologist, 19(1), 105–120. doi:10.1080/13854040490888567
cutoffs, while they had lower BRFail on all other PVTs exam- Ashendorf, L., Clark, E. L., & Sugarman, M. A. (2017). Performance
ined. Nevertheless, before automatically discounting an EVI validity and processing speed in a VA polytrauma sample. The
Clinical Neuropsychologist, 31(5), 857–866. doi:10.1080/13854046.
failure as a false positive, the broader clinical context should 2017.1285961
be considered, in conjunction with the ongoing consultation Axelrod, B. N., Fichtenberg, N. L., Millis, S. R., & Wertheimer, J. C.
of the empirical literature. In fact, on the EVIs examined in (2006). Detecting incomplete effort with Digit Span from the
this study, conservative cutoffs provided an effective safe- Wechsler Adult Intelligence Scale – Third Edition. The Clinical
guard against false positive errors even among patients with Neuropsychologist, 20(3), 513–523. doi:10.1080/13854040590967117
Axelrod, B. N., Meyers, J. E., & Davis, J. J. (2014). Finger tapping test
verified neurological disorders. Ultimately, the golden rule
performance as a measure of performance validity. The Clinical
remains that these EVIs should only be used and interpreted Neuropsychologist, 28(5), 876–888. doi:10.1080/13854046.2014.907583
in combination with other PVTs that are known to be Babikian, T., Boone, K. B., Lu, P., & Arnold, G. (2006). Sensitivity and
robust to genuine impairment. specificity of various Digit Span scores in the detection of suspect
effort. The Clinical Neuropsychologist, 20(1), 145–159. doi:10.1080/ Brandt, J., & Benedict, R. H. B. (2001). Hopkins Verbal Learning Test –
13854040590947362 Revised. Odessa, FL: Psychological Assessment Services.
Backhaus, S. L., Fichtenberg, N. L., & Hanks, R. A. (2004). Detection Brennan, A. M., Meyer, S., David, E., Pella, R., Hill, B. D., & Gouvier,
of sub-optimal performance using a floor effect strategy in patients W. D. (2009). The vulnerability to coaching across measures of
with traumatic brain injury. The Clinical Neuropsychologist, 18(4), effort. The Clinical Neuropsychologist, 23(2), 314–328. doi:10.1080/
591–603. doi:10.1080/13854040490888558 13854040802054151
Bailey, K. C., Soble, J. R., Bain, K. M., & Fullen, C. (2018). Embedded Browndyke, J. N., Paskavitz, J., Sweet, L. H., Cohen, R. A., Tucker,
performance validity tests in the Hopkins Verbal Learning Test- K. A., Welsh-Bohmer, K. A., … Schmechel, D. E. (2008).
Revised and the Brief Visuospatial Memory Test-Revised: A replica- Neuroanatomical correlates of malingered memory impairment:
tion study. Archives of Clinical Neuropsychology, 33(7), 895–900. doi: Event-related fMRI of deception on a recognition memory task.
10.1093/arclin/acx111 Brain Injury, 22(6), 481–489. doi:10.1080/02699050802084
Bain, K. M., & Soble, J. R. (2019). Validation of the Advanced Clinical Bush, S. S., Heilbronner, R. L., & Ruff, R. M. (2014). Psychological
Solutions Word Choice Test (WCT) in a mixed clinical sample: assessment of symptom and performance validity, response bias,
Establishing classification accuracy, sensitivity/specificity, and cutoff and malingering: Official position of the Association for Scientific
scores. Assessment, 26(7), 1320–1328. doi:10.1177/1073191117725172 Advancement in Psychological Injury and Law. Psychological Injury
Barhon, L. I., Batchelor, J., Meares, S., Chekaluk, E., & Shores, E. A. and Law, 7(3), 197–205. doi:10.1007/s12207-014-9198-7
(2015). A comparison of the degree of effort involved in the TOMM Bush, S., Ruff, R., Troster, A., Barth, J., Koffler, S., Pliskin, N., …
and the ACS Word Choice Test using a dual-task paradigm. Applied Silver, C. (2005). Symptom validity assessment: Practice issues and
Neuropsychology: Adult, 22(2), 114–123. medical necessity (NAN Policy and Planning Committees). Archives
Bashem, J. R., Rapport, L. J., Miller, J. B., Hanks, R. A., Axelrod, B. N., of Clinical Neuropsychology, 20(4), 419–426. doi:10.1016/j.acn.2005.
& Millis, S. R. (2014). Comparison of five performance validity indi- 02.002
ces in bona fide and simulated traumatic brain injury. The Clinical Carone, D. A., Green, P., & Drane, D. L. (2014). Word memory test
Neuropsychologist, 28(5), 851–875. doi:10.1080/13854046.2014.927927 profiles in two cases with surgical removal of the left anterior hippo-
Berger, C., Lev, A., Braw, Y., Elbaum, T., Wagner, M., & Rassovsky, Y. campus and parahippocampal gyrus. Applied Neuropsychology:
(2019). Detection of feigned ADHD using the MOXO-d-CPT. Adult, 21(2), 155–160. doi:10.1080/09084282.2012.755533
Journal of Attention Disorders. doi:10.1177/1087054719864656 Chafetz, M. D., Williams, M. A., Ben-Porath, Y. S., Bianchini, K. J.,
Bialystok, E., Craik, F. I. M., Binns, M. A., Ossher, L., & Freedman, L. Boone, K. B., Kirkwood, M. W., … Ord, J. S. (2015). Official pos-
(2014). Effects of bilingualism on the age of onset and progression ition of the American Academy of Clinical Neuropsychology Social
of MCI and AD: Evidence from executive function tests. Security Administration policy on validity testing: Guidance and rec-
ommendations for change. The Clinical Neuropsychologist, 29(6),
Neuropsychology, 28(2), 290–304. doi:10.1037/neu0000023
723–740. doi:10.1080/13854046.2015.1099738
Bigler, E. D. (2012). Symptom validity testing, effort and neuropsycho-
Coleman, R. D., Rapport, L. J., Millis, S. R., Ricker, J. H., & Farchione,
logical assessment. Journal of the International Neuropsychological
T. J. (1998). Effects of coaching on detection of malingering on the
Society, 18(4), 632–642. doi:10.1017/S1355617712000252
California Verbal Learning Test. Journal of Clinical and
Bigler, E. D. (2014). Effort, symptom validity testing, performance val-
Experimental Neuropsychology, 20(2), 201–210. doi:10.1076/jcen.20.2.
idity testing and traumatic brain injury. Brain Injury, 28(13–14),
201.1164
1623–1638. doi:10.3109/02699052.2014.947627
Cottingham, M. E., Victor, T. L., Boone, K. B., Ziegler, E. A., & Zeller,
Bigler, E. D. (2015). Neuroimaging as a biomarker in symptom validity
M. (2014). Apparent effect of type of compensation seeking (disabil-
and performance validity testing. Brain Imaging and Behavior, 9(3),
ity vs. litigation) on performance validity test scores may be due to
421–444. doi:10.1007/s11682-015-9409-1
other factors. The Clinical Neuropsychologist, 28(6), 1030–1047. doi:
Bodner, T., Merten, T., & Benke, T. (2019). Performance validity meas-
10.1080/13854046.2014.951397
ures in clinical patients with aphasia. Journal of Clinical and
Critchfield, E., Soble, J. R., Marceaux, J. C., Bain, K. M., Chase Bailey,
Experimental Neuropsychology, 41(5), 476–483. doi:10.1080/ K., Webber, T. A. … O’Rourke, J. J. F. (2019). Cognitive impair-
13803395.2019.1579783 ment does not cause invalid performance: Analyzing performance
Boone, K. B. (2013). Clinical practice of forensic neuropsychology. New patterns among cognitively unimpaired, impaired, and noncredible
York, NY: Guilford. participants across six performance validity tests. The Clinical
Boone, K. B. (2009). The need for continuous and comprehensive sam- Neuropsychologist, 33(6), 1083–1101. doi:10.1080/13854046.2018.
pling of effort/response bias during neuropsychological examination. 1508615
The Clinical Neuropsychologist, 23(4), 729–741. doi:10.1080/ Crowe, S. F. (1996). Traumatic anosmia coincides with an organic dis-
13854040802427803 inhibition syndrome as measured by responding on the Controlled
Boone, K. B., Salazar, X., Lu, P., Warner-Chacon, K., & Razani, J. Oral Word Association Test. Psychiatry, Psychology and Law, 3(1),
(2002). The Rey 15-item recognition trial: A technique to enhance 39–45. doi:10.1080/13218719609524873
sensitivity of the Rey 15-item memorization test. Journal of Clinical Curtis, K. L., Greve, K. W., & Bianchini, K. J. (2009). The Wechsler
and Experimental Neuropsychology, 24(5), 561–573. doi:10.1076/jcen. Adult Intelligence Scale-III and malingering in traumatic brain
24.5.561.1004 injury. Assessment, 16(4), 401–414. doi:10.1177/1073191109338161
Boone, K. B., Victor, T. L., Wen, J., Razani, J., & Ponton, M. (2007). Curtis, K. L., Thompson, L. K., Greve, K. W., & Bianchini, K. J. (2008).
The association between neuropsychological scores and ethnicity, Verbal fluency indicators of malingering in traumatic brain injury:
language, and acculturation variables in a large patient population. Classification accuracy in known groups. The Clinical
Archives of Clinical Neuropsychology, 22(3), 355–365. doi:10.1016/j. Neuropsychologist, 22(5), 930–945. doi:10.1080/13854040701563591
acn.2007.01.010 Dandachi-FitzGerald, B., Merckelbach, H., & Ponds, R. W. H. M.
Bortnik, K. E., Boone, K. B., Marion, S. D., Amano, S., Ziegler, E., (2017). Neuropsychologists’ ability to predict distorted symptom
Victor, T. L., & Zeller, M. A. (2010). Examination of various WMS- presentation. Journal of Clinical and Experimental Neuropsychology,
III logical memory scores in the assessment of response bias. The 39(3), 257–264. doi:10.1080/13803395.2016.1223278
Clinical Neuropsychologist, 24(2), 344–357. doi:10.1080/ Davis, J. J. (2014). Further consideration of Advanced Clinical
13854040903307268 Solutions Word Choice: Comparison to the Recognition Memory
Boskovic, I., Biermans, A. J., Merten, T., Jelicic, M., Hope, L., & Test – Words and classification accuracy on a clinical sample. The
Merckelbach, H. (2018). The Modified Stroop Task is susceptible to Clinical Neuropsychologist, 28(8), 1278–1294. doi:10.1080/13854046.
feigning: Stroop performance and symptom over-endorsement in 2014.975844
feigned test anxiety. Frontiers in Psychology, 9, 1195. doi:10.3389/ Davis, J. J., Axelrod, B. N., McHugh, T. S., Hanks, R. A., & Millis, S. R.
fpsyg.2018.01195 (2013). Number of impaired scores as a performance validity
indicator. Journal of Clinical and Experimental Neuropsychology, Psychological Injury and Law, 11(4), 307–324. doi:10.1007/s12207-
35(4), 413–420. doi:10.1080/13803395.2013.781134 018-9337-7
Davis, J. J., & Millis, S. R. (2014). Examination of performance validity Erdodi, L. A., & Lichtenstein, J. D. (2019). Information processing
test failure in relation to number of tests administered. The Clinical speed tests as PVTs. In K. B. Boone (Ed.), Assessment of feigned cog-
Neuropsychologist, 28(2), 199–214. doi:10.1080/13854046.2014.884633 nitive impairment. A neuropsychological perspective. New York, NY:
Delis, D. C., Kaplan, E. F., & Kramer, J. H. (2001). Delis-Kaplan execu- Guilford.
tive function system. San Antonio, TX: Psychological Corporation. Erdodi, L. A., & Lichtenstein, J. D. (2017). Invalid before impaired: An
DenBoer, J. W., & Hall, S. (2007). Neuropsychological test performance emerging paradox of embedded validity indicators. The Clinical
of successful brain injury simulators. The Clinical Neuropsychologist, Neuropsychologist, 31(6–7), 1029–1046. doi:10.1080/13854046.2017.
21(6), 943–955. doi:10.1080/13854040601020783 1323119
Donders, J., & Strong, C. A. H. (2015). Clinical utility of the Wechsler Erdodi, L. A., Nussbaum, S., Sagar, S., Abeare, C. A., & Schwartz, E. S.
Adult Intelligence Scale – Fourth Edition after traumatic brain (2017). Limited English proficiency increases failure rates on per-
injury. Assessment, 22(1), 17–22. doi:10.1177/1073191114530776 formance validity tests with high verbal mediation. Psychological
Eglit, G. M. L., Jurick, S. M., Delis, D. C., Filoteo, J. V., Bondi, M. W., Injury and Law, 10(1), 96–103. doi:10.1007/s12207-017-9282-x
& Jak, A. J. (2019). Utility of the D-KEFS Color Word Interference Erdodi, L. A., & Rai, J. K. (2017). A single error is one too many:
Test as an embedded measure of performance validity. The Clinical Examining alternative cutoffs on Trial 2 on the TOMM. Brain
Neuropsychologist, 1–21. doi:10.1080/13854046.2019.1643923 Injury, 31(10), 1362–1368. doi:10.1080/02699052.2017.1332386
Erdodi, L. A., & Abeare, C. A. (2019). Stronger together: The Wechsler Erdodi, L. A., Pelletier, C. L., & Roth, R. M. (2018). Elevations on
Adult Intelligence Scale – Fourth Edition as a multivariate perform- select Conners’ CPT-II scales indicate noncredible responding in
ance validity test in patients with traumatic brain injury. Archives of adults with traumatic brain injury. Applied Neuropsychology: Adult,
Clinical Neuropsychology. doi:10.1093/arclin/acz032/5613200 25(1), 19–28. doi:10.1080/23279095.2016.1232262
Erdodi, L. A., Abeare, C. A., Medoff, B., Seke, K. R., Sagar, S., & Erdodi, L. A., Sagar, S., Seke, K., Zuccato, B. G., Schwartz, E. S., &
Kirsch, N. L. (2018). A single error is one too many: The Forced Roth, R. M. (2018). The Stroop Test as a measure of performance
Choice Recognition trial on the CVLT-II as a measure of perform- validity in adults clinically referred for neuropsychological assess-
ance validity in adults with TBI. Archives of Clinical ment. Psychological Assessment, 30(6), 755–766. doi:10.1037/
Neuropsychology, 33(7), 845–860. doi:10.1093/arclin/acx110 pas0000525
Erdodi, L. A., Roth, R. M., Kirsch, N. L., Lajiness-O’Neill, R., & Erdodi, L. A., Seke, K. R., Shahein, A., Tyson, B. T., Sagar, S., & Roth,
Medoff, B. (2014). Aggregating validity indicators embedded in R. M. (2017). Low scores on the Grooved Pegboard Test are associ-
Conners’ CPT-II outperforms individual cutoffs at separating valid ated with invalid responding and psychiatric symptoms. Psychology
from invalid performance in adults with traumatic brain injury. & Neuroscience, 10(3), 325–344. doi:10.1037/pne0000103
Erdodi, L. A., Tyson, B. T., Abeare, C. A., Lichtenstein, J. D., Pelletier,
Archives of Clinical Neuropsychology, 29(5), 456–466. doi:10.1093/
C. L., Rai, J. K., & Roth, R. M. (2016). The BDAE Complex
arclin/acu026
Ideational Material – A measure of receptive language or perform-
Erdodi, L. A., Taylor, B., Sabelli, A. G., Malleck, M., Kirsch, N. L., &
ance validity?. Psychological Injury and Law, 9(2), 112–120. doi:10.
Abeare, C. A. (2019). Demographically adjusted validity cutoffs on
1007/s12207-016-9254-6
the Finger Tapping Test are superior to raw score cutoffs in adults
Erdodi, L. A., Tyson, B. T., Abeare, C. A., Zuccato, B. G., Rai, J. K.,
with TBI. Psychological Injury and Law, 12(2), 113–126. doi:10.1007/
Seke, K. R., … Roth, R. M. (2018). Utility of critical items within
s12207-019-09352-y
the Recognition Memory Test and Word Choice Test. Applied
Erdodi, L. A. (2019). Aggregating validity indicators: The salience of
Neuropsychology: Adult, 25(4), 327–339. doi:10.1080/23279095.2017.
domain specificity and the indeterminate range in multivariate mod-
1298600
els of performance validity assessment. Applied Neuropsychology:
Erdodi, L. A., Tyson, B. T., Shahein, A., Lichtenstein, J. D., Abeare,
Adult, 26(2), 155–172. doi:10.1080/23279095.2017.1384925 C. A., Pelletier, C. L., … Roth, R. M. (2017). The power of timing:
Erdodi, L. A., Abeare, C. A., Lichtenstein, J. D., Tyson, B. T., Adding a time-to-completion cutoff to the Word Choice Test and
Kucharski, B., Zuccato, B. G., & Roth, R. M. (2017). WAIS-IV proc- Recognition Memory Test improves classification accuracy. Journal
essing speed scores as measures of non-credible responding – The of Clinical and Experimental Neuropsychology, 39(4), 369–383. doi:
third generation of embedded performance validity indicators. 10.1080/13803395.2016.1230181
Psychological Assessment, 29(2), 148–157. doi:10.1037/pas0000319 Erdodi, L. A., & Roth, R. M. (2017). Low scores on BDAE Complex
Erdodi, L. A., Dunn, A. G., Seke, K. R., Charron, C., McDermott, A., Ideational Material are associated with invalid performance in adults
Enache, A., … Hurtubise, J. (2018). The Boston Naming Test as a without aphasia. Applied Neuropsychology: Adult, 24(3), 264–274.
measure of performance validity. Psychological Injury and Law, doi:10.1080/23279095.2017.1298600
11(1), 1–8. doi:10.1007/s12207-017-9309-3 Frazier, T., Frazier, A., Busch, R., Kerwood, M., & Demaree, H. (2008).
Erdodi, L. A., Green, P., Sirianni, C., & Abeare, C. A. (2019). The Detection of simulated ADHD and reading disorder using symptom
myth of high false positive rates on the Word Memory Test in mild validity measures. Archives of Clinical Neuropsychology, 23(5),
TBI. Psychological Injury and Law, 12(2), 155–169. doi:10.1007/ 501–509.
s12207-019-09356-8 Fuermaier, A. B. M., Tucha, O., Koerts, J., Grabski, M., Lange, K. W.,
Erdodi, L. A., Hurtubise, J. L., Charron, C., Dunn, A., Enache, A., Weisbrod, M., … Tucha, L. (2016). The development of an
McDermott, A., & Hirst, R. (2018). The D-KEFS Trails as perform- embedded figures test for the detection of feigned Attention Deficit/
ance validity tests. Psychological Assessment, 30(8), 1082–1095. doi: Hyperactivity Disorder in Adulthood. PLoS One, 11(10), e0164297.
10.1037/pas0000561 doi:10.1037/journal.pone.0164297
Erdodi, L. A., Jongsma, K. A., & Issa, M. (2017). The 15-item version Fuermaier, A. B. M., Tucha, O., Koerts, J., Lange, K. W., Weisbrod,
of the Boston Naming test as an index of English proficiency. The M., Aschenbrenner, S., & Tucha, L. (2017). Noncredible cognitive
Clinical Neuropsychologist, 31(1), 168–178. doi:10.1080/13854046. performance at clinical evaluation of adult ADHD: An embedded
2016.1224392 validity indicator in a visuospatial working memory test.
Erdodi, L. A., Kirsch, N. L., Lajiness-O’Neill, R., Vingilis, E., & Medoff, Psychological Assessment, 29(12), 1466–1479. doi:10.1037/pas0000534
B. (2014). Comparing the Recognition Memory Test and the Word Gaudet, C. E., & Weyandt, L. L. (2017). Immediate Post-Concussion
Choice Test in a mixed clinical sample: Are they equivalent? and Cognitive Testing (ImPACT): A systematic review of the preva-
Psychological Injury and Law, 7(3), 255–263. lence and assessment of invalid performance. The Clinical
Erdodi, L. A., Kirsch, N. L., Sabelli, A. G., & Abeare, C. A. (2018). The Neuropsychologist, 31(1), 43–58. doi:10.1080/13854046.2016.1220622
Grooved Pegboard Test as a validity indicator – A study on psycho- Giromini, L., Viglione, D. J., Pignolo, C., & Zennaro, A. (2018). A clin-
genic interference as a confound in performance validity research. ical comparison, simulation study testing the validity and IOP-29
with an Italian sample. Psychological Injury and Law, 11(4), of specific feigned disorders. Journal of Personality Assessment,
340–350. doi:10.1007/s12207-018-9314-1 88(1), 43–48. doi:10.1080/00223890709336833
Giromini, L., Lettieri, S. C., Zizolfi, S., Zizolfi, D., Viglione, D.J., Hosmer, D. W., & Lemeshow, S. (2000). Applied logistic regression (2nd
Brusadelli, E., … Zennaro, A. (2019). Beyond rare-symptom ed.). New York: Wiley.
endorsement: A clinical comparison study using the Minnesota Inman, T. H., & Berry, D. T. R. (2002). Cross-validation of indicators
Multiphasic Personality Inventory-2 (MMPI-2) with the Inventory of malingering: A comparison of nine neuropsychological tests, four
of Problems-29 (IOP-29). Psychological Injury and Law, 12(3–4), tests of malingering, and behavioral observations. Archives of
212–224. doi:10.1007/s12207-019-09357-7 Clinical Neuropsychology, 17(1), 1–23. doi:10.1016/S0887-
Gladsjo, J. A., Schuman, C. C., Evans, J. D., Peavy, G. M., Miller, 6177(00)00073-1
S. W., & Heaton, R. K. (1999). Norms for letter and category flu- Iverson, G. L., Franzen, M. D., & Lovell, M. R. (1999). Normative com-
ency: Demographic corrections for age, education, and ethnicity. parisons for the Controlled Oral Word Association Test following
Assessment, 6(2), 147–178. doi:10.1177/107319119900600204 acute traumatic brain injury. The Clinical Neuropsychologist, 13(4),
Glassmire, D. M., Wood, M. E., Ta, M. T., Kinney, D. I., & Nitch, S. R. 437–441. doi:10.1076/1385-4046(199911)13:04;1-Y;FT437
(2019). Examining false-positive rates of Wechsler Adult Intelligence Iverson, G. L., Franzen, M. D., & Mccracken, L. M. (1994). Application
Scale (WAIS-IV) processing speed based embedded validity indica- of a forced-choice memory procedure designed to detect experimen-
tors among individuals with schizophrenia spectrum disorders. tal malingering. Archives of Clinical Neuropsychology, 9(5), 437–450.
Psychological Assessment, 31(1), 120–125. doi:10.1093/arclin/9.5.437
Goodglass, H., Kaplan, E., & Barresi, B. (2001). Boston Diagnostic Iverson, G. L., Lange, R. T., Green, P., & Franzen, M. D. (2002).
Aphasia Examination (3rd ed.). Philadelphia: Lippincott Williams & Detecting exaggeration and malingering with the Trail Making Test.
Wilkins. The Clinical Neuropsychologist, 16(3), 398–406. doi:10.1076/clin.16.3.
Green, P., Montijo, J., & Brockhaus, R. (2011). High specificity of the 398.13861
Word Memory Test and Medical Symptom Validity Test in groups Iverson, G. L., & Schatz, P. (2015). Advanced topics in neuropsycho-
with severe verbal memory impairment. Applied Neuropsychology, logical assessment following sports-related concussion. Brain Injury,
18(2), 86–94. doi:10.1080/09084282.2010.523389 29(2), 263–275. doi:10.3109/02699052.2014.965214
Green, R. E. A., Melo, B., Christensen, B., Ngo, L. A., Monette, G., & Johnson, S. C., Silverberg, N. D., Millis, S. R., & Hanks, R. A. (2012).
Bradbury, C. (2008). Measuring premorbid IQ in traumatic brain Symptom validity indicators embedded in the Controlled Oral Word
injury: An examination of the validity of the Wechsler Test of Adult Association Test. The Clinical Neuropsychologist, 26(7), 1230–1241.
Reading (WTAR). analysis. Journal of Clinical and Experimental doi:10.1080/13854046.2012.709886
Johnstone, B., Callahan, C. D., Kapila, C. J., & Bouman, D. E. (1996).
Neuropsychology, 30(2), 163–172. doi:10.1080/13803390701300524
The comparability of the WRAT-R Reading test and the NAART as
Greiffenstein, M. F., Baker, W. J., & Gola, T. (1996). Motor dysfunction
estimates of premorbid intelligence in neurologically impaired
profiles in traumatic brain injury and postconcussion syndrome.
patients. Archives of Clinical Neuropsychology, 11(6), 513–519. doi:
Journal of the International Neuropsychological Society, 2(6),
10.1016/0887-6177(96)82330-4
477–485. doi:10.1017/S1355617700001648
Kanser, R. J., Rapport, L. J., Bashem, J. R., Billings, N. M., Hanks,
Greve, K. W., Bianchini, K. J., Mathias, C. W., Houston, R. J., &
R. A., Axelrod, B. N., & Miller, J. B. (2017). Strategies of successful
Crouch, J. A. (2002). Detecting malingered neurocognitive dysfunc-
and unsuccessful simulators coached to feign traumatic brain injury.
tion with the Wisconsin Card Sorting Test: A preliminary investiga-
The Clinical Neuropsychologist, 31(3), 644–653. doi:10.1080/
tion in traumatic brain injury. The Clinical Neuropsychologist, 16(2),
13854046.2016.1278040
179–191. doi:10.1076/clin.16.2.179.13241
Kanser, R. J., Rapport, L. J., Bashem, J. R., & Hanks, R. A. (2019).
Hayward, L., Hall, W., Hunt, M., & Zubrick, S. R. (1987). Can localized
Detecting malingering in traumatic brain injury: Combining
brain impairment be simulated on neuropsychological test profiles?.
response time with performance validity test accuracy. The Clinical
Australian & New Zealand Journal of Psychiatry, 21(1), 87–93. doi: Neuropsychologist, 33(1), 90–107. doi:10.1080/13854046.2018.1440006
10.3109/00048678709160904 Kiewel, N. A., Wisdom, N. M., Bradshaw, M. R., Pastorek, N. J., &
Heaton, R. K., Miller, S. W., Taylor, M. J., & Grant, I. (2004). Revised Strutt, A. M. (2012). Retrospective review of Digit Span-related
comprehensive norms for an expanded Halstead-Reitan battery: effort indicators in probable Alzheimer’s disease patients. The
Demographically adjusted neuropsychological norms for African Clinical Neuropsychologist, 26(6), 965–974. doi:10.1080/13854046.
American and Caucasian adults. Lutz, FL: Psychological Assessment 2012.694478
Resources. Kim, M. S., Boone, K. B., Victor, T., Marion, S. D., Amano, S.,
Heaton, R. K., Smith, H. H., Lehman, R. A. W., & Vogt, A. T. (1978). Cottingham, M. E., … Zeller, M. A. (2010). The Warrington
Prospects for faking believable deficits on neuropsychological test- Recognition Memory Test for words as a measure of response bias:
ing. Journal of Consulting and Clinical Psychology, 46(5), 892–900. Total score and response time cutoffs developed on “real world”
doi:10.1037/0022-006X.46.5.892 credible and noncredible subjects. Archives of Clinical
Heilbronner, R. L., Sweet, J. J., Morgan, J. E., Larrabee, G. J., Millis, Neuropsychology, 25(1), 60–70. doi:10.1093/arclin/acp088
S. R., & Conference, P. (2009). American Academy of Clinical Kongs, S.K., Thompson, L.L., Iverson, G.L., & Heaton, R.K. (2000).
Neuropsychology consensus conference statement on the neuro- WCST-64: Wisconsin Card Sorting Test-64 card version professional
psychological assessment of effort, response bias, and malingering. manual. Odessa, FL: Psychological Assessment Resources.
The Clinical Neuropsychologist, 23, 1093–1129. doi:10.1080/ Kura, K. (2013). Japanese north–south gradient in IQ predicts differen-
13854040903155063 ces in stature, skin color, income, and homicide rate. Intelligence,
Henry, J. D., & Crawford, J. R. (2004). A meta-analytic review of verbal 41(5), 512–516. doi:10.1016/j.intell.2013.07.001
fluency performance following focal cortical lesions. Lange, R. T., Iverson, G. L., Brickell, T. A., Staver, T., Pancholi, S.,
Neuropsychology, 18(2), 284–295. doi:10.1037/0894-4105.18.2.284 Bhagwat, A., & French, L. M. (2013). Clinical utility of the Conners’
Higgins, K. L., Denney, R. L., & Maerlender, A. (2017). Sandbagging Continuous Performance Test-II to detect poor effort in U.S. mili-
on the Immediate Post-Concussion Assessment and Cognitive tary personnel following traumatic brain injury. Psychological
Testing (ImPACT) in a high school athlete population. Archives of Assessment, 25 (2), 339–352. doi:10.1037/a0030915
Clinical Neuropsychology, 32(3), 259–266. doi:10.1093/arclin/acw108 Larochette, A. C., & Harrison, A. G. (2012). Word Memory Test per-
Hilsabeck, R. C. (2017). Psychometrics and statistics: Two pillars of formance in Canadian adolescents with learning disabilities: A pre-
neuropsychological practice. The Clinical Neuropsychologist, 31(6–7), liminary study. Applied Neuropsychology: Child, 1(1), 38–47.
995–999. doi:10.1080/13854046.2017.1350752 Larrabee, G. J. (2003). Detection of malingering using atypical perform-
Hopwood, C., Morey, L. C., Rogers, R., & Sewell, K. (2007). ance patterns on standard neuropsychological tests. The Clinical
Malingering on the Personality Assessment Inventory: Identification Neuropsychologist, 17(3), 410–425. doi:10.1076/clin.17.3.410.18089
Larrabee, G. J. (2008). Aggregation across multiple indicators improves Martin, P. K., Hunter, B. P., Rach, A. M., Heinrichs, R. J., &
the detection of malingering: Relationship to likelihood ratios. The Schroeder, R. W. (2017). Excessive decline from premorbid func-
Clinical Neuropsychologist, 22(4), 666–679. doi:10.1080/ tioning: Detecting performance invalidity with the WAIS-IV and
13854040701494987 demographic predictions. The Clinical Neuropsychologist, 31(5),
Larrabee, G. J. (2014). False-positive rates associated with the use of 829–843. doi:10.1080/13854046.2017.1284265
multiple performance and symptom validity tests. Archives of Martin, P. K., Schroeder, R. W., & Odland, A. P. (2015).
Clinical Neuropsychology, 29(4), 364–373. doi:10.1093/arclin/acu019 Neuropsychologists’ validity testing beliefs and practices: A survey of
Larrabee, G. J., Rohling, M. L., & Meyers, J. E. (2019). Use of multiple North American Professionals. The Clinical Neuropsychologist, 29(6),
performance and symptom validity measures: Determining the opti- 741–746. doi:10.1080/13854046.2015.1087597
mal per test cutoff for determination of invalidity, analysis of skew, Martin, P. K., Schroeder, R. W., Wyman-Chick, K. A., Hunter, B. P.,
and inter-test correlations in valid and invalid performance groups. Heinrichs, R. J., & Baade, L. E. (2018). Rates of abnormally low
The Clinical Neuropsychologist, 33(8), 1354–1372. doi:10.1080/ TOPF Word Reading scores in individuals failing versus passing
13854046.2019.1614227 performance validity testing. Assessment, 25(5), 640–652. doi:10.
Larsen, J. D., Allen, M. D., Bigler, E. D., Goodrich-Hunsaker, N. J., & 1177/1073191116656796
Hopkins, R. O. (2010). Different patterns of cerebral activation in Mathias, J. L., Bowden, S. C., Bigler, E. D., & Rosenfeld, J. V. (2007). Is
genuine and malingered cognitive effort during performance on the performance on the Wechsler Test of Adult Reading affected by
Word Memory Test. Brain Injury, 24(2), 89–99. doi:10.3109/ traumatic brain injury?. British Journal of Clinical Psychology, 46(4),
02699050903508218 457–466. doi:10.1348/014466507X190197
Leighton, A., Weinborn, M., & Maybery, M. (2014). Bridging the gap Matthews, C. G., & Klove, K. (1964). Instruction manual for the Adult
between neurocognitive processing theory and performance validity Neuropsychology Test Battery. Madison, WI: University of Wisconsin
assessment among the cognitively impaired: A review and methodo- Medical School.
logical approach. Journal of the International Neuropsychological McCrea, M., Guskiewicz, K. M., Marshall, S. W., Barr, W., Randolph,
Society, 20(9), 873–886. doi:10.1017/S135561771400085X C., Cantu, R. C., … Kelly, J. P. (2003). Acute effects and recovery
Le
on, F. R., & Leon, A. B. (2014). Why complex cognitive ability time following concussion in collegiate football players. JAMA,
increases with absolute latitude. Intelligence, 46, 291–299. doi:10. 290(19), 2556–2564. doi:10.1001/jama.290.19.2556
1016/j.intell.2014.07.011 McDaniel, M. A. (2006). Estimating state IQ: Measurement challenges
Lichtenstein, J. D., Erdodi, L. A., & Linnea, K. S. (2017). Introducing a and preliminary correlates. Intelligence, 34(6), 607–619. doi:10.1016/
forced-choice recognition task to the California Verbal Learning j.intell.2006.08.007
Test – Children’s Version. Child Neuropsychology, 23(3), 284–299. McFarlane, J., Welch, J., & Rodgers, J. (2006). Severity of Alzheimer’s
doi:10.1080/09297049.2015.1135422 disease and effect on premorbid measures of intelligence. British
Lichtenstein, J. D., Flaro, L., Baldwin, F., Rai, J. K., & Erdodi, L. A. Journal of Clinical Psychology, 45(4), 453–463. doi:10.1348/
(2019). Further evidence for embedded validity tests in children 014466505X71245
within the Conners’ Continuous Performance Test – Second Merten, T., Bossink, L., & Schmand, B. (2007). On the limits of effort
Edition. Developmental Neuropsychology, 44(2), 159–171. doi:10. testing: Symptom validity tests and severity of neurocognitive symp-
1080/87565641.2019.1565536 toms in nonlitigant patients. Journal of Clinical and Experimental
Lichtenstein, J. D., Greenacre, M. K., Cutler, L., Abeare, K., Baker, Neuropsychology, 29(3), 308–318. doi:10.1080/13803390600693607
S. D., Kent, K. J., … Erdodi, L. A. (2019). Geographic variation and Merten, T., & Rogers, R. (2017). An international perspective on
instrumentation artifacts: In search of confounds in performance feigned mental disabilities: Conceptual issues and continuing contro-
validity assessment in adults with mild TBI. Psychological Injury and versies. Behavioral Sciences & the Law, 35(2), 97–112. doi:10.1002/
Law, 12(2), 127–145. doi:10.1007/s12207-019-09354-w bsl.2274
Lichtenstein, J. D., Holcomb, M., & Erdodi, L. A. (2018). One-Minute Meyers, J. E., Miller, R. M., Thompson, L. M., Scalese, A. M., Allred,
PVT: Further evidence for the utility of the California Verbal B. C., Rupp, Z. W., … Junghyun Lee, A. (2014). Using likelihood
Learning Test—Children’s Version Forced Choice Recognition Trial. ratios to detect invalid performance with performance validity meas-
Journal of Pediatric Neuropsychology, 4(3-4), 94–104. doi:10.1007/ ures. Archives of Clinical Neuropsychology, 29(3), 224–235. doi:10.
s40817-018-0057-4 1093/arclin/acu001
Lichtenstein, J. D., Linnea, K. S., & Maerlender, A. C. (2018). Patterns Miller, L., Ryan, J., Carruthers, C., & Cluff, R. (2004). Brief screening
of referral in high school concussion management programs: A pilot indexes for malingering: A confirmation of Vocabulary minus Digit
study of consultants from different disciplines. Applied Span from the WAIS-III and the Rarely Missed Index from the
Neuropsychology: Child, 7(4), 334–341. doi:10.1080/21622965.2017. WMS-III. The Clinical Neuropsychologist, 18(2), 327–333. doi:10.
1340158 1080/13854040490501592
Lippa, S. M. (2018). Performance validity testing in neuropsychology: Nelson, N. W., Boone, K., Dueck, A., Wagener, L., Lu, P., & Grills, C.
A clinical guide, critical review, and update on a rapidly evolving lit- (2003). The relationship between eight measures of suspect effort.
erature. The Clinical Neuropsychologist, 32(3), 391–421. doi:10.1080/ The Clinical Neuropsychologist, 17(2), 263–272. doi:10.1076/clin.17.2.
13854046.2017.1406146 263.16511
Loring, D. W., Meador, K. J., Lee, G. P., King, D. W., Nichols, M. E., Odland, A. P., Lammy, A. B., Martin, P. K., Grote, C. L., &
Park, Y. D., … Smith, J. R. (1995). Wada memory asymmetries pre- Mittenberg, W. (2015). Advanced administration and interpretation
dict verbal memory decline after anterior temporal lobectomy. of multiple validity tests. Psychological Injury and Law, 8(1), 46–63.
Neurology, 45(7), 1329–1333. doi:10.1212/WNL.45.7.1329 doi:10.1007/s12207-015-9216-4
Lupu, T., Elbaum, T., Wagner, M., & Braw, Y. (2018). Enhanced detec- Ord, J. S., Boettcher, A. C., Greve, K. J., & Bianchini, K. J. (2010).
tion of feigned cognitive impairment using per item response time Detection of malingering in mild traumatic brain injury with the
measurements in the Word Memory Test. Applied Neuropsychology: Conners’ Continuous Performance Test-II. Journal of Clinical and
Adult, 25(6), 532–542. doi:10.1080/23279095.2017.1341410 Experimental Neuropsychology, 32(4), 380–387. doi:10.1080/
Lynn, R. (2010). In Italy, north–south differences in IQ predict differ- 13803390903066881
ences in income, education, infant mortality, stature, and literacy. Pearson. (2009). Advanced clinical solutions for the WAIS-IV and
Intelligence, 38(1), 93–100. doi:10.1016/j.intell.2009.07.004 WMS-IV – Technical manual. San Antonio, TX: Author.
MacAllister, W. S., Vasserman, M., & Armstrong, K. (2019). Are we Persinger, V. C., Whiteside, D. M., Bobova, L., Saigal, S. D., Vannucci,
documenting performance validity testing in pediatric neuropsycho- M. J., & Basso, M. R. (2018). Using the California Verbal Learning
logical assessment? A brief report. Child Neuropsychology, 25(8), Test, Second Edition as an embedded performance validity measure
1035–1042. doi:10.1080/09297049.2019.1569606 among individuals with TBI and individuals with psychiatric
disorders. The Clinical Neuropsychologist, 32(6), 1039–1053. doi:10. Schroeder, R. W., & Marshall, P. S. (2010). Validation of the Sentence
1080/13854046.2017.1419507 Repetition Test as a measure of suspect effort. The Clinical
Piatt, A. L., Fields, J. A., Paolo, A. M., Koller, W. C., & Troster, A. I. Neuropsychologist, 24(2), 326–343. doi:10.1080/13854040903369441
(1999). Lexical, semantic, and action verbal fluency in Parkinson’s Schroeder, R. W., Martin, P. K., Heinrichs, R. J., & Baade, L. E. (2019).
disease with and without dementia. Journal of Clinical and Research methods in performance validity testing studies: Criterion
Experimental Neuropsychology, 21(4), 435–443. doi:10.1076/jcen.21.4. grouping approach impacts study outcomes. The Clinical
435.885 Neuropsychologist, 33(3), 466–477. doi:10.1080/13854046.2018.
Piatt, A. L., Fields, J. A., Paolo, A. M., & Troster, A. I. (1999). Action 1484517
(verb naming) fluency as an executive function measure: Convergent Schroeder, R. W., Twumasi-Ankrah, P., Baade, L. E., & Marshall, P. S.
and divergent evidence of validity. Neuropsychologia, 37(13), (2012). Reliable Digit Span: A systematic review and cross-validation
1499–1503. doi:10.1016/S0028-3932(99)00066-4 study. Assessment, 19(1), 21–30. doi:10.1177/1073191111428764
Powell, M. R., Locke, D. E., Smigielski, J. S., & McCrea, M. (2011). Schutte, C., Axelrod, B. N., & Montoya, E. (2015). Making sure neuro-
Estimating the diagnostic value of the trail making test for subopti- psychological data are meaningful: Use of performance validity test-
mal effort in acquired brain injury rehabilitation patients. The ing in medicolegal and clinical contexts. Psychological Injury and
Clinical Neuropsychologist, 25(1), 108–118. doi:10.1080/13854046. Law, 8(2), 100–105. doi:10.1007/s12207-015-9225-3
2010.532912 Schwartz, E. S., Erdodi, L., Rodriguez, N., Jyotsna, J. G., Curtain, J. R.,
Poynter, K., Boone, K. B., Ermshar, A., Miora, D., Cottingham, M., Flashman, L. A., & Roth, R. M. (2016). CVLT-II forced choice rec-
Victor, T. L., … Wright, M. (2019). Wait, there’s a baby in this ognition trial as an embedded validity indicator: A systematic review
bath water! Update on quantitative and qualitative cut-offs for Rey of the evidence. Journal of the International Neuropsychological
15-Item Recall and Recognition. Archives of Clinical Society, 22(8), 851–858. doi:10.1017/S1355617716000746
Neuropsychology, 34(8), 1367–1380. doi:10.1093/arclin/acy087 Shura, R. D., Miskey, H. M., Rowland, J. A., Yoash-Gantz, R. E., &
Proto, D. A., Pastorek, N. J., Miller, B. I., Romesser, J. M., Sim, A. H., Denning, J. H. (2016). Embedded performance validity measures
& Linck, J. M. (2014). The dangers of failing one or more perform- with postdeployment veterans: Cross-validation and efficiency with
ance validity tests in individuals claiming mild traumatic brain multiple measures. Applied Neuropsychology: Adult, 23(2), 94–104.
injury-related postconcussive symptoms. Archives of Clinical doi:10.1080/23279095.2015.1014556
Neuropsychology, 29(7), 614–624. doi:10.1093/arclin/acu044 Shura, R. D., Martindale, S. L., Taber, K. H., Higgins, A. M., &
Rai, J., An, K. Y., Charles, J., Ali, S., & Erdodi, L. A. (2019). Rowland, J. A. (2019). Digit Span embedded validity indicators in
Introducing a forced choice recognition trial to the Rey Complex neurologically-intact veterans. The Clinical Neuropsychologist. doi:10.
1080/13854046.2019.1635209
Figure Test. Psychology and Neuroscience. doi:10.1037/pne0000175
Silverberg, N. D., Hanks, R. A., Buchanan, L., Fichtenberg, N., &
Rai, J., & Erdodi, L. (2019). The impact of criterion measures on the
Millis, S. R. (2008). Detecting response bias on an expanded version
classification accuracy of TOMM-1. Applied Neuropsychology: Adult,
of the controlled word association test. The Clinical
doi:10.1080/23279095.2019.161.1613994
Neuropsychologist, 22(1), 140–157. doi:10.1080/13854040601160597
Rapport, L. J., Farchione, T. J., Coleman, R. D., & Axelrod, B. N.
Slick, D. J., Sherman, E. M. S., & Iverson, G. L. (1999). Diagnostic cri-
(1998). Effects of coaching on malingered motor function profiles.
teria for malingered neurocognitive dysfunction: Proposed standards
Journal of Clinical and Experimental Neuropsychology, 20(1), 89–97.
for clinical practice and research. The Clinical Neuropsychologist,
doi:10.1076/jcen.20.4.89.1143
13(4), 545–561. doi:10.1076/1385-4046(199911)13:04;1-Y;FT545
Rey, A. (1941). L’examen psychologique dans les cas d’encephalopathie
Smith, A. (2007). Symbol Digit Modalities Test. Torrance, CA: Western
traumatique [Psychological examination in cases of traumatic
Psychological Services.
encephalopathy]. Archives de Psychologie, 28, 286–340.
Spencer, R. J., Axelrod, B. N., Drag, L. L., Waldron-Perrine, B.,
Rogers, R., Sewell, K. W., Martin, M. A., & Vitacco, M. J. (2003).
Pangilinan, P. H., & Bieliauskas, L. A. (2013). WAIS-IV Reliable
Detection of feigned mental disorders: A meta-analysis of the Digit Span is no more accurate than Age Corrected Scaled Score as
MMPI-2 and malingering. Assessment, 10(2), 160–177. doi:10.1177/ an indicator of invalid performance in a veteran sample undergoing
1073191103010002007 evaluation for mTBI. The Clinical Neuropsychologist, 27(8),
Rosenfeld, B., Sands, S. A., & van Gorp, W. G. (2000). Have we forgot- 1362–1372. doi:10.1080/13854046.2013.845248
ten the base rate problem? Methodological issues in the detection of Stevens, A., Friedel, E., Mehren, G., & Merten, T. (2008). Malingering
distortion. Archives of Clinical Neuropsychology, 15(4), 349–359. doi: and uncooperativeness in psychiatric and psychological assessment:
10.1016/S0887-6177(99)00025-6 Prevalence and effects in a German sample of claimants. Psychiatry
Roye, S., Calamia, M., Bernstein, J. P., De Vito, A. N., & Hill, B. D. Research, 157(1–3), 191–200. doi:10.1016/j.psychres.2007.01.003
(2019). A multi-study examination of performance validity in under- Steward, K. A., Novack, T. A., Kennedy, R., Crowe, M., Marson, D. C.,
graduate research participants. The Clinical Neuropsychologist, 33(6), & Triebel, K. L. (2017). The Wechsler Test of Adult Reading as a
1138–1155. doi:10.1080/13854046.2018.1520303 measure of premorbid intelligence following traumatic brain injury.
Rose, F. E., Hall, S., Szalda-Petree, A. D., & Bach, P. (1998). A com- Archives of Clinical Neuropsychology, 32(1), 98–103. doi:10.1093/
parison of four tests of malingering and the effects of coaching. arclin/acw081
Archives of Clinical Neuropsychology, 13(4), 349–363. doi:10.1016/ Suchotzki, K., Crombez, G., Smulders, F. T., Meijer, E., & Verschuere,
S0887-6177(97)00025-5 B. (2015). The cognitive mechanisms underlying deception: An
Roth, R. M., Erdodi, L. A., McCulloch, L. J., & Isquith, P. K. (2015). event-related potential study. International Journal of
Much ado about norming the Behavior Rating Inventory of Psychophysiology, 95(3), 395–405. doi:10.1016/j.ijpsycho.2015.01.010
Executive Function. Child Neuropsychology, 21(2), 225–233. doi:10. Sugarman, M. A., & Axelrod, B. N. (2015). Embedded measures of per-
1080/09297049.2014.897318 formance validity using verbal fluency tests in a clinical sample.
Savla, G., Twamley, E. W., Thompson, W. K., Delis, D. C., Jeste, D. V., Applied Neuropsychology: Adult, 22(2), 141–146. doi:10.1080/
& Palmer, B. W. (2011). Evaluation of specific executive functioning 23279095.2013.873439
skills and the processes underlying executive control in schizophre- Suhr, J. A., & Boyer, D. (1999). Use of the Wisconsin Card Sorting
nia. Journal of the International Neuropsychological Society, 17(01), Test in the detection of malingering in student simulator and
14–23. doi:10.1017/S1355617710001177 patient samples. Journal of Clinical and Experimental
Sawyer, R. J., Testa, S. M., & Dux, M. (2017). Embedded performance Neuropsychology, 21(5), 701–708. doi:10.1076/jcen.21.5.701.868
validity tests within the Hopkins Verbal Learning Test – Revised Tomer, E., Lupu, T., Golan, L., Wagner, M., & Braw, Y. (2019). Eye
and the Brief Visuospatial Memory Test – Revised. The Clinical tracking as a mean to detect feigned cognitive impairment in the
Neuropsychologist, 31(1), 207–218. doi:10.1080/13854046.2016. Word Memory Test. Applied Neuropsychology: Adult, 27(1), 49–61.
1245787 doi:10.1080/23279095.2018.1480483
Trueblood, W. (1994). Qualitative and quantitative characteristics of and Experimental Neuropsychology, 37(2), 220–227. doi:10.1080/
malingered and other invalid WAIS-R and clinical memory data. 13803395.2014.1002758
Journal of Clinical and Experimental Neuropsychology, 16(4), Whitney, K. A., Davis, J. J., Shepard, P. H., Bertram, D. M., & Adams,
597–607. doi:10.1080/088639408402671 K. M. (2009). Digit span age scaled score in middle-aged military
Tsushima, W. T., Yamamoto, M. H., Ahn, H. J., Siu, A. M., Choi, veterans: Is it more closely associated with TOMM failure than reli-
S. Y., & Murata, N. M. (2019). Invalid baseline testing with able digit span?. Archives of Clinical Neuropsychology, 24(3),
ImPACT: Does sandbagging occur with high school athletes? 263–272. doi:10.1093/arclin/acp034
Applied Neuropsychology: Child, 1–10. Whiteside, D. M., Caraher, K., Hahn-Ketter, A., Gaasedelen, O., &
Tyson, B. T., Baker, S., Greenacre, M., Kent, K. J., Lichtenstein, J. D., Basso, M. R. (2019). Classification accuracy of individual and com-
Sabelli, A., & Erdodi, L.A. (2018). Differentiating epilepsy from psy- bined executive functioning embedded performance validity meas-
chogenic nonepileptic seizures using neuropsychological test data. ures in mild traumatic brain injury. Applied Neuropsychology: Adult,
Epilepsy & Behavior, 87, 39–45. doi:10.1016/j.yebeh.2018.08.010 26(5), 472–481. doi:10.1080/23279095.2018.1443935
Vickery, C. D., Berry, D. T. R., Dearth, C. S., Vagnini, V. L., Baser, Whiteside, D. M., Gaasedelen, O. J., Hahn-Ketter, A. E., Luu, H.,
R. E., Cragar, D. E., & Orey, S. A. (2004). Head injury and the abil- Miller, M. L., Persinger, V., … Basso, M. R. (2015). Derivation of a
ity to feign neuropsychological deficits. Archives of Clinical cross-domain embedded performance validity measure in traumatic
Neuropsychology, 19(1), 37–48. doi:10.1093/arclin/19.1.37 brain injury. The Clinical Neuropsychologist, 29(6), 788–803. doi:10.
Viglione, D. J., Giromini, L., Landis, P., McCullaugh, J. M., Pizitz, 1080/13854046
T. D., O’Brien, S., … Abramsky, A. (2019). Development and valid- Wilkinson, G. S., & Robertson, G. J. (2006). Wide range achievement
ation of the False Disorder Score: The Focal Scale of the Inventory test 4. Lutz, FL: Psychological Assessment Resources, Inc.
of Problems–29. Journal of Personality Assessment, 101(6), 653–661. Yu, J., Tao, Q., Zhang, R., Chan, C. C. H., & Lee, T. M. C. (2019). Can
doi:10.1080/00223891.2018.1492413 fMRI discriminate between deception and false memory? A meta-
von Helvoort, D., Merckelbach, H., & Merten, T. (2019). The Self- analytic comparison between deception and false memory studies.
Report Symptom Inventory (SRSI) is sensitive to instructed feigning, Neuroscience & Biobehavioral Reviews, 104, 43–55. doi:10.1016/j.neu-
but not to genuine psychopathology in male forensic inpatients: An biorev.2019.06.027
initial study. The Clinical Neuropsychologist, 33(6), 1069–1082. doi: Zakzanis, K. K., McDonald, K., & Troyer, A. K. (2011). Component
10.1080/13854046.2018.1559359 analysis of verbal fluency in patients with mild traumatic brain
Webber, T. A., & Soble, J. R. (2018). Utility of various WAIS-IV Digit injury. Journal of Clinical and Experimental Neuropsychology, 33(7),
Span indices for identifying noncredible performance among per- 785–792. doi:10.1080/13803395.2011.558496
formance validity among cognitively impaired and unimpaired Zakzanis, K. K., McDonald, K., & Troyer, A. K. (2013). Component
examinees. The Clinical Neuropsychologist, 32 (4), 657–670. doi:10. analysis of verbal fluency in patients with mild traumatic brain
1080/13854046.2017.1415374 injury. Brain Injury, 27(7–8), 903–908. doi:10.3109/02699052.2013.
Wechsler, D. (1997). Wechsler Adult Intelligence Scale (3rd ed.). San 775505
Antonio, TX: The Psychological Corporation. Zuccato, B. G., Tyson, B. T., & Erdodi, L. A. (2018). Early bird fails
Whiteside, D. M., Kogan, J., Wardin, L., Phillips, D., Franzwa, M. G., the PVT? The effects of timing artifacts on performance validity
Rice, L., … Roper, B. (2015). Language-based embedded perform- tests. Psychological Assessment, 30(11), 1491–1498. doi:10.1037/
ance validity measures in traumatic brain injury. Journal of Clinical pas0000596

Verbal Fluency and Digit Span Variables As Performance Validity Indicators in Experimentally Induced Malingering and Real World Patients With TBI

Uploaded by

Copyright:

Available Formats

You might also like

Verbal Fluency and Digit Span Variables As Performance Validity Indicators in Experimentally Induced Malingering and Real World Patients With TBI

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Verbal Fluency and Digit Span Variables As Performance Validity Indicators in Experimentally Induced Malingering and Real World Patients With TBI

Uploaded by

Copyright:

Available Formats

Applied Neuropsychology: Child

ISSN: 2162-2965 (Print) 2162-2973 (Online) Journal homepage: https://www.tandfonline.com/loi/hapc20

Verbal fluency and digit span variables as

Jessica Hurtubise, Tabarak Baher, Isabelle Messa, Laura Cutler, Ayman

To link to this article: https://doi.org/10.1080/21622965.2020.1719409

Published online: 21 Feb 2020.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Verbal fluency and digit span variables as performance validity indicators in

Performance validity and cognitive testing Types of PVTs

Table 2. List of neuropsychological tests administered to the student sample (n ¼ 81).

SD ¼ 2.6; MexpMAL ¼ 21.0, SD ¼ 3.0; t(79) ¼ 0.66, p ¼ .514], Materials

Clinical sample As anticipated, the student sample was significantly younger

Relative contribution of process variables and

credibility of the overall neurocognitive profile. However, in Ethical approval

You might also like