Professional Documents
Culture Documents
(Oxford Library of Psychology) James N. Butcher - Oxford Handbook of Personality Assessment-Oxford University Press (2009) PDF
(Oxford Library of Psychology) James N. Butcher - Oxford Handbook of Personality Assessment-Oxford University Press (2009) PDF
Editor-in-Chief
Peter E. Nathan
OXFORD LIBRARY OF PSYCHOLOGY
Editor-in-Chief P E T E R E . N A T H A N
Oxford Handbook of
Personality Assessment
Edited by
James N. Butcher
1
2009
1
Oxford University Press, Inc., publishes works that further Oxford University’s objective
of excellence in research, scholarship, and education.
With offices in
Argentina Austria Brazil Chile Czech Republic France Greece
Guatemala Hungary Italy Japan Poland Portugal Singapore
South Korea Switzerland Thailand Turkey Ukraine Vietnam
www.oup.com
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise,
without the prior permission of Oxford University Press
9 8 7 6 5 4 3 2 1
Contributors xi
Table of Contents xv
Chapters 1–718
Appendix 719
Index 739
v
This page intentionally left blank
OXFORD LIBRARY OF PSYCHOLOGY
vii
and world’s most productive and best-respected psychologists have agreed to edit
Library handbooks or write authoritative chapters in their areas of expertise.
For whom has the Oxford Library of Psychology been written? Because of its
breadth, depth, and accessibility, the Library serves a diverse audience, including
graduate students in psychology and their faculty mentors, scholars, researchers,
and practitioners in psychology and related fields. Each will find in the Library the
information they seek on the subfield or focal area of psychology in which they
work or are interested.
Befitting its commitment to accessibility, each handbook includes a compre-
hensive index, as well as extensive references to help guide research. And because
the Library was designed from its inception as an online as well as a print resource,
its structure and contents will be readily and rationally searchable online. Further,
once the Library is released online, the handbooks will be regularly and thoroughly
updated.
In summary, the Oxford Library of Psychology will grow organically to provide a
thoroughly informed perspective on the field of psychology, one that reflects both
psychology’s dynamism and its increasing interdisciplinarity. Once published
electronically, the Library is also destined to become a uniquely valuable interactive
tool, with extended search and browsing capabilities. As you begin to consult this
handbook, we sincerely hope you will share our enthusiasm for the more than 500-
year tradition of Oxford University Press for excellence, innovation, and quality, as
exemplified by the Oxford Library of Psychology.
Peter E. Nathan
Editor-in-Chief
Oxford Library of Psychology
James N. Butcher
ix
This page intentionally left blank
CONTRIBUTORS
xi
BERNADETTE GRAY-LITTLE University of Minnesota
University of North Carolina at Minneapolis, MN
Chapel Hill KRISTIAN E. MARKON
Chapel Hill, NC Department of Psychology
ROGER L. GREENE University of Iowa
Pacific Graduate School Iowa City, IA
of Psychology ROBERT R. MCCRAE
Redwood City, CA National Institute on Aging
DONALD K. GUCKER National Institutes of Health
Albuquerque, NM Baltimore, MD
ALLAN R. HARKNESS EDWIN I. MEGARGEE
Department of Psychology Havana, FL
University of Tulsa J. REID MELOY
Tulsa, OK Department of Psychiatry
JENNIFER L. HARRINGTON University of California
School of Professional Psychology San Diego La Jolla, CA
Pacific University GREGORY J. MEYER
Hillsboro, OR Department of Psychology
T. MARK HARWOOD University of Toledo
Department of Psychology Toledo, OH
Wheaton College DAVID S. NICHOLS
Wheaton, IL School of Professional Psychology
STEPHEN N. HAYNES Pacific University
Department of Psychology Hillsboro, OR
University of Hawai‘i at Manoa JENNIFER LAFFERTY O’CONNOR
Honolulu, HI Remuda Programs for Eating
LOWELL W. HELLERVIK Disorders
Personnel Decisions, Inc. Wickenburg, AZ
Minneapolis, MN MIMI OKAZAKI
RICHARD E. HEYMAN Florida State Hospital
Department of Psychology Chattahoochee, FL
Stony Brook University SUMIE OKAZAKI
Stony Brook, NY Department of Psychology
JESSICA JONES University of Illinois
Department of Psychology Champaign, IL
University of Minnesota RAYMOND L. OWNBY
Minneapolis, MN Department of Psychiatry and Behavioral
KAREN KLOEZEMAN Sciences
Department of Psychology Miller School of Medicine
University of Hawai‘i at Manoa University of Miami
Honolulu, HI Miami, FL
ROBERT F. KRUEGER JULIA N. PERRY
Department of Psychology VA Medical Center
Washington University Minneapolis, MN
St. Louis, MO MICHELLE B. RANSON
MARTIN LEICHTMAN Fielding Graduate University
Leawood, KS Santa Barbara, CA
ANGUS W. MACDONALD, III DAMON ANN ROBINSON
Department of Psychology Daphne, AL
xii CONTRIBUTORS
STEVEN V. ROUSE RONALD A. STOLBERG
Seaver College California School of Professional
Pepperdine University Psychology
Malibu, CA Alliant International University
LINDSEY J. SCHIPPER San Diego, CA
Department of Psychology STANLEY SUE
University of Kentucky Department of Psychology
Lexington, KY University of California
ANNE L. SHANDERA Davis, CA
Department of Psychology NATHAN C. WEED
University of Kentucky Department of Psychology
Lexington, KY Central Michigan University
GREGORY T. SMITH Mount Pleasant, MI
Department of Psychology IRVING B. WEINER
University of Kentucky University of South Florida
Lexington, KY Tampa, FL
DOUGLAS K. SNYDER THOMAS A. WIDIGER
Department of Psychology Department of Psychology
Texas A&M University University of Kentucky
College Station, TX Lexington, KY
MYRIAM J. SOLLMAN JUDITH WORELL
Department of Psychology University of Kentucky
University of Kentucky Lexington, KY
Lexington, KY DAWN T. YOSHIOKA
SUSAN C. SOUTH Department of Psychology
Department of Psychological Sciences University of Hawai‘i at Manoa
Purdue University Honolulu, HI
West Lafayette, IN TAMIKA C. B. ZAPOLSKI
STEPHEN STARK Department of Psychology
Department of Psychology University of Kentucky
University of South Florida Lexington, KY
Tampa, FL
CONTRIBUTORS xiii
This page intentionally left blank
CONTENTS
xv
13. Clinical Applications of Behavioral Assessment 226
Stephen N. Haynes, Dawn T. Yoshioka, Karen Kloezeman, and Iruma Bello
14. The MMPI-2: History, Interpretation, and Clinical Issues 250
Andrew C. Cox, Nathan C. Weed, and James N. Butcher
15. Personality Assessment with the Rorschach Inkblot Method 277
Irving B. Weiner and Gregory J. Meyer
16. The Five-Factor Model and the NEO Inventories 299
Paul T. Costa, Jr. and Robert R. McCrae
17. The California Psychological Inventory 323
Edwin I. Megargee
18. Personality Disorders Assessment Instruments 336
Thomas A. Widiger and Sara E. Boyd
19. Functional Imaging in Clinical Assessment? The Rise of Neurodiagnostics
with fMRI 364
Angus W. MacDonald, III and Jessica A. H. Jones
xvi CONTENTS
Part Seven Interpretation and Reporting of Assessment Findings
32. Assessment of Feigned Psychological Symptoms 613
David T. R. Berry, Myriam J. Sollman, Lindsey J. Schipper,
Jessica A. Clark, and Anne L. Shandera
33. Assessment of Clients in Pretreatment Planning 643
T. Mark Harwood and Larry E. Beutler
34. Assessment of Treatment Resistance via Questionnaire 667
Julia N. Perry
35. Writing Clinical Reports 684
Raymond L. Ownby
36. How to Use Computer-Based Reports 693
James N. Butcher
37. Overview and Future Directions 707
James N. Butcher
Appendix 719
Index 739
CONTENTS xvii
This page intentionally left blank
Personality Assessment
This page intentionally left blank
PART
1
Overview and
Introduction
This page intentionally left blank
C H A P T E R
Clinical Personality Assessment: History,
1 Evolution, Contemporary Models, and
Practical Applications
James N. Butcher
Abstract
Our goals in this volume are to provide a comprehensive view of the field of personality assessment from
its historical roots to the major, methodological issues defining the field and that have led to numerous
approaches and applications, and to the broad range of assessment techniques available today. One can
find throughout history an awareness of the importance of evaluating personality and character using
observation and other means of acquiring information on which to make such decisions. For over 3,000
years, an elaborate system of competitive examinations was used in China. The modern era for personality
assessment began in the late nineteenth-century England with the work of Sir Francis Galton. The first
self-report personality inventory used to obtain personality information was developed by Robert
Woodworth during World War I. Numerous personality inventories were developed in the years that
followed. One of the most successful has been the Minnesota Multiphasic Personality Inventory (MMPI)
developed in the late 1930s by Starke Hathaway and J. C. McKinley. Paralleling the work on personality
inventories was the development of projective assessment instruments. The most significant projective
test was developed by the Swiss psychiatrist Herman Rorschach in 1922. Another projective instrument,
the Thematic Apperception Test, or TAT, was developed in 1935 by Henry Murray and Christiana
Morgan. These instruments came to be the most widely used assessment instruments during the
expansion of applied psychology after World War II and are still the predominant tests in use today.
In this handbook, an overview of contemporary personality assessment will be provided by a number of
prominent assessment psychologists who give the reader a comprehensive perspective on assessment
methodology and instruments. In addition, some challenges that assessment psychology faces will be
described. Discussion is provided to highlight significant assessment trends and projections about new
directions for research.
Keywords: historical roots, highlights in assessment, origins, phrenology, fMRI, Rorschach, MMPI, TAT,
testing standards
People have likely been making personality assess- assessments in antiquity were probably based on
ment decisions and personality appraisals since the diverse, if not particularly valid or reliable, sources of
beginning of human interactions. As an adaptive information. Early writings, for example, mention
human activity, effective personality appraisal has such factors as dreams, signs from the gods, oracular
clear survival value for our species (Buss, 1984). If speculation, spies, traitors, direct observation, pedi-
our cave-dwelling ancestors happened to choose a gree, physiognomic characteristics, and the interview
hunting partner with the ‘‘wrong’’ personality and as important sources of information about people.
motivations, the outcome could be tragic. Or if Even ‘‘tests’’ were employed to evaluate personality
groups from antiquity failed to evaluate whether characteristics. Hathaway (1965) described an early
potential leaders possessed essential personal qualities, personality screening situation from the Old
their own survival could be threatened. Personality Testament: For example, Gideon had collected too
5
large an army, and now the Lord saw that the people from their appearance. Boring (1950)
Israelites would give Him scant credit if so many pointed out that the modern character of phrenology
men overran the Midianites camped in the valley. (as well as its name) was initially founded by Johan
The Lord suggested two screening items for Gideon. Spurzheim (1776–1832), a Viennese physician who
The first of these items had face validity: Gideon studied with Franz Joseph Gall (1758–1828) in
proclaimed that all who were afraid could go home. 1800 and was soon employed as his assistant. Gall,
More than two of every three did so. The second was also a Viennese physician and lecturer, had been
subtle: Those who fought the Midianites were the few involved in the investigation of his theory that
who drank from their cupped hands instead of powerful memory as a characteristic in humans was
stooping to drink. Altogether it was a battle decided reflective of having very prominent eyes and that
by psychological devices, and 300 men literally scared other bodily characteristics, such as head size, were
the demoralized Midianites into headlong flight indicative of special talents for painting or music.
(p. 457). Gall and Spurzheim were very popular speakers
among upper-class intellectuals and scientists in
Highlights in History Europe in the 1820s and influenced British physican
Efforts to appraise personality characteristics were George Combe (1788–1858), who was the most
evident in antiquity. DuBois (1970) pointed out prolific British phrenologist of the nineteenth
that for over 3,000 years an elaborate system of century.
competitive examinations was used for selecting During the brief period that Spurzheim and Gall
government personnel in China. Some of these worked together, Gall had hoped to have Spurzheim
tests used were designed to assess the personal char- as his successor and included him as a co-author in a
acteristics of applicants. This system of examinations number of publications. However, they had a
was described in the year 2200 BC, when the disagreement that prompted Spurzheim to start his
emperor of China examined government officials own career, writing extensively about the physiog-
every 3 years to determine their fitness for office nomical system. Spurzheim was highly successful in
(DuBois, 1970). In 1115 BC, candidates for govern- his effort to popularize phrenology throughout
ment positions were examined for their proficiency Europe, particularly in France and England. After
in the ‘‘six arts’’: music, archery, horsemanship, leaving the tutorship of Professor Gall, he continued
writing, arithmetic, and the rites and ceremonies of to expand his views and established a new and more
public and private life (DuBois, 1970). Examinees complete topography of the skull, filling in blanks
were required to write for hours at a time and for a that Gall felt had not been empirically established.
period of several days in succession. For a picture of He also revised the terminology for the faculties used
the early testing booths, see Ben-Porath and Butcher in phrenology. Once he completed the system to his
(1991). satisfaction, Spurzheim set about spreading the word
One can find throughout history an awareness of on phrenology in other countries. Providing lectures
the importance of evaluating personality and char- in Europe he influenced others such as John
acter as using observation and other means of C. Warren (1778–1856), a professor of medicine
acquiring information on which to make such at Harvard (also known for performing the first
personality-based judgments. However, interest in surgery under ether in the United States). Warren
obtaining further understanding and for evaluating had begun the study of Gall’s ideas on phrenology
underlying personality and personal qualities began in 1808, but during a trip to France he studied
to take more objective focus in the nineteenth Spurzheim’s views on phrenology in 1821. When
century. Some scholars came to believe that human he returned to Boston, he developed a series of
physical characteristics such as bumps on the head or lectures on phrenology at Harvard and later incor-
the shape of the eyes were associated with underlying porated these ideas into presentations for a broader
personality or character and that one could under- audience at the Massachusetts Medical Society.
stand people by careful observation of their physical The most prominent promoter of phrenology in
features. This phenomenon, known as phrenology, the United States during the 1820s was Charles
was held by a number of prominent physicians. Caldwell (1772–1853), who had also attended
Phrenology appealed to intellectuals who followed Spurzheim’s lectures in Paris. He toured throughout
a biological determinism that gave promise to the United States lecturing on phrenology and
individuals improving themselves in life by being founded a number of societies that promoted
able to read and understand characteristics of other phrenology. In 1832, Spurzheim traveled to the
BUTCHER 7
Table 1.1. Highlights in the History of Personality Assessment.
Year Personality Test Reference
Development of the Three Major Clinical continues today. Beck, Klopfer, and Hertz also devel-
Personality Assessment Instruments of the oped separate interpretation systems for the inkblots
Twentieth Century: Rorschach, TAT, and in the 1940s. The most widely used contemporary
MMPI Rorschach interpretive system was published by John
The Rorschach Exner (1983).
The most significant event in the history of projec- An important theoretical article defining this
tive psychological assessment occurred during the early approach to personality assessment was published by
years of the twentieth century when the Swiss psychia- L. K. Frank (1939). This article detailing the use of
trist Herman Rorschach published his monograph ambiguous stimuli like the Rorschach inkblots to assess
Psychodiagnostik, which detailed the development of personality provided an important theoretical explana-
the ‘‘Rorschach inkblot technique.’’ During this tion that came to be known as the projective method
period, inkblots were commonly used in Europe at in assessment. In Frank’s view, the use of ambiguous
the time as stimuli for imagination in the popular stimuli allowed people to ‘‘project’’ their internal mean-
parlor game known as Klecksographie or blotto, in ings and feelings from their unconscious personality in
which people made up responses to various inkblot their responses. This process of projection allowed an
designs. Although Rorschach had experimented with interpreter to gain a better understanding of the client’s
various types and forms of inkblots in 1911 he did not thinking. Frank’s theoretical interpretation of the pro-
pursue these efforts until his work in 1921 that detailed jective process by the projective method was influential
the interpretation of responses to the inkblots to assess as projective tests such as the Rorschach and TAT
mental health symptoms and personality. This was the expanded in use during the 1940s.
only work that Rorschach published on the test because
he died of acute appendicitis the following year. The TAT
Subsequent developments and refinements of the The Thematic Apperception Test, or TAT, was
Rorschach inkblot technique occurred several years developed in 1935 by Henry Murray and Christiana
later in the United States when Beck (1938), Klopfer Morgan and described in Murray’s early works
and Tallman (1938), and Hertz (1938) began to use (1938, 1943). The TAT is a projective measure
the Rorschach blots to understand personality and comprised of a series of pictures to which a client is
emotional characteristics of patients in a movement asked to make up a story describing the events going
that was to see the publication of thousands of articles on in the picture. The test allows the clinician to
and recruitment of countless advocates who began to evaluate the client’s thought patterns, attitudes,
use the 10 blots in clinical settings—a movement that beliefs, observational capacity, and emotional
BUTCHER 9
joined the committee a few years later at the data personality trait scales that would assess personality
analysis stage. The MMPI-2 was published in 1989 characteristics in nonclinical populations. The
and the MMPI-A (for adolescents) in 1992 (Butcher, California Psychological Inventory or CPI, pub-
Williams, Graham, & Ben-Porath, 1992). lished in 1956 (Gough, 1956), contained 489
items (over 200 items of which were from the
Expansion of Personality Assessment During original MMPI). He included an additional group
and Following World War II of items to address personality traits that were not
The entry of the United States into World War II addressed by MMPI items. The personality scales for
in 1941 brought with it an important need for the CPI were grouped into four categories: (1) Poise,
psychological services in both the clinical service (2) Socialization, (3) Achievement Potential, and (4)
area and personnel selection. The military services Intelligence and Interest Modes. The CPI scales
implemented several programs in which tests like the were designed to assess a broad range of personality
MMPI were used in personnel selection for positions attributes found in ‘‘normal’’ populations and
such as pilots and special services personnel (Altus, became a standard measure for assessing personality
1945; Blair, 1950; Fulkerson, Freud, & Raynor, in personnel selection and in conducting psycholo-
1958; Jennings, 1949; Melton, 1955). gical research (see discussion by Megargee in
Other wartime assessment efforts in the selection Chapter 17).
of special forces included those of the Office of In 1977, Theodore Millon developed the Millon
Strategic Services or OSS, a predecessor to the pre- Clinical Multiaxial Inventory (MCMI) (see Millon,
sent Central Intelligence Agency, which performed 1977) to assess personality problems among clients
extensive psychological evaluations on persons who in psychotherapy. Millon based his test development
were to be assigned to secret overseas missions strategy upon his theory of psychopathology
during the war. Henry Murray created and super- (a strategy that is very different from the strict
vised the selection process in which over 5,000 empirical method followed by Hathaway and
candidates were evaluated. The psychologists McKinley with the MMPI). Item development
involved in the program used over a hundred followed a rational strategy and his comparison sam-
different psychological tests and specially designed ples were patients in psychotherapy rather than a
procedures to perform the evaluations. The exten- ‘‘normal’’ population. The MCMI largely addresses
sive assessment project was described, after their Diagnostic and Statistical Manual of Mental
work was declassified when the war ended, in the Disorders (DSM) Axis II dimensions of personality
OSS staff history description provided in the Office rather than symptom disorders on AXIS I of DSM
of Strategic Services Assessment Staff (1948). See that are addressed by the MMPI.
also the article by Handler (2001). The NEO Personality Inventory (NEO PI) was
developed by Paul Costa and Robert McCrae to
Expansion of Personality Inventory assess personality dimension that had been described
Development in the Later Part of the as the ‘‘Big Five’’ or Five Factor Model of personality.
Twentieth Century The NEO was published in 1985 as a measure of
After World War II, developments in personality the major personality dimensions in normal person-
assessment continued at a high rate. Most notably, ality to assess: openness, agreeableness, neuroticism,
shortly after the war, Cattell began working on the extraversion, and conscientiousness (Costa &
factor analysis of personality and published the McCrae, 1985).
Sixteen Personality Factors (16-PF) based upon a The Personality Assessment Inventory (PAI) was
number of factor analyses he conducted on a large developed by Leslie Morey in 1991. This instru-
item pool of adjectives he used to construct trait ment, similar to the MMPI, was designed to address
names. Cattell’s 16-PF included a set of 15 person- the major clinical syndromes, such as depression
ality trait scales and one scale to assess intelligence (Morey, 1991). The PAI contains item content
which were designed to assess the full range of and scales that are similar to the MMPI and devised
normal personality functioning (Cattell & Stice, to be used in the clinical assessment in a range of
1957). This personality inventory has gained wide settings analogous to the applications of the MMPI.
acceptance and use in both personnel and research
applications. Computer-Based Personality Assessment
In 1948, Harrison Gough, who had studied with No treatment of the history of personality assess-
Hathaway at Minnesota, began work on a set of ment would be complete without mention of the
BUTCHER 11
personality is structured have led to rather different that require special attention in conducting personality
assumptions about what data are important in assessments. We examine traditional clinical person-
understanding personality. For example, psychody- ality assessment approaches and include new and
namic theorists have viewed personality as a complex expanding avenues of assessment such as behavior
and intricate system of drives and forces that cannot genetics, and functional magnetic resonance imagery
be understood without extensive personal historical (fMRI). Finally, we examine several problem areas in
information. The means of understanding the clinical personality assessment that require special atten-
connections between personal history and person- tion because they present problems or issues that are
ality can come from many sources, such as through somewhat different from mental health settings where
the individual’s response to intentionally ambiguous much of the interpretive lore for clinical personality
stimulus material such as the Rorschach inkblots or tests has evolved.
their response to a highly structured sentence Psychological assessments are undertaken in
acknowledging a clinical symptom. Other perspec- many different settings and for many different
tives in personality have been oriented more toward purposes. The same psychological tests might be
‘‘surface’’ behaviors. For example, a learning- or employed to evaluate clients in forensic settings, for
behavioral-based viewpoint may not involve such example, to assist the court in determining custody
historic assumptions as the psychodynamic view of minor children in family disputes as in mental
but may rely more on overt, observable behaviors. health settings to determine the nature and extent of
Most researchers and practitioners are familiar with psychological problems in pretreatment planning.
the ‘‘standard’’ sources of personality information In this book, contributions were invited to
that have been developed to appraise personality provide the reader with illustrations of a number of
today—personality inventories, projective methods, different applications. Authors and topics were
and the clinical interview. selected both to cover as many problem areas as
practical. Space limitations, of course, prevent a full
Goals of the Handbook exploration of all the areas that might touch on
Our goal in this volume is to provide a broad-based assessment. It is hoped that the topics chosen will
and comprehensive view of the field of personality illustrate both the diversity and the effectiveness of
assessment from its historical roots to the major sources the assessment techniques involved. In all cases,
of theoretical stimuli, and the methodological issues chapters were invited that would address problem-
defining the field that have led to numerous approaches oriented assessment, and they were written by noted
and applications. This volume will serve as an psychologists with substantial expertise in the
introductory textbook in psychological assessment for assessment area in question.
graduate students in clinical and counseling psychology
as well as a comprehensive ‘‘refresher course’’ for active Genetic and Cultural Perspectives
professionals in the field. At the offset, we provide an We begin this volume by providing two chapters
introduction to the major sources of personality-based that orient the readers to major sources of variance in
thought—both genetic and cultural sources of informa- the development of personality and important
tion. We next turn to presenting a framework for considerations for developing personality measures
personality assessment and examine the methodolo- to assure relevance for practical assessment. A great
gical and conceptual factors and the psychometric deal of personality-based genetic research has
considerations defining the field. We have included a unveiled the liklihood that many personality
number of chapters that address broad methodological traits—those enduring personality characteristics—
issues and strategies for developing personality assess- are likely to be in part inherited qualities. In Chapter
ment appproaches. We also include several chapters 2, Susan South, Robert Krueger, and Kristian
that examine the meanings of personality assessment Markon focus on the genetic factors pertinent to
results. In addition, we include a number of topics that understanding and assessing personality in the light
deal with issues that can impact findings or require of extensive behavior genetics research today. South,
particular attention in understanding to find Krueger, and Markon provide a background in
meaningful results. We have tried to incorporate a behavioral genetics and detail the important influ-
broad range of personality assessment procedures and ences on personality assessment emanating from this
instruments to give the reader a picture of the depth of rapidly growing research domain. Their chapter
personality assessment field today. We also address ‘‘Behavior genetic perspectives on clinical personality
factors related to a number of specific populations assessment’’ discusses how twin studies work to
BUTCHER 13
They discuss test standardization issues and focus behaviors (e.g., suicide), the interpretation of test data
upon the importance of test administration factors (e.g., MMPI-2 scores), and the making of diagnostic
in assessment. The authors also highlight the impor- decisions.
tance of considering differences that can emerge and Predicting phenomena with low base rates is
accomodations that might be required in the testing difficult. In general, it is easier to increase predictive
of minority clients and those with disabilities. Over accuracy when the events to be predicted are mod-
time, psychological assessment procedures that have erately likely to occur (i.e., occur with a probability
been established as standards in the field for close to 50%) than when the events are unlikely to
measuring aspects of personality begin to show occur. Much improvement in clinical decision
their age or become targeted for some reason for making can be accomplished if each of us remem-
change. After a personality inventory has become a bered to think about base rates every time we made a
widely used one over a number of years, its limitations clinical hypothesis. The chapter by Finn provides
may begin to be recognized and efforts to modify or clear strategies for incorporating base rate informa-
surpass the existing scales with new or modified tion in the prediction process.
measures emerge. Are there important factors that Personality assessment has been substantially
need to be considered if standard measurement advanced by theoretical conceptualization of the
instruments or tests become altered? In Chapter 7, variables that comprise personality. The view that
entitled ‘‘Changing or replacing an established stan- personality is comprised of definable factors such as
dard: Issues, goals, and problem,’’ Michelle Ranson, personality traits (see the classical work of Allport)
David Nichols, Steven Rouse, and Jennifer has made developing personality-based measures an
Harrington address these factors and discuss the easier task. Allan Harkness in Chapter 9, ‘‘Theory
issues involved in developing test revisions or mod- and measurement of personality traits,’’ provides a
ifications of instruments that have become widely contemporary perspective on the importance of
used standards. They consider several factors: ethical traits in personality assessment. Harkness discusses
standards, when should a measure be revised, goals for individual differences in traits and describes the
the revised measure, pragmatic concerns for widely ‘‘traits versus states’’ debate in the search for expla-
used procedures, challenges for self-report inven- natory concepts in personality assessment. He
tories, challenges for free-response measures. They examines factors that create, develop, and maintain
also provide a case study, viz. the MMPI by describing traits and he explores the causal source of personality
the MMPI-2 restandardization project and some variance. As an illustration, he describes the devel-
recent changes being developed for the instrument opment of the MMPI-2 PSY-5 Scales and illustrates
to highlight the importance of assuring continuity to their use for assessing personality dimensions.
the applied psychological assessment base. Computer-based test interpretation of psycholo-
An important issue in personality assessment and gical tests has a long history dating back to 1962
prediction of behavior involves the accuracy of the when psychologists at the Mayo Clinical developed
assessment procedure. It is important for practitioners the first MMPI interpretation program. In contem-
to consider the relative frequency of phenomena (base porary assessment psychology today, many instru-
rates) in particular settings in order to interpret ments provide computer-derived reports. The extent
psychological test scores appropriately. One factor of this medium of interpretation will be explored in
that influences the power of an instrument to predict Chapter 10, ‘‘Computer-based interpretation of per-
behavior is the base rate of the occurrence of the sonality scales,’’ by James Butcher, Julia Perry, and
phenomenon in the population. Stephen Finn, in Brooke Dean. The authors describe the issues
Chapter 8, ‘‘Incorporating base rate information in involved in providing computer-based personality
daily clinical decision making,’’ addresses this impor- tests and discuss the professional guidelines for
tant issue. Base rates are defined for specified popula- offering computer testing services presented.
tions and are restricted to them. Thus, a base rate is the Research on the adequacy of widely used computer
a priori chance or prior odds that a member of a testing resources will be summarized.
specified population will have a certain characteristic,
if we know nothing else about this person other than Personality Assessment Procedures and
that he or she is a member of the population we are Instruments
examining. Base rates have important implications for We next turn to an exploration of a number
a wide variety of issues in clinical practice. Although of personality assessment procedures and instru-
rarely acknowledged, base rates affect the prediction of ments that are widely used in contemporary
BUTCHER 15
a number of self-report inventories such as the PAI, being a minority group member in contemporary
MCMI-III, MMPI-2, and Hare Checklist. They society, several contributors were asked to provide
provide a solid perspective on the issues to that different perspectives on issues involved in the
need to be considered in the assessment of personality psychological assessment of minority clients. We
disorders including gender, culture, and ethnicity. begin our discussion in Chapter 20 with an article
New psychophysiologic approaches to person- entitled ‘‘Clinical personality assessment with Asian
ality have emerged in recent years to assess person- Americans.’’ Sumie Okazaki, Mimi Okazaki, and
ality factors or problems. Research on magnetic Stanley Sue contribute an informative discussion of
brain imaging has been increasing at a very rapid possible factors involved in the assessment of ethnic
pace over the past 10 years. There have been Asian minorities that need to be considered in cross-
numerous studies published on using fMRI to ethnic psychological evaluations. Their work with
explore and identify brain processes underlying Asian-American clients has made a substantial
mental disorders, such as schizophrenia, and to contribution to clinical as well as to cross-cultural
improve the evaluation of these conditions. We psychology. In this article they provide information
have included in this handbook a chapter that on the demographics of the population and a
addresses the emerging fMRI assessment field, description of the recent literature on research with
entitled ‘‘Functional imaging in clinical assessment? Asian-American populations. They describe the pro-
The rise of neurodiagnostic imaging with the fMRI’’ blem of the imposed-etic perspective in studies
(Chapter 19), by Professors Angus MacDonald, III, involving Asian-American populations. They also
and Jessica Hurdelbrink. The chapter includes a describe studies with overseas Asian populations
description of the use of fMRI in assessing psychia- and provide a critique of imposed-etic studies and
tric disorders and provides an up-to-date survey of provide a discussion of the alternate approach—the
the important findings in the area. A great deal of indigenous or emic perspective. They conclude with
research effort is being devoted to the psychobiology a framework and guidelines for conducting assess-
of abnormal behavior today. MacDonald and ments with Asian-Americans. Professor Bernadette
Hurdelbrink address a new, but rapidly expanding, Gray-Little, in her chapter entitled ‘‘The assessment
area of assessment research and its potential in pro- of psychopathology in racial and ethnic minorities’’
viding unique information in understanding clinical (Chapter 21), addresses factors to consider in
problems. conducting psychological evaluations on African-
American clients. She describes the factors that are
Specific Populations important in the clinical interview pertinent to
By its very nature the field of personality assess- understanding bias and social distance that can
ment is concerned with individual differences and is impact an evaluation. She examines racial and
enveloped in human diversity. To understand the ethnic variations that can occur in symptoms of
behavior and personality of individuals, it is impera- distress and addresses potential factors to consider
tive that many ‘‘status’’ variables be given careful in employing psychological tests such as the
consideration. Background factors such as age, MMPI-2, the MCMI, the Rorschach Inkblot Test,
gender, and ethnicity are important, since person- the TAT, and other projective instruments.
ality is in many respects influenced by the environ- This handbook also addresses several populations
ment or groups to which a person belongs. People requiring the assessment psychologist to consider
may share certain characteristics of groups with population-specific factors in dealing effectively
which they are affiliated. In the assessment of indi- with potential differences in personality assessment.
viduals it is important to consider influences that Women may experience situations or activities
may come from belonging to a group that has been differently because of the roles in which they may
treated differently by our social institutions than have been cast and in their lack of access to equal
others have. Being a member of an ethnic minority opportunities in society. In Chapter 22, ‘‘Issues in
group in the United States places a person at clinical assessment with women,’’ Judith Worell and
considerable risk for discrimination and diminished Damon Robinson provide valuable insights into
opportunity. Those who are cast in less dominant understanding specific factors in assessment of
societal roles by fact of birth may be at risk for women. They include factors related to the purposes
developing problems or adjustment difficulties that of the assessment that is being undertaken and deals
are not shared by the majority classes. To fully with issues related to possible gender bias in the
explore the potential problems associated with measurement, for example, factors such as sex of
BUTCHER 17
the process of screening for substance abuse. They overview of the use of personality measurement
discuss several traditional personality scales that have techniques in employment applications. Major
been widely used to identify alcohol or drug problems goals of the chapter include the following areas of
such as the MacAndrew Alcoholism Scale (MAC-R) emphasis. First a rationale for the inclusion of
and the Addiction Admission and Addiction Potential clinical personality assessment measures in personnel
Scales (MMPI-2), scales b and t on the MCMI/ decisions such as employment screening, fitness for
MCMI-II. In addition, they describe additional duty evaluations, and reliability screenings for
screening scales such as the Alcohol Use Inventory making recommendations for promotion to respon-
(AUI). The authors also provide a discussion on the sible positions or for the issuance of security clear-
timing of psychological assessment and the role of ances for sensitive applications will be provided. The
objective psychological tests traditionally used in history of personnel assessment is highlighted. A
alcohol or drug treatment programs. number of contemporary issues pertaining to clinical
Edwin I. Megargee, in Chapter 28, entitled personnel screening will be presented. Next, a
‘‘Understanding and assessing aggression and violence,’’ historical summary of the use of the most widely
discusses the issues pertinent to defining and under- used personality measure, the MMPI-2, will be
standing these problems in society. Dr. Megargee presented along with a strategy for the interpreting
discusses the factors leading to aggressive behavior and instruments used in employment settings. The
violence and provides an overview of contemporary chapter will include several practical examples of
methods being used to assess and predict aggressive clinically based assessments in employment settings.
behavior and violence. He describes the types of assess- The target audience for the article is professionals
ments and deals with referral questions and assessment who are using personality assessment measures in
contexts such as not guilty by reason of insanity defense, these industrial-organizational (I-O) settings or are
treatment planning, and needs assessments. He details considering a career as an assessment practitioner in
useful information on assessment tools and techniques business/industrial settings. The strategy of inter-
that are effective for retrospective and prospective preting psychological measures in personnel selection
predictions. In his chapter, he provides an overview of is illustrated with a case example. The article includes
violent crimes—assault and murder, domestic violence, descriptive information on the most widely used
forcible rape and sexual battery, kidnapping and measure (the MMPI-2) and contains up-to-date refer-
hostage taking, arson, bombing, and terrorism—and ences and resources that interested readers wishing
describes the types of violent offenders delineated in the more thorough information can follow up on.
literature. He also highlights a conceptual framework We next turn to a guide for the care and effective
for the analysis of aggression and violence including his documentation of psychological test results.
‘‘algebra of aggression’’ for predicting violence. He Regardless of one’s basic approach to personality
describes some personal factors that decrease the like- assessment, careful accumulation and analysis of
lihood of aggression and discusses situational factors information are important considerations in any
influencing the likelihood of aggression. In a related psychological assessment. In this regard, the discus-
chapter (Chapter 29), ‘‘Assessing antisocial and psycho- sion by Irving Weiner (Chapter 31) provides the
pathic personalities,’’ Carl Gacono and Reid Meloy practitioner with important background informa-
provide a comprehensive treatment on the assessment tion and a clear rationale for employing meticulous
of antisocial personality disorder. They begin with a safeguards in conducting personality evaluations in
discussion of forensic assessment and issues, and then order to avoid potential legal or ethical problems.
provide a thorough overview of our knowledge of the Weiner, who has made substantial contributions to
personality disorder and a discussion of the relevant the field of personality assessment, provides a
research findings on the use of the Psychopathy valuable overview of the issues and guidelines for
Checklist-Revised, the Rorschach, and the MMPI-2. practitioners in his chapter entitled ‘‘Anticipating
They also include an overview of various measures of ethical and legal challenges in personality assess-
cognition and intelligence and their relevance to under- ment.’’ In this chapter, he discusses topics such as
standing antisocial personality. accepting a referral, selecting the test battery for
The next chapter in the handbook provides an conducting the psychological evaluation, how to
example of personality assessment in a nonclinical prepare and present a report, and factors important
context, personnel selection. Chapter 30, ‘‘Clinical to managing case records in an assessment practice.
personality assessment in the employment context,’’ No treatment of clinical assessment would be
by Butcher, Gucker, and Hellervik provides an complete without a chapter on the validity of
BUTCHER 19
fill an important niche in the overall plan. The Atlis, M. M., Hahn, J., & Butcher, J. N. (2006). Computer-based
vastness of the field of personality assessment today assessment with the MMPI-2. In J. N. Butcher (Ed.), MMPI-
2: The practioner’s handbook (pp. 445–476). Washington,
does not permit all noted authorities and all perspec- DC: American Psychological Association.
tives to be equally represented. Some selectivity was Beck, S. J. (1938). Personality structure in schizophrenia: A
required given the limitations of space. The reader Rorschach investigation in 81 patients and 64 controls.
will, of course, be the final judge as to how the many Nervous and Mental Disorders Monograph Series, 63, ix–88.
parts blend into an integral picture. As for me, I Ben-Porath, Y. S., & Butcher, J. N. (1991). The historical devel-
opment of personality assessment. In C. E. Walker (Ed.),
believe that the final pieces matched the initial plan Clinical psychology: Historical and research roots (pp. 121–156).
quite well. A primary goal of this volume was to New York: Plenum Publishing Corporation.
provide a practical and comprehensive overview of Bernreuter, R. G. (1931). The personality inventory. Palo Alto, CA:
the field of personality assessment. I believe that the Consulting Psychologists Press.
contributions included here provide the reader with Bernreuter, R. G. (1933). The theory and construction of the
personality inventory. Journal of Social Psychology, 4, 387–405.
a substantial compendium of assessment resources Blair, W. R. N. (1950). A comparative study of disciplinary
with diverse and interesting elements. I hope that offenders and non-offenders in the Canadian Army, 1948.
clinicians and clinicians-in-training who are new to Canadian Journal of Psychology, 4, 49–62.
the field of personality assessment will be tantalized Boring, E. G. (1950). A history of experimental psychology
by the views and strategies presented here and (2nd ed.). New York: Appleton-Century-Crofts.
Buss, D. M. (1984). Toward a psychology of person–environ-
will travel these paths further when this book is set ment (PE) correlation: The role of spouse selection. Journal of
aside. Personality and Social Psychology, 47, 361–377.
Perpetuation of valuable scholarly resources for a Butcher, J. N. (Ed.). (1972). Objective personality assessment:
scientific discipline is an important goal of this Changing perspectives. New York: Academic Press.
volume to help maintain valuable standards while at Butcher, J. N. (2006). Assessment in clinical psychology: A per-
spective on the past, present challenges, and future prospects.
the same time incorporating new developments in the Clinical Psychology: Research and Practice, 13(3), 205–209.
field. Following the general plan of the Oxford Butcher, J. N., Dahlstrom, W. G., Graham, J. R., Tellegen, A. M.,
Handbook Series, this volume ends with a chapter, & Kaemmer, B. (1989). Minnesota Multiphasic Personality
‘‘Personality assessment: Overview and future direc- Inventory-2 (MMPI-2): Manual for administration and scoring.
tions’’ (Chapter 37), by the volume editor. An overview Minneapolis, MN: University of Minnesota Press.
Butcher, J. N., Ones, D. S., & Cullen, M. (2006). Personnel
of the status of personality assessment as described in screening with the MMPI-2. In J. N. Butcher (Ed.), MMPI-2:
the contributions to this handbook will be provided The practioner’s handbook (pp. 381–406). Washington, DC:
and the significant contributions to future development American Psychological Association.
will be summarized. Some lingering challenges to the Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual
field of personality assessment will be highlighted. differences and clinical assessment. Annual Review of
Psychology, 47, 87–111.
Discussion is provided to highlight significant assess- Butcher, J. N., Williams, C. L., Graham, J. R., Archer, R.,
ment trends and projections about new directions for Tellegen, A., Ben-Porath, Y. S., et al. (1992). MMPI-A
research. manual for administration, scoring, and interpretation.
Minneapolis, MN: University of Minnesota Press.
Index of Assessment Procedures Butcher, J. N., Williams, C. L., Graham, J. R., Tellegen, A.,
A great variety of personality assessment Ben-Porath, Y. S., Archer, R. P., et al. (1992). Manual for
administration, scoring, and interpretation of the Minnesota
instruments and procedures are discussed in this Multiphasic Personality Inventory for Adolescents: MMPI-A.
volume. Because it is unlikely that readers are Minneapolis, MN: University of Minnesota Press.
familiar with all of them, contributors were asked Cattell, R. B., & Stice, G. E. (1957). The sixteen personality factors
to provide a brief description of the tests they questionnaire. Champaign, IL: Institute for Personality and
discuss in their chapter. These assessment proce- Ability Testing.
Costa, P. T., Jr., & McCrae, R. E. (1985). The NEO Personality
dures have been summarized and are described in Inventory manual. Odessa, FL: Psychological Assessment Services.
an appendix. DuBois, P. L. (1970). A history of psychological testing. Boston:
Allyn & Bacon.
References Exner, J. E., Jr. (1983). The Exner report for the Rorschach
Allport, G. W. (1937). Personality: A psychological interpretation. Comprehensive System. Minneapolis, MN: National
New York: Holt, Rinehart & Winston. Computer Systems.
Allport, F. H., & Allport, G. W. (1921). Personality traits: Their Frank, L. K. (1939). Projective methods for the study of person-
classification and measurement. The Journal of Abnormal ality. Journal of Psychology, 8, 543–557.
Psychology and Social Psychology, 16, 6–40. Fulkerson, S. C., Freud, S. L., & Raynor, G. H. (1958). The use
Altus, W. D. (1945). The adjustment of army illiterates. of the MMPI in the psychological evaluation of pilots. Journal
Psychological Bulletin, 42, 461–476. of Aviation Medicine, 29, 122–129.
BUTCHER 21
This page intentionally left blank
PART
2
Genetic and Cultural
Perspectives
This page intentionally left blank
C H A P T E R
Behavior Genetic Perspectives on Clinical
2 Personality Assessment
Abstract
This chapter discusses how the field of behavior genetics—the genetic and environmental contributions to
individual differences in human behavior—can aid and inform personality assessment. These two fields of
study are often quite distinct: personality assessment applies to the study of a singular individual; behavior
genetics typically is used to describe population-level individual differences. However, behavior genetic
methodology has been vital in helping to understand how genetic and environmental influences transact in
the development of personality. Nature and nurture are both important contributors to variation in
human personality and newer methodologies from both behavior and molecular genetics hold great
promise for understanding how different etiological factors interact in the development of personality.
This chapter discusses biometric models and the important contributions from decades of behavior
genetic research into personality and how research using newer biometric moderation models allows for
group-specific estimates of heritability and environmental influences on personality.
Behavior genetics has had a profound influence on personality from classical genetically informative
current personality research and theorizing. For research designs, such as twin and adoption studies.
much of the twentieth century, most psychologists We then discuss the emerging field of molecular
assumed that nurture was the primary cause of per- genetic personality research. We conclude with a dis-
sonality variation. This view has been tempered by cussion of the implications of research on the genetics
an understanding that nature and nurture work of personality for assessment in clinical settings.
together in personality development. But what are
the implications of this more integrated under-
standing for the assessment of personality in clinical How Do Twin Studies Work and What
settings? How might behavior genetic research find- Have They Taught Us about Personality?
ings impact upon the clinician’s understanding of Twins provide a fascinating natural experiment in
the meaning of psychological test scores? What of human variability. Twins come in two basic varieties,
the extraordinary advances in research on human identical (or monozygotic, MZ) and fraternal (or
molecular genetics? Will we soon see the day where dizygotic, DZ). Identical twins share all of their
analysis of a client’s DNA contributes directly to our genes, whereas fraternal twins share half of their
assessment report? genes, on average, of the genes that vary from
Our task in writing this chapter was to provide a person to person. In a study of MZ and DZ twins
perspective on these questions—a perspective on how reared together, it is possible to separate the variance
clinical personality assessment might be influenced in a phenotype (an observed difference among people,
by genetically informative research on personality. such as a personality trait) into three sources that ‘‘add
We begin by reviewing what has been learned about up’’ to account for the total variance of the phenotype.
25
The first source of differences among persons is twins. Here we can offer only a brief description of
what most people think of when they think of twin some of these powerful approaches to understanding
studies: genetic variance, also known as heritability. the genetic and environmental sources of individual
The heritability of a trait is the proportion of variance differences; details and extensions can be obtained
in the trait that can be accounted for by genetic by consulting Neale and Maes (in press).
differences in a particular sample. In a study of twins Twin data models can be usefully described as
reared together, a trait is judged to be heritable if the ‘‘biometric’’ models. The models are biological
MZ twins are found to be more similar on average because they take into account what is known
than the DZ twins. The second source of variation is about the genetic relationships among research par-
known as the shared or common environment. This is ticipants. The models are metric in that they attempt
the extent to which twins are similar to one another by to measure genetic and environmental contributions
virtue of having grown up in the same families. In a to phenotypic variance, and also in the sense that
study of twins reared together, the importance of the they depend fundamentally on the precise measure-
shared environment is suggested if both MZ and ment of phenotypes. That is, accurate estimation
DZ twins are similar to their co-twins, within families, of parameters such as proportions of genetic and
but MZ twins are no more similar than DZ twins. environmental variance in a trait demands precise
The third source of variation is known as the unique measurement of the trait. Although this may seem
or nonshared environment. This is the extent to somewhat obvious, the point is sometimes under-
which twins are different in spite of the fact that appreciated. For example, measuring a trait using a
they share rearing environments and genes. In a shortened scale with reduced reliability will contri-
study of twins reared together, the importance of the bute to the apparent dissimilarity of MZ twins. This
unique environment is indicated if MZ twins are not dissimilarity will be ‘‘read’’ as nonshared environ-
phenotypically identical. mental variance, when in fact it is error variance.
One key assumption of the twin method is known This is the reason many research reports contain
as the equal trait–relevant environments assumption. the caveat that the reported nonshared variance
Sometimes this assumption is described simply as the estimates confound nonshared environment and
equal environments assumption, but we feel this ter- measurement error. Although it may not always be
minology is somewhat less accurate, as we explain feasible to measure constructs with high reliability,
below. The assumption can be understood as follows: the use of measures with high reliability will improve
If MZ twins are found to be more similar than DZ biometric inferences. In sum, biometric inferences
twins on a phenotype, we would like to be able to infer are constrained by psychometric considerations;
that this is because MZ twins are more genetically biometrics and psychometrics are intimately inter-
similar than DZ twins. But what if MZ twins are twined (cf. Goldsmith, Buss, & Lemery, 1997).
treated more similarly, and this makes them more A univariate (one-variable) model for data
similar than DZ twins in terms of the phenotypes we obtained from MZ and DZ twins reared together is
want to study (e.g., personality)? Then our inference of presented in Figure 2.1. Latent variables (variables
a genetic effect, based on greater MZ similarity, might whose effects are inferred from patterns in the data)
be flawed. Fortunately, extensive evidence supports the are contained in circles, and manifest variables
validity of the equal trait–relevant environments (variables that are measured directly) are contained
assumption (Goodman & Stevenson, 1991; Loehlin in squares. The model also contains lines with arrows
& Nichols, 1976; Scarr & Carter-Saltzman, 1979). (paths), which represent relationships among
Although MZ twins are treated more similarly in variables. The model in Figure 2.1 states that the
some ways (e.g., their parents might dress them simi- scores of twins on a specific, measured variable result
larly), and might therefore be described as having more from the three influences we described previously.
‘‘equal environments,’’ these similarities are not linked First, twin scores are influenced by additive genetic
to phenotypic similarity in major individual-differences effects, symbolized by paths leading from the latent
domains, such as personality. ‘‘A’’ variables to the twins’ phenotypes. These are
genetic effects that result from the influence of
Model-Fitting Approaches in the Analysis of multiple genes acting independently and ‘‘adding
Twin Data up’’ to influence a phenotype. MZ twins are per-
Most twin data analyses now take advantage of fectly correlated for additive genetic effects, whereas
the speed and power of modern computers in fitting DZ twins are correlated 0.5 for additive genetic
explicit statistical models to data obtained from effects. In the figure, this is symbolized by the
a c e a c e
Twin 1 Twin 2
curved, double-headed arrow connecting the twins’ the covariance between MZ twin 1 and MZ twin 2
phenotypes via the latent ‘‘A’’ variables, set to 1.0 for (three data points). The second matrix will be for the
MZ twins and 0.5 for DZ twins. The latent ‘‘A’’ DZ twins, and will contain the variances of DZ twin
variables are connected to the phenotypes by the 1 and DZ twin 2, as well as the covariance between
path labeled a. Second, twin scores are influenced DZ twin 1 and DZ twin 2 (three more data points).
by shared environments. Because these are twins That is, we observed six data points, and we have a
reared together, both MZ and DZ twins are per- model for the six data points that involves three
fectly correlated for shared environments (i.e., both unknown quantities (a, c, and e). Hence, model
types of twins grew up in the same families, within fitting to twin data on a single variable measured in
pairs), as reflected by the curved, double-headed twins reared together involves using a computer to
arrow connecting the twins’ phenotypes via the try to find the most likely values of a, c, and e (three
latent ‘‘C’’ variables, and set to 1.0 for both MZ unknowns), given the six data points we observed
and DZ twins. These variables are connected to the and the theoretical model in Figure 2.1. The rules of
twins’ scores via the path labeled c. Third, twin path models (like the model in Figure 2.1) tell us
scores are influenced by nonshared environments. that if we square each path coefficient and add up
These effects are symbolized by the latent ‘‘E’’ vari- these squared values, we should get a value close to
ables, which, by definition, are not correlated across the variance in twins’ scores (see Loehlin, 1998, for
twins within pairs (note the absence of a curved, an excellent introduction to path models). We could
double-headed arrow connecting the twins via E). write this as a2 þ c2 þ e2 ¼ p2, where p2 is the total
These variables are connected to the twins scores via phenotypic variance in the trait under study.
the path labeled e. Similarly, the rules of path models tell us that the
The process of fitting the model in Figure 2.1 to covariance between twin 1 and twin 2 scores will be
twin data involves using a computer to search for the a2 þ c2 for MZ twins and 0.5a2 þ c2 for DZ twins.
most likely values of a, c, and e, given the observed Note that the predicted covariances for MZ and
data. In the case of Figure 2.1, the data would be the DZ twins contain a term that differs based on the
scores of both halves of twin pairs (twins 1 and 2) different degree of relationship in MZ and DZ twins
within two different twin groups (MZ and DZ) on a (a), a term that is the same by virtue of the fact that
single trait. This gives us two variables (twin 1 scores the twins all grew up in the same families, regardless
and twin 2 scores) within two groups (MZ and DZ), of zygosity (c), and does not contain e because e does
allowing us to compute two variance-covariance not contribute to the similarity of the twins. This
matrices, one for each twin group. The first matrix helps to make the meanings of a, c, and e explicit: a is
will be for the MZ twins, and will contain the the extent to which twins resemble one another due
variances of MZ twin 1 and MZ twin 2, as well as to additive genes, c is the extent to which twins
E1 E2 E1 E2
Fanny M. Cheung
Abstract
Culture affects personality through the ways that people are represented psychologically. Global or etic
approaches to the study of culture and personality compare universal dimensions across cultures, whereas
focal or emic approaches interpret and identify indigenous dimensions on the basis of local phenomena and
experiences. This chapter reviews the relationship between culture and personality, and the impact of
cultural factors in personality assessment. Practical and methodological issues of personality assessment
across cultures, including issues of equivalence and cross-cultural validation in test translation and
adaptation, using the MMPI as an example are discussed. At the theoretical level, the etic and emic
approaches to personality assessment are compared, and the contributions of a combined emic–etic
approach in developing culturally relevant personality assessment based on the experience of the
Cross-cultural (Chinese) Personality Assessment Inventory (CPAI-2) are illustrated. These issues highlight
the need for incorporating cross-cultural training as an integral part of psychology in order to enhance the
cultural relevance in the practice and research in personality assessment.
Keywords: cross-cultural, validity, emic, etic, test equivalence, test translation, personality constructs,
training
44
different language versions were used have serious and individual differences, studies of cultural differ-
implications for clinical and forensic decisions. I was ences in personality are useful in understanding
struck by the apparent lack of familiarity among whether certain prevalent features of personality are
practitioners with the available measures and the universal and common among all people, and
knowledge base in culturally relevant personality whether the level of these personality features differs
assessment. There is a definite need for enhancing across cultures.
training and research in the cultural perspectives of
personality assessment. Approaches to Describe Culture
Learning about the strengths and limitations of In their book Social Psychology and Culture, Chiu
translated instruments, our research team went on to and Hong (2006) constructed the definition of
develop an indigenously derived personality inventory culture as ‘‘a set of shared meanings, which provides
suitable for the Chinese cultural context, the Chinese a common frame of reference for a human group to
Personality Assessment Inventory (CPAI; Cheung make sense of reality, coordinate their activities in
et al., 1996). This is one of the few comprehensive collective living, and adapt to the external environ-
personality measures developed in a non-English ment’’ (pp. 16–17). They summarized different
language that have been studied across cultures. strategies that psychologists have used to describe
Comparing the CPAI with Western personality culture: In the global approach, universal dimen-
measures and using the CPAI with non-Chinese sions are assumed to be part of human nature and
samples enlightened our understanding of personality have the same meaning in all cultures. Variations
structure, and functioning and measurement across along these dimensions may be found in different
cultures. Work with the CPAI illustrates the reverse cultures. Cross-cultural psychologists have often
side of the coin in cross-cultural personality assessment, referred to this approach as the ‘‘etic’’ approach (the
when an indigenously derived measure in the Chinese term is derived from ‘‘phonetics’’), which attempts to
culture is compared to well-established personality compare universal pre-established constructs across
measures in mainstream psychology. In a way, the cultures. Some psychologists have criticized the
English-language personality tests could also be consid- Western bias in the etic approach in that most etic
ered as indigenous measures originating in the Western or ‘‘universal’’ dimensions originate in Western
culture. This broader context offers us a more balanced cultural traditions and are imposed on other cultures
view to discuss the cultural perspective in personality (Gergen, Gulerce, Lock, & Misra, 1996).
assessment. In contrast, the focal approach to cultural descrip-
There is now a growing literature on cross-cultural tion uses local or indigenous concepts originating
studies in personality and personality assessment. within a culture to interpret and organize the data for
There are also international guidelines for adapting that cultural group. This approach highlights culturally
tests for use in various different linguistic and cultural relevant meanings and allows the examination of the
contexts. In this chapter, I review the relationship variations of these meanings in different contexts and in
between culture and personality, and the impact of response to social changes. In cross-cultural psychology,
cultural factors in personality assessment. I discuss the the indigenous concepts are often referred to as ‘‘emic’’
methodological and theoretical issues of personality concepts (the term is derived from ‘‘phonemics’’). The
assessment across cultures, and make recommenda- development of indigenous psychologies has adopted a
tions on the training needs for enhancing the cultural bottom-up approach, building theories on the basis of
perspective in personality assessment. local phenomena and experiences. These indigenous
approaches could contribute to the development of
Culture and Personality more comprehensive universal theories in mainstream
Buss (2001) postulated that culture poses ‘‘puz- psychology (Allwood & Berry, 2006). However, at the
zles for personality theories that aspire to explain early stage of development when cultural psychologists
human nature and individual differences’’ at the focus on their emic approaches, there is the danger of
same time (p. 975). Buss defined culture as a set of cultural relativity and ‘‘intellectual provincialism’’ that
‘‘ideas, beliefs, representations, behavior patterns, limits comparisons across cultures (Triandis, 2000).
practices, artifacts, and so forth that are transmitted
socially across generations within a group, resulting Impact of Culture on Personality
in patterns of within-group similarity and between- Culture per se is not a causal agent of cultural
group differences’’ (pp. 955–956). While personality differences in personality. Culture affects personality
psychology seeks to integrate both human nature through the ways that people are represented
CHEUNG 45
psychologically, including descriptions of the self, attribute to external forces. On the other hand,
others, and groups (Chiu & Hong, 2006). People North Americans who are from individualistic cul-
are socialized in a unique family and social environ- tures believed more strongly in the person’s unique
ment that influences the ways they behave and view and fixed personality traits and were more likely to
the world. Cross-cultural psychologists have studied make internal attributions (Chiu & Hong, 2006).
specific cultural features affecting these cognitive Culture and personality further interact to affect
representations of persons. the manifestation of psychopathology in the context
Hofstede (1991) identified major cultural dimen- of behavioral environments, which consist of cultu-
sions that differentiated organizations globally. One rally constituted self-orientations, objects, space,
such dimension, individualism versus collectivism, time, and social norms. One dimension of cultural
has been studied extensively in relation to cultural influence on the perception of psychopathology that
differences in personality. For example, people in has been discussed extensively in cultural psychiatry
collectivistic cultures tend to perceive themselves as is the biased observation of somatization among
members of communal groups. Their roles and Chinese persons with depressive disorders. The
behaviors are affected by their relationships with cultural bias in this discussion is rooted in the
in-group members. They would give priority to Western philosophical tradition of Cartesian
in-group goals. They focus more on the context dualism. In dualistic thinking, psychopathology is
and the situation in making attributions (Triandis, viewed in terms of distinct organic and psychological
2001). Markus and Kitayama (1991) further problems (Lewis-Fernandez & Kleinman, 1994).
delineated the relationship between individualism- On the other hand, Chinese dialectical thinking
collectivism and personality in their model of encompasses the context and allows for the simulta-
independent versus interdependent self-construals. neous expression of psychological distress in the
These two self-construals are considered overarching form of somatic symptoms (Cheung, 1998). When
schemata of self-regulation and have been used to the medical model of psychopathology is imposed,
explain observed cultural differences in cognition, the somatic presentation of distress among Chinese
emotion, and motivation. Independent self- patients is construed by Western-trained practi-
construals are commonly found in Western cultures tioners as the somatization tendency in which
that emphasize the inherent separateness of indivi- psychological affects were allegedly denied or
duals. The view of the self is derived from each suppressed. In the medical model of psycho-
individual’s unique internal attributes, which are pathology, indigenous cultural beliefs are often
independent from others. On the other hand, reduced to misinformed or superstitious obstacles
interdependent self-construals are represented in that interfere with proper diagnosis and treatment
many non-Western cultures that emphasize the of real diseases. These ethnocentric assumptions in
interconnectedness between people. The self is understanding psychopathology focus on the self,
viewed as a part of a social relationship and is not individual experiences and internal attributes,
considered as separate or differentiated from others ignoring the social and interpersonal contexts that
in the group. Markus and Kitayama suggested that affect the conceptualization and manifestation of a
people who endorse independent self-construals range of physical and psychological symptoms.
tend to be more optimistic and more consistent in When practitioners understand the cultural values
their self-perceptions, whereas those who endorse and holistic health beliefs of the Chinese people,
the interdependent self-construals emphasize main- somatization may be reconceptualized as a metaphor
taining harmony and collective goals. of distress in the cultural context of an illness experi-
People in different cultures also differ in their ence with implications to social relationships,
implicit theories of the consistency and malleability coping, and help-seeking behavior. In studies of
of personality. Cultures that emphasize the interde- Chinese personality and psychopathology, somatiza-
pendent self-construal would expect less consistency tion was found to be related to culturally related
between internal personality attributes and social personality features that emphasize harmony and
behavior. Research also confirmed that East Asians traditionalism (Cheung, Gan, & Lo, 2005).
from collectivistic cultures were more likely to These cultural differences highlight the need to
believe that individual persons were relatively malle- integrate a cultural perspective in understanding
able whereas the world was relatively stable. Thus, personality. Matsumoto (2007) proposed a compre-
they were less likely to explain social behavior in hensive model that illustrates the interaction among
terms of global traits and were more inclined to human nature (universal processes), culture (via
CHEUNG 47
of results of objective personality tests. These cul- Recent advances have highlighted important con-
tural exchanges have increased the awareness that siderations in cultural differences that need to be
cultural contexts need to be taken into account in taken into account in cross-cultural personality
personality assessment and the study of personality. assessment. In the first place, testing itself is not a
The cultural perspective is particularly relevant when universally familiar practice. Paper-and-pencil tests
we consider the widespread application of person- require test users to be able to read and write. The
ality instruments on multicultural test users in occu- test users also need to respond to a forced choice
pational, academic, and health settings. Practitioners format of yes or no answers or a number on a rating
need to consider the basic parameters of their tools scale. Response bias such as the tendency toward
and methods: Are the existing psychological tests acquiescence or extremity ratings have been noted
useful and relevant? Are there suitable language ver- in different cultures. For example, cultural differ-
sions of these tests? Are the translated versions valid? ences have been observed in the response style of
The broader issue behind these practical Asian participants who tend to adopt the midpoint
questions lies in the challenge posed by the cultural in responding to rating scales under the influence of
perspective to the mainstream knowledge base in the moderation norm (Hamid, Lai, & Cheng,
personality psychology. Butcher, Mosch, Tsai, and 2001). Studies with Chinese bilingual participants
Nezami (2006) summarized the tension between also showed that the Chinese version of an instru-
cultural influences and personality psychology in a ment produced more culturally accommodating
series of questions: Is personality universal or responses than the English version of the same test,
culture-specific? Do the same personality traits which could affect the results of a cross-cultural
manifest themselves differently across cultural study (Ralston, Cunniff, & Gustafson, 1995).
contexts? How do culture and personality interact The matching of the ethnic background of the
to influence behavior as well as psychopathology? assessor and the client is another source of adminis-
Can personality be adequately measured across tration bias that may affect the outcome of the
cultural groups? These cultural considerations can assessment. Russell and his group of researchers
no longer be ignored in the study of personality and (1996) have found that ethnic matching of therapist
personality assessment. It is in this contemporary and client results in higher ratings of client func-
context that we discuss the impact of cultural factors tioning. Hence, difference from or unfamiliarity
on personality assessment and approaches to adapt with the client’s cultural background may limit the
personality assessment across cultures. therapist’s ability to assess optimal client
functioning.
Impact of Cultural Factors on Personality Other than response biases and therapist
Assessment variables, how can differences in test results be
Marsella and Leong (1995) referred to two major explained? In cross-cultural applications of trans-
problems of ethnocentricism in cross-cultural lated personality tests, such as the MMPI-2, differ-
psychological assessment: the error of omission and ences in the mean scores among normal samples of
the error of commission. The error of omission refers participants have been observed. One of the inqui-
to ‘‘the failure to conduct cross-cultural comparison ries I have received on the Chinese version of the
in reaching conclusions about human behavior’’ MMPI-2 came from an American forensic expert
(p. 203), with the assumption that conclusions working on the case of a Chinese plaintiff in a law-
drawn from psychological studies conducted usually suit. The psychologist described to me the difficulty
in Western cultures can be generalized without he encountered when he administered the English
cultural variations. In the error of commission, MMPI-2 on this Chinese client before he learnt
researchers include different cultural groups for about the availability of the Chinese MMPI-2:
comparison, but neglect the cultural differences in
the psychological measures that they use in making VRIN was about 70T and the clinical scales did not
the comparison. Measures or instruments that are resemble how she actually was. I consulted with a
developed in Western cultures are imposed on the Chinese-speaking psychologist here who offered to
other cultural groups with the assumption that these read the items to her in Chinese and to help her
measures are valid for all groups. Early cross-cultural understand the linguistic nuances of the questions.
studies generally adopt these imposed etic measures The result was even worse. VRIN was about 80T and
to identify and reify cross-cultural similarities and now she looked overtly psychotic, which she clearly
differences in the attributes being studied. was not, and almost all the clinical scales were elevated.
CHEUNG 49
bilingual and experienced in both cultures. After the (Butcher, Nezami, & Exner, 1998). The correla-
initial translation, independent translators are tions between the bilingual results may be compared
engaged in the process of back-translation, which to the test–retest reliability of the single-language
involves retranslation of translated items back to versions to determine if the translated version
the source language in order to determine if the could be considered to be an alternate form of the
translated items retain their meaning. Items that original version (Butcher, Derksen, Sloore, &
are difficult to translate or mistranslated are discovered Sirigatti, 2003).
and retranslated. This translation–back-translation In the case of the international versions of the
process is repeated until the discrepancies between MMPI, programs are also available to compare the
the items of the source language and back- differences in item endorsement between two
translation versions are minimized. cultural groups (Butcher & Pancheri, 1976). Items
Particular attention should be paid to idiomatic showing large discrepancies in endorsement rates
use of the language that may not be readily picked up between the cultural groups may suggest either
in back-translation. For example, in the MMPI-2, errors in translation, or differences in the relevance
expressions like ‘‘I feel blue’’ could be translated of the item to the construct being measured. As
literally into the color blue and back-translated accu- shown in the earlier example of the item on the
rately, but the meaning of the emotional state may MMPI scale 2 (D), the higher rate of endorsement
be lost in this literal translation. In the translation of of that item among Chinese college students com-
the Chinese MMPI-2, the translators consulted the pared to that of the American college students, and
original authors to ensure that the nuances of the the results that showed that the item was rated as
language were accurately reflected in the translation. more socially desirable by the Chinese students sug-
Careful choice of translators who are proficient in gested that this item did not reflect depression in the
both languages and consultation with bilingual Chinese cultural context (Cheung, 1985).
experts have been recommended as necessary proce- In addition to bottom-up approaches, recent
dures in test adaptation. studies have also adopted the top-down approach to
Given the increasing adaptation of tests for use in establish construct equivalence by multilevel analysis
various different linguistic and cultural contexts, a 13- (McCrae et al., 2005). To test for cross-cultural
person committee representing a number of interna- equivalence of the underlying structure of the con-
tional organizations at the International Test structs being measured, some researchers recom-
Commission has developed a set of guidelines. These mended the use of exploratory factor analysis with
guidelines are grouped under four main categories: the country-level data in addition to individual-level data
cultural context, the technicalities of instrument (Van de Vijver & Poortinga, 2002). The use of
development and adaptation, test administration, and structural equation modeling within the framework
documentation and interpretation. For details of the of a multilevel model provides the most rigorous
guidelines, readers are referred to the ITC Guidelines approach to testing for cross-level equivalence.
on Adapting Tests (International Test Commission, Structural equation modeling allows the researcher
www.intestcom.org; Hambleton, Merenda, & to consider both the individual and the cultural
Spielberger, 2005; Van de Vijver & Hambleton, 1996). group level of the data simultaneously (Cheung
et al., 2006). However, one critical restriction of
Construct Equivalence this approach is the need for large samples of cross-
Construct equivalence or conceptual equivalence cultural groups, which is often not the purpose for
refers to the generalizability of the personality vari- establishing construct equivalence in translating and
ables or constructs being examined from the source adapting a personality test.
culture to the target culture. Personality variables
may differ in form and quantity across cultures and Psychometric Equivalence
constructs may not have the same psychological Psychometric equivalence refers to an instrument
meaning across cultures. To test for construct possessing similar psychometric properties in dif-
equivalence in adapted personality measures, one of ferent cultures (Butcher & Han, 1996), which
the recommended approaches is to use bilingual includes reliability, item–scale correlation, item
test–retest studies (Butcher et al., 2006). In these endorsement frequencies, inter-item correlations,
studies, a group of bilinguals in the target culture inter-scale correlations, and factor structure of the
would take the original form and the translated scales. Butcher et al. (2006) described a variety of
version of the test within 1–2-week intervals objective indices used to demonstrate the
CHEUNG 51
importation of well-established tests provides own personality measures (Enriquez, 1993; Yang,
psychologists in countries where assessment is only 1997). Ho (1998) defined an indigenous psychology
beginning to develop with an efficient repertory of as ‘‘the study of human behavior and mental
assessment tools (Cheung, 2004). processes within a cultural context’’ in which cultural
These imported instruments also allow for cross- ‘‘conceptions and methodologies rooted in that
cultural comparison of common personality cultural group [are] employed to generate knowl-
constructs (Cheung & Leung, 1998). The personality edge’’ (p. 94). Personality constructs are derived
model that has been studied most widely across cul- from the local cultural perspective instead of being
ture is the FFM using the NEO PI-R (Costa & imported. Many of the indigenous personality
McCrae, 1992). McCrae et al. (2005) studied constructs derived in Asia reflect the relational
personality profiles of 51 cultures by using mean nature of human experience in a social and inter-
trait levels of culture members by asking college stu- personal context (Cheung, Cheung, Wada, &
dents from their countries to rate an individual whom Zhang, 2003). Examples include the Chinese
they knew well. While cultural variations were found concepts of ‘‘harmony’’ and ‘‘face’’ (public sense
on the mean levels of the five personality factors, the of self) (Cheung et al., 2001), the Japanese concept
FFM structure was found to be a universally valid of amae (‘‘sweet indulgence’’), the Korean concept of
taxonomy of personality (McCrae & Allik, 2002). chong (‘‘affection’’) (Kim, Park, & Park, 1999), and
McCrae asserted that these etic or universal traits the concept of selflessness, or ‘‘selfless-self’’ in
are rooted in biology while culture shapes their Taoism, Buddhism, and Hinduism (Verma, 1999).
expressions (Hofstede & McCrae, 2004). Some of the early indigenous personality
When importing etic personality measures, cross- measures modified imported instruments by
cultural psychologists usually ask the following selecting relevant scales and collecting data to estab-
questions: How well do the personality dimensions lish local norms. Others made use of the lexical
assessed by the imported tests replicate across cultures? approach to extract personality dimensions that are
What are the significant differences in personality traits relevant to the local culture (Guanzon-Lapeña,
that are identified by the imported measures? Can Church, Carlota, & Katigbak, 1998; Isaka, 1990;
imported tests predict relevant criteria across cultures Yang & Bond, 1990).
(Church, 2001)? Cross-cultural research on the well- However, few of the indigenous measures have
established personality measures such as the MMPI-2 been followed through with a program of research
and the NEO PI-R have replicated the factor structure on its development and validation that is necessary
and the relationships with external correlates, sup- for its utility as an assessment tool. Nor have they
porting the relevance and usefulness of these imported addressed satisfactorily the questions posed by
measures. However, different factor structures could cross-cultural psychologists (Church, 2001): How
also be extracted in cross-cultural studies, and some culture-specific are the indigenous personality
of the factors in the NEO PI-R, such as Openness, constructs and measures? Do these constructs and
did not replicate well in European and Asian cultures, measures provide incremental validity beyond that
suggesting that the imported measures may be provided by the imported measures? Church
imposing their model of factor structure in the new concluded that most indigenous measures did not
cultural context (Cheung, Cheung et al., 2008; identify culture-specific constructs that could not be
Church, 2001). There may be other personality subsumed under the universal FFM. He acknowl-
constructs that are important to the local cultures that edged that the best support for the incremental
are not covered in the imported measures, or are not validity of indigenously derived personality measures
construed in the same taxonomy that is meaningful to came from the CPAI.
the local culture. The rise of indigenous movements in
psychology has led to the development of emic person- Combined Emic–Etic Approach
ality measures which derive the constructs from within The CPAI was developed to measure Chinese
the culture. personality from an indigenous perspective, using a
combined emic–etic approach (Cheung et al., 1996,
Indigenous Measures of Personality 2001). Universal and indigenous personality traits
In the movement toward indigenization of considered to be important in the Chinese culture
psychology since the 1970s, psychologists in non- were generated in a bottom-up approach to develop
Western cultures have proposed indigenous a set of normal personality and clinical scales for
personality theories and attempted to develop their comprehensive personality assessment. Borrowing
CHEUNG 53
Training Needs in Enhancing the Cultural Acknowledgments
Perspective in Personality Assessment The CPAI projects reported in this chapter were
The increasing use of personality assessment across partially supported by the Hong Kong Government
cultures illustrates the need to develop culturally Research Grants Council Earmarked Grants
sensitive measures. Recent advances in cross-cultural CUHK4333/00H, CUHK4326/01H, CUHK4259/
psychology have introduced useful methodological and 03H, CUHK4254/03H, and CUHK4715/06H, and
statistical tools for research in this area. The American by Direct Grants #89106, 91113, 2020662, 2020745,
Psychological Association has published two important 2020717, and 2020871 of the Chinese University of
documents to promote scholarship on multicultural Hong Kong.
awareness: Guidelines on Multicultural Education,
Training, Research, Practice, and Organizational Change References
for Psychologists (American Psychological Association, Allwood, C. M., & Berry, J. W. (2006). Origins and development
2003) and Resolution on Culture and Gender Awareness of indigenous psychologies: An international analysis.
in International Psychology (American Psychological International Journal of Psychology, 41, 243–268.
American Psychological Association (2003). Guidelines on multi-
Association Council of Representatives, 2004). It
cultural education, training, research, practice, and organizational
has established a task force initiated by the Division of change for psychologists. American Psychologist, 58, 377–402.
International Psychology (Division 52) and the Division American Psychological Association Council of Representatives
of Measurement, Evaluation, and Statistics (Division 5). (2004). Resolution on culture and gender awareness in interna-
The purpose of the task force was to identify methodo- tional psychology. Washington, DC: American Psychological
Association.
logical aspects of cross-cultural research that were in
Barnouw, V. (1963). Culture and personality. Homewood, IL:
need of improvement and to promote training in these Dorsey Press.
aspects. In addition to training offered by the APA, Buss, D. M. (2001). Human nature and culture: An evolutionary
other international professional organizations such psychological perspective. Journal of Personality, 69, 955–978.
as the International Test Commission, International Butcher, J. N. (2004). Personality assessment without borders:
Adaptation of the MMPI-2 across cultures. Journal of
Association of Applied Psychology, International
Personality Assessment, 83, 90–104.
Congress of Psychology, and the European Congresses Butcher, J. N., & Cheung, F. M. (2006, May). Discussion and
of Psychology also offer training workshops on Case Interpretation. Presentation at the Workshop on
cross-cultural testing and measurement during their MMPI-2 Jointly Organized by the Clinical Psychological
conventions. Service Branch of Social Welfare Department and the
Chinese University of Hong Kong, Hong Kong.
At a more practical level, it is imperative that
Butcher, J. N., Derksen, J., Sloore, H., & Sirigatti, S. (2003).
practitioners become familiarized with the literature Objective personality assessment of people in diverse cultures:
on cross-cultural validity and applications of the European adaptations of the MMPI-2. Behavior Research and
instruments that they intend to use across cultures. Therapy, 41, 819–840.
It is insufficient to be only familiar with the original Butcher, J. N., & Han, K. (1996). Methods of establishing cross-
cultural equivalence. In J. N. Butcher (Ed.), International
tests that they may be trained to administer. They
adaptations of the MMPI-2 (pp. 44–63). Minneapolis, MN:
should consult the test publishers on the language University of Minnesota Press.
versions of the tests that are available in the cultures Butcher, J. N., Mosch, S. C., Tsai, J., & Nezami, E. (2006). Cross
in which they intend to administer their assessment. cultural applications of the MMPI-2. In J. N. Butcher (Ed.),
They should be wary of cross-cultural biases and MMPI-2: The practitioner’s guide (pp. 505–537). Washington,
DC: American Psychological Association.
incorporate their knowledge of cross-cultural differ-
Butcher, J. N., Nezami, E., & Exner, J. (1998). Psychological
ences in their interpretation of test results. The assessment of people in diverse cultures. In S. Kazarian &
MMPI-2 international conferences and workshops D. R. Evans (Eds.), Cross-cultural clinical psychology (pp. 61–105).
are examples of the venue where practitioners can New York: Oxford University Press.
gain such knowledge. Butcher, J. N. & Pancheri, P. (1976). Handbook of cross-national
MMPI research. Minneapolis, MN: University of Minnesota Press.
As psychology joins the global village, the disci-
Cheung, F. M. (1985). Cross-cultural considerations for the
pline can no longer ignore the cultural contexts that translation and adaptation of the Chinese MMPI in Hong
inform our knowledge base. The cultural perspective Kong. In J. N. Butcher & C. D. Spielberger (Eds.), Advances
is not a sideline to mainstream psychology. Cross- in personality assessment (Vol. 4, pp. 131–158). Hillsdale, NJ:
cultural training should become an integral part of Lawrence Erlbaum.
Cheung, F. M. (1995). Administration manual of the Minnesota
graduate and professional training, not only for
Multiphasic Personality Inventory (MMPI) (Chinese ed.).
students interested in cross-cultural psychology, Hong Kong: The Chinese University Press.
but also as a basic tenet of training in psychological Cheung, F. M. (1998). Cross-cultural psychopathology. In A. S.
assessment and research methodology. Bellack & M. Hersen (Eds.), Comprehensive clinical psychology,
CHEUNG 55
Lewis-Fernandez, R., & Kleinman, A. (1994). Culture, person- responses of bilingual Hong Kong Chinese managers.
ality, and psychopathology. Journal of Abnormal Psychology, Journal of Cross-Cultural Psychology, 26, 714–727.
103, 67–71. Russell, G. L., Fujino, D. C., Sue, S., Cheung, M., & Snowdon,
Lin, E. J., & Church, A. T. (2004). Are indigenous Chinese L. R. (1996). The effects of therapist–client ethnic match in
personality dimensions culture-specific? An investigation of the assessment of mental health functioning. Journal of Cross-
the Chinese Personality Assessment Inventory in Chinese cultural Psychology, 27, 598–615.
American and European American samples. Journal of Cross- Sue, S., Fujino, D. C., Hu, L., Takeuchi, D. T., & Zane, N.
cultural Psychology, 35, 586–605. (1991). Community mental health services for ethnic min-
Markus, H. R., & Kitayama, S. (1991). Culture and self: ority groups: A test of the cultural responsiveness hypothesis.
Implications for cognition, emotion, and motivation. Journal of Consulting and Clinical Psychology, 59, 533–540.
Psychological Review, 98, 224–253. Terracciano, A., Abdel-Khalek, A. M., Adam, N., Adamovova, L.,
Marsella, A., & Leong, F. T. L. (1995). Cross-cultural issues in Ahn, C. K., Ahn, H. N., et al. (2005). National character does
personality and career assessment. Journal of Career Assessment, not reflect mean personality trait levels in 49 cultures. Science,
3, 202–218. 310, 96–100.
Marsella, A. J., Dubanoski, J., Hamada, W. C., & Morse, H. Triandis, H. C. (2000). Dialectics between cultural and cross-
(2000). The measurement of personality across cultures: cultural psychology. Asian Journal of Social Psychology, 3,
Historical, conceptual, and methodological issues and 185–197.
considerations. American Behavioral Scientist, 44, 41–62. Triandis, H. C. (2001). Individualism-collectivism and person-
Matsumoto, D. (2007). Culture, context, and behavior. Journal of ality. Journal of Personality, 69, 907–924.
Personality, 75, 1285–1319. Van de Vijver, F., & Hambleton, R. (1996). Translating tests:
McCrae, R. R., & Allik, J. (Eds.). (2002). The Five-Factor Model Some practical guidelines. European Psychologist, 1, 89–99.
across cultures. New York: Kluwer Academic Press/Plenum. Van de Vijver, F., & Leung, K. (1997). Methods and data analysis
McCrae, R. R., & Terracciano, A. (2006). National character and for cross-cultural research. Thousand Oaks, CA: Sage.
personality. Current Directions in Psychological Science, 15, Van de Vijver, F., & Leung, K. (2001). Personality in cultural
156–161. context: Methodological issues. Journal of Personality, 69,
McCrae, R. R., Terracciano, A., & 79 members of the Personality 1007–1031.
Profiles of Cultures Projects. (2005). Personality profiles of Verma, J. (1999). Hinduism, Islam and Buddhism: The source of
cultures: Aggregate personality traits. Journal of Personality Asian values. In K. Leung, U. Kim, S. Yamaguchi, &
and Social Psychology, 89, 407–425. U. Kashima (Eds.), Progress in Asian social psychologies
Meiring, D., van de Vijver, F., Rothmann, I., & de Bruin, (pp. 23–36). Singapore: John Wiley & Sons.
D. (2008, July). Uncovering the Personality Structure of the Yang, K. S. (1997). Theories and research in Chinese personality:
11 Language Groups in South Africa: SAPI Project. In F. M. An indigenous approach. In H. S. R. Kao & D. Sinha (Eds.),
Cheung (Chair), Testing and assessment in emerging and devel- Asian perspectives on psychology (pp. 236–262). Thousand
oping countries. II: Challenges and recent advances. Symposium Oaks, CA: Sage.
conducted at the meeting of the 29th International Congress Yang, K. S., & Bond, M. H. (1990). Exploring implicit person-
of Psychology, Berlin, Germany. ality theories with indigenous or imported constructs: The
Ralston, D. A., Cunniff, M. K., & Gustafson, D. J. (1995). Chinese case. Journal of Personality and Social Psychology, 58,
Cultural accommodation: The effect of language on the 1087–1095.
Abstract
This chapter reviews traditional approaches for the psychometric analysis of responses to personality
inventories, including classical test theory item analysis, exploratory factor analysis, and item response
theory. These methods, which can be called ‘‘dominance’’ models, work well for items assessing
moderately positive or negative trait levels, but are unable to describe adequately items representing
intermediate (or average) trait levels. This necessitates a shift to an alternative family of psychometric
models, known as ideal point models, which stipulate that the likelihood of endorsement increases as
respondents’ trait levels get closer to an item’s location. An ideal point model for personality measures
using single statements as items is described, data are reanalyzed to show how the change of modeling
framework improves fit, and the pairwise preference format is described for use in personality assessment.
Two illustrative ideal point models for unidimensional and multidimensional pairwise preferences are
discussed, which shows that, after correcting for unreliability, correlations of traits assessed with single
statements, unidimensional pairs, and multidimensional pairs are very close to unity suggesting that the
choice of format does not affect the trait being measured. Some exciting areas for future research are
noted.
Since Spearman’s classic 1904 paper ‘‘General intel- best models and practices from the cognitive ability
ligence, objectively determined and measured,’’ domain or is there a need for different approaches
thousands of psychometrician-years have been and innovation? In this chapter, we will first review
devoted to the development and evaluation of test and illustrate the application of ‘‘tried and true’’
theories for the assessment of cognitive ability. methods developed in the context of cognitive
Classical test theory (CTT; Gulliksen, 1950), item ability testing to personality assessment and contrast
response theory (IRT; Lord, 1980), computerized them with alternative ways of scaling personality
adaptive testing (CAT; Sands, Waters, & McBride, items. We then focus on pairwise preference
1997), and many other innovations have been devel- response formats, which are not commonly found
oped to improve the measurement of cognitive in cognitive ability testing, but carry a number of
ability. In addition to a great deal of research, ability advantages over the single-statement (SS) format for
testing is big business. The Educational Testing personality assessment. We present some methods
Service, for example, employed more than 2,400 for scoring both unidimensional and multidimen-
people, administered more than 12 million exams, sional pairwise preferences and relate these results to
and had revenue in excess of $800 million in its fiscal traditionally scored measures. In keeping with the
year ending June 2007 (Hoovers, 2007). cognitive ability tradition, we do not review methods
So what are the implications of this massive for empirical keying; see Bergman, Drasgow,
research literature and business practice for person- Donovan, Juraska, and Nejdlik (2006) for a review
ality measurement? Is it reasonable to simply adopt and illustration of several of these methods.
59
It is important to note that successful application rating scale that ranged from ‘‘strongly approve’’ to
of any method for modeling response data depends ‘‘strongly disapprove,’’ Likert found scoring response
critically on assumptions about the underlying options with consecutive integers (e.g., 5 ¼ ‘‘strongly
response process. At the most general level, for approve’’ to 1 ¼ ‘‘strongly disapprove’’) produced
example, test theories for cognitive ability assume attitude scores that correlated in excess of .99 with
that individuals with greater ability have a higher scores computed following a more elaborate method
probability of answering items correctly. Coombs of assigning values to rating scale categories. Likert
(1964) used the term ‘‘dominance’’ to refer to also found that ‘‘using the method here described, a
models with this property. measure of a person’s attitude as reliable as that
Are dominance models appropriate for person- obtained by the Thurstone method is secured by
ality measurement? Consider, for example, an extra- asking him to react to one-half as many items’’ (p. 33).
version item that asks whether the respondent enjoys Likert (1932) used two methods for evaluating a
chatting quietly with a friend at a café. Are indivi- scale and its items. First, Likert carefully examined
duals with increasingly greater degrees of extrover- the reliability of the scores produced by a given
sion more and more likely to endorse this item? Or method. In addition, Likert used item–total correla-
would individuals with moderate levels of extrover- tions to assess the contribution of individual items.
sion have the greatest probability of endorsement? He noted, ‘‘If a zero or very low [item–total] correla-
Coombs used the term ‘‘ideal point’’ in the context of tion coefficient is obtained, it indicates that the
models in which individuals are assumed to be more statement fails to measure that which the rest of
likely to endorse an item when it is close to them. the statements measure. . . . Thus item analysis
The idea for ideal point models can be traced reveals the satisfactoriness of any statement so far as
back to a series of remarkable papers by Louis its inclusion in a given attitude scale is concerned’’
Thurstone (1927, 1928, 1929) that developed (pp. 48–49).
theory and methods for measuring attitudes. In Likert (1932) found that ‘‘double-barreled items’’
one approach, a set of attitude statements is first such as ‘‘compulsory military training in all countries
compiled and then ‘‘two or three hundred subjects should be reduced but not eliminated’’ (p. 34) on an
are asked to arrange the statements in eleven piles assessment of internationalism had low item–total
ranging from opinions most strongly affirmative to correlations and thus he felt they should be elimi-
those most strongly negative’’ (1928, p. 545). These nated. Thurstone, on the other hand, deliberately
judgments are then used to compute scale values for included such items. For example, his church
the statements. Only after scaling the statements is it attitude scale included ‘‘Sometimes I feel that the
possible to assess an individual’s attitude: In a second church and religion are necessary and sometimes
step, a respondent indicates which items he/she I doubt it’’ (1929, p. 232). Thurstone found such
agrees with and then the mean of the scale values items to have intermediate scale values, whereas the
of the endorsed items is taken as his/her attitude ‘‘worn-out dogmas’’ item had a very small scale value
value. For example, on a scale assessing attitude and the ‘‘church services give me inspiration’’ item
toward the church, two individuals might endorse had a very large scale value. Moreover, Thurstone
three of ten items but receive very different attitude devised methods to assess the discrepancy between a
scale values because one person endorses items like model and the data and found reasonably good fits
‘‘I think the church seeks to impose a lot of worn-out for the intermediate items (e.g., see Table 6 in
dogmas and medieval superstitions’’ (Thurstone, Thurstone, 1929).
1929, p. 232) and the other person endorses items The apparent contradiction between Likert and
such as ‘‘I feel the church services give me inspiration Thurstone’s conclusions regarding intermediate
and help to live up to my best during the following items can be explained by the ideal point response
week’’ (p. 232). process but not the dominance process. Specifically,
Certainly, Thurstone’s methods are time-con- if individuals endorse items whose scale values are
suming and laborious. Likert (1932) introduced a close to the individuals’ standings on the attitude
technique that greatly simplified the analysis and being measured, then individuals who are low or
yielded very similar results. Specifically, Likert sug- high will not endorse intermediate items; only
gested identifying one end of the attitude as positive, individuals who are intermediate will endorse such
reverse-scoring items at the other end of the attitude items. In contrast, only individuals who are high
continuum, and then computing a person’s score as will endorse the most positive items. And only indi-
the sum of his or her item scores. For a five category viduals who are low will endorse the most negative
alpha for the 20 item scale is .76, but can be Williams, 2001). Note also that the SGR model is a
increased to .83 by deleting the six neutral items. generalization of the 2PL to the case of ordered poly-
This is reminiscent of Likert’s (1932) findings con- tomous responses, so that the SGR simplifies to the
cerning Thurstone’s intermediate items and high- 2PL when the data consist of only two response
lights the importance of assessing the fit of a model categories.
to the data analyzed. More specifically, Thurstone’s
scaling methods provide assessments of model-data DICHOTOMOUS ANALYSIS
fit, whereas CTT analyses do not. In the present For the 2PL analysis, the well-being data (after
case, we see that coefficient alpha increases when reverse-scoring the negative items) were dichoto-
the six neutral items are deleted, but we have no mized (Strongly Disagree and Disagree were coded
information about whether the underlying domi- as negative responses and Agree and Strongly Agree
nance model assumptions are satisfied. were coded as positive responses) and submitted
to the BILOG (Mislevy & Bock, 1991) computer
IRT Dominance Analyses program. BILOG estimates an item discrimination
Next, the data were analyzed by two dominance parameter, ai, and an item extremity parameter, bi,
IRT models widely used in personality research. for each item i. The mathematical formula for the
The first model, the 2PL, can be used to analyze 2PL model relating the probability of a positive
dichotomous response data from assessment instru- response (Ui ¼ 1) to item i and person j’s trait level
ments such as the Minnesota Personality on the continuum underlying responses, j , is
Questionnaire, the Hogan Personality Inventory, or
the California Psychological Inventory (for an 1
PðUi ¼ 1jj Þ ¼ ;
example, see Reise & Waller, 1990). The second 1 þ exp½1:7ai ðj bi Þ
model, the SGR, can be used to analyze polytomous
data with ordered response categories such as NEO where ai is the discrimination parameter for item i, bi
PI-R or Goldberg’s Big Five markers (for an example, is the location parameter for item i, and 1.7 is
see Chernyshenko, Stark, Chan, Drasgow, & a scaling constant used for historical reasons.
Eigenvalue
3
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Factor number
(b) Scree plot
2.00
1.75
1.50
Eigenvalue
1.25
1.00
0.75
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Factor number
Fig. 4.1 Scree plots for well-being scale with and without neutral items.
Figure 4.2 illustrates the results for the item smoothly; there is no discontinuity in the curve.
WELL41, which has an a41 ¼ 1.08 and b41 ¼ –1.13. Second, individuals with low standings on well-
The x-axis refers to the latent personality trait, well- being are virtually certain to not endorse the item,
being, that the item measures. For convenience, and individuals with high standings to endorse the
BILOG assumes that the latent trait assessed by item. Third, we have portrayed the item as
the items is standardized to a mean of zero and a measuring only a single latent characteristic of
standard deviation of 1. The y-axis refers to the prob- respondents. This ‘‘simple structure’’ (Thurstone,
ability of endorsing the item, conditional on one’s 1947) reflects the fact that we wish to avoid
standing on well-being. For example, among indivi- factorially complex items. Fourth, Figure 4.2 reflects
duals with well-being equal to –1, there is about a .5 the fact that individuals low in well-being cannot
probability of a positive response whereas for individual have a negative probability of endorsing the item and
with well-being equal to 0.5, there is about a .95 high well-being individuals cannot have a prob-
probability of a positive response. ability of endorsement greater than 1. Thus, the
The smoothly rising probability curve in Figure curve in monotonically, but not linearly, increasing.
4.2 illustrates several characteristics of dominance Finally, the probability curve in Figure 4.2 is
items. First, Guttman perfect scale items notwith- the nonlinear regression of the dichotomously
standing, the probability of endorsing rises scored, 1 ¼ endorsement and 0 ¼ nonendorsement,
0.6
0.4
0.2
0.0
–3.0 –2.0 –1.0 0.0 1.0 2.0 3.0
Well-being
Fig. 4.2 Item response function for WELL41 with a ¼ 1.08 and b ¼ –1.13.
response to WELL41 on the latent trait. To see on the latent trait and N is the sample size. The
this, let the dichotomously scored response expected number of negative responses, U ¼ 0, can
variable be denoted Ui and use to denote the be computed as
latent well-being trait. Then the nonlinear regression Z
of Ui on is Ei ðUi ¼ 0Þ ¼ N ½1 PðUi ¼ 1jÞ f ðÞ d ;
EðUi jÞ ¼ 1 PðUi ¼ 1jÞ þ 0 PðUi ¼ 0jÞ
and then a chi-square statistic can be computed by
¼ PðUi ¼ 1jÞ
the usual formula comparing observed Oi and
The 2PL results in Table 4.2 are consistent with expected Ei number of responses,
the CTT and EFA results in Table 4.1 and again
½Oi ðUi ¼ 1Þ Ei ðUi ¼ 1Þ 2
suggest that the neutral items fail to adequately 2i ¼
assess well-being. For example, WELL17 has a very Ei ðUi ¼ 1Þ
½Oi ðUi ¼ 0 Þ Ei ðUi ¼ 0 Þ 2
low discrimination parameter estimate (0.26) and þ :
would be normally deleted from further considera- Ei ðUi ¼ 0Þ
tion during the scale construction process. Other This analysis can be extended to assess the fit of
items judged to have neutral content had even the IRT model to more than one item at a time. For
lower discrimination parameter estimates and all example, the expected number of positive responses
would have been eliminated, because most test to both item i and item m is
developers would use a minimum cutoff of 0.50
for the 2PL discrimination parameter. Yet all these Ei ; m ðUZ
i ¼ 1; Um ¼ 1Þ
items should assess well-being as they tap into such
content domains as self-esteem, depression, and ¼N PðUi ¼ 1jÞ PðUm ¼ 1jÞ f ðÞ d :
overall life happiness.
An advantage of IRT is that the fit of the model When this analysis was conducted for two neutral
can be tested. For example, let f () denote the items, WELL17 and WELL26, a chi-square of 18.8
distribution of the latent trait, which we can assume with three degrees of freedom (df ) was obtained.
to be the standard normal. Then the expected Because chi-square statistics are sensitive to sample
number of positive responses to item i is size, we have found it useful to adjust observed
values to that expected for a fixed sample size of
Z
3,000 (Drasgow, Levine, Tsien, Williams, &
Ei ðUi ¼ 1Þ ¼ N PðUi ¼ 1jÞ f ðÞ d ; Mead, 1995). This can be accomplished by noting
that the expected value of a noncentral chi-square is
where P(Ui ¼ 1|) is the probability of a positive equal to its degrees of freedom plus N times the
response to item i by individuals with standing noncentrality parameter ,
WELL02 Negative 0.50 3.52 1.27 1.81 0.52 1.19 0.87 4.02 2.72
WELL04 Negative 0.57 3.01 1.17 1.27 0.61 1.09 0.96 3.65 2.44
WELL06 Negative 0.81 2.64 0.77 1.58 0.68 0.79 1.09 2.44 1.46
WELL09 Negative 0.77 2.74 0.74 1.63 0.67 0.77 1.22 1.81 0.95
WELL13 Negative 0.76 2.91 0.69 1.65 0.78 0.67 1.30 1.89 1.11
WELL16 Neutral 0.12 9.70 0.43 9.89 0.23 0.19 0.47 0.71 0.71
WELL17 Neutral 0.14 13.33 3.96 7.36 0.26 2.14 1.24 0.10 1.45
WELL19 Neutral 0.04 47.57 15.02 27.24 0.14 4.56 0.97 0.46 1.89
WELL20 Neutral 0.12 11.93 6.04 3.79 0.16 4.59 0.55 0.61 2.82
WELL23 Neutral 0.13 17.41 8.57 6.61 0.32 3.53 0.74 0.51 3.19
WELL26 Neutral 0.16 6.16 3.52 10.29 0.24 2.45 1.27 0.26 1.48
WELL29 Positive 0.81 2.70 0.70 1.75 0.79 0.70 1.31 2.12 2.82
WELL30 Positive 0.66 3.88 1.90 0.93 0.62 2.01 1.03 2.18 4.17
WELL34 Positive 0.50 4.20 1.51 1.95 0.46 1.59 1.19 0.59 1.85
WELL38 Positive 0.80 2.57 0.65 1.38 0.69 0.70 1.08 1.92 2.60
WELL40 Positive 0.81 2.50 0.08 2.17 0.74 0.04 1.14 2.45 2.45
WELL41 Positive 0.84 2.87 1.25 1.32 1.08 1.13 1.94 2.40 3.51
WELL43 Positive 1.09 3.09 1.57 0.79 0.92 1.71 1.95 0.95 2.53
WELL45 Positive 0.75 3.23 0.99 1.76 0.77 0.98 1.27 1.64 2.61
WELL46 Positive 0.65 3.78 1.13 1.65 0.61 1.15 1.27 0.89 1.92
Note: SGR ¼ Samejima’s graded response model, 2PL ¼ two-parameter logistic model, GGUM ¼ generalized graded unfolding model.
1.0 1.0
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0.0 0.0
–3.0 –2.0 –1.0 0.0 1.0 2.0 3.0 –3.0 –2.0 –1.0 0.0 1.0 2.0 3.0
Well-being Well-being
Fig. 4.3 Option response functions for WELL41 with a ¼ 0.84, b1 ¼ –2.87, b2 ¼ –1.25, and b3 ¼ 1.32.
where i is item i’s discrimination parameter, i is a To illustrate the ideal point analysis, we analyzed
response category threshold parameter, and is a the dichotomously scored TAPAS well-being data
scaling constant. The probability of the latent agree with GGUM. Figure 4.4 shows the results for
from above category is WELL17. In this figure, ‘‘Option 1’’ refers to the
0.8
0.6
0.4
0.2
0.0
–3.0 –2.0 –1.0 0.0 1.0 2.0 3.0
Well-being
1.0
0.8
0.6
0.4
0.2
0.0
–3.0 –2.0 –1.0 0.0 1.0 2.0 3.0
Well-being
(a)
1.0
Prob. of positive response
0.8
0.6
0.4
0.2
0.0
–3.0 –2.0 –1.0 0.0 1.0 2.0 3.0
(b)
1.0
Prob. of positive response
0.8
0.6
0.4
0.2
0.0
–3.0 –2.0 –1.0 0.0 1.0 2.0 3.0
Table 4.3. Summary statistics for single-stimulus, unidimensional, and multidimensional pairwise preference
order, self-control, and sociability scales.
1.00
0.80
P(s > t ) (θd , θd )
t
0.60
s
i
0.40
0.20
–2.00
0.0 2.00
Socia 0 0.00
2.00 r
bility –2.00 Orde
Fig. 4.6 Item response surface for an MDPP item measuring two dimensions (order and sociability).
SS UPP MDPP
Order .74 .41 .14 1.00 .38 .08 1.00 .27 .14
SS Self-control .26 .54 .40 .47 1.00 .43 .50 .91 .21
Sociability .09 .23 .62 .18 .46 1.00 .29 .43 1.00
Order .75 .30 .12 .75 .45 .15 .99 .36 .17
UPP Self-control .24 .55 .27 .28 .53 .47 .49 1.00 .29
Sociability .06 .28 .76 .12 .30 .78 .23 .43 .96
Order .75 .32 .20 .74 .31 .18 .75 .48 .19
MDPP Self-control .19 .54 .27 .25 .62 .31 .34 .66 .24
Sociability .10 .13 .75 .13 .18 .73 .14 .17 .73
Note: N ¼ 602, SS ¼ single statement, UPP ¼ unidimensional pairwise preference, MDPP ¼ multidimensional pairwise preference. Reliability estimates appear in bold on the main diagonal, observed correlations are provided below the
diagonal, and correlations corrected for unreliability are provided above the diagonal. Monotrait/heteromethod validities appear in bold.
Chernyshenko, Stark, Drasgow, & Roberts, 2007; area offering a great opportunity for innovative new
Stark et al., 2006) that responding to personality research.
questions is fundamentally different from responding
to cognitive ability test items. Whereas dominance
models provide excellent fits to the cognitive ability Acknowledgments
tests, they perform poorly in the personality domain The work described in this chapter was supported
unless items are preselected and responses are reverse- in part by a grant from the U.S. Army Research
scored. If, however, the full range of statements is Institute, contract number W74V8H-06-C-006.
retained for pretesting, only ideal point models We thank Dr. Len White for his extensive contribu-
appear to provide better fit and yield parameters that tions to this research.
are easily interpretable and intuitively understood.
Moreover, the shift in response process assumptions References
from dominance to ideal point effectively paves the Andrich, D. (1988). The application of an unfolding model of
way for a variety of forced-choice formats in person- the PIRT type to the measurement of attitude. Applied
ality assessment, which are rapidly becoming popular Psychological Measurement, 12, 33–51.
Andrich, D. (1989). A probabilistic IRT model for unfolding
in personnel selection settings due to their resistance preference data. Applied Psychological Measurement, 13,
to a variety of response biases (i.e., halo, social desir- 193–216.
ability). In this chapter, we discussed two ideal point Andrich, D., & Luo, G. (1993). A hyperbolic cosine latent trait
IRT models for unidimensional and multidimen- model for unfolding dichotomous single-stimulus responses.
sional pairwise preference formats (i.e., the ZG and Applied Psychological Measurement, 17, 253–276.
Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and
MDPP models) and empirically showed them to practical advantages of the HEXACO model of personality
produce scores comparable to those produced by the structure. Personality and Social Psychology Review, 11, 150–
ideal point model with an SS format (i.e., the 166.
GGUM). Consequently, it appears ideal point Bergman, M. E., Drasgow, F., Donovan, M. A., Juraska, S. E., &
methods, as a class of models, are well suited for Nejdlik, J. B. (2006). Scoring situational judgment tests:
Once you get the data, your troubles begin. International
personality assessment. Journal of Selection and Assessment, 14, 223–235.
With suitable models, interesting and important Birnbaum, A. (1968). Some latent trait models and their use in
innovations are feasible. For example, IRT differen- inferring an examinee’s ability. In F. M. Lord & M. R.
tial item functioning analysis can provide insights Novick, Statistical theories of mental test scores (pp. 395–479).
about the effects of culture on responses to person- Reading, MA: Addison-Wesley.
Bock, R. D. (1972). Estimating item parameters and latent ability
ality items. Highly efficient computer adaptive when responses are scored in two or more nominal categories.
assessment should substantially reduce the numbers Psychometrika, 37, 29–51.
of items needed for accurate personality assessment. Bock, R. D., Gibbons, R., & Muraki, E. (1988). Full-information
New measures like TAPAS or NCAPS are examples item factor analysis. Applied Psychological Measurement, 12,
of this new generation of personality assessment 261–280.
Bock, R. D., Gibbons, R., Schilling, S. G., Muraki, E., Wilson, D. T.,
tools. Perhaps most importantly, questions about & Wood, R. (2003). TESTFACT 4.0 (Computer software and
the fundamental structure of personality can be manual). Lincolnwood, IL: Scientific Software International.
examined without analytical artifacts possibly Böckenholt, U. (2004). Comparative judgments as an alternative
lurking in the background. We predict that these to ratings: Identifying the scale origin. Psychological Methods,
and other advances will be realized as more ideal 9, 453–465.
Borman, W. C., Buck, D. E., Hanson, M. A., Motowidlo, S. J.,
point models and methods for evaluating quality of Stark, S., & Drasgow, F. (2001). An examination of the
items are developed. comparative reliability, validity, and accuracy of performance
In conclusion, we believe that psychometric ratings made using computerized adaptive rating scales.
models for personality assessment data should reflect Journal of Applied Psychology, 86, 965–973.
the unique characteristics of this domain. Even Bossuyt, P. (1990). A comparison of probabilistic unfolding theories
for paired comparisons data. Berlin: Springer-Verlag.
though it is convenient to use models and methods Bowen, C., Martin, B. A., & Hunt, S. T. (2002). A comparison of
from the cognitive ability domain, we believe that ipsative and normative approaches for ability to control faking
such an approach short-changes the richness of per- in personality questionnaires. International Journal of
sonality. We hope this chapter has demonstrated Organizational Analysis, 10, 240–259.
that new models are needed and that, as a first Brady, H. E. (1989). Factor and ideal point analysis for inter-
personally incomparable data. Psychometrika, 54, 181–202.
attempt, ideal point methods provide a better char- Campbell, D. T., & Fiske, D. W. (1959). Convergent and dis-
acterization of personality assessment data. Clearly, criminant validation by the multitrait-multimethod matrix.
measurement theory for personality assessment is an Psychological Bulletin, 56, 81–105.
Abstract
Current construct validity theory emphasizes the following points. The process of validating measures of
unobservable constructs is also the process of validating theories of psychological functioning. That
process is ongoing for each measure and theory: construct validity is never fully established, but rather
evidence for the validity of measures and theories accrues over time. To facilitate the process, researchers
should conduct validation tests that are as informative as possible, that is, which directly address core
components of one’s theory. Measure and theory validation should be conducted on measures of
homogeneous constructs: the use of single scores to represent multidimensional entities cannot be
defended. An important new development is that it is now possible to validate the claims of personality
theory by testing them in relation to daily, moment-to-moment reactions and behavior. Extraordinary
progress has been made in the study of personality, and the further implementation of sound construct
validation principles will lead to continued, improved, and successful measure and theory validation in the
future.
Over the last century, both validation theory and than multidimensional, constructs the focus of
personality theory have made enormous strides, and both construct validation and theory validation.
progress in the two domains has been closely linked. Although this need has been recognized since the
A great deal of the history of validation theory in modern beginning of validation efforts, researchers
psychology has been driven by the need to assess have not consistently focused their efforts on
personality-based dysfunction; at the same time, unidimensional measures. As a result, some impor-
progress in understanding how to validate measures tant changes in the measurement and validation of
of unobservable constructs has led to improvements personality constructs are necessary. We then
in personality theory and assessment. provide examples of the kinds of advances in
In this chapter, we begin with an overview of knowledge that have been realized in personality
this history. We then discuss recent advances in and psychopathology research following the intro-
construct validation theory in some depth. duction of construct validity and also following
In doing so, we discuss the philosophical and appreciation of the need to focus on homogeneous
theoretical basis for the recent advances, as well as psychological entities. Finally, we discuss an
ongoing difficulties in the implementation of important new challenge and opportunity for the
construct validation methods. One important, validation of personality measures: to validate
recent advance is a growing appreciation of the measures in relation to humans’ day-to-day and
need to make measures of unidimensional, rather moment-to-moment reactions and behaviors.
81
An Historical Overview of the Joint History square?,’’ and ‘‘Does some particular useless thought
of Construct Validation Theory and keep coming into your mind to bother you?’’ In
Personality Theory noting the variety of items and the mixture of
Early Attempts to Validate Personality/ mental complaints that sparked them, Garrett and
Psychopathology Measures Schneck (1928) concluded:
A good place to start a brief history of validation
It is this fact, among others, which is causing the
theory is with the Woodworth Personal Data Sheet
present-day trend away from the concept of mental
(WPDS), a questionnaire developed in 1919 to help
disease as an entity. Instead of saying that a patient has
the U.S. Army screen out recruits who might be this or that disease, the modern psychiatrist prefers to
susceptible to ‘‘shell shock’’ or ‘‘war neurosis.’’ It was say that the patient exhibits such and such symptoms.
thought to measure emotional stability (Garrett & (p. 465)
Schneck, 1928; Morey, 2002). Although the failure
of the WPDS is well known, it is nevertheless the case This thinking led them to investigate the perfor-
that in both the construction of the test and in mance of individual items, rather than the test as a
subsequent attempts to use it, researchers did show whole; further, they sought to identify covariation
real concerns with the scale’s validity. We briefly between individual item responses and specific diag-
consider this literature, because one can observe the noses, rather than membership in a global category
presence of concerns that very much anticipate the of ‘‘mentally disturbed.’’ In this early study, Garrett
ensuing development of methods for construct and and Schneck recognized both the need to avoid
theory validation. combining items of different content and the need
In constructing the test, Woodworth developed its to avoid combining individuals with different
116 dichotomous items on both rational and symptom pictures. Their approach anticipated two
empirical grounds. He drew on the case histories of important subsequent advances. First, their use of an
individuals identified as neurotic for item content, empirical item–person classification produced very
and then deleted items scored in the dysfunctional different results from prior rational classifications
direction by 50% or more of his normal test group (Laird, 1925), thus implicating the importance of
(Garrett & Schneck, 1928). In subsequent writing empirical validation and anticipating criterion-
on the test, some researchers stressed the need to keying methods of test construction. Second, they
establish the scale’s validity empirically. For example, anticipated the current appreciation for construct
Flemming and Flemming (1929) began their homogeneity, with its emphasis on unidimensional
empirical investigation of the WPDS with this state- traits and unidimensional symptoms as the proper
ment: ‘‘[Of the nine studies published using the objects of theoretical study and measure validation
WPDS], in only one . . . has any attempt been made (Edwards, 2001; McGrath, 2005, 2007; Smith,
to validate the test by the correlation technique’’ Fischer, & Fister, 2003; Smith, McCarthy, &
(p. 500). Similar statements decrying the lack of Zapolski, 2007).
validity evidence for many current instruments are In the end, the WPDS rested on a far too incom-
abundant in today’s literature. plete understanding of personality and psycho-
Total WPDS score (or, alternatively, a percentage pathology to prove effective. Although efforts to
of all items endorsed in the dysfunctional direction) validate the measure occurred in the absence of
did not well differentiate dysfunctional from normal well-developed test and construct validation theory,
individuals. Garrett and Schneck (1928) found that this work undoubtedly paved the way for ensuing
the WPDS did not differentiate between college advances in both personality/psychopathology
freshmen (at that time, a more elite group than are theory and validation theory.
freshmen today) and ‘‘avowed psychoneurotics.’’
Flemming and Flemming (1929) found that The Centrality of Criterion-Related Validity
WPDS scores did not correlate with teacher ratings in the Early and Middle Twentieth Century
of students’ emotional stability. To investigate the In the following years, perhaps in part due to the
basis for this failure, researchers considered the failures of rationally based test construction, and
obvious diversity of item content (Garrett & perhaps in part due to suspicion of theories
Schneck, 1928; Laird, 1925). Sample WPDS items describing unobservable entities (Blumberg &
are: ‘‘Have you ever lost your memory for a time?,’’ Feigl, 1931), test validity came to be understood in
‘‘Can you sit still without fidgeting?,’’ ‘‘Does it make terms of a test’s ability to predict a practical criterion
you uneasy to have to cross a wide street or an open (Cureton, 1950; Kane, 2001). In fact, many
SMITH, ZAPOLSKI 83
advances that have taken place since then. Perhaps negative affect, and levels of depression. None of
ironically, the success of the criterion-related validity these three entities is directly observable, so to test
method led to its ultimate replacement with the theory, one needs a measure to represent each
construct validity theory. The criterion approach led entity. The challenge, of course, is to determine
to significant advances in knowledge, which made whether individual differences on one’s measure
possible the development of integrative theories of validly reflect individual differences in the unobser-
personality and psychopathology. But such theories vable target entity. This validation process is neces-
could not be validated using the criterion approach; sarily inferential. One must demonstrate that one’s
there was thus a need for advances in validation measure of a given construct relates to measures of
theory to make possible the emerging advances in other constructs in theoretically predictable ways.
theories of personality and psychopathology. This For measures of inferred constructs, the only way
need was addressed by several construct validity to build evidence that a measure reflects a construct
authors in the middle of the twentieth century validly is to test whether scores on the measure
(Campbell & Fiske, 1959; Cronbach & Meehl, conform to a theory, of which the target construct
1955; Loevinger, 1957). is a part.
The sciences of personality and psychopathology If I develop a measure of hypothetical construct
have moved far beyond the need to rely on strict A, I can only validate my measure if I have some
criterion prediction for validity, but that is true theoretical argument that, for instance, A relates
because of the strong insistence on empirical, positively to B, but is unrelated to C. If I have such
criterion-based validation that characterized the a theory, and if I have measures of constructs B and
best research during that period of history. Today, C, I can test whether my measure of A performs as
theory testing and the psychometric representation predicted by my theory. Thus, the construct
of unobservable processes is the norm. Even research validation process requires the prior development
using instruments originally developed through of theories. And more than that, the process simul-
criterion-keying now emphasizes theory development taneously involves testing the validity of a measure
and validation. Recent advances in research using and the validity of the theory of which the measured
the MMPI-2 have addressed precisely these topics; construct is a part. Construct validation is also
authors have developed and validated content scales theory validation.
thought to represent basic, homogeneous dimen- As one can readily see, the construct validation
sions of both personality (Harkness & McNulty, process is indeterminate. If my measure of A per-
2006) and psychopathology (Butcher, 1995). forms as expected by my theory, then I am likely to
have increased confidence in both my theory and my
The Emergence of Construct Validity measure of A. However, I cannot be certain that
and of Theory Testing both my theory is correct and my measure is accu-
To address the emerging need to validate rate. Suppose, instead, my new measure of A inad-
theories, rather than simply predict criteria of vertently overlaps with B (known not to correlate
importance, Meehl and Challman, working as with C), and my supportive results are really due to
part of the American Psychological Association the measures of A and B partly reflecting the same
Committee on Psychological Tests, introduced the construct. Thus, positive results increase one’s con-
concept of construct validity in the 1954 Technical fidence in one’s measure and in one’s theory, but
Recommendations (American Psychological Association, they do not constitute either proof of a theory or full
1954). Cronbach and Meehl (1955), of course, validation of a measure.
further developed the meaning of the concept, as Similarly, if my measure of A does not perform as
did Loevinger (1957). The core idea of construct expected by my theory, I must consider whether the
validity is as follows. measure, the theory, both, or neither, lack validity.
To develop theories of psychological functioning, Suppose A relates to both B and C. I have no certain
it is typically necessary to refer to psychological basis for determining whether (a) my theory was
attributes that, though presumably real, are not accurate but my measure of A was inadequate;
directly observable. For example, the theory that (b) my theory was incorrect and my measure of A
both low positive affect and high negative affect was adequate; or (c) my theory was correct, my
contribute to depression (Clark & Watson, 1991) measure of A was adequate, but my measure of C
is a theory that specifies three entities along which was inadequate. There are other possibilities as well,
individuals vary: levels of positive affect, levels of including inadequate study design.
SMITH, ZAPOLSKI 85
programs. They address questions that will clarify classic idea of a critical experiment, the results of
the validity of theories and the measures used to test which could demonstrate that a theory is false, is an
them. Because of the iterative nature of the develop- example of justificationism (Duhem, 1914/1991;
ment and validation of theory and measurement, Lakatos, 1968). Logical positivism (Blumberg &
theoretical tests of this kind are likely to provide Feigl, 1931), with its belief that theories are straight-
important information. forward derivations from observed facts, is one
One characteristic of informative theory tests is example of justificationist philosophy of science.
that they evaluate, as directly as possible, specific Under justificationism, one could imagine the
claims made for a theory. In clinical risk factor validity of a theory and its accompanying measures
research, to assert that trait A is a risk factor for being fully and unequivocally established following a
syndrome B requires as direct a test of that specific series of critical experiments.
claim as possible. Demonstration of a positive cross- In recent decades, historians of science and philo-
sectional correlation between A and B does provide sophers of science have moved away from justifica-
information (the absence of a correlation would pose tionism, endorsing, instead, what is referred to as
serious problems for a risk theory), but its informa- nonjustificationism (Bartley, 1987; Campbell,
tion value is limited because it is not a direct test of 1987, 1990; Feyerabend, 1970; Kuhn, 1970;
the risk factor hypothesis. In contrast, prospective Lakatos, 1968; Rorer & Widiger, 1983; Weimer,
designs in which trait A predicts onset of syndrome 1979). Nonjustificationism refers to the concept
B, and other possible explanations for the onset of B that scientific theories are neither fully justified nor
have been controlled, are more informative. Such a definitively disproven by individual, empirical stu-
design provides greater reason for confidence in both dies. It is based on the following considerations.
the theory and the measures of the constructs speci- The test of any theory presupposes the validity of
fied by the theory. Tests of mediation in which the several other theories (often referred to as auxiliary
putative cause predicts subsequent changes in the theories), including theories of measurement, that
putative mediator, and the mediator then predicts also influence the empirical test (Lakatos, 1999;
still later changes in the putative consequence (e.g., Meehl, 1978, 1990). Consider the brief example we
Fried, Cyders, & Smith, 2008; Stice, 2001), are presented above, in which we referred to measures of
informative because they are more direct tests of a constructs A, B, and C. To test our theory of A, we
mediational risk process than are cross-sectional cor- must rely on the validity of the theory of how con-
relations, and so have ruled out several possible expla- struct B operates, the validity of the theory that we
nations that compete with the mediational theory. have a valid measure of construct B, the validity of the
Tests comparing alternate theoretical explanations of theory of how construct C operates, the validity of
the same data (Bartusch, Lynam, Moffitt, & Silva, the theory that we have a valid measure of construct
1997) can also be informative, because they hold C, and the validity of the theory that B is unrelated to
multiple theories up to critical examination, both C. Empirical results that are negative for A could thus
individually and in comparison to each other. reflect the failure of any number of theories other
Doing so is an effective way to provide information than the validity of our measure of construct A.
to researchers and clinicians. In part for this reason, no theory is ever fully
proved or disproved, hence the term nonjustifica-
Advances in Philosophy of Science: tionism. Instead, at any one time, evidence tends to
Nonjustificationism and Construct favor some theories over others. Confirming evi-
Validation dence can be evaluated in terms of how informative
The second problem in Cronbach and Meehl’s the theory test was (Smith, 2005). To evaluate dis-
(1955) use of the nomological network concept has confirming evidence, one must make judgments as
been illuminated by advances in both the philosophy to whether the negative results likely stemmed from
of science and the study of the history of science; problems in the core theory under consideration,
these advances have taken place since their seminal one of the auxiliary theories invoked to
paper was published. conduct the test, or another, more specific auxiliary
Earlier perspectives from the philosophy of hypothesis (Lakatos, 1968, 1999; Meehl, 1990).
science on theory validation have been described, Obviously, researchers can and do disagree.
by Bartley (1962) and others, as justificationist: Philosophers have proposed various means for eval-
theories could be fully justified or fully disproved uating the evidence: for example, Lakatos (1999)
based on observation or empirical evidence. The advocated judgments as to whether a research
SMITH, ZAPOLSKI 87
with measures of other constructs. The second the composite, big five dimensions themselves.
source of uncertainty is perhaps more severe than Consider, for example, the occupational variable of
the first. The same composite score is likely to reflect service orientation to consumers (Hogan & Hogan,
different combinations of constructs for different 1992). Although appealing, it would be imprecise to
members of the sample. hypothesize that conscientiousness, as measured by
McGrath (2005) clarified this point by drawing a the NEO PI-R, relates to service orientation. Using
useful distinction between psychological constructs McGrath’s language, such a hypothesis involves a
that represent real psychological entities, on the one social construction, conscientiousness, rather than
hand, and concepts designed to refer to covariations presumably real, unidimensional psychological
among more elemental psychological entities. entities.
Consider the NEO PI-R measure of the Five- Instead, one should develop hypotheses con-
Factor Model of personality (Costa & McCrae, cerning the putative entities themselves. Indeed,
1992; Chapter 16). One of the five factors is neuro- Costa and McCrae (1995) found that one trait
ticism, which is understood to be composed of six within the conscientiousness domain, dutifulness,
elemental constructs. Two of those are angry hosti- correlated .35 with service orientation; while another,
lity and anxiety. Measures of those two traits covary achievement striving, correlated .01 with the same
reliably; they consistently fall on a neuroticism factor criterion. Achievement striving did correlate highly
in exploratory factor analyses conducted in different with a different occupational variable, managerial
samples and across cultures (McCrae, Zonderman, potential (r ¼ .63), but another facet of conscien-
Costa, Bond, & Paunonen, 1996). However, they tiousness, order, accounted for comparatively little
are not the same construct. Their correlation was .47 variance in that criterion (r ¼ .25). These findings
in the standardization sample; they share only 22% are typical: facets within domains of the NEO PI-R
of their variance. often have markedly different external correlates.
Clearly, one person could be high in angry hosti- This reality strongly supports the psychometric argu-
lity and low in anxiety, and another could be low in ment for focusing validation tests on homogeneous
angry hostility and high in anxiety. Those two dif- construct measures. If, instead, researchers conduct
ferent patterns could produce exactly the same score validation tests using single scores to reflect
on neuroticism. It therefore makes sense to develop multidimensional measures, they risk averaging the
theories relating angry hostility, or anxiety, to other different effects of different constructs. When
constructs, and tests of such theories are coherent. researchers study homogeneous constructs, their
However, a theory relating overall neuroticism to tests are more precise and more coherent; as a
other constructs is imprecise and unclear. If neuroti- result, they can be more informative.
cism correlates with another measure, one does not A second example concerns the construct of
know which traits account for the covariation, or impulsivity. Over the last several years, clinical
even whether the same traits account for the covar- researchers have recognized that the term ‘‘impul-
iation for each member of the sample. Using sivity’’ has been used in a variety of ways: various
McGrath’s (2005) language, we should understand measures with that label can tap what appear to be
neuroticism as a social construction, designed to different constructs, and many impulsivity measures
indicate the consistent covariation among identified appear to include items tapping multiple constructs
psychological entities. We should not understand it (Bagby, Joffe, Parker, & Schuller, 1993; Depue &
as a coherent, psychological entity. Presuming con- Collins, 1999; Evenden, 1999; Petry, 2001; Smith
tinued evidence for the homogeneity of measures of et al., 2007; Whiteside & Lynam, 2001, 2003;
angry hostility and anxiety, those two measures can Whiteside, Lynam, Miller, & Reynolds, 2005;
be theorized to represent real psychological entities. Zuckerman, 1994).
Hough and Schneider (1995), McGrath (2005, Efforts to disaggregate the set of constructs
2007), Paunonen and Ashton (2001), Schneider, included within the impulsivity framework have
Hough, and Dunnette (1996), and Smith et al. proved quite useful. One current model involves
(2003), among others, have all noted that use of identification of five, separate constructs that involve
scores of broad measures often obscures predictive dispositions to rash action: sensation seeking, lack of
relationships. Paunonen (1998) and Paunonen and planning, lack of perseverance, positive urgency (the
Ashton (2001) have shown that prediction of theo- tendency to engage in rash actions when in an extre-
retically relevant criteria is improved when one uses mely positive mood), and negative urgency (the
facets of the big five personality scales, rather than tendency to engage in rash actions when in an
SMITH, ZAPOLSKI 89
researchers and clinicians have a much clearer under- uses DSM-IV diagnoses, or DSM-IV symptom
standing of which sets of lower-level traits tend to counts, as predictors or criteria in validation studies.
covary. Theory testing should take place at the level Using diagnoses and symptom counts in that way
of homogeneous constructs (here represented by the involves the presumption of the validity of the dis-
30 facets of the NEO PI-R), but by understanding orders, and recent advances suggest that presump-
the structural relations among these 30 trait dimen- tion may not always be warranted. Researchers have
sions, researchers and clinicians have a growing noted that many currently defined disorders may not
understanding of typical patterns of individual dif- reflect coherent psychological constructs; instead,
ferences in personality functioning. they appear to represent composites of multiple,
Distinctions among putatively elemental psycho- separable constructs.
logical entities can alter researchers’ understanding of Applying our above discussion of construct
the hierarchical structure of those entities. Consider homogeneity, if disorders are not unidimensional
the above example concerning impulsivity-like con- entities, then the use of disorder scores or symptom
structs, in which five, distinct pathways to rash action counts as variables in an analysis is problematic for at
have been identified. Interestingly, there has been no least two reasons. First, the scores represent the
evidence that those five constructs should be under- influence of multiple psychological constructs, so
stood as facets of a broader, overall impulsivity they lack clear theoretical meaning. Second, dif-
domain. Instead, positive and negative urgency ferent individuals are likely attaining the same score
appear to be facets of an overall emotion-based dis- through endorsement of different symptoms, so the
position to rash action; lack of perseverance and lack relative degree of influence of the different con-
of planning appear to be facets of an overall low structs varies from person to person (McGrath,
conscientiousness disposition to rash action; and sen- 2005, 2007). Thus, for at least some disorders, the
sation seeking stands alone, loading on neither of meaning of scores may not be clear.
those factors (Cyders & Smith, 2007). Thus, a Clinical researchers have increasingly demon-
higher-order integration of the five pathways suggests strated the likely reality of this concern. In one
at least three different broad dispositions to rash striking example, Widiger and Trull (2007) noted
action. The term ‘‘impulsivity’’ appears not to have that two individuals could both be diagnosed with
a clear referent in the personality domain. obsessive compulsive personality disorder, and yet not
Recognition of the structure of these traits clarifies share a single symptom in common. Clinical
understanding of the personality contribution to researchers have increasingly identified multiple
rash, ill-considered action. dimensions, in some cases with different etiologies,
Other distinctions among elemental traits within single mental disorders. We next briefly review
suggest hierarchical organizations of dysfunction. relevant evidence pertaining to a sample of disorders.
Recognition that the dimensions of positive affect
and negative affect were distinct helped researchers THE DISAGGREGATION OF PSYCHOPATHY
understand the difference between anxiety and The disaggregation of the many components of
depression (Clark & Watson, 1991; Diener, psychopathy has received considerable attention and
Larsen, Levine, & Emmons, 1985; Watson, Clark, constitutes a good, recent example of progress stem-
& Carey, 1988). The two appear to share high, ming from construct validation theory, including
overall negative affect, but they differ in positive the need to study homogeneous constructs
affect. Individuals high in negative affect but unre- (Brinkley, Newman, Widiger, & Lynam, 2004;
markable in positive affect tend to be anxious, and Cooke & Michie, 2001; Harpur, Hakistan, &
those high in negative affect and also low in positive Hare, 1988; Harpur, Hare, & Hakistan, 1989;
affect tend to be depressed. Thus, researchers and Lynam & Widiger, 2007). See Chapter 29 for a
clinicians have available a hierarchical, tripartite more detailed consideration of psychopathy. Hare’s
structure for distress; it includes an overall affective (2003) Psychopathy Checklist Revised (PCL-R)
distress factor (high negative affect) and differentia- identified two separate factors, one representing
tion among two expressions of distress: anxiety and the callous and remorseless use of others and the
depression (Clark & Watson, 1991). other representing a deviant and antisocial lifestyle.
In recent years, progress in differentiating among In the PCL-R, the two factors share only 25% of
constructs and the resulting implications for hier- their variance (Hare, 1990; Harpur et al., 1988), and
archical organization has begun to be applied to the they have numerous different correlates (Harpur
study of psychopathology. A great deal of research et al., 1989). Cooke and Michie (2001) identified
SMITH, ZAPOLSKI 91
avoidance, emotional numbing, and hyperarousal) Garrett and Schneck argued that one should not
fit their 17-symptom clinical interview better than validate the WPDS against overly broad criteria,
did any other model, including a single-factor model such as disease status. Instead, they felt, researchers
or a hierarchical model, in which an overall factor should use specific symptoms as criteria.
was thought to underlie the four factors. Clearly, It is certainly true that our understanding of psy-
identical posttraumatic stress disorder symptom chopathology has come a long way since 1928. For
counts can refer to very different symptom pictures. example, the DSM-IV committee conducted and
It thus appears to be the case that the term ‘‘post- published 175 qualitative literature reviews and 48
traumatic stress disorder’’ does not denote a real, additional empirical studies to provide the scientific
unidimensional psychological entity; rather, it underpinnings of their efforts (Widiger & Clark,
denotes the tendency of four dimensions of distress 2000). Each new version of the DSM has relied
to covary. Again, this recognition can only accelerate more heavily on the available science than previous
the rate of acquisition of new knowledge about these versions, and each new version has benefited from an
dimensions of distress. ever-increasing body of scientific knowledge.
At the same time, it appears to be the case that
THE DISAGGREGATION OF SCHIZOTYPAL
validation research still suffers from the use of overly
PERSONALITY DISORDER
broad, multidimensional criteria in the form of putative
The apparent heterogeneity of some DSM-IV
mental disorders. The recent success in disaggregating
disorders is not limited to Axis 1 disorders. Fossati et
some of these disorders should perhaps be best under-
al. (2005) compared several different factor structures
stood as a continuation of the effort called for by Garrett
for the schizotypal personality disorder criteria, and
and Schneck in 1928. Just as theories of psycho-
found that a three-factor model (cognitive-perceptual,
pathology have advanced dramatically since 1928, we
interpersonal, and disorganization) fit best.
can expect them to continue to advance as researchers
Intercorrelations among the three factors ranged
further sharpen their focus to study elemental, homo-
from .14 to .63, again indicating substantial unshared
geneous psychological entities. Perhaps today, just as in
variance in each factor. Here, too, individuals can be
1928, one of the biggest obstacles to test validation is
high on one factor but not on another. Again in this
the imprecision present in many criterion variables.
case, the same quantitative symptom count could
It seems to us that one of the obvious implications
reflect very different dysfunctional experiences.
of this disaggregation research may be that, in some
IMPLICATIONS OF THIS DISAGGREGATION cases, the diagnoses themselves need to be rethought.
RESEARCH Just as some diagnostic categories may be best under-
It follows from these recent advances that, for stood as social constructions denoting a set of mod-
many disorders, the use of diagnostic status or a erately related constructs (McGrath, 2005), it is also
disorder score as either a predictor or a criterion in true that certain constructs are represented in the
theory-testing studies will tend to produce unclear diagnostic criteria for many different disorders.
results. A score on depression, or a depression diag- Psychological constructs cut across disorders, and
nosis, reflects scores on several constructs. To test a disorders combine separate psychological constructs.
theory that experience X is a risk factor for depres- Perhaps a system describing dysfunction in terms of
sion is to be imprecise. It may be the case that homogeneous dimensions of functioning would
experience X is a risk factor for one factor within accelerate research progress dramatically. It is also
depression, but not for other factors. A validation true that, in the case of some disorders, researchers
test in which the criterion was overall depression may identify clear disease processes, where a single
scores could easily produce nonsignificant results. cause produces elevations on what are otherwise only
One would risk missing the association altogether. moderately correlated dimensions of functioning.
Perhaps more problematically, in such a situation (Although achy muscles, nausea, and a fever are only
one has to assume that the symptom count score moderately correlated, they can occur together as a
reflects the same variable for each person, but that result of a single cause, such as is the case with the flu.)
may well not be true. Individuals could have very However, that possibility cannot be confirmed until
similar symptom counts, yet very different patterns the specific, homogeneous dimensions are themselves
of scores on individual constructs. When that is the identified, described, and measured.
case, the symptom count is not a coherent theore- The possibility of describing dysfunction along
tical entity, and its correlation with measures of dimensions of functioning is not an abstract ideal for
other constructs has unclear meaning. In 1928, an imaginary future. It is happening right now.
SMITH, ZAPOLSKI 93
important to demonstrate that individual differences These tests had a number of properties that we
in personality do in fact predict individual differ- believe characterize sound validation tests. First, they
ences in daily reactions and behaviors. Recently, were informative: they put both the personality
validation evidence of this kind has begun to measures and the negative reinforcement theory to
emerge. To provide an example of this work, we direct tests. Failures of prediction would have cast
briefly consider EMA as applied to the disorder doubt on the theory, the measures, or both. Second,
bulimia nervosa. the criteria were likely unidimensional constructs:
One model of risk for the binge-eating character- negative mood, positive mood, hostility, binge
istic of bulimia nervosa is that it is precipitated by the eating, and vomiting. That the authors used specific
experience of negative mood (Agras & Telch, 1998; symptoms as their predictive criteria satisfies the
Sanftner & Crowther, 1998; Telch & Agras, 1996). demand of Garrett and Schneck (1928) and the
With the emergence of ED technology, it has become recent rerecognition of the importance of unidimen-
possible to test this hypothesis ‘‘in the moment.’’ A sionality in theory tests. Third, they used what
series of studies have recently begun to do so. appear to be homogeneous personality constructs
Wonderlich et al. (2007) compared bulimic partici- to create latent profiles: they identified groups of
pants with three different personality profiles in their individuals with co-occurring personality traits, but
daily moods and their symptomatic behavior. They they understood the groups to reflect probabilistic
contrasted a cluster of individuals characterized by combinations of distinct traits.
interpersonal-emotional distress (e.g., high scores on ED methodology provides an opportunity for
cognitive dysregulation, identity problems, anxious- rigorous, informative tests of the validity of person-
ness, suspiciousness, and self-harm) with individuals ality measures. We anticipate that research will show
characterized by stimulus seeking-hostile behavior that traits do tend to predict individual differences in
and individuals with a relative absence of personality reactivity to the environment and in behavioral
pathology. Consistent with self-reported personality choices, as measured on a moment-to-moment
data, the interpersonal-emotional cluster reported basis (as was true in the examples we selected).
more momentary negative mood and less momentary If that turns out to be true, personality researchers
positive mood than did either of the other two clusters will have come one step closer to describing
via ED assessment. And consistent with theory, those individual differences in pathways from inherited
individuals also reported the highest rates of daily disposition all the way to present behavior.
binge eating and vomiting. Of course, failures to find expected relations
Smyth et al. (2007) found that, among bulimia between personality and daily experience will lead
nervosa patients, days in which individuals engaged to re-examination of both theory and assessment
in binge eating or vomiting were characterized by method. In either case, these types of studies are
lower levels of positive affect, higher levels of nega- likely to be highly informative, and therefore likely
tive affect, higher levels of hostility, and more stress. to advance our understanding of personality. They
They also analyzed within-day trends, and found constitute useful validation tests for both theories
that decreasing positive affect, increasing negative and the measures used to embody them.
affect, and increasing anger/hostility preceded symp-
tomatic events. And, consistent with the view that Summary
binge eating and purging provide negative reinforce- We wish to emphasize the following points from
ment in the form of distress relief, they found that this chapter. First, the history of validity theory is
following symptomatic events, positive affect characterized by rigorous efforts to determine the
increased and both negative affect and anger/hosti- best ways to validate measures of psychological
lity decreased. Together, these findings provide constructs, and, over time, to validate theories of
important validation for the negative reinforcement psychological functioning. It is inaccurate to look
theory of bulimia nervosa symptom expression. at past efforts, such as the attempt to validate the
They also provide important validation for the per- WPDS and the use of criterion-related validity, as
sonality measures contributing to the personality simply inadequate efforts. Rather, each step in
clusters (in this case, taken from the Dimensional validity theory can be understood to provide
Assessment of Personality Pathology: Livesley & advances that made possible the next step. And, the
Jackson, 2002); individual differences on the mea- advances in validity theory occurred hand-in-hand
sures predicted individual differences in both mood with advances in personality theory; each stimulated
and behavior. the development of the other.
SMITH, ZAPOLSKI 95
Bagby, R. M., Joffe, R. T., Parker, J. D. A., & Schuller, D. R. Cronbach, L. J., & Gleser, G. C. (1965). Psychological test and
(1993). Re-examination of the evidence for the DSM-III personnel decisions. Urbana, IL: University of Illinois Press.
personality disorder clusters. Journal of Personality Disorders, Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in
7, 320–328. psychological tests. Psychological Bulletin, 52, 281–302.
Bartley, W. W., III. (1962). The retreat to commitment. New York: Cureton, E. E. (1950). Validity. In E. F. Lingquist (Ed.),
A. A. Knopf. Educational measurement. Washington, DC: American
Bartley, W. W., III. (1987). Philosophy of biology versus philo- Council of Education.
sophy of physics. In G. Radnitzky & W. W. Bartley, III Cyders, M. A., & Smith, G. T. (2007). Mood-based rash action
(Eds.), Evolutionary epistemology, rationality, and the sociology and its components: Positive and negative urgency and their
of knowledge (pp. 7–46). La Salle, IL: Open Court. relations with other impulsivity-like constructs. Personality
Bartusch, D. R. J., Lynam, D. R., Moffitt, T. E., & Silva, P. A. and Individual Differences, 43, 839–850.
(1997). Is age important? Testing a general versus a develop- Cyders, M. A., Smith, G. T., Spillane, N. S., Fischer, S., Annus,
mental theory of antisocial behavior. Criminology, 35, 13–48. A. M., & Peterson, C. (2007). Integration of impulsivity and
Bechtoldt, H. P. (1951). Selection. In S. S. Stevens (Ed.), positive mood to predict risky behavior: Development and
Handbook of experimental psychology (pp. 1237–1266). validation of a measure of positive urgency. Psychological
Oxford: John Wiley & Sons. Assessment, 19, 107–118.
Blumberg, A. E., & Feigl, H. (1931). Logical positivism. Journal Depue, R. A., & Collins, P. F. (1999). Neurobiology of the
of Philosophy, 28, 281–296. structure of personality: Dopamine, facilitation of incentive
Briggs, S. R., & Cheek, J. M. (1986). The role of factor analysis in motivation, and extraversion. Behavioral and Brain Sciences,
the development and evaluation of personality scales. Journal 22, 491–569.
of Personality, 54, 107–147. Derksen, J., Gerits, L., & Verbruggen, A. (2003). MMPI-2
Brinkley, C. A., Newman, J. P., Widiger, T. A., & Lynam, D. R. profiles of nurses caring for people with severe behavior problems.
(2004). Two approaches to parsing the heterogeneity of psy- Paper given at the 38th MMPI-2 Conference on Recent
chopathy. Clinical Psychology: Science and Practice, 11, 69–94. Developments in the Use of the MMPI-2 and MMPI-A,
Butcher, J. N. (1990). Use of the MMPI-2 in treatment planning. Minneapolis, MN.
New York: Oxford University Press. Diener, E., Larsen, R. J., Levine, S., & Emmons, R. A. (1985).
Butcher, J. N. (1995). Clinical personality assessment: Practical Intensity and frequency: Dimensions underlying positive and
approaches. New York: Oxford University Press. negative affect. Journal of Personality and Social Psychology, 48,
Butcher, J. N. (2002). Assessing pilots with ‘‘the wrong stuff’’: A call 1253–1265.
for research on emotional health factors in commercial aviators. Digman, J. M. (1997). Higher-order factors of the big five.
International Journal of Selection and Assessment, 10, 1–17. Journal of Personality and Social Psychology, 73, 1246–1256.
Campbell, D. T. (1987). Evolutionary epistemology. In Duhem, P. (1914/1991). The Aim and Structure of Physical
G. Radnitzky & W. W. Bartley, III (Eds.), Evolutionary Theory (P. Weiner, trans.). Princeton, NJ: Princeton
epistemology, epistemology, rationality, and the sociology of University Press (First published in 1914 as La Théorie
knowledge (pp. 47–89). La Salle, IL: Open Court. physique: Son objet, sa structure).
Campbell, D. T. (1990). The Meehlian corroboration-verisimili- Edwards, J. R. (2001). Multidimensional constructs in organiza-
tude theory of science. Psychological Inquiry, 1, 142–147. tional behavior research: An integrative analytical framework.
Campbell, D. T., & Fiske, D. W. (1959). Convergent and dis- Organizational Research Methods, 4, 144–192.
criminant validation by the multi-trait multi-method matrix. Evenden, J. (1999). Varieties of impulsivity. Journal of
Psychological Bulletin, 56, 81–105. Psychopharmacology, 146, 348–361.
Clark, L. A. (2007). Assessment and diagnosis of personality Feyerabend, P. (1970). Against method. In M. Radner & S.
disorder: Perennial issues and an emerging reconceptualiza- Winokur (Eds.), Minnesota studies on the philosophy of science,
tion. Annual Review of Psychology, 58, 227–257. Vol. IV: Analyses of theories and methods of physics and psychology
Clark, L. A., & Watson, D. (1991). Tripartite model of anxiety (pp. 17–130). Minneapolis, MN: University of Minnesota Press.
and depression: Psychometric evidence and taxonomic impli- Flemming, E. G., & Flemming, C. W. (1929). The validity of the
cations. Journal of Abnormal Psychology, 100, 316–336. Matthews’ revision of the Woodworth personal data question-
Clark, L. A., & Watson, D. (1995). Constructing validity: Basic naire. Journal of Abnormal and Social Psychology, 23, 500–506.
issues in objective scale development. Psychological Assessment, Fischer, S., & Smith, G. T. (2008). Binge eating, problem
7, 309–319. drinking, and pathological gambling: Linking behavior to
Cleckley, H. (1941). The mask of sanity. St Louis: Mosby. shared traits and social learning. Personality and Individual
Cooke, D. J., & Michie, C. (2001). Refining the construct of Differences, 44, 789–800.
psychopathy: Towards a hierarchical model. Psychological Fischer, S., Smith, G., Annus, A., & Hendricks, M. (2007). The
Assessment, 13, 171–188. relationship of neuroticism and urgency to negative conse-
Costa, P. T. & McCrae, R. R. (1992). Revised NEO Personality quences of alcohol use in women with bulimic symptoms.
Inventory manual. Odessa, FL: Psychological Assessment Personality and Individual Differences, 43, 1199–1209.
Resources. Fischer, S., Smith, G. T., Spillane, N., & Cyders, M. A. (2005).
Costa, P. T., & McCrae, R. R. (1995). Domains and facets: Urgency: Individual differences in reaction to mood and
Hierarchical personality assessment using the revised NEO implications for addictive behaviors. In A. V. Clark (Ed.),
Personality Inventory. Journal of Personality Assessment, 64, The Psychology of Mood (pp. 85–108). New York: Nova
21–50. Science Publishers.
Cronbach, L. J. (1988). Five perspectives on validation argument. Fossati, A., Citterio, A., Grazioli, F., Borroni, S., Carretta, I.,
In H. Wainer & H. Braun (Eds.), Test validity (pp. 3–17). Maffei, C., et al. (2005). Taxonic structure of schizotypal
Hillsdale, NJ: Erlbaum. personality disorder: A multiple-instrument, multi sample
SMITH, ZAPOLSKI 97
W. Bukoski (Eds.), Effective mass media strategies for drug abuse Smyth, J. M., Wonderlich, S. A., Heron, K. E., Sliwinski, M. J.,
prevention campaigns (pp. 27–43). New York: Springer. Crosby, R. D., Mitchell, J. E., et al. (2007). Daily and
Paunonen, S. V. (1998). Hierarchical organization of personality momentary mood and stress are associated with binge eating
and prediction of behavior. Journal of Personality and Social and vomiting in bulimia nervosa patients in the natural envir-
Psychology, 74, 538–556. onment. Journal of Consulting and Clinical Psychology, 75,
Paunonen, S. V., & Ashton, M. C. (2001). Big five factors and 629–638.
facets and the prediction of behavior. Journal of Personality and Stice, E. (2001). A prospective test of the dual-pathway model of
Social Psychology, 81, 524–539. bulimic pathology: Mediating effects of dieting and negative
Perry, J. N., Miller, K. B., & Klump, K. (2006). Treatment affect. Journal of Abnormal Psychology, 110, 124–135.
planning with the MMPI-2. In J. Butcher (Ed.), MMPI-2: Stone, A. A., & Shiffman, S. (2002). Capturing momentary self-
A practitioner’s guide (pp. 143–164). Washington, DC: report data: A proposal for reporting guidelines. Annals of
American Psychological Association. Behavioral Medicine, 24, 236–243.
Petry, N. (2001). Substance abuse, pathological gambling, and Swets, J. A., Dawes, R. M., & Monahan, J. (2000). Psychological
impulsiveness. Drug and Alcohol Dependence, 63, 29–38. science can improve diagnostic decisions. Psychological Science
Piasecki, T., Huford, M., Solham, M., & Trull, T. (2007). in the Public Interest, 1, 1–26.
Assessing clients in their natural environments with electronic Telch, C. F., & Agras, W. S. (1996). Do emotional states influ-
diaries: Rationale, benefits, limitations, and barriers. ence binge eating in the obese? International Journal of Eating
Psychological Assessment, 19, 25–43. Disorders, 20, 271–290.
Rogers, R. (1995). Diagnostic and structured interviewing. Odessa, Watson, D., Clark, L. A., & Carey, G. (1988). Positive and
FL: Psychological Assessment Resources. negative affect and their relation to anxiety and depressive
Rorer, L., & Widiger, T. (1983). Personality structure and assess- disorders. Journal of Abnormal Psychology, 97, 346–353.
ment. Annual Review of Psychology, 34, 431–463. Watson, D., & Wu, K. (2005). Development and validation of
Samuel, D., & Widiger, T. (2006). Clinicians’ judgments of the Schedule of Compulsions, Obsessions, and Pathological
clinical utility: A comparison of the DSM-IV and Five- Impulses (SCOPI). Assessment, 12, 50–65.
Factor models. Journal of Abnormal Psychology, 115, 298–308. Weimer, W. B. (1979). Notes on the methodology of scientific
Sanftner, J. L., & Crowther, J. H. (1998). Variability in self- research. Hillsdale, NJ: Lawrence Erlbaum.
esteem, moods, shame, and guilt in women who binge. Whiteside, S. P., & Lynam, D. R. (2001). The Five Factor Model
International Journal of Eating Disorders, 23, 391–397. and impulsivity: Using a structural model of personality to
Schneider, R. J., Hough, L. M., & Dunnette, M. D. (1996). understand impulsivity. Personality and Individual Differences,
Broadsided by broad traits: How to sink science in five dimen- 30, 669–689.
sions or less. Journal of Organizational Behavior, 17, 639–655. Whiteside, S. P., & Lynam, D. R. (2003). Understanding the role
Schwarz, N., & Oyserman, D. (2001). Asking questions about of impulsivity and externalizing psychopathology in alcohol
behavior: Cognition, communication, and questionnaire con- abuse: Applications of the UPPS impulsive behavior scale.
struction. American Journal of Evaluation, 22, 127–160. Experimental and Clinical Psychopharmacology, 11, 210–217.
Shiffman, S., Hufford, M., Hickcox, M., Paty, J. A., Gnys, M., & Whiteside, S. P., Lynam, D. R., Miller, J. D., & Reynolds, S. K.
Kassel, J. D. (1997). Remember that? A comparison of real- (2005). Validation of the UPPS impulsive behavior scale: A
time versus retrospective recall of smoking lapses. Journal of four-factor model of impulsivity. European Journal of
Consulting and Clinical Psychology, 65, 292–300. Personality, 19, 559–574.
Simms, L., Watson, D., & Doebbeling, B. (2002). Confirmatory Widiger, T. A., & Clark, L. A. (2000). Toward DSM V and the
factor analyses of posttraumatic stress symptoms in deployed classification of psychopathology. Psychological Bulletin, 126,
and nondeployed veterans of the Gulf War. Journal of 946–963.
Abnormal Psychology, 111, 637–647. Widiger, T., & Samuel, D. (2005). Diagnostic categories or
Smith, G. T. (2005). On construct validity: Issues of method and dimensions? A question for the Diagnostic and statistical
measurement. Psychological Assessment, 17, 396–408. manual of mental disorders—fifth edition. Journal of
Smith, G. T., Fischer, S., Cyders, M. A., Annus, A. M., Spillane, Abnormal Psychology, 114, 494–504.
N. S., & McCarthy, D. M. (2007). On the validity and utility Widiger, T. A., & Trull, T. J. (2007). Plate tectonics in the
of discriminating among impulsivity-like traits. Assessment, classification of personality disorder: Shifting to a dimensional
14, 155–170. model. The American Psychologist, 62, 71–83.
Smith, G. T., Fischer, S., & Fister, S. M. (2003). Incremental Wonderlich, S., Crosby, R., Engel, S., Mitchell, J., Smyth, J., &
validity principles in test construction. Psychological Miltenberger, R. (2007). Personality-based clusters in bulimia
Assessment, 15, 467–477. nervosa: Differences in clinical variables and ecological
Smith, G. T., McCarthy, D. M., & Zapolski, T. C. B. (2007). On momentary assessment. Journal of Personality Disorders, 21,
the Value of Homogeneous Constructs for Construct Validation, 340–357.
Theory Testing, and the Description of Psychopathology (Invited Zuckerman, M. (1994). Behavioral expressions and biological bases
manuscript submitted for publication). of sensation seeking. Cambridge: Cambridge University Press.
Abstract
The meaningfulness of measures depends upon a number of factors: validity studies, score scales that are
understood by test users, knowledge that the measure has been studied with members of a population to
which an individual test taker or a group of test takers belongs, and so on. Fundamental to such
considerations is that the measure has been administered under the same or comparable circumstances as
it was when normed, validated, checked for appropriate types of reliability, and otherwise researched. This
chapter addresses issues specific to testing individuals. In some cases, specialized validity studies are
needed to document that a test can be adapted or accommodated to the needs of specific test takers and
have its scores still carry the same or similar meaning as with the original population. Essentially, we need
to perform generalizability studies to show that the meaningfulness and validity of tests generalizes across
administrative changes, language variations, and cultural differences for varying groups of individuals. Truly,
psychological testing needs to stay rooted into its very basis: the science of individual differences.
We would like to begin this chapter with a fictiona- to the front of the courtroom. The man reached the
lized court case that draws heavily upon reality—as witness stand and turned slowly and stiffly. As he
experienced by many psychometricians and others took the oath, his heavy German accent was evident.
concerned with assessment. The case is that of He settled into the witness chair and in response to
Jenna C. v. Unified School District. questions from the attorney began tracing his
background—his education among masters in
Jenna C. v. Unified School District Vienna and Zurich, his many years of practice in
It was a blustery day outside, but the temperature in New York City, awards that he had received from
the courtroom was at least 78 degrees. The various various educational and civic groups, and the popular
parties nervously adjusted their seats and looked books that he had written on raising children with
around the courtroom as though trying to memorize learning disabilities. He grunted and scowled in
the scene. The court clerk called, ‘‘All rise; this court response to a query about Newsweek and other
is now in session.’’ A formal-looking judge entered popular publications referring to him as ‘‘the
the room and was seated. Her hair was a variety of Benjamin Spock of the learning disabled.’’ The
shades of gray and tied sternly in a bun at the back of questioning attorney asked the court to qualify the
her head. Papers were shuffled. The judge fixed her witness as an expert. The judge shifted her line of
gaze on the attorney for the prosecution, who rose. vision toward the opposing attorney, who began to
‘‘We call Dr. Glaubenweiss to the stand.’’ The short rise. This second attorney was silent for a moment
but distinguished-looking psychologist rose slowly, after he stood; then he stated slowly, ‘‘The people are
leaning heavily on his cane. He wore a gray tweed not convinced that Dr. Glaubenweiss meets current
jacket, and gray flannel slacks. He began a slow trek standards as an expert in this area.’’ He paused
99
momentarily, but before he was able to continue, the ‘‘And how certain are you of your diagnosis,
first attorney retorted, ‘‘Your Honor, he has already Doctor?’’
been qualified as an expert in this district and six ‘‘Oh. Quite certain, quite certain. I have never used
other federal districts as well as in Canada and several the Glaubenweiss instruments and been incorrect. No,
European countries. He has given invited talks at the never. It’s a certainty.’’ After he finished speaking, he
annual meetings of the American Bar Association on made a quiet ‘‘Humph’’ sound to no one in particular,
the legal nature of learning disabilities; if he is not an as if he had been surprised by such a naive question.
expert, who is?’’ ‘‘And you’re scientifically convinced of the remediation
The judge shrugged slightly, frowned momentarily, that is needed?’’ the attorney asked.
and stated, ‘‘The court recognizes him as an expert. ‘‘Ja, yes, quite certain. I detailed what is needed in
Objections to and limitations on that qualification can my report. You have that, no? You want me to discuss
be discussed in cross-examination.’’ my report?’’
The first attorney returned to his questioning. ‘‘Are ‘‘No, Doctor, we have already entered your report
you familiar with a student in the Unified School as evidence; that is not necessary. Thank you, Doctor,
District who is being referred to in these proceedings as thank you.’’ The attorney looked at the judge and said,
Jenna C.?’’ ‘‘That concludes our direct testimony then, your
‘‘I am,’’ Dr. Glaubenweiss responded. Honor, but we reserve the right to redirect after cross.’’
‘‘Did you make an appraisal of her educational The judge stated, ‘‘So noted,’’ and shifted her gaze
status at the request of her parents and her counsel?’’ to the defense attorney, who rose slowly, still intently
‘‘I did.’’ Dr. Glaubenweiss took off his thick glasses reviewing a thick, dog-eared, legal-sized pad of notes
as he responded and peered at them as if they troubled and questions.
him. ‘‘Doctor, you have never published an article in a
‘‘So Dr. Glaubenweiss, please tell the court what refereed journal, now have you?’’
measures you used to assess Jenna C.’’ ‘‘A journal? No, not a journal. It’s the wrong place
Dr. Glaubenweiss commenced reciting his to reach the people. The wrong place.’’
procedures as he began to clean his glasses against his ‘‘And you’ve never engaged in empirical research, is
sweater vest. ‘‘I first met with Jenna C. along with her that correct?’’
parents. Her parents told me that Jenna had not been ‘‘Wrong,’’ he retorted, ‘‘I have indeed. I do research
performing as well as school officials thought that she all the time, all the time. Silly question, silly.’’
should. They told me that they had tried both ‘‘Let me rephrase my question, then, Doctor. You
working with her themselves and tutoring, but that have never validated either your Glaubenweiss Test of
nothing worked. Then I met Jenna on several Learning Disabilities or the Glaubenweiss Survey of
occasions. During the first meeting, I conducted an Instructional Deficits, is that not correct?’’
interview with her to determine what activities she ‘‘No need to validate. They are valid measures.
enjoyed and to establish rapport. On the second and I know it.’’
third visits, I administered, of course,’’ he paused for ‘‘But there’s a difference, isn’t there, Doctor,
effect and then returned to his description, ‘‘I between validation and validity? Isn’t it true that you
administered the Glaubenweiss Test of Learning believe that your measures are valid, but you have not
Disabilities and the Glaubenweiss Survey of validated them?’’
Instructional Deficits. Administration of these ‘‘Ja. Validation is the process of proving that a
measures was quite straightforward and the results, as measure is valid. These measures are valid, however, so
usual, were quite informative . . . definitive, really.’’ there’s no need to perform a validation study.’’
‘‘And what were those results, Doctor?’’ ‘‘Have there been any published studies by others
‘‘That Jenna C. is experiencing significant learning documenting that your measures are valid, Doctor?’’
disabilities of unknown origin and that her school ‘‘Not that I know of. Because only I use my
district needs to provide additional instruction, one- measures, that would be difficult, no?’’
on-one. They have been unwilling to do so, quite ‘‘Have you developed norms for your measures,
unwilling, and she is being harmed each and every day Doctor, norms against which to evaluate the meaning of
that they refuse. Indeed, the results show that she is scores?’’
learning disabled.’’ ‘‘No, there are no norms, but I know what scores
The attorney confirmed, ‘‘So your opinion is that mean.’’
she is learning disabled?’’ The defense attorney managed to roll his eyes at the
‘‘Indeed, correct,’’ was the response. judge while informing the judge that he had completed
Abstract
The question of whether to revise a psychological test often emerges during the test development phase
rather than after the publication of the test because there are always unanswered questions or alternative
strategies one could have taken. For an assessment ‘‘standard’’ that has found broad acceptance among
practitioners and researchers the considerations that bear on its change or revision can be especially
complex and demanding. This chapter provides an overview of some of these considerations, including
how changes in a test are to be occasioned and warranted; the motives that may become involved
in driving changes; issues pertaining to the differing and at times conflicting interests of the various
constituencies (designers, advisors, consultants, publishers, distributors, practitioners, consumers) that
may be involved in the preparation, decisions, and processes entailed in change and revision; and the
various ethical expectations, implications, and responsibilities that have a role to play in guiding and
influencing all of these. We have included a case study to illustrate some of the recent issues that have
emerged with proposed changes to one standard personality measure, the Minnesota Multiphasic
Personality Inventory-2 (MMPI-2).
Keywords: test revision, personality, testing standards, Watson and Tellegen model, MMPI-2, RC Scales,
MMPI-2-RF, ethical issues in test development
Many articles, book chapters, and books have been Reasons for Revision
written on the process of test development. By con- Several valid reasons for test revision have been
trast, the literature on the topic of test revision is mentioned in the literature, but they generally fall
relatively undeveloped; this is unfortunate because, into two categories. On one hand, there are
as Adams (2000) wrote, test revision is a continuous changes that can be made to reduce flaws of the
process, where consideration of the need for revision current form of the test. We might refer to these
begins at the time of the release of any edition of a as ‘‘deficiency changes.’’ In fact, several authors
test. This chapter reviews literature relevant to the (Adams, 2000; Butcher, 2000; Campbell, 2002)
process of test revision and draws examples from have noted that a revision is a healthy process,
recent developments in the Minnesota Multiphasic allowing the test developers to identify and
Personality Inventory (MMPI-2), and in view of correct imperfections in the test. On the other
these issues, offers a critique of the wisdom, con- hand, there are changes that are not designed to
ceptualization, methodology, and marketing issues correct flaws but to allow the test to develop in
associated with a recent and significant scale revision new ways. We might refer to these as ‘‘growth
effort. changes.’’
112
Deficiency Changes the public domain (Adams, 2000). For example, if
When considering changes to a self-report person- copyright laws have been violated by placing items
ality test, the motivations that come to mind most on Internet sites, or when a large number of indivi-
quickly (and are most often discussed in the profes- duals have been coached on test-taking practices that
sional literature) are those changes that will repair or are likely to lead to desired scores on the test, the
reduce the flaws that have been discovered since the integrity of the test may be compromised to the
publication of the current version. For example, sev- point where confidence in its scores is undermined.
eral years before formal work began on the revision of Fourth, a test revision may be motivated by the
the MMPI in the 1980s, Dahlstrom (1972) noted goal to improve flawed psychometric properties of the
that the revision process should be guided by critiques original instrument. Reise, Waller, and Comrey
in the published literature. Several changes of this (2000) noted that factor analytic studies sometimes
type were integral to the revision of the MMPI as demonstrate that a scale’s factor structure published
well as to those of other measures. in the test manual does not replicate across divergent
First, items may be outdated or even antiquated samples. Similarly, factor analytic studies might sug-
(Butcher, 2000). An item that had clear relevance to a gest that a scale is not providing an adequate repre-
personality construct when the test was created may, sentation of a construct of interest. If a component of
with the passage of time, become irrelevant or tap an the construct is not present in the factor structure of
entirely different construct. For example, Campbell that scale, the validity of the scale will be severely
(1972) noted that, at the time of revision, one version compromised. It is even possible, according to Reise
of the Strong Vocational Interest Blank included et al., that factor analytic research might show that the
references to interest in magazines which had not scale is actually more highly loaded on a related con-
even been published during the lifetimes of college- struct than it is on the intended construct. While all of
aged test takers. Similarly, prior to its revision, the these concerns were likely considered before the pub-
MMPI possessed items that were clearly outdated by lication of the first form of the test, the gradual
today’s standards. Examples of outdated MMPI items accumulation of data might point to psychometric
included such statements as ‘‘I used to like drop-the- flaws that were not apparent with the initial research
handkerchief’’ and ‘‘Horses that don’t pull should be samples. An example of the way factor analytic
beaten or kicked.’’ Even when the items are not anti- research can guide test development and revision is
quated, cultural trends gradually change to the point illustrated in the most recent revision of the Job
where some items seem old-fashioned to the modern Description Index (JDI), as recounted by Stanton,
test taker (Adams, 2000). In these cases, a test revision Bachiochi, Robie, Perez, and Smith (2002). The
provides the opportunity to remove or rephrase items original scale was developed in 1969 as a measure of
that are no longer adequate. job satisfaction, but when it was revised in 1985 a
Second, test revisions are necessary when the small number of new items were added. As research
norms are no longer adequate for the current pur- accumulated, however, it became clear that some of
pose of test use (Adams, 2000). Because a raw score the new items did not measure job satisfaction speci-
on a self-report test generally only gains meaning by fically. Instead these items measured job stress. They
being compared against scores obtained by a norma- asked questions about whether work was too tiring or
tive sample, it is essential for the normative sample to frustrating and whether respondents felt they had too
be representative of the population of test takers who much work to complete. Although job satisfaction
are currently assessed with that test. For example, and job stress are negatively correlated, they represent
Butcher (2000) voiced concern that the norms of the two distinctly different constructs. Therefore, when
Millon Clinical Multiaxial Inventory—third edition the scale was revised again in 1997, the job stress
(MCMI-III; Millon, 1994) were based upon a psy- items were replaced with additional items, more
chiatric sample. While this would be appropriate if directly related to job satisfaction. Although research
the test was used exclusively in a psychiatric context, on the factor structure showed that stress and satisfac-
as the test’s use grew beyond the realm of clinical tion are intertwined, the resulting scale is more uni-
assessment (such as into the area of personnel selec- dimensional in nature.
tion), the norms were no longer sufficient for that Finally, a test that once provided a valid assess-
expanded range of applications. ment of a construct might need revision when the
Third, and especially relevant to the area of for- professional understanding of that construct
ensic assessment, is the necessity to revise an instru- changes. For example, if a test provides a valid assess-
ment when the items or test stimuli have leaked into ment of posttraumatic stress disorder, in accordance
RCd—Demoralization
RC1—Somatic Complaints Scale 1—Hypochondriasis
RC2—Low Positive Emotions Scale 2—Deprestsion
RC3—Cynicism Scale 3—Hysteria
RC4—Antisocial Behavior Scale 4—Psychopathic Deviate
RC6—Ideas of Persecution Scale 6—Paranoia
RC7—Dysfunctional Negative Emotions Scale 7—Psychasthenia
RC8—Aberrant Experiences Scale 8—Schizophrenia
RC9—Hypomanic Activation Scale 9—Hypomania
REDUCTION OF CLINICAL SCALE COVARIATION the role played by item overlap itself, either within
The problems represented by the extensive cov- their own samples or generally.
ariation among the Clinical Scales, as well as It should be noted that our criticism, here, is not
straightforward approaches to reducing it, have that the strategy used in the development of the RC
been recognized for more than half a century. Scales (i.e., reducing CSC and increasing Clinical
Although multiple approaches are available for redu- Scale purification by eliminating or reducing item
cing such covariation, by far the most obvious is to overlap) was abandoned after being tried and found
simply eliminate the items that are shared among wanting. Rather, we are concerned that this strategy
two or more of the Clinical Scales. Helmes and was ignored, despite historical precedent, and
Reddon (1993) identified item overlap as among despite its representing a more straightforward
the major structural deficiencies of the MMPI/ means of achieving the authors’ stated goals.
MMPI-2 (pp. 457–459). Eliminating the overlap- As a demonstration of the achievements in scale
ping items addresses the RC authors’ stated aims independence that could be had from more parsimo-
simultaneously because such items are both the nious methods, Nichols (2006a) explored three dif-
most obvious contributors to clinical scale covaria- ferent alternate solutions to the problem of Clinical
tion (CSC) and because they are, by definition, Scale covariation using a data set of 26,118 male and
nondistinctive. 26,425 female MMPI-2 protocols (Caldwell, 1997a).
Although not mentioned by Tellegen et al. In one test, Nichols simply eliminated from each
(2003), this was precisely the approach taken by Clinical Scale the 35 items that overlap at least
Welsh (1952) and later by Adams and Horn three of the basic Clinical Scales (1–4 and 6–9). As
(1965) to address the CSC problem. In commenting a result, the average intercorrelation among these
on ‘‘the source of high correlations between mea- scales dropped from .59 to .39, a decrease in shared
sures of presumably very distinctive personality attri- variance of 20% (1 – [(.59 .59) – (.39 .39)]),
butes,’’ Tellegen et al. (2003) recognized the while the average correlation between the altered
problem of item overlap but appear to minimize it: scale and the intact parent scale remained high at
‘‘the high scale intercorrelations reflect only in part .94, a loss of only 12% [1 – (.94 .94)] in shared
the spurious correlations between measurement variance. In the second test, Nichols identified 37
errors that occur when the scales in question contain items from the Clinical Scales that loaded on the First
overlapping items’’ (p. 6). Among the possible com- Factor in two independent factor analyses (Johnson,
petitors with overlapping items in accounting for Butcher, Null, & Johnson, 1984; Waller, 1999); he
CSC, Tellegen et al. (2003) include ‘‘response-style then removed the variance of this set of 37 items from
variance’’ (p. 6) (which they tend to dismiss), ‘‘unan- each of the Clinical Scales. Using this strategy,
ticipated ‘comorbidity’ ’’ (p. 6), and ‘‘invalid’’ (p. 5) Nichols achieved an increase in scale independence
subtle items2 (although the latter items, relative to of 13% (with average intercorrelations dropping from
the so-called obvious items, show a much lower rate .59 to .47), while maintaining an average correlation
of overlap). If one is to judge from the information with their parent scales of .97, a loss of only 6% in
provided in the RC Scales monograph, however, no shared variance. In the third test, Nichols combined
effort was made to explore the relative strength of the two strategies by removing from the Clinical
these competing factors, nor to evaluate the extent of Scales those items that appeared on both markers
Clinical
Alcohol/Drug 1 510 349 Yes Greene et al. (1996)
Alcohol/Drug 2 799 369 Yes McKenna & Butcher (1987)
Psychotherapy 176 252 No Butcher (1998)
State Hospital 234 268 Yes Greene (1992)
VA Inpatient 1,161 87 Yes Schinka (2004)
College Students
Sample 1 57 80 No Butcher (1998)
Sample 2 510 762 No Butcher, Graham et al. (1990)
Forensic
Child Custody 1 844 863 No Butcher (1997)
Child Custody 2 952 950 No Caldwell (2004)
Child Custody 3 534 514 No Hoppe et al. (2004)
Corrections 263 33 Yes Butcher (1997)
Criminal Psych. 780 122 Yes Davis (2001)
Death Row 74 0 Yes Gelbort (2002)
Personal Injury 1 49 95 No Butcher (1997)
Personal Injury 2 74 67 No Greenberg, Lees-Haley, &
Otto (2005)
Medical Chronic Pain 104 316 No Caldwell (1998)
Neuropsych. 1 97 71 No Cripe (2000)
Neuropsych. 2 41 41 No Solomon (2004)
Military, Active Duty 1,354 174 No Butcher, Jeffrey, Cayton, Colligan,
DeVore, & Minnegawa (1990)
MMPI-2 Normative 1,108 1,421 No Butcher et al. (1989)
Native American 264 361 No Robin et al. (2003)
Personnel Screening
Clergy 234 276 No Friedman & Damsteegt (2002)
General 3,630 1,569 No Caldwell (1997b)
Police 6,168 926 No Christiansen & Cohen (2003)
among the Clinical Scales. The average correlation achieved reductions in CSC of between 9% and
between each RC Scale and its Clinical Scale parent 20%, with decreases in fidelity to their unaltered
is .724, a loss of 48% in the variance shared between versions of from 8% to 23%. This finding
the RC Scales and their parent Clinical Scales. Even largely replicates Nichols’ previous demonstration
after eliminating the weakest correlation from inclu- (2006a), wherein, using the same methods, he
sion (that between Scale 3 and RC3), the average achieved reductions in CSC of between 13% and
correlation achieved between the RC Scales and their 25%, with decreases in fidelity of from 6% to
respective parent scales is .808, a loss of 35% in the 23%. In contrast, the correlations among the RC
average shared variance between the RC Scale and its Scales in our aggregate sample show only nominal
corresponding Clinical Scale. Thus the small gain in reductions in covariation (5%), but at a cost of a
the independence achieved in the RC Scales is roughly 10-fold loss in fidelity (48%) to their
swamped by a loss in the variance that they share parent Clinical Scales. By this standard, the first
with their Clinical Scale counterparts. of the RC authors’ goals, that of substantially
In summary, improved Clinical Scale indepen- reducing the covariation that compromises the
dence can be achieved by simply eliminating from discriminant validity of the MMPI-2 Clinical
the scales those nondistinctive items that overlap Scales, appears to have been met only marginally,
three or more Clinical Scales, or mark the major and by methods both substantially more question-
source of nonspecific variance on the Clinical able and complicated than could be achieved by
Scales, or both. In the present investigation, we more straightforward means.
Table 7.3. The proportion of RC7 and Scale 7 items that overlap with empirically
derived markers for the First Factor.
RC7 48 76 90 67 38
Scale 7 27 52 79 50 17
Note: Decimals omitted. RC7 ¼ Dysfunctional Negative Emotions. (The divisor for RC7 was 21 because three of the
RC7 items do not appear on the First Factor markers, all of which were developed within the MMPI-1 environment.)
Scale 7 ¼ Psychasthenia. A ¼ Anxiety (Welsh, 1956). JB1 ¼ Johnson–Butcher first factor (Johnson et al., 1984).
W1 ¼ Waller first factor (Waller, 1999). JBW72 ¼ Items overlapping both JB1 and W1. AJBW ¼ Items overlapping
all three First Factor markers.
Study 1
Physical Anhedonia .21 .26 .24 .26 .20
Social Anhedonia .25 .36 .34 .37 .29
Study 2
Physical Anhedonia .26 .23 .27 .39 .25
Social Anhedonia .42 .42 .54 .51 .45
Note: RC2 ¼ Low Positive Emotions, Anhed ¼ Anhedonia, HedCap ¼ Hedonic Capacity (reversed keyed), INTR ¼ Introversion/
Low Positive Emotionality, Si ¼ Social Introversion.
Scale, RC2, makes possible the comparison of its (reversed), INTR, and the MMPI-2 social introver-
performance against other MMPI/MMPI-2 mea- sion scale (Si).
sures of anhedonia. The correlations between all of the MMPI/
MMPI-2 anhedonia measures and each of the
Study 1. The sample for our first study consisted Wisconsin anhedonia scales were significant at
of 450 male and 438 female Midwestern University p < .0001. The results from both samples, pre-
undergraduates between the ages of 18 and 25 years, sented in Table 7.4, indicate that RC2 predicts
inclusive, who served as participants in earlier no better than, and in most cases less well than,
research reported by Merritt, Balogh, and the MMPI/MMPI-2 measures of anhedonia that
DeVinney (1993). All subjects completed both the were already available at the time that RC Scales
original MMPI and the Physical and Social were developed. Scale 0 also predicted scores on
Anhedonia Scales (Chapman et al., 1976). the social anhedonia scale somewhat better than
Study 2. Participants in the second study were RC2, and predicted physical anhedonia scores
drawn from a sample of 120 male and 200 female only slightly less well, in both samples. INTR
Midwestern undergraduates between the ages of 18 performed best among the MMPI/MMPI-2 scale
and 24 years, inclusive, who participated in the study for both of the Wisconsin anhedonia scales, in
in order to receive course credit. In order to be both samples.
considered for inclusion in the final sample,
participants’ responses had to meet several validity Invalid Motivations for Change
criteria for the MMPI-2 and the Wisconsin Although there are several valid motivations for
schizotypy scales (WSS; Kwapil, Chapman, & test revision, a test need not be revised simply
Chapman, 1999). Respondents were eliminated if because of the passage of time. Hathaway
they had more than 20 omitted items; raw scores (1972b), for example, when considering the possi-
greater than 13 on VRIN, less than 5 or greater than bility of the revision of the original MMPI, noted
13 on TRIN, greater than 30 on F, greater than 20 that there were flaws in the measure and hoped
on Fb; T-scores greater than 120 on Fp, or greater that the future would bring an improvement
83 on L; 38 participants were dropped from the (either in the form of a revised version of the
sample after application of these validity criteria. MMPI or a new similar test), but he argued that
The remaining validity criteria consisted of test revision should only be attempted with uncon-
endorsement of less than three items on the WSS ventional thinking about the constructs or uncon-
infrequency scale and omission of no more than 10 ventional methods of test development. A test
items. The application of both sets of validity criteria need not be revised due to age unless there is a
resulted in a final sample of 89 males and 189 compelling reason to change or a compelling
females. strategy for improvement.
Adams (2000) noted that although the ethical
For both samples, Pearson product moment correla- standards of the American Psychological
tions were computed between the Wisconsin anhe- Association do prohibit psychologists from using
donia scales, and each of five MMPI/MMPI-2-based outdated tests, there are no ethical codes driving
anhedonia scales: RC2, anhedonia, hedonic capacity the timing of test revision. However, one could
Stephen E. Finn
Abstract
This chapter discusses the need for practitioners to consider the relative frequency of phenomena (base
rates) in particular clinical settings in order to make diagnoses and interpret psychological test scores
appropriately. Topics discussed include what are base rates, base rates and predictive power, implications
of base rates for clinical practice and test use, and how to use base rates to benefit you and your clients.
Keywords: base rates, clinical decision making, clinical judgment, clinical prediction, confirmatory bias,
cutting scores, diagnosis
140
Table 8.1. Predicting assault by base rates alone.
row percentages, since your predictions will be made accurate than those derived from less information. A
randomly and are consistent with the axiom of procedure that has predictive power beyond the base
independent probabilities.1 Cell A shows the prob- rate accuracy is called ‘‘efficient.’’ Clearly, in the
ability of your predicting assault for a client who will above example, the colleague’s interview would not
actually assault a staff person (0.25%). Cell D shows be considered efficient relative to the base rates.
the probability of your predicting no assault for a However, if we were exclusively interested in pre-
client who will not make an assault (90.25%). Thus, dicting true positives (Cell A), he would be consid-
your predictions of assault, made with base rates ered efficient, for by his method 100% of the
alone, would be right 90.5% of the time (Cell assaulters were successfully identified versus only
A þ Cell D ¼ total accuracy). 5% of the assaulters successfully identified by base
Now suppose a colleague, unfamiliar with the rates alone. In other words, efficiency depends on
concept of base rates, claims that he can identify the purpose of the test or procedure; we will return
future assaulters by having a 5-min conversation to this issue when we discuss utilities. Let us assume
with each patient. Suppose further that he is able for now that we are interested in the overall accuracy
to successfully identify all (100%) assaulters (Cell A of our classifications: When is it useful (efficient) to
is 100% of 5%), but that he incorrectly classifies one use a test? When does it increase the number of
out of every eight (12.5%) of the nonassaulters as correct decisions?2
assaulters (Cell B is 12.5% of 95%). As can be seen
in Table 8.2, his overall predictive accuracy, despite
hours of interviews, will be lower than yours Implications of Base Rates for Clinical Practice
(88.1%, obtained by summing true positives or Predicting Phenomena with Low Base Rates
Cell A, and true negatives, or Cell D). Is Difficult
This is a counterintuitive point: Decisions In general, it is easier to increase predictive accu-
derived from more information may actually be less racy when the events to be predicted are moderately
FINN 141
likely to occur (i.e., occur with a probability close to already have a history of assault. By thus making the
50%) than when the events are unlikely to occur definition more restrictive, he is likely to find more
(i.e., occur with a probability closer to 0%). This is positive cases in his population, that is, to obtain a
easy to see from the example in the previous section. higher base rate. There would be a cost to this
It is very difficult to design a test that will improve restriction—his interview could only be used with
upon the overall accuracy of base rate predictions a subset of the entire psychiatric population—but it
(such as those in Table 8.1) when the base rate is very most likely would result in greater predictive
low. As shown in the table, simply predicting ‘‘no accuracy.
assault’’ all the time will result in 95% predictive
accuracy. Optimal Test Cutting Scores Vary by Base
Meehl and Rosen (1955), in their paper on ante- Rates
cedent probabilities, derived a helpful rule from Very commonly, psychological tests that produce
Bayes’ theorem to help decide whether a test adds a range of numerical scores are used to classify
to predictive accuracy. In statistical terms, a test or patients into dichotomous categories. When this is
procedure increases the overall number of correct done, one must set a ‘‘cutting score’’ and place people
decisions when who obtain a score equal to or higher than the
cutting score in one category and all other people
false positives using in the other category. For example, in one school in
Base rate of event the procedure my area, students are eligible for classes for the
> ð1Þ
Base rate of true positives using ‘‘gifted’’ only if their IQ test scores are equal to or
no event the procedure above 130.
For reasons similar to those outlined earlier,
In other words, a test or procedure predicts better optimal cutting scores for psychological tests
than the base rate when the fraction of the base rate depend on the base rate of occurrence of the beha-
of occurrence to the base rate of nonoccurrence vior or trait you are trying to predict with the test. To
exceeds the fraction of the rates of false positives to explain this, we will examine what happens when
the rates of true positives. In our example we would one overlooks the effects of base rates upon making
like to know whether predictions. Exner and Wylie (1977) attempted to
predict suicide using multiple indicators from the
false positives using Rorschach. They compared Rorschach protocols of
Base rate of assault the interview successful suicides (n1 ¼ 59) with the protocols of
> ð2Þ
Base rate of true positives using patients who made suicide attempts that failed
no assault the interview (n2 ¼ 31), the protocols of depressed patients
(n3 ¼ 50), the protocols of schizophrenic patients
This would mean that the fraction of false positives (n4 ¼ 50), and the protocols of normals (n5 ¼ 50)
(i.e., erroneously predicted assaulters) to true posi- to identify an optimal set of predictors of suicide
tives (i.e., successfully predicted assaulters) should be (which later came to be known as the suicide con-
lower than 5%/95% ¼ 0.05%. Let us verify this for stellation, or S-CON).
the colleague. His false-positive rate (Cell B) is On the basis of the above groups, Exner and Wylie
11.9%, his true-positive rate is 5% (Cell A), which proposed a cutting score of 8 on the S-CON for
when divided results in 2.4%. This is greater than identifying a person as being at risk for a serious
0.05%; thus, the colleague’s predictions are not suicide attempt. Their sample had a base rate of
consistent with the rule for efficient tests. It is 24.6% of suicide (59 out of 240). How can we
again demonstrated that his interviews lead to determine overall predictive accuracy, taking base
fewer correct decisions than merely following base rates into account? First, let’s make a table that crosses
rate predictions. prediction of suicide with actual suicide, just as we did
One way of dealing with the problem of pre- with assault in Tables 8.1 and 8.2. The overall accu-
dicting rare events is to increase the base rate— racy can be calculated by multiplying the correctly
closer to a probability of 50%—by formulating a predicted percentage of suicide cases with the base
more restrictive definition of the population rate of suicide and adding to this the product of the
for which you are trying to predict the behavior. percentage of correct ‘‘no suicide’’ predictions with the
In our example, the colleague might get better base rate of ‘‘no suicide.’’ Table 8.3 shows the results
results if he limits his predictions to patients who for three different base rates.
Score on %True %True %Total %True %True %Total %True %True %Total
S-CON positives negatives accuracy positives negatives accuracy positives negatives accuracy
6 23 35 58 05 44 48 01 45 46
7 20 48 68 04 60 64 01 63 64
8 18 63 81 04 79 83 01 83 83
9 16 69 85* 03 87 90 01 90 91
10 13 72 85* 03 90 93 01 95 95
11 06 73 79 01 92 94* 00 96 96*
Note: The total percentages are not always equal to the sum of the true positives and true negatives because of rounding errors.
*
Denotes highest overall accuracy (hit rate).
As you can see in the table, a cutting score of 9 or 2000) diagnostic rules are also (in the best of cases)
10 appears to maximize the overall hit rate in Exner dependent on minimizing error for a certain base
and Wylie’s sample, but at the cost of fewer true- rate. For example, a client must meet five or more of
positive predictions than their proposed cutting nine criteria to be given the diagnosis of borderline
score of 8.3 More important, for settings in which personality disorder (BPD). It is unclear from the
the base rate of suicide is lower than 24.6%—and I DSM-IV-TR for what setting or base rate this cut-
assume that there (fortunately) are many such ting score was chosen. However, as is clear from the
settings—a different cutting score should be used. S-CON example, this decision rule may not apply
For example, in a setting with a base rate of 5% or equally well across clinical settings if they have sig-
1%, a cutting score of 11 signs maximizes the overall nificantly different base rates of BPD (Finn, 1982;
predictive accuracy. Clearly, base rates are essential Widiger, Hurt, Frances, Clarkin, & Gilmore, 1984).
in deciding on optimal cutting scores for test predic- In a clinical setting that sees very few clients with
tions. Awareness of this issue can greatly improve BPD, the best diagnostic rule may be eight or more
our clinical decision making. criteria. In a setting that sees a great number of
Where do we encounter base rate effects in clinical clients with BPD, a cutting score of four or three
practice? As clinicians, we are often faced with situa- criteria may maximize diagnostic accuracy.
tions in which the same numbers have different mean-
ings in different settings (because different settings are Utilities Can Also Affect Cutting Scores
likely to have different base rates). Two everyday Back to the Exner and Wylie (1977) study: The
examples will suffice: MMPI-2 scores and clinical earlier discussion assumed that we strive to optimize
diagnoses. It has long been known that MMPI-2 the overall rate of correct diagnostic decisions.
elevations should be interpreted differently across set- However, overall predictive accuracy may not be the
tings. This is reflected in statements in MMPI-2 code appropriate criterion for evaluating the usefulness of a
books like ‘‘in inpatient settings, it is likely that the psychological test. Surely, failing to predict a suicide is
patient has . . .’’ or ‘‘in women, this score tends to be of much more consequence than overpredicting
associated with . . .’’ Why this type of qualification? suicide. Conversely, some might argue that under-
Because the claims made are only valid for the patient diagnosing schizophrenia is less of a problem than
population under study; with different base rates, overdiagnosing the disorder, in view of the societal
different cutting scores or interpretations are necessary. ‘‘labeling costs’’ of being diagnosed schizophrenic.
Another illustration was provided in a study on Hence, the usefulness of a certain test or procedure
the Rorschach trauma content index (TCI; depends not only on base rates but also on the relative
Kamphuis, Kugeares, & Finn, 2000): Predictive seriousness of the types of error it produces. If a
accuracy of the TCI in detecting sexual abuse greatly researcher designs a test to rule out a certain disorder,
depends on the local base rate (and even at best, is he or she should pay more attention to the rate of false
insufficient for definitively identifying clients who negatives than to the rate of false positives.
have been sexually abused). The issues outlined above refer to the asymmetry
Although it is not yet widely recognized, the of utilities (i.e., costs and benefits) of Type I (false-
DSM-IV-TR (American Psychiatric Association, negative) and Type II (false-positive) errors. If you
FINN 143
do not specify a decision about the weights that you be hospitalized immediately. Past suicide attempts
attach to the two types of error, a default decision of are very salient for clinicians and tend to override any
equal weighting will result: the so-called utility- information about base rates.4
balanced diagnostic rule (Finn, 1983).
Subjective Base Rate Estimates Are Difficult
Salient Personal Experience Tends to Be to Shift
Weighed More Than Base Rates Subjective base rates are probability estimates we
A great deal of research now shows that human develop and hold internally, based on our experience
decision makers are quite poor in using base rates to with a certain population. Thus, a clinician will
reach informed decisions. For example, in one of a probably have a subjective estimate of the base rate
series of classic studies, Kahneman and Tversky (1972, of psychosis in his or her clinical setting from accu-
1973) asked subjects to estimate the base rate of mulated experience working with clients. One well-
students in various graduate school programs, such as known problem in clinical decision making occurs
law, social work, computer science, and engineering. when practitioners develop relatively accurate sub-
Subjects then read the following short biography: jective base rates in one setting but fail to shift their
base rate estimates when making decisions in a dif-
Tom W. is of high intelligence, although lacking in
ferent setting.
true creativity. He has a need for order and clarity, and
As an example, imagine that two different clin-
for neat and tidy systems in which every detail finds its
icians, Dr. Smith and Dr. Jones, are shown a video-
appropriate place. His writing is rather dull and
tape of diagnostic interviews of five clients who have
mechanical, occasionally enlivened by somewhat corny
mixed symptomatology of psychosis and severe
puns and by flashes of imagination of the sci-fi type.
dissociation. Dr. Smith specializes in the treatment
(Kahneman & Tversky, 1973, p. 283)
of psychosis and primarily has seen psychotic clients
After this, subjects made two ratings: (1) the for the last 10 years of her professional life. Dr. Jones
degree to which Tom W. was similar to their experi- heads an inpatient program for the treatment of
ence of students in the different graduate programs; dissociative disorders, and most patients referred to
and (2) the likelihood that Tom W. was a graduate the program have been previously diagnosed as
student in law, social work, computer science, and so having a dissociative disorder. Research has shown
on. As you might anticipate, subjects weighed their that Dr. Smith is more likely to diagnose the five
personal experience more heavily than the estimated clients as psychotic and Dr. Jones is more likely to
base rates in judging the likelihood of Tom W. pur- diagnose them as dissociative (Katz, Cole, &
suing various areas of study. For example, Tom W. Lowery, 1969). These biases probably do not reflect
was given the highest likelihood of studying computer Dr. Smith’s or Dr. Jones’ salient experience with one
science, although subjects estimated that only 7% of or two clients, nor do they indicate that Dr. Jones
graduate students were in that area. Subjects consid- and Dr. Smith are ignoring base rate information
ered him least likely to be studying social work or altogether. Rather, the two clinicians’ diagnoses are
humanities, although they judged the base rates in likely to reflect their tendency to use their subjective
these areas to be 17% and 20%, respectively. The base rate estimates from their own settings when
overall correlation between base rates and likelihood making diagnostic decisions in a new setting. To
ratings was r ¼ –.65, while the correlation between make the most accurate diagnoses, each clinician
similarity and likelihood ratings was r ¼ .97. would have to inquire about the relative frequency
The implications of this type of error for clinical of psychotic and dissociative clients in the popula-
practice are clear and have been expounded upon by tion from which the research sample was drawn, and
others (Arkes, 1981; Garb, 1998). Let us return to then hold this information in mind when making a
the Exner and Wylie suicide constellation index for diagnosis. Although we tend to think of diagnoses as
an example. You may tell fellow clinicians that in real and immutable, in fact their accuracy is influ-
their outpatient setting (with a base rate of suicide of enced by base rates just as much as the Exner and
5%) only 4% of clients with an S-CON score of 8 Wylie (1977) S-CON index.
are likely to make serious suicide attempts (see
Table 8.3). However, clinicians who have dealt Base Rates Can Influence Perceptions and
with a suicide attempt from a past client with an S- Information Collection Strategies
CON score of 8 may still claim that the suicide risk An even more disturbing bias exists than subjec-
with such clients is extremely high and urge that they tive base rates influencing diagnoses: subjective base
FINN 145
These figures and the resulting lack of association are been influenced by the large number of women in
exactly as expected given our previously mentioned their settings who had both of these problems (i.e.,
knowledge about schizophrenia. the large number of cases in Cell A). In effect, the
What most clinicians may not know is that if they clinicians had no ‘‘control group’’ (women without
were intake workers in such an inpatient setting, it sexual abuse or eating disorders) to help shape their
would be very easy to perceive a correlation between perceptions.
Latino heritage and schizophrenia, even though
none exists. Smedslund (1963), Nisbett and Ross How to Use Base Rates to Benefit You and
(1980), Arkes (1981), Kayne and Alloy (1988), and Your Clients
others have shown that high co-occurring base rates Should we clinicians give up entirely the task of
give observers the impression of a substantial asso- making diagnostic and treatment decisions or use
ciation, because of the high number of cases in Cell Ouija boards to guide our actions? We don’t think
A. Apparently, most of us have difficulty realizing so. A few simple guidelines can greatly help all of us
that Cells B, C, and D are as important for esti- to incorporate base rates intelligently in clinical deci-
mating an association as is Cell A. You, as an intake sion making.
worker, might be protected from overestimating an
association if you already knew that the large Investigate the Base Rates in Your Setting
number of clients you saw in Cell A was due to the If you want base rates to work for you, use the tests
peculiarities of geography and how your hospital and decision rules that minimize overall error for your
shares resources with others. You might also be particular clients. This will require you to have at least
helped if you knew the research literature on schizo- some knowledge of the base rates of certain client
phrenia and the general lack of association between characteristics in your setting. Pertinent information
schizophrenia and Latino heritage. Suppose, how- can be collected through examination of previous
ever, that you were faced with variables that were not records or perhaps through comparison with data
as well researched or understood? from related settings. If you collect information
This is the situation that I faced some years ago as about the base rates of phenomena of interest in
part of the Eating Disorders Research Group at the your setting, you are likely to improve your clinical
University of Minnesota. The research group was decision making in the long run.
contacted over several years by a number of clinicians
in Minneapolis who said they noted a strong associa- Be Suspicious of Simple Interpretations of
tion between a history of sexual abuse and eating Test Scores
disorders in women they were treating in psy- ERROR IS A FACT
chotherapy. These clinicians were intrigued by the Especially with low base rate phenomena, errors
association they perceived and had begun musing in prediction are a given. Be aware of this and avoid
about a causal relationship between sexual abuse of claiming more than you can substantiate, especially
women and subsequent eating disorders. When the if you work in the area of forensic psychology. For
eating disorders research team assessed abnormal example, clinicians are often asked to make predic-
eating patterns and sexual abuse in the very clinical tions of dangerousness, suicide potential, and other
settings from which the reports came, the results were low base rate characteristics in court. Consistent
as follows. There was a high base rate of past sexual with the current ethical code of psychologists, you
abuse among the 85 women in the sample (70%) as should appropriately qualify any opinions you
well as a high rate of abnormal eating patterns (82%). express about a particular client showing such char-
There was no significant statistical association at all acteristics (American Psychological Association,
between eating disorders and sexual abuse (w2 ¼ 0, 2002; Principles 5.01 and 9.06).
¼ .50), and the cell frequencies of the fourfold table Tests are most useful if the base rate is close to
were exactly equal to those that would be derived 50%. We have seen in the hypothetical example of
from the cross-multiplication of cell and column predicting assault in your clinic that it is difficult for a
frequencies: Cell A ¼ 57%, Cell B ¼ 25%, Cell test to improve upon the base rate predictions if the
C ¼ 13%, and Cell D ¼ 5% (Finn, Hartman, Leon, base rates are extreme, that is, are close to 0%. It
& Lawson, 1986). Although post hoc hypotheses follows from Meehl and Rosen’s (1955) optimizing
must be considered tentatively, it seems likely that rule that tests are more likely to contribute positively
the clinicians who gave the initial reports of an asso- to the quality of decision making when the base rate of
ciation between eating disorders and sexual abuse had the behavior or condition to be predicted is closer to
FINN 147
impressions. This goes against our natural tendency phenomenon in his classic paper, ‘‘Why I Do Not
to look for information that confirms our interna- Attend Case Conferences,’’ reporting that when he
lized base rates. Thus, in the previous example, I spoke about base rates in case conferences some clin-
should have rigorously sought all information that icians would respond, ‘‘We aren’t dealing with
might prove my client was psychotically depressed, groups, we are dealing with an individual case.’’
as well as information that was in accord with my Meehl responded pointedly in his paper to this rea-
internalized base rates. This would have helped me soning: ‘‘If you depart in your clinical decision
make an accurate diagnosis. making from a well-established (or even moderately
well-supported) empirical frequency . . . [this] will
Delay Clinical Decisions while Information result in the misclassifying of other cases that would
Is Collected have been correctly classified had such nonactuarial
Consistent with the above strategies, research has departures been forbidden’’ (p. 234). As clinicians, we
generally shown that the most accurate clinical deci- may believe that we are caring more about clients
sion makers tend to arrive at their conclusions later when we avoid using statistics to make decisions
than do less accurate clinicians (Elstein, Shulman, & that affect them. In fact, however, the costs of such
Sprafka, 1978; Sandifer, Hordern, & Green, 1970). decisions are likely to far outweigh their benefits. We
We may all be tempted to impress students and show our greatest caring when we use all the informa-
colleagues by showing how quickly we can diagnose tion available to make the most accurate decisions
a client or how few responses we need to interpret a possible about our clients.
Rorschach protocol. In such instances, however, we
appear to be most susceptible to bias from interna- Notes
lized base rates. 1. The probability of two independent events co-occurring is
equal to the probability of one event times the probability of
When You Think There Is an Association, the other event.
Look for the Control Group 2. Many indices assess the psychometric quality of tests used for
clinical decision making. ‘‘Sensitivity’’ is the probability of a
The Minneapolis clinicians who thought they person with a certain trait or condition being picked up by a test
perceived an association between eating disorders [a/(a þ c)]; ‘‘specificity’’ is the probability that a person without
and sexual abuse are to be commended; they asked a certain trait or condition being identified by the test as not
researchers to test this hypothesis among the women having that trait or condition [d/(b þ d)]. ‘‘Positive predictive
the clinicians were treating. Unfortunately, many of power’’ is the likelihood of a person identified by the test as
having a certain trait or condition actually possessing that
us in clinical practice do not have research teams at characteristic [a/(a þ b)]; ‘‘negative predictive power’’ indicates
our immediate disposal. In such a situation, a good the probability of a person identified by the test as not having a
strategy is to look for the control group that will test certain trait or condition actually not having that characteristic
the association we think we perceive (i.e., clients [d/(c þ d)]. All of these indices, including the rate of correct
who do not possess the characteristic of interest). decisions we use in Tables 8.1–8.3, are helpful in clinical
decision making, depending on the specific goal of the clin-
To return to our example in Table 8.4, should ician. See Baldessarini, Finklestein, and Arana (1983), Kraemer
clinicians perceive an association between Latino (1992), and Streiner (2003) for in-depth discussions of these
heritage and schizophrenia (because of the large various indicators.
number of cases in Cell A) they need only ask 3. Exner and Wylie’s (1977) cutting score of 8 is entirely reason-
themselves, ‘‘Are non-Latino clients more likely able; apparently they opted to increase the number of true-
positive decisions at the cost of a slight decrease in the overall
than Latino clients to receive a diagnosis other than number of accurate decisions.
schizophrenia?’’ (a comparison of Cell B with Cell 4. Hospitalization of such clients could be based on clinicians’
D). As one quickly sees, they are not, which shows decision to weigh false-negative errors much more than false-
that there is no association between Latino heritage positive errors. However, if this is true, the clinicians should
and schizophrenia. appropriately argue that even though their clients appear to
have only a 4% probability of a serious suicide attempt, the
negative utilities of not intervening are so high that some steps
Do Not Confuse Statistically Based Decisions must be taken.
with a Lack of Caring for Clients
When I urge clinicians to incorporate base rates
into their clinical decision making, I find that many References
American Psychiatric Association. (2000). Diagnostic and
are reluctant to do so because they feel it involves
statistical manual of mental disorders (4th ed., Text rev.)
‘‘treating clients like numbers’’ or ‘‘not caring about (DSM-IV-TR). Washington, DC: American Psychiatric
the individual.’’ Meehl (1977) bemoaned this same Association.
FINN 149
C H A P T E R
Theory and Measurement
9 of Personality Traits
Allan R. Harkness
Abstract
This chapter describes the connections between basic emotion systems and personality traits. The fear,
anger (or rage), interest (or seeking), and joy emotion systems are described and linked to personality
traits. The case is argued that modern psychological findings from behavior genetic, longitudinal, and
clinical studies cannot be interpreted without including stable individuating personality traits as causes in
psychological theories. Using personality sketches to illustrate behavioral features, the chapter provides
an introduction to trait concepts that can enrich clinical case conception. The highly available Minnesota
Multiphasic Personality Inventory-2 (MMPI-2) personality trait scales, called the Personality
Psychopathology-Five (PSY-5) scales, are described. The chapter provides a brief review of key PSY-5
research and clinical literature.
What is a personality trait? Permit me to narrate the herself. The thoughts fluidly associated to other dan-
inner experience of a hypothetical patient: gers. She turned on the radio to tune herself out.
Jane thought her coffee tasted a bit off, perhaps a taste The first purpose of this chapter is to
of pencil eraser. She thought ‘‘cryptosporidium.’’ This describe what is meant by the term personality
germ just popped into her mind. She remembered it trait. I will use the case of hypothetical Jane to
was an infectious agent that survives water treatment. make trait theory concrete and understandable
She had seen a picture of them; they had ugly for the reader. A second purpose of this chapter is
microscopic capsules that protected their innards from to introduce a convenient measure of personality
chlorine. She thought, ‘‘That stuff is run-off from pig traits for clinical work: the Minnesota Multiphasic
farms or chicken farms, right?’’ Jane tried to use humor Personality Inventory-2 (MMPI-2) Personality
to calm herself. She thought of funny chickens, funny Psychopathology-Five (PSY-5) scales (Harkness,
pigs. She hoped the coffee maker would kill those McNulty, Ben-Porath, & Graham, 2002).
things. But worry—readiness to think of creepy
cryptosporidium—was stronger than humor or faith in Personality Trait Theory: At the Core
coffee makers. of Psychology
Jane started her car. She caught a new sound. Subtle, Personality traits connect emotions, thinking,
but new. It could be expensive. Expensive like the last learning, attention, the patterning of behavioral out-
time. That sound. That subtle sound meant giving up puts, and clinical problems. They are indispensable
control. That sound meant being at the mercy of because no psychological theory can cope with the
people she did not trust. It meant another month with data available today without them. The data of beha-
no savings. No savings. ‘‘These are the years when vioral genetics, which allows us to compare people
I SHOULD be saving for retirement,’’ she scolded of different degrees of relatedness and environment
150
sharing, cannot be interpreted without personality the disposition can be demonstrated. In the case of
traits. The interpretation of longitudinal studies, for Jane, if a personality trait were at work, then some
example, work described by Caspi (2000) or Saudino, underlying structure, such as enduring properties
Pedersen, Lichtenstein, McClearn, and Plomin of cell networks in her nervous system, would
(1997), would be incomprehensible without traits generate behavioral dispositions. Jane might be
acting as causal agents. The facts of the ‘‘comorbidity’’ chronically disposed to treat relatively neutral but
of psychiatric disorders, and the longitudinal patterns novel stimuli as newly detected dangers.
that persist over time require an ingredient in psycho-
logical theories: powerful individual differences that Traits Are Individuating
persist over time, are real, causal, and can be studied. Much of psychology is concerned with general
What is a personality trait? Tellegen (1988) laws—psychological principles that apply to all
defined a trait as ‘‘. . . a psychological (therefore orga- vertebrates, or all mammals, or all human beings.
nismic) structure underlying a relatively enduring In contrast, the personality traits described in this
behavioral disposition, i.e., a tendency to respond chapter are part of individual differences science
in certain ways under certain circumstances. In the (Harkness & Lilienfeld, 1997). These traits display
case of a personality trait some of the behaviors naturally occurring variation in the systems that
expressing the disposition have substantial adapta- underlie important adaptive behaviors. Individual
tional implications’’ (p. 622). differences in the levels of the trait, across a group
Think back to Jane. Think of her reaction to the showing variation, create what Tellegen (1988) has
odd taste of the coffee, to the new sound from her called a ‘‘trait dimension.’’ Think back to Jane. If her
car. There could be many different causes for her inner experience were replicated across a variety of
mental contents and behaviors. Perhaps her brother occasions and events, it would suggest that she has a
had just had a horrible cryptosporidium infection. readiness to treat novel information as clues to
Perhaps she had had several very bad nights’ sleep danger. Perhaps she is at a high trait level of negative
because of a sore back. Perhaps she was just in an emotionality. But the dimension of negative emo-
unexplained mood or a hormonal flux. But none of tionality does not exist in any single person. The
those explanations are personality traits. dimension exists only in a population of Janes,
For Jane to have a personality trait, she must have Kaylas, Toms, and Maxwells, each of whom has a
an ‘‘organismic structure that underlies a disposi- real threat analysis system that endures across time,
tion.’’ Trait theorists believe that traits are real psy- and that is set at an individuating trait level.
chobiological structures that cause things to happen.
Currently some of the evidence for the existence of
structures is indirect: behavior genetic evidence
Traits versus States
Traits last a long time. States last a short time.
shows consistent heritabilities for traits, psychophar-
Systems displaying stable parameters over months,
macologic interventions can change the operating
years, or decades show the endurance required of
properties of naturally occurring systems, and so
a trait. Structures that create dynamic parameters
on. Other evidence for these psychobiological struc-
lasting instants, seconds, minutes, hours, or per-
tures is more direct. For example, LeDoux (1996)
haps even weeks have the endurance of states. Yes,
has elegantly synthesized anatomical studies and
Jane’s danger analysis system does vary from
psychobiological studies to make the case for a
moment to moment, day to day. However, if
high-speed danger analysis system with the amygdala
there are fundamental parameters that endure,
as its hub. If our Jane were to have some enduring
then she has a trait level around which state varia-
property of this system—say a readiness to detect
tion occurs.
danger—this would yield a behavioral disposition.
So is Jane at a high trait level of readiness to detect
Take a moment to distinguish between under-
danger? Or was she just in a bad mood? Had she
lying structures and dispositions. As an analogy, the
been a recent victim of distressing events? For the
underlying atomic structure of a piece of glass (great
clinician, her history would be critical. We would be
irregularity compared to the regular structure of
very interested in information like this:
crystal) gives rise to the disposition of ‘‘fracture upon
breakage.’’ The underlying atomic structure is always From early childhood, Jane has had a talent for
there in the glass, but the disposition of fracture imagining how things could go wrong. Always on alert,
upon breakage is a potential that is only observed she prides herself on not being surprised by bad events.
under special circumstances. Given a baseball bat, The emotion of tense worry, of anxiety, is a frequent
HARKNESS 151
visitor. Jane experiences more negative emotions than The extremely informative German Observational
most people do. Study of Adult Twins (GOSAT; Borkenau, Riemann,
Angleitner, & Spinath, 2001) included not only self-
Is It Plausible that Real Organismic Structural reported personality, but also observations of beha-
Differences Underlie Traits? vior made by multiple independent judges. The
When traits are studied by behavior genetic work was methodologically rigorous, with predomi-
methods, the evidence is overwhelming that a sub- nantly molecular methods used to assess type of
stantial portion of the variance in those traits arise twinning and some 1.26 million ratings collected
from genetic variation, consistent with Tellegen’s on videotapes of individual twins in 15 behavior
concept of trait as an organismic structure under- challenges. Reliable variance in self-reported and
lying a relatively enduring behavioral disposition. peer-rated personality was estimated to be about
For example, Loehlin and Rowe (1992) con- 67% genetic in origin and 33% due to environment,
ducted an exemplary early study of the behavior with little or no evidence that sharing family life
genetics of personality traits using structural equa- makes people more similar in personality. Analyses
tions modeling. This allowed the simultaneous of the behavioral observations suggested 40% of the
examination of data from large twin studies, adop- variance was genetic in origin, 35% was environ-
tion studies, and studies of the relatives of twins. mental variance that does not tend to produce family
They reported findings that have held up well in similarity, and 25% of the variance arose from envir-
light of subsequent research: onmental influence that does produce similarity
among those sharing a family environment. Behavior
Somewhere between 40 and 50 percent of the Big Five
is, of course, determined by many causes, not just traits.
Traits is genetic, according to this analysis, and between
Certainly family styles and habits may be among
50 percent and 60 percent is due to environment,
those influences on individual behavioral observa-
plus error and interaction. The environmental effects
tions. However when self-reporters and observers
are mostly non-familial, but there is a small component
reflect on the major enduring and individuating
due to shared environment.
dispositions that characterize a personality, both
(Loehlin & Rowe, 1992, p. 358).
individuating genetics and individuating environ-
The above quotation contains a finding that I ment seem to be predominant shaping forces.
believe will eventually come to be regarded as one If it is personality we are seeing in Jane, then we
of the great surprises and achievements of twentieth- would expect to see similarities between Jane and a
century social science. Environment is powerful in genetically identical twin of Jane, even if they were
shaping personality, but Loehlin and Rowe found raised in different families. But we would not expect
that it does not work the way we assumed. In Jane’s family life to strongly shape those structures,
contrast to our expectations, environmental forces the traits that shape her major enduring dispositions.
on personality sculpt in a way that does not create Nevertheless, in looking at any particular stream of
similarity among family members. David Rowe behavior, we might see the signatures of Jane’s family
(1994) developed this theme in a book entitled, habits, beliefs, and values.
The Limits of Family Influence: Genes, Experience, and
Behavior. Rowe (1994) contrasted well-replicated Some Personality Traits Are Stable
findings from behavior genetics with the core Individuating Properties of Emotion Systems
assumptions of socialization theories. When we Meehl (1975) and Tellegen (1982, 1985) both
usually consider the power of the environment, we suggested that some personality traits might best
instinctively think in social learning terms. We think be understood as the dispositions of emotion
of parents as powerful role models shaping person- systems. All mammals share a number of basic
alities of their young. The general picture that has emotion systems (Panksepp, 1998). Some of the
emerged from a vast literature of behavior genetic emotion systems that are particularly relevant to
examinations of the source of population variance personality traits are enumerated in Table 9.1. The
in personality trait dimensions indicates that the major emotion theorists who developed the under-
resemblance between family members comes standing of these systems are listed in Table 9.1.
mostly from shared genetics. When the environ- Most of these theorists describe seven or more
ment shapes personality, for the most part, it does systems, and generally agreement across theorists is
not produce similarity between people who share quite good with a few areas of disagreement. When
the same family. the list of emotion systems is pared down to a few
that are particularly relevant to personality traits, as information from the external senses. They are
shown in Table 9.1, the agreement between theorists screening for any stimuli that are releasers of the
is excellent. system. In the fear system, autoappraisers search for
Each of the basic emotion systems referenced in signals of ‘‘potential for bodily injury or death.’’
Table 9.1 can rapidly detect a distinct class of Through learning mechanisms, new stimuli can
stimulus events (Ekman, 2007). Starting with the become rapid releasers of the outputs of the fear
left column of Table 9.1, the fear system detects system. Ekman has called the set of rapid releasers
signals of impending danger. As each emotion for a system the ‘‘triggers’’ of the system. Figure 9.1
system is activated by its own relevant class of raises the interesting possibility that some of the emo-
stimuli, the system organizes adaptive behaviors, tion systems have their own onboard learning subunit,
structures thought, and creates powerful motivations a learning system that operates in parallel with the
(Izard, 1991). The systems in Table 9.1 have rea- more general learning systems of the organism.
sonably distinct and well-characterized underlying On the right side in Figure 9.1, we see that
neural systems, and some have distinct biochemical the fear system coordinates a range of behavioral
signaling systems (Depue & Lenzenweger, 2006; and cognitive outputs such as a characteristic facial
Panksepp, 1998; Patrick & Bernat, 2006). expression. Ekman (2007) describes the facial
The fear system is explored further in Figure 9.1. output of a falling man: ‘‘his upper eyelids are
The system is represented by an elongated oval. The raised as high as they can go, . . . eyebrows are
left side of the oval represents the input side of the raised and drawn together, . . . lips are stretched hor-
system, and the right side represents the output izontally toward his ears . . . his chin is pulled back’’
side. The fear system allows the organism to respond (p. 162). Cognition is structured by the system. The
quickly to major adaptive challenges even before ability to think creatively is lost: thinking focuses on
consciousness becomes engaged (LeDoux, 1996). the threat. There are changes in blood pressure. Heart
For example, very rudimentary analyses of rapidly rate changes, and blood flow is directed toward the
looming objects (predator signature) or loss of legs. Depending on the behavior and proximity of the
support (falling) can activate the fear system at the threat, there may be total behavioral inhibition
same time the cortex is receiving the information for (freezing) or escape behavior. And after much of this
deeper analysis, for the assignment of meaning, and behavior is well under way, only then does the person
for providing representations of the event in con- possibly begin to sense a feeling—perhaps there is the
sciousness. The fear system can be potentiated by feeling of fear. Thus it is critical to understand that
factors such as being alone, being in the dark, novel what is being discussed here is an evolved emotion
stimulation such as odd tastes, sounds, sights, skin system—Panksepp calls them mammalian homolo-
sensations, and so on. Some set of releasing stimuli gies—which are much more than a feeling: they are
and potentiators are thought to be species-typical complete adaptive dynamic systems that deal with
givens, the unlearned primitives of the fear system. classes of stimuli that are far too important to the
This is shown on the left end of the oval in Figure 9.1. organism to be left to slower thought and reflection.
Ekman (2007) offers the term ‘‘autoappraisers’’ It is critical to be aware that the emotion system
for the cognitive elements of an emotion system may be strongly controlling cognition even before
that are charged with a continual screening of the person becomes cognitively aware that they are
HARKNESS 153
Consciousness
Blood pressure
Car repair bill Face
Fast approach
Heart rate
Fast drop
Learning
Input Fear system Output
Thinking
Alone
Blood to legs
Dark
Cryptosporidium Freeze or move
Feeling: fear
having an emotion. On December 29, 1972, as Again, consider Jane. Like all of us, Jane has a fear
Eastern Airlines Flight 401, a Lockheed L-1011, system. Perhaps her system, like a jumpy car alarm,
was approaching for Miami International for has a low threshold of activation. An odd taste,
landing, the crew became aware that one of the which might simply potentiate the system for most
landing gear indicator lights had not come on. The people, actually activates Jane’s system. She regards
pilot, the copilot, and the flight engineer, each under the taste as a signal of a dangerous threat, a germ, and
the command of their own emotion systems, began she has difficulty letting go of it. Another potential
to focus on the threat. First someone tapped on the threat, an unusual sound, then sets off thoughts
light. Then they began to take apart the light display. about a potential automotive repair bill. When
The flight engineer was sent below to visually verify the emotion system controls on her cognitions are
the position of the nose gear, as the rest of the crew slightly loosened, she can still think only of asso-
worked on the light. As they all diligently focused on ciated dangers, costs, financial problems, and future
the threat cue, the airliner crashed into the Florida poverty. Jane has the same fear system that is homo-
Everglades and 101 of 176 aboard died. Range of logous in all mammals. But there are enduring,
thought was narrowed by the fear system: all focus individuating parameters of that system that are at
was on the threat. the core of Jane’s personality, and will remain with
Note that in Figure 9.1, the systems involved in Jane throughout her life.
consciousness are just sketched in, but they are All mammals share a number of these systems.
separated from the fear system. Conscious awareness Returning to Table 9.1, the second column refers to
of the various inputs and outputs of the system may the anger/rage system. This system detects frustra-
be strongly controlled by the system. Each flight tion of one’s goals and agendas. The primitive trig-
crew member was focusing on a threat cue—the gers relate to frustration of one’s basic needs at the
landing gear display. Thinking was strongly directed lower levels of a Maslowian hierarchy. Physical
toward that cue, and away from an even more restraint sets off the system in a 2-year-old. The
important danger—terrain contact. Even though system then organizes the topography of behavioral
the fear system was commanding cognition, it is responses with characteristic facial expression, sym-
possible that none of the flight crew was even con- pathetic preparation for energy expenditure, an
scious of fear when working on the light display. agonistic orientation to the agent of frustration,
Ekman (2007), Izard and Ackerman (2004), and and strong motivation to overcome obstacles. Even
LeDoux’s (1996) work is particularly recommended pain is dampened as the rage system is activated.
for fascinating accounts of the interaction of emo- Feelings of different qualities result from the emo-
tion systems and consciousness. tion systems on the right-hand side of Table 9.1.
Consciousness
HARKNESS 155
α β
Negative Positive
Disinhibition
emotionality emotionality
Fig. 9.3 Heirarchical structure of personality. Adapted from Markon, Krueger, and Watson (2005), p. 148; Copyright American Psychological
Association.
Figure 9.3, adapted from their article, presents the Markon et al. reported that the factor had high
factors they found underlying these 44 popular scales. loadings on scales measuring extraversion, a sense of
At the top of Figure 9.3 are two ovals bearing the labels well-being, and social ease. Note that this factor
and . This is the result of extracting two factors. remains relatively stable at subsequent three-, and
The three-factor extraction is the next row of three four-factor extraction levels. From the brief descrip-
ovals, and so on down to the last row showing the five- tion of Jane’s inner experience we cannot estimate
factor level. There are very broad bandwidth traits at her status on this dimension.
the two-factor level, and more differentiated, narrow Markon et al. next extracted three factors. The
traits at each subsequent level of extraction. The three factors are coherently related to the earlier two
authors argued that taken together, the factor solutions factors. The big negative factor split into two more
comprise a single coherent hierarchical structure. narrow factors. They found a split between worry,
At the two-factor level, Markon et al. (2005) anxiety, and being easily stressed and a split-off
found that the factor has loadings from scales factor relating to being aggressive and poorly self-
tapping worry, anxiety, the experience of negative controlled. The third factor remained extraversion,
emotions, emotional instability, aggressiveness, social interest, and well-being.
negative distortion of thinking, lack of conscien- At this three-factor level, we can better capture
tiousness, oppositionality, identity disturbance, and Jane. From the brief narration of inner experience,
poor self-control. Jane, the hypothetical patient at we would guess that she is characterized by elevated
the beginning of this chapter, might be well char- negative emotionality, but that we do not have
acterized with descriptors like worry, anxiety, the strong evidence on her status on disinhibition or,
experience of negative emotions, and negative distor- again, on positive emotionality.
tions of thinking. However, we do not know if Jane At the four-factor level, Markon et al. (2005)
would be well characterized by general emotional found that aggressiveness split off from poor self-
instability, aggressiveness, oppositionality, identity control. Aggressiveness separated from what Watson
disturbance, and poor self-control. The high-level and Clark (1993) referred to as behavioral disinhibi-
factor appears to be too coarse for a good descrip- tion. Tellegen (1982) referred to this trait dimension
tion of Jane. As you look at how is decomposed as Constraint. In work with my collaborators, I used
at lower levels of the figure, you see how more Tellegen’s trait construct and labeled the same trait
narrow traits offer further opportunities to refine dimension from the reverse end as disconstraint
the description. (Harkness et al., 2002). Markon et al.’s four factors
Table 9.2 A few links between emotion systems and major personality traits
HARKNESS 157
emotion systems and the personality traits lies at the (Harkness, McNulty, & Ben-Porath, 1995). Because
connection of the two branches of psychological the PSY-5 Scales are scored from the MMPI-2, it is
science. Much of psychology is the psychology of highly available to practitioners. Pope, Butcher, and
general laws—principles like habituation or operant Seelen (2006) suggest that the MMPI-2 remains the
conditioning, the power and log laws relating most widely used psychopathology and personality
stimulus magnitudes to sensory neuron firng rates; assessment instrument.
these are general laws that may apply across all The psychometric properties of the PSY-5 scales,
vertebrates, all mammals, or all people. The basic and guidelines for interpretation are reported in a
emotion systems we have considered are, as Jaak University of Minnesota Press test report (Harkness
Panksepp (1998) noted, mammalian homologies. All et al., 2002). The PSY-5 model has been described
mammals—all mice, all cats, goats, whales, lemurs, by Butcher and Rouse (1996), Millon and Davis
and humans—have these basic emotion systems. But (2000), Widiger and Trull (1997), and by Widiger
every mouse, cat, goat, and human has enduring and Simonsen (2005). Pearson Assessments provides
parameters—long-lasting characteristics of those PSY-5 scores with three MMPI-2 reports: the
shared systems. This is what Allport (1937) meant Extended Score Report, The Minnesota Reports,
by a nomothetic trait—individual variation in a and the Criminal Justice and Correctional Report.
system that is shared by a population. Those long- A number of textbooks offer instruction on inter-
lasting individuating properties—those are personality pretation of the PSY-5 scores: Butcher and Williams
traits. Traits belong to the other side of the scientific (2000) and Friedman, Lewak, Nichols, Webb
psychology—the individual differences. Personality (2000), and Graham (2006).
traits can guide the clinician to properly apply the The PSY-5 measurement model (Harkness &
general laws to the case of the individual patient. McNulty, 1994) is connected to current models of per-
From an evolutionary perspective, why would sonality and psychopathology (Watson, Clark, and
individual differences remain in a critical adaptive Harkness, 1994), for example, it measures the four-
system? If the systems deal with important problems, factor level in the meta-analysis of Markon et al. (2005),
should not selection have wrung out all the variation described in a preceding section, with its Aggressiveness,
between individuals? No. Selection can never remove Disconstraint, Negative Emotionality/Neuroticism,
individual differences when the selection cannot be and Introversion/Low Positive Emotionality Scales.
unidirectional. Too much fear is bad, but so is too Further, the PSY-5 taps clinically relevant disconnec-
little. And fear is a particularly interesting example for tion from reality with its Psychoticism scale. The rela-
the preservation of variance in that fear of predators tionship of the PSY-5 model to normal sample-based
may act in exactly the opposite way to fear of sexual five factor models has been well established and
competitors in terms of contributions to reproductive described (McNulty & Harkness, 2002; Trull,
fitness. With a fear system set point controlled by Useda, Costa, & McCrae, 1995), evidencing robust-
many genes, a single gene promoting a highly reactive ness across translation and culture (Egger, De Mey,
fear system can at times act like a protective factor, Derksen, & van der Staak, 2003).
useful in a population because it prevents, on average,
having a fear system set too low. This action as a Brief Introduction to PSY-5 Scale
protective factor balances its cost as a risk factor in Interpretations
promoting too strong a fear system. And you can see In this section, I provide a brief introduction to
on the other side, that genes promoting low fear set PSY-5 interpretation based on a test report (Harkness
points can serve as protective factors against overly et al., 2002). The clinician interested in a more
reactive fear systems, balancing their cost as promoters detailed description of the MMPI-2 PSY-5 scales
of inadequate fear systems. Individual differences can can consult the test report and also Harkness and
be quite prominent and stable in major adaptive McNulty (2006).
systems with polygenic control when there can be
no unidirectional selection. Nike shoe company may AGGRESSIVENESS (AGGR)
tout ‘‘No Fear,’’ but nature does not. Aggression can be used to try to get things and to try
to dominate others. The underlying psychological
The MMPI-2 PSY-5 Scales system detects connections between one’s agenda and
A convenient overview of personality traits can be the potential threats to that agenda. Energy is organized
scored from an administration of the MMPI-2: The and motivation is directed to agonistically remove
Personality Psychopathology Five (PSY-5) scales obstacles. Thus it is hypothesized that AGGR is an
HARKNESS 159
Lykken and Auke Tellegen produced a precursor to INTROVERSION/LOW POSITIVE EMOTIONALITY
the MPQ harm avoidance scale (Lykken, Tellegen, (INTR)
& Katzenmeyer, 1973). A major feature of Lykken’s Both lower levels of incentive motivation and
(1957) dissertation had involved the performance of lower responses to positive consummatory events
psychopaths in a mental maze that required antici- are the hallmarks of high INTR. Meehl (1975)
pation of future punishment. drew attention to the joy side of the individual
PSY-5 DISC correlates solidly with MPQ differences in his work on ‘‘hedonic capacity.’’
constraint (reversed). Persons with scores on PSY-5 When a warm fire on a cold night, a giggly baby,
DISC greater than 65 tend to be more risk taking, and chocolates fail to produce the experience of joy
impulsive, and less traditional. Since an alternative and positive engagement, the person is hedonically
future does not readily appear for them, they are impaired. The PSY-5 INTR scale elevations also
more susceptible to boredom when faced with a imply reduced levels of incentive motivation, or
routine task in the here and now. High-DISC out- lower urgency or drive. Outpatient men with
patients tend to have a history of being arrested and INTR scores greater than 65 showed increased
abusing alcohol, cocaine, and marijuana. They show rates of dysthymia and depression diagnoses, were
a slight tendency to prefer romantic partners with depressed and sad during completion of a mental
elevated DISC. status exam, and were rated by their therapists as
In contrast, low scores on DISC (uniform T-scores having low achievement orientation. Outpatient
of less than 40) suggest a constrained personality women with INTR scores greater than 65 were
pattern. Constrained individuals tend to take less more likely to have been prescribed antidepressant
risk, and exhibit greater self-control and boredom medication, and to have few or no friends.
tolerance. They show more tendency to be rule fol- Persons with low INTR scores (MMPI-2
lowers, and evidence a slight tendency to prefer uniform T-scores of 40 or less) tend to exhibit
romantic partners with similarly constrained person- an extroverted/high positive emotionality pattern,
ality patterns. Some people are just more likely than having a greater capacity to experience pleasure and
others to consider what that eagle tattoo will look like joy, and be more social and energetic. Extremely low
in 50 years! scores may be associated with hypomanic features.
DISC predicts which back pain patients will be
unwilling to try potentially painful but effective
Two Futures for Jane
treatment regimes (Vendrig, Derksen, & De Mey,
Traits are enduring; but traits are not destiny.
2000). DISC appears to be unusally effective in
Given that Jane is at a high trait level of negative
predicting alcohol and drug use (Rouse, Butcher,
emotionality, that she has a low-threshold danger
& Miller, 1999). DISC also appears to be effective
analysis system, what kind of life will she have? Let
in isolating an externalizing subtype of PTSD
me briefly sketch two alternative Janes. ‘‘Clinical
patient (Miller, Kaloupek, Dillon, & Keane, 2004).
Jane’’ finds that her frequent sense of tension is
NEGATIVE EMOTIONALITY/NEUROTICISM (NEGE) reduced, for brief time, by inhaled nicotine. Her
This chapter began with a description of Jane, who fingers and lungs become stained by a heavy addic-
thought ‘‘cryptosporidium’’ when her coffee tasted tion to cigarettes. Clinical Jane finds that it is more
odd. The danger analysis system hypothesized by comfortable to focus her talented danger detector on
Tellegen (1985) and elucidated by LeDoux (1996) is the TV set rather than on her real life. Yes, she
thought to underlie individual differences in NEGE. worries about the characters on TV, but that is less
Jane, and others at a high trait level of NEGE, would upsetting than worrying about her job, taxes, retire-
tend to focus on the signals of danger in incoming ment, and relationship problems. Unfortunately,
information. Watson and Clark (1984) noted such this means she spends little time actively solving
persons tend to worry, to be self-critical, to feel guilty, her problems. Clinical Jane’s frequent experience of
and to concoct worst-case scenarios. Outpatients with tension and worry is fertile soil for the negative
NEGE scores greater than 65 were found to have reinforcement of quick-acting tension reducers. In
higher rates of depression and dysthymia, were more addition to cigarettes, Clinical Jane finds that candy,
socially isolated, and tended to have lower levels of fatty foods, and alcohol all bring respite. As she
functioning. Their therapists rated them higher on assesses her acquaintances, their faults and failings
anxiety, depression, and sad mood state. Low NEGE are foremost in her thoughts: they are danger signals,
scores should not be interpreted. and her fear system commands that she focus on
HARKNESS 161
presented at the 34th Annual MMPI-2 Symposium, of the MMPI-2 substance abuse scales. Psychological Assessment,
Huntington Beach, CA, April 1999. 11, 101–107.
Izard, C. E. (1991). The psychology of emotions. New York: Rowe, D. C. (1994). The limits of family influence: Genes, experi-
Plenum. ence, and behavior. New York: Guilford.
Izard, C. E., & Ackerman, B. P. (2004). Motivational, organiza- Saudino, K. J., Pedersen, N. L., Lichtenstein, P., McClearn,
tional, and regulatory functions of discrete emotions. In G. E., & Plomin, R. (1997). Can personality explain genetic
M. Lewis & J. M. Haviland-Jones (Eds.), Handbook of influences on life events? Journal of Personality and Social
emotions (2nd ed., pp. 253–264). New York: Guilford. Psychology, 72, 196–206.
LeDoux, J. (1996). The emotional brain: The mysterious under- Sharpe, J. E., & Desai, S. (2001). The revised NEO Personality
pinnings of emotional life. New York: Touchstone. Inventory and the MMPI-2 Psychopathology Five in the
Loehlin, J. C., & Rowe, D. C. (1992). Genes, environment, and prediction of aggression. Personality and Individual Differences,
personality. In G. Capara & G. L. Van Heck (Eds.), Modern 31, 505–518.
personality psychology: Critical reviews and new directions Tellegen, A. (1982). Brief manual for the Differential Personality
(pp. 352–370). New York: Harvester Wheatsheaf. Questionnaire. Unpublished manuscript, University of
Lykken, D. T. (1955). A study of anxiety in the sociopathic personality. Minnesota, Minneapolis, MN (since renamed Multidimen-
Unpublished doctoral dissertation, University of Minnesota. sional Personality Questionnaire).
Lykken, D. T. (1957). A study of anxiety in the sociopathic person- Tellegen, A. (1985). Structures of mood and personality and their
ality. Journal of Abnormal and Social Psychology, 55, 6–10. relevance to assessing anxiety, with an emphasis on self-report.
Lykken, D. T., Tellegen, A., & Katzenmeyer, C. (1973). Manual In A. H. Tuma & J. D. Maser (Eds.), Anxiety and the anxiety
for the activity perference questionnaire. Unpublished manu- disorders. Hillsdale, NJ: Lawrence Erlbaum.
script, Department of Psychiatry, University of Minnesota. Tellegen, A. (1988). The analysis of consistency in personality
Markon, K. E., Krueger, R. F., & Watson, D. (2005). Delineating assessment. Journal of Personality, 56, 621–663.
the structure of normal and abnormal personality: An inte- Tomkins, S. (1962). Affect-imagery-consciousness, Vol. I: The
grative hierarchical approach. Journal of Personality and Social positive affects. New York: Springer.
Psychology, 88, 139–157. Tomkins, S. (1991). Affect-imagery-consciousness, Vol. III: The
McNulty, J. L., & Harkness, A. R. (2002). The MMPI-2 PSY-5 negative affects, anger and fear. New York: Springer.
Scales and the Five-Factor Model. In B. De Raad & M. Trull, T. J., & Durrett, C. A. (2005). Categorical and dimensional
Perugini (Eds.), Big Five assessment (pp. 435–456). Seattle: models of personality disorder. Annual Reviews of Clinical
Hogrefe & Huber. Psychology, 1, 355–380.
Meehl, P. E. (1975). Hedonic capacity: Some conjectures. Trull, T. J., Useda, J. D., Costa, P. T., & McCrae, R. R. (1995).
Bulletin of the Menninger Clinic, 39, 295–307. Comparison of the MMPI-2 Personality Psychopathology
Meehl, P. E. (1986). Trait language and behaviorese. In Five (PSY-5), the NEO PI, and NEO PI-R. Psychological
T. Thompson & M. D. Zeiler (Eds.), Analysis and integration Assessment, 7, 508–516.
of behavioral units (pp. 315–334). Hillsdale, NJ: Erlbaum. Vendrig, A. A., Derksen, J. J. L., & De Mey, H. R. (2000).
Miller, M. W., Kaloupek, D. G., Dillon, A. L., & Keane, T. M. MMPI-2 Personality Psychopathology Five (PSY-5) and pre-
(2004). Externalizing and internalizing subtypes of combat- diction of treatment outcome for patients with chronic back
related PTSD: A replication and extension using the PSY-5 pain. Journal of Personality Assessment, 74, 423–438.
Scales. Journal of Abnormal Psychology, 113, 636–645. Watson, D., & Clark, L. A. (1984). Negative affectivity: The
Millon, T., & Davis, R. (2000). Personality disorders in modern disposition to experience aversive emotional states. Psychological
life. New York: John Wiley & Sons. Bulletin, 96, 465–490.
Panksepp, J. (1998). Affective neuroscience: The foundations of Watson, D., & Clark, L. A. (1993). Behavioral disinhibition
human and animal emotions. New York: Oxford University versus constraint: A dispositional perspective. In D. M.
Press. Wegner & J. W. Pennebaker (Eds.), Handbook of mental
Patrick, C. J., & Bernat, E. M. (2006). The construct of emotion control (pp. 506–527). New York: Prentice Hall.
as a bridge between personality and psychopathology. Watson, D., Clark, L. A., & Harkness, A. R. (1994). Structures of
In R. F. Krueger & J. L. Tackett (Eds.). Personality and personality and their relevance to psychopathology. Journal of
psychopathology (pp. 174–209). New York: Guilford. Abnormal Psychology, 103, 18–31.
Petroskey, L. J., Ben-Porath, Y. S., & Stafford, K. P. (2003). Weisenburger, S. M., Harkness, A. R., McNulty, J. L., Graham, J. R.,
Correlates of the Minnesota Multiphasic Personality Inventory- & Ben-Porath, Y. S. (2008). Interpreting low PSY-5 aggressive-
2 (MMPI-2) Personality Psychopathology Five (PSY-5) Scales in ness scores on the MMPI-2: Use of graphical, robust, and resis-
a Forensic Assessment Setting. Assessment, 10, 393–399. tant data analysis. Psychological Assessment, 20, 403–408.
Pope, K. S., Butcher, J. N., & Seelen, J. (2006). The MMPI, Widiger, T. A. (1998). Four out of five ain’t bad. Archives of
MMPI-2, and MMPI-A in court testimony. In K. S. Pope, General Psychiatry, 55, 865–866.
J. N. Butcher, & J. Seelen (Eds.), The MMPI, MMPI-2, & Widiger, T. A., & Simonsen, E. (2005). Alternative dimensional
MMPI-A in court: A practical guide for expert witnesses and models of personality disorders: Finding a common ground.
attorneys (3rd ed., pp. 7–36). Washington, DC: American Journal of Personality Disorders, 19, 110–130.
Psychological Association. Widiger, T. A., & Trull, T. J. (1997). Assessment of the Five-
Rouse, S. B., Butcher, J. N., & Miller, K. B. (1999). Assessment Factor Model of personality. Journal of Personality Assessment,
of substance abuse in psychotherapy clients: The effectiveness 68, 228–250.
10 Computer-Based Assessment
Abstract
Practitioners are increasingly using computer-administered and computer-interpreted personality
assessments in their practice to gain a broader and quicker evaluation of their client’s problems.
Although psychologists have made great strides in developing computer-based test interpretation
reports of traditional measures, the available technology has not been fully utilized in several areas such
as development of virtual reality (lifelike) test stimuli, maximal data processing methods, and complex
interpretation strategies that account for the complexities of the real world. Also, adaptive versions are
likely to have limited utility since most practitioners (particularly forensic psychologists) need thorough
evaluations, not just to save a few minutes of client time. Internet-based assessment is faced with
problems as well as opportunities. The development and validation of Internet-based assessment
resources has not kept pace with the technological developments and there has been relatively limited
research on the equivalence in testing conditions, equivalence of normative databases, etc., important to
reliable and valid test use.
Keywords: computer adaptive testing, Internet-based assessment, computerized interviews, CBTI, BTPI,
MMPI/MMPI-2
163
Thematic Apperception Test (TAT), the Wide recent fervor to use computers more and paper-
Range Achievement Test (WRAT), the House- and-pencil tasks less. In general, approximately
Tree-Person Projective Technique, the Wechsler 75% of clinicians report using computers as part of
Memory Scale-Revised (WMS-R), the Millon their daily work, albeit primarily for word processing
Clinical Multiaxial Inventory (MCMI), and (tied purposes (Rosen & Weil, 1996). Moreover, approxi-
with the MCMI for 10th position) the Beck mately 25% of clinicians indicated they had Internet
Depression Inventory. Strikingly, only 12% of access at their practice location (Rosen & Weil,
the clinical psychologists from this survey reported 1996). Nonetheless, even though computer interac-
that they engage in assessment-related practice tive strategies have been shown to engage clients in
(i.e., administration, scoring, and interpretation/ tasks effectively (Hake, 1998), survey results suggest
report writing) for more than 10 hr per week. that clinicians are not using computer-based assess-
Underscoring the perceived value of these select ments at the rates that we might expect in light of the
few instruments in guiding psychological services increasing popularity of computer technology in the
is the fact that there has been little change over the general population. As a means of communication
last 30 years with regard to the assessments related to service delivery, only 2% of clinicians
endorsed as being most frequently used by clini- sampled in one recent study reported utilization of
cians. In addition, it is interesting to note that they Internet or satellite-based video connection tech-
still include a mix of objective and projective nology, whereas nearly every clinician endorsed use
assessments (Cashel, 2002; Lubin, Larsen, of the telephone for practice-related communication
Matarazzo, & Seever, 1985; Watkins, Campbell, (VandenBos & Williams, 2000). These findings beg
Nieberding, & Hallmark, 1995). the question of how computer technology has
Not only do clinicians have a multitude of assess- affected the use of computerized assessments speci-
ment instruments from which to choose, but there fically, as it has been suggested that there are few
are also numerous settings in which these assess- common behaviors among clinicians regarding
ments can be utilized. For instance, assessments are computer-related technology (McMinn, Buchanan,
commonly used in college counseling centers, com- Ellens, & Ryan, 1999).
munity clinics, hospitals, law enforcement, busi- Almost 20 years ago, Farrell (1989) surveyed 227
nesses and organizations, and forensic settings. practicing psychologists and determined that 41%
Data show that the amount of time spent using frequently used computers for test scoring, and 29%
assessments may depend on the psychologist’s used them for test interpretation. More recent sur-
work setting. For instance, Camara and colleagues’ veys of computer assessment utilization actually
(2000) study suggests that most clinical psycholo- report a decrease in rates, such that approximately
gists spend less than 10 hr per week in assessment- 26% of clinicians frequently use computers for test
related practice, but also shows that one-third of scoring, and 20% use them for test interpretation
neuropsychologists spend an average of 20 hr or (McMinn et al., 1999; see also Rosen & Weil,
more per week in assessment-related practice; thus 1996). Although clinicians may not frequently use
they would have more opportunity to incorporate computers in their daily practice activities, at least
computerized assessments into their work. two-thirds of them do report using computers for
However, there has been a recent trend toward testing purposes or assisting with psychological
limiting compensation or assessment benefits for testing (Ball, Archer, & Imhof, 1994). Another
insurance holders on the part of managed care poli- recent survey found that the most common service
cies. This is apt to have a negative impact on psy- utilized by both neuropsychologists and clinical psy-
chologists utilizing particular assessments, such as chologists is scoring with computer-based assess-
the Wechsler scales and the MMPI-2, in their clin- ments, which comprises about 10% of all testing
ical work due to lack of reimbursement (Camara services conducted. On the other hand, administra-
et al., 2000; Cashel, 2002). tion and interpretation applications each accounted
for only 3% of testing services (Camara et al., 2000).
Utilization of Computerized Assessments Moreover, clinicians reported using computer-based
Recent technological advances have propelled all assessment most often for personality and/or psy-
of society into a new generation of computer usage, chopathology assessment (Camara et al., 2000), sug-
and psychological professionals are no exception gesting considerable potential when one considers
(Bartram & Hambleton, 2006). Psychological the range of areas in which computers are seemingly
assessment has not gone untouched during this underutilized.
INC 51
VIR 49
EXA 65
CLM 58
0 10 20 30 40 50 60 70 80 90 100+
REL 57
SOM 65
EXP 38
NAR 41
ENV 62
0 10 20 30 40 50 60 70 80 90 100+
DEP 62
ANX 74
A-O 68
A-I 47
PSY 72
0 10 20 30 40 50 60 70 80 90 100+
Composite scales
GPC 67
TDC 58
0 10 20 30 40 50 60 70 80 90 100+
TREATMENT ISSUES
Individuals who score high on the Somatization of Conflict (SOM) scale are reporting a considerable amount of
physical distress at this time. They seem to feel their problems are, for the most part, physical, and they do not like to
deal with emotional conflict. They have a tendency to channel conflict into physical symptoms such as headache, pain,
or stomach distress. They tend to worry about their health and seem to be reducing their life activities substantially as a
result of their physical concerns. They view themselves as tired and worried that their health is not better.
CURRENT SYMPTOMS
Raymond appears to be extremely anxious at this time. He is reporting great difficulty as a result of this
tension, fearfulness, and inability to concentrate effectively. He seems to worry a great deal and feels that he
can’t seem to sit still at times. His daily functioning is severely impaired because of his worries and an inability
to make decisions. The elevation he obtained on the Anxiety (ANX) scale is relatively common among
psychotherapy clients. The ANX scale was the second most frequent peak score in the Minnesota
Psychotherapy Assessment Project, with 28% of the cases having T-scores > 64 (6% of these with ANX as the
peak Current Symptom scale score). However, peak ANX elevations were more prominent for women (9%)
than for men (2%).
In addition, he has presented a number of other serious problems through the BTPI items that require careful
evaluation at this time. He has also obtained a high score on the PSY scale. Scores in this range reflect very
unusual thinking. Individuals with extreme scores, such as his, are reporting that their minds are not working
well and that they are having difficulty with their thought processes. Unusual and magical thoughts are
characteristic of their belief systems. He also appears to be extremely mistrustful and suspicious of others.
There is some possibility that his unusual thoughts are delusional in nature.
Along with the problems described above, there are other symptoms reflected in his BTPI response pattern
that need to be considered in assessing his current symptomatic picture. His responses to the BTPI items
suggest that he is likely to be somewhat aggressive and irritable toward other people. He feels as though he
lives in a world full of antagonism and reports feeling so these and irritable that he thinks he is ‘‘going to
explode.’’ He reports behavior that suggests he has temper control problems and may feel angry and resentful
of other people.
TREATMENT PLANNING
His symptom description suggests some concerns that could become the focus of psychological treatment if the
client can be engaged in the treatment if the client can be engaged in the treatment process. The provision of test
feedback about his problem description might prove valuable in promoting accessibility to therapy.
His reliance upon somatic defense mechanisms and his need to view conflicts in medical terms need to be the
dealt with in therapy if he is going to be able to effectively resolve conflicts that occur in his interpersonal relations.
Treatment planning should proceed with the understanding that the client maintains the view that he lives in a very
unsupportive environment. This perceived lack of a supportive context for change, whether real or imagined, can
prove frustrating to the client’s efforts at self-improvement.
His intense anxiety and high tension would likely be good target symptoms to address in therapy. He appears to be
experiencing some disabling cognitions and may need to explore these vulnerabilities in some detail to alleviate
the sources of this anxious states. Directing treatment toward tension reduction and focusing upon more effective
stress management strategies would likely serve to improve his life adjustment.
His rigid beliefs and opinionated interaction style are likely to challenge the therapist as treatment proceeds.
PROGRESS MONITORING
Raymond obtained a general pathology composite (GPC) T-score of 67. His GPC index score indicates that he
has endorsed a number of mental health symptoms that may require consideration in psychological treatment
planning. For a statistically significant change, based on a 90% confidence interval, a subsequent GPC T-score
must be above 73 or below 71.
Raymond obtained a treatment difficulty composite (TDC) T-score of 58. Overall, his TDC index score is well
within the normal range, indicating that he has not acknowledged many of the personality-based symptoms
addressed by the BTPI to assess difficult treatment relationships. For a statistically significant change, based on a
90% confidence interval, a subsequent TDC T-score must be above 65 or below 51.
DEP 53
ANX 58
A-O 47
A-I 43
PSY 62
GPC 50
0 20 40 60 80 100+
T-scores
Fig. 10.3 Follow-up BTPI for Mr. R. Reproduced with permission of MHS.
diagnoses and arranged for additional psychiatric symptoms. He was accepting of the feedback because
evaluation of his suitability for antipsychotic medi- he could appreciate it as an objective distillation of his
cation. The patient and psychologist decided to own answers to a large number of specific items. Mr.
expand his psychotherapy plan by adding group R’s demeanor in the feedback sessions suggested that
therapies to his individual psychotherapy regimen. the profiles had helped him to get some distance from
his problems. The resulting discussions, which were
REASSESSMENT
about highly sensitive subjects, thereby seemed to be
After 3 months, Mr. R’s psychologist requested
more collaborative. Finally, the computer-based sum-
that he complete the MMPI-2 and BTPI again, in
maries eased some of the burden on the psychologist
order to monitor his treatment progress. He repeated
with regard to interpreting the test data. In addition to
the full MMPI-2 battery and was administered an
facilitating and truncating the interpretation process,
abbreviated version of the BTPI that included only
they reduced the likelihood of biased interpretative
the Current Symptoms Scales. His new MMPI-2
statements and human error. Nonetheless, as was dis-
profile was similar to his previous one in its indication
cussed with Mr. R, the interpretations did need to be
of ruminative thinking; however, the second MMPI-2
individualized so as to ensure that they applied speci-
did not evidence the degree of paranoid ideation,
fically to him.
persecutory thinking, and cynicism toward others
that was suggested by the initial one. Figure 10.3
shows his second BTPI profile. He did not produce Developments in Computer-Based
elevated scores on any of the BTPI symptom scales, Assessment
supporting the notion of improvement in his thinking In this section we will examine in more detail two
problems. See Figure 10.4 for the follow-up narrative issues concerning computer-based assessment that are
reports of the BTPI. gaining increasing attention today: CAT and
Internet-based test administration and interpretation.
CONCLUSIONS ABOUT THE ASSESSMENT
The computer-based test interpretations in this Computerized Adaptive Testing
case served multiple purposes. First, they provided People differ greatly in terms of the number
objective baseline data against which to measure and range of problems and characteristics they
changes in Mr. R’s problems and symptoms over experience. Why then would it not be desirable,
time. Second, they provided extensive, objective in a mental health assessment, to use a varied
input regarding diagnoses and treatment considera- symptom review strategy that is tailored to the
tions. The computerized interpretations afforded the person’s problems rather than using a fixed ques-
psychotherapist and the patient scientifically rig- tionnaire that asks the same questions of all per-
orous data. The psychologist found that having such sons in a standard order as the majority of
information facilitated the feedback process by helping personality questionnaires do? In most standard
to reduce Mr. R’s skepticism about the information in psychological tests, everyone is administered the
his profiles and his general defensiveness about his same items in a fixed sequence even though some
PROGRESS MONITORING
The general pathology composite (GPC) T-score of 50 places Raymond in the 62nd percentile when compared to
the normative sample. Overall, his GPC index score is well within the normal range suggesting that he has not
acknowledged many mental health symptoms. For a statistically significant change, based on a 90% confidence
interval, a subsequent GPC T-score must be above 56 or below 44.
Fig. 10.4 Follow-up BTPI report for Mr. R. Reproduced with permission.
might not be appropriate for some clients. An has a high latent trait, the probability of that
‘‘adaptive’’ test, on the other hand, is an individua- person responding to an item in the keyed direc-
lized procedure that aims to provide the same tion is considered to be greater. This relationship
information as the full test but with fewer items. between the probability of the item endorsement
A computer adaptive instrument can be pro- and the latent trait characteristic can be graphically
grammed to determine when sufficient informa- expressed in an S-shaped curve, called item char-
tion has been collected from the client to estimate acteristic curve (ICC). The major goal of the IRT
the traits or characteristics being assessed. The procedure is to determine the most accurate ICC
primary goal of an adaptive test is to acquire the for each test taker (Weiss & Yoes, 1988).
necessary information by administering a reduced An extensive program of research by Weiss and his
number of items. Studies have shown that an colleagues (Brown & Weiss, 1977; Kiely, Zara, &
adaptive administration can result in reduced Weiss, 1983; Weiss & Yoes, 1988) has demonstrated
item administration (Forbey & Ben-Porath, 2007). the effectiveness of IRT and CAT procedures over
During the adaptive test administration proce- traditional methods in ability testing. In their
dure, the first administered item is scored immedi- approach, computers administer items, score the
ately to assess the difficulty and discrimination levels responses, and select the next items based on the
to determine the next item to be administered. For responses to the previous items, and determine if
example, in a computer adapted ability test, if the the optimal amount of information has been
examinee answers the item in a particular direction, obtained and thus terminate the test. Their procedure
the next item to be administered will be a more starts with an item of predetermined difficulty; then
difficult one or the one designed to assess the con- the preprogrammed computer decides if an examinee
tent area more thoroughly. If the examinee fails to answered in the keyed or nonkeyed direction, based
respond in a predetermined direction then the next on which it selects the next item that can provide
item will be an easier one or the one that is aimed at maximal information about the person’s latent trait.
sampling a slightly different aspect of the same The computer program then automatically discon-
domain. The third item is administered and scored tinues this process if the termination criterion is met.
in the same manner and is chosen based on the The major advantage of CAT over conventional
responses from the previous two items. The adaptive testing is that participants are provided with only the
administration is discontinued when the procedure items that are necessary to measure the assessment
determines that a sufficient number of items have questions, and they do not have to take the redun-
been answered to assess the characteristic in question. dant or inappropriate items. As a result, test length
Most adaptive testing strategies have been based and time can be significantly reduced.
on a testing procedure referred to as item response
theory or IRT (Weiss, 1985). In this procedure, COMPUTERIZED ADAPTIVE TESTING WITH THE
the conditional probability of the response to an MMPI-2
item for different levels of underlying latent trait or Can personality characteristics be evaluated by
ability is estimated. For example, if an individual computer adaptive strategies such as IRT? Some
11 Behavioral Observations
Martin Leichtman
Abstract
Introductory sections of test reports devoted to ‘‘Behavioral Observations’’ are among the most neglected
and poorly handled aspects of psychological evaluations. To highlight the contributions these paragraphs
can make to reports and provide a framework for presenting them, this chapter examines (1) ways of
conceptualizing the behavior in test situations to which they are devoted; (2) the role of behavior
observations in the inference process through which psychologists can seek answers to diagnostic
questions; and (3) the function these paragraphs serve in reports.
187
that so interfered with his schoolwork he was unable A second story may then be written about how
to finish the 10th grade. Asked by the referring subjects are likely to behave in such situations and
psychiatrist whether the boy exhibited severe char- how teachers or therapists might most effectively
acter pathology as well as an obsessive-compulsive respond to them.
disorder, a psychologist wrote the following para- Field tests of the TSAT indicate that it possesses
graph on the basis of the TSAT: some extraordinary characteristics. For example, the
scope of its application extends far beyond that of
A tall, slender, effeminate adolescent whose nervous
any established assessment instrument. It may be
mannerisms are reminiscent of the actor Anthony
utilized in unmodified form with subjects who
Perkins, Don approached the tests with considerable
range in age from neonates to the elderly and who
anxiety and a strong desire to be ingratiating. Yet he
display every known form of psychopathology. It is
had little idea of how to interact in socially appropriate
equally appropriate for the severely retarded and the
ways and, from the first, treated the examiner as an
gifted, and neither physical disabilities nor neurolo-
admirer who would be fascinated by his every thought,
gical impairments pose any restrictions on its use.
no matter how strange. Thus, he began the first session
Indeed, the TSAT can even be employed with other
by launching into a detailed lecture on his odd
species, although how far down the phylogenetic
obsessions as if he were a case in a medical text. On the
scale we may proceed before it ceases to be fruitful
tests, he tried to be precise and meticulous, but had
is still a matter of contention.
increasing difficulty doing so. So eager was he to
As a means of studying character and predicting
impress that he was loath to give one answer to a
behavior, the TSAT has been found to have greater
question when three would do, but often was unable to
content validity than other projective tests. Early
decide which of the three were good and which poor.
investigations suggest that concurrent validity is
He had particular difficulty with mathematical tasks
high. For example, there is an impressive correlation
because he had forbidden himself the use of certain
between diagnostic impressions formed by experi-
numbers and he clutched his left side when they were
enced psychologists on the basis of the TSAT and
mentioned. He worried constantly about making
those advanced by experienced clinicians on the basis
mistakes and, when encountering test items he could
of interviews. In addition, studies of predictive
not do, he would begin to shake, noting nonchalantly
validity have established that subjects described as
that he often had such ‘‘violent spasms’’ when upset. As
reflective on the TSAT do well in expressive psy-
the testing progressed, his thinking became looser.
chotherapy; subjects who are depicted as distractible,
Occasionally he used odd words of his own creation or
hyperactive, and emotionally labile frequently display
spoke in Russian to demonstrate his proficiency in a
these characteristics at home and at school; and sub-
language he had taught himself. Toward the end of the
jects who are described as assaulting the examiner on
tests, tangential associations to a question led him to
the TSAT not only have histories of violent behavior
give an elaborate description of an imaginary kingdom
but are at high risk for aggressive acting out in the
he had invented and about which he spent many hours
future.
each day fantasizing over the last several years.
As anticipated at the time of the test’s introduc-
Classifying his TSAT story according to the tion, only three problems have been encountered.
DSM-IV scheme favored by his colleague, the psy- First, reliability is a bit lower than that of many
chologist suggested that, while there was evidence of objective instruments, though higher than that of
the obsessive-compulsive problems noted by the most projective tests. Second, the TSAT is subject to
psychiatrist, they were present in a boy with promi- misinterpretation. Misunderstanding the nature of
nent schizotypal personality characteristics who was the task, supervisors and colleagues at times do not
at high risk for decompensation. focus on the subjects of the stories but instead ana-
In addition to the standard TSAT task, an lyze protocols from the standpoint of what they
optional variation may be undertaken as well. After reveal about their authors. On occasion, this has
completing the initial story, psychologists are exposed psychologists to rude comments about
encouraged to imagine that their clients will soon their clinical judgment, their literary style, and
enter situations that bear some similarity to testing. even their grammar and punctuation. Third, and
For example, they may assume that a child who has most important, because the test does not require
just taken intelligence and achievement tests is about so much as a modestly priced manual or inexpensive
to enter a remediation program or an adult who has scoring form, let alone a costly kit, it has been of little
taken the Rorschach is soon to begin psychotherapy. interest to major publishers. In the absence of
LEICHTMAN 189
The ‘‘Behavior’’ Observed enormous volume of material, and the precept pro-
‘‘Behavioral Observations’’ and Their vides no basis for deciding how observations are to
Rationale be selected and organized into a paragraph of pre-
Of the problems encountered in ‘‘Behavioral sumably moderate length.
Observations’’ paragraphs, two types are especially The second example may be seen as an outgrowth
worthy of note because of what they reveal about of assumptions rooted in a purely psychometric
how these accounts are conceived. Undoubtedly the approach to testing that takes laboratory procedures
most exasperating is the long, banal description of as its model (Cronbach, 1960, pp. 59–60). Though
behavior that has little bearing on the psychological willing to concede that everything that occurs in the
issues the report is intended to address (Tallent, test situation is behavior, adherents of such a per-
1988). For example, clinicians who have extensive spective presuppose that there are two classes of
contact with a child may be informed he has blue behavior and a ‘‘class system’’ based on qualitative
eyes and blond hair, wore jeans and a T-shirt embla- differences between them. On one hand, there are
zoned with the word ‘‘Cougars’’ to the initial test test data, data gathered in standardized, delimited
session, smiled when meeting the examiner in the conditions, subject to rigorous observation and pre-
waiting room, but glanced apprehensively at his cise measurement, and analyzed in terms of estab-
mother, walked down the hall holding the psychol- lished norms. On the other, there are those
ogist’s hand, looked in the toy closet upon entering additional observations psychologists make while
the office, but came to the desk when asked, and gathering this information, observations that center
so forth. At the other extreme are ‘‘Behavioral on more obvious, superficial behavior, that are less
Observations’’ that consist of little more than a systematic and more subject to bias, and that are, in
terse statement such as: ‘‘The patient is a white, any case, only a limited sample of the kind of beha-
female Caucasian, age 36, who exhibited constric- vior that can be explored more fully in diagnostic
tion, dysphoric affect, and psychomotor retardation interviews. The former data are viewed as the source
in the course of testing.’’ In contrast to rambling, of the unique, invaluable contributions psycholo-
unfocused accounts of subjects’ behavior, such para- gists make to diagnostic evaluations. The latter are
graphs are blessedly brief and move readers on relatively unimportant, and hence are best stated in
quickly to the main test findings. At the same time, as brief and objective a manner as possible.
their impersonal, technical tone typically heralds a For ‘‘Behavioral Observations’’ to play an effective
report written in the same fashion that will afford role in test reports, a rationale is required that involves
little sense of what subjects are like as people. greater specificity than the view that all behavior is
What is most significant about the first example important and that recognizes that these accounts
is that it may be seen as a conscientious effort to act have a legitimate and even indispensable role to play
on a central tenet of the catechism that all psychol- in the report.
ogists, regardless of sect, learn at the beginning of
their training: Everything a person thinks, feels, and The Nature of the Test Process
does is behavior and merits careful observation. This The basis for such a rationale can be appreciated
tenet is hard to dispute. In fact, so much can be if we examine how test data are gathered. Consider,
gleaned from careful observation of test behavior for example, this process with John, a 5-year-old
that psychologists can give remarkably good diag- referred for an evaluation 18 months after he suf-
nostic pictures of profoundly disturbed children on fered organic damage that resulted in a variety of
the basis of how they act in the process of being cognitive impairments, hyperactivity, problems with
‘‘untestable’’ (Kaplan, 1975; Leichtman & Nathan, attention and concentration, and aggressive beha-
1983). Yet the principle is difficult to translate into vior. As would be anticipated, referral questions
practice. Acting on it, inexperienced clinicians may center on assessment of his intellectual abilities and
produce long, meandering narratives of test behavior the impact of his injury on his psychological devel-
that have the unintended consequence of suggesting opment. As would also be anticipated with the
that no test behavior is of particular significance. referral, the psychologist receives fair warning that
Skilled clinicians can use test behavior to gain the boy will not be easy to test.
insights into every aspect of psychological func- This warning is repeated by the receptionist, who
tioning, but the value of their observations does makes it clear that the rambunctious child needs to
not derive chiefly from the assumption that all leave the waiting room quickly. Upon meeting John,
behavior is important. Hours of testing yield an the psychologist encounters a feisty youngster who,
LEICHTMAN 191
context of this relationship, tests are administered in an accurate diagnostic picture requires giving tests in
a standardized form or at least as close to a standar- ways that address clients’ concerns and cope with the
dized form as seems possible. Yet the case also illus- defensive maneuvers through which they seek to
trates a point of particular importance that is escape serious engagement with the tests (Palmer,
typically acknowledged in manuals and texts only 1983). While at times this necessitates explicit inter-
to be quickly glossed over in the interest of empha- pretations and modification of test procedures, in
sizing the standardization of test conditions: most cases it is done without going beyond the para-
Depending on differences in clients and examiners, meters of standard test administration. As with John,
there are, in fact, marked variations in how ‘‘rapport’’ examiners may handle these problems in subtle, often
is established and even what constitutes ‘‘standardized nonverbal ways though their posture, expression, or
administration.’’ tone of voice, through the timing of the tests, or
Rapport, for example, is usually described in gen- through judgments about whether to accept
eral, abstract ways as if it consists of little more than responses, repeat them, or inquire further. In a
adopting a ‘‘positive, nonthreatening tone,’’ ‘‘putting sense, test instructions are like a play. Examiners are
the child at ease,’’ and ‘‘conveying interest and enthu- bound by the script, but there is wide latitude in how
siasm’’ to create a positive, relaxed atmosphere they and their clients interpret their roles. This lati-
(Wechsler, 1991). However, in emphasizing the tude is, in fact, the subject of an extensive literature on
‘‘importance of rapport’’ in the administration of the examiner effects on test performance and their impli-
Stanford-Binet, Terman and Merrill (1960) wrote: cations for the validity of particular tests such as
intelligence scales (see Jensen, 1980; Sattler, 1988)
To elicit the subject’s best efforts and maintain both
or the Rorschach (Exner, 2003).
high motivation and optimal performance level
These influences, it should be stressed, are not
throughout the test session are the sine qua non of good
unfortunate intrusions of extraneous variables into
testing, but the means by which these ends are
the test process that can be minimized or fully con-
accomplished are so varied as to defy specific
trolled, but rather are inescapable consequences of the
formulation. The address which puts one child at least
fact that tests are given by one human being to
with a strange adult may belittle or even antagonize
another. If tests were administered to each subject in
another. The competent examiner, like the good
exactly the same way, the result would be less useful
clinician, must be able to sense the needs of the subject
information rather than more. For example, were
so that he can help him accept and adjust to the testing
John tested in this manner, rather than seeing his
situation. Sympathetic, understanding relationships
capacities we might learn little more than that the
with children are achieved in the most diverse ways and
boy’s hyperactivity, distractibility, and defensiveness
no armory of technical skills is a satisfactory substitute
interfered with his performance on tests and with
for this kind of interpersonal know-how.
every other aspect of his life, a point already clear to
(pp. 50–51)
the receptionist after a few minutes’ contact with him.
Stylistic and personality differences among exam- Even more important, it is essential to realize that
iners also affect this process. As Phares and Tull what are referred to as ‘‘rapport’’ and appropriate test
(1997, p. 257) observed, ‘‘There are many ways to administration are processes that are a great deal
achieve good rapport—perhaps as many as there are more complex than the often cursory discussions of
clinicians.’’ them in textbooks and manuals would lead one to
Similarly, there is considerable diversity in ‘‘stan- believe. Rapport involves more than developing a
dardized’’ test administration. Sattler (1988, p. 87) positive, accepting relationship in order to ‘‘arouse
noted, ‘‘Countless variations preclude giving an exam- the test takers’ interest in the test, elicit their coop-
ination that is always the same.’’ Even with relatively eration, and encourage them to respond in a manner
simple evaluations, the pace of testing differs because appropriate to the objectives of the test’’ (Anastasi &
some children react quickly and others slowly, and the Urbina, 1997, p. 17). It requires sensing roles clients
number and types of follow-up questions vary with strive to enact that are central to their identities, and
the ambiguity of responses and examiners’ judgments in natural, intuitive, often unrecognized ways
of them. Moreover, with individuals referred because enacting reciprocal roles (Leichtman, 1996).
of serious emotional problems, evaluations are any- Similarly, administering tests effectively involves
thing but simple. Clients come struggling desperately far more than adhering to their instructions in
to hide their psychoses, fearful of failure, angry, frigh- rigid, unbending ways. It also requires attending to
tened, elated, depressed, or oppositional. Obtaining and helping compensate for the specific manner
LEICHTMAN 193
during the testing, a few lines may be sufficient to give role in a busy clinical practice. Too heavy an
a reasonable view of the overriding impression the client emphasis on documented behavior and a reluctance
creates. Indeed, the term ‘‘Behavioral Observations’’ is a to generalize, while admirable from a scientific
misnomer in the sense that what is sought are standpoint, may diminish the usefulness of reports
generalizations. For example, a multitude of obser- to referral sources and leave the task of generalizing
vations may go into formulating an impression of a to others who are probably less skilled and certainly
youngster’s attitude toward and modes of less familiar with the data. In addition, as Aiken
approaching the tests, but only a good example or (1999) suggests, such attitudes may underestimate
two are necessary to illustrate it. the accuracy with which human beings can read
kinesic, proximic, and paralinguistic cues they have
Behavioral Observations and the Inference spent a lifetime mastering. Although acknowledging
Process that mistakes can be made, he observes: ‘‘. . . most
Attitudes toward Clinical Impressions people probably succeed more often than not in
Psychologists typically are most comfortable with interpreting nonverbal messages in their own culture
those questions about the test situation concerned correctly’’ (p. 134). Most important, this approach
with the client’s modes of handling the tests because underestimates the value of what Polanyi (1958)
answers can be based on observable behavior that described as ‘‘personal knowledge,’’ the judgment
lends itself to quantification (Sattler, 1988). In con- and expertise that constitute the art of a profession.
trast, many are uneasy with questions about the roles
clients play, and even more about relationships with The Interplay of Inferences from Different
examiners, because these require greater reliance on Sources of Data
subjective impressions and clinical judgment. An alternate approach to the problem is one that
Consequently, there are significant differences of encourages the use of a disciplined subjectivity in the
opinion about how these answers should be given inference process. A useful exercise in teaching this
and what role they should play in the inference kind of clinical skill consists of having trainees rate
process through which diagnostic formulations are the degree of pathology exhibited by clients they
developed. have just tested on a 5-point scale ranging from
Articulating a position shared by many psychol- normality to psychosis and categorize the client’s
ogists with psychometric and behavioral orienta- character style and type of pathology according to
tions, Knoff (1986) advocated caution in the use of DSM-IV (Diagnostic and Statistical Manual, 4th
observations and behavior. He asserted: ‘‘To date, edition) or whatever nosological scheme is preferred
there is no empirically sound observational system in their setting. In approaching these tasks, they are
available for completion by the practitioner during asked to imagine that they will not be allowed to
or immediately after the individual assessment sec- score and analyze tests or make use of individual test
tion; nor are there procedures to control the poten- responses, but must rely solely on their impressions
tial bias when data (observed or recalled) are of clients in the test situation and articulate the bases
generalized into diagnostic hypotheses’’ (p. 552). for those impressions. Then they are asked to repeat
Accordingly, he recommended: (1) recognizing the task assuming answers to the diagnostic ques-
that ‘‘assessment observations are based on a tions must be based only on information gathered
narrow, artificial situation and may not represent from each of the tests they have administered.
the child’s behavior in ‘real life’ situations’’; (2) Finally, they compare the answers they have offered.
emphasizing ‘‘observed and documented behavior The exercise has a number of benefits. First,
over recollections and inferences’’; (3) ‘‘utiliz[ing] subjective impressions are brought into the open
observers behind one-way mirrors and determining and examined rather than driven underground
interrater reliabilities for observations and interpre- where they are less likely to disappear than to influ-
tation’’; and (4) stressing ‘‘consistencies across the ence diagnoses in unacknowledged ways. Second,
entire assessment process’’ and ‘‘discounting incon- insisting that distinct sets of inferences can be
sistencies that may be situation-specific and ‘chance made from different sources of data helps wean
fluctuations’ ’’ (p. 552). novice psychologists away from a covert overreliance
Most clinicians, however, are likely to find these on their behavioral observations and teaches them
recommendations excessively cautious and imprac- how to use test data. Third, the exercise sharpens
tical. Though useful in training, one-way mirrors skills in forming clinical impressions, since these
and checks on inter-rater reliability can play little impressions can be compared with those of
LEICHTMAN 195
coping with marked impulsivity and distractibility, which the boy was to use his therapist and the issues
but also a disconcerting aversion to the child. When, that would disrupt the therapeutic relationship in
with some reluctance, he shares these feelings in his the years that followed (Leichtman, 1992b).
report, the child’s psychiatrist and teachers are Appelbaum (1972) attached such importance to
relieved and reveal that they have similar reactions this aspect of the testing that he advocated devoting
that have been hard to acknowledge. The family a major section of the report to ‘‘How the Patient
therapist adds that she suspects that the parents feel Responds to Various Interpersonal Approaches.’’
the same way and that their anger and guilt interfere Although many psychologists would dispute so
with accepting the child and setting appropriate close an identification of testing and therapy, they
limits. Clearly, dealing with the youngster’s atten- would acknowledge important common features.
tion problems, hyperactivity, and learning disability Therapists, like testers, need to establish rapport
will not be enough. Effective treatment will require and engage in explorations in which they seek to
understanding and addressing why the child is dis- learn what clients are like, including secrets and
liked and why the child creates this feeling. vulnerabilities wished hidden. Also, as psychologists
The value of subjective data is not confined to help children take intelligence or achievement tests,
those cases in which unanticipated information they grapple with many of the same problems with
about clients is uncovered. In every testing, the psy- which teachers and tutors struggle in educational
chologists’ personal experience of clients provides a settings. Because of these similarities, examiners
means of translating objective, quantitative, often can draw on their experiences of what facilitates
impersonal data from tests into full-bodied pictures and disrupts testing to make the kinds of specific,
of human beings. This experience also allows psychol- individualized recommendations that are of parti-
ogists to empathize with families and clinicians to cular help to those who will treat clients.
whom their reports are addressed. It is this integration Appreciating the potential benefits of the use of
of objective and subjective data that enables psychol- subjective data helps place concerns about their liabil-
ogists to reconcile the ‘‘scientific’’ and ‘‘humanistic’’ ities in a proper perspective. Though drawing on
aspects of the assessment process (Sugarman, 1978). private, emotion-laden experience, psychologists can
Perhaps the most significant contributions the check their impressions against those of others and
examiner’s experience of clients can make in for- against test data. Though we can never be sure that
mulating cases are to treatment recommendations. our impressions are not biased or idiosyncratic, con-
Although clinicians at times act as if diagnosis and fining reports to objective data because of concerns
treatment are separate processes, clients do not. about bias and unreliability will not keep our virtue
For them, diagnosis is the first step in the treat- unsullied. There are sins of omission as well as com-
ment process. Even proponents of objective instru- mission, Type I as well as Type II errors. To ignore
ments recognize that when conducted properly, subjective data may be to neglect vital information
testing can have important therapeutic benefits about clients, miss opportunities to offer richer pic-
(Butcher & Perry, 2008; Finn & Kamphuis, tures of them in reports, and diminish the utility of
2006). Psychodynamic psychologists who view treatment recommendations. Clients may be better
testing as a clinical interview go further. Treating served when psychologists risk mistakes, trust their
the test situation as paradigmatic of the therapy capacities to make emphatic contact, and work to
process, they argue that transference–countertrans- develop the kinds of skills that warrant such trust.
ference patterns established in the former can pre-
dict those that will arise in the latter (Schlesinger, The Function of ‘‘Behavioral Observations’’
1973; Sugarman, 1981). For example, Lerner Sections in Reports
(1998) asserts: ‘‘Because of the similarities between The importance of the phenomena about which
the assessment and treatment frames, the examiner ‘‘Behavioral Observations’’ are made and their signifi-
who explicitly sets the assessment frame and then cance in inference process are not sufficient to justify
observes patient responses to it is in a unique devoting a separate introductory section of test
position to predict patient reactions to the treat- reports to them. Such information, after all, can be
ment frame’’ (p. 68). included in a variety of places. In fact, contending
With John, the neurologically impaired young- that the formats in which these observations are
ster described above, the relationship established treated separately creates an artificial distinction
with the psychologist during the testing afforded a between internal and external aspects of personality,
remarkably good basis for anticipating the ways in Sugarman (1981) argued that these data should be
LEICHTMAN 197
seemingly arcane theories are pseudoscientific and ‘‘Behavioral Observations’’ can stimulate an
blind us to what is obvious and practical in assisting interest in learning more about clients by drawing
clients. Such suspicions are likely to be especially high attention to incongruous phenomena. For example,
when conclusions about clients from testing differ when an examiner experiences an adolescent as
from what their overt behavior or presenting symp- dependent and surprisingly cooperative in spite of
toms lead others to believe. Similar problems occur the child’s effort to maintain a tough, antisocial
even when there is a difference of opinion with facade, readers are likely to be curious about why.
sophisticated clinicians who share the same theoretical Or the contrast between presenting problems and
perspective, as for example when test findings suggest the psychologist’s observations may highlight central
a markedly different view of a client than does a issues that reports need to address. Why is a child
psychiatric interview. who is impulsive, hyperactive, and distractible at
Although it has been suggested that readers are put school capable of working diligently for extended
off by reports that tell them what they already know periods of time in the test situation? Why does a
about clients (Tallent, 1988), the opposite may be the client others describe as psychologically minded and
case. Well-observed accounts of how clients present insightful leave the examiner feeling frustrated and
themselves in the testing situation can reassure others annoyed as she tries to use the testing to explore and
that psychologists are seeing the same individuals they clarify the man’s problems? What is one to make of
are. When reports start by establishing a common the cheerful, relentlessly upbeat woman whose his-
ground, readers are more inclined to accept test results tory has been one of repeated trauma and loss? Such
that differ from surface impressions and to see differ- questions capture the reader’s attention and prepare
ences of opinion as a product of the perspectives the way for answers that may be critical in recom-
afforded by particular methods rather than of a mis- mending particular forms of treatment.
diagnosis on someone’s part. Even where ‘‘Behavioral Observations’’ are consis-
tent with both presenting symptoms and test find-
Art and Style ings, they can provide an initial statement of a theme
Appelbaum (1970) also suggested that writing that subsequent sections of reports will develop with
effective test reports involves art as well as science increasing richness. For example, with Don, the ado-
and salesmanship. He noted: ‘‘Just as advertising lescent described in the introduction, ‘‘Behavioral
people do, the writer of psychological reports may Observations’’ give readers a sense of his obsessional
use techniques of art (again, with different motives as and schizotypal characteristics. Subsequent sections of
well as technical differences). As does the artist, the his report use data from the tests to amplify the
writer of a test report may arouse expectations, build, particular qualities of his thinking, his relationships
sustain and relieve tension, integrate and recapitulate’’ with others, and his central conflicts and defensive
(p. 351). By providing a transition from statements of structure.
symptoms and referral questions to the body of The stylistic contributions of ‘‘Behavioral
reports, ‘‘Behavioral Observations’’ can make impor- Observations’’ to reports can be as varied as the indi-
tant contributions in these respects as well. viduals tested, the goals of reports, and the talents and
Because behavioral and characterological pro- ingenuity of their writers. What is important for
blems manifest themselves so readily in the test present purposes is a recognition of the diverse oppor-
situation, a description of how patients present tunities these sections afford for making their subjects
themselves and act often provides readers with an come alive and transforming reports into something
experience of the clients’ difficulties that goes far more than dry inventories of test findings.
beyond what is often a generic list of symptoms
in the introduction. What is meant by ‘‘hyper- Conclusion: The Test Situation as a Test
activity,’’ for example, can be brought home to Although there is no TSAT, the test situation in
the reader by a vignette describing trying to test fact resembles a projective test. Like a TAT card, it
this particular child while he is sitting on, under, has a structure to which all clients must respond in
and through his chair. Similarly, readers gain an some way. Tester and test taker play more or less
appreciation of a referral problem stated as ‘‘pro- objective roles according to prescribed social con-
blems managing aggression’’ when an examiner ventions, and the tests are standardized tasks to
recounts walking on eggshells as she tries to cope which all subjects must respond. Yet as with a
with a client’s simmering rage and subtle—and TAT story, the meaning clients and examiners attri-
at times none-too-subtle—intimidation. bute to each aspect of the test situation may be
LEICHTMAN 199
Palmer, J. O. (1983). The psychological assessment of children (2nd Seagull, E. A. W. (1979). Writing the report of a psychological
ed.). New York: John Wiley & Sons. assessment of a child. Journal of Clinical Child Psychology, 8,
Peebles-Kleiger, M. J. (2002). The art and science of planning 39–42.
psychotherapy. Hillsdale, NJ: Analytic Press. Shevrin, H., & Schectman, F. (1973). The diagnostic process in
Phares, E. J., & Tull, T. J. (1997). Clinical psychology: Concepts, psychiatric evaluations. Bulletin of the Menninger Clinic, 37,
methods, and profession (5th ed.). Pacific Grove, CA: Brooks/Cole. 451–494.
Polanyi, M. (1958). Personal knowledge: Towards a post-critical Sugarman, A. (1978). Is psychodiagnostic assessment humanistic?
philosophy. Chicago: University of Chicago Press. Journal of Personality Assessment, 42, 11–21.
Rapaport, D., Gill, M. M., & Schafer, R. (1945–1946). Sugarman, A. (1981). The diagnostic use of countertransference
Diagnostic psychological testing (2 Vols). Chicago: Year Book. reactions in psychological testing. Bulletin of the Menninger
Sattler, J. M. (1988). Assessment of children’s intelligence and special Clinic, 45, 473–490.
abilities (3rd ed.). San Diego: Jerome M. Sattler. Tallent, N. (1988). Psychological report writing (3rd ed.).
Schachtel, E. (1966). Experiential foundations of Rorschach’s test. Englewood Cliffs, NJ: Prentice-Hall.
New York: Basic Books. Terman, L., & Merrill, M. (1960). Stanford-Binet Intelligence
Schafer, R. (1954). Psychoanalytic interpretation in Rorschach Scale. Boston: Houghton-Mifflin.
testing. New York: International Universities Press. Wechsler, D. (1991). Weschler Intelligence Scale for Children—
Schafer, R. (1956). Transference in the patient’s reaction to the third edition: Manual. San Antonio, TX: Psychological
tester. Journal of Projective Techniques, 20, 26–32. Corporation.
Schechtman, F. (1979). Problems in communicating psycholo- Wolber, G. J., & Carne, W. F. (2002). Writing psychological
gical understanding: Why won’t they listen to me?! American reports: A guide for clinicians (2nd ed.). Sarasota, FL:
Psychologist, 34(9), 781–790. Professional Resource Press.
Schlesinger, H. (1973). Interaction of dynamic and reality factors Woody, R. H., & Robertson, M. (1988). Becoming a clinical
in the diagnostic testing interview. Bulletin of the Menninger psychologist. Madison, CT: International Universities
Clinic, 37, 495–517. Press.
Robert J. Craig
Abstract
This chapter presents the anatomy of the basic clinical interview. We discuss its structure, content,
factors affecting the clinical interview, interviewing techniques, types of clinical interviews, patient and
psychologists’ expectations, interview reliability, and computer-assisted clinical interviews.
In this chapter, I use the term ‘‘clinical interview’’ to Finally, this chapter focuses exclusively on clin-
mean the assessment of personality, personality ical interviews with adults. There are other sources
disorders, and psychopathology using the clinical on interviewing children (Hersen & Thomas,
interview. There are other (supplemental) means of 2007a; Logan, 2005) and the reader is urged to
performing this clinical function, such as collateral consult these authoritative sources.
information, record reviews, behavioral observa- Matarazzo (1965, 1978) was among the first
tions, psychological testing, and other sources of psychologists to study the anatomy of the clinical
anamnestic data. Indeed, when these supplemental interview, reporting on its reliability and validity in a
sources are consulted, the accuracy of clinical diag- plethora of clinical usages. Matarazzo was able to
noses and assessment conclusions are often take a unilateral perspective of the clinical interview
enhanced. I lament the fact that too often clinicians because special focus interviews had not yet been
rely primarily on the clinical interview as their developed. Contemporary usage of the clinical inter-
major tool for assessment and diagnosis. This is view has evolved into more standardized and semi-
often never the case among forensic psychologists standardized structured and semistructured clinical
who routinely consult these sources as part of their interviews, such that it is no longer appropriate to
exploration to determine fact. However, in this discuss the reliability and validity of a clinical inter-
chapter, the focus is exclusively on using the clinical view. Rather, we must now study the reliability and
interview devoid of these other sources, even validity of a specific (kind of) interview under con-
though, in clinical practice, such overreliance is sideration (Craig, 2003). Indeed, there are now a
frowned upon. substantial number of clinical interviews that have
Also, in this chapter, I use the term ‘‘client’’ and been published designed to aid in the diagnosis of
‘‘patient’’ interchangeably. Psychologists continue to major psychiatric syndromes and clinical personality
argue which is the preferred term to use. Clinicians patterns/disorders.
operating in psychiatric outpatient clinics and
hospitals generally use the term ‘‘patient,’’ while Anatomy of a Clinical interview
clinicians working in private practice or in university The elements of a clinical interview include its
mental health clinics often use the term ‘‘client.’’ structure, theoretical orientation of the clinical inter-
Both terms are used in this chapter with no value viewer, the expectations of both client and clinician,
pinned on one or the other. as well as the content and process of the interview.
201
Clinician’s Theoretical Orientation therapist’s goal is to understand these in terms of
Patients come to an interview with certain expec- the patient’s intentions, purposes, and personal
tations; clinicians also approach a clinical interview meaning and to come to a clear understanding of
with their own set of predispositions. They do not the patient’s underlying assumptions and presupposi-
enter the room with a blank slate. Rather they tions (Gruba-McCallister, 2005). A patient’s beha-
approach the patient with a philosophical orienta- vior is totally determined by their beliefs and life
tion that influences the structure, process, and experiences. The focus is on the here-and-now and
content of the interview. This theoretical framework how the self develops and grows amid these personal
influences the areas and methods of inquiry, how beliefs.
they conduct the evaluation, and how they under- Existential theory is closely allied with these two
stand the meaning of elicited client information. humanistic approaches. Existential theory posits that
Let us take a closer look at how one’s theoretical the search for meaning is the basic motivation
information influences the interview process. We underlying behavior (Husserl, 1962). The goal of
will briefly discuss the interview process from the therapy is to help the person discover their true
prospective of the psychodynamic, phenomenolo- meaning. The clinician is a participant-observer in
gical-existential-nondirective, behavioral and family the phenomenological world of the client.
systems orientations. COGNITIVE BEHAVIOR
PSYCHODYNAMIC Cognitive-behavior theory anchors itself within a
Clinicians operating from a psychodynamic phi- behaviorist epistemology, where behavior is a func-
losophical orientation emphasize the role of early tion of antecedents and consequences (i.e., rewards
developmental history, trauma, vulnerability to and punishments) within a specific environmental
self-esteem injury, and unconscious conflict in the context. It relies on classical and operant condi-
development of psychological maladjustment (Yalof tioning models as well as the role of modeling in
& Abraham, 2005). Drive theory, self-psychology, shaping human behavior. All behavior is learned
and object relations theories may be the philoso- and there are no mediating variables (i.e., traits) to
phical substrate that guides the clinician’s thinking. explain behavior. The clinical interview is the start of
However, it is not unusual to find both the models a ‘‘functional analysis of behavior’’ which helps the
of intrapsychic structure and interpersonal relations client identify the antecedent conditions to the pro-
in a psychodynamic interview. blem, the cognitions that related to the problem, and
Given this background, psychodynamic clinicians the consequences of problematic behavior.
pay particular attention to the way the patient is Interventions may begin within the initial interview.
relating to the clinician and are continually observing For example, the family of an enuretic might be asked
for the development of transference, even in the initial to begin to self-monitor and record the extent and
session. They view the resolution of this transference frequency of bedwetting, knowing that the act of self-
as the sine qua non of treatment. Interpretations are monitoring may also reduce the behavior being mon-
the main source of interventions and may be itored. An alcoholic man might be told to vary his
attempted within the first session, if the clinician route home in an effort to break the chain of beha-
deems this appropriate. Both supportive and explora- viors that leads him into the bar after work.
tory interventions may occur here. Cognitive behavior therapists have offered sug-
gestions as to the content of the initial interview.
CLIENT-CENTERED/PHENOMENOLOGICAL/
Lazarus (1976) suggested that the comprehensive
EXISTENTIAL
assessment of the patient includes (the pneumonic)
Client-centered therapy was developed by Carl
device of BASIC ID, where
Rogers (1942, 1951) who argued that empathy, gen-
uineness and congruence, and unconditional positive
regard practiced by the clinician toward the client B ¼ behavior, especially those that are
(Rogers did not believe in calling the person a problematic;
‘‘patient’’) are the necessary and sufficient conditions A ¼ affective responses, especially those
that allowed for client self-actualization and personal requiring modification;
growth (Rogers, 1957). A client-centered approach S ¼ sensory deficits (e.g., emotional pain);
anchors itself in phenomenological theory. I ¼ imagery (such as fantasies);
Phenomenology believes that the object of study is C ¼ cognitions, especially those that are illogical
people’s perceptions and life experiences. The (Ellis, 1962);
CRAIG 203
unstructured approach within the interview process The SCID is semistructured clinical interview
so as to not interfere with the cognitive and emo- with an administration time of 60–90 min. The
tional flow of the client. These types of interviews patient initially describes the problem and the clin-
will be elaborated on later in this chapter. ician responds with open-ended questions guided by
With the introduction of DSM-III (APA, 1980), the interview booklet. After each section, the exam-
and continuing through DSM-III-R (APA, 1987) iner rates the severity level of the disorder in terms of
and DSM-IV (APA, 1994), a large number of struc- mild, moderate, or severe. DSM provides a hierarch-
tured and semistructured clinical interviews began to ical structure to its organization (i.e., substance use
be published in the professional literature. Initially disorders, psychotic disorders, affective disorders,
they were comprehensively focused on surveying anxiety disorders, etc.), and the SCID follows this
major Axis I syndromes, such as the Schedule for hierarchical structure in its questioning. The SCID-
Affective Disorders and Schizophrenia (SADS) II was then developed to assess DSM personality
(Endicott & Spitzer, 1978), the Diagnostic disorders.
Interview Schedule (DIS) (Robins, Helzer, Within psychiatry the SCID/SCID-II has become
Croughan, & Ratcliff, 1981), and the Structured the most frequently used (semistructured) clinical
Clinical Interview for DSM-III (SCID) (Spitzer & interview and is the standard to diagnose Axis I and
Williams, 1984). Then structured clinical interviews Axis II disorders in psychiatric research.
began to appear for the assessment of personality Table 12.1 presents an illustration of how these
disorders for assessing Axis II conditions, such as the structured clinical interviews appear in actual text.
Structured Clinical Interview for DSM-III-R (SCID- Screening for antisocial personality disorder is
II) (Spitzer, Williams, Gibbon, & First, 1992), the highlighted.
Personality Disorder Diagnostic Interview (Widiger,
Mangine, Corbitt, Ellis, & Thomas, 1995), and the
Diagnostic Interview for DSM-IV Personality Structure of the Initial Clinical Interview
Disorders (Zanarini, Frankenburg, Sickel, & Yong, Even an apparent unstructured clinical interview
1995). Thereafter followed the development of struc- may be actually having a structure. The social psy-
tured and semistructured clinical interviews for spe- chiatrist Harry Stack Sullivan believed that interper-
cific psychiatric syndromes, including anxiety sonal and social relationships were more important
disorders (Spitzer & Williams, 1988), depression in the derivation of psychological disorders than
(Jamison & Scogin, 1992), and posttraumatic stress unconscious conflicts and arrested psychosexual
disorder (Watson, Juba, Manifold, Kucala, & development. Sullivan published no material on his
Anderson, 1991). Today, structured clinical inter- seminal ideas, but his psychiatric residents recorded
views have been published for most major psychiatric his notes and subsequently published them under
disorders. (For a more detailed presentation of these his name in a critical book entitled The Psychiatric
instruments, see Craig, 2003, and Rogers, 2001.) Interview (Sullivan, 1954).
The SADS is a semistructured diagnostic interview Sullivan argued that the structure of the clinical
consisting of two main sections. The instrument is interview consisted of four parts: (a) formal incep-
designed to reduce the need for clinical judgment. tion, (b) reconnaissance, (c) detailed inquiry, and (d)
The first section asks questions to ascertain a patient’s termination.
symptoms and their severity. The level of impairment In the first phase of the interview, or formal
is determined by comparing the patient’s responses to inception, the clinician learns why the client is
standard descriptions. The second section queries seeking assistance. Sullivan believed this should
mental disorders clustered within each main section take no more than 5 min. In the reconnaissance,
of the DSM. This section addresses history of current the clinician should take not more than 20 min to
episode, mood, symptoms, and impairment. The get an overview of the patient, and ascertain history,
instrument takes 1–2 hrs to administer. strengths, defenses, and associated concurrent
The DIS is a structured clinical interview. The comorbidities. In the detailed inquiry, the clinician
clinician reads the questions exactly as they appear develops ideas as to the cause of the problem as well
in the interview booklet without subsequent probing. as a working plan to assist in problem resolution.
There are 32 different diagnoses assessed with the The last phase (termination) should take 5–10 min
DIS. Current symptoms are assessed for the past 2 and provide the patient with an action plan as well as
weeks, 4 weeks, 6 months, and 12 months. It takes address basic issues such as frequency of future
45–90 min to administer. sessions, fees, confidentiality, and the like.
Conduct disorder
1. Prior to the age of 15, have you ever done things for which you — —
could have been arrested.
If yes, ask: What were some of those things?
Have you ever been arrested?
Have you ever spent time in jail? How Long?
Deceitfulness, manipulations & conning
2. Would you say that you are especially good at fooling people? — —
If yes, say ‘‘Give me some examples’’
Impulsivity
3. Do you prefer to (a) think about things before you act or (b) jump right in? — —
If ‘‘a’’ ask ‘‘Give me two examples’’
Aggressiveness and fighting
4. Have you ever engaged in physical fighting? — —
If yes, ask ‘‘What were the circumstances of these fights?’’
Recklessness
5. Do you take chances that others might consider reckless? — —
If yes, ask ‘‘Give me some examples’’
Irresponsibility
6. Have you ever lost, been fired, or walked off a job? — —
If yes, ask ‘‘What were the reasons you lost these jobs?’’
7. Are you in debt? — —
If yes, ask ‘‘What for and how much do you owe?
Lack of guilt or remorse
8. Do you feel that people have abused you, hurt you, or in other ways — —
mistreated you?
If yes, ask ‘‘Give me some examples’’
9. In general do you feel guilty when you physically or emotionally hurt — —
someone?
If yes, ask ‘‘Why not?’’ (look for justifications and rationalizations)
Criminality
10. As an adult have you been in trouble with the law? — —
If yes, ask ‘‘Why were you arrested?
‘‘What did they charge you with?
‘‘How much time have you spent in jail?
Maintenance of independence
11. Do you maintain your independence at all costs? — —
If yes, ask ‘‘Why is that?’’
12. Do you think that others try to control you? — —
If yes, ask ‘‘Give me some examples’’
13. Record here any additional question(s) you may have asked and the patient’s response(s)
to them.
14. Record below the patient’s (a) way of relating to you, (b) degree of cooperativeness,
and (c) manner of dress.
15. Rate the overall truthfulness of the patient’s responses.
(A) Probably truthful — (B) Mostly truthful —
(C) Somewhat truthful — (D) Unlikely to be truthful — (E) Uncertain —
CRAIG 205
Other clinicians have sought to provide a struc- to promote observation, avoiding unnecessary
tural model of a clinical interview from a different interruptions, using language that is appropriate
theoretical perspective. For example, Sullivan took to the client, being nonjudgmental, and displaying
a perspective in creating his structural model, whereas empathy.
Benjamin (1981) took a psychosocial perspective I call the next phase ‘‘exploration.’’ This phase
to the clinical interview and divided it into three corresponds to Sullivan’s reconnaissance and
main stages called initiation (i.e., statement of the detailed inquiry. In this phase, the clinician ascer-
problem), development (where client and clinician tains a clinical diagnosis and associated comorbid-
mutually agree on the nature of the problem), and ities, and develops an initial hypothesis based on
closing. one’s theoretical orientation. You may characterize
From a behavioral perspective, the clinical inter- the nature of the problem as an unbalanced family
view has been divided into role structuring, creating a hierarchy with triangulated communications (a
therapeutic alliance, developing a commitment for family theory perspective), problematic negative
change, analyzing behavior, negotiating treatment, reinforcements (a behavioral perspective), distorted
and planning and implementation (Kanfer & object relations, and arrested psychological develop-
Scheft, 1988). More recently, Beach (2005) argued ment (a psychoanalytic perspective). The important
that a behavioral clinical interview should consist of point here is to develop a working hypothesis that is
the identification of problematic behaviors, a deter- tested within subsequent moments of the interview.
mination of the variables causally related in a beha- Eliciting this kind of information can best be accom-
vioral chain to the development of the problem, plished by asking open-ended questions (see below),
identification of patient characteristics that will facil- intervening at critical points of client elaborations,
itate or retard treatment, the assessment of environ- clarifying any inconsistencies, and interrupting in
mental variables that will positively or negatively ways that suggest clinician competence.
reward behavior, and finally selecting a behavioral I need to diverge somewhat here to comment on
treatment method. the importance of ‘‘assessing the syndrome’’ in this
Beach (2005) has recommended that the initial phase of the interview. By that I mean that the
interview begin with a definition of the presenting clinician demonstrates his or her knowledge of the
problem, invoking one of the assessment models problem by inquiring on symptoms, problems, and
previously discussed (i.e., Wolpe, Lazarus, and disorders known to be associated with the primary
Kanfer & Scheft), developing a preliminary hypoth- problem or diagnosis. For example, suppose that you
esis for variables affecting the patient’s behavior, are a diabetic and experience a job relocation that
testing this hypothesis through detailed questioning requires you to find another physician to help
of variables controlling the patient’s behavior, and manage your diabetes. You receive the names of
then designing a treatment approach that will lead to two physicians and contact them to make an
problem resolution and behavior change. appointment, stating that you are looking for
From a client-centered perspective, Carl Rogers someone to treat your elevated glucose levels. Both
characterized the clinical interview in the following of them give you an appointment and instruct you to
manner: the client comes for help, the situation is go to a local lab for a blood test. When you see Dr. A,
defined through therapist acceptance, clarification, s/he reviews your test results, takes your medical
and the expression of positive feelings, and finally history, and prescribes 500 mg of Glucophage (met-
there is the development of insight (Rogers, 1942). formin) BID. Next you visit Dr. B, who reviews
I have adapted Sullivan’s model of the structure your blood test results, takes a medical history,
of a clinical interview both in my clinical practice examines your heart, checks your extremities for
and when leading seminars in clinical interviewing peripheral neuropathy, orders a 24-hr kidney func-
among graduate students in clinical psychology. I tioning test, recommends that you see an ophthal-
conceive of the interview as consisting of an ‘‘intro- mologist annually to check for diabetic retinopathy,
duction,’’ which roughly corresponds to Sullivan’s and also suggests you take a cardiac stress test. S/he
formal inception phase. Here we learn why the also prescribes 500 mg of Glucophage (metformin)
patient is seeking mental health services. It is also BID. Other things being equal, which physician will
the phase where we begin to establish both trust you select? I believe you will choose Doctor B, who,
and rapport. This can be facilitated by discussing based on the actions within the interview, has
privacy and confidentiality issues, addressing the demonstrated thorough knowledge of the syndrome
clients by their preferred name, arranging seating being treated. Dr. A may be just as knowledgeable
CRAIG 207
Table 12.2. Overview of the structure of a clinical interview.
Phase Requirements
purposes, but in reality, their boundaries are perme- Does the patient have acceptable aftercare plans?
able and clinicians often engage in several of these Does the patient understand the consequences
activities within a single session. should s/he test positive for HIV? In most cases, a
referral for some other mental health intervention
BRIEF SCREENING INTERVIEWS
usually follows the brief clinical interview.
The brief screening interview is distinguished by
its time-limited and focused structure designed to CASE HISTORY EXAMS
address a specific question. Is the patient suicidal? Is Case histories often comprise the richest part of a
the patient dangerous to others? Is the client psycho- clinical interview. Occasionally they are conducted
logically fit to stand trial? Does the patient need to be separately, but more often than not they are part of a
hospitalized? Is medical referral needed? Does the routine clinical interview. In some cases a more
patient have an adequate relapse prevention plan? detailed sketch of the patient’s background is required,
CRAIG 209
phase dysphoria.’’ Of course only a physician would whereas the goal of the forensic interviewer is to help
be able to diagnose the phase of the menstrual cycle. the court. Fourth, while a psychotherapist will see
Women’s advocacy groups also objected to the impli- the client over multiple sessions, the forensic clin-
cation that they would be deemed normal for 3 weeks ician may see the client only once or twice. Fifth, the
out of every month, but would have a psychiatric forensic clinician spends more time examining
disorder 1 week out of every month. These groups records and historical material, compared to the
also objected to the possible inclusion of a sadistic and usual clinical work. Finally, the referred patient
a masochistic personality disorder. They argued that may not be the ‘‘patient’’ to the forensic clinician.
women in abusive relationships might be diagnosed Rather, the court or the referring attorney may be
with a masochistic personality disorder, whereas other the actual client here (Craig, 2005a).
explanatory models might also explain why they
FOLLOW-UP INTERVIEWS
remain in these situations (i.e., the psychology of
victimization, women socialized into care-taking These are generally special-focus evaluations to
roles, lack of independent economic viability, etc.). determine the status or outcome of previously deliv-
They also feared that rapists might receive attenuated ered services. The most frequent example of this kind
prison sentences because they would be diagnosed on interview is a researcher who contacts a patient 1
with a sadistic personality disorder. year after termination of treatment to determine the
There were other complaints about the DSM, but effectiveness of an intervention. I once had a contract
the important point here is to recognize that the DSM with an HMO to screen the patient, determine the
is both a scientific and a sociopolitical document. diagnosis and the kind of treatment required, and
authorize the number of sessions to resolve the pro-
ETIOLOGIC INTERVIEWS blem. The contract required me to interview the
Despite one’s theoretical orientation, all clini- patient at the midpoint of treatment and at the end
cians want to understand why the patient is of treatment to determine outcome and patient satis-
behaving the way s/he is behaving. This requires faction with the services.
some formulation as to motivation. This motivation
may be unconscious, such as in psychodynamic or INTAKE INTERVIEWS
family theory formulations, or it may be semicon- Intake interviews generally occur within agencies
scious as in client-centered, or simply a matter of that need to determine if the client is eligible for
failing to consider the various environmental influ- services or whether a referral to more specialized
ences and conditionings, as explicated by the beha- treatment is warranted. They are also used to com-
viorists. At the end of the first interview, the clinician municate agency policies and procedures, or to gain
should have a working hypothesis as to the cause of enough clinical material to present the case at a
the problematic behavior. This will lead toward goals clinical case conference, where treatment goals and
of treatment. perhaps therapist assignment occurs.
FORENSIC INTERVIEWS MENTAL STATUS EXAMS
Forensic interviews address such issues as compe- Mental status exams may be conducted to deter-
tency to stand trial, the question of legal insanity, mine the amount of mental impairment and cogni-
psychological damage following industrial accidents, tive deficit associated with the given clinical disorder.
assessment of sexual predators, and child custody These exams are routinely conducted for major psy-
evaluations. Many times the interviewee is referred chiatric disorders and substance-induced disorders
to as a defendant rather than client or patient. or when neurological complications are suspected
A forensic interview differs in a number of ways for a given disorder. Several mental status exams
from a traditional clinical interview. First, contrary have been published (e.g., Folstein, Folstein, &
to clinical interviews, forensic interviews are not McHugh, 2005), but most only provide content
confidential. In fact, the clinician must inform the areas that should be addressed during these exams
defendant that any and all material uncovered in this and provide only qualitative rather than quantitative
process can and will be reported to the court. findings. Furthermore, these standard-type measures
Second, while clinical interviews take a supportive, are more typically used in neurological rather than in
accepting, and empathic stance toward the patient, psychiatric settings.
the forensic interview adopts a more investigative Mental status exams may be conducted using
and probative role in seeking the truth. Third, the standardized instruments or, more generally, they
goal of the clinician interviewer is to help the client, are given while discussing other content areas. For
Appearance
Level of consciousness Eye contact
Grooming Hygiene
Position of body Physical appearance
Dress/clothing Gait
Distant Sensory impairments
Attention Distractibility
Attitude Cooperative Uncooperative
Guarded Defensive
Candid Subdued
Indifferent/lethargic Seductive
Insight Intact comprehension Impaired
Language Audible Clear
Appropriate Expressive/receptive
Memory Immediate Remote
Short term Long term
Recent Dissociative
Amnestic Confabulation
Mood (affect) Euphoric Dysphoric
Anxious Flat
Apathetic Appropriate
Hostile Manic
Alexithymic Euthymic
Bland Restricted
Labile Blunted
Exaggerated Irritable
Movements Tics Spasms
Compulsions Automatic; spontaneous
Voluntary Involuntary
Perception Hallucinations Delusions
Depersonalization Derealization
Superstitions Deja vu
Orientation Time Place
Person Space
Speech Articulation Aphasic
Stream of consciousness Stuttering
Volume Rate
Loquacious Fluency
Poverty (minimal speech) Mute
Pressured Slow
Impoverished
Thought content Blocked Overinclusive thinking
Clanging Perserverations
continued
CRAIG 211
Table 12.4. (continued ).
CRAIG 213
only utilize a few of these techniques (i.e., reflection, ACCURATE EMPATHY
silence), while others (i.e., psychodynamic) may Empathy is defined as the ability of the clinician
place emphasis on a particular technique (i.e., inter- to understand the client from the perspective of the
pretation). Still, the eclectic clinician will tend to use client. Most people want to be understood, and
most if not all of the techniques listed below. demonstrating accurate empathy (along with active
listening) communicates to clients that you under-
QUESTIONING stand them. Accurate empathy is not feeling what
We begin this section with the use of ques- the client is feeling. Rather it is understanding what
tioning, which is the heart of most clinical inter- the client is feeling. Sometimes this is not really
views—especially the initial clinical interview, and expressed directly, but can be palpably felt within
which is generally replete with the use of questioning the session.
as a predominant interviewing tool. In general, ques- Example:
tioning is the fastest way to generate information and
so this technique is frequently employed in most Client: You know, doc, I want to stop using drugs and
clinical interviews. actually stopped for three weeks. Then I went into
There are two basic kinds of questions: open and this bar and picked up a girl and we got a room at a
closed. The latter often begins with words, such as hotel and she pulled some cocaine out of her purse
who, where, when, and why. These are direct ques- and so I used because I didn’t want her to think I
tions that elicit a closed response, especially when wasn’t a man. Afterwards I felt awful and thought
they are framed to elicit a ‘‘yes’’ or ‘‘no’’ response. to myself, ‘‘How could I do that?’’ You know the
The client will then wait for the interviewer to ask same thing happened when my mom found drugs
the next question. Excessive use of this method is in my room. She yelled at me and I just sat there
regressive and stifles true communication. They and took it.
should be used sparingly. An example of a direct or Therapist: You know the common thread to both of
closed question is ‘‘How many times have you been these situations is your guilt over your drug-using
fired from a job?’’ behavior. When you’re in a situation where you
An open-ended question allows for a fuller range might be criticized, you begin to feel less than a man.
of client response and elaboration, and facilitates the Client: You know, I think you’re right.
interviewing process. Open-ended questions do not
evoke a ‘‘yes’’ or ‘‘no’’ response nor a specific reply. Along with active listening, expressing accurate
An example of an open-ended question is ‘‘How do empathy are techniques that should be used
you feel the next morning after you wake up with yet throughout the clinical interview. A long history of
another hangover?’’ psychological research has demonstrated that these
Students learning how to interview tend to over- techniques (along with unconditional positive
rely on the method of direct questioning and generally regard and acceptance) are related to positive out-
have trouble framing their questions so that they elicit comes in mental health. They actually are more of an
an open-ended response. It actually requires clinical attitude than a technique.
acumen to frame a question so that it elicits a max-
imum return while promoting the free-floating com- ACTIVE LISTENING
munication required for a good interview. To Active listening is defined as the ability of the
demonstrate the awkwardness of an overreliance in clinician to accurately state the content, feeling, and/
direct questions, role-play a clinical interview and ask or meaning of what the patient has just communi-
only direct questions for 5 min. At the end of this cated to you. It actually is one of the most important
time, ask your partner how s/he felt when being asked tools in the clinician’s armamentarium. It requires
these questions in this way. Reverse roles now: the you to ‘‘listen’’ to what the client says, ‘‘restate’’ their
counselor becomes the client and is required to words using ‘‘reflective’’ statements, and then
answer only closed-ended questions from the exam- ‘‘observe’’ the patient’s response to your reflection.
iner. At the end of this time, process how you felt The client’s reaction to your intervention will tell
while being asked questions in this format. you whether or not you have accurately understood
Both open- and closed-ended questions will be the client.
necessary to conduct a competent clinical interview. In order to promote active listening, you must
Try to avoid too many closed-ended questions position yourself vis-à-vis the patient so you can
because they inhibit the flow of the session. directly observe verbal and nonverbal
CRAIG 215
counseling should not use this without first dis- HUMOR
cussing it with their clinical supervisor. Even sea- Originally, clinical interviewing was viewed as a
soned counselors should only use confrontation distinctly professional interaction between client and
sparingly, if at all. clinician and one where humor played no role in the
Example: process. Freud believed that humor was the highest
defense mechanism, but thought that the use of
Client: I think my problem drinking is behind me. humor by the psychoanalyst was evidence of coun-
Therapist: Your wife told me you came home drunk last tertransference. We are now beginning to realize the
night! value of humor in mental health interventions. In
fact, there are conferences devoted exclusively to the
EDUCATION role of humor in psychotherapy.
It has been said that education is the answer when As with any technique, it should be used in
ignorance is the problem. Sometimes it is necessary to moderation and only to benefit the client. Such
educate a client as part of the interview process. benefits may include the reduction of anxiety in
Psycho-immunology is a term used to describe efforts the patient, facilitating the clinical interviewing pro-
by a psychologist to educate a patient in order to cess and enhancing the flow of the interview.
prevent problematic behaviors. Education is a tech- However, there is the danger that the client may
nique frequently used by substance abuse counselors not take you seriously so one has to be careful and
to help clients learn about their alcoholism. It deserves judicious when injecting humor into the session.
a place in our interviewing technique repertoire. Example:
Example:
Client: Sometimes, Doc, I do so many strange things
Client: After I’m discharged from this program, I plan to that I don’t even recognize myself. Maybe I have a
go back to my musical career. However, I assure you split personality.
I’m through using drugs. Besides I have learned that Clinician: In that case, the fee will be $100 each.
using drugs the way I did puts me at risk for getting
AIDS and believe me I don’t want to get that. INTERPRETATION
Therapist: That’s good news! However, you also have a One area that separates psychologists and psy-
history of picking up women and the women you chiatrists from many other mental health professions
picked up were having sex with guys who were also is the use of interpretation—particularly psychody-
using drugs so you could get AIDS from them too. namic interpretation. Freud considered interpreta-
tion of the unconscious as the sine qua non of
EXPLORATION psychoanalysis, but felt that the analyst had to wait
Rarely does a patient provide information in until transference became obvious before the analyst
sufficient detail such that no further questions are should begin to use this technique. Freud believed
warranted. Furthermore, some areas require a more that human motivation was largely outside the con-
in-depth inquiry than others so that most clinicians scious awareness of the analysand and that one of the
will be required to engage in additional questioning goals of therapy was to make the unconscious
in order to get a fuller picture of the issue at hand. conscious.
Most clients expect to be questioned and generally Interpretation is essentially exploring a client’s
do not resent a clinician’s further inquiry. They may motivating factors in their life so that they are
even wonder (to themselves) why certain areas were understood and altered when necessary.
not addressed further. The mental health interviewer Interpretation tries to link two or more streams of
should not be reluctant to explore areas that many consciousness (i.e., free association), events, or
consider sensitive. behaviors so that this linkage reveals a common
Example: source. Interpreting a client’s behavior and motiva-
tion should always be done carefully. It is a difficult
Client: When I was in Vietnam, my unit was overrun technique to master and requires comprehensive
by the Viet Cong, so I had to lay there and pretend knowledge of a patient’s history, motivation, and
to be dead. dynamics. It requires expertise in personality theory
Therapist (sample exploration questions): How long did and should rarely be done by clinicians in training
you lie there? What about your bodily functions? unless they first discuss this with their clinical
How were you rescued? And so on. supervisor.
CRAIG 217
realize that telling the patient personal information interrupts me, says bad things about me behind my
about ourselves can be therapeutic, if used in a back and recently told people that he thinks I’m gay.
judicious manner. I swear one of these days I’m going to meet him after
Conveying personal information can be counter- work in the parking lot and make sure that he never
productive, however, and should always be done to has children, if you know what I mean.
facilitate the therapeutic process. Clinicians should Clinician: (no response).
ask themselves, ‘‘Why am I telling this patient this
information?’’ Usually it is done to reduce an emo- CLINICIAN ATTITUDES
tional blockage in a client, to promote elaboration, These interviewing techniques are also used in
or to reduce resistance. Substance abuse counselors conducting psychotherapy. A few other ‘‘attitudes’’
with a personal history of addiction who are now in deserve mention, even though they are not specific
recovery often tell alcoholics that they, too, are in interviewing techniques. I am referring to the quali-
recovery. For many such patients, it seems to pro- ties of acceptance, positive regard, and a nonjudg-
mote understanding between these two parties. mental frame of reference.
Example: Students have frequently discussed the difficulty
they have in accepting a client’s behavior, which
Client: How would you know what I’m going through? often is heinous and incorrigible. The same students
Clinician: Because I’m an alcoholic and have been in talk about the reprehensible behaviors they hear of in
recovery for over 15 years now. counseling sessions and feel quite judgmental about
them. They report difficulty in having any feelings of
SILENCE positive regard for such clients.
Sometimes the best response is no response. Carl Carl Rogers and his students have argued that
Rogers (1951), in fact, argued that silence can and acceptance, unconditional positive regard, and
does have many beneficial effects and discussed a empathy are the necessary and sufficient conditions
case wherein he had said nothing during the entire of behavior change and have produced a prodigious
sessions. The next session the client came in and amount of research to back up their claims. The
thanked Rogers for how much he had helped him essential thing to remember is that we are accepting
in the previous session. Rogers began to reflect on the person and not the person’s behavior.
what had occurred during this process and came to SUMMARIZING
the conclusion that silence allows the client to pro- Briefly summarizing what a client has conveyed
cess his thoughts and feelings in a safe, nonjudg- to you in the interview may be a useful tool because
mental environment. it can convey to the client that you understand the
Silence allows clients to recompose themselves problem or issue at hand and allows the client to
after an emotional expression, and helps promote correct any clinician misunderstandings.
introspection. It can help clients consolidate and Example:
integrate session material and move the therapeutic
process. Again, it must be done in a timely and Clinician: OK. So the main issue as I understand it is
appropriate manner. The important point here is that you feel stuck in an abusive relationship. You
that clinicians do not always have to say something feel guilty for thinking about leaving him because
in response to a client’s statements. Sometimes a your religion teaches you to stay in a marriage and
clinician’s too rapid response can actually interfere try to work things out, and you don’t have the
with the clinical interview. economic means to live independently. One solu-
Counselors in training often feel silence on the tion you have thought about is suicide but, again,
part of a client as a dreadful experience. They feel your religion teaches you that this is a serious sin and
somewhat inadequate and tend to believe that inter- you are afraid to see a psychiatrist for medication
views must always proceed with continuous verbiage. because this will upset your husband, who might
Silence may occur because of clinician inadequacy as beat you again for spending money unnecessarily.
an interviewer, but it may also occur as part of the
clinician’s intervention as a clinical tool. A Practice Exercise
Example: Using the example below, practice each inter-
viewing technique listed above. Record your answers
Client: This guy at work keeps harassing me. He never and discuss them with other students who should do
agrees with anything I say, he continuously the same.
CRAIG 219
Table 12.5. Basic clinical interviewing techniques.
Techniques Client statement Clinician response
Questioning: Closed I got arrested again for a DUI How many DUI arrests do you
have?
Open I got arrested again for a DUI What were the circumstances?
Active listening For once I have been able to do something You can feel good to know that you
without the interference of my parents. It can be independent.
feels good to know that I can do it.
Accurate empathy Whenever I see these good looking women It’s difficult for you to control your
in their skimpy outfits I get so aroused I feelings and behavior when you get
want to jump them then and there. so aroused.
Advising I was fired from my job because I failed a They can’t do that. The law states
drug test. I used marijuana at a party and they have to allow you to get help.
they let me go. Mention this to your boss or see an
EEO counselor at work.
Clarification I got arrested but it wasn’t my fault. I was What were you doing and what
with some people and just got caught up in were you charged with?
the action.
Confrontation My wife over-reacts. I can handle my liquor You go out every night and get
drunk. You come home and argue
with your wife. And you were
recently arrested for a DUI. Help
me understand how this shows you
can handle liquor.
Education The doctor said the medication I’m on will Yes it will but you also have stop
control my bipolar disease. drinking because people with your
disease also tend to develop a
problem with drinking.
Exploration I came home and found my wife in bed How did that make you feel?
with the post man. What was your reaction?
What did you say to her?
Humor My shoulders having really been hurting me. Maybe it’s rust
Interpretation It’s so easy for me to pick up women. I take Your wife ‘‘screwed’’ you in a sense
them home, screw them and then don’t ask when she left you. So now, to get
them out again. I don’t know why. They back at her you screw other women
want to keep seeing me. and then leave them just like your
wife did to you.
Reflection I came home last night and my wife told me You are both perplexed and upset to
she was leaving me for my best Friend. I learn that your wife is leaving when
didn’t know any-thing was wrong with our things were going so well.
marriage.
Reframing My girl friend dumped me and gave me no While her leaving upsets you now,
reason. She said she just didn’t want to go this will give you the chance to go
out with me any more. out with other girls and perhaps find
a better match.
Self disclosure I just can’t learn. No matter how hard I try I had a problem with dyslexia, too,
I don’t get good grades. but it can be overcome with special
education.
Silence Some day I’m going to tell my boss ‘‘take (no response)
this job and shove it’’
Summarizing (Client has finished talking) Your plans are to go to AA, avoid
your high risk situations and relapse
triggers, attend outpatient
counseling and agree to random
alcohol breath tests.
CRAIG 221
much of what he had bought. He left there and shows a repeated pattern of moving in with a female
drank alcohol to help him come down from the and then escalating his alcohol use and/or sleeping
cocaine. He was riding on a bus when he passed a with other women, thereby causing that woman to
lagoon. Feeling depressed and a failure, he took off kick him out. He was married once, to a White female
his jacket and swam out to the middle of the lagoon who he said was an opera singer. They have one child,
with the intention of drowning. However, he but this union lasted only 2 years. He was caught in
became quite cold and returned to shore, putting bed with another woman, leading to the separation.
on his jacket and then went to his mother’s house, Divorce is pending. He admits to hitting his wife
where he told her he had just tried to kill himself. ‘‘once’’ and professes guilt over this behavior. He
She convinced him to apply for public aid and says he doesn’t seem to be able to maintain a commit-
gave him some clothes. He was still depressed, with- ment to one female and never thought he would
drawn, and felt that life was hopeless. He found the marry. He also reported one prior suicide attempt
odor and the people in the abandoned buildings to precipitated by a woman leaving him for another
be offensive. In the interim he had gotten high, he man. He withdrew into his room for 2 weeks, lost
went to his mother’s house, stole her watch and his appetite, and was anergic. He tried to overdose by
threatened her—something he said he had never taking a bottle of sleeping pills. His tolerance for
done before. He bought some rat poison and once depressant drugs, caused by repeated alcohol and
again went to his mother, telling her that he was marijuana use, probably saved his life.
thinking of committing suicide. His mother, now He reports no medical problems nor any sexual
fearful of him, threw him out of the house. dysfunction. Legal history reflects only minor
The patient reasoned that he would rather be dead charges for possession and for disorderly conduct.
or in jail than living this way so he approached a White He has never been in jail.
policeman stating that he was a drug abuser, stole from
his mother, and was considering suicide. He felt sure Mental Status
that a White cop would arrest him and put him in jail. A mental status exam revealed no signs of psy-
However, the officer recognized the patient as a chosis, no homicidal nor current suicidal ideation, but
trumpet player from one of the local nightclubs and there did remain some mild depression or dysphoria
convinced the patient to seek hospitalization and drug with cognitive but not vegetative symptoms. He was
treatment. He was taken to a neighborhood treatment oriented in time, place, and person, and his degree of
program and then transferred here. reality contact was good. He was neatly dressed in
color-coordinated clothes and hygiene seemed good.
History There were no signs of hallucinations or obvious
The patient’s history shows dysfunctional par- delusions. He did evince some mild errors in judg-
enting, a chaotic environment, and an absence of ment, believing that the music industry is controlled
positive role models. His mother was an alcoholic by the Jewish and Italian mafia who might kill you if
and has only been in recovery for the past 4 years. He you do not obey them. He related in a rather open
was born out of wedlock and his biological father manner. His speech was audible but somewhat cir-
was married to someone else and was not involved in cumlocutious in relating his story, though he toler-
his rearing. The patient’s mother had a series of ated interruptions and direct questioning without
boyfriends, many of whom were also alcoholic and apparent resentment. He seemed desirous to please
some of whom physically abused him. His uncle the examiner. History does show some impulsivity.
often came to the house, beat him once, but ridic-
uled him often. The patient also reports fearing for Case Formulation
his life while in grammar school, often running This patient is an orally fixated personality with
home to avoid the gangs, and reports that, on one many unresolved dependency needs that have left
occasion, gang members tried to recruit him and him unable to successfully cope with stress, frustra-
chased him up the stairs trying to kill him when he tions, and setbacks, and perhaps prone toward
refused to join them. He used music and athletics as depression and substance abuse as well. His family
social interests that helped him avoid the gangs. history is positive for alcoholism, which further
He entered service following graduation from high increases his risk in these areas as well. Mother, an
school and spent most of the last two decades as a alcoholic herself, probably did not meet his basic
professional musician, traveling around the country, psychological needs in infancy, leaving the patient
drinking, using marijuana, and womanizing. He to feel rejected, unloved, and emotionally
CRAIG 223
II. (301.60) Dependent personality disorder Hersen, M., & Thomas, J. C. (2007b). Handbook of clinical
III. None, by patient self-report interviewing with adults. Thousand Oaks, CA: Sage
Publications.
IV. Problems related to the social environment: Hester, R. K., & Miller, J. H. (2006). Economic perspectives:
General psychosocial deterioration induced by free- Screening and intervention: Computer-tools for diagnosis and
basing cocaine treatment of alcohol problems. Alcohol Research and Health,
V. 45 29, 36–40.
Husserl, E. (1962). Ideas (W. R. Boyce Gibson, Trans.). New
York: Collier.
References Jamison, C., & Scogin, F. (1992). Development of an interview-
American Psychiatric Association (1980). Diagnostic and statistical based geriatric depression rating scale. International Journal of
manual of mental disorders (3rd ed.). Washington, DC: Aging and Human Development, 35, 193–204.
American Psychiatric Association. Kanfer, F. H., & Scheft, B. K. (1988). Guiding the process of
American Psychiatric Association (1987). Diagnostic and statistical therapeutic change. Champaign, IL: Research Press.
manual of mental disorders (3rd ed. –Rev.). Washington, DC: Kelly, B., Kay-Lambkin, F. J., & Kavanagh, D. J. (2007). Rurally
American Psychiatric Association. isolated populations and co-existing mental health and drug
American Psychiatric Association (1994). Diagnostic and statistical and alcohol problems. In A. Baker & R. Velleman (Eds.),
manual of mental disorders (4th ed.). Washington, DC: Clinical handbook of co-existing mental health and drug and
American Psychiatric Association. alcohol problems (pp. 159–176). New York: Routledge/Taylor
Beach, D. A. (2005). The behavioral interview. In R. J. Craig (Ed.), & Francis.
Clinical and diagnostic interviewing (2nd ed., pp. 91–105). Komiti, A. A., Jackson, H. J., Judd, F. K., Cockram, A. M.,
Lanham, MD: Jason Aronson. Kyrios, M., & Yeatman, R. (2001). A comparison of the
Benjamin, A. (1981). The helping interview (3rd ed.). Boston: Composite International Diagnostic Interview (CIDI-Auto)
Houghton Mifflin. with clinical assessment in diagnosing mood and anxiety
Brammer, R. (2002). Effects of experience and training on disorders. Australian and New Zealand Journal of Psychiatry,
diagnostic accuracy. Psychological Assessment, 14, 110–113. 35, 224–230.
Chipman, M., Hassell, J., Magnabosco, J., Nowlin-Finch, N., Koski-Jannes, A., Cunningham, J. A., Tolonen, K., & Bothas, H.
Marusak, S., & Young, A. S. (2007). The feasibility of com- (2007). Internet-based self-assessment of drinking: 3-month
puterized patient self-assessment at mental health clinics. follow-up data. Addictive Behaviors, 32, 533–542.
Administration and Policy in Mental Health Services Research, Lazarus, A. (1976). Multimodal behavior therapy. New York:
34, 401–409. Springer.
Cozby, P. C. (1973). Self-disclosure: A literature review. Logan, N. (2005). Diagnostic assessment of children. In. R. J.
Psychological Bulletin, 79, 73–91. Craig (Ed.), Clinical and diagnostic interviewing (2nd ed.,
Craig, R. J. (2003). Assessing personality and psychopathology pp. 303–322). Lanham, MD: Jason Aronson.
with interviews. In J. Graham & J. Naglieri (Eds. Handbook Maddi, S. (2005). The existential/humanistic interview. In. R. J.
of psychology. Vol. 10: Assessment psychology (pp. 487–508). Craig (Ed.), Clinical and diagnostic interviewing (2nd ed.,
New York: John Wiley & Sons. pp. 106–130). Lanham, MD: Jason Aronson.
Craig, R. J. (2005a) Personality-guided forensic psychology. Matarazzo, J. D. (1965). The interview. In B. Wolman (Ed.),
Washington, DC: American Psychological Association. Handbook of clinical Psychology (pp. 403–452). New York:
Craig, R. J. (Ed.). (2005b). Clinical and diagnostic interviewing McGraw-Hill.
(2nd ed.). Lanham, MD: Jason Aronson. Matarazzo, J. D. (1978). The interview: Its reliability and validity
Craig, R. J., & Olson, R. E. (2004). Predicting methadone in psychiatric diagnosis. In B. Wolman (Ed.), Clinical diag-
maintenance treatment outcomes using the Addiction nosis of medical disorders (pp. 47–96). New York: Plenum.
Severity Index and the MMPI-2 Content Scales (Negative McLellan, A. T., Kushner, H., Metzger, D. S., Peters, R., Smith,
Treatment Indicators and Cynicism Scales). American Journal I., Grissom, G., et al. (1992). The fifth edition of the
of Drug and Alcohol abuse, 30, 823–839. Addiction Severity Index. Journal of Substance Abuse
Ellis, A. (1962). Reason and emotional in psychotherapy. New York: Treatment, 9, 199–213.
Lyle Stuart. Miller, W. R., & Rollnick, S. (2002). Motivational interviewing:
Endicott, J., & Spitzer, R. L. (1978). A diagnostic interview: The Preparing people to change addictive behavior. New York:
schedule for affective disorders and schizophrenia. Archives of Guilford Press.
General Psychiatry, 35, 837–844. Morrison-Beedy, D., Carey, M. P., & Tu, X. (2006). Accuracy of
Folstein, M. F., Folstein, S. E., & McHugh, P. R. (2005). The audio computer-assisted self-interviewing (ACASI) and self-
mini-mental state examination. Odessa, FL: Psychological administered questionnaires for the assessment of sexual beha-
Assessment Resources. vior. Aids and Behavior, 10, 541–552.
Garb, H. N. (2007). Computer-administered interviews and Prochaska, J., & DiClemente, C. C. (1983). Stages and processes
rating scales. Psychological Assessment, 19, 4–13. of self-change of smoking: toward an integrated model of
Gruba-McCallister, F. (2005). Phenomenological orientation to change. Journal of Consulting and Clinical Psychology, 51,
the interview. In R. J. Craig (Ed.), Clinical and diagnostic 390–395.
interviewing (2nd ed., pp. 42–53). Lanham, MD: Jason Prochaska, J. O., DiClemente, C. C., & Norcross, J. C. (1992).
Aronson. In search of how people change: Applications to the addictive
Hersen, M., & Thomas, J. C. (2007a). Handbook of clinical behaviors. American Psychologist, 47, 1102–1114.
interviewing with children. Thousand Oaks, CA: Sage Prochaska, J. O., Velicer, W. F., Rossi, J. S., Goldstein, M. G.,
Publications. Marcus, B. H., Rakowski, W., et al. (1994). Stages of change
CRAIG 225
C H A P T E R
Clinical Applications of Behavioral
13 Assessment
Abstract
This chapter uses a case study of a family with marital, substance use, mood, external stressors, and
parent–child difficulties to illustrate the principles and methods of pretreatment behavioral assessment.
Several guiding principles are discussed: (a) an emphasis on identifying functional relations relevant to
behavior problems and treatment goals, especially those that help us explain behavior problems; (b) a
multimodal focus during assessment, attending to overt behavior, thoughts, emotions, and physiology;
(c) the importance of the conditional and contextual nature of behavior; (d) an emphasis on temporally
contiguous and environmental events, thoughts, and emotions in triggering and maintaining behavior
problems; (e) the dynamic nature of behavior problems and causal variables; (f) an emphasis on specificity
and precision of measurement; and (g) individual differences in the elements, correlates, and causes of
behavior problems. A diverse set of methods is congruent with the principles of behavioral assessment and
include behavioral interviews and questionnaires, observation in natural and analog environments, and
self- and instrument-aided monitoring of behavior and events in natural and analog environments.
Keywords: functional relation, idiographic, self-monitoring, self-report, time sampling, treatment goals
Introduction to Behavioral Assessment and validity of the clinical judgments that are derived
Advances in a science-based discipline depend on from measurements.
the measurement technology available to that disci- The ultimate goals of clinical assessment are to
pline. Our ability to describe and explain the beha- describe, predict, and especially to explain human
vior of planets, protons, parrots, and people depends behavior. ‘‘Descriptions’’ of human behavior include
on measurement ‘‘accuracy’’—the degree to which formal diagnoses of behavior problems, such as the
our measures reflect the true state of light, particle diagnostic system described in the Diagnostic and
motion, sound, and human actions.1 As the funda- statistical manual of mental disorders, fourth edition,
mental technology of a science, ‘‘measurement’’ is text revision (DSM-IV-TR; American Psychiatric
the process of assigning a numerical value to an Association, 2000), the identification of a person’s
attribute dimension (e.g., rate, intensity, and dura- behavior problems and strengths (e.g., through
tion of planetary motion or of a person’s behavior). interviews or problem checklists), the specification
The accuracy of a measure is the degree to which the of treatment effects, and the classification of social
assigned numbers truly reflect the attribute being and physical environments and events (e.g., classifi-
measured. Advances in clinical applications of psy- cation of different life stressors or social environ-
chological science, which is the focus of this chapter, ments). ‘‘Predictions’’ of human behavior include
are also contingent on measurement accuracy. estimates of a person’s likely response to a treatment,
‘‘Behavioral assessment’’ is a conceptual and metho- the likelihood that a person will relapse following
dological approach to clinical assessment that successful therapy, or the degree to which he or she is
emphasizes accuracy and precision of measurement vulnerable to the adverse effects of future traumatic
226
life events. ‘‘Explanations’’ of human behavior 1. Behavioral assessment emphasizes the ‘‘functional
include the identification of the factors that can relations’’2 relevant to behavior problems and
trigger or maintain a person’s behavior problems, treatment goals. In addition to specifying problems
the mediating variables that account for a relation- and goals, behavioral assessment attempts to explain
ship between distal events and a person’s current behavior by identifying controlling variables, such as
behavior problem (e.g., variables that account for antecedent and consequent environmental events,
the likelihood that early trauma will affect later physiological events, social interactions, and
social functioning), and the mechanisms that cognitions that affect behavior. For example, we are
underlie treatment effects. not only interested in specifying deficits and excesses
The degree to which these goals of clinical science associated with depressed mood—we also are
can be achieved depends on the accuracy with which interested in the factors that affect mood.
behavioral, cognitive, physiological, emotional, and 2. Consistent with an emphasis on functional
environmental events can be measured. Ultimately, relations, behavioral assessment emphasizes the
our judgments about treatment outcome and other ‘‘conditional’’ and ‘‘contextual’’ nature of behavior.
ultimate goals in clinical science depend on the degree That is, we presume that behavior problems are
that they are based on the best available science-based often conditional: their rate, duration, and intensity
measures of persons and events. A principle of beha- can vary across contexts. For example, we are
vioral assessment is that clinical inferences are most interested in the degree to which a child’s aggressive
likely to be valid when they are based on measures behavior (e.g., its rate, form, and target) varies across
that are empirically supported, precise, accurate, com- classroom settings, between home and school, across
prehensive, and sensitive to change. social settings, as a function of sleep quality, or
Behavioral assessment is also a conceptual as well depending on medication state.
as a methodological paradigm. It emphasizes that the 3. Although historical events are often informative,
description, prediction, and explanation of behavior behavioral assessment emphasizes the role of
and behavior problems are strengthened by exam- temporally contiguous behaviors, environmental
ining the contexts for behavior; that is, the identifi- events, thoughts, and emotions in triggering and
cation of functional relations among motor, maintaining behavior problems. While not ignoring
behavioral, cognitive, physiological, and environ- the importance of historical factors in the
mental events. We presume that these goals are development of behavior problems, the methods of
most likely to be achieved with the use of multiple behavioral assessment (see ‘‘Methods of Behavioral
assessment methods and sources of information; the Assessment’’ section) are congruent with this focus
use of lower-level, less inferential, and more specific on contemporaneous events.
variables; and time-sampling measurement strategies 4. Behavioral assessment emphasizes the ‘‘dynamic
to capture the dynamic nature of behavior. nature of behavior problems’’ and ‘‘causal variables.’’
This chapter outlines the concepts and methods of That is, behavior problems, such as mood, sleep, and
behavioral assessment that are applicable to clinical social interactions, and causal variables, such as
assessment. Although our focus is on pretreatment triggering stimuli or response contingencies, can
clinical assessment for case formulation and treatment change in intensity, frequency, or duration across
design, behavioral assessment is a useful paradigm for time. To capture these important changes across
treatment outcome evaluation and research (Haynes, time, we often use ‘‘time-sampling assessment
Kaholokula, & Yoshioka, 2008), clinical research strategies’’ in which important variables are
(Haynes, Pinson, Yoshioka, & Kloezeman, 2008), repeatedly sampled across time.
and psychopathology research (Haynes, 1992). Our 5. Behavioral assessment emphasizes ‘‘specificity’’
presentation is necessarily limited but in-depth pre- and ‘‘precision’’ of measurement. Because of the
sentations of behavioral assessment are available in high degree of error variance associated with the use
Haynes and O’Brien (2000), Haynes and Heiby of heterogeneous higher-order personality constructs,
(2004), and Hersen (2006). We begin the chapter the accuracy and utility of clinical inferences can
with a case study to illustrate the conceptual founda- often be increased by using more specific, lower-
tions and diversity of behavioral assessment that are level constructs. Thus, for example, we are more
discussed in subsequent sections. likely to emphasize measures of specific thoughts,
To prime the reader for major points emphasized social behaviors, and sleep behaviors of a client with
in this chapter, we list several distinguishing attri- depressed mood rather than measure the higher-order
butes of behavioral assessment: construct of ‘‘depressed mood.’’
Parenting
skills
deficits
Impairments in
marital,
Financial parental, &
Frank’s occupational
stressors
behavior functioning
Husband’s problems
alcohol
use
Conflicts Decreased
with Depressed social and
husband mood physical
activity
Husband’s
work
stress
Problem-solving Worry;
skills deficits feelings of
guilt
Sleep
disturbance
Family
history of
depression
===================================================================
Symbols
Original, Causal
unmodifiable variable; Behavior problem; effect of
X X mediating Y, Z
causal behavior problem
variable variable
.2 .4 .8
Moderating
relationship
the frequency of the couple’s arguments and level of which is consistent with the assessor’s hypothesis
marital distress. Frank’s increasing levels of aggres- that the couple’s relationship distress might be con-
sive and oppositional behaviors were a function of tributing to their son’s behavior problems.
behavior management skills by his parents (e.g., less Additionally, one study on mediators and modera-
attention to and inconsistent responses to his posi- tors in the relationship between parent problem
tive and negative behaviors). drinking and their child’s adjustment found that
Increased levels of marital discord have been maternal depressive symptoms partially accounted
found to have a negative effect on a child’s adjust- for the relationship between paternal parental
ment (see Cummings & Davies, 2002, for a review), drinking and child social problems (El-Sheikh &
Interviews, Nock, Self-injurious Adolescents Self-Injurious Assesses the presence, Good inter-rater Construct Small sample
checklists, and Holmberg, behaviors and young Thoughts and frequency, and characteristics reliability: kappas validity (using other consisting only of
questionnaires Photos, & adults Behaviors of self-injurious behaviors ranged from .90 to measures of self- adolescents
Michel (2007) Interview (SITBI) including suicidal ideation, 1.0 injury): kappas Fails to ask about
plans, gestures, and attempts as Test–retest ranged from .48 to other self-destructive
well as non-suicidal self- reliability: kappas 1.0 behaviors that may not
injurious behaviors. ranged from .25 Statistically directly be self-injurious
Broad screening measures to 1.0 significant
that provide basic information correlations ranging
from .64 to .73 found
between the SITBI
and another measure
of nonsuicidal self-
injurious behaviors
Barton Depression Adult out- Sentence Completion Idiographic assessment of Internal Good Further study is
et al. (2005) patients with Test for Depression depressive thinking consistency: construct, content, necessary to ensure
depression (SCD) Can be used to Kuder-Richardson and discriminant generalizability to other
complement other depression 20 ¼ .882 validity depressed populations
questionnaires Sensitivity Pearson such as adolescents,
May help identify varied from 92% to correlation between older adults, non-
hypotheses of target problems 100% SCD negative Western cultures, etc.
and dysfunctional beliefs Specificity statements and BDI
within a cognitive-behavioral varied from 96% to scores: r ¼ .43
case formulation 100% SCD negative
statements were
significantly greater
in the depressed
group, p < .001
continued
Fromme, Positive and Undergraduate Comprehensive Effects Provides information about Test–retest Construct validity: May not be
Stroot, & negative students in of Alcohol the effects (physiological, reliability: confirmatory factor generalizable outside of
Kaplan (1993) expected effects psychology Questionnaire (CEOA) psychological, and behavioral) correlations ranged analysis revealed a college students
of alcohol classes that people associate with between .53 and good four factor May fail to capture
different amounts of drinking .81, p < .01 structure; loadings important variables
By measuring subjective ranged from .15 to .84 identified using an
evaluations of drinking Criterion-related idiographic approach
clinicians can track client’s validity: positive and
appraisal of drinking and negative expectancy
predict future involvement were significantly
related to alcohol use
measures
Naturalistic Buckley, Klein, Child Preschool-aged Child Temperament May be a good measure of Inter-rater Construct validity: Necessary to cross-
observation Durbin, temperament children and Behavior Q-Set observer-rated preschooler’s reliability: ranged High correlations on validate results with a
Hayden, & behavior in their natural home from 0.58 to 0.80. maternal ratings for different sample
Moerk (2002) environment Test–retest externalizing problems Make constructs
May help with the stability: median with the Child relevant to
identification of externalizing ranged from 0.14 to Behavior Checklist (r developmental stages
problems in children 0.51 ¼ .55, p < .01)
May be too time Reliability of High correlations
consuming for practitioners four observers: with several subscales
but others could be trained to ranged from 0.66 to of the Child Behavior
use it to gather data on 0.86 Questionnaire
temperament and behavior for (30–.49)
the clinician’s use
Analog behavioral Johnson (2002) Marital Newly wed Behavioral Affective Couples are asked to come to Reliability of Correlations with Results may not be
observation satisfaction couples Rating System (BARS) a resolution about an important behavioral other measures of generalizable to more
marital complaint including sex, observations ranged marital satisfaction established couples
in-laws, or money from .41 to .85 (e.g., MAT and Possibility of
Discussions are videotaped QMI) ranged from observer bias
and then coded for affect .72 to .89, p < .01 Did not compute
Helps predict marital inter-rater reliability
satisfaction (dyadic expressions
of positive emotions have been
associated with marital
happiness while expressions of
negative emotions have been
associated with marital distress)
continued
Table 13.1. (continued )
Reference Disorder/Focus Target Assessment Clinical Applicability/Utility Reliability Validity Sources of Error/
Population Instrument/ Limitations in Clinical
Measures Assessment
Ambulatory self- Johnston, Work-related Stressed Palm IIIe, Ambulatory diary for Self-reported Slope of EMA May be more time
monitoring Beedie, & Jones stress workers personal digital ecological momentary stress across 3 measures on consuming and costly
(2006) assistant (PDA) assessment (EMA) nursing shifts questionnaire compared with paper-
Diary software Diaries entered on small covaried reliably measures (Job and-pencil
program written for hand-held computers (PDAs) in with both demand– Content questionnaires
the study which recordings are made in real control and effort– Questionnaire, May have technical
time in the working environment reward imbalance Karasek, 1985; problems
PDA is user friendly with (p < .001) Effort–Reward May have response
entries made via tapping the Imbalance burden on subjects
stylus on the screen Questionnaire, May lack stringent
Behavioral diaries that Siegrist, 1996) control over stimuli and
measure effort-demand, significant for responses found in lab
control, reward, and stress demand–control setting
throughout nurse’s shift at (p < .01) and effort–
approx. 90-min intervals (–5, reward imbalance
10, 15 min determined (p < .001)
randomly) over 3 or 4 shifts
Format for diary questions
based on the Diary of
Ambulatory Behavioural States1
Ambulatory Paul, Wai, Cough, Children and Cough monitoring Objective, non-invasive, self- Agreement Used gold- Findings limited in
psychophysiological Jewell, Shaffer, including adults of all system; monitor w/ contained, ambulatory cough between 2 standard of video that each did not match
& Varadan psychogenic ages accelerometer, monitoring system that measures investigators (1 w/ recording; video recorded cough to
(2006) coughing electronic package, vibration and transmits output prior experience in comparison of the device recorded
cable to connect data through cable to package cough research and 1 investigators’ video cough
accelerometer to Typically worn on belt or in a w/o prior experience; counts to
package, compact-flash pocket; stores data on memory experience not corresponding audio
memory card card; stores up to 24 hr of data found to impact counts:
findings): For investigator
For audio 1 ¼ 0.968
counts ¼ 0.998 (p ¼ 0.026)
(p < 0.001)
continued
Accelerometer attached to For video For investigator
skin at suprasternal notch; counts ¼ 0.997 2 ¼ 0.973
advantages: below larynx so (p < 0.001) (p ¼ 0.015)
speech that causes vibrations is
unintelligible on audio recording,
maintains privacy for subject in
ambulatory setting, eliminates
interference from swallowing,
relatively comfortable location
does not interfere w/ daily
activities, eliminates movement
artifact or distance from
externally located microphone
Functional Kearney, Cook, Maladaptive/ Children Motivation Assessment 16-item, 7-point Likert- Depending on Predictive validity May be less effective
interview Chapman, & problem through adults Scale (MAS) type rating scale (ranging from the severity of the between MAS and in cases of very low or
Bensaheb behaviors Never to Always) that assesses behavior problem observational data very high frequency
(2006) (functional the functions or motivations of Inter-rater (for developmentally behavior
relations: behavior problems reliability varies disabled children w/ Reliability may be
attention, May be completed from .24 to .92 frequent self- lower because of possible
tangible, escape, independently or through an Test–retest injurious behaviors): unstable factor structure
sensory) interview (parents, teachers, reliability varies r ¼ .99, p < .001 May be best used in
careproviders) from .39 to .98 combination with
Questions about the various methods of
likelihood of a behavior problem functional analysis
occurring in a variety of situations
were not reported during interviews can be observed behavioral observation in natural environments:
in the natural environment. Direct observation of (a) it can be a rich source of data and hypotheses
parent–child, couple, teacher–child, patient– about functional relations; (b) it can be often be
patient, and staff–patient interactions can also com- conducted by observers who are normally part of
pensate for some of the errors and imprecision asso- the setting; (c) it can be expensive; (d) the presence
ciated with self-report methods. of an observer can evoke atypical behaviors from the
Naturalistic observations are often designed to participants; (e) observers can make consistent
maximize the chance that clinically relevant patterns assessment errors, fabricate data, or decline in con-
of behavior are observed. This is often accomplished sistency and accuracy over time; and (f) careful con-
by imposing minimal constraints on the environ- struction of the observation strategy is necessary to
ment. For example, during an observation, a family strengthen its ecological validity and cost benefits.
might be requested to remain in two rooms, keep the
television off, not engage in long phone conversa- Behavioral Observations in Analog
tions, and not receive visitors. These constraints can Environments
not only increase the cost benefits of naturalistic In analog behavioral observations (ABOs), the
observations but also compromise the ecological assessment context is arranged to efficiently observe
validity of the obtained data. important behaviors and interactions (see Haynes &
Interactions are often video- and tape-recorded O’Brien, 2000; Heyman & Slep, 2004; and the
and later quantified through the systematic use of special section on analogue behavioral observation
definitions and codes (see Kahng & Iwata, 1998, for in Psychological Assessment, March 2001, Vol. 13,
a discussion of computerized data collection from No. 1). In ABOs, the physical setting, presence of
observation videotapes). Furthermore, participant confederates, and cues and instructions to the client
observers (e.g., teachers, nurses, and parents) who are arranged to increase the likelihood that impor-
are part of the setting can gather data (see discussions tant behaviors and functional relations will occur.
in Dishion & Granic, 2004). Examples include the observation in a clinic setting
A study by Buckley et al. (2002) illustrates the of couples attempting to resolve a problem in their
role of external observers in measuring preschoolers’ relationship, role plays with a psychiatric patient to
behavior and temperament. Trained graduate and assess social skills and social anxiety, behavioral
undergraduate students observed the children avoidance tests with clients experiencing fears and
during two home visits lasting 2–3 hr. Researchers phobias, and ‘‘experimental functional analysis’’
scheduled each visit at different times of the day and involving the systematic manipulation of presumed
arranged for different observers to attend each one. antecedent and consequent variables (e.g., testing
This strategy was designed to minimize observer bias the effect of response contingent attention on the
by rotating observers to increase the setting general- self-injurious behaviors of a child).
izability of the data. ABOs are designed to be more cost-beneficial,
Mehl, Gosling, and Pennebaker (2006) tried to compared to naturalistic observations, and can be
reduce the reactive effects of observation by using an used to collect data across response modes. Not
electronically activated recorder (EAR) rather than a only can important behaviors and social interac-
live observer. The EAR is a digital voice recorder that tions be observed, but assessors can also gather
periodically records snippets of ambient sounds. data on clients’ emotions, affect, feelings,
Participants wore the device for 2 days and were thoughts, and psychophysiological reactions
unaware of when it was recording. The recordings during the ABO (see Hawkins, Carrère, &
were later collected, coded, and rated for depressive Gottman, 2002, for an example of measuring
talk. Judges’ ratings were significantly correlated with affect during couple interactions).
BDI (r ¼ .55) scores but only for those participants A study by Johnson (2002) evaluated marital
who had higher levels of depressive symptoms. An satisfaction in newly wed couples using an ABO.
important limitation of this assessment strategy, given Couples were asked to discuss and reach a resolution
the emphasis on functional relations in behavioral about an important marital complaint (e.g., sex, in-
assessment, is that the snippets of data provided very laws, money). The discussions were videotaped and
little information about the contexts in which the then coded for affect using the Behavioral Affective
recorded behaviors occurred. Rating Scale (BARS). This method is useful for
Both of these studies, and the sources previously assessing marital satisfaction since expressions of
cited, highlight the utility and limitations of positive emotions have been found to be associated
Abstract
The Minnesota Multiphasic Personality Inventory-2nd edition (MMPI-2; Butcher, Dahlstrom, Graham,
Tellegen, & Kaemmer, 1989) is the most widely used personality assessment instrument in research and
clinical practice. The goals of this chapter are to orient new users to the test and to highlight a number of
current issues with the test that may also be of interest to experienced users. The chapter begins by briefly
illustrating the development of MMPI-2, from its origins in the University of Minnesota Hospitals to its
current form. The sections that follow describe some of its key features and the contributions they make
to assessing personality and psychopathology: Validity Scales, Clinical Scales, Content Scales,
Supplementary Scales, Personality Psychopathology Five Scales, and Restructured Clinical Scales. The
chapter concludes with discussions of specific clinical issues related to using the MMPI-2 and of new
developments involving the MMPI-2: assessment of ethnic minority clients, international adaptation of the
MMPI-2, short forms, and computer applications. Although basic test interpretive information is provided,
this chapter should not be considered a comprehensive interpretive guide. An annotated bibliography is
included for those interested in further reading on the MMPI-2.
Keywords: Clinical Scales, content interpretation, international test adaptation, MMPI-2, short forms of
the MMPI-2, Validity Scales
Brief History of the MMPI-2 relatively objective way. It was hoped that clinicians
Development of the Original MMPI would be able to use test results to arrive at one or
The first Minnesota Multiphasic Personality more conclusive diagnoses, making the assessment
Inventory (MMPI) was published by Starke process quicker and more reliable.
Hathaway, PhD, and J. Charnley McKinley, MD, Hathaway and McKinley began constructing this
in 1943. Diagnostic assessment at that time typically new instrument by creating an item pool of approxi-
entailed subjecting each patient to an interview, mately 1,000 statements. These statements were
mental status exam, and individually administered assembled from a review of case histories, textbooks,
psychological testing. Prior to the MMPI’s publica- and a variety of other trait scales that were in use at
tion the test authors worked in the University of the time. About half of the statements were elimi-
Minnesota Hospitals system, and they found this nated based on redundant content, leaving a total of
process quite time consuming and demanding of 504 items keyed true or false. These items were then
hospital resources. Hathaway and McKinley believed administered to approximately 1,500 adults not hos-
that a paper-and-pencil survey, which could be pitalized for a psychiatric condition, the majority of
administered in a group format, would allow clin- which were visitors of patients at the University of
icians to gather a large amount of information in a Minnesota Hospitals, and to a group of 221 psychia-
relatively short period of time. This instrument tric inpatients in the same hospital system. These
could be designed to detect features of the most inpatients were selected based on the presence of
common diagnostic categories of the day in a one of eight common psychiatric diagnoses. Patients
250
with multiple diagnoses, or a diagnosis that was ques- Supplementary Scales, were added to the MMPI
tionable, were excluded from the sample. throughout the years using the items available in its
Items were assigned to scales using the empirical copious item pool. Scoring keys for scales found to
keying method. This method involves the selection of be useful were published and made part of the
items for scale membership that maximally distinguish standard test. Many of these original scales are still
between two criterion groups, in this case Minnesota in use on the MMPI-2, in updated versions, and will
normals and inpatients assigned a particular diagnosis. be discussed in greater depth in later sections.
Responses to each item were compared between the Within a short time, the MMPI became the most
relatively nonpathological Minnesota sample and each widely used personality assessment instrument in the
inpatient diagnostic sample. Items that were statisti- United States in both inpatient and outpatient set-
cally meaningful predictors of group membership tings (Lubin, Larsen, & Matarazzo, 1984). Even
were assigned to a scale. No attempt was made to though it was designed to be used only with adults,
avoid item overlap between scales, so many items it also became the most popular inventory for asses-
were scored on multiple scales. sing adolescents (Archer, Maruish, Imhof, &
Eight separate measures emerged from this Piotrowski, 1991). By the late 1980s, over 10,000
empirical keying process. They were known collectively studies had been published using the MMPI.
as the Clinical Scales, and they were named for their
original diagnostic criterion groups—Hypochondriasis Reasons for Restandardization
(Hs), Depression (D), Hysteria (Hy), Psychopathic Despite its unparalleled success in the field
Deviate (Pd), Paranoia (Pa), Psychasthenia (Pt), of objective personality assessment, the MMPI
Schizophrenia (Sc), and Hypomania (Ma). After eventually began to show its age. By the 1970s, a
the MMPI’s publication, two more Clinical substantial number of researchers and practitioners
Scales were added. The Masculinity–Femininity were calling for a revision. Two key criticisms of the
(Mf) scale (Hathaway, 1956) was designed to original test formed the basis of these arguments.
identify features of ‘‘male sexual inversion,’’ (homo- First, much of the item content was inappropriate
sexuality). Drake (1946) developed the Social given contemporary standards. Some items included
Introversion (Si) scale to measure pathological varia- antiquated terms that would not resonate with modern
tions of shyness. Together, these 10 Clinical Scales examinees, such as streetcars, sleeping powders, and
formed the cornerstone of the original MMPI. playing ‘‘drop the handkerchief.’’ Others used language
In addition to the Clinical Scales, the MMPI also that could be considered sexist, pathologized certain
contained a set of Validity Scales designed to detect religious beliefs, or contained crude references to bodily
problematic response biases. Extreme scores on any of functions. Some items that were neither archaic nor
these scales suggested that the test taker had offensive were simply poorly worded or suffered from
responded in an atypical manner, and that the profile grammatical errors. In addition, over time it became
should be interpreted with caution or not at all. The apparent that the original item pool was inadequate
Cannot Say (?) scale was simply a count of the total for measuring some important aspects of psycho-
number of unscorable items (either not answered or pathology, such as suicidality and illicit drug use.
marked both true and false). The Lie (L) scale mea- Serious criticisms were also leveled against the
sured naive attempts to fake good or present oneself as normative sample. The norms still used to score
unrealistically virtuous. Attempts to malinger, on the the test were generated in the early 1940s, and
other hand, were detected with the Infrequency (F) American society had changed meaningfully since
scale, composed of items only rarely endorsed in the then. The norms were also decidedly unrepresenta-
normal sample. Test takers tended to obtain high tive of the U.S. population at the time. All of the
scores on this scale if they were severely disturbed, original respondents lived in Minnesota, the vast
pretending to be severely disturbed, or responding majority were Caucasian, and most were married,
randomly. The final Validity Scale on the original living in low-population areas, and had only about
MMPI was the Correction scale (K). High scores on 8 years of education.
K indicated reluctance to disclose psychological
problems, and portions of scores on this scale were The Restandardization Project and the Birth
typically added to scores on some of the Clinical of the MMPI-2
Scales to correct for defensiveness. After years of debate and controversy, the
Many other important scales, including subscales University of Minnesota Press formed a restandar-
of the Clinical Scales, face-valid Content Scales, and dization committee in 1982 that consisted of
1940 First journal publication on the Multiphasic Schedule (Hathaway & McKinley)
1943 Time magazine announcement of the broad use of the MMPI
Applications of the MMPI in medical assessment (Schiele, Baker, & Hathaway)
1945 Value of the MMPI in personnel selection established (Abramson)
Meehl’s empiricist manifesto
First use of the MMPI with adolescents (Capwell)
1946 Development of the Social Introversion scale (Drake)
1947 Simulated patterns on the MMPI (Gough)
1948 First international translations of the MMPI
1951 Successful analysis and prediction of delinquency (Hathaway & Monachesi)
1952 First factor analysis of the test establishing the factor structure of the MMPI scales (Welsh)
1954 Meehl’s article on clinical vs. statistical prediction, establishing the actuarial prediction approach in psychology
1955 Work by Reitan contributed to the inclusion of the MMPI in neuropsychological assessment
Empirical validation of code types (Halbower)
Development of the Harris–Lingoes Subscales (Harris & Lingoes)
1956 Factors in test translation (Sundberg)
1960 First comprehensive interpretative text for the MMPI (Dahlstrom & Welsh)
1962 Development of the first computer interpretation system at the Mayo Clinic (Rome, Swenson, Mataya,
McCarthy, Pearson, Keating, & Hathaway)
1963 Marks & Seeman’s interpretive cookbook for MMPI codes with outpatients based on actuarial prediction
methods
1965 Gilberstadt & Duker’s MMPI cookbook for inpatients
Block’s book dispelling the power of response sets
Establishment of the Symposium on Recent Developments in the Use of the MMPI, which has brought new
research and clinical interpretation strategies to psychologists for over 40 years
Craig MacAndrew’s development of the MacAndrew Alcoholism Scale
1966 Wiggins’s content interpretation approach
Gottesman & Shields’s MMPI study on the genetics of schizophrenia and personality
Strength of actuarial methods in psychological assessment (Sines)
1967 Development of the first narrative MMPI interpretive program with Roche Laboratories (Fowler)
1969 First public discussion of the need for an MMPI revision by a panel of MMPI experts at the 5th MMPI
Symposium on Recent Developments in the Use of the MMPI at Minneapolis, MN
1970 First International Symposium on Recent Developments in the MMPI in Mexico, highlighting international use
of the MMPI
1972 Changing perspectives in objective personality assessment; detailed discussion on the need for and factors in the
revision of the MMPI (Butcher)
1976 Development of rigorous methodology for MMPI translation (Butcher & Pancheri)
1977 Established a system for the classification of profiles of criminal offenders (Megargee)
Publication of John Graham’s long-running and widely used interpretive MMPI textbook, The MMPI:
A Practical Guide
1978 Innovative clinical programming that integrated the MMPI into chronic pain treatment programs (Fordyce)
1982 The MMPI-2 Restandardization Committee is formed consisting of James Butcher, John Graham, W. Grant
Dahlstom, & Auke Tellegen
1985 Predicting behavior with the MMPI in job applicants (Beutler)
1988 Malingering assessment in psychological tests (Schretlen)
1989 MMPI-2 is published (Butcher et al., 1989)
Criteria for assessing inconsistent patterns of item endorsement (Nichols, Greene, & Schmolck)
1990 Development of the MMPI-2 content scales (Butcher, Graham, Williams, & Ben-Porath)
1991 Research verified the utility of the MMPI-2 Validity Scales (Berry, Baer, & Harris; Graham, Watts, & Timbrook)
1992 MMPI-A was published (Butcher, Williams, Graham, Archer, Tellegen, Ben-Porath, & Kaemmer)
Therapeutic assessment (test feedback in therapy) demonstrated to be an effective treatment strategy (Finn &
Tonsager).
continued
appear on a single scale. If an examinee fails to of items that are opposite in their content.
properly answer a large number of items, it is stan- Inconsistent true or false responding increases the
dard practice to request that the person reconsider score. Both acquiescence (the tendency to answer in
his or her responses to those items so that a valid the affirmative regardless of content) and nonac-
profile may be obtained. quiescence (the tendency to answer in the negative
regardless of content) produce elevated scores on this
Variable Response Inconsistency (VRIN) scale. Wetter and Tharpe (1995) conducted a simu-
The Variable Response Inconsistency (VRIN) lation study to test the effects of randomly inserting
scale was added to the MMPI-2 during the restruc- true or false item responses on TRIN scores. They
turing process. It is scored by comparing the exam- concluded that TRIN is sensitive to both kinds of
inees’ responses to various items pairs judged to be response styles, and that it demonstrates utility
related in their content. Contradictory responding to beyond the other Validity Scales in detecting them.
these items increases scores on this scale. Examinees
who respond randomly or those who do not ade- Infrequency (F)
quately evaluate item content before responding The F scale of the original MMPI was largely
tend to return elevated scores on this measure. retained in the MMPI-2, only some items having
been deleted in the restructuring process. The scale is
True Response Inconsistency (TRIN) composed of items answered in the scored direction
True Response Inconsistency (TRIN) was also by fewer than 10% of the MMPI normative sample.
created by the restructuring committee for the Although it was originally designed to detect failure
MMPI-2. Scores are generated by contrasting pairs on the part of the examinee to properly attend to the
outcome in psychotherapy. It was created empiri- university clinic, and contrasted their responses to
cally by contrasting the responses of patients judged the MMPI with those of a group of well-adjusted
to have improved in treatment, compared with those students. PK was constructed by contrasting the
who did not. Do and Re were also empirically responses of Vietnam veterans diagnosed with
derived. Research participants were given definitions posttraumatic stress disorder (PTSD) with those
of the constructs and asked to nominate peers either who had some other diagnosis (Keane, Malloy, &
high or low in each. Item responses from each group Fairbank, 1984). The creation of MDS (Hjemboe,
were contrasted in order to create the scales. Almagor, & Butcher, 1992) was substantially
The internal consistency of Es and Re were both different from the other two. Items were selected
low in the MMPI-2 normative sample, while the for the scale based on their correlation with the
internal consistency for Do was good. Retest relia- Dyadic Adjustment Scale (Spanier, 1976) in a
bility over a 1-week interval was acceptable for all of mixed group of distressed and non-distressed cou-
these scales. High scores on Es are suggestive of good ples. Once the initial scale was formed, items were
psychological adjustment. If examinees who obtain then either removed or added based on both rational
elevated scores on this scale are experiencing pro- and empirical considerations.
blems, they are likely to be transient and easily ame- The internal consistency for Mt and PK is excel-
liorated. Those who score high on Do exhibit greater lent, whereas it is low for MDS. Short interval retest
than average mastery of social situations. They appear reliability is also excellent for Mt and PK, and accep-
confident, do not suffer from debilitating psycholo- table for MDS. Mt is best viewed as a scale of
gical problems, and may possess other attributes that internalizing symptoms among college students.
contribute to good leadership skills. Re scores are Little is known about its validity with other popula-
associated with examinees’ dependability and relia- tions. PK scores are strongly related to symptoms of
bility. High scorers have a strong sense of justice and PTSD, although this scale is related to general dis-
strive to maintain moral values. tress as well. High scorers report mixed internalizing
symptoms, emotional turmoil, and fear of losing
SCALES OF ADJUSTMENT
control. MDS appears to be sensitive to marital
These scales include College Maladjustment
distress, and is only interpreted in this context.
(Mt), Posttraumatic Stress Disorder (PK), and
Those who score high on this scale may feel hurt,
Marital Distress (MDS). Each is designed to mea-
rejected, angry, or depressed regarding their situa-
sure pathological adjustment to potentially stressful
tion with their partner.
situations. Mt and PK were derived empirically,
using methods similar to those discussed above. SCALES OF ANGER
For Mt, Kleinmuntz (1961) identified a group of Two scales are included in this cluster: Hostility
maladjusted college students seeking counseling at a (Ho) and Overcontrolled-Hostility (O-H). Ho was
Abstract
This chapter reviews the development of the Rorschach Inkblot Method (RIM) and delineates its value in
identifying a broad range of personality characteristics. Attention is paid to particularly useful purposes
served by Rorschach assessment in clinical and forensic evaluations when it is integrated with self-report
measures. Available evidence indicates that Rorschach assessment is a psychometrically sound procedure
that is being actively practiced and studied, both in the United States and in many other parts of the world,
and that can be meaningfully applied in diverse countries and cultures. Some concern is raised, however,
about the adequacy of education and training in Rorschach assessment currently provided in graduate
programs.
Rorschach assessment became public in 1921, when practical utility. This situation began to change
Hermann Rorschach published the results of his with the 1974 publication of Exner’s The
inkblot studies with 117 non-patients and 288 Rorschach: A Comprehensive System, which integrated
mental hospital patients with various types of dis- various aspects of previous Rorschach methods into
order (Rorschach, 1921/1942). Interest in the a standardized and research-based set of administra-
inkblot method as an instrument for assessing tion and coding guidelines (Exner, 1974). Over the
personality functioning spread to many parts of the course of four editions of this text (see Exner, 2003),
world during the 1920s and reached the United the Comprehensive System (CS) became by far the
States near the end of that decade. The first most widely used Rorschach method in the United
English-language articles on Rorschach’s method States and in many other countries as well, and its
were published by Samuel Beck in 1930 (Beck, standardization contributed to substantial advances
1930a, 1930b), and several years later the first in Rorschach research and international communi-
English-language textbooks on Rorschach assess- cation about Rorschach assessment issues.
ment were authored by Beck (1937) and by Bruno By virtue of being a performance-based method
Klopfer and Douglas Kelley (1942). Since that time, and providing a broad range of information about
this personality assessment measure, now often personality characteristics, the RIM is frequently
referred to as the Rorschach Inkblot Method valuable to include in a psychological test battery.
(RIM; Weiner, 1994), has become widely known Rorschach assessment serves particularly useful pur-
to professional psychologists and among the lay poses in clinical and forensic evaluations, which are
public as well. the two contexts in which it is most often applied.
During the first 50 years of its history, a prolif- This chapter reviews the value of the RIM in a test
eration of methods of administering and scoring the battery and how Rorschach assessment can be
RIM impeded systematic accumulation of informa- applied. Following these reviews, the chapter takes
tion about its psychometric characteristics and up four questions that frequently have been asked
277
about the RIM: (1) Is Rorschach assessment a psy- people do not fully recognize in themselves or are
chometrically sound procedure? (2) Is Rorschach hesitant to admit when asked about them directly
assessment actively practiced and studied? (3) Can (Bornstein, 1999; Schmulke & Egloff, 2005). They
Rorschach assessment be applied in diverse countries also are more adept at providing a clinician with a
and cultures? (4) Is Rorschach assessment being glimpse of the clients’ internal representations and
widely and adequately taught? Although of long behavioral or problem-solving predilections. As a
standing, these questions define ongoing contem- performance-based measure in which inferences
porary issues, and the discussion presents current about people are derived not from what they say
information on the psychometric foundations, about themselves but from what they actually do
usage frequency, and cross-cultural applicability of (i.e., say about inkblots), the RIM provides valuable
Rorschach assessment and the status of Rorschach balance to the self-report data obtained in a personality
education and training. assessment. In recognition of the fact that self-report
and performance-based methods measure personality
Value of Rorschach Assessment in a Test characteristics at different levels of a person’s ability to
Battery recognize and willingness to acknowledge these char-
Following the suggestion of the Psychological acteristics, and in light of the relative advantages and
Assessment Work Group (PAWG), a task force limitations of these methods, contemporary authors
appointed by the American Psychological commonly recommend an integrative approach to
Association Board of Professional Affairs, it has personality assessment that combines both kinds of
become common practice to refer to two different measures. According to the PAWG report, for
types of personality assessment instruments as ‘‘self- example, conjoint testing with both self-report and
report’’ or ‘‘performance-based’’ (Meyer et al., 2001). performance-based measures is a procedure ‘‘by
The data in self-report measures consist of what which practitioners have historically used the most
people say about themselves when asked; the data efficient means at their disposal to maximize the
in performance-based measures consist of how validity of their judgments about individual clients’’
people go about performing a task. Self-report (Meyer et al., 2001, p. 150; see also Beutler & Groth-
assessment is thus a direct, explicit procedure based Marnat, 2003; Weiner & Greene, 2008).
on acceptance of what examinees say, whereas
performance-based assessment is an indirect, Purposes Served by Rorschach Assessment
implicit procedure based on inferences from what As a personality assessment instrument, the RIM
examinees do. provides information about an individual’s adaptive
As elaborated elsewhere, self-report and performance- capacities, coping style, underlying attitudes and
based methods have some potential advantage and lim- concerns, and dispositions to think, feel, and act in
itations relative to each other (Meyer, 1997; Weiner, certain ways. These indications of a person’s states,
2005c). Usually the best way to learn something about traits, and inner life frequently facilitate decisions in
people is to ask them about it. How people answer direct which personality characteristics are a relevant con-
questions about themselves is more likely to provide sideration. Rorschach assessment thereby serves
definitive information about what they think, feel, plan, useful purposes in clinical and forensic psychology,
and have experienced than indirect impressions based on which commonly call for personality-based decisions
how they perform some task. On the other hand, what and, as noted, the two contexts in which Rorschach
people self-report is limited to what they are able and findings are most often applied.
willing to say about themselves—which depends on
how fully they are aware of their own characteristics Clinical Purposes
and how inclined they are to be open and truthful. Rorschach assessment contributes to decision
Limited self-awareness or reluctance to disclose can making in clinical cases by virtue of the relevance of
detract from the dependability of the answers people personality characteristics to the manner in which
give on a self-report inventory (Dunning, Heath, & various types of psychological disturbance are
Suls, 2004; Wilson & Dunn, 2004). manifest and experienced (see American Psychiatric
Performance-based measures can often circumvent Association, 2000; PDM Task Force, 2006). In addi-
this limitation of self-report inventories. Although tion to assisting in differential diagnosis, personality
indirect methods generate fewer definite conclusions characteristics identified or suggested by Rorschach
than a direct inquiry, they are often more likely than data are often useful in treatment planning and
self-reports to reveal personality characteristics that outcome evaluation.
Abstract
Personality traits provide distal explanations for behavior and are compatible with personality development,
useful in clinical applications, and intrinsically interesting. They must, however, be understood in the context
of a broader system of personality functioning. For decades factor analysts offered competing models of trait
structure. By the 1980s the Five-Factor Model (FFM) emerged, and studies comparing its dimensions to
alternative models led to a growing consensus that the FFM is comprehensive. The model is also universal,
applicable to psychiatric as well as normal samples. To assess the FFM we developed the NEO Inventories,
which offer computer administration and interpretation, are available in a number of languages, and adopt a
novel approach to protocol validity. Research using the NEO Inventories has led to a reconceptualization of
the importance of the person in the social sciences, and may be the basis for a revolutionary new approach to
the diagnosis of personality disorders.
The NEO Inventories are operationalizations of the contributors to contemporary trait models seem
Five-Factor Model (FFM) of personality traits eager to distance themselves from the topic. Jerry
(Digman, 1990; McCrae & John, 1992). The FFM Wiggins (1997), in a classic defense of the trait con-
is currently the most widely accepted model of per- struct, backed away from the simple dispositional
sonality trait structure, and the NEO Inventories have view that laypersons (and some of us psychologists)
been used around the world in clinical, research, and have of traits, claiming only that they represented
applied contexts (Costa & McCrae, 2008). In this regularities in behavior that called for explanation
chapter we provide an overview of the FFM and its by other mechanisms. Saucier and Goldberg (1996)
place in personality psychology; we then give a declared that trait theory was ‘‘a rubric that may have
detailed account of the development, validation, and no meaning outside introductory personality texts’’
applications of the NEO Inventories. First, however, (p. 25). Historically, the major schools of psychology
we must address some hurdles to an appreciation of were taken to be psychoanalysis, behaviorism, and
the contribution of traits themselves to an under- humanism; a recent analysis (Robins, Gosling, &
standing of people. Craik, 1999) replaced humanism with cognitive
psychology and neuroscience. Trait psychology con-
Overcoming the Prejudice against tinues to be marginalized.
Trait Psychology Other psychologists are openly hostile or frankly
Despite the fact that trait measures have been indifferent to traits. Among the objections voiced
used for decades and continue to be a central feature are that (a) traits are mere cognitive fictions; (b) even
in psychological research, there is no doubt that the if they are real, they offer only descriptions, not
trait approach is stigmatized by many psychologists explanations for behavior; (c) the trait construct is
and defended by few. Even some of the major incompatible with human growth and development;
299
(d) because traits cannot be changed, they are irrele- puts all divinity to rout’’ (Emerson, 1844/1990).
vant to clinical practice; (e) trait accounts of person- Instead of the freedom and spontaneity that we think
ality are dry and uninteresting; and (f) traits offer an we see in people’s behavior, long-term observation
incomplete account of human psychology or even shows that actions simply reflect underlying and
personality psychology. Contemporary trait theory enduring traits. In Dweck’s (2000) terminology,
and research can address each of these objections. Emerson was by preference an ‘‘incremental theorist’’
who wanted to believe in the malleability of traits but
Cognitive Fictions? was forced by long experience to become an ‘‘entity
Critiques of trait psychology by Mischel (1968), theorist,’’ seeing traits as fixed. Dweck and other
Shweder (1975), and others led many researchers researchers have shown that these implicit views of
(especially social psychologists) to the view that personality have powerful effects on personal motiva-
traits were mere attributions that did not refer to tion and on attributions about others. Some of the
any real psychological mechanism. The chief basis harshest views of trait psychology come from
for this inference was the fact that people could easily researchers who seem to be incremental theorists and
attribute personality traits to strangers based on little who believe that trait psychology is only compatible
or no information, and that these ratings mimicked with entity theories. Lifespan developmentalist Orville
the structure of real trait ratings. It is surely the case Brim, for example, was quoted as saying ‘‘Properties
that people can and do make false attributions about like gregariousness don’t interest me. . . . You want to
traits, but studies of consensual validation (McCrae look at how a person grows and changes, not at how a
et al., 2004), behavior genetics (Yamagata et al., person stays the same’’ (Rubin, 1981, p. 24).
2006), and the prediction of behaviors (Funder & Curiously, even some trait psychologists seem
Sneed, 1993) and life outcomes (Ozer & Benet- to regard others as entity theorists, attributing to
Martı́nez, 2006) have by now produced ‘‘uncontro- them a belief in the immutability of traits (Costa &
versial and overwhelming evidence’’ (Perugini & McCrae, 2006). But the data do not support any
Richetin, 2007, p. 980) of the existence of traits claim of immutability. Individuals change in rank
and the validity of their assessments from knowl- order (Terracciano, Costa, & McCrae, 2006), and
edgeable raters (including, of course, self-reports). people as a whole show predictable developmental
Like them or not, personality traits are a fact of life. changes in the mean levels of traits. For example,
conscientiousness increases from adolescence through
Trait Explanations age 70, whereas extraversion declines—albeit very
From Lamiell (1987) to Cervone (2004), some gradually (Terracciano, McCrae, Brant, & Costa,
critics have argued that traits at best provide a descrip- 2005). Trait levels rise or fall in response to clinical
tion of patterns of behavior; they cannot provide depression and its remission (Costa, Bagby, Herbst,
a causal explanation of it. A full response to this & McCrae, 2005), and neurological conditions such
charge cannot be made here; the interested reader as Alzheimer’s disease profoundly affect personality
can see McCrae and Costa (1995, 2008a). But in traits (Siegler et al., 1991).
brief, we argue that social-cognitive explanations are McCrae and colleagues (2000) are perhaps singled
proximal, whereas trait explanations are distal. Both out as entity theorists who conceptualize traits as
forms of explanation are legitimate, and both are ‘‘inherent and immutable internal dispositions’’
useful in some contexts. If one is trying to prevent (Johnson, Hicks, McGue, & Iacono, 2007, p. 266)
quarrels between two lab partners who must coop- because their Five-Factor Theory (FFT; McCrae &
erate to pass a required course, proximal explanations Costa, in 2008b) postulates that traits are biologically
grounded in the situational context are probably more based, largely uninfluenced by life experience. A study
useful than distal ones. But if one is trying to under- of essentialism by Haslam, Rothschild, and Ernst
stand why an individual quarrels with her lab partner, (2000) showed that laypersons have a conception of
is prone to fits of jealousy, regards herself as better ‘‘natural kinds’’ of categories characterized by discrete-
than others, and alienates her roommate, then a distal, ness, naturalness, immutability, historical stability,
trait explanation (one suspects disagreeableness) is and necessary features. The categories ‘‘men’’ and
more useful. ‘‘women,’’ for example, are distinct types, biologically
based, fixed (at least until the advent of sex-change
Trait Immutability? surgery), seen throughout history, and quintessen-
The philosopher Ralph Waldo Emerson was per- tially defined by the number of X-chromosomes
haps the first to raise the objection that ‘‘temperament they have. Perhaps the claim that personality traits
Emotional reactions,
mid-career shifts: Dynamic External
ic s Behavior processes influences
Dynamic n am se
y s
D oce
processes pr
mic Cultural norms,
Dyna
c e sses life events:
Characteristic pro Situation
Basic
processes
Dynamic
tendencies adaptations
Dynamic Culturally conditioned
processes phenomena:
personal strivings,
Neuroticism, attitudes
Extraversion, Dy
Openness, n
pro amic
Agreeableness, ce
ss
Conscientiousness es
Self-concept
Dynamic Self-schemas,
personal myths
processes
Fig. 16.1 A representation of the FFT personality system. Core components are in rectangles; interfacing components are in ellipses. Adapted from
McCrae & Costa, 2008b.
implicitly ‘‘delineate the particular circumstances ‘‘ebullient,’’ and so on. Psychologists and psychia-
in which the behavioral trait manifestations are trists have created hundreds of technical terms to
likely to take place’’ (p. 17). If we assert that extraverts refer to trait-like characteristics, such as psycholo-
like to laugh, sing, and dance, it is understood by any gical mindedness, intolerance of ambiguity, and ego
competent English speaker that we mean ‘‘when the strength. An indefinitely long list of traits poses a
circumstances permit it,’’ not, say, at a funeral or in a host of problems for the science of personality psy-
hostage situation. FFT goes farther than this by making chology: How do we know that any particular selec-
explicit that characteristic adaptations are formed when tion adequately covers the range of relevant traits?
traits operate in the context of the social environment, If we are interested in personality development, what
and that behavior emerges from the contextualized traits ought we to select for a longitudinal study?
expression of characteristic adaptations. How can a clinician be confident that she has
In FFT the elements of the personality system assessed those aspects of personality that explain a
are connected by arrows that represent dynamic client’s problems in living? How can we compare the
processes; these are the causal pathways that specify results of two studies when different traits are chosen
the operation of the system. The distinctive feature by different investigators? The possibilities for sys-
of FFT, in contrast to most other personality tematic and cumulative research and assessment are
theories (e.g., Rentfrow, Gosling, & Potter, 2008), distinctly limited.
is the lack of any direct arrow from ‘‘External Fortunately, psychologists long ago realized that
influences’’ to ‘‘Basic tendencies.’’ This is a highly traits are not independent, but almost always covary
controversial feature of the theory, but it is sup- with other traits. People who are ‘‘arrogant’’ are also
ported by surprisingly strong data, and, even if it ‘‘haughty’’—in fact, it is not clear that these are any
ultimately turns out to be incorrect, seems well more than different words for the same trait. But
chosen to inspire revealing research. beyond mere synonyms, it is clear that traits show
empirical relations. People who are energetic are
The Structure of Personality Traits also likely to be cheerful, and every clinician knows
The Problem of Structure that clients who are prone to anxiety are also likely to
People are routinely characterized by a vast experience feelings of depression. These observa-
number of more or less distinct traits. They may be tions long ago led personality psychologists to seek
called ‘‘arrogant,’’ ‘‘bawdy,’’ ‘‘complicated,’’ ‘‘dense,’’ clusters of traits that could provide a framework for
CPI McCrae et al., 1993 Well-Being (R) Sociability Flexibility Femininity Achievement
via conformance
MBTI McCrae & Costa, 1989 Extraversion Intuition Feeling Judging
PAI Costa & McCrae, 1992a Borderline features Paranoia (R)
PRF Costa & McCrae, 1988 Defendence Play Understanding Aggression (R) Order
BPI Costa & McCrae, 1992a Anxiety Self-depreciation (R) Denial (R) Interpersonal problems (R)
DAPP Markon et al., 2005 Identity disturbance Restricted expression (R) Stimulus seeking Callousness (R) Compulsivity
SNAP Markon et al., 2005 Dependency Exhibitionism Eccentric perceptions Manipulativeness (R) Workaholism
MMPI-2 Quirk et al., 2003 Psychasthenia Social introversion (R) Psychopathic
deviance (R)
MCMI-II Costa & McCrae, 1990 Passive–aggressive Schizoid (R) Antisocial (R) Compulsive
MIPS Millon, 1994 Hesitating Outgoing Intuiting Agreeing Systematizing
TCI De Fruyt, Van de Wiele, Self-directedness (R) Harm avoidance (R) Self-transcendence Cooperativeness Persistence
& Van Heeringen, 2000
16PF Conn & Rieke, 1994 Anxiety Extraversion Tough-mindedness (R) Independence (R) Self-control
Notes: All factor loadings or correlations between the factors and associated scales are greater than .40 in absolute magnitude. ‘‘(R)’’ indicates a reversed scale. CPI ¼ California Psychological Inventory (Gough, 1987), MBTI ¼
Myers–Briggs type indicator (Myers & McCaulley, 1985), PAI ¼ Personality Assessment Inventory (Morey, 1991), PRF ¼ Personality Research Form (Jackson, 1974), BPI ¼ Basic Personality Inventory (Jackson, 1989); DAPP ¼
Dimensional Assessment of Personality Pathology (Livesley & Jackson, 2008), SNAP ¼ Schedule for Nonadaptive and Adaptive Personality (Clark, 1993), MMPI-2 ¼ Minnesota Multiphasic Personality Inventory-2 (Butcher,
Dahlstrom, Graham, Tellegen, & Kaemmer, 1989), MCMI-II ¼ Millon Clinical Multiaxial Inventory (Millon, 1983), MIPS ¼ Millon Inventory of Personality Styles (Millon, 1994), TCI ¼ Temperament and Character Inventory
(Cloninger, Przybeck, Svrakic, & Wetzel, 1994), 16PF ¼ Sixteen Personality Factor Questionnaire (Conn & Rieke, 1994).
variety of instruments onto a common framework. That was a testable hypothesis, and as translators
This is of particular importance in meta-analyses, began to gather data the results soon became clear:
where a variety of different personality measures Neuroticism and conscientiousness were invariably
have been used to study the same phenomenon. found, and openness was always suggested, although
It has now become routine to use the FFM as the some of its facets occasionally failed to show
framework for meta-analyses (e.g., DeNeve & substantial loadings (McCrae & Costa, 1997b).
Cooper, 1998; Feingold, 1994; Roberts, Walton, Extraversion and agreeableness emerged in most
& Viechtbauer, 2006). cultures, but in a minority of cultures, varimax rota-
tion instead produced love and dominance factors
The FFM as a Theory of Everyone (Rolland, 2002). Readers familiar with the interper-
Research conducted in the Augmented BLSA sonal circumplex will recognize these as two of its
showed that the FFM can be generalized across a axes, about 45 away from the positions of extraver-
wide range of personality constructs and measures, sion and agreeableness (McCrae & Costa, 1989b).
but the BLSA is a very select sample of healthy and Because the axes chosen to define a circumplex are
generally well-educated volunteers. It was not clear always more or less arbitrary, we eventually learned
when we began our research whether the structure to compare factor structures after rotating them to
we saw could be replicated in different populations. maximum similarity. This is a form of confirmatory
Indeed, our earlier work on the three-factor model factor analysis that seems better suited to the analysis
had been conducted in a population of men (Costa of personality instruments than the maximum like-
& McCrae, 1980), so its replication in women was lihood methods that are sometimes used (McCrae,
an important step. Because we had worked only with Zonderman, Costa, Bond, & Paunonen, 1996).
adult samples from longitudinal studies, we did not The procedure we now recommend uses targeted
know for some time whether the FFM could be rotation and evaluates the degree of replication by
found in college samples. A large and diverse computing congruence coefficients for both factors
sample of employees from a national organization and variables; factor congruence coefficients above
allowed us to compare personality structure in .85 are considered replications (Lorenzo-Seva & ten
younger versus older groups, in men versus Berge, 2006).
women, and in White versus non-White subsam- The cross-cultural generalizability of the FFM
ples. The FFM was found in all groups (Costa, was tested again in a study of 50 cultures (McCrae
McCrae, & Dye, 1991). The structure was later et al., 2005). Unlike previous studies, which had
replicated in a sample of older African-Americans examined self-reports of personality, this study
(Savla, Davey, Costa, & Whitfield, 2007). asked college students to rate someone they knew
Recently, using a version with slightly simplified well on the third-person version of the NEO-PI-R
language, we found the same structure in middle (Costa & McCrae, 1992b), which had been trans-
school children (Costa, McCrae, & Martin, 2008). lated into 26 different languages. Assigned targets
All of these studies, however, were conducted in were either college age or adults over age 40. A factor
North American samples using the original English- analysis of the full sample (N ¼ 11,985) was com-
language version of the test. Beginning around 1990, pared to the adult American self-report normative
investigators around the world began to approach us structure; factor congruence coefficients were .98,
and ask if they could translate the NEO-PI-R. At that .97, .97, .97, and .97 for neuroticism, extraversion,
time it was an open question whether the FFM could openness to experience, agreeableness, and conscien-
be found in other cultures, and whether the items of tiousness, respectively, suggesting almost perfect
the NEO-PI-R could be meaningfully translated. replication. The structure was also clearly replicated
One reviewer doubted it: ‘‘The simplistic (a poster- in college-age males, college-age females, adult
iori) basis of the Five-Factor Model, as it is derived males, and adult females, with congruence coeffi-
from colloquial usage of language, makes the model cients ranging from .96 to .98.
and its tools intrinsically bound to the culture and When analyses were conducted on the 50 indivi-
language that spawned it. Different cultures and dif- dual cultures, the smaller sample sizes introduced
ferent languages should give rise to other models that more error. Even so, the total congruence coefficient
have little chance of being five in number nor of was greater than .90 in 46 (92%) of the cultures. The
having any of the factors resemble those derived lowest factor congruences (as low as .53) were found
from the linguistic/social network of middle-class in five Black African nations (Burkina Faso, Nigeria,
Americans’’ (Juni, 1996, p. 864). Uganda, Ethiopia, and Botswana), which might
Facet N E O A C
Special Features of the NEO Inventories interpreted by reference to the manual, the litera-
COMPUTER INTERPRETATION ture, and training materials provided at occasional
Like most published personality instruments, workshops.
the NEO Inventories lead a double life: They are But a simpler method relies on computer
widely used in research on groups, which accounts technology. Since its original publication, the
for their familiarity in scientific journals, but they NEO Inventories have included the option of com-
are also intended for the assessment of individuals, puter administration, scoring, and interpretation.
where research findings can help psychologists and Computer interpretation of personality inventories
psychiatrists understand real human beings and was relatively new in 1985, and our major concern
their problems and promise. Profile sheets are avail- was that the interpretations offered were scientifi-
able on which clinicians or counselors can plot raw cally based. In the first manual we gave illustrations
scores and obtain normed profiles; these can be of cases and the reports that the computer would
Facet N E O A C
The NEO-PI-R is widely used in clinical practice of FFM traits, especially in relation to depression
as a basic tool for understanding the client and (Bagby et al., 1998), psychopathy (J. D. Miller,
establishing rapport, anticipating the course of Lynam, Widiger, & Leukefeld, 2001), and the per-
therapy, and selecting appropriate forms of therapy sonality disorders (Costa & Widiger, 2002). These
(Piedmont, 1998; Singer, 2005). Although the clin- studies make it clear that, as T. A. Widiger has
ical value of the FFM was pointed out some time ago remarked, the NEO Inventories do not measure
(Costa, 1991), much research remains to be done to ‘‘normal’’ personality traits; they measure ‘‘general’’
demonstrate the most effective ways to utilize infor- personality traits, applicable to everyone and rele-
mation from the NEO-PI-R in clinical practice vant to many forms of psychopathology.
(McCrae & Sutin, 2007). In contrast, there have Perhaps the most exciting potential application of
been hundreds of studies on the diagnostic relevance the FFM is in the revision of DSM-V ’s Axis II,
Edwin I. Megargee
Abstract
This chapter first describes the California Psychological Inventory (CPI) and presents a brief overview of
how it has been used (Gough, 1957, 1987; Gough & Bradley, 1996). Next, it discusses Harrison Gough’s
philosophy of test construction and how it guided the development and evolution of the CPI. The third
section provides a brief description of the 20 current CPI folk construct scales, focusing on their
construction and the correlates of high and low scores. This is followed by a brief account of the CPI’s
factor structure. The next section discusses the CPI’s three vector scales assessing the dimensions of
person orientation, values orientation, and level of realization and describes how they are used to define
four personality types. The concluding section offers a brief overview of CPI interpretation.
Scientists and theologians continue to debate Although the DNA for the CPI originated on the
whether life as we know it was created all at snowy campus of the University of Minnesota where
once or whether it evolved gradually through a Gough, a native Minnesotan, obtained his BA in
process of trial and error. There can be no doubt, sociology (1942) and his MA (1947) and PhD
however, regarding the origins of the California (1949) in psychology, it developed, matured, and
Psychological Inventory (CPI; Gough, 1957, eventually flourished on the fog-shrouded hillsides
1987; Gough & Bradley, 1996). The CPI as we of the University of California at Berkeley, specifi-
know it definitely evolved over a period of six cally at the Institute of Personality Assessment and
decades. Research (IPAR), which specialized in broad but
The CPI’s immediate ancestor was the in-depth assessments of well-functioning and crea-
original Minnesota Multiphasic Personality tive individuals. There additional CPI scales were
Inventory (MMPI; Hathaway & McKinley, developed and there Gough was able to relate these
1943), from which its author, Harrison Gough, measures to wide arrays of collateral information
borrowed items and empirical scale construction collected in the course of IPAR evaluations.
techniques as he sought to measure constructs The first copyrighted edition of the CPI con-
that were relevant to everyday social interactions tained 548 true–false items and included 15 scales.
such as academic achievement (Gough, 1953a, c), Five years later, an 18-scale, 480-item version was
social status (Gough, 1948), and prejudice issued by Consulting Psychologists Press, along with
(Gough, 1951). Other forebears were scales a series of administrator’s manuals (Gough, 1957,
developed to assess behaviors relevant to research 1960, 1969). In 1972, the author’s California
in sociology and political science, such as delin- Psychological Inventory Handbook, which attempted
quency, dominance, and responsibility (Gough, to integrate the fast-burgeoning CPI literature, was
McClosky, & Meehl, 1951, 1952; Gough & published (Megargee, 1972). During the 1960s and
Peterson, 1952). 1970s, the conceptual validity of the ‘‘CPI 480,’’ as it
323
later came to be called, was strengthened as the 18 Vector 2 (high)
scales were related to other measures and to behav- Norm-favoring
ioral observations in the course of IPAR evaluations
and independent personality research.
Alpha Beta
The next generation in the evolution of the CPI
was a 462-item version (the ‘‘CPI 462’’) which added
two new scales, a revised profile sheet that was more
consistent with the CPI’s factor structure, and a new
approach to interpretation based on three new ‘‘vector Vector 1 (low) Vector 1 (high)
scales’’ utilizing a ‘‘cuboid theoretical model’’ (Gough, Externality Internality
1987). And, in less than a decade, a third revision, the
‘‘CPI 434,’’ was brought forth with revised items
designed to be compatible with the 1990 Americans
With Disabilities Act (ADA), along with a new 419- Gamma Delta
page administrator’s manual (Gough & Bradley,
1996) that was substantially more detailed and com-
prehensive than any of its predecessors. Vector 2 (low)
Norm-questioning
In the pages that follow, I will first describe the
CPI and then present a brief overview of how it has Fig. 17.1 Grid classifying the four CPI types on the basis of their
been used. Next, I will discuss Harrison Gough’s scores on Vector Scales 1 and 2. Adapted from Gough (1989, Fig. 4-1).
philosophy of test construction and how it guided
the evolution of the CPI. The third section will
Recorded separately are the scores on three
provide a brief description of the 20 current CPI
orthogonal ‘‘vector’’ or ‘‘structural’’ scales asses-
folk construct scales, focusing on their construction
sing dimensions underlying the 20 folk culture
and the correlates of high and low scores. This will be
scales. Vector 1 assesses interpersonal orientation,
followed by a brief account of the CPI’s factor struc-
ranging from extraversion to introversion, while
ture. The next section will discuss the three vector
Vector 2 assesses incorporation of social values,
scales, the four CPI personality types, and the cuboid
ranging from norm-favoring to norm-ques-
model. This will be followed by a list of special pur-
tioning. Plotted together in a classificatory grid,
pose and research scales that may be scored at the
the four possible combinations of high and low
examiner’s discretion. The concluding section will
scores on these two vectors define four different
offer a brief overview of CPI interpretation.
personality types. Type Alpha is extraverted and
norm accepting, Type Beta is introverted and
Description
norm accepting, Type Gamma is extroverted
The current version of the CPI is a 434-item, true–
and norm questioning, and Type Delta is intro-
false, self-administering personality inventory that is
verted and norm questioning (see Figure 17.1).
scored on 20 scales reflecting ‘‘folk culture’’ constructs
Vector 3 defines seven levels of psychological
drawn from everyday life and social discourse in cul-
integration and fulfillment that indicate how effec-
tures around the world. It can be taken on a computer
tively people of each personality type realize its
or in paper and pencil format with a reusable test
potential. They range from ineffectiveness, mal-
booklet. The raw scores are converted to T-scores
adjustment, and poor integration at the lowest
and displayed on a profile sheet on which high scores
levels, through average realization and coping ability
reflect the positive aspect of the various dimensions.
in the mid-range, to superior integration, and self-
The profile sheet groups the scales into four clus-
actualization at the highest levels (Gough, 1989;
ters or sectors. Sector I consists of seven scales asses-
Gough & Bradley, 1996).
sing various aspects of interpersonal effectiveness,
social poise, and self-assurance. Sector II is comprised
of seven scales measuring adjustment, socialization, Use
and normative values. Three of these measures also Over the years, the CPI has been widely used by
serve as validity scales. Sector III contains three scales practitioners and researchers in the fields of educa-
assessing cognitive and intellectual functioning and tion, counseling, and industrial/organizational psy-
achievement motivation. Sector IV includes three chology. Some of the most frequently studied areas
scales assessing roles and personal styles. are the following:
MEGARGEE 325
items with the highest correlations with the total criteria of socioeconomic status (SES) such as the
scores on this preliminary measure were then relative levels of income, education, prestige, and
chosen for further validation. power attained in one’s sociocultural milieu
When the various scales were first gathered (Gough, 1968). However, over time it came to be
together, profiled, and published as the CPI, it regarded as a measure of the personal attributes such
was decided that high scores on all scales (except as ambition and self-assurance that led to the attain-
for F/M) would reflect the more culturally approved ment of social status rather than SES per se (Gough
or socially valued pole. This resulted in changing & Bradley, 1996). It was initially derived by com-
the direction and renaming some scales such as paring the MMPI responses of small-town
Delinquency (De), which became Socialization Minnesota high school students who were one stan-
(So), and Prejudice (Pr), which became Tolerance dard deviation above or below the group’s mean on a
(To). Over the years, as the CPI was used in IPAR measure of SES. In addition to differing in SES, the
assessments, the empirical correlates of these scales initial criterion groups also differed significantly in
were further explored as they were correlated with IQ and grade point average (Gough, 1948).
a wide array of other measures used in these High scorers on Cs are typically described as
evaluations. being ambitious, optimistic, socially effective, verb-
ally fluent, and having wide-ranging interests. Low
scorers are seen as being inadequate, socially ineffec-
The Twenty Folk Culture Scales tive, inarticulate, and low in achievement
Sector I Scales: Measures of Poise, Self- motivation.
Assurance, and Patterns of Interpersonal
Behavior SOCIABILITY (SY; 32 ITEMS)
Gough’s first sector consists of seven scales Originally designated the Sp or ‘‘Social
which measure ascendancy, social skills, and self- Participation’’ scale, the Sy scale was devised to dif-
confidence: Dominance (Do), Capacity for Status ferentiate people with an outgoing, sociable, partici-
(Cs), Sociability (Sy), Social Presence (Sp), Self- pative temperament from those who shun
Acceptance (Sa), Independence (In), and Empathy involvement and avoid social visibility (Gough,
(Em). Gough regards these seven scales as reflecting 1952c, 1968, 1969). The CPI item pool was admi-
different ‘‘facets or themes within the domain of nistered to three samples of high school seniors who
interpersonal effectiveness’’ (Gough & Bradley, ranked in the top and bottom quartiles of their
1996, p. 86). classes in the number of extracurricular activities in
which they participated. Those items that signifi-
DOMINANCE (DO; 36 ITEMS) cantly differentiated the groups in all three samples
The Dominance scale was originally derived in were selected. Although it was originally designed to
connection with research on political participation predict participation in social activities, subsequent
(Gough et al., 1951). The goal was to identify research with the measure convinced Gough that it
strong, dominant, influential, and ascendant indivi- assessed sociability more than social participation,
duals who are able to take the initiative and exercise and he renamed the scale accordingly.
leadership. It was constructed by contrasting the People with high Sy scale scores enjoy going out,
responses of male and female high school and college meeting, and interacting with people. They make
students who had been nominated by their peers as favorable impressions on others and are described as
being exceptionally high or low in social influence, being outgoing, extraverted, self-confident, and
self-confidence, and leadership ability in face-to-face talkative. Low scorers prefer more solitary pursuits,
interactions (Megargee, 1972). Many validity stu- such as reading or watching TV. They hang back,
dies have shown that individuals with high Do scale speak only when spoken to, and are seen as being
scores are described as self-confident, verbally fluent, awkward, inept, reticent, and inhibited.
assertive, enterprising, and able to influence others
SOCIAL PRESENCE (SP; 38 ITEMS)
in a socially positive manner, while those with low
Do scores are seen as socially awkward, reticent, The Sp scale was constructed by means of
submissive, ineffective, and lacking ambition.2 internal consistency analyses to assess poise, self-
confidence, verve, and spontaneity in social interac-
CAPACITY FOR STATUS (CS; 28 ITEMS) tions (Gough, 1968, 1969). A pool of items that
Cs was developed to provide a personality scale appeared to reflect these behaviors was assembled
that would correlate significantly with external and administered to unspecified samples. The
MEGARGEE 327
written items of high school and college students Self-control in accordance with the decision to have
who were nominated as being high or low in respon- high scores on all CPI scales reflect positive attributes.
sibility by fellow students or by teachers. The Sc scale was constructed through internal
Re differs from the related So and Sc scales in that consistency analyses in which a small cluster of
Re emphasizes the degree to which values and con- CPI items that appeared to reflect impetuous
trols are conceptualized and understood (Gough, behavior was rationally devised. Scores on this
1965a). High scorers are described as being con- cluster were then correlated with the total CPI
scientious, reliable, trustworthy, and attentive to item pool; those having the highest correlations
their duties and obligations. Low scorers are seen as for both genders in various unspecified research
irresponsible, undisciplined, hedonistic, unpredict- samples were selected (Gough, 1987). When the
able, and unethical. Rebellious and self-indulgent, revised CPI 462 was published, the scale was
they are apt to break the rules and may have records shortened by deleting a number of items that
of delinquent or even criminal behavior. Early overlapped with other scales.
Sunday morning, a high scorer is apt to be dressing Individuals with elevated Sc scale scores are
for church while a low scorer is lurching home from described as disciplined, self-denying, judgmental,
a night on the town. and conscientious. Indeed Groth-Marnat (1997)
suggests that extremely high scores may be indicative
SOCIALIZATION (SO; 46 ITEMS) of my syndrome of overcontrol of hostile impulses
So was originally was published independently as (Megargee, 1966). Low scorers are seen as impulsive,
the Delinquency (De) scale (Gough & Peterson, willful, uninhibited, and pleasure seeking.
1952). De was constructed through external cri-
terion analyses by comparing the responses of several GOOD IMPRESSION (GI; 40 ITEMS)
samples of delinquent and nondelinquent adoles- Gi is one of three validity scales on the CPI, the
cents to items reflecting Gough’s role-taking theory others being Communality (Cm) and Well-being
of sociopathy (Gough, 1948; Gough & Peterson, (Wb). The original purpose of the scale was to
1952). When the De scale was included on the CPI, identify dissimulated records for which the norma-
its keying was reversed and it was renamed tive data did not apply (Gough, 1952b, 1968). It is
Socialization. Subsequent research demonstrated also used to assess the degree to which test-takers
that So actually ordered people throughout the full engage in impression management in their responses
range of socialization, with low scores indicating to the CPI items and their concern about how others
antisocial and even criminal behavior, mid-range react to them (Gough, 1969).
scores representing ordinary adherence to social Gi was constructed through external criterion ana-
norms, and high scores identifying people of unusual lyses of the responses of high school students to 150
honesty and integrity such as high school ‘‘best citi- specially devised items. The questionnaire was first
zens’’ (Gough, 1960, 1994; Megargee, 1972). administered using standard instructions and later
Therefore, the present So scale reflects the degree with instructions to respond as if one were applying
of social maturity, integrity, and rectitude the indi- for a very important job or trying to make an espe-
vidual has attained (Gough, 1965b). cially favorable impression. Those items which
Moderately high scorers are described as being changed significantly from the first to the second
conscientious, honest, reliable, and well-organized, administration were selected (Gough, 1952b). Gi
with strong, internalized values. Those with extre- was thus one of the first scales constructed to assess
mely high scores may be overly righteous and mor- what came to be called ‘‘social desirability.’’
alistic. Low scorers are seen as unconventional, Gi raw scores 30 suggest positive dissimulation,
independent, and undependable, with poorly devel- especially if the case history suggests that the client’s
oped values. As a result, they are apt to flout society’s overall adjustment is less than perfect. Moderately
rules and get into trouble with the law. elevated scores are obtained by people who are
tactful, obliging, helpful, and concerned with
SELF-CONTROL (SC; 38 ITEMS) making a favorable impression on others, especially
Initially termed ‘‘Impulsivity,’’ the Sc scale was superiors. However, they may be less considerate of
designed to assess nondelinquent freedom from subordinates. Low-scoring individuals have an arro-
restraints ‘‘aimed at impetuosity, high spirits, caprice, gant disregard for the feelings or opinions of others.
and a taste for deviltry’’ (Gough, 1987, p. 45). Later, They are seen as cynical, dissatisfied, complaining
the Impulsivity scale was reversed and renamed nonconformists who want to have things their own
MEGARGEE 329
appreciation of structure and organization’’ (Gough, INTELLECTUAL EFFICIENCY (IE; 42 ITEMS)
1968, p. 69). Originally referred to as a ‘‘nonintellectual intelli-
The Ac scale was constructed by the external gence test,’’ the Ie scale was constructed to provide a
criterion method. After reviewing the literature on set of personality items that would correlate signifi-
academic achievement, Gough (1953c) made up a cantly with accepted measures of intelligence. Ie was
pool of items that appeared to be related to high constructed empirically by contrasting the responses
school achievement. They were administered to of senior high school students scoring in the top and
samples of students matched for IQ but differing bottom quartiles on conventional IQ tests to a pool
in grade point average. Items discriminating the of MMPI items supplemented by specially written
higher- and lower-achieving students in all the sam- items (Gough, 1953b). The scale was cross-validated
ples were retained. The scale was initially cross-vali- and refined by correlating it with measures of intelli-
dated by correlating scores with grade point averages gence in various samples of high school and college
in new samples. Ac proved to be effective in pre- students.
dicting high school achievement, but did not corre- Those scoring high on Ie are described as being
late well with college grades. intelligent, effective, persevering, capable people
High scorers are described as being ambitious, who are successful and achievement-oriented. Low
well-organized, and effective in structured settings. scorers are seen as dull, uncertain, and ineffective
They are seen as hard-working, well-adjusted, con- people who are uninterested in intellectual activities.
scientious and diligent, with traditional social and
family values. Low scorers are described as noncon-
Sector IV Scales: Measures of Role and
formists who feel confined by regulations and struc-
Personal Style
ture. If they have high ability, as indicated by Ai and
Sector IV includes three scales assessing intellec-
Ie, they may be inventive or creative. If not they are
tual and interest modes and personal style:
likely to be low achievers who are more interested in
Psychological-mindedness (Py), Flexibility (Fx),
having fun than they are in following the rules or
and Femininity/Masculinity (F/M).
attaining academic success. Such rebellious behavior
can lead to authority conflicts and delinquency. PSYCHOLOGICAL-MINDEDNESS (PY; 28 ITEMS)
The purpose of the Py scale was to identify people
ACHIEVEMENT VIA INDEPENDENCE (AI; 38 ITEMS) interested in and capable of outstanding achieve-
Ai was initially developed to forecast college-level ment in academic psychology (as opposed to other
academic success, especially in undergraduate fields). During the late 1940s and early 1950s, a pool
courses in psychology (Gough, 1953a). The deriva- of 300 experimental items was given to 25 out-
tion method closely paralleled that used with Ac, standing young psychologists who, in their later
with a specially formulated pool of items being careers, rose to positions of eminence in the field,
administered to samples of higher- and lower- being elected to office in state, regional, and national
achieving college students, except that with Ai the professional organizations, winning numerous
samples were not matched for intelligence. As data awards for professional and research contributions
on the scale accumulated, it appeared that Ai pre- to psychology, and producing a vast array of pub-
dicted achievement in settings where independence lications. Their responses were contrasted with those
of thought, creativity, and self-actualization were of people in other fields and training programs. The
rewarded. Hence, Ai, which had initially been 75 items that distinguished these groups were then
dubbed Honor Point Ratio (Hr), was renamed correlated with instructors’ ratings of the compe-
Achievement via Independence. tence and potential of graduate students in psy-
High scorers on Ai are described as being bright, chology. Those items that correlated significantly
independent, and clear thinking, with a high toler- with this second criterion were retained (Gough,
ance for ambiguity. They have wide-ranging interests 1968).
and do well dealing with complexity and abstractions. High scorers are described as being bright, fluent,
However, they may be somewhat egotistic and cool individualistic, perceptive, and creative, with wide-
toward others. Low scorers are lacking in ambition ranging interests and an openness to new experi-
and do not do well in high school or college, espe- ences. They excel at research, but are somewhat
cially on ambiguous or unstructured tasks. distant and cool in their interpersonal relations.
Interpersonally, they are warmer and more comfor- Low scorers are seen as conventional, practical,
table and conventional than their high-Ai peers. down to earth, and wedded to routine. Somewhat
MEGARGEE 331
achievement orientation. Gough (1987) calls this scales marking the other clusters, a process he
factor ‘‘control.’’ This factor appeared as Factor 1 in described as a, ‘‘long and arduous task, involving a
analyses of the CPI 480, where it was variously great deal of trial and error, and taking a number of
described as ‘‘positive adjustment,’’ ‘‘conformity years to complete’’ (Gough & Bradley, 1996, p. 14).
and value orientation,’’ or ‘‘mental health and per-
sonal efficiency’’ (Megargee, 1972).
VECTOR 1 (V. 1; 34 ITEMS)
Vector 1 measures the ‘‘external/internal’’ or
Factor 3 on all forms of the CPI is defined by
‘‘involved/detached’’ dimension associated with the
large loadings from Scales To, Ai, and Fx, as well as
SSA ‘‘person orientation.’’ Scales Sa, Sy, Do, and Sp
by smaller loadings from Py and Ie. These scales all
were used as markers for the extraversion pole while
deal with independent thinking and openness to
F/M, Sc, and Gi were the markers for introversion
new experiences. Gough (1987) calls this factor
(Gough, 1987). According to Gough (1989, p. 78),
‘‘flexibility.’’ Previous analysts have referred to it as
‘‘capacity for independent thought and action’’ and ‘‘Lower scores on v. 1 are associated with a more
outgoing, involved, interpersonally responsive orien-
‘‘adaptive autonomy’’ (Megargee, 1972).
tation, and high scores with a more inwardly directed,
Factor 4 has its primary loadings from Scale Cm
private, and interpersonally focused inclination.’’
along with Scales Re, So, and Wb, which reflect
conventional social values and conformity. Gough VECTOR 2 (V. 2; 36 ITEMS)
(1987) terms this factor, ‘‘consensuality.’’ Other ana- Vector 2 assesses the norm-questioning/norm-
lysts have described it as ‘‘conventionality,’’ ‘‘social favoring dimension associated with the SSA ‘‘value
conformity,’’ and ‘‘super-ego strength’’ (Megargee, orientation.’’ The norm-favoring pole was marked
1972). by scales Cm and So, while the norm-questioning
dimension was marked by Scale Fx. Gough (1989,
Smallest Space Analysis p. 78) wrote, ‘‘Lower scores on v. 2 are associated
SSA is a technique for geometrically representing with rule-questioning, rule-resisting, and even rule-
the relationships among data points such as correla- violating dispositions. Higher scores are associated
tion coefficients. Karni and Levin’s (1972) SSA of with rule-accepting, rule-observing, and even rule-
the CPI identified three dimensions: a ‘‘person cathecting tendencies.’’
orientation,’’ a ‘‘values orientation,’’ and a ‘‘self-
orientation.’’ This SSA analysis, which is compatible VECTOR 3 (V. 3; 58 ITEMS)
with the factor analytic findings, influenced the next The third vector scale was designed to define
major advance in the evolution of the CPI, namely seven levels of self-realization or competence. Items
the development of orthogonal ‘‘vector’’ scales asses- were selected that had minimal correlations with the
sing these three dimensions. v. 1 and v. 2 scales and maximal correlations with the
19 CPI folk scales exclusive of Scale F/M. The sum
Vector Scales and the Cuboid Model of the T-scores on Scales Wb, To, and Ie served as a
The Three Vector Scales marker for Scale v. 3. According to Gough (1989,
Approximately three decades after the publica- p. 80), ‘‘Persons with high scores on v. 3 tend to be
tion of the CPI 480, the second generation of described by others as intelligent, resourceful, and
CPIs, Forms 462 (Gough, 1987) and 434 (Gough insightful. Persons with low scores tend to be
and Bradley, 1996) were introduced. In addition to described as complaining, narrow in interests, and
two new folk culture scales, In and Em, these two self-defeating.’’
new forms of the CPI had evolved to include three
new ‘‘vector’’ scales assessing the dimensions identi- The Cuboid Model
fied in Karni and Levin’s (1972) SSA analysis. These When scores on the first two orthogonal vector
three measures were used to define four CPI-based scales, v. 1 and v. 2, are plotted at right angles on the
personality types, each with seven different levels of reverse side of the CPI profile sheet, they form a grid
realization. with four equal quadrants (see Figure 17.1).
Each vector scale was designed to be bipolar and Each quadrant, labeled Alpha, Beta, Gamma, or
unrelated or orthogonal to the other two. Gough Delta, is thought to define ‘‘four lifestyles or ways
(1987) constructed them by means of internal ana- of living . . . each having its own potentiality and
lyses in which he sought items that had high correla- particular type of fulfillment, and each having its
tions with the CPI scales that marked each SSA own pitfalls and dangers to be avoided’’ (Gough,
cluster along with negligible correlations with the 1989, p. 79).
MEGARGEE 333
As noted earlier,Cm raw scores £24 indicate atypical situational as well as personality factors could accu-
or nonresponsive, such as fixed pattern or random, rate predictions be made.
responding. Raw scores 30 on Gi suggest positive
dissimulation or faking good, while Wb raw scores Conclusion
£20 imply faking bad (malingering) or serious per- In the six decades since its first scales were devel-
sonal problems. A set of three more precise equations oped, the CPI has evolved significantly. The initial
involving weighted combinations of raw scores on scales assessing a few selected aspects of social func-
several scales which are applied sequentially in a tioning grew into an interrelated array of measures
decision tree approach have been devised for CPI designed to assess, singly or in combination, all
462 and CPI 434 (Lanning, 1989). These equations relevant aspects of effective interpersonal func-
and their cutting scores along with relevant norma- tioning. Later, orthogonal vector scales defining
tive and validity data can be found in the manual four different personality types and assessing their
(Gough & Bradley, 1996, pp. 67–73). level of effectiveness were added.
The CPI was designed to be an open system, and
Interpreting the Profile new applications and uses continue to be added.
CPI interpretation begins with ascertaining the While I have focused on the concepts and the struc-
cuboid type—Alpha, Beta, Gamma, or Delta—and ture of the CPI, as with other personality inventories
its level of realization. The type and the level estab- there have also been important technological advances
lish the overall context for the interpretation. as hand scoring has given way to computerized
Once the profile is classified, the interpreter administration, scoring, and interpretation. If the
should consider its overall elevation, paying parti- past is a guide to the future, it seems certain that the
cular attention to those scales that are above T60 and CPI will continue to evolve in the years to come.
below T40, and considering the differences in eleva-
tion within and between the four clusters or sectors.
A wealth of information can be found in the CPI Notes
1. When critics frown on the fact that most of the CPI’s folk
interpretive literature (Gough, 1968, 1989; Gough
culture scales are positively correlated with one another, Gough
& Bradley, 1996; Groth-Marnat, 1997; McAllister, points to the Strong Vocational Interest Blank (SVIB), a col-
1996; Megargee, 1972). In general the higher the lateral ancestor of the CPI, and notes that no one disapproves of
elevation, the more favorable the behavioral implica- the fact that the SVIB’s scales for physicist and mathematician
tions. This examination of the individual scales are positively correlated because, in the real world, most phy-
sicists have mathematical talents, and many mathematicians
should sharpen and refine the broader implications
have an interest in physics. Therefore, valid measures of these
of the cuboid type and level. interests should be associated with one another (Gough &
The next step is to examine the configuration of Bradley, 1996, p. 8).
selected sets of scales. For example, the relative ele- 2. These descriptors for Do and the other 19 folk culture scales are
vations of Scales Ac and Ai have important implica- drawn from Gough (1989), Gough and Bradley (1996), Groth-
tions for patterns of achievement-oriented behavior, Marnat (1997), and Megargee (1972).
Re, So, and Sc for honesty versus misbehavior, and 3. Gough notes that some have difficulty with the notion of a
bipolar Femininity/Masculinity scale (Gough & Bradley,
Do, Re, and Gi for the manner in which leadership is
1996). Regardless of the conceptual difficulties, the fact
exercised (Groth-Marnat, 1997; McAllister, 1996; remains that F/M scores correlate reliably with how people
Megargee, 1972). describe one another. Those who prefer unipolar scales on
After extracting as much information as possible which people can obtain high or low scores on both masculi-
from the CPI profile, the examiner should consider nity and femininity can use the Baucom scales for Masculinity
(B-MS) and Femininity (B-FM) (Baucom, 1976).
relevant situational and environmental factors before
making behavioral assessments and predictions. For
example, in a series of experiments, my colleagues References
Adorno, T. W., Frenkel-Brunswik, E., Levinson, D., &
and I discovered that whether or not men and Sanford, R. (1950). The authoritarian personality. New
women with high scores on the Dominance scale York: Harper.
(High Do) assumed leadership positions when Baucom, D. H. (1976). Independent masculinity and femininity
paired with low-Do partners was strongly influenced scales on the California Psychological Inventory. Journal of
by situational variables such as the demands of the Consulting and Clinical Psychology, 44, 876.
Costa, P. T., Jr., & McRae, R. R. (1992). Revised NEO Personality
task, social mores, and gender role expectations Inventory (NEO PI-R) and NEO Five-Factor Inventory (NEO-
(Megargee, 1969; Megargee, Bogart, & Anderson, FFI)) professional manual. Odessa, FL.: Psychological
1966). In these studies, only by considering Assessment resources.
MEGARGEE 335
C H A P T E R
Personality Disorders Assessment
18 Instruments
Abstract
This chapter provides an overview of the assessment of personality disorder, focusing in particular on
self-report inventories and semistructured interviews. The convergent validity reported in 68 studies
among five semistructured interviews, one rating form, and 10 self-report inventories are summarized
and discussed. Discussed as well are discriminant validity, the boundaries with Axis I disorders, the
boundaries with normal personality functioning, the boundaries among the personality disorders, and
gender, culture, and ethnicity bias.
Personality disorders were placed on a separate diag- fully for the variation in their patients’ treatment
nostic axis (Axis II) in the influential third edition of responsivity, and clinicians who are concerned pri-
the American Psychiatric Association’s (APA) marily with the treatment of a personality disorder
Diagnostic and statistical manual of mental disorders are advised to include objective measures of person-
(DSM; APA, 1980). Most other mental disorders ality disorder symptomatology to document empiri-
were and continue to be placed on Axis I (APA, cally the effectiveness of their intervention.
2000). This unique distinction for the personality Personality disorders, however, can be difficult to
disorders is to encourage the recognition of mal- assess. One of the major innovations of DSM-III
adaptive personality traits in virtually every patient (APA, 1980) was the provision of relatively more
because personality traits will affect the occurrence, specific and explicit diagnostic criterion sets that
expression, course, and/or treatment of most Axis I facilitated substantially the obtainment of more reli-
disorders (Frances, 1980). In addition, a significant able clinical diagnoses. This innovation, however,
proportion of patients will be in treatment primarily has been problematic for the personality disorders.
because of a personality disorder. Personality disor- It is difficult, perhaps even impossible, to provide a
ders are among the most difficult of the mental brief list of specific diagnostic criteria for the broad
disorders to treat, but they are not untreatable and complex behavior patterns that constitute a
(Perry, Banon, & Ianni, 1999). Clinically significant personality disorder (Westen, 1997). As acknowl-
and meaningful improvements to personality func- edged in DSM-III-R, ‘‘for some disorders . . . , parti-
tioning are often obtained. cularly the Personality Disorders, the criteria require
The assessment of personality disorders should much more inference on the part of the observer’’
then be of substantial importance to clinicians (APA, 1987, p. xxiii).
(Widiger, 2002). Clinicians who are concerned pri- An additional difficulty is that many of the per-
marily with the treatment of an Axis I mental dis- sonality disorders affect self-description. Antisocial
order should include a measure of maladaptive persons will be dishonest, dependent persons can be
personality functioning if they intend to account overly self-denigrating, paranoid persons will be
336
reluctant to provide personal information, histrionic two disorders (negativistic and depressive) included
persons would be expected to be melodramatic, and within an appendix for disorders not yet receiving
narcissistic persons might deny the existence of faults official approval. Even if one spent ‘‘only’’ 2 hr asses-
and inadequacies. The self-description of persons sing the personality disorders, that would still allow
with personality disorders should not be taken at (on average) only 90 s to assess borderline identity
face value (Widiger & Samuel, 2005). Most self- disturbance or narcissistic lack of empathy. As a
report inventories include validity scales to detect result, clinicians often fail to assess adequately the
denial, exaggeration, or other forms of dissimula- full range of personality disorder symptomatology
tion. These scales are typically used to identify that is in fact present in their clients (Garb, 2005;
invalid or questionable test results but they Zimmerman & Mattia, 1999a, b).
could also be detecting personality disorder
symptomatology. Instruments for the Assessment of
One approach for addressing distortions in self- Personality Disorders
image is to request that one or more persons who Fortunately, there are a number of instruments
know the patient well to provide their perspective on available that improve considerably the reliability
his or her personality. Spouses, friends, and close and validity of personality disorder assessments.
colleagues will not provide an entirely accurate This chapter will be confined to interviews and
description of an identified patient, as they will not self-report inventories. There is currently only lim-
be familiar with all aspects of the person’s func- ited research on the use of projective tests for the
tioning and they may have their own axes to grind. assessment of most of the DSM-IV-TR personality
Nevertheless, they do provide a useful source of disorders (Huprich, 2005) and some controversy
additional information, and they would lack the over the validity of these assessments (Wood, Garb,
distortions, denials, and exaggerations that charac- Lilienfeld, & Nezworski, 2002). In any case, this
terize the patient’s personality disorder. Agreement additional literature is beyond the scope of this
between self-descriptions and informant descrip- chapter. Huprich (2005) provides a thorough cov-
tions of personality disorders has been only adequate erage of the literature on the use of the Rorschach in
to poor (Klein, 2003; Klonsky, Oltmanns, & the assessment of personality disorders and Gacano
Turkheimer, 2002; Ready & Clark, 2002). The and Meloy (Chapter 29) review the literature on the
reasons for the poor agreement are many (e.g., dis- use of projective techniques for the assessment of the
tortion of self-descriptions secondary to personality antisocial personality disorder.
and mood disorder, or biased and inadequate bases
for informant descriptions), but the disagreement Semistructured Interviews
does at least indicate the importance of obtaining The preferred method for the assessment of
multiple sources of input for a diagnostic assessment personality disorders in general clinical practice is
and for conducting additional research into the bases an unstructured clinical interview (Watkins,
for this disagreement. Oltmanns and Turkheimer Campbell, Nieberding, & Hallmark, 1995;
(2006) have been conducting a particularly creative Westen, 1997), whereas the preferred method in
and intriguing investigation of the agreement and research is the semistructured interview (Segal &
disagreement between self-descriptions and peer Coolidge, 2007; Widiger & Samuel, 2005;
nominations of personality disorder. They indicated, Zimmerman, 2003). Semistructured interviews
for example, that there is considerable agreement have several advantages over unstructured inter-
among peers within a particular social circle views (McDermut & Zimmernan, 2008; Rogers,
regarding who is narcissistic but, perhaps not sur- 2003). Semistructured interviews ensure and
prisingly, the persons identified as being narcissistic document that a systematic and comprehensive
do not necessarily describe themselves that way. assessment of each personality disorder
They described themselves as being outgoing, gre- diagnostic criterion has been made. This docu-
garious, and likeable. mentation can be particularly helpful in situations
A further difficulty is the amount of time it takes in which the credibility or validity of the assess-
to provide a systematic and comprehensive assess- ment might be questioned, such as forensic or
ment of all of the personality disorders included disability evaluations. Semistructured interviews
within DSM-IV-TR (APA, 2000). The DSM-IV- also increase the likelihood of reliable and replic-
TR personality disorders include 80 diagnostic cri- able assessments (Farmer, 2000; Rogers, 2003;
teria, not counting the 14 diagnostic criteria for the Segal & Coolidge, 2007; Wood et al., 2002).
Instruments PRN SZD SZT ATS BDL HST NCS AVD DPD OBC PAG
MCMI/PDQ1 .30 .28 .38 .15 .47 .15 .47 .68 .53 –.47 .59
MCMI/MMPI2 .33 .64 .41 .30 .55 .61 .66 .62 .52 –.38 .51
MCMI/MMPI3 .44 .35 .51 .14 .28 .66 .55 .65 .68 –.42 .50
MCMI/MMPI4 .69 .68 .78 .25 .54 .71 .55 .76 .68 –.31 .48
MCMI/MMPI5 .45 .61 .55 .14 .49 .71 .70 .77 .60 –.49 .70
MCMI/MMPI5 .19 .22 .57 .13 .49 .44 .49 .69 .59 –.50 .65
MCMI/MMPI6 .08 .67 .74 .15 .42 .68 .78 .82 .50 –.30 .57
MCMI/MMPI7 .32 .74 .53 .25 .37 .69 .73 .79 .67 –.27 .46
MCMI/MMPI7 .26 .71 .44 .20 .52 .64 .61 .67 .67 –.24 .46
MCMI/MMPI8 .27 .62 .48 .09 .46 .63 .66 .76 .53 –.13 .50
MCMI-II/MMPI9 .50 .73 .86 .57 .68 .74 .65 .87 .56. –.04 .70
MCMI-II/MMPI2-SB10 .70 .79 .77 .76 .88 .64 .29 .82 .50 –.11 —
MCMI-II/MMPI210 .68 .71 .85 .71 .70 .65 .49 .83 .49 –.08 .75
MCMI-II/MMPI211 .65 .61 .70 .53 .65 .64 .60 .69 .56 .13 .65
MCMI-II/MMPI212 .48 .54 .70 .61 .66 .61 .54 .50 .31 –.02 .65
MCMI-II/PDQ-413 .73 .60 .70 .67 .70 .52 .57 .73 .36 .15 —
MCMI-II/MMPI214 .52 .66 .68 .46 .68 .57 .68 .76 .63 –.10 .70
MCMI-III/MMPI215 .79 .71 .84 .57 .57 .73 .65 .84 .75 –.26 —
MCMI-III/MMPI2-SB15 .81 .79 .84 .72 .82 .77 .58 .83 .80 –.30 —
MCMI-III/MMPI2-SB16 .69 .61 .66 .65 .75 .68 .56 .73 .72 –.12 —
MCMI-III/MMPI216 .65 .54 .67 .66 .60 .61 .62 .74 .68 –.07 —
MCMI-III/MMPI216 .66 .60 .63 .65 .73 .67 .56 .72 .70 –.15 —
MCMI-II/CATI17 .58 .22 .65 .57 .87 .72 .38 .80 .43 .10 .86
MCMI-II/CATI18 .55 –.13 .57 .70 .88 .10 .40 .55 .20 –.11 .77
MCMI-II/CATI11 .56 .30 .62 .63 .72 .64 .52 .66 .54 .40 .65
MCMI/WISPI19 .38 .48 .43 .32 .14 .10 .57 .79 .68 –.26 .50
MMPI-PD/CATI11 .59 .40 .65 .60 .63 .47 .28 .75 .74 .35 .62
MMPI/PDQ-R20 .61 .23 — .63 — –.04 .24 .73 .58 .47 .57
MMPI/PDQ-R21 .42 .26 .46 .51 .75 .32 –.04 .57 .60 .36 .62
MMPI/PDQ-R22 .38 .31 .50 .50 .53 .09 –.12 .56 .35 .19 .20
MMPI2/PDQ-423 .59 .59 .56 .60 .62 .29 .21 .80 .68 .44 —
MMPI2/PDQ-415 .73 .64 .69 .53 .62 .21 .14 .72 .53 .62 —
MMPI2-SB/PDQ-415 .66 .70 .72 .60 .71 .10 –.11 .69 .55 .70 —
WISPI/PDQ19 .66 .37 .72 .68 .54 .79 .67 .75 .79 .57 .67
SNAP/PDQ-423 .78 .70 .77 .78 .76 .66 .67 .76 .82 .60 —
SNAP/MMPI223 .76 .72 .76 .79 .69 .59 .42 .81 .77 .51 —
SNAP/MAPP24 .55 .48 .53 .49 .50 .38 .40 .44 .42 .31 —
SNAP/MAPP24 .49 .51 .51 .51 .52 .39 .40 .49 .46 .43 —
Median 0.57 0.60 0.65 0.57 0.62 0.62 0.55 0.73 0.59 –0.07 0.62
Notes: PRN ¼ paranoid, SZD ¼ schizoid, SZT ¼ schizotypal, ATS ¼ antisocial, BDL ¼ borderline, HST ¼ histrionic, NCS ¼ narcissistic, AVD ¼ avoidant,
DPD ¼ dependent, OBC ¼ obsessive compulsive, PAG ¼ passive–aggressive, MCMI ¼ Millon Clinical Multiaxial Inventory (Millon et al., 1997), PDQ ¼
Personality Diagnostic Questionnaire (Hyler, 1994), MMPI ¼ Minnesota Multiphasic Personality Inventory (Morey et al., 1985), MMPI-SB ¼ MMPI-II
(Somwaru & Ben-Porath, 1995) scales, CATI ¼ Coolidge Axis II Inventory (Coolidge & Merwin, 1992), WISPI ¼ Wisconsin Personality Disorder
Inventory (Klein et al., 1993), SNAP ¼ Schedule for Nonadaptive and Adaptive Personality (Clark et al., in press), MAPP ¼ Multisource Assessment of
Personality Pathology (Oltmanns & Turkheimer, 2006).
1
Reich, Noyes, & Troughton (1987), 2 Streiner & Miller (1988), 3 Dubro & Wetzler (1989), 4 Morey & LeVine (1988), 5 Zarrella, Schuerger, & Ritz
(1990), 6 McCann (1989), 7 Schuler, Snibbe, & Buckwalter (1994), 8 Wise (1994), 9 McCann (1991), 10 Jones (2005), 11 Sinha & Watson (2001), 12 Wise
(2001), 13 Blackburn, Donnelly, Logan, & Renswick (2004), 14 Wise (1996), 15 Hicklin & Widiger (2000), 16 Rossi, vanden Brande, et al. (2003),
17
Coolidge & Merwin (1992), 18 Silberman, Roth, Segal, & Burns (1997), 19 Klein et al. (1993), 20 Trull, Goodwin, Schopp, Hillenbrand, & Schuster
(1993), 21 Trull (1993), 22 O’Maille & Fine (1995), 23 Haigler & Widiger (2001), 24 Oltmanns & Turkheimer (2006).
Instruments PRN SZD SZT ATS BDL HST NCS AVD DPD OBC PAG
Categorical
PDQ/SIDP1 .00 .00 .27 .14 .30 .38 .00 .20 .08 .13 .00
PDQ-R/SIDP2 .27 .25 .28 .32 .54 .49 .14 .37 .40 .34 .21
PDQ-R/SIDP-R3 .18 — .17 .46 .17 .09 .08 .20 .30 .16 .28
PDQ-R/IPDE4 .12 –.02 .54 .36 .46 .18 .42 .53 .52 .38 .21
PDQ-R/IPDE5 .10 .26 .00 — .42 .22 .10 .37 .14 .37 .33
PDQ-R/SIDP6 .00 .11 .24 .24 .20 .02 –.04 — .15 .04 .14
PDQ-R/SCID-II4 .27 .43 .48 .42 .53 .24 .34 .63 .57 .30 .23
PDQ-R/SCID-II5 .25 .00 –.03 — .37 .32 .23 .46 .53 .42 .46
PDQ-R/SCID-II7 .25 .13 .05 — .15 .33 .24 .30 .07 .34 .16
PDQ-4/SCID-II8 .16 .09 .09 .28 .19 .12 .28 .15 .24 .09 .09
PDQ-4/IPDE9 .13 .21 .11 .28 .16 .10 .09 .19 –.02 .16 —
PDQ-4/PDI-IV10 .11 .08 .02 .33 .11 .03 .11 .21 .24 .08 .11
PDQ-4/SCID-II11 .35 .05 .11 .49 .57 — .14 .40 — .20 .18
PDQ-4/PAS12 .24 .03 .21 .30 .15 .12 .18 .14 .48 .05 —
MMPI2/SCID-II13 .05 .21 .22 .20 .28 –.05 .19 .42 .37 — .18
MMPI/SIDP-R3 .08 — — .20 .26 –.03 — — –.10 — .18
MCMI/SIDP14 .38 .06 .07 — .51 .22 — .14 .22 –.06 .29
MCMI/SIDP15 .19 –.03 .18 .06 .53 .12 .25 .36 .15 .00 .38
MCMI-II/SCID-II13 .05 .21 .24 .10 .53 .12 .37 .28 .13 –.05 .22
MCMI-II/SCID-II16 .21 — — — .25 .14 .16 .51 .28 .38 .30
MCMI-II/MMPI13 .30 .31 .48 .19 .34 .25 .37 .44 .27 — .25
MCMI-II/IPDE17 .00 .00 –.02 .00 .41 .29 –.04 .30 .26 –.07 –.05
MCMI-II/IPDE9 .20 .30 .05 .36 .30 –.04 .08 .17 .04 .13 —
ADP-IV/SCID-II18 .34 .07 .14 .29 .65 .45 .19 .77 .50 .33 —
WISPI-IV/SCID-II19 .48 — — — .38 — — .34 .19 .08 .14
Median 0.19 0.09 0.15 0.28 0.34 0.14 0.17 0.34 0.24 0.14 0.21
Dimensional
MCMI/SIDP20 .29 .40 .31 .23 .32 .05 .04 .53 .51 –.29 .28
MCMI/SIDP21 .22 .39 .37 — .32 .20 .18 .42 .38 –.05 .14
MCMI/SIDP14 .28 .20 .15 .30 .80 .22 .14 .31 .38 .15 .50
MCMI/SIDP15 .20 .31 .23 .14 .63 .07 .26 .56 .31 .02 .41
MCMI/SIDP22 .03 .47 .39 .23 .33 .26 .34 .60 .21 –.04 .17
MCMI/PDI-I23 .08 .02 .33 .28 .51 .01 .21 .53 .64 –.32 .15
MCMI-II/SCID-II24 .39 .31 .17 .47 .51 .32 .34 .55 .40 .08 .38
MCMI-II/PDI-II23 .30 .52 .21 .32 .63 .24 .32 .64 .36 .11 .64
MCMI-II/PDI-III25 .44 .53 .61 .58 .63 .30 .42 .58 .50 –.04 .44
MCMI-II/IPDE17 .38 .48 .39 .37 .60 .56 .41 .51 .38 –.05 .41
MCMI-II/IPDE9 .33 .52 .50 .49 .41 .13 .10 .64 .30 .20 —
PDQ/SIDP20 .56 .33 .49 .78 .64 .47 .53 .51 .59 .52 .46
PDQ/SIDP1 .43 .24 .34 .55 .39 .42 .26 .30 .35 .47 .37
continued
Instruments PRN SZD SZT ATS BDL HST NCS AVD DPD OBC PAG
PDQ-R/SIDP6 .22 .32 .31 .20 .39 .38 .15 .21 .36 .29 .26
PDQ-R/SIDP2 .32 .44 .45 .60 .50 .43 .30 .49 .44 .26 .39
PDQ-R/SIDP3 .31 .60 .32 .44 .48 .40 .38 .35 .55 .47 .43
PDQ-4/SCID-II8 .36 .19 .20 .37 .40 .29 .42 .36 .39 .28 .30
PDQ-4/IPDE9 .36 .45 .50 .57 .46 .18 .21 .66 .48 .36 —
PDQ-4/PDI-IV10 .34 .19 .23 .47 .39 .29 .31 .43 .35 .32 .34
OMNI/IPDE26 .52 .67 .61 .57 .81 .66 .52 .67 .71 .65 —
MMPI/SIDP3 .33 .47 .35 .53 .66 .31 .10 .47 .40 .24 .47
WISPI/SCID-II27 .43 .40 .51 .24 .61 .29 .15 .65 .49 .59 .46
WISPI/IPDE27 .11 .36 .18 .39 .47 .22 .38 .58 .51 .40 .54
WISPI-4/SCID-II19 .60 .36 .32 .40 .60 .43 .54 .60 .53 .44 .38
ADP-IV/SCID-II18 .58 .49 .34 .38 .65 .53 .49 .67 .51 .52 —
Median 0.33 0.40 0.34 0.39 0.51 0.29 0.31 0.53 0.40 0.26 0.39
Notes: PRN ¼ paranoid, SZD ¼ schizoid, SZT ¼ schizotypal, ATS ¼ antisocial, BDL ¼ borderline, HST ¼ histrionic, NCS ¼ narcissistic, AVD ¼ avoidant,
DPD ¼ dependent, OBC ¼ obsessive compulsive, PAG ¼ passive–aggressive. PDQ ¼ Personality Diagnostic Questionnaire (Hyler, 1994), SIDP ¼
Structured Interview for Personality Disorders (Pfohl et al., 1997), IPDE ¼ International Personality Disorder Examination (Loranger, 1999), SCID-II ¼
Structured Clinical Interview for Personality Disorders (First & Gibbon, 2004), PDI ¼ Personality Disorder Interview-IV (Widiger et al., 1995),
PAS ¼ Personality Assessment Schedule (Tyrer & Alexander, 1979), MMPI ¼ Minnesota Multiphasic Personality Inventory; Morey et al., 1985),
MCMI ¼ Millon Clinical Multiaxial Inventory (Millon et al., 1997); ADP-IV ¼ Assessment of DSM-IV Personality Disorders (Schotte et al., 1998),
WISPI ¼ Wisconsin Personality Disorder Inventory (Klein et al., 1993), OMNI ¼ OMNI Personality Inventory (Loranger, 2001).
1
Zimmerman & Coryell (1990), 2 DeRuiter & Greeven (2000), 3 Trull & Larson (1994), 4 Hyler et al. (1990), 5 Hyler, Skodol, Oldham, Kellman, &
Doidge (1992), 6 Yeung, Lyons, Waternaux, Faraone, & Tsuang (1993), 7 van Velzen, Luteijn, Scholing, van Hout, & Emmelkamp (1999), 8 Fossati et al.
(1998), 9 Blackburn, Donnelly, Logan, & Renwick (2004), 10 Yang et al. (2000), 11 Davison, Leese, & Taylor (2001), 12 Whyte, Fox, & Coxell (2006),
13
Hills (1995), 14 Nazikian, Rudd, Edwards, & Jackson (1990), 15 Jackson, Gazis, Rudd, & Edwards (1991), 16 Renneberg, Chambless, Dowdall, Fauerbach,
& Gracely (1992), 17 Soldz, Budman, Demby, & Merry (1993), 18 Schotte et al. (2004), 19 Smith, Klein, & Benjamin (2003), 20 Reich et al. (1987),
21
Torgensen & Alneas (1990), 22 Hogg, Jackson, Rudd, & Edwards (1990), 23Widiger & Freiman (1988), 24 Marlowe, Husband, Bonieskie, Kirby, & Platt
(1997), 25 Corbitt (1995), 26 Loranger (2001), 27 Barber & Morse (1994).
The improvement in convergent validity for a inpatients of a personality disorders treatment unit.
dimensional rating is not simply a statistical artifact Both interviews were administered blind to one
of the increase in power provided by more quantita- another on the same day (one in the morning, the
tive ratings. The results indicate that reliable and other in the afternoon). Order of administration was
valid information is being lost by the use of catego- staggered. Kappa for individual diagnoses ranged
rical distinctions. The results also indicate little to no from a low of .14 (schizoid) to a high of .66 (depen-
confidence in the agreement between a self-report dent), with a median kappa of .50 (obsessive com-
inventory and a semistructured interview with pulsive). This convergence is better than the
respect to a categorical diagnosis. consistently poor agreement between a self-report
Table 18.3 provides the convergent validity coef- inventory with a semistructured interview but the
ficients for clinical interviews, whether semistruc- authors acknowledged that the agreement for some
tured or unstructured, as reported in 21 studies. of the categorical diagnoses was discouraging. ‘‘It is
The extent of convergence among the existing semi- fair to say that, for a number of disorders (i.e.,
structured interviews is difficult to judge because paranoid, schizoid, schizotypal, narcissistic, and pas-
only three studies have even attempted to provide sive–aggressive) the two [interviews] studied do not
this information (i.e., O’Boyle & Self, 1990; operationalize the diagnoses similarly and thus yield
Pilkonis et al., 1995; Skodol et al., 1991). All three disparate results’’ (Skodol et al., 1991, p. 22). They
studies were confined to just two of the five semi- also indicated that, in general, the SCID-II yielded
structured interviews. Only two of the studies admi- appreciably more personality disorder diagnoses
nistered the interview schedules to the same patients than the IPDE (35 versus 15 diagnoses). On the
(i.e., O’Boyle & Self, 1990; Skodol et al., 1991), and other hand, agreement with respect to the extent to
only one reported findings for at least half of the which each personality disorder was present (i.e.,
personality disorders (Skodol et al., 1991). Skodol dimensional rating) was considerably better, with
et al. administered the IPDE and SCID-II to 100 correlations ranging from a low of .58 (schizoid) to
Instruments PRN SZD SZT ATS BDL HST NCS AVD DPD OBC PAG
Convergence of
Semistructured
Interviews
Categorical
IPDE/SCID-II1 .29 .14 .44 .59 .53 .58 .44 .56 .66 .50 .21
Dimensional
IPDE/SCID-II1 .68 .58 .72 .87 .76 .77 .80 .78 .81 .77 .74
Convergence with
Unstructured
Clinical Interviews
Categorical
Clinician/Clinician2 .35 .01 .19 .49 .29 .23 — .23 .40 .20 —
IDCL-P/IPDE3 .63 –.03 –.03 — .55 –.07 .54 .71 .27 .30 .66
Clinician/PDQ4 .40 –.16 .01 .07 .46 .15 .10 .10 .08 .08 –.02
Conf/SIDP-R5 .04 — — .25 .40 .46 .39 .51 .52 .40 .47
Conf/IPDE5 — — — — –.06 .24 .20 .37 .09 .51 .11
Conf/IPDE1 .25 — .34 — .01 .31 .04 .29 .41 .06 –.01
Conf/SCID-II1 .54 — .53 — .22 .25 .03 .23 .60 .30 .03
SWAP-200/SCID-II6 .13 .30 –.06 .49 .27 — .54 — — .65 .10
Clinician/SCID-II7 .26 .13 .32 .49 .77 .34 .20 –.04 –.04 .31 .10
Conf/PDQ-48 .05 — — — .20 — — .26 .13 .08 —
Clinician/PDQ-R9 .12 –.03 .12 –.03 .27 .25 .11 .31 .52 .13 .07
Median 0.19 0.13 0.32 0.49 0.24 0.28 0.11 0.26 0.41 0.21 0.07
Dimensional
Clinician/Clinician10 .58 .14 .52 .49 .51 .41 .45 .34 .39 .16 —
Clinician/MCMI11 –.08 .12 .33 .04 .13 .00 .09 .06 .05 .05 .03
Clinician/MCMI-III12 .13 .11 .09 .29 .47 .13 .32 .31 .34 .13 —
SWAP-200/SCID-II13 –.06 .33 — .70 .20 .49 .47 .37 .30 .23 —
SWAP-200/SCID-II6 .48 .52 –.07 .73 .37 .43 .33 .44 .41 .49 .39
PAF/WISPI14 .23 .21 .40 .41 .27 .36 .49 .28 .37 .26 .40
SWAP-200/SWAP-SR15 .07 .22 — .11 .24 .27 .43 .63 .46 –.06 —
Clinician/SWAP-20016 .55 .68 .62 .86 .82 .80 .67 .82 .67 .82 —
Clinician/SWAP-20017 — .46 .55 .71 .48 — .59 — — .57 —
Clinician/SWAP-20018 .53 .68 — .70 — .58 .51 — — .32 —
Clinician/SWAP-200A19 — .51 — .72 .47 .69 .55 — — — —
Clinician/SWAP-200A20 — — .62 .69 .68 .72 .75 .67 — .57 —
Clinician/SWAP-200A21 .71 .66 .56 .78 .60 .65 .68 .77 .54 .75 —
Median 0.53 0.52 0.58 0.71 0.48 0.69 0.57 0.67 0.54 0.57 0.39
Notes: PRN ¼ paranoid, SZD ¼ schizoid, SZT ¼ schizotypal, ATS ¼ antisocial, BDL ¼ borderline, HST ¼ histrionic, NCS ¼ narcissistic, AVD ¼
avoidant, DPD ¼ dependent, OBC ¼ obsessive compulsive, PAG ¼ passive–aggressive. IPDE ¼ International Personality Disorder Examination (Loranger,
1999), SCID-II ¼ Structured Clinical Interview for Personality Disorders (First & Gibbon, 2004), Clinician ¼ rating for each personality disorder provided
by a clinician based on unstructured interview, IDCL-P¼ International Diagnostic Checklist for Assessment of DSM-III-R & ICD-10 Personality Disorders;
Conf ¼ Consensus conference ratings using all available information other than the other instrument (Pilkonis et al., 1995), MCMI ¼ Millon Clinical
Multiaxial Inventory (Millon et al., 1997), WISPI ¼ Wisconsin Personality Disorder Inventory (Klein et al., 1993), PAF ¼ Personality Assessment Form
(Klein et al., 1993), SWAP-200 ¼ Shedler Westen Assessment Procedure (Westen & Shedler, 1999a, 1999b), SWAP-200A ¼ SWAP-200 Adolescent
Version (Westen, Dutra, & Shedler, 2005), SWAPSR ¼ SWAP-200 Self-Report Version.
1
Skodol, Oldham, Rosnick, & Hyler (1991), 2 Mellsop et al. (1982), 3 Bronisch & Mombour (1994), 4 Hyler et al. (1989), 5 Pilkonis, Heape, Proietti,
Clark, McDavid, & Pitts (1995), 6 Marin-Avellan, McGauley, Campbell, & Fonagy (2005), 7 Fridell & Hesse (2006), 8 Wilberg, Dammen, Friis (2000),
9
Bronisch, Flett, Garcia-Borreguero, & Wolf (1993), 10 Hesse (2005), 11 Chick, Sheaffer, Goggin, & Sison (1993), 12 Rossi, Hauben, Van Den Brande, &
Sloore (2003), 13 Loffler-Stastka et al. (2007), 14 Klein et al. (1993), 15 Davison, Obonsawin, Sells, & Patience (2003), 16 Westen & Muderrisoglu (2003),
17
Shedler & Westen (2004), 18 Westen & Shedler (1999a, 1999b), 19 Westen, Shedler, Durrett, Glass, & Martens (2003), 20 Westen et al. (2005),
21
Durrett & Westen (2005).
Abstract
This chapter provides an update on progress in the use of neuroimaging for predicting clinical states,
with particular attention to diagnosis. The chapter discusses the underpinnings of the blood oxygenation
level–dependent (BOLD) response used in functional magnetic resonance imaging (fMRI), and issues
involved in measuring this signal reliably. It then examines the logic underpinning the development of models
based on brain data to examine latent states, such as deception, and latent traits, such as the diagnosis of
schizophrenia. It concludes that neuroimaging, while not currently a practical tool for clinical assessment, is
likely to provide an important avenue of new ideas. Biomarkers, such as those derived from neuroimaging,
are likely to have a role in understanding dimensionality and the common origins of certain disorders (e.g.,
depression and anxiety) by providing biological principles around which to organize thinking in these areas.
Keywords: diagnosis, fMRI, independent components analysis, lie detection, pattern recognition,
reliability, schizophrenia
The science fiction scenario that has provoked you internalizing disorders. How can you tell her that if
into reading this chapter is this: Your client sits down she’s not depressed already she soon will be? Or, that
across from you looking quite nervous, and reluc- the scan shows she is unlikely to be a good candidate
tantly asks, ‘‘How bad is it, Doc?’’ for medication? ‘‘Well, it means that we know what
You mouse through a brief file sent over yesterday to work on.’’
from radiology featuring brain images with colourful Imagining such a scenario is easy. We have had
circles and lines. Soon you arrive at a familiar table at tests that function in this manner long before there
the bottom of the computer-generated readout. were electronic medical records in which to record
‘‘The good news is that your brain scan rules out a them. What takes a little more work is imagining
schizophrenia spectrum disorder, as well as any indi- how such a scenario might come about. What are all
cation of pathological appetitive processes.’’ Then, the little pieces that need to be in place to probe the
after a pause, ‘‘The bad news is that your default brain’s psychological and psychopathological states
network and primary sensory components suggest in a manner useful for clinical assessment? In this
very dominant avoidance processes . . . here and chapter, we will check in with current efforts to
here,’’ you add, circling on the screen the highlighted evaluate the little pieces that underpin the logic of
pictures of the amygdala and medial orbitofrontal clinical assessment, including personality assess-
cortex. ment, with functional imaging. We will first describe
She looks back, bewildered, ‘‘But, Doc, what the increasingly well-understood physiological basis
does it all mean?’’ of functional magnetic resonance imaging (fMRI).
You shuffle in your seat. You have hardly met the Next, we will address a topic close to the hearts of
woman and yet you know the tests have placed her psychometrists, which is the reliability of functional
well within the confidence intervals of three different imaging. We will then explore current efforts to
364
understand implicit brain states, using studies of lie Because protons in a magnetic field cannot help
detection as an illustration. Finally, we will examine but wobble as they spin, which is known more
recent efforts to diagnose schizophrenia and other elegantly as their precession. Luckily, the extent to
psychiatric illnesses using neuroimaging, and reflect which protons wobble depends on the strength of
on their relevance for the future of clinical assess- the magnetic field. This is where a collection of other
ment, including personality assessment. magnets, known as gradients, comes in to play. The
For many, the prospect of viewing states of the gradient magnets alter the main Bo field ever so
brain directly, and from this inferring some latent slightly. As a result, all those protons process at an
truth about a person’s internal state, is as mysterious ever so slightly different frequency across space. This
as it is awe-inspiring. Neuroimaging technology is is the necessary setup for obtaining pretty pictures.
indeed awe-inspiring, but throughout this chapter It is not enough to simply have a number of magnets
seasoned clinical scientists will recognize familiar working in tandem. MRI scanners must also have the
statistical problems refracted through this new lens. capacity to transmit and receive radio signals. (Scanner
In addition to test–retest reliability, issues such as rooms are surrounded with thick shielding to prevent
sensitivity, and specificity, and generalizability must outside radio signals from interfering with these local
be addressed. We have not escaped from these broadcasts.) The transmission and reception of signals
familiar testing criteria simply because we are dealing is carefully orchestrated by a pulse sequence. A pulse
with images of biological phenomena rather than sequence begins with a radio frequency (RF) pulse sent
behaviors. Despite their promise, imaging from a coil close to a person’s head. This RF pulse
approaches still have a long way to go to catch up knocks the protons from their alignment with Bo into
with established interview and paper-and-pencil some other, predetermined alignment. This also syn-
assessment practices. chronizes their precessions, which is like setting the
hands of a clock to 12:00. As each proton finds its way
Neural Basis of the BOLD Response back into it own precession, a very slight signal is
The BOLD response is the ‘‘blood oxygenation emitted which can be picked up with a very sensitive
level–dependent’’ response that is measured in 99% antenna if it is close to the source. By no coincidence at
of fMRI studies. (The other 1% measures the actual all (remember those gradients?), each location in space
level of blood flow itself, but this is tricky business broadcasts at a slightly different frequency so by lis-
indeed, and no more will be said of it in this tening for a particular frequency, one can learn some-
chapter.) Since very few people actually care whether thing about the proton’s journey back to its original
the blood in the brain is oxygenated or deoxyge- precession in that location. The broadcasts that are
nated, it is instructive to unpack this signal further heard most readily are those from the hydrogen pro-
to understand why this is a useful way to understand tons in H2O.
regional brain function. However, the following dis- Yes, this is a simplification of a number of com-
cussion is not for everyone. Readers interested in the plicated equations and technical marvels that have
practicalities of fMRI for clinical assessment with no consumed physicists and engineers for more than
interest in its theoretical basis should feel welcomed half a century. It is not so gross a simplification to
to jump to the next section. obscure the following insight: a proton’s journey
Now, it is a misunderstanding to think of an MR back to its own precession reveals important infor-
scanner as just a big magnet. It is in fact a number of mation about its immediate environment. One
magnets working in tandem. The biggest magnet, aspect of that environment is a useful magnetic
Bo, is a supercooled monstrosity that produces a property of the nearby hemoglobin. Dispatched
magnetic field many thousand times greater than from the lungs, and propelled by the heart, hemo-
the strength of the Earth’s magnetic field and globin carries oxygen to all tissue in the body. When
whose strength is measured in T, or tesla. Most hemoglobin is on its outward, oxygenated path, it is
functional neuroimaging today occurs on magnets diamagnetic like most tissue. That is, it has no
whose Bo strength is 1.5–3 T. Such fields are strong magnetic properties to speak of. After the hemo-
enough to accelerate a hammer through a block of globin has made its oxygen delivery, it becomes
concrete. For non-ferromagnetic people, MRI scan- paramagnetic. That means that within a larger mag-
ners are harmless and are found by many to be very netic field deoxygenated hemoglobin has the capa-
soothing. Bo needs to be this strong because it aligns city to generate its own small magnetic field. This
all the protons within its magnetic field in more or small magnetic field then affects the local protons as
less the same orientation. Why ‘‘more or less’’? they desynchronize in the milliseconds after the RF
Control
–4
Schizophrenia Multiple sclerosis
–6
–15 –10 –5 0 5 10 15
First canonical discriminant function
the algorithm derived on the initial sample in an oddball task during fMRI. The oddball task is simply
independent sample of 46 subjects. Of the classifica- a series of stimuli (one every 2 s) that establish a
tion schemes that had given 100% correct classifica- dominant pattern. Occasionally (25% of the time)
tion in the first sample, many provided excellent something different from the dominant tone occurs,
classification in the new sample (hundreds provided an oddball, and participants responded with a button
better than 95% correct classification, and thou- press. Many studies of the oddball task have been
sands more provided better than 90% correct classi- conducted, and these have focused on brain regions
fications). The cross-validation approach used in associated with detecting and responding to the odd-
these analyses is quite powerful; however the relative ball stimuli. Analyzing the independent components
scarcity of MEG devices around the world and an that could be found within brains engaged in this
unproven ability to repeat the results at another task, however, was something quite different from
location will slow the adoption of this approach in these more traditional analyses. In this case, the
clinical practice. Thus, the question can well be researchers calculated maps of the default mode and
asked whether the data available from functional temporal lobe networks from all of the participants.
MRI is capable of a similar trick. These networks are so named because they represent
The use of fMRI to distinguish between diag- two different canonical sets of regions: the first is
nostic categories remains in its early stages, but a typified by activity in the posterior anterior cingulate,
growing body of work suggests explosive growth in precuneus, and frontopolar cortex; the temporal lobe
this area may be just around the corner. The work of network was characterized by activity in the inferior
Calhoun and colleagues (Calhoun, 2005; Calhoun, temporal lobe and temporal poles. The researchers
Maciejewski, Pearlson, & Kiehl, 2007) has made the then left out one participant from each group
greatest contributions we know of using this kind of during the training of an algorithm to distinguish
data. Calhoun shares with Georgopoulous a theory- people in each group. That is, they allowed the algo-
poor, machine-driven approach to data analysis and rithm to learn the important differences between the
classification. It also utilizes the incidental correla- different diagnoses in their default and temporal net-
tions that arise across the brain. Whereas works. (And conversely, the algorithm learned to
Georgopoulos used the very high temporal resolu- ignore between-individual differences that were irre-
tion of MEG to observe correlations as they occurred levant to classification.) To test the resulting classifi-
in time, Calhoun’s method relies on ICA, described cation algorithm, he calculated how frequently it
above. correctly identified the participants left out of that
To study whether such ICA brain maps were training. As illustrated in Figure 19.2, the average
indicative of psychopathology, Calhoun evaluated sensitivity of the models was 90%, whereas the
two components that emerged when 21 chronic schi- average specificity was 95%. Among the different
zophrenia patients, 14 bipolar patients, and 26 diagnostic groups, controls were classified correctly
healthy controls were asked to perform an auditory 95% of the time and bipolar patients were classified
Schizo-bipo
Schizo-bipo
0 0 0
–2 –2 –2
–4 –4 –4
Cont Cont Cont
–6 Schizo –6 –6 Schizo
Schizo
–8 Bipo –8 Bipo –8 Bipo
–5 0 5 –5 0 5 –5 0 5
(a) Control-schizo (b) Control-schizo (c) Control-bipo
Fig. 19.2 Classification results from 21 chronic schizophrenia patients, 14 bipolar patients, and 26 healthy controls included in Calhoun et al.
(2007). Each case is represented with a dot, with the gray scale indicating diagnosis. (a) Lower left shaded region shows the a priori decision region
for controls versus noncontrols; (b) lower right shaded region shows the a priori decision region for schizophrenia versus nonschizophrenia patients;
(c) upper right shaded region shows the decision region for bipolar disorder versus nonbipolar patients. Reprinted with the kind permission of the
author and publisher.
correctly 83% of the time, with schizophrenia premise that the investigators began with is that
patients in between. Interestingly, the inclusion of schizophrenia is largely a disorder of connectivity;
more ICA networks did not appreciably increase sen- given enough machine power, functional data will
sitivity or specificity and indeed risked overfitting the reveal the nature of that disconnectivity in the
data and thereby losing the ability to generalize to the absence of any particular task demand. This is
untrained cases, much less cases from other scanners. important for many reasons, but two appear to be
It is useful to remind readers before going on that most directly relevant to the future of clinical assess-
this level of sensitivity and specificity is at least on a ment: first, the problems and questions associated
par with clinical ratings. For example, Jakobsen and with matching performance of patients and controls
colleagues (2005) reported the agreement between and the confounds that arise from individual differ-
the hospital record–based OPCRIT (operationalized ences in performance are ameliorated by using this
criteria) diagnosis of schizophrenia and that of an kind of approach; second, schizophrenia may be an
experienced psychiatrist was 85%, with 93% sensi- unusual disorder in this regard. Clinical assessment
tivity and 62% specificity. Comparing DSM-IV in domains other than psychosis, such as interna-
emergency room diagnoses to inpatient discharge lizing or externalizing pathology, may still require
diagnoses from Woo, Sevilla, and Obrocea (2006), task-related activity rather than resting state activity
there was 86% agreement for all patients eventually to observe clinically relevant brain activity. That is,
diagnosed with either schizophrenia or bipolar dis- there are no basal patterns of activity or connectivity
order, with 88% sensitivity and 85% specificity for that distinguish someone with generalized anxiety
the diagnosis of schizophrenia or schizoaffective dis- disorder or a gambling addiction from controls.
order. Compared to these efforts to apply the same Thus, simply because in this case task demands
phenotypic framework across different settings, the and performance were not discussed at length,
classification algorithm based on brain activity alone these may be important considerations for under-
appears to be rather promising. standing some forms of psychopathology.
It is also useful to reflect on the importance of the
tasks used in these two studies: they were largely Conclusions
irrelevant. That is, the cognitive demands of the What are we to make of these developments at
tasks were not the basis of the diagnostic informa- the edge of science fiction? On the one hand, the
tion. While Calhoun’s participants were doing an evidence for reliability as traditionally understood is
oddball task, it would seem in principle that they poor. Test–retest of the reliability of voxels within an
may as well have been counting monkeys in the independent component, for example, is an unim-
jungle. This is quite surprising for someone stepped pressive .49, on average. On the other hand, there is
in clinical cognitive neuroscience. One might think something of the feel of a large locomotive coming
that tasks that tapped the deficits associated with through a tunnel. At this point we can only feel the
schizophrenia would be those most useful in rumble and the wind being pushed in front of it; but
making diagnosis. But the implicit or explicit soon, perhaps, it may change the landscape of
Abstract
Asian American population is a fast-growing population with enormous within-group heterogeneity.
Although the past decade has seen significant gains in psychological research on this population, there
continues to be notable paucity of research in some areas of clinical personality assessment with Asian
American clients. This chapter’s goal is to provide guidelines for practical considerations in conducting
clinical personality assessment with Asian Americans. The chapter reviews recent research on the use of
clinical assessment measures with Asian Americans and presents general guidelines for clinical assessment,
followed by the case example of an Asian American client in a forensic assessment case. The chapter
concludes with a discussion of remaining challenges and future directions.
Clinicians and researchers conducting personality population in the United States has continued to
assessment of a culturally different client are faced grow in numbers in the past few decades. For
with the formidable task of evaluating the contribu- example, the 1990 census (U.S. Bureau of the
tion of cultural factors to an individual’s personality Census, 1993) showed the Asian and Pacific
structure and functioning. With Asian Americans, Islander population to number 7.3 million, or
the difficulty of the task is magnified due to a 2.9% of the U.S. population. The 20001 census
number of factors. First, there is an enormous showed that the population had reached 11.9 mil-
degree of heterogeneity within this population on a lion, thus comprising 4.2% of the U.S. population
number of important dimensions including culture (U.S. Bureau of the Census, 2004). The census
of origin, languages, history in Asia, immigration bureau estimated Asian population in July 2005 to
and settlement history, and so forth. Second, there be at 14.4 million or 5% of the U.S. population.
are no existing norms for this population even for the Between 2004 and 2005, Asian population grew at
most extensively used assessment tools such as the 3%—highest of any race group in the same period
Minnesota Multiphasic Personality Inventory (U.S. Bureau of the Census, 2000). Chinese was the
(MMPI; Greene, 1987; Hall, Bansal, & Lopez, largest ethnic group at 3.3 million, followed by
1999), and many current versions of widely used Filipinos (2.8 million), Asian Indians (2.5 million),
instruments are not available in Asian languages. Koreans (1.4 million), Japanese (1.2 million), and
Finally, there is no consensus on how to conceptua- Vietnamese (1.2 million). Other Asian ethnic
lize individual and group differences on measures of groups that compose at least 1% of the U.S. Asian
personality. population include Cambodian, Hmong, Laotian,
In the United States, the term Asian is used to Pakistani, and Thai. Other Asian groups include
designate persons from more than 20 culturally and those who trace their ancestry to Bangladesh,
linguistically distinct ethnic groups from East Asia, Malaysia, Sri Lanka, Indonesia, and Nepal. With
Southeast Asia, and the Indian subcontinent. Asian the exception of Japanese Americans, the majority
377
(60%) of Asian Americans in 1997 were born in MMPI Studies
foreign countries (U.S. Bureau of the Census, MMPI STUDIES WITH ASIAN AMERICAN
1999). According to the 2000 census (U.S. Bureau POPULATIONS
of the Census, 2004), nearly one-third of Asian The MMPI is perhaps the most extensively studied
Americans were born in the U.S. (31.1%), another personality instrument across cultures (Butcher, 1985)
third were foreign-born naturalized U.S. citizens with available translated versions estimated to be in
(34.4%), and the final third were foreign-born 125 languages (Lonner, 1990). Earlier, there was a
noncitizens (34.5%). Asian Americans also tend to concern about its use with ethnic minorities in the
be geographically concentrated and many are also United States because of the lack of norms among
linguistically isolated. The 2000 census indicated minority populations (Colligan, Osborne,
that over half (or 51%) of the Asian Americans lived Swenson, & Offord, 1983) or because of interpretive
in just three states: New York, California, and Hawaii. bias (Dana, 1988). However, a review by Greene
And of the 9.5 million Asians aged 5 and over, 79% (1987) found no consistent main effects of majority
spoke a language other than English at home and or minority ethnic group status on the MMPI scales.
about 40% spoke English less than ‘‘very well.’’ Another meta-analytic review of MMPI and MMPI-2
Given such a diverse population, the purpose of research (Hall, Bansal, & Lopez, 1999) had also found
this chapter is to provide guidelines for practical that these instruments on the whole did not portray
considerations in conducting a personality assessment African American and Latino populations as more
of persons of Asian descent. To this end, the chapter is pathological than the White American population.
organized into two main sections: (1) a review of Although these studies may have allayed some of the
existing research on the use of clinical assessment reservations regarding its use with ethnic minorities,
measures with Asian Americans; and (2) practical the meta-analyses did not include Asian Americans
considerations and guidelines for assessment. Due to because of lack of data. To this day, there continue
gaps in research on personality assessment of Asian to be very few studies on the use of the MMPI-2 with
Americans, some of our discussion will be based on Asian Americans (in the United States).3
the literature from ethnic minority research and The two early studies with the original MMPI
personality research on the Asians overseas. had found that scores on various clinical and validity
scales among Asian Americans tended to be signifi-
A Review cantly more elevated (Marsella, Sanborn, Kameoka,
Although psychological research on Asian Shizuru, & Brennan, 1975; S. Sue & D. W. Sue,
Americans has grown in volume and in scope 1974), especially among men. However, when
within the past decade (Leong, Okazaki, & David, Tsushima and Onorato (1982) compared the
2006), there appears to be relatively little research on MMPI scores of White and Japanese American
clinical personality assessment instruments and their medical and neurological patients in Hawaii, they
uses.2 However, across the Pacific, personality scales found no significant ethnic differences in the
such as the MMPI-2 are enjoying increasing usage in T-scores of validity or clinical scales when diagnoses
various Asian nations (Butcher, Cheung, & Lim, were matched across ethnic groups. Significant
2003), and there is an active program of research in gender effect was found across the two ethnic
the cross-cultural adaptation of the Chinese groups such that males scored higher than females
Personality Assessment Inventory across Asia on some of the scales (1, 2, 3, 4, 5, 7, 8, and 9) and
(F. M. Cheung, S. F. Cheung, Wada, & Zhang, females scored higher than males on Scale 0. The
2003). In Asian American psychology, the more researchers suggested that their results, in contrast to
active area of research in personality assessment those of the earlier study by Sue and Sue (1974),
surrounds the development and validation of scales might be due to their matching the ethnic groups on
designed to assess cultural identity and values that diagnostic classifications.
characterize this population. In addition there have Limited data on Asian Americans’ performance on
been some new findings from a nation-wide psychia- the MMPI-2 are available. Sue, Keefe, Enomoto,
tric epidemiological assessment of Asian Americans Durvasula, and Chao (1996) administered the
(e.g., Takeuchi et al., 2007). In the practical consid- MMPI-2 to a group of 133 English-speaking Asian
erations section of this chapter, we discuss the impli- American and 91 White American university students.
cations and applications of these various Comparable internal consistency estimates (alpha
developments to the practice of clinical personality coefficients) for the validity and clinical scales were
assessment with Asian Americans. reported for the two ethnic groups. Less acculturated
Bernadette Gray-Little
Abstract
As accurate assessment is critical to diagnosis and treatment, understanding the association between race
or ethnicity and the assessment of psychopathology has important practical significance. However,
detecting and mitigating racial bias in assessment is a profoundly complex challenge. This chapter considers
several conceptual issues germane to assessing U.S. minority groups and reviews efforts to address ethnic
and racial bias in frequently used adult assessment instruments.
Keywords: acculturation, bias, clinical interview, ethnicity, minority, race, picture-story test, projective
test, Rorschach, socioeconomic status, test
During the past 40 years considerable effort has been poorer treatment outcomes. Detecting and miti-
devoted to understanding how race and ethnicity gating ethnic and racial bias in psychological tests is
might be associated with the assessment of psycho- a deeply complicated undertaking with numerous
pathology. Although knowledge of this association conceptual conundrums and methodological chal-
would benefit our theoretical understanding of the lenges. For example, what constitutes membership
way culture influences personality and psycho- in a minority group? How do we identify the dis-
pathology, its practical value stems from the assump- tinctive psychological profile of a particular minority
tions that accurate assessment is necessary for group and then determine to which members of the
appropriate diagnosis, and that misdiagnosis leads group the profile is applicable? Perhaps more impor-
to disparate treatment and poorer outcomes for tant, how do we define bias? Are group mean differ-
minority group members. Moreover, because ences in test scores irrelevant, with claims of bias
stigma is associated with severe mental disorders, warranted only by evidence of a differential relation-
findings of more severe or frequent psychopathology ship between test scores and a criterion (slope bias)
in minority groups foster negative stereotypes that and by an over- or underprediction of the criterion
may, in turn, be the basis for further discrimination. (intercept bias) (Urbina, 2004)? This chapter will
Investigations of this topic fall into two broad over- consider several conceptual issues germane to asses-
lapping categories: attempts to negate or affirm the sing U.S. minority groups and review efforts to
presence of bias in assessment instruments and address ethnic and racial bias in frequently used
efforts to develop modifications that eliminate pre- adult assessment instruments.
sumed bias. Attempted modifications include both
adjustments to the scoring and interpretation of Race, Ethnicity, and Minority Status
existing instruments and the development of new For reasons that are primarily historical rather than
instruments for particular ethnic groups. Despite logical, comparisons of Caucasians with African-
such efforts, however, concerns persist that accurate Americans have traditionally been referred to as stu-
assessment of psychopathology in minority groups dies of racial difference, whereas those involving
lags that of majority group members, leading to Asian-, Hispanic-, or Native Americans are referred
396
to as ethnic or cultural comparisons. The terms race, culturally based ethnic features. As minority-group
ethnicity, and culture often are used interchangeably membership is a basis for differentiation in power,
and may refer to the same characteristics. The primary prestige, and resources, it might reasonably be seen
distinction is that race more often refers to a major as contributing to distress and psychopathology in
biological subgroup of interbreeding persons with much the same way as low SES (Eaton, 1980).
similar physical characteristics, whereas ethnicity However, research with both Native Americans
refers to primarily cultural features, the nonmaterial (Davis, Hoffman, & Nelson, 1990) and African-
aspects or conceptual structures that determine Americans (Kessler & Neighbors, 1986) suggests
patterns of living, such as language, customs, religion, that minority status makes an independent contri-
and values. However, even race, as a category, can bution to the understanding of psychopathology.
be viewed as primarily socially constructed because Accordingly, the effects of SES and minority status
within-group genetic variability often exceeds that are also pertinent and contribute to the complexity
between any two racial groups (Williams, Lavizzo- of understanding the assessment of pathology in
Mourey, & Warren, 1994; see Rutter & Tienda, ethnic minority groups.
2005, for a discussion of the genetic construction of
ethnicity). Ethnic Identity and Acculturation
Although the study of ethnic groups requires Ethnic groups share elements of the national cul-
attention to cultural features, it is also important to ture, and each has distinctive cultural features. In the
consider the status of the group within the socio- content of their beliefs and values, and in particular
economic system. Socioeconomic status (SES), behavior patterns, members of minority groups may
which ranks individuals with regard to the distribu- be more similar to the majority culture than to one
tion of resources, may set in motion some of the another. Thus, there is little justification for automa-
same features as ethnicity. For example, it is tically equating differences that might be found
common in sociological literature to refer to the between Caucasians and Native Americans, for
culture of the lower class or to middle-class values. example, to differences between Caucasians and
Past and current research suggests that SES is Hispanic-Americans. Because assignment to an
reliably related to psychopathology (Bruce & ethnic group typically is derived from self-declared
Phelan, 2006; Johnson, Cohen, Dohrenwend, membership, it reveals neither biological race nor
Link, & Brook, 1999) and may have a more pro- strength of ethnic identification. In an effort to
found effect than ethnicity in such areas as child- better define the meaning of ethnicity, numerous
rearing, academic achievement, the outcome of child investigators have attempted to develop scales to
behavior disorders, and psychiatric symptomatology assess the strength and correlates of racial and ethnic
(Dohrenwend et al., 1998; Eaton, 1980). Moreover, group identification (e.g., Cross, 1978; Cuellar,
identifiable ethnic groups may be economically and Harris, & Jasso, 1980; Helms & Parhams, 1996;
educationally disadvantaged, and as a result, both Kwan & Sodowsky, 1997; Phinney, 1992; Sellers &
distinctions—race/ethnicity and recognizable Nicole, 2003). However, ethnic identity scales have
class—may be applied to one group. African- not found widespread use in clinical practice; perhaps
Americans, for example, constitute a group with because they address an issue—strength of identity
loosely clustered cultural elements whose members with one’s ethnic group—that many view as tangen-
are disproportionately represented among lower SES tial to the use of traditional assessment instruments.
classes. A more direct gauge of whether standardized
When members of a racial or ethnic group are tests provide useful information may lie in the
differentiated from others in the same society and level of ‘‘acculturation’’ of the group to the
occupy a subordinate position, they are also consid- American mainstream. Like many constructs,
ered minorities. Minority status may also contribute acculturation, or the adoption of the beliefs, beha-
to distinctive experiences that are similar to ethni- viors, and language of the majority, is more readily
city. Fernando (2002) described ethnicity as a shared conceptualized than measured (Kang, 2006).
sense of belonging-together arising from similarities Numerous acculturation scales have been devel-
among persons within a group; he also suggested oped. Matsudaira (2006) identified more than 50
that a sense of ethnicity may arise when external acculturation scales, which vary widely in their
pressures lead to the formation of an alliance against conceptualization and structure. Although research
a common enemy or threat. Thus, disadvantaged- investigators occasionally use acculturation as a
minority status can lead to effects similar to control variable (Roesch, Wee, & Vaughn, 2006),
GRAY-LITTLE 397
acculturation scales have not yet achieved wide- Raskin, Crook, & Herman, 1975; Simon, Fleiss,
spread use in clinical practice (see reviews by Gurland, Stiller, & Sharpe, 1973; Strawkowski
Burlew, Bellow, & Lovett, 2000 and Roysircar- et al., 1995). A few have reported opposite results:
Sodowsky & Maestas, 2000). Kunen, Niederhauser, Smith, Morris, and Mark
Culture- or ethnic-specific instruments also have (2005), for example, found an underdiagnosis of
been proposed as a way to ensure appropriate assess- psychosis but still a relative overdiagnosis of schizo-
ment of members of ethnic minority groups. phrenia in African-Americans, based on an examina-
Because the development of valid group-specific tion of hospital charts. Investigators have reported
instruments would be at least as challenging as the both overdiagnosis (Aldwin & Greenberger, 1987)
development of existing instruments has been, there and underdiagnosis of psychopathology (Lu, 2004)
are significant practical obstacles to this approach. of Asian-Americans relative to Caucasians. In a study
The few ethnic-specific instruments developed thus that underscores the complexity of this issue, Zhang
far have met mixed psychometric success and and Snowden (1999) showed that the magnitude
limited use (Cervantes, Padilla, & de Snyder, and direction of ethnic differences varied with the
1991; Costello, 1977; Dahlstrom, Lachar, and geographic region of the sample and the type of
Dahlstrom, 1986; Jones, 1996). For example, disorder studied.
Cervantes and colleagues developed the Hispanic Ethnic differences in diagnosis may be attribu-
Stress Inventory (HSI) as a clinical and research table to one or a combination of factors: true differ-
measure of stress-related psychopathology. Factor ences in the rate of psychopathology, the presence of
analyses showed that the HSI was not conceptually culturally meaningful differences that are misinter-
equivalent for American-born Latinos and Mexican- preted as psychopathology, or bias in the clinician.
American immigrants, and that it was necessary to As establishing true differences would necessitate
develop different versions of the inventory for the prior elimination of the second and third explana-
two groups. As new immigrations create additional tions, most attention has centered on ethnic-specific
ethnic minority groups, the challenge of appropriate manifestations of behavior and clinician bias.
assessment will grow, further straining attempts
to create ethnic-specific instruments. Though Ethnic-specific Manifestations of Behavior
promising, neither the concepts of ethnic identity The cultural expression perspective assumes that
and acculturation, nor the development of ethnic- racial and ethnic minorities may conduct themselves
specific tests has yet to be comprehensively incorporated in distinctive ways that can lead to misdiagnosis
into general assessment practice. during the clinical interview. Mental health and
mental illness concern emotions, such as depression,
Assessment of Psychopathology happiness, or anger experienced by the individual;
The Clinical Interview and the manner of expression is influenced by cul-
The clinical interview remains the most common tural and social factors (Barry, 2007; Mesquita &
diagnostic strategy and often is used in conjunction Frijda, 1992; Snowden, 1999; Whaley, 1997). In
with psychological testing. Because the clinical inter- situations where the interviewer and client share a
view is ubiquitous in clinical settings, embodies similar understanding of idioms of expression, com-
many of the interpersonal features germane to the munication is eased and idioms natural to both can
testing situation, and is often the criterion against be used. In the absence of shared understanding
which psychological tests are validated, a brief dis- there is potentially a gulf, akin to that between a
cussion of the clinical interview precedes the discus- diagnostician and patient who speak different lan-
sion of psychological testing. Numerous studies guages. This gulf might result in mislabeling
conducted during the past several decades have ‘‘normal’’ behavior as problematic or to miscategor-
shown that clinical interviews can result in excess ization of problematic behavior. The existence of
diagnosis of severe pathology or recommendations ethnic variations in the expression of disorder, as
of more restrictive treatment for African-, Hispanic-, well as the occurrence of culture-specific syndromes,
and Native American patients than for Whites is documented in psychiatric literature and recog-
(Blake, 1973; Flaherty & Meagher, 1980; Lawson, nized in recent diagnostic manuals (American
Yesavage, & Werner, 1984; Lu, 2004; Soloff & Psychiatric Association, 2000; Westermeyer,
Turner, 1981; Mukherjee, Shukla, Woodle, Rosen, 1987). For example, empirical studies have at times
& Olarte, 1983; Neighbors, Trierweiler, Ford, & revealed more frequent somatic symptoms in depres-
Muroff, 2003; Pavkov, Lewis, & Lyons, 1989; sion for Hispanics (Beltran, 2006; Fabrega,
GRAY-LITTLE 399
patients, but elevated and dysphoric mood symp- et al., 1989; Neighbors et al., 2003; Whaley, 2004).
toms more in the diagnosis of other patients, a bias Greater use of standardized interviews, which pro-
which helps to explain the frequent ‘‘overdiagnosis’’ vide more consistent information, should be
of schizophrenia in African-American patients. encouraged.
Neighbors et al. (2003) suggest that clinicians use a
different process to link observed symptoms to diag- Psychological Testing
nosis for African-American than Caucasian patients, Psychological tests were developed to provide
especially in reference to schizophrenia diagnosis. objective and standardized appraisals to support
Trierweiler et al. (2006) underscored the complexity inferences about behavior or psychological pro-
of this literature by showing that clinician ethnicity cesses. With any test used to measure psycho-
may also be a factor in the way symptoms are pathology in ethnic groups, the assumption is
weighted in the diagnostic process. The work of implicit that an underlying disorder is expressed in
Trierweiler and associates along with studies by the same way in each population and that the test in
Malgady and Constantino (1998) and Iwamasa, question adequately represents these various symp-
Larrabee, and Merritt (2000) shows that rather toms. If the disorder is manifest in different symp-
than being ignored, race, ethnic, and cultural factors toms in different groups, this assumption is violated
are used in a way that influences diagnostic deci- and the instrument is not equally appropriate for all
sions. Ethnic variations in symptoms present a chal- groups. A second assumption is that the scores
lenge for the practitioner when they result in a obtained represent true scores, whereas, in fact,
poorer fit between symptoms and diagnostic cate- each score measures the target construct with a cer-
gory. This challenge is compounded by social dis- tain level of error, and the amount of error may differ
tance, differential symptom attribution, and other from one group to another (Choca, Shanley, & Van
cognitive processing errors. Whereas the traditional Denburg, 1992; Walters, Greene, Jeffrey, Kruzich,
clinical interview seems maximally susceptible to & Haskin, 1983). Further, as many psychological
such influences, structured diagnostic interviews tests require interpretation, test use may also be
were developed to overcome these sources of error subject to distortions similar to those affecting the
(Kashner et al., 2003; Lyketsos, Aritzi, & Lyketsos, clinical interview. All of these considerations are
1994). relevant to the most contentious issue in this
debate, test fairness: whether psychological tests
Structured Clinical Interviews offer equally accurate assessment for minority
Structured diagnostic interviews are more com- groups, and, in particular, whether tests provide
monly used in research, especially epidemiological equally adequate basis for diagnosis, effective treat-
studies and clinical trials, than in routine clinical ment, and for predicting level of adjustment.
practice, where the time required to train-to-stan- Although there are scores of psychological tests
dard and for complete administration are frequently that might be used to assess psychopathology, the
cited as barriers (Jensen & Weisz, 2002). However, number used with high frequency to assess adults in
the emerging emphasis on evidence-based practice in clinical practice and training settings is relatively
psychology has spurred recommendations for their small (Camara, Nathan, & Puente, 2000;
use in routine clinical practice (APA Task Force, Clemence & Handler, 2001). The Minnesota
2006). By design, structured interviews minimize Multiphasic Personality Inventory (MMPI) and
the sources of variability that lead to unreliable psy- the Millon Clinical Multiaxial Inventory (MCMI)
chiatric diagnoses (Summerfeldt & Antony, 2002). are the two broad-gauged self-report instruments
Like psychological tests, they have standardized con- used most frequently to assess adult personality and
tent, format, and ordering of questions and provide psychopathology.2 Among instruments that assess
algorithms for making a diagnostic decision. They specific disorders, the Beck Depression Inventory
should reduce unwarranted disparities in diagnosis (BDI) is frequently used; however, little attention
among ethnic groups. Several researchers have has been devoted to exploring its cross-ethnic utility
shown that indeed the association between race (Okazaki & Tanaka-Matsumi, 2006). The most
and diagnosis found through a traditional interview commonly used self-expressive or projective instru-
can be attenuated or eliminated when the diagnosis ments are the Rorschach Inkblot Test, the Thematic
is derived from structured interviews or other objec- Apperception Test (TAT), sentence completion
tive criteria (Craig, Goodman, & Hauglaund, 1982; tests (SCTs), and drawings (Watkins, Campbell,
Kashner et al., 2003; Mukherjee et al., 1983; Pavkov Nieberding, & Hallmark, 1995). For the most
GRAY-LITTLE 401
DSM-III acknowledges cultural influences, it never- contrast, Gironda (1998) reported a large number of
theless assumes a cultural view of maladjustment significant differences, some exceeding five T-score
that is difficult to apply without modification to points, in clinical and content scales for Black and
minority cultures. Pace et al. (2006) also conducted White men and women referred for forensic assess-
their own investigation of the MMPI-2 with tribal ment. Fortunately, these studies encompassed extra-
members recruited from two distinct American test correlates of scale scores in their analyses.
Indian communities in Oklahoma, viz. Eastern Correlates of MMPI-2 scale scores did not differ by
Woodland Oklahoma (EWO) and Southwest race when cases from the normative sample were
Plains Oklahoma (SPO). Participants in this study compared with special scales developed in 1993 by
were non-patients and volunteers attending tribal Long (Timbrook & Graham, 1994); nor, for the
fairs or other tribal events. They found significant most part, when forensic clients’ scores were com-
clinical elevations on the majority of MMPI-2 scales, pared with collateral information taken from their
even when education level was accounted for. case files (Gironda, 1998); nor with therapists’ rat-
Specifically, elevations near or above 5 T-points on ings of personality and symptomatology (McNulty
Scales F, 1, 4, 5, 6, 7, and 8 were observed in both et al., 1997). The quality of the extra-test criteria
tribes, with elevations in the L scale among the used in these studies might be questioned as they do
SWPO but not the EWO (which the authors inter- not represent a higher standard than the MMPI. In
preted as reflecting the lower educational levels the aggregate, however, this body of research sug-
among the SWPO). The authors argue that such gests comparable predictive validity of MMPI-2 for
extensive elevation in a non-patient sample, after African-Americans who are matched for education
accounting for education, is at the very least proble- and SES. This work is also consistent with a recent
matic, and urged great caution in the use of MMPI-2 meta-analytic synthesis of 31 years of comparative
data in understanding personality among American MMPI research, which concluded that neither
Indians. For example, they argued, ‘‘unusual African-Americans nor Latinos were unfairly por-
thinking’’ within the MMPI-2 may not necessarily trayed as pathological when tested with the MMPI-2
be linked to distress or impairment within a parti- (Hall, Bansal, & Lopez, 1999). This finding is
cular tribal context. The authors urged researchers echoed in work by Arbisi, Ben-Porath, and
and clinicians to interpret MMPI-2 results using McNulty (2002) who explored ethnic differences
a cultural view consistent with the history of the in MMPI-2 profiles between African-American and
particular tribes studied rather than to rely solely Caucasian psychiatric inpatients to determine
on the categorical, descriptive system on which the whether the two ethnic groups produced differen-
DSM is based. tially elevated profiles and if so, whether this reflects
test bias (slope or intercept) or actual group differ-
African-Americans ences. Mean differences on several validity and clin-
Timbrook and Graham (1994) found no MMPI-2 ical scales were reported, and these were generally
validity scale differences among matched samples consistent with past research. Specifically, African-
of Blacks and Whites; but on clinical scales Black American men scored significantly higher than
men scored significantly higher than Whites on Caucasian men on anger, antisocial practices, bizarre
scales 8 and 9; among women, Blacks had higher mentation, cynicism, family problems, fears, depres-
scores on Scales 4, 5, and 9. Their comparison of sion, health concerns, negative treatment indicators,
unmatched samples revealed significant racial differ- alcoholism and addiction acknowledgment. African-
ences for both men and women on the validity American women scored significantly higher on the
scales. Additional research by Ben-Porath and his first five, plus alcoholism. A step-down hierarchical
colleagues compared samples of Black and White multiple regression indicated the presence of predic-
forensic and psychiatric patients on clinical, tion bias in 32 of the 65 comparisons among men,
validity, and content scales of the MMPI-2 (Arbisi, and 12 of the 65 among women. None, however,
Ben-Porath, & McNulty, 1998; Ben-Porath, exceeded a small effect size, and the source of bias
Shondrick, & Stafford, 1995; McNulty, Graham, was differences in intercepts between the groups.
Ben-Porath, & Stein, 1997). Ben-Porath et al. When bias was present, it was consistently in the
(1995) found no significant differences in clinical direction of underpredicting psychopathology in
and validity scales in their forensic sample, though African-American males, which is inconsistent with
the other studies report some statistically significant past critiques of the MMPI as on overpredicting
differences that are less than five T-score points. By psychopathology in African-Americans. The authors
GRAY-LITTLE 403
the medium- and high-acculturation groups, but the Whites, significant racial differences occurred not
latter group did not differ from Caucasians. The only at the level of item endorsement, but also in
caveat concerning ethnic identify measures is also the standardized scale scores, as well as in the way the
relevant here: We do not know whether accultura- MCMI-I predicted DSM-III Axis I and Axis II dis-
tion measures tap the process of psychopathology or orders (Choca et al., 1992). For example, African-
the MMPI-2 pathologizes cultural difference. Americans received significantly higher scores on
Nonetheless, MMPI-2 research with Asian- narcissistic, paranoid, and delusional disorder scales
American populations dramatically highlights the of the MCMI-I in a sample of psychiatric outpati-
potential importance of cultural factor in using psy- ents matched for age, education, and employment
chological tests for diagnosis and treatment status (Hamberger & Hastings, 1992). In a study by
planning. Munley, Vacha-Haase, Busby, and Paul (1998),
Our brief review of MMPI-2 use with four ethnic African-American inpatients scored higher than
groups reveals quite disparate results. First, recent Caucasians on the same three scales in addition to
MMPI-2 research with African-Americans reveals scales of histrionic and drug dependence disorder on
little evidence of a systematic overestimation of psy- the MCMI-II; these differences were attenuated
chopathology in samples of comparable social class. when Munley and colleagues matched their patients
The improved picture with African-Americans may on primary diagnosis and comorbidity. Several
represent acculturative changes in the population, investigators have used the MCMI-II with other
improved psychometric properties of the test, or a ethnic minority samples or with samples that include
more inclusive normative sample of the MMPI-2 in ethnic minorities (Daly, 1998; Rudd & Orman,
comparison with original MMPI. There continues 1996; Stewart, 1999). However, there is exiguous
to be little evidence of systematic over- or under- comparative research or research on the predictive
prediction of psychopathology with Hispanic validity of the MCMI-I or -II with U.S. minority
groups, and differences that are found (L scale and groups other than African-Americans (Craig &
Scale 5) are readily understood as cultural, rather Bivens, 1998; Craig, Bivens, & Olson, 1997). One
than pathological. Research with Asian-Americans exception is the study by Glass, Bieber, and Tkachuk
is still sparse, but is suggestive of a very strong (1996) that reported significantly higher scores on
moderating influence of acculturation; whereas stu- several MCMI-II scores when comparing Alaskan
dies with Native Americans suggest strong caution in natives with a primarily Anglo group of incarcerated
interpreting elevated profiles. Inter-ethnic differ- offenders. Millon and Meagher (2004, p. 119) sug-
ences highlight the distinctiveness of each group, gested that ‘‘it may be useful to develop separate
while the association of MMPI-2 scale elevations norms for different ethnic groups’’ and Rossi and
with acculturation and ethnic identity within ethni- Sloore (2005) advocated for increased sensitivity to
city underscores intra-group variability. Research issues of equivalence in the cross-cultural and cross-
incorporating both controls for SES and measures national usage; however, little empirical work on the
of ethnic and identity and acculturation would be appropriate use of the MCMI with American ethnic
useful in gauging their relative contribution populations has been published in recent years.
MMPI-2 results in different ethnic groups. Thus, Choca and colleagues’ (1992) admonition to
temper interpretation of MCMI scores for African-
THE MILLON CLINICAL MULTIAXIAL INVENTORY Americans until we understand more exactly the
(MCMI) meaning of score elevations is still sound advice
There are important differences between the and is appropriate for other ethnic minority groups
MCMI-I and the original MMPI in both theoretical as well.
underpinnings and construction. Unlike the original
MMPI, the standardization sample for the MCMI-I Projective Techniques: Self-Expressive
included representative percentages of minority Assessment of Psychopathology
groups (a total of 19% comprising Asians, Blacks, Projective or self-expressive instruments are
Hispanics, and mixed ethnics). In addition, the pro- widely used in clinical practice, although Belter and
cedures for converting raw scores into standard Piotrowki (2001) reported a substantial decline in
scores include separate conversion tables for men the number of clinical programs that offer training in
and women, for Blacks and Hispanics, and for the projective instruments. Projective tests have been
general population. Despite these efforts and evi- considered more culturally fair than self-report
dence of similar factor structure for Blacks and instruments because of their ambiguous stimuli,
GRAY-LITTLE 405
2005). Although this controversy addresses concerns ethnic minority groups. He attempted to system-
about the reliability, validity, efficiency, and effec- atically test for bias by demonstrating a common
tiveness of the Rorschach, per se, the debate has also factor structure for Whites, African-Americans, and
touched on appropriateness of the Rorschach for use a general minority group category; and by deter-
with members of minority groups, a topic that has a mining whether there was evidence of slope or inter-
long history. cept bias. The analyses generally supported Meyer’s
The Rorschach has been used extensively in position that the Rorschach can be used effectively
anthropological studies of personality and culture with majority and minority clients who are matched
(e.g., Hallowell, 1945). The first racial (Black– for age, education, marital, and inpatient status. As
White) comparison on the Rorschach was con- Meyer noted, however, a major limitation of the
ducted more than 50 years ago (Abel, 1973); a study was the fact that the criterion against which
study involving Puerto Rican children occurred in the Rorschach was measured was prior clinical diag-
1966 (Ruse and Berkowitz, cited in Abel, 1973); and nosis. The results should therefore be viewed in light
a number of studies were completed by Boyer and of the limitations of clinical diagnosis noted above:
Klopfer and their associates with Apache children Demonstrating that one instrument predicts another
and adults (1963, 1964, 1968). Between 1970 and potentially biased one speaks about the correlation
1990, only a handful of studies addressed the per- between the tests, but does not settle the question of
formance of ethnic groups on the Rorschach (Howes their fairness. Nonetheless, the work of Presley et al.
& DeBlassie, 1989; Scott, 1981). A cursory exam- and Meyer makes an important beginning contribu-
ination of Exner’s (1993) comprehensive scoring tion and suggests that the Rorschach may have com-
and interpretive guide revealed no systematic con- parable utility for very carefully matched Caucasian
sideration of race and ethnicity in use of the and African-American groups. Research with other
Rorschach. In their Rorschach study of incarcerated ethnic minority groups is still too scant to be
Alaskan native and nonnative men, Glass et al. conclusive.
(1996) found a few predicted and several unpre-
dicted differences between the groups. There were, Picture-Story Tests
however, no significant differences in Rorschach The prototype of the picture-story test is
scores for three different levels of acculturation Murray’s Thematic Apperception Tests (TAT).
(traditional, bicultural, and assimilated) among the The TAT was developed on the assumption that
Alaskan native sample. Presley et al. (2001) formed a somewhat ambiguous, but emotionally suggestive,
comparison group of 44 Whites and 44 African- pictures will elicit personal narratives reflecting
Americans from the Comprehensive System norma- underlying needs and experiences. The scoring of
tive sample (N ¼ 700). They compared the two the TAT with any population is problematic, and
samples, which were matched on age, sex, education, numerous scoring systems have been developed.
and socioeconomic class on 23 variables, and found Lilienfeld et al. (2000) describe some of the scoring
significant difference on only three. These three systems as promising, but note that none has been
variables were suggestive of greater anger, higher adopted widely in clinical practice, and scant
schizophrenia index, and less cooperative responses research has addressed the applicability of these
among African-Americans, although only the last scoring schemes to different cultural groups
was considered large enough to suggest clinical sig- (Smith, Atkinson, McClelland, & Veroff, 1992).
nificance. The authors interpreted the lower level of In clinical situations, few practitioners adhere to a
cooperative responses as suggesting that African- standard system; instead, patients’ stories are exam-
Americans expect cooperative reactions less routi- ined for themes and conflicts; and the scoring may
nely than Caucasians and perceive others to be less be entirely impressionistic (Rossini & Moretti,
sensitive to their needs. The most striking finding, 1997). For that reason, the scoring, and possibly
however, was similarity of the groups on the vast the interpretation of the TAT, has been more vari-
majority of variables studied in these matched sam- able than for the Rorschach, for which there are
ples. This study explored the question of mean standard scoring systems. Despite the existence of a
differences on Rorschach scores and found little standard pool of stimuli, use of the TAT in common
evidence that race moderated the scores. In a more practice does not appear to constitute standardized
in-depth study, Meyer (2002) used a large clinical assessment. It is not surprising then that serious
‘‘files’’ sample of 432 patients including 242 Whites, questions have been raised about its reliability and
157 African-Americans, and 33 members of other validity for any group and about its incremental
GRAY-LITTLE 407
Sentence Completion and Figure Drawing However, there is evidence to indicate that drawings
Tests from both children and adults in other societies
If one eliminates clinical interviews and intelli- (German, Nigerian, Australian Aboriginal, and
gence and neurological tests from the list of com- Mexican, among others) are different from those of
monly used procedures, SCTs and figure drawings Americans (Handler, Campbell, & Martin, 2004),
are ranked among the top 10 (Camara et al., 2000; suggesting that environment and culture exert an
Watkins et al., 1995). Each of these procedures influence.
encompasses numerous strategies ranging from the Comparative studies of the SCT and DAP have
use of a standard set of stimuli with a structured more often been directed at comparisons of cross-
scoring system and interpretive guidelines through national groups than of ethnic subgroups within the
informal use in which the practitioner may select United Sates (Handler et al., 2004; Sherry et al.,
stimuli from a standard set or develop items and 2004). Given variations in language use and physical
interpretations for each patient (Dana, 1998; characteristics among diverse U.S. groups, it is con-
Holaday, Smith, & Sherry, 2000). Sherry, Dahlen, ceivable that clients of different ethnic minority
and Holaday (2004) identified more than 40 dif- groups may give distinctive responses to both the
ferent SCTs; however, only a few, viz. the Rotter SCTs and projective drawings. When SCTs and
Incomplete Sentences Blank and Washington projective drawings are employed in the informal
University Incomplete Sentence Test of Ego manner common in present clinical practice, they
Development, have been researched extensively. might be thought of as adjunct to the clinical inter-
The scoring procedures developed for these SCT view rather than as formal testing procedures and
instruments are not used routinely in clinical prac- their utility with ethnic minority persons will
tice (Holaday et al., 2000). depend on skill and cultural awareness of the practi-
There are also several prominent graphic techni- tioner. In view of the widespread use of these proce-
ques. The best known are the Draw-A-Person dures, there is a great need for additional study of
(DAP), House-Tree-Person (HTP), and the their basic psychometric qualities, their added value,
Kinetic Family Drawing. The last of these is consid- their clinical utility, and their applicability to ethnic
ered a kinetic technique because unlike the other minority groups.
two it directs the patient to draw a figure in action.
Among the three, only the DAP is frequently used Conclusions
with adult populations. Like the Rorschach, graphic There have been many advances in the develop-
techniques have been the subject of intense recent ment of assessment instruments with applicability to
controversy (Lilienfeld et al., 2000). Critics have diverse groups in the past decade. The MMPI-2 is a
demonstrated that there is no empirical evidence prime example. Some of the most extreme cautions
for the validity of specific quantitative indicators about use of MMPI with African-Americans and
such as line thickness, figure size, and detail (e.g., Hispanic Americans have abated. Recent research
Joiner, 1997; Joiner, Schmidt, & Barnett, 1996). with the Rorschach also holds out the likelihood
Further, without strong interpretive guidelines, that it is equally applicable to very closely matched
practitioners appear to carry over unsupported lay groups of African-Americans and Whites. It bears
schemas about the meaning of drawings, for repeating, however, that not all minority groups are
example, that a figure with broad shoulders indicates alike. Findings that apply to one group are not
that the client feels burdened (Smith & Dumont, necessarily applicable to all. For example, research
1995), whereas empirical studies do not support with MMPI-2 suggests substantial similarity in
such associations. Supporters of projective drawings mean scores for Hispanic and normative samples,
have long held, however, that figure drawings are but substantial and poorly understood overdiagnosis
most meaningful when interpreted holistically in an of psychopathology for Native American samples
approach that integrates the client’s experience (Greene et al., 2003; Pace et al., 2006) and the
rather than from an enumeration of quantitative applicability of the Rorschach to ethnic groups
indicators (Handler & Riethmiller, 1998; other than African Americans remains to be estab-
Riethmiller & Handler, 1997; Tharinger & Stark, lished. Thus, we have not reached the point of uni-
1990). The appropriate use of projective drawings versal applicability of the major instruments used to
with ethnic minority clients in the United States has assess psychopathology. Among other frequently
not typically been a part of this debate (see Naglieri, used instruments such as the SCT and figure draw-
McNeish, & Bardos, 1991, for an exception). ings, concerns about their reliability and validity
GRAY-LITTLE 409
References Burlew, A. K., Bellow, S., & Lovett, M. (2000). Racial identity
Abel, T. M. (1973). Psychological testing in cultural contexts. New measures: A review and classification. In R. Dana (Ed.),
Haven, CT: College and University Press. Handbook of cross-cultural and multicultural assessment
Aldwin, C., & Greenberger, E. (1987). Cultural differences in the (pp. 173–196). Mawah, NJ: Erlbaum.
predictors of depression. American Journal of Community Butcher, J. N., Cabiya, J., Lucio, E., Garrido, M. (2007). Assessing
Psychology, 15(6), 789–813. Hispanic clients using the MMPI-2 and MMPI-A. Washington,
Allen, J. (2007). A multicultural assessment supervision model to DC: American Psychological Association.
guide research and practice. Professional Psychology: Research Camara, W., Nathan, J., & Puente, A. (2000). Psychological test
and Practice, 38, 248–258. usage: Implications in professional psychology. Professional
American Psychiatric Association. (2000). Diagnostic and statis- Psychology: Research and Practice, 31, 141–154.
tical manual of mental disorders (4th Rev. ed.). Washington, Campos, L. P. (1989). Adverse impact, unfairness, and bias in
DC: Author. psychological screening of Hispanic peace officers. Hispanic
Anonymous. (2005). The status of the Rorschach in clinical and Journal of Behavioral Sciences, 11, 122–135.
forensic practice: An official statement by the board of trustees Cervantes, R. C., Padilla, A., & de Snyder, N. S. (1991). The
of the Society for Personality Assessment. Journal of Hispanic Stress Inventory: A culturally relevant approach to
Personality Assessment, 85, 219–237. psychosocial assessment. Psychological Assessment, 3, 438–447.
APA Presidential Task Force on Evidence-Based Practice. (2006). Choca, J. P., Shanley, L. A., & Van Denburg, E. (1992).
Evidence-based practice in psychology. American Psychologist, Interpretive guide to the Millon Clinical Multiaxial Inventory.
61, 271–285. Washington, DC: American Psychological Association.
Arbisi, P., Ben-Porath, Y. S., & McNulty, J. L. (1998). Impact of Clemence, A. J., & Handler, L. (2001). Psychological assessment
ethnicity on the MMPI-2 in inpatient settings. Paper presented on internship: A survey of training directors and their expec-
at the 106th annual meeting of the American Psychological tations of students. Journal of Personality Assessment, 76,
Association, San Francisco. 18–47.
Arbisi, P., Ben-Porath, Y., & McNulty, J. L. (2002). A comparison Constantino, G., & Malgady, R. G. (2000). Multicultural and
of MMPI-2 validity in African American and Caucasian cross-cultural utility of the TEMAS (Tell Me a Story) Test.
psychiatric inpatients. Psychological Assessment, 14, 3–15. In R. Dana (Ed.), Handbook of cross-cultural and multicul-
Bailey, B. E., & Green, J. (1977). Black Thematic Apperception Test tural personality assessment (pp. 481–513). Mahwah, NJ:
stimulus material. Journal of Personality Assessment, 41, 25–30. Erlbaum.
Barry, D. T. (2007) Ethnicity, race, and men’s mental health. In Constantino, R., Dana, R. H., & Malgady, R. G. (2007). TEMAS
J. E. Grant & M. N. Potenza (Eds.), Textbook of men’s mental (Tell Me a Story) assessment in multicultural societies. Mahwah,
health (pp. 343–361). Washington, DC: American NJ: Erlbaum.
Psychiatric Publishing. Costello, R. M. (1977). Construction and cross-validation of an
Bellak, L., & Abrams, D. M. (1997). The T.A.T., the C.A.T., and MMPI Black–White scale. Journal of Personality Assessment,
the S.A.T. in clinical use (6th ed.). Needham, MA: Allyn & 41, 514–519.
Bacon. Craig, R. J., & Bivens, A. (1998). Factor structure of the
Belter, R., & Piotrowski, C. (2001). Current status of doctoral- MCMI-III. Journal of Personality Assessment, 70, 190–196.
level training in psychological testing. Journal of Clinical Craig, R. J., Bivens, A., & Olson, R. (1997). MCMI-III-derived
Psychology, 57(6), 717–726. typological analysis of cocaine and heroin addicts. Journal of
Beltran, I. (2006). The relation of culture to differences in depressive Personality Assessment, 69, 583–595.
symptoms and coping strategies: Mexican American and Craig, T., Goodman, A. B., & Hauglaund, G. (1982). Impact of
European American college students. Dissertation Abstracts, DSM-III on clinical practice. American Journal of Psychiatry,
67(4-B), 2214. 139, 922–925.
Ben-Porath, Y. S., Shondrick, D. D., & Stafford, K. (1995). Cross, W. E. (1978). The Thomas and Cross models of psycho-
MMPI-2 and race in a forensic diagnostic sample. Criminal logical nigrescence: A review. Journal of Black Psychology, 5,
Justice and Behavior, 22, 12–32. 13–31.
Blake, W. (1973). The influence of race on diagnosis. Studies in Cuellar, I., Harris, I. C., & Jasso, R. (1980). An acculturation
Social Work, 43, 184–192. scale for Mexican American normal and clinical populations.
Boyer, L., Boyer, R. M., Brawer, F. R., Kawai, H., & Klopfer, B. Hispanic Journal of Behavioral Science, 2, 199–217.
(1964). Apache age groups. Journal of Projective Techniques Dahlstrom, W. G., Lachar, D., & Dahlstrom, L. E. (Eds.).
and Personality Assessment, 28, 397–402. (1986). MMPI patterns of American minorities. Minneapolis,
Boyer, L., Boyer, R. M., Kawai, H., Scheiner, S. B., & Klopfer, B. MN: University of Minnesota Press.
(1968). Apache ‘‘learners’’ and ‘‘non-learners.’’ Journal of Daly, J. E. (1998). Predicting compliance among men who
Projective Techniques and Personality Assessment, 32, 147–159. batter: The contribution of demographics, violence-related
Boyer, L., Klopfer, B., Brawer, F. B., & Kawai, H. (1963). factors and psychopathology. Dissertation Abstracts
Comparisons of shamans and pseudoshamans of the International: Section B—The Sciences and Engineering,
Apaches of the Mescalero Indian Reservation. Journal of 59(5-B), 2414.
Personality Assessment, 28, 173–180. Dana, R. H. (1993). Multicultural assessment perspectives for
Bruce, B. G., & Phelan, J. C. (2006). Fundamental social causes: professional psychology. Boston: Allyn & Bacon.
The ascendancy of social factors as determinants of distribu- Dana, R. H. (1998). Projective assessment of Latinos in the
tions of mental illnesses in populations. In W. W. Eaton United States: Current realities, problems, and prospects.
(Ed.), Medical and psychiatric comorbidity over the course of Cultural Diversity and Mental Health, 3, 165–184.
life (pp. 77–94). Washington, DC: American Psychiatric Dana, R. H. (2000). Handbook of cross-cultural personality
Publishing. assessment. Mahwah, NJ: Lawrence Erlbaum.
GRAY-LITTLE 411
Lawson, W. B. (1986). Racial and ethnic factors in psychiatric M. J. Hilsenroth & D. L. Segal (Eds.), Comprehensive hand-
research. Hospital and Community Psychiatry, 37, 50–54. book of psychological assessment (pp. 108–121). Hoboken, NJ:
Lawson, W. B., Yesavage, J. A., & Werner, P. D. (1984). Race, John Wiley & Sons.
violence, and psychopathology. Journal of Clinical Psychiatry, Monopoli, J. (1984). A culture-specific interpretation of thematic
45, 294–297. test protocols for American Indians (Unpublished master’s
Leichtman, M. (2004). Projective tests: The nature of the task. In thesis, University of Arkansas, Fayetteville).
M. Hersen (Series ed.) & M. J. Hilsenroth & D. L. Segal (Vol. Monopoli, J., & Alworth, L. (2000). The use of the thematic
eds.), Comprehensive handbook of psychological assessment, Vol. 2: apperception test in the study of Native American psycholo-
Personality assessment (pp. 297–314). Hoboken, NJ: John gical characteristics: A review and archival study of Navaho
Wiley & Sons. men. Genetic, Social, and General Psychology Monographs, 126,
Leininger, A. (2002). Vietnamese–American personality and 43–78.
acculturation: An exploration of relations between personality Mukherjee, S., Shukla, S., Woodle, J., Rosen, A., & Olarte, S.
traits and cultural goals. In R. McCrae & J. Allik (Eds.), The (1983). Misdiagnosis of schizophrenia in bipolar patients: A
Five-Factor Model of personality across cultures (pp. 197–225). multiethnic comparison. American Journal of Psychiatry, 140,
New York: Kluwer Academic/Plenum. 1571–1574.
Lilienfeld, S., Wood, J., & Garb, H. (2000). The scientific status Mumford, D. (1993). Somatization: A transcultural perspective.
of projective techniques. Psychological Science in the Public International Review of Psychiatry, 5, 231–242.
Interest, 1, 27–66. Munley, P. H., Vacha-Haase, T., Busby, R. M., & Paul, B. D.
Lim, R. (2006). Clinical manual of cultural psychiatry. (1998). The MCMI-II and race. Journal of Personality
Washington, DC: American Psychiatric Publishing. Assessment, 70, 183–189.
Liss, J. L., Welner, A., Robins, E., & Richardson, M. (1973). Naglieri, J. A., McNeish, T. J., & Bardos, A. N. (1991). Draw-a-
Psychiatric symptoms in White and Black inpatients: I. Person: Screening procedures for emotional disturbance. Austin,
Record study. Comprehensive Psychiatry, 14, 475–481. TX: ProEd.
Lopez, S., & Nunez, J. A. (1987). Cultural factors considered in Neighbors, H., Trierweiler, S., Ford, B., & Muroff, J. (2003).
selected diagnostic criteria and interview schedules. Journal of Racial differences in DSM diagnosis using a semi-structured
Abnormal Psychology, 96, 270–272. instrument: The importance of clinical judgment in the diag-
Loring, M., & Powell, B. (1988). Gender, race, and DSM-III: A nosis of African Americans. Journal of Health and Social
study of the objectivity of psychiatric diagnostic behavior. Behavior, 44, 237–256.
Journal of Health and Social Behavior, 29, 1–22. Okazaki, S., & Sue, S. (2000). Implications of test revisions for
Lu, F. (2004). Culture and inpatient psychiatry. In W.-S. Tseng & Asian Americans. Psychological Assessment, 12, 272–280.
J. Streltzer (Eds.), Cultural competence in clinical psychiatry Okazaki, S., & Tanaka-Matsumi, J. (2006). Cultural considera-
(pp. 21–36). Washington, DC: American Psychiatric tions in cognitive-behavioral assessment. In P. Hays & G.
Publishing. Iwamasa (Eds.), Culturally responsive cognitive-behavioral
Lyketsos, C., Aritzi, S., & Lyketsos, G. (1994). Effectiveness of therapy: Assessment, practice, and supervision (pp. 247–266).
office-based psychiatric practice using a structured diagnostic Washington, DC: American Psychological Association.
interview to guide treatment. Journal of Nervous and Mental Pace, T., Robbins, R., Choney, S., Hill, J., Lacey, K., & Blair, G.
Disease, 182, 720–723. (2006). A cultural-contextual perspective on the validity of
McNulty, J. L., Graham, J. R., Ben-Porath, Y. S., & Stein, L. A. the MMPI-2 with American Indians. Cultural Diversity and
R. (1997). Comparative validity of MMPI-2 scores of African Ethnic Minority Psychology, 12, 320–333.
American and Caucasian mental health center clients. Park, B., & Rothbart, M. (1982). Perception of out-group homo-
Psychological Assessment, 9, 464–470. geneity and levels of social categorization: Memory for the
Maduro, R. (1983). Curanderismo and Latino views of diseases subordinate attributes of in-group and out-group members.
and curing. Western Journal of Medicine, 139, 868–874. Journal of Personality and Social Psychology, 42, 1051–1068.
Malgady, R. G., & Constantino, G. (1998). Symptom Pavkov, T. W., Lewis, D. A., & Lyons, J. S. (1989). Psychiatric
severity in bilingual Hispanics as a function of clinician diagnoses and racial bias: An empirical investigation.
ethnicity and language of interview. Psychological Professional Psychology: Research and Practice, 20, 364–368.
Assessment, 10, 120–127. Phinney, J., 1992. The Multigroup Ethnic Identity Measure: A
Manson, S. M. (1994). Culture and depression. Discovering new scale for use with diverse groups. Journal of Adolescent
variations in the experience of illness. In W. B. Lonner & Research, 7, 156–176.
R. S. Malpass (Eds.), Psychology and culture (pp. 285–290). Poland, J., & Caplan, P. J. (2004). The deep structure of bias in
Boston: Allyn & Bacon. psychiatric diagnosis. In P. J. Caplan & L. Cosgrove (Eds.),
Marsella, A. J., Kinsie, D., & Gordon, P. (1973). Ethnic varia- Bias in psychiatric diagnosis (pp. 9–23). Lanham, MD: Jason
tions in the expression of depression. Journal of Cross-cultural Aronson.
Psychology, 4, 435–458. Pollack, D., & Shore, J. H. (1980). Validity of the MMPI with
Matsudaira, T. (2006). Measures of psychological acculturation: Native Americans. American Journal of Psychiatry, 137,
A review. Transcultural Psychiatry, 43, 462–483. 946–950.
Mesquita, B., & Frijda, N. H. (1992). Cultural variations in Presley, G., Smith, C., Hilsenroth, M., & Exner, J. (2001).
emotions: A review. Psychological Bulletin, 112, 179–204. Clinical utility of the Rorschach with African Americans.
Meyer, G. (2002). Exploring possible ethnic differences and bias Journal of Personality Assessment, 77, 491–507.
in the Rorschach Comprehensive System. Journal of Raskin, A., Crook, T. H., & Herman, K. D. (1975). Psychiatric
Personality Assessment, 78, 104–129. history and symptom differences in Black and White
Millon, T., & Meagher, S. E. (2004). The Millon depressed inpatients. Journal of Consulting and Clinical
Clinical Multiaxial Inventory-III (MCMI-III). In Psychology, 43, 73–80.
GRAY-LITTLE 413
Wade, T. C., & Baker, T. B. (1977). Opinions and use of Whatley, P., Allen, J., & Dana, R. (2003). Racial identity and the
psychological tests: A survey of clinical psychologists. MMPI in African American male college students. Cultural
American Psychologist, 32, 874–882. Diversity and Ethnic Minority Psychology, 9, 345–353.
Walters, G. D., Greene, R. L., Jeffrey, T. B., Kruzich, D. J., & Whitworth, R. H., & McBlaine, D. D. (1993). Comparison of
Haskin, J. J. (1983). Racial variations on the MacAndrew the MMPI and MMPI-2 administered to Anglo and
Alcoholism Scale of the MMPI. Journal of Consulting and Hispanic university students. Journal of Personality
Clinical Psychology, 51, 947–948. Assessment, 61, 19–27.
Watkins, C. E., Campbell, V. L., Nieberding, R., & Hallmark, R. Whitworth, R. H., & Unterbrink, C. (1994). Comparison of
(1995). Contemporary practice of psychological assessment MMPI-2 clinical and content scales administered to
by clinical psychologists. Professional Psychology, 26, 54–60. Hispanic and Anglo-Americans. Hispanic Journal of
Weaver, C., & Sklar, D. (1980). Diagnostic dilemmas and cultural Behavioral Sciences, 16, 255–264.
diversity in emergency rooms. Western Journal of Medicine, Williams, D. R., Lavizzo-Mourey, R., & Warren, R. (1994). The
133, 356–366. concept of race and health status in America. Public Health
Westermeyer, J. (1987). Cultural factors in clinical assessment. Reports, 109, 26–41.
Journal of Consulting and Clinical Psychology, 55, 471–478. Williams, R. L., & Johnson, R. C. (1981). Progress in developing
Whaley, A. L. (1997). Ethnicity/race, paranoia, and psychiatric Afrocentric measuring instruments. Journal of Non-White
diagnoses: Clinician bias versus sociocultural differences. Concerns, 9, 3–18.
Journal of Psychopathology and Behavioral Assessment, 19, Wood, J. M., Nezworski, M. T., Lilienfeld, S. O., & Garb, H. N.
1–20. (2003). What’s wrong with the Rorschach? San Francisco:
Whaley, A. (2001). Cultural mistrust: An important psycholo- Jossey Bass.
gical constructs for diagnosis and treatment of African Worrell, F., & Cross, W. (2004). The reliability and validity of Big
Americans. Professional Psychology: Research & Practice, Five Inventory scores with African American college students.
32(6), 555–562. Journal of Multicultural Counseling and Development, 32, 18–32.
Whaley, A. L (2004). A two-stage method for the study of cultural Zhang, A., & Snowden, L. (1999, May). Ethnic characteristics of
bias in the diagnosis of schizophrenia in African Americans. mental disorders in five U.S. communities. Cultural Diversity
Journal of Black Psychology, 30, 167–186. and Ethnic Minority Psychology, 5, 134–146.
Abstract
This chapter aims to provide a framework for integrating gender-relevant issues with practical procedures
for the psychological assessment of women in clinical practice. It reviews some historical and current
approaches to gender in the assessment process, and provides suggestions for information-collection
strategies that acknowledge the influence of gender-related variables on women’s psychological health and
well-being. Discussed are some of the concerns and research associated with gender bias in assessment,
evidence for the influence of bias in the diagnostic process, and the important role of gender in selected
clinically relevant topics. Finally, topics related to women’s positive psychological health and well-being are
proposed as essential elements of a competent assessment. Many of the topics discussed here are also
relevent to assessment with preadolescent and adolecent girls. However, their concerns may differ
developmentally and in substantial ways from those of adult women, and thus are not addressed here.
415
relevant to assessment and intervention with girls women clients are in many ways experts on their own
and women. The guidelines were developed to coor- lives and experiences. Our willingness to open doors
dinate with those for several other groups whose to difficult and emotion-laden issues (e.g., physical or
members include women, including lesbian, gay, sexual abuse and violence) sends a message that such
and bisexual clients (American Psychological issues are acceptable and potentially important assess-
Association, 2000), clients with multicultural iden- ment and therapeutic areas. Indeed, for some women,
tities (American Psychological Association, 2003), a clinical psychological assessment may include ques-
and older adult clients (American Psychological tions that they have never before been asked. For
Association, 2004). Clinicians who work with individuals who identify with marginalized.ethnic or
women clients from many diverse identities will cultural groups, revealing such private information
find a rich source of research and practical applica- may be personally difficult or culturally proscribed
tions within the guidelines for practice. (Greene, 2007; Nettles, 2007). Women who identify
with a community that has been subjected to oppres-
Purposes of Assessment sion and discrimination may believe that protection
Clinical assessment typically serves the purpose of of their ethnic/cultural group is paramount; for
gathering multiple sources of information to assist in others, shame and humiliation may prevent self-dis-
clinical conceptualization of presenting psycholo- closure of sensitive material (Fisher, Daigle,
gical problems and concerns. This conceptualization Cullen, & Turner, 2003; Robinson & Howard-
often drives decisions about types and directions of Hamilton, 2000; West, 2002).
intervention made by the client and the clinician. We propose that clinical assessment with
Other types of clinical assessment may serve more women be done within a culturally sensitive and
circumscribed purposes, often with no expectation woman-centered context. This approach considers
of further therapeutic intervention. Such examples the environmental factors that may impact women’s
include career and management decisions, Social psychological health concerns, including disadvan-
security evaluations, forensic evaluations, child tages and differential life experiences that create
custody issues, competence to stand trial and subtle and not so subtle gender, culture, and class
‘‘mental status’’ at the time of offense, and evalua- inequities (American Psychological Association,
tions performed for the purpose of providing expert 2007; Sue et al., 2007). Gender stratification,
testimony. This chapter focuses primarily on assess- expressed as power imbalance and power assertion,
ment for the purposes of intervention and treatment, can be seen as a direct contributing factor in some
but many of the concepts presented are of impor- presenting problems (e.g., sexual harassment, rape,
tance in the consideration of gender variables in domestic violence), in women’s socially gendered
forensic and other types of evaluations. self-identity and self-worth (in many cultures women
are considered as subordinate and ‘‘less than’’), and in
A Woman-Centered Approach women’s beliefs about their ability or feasibility to
Assessment does not occur in a vacuum. There is change their behaviors, feelings, and lives.
clearly no one set of questions for women or one
model that is the ‘‘right’’ or ‘‘most complete’’ approach. Multiple Identities
The elements and focus of clinical assessment here are In the context of gender, we should be aware that
driven by at least four factors: (1) cultural knowledge, assessment issues may also be related to other group
assumptions, and attitudes; (2) preferred theoretical or identity characteristics for each woman, including
therapeutic orientations (Butcher, 1987; Lopez, 1989; but not limited to ethnoracial identity, culture,
MacDonald, 1984; McReynolds, 1989); (3) restric- social class, (dis)ability, religious orientation, and
tions imposed on clinicians’ decision making by the sexual orientation. Simple comparisons between
insurance industry; and (4) by the knowledge base women and men in most of the extant research fail
of the clinician regarding issues specific to women to capture the complexities that should be consid-
and the significant contexts of their lives (Caplan & ered when clients other than White middle-class
Cosgrove, 2004). Our purpose here is to reinforce and heterosexual women are being evaluated. Clients
expand this knowledge base. may be struggling with several intersecting identity
The suggestions we offer specifically for assess- concerns that become salient or brought into aware-
ment with women derive from both theory and ness depending on the situation (Marsella &
research on women’s psychological health and treat- Yamada, 2000; Stewart & McDermott, 2004;
ment. However, it is essential to be aware that our Williams, McCandless, & Dunlap, 2002).
1. Violence, abuse, and trauma: Physical, sexual, and psychological abusive or traumatic experiences, either current or
by history. Additionally, was there violence or abuse present in her family: for example, did she witness her mother
being battered by her father, or other family violence? What messages did she learn about power differentials and
violence in her family that affect her today? If she has had abusive or traumatic experiences, was she able to tell anyone,
and what was the response? Has she experienced incidents of sexual harassment? As a woman of color and/or of diverse
sexual orientations, how have experiences of racism, discrimination, violence, or exclusion affected her? How did her
experiences as a veteran of distance or recent conflicts affect her?
2. Caretaking responsibilities: Many women find themselves caring for children, husbands, partners, relatives, and
parents. For the single mothers, pressures resulting from financial insufficiency, isolation, and role overload may
present additional stressors. The resulting exhaustion, frustration, and lack of caretaking of self may contribute to
feelings of depression, anger, or anxiety.
3. Health—Medical conditions, medication prescribed, and medical concerns: Openly questioning about any
reproductive or sexual problems is recommended. A history, or current symptoms of eating-disordered behaviors
or weight-related concerns can be explored. Screening for substance abuse, either legal, illicit, or prescription drugs,
should be a standard part of the assessment. Addressing physical disabilities or limitations on daily functioning and role
responsibilities is also suggested.
4. Gender-role messages: Querying for underlying beliefs of traditional women’s ‘‘roles,’’ with her culture and her
comfort with her real-life roles (e.g., boss, employee, wife, partner, single woman, mother, etc.), may help in
understanding self-concept and self-worth issues. Understanding how these beliefs affect her day-to-day living in both
negative and positive ways can enrich understanding of presenting symptomatology.
5. Relationship beliefs: The quality of her interpersonal relationships, including romantic, familial, and friendships,
can shed light on the extent and quality of her social support network. Her evaluation of herself based on the
presence or lack of a romantic relationship, and the quality of such prior relationships, is an area frequently
impacted by stereotypical gender role and culturally driven messages. Finally, are her relationships the source of
discrimination or additional stress (e.g., being a lesbian, bisexual, etc., especially if a member of a minority
community)?
6. Previous therapy experiences: Obtaining an overview of prior therapy, including her judgment of its quality and
helpfulness, and the type of therapy she received, can be useful information. Along these lines, her readiness to
undertake therapy may be a factor, for example, in an involuntarily committed individual who feels she does not
need hospitalization or treatment. In cases where she has previously explored abuse issues, an understanding of what
type of therapy was used, particularly if high-risk interventions were used (e.g., hypnosis or detailed re-creations of
traumatic events), are critical. Finally, the possibility of sexual touching or sexual relationships with former therapists
should be noted.
7. Communication patterns. Does she have verbal access to her emotional life? How does she typically deal with
strong feelings? Does she display explosive or inappropriate anger? Is she able to be assertive, or to express anger?
Does she engage in unhealthy behaviors to deal with her feelings (e.g., overeating, substance use, self-injurious
behaviors)? Finally, does she believe her communication styles are effective, or does she consider learning new
methods as potentially helpful for her?
8. Attributional style: Does she attribute problems only to herself (self-blame), or is she able to see any external factors?
Does she tend to expect negative outcomes, or engage in depressogenic thinking?
9. Ability to self-nurture: Does she actively take care of her needs, and value those needs, or does she tend to engage
in caretaking of others with little or no attention to herself? Does she perceive that she is able to care for herself (e.g.,
an older widowed woman who has fears of living alone with financial concerns)? Are these perceived ideas
pragmatically driven (e.g., does she lack adequate financial resources to care for her self)?
10. Career/employment concerns: Is her career a meaningful part of her life, or does she see herself working
only to provide income? Does she feel isolated in her work, or does she have a support network that nurtures
her in her profession? Does she face gender or ethnoracial discrimination or sexual harassment in her workplace? Are
there stressors resulting from home–career conflicts?
continued
11. Resource assessment: Does she have adequate income to sufficiently manage her life? Does she receive child
support, or has this been difficult to obtain? Does she have adequate insurance or financial resources to pay for
medical care and medications? Has she ever been, or is is in danger of being, homeless? Does she have adequate
friends and family, or community groups to whom she can turn for emotional support?
12. Expectations of therapy and perception of her needs: If the evaluation is performed in preparation for therapy,
her expectations for therapy should be explored. In all cases, her perception of therapeutic needs and goals
should be explored. Such questions validate her importance as a collaborator in therapy, and may be for many
women the first time they have been so immediately included in the process of treatment planning. Are there
obstacles to obtaining therapy (e.g., financial, feeling stigmatized, unsupported by friends or family)?
and physical assaults as adults, and more cumulative may vary widely. Marley and Bulia (1999) ques-
PTSD symptomatology (Nishith, Mechanic, & tioned 234 adults who had previously received
Resick, 2000). Adding to these accumulated data is diagnoses of schizophrenia, bipolar, or schizoaffec-
evidence that survivors of sexual assault may not tive disorder, about their experiences in reporting
report their experiences at all due to fear of negative interpersonal violence to police. The women in this
outcomes, such as reprisal or disbelief and blame by study reported rape or other adult sexual abuse by a
others (Ahrens, Campbell, Ternier-Thames, Wasco, known perpetrator or unknown perpetrators, as well
& Sefl, 2007). as childhood sexual abuse. Abuse survivor variables
Adding the variable of vulnerability increased the leading to more police responses included higher
likelihood of abuse. Among women who were fre- levels of education, fewer lifetime victimization inci-
quently homeless and severely mentally distressed dents, being married, having a higher income,
(predominately diagnosed as having schizophrenia, having a home, and having no history of substance
schizoaffective, bipolar, or major depressive dis- abuse. Subjectively, some respondents felt that they
order), the lifetime prevalence of physical and would not be believed because they had been
sexual assault was astronomical (Goodman, diagnosed with a mental illness. In sum, hopeless-
Dutton, & Harris, 1995). Of the 100 predominately ness and psychological disturbance are variables
single African-American women surveyed, 85% that increase women’s vulnerability to interpersonal
reported childhood physical abuse and 65% violence, as well as level of belief by others to
reported sexual abuse in childhood. The majority whom they may report the violence. Reliable data
of these attacks were considered severe in nature. In on incidence rates are thus conflated with reporting
adulthood, 87% of these women had experienced a variables.
severe physical assault, predominately by someone
they knew well. Among these women, 76% had
Barriers to Screening
experienced sexual assault in adulthood. In sum,
Two major variables will affect the probability
97% of this sample of women had experienced
and effectiveness of screening for sexual and physical
physical or sexual abuse either during childhood or
abuse, those associated with therapist and those
adulthood. The authors concluded that among this
connected to the client. For many mental health
sample of women, the risk for ‘‘violent victimization
professionals, asking a client about a history of
is so high that rape and physical battery are norma-
sexual or physical abuse is experienced as being
tive experiences’’ (p. 474). Similarly, Buckner,
uncomfortable or intrusive. Research indicates that
Bassuk, and Zima (1993) concluded that a history
the rates of querying for this information are quite
of traumatic events across their lifetime is markedly
low. For example, among a sample of psychologists,
common for homeless women, and many of their
only 17% routinely inquired as to a history of life-
presenting psychological symptoms may well be
time sexual abuse (Pruitt & Kappius, 1992). In some
representative of survival skills rather than of clinical
instances, questions regarding lifetime abuse experi-
pathology.
ences are simply not asked (Craine, Hensen,
REACTIONS OF OTHERS Colliver, & MacLean, 1988; Pruitt & Kappius,
Finally, the reactions of others to reports of 1992). Intuitively, a direct question about lifetime
violence among psychologically disturbed individuals abuse must elicit more acknowledgment of such a
Carlton S. Gass
Abstract
The measurement of personality and psychopathology in neuropsychological contexts is essential because
brain injury affects psychological status and psychological problems can mimic brain dysfunction. The
Minnesota Multiphasic Personality Inventory-2 (MMPI-2) is, by far, the most widely researched and
clinically utilized test to meet this need. This chapter reviews special administrative and interpretive
considerations that apply in neuropsychological settings. Recent and controversial developments in
MMPI-2 research and application are also discussed.
Keywords: Alzheimer’s disease, brain injury, cognitive test performance, correction factor, FBS,
feedback of results, malingering, MMPI-2, neurological symptom reporting, neuropsychological
deficits, post-concussive syndrome, Clinical Scales, seizure disorder, somatoform disorders
432
Brain injury commonly causes psychological requirement for clinicians to obtain informed con-
symptoms, and psychological symptoms often mas- sent. Patients should never be administered the
querade as manifestations of neurological disease. In MMPI-2 or other instruments without an adequate
either scenario, clarification of psychological status is explanation of its function and the rationale for
essential to proper diagnosis and treatment. giving it. The MMPI-2 can be introduced as a
Neuropsychologists are intent on using reliable and routine part of the examination process designed to
valid performance measures to assess neurobeha- measure health-related feelings and attitudes.
vioral abilities. Unfortunately, a double standard is Although MMPI-2 administration is presented as a
applied in selecting methods for assessing psycholo- routine procedure, the clinician should be sensitive
gical status. A clinical interview is commonly the sole to the patient’s needs and encourage questions or
method used to evaluate emotional status. In some expressions of concern.
cases, clinicians combine the clinical interview with
very limited data based on brief tests that narrowly Administrative Format
address issues such as depression or anxiety. The MMPI-2 can be used with most individuals
The MMPI-2/MMPI (Butcher, Dahlstrom, who are at least 18 years old and have at least a sixth
Graham, Tellegen, & Kaemmer, 1989) has a long grade level of reading ability. The popularity of the
history of application within the field of neuropsy- MMPI-2 among neuropsychologists suggests that,
chology. In fact, the MMPI was incorporated into a in most clinical settings, administration is not a
standardized and subsequently validated neuropsy- significant problem. However, patients who are sub-
chological test battery by Ralph Reitan over 50 years stantially handicapped by cognitive impairment or
ago (Reitan, 1955). The amount of past and present behavioral disturbances are often unable to manage
research efforts involving the MMPI-2 in the neu- the requirements of the test, and are therefore unable
ropsychological context is substantial and far exceeds to provide valid information. A wide variety of con-
that involving any other multidimensional measure ditions can preclude a valid administration of the
of personality or psychopathology. This extensive MMPI-2. These include impatience and low frus-
research base is certainly one of the reasons for wide- tration tolerance, visual disturbances, confusion,
spread use of the MMPI-2 by neuropsychologists dyslexia, impaired reading comprehension, inatten-
(Butcher & Ben-Porath, 2004; Sharland & Gfeller, tion and distractibility, and florid psychotic symp-
2007). This chapter is a review of some important toms. Burke, Smith, and Imhoff (1989), using a
aspects of using the MMPI-2 specifically in clinical sample of 66 post-acute traumatic brain injury (TBI)
neuropsychological practice though some of this patients, reported a 20% incidence of inconsistent
material lends itself to a broader application. responding to the MMPI. On the other hand, there
is evidence suggesting that patients who have mod-
Special Administrative Issues in erate neuropsychological impairment secondary to
Neuropsychological Settings brain injury are typically able to produce profiles that
Clarifying the Purpose of the MMPI-2 are valid in regard to content-response consistency
Many medical patients who are referred for neu- (Mittenberg, Tremont, & Rayls, 1996; Paniak &
ropsychological assessment do not initially under- Miller, 1993).
stand why they have been asked to take a test of Completion of the MMPI-2 sometimes requires
psychological status. In addition, the content of breaking the session into several shorter periods. In
MMPI-2 items includes occasional references to testing the more severely impaired patients who are
highly personal matters and other issues that might unable to manage its cognitive or sensorimotor
appear to be irrelevant to the reason for referral. This demands, clinicians sometimes attempt to read the
will sometimes elicit an uncooperative response set test items aloud to the patient and assume the task of
that yields inaccurate information. For example, filling out the answer sheet for the patient. The
some of these individuals respond to the test items impact of the examiner’s involvement in this situa-
defensively, feeling compelled to assert normalcy tion is unknown. If this nonstandardized procedure
and deny any problems so as not to be mistaken as is used (presumably as a last resort), caution must be
‘‘crazy.’’ Others may be offended and refuse to coop- exercised in interpreting the results. In many
erate with the remainder of the assessment process. instances, a better alternative to this administrative
The clinician can prevent these unfortunate out- procedure is to use an audiotape or a compact disk
comes by preparing the patient prior to adminis- (CD) version released by Pearson Assessments. The
tering the MMPI-2. This is part of the ethical CD first presents the general test instructions,
GASS 433
followed by two readings of each item to ensure that administration, so that an appropriate intervention
examinees understand and have time to mark their can be made if a problem arises.
response. If the standard MMPI-2 answer sheet The MMPI-2 should be completed by the exam-
proves too challenging for the examinee because of inee in a supervised setting where privacy can be
impaired visual and visuomotor function, a custo- assured and assistance is available if needed. There
mized answer sheet with larger font can be made are two mistakes that clinicians sometimes make in
including the words ‘‘True’’ and ‘‘False’’ after each MMPI-2 administration. The first is allowing the
item number. The patient can then be instructed to examinee to complete the MMPI-2 unsupervised or
circle the appropriate answer. in the presence of friends or relatives. Some exam-
The selection of the written or audio administra- inees are naturally inclined to solicit help in
tive format is based on a consideration of time and answering the test items. Caregivers of neurologi-
examinee competency. Higher functioning indivi- cally impaired individuals often insist on answering
duals who can read without difficulty are best items on behalf of the patient. Even the presence of
suited for the written format. Most examinees that other examinees is a potential problem, as some
can read the MMPI-2 booklet complete the inven- examinees are predisposed to make the MMPI-2 a
tory in about 90 min. Some take considerably longer group experience, opting for a democratic process in
and some cannot complete it. Those who have answering test items. On one occasion, I entered a
reading difficulties and require the audio format testing room where four or five examinees were
will need 135 min to complete the full MMPI-2. If voting on which answer to give for each MMPI
an individual is ruminative and indecisive, the audio item. Profile invalidity is the consequence. A
version might be preferable because of its constant second mistake is to allow patients to take the
pace, forcing a decision within a reasonable amount MMPI-2 home. In this scenario, one can never
of time. The important point is that MMPI-2 know who really completed the inventory, or what
administration is an intervention that requires clin- influences may have intervened (Pope, Butcher, &
ician sensitivity and attentiveness. The audio and Seelen, 1993).
written formats yield equally valid results
(Herrman, Dorfman, Roth, & Burns, 1997). Use of a Short Form
In some cases, the MMPI-2 administration will
Monitoring the Examinee have to be terminated because of various difficulties
Brain-impaired examinees should be monitored that emerge. If the first 370 MMPI-2 items are
closely but unobtrusively to ensure that the instruc- successfully completed by the examinee, this abbre-
tions are properly followed as MMPI-2 administra- viated form is perfectly adequate for scoring the L, F,
tion begins. This is particularly critical in the case of K, and the basic Clinical Scales as well as the Harris–
those individuals who are more cognitively Lingoes subscales. A substantial loss of information is
impaired, are more easily confused, and can lose the unfortunate consequence. Short forms of the
track of the questions and their markings on the MMPI-2 have been proposed. Under certain circum-
answer sheet. Most examinees that successfully com- stances an accurate short form might offer advantages
plete the first several items are able to finish the that outweigh the loss of comprehensiveness
remainder of the MMPI-2, though an additional (Dahlstrom & Archer, 2000). The proposed 180-
session is sometimes required due to fatigue or item format has been shown to be very unreliable
time restrictions. In some instances, the examinee for predicting clinical code types, identifying the
will have increasing difficulties as the testing con- high-point scale, or predicting scores within 5–10
tinues, which may become apparent in several ways. T-score points on most of the Clinical Scales (Gass
First, the patient may inform the examiner directly & Luis, 2001a; Gass & Gonzalez, 2003). A standard
by saying so. Unfortunately, neurologically impaired interpretation of the 180-item short form is therefore
examinees sometimes ‘‘complete’’ the test, answering likely to be inaccurate. However, in the event that a
items without adequately understanding item con- patient can complete only the first 180 items, the
tent. The clinician can sometimes observe the clinician can salvage some information that addresses
quietly confused patient making unusual marks the presence or absence of general problem areas
with the pen, writing in the wrong location on the without regard to their level of severity or specific
answer sheet, skipping items, or not answering items symptom manifestations. If a prorated T-score on
at all. In any case, it is important for the clinician to the 180-item short form basic clinical scale exceeds
monitor the patient throughout the course of the T65, it indicates that the full-form MMPI-2
GASS 435
Empirical investigations of the FBS provide (p. 43). In fact, Guez et al. were emphatic in stating
mixed findings that are not surprising given the that this sample was not malingering based on
variety of sampling characteristics across studies. A several lines of evidence including an absence of
substantial body of evidence suggests that the FBS is incentive (the majority of subjects were not
highly sensitive to complaints of somatic discomfort involved in a compensation-seeking context), nega-
and distress (Greiffenstein, Baker, Axelrod, Peck, & tive malingering test findings, and normal neurop-
Gervais, 2004; Greiffenstein et al., 2007; Greve sychological test results. Thus, although these data
et al., 2006; Ross, Millis, Krukowski, Putnam, & were cited as supportive of FBS, they actually raise
Adams, 2004), whereas a number of studies suggest concern regarding FBS validity and underscore its
important limitations that qualify its use. For potential for misleading clinicians into erroneously
example, the FBS was not very sensitive to atypical suspecting malingering where none exists.
presentations by litigating mild TBI (57%) using the Larrabee (2007) reviewed numerous studies on
widely recommended cutoff of 23 (Greiffenstein, psychological and physical symptom exaggeration
Baker, Gola, Donders, & Miller, 2002). FBS was pertinent to neuropsychological evaluation. Support
not very sensitive to identifying chronic pain liti- for FBS as a malingering scale is largely based on
gants (42%) in a study by Meyers, Millis, and studies that show its effectiveness in differentiating
Volkert (2002). In a study of probable malingered litigating from nonlitigating sample, as litigants typi-
neurocognitive dysfunction, FBS (cutoff ¼ 25) had cally score higher than nonlitigants on FBS.
a sensitivity of .52 (Greve et al., 2006). Compensation-seeking individuals in neuropsy-
Variable support for FBS sensitivity to malingering chological cases are more likely to have more symp-
extends to the issue of its specificity—that is, to its tomatic complaints and perform more poorly on
effectiveness in not misclassifying bona fide patients as neuropsychological tests. However, it should not
malingerers. High scores on the FBS might lead be assumed that seeking compensation, in itself,
clinicians to falsely conclude that an unacceptably causes a higher level of symptom reporting. The
high number of patients who have organic-based converse could be true: greater concern over
physical conditions or certain psychiatric disorders acquired symptoms or deficits could precipitate
might be erroneously classified as malingerers the quest for compensation. The fact that litigants
(Butcher, Arbisi, Atlis, & McNulty, 2003). Studies report more problems begs the question of the
have demonstrated that FBS scores are raised by cause of their complaints. As Larrabee (2000)
objectively verified sources of physical discomfort noted, additional evidence of malingering is
(Iverson, Henrichs, Barton, & Allen, 2002; Meyers required, such as clinically atypical presentations
et al., 2002). Premorbid psychiatric history is (Greiffenstein et al., 2002), in addition to a com-
associated with higher scores on FBS (Martens, pensation-seeking context. Some studies that sup-
Donders, & Millis, 2001). In TBI cases, scores on port FBS have used samples of subjects that had an
FBS appear to be significantly influenced by bona fide improbable symptom presentation or produced
physical injury including anosmia and motor impair- evidence of incomplete effort. The limitations of
ment (Greiffenstein et al., 2002). Collectively, the FBS and its scope of application are described by
data suggest that physical injury combined with Strauss, Sherman, and Spreen (2006).
psychological distress can produce high scores on
FBS in the absence of malingering. Clinical Profile Interpretation
A meta-analytic study (Nelson, Sweet, & The long history of MMPI-2 use in neuropsy-
Demakis, 2006) is often cited as support for the chology has generated numerous areas of investiga-
FBS without due consideration to its methodolo- tion designed specifically to enhance its application
gical limitations. For example, of the studies with individuals with known or suspected brain
described in this investigation, the most dramatic damage (see Table 23.1). The possibility that the
effect size for FBS was, by far, that (3.52) found in MMPI could be used to diagnostically identify the
the data reported in Guez, Brannstrom, Nyberg, presence of brain damage was one such area
Toolanen, and Hildingsson (2005). (The grand (Golden, Sweet, & Osmon, 1979). This pursuit
effect size using all 19 studies in the meta-analysis failed. It was based on faulty presuppositions and
was substantially less—.96.) A whiplash subsample flawed research methodologies. In neuropsycholo-
was extracted from the Guez et al. study and classi- gical or psychiatric settings, patient self-report of
fied as a malingering group on the grounds that it symptomatic complaints is an unreliable basis for
‘‘may have had reason to exaggerate symptoms’’ identifying the presence of brain damage. Actual
Motor symptoms
Paralysis or weakness (paresis) 295 Sc
Hand tremor 172 Hy
Manual clumsiness 177 Sc
Movement difficulties 182 Sc, Ma
Problems walking/balance 179 Hs, D, Hy
Slurred speech (dysarthria) 106 Sc, Ma
Sensory symptoms
Numbness 247 Hs, Sc
Paresthesia 53 Hs
Diminished vision 349 Hs, Hy
Cognitive symptoms
Poor concentration 325 Pt, Sc
Distractibility 31 D, Hy, Pt, Sc
Forgetfulness 165 D, Pt, Sc
Miscellaneous physical symptoms
Generalized weakness 175 Hs, Hy, Pt
Rapid fatigue 152 Hs, Hy
Dizzy spells 164 Hs, Hy
Note: Gass, 1991, 1992, 1996b; Gass & Wald, 1997.
case history is much more reliable. Symptom ques- brain injury in clinical settings. The MMPI-2 also
tionnaires are ineffective and generally lack specifi- has no utility in localizing brain lesions, which is an
city in identifying brain dysfunction. Although asset of many other neurodiagnostic techniques
people who have brain damage report their neurolo- including neuropsychological testing (Gass &
gical symptoms on the MMPI-2, the test has many Russell, 1986b). More salient in recent MMPI-2
more items that assess psychopathology and person- research is the fact that a very small number of
ality characteristics. Numerous studies using care- MMPI-2 items are highly sensitive to brain dysfunc-
fully constructed samples found that the MMPI tion and pose an interpretive challenge when brain-
could be used to successfully discriminate brain- injured individuals are examined (Gass, 1999).
damaged subjects from those with specified forms
of psychopathology (especially schizophrenia). In Clinical Scale Interpretation
fact, early optimism about the MMPI in identifying Analysis of the scores on the basic Clinical Scales is
brain damage was unwarranted, because the discri- essential to MMPI-2 interpretation, because these
minative power was based almost completely on scales have the advantage of decades of extensive
MMPI sensitivity to psychopathology and not on empirical research upon which interpretive inferences
its ability to identify brain damage. For example, in a are based. However, detailed information regarding the
research sample that combines two discretely properties of these (and other) scales in brain-injured
defined brain-damaged and schizophrenic indivi- samples is relatively limited. Gass (1997) examined the
duals, it is not surprising that lower scorers on MMPI-2 scoring characteristics of samples of 54
Scale 8 are more likely to be brain-injured subjects. closed-head injury and 40 stroke patients (all males)
The objective of differentiating forensic TBI who were referred for neuropsychological evaluation at
cases from normals was investigated by Senior, the Miami Veterans Affairs Healthcare System. These
Lothrop, and Deacon (1999) using the 14-item and other research data (Gass, 2000), which will be
correction for head injury (Gass, 1991b). The alluded to throughout this section, were based exclu-
authors reported a sensitivity of .80 and specificity sively on valid protocols. In addition, none of these
of .88, supporting the fact that the correction is referrals had a premorbid history of psychiatric treat-
effective in representing brain injury complaints. ment or addictive disorder. All had documented brain
However, the correction was not designed to detect injuries, and none were seeking compensation benefits.
GASS 437
SCALE 1. HYPOCHONDRIASIS (HS) correlates necessarily apply. For this reason, an analysis
This 32-item scale was originally referred to as the of the scoring pattern on the Harris–Lingoes subscales
hypochondriasis scale because item selection was is especially helpful in clarifying the meaning of mar-
based on the responses of a group of neurotic ginally elevated scores on Scale 2. Individuals who
patients who had multiple physical complaints and have CNS impairment and report their neurological
health-related preoccupations in the absence of any symptoms on the MMPI-2 typically increase their
discernible medical condition or abnormal physical scores on the Scale 2 regardless of whether they are
findings. In the neuropsychological context, eleva- depressed. This occurs because the Scale 2 contains a
tions on Scale 1 are quite common. Brain-injured subset of items that are related to bona fide physical
individuals commonly score moderately high on and cognitive manifestations of brain damage.
Scale 1 (65T to 75T) because of their frank endorse- Included are items that refer specifically to problems
ment of items that are highly sensitive to brain with distractibility (31), convulsions (142), dimin-
dysfunction. These include references to diminished ished reading comprehension (147), memory diffi-
general health status (45), paresthesias (53), tired- culty (165), generalized weakness (175), and walking
ness and fatigue (152), weakness (175), pain (224), or balance (179). Acknowledgment of these symp-
periodic dizzy spells (164), difficulty walking (179), toms increases the T-score on the Scale 2 by an average
and numbness (247). Although there are exceptions, of 5–10 points, with a range of 0–12 points (Gass,
empirical data suggest that scored responses to neu- 2000). When protocols were not corrected for neuro-
rologically related items such as these account for an logical item endorsement, half of the closed-head
average increase of 10–15 T-score points on Scale 1, injury and 43% of the stroke patients comprising the
with a range of 0–30 points (Gass, 2000). Miami veterans referral sample scored above 65T on
Scale 2. Scores on Scale 2 that exceed 75T in brain-
SCALE 2. DEPRESSION (D) injured individuals almost invariably reflect depres-
The 57-item depression scale assesses common sive symptoms in addition to any neurologic item-
symptoms of depression, including subjective dys- related artifact that might exist. In examining the
phoria, sadness, low self-esteem, psychomotor retar- protocols of brain-injured patients, more reliable
dation, a sense of hopelessness about the future, and information pertaining to depression can be
vegetative features such as poor appetite, fatigue, and obtained using the more homogeneous content
insomnia. High scorers (T > 70) are also socially scale DEP (depression).
withdrawn and often isolate themselves. They feel Although depression is widely believed to affect
very inadequate, lack self-confidence, and are prone adversely cognitive test performance, scores on
to feeling guilty. They experience cognitive ineffi- Scale 2 are not usually predictive of neuropsycho-
ciency in daily living, such as poor concentration and logical test performance. Studies have suggested
forgetfulness, though their actual capability in these that Scale 2 scores in neuropsychological referrals
neurobehavioral domains may be intact (Gass & are independent of the level of performance on
Russell, 1986a; Gass, Russell, & Hamilton, 1990). measures of attention and memory (Gass, 1996a;
They are often indecisive and plagued by doubts Gass & Russell, 1986a; Gass et al., 1990), fluency
about themselves. In inspecting the scores on these or mazes (Gass et al., 1994), or alternating atten-
subscales, the clinician will frequently note that tion on the Trail-Making Test, Part B (Gass &
individuals who have neurological impairment Daniel, 1990).
score relatively high on mental dullness (D4) and
physical malfunctioning (D3) as an expression of SCALE 3. HYSTERIA (HY)
their cognitive and somatic difficulties (Gass & The hysteria scale is comprised of 60 items that
Lawhorn, 1991; Gass & Russell, 1991; Gass et al., were originally selected on the basis of their associa-
1990). High scores are also common on subjective tion with psychologically based sensory or motor
depression (D1), which is consistent with the fact abnormalities (conversion disorder). Accordingly,
that symptomatic depression is probably the most when a score on Scale 3 is high (T > 70) and has a
common psychological sequelae of brain disease or prominently elevated position in the profile, indivi-
injury. duals show a tendency to report a variety of somatic
Brain-injured individuals frequently produce mar- complaints and develop physical symptoms in
ginal elevations (60T–70T) on Scale 2. Scores in this response to psychological conflict or stressful circum-
range are difficult to interpret because of its hetero- stances. Behind these symptoms, in many cases, is
geneous item content. Therefore, not all Scale 2 secondary gain in the form of obtaining affectionate
GASS 439
SCALE 6. PARANOIA (PA) large as 12 points in an individual case (Gass,
The criterion sample used by Hathaway to select 1991b). Thus, slightly elevated scores on Scale 7
the 40 items on paranoia scale consisted of indivi- do not necessarily reflect anxiety and distress in
duals who had frank paranoid features or a diagnosed individuals who have bona fide brain injury. Scores
paranoid disorder. Scale 6 contains few, if any, items that exceed 70T, however, almost always indicate
that refer directly to physical, cognitive, or health- the presence of these symptoms.
related symptoms of brain dysfunction. Studies sug- Complaints of poor concentration, inattention,
gest that most neurologically impaired individuals and forgetfulness are perhaps the most common
score within the average range on Scale 6. However, reason for referral for neuropsychological evaluation.
traumatic brain-injured individuals endorse Scale 6 Scale 7 is the best single predictor of cognitive com-
items with some frequency, particularly if they are plaint severity in a comprehensive neuropsycholo-
examined very soon after injury. Their scores are gical battery (Gass & Apple, 1997; Gass, Ardern,
sometimes sufficiently elevated (T > 65) to suggest Howell, Dowd, Levy & McKenzie, 2005; Gass &
suspiciousness, distrust, and a sense of having Freshwater, 1999). Cognitive complaints are a
received a ‘‘raw deal’’ from life. Posttraumatic para- reflection of anxiety and, in some cases, are also
noia is not uncommon, particularly in acute head indicative of actual deficits (Gass & Apple, 1997).
injury, though it rarely persists over a period of No subscales exist for Scale 7, though the results of
months (Grant & Alves, 1987). In the Miami a factor analytic study suggest that the scale includes
veterans sample, 35% of the closed-head injury components of maladjustment, neuroticism, anxiety,
sample scored above 65T, whereas only 15% of the psychotic tendencies, withdrawal, denial of antisocial
stroke patients scored this high (Gass, 1997). behavior, poor concentration, agitation, and poor phy-
Personal injury claimants are more likely to report sical health (Comrey, 1958). To assist in making
symptoms that increase scores on Scale 6. In the inferences about anxiety, fearfulness, or obsessional
forensic neuropsychological context, mild head thinking, scores on Scale 7 can be compared with
trauma litigants with late postconcussive syndrome scores on several content scales that measure anxiety-
(LPCS) commonly produce Scale 6 scores between related symptoms: generalized anxiety (ANX), obses-
65T and 75T, with a mean T-score that is about 10 sional thinking (OBS), specific phobias, apprehen-
points higher than that found in the Miami veterans sions, and a general tendency to be fearful (FRS).
brain-damaged sample (Youngjohn et al., 1997).
SCALE 8. SCHIZOPHRENIA (SC)
SCALE 7. PSYCHASTHENIA (PT) The schizophrenia scale consists of 78 items that
The 48 items that comprise this scale were were pooled by Hathaway and McKinley from several
intended to measure psychasthenia, which is roughly groups of items that were originally intended (but
synonymous with obsessive compulsive disorder. failed) to be specific to four subtypes of schizophrenia
High scorers (T > 65) on Scale 7 are very distressed, (paranoid, simple, hebephrenic, and catatonic). The
anxious, and worried, often an overabundance of resulting scale was both lengthy and heterogeneous in
seemingly minor issues. A substantial number of content. The heterogeneity of Scale 8 partially
neurologically impaired individuals score above accounts for the fact that scores on it can be increased
65T on Scale 7. In the Miami veterans study, 50% by factors that are largely unrelated to schizophrenia.
of the closed-head injury and 33% of the stroke For example, Graham (1993) reported that slightly
patients produced elevated (noncorrected) scores higher scores frequently occur in adolescents, college
on this scale. Although many brain-injured indivi- students, African-American males, amphetamine
duals become highly distressed and worried about users, and in people who have certain medical condi-
their health condition, the frequency of high scores tions, such as epilepsy. Butcher and Williams (1992)
in this population is partially due to the endorsement further observed that scores on Scale 8 are sometimes
of items that fall within the larger cluster of empiri- increased because of adherence to nontraditional or
cally identified neurologically related items. These countercultural values, as well as in individuals who
items include references to distractibility (31), have severe sensory impairment or organic brain dys-
reading problems (147), memory difficulty (165), function. Additional sources of information, such as
generalized weakness (175), forgetfulness (308), and interview and observational data, relevant history,
concentration difficulty (325). Acknowledgment of contextual considerations, and scores on Scales F
these symptoms increases the T-score on Scale 7 by and F(p), can be especially useful in interpreting
an average of 5 points, although the effect can be as high scores on Scale 8.
GASS 441
The RC Scales and the Restructured MMPI-2 content on the standard Clinical Scales (Tellegen,
The MMPI-2 Restructured Form (MMPI-2-RF, Ben-Porath, Sellbom, Arbisi, McNulty, & Graham,
Ben-Porath & Tellegen, 2007) is a redesigned ver- 2006). However, if the subtle items constitute
sion of the MMPI-2 that has as its core the random noise, as the authors maintain, one would
Restructured Clinical (RC) Scales (Tellegen et al., not expect their net effect to systematically increase
2003). The RC Scales were developed to enhance (or decrease) standardized T-scores on the Clinical
the distinctiveness of the original Clinical Scales by Scales. Anecdotal evidence that the RC Scales under-
removing a common ‘‘demoralization’’ factor that estimate level of psychopathology received support
exists across the scales. The result is that they achieve in a recent study of substance abusers (Forbey &
greater homogeneity and discriminative validity in BenPorath, 2007). RC scores were consistently
predicting specific behavioral criteria using fewer lower than their Clinical Scale counterparts.
items. In neuropsychological settings, primary eleva- The elimination of over 200 MMPI-2 items
tions are most likely to occur on RC1 because of its that are ‘‘working items’’ has additional implica-
physical symptom and health-focused item compo- tions for information loss and its potentially
sition. In contrast, RC3 is unlikely to be elevated as adverse impact on clinical use of the MMPI-2.
it no longer contains somatic item content and is Current interpretive practice uses the RC Scales
quite different from its Hy counterpart (Butcher, to refine interpretation of the basic Clinical Scales
Hamilton, Rouse, and Cumella, 2006). by removing the common factor of ‘‘demoraliza-
The RC Scales are purer measures of a core con- tion’’ that exists in the scales. This information can
struct underlying each of the original eight Clinical be quite helpful. Beyond this, it is unclear whether
Scales (excluding Scales 5 and 0), but the core con- the potential benefits of the MMPI-2-RF outweigh
struct is narrower than that measured by the original the cost in information loss. The architects of the
scales. In evaluating the MMPI-2-RF, the critical restructured MMPI-2 maintain that ‘‘profile con-
issue is whether this 338-item inventory has suffi- figurations defined by well-chosen more homoge-
cient advantages to offset its high cost in information neous scales in principle should achieve a more
loss based on decades of research on the behavioral accurate recognition of more distinctive syndromal
correlates of the MMPI/MMPI-2 Clinical Scales, response patterns. We believe the RC Scales will
code-type patterns, and other aspects of score con- make it easier to explore true syndromal assess-
figuration. It is alleged that the RC Scales sacrifice a ment’’ (Tellegen et al., 2006, p. 165). Whether
wealth of specific and dynamic interpretative predic- this grand hope will be actualized remains to be
tions that are provided by their original Clinical seen. It is clear, however, that if clinicians abandon
Scale counterparts, particularly when well-established the original Clinical Scales and body of code-type
code-type information is utilized (Caldwell, 2006; information, they will sacrifice the most impressive
Nichols, 2006). The radical alteration of Scale 3 body of empirically based interpretive material ever
(RC3), which is aptly described as a deconstruction amassed in the history of personality assessment.
by Butcher et al. (2006), is particularly disconcerting
from a neuropsychological standpoint, given the ori- Diagnosing Neuropsychological Deficits
ginal scale’s sensitivity to conversion symptoma- The diagnostic efficacy of neuropsychological
tology. One example of this is psychogenic seizures, tests depends, in part, on their specificity in
which account for almost one-third of seizure presen- detecting underlying brain dysfunction. To the
tations (Ansley et al., 1995). extent that emotional and motivational factors inter-
The RC Scales might have additional shortcom- fere with neurobehavioral performance, inferences
ings. Nichols (2006) observed that the core con- regarding brain-based abilities are more uncertain.
structs measured by the RC Scales, with the Conation, which is ‘‘will power’’ or application of
exception of RC9, show extremely high redundancy effort, can be adversely affected by neurological
with existing MMPI-2 content scales (Butcher, (Reitan & Wolfson, 2002) and non-neurological
Graham, Williams, & Ben-Porath, 1990) (rs conditions. The latter include emotional states and
between .80 and .95). More importantly, the RC situational factors in which optimal effort is expected
Scales are more vulnerable to a defensive response to have a painful or unpleasant outcome. This is a
style than are the Clinical Scales (Sellbom, Ben- common problem in the assessment of compensa-
Porath, McNulty, Arbisi, & Graham, 2006). tion claimants who stand to lose by performing well
Elsewhere, these authors attempted to dismiss this on tests of ability (Green, Rohling, Lees-Haley, &
apparent weakness by attributing it to subtle item Allen, 2001).
110
100
90
80
70
60
50
40
30
VRIN TRIN F FB FP L K S Hs D Hy Pd Mf Pa Pt Sc Ma Si
GASS 443
MMPI-2 Restructured clinical scales profile
120
110
100
90
80
70
60
50
40
30
110
100
90
80
70
60
50
40
30
ANX FRS OBS DEP HEA BIZ ANG CYN ASP TPA LSE SOD FAM WRK TRT
and Johnson (1982) examined correlations between the stroke patients, only modest relationships were found
MMPI and HRNB in a sample of normal individuals between MMPI variables and neuropsychological test
and found no statistically significant relationships. In performance (Gass & Ansley, 1994a).
GASS 445
the WAIS-R (Full Scale IQ, r ¼ –.35) and HRNB method of scale construction with carefully defined
(Average Impairment Rating, r ¼ –.30, ps < .001). psychodiagnostic groups, these CNS items were
assigned along with other MMPI items to the var-
Controlling for Neurological Symptom ious Clinical (psychopathology) Scales.
Reporting Psychometric principles are seriously challenged
Scales 1, 2, 3, and 8 contain physical and cogni- when these scales are applied to patients who, having
tive symptom items that have been a source of con- bona fide CNS symptoms, differ in a relevant and
fusion and controversy when the MMPI-2 is used systematic way from the psychiatric patients used in
with people in medical and medicolegal settings. In the criterion groups to construct the scales. In brain-
neuropsychological settings, group studies show that injured samples, neurologically sensitive items con-
these scales typically show the highest elevations stitute a distinctive source of variance in MMPI-2
(Gass, 2000; Wooten, 1983). The interpretation of responses (Gass, 1991b, 1992). At one extreme,
Scales 2, 3, and 8 is facilitated using scores on the some authors recommend ignoring the presence of
Harris–Lingoes subscales. In brain-impaired and these neurologically sensitive items or their potential
other medical samples, high scores on Scale 2 are profile effects when evaluating brain-impaired indi-
most commonly associated with complaints of phy- viduals. Some authors opt for a crystal ball approach,
sical malfunctioning (D3) and mental dullness (D4). recommending a reliance upon vaguely defined
High scores on Scale 3 are associated with high ‘‘clinical judgment,’’ but they fail to provide specific
scores on lassitude-malaise (Hy3) and somatic com- guidelines. At another extreme, some authors argue
plaints (Hy4). On Scale 8, high scores are usually that the MMPI-2 as applied to brain-injured
due, in part, to reports of cognitive difficulties (Sc3) patients is essentially uninterpretable because of the
and bizarre sensory experiences (Sc6). These types of confounding influence of neurological symptom
somatic and cognitive complaints are common in reporting and the resulting inflation of scores across
brain-injured individuals who report their symp- the Clinical Scales. They go so far as to recommend
toms of CNS disease. Unfortunately, the extent to using rationally (not empirically) selected MMPI-2
which scores on these subscales are reflective of items as a neurological symptom checklist.
psychological disturbance, as opposed to CNS A prudent approach to MMPI-2 use with brain-
symptom reporting is unknown. These subscales impaired patients requires a systematic examination
are not diagnostically sensitive or specific, nor are of the potential impact of neurologically sensitive
they uniformly reliable. item endorsement on MMPI-2 profiles. However,
The architects of the MMPI (Starke Hathaway, the initial identification of neurologically sensitive
PhD and J. Charnley McKinley, MD) were clini- items must be done empirically, not subjectively.
cians who had a strong interest in behavioral neu- Once the key items are identified, the clinician can
rology. Hathaway, a clinical psychologist with a keen inspect the examinee’s responses to these items,
interest in engineering, published a textbook in singling out the items that were answered in the
1942 entitled Physiological Psychology, in which he keyed direction, and calculating the effects of their
described and made artistic drawings to illustrate endorsement on the clinical profile. The original
functional neuroanatomy. In fact, Hathaway’s MMPI-2 profile, scored in the standard way, can
impetus for meeting the neuropsychiatrist, Dr. then be compared with the ‘‘corrected’’ profile
McKinley, was a shared interest in the measurement (Arbisi & Ben-Porath, 1999; Gass, 1991b, 1992).
of CNS evoked potentials. According to Hathaway, To the extent that the two profiles are similar, the
McKinley ‘‘was an admirable neurologist . . . he was clinician can have greater certainty regarding the
no psychiatrist at all.’’ (Butcher, 1973). In con- profile’s accuracy as a reflection of emotional status
structing the MMPI, McKinley expressed an interest and personality characteristics. To the extent that
in measuring neurological as well as psychiatric the two profiles differ, the clinician has to proceed
symptoms, though the criterion groups used to com- more cautiously, knowing that there is a higher like-
pose the Clinical Scales were psychodiagnostic lihood of artifactual score elevation. In either case,
groups screened for brain dysfunction. After the unless this interpretive method is followed, the clin-
test was developed, Hathaway and McKinley ician is left with the highest degree of uncertainty
(1940) identified several item clusters of MMPI regarding the potential influence of neurological
items reflecting ‘‘general neurologic’’ (19 items), symptom reporting.
‘‘cranial nerve’’ (11 items), and ‘‘motility and coordi- A fundamental question is, when assessing
nation’’ (6 items). Using the empirical keying the brain-impaired patient, which items on the
GASS 447
Table 23.2. MMPI-2 item table for evaluating traumatic brain injury.
Item F 1 2 3 4 7 8 9 0
reported that the correction items were endorsed by correction items was positively related to injury
the patient specifically because of problems resulting severity on the GCS (r ¼ .26) and findings on the
from the injury. Consider an example: Item 179, CT scan (r ¼ .26, ps < .05). These findings provide
‘‘I have had no difficulty in keeping my balance in empirical support for the 14-item correction.
walking.’’ Thirteen of the patients answered ‘‘false’’ Nevertheless, they are unexpected and surprising
and, in the interview, all 13 of them attributed the given the well-established weak relationship between
imbalance to the head injury. Sixteen family members the number of symptom complaints and degree of
were interviewed independently and asked about the brain injury, whether estimated on the basis of GCS
patient’s answer to this item. All 16 family members scores, radiological findings, duration of loss of con-
attributed the symptom to the head injury. Across the sciousness, or posttraumatic amnesia (Bornstein,
14 items, the patients were 10 times more likely to Miller, & van Schoor, 1988). It is surprising to
attribute the endorsed problem to the head injury find a correlation between the amount of symptom
than to their preinjury status. Similarly, their signifi- reporting and estimates of injury severity, particu-
cant others independently attributed the item endor- larly when these estimates are typically crude and
sements to the head injury about 13 times more often temporally separated from the time of symptom
than to preinjury status. Overall, 109 of 119 endorse- (MMPI-2) reporting by highly variable time periods,
ments of the 14 items were attributed to the head ranging from several weeks to many years. To com-
injury, supporting the specificity of the correction plicate matters, mild head-trauma patients with
when applied to TBI patients. To date, no other questionable injuries commonly report more
study that has attempted to address the specificity of symptom complaints than bona fide brain injury
the correction has examined individual cases in this patients (Berry et al., 1995; Youngjohn et al.,
careful and detailed manner. 1995; Youngjohn et al., 1997). Furthermore, retro-
The 14-item correction was examined in a spective reports of injury severity made by mild head
sample of 67 inpatients who were consecutive trauma litigants with large financial incentives and
admissions to a general hospital immediately fol- symptoms persisting up to 7 years post-incident
lowing a TBI (Gass et al., 1999). Exclusionary cri- (e.g., Brulot, Strauss, & Spellacy, 1997) have
teria included recent substance abuse, and a history doubtful validity and are misleading and irrelevant
of either neurological or psychiatric disorder. Upon for assessing the validity of an MMPI-2 correction.
admission, these patients were administered the Studies that use bona fide brain injury samples
Glascow Coma Scale (GCS) and had CT scans. An have supported the validity of the correction. Rayls,
incidental finding in this study was the fact that the Mittenberg, Burns, and Theroux (2000) investi-
endorsement frequency on the 14 MMPI-2 gated the question of whether the 14 correction
GASS 449
were other test results. She claimed that she was wave a flag of very strong moral integrity (Scale L).
unable to concentrate well enough to handle any The 13/31 code type is suggestive of somatoform
work during the 2 years following the accident. and specifically conversion symptoms.
Neuropsychological testing revealed borderline per- The Harris–Lingoes subscales for Scale 3 revealed
formance across tasks of attention, concentration, evidence of personality characteristics associated
and memory, and normal performance on the most with hysteria, specifically the intense need for affec-
sensitive indicators. She applied for medical discharge tion and the tendency to inhibit hostility in situa-
from the Navy, and sought compensation for her tions where anger is elicited. On the content scales,
injury. At 1-year post injury, she began generating a she revealed problems with anxiety, worry, and
very long list of vague physical complaints. She made depression, all of which are less intense than her
numerous visits to neurology, orthopedics, and health concerns. The 14-item correction (Gass,
audiology. None of these services were able to find 1991b) was applied to ascertain the profile effects
any physical basis for her complaints. of her answering items associated with TBI. The
The Navy investigated her jeep accident and dotted line in her profile, which is the result after
found her at fault. The board concluded that she correction, is indicative that even if she sustained a
was driving recklessly during a temper tantrum. In brain injury (which is questionable), her MMPI-2 is
fact, the Navy brought her up on disciplinary a reflection of prominent somatoform symptoms.
charges. She then sought psychotherapy and
attended three sessions. She avoided talking about MMPI-2, Mild Head Trauma, and Late
emotional issues, focused entirely on physical Postconcussive Symptoms
complaints. She abruptly stopped showing up for The neurobehavioral effects of uncomplicated mild
appointments when she was informed that her head trauma typically resolve within several weeks.
request for a medical discharge was denied. Her When symptoms persist beyond several months—
MMPI-2 profile shown in Figure 23.4 revealed a LPCS—an important role of psychological and moti-
test-taking attitude of admitting to significant pro- vational factors must be considered (Binder, 1997;
blems of a circumscribed nature (Scale F)—in this Binder et al., 1997). Predisposing factors for develop-
case, physical—and at the same time attempting to ment of LPCS appear to include a context for
110
100
90
80
70
60
50
40
30
VRIN TRIN F FB FP L K S Hs D Hy Pd Mf Pa Pt Sc Ma Si
Fig. 23.4 Amy’s basic validity and clinical profile with correction (dotted line).
GASS 451
On the MMPI-2, his results suggested a very might be predictive of whether recommendations
different picture. He produced high scores on are followed. MMPI-2 feedback can be part of a
Scales 1, 2, 3, 7, and 8, and endorsed items that structured intervention yielding positive thera-
refer to suicidal behavior. Scores on Scales 1 and 3 peutic benefits including symptom reduction and
were partially elevated because of his reported phy- enhancement of self-esteem (Finn, 1996). Many
sical and health-related effects of stroke (Gass, individuals who are referred for neuropsychological
1992). An immediate follow-up feedback session evaluation have untreated emotional difficulties
and interview were conducted, in which he admitted and problematic behaviors that are made manifest
having a clear and specific plan to shoot himself with through their MMPI-2 results. Through this pro-
his revolver out of a sense of hopelessness and frus- cess of feedback provision and discussion, the neu-
tration about his seemingly intractable medical con- ropsychologist can help individuals obtain the
dition. At this point, staff members were able to assistance they need.
intervene to prevent a negative outcome. People
often use self-report instruments to convey problems Summary
that they may not bring up or admit to in an initial • A neuropsychological evaluation is incomplete
interview. without an assessment of personality and emotional
The reverse phenomenon sometimes occurs, too. status.
Some people adopt an exaggerated or defensive pos- • The use of a comprehensive self-report instru-
ture in answering MMPI-2 items, and the clinical ment such as the MMPI-2 holds many advantages.
interview provides more accurate information about For example, some patients use the MMPI-2 to
the examinee. When psychological test protocols are disclose significant problems that they will not bring
either exaggerated or defensive, a feedback and up or acknowledge in a clinical interview.
follow-up interview session is often enlightening • Profile invalidity can result from the ill-advised
(Gass & Brown, 1992). Joe was a 57-year old use of proposed short forms of the MMPI-2,
combat veteran who was referred for a neuropsycho- inadequate preparation of the examinee, inadequate
logical evaluation due to complaints of increasing supervision over the testing session, or exclusive
forgetfulness and periodic ‘‘blank spells.’’ He was reliance on the standard written administration of the
applying for an increase in his disability rating for MMPI-2 with examinees who have limited reading
posttraumatic stress disorder (PTSD). His MMPI-2 ability. Reading difficulties can be circumvented
profile revealed consistent (content-dependent) through the use of the audiotape or CD format.
responding (TRIN and VRIN T-score < 80) with • High scores on the FBS (Lees-Haley, English, &
high scores on F (110T), Fb (108T), and Fp (92T). Glenn, 1991) occur in cases involving the malingering
His profile showed an 86 code type, with high scores of physical symptoms, but they also occur in a variety
on Scale 8 (94T) and Scale 6 (86T) suggestive of of other clinical conditions. FBS does not distinguish
psychotic symptoms and corroborated by an eleva- malingerers from individuals with somatoform
tion on BIZ (76T). In the feedback session, he was disorders, which are commonplace in the personal
queried regarding his endorsement of critical items injury arena.
suggestive of hallucinations and paranoid ideation. • The MMPI-2 content scales, subscales, and RC
In every instance, he attributed his endorsement of Scales provide information that helps the clinician
these items to various experiences of flashbacks, refine and supplement Clinical Scale interpretation.
nightmares, and drug intoxication. This interview • The RC Scales assess core constructs underlying
helped to clarify that psychosis was not part of his most of the standard Clinical Scales. Reliance on the
clinical picture. MMPI-2-RF as a replacement for the MMPI-2 entails
An excellent systematic and detailed approach to a substantial loss of information, including the loss of
providing MMPI-2 feedback is presented by important inferences based on Scale 3 (Hy).
Butcher (1990). The feedback session is a two- • Most scores on the MMPI-2 are not reliable
way transaction between the clinician and the predictors of performance on neuropsychological
examinee perhaps accompanied by significant tests. In the absence of corroborative evidence,
others. It a means for reviewing findings and pro- clinicians should be reluctant to use elevated MMPI-
viding clarification, and it is also an opportunity for 2 scores as an explanatory basis for poor performance
additional assessment and intervention. Examinee’s on neuropsychological tests.
reactions to test feedback, such as conveyances of • Scales 1, 2, 3, and 8 contain items that are
acceptance or rejection, can be informative and not only sensitive to psychopathology, but also to
GASS 453
Gass, C. S. (1991a). Emotional variables in neuropsychological Gass, C. S., Luis, C. A., Rayls, K. R., & Mittenberg, W. (1999).
test performance. Journal of Clinical Psychology, 47, 100–104. MMPI-2 profiles in acute traumatic brain injury: Impact of
Gass, C. S. (1991b). MMPI-2 interpretation and closed-head demographic variables and neurological symptom reporting.
injury: A correction factor. Psychological Assessment, 3, 27–31. Journal of the International Neuropsychological Society, 5, 140.
Gass, C. S. (1992). MMPI-2 interpretation of patients with Gass, C. S., & Russell, E. W. (1986a). Differential impact of brain
cerebrovascular disease: A correction factor. Archives of damage and depression on memory test performance. Journal
Clinical Neuropsychology, 7, 17–27. of Consulting and Clinical Psychology, 54, 261–263.
Gass, C. S. (1996a). MMPI-2 variables in attention and memory Gass, C. S., & Russell, E. W. (1986b). MMPI-2 correlates of
test performance. Psychological Assessment, 8, 135–138. lateralized cerebral lesions and aphasic deficits. Journal of
Gass, C. S. (1996b). MMPI-2 interpretation and stroke: Cross- Consulting and Clinical Psychology, 54, 359–363.
validation of a correction factor. Journal of Clinical Psychology, Gass, C. S., & Russell, E. W. (1987). MMPI correlates of perfor-
52, 569–572. mance–intellectual deficits in patients with right hemisphere
Gass, C. S. (1997). Assessing patients with neurological impair- lesions. Journal of Clinical Psychology, 43, 484–489.
ments. Presented at the University of Minnesota MMPI-2 Gass, C. S., & Russell, E. W. (1991). MMPI profiles of closed-
Clinical Workshops & Symposia, Minneapolis, MN, June 5. head trauma patients: Impact of neurologic complaints.
Gass, C. S. (1999). Assessment of emotional functioning with the Journal of Clinical Psychology, 47, 253–260.
MMPI-2. In G. Groth-Marnat (Ed.), Neuropsychological assess- Gass, C. S., Russell, E. W., & Hamilton, R. A. (1990). Accuracy of
ment in clinical practice: A practical guide to test interpretation MMPI-based inferences regarding memory and concentration
and integration (chap. 14). New York: John Wiley & Sons. in closed-head trauma. Psychological Assessment, 2, 175–178.
Gass, C. S. (2000). Personality evaluation in neuropsychological Gass, C. S., & Wald, H. (1997). MMPI-2 interpretation and
assessment. In R. D. Vanderploeg (Ed.), Clinician’s guide to closed-head trauma: Cross-validation of a correction factor.
neuropsychological assessment (2nd ed., pp. 155–194). Archives of Clinical Neuropsychology, 12, 199–205.
Mahwah, NJ: Lawrence Erlbaum Associates. Golden, C. J., Sweet, J. J., & Osmon, D. C. (1979). The diag-
Gass, C. S. (2002). Does test anxiety impede neuropsychological nosis of brain damage by the MMPI: A comprehensive eva-
test performance? Archives of Clinical Neuropsychology, 17, luation. Journal of Personality Assessment, 43, 138–142.
860. Graham, J. R. (1993). MMPI-2: Assessing personality and psycho-
Gass, C. S., & Apple, C. (1997). Cognitive complaints in closed- pathology (2nd ed.). New York: Oxford University Press.
head injury: Relationship to memory test performance and Graham, J. R. (2006). MMPI-2: Assessing personality and psycho-
emotional disturbance. Journal of Clinical and Experimental pathology (4th ed.). New York: Oxford University Press.
Neuropsychology, 19, 290–299. Graham, J. R., Watts, D., & Timbrook, R. E. (1991). Detecting
Gass, C. S., & Ansley, J. (1994a). MMPI correlates of poststroke fake-good and fake-bad MMPI-2 profiles. Journal of
neurobehavioral deficits. Archives of Clinical Neuropsychology, Personality Assessment, 57, 264–277.
9, 461–469. Grant, I., & Alves, W. (1987). Psychiatric and psychosocial dis-
Gass, C. S., & Ansley, J. (1994b). Neuropsychological correlates of turbances in head injury. In H. S. Levin, J. Grafman, & H. M.
MMPI-2 scores in an inpatient psychiatric sample. Paper pre- Eisenberg (Eds.), Neurobehavioral recovery from head injury
sented at the 29th Annual Symposium on Recent (pp. 232–261). New York: Oxford University Press.
Developments in the Use of the MMPI (MMPI-2 and Green, P., Rohling, M. L., Lees-Haley, P. R., & Allen, L. M.
MMPI-A), May 24, Minneapolis, Minnesota. (2001). Effort has a greater effect on test scores than severe
Gass, C. S., Ansley, J., & Boyette, S. (1994). Emotional correlates brain injury in compensation claimants. Brain Injury, 15,
of fluency test and maze performance. Journal of Clinical 1045–1060.
Psychology, 50, 586–590. Greiffenstein, M. F., & Baker, W. J. (2001). Comparison
Gass, C. S., Ardern, H., Howell, S., Dowd, A., Levy, S., & of premorbid and postinjury MMPI-2 profiles in late
McKenzie, K. (2005). Cognitive complaints in relation to postconcussive claimants. The Clinical Neuropsychologist, 15,
test performance and emotional status in a neuropsychological 162–170.
referral sample [Abstract]. Archives of Clinical Neuropsychology, Greiffenstein, M. F., & Baker, W. J. (2003). Premorbid clues?
20, 893. Preinjury scholastic performance and present neuropsycholo-
Gass, C. S., & Brown, M. C. (1992). Neuropsychological test gical functioning in late postconcussive syndrome. The
feedback to patients with brain dysfunction. Psychological Clinical Neuropsychologist, 17, 561–573.
Assessment, 4, 272–277. Greiffenstein, M. F., Baker, W. J., Axelrod, B., Peck, E. A., &
Gass, C. S., & Daniel, S. K. (1990). Emotional impact on Trail Gervais, R. (2004). The Fake Bad Scale and MMPI-2
Making Test performance. Psychological Reports, 67, 435–438. F-family in detection of implausible psychological trauma
Gass, C. S., & Freshwater, S. (1999). MMPI-2 symptom disclo- claims. The Clinical Neuropsychologist, 18, 573–590.
sure and cognitive complaints in a head injury sample Greiffenstein, M. F., Baker, W. J., Gola, T., Donders, J., &
[Abstract]. Archives of Clinical Neuropsychology, 14, 29–30. Miller, L. J. (2002). The FBS in atypical and severe closed
Gass, C. S., & Gonzalez, C. (2003). MMPI-2 short form pro- head injury litigants. Archives of Clinical Neuropsychology, 19,
posal: CAUTION. Archives of Clinical Neuropsychology, 18, 333–336.
521–527. Greiffenstein, M. F., Fox, D., & Lees-Haley, P. R. (2007). The
Gass, C. S., & Lawhorn, L. (1991). Psychological adjustment MMPI-2 Fake Bad Scale in detection of noncredible brain
following stroke: An MMPI study. Psychological Assessment, 3, injury claims. In K. B. Boone (Ed.), Detection of noncredible
628–633. cognitive performance. New York: Guilford Press.
Gass, C. S., & Luis, C. A. (2001a). MMPI-2 short form: Greve, K. W., Bianchini, K. J., Love, J. M., Brennan, A., &
Psychometric characteristics in a neuropsychological setting. Heinly, M. T. (2006). Sensitivity and specificity of MMPI-2
Assessment, 8, 213–219. validity scales and indicators to malingered neurocognitive
GASS 455
and interpretation. Minneapolis, MN: University of Youngjohn, J. R., Burrows, L., & Erdal, K. (1995). Brain damage
Minnesota Press. or compensation neurosis? The controversial postconcussive
Vanderploeg, R. D., & Curtiss, G. (2001). Malingering assess- syndrome. The Clinical Neuropsychologist, 9, 595–598.
ment: Evaluation of validity of performance. Youngjohn, J. R., Davis, D., & Wolf, I. (1997). Head injury and
NeuroRehabilitation, 16, 245–251. the MMPI-2: Paradoxical severity effects and the influence of
Van Reekum, R., Cohen, T., & Wong, J. (2000). Can traumatic litigation. Psychological Assessment, 9, 177–184.
brain injury cause psychiatric disorders? Journal of Zillmer, E. A., & Perry, W. (1996). Cognitive-neuropsycholo-
Neuropsychiatry and Clinical Neuroscience, 12, 316–327. gical abilities and related psychological disturbance: A factor
Wooten, A. J. (1983). MMPI profiles among neuropsychology model of neuropsychological, Rorschach, and MMPI indices.
patients. Journal of Clinical Psychology, 39, 392–406. Assessment, 3, 209–224.
24 Assessing Couples
Abstract
Couple distress has a high prevalence as well as high comorbidity with a broad range of emotional,
behavioral, and physical health problems. Relationship problems also influence individuals’ response to
treatment for a wide range of psychological disorders. Hence, clinicians need to be skilled in assessing
relationships in order to provide effective interventions whether working primarily with individuals,
couples, or the broader family system. This chapter first introduces brief screening measures and clinical
methods for diagnosing couple distress in clinical as well as research applications. The bulk of the chapter is
devoted to conceptualizing and assessing couple distress for the purpose of planning and evaluating
treatment. Emerging technologies for assessing intimate relationships are presented, along with
recommendations for future research.
Individuals rate having a satisfying marriage or rela- methods demonstrate evidence of reliability,
tionship as one of their most important goals in life validity, and cost-effectiveness; and (c) findings be
(Roberts & Robins, 2000). Indeed, marital happi- linked within a theoretical or conceptual framework
ness exceeds satisfaction in other domains of the presumed causes of difficulties, as well as to
(e.g., health, work, or children) as the strongest clinical intervention or prevention.
single predictor of overall life satisfaction (Fleeson, Couple assessment differs from individual assess-
2004). When intimate relationships become dis- ment in that couple assessment strategies (a) focus
tressed, the negative effects on partners’ emotional specifically on relationship processes and the inter-
and physical well-being can be far-reaching. actions between individuals; (b) provide an oppor-
Moreover, failure to address clients’ relationship tunity for direct observation of target complaints
well-being can compromise the effectiveness of indi- involving communication and other interpersonal
vidual therapies targeting a broad range of psycho- exchange; and (c) must be sensitive to potential
logical disorders (Snyder & Whisman, 2004). challenges unique to establishing a collaborative alli-
Hence, it behooves mental health practitioners to ance when assessing highly distressed or antagonistic
gain expertise in assessing couples as a prerequisite to partners, particularly in a conjoint context. Our
planning and implementing effective interven- discussion of strategies for assessing couples is neces-
tions—whether conducting individual- or couple- sarily selective—emphasizing dimensions empiri-
based treatments. cally related to couple distress, identifying
Assessment of couples shares basic principles of alternative methods and strategies for obtaining
assessing individuals—viz. (a) the foci of assessment relevant assessment data, and highlighting specific
methods be empirically linked to target problems techniques within each method.
and constructs hypothesized to be functionally We begin this chapter by defining couple distress
related; (b) selected assessment instruments and and noting its prevalence and comorbidity with
457
emotional, behavioral, and physical health problems outcomes longitudinally (Gottman, 1993). In addi-
of individuals in both clinical and community popu- tion, positive changes in relationship satisfaction
lations. Both brief screening measures and clinical following couple therapy correspond only weakly
methods are presented for diagnosing couple distress or nonsignificantly with actual changes in commu-
in clinical as well as research applications. The bulk nication behavior (Jacobson, Schmaling, &
of the chapter is devoted to conceptualizing and Holtzworth-Munroe, 1987; Sayers, Baucom, Sher,
assessing couple distress for the purpose of planning Weiss, & Heyman, 1991).
and evaluating treatment. Toward this end, we In proposing a broadened conceptualization of
review empirical findings regarding behavioral, cog- relationship disorders for the DSM-V, First et al.
nitive, and affective components of couple distress (2002, p. 161) defined relational disorders as ‘‘per-
and specific techniques derived from clinical inter- sistent and painful patterns of feelings, behavior, and
view, behavioral observation, and self-report perceptions involving two or more partners in an
methods. In most cases, these same assessment important personal relationship . . . marked by dis-
methods and instruments are relevant to evaluating tinctive, maladaptive patterns that show little change
treatment progress and outcome. Also included is a despite a great variety of challenges and circum-
discussion of emerging technologies for assessing stances.’’ Still lacking in this conceptualization (as
intimate relationships. We conclude with general well as in the DSM-IV-TR) is a recognition of
recommendations for assessing couple distress and ‘‘nonsymptomatic’’ deficiencies that couples often
directions for future research. present as a focus of concern, including those that
detract from optimal individual or relationship well-
Conceptualizing Couple Relationship being. These include deficits in feelings of security
Distress and closeness, shared values, trust, joy, love, and
Defining Couple Distress similar positive emotions that individuals typically
The fourth edition of the Diagnostic and statistical value in their intimate relationships.
manual of mental disorders—text revision (DSM-IV- Heyman, Feldbau-Kohn, Ehrensaft,
TR; American Psychiatric Association, 2000) Langhinrichsen-Rohling, and O’Leary (2001) pro-
defines a ‘‘partner relational problem’’ as a pattern posed and evaluated criteria for relationship distress
of interaction characterized by negative or distorted modeled loosely on the DSM major depressive dis-
communication, or ‘‘noncommunication (e.g., with- order (MDD) diagnosis. Their definition of relation-
drawal),’’ that is associated with clinically significant ship distress required that ‘‘during the past month, a
impairment in individual or relationship func- subjective sense of relationship dissatisfaction was
tioning or the development of symptoms in one or noted by one or both partners and was accompanied
both partners. The acknowledgment of relational by at least one significant symptom in the interac-
problems as a ‘‘frequent focus of clinical attention,’’ tional, cognitive, or affective domains’’ (see
but their separation from other emotional and behav- Table 24.1). Heyman et al.’s original criteria were
ioral disorders, comprises only a marginal improve- expanded to include additional interactional indica-
ment over earlier versions of the DSM that all but tors of relationship distress by a task force of psychol-
ignored the interpersonal context of distressed lives. ogists and psychiatrists examining the role of
What are the limitations to this conceptualization relational processes in revisions to the DSM
of partner relational problems? First is an almost (Beach & Wamboldt, 2007). Overall, these criteria
exclusive emphasis on the etiological role of commu- emphasize both (a) a subjective global indicator of the
nication in the impairment of functioning or devel- construct’s core (relationship dissatisfaction as noted
opment of symptoms in one or both partners. by reports of a sense of unhappiness, thoughts of
Although group comparisons document differences divorce, or need for professional help for the relation-
in communication between clinic versus community ship), and (b) specific indicators of distress (at least
couples (Heyman, 2001), and ‘‘communication pro- one behavioral, cognitive, or affective symptom).
blems’’ comprise the most frequent presenting com- Items for Criterion A, overall dissatisfaction, were
plaint of couples (Geiss & O’Leary, 1981), evidence based on a rational approach to conceptualizing sig-
that communication differences precede, rather than nificant relationship distress. Items for Criteria B, C,
follow, from relationship distress is weak or nonexis- and D were based on scientific literature demon-
tent. Moreover, research with community samples strating the linkage between low relationship satis-
indicates that some forms of ‘‘negative’’ communi- faction and each of the following: higher likelihood
cation predict better rather than worse relationship of reciprocating or escalating negativity during
conflict and of withdrawal (see review by sadness (e.g., Gottman, 1999). Overall dissatisfac-
Heyman, 2001), distressed attributional patterns tion accompanied by apathy was included to make
(e.g., Bradbury & Fincham, 1990), low sense of the symptom criteria exhaustive and because apathy
efficacy that the relationship can improve (e.g., or emotional detachment is a primary presenting
Vanzetti, Notarius, & NeeSmith, 1992), and complaint of some couples (O’Leary, Heyman, &
intense or persistent levels of anger/contempt or Jongsma, 1998).
Individual Dyad (Couple, Nuclear family system Extended system (Family of Culture/community
parent–child) origin, friends)
Cognitive Intelligence; memory Cognitions regarding self and Shared or co-constructed Intergenerational patterns of Prevailing societal and
functions; thought content; other in relationship; meanings within the system; thinking and believing; cultural beliefs and attitudes;
thought quality; analytic expectancies, attributions, family ideology or paradigm; co-constructed meaning ways of thinking associated
skills; cognitive distortions; attentional biases, and goals in thought sequences between shared by therapist and family with particular religious or
schemas; capacity for the relationship members contributing to or other significant friends or ethnic groups that are
self-reflection and insight family functioning family germane to the family or
individual
Affective/ Mood; affective range, Predominant emotional Family emotional themes of Emotional themes and Prevailing emotional
emotional intensity, and valence; themes or patterns in the fear, shame, guilt, or patterns in extended system; sentiment in the community,
emotional lability and relationship; cohesion; range rejection; system properties of intergenerational emotional culture, and society; cultural
reactivity of emotional expression; cohesion or emotional legacies; patterns of fusion or norms and mores regarding
commitment and satisfaction disaffection; emotional differentiation across the expression of emotion
in the relationship; emotional atmosphere in the home— generations
content during conflict; including humor, joy, love,
acceptance and forgiveness and affection as well as
conflict and hostility
Behavioral Capacity for self-control; Recursive behavioral Repetitive behavioral patterns Behavioral patterns displayed Cultural norms and mores of
impulsivity; aggressiveness; sequences displayed in the or sequences used to influence by the extended system behavior; behaviors which are
capacity to defer gratification; relationship; behavioral family structure and power; (significant friends, family of prescribed or proscribed by
substance abuse; overall repertoire; reinforcement shared recreation and other origin, therapist) used to the larger society
health, energy, and drive contingencies; strategies used pleasant activities influence the structure and
to control other’s behavior behaviors of the extended
system
continued
Table 24.2. (continued )
Individual Dyad (Couple, Nuclear family system Extended system (Family of Culture/community
parent–child) origin, friends)
Interpersonal/ Characteristic ways of Quality and frequency of the Information flow in the Degree to which information Information that is
communication communicating and dyad’s communication; family system; paradoxical is shared with and received communicated to the family
interacting across speaking and listening skills; messages; family system from significant others or individual by the
relationships or personality how couples share boundaries, hierarchy, and outside the nuclear family community or culture in
(e.g. shy, gregarious, information, express feelings, organization; how the family system or dyad; the which they live; how the
narcissistic, dependent, and resolve conflict system uses information permeability of boundaries family or individual
controlling, avoidant) regarding its own and the degree to which the communicates their needs
functioning; family family or couple is receptive and mobilizes resources
decision-making strategies to outside influences
Structural/ All aspects of physiological History of the relationship Changes in the family system Developmental changes The cultural and political
developmental and psychosocial and how it has evolved over over time; current stage in the across generations; significant history of the society in which
development; personal time; congruence of partners’ family life cycle; stressors historical events influencing the family or individual lives;
history that influences current cognitions, affect, and related to childrearing; current system functioning current political and
functioning—including behavior congruence in needs, beliefs, (e.g. death, illness, divorce, economic changes;
psychosocial stressors; and behaviors across family abuse); congruence of beliefs congruence of the individual’s
intrapersonal consistency of members and values across extended or couple’s values with those
cognitions, affect, and social support systems of the larger community
behavior
Note: From Snyder, D. K., & Abbott, B. V. (2002). Couple distress. In M. M. Antony & D. H. Barlow (Eds.), Handbook of assessment and treatment planning for psychological disorders (pp. 341–374). New York: Guilford. Copyright 2002
by Guilford Press. Reprinted with permission.
although providing guidance regarding initial link,’’ ‘‘strong link,’’ and ‘‘averaged report’’ models.
areas of inquiry from a nomothetic perspective, The ‘‘weak link’’ approach emphasizes partner
the relation of any specific component to rela- reports denoting higher levels of distress based on
tionship distress for a given individual or couple the premise that relationships come under duress
needs to be determined from a functional ana- and may be dissolved even if only one partner is
lytic approach and applied idiographically (Cone, significantly dissatisfied. Alternatively, the ‘‘strong
1988; Haynes, Leisen, & Blaine, 1997; link’’ approach asserts that one partner’s relatively
Haynes & O’Brien, 2000). Moreover, interactive higher levels of satisfaction or commitment may lead
effects occur within domains across levels, within to resilience and maintenance of a relationship
levels across domains, and across levels and through particularly difficult or stressful times. The
domains. For example, individual differences in ‘‘averaged report’’ method, although the most
emotion regulation could significantly impact common approach to combining partners’ reports
how partners interact when disclosing personal in couple research prior to multilevel modeling tech-
information or attempting to resolve conflict niques, may be the least justifiable approach because
(Snyder, Simpson, & Hughes, 2006). Later in it obscures partner differences and provides no infor-
this chapter, we highlight more salient compo- mation about either resilience or vulnerability (i.e.,
nents of this assessment model operating pri- couples wherein both partners report ‘‘moderate’’
marily at the dyadic level as they relate to case distress are regarded as equivalent to couples wherein
conceptualization and treatment planning. one partner reports extensive distress and the other
little or no distress).
Assessment for Diagnosis and Screening For screening purposes, a brief structured inter-
A diagnosis of couple distress is based on the view may be used to assess overall relationship dis-
subjective evaluation of dissatisfaction by one or tress and partner violence. Heyman et al.’s (2001)
both partners with the overall quality of their rela- distress interview (see Table 24.1) has demonstrated
tionship. By comparison, ‘‘relationship dysfunction’’ high inter-rater reliability and convergent validity
may be determined by external evaluations of part- with observed interactions. Heyman and Slep
ners’ objective interactions (as observed by the (2006) have developed an interview for partner and
assessor or reported by the partners). Although sub- child maltreatment that resulted in over 90% agree-
jective and external evaluations frequently converge, ment on diagnostic decisions between community
partners may report being satisfied with a relation- assessors and master reviewers.
ship that—by outsiders’ evaluations—would be An alternative approach to detecting relationship
rated as dysfunctional due to observed deficits in distress builds on taxometric analyses by Whisman,
conflict resolution, emotional expressiveness, man- Beach, and Snyder (2008) identifying a ‘‘marital
agement of relationship tasks involving finances or distress taxon’’ reliably distinguishing clinic from
children, interactions with extended family, and so community couples based on partners’ self-reports
forth. Similarly, partners may report dissatisfaction across five domains of relationship functioning
with a relationship that to outsiders appears charac- assessed by the Marital Satisfaction Inventory-
terized by effective patterns of interacting in these Revised (MSI-R; Snyder, 1997). Incorporating
and other domains. Discrepancies between partners’ both taxometric findings and item analytic proce-
subjective reports and outside observers’ evaluations dures, Whisman and colleagues subsequently con-
may result, in part, from raters’ differences in per- structed a 10-item interview with high sensitivity
sonal values, gender, ethnicity, or cultural perspec- and specificity for detecting relationship distress;
tives (Tanaka-Matsumi, 2004) or from a lack of the same items can be administered in a self-report
opportunity to observe relatively infrequent beha- format with different cutoffs for tailoring sensitivity
viors (e.g., physical abuse, or cruel comments when and specificity to specific assessment purposes
drunk). (Whisman, Snyder, & Beach, 2009).
Partners may also disagree in their relationship The emphasis on partners’ subjective evalua-
accounts—either because of actual differences in tions of couple distress has led to development of
subjective experiences of their relationship or numerous self-report measures of relationship
because of differences in ability or willingness to satisfaction and global affect. There is consider-
convey these experiences via interview or self-report able convergence across measures purporting to
measures. Various methods for combining partners’ assess such constructs as marital ‘‘quality,’’ ‘‘satis-
self-reports have been proposed including ‘‘weak faction,’’ ‘‘adjustment,’’ ‘‘happiness,’’ ‘‘cohesion,’’
continued
Problem solving
Codebook of Marital and Family Interaction (Notarius et al., 1991)
*Communication Skills Test (Floyd, 2004)
Dyadic Interaction Scoring Code (Filsinger, 1983)
Living In Family Environments Coding System (Hops et al., 1995)
Verbal Tactics Coding Scheme (Sillars, 1982)
Support/intimacy
*Social Support Interaction Coding System (Pasch et al., 2004)
Assessment Methods for Treatment Monitoring and Evaluation
Self-report measures
*Dyadic Adjustment Scale (Spanier, 1976)
Goal Attainment Scaling (Kiresuk et al., 1994)
Kansas Marital Satisfaction Scale (Schumm et al., 1986)
*Marital Satisfaction Inventory—Revised (Snyder, 1997)
Observational methods
*Rapid Marital Interaction Coding System (Heyman, 2004)
Note: *denotes highly recommended based on overall psychometric characteristics and demonstrated utility.
and affectional expression. For abbreviated screening observe the reciprocal social determinants of problem
measures of couple distress, several alternatives are behaviors without venturing outside the therapy
available, including a brief (7-item) version of the office. Structured observations constitute a useful
DAS (Hunsley, Best, Lefebvre, & Vito, 2001). An assessment method because they minimize inferences
even briefer measure, the Kansas Marital Satisfaction needed to assess behavior, can facilitate formal or
Scale (KMSS; Schumm et al., 1986), includes three informal functional analysis, can provide an addi-
Likert items assessing satisfaction with marriage as tional method of assessment in a multimethod
an institution, the marital relationship, and the char- strategy (e.g., integrated with interview and question-
acter of one’s spouse. New global measures of rela- naires), and can facilitate the observation of otherwise
tionship sentiment continue to be developed for difficult to observe behaviors (Haynes & O’Brien,
both research and clinical purposes. These include 2000; Heyman & Slep, 2004). We discuss analog
a new set of three Couple Satisfaction Index (CSI) behavioral observation of couple interactions and
scales constructed using item response theory (IRT) describe specific observational coding systems at
and comprising 32, 16, and 4 items each (Funk & greater length in the following section on case con-
Rogge, 2007), and two factor scales derived from the ceptualization and treatment planning. However, for
MSI-R distinguishing overt conflict (disharmony) purposes of initial screening and diagnosis, we advo-
from emotional distance or isolation (disaffection) cate two approaches to assessing partners’ descrip-
(Herrington et al., 2008). tions of relationship problems, expression of positive
Despite its widespread use, a review of psycho- and negative feelings, and efforts to resolve conflicts
metric properties reveals important limitations to the and reach decisions—specifically, the Rapid Marital
DAS. Factor analyses have failed to replicate its four Interaction Coding System (RMICS; Heyman,
subscales (Crane, Busby, & Larson, 1991), and the 2004) and the Rapid Couples Interaction Scoring
internal consistency of the affectional expression System (RCISS; Krokoff, Gottman, & Hass, 1989).
subscale is weak. There is little evidence that the Even when not formally coding couples’ interactions,
full-length DAS and similar longer global scales clinicians’ familiarity with the behavioral indicators
offer incremental validity above the 3-item for specific communication patterns previously
KMSS—although preliminary evidence suggests demonstrated to covary with relationship accord or
that the new CSI scales may offer higher precision distress should facilitate empirically informed
of measurement and greater sensitivity for detecting screening of partners’ verbal and nonverbal
differences in relationship satisfaction. exchanges.
Because partners frequently present for treatment When a couple presents for therapy with primary
together, clinicians have the rare opportunity to complaints of dissatisfaction in their relationship,
Abstract
The Minnesota Multiphasic Personality Inventory-Adolescent (MMPI-A) was developed using adolescent
norms and adolescent-specific content to offer clinicians an objective assessment instrument of adolescent
psychopathology across many settings. The MMPI-A includes Validity Scales that measure the extent to
which an adolescent responds to questions in a cooperative and truthful manner. The MMPI-A also
includes 10 basic Clinical Scales, 15 Content Scales, 31 Content Component Scales, and 42 Supplementary
Scales. MMPI-A profiles require specialized knowledge and training to correctly interpret adolescent
functioning and psychopathology. Computer-driven scoring and reporting may assist clinicians in accurate,
comprehensive, and integrated interpretation of the MMPI-A’s many scales. The MMPI-A can be used to
provide feedback to adolescents about their self-reported psychopathology and gain their cooperation
with the therapeutic process.
Keywords: adolescent, assessment, Clinical Scales, Content Component Scales, Content Scales, MMPI,
MMPI-A, psychopathology, Supplementary Scales, Validity Scales
Before the original Minnesota Multiphasic develop 29 adolescent specific scale elevation pat-
Personality Inventory (MMPI; Hathaway & terns known as ‘‘code types,’’ differentially describing
McKinley, 1943) was published in 1943, it was both high and low scores. The Actuarial Use of the
already being used in personality research with ado- MMPI with Adolescents and Adults (Marks et al.,
lescents (Capwell, 1945). Large longitudinal MMPI 1974) established the reliability, validity, and broad
studies with adolescents continued through the utility of the MMPI with adolescents across multiple
1950s and 1960s, demonstrating that the MMPI settings. Later research offered adolescent norms
could well predict later juvenile delinquency and using larger samples (Gottesman, Hanson,
schizophrenia in boys and girls (Hanson, Kroeker, & Briggs, 1987) or more recent test dates
Gottesman, & Heston, 1990; Monachesi & (Colligan & Offord, 1989), extended norms to
Hathaway, 1969). However, this research revealed Content and Special Scales, and examined the clin-
differences in item endorsement patterns between ical correlates of additional single-scale and two-
adolescents and adults, and longitudinally as subjects point code types (Archer, Gordon, Giannetti, &
moved from middle to late adolescence. To address Singles, 1988; Williams & Butcher, 1989).
these issues and open MMPI access more broadly to Over time, the extensive MMPI research with
adolescent assessment, Marks and Briggs developed adolescents suggested that the original MMPI,
adolescent norms (Marks, Seeman, & Haller, 1974). although often employed in assessing adolescent
These norms derived from the MMPI responses of psychopathology, was burdened with several limita-
1,800 normal adolescents and were provided sepa- tions among teenage subjects. Archer (2005, p. 38)
rately for males and females ranging from ages 12 to summarized the following issues: (a) need for a
18. Marks and Briggs further used external criteria to revised item pool, eliminating offensive and
485
developmentally inappropriate items, rewriting items inpatient general and mental health settings, as well
for easier readability, and adding items with adoles- as academic and forensic settings.
cent specific content; (b) need for a universally The MMPI-A provides an objective assessment
accepted set of contemporary adolescent norms of an adolescent’s level of functioning in relation to
based on a nationally and ethnically representative standardized dimensions of psychopathology. The
adolescent sample; (c) need to create scales to assess MMPI-A can serve as an index of therapeutic change
adolescent specific problems and developmental if administered repeatedly during treatment. The
issues; and (d) need for clear test interpretation guide- ability to accurately assess changes in psycho-
lines for adolescent MMPI profiles. During the pathology over time is particularly useful with ado-
MMPI Restandardization Project, the University of lescents, as they undergo significant developmentally
Minnesota Press determined that the most effective related psychological transitions.
way to address these limitations was to develop an The MMPI-A has rapidly become the most
adolescent specific version of the MMPI (Butcher & widely used objective personality assessment instru-
Williams, 2007, p. 6). The MMPI-Adolescent ment with adolescents (Archer & Newsom, 2000).
(MMPI-A) was completed and published in 1992 Since its publication, hundreds of books, book chap-
(Butcher et al., 1992). ters, and research articles have appeared covering the
To develop the MMPI-A, 154 newly written MMPI-A’s use and performance in a range of set-
adolescent specific items along with the original tings and populations. These investigations have
MMPI items, some reworded, were administered continued to refine our knowledge of MMPI-A
to approximately 2,500 adolescents in eight states interpretation.
across the United States during the late 1980s.
Approximately one-third of the tests were excluded MMPI-A Interpretation
due to invalidity or age, leaving 1,620 adolescents A strength of the MMPI-A is its ability to detect
comprising the adolescent normative sample. The invalid response patterns, which occur frequently in
average age of the sample was 15.6 years. The sample adolescents (Archer, 2005). MMPI-A Validity Scales
closely matched the ethnic distribution of the U.S. can be used to detect a variety of response styles that
population at the time, but overrepresented adoles- may distort a respondent’s clinical profile. Under-
cents from college educated families, suggesting an reporting and over-reporting of psychopathology,
upward socioeconomic skew (Butcher et al., 1992). random and haphazard responding, extreme
The MMPI-A was developed using this adolescent responding, and defensive or guarded response styles
normative sample. The MMPI-A contains 478 items. can all be detected by MMPI-A Validity Scales.
Only the first 350 items are needed to score the basic
Validity and Clinical Scales. The remaining items are Validity Scales
necessary to score the additional Validity, MMPI-A Validity Scales must be carefully exam-
Supplementary, and Content Scales. Thus, the ined by clinicians prior to interpreting a client pro-
MMPI-A is shorter than the MMPI or MMPI-2—a file. The Validity Scales provide information about
feature deemed important based on experience with whether or not the adolescent responded to test
adolescents taking the original MMPI (e.g., Archer, items in a cooperative, consistent, and accurate
Maruish, Imhof, & Piotrowski, 1991). By design, the manner. Validity Scales provide information not
MMPI-A maintains substantial continuity with the only about the technical validity of the response
original MMPI and MMPI-2. It embellishes these pattern, but also about behavioral correlates and
adult-oriented instruments by adding adolescent spe- personality features that likely apply to the
cific Content and Supplementary Scales, and refines respondent.
Scales F, Mf, and Si by deleting items that did not
meet statistical criteria for inclusion in the adolescent CANNOT SAY SCALE
sample (Butcher et al., 1992). The Cannot Say Scale consists of the total
The MMPI-A is a sophisticated self-report instru- number of items that a respondent fails to answer.
ment that allows clinicians to assess psychopathology Adolescents typically omit 10 or fewer items when
in adolescent clients despite complex developmental, completing the MMPI-A. When 11 or more items
clinical, and methodological issues that can interfere are omitted, the Cannot Say score is considered
with the psychological assessment of teens. MMPI-A elevated.
is designed to assess psychopathology among adoles- Elevated Cannot Say Scale scores can occur for a
cents aged 14–18 in outpatient, residential, and variety of reasons. Higher Cannot Say Scale scores
Abstract
The assessment of suicide risk remains one of the most important, complex, and difficult tasks performed by
clinicians. Because suicide is one of the few fatal consequences of a psychiatric illness, accurate assessment of
suicide risk is essential. Therefore, the purpose of this chapter is to present the assessment of suicide risk
from a clinically balanced and research-informed standpoint. The optimal risk assessment integrates a sound
clinical interview with actuarial instruments providing supplementary or clarifying information.
Keywords: suicide, risk assessment, self-harm, suicide risk, assessment, MMPI-2, MCMI-III, suicide
attempts, suicide potential, Rorschach, BDI, suicidal ideation, suicidal intent, hopelessness
There are many reasons why the assessment of suicide 2004). Unfortunately, many suicidal individuals do
risk is one of the most important aspects of clinical not voluntarily report their thoughts or plans for
work. The fact that the majority (up to two-thirds) of self-harm to the professionals who are trained to act
those who commit suicide have had contact with a accordingly. If we work from the premise that not
health-care professional in the month before their all patients are willing to admit their suicidal idea-
death (Kutcher & Chehil, 2007) illustrates this tion to their health-care provider (Glassmire,
point profoundly. Because suicide is one of the few Stolberg, Greene, & Bongar, 2001; Johnson, Lall,
fatal consequences of psychiatric illness, accurate Bongar, & Nordlund, 1999), then we must utilize
assessment of suicidal risk is essential (Packman, other resources, namely the psychological assess-
Marlitt, Bongar, & Pennuto, 2004). It is estimated ment and a sound clinical review of risk factors, to
that there are about a quarter million nonfatal suicide guide our intervention starategies.
attempts each year in the United States. About 15% A wide array of instruments have been developed
of those who attempt suicide will eventually take for the measurement of various aspects of suicidality.
their own lives, and one-third of those who complete These actuarial instruments can be a helpful supple-
suicide have had nonfatal attempts in their past ment in risk elevation (Bryan & Rudd, 2006) but
(Yufit & Lester, 2005). Suicide was the 11th leading they have a notoriously high false-positive rate.
cause of death in the United States in 2000, when Thus, optimal risk assessment integrates a sound
an estimated 29,350 deaths were attributed to it clinical interview with actuarial instruments pro-
(National Institute of Mental Health, 2003). viding supplementary or clarifying information.
The purpose of this chapter is to present the assess- A recent trend is to focus on new empirically based
ment of suicide risk from a clinically balanced and strategies in the assessment of suicide potential
research informed standpoint. There is no question (Bongar, 2000; Bryan & Rudd, 2006). Some
that clinicians are increasingly faced with decisions researchers have turned their attention away from
about what to do when a patient reports suicidal making predictions of patient behavior to clinicians’
ideation or when assessment data lead to the same own views of critical factors when assessing for suicide
conclusion (Wingate, Joiner, Walker, Rudd, & Jobes, (Bongar, 1991; Bruno, 1995; Bryan & Rudd, 2006;
501
Greaney, 1995; Kutcher & Chehil, 2007; Mahrer, (2006) present a clear outline for a thorough risk
1993), attempting to identify the standards of care assessment which includes an examination of:
for this practice situation. This approach seeks to
1. predispositions to suicidal behavior (diagnosis,
bridge the study of patient characteristics with that of
history of attempts, gender, etc.);
clinician education, training, and experience in hope of
2. identifiable precipitant or stressors (loss,
describing reasonable and prudent practitioner beha-
health problems, relationship problems, etc.);
viors. A profile of these behaviors not only is useful in
3. symptomatic presentation (specific diagnosis
identifying the standard of care, but can also help
with elevated risk factors);
identify professional myths and deficiencies in practice,
4. presence of hopelessness (severity and duration);
both essential to training and education efforts.
5. the nature of suicidal thinking (intensity,
The assessment of suicide risk remains among
duration, intent, means, lethality, etc.);
the most important, complex, and difficult tasks per-
6. previous suicidal behavior (opportunity,
formed by clinicians (Bongar, Maris, Berman,
frequency, and context of attempts, etc.);
Litman, & Silverman, 1993; Chemtob, Bauer,
7. impulsivity and self-control (subjective and
Hamada, Pelowski, & Muraoka, 1989; Motto,
objective control); and
1991; Reinecke & Franklin-Scott, 2005). Because
8. protective factors (support, religion, children
suicidology literature is voluminous, diverse, and
life satisfaction, etc.).
sometimes contradictory, clinicians may have
difficulty determining the relative importance of var- Kutcher and Chehil (2005) authored a quick
ious factors when assessing the individual. Some structured approach to assessing suicide risk levels
researchers have emphasized particular charts of called the Tool for Assessment of Suicide Risk
variables. Gutheil and Appelbaum (1982), for (TASR). Readers interested in a copy of the instru-
example, stated that the best approach to gathering ment and direction for its use are directed to Kutcher
data ‘‘focuses on previous psychiatric history; recent and Chehil (2007). One of the other useful compo-
behavior or behavioral change; significant alteration nents of the TASR is a list of important questions to
of circumstances (e.g., loss of job); bizarreness of ask. A few of those questions are listed here:
ideation or action; threats to self or others, or related
behavior such as the purchase of poison, rope, or a 1. Do you have thoughts of death or suicide?
gun; history of substance abuse and the like’’ (p. 52). 2. Do you have a specific plan?
These authors pointed out that the central clinical and 3. What methods have you considered?
legal concerns involve negligence in evaluation and in 4. Do you have access to a plan?
involuntary interventions (i.e., hospitalization). 5. Do you have a date and place in mind?
Brent, Kupfer, Bromet, and Dew (1988) noted that 6. If you were alone now would you try to kill
an accurate diagnostic assessment is necessary with yourself?
regard to primary and comorbid psychiatric disorder, Along these lines Fremouw, de Perczel, and Ellis
alcohol and drug abuse, personality disorders, and (1990), as well as Bassuk (1982), noted that the
attendant medical disorders. In addition, they pointed assessment of a patient’s potential suicide risk neces-
out that a proper assessment mobilizes the families sitates the gathering and weighing of a variety of
and significant others in order to improve compliance information and data—and the importance of this
with treatment and decrease the chance of a relapse. particular assessment has led these psychologists to
Much of the difficulty in the assessment of sui- construct an impressive decision model that integrates
cide risk comes from the psychometric weaknesses of and formalizes the steps for a thorough and reasonable
existing suicide scales and measures (Rogers & decision about the risk for suicide for a particular
Oney, 2005). We suggest that all measures of suicide patient. Somewhat like Bassuk’s 1982 checklist
risk (e.g., scales, tests, and critical items) need to be system, their decision model involves seven steps for
reviewed in context of the assessment itself. What is the psychologist:
the referral question, why is this person here with me
at this time, and how previous experiences or cogni- 1. The collection of demographic information (e.g.,
tions may impact the current presentation are all age, sex, race, marital status, and living situation) to
vital to consider. Recent advances in risk assessment determine whether the patient is in a high-risk or
research show areas that have been empirically low-risk group.
demonstrated to be essential to the risk assessment 2. The examination of clinical and historical
(Rudd, Joiner, & Rajab, 2001). Bryan and Rudd indicators as the more specific information that
Abstract
This chapter provides a conceptual framework for the use of self-report measures in assessing alcohol
and drug abuse. The chapter reviews the use of self-report measures for assessing a number of different
issues: screening for substance abuse; traditional scales used to identify alcohol or drug abuse; timing of
assessment within an alcohol/drug treatment program; specific tests that have particular utility within the
area; and identification of patients with dual disorders. The Minnesota Multiphasic Personality Inventory-2
(MMPI-2) is emphasized throughout the chapter, but the results can be generalized to other self-report
measures. Self-report measures continue to show promise in addressing alcohol/drug abuse. Research still
is needed to address more comprehensively these issues.
Keywords: Addiction Admission Scale (AAS), Addiction Potential Scale (APS), Alcohol Use Inventory
(AUI), AUDIT, CAGE, MacAndrew Alcoholism Scale-Revised (MAC-R), MAST, Millon Multiaxial Clinical
Inventory (MCMI/MCMI-II/MCMI-III), Minnesota Multiphasic Personality Inventory (MMPI/MMPI-2), Rapid
Alcohol Problems Screen 4–Quantity Frequency (RAPS4-QF)
The use of any form of psychological assessment in treatment program; (d) specific tests that have
alcohol/drug abuse has been hampered by the tra- particular utility within the area; and (e) identifi-
ditional bias in this field against any form of assess- cation of patients with dual disorders. The primary
ment, which has stated more or less explicitly that emphasis in this chapter will be on the Minnesota
once the person stops drinking/drugging the other Multiphasic Personality Inventory-2 (MMPI-2)
problems will go away. It is only in the last few (Butcher, Dahlstrom, Graham, Tellegen, &
decades that progress has been made in the use of Kaemmer, 1989), although the results can be gen-
psychological assessment in this field with the eralized easily to any self-report measures.
recognition of the (a) specific contributions that
can be made by psychological assessment in plan- Preliminary Considerations
ning interventions and treatment; (b) existence of A number of potential issues must be given
persons with dual disorders that must be diagnosed explicit consideration in understanding research on
and treated concurrently; and (c) need for alcohol or drug abuse. The parameters of the per-
screening for alcohol/drug abuse in many areas of son’s use of substances need to be reported explicitly.
our society that has increased public awareness of These data can be as simple as whether the person
the role of assessment. This chapter will focus on uses substances constantly, only on weekends, or
the use of self-report measures in several areas of during binges. The type of substance(s) used also is
assessing alcohol/drug abuse: (a) issues involved in important, at least as to generic class, for example,
screening for substance abuse; (b) traditional scales sedative-hypnotics, stimulants, and so on, as well as
used to identify alcohol or drug abuse; (c) timing whether the person uses only one substance or mul-
of psychological assessment within an alcohol/drug tiple substances. The work of Skinner and his
527
colleagues (cf. Morey & Skinner, 1986; Morey, use specific substances respond better to one type of
Skinner, & Blashfield, 1984; Skinner, Jackson, & treatment and whether they have differential
Hoffmann, 1974) may be very useful in this area in recovery rates. Project MATCH Research Group
terms of identifying patterns of use that can be (1993) has provided a comprehensive overview of
compared with psychological test performance. this topic that should be consulted by anyone who is
The social-interpersonal aspects of the alcohol or working in this area. The time is long past when
drug use should be identified. Persons may use sub- delineating whether alcoholics or drug addicts are a
stances only alone, withdraw and isolate themselves unitary group is a viable question; the data are clear
as the result of the use of substances, only use in that substance users are very heterogeneous.
social contexts, or some combination of these beha- Research questions need to be more sharply focused;
viors. Misuse of alcohol is not only associated with such as, do alcoholics with both a 2-4/4-2 codetype
multiple physical and mental health problems, but and an elevated score on the MacAndrew
also increases risk factors related to criminal beha- Alcoholism Scale-Revised (MAC-R) have better
vior. Psychosocial aspects of substance use is parti- recovery rates in a confrontational-based AA pro-
cularly important in women, younger persons, and gram? Or, are attrition rates across the course of
older persons who may start drinking at a later time treatment within this codetype higher or lower
in their life, all of whom generally have shorter than in a 4-9/9-4 codetype, and if so, how can this
periods of substance use history. The use and problem be addressed? It also would be informative
misuse of alcohol among college students has been to determine whether patients within these specific
widely documented (e.g., O’Malley & Johnston, subgroups have different histories of personal and
2002; Perkins, 2002) and more recently has been familial substance use. Consequently, it is manda-
an area of public concern. tory that researchers do more than simply report that
The factors that have led the person to receive the patients are in some unspecified treatment pro-
treatment need to be identified. Although persons gram. Detailed databases that encompass both the
enter treatment as a result of ‘‘hitting bottom,’’ the individual’s and his/her familial history of substance
factors that lead to that bottom are very different. use as well as social and environmental factors that
Persons may be entering treatment because of the may be involved must be reported as well as the
‘‘encouragement’’ of the legal system, employers, patient’s MMPI-2 performance. Substance use is a
spouses, and so on. The increased use of substance complex process; it is time that research designs start
use testing in the workplace may result in persons to appreciate this complexity and report data that
being identified and referred for treatment before can begin to give some insight into the multitude of
‘‘hitting bottom,’’ that is, at an earlier stage of sub- factors that are involved.
stance abuse. The point in time at which the data are collected
Any comparisons that are made between is very important. If the data are collected too early in
polydrug and alcoholic patients must consider the treatment the patients may be too toxic to report
potential effects of age since polydrug patients accurately. This specific issue will be discussed
average 15–20 years younger than alcoholics. below.
Another consideration is that persons dependent How the persons respond after they entered treat-
on alcohol are 3 times more likely than those in the ment also must be considered, particularly in any
general population to use tobacco, and people study that is evaluating treatment effectiveness or
dependent on tobacco are 4 times more likely than outcome. Some persons will continue to deny
the general population to be dependent on alcohol or minimize their use of substance(s) throughout
(Grant, Hasin, Chou, Stinson, & Dawson, 2004). treatment, and/or leave treatment prematurely.
The influence of socioeconomic class and education, Clinicians, who collect their data early in treatment
which have significant effects on self-report person- and do not consider that some patients may leave
ality test performance, also must be considered. The before completing treatment, will have different
general failure to even consider the role of such results than those who collect data later or toward
moderator variables, which is so characteristic of the end of treatment.
research in this area, must be corrected if meaningful Finally, the basis on which the diagnosis is made
data are to be obtained. must be given careful consideration. All too often it
One of the most significant needs in this entire appears that the only criterion for the diagnosis is the
area is to focus on treatment process and outcome person’s presence in a treatment facility with little
issues, that is, whether subgroups of patients who regard for whether the diagnosis is appropriate or
Edwin I. Megargee
Abstract
This chapter considers the issues involved in understanding and, especially, assessing, human aggression
and violence. After defining aggression and discussing a broad range of factors that influence its assessment,
it differentiates individualized from generalized assessments. In the former, the clinician analyzes the
factors influencing aggressive behavior in a particular individual. A conceptual framework, the ‘‘algebra of
aggression,’’ is offered for analyzing physiological and psychological factors such as instigation, inhibitions,
and habit strength that influence the relative response strength of different aggressive and nonaggressive
acts.
Generalized assessments are focused on identifying which members of a group, such as applicants for
parole, are most likely to be aggressive. A number of subjective and objective rationally and actuarially
derived risk assessment instruments are described and evaluated.
Keywords: aggression and violence, individual assessments, generalized assessments, risk assessment
On August 1, 1966, a ‘‘false negative’’ shot 38 people armed civilian ascended the tower stairs and fatally
on the University of Texas’s (UT) Austin campus, shot Whitman.
killing 14. Several weeks earlier, Charles Whitman, a When assessing the risk of aggression and vio-
UT student, had gone to the student health center lence, it is the possibility of missing false negatives,
complaining of headaches and uncontrollable urges such as Whitman, that most worries clinicians.
to shoot people at random from the 22-story tower However, it is the false positives, namely individuals
that dominates the campus. The psychiatrist whom incorrectly diagnosed as being aggressive, who are far
Whitman consulted prescribed an analgesic for the more numerous (Otto, 1992; Wollert, 2006).
headaches and suggested that he return for another The consequences of being labeled dangerous can
session the following week. Instead, on the night of be severe. They can include prolonged imprisonment,
July 31, Whitman killed his mother and, later that revocation of probation or parole, involuntary com-
evening, his wife. The following morning, he mitment to a mental health facility, and loss of par-
donned the uniform of a maintenance worker, ental rights in custody disputes. In Texas, people
packed a trunk full of guns and ammunition, and convicted of capital crimes who are assessed as being
wheeled it to a tower elevator. Ascending to the top a continuing threat to society often end up on death
floor, he shot and killed the attendant, sealed off the row (Cunningham & Reidy, 1999; Dorland &
elevators, and took his weapons to the balustrade Krauss, 2005; Edens, Buffington-Vollum, Keilen,
overlooking the campus. When he thought classes Roskamp, & Anthony, 2005). In many states, sex
were about to change, flooding the campus with offenders regarded as dangerous may be civilly com-
students, he opened fire. Along with hundreds of mitted, subjected to preventive detention, or effec-
others UT faculty and students, I was pinned down tively barred from living in some communities
for over 90 minutes until a police officer and an (Alexander, 2004; Mercado & Ogloff, 2007).
542
Defining Aggression and Violence In this chapter, which is limited to human beha-
Johnson (1972, p. 8) wrote that, ‘‘the most vior, ‘‘aggression’’ refers to overt verbal or physical
important thing that can be said about defining behavior that can harm people and other living
aggression’’ is that, ‘‘there is no single kind of beha- creatures by causing them distress, damage, pain,
vior that can be called ‘aggressive’ nor is there any or injury, or by damaging their property or reputa-
single process which represents ‘aggression’.’’ tion. ‘‘Violence’’ is severe physical aggression that is
Indeed, it would be possible, although not very likely to cause serious damage or injury. To be
helpful, to devote this entire chapter to a discussion considered aggression, the harm must be intended
of the difficulties encountered in formulating a uni- or the possibility of such harm accepted by the
versally acceptable definition of aggression (Baron, perpetrator. Such behavior will be viewed as aggres-
1977; Johnson, 1972; Megargee, 1993). sive whether or not hurting the victim was the
Several important issues must be considered aggressor’s primary intention and regardless of
when one tries to define aggression or violence. whether the behavior is classified as legal or illegal
The first is ‘‘intentionality.’’ I am sure everyone in the particular society in which it occurs.
agrees that Charles Whitman’s deliberately planned This definition includes instrumental aggression
and well-executed mass murder constituted aggres- where the primary intent is not necessarily to harm
sion and violence. However, was it also aggression or someone but in which a person’s pain or suffering is
violence in February 2006 when Vice President an acceptable outcome, as well as the legal aggression
Cheney accidentally sprayed a hunting partner or violence that might occur in military combat. It
with birdshot? Some behaviorists attempted to does not include accidental or unintentionally
avoid this issue by defining aggression as any beha- harmful behavior, or unavoidable pain inflicted for
vior that, ‘‘delivers noxious stimulation to another altruistic reasons such as might occur in medical or
organism’’ regardless of whether or not it was dental treatment.1
intended (Buss, 1961, p. 1). However, today most
authorities insist that the perpetrator must intend to Heterogeneity of Aggression and Violence
harm the target for the behavior to be considered Just as there is no simple definition of aggression,
aggressive (Anderson & Bushman, 2002). there is also no single prototype for those who
Assuming intent is a necessary component of behave aggressively or who engage in violence.
aggression, the next question is whether harmful After Seung-Hui Cho shot and killed 32 people on
intent alone is sufficient to constitute aggression. If the Virginia Tech campus in April 2007, I was asked
a woman who plans to commit a suicide bombing in how closely he resembled Charles Whitman, whom
a crowded market is arrested before she can carry out I had studied four decades earlier. The two could
her attack, is her scheme alone enough to be con- hardly have been more different. Whitman, a former
sidered aggression? altar boy and Eagle Scout, was a hard-working, well-
‘‘Legality’’ is yet another issue. Charles Whitman regarded, achievement-oriented married veteran
and the two men who shot him to death all com- with a temporal lobe tumor. Cho was an estranged,
mitted homicide on that hot August afternoon. alienated, under-achieving misfit with no known
Were all three engaging in violent aggression? physical impairment whose agitated behavior and
Some authorities reserve the terms ‘‘aggression’’ or bizarre writings frightened his teachers and fellow
‘‘violence’’ for illegal or criminal behavior while students.
others include any act that harms another human In the 1960s, many social psychologists thought
being, regardless of its legality. that, although the causes of the milder forms of
The ‘‘target’’ of the aggressive behavior is another aggressive behavior that they studied in the labora-
concern. Anderson and Bushman (2002, p. 28) tory could be quite complex, violent crimes were
regard human aggression as harmful ‘‘behavior typically committed by offenders who had inade-
directed toward another individual,’’ but what of quate controls and inhibitions against aggressive
harm inflicted on other targets such as animals, acting out (Berkowitz, 1962; Buss, 1961). Such
objects, or property? Is someone who deliberately undercontrol could result from (a) a failure to incor-
sets a forest fire aggressive if no people are injured? porate society’s moral codes and prohibitions against
What about a worker who slaughters animals at a aggression, (b) being socialized into a subculture of
meatpacking plant? In my seminars on aggression, violence in which aggression or violence was
discussions of these issues could go on for hours, expected and rewarded in certain circumstances, or
even days, if left unchecked. (c) impairment of normal inhibitions against
MEGARGEE 543
aggression by organic conditions, toxic substances, liability might depend on the mental status of the
or functional psychopathology. perpetrator at the time of offense. For example,
Although many aggressive and violent people Mary Winkler, a minister’s wife who shot and
clearly do lack adequate inhibitions or controls, we killed her sleeping husband in March 2006, was
discovered that some extremely violent offenders are, convicted of a reduced charge of voluntary man-
paradoxically, overcontrolled individuals whose rigid slaughter and sentenced to only 210 days of confine-
and excessive inhibitions against any type of aggres- ment after a psychologist for the defense testified
sive behavior allow aggressive instigation to accumu- that her husband’s abusiveness resulted in a mental
late until it is unleashed in one often cataclysmic act, disorder that made it impossible for her to form
after which they revert to their typical overcontrolled criminal intent (Grinberg, 2007). Retrospective eva-
pattern (Megargee, 1966, 1982). In addition to luations are especially important in program plan-
these under- and overcontrolled assaultive types, it ning. For example, the management and treatment
is also possible for normally socialized people with needs of undercontrolled and overcontrolled assaul-
culturally appropriate values systems to engage in tive offenders obviously differ greatly (Megargee,
violence. This includes law enforcement officers 1966).
and military personnel whose jobs may require In prospective or prognostic evaluations or ‘‘risk
them to behave aggressively. Normally socialized assessments,’’ psychologists are asked whether a
people may also engage in aggression in response to person, group, or even a nation is likely to
being attacked, severely provoked, extremely fru- commit aggression in the future. If so, what form
strated, or if they are caught up in the contagious is the aggressive behavior likely to take? Against
chaos of a riot or rebellion (Berkowitz, 1970). whom or what is the aggressive behavior likely to
This heterogeneity makes it unlikely that any be directed? Under what conditions is such aggres-
single global device for the assessment or prediction sion most likely to occur? What circumstances or
of aggression or violence will be universally effective. interventions are likely to increase or decrease the
For example, Hare’s Psychopathy Checklist-Revised risk of such behavior (Borum, Fein, Vossekuil, &
(PCL-R; Hare, 2003) might be very useful in Berglund, 1999; Heilbrun & Heilbrun, 1995)?
detecting poorly socialized criminal offenders who The answers to these questions might be used to
have inadequate inhibitions against aggressive beha- help determine whether patients or prisoners are
vior or violence, but it is unlikely to identify the ready for release or if a child should be returned to
normally socialized people driven to aggression by the custody of a possibly abusive parent. As anyone
extreme situational factors. who has ever bet on a sporting event knows, it is
much more difficult to predict what will happen in
the future than it is to explain what has happened in
Issues and Factors Influencing Assessments the past.
of Aggression and Violence Some assessment techniques work much better in
In addition to the complexity of aggressive beha-
retrospective than in prospective assessments.
vior, there are a number of other issues that influence
Retrospectively, the overcontrolled hostility (O-H)
its assessment. They include the type of appraisal
scale for the Minnesota Multiphasic Personality
required, the context and setting in which the assess-
Inventory-2 (MMPI-2; Megargee, Cook, &
ment takes place, and whether an individual or a
Mendelsohn, 1967) can help identify overcontrolled
group is being assessed.
assaultive offenders, but prospectively the O-H scale
is of limited utility in predicting violent behavior
Type of Appraisal because of the rarity of this syndrome.
Most assessments of aggressive behavior fall into
one of two general categories: ‘‘retrospective’’ or Context and Setting
‘‘diagnostic’’ assessments in which individuals who The context of the assessment and the setting in
have already been violent are evaluated, and ‘‘pro- which it is carried out are also important considera-
spective’’ or ‘‘prognostic’’ assessments in which their tions. Assessments of aggression and violence may
potential for future aggression or violence is take place in jails, prisons, or closed psychiatric
estimated. wards. Security considerations may dictate that the
In retrospective assessments, psychologists are client is in a cell or even in some sort of restraints,
typically confronted with aggression that has already and other people, such as attorneys, correctional
occurred and are asked to explain it. Criminal officers, or psychiatric aides, may be present. The
MEGARGEE 545
Individualized Assessment of Aggression in the case of a robbery or political benefits from an
The Algebra of Aggression: A Conceptual act of terrorism.
Framework for the Analysis of Aggressive The second personal factor contributing to
Behavior aggressive behavior is ‘‘habit strength,’’ the extent
In his recent Sutherland Award address, Nagin to which a given response has been rewarded or
(2007) proposed that the concepts of choice and punished in the past. Other things being equal
decision making should occupy ‘‘center stage’’ in (which they rarely are), the more often we have
criminological theory and research. Choices are cen- been reinforced for aggressive behavior in the past,
tral to my conceptual framework for understanding or the more we have observed others being rewarded
human aggression. According to this formulation, for aggression, the more likely we are to aggress in
every act of human aggression results from the inter- the future.
action of numerous factors and involves dozens of Instigation to aggression and habit strength both
implicit and explicit choices. This is true whether the induce us to act aggressively. What deters us?
act is planned and deliberate or spontaneous and Opposing the motivational factors is the third set
implusive. These choices involve emotions as well of personal variables, namely ‘‘inhibitions against
as cognitions, physiology as well as psychology, and aggression.’’ Inhibitions include all the reasons we
situational as well as personal factors. might refrain from a given aggressive act directed at a
Although we may not be consciously aware of it, particular target. They include both moral prohibi-
at any given time we are confronted with dozens of tions, which cause us to feel that this act of aggres-
alternative behaviors; some of which may be aggres- sion is wrong, and practical considerations, such as
sive. These potential aggressive responses may be our fear of retaliation. Inhibitions can be general or
verbal or physical, mild or extreme, direct or specific and can vary as a function of the act, the
indirect. How do we make these choices? target, and the circumstances.
According to the algebra of aggression, we typically Instigation, habit strength, and inhibitions are all
select the response that appears to offer us the most personal characteristics, but behavior results from
satisfactions and the fewest dissatisfactions in that individuals interacting with their milieus. The next
particular situation. set of variables is comprised of ‘‘situational factors’’
This simple statement conceals a rapid but com- which encompass all those external factors that may
plex internal bargaining process in which we weigh influence the likelihood that we will engage in
the capacity of each response to fulfill various com- aggressive behavior. These include ‘‘environments,’’
peting drives and motives against the discomforts or ‘‘settings,’’ ‘‘situations,’’ and ‘‘stimuli,’’ and they may
disappointments we might suffer from choosing that either facilitate or impede aggressive behavior.
response. By means of this ‘‘internal algebra,’’ we ‘‘Reaction potential,’’ my last major construct,
calculate the net strength of each possible response, consists of the net strength of any given response
compare it with all the other responses, and select after all the inhibitory factors, both personal and
the strongest. situational, have been balanced against the excitatory
What determines the net strength of a potential ones. A response will be blocked and cannot occur
response? We must consider both ‘‘personal’’ and whenever the inhibitions exceed the instigation. A
‘‘situational’’ factors. In the case of an aggressive or response is ‘‘possible’’ (i.e., has a positive reaction
violent act, we can isolate three personal factors that potential) if the forces favoring the aggressive
interact to determine its response strength. The first response exceed those opposing it. After all the pos-
two induce us to act aggressively while the third sible responses compete with one another, the one
deters us from aggressive behavior. with the highest reaction potential—that is, the
The first personal factor which promotes aggres- capacity to satisfy the most needs at the least
sion is ‘‘instigation to aggression.’’ Instigation to cost—should be chosen (Megargee, 1993).
aggression is the sum of all the forces that can Given this overall conceptual framework, we can
motivate us to commit a violent or aggressive act. then ask salient questions such as, ‘‘What are the
It includes both ‘‘intrinsic or angry’’ instigation, causes of anger? What happens to instigation once it
which is our conscious or unconscious desire to is aroused? What factors increase or decrease our
harm the victim in some fashion, and ‘‘extrinsic or inhibitions? How might environmental factors be
instrumental’’ instigation, which is our wish for manipulated to increase or decrease the likelihood
other desirable outcomes that the aggressive act in of aggressive behavior?’’ General questions such as
question might achieve for us, such as economic gain these can guide psychological research on aggression.
MEGARGEE 547
appear to depend on previous learning and cultural Kaemmer, 1989). Since instigation to aggression has
values. been implicated as a factor in cardiovascular and
gastrointestinal disorders, a number of measures of
ASSESSING AND TREATING INTRINSIC INSTIGATION anger and hostility have been devised for use in beha-
Obviously, the more intense and long-lasting the vioral medicine (Chesney & Rosenman, 1985). Most
instigation, the easier it is to assess. Because anger of these scales have obvious content and can easily be
and rage are transitory, we may not observe them in a dissembled.
single session. As with the assessment of other emo- Researchers have also devised observational rating
tions, the more often we interact with a subject, the instruments such as the Ward Anger Rating Scale
greater the range of emotions, including anger, we (WARS; Novaco, 1994; Novaco & Taylor, 2004)
are likely to observe. designed to allow staff members to record and quan-
Our primary assessment questions are the tify inpatients’ angry and aggressive behavior over
amount, the duration, and the direction of instiga- the course of a week.
tion to aggression: ‘‘Is the subject angry?’’ ‘‘How Ascertaining the sources of instigation to aggres-
angry?’’ ‘‘How long does he stay angry?’’ ‘‘At whom sion can have important clinical implications for
is she angry?’’ The secondary questions are what assessment and for treatment. The court that sen-
causes this instigation and how does the subject tenced Mary Winkler to the time she had already
deal with it: ‘‘What angers him?’’ ‘‘What does she served in jail plus some additional counseling prob-
do when she is angry?’’ ably concluded that she was unlikely to engage in
One of the best ways to assess anger and hostility further violence since, by killing her abusive hus-
is also the simplest: asking subjects, using a variety of band, she had already eliminated her major source of
synonyms, what makes them angry or annoyed, who instigation to aggression.
aggravates them, what they would like to do when With regard to treatment, a clinician might inves-
they are provoked, to whom they would like to do it, tigate whether a client’s frustration and resulting
and what it is they actually do and why. Anger is a anger stems from unreasonable expectations. If so,
strong emotion; given the opportunity, many people cognitive therapy designed to correct these beliefs
will express it quite openly, even when it is not in might be effective (Feindler & Ecton, 1986;
their best interests to do so. Because anger and Goldstein & Keller, 1987). On the other hand, if
hostility are difficult to conceal, their friends and the grievance is genuine and the anger justified, a
associates should also be able to comment on a better solution might be to help the client find a
client’s temper. more constructive way of coping with the frustrating
A number of self-report tests and scales have been situation, such as seeking redress in the courts.
constructed to assess anger and hostility. The Buss– Cognitive redefinition, especially through humor,
Durkee Hostility Guilt Inventory (Buss & Durkee, is another effective mechanism for reducing instiga-
1957) features seven scales designed to assess various tion (Megargee, 1993).
aspects of hostility and aggression, plus one for asses-
sing guilt over its expression, a potential measure of Extrinsic (Instrumental) Instigation
inhibitions. Spielberger’s (1996) State Trait Anger to Aggression
Expression Inventory (STAXI) has three standard People aggress not only because they want to hurt
scales, ‘‘Anger out’’ (Ax/out), ‘‘Anger in’’ (Ax/in), the victim, but also because they hope aggression
and ‘‘Anger control’’ (Ax/con), as well as two experi- may help them attain other goals. As Al Capone
mental subscales, ‘‘Anger control in’’ (Ax/con-in) reportedly said, ‘‘You can get much farther with a
and ‘‘Anger control out’’ (Ax/con-out). Novaco and kind word and a gun than you can with a kind word
his colleagues have devised several measures related alone’’ (Peter, 1977, p. 141).
to anger and its causes, including the Novaco Anger
Scale (NAS; Novaco, 1994, 2003), the Novaco SOURCES OF EXTRINSIC INSTIGATION
Provocation Inventory (NPI; Novaco, 1975, There is a vast array of possible extrinsic
1988), and the Dimensions of Anger Reaction motives, both primary and secondary. Primary
Scale (DAS and DAS-5; Hawthorne, Mouthaan, motives include (a) personal gains and satisfac-
Forbes, & Novaco, 2006). tions, such as acquisition of property or enhance-
Scales designed to assess anger and hostility are ment of self-esteem; (b) removal of problems or
also included on self-report inventories such as the impediments, such eliminating enemies or wit-
MMPI-2 (Butcher, Dahlstrom, Graham, Tellegen, & nesses to crimes; (c) achieving social goals, such
MEGARGEE 549
Although direct reinforcement of our own With regard to treatment, Brown and Elliott’s
aggressive responses is most effective, we can also (1965) experiment also demonstrated how difficult
develop habit strength indirectly by observing role it is to eliminate habits once they have been formed.
models successfully engage in aggressive behavior. As we noted, both verbal and physical aggression
Children can learn aggressive scripts from observing diminished during their experiment when their nur-
domestic violence, from watching countless por- sery school teachers were instructed to minimize the
trayals of aggressive behavior in the media, and amount of attention they paid to aggressive students.
from playing violent video games. These aggressive Unfortunately, fighting resumed once the experi-
scripts, once learned, can then serve as guides for ment was over, presumably because the teachers
future aggressive behavior in real-life situations could not bring themselves to continue ignoring
(Anderson & Bushman, 2002; Bandura, 1973; physical aggression while rewarding only prosocial
Huesmann, 1986, 1998; Huesmann & Eron, behavior.
1986). For example, in the videos Seung-Hui Cho
mailed to NBC the day of his multiple killings at Inhibitions against Aggression
Virginia Tech, he acknowledged the influence of Opposing instigation and habit strength are inhi-
Dylan Klebold and Eric Harris’s killing spree at bitions against aggression. Inhibitions include both
Columbine High School in 1999, referring to moral prohibitions and pragmatic concerns. They
them as ‘‘martyrs’’ (Healy, 2007). can be general or specific, lasting or temporary, and
they vary as a function of the aggressive act, the
REDUCING HABIT STRENGTH
target, and the circumstances.
Once acquired, aggressive habits and scripts are
extraordinarily difficult to eliminate. As we all SOURCES OF INHIBITIONS
learned in Psychology 101, extinction is the only Moral prohibitions go by various terms such as
certain technique for eliminating habits. However, ‘‘taboos,’’ ‘‘conscientiousness,’’ and ‘‘superego.’’ In
extinction requires the elimination of any reinforce- the course of normal development, we all learn that
ment for aggressive behavior, which is virtually certain aggressive acts directed at particular people
impossible in our culture. Although punishment are morally wrong. For example, in most societies
may temporarily suppress aggressive behavior, it aggression against one’s parents, especially one’s
does not eliminate aggressive responses from our mother, is considered wrong, while attacking an
repertoires. Moreover, to the extent that punish- enemy, especially one who is from an alien culture,
ment is frustrating or perceived as an attack, it can is more acceptable.
also increase intrinsic instigation. This is one reason Pragmatic issues are another source of internal
prisons are often ineffective. inhibitions. One practical concern is the fear that
bad things, such as punishment or retribution, may
ASSESSING AND TREATING HABIT STRENGTH
occur following a particular aggressive act. Fearing
Of all the constructs in the algebra of aggression,
he might be attacked by a mob following his planned
habit strength should be the easiest to assess, based as
assassination of President James A. Garfield in 1881,
it is on the individual’s reinforcement history as
Charles Guiteau hired a cab to wait and take him to
documented in the case history. Habit strength is
the DC prison as soon as he had shot the president
the variable that most strongly predicts aggression;
(Clark, 1982).
the longer and stronger the history of aggression, the Another concern is the fear that the proposed act
more likely an individual will behave similarly in the
might fail to accomplish its objective. On several
future. One can infer strong habit strength from a
occasions in 1972, Arthur Bremer set out to assassi-
long history of aggressive behavior, especially if the
nate President Richard Nixon, but was deterred
aggression appears to have been successful. A history
when he was unable to get close enough to get a
of aggressive behavior is a key element in diagnosing
clear shot at the President. In a classic case of dis-
Megargee’s (1966) undercontrolled assaultive type,
placement, he next targeted and eventually shot
and is central to the objective prediction devices used
Gov. George Wallace, who was running for the
in many mental health and correctional risk assess- Democratic nomination for president (Clark, 1982).
ments (Borum, 1996; Otto, 2000). However, as in
the case of the overcontrolled assaultive type, the fact REDUCING INIHIBITIONS
that some people have no known history of aggres- Unfortunately, it is much easier to decrease inhi-
sion or violence is no guarantee that they will never bitions than it is to foster them. Physiological factors
aggress in the future. that reduce inhibitions include (a) injuries or
MEGARGEE 551
(a) living in a society that encourages aggressive temperature (Anderson & Bushman, 2002; Baron &
behavior (environment); (b) being in a war zone Ransberger, 1978; Megargee, 1977); architectural
such as any of a number of Middle Eastern trouble design (Newman, 1972); crowding (Megargee,
spots (setting); (c) being caught up in the general 1977; Russell, 1983); the family, peer, and job envir-
chaos of a firefight or riot (situation); and (d) being onments; the availability of alcohol and of potential
attacked by an antagonist (stimulus). victims (Monahan & Klassen, 1982); and the beha-
Factors that might inhibit violence include (a) vior of antagonists, victims, associates, and bystanders
living in a peaceful and harmonious society (envir- (Megargee, 1993; Wolfgang, 1958). Perhaps the
onment); (b) attending a Quaker religious service most widely discussed variable is access to weapons
(setting); (c) being in the presence of loved ones or by either or both antagonists (Cook, 1982;
authorities who disapprove of aggression (situation); MacDonald, 1975; Monahan & Klassen, 1982).
and (d) being lovingly embraced by a potential target
of aggression (stimulus). The boundaries between ASSESSING SITUATIONAL FACTORS
these terms are not significant; the point is that a In both prospective and retrospective assess-
wide range of external factors and events helps deter- ments, clinicians must consider the influence of
mine whether or not an aggressive act takes place. situational as well as personality factors. There is
The common denominator for these events, no substitute for familiarity with a client’s various
according to Monahan and Klassen (1982), is that environments and living conditions. First-hand
they all occur outside our skins. appraisals are the best source of information. This
Situational influences involve times as well as is done most easily in hospitals and prisons in which
places; much more violence occurs in New York one is attempting to predict institutional violence. It
City’s Central Park between 2:00 and 4:00 A.M. is more difficult in community settings, but if home
on Sunday morning than from 2:00 to 4:00 P.M. on visits can be arranged they are invaluable. In most
Sunday afternoon. Of course, personality and situa- cases, however, we must rely on descriptions in case
tional factors are not independent. Those peaceful history material, police reports, and information
people who flock to Central Park on a sunny after- gained in interviews with clients.
noon generally shun it in the early morning hours, Compared with personality measures, there are
leaving it to the predators and police. few instruments for systematically assessing situa-
Contagion effects are also important. Berkowitz tional influences. An exception is the Correctional
(1970) demonstrated an increase in violence fol- Institution Environment Scale (CIES; Moos, 1975),
lowing highly publicized aggressive crimes. We a 90-item instrument scored on nine environmental
have already noted the fact that Seung-Hui Cho scales that is designed to be administered to inmates
was influenced by the Columbine High School kill- and staff. The CIES has been used more to evaluate
ings. In addition to his Virginia Tech massacre, there the the effects of various program changes on the
have been at least 25 high school shootings involving social climate of prisons rather than as a measure to
multiple victims in the United States as well as at be considered in analyzing the causes of institutional
least 9 mass school shootings in other countries aggression.
(Infoplease.com, 2007). In recent years, several instruments have been
The circumstances and stimuli to which clients devised to assess bullying and identify bullies in
are exposed and the conditions in which they live are schools using both self- and peer reports.
extremely important determinants of aggression. However, their reliability and validity have not yet
Cognitive expectancies interacting with situational been well established (Cornell, Sheras, & Cole,
realities can cause frustration, the major source of 2006).
intrinsic instigation (Dollard et al., 1939). Bjørkly (1993, 1994) devised a situational ‘‘Scale
Environmental factors can moderate the relation of for the Prediction of Aggression and Dangerousness
personality factors, such as empathy, to aggression in Psychotic Patients,’’ which is designed to predict
(Rose & Feshbach, 1990). Given strong enough the frequency and intensity of psychiatric patients’
threat or provocation, almost anyone may become aggressive behavior in 29 different situations, both
violent. on the ward and off. Recent research (Bjørkly,
Since this book is about personality assessment, Havik, & Løberg, 1996) has attested to its inter-
space does not permit a discussion of all the external rater reliability, but its validity is yet to be estab-
factors that influence aggression. Among those that lished, as is its applicability outside of Norway.
have been emphasized in the literature are ambient Monahan and Klassen (1982) suggested that we
Response Competition and Reaction In the 1950s and 1960s, diagnoses of dangerous-
Formation ness were made clinically by psychiatrists, psycholo-
An aggressive or violent act will be blocked if gists, and case workers using whatever criteria their
inhibitory factors exceed motivating ones. training and experience suggested. In the 1970s,
However, if all the forces favoring a particular researchers began examining the accuracy of these
response exceed those opposing it, that act is pos- diagnoses by conducting follow-ups of psychiatric
sible, at least as long as those conditions prevail. patients who had been labeled dangerous and sub-
Before it can be carried out, however, all the possible sequently released into the community. Official
responses, both violent and nonviolent, must com- records such as arrest reports indicated that only
pete with one another. The one with the highest 20–35% of these patients actually engaged in aggres-
reaction potential—that is, the greatest capacity to sive behavior after they had been released (Kozol,
satisfy the most needs at the least cost—should be Boucher, & Garafalo, 1972; Steadman & Cocozza,
chosen. 1974). More recent studies using more sophisticated
assessments and improved research methods now
ASSESSING REACTION POTENTIAL suggest that the accuracy of short-term clinical pre-
In a retrospective analysis of aggressive behavior, dictions of violence among acute psychiatric patients
it is obvious that the act that was actually performed may be as high as 50% (Borum, 1996; Monahan,
had the highest reaction potential. If that act was 1996; Otto, 1992). This improved accuracy still
socially or personally undesirable, and the clinical risks classifying many nonaggressive people as poten-
goal is to decrease the likelihood of a recurrence, tially violent (‘‘false positives’’), especially in settings
other more satisfactory alternatives should be inves- in which aggressive behavior is infrequent.
tigated. Why were these acts not selected? Were the Clinical risk assessment has been criticized as
inhibitions too great? Their chance of success too being overly subjective and potentially influenced
low? Did cognitive beliefs suggest that the aggressive by illusory correlation, stereotypes, and hindsight
act was the more honorable alternative? Did friends bias (Towl & Crighton, 1995). This was illustrated
or onlookers encourage the aggressive behavior? The by Cooper and Werner’s (1990) investigation of the
answers to such questions might indicate how the abilities of 10 psychologists and 11 case workers to
reaction potential of the objectionable behavior may predict institutional violence based on 17 variables
be decreased and that of more acceptable responses in a sample of 33 male federal correctional institu-
increased. tion inmates, 8 of whom (24%) were violent and 25
In prospective assessments, the clinician’s task is of whom (76%) were not. They found that inter-
to try to determine the range of possible responses judge reliabilities were quite low, averaging only .23
and estimate their relative reaction potential. among pairs of judges. Pooled judgments were sub-
Situational circumstances, such as whether or not stantially more reliable than individual assessments.
certain interventions are attempted, may be critical. Accuracy was appalling; the psychologists’ predictive
MEGARGEE 553
accuracy averaged only –.08 (range ¼ –.25 to þ.22), The psychological test with the best track record
while the case workers’ mean accuracy was þ.08 with regard to violence risk assessment is Hare’s
(range ¼ –.14 to þ .34). The main reason for the (2003) PCL-R. The PCL-R was devised to assess
inaccuracy appeared to be illusory correlation, with the construct of psychopathy as originally delineated
judges often using cues that proved to be unrelated by Cleckley (1941/1976). Since 1980, an impressive
to the criterion. array of empirical evidence from personality, phy-
Subjective classification procedures are difficult siological, and cognitive psychological research as
to document, which can result in a lack of oversight well as from criminology and corrections has attested
and accountability. Correctional facilities relying on to the construct validity of the PCL-R. Barone
subjective appraisals have been plagued by chronic (2004, p. 113) recently referred to the PCL-R as
‘‘overclassification,’’ with prisoners being assigned to the ‘‘gold standard’’ for the measurement of
more restrictive conditions of confinement than psychopathy.
necessary. In a series of cases, the courts held that PCL-R assessments require a thorough review of
subjective classifications were too often arbitrary, the clinical, medical, and legal records, followed by a
capricious, inconsistent, and invalid (Austin, 1993; clinical interview in which a complete chronological
Solomon & Camp, 1993). In Laaman v. Helgemoe case history is obtained. For research purposes, it is
(437 F. Supp. 318, D.N.H. 1977, quoted by possible to compute PCL-R scores based only on
Solomon & Camp, 1993, p. 9), the court stated (extensive) file data (Grann, Långström, Tengström
that prison classification systems, ‘‘cannot be arbi- & Kullgren, 1999), but this is not recommended
trary, irrational, or discriminatory,’’ and in Alabama (Meloy & Gacono, 1995).
the court took over the entire correctional system The PCL-R consists of 20 symptoms of psycho-
and ordered every inmate reclassified (Fowler, pathy, each of which is scored on a 3-point scale
1976). from 0 (absent) to 2 (clearly present). Inter-rater
reliabilities average .86, and users are advised to
PERSONALITY TESTS base assessments on the average of two or more
Psychological tests developed in other contexts independent ratings whenever possible (Hare et al.,
and for other purposes have also been used in risk 1990). Although Hare et al. (1990) regard the PCL-
assessment. The original MMPI and its successor, R as a homogeneous, unidimensional scale based on
the MMPI-2, are the world’s most widely used and its average alpha coefficient of .88, there are two
researched clinical inventories.2 Megargee and well-defined factors. The first reflects an egocentric,
Mendelsohn (1962) found that extremely violent selfish interpersonal style with its principal loadings
and moderately violent male applicants for proba- from such items as glibness/superficial charm (.86),
tion did not differ from nonviolent criminals and grandiose sense of self-worth (.76), pathological
noncriminals on a dozen original MMPI scales espe- lying (.62), conning/manipulative (.59), shallow
cially designed to measure hostility and aggression. affect (.57), lack of remorse or guilt (.53), and cal-
Megargee and Carbonell (1995) correlated a lousness/lack of empathy (.53). The items loading
number of MMPI scales with measures of institu- on the second factor suggest the chronic antisocial
tional adjustment and violence. The correlations, behavior associated with psychopathy: impulsivity
while significant, were generally too low to be of (.66), juvenile delinquency (.59), and need for sti-
much value in prospective risk assessment, and mul- mulation, parasitic lifestyle, early behavior problems,
tiple regression equations were not much better. and lack of realistic goals (all loading .56) (Hare
Bohn (1979) was able to reduce the incidence of et al., 1990).
serious assaults in a federal correctional institution Reviewing a number of empirical investigations,
46% by using the MMPI-based offender classifica- both retrospective and prospective, Hart (1996)
tion system (Megargee et al., 1979, 2001) to identify reported that psychopaths, as diagnosed by the
those inmates most likely to be predatory and those PCL-R, had higher rates of violence in the commu-
most likely to be preyed upon so they could be nity and in institutions than nonpsychopaths, and
housed in different dormitories. However, the that psychopathy, as measured by the PCL-R, was
MMPI-2 is better at assessing offenders’ attitudes, predictive of violence after admission to a hospital
mental health, emotional adjustment, and need for ward and also after conditional release from a hos-
treatment or other professional interventions pital or correctional institution. He estimated
(‘‘needs assessment’’) than it is at estimating how that the average correlation of PCL-R scores with
dangerous they are (Megargee, 2006b). violence in these studies was about .35. In their
MEGARGEE 555
Table 28.1. Objective devices.
continued
(g) companions, (h) alcohol/drug, (i) emotional/per- behavior in question, such as institutional misbe-
sonal, and (j) attitude/orientation. It thus utilizes havior or recidivism. These ‘‘risk factors’’ are then
both static and dynamic information. Studies of combined into a predictive scheme. Some investi-
the LSI have reported adequate reliability and sig- gators use multiple regression equations or
nificant associations of LSI total scores with success weighted discriminant functions, some use simple
or failure in institutional settings, on probation and additive models in which points are assigned for
parole, and with general recidivism among both each risk factor, and some use decision trees. These
male and female offenders (Catchpole & Gretton, schemes seem to work best in correctional mental
2003; Girard & Wormith, 2004; Hollin & Palmer, health facilities, which typically have more data
2006; Kroner & Mills, 2001; Palmer & Hollin, available, including psychological evaluations,
2007; Simourd, 2004). However, the LSI-R’s asso- than ordinary correctional institutions. It is gener-
ciations with violent recidivism are less robust than ally agreed that, as a group, actuarially derived
those with general recidivism (Daffern et al., 2005; objective instruments outperform those that are
Gendreau et al., 2002). Girard and Wormith (2004) rationally constructed (Barbaree, Seto, Langston,
recently added a new specific/risk section to the & Peacock, 2001; Dawes, Faust, & Meehl, 1989;
latest ‘‘Ontario’’ revision of the LSI to improve its Grove & Meehl, 1996; Grove, Zald, Hallberg,
assessment of aggressive behavior. Lebow, Snitz, & Nelson, 2000; Kroner et al.,
2007). However, in the area of risk assessment, it
OBJECTIVE ASSESSMENT TOOLS: ACTUARIALLY appears that the best rational and actuarial devices
DERIVED are comparable in their validity (Grann, Belfrage,
Whereas rationally constructed instruments try & Tengström, 2000).
to capture the judgments of classification experts The best-established actuarial instrument for
and apply them in a standard fashion, actuarial the prediction of violent recidivism is the Violence
tools are derived by selecting items that previous Risk Appraisal Guide (VRAG; Harris et al., 1993),
research has shown are empirically related to the which was derived from a population of 618 violent
MEGARGEE 557
offenders at a maximum-security psychiatric institu- it is questionable whether the VPS could be applied
tion, 191 of whom (31%) engaged in further vio- in many correctional settings or to patients who have
lence after their eventual release. The clinical records no prior history of violence.
of these ‘‘violent failures’’ were combed, coded, and
compared with those of the 427 violent offenders STRUCTURED PROFESSIONAL JUDGMENT
who did not subsequently engage in violent recidi- Whether they are rationally or actuarially derived,
vism. Multivariate analyses identified 12 variables the essence of objective assessment tools such as the
from the 50 or so studied which significantly differ- LSI-R and the VRAG is that decisions regarding the
entiated the two criterion groups. They included risks for aggression posed by those being assessed are
three history variables (elementary school maladjust- based entirely on the scores they obtain on the
ment, separation from parents before age 16, and instruments or the classifications they receive from
never marrying), four clinical history variables (his- the decision tree. In ‘‘structured professional judg-
tory of alcohol abuse, DSM-III diagnoses of schizo- ment’’ (SPJ), objective tools are used to guide clinical
phrenia or personality disorder, and score on the judgments to a greater or lesser degree (Douglas
PCL-R), two overall criminal history variables et al., 2005).
(record of property offenses and prior failures on A number of SPJ instruments have been devel-
conditional release), and three factors regarding the oped (Douglas et al., 2005).5 Of these, the best
index offense (injuries to the victim, gender of known and most widely used is the rationally con-
victim, and age of offender at the time of the structed HCR-20 (Douglas & Webster, 1999;
offense). Interestingly, the PCL-R was the variable Webster, Douglas, Eaves, & Hart, 1997). The
most closely related to violent recidivism. These 12 HCR-20 is comprised of 20 records-based ratings
variables were differentially weighted (with the PCL- in three categories: historical, clinical, and risk man-
R receiving the highest weight) and combined to agement. The ten historical (H) items are similar to
yield a total VRAG score which was used to assign the items on the VRAG. They include history of
the subjects to nine categories ranging from the (a) previous violence, (b) young age at first violent
lowest to the highest chance of recidivism (Harris incident, (c) relationship instability, (d) employ-
et al., 1993). ment problems, (e) substance abuse (f), history of
Because the VRAG consists only of unchanging major mental illness, (g) psychopathy as measured
static items, Webster and Polvi (1995) created the by the PCL-R, (h) early maladjustment, (i) person-
ASSESS-LIST, a device consisting of 10 more ality disorder, and (j) prior supervision failure
dynamic clinical variables: antecedent history, self- (Douglas & Webster, 1999).
presentation, social and psychosocial adjustment, The five clinical (C) items reflecting current emo-
expectations and plans, symptoms, supervision, life tional status are (a) lack of insight, (b) negative
factors, institutional management, sexual adjust- attitudes, (c) active symptoms of major mental ill-
ment, and treatment progress. Together, the ness, (d) impulsivity, and (e) unresponsiveness to
VRAG and ASSESS-LIST comprise the Violence treatment.
Prediction Scheme (VPS; Webster, Harris, Rice, The five risk management (R) items are (a) plans
Cormier, & Quinsey, 1994). lacking feasibility, (b) exposure to destabilizers,
Actuarially derived tools should be cross- (c) lack of personal support, (d) lack of compliance
validated on new independent samples before with remediation attempts, and (e) stress (Douglas
being used, and instruments devised in one setting & Webster, 1999). These dynamic C and R items,
or locale should be checked for accuracy before they which can change to reflect a client’s current situa-
are adopted elsewhere (Craig, Beech, & Brown, tion, are the clearest difference between the HCR-20
2007). The VRAG has been successfully cross- and instruments that are based exclusively on histor-
validated on new samples of serious mentally disor- ical or file data.
dered offenders by its creators and independent Because it is designed to guide structured clinical
researchers (Douglas et al., 2005; Grann et al., judgments, the HCR-20 does not have recom-
2000; Harris, Rice, & Cormier, 2002; Harris et al., mended cutting scores or algorithms that categorize
1993; Kroner & Mills, 2001; Loza & Dhaliwal, people. Instead, it produces numerical scores on each
1997; Quinsey, Harris, Rice, & Cormier, 1998; of its three scales as well as a total score that clinicians
Rice & Harris, 1994, 1995). are encouraged to consider when deciding whether
Given its reliance on psychiatric diagnoses and people are at low, medium, or high risk of violent
psychological factors involved in previous violence, offending (Douglas et al., 2005). Nevertheless, a
MEGARGEE 559
consequences of incorrect classifications (Megargee, the statistical procedures, the less likely decision
1976). In Bohn’s (1979) study, those prisoners makers are to use them (Brennan, 1993). In his
labeled ‘‘potential predator’’ or ‘‘potential prey’’ classic paper, ‘‘Why I no longer attend case confer-
based on their MMPI-based classifications were ences,’’ Paul Meehl (1977) lamented the tendency
simply separated by being assigned to different dor- for clinicians to place more faith in their subjective
mitories. Their designations had no impact on their impressions than they do in base rates or the results
programming or other conditions of confinement. of objective instruments. In a recent risk assessment
With such a benign outcome, Bohn could tolerate study, Kroner and Mills (2001) found that experi-
large numbers of false positives in both categories. enced clinicians provided with base rate data appar-
However, in many cases, the consequences for false ently ignored them when estimating the likelihood
positives can be quite serious. For example, those of violent offending. As a result, they did no better
diagnosed as dangerous sex offenders may be civilly than a contrast group of clinicians who had no
committed or subjected to preventive detention over information on base rates.
and above their criminal sentence; those released Hilton and Simmons (2001) reviewed the factors
may be forced to register with the police, denied associated with decisions made by the Ontario
certain types of employment, and restricted with Review Board on whether mentally disordered
regard to where they can live (Janus, 2003; Kroner offenders charged with serious, usually violent,
et al., 2007). offenses should be detained in maximum security.
Although the emphasis in risk assessment is on VRAG scores, which were provided to the members
diagnosing the most dangerous offenders, the greater of the board, were not related to the eventual
contribution of these classification tools has been to security-level decisions. The most important single
identify low-risk individuals (true negatives) who variable was the testimony of the senior clinician,
can be placed in less secure settings or released to although criminal history, psychotropic medication,
the community (Austin, 1993; Glaser, 1987; institutional behavior, and even the physical attrac-
Solomon & Camp, 1993). When making subjective tiveness of the patient also played a role. They con-
predictions of violence, assessors are often overly cluded, ‘‘contrary to current optimism in the field,
conservative, placing too many individuals in actuarial risk assessment had little influence on clin-
higher-than-necessary risk categories (Heilbrun & ical judgments and tribunal decisions about men-
Heilbrun, 1995; Monahan, 1981, 1996; Proctor, tally disordered offenders in this maximum security
1994; Solomon & Camp, 1993). This is not sur- setting . . . . It is apparent . . . that simply creating
prising. There is little public concern over false actuarial instruments and making the results avail-
positives who may be retained in overly restrictive able to decision makers does not alter long-estab-
mental health or correctional facilities, but there is lished patterns of forensic decision making’’
an understandable outcry when a future mass mur- (Hilton & Simmons, 2001, pp. 402, 406).
derer such as Charles Whitman or Seung-Hui Cho is
assessed as not dangerous. Notes
Reducing the extent of overclassification is espe- 1. Although we have only considered human aggression, there is
cially important in criminal justice and correctional also a vast literature on animal aggression, which further com-
settings. First, it is the correct thing to do. The courts plicates definitional issues. Because this book concerns clinical
have consistently ruled that criminal offenders have personality assessment, discussion has been limited to human
aggression and violence, but readers should keep in mind that a
the right to be maintained in the least restrictive comprehensive theory of aggressive behavior should be applic-
settings consistent with maintaining safety, order, able to animal as well as human behavior.
and discipline. Second, less restrictive settings are 2. Gendreau et al. (2002, p. 422) recently complained, ‘‘Is there
more economical. Confining a criminal offender in an offender anywhere in North America who has not at some
a maximum-security institution costs $3,000 a year point been administered the MMPI/MMPI-2?’’
3. Salekin et al. (1996, p. 211) hailed the PCL-R as being
more than a minimum-security facility and $7,000 ‘‘unparalleled as a measure for making risk assessments.’’
more than a community setting. Third, all residents Gendreau et al. (2002) took umbrage at this effusive praise
benefit because more programming is possible in less and took 30 journal pages to explain why the LSI-R was
restrictive correctional settings, and the deleterious superior to the PCL-R. Hemphill and Hare (2004) took
effects of crowding are diminished (Proctor, 1994). offense at Gendreau et al.’s (2002) criticisms and spent 41
pages explaining why the PCL-R was at least as good as the
Perhaps the greatest limitation on the utilization LSI-R for predicting criminal justice criteria, even though the
of objective risk assessment devices is the reluctance PCL-R was designed to assess the construct of psychopathy
of clinicians to rely on such tools. The more complex rather than predict recidivism.
MEGARGEE 561
Chesney, M. A., & Rosenman, R. H. (Eds.). (1985). Anger and Douglas, K. S., Yeomans, M., & Boer, D. P. (2005). Comparative
hostility in cardiovascular and behavior disorders. Washington, validity analysis of multiple measures of violence risk in a
DC: Hemisphere. sample of criminal offenders. Criminal Justice and Behavior,
Clark, J. W. (1982). American assassins. Princeton, NJ: Princeton 32, 479–510.
University Press. Edens, J. F., Buffington-Vollum, J. K., Keilen, A., Roskamp, P.,
Cleckley, H. (1976). The mask of sanity. St. Louis, MO: C. V. & Anthony, C. (2005). Predictions of future violence in
Mosby (Original work published in 1941). capital murder trials: Is it time to disinvent the wheel? Law
Cohen, D, Nisbett, R. E. Bowdle, B. F., & Schwarz, N. (1996). and Human Behavior, 29, 55–86.
Insult, aggression, and the southern culture of honor: An Epperson, D. L., Kaul, J. D., & Hesselton, D. (1998). Final report
‘‘experimental ethnography.’’ Journal of Personality and Social of the development of the Minnesota Sex Offender Screening Test-
Psychology, 70, 945–960. Revised: MnSOST-R. Paper presented at the 17th Annual
Cohen, L. E., & Felson, M. (1979). Social change and crime rate Conference of the Association for the Treatment of Sexual
trends: A routine activity approach. American Sociological Abusers, Vancouver, BC.
Review, 44, 588–608. Feindler, E. L., & Ecton, R. B. (1986). Adolescent anger control:
Cook, P. I. (1982). The role of firearms in violent crime: An Cognitive behavioral techniques. New York: Pergamon.
interpretative review of the literature. In M. E. Wolfgang & Fowler, R. (1976). Sweeping reforms ordered in Alabama prisons.
N. A. Weiner (Eds.), Criminal violence (pp. 236–291). APA Monitor, 7(4), 1, 15.
Beverly Hills, CA: Sage. Gendreau, P., Goggin, D. S., & Smith, P. (2002). Is the PCL-R
Cooper, R. P., & Werner, P. D. (1990). Predicting violence in really the ‘‘unparalleled’’ measure of offender risk? Criminal
newly admitted inmates. A lens model of staff decision Justice and Behavior, 29, 397–426.
making. Criminal Justice and Behavior, 17, 431–447. Girard, L., & Wormith, J. S. (2004). The predictive validity of the
Copas, J., & Marshall, P. (1998). The Offender Group Level of Service Inventory-Ontario Revision on general and
Reconviction Scales: A statistical reconviction scale for use violent recidivism among various offender groups. Criminal
by probation officers. Applied Statistics, 47, 159–171. Justice and Behavior, 31, 150–181.
Cornell, D. G., Sheras, P. L., & Cole, J. C. (2006). Assessment of Glaser, D. (1987). Classification for risk. In D. M. Gottfredson &
bullying. In S. R. Jimerson & M. Furlong (Eds.), Handbook of M. Tonry (Eds.), Prediction and classification in criminal
school violence and school safety: From research to practice justice decision making (pp. 249–291). Chicago: University
(pp. 191–209). Mahwah, NJ: Lawrence Erlbaum. of Chicago Press.
Costa, P. T., Jr., & McRae, R. R. (1992). NEO-PI-R professional Goldstein, A. P., & Keller, H. (1987). Aggressive behavior:
manual. Odessa, FL: Psychological Assessment Resources. Assessment and intervention. New York: Pergamon.
Craig, L. A., Beech, A. R., & Brown, K. D. (2007). The impor- Gottfredson, D. M. (1987). Prediction and classification in crim-
tance of cross-validating actuarial measures. Forensic Update, inal justice decision making. In D. M. Gottfredson &
40, 34–39. M. Tonry (Eds.), Prediction and classification in criminal
Cunningham, M. D., & Reidy, T. J. (1999). Don’t confuse me justice decision making (pp. 1–20). Chicago: University of
with the facts: Common errors in risk assessment at capital Chicago Press.
sentencing. Criminal Justice and Behavior, 26, 20–43. Gough, H. G., & Bradley, P. (1996). CPI manual (3rd ed.). Palo
Cunningham, M. D., & Sorenson, J. R. (2006). Actuarial models for Alto, CA: Consulting Psychologists Press.
assessing prison violence risk: Revision and extension of the Risk Grann, M., Belfrage, H., & Tengström, A. (2000). Actuarial
Assessment Scale for Prison (RASP). Assessment, 13, 253–265. assessment of risk for violence: The predictive validity of the
Cunningham, M. D., Sorenson, J. R., & Reidy, T. J. (2005). An VRAG and the historical part of the HCR-20. Criminal Justice
actuarial model for prison violence risk among maximum and Behavior, 27, 97–114.
security inmates. Assessment, 12, 40–49. Grann, M., Lǻngström, N., Tengström, A., & Kullgren, G. (1999).
Daffern, M., Ogloff, J. R. P., Ferguson, M., & Thomson, L. Psychopathy (PCL-R) predicts violent recidivism among criminal
(2005). Assessing risk for aggression in a forensic psychiatric offenders with personality disorders in Sweden. Law and Human
hospital using the Level of Service Inventory-Revised: Behavior, 23, 205–217.
Screening version. International Journal of Forensic Mental Grinberg, E. (2007). Jury convicts Mary Winkler of voluntary
Health, 4, 201–206. manslaughter in husband’s shooting death. Court Radio/TV,
Dahle, K.-P. (2006). Strength and limitations of actuarial predic- June 12, 2007, 11:16 a.m. ET.
tion of criminal reoffence in a German prison sample: A Grove, W. M., & Meehl, P. E. (1996). Comparative efficiency of
comparative study of LSI-R, HCR-20, and PCL-R. informal (subjective, impressionistic) and formal (mechanical,
International Journal of Law and Psychiatry, 29, 431–442. algorithmic) prediction procedures: The clinical-statistical
Dawes, R. M., Faust, D., & Meehl, P. E. (1989). Clinical versus controversy. Psychology, Public Policy, and the Law, 2,
actuarial judgment. Science, 243, 1668–1674. 293–323.
Dollard, J., Doob, L. W., Miller, N. E., Mowrer, O. H., & Sears, Grove, W. M., Zald, D. H., Hallberg, A. M., Lebow, B., Snitz, E.,
R. R. (1939). Frustration and aggression. New Haven, CN: & Nelson, C. (2000). Clinical versus mechanical judgment: A
Yale University Press. meta-analysis. Psychological Assessment, 12, 19–30.
Dorland, M., & Krauss, D. (2005). The danger of dangerousness Hanson, R. K. (1997). Development of a brief scale for sexual
in capital sentencing: Exacerbating the problem of arbitrary offender recidivism. Ottawa, ON: Public Works and
and capricious decision-making. Law and Psychology Review, Government Services of Canada.
29, 63–105. Hanson, R. K., & Thornton, D. (1999). STATIC-99: Improving
Douglas, K. S., & Webster, C. D. (1999). The HCR-20 violence actuarial risk assessments for adult sex offenders. User report
risk assessment scheme: Concurrent validity in a sample of 1999–2002. Ottawa, ON: Department of the Solicitor
incarcerated offenders. Criminal Justice and Behavior, 26, 3–19. General of Canada.
MEGARGEE 563
Kropp, P. R., Hart, S. D., Webster, C. D., & Eaves, D. (1994). Megargee, E. I. (1997). Internal inhibitions and controls. In
Manual for the Spousal Assault Risk Assessment Guide. S. R. Briggs, R. Hogan, & W. H. Jones (Eds.), Handbook
Vancouver, BC: Canada Institute on Family Violence. of personality psychology (pp. 581–614). New York:
Levene, K. S., Augimeri, L. K., Pepler, D. J., Walsh, M. M., Academic Press.
Webster, C. D., & Koegl, C. J. (2001). Early Assessment Risk Megargee, E. I. (2003). Psychological assessment in correctional
List for Girls (EARL-G), Version 2. Toronto, ON: Earl’s Court settings. In J. R. Graham & J. A. Naglieri (Eds.), Handbook of
Child and Family Centre. psychology, Vol. 10: Assessment psychology (pp. 365–388).
Loza, W., & Dhaliwal, G., (1997). Psychometric evaluation of the New York: John Wiley & Sons.
Risk Appraisal Guide (RAG): A tool for assessing violent Megargee, E. I. (2006a). Using the MMPI-2 in correctional
recidivism. Journal of Interpersonal Violence, 12, 779–793. settings. In J. N. Butcher (Ed.), MMPI-2: A handbook for
Loza, W., Dhaliwal, G., Kroner, D., & Loza-Fanous, A. (2000). practitioners (pp. 327–360). Washington, DC: American
Reliability, construct and concurrent validities of the Self Psychological Association.
Appraisal Questionnaire. Criminal Justice and Behavior, 27, Megargee, E. I. (2006b). Using the MMPI-2 in criminal justice and
356–374. correctional settings: An empirical approach. Minneapolis, MN:
Loza, W., & Loza-Fanous, A. (2003). More evidence for the University of Minnesota Press.
validity of the Self-Appraisal Questionnaire for predicting Megargee, E. I., Bohn, M. J., Jr., Meyer, J., Jr., & Sink, F. (1979).
violent and non-violent recidivism. Criminal Justice and Classifying criminal offenders: A new system based on the MMPI.
Behavior, 30, 709–721. Beverly Hills, CA: Sage.
MacDonald, J. (1975). Armed robbery: Offenders and their victims. Megargee, E. I., & Carbonell, J. L. (1995). Use of the MMPI-2 in
Springfield, IL: Charles C. Thomas. correctional settings. In Y. S. Ben-Porath, J. R. Graham,
McGuire, J. S., & Megargee, E. I. (1976, September). Prediction G. C. N. Hall, R. D. Hirschman, & M. S. Zaragoza (Eds.),
of dangerous and disruptive behavior in a federal institution Forensic applications of MMPI-2 (pp. 127–159). Thousand
for youthful offenders. In C. S. Moss (Chair), Objective studies Oaks, CA: Sage.
on violence in correctional settings. Symposium presented at the Megargee, E. I., Carbonell, J. L., Bohn, M., & Sliger, G. L.
meeting of the American Psychological Association, (2001). Classifying criminal offenders with MMPI-2: The
Washington, DC. Megargee system. Minneapolis, MN: University of Minnesota
McNiel, D. E. (1988). Empirically based evaluation and manage- Press.
ment of the potentially violent patient. In P. K. Kleespie (Ed.), Megargee, E. I., Cook, P. E., & Mendelsohn, G. A. (1967).
Emergencies in mental health practice: Evaluation and manage- Development and validation of an MMPI scale of assaultive-
ment (pp. 95–116). New York: Guilford. ness in overcontrolled individuals. Journal of Abnormal
McNiel, D. E., & Binder, R. L. (1994). Screening for risk of Psychology, 72, 519–528.
inpatient violence: Validation of an actuarial tool. Law and Megargee, E. I., & Mendelsohn, G. A. (1962). A cross-validation
Human Behavior, 18, 579–586. of 12 MMPI indices of hostility and control. Journal of
Meehl, P. E. (1957). When shall we use our heads instead of the Abnormal and Social Psychology, 65, 431–438.
formula? Journal of Counseling Psychology, 4, 268–273. Meloy, J. R., & Gacono, C. (1995). Assessing the psychopathic
Meehl, P. E. (1977). Why I do not attend case conferences. In personality. In J. N. Butcher (Ed.), Clinical personality assess-
Psychodiagnosis: Selected papers (pp. 255–302). New York: ment: Practical approaches (2nd ed., pp. 410–422). New York:
Norton. Oxford University Press.
Meehl, P. E., & Rosen, A. (1955). Antecedent probability and the Mercado, C. C., & Ogloff, J. R. P. (2007). Risk and the
efficiency of certain psychometric signs, patterns, and cutting preventive detention of sex offenders in Australia and the
scores. Psychological Bulletin, 52, 194–216. United States. International Journal of Law and Psychiatry,
Megargee, E. I. (1966). Undercontrolled and overcontrolled per- 30, 49–59.
sonality types in extreme antisocial aggression. Psychological Monahan, J. (Ed.) (1972). Who is the client? The ethics of psycho-
Monographs, 80 (3, whole No. 611). logical intervention in the criminal justice system. Washington,
Megargee, E. I. (1972). The California Psychological Inventory DC: American Psychological Association.
handbook. San Francisco: Jossey-Bass. Monahan, J. (1981). The clinical prediction of violent behavior.
Megargee, E. I. (1976). The prediction of dangerous behavior. Rockville, MD: National Institute of Mental Health.
Criminal Justice and Behavior, 3, 3–22. Monahan, J. (1996). Violence prediction: The past twenty years.
Megargee, E. I. (1977). The association of population density, Criminal Justice and Behavior, 23, 107–120.
reduced space and uncomfortable temperatures with miscon- Monahan, J., & Klassen, D. (1982). Situational approaches to
duct in a prison community. American Journal of Community understanding and predicting criminal violence. In
Psychology, 5, 289–298. M. Wolfgang & N. Wiener (Eds.), Criminal violence (pp.
Megargee, E. I. (1981). Methodological problems in the predic- 292–319). Beverly Hills, CA: Sage.
tion of violence. In J. R. Hays, T. K. Roberts, & K. S. Solway Monahan, J., Steadman, H. J., Applebaum, P. S., Grisso, T.,
(Eds.), Violence and the violent individual (pp. 179–191). New Mulvey, E. P., Roth, L. H., et al. (2006). The classifica-
York: S. P. Scientific and Medical Books. tion of violence risk. Behavioral Science and the Law, 24,
Megargee, E. I. (1982). Psychological correlates and determi- 721–730.
nants of criminal violence. In M. Wolfgang & N. Wiener Monahan, J., Steadman, H. J., Robbins, P. C. Applebaum, P.,
(Eds.), Criminal violence (pp. 81–170). Beverly Hills, CA: Banks, S., Grisso, T., et al. (2005). An actuarial model of risk
Sage. assessment for persons with mental disorders. Psychiatric
Megargee, E. I. (1993). Aggression and violence. In H. Adams & Services, 56, 810–815.
P. Sutker (Eds.), Comprehensive handbook of psychopathology Moos, R. H. (1975). Evaluating correctional and community set-
(2nd ed., pp. 617–644). New York: Plenum. tings. New York: John Wiley & Sons.
MEGARGEE 565
Wollert, R. (2006). Low base rates limit expert certainty when assessment tool. Psychology, Public Policy, and the Law, 12,
current actuarials are used to identify sexually violent preda- 279–309.
tors: An application of Bayes’s theorem. Psychology, Public Worling, J. R., & Curwen, T. (2001). The ‘‘ERASOR’’: Estimate of
Policy, and the Law, 12, 56–85. Risk of Adolescent Sexual Offense Recidivism, Version 2.
Wong, S. C. P., & Gordon, A. (2006). The validity and reliability Toronto, ON: SAFE T Program, Thistletown Regional
of the Violence Risk Scale: A treatment-friendly violence risk Centre.
Abstract
The assessment of antisocial and psychopathic personalities presents special challenges for the forensic
evaluator. This chapter emphasizes use of the Hare Psychopathy Checklist-Revised (PCL-R), Rorschach,
and Minnesota Multiphasic Personality Inventory (MMPI) for a comprehensive evaluation of these patients.
These measures lend incremental validity to understanding these difficult patients, especially when
combined with testing of intelligence and cognitive functioning. Integrating data from multiple domains is
essential to answering the psycholegal and forensic treatment questions surrounding the antisocial and
psychopathic patient. The forensically trained clinical psychologist is best suited to assess psychopathy, a
task that historically has been overlooked or avoided in traditional mental health settings.
Understanding that antisocial personality disorder psychopathy requires a formal assessment with the
and psychopathy are distinct but related constructs Psychopathy Checklist-Revised (PCL-R; Hare,
is crucial to clinical and forensic assessment of these 1991, 2003) that typically involves a review of collat-
patients. While antisocial personality disorder eral information and a semistructured interview
(ASPD; DSM-IV; American Psychiatric Association (Gacono, 2005).
[APA], 1994, 2000) evolved from a social deviancy Two additional findings support the need to
model (Robins, 1966) and the term sociopathy differentiate between these terms. First, base rates
(DSM; American Psychiatric Association (APA), for ASPD and psychopathy are not the same.
1952),1 the construct of psychopathy can be traced Although most psychopathic subjects will meet
to the more traditional psychiatric conceptualizations criteria for ASPD, at most one-third of ASPD sam-
originating in late nineteenth-century Germany ples in maximum-security prisons will meet the
(Cleckley, 1976). ASPD criteria are primarily PCL-R criteria for ‘‘psychopathy’’ (Hare, 1991,
behavioral, while psychopathy criteria include both 2003). The clinical importance of this fact can be
behaviors and traits that significantly overlap with stated differently. Most ASPD adults, male or
most of the DSM-IV Cluster B syndromes (narcis- female, are not psychopathic and will not meet
sistic, histrionic, borderline, and antisocial personality the factor analytic definition of this construct, in
disorders; Gacono, Nieberding, Owen, Rubel, & particular the personal qualities and behavior char-
Bodholdt, 2001). A clinician arrives at a diagnosis of acterized by a callous and remorseless disregard for
ASPD by verifying that the patient meets specific the rights and feelings of others and a chronic
criteria outlined in the DSM-IV-TR (APA, 2000), antisocial lifestyle (Hare, 1991, 2003).
which include (a) a pervasive pattern of disregard for Second, an ASPD diagnosis provides far less
and violation of the rights of others since age 15; (b) a predictive utility in clinical/forensic decision
history of conduct disorder prior to the age of 15; and, making than PCL-R scoring (Hare, 2003; Lyon &
(c) age 18 or older. In contrast, the diagnosis of Ogloff, 2000). An impressive body of literature
567
(see Hare, 2003) has demonstrated that, when com- person, an institution, or legal proceeding (Meloy,
pared with low scorers, prisoners with high PCL-R 1988, 2001).
scores The forensic psychologist must always evaluate
the validity all data, particularly unsubstantiated self-
• commit a greater quantity and variety of
reports, obtained from antisocial and psychopathic
offenses (Hare & Jutai, 1983);
patients. Gathering data from three different
• commit a greater frequency of violent offenses
sources—face-to-face interviews, independent his-
in which predatory violence (Meloy, 1988, 2006) is
torical information, and testing—aids the evaluator
used against male strangers (Hare & McPherson,
in addressing potential deception and combines to
1984; Williamson, Hare, & Wong, 1987);
form the foundation of the evaluation (Meloy,
• have lengthier criminal careers (Hare,
1989). Interviewing involves a face-to-face contact
McPherson, & Forth, 1988);
with the individual long enough to complete a
• have a poorer response to therapeutic
mental status exam, assess specifically targeted areas
intervention (Ogloff, Wong, & Greenwood,
(malingering, psychopathy level, and so forth), and
1990), which, in some cases, may be followed by
gather self-reported problems and historical data.
an increase in their subsequent arrest rates for violent
Additionally, face-to-face interviewing may provide
crimes (Rice, Harris, & Cormier, 1992); and
the interviewer with adumbrations of possible trans-
• are at high risk for problematic and disruptive
ference and countertransference reactions, which in
behavior while in treatment (Gacono, Meloy,
turn may inform or ‘‘flesh out’’ the interpersonal
Sheppard, Speth, & Roske, 1995; Gacono, Meloy,
section of the evaluation (Kosson et al., 2000).
Speth, & Roske, 1997; Young, Justice, Erdberg, &
Independent historical or contemporaneous informa-
Gacono, 2000).
tion refers to any data that are not self-reported by
Additionally, PCL-R item analysis provides valu- the examinee, and includes such things as other
able information for treatment planning with offen- psychiatric and psychological records, medical
ders (Gacono, 1998; Gacono, Jumes, & Grey, 2008). records, school and military records, employment
These robust findings make psychopathy assessment a records, criminal records, and interviews with histor-
useful, and in some cases essential, tool for clinical/ ical and contemporary observers of the examinee
forensic examiners evaluating antisocial and/or psycho- (parents, siblings, legal, and health care profes-
pathic patients (Gacono & Bodholdt, 2001; Gacono, sionals). Testing refers to psychological, neuropsy-
Loving, Evans, & Jumes, 2002). chological, and medical tests, historical or
contemporary, that provide objective reference
Forensic Assessment and Issues points to further understand the psychology or psy-
In all cases the forensic psychologist, as chobiology of the examinee. All three sources of
evaluator rather than therapist, is performing an information are necessary when assessing antisocial
investigation to gather data. He is not an agent of and psychopathic patients.
change. Confusion between these two fundamen- When evaluating antisocial and psychopathic
tally different roles may lead to misuse of informa- patients, the psychologist must have a clear under-
tion and unethical behavior (Meloy, 1989; standing of his or her purpose for assessing psycho-
Goldstein, 2007). The psychologist must have a pathy level or the presence or absence of a
clear conception of his or her role before the psychopathic syndrome. The need to ‘‘assess’’ psycho-
assessment begins. pathy varies with the nature of the setting and the
The psychologist must also consider that psycho- function of the evaluation (Bodholdt, Richards, &
pathic individuals are chronically deceptive and will Gacono, 2000; Gacono, 2000a; Gacono, Loving, &
lie to and mislead the assessor at every turn. Bodholdt, 2001; Gacono, Loving et al., 2002). For
Deceptive behaviors often include projection of example, prior to sentencing the psychologist is
blame, malingering or exaggeration of psychiatric usually called upon to aid the trier of fact, the judge
symptoms, and/or conscious denial: all important or jury, in answering a psycholegal question, such as
behaviors to be noted as part of the assessment intent, motivation, dangerousness, or sanity.
process (Kosson, Gacono, & Bodholdt, 2000). The Subsequent to institutional commitment, referral
goal of the psychopathic patient, and to a lesser questions stem from institutional concerns and may
degree the nonpsychopathic antisocial patient, is involve severity of antisocial personality disorder, mal-
usually to gain a more dominant or pleasurable ingering, treatment amenability or planning, sanity,
position in relation to his objects, whether a violence risk, threat management, sadism, sexual
Abstract
Clinical personality assessment is not widely used in industrial/organizational (I/O) psychology for
employment selection. The assessment of clinical symptoms, emotional instability, and personality
problems in personnel evaluations is primarily used in evaluating applicants for positions such as law
enforcement, fire department, nuclear power plant, and airline pilot positions in which high emotional
stability, successful stress skills, and high responsibility are considered important to the performance of a
high-risk job. The clinical personality assessment is characteristically conducted post hire and seen as
analogous to the pre-employment medical examination. The most widely used psychological test in this
application is the Minnesota Multiphasic Personality Inventory-2 (MMPI-2) which has a substantial research
base (more than 600 studies) supporting its use. This chapter provides interpretive guidelines for the
MMPI-2 in personnel screening and suggests an effective strategy (retest method) for dealing with
defensive applicants.
The term ‘‘clinical personality assessment’’ is not in the ‘‘Five Factor Model’’ (FFM; see Costa &
familiar to many industrial/organizational (I/O) psy- McCrae, 1985; Hough, Eaton, Dunnette, Kamp,
chologists. They might assume its meaning to be an & McCloy, 1990).
evaluation of personality by a trained clinician using Although, ‘‘clinical’’ personality tests such as the
direct observations, interview, and psychological MMPI/MMPI-2 have been largely ignored by I/O
testing of an individual’s behavior to explore psycho- psychologists, they have continued to be used by a
pathology rather than the broader personality minority of psychologists (clinical or counseling psy-
appraisal of characteristics considered to be related chologists) working in some personnel screening
to work functioning. Whatever meaning is attached settings. Clinical assessments have been primarily
to the term, personality constructs are central to the incorporated into employment screening as part of
application. Many I/O academics and other the medical/mental health evaluation for assessing
researchers have avoided the use of personality mea- job applicants for positions that are considered to be
sures in personnel evaluations for many years and unusually stressful or requiring a high degree of
have only recently begun to incorporate them for use emotional stability or high responsibility such as
in personnel assessments as potential significant con- police officers, nuclear power plant employees, air-
tributors to their models of performance prediction line pilots, or air traffic controllers. The more limited
in the workplace. After a number of years of low use of clinical assessment instruments in I/O psy-
visibility, in the 1990s, personality variables have chology results, in part, because most employment
experienced a renaissance with a phoenix-like rise decisions do not require a clinical personality evalua-
to prominence as psychology has found great interest tion (i.e., positions in sales, managerial, etc.), since a
582
‘‘specific personality type’’ or a ‘‘spotless’’ mental For example, an early test examined whether truck
health adjustment is not necessary for successful drivers had the skills necessary to deliver ammuni-
performance of most jobs. Moreover, there has tion and supplies to the front under battle condi-
been concern in the past over whether employers tions. By the war’s end, this second group of
can ‘‘invade the privacy’’ of applicants through the psychologists had developed 80 standardized tests.
use of personality measures (Ervin, 1965). Their motto was ‘‘the right man in the right job.’’
After the war, Walter Dill Scott formed the Scott
Evolution of Industrial/Organizational Company, the first modern psychological consulting
Psychology organization. It was dedicated to the application of
The evolution of the I/O assessment model pro- scientific methods to personnel problems in business
vides a clear perspective on testing practices in this and industry.
context. Psychological assessment of people in orga- The ensuing four decades saw modest expansion
nizations has recently passed its 100th birthday. As of psychological consulting corporations such as
noted in Chapter 1, the first personality assessment Scott’s, as well as similar consulting performed by
evaluation research was published by Heymans and academics. Job analysis studies began to have impact
Wiersma (1906), who used a rating strategy in which on carefully measuring requirements for each job.
the professional observer evaluated the individual on During World War II, millions of recruits were
a set of personality dimensions. The first self-report assessed by thousands of psychologists and assigned
personality assessment was developed a few years later a military job category based on the results. In the
by Robert Woodworth (1919, 1920) to obtain self- 1960s, Professor Marvin Dunnette, at the
reported symptoms and personality characteristics of University of Minnesota, following the ‘‘dust-bowl
draftees into the army during World War I. empiricist’’ tradition (see Dunnette, 1966),
I/O psychology was broadly influenced by the acknowledged successful work in analyzing specific
work of psychologists such as Hugo Munsterberg jobs, but he said comparisons among large numbers
(1913), at Harvard, who promoted applied psy- of jobs were still a matter of subjective judgment and
chology, especially mental testing, as a solution to guesswork. Dunnette and his colleagues conducted
the assessment problems of industry. He developed empirical research on the relationship between per-
tests for the hiring of traveling salesmen, ship cap- sonality characteristics and job performance
tains, and trolley drivers, and followed up with stu- (Kirchner, Dunnette, & Mousley, 1960). In addi-
dies evaluating the tests’ effectiveness. At tion, he called for the application of empirical infor-
Northwestern University and, later, at the Carnegie mation, developed, perhaps, by clustering and factor
Institute of Technology, Walter Dill Scott applied analysis and the use of ‘‘electronic computers.’’
psychological principles to advertising, hiring, and Selection tests were of ‘‘limited usefulness and ques-
motivation. He developed many now-standard tionable validity’’ in Dunnette’s view, in part,
assessment tools: the scored personal history blank, because he found that they were being used by
interview guide, performance rating form, and stan- people who were ‘‘ill-equipped’’ either to appreciate
dardized measures of an applicant’s intelligence, the tests’ shortcomings or to do research. The use of
alertness, carefulness, imagination, resourcefulness, the job interview, without more research, ‘‘had best
and verbal facility. be rationalized on the basis of its utility as a public
Psychological testing as a professional activity relations device rather than as a personnel proce-
expanded during America’s entry in World War I. dure.’’ Predictions of job success, he said, were no
A group of psychologists responded to a patriotic more sophisticated than the model presented by
appeal from the American Psychological Association Munsterberg 60 years before.
by developing and administering an intelligence test As Dunnette was critiquing this state of affairs,
to more than 1,700,000 recruits. A performance test a new development was occurring that would
based on the work done by Paterson and Pintner (see shape the I/O field substantially. Douglas Bray
Paterson, 1936) at Ohio State was administered to (Bray, 1964), the personnel research psychologist
recruits with limited language skills. A second group at American Telephone and Telegraph (AT&T),
of psychologists, led by Walter Dill Scott (see Scott was conducting what was to become a highly
& Clothier, 1923; Scott & Hayes, 1921), created influential study of ‘‘management progress.’’ In
and administered a personnel system that rated addition to standardized tests of intelligence, per-
recruits according to their abilities, education, and sonality, and other characteristics, he created
experience and placed them in appropriate positions. ‘‘work samples’’ or simulations of management
Irving B. Weiner
Abstract
Despite their best intentions, practitioners of personality assessment sometimes painfully discover they
have paid insufficient attention to what they should or should not have done. This chapter addresses ways
in which personality assessors can anticipate ethical and legal challenges and by so doing avoid them. The
overriding consideration in this regard is adherence to the APA Ethical Principles and Code of Conduct
and relevant state statutes and regulations. Practitioners can additionally protect themselves against
ethical and legal recriminations by following some recommended procedures for establishing clearly the
identity of the client when they accept a referral; selecting measures that are commonly used, well
substantiated, and familiar to them; conducting their evaluations in standard ways and recording the data
carefully; writing and presenting their reports in an appropriately denotative and circumspect manner; and
maintaining the confidentiality and availability of their records for an appropriate length of time.
Personality assessors rarely intend to break the law or Change for Psychologists’’ (APA, 2003), and
violate the ethical standards of their profession. ‘‘Specialty Guidelines for Forensic Psychologists’’
Likewise, few psychologists wish to look insensitive (Committee on Ethical Guidelines for Forensic
in their consulting rooms or foolish in the court- Psychologists, 1991). Information about state laws
room. At times, however, practitioners painfully dis- and regulations is available in a series of APA books
cover that, despite their best intentions and wishes, devoted to the individual states (e.g., for Florida, see
they have paid insufficient attention to what they Petrila & Otto, 1996).
should or should not have done. This chapter Other contemporary elaborations on ethical
addresses ways in which personality assessors can practice in psychology include books by Bennett
anticipate ethical and legal challenges and, by et al. (2006), Bucky, Callan, and Stricker (2007),
taking arms against them, avoid a sea of troubles. Kitchener (2000), Knapp and VandeCreek (2006),
As an overriding consideration in minimizing risk, Nagy (2005), Pope, Butcher, and Seelen (2006),
psychologists can protect themselves against ethical and and Pope and Vasquez (2007) (see also http://
legal quagmires by adhering to the ‘‘Ethical Principles www.kspope.com). For further general information
and Code of Conduct’’ of the American Psychological about ethical and legal risk in personality assessment,
Association (APA, 2002) and the relevant statutory readers are referred to Weiner and Greene (2008,
laws and professional regulations in the state in which chap. 3) and specific attention by Koocher (2006) to
they are practicing. Additional protections are specified forensic assessment of children and adolescents, by
in ‘‘Record Keeping Guidelines’’ (APA, 1993), Barnett and Dunning (2005) to assessing the elderly,
‘‘Guidelines for Child Custody Evaluations’’ (APA, and by Brabender and Bricklin (2001) and Turchik,
1994), ‘‘Guidelines on Multicultural Education, Karpenko, Hammers, and McNamara (2007) to
Training, Research, Practice, and Organizational conducting assessments in diverse settings.
599
A few additional introductory comments about (3) conducting the psychological evaluation; (4) pre-
adherence to ethical and legal standards are in order. paring and presenting a report; and (5) managing
First, as elaborated by Barnett, Behnke, Rosenthal, case records. As they proceed through these phases of
and Koocher (2007), ethical codes furnish principles an assessment, psychological examiners can sustain a
that guide clinicians in practicing ethically, but they high level of anticipation, and thereby sidestep any
do not supply solutions to every ethical uncertainty lurking hazards, by following the recommendations
that may arise in clinical practice. Hence clinicians in this chapter and by imagining three scenarios:
must often rely on their judgment as well as on
formal codes of conduct in resolving ethical 1. Whatever you do, imagine that a knowledgeable
dilemmas, and their decisions must also take into and unfriendly critic is looking over your shoulder.
account the context of each situation. A satisfactory 2. Whatever you say, imagine that it will be
solution to an ethical issue in one situation may not taken in the most unfavorable light possible and
be a sensitive or discreet way of resolving a similar used against you.
issue emerging in a different setting and involving 3. Whatever you write, imagine that it will be
different types of people with different types of read aloud, sarcastically, in a court of law.
problems.
Second, as discussed by Knapp, Gottlieb, Although these scenarios may suggest an
Berman, and Handelsman (2007), clinicians may unseemly paranoia, colleagues who have encoun-
at times encounter legal demands or expectations tered the kinds of unpleasant situations illustrated
that run counter to their ethical convictions. in this chapter will surely think otherwise. However,
Practitioners are obliged to abide by the laws of as a less pathological construction than paranoia on
their state, whatever the ethical principles of their how clinicians should conduct their practice, hyper-
profession. Yet there may be times when ethical vigilance at least is in order. Practitioners can also be
principles warrant objections by psychologists to comforted by appreciating that the APA ethics code
what they are being asked to do, and in these rests on the fundamental requirement for psycholo-
instances they may attempt to reconcile with the gists to be knowledgeable and responsible profes-
court a conflict between what is legal and what sionals who respect the rights and dignity of others,
they see as violating a patient’s or client’s rights. show concern for the welfare of their clients and
Third, awareness of ethical and legal issues should colleagues, and present themselves fairly and hon-
not be a now-and-then event occasioned by an estly to their patients and their communities. These
unexpectedly problematic case or a complaint filed features of being a competent clinician and a decent
in civil court or with a licensing board. Instead, to person usually suffice to carry personality assessors
minimize their risk of professional peril, practi- with propriety through the several phases of con-
tioners need to be continuously and proactively ducting a psychological evaluation. In each phase,
alert to the ethical and legal implications of what however, there are certain steps that can be taken to
they do or omit to do. restrict ethical and legal jeopardy.
Fourth, as a seemingly obvious matter but one
that nevertheless must be made explicit, psycholo- Accepting a Referral
gists who provide assessment services should protect From an ethical and legal perspective, a key con-
themselves and the public by obtaining and main- sideration in accepting a referral for diagnostic con-
taining licensure in the state in which they practice sultation is knowing who the client is. The client in
and by carrying appropriate malpractice insurance. personality assessment has traditionally been
In some locales, psychologists working in state agen- regarded as the person who is examined, receives a
cies or other specified facilities may be exempted report of the findings, pays for the services rendered,
from licensing requirements, and some institutions and commands the clinician’s primary allegiance. A
may provide adequate malpractice coverage for their self-referred examinee who expects to receive feed-
psychology staff. It behooves assessment psycholo- back concerning the psychologist’s conclusions and
gists to be certain of their particular circumstances in recommendations and to pay the bill can usually be
both of these respects. identified as the client with little difficulty. Often,
The present chapter discusses five sequential however, as elaborated by Monahan (1980), there
phases of clinical personality assessment, in each of are other circumstances in the delivery of assessment
which lurk some ethical and legal hazards: (1) services that make clienthood a more complex
accepting a referral; (2) selecting the test battery; matter than this.
WEINER 601
too much information to other interested parties, Should psychologists decide for some reason to
and vice versa; and (4) getting into disputes over administer a highly specialized or rarely used
unexpected or unpaid bills. measure or a controversial new form of a test, they
should be prepared to make a case for having done
so—a case that goes beyond saying, ‘‘I sometimes get
Selecting the Test Battery good information from this test’’ (which may give
To anticipate possible ethical and legal pitfalls in
the unintended impression that most other
selecting a test battery, and especially in forensic
professionals do not get much information from it,
cases, assessors should keep in mind the following
or else it would be more widely used). Assessors who
five considerations:
cannot justify the composition of their test battery
are at risk for having their credibility challenged,
1. The test battery should be limited to measures either for being cavalier about their work (throwing
that are warranted by the referral question, and in measures on a whim) or for not conforming to
clinicians should recognize when to keep some of standard practices in conducting a psychodiagnostic
their favorite tests on the sidelines. If a referral evaluation. Published surveys of the relative
question concerns neuropsychological impairment, frequency with which various tests are used in
for example, examiners should have a good reason clinical practice can help assessors select batteries
for including the Rorschach Inkblot Method in their with this consideration in mind (see Archer,
test battery, such as an explicit request for assessment Buffington-Vollum, Stredny, & Handel, 2006;
of personality as well as neuropsychological Archer & Newsom, 2000; Camara, Nathan, &
functioning. Absence of such a request makes it Puente, 2000; Cashel, 2002; Hogan, 2005;
difficult to justify administering and billing for a Quinnell & Bow, 2001).
personality assessment instrument that rarely 3. Psychological assessors concerned about ethical
contributes to identifying central nervous system and legal hazards should limit their test batteries to
disorder. Similarly, examiners who include a measures that have a solid psychometric foundation.
Wechsler Adult Intelligence Scale-III (WAIS-III) in This recommendation is not intended to be prejudicial
a test battery designed to answer questions about to measures of uncertain or undetermined psycho-
personality style may be challenged to defend the metric status that clinicians find helpful in their
utility of the WAIS-III for this purpose. work and rely on for generating hypotheses, not
2. The test battery should consist of widely used as a basis for drawing conclusions. However, when
and well-known measures. This consideration is not examiners anticipate that a case may go to court,
intended to discourage the development of new tests or it is advisable for them to exclude from their battery
the clinical application of respectable measures that any measures for which there are limited data
have not yet become widely used or well known. concerning their reliability, validity, and normative
However, evaluations with forensic implications are expectations.
ill suited for relying on experimental or esoteric Avoiding psychometrically soft measures in cases
measures. The APA ethics code cautions as well that could elicit ethical or legal questions about an
against including obsolete or outdated versions of assessment does not mean merely refraining from
tests in a test battery, and using short forms of mentioning them in one’s report. Assessors may be
standard measures can also put examiners in potential tempted at times to administer an extensive battery
forensic jeopardy (see Crespi & Politikos, 2005). that includes some personal favorites or ‘‘interesting’’
Examiners who administer obsolete, outdated, or but empirically untested measures, and then prepare
abbreviated measures may be challenged to justify a report based only on the psychometrically sound
such questionable practice if they appear on the measures in the test battery. Surrendering to such
witness stand or before a regulatory board. To temptation can buy a ticket to trouble. Should the
avoid potential unpleasantness when there is a slightest hint emerge in the courtroom that more
possibility of having to testify about their data, tests were administered than are being testified to,
assessors should limit their battery to tests that what can happen next to an expert witness is not a
judges, juries, and attorneys, as well as fellow pretty sight. After having said ‘‘Yes’’ when asked if
psychologists, are likely to recognize by name, that other tests not mentioned in the report were also
are reasonably invulnerable to challenge, or that can given, assessors must not only discuss but also
at least be shown to represent standard professional defend these other tests (which were probably kept
practice. out of the report because they would be difficult to
WEINER 603
impression that they are not very committed or should consider copying their test protocols over or
conscientious in their work, or that they do not having them typed. In so doing, however, they
know as much as they should about the person should preserve the original protocols in their own
who was examined, or both. Such undesirable handwriting, to document who took the record and
ethical and legal outcomes can be anticipated and the accuracy of the transcription.
avoided simply by taking an active and visible role in For assistance in avoiding some of these potential
the evaluation of all referred individuals. problems with the quality of test records, examiners
As a second consideration, assessors should make are advised to generate computerized scoring and
sure that their test forms will stand up to inspection data printouts when suitable software programs are
and give every appearance that the tests were admi- available. These programs reduce scoring errors,
nistered, scored, and evaluated in a careful, proper, which if discovered reflect poorly on an examiner
and thoroughly professional manner. Standard whose work is being reviewed, and they ensure being
recording forms should be used for each test, and able to present neat, complete, and legible summa-
all of the information requested by the forms should ries of test scores, scales, and indices. Computer
be entered. Examiners who have their own preferred printouts also have the indirect benefit of attesting
ways of recording test responses and entering scores an examiner’s familiarity with and utilization of
should exercise their preferences only when they do available assessment technology.
not anticipate ethical or legal inquiry concerning
their examination. Whenever ethical or legal issues Preparing and Presenting a Report
do arise, idiosyncratic deviations from standard prac- For examiners to prepare and present accurate,
tice expose examiners to questions about why they useful, and effective reports, they must (a) be com-
proceeded as they did. No matter how adequately petent with respect to determining from the test data
examiners believe they can answer such questions, what a report should say in response to a particular
and thereby defend their methods, the very fact of referral question and (b) have some facility in
their having to defend nonstandard methods can phrasing reports, whether for oral or written
raise doubts in others’ minds about the appropriate- delivery.
ness of the procedures that were used and the relia- Competency in psychological assessment was
bility of the results that were obtained. mentioned previously in discussing the ethical neces-
As for entering all of the requested data, com- sity for examiners to be familiar with the tests they use
pleted test forms convey thoroughness and compe- and the literature concerning them. Sufficient com-
tence. WAIS-III forms in which the index scores are petence in assessment to sustain ethical practice
omitted look naked to the trained eye, on the other involves more than knowing how to select, admin-
hand, as do other frequently used test forms in which ister, and score tests, however. Ethicality calls as well
spaces for inserting numbers or drawing profiles have for proficiency in interpreting the test data that are
been left blank. Whether experienced examiners feel obtained. Graduate and postgraduate education and
they need to fill in test forms completely in order to training, participation in continuing education work-
interpret the data is beside the point. To spare shops, and perusal of assessment textbooks are among
themselves aggravation when their test data are the routes to proficiency in test interpretation.
inspected in an ethical or legal context, psychological The core components of psychological assessment
assessors should collect and prepare raw test data in a competency are detailed further by Krishnamurthy
manner that is beyond reproach. et al. (2004). Also of note is a position statement by
A perhaps pedestrian but potentially problematic the Society for Personality Assessment Board of
aspect of having adequate test forms for appropriate Trustees (2006) on the requirements for becoming a
parties to inspect is their legibility. Examiners should competent assessment psychologist, which begins as
ensure that whatever they have written down can be follows: ‘‘It is the position of the Society that psycho-
read by others, should it become necessary. Having logical assessment is a specialty that requires intensive
legible records signifies a sense of professional and ongoing education and training to be practiced
responsibility and openness to sharing information competently and ethically and in order to protect the
as appropriate; examiners whose records cannot be public’’ (p. 355).
read by other people or, worse yet, cannot be read by Being expert in collecting and interpreting per-
the examiners themselves, are at risk for appearing sonality assessment data is still not enough to ensure
careless and indifferent to their clients’ welfare. competence in responding to a referral question,
Psychologists whose handwriting is difficult to read however. In order to provide helpful conclusions
WEINER 605
Rorschach indices of schizophrenia. Accordingly, person shows some but not a high level of violence
although the possibility of schizophrenia cannot be potential’’ or ‘‘This person is somewhat but not
ruled out on the basis of these findings, the test results substantially more likely than most people to
make it unlikely that his adjustment difficulties are respond aggressively if provoked.’’
attributable to the onset of a schizophrenic disorder. Second, in addition to avoiding unwarranted
assertions, this recommended approach to
Presenting negative findings in this way protects prediction or postdiction insulates examiners
examiners not only against unwarranted overstate- against courtroom insinuations that they have
ment, but also against professional embarrassment overstated their case. Suppose an attorney asks,
and possible liability should subsequent events ‘‘Are you saying that this man is going to go out in
document an instance of false negative findings. the street and start hurting people if he’s not put in
2. Reports should concentrate on describing jail?’’ Pychologists who have expressed their
what people are probably like and comment only conclusions in nomothetic terms can reply, ‘‘No,
cautiously on what they have probably done or are I’m not saying that; I’m just saying that he shows a
likely to do (see Weiner, 2003). Specific and lot of the personality characteristics commonly
unqualified predictions from psychological test found in people who behave violently, which
findings to behavior puts examiners in a precarious means that he is at greater risk than most people
position in which their conclusions will be difficult for becoming violent himself.’’ Testimony presented
to justify. On the other hand, referrals for in this way usually serves to get across the point that
psychodiagnostic consultation often call for examiners are trying to make without putting them
behavioral predictions, and psychological assessors at risk for appearing to have drawn conclusions that
are expected to provide them. Common examples go beyond their data.
in contemporary practice include requests to assess 3. Assessors should avoid describing people in
suicide potential, violence risk, and having been a unfavorable or prejudicial terms. Reports of a
victim or perpetrator of sexual abuse. psychological evaluation may be read by many
As an alternative in such instances to offering people who are not mental health professionals,
definite opinions about an individual’s past or future including family members and examinees
behavior, examiners are advised to construct themselves. In assessments for administrative
conclusions based on comparisons between the firm purposes, reports are likely to fall as well into the
findings in the case (i.e., test indications of what the hands of lawyers, judges, health-care managers,
person is probably or most certainly like) and employers, teachers, and principals. Comments
empirical evidence concerning how people with that have traditionally seemed appropriate for entry
personality characteristics similar to those of the into clinical records may strike non–mental health
examinee generally tend to behave or what they professionals as harsh, critical, judgmental, and
tend to have experienced. This evidence-based pejorative. Examinees who feel they have been
nomothetic approach in preparing reports helps unfairly disadvantaged by what a psychologist has
examiners avoid unwarranted assertions about an said or written about them may seek recourse by
individual’s future behavior or past experiences and challenging the propriety of the examiner’s conduct.
has two related advantages in personality assessment. It therefore behooves personality assessors to find
First, an evidence-based nomothetic approach ways of saying what needs to be said about
facilitates formulating conclusions with an individuals who are disturbed or impaired without
appropriate degree of qualification. If an seeming to lack sympathy and understanding.
individual’s personality test responses (and other For example, in describing a resistive examinee
assessment findings as well) are very similar to who was difficult to evaluate, an assessor could say,
those of individuals who very often behave ‘‘Ms. B was uncooperative, and she invalidated some
aggressively toward other people, it is reasonable to of the tests by refusing to respond to many items.’’
conclude that ‘‘This person shows a high level of Or, describing the same person, the assessor could
violence potential’’ or ‘‘This person is much more say, ‘‘Apparently because of her considerable anxiety
likely than most people to become violent if about being evaluated, Ms. B found it difficult to
provoked.’’ If, on the other hand, the assessment comply with the examination procedures and was
data indicate a few personality characteristics that unable to respond to many of the test items.’’ A
are sometimes associated with aggressive behavior, disgruntled client or an independent observer
the examiner is justified in reporting that ‘‘This could well consider the first of these comments
WEINER 607
unexpectedly. Records that are complete, orderly, American Psychological Association. (1994). Guidelines for child
and in reasonably pristine form reflect well on the custody evaluations in divorce proceedings. American
Psychologist, 49, 677–680.
competence and conscientiousness of the assessor. American Psychological Association. (2002). Ethical principles of
Those that are disorganized and dog-eared suggest psychologists and code of conduct. American Psychologist, 57,
that examiners are paying insufficient attention to 1060–1073.
the rights and dignity of their clients. American Psychological Association. (2003). Guidelines on mul-
Examiners must also take care to maintain the ticultral education, training, research, practice, and organiza-
tional change for psychologists. American Psychologist, 58,
availability of their case records for an adequate 377–401.
period of time. Long after the fact of a psycholo- Archer, R. P., Buffington-Vollum, J. K., Stredny, R. V., &
gical examination, clients and those concerned Handel, R. W. (2006). A survey of psychological test use
about them have a right to expect that the records patterns among forensic psychologists. Journal of Personality
will be available if needed for any purpose. The Assessment, 87, 84–94.
Archer, R. P., & Newsom, C. R. (2000). Psychological test usage
obligations of examiners in this regard are often with adolescent clients: Survey update. Assessment, 7,
specified in statutory regulations concerning the 227–235.
practice of psychology. In Florida, for example, Bank, S. C., & Packer, I. K. (2007). Expert witness testimony:
the Administrative Code of the Board of Law, ethics, and practice. In A. M. Goldstein (Ed.), Forensic
Psychology requires complete psychological psychology: Emerging topics and expanding roles (pp. 421–445).
Hoboken, NJ: John Wiley & Sons.
records to be maintained for at least 3 years fol- Barnett, J. E., Behnke, S. H., Rosenthal, S. L., & Koocher, G. P.
lowing an examination and a summary of the (2007). In case of ethical dilemma, break glass: Commentary
records for an additional 4 years. on ethical decision making in practice. Professional Psychology:
Furthermore, the ethical obligations of examiners Research and Practice, 38, 7–12.
to maintain records extend not only as long as they Barnett, J. E., & Dunning, C. B. (2005). Assessment and treat-
ment of the elderly: Clinical and ethical issues. In L.
practice but beyond as well. The APA Ethical VandeCreek (Ed.), Innovations in clinical practice: Focus on
Standards specify that psychologists should plan in adults (pp. 19–32). Sarasota, FL: Professional Resources Press.
advance to preserve the confidentiality and avail- Bennett, B. E., Bricklin, P. M., Harris, E., Knapp, S.,
ability of their records after they discontinue practice VandeCreek, L., & Youunggren, J. N. (2006). Assessing and
and in the event of their incapacity or death. managing risk in psychological practice. Rockville, MD: The
Trust.
Accordingly, psychological assessors should make Blau, T. H. (1998). The psychologist as expert witness (2nd ed.).
provisions for the maintenance of their case records New York: John Wiley & Sons.
in future times when they will no longer be available Brabender, V., & Bricklin, P. (2001). Ethical issues in psycholo-
or capable themselves to share this information with gical assessment in different settings. Journal of Personality
appropriate parties. Assessment, 77, 192–194.
Brodsky, S. L. (2004). Coping with cross-examination and other
To summarize this chapter, psychologists who pathways to effective testimony. Washington, DC: American
conduct personality evaluations can protect them- Psychological Association.
selves against ethical and legal recriminations by Bucky, S., Callan, J., & Stricker, G. (Eds.) (2007). Ethical and
establishing clearly the identity of the client when legal issues for mental health professionals: In forensic settings.
they accept a referral; by selecting for their test New York: Haworth Press.
Butcher, J. N. (2003). Computerized psychological assessment.
battery measures that are commonly used, well sub- In I. B. Weiner (Ed.), Handbook of psychology, Vol. 11:
stantiated, and familiar to them; by conducting their Assessment psychology (pp. 141–164). Hoboken, NJ:John
evaluations in standard ways and recording the data Wiley & Sons.
carefully; by writing and presenting their reports in Butcher, J. N. (in press). How to use computer-based reports. In
an appropriately denotative and circumspect J. N. Butcher (Ed.), Oxford handbook of clinical and personality
assessment. New York: Oxford University Press.
manner; and by maintaining the confidentiality Butcher, J. N., Perry, J., & Hahn, J. (2004). Computers in clinical
and availability of their records for an appropriate assessment: Historical developments, present status, and
length of time. future challenges. Journal of Clinical Psychology, 60, 331–345.
Camara, W. J., Nathan, J. S., & Puente, A. E. (2000).
References Psychological test usage: Implications in professional psy-
Allen, J., & Walsh, J. A. (2000). A construct-based approach to chology. Professional Psychology, 31, 141–154.
equivalence: Methodologies for cross-cultural/multicultural Cashel, M. L. (2002). Child and adolescent psychological assess-
personality assessment research. In R. H. Dana (Ed.), ment: Current clinical practices and the impact of managed care.
Handbook of cross-cultural and multicultural personality assess- Professional Psychology: Research and Practice, 33, 1446–1453.
ment (pp. 63–86). Mahwah, NJ: Lawrence Erlbaum. Committee on Ethical Guidelines for Forensic Psychologists
American Psychological Association. (1993). Record keeping (1991). Specialty guidelines for forensic psychologists. Law
guidelines. American Psychologist, 48, 984–986. and Human Behavior, 15, 655–665.
WEINER 609
This page intentionally left blank
PART
7
Interpretation and
Reporting of
Assessment Findings
This page intentionally left blank
C H A P T E R
Assessment of Feigned Psychological
32 Symptoms
David T. R. Berry, Myriam J. Sollman, Lindsey J. Schipper, Jessica A. Clark, and Anne L. Shandera
Abstract
Feigned symptom reports have become of increasing interest in recent years, in part because the results of
psychological evaluations are more widely accepted in legal proceedings. This chapter reviews several
pertinent issues regarding assessment of malingering, including methodological concerns, base rates of
feigning, and coaching to avoid detection. Several frequently used measures of malingering, including the
Minnesota Multiphasic Personality Inventory-2 (MMPI-2), Personality Assessment Inventory (PAI), Millon
Clinical Multiaxial Inventory-III (MCMI-III), Structured Inventory of Malingered Symptomatology (SIMS),
Miller Forensic Assessment of Symptoms Test (M-FAST), and Structured Interview of Reported
Symptoms (SIRS), are reviewed. Cutting scores, sensitivity, specificity, and positive and negative predictive
powers are provided at various base rates of feigning. After the SIRS, the feigning scales of the MMPI-2 have
the most support for malingering detection, followed by the PAI scales. Generally, all measures reviewed
showed greater negative predictive power rates; thus, a two-stage sequential process for malingering
detection is discussed.
Normal people frequently attempt to manage the need to tell the truth, and may shame or physically
impressions they make on others to achieve their discipline for such offenses. Clients of physicians,
personal goals. For example, individuals meeting attorneys, and psychotherapists are told that the
their prospective spouse’s parents for the first time confidentiality of their communications with the
are typically quite attentive to making the ‘‘right’’ professional is protected by law (with exceptions
impression. Classified advertisements placed to duly noted) in order to encourage the candor that
market an automobile or other merchandise usually is felt to be in the client’s best interest but apparently
highlight the positive qualities of the item. Students not thought to occur spontaneously in all cases.
contacting their instructors to indicate that they will Before testifying in a legal proceeding, witnesses are
miss an examination or turn in an assignment late asked to swear that they will tell the truth, the whole
due to illness are often quite intent on communi- truth, and nothing but the truth, a rather elaborate
cating the severity of their symptoms. Everyday formulation that indirectly acknowledges the
examples of impression management abound, sug- many forms that distortions of the truth may take.
gesting that deliberate manipulation of the opinions Clearly, a wide range of everyday habits in society,
of others is a pervasive phenomenon in contem- professional standards, and institutional procedures
porary society. implicitly acknowledge that individuals may with-
Indirect acknowledgements of the frequency of hold or misrepresent information for their perceived
impression management, including its extreme advantage, sometimes to the point of outright lies.
form, lying, are also numerous. Parents of a child Curiously, despite the widespread appreciation in
who has told a lie usually emphatically stress the society at large that individuals may misrepresent
613
information for their own perceived benefit, psy- Conceptual and Practical Issues
chologists and other mental health professionals Criteria
have traditionally devoted relatively little formal False symptom reports are part of the criteria for
attention or resources to assessing objectively the identifying three conditions in the DSM-IV-TR
accuracy of information presented by clients. (American Psychiatric Association [APA], 2000).
Instead, clinicians typically depend on a global clin- Perhaps the most blatant example of feigned
ical judgment of uncertain validity that may be symptom reports involves malingering. Strictly
casually summarized along the lines of ‘‘the present speaking, malingering is not a diagnosis in DSM-
results are judged representative of the client’s cur- IV-TR (APA, 2000). Instead, DSM-IV-TR treats
rent functioning’’ with little or no substantive evi- malingering as a ‘‘V-code,’’ more specifically as one
dence offered in support. of several ‘‘additional conditions that may be the
This ‘‘ignorant bliss,’’ however, has been chal- focus of clinical attention’’ (APA, 2000, p. 739).
lenged as the results of psychological assessments The essential feature of malingering is ‘‘the inten-
have become increasingly accepted in legal and tional production of false or grossly exaggerated
administrative proceedings determining the assign- physical or psychological symptoms, motivated by
ment of substantial benefits such as disability pen- external incentives such as avoiding military duty,
sions, workers’ compensation awards, and monetary avoiding work, obtaining financial compensation,
damages, as well as the adjudication of criminal evading criminal prosecution, or obtaining drugs’’
offenses. Through vigorous cross-examinations, (p. 739). Although no formal criteria are given for
often aided by sources such as Coping with arriving at a determination of malingering, the clin-
Psychiatric and Psychological Testimony (Ziskin, ician is told to strongly suspect malingering if two or
1995) and widespread public discontent with more of the following are present: medico-legal con-
‘‘insanity pleas,’’ psychologists and other mental text, discrepancy between subjective and objective
health clinicians have become increasingly aware of information, lack of cooperation with assessment or
the need to evaluate carefully and routinely the treatment, or presence of antisocial personality dis-
veracity of information obtained from testing, clin- order. In the DSM-IV-TR system, malingering is
ical interviews, and related methods that are based distinguished from similar conditions by two factors:
ultimately on self-report. To address the issue of whether or not conscious control is exerted over false
feigned symptom reports reliably, it is necessary to symptom reports and whether apparent goals are
have well-validated, objective indicators of the external (money, narcotics, avoiding criminal prose-
phenomenon. cution, etc.) as opposed to intrapsychic (fulfill the
This chapter will provide an overview of the cur- sick role, resolve unconscious conflicts). The
rent status of structured psychological methods used resulting hypothetical two-by-two table includes
to evaluate the possibility of one extreme type of malingering (conscious feigning for external goals),
impression management that may occur in psycholo- factitious disorder (conscious feigning for internal
gical assessments. Deliberate exaggeration or fabrica- goal; i.e., to achieve the sick role), and conversion
tion of problems or misattribution of symptoms to disorder (unconscious production of signs and
obtain a desired external goal is behavior that, in the symptoms to resolve psychological conflicts).
context of a psychological or psychiatric evaluation, is Oddly, the fourth quadrant of the hypothetical
commonly known as ‘‘faking bad’’ or ‘‘malingering.’’ table resulting from crossing the two major factors
Psychologists have generally conceptualized feigned (unconscious production of signs and symptoms for
psychological complaints as a ‘‘response set,’’ and external goals) is not addressed in the DSM-IV-TR
readers should be aware that it is only one of several system, but might constitute a potentially important
possible response sets such as random responding, condition for future exploration.
yea-saying or nay-saying, and faking good that may The omission in the DSM-IV-TR entries on
affect self-reports. Response sets such as random malingering and related disorders of explicit gui-
responding and yea-saying at times constitute impor- dance for making the crucial and intrinsically diffi-
tant confounds that must be ruled out before a deter- cult distinctions required by the system is
mination of faking bad is reached. In this chapter, particularly problematic. For example, reliable dis-
other response sets will be addressed only as they bear crimination of conscious versus unconscious moti-
on the detection of feigned reports, with relevant vation and control of behavior requires a high level
sources of information on the alternative response of inference in all but exceptional circumstances, yet
sets provided as appropriate. DSM-IV-TR provides no definitive directions for
Table 32.1. Methodological quality issues in validation of malingering detection tests and scales.
Simulation designs
Are participant characteristics similar to those with whom the test will be used in clinical practice?
Are analog malingerers given a plausible ‘‘scenario’’ that they understand and may identify with?
Are analog malingerers instructed to be ‘‘believable’’ in their feigning (try to avoid detection)?
Are analog malingerers given time and resources to prepare to malinger successfully?
Are analog malingerers provided with information on symptoms and detection strategies to reflect knowledge that is
probably easily available to ‘‘real world’’ malingerers?
Are analog malingerers provided with an incentive powerful enough to motivate them to attempt to feign successfully?
Are analog malingerers ‘‘debriefed’’ at the end of the study to determine if they recalled, comprehended, and complied with
their instructions?
Are nonmalingering measures included in the experimental protocol, replicating clinical practice in which both
malingering and standard tests are given?
Is a clinically relevant patient control group included?
Is the honesty of the patient control group verified through post-test debriefing or other independent means?
Are the numbers of participants in the malingering and honest groups adequate to detect an effect?
Known-groups Designs
How valid is the criterion used to assign group membership (criterion validity will place a ceiling on predictor validity)?
How representative of malingerers are those assigned to the malingering group (if only flagrant malingerers are ‘‘caught,’’
results, may misrepresent the ability of the test to identify malingering)?
Is the honesty of the patient control group verified through post-test debriefing or other independent means?
How representative of honest patients are those assigned to the nonmalingering control group?
Differential prevalence
Does the criterion chosen to select groups have demonstrated validity for producing different base rates of malingering?
Graham, Watts, & Timbrook, 1991 FB 30 Male Stud Sim N N N F T 120 .90 .92 .88
Hon 30 Male Pt N .97 .96 .98
F-K 23 .90 .92 .88
.97 .96 .98
FB 20 Fem Stud Sim N N N F T 120 .95 .88 .82
Hon 20 Fem Pt N .95 .98 .99
F-K 25 .90 .87 .81
.95 .96 .95
Lees-Haley, 1991 Mal 26 Pers Inj KG n/a n/a n/a F T 65 .81 1.00 1.00
Hon 21 Pers Inj n/a 1.00 .93 .96
Lees-Haley, 1992 Mal 55 Pers Inj KG n/a n/a n/a F T 62 .89 .94 .91
Hon 64 Pers Inj n/a .98 .96 .97
F-K 4 .82 .94 .91
.98 .94 .96
Rogers, Bagby, & Chakraborty, 1993 FBNC 15 CV Sim Y N N F T > 120 .60 .88 .83
Hon 37 Pt N .97 .87 .91
F-K >14 .67 .57 .45
.81 .87 .91
FBSC 15 CV Sim Y Y N F T > 120 .73 .90 .85
Hon 37 Pt N .97 .91 .94
F-K >14 .67 .57 .45
.81 .87 .91
FBVC 14 CV Sim Y Y N F T > 120 .21 .72 .62
Hon 37 Pt N .97 .77 .84
F-K >14 .29 .36 .26
.81 .76 .83
FBCB 15 CV Sim Y Y N F T > 120 .47 .85 .79
Hon 37 Pt N .97 .83 .89
F-K >14 .48 .48 .37
.81 .81 .87
Wetter et al., 1993 FB 20 CV Sim Y N Y F T 96 .85 .51 .40
Hon 20 PTSD N .70 .93 .95
F-K 15 .75 .65 .54
.85 .90 .94
FB 22 CV Sim Y N Y F T 104 1.00 .79 .70
Hon 20 PASC N .90 1.00 1.00
F-K 16 .95 .88 .82
.95 .98 .99
continued
Table 32.2. (continued)
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
SP
PPP NPP PPP NPP
Bagby, Rogers, & FB 58 Stud Sim Y Y N F T > 104 .88 .63 .52
Buis, 1994 Hon 184 Fsc Pt N .81 .95 .97
F-K >11 .88 .64 .53
.82 .95 .97
Bagby, Rogers, Buis FB 58 Stud Sim Y Y N F T > 89 .91 .64 .53
& Kalemba, 1994 Hon 95 Pt N .81 .96 .97
F-K >7 .95 .64 .53
.80 .98 .98
Iverson, Franzen, & Hammond, 1995 FB 28 Corr Sim N Y N F T 92 .89 .94 .91
Hon 51 Pt N .98 .96 .97
F-K >9 .86 .94 .91
.98 .95 .97
Rogers, 1995 FB 33 Pt Sim Y Y N F T > 120 .94 .81 .73
Hon 38 Pt N .92 .98 .98
F-K >18 .88 .67 .56
.84 .95 .97
F(p) T > 110 .88 .87 .81
.95 .96 .97
Sivec, Hilsenroth, & Lynn, 1995 FB 65 Stud Sim Y Y Y F T > 99 .95 .78 .69
Hon 40 Pt N .90 .98 .99
Pensa et al., 1996 FB 20 Stud Sim Y N N F T 95 .93 .95 .92
Hon 20 Pt N .98 .97 .98
F-K 8 1.00 .76 .66
.88 1.00 1.00
Arbisi & Ben-Porath, 1998 FB 33 Pt Sim Y N Y F T 100 .97 .64 .53
Hon 41 Pt Y .80 .99 .99
F(p) T 100 .97 .95 .92
.98 .99 .99
Elhai et al., 2000 FB 79 Stud Sim Y Y N F T 120 .75 .55 .43
Hon 124 Pt N .77 .89 .93
F-K 17 .71 .46 .35
.69 .87 .91
Storm & Graham, 2000 FBNC 67 Male Stud Sim Y N Y F T > 120 .67 .93 .89
Hon 192 Male Pt N .98 .89 .93
F-K >23 .63 .92 .88
.98 .88 .92
F(p) T > 106 .75 .85 .78
.95 .91 .94
FBNC 124 Fem Stud Sim Y N Y F T > 120 .82 .91 .87
Hon 160 Fem Pt N .97 .94 .96
F-K > 17 .84 .84 .77
.94 .94 .96
F(p) T > 120 .86 .91 .87
.97 .95 .97
FBVC 93 Male Stud Sim Y Y Y F T > 101 .54 .54 .43
Hon 192 Male Pt N .83 .83 .89
F-K > 16 .35 .62 .51
.92 .79 .86
F(p) T > 94 .59 .69 .58
.90 .86 .90
FBVC 156 Fem Stud Sim Y Y Y F T > 85 .78 .44 .34
Hon 160 Fem Pt N .64 .89 .93
F-K >2 .90 .39 .29
.49 .93 .95
F(p) T > 89 .69 .60 .49
.83 .88 .92
Archer et al., 2001 FB 60 Male S&C Sim Y N N F T 90 .62 .52 .41
Hon 329 Male Pt N .79 .85 .90
F(p) T 70 .53 .54 .42
.83 .83 .88
FB 143 Fem S&C Sim Y N N F T 90 .50 .43 .32
Hon 288 Fem Pt N .75 .80 .86
F(p) T 70 .39 .46 .35
.83 .79 .85
Lewis, Simcox, & Berry, 2002 Mal 24 Corr KG n/a n/a n/a F T > 107 .67 1.00 1.00
Hon 31 Corr n/a 1.00 .89 .93
F(p) T > 100 .50 1.00 1.00
1.00 .84 .90
F-K >10 .79 .83 .76
.94 .92 .95
Heinze, 2003 Mal 33 Corr KG n/a n/a n/a F T 100 .73 .69 .59
Hon 33 Corr n/a .88 .90 .93
F(p) T 100 .52 .56 .45
.85 .83 .88
F-K 16 .70 .68 .58
.88 .89 .93
continued
Table 32.2. (continued)
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
SP
PPP NPP PPP NPP
Ross, Millis, Krukowski, Putnam, & Adams, Mal 59 Fsc Pt KG n/a n/a n/a F T 65 .66 .40 .30
2004
Hon 59 TBI n/a .64 .84 .89
F-K 6 .58 .33 .24
.56 .78 .85
Heinze & Vess, 2005 Mal 6 Corr KG n/a n/a n/a F T 100 .33 .64 .53
Hon 47 Corr n/a .93 .79 .86
F-K >15 .17 .44 .33
.92 .75 .83
F(p) T 100 .33 1.00 1.00
1.00 .80 .86
Arbisi, Ben-Porath, & McNutty, 2006 FB 35 PTSD Pt Sim Y Y Y F T 120 .49 1.00 1.00
Hon 55 PTSD Pt Y 1.00 .84 .89
F(p) T 100 .49 1.00 1.00
.80 .77 .84
Eakin et al., 2006 FB 29 PTSD Pt Sim Y Y N F >11 .71 .47 .36
Hon 23 Stud N .70 .87 .91
F(p) >4 .52 .60 .48
.87 .83 .89
Note: For studies, only first author is listed. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison. n/a ¼ not applicable. Inc ¼ incentive for successful faking, Warn ¼ warning to fake
believably, Comp Check ¼ check for compliance with faking instructions, Sn ¼ sensitivity, Sp ¼ specificity, BR ¼ base rate, PPP ¼ positive predictive power, NPP ¼ negative predictive power. For Group: FB ¼ fake bad,
Hon ¼ honest, FBCB ¼ fake bad with coaching on symptoms and validity scales, FBNC ¼ fake bad with no coaching, FBSC ¼ fake bad with coaching on symptoms, FBVC ¼ fake bad with coaching on validity scales,
Mal ¼ malingering. For Sample: Corr ¼ correctional, CST ¼ competency to stand trial evaluees, CV ¼ community volunteers, Fem ¼ female, Fsc ¼ forensic, IPt ¼ inpatient, Pers Inj ¼ workers compensation or personal
injury claimant, PASC¼paranoid schizophrenia, PTSD ¼ post traumatic stress disorder, S&C ¼ student and community volunteers, Stud ¼ student, TBI ¼ traumatic brain injury. For Design: Sim ¼ Simulation, KG ¼
Known-groups. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison. For cut score: M ¼ male, F ¼ female. In cases where results using more than one cutting score were reported,
data from the cutting score with the highest overall hit rate are presented.
feigning scales, including data on sample character- resulting from the availability of multiple validity
istics, study methodology, cutting scores, sensitivity, scales embedded in the measure. Routine considera-
specificity, and positive and negative predictive tion of all possible feigning scales on the MMPI-2
powers at two hypothetical base rates of malingering (F, Fb, F-K, F(p), Ds, FBS, RBS, etc.), with positive
(.19 and .27). The most notable aspect of these data results from any one held as suggestive of feigning,
is the sheer number of studies that have been pub- will result in an overall elevated false positive rate
lished utilizing psychiatric control groups, as also because as Greene (2000) indicates, these scales are
illustrated in a comprehensive listing by Pope et al. imperfectly correlated and thus tap somewhat dif-
(2006). Several known-groups studies are present, as ferent aspects of false symptom presentation. Thus,
well as simulation designs. The majority of simula- in order to minimize potential Type I error, only two
tion designs used the recommendations of Rogers or possibly three of these scales should be routinely
(1997) and incorporated incentives for accurate utilized and taken as evidence of malingering.
feigning and warning to be believable, although Further, as noted previously, elevations on these
very few utilized a compliance check for under- scales should only be considered possible indicators
standing and adherence to instructions. This of malingering after VRIN and TRIN have been
remains true even with more recent studies, which examined and other response styles ruled out.
raises methodological concerns given the specific
recommendations provided by Rogers (1997). Personality Assessment Inventory
Although this table highlights the tremendous The Personality Assessment Inventory (PAI;
number of studies available on the MMPI-2 validity Morey, 1991, 2007) is an objective self-report mea-
scales, it is difficult to glean specific information on sure of personality and psychopathology that has
cutting scores and operating characteristics here. steadily grown in popularity. The PAI is intended
Thus, Table 32.8 provides a summary of the aggre- for use with adults 18 years and older. It is written at
gated MMPI-2 studies as well as results from other a fourth-grade reading level and consists of 344
instruments and scales to be reviewed below. items answered in a 4-point Likert format.
Table 32.8 provides mean cutting scores, sensi- Responses are arranged into 22 non-overlapping
tivity, specificity, and positive and negative predic- scales covering four domains: validity scales (4),
tive powers across the MMPI-2 studies presented in clinical scales (11), treatment indication scales (5),
Table 32.2. It should be noted that these cannot be and interpersonal style scales (2).
taken as exact estimates of operating characteristics, Of the original PAI validity indices, two assess
but rather as a rough approximation of the values inconsistent and random response styles and two
based on available studies. Overall, considerable assess positive and negative self-presentation styles.
variability exists for cutting scores for all these The inconsistency (ICN) scale consists of paired
scales. The F scale shows the best operating charac- questions selected due to content similarity and the
teristics across studies, with high specificity and infrequency (INF) scale includes eight items that are
moderate sensitivity. Additionally, F showed the either very rarely or nearly always endorsed, but
highest sensitivity overall, whereas F(p) had the which are not indicative of psychopathology. Both
highest overall specificity. Considering the clinically these scales are elevated by random responding.
relevant predictive powers, none of these indices had The presence of the ICN and INF scales on the
impressive PPP. In contrast, NPP was generally at PAI facilitates Greene’s (2000) recommended sys-
.90 or higher for almost all these scales at both base tematic approach to evaluating profile validity.
rates, suggesting better ability to rule out than to rule Therefore, the infrequency and inconsistency scales
in feigning. The modest PPP for these scales suggests should be consulted first before addressing the pos-
that a positive test sign on the MMPI-2 is insuffi- sibility of symptom feigning.
cient, on its own, for identifying feigning. These The original PAI validity scales assessing self-
results suggest that to identify feigning, data from presentation styles included a positive impression
an additional indicator, such as the SIRS, should be (PIM) scale and a negative impression (NIM) scale.
used to supplement MMPI-2 validity scales in a The NIM scale includes nine items, which, when
determination of feigning. An example combining endorsed, suggest a tendency to report a greater
results from the MMPI-2 and the SIRS for detecting degree of symptomatology than would be predicted
feigning will be presented later in this chapter. by objective evidence. A NIM elevation therefore
An important caveat that should be noted for the suggests an exaggerated, unfavorable impression or
MMPI-2 validity scales is the subtle problem feigning.
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Sp
PPP NPP PPP NPP
Rogers, Sewell, Morey, & FB 182 Stud Sim Y Y N NIM T 77 .56 .52 .41
Ustad, 1996
Hon 221 Pt N .81 .83 .89
Mal I 3 .32 .52 .43
.90 .78 .85
Liljequist, Kinder, & Schinka, 1998 FB 27 Stud Sim Y Y Y Mal I 3 .44 1.00 1.00
Hon 29 Pt-PTSD N 1.00 .83 .88
FB 27 Stud Sim Y Mal I 3 .97 .84 .77
Hon 30 Pt-alcoh N 1.00 .82 .88
Morey & Lanier, 1998 FB 44 Stud Sim Y N N NIM T > 80 .89 .75 .65
Hon 45 Pt N .89 .96 .97
RDF T > 65a .95 .90 .85
.96 .98 .99
Rogers et al., 1998 Mal 16 Corr KGb n/a n/a n/a NIM T > 80 .63 .82 .75
Hon 43 Corr IPt n/a .95 .87 .92
Rogers et al., 1998 Mal 57 Corr KGb n/a n/a n/a NIM T 77 .84 .54 .43
Hon 58 Corr IPt n/a .74 .94 .95
Mal I >4 .47 .55 .44
.86 .81 .87
RDF T > 65a .51 .40 .30
.72 .80 .86
Calhoun et al., 2000 FB 23 Stud Sim Y Y N NIM T 73c .83 .32 .23
Hon 23 Pt-PTSD N .35 .85 .90
NIM T 84d .74 .32 .23
.43 .82 .88
NIM T 92e .61 .39 . 29
.65 .82 .88
continued
Table 32.3. (continued )
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Sp
PPP NPP PPP NPP
continued
Table 32.3. (continued )
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Sp
PPP NPP PPP NPP
Kucharski et al., 2007k Malm 31 Corr-CST KGb n/a n/a n/a NIM T 73 .90 .49 .38
Honm 85 Corr-CST n/a .66 .95 .97
NIM T 77 .90 .59 .48
.77 .95 .97
NIM T 81 .87 .61 .49
.79 .94 .96
NIM T 84o .84 .63 .52
.82 .93 .96
NIM T 88 .81 .68 .58
.86 .92 .95
NIM T 92 .71 .70 .60
.89 .89 .93
Note: Group: For studies, only first author is listed unless additional authors required to identify reference. n/a ¼ not applicable. Inc ¼ incentive for successful faking, Warn ¼ warning to fake believably, Comp Check ¼ check for compliance
with faking instructions, Sn ¼ sensitivity, Sp ¼ specificity, BR ¼ base rate, PPP ¼ positive predictive power, NPP ¼ negative predictive power. For Group: FB ¼ fake bad, FG ¼ fake good, Hon ¼ Honest, Mal ¼ Malingering.
For sample: alcoh ¼ alcohol dependence, Corr ¼ correctional, CST ¼ competency to stand trial evaluee, CV ¼ community volunteer, FF ¼ forensic feigning condition, IPt ¼ inpatient, MDD ¼ major depressive disorder, OPt ¼ outpatient,
PF ¼ psychiatric feigning condition, Pt ¼ patient, Psy ¼ psychiatric (mixed or unspecified), PTSD ¼ post-traumatic stress disorder, Stud ¼ student. For Design: Sim ¼ simulation, KG ¼ known groups. For Scales: CDF ¼ Cashel
Discriminant Function Scale, Mal I ¼ Malingering Index, NIM ¼ Negative Impression Management, RDF ¼ Rogers Discriminant Function. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison.
a
Cutoff given in paper was >.57 raw; this was converted to an approximate T-score for the purpose of this table. bFormed using SIRS as criteria. cCutoff given in paper was 8 raw. d Cutoff given in paper was 11 raw. eCutoff given in paper was
13 raw. fCutoff determined a priori. gCutoff given in paper was >.06 raw; this was converted to an approximate T-score for the purpose of this table. hCutoff given in paper was 1.70 raw; this was converted to an approximate T-score for the
purpose of this table. iWithin-subjects design where individuals took test under standard instructions and under instructions to fake good (both within 2–24 hour period); order was randomized. jNote that the same Hon comparison group was
used for both FB and FG analyses. kPersonal communication with the first author indicated that the 2006 and 2007 studies had overlapping data sets; both are provided to illustrate test characteristics at various cutting scores. lThe Hon group
was comprised of those scoring in the Indeterminate range (N ¼ 44) and those producing valid SIRS results. mRandom responders were eliminated from this group prior to data analysis. oThe operating characteristics at this cutoff score did not
contribute to the operating characteristic summary table, as the 2006 Kucharski et al. study provides data for this score based upon the more inclusive sample.
as a validity indicator. The desirability index taps the the least researched and least validated of all MCMI
‘‘inclination to appear socially attractive, morally scales and hence could be subject to extensive cross-
virtuous, or emotionally well composed’’ (p. 118) examination’’ (p. 295). Morgan, Schoenberg, Dorr,
and thus should be sensitive to a ‘‘fake good’’ and Burke (2002) reported that the MCMI-III
approach to the test. The debasement index is the feigning scales were much less sensitive to feigning
most directly applicable to detection of over- than the comparable MMPI-2 scales. In their study
reporting of psychological problems. It assesses a of 191 psychiatric inpatients, nearly 31% of proto-
tendency to ‘‘devalue oneself by presenting more cols classified as invalid by the MMPI-2 F(p) scale
troublesome emotional and personal difficulties were categorized as valid by the MCMI-III Scale X.
than are likely’’ (p. 118). Thus, there are a number of reasons to fear that the
Significant potential issues exist with regard to MCMI-III validity scales are too inaccurate for cases
the MCMI-III validity scales. First, given the pre- where malingering is a significant concern.
sence of only three items, the V scale may ‘‘miss’’ a Table 32.4 presents results from published mal-
significant proportion of random responders, pos- ingering detection studies of the MCMI-III that
sibly leading to erroneous conclusions regarding the included a patient control group. Perhaps the most
presence of over-reporting of symptoms on the basis striking aspect here is the paucity of studies. Only
of other scales. Charter and Lopez (2002) reported three published studies were found that included a
that approximately 50% of entirely random MCMI- patient control group for the MCMI-III using tradi-
III protocols received scores of 0 or 1 on the V scale tional interpretative indices, all of which used simu-
and thus would have avoided proper identification. lation designs. These simulation studies each have
Thus, the stepwise approach to evaluating protocol minor methodological weaknesses, especially when a
validity possible for the MMPI-2 is potentially much student population is used to ‘‘fake bad.’’ Table 32.4
less powerful for the MCMI-III. Another issue is shows generally low to moderate sensitivity values
that the BR scores of 75 and 85 for the X, Y, and but somewhat stronger specificity statistics.
Z scales were set simply to demarcate percentile Classification statistics from these studies suggest a
rankings of 75 and 90 in the normative patient low to moderate PPP, depending on the base rate
sample without regard to any independent determi- (.29–.43 at a base rate of .19, and .39–.76 at a base
nation of presence or absence of response sets (pp. rate of .27). In contrast to PPP, NPP results are
61–62). This is a rather weak developmental method generally stronger with values ranging from .81 to
for scales used to make important decisions .93 at a base rate of .19, and from .73 to .89 at a base
regarding response sets. Yet another potential con- rate of .27.
cern is that BR score thresholds of 75 and 85 are Given the many questions raised earlier about the
used to signal the presence of a trait or condition on acceptability of the MCMI-III feigning indices, as
clinical scales. Thus, this metric may be confusing to well as the small number of patient-controlled stu-
clinicians who do not closely read the manual as they dies, this test does not appear to be an ideal choice in
may misinterpret elevated scores on Scales X, Y, and settings characterized by higher base rates of malin-
Z as clearly indicating the presence of the target gering. In particular, given the modest PPP noted in
response set. A concern specific to Scales Y and Z Table 32.4, the MCMI-III appears inadequate to
is that they were developed based on the responses of document feigned symptom reports without data
a small number of graduate students (n ¼ 12; from an additional malingering test, and thus no
Millon, 1987) who were asked to respond under summary results are reported in Table 32.8.
‘‘fake good’’ and ‘‘fake bad’’ instructions, and only
minimal detail (group means) is offered regarding Dedicated Malingering Tests
results from two small-scale validation studies In addition to malingering detection scales
reported in the MCMI-II manual. Although the embedded in multiscale inventories of personality
scales were apparently modified somewhat in the and psychopathology, there are several tests devoted
transition from the MCMI-II to III, no further solely to the detection of feigned symptom reports.
empirical work on the scales was presented in the Although routine use of tests providing only infor-
MCMI-III manual, raising questions about the mation about malingering may be appropriate in
robustness and generalizability of results from these settings with moderate to high base rates of malin-
scales. Craig (1999), in a review paper on forensic gering, in other settings they are likely to be used
applications of the MCMI, states ‘‘readers should be only for evaluees who have provided evidence for
aware that the validity scales of the MCMI remain feigned symptoms on multiscale inventories or in
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Sp
PPP NPP PPP NPP
Daubert & Metzler, 2000 FB 80 OPt Sim Y N N X 85BR .61 .54 .43
Hon 80 OPt .81 .85 .90
Y 35BR .58 .47 .36
.76 .83 .89
Z 85BR .55 .49 .38
.79 .83 .88
X* 80BR .76 .49 .36
.71 .89 .93
Y* 39BR .64 .48 .37
.74 .85 .90
Z* 81BR .64 .52 .41
.78 .85 .90
Schoenberg, Dorr, & Morgan, 2003 Mal 106 Stud Sim Y N Y X* >89BR .35 .52 .41
Hon 181 IPt .88 .79 .85
Y* <21BR .33 .45 .34
.85 .77 .84
Z* >82BR .59 .39 .29
.66 .81 .87
X >178r .00 .00 .00
1.00 .73 .81
Schoenberg, Dorr, & Morgan, 2006 FB 18 Stud Sim Y Y Y DisFxA n/a* .51 .76 .67
Hon 21 Ipt .94 .84 .89
DisFxB .64 .58 .47
.83 .86 .91
FB 55 Stud Sim Y Y Y DisFxA n/a* .45 .62 .51
Hon 88 Ipt .90 .82 .87
DisFxB .71 .61 .49
Note: For studies, only first author is listed. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison. n/a ¼ not applicable. Inc ¼ incentive for successful faking, Warn ¼ warning to fake believably, Comp
Check ¼ check for compliance with faking instructions, Sn ¼ sensitivity, Sp ¼ specificity, PPP ¼ positive predictive power, NPP ¼ negative predictive power. For Group: FB ¼ fake bad, Hon ¼ Honest, Mal ¼ Malingering. For Sample:
IPt ¼ inpatient, OPt ¼ outpatient, Stud ¼ student. For Design: Sim ¼ Simulation. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison. For Scale: X ¼ Disclosure, Y ¼ Desirability,
Z ¼ Debasement; * ¼ optimal cutting score in samples; DisFxA/DisFxB ¼ Discriminant Function A/B.
other aspects of their presentations. The tests sensitivity and moderate specificity. Translated into
reviewed in this section include only those with predictive powers, at both base rates NPP is excellent
adequate independent evaluation of their operating (>.90), although PPP is rather modest. This profile
characteristics including patient control groups. suggests that the SIMS might make an excellent brief
screening tool for feigned symptom reports, as a
Structured Inventory of Malingered negative test sign on the instrument is highly pre-
Symptomatology dictive of honest responding. In contrast, the rela-
The Structured Inventory of Malingered tively weak PPP makes the SIMS a poor candidate
Symptomatology (SIMS) was developed by Smith for following up a positive test sign from a multiscale
(1992) with the goal of producing a brief malin- inventory such as the MMPI-2 or the PAI. At that
gering screening instrument that was easily adminis- stage, high PPP is critical as labeling an evaluee a
tered, capable of high rates of detection, and written malingerer should never be done lightly. Thus,
at a low reading level (fifth grade). It was recently another instrument with stronger PPP should prob-
published by Psychological Assessment Resources ably be preferred to the SIMS for ruling in psychia-
(Widows & Smith, 2005). In addition to using a tric feigning.
self-report true/false format, the 75-item SIMS was
designed to include scales sensitive to several specific Miller Forensic Assessment of Symptoms Test
forms of malingering, including psychosis (P), The Miller Forensic Assessment of Symptoms
amnesia (Am), neurologic impairment (N), affective Test (M-FAST; Miller, 2001) is a 25-item struc-
disorder (Af), and low intelligence (Li) (Smith & tured interview for psychiatric feigning. The
Burger, 1997). The total score, summed across all M-FAST is intended to be a brief screening instru-
scales on the SIMS, appears to be the most com- ment, taking only 5 min to administer. Based on
monly used variable for predicting the presence or malingering detection strategies identified by
absence of malingering. One limitation of the SIMS Rogers (1997), the M-FAST consists of seven sub-
is that it does not have a scale for addressing the scales: unusual hallucinations (UH), reported versus
possibility of random responding. A totally random observed behavior (RO), rare symptom combina-
approach to the test would likely cause mid-range tions (RC), extreme symptoms (ES), negative
elevations on all the malingering scales, a potentially image (NI), unusual symptom course (USC), and
misleading finding. suggestibility (S). The total score is obtained by
Table 32.5 presents results from the five pub- summing the scale scores, and a cutting score of 6
lished studies of the SIMS that included an appro- or above suggests feigning.
priate control group. Strengths of the available Table 32.6 presents available published data on
studies of the SIMS include the use of both patient-controlled studies of the M-FAST. The
known-groups and simulation designs, and the pro- M-FAST has been validated in a number of pub-
vision of incentives for successful feigners in the lished studies with college students (Miller, 2001),
analog studies. Another noteworthy aspect of the forensic psychiatric inpatients (Jackson, Rogers, &
available data is the use of a fairly consistent cutting Sewell, 2005; Miller, 2001), criminal defendants
score for predicting malingering, with all studies but found incompetent to stand trial (Miller, 2004),
one using the same cutoff score. In addition, there prisoners (Guy & Miller, 2004; Jackson et al.,
has been increased variety in control groups used in 2005), and civil forensic populations (Alwes, Clark,
recent years, with adolescent, correctional, student, Berry & Granacher, 2008). Favorable findings inde-
personal injury, and psychiatric populations all pendent of the scale’s author have been reported, as
being studied. In spite of this progress, the number have supporting results from both known-groups
of nonforensic adult psychiatric inpatients studied is and simulation designs using a single recommended
still quite low. An additional weakness of the avail- cutting score.
able studies is the failure to check compliance in Table 32.8 shows mean values summarizing
supposedly honest patient groups. A final potential M-FAST findings. Mean sensitivity is moderate at
concern is the lack of a random responding scale, .79, whereas mean specificity is somewhat stronger
although presumably the examiner would note such at .84. These parameters translate into low to mod-
an approach. erate PPP, but NPP values which are strong ( .90).
Table 32.8 presents mean classification values This profile suggests that the M-FAST is best used as
summarizing data from the SIMS. It can be seen a screen to rule out feigning. However, individuals
that the SIMS is characterized by moderately high with positive test signs on the M-FAST should
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Sp
PPP NPP PPP NPP
Rogers, Hinds & Sewell, 1996 FB 53 Adol IPt Sim Y N N Total >16 .44a .71 .61
Hon 53 Adol IPt .94a .82 .88
Lewis, Simcox, & Berry, 2002 Mal 20 Corr KGb n/a n/a n/a Total >16 1.00 .45 .30
Hon 31 Corr .55 1.00 1.00
Merckelbach & Smith, 2003 FB 57 Stud Sim N Y N Total >16 .93 .95 .92
Hon 10 Psy IPt .98 .97 .98
Hon 231 Stud
Alwes et al., 2008 PsyM 23 Pers Inj KG n/a n/a n/a Total >16 .96 .52 .40
PsyH 172 Pers Inj .97 .98 .99
Neu M 75 Pers Inj KG n/a n/a n/a Total >16 .75 .41 .31
Neu H 178 Pers Inj .60 .87 .91
Edens et al., 2007 FB 30 Corr Sim Y Y Y Total >14 .90 .36 .26
Hon 30 Corr IPt Y .40 .92 .94
Note: For studies, only first author is listed. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison. n/a ¼ not applicable. Inc ¼ incentive for successful faking, Warn ¼ warning to fake believably, Comp
Check ¼ check for compliance with faking instructions, Sn ¼ sensitivity, Sp ¼ specificity, BR ¼ base rate, PPP ¼ positive predictive power, NPP ¼ negative predictive power. For Group: FB ¼ fake bad, Hon ¼ Honest, Mal ¼ Malingering,
PsyM ¼ psychiatric malingering, PsyH ¼ psychiatric honest, Neu M ¼ neurocognitive malingering, Neu H ¼ neurocognitive honest. For Sample: Adol ¼ Adolescent, Corr ¼ correctional, IPt ¼ inpatient, Stud ¼ student, Pers Inj ¼ Workers
Compensation or Personal Injury claimant. For Design: Sim ¼ Simulation, KG ¼ Known-groups. For Scale: Tot ¼ SIMS Total Score. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison.
a
Sensitivity and Specificity estimated from PPP, NPP, and base rate of 50%. b The SIRS was used as criterion for individual classification.
Table 32.6. Results from patient-controlled malingering detection studies using the M-FAST.
Study Group N Sample Design Inc Warn Comp Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Check Sp
PPP NPP PPP NPP
Guy & Miller, 2004 Mal 21 Corr IPt KG N N N Total 6 .86 .65 .54
Hon 29 Corr IPt .83 .94 .96
Miller, 2004 Mal 14 Psy CST KG N N N Total 6 .93 .67 .56
Hon 36 Psy CST .83 .97 .98
Jackson & Rogers, 2005 FB 43 Corr Sim N Y Y Total 6 .76 .74 .64
Hon 96 Corr .90
Mal 8 Psy CST KG N N N Total 6 .91 .94
Hon 41 Psy CST
Veazey, Miller, & Hays, 2005 Mal 5 Psy IPt KG N N N Total 6 .80 .66 .56
Hon 39 Psy IPt .85 .92 .95
Guy, Kwartner, & Miller, 2006 FB 48 Stud Sim Y Y Y Total 6 .88 .64 .53
Hon 48 Scz .82 .95 .97
FB 41 Stud Sim Y Y Y Total 6 .84 .60 .48
Hon 20 BP .79 .93 .95
FB 51 Stud Sim Y Y Y Total 6 .62 .52 .41
Hon 25 MDD .79 .85 .90
FB 50 Stud Sim Y Y Y Total 6 .63 .61 .50
Hon 47 PTSD .85 .86 .91
Alwes, 2008 PsyM 23 Pers Inj KG n/a n/a n/a Total 6 .83 .77 .68
PsyH 172 Pers Inj .91 .94 .96
Note: For studies, only first author is listed unless others required to identify reference. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each comparison. n/a ¼ not applicable. Inc ¼ incentive for successful faking,
Warn ¼ warning to fake believably, Comp Check ¼ check for compliance with faking instructions, Sn ¼ sensitivity, Sp ¼ specificity, BR ¼ base rate, PPP ¼ positive predictive power, NPP ¼ negative predictive power. For Group: FB ¼ fake
bad, Hon ¼ Honest, Mal ¼ Malingering, PsyM ¼ psychiatric malingering, PsyH ¼ psychiatric honest. For Sample: BP ¼ bipolar disorder, Corr ¼ correctional, CST ¼ competency to stand trial evaluees, IPt ¼ inpatient, MDD ¼ major
depressive disorder, Pers Inj ¼ workers compensation or personal injury claimant, PTSD ¼ post traumatic stress disorder, Scz ¼ schizophrenia, Stud ¼ student. For Design: Sim ¼ Simulation, KG ¼ Known-groups.
receive another procedure to confirm presence of a supplementary consistency scale (INC), it does not
feigning, as the M-FAST’s PPP values are far too appear to have been extensively validated for detec-
low to rule in false symptom reports with sufficient tion of random responding. Presumably, however,
confidence. random answers would be noted by the interviewer.
Table 32.7 presents results from malingering
Structured Interview of Reported Symptoms detection studies using the SIRS and including an
The SIRS (Rogers et al., 1992) was developed to appropriate control group. It should be noted that
assess malingered psychological symptoms as well as the manual summarizes data from several published
other response sets in a structured interview format. studies on the SIRS which have been collapsed for
The 172 items are organized into eight primary the sake of convenience in Table 32.7 as Rogers et al.
scales, which have consistently shown the ability to (1992). Thus, both simulation and known-groups
discriminate between malingerers and honest designs have been utilized to evaluate the SIRS.
responders, and five supplementary scales which Groups of simulators have been offered incentives
have not proven as effective in detecting malingering to malinger and given warnings to respond believ-
or are intended to assess other response sets (incon- ably, and subjects’ compliance with the instructions
sistency, defensiveness). Three of the primary scales has been checked, and coached subjects have not
focus on very unusual symptom presentations (rare been found to avoid detection on the SIRS.
symptoms, symptom combinations, improbable and Classification data in Table 32.8 indicate that,
absurd symptoms), four assess the range and severity using the primary scale (PS) with a cutoff of 3 in
of symptoms (blatant symptoms, subtle symptoms, the probable feigning range, the SIRS has high PPP
selectivity of symptoms, severity of symptoms), and for identifying malingering at both base rates used in
the final primary scale examines discrepancies Table 32.8, and respectable NPP values also have
between self-reports of overt symptoms and inter- been reported. The major strength of the SIRS
viewer observations (reported vs. observed symp- appears to be its high PPP. Thus, when ruling in
toms). The structured interview format adds an feigning, the SIRS appears to be the instrument of
additional dimension to data considered for identi- choice. Its major limitation may be the requirement
fication of malingering beyond the traditional for more direct clinician time for administration
methods (unstructured clinical interview, self- than necessary for self-report procedures.
report test results, behavioral observations, and
record reviews). Additionally, this format standar-
dizes the interviewer’s behavior and serves to mini- Hypothetical Classification Rates for a
mize inadvertent examiner clues regarding the Two-Stage Malingering Detection Strategy
implausibility of symptom endorsements. Finally, Considering the available data on the classifica-
structured clinical interviews have facilitated tion rates of various malingering detection scales and
improved reliability and validity of psychiatric diag- tests in Table 32.8, it is clear that almost all available
noses and offer the same potential for the detection indices, with one exception, have much stronger
of malingering (Rogers, 1995). NPP than PPP. This situation points to a need to
Raw scores from each of the SIRS scales are explore the accuracy of sequential application of
classified as honest, indeterminate, probable these indicators, or the ‘‘successive hurdles
feigning, or definite feigning based on extensive approach’’ described by Meehl and Rosen (1955).
data from patients, prisoners, analog malingerers, A simple model would involve a first-stage deploy-
and independently identified malingerers (Rogers ment of an indicator with high NPP to rule out
et al., 1992). Overall results are interpreted as sug- malingering in the subset of evaluees with a negative
gestive of feigned psychological symptoms if one or test sign, and a second stage consisting of adminis-
more primary scales are in the definite feigning tration of an additional malingering detection pro-
range, if three or more primary scales are in the cedure with high PPP to the subset of individuals
probable feigning range, or if the total raw score with a positive test sign from stage one. This strategy
exceeds 76. Reports published by the test’s authors potentially maximizes the utility of information
as well as independent investigators have supported obtained from tests with divergent NPP and PPP
the accuracy of the SIRS for detection of malin- as well as benefits from the heightened base rate of
gering. Inter-rater reliability, a key concern for inter- malingering among second stage evaluees.
view-based data, has been reported to be quite strong Implementing this sequential strategy requires
(mean reliability ¼ .96). Although the SIRS includes identification of a test with high NPP for Stage 1
Study Group N Sample Design Inc Warn Comp Check Scale Cutoff Sn BR ¼ .27 BR ¼ .19
Sp
PPP NPP PPP NPP
Rogers et al., 1992 Mal 206 Mixed KGþSim Y Y Y Ps 3Feign .48 .97 .96
Hon 197 Pt/Cor/ n/a .99 .84 .89
Com
Gothard et al., 1995 Mal 37 For/Cor KGþSim Y Y Y Ps 3Feign .97 .93 .90
Hon 78 For/Cor Y .97 .99 .99
Note: For studies, only first author is listed. Inc ¼ incentive for successful faking, Warn ¼ warning to fake believably, Comp Check ¼ check for compliance with faking instructions, Sn ¼ sensitivity, Sp ¼ specificity, BR ¼ base rate,
PPP ¼ positive predictive power, NPP ¼ negative predictive power. For Group: Mal ¼ malingering, Hon-honest. For Sample: Mixed ¼ forensic, correctional, community, & student, Pt ¼ patient, Cor ¼ correctional, Com ¼ community. For
Design: KGþSim ¼ includes both known groups and simulation designs. For Scale: PS ¼ Primary Scales. For Cutoff: Feign ¼ in probable feigning range. Sn/Sp column indicates sensitivity in top line and specificity in bottom line for each
comparisons.
Table 32.8. Means across studies for cutting score, sensitivity, specificity, PPP, and NPP for detection of
malingering using the MMPI-2 PAI, SIMS, M-FAST & SIRS.
Cutting score
Mean 102.38 13.55 94.54 82.2 3.5 66.0 15.7 >6 >3
Range 62–100 (4)–25 70–120 66–110 2–5 51–81 14–16 n/a n/a
SD 17.46 7.10 15.83 8.9 1.0 9.3 .82 0 n/a
Sensitivity
Mean .74 .73 .62 .73 .53 .69 .83 .79 .73
Range .21–1.00 .17–1.00 .33–97 .05–.91 .13–1.00 .29–.95 .44–1.00 .62–.93 .48–.97
SD .20 .23 .20 .22 .27 .21 .21 .11 .33
Specificity
Mean .89 .86 .90 .75 .84 .77 .74 .84 .98
Range .64–1.00 .49–.98 .75–1.00 .17–1.00 0–1.00 .39–.99 .40–.98 .79–.91 .97–.99
SD .11 .12 .08 .21 .27 .22 .25 .04 .01
PPP BR ¼ .27
Mean .77 .68 .77 .62 .67 .62 .57 .65 .95
Range .43–1.00 .35–.94 .46–1.00 .29–1.00 .27–1.00 .32–.96 .36–.95 .52–.77 .93–.97
SD .18 .19 .20 .21 .24 .23 .22 .04 .02
NPP BR ¼ .27
Mean .91 .90 .87 .89 .76 .87 .93 .92 .91
Range .77–1.00 .75–1.00 .79–.99 .74–.96 0–.96 .78–.98 .82–1.00 .85–.97 .84–.97
SD .06 .07 .06 .06 .23 .07 .07 .07 .03
PPP BR ¼ .19
Mean .68 .60 .70 .53 .60 .53 .47 .64 .93
Range .32–1.00 .26–.91 .35–1.00 .20–1.00 .19–1.00 .23–.93 .26–.92 .41–.68 .90–.96
SD .22 .21 .25 .24 .28 .25 .26 .08 .02
NPP BR ¼ .19
Mean .94 .93 .91 .92 .81 .91 .95 .95 .94
Range .84–1.00 .83–1.00 .85–.99 .82–.97 0–.97 .85–.99 .88–1.00 .90–.98 .89–.99
SD .04 .05 .09 .04 .25 .05 .05 .03 .04
# comparisons 29 22 13 26 13 9 6 9 2a
Note: Cutting score ¼ point at or above which scale value interpreted as a positive test sign. All cutting scores given in T-scores except F–K which is given in raw
scores. See text for definition of sensitivity, specificity, Positive Predictive Power (PPP) and Negative Predictive Power (NPP). BR ¼ base rate of feigning.
a
¼ Many published studies summarized in SIRS manual.
screening and a test with high PPP for Stage 2 classi- detection instruments discussed above, the SIRS has
fication of malingering. Although the PAI is a signifi- the highest average PPP for the detection of malin-
cant rival, the multiscale MMPI-2 has slightly more to gering, relatively extensive empirical evaluation, and
recommend it for use in a first stage of malingering the theoretical advantage of assessing malingering in a
screening, including extensive documentation of the different format (structured interview) than the self-
accuracy of its validity scales, relatively widespread use report MMPI-2. Thus, the SIRS will be employed in
in forensic settings, and the availability of several the second stage of the example evaluation process.
indices with high NPP. Of the three MMPI-2 Table 32.9 presents results from a hypothetical
fake-bad indices reviewed above, the F scale has the two-stage screening process using the MMPI-2 F
highest NPP in Table 32.8. Thus, the MMPI-2 F scale scale at Stage 1 and the SIRS at Stage 2. The table
will be used in the hypothetical two-stage process includes Stage 1 data from the MMPI-2 F scale for
provided below as an example. 1,000 evaluees. Given average sensitivity and speci-
Because the second stage of evaluation here is ficity results for the F scale using a cutting score of
aimed at identifying malingerers with a high level of >102T, 280 evaluees would have a positive F sign.
confidence, high PPP is required for an instrument However, the associated PPP, given a malingering
used at this stage. Of the three dedicated malingering base rate of .27, would be about 71%. This means that
Table 32.9. Classification Parameters of One- and Two-Stage Feigning Detection Strategies.
Abstract
Pretreatment planning is an essential element in psychotherapy. Constraints imposed by managed care,
efficiency of treatment/cost-effectiveness issues, and increased likelihood of change and increased
magnitude of change are obvious justifications for the provision of individualized and comprehensive
pretreatment assessment. Relatedly, quality pretreatment assessment is essential for accurate
patient–treatment matching. In this vein, systematic treatment selection (STS), prescriptive psychotherapy
(PT), systematic treatment (ST) and Innerlife are variants of an overriding empirically supported model of
patient–treatment matching. More specifically, STS, PT, ST, and Innerlife employ accurate, comprehensive
pretreatment assessment to help guide the clinician in the selection and matching of specific
psychotherapeutic strategies and principles. These research-informed strategies and principles of
change, selected based on pretreatment data, efficiently and effectively address the unique treatment
needs of the patient and their presenting problem(s).
The first and most central question facing the clin- who maintain that each assessment should be speci-
ician who is assessing a client is, ‘‘How will these fically tailored to each client’s needs and each refer-
findings be used?’’ A focused and meaningful evalua- ent’s requests (Beutler & Groth-Marnat, 2003;
tion requires that one keeps the ultimate uses of the Clarkin & Hurt, 1988; Cohen & Swerdlik, 2005;
report in mind. It is our experience that clinicians Sweeney, Clarkin, & Fitzgibbon, 1987). The epi-
frequently fail to obtain a clear understanding of the tome of the standard assessment approach is embo-
referral question and may respond out of habit to died in computer-generated reports. This approach
certain inferred questions, thus providing results and utilizes a uniform set of one or more instruments,
information that are routine and overly general. assesses a uniform set of dimensions, and applies
With the advent of managed health care, there standard descriptive statements to the evaluation of
has been decreasing support for the use of systematic all patients. While this approach is satisfactory as a
psychological assessment in many clinical practices. screening procedure, we believe that even if the same
Reliance on the clinical interview has largely replaced procedures are being used in different cases, an
the use of reliable and valid psychological tests. This, individually focused approach to interpreting and
we believe, is a serious mistake and places too many reporting the results makes better sense than a stan-
decisions on a method that suffers from poor relia- dard report in those instances when the clinician is
bility and uncertain validity. In a related way, much asked to provide consultation to someone else.
debate has occurred between those who believe that Psychological assessment is a method for
standard procedures and reports can be applied to obtaining answers to questions. Without explicitly
the process of psychological assessment and those defining the questions in advance, the conclusions
643
that one reaches and the report that one creates may and perceptual functions, etc.) than would be
be needlessly complex, general, uninformative, or required if the clinician were responding to question
inaccurate. Hence, the first task of the clinician is to about the severity of the patient’s depression (a ques-
determine what questions are being asked. Asking the tion of current functional level). Likewise, a question
referent for clarification of the question being asked is regarding whether or not cognitive therapy would be
an obvious first step when the clinician is functioning helpful (a treatment question) would require the clin-
as a consultant to another professional. In this role, ician to formulate an opinion regarding the patient’s
the questions asked by a referring professional usually motivation level and coping styles to a greater extent
fall within five categories: (a) What ‘‘diagnosis’’ fits than a question asking if a history of child abuse is
this person’s current presentation? (b) What is the related to the child’s subsequent anxiety and social
‘‘prognosis’’ for this person’s condition? (c) How disturbance (a question of etiology).
impaired is this person’s ‘‘current functioning’’? (d) Unfortunately, the questions actually asked on
What ‘‘treatment’’ will be most likely to yield positive written referral forms are seldom as clearly framed
effects? (e) What factors are contributing to or as one would like. Often, these written questions are
‘‘causing’’ this client’s disturbance? much more specific than the ones really motivating
While the report is tailored to the specific ques- the referral and if responded to naively may cause the
tion or questions asked, there is some degree of clinician to omit information the referent really
consistency to the nature of the information that is desires and needs. For example, instead of asking
needed in order to answer all of these questions. for a ‘‘diagnosis,’’ the referring professional may ask
That is, there are common domains of patient func- for ‘‘an MMPI’’ or a ‘‘Rorschach.’’ At other times, the
tioning that must be addressed in order to ade- questions asked may be too broad to direct
quately answer any of the questions that might be the clinician’s inquiry. Thus, instead of asking for
posed by a referent. These common areas include: the desired treatment recommendations directly, the
(a) an assessment of current level, strengths, and referent may ask for ‘‘personality testing.’’
limitations of intellectual, memory, and other cog- In each of these cases, the referring professional
nitive functions; (b) an evaluation of mood, affect, may be using a shorthand method of attempting to
and level of emotional control; (c) a determination of get information from the consultant. Unless the
events, conflicts, and needs that trigger the proble- consultant can decode the question, however, a
matic responses for which the patient is seeking help; response to the referent’s shorthand will not be
and (d) a narrative formulation of the patient’s found to be helpful. Each cryptically written request
resources and deficits. This latter domain entails for consultation hides some assumptions and specific
providing a conceptual formulation of the patient but unstated questions that the referring clinician is
that is distilled from the information available from attempting to address. For example, the referent who
the other domains of functioning. This formulation makes a written request for ‘‘neuropsychological
usually includes a description of the nature of the testing’’ may be seeking confirmation of organic
patient’s interaction with the environments that etiology in order to allow the patient to receive
both evoke and alleviate the problem behaviors, insurance compensation, or he/she may be trying
including levels of motivation for treatment, styles to predict whether the symptoms are likely to dis-
of coping with external and internal stressors, levels sipate enough over time to allow the patient to go
and types of social support, and a description of back to work. Obviously, the responding clinician
common interpersonal themes or patterns that char- will want either to undertake a different form of
acterize the patient’s functioning. evaluation or to write the report differently in these
While an integration of descriptive and inferred two instances.
information on these domains of functioning pro- Unless the clinician is very familiar with the
vide a common framework for responding to referral referent’s habits, he/she must clarify what decisions
questions, different questions require more or less are being contemplated and what types of questions
attention and detail to one or more of the different and dilemmas are facing the referent in order to have
domains. For example, to answer a question that asks prompted this referral. Reframing requests in terms
if the patient’s behavior is representative of a central of questions that reflect one of the five aforemen-
nervous system disorder (a diagnosis question), the tioned categories is desirable in order to provide the
clinician will need to focus more on an assessment of most reliable and useful information for the referent.
cognitive functions (judgment, intelligence, short- In the final analysis, all questions that are asked of
and long-term memory, concept formation, sensory the consulting clinician have implications for
Julia N. Perry
Abstract
This chapter examines how evaluating traitlike resistance with objective personality assessment
instruments can assist therapists in better anticipating, understanding, and responding to their clients’ signs
of therapeutic resistance. Beginning with a brief review of the data regarding gender-influenced attitudes
toward psychological treatment, it then presents and discusses specific Minnesota Multiphasic Personality
Inventory (MMPI/MMPI-2) validity, clinical, and content scales that measure elements of treatment
resistance. A review of clinical and normative data on the Butcher Treatment Planning Inventory (BTPI), a
measure that was created specifically for treatment planning purposes, is discussed. Finally, a case example
to illustrate specific methods of informing psychotherapy with objective test data is presented.
There has been considerable study and commentary benefit. Unwillingness on their part to do so is
within the field of psychology regarding the range of viewed as ‘‘psychotherapeutic resistance.’’ The ‘‘resis-
factors affecting psychotherapy process and out- tance’’ concept is rooted in psychoanalysis. Freud’s
come. Myriad therapist qualities and client qualities writings delineate several types of resistance, such as
have been investigated (e.g., Corey, 1991; Hersen & that relating to unwillingness to examine associa-
Ammerman, 1994). Establishing a maximally ser- tions and to interpret dreams. But, he theorized, an
viceable relationship between clients and therapists even more fundamental type occurred ‘‘when we
requires that both parties be genuinely engaged in [psychoanalysts] undertake to cure a patient of his
the psychotherapy process. In order for that to symptoms [and] he opposes against us a vigorous
happen, therapists need to be willing to ‘‘treat’’ and tenacious resistance throughout the entire course
their clients, typically using evidence-based methods of the treatment’’ (Freud, 1973, p. 297). This phe-
that are matched to clients’ presenting difficulties. In nomenon was so important to psychoanalysis that
addition, clients need to be open to and accepting of Freud concluded that overcoming resistance was the
the ‘‘treatment’’ that they receive. Although both of most time-consuming and difficult element of all
these elements are necessary for therapeutic change, psychoanalytic work.
neither is sufficient. This chapter will focus on client Freud’s conceptualization held that resistance
factors by examining ways in which objective assess- was unconscious; thus, it went unrecognized by the
ment can assist providers in understanding the person experiencing it. In fact, his theory was based
sources of treatment resistance affecting their clients. on the idea that resistance could be manifested
whenever an analyst attempted to bring an analy-
The Concept of ‘‘Resistance’’ sand’s unconscious material to the level of con-
Broadly speaking, clients must be willing to sciousness. However, in its present iteration, there
involve themselves in psychotherapy to at least a is no presumption that clients’ resistive behavior is
minimal degree in order to achieve therapeutic unconscious or goes unrecognized by them. In fact,
667
resistance can be a willful phenomenon. Whether it Gender also has been implicated as an important
is purposeful or not, it can bear significantly on the determining factor, with research generally demon-
therapeutic relationship and exert its influence at strating greater willingness to engage in psy-
various points in time over the treatment course. chotherapy among women than among men. This
finding has been supported by the work of such
individuals as Ryan (1969); Cheatham, Shelton,
Resistance to Initiating Psychological and Ray (1987); Johnson (1988); and Butcher,
Treatment Rouse, and Perry (1998). Ryan’s (1969) research
Resistance can begin quite early in the psy- demonstrated that the therapy clients being studied
chotherapeutic process. For example, the idea of were twice as likely to be women as men, under-
immersing oneself in psychotherapy actually begins scoring the gender disparity with regard to psy-
even before clients embark upon their treatment chotherapy involvement. More recent research has
experiences. Frank and Frank (1991) outlined this confirmed this phenomenon and also has provided
idea in their discussion of the client’s ‘‘assumptive possible explanations for it. A study by Cheatham
world,’’ in which suppositions are organized in atti- et al. (1987) uncovered a decreased willingness for
tudes that potentially determine behavior. This men to seek psychotherapy, as compared to women.
‘‘behavior’’ may be whether to engage in psycholo- Johnson (1988) also found that female college stu-
gical treatment at all. dents were more willing than their male counterparts
Data indicate that a substantial proportion of to recognize a need for professional help and to be
individuals who may require psychological treat- open to sharing their difficulties with other people.
ment never receive it. In recent decades, estimates Butcher et al. (1998) also looked at men’s and
of people who meet criteria for at least one mental women’s attitudes toward experiences in therapy,
disorder have run as high as 14% (e.g., Regier et al., using a seven-item ‘‘Survey of Treatment Attitudes’’
1993), whereas the portion of Americans who actu- that was composed for their study. They asked 388
ally received formal psychological intervention have university students (213 women and 175 men) such
been estimated at around only 6% (Castro, 1993). questions as, ‘‘Have you ever been in psychological
The gaps between these figures indicate that a sizable counseling in the past?’’, and ‘‘Do you think that you
proportion of individuals with psychological pro- would be willing to seek assistance from a psychol-
blems are going untreated (or at least undertreated). ogist or psychiatrist if you were experiencing pro-
This phenomenon may be due, in part, to what has blems in psychological adjustment?’’ They found
been characterized as the ‘‘unappealing if not unac- that the women in their sample were more likely
ceptable’’ nature of many psychological treatments, than the men to have participated in counseling and
from a client’s point of view (Cowen, 1982, p. 385). also to have considered participating in some kind of
Thus, large numbers of people may choose not to psychological treatment. The female participants
‘‘bring their personal troubles to mental health pro- also reported being more willing than the male par-
fessionals at any point in their unfolding’’ (Cowen, ticipants to recommend psychological interventions
1982, p. 385). to a family member or friend.
A number of other studies have examined this In explaining findings such as these, several the-
phenomenon. Back in 1958, Rosenthal and Frank orists have implicated not the biological category of
identified low levels of education and income as sex but the attitudes that men and women frequently
predictors of which individuals held negative atti- hold, which are shaped by psychological and socio-
tudes toward psychotherapy and, consequently, logical factors,. For example, research by Robertson
would reject offered treatment. By 1971, Raynes and Fitzgerald (1992) showed that high scores on
and Warren had examined such variables as race measures of masculinity predicted negative attitudes
and age and had found that Blacks younger than toward psychotherapy on the Fisher–Turner
age 40 were the least likely group in their sample to Attitudes Toward Seeking Professional Help Scale.
initiate treatment. Decades later, Furnham and Moreover, although women tended to prefer tradi-
Wardley (1990) found older age to be predictive of tional forms of psychotherapy, men who possessed
negative beliefs about psychological interventions, more stereotypically masculine attitudes demon-
likely in keeping with Brody’s (1994) conclusion strated a preference for alternative, nontraditional
that negative attitudes about treatment would be interventions. Johnson’s (1988) study attributed
found among individuals with ‘‘conservative,’’ ‘‘old- male–female differences to sex role differences, not
fashioned,’’ and ‘‘traditional’’ points of view. sex category, per se. Specifically, individuals who
PERRY 669
items distinguished between the two groups. validity indicators for other reasons (e.g., being unable
However, the scale has been found to measure one’s to read English well enough to produce a valid pro-
ability to withstand stress, rather than one’s potential file); such clients will not necessarily come across as
for psychotherapy success (Butcher & Williams, resistant to psychological interventions (see discussion
2000). Another measure, the Therapeutic Reactance in Butcher, Cabiya, Lucio, & Garrido, 2007).
Scale (TRS; Dowd, Milne, & Wise, 1991), has Elevated (T 65) and even low (T < 45) scores
demonstrated greater potential to predict therapy on MMPI-2 clinical scales also have implications for
response, at least among individuals with acute and the individual’s attitudes in treatment (e.g.,
transitory difficulties (Beutler, Goodrich, Fisher, & Butcher & Williams, 2000; Greene & Clopton,
Williams, 1999). 1999), although the relationships are not necessarily
Other MMPI-2 scales measure elements of treat- uniformly robust. Elevated scores on Scale 1 (Hs) are
ment resistance as well. The Negative Treatment associated with a tendency to be difficult to engage
Indicators (TRT) content scale specifically examines in psychotherapy, in part due to likely pessimism
clients’ attitudes toward accepting psychotherapy about the benefits of treatment. Low scorers on Scale
and other types of treatment (Butcher & Williams, 2 (D) tend to have little internal motivation to
2000). It also measures their attitudes toward undertake any kind of psychological treatment.
making behavioral changes. Individuals who pro- High scorers on Scales 3 (Hy) may be resistant to
duce elevated scores on TRT are thereby acknowl- psychological interpretations of their problems;
edging negative attitudes toward health-care although they initially may seem enthusiastic about
professionals, unwillingness to change their beha- their treatments, their eagerness is unlikely to persist.
vior, and lack of belief that change is possible (see Individuals who produce elevated scores on Scale 4
study by Craig & Olson, 2003). (Pd) tend to demonstrate poor prognosis for change
Elevated scores on several of the other MMPI-2 in therapy. Moreover, those who score low on Pd
scales also have implications with regard to treatment generally have little psychological insight and need
resistance (e.g., Butcher, 1990; Butcher & Williams, motivation in order to make specific behavioral
2000; Greene & Clopton, 1999). The instrument’s changes.
validity indicators provide information about indivi- High scorers on Scale 6 (Pa) are also likely to
duals’ willingness to be forthcoming during the assess- have poor prognosis for treatment-related change.
ment process, which has logical implications for their They often are so suspicious of others that they
willingness to be open and honest in any psy- have difficulty in forming solid therapeutic rela-
chotherapy that may be undertaken. Therefore, ele- tionships. Like those who score high on Hy, indi-
vated scores on several measures can be indicators of viduals who produce elevations on Scale 7 (Pt) may
resistance: cannot say (Cs), indicating the number of refuse to accept psychological interpretations of
items not answered; the lie scale (L), assessing the their problems; they may also be so tense and
tendency to claim excessive virtue; the infrequency anxious that it is necessary to reduce their anxiety
scales (F and FB), which look at deviant and random levels before psychological treatment is initiated.
responding; the suppressor scale (K), which is On the other hand, a low Pt score suggests that
another, subtler measure of problem denial; true the individual is so comfortable with himself or
response inconsistency (TRIN), measuring yea- herself that there does not appear to be a need for
saying and nay-saying response sets; and variable treatment or change.
response inconsistency (VRIN), which assesses A tendency to resist psychological interpretations
response inconsistency that possibly is due to of problems is associated with high scores on Scale 8
random responding. Among these, L, K, and TRIN (Sc) as well. However, the individuals who produce
may be most important, given that individuals who such scores may remain in psychotherapy for longer
produce elevated scores on these scales may be than most patients. Finally, individuals who score in
attempting to ‘‘put their best foot forward’’ in a the elevated range on Scale 9 (Ma) are inclined to
manner that leads them to deny or minimize their attend their therapy sessions irregularly, and they
difficulties. It stands to reason that clients who are may terminate therapy prematurely as well.
unwilling to complete the MMPI-2 fully and accu- Similar problems are associated with elevated
rately are less apt to comply fully with other facets of scores on the MMPI-2 content scales (Greene &
assessment and intervention (Greene & Clopton, Clopton, 1999). High scorers on anxiety (ANX)
1999). However, it should also be noted that indivi- may be so anxious that it is actually ineffective to
duals may produce elevations on one or more of the initiate psychotherapy prior to reducing their anxiety
PERRY 671
and empirical means. It is intended to be behavio- try to present themselves to others in an unrealistically
rally oriented and fundamentally atheoretical in positive manner and indicate that they are better
nature. The BTPI seeks to assess not only clients’ adjusted than most people are. The 61 exaggerated
psychological symptomatology, but also those spe- problem presentation (EXA) items look at the degree
cific qualities that are likely to impede psychological to which individuals endorse an unrealistically high
treatment, making it an ideal candidate for research number of symptoms that tend not to be endorsed by
into treatment resistance issues. mental health patients. The closed-mindedness
The measure consists of 210 true-or-false items. (CLM) scale’s 19 items evaluate the individual’s ten-
Through their responses to these items, respondents dency to resist disclosing personal information and
can directly communicate key information with making cognitive and behavioral changes.
regard to the nature and process of psychotherapy. The five scales comprising Cluster 2 assess person-
The content of each item assigns it to one or more of ality-related issues that could make a significant
the instrument’s 14 scales, as summarized in impact on treatment. Problems in relationship forma-
Table 34.1. tion (REL) is composed of 18 items that measure a
The four scales that make up Cluster 1 assess the lack of interpersonal trust as well as problems in
validity of the individual’s self-report on the instru- relating to others. The 16 somatization of conflict
ment. Inconsistent responding (INC) consists of 21 (SOM) items look at the extent to which individuals
pairs of items and looks at the degree to which indi- try to cope with their psychological problems by
viduals endorse each pair’s items in a semantically developing new or worsening physical symptoms.
logical way. ‘‘I am very hard to get to know’’ and ‘‘I Low expectation of therapeutic benefit (EXP) consists
am very hard to get to know, and I prefer it that way’’ of 25 items that examine respondents’ skepticism
should both be answered either true or false. If a about the appropriateness and value of making cog-
respondent indicates that one statement is true and nitive and behavioral changes through therapy. The
the other is false, then his or her raw INC score would 19 items that constitute the self-oriented/narcissism
reflect that inconsistency. The 15 overly virtuous self- (NAR) scale look at the extent to which individuals
views (VIR) items look at respondents’ tendency to are self-indulgent in their interpersonal styles and
PERRY 673
in forming relationships with others (higher scores with the two composite measures (GPC and
on REL), and greater tendency to be self-punitive TDC). Higher scores were associated with prema-
(higher scores on A-I). ture termination for all of those scales, with the
There were some gender differences as well. exception of the GPC, which was found to enhance
Specifically, male participants produced higher the prediction of termination status by suppressing
scores than women on CLM, REL, and EXP, sug- extraneous variance in the other measures. When
gesting that male students were more closed- faced with the failure of EXP (as a measure of low
minded, reported more problems in forming rela- expectation of benefit from psychotherapy) to pre-
tionships, and had lower expectations of benefiting dict premature termination, the researchers impli-
from psychotherapy, as compared to female stu- cated the restricted range of scores on the scale
dents. The data from this college student sample within their sample; they further hypothesized that
support the notion that treatment resistance can be individuals whose expectations of benefit were very
measured using an objective measure. These indivi- low perhaps never became involved in counseling in
duals’ BTPI scores were largely consistent with the the first place. The authors concluded that their
attitudes conveyed via the seven-item survey. findings supported the theory and rationale under-
Hatchett, Han, and Cooker (2002) considered lying the BTPI and also its ability to identify college
another operationalization of treatment resistance by students at increased risk of early termination of
examining the degree to which premature termina- counseling. The following sections extend these clin-
tion from counseling could be predicted on the basis ical findings by examining BTPI trends within other
of BTPI data. As part of the intake evaluation process client samples.
at a university counseling center, 95 new clients
agreed to complete the BTPI. According to the demo- A Heterogeneous Sample of Psychotherapy
graphic information that they gathered, their typical Clients
participant was female and Caucasian, with a median In the Butcher et al. study (2000; see also Butcher
age of 21. Termination status was considered ‘‘pre- 2005b) described previously in this chapter, the
mature’’ if a client missed a scheduled appointment same group of 460 psychotherapy clients who com-
and made a unilateral decision to discontinue care. pleted the MMPI-2 also completed the BTPI.
Results showed that the BTPI scales assessing Within that sample, there were elevated scores on
lack of openness about new ways of thinking and each of the BTPI scales for at least a portion of the
behaving (CLM), tendency to channel psychological participants (see Table 34.3).
distress into physical symptoms (SOM), and percep- For those of their clients who were still involved in
tion of having limited social support were the most psychotherapy 3 months after the initial data were
useful predictors of premature termination, along collected, the study’s therapist participants provided
INC 6 4 8
VIR 10 11 10
EXA 26 28 24
CLM 19 19 20
REL 16 13 19
SOM 28 34 21
EXP 3 3 3
NAR 3 3 2
ENV 27 31 21
DEP 37 41 31
ANX 28 35 19
A-O 13 13 13
A-I 18 23 11
PSY 14 14 12
Note: Based on data from Butcher (2005b).
PERRY 675
depressive disorders occurring most commonly. In to psychological intervention in this sample of
addition, just over 17% of participants were diag- individuals with anxiety disorders. The relative
nosed with an Axis II disorder, most commonly lack of score elevations suggested that the partici-
personality disorder not otherwise specified. The pants were not experiencing many of the difficulties
typical participant was a Caucasian, never-married tapped by the BTPI scales. Therefore, one could
woman whose psychological treatment was cogni- expect their treatments to be less vulnerable to
tive-behavioral in nature. Approximately half of the treatment resistance difficulties. Indeed, most
sample was self-referred for treatment. The mean anxiety disorders are considered to be highly amen-
number of previous courses of treatment was able to treatment, particularly with behavioral and
roughly 3. At the time of the study, the participants cognitive-behavioral therapies (e.g., Barlow, 1990;
had completed an average of 39 psychotherapy ses- Chambless & Gillis, 1993; Foa, 1996). Thus, the
sions in their current courses of care and had an types of individuals with these disorders may be
average of 17 sessions remaining in the initial treat- open to interventions in a way that promotes
ment ‘‘contracts.’’ The mean Global Assessment of successful treatment.
Functioning (GAF) rating was nearly 61. The notion of the relatively high level of func-
For the sample as a whole, there were no scale tioning among this sample also is supported by the
scores at T 65. However, there were a number of mean GAF rating of nearly 61. According to the
scores falling in the T ¼ 60–64 range (see Table DSM-IV, a score of 61 is associated with ‘‘mild
34.4). Predictably, the highest scores were found on symptoms’’ or ‘‘some difficulty in social, occupa-
the ANX scale (T ¼ 62.67). When the sample was tional, or school functioning’’; individuals are to be
broken down by gender, results showed that the men assigned a GAF score of at least 61 if they are
in the group produced a mean T-score of just over 65 ‘‘generally functioning pretty well’’ and have
on the SOM scale, indicating high likelihood of ‘‘some meaningful interpersonal relationships’’
expressing psychological problems through physical (American Psychiatric Association, 1994, p. 32).
symptoms. Male participants’ scale scores were also Assuming that these study participants likely
significantly higher than those for female participants would have been rated lower at the onset of their
on EXP, SOM, ANX, PSY, and the TDC. treatments (as it is unlikely that highly functioning
The participants’ primary anxiety diagnoses were individuals would have sought or been referred for
also more broadly categorized by type, in order to psychological treatment), it could be hypothesized
examine trends among individuals with significantly that their openness to therapy facilitated their
overlapping symptoms. Among the notable findings doing so well by the time of the study; however,
were elevations on Scales SOM, ANX, and PSY, as this conclusion cannot be made definitively on the
well as on the GPC, among the 17 individuals who basis of the available data.
were diagnosed with either posttraumatic stress dis- That fact notwithstanding, there were note-
order or acute stress disorder. worthy trends in the data that bear on the issue of
Elevations and high scores (T ¼ 60–64) on the treatment resistance. As stated previously, analyses
BTPI scales were found more frequently among revealed that male participants’ scale scores signifi-
individuals with comorbid Axis II disorders than cantly exceeded those of female participants on EXP,
among individuals without them. In the total SOM, ANX, PSY, and the TDC. Moreover, com-
sample, individuals with personality disorder diag- pared with the female participants, the men in the
noses produced high scores on EXA, SOM, ANX, study were rated by their therapists as having made
PSY, and the GPC, as well as an elevated score (i.e., significantly less treatment progress to that point.
T 65) on DEP. This is in contrast to individuals Such data suggest the potential for greater treatment
without such additional diagnoses, for whom there resistance among the men than among the women.
were no elevated scores and high scores (i.e., The significantly higher scores on EXP and the
T ¼ 60–64) only on SOM and ANX. These findings TDC are of special interest, particularly when they
indicate the potential for greater treatment resistance are coupled with therapists’ judgments of male cli-
among individuals with Axis II disorders, in keeping ents’ progress. It could be that the men had made less
with the interpersonal difficulties associated with progress than the women in part because their expec-
such disorders. tations were lower, creating the potential for a self-
If the BTPI scales are intended to point to fulfilling prophecy. Difficulties in making behavioral
potential treatment resistance issues among therapy and attitudinal changes also may have been influ-
clients, the data indicated a good deal of openness enced by qualities (such as somatization of
PERRY 677
Table 34.5. MMPI-2 Validity, clinical, and content scale T scores for Lisa.
Name Scale elevation
(T-score)
Validity Scales/Indexes
? (Cannot Say Score) 0
VRIN (Inconsistency) 38
TRIN (Inconsistency) 58T
F (Infrequency) 65
F(B) (Infrequency, Back) 93
F(p) (Infrequency- 41
Psychopathology)
L (Lie) 71
K (Defensiveness) 46
S (Superlative Self-Presentation) 43
Clinical Scales
Hs (Hypochondrasias) 71
D (Depression) 94
Hy (Hysteria) 77
Pd (Psychopathic Deviation) 60
Mf (Masculinity-Femininity) 45
Pa (Paranoia) 81
Pt (Psychasthenia) 90
Sc (Schizophrenia) 72
Ma (Hypomania) 53
Si (Social Introversion- 75
Extraversion)
Content Scales
ANX (Anxiety) 81
FRS (Fears) 72
OBS (Obsessiveness) 71
DEP (Depression) 78
HEA (Health Concerns) 68
BIZ (Bizarre Mentation) 52
ANG (Anger Control Problems) 56
CYN (Cynicism) 53
ASP (AntiSocial Personality) 47
TPA (Type-A Behavior) 50
LSE (Low Self Esteem) 86
SOD (Social Distrust) 80
FAM (Family Problems) 42
WRK (Work Adjustment 71
Problems)
TRT (Treatment Planning) 69
She had also responded to the BTPI items in a emotional conflict. It was also indicative of such
manner suggesting the she was not receptive to intense worry about physical health that substantial
new ideas or suggestions from others, including reduction in activity level could result. The treat-
those about the significance of her behavior ment issues pattern was suggestive of poor self-con-
(TCLM ¼ 69). cept and low self-esteem as well.
Lisa’s treatment issues scales showed that she was Lisa’s current symptoms scales were notable pri-
very likely to channel emotional distress into phy- marily for high scores on A-I (T ¼ 87) and ANX
sical symptoms (TSOM ¼ 71). Her profile was asso- (T ¼ 83). Her profile was suggestive of turning one’s
ciated with avoidance of dealing directly with anger inward and being unreasonably self-punishing.
INC 48
VIR 45
EXA 67
CLM 69
0 10 20 30 40 50 60 70 80 90 100+
REL 56
SOM 71
EXP 46
NAR 29
ENV 58
0 10 20 30 40 50 60 70 80 90 100+
DEP 61
ANX 83
A-O 53
A-I 87
PSY 48
0 10 20 30 40 50 60 70 80 90 100+
Composite Scales
GPC 78
TDC 52
0 10 20 30 40 50 60 70 80 90 100+
Fig. 34.1 Lisa’s Scale and composite T-scores on the BTPI. Reproduced by permission
It also suggested high likelihood of interpersonal and accept others’ suggestions and interpretations. Her
problems caused by one’s taking a subservient role in tendency to channel psychological stress into physical
relationships. The profile was further indicative of symptoms might be especially hard for her to acknowl-
anxiety, worry, and indecision that could interfere edge and address in working with her therapist.
with daily functioning.
Computerized BTPI Report for Lisa
INDICATIONS OF RESISTANCE AND TREATMENT (Reproduced with permission of MHS)
IMPLICATIONS VALIDITY OF THE REPORT
Lisa’s psychological evaluation results provided sev- Most patients in psychological treatment are
eral pieces of helpful information bearing on her poten- more open than Lisa in their responses to the BTPI
tial to demonstrate resistance in working with her care items. Only 8% of clients in the Minnesota
providers. The findings from both the MMPI-2 and Psychotherapy Assessment Project attained peak
BTPI suggested reluctance to disclose personal infor- CLM T-scores >64. Figure 34.1 contains Lisa’s
mation, whether due to unwillingness to be coopera- computer-generated BTPI report.
tive, intentional distortion of her problems, lack of She has responded to the BTPI in a manner that
psychological sophistication, or other factors. suggests she is generally closed to new ideas or sug-
Therefore, it would likely be hard at times for Lisa to gestions from others as to possible meanings of her
talk openly with her providers, particularly about sub- behavior. She endorses content that suggests she is
jects that she regarded as highly sensitive. The evalua- not very open to new ways of thinking. It is likely
tion results also indicated that she could be quite that her lack of openness will influence her behavior
unreceptive to new ways of thinking. Such an attitude in psychological therapy. She is not likely to be very
could interfere significantly with her willingness to hear receptive to psychological interpretation about her
PERRY 679
behavior and may resent pressure on her to consider role in interpersonal situations and readily takes
alternatives to her comfortable ways of behaving. blame for problems she has had no part in origi-
The therapist should be aware of the need for con- nating in order to placate other, more dominant
fronting her lack of openness or defensiveness if it is persons. She turns anger inward and appears to
observed in the treatment process. punish herself unreasonably at times. A moderate
Other response attitudes she showed on the BTPI percentage of outpatient psychotherapy clients in
items should be evaluated further in order to more the Minnesota Psychotherapy Assessment Project
fully understand potential problems with her treat- (18%) obtained A-I T-scores >64. However, only
ment process. 8% of these had A-I as the peak current symptom
She has endorsed a number of extreme items on scale score, with T > 64.
the BTPI and claims to be experiencing a great deal In addition, she has also reported other psycholo-
of symptoms and attitudes that are unlikely to be gical symptoms on the BTPI items that need to be
endorsed by most people. The individual’s current taken into consideration in evaluating her mood
situation should be evaluated to determine the rea- state. She also appears to be very anxious at this
son(s) for the exaggerated symptom endorsement. time. She is reporting great difficulty as a result of
Other response attitudes she showed on the BTPI her tension, fearfulness, and inability to concentrate
items should be evaluated further in order to more effectively. She seems to worry even over small mat-
fully understand potential problems with her treat- ters to the point that she cannot seem to sit still. Her
ment attitudes. daily functioning is severely impaired because of her
She has endorsed a number of extreme items on worries and an inability to make decisions.
the BTPI and claims to be experiencing a great deal
of symptoms and attitudes that are unlikely to be
TREATMENT PLANNING
endorsed by most people. The individual’s current
Her somatization defenses need to be a central
situation should be evaluated to determine the rea-
focus in early treatment sessions. Therapists should
son(s) for the exaggerated symptom endorsement.
be aware of her potential difficulty to think psycho-
logically in treatment sessions.
TREATMENT ISSUES
There is some possibility that she is experiencing
Lisa has attained a very high score on the SOM
low self-esteem, which could impact her functioning
scale, suggesting that she is experiencing a consider- in interpersonal contexts.
able amount of physical distress at this time. She
Extremely negative self-views were noted in her
seems to feel that her problems are based on physical
self-appraisal. Treatment planning should incorporate
causes, and she does not like to deal with emotional
strategies designed to increase her self-esteem and
conflict in a direct way. She tends to channel emo-
reduce her tendency to view herself in such a negative
tional conflict into physical symptoms such as head-
light. The practitioner might find it valuable in treat-
ache, pain, or stomach distress. Moreover, she worries
ment sessions to be aware of her extreme tendency to
about her health to such an extent that she is reducing
accept blame in situations for which the client bears
her activities substantially. She tends to view herself as no responsibility.
tired and is worried that her health is not better.
The therapist needs to keep in mind that she
The use of somatic defenses and the development
appears to be prone to react to stressful situations
of physical problems under psychological conflict are
with catastrophic reactions.
prominent mechanisms in outpatient therapy. Over
a quarter (28%) of the clients in the Minnesota PROGRESS MONITORING
Psychotherapy Assessment Project produced high Lisa obtained a GPC T-score of 78. She has
scores (T > 64) on the SOM scale. In addition, endorsed a broad range of mental health symptoms
SOM led other Cluster 2 scales as the most frequent on the BTPI. Her elevated GPC index score indi-
peak score (with 19% of the clients having a peak cates that she acknowledged a number of mental
SOM T-score >64). health symptoms that require attention in psycholo-
gical treatment. For a statistically significant change,
CURRENT SYMPTOMS based on a 90% confidence interval, a subsequent
Lisa appears to have very low self-esteem and GPC T-score must be above 84 or below 72.
negative views about her ability to function in Lisa obtained a TDC T-score of 52. Overall, her
many situations. She usually takes a subservient TDC index score is well within the normal range
PERRY 681
anticipating problems and in reacting to them, such Cowen, E. L. (1982). Help is where you find it: Four informal
that the two can deviate from prescribed therapy helping groups. American Psychologist, 37, 385–395.
Craig, R. J., & Olson, R. E. (2003). Predicting the outcome of
plans in order to address resistance issues head on. methadone maintenance treatment with the negative treat-
ment indicators content scale from the MMPI-2. Psychological
Reports, 93, 1056–1058.
References Dowd, E. T., Milne, C. R., & Wise, S. L. (1991). The therapeutic
American Psychiatric Association. (1994). Diagnostic and statistical reactance scale: A measure of psychological reactance. Journal
manual of mental disorders (4th ed.). Washington, DC: Author. of Counseling and Development, 69, 541–545.
Barlow, D. H. (1990). Long-term outcome for patients with Foa, E. B. (1996). The efficacy of behavioral therapy with obses-
panic disorder treated with cognitive-behavioral therapy. sive-compulsives. The Clinical Psychologist, 49, 19–22.
Journal of Clinical Psychiatry, 51, 17–23. Finn, S. E., & Kamphuis, J. H. (2006). Therapeutic assessment
Barron, F. (1953). An ego strength scale which predicts response to with the MMPI-2. In J. N. Butcher (Ed.), MMPI-2: The
psychotherapy. Journal of Consulting Psychology, 17, 327–333. practioner’s handbook (pp. 165–191). Washington, DC:
Ben-Porath, Y. S. (1997). Use of personality assessment instru- American Psychological Association.
ments in empirically guided treatment planning. Psychological Finn, S. E., & Martin, H. (1997). Therapeutic assessment with
Assessment, 9, 361–367. the MMPI-2 in managed health care. In J. N. Butcher (Ed.),
Beutler, L. E., Engle, D., Mohr, D., Daldrup, R. J., Bergan, J., Personality assessment in managed health care: Using the
Meredith, K., et al. (1991). Predictors of differential and self- MMPI-2 in treatment planning (pp. 131–152). New York:
directed psychotherapeutic procedures. Journal of Consulting Oxford University Press.
and Clinical Psychology, 59, 333–340. Finn, S. E., & Tonsager, M. E. (1992). Therapeutic effects of
Beutler, L. E., Goodrich, G., Fisher, D., & Williams, O. B. providing MMPI-2 test feedback to college students awaiting
(1999). Use of psychological tests/instruments for treatment therapy. Psychological assessment, 4, 278–287.
planning. In M. E. Maruish (Ed.), The use of psychological Finn, S. E., & Tonsager, M. E. (1997). Information-gathering
testing for treatment planning and outcomes assessment (2nd ed., and therapeutic models of assessment: Complementary para-
pp. 81–113). Mahwah, NJ: Erlbaum. digms. Psychological Assessment, 9, 374–385.
Brody, S. (1994). Traditional ideology, stress, and psychotherapy Fisher, D., Beutler, L. E., & Williams, O. B. (1999). Making
use. Journal of Psychology, 128, 5–13. assessment relevant to treatment planning: The STS Clinician
Butcher, J. N. (1990). The MMPI-2 in psychological treatment. Rating Form. Journal of Clinical Psychology, 55, 825–842.
New York: Oxford University Press. Frank, J. D., & Frank, J. B. (1991). Persuasion and healing: A
Butcher, J. N. (2005a). MMPI-2: A beginner’s guide (2nd ed.). comparative study of psychotherapy (3rd ed.). Baltimore, MD:
Washington DC: American Psychological Association. Johns Hopkins University Press.
Butcher, J. N. (2005b). Butcher Treatment Planning Inventory Freud, S. (1973). A general introduction to psychoanalysis (25th
(BTPI): Technical manual. Toronto, ON: Multi-Health printing). New York: Pocket Books.
Systems. Furnham, A., & Wardley, Z. (1990). Lay theories of psy-
Butcher, J. N., Cabiya, J., Lucio, E. M., & Garrido, M. (2007). chotherapy: I. Attitudes toward, and beliefs about, psy-
Assessing Hispanic clients using the MMPI-2 and MMPI-A. chotherapy and therapists. Journal of Clinical Psychology, 46,
Washington, DC: American Psychological Association. 878–890.
Butcher, J. N., & Perry, J. N. (2008). Personality assessment in Garfield, S. L. (1994). Research on client variables in psy-
psychological treatment: Use of the MMPI-2 and BTPI. New chotherapy. In S. L. Garfield & A. E. Bergen (Eds.),
York: Oxford University Press. Handbook of psychotherapy and behavior change (4th ed.,
Butcher, J. N., Rouse, S. V., & Perry, J. N. (1998). Assessing pp. 190–228). New York: John Wiley & Sons.
resistance to psychological treatment. Measurement Garfield, S. L. (1995). Psychotherapy: An eclectic-integrative
and Evaluation in Counseling and Development, 31, approach (2nd ed.). New York: John Wiley & Sons.
95–108. Graham, J. R. (2006). MMPI-2: Assessing personality and psycho-
Butcher, J. N., Rouse, S. V., & Perry, J. N. (2000). Empirical pathology. New York: Oxford University Press.
description of psychopathology in therapy clients: Correlates Greene, R. L., & Clopton, J. R. (1999). Minnesota Multiphasic
of the MMPI-2 scales. In J. N. Butcher (Ed.), Basic sources on Personality Inventory-2 (MMPI-2). In M. E. Maruish (Ed.),
the MMPI-2 (pp. 487–500). Minneapolis, MN: University of The use of psychological testing for treatment planning and out-
Minnesota Press. comes assessment (2nd ed., pp. 1023–1049). Mahwah, NJ:
Butcher, J. N., & Williams, C. L. (2000). Essentials of MMPI-2 Erlbaum.
and MMPI-A interpretation (2nd ed.). Minneapolis, MN: Groth-Marnat, G. (1999). Current status and future directions of
University of Minnesota Press. psychological assessment: Introduction. Journal of Clinical
Castro, J. (1993, May 31). What price mental health? Time, Psychology, 55, 781–785.
59–60. Hatchett, G. T., Han, K., & Cooker, P. G. (2002). Predicting
Chambless, D. L., & Gillis, M. M. (1993). Cognitive therapy of premature termination from counseling using the
anxiety disorders. Journal of Consulting and Clinical Butcher Treatment Planning Inventory. Assessment, 9,
Psychology, 61, 248–260. 156–163.
Cheatham, H. E., Shelton, T. O., & Ray, W. (1987). Race, sex, Hersen, M., & Ammerman, R. T. (Eds.) (1994). Handbook of
causal attribution, and help-seeking behavior. Journal of prescriptive treatment for adults. New York: Plenum.
College Student Personnel, 26, 559–568. Johnson, M. E. (1988). Influences of gender and sex role orienta-
Corey, G. (1991). Theory and practice of counseling and psy- tion on help-seeking attitudes. Journal of Psychology, 122,
chotherapy (4th ed.). Pacific Grove, CA: Brooks/Cole. 237–241.
PERRY 683
C H A P T E R
Raymond L. Ownby
Abstract
Psychological assessment is a key aspect of professional practice and one that distinguishes psychology
from other disciplines. Reporting the results of assessments is thus an important professional function but
surveys suggest that many consumers of reports are dissatisfied. This chapter reviews problems with
reports and discusses the expository process model. This model uses insights from basic psycholinguistic
research to inform reporting practices.
684
procedures as in the communication of the assess- A failure to focus on conceptualization and com-
ment results to others. The demands of the task, munication in graduate coursework can be rationa-
especially for psychologists in training, dominate lized as the work of clinical practica and internships,
the assessor’s experience. The mechanics of per- but this is analogous to teaching someone to ski on
forming a standardized administration, for example, one occasion and asking them to actually face a slope
are seen as the proper focus of educational efforts. months or years later. The situation merits examina-
Conceptually integrating the results of diverse assess- tion since habits acquired early in one’s career are
ment measures and communicating them in a useful likely to persist. Writing the clinical report is thus
way to a reader, while perhaps acknowledged as part much more than recording results of test adminis-
of the task, are given less attention. This orientation tration, and the higher-level cognitive skills required
is revealed in assessment textbooks, which devote for integrating and communicating results deserve
most of their chapters to domain-specific assessment more explicit attention. The remainder of this
techniques but perhaps one to the arguably more chapter provides a guide by which psychologists
cognitively complex task of taking assessment data can improve their ability to conceptualize and com-
and integrating them into a coherent narrative municate results of their psychological assessments.
account of the reason for the assessment, its results,
and what should be done about them. While under- Problems in Psychological Reports
standable in the context of the history of psychology An area in which empirical data exist on reports
and the qualifications of textbook writers (who are concerns the complaints voiced by consumer of
generally senior psychologists with extensive experi- reports. Readers of reports consistently complain
ence), the situation inevitably gives rise to a test- that writers use jargon, write unclearly, and do not
administration orientation when what is arguably provide specific report elements that are often
needed is a communication orientation. The key requested in certain report contexts or by certain
problem in assessment is to communicate the infor- report consumers (Groth-Marnat & Horvath, 2006).
mation obtained in a useful way, but an early focus
during training on the mechanics of test administra- Psycholinguistic Concepts
tion, scoring, and interpretation is not followed by a Although little noticed by psychologists writing
later focus on communicating the information thus about how to write reports, a substantial body of
derived in ways that are helpful. research exists on how to make writing easier for the
Report writers can readily find advice on how to reader to understand (Gopen & Swan, 1990). In the
write reports. One common piece of advice is to absence of specific empirical research on clinical
‘‘write clearly.’’ While it’s hard to argue with this, it’s report writing, concepts from the more general
not clear how useful this advice is to novice or field of technical writing and the basic science
troubled report writers. It’s akin to saying ‘‘Feel study of discourse comprehension processes in psy-
better!’’ to an anxious or depressed individual. It’s cholinguistics may help writers communicate more
not that it’s a bad idea, but the recommendation effectively. Three concepts from psycholinguistics
lacks specificity and does not provide clear guidance are particularly relevant to writing clinical reports:
about a course of action to remedy a problem. Far (a) the given-new contract, (b) coherence, and (c)
more useful would be help with a concrete set of cohesion.
procedures that can be followed in order to feel The given-new contract refers to a core principle
better, or in the case of reports, to write better. in discourse comprehension that states that to effec-
Exhortation to do good things leaves the individual tively communicate one tells the listener or reader
who makes the recommendations personally satisfied something new about something he or she already
and gives the impression that something important knows about. This means that the person initiating
has been communicated, but may leave the reader the communication must establish common ground
with little concrete advice with which to guide his or with the listener or reader before going forward with
her behavior. Writers thus need concrete strategies for new information, and implies that the person who
writing reports, as well as to tailor reports to specific initiates communication must have some idea of
purposes, contexts, and readers, rather than exhorta- what the message’s intended recipient already
tions to write clearly or succinctly. Impediments to knows. In writing reports, this means that to effec-
report readability are readily identified (e.g., Harvey, tively communicate the writer of a report must have
2006), but writers’ recommended remedies are often a fairly good idea of what his reader already knows
unintentionally vague or difficult to implement. about the likely substance of the assessment. As will
OWNBY 685
be discussed below, this means that the writer must The following statement is a postulate that will allow
be able to link common terms used to describe an ongoing evaluation of what makes reports
abilities and personality (e.g., ‘‘intelligence,’’ ‘‘visuos- effective:
patial skills,’’ ‘‘dependence,’’ or ‘‘anxiety’’) to concrete
The purpose of psychological reports is to
behavioral descriptions or specific test scores.
communicate the results of an evaluation in a way that
Following recommendations drawn from the
is credible and persuasive.
given-new contract means that the writer must
clearly establish what he or she is writing about Here, credible means that after reading a statement
before relating new information about the client or a report, the reader would say, if asked, that he or
assessed. she believes the assertion it contains. Credibility can
Coherence is a property of a written report that be established by referencing assertions to basic assess-
conveys to the reader a sense that statements are ment data that the reader is likely to accept based on
conceptually related to one another and the account the psychologist’s basic training in administering
of the assessment makes sense. The ideas, concepts, assessment tools. For example, a statement that the
interpretations, and recommendations should all patient is depressed might be supported by reference
have clear conceptual links in order to leave the to an elevation on Scale 2 of the Minnesota
reader with the impression that all the elements Multiphasic Personality Inventory-2 (MMPI-2), to a
discussed in the report are relevant and help lead score on a self-report measure such as the Beck
the reader to understanding the author’s interpreta- Depression Inventory, or to behavioral observations
tion and recommendations. of psychomotor slowing or sad affect. Although the
Cohesion is a property of discourse in which indi- reader may be skeptical of or even explicitly disagree
vidual elements such as clauses and paragraphs are with the writer’s conclusions, credibility demands
logically related or linked. While coherence may be that raw assessment data be used to substantiate asser-
thought of as a property in which ideas are logically tions that are one step removed from raw data but
related, cohesion refers to the mechanical linking of employ intermediate explanatory concepts such as
ideas through grammatical devices such as coordina- ‘‘depression’’ (see the next section for more on these
tion and subordination. Coordination, for example, explanatory concepts).
is a grammatical device that allows the writer to link Persuasive means that after reading a statement or
two distinct ideas that might be able to stand alone. a report, the reader would be more likely to follow a
This is often accomplished with words called con- recommended course of action. Persuasive state-
junctions (‘‘and,’’ ‘‘but,’’ ‘‘therefore’’): The client ments must be based in credible data but the recom-
arrived at the office promptly and did not fatigue exces- mendations for behavior contained in them must
sively during the course of the assessment. Both clauses, also be logically linked to the credible data. A recom-
or major sections, of this sentence can stand sepa- mendation for psychotherapy for depression, for
rately. They are linked in a coordinate fashion with example, might follow from a credible conclusion
the word ‘‘and.’’ Subordination is another gramma- that a patient is depressed, but would be more
tical device that allows the author to links clauses in persuasive if accompanied by data showing that the
instances in which one of the clauses would not stand client or patient is interested in obtaining psy-
independently: After completing the MMPI-2, the chotherapy and that he or she is likely to benefit
client insisted on returning home for the day. The from it. Unless the report is only written for archival
clause ‘‘After completing the MMPI-2’’ would not purposes (and thus is merely a clerical exercise), the
make sense by itself, nor is it a complete sentence. It writer should give the reader guidance not only on
is grammatically as well as conceptually linked to the the assessment data but also on the implications of
clause that follows it. Employed properly, these those data.
devices allow the author of a report give an account The idea that psychological reports and the
of the assessment in which key ideas are presented in a statements, conclusions, and recommendations
way that makes the reader feel as though the ideas fit they contain should be both credible and persuasive
together. is a postulate. In the context of this chapter, this
means that it is an assertion that is taken as basic,
The Purpose of Reports without other empirical support. A brief considera-
Elsewhere it has been argued that any attempt to tion of the role that reports play will for most
empirically examine psychological reports requires a readers support a conclusion that this postulate is
definition of the purpose of reports (Ownby, 1997). reasonable. It does not exclude the possibility that
OWNBY 687
reports. The sentence should refer to given informa- (f) specific answers to the referral question, and
tion (the client, a specific measure, or previously (g) recommendations for follow-up action. Inclusion
discussed ideas) and provide new information of these elements is essential, and failure to do so is
about the given. The new information will often be likely to render the report ineffective in communi-
expressed in the form of an evaluation of a middle- cating results. Failing to include any of these elements
level construct, such as ‘‘severe depression,’’ or ‘‘low is likely as well to leave the reader dissatisfied. As
self-esteem.’’ At the sentence level, the middle-level discussed in the next section, reporting models will
construct must be linked to observable data or a dictate how and in what order the key elements are
concrete explanation. presented, but even in the shortest report these key
elements should be addressed.
The Paragraph
The EPM also provides guidance about paragraph Report Sections
structure as well. Many writers have been taught that How the writer creates sections of the report will
each paragraph should contain a ‘‘topic sentence’’ that depend to some extent on his or her preferences and
states the main idea of the paragraph. Other sentences the model chosen for the report (e.g., hypothesis-,
in that paragraph should elaborate upon or sub- test-, or ability-oriented; see the next section). Here
stantiate whatever is said in the topic sentence. If the again, the requirements of the given-new contract,
writer has written an outline prior to writing (a task coherence, and cohesion provide guidance. The
that many writers experience as yet more unpleasant given-new contract suggests that the first section of
work), then topics from the outline can become topic the report should establish the given—usually the
sentences. This approach will help the writer provide client. Most reports intuitively or by tradition follow
a sense of coherence to what is written. The outline this dictate and begin with a section of ‘‘identifying
allows the writer to see the overall organization of the information.’’ Having established who the report is
planned report and detect potential illogical transi- about, the requirement for coherence suggests that
tions or extraneous material. Writers who approach the next section of the report establish why the
writing the report by beginning with the first para- evaluation was conducted. Here again, most writers
graph and simply proceed through the narrative intuitively follow this practice and provide a descrip-
deprive themselves of this method for ensuring the tion of why the client was seen for the evaluation.
coherence of their narrative. Two small studies sup- This section may also include some elaboration of
port the hypothesis that using the EPM can result in relevant history about the client to substantiate or
sentences and paragraphs that are more credible and elaborate the reason for the referral. For example, the
persuasive (Ownby, 1990a, b) than are non-model- reason for evaluation might most simply be stated as
based alternatives. A more extensive discussion of ‘‘personality and cognitive evaluation.’’ Such a brief
alternative paragraph structures is available elsewhere statement might be inadequate to many readers,
(Ownby, 1997). failing to establish the necessary given information.
If the writer includes, ‘‘Ms. X’s personality and
Key Elements of Reports memory are said to have changed after she was
The requirements of the given-new contract and involved in a car crash earlier this year,’’ the reader
coherence mean that some elements must be may have a much better understanding of the assess-
included in virtually all reports. The basic given ment report that follows.
information of the assessment must be established,
and the path from given to the author’s interpreta- Overall Report Organization
tions and recommendations must be logical and Before writing, the writer should decide which of
clear. Further, the task of presenting credible and several models he or she should employ in reporting
persuasive reports of assessments requires that the the assessment results. Reporting models provide gui-
person assessed be identified, the reason why he or dance about how the essential elements of a report are
she was assessed should be established, the methods organized in the report as well as about in what level of
used in the assessments be reported, results be dis- detail and in what portion of the report they are
cussed, and recommendations be made. The core presented. The choice of reporting model should be
elements that must be included in any psychological guided by an explicit evaluation of (a) the referral
report are thus (a) identifying information, source and his or her characteristics (psychologist,
(b) reason for evaluation, (c) procedures employed, judge, physician, teacher, administrator); and (b) the
(d) procedure results, (e) implications of the results, context in which the assessment and its reporting
OWNBY 689
Specific Issues about Reports evaluate readability. Although in the past this task
Groth-Marnat and Horvath (2006) review con- was tedious, most modern word processors will now
troversies that arise in considering psychological provide readability estimates on demand. Improving
reports, several of which merit discussion here as readability as assessed by these formulas can be
well. accomplished by using shorter words and limiting
sentence length. In general, using shorter words
Length helps the writer focus on using simpler concepts,
Most reports are too long, but the appropriate and using shorter sentences reduces the grammatical
length for any report must be determined after con- complexity of sentences. Although evaluation of a
sideration of the purpose of the report, the needs of report in this way can increase writers’ awareness of
the intended recipient, and the possibility that the how difficult their prose is for readers to understand,
report may have multiple readers over many years. using these indices to improve readability can be
Reports are too long because their writers focus on problematic if done uncritically. In many instances,
mechanical reporting of the assessment rather than a for example, it may be important for conceptual
well thought out presentation of relevant data in a precision to use a polysyllabic technical term, and
way that will assist the reports reader in making emphasizing short and simple sentences can make
clinical decisions. Donders (2001), for example, the writing seem choppy.
found that neuropsychologists said that their average
report length was 5–10 pages. Psychologists who Validity of Measures
write reports of this length should consider the Measures used in assessments may vary in the
need for this report length and whether the report’s degree to which their use is based in research that
recipients find their reports useful. Merely following establishes reliability, validity, and appropriate
clinical practice tradition without a critical consid- norms. The judicious use of any psychological mea-
eration of it is inadvisable. sure is the responsibility of the psychologist who
chooses, administers, and interprets it. The use of
Readability measures with limited validity is inadvisable, but
Readability refers to the overall difficulty which when such a use is undertaken it is ethically manda-
the content and format of written text present to tory that any interpretation based on such an instru-
readers. It is perhaps most familiar to both readers ment be labeled as based on a measure of unknown
and writers in its assessment through grade equiva- or limited validity (Michaels, 2006). The reader is
lent scores, so that one can refer to a passage of text as referred to Standard 9 of the Ethical Principles of
having been written at a fourth grade level or one can Psychologists for further guidance (American
say that a person reads proficiently material at an Psychological Association, 2003).
eighth grade level. A typical recommendation is to
write material for a general audience at an eighth Inclusion of Test Scores
grade level. This recommendation is based on the A number of authors have weighed in on the
observation that most persons read independently at subject of including test scores in reports. Several
levels about four years less than their highest level of issues arise in considering whether and how to report
education. Since most people in the United States scores on specific measures in a report. The first is
complete 12 years of education, material should thus whether reporting a score places the client in any way
be written four years below this level. at unwarranted risk. The second is whether the
No readily available study has reported on read- person who receives the test score will be able to
ability of psychological reports in the grade equiva- interpret it meaningfully. Finally, if the score is
lent metric. It is well known, however, that much written into a report that may persist in someone’s
professional material is written at levels of much records for many years, will future readers of the
greater difficulty. Professionals often underestimate report be able to interpret it? Even a cursory exam-
the degree of difficulty their written communica- ination of these issues must lead psychologists to be
tions present for readers (Ownby, 2006), however, cautious about releasing test scores.
and by inference it is possible that psychologists may Reporting test scores about clients’ intellectual
underestimate the difficulty their reports present. abilities may facilitate attempts to help them receive
Assessment of readability has often been based on needed services or provide appropriate career gui-
fairly simple formulas that use average word and dance, but might also be used to justify denying
sentence length in a passage of a given length to them important opportunities. Although few data
OWNBY 691
evaluation should include specific elements, be a Gopen, G. D., & Swan, J. A. (1990). The science of scientific
specific length, and be organized in specific writing. American Scientist, 78, 550–558.
Groth-Marnat, G., & Horvath, L. S. (2006). The psychological
sections that all might be different from an report: A review of current controversies. Journal of Clinical
assessment report sent to a psychiatrist colleague Psychology, 62, 73–81.
in private practice? Grove, D. M., Zald, D. H., Lebow, B. S., Snitz, B. E., & Nelson,
2) Empirical investigations are needed of how report C. (2000). Clinical versus mechanical prediction: A meta-
variables such as content, length, and organization analysis. Psychological Assessment, 12, 19–30.
Guyatt, G., Cairns, J., & Churchill, D. f. t. E.-B. M. W. G.
are received by consumers of reports and how (1992). Evidence-based medicine: A new approach
these variables influence their subsequent to teaching the practice of medicine. JAMA, 268,
behavior. 2420–2425.
3) How can psychologists be better trained to Harvey, V. S. (2006). Variables affecting the clarity of psycholo-
understand the psychometric properties of gical reports. Journal of Clinical Psychology, 62, 5–18.
Hunsley, J., & Meyer, G. J. (2003). The incremental validity of
measures and how these properties should affect psychological testing and assessment: Conceptual, methodo-
the way they present assessment results in reports? logical, and statistical issues. Psychological Assessment, 15,
446–455.
References Klopfer, W. G. (1960). The psychological report: Use and commu-
American Psychological Association. (2003). The ethical principles nication of psychological findings. New York: Grune &
of psychologists. Washington, DC: Author. Stratton.
Appelbaum, S. (1970). Science and persuasion in the psycholo- Meehl, P. A. (1954). Clinical versus statistical prediction: A theore-
gical test report. Journal of Consulting and Clinical Psychology, tical analysis and a review of the evidence. Minneapolis, MN:
35, 349–355. University of Minnesota Press.
Butcher, J. N., Perry, J. N., & Atlis, M. M. (2000). Validity and Michaels, M. H. (2006). Ethical considerations in writing psy-
utility of computer-based test interpretation. Psychological chological assessment reports. Journal of Clinical Psychology,
Assessment, 12, 6–18. 62, 47–58.
Butcher, J. N., Perry, J., & Hahn, J. (2004). Computers in clinical Ownby, R. L. (1990a). A study of the expository process model in
assessment: Historical developments, present status, and mental health settings. Journal of Clinical Psychology, 46,
future challenges. Journal of Clinical Psychology, 60, 331–345. 366–371.
Chambless, D. N., & Hollon, S. D. (1998). Defining empirically Ownby, R. L. (1990b). A study of the expository process model in
supported therapies. Journal of Consulting and Clinical school psychological reports. Psychology in the Schools, 27,
Psychology, 66, 7–18. 353–358.
Donders, J. (2001). A survey of report writing by clinical neurop- Ownby, R. L. (1997). Psychological reports: A guide to report
sychologists: II. Test data, report format, and document writing in professional psychology (3rd ed.). New York: John
length. The Clinical Neuropsychologist, 15, 150–161. Wiley & Sons.
Donders, J. (2006). Performance discrepancies on the California Ownby, R. L. (2006). Readability of consumer-oriented geriatric
Verbal Learning Test—second edition (CVLT—II) in the depression information on the Internet. Clinical Gerontologist,
standardization sample. Psychological Assessment, 18, 453–463. 29, 17–32.
Dori, G. A., & Chelune, G. J. (2004). Education-stratified base- Schretlen, D. J., Munro, C. A., Anthony, J. C., & Pearlson, G.
rate information on discrepancy scores within and between D. (2003). Examining the range of normal intraindividual
Wechsler Adult Intelligence Scale—third edition and the variability in neuropsychological test performance.
Wechsler Memory Scale—third edition. Psychological Journal of the International Neuropsychological Society, 9,
Assessment, 16, 146–154. 864–870.
James N. Butcher
Abstract
The first computer program to interpret a psychological test, the original version of the Minnesota
Multiphasic Personality Inventory (MMPI), was developed in the 1960s. Computers have become an
indispensable component in psychological assessment in most health-care, forensic, mental health, and
personnel settings. This chapter addresses several aspects of the use of computer-based psychological test
reports. The rationale and value of using computer-based reports is described and reasons for the
widespread use of computerized test interpretation detailed. Computer-based interpretation systems can
usually provide a more comprehensive and objective summary of relevant test-based hypotheses than
practitioners have the time and resources to develop. Use of computer-based test results avoids or
minimizes subjectivity in selecting and emphasizing interpretive material. Computer-based psychological
test results can be obtained quite rapidly saving practioner’s valuable time for use in early clinical sessions.
Finally, a broad range of computer-based assessment services is available for the practitioner to
incorporate into his or her practice. The methods of computer test data processing that practitioners can
employ are varied and include the following data entry methods: mail-in service, in-office computer
processing, fax processing option, and Internet-based test applications.
Keywords: abbreviated forms, actuarial prediction, computer adaptive tests, computer-based reports,
MMPI-2, test feedback
Almost half a century has now passed since the first generate a complete sophisticated report that
computer was programmed to interpret a psycholo- describes and classifies problems. Moreover, the use
gical test, the original version of the Minnesota of the computer, with its extensive memory and rapid
Multiphasic Personality Inventory (MMPI), at the combination capability, allows the psychologist to
Mayo Clinic (Rome et al., 1962). Today computers synthesize vast amounts of information and provide
have become an indispensable partner in psycholo- interpretations related to very complete results such as
gical assessment in health-care, forensic, mental determining prognosis and treatment planning and
health, and other settings (Butcher, Perry, & Atlis, aid in tracking symptom change in therapy.
2000; Atlis, Hahn, & Butcher, 2006). One survey of This chapter will address several facets of the use
clinical practitioners found that 67.8% of respon- of computer-based psychological test reports. First,
dents used computer scoring of psychological tests the rationale and value of using computer-based
and 43.8% also used computer-derived reports in reports will be explored. Then important factors to
their clinical practice (Downey, Sinnett, & consider in using computer-based reports will be
Seeberger, 1998). Computers can be used to aid the described and illustrated. Next, the clinical use of
psychological practitioner in a number of ways in one computer-based test report, the Minnesota
performing clinical evaluations. For example, they Report for the MMPI-2, will be described. The
can be used to administer test items, to rapidly score importance of profile base rates will be described
and profile test results, and to interpret test scores and (see Chapter 8). In this example, the use of the
693
MMPI-2 in providing test feedback to clients in practitioner can have a great deal of confidence in
psychological treatment will be illustrated. After most commercially available test scoring and inter-
this practical case example is described, the use of pretation programs, since the testing industry
computer reports in forensic testimony will be usually maintains high standards for computerized
explored. The chapter concludes with an examina- assessment. In his review of computerized psycholo-
tion of several possible problems with computer- gical assessment programs, Bloom (1992) concluded
based psychological test reports and provides a per- that ‘‘the very high level of professional vigilance over
spective on appropriate uses. test administration and interpretation software
undoubtedly accounts for the fact that computerized
Value of Using Computer-Based assessment programs have received such high marks’’
Psychological Tests in Clinical Assessment (p. 172).
Several reasons can be found for the widespread
use of computer-based test interpretation in clinical Factors to Consider in Test Scoring and
assessment. First, computer-based interpretation sys- Interpretation Services
tems can usually provide a more comprehensive and Many factors will determine which of the com-
objective summary of relevant test-based hypotheses mercially available computer scoring and interpreta-
than the clinical practitioner has the time and tion services best fit one’s assessment needs.
resources to develop (Atlis et al., 2006). In inter- Foremost in the decision, of course, will be whether
preting profiles, a computer will attend to all relevant a particular test interpretation company provides the
scores and not ignore or fail to consider important range and quality of test scoring and interpretation
information, as sometimes happens when human services that are needed. Testing services are usually
beings face a large array of complicated data. dedicated to particular tests and not to others. No
Computer-assisted test interpretations are usually test interpretation companies offer all tests that a
more thorough and better documented than those practitioner might wish to use in his or her practice.
typically derived by clinical assessment procedures. Some services have limited publication or distribu-
Second, use of computer-based test results avoids tion arrangements and are unable to provide scoring
or minimizes subjectivity in selecting and empha- of some tests due to copyright restrictions. For
sizing interpretive material. Biasing factors, such as example, a company might not have permission to
halo effects, can creep into less structured procedures computer-score the scales contained on a test, yet it
such as the clinical interview. The computer can might offer computer-based interpretive reports. In
carry out the ‘‘look up and combine function’’ such cases, the tests must be scored manually and the
involved in test interpretation in an unbiased scale scores entered into the computer to obtain
manner compared with most human interpreters. interpretations. It is important for the practitioner
Third, computer-based psychological test results to ensure that the testing company has the type of
can be obtained quite rapidly, thereby being available data processing option that the practitioner needs.
for the clinician to use in early clinical sessions. One Inquiries need to be made about the exact nature of
can often have a summary of test results within the test scoring and processing options provided.
minutes of the time the client has completed the
testing. The summary information, as we will discuss Methods of Computer Test Data Processing
later, can actually be used in the initial clinical contact The methods of computer test data processing that
with the client instead of a week or two later, as was practitioners can employ are varied. The following
common for psychological test evaluations in the are several possible data entry methods and test factors
precomputer era. that the practitioner might consider in deciding
Fourth, computer interpretation systems are which computer application best fits his or her prac-
considered to be almost completely reliable; that is, tice requirements.
they always produce the same results on the same set of
scores. Moreover, tests that are computer administered Mail-In Service
have been shown to be equivalent to those adminis- The oldest type of scoring and interpretation
tered by the traditional booklet form (Finger & service commercially available is the mail-in test
Ones, 1999). processing option. For the most part, this service is
Finally, a broad range of computer-based assess- provided to practitioners who have limited in-house
ment services is available for the practitioner to computer facilities or who may process large num-
incorporate into his or her practice. Generally, the bers of tests in batches. The tests in this scoring
BUTCHER 695
needed to assess the major symptom pattern(s) assessors surveyed in one study used the MMPI or
rather than having a fixed format of administering MMPI-2 in personal injury cases (Boccaccini &
all the items (Weiss, 1985). Two strategies for item Brodsky, 1999). The idea of computer-based
selection have been studied: the item response MMPI-2 (MMPI) interpretation is based on the
theory approach, which is applied most effectively view that tests can more effectively be interpreted in
with homogeneous item scales, and the ‘‘count- an actuarial manner than through clinical interpreta-
down’’ method (Butcher, Keller, Bacon, 1985; tion strategies (see Grove & Meehl, 1996; Grove,
Handel, Ben-Porath, Watt, 1999), which attempts Zald, Lebow, Smith, & Nelson, 2000; Meehl,
to provide an estimate of the total score based upon a 1954). A report made up of objective applied person-
set number of items. The computer-adapted MMPI- ality and clinical symptom descriptions that have been
2 has been found to save about half an hour of established for the test scales or patterns can more
testing time at the cost of a full evaluation of the accurately be combined by machine than can reports
client’s problems with the full version. In most cases, derived by clinical means. For the most part, com-
this small time saving does not warrant an abbre- puter-generated clinical reports can be viewed as pre-
viated administration (see Atlis et al., 2006, for a determined or canned statements that are applied to
fuller discussion). individuals who obtain particular psychological test
Several questions to consider in selecting a com- scores. The statements or paragraphs contained in the
puter-based test interpretation service to process computerized report, which is often in narrative form,
psychological tests are outlined in Table 36.1. are stored in computer memory and automatically
This chapter will focus on practical considerations retrieved to match a case when it meets particular
in using one type of computer-based psychological scale elevations, profile types, or scale indices. The
test interpretive report that is widely used in clinical behavioral correlate information for MMPI-2 scales
evaluation settings. The case illustration involves the and code types can be found in a number of sources
MMPI-2, since the MMPI and its derivatives have (see Archer, Griffin, & Aiduk, 1995; Butcher,
been the most widely used clinical instrument in most Graham, Williams, & Ben-Porath, 1990; Butcher,
settings (Lubin, Larsen, & Matarazzo, 1984; Rouse, & Perry, 2000; Butcher & Williams, 2000;
Piotrowski & Keller, 1989) and have the most exten- Graham, 2006; Graham, Ben-Porath, & McNulty,
sive history of computer adaptation for a psycholo- 1999; Gilberstadt & Duker, 1963; Lewandowski &
gical test. One survey of mental health professionals Graham, 1972; Marks, Seeman, & Haller, 1975).
reported that the MMPI-2 was ranked as being the Computer interpretation requires that test scores be
most frequently administered test (Frauenhoffer, used to access appropriate specific symptoms, beha-
Ross, Gfeller, Searight, & Piotrowski, 1998). Other viors, and so on, that are stored in the computer.
surveys have reported similar conclusions (Borum & The actuarial combination of data from the
Grisso, 1995; Lees-Haley, 1992; Lees-Haley, Smith, MMPI-2 scales has been well established. However,
Williams, & Dunn, 1996). In fact, 100% of frequent empirical research has not provided correlates for all
Table 36.1. Questions to consider in evaluating the adequacy of computerized psychological testing services.
Does the test on which the computer interpretation system is based have an adequate network of established validity
research?
Do the system developers have the requisite experience with the particular test(s) to provide reliable valid information?
Is there a sufficient amount of documentation available on the system of interest? Is there a published user’s guide to explain
the test and system variables?
Is the computer-based report based on the most widely validated measures available on the test?
Is the system flexible enough to incorporate new information as it becomes available?
How frequently is the system revised to incorporate new empirical data on the test?
Was the interpretation system developed following the APA guidelines for computer-based tests?
Do the reports contain a sufficient evaluation of potentially invalidating response conditions?
How closely does the system conform to empirically validated test correlates? Is it possible to determine from the outputs
whether the reports are consistent with interpretive strategies and information from published resources on the test?
Does the company providing computer interpretation services have a qualified technical staff to deal with questions or
problems?
Are the reports sufficiently annotated to indicate appropriate cautions?
BUTCHER 697
Table 36.2. Steps in providing test feedback to clients.
Step 1. Explain why the test was administered
• Explain the rationale behind your giving test feedback to the client.
• Indicate that you want to provide information about the problem situation that is based on how the client
responds to various questions.
• Explain that personality characteristics and potential problems revealed in the test scores are compared to
those of other people who have taken the test, and that feedback gives the client perspective on his or her
problems that can be used as a starting point in therapy.
Step 2. Describe what the test is and its uses
• Describe the MMPI-2 and provide the patient with an understanding of how valid and accurate the test is
for clinical problem description. Explain, for example, that the MMPI-2 was originally developed as a means of
obtaining objective information about patients’ problems and personality characteristics.
• Indicate that the MMPI-2 is the most widely used clinical test in the United States, which has been
translated into many languages and which is used in many other countries for evaluating patient problems.
• Explain that it has been developed and used with many different patient problems and provides accurate
information about problems and issues the client is dealing with.
Step 3. Describe how the personality test works
• Begin this step in the feedback process by providing a description of several elements of the MMPI-2. Use
the patient’s own test profiles to provide a basis for visualizing the information you are going to provide.
• Explain that a scale is a group of items or statements that measure certain characteristics or problems such
as depression or anxiousness.
• Describe what an ‘‘average’’ score is on the profile. Show how scores are compared on the profile, and
indicate that higher scores reflect ‘‘more’’ of the characteristics involved and problems the patient is
experiencing.
• Point out where the elevated score range is and what a score at T = 65 or T = 80 means in terms of the
number of people obtaining scores in this range.
• If available, use an average profile from a relevant clinical group from the published literature to illustrate
how people with particular problems score on measure. For example, if the patient has significant depression,
showing a group mean profile of depressed clients provides a good comparison group.
Step 4. Describe how the validity scales on the test work
Discuss the strategies the patient used in approaching the test content. Focus on the person’s self-presentation and on how
he or she is viewing the problem situation at this time. Discussion of the client’s validity pattern is one of the most
important facets of test feedback because it provides the therapist with an opportunity to explore the patient’s motivation
for and accessibility to treatment.
Step 5. Point out the client’s most significant departures from the norm on the Clinical Scales
Indicate that the client’s responses have been compared with those of thousands of other individuals who have taken the
test under different conditions. Describe your patient’s highest ranging clinical scores in terms of prevailing attitudes,
symptoms, or problem areas. It is also valuable to discuss the individual’s low points on the profile to provide a contrast
with other personality areas in which he or she does not seem to be having problems. But avoid descriptions that are low in
occurrence, and avoid using psychological jargon.
• Emphasize the individual’s highest point(s) on the test profiles.
• Point out where the scores fall in relation to the ‘‘average’’ scores.
• Provide understandable descriptions of the personality characteristics revealed by the prominent scale
elevations.
Step 6. Seek responses from the client during the feedback session
Encourage the client to ask questions about the scores and clear up any points of concern.
Step 7. Appraise client acceptance of the test feedback
Ask the client to summarize how he or she feels the test characterized personal problems to evaluate whether there were
aspects of the test results that were particularly surprising or distressing, and whether there were aspects of the
interpretation to which the patient objected.
BUTCHER 699
Table 36.3. Descriptive summary of the Minnesota Report on the MMPI-2 for a 46-year-old department store
manager.
The computer-based Minnesota Report addressed a number of problems the client faced and described the personality
characteristics likely to be present. An initial section was included that gave information about the client’s approach to
the testing in a profile validity paragraph. She produced a clinical profile that was likely valid. The report suggested that
she was cooperative with the evaluation.
The pattern of clinical and personality symptoms shown by the report appeared to provide a good summary of the client’s
current personality functioning. The clinical narrative used in the report incorporated features of her two high points, Pd
and Pa, which showed high-profile definition. The report noted that persons with this MMPI-2 clinical profile tend to
show an extreme pattern of chronic psychological maladjustment including immaturity and alienation, and tendencies
to manipulate others for their own gratification. The client also seems quite self-indulgent, hedonistic, and narcissistic,
with grandiose ideas of herself and her abilities. The report noted that she is likely to be quite aggressive with others and
very impulsive, and likely tends to act out her problems rather than resolve them. She seems to rationalize her difficulties
and takes no direct responsibility for her actions, but tends to blame other people for her problems. The report also noted
that she may resort to hostile, resentful, and irritable behvior.
The computer report also pointed out that she appears to be a tense and high-strung person who believes that she
experiences more intensely than other people do and feels lonely and misunderstood.
The report also provides information as to the relative frequency of Clinical Scale symptoms that clients with similar patterns
show. Her high-point scale score on Pd occurs in over 9.5% of the MMPI-2 normative sample of women. However, only
4.7% of the sample have Pd scale peak scores at or above a T-score of 65, and only 2.9% of clients have well-defined Pd
spikes. This particular MMPI-2 profile pattern is relatively rare in samples of normals, occurring in fewer than 1% of
normative women. In medical samples, this high-point Clinical Scale score (Pd) is found in about 6.8% of women.
The client scored high on APS and AAS. This indicated that there is a possibility of a drug or alcohol abuse problem that
should be carefully evaluated. The narrative report also noted hat this profile pattern is generally stable over time and that
if she is retested at a later date she is likely to produce similiar results.
In terms of being able to get along with other people, the narrative report indicated that she has a great deal of difficulty in
her social relationships. She often feels that others do not respect her or give her enough sympathy. She tends to be aloof
from others and uncompromising in conflict sitations. Divorce is relatively common for such clients. The report noted
that she is often hypersensitive about what others think of her and is concerned about her relationships with others. She
likely has difficulty expressing her feelings toward others.
In terms of clinical diagnosis, persons attaining this patern of scores are usually viewed as having a severe personality
disorder. Antisocial features are present and the possibility of paranoid personality is highlighted. In addition, the report
concludes that she responds in a manner similar to that of clients who have paranoid disorder, a possibility that should be
ruled out.
She appears to have a number of personality characteristics that are associated with substance abuse or substance use
problems and she has acknowledged some problems with excessive use or abuse of addictive substances.
The Minnesota Report also provides some treatment suggestions. Persons with this pattern of scores tend to avoid
psychological treatment and are usually not good candidates for psychotherapy. The report notes that they tend to resist
psychological interpretation and rationalize their own input and tend to blame others for their problems. The report
suggests that she has acknowledged problems with alcohol or drug use, a factor that should be addressed more fully in
therapy.
Source: Adapted summary of the clients’ Minnesota Report.
provided summaries of her present feelings, symp- One of the greatest problems she seemed to be
toms, and attitudes. Her Clinical Scale scores experiencing, although she tended to minimize it in
reflected a great deal of anger and hurt on her the initial interview, was that of alcohol abuse. The
part. She seemed to be feeling mistrustful of MMPI-2 substance abuse indicators (as well as her
others and showed a tendency to blame others for acknowledged pattern of alcohol use reported in the
her own problems. The MMPI-2 scales reflect initial session) suggested a strong likelihood that
immaturity in behavior and relationships that alcohol abuse was a much more significant problem
have probably persisted for some time. The psy- than she had been willing to discuss before. The
chologist pointed out that she was clearly experi- remainder of the interview was oriented toward
encing a great deal of physical distress and tension highlighting her major problems and encouraging
at this time and that perhaps many of the problems her to enter an alcohol treatment program. She
centered around relationship difficulties. acknowledged that her life had been going downhill
BUTCHER 701
Forensic Use of Computer-Based Reports the client to take the test somewhere other than in
It is becoming more and more important for the clinician’s office may well lead to exclusion of the
psychologists to employ clinical procedures that results from testimony.
will hold up under rigorous cross-examination in It is also important to ensure that the particular
court, because more cases today are becoming the interpretation in the computer-based report is an
subject of litigation. Psychologists testifying in court appropriate match for the patient’s test scores. In
might consider using computer-based test reports cross-examination, an attorney might ask questions
that provide results in an objective manner, reducing about how the psychologist knows that the computer
the possibility of subjectivity and bias in report actually fits the client. As noted earlier, com-
interpretation (Pope, Butcher, & Seelen, 2006). puter-based psychological tests are usually prototypes.
Computerized psychological test reports can lend Therefore, some of the report descriptions might not
considerable objectivity and credibility to a forensic apply for a particular case. The psychologist should
psychological study. The use of a computer to ana- be prepared, on cross-examination, to deal with
lyze test score patterns can remove subjective bias descriptive statements in the report that are consid-
from the interpretation process. ered not appropriate for the particular client.
The value of using objective psychological measures Psychologists who use computer-based MMPI-2
in court testimony was described by Ziskin (1981): reports in court typically are able to establish the
objectivity and utility of the instrument for under-
I would recommend for forensic purposes the
standing the client’s problems and personality. Most
utilization of one of the automated MMPI services.
people have found that the MMPI (MMPI-2) is
One obvious advantage is the minimizing or
relatively easy to explain to lay people such as juries
eliminating of the possibility of scoring errors or
or to judges. Moreover, judges and juries are often
errors in transposition of scores (Allard, Butler,
found to be interested in the idea that computers can
Faust, & Shea, 1995). Also these systems are capable
be programmed to provide an objective appraisal of
of generating more information about an individual
the test.
than the individual clinician is usually capable of
simply by virtue of their ability to deal with greater
amounts of information. Another advantage is the
Cautions or Questions Regarding Use of
reduction of the problems of examiner effects, such
Computerized Psychological Reports
as biases entering into the collection, recording, and
Some problems or pitfalls can occur with the use
interpretation of the data.
of computerized psychological evaluations.
(p. 9)
Computer-based test users need to be vigilant about
problems that can enter into blind reliance on com-
Psychologists developing reports in forensic set- puter-based personality assessment and implement
tings might consult Pope et al. (2006). remedies to avoid such problems. To have an effec-
tive practice the clinician must be alert to and avoid
Points to Consider the following potential problems with computer-
Several issues concerning the use of computer- based testing:
based personality test evaluations in forensic assess- Computerized assessment can promote an overly
ments need to be addressed. Pope and colleagues passive attitude toward clinical evaluation.
(2006) point out that when forensic testimony Professionals who allow the computer to do all the
involves the use of computer-based testing, it is work of gathering and summarizing clinical data may
very important to be able to establish the chain of fail to pay sufficient attention to the patient in the
custody for the test protocol in question, to docu- assessment process to gain a thorough understanding
ment that the computer-based report is actually the of the patient’s problems. A great deal can be learned
report for the client in the case. The forensic psy- about a patient by carefully observing and keeping in
chologist should be prepared to explain how the close touch with the test data, and in thoughtfully
answer sheet was obtained and to show that the summarizing the results. Having computerized
computer-based results are actually those for results should not be considered a substitute for
the client in question. There have been cases in clinical observation and astute integration of a broad
which the MMPI was excluded from evidence range of pertinent information.
because the psychologist was unable to establish Computerized assessments can introduce a mys-
that the test interpretation was actually the one tical aura into the assessment process. Clinicians who
that the client completed. For example, allowing become overly enthralled with the marvels of
BUTCHER 703
somewhat different results for the same protocol. considered B1-level instruments and are available
Theoretically, any computer-based interpretation to fellows, members, and associate members of the
of a test should be quite similar to any other, American Psychological Association as well as to
since all are based on the same correlate research psychologists, physicians, and marriage and family
literature. Interpretation of test scores can vary; therapists who are licensed by the regulatory board
however, computer interpretation systems that of the state in which they practice.
are based closely on research-validated indices
tend to have similar outputs. Yet, research has Summary
shown that test interpretation services may differ Computerized psychological testing, originally
with respect to their amount of information and developed in the 1960s, has evolved substantially
the accuracy of the interpretations (Eyde, Kowal, over the past three decades. Computer-based psy-
& Fishburne, 1991). Practitioners using an all- chological test reports can add significantly to the
automated interpretation system should be practitioner’s clinical evaluations in terms of pro-
familiar with the issues of computerized test viding valuable, thorough, and accurate information
interpretation generally and the validity research that is processed in a timely manner. Many reasons
on the particular system used (see Butcher et al., account for the widespread use of computerized
1998; Eyde, 1993; Eyde et al., 1991; Moreland, reports in clinical practice today: computer-based
1987; Williams & Weed, 2004). test interpretations are an objective and
comprehensive means of interpreting psychological
Are Computer-Based Psychological tests; they can be rapidly processed and made avail-
Assessments Ethical and Appropriate? able early in the treatment session; and they have a
The psychological profession accepts computer high degree of reliability.
interpretation as an appropriate means of evaluating This chapter provided a discussion of how clin-
test data on clients (see the American Psychological icians can incorporate computer-based information
Association Guidelines for computer-based assess- into their clinical practice. A case example was
ment, 1986): ‘‘A long history of research on statis- included to illustrate the types of information avail-
tical and clinical prediction has established that a able and the ways it can be incorporated into treat-
well-designed statistical treatment of test results ment planning. The use of computer-based
and ancillary information will yield more valid interpretive reports in forensic settings was also dis-
assessments than will an individual professional cussed and some guidelines for their use in court
using the same information’’ (p. 9). testimony were provided.
A survey of practitioners found that applied Several possible pitfalls that can accompany the
psychologists have substantially endorsed com- use of computer-based psychological tests were dis-
puter-based psychological assessment as an ethical cussed. It is important for practitioners to be aware
practice (McMinn, Buchanan, Ellens, & Ryan, of and to avoid the potential problems that can occur
1999). with the incorporation of automated assessment into
clinical practice. Ways of avoiding problems were
Who Can Purchase and Use Computerized discussed in order to provide the practitioner with
Reports? assurance that computer-derived information from
Are the test reports available only to qualified psychological tests can be readily incorporated into a
psychologists or can other professionals, such as clinical practice, making it more appropriate and
physicians and social workers, buy the services? effective.
The question of who can purchase and use compu-
terized psychological test reports is frequently asked.
The American Psychological Association (1986) has References
Allard, G., Butler, J., Faust, D., & Shea, M. T. (1995). Errors in
established guidelines for determining user qualifica- hand scoring objective personality tests: The case of the per-
tions for psychological test interpretation services. sonality diagnostic questionnaire-revised (PDQ-R).
Psychological tests are grouped into different cate- Professional Psychology: Research and Practice, 26, 304–308.
gories on the basis of purpose and use of the test. American Psychological Association. (1986). American
Professional Association Guidelines for computer-based tests and
Recommendations for test user qualifications differ
interpretations. Washington, DC: Author.
according to the different types of tests involved. For Archer, R. P., Griffin, R., & Aiduk, R. (1995). Clinical correlates
example, clinical personality tests (such as the for ten common code types. Journal of Personality Assessment,
MMPI-2) and computerized reports on them are 65, 391–408.
BUTCHER 705
Lewandowski, D., & Graham, J. R. (1972). Empirical correlates Piotrowski, C., & Keller, J. W. (1989). Psychological
of frequently occurring two-point code types: A replicated testing in outpatient mental health facilities: A national
study. Journal of Clinical Psychology, 39, 467–472. study. Professional Psychology Research and Practice, 20,
Lubin, B., Larsen, R., & Matarazzo, J. D. (1984). Patterns of 423–425.
psychological test usage in the United States: 1935–1982. Pope, K., Butcher J. N., & Seelen J. (2006). The MMPI/MMPI-2/
American Psychologist, 39, 451–454. MMPI-A in court: Assessment, testimony, and cross-examination
McKenna, T., & Butcher, J. Continuity of the MMPI with alco- for expert witnesses and attorneys (3rd ed.). Washington, DC:
holics. (1987, April). Paper given at the 22nd Annual American Psychological Association.
Symposium on Recent Developments in the Use of the MMPI. Rubenzer, S. (1991). Computerized testing and clinical
Seattle, Washington. judgment: Cause for concern. The Clinical Psychologist, 4,
McMinn, M. R., Buchanan, T., Ellens, B. M., & Ryan, M. 63–66.
(1999). Technology, professional practice, and ethics: Rome, H. P., Swenson, W., Mataya P., McCarthy, C. E. Pearson
Survey findings and implications. Professional Psychology: J. S., Keating F. R., et al. (1962). Symposium on automation
Research and Practice, 30, 165–172. technics in personality assessment. Proceedings of the Staff
Marks, P. A., Seeman, W., & Haller, D. (1975). The actuarial use Meetings of the Mayo Clinic, 37, 61–82.
of the MMPI with adolescents and adults. Baltimore, MD: Tellegen, A., Ben-Porath, Y. S., McNulty, J., Arbisi, P., Graham,
Williams & Wilkins. J. R., & Kaemmer, B. (2003). MMPI-2: Restructured Clinical
Matarazzo, J. (1986). Computerized clinical psychological test (RC) Scales. Minneapolis, MN: University of Minnesota
interpretations: Unvalidated plus all mean and no sigma. Press.
American Psychologist, 41, 14–24. Weiss, D. J. (1985). Adaptive testing by computer. Journal of
Meehl, P. (1954). Clinical versus statistical prediction: A theoretical Consulting and Clinical Psychology, 53, 774–789.
analysis and a review of the evidence. Minneapolis, MN: Williams, J. E., & Weed, N. C. (2004). Review of computer-
University of Minnesota Press. based test interpretation software for the MMPI-2. Journal of
Moreland, K. (1987). Computerized psychological assess- Personality Assessment, 83(1), 78–83.
ment: What’s available. In J. N. Butcher (Ed.), Ziskin, J. (1981). The use of the MMPI in forensic settings.
Computerized psychological assessment (pp. 26–49). New In J. N. Butcher, W. G. Dahlstrom, M. D. Gynther, &
York: Basic Books. S. Schofield (Eds.), Clinical notes on the MMPI (No. 9,
O’Dell, J. (1972). P. T. Barnum explores the computer. entire issue). Nutley, NJ: Hoffman-LaRoche Laboratories/
Journal of Consulting and Clinical Psychology, 38, 270–273. NCS.
James N. Butcher
Abstract
Personality assessment emerged during the early twentieth century largely through two avenues:
development of personality questionnaires for assessing characteristics considered pertinent to screening
draftees in the military in World War I, and early experiments with inkblot perception. Personality
assessment broadened into other areas of applied and research psychology including clinical, forensic, and
personnel applications as psychologists’ professional roles expanded during the twentieth century.
Personality assessment methods and applications have evolved successfully over the past century even
though changes in social and professional practices have altered somewhat the focus of assessment during
some periods. The contributions in this volume, providing a cross section of personality assessment
techniques and instruments, have been extensively described and illustrated. Current challenges for
personality assessment were highlighted and the potential for assessment to contribute further to
understanding of personality and adjustment by various approaches was explored by the contributors.
New directions for personality assessment have been suggested. The international expansion or
globalization of Western-based personality assessment methodology was described and the fact that we
are experiencing an increasingly broadened and effective cross-cultural collaborative environment for
study of personality and psychopathology with the growing development of personality psychology
internationally has been noted.
Keywords: history, cross-cultural, ethnicity, computer-based, invasion of privacy, high stakes testing
A look backward in time at the development of questionnaires (Cattell & Stice, 1957; Meehl, 1945)
personality assessment over the past 100 years gives and the use of ambiguous visual stimuli such as
one an interesting perspective on the diverse metho- inkblots and pictures as a mode for eliciting ‘‘projec-
dology and effective applications that occurred in the tive’’ responses from which personality characteristics
field. As described in Chapter 1, personality assess- are inferred (Exner, 1983; Lerner, 2007; Rapaport,
ment emerged in the early twentieth century and Gill, & Schafer, 1945; Schafer, 1948).
expanded substantially until the close of century. As noted in Chapter 1, assessment psychology
Researchers and practitioners in applied psychology proceeded for years without a comprehensive under-
incorporated early personality assessment strategies lying theoretical basis and instrument development
originated by Herman Rorschach (1921) and and acceptance was sporadic at best. In the 1930s,
Robert Woodworth (1920) to develop instruments Gordon Allport and Henry Murray provided a
to evaluate personality. The 1930s and 1940s wit- needed perspective and theoretical rationale for
nessed a strong merger between applied psychology many types of personality assessments, particularly
and personality assessment as new effective instru- instruments that were ‘‘projective’’ in nature. The
ments and techniques became available. Two distinct empirical approach by Hathaway and McKinley
tracks in personality assessment evolved during this (1940), though without a theoretical ‘‘flag’’ until
period—the diverse development of personality Meehl’s thoughtful ‘‘empiricist manifesto’’ in 1945,
707
provided a strong empirical rationale and justifica- Comprehensive System and its norms for interpreting
tion for instruments like the Minnesota Multiphasic the test (Garb, 1999; Lilienfeld, Wood, & Garb,
Personality Inventory (MMPI). These ‘‘trait- 2000).
oriented’’ approaches to understanding personality In spite of these criticisms, personality assessment
characteristics provided the theoretical grounding has continued to contribute to understanding of
for a functional perspective needed in personality personality characteristics and problems of clients
assessment. After the 1940s, personality assessment in numerous settings. In many ways, responses to
psychology continued to be central in personnel these criticisms have actually improved interpreta-
screening but also merged extensively with clinical tion strategies or strengthened the use of the instru-
and counseling psychology and became oriented ments like the Rorschach (see Chapter 15). For
toward assessing personality and mental health pro- example, a recent issue of the Journal of Personality
blems. Later, forensic psychology, with assessment Assessment provided a series of articles reporting an
as a primary vehicle, came to address varied pro- extensive normative base for the Rorschach
blems ranging from correctional assessment to Comprehensive System from many countries (see
family custody evaluations, and assessing litigants Shaffer, Erdberg, & Meyer, 2007). In addition, the
in personal injury cases. criticisms of the possibility of response sets affecting
MMPI assessment prompted further research and
Past Challenges to Personality Assessment development of validity scales (such as VRIN,
Personality assessment methodology and evolving TRIN, S scales) that provide valuable pictures of
techniques have not been without critics. Some of the the client’s approach to the assessment. In another
problems will be noted in order to give the reader a example, Mischel’s (1968) critique of personality
view on past difficulties that assessment psychologists assessment and trait conceptualizations was influen-
have faced during the evolution of the field. A tial although the viewpoint received considerable
number of psychologists raised concerns about opposition and continued development. Trait
personality assessment methods or tests (e.g., Ellis, theorists have actively pursued the trait approach to
1946; Ellis & Conrad, 1948). Some criticism has personality and the past 20 years have witnessed an
been directed at the theoretical bases of assessment extensive development of trait-oriented research,
by questioning the existence of ‘‘traits’’ (Mischel, specifically what has been described as the
1968). Some criticism of testing has come from out- ‘‘Big Five’’ model from which a number of personality
side of psychology, for example, criticisms that the use measures have evolved (see Chapters 9 and 16).
of personality inventories was an ‘‘invasion of privacy’’ Broader developments in the field have also
(Ervin, 1965) or Annie Paul’s general critique of ‘‘the impacted assessment development of trait constructs
cult of personality’’ (Paul, 2004). Other critics have in personality. During the 1970s, psychological
more directly raised questions about specific instru- assessment in mental health settings came to be
ments. For example, Jackson and Messick (1969) viewed by some as less important in clinical settings
questioned aspects of self-report questionnaires in because of the widespread popularity of behavioral
tests like the MMPI by viewing client’s responses to therapy. There was a sharp reduction in the use of
the items as largely driven by response sets. Others traditional assessment techniques in the clinical set-
have taken different strategies in their critiques, for ting as shown by the dip in the total amount of
example, Helmes and Reddon (1993) provided a research in psychological assessment across test
number of blanket criticisms of the MMPI psycho- instruments compared with the 1960s (Butcher,
metrics including such factors as a perceived lack of a Atlis, & Hahn, 2003). Even some clinical and coun-
consistent measurement model, heterogeneous scale seling training programs deemphasized personality
content, and suspect diagnostic criteria, item overlap assessment and focused upon providing behavioral
among scales, lack of cross-validation of the scoring treatment without training in traditional assessment
keys, inadequacy of measures of response styles, methods. During this period some assessment psy-
and suspect norms. Other critics viewed the original chologists began to move into settings other than
MMPI as outdated and incomplete and provided mental health, particularly into medical, forensic,
criticisms that would eventually lead to revising and personnel applications (Butcher, Ones, &
the instrument (Butcher, 1972). Other instruments Cullen, 2006).
have also been criticized; for example, there has The clinical practice of psychological assessment
been a substantial controversy surrounding the (particularly objective assessment methods) made a
Rorschach Inkblot technique as well as the Exner comeback with studies being conducted on more
BUTCHER 709
An Increasing Variety of Personality Haynes & O’Brien, 2000). These instruments are
Inventories and Behavioral Assessment usually completed by the clinician or an observer
Strategies Are Available for Use in Clinical rather than the client and address different aspects of
Assessment the patient’s behavior than those addressed by tradi-
Most of these newer personality inventories tional assessment methods, such as self-report. There
follow somewhat different theoretical viewpoints is also a range of interview formats that are used both
and test construction strategies than was used with for research and clinical applications (see Chapter 12).
the MMPI and, consequently, address different
aspects of personality. For example, the Millon Assessment Has Been Successfully Integrated
Clinical Multiaxial Inventory (and revised versions into Psychological Treatment
such as the MCMI-II and MCMI-III) (Millon, In the past 20 years, a clinical treatment approach
1977) were developed from Millon’s general theory has evolved that has taken the strengths of psycholo-
of personality and standardized on patients under- gical assessment methods and incorporated assess-
going mental health care. He did not use a general ment data into psychotherapy as a behavioral change
normative population as was used in the develop- strategy through providing test feedback to the
ment of instruments such as the MMPI. The client (Butcher, 1990). This approach, referred to as
MCMI and the revised versions, the MCMI-II and ‘‘therapeutic assessment’’ (Finn & Tonsager, 1992),
MCMI-III inventories (Millon, 1996), are intended has been shown to effectively provide clients with
to evaluate patients undergoing psychotherapy and personality information based on their test results
follow the Diagnostic and Statistical Manual early in treatment—a tactic that leads to improved
(DSM) criteria. The use of psychotherapy patients self-esteem and reduced symptoms. A recent survey
as the reference sample is both a strength and a by Smith, Wiggins, and Gorske (2007) reported the
limitation for some applications. That is, comparing results of a survey from 719 psychologist members of
a protocol with patients gives the practitioner a the International Neuropsychological Society, the
perspective on his or her client’s problems that is National Academy of Neuropsychology, and the
not provided with other tests. However, use of a Society for Personality Assessment who regularly con-
patient reference group (instead of ‘‘normals’’) ducted assessments as part of their professional activ-
weakens use in settings such as personnel selection ities. They found that the majority of respondents
or forensic evaluations in which reference samples (71%) frequently provided in-person assessment
need to be drawn from the normal population to be feedback to their clients and/or their clients’ families.
informative. Moreover, they reported that most respondents
Another instrument, the NEO PI (Costa & (72%) indicated that clients found this information
McCrae, 1992), was developed as a measure of to be helpful and positive in the clinical process.
‘‘normal range’’ personality characteristics but is
sometimes used in clinical assessment. The authors The Wide Availability of Electronic
constructed the NEO PI based upon the five-factor Computers Enhance Personality Assessment
theory of personality (see Digman, 1990) in order to Today
appraise the five major trait domains underlying this Computer scoring and test interpretation make
theoretical approach. Some inventories address psychological testing results more readily available
highly specialized assessment tasks such as the and cost-effective for incorporating into psycholo-
Psychopathy Checklist (PCL) by Hare (2002) as a gical assessment. Moreover, computer interpretation
means for typifying psychopathic personalities. In provides the practitioner with an objective frame-
applying psychological tests with clients, it is impor- work for a clinical evaluation (Atlis, Hahn, &
tant for the practitioner to review the test manual for Butcher, 2006; Chapter 10). A number of resources
the instrument to ascertain proper administration, are available that demonstrate the utility of com-
scoring, and interpretation guidelines to ensure that puter-based assessment in clinical evaluations (see
a test is properly used. Butcher et al., 1998, 2000; Butcher & Cheung,
In addition to psychological tests, there are a 2006).
number of formal behavioral assessment methods
and rating scales that have evolved over the last half METHODOLOGICAL DEVELOPMENTS IN
of the twentieth century to provide valuable infor- PERSONALITY ASSESSMENT
mation in clinical evaluations (see discussions by Psychologists developing personality assessment
Goldfried & Kent, 1972; in Chapter 13; and by instruments today, or those interested in evaluating
BUTCHER 711
Asian-Americans is a field that continues to face the MMPI-2 that are presently available—over 33
many challenging problems that require attention. (Butcher, 1996). These adapted versions of the test
The authors call for continued programmatic enable research on clinical problem areas across
research on clinical personality instruments with many countries and can serve as a basis for cross-
Asian-American populations to ensure that the national psychopathology research (Butcher, 2006).
most effective procedures be used. Recent develop- A number of studies have shown highly similar
ments in test interpretation with Hispanic clients clinical patterns on the available translations of
(see Butcher, Cabiya, Lucio, & Garrido, 2007; MMPI clinical measures (Manos, 1985; Savacir &
Garrido & Velasquez, 2006; Geisinger, 1992) Erol, 1990). Clinical research on the MMPI-2 has
show both a broad research base and effective inter- been expanding greatly in other countries, for
pretive guidelines in applying the MMPI-2 with example, in Europe (Butcher, Derksen, Sloore, &
Latino clients. Sirigatti, 2003), and in China (Cheung, 2004;
With respect to another demographic variable Cheung, Cheung, & Zhang, 2004). Three recent
that requires careful consideration in assessment, MMPI-2 research projects have been conducted
Chapter 22 provides an important perspective on by teams of psychologists in several countries in
the historical and current approaches to gender in order to assess the effectiveness of computer-based
the assessment process, and provides suggestions for assessment across cultures (Butcher et al., 1998,
effective information collection strategies that 2000; Butcher & Cheung, 2006).
acknowledge the influence of gender-related vari- With the growth in applied psychology in other
ables on women’s psychological health and well- countries there is an increasing number of trained
being. They discuss several concerns and available personality researchers who have both the interest in
research associated with gender bias in assessment, and expertise for conducting collaborative interna-
and examine the evidence for the influence of bias in tional personality assessment research. International
the diagnostic process. research programs to study personality processes
are much more efficiently developed today.
INTERNATIONAL DEVELOPMENTS IN ASSESSMENT International research collaboration on personality
RESEARCH AND APPLICATIONS projects is more efficient with the rapid communica-
The use of well-established and demonstrated tion that is available through the high-speed Internet
equivalent assessment instruments can serve as an (Butcher, 2006). And most researchers have ready
effective vehicle for gaining a broader understanding access to the extensive body of published research
of personality factors in other cultures. that can be a great advantage to collaborating
Opportunities now exist for psychologists to researchers today; finally, the use of email to trans-
broaden our knowledge of psychopathology through port data and communicate findings or problems
gaining a more global perspective of mental health makes the collaborative process more time-efficient.
problems. Psychological test adaptations in cultures
that are different from the country of origin are not Current Challenges Facing Contemporary
novel. In fact, two of the most widely used assess- Personality Assessment Psychologists
ment techniques, the Rorschach and Binet There are a number of continuing challenges that
Intelligence Scale, were initially cross-cultural adap- can result in diminished effectiveness in assessment
tations from other countries to the United States. psychology today if they are not resolved and
Several important differences in today’s test adapta- improved; six problem areas that can impact future
tion are addressed. test applications or procedures will be described here.
Cross-cultural personality assessment research is
more viable today than ever before (see research on Diminished Assessment Training in
cross-cultural comparisons of personality dimen- Professional Programs
sions by Cheung, Cheung, Zhang, Leong, & Yeh, As noted earlier, one challenge facing psycholo-
2008). The capability of collecting comparable data gical assessment today is the reduced emphasis upon
describing similar clinical patterns in different coun- training graduate students in assessment (Merenda,
tries can contribute to our general understanding of 2007). There has been a noticeable reduction of
personality and adds cumulative information to the assessment training in many graduate programs,
field of personality. An illustration of the gains to be particularly with respect to projective techniques.
made in the field of personality can be seen in using Graduate students from some training programs
the large number of well-developed translations of often find it difficult to obtain placements in desired
BUTCHER 713
assessment settings and have a substantial research center of a great controversy recently (see discussion
base in support of their varied applications. It takes below) as a result of bias. This RF approach to assess-
only an hour and a half (with an MMPI-2) to obtain ment uses the traditional name and reputation
an extensive review of a client’s symptoms, attitudes, (MMPI) to provide a completely new instrument in
test-taking behaviors, etc., that typically provides its place. The test user needs to be aware that the RC
valuable information in giving a perspective on a Scales bear little psychometric resemblance to the
client’s problems. In the majority of settings this is traditional clinical scales (see Butcher, Hamilton,
not an overbearing commitment of time. Rouse, & Cumella, 2006; Butcher & Williams,
The abbreviation of established tests in order to 2009; Nichols, 2006; Rouse, Greene, Butcher,
provide a seemingly more ‘‘user-friendly’’ option can Nichols, & Williams, 2008) and thus cannot effec-
result in misleading interpretations or results. That tively replace their functioning although the implica-
is, a shortened test may require less testing time but tion in the name, MMPI-2-RF, is that it does just that
can result in lowered validity and reliability or even a (see also the discussion in Chapter 23).
completely different set of results for the shortened The challenge: Abbreviated versions of standard
measure than the practitioner was expecting based on tests are typically not substitutes for the original form
the reputation of the original instrument. The clearest but are different tests and are often not equivalent to
examples of this problem were found in the popular the original. Clear differentiation of new versions from
strategy for developing MMPI short forms during the the original needs to be carefully made to avoid
1970s and 1980s when there were several attempts confusion and test misuse.
made such as the Mini-Mult (Kincannon, 1968), the
Faschingbauer (FAM; Faschingbauer, 1974), and the Erosion of Standards for Assessment
MMPI 168 (Overall & Gomez-Mont, 1974) (see Instruments
discussion in Chapter 14). These short forms, It is important for the profession to guard against
however, did not completely capture the meanings the dissemination of assessment instruments that do
of the full MMPI scale scores (see reviews by Butcher not perform psychometrically as promised or that
& Hostetler, 1990; Dahlstrom, 1980), and, for the perform in a biased manner and unfairly disadvantage
most part, disappeared from the assessment scene. clients. Professional vigilance to maintain quality
Two efforts to develop shortened MMPI-2 forms assurance in assessment techniques is required, espe-
were published. For example, Dahlstrom and Archer cially in ‘‘high stakes’’ evaluations such as personal
(2000) published a shortened form of the MMPI-2 injury forensic or work compensation cases.
that comprised simply the first 180 items in the Geisinger (2005) provided cautionary recom-
booklet. This abbreviated form resulted in mendations for psychologists using psychological
similar problems as the earlier efforts (Gass & Luis, tests in evaluations that can impact whether a
2001). The other recent shortened version by client will or will not receive benefits based on
McGrath, Terranova, Pogge, and Kravic (2003), the a psychologist’s test-based recommendations.
MMPI-297, proved more successful than the See also Chapters 6 and 7 which discuss the impor-
Dahlstrom and Archer version; however, it still tance of maintaining high standards in personality
produced T-scores that are substantially different assessment instruments; new or modified assessment
(discrepancies of T > 5) from those derived from procedures need to meet high standards in order
the full form of the MMPI-2, and its convergent to assess personality variables with accuracy and
validity is also weaker across a number of settings reliability, and to be consistent with the expectations
(McGrath et al., 2003). of qualified and ethical test users. Both chapters
Recently, there has been an effort to promote a provide illustrations of problem situations that can
shortened version of the MMPI-2 referred to as the occur in the field.
MMPI-2-RF that takes a very different approach to When high standards are ignored in the construc-
abbreviating the test than past efforts. This 338-item, tion of instruments, the field of personality assess-
extremely altered version includes as its core the ment is diminished, and clients, instead of being
Restructured Clinical or RC Scales (Tellegen et al., helped, can suffer consequences from an unfair
2003) in lieu of the traditional and well-established evaluation (see Armstrong, 2008). The distribution
Clinical Scales. In addition, there are a number of of flawed assessment devices under the guise of ‘‘high
other short measures that have not been previously standards’’ has recently occurred in changes made in
published and the inclusion of one scale, the the use of the MMPI-2. A scale that was developed
Lees-Haley Fake Bad Scale (FBS), has been at the several years ago from the MMPI-2 item pool
BUTCHER 715
reflected in the range of contributions featured in Armstrong, D. (2008). PERSONALITY CHECK Malingerer Test
this book, we anticipate the evolution in the next roils personal-injury law ‘Fake Bad Scale’ bars real victims, its
critics contend. Wall Street Journal, March 5, 2008.
100 years with several questions: Will there be novel, Atlis, M. M, Hahn, J., & Butcher, J. N. (2006). Computer-based
improved, and more effective approaches to carry assessment with the MMPI-2. In J. N. Butcher (Ed.), MMPI-2:
the field forward? Or, will there be a persistence of The practioner’s handbook (pp. 445–476). Washington, DC:
past successes with further refinements made to old American Psychological Association.
standards? Will the new technological advances we Ben-Porath, Y. S., & Tellegen, A. (2007). MMPI-2 Fake Bad
Scale (FBS). Retrieved December 4, 2007, from http://
highlight here be followed by further broadening of www.upress.umn.edu/tests/mmpi2_fbs.html
personality assessment methodology? Will contem- Butcher, J. N. (Ed.). (1972). Objective personality assessment:
porary society continue to rely upon available per- Changing perspectives. New York: Academic Press.
sonality assessment instruments as it currently does? Butcher, J. N. (1990). The MMPI-2 in psychological treatment.
Will personality assessment keep pace with society’s New York: Oxford University Press.
Butcher, J. N. (1996). International adaptations of the MMPI-2:
advances and expectations? Research and clinical applications. Minneapolis, MN:
Whatever avenues are pursued, it is likely that the University of Minnesota Press.
demands of current applications and the general Butcher, J. N. (Ed.) (1997). Personality assessment in managed care:
acceptance of well-established techniques will con- Using the MMPI-2 in treatment planning. New York: Oxford
tinue to encourage development in applied settings. University Press.
Butcher, J. N. (2006). Assessment in clinical psychology: A
Along with the successes come the challenges: Can perspective on the past, present challenges, and future prospects.
we meet these expectations? Clinical Psychology: Science and Practice, 13(3), 205–209.
Given the challenges that personality assessment Butcher, J. N., & Cheung, F. M. (2006, May). Workshop on the
psychologists are facing today, as described above, MMPI-2 and Forensic Psychology. Hong Kong: Chinese
we may wonder about the best strategies to take University of Hong Kong and the Hong Kong Social
Welfare Department.
with regard to the question that where do we go Butcher, J. N., & Hostetler (1990). Abbreviating MMPI item
from here. In my view, we can be very optimistic administration: Past problems and prospects for the MMPI-2.
about the future pathways that personality assess- Psychological Assessment: A Journal of Consulting and Clinical
ment will follow. As we saw in this handbook, we Psychology, 2, 12–22.
have a very rich history and substantially contem- Butcher, J. N., & Rouse, S. V. (1996). Personality: Individual
differences and clinical assessment. Annual Review of
porary database from which to proceed over the Psychology, 47, 87–111.
coming years. Some well-traveled paths will likely Butcher, J. N., Arbisi, P. A., Atlis, M. M., & McNulty, J. L.
continue in use while other avenues will emerge as (2003). The construct validity of the Lees-Haley Fake Bad
exploration proceeds. As the chapters here attest, Scale: Does this scale measure somatic malingering and
we are well equipped methodologically to assure a feigned emotional distress? Archives of Clinical
Neuropsychology, 18, 473–485.
fruitful passage. Perhaps the most effective Butcher, J. N., Atlis, M., & Hahn, J. (2003). Assessment with the
approach to progress in personality assessment MMPI-2: Research base and future developments. In D. Segal
would be to confront the challenges we currently (Ed.), Comprehensive handbook of psychological assessment
face, as described in this book, through further (pp. 30–38). New York: John Wiley & Sons.
research while maintaining the strengths of past Butcher, J. N., Berah, E., Ellertsen, B., Miach, P., Lim, J., Nezami, E.,
et al. (1998). Objective personality assessment: Computer-
successes. With problems ameliorated, we can based MMPI-2 interpretation in international clinical set-
advance into the next generation of personality tings. In C. Belar (Ed.), Comprehensive clinical psychology:
assessment technology with confidence. Sociocultural and individual differences (pp. 277–312).
It is my view that the challenges we have noted in New York: Elsevier.
various places in this book can be overcome and that Butcher, J. N., Cabiya, J., Lucio, E. M., & Garrido, M. (2007). The
MMPI-2 and MMPI-A with Hispanic clients in the United States.
sounder steps can be taken to insure that the future Washington, DC: American Psychological Association Books.
of personality assessment will be successful. I hope Butcher, J. N., Derksen, J., Sloore, H., & Sirigatti, S. (2003).
the accomplishments and the visions our personality Objective personality assessment of people in diverse cultures:
assessment forebearers held will be maintained and European adaptations of the MMPI-2. Behavior Research and
the hopes they had for the future of the field will be Therapy, 41, 819–840.
Butcher, J. N., Ellertsen, B., Ubostad, B., Bubb, E., Lucio, E., Lim,
fulfilled. J., et al. (2000). International case studies on the MMPI-A: An
objective approach. Minneapolis, MN: MMPI-2 Workshops.
References Butcher, J. N., Gass, C. S., Cumella, E., Kally, Z., & Williams, C.
American Psychological Association Practice Directorate. (2006). L. (2008). Potential for bias in MMPI-2 assessments using the
Testing codes toolkit. Washington, DC: American Fake Bad Scale (FBS). Psychological Injury and Law, 1(3),
Psychological Association. 191–209.
BUTCHER 717
of response. Educational and Psychological Measurement, Piotrowski, C. (1999). Assessment practices in the era of managed
29, 273–293. care: Current status and future directions. Journal of Clinical
Jöreskog, K. G. (2007). Factor analysis and its extensions. Psychology, 55(7), 787–796.
In R. Cudeck & R. C. MacCallum (Eds.), Factor analysis at Pope, K. S., Butcher, J. N., & Seelen, J. (2006). The MMPI/
100: Historical developments and future directions (pp. 47–77). MMPI-2/MMPI-A in court (3rd ed.). Washington, DC:
Mahwah, NJ: Lawrence Erlbaum. American Psychological Association.
Kincannon, J. (1968). Prediction of the standard MMPI scale Rapaport, D., Gill, M., & Schafer, R. (1945). Diagnostic psycho-
scores from 71 items: The Mini-Mult. Journal of Consulting logical testing (2 vols.). Chicago: World Book.
and Clinical Psychology, 32, 319–325. Rorschach, H. (1921). Psychodiagnostik. Bern: Hans Huber.
Lees-Haley, P. R., & Fox, D. D. (2004). Commentary on Rouse, S. V. (2007). Using reliability generalization methods
Butcher, Arbisi, Atlis, and McNulty (2003) on the Fake Bad to explore measurement error: An examination of the
Scale. Archives of Clinical Neuropsychology, 19, 333–336. MMPI-2 PSY-5 scales. Journal of Personality Assessment,
Lees-Haley, P. R., English L. T., & Glenn W. J. (1991). A Fake 88, 264–275.
Bad Scale on the MMPI-2 for personal injury claimants. Rouse, S. V., Greene, R. L., Butcher, J. N., Nichols, D. S., &
Psychological Reports, 68, 203–210. Williams, C. L. (2008).What do the MMPI-2 Restructured
Lerner, P. M. (2007). On preserving a legacy: Psychoanalysis and Clinical Scales reliably measure? Answers from multiple
psychological testing. Psychoanalytic Psychology, 24, 208–230. research settings. Journal of Personality Assessment,
Lilienfeld, S. O., Wood, J. M., & Garb, H. N. (2000). The 90, 435–442.
scientific status of projective techniques. Psychological Science Savacir, I., & Erol, N. (1990). The Turkish MMPI: Translation,
in the Public Interest, 1, 27–66. standardization, and validation. In J. N. Butcher & C. D.
Manos, N. (1985). Adaptation of the MMPI in Greece: Spielberger (Eds.), Advances in personality assessment (Vol. 8,
Translation, standardization, and cross-cultural comparisons. pp. 49–62). Hillsdale, NJ: Lawrence Erlbaum.
In J. N. Butcher & C. D. Spielberger (Eds.), Advances in Schafer, R. (1948). The clinical application of psychological tests:
personality assessment (Vol. 4). Hillsdale, NJ: Lawrence Diagnostic summaries and case studies. New York: International
Erlbaum. Universities Press.
McGrath, R. E., Terranova, R., Pogge, D. L., & Kravic, C. Shaffer, T. W., Erdberg, P., & Meyer, G. J. (2007). Introduction
(2003). Development of a short form for the MMPI-2 based to the JPA Special Supplement on international reference
on scale elevation congruence. Assessment, 10(1), 13–28. samples for the Rorschach Comprehensive System. Journal
Meehl, P. E. (1945). The dynamics of ‘‘structured’’ personality of Personality Assessment, 89, S2–S6.
tests. Journal of Clinical Psychology, 1, 296–303. Smith, S. R., Wiggins, C. M., & Gorske, T. T. (2007). A survey
Megargee, E. I. (2006). Use of the MMPI-2 in correctional of psychological assessment feedback practices. Assessment,
settings. In J. N. Butcher (Ed.), MMPI-2: The practioner’s 14(3), 310–319.
handbook (pp. 327–460). Washington, DC: American Stith v State Farm Mutua Insurance, Case No. 50-2003 CA
Psychological Association. 010945AG, Palm Beach County, Florida, 2008.
Merenda, P. F. (2007). Update on the decline in the education Streiner, D. L. (2003). Starting at the beginning: An introduction
and training in psychological measurement and assessment. to coefficient alpha and internal consistency. Journal of
Psychological Reports, 101(1), 153–155. Personality Assessment, 80, 99–103.
Meyer, G. J., Finn, S. E., Eyde, L. D., Kay, G.G., Moreland, Tellegen, A., Ben-Porath, Y. S., McNulty, J., Arbisi, P., Graham,
K. L., Dies, R. R., et al. (2001). Psychological testing and J. R., & Kaemmer, B. (2003). MMPI-2: Restructured Clinical
psychological assessment: A review of evidence and issues. (RC) Scales. Minneapolis, MN: University of Minnesota
American Psychologist, 56(2), 128–165. Press.
Millon, T. (1977). Millon Clinical Multiaxial Inventory. Vacha-Haase, T., Henson, R. K., & Caruso, J. C. (2002).
Minneapolis, MN: National Computer Systems. Reliability generalization: Moving toward improved under-
Millon, T. (1996). The Millon Inventories. New York: Guilford standing and use of score reliability. Educational and
Press. Psychological Measurement, 62, 562–569.
Mischel, W. (1968). Personality and assessment. Hoboken, NJ: Vandergracht v. Progressive Express et al. (2007). Case #02-04552,
John Wiley & Sons. Hillsborough County, FL.
Nichols, D. S. (2006). The trials of separating bath water from Weiss, D., & Suhdolnik, D. (1985). Robustness of adaptive
baby: A review and critique of the MMPI-2 restructured testing to multidimensionality. In D. J. Weiss (Ed.),
clinical scales. Journal of Personality Assessment, 87, 121–138. Proceedings of the 1982 item response theory and computerized
Overall, J. E., & Gomez-Mont, F. (1974). The MMPI-168 for adaptive testing conference (pp. 248–280). Minneapolis, MN:
psychiatric screening. Educational and Psychological University of Minnesota, Department of Psychology,
Measurement, 34, 315–319. Computerized Adaptive Testing Laboratory.
Paul, A. M. (2004). The cult of personality—how personality tests are Williams v. CSX Transportation, Inc. (2007). Case
leading us to miseducate our children, mismanage our companies #04-CA-008892, Hillsborough County, FL.
and misunderstand ourselves. New York: Free Press. Woodworth, R. S. (1920). Personal data sheet. Chicago: Stoelting.
719
Basic Personality Inventory (BPI) Sales: 800.227.7271
Support: 800.627.7271 ext. 3235
Characteristics Assessed
General Info: 877.242.6767
Dimensions of psychopathology (hypochondriasis, anxiety, Call: 800.627-7271
depression, thinking disorder, etc.) divided into 11 clinical scales. Fax: 800.632.9011 or 952.681.3299
E-Mail: pearsonassessments@pearson.com
Publisher:
Sigma Assessment Systems, Inc. For More Information:
PO Box 610984 www.pearsonassessments.com/tests/aui.htm
Port Huron, MI 48061-0984 The test is self-administered and usually takes about 10–15 min.
1-800-265-1285 The administration can be delegated to clerical or nursing staff.
The test is self-administered and usually takes about 35–40 min. The test is computer scorable and can (a brief narrative) be
The administration can be delegated to clerical or nursing staff. interpreted by computer.
The test is computer scorable and can be interpreted by computer. Other Important Features
The address of the official computer scoring and interpretation 21 four-choice statements
service is
Sigma Assessment Systems, Inc. Key References
PO Box 610984 Beck, A. T. (1987). Beck depression inventory: manual. San
Port Huron, MI 48061-0984 Antonio, TX: Psychological Corporation.
1-800-265-1285 Beck, A. T., Ward, C. H., Mendelson, M., Mock, J., Erbaugh, J.
(1961). An inventory for measuring depression. Archives of
Key Reference General Psychiatry, 4, 561–571.
Jackson, D. N. (1988, 1995, 1997). Basic Personality Inventory:
BPI manual. MI: Sigma Assessment Systems, Inc.
Butcher Treatment Planning Inventory (BTPI)
Characteristics Assessed
Beck Anxiety Inventory (BAI) The Butcher Treatment Planning Inventory (BTPI) was developed
Characteristics Assessed as a means of providing personality information on clients involved in
Self-report measure of anxiety symptomatology. personality information on clients involved in psychological
intervention. It was derived on the basis of a combination of
Publisher: rational and empirical means. Through its construction, it was
Pearson Assessments created to be behaviorally oriented and fundamentally, atheoretical
5601 Green Valley Drive in nature. The BTPI is intended to assess not only clients’
Bloomington, MN 55437-1099 psychological symptoms, but also those specific qualities that are
Sales: 800.227.7271 likely to impede the psychological treatment process. The scale
Support: 800.627.7271 ext. 3235 consists of 210 true/false items, each of whose content assigns it to
General Info: 877.242.6767 one or more of the instrument’s 14 scales. Four scales are included to
Call: 800.627-7271 assess the validity of the individual’s self-report on the instrument:
Fax: 800.632.9011 or 952.681.3299 Inconsistent Responding (INC), the Overly Virtuous Self-Views
E-Mail: pearsonassessments@pearson.com (VIR), the Exaggerated Problem Presentation (EXA), and the
For More Information: Closed-Mindedness (CLM). There are five scales that assess
www.pearsonassessments.com/tests/aui.htm treatment issues: There are five symptom scales: Depression (DEP)
The test is self-administered and usually takes about 10 min. assessing proneness to low mood; Anxiety (ANX) measuring the
The administration can be delegated to clerical or nursing staff. experience of tension and nervousness; Anger-Out (A-O) assessing
The test is not computer scorable and cannot be interpreted by hostile attitudes and exhibiting of anger; Anger-In (A-I), measuring
computer. internalization of anger and self-blame; and Unusual Thinking (PSY)
assessing strange or delusional beliefs and demonstrating of unusual
Key References behaviors.
Beck, A. T., Epstein, N., Brown, G., & Steer, R. A. (1988). An
Characteristics Assessed
inventory for measuring clinical anxiety: Psychometric proper-
Personality characteristics and process variables in treatment
ties. Journal of Consulting and Clinical Psychology, 56, 893–897.
planning.
Eydrich, T., Dowdall, D., & Chambless, D. L. (1992). Reliability
and validity of the Beck anxiety inventory. Journal of Anxiety
Disorders, 6, 55–61. Publisher:
Multi-Health Systems Inc.
908 Niagara Falls Blvd.
Beck Depression Inventory (BDI) North Tonawanda, NY 14120-2060
Characteristics Assessed 800-496-8324
Fax: 800-540-4484
Depression.
Multi-Health Systems Inc. (MHS)
Publisher: 3770 Victoria Park Ave.
Pearson Assessments Toronto, ON M2H 3M6
5601 Green Valley Drive 416-492-2627 (Ext. 330)
Bloomington, MN 55437-1099 Fax: 416-492-3343
Publisher:
Comprehensive Assessment and Referral
Research Press Evaluation (CARE)
Dept. 27W Characteristics Assessed
Main: 217-352-3273 Psychiatric, medical, nutritional, economic, and social
PO Box 9177 problems of the older person.
Toll Free: 800-519-2707 The test is not computer scorable and cannot be interpreted
Champaign, IL 61826 by computer.
Fax: 217-352-1221
Other Important Features
The test is computer scorable but cannot be interpreted by computer.
Clinician rating scale
Other Important Features
A set of forms for completion by clinical staff as part of ongoing Key Reference
service delivery, specifically for a comprehensive social-learning Gurland, B., Kuriansky, J., Sharpe, L., Simar, R., Stiller, P., & Birkett,
program, but adaptable to other programs. P. (1977–1978). The Comprehensive Assessment and Referral
Functioning scores are derived across forms to provide weekly Evaluation (CARE)—Rationale, development and reliability.
‘‘rate-per-opportunity.’’ International Journal of Aging and Human Development, 8, 9–42.
Key Reference
Functional Assessment Inventory (FAI)
Schroeder, M. L., Wormworth, J. A., & Livesley, W. J. (1992). Characteristics Assessed
Dimensions of personality disorder and their relationship to Functional information in five domains—social resources,
the Big Five dimensions of personality. Psychological economic resources, mental health, physical health, and ADL
Assessment, 4, 47–53. status.
Publisher: Publisher:
Arnold R. Bruhn Pearson Assessments
The Topaz House 5601 Green Valley Drive
4400 E. West Highway #28 Bloomington, MN 55437-1099
Bethesda, MD 20814 The test is self-administered and takes about 45–60 min.
The test is self-administered and takes about 5 min for psychologists. The administration can be delegated to clerical or nursing staff.
The administration can be delegated to clerical or nursing staff. The test is computer scorable and is interpreted by computer.
The test is not computer scorable and cannot be interpreted by The address of the official computer scoring and interpretation
computer. service is:
Pearson Assessments
Key Reference 5601 Green Valley Drive
Bruhn, A. R., & Feigenbaum, K. (1993). The interpretation of Bloomington, MN 55437-1099
autobiographical memories (Unpublished manuscript).
Key References
Craig, R. J. (Ed.) (1993). The million clinical multiaxial inventory:
Miller Behavioral Style Scale (MBSS) A clinical research information synthesis. Hillsdale, NJ:
Characteristics Assessed Erlbaum.
Two basic coping styles: information distractors (blunters) Choca, J., Shanty, L., & Van Denberg, E. (1991). Interpretive
and information seekers (monitors). guide to the million multiaxial inventory. Washington, DC:
Where Test Can Be Obtained American Psychological Association.
Dr. Suzanne M. Miller Millon, T. (Ed). (1997). The million inventories: Clinical and
Department of Psychology, Weiss Hall personality assessment. New York: Guilford Press.
Temple University
113th and Columbia Streets
Philadelphia, PA 19122 Mini-Mental State (MMS)
The test is self-administered and takes about 10–15 min. Characteristics Assessed
The administration can be delegated to clerical or nursing Orientation; memory; attention; ability to name, follow
staff. verbal and written commands, and constructional ability.
The test is not computer scorable and cannot be interpreted Where Test Can Be Obtained
by computer. Dr. Marshal F. Folstein
Department of Psychiatry and Behavioral Science
Key References
Johns Hopkins Hospital
Miller, S. M. (1987). Monitoring and blunting: Validation of a
Baltimore, MD 21205
questionnaire to assess styles of information seeking under
The test is not self-administered and usually takes about
threat. Journal of Personality and Social Psychology, 52, 345–353.
5–10 min.
Miller, S. M., Brody, D. S., & Summerton, S. (1988). Styles of
The administration cannot be delegated to clerical or nursing
coping with threat: Implications for health. Journal of
staff.
Personality and Social Psychology, 54, 142–148.
The test is not computer scorable and cannot be interpreted
by computer.
Michigan Alcoholism Screening Test (MAST)
Characteristics Assessed Key References
This instrument provides a quantifiable assessment of alcohol Folstein, M. F., Folstein, S. E., & McHugh, P. (1975). ‘‘Mini-
use problems. The inventory is a 25-item measure that is mental state’’: A practical method for grading the cognitive
administered by an interviewer. state of patients for the clinician. Journal of Psychiatric
Research, 12, 189–198.
Publisher: Anthony, J. C., LeResche, L., Niaz, W., Von Korff, M. R., &
Melvin Selzer Folstein, M. F. (1982). Limits of the mini-mental state as a
6967 Paseo Laredo screening test for dementia and delirium among hospital
LaJolla, CA 92037 patients. Psychological Medicine, 12, 397–408.
Key References
Neuropsychology Behavior and Affect Strack, S. (1987). Development and validation of an adjective
Profile checklist to assess the millon personality types in a normal
Characteristics Assessed population. Journal of Personality Assessment, 51, 572–587.
Indifference, mania, depression, behavioral inappropriateness, Strack, S. (1991). Manual for the personality adjective checklist
and communication problems (pragnosia). (PACL) (Rev. ed.). South Pasadena, CA: Stephen Strack.
Where Test Can Be Obtained:
Dr. Linda Nelson Personality Assessment Form
UCLA/Semel Institute Characteristics Assessed
760 Westwood Plaza, Rm. C8-749
Personality disorders.
Los Angeles, CA 90095.
Where Test Can Be Obtained
The test is not self-administered.
Dr. Paul Pilkonis
The test is not computer scorable and cannot be interpreted
Western Psychiatric Institute & Clinic
by computer.
3811 O’Hara Street
Other Important Features Pittsburgh, PA 15213
The instrument consists of 106 statements and five scales, and The test is self-administered and usually takes about 1–2 hr.
has self-and other-rater versions. The administration cannot be delegated to clerical or nursing
staff.
Key Reference The test is not computer scorable and cannot be interpreted
Nelson, L. D., Satz, P., Mitushina, M., Van Gorp, W., Cicchetti, by computer.
D., Lewis, R., & Van Lancker, D. (1989). Development and
Other Important Features:
validation of the neuropsychology behavior and affect profile.
Unstructured to semistructured interview.
Psychological Assessment, 1, 225–272.
Key Reference
Penn State Worry Questionnaire (PSWQ) Pilkonis, P. A., Heape, C. L., Ruddy, J., & Serrano, P. (1991).
Validity in the diagnosis of personality disorders: The use of
Characteristics Assessed
the LEAD standard. Psychological Assessment, 3, 46–54.
Brief self-report measure of trait worry.
Where Test Can Be Obtained
Dr. Thomas D. Borkovec Personality Assessment Inventory (PAI)
Penn State University Characteristics Assessed
417 Bruce Moore Bldg. Dimensions of psychopathology (e.g., somatic complaints,
University Park, PA 16802 anxiety, paranoia, alcohol problems, stress, warmth, etc.) divided
The test is self-administered and usually takes about 10 min. into 11 clinical scales, five treatment scales, and two interpersonal
The administration can be delegated to clerical or nursing scales.
staff.
The test is not computer scorable and cannot be interpreted Publisher:
by computer. Psychological Assessment Resources, Inc. (PAR)
PO Box 998
Key References Odessa, FL 33556
Meyer, T. J., Miller, M. L., Metzger, M., & Borkovec, T. D. The test is self-administered and usually takes about 45–55 min.
(1990). Development and validation of the Penn state worry The administration can be delegated to clerical or nursing staff.
questionnaire. Behavior Research and Therapy, 28, 487–495. The test is computer scorable and can be interpreted by computer.
Brown, T. A., Antony, M. M., & Barlow, D. H. (1992). The address of the official computer scoring and interpretation
Psychometric properties of the Penn state worry question- service is
naire: In a clinical anxiety disorders sample. Behavior Psychological Assessment Resources, Inc. (PAR)
Research and Therapy, 30, 33–37. PO Box 998
Odessa, FL 33556
Publisher:
21st Century Assessment Personality Diagnostic Questionnaire-4
PO Box 608 Characteristics Assessed
South Pasadena, CA 91031-0608 Personality disorders.
The test is self-administered and usually takes about 15 min. Where Test Can Be Obtained
The administration can be delegated to clerical or nursing staff. Dr. Steve Hyler
Note: The letter ‘t’ denotes tables generalized assessments of, 545, 553–5, selecting ‘‘critical,’’ 37
557–60 selecting functionally relevant, 38
Abbreviated forms, 695 habit strength, 546, 549 Allport, Gordon W., 9
of MMPI, 271–2, 434–5, 695 assessing and treating, 549 Alzheimer’s disease, 305, 435
undesirable side effects, 713–14 reducing, 550 Ambulatory psychophysiological
Acculturation, 384, 397–8, 409 sources of, 549–50 self-monitoring, continuous
indicators that assess, 387 heterogeneity, 543–4 in natural environments, 242t–3t, 245–6
Acculturation Rating Scale for Mexican individualized assessment of, 545–3 Ambulatory self-monitoring, 229, 242t,
Americans (ARSMA), 384 inhibitions against, 546, 550 245–6
Accuracy, 530 assessing, 551 in clinically controlled settings, 245
Achievement motivation, 590 reducing, 550–1 ecological momentary assessment using
Achievement via Conformance (Ac) scale, sources of, 550 PDAs, 229, 246
329–30 intrinsic instigation to, 546, 547 American Indians. See Native Americans
Achievement via Independence (Ai) assessing Americans with Disabilities Act (ADA) of
scale, 330 and treating, 548 1990, 593
Actuarial prediction, 696–7. See also under reducing, 547–8 Analog behavioral observation (ABO),
Violence risk assessment, objective sources of, 547 241t, 244–5, 471–3
assessment tools issues and factors influencing Anastasi, A., 83
Adaptability. See Flexibility assessments of, 544, 545 Andrich, D., 74
Adapted measures, 108. See also Translated context and setting, 544–5 Anger control, 590
instruments type of appraisal, 544 Anger/rage system, 154
Addiction Acknowledgment Scale (AAS), 268 response competition and reaction Anger (ANG) scale, 264
Addiction Admission Scale (AAS), 532, 533 formation, 553 Anger scales, MMPI-2, 267–8
Addiction Potential Scale (APS), 268, 534–5 situational factors, external conditions, Anhedonia, 128–30
Addiction Severity Index, 218 and, 546, 551–3 Antisocial (ATS) personality disorder
Adjective Check List (ACL), 327 Aggression (AGG) scale, 573, 574 (ASPD), 348t–50t, 567
Adolescents and MMPI, 485–6. See also Aggressive behavior, conceptual framework interview screening for, 205t, 340
MMPI-A (Minnesota Multiphasic for the analysis of, 546–7 MMPI-2 assessment, 572–5
Personality Inventory-Adolescent) Aggressiveness, 157. See also Psychopathic vs. psychopathy, 567–8. See also
Adoption studies, 32–3 Deviate (Pd) scale Psychopathy
Adult Suicidal Ideation Questionnaire Aggressiveness (AGGR) scale, 158–9, 269 Psychopathy Checklist-Revised and, 570
(ASIQ), 514 Agreeableness, 306t, 311t, 344. Rorschach assessment, 571–2
Advice giving in interviews, 215 See also Five-Factor Model (FFM); Antisocial Practices (ASP) scale, 264–5, 573–4
Affect, 472. See also Emotion systems NEO Inventories Anxiety. See Fear system; Social
‘‘time’’ and measurement of, 123 Alcohol and Drug Problem Proneness Discomfort (SOD) scale; Social
African Americans, 105. See also Ethnic (PRO) Scale, 491 Introversion (Si) scale
and racial minority clients in U.S. Alcohol/Drug Problem Acknowledgment Anxiety (ANX) scale, 263, 670–1, 673
MMPI-2 use with, 402–3 (ACK) Scale, 491 Anxiety (A) scale, 125, 266, 492
paranoid personality and, 356–7 Alcoholism, 531. See also Substance abuse Anxiety disorders, 675t, 675–6
TAT use with, 407 Alcohol use, hidden/secretive, 531 Applied psychology, acceptance of
Aggression, 542. See also Assault; Alcohol Use Disorders Identification Test personality assessment in, 709
Violence risk assessment (AUDIT), 533, 534 Applied settings, need for objective
‘‘algebra’’ of, 545–7 Alcohol Use Inventory (AUI), 535–6 evaluations of personality
assessing reaction potential, 546, 553 Allele approaches to microarray characteristics in, 711
defined, 543 construction, 37–9 Asian American psychological experiences,
extrinsic (instrumental) instigation to, Alleles, 39 measures of, 383–6
546, 548 predictive of external criteria, Asian Americans, personality
assessing and treating, 549 selecting, 38 assessment with, 377–8, 386, 391
reducing, 549 selecting alleles predictive of prognosis or future directions, 392
sources of, 548–9 treatment response, 38 MMPI studies, 378–80, 403–4
739
Asian personality, measures of, 381–3 Behavioral Affective Rating Scale (BARS), Borderline (BDL) personality
Asian populations, MMPI-2 studies with, 244–5 disorder (BPD), 143, 348t–50t,
380–1 Behavioral assessment, 226–7, 246–7 353–4, 418
Asian Values Scale (AVS), 385 attributes, 227–8 Borderline Features (BOR) scale, 511
Asperger’s disorder, 496–7 case study, 235–9 Brain damage, structural, 449. See also
Assault. See also Aggression; Violence assessment sessions, 228–3 Neuropsychological evaluations,
predicting, 505. See also Violence risk case formulation, 233f, 233–5 use of MMPI-2 in
assessment referral, 228 Brain injury. See Traumatic brain injury (TBI)
by base rates alone, 140–1, 141t conceptual and empirical foundations, Butcher, James N.
using interview data for, 141t 227–38 computerized systems and, 168
ASSESS-LIST, 558 methods, 238, 239, 240t–3t, 244–6 on culture and personality, 48
Association analysis, 33, 34 vs. assessment strategies, 238–9 MMPI-2 and, 108, 117, 252, 509, 587,
Attachment style, 656, 664 terminology, 247n1 588, 590, 671
Attorney coaching, 618 Behavioral assessment strategies, 238–9 on MMPI Revision Committee, 9, 251–2
Attribution bias, 399 increasing variety available for use in profile validity and, 587, 588
Autoappraisers, 153 clinical assessment, 710 Butcher Treatment Planning Inventory
Autonomy, adaptive, 332 Behavioral engagement, 472 (BTPI), 169–70
Avoidant (AVD) personality disorder, Behavioral observations, 187–90, 193–4 anxiety and, 673, 675–7
348t–50t in analog environments, 241t, 244–5 case example, 170, 171f–4f, 173
case example, 190–1 resistance and, 671–9
Background investigation (BI), 588 and the inference process, 194–6 scale and composite information, 672t,
Back Infrequency (Fb) scale, 255, attitudes toward clinical impressions, 672–3
435, 620 194 treatment resistance indicators, 673, 673t
Back-translation, 50 interplay of inferences from different
Base rate estimates, subjective data sources, 194–5 CAGE, 531, 533
difficulty shifting, 144 the value of subjectivity, 195–6 Calhoun, V. D., 370
Base rates in natural environments, 239 California Psychological Inventory (CPI),
clinical uses of problems with, 189, 190 10, 83, 323–4, 334
asking what the alternatives are, 147 rationale for, 190–4 cuboid model, 332–3
delaying decisions while collecting subjectivity, 189, 194–6 description, 324
information, 148 Behavioral Observations sections in development, 325–6
hunting for disconfirming reports, 193–4 interpretation, 333
information, 147–8 function, 196–7 establishing profile validity, 333–4
investigating base rates in your art and style, 198 interpreting the profile, 334
setting, 146 consumer rights, 197 scales
looking for a control group, 148 humanizing reports, 197 folk culture/sector, 326–1
simple interpretation of test scores, persuasion, 197–8 relevant to inhibitions, 551
146–7 nature of test process and, 190–3, 195 special purpose and research, 333
statistically-based decisions vs. lack of neglect of, 189 validity, 328–9, 333–4
caring for clients, 148 rationale for, 189 vector, 324, 332
high co-occurring, and erroneous Behavior problems structure, 331
perceptions of correlations and are conditional and vary across contexts factor analyses, 331–2
causal relationships, 145–6 and settings, 227, 237 smallest space analysis (SSA), 332
implications for clinical practice, 140–6 causal variables and dynamic nature of, use, 324–5
low, 147 227, 237 versions, 323–4
difficulty predicting phenomena multiple functionally related, 227, California Psychological Inventory
with, 141–2 236–7 Handbook (Megargee), 323
nature of, 140 variation over time, 237 Campbell, David P., 120, 132, 135
optimal test cutting scores vary by, Bernreuter Personality Inventory, 7 Cannot Say (?) scale, 252, 254, 486–7
142–3 Beutler, Larry E., 519 Capacity for Status (Cs) scale, 326
and perceptions and information Big Five Inventory (BFI), 409n2 Career counseling, Internet-based, 166–7
collection strategies, 144–5 ‘‘Big Five’’ personality traits, 10, 302, 382. Case formulations, limitations of, 235
and predictive power, 140–1 See also Five-Factor Model (FFM) Case history exams, 208–9
salient personal experience weighed Biometrical moderation models, 26, 31–2 Cattell, Raymond B., 10, 304
more than, 144 Bipolar disorder, 370–1, 371f Checklists, 239
utilities and, 143–4 Bivariate biometrical modeling, 29 Child behavior problem questionnaires,
BASIC ID, 202–3 Bizarre Mentation (BIZ) scale, 264 230
Beck Depression Inventory (BDI) and Black Thematic Apperception Test Child custody cases, 281
suicide risk, 512–13 (B-TAT), 407 China, 6, 44–6, 48, 314. See also Asian
Beck Hopelessness Scale (BHS) and suicide Blood oxygenation level-dependent populations, MMPI-2 studies with
risk, 512, 513 (BOLD) response Chinese American clients, 387. See also
Behavior, conditional and contextual neural basis, 365–6 Asian Americans, personality
nature of, 227, 237 reliability, 366–7 assessment with
740 INDEX
Chinese MMPI, 44–5, 380–1 Computer-based assessment, 10–11. Construct validation theory and
Chinese MMPI-2, 48–51, 381 See also Computer-based test personality theory, history of, 81,
Chinese Personality Assessment Inventory interpretation (CBTI) systems 84–5, 94
(CPAI), 45, 52, 53, 382–3 developments in, 173. See also centrality of criterion-related validity
Chinese personality perception, Computer adaptive testing in early and middle 20th century,
dimensions of, 382 assessment via Internet, 165, 176–8 82–4
Chronicity. See Problem complexity/ countdown method, 175–6 early attempts to validate personality/
chronicity (PCC) ethics and appropriateness, 704 psychopathology measures, 82
Clarification in interviews, 215 history, 167–70 emergence of construct validity and
Clark, D. C., 519 utilization, 164 theory testing, 84–5
Classical test theory (CTT), 59, 61–2 Computer-based narratives, integrated into Control, 331–2, 590. See also Self-control
CTT item statistics for TAPAS report, 691 (Sc) scale
well-being scale, 61–2, 62t Computer-based reports, 704 Conventionality, 332, 333
Classification procedure (CP) strategy, 175 cautions and questions regarding use Conversion symptoms. See Hysteria (Hy)
Client-centered perspective, 202, 206 of, 702–4 scale
Clinical decision support systems (CDSSs), forensic use, 702 Coolidge Axis II Inventory (CATI),
165–6 points to consider regarding, 702 342, 346t
Clinical judgment and decision making, who can purchase and use, 704 Coombs unfolding model, 71
148. See also Base rates Computer-based test interpretation Coping styles, 655–6, 661, 664
Clinical scale covariation (CSC), 116–18 (CBTI) systems, 106, 168, 169, Correctional Institution Environment
Clopton, J. R., 509 691, 704. See also Computer-based Scale (CIES), 552
Coaching individuals to feign test assessment; MMPI-2 (Minnesota Correctional settings. See Forensic settings
responses, 618 Multiphasic Personality Inventory- Correction factor (MMPI-2), 449
Cognitive-behavioral orientation, 202–3, 206 2nd edition) Correction (K) scale, 255–6, 487–8,
Cognitive biases of clinicians, 399–400 case example, 170, 171f–4f, 173 594, 670
Cognitive restructuring, 217, 548 caveats and limitations, 170, 607 Costa, Paul T., Jr., 344. See also Five-Factor
Cognitive style and ability, and structure factors to consider in scoring and Model (FFM); NEO Inventories
of personality, 308 interpretation services, 694 Couple assessment, 229–31, 457
Cognitive test performance questions to consider in evaluating for case conceptualization and treatment
brain injury and, 435, 437t, 438, 446. adequacy of, 696t planning, 466–75
See also Neuropsychological reasons for using, 694 conducting separate interviews with
evaluations, use of MMPI-2 in relative accuracy, 703–4 partners, 471
psychopathy and, 575–6 Computer-based tests, sources of for diagnosis and screening, 463–6
Cohort effects, 314 information for, 163 emerging assessment technologies,
Collaborative Longitudinal Personality Computerized adaptive testing. See 476–7
Disorders Study (CLPS), 353–4 Computer adaptive testing for treatment monitoring and outcome,
Collectivism, 46 Computer test data processing, methods 475–6
College Maladjustment (Mt) scale, 267 of, 694–7. See also specific methods Couple assessment interview, 470–1
Color-shading blend responses on in-office computer processing, 695 formats, 471
Rorschach, 506 mail-in service, 694–5 initial, 470, 471
Common pathways model, 30 Conceptual equivalence. See Construct Couple assessment methods,
Communality (Cm) scale, 329 equivalence evidence-based, 464t–5t
Communication Patterns Questionnaire Confirmatory bias, 145, 399 Couple distress. See also Couple
(CPQ), 467, 474 Confirmatory factor analysis (CFA), 61 relationship distress
Communication skills, 472 Conflicts, defensive approach to dealing assessment methods and techniques for
Comparative judgment, law of, 71 with, 590 evaluating, 469–70, 475.
Competence, professional, 603, 604. Conflict Tactics Scale (CTS), 468, 474 See also Couple assessment
See also Cultural perspective in Conformity, 331–3, 590 couple assessment interview, 470–1
assessment, training needs in Confrontation in interviews, 215–16 observational methods, 471–3
enhancing Conscientiousness, 306t, 311t. See also self- and other-report methods,
Components analysis, independent, 367, Five-Factor Model (FFM); NEO 473–5
370, 371 Inventories comorbid individual distress, 469
Comprehensive System (CS). See Exner Consensuality, 332 cultural differences in, 469
Comprehensive System (CS) Constraint. See Disconstraint domains to target when evaluating,
Computer adaptive testing (CAT), 59, ‘‘Construct drift,’’ 127 466–9
173–6, 272, 695–7 Construct equivalence, 49, 50 recommendations for assessing, 477–8
Computer applications Constructs, 52 recommendations for future research,
in clinical assessment, 165–7 homogeneity, 87–90 478–9
interviews, screening, and diagnosis, types of, 88 Couple functioning, assessment constructs
165–6 Construct validation, 81, 94–5 across domains and levels of, 460,
for psychotherapy interventions, 166–7 problems with and advances in concept 461t–2t
Computer availability, assessment of, 85 Couple interaction, analog behavioral
enhanced by, 710–11 strong vs. weak, 85–6 observation of, 230
INDEX 741
Couple relationship distress. See also Cutting scores Disconstraint (DISC) scale, 159–60, 269
Couple distress modifying, 147 DNA, 36–7. See also Genetics
conceptualizing, 458–60, 463 problems with fixed, 147 Domestic violence, instruments designed
defined, 458–9 Cynicism (CYN) scale, 264, 589 to identify potential perpetrators
etiological considerations and of, 557t
implications for assessment, 460, Dahlstrom, W. Grant, 252 Dominance analyses, classical, 61, 67
463 Debasement (Z) scale, 626, 627t–30t, 631 Coombs unfolding model, 71
prevalence and comorbid conditions, Defensive invalid (DI) MMPI-2 profiles, exploratory factor analysis (EFA), 61–2
460 strategy to deal with, 588–9 ideal point analyses, 67–9
Couple Satisfaction Index (CSI), 465 Defensiveness (K) scale, 487–8. See also item response theory (IRT), 62–7
Criminal cases, 280–1. See also Forensic Correction (K) scale law of comparative judgment, 71
settings Defensive responding, 587–90. See also multidimensional pairwise preferences,
Crisis interview, 209 Superlative Self-Presentation (S) 74–6, 77t
Criterion-related validity scale; Validity scales pairwise preference formats, 69–70
centrality in early and middle 20th Demoralization (Dem) scale, 115, 122, unfolding model, 71
century, 82–4 126–9, 270, 594 Zinnes-Griggs (ZG) model, 71–4
limitations to an exclusive focus on, as theory-driven marker for Clinical Dominance models, 60–1, 67. See also
83–4 Scale covariation, 125–6 Dominance analyses, classical
Cronbach, L. J., 84, 85, 193 Dependability, 590 Dominance (Do) scale (CPI), 326
Cross-cultural Personality Assessment Dependency. See Rorschach Oral Dominance (Do) scale (MMPI), 266,
Inventory, 53. See also Chinese Dependency (ROD) scale 267
Personality Assessment Inventory Dependent (DPD) personality disorder, Dopamine receptor type 4 (DRD4),
(CPAI) 348t–50t 34–6
Cross-cultural validity of assessment, 44–5, Depression, disaggregation of, 91 ‘‘Double-barreled items,’’ 60
48–9, 108. See also Five-Factor Depression (DEP) scale, 264 Draw-A-Person (DAP), 407, 518
Model (FFM), as a theory for Depression (D) scale, 257, 438, 488 Drug abuse. See Substance abuse
everyone; Multinational groups Depression Index (DEPI), 288, 289 Dual diagnosis, 536–7
construct equivalence, 50 Depressive disorders Dunnette, Marvin, 583
linguistic equivalence, 49–50 in Chinese clients, 46, 49 Dyadic Adjustment Scale (DAS), 464–5
psychological equivalence, 51 Rorschach and, 279, 281, 288, 289
psychometric equivalence, 49–51 de Rivera, J., 425 Eastern Woodland Oklahoma (EWO),
Cross-sectional responses on Rorschach, Desirability (Y) scale, 626, 627t–30t, 631 401
507 Diagnosis, 223–4. See also Base rates Eating disorders, sexual abuse and, 31
Cross-validated tests, 147 computer-based applications for, 165–6 Ecological momentary assessment (EMA),
Cultural adaptation. See Acculturation Diagnostic and Statistical Manual of Mental 93, 94, 246
Cultural and racial factors in assessment, Disorders (DSM), 92, 93, 316–17. using PDAs, 246
48–9, 120, 420. See also Ethnic See also Diagnosis Educable mentally retarded (EMR), 105
and racial minority clients in U.S. structured clinical interviews for, 204, Education in interviews, 216
attending to, 711–12 338–9 Ego Strength (Es) scale, 266–7, 669–70
personality disorders and, 356–7 Diagnostic criteria, gender bias and, Ekman, P., 153
Cultural conflicts, 385–6 419–20 Electroconvulsive therapy (ECT), 659
Cultural dimensions, 46 Diagnostic interview, 204, 209–10. See also Electronically activated recorder (EAR),
Cultural diversity Structured clinical interviews 244
challenge of increasing, 715 Diagnostic Interview for Narcissism Electronic diaries (EDs), 93, 94, 476–7
globalization and, 47–8 (DIN), 339–40 Emic and etic approaches, 45, 381–2
Cultural-ecological context, locating client Diagnostic Interview for Personality combined emic-etic approach, 51–3
in, 386–7 Disorders (DIPD), 338, 339, Emics vs. etics personality measures, 51–3,
Culturally responsive assessment, practical 353–4 381–2
considerations in conducting, Diagnostic Interview Schedule (DIS), 204 value and limitations of universal
386–7, 603 Diaries. See also Ambulatory self- constructs, 51–2
interpreting results, 388–9 monitoring Emotional control, 590. See also Control
case example, 389–91 PDA, 246 Emotion systems, 152–5, 153t, 155f
testing procedures, 387–8 Differential diagnosis, 279 Empathy
Cultural perspective in assessment, training Dimensional Assessment of Personality accurate, 214
needs in enhancing, 54 Pathology-Basic Questionnaire in interviews, 214, 218
Cultural values, 384–5 (DAPP-BQ), 344, 345 Empathy (Em) scale, 327
Cultural Values Conflict Scale, 386 Disabilities, testing of individuals with, Employment settings. See also Industrial/
Culture 106–8. See also Learning organizational (I/O) psychology;
approaches to describe, 45 disabilities Personnel decisions; Vocational
defined, 45 Disaggregation research, 90–2 behavior
personality and, 45–9. See also implications, 92–3 assessing personality and mental health
Environmental influences on Disclosure (X) scale, 626, 627t–30t, 631 problems in, 586–7
personality Disconstraint, 157 assessing test defensiveness in, 587–9
742 INDEX
computer-based interpretation of test reliability, 284–6 personality structure of FFM traits, 30
results, 592 suicide risk and, 507. See also Suicide scales associated with the five factors,
strategies for MMPI-2 interpretation, 590 Constellation (S-CON) 305, 306t
exclusion vs. inclusion rules in utility, 288–9 as a theory of everyone, 307–9
evaluation, 590 validity, 288–9 visual representation of, 303f
MMPI-2 clinical profile information, Exploration in interviews, 216 Five-Factor Theory (FFT), 300–3, 315
590–1 Exploratory factor analysis (EFA), 61–2 Fixed cutting scores, 147
personality information available Expository process model (EPM), 687–8 Flexibility, 332, 590
through assessment, 590 Extraversion, 306t, 311t, 331, 333. Flexibility (Fx) scale, 331
Empowerment, 426 See also Five-Factor Model (FFM); F minus K (F-K) index, 620
Enculturation, 384 NEO Inventories Follow-up interviews, 210
Environmental influences on personality, Forced-choice measures, 70
314, 315. See also Culture, Factor analysis, 304. See also specific trait scores for single-stimulus vs., 76, 77t
personality and personality models Forensic interviews, 210
‘‘Environmental’’ measures, genetic effect confirmatory, 61 Forensic settings, 11, 556t, 568–9.
on, 30 exploratory, 61–2 See also Psychopathy
Epigenesis, 34–5 genetic, 30 FRAMES, 212
Error, inevitability of, 146–7 meta-factor analysis of personality traits, Frank, L. K., 8
Errors of omission and commission, 48 155–6, 156f Frequency and Acceptability of Partner
Ethical and legal challenges in assessment, Fake Bad Scale (FBS), 435–6 Behavior Inventory (FAPBI), 467,
anticipating, 599–600, 608 ‘‘Faking bad’’. See Malingering 474
in accepting referrals, 600–2 False memories, 424, 425 Frequency (F) scale, 487. See also
in conducting psychological evaluation, Family background, 208t Infrequency (F) scale
601–3 Family Conflicts Scale (FCS), 385–6 Freud, Sigmund, 667
in managing case records, 607–8 Family Discord (Pd1) subscale, 262 Friedman, A. F., 508–9
in preparing and presenting reports, 603–7 Family functioning, 657. See also Couple F scales on MMPI-2, 255, 435, 487. See
in selecting test battery, 600–1 distress also Infrequency (F) scale
Ethical issues in test development, 130–1 assessment constructs across domains Full scores on elevated scales (FSES)
Ethnic and racial minority clients in U.S., and levels of, 460, 461t–2t strategy, 175
assessment of, 105–7, 109, 396–7. Family law cases, 282 Functional analysis, 236. See also Functional
See also Asian Americans, Family Problems (FAM), 265 relations
personality assessment with Family systems perspective, 203, 461t–2t limitations, 235
assessment of psychopathology, 396, Fax processing of computer test data, 695 Functional Analytic Clinical Case Model
408–9 Fb (Back Infrequency) scale, 255, 435, 620 (FACCM), 233, 233f, 234f, 236
clinical interviews, 398, 400 Fears (FRS) scale, 264, 445, 589 Functional equivalence, 51
clinician bias, 399–400 Fear system, 154, 154f, 157, 158 Functional impairment (FI), 651–2, 663
psychological testing, 400–8 Feedback Functional magnetic resonance imaging
structured clinical interviews, 400 acceptance of, 590 (fMRI). See also Blood oxygenation
future of, 392 to clients, 207, 213, 232–3. See also level-dependent (BOLD) response
with MMPI-2, 120, 270–1 under MMPI inferring latent states from, 367–8
with Rorschach, 290–1 neuropsychological intervention and, Functional relations, 227, 237–9, 247n2
test revision and, 120 451–2
Ethnic bias, 356–7, 396, 399–400. See also Feigned psychological symptoms. Gall, Franz Joseph, 6
Cultural and racial factors in See also Malingering Galton, Francis, 7
assessment base rates, 617–18 Gender bias, 356, 417–18, 427
Ethnic identity and acculturation, 397–8, 409 Feminine Gender Role (GF) scale, 268 clinical decisions and, 420–1
Ethnic-specific manifestations of behavior, Femininity/Masculinity (F/M) scale, 331 diagnosis and, 419–20
398–9 Figure drawing tests, 408 sex of clinician and, 418–19
Ethnocentricity, 46, 48. See also Culture Firestone Assessment of Self-Destructive Gender differences, 314–15, 531
Etic approach. See Emic and etic Thoughts (FAST), 514 Gender role identification, 422t
approaches selected items, 515t scales of, 268, 331. See also Masculinity-
Etiologic interviews, 210 Fitness-for-duty evaluations, 11, 586 Femininity (Mf) scale
European American Values Scale for Asian Five-Factor Model (FFM), 157, 305, 307. Gene-environment interactions, 35–6
Americans (EAVS-AA), 385 See also NEO Inventories Gene microarrays, 36–7
Event-contingent recording, 229 as covering full range of personality composition, 37–8
Existential theory, 202 traits, 305 General Ethnicity Questionnaire (GEQ),
Exner, John E., Jr., 142–3, 277 cross-cultural research on, 52 384, 385
Exner Comprehensive System (CS), 405. historical background, 304 Generalized graded unfolding model
See also Rorschach; Rorschach Openness factor and, 51. See also (GGUM), 67–9, 73, 75
assessment Openness Generalized vs. individualized
antisocial and psychopathic patients and overcoming prejudice against trait assessments, 545. See also
and, 571 psychology, 300–3 Nomethic assessments
multinational groups and, 291–3, 406 personality disorders and, 343–4, 351–2 Genetic factor analysis, 30
INDEX 743
Genetics. See also Molecular genetics Independent vs. interdependent functional, 243t
behavior, 25, 152, 314. See also self-construals, 46 parts, 204, 206
Molecular genetics Index of Potential Suicide, 514 practice exercise, 218–19
implications for assessment and Indigenous measures of personality, 52 reliability, 219, 221
classification theory, 39 Individual differences, 103, 105, 151, 228 semistructured, 337–40, 349–50, 350t
Genomic imaging, 38–9 Individualism, 46 structure, 203–4, 205t, 206–7, 208t
Georgopoulos, Apostolos, 369 Individualized vs. generalized assessments, techniques, 213–18, 220t
Given-new contract, 685–6 545. See also Idiographic assessment types of, 207–13
Glascow Coma Scale (GCS), 448 Industrial/organizational (I/O) Intimacy, 472
Globalization. See under Cultural diversity psychology, 582–3. See also Introversion, 333
Goal attainment scaling (GAS), 476 Employment settings; Personnel Introversion/low positive emotionality, 157
Good Impression (Gi) scale, 328–9 decisions; Vocational behavior Introversion/Low Positive Emotionality
Gough, Harrison, 10, 323. See also evolution, 583–6 (INTR), 160, 269
California Psychological Inventory Infrequency (F) scale, 254–5, 435, Inventory of Interpersonal Problems (IIP),
Grady, Judge, 109 487, 620 343, 344
Graham, John R., 252 Infrequency (INF) scale, 625 Item characteristic curve (ICC), 174
Greene, Roger L., 170 Infrequency Psychopathology (Fp) scale, Item response functions (IRFs), 72
255, 435, 620 Item response theory (IRT), 59, 67, 119, 174
Habit strength. See under Aggression Initiative, 590 IRT parameters for TAPAS well-being
Halstead-Reitan Neuropsychology Battery Intake interviews, 210. See also Interviews scale, 65t
(HRNB), 443–5 Intellectual Efficiency (Ie) scale, 330 Item response theory (IRT) dominance
Hare Psychopathy Checklist-Revised. Intelligence measures analyses, 62
See Psychopathy Checklist-Revised minority groups and, 109 dichotomous analysis, 62–6
Harm avoidance, 159–60 personality measures and, 104 polytomous analysis, 66–7
Harris-Lingoes subscales, 261–2, 446, Interest/seeking system, 155, 157
492, 573, 574 Internal consistency analysis, 325–6 Jacobs, D. G., 519
Hathaway, Starke R., 9, 119, 250, 260, Internalization ratio (IR), 656 Japanese MMPI, 381
446, 573 International developments in assessment Job Description Index (JDI), 113
HCR-20, 558–9 research and applications, 712 Johnson, R. W., 131
Health Concerns (HEA) scale, 264 International Personality Disorder Joy system, 155
Helms, Janet E., 386 Examination (IPDE), 339, Justificationism vs. nonjustificationism, 86–7
Hierarchical organization, constructs and, 348t–50t, 349, 351, 353
90, 156, 156f International testing issues, 103, 108. Karson Clinical Report, 169
Hispanics, 103, 108, 271. See also Latino See also Cross-cultural validity of Korean MMPI, 381
heritage assessment; MMPI, international Koss-Butcher Critical Item Set-Revised, 509
MMPI-2 use with, 403 application; Multicultural K scale. See Correction (K) scale
Hispanic Stress Inventory (HSI), 398 assessment
History, patient, 208t, 222, 568, 569 Internet-based assessment, 165, 176–8 Late postconcussive syndrome (LPCS) and
Histrionic (HST) personality disorder, Internet-based career counseling, 166–7 MMPI-2, 440, 450–1
348t–50t Internet-based test applications, 695 Latino heritage. See also Hispanics
Hmong version of MMPI, 379–80 Internet psychotherapy interventions, 166 schizophrenia and, 145t, 145–6
Homogeneous constructs, 87–90 Interpersonal effectiveness, 326–7, 331 Learning disabilities, 99–103
Hong Kong. See Asian personality Interpersonal resistance. See Resistance Lethality of Suicide Attempt Rating Scale
Hostility (Ho) scale, 267–8 Interpretation in interviews, 216–17 (LSARS), 515
Humor in interviews, 216 Interview reports, 219 Level of Supervision Inventory (LSI), 555,
Hypochondriasis (Hs) scale, 257, 438, 488 sample, 230–1 557, 559
Hypomania (Ma) scale, 259–60, 441, 490 case formulation, 222–3 Lexical hypothesis, 304, 305
Hypothesis testing (interview), 207 mental status exam, 222 Lie detection, fMRI and, 367–8
Hysteria (Hy) scale, 257–8, 438–9, 489 reason for admission, 221–2 Lie (L) scale, 255, 435, 487, 670
treatment recommendations, 223 Likert, R., 60
Ideal point analyses, 67, 71 Interviews, 201, 239, 240t–1t, 568. Likert scales, 60
of single-statement items, 67–9 See also Diagnostic interview Linguistic equivalence, 49–50
Ideal point models, 60–1 clinician attitudes in, 201 Linkage analysis, 33–4
Idiographic assessment, 238, 545. computer-administered, 219, 221 Listening, active, 214–15
See also individual differences computerized, 165–6 Litman, R. E., 515
Immaturity (IMM) scale, 491 content, 207, 208t Loss of Face (LOF) scale, 385
Impulsivity, 88–90, 328, 590 elements, 201 Low Self-Esteem (LSE) scale, 265
Inconsistency (ICN) scale, 625 clinician’s theoretical orientation, L scale. See Lie (L) scale
Inconsistent responding (INC), 672 202–3, 206
Independence (In) scale, 327 personal beliefs and values, 203 MacAndrew Alcoholism Scale-Revised
Independent components analysis (ICA), with ethnic and racial minorities, 398, 400 (MAC-R) scale, 268, 491, 532, 533
367, 370, 371 expectations of client, 213 McCrae, Robert R., 344. See also Five-Factor
Independent thinking, 332 expectations of interviewer, 213 Model (FFM); NEO Inventories
744 INDEX
McGrath, R. E., 88 Miller, W. R., 212 computer-based test interpretation,
McKinley, J. Charnley, 9, 250, 446, 573 Miller Forensic Assessment of Symptoms Test 168–9, 272–3, 592
Magnetic resonance imaging (MRI). See (M-FAST), 633, 635t, 636, 638t case history, 699–701, 700t, 701t
Functional magnetic resonance Millon Adolescent Personality Inventory clinical use of computerized report,
imaging (MACI), 511 697, 699–701
Magnetoencephalography (MEG), Millon Clinical Multiaxial Inventory computerized adaptive testing with,
369–70, 370f (MCMI), 10 174–6, 272, 696–7
Malgady, R. G., 109–10 CBTI systems for, 169 limitations of, 176
Malingering, 435–6, 575, 613–14, malingering screening and detection content interpretation of, 260, 261
639–40. See also Feigned scales, 626, 631, 632t advantages of, 261
psychological symptoms minority groups and, 404 limitations of, 260–1
conceptual and practical issues personality disorders and, 341–3, 346t, content scales, 262–6, 263t, 444f, 573
coaching, 618 347, 348t–50t, 352–3 critical items, 266
cost of misdiagnosis, 618–19 substance abuse scales (B and T), 533 First Factor, 124–6. See also Anxiety (A)
definition and diagnostic criteria, substance abuse settings and, 535 scale
614–15 suicide risk and, 510–11 Harris-Lingoes subscales, 261–2, 446,
methodological issues, 615–17 Minnesota Multiphasic Personality 492, 573, 574
DSM-IV-TR and, 614–15 Inventory. See MMPI history of, 251–2, 586
strategy for clinical practice, 619 Minnesota Report: Adolescent Interpretive timeline of developments in, 253t–4t
Malingering detection strategy, hypothetical System (Butcher & Williams), 493 international adaptations, 271, 379–81
classification rates for a two-stage, Minnesota Reports, 168, 592 Internet-based testing, 176–7
636, 638t, 638–9, 639t Minority groups. See Ethnic and racial minority clients and, 270–1, 401–4
Malingering detection tests and scales, minority clients in U.S.; Race, personality disorders and, 341, 346t, 347
methodological quality issues in ethnicity, and minority status personnel applications, 586–7
validation of, 616t, 616–17 MMPI (Minnesota Multiphasic strategies for interpretation, 590–1
Malingering (MAL) index, 626 Personality Inventory) psychopathy and, 572–5, 574t
Malingering screening and detection scales, computer-assisted interpretive systems, supplemental scales, 266, 267t. See also
619–20 168–9 Personality Psychopathology Five
MCMI-III, 626, 631, 632t development, 9–10, 250–1 (PSY-5) scales
MMPI-2, 620, 625, 638t, 638–9 international application, 44–5, 48–51, factor scales, 266
results from patient-controlled 271, 593. See also Asian scales of adjustment, 267
malingering detection studies, 620, Americans, personality assessment scales of anger, 267–8
621t–4t, 625 with; Asian populations, MMPI-2 scales of gender role identification, 268
Personality Assessment Inventory (PAI), studies with scales of personal resources, 266–7
625–6, 627t–30t, 638t providing test results feedback, 495–6, scales of substance abuse, 268
Malingering tests, dedicated, 631, 633, 697, 698t, 699–701 MMPI-2 RF (MMPI-2 Restructured
636, 638–9 as an intervention, 697, 699 Form), 114, 595. See also
Maltsberger, J. T., 520 reasons for restandardization, 112, 251 Restructured Clinical (RC) Scales
Management. See Industrial/organizational deficiency changes, 113, 114 MMPI-297, 272
(I/O) psychology growth changes, 119–30 MMPI-A (Minnesota Multiphasic
Mania. See Hypomania (Ma) scale resistance and, 669–71, 677 Personality Inventory-Adolescent),
Marginal maximum likelihood (MML), 73 restandardization project, 251–2 10, 380–1, 496
Marital distress. See Couple distress short forms, 271–2 case example, 494–5
Marital Distress (MDS) scale, 267 substance abuse and, 528–30, 534–6 development, 486
Marital distress taxon, 463 suicide risk and, 508–10 future directions, 496–7
Marital problem questionnaires, 230 validity scales, 251, 252, 254t, 254–6, interpretation, 486
Marital Satisfaction Inventory-Revised 435–6 adolescent interpretive system, 493
(MSI-R), 475 malingering and, 620, 621t–4t, 625, Clinical Scales, 488–90
Markon, K. E., 156–7 638–9 Content Component Scales, 491
Masculine Gender Role (GM) scale, 268 violence risk and, 554 Content Scales, 490–1
Masculinity-Femininity (Mf) scale, 258, MMPI-2 (Minnesota Multiphasic interpretive strategies, 492–3
439, 489 Personality Inventory-2nd norms and T-scores, 488
Matarazzo, J. D., 201 edition), 9–10, 48, 49, 273. Supplementary Scales, 491–2
Meehl, Paul E., 84, 85, 148, 260 See also MMPI; Personality validity scales, 486–8
Mental health care (MHC) programs, 647 Psychopathology Five (PSY-5) Molecular genetics
Mentally retardation, 105 scales; Personnel decisions clinical assessment and, 36–9
Mental status exams, 210–12, 222 abbreviated versions and short forms, future directions in, 38–9
content areas, 211t–12t 271–2, 434–5, 695 Molecular genetic studies, 33–4
Merrill, M., 192 Clinical Scales, 251, 256–9, 438–41. evidence for polygenic influence, 33–4
Meyer, Gregory J., 292, 293 See also Restructured Clinical (RC) mechanisms of polygenic influence, 34–6
Meyer, R. G., 509–10 Scales; specific scales Monoamine oxidase A (MAOA) gene, 35,
Michigan Alcohol Screening Test (MAST), identifying the ‘‘core’’ components of, 38, 39
531, 533–4 115 Motivational interviews, 212
INDEX 745
Motto, J. A., 503, 506, 515, 517, 521, 522 Neuropsychological assessment, 432–3 Pairwise preference formats, 69–70
Motto Suicide Risk Assessment, 518 components, 432 advantages, 70–1
Multicultural assessment, 47–8, 715. psychopathy and, 575–6 Pairwise preferences, multidimensional,
See also Cross-cultural validity of Neuropsychological evaluations, use of 74–6, 77t
assessment; Cultural and racial MMPI-2 in, 432, 433, 452–3 Paranoia (Pa) scale, 258–9, 440, 489
factors in assessment; International administrative issues Paranoid (PRN) personality disorder,
testing issues administrative format, 433–4 348t–50t, 356–7
Multidimensional pairwise preferences clarifying purpose of MMPI-2, 433 Paraphrasing, 217
(MDPP), 70, 73t, 73–6, 77t monitoring the examinee, 434 Parent-child interactions, 231–2
item response surface for MDPP item use of short form, 434–5 analog behavior observations, 232
measuring two dimensions, 75, 75f clinical profile interpretation, 436–7, 437t Partner relational discord syndrome,
Multidimensional Personality case illustrations, 443, 443f, 444f, criteria for, 459t
Questionnaire (MPQ), 159–60 449–50, 450f Partner relational problems. See also
Multinational groups. Clinical Scale interpretation, 437–42 Couple distress
See also Cross-cultural validity controlling for neurological symptom definition and conceptualization, 458
of assessment reporting, 446–9 PASE v. Hannon, 109
Rorschach assessment with, 291–3 diagnosing neuropsychological Passive-aggressive (PAG) personality
Multi-trait Personality Inventory deficits, 442–6 disorder, 348t–50t
(MTPI), 382 incorporating results into reports, 451 Patient-examiner process, 193
Multivariate biometrical modeling, 29–31 mild head trauma and late Patient history, 208t, 222, 568, 569
Murphy, G. E., 521 postconcussive symptoms, 450–1 Patient predisposing variables.
Murray, Henry, 8–10 MMPI-2 item table for evaluating See Pretreatment planning
Myers-Briggs Type Indicator (MBTI), 305 traumatic brain injury, 446–9, 448t PDAs (personal digital assistants), ecological
neuropsychological intervention and momentary assessment using, 246
Narcissistic (NCS) personality disorder, feedback issues, 451–2 Performance-based measures, 278
348t–50t RC Scales and the restructured Personal Data Sheet, 7
Native Americans, 400–1, 407 MMPI-2, 442 Personal digital assistants. See PDAs
Naturalistic observation, 241t interpretive issues Personal injury cases, 281
Nature vs. nurture, 314. See also Culture, exaggerated symptom responding, 435 Personality, recent advances in the study of,
personality and; Environmental validity scale interpretation, 435–6 89–90, 93
influences on personality; Genetics Neuropsychological reports, incorporating Personality assessment
Negative emotionality/neuroticism, 157. MMPI-2 results into, 451 classic approaches to, 709
See also Neuroticism; Positive Affect Neuroticism, 306t, 311t. See also developments in, and the current scene,
(PA) and Negative Affect (NA) Five-Factor Model (FFM); 709–12
Negative Emotionality/Neuroticism Negative emotionality/ future directions, 715–16
(NEGE) scale, 160, 269 neuroticism; NEO Inventories history of, 4–5
Negative impression (NIM) scale, 625, 626 Nomethic assessments, 545, 606 contemporary approaches and
Negative presentation management Nomethic traits, 158 directions, 11–12
(NPM), 313 Nomological network concept, 86 expansion during World War II, 10
Negative Treatment Indicators (TRT), Nonepileptic seizure (NES) disorder, 439 highlights in, 6–7, 8t
265–6 Nonjustificationism, 86–7 20th-century developments, 7
NEO Five-Factor Inventory (NEO-FFI), Norms, test, 104 increased rate of research publication
310, 351. See also NEO Inventories changes in, 113 and clinical application of, 11
NEO Inventories, 10, 299, 317. See also methodological developments, 710–11
Five-Factor Model (FFM) Obsessive-compulsive disorder (OCD). new technologies, 711, 713
applications and future directions, 315–17 See also Psychasthenia (Pt) scale past challenges to, 708–9
clinical usefulness, 93 disaggregation of, 91 purposes, 176, 416–17
computer interpretation, 311–12 Obsessive-compulsive (OBC) personality Personality Assessment Inventory (PAI),
CPAI and, 53 disorder (OCPD), 347, 348t–50t 10, 342, 511, 625–6. See also
factors, 52, 53, 88, 89, 310, 311t, 316t Obsessiveness (OBS) scale, 264 Chinese Personality Assessment
history, 304–5, 309–10 Office of Strategic Services (OSS), 10 Inventory (CPAI)
protocol validity, 312–14 OMNI Personality Inventory, 341, Personality assessment procedures, lag in
research findings, 30, 52, 53, 314–15 344–5, 349t development of new, 713
translations, 307, 312 Online psychotherapy. See Internet Personality assessment psychologists,
NEO Personality Inventory-Revised psychotherapy interventions challenges facing, 712–15
(NEO-PI-R), 310. See also NEO Open- and closed-ended questions, 214 Personality constructs. See Constructs
Inventories Openness, 157, 306t, 311t, 332. Personality Disorder Interview-IV
Asian personality and, 382–3 See also Five-Factor Model (FFM); (PDI-IV), 339
personality disorders and, 344, 345 NEO Inventories Personality Disorder Questionnaire-4
Neurodiagnostics, 364, 368–73. See also Orientation interviews, 213 (PDQ-4), 342, 346t, 348t–50t
Blood oxygenation level- Outcome evaluation, 280 Personality disorders, 316–17, 336–7, 357
dependent (BOLD) response Overcontrolled-Hostility (O-H) scale, instruments for assessment of, 337
Neurological symptom reporting, 441 267, 268, 544 convergent validity research, 345,
controlling for, 446–9 Ownby, Raymond L., 684, 687 346t, 347, 348t–50t, 349, 351
746 INDEX
discriminant validity research, 345, Positive presentation management Psychopathy Checklist-Revised (PCL-R),
351–2 (PPM), 313 339–40, 554, 569–71, 576–7
self-report inventories, 340–5 Positive psychology, 426 administration and scoring, 570
issues to consider in assessment of, 352 Postconcussive syndrome. See Late aggression, violence, and, 554–5, 558,
boundaries among personality postconcussive syndrome (LPCS) 560n3
disorders, 355–6 and MMPI-2 factor analysis, 90–1
boundaries with Axis I disorders, 352–4 Posttraumatic Stress Disorder (PK) scale, 267 MMPI and, 573–4, 574t, 576, 577
boundaries with normal personality Posttraumatic stress disorder (PTSD). reliability, validity, and utility, 554–5,
functioning, 354–5 See also Gender bias; Sexual and 558, 560n3, 567–9
gender, culture, and ethnic bias, 356–7 physical abuse Psychosis. See Psychotic features and
Psychopathy Checklist-Revised and, 570 disaggregation of, 91–2 psychosis
Personality inventories Rorschach and, 281 Psychosocial treatment, 658. See also
expansion in later part of 20th century, 10 Power, 472 Treatment
first, 7 Prediction, clinical. See Base rates Psychotherapeutic resistance. See
increasing variety available for use in Prejudice (Pr) scale, 329 Resistance
clinical assessment, 710 Prescriptive psychotherapy (PT), 661–2 Psychotherapy interventions. See also
Personality Inventory for Children (PIC), 169 Pretreatment planning, 643–5, 662–4 Treatment
Personality Psychopathology Five (PSY-5) future directions, 661–2 computer applications for, 166–7
scales, 128, 158–60, 268–9, 269t,492 patient predisposing variables, 645, 647–8 Psychotic features and psychosis, 281.
interpretation, 158–60 diagnostic variables, 648–9 See also Schizophrenia (Sc) scale
personality disorders and, 344, 345 nondiagnostic variables, 649 Psychoticism, 157
Personality structure, 303–4 state-like qualities, 649–51 Psychoticism (PSYC) scale, 159, 269
Personality test theory, 59–61, 76, 78 subjective distress and arousal, 651–2
classical dominance analyses, 60–76 trait-like qualities, 652–8 Quantitative trait loci (QTL) analysis, 33
Personality theories, 11–12 problem-solving approach to, 645–7 Question, Persuade, Refer, and/or Treat
Personality traits, 158. See also Traits Problem complexity/chronicity (PCC), (QPRT), 518
major, 155–7 653–4, 663 Questioning in interviews, 214
emotion systems and, 157t, 157–8 Problem solving, 472 Questionnaires, 7
meta-factor analysis of, 155–6, 156f Professional programs, diminished behavioral, 239
structure of assessment training in, 712–13
creating consensus, 305, 307 Prognostic vs. diagnostic assessments, 544 Race, ethnicity, and minority status, 396–7.
discovery and rediscovery, 304–5 Projective method, 8. See also Test See also Cultural and racial factors in
the problem of structure, 303–4 situation, as a projective test
assessment; Ethnic and racial
Personality trait theory, 150–1 Projective techniques, 104.
minority clients in U.S.
Personal Progress Scale (PPS), 426 See also specific tests
Rape. See Sexual and physical abuse
Personnel decisions, 582, 595. See also minority groups and, 404–5
Rapid Alcohol Problems Screen 4
Employment settings; Industrial/ Prospective vs. retrospective assessments, 544
(RAPS4), 533, 534
organizational (I/O) psychology ‘‘Proteomics,’’ behavioral, 35
Rapport, 191–2, 195
MMPI-2 and, 592–4 Psychasthenia (Pt) scale, 126, 259, 440, 489
Rating scales, 7
case illustrations, 591–3, 592t Psychodynamic orientation, 202
Reaction potential, 546, 553
non-K-corrected Clinical Scale Psychological equivalence, 51
Readiness for change, 652, 663
profiles, 593 Psychological-mindedness (Py) scale, 330–1
Reality contact, 157
RC scales, 594–5 Psychologists. See Personality assessment
Reasons for Living Inventory, 516
MMPI-2 RF and, 595 psychologists
Receiver operator curves (ROCs), 530
need for well-developed and research Psychometric equivalence, 49–51
Recovered memories, 424–5
measures for, 593–5 Psychopathic Deviate (Pd) scale, 258, 439,
Referrals, 600–2, 644
Personnel settings, 11 489, 573, 574
‘‘clinical’’ vs. ‘‘administrative,’’ 601
Person orientation, 332 Psychopathic personality inventory (PPI), 573
Reflection (interview technique), 217
Phenomenological theory, 202 Psychopathology, recent advances in the
Reframing (interview technique), 217
Phrenology, 6–7 study of, 89–93
Regulations. See Conformity
Physical abuse. See Sexual and physical abuse Psychopathy. See also Antisocial (ATS)
Reitan, R. M., 445
Picture-story tests, 406–7. See also personality disorder
Relational disorders. See also Couple distress
Thematic Apperception Test assessing, 568. See also Psychopathy
defined, 458
‘‘Polygenic’’ model, 33–6 Checklist-Revised
Relationship affect, 468–9
Positive adjustment, 332 cognitive and intellectual measures,
Relationship Attribution Measure
Positive Affect (PA) and Negative Affect (NA), 575–6
(RAM), 474
Watson and Tellegen model of, 115, forensic assessment and issues, 568–9
Relationship behaviors, 466–7
121–8. See also Demoralization integration of findings, 576–7
Relationship cognitions, 467–8
vs. other models, 122–4 MMPI-2 assessment, 572–5, 574t
Relationships. See also Couple functioning
Positive and Negative Affect Schedule Rorschach assessment, 571–2, 572t
defensive approach to dealing with, 590
(PANAS), 124 as categorical and dimensional
therapist-client, 646, 659–60
Positive emotionality. See Introversion/low construct, 569–70 Reliability Generalization (RG), 135
positive emotionality disaggregation of, 90–1
Reliability (personality trait), 590
INDEX 747
Reports, psychological, 691 Risk management, professional. See Ethical Schedule for Nonadaptive and Adaptive
credibility and persuasiveness, 686–7 and legal challenges in assessment Personality (SNAP), 343–5, 346t
current guidelines on writing, 685 Risk-Rescue Scale, 514–15 Schizoid (SZD) personality disorder,
elements, 688 Rogers, Carl, 202, 206, 218 348t–50t
organization, 688–9 Rogers discriminant function (RDF), 626 Schizophrenia, 145
sections, 688 Rorschach: A Comprehensive System, The Five-Factor Model, NEO-PI-R, and, 308
ethical and legal challenges in preparing (Exner), 277, 507. See also Exner Latino heritage and, 145t, 145–6
and presenting, 603–7 Comprehensive System neurodiagnostics and, 367, 369–71,
future directions, 691–2 Rorschach assessment 370f, 371f
jargon vs. middle-level constructs in, 687 admissibility into evidence (forensic Schizophrenia (Sc) scale, 259, 440–1,
length, 690 cases), 282–3 489–90
paragraphs, 688 of antisocial and psychopathic patients, Schizotypal (SZT) personality disorder,
problems in, 685 571–2, 572t 348t–50t
psycholinguistic concepts and, 685–6 cross-cultural applicability, 290–3, 406 disaggregation of, 92
purpose, 686–7 literature on, 290 Science, advances in philosophy of, 86
readability, 690 psychometric foundations of, 283 nonjustificationism and construct
report models/strategies, 689 reliability, 283–6 validation, 86–94
ability- or domain-oriented report, 689 utility, 288–9 Screening interviews, 213
hypothesis-oriented report, 689 validity, 286–8, 405, 571 brief, 208
professional letter report, 689 purposes served by, 278 Seizure disorder, 439, 441, 442
test-oriented report, 689 clinical, 278–80 Self-acceptance (Sa) scale, 327
sentences, 687–8 forensic, 280–3 Self-construals, independent vs.
Report writing, 684–5. See also Behavioral status of education and training in, 293–4 interdependent, 46
observations; Interview reports suicide risk and, 506–8. See also Suicide Self-control. See Control
Repression (R) scale, 266, 492 Constellation (S-CON) Self-control (Sc) scale, 328, 551
Research methodology, 8 usage frequency, 289–90 Self-disclosure by clinician, 217–18
Resistance, 667, 681–2 value in a test battery, 278 Self-esteem. See Low Self-Esteem (LSE) scale
Butcher Treatment Planning Inventory Rorschach Inkblot Method (RIM), 277–8. Self-expressive tests. See Projective techniques
and, 671–7 See also Rorschach assessment Self-Harm (BOR-S) subscale, 511
case example, 677–81, 678t, 679f Rorschach Inkblot Test, 406–7. See also Self-monitoring, 229. See also Ambulatory
concept of, 667–8 Rorschach assessment self-monitoring
to initiating psychological treatment, 668–9 computer-based interpretation, 169 Self-orientation, 332
MMPI/MMPI-2 and, 669–71 development, 8 Self-report measures of psychopathology
and ongoing psychological treatment, 669 Rorschach Interpretation Assistance use with minority groups, 400–4
Resistance potential, 654–5, 663–4 Program, 169 Self-report personality inventories. See also
Response bias, 48 Rorschach Oral Dependency (ROD) scale, specific tests
Response sets, 614 288, 289 first, 7
Responsibility (Re) scale, 327–8, 551 Rosenhan, D. L., 145 vs. performance-based instruments, 278
Restructured Clinical (RC) Scales (MMPI-2), Rosenthal, Robert, 287 for personality disorders, 340, 343–5
114, 135–6, 269–70, 270t Rowe, David, 152 convergent validity, 345, 346t, 347–9,
and corresponding Clinical Scales, 116t Rules. See Conformity 348t–9t
criticism of, 595 Rutgers Alcohol Problem Index (RAPI), DSM-IV-TR, 341–3
development of 534 limitations of, 342–3
background for, 114 Rutgers Drugs Problem Screener (RDPS), Sensitivity, 148n2, 529, 617
deficiency changes pursued through, 114 534 Sentence completion tests (SCTs), 408
growth aims pursued in, 120–30 Serotonin transporter gene-linked
new analysis of Clinical Scale Samejima’s graded response (SGR) model, polymorphic region
covariation, 117–18 61, 62, 66, 67 (5-HTTLPR), 34
overview of, 114–18, 574–5, 594–5 Scalar equivalence, 51 Settings. See also under Treatment context
reduction of Clinical Scale Scale covariation. See Clinical scale tests designed for particular, 147
covariation, 116–17 covariation Sexual and physical abuse
steps in, 115 Scale for Assessing Suicide Risk (SASR), 514 eating disorders and, 146
in neuropsychological evaluations, 442, Scale for Suicidal Ideation (SSI), 513, 516 prevalence rates, 421, 423
443, 444f Scale for the Prediction of Aggression and reactions of other people to, 423
and the restructured MMPI-2, 442 Dangerousness in Psychotic screening for, 421, 422t
Retest method, 586–90, 593 Patients, 552 barriers to, 423–5
Revised Diagnostic Interview for Scale redundancy and test revision, 131–2 probing for abuse details, 425–6
Borderlines (DIB-R), 339–40 Schafer, Roy, 193 Sexual reoffending and recidivism, tools
Revised NEO Personality Inventory. See Schedule for Affective Disorders and devised to predict, 557t
NEO Personality Inventory-Revised Schizophrenia-Lifetime Version Shedler, J., 355
Rigidity, 331 (SADS-L), 401 Shedler-Westen Assessment Procedure
Risk assessment. See Aggression; Suicide risk Schedule for Affective Disorders and (SWAP-200), 339, 350t, 351, 352
assessment; Violence risk assessment Schizophrenia (SADS), 204 Shneidman, E. S., 521
748 INDEX
Shortened forms, 695 Subjectivity Teamwork, 590
of MMPI, 271–2, 434–5, 695 of behavioral observations, 189, 194–6 Tellegen, Auke. See also Positive Affect
undesirable side effects, 713–14 ‘‘disciplined,’’ 195 (PA) and Negative Affect (NA)
Silence during interviews, 218 value of, 195–6 MMPI-2 and, 120, 121, 129, 159, 160,
Single-statement (SS) format, 59, 73t, 76, 77t Substance abuse, 527–9, 537 252
Single-stimulus measures, trait scores for definitions and diagnostic criteria, on MMPI restandardization
forced-choice vs., 77t 530–1 committee, 252
Sino-American Person Perception Scale dual diagnosis, 536–7 on traits, 151
(SAPPS), 382 gender differences, 531 TEMAS (Tell-Me-A-Story) Thematic
16 Personality Factors (16-PF), 10, 304, MMPI-2 supplemental scales of, 268 Apperception Test, 407
309, 510 prevalence, 529, 530 Terman, L., 192
Sociability (Sy) scale, 326 screening for, 529–34 Termination interviews, 213
Social Discomfort (SOD) scale, 265 screening test used with college students, 534 Test administration and standardization,
Social distance and clinician bias, 399 timing of psychological assessment, 534 99–105, 109, 120. See also
Social Introversion (Si) scale, 260, 441, 490 traditional self-report scales used to Disabilities; Ethnic and racial
subscales, 491 identify, 532–4. See also specific scales minority clients in U.S.
Socialization (So) scale, 328, 551 Addiction Admission and Addiction diversity in, 192
Social Participation (Sp) scale, 326 Potential scales, 534–5 Test equivalence. See Cross-cultural
Social poise, 326–7, 331 Scales B and T (MCMI-III), 535 validity of assessment
Social Presence (Sp) scale, 326–7 Substance abuse treatment programs, Test feedback. See Feedback
Social Responsibility (Re) scale, 266, 267 self-report tests used in, 534–6 Testing, defined, 568
Social role expectations, 649–50 Suicidal ideation, intent, and behavior Test items
Social support, 657–8, 664 assessment of, 513–17. See also Suicide leaked into public domain, 113
Socioeconomic status (SES), 397 risk assessment outdated/antiquated, 113
Somatization, 46. See also Hypochondriasis Suicidal Ideation Questionnaire (SIQ), Test of Memory Malingering (TOMM),
(Hs) scale 514 618
Somatization of Conflict (SOM) scale, 171 Suicidal Ideation (SUI) scale, 511 Test-Retest (TR) index, 133
Somatoform disorders, 439, 449–51. See Suicidal Impulsivity, 511 Test revision, 112, 135–6
also Hypochondriasis (Hs) scale Suicide Assessment Battery (SAB), 517, 518 preparation for, 132–3
Somwaru and Ben-Porath scales, 341 Suicide Constellation (S-CON), 142–4, professional acceptance, 135
South African Personality Inventory (SAPI), 53 143t, 289, 507 psychometric data at time of release,
Southwest Plains Oklahoma (SWPO), 401 Suicide Intent Scale (SIS), 516 134–5
Specificity, 148n2, 530, 617 Suicide Potential Scale (SPS), 509 reasons for, 112. See also MMPI, reasons
Spurzheim, Johan, 6 Suicide Probability Scale (SPS), 516 for restandardization
S (Superlative Self-Presentation) scale, 256, 587 Suicide risk assessment, 501–6. See also deficiency changes, 112–14
Stages of Change Questionnaire, 669 Suicidal ideation growth changes, 112, 119–20
Standardization, test, 101, 104–5, 109–10 decision model, 502–3 invalid motivations for change, 130–2
Standards for assessment instruments, formulation of clinical judgment in structural changes
erosion of, 714–15 interview, 520–2 changes in length of scales, 134
Stress tolerance, 590 limitations of theoretical orientation and item format changes, 133–4
Strong Vocational Interest Blank (SVIB), DSM-IV diagnosis, 519–20 test format changes, 133
133, 334n1 recent empirical evidence on risk Test revision process, stakeholders in, 132
Structural equation modeling, 50 assessment practices, 505–6 Tests, psychological
Structured Clinical Interview for DSM-III through psychological testing, 506–13 inappropriate use of, 713
(SCID), 204 through structured interviews and most commonly used, 163–4
Structured Clinical Interview for DSM-IV psychological battery, 517–19 Test scores, inclusion in reports, 690–1
Axis II Personality Disorders Suicide risk factors, 504–5, 520. See also Test situation
(SCID-II), 339, 348t–50t, 349, 351 Suicide risk assessment as a projective test, 187–9, 198–9
Structured clinical interviews, 400. Suinn-Lew Asian Self-Identity and as a standardized interview, 193–4
See also Interviews, semistructured Acculturation Scale (SL-ASIA), Test theory. See Classical test theory
Structured Interview for DSM-IV 384, 385 Test translation. See Translated
Personality Disorders (SIDP-IV), Sullivan, Harry Stack, 204, 206 instruments
339, 349t ‘‘Super-ego strength,’’ 332 Thematic Apperception Test (TAT),
Structured Interview for Personality Superlative Self-Presentation (S) scale, 256, 587 187, 406–7
Disorders (SIDP), 349t Systematic treatment (ST), 662 development, 8–9
Structured Interview of Reported Systematic treatment selection (STS), minority groups and, 406, 407
Symptoms (SIRS), 618, 625, 636, 651–7, 662 suicide risk and, 511–12
637t, 638t Themes Concerning Blacks (TCB), 407
Structured Inventory of Malingered Tailored Adaptive Personality Assessment Theory validation, 81, 82, 84–6, 95
Symptomatology (SIMS), 633, System (TAPAS) well-being scale, 61. Therapeutic assessment, 710
634t, 638, 638t, 639 See also Dominance analyses, classical Therapeutic Reactance Scale (TRS),
Structured professional judgment (SJP), 558 CTT item statistics for, 61–2, 62t 655, 670
Subjective distress, 651–2, 663 IRT parameters for, 65t Therapist-patient matching, 659
INDEX 749
Three-parameter logistic (3PL) model, 67 computer technology and, 167 Violence, 542. See also Aggression; Assault
Thurstone, Louis, 60 couple assessment for case defined, 543
Time Questionnaire (TQ), 517–18 conceptualization and, 466–75 heterogeneity, 543–4
Tolerance (To) scale, 329 Treatment recommendations, 196, 648 Violence Risk Appraisal Guide (VRAG),
Tool for Assessment of Suicide Risk True Response Inconsistency (TRIN) 557–9
(TASR), 502 scale, 254, 255, 435, 488, 620, Violence risk assessment. See also Assault,
Training. See Professional programs 670 predicting
Training needs in enhancing cultural Turner syndrome, 35 issues in prospective, 559–60
perspective in assessment, 54 TWEAK, 533, 534 objective assessment tools, 556t–7t
Trait and behavior assessment, integration Twin data actuarially derived, 557–8
of, 93–4 model-fitting approaches in the analysis limitations of, 559–60
Trait dimension, 151 of, 26–8 rationally derived, 555, 557
Trait explanations, 300 what we have learned from univariate personality tests, 554–5
Trait psychology modeling of, 28–9 structured professional judgment,
dryness, 301–2 Twin research, 25–6, 33, 34 558–9
incompleteness, 302–3 current directions, 31–2 subjective judgment, 553–4
overcoming prejudice against, 299–303 what it has taught us about Vocational behavior, 315, 422t. See also
Traits, 160–1. See also Personality traits personality, 26 Employment settings; Industrial/
clinical relevance, 301 Twins, 25–6 organizational (I/O) psychology
as cognitive fictions, 300 multivariate and longitudinal studies
defined, 151 of personality and its correlates Watson and Tellegen model. See Positive
as immutable, 300–1 in, 29–31 Affect (PA) and Negative Affect
as individuating, 151 Two-parameter logistic (2PL) model, 61, (NA)
real organismic structural differences as 62, 67 Well-being scale, 20-item. See Tailored
underlying, 152 Type A (TPA) scale, 265 Adaptive Personality Assessment
as stable individuating properties of System
emotion systems, 152–5, 153t, Well-being (Wb) scale (CPI), 329
Unfolding model, 71
155f Westen, Drew, 355
Unidimensional pairwise preference (UPP)
vs. states, 151–7 Widiger, Thomas A., 344
items, 71, 72f, 73t, 74, 76, 77t
underlying structures vs. dispositions, Woman-centered approach to assessment,
Utility-balanced diagnostic rule, 144
151 416–17, 427
Utility of available assessment methods,
Translated instruments, 44–5, 49–50, assessing for strength and well-being,
709
108, 603. See also specific tests 426–7
Transparencies on Rorschach, 507 diagnosis in context, 417
Trauma content index (TCI), 143 Validation, 100, 235. See also Construct as multimodal approach, 417
Traumatic brain injury (TBI). See also validation Women, 427. See also Gender bias; Gender
Neuropsychological evaluations, Validation tests differences
use of MMPI-2 in properties that characterize sound, 94 contextual assessment for, 421.
MMPI-2 item table for evaluating, toward informative, 85–6 See also Sexual and physical abuse
446–9, 448t Validity, profile. See Defensive responding; suggested areas for clinical assessment,
Treatment Validity scales 422t–3t
assessment as successfully integrated Validity, test, 100, 690 multiple identities, 396–7
into, 710 evidence of, 709 Women’s health, 422t
environmental variables and, 657–8 Validity Index Profile (VIP), 618 biopsychosocial view of, 415–16
interviews before and after, 213 Validity index (V) scale, 626, 627t–30t, Woodworth, Robert, 7
patient expectations, 649–51 631 Woodworth Personal Data Sheet (WPDS),
Treatment context, 645–6 Validity scales, 313–14. See also 82, 83
intensity, 659 Malingering screening and
Work. See Vocational behavior
mode/format, 658–9 detection scales
Work Interference (WRK), 265
relationship variables, 646, 659–60 MMPI, 251, 252, 254t, 254–6, 435–6,
World War II, 10
setting, 658 620, 621t–4t, 625
Wylie, J., 142–3
strategies and techniques, 660–1 Value orientation, 332
Treatment indicators. See Negative Variable Response Inconsistency (VRIN)
scale, 254, 255, 435, 488, X (disclosure) scale, 626, 627t–30t, 631
Treatment Indicators
Treatment monitoring and outcome, 620, 670
couple assessment for, 475–6 Vietnamese Americans, 389–91. See also Y (desirability) scale, 626, 627t–30t, 631
Treatment planning, 279, 423, 680. See also Asian Americans, personality
Butcher Treatment Planning assessment with Z (debasement) scale, 626, 627t–30t, 631
Inventory; Pretreatment planning Vietnamese MMPI, 379–80 Zinnes-Griggs (ZG) model, 71–4
750 INDEX