Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Handbook of Autism and Pervasive Developmental Disorders:

Assessment, Interventions, and Policy, Third Edition


Edited by Fred R. Volkmar, Rhea Paul, Ami Klin and Donald Cohen
Copyright © 2005 John Wiley & Sons, Inc.

CHAPTER 28

Diagnostic Instruments in Autistic


Spectrum Disorders

CATHERINE LORD AND CHRISTINA CORSELLO

The development of diagnostic instruments in and Teal and Wiebe (1986), as well as original
the past 30 years is an example of the interplay works cited in the text, for further information.
between clinical and research needs in the field
of autism. When judged from field trials of GENERAL ISSUES IN DIAGNOSIS OF
diagnostic criteria (Volkmar et al., 1994), AUTISTIC SPECTRUM DISORDERS
autism is one of the most reliably diagnosed
disorders in child psychiatry. However, many Autism and other pervasive developmental dis-
diagnostic aspects of the disorder provide orders (ASDs) are associated with a broad
unique challenges, as well as raising issues range of intellectual and language skills, par-
shared with other childhood onset disorders. In ticularly across time. This range affects the
this chapter, first general and then specific is- way that the disorder’s defining symptoms are
sues pertaining to designing and selecting in- manifested. Because ASDs typically begin
struments for diagnosis and measurement of when children are infants or toddlers and con-
core features of autistic spectrum disorders tinue into adulthood, precise identification of
(ASDs) are considered. A brief historical re- well-defined behaviors that are necessary and
view of some of the first standardized instru- sufficient to diagnoses across developmental
ments used for diagnosis of autism is next, levels is a complex task (Lord, Pickles, DiLa-
followed by short descriptions of some of the vore, & Shulman 1996; Volkmar et al., 1994).
most common instruments used in diagnosis For example, although deficits in simple pre-
and measurement of the features that define tense and elicited imitation are typical of most
ASD. The chapter concludes with information children with autism at certain points in devel-
about the use of instruments for specific pur- opment, these deficits do not necessarily dis-
poses, such as measuring change, ending with a criminate autism from other disorders at
general discussion. Because the emphasis of either very basic levels of development (i.e.,
the chapter is on issues pertaining to the design age equivalents of under 12 months; Charman
and selection of measures, sections on individ- et al., 1998) or at much more sophisticated lev-
ual instruments are not intended to be compre- els of development (i.e., very high-functioning
hensive. See review articles by Parks (1988) verbal adults; Happé, 1995).

Appreciation is expressed to NICHD (U19HD35482) through the Collaborative Program for Excellence in
Autism (CPEA) and NIMH (R01MH066496) that provided support to the authors during the preparation of
this manuscript and to Colleen Hall, Kaite Gotham, Daniel Karstofsky, and Amanda Edgell, who helped in
the preparation of this chapter.

730
Diagnostic Instruments in Autistic Spectrum Disorders 731

The challenge presented by changes in devel- based on all combinations of chronological age
opment in autism is similar to issues that affect and level of mental handicap is not feasible
the measurement of general intellectual devel- without very large samples, and sometimes not
opment in all children. In the case of general even then (e.g., identifying infants with mild id-
intelligence testing, however, years of investi- iopathic mental retardation may be impossible).
gation, access to large populations, and popula- A further factor is language delay. Even
tion samples of normative data have allowed the when level of mental handicap is addressed
development of instruments such as the Wech- through a research design, children with
sler tests (WISC-IV; Wechsler, 2003), WAIS- autism-related disorders often (with some no-
III (Wechsler, 1997), and WPPSI-III (Wechsler, table exceptions) show more severe language
2002). These tests contain different tasks for delays than other children of equivalent nonver-
children and adults at different levels. Standard bal level. Any diagnostic instrument that relies
scores are computed according to small grada- heavily on behaviors associated with receptive
tions in age. In ASDs, with the exception of the or expressive language competence must take
revised Autism Diagnostic Observation Sched- this into account (Lord, Storoschuk, Rutter, &
ule (ADOS: Lord, Rutter, DiLavore, & Risi, Pickles, 1993). However, exactly how to do so
1999; Lord, Risi, et al., 2000), such grading becomes a complex decision (Happé, 1995;
has not yet been attempted, and may not be fea- Hobson, 1991). Trying to control for language
sible, given the incidence and variability of the delay may also “control for ” autism itself. It
disorders. may result in comparisons that are invalid for
In addition, while cognitive tests use other reasons (e.g., comparing 2-year-olds with
chronological age and population demograph- autism to nonhandicapped 8-month-olds of
ics to define what is “average,” identifying the equivalent receptive language skill).
“average” child with autism is much more In addition, the relationship between autism
complicated, particularly with small samples. and language impairment is complicated by the
Using large samples who have not been sys- fact that the expressive language of individuals
tematically assessed or recruited according to with no or very little spontaneous speech may
epidemiological standards may also lead to not show as many abnormalities as the language
unrepresentative scores (see Ozonoff, South, of more verbally fluent persons with autism.
& Miller, 2000). Eventually, pooled research This relationship affects attempts to quantify
samples that result in very large sample sizes severity in any additive way. Thus, in the
and/or methods such as latent class analyses Autism Behavior Checklist (ABC; Krug, Arick,
may be helpful in this endeavor (Mahoney & Almond, 1980b) and in the Autism Diagnos-
et al., 1998). In the meantime, studies that ex- tic Interview-Revised (ADI-R; Rutter, Le Cou-
plicitly compare distributions of different teur, & Lord, 2003), both described later, an
samples (e.g., Szatmari et al., 2002) provide abnormality score is computed by adding the
important information about the consistency number of ways in which a child or adult’s lan-
of diagnosis across populations. guage is unusual (e.g., pronoun reversal, de-
In addition, as discussed in more detail later layed echolalia, neologisms). This strategy
in this chapter, issues arise about how to best results in individuals with more complex
define comparison group to autism in order to language scoring as more abnormal than indi-
generate appropriate norms. Providing norma- viduals who cannot speak (Miranda-Linne,
tive data based on chronological age, as is done Fredrika, & Melin, 1997; Rutter et al., 2003). A
for most well-known general intellectual as- recent factor analysis carried out on the ADI-R
sessments, is not sufficient, because ASDs are (Lord, Rutter, & Le Couteur, 1994; Tade-
often, but not always, associated with mental vosyan-Leyfer et al., 2003) credited nonverbal
handicap. Thus, differences obtained between children with maximum scores of severity on
mentally handicapped children with autism and verbal items. This resulted in nonverbal chil-
chronological-age matched nonautistic children dren scoring as most severe on a hierarchy of
who are not mentally handicapped may be at- language items, overlapping with children with
tributed to autism, mental handicap, or both. the most sophisticated language and many ab-
On the other hand, the generation of norms normalities; not a result that is very meaningful
732 Assessment

or interpretable. The ADI-R attempts to avoid interactions. For example, in a study comparing
this problem by having separate domain scores parent report in a structured interview to direct
for verbal and nonverbal communication; how- observations, good agreement across the two
ever, this strategy is not ideal for researchers methods for the occurrence of abnormalities
who need a single overall severity score. emerged for only 3 of 16 items taken from
In general, classification systems and diag- DSM-III-R: abnormal social play, stereotyped
nostic instruments for ASDs have been most body movements, and restricted range of inter-
accurate in addressing autism in somewhat ests (Stone & Lemanek, 1990). Differentiation
verbal, mildly to moderately mentally handi- for adults between deficits specific to autism
capped school age children. Classification sys- and those associated with any severe, chronic
tems and diagnostic instruments decrease in psychiatric disorder that drastically limits so-
interpretability the farther one moves from this cial contact and everyday opportunities, also
group (Lord & Bailey, 2002; Lord et al., 1996). becomes more difficult (Rutter, Mawhood, &
Unfortunately, diagnostic instruments are most Howlin, 1992; Volkmar et al., 1994).
needed for children and adults who do not Parent and child reports are not inter-
fall within this most easily recognized proto- changeable. This issue is most relevant to high-
type. As discussed later, it is important that functioning older children, adolescents, and
consumers who use diagnostic instruments take adults with autism and ASDs who can be asked
into account the biases that an instrument to describe their own symptoms and concerns.
shows for populations who fall outside the For certain behaviors, parent report may be
most commonly studied group of children with more valid and reliable over time (e.g., reports
autism, such as children with nonautism ASDs, of friendships, development of play; Lord et al.,
such as Asperger’s Disorder and Pervasive 1989); for others, either direct observation
Developmental Disorders, Not Otherwise (such as of very young children with autism;
Specified (PDD-NOS). The difficulties are Lord, Cook, Leventhal, & Amaral, 2000) or
less relevant for Rett Syndrome and Fragile X, self-reports, such as for mood and interest in
because these disorders have biological mark- the opposite sex (Howlin, Mawhood, & Rutter,
ers, however, questions remain when children 2000; Mawhood, Howlin, & Rutter, 2000), may
with these disorders meet standard diagnostic be more accurate indicators. In other areas of
criteria for autism. developmental psychopathology, with a few no-
table exceptions (e.g., self-reports of anxiety or
Issues in Selecting the Appropriate depressive feelings), informant accounts have
Focus and Level of Analysis often been better discriminators than alterna-
tive methods (Bird, Gould, & Staghezza, 1992).
An alternative to organizing a diagnostic in- Using multiple sources may address some of
strument around very specific behaviors is to these issues by helping to place diagnostic
develop measures of broadly defined deficits, information in developmental and social con-
such as impairments in social reciprocity or texts. For example, if a child appeared fasci-
circumscribed interests that are relevant to nated by pencils during an observation, a
the behaviors of individuals across a range of parent’s account of his fascination with stick-
chronological ages and developmental levels. like materials at home would be important in
However, answering questions about these evaluating whether this was a consistent focus
broad conceptualizations may be difficult for or a brief interest. Information about a history
naive observers, such as nonexpert clinicians of very limited social interaction beginning in
(Volkmar et al., 1994) or parents (Schopler & early childhood can place reports of social
Reichler, 1972). This seems especially true in isolation into context for an adult client. From
diagnoses of young children (see Charman the reverse perspective, observation of how a
et al., 1998; DiLavore, Lord, & Rutter, 1995; child responds when a parent is asked to call
Lord et al., 1993), for whom it may be difficult his name may be a helpful complement to a
to disentangle well-coordinated social behav- parent’s description of the child’s response to
iors produced as part of familiar, physical rou- family members’ attempts to get his attention
tines from spontaneous, socially motivated at home. Ideally, diagnostic instruments would
Diagnostic Instruments in Autistic Spectrum Disorders 733

maximize use of direct observations and par- features) are clearly related, they have some-
ents’ and teachers’ descriptions, while getting what different implications for diagnostic in-
broader information directly from individuals struments. Social-communicative features of
with ASD without requiring them to draw in- autism tend to be described in terms of ab-
ferences that they often do not have the knowl- sences, while oddities in interests and behav-
edge to make (e.g., about the nature of autism ior, as well as a few specific characteristics
and the applicability of that term to them- of language (e.g., stereotypic speech) tend
selves). However, how to best combine infor- to be described in terms of the presence of ab-
mation from multiple sources is not obvious normalities. When they occur, odd behaviors,
(Kraemer, 1992; Offord et al., 1996). For ex- such as hand and finger mannerisms or re-
ample, one method of quantifying severity peated smelling of objects, may be more strik-
might be to consider information from differ- ing and obviously abnormal than the lack of
ent sources as separate repeated measures of a typical development in a particular area. How-
hypothetical construct, such as qualitative im- ever, such obviously abnormal behaviors, even
pairments in social interaction (Grinager, Cox, if a child or adult engages in them frequently at
& Yairi, 1997). home or school, may not always occur during
Instruments also differ in the degree to a relatively brief observation. For example,
which they emphasize the presence of observ- in one study, only 60% of verbal, mildly men-
able abnormalities or the absence of normally tally handicapped adolescents with autism and
developing features. Sometimes this distinc- 35% of very high-functioning, verbal adoles-
tion is arbitrary, as in descriptions of the use cents with autism exhibited clearly obser-
of gaze by children with autism as either “ un- vable repetitive behaviors during a half-hour
usual eye contact ” or “ failure to use gaze to structured observation, though all of these in-
regulate social interaction in subtle ways.” The dividuals were described by their parents as
former describes the presence of an abnormal- engaging in such behaviors at home on a regu-
ity and the latter describes the absence of a lar basis (Lord et al., 1989). None of the lan-
prosocial behavior. In young children with guage and chronological-age matched mentally
autism, the absence of behaviors such as eye handicapped and normally developing adoles-
contact, smiling, and social responses, may be cents exhibited these behaviors during the
more specific and more predictive of outcome observation. The presence of these behaviors
than abnormalities (Lord, 1995; Venter, Lord, during an observation was important diagnos-
& Schopler, 1992). It is also more highly corre- tically, but the absence during that one obser-
lated with chronological and cognitive age vation was not interpretable. As already noted,
(Tadevosyan-Leyfer et al., 2003). For other there is reason to believe that such abnormali-
diagnostic features, the presence of clear ab- ties may be less directly related to clinical
normalities and the absence of normal devel- outcome than are social impairments and
opment may be strongly related, but the two more broadly based aspects of communication
perspectives may not necessarily be the same. (Cox et al., 1999; Venter et al., 1992). Never-
For example, developmental and behavioral in- theless, brief descriptions of clearly abnormal
tervention studies would suggest that the pres- behaviors, particularly sensory reactions to
ence of unusual preoccupations and restricted environmental stimuli, are more amenable
interests is associated with the absence of to checklists and screening measures (Krug
early social play. If a child is taught develop- et al., 1980b; Rimland, 1971) than longer-
mentally appropriate play skills, he will show winded descriptions of subtle differences in
fewer stereotyped behaviors (Schopler, 1976); nonverbal social behaviors, though the abnor-
however, he may still have restricted interests. mal behaviors may be less indicative of out-
To our knowledge, this assumption has not come and of diagnoses made by experienced
been directly tested outside of evaluations of clinicians than other measures.
specific interventions. It is important to remember that, in a diag-
Even though the two approaches (comput- nosis, the diagnosticians tend to find what they
ing the presence of abnormalities and deter- look for or ask about. That is, the content and
mining the number of absences of prosocial the nature of the behaviors that are observed
734 Assessment

(or described) and the content and the nature of age (Cox et al., 1999). Furthermore, the
of the ways in which they are reduced or ADI-R was not sensitive to other Pervasive
“coded” affect the end product of diagnosis. Developmental Disorders, such as Asperger’s
Scales that employ linear approaches to scores Disorder and PDD-NOS, when used with 20-
(e.g., using a single total) with a single cut-off month-old toddlers (Cox et al., 1999). Stone
more easily quantify examples of dysfunction, found that clinical diagnoses at age two identi-
but also are more likely affected by factors fied children with later stable diagnoses of
outside autism, most notably co-occurring autism but not of PDD-NOS (Stone, Ousley,
mental retardation, than are instruments that Yoder, Hogan, & Hepburn, 1997; Stone et al.,
require thresholds in different areas. Scales 1999). As we discuss later, decisions of which
that require meeting of multiple thresholds are approach is most appropriate may differ de-
tied to specific classification systems and the pending on the needs of the clinician or re-
theories that underlie them (e.g., DSM IV and searcher and the developmental level of the
ICD-10). Thus, they may underestimate cases child or adult who is assessed.
because of requirements for distribution of
scores or because the system is not quite cor- Implications of Information from
rect (Cox et al., 1999; Hepburn, John, Lord, & Other Areas of Research for
Rogers, 2003; Lord, 1995; Pilowsky, Yirmiya, Diagnostic Instruments
Shulman, & Dover, 1998).
For example, one study showed that both Without a well-established biological marker,
the Childhood Autism Rating Scale (CARS; decisions about classification of autism and
Schopler, Reichler, & Renner, 1988) and the ASDs have often been based on the need to
Autism Diagnostic Interview-Revised (Lord identify appropriate populations for services
et al., 1994) were concordant with clinician’s and research, rather than empirical bases
judgments of diagnosing autism in children at (American Psychiatric Association, 1994; Volk-
age 3 (Lord, 1995). Both were less accurate for mar et al., 1994; Wing & Gould, 1979). Though
children 2 years or younger, but for somewhat eventually, neurobiological factors may result in
different reasons. The CARS consistently over- a re-sorting of diagnoses in autism / PDD, bio-
diagnosed nonautistic mentally handicapped logical heterogeneity is expected within and
children as having autism at age 2; CARS diag- among the spectrum disorders. Thus, we will be
noses of these children became more accurate dependent on descriptions of social and other
by age 3, but were still less specific than has behaviors for some time. Yet, the behavioral
typically been reported for older children. The boundaries between autism and other disorders
ADI-R was more accurate than the CARS with in the spectrum, such as PDD-NOS and As-
the nonautistic children at 2, but like the CARS perger’s Disorder, are not clearly defined, par-
it was over-inclusive for mentally handicapped ticularly when changes with development are
and/or language delayed children. The ADI-R taken into account (see Ghaziuddin, Tsai, &
also failed to diagnose autism in about 10% of Ghaziuddin, 1992). Information, such as devel-
2-year-olds who later met formal diagnostic cri- opmental trajectories and clustering of symp-
teria for the disorder because their parents did toms, that arises out of studies of diagnostic
not report sufficient abnormal repetitive behav- instruments may influence classification sys-
iors or abnormalities in language. Agreement tems in the near future (Lord et al., 1996; Ma-
between the ADI-R and CARS was in fact quite honey et al., 1998; Szatmari, Archer, Fisman,
high; the difference was whether a simple total Streiner, & Wilson, 1995). The expectation is
or thresholds across several domains (i.e., social that diagnostic instruments may and should con-
reciprocity, communication, restricted, repeti- tinue to change as more information is acquired.
tive behaviors) were required for a diagnosis. Furthermore, priorities for the results of
Similar results were found in another study diagnoses may be different for clinical and
comparing the ADI-R and CARS with older research purposes. Clinical diagnoses offer
children (Pilowsky et al., 1998). The ADI-R families access to general information about
resulted in good specificity, but poor sensitiv- their children, and is often the entry point to
ity at detecting childhood autism at 20 months services. Service providers may use a diagno-
Diagnostic Instruments in Autistic Spectrum Disorders 735

sis to allocate limited resources, whereas a the manner in which current criteria in the
priority for families and diagnosticians is to DSM-IV and the ICD-10 should be applied, and
ensure that children or adults are not being ex- little agreement as to whether the diagnosis is
cluded from appropriate services because of a distinct from autism or is a subtype of autism
particular label or classification (Wing & (Klin, Pauls, Schultz, & Volkmar, in press).
Attwood, 1987). Several different definitions for Asperger’s
Researchers often prefer narrow diagnoses. Disorder are currently used to make the diag-
Narrower formulations provide better cross-site nosis, including Gillberg’s criteria, Szatmari’s
reliability, eliminate outliers, and reduce over- criteria, Tantum’s criteria, and the criteria
lap with control groups. Narrower diagnostic listed in DSM-IV and ICD-10 (Leekam, Libby,
categories may reduce the likelihood of false Wing, Gould, & Gillberg, 2000). Some authors
positives. On the other hand, researchers seek have reported that it is difficult, if not impossi-
populations of particular sizes and are inter- ble, to diagnose Asperger’s Disorder given the
ested in maximizing the number of participants current diagnostic criteria, which requires that
who meet their criteria. All of these forces af- autism is excluded prior to making a diagnosis
fect the goals addressed by diagnostic instru- of Asperger’s Disorder (Miller & Ozonoff,
ments and the ways in which they are used. 1997; Szatmari et al., 1995).
There is an urgent need for instruments to There are also different opinions of what
address diagnoses beyond autism, particularly criteria to use to determine if Asperger’s Dis-
ASDs, such as PDD-NOS and Asperger’s Dis- order should be considered distinct from
order. In part, the absence of replicable, reli- autism or not. Klin and his colleagues broach
able, and valid instruments in this area is the need for a greater body of research on the
related to the absence of clear diagnostic crite- validation of the syndrome (Klin, Sparrow,
ria for these disorders (Sponheim, 1996; Szat- Marans, Carter, & Volkmar, 2000). Szatmari
mari et al., 2002). A lack of empirical data et al. (1995), on the other hand, argues that the
affects the ability to discriminate these disor- decision should be based on clinical useful-
ders both from autism and from disorders out- ness, taking into account course, response
side the autism spectrum (e.g., severe attention to treatment, and prognosis. Szatmari et al. in-
deficit; severe communication impairment), dicates that the first priority should be to de-
which in turn affects the development and the termine if there is a meaningful distinction
operationalization of these criteria. between Asperger’s Disorder and autism and a
There are numerous sets of diagnostic second priority should be to determine if there
criteria for ASDs, especially for Asperger’s is a distinction between Asperger’s Disorder
Disorder, that suggest conceptualizations for and other related but also not well-defined
them (Volkmar & Klin, 2001; Szatmari, 2000; groups (such as nonverbal learning disability).
Tantam, 2000), but that do not directly address One of the difficulties is that different criteria
the overlap with autism. In contrast, DSM-IV for Asperger’s Disorder change not only the
and ICD-10 criteria define Asperger’s Disor- individuals who receive that diagnosis, but
der purely in terms of its relationship with also whom is then diagnosed with PDD-NOS
autism, but provide little conceptualization and autism (Klin et al., in press).
(DSM-IV-TR). Moreover, conceptualizations At this point, there are very few instru-
exist for disorders such as schizoid disorder ments with extensive reliability and validity
and nonverbal learning disabilites; but without studies available to aide in the diagnosis of As-
clear indications of their relationship with perger’s Disorder. The few that are available
autism. As they are for Asperger’s Disorder, are described in this chapter. Most researchers
DSM-IV criteria for PDD-NOS and ICD-10 continue to modify instruments designed for
criteria for atypical autism are based solely on diagnostic purposes for autism in research
the basis of just missing autism criteria. studies on Asperger’s Disorder, particularly as
Developing standardized assessment instru- there is still controversy as to whether it is a
ments for Asperger’s Disorder is particularly distinct syndrome.
tricky because there is little consensus in how Two Asperger’s Disorder algorithms were
to define the disorder, little consistency in developed for the DISCO (Leekam et al.,
736 Assessment

2000), one of the algorithms was based on Gill- knowledge, from new statistical techniques to
berg’s criteria and the other on ICD-10. In the neuroimaging to molecular genetics. As per-
study by Leekam et al. (2000), 91 (45%) of the spectives on autism have shifted with new the-
subjects met criteria for Asperger’s Disorder ories and empirical findings, strategies and
using the algorithm based on Gillberg’s crite- content of instruments used for its diagnosis
ria, while only 3 (1%) met criteria based on the have also shifted in numerous ways. However,
ICD-10. In most cases, this was at least par- in the newer instruments, roots can almost
tially due to the ICD-10 criteria requiring nor- always be traced to strategies begun in earlier
mal language development prior to 3 years and work. Science offers clinicians the opportu-
age appropriate self-help or adaptive skills, or nity to learn from accumulated knowledge and
curiosity. This supports the difficulty in mak- empirical testing of hypotheses.
ing the diagnosis based on current criteria as it
is set forth in the diagnostic manuals. The au- Psychometric Issues
thors of this article admit that they may have
interpreted the ICD-10 criteria more strictly The American Psychological Association
than was intended. Also of importance was the (APA) has issued guidelines for the develop-
finding that all 91 of the children who met algo- ment of psychometric instruments in the
rithm cut-offs for Asperger’s Disorder based on United States. A number of factors affect the
Gillberg’s criteria also met ICD-10 criteria psychometric appropriateness of an instru-
for autism or atypical autism, again, highlight- ment. These issues are raised as they apply to
ing the issue of the overlap between Asperger’s the question of diagnostic instruments for the
Disorder and autism. Similar findings have autism spectrum in general, followed by more
been reported by numerous other researchers, specific discussions of selected instruments.
including Ozonoff et al. (2000), Szatmari et al. Selected standards from these guidelines are
(1995), and Klin, Volkmar, Sparrow, Cicchetti, presented in Table 28.1 (reliability) and Table
and Rourke (1995). 28.2 (validity). Many diagnostic instruments
Other instruments have tended to yield the in autism / PDD, as noted, have addressed some
same results. This is truly an unfortunate of these issues, but few or none have addressed
cycle: without reliable diagnostic criteria and all of them. In part, this lack of information is
measures, empirical findings are very difficult understandable because of difficulties in
to interpret (Klin et al., in press; Sponheim, achieving sufficiently large well-documented
1996). Without empirical data about the course samples; in part, it reflects the limited history
and characteristics of nonautism pervasive de- of instrument development in autism.
velopmental disorders, attempts to differentiate
between these disorders and autism will not be Reliability
effective. Data from genetic and family stud-
ies, as well as other neurobiological approaches, Reliability, which is the degree to which a
may make this task easier, but the results are score or decision is free from errors of mea-
also affected by instrumentation. Thus, re- surement, requires assessment in a number of
searchers must arrive at working agreements forms, including across raters, across time,
that allow them to proceed in a reliable fashion. and within an instrument. Often the term reli-
In the face of these difficulties, autism as a ability is used to describe these separate as-
field has the strength of its intense research pects of the stability of the results of an
history and the benefit of research teams from instrument as if they were interchangeable.
around the world investigating similar ques- However, this is not the case. For example, the
tions. Descriptive and experimental research degree to which different raters concur when
have offered solutions to some of these diffi- using the same instrument cannot be deter-
culties, such as identifying developmentally mined by measuring the internal characteris-
meaningful behaviors—joint attention, theory tics of a test. The internal consistency (i.e., the
of mind, response to name—that discriminate degree to which different items on a scale
autism from other disorders at various points measure the same concept) of an instrument
in development. It offers the promise of other can be quite high, even though its inter-rater
Diagnostic Instruments in Autistic Spectrum Disorders 737

TABLE 28.1 Reliability and Errors of Measurement: Issues Related to Diagnosis of Autistic
Spectrum Disorders
1. For each total score, subscore, or combination of scores that is reported, estimates of relevant reliabilities and
standard errors of measurement should be provided in adequate detail to enable the test user to judge whether
scores are sufficiently accurate for the intended use of the test.
2. The procedures that are used to obtain samples of individuals, groups or observations for the purpose of
estimating reliabilities and standard errors of measurement, as well as the nature of the populations involved,
should be described.
3. The conditions under which the reliability estimate was obtained and the situations to which it may be
applicable should be explained clearly.
4. Coefficients based on internal analysis should not be interpreted as substitutes for alternate-form reliability
or estimates of stability over time unless other evidence supports that interpretation in a particular context.
5. Where judgmental processes enter into the scoring of a test, evidence on the degree of agreement between
independent scorings should be provided.
6. Where cut scores are specified for selection or classification, the standard errors of measurement should be
reported for score levels at or near the cut score. For dichotomous decisions, estimates should be provided of
the percentage of test takers who are classified in the same way on two occasions or on alternate forms of
the test.
Selected and adapted from Standards for Educational and Psychological Testing, by AERA, APA, NCME, 1985,
Washington, DC: American Psychological Association.

reliability is low. In a disorder such as autism difficulty with correlations is that the absolute
that is defined by a pattern of difficulties scores of raters can be quite different, result-
across several areas (i.e., communication, so- ing in different diagnoses, even though they
cial interaction, behavior), internal consis- are highly correlated. That is, if one rater rated
tency in a scale is a worthwhile endeavor, but all participants relatively high and another
does not have the same meaning as in a scale rater rated the same participants relatively low
that is not designed to describe a pattern of re- and the raters had the same rankings of partic-
lated, but different, deficits. ipants, the correlation of the two raters’ scores
In the past, reliability estimates were often would be high. If diagnosis is based on exceed-
reported as correlations. A correlation mea- ing a certain threshold, the fact that the rank-
sures whether the rankings of different indi- ings of the raters agreed would not prevent the
viduals are similar across different raters. The scores from resulting in different diagnoses for

TABLE 28.2 Validity: Issues Related to Diagnosis of Autistic Spectrum Disorders


1. Evidence of validity should be presented for the major types of inferences for which the use of a test is
recommended.
2. If validity for some common interpretation has not been investigated, that fact should be made clear, and
potential users should be cautioned about making such interpretations.
3. The composition of the validation sample should be described in as much detail as is practicable.
4. When criteria are composed of rater judgments, the relevant training, experience, and qualifications of the
experts should be described.
5. When a test is proposed as a measure of a construct, that construct should be distinguished from other
constructs. Evidence should be presented to show that a test does not depend heavily on extraneous
constructs. If evidence indicates that a criterion measure is affected to a substantial degree by irrelevant
factors, this evidence should be reported.
6. When criteria are composed of rater judgments, the degree of knowledge that raters have concerning ratee
performance should be reported. The training and experience of the raters should be described.
7. If specific cut scores are recommended for decision making (for example, in differential diagnosis), the user’s
guide should caution that the rates of misclassification will vary depending on the percentage of individuals
tested who actually belong in each category.
Selected and adapted from Standards for Educational and Psychological Testing, by AERA, APA, NCME, 1985,
Washington, DC: American Psychological Association.
738 Assessment

the same client. Thus, while correlations pro- high scores on all further measures of abnor-
vide an important index of the relationship mality. Having seen a typically developing
among scores, they are not sufficient to show child’s behavior on the same first item, a rater
agreement when cut-off scores are used to might predict, based on the child’s “normal”
make categorical judgments about diagnoses. reaction to the first task, that she will receive
In place of correlations, many investigators “normal” scores on other items. If there is lit-
now employ measures of percent of agreement tle variation across tasks and little overlap
between pairs or larger groupings of raters. An across populations, two raters might get better
agreement must be defined at a level commensu- agreement using this strategy than by actually
rate with the aims of the instrument. It may be observing and coding the behaviors of the indi-
exact agreement or agreement within a certain vidual children. Specific statistics, called kap-
number of points, depending how scores are to pas (Cicchetti & Sparrow, 1981), allow some
be used. Clinicians and researchers can then control of this phenomenon. However, no simple
evaluate the frequency with which their coding answer addresses all of these problems. Al-
agrees with that of another person for a given though kappas control for chance, they are sen-
individual. There are no set standards for levels sitive to distributions and so, as with any
of agreement, but generally, in self-report and statistic, must be interpreted in light of other
interview studies, researchers have been able to information. Another strategy using reliability
achieve 90% or greater agreement on individual coefficients does not address the intersection
categorical measures and at least 80% on indi- between individual participants and individual
vidual observational codes, with greater agree- raters, but allows quantification of the effects
ment for pooled or summary scores. of each separately (Mundy, Sigman, Ungerer,
Item-level inter-rater agreement is very im- & Sherman, 1986). This statistic tests whether
portant when an instrument is being developed scores are more affected by individual differ-
because it allows for experimentation with ences in children than by differences among
which items yield the most valid scores. Many raters. However, if there are large individual
of the most well-known assessment instru- differences among children, finding that these
ments (i.e., the Wechsler tests, the Vineland differences exceed those among raters may not
Adaptive Behavior Scales) do not have this guarantee strong reliability.
level of inter-rater reliability because the rely These issues illustrate the importance of
on total or domain scores and because the in- the nature of the samples on which psychomet-
ternal consistency of these domains or the ric analyses are conducted. Autism affects in-
total are well documented. In the field of ASD, dividuals across the lifespan who have a range
because of gradually changing conceptualiza- of language and cognitive skills. If samples are
tions, recent instruments have actually aimed not well matched and not relevant to the clini-
for the establishment of more specific reliabil- cal or research contexts in which the instru-
ity among raters in order to retain the flexibil- ment will be used, there will generally be little
ity to rework scoring systems as different overlap in scores (e.g., if children with autism
diagnostic frameworks emerge. are compared to typical children). If instru-
The difficulty with using percent agreement ments are developed only using very easily
as a metric is the role of chance. If there is a discriminable populations, documentation of
high frequency of extreme scores without much reliable ratings will be difficult to achieve
variation within different populations (e.g., al- when statistics that take distributions into ac-
most all zeroes for nonautism or high scores for count are employed, although they may look
autism), correlations and percent agreement good in terms of absolute agreement. When re-
among raters can be quite high because of the liability estimates are presented only for to-
likelihiood of agreement based on using the ex- tals, even when subscales are described and
treme scores, without attention to individual intended to be used, clinicians or researchers
differences. That is, having seen a child’s per- who want to base interpretations on specific
formance on the first item of the test, a rater items or subscales cannot do so. It is important
might predict that, because the child looked that test users interpret their results within the
quite autistic on the first item, he will receive context of the information that is available.
Diagnostic Instruments in Autistic Spectrum Disorders 739

Sometimes the reverse is the case. Re- cerned with identifying group differences and
searchers may present detailed psychometric so do not address issues at an individual level in
data for items, but not present reliability for the much detail, if at all. For example, two studies
diagnostic categorization for which the scale is reported substantial intra-individual variability
intended. This is particularly problematic for across tasks and time in standard tasks used to
ASDs. It is not difficult to find an instrument assess theory of mind (see Chapters 41 and 42
that identifies more abnormal behaviors occur- in this book for a discussion of this concept) in
ring in ASD than in typical development. How- autism (Holroyd & Baron-Cohen, 1993; Mayes
ever, seldom is this the goal of an instrument. & Zigler, 1992). While group effects on false
To be useful diagnostically, instruments must belief tasks have had a major impact on the con-
discriminate children with autism or ASD from ceptualization of social-cognitive deficits in
nonautistic severely mentally handicapped, lan- autism, and have been replicated across studies
guage-impaired children. Because it is often internationally, in neither of the recent studies
difficult to set a threshold that includes chil- were the results of the tasks sufficiently replic-
dren with mild autism identified as such and able within individuals to meet reasonable clin-
excludes nonautistic severely mentally handi- ical standards for classification.
capped children, consistency across raters and An important aspect of reliability is speci-
across time with which an individual falls in or fication of exactly how and under what cir-
out of the category of autism or ASD must be cumstances diagnostic instruments are to be
measured directly. used and how they are to be scored. Sometimes
The issue of test-retest reliability in autism procedures reported in journal articles are de-
is complex. Changes in behavior due to devel- scribed so briefly that it is difficult to deter-
opment would be expected if administrations mine what exactly was done and who did it.
were separated by substantial amounts of time. Differences in procedures, such as whether or
Some learning may occur within the testing not coding is carried out live or from video-
situation that affects a child’s behavior if he or tape, whether interviews are done face to face
she is asked to carry out the same actions or on the telephone, or how experienced in
again. This is different than error in measure- autism the raters are, may result in differences
ment, but still must be taken into account. In in scoring (Sanchez et al., 1995; Volkmar
some cases, previous administration of an in- et al., 1994). It is helpful for users of instru-
strument (i.e., practice) may affect its scores ments to know how, as “consumers,” they
or interpretation. For example, in the Autism might improve and evaluate their own reliabil-
Diagnostic Observation Schedule (ADOS, ity with an instrument.
Module 1; Lord, Risi, et al., 2000), young chil- In studies of reliability and validity, raters
dren are taught a routine of bringing a balloon should be unaware of children’s diagnostic cat-
to the examiner if they do not do so sponta- egories or of scores on other diagnostic instru-
neously. If they are presented with the same ments, unless this information would typically
task several weeks later, they may respond dif- be available prior to use of the instrument. If
ferently because of learning, not because of other information is assumed to be a critical
error in measurement. However, the examiner part of the use of the instrument, this needs to
still needs to code the behavior he or she sees. be stated clearly as part of the procedures. For
Ideally, information about stability and ex- example, for the ADOS, general information
pected changes across multiple administra- about a participant’s likely level of expressive
tions should be available for all instruments. language is crucial in selecting the appropriate
For diagnostic instruments, this information module and so is considered part of the assess-
must be presented at the level of each individ- ment. How this information is used is speci-
ual’s score and resulting diagnosis. Just because fied in the manual. In addition, description of
a task or instrument has been used in many the training required for a rater and the cir-
studies, it cannot be assumed that it is reliable cumstances of the training and the administra-
on an individual level at a standard appropriate tion are critical aspects of reliability.
for diagnostic work. Many experimental studies Another factor to be considered in autism
in psychology and psychiatry are primarily con- and ASDs is parents’ awareness of their
740 Assessment

child’s diagnosis. That is, in many research parental reports on the ADI, of autistic chil-
samples, parents of previously diagnosed dren’s responses to separation and reunion
autistic children are well versed in the char- (which were intended to be linked theoretically
acteristics of autism and how their children to conceptualizations of attachment) were
fit into the diagnostic scheme. Several recent more highly correlated with their children’s
studies have shown excellent agreement be- communicative competence than the same chil-
tween questionnaires (i.e., the Social Com- dren’s observed responses to separation and
munication Questionnaire [SCQ]; Rutter, Le reunion in a standardized setting (e.g., during
Couteur, & Lord, 2003) and interview for- administration of the Pre-Linguistic Autism
mats of similar items (i.e., the SCQ and ADI- Observation Schedule or PL-ADOS: DiLavore
R; Bishop & Norbury, 2002; Chakrabarti & et al., 1995; Spencer, 1993).
Fombonne, 2001; Le Couteur, Lord, & Rutter, Internal consistency for items within a diag-
2003; Lord et al., 1994; Vrancic et al., 2002 nostic instrument can be used to support the as-
[Spanish SCQ by telephone]). However, if a sertion that a test measures a single construct.
parent report instrument is intended to be In ASD, this has meant support for the differ-
used in initial diagnoses, then it is appropriate entiation of ASDs from other developmental
that it is shown to be reliable and valid with disorders or support for the three domains (so-
caregivers who have not yet received formal cial reciprocity; communication; restricted,
diagnoses. repetitive behaviors) that define the syndrome.
Measures of internal consistency for the most
Validity commonly used instruments in the diagnosis of
autism (e.g., the ADI-R, the Autism Behavior
Validity is the most important aspect of a di- Checklist or ABC, the ADOS, the Childhood
agnostic instrument. Validation refers to the Autism Rating Scale or CARS) have generally
degree to which other evidence supports infer- been high.
ences drawn from the scores yielded by the di- Content validity has to do with the degree
agnostic instrument. Thus, how validity is best to which a sample of items, tasks, or questions
measured is inherently related to the uses for in an instrument are representative of a de-
which the instrument is intended. fined domain. In most cases, this domain is
Validity is often grouped into categories of autism, either narrowly or broadly defined
content, construct, and criterion-related evi- (i.e., ASD, PDD). For the purposes of this re-
dence. For the diagnosis of ASDs, questions of view, content validity is most often defined as
construct validity are related to those that un- the degree to which different instruments rep-
derlie the diagnostic framework on which the resent the diagnostic criteria for ASDs. Many
instrument is based. For example, the ADI-R of the instruments reviewed here predated
uses a concept of social reciprocity derived the release of DSM-IV and ICD-10 criteria
from theories of autism (see Lord & Bailey, for autism and so do not correspond to the
2002). It is operationalized in terms of spe- three-domain approach specified in these
cific questions to parents and caregivers about diagnostic systems. The exceptions are the
behaviors such as joint attention, shared enjoy- ADI-R and ADOS. These are special cases
ment, comforting, and friendship. Data from because interpretation of results from the orig-
studies of the ADI and ADI-R (Le Couteur inal versions of these instruments, the ADI
et al., 1989; Lord et al., 1996) contributed to and ADOS/ PL-ADOS, influenced strategies
the understanding of this construct during tested in the field trials and the ICD-10
preparation of DSM-IV and ICD-10 criteria, revisions.
along with results of observational studies and Concurrent aspects of criterion-related va-
field trials, in showing that traditional mea- lidity of instruments have been most commonly
sures of attachment were not strongly related addressed in the broad area of ASD by investi-
to other measures of social reciprocity (Lord gating the convergence between diagnostic
et al., 1993; Sigman & Ungerer, 1984; Volkmar categorizations yielded by another diagnostic
et al., 1994). A further study suggested that instrument or with clinical judgment. As shown
Diagnostic Instruments in Autistic Spectrum Disorders 741

in Table 28.3, convergent validity for three of abilities, and perhaps even general intellectual
the most common diagnostic instruments (ADI, skills than autistic participants. Thus, compar-
ADOS, CARS) available in English has been isons of such samples, even though they may be
quite good. Convergence between the CARS representative of the population at hand, could
and several other instruments (e.g., the Autism yield differences interpreted as specific to
Behavior Checklist; Krug et al., 1993; the Real- autism that may be more accurately linked
Life Rating Scale or RLRS; Freeman, Ritvo, to severity of mental handicap or communica-
Yokota, & Ritvo, 1986) has been good. Also, tion impairment (Lord et al., 1993). This is an-
as depicted in Table 28.3, all of the diagnostic other reason why data concerning the size,
instruments have been shown to be adequate the characteristics, and the ascertainment of
in identifying clinically diagnosed children samples are especially important in evaluating
with autism, with relatively rare false negatives instruments. In addition, more sophisticated
within a “prototypical” group of mildly to statistical techniques, such as latent class
moderately mentally handicapped school age analyses and logistic regression, may allow re-
children with autism. There is more variability searchers to take into account both positive and
when instruments are used with younger (Lord, negative predictive values within a single met-
1995; Lord et al., 1993) and older (CARS; ric (though still dependent on adequate samples
Garfin, McCallon, & Cox, 1988; Piven, on which to make comparisons).
Harper, Palmers, & Arndt, 1996) populations, Little information concerning predictive va-
and with higher (Yirmiya, Sigman, & Free- lidity of diagnostic instruments in autism exists
man, 1994) and lower functioning groups except for a few studies using the ADI-R. Our
(Fombonne, 1992; Lord et al., 1993). This pat- own follow-up study of 2-year-olds who were
tern is not unique to the instruments, but re- referred to a pediatric clinic for an evaluation
flects general difficulty in application of of possible autism, showed that both the ADI-R
standard diagnostic criteria to various develop- and the CARS tended to over-diagnose autism
mental levels. More detailed information about in mentally handicapped children at age 2. This
this issue is discussed next with descriptions of was much less the case by age 3, and was less
particular instruments. true for the ADI-R (in part, because of the re-
An even more serious, though less wide- quirement for a “ triad” of deficits) than the
spread, issue is that of false positives. Instru- CARS. On the other hand, Cox et al. (1999)
ments differ considerably in the number of found ADI-R diagnoses, when the threshold in
studies that include comparison groups. They repetitive behavior was not required, to be quite
also differ in the degree to which the compari- stable from 18 months to 3 years, for a select,
son groups represent typical populations for higher-functioning group of children identified
whom a diagnosis of autism or ASD might be as having autism with a screening instrument
considered and rejected. Often studies include called the CHAT (Baron-Cohen et al., 1992).
a comparison group of nonautistic mentally A follow-up study from early school age
handicapped or language impaired subjects, showed that retrospective ADI scores describ-
without sufficient information to determine the ing behavior at 4 to 5 years of age significantly
degree to which these subjects were compara- predicted academic achievement and adaptive
ble in ways other than the characteristics of scores in adolescence and young adulthood in a
autism to the autistic individuals. Autism is as- group of mildly mentally handicapped to non-
sociated with particularly severe communica- retarded autistic individuals (Venter et al.,
tion difficulties; and it is well established that 1992). Social and communication deviance at
the triad of deficits that define autism increases age 5 made independent contributions, in addi-
in frequency as level of mental retardation tion to various measures of expressive and re-
increases (Wing & Gould, 1979). Conse- ceptive language and nonverbal IQ, to current
quently, there is reason to be concerned that, adaptive skill; whereas the severity of re-
without deliberate stratification, most compar- stricted and repetitive behaviors added to the
ison groups of nonautistic individuals will have predictive value of verbal and nonverbal pre-
markedly lower communication skills, adaptive dictors of academic achievement.
TABLE 28.3 Currently Available Diagnostic Instruments in Autism
Reliability Validity General Information

Published
Guidelines for
Internal Construct / Discriminant Diagnostic Most Level of
Instrument Interrater Test-Retest Consistency Content Convergent Matched Sample Decision Subscales Appropriate For Expertise

Rimland’s E-2 form Unpublished Unpublished Unpublished Kanner (1943) — Poor — — Screening Parent check-
(E-2) list
Behavior Rating Instru- S: good — S: variable — — Limited Yes 8 Current observa- Requires
ment for Autistic and tion training
Atypical Children
(BRIAAC)

Real-Life Rating Scale T: moderate — T: good ASA CARS AUT/ MR / TYP — 5 Screening Minimal
(RLRS) I: marginal S: poor ABC
Social Responsiveness T: high T: high T: high DSM-IV — TYP/ PSY/AUT/ No 3 Symptoms —
Scale (SRS) PDD/AS Severity
Response to
treatment
Pervasive Developmental T: high T: high T: good–high DSM-III-R ABC AUT/AD/ LDD/ No 3 Preliminary Minimal
Disorders Rating Scale S: high S: high S: high MR / PDD-NOS/ stages
(PDDRS) William’s
Syndrome
Children’s Social S: variable T: high S: variable DSM-IV CBCL PDD/ADHD/ No 5 Current symp- Minimal
Behavior Questionnaire S: variable ABC TYP/ PSY/ AUT toms
(CSBQ)
Childhood Autism Rat- T: high — T: high DSM-III-R ABC AUT/ MR Yes 4 Targeted Moderate/
ing Scale (CARS) RLRS screening a video available
ADI
Autism Behavior T: variable — T: good — CARS — Yes 2/5 Measuring mal- Minimal
Checklist (ABC) S: poor RLRS adaptive behavior

Behavioral Summarized T: high — T: Adequate — Rimland E2 AUT/ MR / MP — — Symptoms for Requires


Evaluation-Revised I: good research training
(BSE-R)
Gilliam Autism Rating T: high T: high T: high ASA; DSM-IV ABC AUT/ MR / ED/ LD Yes 4 Needs further Parent check-
Scale (GARS) S: high S: high S: high ADPR (not matched) evaluation list
Autism Diagnostic S: high T: good T: unpublished DSM-IV CARS AUT/ MR Yes 3 Diagnostic clinics/ Experience,
Interview-Revised I: high S: high ICD-10 ADOS research across video, or re-
(ADI-R) developmental quires training
level
Diagnostic Interview I: variable — — ICD-10; Wing — AUT/ LD/ LANG No 4 Educational Requires
for Social and Commu- & Gould DIS planning training
nication Disorders (1979); Gill-
(DISCO) berg, Gillberg,
et al. (2001);
DSM-IV, 1994
Autism Diagnostic I: good T: Adequate S: high DSM-IV ADI-R MR / LANG Yes 3 Research and Experience,
Observation Schedule T: high ICD-10 CCC DIS/ PSY clinical diagnosis video, or re-
(ADOS) quires training
Psychoeducational Pro- T: good — S: high DSM-III-R CARS TYP No 4 Intervention rec- Experience,
file-Revised (PEP-R) ommendations video available
Adolescent and Adult S: variable — — — — — No 6 Intervention rec- Experience,
Psychoeducational Pro- T: high ommendations
file (AAPEP)
Communication and Sym- T: high T: high — — — TYP/ MR /ASD No Screening Minimal, video
bolic Behavior Scales S: high available
(CSBS Behavior Sample)
Children’s Communica- T: high — T: high — — AUT/AD/ PDD- — 5 Identifying prag- Checklist
tion Checklist (CCC) NOS/ADHD/ LD/ matic difficulties
LANG DIS
Asperger Syndrome I: high I: high — Gillberg, Gill- — PSY/ TYP No 6 Still in prelimi- —
(and high-functioning berg, et al. nary stages
autism) Diagnostic (2001); Szatmari
Interview (ASDI ) (1995); ICD-10;
DSM-IV
Australian Scale for — — — Behavior AS — TYP/ASD/ PSY Yes 5 Screening Questionnaire
Asperger’s Syndrome researchers
(ASAS) define as AS
a
Most appropriate for school age children with mental retardation.
Note: All instruments are discussed in detail in text. AD = Asperger Disorder; ADHD = Attention Deficit Hyperactivity Disorder; AUT = Autistic; ED = Emotionally disturbed; I = Item; LANG DEL = Language
delayed; LANG DIS = Language disorder; LD = Learning disabled; MP = Multiple handicap; MR = Mentally retarded; PDD-NOS = Pervasive Developmental Disorder-Not Otherwise Specified; PSY = Psychiatric
disorder; S = Subscale; T = Total; TYP = Typical.
744 Assessment

DIAGNOSTIC INSTRUMENTS CARS. In another study, diagnostic overlap


FOR AUTISM with the Behavior Rating Instrument for Autis-
tic and Atypical Children (BRIAAC; Rutten-
Next, instruments used in the diagnosis of berg, Dratman, Fraknoi, & Wenar, 1966) was
autism and ASDs are discussed briefly follow- poor (Cohen et al., 1978).
ing approximate chronological order according Basic psychometric data and scoring infor-
to when they were first introduced to the public mation for the E-2 have not been published in
and according to general categories of method. scientific journals (Masters & Miller, 1970).
Descriptions are not meant to be comprehen- Several studies suggested differences between
sive; some instruments will be described pri- parent and staff reports using the scale (Davids,
marily as examples of kinds of measures or 1975; Prior & Bence, 1975) and limited differ-
novel approaches. For more detailed informa- entiation between children with autism and
tion, the reader is referred to specific publica- children with other disorders. While current
tions about each instrument or to a chapter diagnostic frameworks such as DSM-IV and
by Parks (1988), for many of the older instru- ICD-10 continue to build on Kanner’s original
ments. When several versions of the same or a descriptions of autism (Kanner, 1943), the ways
similar scale have been disseminated, the focus in which symptoms are operationalized and
is on the most recent version. weighted have changed substantially. Thus, the
E-2 form may serve as most useful to parents
The First Empirically Developed Rating who are beginning to familiarize themselves
Scales and Questionnaires with behaviors associated with autism, rather
than as a measure of standard diagnoses of
The Rimland Diagnostic Form for Behavior- autism or related disorders.
Disturbed Children (Form E-1) was the first The BRIAAC, is another scale that was cre-
widely used scale for the identification of ated about the same time as Rimland’s first di-
autism (Rimland, 1968). It made an important agnostic checklist (Ruttenberg, Kalish, Wenar,
contribution as a systematic diagnostic assess- & Wolf, 1977; Ruttenberg et al., 1966). It con-
ment that focused on a carefully selected range sists of eight subscales that measure behavior
of symptoms rather than more abstract and in- in different areas, yielding a diagnosis of
consistently defined concepts, especially of autism. A trained rater completes the scale
emotional withdrawal. A revised form, Form E- after substantial observations. The BRIAAC
2 is now scored without charge for parents by was important historically because it used di-
the Autism Research Institute in San Diego. rect observations of behaviors, defined on the
Total scores are additive across all questions. basis of descriptions in case notes (Parks,
The scale is based on the core symptoms de- 1988). Psychometrics were computed on vari-
fined by Kanner in 1943 and Kanner’s belief ous samples, including at least one study of
(Kanner, 1962, as cited in Rimland, 1971) that autistic, mentally handicapped, and normally
only a relatively small percentage of children developing children. Reliability estimates in
labeled as autistic have “pure” autism. the form of correlations have consistently been
Many parents have found information from high, though the scoring criteria are complex.
the Autism Research Institute to be helpful. More sophisticated estimates of inter-rater or
Comparisons with other scales suggest that the test-retest reliability are not yet published. Re-
diagnosis yielded by the E-2 form is different sults from validity studies have not indicated
from those offered by most other instruments. that diagnostic classifications based on the
In the original validation study of the Child- BRIAAC correspond to those yielded by other
hood Autism Rating Scale (CARS; Schopler, instruments or clinical judgment (Cohen et al.,
Reichler, DeVellis, & Daly, 1980; also see 1978). Because it is based only on current ob-
below), over 200 children who met autism crite- servations, the BRIAAC has the potential to
ria and another 200 children who did not were be used as a measure of therapeutic effective-
all rated on the E-2 form. Only 8 were consid- ness (Wenar & Ruttenberg, 1976), if more up-
ered autistic by Rimland using the E-2 form and to-date, rigorous standards for reliability can
of those 8, 3 were considered nonautistic on the be met.
Diagnostic Instruments in Autistic Spectrum Disorders 745

Another scale that has been influential in the ent environments was unique and had the potential
field of ASDs has been the Handicaps, Behav- for usefulness in documenting changes in behav-
ior, and Skills schedule (HBS) (Wing & Gould, ior. The HBS has now been substantially revised.
1978). It was the first widely distributed semi- This revision is discussed later as the Diagnostic
structured interview for parents and caregivers Interview for Social and Communication Disor-
of children who were mentally retarded or ders (DISCO: Wing, Leekam, Libby, Gould, &
autistic (referred to as “psychotic” at the time). Larcombe, 2002).
It was used in the Camberwell epidemiological A final scale that was important in the first
study and, as the source of data for that project, group of diagnostic instruments emerging in
had a significant effect on the understanding of the 1970s was the Behavior Observation Scale
the “ triad of impairments” seen in autism and (BOS; Freeman, Ritvo, Guthrie, Schroth, &
related disorders (Wing & Gould, 1979). The Ball, 1978). It includes ratings of 24 behaviors,
HBS was not a diagnostic instrument, but a carried out in 10-second intervals of a video-
“ framework for eliciting, systematically, clini- taped free-play session. The BOS was the first
cal information to be used in conjunction with scale that emphasized the importance of con-
appropriate psychological tests for assessment trolling the environment in which a child was
and diagnosis” (Wing & Gould, 1978, p. 81). It observed, as well as standardizing what was
provided standard questions and topics so that observed. It used frequencies of behaviors to
an interviewer could elicit enough information differentiate among diagnostic groups. The
from a parent or caregiver to make an appropri- authors noted that this approach was not com-
ate rating for each item. Formal scoring was pletely successful for several reasons. Fre-
mapped onto the Vineland Social Maturity quencies of many behaviors were associated
Scale (Doll, 1965). The HBS took several hours with developmental levels as much as diagno-
to administer and consisted of 31 sections sis. In some cases, behaviors that occurred
that included questions about both diagnostic only rarely were very important, suggesting
and developmental issues. Psychometrics were that frequency was a less critical variable than
based on 171 children between 2 and 15 years the quality of behavior.
of age who comprised an epidemiological sam- The same authors then developed the Ritvo-
ple of children with IQs below 50 and/or who Freeman Real Life Rating Scale (RLRS;
were receiving special services who lived in the Freeman et al., 1986) to assess behaviors that
London borough of Camberwell. characterize autism more accurately, with an
Reliability, judged on the basis of compar- emphasis on unusual sensory behaviors. This
isons between pairs of ratings by parents, pro- scale can be used after observation of a 30-
fessional workers, and the authors, averaged minute free-play period. Marginal to adequate
from 77% to 81%. Summary ratings across in- reliability was found for individual items with
formants and observations in the form of 3- adequate subscale and total inter-rater reliabil-
point scales for each section showed near ity using kappas (Freeman et al., 1986; Sevin,
perfect agreement. Indices of association were Matson, Coe, Fee, & Sevin, 1991) for rela-
stronger for the absence of skills than the pres- tively brief samples of behavior coded by
ence, except for social development. Develop- raters with minimal training. For a sample of
mental variables were generally more reliable 24 children and adolescents with autism, 7 of
than ratings of behavioral abnormalities. 38 items did not occur at all and 4 others were
One unusual aspect of the reported research very rare. Inter-rater reliability for another 9
was comparisons among professional reports, par- items was not significant (Sevin et al., 1991).
ent reports, and the authors’ direct observations On the other hand, the correlation with the
of relevant behaviors. Parents tended to describe CARS total score was .77 for an autistic sam-
their children as more socially and emotionally ple. Three of the five subscales (social rela-
responsive than did professionals, but to report tionships, sensory, and language) and the total
more stereotyped movements and abnormal re- had adequate to high internal consistency
sponses to sensory stimuli. The more severe the (Sturmey, Matson, & Sevin, 1992). No specific
child’s impairment, the better was the agreement. cut-offs for diagnosis are provided. Thus, the
The mechanism for combining scores from differ- instrument is primarily useful as a general
746 Assessment

index of diagnostic features, and potentially a children in the epidemiological school sample
measure of change, rather than as an indepen- or clinical sample, which was comprised of
dent source of classification. child psychiatry patients with and without Per-
vasive Developmental Disorders (PDD). The
SCALES THAT MEASURE CORE scores of children with diagnoses of PDDs
DEFICITS IN AUTISM were approximately 2 standard deviations
SPECTRUM DISORDERS above the mean of the children with non-PDD
psychiatric diagnoses. Approximately 8% of
Social Responsiveness Scale the sample of school children had scores that
exceeded the mean of the children with ASDs.
The Social Responsiveness Scale (SRS; Con- While children with PDD-NOS had signifi-
stantino, 2002), formerly the Social Respon- cantly higher scores than nonautistic children
sivity Scale, is a questionnaire designed to be in the clinical sample, overlap occurred be-
completed by an adult, such as a parent or tween the lower 20% of scores in the PDD-
teacher, who observes a child in social situa- NOS group and the upper 20% of scores in the
tions for the purpose of measuring difficulties children with mood and anxiety disorders. Re-
in reciprocal social interactions on a contin- sults of a latent class analysis and principle
uum (Constantino, Przybeck, Friesen, & Todd, components analysis on the epidemiological
2000). The questionnaire takes only 15 to 20 sample of school children revealed differences
minutes to complete and consists of 65 items in severity, but not in patterns of scores, sug-
covering dimensions of communication (6 gesting a continuously distributed variable
items), social interactions (35 items), and (Constantino et al., 2000).
repetitive and stereotyped behaviors and in- Strong correlations have been reported be-
terests (20 items) associated with ASDs. Each tween the ADI-R algorithm scores and SRS
item rates the frequency, not the intensity of a scores, both based on parent report (Constan-
behavior, on a scale from zero (not true) to tino et al., 2004). Principal Component Analy-
three (almost always true). The item scores are sis resulted in single factor explaining 35% of
totaled and result in a severity score along a the variance (Constantino et al., 2004). At this
continuum of difficulties in reciprocity in so- point, the SRS is best used as a measure of
cial interactions (Constantino & Todd, 2000). severity of difficulties in social reciprocity,
Internal consistency of the measure was including odd behaviors. It has been used in
computed based on teacher completed ques- genetic studies of ASDs (Constantino & Todd,
tionnaires for 195 school children between the 2000). The SRS does not take long to adminis-
ages of 4 and 7 years, resulting in a Cronbach’s ter and demonstrates good reliability. Given
alpha of .97. All 65 items were retained be- that there is overlap between scores in ASD
cause reducing the number of items resulted and non-ASD psychiatric populations, it’s pri-
in a reduced ability to distinguish subjects mary use is for measuring symptom severity
with PDD-NOS from clinical controls. In addi- and response to treatment.
tion, factor loadings differed between groups
of older and younger children. Test-retest relia- Pervasive Developmental Disorders
bility has been good with correlations reported Rating Scale
between .83 and .88 (Constantino et al., 2004).
Inter-rater reliability between parents and The Pervasive Developmental Disorders Rating
teachers ranged between correlations of .73 Scale (PDDRS) is a revision of an earlier scale
and .75 (Constantino & Todd, 2000; Constan- developed (Eaves, 1990; Eaves & Hooper,
tino et al., 2004) and correlations between par- 1987), and includes 51 items across three sub-
ents were also strong (r = .91). SRS scores scales (arousal, affect, and cognition), based
were not related to IQ (Constantino et al., on the DSM-III-R. Each behavior is rated on a
2004) in one paper, but were in an earlier paper 5-point Likert scale. The author suggests that
(Constantino, Przybeck, et al., 2000). both the total score and the arousal factor score
Scores on the SRS were significantly higher meet the cutoff of one standard deviation
for children with diagnoses of autism, As- below the mean (standard score > 85), to clas-
perger’s Disorder, and PDD-NOS than for sify a child as PDD.
Diagnostic Instruments in Autistic Spectrum Disorders 747

The internal consistency, test-retest relia- ASDs and was designed to be completed by
bility and inter-rater reliability of the measure parents or caregivers of children between the
were evaluated. Internal consistency was good, ages of 4 and 18 years. It includes 96 items,
resulting in reliability coefficients between 66 of which fall into five factors: Acting-
.79 and .90 for the scales and .92 on the total Out, Social Contact Problems, Social Insight
score. Test-retest and inter-rater reliability Problems, Anxious/Rigid, and Stereotypical
were strong when based on an initial sample in (Luteijn et al., 2000). Each item focuses on re-
which rating pairs were collected over a mean cent behavior (over the past 2 months) and is
of 8.33 months, with correlation coefficients rated from zero (“does not describe the child”)
between .87 and .91. In a second sample, inter- to two (“clearly applies to the child”).
rater and test-retest reliability were evaluated Internal consistency, inter-rater reliability
based on ratings completed by two different and test-retest reliability were all evaluated
respondents over 14 months. Reliability was for the questionnaire. Internal consistency
lower in this situation ranging from .44 to .53 was fair to excellent with Cronbach’s alphas
(Eaves, Campbell, & Chambers, 2000). ranging from .76 on the Stereotypical scale to
Convergent and discriminant validity of .92 on the Acting-Out scale. Inter-rater relia-
the instrument were measured by comparing bility between parents was good to excellent,
scores on the PDDRS with scores on the ABC with intra-class correlations ranging from .64
and evaluating the sensitivity and specificity for the Anxious/Rigid scale to .85 for the So-
of the instrument. Partial correlations, with cial Contacts scale. Test-retest reliability was
chronological age as the control variable, were also good to excellent for most scales, with
run on the ABC scales and the PDDRS fac- intra-class correlations ranging from .62 on
tors. All correlations were significantly dif- the Social Insight Problems scale to .90 on the
ferent from zero with the exception of PDDRS Total, with the exception of the Stereotypical
Cognition and ABC Relating and PDDRS scale, which had a low intra-class correlation
Cognition and ABC Body and Object Use, for of .32 (Luteijn et al., 2000). Convergent and
which correlations ranged from .32 to .81. The discriminant validity of the scales were mea-
mean score on each scale was significantly sured by comparing scores on the CSBQ with
higher in the autistic group than a nonautistic scores on the Children’s Behavior Checklist
group that included nonautism ASDs as well (CBCL; Achenbach, 1981) and the ABC
as moderate to severe mental retardation and (Krug et al., 1980a) and by comparing mean
Williams Syndrome. Using the recommended scores on the measure between diagnostic
cut-off score, sensitivity and specificity were groups (Luteijn et al., 2000). The scales of the
88%. The ABC and the PDDRS scores were CSBQ were highly correlated with the scales
consistent in classifying children with autism of both the ABC and the CBCL. Three scales
in 85% of the sample (Eaves et al., 2000). In of the CSBQ were significantly correlated
the validity studies, no standard diagnostic (.31 to .46) with scores from a checklist based
procedure was used to define the sample. on the DSM-IV, completed by a clinician. The
Thus, the authors suggest that the instruments exceptions were Acting-Out and Anxious/
be used for screening rather than for diagnos- Rigid, indicating that these two scales were
tic purposes (Eaves et al., 2000). Because the less specific to difficulties associated with an
control group included children with ASDs ASD. A discriminant function analysis re-
and there was not a standardized procedure vealed that 50% of children in the original
for establishing diagnosis, it is possible that five groups (PDD-NOS, high-functioning
the instrument may miss some children with autistic children, attention deficit hyperactiv-
autism, given that it “screens out ” children ity disorder, clinical control group, mentally
with related ASDs. retarded children, normal control group) could
be correctly classified on the basis of the four
Children’s Social Behavior Questionnaire discriminant functions: (1) General psycho-
pathology, (2) Withdrawn behaviors, (3) Neg-
The Children’s Social Behavior Questionnaire ative correlation with Social Insight Problems
(CSBQ; Luteijn, Luteijn, Jackson, Volkmar, & and a positive correlation with Anxious/
Minderaa, 2000) covers areas associated with Rigid, and (4) A strong relationship with
748 Assessment

Stereotypical Behaviors and Anxious/Rigid Scales. Achenbach and Rescorla (2000) specify
Behaviors (Luteijn et al., 2000). that the DSM Oriented Scales are not equiva-
The authors suggest that the instrument lent to a diagnosis, because only behavior over
may offer important contributions to research the past 2 months is rated, the behaviors listed
and clinical work, particularly because it re- do not correspond exactly to diagnostic crite-
vealed different patterns of scores in children ria, and the standard scores are based on age
with autism and children with PDD-NOS. and gender comparisons and the DSM-IV is
Specifically, children with PDD-NOS scored not. However, the scores could be used to iden-
higher on the Acting-Out scale than children tify children with behavior difficulties, and
with autism (Luteijn et al., 2000). One limita- children who have elevated scores ( borderline
tion is that the diagnostic groups were deter- or clinical range) could be referred for further
mined based on clinical diagnosis alone, rather evaluation.
than with standardized measures. In addition, Test-retest reliability was quite high for the
the correlations of the CSBQ scales with the Pervasive Developmental Problems scale for
DSM-IV checklist were not high, although they both parents (r = .86) and teachers (r = .83)
were significant. At this point, the CSBQ when the checklist was completed a second
remains in the early stages. Further investiga- time 8 days after the initial rating. Inter-rater
tion will be important in determining its re- reliability on the Pervasive Developmental
search and clinical utility. Problems Scale was moderate between parent
to parent ratings (r = .67) and teacher to
Achenbach System of Empirically teacher ratings (r = .67).
Based Assessment Validity was assessed based on a “clinic re-
ferred sample” and a “non-referred sample.” As
The Achenbach System of Empirically Based a result, information is not available as to how
Assessment, Preschool Forms and Profiles valid the instrument is for screening specifi-
(Achenbach & Rescorla, 2000) includes the cally for ASD. While the CBCL and C-TRF
CBCL for ages 1 year to 5 years, the Language should not be used as diagnostic instruments,
Development Survey (LDS), and the Caregiver- they have potential value as screening tools or
Teacher report form (CTRF). research measures of autistic behaviors.
The CBCL is a questionnaire designed to be
completed by parents or caregivers in a home CURRENTLY USED RATING SCALES
setting, and only requires a fifth-grade reading
level. The CBCL scores result in a Total Score, Childhood Autism Rating Scale
and Internalizing and Externalizing Scale, as
well as Syndrome and DSM oriented Scales. The Childhood Autism Rating Scale (CARS;
The DSM Oriented Scales include a Pervasive Schopler, Reichler, & Renner, 1986) is the
Developmental Disorder Problems Scale that strongest, best-documented, and most widely
consists of 13 items. Each item is point rated on used clinical rating scale for behaviors associ-
a 0 to 2 point scale based on behaviors over the ated with autism. It has been used in studies
past 2 months, with “0” indicating “not true,” all around the world and translated into many
“1” indicating “sometimes true” or “somewhat languages (Nordin, Gillberg, & Nyden, 1998;
true” and “2” indicating “ very true” or “often Pilowsky et al., 1998; Sponheim, 1996). It
true.” Based on raw scores, T-scores can be cal- consists of 15 items on which children and
culated for each of the DSM Oriented Scales. adults are rated, generally after observation,
There are cut points for the “ borderline range” on a 4-point scale. The scale requires minimal
and the “clinical range.” The C-TRF is a training. Training is available on videotape or
teacher rating form designed to be completed in brief workshops. Points are added and a
by daycare providers or teachers. standard cut-off of 30 has been suggested and
While the CBCL is not intended for diagnos- validated with various samples (Garfin et al.,
tic purposes, it is included in this chapter be- 1988; Schopler et al., 1980). Minor modifica-
cause it includes a Pervasive Developmental tions have been suggested in which cut-offs are
Disorders Scale as one of the DSM Oriented moved up a few points for very young children
Diagnostic Instruments in Autistic Spectrum Disorders 749

(Lord, 1995) and down for high-functioning make discriminations for complex diagnostic
adolescents and adults (Mesibov, Schopler, cases in which DSM-IV or ICD-10 criteria are
Schaffer, & Michal, 1989). the standard; nevertheless, as discussed earlier,
Most of the information about the CARS multiple sources are important in any diagnostic
is from studies of autistic children who func- decision making and it may provide important
tion in the mild to moderate range of mental information in addition to other sources
handicap. Studies of discriminant validity from (Nordin & Gillberg, 1996a, 1996b).
carefully matched comparison groups are not The CARS total score has held up to re-
yet available, though the CARS has been shown peated, careful examinations, as internally
to discriminate autistic children from children consistent (Kurita, Kita, & Miyake, 1992;
without autism and some mental handicap Sturmey et al., 1992) and reliable across raters
(Schopler et al., 1988; Teal & Wiebe, 1986). (Garfin et al., 1988; Kurita et al., 1992; Sevin
Convergence between the CARS and the et al., 1991). Inter-rater reliability for individ-
Autism Diagnostic Interview (ADI; Lord, 1995; ual items has been found to be more variable.
Sevin et al., 1991; Venter et al., 1992) and cor- Some of the scales (e.g., Relating to People,
relations between CARS total scores and RLRS Imitation) have consistently shown high corre-
total scores (Sevin et al., 1991) were good for lations between different raters’ scores. Statis-
autistic children, but less good for young, tics such as kappas, which control for base
nonautistic mentally handicapped children rates, have not yet been employed (Garfin
(Lord, 1995). Thus, the evidence that the CARS et al., 1988; Sevin et al., 1991). One of the im-
accurately identifies children with autism is portant contributions of the CARS was the
stronger than the evidence that it discriminates provision of specific anchorpoints for each
between children with autism and mental-age item in a way that allows the rater to take into
matched children with other disorders. account developmental level. The difficulty
The CARS was created before the introduc- with this strategy is that how anchorpoints are
tion of DSM-IV and ICD-10 diagnostic frame- defined differs across items. Interpretation of
works. It shows good agreement with clinicians’ scores on individual items, particularly given
judgments using DSM-III-R, though it is some- the inconsistent evidence of reliability at this
what over-inclusive compared to strict applica- level, must be carried out with care.
tion of the criteria (Van Bourgondien, Marcus, Besides direct observation by a clinician, for
& Schopler, 1992). Because, with the exception which the CARS was designed, it has also been
of the preceding reference, DSM-III-R was used in chart review, scored directly by parents
found to be more inclusive than clinicians’ and teachers, and used as part of a parent inter-
judgments of autism (Hertzig, Snow, New, & view (Schopler et al., 1988). On the whole, clas-
Shapiro, 1990; Volkmar, Cicchetti, Bregman, & sifications and correlations between raters for
Cohen, 1992), this finding suggests that the total scores have been relatively high across
CARS identifies more children as having autism different procedures. Several studies have sug-
than the currently accepted three-domain diag- gested that clinicians tend to rate behaviors as
nostic frameworks of DSM-IV (American Psy- more severe than do fathers or mothers (Bebko,
chiatric Association, 1994) and ICD-10 (World Konstantareas, & Springer, 1987; Konstantar-
Health Organization, 1992). Children with min- eas & Homatidis, 1989), with other studies
imal verbal skills and/or moderate to severe finding few differences (Freeman, Perry, &
mental handicap may be more likely to fall into Factor, 1991; Schopler et al., 1988).
the range of autism, in part because items on the A factor analysis of the CARS of 90 chil-
CARS rating language skill and mental handi- dren with clinical diagnoses of autism or PDD-
cap comprise part of the total score (Pilowsky NOS based on DSM-III-R criteria yielded five
et al., 1998). For the purposes of screening factors out of 15 items: Social Communica-
or determining services, over-inclusiveness of tion, Emotional Reactivity, Social Orienting,
children with clear impairments is not as prob- Cognitive and Behavioral Consistency, and
lematic as over-exclusion (Wing & Gould, Odd Sensory Exploration. Cognitive and Be-
1979). However, implications may be different havioral Consistency and Emotional Reactiv-
for research. The CARS cannot be used alone to ity were significantly correlated with age;
750 Assessment

Social Communication was significantly corre- et al., 1988; Wadden, Bryson, & Rodger, 1991).
lated with gender, IQ, and Vineland scores. More recently, Krug, Arick, and Almond
Factor-based scales distinguished children with (1993) recommended using a cut-off of greater
autism from those with PDD-NOS. It was sug- than 53 for classifying a child as probably autis-
gested that use of these factor scores might in- tic. When using this lower cut-off, Eaves et al.
crease the sensitivity of the CARS with (2000) found that overall classification accu-
younger and/or higher functioning individuals racy was 80%, specificity (correct negatives)
within the autistic spectrum (Stella, Mundy, & was 91% and sensitivity (correct positives) was
Tuchman, 1999). 77%. Norms and standard profiles are provided
Overall, the CARS is the most widely re- for samples of autistic, typical, deaf, and blind
searched and employed rating scale of autism students.
in the United States. Versions are available in Initial estimates for inter-rater reliability
numerous languages other than English. It is a were high, though based on small samples and
reliable screening instrument for children with not controlling for chance (Krug et al., 1980b).
autism and mental retardation that can be used Later estimates have been less high (Volkmar
with minimal training across a range of situa- et al., 1988). Discriminant validity has been
tions. Its scores do not correspond to current variable, in part depending on whether investi-
formal diagnostic frameworks for autism, such gators generated discriminant functions from
as DSM-IV and lCD-10, and so for research data within their group or used the cut-offs sug-
purposes, it may identify a somewhat different gested by the authors. In the latter case, there
population than suggested by those systems. was considerable overlap between autistic and
mentally handicapped populations (Volkmar
Autism Behavior Checklist et al., 1988). In the former case, diagnostic
differentiation was, not surprisingly, better
The Autism Behavior Checklist (ABC) is one (Nordin & Gillberg, 1996b; Wadden et al.,
component of the Autism Screening Instrument 1991). Current scores on the ABC did not meet
for Educational Planning (ASIEP; Krug et al., criteria for most of a group of verbal adoles-
1980b) and the only one that has been evaluated cents with autism, but retrospective accounts
psychometrically. It builds on Rimland’s Form did (Yirmiya et al., 1994). Differences in stud-
E-2, the original Kanner criteria (1943), the ies may also be related to the use of a somewhat
Behavior Observation Schedule (Freeman broader definition of autism, in which case the
et al., 1978), the BRIAAC (Ruttenberg et al., ABC becomes more accurate in diagnosing
1977), and several other sources. It contains 57 autism, and inclusion of subjects with Down
items in five areas: sensory, relating, body and syndrome, which may decrease the false posi-
object use, language and social interaction, and tive rate (Wadden et al., 1991).
self-help. It was intended to be completed by Internal consistency for the total scale is
teachers as an initial step in educational plan- good. Various investigations have yielded dif-
ning. No special training is required. It has also ferent results in terms of the internal consis-
been used with parents on a retrospective basis tency and intercorrelations of the five areas;
for families of high-functioning children (Yir- both chronological and mental age may account
miya et al., 1994) and on a current basis, yield- for much of the variance. Subscales of relating
ing somewhat higher scores than with teachers and object / body use were the strongest in one
(Volkmar et al., 1988). The rater completes di- study in terms of inter-item correlations and
chotomous ratings, which are weighted accord- lack of rogue items (Sturmey et al., 1992).
ing to the authors’ data and yield a total score. Several investigators have suggested that dis-
Ranges, on the basis of a very large, but unspec- criminant validity may be equally good using
ified sample, are provided for a high probability fewer items (Volkmar et al. 1994; Wadden
of autism (≥ 68), low probability of autism et al., 1991).
(under 53), and mixed. Several investigators Convergent validity between the ABC and
have reported that the suggested cut-offs are other instruments has been measured for the
too high, and result in a high proportion of false CARS and the RLRS and found to be poor,
negatives (Miranda-Linne et al., 1997; Volkmar suggesting that the ABC’s usefulness as an
Diagnostic Instruments in Autistic Spectrum Disorders 751

independent diagnostic instrument may be lim- 20 items in the BSE selected from 19 items
ited, particularly since it was constructed be- from the autism factor in the IBSE, the form
fore current theoretical frameworks for autism for children under 4 years of age (Adrien et al.,
were proposed (Nordin & Gillberg, 1996b). 1992) and 20 in the original BSE (Barthelemy
For verbal autistic adolescents, retrospective et al., 1990). Items are scored on a five-point
parent ratings on the ABC about their chil- scale administered by trained raters, on the
dren’s behavior between 3 and 5 years, related basis of direct or videotaped observation, dis-
to whether children were considered to have cussion of history, and access to information
“residual” autism or not, but diagnosis did not from multiple sources. With trained raters,
correspond to the cut-offs suggested by the au- most individual items have shown very good
thors of the scale (Yirmiya et al., 1994). inter-rater reliability. Inter-rater reliability for
The ABC emphasizes autistic symptoma- total scores has been excellent, though ratings
tology rather than prosocial behaviors and so is were not typically based on independently ac-
quite different than several of the other instru- quired information.
ments, for example, the ADI-R. Because of its Factor analyses have shown loadings within
emphasis on observable features associated one primary Interaction Disorder factor, ac-
with, but not limited to autism, the ABC may counting for 38% of the variance and a Modula-
be helpful in documenting change. This would tion factor, accounting for 10%. Results from
be particularly true for changes in the pres- previous versions indicated adequate internal
ence of abnormal behaviors. Unlike several consistency (Adrien et al., 1992; Barthelemy
other autism scales that showed more consis- et al., 1990). The Interaction factor was not cor-
tent convergent validity with each other, the related with age but was highly negatively
ABC is correlated with the American Associa- correlated with IQ (r = −.59). Discriminant
tion of Mental Deficiency (AAMD) Adaptive function analyses accurately grouped 80% to
Behavior Scale-School Version (Sevin et al., 85% of autistic and mentally handicapped chil-
1991). The ABC alone cannot be considered a dren using the IBSE (Adrien et al., 1992). Inter-
strong diagnostic instrument because of its action Disorder factor scores were correlated
limited relationship to current diagnostic with expert ratings of severity of autism
frameworks. As it stands, it is of limited value (Barthelemy et al., 1990, 1997). A cut-off score
as a screening instrument because of variable of 27 on the Interaction Disorder factor on the
sensitivity. However, the ABC may be useful BSE-R yielded a sensitivity of .74 and a speci-
in documenting response to treatment and edu- ficity of .71 (Barthelemy et al., 1997). Conver-
cational programming. gent validity with other measures except the
Rimland E2 is not yet published. There is some
Revised Behavior Summarized Evaluation suggestion that the BSE-R may be particularly
helpful in measuring response to treatment
The Revised Behavior Summarized Evaluation (Boiron, Barthelemy, Adrien, Martineau, &
(BSE-R) is composed of items from two over- Lelord, 1992) and in neurophysiological studies
lapping instruments, the Behavioral Summa- (Barthelemy et al., 1997).
rized Evaluation scale (BSE) and the Infant
Behavioral Summarized Evaluation scale The Gilliam Autism Rating Scale
(IBSE; Barthelemy et al., 1997) and is primar-
ily designed to document behavioral symptoms The Gilliam Autism Rating Scale (GARS;
associated with autism as they relate to neuro- Gilliam, 1995) is a parent-completed surveil-
physiological measures. New items have been lance questionnaire, designed to indicate the
added concerning nonverbal communication, probability that a child has autism. It is in-
emotion, and perception, as well as intention tended for individuals between 3 and 22 years
and imitation. These scales are available in of age. The questionnaire consists of 56 items
French and have been used in many basic re- across four subscales: Social Interaction,
search investigations of children with autism Communication, Stereotyped Behaviors, and
in France (for example, see Zakian, Malvy, Developmental Disturbances. The first three
Desombre, Roux, & Lenoir, 2000). There are subscales listed are based on a child’s current
752 Assessment

behavior, and the final scale is based on no “gold standard” for diagnosis. Concurrent
a child’s developmental history. Each item is validity was evaluated by correlating standard
rated on a four-point scale, from “Never scores on the GARS with scores on the ABC;
Observed” to “Frequently Observed.” Item all correlations were large and significant.
scores are totaled for each scale and corre- Discriminant validity was evaluated by deter-
spond to a standard score with a mean of 10 mining how well the measure discriminated
and a standard deviation of 3. Typically, all between groups that were diagnosed with
scales of the GARS are completed. However, if autism as compared to those who were not.
a child is nonverbal and/or the parent does not Significant differences were found between
have knowledge of the child’s early history, the means of those diagnosed with autism ver-
the Communication or Developmental History sus those who were not. Using the Autism
scales may be omitted. A standard score or Quotient, 90% of the subjects were classified
Autism Quotient can be based on 4, 3, or 2 accurately.
scales of the GARS. An Autism Quotient is A later independent study of the validity
derived by summing relevant scale scores, of the GARS was based on a sample of 119 in-
yielding a standard score with a mean of 100 dividuals with autism, all of whom had exten-
and a standard deviation of 15. The Autism sive diagnostic evaluations, including the
Quotient is divided into seven ordinal cate- ADOS and the ADI-R, by experts in ASDs
gories, ranging from a “low probability” that a (South et al., 2002). The validity data from
child has autism to a “ high probability” that this study was disappointing, with the GARS
a child has autism. receiving a sensitivity of .48 compared to
Internal consistency of the items on the “gold standard” diagnoses, indicating that
scale using Cronbach’s alpha yielded coeffi- 52% of children with autism ( based on the
cient alphas ranging from .88 to .96 (Gilliam, ADI, ADOS, and clinical impression) were
1995). Correlations among individual GARS missed by this instrument (South et al., 2002).
scales rating current behaviors are relatively Convergent validity was also investigated by
high. The Developmental Disturbances scale comparing the GARS scores to scores on the
was not significantly correlated with any of ADOS and the ADI-R (South et al., 2002).
the other scales, although it was weakly, but There were no significant correlations be-
significantly, correlated with the Autism Quo- tween any of the GARS scales and the ADOS.
tient, r = .34 (South et al., 2002). Test-retest Small but significant correlations were re-
reliability on a small sample of 11 children, ported between the ADI-R Social Interac-
using three of the scales (excluding Develop- tions score and the GARS Social Interaction
mental History) revealed correlations between Scale (r = .26), Stereotyped Behaviors Scale
totals ranging from r = .81 (Communication) (r = .21), and the Autism Quotient (r = .23).
to r = .88 (Autism Quotient). Neither item Given the current information about its va-
agreement nor classification agreement across lidity and the high rates of false negatives, the
time or rater were reported. Finally, inter-rater GARS cannot yet be used in isolation as a di-
reliability was evaluated for the various agnostic tool. This is particularly concerning
pairs of raters (parent-parent, teacher-teacher, because, although most research projects have
and parent-teacher). Teacher-to-teacher and used the instrument in conjunction with other
teacher-to-parent inter-rater reliability esti- instruments (Asano et al., 2001; Owley et al.,
mates were all strong, ranging from .85 to .99. 2001), some studies have employed the GARS
Ratings were weakest for the Parent-to-Parent for diagnostic purposes (Schreck & Mulick,
ratings with reliability ranging from .55 to .85 2000). The use of this instrument may be even
(Gilliam, 1995). more problematic in clinical settings where
The initial reference sample for the GARS the professionals know less about autism. In
consisted of data collected for 1,092 children, this context, many children with autism could
adolescents, and adults (Gilliam, 1995). Al- be missed and not referred for appropriate ser-
though a parent or professional rater reported vices. The author intends to revise the instru-
each individual’s diagnosis, an independent ment and has proposed using a lower cut-off
professional did not verify it. Thus, there was score (South et al., 2002).
Diagnostic Instruments in Autistic Spectrum Disorders 753

DIAGNOSTIC INTERVIEWS mains (see Chakrabarti & Fombonne, 2001;


Rutter et al., 2003). Test-retest reliability, on
Autism Diagnostic Interview-Revised a very small sample, was also good (Lord
et al., 1994). Change over time is reflected in
The Autism Diagnostic Interview-Revised items that include whether the behavior
(ADI-R) is a semi-structured, investigator- “ever ” occurred and items that focus on “cur-
based interview for caregivers of children rent ” manifestations. On the whole, however,
and adults for whom autism or pervasive devel- the ADI-R is not intended to measure change.
opmental disorders is a possible diagnosis. There has been a deliberate attempt to in-
Originally developed as a research diagnostic clude items that will reflect autism of vary-
instrument (ADI; Le Couteur et al., 1989), the ing levels of severity and at varying points in
ADI-R has been modified to be appropriate for development.
a broader age range of children than the original Internal consistency is excellent within
ADI (Lord et al., 1994). It is linked specifically the three domains. Differentiation between
to ICD-10 and DSM-IV criteria. A revised autistic and mentally handicapped children and
shortened version is now available, consisting adults is excellent, with the restriction that
of about 93 items. The most recent version the instrument tends to be over-inclusive for
takes about 2 hours for an experienced inter- individuals with mental ages of less than 18
viewer to administer (Le Couteur, Lord, & months (Lord et al., 1993) and with severely
Rutter, 2003). Researchers are required to par- to profoundly retarded individuals (Nordin &
ticipate in training workshops and to establish Gillberg, 1998). One study found that the
reliability with investigators from other cen- ADI was slightly under-inclusive with very ver-
ters. Clinicians are encouraged to use video bal children with autism or pervasive develop-
training materials, and may use the instrument mental disorders (Yirmiya et al., 1994); another
without intensive training within the ethical study reported that it was over-inclusive (Ma-
guidelines for test use in their professions. honey et al., 1998). Convergent validity with
Nonetheless, administering the ADI-R requires the CARS was excellent after age 3 (Lord,
general experience in both interviewing and 1995; Pilowsky et al., 1998); convergent valid-
working with individuals with autism to be ef- ity with the Autism Diagnostic Observation
fective. The ADI-R has been translated into 11 Schedule (see below) has also been good for
languages and the ADI and the ADI-R are cited most samples (Hepburn et al., 2003; Lord, Risi,
as the “gold standard” for diagnosis in many et al., 2000; Lord et al., 1989). The exception
countries. was a recent study of Bishop and Norbury
Psychometric data for the ADI and ADI-R (2002) in which children with language im-
have been carefully acquired with attention pairments, in some cases also with ASD, were
to matching across samples and to maintaining given the ADI-R or SCQ, the ADOS, and the
as much “ blindness” as possible for raters, Communication Competence Checklist (CCC).
but is based on very small samples (Rutter ADI-R and SCQ classifications were compara-
et al., 2003). This limitation is compensated ble to school classifications but not with ADOS
for slightly by independent psychometric scores or scores on the CCC, which were simi-
data published by other major research cen- lar to each other. It was not clear if this was
ters that have used the ADI or ADI-R as a di- related to specific difficulties using the ASD
agnostic instrument (Constantino et al., 2004; instruments in a relatively narrowly defined
Cuccaro, Shao, Grubber, et al., 2003; deBildt verbal sample, administration of a single mod-
et al., 2004; Kolevzon et al., 2004; Saemund- ule of the ADOS regardless of language level in
sen, Magnússon, Smári, & Sigurdardóttir, some cases (all children were given Module 3)
2003). Inter-rater reliability has been good to or differences in parent report and school clas-
excellent for individual items and excellent for sification systems and direct observations.
domain scores, including those for each of the Because of the widespread use of the ADI-R
three subscales: social reciprocity, communi- in defining samples, there has been a recent
cation, and restricted, repetitive behaviors surge of interest in how to use the ADI-R for a
that correspond to the DSM IV/ICD-10 do- variety of other purposes beyond classification,
754 Assessment

including quantifying severity (Lord, Leven- ual items within the ADI-R repetitive domain
thal, & Cook, 2001; Spiker, Lotspeich, Dim- (Alarcón et al., 2002) or by the entire domain
iceli, Myers, & Risch, 2002; Szatmari et al., score (Silverman et al., 2002). Other studies
2002; Volkmar & Lord, 1998), describing indi- have found that only particular combinations of
vidual differences (Alarcón et al., 2002; Cuc- items (e.g., insistence on sameness, compul-
caro, Shao, Bass, et al., 2003; Tanguay, sions) yielded similar results (Shao et al., 2003)
Robertson, & Derrick, 1998) and creating for other genetic regions. Though potential
more homogeneous subsets of participants for genetic significance of repetitive behaviors
genetic analyses (Buxbaum et al., 2001; Fre- emerges across papers, in most cases, studies
itag, 2002; Shao, Raiford, et al., 2002; Shao, have not replicated each other, nor have age, IQ,
Wolpert, et al., 2002; Tadevosyan-Leyfer et al., or verbal status been controlled consistently.
2003). These studies have used a wide range of The other way in which the ADI-R has been
analytic techniques, sometimes related to dif- used within genetic studies has been to pro-
ferent purposes, and have been carried out on a duce subsets based on language delay ( based
wide range of items (e.g., sometimes all ADI-R on measures of age of first word or age of first
item scores are included; sometimes selected phrase) or on current language level. Concor-
items, sometimes domain scores). Studies have dance for current verbal ability has been
varied considerably whether age, IQ or verbal shown in some cases (Freitag et al., 2002;
level were controlled. It is clear that, depending MacLean et al., 1999; Spiker et al., 2002), but
on the ranges studied, all three of these fea- on the whole, only delay in either first single
tures can affect ADI /ADI-R scores (Cox et al., words (Alarcón et al., 2002) or first phrases
1999; Cuccaro, Shao, Bass, et al., 2003; Spiker (Bradford et al., 2001; Buxbaum et al., 2001;
et al., 1994). Shao, Wolpert, et al., 2002) or delay accompa-
Overall, no differences were found in do- nied by the presence of a language-delayed rel-
main scores for multiplex families compared ative (Folstein & Mankoski, 2000) increased
to singletons (Cuccaro, Shao, Bass, et al., the significance of specific regions.
2003). Factor analyses of domain scores and Several factor analyses and principal com-
Vineland adaptive behavior scores (Sparrow, ponent analyses have been carried out, primar-
Balla, & Cicchetti, 1984) in two separate sam- ily with data from earlier versions of the ADI.
ples yielded a symptom number factor and a In one study, factors emerged that reflected
separate factor for level of functioning, deter- three aspects of social communication: Affec-
mined by the Vineland adaptive behavior tive Reciprocity, Theory of Mind, and Joint At-
scores (Szatmari et al., 2002). Only the ADI-R tention (Tanguay et al., 1998). In another study
domain of nonverbal communication showed (Lord, 1990), social and communication items
any evidence of concordance within multiplex both loaded on two factors; in this case, the
families (MacLean et al., 1999), a relationship factors seemed to reflect initiations versus so-
also found by another research group (Freitag cial responsiveness. In a recent study, six fac-
et al., 2002). In another sample, heritability tors emerged, that together accounted for about
was supported for a continuous severity gradi- 40% of the variance (Tadevosyan-Leyfer et al.,
ent composed of ADI-R scores, verbal—non- 2003). These factors consisted of items scored
verbal status and nonverbal IQ (Spiker et al., for both current functioning and “ever ”/most
2002). Several other studies that have mea- abnormal 4 to 5, that were present in both the
sured concordance within twin pairs (Le Cou- early ADI and the ADI-R so they represent a
teur et al., 1996) and families (Spiker et al., particular subset of questions. Factors were
1994) found contradictory results with little validated in another sample using additional
concordance on any dimension for monozy- psychometric measures. Constantino and Todd
gotic twins, but concordance for ADI-R repet- (2003) recently reported a factor analysis of
itive scores found in families (Freitag et al., the ADI-R, with a different pattern.
2002; Spiker et al., 1994). It seems very likely that items within the
Several groups of genetics researchers have ADI-R can be combined in more fruitful ways
produced increased homogeneity and more sig- than the present algorithm domain scores. One
nificant results by subsetting groups by individ- consistent finding across these studies is the
Diagnostic Instruments in Autistic Spectrum Disorders 755

overlap between “communication” and “social” on behaviors relevant to autism for the purpose
items, suggesting that they are not separate do- of assisting clinicians in determining a child’s
mains of skill (Lord, 1996; Tadevosyan-Leyfer development in different areas as well as his
et al., 2003; Tanguay et al., 1998). Several fac- individual needs (Leekam, Libby, Wing,
tors with different organizations of repetitive Gould, & Taylor, 2002). It is based on the con-
behaviors have also been proposed. To date, cept of a spectrum of disorders rather than cat-
however, factors in this area and across other egorical diagnoses.
domains have differed considerably across in- The DISCO is an investigator-based inter-
vestigations. The development of a more stable view in which the interviewer asks questions
measure or measures of repetitive interests and designed to elicit descriptions of behavior and
behaviors will be an important contribution to makes coding decisions based on the informa-
better understanding of phenotypes in ASDs. tion provided. The coding of the items can be
Larger samples, including individuals without based on information obtained during the in-
autism, will be necessary in order to control ef- terview as well as through other information,
fects of age, verbal status and IQ. Replication such as direct observation. The DISCO in-
across sites and samples will be crucial in cludes items covering behavioral manifesta-
determining the factors of greatest interest or tions of the deficits associated with ASDs,
usefulness. including social interaction, communication,
ADI scores have also been shown to be re- imagination, and repetitive activities. In addi-
lated to ABC scores, given by history, for a tion, it includes items designed to assess devel-
group of high-functioning children (Yirmiya opmental levels in a variety of domains. Many
et al., 1994). Because of its clear link to DSM- of these items are based on the Vineland
IV and ICD-10 and its mulitdimensional ap- Adaptive Behavior Scales (Sparrow et al.,
proach, the ADI-R offers the potential of 1984). There is also a section on atypical be-
providing empirical information and diagnostic haviors that are not specific to autism. These
guidance about other PDDs besides autism. include unusual responses to sensory stimuli,
However, cut-offs for nonautism pervasive de- difficulties in attention and activity level,
velopmental disorders are not yet available. Sev- challenging behaviors, and other psychiatric
eral investigations have proposed various cut disorders. Items relating to developmental
offs, including one or two points below autism delay are rated on a 3-point scale, as “delay,”
thresholds (over all the domains), but none have “minor delay,” or “no problem.” An actual age
yet been empirically validated (Cox et al., 1999; is coded for some of the developmental items.
Dawson et al., 2002). Atypical behaviors receive codes for “current ”
and “ever ” and are rated as “severe,” “minor,”
The Diagnostic Interview for Social and or “not present.”
Communication Disorders The reliability of the DISCO 9 was evalu-
ated based on a sample of 82 children with di-
The Diagnostic Interview for Social and Com- agnoses of ASDs, learning disability, or no
munication Disorders (DISCO: Wing et al., diagnosis (typically developing) between the
2002) is a standardized, semi-structured inter- ages of 3 and 11 years of age (Wing et al.,
view, now in its ninth revision. It is based on 2002). Inter-rater reliability was measured,
the Handicaps, Behaviors and Skills schedule comparing two interviewers/coders, using
(HBS) (Wing & Gould, 1978, 1979). In 1990, Kappa’s alpha for items with two or three
a clinical need emerged for an instrument that codes and by intraclass correlations (ICC) for
extended beyond the school age years, into items with four or more codes. Agreement was
adulthood. At this time, the first version of the high ( k or lCC > .75) for 85% of all ratings for
DISCO was developed to assess the pattern of both preschool age and school age children.
development in individuals with ASDs and Within the Developmental Skills area, the low-
their individual needs (Wing et al., 2002). The est agreements (.67 to .80) were for items that
primary purpose of the DISCO is not to pro- were not part of the diagnostic algorithm (e.g.,
vide a diagnostic classification. Rather, the in- reading, drawing). Of greater concern, was the
strument was designed to obtain information low agreement (with kappas < .40) on some of
756 Assessment

the social interaction items and for many of the DIRECT OBSERVATION SCALES
repetitive routine items, which are part of the
diagnostic algorithm. Inter-rater reliability was Autism Diagnostic Observation Schedule
higher for the “ever ” items than for the “cur-
rent ” items. Based on this information, the au- The Autism Diagnostic Observation Schedule
thors plan to make some changes designed to (ADOS) is a standardized protocol for the ob-
improve reliability, which will be included in servation of social and communicative behavior
the DISCO 10. of children for whom a diagnosis of autism or
While the DISCO was designed for clinical ASDs is in question (Lord, Rutter, DiLavore, &
purposes, provisional algorithms have been Risi, 1999; Lord, Risi, et al., 2000). The origi-
written for research purposes. Recently, two nal ADOS was developed in order to be used
diagnostic algorithms for the DISCO 9 were with children who had fluent phrase speech; the
developed and investigated (Leekam et al., Pre-Linguistic Autism Diagnostic Observation
2002). One of the algorithms was based on cri- Schedule (PL-ADOS) was intended for pre-
teria for autistic disorder in the ICD-10 school children with little or no expressive
(World Health Organization, 1992) and the language (DiLavore et al., 1995; Lord et al.,
other was based on the criteria for autistic 1989). Recently they have been combined and
spectrum disorder as defined by Wing and extended within a single instrument, the
Gould (1979). When comparing clinical diag- Autism Diagnostic Observation Schedule
nosis to algorithm diagnoses for a sample of (Lord, Rutter, DiLavore, & Risi, 1999), for-
children with language disorders, learning dis- merly called the ADOS-G, with the PL-ADOS
ability, and autistic disorder, both algorithms comprising most of Module 1, the original
were significantly related to a diagnosis of ADOS comprising most of Module 3, and the
autistic disorder or nonautistic disorder. How- addition of new modules for children with some
ever, discrepancies were also found, primarily language but not fluent spontaneous speech
for the clinical nonautistic group using the (Module 2) and for high-functioning adoles-
ICD-10 algorithm such that 10 children with cents and adults (Module 4). The new ADOS
clinical diagnoses of a language disorder or thus provides the same information as the origi-
learning disability met ICD-10 algorithm cri- nal ADOS and PL-ADOS for individuals rang-
teria for autistic disorder. Four children with a ing in age and development from nonverbal
learning disability diagnosis met both ICD-10 toddlers to verbally fluent adults of average or
and Wing and Gould algorithm criteria for higher intelligence.
autistic disorder, while none with a language The ADOS and PL-ADOS were originally
disorder met criteria using both algorithms. developed as companion instruments for the
DISCO 9 algorithms were also generated for ADI. Their purpose is to provide a series of
Gillberg’s diagnostic criteria for Asperger’s structured and semi-structured “presses” for
Disorder and ICD-10 criteria for Asperger’s social interaction, communication, and play
Disorder. Of the 200 children included in the that can be coded immediately following ad-
study, all of whom met ICD-10 criteria for ministration (although often videotapes are
autism or atypical autism, only 3 (1%) met made as well). They are scored in the context
criteria for Asperger’s Disorder based on of a diagnostic algorithm for autism. The ra-
the DSM-IV algorithm and 91 (45%) met crite- tionale is that context can have very significant
ria based on Gillberg’s algorithm criteria effects on social-communicative behaviors.
(Leekam, Libby, et al., 2000). Consequently, it is important to standardize
The DISCO was primarily designed for contexts as well as judgments in any diagnostic
clinical purposes, particularly for assisting in observation of these behaviors. Both instru-
generating recommendations for individuals ments can be administered by a trained exam-
and adults with autistic spectrum disorders. iner in about 30 to 45 minutes. Training and
The authors are revising the instrument to establishment of reliability with another center
improve inter-rater reliability and to generate is required for research, but not for clinical
diagnostic algorithms that can be used for re- use. A substantial amount of experience, skill,
search purposes. and practice in working with individuals with
Diagnostic Instruments in Autistic Spectrum Disorders 757

autism or PDD is necessary to use either in- r = .31, but scores on the other factors were
strument effectively. not. Together with findings from the related
Inter-rater reliability is very good for items analysis of the ADI-R, these results highlight
and excellent for totals. Internal consistency the importance of considering social develop-
within domains of social-communication and re- ment and communication together in the use of
stricted-repetitive behaviors is excellent (Lord, these diagnostic instruments.
Risi, et al., 2000); test-retest reliability is ade- Like the ADI, the ADOS was not originally
quate. Discriminant validity is excellent for di- intended to measure change, although it may
agnostic algorithms using social-communication be possible to use the standard behavior sam-
scores. In the normative data, within each ples provided by the ADOS in conjunction
module, social and communication scores were with other coding systems as a measure of re-
relatively independent of absolute expressive sponse to treatment (Owley et al., 2001). As is
language level. However, recent studies have the case with the ADI, it is hoped that mulitdi-
found relatively strong effects of level of ver- mensional scoring of the ADOS may allow for
bal impairment (e.g., verbal IQ), communica- better quantification of nonautism pervasive
tion scores, and for social domain scores, developmental disorders, most notably PDD-
particularly with preschool children (Munson, NOS and Asperger’s disorder. Clinically, the
Dawson, Lord, Rogers, & Sigman, in press). ADOS is particularly helpful in providing in-
The instruments were expanded into the four formation concerning social and communica-
modules that comprise the ADOS because of tive functioning, which has been collected in a
varying problems of sensitivity and specificity positive but standard context, to parents, ther-
by age and language level. Diagnostic algo- apists, and teachers.
rithms for the PL-ADOS were under-inclusive
for children with phrase speech with about 80% The Psychoeducational Profile-Revised
accuracy overall for autistic and/or nonautistic
mentally handicapped 3- and 4-year-olds. For The Psychoeducational Profile (PEP; Schopler
the original ADOS, the diagnostic algorithm & Reichler, 1979) is a developmental and di-
was over-inclusive for children with mental agnostic assessment instrument designed
handicap and difficult behaviors and was under- specifically for assessing children with ASDs.
inclusive for very verbal adolescents, with about It was revised in 1990 (PEP-R; Schopler, Re-
87% accuracy comparing autistic to mentally ichler, Bashford, Lansing, & Marcus, 1990),
handicapped and behavior-disordered, language is currently under revision once again, and
impaired children. The design of four different will soon be available as the Psychoeduca-
modules has increased the diagnostic accuracy tional Profile-Third Edition (PEP-3). The in-
of the ADOS considerably, but nevertheless, it strument has been translated into several
remains over-inclusive with very young (under different languages. The PEP and PEP-R are
30 months), mentally retarded children (Hep- most appropriate for use with children be-
burn et al., 2003), and under-inclusive with very tween the chronological ages of about 3 and 7
mild, verbal adolescents and adults with autistic years. The normative sample included 420
spectrum disorders (International Molecular children between 1 year and 7 years of age.
Genetics Study of Autism Consortium, 1998; Much of the available published information
Lord, Risi, et al., 2000). covers the PEP as well as the PEP-R. Because
A recent factor analysis of the original this is a chapter on diagnosis, only the pathol-
ADOS yielded three factors that accounted for ogy scales will be reviewed. The pathology
72% of the variance (Robertson, Tanguay, scales of the PEP-R are designed to rate the
L’Ecuyer, Sims, & Waltrip, 1999) in a sample severity of the characteristics of autism in
of verbally fluent children with ASD. These the following areas: Response to Materials
factors, similar to those found in an analysis by (8 items), Language (11 items), Affect & De-
the same authors of ADI data (Tanguay et al., velopment of Relationships (12 items), and
1998), were Joint Attention, Affective Reci- Sensory Modalities (12 items). On these
procity, and Theory of Mind. Theory of Mind scales, pathology is rated as “absent,” “mild”
scores on the two instruments were correlated or “severe.”
758 Assessment

Both convergent and discriminant validity of age with moderate to severe mental retarda-
of the original PEP’s pathology section have tion. As a result, the targeted areas focus on
been evaluated. Schopler and Reichler (1979) concerns that often appear as adulthood ap-
reported a high correlation (r = .80) between proaches and include matters such as semi-
pathology scores on the Childhood Autism independent functioning and psychopathology
Rating Scales (CARS) and pathology scores in the community.
on the PEP. When comparing children with The AAPEP incorporates three separate
autism with children without autism, children scales: a direct observation scale and two
with autism exhibited higher pathology scores interview sections (a home scale and school /
on the PEP (Lam & Rao, 1993). Internal con- work scale). Each scale includes six function-
sistency of the diagnostic section of the PEP-R ing areas: vocational skills, independent func-
pathology subscales has been reported to be tioning, leisure skills, vocational behavior,
good, with Cronbach’s alphas between .84 functional communication, and interpersonal
and .97 (Steerneman, Muris, Merckelbach, & behavior. Little information is available on the
Willems, 1997). Very little information is validity and reliability of this instrument.
available on the inter-rater reliability of Inter-rater reliability was evaluated by calcu-
the pathology subscales of the PEP or the PEP- lating the percent agreement between two in-
R. The one study that investigated inter-rater dependent raters and was determined to be
reliability reported a mean kappa score of sufficient (with the exception of Interpersonal
.69 (Muris, Steerneman, & Ratering, 1997), Behaviors on the Direct Observation scale;
which is considered adequate. r = .68), ranging from r = .74 to r = .95 (Mesi-
Much of the information available on the bov, Schopler, & Caison, 1989; Mesibov,
PEP is based on the original version of the Schopler, Schaffer, & Landrus, 1988). There
measure. Many of the studies are small scale has been little research using the AAPEP;
studies conducted on translations of the in- however, one study evaluated progress in
strument. The developmental scores of the adults with ASDs who were living in a group
PEP-R have been used in outcome studies home setting (Persson, 2000).
(Ozonoff & Cathcart, 1998; Panerai, Ferrante, There are few scales available for measur-
& Caputo, 1997). In research, the pathology ing functional behaviors and skills in adults
subscales are used much less frequently than with autism. The AAPEP is not intended for
the developmental subscales, which are most diagnostic purposes and focuses primarily on
often administered to establish the develop- the assessment of skills required for indepen-
mental levels of lower functioning children dent living. The best application of the AAPEP
with ASDs or to evaluate treatment outcomes. is for identifying target areas for intervention
The PEP is frequently used in conjunction or skill building.
with the CARS in research studies measuring
both diagnostic classification and developmen- RELATED DIAGNOSTIC
tal level. AND BEHAVIORAL
ASSESSMENT INSTRUMENTS
The Adolescent and Adult
Psychoeducational Profile The Communication and Symbolic
Behavior Scales, Developmental Profile
The Adolescent and Adult Psychoeducational
Profile (AAPEP; Mesibov, Schopler, & Cai- The Communication and Symbolic Behavior
son, 1989) is an extension of the PEP and was Scales, Developmental Profile (CSBS DP) is a
also developed by Division TEACCH. Like the standardized instrument designed for screening
PEP, the AAPEP is designed to assess individ- and evaluating communication and symbolic
uals with ASDs for the purpose of developing abilities in young children between the ages of
individualized treatment goals and recommen- 6 and 24 months (Wetherby & Prizant, 2002).
dations. The AAPEP is a criterion-referenced The published version is based on an earlier
test and targets individuals over 12 years version designed specifically for research pur-
Diagnostic Instruments in Autistic Spectrum Disorders 759

poses (Wetherby & Prizant, 1993). There are of the typical group were correctly predicted
three separate parts of the CSBS DP, including (Wetherby et al., in press). At this point, there
a screening instrument (CSBS DP Infant and are no cut-offs suggested, but children who
Toddler Checklist) and two follow up assess- demonstrate most of the 15 red flags should be
ment tools: a parent questionnaire (CSBS DP referred for further evaluation. More investiga-
Caregiver Questionnaire) and a direct observa- tion using the SORF and CSBS behavior sample
tion section (CSBS DP Behavior Sample). The with children with ASDs is warranted as it
purposes of the CSBS DP are screening and shows promise as a valuable screening tool.
identifying children at risk for language and de-
velopmental delays, not specifically autism, as The Children’s Communication Checklist
well as assessment and identification of delays
in social communication, expressive language, The Children’s Communication Checklist (CCC)
and symbolic abilities. The CSBS DP also was developed by Dorothy Bishop (1998) to as-
provides an opportunity for documentation of sess pragmatic difficulties within the speech
progress over time. The CSBS DP consists of and language impaired population. Although
seven cluster areas (Emotion and Eye Gaze, there are several standardized tests available for
Communication, Gestures, Sounds, Words, Un- assessing language form, such as syntax and
derstanding, Object Use) that are included in phonology, adequate standardized assessment
one of three composites (Social Communica- instruments for assessing pragmatic difficulties
tion, Expressive Speech & Language, and Sym- are very rare. The CCC is designed to be com-
bolic Abilities). pleted by a professional, such as a teacher or a
Although the CSBS was not specifically de- speech and language therapist, who knows the
signed to screen or evaluate young children child well (Bishop, 1998; Bishop & Baird,
with ASDs, there is evidence that information 2001). It consists of five scales assessing prag-
gathered from the Behavior Sample may have matic aspects of speech: inappropriate initia-
some value in screening for ASDs. Wetherby tion, coherence, stereotyped language, use of
et al. (in press) compared children with ASDs, context, and rapport. In addition, it includes
children with developmental delays, and chil- two item sets designed to assess other aspects
dren who were typically developing using the of speech and language (speech production and
Systematic Observation of Red Flags (SORF) syntactic complexity), as well as two item sets
for Autism Spectrum Disorders in Young Chil- intended to assess nonlanguage features of
dren (Wetherby et al., in press), which is based autistic spectrum disorders (social relation-
on the Behavior Sample of the CSBS. The ships and interests). Each behavior is described,
SORF includes 29 items from both the diagnos- and the rater is asked to indicate if it “defi-
tic criteria and research on ASDs in young nitely applies,” “applies somewhat,” “does not
children. It covers five composite areas and in- apply,” or if they are “ unable to judge.” Most of
cludes reciprocal social interaction, unconven- the data available on the CCC is based on chil-
tional gestures, unconventional sounds and dren between the ages of 5 and 17 years.
words, repetitive behaviors and restricted inter- There is debate in the literature as to
ests, and emotional regulation (Wetherby et al., whether there is a pure group of children with
in press). Inter-rater reliability for the SORF pragmatic difficulties and the extent to which
was high (89.7% to 100% agreement across they overlap with children with ASDs (Bishop,
children and 83% to 100% across items). 1998; Botting & Conti-Ramsden, 1999). At
The SORF shows promise as a screening in- least a subset of children with pragmatic lan-
strument, with sensitivity, specificity, positive guage difficulties also meet criteria for an
and negative predictive values all over 80% autistic spectrum disorder (Botting & Conti-
(Wetherby et al., in press), based on the Behav- Ramsden, 1999). This has led some individuals
ior Sample. A discriminant function analysis in- to hypothesize that Autistic Spectrum Disor-
dicated that when 15 red flags were considered, ders and Pragmatic Disorder may be related
100% of the children in the ASD group, 83% of both in symptoms and etiology (Bishop, 1998).
the developmentally delayed group, and 100% Although individuals with receptive-expressive
760 Assessment

language disorders generally tend to function ing parent and professional report to obtain
better than adults with autism, when adults the most accurate information.
who were diagnosed with receptive-expressive
impairments as children are assessed with the INSTRUMENTS FOR
ADI-R and ADOS, there is considerable over- ASPERGER’S DISORDER
lap in adult diagnosis. In one study, 60% of
the developmental language disorder group was The Asperger’s Syndrome
misclassified as autistic on at least one vari- (and High-Functioning Autism)
able (social functioning, independence, or ritu- Diagnostic Interview
alistic/stereotyped behavior) and 33% of the
autistic adults were misclassified as language The Asperger’s Syndrome (and High-Function-
impaired (Howlin et al., 2000). Other studies ing Autism) Diagnostic Interview (ASDI ) was
have found that when a score of less than 132 developed as a diagnostic tool specifically tai-
on the CCC ( lower scores indicating greater lored for verbally fluent autism and Asperger’s
pragmatic difficulties) is used as a cut-off, Disorder (Gillberg, Gillberg, Rastam, & Wentz,
children with autism, semantic-pragmatic lan- 2001). The interview is based on Gillberg’s di-
guage impairments, or semantic-pragmatic lan- agnostic criteria for Asperger’s Disorder, and
guage impairments plus autistic characteristics includes 20 items that operationalize six crite-
have lower scores than a speech language im- ria (Social, Interests, Routines, Verbal and
paired group (Bishop, 1998). Another study Speech, Communication, and Motor). The ASDI
found using this same cut-off, children with is a structured interview that is administered to
autism had lower scores than a learning disabil- a person who knows the subject of the interview
ities group (Botting & Conti-Ramsden, 1999). quite well, and has some knowledge of the sub-
The purpose of the CCC, however, is not to dif- ject’s childhood. Each question is rated on a
ferentiate children with a language disorder three-point scale. The interviewer is instructed
from the general population, but rather to dif- to obtain details on actual behaviors to accu-
ferentiate pragmatic difficulties from other as- rately code each item.
pects of language disorder within the language Initial reliability studies were conducted on
impaired population (Bishop, 1998; Bishop & a group of 20 individuals between 6 and 55
Baird, 2001). years of age. Inter-rater reliability was investi-
Validity of the instrument was evaluated gated and results indicated exact agreement
by comparing scores on the CCC between for 96% of the ratings (383 out of 400 ratings),
three different diagnostic categories (Semantic- resulting in a kappa of .91. Test-retest reliabil-
pragmatic pure—did not have autistic symp- ity was also investigated, and complete agree-
toms; Semantic-pragmatic plus—did have ment was achieved for 97% of the ratings (465
autistic symptoms or an autistic disorder, out of 480), resulting in a kappa of .92.
Other speech and language impairment— Validity was assessed by comparing algo-
without pragmatic difficulties or autistic char- rithm item scores with a clinical diagnosis
acteristics), based on school system classifica- made by two independent neuropsychiatrists
tions (Bishop, 1998). Based on this study, or neuropsychologists familiar with ASDs. All
children with a composite score lower than of the subjects who received a clinical diagno-
132 were more likely to be in the semantic- sis of Asperger’s Disorder or Atypical Autism
pragmatic pure or the semantic-pragmatic (n = 13) met five or six of the algorithm crite-
plus (pragmatic disorder plus some autistic ria for Asperger’s Disorder on at least one of
characteristics) groups and those children the ratings. Of the remaining 11 individuals
with scores higher than 132 were more likely who were not diagnosed with Asperger’s Dis-
to be in the other speech and language im- order, only one met five criteria. The authors
paired group (Bishop, 1998). Interestingly, acknowledge that many of the individuals who
parent ratings on the CCC relate more clearly met algorithm criteria for Asperger’s Disorder
to the child’s diagnostic status than do ratings would also meet DSM-IV criteria for autism.
by teachers. The authors recommend combin- This instrument is in the preliminary stages
Diagnostic Instruments in Autistic Spectrum Disorders 761

and further investigation is warranted prior to is typically completed. It is intended to be com-


using it as a diagnostic instrument. pleted as a questionnaire by a parent, teacher,
or professional. Not only was it administered as
The Australian Scale for an interview, but the interviewer was not blind
Asperger’s Syndrome to diagnosis. In addition, the clinical assess-
ment consisted of “an unstructured clinical ex-
The Australian Scale for Asperger’s Syndrome amination to decide whether they had AS.” The
(ASAS) was developed by Garnett and assessment included a parent interview, an as-
Attwood and published in Attwood’s book, As- sessment with the child, a record review, and a
perger’s Syndrome: A guide for parents, profes- diagnostic checklist. There were no standard-
sionals, people with Asperger’s Syndrome and ized instruments used in the assessment. The
their partners (Attwood, 1997), as well as on a relationship between the examiners in the As-
web site ( http://www.tonyattwood.co.uk). Al- perger’s Clinic who made the diagnosis and the
though there are no peer reviewed published authors of the instrument was also unclear.
papers on this instrument, it is widely used by Based on a stepwise discriminant function
educational systems and parents, in large part analysis, accuracy for the predicted member-
because of the accessibility and popularity of ship of the Asperger’s Disorder group was 90%,
the book and web site. and accuracy for the non-Asperger’s group was
The ASAS covers five areas, which (as the 65%. Given its high sensitivity and low speci-
developer of the instrument himself states) ficity, the authors recommend using this instru-
“loosely correspond to the five broad cate- ment as a screener, rather than as a diagnostic
gories of behavior identified by other re- instrument at this time. They also caution
searchers to identify Asperger’s Syndrome.” against using the instrument clinically, given
These include social and emotional difficul- the lack of data on the reliability and validity of
ties, cognitive skills deficits, communication the instrument. Clearly, considering these re-
skills deficits, specific interests, and motor sults and the lack of carefully controlled stud-
clumsiness. The authors also indicate that ies, it is difficult to interpret results from the
there are at least two questions that are not ASAS at this time.
based on current diagnostic criteria, because
their clinical observations differed from what Measuring Change in Core Behaviors
was reported in the literature. The instrument
includes 19 items and is scored on a 7-point Investigators have often attempted to use diag-
scale ranging from “rarely” (0) to “ fre- nostic instruments in order to measure change
quently” (6). Each item describes a behavior in response to treatment. On the whole, this
that the parent or teacher is asked to rate, fol- has not been very successful. This is partly
lowed by an example of that behavior. due to the fact that most diagnostic instru-
A nonpeer reviewed study designed to evalu- ments were designed to include a wide range of
ate the validity of the instrument in diagnosing deficits associated with ASDs, and so they
Asperger’s Disorder is available on Tony are not sufficiently sensitive to changes within
Attwood’s web site. The study included chil- an individual. In addition, expectations and
dren and adolescents between 3 and 19 years of contexts for behavior, especially for young
age in three groups: a group of individuals re- children, frequently change with time (Lord
ferred to a clinic for Asperger’s Disorder but et al., 2001; Volkmar & Lord, 1998). Although
not diagnosed with Asperger’s Disorder, a a child may be showing substantial improve-
group of individuals referred to a clinic and di- ment and acquiring specific behaviors, this
agnosed with Asperger’s Disorder, and a typi- improvement may not be measurable if the
cal control group. comparison is to the quality of interaction
There are several concerns regarding the seen in typical children. On the other hand, for
methods and design of this study. The ASAS treatments that claim that they result in com-
was administered as an interview by a clinician, plete recovery, changes should be observable
which is not the manner in which the instrument even in standard diagnostic instruments.
762 Assessment

There are a number of well-known instru- should be aware of the needs of their particu-
ments that measure behaviors that are not spe- lar situation and population in order to make
cific to autism but that are frequently found in the most informed choice of instruments. In
association with it. These measures have often general, higher standards in terms of limiting
been used in psychopharmacology research. the amount of information given to the user of
The most prominent one is the Aberrant Behav- an instrument tested (e.g., keeping examiners
ior Checklist (ABC; Aman, 1994; Aman & “ blind” to diagnosis, attempting to use instru-
Singh, 1986; Arnold et al., 2000). The Autism ments with parents who have not yet received
Behavior Checklist (also known as the ABC; a diagnosis), and including measurements of
Krug et al., 1980a), although less appropriate as test-retest reliability and appropriate analysis
a diagnostic instrument, has also been helpful of reliability statistics, will aid in the inter-
in indicating the degree of overtly abnormal or pretability of the instruments. Clear descrip-
impairing behaviors produced, particularly by tions of exactly how instruments were used
those children who are both autistic and men- and are intended to be used, including
tally handicapped. The Children’s Global As- cut-offs if categorical use is implied, are also
sessment Scale (Shaffer et al., 1983) gives a critical.
general measure of impairment, which may be It seems particularly important to recog-
helpful for some investigators. In addition, the nize that there are a variety of needs having to
Maladaptive Behavior Scale from the Vineland do with formal diagnosis that may not be met
Adaptive Behavior Scale (Perry & Factor, by a single instrument. Screening of large pop-
1989) provides counts of particular maladaptive ulations for possible autism is most likely to
behaviors. The Real Life Rating Scale (Free- occur with very young children and needs to
man et al., 1986) has also been used for this be coordinated with developmental screening,
aim. On the whole, most of these scales were because delays in language are inherently en-
not designed for diagnosis or measuring change twined with the recognition of autism in many
and do not have psychometric data to support children (see Stone, this volume). After a child
this particular use. The exception is the Aber- has been identified as possibly having an ASD,
rant Behavior Checklist. Recently, several in- procedures for early diagnosis may be rather
vestigators have begun to use the ADOS either different than screening methods. Diagnostic
as a measure or as a context in which to mea- procedures will involve fewer children than
sure treatment responsiveness. In our own re- screening and should have closer links to indi-
search, we see more quantifiable changes if we vidual education and treatment plans, as well
re-administer identical items over extended as outline possible mulitaxial diagnoses.
time periods (several years) on the direct obser- For research purposes, there is a need for
vation schedules (e.g., ADOS), even given the lifetime diagnoses and standard procedures
variability that this entails, than we do in par- that presumably yield the same final interpre-
ent reports, because of the very broad focus of tation (though not necessarily the same raw
the ADI. Time will tell if the ADOS has a suffi- data) for the same individual at multiple points
cient range of presses and contexts to be useful in his or her life. In contrast, there is also a
in this way (Owley et al., 2001). need for measurement of change. It seems very
unlikely that any one instrument will accom-
CONCLUSIONS plish all of these objectives. However, for each
of these needs, there are promising candidates.
Overall, there is a wealth of information and Ensuring that the relationship between various
options for the diagnosis of autism, but there instruments and goals is well understood will
is still much to be done to make our tech- also increase the usefulness of the endeavor.
niques stronger and broader in scope. There Recognizing that other factors, particularly
will always be trade-offs between acquiring level of development and language skill, have
the maximum amount of meaningful informa- marked effects on most measurements in
tion and highest validity versus being able to autism and pervasive developmental disorder
reliably code and make decisions about infor- is an important step in considering the mean-
mation. Users of diagnostic instruments ing of any clinical or research result.
Diagnostic Instruments in Autistic Spectrum Disorders 763

Finally, there is a great need for the exten- some 7q in multiplex autism families. Ameri-
sion of the current instruments to diagnosis of can Journal of Human Genetics, 70(1), 60–71.
disorders other than autism in the autism spec- Aman, M. G. (1994). Instruments for assessing
trum. Part of the difficulty, as discussed in treatment effects in developmentally disabled
later chapters, is that the definitions and dis- populations. Assessment in Rehabilitation and
Exceptionality, 1, 1–20.
criminations from autism of these disorders
Aman, M. G., & Singh, N. N. (1986). Aberrant Be-
are not yet as clear as we would like. However, havior Checklist: Manual. East Aurora, NY:
reliable ways of formally substantiating diag- Slosson Educational Publications.
noses such as PDD-NOS, Asperger’s Disor- American Psychiatric Association. (1994). Diag-
ders and atypical autism are needed so that nostic and statistical manual of mental disor-
researchers and clinicians can make informed ders (4th ed.). Washington, DC: Author.
decisions about the usefulness of these con- Arnold, L. E., Aman, M. G., Martin, A., Collier-
cepts. Various instruments have been proposed Crespin, A., Vitiello, B., Tierney, E., et al.
to study these disorders, but at this point, they (2000). Assessment in multisite randomized
have little relationship to each other and have clinical trials of patients with autistic disorder:
The Autism RUPP Network. Journal of Autism
not been found to be reliable. Consequently,
and Developmental Disorders, 30(2), 99–111.
they offer limited scientific usefulness. A pri-
Asano, E., Chugani, D. C., Muzik, O., Behen, M.,
ority for researchers is to work together to de- Janisse, J., Rothermel, R., et al. (2001).
rive operationalized definitions and specific Autism in tuberous sclerosis complex is re-
proposals for how their approaches add to or lated to both cortical and subcortical dysfunc-
fit in with those of other researchers. In the tion. Neurology, 57(7), 1269–1277.
meantime, clinicians must be careful to be in- Attwood, T. (1997). Asperger’s syndrome: A guide
formed about the kind of information a partic- for parents and professionals. London: Jessica
ular instrument provides and to consider the Kingsley.
implications for the appropriateness of that in- Baron-Cohen, S., Allen, J., & Gillberg, C. (1992).
formation to their immediate clinical needs. Can autism be detected at 18 months? The
needle, the haystack, and the CHAT. British
Journal of Psychiatry, 161(1), 839–843.
Cross-References Baron-Cohen, S., Cox, A., Baird, G., Swettenham,
J., Nightingale, N., Morgan, K., et al. (1996).
Issues in the diagnosis of autism and related Psychological markers in the detection of
conditions are discussed in Chapters 1 to 7. autism in infancy in a large population. British
Other aspects of assessment are reviewed in Journal of Psychiatry, 168(2), 158–163.
Chapter 27 and in Chapters 29 to 33. Barthelemy, C., Adrien, J. L., Tanguay, P. E., Gar-
reau, B., Fermanian, J., Roux, S., et al. (1990).
REFERENCES The Behavioral Summarized Evaluation: Va-
lidity and reliability of a scale for the assess-
Achenbach, T. M. (1981). Childhood Behavior ment of autistic behaviors. Journal of Autism
Checklist. Burlington: University of Vermont, and Developmental Disorders, 20(2), 189–204.
Department of Psychiatry. Barthelemy, C., Roux, S., Adrien, J. L., Hameury,
Achenbach, T. M., & Rescorla, L. (2000). Manual L., Guerin, P., Garreau, B., et al. (1997). Vali-
for the ASEBA forms and profiles. Burlington: dation of the Revised Behavior Summarized
University of Vermont, Center for Children, Evaluation Scale. Journal of Autism and Devel-
Youth and Families. opmental Disorders, 27(2), 139–153.
Adrien, J. L., Perrot, A., Sauvage, D., Leddet, I., Bebko, J. M., Konstantareas, M. M., & Springer, J.
Larmande, C., Hameury, L., et al. (1992). Early (1987). Parent and professional evaluations of
symptoms in autism from family home movies: family stress associated with characteristics
Evaluation and comparison between 1st and of autism. Journal of Autism and Developmen-
2nd year of life using I.B.S.E. Scale. Acta Pae- tal Disorders, 17(4), 565–576.
dopsychiatrica: International Journal of Child Berument, S. K., Rutter, M., Lord, C., Pickles, A.,
and Adolescent Psychiatry, 55(2), 71–75. & Bailey, A. (1999). Autism screening ques-
Alarcón, M., Cantor, R. M., Liu, J., Gilliam, T. C., tionnaire: Diagnostic validity. British Journal
Geschwind, D. H., & Autism Genetic Research of Psychiatry, 175, 444–451.
Exchange Consortium. (2002). Evidence for a Bird, H. R., Gould, M. S., & Staghezza, B. (1992).
language quantitative trait locus on chromo- Aggregating data from multiple informants in
764 Assessment

child psychiatry epidemiological research. Cohen, D. J., Caparulo, B. K., Gold, J. R., Waldo,
Journal of the American Academy of Child and M. C., Shaywitz, B. A., Ruttenberg, B. A., et al.
Adolescent Psychiatry, 31(1), 78–85. (1978). Agreement in diagnosis: Clinical assess-
Bishop, D. V. M. (1998). Development of the Chil- ment and behavior rating scales for pervasively
dren’s Communication Checklist (CCC): A disturbed children. Journal of the American
method for assessing qualitative aspects of Academy of Child Psychiatry, 17(4), 589–603.
communicative impairment in children. Jour- Constantino, J. N. (2002). The Social Responsive-
nal of Child Psychology and Psychiatry and ness Scale. Los Angeles: Wester Psychological
Allied Disciplines, 39(6), 879–891. Services.
Bishop, D. V. M., & Baird, G. (2001). Parent and Constantino, J. N., Gruber, C. P., Davis, S., Hays, S.,
teacher report of pragmatic aspects of Passante, N., & Przybeck, T. (2004). The factor
communication: Use of the Children’s Com- structure of autistic traits. Journal of Child
munication Checklist in a clinical setting. Psychology and Psychiatry, 45(4), 719–726.
Developmental Medicine and Child Neurology, Constantino, J. N., Przybeck, T., Friesen, D., &
43(12), 809–818. Todd, R. D. (2000). Reciprocal social behavior
Bishop, D. V. M., & Norbury, C. F. (2002). Explor- in children with and without pervasive devel-
ing the borderlands of autistic disorder and opmental disorders. Journal of Developmental
specific language impairment: A study using and Behavioral Pediatrics, 21(1), 2–11.
standardized diagnostic instruments. Journal Constantino, J. N., & Todd, R. D. (2000). Genetic
of Child Psychology and Psychiatry and Allied structure of reciprocal social behavior. Ameri-
Disciplines, 43(7), 917–929. can Journal of Psychiatry, 157(12), 2043–2044.
Boiron, M., Barthelemy, C., Adrien, J. L., Mar- Constantino, J. N., & Todd, R. D. (2003). Autistic
tineau, J., & Lelord, G. (1992). The assessment traits in the general population: A twin study.
of psychophysiological dysfunction in children Archives of General Psychiatry, 60(5), 524–530.
using the BSE scale before and during therapy. Cox, A., Klein, K., Charman, T., Baird, G., Baron-
Acta Paedopsychiatrica: International Journal Cohen, S., Swettenham, J., et al. (1999).
of Child and Adolescent Psychiatry, 55(4), Autism spectrum disorders at 20 and 42 months
203–206. of age: Stability of clinical and ADI-R diagno-
Botting, N., & Conti-Ramsden, G. (1999). Pragmatic sis. Journal of Child Psychology and Psychiatry
language impairment without autism: The chil- and Allied Disciplines, 40(5), 719–732.
dren in question. Autism, 3(4), 371–396. Cuccaro, M. L., Shao, Y. J., Bass, M. P., Abramson,
Bradford, Y., Haines, J., Hutcheson, H., Gardiner, R. K., Ravan, S. A., Wright, H. H., et al.
M., Braun, T., Sheffield, V., et al. (2001). In- (2003). Behavioral comparisons in autistic in-
corporating language phenotypes strengthens dividuals from multiplex and singleton fami-
evidence of linkage to autism. American Jour- lies. Journal of Autism and Developmental
nal of Medical Genetics, 105(6), 539–547. Disorders, 33(1), 87–91.
Buxbaum, J. D., Silverman, J. M., Smith, C. J., Kil- Cuccaro, M. L., Shao, Y., Grubber, J., Slifer, M.,
ifarski, M., Reichert, J., Hollander, E., et al. Wolpert, C. M., Donnelly, S. L., et al. (2003).
(2001). Evidence for a susceptibility gene for “Factor Analysis of Restricted and Repetetive
autism on chromosome 2 and for genetic het- Behaviors in Autism Using the Autism Diag-
erogeneity. American Journal of Human Ge- nostic Interview-R.” Child Psychiatry and
netics, 68(6), 1514–1520. Human Development, 34(1), 3–17.
Chakrabarti, S., & Fombonne, E. (2001). Pervasive Davids, A. (1975). Childhood psychosis: The prob-
developmental disorders in preschool children. lem of differential diagnosis. Journal of Autism
Journal of the American Medical Association and Childhood Schizophrenia, 5(2), 129–138.
[Special issue], 285(24), 3093–3099. Dawson, G., Webb, S., Schellenberg, G. D., Dager,
Charman, T., Swettenham, J., Baron-Cohen, S., S., Friedman, S., Aylward, E., et al. (2002).
Cox, A., Baird, G., & Drew, A. (1998). An ex- Defining the broader phenotype of autism: Ge-
perimental investigation of social-cognitive netic, brain, and behavioral perspectives. Devel-
abilities in infants with autism: Clinical impli- opment and Psychopathology, 14(3), 581–611.
cations. Infant Mental Health Journal, 19(2), de Bildt, A., Sytema, S., Ketelaars, C., Kraijer, D.,
260–275. Mulder, E., Volkmar, F., et al. (2004). Interre-
Cicchetti, D. V., & Sparrow, S. S. (1981). Develop- lationship between autism diagnostic observa-
ing criteria for establishing inter-rater relia- tion schedule-generic (ADOS-G), autism
bility of specific items: Applications to diagnostic interview-revised (ADI-R), and the
assessment of adaptive behavior. American diagnostic and statistical manual of mental
Journal of Mental Deficiency, 86(2), 127–137. disorders (DSM-IV-TR) classification in chil-
Diagnostic Instruments in Autistic Spectrum Disorders 765

dren and adolescents with mental retardation. criteria for Asperger’s Syndrome. Journal of
Journal of Autism and Developmental Disor- Autism and Developmental Disorders, 22(4),
ders 34(2), 129–137. 643–649.
DiLavore, P., Lord, C., & Rutter, M. (1995). Pre- Gillberg, C., Gillberg, C., Rastam, M., & Wentz, E.
Linguistic Autism Diagnostic Observation (2001). The Asperger Syndrome (and high-func-
Schedule (PLADOS). Journal of Autism and tioning autism) Diagnostic Interview (ASDI ): A
Developmental Disorders, 25(4), 355–379. preliminary study of a new structured clinical
Doll, E. A. (1965). Vineland Social Maturity Scale. interview [Special issue]. Autism, 5(1), 57–66.
Circle Pines, MN: American Guidance Service. Gilliam, J. E. (1995). Gilliam Autism Rating Scale.
Eaves, R. C. (1990). The factor structure of autistic Austin, TX: ProEd.
behavior. Paper presented at the annual Al- Grinager, A. N., Cox, N. J., & Yairi, E. (1997). The
abama Conference on Autism, Birmingham. genetic basis of persistence and recovery in
Eaves, R. C., Campbell, H. A., & Chambers, D. stuttering. Journal of Speech and Hearing Re-
(2000). Criterion-related and construct validity search, 40(3), 567–580.
of the Pervasive Developmental Disorders Rat- Happé, F. G. E. (1995). The role of age and verbal
ing Scale and the Autism Behavior Checklist. ability in the theory of mind task performance
Psychology in the Schools, 37(4), 311–321. of subjects with autism. Child Development,
Eaves, R. C., & Hooper, J. (1987). A factor analysis 66(3), 843–855.
of psychotic behavior. Journal of Special Edu- Hepburn, S., John, A., Lord, C., & Rogers, S.
cation, 21(4), 122–132. (2003). Sensitivity and specificity of the Autism
Folstein, S. E., & Mankoski, R. E. (2000). Chro- Diagnostic Observation Schedule in young chil-
mosome 7q: Where autism meets language dren. Manuscript in preparation.
disorder? American Journal of Human Genet- Hertzig, M. E., Snow, M. E., New, E., & Shapiro, T.
ics, 67(2), 278–281. (1990). DSM-III and DSM-III-R diagnosis of
Fombonne, E. (1992). Diagnostic assessment in a autism and pervasive developmental disorder in
sample of autistic and developmentally im- nursery school children. Journal of the Ameri-
paired adolescents. Journal of Autism and De- can Academy of Child and Adolescent Psychia-
velopmental Disorders, 22(4), 563–581. try, 29(1), 195–199.
Freeman, N. L., Perry, A., & Factor, D. C. (1991). Hobson, R. P. (1991). Methodological issues for ex-
Child behavior as stressors: Replicating and periments on autistic individuals’ perception
extending the use of the CARS as a measure of and understanding of emotion. Journal of the
stress: A research note. Journal of Child Psy- American Academy of Child and Adolescent
chology and Psychiatry and Allied Disciplines, Psychiatry, 32(7), 1135–1158.
32(6), 1025–1030. Holroyd, S., & Baron-Cohen, S. (1993). Brief re-
Freeman, B. J., Ritvo, E. R., Guthrie, D., Schroth, port: How far can people with autism go in de-
P., & Ball, J. (1978). The Behavior Observa- veloping a theory of mind? Journal of Autism
tion Scale for Autism: Initial methodology, and Developmental Disorders, 23(2), 379–385.
data analysis, and preliminary findings on 89 Howlin, P., Mawhood, L., & Rutter, M. (2000).
children. Journal of the American Academy of Autism and developmental receptive language
Child Psychiatry, 17(4), 576–588. disorder: A follow-up comparison in early
Freeman, B. J., Ritvo, E. R., Yokota, A., & Ritvo, adult life: Part II. Social, behavioural, and
A. (1986). A scale for rating symptoms of pa- psychiatric outcomes. Journal of Child Psy-
tients with the syndrome of autism in real life chology and Psychiatry and Allied Disciplines,
settings. Journal of the American Academy of 41(5), 561–578.
Child Psychiatry, 25(1), 130–136. International Molecular Genetic Study of Autism
Freitag, C. M. (2002). Phenotypic characteristics Consortium. (1998). A full genome screen for
of siblings with autism and/or pervasive devel- autism with evidence for linkage to a region on
opmental disorder: Evidence for heterogene- chromosome 7q. Human Molecular Genetics,
ity. American Journal of Medical Genetics, 7(3), 571–578.
114(7), 31. Kanner, L. (1943). Autistic disturbances of affec-
Garfin, D. G., McCallon, D., & Cox, R. (1988). Va- tive contact. Nervous Child, 2, 217–250.
lidity and reliability of the Childhood Autism Klin, A., Pauls, D. L., Schultz, R., & Volkmar, F. R.
Rating Scale with autistic adolescents. Journal (in press). Three diagnostic approaches to As-
of Autism and Developmental Disorders, 18(3), perger’s syndrome: Implications for research.
367–378. Journal of Autism and Developmental Disorders.
Ghaziuddin, M., Tsai, L., & Ghaziuddin, N. (1992). Klin, A., Sparrow, S. S., Marans, W. D., Carter, A.,
Brief report: A comparison of the diagnostic & Volkmar, F. R. (2000). Assessment issues in
766 Assessment

children and adolescents with Asperger syn- Le Couteur, A., Rutter, M., Lord, C., Rios, P.,
drome. In A. Klin, F. R. Volkmar, & S. S. Spar- Robertson, S., Holdgrafer, M., et al. (1989).
row (Eds.), Asperger syndrome (pp. 309–339). Autism Diagnostic Interview: A standardized
New York: Guilford Press. investigator-based instrument. Journal of Autism
Klin, A., Volkmar, F. R., Sparrow, S. S., Cicchetti, and Developmental Disorders, 19(3), 363–387.
D. V., & Rourke, B. P. (1995). Validity and Leekam, S. R., Libby, S. J., Wing, L., Gould, J., &
neuropsychological characterization of As- Gillberg, C. (2000). Comparison of ICD-10
perger syndrome: Convergence with nonverbal and Gillberg’s criteria for Asperger syndrome
learning disabilities syndrome. Journal of [Special issue: Asperger syndrome]. Autism,
Child Psychology and Psychiatry and Allied 4(1), 11–28.
Disciplines, 36(7), 1127–1140. Leekam, S. R., Libby, S. J., Wing, L., Gould, J., &
Kolevzon, A., Smith, C. J., Schmeidler, J., Buxbaum, Taylor, C. (2002). The Diagnostic Interview
J. D., & Silverman, J. M. (2004). Familial for Social and Communication Disorders: Al-
symptom domains in monzygotic siblings with gorithms for ICD-10 childhood autism and
autism. American Journal of Medical Genetics Wing and Gould autistic spectrum disorder.
Part B-Neuropsychiatric Genetics 129B, 76–81. Journal of Child Psychology and Psychiatry
Konstantareas, M. M., & Homatidis, S. (1989). As- and Allied Disciplines, 43(3), 327–342.
sessing child symptom severity and stress in Lord, C. (1990). A cognitive-behavioral model for
parents of autistic children. Journal of Child the treatment of social-communicative deficits
Psychology and Psychiatry and Allied Disci- in adolescents with autism. In R. J. McMahon
plines, 30(3), 459–470. & R. D. Peters (Eds.), Behavior disorders of
Kraemer, H. C. (1992). Measurement of reliability adolescence: Research, intervention and policy
for categorical data medical research. Statistical in clinical and school settings (pp. 155–174).
Methods in Medical Research, 1(2), 183–199. New York: Plenum Press.
Krug, D. A., Arick, J. R., & Almond, P. J. (1980a). Lord, C. (1995). Follow-up of two-year-olds re-
Autism screening instrument for educational ferred for possible autism. Journal of Child
planning. Portland, OR: ASIEP Educational. Psychology and Psychiatry, 36(8), 1365–1382.
Krug, D. A., Arick, J. R., & Almond, P. J. (1980b). Lord, C. (1996). Treatment of a high-functioning
Behavior checklist for identifying severely adolescent with autism: A cognitive-behav-
handicapped individuals with high levels of ioral approach. Cognitive therapy with children
autistic behavior. Journal of Child Psychology and adolescents: A casebook for clinical prac-
and Psychiatry and Allied Disciplines, 21(3), tice (pp. 394–404). New York: Guilford Press.
221–229. Lord, C., & Bailey, A. (2002). Autism spectrum
Krug, D. A., Arick, J. R., & Almond, P. J. (1993). disorders. In M. Rutter & E. Taylor (Eds.),
Autism screening instrument for educational Child and adolescent psychiatry (4th ed.,
planning (2nd ed.). Austin, TX: ProEd. pp. 636–663). Oxford, England: Blackwell.
Kurita, H., Kita, M., & Miyake, Y. (1992). A com- Lord, C., Cook, E. H., Leventhal, B. L., & Amaral,
parative study of development and symptoms D. G. (2000). Autism spectrum disorders. Neu-
among disintegrative psychosis and infantile ron, 28(2), 355–363.
autism with and without speech loss. Journal Lord, C., Leventhal, B. L., & Cook, E. H., Jr.
of Autism and Developmental Disorders, 22(2), (2001). Quantifying the phenotype in autism
175–188. spectrum disorders. American Journal of Med-
Lam, M. K., & Rao, N. (1993). Developing a Chi- ical Genetics, 105(1), 36–38.
nese version of the Psychoeducational Profile Lord, C., Pickles, A., DiLavore, P. C., & Shulman, C.
(CPEP) to assess autistic children in Hong (1996). Longitudinal studies of young children
Kong. Journal of Autism and Developmental referred for possible autism. Paper presented at
Disorders, 23(2), 273–279. the biannual meeting of the International Soci-
Le Couteur, A., Bailey, A., Goode, S., Pickles, A., ety for Research in Child and Adolescent
Robertson, S., Gottesman, I., et al. (1996). A Psychopathology, Los Angeles.
broader phenotype of autism: The clinical Lord, C., Risi, S., Lambrecht, L., Cook, E. H., Jr.,
spectrum in twins. Journal of Child Psychol- Leventhal, B. L., DiLavore, P. C., et al. (2000).
ogy and Psychiatry and Allied Disciplines, The Autism Diagnostic Observation Schedule-
37(7), 785–801. Generic: A standard measure of social and
Le Couteur, A., Lord, C., & Rutter, M. (2003). The communication deficits associated with the
Autism Diagnostic Interview: Revised (ADI-R). spectrum of autism. Journal of Autism and De-
Los Angeles: Western Psychological Services. velopmental Disorders, 30(3), 205–223.
Diagnostic Instruments in Autistic Spectrum Disorders 767

Lord, C., Rutter, M. L., DiLavore, P. C., & Risi, S. of Autism and Developmental Disorders, 19(1),
(1999). Autism Diagnostic Observation Sched- 33–40.
ule—WPS (WPS ed.). Los Angeles: Western Mesibov, G. B., Schopler, E., Schaffer, B., &
Psychological Services. Landrus, R. (1988). Adolescent and Adult Psy-
Lord, C., Rutter, M. L., Goode, S., Heemsbergen, choeducational Profile (AAPEP): Volume IV.
J., Jordan, H., Mawhood, L., et al. (1989). Austin, TX: ProEd.
Autism Diagnostic Observation Schedule: A Mesibov, G. B., Schopler, E., Schaffer, B., &
standardized observation of communicative Michal, N. (1989). Use of the Childhood
and social behavior. Journal of Autism and De- Autism Rating Scale with autistic adolescents
velopmental Disorders, 19(2), 185–212. and adults. Journal of the American Academy
Lord, C., Rutter, M. L., & Le Couteur, A. (1994). of Child and Adolescent Psychiatry, 28(4),
The Autism Diagnostic Interview—Revised: A 538–541.
revised version of a diagnostic interview for Miller, J. N., & Ozonoff, S. (1997). Did Asperger’s
caregivers of individuals with possible perva- cases have Asperger disorder? Journal of Child
sive developmental disorders. Journal of Autism Psychology and Psychiatry and Allied Disci-
and Developmental Disorders, 24(5), 659–685. plines, 38(2), 247–251.
Lord, C., Storoschuk, S., Rutter, M., & Pickles, A. Minshew, N. J., & Goldstein, G. (1993). Is autism
(1993). Using the ADI–R to diagnose autism in an amnesic disorder? Evidence from the Cali-
preschool children. Infant Mental Health Jour- fornia Verbal Learning Test. Neuropsychology,
nal, 14(3), 1234–1252. 7(2), 209–216.
Luteijn, E., Luteijn, F., Jackson, S., Volkmar, F., & Miranda-Linne, F. M., Fredrika, M., & Melin, L.
Minderaa, R. (2000). The Children’s Social (1997). A comparison of speaking and mute
Behavior Questionnaire for milder variants of individuals with autism and autistic-like con-
PDD problems: Evaluation of the psychomet- ditions on the Autism Behavior Checklist.
ric characteristics. Journal of Autism and De- Journal of Autism and Developmental Disor-
velopmental Disorders, 30(4), 317–330. ders, 27(3), 245–264.
MacLean, J. E., Szatmari, P., Jones, M. B., Bryson, Mundy, P., Sigman, M., Ungerer, J., & Sherman, T.
S. E., Mahoney, W. J., Bartolucci, G., et al. (1986). Defining the social deficits of autism:
(1999). Familial factors inf luence level of The contribution of nonverbal communication
functioning in pervasive developmental disor- measures. Journal of Child Psychology and
der. Journal of the American Academy of Child Psychiatry, 27(5), 657–669.
and Adolescent Psychiatry, 38(6), 746–753. Munson, J., Dawson, G., Lord, C., Rogers, S., &
Mahoney, W. J., Szatmari, P., MacLean, J. E., Sigman, M. (in press). Cognitive profiles and
Bryson, S. E., Bartolucci, G., Walter, S. D., adaptive functioning in preschool children with
et al. (1998). Reliability and accuracy of dif- autism spectrum disorder versus developmental
ferentiating pervasive developmental disorder delay.
subtypes. Journal of the American Academy Muris, P., Steerneman, P., & Ratering, E. (1997).
of Child and Adolescent Psychiatry, 37(3), Inter-rater reliability of the Psychoeducational
278–285. Profile (PEP). Journal of Autism and Develop-
Masters, J. C., & Miller, D. E. (1970). Early infan- mental Disorders, 27(5), 621–626.
tile autism: A methodological critique. Journal Nordin, V., & Gillberg, C. (1996a). Autism spec-
of Abnormal Psychology, 75(3), 342–343. trum disorders in children with physical or
Mawhood, L., Howlin, P., & Rutter, M. L. (2000). mental disability or both: Part I. Clinical and
Autism and developmental receptive language epidemiological aspects. Developmental Medi-
disorder—A comparitive follow-up in early cine and Child Neurology, 38(4), 297–313.
adult life: Part I. Cognitive and language out- Nordin, V., & Gillberg, C. (1996b). Autism spec-
comes. Journal of Child Psychology and Psy- trum disorders in children with physical or
chiatry and Allied Disciplines, 41(5), 547–559. mental disability or both: Part II. Screening
Mayes, L. C., & Zigler, E. (1992). An observational aspects. Journal of Child Psychology and Psy-
study of the affective concomitants of mastery chiatry, 38(4), 314–324.
in infants. Journal of Child Psychology and Psy- Nordin, V., & Gillberg, C. (1998). The long-term
chiatry and Allied Disciplines, 33(4), 659–667. course of autistic disorders: Update on follow-
Mesibov, G. B., Schopler, E., & Caison, W. (1989). up studies. Acta Psychiatrica Scandinavica,
The Adolescent and Adult Psychoeducational 97(2), 99–108.
Profile: Assessment of adolescents and adults Nordin, V., Gillberg, C., & Nyden, A. (1998). The
with severe developmental handicaps. Journal Swedish version of the Childhood Autism
768 Assessment

Rating Scale in a clinical setting. Journal of Piven, J., Nehme, E., Simon, J., Barta, P., Pearl, G.,
Autism and Developmental Disorders, 28(1), & Folstein, S. E. (1992). Magnetic Resonance
69–75. Imaging in autism: Measurement of the cere-
Offord, D. R., Boyle, M. H., Racine, Y., Szatmari, bellum, pons, and fourth ventricle. Biological
P., Fleming, J. E., Sanford, M., et al. (1996). Psychiatry, 31(5), 491–504.
Integrating assessment data from multiple in- Prior, M. R., & Bence, R. (1975). A note on the va-
formants. Journal of the American Academy lidity of the Rimland Diagnostic Checklist.
of Child and Adolescent Psychiatry, 35(8), Journal of Clinical Psychology, 31(3), 510–513.
1078–1085. Rimland, B. (1968). On the objective diagnosis of
Owley, T., McMahon, W., Cook, E. H., Laulhere, infantile autism. Acta Paedopsychiatrica: In-
T., South, M., Mays, L. Z., et al. (2001). Multi- ternational Journal of Child and Adolescent
site, double-blind, placebo-controlled trial of Psychiatry, 35(4/8), 146–161.
porcine secretin in autism. Journal of the Rimland, B. (1971). The differentiation of child-
American Academy of Child and Adolescent hood psychoses: An analysis of checklists for
Psychiatry, 40(11), 1293–1299. 2,218 psychotic children. Journal of Autism
Ozonoff, S., & Cathcart, K. (1998). Effectiveness and Childhood Schizophrenia, 1(2), 161–174.
of a home program intervention for young chil- Robertson, J. M., Tanguay, P. E., L’Ecuyer, S.,
dren with autism. Journal of Autism and Devel- Sims, A., & Waltrip, C. (1999). Domains of
opmental Disorders, 28(1), 25–32. social communication handicap in autism
Ozonoff, S., South, M., & Miller, J. N. (2000). DSM- spectrum disorder. Journal of the American
IV-defined Asperger syndrome: Cognitive, be- Academy of Child and Adolescent Psychiatry,
havioral and early history differentiation from 38(6), 738–745.
high-functioning autism [Special issue: As- Ruttenberg, B. A., Dratman, M. L., Fraknoi, J., &
perger syndrome]. Autism, 4(1), 29–46. Wenar, C. (1966). An instrument for evaluat-
Panerai, S., Ferrante, L., & Caputo, V. (1997). The ing autistic children. Journal of American
TEACCH strategy in mentally retarded children Academy of Child Psychiatry, 5, 453–478.
with autism: A multidimensional assessment: Ruttenberg, B. A., Kalish, B. I., Wenar, C., & Wolf,
Pilot study. Journal of Autism and Developmen- E. G. (1977). Behavior rating instrument for
tal Disorders, 27(3), 345–347. autistic and other atypical children (rev. ed.).
Parks, S. L. (1988). Psychometric instruments avail- Philadelphia: Developmental Center for Autis-
able for the assessment of autistic children. In tic Children.
E. Schopler & G. Mesibov (Eds.), Diagnosis Rutter, M., Le Couteur, A., & Lord, C. (2003).
and assessment in autism (pp. 123–136). New Manual for the ADI–WPS version. Los Ange-
York: Plenum Press. les: Western Psychological Services.
Perry, A., & Factor, D. C. (1989). Psychometric va- Rutter, M., Mawhood, L., & Howlin, P. (1992).
lidity and clinical usefulness of the Vineland Language delay and social development. In
Adaptive Behavior Scale and the AAMD P. Fletcher & D. Hall (Eds.), Specific speech
Adaptive Behavior Scale for an autistic sam- and language disorders in children: Correlates,
ple. Journal of Autism and Developmental Dis- characteristics, and outcomes (pp. 63–78).
orders, 19(1), 41–56. London: Whurr.
Persson, B. (2000). Brief report: A longitudinal Saemundsen, E., Magnússon, P., Smári, J., &
study of quality of life and independence among Sigurdardóttir, S. (2003). Autism Diagnostic
adult men with autism. Journal of Autism and Interview-Revised and the Childhood Autism
Developmental Disorders, 30(1), 2061–2066. Rating Scale: Convergence and discrepancy in
Pilowsky, T., Yirmiya, N., Shulman, C., & Dover, diagnosing autism. Journal of Autism and De-
R. (1998). The Autism Diagnostic Review– velopmental Disorders 33(3), 319–328.
Revised and the childhood autism rating Sanchez, L. E., Adams, P. B., Yusal, S., Hallin, A.,
scale: Differences between diagnostic sys- Campbell, M., & Small, A. M. (1995). A com-
tems and comparison between genders. Jour- parison of live and videotape ratings:
nal of Autism and Developmental Disorders, Comipramine and halperidol in autism. Psy-
28(2), 143–151. chopharmacology Bulletin, 31(2), 371–378.
Piven, J., Harper, J., Palmer, P., & Arndt, S. Schopler, E. (1976). Towards reducing behavior
(1996). Course of behavioral change in problems in autistic children. In L. Wing
autism: A retrospective study of high-IQ ado- (Ed.), Early childhood autism (pp. 221–246).
lescents and adults. Journal of the American London: Pergamon Press.
Academy of Child and Adolescent Psychiatry, Schopler, E., & Reichler, R. J. (1972). How well do
35(4), 523–529. parents understand their own psychotic child?
Diagnostic Instruments in Autistic Spectrum Disorders 769

Journal of Autism and Childhood Schizophre- (2002). Symptom domains in autism and related
nia, 2(4), 387–400. conditions: Evidence for familiality. American
Schopler, E., & Reichler, R. J. (1979). Individual- Journal of Medical Genetics, 114(1), 64–73.
ized assessment and treatment for autistic and Smalley, S. L., Tanguay, P. E., Smith, M., & Gutier-
developmentally disabled children: Psychoedu- rez, G. (1992). Autism and tuberous sclerosis.
cational profile (Vol. 1). Baltimore: University Journal of Autism and Developmental Disor-
Park Press. ders, 22(3), 339–355.
Schopler, E., Reichler, R. J., Bashford, A., Lansing, South, M., Williams, B. J., McMahon, W. M.,
M. D., & Marcus, L. M. (1990). Psychoeduca- Owley, T., Filipek, P. A., Shernoff, E., et al.
tional Profile–Revised. Austin, TX: ProEd. (2002). Utility of the Gilliam Autism Rating
Schopler, E., Reichler, R. J., DeVellis, R., & Daly, Scale in research and clinical populations.
K. (1980). Toward objective classification of Journal of Autism and Developmental Disor-
childhood autism: Childhood Autism Rating ders, 32(6), 593–599.
Scale (CARS). Journal of Autism and Develop- Sparrow, S. S., Balla, D., & Cicchetti, D. (1984).
mental Disorders, 10(1), 91–103. Vineland Adaptive Behavior Scales. Circle
Schopler, E., Reichler, R. J., & Renner, B. R. Pines, MN: American Guidance Service.
(1986). The Childhood Autism Rating Scale Spencer, A. (1993). Separation and reunion in
(CARS) for diagnostic screening and classifi- autistic two year olds. Unpublished doctoral
cation of autism. Irvington, NY: Irvington. dissertation, University of North Carolina,
Schopler, E., Reichler, R. J., & Renner, B. R. (1988). Chapel Hill.
The Childhood Autism Rating Scale (CARS). Spiker, D., Lotspeich, L. J., Dimiceli, S., Myers,
Los Angeles: Western Psychological Services. R. M., & Risch, N. (2002). Behavioral pheno-
Schreck, K. A., & Mulick, J. A. (2000). Parental re- typic variation in autism multiplex families:
port of sleep problems in children with autism. Evidence for a continuous severity gradient.
Journal of Autism and Developmental Disor- American Journal of Medical Genetics, 114(2),
ders, 30(2), 127–135. 129–136.
Sevin, J. A., Matson, J. L., Coe, D. A., Fee, V. E., & Spiker, D., Lotspeich, L. J., Kraemer, H. C., Hall-
Sevin, B. M. (1991). A comparison and evalua- mayer, J., McMahon, W., Peterson, B., et al.
tion of three commonly used autism scales. (1994). Genetics of autism: Characteristics of
Journal of Autism and Developmental Disor- affected and unaffected children from 37 mul-
ders, 21(4), 551–556. tiplex families. American Journal of Medical
Shaffer, D., Gould, M. S., Brasic, J., Ambrosini, P., Genetics, 54(1), 27–35.
Fisher, P., Bird, H., et al. (1983). A Children’s Sponheim, E. (1996). Changing criteria of autistic
Global Assessment Scale (CGAS). Archives of disorders: A comparison of the ICD-10 re-
General Psychiatry, 40, 1228–1231. search criteria and DSM-IV with DSM-III-R,
Shao, Y. J., Cuccaro, M. L., Hauser, E. R., Raiford, CARS, and ABC. Journal of Autism and Devel-
K. L., Menold, M. M., Wolpert, C. M., et al. opmental Disorders, 26(5), 513–525.
(2003). Fine mapping of Autistic disorder to Steerneman, P., Muris, P., Merckelbach, H., &
chromosome 15q11-q13 by use of phenotypic Willems, H. (1997). Brief report: Assessment
subtypes. American Journal of Human Genet- of development and abnormal behavior in chil-
ics, 72(3), 539–548. dren with pervasive developmental disorders:
Shao, Y. J., Raiford, K. L., Wolpert, C. M., Cope, Evidence for the reliability and validity of the
H. A., Ravan, S. A., Ashley-Koch, A. A., et al. Revised Psychoeducational Profile. Journal of
(2002). Phenotypic homogeneity provides in- Autism and Developmental Disorders, 27(2),
creased support for linkage on chromosome 2 177–185.
in autistic disorder. American Journal of Stella, J., Mundy, P., & Tuchman, R. (1999). Social
Human Genetics, 70(4), 1058–1061. and nonsocial factors in the Childhood Autism
Shao, Y. J., Wolpert, C. M., Raiford, K. L., Menold, Rating Scale. Journal of Autism and Develop-
M. M., Donnelly, S. L., Ravan, S. A., et al. mental Disorders, 29(4), 307–317.
(2002). Genomic screen and follow-up analy- Stone, W. L., Lee, E. B., Ashford, L., Brissie, J.,
sis for autistic disorder. American Journal of Hepburn, S. L., Coonrod, E. E., et al. (1999).
Medical Genetics, 114, 99–105. Can autism be diagnosed accurately in chil-
Sigman, M., & Ungerer, J. (1984). Attachment be- dren under 3 years? Journal of Child Psychol-
haviors in autistic children. Journal of Autism ogy and Psychiatry and Allied Disciplines,
and Developmental Disorders, 14(3), 231–244. 40(2), 219–226.
Silverman, J. M., Smith, C. J., Schmeidler, J., Hol- Stone, W. L., & Lemanek, K. L. (1990). Parental
lander, E., Lawlor, B. A., Fitzgerald, M., et al. report of social behaviors in autistic
770 Assessment

preschoolers. Journal of Autism and Develop- Volkmar, F. R., Cicchetti, D. V., Bregman, J., &
mental Disorders, 20(4), 513–522. Cohen, D. J. (1992). Three diagnostic systems
Stone, W. L., Ousley, O. Y., Yoder, P., Hogan, K., & for autism: DSM-III, DSM-III-R, and ICD-10
Hepburn, S. (1997). Nonverbal communication [Special issue: Classification and diagnosis].
in 2- and 3-year-old children with autism. Jour- Journal of Autism and Developmental Disor-
nal of Autism and Developmental Disorders, ders, 22(4), 483–492.
27(6), 677–696. Volkmar, F. R., Cicchetti, D. V., Dykens, E.,
Sturmey, P., Matson, J. L., & Sevin, J. A. (1992). Sparrow, S. S., Leckman, J. F., & Cohen, D. F.
Brief report: Analysis of the internal consis- (1988). An evaluation of the Autism Behavior
tency of three autism scales. Journal of Checklist. Journal of Autism and Developmen-
Autism and Developmental Disorders, 22(2), tal Disorders, 18(1), 81–97.
321–328. Volkmar, F. R., & Klin, A. (2001). Asperger’s dis-
Szatmari, P. (2000). Perspectives on the classifica- order and higher functioning autism: Same or
tion of Asperger Syndrome. In A. Klin (Ed.), different? In L. M. Glidden (Ed.), Interna-
Asperger Syndrome (pp. 403–407). New York, tional review of research in mental retardation:
NY: Guilford Press. Autism (Vol. 23, pp. 83–110). San Diego, CA:
Szatmari, P., Archer, L., Fisman, S., Streiner, Academic Press.
D. L., & Wilson, F. (1995). Asperger’s syn- Volkmar, F. R., Klin, A., Siegal, B., Szatmari, P.,
drome and autism: Differences in behavior, Lord, C., Campbell, M., et al. (1994). Field
cognition, and adaptive functioning. Journal of trial for autistic disorder in DSM-IV. Ameri-
the American Academy of Child and Adolescent can Journal of Psychiatry, 151(9), 1361–1367.
Psychiatry, 34(12), 1662–1671. Volkmar, F. R., & Lord, C. (1998). Diagnosis and
Szatmari, P., Merette, C., Bryson, S. E., Thivierge, definition of autism and other pervasive devel-
J., Roy, M. A., Cayer, M., et al. (2002). Quanti- opmental disorders. In F. R. Volkmar (Ed.),
fying dimensions in autism: A factor-analytic Autism and pervasive developmental disorders
study. Journal of the American Academy of (pp. 1–31). New York: Cambridge University
Child and Adolescent Psychiatry, 41(4), Press.
467–474. Vrancic, D., Nanclares, V., Soares, D., Kulesz, A.,
Tadevosyan-Leyfer, O., Dowd, M., Mankoski, R., Mordzinski, C., Plebst, C., et al. (2002). Sensi-
Winklosky, B., Putnam, S., McGrath, L., et al. tivity and specificity of the autism diagnostic
(2003). A principal components analysis of the inventory-telephone screening in Spanish. Jour-
Autism Diagnostic Interview-Revised. Journal nal of Autism and Developmental Disorders,
of the American Academy of Child and Adoles- 32(4), 313–320.
cent Psychiatry, 42(7), 864–872. Wadden, N., Bryson, S. E., & Rodger, R. (1991). A
Tanguay, P. E., Robertson, J., & Derrick, A. closer look at the Autism Behavior Checklist:
(1998). A dimensional classification of autism Discriminant validity and factor structure.
spectrum disorder by social communication Journal of Autism and Developmental Disor-
domains. Journal of the American Academy ders, 21(4), 529–542.
of Child and Adolescent Psychiatry, 37(3), Wechsler, D. (1991). Manual for the Wechsler Intel-
271–277. ligence Scale for Children–III. San Antonio,
Tantam, D. (2000). Psychological disorder in ado- TX: Psychological Corporation.
lescents and adults with Asperger Syndrome. Wechsler, D. (1997) Wechsler Adult Intelligence
Autism, 4(1), 47–62. Scale, 3rd Edition. San Antonio, TX: Psycho-
Teal, M. B., & Wiebe, M. J. (1986). A validity logical Corporation.
analysis of selected instruments used to assess Wechsler, D. (2002). Wechsler Preschool and Pri-
autism. Journal of Autism and Developmental mary Scale of Intelligence–III. San Antonio,
Disorders, 16(4), 485–494. TX: Psychological Corporation.
Van Bourgondien, M. E., Marcus, L. M., & Wechsler, D. (2003). Wechsler Intelligence Scale for
Schopler, E. (1992). Comparison of DSM-III-R Children (4th ed.). San Antonio, TX:Psycholog-
and Childhood Autism Rating Scale diagnosis ical Corporation.
of autism. Journal of Autism and Developmen- Wenar, C., & Ruttenberg, B. A. (1976). The use of
tal Disorders, 22(4), 493–505. BRIAAC for evaluating therapeutic effective-
Venter, A., Lord, C., & Schopler, E. (1992). A ness. Journal of Autism and Childhood Schizo-
follow-up study of high-functioning autistic phrenia, 6(2), 175–191.
children. Journal of Child Psychology and Wetherby, A., & Prizant, B. (1993). Communica-
Psychiatry and Allied Disciplines, 33(3), tion and Symbolic Behavior Scales (Normed
1489–1507. ed.). Baltimore: Paul H. Brookes.
Diagnostic Instruments in Autistic Spectrum Disorders 771

Wetherby, A., & Prizant, B. (2002). Communica- Wing, L., Leekam, S. R., Libby, S. J., Gould, J., &
tion and Symbolic Behavior Scales developmen- Larcombe, M. (2002). The Diagnostic Inter-
tal profile (First Normed Edition). Baltimore: view for Social and Communication Disorders:
Paul H. Brookes. Background, inter-rater reliability and clinical
Wetherby, A., Woods, J., Allen, L., Cleary, J., Dick- use. Journal of Child Psychology and Psychia-
inson, H., & Lord, C. (in press). Early indica- try and Allied Disciplines, 43(3), 307–325.
tors of autistic spectrum disorders in the second World Health Organization. (1992). The ICD 10
year of life. Classification of Mental and Behavioral Disor-
Wing, L., & Attwood, A. (1987). Syndromes of ders: Clinical descriptions and diagnostic guide-
autism and atypical development. In D. J. lines. Geneva, Switzerland: Author.
Cohen & A. M. Donnellan (Eds.), Handbook of Yirmiya, N., Sigman, M., & Freeman, B. J. (1994).
autism and pervasive developmental disorders Comparison between diagnostic instruments
(pp. 148–170). New York: Wiley. for identifying high-functioning children with
Wing, L., & Gould, J. (1978). Systematic recording autism. Journal of Autism and Developmental
of behaviors and skills of retarded and psy- Disorders, 24(3), 281–291.
chotic children. Journal of Autism and Child- Zakian, A., Malvy, J., Desombre, H., Roux, S., &
hood Schizophrenia, 8(1), 79–97. Lenoir, P. (2000). Early signs of autism: A new
Wing, L., & Gould, J. (1979). Severe impairments study of family home movies. Encephale-Revue
of social interaction and associated abnormal- De Psychiatrie Clinique Biologique Et Thera-
ities in children: Epidemiology and classifica- peutique, 26(2), 38–44.
tion. Journal of Autism and Developmental
Disorders, 9(1), 11–29.

You might also like