Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Psychological Assessment Copyright 2004 by the American Psychological Association

2004, Vol. 16, No. 3, 323–325 1040-3590/04/$12.00 DOI: 10.1037/1040-3590.16.3.323

BRIEF REPORTS

Values for Comparison of WAIS–III Index Scores With Overall Means


R. Stewart Longman
Foothills Hospital, Calgary

The Wechsler Adult Intelligence Scale—Third Edition (WAIS–III; Wechsler, 1997b) provides factor-
based index scores but allows only for pairwise comparison of these scores, producing inflated Type I
error rates and reducing profile interpretability. This article provides tables for simultaneous comparison
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

to the overall mean index score, thus reducing error rates and aiding interpretation. The Working Memory
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Index or Processing Speed Index can also be specifically compared when an individual is believed to
have a condition, such as a learning disability or traumatic brain injury, associated with the selective
depression of these indexes. Tables for the infrequency of specific differences are also provided, allowing
the practitioner to note how unusual an obtained difference is in the general population.

With the publication of the Wechsler Adult Intelligence Scale— In addition to the reduced conceptual clarity resulting from
Third Edition (WAIS–III; Wechsler, 1997b), there has been the pairwise interpretation, there is an increased chance of Type I
provision of factor-based index scores, reflecting relatively more errors, because of the multiple comparisons that are made (Knight
homogenous selections of skills and abilities than those making up & Godfrey, 1984). Indeed, given six possible pairwise compari-
the standard Verbal IQ (VIQ) and Performance IQ (PIQ). The sons, the probability of making a Type I error is 62% if using the
Verbal Comprehension (VCI) and Working Memory (WMI) in- ␣ ⫽ .15 significance level for individual comparisons and is still
dexes are primarily based on subtests contributing to VIQ, whereas 26% when using the more stringent ␣ ⫽ .05 significance level. For
the Perceptual Organization (POI) and Processing Speed (PSI) these reasons, some sources recommend against the pairwise com-
indexes are primarily based on subtests contributing to the PIQ. In parisons of subtest scores (e.g., Glutting & Watkins, 1997).
addition, the WAIS–III manual contains much information that There is emerging evidence that profiles of index scores may be
was not available in previous editions, allowing for interpretation reliably characterized by a pattern of specific strengths or weak-
of subtest or index score differences. For example, Table B.3 of the nesses and associated with particular diagnoses. For example, a
WAIS–III manual (Wechsler, 1997b) allows for comparison of relative weakness on the PSI appears to be a common consequence
individual subtest scores to the mean of subtest scores (within the of traumatic brain injury (Axelrod, Fichtenberg, Liethen, Czarnota,
appropriate IQ scale or for comparison to all subtests), and Table & Stucky, 2002; Donders, Tulsky, & Zhu, 2001; Wechsler, 1997a,
B.1 allows identification of statistically significant differences pp. 154 –156). For adults with a history of learning disabilities, a
between each pair of index scores. reduced WMI is a relatively frequent finding (Wechsler, 1997a,
The current method for interpretation of differences between pp. 176 –178). For normal subjects in the WAIS–III normative
index scores is limited to pairwise comparisons. Unfortunately, sample, five clusters were derived based on index score profiles
these comparisons do not address overall configuration, such as a (Donders, Zhu, & Tulsky, 2001). Three profiles showed little
relative strength or weakness on a specific index as compared with difference between index scores and differed only by mean eleva-
overall performance. Pairwise comparisons can also lead to results tion, whereas the remaining two profiles were characterized by a
that are either confusing or difficult to interpret. For example, an relative strength or a weakness on the PSI, compared with the
individual aged 46 years may have a VCI of 95, a POI of 102, a remaining indexes. In the Wechsler Intelligence Scale for Chil-
WMI of 102, and a PSI of 103. In this case, only the difference dren—III (WISC–III; Wechsler, 1991), similar clusters have been
between VCI and WMI is statistically significant (meaning likely found (Donders, 1996). These findings for normal and clinical
to be replicable) at the ␣ ⫽ .15 level, yet this is one of the smaller groups suggest that review of index score profiles may aid con-
differences. Communicating this finding and conceptualizing the
sideration in the individual case.
overall results is somewhat difficult. As well, although the differ-
Naglieri (1993) proposed two possible strategies to interpret
ence between WMI and VCI may suggest further investigation and
index scores when using the WISC–III. One method is to use the
consideration of clinical hypotheses, a difference this large or
Bonferroni correction for the six individual pairwise comparisons.
larger is extremely common, found in 57.5% of adults (Wechsler,
This reduces Type I error but does not address the concerns about
1997b, pp. 206 –207).
pairwise interpretation. The other method, the ipsative approach,
involves comparing index scores to the overall mean, which re-
duces the number of comparisons (and the degree of a Bonferroni
Correspondence concerning this article should be addressed to R. Stew- correction for multiple comparisons) and is also much easier to
art Longman, Department of Psychology, Foothills Hospital, 1403 29th interpret. This approach allows the clinician to state that an indi-
Street, Northwest, Calgary, Alberta T2N 2T9, Canada. E-mail: Stewart vidual shows a reliable strength (or weakness) on a specific index
.Longman@calgaryhealthregion.ca compared with overall performance. Furthermore, if a targeted

323
324 BRIEF REPORTS

comparison of one index against the overall mean is suggested a Results and Example
priori, the Bonferroni correction can be eliminated, improving
clinical sensitivity. The standard errors and corresponding critical values for each
These two considerations (increased risk of Type I error from index are presented in Table 1. In addition, given that in particular
performing all possible pairwise comparisons and greater concep- populations (individuals with suspected traumatic brain injury or
tual clarity and ease of communication of findings through per- learning disabilities) it may be appropriate to compare a single
forming multiple comparisons simultaneously) suggest that com- index score to the overall mean, critical values for a targeted
paring obtained index scores to the overall mean index score may comparison of only one index score against the overall mean are
be preferable to the current practice of pairwise comparisons. This provided in Table 2. These use unprotected one-tailed alpha levels
should lead to reduced frequency of Type I errors without the high of .05 and .01. As expected, given that the PSI is composed of only
degree of correction needed for a series of pairwise comparisons at two subtests, the standard error (and thus the required deviation
the same overall alpha level and may aid in characterizing an from the overall mean) is higher than for the remaining indexes,
individual WAIS–III profile as showing either no profile variation, both for the overall as well as the targeted comparisons. Finally,
a relative strength, or a relative weakness on a specific index (or
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Table 3 indicates the 1st, 2nd, 5th, 10th, and 25th percentiles for
This document is copyrighted by the American Psychological Association or one of its allied publishers.

indexes). This could aid in both conceptualizing and communicat-


differences between a specific index score and the overall mean.
ing assessment results.
These can be used to give an indication of the relative infrequency
of an obtained difference, and thus, the clinical importance of that
Method difference.
Davis (1959) provided a formula for calculating different scores required For example, consider a 25-year-old man with a history of
for statistical significance for an individual subtest or index compared with reading and spelling problems, referred for consideration of a
the overall mean of subtests or indexes. This formula requires obtaining possible learning disability. He shows a VCI of 98, a POI of 102,
standard errors of measurement and choosing appropriate Z values. This a WMI of 89, and a PSI of 96, giving an overall mean index score
formula can be represented as
of 96.25. For this individual, given a specific hypothesis of aca-
D ⫽ Z 冑共SEMt兲 2 /K2 ⫹ 关共K ⫺ 2兲/K兴共SEMi兲 2 . demic learning disability on the basis of his history and back-
ground, comparison of the WMI against the overall mean is
In this formula, D is the deviation from the mean index score, Z is the appropriate. In this case, the obtained value is 7.25, as compared
critical Z value (after Bonferroni correction), SEM2t is the sum of the
with a critical value of 5.75 for the ␣ ⫽ .05 level of statistical
squared standard errors of measurement for all the indexes, K is the number
of indexes, and SEM2i is the squared standard error of measurement for the
significance, indicating a relative weakness in WMI compared
particular index being compared. with his overall mean. None of his remaining indexes differ from
Following Kaufman’s (1990, p. 436) and Naglieri’s (1993) studies, I the overall mean. In this case, use of pairwise comparisons at an
calculated Bonferroni corrected values for omnibus comparisons, using ␣ ⫽ .05 level only indicates a reliable difference between WMI
two-tailed significance levels of .15, .05, and .01. These significance values and POI. Table 3 indicates that this discrepancy, although statis-
were chosen to reflect the values presented for both pairwise and ipsative tically significant, is not unusual, found in more than 25% of the
comparisons in the manual for the WAIS–III (Wechsler, 1997b), which
general population.
correspond to the .15 value recommended for hypothesis generation by
Davis (1959) and the .05 and .01 values for more stringent identification of
Consider the example given earlier, an individual with a VCI of
strengths and weaknesses proposed by Kaufman (1990). Data for the 95, a POI of 102, a WMI of 102, and a PSI of 103. In this case, the
standard errors of measurement were taken from Table 3.4 of the technical overall mean is 100.5, and none of the scores differs significantly
manual (Wechsler, 1997a, p. 54), using the values for the index scores from that value. The WAIS–III index score profile would be
averaged across all the age ranges. Averaged values were chosen for the considered to show no overall strengths or weaknesses, and the
ease of using a single table and because inspection of tabled values did not previously noted difference between VCI and WMI may represent
indicate any clear or consistent trends of increasing or decreasing standard
an overinterpretation of the data, as suggested by the relative
errors with increasing age.
In addition to the statistical significance of differences, the relative frequency of the obtained score difference.
infrequency of differences between a specific index and the overall mean
was calculated, to give the 25th, 10th, 5th, 2nd, and 1st percentile values,
helping to delimit relatively common and relatively uncommon discrepan-
cies. This requires the calculation of the standard deviation of the mean of Table 1
the four index scores (using formula 5-3 in Nunnally, 1978, p. 153), the Differences Between Each Index Score and Mean Index Score
correlation of that mean with each index score (using formula 5-7 in for Each Significance Level
Nunnally, 1978, p. 166), and then the calculation of the abnormality of the
difference between two correlated scores, using the formula from Payne Significance level (␣)
and Jones (1957), which can be expressed as
Index SEM .15 .05 .01
A ⫽ Z 冑SD1 ⫹ SD2 ⫺ 2共r1, 2 SD1 SD2 兲.
2 2
Verbal Comprehension 3.01 6.04 7.32 8.88
In this formula, A is the difference associated with a relative abnormality, Perceptual Organization 3.95 7.09 8.60 10.43
Working Memory 3.84 6.96 8.44 10.24
Z is the standard normal deviate associated with a specific (two-tailed)
Processing Speed 5.13 8.54 10.35 12.56
probability, SD1 is the standard deviation of the index score, SD2 is the
standard deviation of the mean of the index scores, and r1, 2 is the Note. SEM data are from the WAIS–III/WMS–III Technical Manual (p.
correlation between a specific index score and the overall mean. All 54) by D. Wechsler. Copyright @ 1997 by Harcourt Assessment, Inc.
percentiles were calculated as two-tailed values. Reproduced by permission. All rights reserved.
BRIEF REPORTS 325

Discussion Table 3
Abnormality of Differences Between Index Scores and the
For interpretive purposes, differences between obtained index Overall Mean Index Score
scores and the overall index score mean are likely to be more
useful and easier to report than comparing pairs of indexes. That is, Cumulative infrequency
it is much easier to conceptualize and describe a profile as showing
a relative strength (or weakness) on one index, rather than stating Index 25% 10% 5% 2% 1%
that the difference between specific pairs of indexes was statisti- Verbal Comprehension 9.11 13.03 15.52 18.42 20.40
cally significant. Such an overall comparison may also be a better Perceptual Organization 8.69 12.43 14.81 17.58 19.46
reflection of the actual process of interpretation (Glutting & Working Memory 9.27 13.26 15.80 18.75 20.76
Watkins, 1997). The data provided in Table 1 allows for such Processing Speed 10.41 14.88 17.73 21.05 23.30
comparisons, thus providing better integration of overall results
than the current practice of pairwise comparisons. This is also
conceptually consistent with the recommended procedure for in- discrepancies between subtest or index scores for a sample of
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

terpretation of subtest strengths and weaknesses. Just as with the adults with brain injury, demonstrating not only the replicability of
This document is copyrighted by the American Psychological Association or one of its allied publishers.

index scores, subtest scores were initially interpreted through the overall group profile but also the variability of individual
pairwise comparisons, until the awareness of the logical and sta- findings within this broad diagnosis. The strategy of a single
tistical difficulties involved led to the current emphasis on com- overall comparison should reduce overinterpretation of test results
parison of individual scores to the overall mean (Kaufman, 1990, and direct attention to the most reliable and meaningful differ-
p. 425; Knight & Godfrey, 1984). ences. Further studies examining WAIS–III profiles for different
The Davis (1959) formula does not require that the measures clinical groups will be important in identifying how well individ-
being compared have any particular properties, but the comparison ual profiles aid in diagnosis and treatment planning.
will be more sensitive when the measures are more reliable (as
index scores are more reliable than the individual subtests). The
values provided in Tables 1 and 2 do not indicate the relative
References
abnormality of specific differences but rather their statistical im- Axelrod, B. N., Fichtenberg, N. L., Liethen, P. C., Czarnota, M. A., &
probability when no true differences are expected, thus indicating Stucky, K. (2002). Index, summary, and subtest discrepancy scores on
if this is a reliable difference. This is similar to the situation with the WAIS–III in traumatic brain injury patients. International Journal of
VIQ and PIQ, in which differences may be statistically significant Neuroscience, 112, 1479 –1487.
(unlikely to arise by chance and reflecting some true variation in Davis, F. B. (1959). Interpretation of differences among averages and
individual test scores. Journal of Educational Psychology, 50, 162–170.
abilities) but may also be quite common.
Donders, J. (1996). Cluster subtypes in the WISC–III standardization
Fortunately, the high correlations between the index scores and
sample: Analysis of factor index scores. Psychological Assessment, 8,
the overall mean (.80 –.86) indicate that large discrepancies be- 312–318.
tween index scores and the overall mean are relatively less com- Donders, J., Tulsky, D. S., & Zhu, J. (2001). Criterion validity of new
mon than similarly large discrepancies between pairs of index WAIS–III subtest scores after traumatic brain injury. Journal of the
scores. This can be seen by comparing Table B.2 of the WAIS–III International Neuropsychological Society, 7, 892– 898.
manual (Wechsler, 1997b) to Table 3 from the present study. This Donders, J., Zhu, J., & Tulsky, D. (2001). Factor index score patterns in the
suggests that a reliable and unusual discrepancy between an index WAIS–III standardization sample. Assessment, 8, 193–203.
score and the mean score may well have some cognitive or Glutting, J. J., & Watkins, M. M. (1997). The base rate problem and its
diagnostic correlates outside the domain of WAIS–III scores. consequences for interpreting children’s ability profiles. School Psychol-
ogy Review, 26, 176 –188.
The methods and tables presented here are intended to give a
Kaufman, A. S. (1990). Assessing adolescent and adult intelligence. Bos-
straightforward strategy for profile identification and description,
ton: Allyn & Bacon.
while recognizing that the pattern of WAIS–III scores, by itself, Knight, R. G., & Godfrey, H. P. D. (1984). Assessing the significance of
may help to generate hypotheses, but further information is differences between subtests on the Wechsler Adult Intelligence Scale—
needed, such as patient history and other test data, to test these Revised. Journal of Clinical Psychology, 40, 808 – 810.
hypotheses. This may supplement studies such as that by Axelrod Naglieri, J. A. (1993). Pairwise and ipsative comparisons of WISC–III IQ
et al. (2002), who noted the relative infrequencies of varying and index scores. Psychological Assessment, 5, 113–116.
Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York:
McGraw-Hill.
Table 2 Payne, R. W., & Jones, G. (1957). Statistics for the investigation of
Differences Between WMI or PSI and Mean Score: .05 and .01 individual cases. Journal of Clinical Psychology, 13, 115–121.
Wechsler, D. (1991). Wechsler Intelligence Scale for Children—III. San
Significance Levels When Comparing a Single Index With the
Antonio, TX: The Psychological Corporation.
Mean Index Wechsler, D. (1997a). WAIS–III/WMS–III technical manual. San Antonio,
TX: The Psychological Corporation.
Significance level (␣)
Wechsler, D. (1997b). Wechsler Adult Intelligence Scale—Third Edition.
Index .05 .01 San Antonio, TX: The Psychological Corporation.

Working Memory 5.75 7.88 Received March 18, 2002


Processing Speed 6.83 9.67
Revision received July 16, 2003
Note. WMI ⫽ Working Memory Index; PSI ⫽ Processing Speed Index. Accepted November 19, 2003 䡲

You might also like