INTELLIGENCE 13, 349-359 (1989)

Correlations of Mental Tests with Each

Other and with Cognitive Variables are
Highest for Low IQ Groups
Case Western Reserve University

The Psychological Corporation

Two studies showed an inverse relationship between ability level and correlations among
IQ measures. Low IQ subjects showed much higher correlations than high 1Q subjects.
Intercorrelatio||s of IQ subtests, correlations of cognitive ability measures with each other,
and correlations of IQ with measures of cognitive abilities all displayed the same effect. In
the fast study, data from two experiments in which subjects took a battery of basic
cognitive tasks and a standard IQ test were analyzed. Measures from the basic tasks
correlated more highly in the low IQ group than in the high IQ group. In the second study,
data from the WAIS-R and WlSC-R standardization samples were divided into five ability
groups. Average correlations among subtests were computed for each ability group. For
both the WAIS-R and WISC-R, average subtest correlations were highest in the low
ability group. Correlations declined systematically with increasing IQ. In both studies,
correlations were found to be two times higher in low IQ groups than in high IQ groups.

Spearman (1904) established the importance of positive manifold. Positive man-

ifold refers to the empirical observation that tests of mental ability are positively
correlated with each other. Spearman's formulation of 'g', general intelligence,
represented the degree of positive manifold among all tests in a battery of tests.
Positive manifold among mental tests is one of the most reliable, replicable,
and important empirical discoveries about human ability yet found. Attempts to
explain positive manifold have, directly or indirectly, occupied the efforts of
many researchers in psychometrics and individual differences. During the 85-
year history of this work, it was thought that positive manifold was uniformly
distributed over the full range of ability. That is, it was assumed that the correla-

tion among mental tests would be about the same in a group of low IQ subjects as
it would be in a group of high IQ subjects. (Both groups must represent similar
ranges of ability and, so, have equal standard deviations, or differences in cor-
relations could be due to restriction of range.)
Data are reported that show the uniform distribution assumption is incorrect.
The correlations among the WAIS-R and WISC-R subtests and the correlation of
basic cognitive measures with standardized tests of intelligence and with each
other were analyzed. The data to be reported show correlations among mental
tests are higher for low IQ subjects than for high IQ subjects. This relation was
first found (Study 1) for the correlation of measures from basic cognitive tasks
with intelligence test scores and with each other. It was later confirmed (Study 2)
for subtests of the WAIS-R and WISC-R using the national standardization
samples for those tests.

The data used in this study came from preliminary reports-of two experiments by
Detterman, Caruso, Mayer, Legree, and Conners (1983) and Detterman (1986).
The first experiment (MR/College) used an extreme groups design comparing
mentally retarded persons with college students. The second experiment (High
School) employed a randomly selected sample of high school students. In both
experiments, subjects were given a set of computer-administered, basic cognitive
tasks and a standardized IQ test.

Subjects. The MR/College experiment compared 20 young adult mentally
retarded persons (M IQ = 67.5, SD = 7.56) with 20 college students (M IQ =
115.5, SD = 7.79). The High School experiment included 141 randomly se-
lected high school students (M IQ = 108.0, SD = 18.3). A low IQ group (M IQ
= 93.0, SD = 12.3) consisted of the 68 subjects below the mean of the entire
group. A high IQ group (M IQ = 122.0, SD = 9.9) consisted of 73 subjects
above the mean.

Tests. All subjects in both experiments were given the WAIS-R. They also
took a set of computer-administered basic cognitive tasks. In the MR/College
experiment, the battery consisted of 9 tests, one each for learning, relearning,
probe memory, Sternberg memory search, match-to-sample, tachistoscopic iden-
tification, tachistoscopic recognition, strategy development, and choice reaction
time. The High School experiment included variants of these tasks and a recogni-
tion memory task for a total of 10 tasks. All tasks were administered by computer
and all responses were made on a touch screen fitted to the computer monitor. All
the tasks used the same stimuli. Each task yielded several measures based on
latency or errors.

All measures had been extensively pretested and were known to have good
reliabilities (average split-half reliability = .79). For the MR/College experi-
ment, 31 measures from the basic tasks and full scale WAIS-R IQ scores were
available. For the High School experiment, 36 measures were chosen as best
representing all the measures obtained from all the basic cognitive tests. WAIS-R
full scale IQ scores were also included.

Procedure. Subjects in both experiments were administered the battery of

cognitive tasks in a darkened, quiet room. Administration took from 2 to 4 hours
for each subject. At a convenient time during computer testing or after comple-
tion of the computer battery, each subject was administered the WAIS-R by a
trained examiner.

Results and Discussion

The results of interest were the correlations between the measures from the
cognitive tasks and IQ for subjects low and high on IQ. The question being asked
was if basic cognitive measures correlated more highly with IQ in low IQ groups
than in high IQ groups. In the MR/College experiment, the subjects were ini-
tially divided into groups and these divisions were used for analysis. In the High
School experiment two groups were formed by dividing the full group at the
mean of IQ.
Statistical analyses of the data were conducted in the same way for both
groups. A matrix of the correlations of all cognitive variables from the tasks plus
IQ was computed separately for the high and low IQ groups yielding two ma-
trices for each experiment. For the MR/College experiment there were 32 (31
cognitive variables + IQ) variables in each matrix. For the High School experi-
ment there were 37 (36 cognitive variables + IQ) variables in each matrix. All
correlations in each matrix were corrected for restriction of range. This was
essential to correct for different degrees of selection that might have occurred in
each of the subgroups. Corrections were based on the Full Scale IQ standard
deviation and followed the procedures suggested by Gulliksen (1950) for explicit
and incidental selection. Corrections for restriction of range correct the obtained
values to the value that would have been obtained had the range not been
restricted. However, the correction treats the sample data as typical of the full
range. Correction for restriction of range results in correlations that would have
been gotten from the entire population if correlations in the subgroup were
typical of the entire population. Thus, adjustments for restriction of range do not
eliminate differences in correlation between high and low IQ groups. It simply
adjusts both groups to a common standard deviation so that differences in cor-
relations between the groups can be directly compared.
The differences between corresponding correlations in the high and low IQ
group matrices were tested using Fisher's z. Only the upper triangular portion of
the matrix was used for these tests because the lower half is redundant. The value

The Average Correlation of Basic Cognitive Task Measures
with IQ Scores on the WAIS-R and with Each Other (Cognitive)

MR/College High School

IQ Level IQ Cognitive IQ Cognitive

Low .60 .44 .37 .26

High .26 .23 .24 .18

of z was evaluated by a two-tailed test. The number of correlations which were

significantly larger in the low IQ group than the same correlation in the high IQ
group was counted. Next, X2 was used to test the difference between the number
of correlations significantly higher in the low group compared with the number
(5%) expected by chance. Note that even though significance was tested as a
two-tailed test, the hypothesis tested by ×2 was that the low IQ group correlations
were larger than the high IQ group correlations which is a one-tailed hypothesis.
This makes the test a conservative one.
Table 1 shows the results of both experiments. Clearly, the correlations be-
tween cognitive task measures and WAIS-R IQ are up to twice as large in low IQ
samples as in high IQ samples. The statistical analyses of the MR/College
experiment comparing low and high IQ groups confirmed this observation yield-
ing a statistically significant effect (×2(1) = 308.11, p < .001) as did the High
School experiment (X2(I) = 87.79, p < .001). There were larger differences
between high and low IQ groups for the correlations of IQ tests with cognitive
variables than for cognitive variables with themselves although the difference
was found for both. Similar results were obtained if the same statistical pro-
cedure was carried out on the correlation matrices uncorrected for restriction of
Another way of tabulating and comparing differences in correlations was
suggested by Kaiser (1968). The largest eigenvalue minus one divided by the
number of variables minus one yields an estimate of average correlation in the
matrix. This method was used to calculate the average correlation. The largest
difference between the previous method and Kaiser's method was .06. Evidently
both methods are similar.

If correlations between cognitive tasks and IQ scores are higher in low IQ
groups, then more complex mental tests should also intercorrelate more highly in
low IQ samples. To test this possibility, the standardization samples for the
WAIS-R and WISC-R were divided into subgroups and analyzed in the same way
as in Study 1.

Subgroups were formed by selecting subjects by one of their subtest scaled

scores. At first, it might seem best to select subgroups on Full Scale IQ. How-
ever, that procedure would introduce spurious negative correlations among the
subtests. The problem is that subtests contribute to Full Scale IQ. If subjects are
selected by Full Scale IQ, their subscale scores must balance out to equal the Full
Scale score. Those with higher scores on one subtest must have a lower score on
one or more other subtests to keep their IQ within the range.
The negative correlations induced by such selection procedures are not trivial.
For example, when the WISC-R standardization sample was divided by Full
Scale IQ in subgroups that are 15 IQ points wide, the average subtest intercor-
relation ranged from - . 0 3 to .00 for all subgroups except the lowest (IQ < 71),
for which the mean intercorrelation was .07.
To avoid this problem, cases must be chosen by a score that is not a sum of
some or all the subtest scores. The ideal solution would be to use a score from
another test to select cases. But no other test was available, so subjects were
divided into subgroups by one of their subtests. The Vocabulary subtest score
was used for one analysis and, as a check, the analysis was repeated using the
Information subtest for group assignment. The subtest used to choose groups as
included in the analysis because, after correction for restriction of range, includ-
ing it had little effect on the results.

Standardization Samples. The WAIS-R and WISC-R standardization samples
consisted of 1,880 and 2,200 subjects, respectively. For both tests, scores stan-
dardized within age for each subtest were used to remove age variance. All 11
subtests for the WAIS-R and all 12 subtests for the WISC-R were used for
analysis. Full Scale IQ was also included for both tests.

Sample Subdivision. Each standardization sample was divided into five sepa-
rate ability groups by standard scores on one of the subtests. This was done using
the Information and Vocabulary subtests for both the WAIS-R and WISC-R.
These two subtests were selected because of their high correlation with Full Scale
IQ and because they are stable across age. The standard score range (and its IQ
equivalent) and number of cases for each subgroup is shown in Table 2 (p. 354).

Results and Discussion

Analyses were conducted in the same manner as for Study 1. Correlation ma-
trices including all subtests and Full Scale IQ were constructed for each of the
five ability level groupings. Correlations were then corrected for restriction of
range based on the standard deviation of the subtest used for selection for that
matrix. This procedure corrected for within-subgroup differences of range. It
also allowed average subgroup correlations to reflect correlations that would

Number of Subjects in WAIS-R and WISC-R Subgroups Formed
by Selection on Vocabulary (Voc) and Information (Inf)

Group Range IQ Equiv. Voc Inf Voc lnf

1 1-5 <78 109 120 156 130

2 6-8 78-92 474 466 518 525
3 9-11 93-107 697 669 837 842
4 12-14 108-122 472 514 525 535
5 15-19 >122 128 111 164 168
Range = scaled score range, M = 10, S D = 3.
IQ Equivalent = the equivalent scaled score as an IQ, M = 100, S D = 15.

have been obtained from the entire population if the ability subgroup was typical
of all the population.
Figure 1 shows the average correlation at each ability level. Choosing ability
subgroups by Vocabulary subtest scores is nearly identical to results when groups
are formed using Information subtest scores. The inclusion of Full Scale IQ in
these calculations has very little effect. Omission of Full Scale IQ reduced none
of the average correlations more than .04.
The most obvious and striking trend apparent in Figure 1 is that low ability
groups demonstrate correlations which are two times larger than high ability
groups. This trend is apparent in both the WAIS-R and WISC-R standardization
samples though the trend is more pronounced in the WAIS-R data. It is also
apparent that there is a systematic trend for successively lower ability levels to
show successively higher correlations.
Each point in Figure 1 is based on 78 averaged correlations for the WISC-R
and 66 averaged correlations for the WAIS-R (representing the upper triangular
portion of the matrix). Statistical analyses on these data using the same pro-
cedures as in Study 1 showed that the graphical trends were highly statistically
significant. Each correlation in a matrix was compared with the corresponding
correlation in every other matrix for each subgroup of that test and selection
method. For example, the correlations in the matrix for Group 1 of the WAIS-R
selected on Vocabulary were compared with the corresponding correlations for
all other subgroups of the WAIS-R selected on Vocabulary. As before, the cor-
relations were tested using Fisher's z and the number of statistically significant
differences was compared to differences expected by chance using ×2. This
resulted in a total of 40 ×2 statistics, 10 for each test and subtest selection
combination. Of these comparisons, only 4 ×2 values were not statistically
significant. Three of these were in comparisons among subgroups of the WISC-R
with selection by Vocabulary: Group 4 was not less than Group 3; Group 5 was
not less than Group 4; and Group 5 was not less than Group 2. The only other ×2

0.9 0.9
i Information Vocabulary
0.8 0.8
0.7 0.7

,~ 0.6 0.6
._~ 0.5 "'" "'n. 0.5

~ 0.4 0.4

~ 0.3 0.3

0.2 0.2

0.1 0.1

<78 78-92 93-107 108-122 >122 <:78 78-92 93-107 108-122 >122
IQ Equivalent IQ Equivalent

FIG. 1. Average correlation among WAIS-R and WISC-R subtests within ability level group when
groups are selected by Vocabulary or Information subtests corrected for restriction of range.

which was not statistically significant was the comparison between Group 5 and
Group 4 on the WAIS-R with selection by Information. Figure 1 shows that all
these differences are consistent with slight deviations of a few data points from
the general trend.
For the remaining 36 comparisons between groups, the ×2 values were very
large, mostly larger than 100. An idea of the size of this effect can be had from
the number of statistically significant differences between correlations of the
lower and higher groups. The following percentage of Fisher's z comparisons
were statistically significant in the right direction: WAIS-R--Vocabulary,
78%;--Information, 72%; WISC-R--Vocabulary, 40%; Information-68%. Re-
member that this is a conservative test because a two-tailed test was used to test a
one-tailed hypothesis. Even so, the results obviously show that there is a system-
atic trend for the correlations among tests to be higher at lower ability levels.
LISREL offers another method of testing whether correlations are different
across ability level. Following the method suggested in the LISREL manual
(Joreskog & Sorbom, 1986), all five correlation matrices, uncorrected for re-
striction of range, were simultaneously compared to determine if they were
equal. This analysis was repeated for each test and for each method of selection.
LISREL gives a X2 showing the degree of fit to the hypothesis or model which
was that all five matrices were equal. The results showed that the matrices could
not be regarded as equal for any of the tests or methods of selection. For WAIS-R
selected by Vocabulary, ×2(312) = 743.95, p < .001, and selected by Informa-
tion, ×2(312) = 1015.42, p < .001. For WISC-R selected by Vocabulary,

X2(364) = 1390.27, p < .001 and selected by Information, X2(364) = 1720.92,

p < .001. LISREL confirmed the results of the previous analysis showing that
correlations change across ability level.
Because the correlations in Figure 1 were corrected for restriction of range,
comparisons can be made to the full standardization sample average correlations
among subtests which, for the WISC-R and WAIS-R, were .39 and .51 respec-
tively. It appears that the correlations in the full sample were more heavily
affected by the low end of the distribution. Differences between groups appear
larger at the low end of the distribution. This trend was most apparent for the
WISC-R. Though correlation differences were most obvious in the low IQ
groups, clearly the relationship between ability group and average subtest inter-
correlation was a systematic one.
If the average correlations were not corrected for restriction of range, the same
relationships reported above held except, of course, the correlations were small-
er. However, it is very important to correct for restriction of range in doing
analyses of these kind. Differences resulting from using Vocabulary or Informa-
tion subtests to form ability groups were nearly entirely eliminated by correction
for restriction of range. Average correlations for lowest to highest ability groups
assembled by Information subscale scores, uncorrected for attenuation, for the
WISC-R were .42, .29, .26, .21, and .22 and for the WAIS-R were .56, .37,
.30, .25, and .26. Even these uncorrected differences were substantial and
One other effect apparent in Figure 1 is that the WlSC-R has consistently
lower average correlations among subtests than the WAIS-R. This disparity
seems to be largest at the lower ability levels. The difference between the two
tests might be developmental in origin. However, the WISC-R and the WAIS-R
are two independent tests and so the differences between them could as easily be
because of dissimilarity in the tests themselves.
In summary, data from the WAIS-R and WISC-R provide strong support for
the contention that size of correlations among mental tests varies inversely as a
function of ability level. Correlations among subtests are higher for lower ability
subjects and lower for higher ability subjects.

The general finding from both studies is that mental tests, including mental tests
measuring basic cognitive ability, have higher intercorrelations in lower ability
groups than in higher ability groups. The data analyzed showed three separate
findings. First, cognitive tasks correlate more highly among themselves at lower
ability levels than at higher ability levels. Second, cognitive tasks correlate more
highly with IQ at lower ability levels than at higher ability levels. Third, subtests
of IQ tests intercorrelate more highly at lower ability levels than at higher ability
levels. The findings from the two studies are consistent and systematic.

There can be little doubt of the reliability of this finding. However, further
research will be required to determine if this same relationship can be found for
other tests than those used here. Given the regularity and systematic nature of the
findings for the tests analyzed here, it would be surprising if it were not a
replicable finding with other data sets.
Considering the consistency of these results, it seems surprising that they have
not been reported before. Anastasi (1970) reviewed the literature on variables
affecting psychological trait formation. She found a few studies which reported
incidentally differences in average correlations across ability groups. It seems
that the full importance of the finding was not understood or appreciated since
none of the studies were followed up.
The most similar finding to the one reported here was in a study by Maxwell
(1972). As a part of a statistical exercise, he discovered accidently that two
groups of children divided by their scores on a reading test showed different
correlations among the subtests of the WPPSI. Correlations among subtests were
higher for the group that had low reading test scores. Maxwell proposed two ad
hoc explanations of the effect based on Thomson's sampling or bond theory.
Unfortunately, Maxwell seems not to have attempted to determine if the dif-
ferences in subtest correlations he found between high and low reading ability
subjects extended beyond the sample he analyzed. Maxwell thought the reading
test was the important variable in forming groups. Even so, it is very likely that
division by the subtests of the WPPSI would give the same results and so validate
the findings from this study.
If the finding that correlations between mental tests vary systematically by
level of ability is found to be a general one, not specific to certain tests, then the
implications of this finding are substantial. It suggests that much of what we
know about IQ would have to be reconsidered in light of this finding. Without
further verification and replication of this phenomenon, extensive theoretical
speculation is premature,. Nevertheless, there are some obvious implications.
For example, Sternberg and Salter (1982) have discussed what they call the
.30 barrier, meaning that correlations between basic cognitive tasks and IQ have
often been under .30. The reason for this is probably that most researchers
investigating the relationship between cognitive abilities and IQ do not include
low IQ subjects in their samples. Table 1 shows what happens when only high IQ
subjects are used. Average correlations between cognitive tasks and IQ remain
below .30. But if the same correlations are computed for low IQ subjects, they
are about twice as large. Studies of individual differences in cognition that have
not included a proportional representation of low IQ subjects will be very hard to
interpret. Studies done with college students alone or with other highly selected
subject groups will not be generalizable to the full spectrum of intellectual
ability. Much of experimental psychology assumes that college students repre-
sent the mental structures of all people. These results would suggest that such an
assumption may not be warranted.

Higher correlations among low IQ subjects may also explain why IQ tests
seem to find more uses at the low end of the IQ scale than at the high end. The
WlSC-R and WAIS-R, and perhaps other tests, will be more 'g'-loaded at the
low end of the distribution that at the high end. That is, a general factor should
account for more of the total variation among low ability subjects than it does
among high. Conversely, high ability subjects will show more subtest scatter.
The interpretation of many previous findings may be altered significantly if IQ
tests have different correlations at different ability levels. For example, to what
extent will the interpretation of heritability estimates be affected by this finding?
Are heritabilities higher or lower for low IQ subjects? Can subtest scatter be a
valid clinical tool for high IQ subjects even if it could be expected to be less
useful at low IQ levels? Should different kinds of testing be done for low IQ
subjects than is done for high IQ subjects? Can one factor analytic model ade-
quately represent high and low ability subjects simultaneously? (It is possible
that average correlations can change across ability level but that factor structure
could remain unchanged.)
Finally, what causes these differences in correlation at different ability levels
may be the most interesting of all the questions presented by these findings.
Although it is not the aim of this paper to consider fully potential explanations,
one possibility is suggested by a theory of mental retardation presented by
Detterman (1987). Briefly, he suggested that intelligence is a system made up of
a small number of independent processes. Mental retardation is caused by defi-
cits in central processes, meaning processes which most heavily affect all other
processes in the system. If these central processes are deficient, they limit the
efficiency of all other processes in the system. Because of the deficit in the
central process, the entire system is brought to a uniform low level of operation.
So all processes in subjects with deficits tend to operate at the same uniform
level. However, subjects without deficits show much more variability across
processes because they do not have deficits in important central processes. This
causes high correlations among mental measures in low IQ subjects and low
correlations in high IQ subjects.

Anastasi, A. (1970). On the formationof psychologicaltraits. American Psychologist, 25, 899-910.
Detterman, D.K. (1986, November).Basic cognitive processes predict IQ. Paper presented at the
PsychonomicSociety,New Orleans, LA.
Detterman, D.K. (1987). Theoreticalnotionsof intelligenceand mental retardation.American Jour-
nal of Mental Deficiency, 92, 2-11.
Detterman, D.K., Caruso, D.R., Mayer, J.D., Legree, P.J., & Conners, F.A. (1983, March).
Assessing cognitive deficits in the mentally retarded: Findings from overall analyses of tasks.
Paper presented at the GatlinburgConferenceon Mental Retardation,Gatlinburg, TN.
Gulliksen, H. (1950). Theory of mental tests. New York: Wiley
Joreskog, K.G., & Sorbom, D. (1986). LISP,EL VI. Mooresville,IN: ScientificSoftwareInc.

Kaiser, H.F. (1968). A measure of the average intercorrelation. Educational and Psychological
Measurement, 28, 245-257.
Maxwell, A.E. (1972). The WPPSI: A marked discrepancy in the correlations of the subtests for
good and poor readers. The British Journal of Mathematical and Statistical Psychology, 25,
Spearman, C.E. (1904). "General intelligence" objectively determined and measured. American
Journal of Psychology, 15, 201-293.
Steinberg, R.J., & Salter, W. (1982). Conceptions of intelligence. In R.J. Steinberg (Ed.), Handbook
of human intelligence (pp. 3-28). New York: Cambridge University Press.

