PredVerbal ELL

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/254114219
The Predictive Accuracy of Verbal, Quantitative, and Nonverbal Reasoning Tests:

Consequences for Talent Identiﬁcation and Program Diversity
Article in journal for the education of the gifted · October 2011

DOI: 10.1177/016235321103400404
CITATIONS READS
37 1,706
2 authors, including:
Joni Lakin
Auburn University
39 PUBLICATIONS 679 CITATIONS
SEE PROFILE
All content following this page was uploaded by Joni Lakin on 30 May 2014.
The user has requested enhancement of the downloaded file.

Running head: ABILITY TESTS AND LINGUISTICALLY DIVERSE GROUPS
Assessing the Cognitive Abilities of Culturally and Linguistically Diverse Students:
Predictive Validity of Verbal, Quantitative, and Nonverbal Tests
Joni M. Lakin
Auburn University
Draft of August 9, 2011
This manuscript was published in Psychology in the Schools.
Author Note
Joni M. Lakin, Department of Educational Foundations, Leadership, and Technology,

Auburn University.
The data analyzed in this paper were collected as part of the Project Bright Horizon
project, which was sponsored by a Jacob K. Javits Gifted and Talented Education grant to the
Project Bright Horizon Research Team: Peter Laing, Project Director/Co–Principal
Investigator,Washington Elementary School District, Phoenix, AZ; Dr. Jaime Castellano, Project
Consultant; and Dr. Ray Buss, Arizona State University at the West Campus, Principal
Investigator. The views and opinions expressed in this article are those of the author and should
not be ascribed to any members of the Project Bright Horizon staff or its consulting partners.
The author gratefully acknowledges the helpful suggestions of David Lohman, John Young,
Brent Bridgeman, Don Powers, and Dan Eignor on earlier drafts of this article.
Correspondence concerning this article should be addressed to Joni Lakin, Department of
Educational Foundations, Leadership, and Technology, Auburn University, Auburn, AL 36831.
Email: joni.lakin@auburn.edu
Running head: ABILITY TESTS AND LINGUISTICALLY DIVERSE GROUPS 2
Abstract
Verbal and quantitative reasoning tests provide valuable information about cognitive abilities
that are important to academic success. Information about these abilities may be particularly
valuable to teachers of students who are English-language learners (ELL), because leveraging
reasoning skills to support comprehension is a critical aptitude for their academic success.
However, due to concerns about cultural bias, many researchers advise exclusive use of
nonverbal tests with ELL students despite a lack of evidence that nonverbal tests provide greater
validity for these students. In this study, a culturally and linguistically diverse sample of students
were administered a test measuring verbal, quantitative, and nonverbal reasoning. The two-year
predictive relationship between ability and achievement scores revealed that nonverbal scores
had weaker correlations with future achievement than quantitative and verbal reasoning ability
for ELL and non-ELL students. Results do not indicate differential prediction and do not support
the exclusive use of nonverbal tests for ELL students.
Keywords: Cognitive ability testing, English-language learners, Hispanic students, Validity

Assessing the Cognitive Abilities of Culturally and Linguistically Diverse Students:
Predictive Validity of Verbal, Quantitative, and Nonverbal Tests
Cognitive ability tests that measure verbal, quantitative, and nonverbal reasoning skills
are widely used by schools to provide valuable information to teachers hoping to differentiate
instruction to their students’ cognitive strengths (Lohman, 2009; Lohman & Hagen, 2001b).
Verbal and quantitative reasoning skills are particularly critical for academic success because of
the heavy reliance on these skills in traditional academic domains. This may be even more true
for English-language learner (ELL) students, for whom leveraging verbal reasoning skills to
support comprehension is a critical aptitude for school success and language acquisition. For
example, knowing an ELL student has relatively weak verbal reasoning skills, a teacher might
provide that student with more linguistic support than other ELL students need. However,
despite the potential utility of such information, many researchers advise the exclusive use of
nonverbal tests with ELL students, suggesting that the linguistic and cultural demands of the
items creates measurement bias (Lewis, 2001; McCallum, Bracken, & Wasserman, 2001;
Naglieri & Ronning, 2000). While this argument is persuasive to many, there is little direct
evidence that nonverbal tests provide more useful information about ELL students’ academic
aptitude than verbal or quantitative tests.
This study evaluated the validity and fairness of the Cognitive Abilities Test (CogAT,
Form 6; Lohman & Hagen, 2001a) in predicting reading and math achievement in a sample of
Hispanic ELL and Hispanic and White non-ELL students. The CogAT consists of verbal,
quantitative, and nonverbal batteries. These batteries have been found to provide strong
predictive validity for achievement in non-ELL populations (Lohman & Hagen, 2002). The
purpose of the study was to explore whether the batteries provide similar predictive validity for
Hispanic ELL and non-ELL students.
Considerations in the use of multi-battery ability tests
Multi-battery ability tests assess cognitive ability by sampling multiple content domains.
Such tests are useful for teachers because they provide information about the range of a student’s
talents. Both individually and group-administered tests can provide multiple test scores for
students that contrast their performance in various domains (Sattler, 2008). Teachers can use this
information to target both student weaknesses—for extra practice and instructional support—and
student strengths—for enrichment opportunities that make school more enjoyable and
challenging.
A common misconception is that ability tests should enable users to measure innate
ability that is uninfluenced by educational opportunity. In fact, ability tests measure developed
and well-practiced reasoning skills (Anastasi, 1980). Rather than providing qualitatively distinct
information from achievement tests, ability and achievement tests differ in the degree to which
they tap into recent and specific learning accomplishments versus general and long-term
acquisitions (Anastasi, 1980; Lohman, 2001). Thus, ability tests offer a different broader
perspective on developed knowledge and skills that can be contrasted with more narrowly
focused achievement test performance and can be useful to teachers who want to adapt the pace
and content of their instruction to students who differ widely in the speed and readiness with
which they learn (Lohman & Hagen, 2001b).
The misconception about ability tests measuring innate capabilities leads many to
conclude that mean differences on verbal and quantitative ability tests by definition reflect bias
in the assessments. Large mean differences have been documented between ELL and non-ELL
students on a range of ability tests (Lakin & Lohman, in press; Palmer, Olivarez, Willson, &
Fordyce, 1989; Patterson, Mattern, & Kobrin, 2007). Thus, a number of researchers and
educators have called for the exclusive use of nonverbal tests in assessing the cognitive strengths
of culturally and linguistically diverse students (Lewis, 2001; Naglieri & Ford, 2003).
However, the existence of mean differences is not in itself evidence of bias (Jensen,
1980; Reynolds, 1982). When test users interpret scores appropriately by controlling for
opportunity to learn, mean differences do not negate the utility of the tests for differentiating
instruction. Furthermore, the exclusive use of nonverbal tests to predict achievement and make
academic placement decisions has been widely criticized by many researchers, because those
tests clearly under-represent the domain of interest and lack the obvious links that verbal and
quantitative reasoning have to learning and school success (Braden, 2000; Figueroa, 1989; Lakin
& Lohman, in press; Ortiz & Dynda, 2005). They also do not provide teachers with a clear path
for differentiating instruction.
Use of nonverbal ability tests with ELL students
To support the contention that nonverbal tests are more valid and useful for
differentiating instruction for ELL students, researchers must first show conclusive evidence of
bias for tests measuring verbal and quantitative reasoning, and, second, that nonverbal tests
provide an effective alternative. The evidence of bias for verbal and quantitative tests is not
conclusive because most proponents of nonverbal assessments rely solely on mean differences as
evidence of bias. Evidence of differential prediction for tests measuring verbal and quantitative
ability would provide more conclusive evidence, but has not been found by previous research.
For example, despite finding large mean differences, Palmer et al. (1989) found no differences in
the regression slopes between language proficiency groups when predicting achievement scores
from the Kaufman Assessment Battery for Children (K-ABC; Kaufman & Kaufman, 1983) using
ability scores from the Wechsler Intelligence Scale for Children-Revised (WISC-R; Wechsler,
1974).
In contrast, there is strong evidence that nonverbal tests do not provide an effective
alternative to verbal and quantitative tests because these tests often yield low validities for
predicting reading and math achievement (both critical domains of academic development). For
a sample of ELL students, Borghese (2009) reported correlations between the Universal
Nonverbal Intelligence Test (UNIT; Bracken & McCallum, 1998) and achievement of r = .28 for
reading achievement. Prediction of math achievement was stronger at r = .51. Jones (2006) found
correlations below .10 between UNIT scores in first grade and reading achievement on the Texas
Assessment of Knowledge and Skills (TAKS) in third grade for both ELL and non-ELL students.
Even in non-ELL samples, the correlations between nonverbal tests and achievement usually
range between .3 and .6 (e.g., Balboni, Naglieri, & Cubelli, 2010; Naglieri & Ronning, 2000).
These values are far below what is typically observed for CogAT verbal and quantitative
batteries with non-ELL samples, which predict their relevant domain of achievement (reading
and mathematics, respectively) with correlations of .75-.80 (Lakin & Lohman, in press). Lakin
and Lohman (in press) showed that differences in correlations of this magnitude (.5 vs. .8) have
practical importance in the identification of academically talented students.
The purpose of this study was to provide additional data on the predictive validity of
verbal, quantitative, and nonverbal test batteries for culturally and linguistically diverse students.
The research questions that guided this study were:

1. Are there substantial mean differences in verbal, quantitative, and nonverbal ability
scores between students who differ in racial/ethnic background and/or language
proficiency?
2. Are the same achievement and ability measures useful as predictors of future
achievement for ELL and non-ELL students?
3. Does the nonverbal battery play a more important role in predicting later achievement for
ELL students?
Methods
Two schools in Arizona participated in the Project Bright Horizons study developed by a
team of researchers and school administrators (see Lohman, Korb, & Lakin, 2008). The data
used in this study came from students in the sample who were in 3rd to 5th grade in the first year
of the study and reported either White or Hispanic ethnicity. The sample consisted of 124
Hispanic ELL students, 161 Hispanic non-ELL students, and 72 White non-ELL students.
Ethnicity was based on district data, which relies on U.S. Census classifications. Other ethnic
groups of non-ELL students (Asian, American Indian, and African American) included fewer
than 30 students each and were omitted from the analyses. Table 1 provides additional
demographic information on the sample.
[Table 1]
ELL status in this study relied on district classifications reported by the schools.
StudentsThese classifications were based partially on student scores on the Stanford English
Language Proficiency Test (SELP; Harcourt Educational Measurement, 2003). For this study,
students were classified based on their ELL status in year 1. The range of English proficiency
varied considerably within the ELL group: 13% were first-year ELL students (i.e., low
proficiency) while another 33% were reclassified by the second year of the study (likely high
proficiency).
In the first year of the study, students completed both ability and achievement tests in the
late spring. In the second year, only achievement tests were administered. The achievement tests
were administered as part of the schools’ annual accountability testing. Only students with
complete teststest records were used. The variables with the greatest proportion of missing data
were the year-two achievement scores. Unlike many studies, in this case, White students had
missing scores more often than ELL and Hispanic students., perhaps due to differences in school
mobility.
Measures
Cognitive Abilities Test (Form 6)
The CogAT consists of three separate batteries measuring verbal, quantitative, and
nonverbal reasoning (Lohman & Hagen, 2001a). In this study, students received the appropriate
level of the CogAT given their grade level (levels A to C, respectively). The verbal (65 items),
quantitative (60 items), and nonverbal batteries (65 items) each consist of three subtests that use
different item formats. Universal scale scores on a vertical scale spanning grades K through 12
were used in this study. A previous research study on the same dataset indicated that the factor
structure of this test was consistent for ELL and non-ELL students, though the variance of the
verbal factor was attenuated for ELL students (Author, 2010). Another study found that the
reliability of the verbal battery was adequate (Φ = .82) for ELL students, though lower than that
of non-ELL students (Φ = .96). See Lakin and Lai (in press) for a detailed exploration of the
reliability of the CogAT for ELL students.

All tests on the CogAT begin with directions that are read aloud by the teacher. In this
study, teachers read directions in Spanish as well as English when appropriate. All three subtests
of the verbal battery and one subtest of the quantitative battery require the examinee to complete
some reading in English. On the verbal battery, students must read either individual words
(verbal classification and verbal analogies) or short sentences (sentence completion). On the
quantitative relations subtest, students read individual words (e.g., foot, gallon). The other
quantitative tests and all of the nonverbal battery do not require reading.
Achievement test
The Arizona Instrument to Measure Standards Dual Purpose Assessment (AIMS DPA)
was designed to yield normative and criterion-referenced information about student achievement.
Thirty to fifty percent of items on the AIMS DPA come from the TerraNova achievement tests
(CTB/McGraw-Hill, 2002). The remaining items were developed by educators specifically for
the AIMS DPA to better align the test with state educational goals (Arizona Department of
Education, 2006). Reading/language arts and mathematics subtests of the AIMS DPA each
contained approximately 80 items. Separate scale scores are reported for mathematics and
reading.
Procedure
In separate models with year 2 reading and math achievement as the criterion, regression
analyses explored the incremental prediction of the ability tests when year-one achievement
scores are available. The order of entry for predictor variables was based on prior research,
which indicates that the best predictor of future achievement is prior achievement followed by
the ability to reason in the domain and then by general reasoning skills (Lohman, 2009). Thus,
year-one achievement scores entered first, followed by domain-relevant, year-one ability tests
scores (verbal or quantitative), and finally nonverbal ability scores. Variables for ethnicity (1 =
Hispanic, 0 = non-Hispanic) and ELL status (1 = ELL; 0 = non-ELL) were then entered as a
block. Finally, interaction terms of the ability scores with ELL status and Hispanic background
were entered as a block. To explore the utility of nonverbal tests for ELL students, a separate
series of regressions compared variance accounted for with different combinations of predictors.
Design
Interaction terms and regression residuals form the basis for analyzing differences in the
magnitude of the relationships between the predictor tests and the achievement criterion tests. In
the predictive bias framework originally outlined by Cleary (1968; see also Cleary, Humphreys,
Kendrick, & Wesman, 1975), two levels of differential prediction were defined. One type of
differential prediction was defined by an interaction of group membership with predictors in the
regression analysis and reflected bias in the slope of the regression lines. Differences in
regression slopes indicate that the predictors being used are less relevant to the criterion for one
group versus another. In this study, an interaction of the ability test scores with ethnicity or ELL
status might indicate that the tests are less predictive of achievement for those students.
Cleary (1968) defined another type of differential prediction as persistent under- or over-
prediction for one group. This form of differential prediction is detected by analyzing regression
residuals for evidence that one group’s observed criterion scores are significantly higher or lower
than the model predicts (Reynolds, 1982). In the absence of an interaction of group membership
with predictor variables, differences in residuals indicate that the regression slopes for two
groups are nearly parallel, but do not coincide. For this type of differential prediction with
parallel regression lines, Cleary et al. (1975) explained, “the test can be used within each group
with the same accuracy of prediction” (p. 27; see also Reynolds, 1982).
Results
Descriptive statistics are reported in Table 2. Mean differences were large between the
ELL and non-ELL groups (-1.0 to -2.1SD). The differences were largest for verbal reasoning and
math achievement and somewhat smaller for quantitative reasoning and year-one reading
achievement. Nonverbal reasoning scores and reading achievement in year two showed the
smallest differences, though they were still substantial. Mean differences between the two non-
ELL groups were much smaller. Only verbal reasoning and year-one reading showed moderate
effect sizes (-0.6SD and -0.4SD, respectively).
[Table 2]
Variance is another important characteristic of score distributions because restricted
range can attenuate correlations with other variables. In Table 2, the ratios of variance are
reported for each test. As an example, on the quantitative battery, the variance ratio of 1.7 for
ELL and non-ELL Hispanic groups indicated that the variance of non-ELL Hispanic students
was 70% greater than the variance for ELL Hispanic students. Across the board, non-ELL
students were much more variable than ELL students were and White students were more
variable than Hispanic students were. Despite this finding, there was no apparent floor effect in
the histograms of test scores. The data for all three groups also satisfied Bracken’s (2007)
heuristic for floor effects in that the range of scores extended above and below the mean by
2SDs.
Patterns of Correlations
Hispanic ELL students had substantially lower correlations between tests, which may be
related to their restricted variability in scores. See Table 3. Despite this, the pattern of
correlations between achievement and ability tests were consistent with previous research. For
all three groups of students, math achievement correlated most strongly with quantitative
reasoning and reading achievement correlated most strongly with verbal reasoning. Even for
ELL students, nonverbal ability scores had significantly lower correlations 1 with year-one
achievement than verbal had with reading and quantitative had with math. Furthermore, the
relationship between the ability scores and achievement remained strong through year two.
[Table 3]
Multiple Regressions Including Year-One Achievement
The strong correlations between ability scores and year-two achievement highlight their
relevance to future academic success. However, ability tests can also provide incremental
prediction beyond the data that schools already have—namely, previous achievement test scores.
Thus, a series of regression models tested the incremental prediction of year-two achievement
from both ability scores and year-one achievement.
Math Achievement
Year-one math achievement accounted for 64% of the variance in year-two math
achievement. See Table 4. Quantitative reasoning added an additional 6% to the variance
accounted for, and nonverbal added an additional 1%. When ELL status and ethnicity (White vs.
Hispanic) entered the model, they did not account for an appreciable amount of variance.
Interaction variables between ethnicity and the ability scores also did not contribute to
prediction, indicating that the regression slope was the same for all three groups.
Reading Achievement
Year-one reading achievement accounted for 70% of year-two achievement. See Table 5.
Verbal reasoning added an additional 1% to the variance explained, but nonverbal reasoning
1
Using a Fisher r-to-z transformation (p<.05). See Hays (1994).
failed to improve prediction any further. ELL status and ethnicity accounted for a significant but
negligible amount of variance (less than 1%). Coefficients in the final model indicated that
neither coefficient was significant. Inspection of the coefficients before the interactions were
added indicated that ELL status had a slight negative effect on achievement (b = -.10).
Interactions between ELL status and test scores failed to add significantly to the prediction of
year-two reading achievement.
[Table 5]
Residuals for overall regression
To test for differential prediction in the absence of group interactions, regression
residuals across groups were analyzed as a one-way ANOVA to detect consistent under- or over-
prediction for one group (Reynolds, 1982). The same regression analyses for reading and
mathematics achievement were repeated and residuals recorded without the effects for ELL and
ethnicity status included. Means and SDs are reported in Table 6. For math achievement, there
was no main effect for residuals, indicating that the three groups of students did not vary
significantly in the fit of the regression model. For reading achievement, however, there was a
significant effect (F [2, 373] = 3.80, p < .025). Follow-up tests using Tukey’s comparisons
indicated that there was significant, though slight, under-prediction of reading achievement for
Hispanic, non-ELL students in year two of around 6 points on the reading achievement scale (an
effect size of about .17). On average, both White non-ELL and Hispanic ELL students showed
over-prediction for reading achievement.
[Table 6]
Alternative Predictive Models for ELL Students
Given the lower correlations between ability and achievement for ELL students, and the
arguments by some researchers that nonverbal tests provide more valid information about the
abilities of ELL students, we explored combinations of ability scores were explored to see if they
improved predictive validity for ELL students. See Table 7. For math achievement, quantitative
reasoning added the most predictive variance: 42% when entered first and adding 14%
incrementally when nonverbal was added first. For reading achievement, verbal reasoning
adding the most predictive variance: 28% when entered first and adding 13% incrementally
when nonverbal was added first. When entered first, nonverbal ability accounted for just 30% of
variance for math achievement and 17% of variance for reading achievement. When entered
second, nonverbal ability accounted for just 2% of variance in either domain.
[Table 7]
Discussion
The research questions addressed (1) the presence of mean differences, (2) the pattern of
correlations between ability and achievement tests across groups, and (3) the interaction of
nonverbal tests with ELL and Hispanic group membership. Large mean differences were found
between the observed test scores for ELL and non-ELL students, while small-to-negligible
differences were found between Hispanic and White non-ELL students. For math achievement,
these differences translated into a small, but significant, positive main effect for Hispanic
students in the regression analysis indicating that their year 2 achievement scores were higher
than those for White and ELL students with similar achievement and ability scores in year 1. For
reading achievement, the tests indicated a small negative main effect of ELL status indicating
that ELL students’ scores were lower than for the other two groups when controlling for prior
achievement and ability. Neither main effect appeared practically significant.
An interaction between ELL or Hispanic variables and ability test scores would indicate
differential prediction between the three groups. However, none of the interaction terms entered
in the final step of the regression analyses. were statistically significant. This finding indicates
that the same test variables are similarly important to the prediction of later achievement for all
three groups of students. This conclusion is further supported by the table of observed test
correlations, which showed that the same ability tests were most important for predicting
achievement in all three groups. One contradictory finding came from the analysis of residuals,
which revealed that Hispanic non-ELL students’ reading achievement was somewhat under-
predicted by the common regression line.
Separate analyses explored whether nonverbal ability scores were particularly important
in predicting achievement for ELL students. For math achievement, nonverbal tests were clearly
inferior to quantitative tests in predicting year-two achievement. For reading achievement, verbal
ability was clearly the best predictor for ELL students. NonverbalIn contrast to the recommended
use of nonverbal tests for ELL students, nonverbal ability scores did not appear to provide
similar predictive validity compared to quantitative or verbal ability and did not add much
incremental prediction beyond those scores even for ELL students. Thus, although nonverbal
ability tests can play an important role as part of an assessment battery, their relationship to
current and future achievement is not as strong as for verbal and quantitative ability tests.
Therefore, for teachers seeking guidance on how best to adapt instruction to the cognitive
strengths of their ELL students, this study provides evidence that, overall, nonverbal tests do not
provide superior information about the cognitive strengths and academic promise of ELL
students. As with non-ELL students, the most relevant information comes from verbal reasoning
for reading domains and quantitative reasoning for mathematics domains.
It would be reasonable to expect these results to generalize to other nonverbal ability tests
that are primarily unidimensional. The CogAT nonverbal battery consists of three item formats:
figure analogies, figure classification, and paper folding. The figure analogies format is related to
the item formats used by the Naglieri Nonverbal Ability Test and Raven’s Progressive Matrices
and shows strong convergent validity with those tests (Lohman, Korb, & Lakin, 2008). On this
basis, it is reasonable to assume that these findings would generalize to those tests.
Implications and Directions for Future Research
The consistency of the regression slope between ELL and non-ELL students indicated
that the tests provide similar information about the future achievement of all three groups of
students. For educators seeking to differentiate instruction, verbal and quantitative reasoning
tests show equally strong predictive accuracy for reading and mathematics achievement,
respectively. In this study, nonverbal measures did not provide an effective alternative and were
less useful for making decisions about which students are most likely to succeed in traditional
academic domains relative to other students with similar linguistic and cultural backgrounds.
The main effects of ELL status for reading achievement and Hispanic background for
math achievement in addition to the slight underprediction of reading achievement for non-ELL
Hispanic students indicate that the use of those scores requires careful interpretation. As Cleary
et al. (1975) explained, despite the presence of main effects in the regression (or mean
differences in observed scores), “when the [regression] lines are parallel, the test can be used
within each group with the same accuracy of prediction” (p. 27; see also Reynolds, 1982).
Recently, there have been innovations in making appropriate inferences about ability when using
tests that are affected by opportunity to learn or access to the curriculum. This is discussed in the
next section.
Normative inferences based on opportunity to learn
The common misconception that ability tests measure innate intelligence often leads to
the (mistaken) conclusion that mean differences must either be interpreted as immutable group
differences in intelligence or test bias (Jensen, 1980; Lohman, 2006a). In fact, ability tests
measure developed capabilities that are impacted by educational experience and opportunity to
learn (Anastasi, 1980; Martinez, 2000). This does not negate their utility for making inferences
about students’ intellectual capacity as long as opportunity to learn is taken into account. In fact,
comparing the performance of ELL students to appropriate norm groups (i.e., those with similar
opportunities to learn) is critical for making valid inferences about the cognitive abilities of ELL
students (Author, 2010). Comparing ELL students to national norms based on predominantly
non-ELL students will not provide appropriate inferences about the skills of ELL students.
Two strategies have recently been suggested to account for group differences that likely
reflect different degrees of opportunity to learn. Lohman (2006b, 2009) proposed the use of local
subgroup norms to provide a rudimentary adjustment for opportunity to learn when identifying
students for gifted programs and talent development. Weiss, Saklofske, Prifitera, and Holdnack
(2006) used national subgroup norms based on proxies for acculturation, including years in U.S.
schools, to provide multiple perspectives on student scores for the WISC-IV. Contextualizing
student scores with multiple norm comparisons can identify students from minority cultural or
linguistic backgrounds who excel relative to their educational opportunities even when they may
not compare favorably to the national norms (Callahan, 2009; Gándara, 2005; Weiss et al.,
2006). Lohman (2006b, 2009) provides practical guidance as to how local norms can be
developed and used by teachers for the identification of students for gifted and talented
programs. Additional research is needed o expand the use of local norms to instructional
differentiation as well as to explore the practicality and political feasibility of these solutions.
Instructional differentiation
Although in this study a multi-battery test has been found to provide useful information
about the cognitive abilities of ELL and non-ELL students, it does not follow that all students
identified with, for example, verbal strengths require the same instructional interventions
(Callahan, 2009). Recommendations for non-ELL students are already available (Lohman &
Hagen, 2001b). Therefore, additional research is needed to explore appropriate instructional
differentiation for ELL students. 2 In fact, a wide range of research is needed to guide teachers’
use of assessment data to make appropriate educational decisions for ELL students (Young,
2009).
Limitations
Although no evidence of significant differential prediction was found in this study, there
may be other undetected sources of bias. For instance, if there is bias in both the predictor and
criterion, it will not affect the correlations (Cronbach, 1970). Given the central role that
achievement tests play in the modern educational system, bias in the criteria of this study
(reading and math achievement) deserve critical analysis that is beyond the scope of this paper.
Another important limitation is the unusual ethnic makeup of this study. Less than one-
third of the sample was White, which makes their relative weight in determining the shape of the
common regression line smaller than it would be in schools with a majority of White students.
2
It should be noted that efforts to capitalize on the apparent nonverbal strengths of ELL students (sometimes
misconstrued as spatial strengths) neglect the impact of opportunity to learn on ability scores. Many ELL students
may in fact have relative strengths in verbal and quantitative reasoning that are obscured by use of national norms.
However, the regression lines for all three groups were nearly identical. To the extent that the
White students in this sample are similar to the population of White students in the U.S. as a
whole, there is no reason to expect that the common regression line would be much different if
White students made up a larger proportion of the sample.
Finally, it should be noted that the choice of assessment should depend on the type of
instructional differentiation being considered (Callahan, 2009). This study focused on traditional
academic domains and thus the CogAT was appropriate for predicting success. However, if a
talent development program were targeting skills beyond general reasoning and verbal and
quantitative domains, other tests might be more appropriate. Multiple indicators of student
aptitude are always critical to making decisions about gifted and talented program placement.
Conclusion
This study confirmed that within ELL groups and Hispanic and White ethnic groups,
multi-battery ability tests provide useful and valid information about the future performance of
students. The exclusive use of nonverbal tests does not appear warranted when assessing ELL
students with some level of English proficiency and when interpreting scores using appropriate
normative comparisons. In fact, assessing the verbal reasoning skills of ELL students may be
particularly helpful for teachers. Verbal reasoning skills, which include the ability to make sense
of incomplete verbal information, is critical for the academic success of ELL students who must
constantly leverage these skills to make sense of teachers, other students, and reading materials.
Knowledge about which students struggle to make connections within verbal information may
help teachers target those students for additional linguistic support. Although verbal reasoning
scores for ELL students have limitations in their psychometric qualities relative to scores for
non-ELL students, they are still very useful in this regard. Efforts to improve these measures and
promote appropriate uses should continue.

References
Abedi, J., & Lord, C. (2001). The language factor in mathematics tests. Applied Measurement in
Education, 14, 219-234.
Anastasi, A. (1980). Abilities and the measurement of achievement. New Directions for Testing
and Measurement, 5, 1-10.
Arizona Department of Education. (2006, November). AIMS student guides. Retrieved
September 4, 2007, from http://www.ade.state.az.us/standards/AIMS/AIMSSTGuides/
Author (2010). Multidimensional ability tests and culturally and linguistically diverse students:
Evidence of measurement invariance. Manuscript submitted for publication.
Balboni, G., Naglieri, J.A., & Cubelli, R. (2010). Concurrent and predictive validity of the Raven
Progressive Matrices and the Naglieri Nonverbal Ability Test. Journal of
Psychoeducational Assessment, 28, 222–235. doi: 10.1177/0734282909343763
Borghese, P. (2009). An analysis of predictive, convergent, and discriminant validity of the
Universal Nonverbal Intelligence Test with limited English proficient Mexican-American
elementary students (Doctoral dissertation). Retrieved from ProQuest. (AAT 3351828)
Bracken, B. A. (2007). Creating the optimal preschool testing situation. In B. A. Bracken, & R. J.
Nagle (Eds.), Psychoeducational assessment of preschool children (4th ed., pp. 137-154).
Mahwah, NJ: Lawrence Erlbaum Associates.
Bracken, B.A., & McCallum, R.S. (1998). Universal Nonverbal Intelligence Test examiner’s
manual. Itasca, IL: Riverside.
Braden, J.P. (2000). Editor’s introduction: Perspectives on the nonverbal assessment of
intelligence. Journal of Psychoeducational Assessment, 18, 204-210.

Callahan, C.M. (2009). Myth 3: A family of identification myths: Your sample must be the same
as the population. There is a "silver bullet" in identification. There must be "winners" and
"losers" in identification and programming. Gifted Child Quarterly, 53, 239-241.
Cleary, T.A. (1968). Test bias: Prediction of grades of Negro and White students in integrated
colleges. Journal of Educational Measurement, 5, 115-124.
Cleary, T.A., Humphreys, L.G., Kendrick, S.A., & Wesman, A. (1975). Educational uses of tests
with disadvantaged students. American Psychologist, 30, 15-41.
Cronbach, L. J. (1970). Essentials of psychological testing (3rd ed.). New York: Harper & Row.
CTB/McGraw-Hill. (2002). TerraNova, the Second Edition. Monterey, CA: Author.
Figueroa, R.A. (1989). Psychological testing of linguistic-minority students: Knowledge gaps
and regulations. Exceptional Children, 56, 145-152.
Gándara, P. (2005). Fragile futures: Risk and vulnerability among Latino high achievers.
Princeton, NJ: Educational Testing Service.
Harcourt Educational Assessment. (2003). Stanford English Language Proficiency Test. San
Antonio, TX: Author.
Jensen, A. R. (1980). Bias in mental testing. New York, NY: The Free Press.
Jones, C.K. (2006). The relationship of language proficiency, general intelligence, and reading
achievement with a sample of low performing, limited English proficient students
(Doctoral dissertation). Retrieved from ProQuest. (AAT 3296415)
Kaufman, A.S., & Kaufman, N.L. (2004). Kaufman Assessment Battery for Children, Second
Edition (KABC-II. Circle Pines, MN: American Guidance Service.

Lakin, J.M., & Lai, E.R. (in press). Multi-group generalizability analysis of verbal, quantitative,
and nonverbal ability tests for culturally and linguistically diverse students. Educational
and Psychological Measurement.
Lakin, J.M., & Lohman, D.F. (in press). The predictive accuracy of verbal, quantitative, and
nonverbal reasoning tests: Consequences for talent identification and program diversity.
Journal for the Education of the Gifted.
Lewis, J. D. (2001). Language isn't needed: Nonverbal assessments and gifted learners. Growing
Partnerships for Rural Special Education, San Diego, CA.
Lohman, D. F. (2001, November). Aptitude for college: The importance of reasoning tests for
minority admissions. Talk given at Rethinking the SAT: The future of standardized testing
in university admissions. University of California at Santa Barbara. Retrieved from
http://faculty.education.uiowa.edu/dlohman/
Lohman, D. F. (2006a). Beliefs about differences between ability and accomplishment: From
folk theories to cognitive science. Roeper Review, 29, 32-40.
Lohman, D. F. (2006b). Practical advice on using the Cognitive Abilities Test as part of a talent
identification system. Retrieved from http://faculty.education.uiowa.edu/dlohman/
Lohman, D.F. (2009). Identifying academically talented students: Some general principles, two
specific procedures. In L. Shavinina (Ed.), International handbook on giftedness, (pp.
971-997). New York, NY: Springer.
Lohman, D. F., & Hagen, E. P. (2001a). Cognitive Abilities Test (Form 6). Itasca, IL: Riverside.
Lohman, D. F. & Hagen, E. P. (2001b). Cognitive Abilities Test (Form 6): Interpretive guide for
teachers and counselors. Itasca, IL: Riverside.

Lohman, D. F., & Hagen, E. P. (2002). Cognitive Abilities Test (Form 6): Research handbook.
Itasca, IL: Riverside.
Lohman, D. F., Korb, K. A., & Lakin, J. M. (2008). Identifying academically gifted English-
language learners using nonverbal tests: A comparison of the Raven, NNAT, and CogAT.
Gifted Child Quarterly, 52(4), 275-296.
Martinez, M.E. (2000).Education as the cultivation of intelligence. Mahwah, NJ: Lawrence
Erlbaum Associates.
McCallum, R. S., Bracken, B. A., & Wasserman, J. D. (2001). Essentials of nonverbal
assessment. Hoboken, NJ: Wiley.
Naglieri, J. A. (1996). Naglieri Nonverbal Ability Test (NNAT). San Antonio, TX: Harcourt
Brace Educational Measurement.
Naglieri, J. A., & Ford, D. Y. (2003). Addressing underrepresentation of gifted minority children
using the Naglieri Nonverbal Ability Test (NNAT). Gifted Child Quarterly, 47, 155-160.
Naglieri, J. A., & Ronning, M. E. (2000). The relationship between general ability using the
Naglieri Nonverbal Ability Test (NNAT) and Stanford Achievement Test ( SAT) reading
achievement. Journal of Psychoeducational Assessment, 18, 230–239.
Ortiz, S. O., & Dynda, A.M. (2005). Use of intelligence tests with culturally and linguistically
diverse populations. In D. P. Flanagan, & P. L. Harrison (Eds.), Contemporary
Intellectual Assessment: Theories, Tests, and Issues (2nd ed., pp. 545-556). New York:
Guilford Press.
Palmer, D.J., Olivarez, A., Willson, L.V., & Fordyce, T. (1989). Ethnicity and language
dominance-influence on the prediction of achievement based on intelligence test scores in
nonreferred and referred samples. Learning Disability Quarterly, 12, 261-274.

Patterson, B.F., Mattern, K.D., & Kobrin, J.L. (2007). Validity of the SAT for predicting FYGPA:
2007 SAT validity sample [Statistical Report]. New York, NY: College Board.
Raven, J. C., Court, J. H., & Raven, J. (1996). Manual for Raven’s Progressive Matrices and
Vocabulary Scales: Section3. Standard Progressive Matrices. Oxford, UK: Oxford
Psychologists Press.
Reynolds, C.R. (1982). Methods for detecting construct and predictive bias. In R.A. Berk,
Handbook of methods for detecting test bias (pp. 199-227). Baltimore, MD: Johns
Hopkins University Press.
Sattler, J.M. (2008). Assessment of children: Cognitive foundations (5th edition). La Mesa, CA:
Author.
Wechsler, D. (1974). Wechsler Intelligence Scale for Children-Revised (WISC-R). New York:
Psychological Corporation
Weiss, L. G., Saklofske, D.H., Prifitera, A., & Holdnack, J. A. (2006). WISC-IV Advanced
Clinical Interpretation. Burlington, MA: Elsevier.
Young, J.W. (2009). A Framework for Test Validity Research on Content Assessments Taken by
English Language Learners. Educational Assessment, 14, 122-138.

Table 1
Breakdown of Sample by Demographic Category
Percent
Total Home Grade Grade
Ethnicity N Female FRL lang. 3 4
Hispanic ELL 128 45 98 100 45 35
Hispanic non-ELL 161 55 94 16 20 38
White non-ELL 72 44 44 4 19 36
Note. FRL = Eligible for free or reduced lunch price. Home lang. = Primary home language other
than English
Table 2
Descriptive Statistics for ELL and Ethnic Groups
Means (SDs)
ELL
prog.
Grade yrs CogAT AIMS DPA
Non- Y1 Y1 Y2 Y2
Verbal Quant. verbal Rdg Math Rdg Math
Hispanic ELL 3.9 3.8 150.7 160.0 173.0 419.3 409.6 450.1 431.5
N=124 (1.5) (0.8) (11.2) (14.0) (17.7) (32.0) (31.1) (39.0) (31.6)
Hispanic non-ELL 4.2 177.5 182.0 192.7 468.4 470.5 498.1 485.0
N=161 (0.8) (16.9) (18.3) (17.8) (39.8) (37.3) (44.5) (34.7)
White non-ELL 4.3 190.4 186.0 197.4 487.6 481.2 503.1 489.9
N=72 (0.8) (24.5) (21.4) (21.8) (50.1) (49.4) (59.9) (41.6)
Cohen's d effect sizes
Hispanic ELL –
Hispanic non-ELL -1.9 -1.4 -1.1 -1.4 -1.8 -1.1 -1.6
Hispanic ELL -
White non-ELL -2.1 -1.4 -1.2 -1.6 -1.7 -1.0 -1.6
Hispanic - White
non-ELL -0.6 -0.2 -0.2 -0.4 -0.2 -0.1 -0.1
Variance Ratios
Hispanic non-ELL/
Hispanic ELL 2.26 1.70 1.01 1.55 1.44 1.30 1.20
White non-ELL/
Hispanic ELL 4.78 2.34 1.51 2.44 2.52 2.36 1.74
White/Hispanic
non-ELL 2.12 1.37 1.49 1.58 1.75 1.81 1.44
Table 3
Correlations Between Tests in Year 1 and 2 Across ELL and Ethnic Groups
CogAT AIMS DPA

Verbal Quant Nonverb. Y1 Mth Y1 Rdg Y2 Mth
Quant 0.60
Hispanic Nonverbal 0.58 0.67
ELL Y1 Mth 0.53 0.70 0.58
N = 128 Y1 Rdg 0.60 0.50 0.39 0.63
Y2 Mth 0.54 0.64 0.55 0.65 0.51
Y2 Rdg 0.54 0.47 0.41 0.52 0.67 0.68
Quant 0.62
Hispanic
Nonverbal 0.55 0.65
non-ELL
N = 161 Y1 Mth 0.69 0.76 0.68
Y1 Rdg 0.78 0.59 0.55 0.78
Y2 Mth 0.60 0.78 0.64 0.78 0.65
Y2 Rdg 0.67 0.54 0.47 0.66 0.77 0.69
White Quant 0.83
non-ELL Nonverbal 0.80 0.79
N = 72 Y1 Mth 0.85 0.90 0.76
Y1 Rdg 0.83 0.75 0.69 0.84
Y2 Mth 0.74 0.80 0.71 0.79 0.68
Y2 Rdg 0.74 0.66 0.62 0.69 0.75 0.77
Note. At these sample sizes, for correlations in the .40-.80 range, differences of .15 or greater are
significant at the .05 level.
Table 4
Multiple Regression for Math Achievement in Year 2
Comparison of models Coefficients for final model

Change Statistics 95.0% CICIf
Model R R2 R2 Δ FΔ dfDf Final model Beta Lower Upper
a
1 0.800 0.640 0.640 662.64** 1, 373 Y1 Math 0.39 0.29 0.54
2b 0.836 0.698 0.058 72.09** 1, 372 Quant 0.41 0.47 1.56
c
3 0.840 0.705 0.007 8.32** 1, 371 Nonverbal 0.16 -0.08 0.86
d
4 0.842 0.709 0.004 2.74 2, 369 ELL 0.13 -61.37 89.24
e
5 0.843 0.710 0.001 0.26 4, 365 Hispanic 0.26 -42.31 109.01
Interactions
ELL x Q -0.22 -0.70 0.41
ELL x N 0.09 -0.43 0.54
Eth x Q -0.03 -0.61 0.57
Eth x N -0.16 -0.68 0.46
Notes. a Model with Y1 AIMS Math. b Add CogAT Quantitative. c Add CogAT Nonverbal. d Add Hispanic and ELL variables. e Add
interaction terms. f The final model included Y1 math achievement, CogAT Quantitative and Nonverbal scores, Hispanic and ELL
variables, and interaction terms between the CogAT scores and Hispanic/ELL categories.
* p < .05. ** p < .01.
Table 5
Multiple Regression for Reading Achievement in Year 2
Comparison of models Coefficients for final modelmodelf

Change Statistics 95.0% CI
Model R R2 R2 Δ FΔ df Final model Beta Lower Upper
1a 0.839 0.704 0.704 889.24** 1, 374 Y1 Reading 0.59 0.43 0.63
2b 0.847 0.718 0.014 18.89** 1, 373 Verbal 0.15 -0.09 0.68
3c 0.849 0.721 0.002 3.24 1, 372 Nonverbal 0.05 -0.28 0.47
4d 0.853 0.728 0.007 4.68** 2, 370 ELL -0.32 -99.40 38.98
5e 0.854 0.729 0.002 0.53 4, 366 Hispanic -0.16 -79.67 44.74
Interactions
ELL x V -0.03 -0.53 0.49
ELL x N 0.26 -0.23 0.51
Eth x V 0.22 -0.29 0.57
Eth x N -0.02 -0.46 0.44
Notes. a Model with Y1 AIMS Reading. b Add CogAT Verbal. c Add CogAT Nonverbal. d Add Hispanic and ELL variables. e Add
interaction terms. f The final model included Y1 reading achievement, CogAT Verbal and Nonverbal scores, Hispanic and ELL
variables, and interaction terms between the CogAT scores and Hispanic/ELL categories.
* p < .05. ** p < .01.
Table 6
Descriptives for mean residuals
Mathematics Reading
M SD M SD
Hispanic ELL 0.97 27.83 -3.08 22.43
Hispanic non-ELL 2.16 25.33 3.71 22.4
White non-ELL -6.31 33.23 - 2.56 26.5
Table 7
Multiple Regression of Achievement for ELL students
Predicting Math Achievement

Order of Entry R R2 Δ Order of Entry R R2 Δ
Quantitative 0.65 0.42 Nonverbal 0.56 0.31
Nonverbal 0.67 0.03 Quantitative 0.67 0.14
Verbal 0.68 0.02 Verbal 0.68 0.02
Predicting Reading Achievement
Verbal 0.53 0.28 Nonverbal 0.42 0.17
Nonverbal 0.55 0.02 Verbal 0.55 0.13
Quantitative 0.57 0.02 Quantitative 0.57 0.02
View publication stats

PredVerbal ELL

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PredVerbal ELL

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

The Predictive Accuracy of Verbal, Quantitative, and Nonverbal Reasoning Tests:

Article in journal for the education of the gifted · October 2011

The user has requested enhancement of the downloaded file.

Assessing the Cognitive Abilities of Culturally and Linguistically Diverse Students:

Predictive Validity of Verbal, Quantitative, and Nonverbal Tests

Draft of August 9, 2011

This manuscript was published in Psychology in the Schools.

Joni M. Lakin, Department of Educational Foundations, Leadership, and Technology,

the exclusive use of nonverbal tests for ELL students.

Keywords: Cognitive ability testing, English-language learners, Hispanic students, Validity

Assessing the Cognitive Abilities of Culturally and Linguistically Diverse Students:

Predictive Validity of Verbal, Quantitative, and Nonverbal Tests

aptitude than verbal or quantitative tests.

Hispanic ELL and non-ELL students.

Considerations in the use of multi-battery ability tests

which they learn (Lohman & Hagen, 2001b).

for differentiating instruction.

Use of nonverbal ability tests with ELL students

practical importance in the identification of academically talented students.

The research questions that guided this study were:

scores between students who differ in racial/ethnic background and/or language

achievement for ELL and non-ELL students?

demographic information on the sample.

Cognitive Abilities Test (Form 6)

reliability of the CogAT for ELL students.

effect sizes (-0.6SD and -0.4SD, respectively).

Variance is another important characteristic of score distributions because restricted

Multiple Regressions Including Year-One Achievement

from both ability scores and year-one achievement.

achievement. See Table 4. Quantitative reasoning added an additional 6% to the variance

year-two reading achievement.

Residuals for overall regression

To test for differential prediction in the absence of group interactions, regression

over-prediction for reading achievement.

Alternative Predictive Models for ELL Students

second, nonverbal ability accounted for just 2% of variance in either domain.

achievement and ability. Neither main effect appeared practically significant.

predicted by the common regression line.

for reading domains and quantitative reasoning for mathematics domains.

Implications and Directions for Future Research

Normative inferences based on opportunity to learn

Hagen, 2001b). Therefore, additional research is needed to explore appropriate instructional

White students made up a larger proportion of the sample.

promote appropriate uses should continue.

Education, 14, 219-234.

and Measurement, 5, 1-10.

Arizona Department of Education. (2006, November). AIMS student guides. Retrieved

September 4, 2007, from http://www.ade.state.az.us/standards/AIMS/AIMSSTGuides/

Evidence of measurement invariance. Manuscript submitted for publication.

Progressive Matrices and the Naglieri Nonverbal Ability Test. Journal of

Psychoeducational Assessment, 28, 222–235. doi: 10.1177/0734282909343763

Borghese, P. (2009). An analysis of predictive, convergent, and discriminant validity of the

Universal Nonverbal Intelligence Test with limited English proficient Mexican-American

elementary students (Doctoral dissertation). Retrieved from ProQuest. (AAT 3351828)

Mahwah, NJ: Lawrence Erlbaum Associates.

manual. Itasca, IL: Riverside.

Braden, J.P. (2000). Editor’s introduction: Perspectives on the nonverbal assessment of

intelligence. Journal of Psychoeducational Assessment, 18, 204-210.

"losers" in identification and programming. Gifted Child Quarterly, 53, 239-241.

colleges. Journal of Educational Measurement, 5, 115-124.

with disadvantaged students. American Psychologist, 30, 15-41.