Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/258137798

Minority Performance on the Naglieri Nonverbal Ability Test, Second Edition,


Versus the Cognitive Abilities Test, Form 6 One Gifted Program’s Experience

Article  in  Gifted Child Quarterly · April 2013


DOI: 10.1177/0016986213477190

CITATIONS READS

30 1,135

3 authors:

Jacob A. Giessman James Gambrell


Columbia (Mo.) Public Schools ACT, Inc.
1 PUBLICATION   30 CITATIONS    10 PUBLICATIONS   191 CITATIONS   

SEE PROFILE SEE PROFILE

Molly S. Stebbins
Columbia Public Schools
10 PUBLICATIONS   129 CITATIONS   

SEE PROFILE

All content following this page was uploaded by James Gambrell on 04 August 2014.

The user has requested enhancement of the downloaded file.


Gifted Child Quarterly http://gcq.sagepub.com/

Minority Performance on the Naglieri Nonverbal Ability Test, Second Edition, Versus the Cognitive
Abilities Test, Form 6 : One Gifted Program's Experience
Jacob A. Giessman, James L. Gambrell and Molly S. Stebbins
Gifted Child Quarterly 2013 57: 101
DOI: 10.1177/0016986213477190

The online version of this article can be found at:


http://gcq.sagepub.com/content/57/2/101

Published by:

http://www.sagepublications.com

On behalf of:

National Association for Gifted Children

Additional services and information for Gifted Child Quarterly can be found at:

Email Alerts: http://gcq.sagepub.com/cgi/alerts

Subscriptions: http://gcq.sagepub.com/subscriptions

Reprints: http://www.sagepub.com/journalsReprints.nav

Permissions: http://www.sagepub.com/journalsPermissions.nav

Citations: http://gcq.sagepub.com/content/57/2/101.refs.html

>> Version of Record - Mar 13, 2013

What is This?

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


477190
7190Gifted Child QuarterlyGiessman et al.
2013
GCQXXX10.1177/001698621347

Article
Gifted Child Quarterly

Minority Performance on the


57(2) 101­–109
© 2013 National Association for
Gifted Children

Naglieri Nonverbal Ability Test, Reprints and permission:


sagepub.com/journalsPermissions.nav
DOI: 10.1177/0016986213477190
Second Edition,Versus the gcq.sagepub.com

Cognitive Abilities Test, Form 6:


One Gifted Program’s Experience

Jacob A. Giessman1, James L. Gambrell2, and Molly S. Stebbins1

Abstract
The Naglieri Nonverbal Ability Test, Second Edition (NNAT2), is used widely to screen students for possible inclusion in talent
development programs. The NNAT2 claims to provide a more culturally neutral evaluation of general ability than tests such
as Form 6 of the Cognitive Abilities Test (CogAT6), which has Verbal and Quantitative batteries in addition to a Nonverbal
battery. This study compared the performance of 5,833 second graders who took the CogAT6 and 4,038 kindergartners,
first graders, and second graders who took the NNAT2 between 2005 and 2011 as part of a grade-wide screening for a
gifted program. Comparison between minorities and Whites on the CogAT6 and the NNAT2 found slightly larger gaps on
the CogAT6 Composite for Hispanics and English-Language Learners (ELL) but the same gap for Black students. Considered
alone, the Nonverbal battery of CogAT6 produced smaller gaps than the NNAT2 for Blacks, Hispanics, Asians, and ELL
students. Fisher’s exact tests showed no significant differences between the CogAT6 Composite and the NNAT2 in subgroup
identification rates at hypothetical cuts for gifted identification (top 20%, 10%, or 5%), except for Asian and ELL students. The
CogAT6 Nonverbal score appeared to identify as many or more high-ability students from underrepresented groups as the
NNAT2. Wechsler Intelligence Scale for Children, Fourth Edition, follow-up on the top 5% showed greater predictive validity
for the CogAT6 Composite. These results suggest that gifted programs should not assume that using a figural screening test
such as the NNAT2, without other adjustments to selection protocol, will address minority underrepresentation.

Keywords
NNAT2, CogAT6, WISC-IV, gifted, talent, minority, underrepresentation

The continued underrepresentation of Black, Hispanic, and Some have cautioned that nonverbal tests do not measure
English-Language Learner (ELL) students in gifted programs entirely the same constructs as the tests they are meant to
is recognized as an important problem by theorists and prac- supplement or replace and that they may contain unique
titioners in the field (e.g., Callahan, 2005; Donovan & Cross, forms of bias (Anastasi & Urbina, 1997; Lohman, 2005b;
2002; Ford, 1998; U.S. Department of Education, 1993). Bor- Lohman & Gambrell, 2012). Others, however, have argued
land (2004), for one, argues that, because of chronic under- that nonverbal tests are relatively free of test bias against
representation of certain groups, gifted programs may actually children from non-English speaking homes, culturally
“widen the gap between society’s have’s and have-not’s” diverse backgrounds, or with limited opportunity to learn
(p. 6). Borland maintains that, although gifted education is by and that they are better measures of ability for any child.
no means the primary cause of achievement differences Naglieri (2010), for example, argued that ability tests with
between demographic groups, it is morally and politically
1
imperative that administrators do what they can to address 2
Columbia (Mo.) Public Schools, Columbia, MO, USA
minority underrepresentation in gifted programs. University of Iowa, Iowa City, IA, USA
In recognition of this issue, the National Association for * This manuscript was accepted under the previous editor,
Gifted Children (NAGC; 2010b) recommends that “students Carolyn M. Callahan.
with identified needs represent diverse backgrounds and
Corresponding Author:
reflect the total student population of the district” and—to Jacob A. Giessman, Center for Gifted Education, Columbia (Mo.) Public
that end—supports “non-biased and equitable” identification Schools, 4303 South Providence Road, Columbia, MO 65203, USA.
strategies, including the use of nonverbal tests (Standard 2.3). Email: jgiessma@columbia.k12.mo.us

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


102 Gifted Child Quarterly 57(2)

verbal or quantitative content are inappropriate for measur- Research Question 1: On which of the two screening
ing general ability because they are heavily loaded with tests were mean scores and variances most similar
achievement factors. Naglieri, Brulles, and Landsdowne among subgroups?
(2008) deemed nonverbal measures more equitable for all Research Question 2: Which screening test best mod-
children and argued that “a nonverbal measure of ability can erated minority underrepresentation at hypothetical
overcome the injustice of under-representation of minorities gifted program cut scores (top 20%, 10%, and 5%)?
in gifted programs” (p. 10). Research Question 3: Which screening test best pre-
The Naglieri Nonverbal Ability Test, Second Edition dicted high performance on the WISC-IV?
(NNAT2; Naglieri, 2008a), has been advertised by its pub- Additional Exploration: Because the CogAT6 Nonver-
lisher as “a culturally neutral evaluation of students’ nonver- bal battery is similar to the NNAT2, CogAT6 Non-
bal reasoning and general problem-solving ability, regardless verbal Standard Age Scores (NSAS) were included
of the individual student’s primary language, education, cul- for comparison where possible in the analysis.
ture or socioeconomic background” (Pearson, 2012). In an
analysis of the standardization data for the first edition of the
NNAT (Naglieri, 1997), Naglieri and Ford (2003) found that Method
White, Black, and Hispanic children had similar mean scores Sample
and were similarly likely to meet common percentile cuts for
participation in gifted programming (see also Naglieri & Data were drawn from district testing records that included
Ronning, 2000). Carman and Taylor (2010), however, found 5,833 students who took the CogAT6 in second grade in the
that low socioeconomic status students from underrepre- 2005-2006 to the 2009-2010 school years, and 4,035 stu-
sented minority groups scored 14 Naglieri Ability Index score dents who took the NNAT2 in kindergarten, first grade, and
(NAI) points lower on the NNAT than nonminority students second grade during the 2010-2011 school year. Because
from middle-class families. Like Villarreal (2005), Carman these were grade-wide screenings, the sample included four
and Taylor cautioned that the NNAT be used only in conjunc- complete grade cohorts for the CogAT6 and three complete
tion with other measures of ability. Naglieri and Ford’s (2003) grade cohorts for the NNAT2. With the exception of a higher
findings were also questioned on statistical and methodologi- representation of ELL students in the NNAT2 group (6.2%
cal grounds by Lohman (2005a). A response from Naglieri as opposed to 3.4%), demographic characteristics were
and Ford (2005) included a call for similar empirical investi- nearly identical between the two groups (approximately
gations of race and ethnic differences on the Cognitive 51% male, 64% White, 20.5% Black, 5% Asian or Pacific
Abilities Test, Form 6 (CogAT6; Lohman & Hagen, 2001b). Islander, 5% Hispanic, 5.5% multiracial, and 1% American
The present study responds to Naglieri and Ford’s request by Indian or Alaska Native).
analyzing archival data sets from one gifted program’s use of Although the full sample was relevant to the first two
the NNAT2 and the CogAT6 in grade-wide screenings. research questions, only a subset of the sample was used for
evaluation of our question pertaining to WISC-IV predictiv-
ity. District policy for the most part limited WISC-IV testing
The Present Study to students with high screening test scores; therefore, cor-
The Midwestern school district (approximately 18,000 stu- relations including all students identified by the screening
dents) studied used grade-wide screenings with a group abil- test could not be calculated. Instead, investigation of the
ity test as one major means of identifying students who third question focused on the top 5% of scorers for each
might benefit from gifted services. Students who met district screening test.
cut scores on the group ability test were referred for further
evaluation, which typically included administration of the
Wechsler Intelligence Scale for Children, Fourth Edition Measures
(WISC-IV; Wechsler, 2003a)—perhaps the most commonly District databases provided gender and ethnicity as well as
used test in identification for gifted-talented services ELL status at the time of screening. Although of interest to
(NAGC, 2010a). the authors, socioeconomic status information in the form of
In the fall of 2010, the district switched from using the free and reduced lunch status was withheld by the district
CogAT6 to the NNAT2 for its grade-wide screenings in due to its interpretation of privacy law.
hopes that the NNAT2 might yield a more diverse pool for
further evaluation and, ultimately, a more diverse group of CogAT6. The CogAT6 is a multidimensional group ability
students in the district’s gifted programs. This study used test and consists of three batteries measuring verbal, quanti-
district screening results from both instruments to explore tative, and nonverbal reasoning (Lohman & Hagen, 2001a).
three questions of interest to the district and to the larger con- At second grade (Level 2), there are 48 items on each battery,
versation about nonverbal testing as a tool for addressing and two item types per battery. No reading is required at this
minority underrepresentation in gifted education. level. On the Verbal and Quantitative batteries, children

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


Giessman et al. 103

listen to questions read by the test administrator and choose These index scores have a mean of 100 and a standard devia-
their answers from a set of pictures. On the Nonverbal Bat- tion of 15 (Wechsler, 2003b).
tery, the test administrator simply paces children though the Internal consistency estimates reported by the technical
44 nonverbal items. The CogAT6 yields a standard age score manual (Wechsler, 2003b) include .94 for VCI, .92 for PRI,
(SAS) for each battery (VSAS, QSAS, and NSAS), three .92 for WMI, .88 for PSI, and .97 for FSIQ. Rowe, Kingsley,
partial composite standard age scores (VQSAS, VNSAS, and Thompson (2010) studied the correlation of GAI and
and QNSAS), and a full composite standard age score (Ver- FSIQ with the reading and math composites from the
bal, Quantitative, and Nonverbal standard age score [VQN- Wechsler Individual Achievement Test, Second Edition
SAS])—all with a mean of 100 and a standard deviation of (WIAT-II; Psychological Corporation, 2001) among gifted
16 (Lohman & Hagen, 2002). referrals. They found GAI among these higher ability stu-
Reliabilities for the three batteries using the Kuder- dents to correlate with WIAT-II Reading at .50 and WIAT-II
Richardson Formula 20 (KR20) are reported in the research Math at .43. Correlations were higher with FSIQ (Reading,
handbook for CogAT6 (Lohman & Hagen, 2002). These .59 and Math, .47).
ranged from .86 to .92 in the Primary Battery (grades K-2). Several analyses have been cited by NAGC (2010a) sup-
The KR20 reliability for VQNSAS for these grades was porting use of GAI over FSIQ in identification, especially in
reported as .96, which corresponds to a standard error of cases where subscores are highly discrepant. During part of
measurement of 3.2 SAS points. The handbook also the study period, the district only administered the VCI and
reported test–retest reliability as .92 when different forms PRI subtests of the WISC-IV. Our analysis, therefore, is con-
of the test were administered 2 weeks apart. Correlations fined to VCI, PRI, and GAI.
between the overall composite score and scores on other
tests include .69 with the Woodcock-Johnson III (Lohman,
2003b; Woodcock, McGrew, & Mather, 2001), .79 with the Statistical Analyses
WISC, Third Edition (Lohman, 2003a; Wechsler, 1991), The shape of the score distribution on each screening test
and .86 with the Iowa Tests of Basic Skills (Hoover, was analyzed in terms of mean, standard deviation, skew-
Hieronymous, Frisbie, & Dunbar, 1994; Lohman & Hagen, ness, and kurtosis. Skewness and kurtosis were tested for
2002). significance at p < .05. Differences between subgroup means
on each screening test were tested for significance at p < .05
NNAT2. The NNAT2 is a shorter, unidimensional group- and p < .001 with independent samples t tests. The lower and
administered ability test that uses 48 figure matrix items at upper limits of the 95% confidence interval were also
all levels (Naglieri, 2008a). The NNAT2 yields the NAI, reported. We used PS (version 3.0) to determine that the
which has a mean of 100 and a standard deviation of 16 sample size necessary to detect a real difference of 5 points
(Naglieri, 2008b). The district used the online version. at a power level greater than .80 was approximately 135 for
The NNAT2 technical manual (Naglieri, 2008b) reported the smaller group (Dupont & Plummer, 1997). Accordingly,
that KR20 reliability coefficients for the test levels used in the the Native American and Pacific Islander groups were left
present study ranged from .84 to .92. The standard error of out of analyses due to sample size. Comparisons between
measurement at these levels ranged from 4.79 to 6.36. Test– Asian and non-Asian ELL students were included, despite a
retest reliability ranged from .75 to .78. Validity was exam- less than ideal sample size, because observed differences
ined through correlation with the Otis-Lennon School Ability were strikingly large. Along with each mean comparison, we
Test, Eighth Edition (OLSAT-8; Otis & Lennon, 2003) and also tested for differences between subgroup variances using
the Stanford Achievement Tests, Tenth Edition (Stanford 10; Levene’s test. Mean comparisons with significant variance
Pearson, 2003). Pearson r with OLSAT-8 at second grade was differences used a separate variance t test algorithm for sig-
.53 for the Verbal section, .68 for the Nonverbal section, and nificance testing.
.69 for the Composite. For kindergarten through second Next, differences in the proportion of each subgroup scor-
grade, correlations with Stanford 10 Reading ranged from .61 ing in the top 20%, 10%, and 5% on CogAT6 versus NNAT2
to .70 and with Stanford 10 Math ranged from .62 to .70. At were tested for significance using Fisher’s exact test. The
first grade, a comparison of ELL student scores with matched size of the effect is indexed using the natural log of the odds
control groups showed a difference of 3.57 NAI points for ratio (LOR; subgroup odds of selection on CogAT6/
non-Spanish speaking ELL students and .93 NAI points for subgroup odds of selection on NNAT2). Rosenthal (1996)
Spanish-speaking ELL students. gave guidelines for interpretation of effect sizes in the odds
ratio metric based on Cohen (1998). Suggested values for
WISC-IV. The WISC-IV is an individual ability test with sub- small, medium, and large effect sizes translate into LOR of
tests yielding index scores for Verbal Comprehension (VCI), .40, .90, and 1.5, respectively. The statistical power of these
Perceptual Reasoning (PRI), Working Memory (WMI), and tests depends not only on assumptions about possible effect
Processing Speed (PSI; Wechsler, 2003a). The first two bat- sizes but also on the exact proportions involved. The sample
teries together yield a General Ability Index (GAI), and all size necessary in the smaller (NNAT2) sample to detect a
four batteries in combination yield a Full-Scale IQ (FSIQ). medium-sized effect increases from approximately 140 to

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


104 Gifted Child Quarterly 57(2)

Table 1. Descriptive Statistics for Subgroups.

CogAT6 sample NNAT2 sample


Group n NSAS, M (SD) VQNSAS, M (SD) n NAI, M (SD)
Grade  
 K 1,251 98.0 (18.3)
 1 1,432 97.5 (17.9)
 2 5,833 104.1 (15.1) 102.4 (14.8) 1,352 96.3 (15.7)
Gender  
 Female 2,843 104.9 (14.9) 102.9 (14.7) 1,967 97.5 (16.7)
 Male 2,990 103.5 (15.3) 102.0 (14.9) 2,068 97.0 (17.9)
Ethnicity  
 White 3,665 106.7 (14.1) 106.5 (13.4) 2,567 100.5 (15.6)
 Black 1,217 94.6 (14.2) 90.5 (12.7) 820 84.5 (16.5)
 Hispanic 284 101.0 (13.2) 96.1 (12.6) 191 93.2 (15.8)
 Asian 296 114.8 (14.8) 108.0 (15.1) 214 109.6 (16.5)
  Native American 30 106.1 (13.6) 102.6 (12.5) 21 102.0 (11.3)
  Native Hawaiian/ 9 109.4 (19.0) 100.0 (13.0) 8 97.6 (12.4)
Pacific Islander
 Multiracial 332 103.6 (13.7) 101.2 (12.9) 214 97.4 (15.9)
ELL status  
 Non-ELL 5,634 104.2 (15.0) 102.8 (14.7) 3,786 97.5 (17.1)
 ELL 199 103.1 (18.0) 92.0 (13.6) 249 93.2 (19.3)
  Non-Asian ELL 127 95.8 (15.0) 87.3 (11.6) 172 88.5 (17.3)
  Asian ELL 72 116.1 (15.3) 100.3 (13.0) 77 104.8 (19.4)
Note. CogAT6 = Cognitive Abilities Test–Form 6; NNAT2 = Naglieri Nonverbal Ability Test, Second Edition; NSAS = Nonverbal Standard Age Scores;
VQNSAS = Verbal, Quantitative, and Nonverbal Standard Age Score; NAI = Naglieri Ability Index; ELL = English-language learner. The normative population
mean and SD for all tests is 100 and 16, respectively.

560 as the smaller proportion in the comparison decreases district was stable across the studied time period and both
from 10% to 2%. Thus, statistical power should be sufficient tests were equally good at predicting WISC-IV performance,
to detect effects in the top 20% and 10% comparisons. For then we could expect no difference in observed scores after
the top 5%, only comparisons in groups with relatively large the selectivity was equalized.
samples and/or large proportions selected have adequate Finally, to explore whether the inclusion of kindergarten
power, but results and tests for all groups were reported for and first grade NNAT2 scores may have influenced the com-
descriptive purposes. To ensure that any possible differences parison between screening tests, given that the CogAT6 sam-
were detected, we used an uncorrected alpha level of .05 ple was composed entirely of second graders, we presented
despite the many comparisons. Although this decision grade-disaggregated results for the NNAT2. All statistical
increases the chances of detecting a difference where none analyses were conducted in SPSS 20.
exists, it minimizes the possibility of missing real differ-
ences. A more stringent alpha level could be viewed as a
means of masking real differences between the tests. Results
Exploration of relationships between each screening test Mean Scores and Variances
and WISC-IV performance was complicated by district test-
ing policy and data collection. WISC-IVs usually were given The first research question asked, “Which screening test
to students scoring above 125 on VQNSAS when CogAT6 generated mean scores and variances that were more similar
was administered, or above an NAI score of 118 when the between subgroups.” Table 1 shows mean scores and stan-
NNAT2 was administered, but exceptions and incomplete dard deviation by subgroup for NSAS, VQNSAS, and NAI.
WISC-IV records were not unusual. To create a fair compari- Although NSAS and VQNSAS scores were normally dis-
son between the screening tests as predictors, we compared tributed, NAI scores showed significant negative skew
the WISC-IV performance of only those students scoring in (−.461) and positive kurtosis (+.380), p < .05. This means
the top 5% on either screening test. This cut point is just there were more NAI scores at the extremes of the distribu-
above the district cut score on VQNSAS and well above the tion, and in particular more very low scores, than would be
district cut score on NAI. If the pool of talent available in the expected under normality. NAI score standard deviations at

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


Giessman et al. 105

Table 2. Subgroup Score Differences.

CogAT6 sample NNAT2 sample


  NSAS VQNSAS NAI
  95% CI 95% CI 95% CI
Group n Difference LL UL Difference LL UL n Difference LL UL
Gender  
Female—Male 2,843 1.4** 0.6 2.2 0.9* 0.2 1.7 1,967 0.6 −0.5 1.6
Ethnicity  
 Black—White 1,217 −12.1** −11.2 −13.0 −16.1** −15.2 −16.9 820 −16.0**a −17.3 −14.8
 Hispanic—White 284 −5.7** −4.0 −7.4 −10.5** −8.9 −12.1 191 −7.3** −9.7 −5
 Asian—White 296 8.1** 9.8 6.4 1.5a 3.1 −.1 214 9.1** 6.9 11.3
 Multiracial—White 332 −3.1** −1.6 −4.7 −5.4** −3.9 −6.9 214 −3.1* −5.3 −0.9
ELL status  
 ELL—Non-ELL 199 −1.0a −3.2 1.1 −10.8** −12.8 −8.7 249 −4.3**a −6.5 −2.1
  Asian ELL—Non-Asian ELL 72 20.3** 16.0 24.7 13.0**a 8.8 17.2 77 16.1** 11.5 20.7
Note. CogAT6 = Cognitive Abilities Test–Form 6; NNAT2 = Naglieri Nonverbal Ability Test, Second Edition; NSAS = Nonverbal Standard Age Scores;VQN-
SAS = Verbal, Quantitative, and Nonverbal Standard Age Score; NAI = Naglieri Ability Index; ELL = English-language learner; CI = confidence interval; UL =
upper limit; LL = lower limit. Listed sample sizes are for the focal groups in each comparison.
a
Significant variance differences.
*p < .05. **p < .001.

kindergarten and first grade were larger than the expected batteries, which include spoken English language instruc-
16, which would exacerbate the tendency toward extreme tions at Grade 2. Further analysis showed sharp differences
scores. A similar pattern of deviation from the expected dis- between Asian ELL and other, mostly Hispanic, ELL stu-
tribution was found for the first edition of NNAT (Lohman, dents. In fact, some of the largest mean score differences
Korb, & Lakin, 2008). Meanwhile, NSAS and VQNSAS noted (13.0, 16.1, and 20.3, p < .001) favored Asian ELL
standard deviations were slightly smaller than expected. over non-Asian ELL students. This suggests that any ELL
Table 2 presents the differences between subgroup means advantage on NSAS or NAI was largely attributable to an
using White students as a reference among ethnic groups. overall Asian advantage on nonverbal measures. Due to the
Mean scores were substantially higher on the CogAT6 large gap between Asian and non-Asian ELL students, stan-
than on the NNAT2—a difference of more than 6 points dard deviations were larger for the ELL than non-ELL sam-
in most cases. This large overall mean difference is the rea- ple on both NSAS (+3, p < .001) and NAI (+2.2, p = .002).
son we did not attempt any testing of mean differences
across the screening tests. Little significant gender differ-
ence was noted for either test. Blacks had the lowest mean Identification Rates
scores, scoring a full standard deviation below Whites on The second research question asked which screening test
VQNSAS and NAI and three quarters of a standard devia- yielded identification rates on likely cut scores most similar
tion below Whites on NSAS (p < .001). However, Blacks across subgroups. Table 3 details the percentage of each sub-
did have larger score variability than Whites on the NAI (p group that fell within the top 20%, 10%, and 5% of sample
= .005). scores for NSAS, VQNSAS, and NAI. If perfect proportion-
On all three measures, Hispanic and multiracial means ality among subgroups were to hold, each cell value would
fell between Black and White means, while Asians scored match the cut percent. For example, all of the cell values in
the highest. The difference between Asian and White means the first section of the table (top 20%) would be 20.0.
was 8.1 and 9.1 points, respectively, on NSAS and NAI (p < Neither instrument identified proportionally at three
.001) but insignificant on VQNSAS (p > .05). Despite no hypothetical cut scores gifted programs might apply during
mean advantage, Asians showed greater variance than identification for services. Fisher exact tests (p < .05) were
Whites on VQNSAS (SD +1.7, p = .005). used, though, to indicate subgroups for which one test held
ELL students scored 10.8 points lower than non-ELL stu- an advantage over another at each cut. The finding of larger
dents on VQNSAS (p < .001), but only 4.3 points lower on variance for Black students on NAI did not translate into
NAI (p < .001) and showed no significant difference on more high scores since the additional variability was caused
NSAS (p > .05). As would be expected, this indicates that the by an excess of low scores. NAI identified proportionately
larger gap on VQNSAS is due to the Verbal and Quantitative more ELL and more Asian students at all three score levels.

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


106 Gifted Child Quarterly 57(2)

Table 3. Percentage of Students Within Subgroups Above Table 4. WISC-IV Performance of Top 5%.
Selected Score Levels and CogAT:NNAT Log Odds Ratio Effect
Screening test n VCI, M (SD) PRI, M (SD) GAI, M (SD)
Sizes.
VQNSAS 182 124.3* (11.5) 126.4 (10.2) 130.0* (10.6)
NNAT NAI CogAT NSAS CogAT VQNSAS NAI 161 112.3 (14.8) 128.4 (11.5) 124.2 (13.1)

Log odds Log odds Note. WISC-IV = Wechsler Intelligence Scale for Children, Fourth Edition;VQNSAS =
  % % effect size % effect size Verbal, Quantitative, and Nonverbal Standard Age Score; NAI = Naglieri Ability Index;
VCI = Verbal Comprehension; PRI = Perceptual Reasoning; GAI = General Ability Index.
Top 20% *Difference between CogAT6 and NNAT2 sample was significant at p < .001.
 Female 21.6 22.0 .02 21.8 .01
 White 25.4 24.0 −.08 26.9 .08
 Black 4.1 6.2 .44 2.6 −.47
Table 5. NNAT2 Descriptives and WISC-IV Scores of Top 5% by
 Hispanic 11.0 11.3 .03 6.0 −.66
 Asian 50.5 48.0 −.10 34.5* −.66 Grade.
 Multiracial 16.8 20.2 .23 16.8 .00 Grade
 ELL 17.7 25.1 .44 5.5* −1.31
Top 10%   K, M (SD) 1, M (SD) 2, M (SD)
 Female 10.2 10.9 .07 11.0 .08
Overall 97.9 (18.3) 97.5 (17.9) 96.3 (15.7)
 White 11.6 12.1 .05 13.9* .21
Female 97.6 (17.9) 97.8 (17.0) 97.1 (15.0)
 Black 1.6 2.6 .50 1.0 −.48
White 101.3 (16.6) 101.2 (15.7) 99.0 (14.4)
 Hispanic 3.7 4.6 .23 3.5 −.06
Black 85.0 (18.1) 83.5 (16.8) 85.2 (14.7)
 Asian 36.0 30.1 −.27 18.9* −.88
Hispanic 93.8 (18.9) 92.7 (14.7) 93.2 (14.0)
 Multiracial 9.8 8.1 −.21 4.8 −.77
Asian 108.8 (15.3) 112.5 (16.7) 107.5 (16.9)
 ELL 10.4 15.1 .43 1.0* −2.44
Multiracial 96.4 (17.9) 99.1 (16.1) 97.0 (13.0)
Top 5%
ELL 95.3 (21.7) 91.0 (19.7) 93.5 (17.0)
 Female 4.5 5.2 .15 5.2 .15
(Top 5%) WISC-IV VCI 110.2 (16.6) 113.8 (12.3) 113.3 (15.4)
 White 5.7 6.1 .07 6.9 .20
(Top 5%) WISC-IV PRI 125.8 (11.3) 131.4 (10.2) 128.0 (12.8)
 Black 0.6 0.7 .16 .4 −.41
(Top 5%) WISC-IV GAI 121.3 (13.3) 126.9 (11.0) 124.5 (14.8)
 Hispanic 2.1 1.1 −.66 1.1 −.66
 Asian 22.0 16.9 −.33 11.1* −.81 Note. NNAT2 = Naglieri Nonverbal Ability Test, Second Edition; WISC-IV = Wechsler
 Multiracial 1.9 2.1 .10 3.6 .66 Intelligence Scale for Children, Fourth Edition;VCI = Verbal Comprehension; PRI =
Perceptual Reasoning; GAI = General Ability Index.
 ELL 7.6 8.5 .12 .5* −2.80

Note. NNAT = Naglieri Nonverbal Ability Test, Second Edition; CogAT = Cognitive
Abilities Test; NSAS = Nonverbal Standard Age Score;VQNSAS = Verbal, Quantita-
tive, and Nonverbal Standard Age Score; NAI = Naglieri Ability Index; ELL = English- score on the screening test. Results showed VQNSAS was a
language learners.
*Significant Fisher Exact test when compared with NAI percentage at p < .05. significantly better predictor of high VCI and high GAI,
with the top 5% scoring 12 and 5.8 points higher, respec-
tively, on VCI and GAI than the top 5% of NAI scorers. NAI
appeared nominally better at predicting high PRI, but the
The effect size for ELL students was large, whereas the difference was not significant.
effect for Asian students was moderate. The only significant
subgroup advantage on the CogAT6 over the NNAT2 was a
very small effect for Whites on VQNSAS at the 10% cut Impact of Grade and Age Differences in
only. As this advantage occurred at only one cut point, it may Sample
be simply the result of chance. Because the CogAT6 data used in this study came exclu-
No significant differences were found at any cut between sively from second graders and the NNAT2 data included a
NAI and CogAT6 NSAS. In fact, NSAS identification rates mix of kindergarteners, first graders, and second graders, it
for underrepresented groups were as high as or slightly was conceivable that age differences influenced the com-
higher than NAI identification rates. From this we can also parison of subgroup performance and WISC-IV perfor-
infer that the significantly lower Asian selection advantage mance between the two screening tests. To investigate this
on VQNSAS compared with NAI stems from differences on possibility, means, standard deviations, and WISC-IV results
the VSAS and QSAS batteries. were disaggregated by grade, as shown in Table 5.
No evidence of substantial differences in NAI mean
scores across grade levels was found, although variability
Relationship to WISC-IV decreased as grade level increased. This increased variability
The third research question asked which screening test best in kindergarten and first grade may have resulted in more
predicted high performance on the WISC-IV. Table 4 com- high scores on NNAT2 than would have been found if the
pares WISC-IV performance between the top 5% of sample had been restricted to second graders. First grade PRI
VQNSAS scorers and NNAT2 scorers in the sample, show- and GAI scores were significantly higher than kindergarten
ing what score level on WISC-IV is predicted by a high scores (p < .05). However, the trend does not continue into

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


Giessman et al. 107

second grade, as these scores were not significantly different sampled. This conclusion, however, assumes that GAI is a
from either grade. “gold standard” measure; one could conclude alternatively
that the CogAT6 Verbal and Quantitative batteries and the
WISC-IV Verbal subtests simply share a heavy loading of
Discussion achievement or language factors.
The purpose of this study was to compare subgroup perfor-
mance and WISC-IV prediction of the CogAT6 and the
NNAT2 in the context of selection for gifted services. Our Limitations
field data informs the debate about whether or not the Several limitations of this study stem from its dependence
NNAT2 is an effective tool for addressing the underrepre- on data from one district’s gifted testing records. First, it
sentation of minorities in gifted programming. In this study, would have been preferable for analysis if all students had
none of the three screening measures (VQNSAS, NSAS, and taken the CogAT6, the NNAT2, and the WISC-IV. Instead,
NAI) yielded similar mean performance or identification practical and financial considerations at the district level
rates across subgroups—meaning that performance gaps meant that each student took either the CogAT6 or the
among subgroups persisted across instruments. NNAT2 and only a small portion of these students took the
Within our sample, multiracial, Hispanic, and ELL stu- WISC-IV. Second, due to sample size limitations, we com-
dents did perform less disparately on average from White stu- pared results from slightly different grade levels and from
dents on the NNAT2 than they did on the CogAT6 VQNSAS, test administrations that took place over the course of 5
but this was not true for Black students. Furthermore, any nar- years. This increased the possibility of unmeasured changes
rowing of performance gaps did not translate into significantly in the sampled population. Third, the fact that the NNAT2
higher rates of identification at likely selection cut scores— was administered online may raise questions about a disad-
with the exception of ELL students. The advantage to ELL vantage for subgroups with less early childhood experience
students on the NNAT2 may be attributable to an overall Asian with computers (Huff & Sireci, 2001). Fourth, the latest
advantage on nonverbal items. Asian ELL students outper- form of the CogAT (CogAT7, Lohman, 2012a), which has
formed non-Asian ELL students, and the overall Asian sample been updated to improve ELL fairness, was not represented
outperformed all other groups in both mean scores and identi- in this study (see also Lohman & Gambrell, 2012). Finally,
fication rates—most significantly on the nonverbally oriented the study is limited to one Midwestern district and may not
NSAS and NAI. Exceptional Asian and Asian-ELL perfor- be representative of other districts.
mance may also be partly attributable to the fact that the Asian Two theoretical issues also limit the practical implications
population in this district is affiliated disproportionately with a of the results. First, one cannot expect one test to perform in
large research university and several medical institutions, and isolation as a reliable, valid, and equitable selection tool
thus is a particularly talented Asian sample that has been when matching gifted services to students (NAGC, 2010b,
attracted from other states and countries. Standard 2.2.5). Dai (2010) confirmed that dependence on a
Of the three screening measures, VQNSAS yielded the single measure is common (p. 248), but likened it to “putting
lowest ELL means and identification rates relative to non- all of the eggs in one basket” (pp. 224-225). Using a group
ELLs, which could suggest either a disadvantage on verbal ability test as a screening test to inform who goes on for indi-
items or difficulty with directions spoken in the English lan- vidual testing is a similarly flawed practice. Ideally, one
guage. The CogAT6 Directions for Administration anticipate would administer multiple measures to all students and find
this and advise that a fair way to interpret them in combination (Lohman, 2012b).
Furthermore, improving minority representation in gifted
students who have just begun instruction in English are programming need not require the development of a test on
not likely to be able to answer many of the questions which all subgroups perform identically. The use of local and
on the Verbal and Quantitative batteries. . . . However, subgroup norms, for example, offer a defensible framework
these students can generally take the tests in the for identifying talent among underrepresented groups
Nonverbal battery. (Lohman & Hagen, 2001b, p. 9) (Lohman, 2012b, p. 27). The utility of the NNAT2 in address-
ing underrepresentation is as much about how the test fits
In fact, our results suggested that the CogAT6 Nonverbal into a larger approach to identification as it is about how dif-
battery is similar to the NNAT2 in identifying students from ferent groups perform on it.
underrepresented groups at hypothetical cut scores and was The second theoretical limitation relates to WISC-IV pre-
better than the NNAT2 at moderating the mean score disad- dictivity. The fact that high performance on the CogAT6 was
vantage to Black, Hispanic, multiracial, and non-Asian ELL more predictive of high performance on the WISC-IV than
students. was high performance on the NNAT2 could be interpreted
Of the three screening measures, VQNSAS was the best either as evidence that the WISC-IV and the CogAT6 share
predictor of high GAI, which may be taken as evidence that the “achievement” loading that the NNAT2 seeks to avoid, or
it is a better measure of general intelligence because of the as evidence that the CogAT6 is a better measure of general
broader range of item formats and reasoning abilities ability than the NNAT2. This is ultimately a philosophical

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


108 Gifted Child Quarterly 57(2)

question about which abilities should be used to define aca- Lohman, D. F. (2003b). The Woodcock-Johnson III and the Cog-
demic giftedness, as well as a practical question of which nitive Abilities Test (Form 6): A concurrent validity study.
abilities are most required by a particular gifted program. Retrieved from http://faculty.education.uiowa.edu/dlohman/
pdf/CogAT_WJIII_final_2col%202r.pdf
Lohman, D. F. (2005a). Review of Naglieri and Ford (2003): Does
Conclusion the Naglieri Nonverbal Ability Test identify equal proportions
This study raises doubts about the claims of at least one of high-scoring White, Black, and Hispanic students? Gifted
nonverbal test that it can better identify students from Child Quarterly, 49, 19-28.
underrepresented groups for gifted services. Districts should Lohman, D. F. (2005b). The role of nonverbal ability tests in iden-
not assume that one instrument will be a panacea and, tifying academically gifted students: An aptitude perspective.
instead, might consider using nonverbal ability tests as one Gifted Child Quarterly, 49, 111-138.
tool in a wider approach to identifying and serving students Lohman, D. F. (2012a). Cognitive Abilities Test (Form 7). Rolling
in these groups. Meadows, IL: Riverside.
Lohman, D. F. (2012b). Decision Strategies. In S. L. Hunsaker
Declaration of Conflicting Interests (Ed.), Identification: The theory and practice of identifying stu-
The authors declared no potential conflicts of interest with dents for gifted and talented education services (pp. 217-248).
respect to the research, authorship, and/or publication of this Mansfield Center, CT: Creative Learning Press.
article. Lohman, D. F., & Gambrell, J. L. (2012). Using nonverbal tests to
help identify academically talented children. Journal of Psy-
Funding choeducational Assessment, 30, 25-44.
The authors received no financial support for the research, author- Lohman, D. F., & Hagen, E. P. (2001a). Cognitive Abilities Test
ship, and/or publication of this article. (Form 6). Itasca, IL: Riverside.
Lohman, D. F., & Hagen, E. P. (2001b). Cognitive Abilities Test
References (Form 6): Directions for administration. Itasca, IL: Riverside.
Anastasi, A., & Urbina, S. (1997). Psychological testing (7th ed.). Lohman, D. F., & Hagen, E. P. (2002). Cognitive Abilities Test
Upper Saddle River, NJ: Prentice Hall. (Form 6): Research handbook. Itasca, IL: Riverside.
Borland, J. H. (2004). Issues and practices in the identification and Lohman, D. F., Korb, K. A., & Lakin, J. M. (2008). Identifying
education of gifted students from under-represented groups academically gifted English-language learners using nonverbal
(Research Monograph No. 04186). Storrs: University of Connect- tests. Gifted Child Quarterly, 52, 275-296.
icut, The National Research Center on the Gifted and Talented. Naglieri, J. A. (1997). Naglieri Nonverbal Ability Test. San Antonio,
Callahan, C. M. (2005). Identifying gifted students from underrep- TX: Psychological Corporation.
resented populations. Theory into Practice, 44, 98-104. Naglieri, J. A. (2008a). Naglieri Nonverbal Ability Test (2nd ed.).
Carman, C. A., & Taylor, D. K. (2010). Socioeconomic status San Antonio, TX: NCS Pearson.
effects on using the Naglieri Nonverbal Ability Test (NNAT) to Naglieri, J. A. (2008b). Naglieri Nonverbal Ability Test (Second
identify the gifted/talented. Gifted Child Quarterly, 54, 75-84. Edition) manual: Technical information and normative data.
Cohen, J. (1988). Statistical power analysis for the behavioral sci- San Antonio, TX: NCS Pearson.
ences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum. Naglieri, J. A. (2010, July). The truth about IQ and achieve-
Dai, D. Y. (2010). The nature and nurture of giftedness: A new ment. Paper presented at Learning and the Brain Conference,
framework for understanding gifted education. New York, NY: Boston, MA. Retrieved from http://www.jacknaglieri.com/
Teachers College Press. wordpress/wp-content/uploads/2010/11/The-Truth-About-IQ-
Donovan, M. S., & Cross, C. T. (Eds.). (2002). Minority students in Ach-HNDT.pdf
special and gifted education. Washington, DC: National Acad- Naglieri, J. A., Brulles, D., & Landsdowne, K. (2008). Helping all
emies Press. gifted children learn: A teacher’s guide to using the NNAT2.
Dupont, W. D., & Plummer, W. D. (1997). PS power and sample San Antonio, TX: Pearson.
size program available for free on the Internet. Controlled Clin- Naglieri, J. A., & Ford, D. Y. (2003). Addressing underrepresen-
ical Trials, 18, 274. tation of gifted minority children using the Naglieri Non-
Ford, D. Y. (1998). The underrepresentation of minority students in verbal Ability Test (NNAT). Gifted Child Quarterly, 47,
gifted education. Journal of Special Education, 32, 4-14. 155-160.
Huff, K. L., & Sireci, S. G. (2001). Validity issues in computer- Naglieri, J. A., & Ford, D. Y. (2005). Increasing minority children’s
based testing. Educational Measurement: Issues and Practice, participation in gifted classes using the NNAT: A response to
20, 16-25. Lohman. Gifted Child Quarterly, 49, 29-36.
Lohman, D. F. (2003a). The Wechsler Intelligence Scale for Chil- Naglieri, J. A., & Ronning, M. E. (2000). Comparison of
dren III and the Cognitive Abilities Test (Form 6): Are the gen- White, African-American, Hispanic, and Asian children on the
eral factors the same? Retrieved from http://faculty.education. Naglieri Nonverbal Ability Test. Psychological Assessment, 12,
uiowa/dlohman/pdf/CogAT-WISC_final_2col2r.pdf 328-334.

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013


Giessman et al. 109

National Association for Gifted Children. (2010a). NAGC posi- Wechsler, D. (1991). Wechsler Intelligence Scales for Children (3rd
tion statement: WISC-IV. Retrieved from http://www.nagc.org/ ed.). San Antonio, TX: Psychological Corporation.
index.aspx?id=2455 Wechsler, D. (2003a). Wechsler Intelligence Scales for Children
National Association for Gifted Children. (2010b). NAGC pre-K- (4th Ed.). San Antonio, TX: Psychological Corporation.
grade 12 gifted education programming standards: A blueprint Wechsler, D. (2003b). Wechsler Intelligence Scale for Children
for high quality gifted education programs. Washington, DC: (Fourth Edition): Technical and interpretive manual. San Anto-
Author. nio, TX: Psychological Corporation.
Otis, A. S., & Lennon, R. T. (2003). Otis-Lennon School Ability Test Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock-
(8th ed.). San Antonio, TX: Psychological Corporation. Johnson III Tests of Cognitive Abilities. Itasca, IL: Riverside.
Pearson. (2003). Stanford Achievement Test Series (10th ed.). San
Antonio, TX: Author. Author Biographies
Pearson. (2012). Introduction to the Naglieri Nonverbal Ability Jacob A. Giessman is Co-Director of the Center for Gifted
Test–Second Edition (NNAT2). Retrieved from http://www. Education at the Columbia (Mo.) Public Schools. He is the former
pearsonassessments.com/haiweb/Cultures/en-US/Site/ head of Academy Hill School in Springfield, Massachusetts, and
Community/Education/Products/NNAT2/nnat2.htm served on the Massachusetts Department of Elementary and
Psychological Corporation. (2001). Wechsler Individual Achieve- Secondary Education’s Gifted-Talented Advisory Council.
ment Test (2nd Ed.). San Antonio, TX: Author.
Rosenthal, J. A. (1996). Qualitative descriptors of strength associa- James L. Gambrell is a doctoral student in Educational Psychology
tion and effect size. Social Service Research, 21, 37-59. at the University of Iowa. His research interests include test valid-
Rowe, E. W., Kingsley, J. M., & Thompson, D. F. (2010). Predic- ity, growth modeling, school effectiveness, and the use of assess-
tive ability of the General Ability Index (GAI) versus the Full ments to identify gifted children.
Scale IQ among gifted referrals. School Psychology Quarterly,
25, 119-128. Molly S. Stebbins is the Coordinator of Psychological Services for
U.S. Department of Education. (1993). National excellence: A case the Columbia Public Schools in Missouri and has served in the
for developing America’s talent. Washington, DC: Author. public school system for over 13 years. She is a nationally certified
Villarreal, C. A. (2005). An analysis of the reliability and valid- school psychologist and an adjunct assistant professor at the
ity of the Naglieri Nonverbal Ability Test (NNAT) with English University of Missouri-Columbia.
Language Learner (ELL) Mexican-American Children (Doc-
toral dissertation). Retrieved from http://repository.tamu.edu/
bitstream/handle/1969.1/3850/etd-tamu-2005A-SPSY-Villarr.
pdf?sequence=1

Downloaded from gcq.sagepub.com at University of Missouri-Columbia on March 19, 2013

View publication stats

You might also like