A Pilot Study Examining The Test-Retest and Internal Consistency Reliability of The Ablls-R

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

678348

research-article2016
JPAXXX10.1177/0734282916678348Journal of Psychoeducational AssessmentPartington et al.

Brief Article
Journal of Psychoeducational Assessment
1­–6
A Pilot Study Examining the © The Author(s) 2016
Reprints and permissions:
Test–Retest and Internal sagepub.com/journalsPermissions.nav
DOI: 10.1177/0734282916678348
Consistency Reliability of the jpa.sagepub.com

ABLLS-R

James W. Partington1, Autumn Bailey1,


and Scott W. Partington1

Abstract
The literature contains a variety of assessment tools for measuring the skills of individuals with
autism or other developmental delays, but most lack adequate empirical evidence supporting
their reliability and validity. The current pilot study sought to examine the reliability of scores
obtained from the Assessment of Basic Language and Learning Skills–Revised (ABLLS-R). Two
forms of reliability were measured: internal consistency and test–retest reliability. Analyses
using data obtained from neuro-typical children (N = 50) yielded strong evidence of internal
consistency and test–retest reliability. These preliminary findings suggest that the ABLLS-R can
yield reliable scores.

Keywords
autism, assessments, ABLLS-R, reliability, psychometric properties

Numerous practitioners (e.g., educators, board certified behavior analysts, speech and language
pathologists, etc.) provide services to children with autism. Best practice requires practitioners to
obtain an accurate assessment of the strengths and skill deficits of each child to identify appropri-
ate teaching strategies (Guldberg, 2010). Furthermore, professional organizations such as the
American Educational Research Association (1999) and federal law (i.e., Individuals With
Disabilities Education Act, 2004) call for the use of valid and reliable assessments. Although
many of the comprehensive assessment tools available to professionals who serve children with
autism possess strong clinical significance, most lack sufficient empirical validation of their
psychometric properties. In contrast, other norm-referenced assessments (e.g., Vineland II;
Sparrow, Cicchetti, & Balla, 2005) only assess a limited range of skills, but contain empirical
support for their validity and reliability. An ideal assessment would yield both a thorough review
of skills and contain empirical support for its psychometric properties.
The Assessment of Basic Language and Learning Skills–Revised (ABLLS-R; Partington, 2010)
is an internationally recognized, criterion-referenced assessment tool that provides a comprehen-
sive review of 544 skills from 25 skill areas (i.e., repertoires) including language, social interaction,

1Behavior Analysts, Inc., Walnut Creek, CA, USA

Corresponding Author:
James W. Partington, Behavior Analysts. Inc., 309 Lennon Lane, #104, Walnut Creek, CA 94598, USA.
Email: partington@behavioranalysts.com
2 Journal of Psychoeducational Assessment 

self-help, academic, and motor skills. Leading researchers and professional organizations identified
the ABLLS-R as a useful tool to guide teaching language and other critical learner skills to children
with autism (American Medical Association, 2014; Thompson, 2011). Despite its widespread pop-
ularity, the ABLLS-R contains limited empirical support for its psychometric properties.
Currently, only one research report examined the reliability and validity of the ABLLS-R (Usry,
2015). Usry asked experts in the field of autism treatment to rate the extent that each ABLLS-R
item accurately measured the construct of interest. In addition, a second panel of experts who had
never previously conducted an ABLLS-R assessment watched a video and scored the performance
of an individual with autism on 86 ABLLS-R items. The results yielded evidence of good content
validity and interrater reliability. The assessment literature would benefit from a further examina-
tion of the psychometric properties of the ABLLS-R. The current pilot study aimed to measure
other forms of reliability including internal consistency and test–retest reliability.

Method
Participants and Setting
Parents and professionals who received training on the administration of the ABLLS-R from the
first author volunteered to collect data from their typically developing children. All assessors (N
= 50; 48 women and two men) had received at least a bachelor’s degree—74% had a master’s
degree or better—and 80% of the assessors had previously administered the ABLLS-R prior to
the study. The number of participants whose data were used in each analysis varied due to our
data analysis methodology. Our initial participant sample (N = 53) was reduced to only include
participants whose data were used in one or more of the analyses (N = 50; 29 girls and 21 boys,
age range: 6 to 72 months). Most participants scored within the average range of adaptive func-
tioning per their Adaptive Behavior Composite Score (M = 110.54, range = 87-131) obtained
from the Vineland II questionnaire (Sparrow et al., 2005), and all parents reported their children
as healthy and as having neither mental health disorders nor learning disabilities. We did not offer
compensation for participation, but all participants received free access to their online ABLLS-R
account (i.e., WebABLLS) for the duration of the study.

Materials
Participants entered data using their WebABLLS account. This criterion-referenced assessment
consists of 544 items from 25 repertoires (e.g., Labeling, Requesting, etc.) with each repertoire
containing between six and 57 items. For example, the Labeling repertoire contains 47 items,
which assess how well one can label objects, actions, adjectives, and so on. Depending upon the
skills of the participant, one could score between 0 and 4 points per item based upon the scoring
criteria specified in the assessment.
A completed ABLLS-R grid allows one to calculate a total score and the percentage of pos-
sible points for each repertoire. To calculate the total score for each repertoire, add the points
accumulated from each of the specific items within the repertoire. For example, adding scores
obtained from items F1 to F29 yields a total score for the Requesting repertoire with a maximum
score of 74. To obtain a percentage of possible points, divide the total score obtained for a reper-
toire by the maximum points possible for that repertoire and multiply the quotient by 100.

Procedure
All participants provided informed consent and demographic information prior to receiving
instructions on the data collection process. Parents or other professionals who both knew the
Partington et al. 3

child and received training on administering the ABLLS-R carried out the assessment in the
home setting. The data-collection period was between January 2007 and May 2013. Data collec-
tion occurred as early as 6 months of age and at each 3-month interval thereafter through 72
months. Participants could begin collecting data at any time once their child reached the next
3-month stage in development (e.g., a parent of a 34-month-old child would begin collecting data
at age 36 months). Although we encouraged data collection through 6 years of age, participants
could terminate data collection at their discretion.
We asked participants to update their WebABLLS grids at each 3-month stage in development
(i.e., 6 months, 9 months, 12 months, etc.) between the week prior to and following the day of the
3-month mark. We informed all participants that we would only use fully completed ABLLS-R
grids, completed within the 2-week window, in our data analyses.

Data Analysis
Internal consistency reliability.  We measured the internal consistency reliability of the ABLLS-R
scores by computing a Cronbach’s alpha coefficient for each repertoire. The age selected for
computing alpha for each repertoire minimized the number of items with no variance (i.e., all
children at 0% or 100% for any given item). Researchers generally consider coefficients greater
than .70 as “acceptable,” greater than .80 as “good,” and greater than .90 as “excellent” (George
& Mallery, 2003).

Test–retest reliability.  The researchers established a priori, a general first age point to use in the test–
retest correlations—the first age where the median score was approximately 30% (see Table 2)—
as scores from most repertoires contained minimal variability for the youngest and oldest
participants.

Results and Discussion


Internal Consistency Reliability
The results obtained from the internal consistency reliability analysis contained strong indices
of internal consistency across scores from most of the ABLLS-R repertoires (see Table 1).
Indeed, 18 out of the 25 repertoires contained a Cronbach’s alpha coefficient greater than .90.
Six repertoires, including Spontaneous Vocalizations (α = .88), Play and Leisure (α = .89),
Generalized Responding (α = .80), Eating (α = .80), Spelling (α = .88), and Dressing (α = .85),
contained coefficients between .80 and .90. Only one repertoire, Fine Motor (α = .78), yielded
an alpha value less than .80. The analysis also revealed that keeping all items from 16 out of
the 25 repertoires yielded the largest possible Cronbach’s alpha coefficient. The remaining
nine repertoires contained between one and three items for which removing that item would
only increase alpha by .01.

Test–Retest Reliability
The analysis revealed strong test–retest correlations at the 3-, 6-, 9-, and 12-month marks across
the majority of the repertoires (see Table 2). The analysis yielded strong average test–retest cor-
relation coefficients at the 3- (r = .84) and 6-month (r = .77) retests and to a lesser extent at the
9- (r = .62) and 12-month (r = .54) retests.
The current study examined the reliability of scores obtained from the ABLLS-R by obtaining
measures of internal consistency and test–retest reliability. The analyses yielded strong evidence
of both types of reliability. In addition to the primarily strong alpha coefficients obtained, we
4 Journal of Psychoeducational Assessment 

Table 1.  Cronbach’s Alpha Coefficients for Each ABLLS-R Repertoire.

Repertoire label Age (in months) n α


A Cooperation & Reinforcer Effectiveness 18 24 .91
B Visual Performance 30 29 .94
C Receptive Language 21 28 .96
D Motor Imitation 21 28 .95
E Vocal Imitation 21 28 .95
F Requests 21 28 .93
G Labeling 27 28 .97
H Intraverbals 30 29 .99
I Spontaneous Vocalizations 18 24 .88a
J Syntax & Grammar 30 29 .95a
K Play & Leisure 21 28 .89
L Social Interaction 21 28 .94
M Group Instruction 30 29 .91
N Classroom Routines 33 30 .96
P Generalized Responding 21 28 .80a
Q Reading 54 25 .90
R Math 51 30 .91
S Writing 45 35 .91a
T Spelling 54 25 .88a
U Dressing 30 29 .85a
V Eating 27 28 .80a
W Grooming 27 28 .90a
X Toileting 33 30 .90
Y Gross Motor 27 32 .92
Z Fine Motor 30 29 .78a

Note. ABLLS-R = Assessment of Basic Language and Learning Skills–Revised.


aRepertoires containing between one and three items for which removing an item would increase alpha by .01.

found that removing an ABLLS-R item would only minimally improve the internal consistency
of nine ABLLS-R repertoires, which supports the inclusion of the existing ABLLS-R items. In
the test–retest reliability analysis, we observed a general trend of decreasing correlation coeffi-
cients at each increasing retest period. However, one could predict this finding as children con-
tinuously acquire skills, which likely accounts for the systematic decreases observed in the retest
correlations over time.
These findings compliment the research by Usry (2015) who found evidence of interrater reli-
ability and content validity. The preliminary empirical evidence thus far suggests that the
ABLLS-R can produce reliable scores.

Limitations and Future Research


This pilot study contains a couple of noteworthy limitations. First, we obtained data from a small
participant sample relative to the number of variables examined. A second limitation includes
departures from the normal distribution for both the total scores and the individual items. Despite
these limitations, our results can still guide future ABLLS-R research. Researchers could confirm
our findings by replicating this study using a larger participant sample and measuring other psy-
chometric properties of the ABLLS-R.
Partington et al. 5

Table 2.  Test–Retest Correlations for Each ABLLS-R Repertoire.

Age (in Mdn score


Repertoire label months) (%) 3 months 6 months 9 months 12 months
A Cooperation & Reinforcer 18 33 .50* .51* .31 .27
Effectiveness
B Visual Performance 27 38 .89*** .75*** .74*** .68***
C Receptive Language 18 36 .92*** .86*** .71*** .49*
D Motor Imitation 18 34 .79*** .68*** .50* .51*
E Vocal Imitation 18 32 .70*** .53* .51* .48*
F Requests 15 29 .68** .63** .48* .37
G Labeling 21 39 .96*** .84*** .62** .48*
H Intraverbals 30 46 .87*** .78*** .70*** .59**
I Spontaneous Vocalizations 15 29 .82*** .72*** .54* .60**
J Syntax & Grammar 27 53 .90*** .84*** .76*** .60**
K Play & Leisure 18 33 .89*** .74*** .67*** .49*
L Social Interaction 15 31 .93*** .91*** .88*** .86***
M Group Instruction 27 42 .89*** .83*** .64** .33
N Classroom Routines 30 38 .96*** .88*** .43* .35
P Generalized Responding 18 42 .72*** .67*** .63** .52*
Q Reading 45 31 .89*** .80*** .70*** .65**
R Math 39 27 .90*** .89*** .87*** .89***
S Writing 39 35 .90*** .88*** .65** .44*
T Spelling 48 33 .93*** .87*** .75*** .55*
U Dressing 27 38 .80*** .76*** .68*** .69***
V Eating 18 30 .87*** .81*** .54* .53*
W Grooming 21 39 .72*** .63** .59** .56*
X Toileting 30 29 .82*** .56** .29 .24
Y Gross Motor 21 40 .94*** .87*** .74*** .71***
Z Fine Motor 21 36 .88*** .81*** .47* .56*

Note. nrange: 21-34. ABLLS-R = Assessment of Basic Language and Learning Skills–Revised.
*p < .05. **p < .01. ***p < .001.

Acknowledgments
The authors wish to thank Frank Quinn for organizing the data set and Barry Collins, PhD, for his assistance
with conducting the statistical analyses.

Declaration of Conflicting Interests


The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or
publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this
article.

References
American Educational Research Association. (1999). Standards for educational and psychological testing.
Washington, DC: Author.
American Medical Association. (2014). CPT assistant. Chicago, IL: Author.
6 Journal of Psychoeducational Assessment 

George, D., & Mallery, P. (2003). SPSS for Windows step by step: A simple guide and reference. 11.0
update (4th ed.). Boston, MA: Allyn & Bacon.
Guldberg, K. (2010). Educating children on the autism spectrum: Preconditions for inclusion and notions
of “best autism practice” in the early years. British Journal of Special Education, 37, 168-174.
doi:10.1111/j.1467-8578.2010.00482.x
Individuals With Disabilities Education Act. (2004). Evaluation procedures. Retrieved from http://idea.
ed.gov/explore/view/p/,root,regs,300,D,300%252E304,
Partington, J. W. (2010). The ABLLS-R—The Assessment of Basic Language and Learning Skills–Revised.
Walnut Creek, CA: Behavior Analysts, Inc.
Sparrow, S. S., Cicchetti, D. V., & Balla, D. A. (2005). Vineland Adaptive Behavior Scales (2nd ed.).
Bloomington, MN: Pearson. doi:10.1037/t15164-000
Thompson, T. (2011). Individualized autism intervention for young children. Baltimore, MD: Paul H.
Brookes.
Usry, J. N. (2015). Validation of the Assessment of Basic Language and Learning Skills–Revised for stu-
dents with autism spectrum disorder using an expert review panel (Unpublished doctoral dissertation).
Lynchburg, VA: Liberty University.

You might also like