Professional Documents
Culture Documents
Preventing School Failure: Alternative Education For Children and Youth
Preventing School Failure: Alternative Education For Children and Youth
To cite this article: Gail Coulter , Karen Shavin & Margaret Gichuru (2009) Oral Reading Fluency: Accuracy of Assessing Errors and
Classification of Readers Using a 1-Min Timed Reading Sample, Preventing School Failure: Alternative Education for Children and Youth,
54:1, 71-76, DOI: 10.3200/PSFL.54.1.71-76
Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the
publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or
warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and
views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by
Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary
sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs,
expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with,
in relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction,
redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly
forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
Oral Reading Fluency: Accuracy of Assessing
Errors and Classification of Readers Using
a 1-Min Timed Reading Sample
Gail Coulter, Karen Shavin, and Margaret Gichuru
Downloaded by [Colorado College] at 17:03 14 October 2014
ABSTRACT: Children in general education are classified by mea- States. One commercial system for tracking ORF measures
sures of oral reading fluency (ORF) to determine the level of sup- given by teachers is Aimsweb (Aimsweb, personal commu-
port needed for reading. In addition, teachers use ORF measures nication, January 22, 2008). The system reported scores of
with children who receive special education services to determine
whether they are making progress toward their reading goals. In ORF for approximately 680,000 unique students from first
this descriptive study, the authors examined the accuracy of scor- grade through sixth grade for the 2006–2007 school year. It is
ing for the 45 preservice teachers using the ORF subtest of the important to note that Aimsweb alone showed 61,000 users,
Dynamic Indicator of Basic Early Literacy Skills (R. H. Good & R. including school administrators and individual teachers. Fur-
A. Kaminski, 2002) for 1st- and 6th-grade readers. Preservice teach- thermore, assessments similar to this are used throughout
ers correctly classified a 1st-grade reader by using cutoff points.
However, 48.8% of preservice teachers incorrectly classified the the country, and some are required in states in large and
6th-grade reader when an alternative classification was more appro- small districts. The state of Idaho assessed 53,000 students
priate. The authors make practical recommendations for teachers to during the 2006–2007 school year using ORF measures
ensure accuracy of scoring with timed reading measurements. (Idaho Reading Indicator Assessment Coordinator, personal
communication, January 22, 2008). Although there are many
KEYWORDS: curriculum-based measurement, cutoff points, first- types of ORF measures including commercial products,
and sixth-grade readers, oral reading fluency, timed reading and we selected the Dynamic Indicator of Basic Early Literacy
measurements
Skills (DIBELS; Good & Kaminski, 2002) because it is used
CURRICULUM-BASED MEASUREMENT of oral reading across the United States with more than 3 million children
fluency (ORF) is especially useful because it accurately pre- from Grades K–6 representing 3,000 districts with 12,000
dicts later reading success (Barger, 2003; Buck & Torgesen, schools (DIBELS Support, personal communication, August
2003; Crawford, Tindal, & Stieber, 2001; Vander Meer, Lentz, 31, 2006). The DIBELS subtests use cutoff points to iden-
& Stollar, 2005; Wilson, 2005). Teachers who use this type of tify children as needing no further assistance beyond typical
assessment are then able to intervene with specific remediation classroom instruction; needing supplemental instruction; or
strategies in reading instruction at earlier stages, thus prevent- needing intensive instruction (i.e., low risk, some risk, and at
ing later academic complications and possibly school failure. risk, respectively) in the areas of phonemic awareness, alpha-
Not only are brief ORF measures efficient and effective in betic principle, ORF, vocabulary, and comprehension.
identifying children who need additional support (e.g., chil- Because of the number of children each year who
dren with disabilities), they also provide a means for moni- are assessed using some form of ORF, it is essential
toring the progress of children in general education who may that those people administering the assessments (namely,
experience reading failure and children who receive special the classroom teachers) know how to administer the
education services as they learn to read (Johns, 2005; Meh- assessments according to the guidelines and how to use
rens, & Clarizio, 1993; Rodden-Nord & Shinn, 1991). the scores for determining instruction. ORF measures are
One type of ORF probe is a 1-min timed reading. The
child reads for 1 min while the teacher notes the number
Address correspondence to Gail Coulter, Western Washington Uni-
of errors made during that time (Fuchs & Fuchs, 1993; versity, Woodring College of Education, 516 High Street, Belling-
McCurdy & Shapiro, 1992; Speece & Ritchey, 2005). These ham, WA 98225, USA; gail.coulter@wwu.edu (e-mail). Copyright
types of assessments are common throughout the United © 2009 Heldref Publications
71
72 Preventing School Failure Vol. 54, No. 1
Step 3: Practice test. The participants in both groups ing groups. We expected that the variance for identification
listened to an audiotape of benchmark passages and of errors within that 1 min would be greater for those in
practiced scoring rules on a DIBELS hardcopy as they the instruction group (who only read the instructions) than
listened to a first-grade child read orally on an audiotape for those in the training group (who read the instructions
from a first-grade benchmark passage. They marked a and received training on error identification and scoring).
slash through an error when they heard one and marked However, the results were similar. The instruction group’s
the end of 1 min. Then, they listened to an audiotape of mean was 46.80 (SD = 1.47), whereas the training group’s
a graduate student who simulated a sixth-grade reader mean was 47.16 (SD = 1.49). For the first-grade reader, the
reading a sixth-grade benchmark passage. actual cwpm was 46.
Step 4: Training. We used guidelines for the procedures
Results for sixth-grade passage. The sixth-grade reader had
for training for ORF suggested by the Dynamic Mea-
a rate of 124 cwpm. Table 2 presents the means, standard
surement Group (K. Fleming, personal communication,
deviations, and ranges. The mean and standard deviation for
September 28, 2006). We presented the most common
the instruction group were 127.47 and 14.22, respectively.
rules for scoring as they appeared in the administration
The standard deviation for the instruction group was greater
manual (e.g., self-corrects, hesitations, mispronounced
than that of the training group. However, the difference in
words, word order, omissions, and inserted words; Good
Downloaded by [Colorado College] at 17:03 14 October 2014
TABLE 2. Means, Standard Deviations, and Ranges for Participants’ Correct Words
Per Minute
Instruction Training
Group M SD Range M SD Range
not change the meaning of the passage substantially. How- these assessments, regardless of how simple they appear.
ever, the administration guidelines specify that misidentifi- This is especially true when 1-min timed readings are
cations are recorded as errors. The guidelines do not make administered by those participants who have not received
provisions for whether the misidentifications change the extensive training and practice. We subsequently offer some
meaning. Some of the common errors of confusion by the practical suggestions from lessons learned to teachers who
participants were the following: (a) surely was mistaken for are likely to administer these measures or be responsible for
assuredly, (b) saw was mistaken for spied, and (c) thank- their administration.
fully was mistaken for gratefully. Teachers need to recognize that a test such as the DIBELS
One other common error that surprised us was omission. is different from most formative assessments. To get an
Some omissions were words that were meaning-related. How- accurate assessment, the measures must be administered
ever, most omissions were of small words (e.g., a, the, an, then). consistently, following the standardization procedures. The
Omissions, regardless of whether they were content or func- measurements are neither designed to elicit best perfor-
tional words, were difficult for participants to detect and score. mance nor an opportunity to teach. Consequently, the deci-
Last, participants seemed to have difficulty scoring errors sions on the basis of the results would only be accurate if
when the erroneous word sounded similar to the word in the administration and scoring is accurate.
the passage (e.g., held and helped, though and through, far Before testing, those teachers need to review the admin-
Downloaded by [Colorado College] at 17:03 14 October 2014
and for, wet and went). The readers did not appear to have istration manual. In addition, it is helpful to preview each
a dialect or any type of speech problem. This indicated the reading passage that would be given. When giving the
level of attention that participants needed when listening for assessment, teachers need to follow the presentation and
errors. This also indicated the need for the participants to be scoring rules, even when instructions seem redundant. In
familiar with the text to ensure accurate scoring. addition, it is a good idea to practice with another adult who
The accuracy of the participants’ scores for the readers can provide feedback on any administrative errors. Another
affected the reader’s classification as low risk, some risk, or useful way to check reliability is to shadow score with a
at risk. This classification can determine whether a student partner and compare results. Results should be within 2
receives additional assistance in reading. In this study, the points of each other to be reliable.
participants correctly classified the first-grade reader. Because In any administration of an assessment, there is room
the scores fell well within the range of cutoff points, there was for error. Therefore, the goal is to minimize the errors.
little chance of the participants’ misclassifying the student. One way to reduce errors is to administer three probes and
However, a problem arose when the participants scored use the middle score for making instructional decisions. If
the passage for the sixth-grade reader: The scores fell closer there is still doubt that the assessment accurately reflects
to the cutoff point. The closer the scores were to the cutoff the student’s skills, teachers should retest on a different day
point, the more difficulty the participants had in correctly using an alternate probe.
classifying the reader for additional assistance. The reliabil- Mistakes in timing were among of the most common
ity of the classification of the participants was compromised types of error that we noted. For this reason, training for
as the degree of difficulty of the passages and the speed of those people who administer ORF measures should focus
the reader increased. on accurate use of timing mechanisms as well as on follow-
Misclassifying a reader can have two consequences: A ing standardization procedures specifically related to tim-
student would receive unnecessary supplemental assistance ing. This is most important: Timing depends on the specific
or intervention or would be denied needed assistance. In ORF test. In some cases, timing begins when the student
addition, an error in classification may not be detected reads the first word of the title. In other types of oral reading
for some time because students at benchmark would not measures of the DIBELS, timing begins when the student
be tested again until the next benchmark assessment, thus reads the first word of the body of the passage. When to
losing many valuable weeks of instruction. It is interesting begin the timing for each type of measure is designated in
to note that the sixth-grade passage—which had the most the administration guideline. Further, using a stopwatch can
cwpm and the most errors in 1 min—also showed the great- be distracting to the teacher as well as to the child. Those
est difference in classification among participants. administering the assessment would benefit from practicing
with the timing mechanism with a variety of contingencies.
Lessons Learned If possible, the administrator should set the stopwatch for
Because ORF measures are frequently administered in 1 min and let it count down, stopping automatically at the
schools and can have a substantial effect on what kind of end of the minute. This frees the administrator to focus on
reading instruction and progress monitoring is provided to the student response and not on the timer.
a student in general and special education settings, teach- Students are likely to omit small words—especially before
ers need to be aware of the challenges when administering content words—or to provide substitute words. It is important
76 Preventing School Failure Vol. 54, No. 1
to mark these errors, even if the error does not alter the meaning Information derived from oral timed reading measures can
of the passage. The purpose of this type of 1-min timed reading lead to high-quality educational programming only if it is
is to assess accuracy and rate—not meaning—and so each administered accurately.
error needs to be counted. In the case of the DIBELS, cutoff
points for determining amount and quality of instruction AUTHOR NOTES
have been established on the basis of large norming samples. Gail Coulter is an assistant professor in special education at
Therefore, each error is important to note. Western Washington University. Her research interests are reading
The faster the rate, the more likely there would be errors (including assessment and interventions) and students who are at
made by both the reader and the test administrator. For the risk for school failure. Karen Shavin is a doctoral student in the
administrator, it is more difficult to mark errors and fol- educational leadership program at the College of Notre Dame of
Maryland. For the past 5 years, she has worked with the Dynamic
low the reader as the rate increases. The difficulty is even Indicators of Basic Early Literacy Skills, providing technical
greater when there are numerous reader errors. If necessary, assistance throughout Maryland. Margaret Gichuru is a lecturer
teachers can audiotape a student who may read at a faster in interdisciplinary early childhood education at Murray State
rate and has frequent errors as they give the assessments and University and a doctoral student in the educational leadership
refer to the audiotape for scoring. If there is any suspicion program at Idaho State University. Her research interest is recruit-
ment and retention of college students.
that the assessment does not accurately reflect the skills of
Downloaded by [Colorado College] at 17:03 14 October 2014