Download as pdf or txt
Download as pdf or txt
You are on page 1of 107

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/380753238

Effects of Commission Errors on Behavior Intervention Plan Outcomes

Thesis · May 2024


DOI: 10.13140/RG.2.2.30998.15687

CITATIONS READ
0 1

1 author:

Olivia Harvey
West Virginia University
2 PUBLICATIONS 1 CITATION

SEE PROFILE

All content following this page was uploaded by Olivia Harvey on 21 May 2024.

The user has requested enhancement of the downloaded file.


Graduate Theses, Dissertations, and Problem Reports

2024

Effects of Commission Errors on Behavior Intervention Plan


Outcomes
Olivia Brianne Harvey

Follow this and additional works at: https://researchrepository.wvu.edu/etd

Part of the Applied Behavior Analysis Commons

Recommended Citation
Harvey, Olivia Brianne, "Effects of Commission Errors on Behavior Intervention Plan Outcomes" (2024).
Graduate Theses, Dissertations, and Problem Reports. 12469.
https://researchrepository.wvu.edu/etd/12469

This Thesis is protected by copyright and/or related rights. It has been brought to you by the The Research
Repository @ WVU with permission from the rights-holder(s). You are free to use this Thesis in any way that is
permitted by the copyright and related rights legislation that applies to your use. For other uses you must obtain
permission from the rights-holder(s) directly, unless additional rights are indicated by a Creative Commons license
in the record and/ or on the work itself. This Thesis has been accepted for inclusion in WVU Graduate Theses,
Dissertations, and Problem Reports collection by an authorized administrator of The Research Repository @ WVU.
For more information, please contact researchrepository@mail.wvu.edu.
Effects of Commission Errors on Behavior Intervention Plan Outcomes

Olivia B. Harvey

Thesis submitted
to the Eberly College of Arts and Sciences
at West Virginia University

in partial fulfillment of the requirements for the degree of

Master of Science in
Psychology

Claire St. Peter, PhD, Chair


Kathryn Kestner, PhD
Kathleen Morrison, PhD

Department of Psychology

Morgantown, West Virginia


2024

Keywords: treatment integrity, commission error, school-based intervention

Copyright 2024 Olivia Harvey


ABSTRACT

Effects of Commission Errors on Behavior Intervention Plan Outcomes

Olivia B. Harvey

When implemented well (with fidelity), behavior intervention plans (BIP) improve
student outcomes. Teachers tend to implement BIPs with poor overall fidelity, but little is known
about the specific errors occurring during BIP implementation or the subsequent impacts these
errors have on student outcomes. One possibility is that teachers learn what strategies suppress
challenging behavior and implement those strategies regardless of what is written in the formal
BIP. These added intervention components, termed commission errors, have not yet been
evaluated in the context of BIP implementation. The proposed studies will begin to address these
gaps. During Study 1, we identified the prevalence and types of errors that three teachers made
when implementing BIPs. A frequent commission error was selected for each student-teacher
dyad to be assessed in Study 2. During Study 2, we manipulated the identified error to determine
its impacts on student outcomes. To accomplish this aim, we compared rates of challenging
behavior when the error was present or absent during implementation of the BIP by a behavior
analyst, using a reversal design. Teacher’s engaged in frequent errors and one commission error
enhanced efficacy of a student’s BIP.
Table of Contents

Introduction……………………………………………………………………………………….. 1

General Method ………………………………………………………………………………….. 6

Recruitment and Consenting Process……………………………………………………. 6

Demographic Questionnaires, Record Review, and Likert Scale……………………...…. 6

Study 1………………………...……………………………………………………………..….... 9

Method……...…………………………………………………………...…………..……. 9

Setting……...……………………………………………………………………..………. 9

Response Measurement…………………………..…………………………..……….… 10

Data Analysis………………………………………………….……………..……….…. 13

Interobserver Agreement (IOA)……………………………….……………..……….…. 14

Results and Discussion………………………………………………..……..……….…. 15

Study 2………………………...……………………………………………………………….... 22

Method……...………………………………………………….…………………..……. 23

Implementer, Consenting Procedures, & Setting……..…………………………………. 23

Response Measurement and IOA……………………………………………..……….…24

Procedural Fidelity…………………………………………………………..……….…. 26

Error Selection………………………………………………………...……..……….…. 27

Experimental Manipulation………………………………………...………..……….…. 28

Experimental Design……..………………………..………………………………….…. 29

Results and Discussion…………………………………………...…………..……….… 31

General Discussion…………………………………………………………………..……….…. 34

References …………………………………………………………………………………. ……40

iii
Tables…... …………………………………………………………………………………. ……47

Figures…. …………………………………………………………………………………. ……58

Appendices………………………………………………………………………………….……70

iv
Effects of Commission Errors on Behavior Intervention Plan Outcomes

Procedural fidelity is the extent to which a procedure is implemented as planned (Cook et

al., 2015). Procedures implemented with high procedural fidelity yield better outcomes for skill-

acquisition tasks (Bergmann et al., 2021; DiGennaro Reed et al., 201; Holcombe et al., 1994;

Jenkins et al., 2015; Leon et al., 2014; Noell et al., 2002), self-care skills (Donnelly & Karsten,

2017), and behavior-reduction procedures (Foreman et al., 2022; St. Peter et al., 2016; St. Peter

Pipkin et al., 2010) relative to procedures implemented with reduced fidelity. However, the

specific effects of reduced procedural fidelity may depend on the type of error made.

There are at least two types of procedural fidelity errors: omission errors and commission

errors. An omission error consists of the absence of a step of an intervention. For example,

failing to deliver a reinforcer following an alternative (appropriate) response as specified in a

procedure would be an omission error. A commission error consists of adding a step to an

intervention context. For example, delivering a reinforcer following a target (challenging)

response, which is not specified in the procedure, would be a commission error.

There are relatively few studies evaluating effects of commission errors in isolation on

behavior-analytic procedures (Brand et al., 2019). Of the three studies that have evaluated

commission errors in isolation, all evaluated the same form of commission error: delivering a

reinforcer following unwanted behavior (DiGennaro Reed et al., 2011; Leon et al., 2014; St.

Peter Pipkin et al., 2010). For example, St. Peter Pipkin et al. (2010) evaluated commission

errors during differential reinforcement of alternative behavior with college students who

engaged in arbitrary responses. During Experiment 1, commission errors consisted of

periodically delivering a point when participants engaged in a target behavior that the

experimenters deemed analogous to challenging behavior. Commission errors resulted in higher

1
rates of target behavior relative to the intervention implemented without errors. Similarly,

DiGennaro Reed et al. (2011) evaluated commission errors during a skill-acquisition task with

elementary students diagnosed with autism spectrum disorder. Commission errors consisted of

praising incorrect responses. Commission errors resulted in low accuracy relative to teaching

without commission errors.

Several other forms of commission errors could occur. For example, Donnelly and

Karsten (2017) identified at least six forms of commission errors during observations of an

intervention for teaching self-care skills. One commission error was prompting the client to

complete the self-care skill steps out of order; this error produced undesirable outcomes

(participants did not reach mastery criterion) for both participants. A second commission error

was offering a choice of items prior to teaching when a choice should not have been offered.

Offering a choice of items may have improved outcomes (Bannerman et al., 1990), but its effects

were not evaluated.

To my knowledge, only one experiment (Carroll et al., 2013) has identified and evaluated

a commission error that was empirically demonstrated to improve outcomes. Carroll et al.

conducted a two-part study. In Study 1, experimenters observed four special-education teachers,

one regular-education teacher, one speech pathologist, and three paraprofessionals implement

discrete trial teaching with children diagnosed with autism spectrum disorder in a classroom

setting. During an average of 45% of trials, participants presented an instruction twice when the

procedure specified a single instruction. The researchers subsequently evaluated effects of this

error in Study 2. Introducing the error resulted in detrimental impacts (slower learning) for two

participants, but facilitative impacts (faster learning) for a third participant. Thus, in some

circumstances, commission errors could facilitate positive outcomes.

2
The two-part approach to first identifying errors and then manipulating them was also

adopted by Foreman et al. (2021) in the context of teachers’ implementation of school-based

behavior intervention plans (BIP) that included timeout procedures. Foreman et al. used

descriptive observation to identify common errors. The descriptive data demonstrated that

teachers frequently omitted timeout. Foreman et al. then experimentally manipulated the

frequency of timeout. For example, teachers implemented timeout following an average of one in

20 instances of challenging behavior for one student, so the researcher compared the efficacy of

timeout at 100% procedural fidelity (timeout after each response) and 5% procedural fidelity

(timeout following one in 20 responses, on average). Challenging behavior was suppressed for

two participants even when procedural fidelity was reduced. However, the timeout procedure

was only one component in more complex BIPs. Although Foreman et al. did not identify

frequent commission errors with the use of timeout, the limited scope of the evaluation may have

reduced the types and rates of commission errors that could have been observed.

Identifying a broader range of errors may be particularly important because teachers often

implement BIP’s with notably low procedural fidelity. For example, Wickstrom et al. (1998)

observed 27 teachers with varying levels of teaching experience implement behavior-change

procedures (e.g., differential reinforcement of alternative behavior, response cost) in their

classrooms. Observers collected data on the extent to which teachers implemented consequences

for student behavior as planned. Teachers provided planned consequences with 0% to 21%

procedural fidelity (M = 4%).

More recently, Codding et al. (2008) observed three teachers who had received an 8-hr

didactic training to implement a class-wide behavior management plan. Observers collected data

on the teacher’s implementation of 14 procedural components. Teachers implemented the

3
behavior management plan with 0% to 57% procedural fidelity. Although Codding et al. reported

a higher procedural fidelity percentage than did Wickstrom et al. (1998), these percentages are

still low in comparison to the recommended standard of at least 80% procedural fidelity

(National Autism Center, 2015).

Although data from Wickstrom et al. (1998) and Codding et al. (2008) demonstrate that

teachers implement procedures with low overall fidelity, neither study included the types of

errors that teachers made. For example, Wickstrom et al. reported that teachers did not

implement the planned consequence following each instance of student behavior, but not what

the teacher did instead. Similarly, Codding et al. recorded teachers’ erroneous implementation of

a component as “not implemented as written,” which limits the interpretation of the types of

errors teachers made (p. 331).

Recently, Morris et al. (2024) provided recommendations for measuring procedural

fidelity using a checklist during direct observation of a procedure. Morris et al. suggested that

BIP steps be operationalized into observable and measurable units. Then, the steps should be

organized sequentially or under subheadings of the contexts in which the steps should occur.

Finally, each step should have a measure that accurately captures the relevant dimensions of the

step (e.g., frequency, duration). The authors note that modifying the measurement system to

include commission errors when observed may be important. However, none of the previous

studies on naturally occurring commission errors have used the strategies suggested by Morris et

al. to broadly capture various kinds of errors.

One possibility is that teachers add steps to BIPs to suppress challenging behavior. For

example, a teacher may deliver additional access to reinforcers (attention or access to items)

when the student is on-task, relative to what is specified in the BIP. This error may increase on-

4
task behavior, thus increasing the likelihood the teacher continues to make the error. When

teachers make such errors, the BIP may not result in the same student outcomes when the student

transitions to a new teacher. This difference in outcomes may even (or especially) occur if the

new teacher implements the BIP as written (with high fidelity) because they would be delivering

fewer reinforcers relative to the previous teacher, which may result in an increase in challenging

behavior.

Fidelity errors have predominantly been found to be detrimental to outcomes. However,

several other types of errors could exist, and such errors may have different effects on outcomes.

Teachers are known to make errors, but little is known about the types of errors teachers make.

One possibility is that teachers add steps to the BIP (i.e., commission error not in the plan) and

that these interactions are maintained by the suppression of challenging behavior (i.e., facilitates

outcomes). A gap exists in the literature on how to capture a broader array of errors and the

impact such errors may have on outcomes. Given that errors occur, it is critical to (a) determine

the form and frequency of specific errors, and (b) evaluate impacts of those errors on outcomes.

Therefore, the purposes of the present studies were to identify the types and prevalence of

commission and omission errors in BIP implementation by teachers and to evaluate effects of an

identified commission error on student outcomes. Study 1 was a descriptive assessment to collect

data on types and frequencies of commission errors that occurred when teachers implemented

BIPs. Study 2 was an experimental manipulation of an identified commission error in a reversal

design to determine if an observed commission error from Study 1 affected rates of challenging

behavior.

5
General Method

Recruitment and Consenting Process

Student-teacher dyads were recruited from public elementary schools. We received

approval from the school district and contacted teachers who typically provided services to

students with BIPs. Researchers met with teachers who expressed interest to garner their consent.

The teachers were asked to provide information about the study to the legal guardian(s) of a

student(s) in their class who had a formal, written BIP. Legal guardian(s) who expressed interest

were sent a consent form. Once teacher and parent consent was secured, assent was obtained for

children above the age of 7 years without significant cognitive impairments.

Eight student-teacher dyads were recruited. Four dyads did not participate in the study:

two students were unenrolled from the study, one student changed schools, and one teacher

withdrew consent to participate before data collection began. Four dyads completed Study 1 and

were recruited for Study 2. Each of these dyads was from a public, alternative-education

elementary school.

Demographic Questionnaires, Record Review, and Likert Scale

Table 1 summarizes student demographic information and Table 2 summarizes teacher

demographic information and responses to the Likert Scale. See Appendix A and Appendix B for

the Questionnaires and Likert Scales.

Fabian1 was an 8-year-old white male whose primary language was English. Fabian’s

parent reported that he had attention deficit disorder, depression, a learning disability, and

oppositional defiant disorder and that Fabian took guanfacine and methylphenidate medication

daily. Fabian’s BIP had been written 8 months before his enrollment in the study. Fabian’s BIP

1
All names are pseudonyms.

6
specified he may protest, swear, insult others, leave his area, destroy property, or aggress.

Fabian’s educational records specified challenging behavior was maintained by escape. Fabian’s

BIP also included a pass that could be exchanged anytime to temporarily escape academic tasks

(tag-out area) and replace the ongoing academic task. Fabian’s BIP included a token system in

which tokens could be exchanged for items or activities with teachers on a reward menu (e.g.,

coloring, Legos). Fabian’s BIP contained two class-wide management strategies: a group token

system and the Good Behavior Game. Fabian’s teacher, Kelly, was a 40-year-old white female

whose primary language was English. Kelly had a master’s degree and over 10 years of

experience as a teacher. She held a credential as a Board Certified Behavior Analyst. Kelly had

known Fabian for 4 yr and worked with him for 1.5 yr. Kelly had assisted in writing Fabian’s

BIP.

Dakota was a 9-year-old white female whose primary language was English. Her legal

guardian reported that Dakota had attention deficit hyperactivity disorder and took focalin,

remeron, clonidine, risperidone, zoloft, and melatonin daily. Dakota’s BIP had been written 5

months before her enrollment in the study. Dakota’s BIP specified she may refuse to comply,

leave her area, destroy property, aggress, elope, or injure herself. Dakota’s educational records

specified challenging behavior may be triggered by denied requests or not being within

proximity to a teacher and that challenging behavior was maintained by access to tangibles and

attention. Dakota’s BIP included certificates that could be exchanged anytime for access to adult

attention and frequent praise. Dakota’s BIP also included a token system in which tokens could

be exchanged for escape from academic tasks and access to tangible items. Dakota’s teacher,

Rachel, was a 45-year-old white female whose primary language was English. Rachel had a

7
master’s degree and 22 years of experience. Rachel had known and been working with Dakota

for 5 months. Rachel had assisted in writing Dakota’s BIP.

Warren was a 7-year-old white male whose primary language was English. The legal

guardian did not report any diagnoses or medications. Warren’s BIP had been written 6 months

before his enrollment in the study. Warren’s BIP specified he may bargain, yell, swear, leave his

area, destroy property, aggress, and elope. Warren’s educational records specified challenging

behavior was maintained by escape and attention. Warren’s BIP included a reward system for

completing worksheets without challenging behavior. For each worksheet that Warren completed

without challenging behavior, he could select from various rewards on a menu (i.e., skip-an-

assignment pass, play with a peer for 10-min). Warren’s BIP also included a morning check-in

and access to tangible items or teacher attention during transitions. Warren’s teacher, Rhett, was

a 27-year-old white male whose primary language was English. Rhett was obtaining a master’s

degree in education. Rhett was a long-term substitute with 2 months of experience. Rhett was

also the teacher for Wybie. Wybie was a 10-year-old white male whose primary language was

English. The legal guardian reported that Wybie had diagnoses of attention deficit hyperactivity

disorder, oppositional defiant disorder, and sensory processing disorder and took focalin and

guanfacine. Warren’s BIP had been written 9 months before his enrollment in the study. Wybie’s

BIP specified he may refuse to comply, negatively interact with others, leave his area, destroy

property, aggress, and elope. Wybie’s educational records specified challenging behavior was

maintained by escape, attention, and access to tangibles. Wybie’s BIP included curricular

revision for English Language Arts activities, frequent praise, and teacher attention for cleaning

up tangible items in a timely manner. Wybie’s BIP also included a tiered reward system for

completing academic activities and refraining from challenging behavior. Wybie could earn

8
access to a novel reward if he completed academic activities and refrained from challenging

behavior, a regularly available reward if he did not complete academic activities and refrained

from challenging behavior, or no reward if he engaged in challenging behavior. Rhett had known

and been working with Warren and Wybie for 2 months, and had not assisted with writing either

student’s BIP.

Immediately after providing consent, teachers were provided with a questionnaire about

the importance of and their experiences with BIPs. Teachers completed a 5-point Likert Scale

(ranging from strongly disagree to strongly agree) assessing their agreement or disagreement

with statements about BIP implementation and student behavior (see Appendix C). The

researcher accessed the students’ BIPs and educational records from the students’ school files.

The BIPs were used to create fidelity checklists (see Response Measurement below), and the

educational records (e.g., assessments) were used in the selection process of an error to

manipulate in Study 2 (see Error Selection section in Study 2).

Study 1

The purpose of Study 1 was to determine the types and frequency of errors that occurred

when teachers implemented BIPs during regularly occurring classroom routines.

Method

Setting

Observations occurred in the alternative-education classroom during the students' typical

daily activities. Rachel’s classroom had students in 3rd and 4th grade and had an educational

assistant. Kelly’s and Rhett’s classrooms had students in Kindergarten through 2nd grade, and

each classroom had two educational assistants. Each classroom served fewer than 10 students.

9
Observations did not occur during irregular activities (e.g., fire drills, school assemblies)

or when the BIP was intentionally not implemented (e.g., assessments). Observers sat in a

location designed to minimize distraction to teachers and students and avoided interacting with

students. Each observation would have ended after 15 min or when the student transitioned to

working with a different teacher, whichever came first. However, the latter never occurred, and

all observations were 15 min in duration. The median number of observations per day was 2

(range: 2-3), and observations occurred across an average of 4 days for each student (range: 2-7).

The descriptive assessment for a student-teacher dyad was considered complete after a total of 50

commission errors had been recorded. If 50 commission errors were not recorded after 5 hr of

data collection, observations would have ended. However, this never happened.

Response Measurement

Each student’s existing BIP was used to create a procedural-fidelity checklist (see

Appendices D – G). A BIP step must have met three criteria to be included on the checklist (see

Table 3 for examples and nonexamples). First, only proactive steps were included to consistently

evaluate similar steps across observations, regardless of the occurrence of challenging behavior.

Proactive steps were procedures designed to prevent the occurrence of challenging behavior or

teach an adaptive response. Second, the step must have constituted a directly observable

interaction between the teacher and student. Third, the step must have been observable during a

15-min period. See Table 4 for a summary of the number of included and excluded steps for each

student-teacher dyad.

Four structural modifications were made to included BIP steps when applicable. First,

any step of the BIP that specified multiple teacher actions was divided into multiple steps on the

checklist. For example, “Set the timer and tell the student they have 3 min to engage in the

10
activity” would be divided into two steps (“set the timer for 3 min” and “tell the student they

have 3 min to engage in the activity”). This modification allowed us to detect the specific actions

for which teachers made errors. Second, any BIP step that was a duplicate (i.e., listed in two

places in the BIP) was only listed once on the checklist. Third, time windows were inferred for

the purposes of data collection as it was necessary for capturing errors. This specified a time

frame for the teacher to interact with their student after a context occurred. This is crucial for

determining whether a step should be scored as an omission or commission error. If the teacher

did not implement a step within the designated time window, the step was scored as an omission

error. If the teacher implemented a step outside the time window, the step was scored a

commission error. The temporal window for a step to be considered correct was based on the

frequency or immediacy of implementation specified in the BIP. Steps that were specified to

occur immediately or within 2 min had a 15-s window. Thus, a step specified to occur

“immediately” was still considered correct if it occurred within 15s, and a step specified to occur

“after 2 min” was considered correct if it occurred from 1.75 min to 2.25 min. Steps that should

occur in a range of 2 to 5 min had a 30-s window to be considered correct. For example, a step

that specified “praise every 5 min” would be considered correct if the teacher praised after 4.5

min or 5.5 min. Steps scheduled to occur every 5 min to 30 min had a 1-min window. Thus,

token exchanges scheduled to occur every 30 minutes were considered correct if they occurred

after 29 min to 31 min. We selected to standardize time windows to ease data collection. Fourth,

additional contexts were provided for steps that were dependent on correct performance of a

previous step. For example, “When the [5 min] timer sounds…” is a context that would not have

occurred if the teacher did not set the timer. Therefore, we added the context of “(or 5 min

elapses)” to make the steps independent.

11
In some cases, additional clarification was needed to transform the written BIP into the

data-collection checklist. Once the checklist was drafted, two individuals with a BCBA-D

credential ensured that each BIP step that met the inclusion criteria appeared on the checklist and

identified possible remaining ambiguities on the checklist. If ambiguities existed, the researcher

interviewed the teacher to further operationalize ambiguous steps. This process iterated until

both BCBA-D reviewers agreed that the checklist was complete, and components were

operationalized. The researcher then piloted the checklist with the secondary observer. The two

observers discussed the discrepancies between recorded responses and resolved how the

interactions would be recorded going forward. If a discrepancy arose that could not be resolved

with only the language specified on the checklist, a scoring rule was added in the margins of the

checklist for observers to reference.

The checklists included space to narratively record and tally the frequency of commission

errors that were not in the BIP but met the inclusion criteria. For example, a student’s BIP may

not mention praise; however, a teacher may deliver praise following appropriate behavior. After

every observation, observers were allotted 2 min to add details to qualitative descriptions. Each

narratively recorded commission error was then operationally defined and added to the student’s

checklist before the next observation.

During each observation, observers categorized the teacher’s responses as being correct,

an omission error, a commission error, or not applicable. Correct implementation was defined as

the teacher implementing the step as described in the plan. For example, the teacher delivers a

token when the student meets the criterion to be awarded a token. An omission error was defined

as any portion of the step not occurring. For example, the teacher does not deliver a token when

the student meets the criterion. A commission error was defined as implementing a step that was

12
not specified in the plan or implementing it differently than as specified. There were two forms

of commission errors. A commission error in the plan was defined as the teacher adding or

modifying a step that appeared in the plan. For example, the teacher delivers two tokens when

the student meets the criterion, or the teacher delivers a token when the student does not meet the

criterion. A commission error not in the plan was defined as a student-teacher interaction that

was not a step described in the plan. For example, the teacher delivers candy when the student

engages in an appropriate response, which is not a step in the plan. Any BIP step that included

negative language (e.g., “do not comment on the behavior”) was scored as a commission error if

the behavior specified to be omitted occurred.

Brief interactions between the student and nonparticipating teachers in the classroom

were not recorded on the checklist, but the primary observer narratively noted these interactions.

If the participating teacher omitted a BIP step, it was counted as an omission error even if a

nonparticipating teacher implemented the step (but this rarely occurred). The observation would

have ended if the nonparticipating teacher implemented three BIP steps in succession, but this

never occurred.

Data Analysis

A global measure of procedural fidelity was calculated by dividing the number of correct

student-teacher interactions by the total count of student-teacher interactions (corrects and errors)

and multiplied by 100 to yield a percentage.

The rate of specific categories of errors per hour was calculated by dividing the count of

errors for each BIP step (including commission errors not listed on BIP) by the total hours of

observation.

13
Interobserver Agreement (IOA)

Observers were trained to collect data during the initial pilot of each procedural fidelity

checklist. Training consisted of instructions on how to complete the checklist and feedback on

data collection by comparing their checklist to a trained researcher.

Because each form of error was typically infrequent during an observation, traditional

methods of calculating IOA resulted in very low agreement estimates. For example, if one

observer scored one instance of correct implementation of a step and the second observer scored

two instances, proportional agreement on that step would be 50% despite deviation by only one

count. Using a total agreement calculation (e.g., smaller count divided by larger count for each

category of data), the mean IOA scores for “commission errors not in plan” was 44% for Fabian-

Kelly, 64% for Dakota-Rachel, 34% for Warren-Rhett, and 50% for Wybie-Rhett (see Table 5 –

8). Therefore, we used a correlation analysis to evaluate the believability of the data.

Correlations between counts obtained by the primary and secondary data collectors are shown in

Figure 1. Data for each participant is shown in a separate graph. Each data point represents the

counts obtained by each observer for that category (correct implementation, omission error,

commission error in plan, commission error not in plan) for a single observation. We collapsed

the categories of interactions and obtained significant Spearman r correlations for each student-

teacher dyad. No clear patterns in the differences across observers were obtained, although

commission errors not in the plan seemed less likely to be detected identically across both

observers (with only three of 12 observations [25%] resulting in perfect correspondence of these

kinds of errors across the observers, relative to 75% for commission errors in plan, 50% for

correct implementation, and 33% for omission errors). Although perfect correspondence was

relatively rare, with observers only having perfect correspondence in 48% of all interactions,

14
observers’ records typically deviated by only a few instances per category per observation.

Nonetheless, the imperfect agreement suggests that the results should be interpreted with caution.

Results and Discussion

The Spearman correlations between the observers’ data were statistically significant for

all student-teacher dyads. However, agreement was far from perfect, and more typical measures

of IOA yielded low correspondence (34%-64% for commission errors not in the plan).

Disagreements may have been related to the number of discriminations required for the

observational method, how errors were categorized, or the exceptionally large difference

between the BIP as written and the BIP as implemented.

Observers had to engage in a lengthy discrimination process when recording data.

Observers had to (1) determine if the interaction occurred during a specified context, (2) and if

so, if the interaction was a BIP step, (3) and if so, the type of the interaction, and (4) if it was not

a context or BIP step, determine if the step met the three inclusion criteria, (5) and if so, if the

interaction was already operationally defined as a commission error not in the plan to tally, and

(5) if not operationalized, how to write a narrative description of the error. This process had to

occur quickly and repeatedly during live observations. Observers could make an error at any

point in the discrimination process or miss student-teacher interactions while recording previous

interactions.

The discrimination process was lengthy in part because interactions were categorized.

Procedural-fidelity scholars have categorized errors into the subtypes of omission and

commission (Brand et al., 2019). However, there is little information on the criteria or features of

an error that determine its classification (Vollmer et al., 2008). It is impractical to attempt to

classify errors when there is not sufficient information on how to classify errors (Han et al.,

15
2022). Researchers may consider identifying the features of an error that informs its

classification, and the best methods for classifying errors (Harvey & St. Peter, 2024).

The lack of correspondence between the procedural-fidelity checklist and the teacher’s

implementation made it difficult to determine if interactions were specified in the plan,

deviations in the plan, or deviations from the plan. This complicated consistently detecting

deviations from the plan as the interaction could look different each time it occurred. After every

observation, any recorded commission error(s) not specified in the plan were operationally

defined, even if only recorded once. Therefore, observers could have only one definitive example

of a commission error not in the plan for that specific teacher. Although alternative possible

examples were written on the checklist for observers to reference, it just was not possible to

identify every example of what the teacher interaction could look like. For example, there are

lots of different praise statements a teacher could make (e.g., “Awesome job!” “Keep up the

good work!”). Even with an operational definition, it could be challenging to decide in the

moment if the specific wording of a phrase was praise or not.

It is uncertain what methods could improve IOA because our observers were experts

(specialization in procedural fidelity and extensive experience and training with data collection),

and we made several efforts to improve data collection (operationalization of the BIP,

modifications of the checklist during piloting, and addition of scoring rules). It may be that the

scope of the measurement system needs to be narrowed further. Behavior analysts could consider

using information about the client (e.g., functional assessments) to their advantage by selecting a

few BIP steps that appear to be function-based and are hypothesized to influence behavior. For

example, a behavior analyst may collect data on a student’s token system for earning breaks from

academic activities if they have an escape function. However, if the goal is to capture deviations

16
from the intervention, you would have to ensure you do not inadvertently capture intervention

steps as deviations from the plan if the steps are not listed on the checklist.

It may be infeasible to collect procedural fidelity data on complex multistep BIPs. In a

pilot study, we determined that collecting procedural fidelity data on proactive and reactive steps

of a BIP was infeasible. Therefore, we narrowed our inclusion criteria to only proactive steps.

However, even with excluding reactive procedures (critical components of BIPs for which

procedural fidelity data should be collected) we did not obtain the minimum recommended

standard of 80% for IOA (Cooper et al., 2007). And, notably, two students (Warren and Wybie)

had more reactive than proactive steps in their BIP. It is troublesome that observers could not

agree on whether or to what extent a treatment was implemented, especially because our

methods for developing a procedural-fidelity checklist aligned with recent recommendations

(Morris et al., 2024). The data should be interpreted with this caveat in mind, and our findings

are framed to suggest areas of future research to improve observational methods for procedural

fidelity data collection.

Figure 2 shows the total count of each category of teacher implementation: correct

implementation, omission error, commission error of a step in the plan (i.e., repeatedly

prompting when the BIP specified a single prompt), and commission error of a step not in the

plan (i.e., adding a potential reinforcer following appropriate behavior). Data were collected for a

total of 45 min, 120 min, 105 min, and 75 min for Fabian-Kelly, Dakota-Rachel, Warren-Rhett,

and Wybie-Rhett, respectively. These durations of data collection were sufficient to capture at

least 75 BIP-related student-teacher interactions (M = 104.5 interactions, range 75-143

interactions).

17
All teachers implemented BIPs with low fidelity (range: 0-17% correct) and engaged in

high rates of errors (range: 39-162 errors/hr). For two of the teachers (Kelly and Rachel),

omission errors were most common. In contrast, Rhett most commonly added potentially

impactful BIP steps (i.e., commission errors not in the plan).

These results replicate previous research demonstrating that teachers implement

procedures with notably low fidelity. The global fidelity percentages were particularly low in

comparison to the minimum recommended standard of 80% fidelity (National Autism Center,

2015). However, the recommended standard appears to be set arbitrarily. Some procedures may

need to be implemented with more than 80% fidelity (e.g., Jones et al., 2022), whereas others

may still be efficacious when implemented with less than 80% fidelity (e.g., Foreman et al.,

2021). Future research should continue to parametrically investigate the impact various global-

fidelity percentages have on intervention outcomes. Identifying interventions that are efficacious

despite low fidelity may be advantageous for behavior analysts in school settings. That is,

because teachers implement with low fidelity, behavior analysts in school settings could consider

recommending interventions more resistant to the impacts of low fidelity than alternatives.

The complexity of a BIP (e.g., the number of steps or the contiguity of interactions) may

reduce fidelity of teacher implementation. For example, Fabian’s BIP had the most steps (56)

and the densest schedule of reinforcement (VR 2 token delivery for appropriate behavior), and

Kelly never implemented a step correctly. Alternatively, Dakota’s BIP had the leanest schedule

of reinforcement (30-min DRO token delivery for absence of challenging behavior), and Rachel

had the highest fidelity. Procedural-fidelity scholars have hypothesized several factors, like

complexity, that may make procedures more prone to low fidelity (Allen & Warzak, 2000).

However, few studies have evaluated the impacts that factors hypothesized to reduce fidelity

18
have on fidelity. Of the studies that have examined these variables, the findings are mixed (see

Fiske, 2008; for a review). Moreover, there may be school-specific factors that impact teachers’

fidelity, like staff buy-in, staff burnout, and school environment or support (Garcia et al., 2022;

Kincaid et al., 2007; Schlichte et al., 2005) or teacher-specific factors like skill sets and training.

Research evaluating the influence of barriers in school settings on teacher fidelity is warranted.

The errors that reduce global fidelity may be more influential on student outcomes than

the global-fidelity percentage. A sub-optimal global-fidelity percentage may represent a few

errors across all steps of an intervention or several errors in only one step of an intervention. The

lack of specificity in a global measure of fidelity may overlook crucial errors in the

implementation of an intervention (Cook et al., 2015). All steps of an intervention might be

important. However, there are likely crucial steps of an intervention (e.g., reinforcer delivery).

Few studies have reported the types of errors made (Han et al. 2022; cf., Carroll et al., 2013;

Donnelly & Karsten, 2017). Therefore, it was important to examine not just the global-fidelity

scores and kinds of errors, but also the distribution of errors across the various components of the

BIPs.

The rate of each specific error is displayed as bar graphs (see Figure 3-6).

Kelly engaged in 12 forms of errors. The three most frequent errors were omitting token

deliveries (e.g., a token was not placed on a token board; 88/hr), adding praise/acknowledgment

(e.g., “Great job!”; 43/hr), and adding flexible seating (e.g., standing at his desk; 7/hr). Rachel

engaged in 24 forms of errors. The three most frequent errors were omitting praise directed

specifically to Dakota (12/hr), adding proximity to teacher (e.g., Rachel standing next to

Dakota’s desk or Dakota moving her chair next to the teacher; 9/hr), and omitting placing a

“dollar” (token) for each domain of behavior that met expectations in her “wallet” (e.g., not

19
delivering a dollar when expectations had been met; 8/hr). Rhett engaged in 21 kinds of errors

with Warren and 16 kinds of errors with Wybie. For Warren-Rhett, the three most frequent errors

were adding proximity to teacher (e.g., standing directly in front of Warren’s desk; 13/hr),

adding restrictions of access to items (e.g., removing potentially distracting items in Warren’s

academic area; 6/hr), and adding token deliveries (e.g., placing a token on a token board; 5/hr).

For Wybie-Rhett, the three most frequent errors were adding proximity (e.g., Rhett standing over

Wybie from behind at his desk; 14/hr), adding access to items (e.g., giving Wybie an item from

his lunch box; 11/hr), and omitting praise (7/hr).

Few previous studies have reported on errors involving teachers adding new steps to the

plan, despite recent discussion emphasizing the importance of capturing these forms of errors

(e.g., Colón & Wallander, 2023). The narrative descriptions of commission errors not in the plan

were idiosyncratic for each student-teacher dyad, with only a few common errors across all

dyads. The idiosyncrasy of procedural fidelity measures may hinder their adoption in practice.

Each procedural fidelity measure must be tailored to reflect a client’s individualized intervention.

Then, observers should capture the unique steps specified in the plan and deviations from the

plan. This complicates the creation of a measurement system and the identification of correct and

erroneous implementation during observation.

All dyads had two common commission errors not in the plan. All teachers moved in

proximity to their students (e.g., standing next to the student’s desk) and provided physical

attention (e.g., patting the student on the back). Proximity to the student was a notable

commission error not in the plan because the teacher often engaged in a second interaction, such

as physical attention. For example, Rachel stood next to Dakota’s desk, patted her on the back,

and said a statement of encouragement. For other examples, Kelly stood next to Fabian’s desk

20
and provided extra help, or Rhett stood over Wybie at his desk and provided gesture prompts.

Proximity, or one-on-one interactions with the teacher, may influence student outcomes. These

errors may have been common across all teachers because the classrooms had small teacher-to-

student ratios. Alternatively, these interactions may be common across teachers because teachers

engage in these interactions with their students or view these interactions as general classroom

management.

Teachers' experiences with and perceptions of BIPs may be important factors when

attempting to identify commission errors not in plans. Table 2 shows teachers’ demographic

information and responses to the questionnaire about the importance of BIPs. Teachers had

similar responses on three of six Likert scale questions. Teachers reported that they sometimes or

infrequently made errors when implementing BIPs. However, our descriptive assessment

findings suggest otherwise. Although teachers engaged in frequent errors, they agreed that

consistency in BIP implementation was critical to its success and disagreed that BIPs were

difficult to implement. Even with a formal data collection system, teachers may still not be able

to accurately record their behavior and overestimate their procedural fidelity (Hagermoser

Sanetti & Kratochwill, 2011). Given that the teacher’s perception is that errors are infrequent, the

teacher may not be highly motivated to seek out or receive feedback on performance. Additional

information is needed on the actions teachers view as teaching versus implementing a BIP.

The West Virginia Department of Education (2023) has five professional teaching

standards that serve as a basis for assessing the expected performance of teachers. The standard

“The Learner and the Learning Environment” sets the expectation that teachers will demonstrate

a fundamental understanding of student development and foster an environment that promotes

learning for all students. An indicator that a teacher is distinguished in the substandard of setting

21
expectations for student behavior is “The teacher has established with students a mutually agreed

upon set of behaviors that foster standards of conduct and consequences in an environment that

focuses on learning” (p. 26). Teachers are expected to establish rules and consequences for

behavior for all students regardless of a BIP. In addition, an indicator that a teacher is

distinguished in the substandard of differentiating learning is “The teacher guides students in

developing individual learning processes by demonstrating extensive and subtle understanding of

the needs, interest, learning style, cultural heritage, gender, and environment of students” (p. 18).

The teacher may have viewed their interactions with students as meeting these expected

performance standards, whereas we may have captured these interactions as errors. When

attempting to identify interactions added to procedures, it may be important to determine the

foundational expectations of worker performance to inform data-based decisions (e.g., feedback

to the implementer). Nonetheless, capturing interactions that are expectations of worker

performance is important. Future studies might first evaluate or establish that teachers are

engaging in the expected behavior to support all learners before measuring implementation of

specific BIP components. If teachers lack these core skills, building them first may facilitate

positive outcomes for multiple students, and excluding these steps from BIP fidelity may ease

subsequent measurement.

In sum, Study 1 replicates previous findings that teachers implement procedures with low

fidelity and provides preliminary evidence that teachers engage in high rates of errors, including

adding steps not in BIPs. Given that it may be infeasible to measure fidelity for all components

of a BIP simultaneously, knowing which errors are potentially impactful could inform

streamlined initial measurement systems.

22
Study 2

The purpose of Study 2 was to experimentally manipulate a commission error identified

during Study 1 to determine if it impacted BIP outcomes. To accomplish this aim, we

implemented the BIP with high fidelity (HF), a programmed commission error (FCE), or no

fidelity (NF) in a reversal design and compared the rate of challenging behavior between

conditions to draw conclusions about the errors impact on treatment outcomes.

Method

Implementer, Consenting Procedures, & Setting

Because Study 2 required implementation of the BIP precisely as specified in each

experimental condition, a doctoral student in behavior analysis with three years of experience

developing and implementing BIPs (hereafter referred to as “researcher”) conducted all

procedures. The legal guardian again provided consent as described for Study 1. Assent was

obtained for all children because they all were above the age of 7 years and did not have

significant cognitive impairments. Assent was monitored before each block of sessions by asking

students if they would work with the researcher in an alternative room. If the student declined,

the session was not conducted. If the student declined for three consecutive sessions, they were

asked if they wanted to be asked again the next school day (Dakota and Wybie), or they were

withdrawn from the study (Fabian and Warren). If Dakota or Wybie wanted to be asked again

the next school day, the experimenter repeated the assent procedures. We changed the assent

procedures for Dakota and Wybie after challenges with obtaining assent from Fabian.

Assent issues occurred for each of the four participants. Warren did not provide initial

assent to participate, so no Study 2 data were collected. After the ninth session (in the No-

Fidelity condition), Fabian withdrew assent to participate. Although Fabian did not label a reason

23
for his withdrawal of assent, it seems likely due to there being more access to reinforcement in

the classroom than in the research sessions during that condition. Although we changed

procedures to ask across separate days, Wybie withdrew assent after 4 sessions. Dakota initially

participated consistently in April and May of 2023, but withdrew assent as the school year came

to a close. Like Fabian, she did not provide a reason for withdrawing assent. Anecdotally, the

rigor of academics in the classroom seemed to wane as the school year came to an end, and it

seems likely that participating in research sessions increased the amount of academic work that

Dakota was expected to complete. We resumed sessions with Dakota in October 2023. She

assented to only the first session, during which she contacted the academic tasks required

(journal prompts), which exceeded those expected of her in the classroom. Therefore, we

changed the academic task from journal prompts to mirror the tasks Dakota was expected to

complete in the classroom. This resulted in Dakota assenting to participate in five additional

sessions before she declined to participate in a session on one day. As with other refusals to

assent, no rationale was provided, but we hypothesized that Dakota may have not wanted to

leave her peers. After participating in three additional sessions, Dakota completed the

experiment.

Sessions were not conducted in the classroom to minimize disruption (as we presumed

challenging behavior would occur) and to prevent possible confounding variables (e.g., a peer

interrupting a session). Study 2 occurred in a barren office at the student’s school (hereafter

referred to as “research room”). Only the researcher, student, and data collectors were present.

All sessions were video recorded. Each session was 15 min in duration.

Response Measurement and IOA

24
The primary independent variable was the rate of the individually defined challenging

behavior, which was expressed as an aggregate responses per minute collapsed across all

topographies. Each student’s operational definitions of challenging behavior included any

topographies specified in the BIP. For example, Fabian’s property destruction definition included

ripping materials as that topography was written in his BIP, whereas Dakota’s property

destruction definition included swiping materials out of place as that topography was written in

her BIP. To maintain a consistent termination criterion, each student had the same definition of

elopement (which captured leaving the research room) regardless of whether it was specified in

the BIP. See Table 9 for each student’s operational definitions of challenging behavior.

Observers independently collected data from video using the Behavior Logger

Observation Coding System (BLOCS) software. The BLOCS software recorded timestamps as

observers press designated keyboard keys associated with participant responses. Data were

output to an Excel file, which included total session duration, a timestamp of each response, and

a summary of responses per minute or percentage of the session during which the response

occurred.

Observers were trained using a performance-based system, consisting of a self-instruction

manual, video models, and automated performance feedback. Training continued until observers

achieved IOA coefficients of at least 90% for all responses across two consecutive sessions

during the experiment, using the calculations described in the IOA section below. If IOA values

had decreased below 90% accuracy for any key for two consecutive sessions, the researcher

would have identified errors, and both observers would have returned to training. However, this

never occurred.

25
Interobserver agreement was calculated using a partial-agreement method with a 10-s

window, using the IOA calculator in BLOCS. The program divided each observer’s data into 10-

s intervals, divided the smaller count of behavior by the larger count for each interval, averaged

the values across intervals, and multiplied by 100 to yield a percentage. If observers agreed on

the absence of behavior in an interval, IOA for that interval was considered 100%. For Fabian,

IOA data were collected and calculated for 33% of HF sessions and 33% of NF sessions. IOA

was 100% for HF sessions and 98% for NF sessions. For Dakota, IOA data were collected and

calculated for 36% of HF sessions, 50% of FCE sessions, and the NF probe. IOA was 99% (97-

100%) for HF sessions, 100% for FCE sessions, and 100% for the NF probe. For Wybie, IOA

data were collected and calculated for 25% of HF sessions. IOA was 100% for HF sessions.

Procedural Fidelity

We measured the researcher’s procedural fidelity (i.e., programmed BIP errors would be

scored as correct). Procedural-fidelity data were collected using the modified version of the

procedural-fidelity checklist that included the possible facilitative error and reactive BIP steps.

Observers were trained to collect procedural fidelity data for each participant by collecting data

on video recordings of research sessions. Training continued until observers achieved IOA

coefficients of at least 90% across two consecutive sessions during the experiment, using the

calculations described below.

Procedural-fidelity scores were calculated by dividing the total number of correct

researcher responses by the total number of researcher responses and multiplying by 100 to yield

a percentage for each checklist. For Fabian, global procedural fidelity was 100%. For Dakota,

global procedural fidelity was 96% (67%- 100%). For the HF condition, the mean procedural

fidelity was 95% (67%- 100%). For the NF condition, the mean procedural fidelity was 100%.

26
For the FCE condition, the mean procedural fidelity was 95% (86%- 100%). See Table 10 for a

summary of the errors made. For Wybie, global procedural fidelity was 64% (27-100%). See

Table 11 for a summary of the errors made.

Correlations between counts obtained by the primary and secondary data collectors are

shown in Figure 7. We evaluated IOA identically to Study 1 and obtained significant Spearman r

correlations for two of three students. Observers had perfect correspondence for 75% of all

interactions.

Error Selection

We selected a commission error that occurred frequently during Study 1 and was

expected to facilitate intervention outcomes. In the context of proactive BIP steps, a possibly

facilitative error is a form of error that may reinforce desirable behavior, attenuate an

establishing operation, or that provides additional instructions or discriminative stimuli. For

example, the delivery of additional possible reinforcers (e.g., praise, proximity, access to items)

following appropriate behavior were categorized as possibly facilitative errors.

For Dakota, four commission errors were selected to simulate the student-teacher

interaction observed in the classroom. The four errors were as follows: proximity, access to a

new space, physical interactions, and statements of encouragement or assurance. Because these

errors co-occurred during Study 1, we treated all four error topographies as one combined error

during Study 2. Thus, each time an error occurred, the researcher moved close to Dakota, walked

with her into the corridor outside the session room, provided physical attention (e.g., patted

Dakota’s back), and made an encouraging statement (e.g., “You can do it!”). Dakota had a

certificate on her desk that had the phrase “talk to the teacher privately” to signal the presence of

these commission errors during the FCE condition. Given that one selected commission error for

27
Dakota was proximity to the teacher, the researcher measured the distance between Dakota’s

desk and the teacher’s desk in the classroom to simulate the classroom. Dakota’s desk was 7 ft

away from the researcher’s desk during all sessions.

Fabian, Wybie, and Warren did not experience commission errors because they did not

provide or withdrew assent.

Experimental Manipulation

Sessions consisted of a school routine like the one during which the error was observed in

the classroom. Fabian was presented with addition, subtraction, and sight word flashcards, as

Kelly had nominated these as skills Fabian should practice. The researcher presented all

flashcards for a particular skill before moving on to the next deck. The decks were presented in a

randomized order. Dakota was presented with two different kinds of academic activities. During

Sessions 1-18, she was instructed to write a response to a journal entry prompt. To address issues

with refusal of assent, during Sessions 19-27, she was instructed to complete activities that

would have been occurring in the classroom (e.g., educational computer program, worksheet).

Wybie’s BIP specified revising his curriculum based on his most recent assessment results.

Therefore, Wybie was presented with first-grade (rather than second-grade) English Language

Arts worksheets.

For Fabian, one to four sessions were conducted per day for four days, spanning 10

calendar days. For Dakota, sessions spanned 51 days with an average of 2.5 sessions (range: 1-4)

conducted an average of 2 days per week (range: 1-5). For Wybie, two sessions were conducted

per day for two days, spanning 18 calendar days. Sessions were 15 mins unless a termination

criterion was met. The researcher stopped the session if the student aggressed, injured

themselves, or eloped.

28
Experimental Design

Comparisons between a high-fidelity (HF) baseline and facilitative commission error

(FCE) condition were conducted using a reversal design for Dakota. Although reversal designs

were also planned for the other participants, either no data or only baseline (HF or No-Fidelity)

data were collected.

High-Fidelity (HF) Baseline. The purpose of the HF condition was to indicate if a

student’s BIP was effective or ineffective. All individualized components of the BIP relevant to

the academic routine were implemented exactly as written (100% procedural fidelity). The HF

baseline was conducted for at least three sessions, and until there was no trend in response rates.

Fabian was provided a pass to take a break from academics and replace the activity;

however, he never used it. The researcher delivered tokens for every 1-3 appropriate responses.

The appropriate responses were answering flashcards (regardless of whether the answers were

correct) and staying in his seat for an entire work block (i.e., up until the delivery of the last

token). After he earned 12 tokens, the tokens were exchanged for 3-5 min of access to an activity

of his choosing from a reward menu. Fabian selected coloring or Legos. The duration of the

reward activity was signaled with a timer, and Fabian was provided two advanced notices of

when reward-menu activities would end. The first advanced notice was when 1 min remained,

and the second when 30 s remained. Given that Fabian’s BIP had ranges for the number of

appropriate responses and min of access, two lists of values were generated using an online

random number generator, one for each BIP step. The values within the range were block-

randomized by three. A data collector seated behind Fabian discretely signaled to the researcher

when to deliver a reinforcer, and the number of minutes of access to an activity. If Fabian

refused or did not provide an answer, the researcher re-prompted the instruction.

29
Dakota was provided with two certificates. The first certificate could be exchanged to

help a teacher complete a task, and the second could be exchanged to work at a teacher’s desk

for 5 min. To maintain consistency across sessions, Dakota was not permitted to exchange a

certificate during sessions; instead, the researcher provided an alternative option if Dakota

requested to use a certificate. The researcher rephrased the instructions for 1 min when

presenting a journal prompt. The researcher praised every 5 min and attended to every hand

raise. Dakota’s BIP specified that she could earn up to three tokens every 30 min in clock time

(i.e., 10:00 a.m.) for being safe, respectful, and responsible. Tokens were delivered during

sessions in which the clock time occurred. The researcher stated a reminder if Dakota put her

head down, vocalized loudly, refused to start an academic activity, or left her area. The

researchers removed materials when Dakota destroyed property.

Wybie’s academic activities were revised to be at the first-grade level, and the researcher

praised every 5 min. If Wybie refused, the researcher asked Wybie if he needed help. If Wybie

left his area, the researcher stated a reminder.

Facilitative Commission Error (FCE); Dakota Only. The purpose of the FCE

condition was to indicate if a programmed error facilitated outcomes by comparing the rate of

challenging behavior to the HF condition. The BIP was implemented with 100% procedural

fidelity with the commission error, which consisted of the researcher moving close to Dakota,

walking with her to the corridor outside the session room, providing physical attention, and

making an encouraging statement. Before session 11, the researcher told Dakota what the talk to

teacher privately certificate was and how to use it. In correspondence with Dakota’s BIP, the

researcher could attend to Dakota when she raised her hand. Any request to use the certificate by

Dakota following a called-upon hand raise was honored. We opted for Dakota to initiate the

30
interaction, rather than the researcher, for clinical reasons. We were near the end of the school

year and Dakota would be transitioning to a different school. The certificate would be more

readily available to embed in her new BIP.

No-Fidelity (NF) Baseline. The purpose of the NF condition was to indicate if a student

required a BIP. The No-Fidelity condition consisted of omission of all proactive steps of the BIP.

Reactive steps were implemented correctly to maintain safety.

Fabian was not provided a pass to take a break from academics and replace the activity.

The researcher presented flashcards but never delivered tokens, and, therefore, did not permit

token exchange.

Dakota was not provided certificates. For Dakota, the researcher gave an initial

instruction but then did not provide praise, attend to hand raises, deliver tokens, or re-state the

instructions. The researchers removed materials when Dakota destroyed property. The researcher

stated a reminder if Dakota put her head down, vocalized loudly, refused to start an academic

activity, or left her area. The researchers removed materials when Dakota destroyed property.

Results and Discussion

Fabian, Dakota, and Wybie’s rate of challenging behavior across conditions is displayed

in Figure 8.

Fabian participated in nine research sessions. Rates of challenging behavior were low

during the HF condition. Following the transition to the NF condition, Fabian did not engage in

challenging behavior for the first two sessions (Sessions 4 and 5). A higher rate of challenging

behavior (1.19 responses/minute) occurred in Session 6, but it decreased across subsequent

sessions. Fabian withdrew assent after Session 9, thus terminating his participation.

31
Although mean response rates were lower in the HF condition than the NF condition, the

delayed change in rates, high degree of variability, and lack of replication in the planned reversal

design preclude conclusion regarding the efficacy of the BIP. Fabian’s withdrawal of assent also

precluded the evaluation of the planned commission error of praise. Thus, Fabian’s data do not

permit the identification of functional relations.

Dakota participated in 27 research sessions. Sessions were terminated if Dakota

aggressed, eloped, or injured herself; terminated sessions are depicted on the graph by white data

points. During the first HF condition, Dakota initially engaged in variable rates of challenging

behavior and met the termination criteria in one session because of self-injury. However, overall

rates of challenging behavior were low. Therefore, we conducted an NF session to determine if

the BIP effectively suppressed responding. Dakota engaged in a high rate (1.28 responses/min)

of challenging behavior during the NF session, ultimately leading to the termination of the

session due to aggression. Although we had initially planned an NF condition, the dangerous

escalation in challenging behavior during the first NF session caused us to pivot back to HF,

using a single NF session as a probe. Like the first several HF sessions, Dakota initially did not

engage in challenging behavior during sessions upon the return to the HF sessions. However, a

high rate (0.83 responses/min) of challenging behavior, ultimately leading to the termination of

the session due to aggression. Thus, across the HF condition, Dakota infrequently engaged in

challenging behavior, but the challenging behavior that occurred was severe and posed imminent

harm to herself and others. These results were replicated in the second HF condition.

During the first FCE condition, Dakota did not engage in challenging behavior across six

sessions. Technical difficulties occurred with the camera during Session 15, and we could not

retrieve the data. Although the frequency of severe behavior had increased in the replication of

32
the HF condition, no challenging behavior occurred across five consecutive sessions in the

replication of the FCE condition. Thus, implementing a BIP with high fidelity and a commission

error can actually facilitative student outcomes compared to the BIP without a commission error.

Dakota did not use the talk to the teacher privately certificate during the final FCE

condition and thus did not experience the commission error. One possibility is the availability of

the certificate was sufficient to mitigate challenging behavior. This effect has been demonstrated

in other interventions for which the availability of reinforcers was signaled by a certificate or

pass. For example, Ravid et al. (2021) provided children with a bedtime pass that could be

exchanged for nighttime parental attention. The children initially exchanged the passes often but,

over time, kept all the passes instead of exchanging them. The intervention effectively reduced

co-sleeping and increased independent sleeping even when the children kept the passes. When

Dakota withdrew assent in June of 2023, we recommended to Rachel that a “talk to teacher

privately” certificate be added to Dakota’s BIP. When we re-obtained assent in October of 2023,

Dakota’s BIP included the talk to teacher privately certificate, and the date of the BIP indicated

the intervention had been in place for approximately 3 months of school. Dakota’s educational

records indicated Dakota had been successful, and the new teacher reported Dakota was not

using the talk to the teacher privately certificate in the classroom. Given that Dakota had the

intervention in place for some extended time, Dakota may have started to keep the passes instead

of exchanging them with the intervention still effectively suppressing challenging behavior.

However, the lack of certificate use during research sessions limits our ability to draw firm

conclusions about effects of the commission errors.

All three students withdrew their assent at some point in the study, with two of the three

students withdrawing their assent when the BIP was implemented with high fidelity. Upon

33
reporting assent issues to the teachers, the teachers offered for us to conduct the study in the

classroom. We declined because the student’s data up to that point made it seem likely that

implementing the BIP as written would be disruptive to the classroom environment. However, in

the classroom, elementary students do not generally have opportunities to vocally assent to

participating in their BIPs. It has been suggested that students should get to assent to the

interventions they partake in, and efforts should be made to teach students to self-advocate. In

behavior analysis, Breaux and Smith (2023) called for the adoption of assent-based practices,

which include teaching clients about assent and how to advocate on their own behalf. In schools,

student’s self-advocacy and academic outcomes improved when they were provided explicit

instruction on how to participate in IEP meetings (Blackwell & Rossetti, 2014). In the present

study, the student’s withdrawal of assent may have suggested that the BIP as written was

aversive. Given that teachers are federally mandated to implement BIPs, it may be worthwhile to

include the students in the discussion on the services they receive (Individuals with Disabilities

Education Improvement Act, 2004).

General Discussion

The current studies aimed to identify the types of errors teachers make when

implementing BIPs (Study 1) and impacts such errors have on student outcomes (Study 2).

However, low agreement between observers in Study 1 requires the data to be interpreted

cautiously, and the lack of student assent in Study 2 limits the extent to which we could identify

possible functional relations. During Study 1, our process for creating procedural-fidelity

checklists aligned with recent best-practice recommendations (Morris et al., 2024), but we had

disagreements in IOA that reduced the believability of our data despite additional training and

feedback for observers. Despite this, our findings are consistent with previous research indicating

34
that teachers implement behavior-reduction procedures with sub-optimal fidelity in naturalistic

contexts (Codding et al., 2008; DiGennaro Reed et al. 2010; Foreman et al., 2021; Mouzakitis et

al. 2015; Sanetti et al. 2014). Our findings additionally extend the literature as we identified that

teachers engage in commission errors not in the plan. During Study 2, we had challenges

obtaining complete datasets due to a lack of student assent. However, we were able to obtain one

complete dataset. Contrary to previous research that identified commission errors as detrimental

to behavior-reduction procedure outcomes, our findings suggest that a commission error could

actually facilitate outcomes (Foreman et al., 2022; St. Peter et al., 2016; St. Peter Pipkin et al.,

2010). Collectively, these findings broaden our conceptualization of procedural fidelity as they

suggest that there are other errors that may need to be captured in data collection and that errors

may have a variety of impacts on outcomes. Additional research is still needed to identify

procedural fidelity measurement systems that produce better IOA to increase the believability of

data, and experimental analyses are needed to determine the various effects of errors on

behavior-analytic procedure outcomes.

Research studies that have identified naturalistic errors and then manipulated them in

experiments mainly focused on evaluating errors that are already specified in the plan (Carroll et

al., 2013; Foreman et al., 2021). However, this approach may not always be sufficient in

identifying all errors occurring. Sanetti et al. (2014) stated that they used the procedural fidelity

measurement system with the most empirical support to measure behavior plan implementation

in the classroom, which captured only errors specified in the plan. However, Sanetti et al.

acknowledged that additional research on fidelity would enhance our understanding, and the

measurement system would need to change to ensure all relevant variables are accounted for.

Given our findings, it is crucial to broaden the scope of errors identified in procedural fidelity

35
measurement systems to capture errors not in the plan. However, this approach has challenges,

such as the need for a sufficient level of expertise to understand how potential reinforcers impact

behavior in nuanced ways. Researchers may need to conduct exploratory studies using different

methods to identify common themes in commission errors that are not in plans. However, it is

important to note that the most common errors may not necessarily have the most significant

impact, and there is a vast array of possible commission errors to identify. As a result, further

investigation in this area may require some trial and error to identify the most efficacious

measurement systems as our understanding of procedural fidelity expands.

Teacher’s implementation of behavior plans in the classroom has been measured using

categorical measurement systems (Codding et al., 2005; Codding et al., 2008; DiGennaro Reed

et al., 2007; DiGennaro Reed et al., 2010; Mouzakitis et al., 2015; Sanetti et al., 2014). For

example, Codding et al. (2005) had observers score each procedure component as (1)

implemented as written, (2) not implemented as written, or (3) no opportunity to observe. Similar

to Study 1, Codding et al. also narratively recorded examples of deviations in implementation

and did not factor agreement of narrative descriptions into IOA calculations. Codding et al.

(2005) suggested that perhaps a more conservative method should be used that directly compares

the observer’s agreement between the type and frequency of deviations in the plan. Our

measurement system captured more detailed aspects of implementation, including the type and

frequency of errors, and even recorded implementation for each procedure step rather than by

procedure component. However, we yielded low agreement. It may be useful to have more

conservative measures of implementation, but they need to be reliable. Additional research is

needed to identify features that make more detailed procedural fidelity measurement systems

feasible. Researchers might parametrically manipulate the number of components or steps in a

36
procedural-fidelity checklist and determine the point at which observers no longer have

acceptable levels of agreement.

There may have been several other variables contributing to low IOA. We obtained low

IOA in Study 1 and high IOA in Study 2, despite using the same checklist across studies. There

were methodological differences that may have made the measurement system more feasible in

Study 2 compared to Study 1. First, observers collected data live during Study 1 but collected

data from video in most cases during Study 2. When procedural fidelity data were collected from

video, observers could pause, rewind, and fast-forward. Observers could have additional time to

collect data or re-watch the interaction when needed to make accurate discriminations. Second,

observers for Study 2 were provided a cheat sheet on how to classify interactions. The sheet

listed each BIP step and described what researcher interactions would be classified as correct,

omission, or commission. Observers could reference this sheet at any time to inform how they

classified interactions. Third, there was more correspondence between the checklist and the BIP

as implemented during Study 2 relative to Study 1 because the researcher was trained to

implement the BIP steps precisely. Even when the BIP was implemented with no fidelity (all

omission errors), it was evident when the researcher did not engage in an interaction as planned.

For example, the researcher attended to Dakota’s hand raises in the HF condition and did not

attend to Dakota’s hand raises in the NF condition. Collection of procedural fidelity data for

complex BIPs may be more feasible when observers can use video recording, which would allow

for additional time and reference to supporting resources. When video recording is not feasible,

data collection may need to be initially limited to just a few components of the BIP.

We had difficulty operationalizing the details of the BIPs in the student’s files, even with

the teacher’s input and despite teachers often having input on the development of the BIP. The

37
challenges with operationalization may be due to a lack of guidance on how to write BIPs to

precisely describe interventions while maintaining language that is consumable for teachers.

Existing recommendations about BIP writing focus mainly on components to include in BIPs,

such as providing background information on the client, defining target and alternative behavior,

or including interventions that reward positive behavior (Higgins et al., 2023; Horner et al.,

2000; Quigley et al., 2018; Williams & Vollmer 2015). Teachers bring their own histories to the

classroom, which likely influences how they interpret and implement BIPs. Personal

interpretation may be particularly likely when the BIP is not operationalized.

Interventions need to be clearly described, and it should be easy to carry out interventions

with high fidelity. Teachers receive inadequate training in classroom management, and student

disruptive behavior contributes to burnout which contributes to teacher shortages (National

Council of Teacher Quality, 2014; Kollerová et al., 2021). Additionally, teacher buy-in is a

barrier to the adoption of positive behavior supports in the classroom (Kincaid et al., 2007). If

interventions are incomprehensible or difficult to implement, it may only compound the problem

as disruptive behavior may persist and further reduce buy-in from the teacher. The next step for

improving measures of procedural fidelity may actually be addressing how to operationalize

BIPs, as identifying the steps in the plan is critical to be able to identify deviations from the plan.

Teachers may deviate from the plan because those deviations suppress challenging

behavior. In some cases, deviations appear to be valuable strategies that could be incorporated

into a student’s BIP. For example, Rachel’s one-on-one interactions with Dakota, which

provided encouragement, assurance, and physical interaction, facilitated outcomes. However,

just because a deviation suppresses challenging behavior does not make it an advantageous

deviation. Deviations may appear beneficial in the moment but may be detrimental in the long

38
term. For example, Wybie’s teacher was provided access to food whenever Wybie requested and

sometimes offered food even before a request. If Wybie was no longer provided access to food

throughout the day, it presumably may result in a dramatic increase in challenging behavior. This

deviation might not be sustainable long term. Researchers and clinicians should look for

deviations from the plan and hypothesize the possible effects the deviation has on outcomes.

Then, the practicality and longevity of deviations should be assessed by determining the

available resources and reviewing the long-term goals for the client.

39
References

Allen, K. D., & Warzak, W. J. (2000). The problem with parental nonadherence in clinical

behavior analysis: Effective treatment is not enough. Journal of Applied Behavior

Analysis, 33(3), 373-391. https://doi.org/10.1901%2Fjaba.2000.33-373

Bannerman, D. J., Sheldon J. B., Sherman, J. A., and Harchik, A. E. (1990). Balancing the right

to habilitation with the right to personal liberties: The rights of people with

developmental disabilities to eat too many doughnuts and take a nap. Journal of Applied

Behavior Analysis, 23(1), 79-89. https://doi.org/10.1901%2Fjaba.1990.23-79

Bergmann, S., Kodak, T., & Harman, M. J. (2021). When do errors in reinforcer delivery affect

learning? A parametric analysis of treatment integrity. Journal of the Experimental

Analysis of Behavior, 115(2), 561-577. https://doi.org/10.1002/jeab.670

Blackwell, W. H., & Rossetti, Z. S. (2014). The development of Individualized Education

Programs: Where have we been and where should we go now? Sage Open, 4(2).

https://doi.org/10.1177/2158244014530411

Brand, D., Henley, A. J., DiGennaro Reed, F. D., Gray, E., & Crabbs, B. (2019). A review of

published studies involving parametric manipulation of treatment integrity. Journal of

Behavioral Education, 28(1), 1-26. https://doi.org/10.1007/s10864-018-09311-8

Breaux, C. A., & Smith, K. (2023). Assent in applied behaviour analysis and positive behavior

support: ethical considerations and practical recommendations. International Journal of

Developmental Disabilities, 69(1), 111-121.

https://doi.org/10.1080/20473869.2022.2144969

Carroll, R. A., Kodak, T., & Fisher, W. W. (2013). An evaluation of programmed treatment-

40
integrity errors during discrete-trial instruction. Journal of Applied Behavior Analysis,

46(2), 379-394. https://doi.org/10.1002/jaba.49

Codding, R. S., Feinberg, A. B., Dunn, E. K., and Pace, G. M. (2005). Effects of immediate

performance feedback on implementation of behavior support plans. Journal of Applied

Behavior Analysis, 38(2), 205-219. https://doi.org/10.1901/jaba.2005.98-04

Codding, R. S., Livanis, A., Pace, G. M., & Vaca, L. (2008). Using performance feedback to

improve treatment integrity of classwide behavior plans: An investigation of observer

reactivity. Journal of Applied Behavior Analysis, 41(3), 417-422.

https://doi.org/10/1901/jaba.2008.41-417

Colón, C. L., & Wallander, R. (2023). Treatment integrity. In J. L. Matson (Eds)., Handbook of

applied behavior analysis: Integrating research into practice (pp. 439-463). Springer

Cham. https://doi.org/10.1007/978-3-031-19964-6

Cook, J. E., Subramaniam, S., Brunson, L. Y., Larson, N. A., Poe, S. G., & St. Peter, C. C.

(2015). Global measures of treatment integrity may mask important errors in discrete-trial

training. Behavior Analysis in Practice, 8(1), 37-47. https://doi.org/10.1007/s40617-014-

0039-7

Cooper, O. J., Heron, T. E., & Heward, W. L. (2007). Applied behavior analysis (2nd ed.)

Pearson.

DiGennaro Reed, F. D., Reed, D. D., Baez, C. N., & Maguire, H. (2011). A parametric analysis

of errors of commission during discrete-trial training. Journal of Applied Behavior

Analysis, 44(3), 611-615. https://doi.org/10.1901/jaba.2011.44-611

41
Donnelly, M. G., & Karsten, A. M. (2017). Effects of programmed teaching errors on acquisition

and durability of self-care skills. Journal of Applied Behavior Analysis, 50(3), 511-528.

https://doi.org/10.1002/jaba.390

Fiske, K. E. (2008). Treatment integrity of school-based behavior analytic interventions: A

review of the research. Behavior Analysis in Practice, 1(2), 19-25.

https://doi.org/10.1007%2FBF03391724

Foreman, A. P., St. Peter, C. C., Mesches, G. A., Robinson, N., & Romano, L. M. (2021).

Treatment integrity failures during timeout from play. Behavior Modification, 45(6), 988-

1010. https://doi.org/10.1177/0145445520935392

Foreman, A. P., Romano, L. M., Mesches, G. A., & St. Peter, C. C. (2022). A translational

evaluation of commission fidelity errors on differential reinforcement of other behavior.

The Psychological Record. https://doi.org/10.1007/s40732-022-00528-8

Garcia, E., Han, E., & Weiss, E. (2022). Determinants of teacher attrition: Evidence from

district-teacher matched data. Education Policy Analysis Archives, 30(25).

https://doi.org/10.14507/epaa.30.6642

Hagermoser Sanetti, L. M., & Kratochwill, T. R. (2011). An evaluation of treatment integrity

planning protocol and two schedules of treatment integrity self-report: Impact on

implementation and report accuracy. Journal of Educational and Psychological

Consultation, 21, 284-308. https://doi.org/10.1080/10474412.2011.620927

Han, J. B., Bergmann, S., Brand, D., Wallace, M. D., St. Peter, C. C., Feng, J. & Long, B. P.

(2022). Trends in reporting procedural integrity: A comparison. Behavior Analysis in

Practice, 16, 388-398. https://doi.org/10.1007/s40617-022-00741-5

42
Harvey, O. B., & St. Peter C. C. (2024). Classifying fidelity errors in the context of behavioral

treatment [Unpublished Manuscript]. Department of Psychology, West Virginia

University.

Higgins, J. P., Riggleman, S., & Lohmann, M. J. (2023). A practical guide to writing behavior

intervention plans for young children. The Journal of Special Education Apprenticeship,

12(1). https://doi.org/10.58729/2167-3454.1160

Holcombe, A., Wolery, M., Snyder, E. (1994). Effects of two levels of procedural fidelity with

constant time delay on children’s learning. Journal of Behavioral Education, 4(1), 49-73.

https://doi.org/10.1007/BF01560509

Horner, R. H., Sugai, G., Todd, A. W., & Lewis-Palmer, T. (2000). Elements of behavior support

plans: A technical brief. Exceptionality, 8(3), 205-215.

https://doi.org/10.1207/S15327035EX0803_6

Individuals with Disabilities Education Improvement Act, 20 U.S.C. § 300.530(f) (2004).

https://sites.ed.gov/idea/regs/b/e/300.530/f

Jenkins, S. R., Hirst, J. M., DiGennaro Reed, F. D. (2015). The effects of discrete-trial training

commission errors on learner outcomes: An extension. Journal of Behavioral Education,

24(2), 196-209. https://doi.org/10.1007/s10864-014-9215-7

Jones, S. H., & St. Peter, C. C. (2022). Nominally acceptable integrity failures affect

interventions involving intermittent reinforcement. Journal of Applied Behavior Analysis,

55(4), 1109-1123. https://doi.org/10.1002/jaba.944

Kincaid, D., Childs, K., Blase, K. A., & Wallace, F. (2007). Identifying barriers and facilitators

in implementing schoolwide positive behavior support. Journal of Positive Behavior

Interventions, 9(3), 174-184. https://doi.org/10.1177/10983007070090030501

43
Kollerová, L., Květon, P., Zábrodská, K., & Janošová, P. (2021). Teacher exhaustion: The

effects of disruptive student behaviors, victimization by workplace bullying, and social

support from colleagues. Social Psychology of Education, 26. 885-902.

https://doi.org/10.1007/s11218-023-09779-x

Leon, Y., Wilder, D. A., Majdalany, L., Myers, K., & Saini, V. (2014). Errors of omission and

commission during alternative reinforcement of compliance: The effects of varying levels

of treatment integrity. Journal of Behavioral Education, 23(1), 19-33.

http://dx.doi.org/10.1007/s10864-013-9181-5

Morris, C., Jones, S. H., & Oliveira, J. P. (2024). A practitioner’s guide to measuring procedural

fidelity. Behavior Analysis in Practice. https://doi.org/10.1007/s40617-024-00910-8

National Autism Center. (2015). Findings and conclusions: National standards project, phase 2.

Randolph, MA: Author

National Council of Teacher Quality (2014). Training our future teachers: Classroom

management.

https://www.nctq.org/dmsView/Future_Teachers_Classroom_Management_NCTQ_Repo

rt

Noell, G. H., Gresham, F. M., & Gansle, K. A. (2002). Does treatment integrity matter? A

preliminary investigation of instructional implementation and mathematics performance.

Journal of Behavioral Education, 11(1), 51-67. https://doi/10.1023/A:1014385321849

Peterson, L., Homer, A. L., & Wonderlich, S. A. (1982). The integrity of independent variables

in behavior analysis. Journal of Applied Behavior Analysis, 15(4), 477-492.

https://doi.org/10.1901/jaba.1982.15-477

44
Quigley, S. P., Ross, R. K., Field, S., & Conway, A. A. (2018). Toward an understanding of the

essential components of behavior analytic service plans. Behavior Analysis in Practice,

11(4), 436-444. https://doi.org/10.1007/s40617-018-0255-7

Ravid, A., Lagbas, E., Johnson, M., & Osborne, T. L. (2021). Targeting co-sleeping in children

with anxiety disorders using a modified bedtime pass intervention: A case series using a

changing criterion design. Behavior Therapy, 52(2), 298-312.

https://doi.org/10.1016/j.beth.2020.03.004

Schlichte, J., Yssel, N., & Merbler, J. (2005). Pathways to burnout: Case studies in teacher

isolation and alienation. Preventing School Failure: Alternative Education for Children

and Youth, 50(1), 35-40. http://dx.doi.org/10.3200/PSFL.50.1.35-40

St. Peter Pipkin, C., Vollmer, T. R., & Sloman, K. N. (2010). Effects of treatment integrity

failures during differential reinforcement of alternative behavior: A translational model.

Journal of Applied Behavior Analysis, 43(1), 47-70. https://doi.org/10.1901/jaba.2010.43-

47

St. Peter, C. C., Byrd, J. D., Pence, S. T., & Foreman, A. P. (2016). Effects of treatment-integrity

failures on a response-cost procedure. Journal of Applied Behavior Analysis, 49(2), 308-

328. https://doi.org/10.1002/jaba.291

Solomon, B. G., Klein, S. A., & Politylo, B. C. (2012). The effect of performance feedback on

teachers’ treatment integrity: A meta-analysis of the single-case literature. School

Psychology Review, 41(2), 160-175. https://doi.org/10.1080/02796015.2012.12087518

Vollmer, T. R., Sloman, K. N., & St. Peter Pipkin, C. (2008). Practical implications of data

reliability and treatment integrity monitoring. Behavior Analysis in Practice, 1(2), 4-11.

https://doi.org/10.1007%2FBF03391722

45
Wickstrom, K. F., Jones, K. M., LaFleur, L. H., & Witt, J. C. (1998). An analysis of treatment

integrity in school-based behavioral consultation. School Psychology Quarterly, 13(2),

141-154. https://doi.org/10.1037/h0088978

West Virginia Department of Education. (2023). West Virginia professional teaching standards.

https://wvde.us/wp-content/uploads/2023/05/WV-Professional-Teaching-Standards-

Final_5-3-2023.pdf

Williams, D. E., & Vollmer, T. R. (2015). Essential components of written behavior treatment

plans. Research in Developmental Disabilities, 36, 323-327.

https://doi.org/10.1016/j.ridd.2014.10.003

46
Table 1

Student Demographic Information

Number of
Grade in Primary Race &
Participant Age Sex Diagnoses proactive
school language ethnicity
BIP steps
ADD, ODD, Depression,
Fabian 8 2 English Male White 56
Learning Disability
ADHD, Anxiety, Mood
Dakota 9 3 English Female White 30
Regulation Disorder

Warner 7 2 English Male White N/A 18

ADHD, ODD, Sensory


Wybie 10 2 English Male White 22
Processing
Note. ADD = Attention deficit disorder, ADHD = Attention deficit hyperactivity disorder, ODD = oppositional defiance disorder

47
Table 2

Teacher Demographic Information

48
Table 3

Examples and Nonexamples of BIP Steps that Meet the Inclusionary Criterion of a Directly

Observable Interaction Between the Teacher and Student

Criterion Examples Nonexamples


Proactive step When the student completes a If the student refuses to start a
worksheet, deliver a token. worksheet, prompt a directive once
every 30s.
When the student raises their hand, If the student walks out of the
call upon them. classroom, follow them to monitor
their safety.
Directly At the start of each academic task, Write the date on the daily data
observable set one behavioral expectation. collection document. (not an
interaction interaction)
between the Throughout activities, honor every Prepare the token economy materials
teacher and appropriate request to leave an area before the student arrives to school.
student without challenging behavior. (not an interaction)
Before tests for which the student If the student leaves their seat on the
may not feel confident (i.e., math and bus, the bus driver should prompt the
writing), state a reminder that it is student to sit back in their seat.
okay to be unsure of answers. (interaction is not by the teacher;
step is not proactive)
At the end of every day, If the student engages in an
acknowledge areas of success that inappropriate void, call their parent.
school day. (interaction is not with the student;
step is not proactive)
Identifiable in At the start of the first work period If the student meets the behavioral
a 15-min of the day, orient the student to the criteria for 10 consecutive school
observation token board. days, provide a special lunch.
When the timer has 10 seconds left, Evaluate the students’ daily
provide a verbal countdown. behavioral data weekly.
Allow the student to select a quiet Provide a consistent and predictable
toy to take with them to the general routine across school days.
education classroom.

49
Table 4
Number of BIP Steps Included and Excluded for Each Student-Teacher Dyad

Excluded Steps by Exclusion Criteria

Not a Directly Not Observable


Dyad Included Excluded Reactive Observable During a 15-min Duplicate
Interaction Period

Fabian-Kelly 61 48 35 3 6 4

Dakota-Rachel 37 22 20 2 0 0

Warren-Rhett 18 41 29 8 0 4

Wybie-Rhett 22 39 35 1 2 1

50
Table 5

Interobserver Agreement Per Commission Error Not in Plan for Fabian-Kelly

Primary Secondary
Commission Error Not in Plan IOA Percentage
Observer Tallies Observer Tallies
Acknowledgement/Praise 16 21 76%
Allowed to switch or move seats or stand instead of sit No IOA data
High-probability sequence No IOA data
Proximity 2 2 100%
Extra help 2 0 0%
Physical attention No IOA data
Restricted access to items No IOA data
Stated a contingency or rule 0 1 0%
Access to food during instruction No IOA data
Allowed the student to show off a skill they requested to No IOA data
Answered hand raise during independent work No IOA data

51
Table 6

Interobserver Agreement Per Commission Error Not in Plan for Dakota-Rachel

Primary Secondary
Commission Error Not in Plan IOA Percentage
Observer Tallies Observer Tallies
Proximity 5 5 100%
Attend… No IOA data
Reminders 4 2 50%
Statements of encouragement or assurance No IOA data
Provided choices 1 2 50%
Let… No IOA data
Break from academic demands 1 2 50%
Access to peers during break 1 2 50%
Access to items (Not in the BIP) 2 2 100%
Told Dakota to get her dollars No IOA data
Physical attention No IOA data
Access to extra teachers No IOA data
Awarded time with teacher certificate 1 2 50%
Access to snack No IOA data
Access to new classroom No IOA data

52
Table 7

Interobserver Agreement Per Commission Error Not in Plan for Warren-Rhett

Primary Secondary
Commission Error Not in Plan IOA Percentage
Observer Tallies Observer Tallies
Proximity 5 1 20%
Restricted access to items 1 0 0%
Awarded a token 2 1 50%
Praise with token delivery 0 1 0%
Statement of gratitude 1 1 100%
Advanced notice 1 0 0%
Access to items 0 1 0%
Allowing the student to sit where he wants No IOA data
Provided a choice No IOA data
Granted requests to have access to an activity longer No IOA data
Screening his view from peers 1 1 100%
Physical attention No IOA data

53
Table 8

Interobserver Agreement Per Commission Error Not in Plan for Wybie-Rhett

Primary Secondary
Commission Error Not in Plan IOA Percentage
Observer Tallies Observer Tallies
Proximity 8 12 66%
Access to items 5 6 83%
Gesture prompt No IOA data
Honoring appropriate requests for items 1 0 0%
Statement of gratitude 1 2 50%
Provided a choice 1 1 100%
Advanced notice 1 0 0%
Awarded a token 1 2 50%
Allowing him to sit incorrectly 1 1 100%
Physical attention 1 0 0%
Statement of encouragement No IOA data
Visual prompt No IOA data
Offering access to food No IOA data

54
Table 9

Operational Definitions of Student Challenging Behavior

Student Behavior Operational Definition


Fabian Protesting Refuses to engage in an activity with a reference to his own
behavior.
Swearing Says swear word.
Insulting Says negative statement about appearance or intellect.
Destroying Swiping, ripping, crumbling, throwing, or making forceful
Property contact with items, furniture or the building.
Aggressing Attempting to or making forceful contact with another or
throwing large items towards the implementer.
Eloping The student takes one step outside of the research room door
without permission from the researcher.
Dakota Vocalizing Saying a statement or making noise above conversational
Loudly volume.
Protesting Refuses to engage in an activity with reference to own behavior.
Destroying Physically removing academic work from her proximity
Property (swiping work off desk) or swiping materials out of place,
throwing, breaking or otherwise damaging materials or items.
Aggressing Attempting to or making forceful contact (hitting or kicking)
towards another person that could cause harm.
Injuring Self Attempting to or making forceful contact (hitting) towards self
that could cause harm.
Eloping The student takes one step outside of the research room door
without permission from the researcher.
Wybie Protesting Refuses to engage in an activity with a reference to his own
behavior.
Negative Making an insulting, cursing, or threatening statement towards
Interactions another.
Destroying Throwing, tipping, or making forceful contact with items.
Property
Aggressing Attempting to or making forceful contact towards another person
that could cause harm.
Eloping The student takes one step outside of the research room door
without permission from the researcher.

55
Table 10

Procedural-Fidelity Errors in Study 2 for Dakota

Session
BIP Step Error Type Error Description
Number
6 Let her know she can raise her Omission Researcher did not implement the
hand if she wants to talk to the step with 15 s
teacher (within 15 s) Commission Researcher implemented the step
after 15 s
13 Attend to the hand raise Commission Researcher attended to Dakota
without Dakota raising her hand
14 Engage in at least one physical Omission Did not engage in a physical
interaction (programmed interaction (limited visibility of
commission error not in plan) interaction due to camera
difficulties)
20 Talk to Dakota for a minute Commission Researcher implemented the step
about the task (rephrase the for more than 1 min
instructions)
21 Ask Dakota if she would like Omission (4) Researcher did not ask
help with the assignment or to
chat with a teacher
Talk to Dakota for a minute Commission Researcher implemented the step
about the task (rephrase the for more than 1 min
instructions)
24 Ask Dakota if she would like Commission Researcher asked if she would like
help with the assignment or to any help.
chat with a teacher
25 Talk to Dakota for a minute Omission Researcher did not talk to Dakota
about the task (rephrase the for a minute about the task
instructions)
26 Talk to Dakota for a minute Omission Researcher did not talk to Dakota
about the task (rephrase the for a minute about the task
instructions)
27 Talk to Dakota for a minute Omission Researcher did not talk to Dakota
about the task (rephrase the for a minute about the task
instructions)
Note. The numbers in parentheses next to the error type indicate the count of times the error was
made during the session.

56
Table 11

Procedural-Fidelity Errors in Study 2 for Wybie

Session
BIP Step Error Type Error Description
Number
1 N/A Commission Gesture Prompt
Error Not in
Plan (8)
2 Tell him he can raise his hand Omission Researcher did not tell him that he
if he needs help can raise his hand if he needs help
Tell him to sit in his seat Omission Researcher did not tell him to sit
in his seat
N/A Commission Gesture Prompt
Error Not in
Plan (5)
3 Praise Wybie once every 5 Omission Researcher did not praise Wybie
minutes
Note. The numbers in parentheses next to the error type indicate the count of times the error was
made during the session.

57
Figure 1

Interobserver Agreement of Total Tallies for Each Type of Teacher Interaction

FABIAN DAKOTA
18 14
r = 1.00 r = 0.8587
16 p = 0.0006 p = 0.0006
12
14
10
OBSERVER 2

OBSERVER 2
12

10 8

8 6
6
4
4
2
2

0 0
0 2 4 6 8 10 12 14 16 18 0 2 4 6 8 10 12 14
OBSERVER 1 OBSERVER 1

WARREN WYBIE
12 18
r = 0.6885 r = 1.00
p = 0.0008 16 p = 0.0001
10
14
OBSERVER 2

OBSERVER 2

8 12

10
6
8
4 6
4
2
2

0 0
0 2 4 6 8 10 12 0 2 4 6 8 10 12 14 16 18
OBSERVER 1 OBSERVER 1

Correct Commission Error in Plan


Omission Commission Error Not in Plan

58
Figure 2

Overall Count and Type of Errors by Student-Teacher Dyad

160

140
Count of Teacher Interactions

120

100 Commission Error


Not in Plan
Commission Error
80 in Plan
Omission Error
60
Correct

40

20

0
ly

tt
he

et

he
el

Rh
ac
-K

-R
-R

n-

ie
an

re
ta

yb
bi

ar
o

W
Fa

ak

W
D

Student-Teacher Dyad

59
Figure 3

Type of Errors by Hour for Fabian-Kelly

X-axis
BIP step
label
When Fabian is complying with academic instruction, gets to work immediately,
raises his hand, does not interrupt a peer when it is their turn to answer, or stays in
A
his area for an entire activity, deliver a token for every 1-3 instances of appropriate
behavior
B Acknowledgement or praise
C Flexible seating
D High-probability sequence
E Proximity
F Extra help
G Physical attention
H Restricted access to possibly distracting items

60
I Stated a contingency or rule
J Access to food during instruction
K Allowed the student to show off a skill they requested to
L Answered hand raise during independent work

61
Figure 4

Type of Errors by Hour for Dakota-Rachel

20
Correct
Omission Error
Commission Error in Plan
15 Commission Error Not in Plan
Errors Per Hour

10

0
A B C D E F G I H J K L MN O P Q R S T U VWX Y Z
Interaction

X-axis
BIP step
label
Throughout academic instruction, when Dakota attends to academic instruction, uses
A kind words, starts work right away, or completes a teacher directive, provide praise
directed specifically to Dakota a minimum of once every 5 minutes
B Proximity
When the 30-minute interval timer sounds or the minutes on the clock are 0 or 30,
C
place a “dollar” for each domain of behavior that met expectations in her “wallet”
D If Dakota raises her hand, attend to the hand raise
When the 30-minute interval timer sounds or the minutes on the clock are 0 or 30,
E
deliver behavior-specific praise to Dakota that describes what she is doing well
F Reminders
G Statements of encouragement or assurance
H Provided choices

62
When presenting a writing instruction, let her know she can raise her hand if she
I
wants to talk to the teacher
J Break from academic demands
K Access to peers during break
L Access to items (not in the BIP)
Before the end of the morning block and afternoon block, if Dakota was safe for the
M
entire block, state that if you have enough for break you can take a break right now
Before the end of the morning block and afternoon block, if Dakota was safe for the
N
entire block, state that it is time for dollars to be exchanged for reward(s)
Before the end of the morning block and afternoon block, f Dakota was safe for the
O
entire block, remind the student that they can borrow from their bank if they need to
Before the end of the morning block and afternoon block, f Dakota was safe for the
P
entire block, tell Dakota to count her dollars to determine what she would like
Q Told Dakota to get her dollars
R When presenting a writing instruction, talk to Dakota for a minute about the task
S Physical attention
T Access to extra teachers
U Awarded time with teacher certificate
V Access to snack
If Dakota requests help from a teacher on a task within her skill set, ask Dakota if
W
she would like help with the assignment or to chat with a teacher
If Dakota would like to exchange a certificate at a time an exchange cannot occur or
X
makes a request that cannot be granted, provide an alternative option
If Dakota would like to exchange a certificate at a time an exchange cannot occur or
Y makes a request that cannot be granted, praise acceptance of the choice of an
alternative option
Z Access to new classroom

63
Figure 5

Type of Errors by Hour for Warren-Rhett

X-axis
BIP step
label
A Proximity
B Once an activity while he is on-task and working, praise or acknowledge him
Each time an assignment is to be presented, if a group assignment is available, ask
C
him if he would like to work with a teacher or do a group activity
D Restricted access to possibly distracting items
E Awarded a token
F Praise with token delivery
G Statement of gratitude
H Advanced notice
I Once an activity while he is on-task and working, engage in a physical interaction

64
If he completes an activity without major challenging behavior, provide access to
J
the selected reward
If he completes an activity without major challenging behavior, allow him to pick a
K
reward
L Access to items
M Flexible seating
If Warren makes a reasonable request and is not engaging in challenging behavior,
N
honor it
O Provided a choice
P Granted requests to have access to an activity longer
S Screening his view from peers
T Physical attention (not in BIP)
During transitions, allow and/or encourage Warren to bring a book with him or have
U
an adult walk with him

65
Figure 6

Type of Errors by Hour for Wybie-Rhett

X-axis
BIP step
label
A Proximity
B Praise Wybie once every 5 minutes
C Access to items
D Gesture prompt
E Honoring appropriate requests for items
F Statement of gratitude
If an activity is above his level in English Language Arts, state instructions about
G
how he can request help or an alternative activity
H Provided a choice
I Advanced notice

66
J Awarded a token
K Flexible seating
L Physical attention
M Statement of encouragement
N Visual prompt
O Offering access to food
When Wybie must complete an academic activity independently, offer for him to sit
P
near the teacher or in a seat of his choice

67
68
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

Figure 7

Interobserver Agreement of Total Tallies for Each Type of Experimenter Interaction

FABIAN DAKOTA
150 24
r = 1.00 r = 0.90
p = 0.0008 p = <0.0001
125 20
OBSERVER 2

OBSERVER 2
100 16

75 12

50 8

25 4

0 0
0 25 50 75 100 125 150 0 4 8 12 16 20 24
OBSERVER 1 OBSERVER 1

WYBIE
6
r = 0.81
p = 0.25
OBSERVER 2

0
0 2 4 6
OBSERVER 1

Correct Commission Error in Plan


Omission Commission Error Not in Plan

68
69
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

Figure 8

Study 2 Experimental Designs


HF NF
1.5
Fabian

1.0

0.5

0.0
1 2 3 4 5 6 7 8 9 10
All Challenging Behavior Per Minute

HF FCE HF FCE
3.0
Dakota
NF Probe
2.5

2.0

1.5

1.0

0.5

0.0
1 4 7 10 13 16 19 22 25

HF
1.0
Wybie

0.5

0.0
1 2 3 4 5 6 7 8 9 10

Sessions

Note. HF = High-Fidelity condition, NF = No-Fidelity condition, and FCE = Facilitative


Commission Error condition.

69
70
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX A. Teacher Demographic Questionnaire.

70
71
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

71
72
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

72
73
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX B. Student Demographic Questionnaire.

73
74
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

74
75
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

75
76
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX C. Teacher Questionnaire.

76
77
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

77
78
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

78
79
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX D. Procedural Fidelity Checklist for Fabian.

79
80
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

80
81
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

81
82
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

82
83
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

83
84
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

84
85
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

85
86
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

86
87
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

87
88
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX E. Procedural Fidelity Checklist for Dakota.

88
89
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

89
90
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

90
91
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

91
92
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

92
93
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

93
94
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX F. Procedural Fidelity Checklist for Warren.

94
95
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

95
96
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

96
97
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

97
98
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

APPENDIX G. Procedural Fidelity Checklist for Wybie.

98
99
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

99
100
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

100
101
EFFECTS OF COMMISSION ERRORS ON STUDENT OUTCOMES

101

View publication stats

You might also like