Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Investigations in Mathematics Learning

ISSN: 1947-7503 (Print) 2472-7466 (Online) Journal homepage: https://www.tandfonline.com/loi/uiml20

Mathematics Classroom Observation Protocol for


2
Practices (MCOP ): A validation study

Jim Gleason, Stefanie Livers & Jeremy Zelkowski

To cite this article: Jim Gleason, Stefanie Livers & Jeremy Zelkowski (2017) Mathematics
2
Classroom Observation Protocol for Practices (MCOP ): A validation study, Investigations in
Mathematics Learning, 9:3, 111-129, DOI: 10.1080/19477503.2017.1308697

To link to this article: https://doi.org/10.1080/19477503.2017.1308697

Published online: 11 Apr 2017.

Submit your article to this journal

Article views: 933

View related articles

View Crossmark data

Citing articles: 5 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=uiml20
INVESTIGATIONS IN MATHEMATICS LEARNING
2017, VOL. 9, NO. 3, 111–129
https://doi.org/10.1080/19477503.2017.1308697

ARTICLES

Mathematics Classroom Observation Protocol for Practices


(MCOP2): A validation study
a
Jim Gleason , Stefanie Liversb, and Jeremy Zelkowski b

a
Department of Mathematics, The University of Alabama, Tuscaloosa, Alabama; bDepartment of Curriculum &
Instruction, The University of Alabama, Tuscaloosa, Alabama

ABSTRACT KEYWORDS
This article reports the robust validation process of the Mathematics Classroom Mathematics teaching
Observation Protocol for Practices (MCOP2). The development of the instrument practices; mathematics
took into consideration both direct and dialogic instruction encompassing class- understanding; observation
room interactions for the development of conceptual understanding, specifically protocol; student discourse;
examining teacher facilitation and student engagement. Instrument validation student engagement;
teaching facilitation
involved feedback from 164 external experts for content validity, interrater and
internal reliability analyses, and structure analyses via scree plot analysis and
exploratory factor analysis. Taken collectively, these results indicate that the
MCOP2 measures the degree to which actions of teachers and students in
mathematics classrooms align with practices recommended by national organi-
zations and initiatives using a two-factor structure of teacher facilitation and
student engagement.

Introduction
For the past 30 years, mathematical organizations in the United States have created recommenda-
tions and guidelines of best practices based on research in mathematics teaching for kindergarten
through undergraduate coursework (American Mathematical Association of Two-Year Colleges
[AMATYC], 1995, 2006; Barker et al., 2004; Conference Board of the Mathematical Sciences
[CBMS], 2016; National Council of Teachers of Mathematics [NCTM], 1989, 1991, 2000, 2014;
National Governors Association Center for Best Practices, Council of Chief State School Officers
[NGACBP, CCSSO], 2010; National Research Council [NRC], 2003). While recommendations for
each grade level include unique characteristics, all of the recommendations revolve around these
common characteristics of mathematics teaching: conceptual understanding, procedural fluency, and
student engagement. These mathematical practices involve students engaged in problem solving,
modeling with mathematics, using a variety of means to represent mathematical concepts, commu-
nicating about mathematics; and using the structure of mathematics to better understand concepts
and their connections to other concepts, both inside and outside of mathematics.
The construct of conceptual understanding focuses on students making connections among
mathematical operations, representations, structure, reasoning, and modeling through discourse
(Hiebert & Carpenter, 1992; Hiebert & Grouws, 2007; NRC, 2003; Resnick & Ford, 1981; Skemp,
1971, 1976). A deep understanding of mathematics enables students to apply their knowledge and
abilities in a variety of contexts within both procedural and conceptual understanding (Rittle-
Johnson, Siegler, & Alibali, 2001). Hiebert and Grouws (2007) state, “there is no reason to believe,
based on empirical findings or theoretical arguments, that a single method of teaching is the most
effective for achieving all types of learning goals” (p. 374). As such, if a specific set of lesson
objectives revolves primarily around improving procedural fluency and efficiency through practices

CONTACT Jim Gleason jgleason@ua.edu Department of Mathematics, The University of Alabama, Box 870350,
Tuscaloosa, AL 35487-0350, USA.
© 2017 Research Council on Mathematics Learning
112 J. GLEASON ET AL.

such as drill worksheets, the lesson would not fit well within this construct of conceptual under-
standing, though beneficial at times within the larger context of mathematics instruction.
The mathematics education community has produced a breadth of empirical research over the
last five decades regarding teaching best practices, student learning, and their interconnections,
identifying dialogic instruction as a desired means of teaching and learning during the standards-
based teaching era launched in 1989 (NCTM, 2000; NGACBP, CCSSO, 2010) while allowing for the
use of properly enacted direct instruction focused on the criticalness of numerical fluency in both
later mathematics and everyday practice, the centrality of communicating with precision and careful
reasoning to the discipline of mathematics, and problem-solving skills (Ball et al., 2005; Munter,
Stein, & Smith, 2015). Historically, the RTOP (Reformed Teaching Observation Protocol) (Sawada
et al., 2002) carried the bulk of the weight as the primary tool for evaluation and research projects in
the area of mathematics and science teaching. However, as the United States moves into the era of
Common Core State Standards implementation, there exists an expectation of classroom practices to
follow the Standards for Mathematical Practices (SMPs) in the K–12 setting, with which the RTOP
does not fully align, as it does not focus on both teacher and student practices. Because little evidence
exists of large-scale practices and implementations at the classroom level across the United States,
the essential need for mathematics classroom observation protocols for practices of classroom
teachers and students has extended beyond just the research community.
Following the Standards for Educational and Psychological Testing (AERA, APA, & NCME, 2014),
this article presents sources of validity evidence that include evidence based on test content, response
processes, and internal structure of the MCOP2. First, we present the existing observation protocols
and discuss the need for an instrument focused on a comprehensive view of mathematics practices.
Then, we provide the process for developing the MCOP2 and development of the theoretical
framework of the instrument. We end with a discussion of the appropriate uses for the MCOP2
that include both practitioner and research possibilities.

Mathematics classroom observation protocols


With both the unification and diversity of mathematics teaching practices (Munter et al., 2015), a
need exists to measure the degree to which different units of study (such as a single classroom
period, a unit during a course, or the entire course) align with different practice recommendations.
Primary methods of such measurement include fine-grained analyses such as video analysis (Hill
et al., 2008; Hill, Charalambous, Blazar, et al., 2012), instructional logs (Ball & Rowan, 2004), teacher
and student interviews (Ball & Rowan, 2004, Munter, 2014), analyzing student work (Matsumura,
Garnier, Slater, & Boston, 2008), mathematical task analysis (Stein, Grover, & Henningsen, 1996),
and large-grain tools such as self-report survey questionnaires (Mullens & Kasprzyk, 1999) and live
classroom observation protocols (Matsumura, Garnier, Pascal, & Valdes, 2002, Sawada et al., 2002;
Walkowiak, Berry, Meyer, Rimm-Kaufman, & Ottmar, 2014). Each of these tools has strengths and
weaknesses but can be very useful for understanding specifics of mathematical teaching. In parti-
cular, the classroom observation protocols allow researchers basic measurement of mathematics
teaching in large numbers of classes.
Researchers and evaluators use classroom observation protocols for various purposes and with
varying degrees of specificity toward measuring teaching practices (Maxwell, 1917). While classroom
observation is a long-standing area of research, the emerging use of instrumentation protocols
specific to evaluating multidimensional aspects of mathematics and science teaching trace back to
Horizon’s 1998 Core Evaluation Classroom Observation Protocol, later becoming the Inside the
Classroom Observation and Analytic Protocol (ITC COP), alongside the Local Systemic Change
through Teacher Enhancement Protocol (LSC COP) in 2000 (Henry, Murray, & Phillips, 2007).
Developed at the same time as the Horizon instrumentation, the Reformed Teaching Observation
Protocol (RTOP) (Sawada et al., 2002) has many similarities and correlations with the ITC COP and
the LSC COP.
INVESTIGATIONS IN MATHEMATICS LEARNING 113

Observation tools used for practice, research, and evaluation serve the purposes of their devel-
opment foci and/or research questions. The RTOP, IQA, MQI, and M-SCAN specifically have many
productive uses but also some drawbacks (Boston, Bostic, Lesseig, & Sherman, 2015). We summarize
the auspices and provide a concise view of comparison of the most recent and widely used
instruments developed for use in mathematics classrooms, namely, the RTOP, IQA, MQI, and
M-SCAN, in Appendix A, and refer readers to Boston et al. (2015) for greater in-depth analyses
of each protocol.

Mathematics Classroom Observation Protocol for Practices


The Mathematics Classroom Observation Protocol for Practices (MCOP2) measures the practices within
the mathematics classroom for teaching lessons that are goal-oriented toward conceptual understanding as
described by standards documents put forth by national organizations focused on mathematics teaching
and learning such as the National Council of Teachers of Mathematics (1989, 1991, 2000, 2014), American
Mathematical Association of Two-Year Colleges (1995, 2006), Mathematical Association of America
(Barker et al., 2004), Conference Board of the Mathematical Sciences (2016), National Research Council
(2003), and National Governors Association Center for Best Practices of the Council of Chief State School
Officers (2010). Specifically, the MCOP2, grounded in the Instruction as Interaction framework (Cohen,
Raudenbush, & Ball, 2003) through initial and revision processes, examines teaching mathematics for
conceptual understanding through lenses examining teacher facilitation and student engagement. The
development of the MCOP2 involved a multistage iterative process over three years based on the standards
for scale development (AERA, APA, & NCME, 2014; Bell et al., 2012; DeVellis, 2011).

Evolved theoretical framework


Because understanding mathematical concepts and procedures encompasses multiple classroom practices
for which both teacher and student engage in productive discourse, tasks, and reasoning (Barker et al., 2004;
NCTM, 2006, 2011; Stein, Engle, Smith, & Hughes, 2008; Stein, Smith, Henningsen, & Silver, 2009), we
follow the definition of Hiebert and Grouws (2007) of teaching as “classroom interactions among teachers
and students around content directed toward facilitating students’ achievement of learning goals” (p. 377).
This definition aligns well with the instruction as interaction framework of Cohen et al. (2003) focusing on
the interactions among the teacher, the mathematical content, and the students. Thus, the instruction as
interaction framework formed the foundation for the initial item development.
Through external expert surveys as part of the validation process, we revisited the framework as
almost all items categorized into “Lesson Implementation” (Appendix E). We realized that our
instrument aligns more appropriately with the mathematics classroom as a community of learners
framework (Rogoff, Matusov, & White, 1996), with the authority and responsibility shared between
the teacher and the students. This was confirmed through internal structure analyses with items
from the two primary factors falling into categories focusing on the teacher or the student roles and
responsibilities in the classroom.
In the community of learners framework, the role of the teacher as a facilitator is one who provides
structure for the lesson, targets students’ zone of proximal development (Vygotsky, 1978), and gives
guidance in the problem-solving process (Polya, 1945, 1957) while promoting positive productive
discourse norms within the classroom (Cobb, Wood, & Yackel, 1993; Nathan & Knuth, 2003).
Likewise, the role of student engagement includes actively participating in the lesson content, persisting
through the problem-solving process, and productive discourse as an essential element for making the
connections necessary for a conceptual understanding of mathematics. Neither of these roles and
responsibilities thrives without the support of the other. Therefore, for a mathematics classroom to fit
strongly within this design framework, both teacher and students must fulfill their roles and
responsibilities.
114 J. GLEASON ET AL.

Teacher facilitation
Current research and reform efforts call for a teacher who can facilitate high-quality tasks, interac-
tions, and student reasoning (Barker et al., 2004; NCTM, 1991, 2006, 2011; NGACBP, CCSSO, 2010;
NRC, 2003; Stein et al., 2008, 2009). Key elements to consider with teacher facilitation of a task
include the amount of scaffolding (Anghileri, 2006), productive struggle (Hiebert & Grouws, 2007),
the level of questioning (Reys, Lindquist, Lambdin, & Smith, 2015; Schoenfeld, 1998; Winne, 1979),
and peer-to-peer discourse (Cobb et al., 1993; Nathan & Knuth, 2003) that occur with a mathema-
tical task, while keeping in mind the complexity that occurs between teachers’ decisions and
students’ responses (Brophy & Good, 1986).
Teachers also take the leading role in creating and maintaining discourse that encourages student
thinking and communication (Lampert, 1990; Moschkovich, 2007; Sherin, Mendez, & Louis, 2004).
Such discourse must be respectful, student-centered, and occur peer-to-peer and student-to-teacher
to assist in developing equitable spaces where all students have agency and power to participate
equally (Hiebert & Grouws, 2007; NCTM, 2000; Sherin et al., 2004; Yackel & Cobb, 1996). These
characteristics of quality mathematics teaching are difficult to achieve, even with master teachers, but
should be the goal of all teachers (Silver, Mesa, Morris, Star, & Benken, 2009).

Student engagement
Student learning cannot occur without student engagement. Bruner (1960, 1961, 1966) and Vygotsky
(1978) emphasized the students’ environment as one of social interactions to develop, learn, and
understand. The interactions between student and teacher provide this essential learning culture that
differs from compliance. Engagement encompasses both emotional and behavioral contributions
(Marks, 2000) to the learning of mathematics and involves “intensity and quality of participation in
classroom activities, as seen in such things as students’ ability to contribute materially and discur-
sively to ongoing work and to build on one another’s contributions over the long haul” (Azevedo,
diSessa, & Sherin, 2012, p. 270).
As part of a mathematics community, students must engage in and contribute to the learning
experience (Hiebert et al., 1997) to develop certain mathematical habits (Cuoco, Goldenberg, &
Mark, 1996) that include making connections allowing students to recognize patterns and use
repeated reasoning for similar tasks and experiences that build a foundation for the conceptual
understanding of the mathematics (Ball, 1990; Ma, 1999). Students can demonstrate their thinking
using a variety of means (models, drawings, graphs, materials, manipulatives, and/or equations) and
assess their own strategies and the strategies of their peers and teacher in order to process and
understand the mathematics. Students further demonstrate engagement by questioning, comment-
ing, and extending their understanding through this type of critical thinking and assessment.

Validation process design


Following commonly accepted processes for scale development (DeVellis, 2011), we present evidence
for the validity of the MCOP2 based on test content, internal structure, and response processes
(AERA, APA, & NCME, 2014). The test content was validated through an iterative process of expert
surveys to clarify the items and their descriptors and to verify that the items measure the desired
constructs of classroom interactions. After the items and descriptors were modified from the test
content validation process, data collected with the instrument was used to determine the internal
structure of the instrument using a Horn parallel analysis (DeVellis, 2011) and an exploratory factor
analysis. This process resulted in two primary factors for the instrument creating subscales. The
interrater reliability was then analyzed on these subscales to provide evidence based on response
processes.
INVESTIGATIONS IN MATHEMATICS LEARNING 115

Initial item development process


Using current mathematics education literature regarding classroom teaching and learning, a group
of six individuals in the field of mathematics education created 18 items focused on the interactions
of the mathematics classroom understood to promote conceptual understanding. The process
adapted some items from other instruments and created others to incorporate the framework and
language of the Standards for Mathematical Practices. (See Appendix B for relationship of the items
with the Standards for Mathematical Practice.) The rubrics for the MCOP2 items developed through
an iterative process involving watching classroom videos as a group and determining specific criteria
for each level in the rubric, along with referencing related literature for specific interactions. This
process resulted in the development of a user guide with detailed descriptors and rubrics (Gleason,
Livers, & Zelkowski, 2015), along with an abridged user guide containing only the rubrics
(Appendix C).

Feedback process from external expert panels


Initial expert panel
The Association of Mathematics Teacher Educators (AMTE) is an organization committed to
improving mathematics teacher education. Membership includes approximately 1,000 mathematics
teacher educators and provides a purposeful sample for this study. When the characteristics and
qualifications of the population are defined, purposeful sampling is a credible practice (Shadish,
Cook, & Campbell, 2002). In early 2014, the instrument development team sent an invitation to the
AMTE membership to participate in an initial online survey asking for feedback on the initial pool
of 18 items and their perceived usefulness in measuring various aspects of the mathematics class-
room. The initial survey asked the participants to rank the usefulness of the item to measure
mathematics instruction on a three-point scale (essential, not essential but useful, and not necessary)
and provide comments about the items. The initial three-point scale was chosen to determine the
usefulness of the items and allow individuals to differentiate between levels of importance.
For the initial expert panel online survey, 164 of 900 (18%) individuals fully completed the online
survey. Of these 164 professionals in mathematics education, 37% practiced in departments of
mathematics, 49% practiced in departments or colleges of education, and 14% held either joint
appointments or other types of positions. The respondents held positions as instructors (15%),
assistant professors (35%), associate professors (22%), and full professors (18%) and included a large
mix of individuals with varying amounts of experience as experts in the field. In addition, 53% of the
respondents currently supervised teachers, 29% had supervised teachers in the past, and 18% had
never supervised teachers. Of the 164 experts who completed the full survey, 65 (40%) agreed to
participate in a follow-up expert panel review by providing their email address during the initial
expert panel survey.
For each item, over 94% of the responses rated the item as either essential or not essential, but
useful. In addition, over 10% of the respondents included an unsolicited comment along the lines of
“these ideas are all critical to math lessons, but it is probably not reasonable to expect that all of them
are present in every lesson” to explain why the items were marked as not essential, but useful
(Appendix D). From these responses and the comments about the items, all but two of the original
18 items remained in the pool with minor edits in the phrasing of the item, with the largest such
change involving changing “Students engaged in flexible alternative modes of investigation/problem
solving” to “Students engaged in exploration/investigation/problem solving.” One of the items
removed was “Students explored prior to formal presentation.” This item received a number of
comments about possible biases of the item toward specific teaching methods and ambiguity about
the meaning of the terms in the item. The second item removed was “The lesson promoted
connections across the discipline of mathematics,” as it received comments regarding ambiguity as
to what constituted another area of mathematics.
116 J. GLEASON ET AL.

Secondary expert panel


Following the revision of the item pool from the results of the initial survey, 65 respondents from the
initial survey who had indicated a willingness to provide further feedback received a request to
complete the second survey. This survey provided detailed descriptors of the different point values
for the items, to enable the respondents to understand the intended use. It also provided the
respondents an overview of the theoretical construct of the instruction as interaction framework
and to categorize each item into the four factors of Lesson Design, Lesson Implementation, Student
Engagement with the Content, and Student Engagement with Peers. Due to the a priori noninde-
pendence of these four, the respondents classified all items into as many of the factors as they saw
appropriate. For each item, the respondents described the extent to which scoring high on the item
represents a lesson positively with regard to each factor on a four-point scale of extremely well, well,
somewhat, and not at all. The researchers compressed these responses into two categories. The
“extremely well” and “well” responses were compressed into a positive category and the others into a
negative category. This relied on the strong relationships among all interactions within a classroom,
justifying the exclusion of the “somewhat” category in the positive response category to include only
items that fit strongly within each construct. The expert panel’s theoretical structure of the instru-
ment was then determined based on these responses using the content validity ratio (Ayre & Scally,
2014). This second survey also requested that these experts provide feedback regarding the suffi-
ciency of the items in measuring the desired construct through asking whether there are any key
components of the construct that are missing from the items (DeVellis, 2011).
For the follow-up expert online survey, 26 of the 65 (40%) individuals who agreed to a second
follow-up completed the follow-up survey fully. Over 85% of the respondents had experience
supervising, observing, and/or evaluating mathematics teaching and represented a good mixture of
individuals in terms of job positions (instructor/lecturer, 7%; assistant professor, 19%; associate
professor, 33%; full professor, 30%; and other, 11%), years of experience in higher education (0–
3 years, 7%; 4–6 years, 11%; 7–10 years, 11%; and 10+ years, 70%) and years of experience teaching
mathematics in K–12 classrooms (0–3, 22%; 4–6, 30%; 7–10, 22%; and 10+, 26%). This mixture of
respondents corresponds well with the desired population of experts in mathematics education in
both research and practice.
With 26 respondents, an item was categorized under a factor if at least 18 respondents (≈70%)
classified the item as fitting the factor extremely well or well, rather than the negatively oriented responses
of somewhat and not at all. Based on the value of the content validity ratio (Ayre & Scally, 2014)
necessary to reject the null hypothesis that the item does not fit the construct, we present the results of
the follow-up survey of experts in Appendix E. Because all of the 16 items received a loading on at least
one factor, all were included in the final instrument.
The fact that the expert external panel indicated all of the items loaded onto the lesson
implementation category was a little surprising, but this is not completely unexpected because the
activities and interactions occurring within the lesson rely heavily on teacher implementation of the
lesson. Some of the items designed to fit within the lesson design and implementation constructs also
loaded on the student engagement with content category from the expert panel. Similarly, some of
the items designed to fit within the student engagement constructs also had loadings by the expert
panel onto the lesson design construct. This reinforced the nonindependence of the factors and
provided guidance in the categorization of the items into student engagement and teacher facilitation
subscales, with the final decision for inclusion resting on the instrument development team.
Over half of the respondents to the survey included at least one comment in the questions that
asked for additional comments about each item and whether there are additional constructs that the
item may be measuring. None of the comments expressed a need for additional constructs.
Collectively, comments provided feedback on improving details in the rubric descriptors for the
items. Several of the item rubrics received minor revisions. In addition, several respondents gave
comments about a desire to have more items focusing on the quality of the tasks in the lesson. It was
determined that because some items address the topic of tasks and task analysis guides (Resnick,
INVESTIGATIONS IN MATHEMATICS LEARNING 117

1976; Stein et al., 2009) already exist for such a purpose, these guides could supplement the
observation protocol in certain scenarios to address these needs for purposes with a greater focus
on mathematical tasks.

Internal structure
Data used for the analysis resulted from observations of 40 elementary, 53 secondary, and 36 tertiary
mathematics classrooms in the southeastern United States. The classrooms observed at each grade band
included teachers with experience ranging from 0 to 40 years, a mixture of gender matching national norms
for each grade band, and a mixture of direct and dialogical instruction in the lessons. Horn’s parallel analysis
(DeVellis, 2011) of a scree plot was used to determine the number of underlying factors in the protocol,
followed by an exploratory factor analysis with the number of components suggested by the scree plot
analysis to determine whether the data generated by the instrument matched the instrument’s theoretical
construct, as verified by the external expert survey. Any subscales generated by the factor analysis were then
analyzed for internal reliability.
The eigenvalues of the interitem correlation matrix suggested a two-factor extraction, because the first
two factors accounted for 55% of the total variance and a parallel analysis (see DeVellis, 2011) resulted in
two eigenvalues with magnitudes greater than those generated by simulated data (see Figure 1). Because the
data had a theoretical structure of multiple interdependent factors, we analyzed the data using a maximum-
likelihood extraction of two factors with a promax rotation with Kaiser normalization (Χ2 = 189.48, df = 89,
p < 0.001). The resulting pattern matrix (see Table 1) resulted in nine items for each of the two factors, with a
factor correlation of 0.536. Because the relative magnitudes of the loadings for the items were the same, the
correlation between the total scores on the subscales using the score coefficients and using the raw scores
exceeded 0.98 for each subscale. Therefore, a total raw score for each subscale is sufficient, rather than scaled
scores based on the item coefficients from the factor score analysis.
The items in the first factor all depend heavily on the actions of the students, so the factor was
identified and labeled as student engagement (SE). The resulting subscale of nine items exhibits a
high internal consistency (Cronbach’s alpha = 0.897), thus effectively measuring differences at the
group level, or at the individual level with at least three observations (Lance, Butts, & Michels, 2006).
The second factor contained items that depended heavily on the actions of the teacher as a facilitator,
so this factor was identified and labeled as teacher facilitation (TF). This subscale also demonstrated
a reliability sufficiently high (Cronbach’s alpha = 0.850) to measure differences at the group level, or
at the individual level with at least three observations (Lance et al., 2006).

Interrater reliability
Following the internal structural analysis, all raters in this study used the final instrument to analyze five
different classroom videos to determine the interrater reliability of the instrument. The raters comprised
five individuals with varying backgrounds. Two of the raters hold doctorates in mathematics education (one
elementary-focused and the other secondary-focused), one rater holds a doctorate in mathematics with
heavy involvement in the mathematics education community (both elementary and secondary), one rater
works as a mathematics specialist with secondary teachers and has taught at both the secondary and

8
6
Actual
4
2 Simulation
0
0 2 4 6 8 10 12 14 16

Figure 1. Scree plot and Horn’s parallel analysis of MCOP .


2
118 J. GLEASON ET AL.

Table 1. Pattern and factor matrices of the MCOP2.


Pattern matrix Factor score
Factor Factor Factor Factor
Item 1 2 1 2
Students engaged in exploration/investigation/problem solving. 0.783 0.172 0.023
Students used a variety of means (models, drawings, graphs, concrete materials, 0.612 0.087 0.022
manipulatives, etc.) to represent concepts.
Students were engaged in mathematical activities. 0.873 0.196 −0.033
Students critically assessed mathematical strategies. 0.359 0.410 0.061 0.099
Students persevered in problem solving. 0.737 0.146 0.029
The lesson involved fundamental concepts of the subject to promote 0.851 −0.018 0.255
relational/conceptual understanding.
The lesson promoted modeling with mathematics. 0.503 0.021 0.095
The lesson provided opportunities to examine mathematical structure. (symbolic notation, 0.585 0.041 0.147
patterns, generalizations, conjectures, etc.)
The lesson included tasks that have multiple paths to a solution or multiple solutions. 0.508 0.013 0.090
The lesson promoted precision of mathematical language. 0.568 −0.005 0.095
The teacher’s talk encouraged student thinking. 0.744 0.016 0.208
There was a high proportion of students talking related to mathematics. 0.831 0.174 −0.013
There was a climate of respect for what others had to say. 0.397 0.329 0.060 0.075
In general, the teacher provided wait time. 0.422 0.060 0.062
Students were involved in the communication of their ideas to others (peer to peer). 0.863 0.225 −0.002
The teacher used student questions/comments to enhance mathematical understanding. 0.738 0.003 0.182
Note: Extraction method: Maximum likelihood. Rotation method: Promax with Kaiser normalization

postsecondary levels, and the fifth rater is a graduate student in mathematics with minimal background in
education other than teaching some introductory college math classes. All raters received the detailed
descriptions of the items with the rubric prior to observing classes and asked some clarification questions
prior to the observations. However, no formal training on the use of the instrument occurred, simulating
the probable future uses.
The videos represented a mix of different levels of students and instruction, with one video chosen from
each of K–2, 3–5, 6–8, 9–12, and undergraduate levels. All raters analyzed all videos independently, to create
a fully crossed, two-way model for the intraclass correlation (ICC) computation. A single-measures ICC
was used because the videos were analyzed by all raters using the instrument to measure classrooms
(Hallgren, 2012). Because the instrument was designed to measure classrooms at the subscale level, not at
the level of each particular item, the subscale scores provided the determination of the ICC, rather than the
individual item scores. The interrater reliabilities are characterized by absolute agreement to allow
comparison of the results from the protocol across studies.
Therefore, for each of the subscale totals and the overall instrument total, a two-way mixed, absolute-
agreement ICC (McGraw & Wong, 1996) assessed the degree that coders provided consistency in their
ratings of the classrooms for practices across subjects. The resulting single-measure ICCs for the student
engagement subscale (0.669) and the teacher facilitation subscale (0.616) both fell within the “good” range
(Cicchetti, 1994), indicating a high degree of agreement among coders on both subscales, even with little to
no training, practice, or discussion among some of the raters. The high ICC for each of the subscales
suggests the existence of only a small amount of measurement error introduced by the independent coders
with little training and use of the MCOP2 and thus does not substantially reduce statistical power for
subsequent analyses.

Discussion
Use for research and practice
When choosing an instrument for research involving observations of mathematics teaching, the
MCOP2 stands apart in that it measures the actions of both the teacher and students. By capturing
both sets of actions, the MCOP2 differentiates between teacher implementation of a lesson and the
students’ engagement. This makes the MCOP2 a holistic assessment of the classroom environment
INVESTIGATIONS IN MATHEMATICS LEARNING 119

not found in observation protocols such as the RTOP, MQI, IQA, and MSCAN. Another benefit of
this instrument for research, in comparison to other observation tools, is that the MCOP2 is more
manageable. Extensive training or professional development is not necessarily required, beyond
reviewing the detailed user guide with the assumption that the user understands the terminology in
the rubrics (Gleason et al., 2015). This guide provides a holistic description of the 16 indicators.
We recommend that the observer stay for the entirety of the lesson to account for teacher design
and structure of the lesson, and student interactions. We understand that the instructional decisions
necessary to teach students effectively differs on a daily basis. This includes a teacher who realizes
that a more directed instruction approach might be necessary following days of student-centered
investigations. On the other hand, if the teacher is working on procedural fluency, those lessons may
not score as high and would not be representative of the teacher’s typical instructional practice. The
MCOP2 cannot accurately evaluate a teacher on a single observation, because of the nature and
complexity of the teaching. We recommend three to six observations using the MCOP2 to capture
the mathematical practices of a teacher and the typical classroom interactions among students and
teachers (Hill, Charalambous, & Kraft, 2012).
With a focus on the Standards for Mathematical Practice, the MCOP2 is a useful tool in the context of
mathematics teacher certification programs for instruction, individual evaluation, peer evaluation, and
program evaluation (see Appendix B). Using videos of mathematics classrooms, one might have pre-service
teachers use the MCOP2 to analyze the lesson and then discuss how the lesson reflects the desired teacher
facilitation and student engagement. Supervisors of pre-service teachers might in turn use the instrument as
the basis of discussion and evaluation of teacher candidates as part of the student-teaching process. Finally,
compiled data from these individual evaluations can form the basis of one part of the evaluation of the
teacher education program to find areas of strengths and weaknesses in the classroom implementations of
the desired program outcomes.
Another possibility for using the MCOP2 is in a pre–post design as part of professional development.
This allows teachers to see their baseline, highlighting the practices that are strong within their teaching
practice and those that are not as apparent, and then through professional development the teachers can
work to strengthen those practices. Our research team envisions teachers within schools and/or districts
using the MCOP2 as a means of peer observations to learn about themselves and their colleagues, as well as
their students, to improve site-wide practice to improve the teaching and learning of mathematics.
Limitations
All instruments have limitations; we identify two primary limitations for the present instrument. As
discussed above, lessons focused primarily on improving procedural fluency will not score very high
on many indicators on the MCOP2. For instance, if a classroom has spent several lessons focusing on
a mixture of conceptual understanding and procedural fluency around a topic such as multiplication,
but on the day of the observation the main focus in improving the procedural fluency of the students
through reviewing multiplication flash cards, the single observation would not give a proper measure
of the overall classroom environment.
A limitation regarding the design and process in the development of this instrument was the
sample population for the observation of teachers. In what could be perceived as a sample of
convenience, the MCOP2 was analyzed only in classrooms within one geographical region of the
United States. Knowing the limitation of one region, the researchers were cognizant of this limitation
and compensated by including a large sample from a variety of school types (e.g., city, rural,
suburban, Title I) and teachers (elementary, secondary, and tertiary; years of experience). A larger,
more diverse national sample is ideal, and is a goal of subsequent investigation.

Summary
The practices of the mathematics classroom encompass the interconnectedness of both the teacher
and the students’ actions. There are common expectations for mathematical practices in grades K–
120 J. GLEASON ET AL.

16, so there is a need for a sound protocol to highlight the practices present in teaching for
mathematical understanding, as well as the practices for students to learn mathematics conceptually.
The MCOP2 represents a useful and reliable means for assessing mathematics classrooms.

ORCID
Jim Gleason http://orcid.org/0000-0001-9116-5166
Jeremy Zelkowski http://orcid.org/0000-0002-5635-4233

References
American Educational Research Association, American Psychological Association, & National Council on
Measurement in Education. (2014). Standards for education and psychological testing. Washington, DC: American
Educational Research Association.
American Mathematical Association of Two-Year Colleges. (1995). Crossroads in mathematics: Standards for intro-
ductory college mathematics before calculus. (D. Cohen, Ed.). Memphis, TN: American Mathematical Association of
Two-Year Colleges. Retrieved from http://www.amatyc.org/?page=GuidelineCrossroads
American Mathematical Association of Two-Year Colleges. (2006). Beyond crossroads: Implementing mathematics
standards in the first two years of college. (R. Blair, Ed.). Memphis, TN: American Mathematical Association of Two-
Year Colleges. Retrieved from http://beyondcrossroads.matyc.org/
Anghileri, J. (2006). Scaffolding practices that enhance mathematics learning. Journal of Mathematics Teacher
Education, 9(1), 33–52. doi:10.1007/s10857-006-9005-9
Ayre, C., & Scally, A. J. (2014). Critical values for Lawshe’s content validity ratio: Revisiting the original methods of calculation.
Measurement and Evaluation in Counseling and Development, 47(1), 79–86. doi:10.1177/0748175613513808
Azevedo, F. S., diSessa, A. A., & Sherin, B. L. (2012). An evolving framework for describing student engagement in
classroom activities. The Journal of Mathematical Behavior, 31(2), 270–289. doi:10.1016/j.jmathb.2011.12.003
Ball, D. L. (1990). The mathematical understandings that prospective teachers bring to teacher education. Elementary
School Journal, 90(4), 449–466. doi:10.1086/461626
Ball, D. L., Ferrini-Mundy, J., Kilpatrick, J., Milgram, R. J., Schmid, W., & Schaar, R. (2005). Reaching for common
ground in K-12 mathematics education. Notices of the American Mathematical Society, 52(9), 1055–1058.
Ball, D. L., & Rowan, B. (2004). Introduction: Measuring instruction. Elementary School Journal, 105(1), 3–10.
doi:10.1086/428762
Barker, W., Bressoud, D., Epp, S., Ganter, S., Haver, B., & Pollatsek, H. (2004). Undergraduate programs and courses in
the mathematical sciences: CUPM curriculum guide. Washington, DC: Mathematical Association of America.
Bell, C. A., Gitomer, D. H., McCaffrey, D. F., Hamre, B. K., Pianta, R. C., & Qi, Y. (2012). An argument approach to
observation protocol validity. Educational Assessment, 17(2–3), 62–87. doi:10.1080/10627197.2012.715014
Borko, H., Stecher, B. M., & Alonzo, A. C. (2005). Artifact packages for characterizing classroom practice: A pilot
study. Educational Assessment, 10(2), 73–104. doi:10.1207/s15326977ea1002_1
Boston, M., Bostic, J., Lesseig, K., & Sherman, M. (2015). A comparison of mathematics classroom observation
protocols. Mathematics Teacher Educator, 3(2), 154–175. doi:10.5951/mathteaceduc.3.2.0154
Boston, M., & Wolf, M. K. (2006). Assessing academic rigor in mathematics instruction: The development of the Instructional
Quality Assessment Toolkit (CSE Technical Report 672). Los Angeles, CA: The Regents of the University of California.
Retrieved from https://www.cse.ucla.edu/products/reports/r672.pdf
Brophy, J. E., & Good, T. (1986). Teacher behavior and student achievement. In M. Wittrock (Ed.), Handbook of
research on teaching (3rd ed., pp. 328–375). New York, NY: Macmillan.
Bruner, J. S. (1960). The process of education. Cambridge, MA: Harvard University Press.
Bruner, J. S. (1961). The act of discovery. Harvard Educational Review, 31, 21–32.
Bruner, J. S. (1966). Toward a theory of instruction. Cambridge, MA: Belknap Press.
Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment
instruments in psychology. Psychological Assessment, 6(4), 284–290. doi:10.1037/1040-3590.6.4.284
Clare, L., & Aschbacher, P. R. (2001). Exploring the technical quality of using assignments and student work as
indicators of classroom practice. Educational Assessment, 7(1), 39–59doi:10.1207/S15326977EA0701_5
Cobb, P., Wood, T., & Yackel, E. (1993). Discourse, mathematical thinking, and classroom practice. In E. A. Forman,
N. Minick, & C. A. Stone (Eds.), Contexts for learning: Sociocultural dynamics in children’s development (pp.
91–119). Oxford U.K:Oxford University Press.
Cohen, D. K., Raudenbush, S. W., & Ball, D. L. (2003). Resources, instruction, and research. Educational Evaluation
and Policy Analysis, 25(2), 119–142. doi:10.3102/01623737025002119
Conference Board of the Mathematical Sciences. (2016). Active learning in post-secondary mathematics education.
Retrieved from http://www.cbmsweb.org/Statements/Active_Learning_Statement.pdf
INVESTIGATIONS IN MATHEMATICS LEARNING 121

Cuoco, A., Goldenberg, P. E., & Mark, J. (1996). Habits of mind: An organizing principle for mathematics curricula.
The Journal of Mathematical Behavior, 15(4), 375–402. doi:10.1016/S0732-3123(96)90023-1
DeVellis, R. F. (2011). Scale development: Theory and applications (3rd ed.). Thousand Oaks, CA: Sage Publications.
Gleason, J., Livers, S., & Zelkowski, J. (2015). Mathematics Classroom Observation Protocol for Practices: Descriptors
manual. Retrieved from http://jgleason.people.ua.edu/mcop2.html
Hallgren, K. A. (2012). Computing inter-rater reliability for observational data: An overview and tutorial. Tutorials in
Quantitative Methods for Psychology, 8(1), 23–34. doi:10.20982/tqmp.08.1.p023
Henry, M., Murray, K. S., & Phillips, K. A. (2007). Meeting the challenge of STEM classroom observation in evaluating
teacher development projects: A comparison of two widely used instruments. St. Louis, MO: Henry Consulting.
Hiebert, J., Carpenter, T., Fennema, E., Fuson, K., Wearne, D., Murray, H., & Human, P. (1997). Making sense:
Teaching and learning mathematics with understanding. Portsmouth, NH: Heinemann.
Hiebert, J., & Carpenter, T. P. (1992). Learning and teaching with understanding. In D. A. Grouws (Ed.), Handbook of
research on mathematics teaching and learning (pp. 65–97). New York, NY: Macmillan.
Hiebert, J., & Grouws, D. A. (2007). The effects of classroom mathematics teaching on students’ learning. In F. K.
Lester (Ed.), Second handbook of research on mathematics teaching and learning (pp. 371–404). Reston, VA:
National Council of Teachers of Mathematics.
Hill, H. C., Blunk, M. L., Charalambous, C. Y., Lewis, J. M., Phelps, G. C., Sleep, L., & Ball, D. L. (2008). Mathematical
knowledge for teaching and the mathematical quality of instruction: An exploratory study. Cognition and
Instruction, 26(4), 430–511. doi:10.1080/07370000802177235
Hill, H. C., Charalambous, C. Y., Blazar, D., McGinn, D., Kraft, M. A., Beisiegel, M., … Lynch, K. (2012). Validating
arguments for observational instruments: Attending to multiple sources of variation. Educational Assessment, 17(2–3),
88–106. doi:10.1080/10627197.2012.715019
Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a
case for the generalizability study. Educational Researcher, 41(2), 56–64. doi:10.3102/0013189X12437203
Hill, H. C., Kapitula, L., & Umland, K. (2011). A validity argument approach to evaluating teacher value-added scores.
American Educational Research Journal, 48(3), 794–831. doi:10.3102/0002831210387916
Hill, H. C., Umland, K., Litke, E., & Kapitula, L. R. (2012). Teacher quality and quality teaching: Examining the relationship
of a teacher assessment to practice. American Journal of Education, 118(4), 489–519. doi:10.1086/666380
Junker, B., Weisberg, Y., Matsumura, L. C., Crosson, A., Wolf, M. K., Levison, A., & Resnick, L. (2006). Overview of the
Instructional Quality Assessment (CSE Technical Report 671). Los Angeles, CA: The Regents of the University of
California. Retrieved from https://www.cse.ucla.edu/products/reports/r671.pdf
Lampert, M. (1990). When the problem is not the question and the solution is not the answer: Mathematical knowing
and teaching. American Educational Research Journal, 27(1), 29–63. doi:10.3102/00028312027001029
Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria: What did
they really say? Organizational Research Methods, 9(2), 202–220. doi:10.1177/1094428105284919
Ma, L. (1999). Knowing and teaching elementary mathematics: Teachers’ understanding of fundamental mathematics in
China and the United States. Mahwah, NJ: Erlbaum.
Marks, H. M. (2000). Student engagement in instructional activity: Patterns in the elementary, middle, and high school
years. American Educational Research Journal, 37(1), 153–184. doi:10.3102/00028312037001153
Matsumura, L. C., Garnier, H., Pascal, J., & Valdes, R. (2002). Measuring instructional quality in accountability
systems: Classroom assignments and student achievement. Educational Assessment, 8(3), 207–229. doi:10.1207/
S15326977EA0803_01
Matsumura, L. C., Garnier, H. E., Slater, S. C., & Boston, M. D. (2008). Toward measuring instructional interactions
“at-scale.” Educational Assessment, 13(4), 267–300. doi:10.1080/10627190802602541
Maxwell, C. R. (1917). The observation of teaching. Boston, MA: Houghton Mifflin.
McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological
Methods, 1(1), 30–46. doi:10.1037/1082-989X.1.1.30
Moschkovich, J. (2007). Examining mathematical discourse practices. For the Learning of Mathematics, 27(1), 24–30.
Mullens, J., & Kasprzyk, D. (1999). Validating item responses on self-report teacher surveys. Washington, DC: U.S.
Department of Education.
Munter, C. (2014). Developing visions of high-quality mathematics instruction. Journal for Research in Mathematics
Education, 45(5), 584–635. doi:10.5951/jresematheduc.45.5.0584
Munter, C., Stein, M. K., & Smith, M. S. (2015). Dialogic and direct instruction: Two distinct models of mathematics
instruction and the debate(s) surrounding them. Teachers College Record, 117(11), 1–32.
Nathan, M. J., & Knuth, E. J. (2003). A study of whole classroom mathematical discourse and teacher change.
Cognition and Instruction, 21(2), 175–207. doi:10.1207/S1532690XCI2102_03
National Council of Teachers of Mathematics. (1989). Curriculum and evaluation standards. Reston, VA: Author.
National Council of Teachers of Mathematics. (1991). Professional standards for teaching mathematics. Reston, VA: Author.
National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA:
Author.
122 J. GLEASON ET AL.

National Council of Teachers of Mathematics. (2006). Curriculum focal points for prekindergarten through grade 8
mathematics: A quest for coherence. Reston, VA: Author.
National Council of Teachers of Mathematics. (2011). Focus in high school mathematics: Fostering reasoning and sense
making for all students. M. E. Strutchens & J. R. Quander (Eds.). Reston, VA: Author.
National Council of Teachers of Mathematics. (2014). Principles to actions: Ensuring mathematical success for all.
Reston, VA: Author.
National Governors Association Center for Best Practices, Council of Chief State School Officers. (2010). Common
core state standards mathematics. Washington, DC: National Governors Association Center for Best Practices,
Council of Chief State School Officers. Retrieved from http://www.corestandards.org/Math
National Research Council. (2003). Evaluating and improving undergraduate teaching in science, technology, engineer-
ing, and mathematics. Washington, DC: The National Academies Press.
Polya, G. (1945). Mathematical discovery: On understanding, learning, and teaching problem solving. Princeton, NJ:
Princeton University Press.
Polya, G. (1957). Mathematical discovery: On understanding, learning, and teaching problem solving (2nd ed.).
Princeton, NJ: Princeton University Press.
Resnick, L. B. (1976). Task analysis in instructional design: Some cases from mathematics. In D. Klahr (Ed.), Cognition
and instruction. Hillsdale, NJ: Erlbaum.
Resnick, L. B., & Ford, W. W. (1981). The psychology of mathematics for instruction. Hillsdale, NJ: Erlbaum.
Reys, R. E., Lindquist, M. M., Lambdin, D. V., & Smith, N. L. (2015). Helping children learn mathematics (11th ed.).
Hoboken, NJ: John Wiley & Sons.
Rittle-Johnson, B., Siegler, R. S., & Alibali, M. W. (2001). Developing conceptual understanding and procedural skill in
mathematics: An iterative process. Journal of Educational Psychology, 93(2), 346. doi:10.1037/0022-0663.93.2.346
Rogoff, B., Matusov, E., & White, C. (1996). Models of teaching and learning: Participation in a community of
learners. In D. R. Olson & N. Torrance (Eds.), The handbook of education and human development (pp. 388–414).
Cambridge, MA: Blackwell Publishers.
Sawada, D., Piburn, M. D., Judson, E., Turley, J., Falconer, K., Benford, R., & Bloom, I. (2002). Measuring reform
practices in science and mathematics classrooms: The reformed teaching observation protocol. School Science and
Mathematics, 102(6), 245–253. doi:10.1111/ssm.2002.102.issue-6
Schoenfeld, A. H. (1998). Toward a theory of teaching-in-context. Issues in Education, 4, 1–94. doi:10.1016/S1080-9724(99)
80076-7
Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized
causal inference. Boston, MA: Houghton Mifflin.
Sherin, M. G., Mendez, E. P., & Louis, D. A. (2004). A discipline apart: The challenges of ‘Fostering a Community of
Learners’ in a mathematics classroom. Journal of Curriculum Studies, 36(2), 207–232. doi:10.1080/
0022027032000142519
Silver, E. A., Mesa, V. M., Morris, K. A., Star, J. R., & Benken, B. M. (2009). Teaching mathematics for understanding:
An analysis of lessons submitted by teachers seeking NBPTS Certification. American Educational Research Journal,
46(2), 501–531. doi:10.3102/0002831208326559
Skemp, R. R. (1971). The psychology of learning mathematics. Middlesex, UK: Penguin.
Skemp, R. R. (1976). Relational understanding and instrumental understanding. Arithmetic Teacher, 26(3), 9–15.
Stein, M. K., Engle, R. A., Smith, M. S., & Hughes, E. K. (2008). Orchestrating productive mathematical discussions:
Five practices for helping teachers move beyond show and tell. Mathematical Thinking and Learning, 10(4), 313–
340. doi:10.1080/10986060802229675
Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and
reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal,
33(2), 455–488. doi:10.3102/00028312033002455
Stein, M. K., Smith, M. S., Henningsen, M. A., & Silver, E. A. (2009). Implementing standards-based mathematics
instruction: A casebook for professional development (2nd ed.). New York, NY: Teachers College Press.
Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes. Cambridge, MA: Harvard
University Press.
Walkowiak, T. A., Berry, R. Q., Meyer, J. P., Rimm-Kaufman, S. E., & Ottmar, E. R. (2014). Introducing an
observational measure of standards-based mathematics teaching practices: Evidence of validity and score reliability.
Education Studies in Mathematics, 85(1), 109–128. doi:10.1007/s10649-013-9499-x
Winne, P. H. (1979). Experiments relating teachers’ use of higher cognitive questions to student achievement. Review
of Educational Research, 49, 13–50. doi:10.3102/00346543049001013
Yackel, E., & Cobb, P. (1996). Sociomathematical norms, argumentation, and autonomy in mathematics. Journal for
Research in Mathematics Education, 27(4), 458–477. doi:10.2307/749877
INVESTIGATIONS IN MATHEMATICS LEARNING 123

Appendix A

Affordances, constraints, and ideal uses for RTOP, IQA, MQI, M-SCAN, and MCOP 2.
RTOP IQA MQI M-SCAN MCOP2
Affordances General indictors Specific to cognitive- Not biased toward Multiple Multiple dimensions of
of reform- challenging tasks instructional type dimensions of measure (teachers and
oriented teaching measure students) for holistic
view of classroom
Use in science Explicit connections to PD Analysis of video Low/medium/ Focused on Standards
and math high scores for Mathematical
Practice
Teacher-level Flexibility for live or School/district Random Well-defined rubrics
data validation recorded or evaluating level selection of
student work only teachers in
sample for
development
Rich descriptive School/district level Well-defined Quantifiable Reliability for K–16
data for teachers scores measures
Can show change Descriptive stats on rubrics Extensive Reliability Large expert validation
over time validation estimates
Many indicators Well-defined scores Online training K–6 grades Validated on pre-service
difficult to use at and in-service teacher
scale populations
Not truly Interrelater reliability Multiple factors Sample only in Focuses on broad
mathematical grades 5–6 picture
Summing of Validation studies Rich descriptive Validity issues Not explicit for PD
subscales & data outside 5–6
overall score loses
interpretive value
Very difficult to Rich descriptive data for Explicit about Small limited Limited sample
get exact-point teachers increasing scores sample
reliability
between raters
No descriptors for Explicit about increasing Used over time, Small limited Lacks task analysis
score levels scores across sites or sample
comparison
Norming across Used over time, across sites, K–8 limited Norming across Requires observation of
studies difficult or comparison studies an entire lesson
Constraints Live/recorded Limited focus on tasks Not for live Five days of Live/recorded
observations training on-site Math K–16
Multiple content Biased toward ambitious Raters need Live/recorded Classroom, school,
math instruction adequate district, project, or
teacher certification
programs
Reform practice Accessibility of training knowledge
issues
Pre-service Not appropriate for Broader focus Math 5–6
programs comparisons or evaluation
with no expectation of
ambitious math instruction
Live/recorded/student work Many indicators Classroom or
makes connections school or district
not so explicit to
PD
Uses Math only in K–12 Recorded
Math K–8
Evaluating student
opportunity to
learn with
different
instruction
Adapted from Boston et al. (2015) and based on RTOP-Reformed Teaching Observation Protocol (Sawada et al., 2002); IQA—
Instructional Quality Assessment (Clare & Aschbacher, 2001; Junker et al., 2006; Matsumura et al., 2002; Boston & Wolf, 2006,
2008); MQI-Mathematics Quality of Instruction (Hill, Charalambous, Blazar, et al., 2012; Hill, Kapitula, & Umland, 2011; Hill,
Umland, Litke, & Kapitula, 2012; Hill et al., 2012); M-SCAN-Mathematics Scan (Borko, Stecher, & Alonzo, 2005; Walkowiak et al.,
2014).
124 J. GLEASON ET AL.

Appendix B

Relationship between the MCOP2 and the standards for mathematical practice.
Standards for Mathematical Practice
Construct viable Look for and
Make sense of Reason arguments and Use Look for express
problems and abstractly critique the appropriate Attend and make regularity in
MCOP2 persevere in and reasoning of Model with tools to use of repeated
item solving them quantitatively others mathematics strategically precision structure reasoning
1 X X X
2 X X
3 X
4 X X X
5 X X X X
6 X X
7 X X
8 X X
9 X
10 X X
11 X
12 X
13 X
14 X
15 X
16 X

Appendix C

1. Students engaged in exploration/investigation/problem solving.


SE1 Description
3 Students regularly engaged in exploration, investigation, or problem solving. Over the course of the lesson, the majority
of the students engaged in exploration/investigation/problem solving.
2 Students sometimes engaged in exploration, investigation, or problem solving. Several students engaged in problem
solving, but not the majority of the class.
1 Students seldom engaged in exploration, investigation, or problem solving. This tended to be limited to one or a few
students engaged in problem solving, while other students watched but did not actively participate.
0 Students did not engage in exploration, investigation, or problem solving. There were either no instances of investigation or
problem solving, or the instances were carried out by the teacher without active participation by any students.
2. Students used a variety of means (models, drawings, graphs, concrete materials, manipulatives, etc.) to represent
concepts.
SE Description
3 The students manipulated or generated two or more representations to represent the same concept, and the connections
across the various representations, relationships of the representations to the underlying concept, and applicability or the
efficiency of the representations were explicitly discussed by the teacher or students, as appropriate.
2 The students manipulated or generated two or more representations to represent the same concept, but the
connections across the various representations, relationships of the representations to the underlying concept, and
applicability or the efficiency of the representations were not explicitly discussed by the teacher or students.
1 The students manipulated or generated one representation of a concept.
0 There were either no representations included in the lesson, or representations were included but were exclusively
manipulated and used by the teacher. If the students only watched the teacher manipulate the representation and did
not interact with a representation themselves, it should be scored a 0.
3. Students were engaged in mathematical activities.
SE Description
3 Most of the students spend two-thirds or more of the lesson engaged in mathematical activity at the appropriate level
for the class. It does not matter if it is one prolonged activity or several shorter activities. (Note that listening and taking
notes does not qualify as a mathematical activity unless the students are filling in the notes and interacting with the
lesson mathematically.)
2 Most of the students spend more than one-quarter but less than two-thirds of the lesson engaged in appropriate-level
mathematical activity. It does not matter if it is one prolonged activity or several shorter activities.
(Continued )
INVESTIGATIONS IN MATHEMATICS LEARNING 125

(Continued).
1 Most of the students spend less than one-quarter of the lesson engaged in appropriate-level mathematical activity.
There is at least one instance of students’ mathematical engagement.
0 Most of the students are not engaged in appropriate-level mathematical activity. This could be because they are never
asked to engage in any activity and spend the lesson listening to the teacher and/or copying notes, or it could be
because the activity they are engaged in is not mathematical—such as a coloring activity.
4. Students critically assessed mathematical strategies.
SE TF2 Description
3 3 More than half of the students critically assessed mathematical strategies. This could have happened in a variety of
scenarios, including in the context of partner work, small-group work, or a student making a comment during direct
instruction or individually to the teacher.
2 2 At least two but less than half of the students critically assessed mathematical strategies. This could have happened in a
variety of scenarios, including in the context of partner work, small-group work, or a student making a comment during
direct instruction or individually to the teacher.
1 1 An individual student critically assessed mathematical strategies. This could have happened in a variety of scenarios,
including in the context of partner work, small group work, or a student making a comment during direct instruction or
individually to the teacher. The critical assessment was limited to one student.
0 0 Students did not critically assess mathematical strategies. This could happen for one of three reasons: (1) No strategies
were used during the lesson. (2) Strategies were used but were not discussed critically. For example, the strategy may
have been discussed in terms of how it was used on the specific problem, but its use was not discussed more generally.
(3) Strategies were discussed critically by the teacher, but this amounted to the teacher telling the students about the
strategy(ies), and students did not actively participate.
5. Students persevered in problem solving.
SE Description
3 Students exhibited a strong amount of perseverance in problem solving. The majority of students looked for entry
points and solution paths, monitored and evaluated progress, and changed course if necessary. When confronted with
an obstacle (such as how to begin or what to do next), the majority of students continued to use resources (physical
tools as well as mental reasoning) to continue to work on the problem.
2 Students exhibited some perseverance in problem solving. Half of students looked for entry points and solution paths,
monitored and evaluated progress, and changed course if necessary. When confronted with an obstacle (such as how to
begin or what to do next), half of students continued to use resources (physical tools as well as mental reasoning) to
continue to work on the problem.
1 Students exhibited minimal perseverance in problem solving. At least one student but less than half of students looked
for entry points and solution paths, monitored and evaluated progress, and changed course if necessary. When
confronted with an obstacle (such as how to begin or what to do next), at least one student but less than half of
students continued to use resources (physical tools as well as mental reasoning) to continue to work on the problem.
There must be a road block to score above a 0.
0 Students did not persevere in problem solving. This could be because there was no student problem solving in the
lesson, or because when presented with a problem-solving situation. no students persevered. That is, all students either
could not figure out how to get started on a problem, or when they confronted an obstacle in their strategy, they
stopped working.
6. The lesson involved fundamental concepts of the subject to promote relational/conceptual understanding.
TF Description
3 The lesson includes fundamental concepts or critical areas of the course, as described by the appropriate standards, and
the teacher/lesson uses these concepts to build relational/conceptual understanding of the students with a focus on
the “why” behind any procedures included.
2 The lesson includes fundamental concepts or critical areas of the course, as described by the appropriate standards, but
the teacher/lesson misses several opportunities to use these concepts to build relational/conceptual understanding of
the students with a focus on the “why” behind any procedures included.
1 The lesson mentions some fundamental concepts of mathematics, but does not use these concepts to develop the
relational/conceptual understanding of the students. For example, in a lesson on the slope of the line, the teacher
mentions that it is related to ratios, but does not help the students to understand how it is related and how that can
help them to better understand the concept of slope.
0 The lesson consists of several mathematical problems with no guidance to make connections with any of the
fundamental mathematical concepts. This usually occurs with a teacher focusing on procedure of solving certain types
of problems without the students understanding the “why” behind the procedures.
7. The lesson promoted modeling with mathematics.
TF Description
3 Modeling (using a mathematical model to describe a real-world situation) is an integral component of the lesson, with
students engaged in the modeling cycle (as described in the Common Core State Standards).
2 Modeling is a major component, but the modeling has been turned into a procedure (i.e., a group of word problems
that all follow the same form, and the teacher has guided the students to find the key pieces of information and how to
plug them into a procedure); or modeling is not a major component, but the students engage in a modeling activity
that fits within the corresponding standard of mathematical practice.
(Continued )
126 J. GLEASON ET AL.

(Continued).
1 The teacher describes some type of mathematical model to describe real-world situations, but the students do not
engage in activities related to using mathematical models.
0 The lesson does not include any modeling with mathematics.
8. The lesson provided opportunities to examine mathematical structure. (symbolic notation, patterns,
generalizations, conjectures, etc.)
TF Description
3 The students have a sufficient amount of time and opportunity to look for and make use of mathematical structure or
patterns.
2 Students are given some time to examine mathematical structure, but are not allowed adequate time or are given too
much scaffolding so that they cannot fully understand the generalization.
1 Students are shown generalizations involving mathematical structure, but have little opportunity to discover these
generalizations themselves or adequate time to understand the generalization.
0 Students are given no opportunities to explore or understand the mathematical structure of a situation.
9. The lesson included tasks that have multiple paths to a solution or multiple solutions.
TF Description
3 A lesson which includes several tasks throughout, or a single task that takes up a large portion of the lesson, with
multiple solutions and/or multiple paths to a solution and which increases the cognitive level of the task for different
students.
2 Multiple solutions and/or multiple paths to a solution are a significant part of the lesson, but are not the primary focus,
or are not explicitly encouraged; or more than one task has multiple solutions and/or multiple paths to a solution that
are explicitly encouraged.
1 Multiple solutions and/or multiple paths minimally occur, and are not explicitly encouraged; or a single task has
multiple solutions and/or multiple paths to a solution that are explicitly encouraged.
0 A lesson that focuses on a single procedure to solve certain types of problems and/or strongly discourages students
from trying different techniques.
10. The lesson promoted precision of mathematical language.
TF Description
3 The teacher “attends to precision” with regard to communication during the lesson. The students also “attend to
precision” in communication, or the teacher guides students to modify or adapt nonprecise communication to improve
precision.
2 The teacher “attends to precision” in all communication during the lesson, but the students are not always required also
to do so.
1 The teacher makes a few incorrect statements or is sloppy about mathematical language, but generally uses correct
mathematical terms.
0 The teacher makes repeated incorrect statements or incorrect names for mathematical objects instead of their accepted
mathematical names.
11. The teacher’s talk encouraged student thinking.
TF Description
3 The teacher’s talk focused on high levels of mathematical thinking. The teacher may ask lower-level questions within
the lesson, but this is not the focus of the practice. There are three possibilities for high levels of thinking: analysis,
synthesis, and evaluation. Analysis: examines/interprets the pattern, order or relationship of the mathematics; parts of
the form of thinking. Synthesis: requires original, creative thinking. Evaluation: makes a judgment of good or bad,
right or wrong, according to the standards he or she values.
2 The teacher’s talk focused on mid-levels of mathematical thinking. Interpretation: discovers relationships among
facts, generalizations, definitions, values and skills. Application: requires identification and selection and use of
appropriate generalizations and skills
1 Teacher talk consists of “lower-order” knowledge-based questions and responses focusing on recall of facts. Memory:
recalls or memorizes information. Translation: changes information into a different symbolic form or situation.
0 Any questions/responses of the teacher related to mathematical ideas were rhetorical, in that there was no expectation
of a response from the students.
12. There was a high proportion of students talking related to mathematics.
SE Description
3 More than three-quarters of the students were talking related to the mathematics of the lesson at some point during
the lesson.
2 More than half but less than three-quarters of the students were talking related to the mathematics of the lesson at
some point during the lesson.
1 Less than half of the students were talking related to the mathematics of the lesson.
0 No students talked related to the mathematics of the lesson.
(Continued )
INVESTIGATIONS IN MATHEMATICS LEARNING 127

(Continued).
13. There was a climate of respect for what others had to say.
SE TF Description
3 3 Many students are sharing, questioning, and commenting during the lesson, including their struggles. Students are also
listening (active), clarifying, and recognizing the ideas of others.
2 2 The environment is such that some students are sharing, questioning, and commenting during the lesson, including
their struggles. Most students listen.
1 1 Only a few share as called on by the teacher. The climate supports those who understand or who behave appropriately.
Or some students are sharing, questioning, or commenting during the lesson, but most students are actively listening
to the communication.
0 0 No students shared ideas.
14. In general, the teacher provided wait time.
SE Description
3 The teacher frequently provided ample amount of “think time” for depth and complexity of a task or question posed
by the teacher or a student.
2 The teacher sometimes provided ample amount of “think time” for depth and complexity of a task or question posed
by the teacher or a student.
1 The teacher rarely provided ample amount of “think time” for depth and complexity of a task or question posed by the
teacher or a student.
0 The teacher never provided ample amount of “think time” for depth and complexity of a task or question posed by the
teacher or a student.
15. Students were involved in the communication of their ideas to others (peer to peer).
SE Description
3 Considerable time (more than half) was spent in peer-to-peer dialog (pairs, groups, whole class) related to the
communication of ideas, strategies, and solution.
2 Some class time (less than half, but more than just a few minutes) was devoted to peer-to-peer (pairs, groups,
whole class) conversations related to the mathematics.
1 The lesson was primarily teacher-directed, and few opportunities were available for peer-to-peer (pairs, groups, whole
class) conversations. A few instances developed where this occurred during the lesson but only lasted less than
5 minutes.
0 No peer-to-peer (pairs, groups, whole class) conversations occurred during the lesson.
16. The teacher uses student questions/comments to enhance conceptual mathematical understanding.
SE TF Description
3 3 The teacher frequently uses student questions/comments to coach students, facilitate conceptual understanding, and
boost the conversation. The teacher sequences the student responses that will be displayed in an intentional order,
and/or connects different students’ responses to key mathematical ideas.
2 2 The teacher sometimes uses student questions/comments to enhance conceptual understanding.
1 1 The teacher rarely uses student questions/comments to enhance conceptual mathematical understanding. The focus is
more on procedural knowledge of the task versus conceptual knowledge of the content.
0 0 The teacher never uses student questions/comments to enhance conceptual mathematical understanding.
1
Student Engagement
2
Teacher Facilitation

© 2017 Research Council on Mathematics Learning


128 J. GLEASON ET AL.

Appendix D

Theoretical structure of the MCOP2 based on initial expert survey.

Not essential, Not


Item Essential but useful useful
Students explored prior to formal presentation. 45.2% 48.2% 6.6%
Students engaged in flexible alternative modes of investigation/problem solving. 50.0% 45.2% 4.8%
Students used a variety of means (models, drawings, graphs, concrete materials, 66.3% 30.1% 3.6%
manipulatives, etc.) to represent concepts.
Students were engaged in mathematical activities. 79.5% 17.5% 3.0%
Students critically assessed mathematical strategies. 53.6% 42.8% 3.6%
Students persevered in problem solving. 80.7% 16.9% 2.4%
The lesson involved fundamental concepts of the subject to promote relational/conceptual 86.1% 11.4% 2.4%
understanding.
The lesson promoted connections across the discipline of mathematics. 36.7% 57.8% 5.4%
The lesson promoted modeling with mathematics. 39.8% 57.2% 3.0%
The lesson provided opportunities to examine elements of abstraction. (symbolic notation, 48.2% 49.4% 2.4%
patterns, generalizations, conjectures, etc.)
The lesson included tasks that have multiple paths to a solution or multiple solutions. 63.3% 34.9% 1.8%
The lesson promoted precision of mathematical language. 56.0% 42.2% 1.8%
The teacher’s talk encouraged student thinking. 89.8% 9.6% 0.6%
There was a high proportion of students talking related to mathematics. 77.1% 20.5% 2.4%
There was a climate of respect for what others had to say. 88.6% 10.8% 0.6%
In general, the teacher provided wait time. 70.5% 29.5% 0.0%
Students were involved in the communication of their ideas to others. (peer to peer) 80.1% 19.3% 0.6%
The teacher uses student questions/comments to enhance mathematical understanding. 82.5% 15.7% 1.8%

Appendix E

Theoretical structure of the MCOP2 based on second expert survey.

Lesson Student engagement with


Item Design Implementation Peers Content
Students engaged in exploration/investigation/problem Extremely 34.6% 65.4% 34.6% 50.0%
solving. Well
Well 42.3% 26.9% 11.5% 26.9%
Somewhat 23.1% 7.7% 30.8% 19.2%
Not at all 0% 0% 23.1% 3.8%
Students used a variety of means to represent concepts. Extremely 34.6% 50.0% 23.1% 76.9%
Well
Well 26.9% 38.5% 23.1% 3.8%
Somewhat 34.6% 11.5% 30.1% 11.5%
Not at all 3.8% 0% 23.1% 7.7%
Students were engaged in mathematical activities. Extremely 30.8% 53.8% 26.9% 53.8%
Well
Well 46.2% 34.6% 15.4% 34.6%
Somewhat 23.1% 11.5% 19.2% 11.5%
Not at all 0% 0% 38.5% 0%
Students critically assessed mathematical strategies. Extremely 30.8% 57.7% 38.5% 57.7%
Well
Well 26.9% 26.9% 30.8% 38.5%
Somewhat 42.3% 15.4% 26.9% 3.8%
Not at all 0% 0% 3.8% 0%
Students persevered in problem solving. Extremely 30.8% 38.5% 23.1% 65.4%
Well
Well 30.8% 38.5% 23.1% 26.9%
Somewhat 34.6% 15.4% 34.6% 7.7%
Not at all 3.8% 7.7% 19.2% 0%
The lesson involved fundamental concepts of the subject to Extremely 76.9% 50.0% 0.0% 23.1%
promote relational/conceptual understanding. Well
Well 19.2% 23.1% 11.5% 34.6%
Somewhat 0% 23.1% 42.3% 34.6%
Not at all 3.8% 3.8% 46.2% 7.7%
(Continued )
INVESTIGATIONS IN MATHEMATICS LEARNING 129

(Continued).
The lesson promoted modeling with mathematics. Extremely 53.8% 61.5% 3.8% 23.1%
Well
Well 30.8% 23.1% 19.2% 38.5%
Somewhat 15.4% 15.4% 34.6% 34.6%
Not at all 0% 0% 42.3% .38%
The lesson provided opportunities to examine mathematical Extremely 73.1% 61.5% 15.4% 57.7%
structure. Well
Well 23.1% 30.8% 11.5% 34.6%
Somewhat 3.8% 7.7% 23.1% 7.7%
Not at all 0% 0% 50.0% 0%
The lesson included tasks that have multiple paths to a Extremely 80.8% 50.0% 11.5% 23.1%
solution or multiple solutions. Well
Well 19.2% 23.1% 11.5% 50.0%
Somewhat 0% 23.1% 26.9% 19.2%
Not at all 0% 3.8% 50.0% 7.7%
The lesson promoted precision of mathematical language. Extremely 19.2% 57.7% 15.4% 11.5%
Well
Well 30.8% 42.3% 15.4% 38.5%
Somewhat 38.5% 0% 15.4% 42.3%
Not at all 11.5% 0% 53.8% 7.7%
The teacher’s talk encouraged student thinking. Extremely 38.5% 76.9% 11.5% 19.2%
Well
Well 34.6% 15.4% 11.5% 46.2%
Somewhat 26.9% 7.7% 15.4% 23.1%
Not at all 0% 0% 61.5% 11.5%
There was a high proportion of students talking related to Extremely 23.1% 57.7% 38.5% 57.7%
mathematics. Well
Well 34.6% 34.6% 23.1% 15.4%
Somewhat 34.6% 3.8% 34.6% 26.9%
Not at all 7.7% 3.8% 3.8% 0%
There was a climate of respect for what others had to say. Extremely 19.2% 53.8% 73.1% 30.8%
Well
Well 34.6% 42.3% 26.9% 42.3%
Somewhat 38.5% 3.8% 0% 19.2%
Not at all 7.7% 0% 0% 7.7%
In general, the teacher provided wait time. Extremely 15.4% 73.1% 7.7% 11.5%
Well
Well 34.6% 23.1% 15.4% 26.9%
Somewhat 26.9% 3.8% 26.9% 50.0%
Not at all 23.1% 0% 50.0% 11.5%
Students were involved in the communication of their ideas Extremely 26.9% 53.8% 88.5% 50.0%
to others. Well
Well 50.0% 42.3% 11.5% 30.8%
Somewhat 23.1% 3.8% 0% 19.2%
Not at all 0% 0% 0% 0%
The teacher used student questions/comments to enhance Extremely 34.6% 84.6% 15.4% 50.0%
mathematical understanding. Well
Well 42.3% 15.4% 34.6% 34.6%
Somewhat 19.2% 0% 26.9% 15.4%
Not at all 3.8% 0% 23.1% 0%

You might also like