Professional Documents
Culture Documents
Bahan Tugas 3.3.2
Bahan Tugas 3.3.2
Their Validity
Dissertation
Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy
By
2013
Dissertation Committee:
Patricia Brosnan
Herb Clemens
Copyright by
Yating Liu
2013
Abstract
The study examined how middle school students evaluate arguments in a wide
common aspects and features of arguments that impacted students’ evaluation of the
The study involved two phases, a survey and a follow-up interview. Over five
hundred 8th grade students from five Ohio public schools participated in the survey study,
where they were provided a variety of arguments in four different mathematical contexts
and were asked to determine which of these arguments were convincing, explanatory and
appealing to them. Eight subjects, whose survey responses were distinct from each other,
were selected to participate in the follow-up interviews, where they were asked to explain
Statistical data from the survey was used to identify types of mathematical arguments that
students found convincing, exploratory and appealing. Interview data were coded using a
proof classification framework to identify the aspects and features of arguments that
ii
Findings from the survey and interview suggested that the participants’ evaluation
of the same argument was highly diverse among individuals. Their judgment of the same
type of arguments also differed across the problem contexts. The subjects’ explanation in
interviews revealed that the source of evidence had the largest impact on their judgment
of an argument, followed by representation. The reasoning mode, i.e. the link between
evidence and conclusion, was the least concerned aspect. Further investigations indicated
that examples, i.e. results from immediate tests, were the most referenced type of
reasoning modes varied. Lastly, it was found that the subjects possessed personal
standards to determine if an argument was convincing. Most subjects didn’t consider the
arguments.
iii
Dedication
iv
Acknowledgement
From the first time I was introduced to the theories of teaching and learning, to the
process of designing, producing and refining the dissertation work, none of these could
be achieved without the help of those individuals whom I can only begin to thank with
interest in students’ mathematical reasoning. Since then, she has invested much of her
teaching and research skills. Through our many conversations, she has provided
thoughtful feedback, helping me clearly and coherently express ideas and shed light on
patterns in my analysis. Always urging me to take one step further, Dr. Manouchehri has
I would also like to thank Professor Herb Clemens and Professor Patti Brosnan
for their contribution as members of my committee. I thank Dr. Clemens for his insight
who, along with Professor Diana Erchick, introduced me to the Mathematics Coaching
v
Program, providing the environment through which I was able to develop my
understanding of the education system and to connect with the schools whose students
form the basis of my dissertation study. I would also like to thank the mathematics
coaches of those schools, who helped facilitate the collection of the data, as well as the
I also very much appreciate my parents and my wife, who with their love, support
throughout my study. I would finally like to thank my graduate student colleagues, with
whom I could, when needed, commiserate — but more often and better still, collaborate
and celebrate.
vi
Vita
Publications
Liu, Y., Zhang, P., Brosnan, P., & Erchick, D. (2012). Examining the geometry items of
state standardized exams using the van Hiele model: Test content and student
achievement. Research in Education, Assessment, and Learning, 3(1), 22-28.
Liu, Y., & Manouchehri, A. (2012). What kinds of arguments do eighth graders prefer:
Preliminary results from an exploratory study. In Proceedings of the 34th Annual
Conference of the North American Chapter of the International Group for the
Psychology of Mathematics Education. Kalamazoo, MI: Western Michigan
University.
Liu, Y., & Manouchehri, A. (2012). Nurturing high school students’ understanding of
proof as a convincing way of reasoning: Results from an exploratory study. In
Proceedings of the 12th International Congress on Mathematics Education (pp.
2848-2857). Seoul, Korea.
Manouchehri, A., Zhang, P., & Liu, Y. (2012). Forces hindering development of
mathematical problem solving among school children. In Proceedings of the 12th
International Congress on Mathematics Education (pp. 2974-2983). Seoul,
Korea.
vii
Liu, Y., Harrison, R., & Zollinger, S. (2011). Enhancing K-8 mathematics coaches’
knowledge for teaching probability. In T. Lamberg & Weist, L. (Eds.),
Proceedings of the 33rd annual meeting of the North American Chapter of the
International Group for the Psychology of Mathematics Education. Reno, NV:
University of Nevada, Reno.
Liu, Y., Zhang, P., Brosnan, P., & Erchick, D. (2010). Examining the geometry content of
state standardized exams using the van Hiele model. In P. Brosnan, D. Erchick,
and Flevares, L. (Eds.), Proceedings of the 32nd annual meeting of the North
American Chapter of the International Group for the Psychology of Mathematics
Education (Vol. 6, pp. 616-624). Columbus, OH: The Ohio State University.
Zhang, P., Brosnan, P., Erchick, D., & Liu, Y. (2010). Analysis and inference to students’
approaches about development of problem-solving ability. In P. Brosnan, D.
Erchick, and Flevares, L. (Eds.), Proceedings of the 32nd annual meeting of the
North American Chapter of the International Group for the Psychology of
Mathematics Education (Vol. 6, pp. 823). Columbus, OH: The Ohio State
University.
Field of Study
(Mathematics Education)
viii
Table of Contents
Abstract .......................................................................................................................... ii
Dedication ......................................................................................................................iv
Acknowledgement ........................................................................................................... v
Vita ...............................................................................................................................vii
ix
Theoretical Framework ......................................................................................... 47
Sample .................................................................................................................. 60
Appendix A. Survey results: Pairwise comparisons of arguments in each problem ....... 271
xi
List of Tables
Table 1. The alignment between functions of proof and learners’ purpose in conducting
proof.................................................................................................................34
Table 11. Summary of the most understandable, convincing, explanatory and appealing
Table 12. Summary of the least understandable, convincing, explanatory and appealing
Table 13. Summary of high and low rated arguments by type ....................................... 140
xii
Table 16. Categories of comments made by Abby ........................................................ 157
Table 37. Summary of the subjects’ rationale in argument evaluation ........................... 220
Table 38. Similarities and differences in the subjects’ rationale of argument evaluation 228
xiii
Table 39. Pairwise comparisons: Participants’ ratings on whether the arguments in each
Table 40. Pairwise comparisons of survey results: Participants’ ratings on whether the
Table 41. Pairwise comparisons of survey results: Participants’ ratings on whether the
Table 42. Pairwise comparisons of survey results: Participants’ ratings on whether the
Table 46. Pairwise comparison of the rankings of arguments in each problem .............. 288
xiv
List of Figures
Figure 2. Proof schemes and sub schemes (Sowder & Harel, 1998) ...............................39
Figure 4. The broad maturation of proof structure (Tall et al, 2012) ...............................46
Figure 6. Reading Comprehension of Geometry Proof (RCGP) Model (Yang & Lin,
2008) ................................................................................................................50
Figure 11. Illustration of Allen’s rationale for evaluating mathematical arguments ....... 112
Figure 12. The percentage of participants who considered each argument understandable
....................................................................................................................... 116
Figure 14. Distribution of the number of arguments indicated not understandable by each
xv
Figure 15. Illustration of how understandable the arguments were to the participants ... 120
Figure 16. Illustration of how convincing the arguments were to the participants ......... 122
Figure 17. Illustration of how explanatory the arguments were to the participants ........ 126
Figure 18. The percentage of participants who considered each argument the appealing
....................................................................................................................... 129
Figure 19. An example of the data transformation for within group ANOVA test ......... 130
Figure 20. Illustration of how appealing the arguments were to the participants ........... 131
Figure 21. Plots for variables on which the gender * school effect was significant ....... 147
Figure 22. Illustration of Abby’s rationale for evaluating mathematical arguments ....... 159
Figure 23. Illustration of Alice’s rationale for evaluating mathematical arguments ....... 167
Figure 24. Illustration of Amy’s rationale for evaluating mathematical arguments ........ 176
Figure 25. Illustration of Beth’s rationale for evaluating mathematical arguments ........ 185
Figure 26. Illustration of Betty’s rationale for evaluating mathematical arguments ....... 193
Figure 27. Illustration of Blake’s rationale for evaluating mathematical arguments ...... 202
Figure 28. Illustration of Brenda’s rationale for evaluating mathematical arguments .... 211
Figure 29. Factors that impacted the subjects’ conviction ............................................. 219
Figure 30. Factors that caused inconsistent evaluation of the same type of arguments .. 234
xvi
CHAPTER 1. INTRODUCTION
arguments that are used to verify the truth of a statement. For example, in judicial process,
testimonies from witness are usually adopted as proofs. In an election, one’s past career
games is often considered as proof of competence. In natural science, proofs come from
sufficiency at which evidence and arguments become proof, as a common foundation for
all sorts of discussion (Pruss, 2006). The conventions and regulations about what can be
used as a reliable source and what can be accepted as a valid argument are highly area-
dependent, even when the discussion is restricted within the study of mathematics (Baker,
2009; Tall, 1991; Thurston, 1995; Usiskin, 1980). Despite the absence of a fixed and
precise standard, mathematical proof, not an exception, also certifies the truth of a claim
There is no other scientific or analytical discipline that uses proof as readily and
mathematics special: the tightly knit chain of reasoning, following strict logical
1
device for establishing the absolute and irrevocable truth of statements in our
subject. This is the reason that we can depend on mathematics that was done by
Euclid 2300 years ago as readily as we believe in the mathematics that is done
today. No other discipline can make such an assertion (p. 1, Krantz, 2007).
Although the general idea of mathematical proof, i.e. deriving a new result from a
known result, remains unchanged for more than 2000 years, details about how such a
throughout the development of mathematics (Jaffe & Quinn, 1993). Primitive forms of
mathematics (before Euclid’s Elements) didn’t imply an awareness of the need for proofs
regarded as the prototype of how a mathematical system should look (Krantz, 2007).
Ever since the Elements, rules have been set to demand that a mathematical proof must
root in definitions and axioms and evolve following acceptable forms of deductions.
Despite the historical and ongoing debates about what can be used as definitions, axioms,
and deductions within the community, consensus exists among mathematicians that a
mathematical proof must be timeless, impersonal, rigid and dependable (Brabiner, 2009;
Davis, 1976; Krantz, 2007; Tall et al., 2012). It is such a pursuit that makes mathematics
a reliable tool that is widely applied in physics, engineering, economy, and many other
disciplines.
2
Traditional discussion of mathematical proof focused on its reliability in
determining the truth of a statement (Brown, 2008). Such a perspective places emphasis
on precise descriptions of the definitions and premises (axioms) and a rigorous layout of
steps of deductions to make sure proofs were presented as a delicate and complete
who “didn’t leave up the scaffolding so that people could see how he constructed a
building” (cited in Krantz, 2007). David Hilbert had hoped for a rigorization of
conjecture was proved unachievable (Gödel, 1931). The influential modern mathematics
book series published in the middle of last century, written by the Nicolas Bourbaki
group, strictly adheres to the doctrines of formal mathematics by offering stern axiomatic
structures and not including any pictorial or other forms of intuitional assistance in proofs.
The New Math curriculum extended such a style into the education of growing
individuals, expecting early exposure to a rigorous format could help integrate such
practices into students’ mathematical thinking (Hanna, 1983). The pursuit of formal proof
has influenced generations of mathematicians and has greatly advanced the community’s
impact were also criticized by scholars (Freudenthal, 1973; Lakatos, 1976; Schoenfeld ,
Lakatos (1976) wrote “… (In formal proof) all propositions are true and all
authoritarian air is secured for the subject … Deductivist style hides the struggle, hides
the adventure …” (p. 142). Hanna (2000b) also claimed that “a proof, valid as it might be
mathematician only when it leads to real mathematical understanding” (p. 7). Krantz
(2007) expressed the same opinion, advocating “In mathematics, we are not simply after
the result. Our ultimate goal is understanding” (p. 32). Tall (1999) added to the
discussion by suggesting “formal proof is appropriate only for some, that some forms of
proof may be appropriate for more” (p. 1). De Villiers (1990, 2003) offered a framework
systemization, discovery, communication and intellectual challenge. All these efforts tend
proof has also faced difficulties in school practice, especially at the introductory levels.
Historically (and currently), in the US, a course on Euclidean geometry has served as the
main venue for the development of students’ skills in deductive reasoning with the
expectation that such skills would automatically transfer to other mathematical and non-
mathematical areas (González & Herbst, 2006; Herbst & Brach, 2006). This goal,
amazingly uniform; they show that most high school and college students don’t know
recognized that this failure might be due to the school treatment of topics in curriculum
and instruction. There is evidence that in many mathematics classrooms proofs and
4
proving process is taught as a procedural topic instead of a conceptual tool for reasoning
(Herbst & Brach, 2006; Reid, 2011). As a consequence, students tend to view proof as a
special “form” of producing written work (e.g. two-column proof) instead of a viable
vehicle for production of reliable explanations, or even means for understanding (Chazan,
1993; González & Herbst, 2006; Healy & Hoyles, 2000; Schoenfeld, 1988). Additionally,
validity of arguments remains underdeveloped at all grade levels (Chazan, 1993; Chazan
& Lueke, 2009; Harel & Sowder, 1998; Heinze & Reiss, 2009; Kuchemann & Hoyles,
2009; Mason, 2009; Waring, 2000; Weber, 2001; Schoenfeld, 1988). Furthermore, even if
a learner showed an awareness of and the ability to produce complete proofs in a certain
mathematical domain, such knowledge might not transfer to other topic areas, nor would
1938/1995; Freudenthal, 1971; Liu & Manouchehri, 2012; Reid, 2011). Therefore, calls
for shifting the focus of instruction from assimilating students into the tradition of
coherently about concrete contexts have been made. Following such a trend, recent
reform efforts on mathematics curriculum place less emphasis on the layout of proof
while paying more attention on nurturing students’ proof skills upon understanding of
specific topics throughout the grades (de Villiers, 1990, 2003; Hanna, 2000a, 2000b; Reid,
2011).
The Principles and Standards for School Mathematics (NCTM, 2000) published
ability to reason and produce proofs must be fostered at all levels of the mathematics
5
curriculum (Hanna, 2000a). According to the standards, K-12 mathematics education
should enable high school graduates to “recognize reasoning and proof as fundamental
evaluate mathematical arguments and proofs, and select and use various types of
reasoning and methods of proof” (p. 56). Furthermore, there is an explicit statement that
suggests nurturing the proof capacity in a broader content area, addressing “reasoning
and proof cannot simply be taught in a single unit on logic, for example, or by “doing
The Common Core State Standards (CCSSO, 2010) also place tremendous
emphasis on the need to assist students in developing their proving skills. Among the 8
quantitatively, construct viable arguments and critique the reasoning of others, look for
and make use of structure, look for and express regularity in repeated reasoning) are
traditional classroom culture. The key idea of the transformation is that elements and
determinant of the curriculum and instruction. The nature of students’ thinking and
respected in the design and practice of teaching (Ball & Bass, 2000, 2003; Boero, 2007;
instructional and curricular models that nurture and promote students’ comprehension of
6
proof and their ability to produce mathematically complete arguments, an understanding
of the nature of students’ thinking in proof related activities must first be developed.
cohort seeks evidence that students possess the ability to use deductive reasoning in
constructing arguments and proofs, even at the early elementary grades. The second
cohort describes students’ common difficulties and mistakes in producing proofs across
the grade levels and content areas. The third cohort offers an account of pedagogical
factors that could facilitate students’ learning about proofs. Although these three cohorts
implications for practice, they do not posit a framework to capture the features of
students’ thinking when performing proof related tasks. Studies of students’ proof
schemes tend to close this gap by creating a framework that classifies different types of
proofs that students offer. Following previous scholars’ work, such as Bell (1976) and
Balacheff (1988, 1991), Harel & Sowder (1998) organized the types of proof students
may use in various content areas of mathematics and proposed a taxonomy of proof
schemes consisting of three main categories, i.e. “external,” “empirical,” & “analytical,”
achieve a more mature comprehension of mathematical proof. The van Hiele levels (van
7
Hiele, 1986) is one of the most well-known frameworks to outline the stages in the
different content area. Frameworks that address explicitly proof learning include the
proof levels (Waring, 2000), reading comprehension of geometry proof (Yang & Lin,
2008), and the broad maturation of proof structure (Tall et. al, 2012). Detailed account of
Harel & Sowder (1998, 2007) observed that students could simultaneously hold
different proof schemes when working on different problems. Their model detects such a
difference but does not explain why such inconsistency might exist. The cognitive
certain mathematical field, but fail to describe why and how such a development may
emerge across content area differences. The categories, levels, and stages offered by
existing models are not precise enough to draw connection to students’ evaluation of the
arguments. Hence, little can be said about what kind of mathematical arguments students
find appealing, convincing, or explanatory since even arguments that are classified as the
same type can be judged quite differently among people and across the content areas.
In order for the instruction to enable students to understand and appreciate proof as a
reliable way of reasoning (de Villiers, 2003; Fawcett, 1938/1995; Reid, 2011), learning
important as teaching the skills of producing specific proofs. As Usiskin (1980) pointed
8
out, there are various ideas, methods and layouts of proofs in different branches of
mathematics. Therefore, investigations into the impact of content on students’ use and
arguments, a pilot study was conducted involving 41 secondary school students. The
participants were drawn from 19 different middle schools across the state of Ohio,
suggesting variety in both content and heuristics they may have had experienced at the
time of data collection. A Survey of Reasoning (SR) was designed and used to examine
means to closely inspect the potential relationship between a problem’s content and proof
mathematics (i.e. number theory, geometry, probability, and algebra). Each problem
consisted of several parts. First, the participants were presented with a conjecture and
were asked to determine whether they agreed with and were certain of the accuracy and
completeness of the statement. They were also asked to offer an explanation for their
choice and factors they considered when evaluating the statement. In the second part,
four arguments, each embodying a different proof scheme supporting or refuting the same
statement, were offered. The participants were asked to compare their own argument to
those given, and to decide whether they preferred any of the optional statements over
9
their own method. Lastly, they reported whether or not they considered each of the
chose the terms convincing and mathematically complete to evaluate students’ “two
conceptions of proof” (Healy & Hoyles, 2000), assuming that when judging the
The proof schemes of each individual’s favorite arguments varied across the
four problems.
argument convincing in one problem but labeled the argument with the same
The students didn’t necessarily persist on their own proof scheme when they
for it over his/her own argument, even when the two arguments represented
Results from the pilot study suggested that the students adopted and determined
their preferred reasoning schemes based on the concrete context of the problem instead of
following a broader uniform scheme. This implied that the transfer of proving skills from
arguments, the pilot study could not explain why the participants had made certain
For instance, the pilot study categorized arguments based upon researchers’ interpretation
of the brief written explanations that the students had produced. This information was
arguments, it is impossible to identify the factors that shape students’ views of those
arguments. Therefore, the current research was conceptualized to extend the previous
work and to shed light on the processes and resources students draw from when judging
mathematical proofs.
11
Purpose of the Study
wide range of mathematical contexts. Data collection and analysis was guided by the
Are there common aspects and features of arguments that significantly impact
individual decision making when evaluating mathematical arguments that can inform
curriculum and instruction. Drawing from Harel & Sowder’s (1998) proof scheme
taxonomy, Yang & Lin’s (2008) Reading Comprehension of Geometry Proof (RCGP)
Adopting a mixed methods design (Greene, Caracelli, and Graham, 1989), the
study consisted of the development, administration and analysis of a survey and follow
up interviews. The survey and interview protocol were designed and refined in 2012. The
12
revised survey (Survey of Mathematical Reasoning, SMR, see Figure 8) was administered
in January - February 2013, and the follow up interviews were conducted in April 2013.
The interested population of this study was 8th grade students. Two reasons
Stages, middle school students are at a critical cognitive phase where they can engage in
abstract and logical thinking. Therefore, how they learn to value different arguments at
this stage could potentially impact their reasoning skills and thinking habit in the later
years. Second, the grade band serves as a bridge between middle and high school
mathematics and the link between informal and more formal and abstract mathematical
reasoning. According to the curriculum standards (CCSSO, 2010), most 8th grade
students should have obtained basic understanding of numbers, shapes, chance, and
algebraic expressions, know some simple propositions and properties, and be able to see
the connection between concepts and ideas. However, they may not have yet adopted
proving techniques and forms. Therefore, the features of arguments they consider as
convincing, explanatory, and appealing can offer valuable references for the development
of resources and instructional explanations that can facilitate students’ internalization and
Data collection followed two phases. During the first phase, over 500 8th grade
students from 5 different public schools in Ohio took the SMR. The students’ responses
were then analyzed quantitatively to investigate their evaluation of the arguments used in
the SMR. In particular, the goal was to identify the type of arguments that they found
interviewed. Common factors that impacted each subject’s evaluation were summarized
and the individual differences were investigated through between subject contrasts.
Details about the participants of the study, development of survey instrument, procedures
of the interview, as well as the data analysis process are described in Chapter III.
This study had the potential to advance the understanding about proof learning on
three levels. First, empirical studies about the middle school students’ evaluation of
different proof types have been rare. Second, investigations that seek to identify
consistent features across content areas that individuals might consider when evaluating
mathematical arguments have been prominently absent from the literature. Lastly, the
type of studies that have worked towards developing a framework useful for identifying
reasoning methodology was underdeveloped. The current study aims to make novel
14
CHAPTER 2. LITERATURE REVIEW
This review offers a summary of literature on the nature and functions of proof in
and known principles by the continuous and uninterrupted action of a mind that has a
clear vision of each step in the process” (cited in Baker, 2009, p. 1). Krantz (2007)
described mathematics as “(i) coming up with new ideas and (ii) validating those ideas by
way of proof” (p. 33). Despite such emphasis, the precise nature and role of mathematical
proof has long been debated by mathematicians and mathematical educators (de Villiers,
1990, 1998; Hanna, 2000b; Lakatos, 1976; Krantz, 2007; Tall, 2002). Because of the
centrality of proofs and proving in mathematics, discussions surrounding its nature has
resided at the heart of the philosophy of mathematics. In this section I will offer a review
philosophical standpoints offer different, if not contradictory, accounts of the role and
function of proof in mathematics and hence provide distinct educational implications for
proof instruction.
15
Platonism
generally agree that Platonism in mathematics (in its pure sense) views mathematical
objects as abstract yet eternal and unchanging (Armstrong, 1970; Balaguer, 2008). For
example, when considering the “fact” that 1 + 1 = 2, those espousing Platonism believe
and consider their relationship demonstrated by the equation an eternal truth without the
paradigm is to explore and discover the unknown underlying truth (Weir, 2011).
According to Platonism, axioms must describe absolute and eternal truths of the world,
and proof is a method to find other absolute and eternal truths determined by the axioms.
However, Platonists were “rapidly losing support” (Weir, 2011) with the
formalism and constructivism have gained considerable attention and raise extensive
Formalism
systematic structure built upon axioms following certain rules. Carnap (1937) suggested
16
The concepts of mathematics can be derived from logical concepts through
explicit definitions.
These suggest that a mathematical axiomatic system must start with finitely many
statements (namely axioms/postulates) that are assumed to be true, and that the judgment
of validity of other statements in that axiomatic system must be based on deduction from
them. With a deeper and abstract conception of the deductive procedure, mathematicians
attempt to convert mathematics into a symbolic system using Set Theory (Johnson, 1972)
by restricting what procedure could be considered as logically valid (i.e. what deduction
is) and what concept is usable in deduction (i.e. what a set is). David Hilbert, arguably the
most prominent mathematician of the formalist genre, set the groundwork for
Formalism agrees with Platonism by envisioning an ideal and static system within
which truth and falseness are indisputable. As such, formalism has a strong tendency to
eliminate the impact of human perception since the criteria allowed in mathematical
deduction are impersonal. However, formalism only concerns the validity of statement
within the established system without articulating how truth within the system relates to
17
truth enclosed in nature (Weir, 2011). For instance, a + b = b + a holds in a commutative
group, but a formalist is not at all interested in how “a”, “b” and “+” in the equation relate
to quantities and operations in real life. In other words, “truth” as viewed by a formalist is
a relative “truth,” which is “attached” to the validity of the axioms (assumptions). Hence
formalism lies in a very restrictive and artificial setup of mathematical system that is
formalism. In particular, the first incompleteness theorem claimed that if the finitely many
axioms in the system are not contradictory to each other, then there must be a statement
within the system whose validity cannot be determined by the axioms and deductive
results upon them. The second incompleteness theorem enhanced the first theorem by
claiming that even if an axiomatic system includes an axiom to confirm the system’s own
consistency, then the existence of contradiction (inconsistency among the axioms) will
become inevitable.
two manners. First, the perfection of the axiomatic system envisioned by formalists is
totally denied. No matter how deliberate the axioms may be set up, the deductive system
won’t be able to solve all the problems within the system. In this sense, the methodology
theorems don’t demolish the value of the deductive system, in the sense that the system is
never perfect but functional and powerful in a very broad context. In fact, formalism
distance, area and volume. To date, the discovery and proof of new theorems is widely
Constructivism
Constructivism, radical or social, has largely been adopted in the social sciences,
and the study of mathematics education is not an exception (von Glasersfeld, 1994;
constructivism does not take any particular individual’s subjective opinion into
consideration. Instead, the theory makes assumptions about what types of deductions
people intuitionally and commonly accept and redefines a new type of logic such that it is
no longer based on the classical Set Theory. For example, the intuitionistic logic is
19
actually a revision of the classic Set Theory logic by removing the law of the excluded
mathematical concepts and axiomatic systems) built upon human intuition, representation
instead of a pre-existing, ideal and perfect form that people attempt to discover.
understanding (opposing Platonists’ view). Rather “concepts and structures are the result
(and time) and are further extended, by language and logic” (p. 22, Longo, 2009).
Therefore, mathematical concepts are not static. They may change as the result of the
discovery of new cases or when new needs emerge in the community. Different from
1
This interpretation is consistent with the concept of constructivism that is widely used in social science research. Since
ultimately this study concerns how an individual can develop deductive reasoning skills, human experience and
activities in conceptualizing and establishing mathematical ideas and structures offer valuable references. Hence for
convenience, constructivism in later text of this dissertation will all refer to the second interpretation unless specially
explained.
20
Lakatos (1976) offered remarkable examples to illustrate such a perspective. In
Proofs and Refutations, Lakatos recreated an imaginary classroom scenario where the
During the process, students found that their conception about what a face, an edge and a
vertex might be was substantially less rigorous than they had realized. Consequently the
students engaged in a discussion about the precise definition of those concepts (e.g.
vertex, edge, face, simple polyhedron, etc.). Students challenged each other’s definitions
using specific counter examples to demonstrate the incompleteness of the defined terms
(polyhedron) was refined and deepened gradually in such a “proof and refutation”
process. Lakatos suggested that concepts, as the foundation of mathematical systems, are
humans and refined to fit into a more useful theory. Neither was it predetermined that
there must be Euler’s polyhedron formula in the theory. Indeed this property rests upon
existing definitions. However, there was no guarantee that it must serve as an important
1976) may lead the development of theory towards another direction (referring to
For instance, visual aid is used to prove the Pythagoras Theorem in Euclidean Geometry,
however such an approach isn’t considered as reliable in Real Analysis. Intuitionism even
rules out the method of proof by contradiction. Therefore, constructivism suggests that
21
concepts, axioms, propositions and proving methodologies in mathematics are all
classroom depicted the journey which mathematicians took to establish the current
enterprise of mathematics, advocating the need for absolute respect for the natural
system, which is consistent with a formalist’s view. In fact, constructivism may suggest
that many of the axiomatic systems built by formalists are the best models that
accepted by a formalist in many cases. Taking the proof of Jensen’s inequality (the simple
definitely more complete and accurate. Constructivism approves the reliability of visual
implication since it is valid and efficient in many situations (e.g. elementary Euclidean
geometry, graphs of low degree polynomials, etc.), therefore judgment with visual aid is
reliable when dealing with cases within a certain scope, even though this method may not
apply to a broader context. However, formalism may degrade the reliability of adopting
visual implication to the specific proof since it cannot deal with Dirichlet function. In
broader context, which won’t be influenced by any particular problem solver’s own
22
Summary
There is consensus within the mathematics community that proof must start with
facts that are known or assumed to be true, use if-then logic (regardless of whether the
logic must be defined upon Set Theory), and then truth or falseness (either in a relative or
absolute sense) of the targeted statements should be established (Harel & Sowder, 1998;
Krantz, 2007; Tall et al., 2012). Platonism suggests that proof discovers and verifies truth
based on known truth, while formalism and constructivism suggest that proof starts with
validity of a proposed statement. However, a closer look reveals four major differences
satisfies certain criteria which guarantee the validity of proof, despite the
sense that concepts, axioms and proofs are all invented to solve problems and
Static vs. Dynamic. Formalism holds a narrower and more static view of
what can be viewed as assumptions and what kind of deductive steps should
mathematicians. Hence they evolve with the rise of new questions and
23
Restrictive vs. Open. Formalism tends to deny or degrade the validity of
agrees that there could be proofs that are reliable in a broader context,
however the “best” proof mathematicians could come up with is not the only
acceptable form of proof and cannot replace the role and value of less rigid
arguments.
Global vs. Local. Formalism suggests that when proving the validity of a
know the scope in which the discussion lies and use methods within that
draw from their own experience and community to determine the scope of
implication for mathematics education, especially at the introductory levels (Hersh, 2009;
complete and rigid form, tends to introduce a theory from a careful layout of its basic
knowledge upon the foundation. Constructivism suggests guiding the learners through the
journey that previous mathematicians took to establish theories, i.e. first offering
premature and informal perception of the subject and then refining and formalizing the
understanding through problem solving and critical reflections. These two instructional
elementary, secondary and early college level, particularly following the decline of New
between and shift depending on specific situations. In addition, there are different
terminologies used by scholars (e.g. Absolutist vs. Fallibilist (Ernest, 1996)) to describe
philosophical views of mathematics, and these different classifications maintain their own
constructivism and various other terms are perceptual concepts instead of defined
concepts (Bruner, 1987). Nevertheless, the essential purpose of the comparison is not to
determine how the three philosophical perspectives differ, but to help identify and
describe the criteria and features that a mathematical proof may possesses. From the
standpoint of a mathematics educator, the subject of study is learners with evolving views
constructivism serves as the basis for theoretical models in the learning of proof as well
as other fields in mathematics education research (Balacheff, 1991; de Villiers, 2012; Tall,
mechanical procedure. It shifts the attention from the content to learners’ thinking and
behavior. This implies that the content itself no longer solely determines how it should be
taught, rather learners’ “natural” behavior in the learning process must also be considered
strictly followed; rather, the instruction should guide students to develop a personal
meaning of proof, in particular why proof is needed and what features it should possess to
meet the need (Hersh, 2009). Such a focus calls for investigations into two critical
questions:
The following two sections are devoted to reviewing and summarizing the
served as a primary function in mathematics ever since it started to be used (Krantz, 2007;
Tall et al., 2012). Without understanding the concept of proof, it is impossible to perceive
what mathematical theory and practice might mean (Hanna, 2000b). With the prevailing
importance to the subject. More recently, with the rising attention to the perspective of
systematization. Balacheff (1991) suggested that “... a mathematical proof is a tool for
26
mathematicians for both establishing the validity of some statement, as well as a tool for
communication with other mathematicians” (p. 178). Schoenfeld (1994) claimed that “it
(p.76). Reflecting on existing literature and his own experiences with teaching and
learning of mathematics, de Villiers (1990, 2003) outlined six major functions of proof in
constructing a proof).
Verification
proved to be true without errors, then its correctness must have been clarified and there is
remains unclear until it is proved. Until then, the statement can only be regarded as a
2
“systemization” is the original spelling used by de Villiers.
27
hypothesis even if it seems true by mathematical authorities and no counterexample is
found. Although de Villiers also pointed out that “proof is not necessarily a prerequisite
for conviction – to the contrary, conviction is probably far more frequently a prerequisite
for the finding of a proof” (p. 18), the level of conviction that mathematicians acquire
before and after obtaining a proof can never be denied. There is a clear difference
between “pretty sure” and “absolutely sure.” In fact, there were many occasions in the
proved to be incorrect. One famous example could be the Kakeya needle problem, which
was proposed by Kakeya (1917) attempting to find the minimal area in 2-dimensional
Euclidean space within which a unit line segment can be rotated continuously through
180 degrees. Many mathematicians (including Kakeya himself) seemed to believe that
the deltoid would be the solution, since deltoid is composed by such elegant curves that
seem to satisfy conditions that are so crucial to obtain the “minimum.” Much effort had
been done to prove that deltoid is the correct solution until the Besicovitch set, a much
more complex and “artificial” shape, was proved to be the right response to the problem
(Besicovitch 1919; Pal, 1920). Another famous example is Leonhard Euler’s conjecture
which was found after almost 200 years (cited by IAS/PCMI, 2007). Since there is no
guarantee that a seemly true statement according to people’s or even experts’ intuition
might hold true in mathematics, it is proofs that distinguish true results from seemingly
28
Explanation
A teacher asked a student to write a 4-digit whole number on the blackboard, then
the teacher immediately said “it is divisible by 3.” The student checked with a
calculator and found out the teacher was correct. Then the student challenged the
teacher again with even larger whole numbers and the teacher could make a
If you were the student, what would you want to know? Most likely, with the
cases, we gathered strong inductive evidence for it … When you have satisfied yourself
that the theorem is true, you start proving it” (p. 83-84). This leads to the second function
As illustrated in the imaginary scenario, knowing the teacher was correct didn’t
satisfy the students’ curiosity (actually it was more likely to be aroused), and didn’t
advance the students’ understanding of the subject (Hanna, 2000b). De Villiers suggested
“Proof helps us understand and explain mathematics” (p. 16, Reid, 2011). Proof
connects phenomena with more basic rules (theorems and axioms) that seem obvious and
and intentionally pursuing the structure of knowledge built by proof leads to a higher
Systemization
indispensable tool for systematizing known results into a deductive axiomatic system (de
Villiers, 1990).
Even those who can generate a proof for some complex propositions (e.g. the Nine-Point
Circle) in Euclidean Geometry may not be conscious about the five (or ten, if counting
the assumptions about algebra) basic assumptions or be aware of how the axioms and
may only involve understanding of a small part of the system; and 2) accepting if-then
perceiving proof as a local illustration of how the system works. It is commonly believed
that an overarching understanding comes after adequate local experiences (e.g. the van
30
Hiele model, 1986). Additionally, proof and investigation of a single case may also
inspire or directly cause a more insightful or even revolutionary view about the
knowledge structure (Lakatos, 1976). For example, the proof of Chinese Reminder
Theorem inspires the understanding of ideals in ring theory; Russell’s paradox (1903)
caused a reconsideration and reconstruction of the logic system. This leads to the
Discovery
empirical investigation (e.g. the law of large numbers) and guess and trials (e.g the four
color map conjecture/theorem). There are also new results that “were discovered or
invented in a purely deductive manner” (de Villiers, 1990). Reid (2011) illustrates this
point by suggesting that a proof of the statement that “the sum of two consecutive odd
numbers is even” leads to the discovery of a new fact that the sum must be a multiple of 4.
Another example could be Euler’s polyhedron formula, which largely narrows down the
possible cases of regular polyhedrons and directly implies the discovery of all the
possible cases.
Perhaps the best presentations of proof’s discovery function lie in the natural
science studies, where many phenomena were “found” in the theory before being
discovered in reality. This is among the most important reasons for mathematics to be
such a popular tool in those disciplines. A well-known example is the discovery of the
gravitational lens (bending of light by mass), which was deduced from Einstein’s general
appearance was found. Discovery in natural science by proof has a strong implication for
Platonism, in the sense that there are perceivable and predictable orders pre-existing in
talking about the verification and explanation function of proof, we consider the process
of tracing the statement “down” to the axioms and theorems. However, neither an
individual nor a community can examine or discover all true statements within a system,
especially when the system is newly established. Hence, when building “up” the theory
upon the axioms, theorems and other known results, it is quite possible to encounter
Communication
perspective of Platonism, the communication function is more likely to share the basis of
mathematical proof has two major features: clear definition and rigid layout of causal
addition, communication by proof also serves to maintain the least level of intuitional
32
confusion in the reasoning process. However, it is impossible to radically remove
intuition from proof. Description of the concepts involves external or even non-
mathematical language (Krantz, 2007). When performing perhaps the most rigorous
standard of deduction as theorized by the Set Theory, intuition still plays a part when
visualizing the inclusion and exclusion relationships. Nevertheless, there are intuitions
that seem to be accepted by all human beings, and hence they are used to form a common
ground for critical debates (Davis, 1976). Mathematical activities ultimately pursue
commonly accepted facts and perform commonly accepted reasoning. Proof, loyal to
both aspects, serves as the most explicit and reliable tool in communicating the substance
of mathematical thinking.
Intellectual challenge
and fulfillment. The motivation of doing proofs may come from the desire to conduct
mathematical proof is set to pursue a common ground and a generally accepted way to
present causal relationships, those who can actually understand, appreciate and utilize the
idea and structure of mathematical proof only compose a small portion of the population.
scholars and learners of mathematics, science, philosophy and other logic intensive
disciplines.
33
Implication of de Villiers’ model to the learning of proof
mathematics and its educational implications are of great importance to the mathematics
proof can be stimulated by the curiosity of knowing “if something is true,” as well as the
willingness to know “why something is true,” “how things relate to each other,” “what
else may be true,” and “how to let other people know my ideas.” The functions proposed
by de Villiers are well aligned with learners’ interests brought into the context (see Table
1).
Table 1. The alignment between functions of proof and learners’ purpose in conducting
proof
34
Healy & Hoyles (2000) categorized students’ view of proof and its purposes in a
large scale empirical study of children aged 14-15. They found that 28% of the students
didn’t show any understanding of the purpose of proof. In addition, only 1% of the
students acknowledged that proof might help discover new theories or systemize ideas.
The most recognized functions of proof were verification and explanation 3. Furthermore,
Healy & Hoyles posited that students’ understanding of the purposes of proof had a
significant influence on their ability to identify and construct a proof. From the
explanatory they become more likely to assimilate it into their own reasoning method.
arguments, it is impossible to understand why certain decisions are made. Therefore, the
Certainly, identifying the possible intentions only serves as a starting point toward
might still not know why a certain strategy (algebraic vs. geometric, empirical checking
vs. deductive reasoning, etc.) was valued by the student. They might not know why the
student failed or succeeded in achieving his/her goal either. Hence, in order to obtain a
3
In Healy & Hoyles’ (2000) classification of students’ view of the purposes of proof, the category named “explanation”
included both explanation and communication as identified in de Villiers’ (1990, 2003) model.
35
are needed to explore not only how students perform on tasks that demand proof, but also
their thinking which can be approached by carefully designed questions. Studies that
research on proof, three major categories. The first category includes studies that
investigate students’ ability to perform proof related activity (Ball & Bass, 2003; Lampert,
1992; Marrades, & Gutiérrez, 2000; Reid, 2002; Sekiguchi, 1991; Zack, 1997). This body
of work suggests that students naturally possess the ability (Piaget, 1928, 1987) to reason
even at early elementary grades (Zack, 1997). They call for the design of interventions
that encourage students to reason coherently instead of assuming they are not ready and
providing them cognitively soft tasks to do (Bloom, 1984; Usiskin, 1987). The second
category of studies describe students’ common difficulties and mistakes when producing
proofs across different grades and content areas (Balacheff, 1988; Chazan, 1993;
Schoenfeld, 1988; Senk, 1985). The third category elaborates on the pedagogical factors
that could facilitate students’ learning about proofs (Hoyles, 1997). These three
theoretical essays, offer insights into students’ ability along with challenges they
experience when learning proofs (Pirie, 1988). However, collectively, they fail to provide
performing proof related tasks. This gap inspired a body of studies on learners’ proof
schemes.
36
Proof schemes
The study of learners’ proof schemes has a long history and is currently a main
stream in didactics of mathematics. For instance, Bell (1976) identified “Empirical” and
“Deductive” as two major modes of justifications that students used when working on
examples (or on actions), and conceptual justifications are based on abstract formulations
conceptual justification, in which actions are internalized and dissociated from the
specific examples and the justification is based on the use of and the transformation of
formalized symbolic expressions (see Figure 1). Balacheff (1988) concluded that while
students experience difficulty producing proofs, they do however show awareness of the
37
Pragmatic Conceptual
Justification Justification
Extending the research of Bell (1976) and Balacheff (1991) and drawing from a
considerable collection of empirical data, Harel & Sowder (1998) proposed a taxonomy
of proof schemes consisting of three main categories, i.e. “external,” “empirical,” and
particular, external conviction proof schemes include instances where students determine
the validity of an argument by referring to external sources, such as the appearance of the
argument instead of its content (e.g. they tend to judge upon the kind of symbols used in
the argument instead of the embedded concepts and connection of those symbols), or
verify the validity of an argument; the prior draws heavily on examination of cases for
convincing oneself, while the latter is grounded in more intuitively coordinated mental
38
axiomatic modes of reasoning which include resting upon defined and undefined terms,
Figure 2. Proof schemes and sub schemes (Sowder & Harel, 1998)
39
Although the existing frameworks of proof schemes provide powerful vehicle for
classifying the types of proofs produced, they do not trace the cognitive stages that
mathematical proof. Attempt has been made to address this gap by studies focused on the
A great deal of research has been undertaken that explores and describes the
proof from the early stages in which s/he only possesses a primitive understanding of
mathematical objects and actions to more advanced levels where s/he is capable of
axiomatic reasoning (Tall et al, 2012). Since the ability to generate logical arguments is
among the most essential goals of any area of mathematics, progress in understanding the
mathematical reasoning. The well-known van Hiele model (1986) for geometric thinking
The van Hiele model was originally proposed by two Dutch teachers, Pierre van
Hiele and Dina van Hiele-Geldof. They designed a framework which could depict the
development of geometric reasoning and hence explain how people grow in their
40
passes when learning geometry were identified, including “visual,”
Hiele, 1986, see Figure 3). A brief description of each level is presented below.
Level 5: Rigor
Level 2: Descriptive/Analytic
Level 1: Visual
At the visual level (Level 1), learners could identify, name, and compare
geometric figures, such as triangles, rectangles, angles, parallel lines, etc., according to
how they look. For example, at this level, students may see the difference between
triangles and quadrilaterals by counting the number of their sides, but they may not be
and properties of a figure; however, they cannot reason upon those properties. They are
able to describe figures in terms of their parts and relationships among these parts, to
41
summarize the properties of a class of figures, and use properties to solve basic
identification problems, but they cannot yet conduct deduction. For example, learners
know a right triangle is a triangle that has a right angle, but they cannot explain whether it
At the informal deductive level (Level 3), learners are able to connect figures with
their properties. They can justify figures by their properties as well as articulate the
properties of a given figure. The learners can understand and use precise definitions.
They are capable of using “if-then” thinking, but they cannot consciously use
mathematically correct language, nor can they realize the deductive property of their
mathematical foundation. For example, the learners are able to claim that it is impossible
for a right triangle to have two right angles because if so there will be two sides that
cannot “meet.”
At the deductive level (Level 4), learners can reason about geometric objects
using their defined properties in a deductive manner. They could consciously construct
the types of proofs that one would find in a typical high school geometry course. They are
At the highest level, rigor (Level 5), learners can compare different axiomatic
systems. Learners fully understand the structure of a system as well as its applications
The van Hiele model has been modified and extended by scholars to meet
particular research interest. For example, Clements and Battista (1992) add a level 0,
“pre-recognition,” where children were not able to visually identify the difference
42
between shapes, to this model to depict their cognition in geometry at the very beginning
stage. Pegg and Davey (1998) integrated the van Hiele model with another learning
theory, the SOLO taxonomy (Biggs & Collis, 1982), to describe how learning develops
The van Hiele model doesn’t trace the “in between levels of reasoning” (Burger &
Shaughnessy, 1986), nor does it offer enough details to depict how proof is perceived by
learners. This point became quite obvious when applying the model to study students’
development of proof skills. After all, the van Hiele model was not specially designed to
geometric reasoning. The early two levels concern sense making and concept building,
while the ability to produce justification mostly develop at the higher three levels. It is
not suggested that the development of reasoning ability could be separated from sense
Waring (2000) proposed the proof levels for elementary and secondary students to
“describe the development of proof concepts beginning with an appreciation of the need
for proof, then an understanding of the nature of proof, and finally pupils’ competence in
constructing proofs” (p. 10). Six levels are identified in the framework.
43
to offer a reason, or just refer to external (Harel & Sowder, 1998) sources, such as
however they are not cognizant that a claim should be verified in all possible cases.
Instead, they just check a few cases and suggest the results are sufficient to support the
claim.
Moving up to Level 2, students still rely on empirical checking. While they are
more careful in choosing examples to verify, and may notice certain patterns in the
process, they still cannot produce a proof that accounts for all cases. This may be due to
their inability to realize the entire scope of discussion, absence of knowledge about the
need to clarify every case, or lack of language tools to describe the patterns they detect.
At Level 3, students become aware of the need to offer justification for general
cases, however they lack proof skills or basic understanding of the subject. Therefore,
At Level 4, students are both aware and capable of producing generalized proofs.
Compared to the van Hiele model, Waring’s proof levels offer an account of how
the shift from informal to formal understanding of proofs may occur. However, both
frameworks provide a linear account of development (i.e. changes happen one after
another) with no space in the structure to describe processes that might occur randomly or
parallel to each other. For instance, lower levels in the van Hiele levels emphasize sense
44
making and concept building while reasoning is only emphasized in higher levels; while
proposed that the development of mathematical cognitions follows a more complex and
non-linear format (Lakatos, 1976; Kieren & Pirie, 1991; Martin, 2008; Pirie & Kieren,
1992).
factors that are involved in the maturation of one’s proof ability (see Figure 4). This
framework captures six key components (i.e. perceptual recognition, verbal description
crystalline concepts, and deductive knowledge structure) and their relationships in the
broad maturation of proof structure. Different from the van Hiele model, Tall et al. (2012)
suggest that the perceptual understanding doesn’t develop only at earlier stages. Instead it
continues to be refined when the understanding of the concept and deductive process is
advanced. This idea is consistent with the perspective of constructivism, in the sense that
a factor may impact other components (Lakatos, 1976; Tall, 2005). Nevertheless, Tall et
al. (2012) don’t suggest that all the components in the structure develop simultaneously.
Instead, certain types of understanding serve as a prerequisite for others to occur. This
45
Figure 4. The broad maturation of proof structure (Tall et al, 2012)
of its context” (p. 19). In other words, it is a concept with a pack of associated knowledge
attached to it. In order to construct deductive reasoning, involved concepts must not be
perceived as isolated objects. Only when the roads are built can a pass be drawn.
46
Theoretical Framework
The pilot study (Liu & Manouchehri, 2012) adopted Harel & Sowder’s (1998)
if the proof scheme is a common indicator of whether an argument was found convincing
or appealing by an individual. The results of the pilot study suggested that students prefer
different proof schemes and may have distinct judgment of the same scheme in different
contexts. This result was consistent with Harel & Sowder’s (1998) finding that an
individual could simultaneously hold different proof schemes. Since the proof scheme
(1990, 2003) model offers one advantage (i.e. the intention of the learner in creating the
referring to stages the learners may be achieved at the time of assessment. The broad
maturation model proposed by Tall et al. (2012) adds to the conversation by considering
perception of proofs. Healy & Hoyles (2000) suggested gender can also be a factor. In
(Dreyfus, 1999; Hoyles, 1997; Herbst & Branch, 2006; Schoenfeld, 1988; etc.). Indeed,
intention, and school experience are factors that impact learners’ preference for and use
understanding of the argument (see Figure 5). In order to distinguish the types of
47
arguments students found convincing and appealing to them, we must understand what
exploration and justification of a mathematical conjecture, but few studies pay attention
to how students comprehend and evaluate a given proof. In addition, instruments that
2012). Research on students’ comprehension of given arguments are rare and in great
need. This is for several reasons. First, in school practice, reading and understanding the
proofs offered by teacher or course materials serves as a main venue for students to
develop their conception and skills of proofs (Weber, 2004). Second, the evidence and
logic students use to construct a proof is usually familiar to them, however when
evaluating a proof, they may encounter unknown resources and unfamiliar reasoning
would reveal some basic features of their conviction system. Lastly, the ability to judge a
48
impossible for one to monitor and inform his/her own construction of mathematical
proofs. Understanding how students evaluate and assimilate (or exclude) different ideas is
critical in understanding their learning process, and the learning of proof is not an
exception. Therefore more studies that focus on students’ thinking in reading and judging
about proof are needed (Healy & Hoyles, 2000; Mejia-Ramos & Inglis, 2009; Selden &
Yang & Lin (2008) took an initiative to address the gap by proposing the Reading
Comprehension of Geometry Proof (RCGP) model to describe the stages that learners go
through in understanding a given geometric proof (see Figure 6). They suggested that
students first identified isolated conceptual and procedural knowledge in the statement at
the surface level. They then start to recognize some knowledge and statements are
premises, some are conclusions and some are description of properties. Moving up to the
chaining elements level, students are able to identify and understand the connection
between premises, conditions, properties and conclusions. While at the highest level,
encapsulation, students gain a systematic and organized view of the elements in the proof;
they are well aware of what the premises and conclusions are and fully understand the
49
Figure 6. Reading Comprehension of Geometry Proof (RCGP) Model (Yang & Lin, 2008)
Following the identification of the four levels of understanding, Yang and Lin
(2008) further examined conditions under which a learner’s understanding could move
toward higher levels. In particular, they suggested that Basic Knowledge (i.e.
understanding of the terms and sentences), Logical Status (i.e. realization of the logical
(i.e. understanding to what extend the argument is valid), and Application (i.e. knowing
how to apply the proposition) make up the critical understanding a learner needs to
Figure 6.
comprehension of a particular proof. The model suggests that only when certain
50
understanding is in place a learner can comprehend a proof as a logically coherent
argument in an informed manner, students must at least reach the understanding at the
chaining elements level since only at this level can students start to see the relations
within the statement. In other words, judgment can only be made upon certain levels of
understanding, and judgment about reasoning method can only be made when they see
the connection.
According to the RCGP model, there are three kinds of understanding that
students need to develop in order to reach the chaining elements level. Generally
speaking, students need to be able to understand the concepts used in the argument,
students need to be able to identify what is the evidence on which the argument is based,
and students need to be able to see the connection between the premises and results.
When shifting the attention from students to the arguments, it is noticeable that
these three types of understanding in fact point out three key aspects about the argument
mathematical argument, i.e. the representation (which describes the concepts and other
terms), the source of conviction (which states what is taken for granted), and the link
between the evidence and conclusion (which represents the reasoning process). Three
similar aspects were addressed by Stylianides and Stylianides (2008a) as the modes of
arguments, the set of accepted statements, and the modes of argumentation. It is assumed
that only when students understand the presentation, agree with the source, and recognize
the link would they consider an argument as reliable. With the three identified aspects,
the investigation becomes one of examining what kind of presentation, what kind of
51
source, and what kind of link contributes to students’ conviction of an argument. So the
next step of model building is to identify the different genres in each aspect.
ideas, i.e. enactive (which involves the use of gesture and physical actions), iconic (which
involves the use of pictures, graphs and visual tools) and symbolic (which includes the
use of natural language, numbers, and logic). However, in the current study, where the
communication is carried out by mathematical arguments, the enactive way is not utilized.
communication from casual language. Therefore we see the need to distinguish the two
numerical, symbolic, and visual. Narrative arguments refer to those using casual
language. A typical example could be “Because the car is slower, it takes a longer time to
get to the destination.” Numerical argument refer to those using numbers and elementary
mathematics symbols (such as “+,” “-,” and “<”). For example, “since 12 = 3 * 4, then 12
is a multiple of 3” falls in this category. Symbolic arguments refer to those using letter
levels, it is not expected that students will use formal language as do those in an
advanced algebra class. Therefore, the symbolic arguments may contain a large amount
the role played by the letter symbols. For instance, the argument “Since x2-2x+1=(x-1)2,
then it must be non-negative” falls in this category. The last type is visual arguments,
52
where visual aids are provided to present concepts and to communicate ideas. This
(1966). A typical geometry proof that uses figures falls in this category.
The classification of sources of conviction as well as the link between source and
conclusion is informed by Harel & Sowder’s (1998) model. However, unlike Harel &
Sowder’s model which categorizes arguments as a whole, this framework classifies the
source of conviction and the link between source and conclusion separately. This
alternative approach was impacted by a reflection on the application of Harel & Sowder’s
model on several concrete cases. For example, a step-by-step correct and complete proof
initiated from the Pythagoras Theorem would most likely be classified as a deductive
proof using Harel & Sowder’s model. However, students who write down the same proof
may actually have different comprehension of it. For instance, some of them may view
some of them may view it as an assumption, and some of them may view it as part of
preference and evaluation of proofs in different contexts as observed in the pilot study
(Liu & Manouchehri, 2012) could partly be due to the lack of preciseness when using
Harel & Sowder’s model to depict students’ way of reasoning. Therefore by looking at
source and link separately, we expect to get a more detailed picture of students’
immediate test), imaginary (i.e. mental image created upon or recalled from previous
53
experience), fact (i.e. well known existing mathematical results), an assumption (i.e. an
assumed truth for the argument to be based on), and opinion (conviction without an
explicit reason). The types of link between source and conclusion include direct
operation (Healy & Hoyles, 2000), and deduction. In direct indication, the conclusion is
the required condition of source without any additional understanding (e.g. “Since the
squares of a positive number, a negative number and 0 are all non-negative, then the
and conclusion based on visualization or intuition. The argument “Since f(x) is a much
longer term than g(x), then f(x) must be larger” is an illustration of the use of perceptual
connection in an argument. The use of metaphor falls into this category as well. Induction
evidence, however the later involves a further investigation and notice of properties that
connect the empirical cases. The use of generic examples (Balacheff, 1988) falls in the
category of transformation. Ritual operation and deduction both refer to a valid reasoning
procedure, however one using the former doesn’t know why the procedure works (e.g.
using an algorithm without knowing why it works) while one using the latter is well
aware that each step in the process connects an evidence to its required condition. The
54
Presentation
Symbolic
Link
Deduction
Numerical
Ritual operation
Transformation
Narrative
Induction
Direct indication
Source
Authority Example Imaginary Fact Assumption Opinion
To exemplify how the framework is used in the current study, let’s consider the
following argument: “Since 2+2=4, 2+4=6, 2+6=8, 4+6=10, then the sum of two even
experience since it is based on results of several trials. The link would most likely be
classified as induction. There shouldn’t be much ambiguity about the representation and
the source of conviction in this case. However, when judging the link between source and
conclusion, we cannot be certain about whether the nature of reasoning was purely
inductive. For instance, it is possible that when one reads the proof s/he may have noticed
some patterns from the trial results but hasn’t explicitly expressed the discoveries. In
55
conjectures or assumptions regarding choices made. Questions such as “why do you think
checking on a few cases is sufficient for a conclusion about every case?” can potentially
conviction, nor the link between source and the conclusion can be identified merely by
looking at the argument itself. Instead, they reside in one’s comprehension of the
argument, even though the expression of the argument can certainly influence one’s
called an “internalized argument” for the rest of the work. Accordingly, this framework,
arguments, the current study investigated the type of arguments students considered
convincing, explanatory and appealing. In addition, the study examined whether there
were common types of representation, source and link that contributed to students’
choices. Furthermore, similarities and differences among individuals and among the
contexts were studied in order to identify personal factors that influenced their judgment.
Detailed research methods and procedures are provided in the next chapter.
56
CHAPTER 3. METHODOLOGY
considered as convincing, explanatory and appealing, and what factors influenced their
evaluations. In order to do so, both quantitative and qualitative data were collected and a
studies that investigate the same phenomenon (Onwuegbuzie & Leech, 2006; Creswell &
Plano Clark, 2011). Quantitative research emphasizes deductive logic, and utilizes
numerical data; whereas qualitative research emphasizes inductive logic, and often
utilizes textual and pictorial data (Teddlie & Tashakkori, 2009). Quantitative research
tends to eliminate researchers' biases, so that they can remain emotionally detached and
uninvolved with the objects of the study and test or empirically justify their stated
abound, that time-and context-free generalizations are neither desirable nor possible, that
research is value-bound, that it is impossible to differentiate fully causes and effects, that
logic flows from specific to general and that knower and known cannot be separated
because the subjective knower is the only source of reality” (Johnson & Onwuegbuzie,
57
2004, p. 14). A combination of quantitative and qualitative research designs serves to
expansion (Greene, Caracelli, and Graham, 1989). More specifically, triangulation seeks
common results from different methods to reduce the inherent method bias of any
particular method, including the inquirer bias, theory bias, and context bias.
enhancing, illustrating and clarifying the results from one method with the results from
other methods. Development concerns the method design using the results from other
methods. Initiation deepens and broadens the inquiry by seeking new perspectives of
frameworks, or the discovery of paradox and contradiction between results from different
methods. Lastly, expansion extends the scope of inquiry by using methods most
Johnson & Onwuegbuzie (2004) suggested that the logic of inquiry in mixed
methods research includes “the use of induction (or discovery of patterns), deduction
(testing of theories and hypotheses), and abduction (uncovering and relying on the best of
a set of explanations for understanding one’s results)” (p. 17). The quantitative and
evaluations. The quantitative method could help with collecting data from a large sample
and hence facilitate the discovery of patterns and testing of hypotheses. It served to
enhance the scope of inquiry and the generality of the findings (Cohen, 1988). Findings
58
based upon statistical analysis could highlight connections between students’ thinking
and characteristics of the content. However, since participants were only asked to
complete multiple-choice items in the survey that were predefined by the researchers, the
interpretations of emergent patterns, and enhancing the analysis of the study by providing
insights into specific cases (McConaughy & Achenbach, 2001; Yin, 2009). Both methods
were needed in order to provide a comprehensive and meaningful explanation for the
administration and analysis of a survey and follow up interviews (see Table 2). The
survey and interview protocol were designed and refined in 2012. The survey was
administered in January - February 2013, and the follow up interviews were conducted in
April 2013. The survey was administered in the participants’ schools and took 30-60
as well as the data analysis process are described in the following sections of this chapter.
59
Timeline Task Summary of the task
2012 Instrument The instrument for the survey and the interview
Development protocol was designed based upon existing literature
and findings from pilot studies.
January - Survey The survey was administered online using the
February, Administration instrument called Survey of Mathematical Reasoning.
2013 The survey took 30-60 minutes to accomplish.
February – Survey Students’ evaluations of mathematical arguments
April, 2013 Analysis were quantitatively analyzed. Survey results were
used to determine the participants of the interview.
April, 2013 Interview Follow up one-on-one interviews were conducted
Conducted with individuals selected from those who had taken
the SMR to further investigate why they made certain
choices in the survey. Each interview lasted about an
hour
April – June, Interview Students’ responses in the interview were
2013 Analysis qualitatively analyzed. Factors that influenced
students’ decisions were conceptualized and
synthesized.
Sample
The population of interest in this study was 8th grade students. Two reasons
Stages, middle school students are at a critical cognitive phase where they can engage in
abstract and logical thinking. Therefore, how they learn to value different arguments at
60
this stage could potentially impact their reasoning skills and thinking habit in the later
years. Second, the grade band serves as a bridge between middle and high school
mathematics and the link between informal and more formal and abstract mathematical
reasoning (Knuth, Choppin, & Bieda, 2009). According to the curriculum standards
(CCSSO, 2010), most 8th grade students should have obtained basic understanding of
numbers, shapes, chance, and algebraic expressions, know some simple propositions and
properties, and should be able to see the connection between concepts and ideas.
However, they may not have yet adopted abstract thinking or deductive ways of
mathematical reasoning using conventional proving techniques and forms. Therefore, the
features of arguments they consider as convincing, explanatory, and appealing can offer
valuable references for the development of resources and instructional explanations that
argumentations.
Survey Participants
Over 500 8th grade students from 5 different public schools in Ohio took the
survey in January and February of 2013. According to the 2012 spring Ohio state
standardized 7th grade mathematics test results, two of the schools had performed below
state average (at least 10% below as measured by percentage of proficiency), one
school’s performance was at the state average, while the other two schools’ performance
was above the state average (about 10% above as measured by percentage of proficiency).
The survey was given to the students in their respective school setting during a regular
class period.
61
Data trimming was conducted to exclude unreliable information. We excluded
data from those who hadn’t completed the survey and those who had chosen the same
option for almost all questions. In particular, the survey contained 48 questions that
required student to select one of the three options: “agree” “disagree” and “not sure.” If a
participant chose the same option for all but 5 questions (10% of the total), we considered
his/her responses not made based on careful analysis. Hence, the actual data used in
analysis in this study consisted of responses from 476 respondents. 48.1% of the
participants were “male,” and 49.8% were “female.” The remaining 2.1% chose not to
disclose their gender. In responses to the question about ethnicity, 78.6% selected “White,
not of Hispanic origin”, 7.1% selected “Black, not of Hispanic origin”, 1.7% selected
“Hispanic”, 2.1% selected “American Indian or Alaskan Native”, 0.4% selected “Asian
or pacific islander.” 10.1% of the respondents chose not to disclose their ethnicity. In
response to the question about the math courses completed, 88.5% of the students
indicated that they had taken or were taking Algebra I or an equivalent Integrated 8th
Grade Mathematics course, 10.3% indicated that they had taken or were taking Geometry,
and 2.5% indicated that they had taken or were taking Algebra II. Based on the
demographics of the sample, we believe our data to be a fair representative of the 8th
62
them. SMR was published online and participants took the survey on the website during
one of their class periods. All the items on the SMR were multiple-choice.
63
BACKGROUND INFORMATION
Thank you for agreeing to participate in this study. Your responses to the survey are
confidential and will not be shared with your teachers in school.
The survey contains 4 mathematics problems. For each problem you will need to evaluate
4 mathematical arguments and answer related questions. There is no right answer to those
questions. We just want to know your opinion.
Please read the questions carefully and pick the options that best match your opinion.
Please plan on using 30 to 45minutes to complete the survey.
Now let's start!
3. Your student ID as assigned by your school (or your math coach): ________________
4. Your gender:
Male Female I choose not to answer this question
6. Mathematics courses you have taken (including the course you are taking):
Pre-algebra Algebra I Algebra II Geometry
Integrated 7th grade math Integrated 8th grade math
Other (please specify) _____________________
On the next page, we will start to work on some math problems. Ready to go?
Continued
4
SMR used in the study was a web-based survey. Therefore, although sharing the same content, the survey on the
internet had a different layout from what is shown here.
64
Figure 8 continued
PROBLEM A
Arguments A1 - A4 are offered by different people to justify Shaina’s claim. Please read
each of the arguments carefully and pick the options that best describe your thinking in
Questions 7 - 11.
************************************************************************
Argument A1: I’ve tried plenty of multiples of 6 (like 12, 60, 606, etc.) and found they
are multiples of 3 as well. So I am sure that Shaina’s statement must be true.
7. What do you think of the argument above? Please pick the option that best matches
your opinion.
Argument A2: Any multiple of 6 can be written as 6n. We know that 6n = 3•2n, which is
a multiple of 3. Therefore a multiple of 6 must also be a multiple of 3.
8. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
65
Figure 8 continued
Argument A3: If the total number of cookies is a multiple of 6, then we can put them
into several boxes where each box contains 6 cookies. We can further divide each
box into 2 packages, where each package contains 3 cookies. Now all the cookies
are put into packages of 3. Therefore, the total amount of cookies must also be a
multiple of 3.
9. What do you think of the argument above? Please pick the option that best matches
your opinion.
10. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
66
Figure 8 continued
11. After evaluating each argument, which of them is closest to what you will use in
arguing about Shaina's claim? 5
Argument A1 Argument A2 Argument A3 Argument A4
None of the arguments is close to what I will use. This is how I will argue:
continued
5
A1 - A4 were relisted below this question in the online version of SMR to allow students see all the arguments that
needed to be compared. The same layout was adopted for the other three problems used in the survey.
67
Figure 8 continued
PROBLEM B
Arguments B1 - B4 are offered by different people to justify Ryan’s claim. Please read
each of the arguments carefully and pick the options that best describe your thinking in
Questions 12 - 16.
************************************************************************
Argument B1: I’ve drawn several rectangles and measured the length of their sides and
diagonals. I found that the diagonal of any of those rectangles is longer than any
side of the same rectangle. So Ryan’s statement must be true for all rectangles.
12. What do you think of the argument above? Please pick the option that best matches
your opinion.
Argument B2: Imagine that you are standing on the corner of a football field. Then the
diagonal of the field is definitely longer than any of its sides. So Ryan’s claim
must be right.
13. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
68
Figure 8 continued
14. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
69
Figure 8 continued
16. After evaluating each argument, which of them is closest to what you will use in
arguing about Ryan's claim?
Argument B1 Argument B2 Argument B3 Argument B4
None of the arguments is close to what I will use. This is how I will argue:
continued
70
Figure 8 continued
PROBLEM C
There are two triangles. The lengths of the three sides of Triangle I are A, B, and C and
the lengths of the three sides of Triangle II are a, b, and c. Jennifer claims that:
“If A > a, B > b and C > c, then the area of Triangle I must also be larger than
Triangle II.”
Arguments C1 - C4 are offered by different people to justify Jennifer’s claim. Please read
each of the arguments carefully and pick the options that best describe your thinking in
Questions 17 - 21.
************************************************************************
17. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
71
Figure 8 continued
Argument C2: We all know that the area of a triangle equals 1/2 of the product of its
base and height. As shown in the figures below, the area of Triangle I = BH/2, and
the area of Triangle II = bh/2. We know that B > b. In addition, since A > a and
C > c, then it must be true that H > h. So BH/2 must be larger than bh/2. Therefore
the area of Triangle I must be larger than the area of Triangle II.
18. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
72
Figure 8 continued
Argument C3: As shown in the figures below, since each side of Triangle II is shorter
than the corresponding side of Triangle I, we can cut each side of Triangle I
shorter and then compose Triangle II using the shortened sides. Therefore, the
area of Triangle II must be smaller than the area of Triangle I.
19. What do you think of the argument above? Please pick the option that best matches
your opinion.
Argument C4: Since each side of Triangle I is longer than the corresponding side of
Triangle II, then the perimeter of Triangle I must also be longer than the perimeter
of Triangle II. If we make the two triangles using wires, then it needs a longer
wire to make Triangle I than Triangle II. Using a longer wire we can make a larger
triangle. Therefore the area of Triangle I is definitely larger than the area of
Triangle II.
20. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
73
Figure 8 continued
21. After evaluating each argument, which of them is closest to what you will use in
arguing about Jennifer's claim?
Argument C1 Argument C2 Argument C3 Argument C4
None of the arguments is close to what I will use. This is how I will argue:
continued
74
Figure 8 continued
PROBLEM D
The sales tax rate of the state where Ravi lives is 5%. Ravi is buying a new bike in a local
bike store and has a $20 coupon. Ravi claims that:
“I can always save $1 if the $20 coupon is applied before tax rather than after tax,
regardless of the actual price of the bike.”
Arguments D1 - D4 are offered by different people to justify Ravi’s claim. Please read
each of the arguments carefully and pick the options that best describe your thinking in
Questions 22 - 26.
************************************************************************
22. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
75
Figure 8 continued
23. What do you think of the argument above? Please pick the option that best matches
your opinion.
Argument D3: If the coupon is applied before tax, then Ravi doesn’t need to pay the tax
for the $20 discount. If the coupon is applied after tax, then he needs to pay the
tax of the original price of the bike. Notice that $20 × 5% = 1. Therefore Ravi
always saves one more dollar if the coupon is applied before tax rather than after
tax.
24. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
76
Figure 8 continued
23. What do you think of the argument above? Please pick the option that best matches
your opinion.
Argument D3: If the coupon is applied before tax, then Ravi doesn’t need to pay the tax
for the $20 discount. If the coupon is applied after tax, then he needs to pay the
tax of the original price of the bike. Notice that $20 × 5% = 1. Therefore Ravi
always saves one more dollar if the coupon is applied before tax rather than after
tax.
24. What do you think of the argument above? Please pick the option that best matches
your opinion.
continued
77
Figure 8 continued
Argument D4: Let x be the original price of the bike and y be how much Ravi actually
needs to pay (after applying the coupon and tax). Based on calculation, the graph
below is generated by a graphing calculator to illustrate the two situations: the
solid line represents how much Ravi needs to pay if the coupon is applied after
tax; the dashed line represents how much he needs to pay if the coupon is applied
before tax. From the graph, we can
see that the solid line is parallel to the
dashed line and is always 1 unit
above it. Therefore, Ravi can always
save one more dollar if the coupon is
applied before tax rather than after
tax.
25. What do you think of the argument above? Please pick the option that best matches
your opinion.
26. After evaluating each argument, which of them is closest to what you will use in
arguing about Ravi's claim?
Argument D1 Argument D2 Argument D3 Argument D4
None of the arguments is close to what I will use. This is how I will argue:
78
The design of SMR was informed by Healy & Hoyles’s (2000) student-proof
views about proofs. The questionnaire consisted of three sections. First, students were
proving. Then several mathematical conjectures and different arguments to justify the
conjectures were provided, and students were asked to pick arguments they would adopt
for themselves and those they considered would receive the best mark from their teachers.
Lastly, students were asked to offer an evaluation of the arguments based on how
convincing and explanatory they found each one. Based on the specific focus of this
proof,” rather, the kind of arguments they find convincing, explanatory and appealing.
understanding of “proof” on the survey, nor was there a need to identify arguments that
they believed would receive “the best mark” (Healy & Hoyles, 2000). However,
participants were still asked to identify an argument in each of the problem contexts that
they were likely to adopt for themselves. Moreover, they judged whether each argument
Second, the design of problem contexts (mathematical items) as well as the choice
As shown by various cognitive development models (Tall et al., 2012; van Hiele,
1986; Yang & Lin, 2008), understanding of the concepts was a prerequisite for the
realization of a connection to occur. Since this study focused on reasoning rather than
representation and concept building, it was necessary to make sure that the participants
understood the concepts so that the differences in their judgment could be attributed to
their evaluation of the reasoning methods. The reason to choose relatively “simple”
arguments was for the feasibility of analysis. If an argument involved multiple modes of
reasoning, it would be difficult to identify its features according to the CCIA model,
which would make the coding less usable. Various arguments were desirable to verify
the comparison and to examine the framework. Lastly, different contexts were supposed
To meet these goals, three conjectures (in Problems A, B, and D, see Figure 8)
were chosen, each representing a topic from one of the three branches of school
mathematics: number theory, and geometry (Problem A was also used by Stylianides &
Stylianides, 2008a). However, it was not assumed that students’ reasoning would be
identical within a branch. Instead, the purpose of choosing conjectures from three
different areas was to provide distinct contexts to detect differentiated judgment and
80
preference of argument types. The three conjectures were all true statements 6, which
false conjecture (in Problem C) in the survey with the intent to seek contrasting data. The
cases that would falsify the conjecture in Problem C were not familiar to the students and
were not easy to be detected. By including the contrasting problem, we aimed to detect
any patterns in students’ judgment that would continue to persist when evaluating false
arguments.
6
In Problem D, it was assumed that a bike costs more than $20. This condition was not articulated in the statement of
this problem in order to see if any student would point out this issue.
81
Figure 9 demonstrates the structure of SMR. Four arguments (e.g. Arguments A1 -
supported the validity of the conjecture (even the validity of the false conjecture in the
conjecture; however, finding some (but all) examples that satisfy a conjecture is not
adequate to prove its validity. Since fostering realization of the latter point is one of the
major goals of proof instruction (Stylianides & Stylianides, 2008b; Waring, 2000), this
study leaned towards exploring student thinking when evaluating “proof” instead of
“refutation.” The four arguments developed in proving each conjecture were classified as
inductive, algebraic, visual, and perceptual. The inductive argument showed proving
attempts by offering a few examples that supported the validity of the proposed
conjecture. The algebraic argument engaged symbolic representation of the context and
then reinterpreted symbolic results to support the conjecture. The visual argument relied
on graphs and figures to provide proof evidence. The perceptual argument related the
problem to a more familiar context and supported the conjecture via such a connection.
Among all the arguments, four (A1, B1, C1 and D1) were inductive; four (A2, B3, C2
and D2) were algebraic; four (A3, B2, C4, D3) were perceptual; and four (A4, B4, C3,
82
Inductive A1, B1, C1, D1
Algebraic A2, B3, C2, D2
Perceptual A3, B2, C4, D3
Visual A4, B4, C3, D4
argument. They were asked to determine whether they understood the concepts used in
the each of the arguments (for the purpose of confirmation), whether they believed the
argument showed the conjecture was always true, and if the argument helped them
understand why the conjecture was true. We designed these questions since the power of
verification and explanation were regarded as two major functions of proofs that are
recognized by students (de Villiers, 1990, 2003; Hanna, 2000b; Healy & Hoyles, 2000).
After reading all 4 arguments for each conjecture, participants were asked to determine
which of them was the closest to what they would use in the same context. We were
why students might experience difficulty when learning about proofs. This knowledge
can also inform the design of tasks that encourage conceptual understanding of proof as a
83
Interview Participants
Participants for the follow up interviews were selected from those whose SMR
responses were used in the survey analysis. Based on the survey results, the participants
were divided in two groups, the consistent group and the inconsistent group. The
consistent group was composed of those who had preferred the same type of arguments in
at least 3 of the 4 problems (i.e. they chose the same type of argument at least 3 times in
Questions 11, 16, 21 and 26. See Figure 8). 141 of the 476 participants belonged to this
group. The inconsistent group was composed of the remaining participants (a total of
335). No member of this group had preferred any particular type of arguments in more
group into 4 subgroups, each of which consisted members who demonstrated a tendency
random number, say n, using a random number generator, and then chose the nth students
from the top of the list as an interview participant. By doing so, we randomly picked 4
representatives, Allen, Abby, Alice, and Amy, from the consistent group. The names are
pseudonyms and are gender appropriate. Using the similar strategy, we randomly picked
4 representatives (by running the random number generator 4 times), Beth, Betty, Blake,
and Brenda, from the inconsistent group. These names are pseudonyms and are gender
All the subjects were taking Algebra I or an equivalent Integrated 8th Grade
Mathematics class at the time when they were interviewed. Two of the subjects were
84
taking Honors Algebra I, among which one was from the consistent group while the other
was from the inconsistent group. Since the interviews were recorded at the end of the
spring semester, the subjects were close to finishing their coursework for the school year.
There were 1 male and 3 females in the each group. Seven of the subjects were Caucasian
and only one interviewee (Betty) was African-American. All subjects were enrolled in
rural or suburban school districts. All subjects were native English speakers. The subjects’
Interview Procedure
The survey results suggested that students’ preference for arguments were highly
diverse across the problems and between individuals. Therefore, we were not able to
make conclusive assertions regarding the types of arguments that students found more
appealing, nor were we able to distinguish the features of the arguments, as pre-identified
85
by the researcher, that could have significantly impacted the students’ evaluation of the
arguments by comparing those who had received high and low ratings. Since students’
judgment was made upon their personal standards of each argument, we believed there
were hidden factors that had influenced their judgment. In order to further investigate
sources that students drew from to make evaluations, which were triggered by particular
Each subject was interviewed separately and each interview lasted approximately
an hour. Each interview consisted of three parts (see Table 5). During the first part, the
subjects were provided with the same problems that were used on the survey, however in
a different format. The subjects had the conjecture as well as each argument of the
problem on a separate piece of paper. Different problems were printed on paper with
different colors. The subjects were asked to read the conjecture again and to then rank the
problems according to how convincing they found the arguments. The subjects were
allowed to change their ranking of arguments at any time during the interview. We did so
to make sure the subjects’ list was not offered randomly but after a careful consideration.
86
Subject Interviewer
First Reexamine each problem and rank Ask the subject why he/she believed
part arguments based on how convincing one argument was more convincing
they were to them. Explain the than another.
rationale of the arrangement by
explicit comparison between
arguments in the same context.
Second Compare the rankings across the Identify the inconsistency in subject’s
part problems. Confirm or revise the rankings across the contexts and
arrangement. Explain the differences explicitly point it out. Ask the subject
between arguments in different to explain how he/she viewed the
contexts in justifying the rankings. same types of argument differently in
different contexts.
Third Rank arguments for the new problem Ask the subject why he/she believed
part (See Figure 10) according to how one argument was more convincing
convincing they found the arguments. than another. Compare subject’s
Explain the rationale of arrangement responses for the new problem to
again. his/her previous answer and probe an
explanation from the subject.
It has been suggested that people usually find it more difficult to reflect on their
own thoughts (Tarricone, 2011). However, by asking the subjects to justify their selection,
their explanation could reveal factors that had impacted their preference. In addition to
87
what the subjects offered, we selected the following items as backup questions in case
they remained quiet or didn’t provide explanations that were understandable to us.
Do you think that one of the arguments can only prove the conjecture is true
Do you think that one argument helps you understand the problem better?
Do you think that one argument’s evidence cannot support its conclusion?
Throughout the interviews, the subjects were encouraged to explain their thoughts
as they felt inclined to do so. Furthermore, if their answer to a question was yes without
questions allowed us to identify their conception of the argument according to the CCIA
framework.
During the first part of the interview the subjects were asked to compare
arguments in each context. During the second part, we asked students to compare the
arguments across the contexts. We were interested in whether the subjects would modify
the order after such a comparison. We were also interested in learning whether the
subjects from the consistent group would act differently from representatives from the
inconsistent group. Most importantly, we wanted to know how the subjects justified their
preference when diversity existed in the types of arguments their preferred (e.g. a subject
preferred an empirical argument in one problem while ranked it as the least convincing in
another problem). Their explanation again revealed factors and features they considered
88
as important when making judgment of mathematical arguments and how such factors
During the last part of the interview, a new problem similar to those on the survey
problems was given to each subject (see Figure 10). The new problem required basic
used (E1 – E4) were inductive, visual, perceptual and algebraic, respectively. The
subjects were again asked to rank the arguments according to how convincing they were
in justifying the conjecture. Comparing the ranking provided for Problem E to their
responses to the previous four problems, the subjects were asked for the last time to offer
89
PROBLEM E
There are some white and orange ping-pong balls in a box. You cannot see what’s inside
the box but you will get a reward if you pick out an orange ping-pong ball from the box.
Jenna claims that:
“If the number of white ping-pong balls and the number of orange ping-pong balls
are both doubled, the chance for you to get a reward still stays the same.”
************************************************************************
Argument E1: Suppose there are 2 orange ping-pong balls and 3 white ping-pong balls
in the box, then the chance for you to get a reward is 2 out of 2+3, which is 40%.
If the numbers of ping-pong balls of each color are both doubled, then there will
be 4 orange ping-pong balls and 6 white ping-pong balls. Hence the chance for
you to get a reward is 4 out of 4 + 6, which is also 40%. Therefore, the chance of
winning the reward won’t change.
Argument E2: As shown in the figure below, if the numbers of orange and white ping-
pong balls are both doubled, the ratio between the ping-pong balls of the two
colors will still be the same. Therefore, the chance of winning won’t change.
… …
… …
Argument E3: When the number of orange ping-pong balls is doubled, the cases for
winning the reward are also doubled. However, when the number of white ping-
pong balls is doubled, the cases for not winning the reward are also doubled. As a
result, the ratio of the cases of winning to the cases of not winning stays the same.
Therefore, the chance of winning won’t change.
Argument E4: Suppose there are n orange ping-pong balls and m white ping-pong balls
in the box, then the chance for you to get a reward is n / (n + m). If the numbers of
ping-pong balls of each color are both doubled, then the chance for you to get a
reward becomes 2n / (2n + 2m), which is equal to n / (n + m). Therefore, the
chance of winning the reward won’t change.
90
Data Analysis
survey results and qualitative analysis of the interviews. Outline of the data analysis
Survey Data Cumulative data were used to identify the type of arguments that were
Analysis understandable, convincing, explanatory or appealing to the entire group
of participants.
Between subgroup comparisons were conducted to investigate between
subgroup differences and possible causes.
Interview Each subject’s responses in the interview were coded and factors that
Data Analysis impacted the individual’s decision were identified.
Common factors that impacted each individual’s evaluation were
summarized and the individual differences were investigated through
between subject contrasts.
The subjects’ responses were revisited and summarized by problem. The
context’s impact on students’ decision was explored.
Survey results were revisited. Explanations to unexpected findings and
proposed hypotheses about the survey data were provided based on the
interview analysis.
questions:
91
Which argument in each problem was indicated understandable by the most
participants?
being closest to what they preferred to use when encountering the same
conjecture?
Were the answers to the four questions above consistent for each problem?
Were the participants’ ratings consistent across the problems when judging
answer these questions. For example, in order to determine what argument in Problem A
was indicated understandable by the most participants, we calculated and compared the
percentages of those who answered “agree” and “disagree” to the first question under
Arguments A1 – A4 (i.e. “You understand the concepts and notations used in the
argument”). The more participants answered “agree” and the less answered “disagree” to
this question under A1, the more understandable we considered A1 was. Since the
questions listed above were directly related to the SMR items, a cumulative summary of
We recognized that solely relying on documenting the cumulative data was not
sufficient to identify whether the difference of the evaluations between two arguments
92
were significant. For example, there might be more participants who found A1 to be
understandable than those who answered “agree” to the same question under A2,
however without clarifying whether the scale of the additional participants was
than A2. Therefore, tests of significance of the differences between the accumulative
percentages were used to clarify any claims regarding the survey data. In particular, we
adopted within group ANOVA tests to examine the significances of the differences in
In addition to the analysis of the results from the entire group, we also examined
enable us to associate the inherent differences between the subgroups and to identify
managed to examine whether students who achieved higher scores on state standardized
by SMR. In addition, data were also analyzed according to gender. The rationale was that
since the male and female students were enrolled in the same schools and in the same
classrooms taught by the same teachers using the same teaching materials and techniques,
The techniques used in the between subgroup comparisons were similar to what was done
when analyzing the entire group’s responses. We examined and compared each
subgroup’s responses to each question in the SMR and adopted the between group
ANOVA to evaluate the significance of differences. During the analysis of survey results
93
from the entire group and from various subgroups, conjectures were also made to explain
why certain arguments received higher ratings from the participants based on the
to support or decline those conjectures were sought after during the interview analysis.
preference for mathematical arguments at the macro level; however they were not
adequate to explain why an individual had made certain decisions when completing the
survey. The latter was the focus of interview analysis. In examining the interview
responses, we first identified both positive and negative comments made by each
comments were summarized in a table as raw data. The comments were then coded using
the CCIA framework (see Figure 7). Specifically, we identified if the comments referred
to the representation, evidence, or the link between evidence and conclusion. We then
to make conclusion about the subjects’ attention in making decisions to determine which
had the largest impact on each subject’s decision. Furthermore, we traced the type of
way by counting the how many times they were referenced by the subject in the
explanation. The frequency of references to these elements served to identify factors that
in the CCIA framework, it was expected that other factors that had impacted the subjects’
94
decision would be detected during the coding process. Those factors were considered as
personal standards and were specially studied. Below we offer an example to illustrate
used when examining the interviews. The same analytical model was utilized in all other
7 cases. The proceeding discussion provides details of the procedure about how Allen’s
responses during the interview were transformed into an analyzable form, how this form
was coded, and how the coding was interpreted to understand the rationale of his
Allen was an 8th grade student enrolled in an Honors Algebra I class at the time of
data collection. In his responses to SMR, the visual arguments were indicated to be
closest to how he would argue in all but Problem C, where he selected C1, the inductive
argument. Based on these results, we believed that Allen had exhibited preference
consistent group. The discussion of Allen’s performance in the interview includes both
data report and data analysis. In data report, we describe the interview process in detail,
including how he ranked the arguments according to how convincing they were to him
and how he justified his rankings. In data analysis, we identify factors that seemly
influenced Allen’s evaluation of the arguments based on his comments on and rankings of
the arguments.
95
Ranking arguments
In the first part of the interview, Allen was asked to work on Problems A through
D one by one. He (as well as other subjects) decided which problems to work on first and
next. However the decision wasn’t based on the content of the problem but the color of
the paper on which it was printed. Table 7 illustrates the rankings provided by Allen for
each problem. Column One of the table represents the order of problems that he tackled.
Allen chose to start with Problem C. From the most to the least convincing, Allen
explaining why he put C2 at the top of the list, Allen suggested that “it uses formulas
which I know are fact, and I like seeing fact.” Later, he repeated a similar comment “this
most effective way” and “very straightforward.” In explaining why he considered C4 less
convincing than C2 but more convincing than the other two arguments, Allen suggested
that C4 was convincing because “it still uses sides and areas.” However, what made him
96
to consider it less convincing was because “it’s less straightforward.” In addition, he
suggested that he had never done something similar to what was described in C4 but he
could imagine the scene. In particular, he suggested that the argument “clearly states, if
you’re using a wired outline which is the figure, I can picture that in my mind, I know
exactly what they’re talking about.” In explaining why he put C1 at the bottom of the list,
Allen suggested that he viewed the “wording" of the problem problematic. For example,
he suggested that “it’s trying to relate too many things: a = b = c,” and “it trip me up for
the first few seconds.” Although the argument contained “a formula,” which he did like, it
was “not straightforward enough.” So this was the only argument in the problem that he
actually didn’t like. In explaining why he put C3 low on the list, Allen claimed that he
liked C3 since “it uses the length of the sides, which is what I would use any day of the
week.” He also liked the fact that “it has diagrams… which explains what they’re talking
about there.” However, when comparing C3 to C2 and C4, Allen only repeated what the
reasons for which he ranked C2 and C4 high on the list but didn’t specify why he didn’t
consider C3 as convincing.
From the most to the least convincing, he ranked the arguments as B3 (algebraic) – B4
B1 was convincing. In particular, he suggested that he was a visual learner but he wasn’t
“seeing any visual representation.” He considered the argument as a mere “opinion” and
suggested “there’s no other facts.” So “there’s not enough support for me, compared to
the other ones.” In evaluating B2, Allen suggested that “it does give a clear example”
which he did believe. So it was better than argument B1. However “it is still not my
97
favorite.” In evaluating B4, Allen suggested that “it shows a circle there, I do like these, I
can clearly relate to these.” He further explained his understanding of the details of the
argument, “I can see with, I guess you can call them formulas, bc here is the length of the
side of the rectangle, and it is smaller than bp, just by a little bit there, so I can believe it.”
He indicated that he liked B4 and B3 “very closely.” Lastly, Allen articulated that C3 was
his favorite since “I like the fact of using the Pythagorean theorem. It’s more
straightforward than using all the angle and the side relations.” He also suggested that by
using the Pythagorean theorem, he could “figure out the problem in a minute or so.”
The next problem Allen worked on was Problem D, where he ranked the
(algebraic) – D3 (perceptual). He suggested that D4 was his favorite since “I like the
visual aid again.” When asked to explain his understanding of the graph, Allen pointed at
the graph and stated, “seeing that after and before are always 1 separated there, 1
separated there, and it never changes, since they’re parallel, so using those, I do think that
that is the best saying that you can always save an extra dollar.” When commenting on
D1, Allen expressed that it was “a little wordy, and that’s why I put it the second.”
However, he expressed that he liked “the fact that they used other, that they can also plug
in a price here, as well as using a formula and plugging a price in.” He claimed that he
liked the argument since he was “a formula kinda guy.” It was interesting that D1 didn’t
contain any formulas, instead there were equations to calculate the results. However,
Allen was able to conclude that replacing a particular value in the equation wouldn’t
evaluating D2, which had the actual formula, Allen said that he didn’t think “there was
98
too much difference.” However, he found D2 to have “more formulas” and “less
suggested that “that is more business, it’s not applying directly to math.” He further
explained that “just stating that and not giving that much evidence, it’s not very
- A3 (perceptual) - A1 (inductive). He started justifying his ranking with A1, stating that
he didn’t see “very many supporting arguments.” He suggested that “there’s almost no
mathematical evidence here, except the opinions and personal work of other people doing
math, and not showing what they did.” Therefore, he considered A1 the least convincing.
Allen considered A3 to be more convincing than A1 since “it says what you can do to
figure it out.” However, since there was “no formula, or visual representation,” he didn’t
consider it as convincing as A2 and A4. To clarify, Allen claimed that in A3, “there is
proof; it’s just not solid, like always a formula.” In evaluating A2, Allen first substituted
a number, 3, to verify if the formula was correct. He then suggested that “it doesn’t
sure about the result. In the evaluation of A4, Allen argued that it had “very simple visual
In the second part of the interview, Allen was asked to revisit Problems A through
D and compare the ranking he had provided. He was first asked why he considered A1,
explained that those arguments did provide some examples but were more like “opinions
99
and people doing things that I have not personally seen.” He didn’t think such examples
were as convincing as those backed up by theorems and graphs. Allen was asked to
justify why he considered D1 convincing since it also showed just a few cases. He
responded that “there’s always the showing, they’re working it out,” which was better
than “plainly stating what they had tried.” Furthermore, Allen was asked if A1 was be
more convincing to him if it had provided more details of the checking procedure. He
replied yes to this question and offered that “giving concrete numbers and facts and
stating their observations of what they did the experiment on” would made it a better
argument.
why he considered the algebraic and visual arguments convincing in all problems. He
stated that formulas and diagrams made arguments more clear and “if there’s a
combination of visual diagrams and formulas, that would be fabulous, that would be
convincing, Allen reasoned that “simply saying to imagine it, then stating that it’s
During the third part of the interview, Allen was asked to examine Problem E and
rank E1 – E4 according to how convincing they were. His rank was: E2 (visual) - E4
(algebraic) - E1 (inductive) - E3 (perceptual). This rank was consistent with the rank he
considered more convincing than inductive and perceptual arguments. In evaluating E2,
Allen proposed that “immediately when I noticed the graph I know it will be high
ranking.” However the interviewer soon realized that Allen didn’t actually understand the
100
graph. Allen was allowed to reexamine the argument but he still couldn’t explain how the
graph supported the conjecture. Therefore, the interviewer explained what the graph
meant, in particular how it represented the “doubling” procedure stated in the problem.
This episode suggested that Allen’s preference for visual illustration might not be based
on a careful analysis of the argument. Instead, he might have been attracted to it due its
appearance. It was unclear if this was an isolated case. In fact, his understanding of the
Another interesting finding was that Allen actually found it difficult to explain his
understanding of the graph in E4. So he chose to refer to E4 (algebraic) and used the
symbols to describe his ideas. This case demonstrated that Allen was comfortable in
using letters to represent variables in mathematical contexts. Allen further suggested that
E2 and E4 “are in principle the same,” but that he still preferred E2 to E4 since E2 “ is
still stating that clearly, while giving me the visual.” Allen added that he in fact liked all
example, not an opportunity to provide your own examples,” while the description
offered in E3, although “in principle the same” as what was offered in E4, was less
appealing to him than visual and algebraic representations. This explanation revealed that
Note that in both Problems E and D, Allen claimed that the inductive and
representations.” He claimed there were formulas in D1, D2, E1 and E4 even though the
text in D1 and E1 didn’t contain any formula (there were numerical equations instead).
This again demonstrated that Allen seemed to be able to conceptualize the formula by
looking at the equations. It was interesting that when evaluating D1, Allen expressed that
he “can see the formula,” while for E1, he claimed that “this is not straightforward
because it only gives one example” and “there would be a formula here, but it’s not
stated.” When evaluating D2, Allen suggested that “this is not straightforward, because it
is a longer and more complicated and not straightforward enough formula.” When
evaluating E4, he characterized that it “straightforward giving you the formula there,
instead of providing two examples.” That is, he used double standards when evaluating
numerical equations were more “straightforward” than the given algebraic formulas. It
was unclear the standard of being “straightforward” meant, however we suspected the
complexity of the formula and his familiarity with it might have been two important
factors.
Table 8. Each comment was then characterized using a coding strategy in line with the
102
Positive Comments Negative Comments
Problem C
It uses formulas which I know are fact, and It’s not straightforward enough. (P)
I like seeing fact. (E4, R4)
[Formula is the] Simplest, quickest, most It’s trying to relate too many things: a = b
effective way. (P) = c. (P)
Very straightforward. (P) If I’m trying to figure this out for the first
time, I wouldn’t think that a, it doesn’t
equal that, and that it doesn’t equal that,
and that would like, trip me up for the first
few seconds. (E2)
I would like to see a diagram… I’m a It is not clearly outlined. (P)
visual learner. (E2, R1)
It is a formula, which I do like. (E4, R4) It’s trying to relate too many things. (P)
It uses the length of the sides. (E4)
103
Table 8 continued
Positive Comments Negative Comments
I like the fact of using the Pythagorean
Theorem. It’s more straightforward than
using all the angle and the side relations.
(E4, P)
It also has a formula that I can work out by
myself and see the process of doing it. (E4,
L4)
Problem D
I like the visual aid again. (R1) It is a little wordy. (R2-)
Seeing that after and before are always 1 I like straightforward ones. (P)
separated there, 1 separated there, and it
never changes, since they’re parallel, so
using those, I do think that that is the best
saying that you can always save an extra
dollar. (R1)
I like the fact that they used other, that [There’s] less explaining. (P)
they can also plug in a price here, as well
as using a formula and plugging a price in.
(E2, L5)
It’s a formula, I’m a formula kinda guy. Just stating that and not giving that much
(E4, R4) evidence, it’s not very convincing to me.
(E6-)
I like the fact that it’s always constant, you
can plug any value in. (E2, L4)
More explaining, as well as they use
examples here. (E2, R2)
Problem A
It says what you can do to figure it out. I’m not seeing very many supporting
(E3, L2) arguments. (E4)
Very simple visual representation. (R1) There’s almost no mathematical evidence
here, except the opinions and personal
work of other people doing math, and not
showing what they did. (E4, E6-)
No formula… or visual representation…
(R1, R4)
There is proof, it’s just not solid, like
always a formula. (R4)
continued
104
Table 8 continued
Positive Comments Negative Comments
Comparing Problems A-D
When someone is trying to convince me of Opinions and people doing things that I
something, I would like facts. (E4) have not personally seen. (E6-)
Formulas and diagrams. (E4, R1, R4) There’s no solid, stated proof right there is
not as convincing as a line graph, or the
Pythagorean theorem, or those. (E4, R1,
R4)
There’s always the showing, they’re Just plainly stating. (E6-)
working it out. (P)
Giving concrete numbers and facts and Simply saying to imagine it, then stating
stating their observations of what they did that it’s definitely longer, you’re not
the experiment on. (E2, R3) giving any example. (E2, E6-)
If there’s a combination of visual diagrams This argument is based on shapes and
and formulas, that would be fabulous, that common sense, but not explanatory sense.
would be perfect. (E2, E4, R1, R4) (E6-)
Problem E
This is still stating that clearly, while Just given that information and no outside
giving me the visual. (R1) knowledge that that would work for all
cases. (E6-)
It does explain it clearly. (P) It provides an example, not an opportunity
to provide your own examples. (E2)
An opportunity to provide your own
examples. (E2)
Work it out on my own, and find out more.
(E2)
Additional Comments
I love formulas, which are always in my It’s not as clear. (P)
mind second to visual representations. (E4,
R1, R4)
This is more straightforward. (P) It only gives one example. (E2)
OK, because it gives examples that This is not straightforward… because it is
worked. (E2) a longer and more complicated and not
straightforward enough formula. (P)
I would still have the opportunity, because There would be a formula here, but it’s not
it’s a formula, to provide my own stated. (R4)
examples. (E2, R4)
Straightforward giving you the formula
there. (E4, R4)
I can see the formula. (E4, R4)
105
Data Analysis
backed by visual illustration and formulas during the interview. He chose visual
illustrations as the most convincing arguments in Problems A & D, and selected algebraic
& C, believing that those arguments didn’t offer enough support to prove the conjecture.
Allen’s general preference toward visual arguments was consistent with his responses in
the SMR.
In order to identify the factors and features of the arguments that had impacted
Allen’s evaluation, his comments during the interview were coded according to the CCIA
framework (see Table 9). We did so to identify whether each of his comments referred to
the representation, evidence, or the link between evidence and conclusion 7 (which was
denoted by the Capitalized letter, R, E, and L, respectively) and the kind of representation,
evidence and link (which was denoted by a single digit following the letter) that had
positively/negatively impacted the subjects’ evaluation of the argument. The coding was
included in Table 8 at the end of each comment. For example, the comment that “it uses
formulas which I know are fact, and I like seeing fact” was coded “E4” and “R4” (see
Table 8, third line from the top), since it was based on a mathematical fact as evidence,
which was expressed in a symbolic form. According to the CCIA framework, this type of
7
“Link between evidence and conclusion” of an argument is referred as “link” of the argument for convenience in the
discussion.
106
evidence is considered “fact,” which is coded “E4” as listed in Table 9. The
1) Not all comments could be coded according to the CCIA framework. In cases
where factors that contributed to the judgment were not identifiable or were not about the
that there were personal standards that need to be further examined. We called it personal
standards since those reasons might not be associated with any particular type of
107
argument. For example, the comment “it’s not straightforward enough” was coded “P”
since it could apply to many different types of arguments. There were also cases when the
subject indicated that he/she was not able to understand an argument. We use “NA” to
denote such comments, suggesting that the subject was unable to provide an evaluation
To identify the different effect, a “-” was added to the end of a code if the identified
factor made the argument less convincing. If there was no such a mark after a code, then
the corresponding factor made the argument more convincing to the subject.
case, it was classified using multiple codes. For example, the comment that “it’s a
formula, I’m a formula kinda guy” referred to the formula as the mathematical evidence
as well as a symbolic representation. Hence it was coded both “E4” and “R4.”
4) There were cases where it was difficult to judge what exactly an argument
meant merely based on the text of the comment. In this case, we reexamined the dialogue
For example, only by reading the comment that “I’m not seeing very many supporting
arguments,” we were not able to understand what exactly the “supporting arguments”
meant. However, by fitting this comment in the conversation, we identified that Allen
referred the formulas and mathematical facts as what he called “supporting arguments.”
108
5). Similar comments might have been mentioned by Allen in multiple places
during the interview. Such comments were counted multiple times. The assumption to do
so was that if a point was addressed multiple times, it should be viewed as being more
The codes for Allen’s comments were then summarized to examine his evaluation
As shown in Table 10, the total number of comments that focused on the
representation, evidence and link of the arguments were 27, 47, and 7, respectively,
109
indicating that the evidence of arguments had the largest impact on Allen’s judgment.
Among all types of evidence, Allen found that fact (i.e. known mathematical results) and
examples (i.e. results from an immediate test) as reliable source to establish an argument,
each of which was referred 17 and 18 times. His explanation was heavily rooted in the
opinions. This was highlighted by his claims that “when someone is trying to convince
me of something, I would like facts” and “giving concrete numbers and facts and stating
their observations of what they did the experiment on” would make an argument
convincing. In addition, he clearly emphasized that “opinions and people doing things
that I have not personally seen” didn’t make an argument convincing to him. Similar
statements were mentioned for 8 times during the interview. Overall, Allen’s comments
he indicated that visual and symbolic representations contributed to his conviction, each
was noted 12 times during the interview. Allen claimed that he loved “formulas, which
“there’s a combination of visual diagrams and formulas, that would be fabulous, that
would be perfect.” This tendency was backed up by his capability to represent variables
with symbols and manipulate the symbols fluently, as well as the capability to connect
However, Allen’s algebraic skills didn’t enable him to evaluate the logic used to
connect evidence and conclusions. Among all the comments he made, 7 referred to a
110
certain type of link between evidence and the conclusion of an argument. In 2, 3, and1
cases, respectively, Allen found a perceptual, transformational, and ritual link convincing.
We found Allen was not able to recognize that showing a few examples couldn’t prove a
examples that worked.” Nevertheless, this didn’t suggest that Allen’s mathematical
reasoning ability was underdeveloped. In fact, we argue that the ability to examine a
single case carefully was a required step for a further conceptualization of generic
examples (Balacheff, 1988). We had noticed that Allen was capable of extracting
properties he saw in one example and applied them to other cases. This was demonstrated
when he claimed “I like the fact that it’s always constant, you can plug any value in” after
standards for deciding whether an argument was convincing or not. There were 14
comments that were coded as “P” (see Table 8). In particular, 9 of these comments
and “quick” to explain why he was or was not convinced, while the other 6 comments
referred to the clarity of the arguments (e.g. “There’s always the showing, they’re
working it out.”). In fact, we found that the pursuit of simplicity and clarity overrode his
preference for the type of representation and evidence. This was demonstrated by his
claim that “this is not straightforward… because it is a longer and more complicated and
not straightforward enough formula.” That means, in order for formulas, one of his
arguments (see Figure 11). Allen viewed arguments that utilized precise description and
and concrete examples were the most straightforward source of evidence, while the visual
and symbolic representations were the clearest ways to describe and relate those
examples. However, since Allen was not yet able to reflect on the rigidness of logic
embedded in an argument, the type of link between the evidence and conclusion was not
among his major focuses. Arguments that used transformation, perceptual and ritual link
were all perceived as convincing by him. An argument was convincing to him as long as
Ritual
Perceptual
Transformational Examples, Facts
Convincing
arguments
Simple procedure (Visual, Symbolic)
Precise description
With this platform, Allen’s rankings of the arguments (see Table 7) became more
sensible. In Problem C, the clarity of the evidence provided in each argument determined
their ranking. The evidence provided C2, C4, C3, and C1 was the triangle area formula,
112
the imagery triangle made by wire, the drawn triangle within a transformation process,
and a collection of triangles, respectively. Among all these, the formula was the most
simple and clear; the imagery triangle made by wire was less clear but also very simple;
the triangle within a transformation process looked more complex; while the collection of
triangles offered a mix pond of information and “trip me (Allen) up for the first few
Arguments in Problem B were also ranked based on the simplicity and clarity of
the evidence provided by them. Compared to his ranking for Problem C, the only
difference was that the rankings of the visual and perceptual arguments were switched.
Allen’s explanation was that the image of the triangle made by wire was clearer than the
image of a football field. Therefore, the argument based on the football field scene was
In the other three arguments, Allen found the visual arguments to be the most
convincing option while the algebraic arguments were ranked lower. A possible
explanation was that in Problem C and B, both algebraic arguments contained well
known mathematical facts (triangle area formula and the Pythagoras Theorem); however,
in Problems D, A and E, the algebraic expressions were not known results but were used
to represent the variables’ relationship in the problem. Therefore, the absence of clear and
The different rankings of the inductive arguments across the problems could also
considered the least convincing. This was because there was no concrete example given
in A1 and B1, while in C1, the examples might seem confusing to him. However, since
113
D1 and E1 discussed more details about the examples, they were considered more
convincing.
Overall, we found that the pursuit of simple and clear statements, the need to see
mathematical facts and concrete examples, and preference towards visual and symbolic
Cross comparison
Seven other subjects’ interview data were analyzed using the same process as
illustrated about. Details of these analyses are included in the next chapter. Following
each individual analysis, a cross comparison of data for all subjects was performed to
order to document the similarities and differences among their responses. In seeking the
similarities, we considered whether there were factors that consistently impacted all (or
the majority of) subjects’ decisions. We calculated the frequency of occurrence of factors
for all subjects and identified those most prominently referenced. We also contrasted the
terms of the elements identified in the CCIA framework and any additional personal
The final stage of analysis focused on exploring context specific factors that
influenced the subjects’ decisions. In particular, we examined causes for the inconsistent
rankings that were provided by the subjects to the same type of arguments. Those factors
arguments. Details of the survey and interview results as well as findings from
114
CHAPTER 4. RESULTS
This chapter is composed of two sections. The first section is dedicated to the
analysis of the survey data. The second section offers a discussion of the results of the
interviews.
We analyzed SMR responses from 476 eighth grade students. The survey results
suggested that students’ judgment of the same type of arguments were highly diverse
The first step in the analysis process considered arguments that the participants
responded “agree” to the question “You understand the concepts and notations used in the
argument” under each argument (see Figure 12). If a participant answered “agree” to the
participant.
115
Figure 12. The percentage of participants who considered each argument understandable
the items were appropriate for this age group. As shown in Figure 12, most arguments
majority of the participants (80.5%) claimed that they understood more than half of the
arguments. We also generated a similar graph to illustrate the distribution of the number
of arguments that were identified as not understandable (not including “not sure”
116
responses, see Figure 14). As shown, those who had claim not having understood more
than 4 arguments counted for less than 15% of the participants. This data suggested that
the respondents. This also suggested that the SMR problems didn’t exceed the
participants’ self assessed mathematical ability so that their evaluation of the arguments
70
62
60 56 54
49 51
50 44
39
40
28 28
30
19 21
20
12
10 4
2 2 2 3
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
117
160
136
140
120
100
80 83
80
58
60 49
40 34
16
20 6 6 4 0 2 0 0 1 1 0
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Figure 14. Distribution of the number of arguments indicated not understandable by each
participant
in each of the respective problems (see Figure 12). Among them, A1, B1 and D1 analyzed
the conjecture by examining a few examples; while C3 was a visual illustration of why
the conjecture was true. Among the second most understood arguments in each problem
(i.e. A4, B4, C4, and D3), A4 and B4 were visual arguments, while C4 and D3 argued in
a perceptual way. Among the least understood arguments in each problem (i.e. A2, B3,
C1&C2 (tie), and D2), all utilized symbolic representations and argued algebraically
except Argument C1, which was an inductive argument but involved plenty of symbols
and labeled figures. Argument C1 required a more careful realization of the connection
between figures, symbols and narratives, hence it may have been harder to understand
118
In order to clarify whether the differences in ratings were statistically significant,
we applied the within group ANOVA tests (i.e. repeated measures ANOVA) in the
numerical value “1” for “agree”, “-1” for disagree, and “0” for “not sure”. We call each of
then adopted the within group ANOVA to test whether the students’ ratings (which were
different from each other. The results are included in Appendix A, Table 39.
particular, the arguments in each problem were listed sequentially from the most
were connected using a curve if the ratings they received were not significantly different
from each other (p > .05). In an intuitive sense, if two arguments were connected, then
they were “close” to (not significantly different from) each other; if not connected, then
the two arguments were separated from (significantly different from) each other. For
example, in Problem D, from top to bottom, D1 received the highest rating; D3 received
the second highest; D4 came after D3; and D2 received the lowest rating. The differences
in ratings between D1 & D3, and D4 & D2 were insignificant so each pair was connected
using a curve. The differences in ratings between D1 & D4, D1 & D2, D3 & D4, and D3
& D2 were significant so each pair was not connected using a curve. As shown in Figure
15, D1 and D3 were considered significantly more convincing than D4 and D2.
119
Figure 15. Illustration of how understandable the arguments were to the participants
that students were more likely to understand an argument when it showed examples. In
addition, it was also detected that the differences between A1 (inductive) and A4 (visual),
and D1 (inductive) and D3 (perceptual) were not significant. This signaled that students
may have been able to understand other types of arguments as well as inductive ones. The
participants’ view on the inductive argument in Problem C was different from the other
three problems where the visual argument was considered as insignificantly less
120
understandable than the two others (perceptual and visual). Therefore, the results tended
by the participants, other types of arguments were well perceived by them when
satisfying certain conditions (e.g. the arguments may connect the problem with familiar
participants’ evaluation of whether an argument showed that the conjecture was always
true was also analyzed. Students’ judgments of the second claim under each argument (i.e.
“the argument shows that the statement is always true”, see Figure 8) were assessed using
numerical values: “1” for “agree”, “-1” for disagree, and “0” for “not sure”. We call each
considered him/her to be not sure if the argument was convincing. We then adopted the
within group ANOVA to test whether the students’ ratings (which were considered
from each other. The results of statistical analysis of this data set is included in Appendix
A, Table 40.
similar to that described in previous section, the arguments in Figure 13 were listed
sequentially from the most convincing (top) to the least convincing (bottom) by problem.
121
In addition, two arguments were connected using a curve if the ratings they received were
not significantly different from each other (p > .05). In an intuitive sense, if two
arguments were connected, then they were “close” to (not significantly different from)
each other; if not connected, then the two arguments were separated from (significantly
different from) each other. For example, in Problem D, all the arguments were connecting
using curves, indicating the difference between any pair of arguments was insignificant.
Figure 16. Illustration of how convincing the arguments were to the participants
convincing argument, while A1 (inductive) received the lowest rating. Note that the
differences between A1 and any of the other 3 arguments were significant; and the
differences between A4 and any of the other 3 arguments were also significant. This
122
suggested that the participants found the visual demonstration more convincing than any
other type of argument in the number theory problem, while they suggested that checking
a few numbers couldn’t convince them that the conjecture was always true. We suspected
that the figures used in A4, where manipulatives commonly used in mathematics
instruction were represented, might have contributed to the high rating on A4.
rating that was significantly higher than any other argument; while B1 (inductive), again
the inductive argument, received a significantly lower rating than B3 (algebraic) and B4
(visual). The same as what was detect in Problem A, the inductive argument received the
lowest rating and the perceptual argument’s rating was not significantly higher. This
finding suggested that either an imaginary (i.e. the football field in B2) or a few concrete
examples as cited in B1 made them to be as well perceived as the visual and algebraic
arguments in this problem. In particular the participants found the algebraic argument
more convincing than all other arguments in this geometry problem. We suspected two
factors might have contributed to the high rating of B3. First we realized that Pythagoras
Theorem is one of the most well known result in school geometry and therefore strongly
recognizable by the students. Second we perceived that 8th grade students may have just
learnt the theorem and the topic was still fresh in their mind.
significantly higher than C3 (visual) and C1 (inductive) but was not rated significantly
ratings were smaller in the sense that the between argument differences were not
significant for many pairs. In particular, the ANOVA test suggested insignificant
123
differences among the ratings on C1, C3, C4. C2 stood out as being significantly more
convincing than two of the other three arguments. We suspected the well known triangle
area formula might contributed to its high rating. Similar to what was detected in
Problem A and B, this finding could be viewed as the mathematics curriculum’s impact
on the participants.
In Problem D, although D4 (visual) received the highest rating, it was not rated
significantly higher than any other arguments used in this problem. Therefore, all the four
arguments seemed to have been equally as convincing to the participants. This was a
good illustration that in some cases there might not be a unique most convincing
approaching a problem using multiple strategies might be the only plausible way to help
Data from the survey suggested that the participants were not completely satisfied
with empirical checking and verifying. Among all the lowest rated arguments in each
problem, two were inductive. However it was premature to claim that the participants
were able to realize that checking a few example was adequate to prove the general
validity of an conjecture. We made this claim since among all the 476 respondents, only
10 could identify, in all four contexts, that the inductive argument was not sufficient to
establish the validity of the conjecture. At the same time, there were some indicators
implying that information other than pure empirical checking could have contributed to
students’ conviction in the process. However, it was not clear what type of information
was most helpful. As shown in the Figure 16, visual illustration (A4 & D4), theorem (B3),
formula (C2), mental image of real life experience (C4), closer examination of examples
124
(D1) could all contribute to a higher rating. Further investigation of how these various
conjectures was carried out during the interview phase of the study.
judgments to the third claim under each argument (i.e. “the argument helps you better
understand why the statement is true”, see Figure 8) were assessed by assigning the
numerical values: “1” for “agree”, “-1” for disagree, and “0” for “not sure”. Each of these
numbers was called a participant’s rating on whether an arguments was explanatory. Note
him/her to be not sure if the argument was explanatory. We then adopted the within group
ANOVA to test whether the students’ ratings (which were considered within-subjects
variables) on the arguments in a problem were significantly different from each other.
The statistical results of the analysis are included in Appendix A, Table 41.
arguments in each problem were listed sequentially from the most explanatory (top) to
the least explanatory (bottom). In addition, two arguments were connected using a curve
if the ratings they received were not significantly different from each other (p > .05). In
an intuitive sense, if two arguments were connected, then they were “close” to (not
significantly different from) each other; if not connected, then the two arguments were
separated from (significantly different from) each other. For example, in Problem A, A4
125
was not connected with any other argument while the other three arguments were
connected to each other. This suggests that the rating on A4 (visual) was significantly
higher than the other three arguments, whose ratings were not significantly different from
each other. The results show that the participants found the visual demonstration more
explanatory than any other type of argument in the number theory problem.
Figure 17. Illustration of how explanatory the arguments were to the participants
received a rating that was significantly higher than the other three arguments, whose
ratings were not significantly different from each other. Therefore, B3 was not only the
most convincing argument in this problem, but was also the most explanatory one to the
126
participants. Again, it was suspected that the use of Pythagoras Theorem as evidence
The ratings on all arguments in Problem C were not significantly different from
each other. This implies that the participants could extract information from each of the
arguments, which helped them understand better why the conjecture was valid. If true,
this could demonstrate the benefit and need of explaining mathematical results from
multiple aspects.
received a rating that was significantly lower than any of the other three arguments.
Different from the cases in the other three problems, there was a single argument in
Problem D that received a significantly lower rating. We suspected that the way in which
the variable was used in D4 might be unfamiliar to the students. In classrooms, students
are usually asked to solve for the variable when it is given in the equation form. However
in D4, the presence of variable didn’t require solving for a value. Rather, the variable was
used to represent general cases and was eventually cancelled out in the calculation to
Visual arguments appeared twice on top of the lists in the contexts of number
theory and geometry. These demonstrated the power of visual illustration in helping
students understand the problem better. Algebraic argument was considered most
explanatory in the geometry problem but was considered the least explanatory in the
algebraic problem. This demonstrated that students possessed the ability to understand
argument in abstract form. However they might still had difficulties if the form was
127
unfamiliar or too complex for them. Overall, there wasn’t any single type of argument
was considered more explanatory than others in all the problem contexts. In addition, as
demonstrated in Figure 17, the “explanatory” ratings were not significantly different in 14
of the total 24 pairwise comparisons. This suggested the need for and benefit on
multiple aspects. A closer look at the data revealed that even the least explanatory
argument (D2) was considered explanatory by close to 60% of the participants who
After evaluating all the arguments in a problem, the participants were asked to
choose one which they believed was closest to how they themselves would have argued
(e.g. see Question 11, Figure 8). A participant’s choice in answering this question was
considered as “appealing” to the participant. Unlike the previous three ratings where each
argument was evaluated separately and there could be multiple arguments in a problem to
argument, the participants needed to compare all arguments in a problem and then select
only one as the appealing option. Figure 18 illustrates the percentage of the participants
8
Since the participants were allowed to choose none of the arguments, the percentages of the participants choosing
each argument in one problem did not add up to 100%.
128
Figure 18. The percentage of participants who considered each argument the appealing
The between group ANOVA was applied to determine whether the participants’
participants’ choices of the appealing option into 4 columns, assigning the numerical
value “1” to an argument if it was indicated by the student as the appealing option and “0”
to the other three arguments (see Figure 19 for an illustration). Treating the 4 columns as
the 4 levels of within-subject variables, the within group ANOVA were applied to test the
129
between argument differences in each problem. The statistical results of the ANOVA tests
Students Appealing A1 A2 A3 A4
Option
S1 A2 0 1 0 0
S2 A1 1 0 0 0
S3 A3 => 0 0 1 0
S4 A4 0 0 0 1
S5 A3 0 0 1 0
S6 A2 0 1 0 0
S7 None 0 0 0 0
Figure 19. An example of the data transformation for within group ANOVA test
Figure 20 illustrates the results presented in Table 42. In particular, the arguments
in each problem were listed sequentially from the most appealing (top) to the least
appealing (bottom). In addition, two arguments were connected using a curve if the
ratings they received were not significantly different (p > .05). In an intuitive sense, if
two arguments were connected, then they were “close” to (not significantly different from)
each other; if not connected, then the two arguments were separated from (significantly
different from) each other. For example, in Problem A, where A4 was not connected to
any other arguments while the other three arguments were connected to each other. This
indicated that the difference between A4 and any other argument was significant, while
the differences between A1 & A2, A1 & A3, and A2 & A3 were not significant.
130
Figure 20. Illustration of how appealing the arguments were to the participants
as the appealing argument in Problem A, which was significantly more than those who
picked any other option. The percentages of participants who chose the other three
arguments (21.2%, 21.0% and 16.6%, respectively) were not significantly different from
each other. The preference towards A4 was consistent with findings in previous sections
suggested that the participants preferred to adopt manipulatives as visual aid to facilitate
as the appealing argument. This number was significantly larger than those who had
chosen B1 (inductive, 20.0%) and B4 (visual, 21.4%), however it was not significantly
131
larger than those who selected B3 (algebraic, 27.3%). B2 used “football field” as a
context to demonstrate why the conjecture was true, where the explanation relied on an
illustration from real life experience; while B3 was based on the Pythagoras Theorem,
which was a well known results referenced in school curriculum. This rating was
interesting since the source of evidence used in the two arguments were different, yet
they were perceived as appealing by the same numbers of students. This again
demonstrated the diversity of students’ preferred ways of reasoning. The visual argument
might have been less appealing to the participants due to the complexity of its structure.
Compared to the simple image of a football field, the geometric figure used in B4
involves many more components (such as rectangle, circle, lines) and the relationship
In Problem C, the arguments received close ratings. Among all the pairwise
comparisons only the difference between the most appealing option (C1, inductive,
preferred by 26.7% of the participants) and the least appealing option (C4, perceptual,
participants’ responses in Problem B, where the perceptual argument was chosen as the
most appealing option, it was surprising to see C4 to have received the lowest rating in
Problem C. Two reasons might help explain this phenomenon. First, the scene created by
B2, i.e. the football field, might be more familiar to the participants than the scene
created by C4, i.e. using wires to make triangles. Second, the other options provided in
Problem C might be more appealing to the participants for various reasons. For example,
the visual illustration in Problem C, i.e. C3, requires less analytical thinking to
significantly higher than those who chose D4 (visual, 22.5%) and the least appealing
option, D2 (algebraic, 18.3%). The same as in Problem C, the inductive argument was
arguments in Problems A and B, C1 (inductive) offered visual images of the samples and
D1 (inductive) offered a detail calculation procedure for one case. In A1 (inductive) and
B1 (inductive), such detail was not present. Therefore, we suspected that the extra details
The data revealed that no particular type of arguments was appealing in all 4
contexts. In fact, only 19 of the 476 participants considered the same types of arguments
to be appealing to them across the 4 problems. 122 participants chose one type of
arguments 3 times in the 4 problems. The rest of the participants (a number of 335) didn’t
pick any type of arguments more than 2 times. This result suggested that for a majority of
the participants, the appealing reasoning methods were highly context based and didn’t
uniformly lean on any particular type. However, it also suggested that some participants
might have developed more uniform preference towards certain types of arguments.
Investigating the rationale for choice by the participants whose judgment seemed to have
In the previous sections the students’ responses were analyzed to determine what
general validity of the conjecture, helpful to explain why the conjecture was true, and
133
closest to how they would argue in the same context. These arguments were referred to as
evaluation, using the four different ratings (i.e. understandable, convincing, explanatory
and appealing), was consistent in each problem, and to explain what might have
convincing, explanatory and appealing arguments based on Figures 13, 14, 15 and 18. In
particular, the highest rated argument as well as those whose ratings were not
significantly lower than it were included in the proper cell of the table. In each cell,
Table 11. Summary of the most understandable, convincing, explanatory and appealing
arguments as evaluated by the participants in each problem
explanatory and appealing arguments. In particular, the lowest rated argument as well as
134
those whose ratings were not significantly higher than it were included in the appropriate
cell of Table 12. In each cell, arguments to the left received lower ratings.
Table 12. Summary of the least understandable, convincing, explanatory and appealing
arguments as evaluated by the participants in each problem
Note that in Problem A, the participants’ choices in all 4 rating standards were
quite consistent. A4 (visual) was considered as the most convincing, explanatory and
appealing option. It was considered the second most understandable option, which was
not significantly lower (p > .05) than the most understandable option A1 (inductive). We
suspect that the visual image provided by A4 was close to the graphic illustration
provided in their early mathematics classroom that introduced multiplication and division.
Students’ familiarity with such a representation might have contributed to the higher
ratings. This suggested that visual representation could be reliable and helpful for
students when making judgment. It also suggested that the classroom experience had an
impact on students’ conviction. Aside from A4, the participants’ ratings on other
arguments were close (see Table 12) except that A1 was considered significantly less
convincing than all other arguments. This suggested that although A1 was the most
135
understandable option, the participants didn’t consider it more convincing and
explanatory than other arguments when showing that the conjecture was always true.
most appealing option, while B3 (algebraic) was considered the most convincing and
explanatory option. B3 was also considered insignificantly less appealing than B2. To
particular, we analyzed data from those who claimed to understand both B2 and B3. It
was found that in this subgroup, 33.9% found B3 more appealing and 29.7% found B2
more appealing. Therefore, B3 was considered the most convincing, explanatory and
appealing option among those who claimed to understood both B2 & B3. In fact, B2 was
considered significantly less convincing and explanatory than B3 and B4 (see Table 12).
Since there were significantly more participants who had found B2 understandable
compared to those who claimed to understand B3, and it was those who had found B2
understandable raised the overall appealing rating of B2. The result was sensible since we
assumed almost every student knew what a football field looked like. Additionally, B2
used easier language while building a perceptual connection between the scenario and the
conjecture, which requires less analysis. On the other hand, B3 was convincing,
explanatory and appealing to those who understood it because the Pythagoras Theorem is
one of the most well known and reputable results in school geometry. Therefore, if a
student understood B3, they would most likely give it a high rating. In addition, B1
(inductive) was considered the least convincing, explanatory and appealing option, which
again demonstrated that inductive arguments without further explanation were not
contrasting item, the conjecture was not true and all the arguments presented in that
problem were false. In this context, only 2 participants indicated that none of the 4
arguments could show the conjecture was always true or could help them see why the
found one of them had indicated that he didn’t understand any of the 4 arguments; while
the other suggested she didn’t think any of the arguments was close to how she would
reason. Furthermore, she claimed that “the tringle 9 1 is clearly bigger than tringle 2 so if
you put bigger number in tringle 1 then it will be bigger than 2.” With the exception of
these 2 cases, all other participants selected “agree” for at least one statement suggesting
that one argument in Problem C showed them or helped them see the conjecture was true.
Although the participants might not have been sure that the conjecture was always true
even when they chose the “agree” option, the data suggested that no student clearly
pointed out the conjecture was false, hence there was no clear evidence to show that
when working on Problem C, any of the participants had assumed the conjecture false.
Tables 23 and 24 indicated the participants’ ratings on the arguments in Problem C were
close. Not a single argument was received a rating that was significantly higher than the
others in any criteria. Indeed significantly more students found C3 (visual) and C4
(perceptual) more understandable than C2. This was not surprising since C3 and C4 used
easier language and there was explanation involving intensive usage of abstract symbols.
C2 and C4 were considered by most students as convincing. It was surprising to see that
9
The participant misspelled “triangle” as “tringle” in her response to the survey.
137
C2 was considered convincing by more students than C3. A possible explanation is that
mathematics classroom and its appearance might add credibility to the argument. All the
that when encountering an unfamiliar context, explanation from various aspects could
help students better understand the problem. When choosing the appealing arguments,
Argument C4 fell behind. This was also surprising since Argument C4 was rated as the
could be that the scenario used in Argument C4 was not closely associated with the
mathematical content presented in the conjecture, so the students might not have seen a
natural connection there. Therefore fewer students thought they would adopt such a
appealing option and was rated a close second to D4 in the other two criteria. This result
Problems A & B. As mentioned in the previous analysis, we suspected that the extra
details offered in D1, i.e. the layout of the calculation procedure, contributed to its higher
ratings. Another interesting finding in Problem D was that D2 (algebraic), which shared
the same procedure as what was used in D1 was considered as the least understandable,
convincing, explanatory and appealing option. Our conjecture is that the symbolic
138
students tended to choose the easier one when they saw both options without considering
Collectively we found that among the 32 cells in Tables 23 and 24, only 8
contained one single argument, while 10 contained 3 or 4 arguments. This suggested that
students rarely gave significantly higher or lower rating to any particular argument, and
in many cases, the between argument differences was small. Furthermore, as shown in
Figure 20, even the least appealing argument (i.e. A3), was chosen by about 1/6 of all
participants. While the difference between A3 and other arguments’ ratings were
statistically significant, it didn’t mean that A3 was of no value to the students. In fact, it
of the participants. Therefore, the use of A3 definitely could provide extra opportunities
for students to approach the conjecture in Problem A. The case of A3 illustrated that
although some arguments received significantly lower ratings than others (no matter in
what criteria), they were still chosen by some students and their preference shouldn’t be
ignored.
In order to distinguish if there was one type of argument that received higher
ratings than all others, we counted the number of times each type of arguments appeared
in Table 11 and Table 12 (see Table 13). As shown in Table 13, there were almost an
equal number of each type in both columns. The result indicated that there wasn’t any
particular type of argument that received higher/lower ratings than other ones. This
finding again illustrates that, in general, “type” couldn’t determine students’ evaluation
139
Argument Type # of Appearances in Table 11 # of Appearances in Table 12
(high ratings) (low ratings)
Inductive 8 10
Perceptual 9 9
Visual 9 8
Algebraic 7 10
Lastly, it was detected that the ratings of the same type of arguments were highly
inconsistent across the problems. For example, the inductive argument was rated high in
Problem D while received low ratings in Problem B; the algebraic argument was rated
high in Problem B but received low ratings in Problem D; the visual argument was rated
high in Problem A but received low ratings in Problem B, and etc. These results again
previous analysis, we identified a few factors that we suspected had influenced students’
choices, such as the amount of details provided, the fluidity of language, and the
familiarity of scenario. The exploration and synthesis of these factors were the major goal
most understandable, convincing, explanatory and appealing was distinct across the
contexts. While we were not able to make a grand conclusion about what type of
140
arguments were understandable, convincing, explanatory and appealing to students, the
high ratings that some arguments received made sense in their respective contexts.
Therefore, we suspected that there were factors other than the presentation and content of
the problems that had impacted the participants’ choices. While we were not able to
obtain an explanation for the choices merely based on the survey results, we probed for
children as well as non-mathematical experiences gained from life outside the school
environment.
participants’ responses, we compared data from participants who were enrolled in higher
mathematical proficiency of the two higher performing schools, as measured by the 2012
state standardized 7th grade mathematics tests, were 10% above state average; while the
percentages of the 8th grade mathematical proficiency of the two lower performing
schools were at least 10% below state average. Therefore, the difference in the students’
levels of mathematical proficiency between the higher and lower performing schools, as
measured by the standardized tests, was rather large. While this comparison couldn’t rule
would be valuable to see if students who achieved higher scores on state standardized
measured by SMR.
141
The second comparison considered the potential impact of the participants’ gender
on their choices. The male and female students were enrolled in the same schools and
same classrooms, taught by the same teachers using the same teaching materials and
techniques. Although sitting in the same classroom didn’t mean the same classroom
experience for each individual learner, however if the cumulative data suggested a large
difference between female and male students’ responses, it was unlikely that this
difference was caused by instruction. Therefore, it was assumed that different responses
from male and female participants could provide additional insight on learner’s choices
and reasoning. Details of the two comparisons were shared in the following discussion.
117 of the participants were enrolled in higher performing schools and 311 of the
participants attended lower performing schools. For convenience, participants from the
higher and lower performing schools were referred as Group H and Group L, respectively.
We adopted between group ANOVA to test the between group difference of the
participants’ responses to each question in SMR. Using the same data quantifying strategy,
“1” “0” and “-1” was assigned for “agree” “not sure” and “disagree” respectively.
Questions were also labeled in a format like “A1.2”, where A1 indicates the argument,
and 2 indicates the 2nd question under this argument, which assesses whether A1 is
convincing. In addition, four variables (e.g. A5.1 – A5.4) were used to quantify students’
choices of the most appealing argument in each problem. The quantifying strategy was
illustrated in Figure 19. Table 43 (in Appendix B) illustrate the results of the between
142
group comparisons of students’ evaluation of each argument by different ratings (i.e.
As reflected in Table 43, the between school differences were not significant for
all 64 variables (questions) except for C5.3, D1.2, D4.2, and D5.4. Specifically,
(inductive) convincing.
(Group H as least 10% above state average and Group L at least 10% below state average,
performance in SMR were much smaller. In particular, the two groups’ evaluations were
not significantly different on 60 of the 64 variables. This result suggested that a higher
reasoning.
A closer examination of the 4 cases where the group differences were significant
revealed that participants from Group H tended to prefer the visual arguments in both
Problems C and D (i.e. C3 and D4). 32% in Group H selected C3 as the appealing option
while only 22% in Group L did so. 30% in Group H selected D4 as the appealing option
while only 21% in Group L did so. It made sense that Group H exhibited a higher
plane and such knowledge could contribute to higher standardized test scores as well.
contributed to their higher preference for this argument. However Group H’s higher
preference towards C3 was less sensible. A possible explanation might be that the
certain geometry contexts and they were more capable of visualizing the change of
geometric shapes by reading the description and static images that depict stages of the
transformation. However, this explanation didn’t align with the fact that C3 was not
was detected that Group H considered D1 (inductive) less convincing that Group L did.
Our hypothesis for this result was that there could be more students from Group H that
had realized that D1, although showed more details than the inductive arguments in other
Group H and Group L; however, in general the two groups’ responses to the SMR were
compatible. Therefore, the survey data suggested that classroom experience that
we suspected there were other factors that might have impacted the participants’
144
Between gender comparison
Among the 476 participants of SMR, 229 were male and 237 were female. The
remaining 20 participants chose not to disclose their gender and were not included in this
comparison. The male and female students were enrolled in the same classrooms in the
same schools. They also had lived in the same communities. Therefore, we didn’t assume
Consequently, we suspected that the between gender comparison might reveal some non-
arguments. The same method (i.e. the between group ANOVA) was adopted to assess the
gender differences. Table 44 (in Appendix B) illustrate the statistical results of the
comparison.
As reflected in Table 44, the gender was an insignificant variable in all 64 cases
except for A2.1 and A5.2. That is, the female students considered Argument A2 (algebraic)
significantly less understandable and appealing than the male students did. This result
was surprising since A2 was stated using pure mathematical language and didn’t refer to
any life experience. Therefore it was difficult to perceive how the gender might have had
an impact on the evaluation of this argument. A possible explanation is that more male
students were comfortable using algebraic method to work on the number theory problem,
Analysis revealed that gender difference was small. Therefore, the gender
difference test didn’t offer us insights into factors that impact students’ evaluation. Such
an inquiry was left to be accomplished during the interview analysis of the study.
145
Gender * School effect
Lastly, we studied the gender * school effect on the ratings provided by the
students to investigate if gender impact on the ratings were significantly different when
comparing the higher and lower performance schools. The results are included in
As shown in Table 45, the gender * school effect was significant (p < .05) for
A2.1, D2.1, D4.1, and D4.2. Note that A2.1, D2.1 and D4.1 measured whether A2
measured if D4 (visual) could prove the conjecture in Problem D was always true. To
further investigated the gender * school effect on these four variables, we generated plots
using school as separate lines and gender as horizontal axis (see Figure 21).
146
Figure 21. Plots for variables on which the gender * school effect was significant
Figure 21 demonstrated that the gender differences of all variables were small in
the lower performing schools. However in the higher performing school, the differences
were large. In particular, the male students provided significantly higher ratings than
female students for all four variables. That is, the male students in the higher performing
147
understandable than the females students in the same schools did. In addition, the male
students in those schools also considered D4 (visual) significantly more convincing than
their female classmates did. This result suggested that male students from the higher
geometric contexts than their female counter parts. In addition, they were also more
likely to understand and be convinced by graphs in a coordinate plane. Since the gender
differences about the same questions were small in the lower performing school, we
suspected that the enlarged gap in higher performing school was caused by knowledge
perceived from classroom instruction. However, it was unclear why the male students
Nevertheless, the significant gender * school effect was only found in 4 of the 64
tested variables. Therefore, the cross effect was not significant for the participants’
The survey data demonstrated great diversity among the participants’ evaluation
of the arguments used in SMR. Among all 16 arguments used in the 4 problems, even the
least understandable argument rated by the participants (D2, algebraic) was indicated as
understandable by nearly 60% of them; the least convincing argument (B1) was indicated
as being able to show the corresponding conjecture was true by about half of those who
understood the argument; the least explanatory argument (D2, algebraic) was considered
as being helpful to show why the conjecture was true by close to 60% of the participants
148
who understood the argument; and the least appealing argument (A3, perceptual) was
selected as the closest way to how they would argue by about 1/6 of the participants.
higher standard of mathematical rigor, they may be most compatible with ways in which
many students themselves argue. Data from the participants’ responses to SMR offered
much insight to these “natural” ways. To sum up, the study results indicated that:
The participants’ evaluation of the same argument was highly diverse among
individuals.
verifying a few cases. Further support from multiple sources, such as visual
conviction.
the group’s favorite argument types also varies across the contexts.
between high and lower performing schools and between male and female
might have contributed to students’ evaluation based on our understanding of the content
involved in the arguments. We suspected that concrete examples and visual illustrations
perhaps examining one case in detail may have helped students see why the conjecture
was true. We conjectured that arguments that used easier language and offered shorter
couldn’t be verified merely based on the survey data. The follow-up interviews aimed to
unpack students’ perception of the arguments and their rationale for decisions they had
made. The interview also allowed us to explore the mathematical and non-mathematical
factors that had impacted the students’ judgment. Results from the interviews are
The survey results suggested that the students’ preference of arguments were
highly diverse across the problems and between individuals. The results however didn’t
allow us to infer what types of arguments were more appealing to students. Furthermore,
the data didn’t capture specific features of the arguments that had significantly impacted
students’ evaluation of the arguments. Since students’ judgments were made based upon
their understanding of each argument, we believed there were hidden factors that could
have impacted their choice. In order to further investigate those factors, we relied on
information as well as the selection process were included in Chapter III. Each interview
lasted about an hour. Details regarding the interview procedure were also described in
Chapter III. This section offers analysis of the interview data. In particular, we first
analysis of the participants’ responses to each problem so to examine the potential impact
Survey results were insufficient to explain why the respondents had made certain
decisions. It was not assumed that an individual relied on the same factors and used the
same logic in every context; however, by comparing and analyzing his/her responses in
multiple problems, we assumed that we were more likely to detect factors that
examined the subjects’ responses, including how they ranked the arguments from the
most convincing to the least convincing, along with their justification for the ranking.
The analysis of Allen’s interview responses has been elaborated in the methodology
chapter and served as an illustration of the analysis process. Below we included findings
from the other seven subjects which was obtained using the same analyzing techniques
151
The case of Abby
Abby was an 8th grade student enrolled in an Algebra I class at the time of data
collection. In her responses to SMR, the inductive arguments were indicated to be the
closest to how she would argue in all but Problem C, where she selected C2, the algebraic
argument. Based on this result, we believed that Abby had exhibited preference towards
consistent group.
Abby’s interview responses are summarized in Table 14 and Table 15. Table 14
illustrates the rankings provided by her for each problem. Column One of the table
represents the order of problems that she tackled. Table 15 summarizes Abby’s comments
when articulating why she found certain arguments convincing or not convincing (The
coding of each comment is explained in Table 9). These two tables served as the major
152
Positive Comments Negative Comments
Problem D
That’s how I would normally do it, it It doesn’t show, like, how it is after the
shows like how to get there. (E2) tax. (E2)
I think it’d be easier to do this way than, There’s just so much work… if you can
like, have a graph. (E2, R3, R1-) make it simple, like this one [points to
D1], why would you confuse yourself?
(E2, P)
That shows 20 times 5 equals 1, so then I would need to try it. (E2)
that’d just prove that he’s right. (E2)
I think work would be easier than trying to
make a graph. (R1-, R3)
It also says they tried it with 200 and 500,
which gives more information. (E2)
If it works for 200 and 500, why wouldn’t
it work for 300? (L3)
I just think ’cuz you’re multiplying it by
the same thing, and if it works for those
two, if you tried 300, I think it’d work.
(L3)
Problem C
It says the formula is base times height, I’ve never heard of using wire to make a
divided by 2, and I think since they’re all triangle. (E3)
greater, then that does prove that this
height would be bigger than this one, so
that’d prove it’s bigger. (E4, L2)
It shows you how he got, how they’re I’ve never heard of this either to do, to
bigger areas. (E2, R1) figure that out. (E3)
It puts it in a form how you can see 1 is These I’ve never actually done. (E3)
bigger than 2, and they did, like,
equilaterals, scalene, isosceles triangles, so
they did all the different triangles, and then
they showed. (E2, R1, L3)
That one would be easier. (P)
That’s just how I’ve been taught since I
was little. (E1, E3)
continued
153
Table 15 continued
Positive Comments Negative Comments
Problem B
Everyone knows what a football field This one makes no sense at all. (NA)
looks like, so you can just, like, imagine in
your head that the diagonal’s longer than
all sides. (E3, L2)
Anyone can draw rectangles and measure I’ve just never done it like that. (E3)
the sides, and they could obviously see the
diagonal’s longer. (R1, E2)
I’ve seen a football field before, and I
know how big they are. (E3)
It’d be simpler to do this than figure out,
make sure you did the circle right. (P)
Problem A
I know how to make, like, algebraic I haven’t tried, like, large numbers, say
expressions, so if I would put it this way, like, a thousand something, that a multiple
I’d understand it more, and it also proves of 6, I didn’t know if that’d be a multiple
that 6 equals 3 times 2. (E4, R4) of 3 too. (E2)
It shows that it could be for any number It proves that it could do that, but they
that’s a multiple of 6. (R4, L6) didn’t show how they did it. (E2)
This one is easy to see, visualize it. (E2, It’s confusing because they show many
R1, P) words in it. I just don’t like word
problems. (R2-)
I’ve been doing that this entire year You could see it, how they put it, but if
because of algebra. (E1, E3) you were just told to figure that out, and
you didn’t have these in front of you, it’d
be hard to tell. (E3, P)
I was taught to put those in algebraic If they said, like, use the square cards, and
expressions. (E1, E3) you didn’t have them in front of you,
you’d have to think and put them together
and draw them. (E3, P)
They, like, show you how to do it, and
they show you it’s true. (P)
Comparing Problems A-D
In this problem, they like, show you They don’t show you how you get them.
pictures, they like show you the triangles, (E2)
and what their size is. (R1, E2)
The size, and I can put them together, in a We were taught to do a tree, and branch
way… like it makes sense to how it’s off the multiples, so… I would need the
smaller. (E2, R1) tree in front of me to see. (E2, E3)
continued
154
Table 15 continued
Positive Comments Negative Comments
If I was by myself, and I didn’t have I wouldn’t know how to make the graph
someone to explain that, that would be a just off the top of my head for that certain
better pictures. (R1) problem. (E3-, P)
I know how to do this, but if I didn’t, that This one confuses me. (NA)
[points at C1] would be easier to find. (E2,
P)
They show you the problem within… the It doesn’t show enough, like it doesn’t
words… so they gave you an idea within give you enough numbers. (E2)
those. (R2)
Just how they worded it. (R2-)
I can imagine a cookie box, it’s just the
words… because it’s just so much. (R2-)
Problem E
It shows you how they got there, like, it This isn’t as easy to visualize. (R1, P)
shows you that it won’t change. (E2)
Just showing the picture, it can help you You’d have to think more, and work it out
visualize in your mind without having to more. (P)
do a lot of work, you just know it won’t
change. (E2, R1, L2)
It shows you, like, the percentage, and it There’s not really any work to show how
won’t change. (E2, R3) they got there, so if you didn’t know, like,
the problem, you wouldn’t be able to
figure this out. (E2, P)
They show the percentage, and when you You wouldn’t know how they got there,
double it, it still stays the same, so why because they didn’t show any work. (E6-)
would it be any different if you did 5 and
3? (E2, R3, L4)
It’s kind of hard to understand. (P)
It had to be explained to me. (P)
It’s a confusing picture. (P)
Additional Comments
It put the work in it. (E2)
It put the work within the problem. (E2)
When they did the 2 and the 3, they
showed the percentage. (E2, R3)
155
As shown in Table 14, Abby considered the inductive arguments most convincing
in 3 of the 5 problems (i.e. Problems C, D, and E), while the visual arguments were rated
least convincing in the same three problems. The algebraic and perceptual arguments
were placed between the visual and inductive arguments. This general preference toward
inductive arguments was consistent with his responses in the SMR. However, Abby’s
rankings for the arguments in the other two problems were different. In Problem A, she
considered the visual argument as the most convincing while the perceptual argument the
least convincing. In Problem B, the perceptual argument was ranked the most convincing
In order to better understand how Abby evaluated the proposed arguments and her
rationale when providing these rankings, the coding for her explanations in Table 15 were
summarized in Table 16 so to identify factors and features of the arguments that had
As shown in Table 16, the total number of comments that referred to the
representation, evidence and link of the arguments were 22, 46, and 8, respectively,
indicating that the evidence had the largest impact on Abby’s judgment. Among all types
of evidence, Abby found that examples (i.e. results from an immediate test) the most
interview. Abby’s reliance on specific examples could be highlighted by her claim that “if
it works for 200 and 500, why wouldn’t it work for 300?” Furthermore, imaginary (i.e.
scenarios recalled from or created upon previous experience) was also considered reliable
evidence to her (referenced for 12 times). For example, she suggested that “I’ve seen a
football field before, and I know how big they are” and hence considered B2 convincing.
156
In addition, she found some arguments not convincing since she had “never done it like
there didn’t seem to be a certain type of representation that particularly contributed to her
conviction. In fact, the same type of representation could affect her judgment negatively
and positively, depending on the context. For example, in Problem D, she found other
methods “easier than trying to make a graph,” while in Problem A she found A4
convincing since it was “easy to see, visualize it.” In addition, she found A3 “confusing
157
because they show many words in it” and she didn’t “like word problems.” However
when commenting on D1 she claimed it was convincing since “they show you the
induc tion, transformation, and deduction, each of which was detected to have been
referenced for 3, 3, 1, and 1 times, respectively. She did realize that an argument should
be valid for all cases when working on Problem A. However, this realization wasn’t
evident in her comments when she worked on other problems. Therefore, she didn’t seem
arguments revealed that Abby tended to consider easier argument more convincing. This
indeed had impacted her conviction. This was demonstrated by her claim that “… if you
can make it simple, like this one, why would you confuse yourself?” In addition, it was
found that the need to do extra work made an argument less convincing to her. For
example, when evaluating A4, she claimed that the need to “think and put them
[manipulatives] together and draw them” complicated the process and made the argument
less convincing to her. Another example was her comment about D4. Although she
claimed that she understood the graph, she considered it less convincing because she was
not able to “make the graph just off the top of my head.” The pursuit of simplicity
explained her preference toward the use of easy examples and imaginaries created upon
previous experience might be the easiest way for her to access the problem.
158
Based on Abby’s comments when explaining her rankings, Figure 22 was
Perceptual
Inductive Examples,
Convincing Imaginaries
arguments
Easy to understand (Visual, Numerical)
Familiar procedure
(inductive), and E1 (inductive) provided the most accessible examples and scenarios and
hence were considered the most convincing. In particular, the manipulative model used in
A4 and the football field scenario in B2 were both familiar contexts to her. The examples
provided in C1, D1 and E1 were easier to understand than those used by other arguments.
and E2 (visual) might be more difficult to access. A3 was too “wordy;” the graph in D4
was difficult to create; the diagram in E2 was “hard to understand;” and she had never
“actually done” anything like what was described in B3 and C3. Therefore, the difficult
159
The case of Alice
Alice was enrolled in an Integrated 8th Grade Mathematics class at the time of
data collection. In her responses to SMR, the perceptual arguments were indicated to be
closest to how she would argue in all but Problem A, where she chose A4, the visual
argument. Based on this result, we believed that Alice had exhibited preference towards
consistent group.
Alice’s interview responses are summarized in Table 17 and Table 18. Table 17
illustrates the rankings provided by her for each problem. Column One of the table
represents the order of problems that she tackled. Table 18 summarizes Alice’s comments
when articulating why she found certain arguments convincing or not convincing (The
coding of each comment is explained in Table 9). These two tables served as the major
160
Positive Comments Negative Comments
Problem D
It shows the picture, which helps the I didn’t understand it; like, I tried and
reader understand more. (E2, R1) tried, but I just couldn’t figure out how to
do it. (NA)
It shows how much it is before tax and It doesn’t seem like it made as much sense
after tax, which helps you notice, or as the first two did. (NA)
realize, how stuff is. (E2, R1)
The height of the line before tax is, er,
after tax is higher up than the line before
tax. (R1, E4)
Problem B
It shows that, like, when you put it in a When I stand on the edge of the football
circle, BD would always follow along the, field, I look at the diagonal and then I look
well it won’t always follow along the edge straight, it looks the same, because like,
of the circle, but if you just imagine that it you can’t see it from like, up in the air,
would, then it’d be longer than BA or BC. you’re on the ground looking at it, so you
(E2, R1, L4) can’t really tell the distance, and it looks
the same. (E3, L2)
Usually, like, when you draw rectangles, When you do AB squared plus AD
and you measure the length of their sides, squared, um, like, it won’t always come
and then you draw a diagonal, the diagonal out to be BD squared, because when you
is always gonna be longer than the edges, combine AB squared and AD squared, it’ll
because in order to get, like, from the edge actually turn out to be farther than BD
and then down, like in order, like… when squared. (NA)
you draw the rectangle, like, the size of the
diagonal will always be longer than this
because, like, if you had a circle, and you
were to bring it up, it would come up to
like, right here, because the diagonal is
always longer than the straight line,
depending on how long the straight line is,
and when it's inside a rectangle, then the
diagonal will always be longer. (E2, R1,
L4)
Now that I think about it, the longer it [the It [Pythagorean theorem] doesn’t really
side] is, the longer the diagonal will be, so apply to this problem. (NA)
that it can go from corner to corner. (E3,
L4)
continued
161
Table 18 continued
Positive Comments Negative Comments
I honestly understand better, stuff better
when it has like, a picture, ’cuz I think
better when I can see it and not read it.
(E2, R1)
Problem A
I used the multiples of 6, like she used, and I didn’t really understand this one, because
for every one I tried, they’re multiples of 3 when they split ’em up, it showed that they
as well. (E2, L3) were multiples of 3, but… I don’t know…
it’s confusing. (NA)
When I work it out, like… when I choose I didn’t understand the wording that they
a random number for n, I put it in and like, put it in, and it made it really confusing.
for instance, here I chose 6 for n, and 6 (NA)
times 2 is 12, and I multiplied that by 3, it
equals 36, and on the other side, I plugged
6 in, and 6 times 6 is 36, so it’s right. (E2,
L3)
When you plug in a number for n, There might be some number of 6 that
whatever you do in the equation will be isn’t always [contained in the discussion of
the same on the other side, like, the A1]. (E2)
answers are. (E2, R4, L4)
Problem C
It’s more appealing to me because, like I
understand what it’s saying, but if you take
the sides of Triangle 1 and you cut them
down, and then you make it into like,
sometimes a similar triangle, then Triangle
1, then, like it can be the same shape but it
won't be the same size, because you cut the
sides down. (E2, R1, L4)
Comparing Problems A-D
You can plug in any number, and you’ll You have to have a specific number in
always, like, get the same number on both order to get the answer that they’re looking
sides. (E2, R4, L6) for . (E2, R3)
I found that one more understanding It has more than one step, which makes it
because it gave you, like, the numbers that kind of harder to do. (P)
they tried to where you could try multiple
numbers. (E2)
They gave you more choices to… choose It’s confusing, like, the way they split it
from to where it’s like, not so complicated, up… (NA)
like you have more numbers to work with.
(E2, P)
continued
162
Table 18 continued
Positive Comments Negative Comments
It gives you something that you can like, It was harder for me to comprehend. (P)
draw with to where like, you can even see
for yourself. (E2)
They only gave you two, but like, what if
the price is higher than 500. (E2, L3-)
They only gave you two numbers to work
with. (E2, L3-)
You can get many, like, possible ways.
(E2)
Problem E
I think of the problem in a different way.
(P)
I’m not comprehending what they’re all
saying… because I have a different way of
finding the answer than all of the
arguments. (NA)
arguments were rated most appealing to her in the SMR; however, most of them were
placed at the bottom of the list during the interview (see Table 17). In addition, her
evaluation of the same type of arguments were inconsistent across the problems. For
example, algebraic argument was considered the most convincing in Problem A, second
Problems D and E. Therefore, the ranking provided by Alice hardly revealed any pattern
in her judgment. In order to better understand how Alice evaluated the proposed
arguments and her rationale when providing these rankings, the coding for her
163
Total number of references to representation: 10
Visual Narrative Numerical Symbolic
Positive 7 0 1 2
Negative 0 0 0 0
As shown in Table 19, the total number of Alice’s comments that focused on the
representation, evidence and link of the arguments were 10, 21, and 11, respectively. It
was found that most of Alice’s comments were about the evidence of the arguments.
Among all types of evidence, Alice found that examples (i.e. results from an immediate
test) the most reliable source to establish an argument. It was referred 18 times
throughout the interview. Additionally, in 3 cases she also considered the arguments that
She found induction reliable in some cases (e.g. she suggested that “I plugged 6 in,
and 6 times 6 is 36, so it’s right”); however, it was detected that in some other situation
164
she demonstrated a need to see more than just checking a few cases. This was
exemplified by her comment on D1 that “they only gave you two, but like, what if the
price is higher than 500?” A closer examination revealed that whether an argument
impact on Alice’s judgment. This was detected 5 times, including the comments that “BD
would always follow along the, well it won’t always follow along the edge of the circle,
but if you just imagine that it would, then it’d be longer than BA or BC” and “Now that I
think about it, the longer it [the side] is, the longer the diagonal will be, so that it can go
by Alice in Table 17. Notice that she considered the visual arguments (B4 and C3) as the
D4 was also considered convincing in Problem D, where she claimed to be able to see the
constant distance between the parallel lines, which convinced her that the difference
remained to be the same value. In Problem A, the A4 (visual) was considered not
convincing. Her explanation, however, revealed that she could clearly see how one
diagram transformed to another but was not able to see how it connected to the problem
context. Hence what made the argument less convincing was not due to the use of
transformation.
however, we suspected that it had also potentially impacted her judgment in other
contexts. For example, it was found that she realized the advantage of algebraic
representation in Problem A, where she claimed that “you can plug in any number, and
165
you’ll always, like, get the same number on both sides.” She believed A2 (algebraic) was
more convincing than A1 (inductive) since A1 didn’t prove the conjecture was true for all
cases. However, at the same time, she needed to plug in some numbers to verify if the
formula used in A2 was true. A possible explanation was that through plugging in the
numbers in the formula she might have detected some patterns that would transfer to
other situations as well, and consequently the formula became valid to her in general
cases. If this conjecture is true, we may claim that Alice found transformational reasoning
the comments she made. This revealed that Alice was not yet able to explicitly reflect on
4 of Alice’s comments were coded “P.” Similar to what was detected in the cases
of Allen and Abby, Alice also suggested the need for simplicity of an argument to be
convincing to her. For example, she claimed that D2 was less convincing since “it has
more than one step, which makes it kind of harder to do.” In addition, it was detected that
her own way of approaching a problem had an impact on her judgment of the arguments
given. This impact was most obvious when she was working on Problem E, where she
claimed that she didn’t consider any of the arguments convincing since she had “a
different way of finding the answer than all of the arguments.” In explaining her own
method, Alice also started with specific numbers. However she decided to give up that
approach after a few trials. What she had done didn’t seem to be different from what was
166
suggested in E1 (inductive). So it seemed that she didn’t understand what was offered in
E1.
suggested that Alice was likely to be convinced by arguments that were simple enough
and used approaches that were familiar to her. In particular, arguments that utilized visual
examples and engaged transformational reasoning seemed to be the most convincing type
to her.
Transformational
Examples
Convincing
arguments
Easy to understand (Visual)
Familiar procedure
Amy was an 8th grade student enrolled in an Algebra I class at the time of data
collection. In her responses to SMR, the algebraic arguments were indicated as the closest
to how she would argue in all but Problem C, where she selected C1, the inductive
argument. Based on this result, we believed that Amy had exhibited preference towards
167
algebraic arguments. Therefore, she was considered to be a representative from the
consistent group.
Amy’s interview responses are summarized in Table 20 and Table 21. Table 20
illustrates the rankings provided by her for each problem. Column One of the table
represents the order of problems that she tackled. Table 21 summarizes Amy’s comments
when articulating why she found certain arguments convincing or not convincing (The
coding of each comment is explained in Table 9). These two tables served as the major
168
Positive Comments Negative Comments
Problem C
That’s kind of how I thought of it in my When I think of stuff, I don’t put it in like,
head. (P) diagram and shape form, I kind of just
think of it as just, like, stuff in my head, I
don’t think of any shapes or any examples
to it, and so I sometimes, they confuse me,
this at first confused me when I looked at
it. (E2-, E3, R1-)
It has the diagram of the pictures, which It’s got too much numbers in it, and so it
just makes sense in my head, and it has gets it confused in my head, so I have to
different cases, so that it just, there’s reread it in my head a couple times. (E2-,
different things to show instead of just one R3-)
example, there’s more. (R1, E2, L3)
Showing more examples makes it more It just slows me down mostly. (P)
convincing. (E2, L3)
It made more sense and it seemed more It doesn’t really, it starts with the b, and it
valid to me. (P) doesn’t explain the a and c in the first part,
and I think it probably should explain the a
and c, it just explains the b. (E2)
The picture confused me a little bit. (E2-,
R1-)
It’s not very detailed. (P)
This one’s too detailed, and not really,
completely true, and that one’s not really
detailed enough; the picture shows it, but
like I said, sometimes pictures get me lost.
(R1-, P)
Problem B
This person used more examples, they did It’s just one example, and so it’s not
more of, I guess, trials… and so, it’s more necessarily true because it just, it may not
likely to be true for this one than for other be a hundred percent true. (L3-)
ones. (E2, L3)
It has a whole formula. (E4, R4) It has more information to it, but I, it
confused me a little bit. (P)
It looks pretty true. (R1) I don’t think it shows it, because it says
many and several. (L3-)
continued
169
Table 21 continued
Positive Comments Negative Comments
It used an actual formula. (E4, R4) This is only, it just has one example, and it
doesn’t have anything other than one
example to back it up. (E2, L3-)
I don’t exactly think this is even correct,
because it says BQ is equal to BD, and I
might have been misunderstanding it
wrong, but it just doesn’t look equal to it,
it doesn’t seem very equal… after that, it
kind of just lost me, because I was just
like, well this isn't equal, so the rest of it
doesn't seem very true either. (E2, R1)
Problem A
You can put any number in there, and it This one is just shown by pictures. (R1-)
wouldn’t make a difference. (E4, R4)
It has the formula. (E4, R4) With pictures, like, sometimes it can be
incorrect, or not true for some things. (R1-
)
I think with a formula, it makes it true for The person just tried a couple different
any event. (E4, R4, L6) things; I mean, they might have tried a lot,
but they didn’t try all of them, which is
important. (E2-, L3-)
Most people like thinking math with food
and whatnot, once you get into food, it just
completely loses me. (E3-, L2-)
It’s talking about cookies… I just can’t
picture that in my head. (E3-, L2-)
Problem D
It’s got the formula, and it uses x instead It doesn’t necessarily say the formula, and
of an actual price, so it can be any number, so I don’t one hundred percent know
and the formula is correct. (E4, R4, L6) exactly what formula was used, in my
head. (E4, R4)
It’s kind of like using this formula, just This one has actual prices instead of x, and
putting it onto a graph instead. (E4, R1, so even though they use different prices,
R4) it’s not always true, because they can’t use
every number… and so it’s just, you can’t
tell from that one if it’s 100 percent true or
not. (E2-, R4, L3-)
It shows the formula, and so I’m, it’s They just don’t have a lot of stuff to back
really clear on what they’re doing. (E4, it up. (E6-)
R4)
continued
170
Table 21 continued
Positive Comments Negative Comments
It doesn’t have much information to back
it up that it’s true, so it’s not as clear. (E6-)
I would probably add to it that, something
about the actual, something about the
before price and the after price, instead of
just the 20 dollars and the 5 percent. (E2)
It would be better if it had, like, an x for an
actual price… instead of just showing the
tax difference. (R4)
Comparing Problems A-D
I am pretty sure that all rectangles are I don’t think the diagram… the picture, I
similar. [making rectangles using her don’t think it goes with this… yeah, with
fingers] (E2, L4) the description. (E2-, R1-)
They kind of have a formula here, and then You can’t necessarily go with the picture,
they back it up with different examples. because the picture doesn’t show all cases.
(E2, E4, R4) (E2-, R1-)
It [my brain] likes more figures and In your brain, your brain can just skew
numbers. (E2, R1, R3) everything if you just have one missed
piece of data or anything. (E3-, L2-)
Numbers are a lot simpler than trying to My brain doesn’t like to connect to
think of something in my head. (E2, R3) imaginative stuff. (E3-, L2-)
Those just don’t convince me as much as
numbers and something that I can actually
see on a piece of paper. (E2, E3-, R3)
It just shows a couple cases, not the whole
range of cases, ’cuz there could be
basically any number, there could be tons
of different things it could be. (E2-, L3-)
Problem E
It has more than one case, and it has It has one case, instead of all the, however
variables, so you can put anything into it, many, amount of cases. (E2-, L3-)
and so it will be true for anything, instead
of just one thing. (E4, R4, L6)
It’s got a picture instead of a number, and
the pictures can be misinterpreted, or
mismade. (E2, R1-, R3)
It doesn’t have any pictures or numbers, it
just has words to back it up. (E2, R1, R2-,
R3)
continued
171
Table 21 continued
Positive Comments Negative Comments
[It was not backed up by] numbers and
objects. (E2, R3)
Additional Comments
The numbers, they seem to be right, but
they don’t really show anything else. (E2-,
L3-)
They say it in words instead of in numbers.
(E2, R2-, R3)
in 3 of the 5 problems (i.e. Problems A, D, and E), while the inductive arguments were
rated least convincing in the same three problems. The visual and perceptual arguments
were ranked between the algebraic and inductive arguments. This general preference for
algebraic arguments was consistent with her responses in the SMR. However, Amy’s
rankings for the arguments in the other two problems were different. In Problem C, she
considered the visual argument as the most convincing while the algebraic argument
received the least convincing ranking. In Problem B, the inductive argument was ranked
the most convincing while the perceptual argument was considered the least convincing.
In order to better understand how Amy evaluated the proposed arguments and her
rationale when providing these rankings, the coding for her explanations in Table 21 were
summarized in Table 16 so to identify factors and features of the arguments that had
172
Total number of references to representation: 37
Visual Narrative Numerical Symbolic
Positive 6 0 7 13
Negative 8 2 1 0
As shown in Table 16, the total number of comments that focused on the
representation, evidence and link of the arguments were 37, 46, and 19, respectively,
indicating that all three factors had impacted her evaluation. These also indicated that
much of Amy’s explanation was based on the features of the arguments instead of her
personal opinions.
There were three key findings that made Amy special. First she was the only
subject who had clearly and repeatedly emphasized her preference toward algebraic
arguments and made explicit claims about the logical rigidity of these arguments.
173
respectively when she was talking about factors that convinced her. She made these
statements were made when justifying the rankings she provided for Problems A, D, and
E. In particular, she claimed that A2 (algebraic) had “a formula; it makes it true for any
event;” D2 (algebraic) “got the formula, and it uses x instead of an actual price, so it can
be any number, and the formula is correct;” and E2 had “variables, so you can put
anything into it, and so it will be true for anything, instead of just one thing.” Her
explanation demonstrated that she was not only attracted by symbolic format, but also
understood their rigidity in logic. This was explicitly addressed 3 times. As a natural
consequence of this realization, she had also repeatedly addressed the deficiency of
inductive and perceptual reasoning (4 and 8 times, respectively). This was exemplified by
her claims that “they might have tried a lot, but they didn’t try all of them, which is
important,” “this one has actual prices instead of x, and so even though they use different
prices, it’s not always true, because they can’t use every number” and “your brain can
just skew everything if you just have one missed piece of data or anything.”
Second, she was the only subject who clearly described the disadvantage of visual
illustrations, which was not about any specific image or graph, but about visual
illustration as a way to reason. Such claims include “you can’t necessarily go with the
picture, because the picture doesn’t show all cases” “it’s got a picture instead of a number,
and the pictures can be misinterpreted, or mismade,” and “with pictures, like, sometimes
it can be incorrect, or not true for some things.” This point was addressed 8 times during
the interview.
The third finding was that although the previous two results consistently appeared
in her explanation in the number theory, algebra, and probability problems, they were not
174
present when she was working on the two geometry problems. This is a good example to
illustrate how context may impact students’ reasoning method. If a reasoning test was
based on Problems A, D and E, Amy should be considered as one who demonstrated the
highest level of maturity in mathematical reasoning, especially among 8th graders. So the
One reason could be that Amy tended to avoid working on visual representations
in Problems A, D, and E since she believed they might misrepresent the content.
However, in geometry problems she had to work on images and figures. Amy’s
further revealed her thinking in geometric context. When asked to compare B1 to the
inductive arguments in other contexts, Amy suggested that B1 was different because the
cases used in B1 were not numbers but rectangles. She further claimed that “all
rectangles are similar” shapes that shared common properties such as “equal opposite
sides,” and hence if the claim that “diagonal is longer than the sides” was true for some of
them, it should apply to others as well (While stating this, she used her fingers to made a
rectangle and made a movement to represent the adjustment of side lengths). This
explanation revealed that Amy utilized transformation to convince herself that B1 did
account for all cases. Similar strategy applied to her judgment in Problem C, where C3
(visual) utilized transformation. This argument was rated most convincing since she
believed it demonstrated that the conjecture was true for all cases. Further examination of
Amy’s judgment of the algebraic argument in the two geometry problems revealed that
she didn’t understand the algebraic argument in Problem C and hence considered it the
least convincing. She was convinced by B3 (algebraic) but rated it low because it
175
confused her slightly at the beginning. Supported by this evidence, we believe the
following three points capture Amy’s major rationale when judging whether
conjecture is true in all cases. This perception served as the primary guiding principle for
Second, she found testing a few numbers to helpful to understand a problem better;
however, she believed that algebra was the reliable tool to guarantee the general validity
perceptual connection, and visual illustration, as reliable, and suggested they each had
Deductive
Transformational Examples, Facts
Convincing
arguments (Symbolic,
True for all cases Numerical)
Third, she considered different numbers as separate cases but she viewed a group
of geometric shapes that share certain common properties as related cases. Therefore,
176
to other cases. However, examples in numerical contexts were viewed as isolated
instances, hence their property might not hold in other situations. Amy’s rationale was
Beth was an 8th grade student enrolled in an Algebra I class at the time of data
collection. When working on the SMR, she tended to prefer A4 (visual) B2 (perceptual)
C2 (algebraic) and D4 (visual) in respective problems and hence she was considered to be
Beth’s interview responses are summarized in Table 23 and Table 24. Table 23
illustrates her rankings for each problem. Column One of the table represents the order of
problems that she tackled. Table 24 summarizes Beth’s comments when articulating why
she found certain arguments convincing or not convincing (The coding of each comment
is explained in Table 9). These two tables served as the major resource for the interview
analysis.
177
Positive Comments Negative Comments
Problem B
I’ve been on a football field, so I know I got really confused. (NA)
what the shape is and everything, so if I
imagine to myself I’m standing at the
corner of a football field, like that says,
and I’ve had to run football fields, and
they’re called the suicide thing, so I had to
run that way, and then those two ways, and
that one was longer than those two when I
was running. (E3, L2)
It’s also because of a relatable thing, I, I guess it would be more convincing if I
like, I understand what it means when it knew what the actual numbers were, if
says, I can picture a rectangle being drawn, they actually use the actual numbers in
plus I’ve measured rectangles, so that’s them, and not just like, saying the square
longer than the two sides. (E3) of BD, if they actually put the actual
numbers. (E2, R3, R4-)
I just know more about B2, I’ve run the We’re probably not going to have rulers
football field before, so I guess that’s why. during the test, so it’s going to be harder.
(E3, L2) (P)
You can kind of look at the side lengths
and see what they mean by it, instead of
having to measure it and everything. (E2,
R1)
Problem D
D1 gave a little bit more of an explanation It makes sense, it’s just really short, and
at the end, and also just like on that one they don’t really give a lot of examples.
[points to previous question], they used (E2)
actual numbers, so even though it wouldn’t
really probably be that hard for me to
insert a number in there during the test,
that one's already done for me, so it's
probably a lot easier to do. (E2, R3, P)
I could insert the 200 dollars and the 500 They don’t give you examples of numbers
dollars that he’s suggesting is the same that fit into it really, they just… I guess
thing, and I could see if it was actually yeah, they just don’t give you numbers to
right. (E2, L3) support themselves, their claims. (E2, R3)
continued
178
Table 24 continued
Positive Comments Negative Comments
It just has an illustration, and I’m
sometimes, most of the time, I’m a visual
learner, so it helps a lot to see it and read
what it says, and it is, it makes sense. (R1,
P)
It only gives one example, but it also
offers 200, 500 if you wanted to insert
them, so yeah, I think it does support that.
(E2, L3)
Problem A
I can kind of imagine someone having six It confused me the first time I read it, and I
cookies in… having a multiple of six I had to re-read it, because I wasn’t really
imagine 36 because that’s the square I sure what it meant by the uh, when it was,
guess, square root or whatever, and um, so the way it was divided and everything.
I imagine 36 and I imagine six boxes of 36 (NA)
cookies and dividing each into two and
then there's three cookies in each, so… and
then you can just, you can put the three
cookies with the two boxes of three
cookies, you can put it back into one box
of six, and it's still a multiple of 36 either
way. (E3)
You can insert a number in there… and it I’m guessing that they’re doing what I
would make sense. (E2) think they’re doing. (NA)
It’s visual, so it’s a lot easier for me to It says that she’s tried plenty of multiples
understand when it’s visual. (R1, P) of six, and three as well, and that they’re
the same, but just ’cuz she’s tried a lot of
them, she hasn’t tried all of them, so you
could never really know, based on that
statement, if she was right or not. (E2-, L3-
)
Even if you just try a wide range of
numbers, you still, you never know. (E2-,
L3-)
Problem C
I can visualize that, and ’cuz I can think You’ve tried many cases, but you can
about it in my head. (E3, L2) never be sure, ’cuz you haven’t tried all
the possibilities, which really, you could
never do anyways. (E2-, L3-)
continued
179
Table 24 continued
Positive Comments Negative Comments
I like they way that C4 is explained better; It says that she shortened the sides, but it
I like being able to imagine it, or being doesn’t say by how much, so she could
able to think… ’cuz actually, I thought the have shortened the sides at any, she could
area of this table surrounded by wire. (E3, have shortened a more than she shortened
L2) b or more than she shortened c, so she
doesn’t really say how much to shorten it
by. (E2)
It’s easier… to imagine. (E3, L2) I think that to make the claim more
believable, you would have to cut all the
sides by the same length, we would have
to cut the sides at the same length from
each side. (E6)
Comparing Problems A-D
I still like that just because of the graph, You yourself would only be inserting a
and I can look at it and kind of understand certain amount of numbers, you wouldn’t
what they’re saying and everything. (E2, be sitting there inserting every single
R1) number in the world. (L3-)
It’s graphed with the two lines, and it You get to trust your answers, you don’t
shows that they’re all, that it’s one unit have to trust their answers, but you also
apart, and if you wanted to, you could kind are limited to a certain number of
of check that with all of them, they’re all numbers, so you can’t… it’s kind of like,
one unit apart and make sure that it was half and half, good and bad. (NA)
one unit apart the whole time like they said
it was. (E2, R1)
It’s relatable for me. (P)
I’m having the football field switch in my
mind, and every rectangle that I can think
of is, it works. (E3, L4)
I can imagine them in my mind, I can
picture them. (E3, R1)
It’s more visual. (R1)
If you use them [variables] you can insert
numbers, any number that you possibly
want, and even if you wanted to insert
numbers just to see if they were wrong…
(R4)
You can insert whatever numbers you
want, you don’t have to go by what they’re
saying as much. (R4)
continued
180
Table 24 continued
Positive Comments Negative Comments
Problem E
If you take 2 out of 5, and you have 4 out it [the narrative description] doesn’t really
of 10… if you take 4 out of 10, it would support what they’re saying, it kind of just
reduce to 2 out of 5, which is the same doesn’t support this; it “unsupports” it not
percent, so that’s why it makes sense. (E2, making sense; it doesn’t really support it.
L3) (E6-)
It shows that they’re the same ratio, You can never really try all the numbers.
they’re still proportionate. (E2, L3) (L3-)
Additional Comments
It gives more information about the
illustration. (R2)
As shown in Table 23, Beth identified the perceptual arguments as the most
convincing in Problems A, B, and C but least convincing in the other two problems.
Algebraic arguments were never considered the most or least convincing. Beth’s
evaluation of visual and inductive arguments were highly inconsistent across the
problems and appeared at different places on the lists. In order to better understand how
Beth evaluated the proposed arguments and her rationale when providing these rankings,
the coding for her explanations in Table 24 were summarized in Table 25 so to identify
factors and features of the arguments that had influenced her judgment.
181
Total number of references to representation: 14
Visual Narrative Numerical Symbolic
Positive 7 1 3 2
Negative 0 0 0 1
As shown in Table 25, the total number of Beth’s comments that focused on the
representation, evidence and link of the arguments were 14, 27, and 15, respectively. It
was found that Beth made more comments about the evidence of the arguments than the
representation and link. Among the types of evidence, examples (i.e. results from an
immediate test) were the most frequently referenced. Specifically, 13 were the results of
immediate tests, mostly by plugging in numbers (e.g. “you can insert a number in there”).
There were 9 cases where imaginaries from past experience (e.g. “I just know more about
B2, I’ve run the football field before”) were recalled to decide whether arguments were
found convincing. Formulas and theorems were not treated as reliable sources of
182
evidence to Beth. In order for them to be convincing, she needed to plug in numbers to
verify.
prominently, she claimed that “most of the time, I’m a visual learner,” and an argument
was “a lot easier for me to understand when it’s visual.” Note that by “visual” she didn’t
only mean visualizing something that was drawn on paper, but also visualizing something
in her mind, i.e. imagining some model. She didn’t distinguish between these two types
of visualization in her explanations. Overall, there were 7 times when Beth mentioned
that arguments with visual illustration contributed to her conviction. In addition, Beth
recognized the value of numerical expression in offering her concrete example to support
a claim. She acknowledged the value of symbolic expression in allowing her to test
numbers that she wanted to check. However, she thought that neither of the expressions
was powerful enough to show that the conjecture was true in all cases. This was further
explained in her view about the link between evidence and conclusion.
Beth didn’t believe that an algebraic argument could prove a conjecture was
always true. Compared to numerical expressions, the symbolic formulas only offered the
advantage that “you can insert whatever numbers you want, you don’t have to go by what
they’re saying as much.” Despite this, she sometimes preferred numerical expressions
since “they used actual numbers, so even though it wouldn’t really probably be that hard
for me to insert a number in there during the test, that one's already done for me, so it's
probably a lot easier to do.” Beth considered an argument to be more convincing if she
“knew what the actual numbers were, if they actually use the actual numbers in them, and
183
not just like, saying the square of BD.” Therefore, an algebraic expression was not
Beth’s evaluation of inductive arguments were not consistent across the problems.
On the one hand, she explicitly pointed out that trying a few cases was not sufficient to
show a conjecture is always true. For example, in commenting on A1, she claimed that
“she’s tried a lot of them, she hasn’t tried all of them, so you could never really know,
based on that statement, if she was right or not.” Similar statements were articulated 5
times during the interview. However, when she was evaluating B2, even though she
realized that football field only represented a certain type of rectangle, she still
considered it as the most convincing one since she could “relate” to it. Similar situation
the second most convincing, admitting that it couldn’t prove the conjecture was always
true. This suggested that being able to show the general validity of a conjecture was not a
required condition for Beth when considering an argument convincing. Other personal
revealed her need to see simple, “relatable” and easy to access arguments in order for her
views to be convinced. Similar opinions were expressed 4 times. While “general validity”
contributed to the reliability of an argument (e.g. her comments on A1), it was not the
This explained Beth’s preference for perceptual arguments (A3, B2, and C4) in
Problems A, B and C (see Table 23), since the contexts provided in those argument
evoked familiar experiences and hence were most “relatable” to her. In contrast, the two
184
perceptual arguments (D3 and E3) in Problems D and E didn’t provide any “relatable”
suggested that Beth was likely to be convinced by arguments that were “relatable” to her
existing experience. In particular, arguments that create a scenario that can be visualized
examples could help her access a problem and hence contributed to her conviction.
Perceptual Examples,
Convincing Imaginaries
arguments Easy to understand,
(Visual, Narrative)
Relatable scenario
Betty was an 8th grade student enrolled in an Honor’s Algebra I class at the time
of data collection. In her responses to SMR, she considered the visual argument (A4) in
Problem A, the perceptual argument (B2) in Problem B, the algebraic argument (C2) in
Problem C, and the inductive argument (D1) in Problem D as the most appealing option
185
in each context. Since she exhibited preference towards different types of argument
across the context, she was considered to be a representative from the consistent group.
Betty’s interview responses are summarized in Table 26 and Table 27. Table 26
illustrates the rankings provided by her for each problem. Column One of the table
represents the order of problems that she tackled. Table 27 summarizes Betty’s comments
when articulating why she found certain arguments convincing or not convincing (The
coding of each comment is explained in Table 9). These two tables served as the major
186
Positive Comments Negative Comments
Problem D
When they explain it and show the work It wasn’t enough work to show how they
[examples] that that’s right. (E2) got a dollar off. (P)
This one is, like, no work at all. (P)
It just, like, gives you a graph and doesn’t
explain how they formed the graph and
like, how they got from the five percent to
a dollar. (R1-, R2)
Problem C
The statement they made is true; they said It [the perceptual argument] just states that
the area of a triangle equals half of the they’re larger. They have no idea what
product of its base and height, and that’s they’re talking about. It’s just, like, blank.
true. (E4) (E6-)
That [the formulas] describes how they They basically just stated that it’s larger.
found out the answer. (E4, L5) (E6-)
They diagramed the triangle part ... If you They wouldn’t even give any, like, work
cut it, you make it smaller. (E2, R1, L4) [in addition to the examples] ... They
didn’t explain why. (E2-, L3-)
That helps to see the actual work [formula
and related procedure] being done of how
to get the answer. (E4, R4, L5)
Problem A
It gives you an equation to solve for n, and It doesn’t really explain, it just breaks up
it comes out correct. (E4, R4) the pattern, like, the blocks. (R1-, R2)
It used cookies as an example. (E3, L2) It just says that, like, this can be an
opinion. (E6-)
It [A2] explained more of how to find the You didn’t go further in the numbers. (R2-
way… to get the answer. (E4, R4) , L3-)
You didn’t look for… like, multiples of
three and six, to see if, to compare them, to
see if they’re the same. (E2)
That’s not enough, I think they just, like,
picked random numbers. (E2-, L3-)
Problem B
I looked at the length of explanations. (P) It’s just, like, an opinion. (E5-)
They divided the rectangle ... then the It’s just, too plain… they didn’t even dig
Pythagoras Theorem ... (E4) deep and explain what they did. (E6-)
continued
187
Table 27 continued
Positive Comments Negative Comments
They are all radius [so they are equal]. There are small football field, and big one,
(E4) say NFL ... so that’s not true for all
football fields ... the size varies (E3, L3-)
They showed the length [pointed on the I think they (the diagonal and the side) are
figure]. (E2, R1) the same size. (E6)
They are true in their cases. (E2)
Comparing Problems A-D
It explains more, they give you a problem They [inductive arguments] just give you
[example] for you to find the solution to the statement, it’s not really explanations
get the answer, to see if it’s right. (E2, P) of how they found it. (E6-)
You gotta work through the problem to get It’s okay to draw a picture, but you have to
the answer. (P) explain the picture too, and they didn’t
really explain it… as well as algebra
would. (R1-, R2, R4)
Algebra, it explains it more than just
saying, just making a statement, and they
give you equations and inequalities, and
problems to find the solutions to get your
answer, rather than just making a
statement. (E4, R4, P)
Problem E
It explains how they get through to use They just drew a picture, and they didn’t
percentages and ratios. (E2, R3, P) really explain it, they just basically said
that the ratio of two ping pong balls would
be the same, therefore they won’t change.
(E6-, R1-, R2)
They use algebra, and it like, and they use It just makes a statement. (E6-)
variables to explain how they got the
answer. (R4)
They give you a percentage. (E2, R3) It just gives you algebra for you to solve
it… it basically just, it isn’t as good as
[points at inductive argument]. (R4-, R3)
There’s more math involved here
[inductive argument]. (E2, R3)
Additional Comments
If you explain how you found it. (R2, P)
How you found the answer to the problem
being asked. (P)
You have to find, go… dig it further, take
further steps. (P)
188
As shown in Table 26, Betty considered the algebraic arguments most convincing
considered the perceptual arguments the least convincing options in 4 of the 5 problems,
also providing consistent evaluations toward this type of arguments. Therefore, although
Betty was selected as a representative of the inconsistent group, she exhibited more
consistent judgment of certain types of argument during the interview phase. To better
understand Betty’s rationale when providing these rankings, the coding for her
representation, evidence and link of the arguments were 23, 30, and 8, respectively,
utilized symbolic and numerical representations. Each of these was mentioned 6 and 4
times, respectively, during the interview. For example, she stated that “it [the argument]
explains how they get through to use percentages and ratios” and “algebra, it explains it
more than just saying, just making a statement, and they give you equations and
inequalities, and problems to find the solutions to get your answer, rather than just
making a statement.” These statements helped to explain her rankings, where the highest
ranked arguments were written in either symbolic or numerical format. However, Betty
didn’t consider visual illustrations convincing except in the two geometry problems.
Although she did rely on visual evidence in the two geometry problems (e.g. she needed
to visually compare the length of two line segments), she didn’t consider reliance on
visual illustrations a convincing way to validate the conjecture in the other three
problems. She suggested that “it’s okay to draw a picture, but you have to explain the
picture too.” Similar opinions were repeated 3 times. Therefore, she didn’t believe simply
showing the graphs and figures without robustly unpacking their meanings made an
argument convincing. This explained why she didn’t consider visual arguments
convincing in problems that didn’t involve geometry content. A need for narrative
description (to explain examples or pictures) was mentioned 5 times. However, it seemed
that arguments with only narrative representations were also not convincing to her. This
blank.”
Betty also found that the evidence provided in an argument contributed to its
validity. In particular, she considered facts (i.e. known mathematical results) and
examples (i.e. results from an immediate test) as reliable sources to establish validity of
an argument, each of which was referred 8 and 9 times. In the two geometry problems
she recognized the validity of the triangle area formula and the Pythagoras Theorem, both
of which made the corresponding arguments convincing to her. She also examined
particular shapes drawn on the paper. In the other problems, the numerical examples
served as primary source of evidence and she even added her own calculations to verify a
few statements. She also perceived arguments that built on imaginaries (football field and
Despite this, Betty didn’t think that merely checking a few examples made an
argument convincing. She mentioned this point 6 times during the interview. For instance,
she commented on A1 (inductive) that “that’s not enough, I think they just, like, picked
random numbers.” When evaluating B2 (perceptual), she claimed that “there are small
football field, and big one, say NFL ... so that’s not true for all football fields ... the size
varies.” These comments revealed that she was able to see the differences among various
examples and had realized some properties might not be generally applicable. However,
she considered the inductive arguments in Problems D and E the most convincing option.
She suggested that these arguments “explain it and show the work that that’s right” and
“explains how they get through to use percentages and ratios.” In these cases, whether an
argument was valid in general cases was ignored. To investigate why these seemly
191
contradictory behaviors happened, we sought explanation from Betty’s personal standards
It was found that Betty repeatedly emphasized the need for “explanations.” The
addition, the need to see more “work” was addressed 6 times. This was highlighted by
her comments that “you have to find, go… dig it further, take further steps.” While it was
difficult to understand what exactly she meant by merely considering segments of the
interview, it became more sensible when taking into account the entire interview. In fact,
process. When working on Problem B, she suggested that she “looked at the length of
explanations” instead of the content to see which argument was more convincing. We
didn’t believe the length of explanation was the single factor to determine her conviction
(in fact it was not, since A1 was short but considered convincing and D4 was long but
considered not convincing); however it didn’t seem that Betty would prefer any particular
type of explanations. Further analysis of the data revealed that whether the idea of an
argument was explained clearly could be more important to Betty than whether the idea
itself proved the proposed conjecture. This was detected in Problem B, where she
provided a ranking for the arguments but suggested that the conjecture was false and
neither argument could show the conjecture was always true (but B3 (algebraic) and B4
(visual) were still more convincing to her since they were “true in their cases”). A similar
situation occurred in her work on Problem A, where even after she had provided a
ranking for the four arguments, she was still unsure if the conjecture was true or false.
192
Therefore, we believe Betty had a personal standard of what a “convincing” argument
meant. To her, the “convincingness” of an argument was first determined by how much
detail the argument offered in order for her to understand the information, and the
purpose of the argument (i.e. to justify the general validity of a conjecture) seemed to be
less important.
evidence and link of the arguments, Table 11 was created to highlight her rationale when
evaluating mathematical arguments. Betty’s interview responses suggested that she was
more likely to be convinced by explanations that were rooted in concrete examples and/or
perceptual connections.
Ritual
Perceptual
Transformational Examples, Facts
Convincing
arguments (Symbolic,
Detailed procedure Numerical)
Blake was enrolled in an Integrated 8th grade Mathematics class at the time of
193
(perceptual), C1 (inductive) and D3 (perceptual) in respective problems as the most
inconsistent group.
Blake’s interview responses are summarized in Table 29 and Table 30. Table 29
illustrates the rankings provided by him for each problem. Column One of the table
represents the order of problems that he tackled. Table 30 summarizes Blake’s comments
when articulating why he found certain arguments convincing or not convincing (The
coding of each comment is explained in Table 9). These two tables served as the major
194
Positive Comments Negative Comments
Problem B
It’s a little bit more simple. (P) A little bit too math-like, they’re not even
thinking about the problem, they’re not
even talking about it, they’re just trying to
make you all confusing with different,
they’re trying to make it different formulas
so you can get all confused about this. (E4-
, R4-, P)
This gives you more of a visual type thing A little bit too much work here for you to
so you can actually understand it, so you understand. (P)
can imagine how that would actually work.
(R1, E3, L2)
You just gotta try to figure it out on your They’re just saying it in like, math, and
own. (P) why not just plainly say, like, it’s longer.
(P)
It’s trying to tell you how to think… and
that wouldn’t really work. (P)
I’ve known some people where they
actually want to think on how to actually
figure it out, not just like, OK, here’s ya
how to do it, think this way. (P)
Problem C
It simply says it, it’s making it not too A little bit too math-like, it wouldn’t really
complicated. (P) work. (R4-, R1-, P)
It’s actually giving you not that much One of the second complicated ones,
visualization, but if you combine these two where they’re trying to make you all
[inductive and perceptual] together, then confused. (P)
you would actually get the answer, right
here. (E2, E3, R1-)
The visual aids right here, they actually They just want to try to make you think
help you, but then there’s an explanation that the one triangle equals two, so that it
that goes with that. (E2, R1, R2) would just make you go all over the place,
and see. (P)
It [perceptual argument] gives you the It’s just going all over the place, it just
explanation, but if you added this with it wants you to think something else besides
[points to inductive argument] it would that, so it wouldn’t work. (P)
really help. (E2, R2)
This is trying to mess you up completely.
(P)
continued
195
Table 30 continued
Positive Comments Negative Comments
it’s like you’re trying to just blatantly out
say it, how you’re just trying to confuse
us. (P)
It just didn’t make any sense. I thought
that it wasn’t talking about the question at
all. (NA)
They’re trying to trick ya. (P)
Just trying to confuse ya. (P)
It’s talking about it in a college-like term.
(R4-)
Problem A
It has a visual aid like it’s supposed to. They’re all confused about this, because
(E2, R1) it’s like blurry and everything, and it’s
like, oh God, it’s too much. (R1-, P)
This is where it talks about the geometry There’s such thing as too much on these
and everything, this is where I could easily things. (P)
understand it. (P)
It gives you a kind of convincing It wouldn’t explain as much as I wanted it
visualization; it kind of gives it easy. (E2, to. (P)
R1)
If you actually did the math right here with It’s not explaining as much to where you
a calculator, you could actually understand can actually understand it; however it’s
it better. (E2, R3, L5) easier for you to understand; however, it’s
not exactly what you want for an answer;
it’s not explaining as much as you want it
to. (NA)
You’re thinking for yourself. (P) I could try and try, and it just gives me too
much work. (P)
There’s such a thing as too much work;
however, thinking for yourself and
actually trying to find out. (P)
Problem D
It’s doing it a lot more simple, to where They’re missing something in this
you can actually figure it out. (P) question… they don’t tell you what the
price of the bike is, so that where you
could actually find it out easier. (E2, R3)
It actually gives you a visualization. (E2, It makes it a little bit too complicated to
R1) where if you do the wrong thing, it’s
kaput. (P)
continued
196
Table 30 continued
Positive Comments Negative Comments
It just simplifies it for you… it’s so you A lot of kids in my class where it wants
can actually figure it out for yourself. (P) variables, with dividers, with decimals,
and it just completely makes them…
blank. (R4-, R3-)
This one [perceptual] totally simplifies it For a sophomore, this’d be working, but
for you, so you wouldn’t have to do that for an 8th grader… [makes sound of
much work; however, you’re still actually disagreement]. (R4-)
learning math. (P)
It’s more like a word problem. (R2-)
They’re just trying to make it all
complicated. (P)
This is a tricky one they’re trying to pull
on ya. (P)
It’s just telling you on what you should do,
and a lot of people don’t like that. (P)
You need this, this… where’s my opinion
in it? (P)
Comparing Problems A-D
It actually gives you visuals, to where you They’re making it too complicated. (P)
can actually understand it, and where you
can actually try it for yourself. (R1, E2)
If you remember it, you got it. (P) They’re throwing off what is being asked,
so that you can get confused; it’s just like
with assessments, that’s what they want.
(P)
I did understand a little bit of each, They’re trying to trick ya, just like with
because they’re the skills I learned in the assessments, they want you to get the right
past. (E3) answer; however, they’re just trying to
trick you to see if you know it. (P)
I knew immediately what it was saying, They’re trying to make it more
because I’m in 8th grade, I know my complicated so that you can try figuring
geometry. (E3) out… like, oh, wait a minute, that’s wrong,
gotta go over here. (P)
Right here, I understood, because we were Even though they look like visuals, that’s
talking about the Pythagorean theorem a the trap. (P)
ton during that time, so I understand it
more. (E3, E4)
They’re trying to make it look like it’s
easy, but when you try to do it, bam! It’s
wrong… (P)
continued
197
Table 30 continued
Positive Comments Negative Comments
They’re throwing it off the question, they
have little pictures to try and trap ya, but
however, they’re doing it in what I like to
call high school and college terms, to
where they’re trying to make it all
complicated to you, and when you
understand that it's really complicated,
then that's where you need to dodge out.
(R1-, R4-, P)
More complicated terms than what I don’t
understand. (R4-)
This one, we didn’t talk about that much.
(E2)
Because we didn’t talk about it that much,
it was a little bit more complicated to
figure out. (E2, P)
Problem E
I can actually understand ratios a lot. (E2, It doesn’t give you any numbers, doesn’t
R3) give you any visuals. (R1, R3, E2)
This is with a visual, to where I could You’re talking about too many doubles,
actually imagine it. (E3, R1) and you’re just trying to confuse me. (P)
You can actually do the math, and you can There’s nothing concrete about it to where
actually figure it out. (P) you can actually figure it out. (E2, P)
If it’s with two different numbers, just like It doesn’t give you any numbers… 2n? I
on how you did, then it’s easier to figure mean, that doesn’t really work for me… It
out. (E2, R3) ain’t gonna really work if it’s just with
variables. (R4-, R3, E2)
Additional Comments
It’s talking about ratios, to where you can
figure it out. (E2)
I’m more good with trying to figure it out
with a calculator and with numbers. (E2,
R1, L5)
When it’s with ratios… I’m real good with
that. (E2, R3)
198
As shown in Table 29, Blake considered the perceptual arguments most
convincing in Problems B, C and D, but not convincing in the other two problems.
Algebraic arguments were generally not convincing to him, but it was ranked higher in
Problem A. The inductive arguments were considered the most convincing in Problem A,
and second most convincing in the other problems. The visual arguments were considered
the most convincing in Problem E, but not convincing in any other problem. In order to
better understand how Blake evaluated the proposed arguments and his rationale when
providing these rankings, the coding for his explanations in Table 30 was summarized in
Table 31 so to identify factors and features of the arguments that had influenced his
judgment.
representation, evidence and link of the arguments were 32, 27, and 3, respectively,
indicating that the representation and evidence had a larger impact on Blake’s judgment.
The 3 occasions Blake mentioned the link between evidence and conclusion were about
the perceptual connection in B2, and the use of calculators, which was classified as ritual
operations.
Blake found arguments that were based on examples (i.e. results from an
immediate test) convincing. This was mentioned 19 times during the interview. In
addition, the arguments referencing imaginary were also considered convincing. Besides,
he considered the argument built on the Pythagoras Theorem convincing, which was
Blake mentioned 8 times that visual illustrations contributed to his conviction, but
there were also 4 times when he indicated the graphs made the arguments less convincing.
When he found visual aid was helpful, he stated that “it gives you a kind of convincing
visualization; it kind of gives it easy.” In other cases where he found the images less
helpful, he suggested they were unclear: “it’s like blurry and everything, and it’s like, oh
God, it’s too much.” This was similar to his comments on numerical representations,
where in 7 cases they made positive impact on his conviction (e.g. he commented that “if
it’s with two different numbers, just like on how you did, then it’s easier to figure out”),
and in 1 cases they made negative impact (where he expressed a dislike towards the use
preferred types to Blake. He expressed this opinion explicitly when he was criticizing E3
(perceptual), suggesting that “it doesn’t give you any numbers, doesn’t give you any
200
visuals” and hence it was not convincing to him. His attitude toward narrative arguments
was compatible. If the description used easier language and was not long, he considered it
helpful explanations. On the other hand, he suggested the more complex description
looked like word problems, which he disliked. Last and perhaps the most evident was
Blake’s negative attitude toward symbolic representations, which was detected across the
appropriate only for “high school or college” students, labeling them as confusing. He
claimed that “it ain’t gonna really work if it’s just with variables.” Blake didn’t exhibit
meant to him. Among the 43 comments classified as “P,” personal standards, 29 were
about how the simplicity (or complexity) made an argument convincing (or not
convincing) to him (e.g. “it’s doing it a lot more simple” and “they’re making it too
complicated”), and 12 were about his need to figure out the problem by himself instead of
being told what to do (e.g. “it’s trying to tell you how to think… and that wouldn’t really
work” and “I’ve known some people where they actually want to think on how to
actually figure it out, not just like, OK, here’s ya how to do it, think this way”). When
making his evaluations, Blake often imagined the scenarios in which he was being taught
the arguments in a mathematics class and expressed his feelings in such situations. He
claimed that some arguments were trying to “trick ya,” “confuse ya,” and “trap ya.” This
manifests the type of frustration some students experience when they face mathematical
problems that may be difficult for them to do. At the same time, it also reveals their needs
201
in these situations. To Blake, whether an argument was convincing didn’t depend on how
complete the argument was, instead, he liked the argument to help him access the
problem so that he could think for himself. Therefore, the argument didn’t need to be
logically correct or even mathematically complete, instead it should explain the problem,
illustrate a few simple examples, or create a context for him to better understand the task
first and then to proceed with solving it. Based on the findings above, Figure 27 was
Perceptual Examples,
Ritual Imaginaries
Convincing
arguments (Visual, Numerical,
Easy to understand,
Non-procedural Narrative)
(perceptual) and C4 (perceptual) created scenarios that he could relate to the problem so
that he could think for himself. D3 (perceptual) used easy language to offer an
few examples of the interested numbers. E2 provided a picture to show the objects
studied in the problem. All of these arguments offered him a starting point for working on
the problem even though they didn’t provide details about what exact steps he should
202
take. Therefore, they were considered the most convincing options. On the contrary, D4
confused Blake. Therefore, they were considered the least convincing options.
Brenda was an 8th grade student enrolled in an Algebra I class at the time of data
collection. When working on the SMR, she had selected A4 (visual), B1 (inductive), C3
(visual) and D1 (inductive) in respective problems as the most appealing option and
Brenda’s interview responses are summarized in Table 32 and Table 33. Table 32
illustrates the rankings provided by her for each problem. Column One of the table
represents the order of problems that she tackled. Table 33 summarizes her comments
when articulating why she found certain arguments convincing or not convincing (The
coding of each comment is explained in Table 9). These two tables served as the major
203
Most convincing -------------------------> Least convincing
Problem B B2 (perceptual) B1 (inductive) B4 (visual) B3 (algebraic)
Problem D D1 (inductive) D2 (algebraic) D3 (perceptual) D4 (visual)
Problem C C1 (inductive) C3 (visual) C2 (algebraic) C4 (perceptual)
Problem A A4 (visual) A1 (inductive) A3 (perceptual) A2 (algebraic)
Problem E E2 (visual) E3 (perceptual) E1 (inductive) E4 (algebraic)
204
Positive Comments Negative Comments
Problem B
I can imagine a football field, so like, it’s If I did it that way, it would take longer to
easier just to think that way. (E3, L2) figure out how to do it, other than just like,
thinking about how to do it, so you would
actually have to measure it to realize how
farther it is. (E2, P)
I know that it’s longer. (E6) I don’t understand them as well, that’s
why I don’t even like them. (NA)
I don’t like the Pythagorean theorem. (E4-,
R4-, P)
When they add circles to the thing, it kinda
confuses me, so I just don’t like ’em. (E2-,
R1-)
Problem D
It’s easier for me if it has a number in it, to It’s a little bit harder to figure out with the
be able to know, like, it’s easier just to x in it, so you have to figure out on both
figure out how to do it that way. (E2, R3) sides of the equals sign. (R4-)
It’s just the one side and you get the There’s two sides of the equals sign in this
answer. (L5) one, so you would have to figure out both
sides, and then you could get the answer.
(L5-, P)
It’s already on the one side, so you just This one I don’t think has enough
figure out the one side and get the answer, information for me to understand it really
and it’s a lot faster and easier. (L5, P) as well. (NA)
Because he also said, like, it’s “such as I don’t really like graphs so I don’t
200,” you didn’t say that you didn’t try understand… I just don’t get ’em that well.
300, so he could’ve tried it but just didn’t (R1-)
say that he did and it could’ve still worked.
(E2, L3)
Problem C
If you take any kind of triangle and you try I don’t understand it as much. (NA)
and fit it into, like, the first one, it’s always
gonna be smaller than the… the first
triangle’s always gonna be bigger than the
second triangle because of how the sides
are. (E2, R1, L4)
continued
205
Table 33 continued
Positive Comments Negative Comments
I understand what they’re saying by It says A is greater than a ... I just got
cutting the lines and making it into a really confused about that part. (R4-)
shorter one, and I can tell by that that there
is, that it is smaller than that. (E2, R1, L4)
With the first one [argument], it shows a The last one [C4] I don’t think gave
lot more, ’cuz there’s different triangles enough information for me to understand
there that are, that have different sizes. what it meant by it. (L2-)
(E2, L3)
I can tell that it’s right because I know that
it’s bh divided by two, because we already
know that that’s how you find the area and
stuff, and then with, it would be lowercase
with the Triangle 2 and it would be
uppercase with the Triangle 1, so it’d have
to be bigger. (E4, R1)
Problem A
That one has a visual effect with it, so it That one kinda confused me on it, because
makes more sense that it you split up the 6, I didn’t know what was going on in the
it comes into threes, and I can understand problem. (NA)
that way. (E2, R1)
That would probably be another way I I don’t like to think of it that way, I just
would do it, so I would understand it that don’t get it that way, so I don’t think of it
way, ’cuz you can divide any of those by that way. (P)
3, and get a multiple of 3 that way. (E2,
R3)
I like visual things better than just thinking I don’t get how you do the 6n equals 3
in my head about it, so that one makes times 2n. (R4-)
more sense to me. (E2, R1)
Comparing Problems A-D
I understand sales tax more than most The way that they did it, like they added
things, I get that better than the other the circle to it, and it didn’t make as much
things. (E2, R3) sense. (E2, R1-)
It, like, is straightforward, and it tells me
what it is and stuff, it’s a lot easier to
understand. (P)
Problem E
It gives me a visual effect of how it It just tells you how it wouldn’t change.
wouldn’t change, because it would still be (E6-)
double the amount of it, which wouldn’t
do anything to it. (E2, R1, P)
continued
206
Table 33 continued
Positive Comments Negative Comments
It made sense ’cuz they explained what It doesn’t give a picture, it’s just an
happened with the orange and with the explanation. (E2, R1, R2-)
white, and that it would stay the same no
matter what, ’cuz the ratio would never
change of how much would be in there.
(E4, R2)
It gives an actual effect of how it wouldn’t I wouldn’t go that way with that, it’s just
change. (E2, R1) not how I would do that. (P)
It has a picture of how it wouldn’t change I understand it now but I don’t really like
and it gives an explanation with it. (E2, it. (P)
R1, R2)
Additional Comments
That one [E3] did because it said that it That one [C4] I don’t think gave as much
was exactly the same by just explaining information as what I needed to figure out
how it is. (R2) that it was. (E3-, L2-)
With the sales tax, I got that better because I don’t understand probability as well, so
I knew that, with that, it’s easier with when they threw in the numbers, I was
numbers. (E2, R3) kind of confused with it. (R3-)
It’s easier to figure out that whatever 6 is,
you can just divide by 3 and it’s an, it’s a
normal number that is 3. (E2, R3)
It [E2] does show you how and it explains
how to. (E2, R1, R2)
They explained that the ratio between the
ping pong balls would still be the same no
matter if it was doubled or whatever
number. (E2, R2)
convincing. She rated them the least convincing in Problems A, B, and E, second least
were considered either the most or second most convincing to her, with the exception of
Problem E, where it was considered the second least convincing. Her evaluation of visual
and perceptual arguments was quite inconsistent across the problems. They appeared at
207
every place (most --> least convincing) in her rankings. In order to better understand how
Brenda evaluated the proposed arguments and her rationale when providing these
rankings, the coding for her explanations in Table 33 was summarized in Table 34 so to
identify factors and features of the arguments that had influenced her judgment.
As shown in Table 34, the total number of Brenda’s comments that focused on the
representation, evidence and link of the arguments were 29, 27, and 10, respectively.
conveyed 8 times. For example, she considered A4 (visual) convincing, suggesting it had
208
“a visual effect with it, so it makes more sense.” However, some visual illustration didn’t
make the argument convincing to her. She claimed to be confused by the geometric shape
the meaning of pictures or graphs could make an argument more convincing to her. She
also believed that an argument was easier to understand “with numbers.” However, she
found numerical and narrative representations confusing to her in some cases. She
claimed that narrative arguments, with the absence of numbers or pictures, might not
offer enough information. She also found dealing with numbers in certain context (e.g.
probability) confusing. This explained her inconsistent judgment of the perceptual, visual,
and inductive arguments across the problems. Overall, it was found that visual, numerical
and narrative represented arguments could all contribute to her conviction if they were
understandable to her.
conviction. In all 4 instances where she mentioned symbolic format, she characterized it
as confusing and not understandable. For example, she claimed not to understand why
“6n equals 3 times 2n.” Considering that this equation involves only the simplest
symbolic expressions, we believe she hadn’t yet developed adequate facility with algebra
to use it in problem solving. Therefore, it was not surprising that all the algebraic
Brenda also paid attention to the evidence presented in the arguments and this was
evidenced 27 times during the interview. She often found arguments that relied on
checking a few cases (e.g. numbers or shapes) convincing. This was detected 19 times.
209
She suggested that she didn’t like argument using the Pythagoras Theorem although she
had learnt it in class; however she found the triangle area formula convincing. This
suggested that even two seemingly similar types of evidence sources could be assessed
It was also found that Brenda could be convinced by a variety of links between
evidence and conclusion. She found trying a few cases in Problem D adequate in showing
the conjecture was always true. She found the transformation model used in C3
convincing. She also found the perceptual connection between the football field scenario
and the property of rectangles in B2 convincing. However, she wasn’t able to make the
perceptual connection between the triangle made by wires and its geometric properties.
She didn’t explicitly comment on the link of any argument and it was not detected in the
interview that she was more likely to be convinced by any other type of link. Therefore, it
was believed that she wasn’t yet able to reflect on the logic of mathematical arguments.
Lastly, Brenda’s personal standards were studied. First, she found simple
arguments more convincing. The preference for “easier” arguments was mentioned 7
times during the interview. Second, it was found her appreciation of certain
concepts/topics had also impacted her evaluations. For example, she claimed to not “like
the Pythagorean theorem” and hence considered B3 (algebraic) not convincing. She
didn’t consider E3 convincing since she “wouldn’t go that way with that, it’s just not how
I would do that.” Based on the findings above, Figure 28 was created to illustrate Blake’s
rationale for evaluating mathematical arguments. Such personal preference was highly
context based and could explain her inconsistent view of the same type of argument
Discussion
into their own rationale for evaluating the mathematical arguments. The following
discussion focuses on the thinking pattern exhibited by the subjects as a whole group
significantly more convincing than others in each problem context. We then studied if
any factor had a larger impact on the subjects’ judgment. In addition, we studied the
similarities and differenced among the individual subject. Lastly, we studied the context’s
convincing that others in each problem context. In order to do so, we first assigned values
argument was ranked as the most, second most, second least and least convincing
211
argument, it received a score of 1, 2, 3 and 4, respectively. Therefore, the rating
represented the position of the argument in the ranking. The lower the rating, the more
convincing an argument was perceived. We then calculated the average rating provided
and B3 (algebraic) were rated as the most and least convincing arguments in Problem B,
respectively. C1 and C2 (inductive and algebraic, tie) were the most convincing
212
arguments in Problem C where C4 (perceptual) was the least convincing. In Problem D,
D2 and D4 (algebraic and visual, tie) were the least convincing arguments, while D1
(inductive) and E4 (algebraic) were considered the most and least convincing arguments,
respectively. These results suggested that the subjects’ evaluation of the same type of
arguments were highly inconsistent across the problems. The same type of argument
could be considered as the most convincing option in one problem but the least
convincing one in another (e.g. A2 (algebraic) was rated most convincing in Problem A
but B3 (algebraic) was rated the least convincing in Problem B). Therefore, it was
difficult to tell whether there was any particular type of arguments that the subjects found
more convincing than others. This finding was compatible with what was detected in the
survey analysis.
We further tested the differences of ratings among the arguments in each problem.
Adopting the within-subject ANOVA (using the arguments in each problem as the levels),
it was found that no argument was considered significantly (p < .05) more convincing
than any other option in every problem (see Appendix C, Table 46). Therefore, based on
the subjects’ rankings, there wasn’t any single argument that stood out in any of the
problems as the most convincing option. This result again demonstrated diversity in the
213
Factors that impacted the subjects’ decision
choices, it was difficult to identify what factors might have impacted the subjects’
subjects’ explanations for their rankings , we calculated the total number of comments
about each type of representation, evidence and link of arguments. We also calculated the
percentage of each number in its own category. For example, there were 60 comments
argument. These comments were 31% of all the comments that referenced to the
214
Total number of references to representation: 194
Visual Narrative Numerical Symbolic
Positive 60 (31%) 18 (31%) 33 (17%) 37 (19%)
Negative 20 (10%) 9 (10%) 3 (2%) 14 (7%)
As shown in Table 36, the total number of comments that focused on the
representation, evidence and link of the arguments were 194, 272, and 81, respectively.
The data suggested that opinion (i.e. personal conviction without an explicit
reason) was not considered a reliable source of evidence by the subjects. Although there
were 3 instances when subjects’ decision was made upon a personal conviction, it was
much more frequent when an opinion was indicated unreliable (20 times in total). This
suggested that most subjects were aware of the need to provide evidence other than
personal opinion to support a mathematical argument. When examining the impact of the
types of evidences on the subjects’ decision, it was found that examples (i.e. results from
215
an immediate test) were used most often to support an argument (140 times in total, more
than half of all evidence referenced). At the same time, it was also the second most
criticized source of evidence (only second to “opinion”). Criticism of the use of only
examples focused mostly on their logical limitation, i.e. their inability to show the
conjecture was always true, which was acknowledged by some subjects. The presence of
facts (i.e. known mathematical results) were the second most referenced types of
evidence (a total of 44: 42 positive and 2 negative). However they were mentioned much
less frequently than examples. Note that it was rare (only 2 instances) that a mathematical
fact was considered unreliable source of evidence. This suggested that once the subjects
Furthermore, imaginaries created upon past experience were referenced as reliable source
of evidence for 34 times. This number was close to that of “facts.” However there were
also 9 times when imaginaries were indicated to contribute negatively to the subjects’
conviction about an argument, suggesting that it was not considered reliable source of
evidence in some subjects’ view (e.g. Amy and Betty). Lastly, it was uncommon that a
as evidence of arguments.
The most influential type of representation was visual, which was considered to
contributed to the subjects’ conviction, it might also be considered misleading (e.g. for
Amy), confusing (e.g. for Blake), or not explanatory (e.g. for Betty). In addition,
representations were criticized more frequently than numerical representation (14 times
vs. 3 times). This suggested that ideas expressed symbolically were not found as
convincing by some subjects (e.g. Blake and Brenda). Narrative arguments were
referenced 27 times (positive for 18 times, and negative for 9 times). The narratives had
contributed to the subjects’ conviction especially when they were used to explain a
picture or an equation (e.g. Betty). However, some subjects found narrative expressions
not clear or convincing when they were not supplemented by visual, numerical, or
symbolic expressed evidence. In sum, the visual, algebraic, and narrative representation
the subjects referenced induction most frequently (33 times in total). Although in 14
their conviction, in 19 cases suggestions were made that showing a few examples
couldn’t show the conjecture was always true. This result indicated that the use of
induction was still popular, however some students had developed an awareness of its
conviction in as many case as induction did (14 and 13 times, respectively). However,
there was only one case where transformation was considered as not convincing while
more analysis of certain examples and pattern seeking. The later was uniformly
217
recognized as convincing link of evidence and conclusion in an argument. Furthermore,
ritual operation and deductive reasoning (mostly referenced by Amy) was considered as
Figure 29 was generated based on the numbers in Table 36. A larger font denotes
that the item was more frequently referenced by the subjects. As illustrated, when
evaluating the arguments, the subjects paid the most attention to the evidence. Among all
imaginaries and mathematical facts. The representation of arguments had also impacted
the subjects’ judgment. Among all types of representations, visual illustration received
the most attention, however it was criticized by some students as well. Similar situation
applied to the view of algebraic representation, where between subject differences were
observed. The link between evidence and conclusion was the least concerned aspect
among the three. Induction was referenced most frequently but it might contribute either
218
Visual Numerical
Perceptual Inductive Ritual
Link
Representation
Imaginaries
Examples
Evidence
Facts Subjects’
conviction
In the previous discussion we revealed some general pattern about the subjects’
rationale in evaluating the arguments. However, it was unclear if such pattern applied to
every individual or only to some of the subjects. More importantly, the individual
differences were repeatedly described using their choices in the survey or the rankings
provided in the interview. It was unclear what factors might have caused the differences
in their ratings. The analysis of each individual subject’s interview responses had
219
provided the bases for the investigation of the similarities and differences of their
rationale. The following discussion offers a cross comparison among the subjects.
evaluating arguments. This table was generated based on the study of each individual’s
interview responses. As shown in the table, there were similarities as well as differences
220
View of evidence
The most prominent similarity among the subjects was that they all considered
examples as a reliable source of evidence. Testing a few cases and seeing if the
conjecture was true in specific conditions had contributed to the subjects’ conviction.
This was observed in the comments from all subjects on at least a few (if not all)
arguments.
As a contrast, students’ view of the use of mathematical facts was less consistent.
Allen, Amy and Betty indicated that they were likely to be convinced if an argument was
based on a known mathematical fact. On the contrary, Blake seemed unwilling to use any
established result and preferred exploring the problem by himself. The other four subjects
might acknowledge that some known results (e.g. the triangle are formula) helped
convince them an argument was true; however they might not consider such results as
established mathematical fact but rather as something they had heard about.
In addition, the subjects’ view of imaginaries also differed. To Abby, Beth and
Blake, imaginaries were major source of evidence, while in Amy’s view, people’s brain
on whether the imaginary was adequately clear to him. Overall, the use of examples
view on the use of other sources, such as known mathematical facts and their own
imaginaries, differed.
View of representation
it was not considered by all subjects as one to have positively contribute to their
221
conviction of an argument. Amy and Betty clearly expressed that they were unlikely to be
convinced by visual arguments. Amy claimed that it was possible that pictures and
figures misrepresented the problem. Betty didn’t tend to perceive connection through
examining the visual demonstrations. She needed to see a narrative explanation of what
the pictures meant when there was a visual illustration. Nevertheless, visual illustrations
still seemed to be the most preferred type of presentation. Six subjects explicitly stated
that visual aids could contribute to their conviction, especially when the image was
function was not as commonly mentioned as visual illustration, we still identified at least
their conviction. To the remaining three subjects, i.e. Allen, Alice and Beth, it didn’t seem
that the use of numerical expressions made an argument less convincing to them. Its
function was just rarely articulated in their explanations. Therefore, the subjects seemed
subject when explaining their understanding of each argument. However, some subjects
demonstrated a higher need for narrative explanation than others. For example, Betty
suggested that visual illustration was not convincing unless it was also accompanied by
an explanation. In contrast, Allen preferred to read equations and examine graphs and
didn’t consider an argument convincing if it was too “wordy” and not “straightforward.”
The major advantage of narrative representation was the easy language, which helped the
222
subjects to understand an argument if adopted properly. However it might be difficult to
use narrative to describe some concepts or examples as precisely as using numeric, visual
narratives without seeing any specific numbers, images or symbols, or whether they
Compared to the other three types of representations, the subjects showed greatest
representation could show the conjecture was true in every possible case. To Allen,
symbolic representation demonstrated the ideas clearly and concisely. To Betty, symbolic
representation helped her see the details of the argument procedure. Therefore, these
On the contrary, Blake considered symbolic represented terms as confusing and not
appropriate for his age group. Brenda also found symbolically represented theorems not
unconvincing to them. The symbolic representation didn’t seem to have either positively
or negatively contributed to the other three subjects’ evaluation. They didn’t seem to
recognize the advantage of symbolic representation, nor did they find it confusing. This
finding was not surprising since symbolic expressions were usually more abstract than
ideas represented in the other three forms. Students who understood the ideas of symbolic
expressions might appreciate how clear and concise they were. More mathematically
mature learners might even see the general validity represented by symbolic arguments.
223
However, to those who hadn’t yet adapted to symbolic representations, they only looked
View of link
The link between evidence and conclusion seemed to be the least influential
aspect on the subjects’ decision. This was natural since the subjects may not start
examining the link if they didn’t find the representation a reliable format or consider the
evidence convincing. Therefore, the link appeared to be the last thing among the three to
be considered.
Only Amy insisted that the evidence used in a convincing argument must show
the conjecture was always true and found symbolic deduction the most reliable way to
guarantee this. This condition wasn’t a requirement for a convincing argument in other
subjects’ view.
Several subjects (Alice, Amy, Beth, and Betty) had articulated that showing a few
examples might help them understand an argument but were not sufficient to convince
them that a conjecture was true. This suggested that some students were aware of
limitation of induction. Although they were not yet able to appreciate deductive reasoning,
they had developed the ability to understand generic examples. For example, Alice could
visualize that some geometric property would remain the same when the shape was
changing in a certain way. Allen could see a formula in a numeric equation since a value
in the equation could be substituted by others without changing the result. Overall, it was
to the subjects’ conviction. It was observed in 5 students’ explanations (except for Abby,
Abby, Beth, Betty and Blake). Perceptual connection relates a given mathematical
problem to imaginaries created upon previous experience, and in many cases such a
connection was not precisely described but was perceived by the subjects. Only Amy
pointed out such connection might not be a reliable way to build an argument. Other
students might not have been able to perceive some connections between a mathematical
problem and a real life scenario; however they might not have realized that arguing by
Lastly, it seemed that all the subjects believed ritual operations, numerical or
Personal standards
(Recio, & Godino, 2001). Having a personal standard of what a convincing argument
Amy seemed to be the only person who believed a convincing argument should
be one that proved the conjecture was always true. To the other subjects, this wasn’t a
principle that guided their decision. Instead, to many of them (Abby, Alice, Beth, Blake,
and Brenda), whether an argument was easy to understand determined, largely, its
credibility. These subjects’ standards for easy argument were not mutually exclusive.
Most prominently was that none of the five subjects considered algebraic arguments as
argument was easy to understand. Blake found an argument easy to understand if it used
225
easy language, easy examples, and easy visual illustrations. Brenda was able to
appreciate more complex examples and visual illustrations; however she preferred an
argument that didn’t involve a complex procedure (e.g. multiple steps). Beth considered
an argument easy to understand if the argument was built upon a life scenario to which
she could relate. Abby and Alice found an argument easy to understand if the concepts
used in the argument and its reasoning procedure were familiar to them. While Beth
preferred a context rooted in her life experience, Abby and Alice also considered
Allen and Betty were the only two subjects who didn’t claim that a convincing
argument needed to be easy to understand. Note that Allen did prefer arguments that
involved simple procedures. This was close to Brenda’s opinion. However, Allen’s
preference toward simple procedures was not because they were easier to understand. He
claimed that he didn’t have much difficulty understanding all the arguments used in the
interview. Despite this, he still preferred “straightforward” arguments since they were
arguments. However, different from Allen, Betty paid more attention to the details of
arguments. She found an argument that left too much space for the readers to decipher
not convincing. For example, she didn’t consider visual illustrations alone were
convincing. She believed they must be accompanied with descriptions that explained the
226
Summary
among the subjects’ rationale in argument evaluation (see Table 38). In general, the
subjects found examples convincing in most cases. However their view towards the use
of existing mathematical results and their own imaginaries differed. In addition, the
subjects found numerical and narrative arguments easier to understand than symbolic
ones. Visual argument could be helpful or confusing depending on the actual images or
diagrams provided. Most students didn’t realize that symbolic representation had the
potential to prove the general validity of a conjecture. However, some students found
symbolic expressions concise and clear while other viewed them as confusing. With the
exception of one participant, the subjects were not aware that the link between evidence
and conclusion must show the argument was always true. Transformational and
perceptual reasoning was widely adopted. However, the subjects’ view toward induction
differed. Half of the subjects seemed to have realized its limitation. Lastly, the subjects
Five subjects found easier-to-understand arguments more convincing; however they also
used different standards to judge the “easiness.” For example some subjects found
arguments embedded in a familiar context easier to understand and hence perceived them
as more convincing. Three didn’t take “ easiness” into consideration but paid attention to
different aspects (logic, expressions, and reasoning procedure) of the arguments. These
differences among the individuals’ rationale caused the distinct evaluations of the
227
Similarities Differences
Evidence Examples were convincing Imaginaries and known
source of evidence. mathematical results might or
Authority, assumption and might not be viewed as reliable
personal opinion were rarely source of evidence.
considered convincing.
Representation Numerical and narrative Visual illustration could be
arguments were usually easier sufficient or not sufficient to
to understand. demonstrate the validity of a
Seeing a few numbers in an conjecture.
argument was helpful in most Narrative descriptions could be
cases. necessary or unnecessary.
Visual illustration was helpful Symbolic expression could be
if the provided image was concise and clear or confusing
understandable. and meaningless.
Most subjects were not aware
of the logical advantage of
symbolic representation.
Link Deduction was rarely used or Induction could be viewed as
considered necessary. convincing, convincing in some
Transformation and perceptual situations, or unconvincing.
connection was widely
adopted.
Ritual operation was rarely
considered unconvincing.
Continued
Table 38. Similarities and differences in the subjects’ rationale of argument evaluation
228
Table 38 continued
Personal Most subjects didn’t focus on Whether an argument was easy
standards whether an argument could to understand was taken into
prove the conjecture was consideration by some but not
always true. all the subjects.
Some subjects found arguments
embedded in a familiar context
or use familiar reasoning
techniques more convincing.
The subjects had different
demand for the clarity of
arguments.
Other various personal
opinions.
Although each of the subjects had exhibited some general standards in assessment
of the arguments, he/she still provided different evaluations for the same type of
arguments in different contexts. None of the participants chose the same type of
arguments as the most convincing option in more than 3 problems. This section discusses
Differences in complexity
The subjects’ responses revealed that whether an argument was easy to understand
had a substantial impact on their conviction. This might explain the difference in their
judgment of the same types of arguments. For visual arguments, E2 was probably the
229
easiest to understand since it was a direct representation of the problem content. A4 and
transformation of the shapes. B4 was even more complex since it involved multiple
For algebraic arguments, A2 might involve the simplest equation which had only
one variable and a one-step operation (i.e. multiplication). D2 also contained only one
variable but involved multiple steps of operations. B3, C2 and E4 all contained two or
more variables and involved multiple steps of reasoning. For inductive and perceptual
argument, the provided examples or evoked imaginary could also be easy or hard to
Five subjects had clearly pointed out that whether an argument was easy for them
to understand had impacted their evaluation of the argument and they might be confused
complexity of the same type of arguments was a cause for their different ratings.
Differences in familiarity
familiarity with their content. Such a difference was evident in the subjects’ judgment of
convincing since she was not familiar with the Pythagoras Theorem. However, she found
C2 (algebraic) more convincing since she was familiar with the triangle area formula.
Similarly, Allen claimed that the imaginary of a triangle made of wire was more clear to
230
him than an imaginary of a football field. Hence he gave C4 (perceptual) a higher ranking
than B2 (perceptual). Abby’s view was just the opposite to Allen’s. She considered B2
convincing since she knew what a football field looked like. She found C4 not
convincing since she “never heard of using wire to make a triangle.” Since it was possible
for any argument, regardless of its type, to provide a context that was familiar or
Differences in clarity
Even the same type of arguments could be different in its perceived level of
clarity of the concepts and the reasoning procedure. A typical example was about the
inductive arguments (A1, B1, C1, D1 and E1). B1 was the least clear one since it just
stated that a few examples were tested but didn’t give any information about what
examples were used and what results were observed. A1 and C1 were more clear since
they provided the examples. D1 and E1 were the clearest since they didn’t only provide
examples, but also showed how an example was tested and demonstrated the operations.
These differences impacted Betty and Allen’s judgment as they considered A1, B1 and
C1 as the least convincing but ranked others two higher. Clarity of arguments also had
impacted Blake’s decision. He tended to prefer arguments that didn’t show the specific
Differences in function
Even though A2, B3, C2, D2 and E4 were all classified as algebraic, the function
of the symbolic expressions were different in each argument. Indeed all symbols
represented variables. However, in B3 and C2, the symbolic form was the carrier of
231
known mathematical results (i.e. the Pythagoras Theorem and triangle area formula),
which were not present in A2, D2 and E4. Additionally, in D2 and E4 the symbolic
ordered collection of values. However, inequality was not a focus in A2, D2 and E4.
arguments. In A4, D4 and E2, the figures were used to represent the problem content in a
visual format so that the subjects could perceive the relationship within the diagrams and
then transform that understanding into the actual context of each problem. However, in
the geometry problems, the figures themselves were subjects of the study and the
participants didn’t need to relate them to anything else. Additionally, subjects needed to
concern about the quantities of the objects used in the diagrams in A4 and E2; however,
spatial relationship, distance and sizes were the focus of the figures used in B4, C3, and
D4. Therefore, visual illustration was a category that involved highly diverse internal
properties.
The situation was similar to the perceptual arguments. Whereas A3, B2, and C4
contexts that were familiar to students, additional contexts were not provided in D3 and
E3. Instead, D3 and E3 tended to use a narrative description to help students perceive the
had impacted students’ decision. For example, students might prefer B3 (algebraic) and
232
C2 (algebraic) since they saw the familiar mathematical results as reliable evidence stated
in the two arguments. However they might have find the other algebraic arguments less
convincing since those familiar results were not present (e.g. Allen). Students may have
considered those that tended to make connection within the context as less convincing
(e.g. Beth). Students might have acknowledged illustrations of images convincing in the
geometry problems but consider visual aid in other problems less reliable (e.g. Amy and
Betty).
arguments. For instance, Amy seemed to view examples provided in the geometry
context (i.e. different shapes) as cases that were connected to each other while she
considered different numbers as separated and unrelated cases. Therefore, when she
evaluated inductive arguments that used different numbers as examples, she considered
geometry context. Whether the illustrated examples were viewed as generic examples or
isolated cases also impacted Alice’s and Beth’s conviction. Alice considered D1
(inductive) not convincing since “they only gave two numbers for you to work with” and
“what if the price is higher than 500.” However, she considered A1 (inductive)
convincing since every multiple of 6 that she tried was a multiple of 3 as well. Beth’s
view was just the opposite. She suggested that A1 (inductive) was not convincing since
“just ’cuz she’s tried a lot of them, she hasn’t tried all of them.” However in evaluating
D1 (inductive), she claimed that “I could insert the 200 dollars and the 500 dollars that
233
he’s suggesting is the same thing, and I could see if it was actually right,” suggesting she
had seen properties in the given example which could transfer to other cases.
Summary
In sum, the subjects’ interview responses revealed that the complexity of the
arguments, students’ familiarity with the context used in the arguments, the clarity of the
explanation presented seemed to have impacted the subjects’ evaluation and judgment.
An argument could be difficult or easy, could use familiar or unfamiliar illustration, could
seem clear or unclear, regardless of its type. In addition, students’ perception of the
function of the same component of an argument also varied. They viewed the examples
used in one problem as isolated cases while considered the examples in another problem
as related instances. Therefore, these factors, which didn’t depend on the argument type
but aligned with the subjects’ personal standards for judgment, could have caused
students’ inconsistent evaluations across the problem contexts for arguments that were
Complexity
Functional
Inconsistent differences of like
Familiarity
evaluations components or
features
Clarity
Figure 30. Factors that caused inconsistent evaluation of the same type of arguments
234
Reflecting on survey results based on findings from the interviews
The distinct rankings provided by the interview participants (see Table 35)
demonstrated that the highly diverse preference among students towards each argument,
as observed from the survey results, was unlikely to be a result of random selection. A
comparison between the interview participants’ ranking and their survey responses (in
arguments but considered them not convincing in the interview, and Beth, who showed a
preference toward algebraic arguments in the interview and such a tendency was not
participants’ interview responses, each learner had his/her own rationales for his/her
decisions and the choices were unlikely to be arbitrarily made in the survey.
Findings from the interviews offered plausible explanations for our previous
conjectures about why certain options received higher ratings in the survey. It was
hypothesized that the participants were more likely to understand an argument when it
showed more details about concrete examples or provided visual support. This conjecture
was supported by the interview results, where the use of example was perceived as the
most frequently referenced evidence to build a reliable argument. Visual illustration was
the preferred representation by 6 of the 8 interviewees. We had also suspected that the
participants were less likely to be completely convinced by checking and verifying a few
235
cases. This conjecture was also supported by some subjects’ interview responses, where
four subjects articulated that showing a few examples was not sufficient to convince them
that a conjecture was always true. In addition, we had speculated that arguments that had
used easier language and offered shorter descriptions were more likely to be preferred by
interviewees preferred the use of more abstract expressions and detailed explanations.
Therefore, students’ preference for the expression of arguments were different among
individuals.
The survey results indicated that students were not consistent in their evaluation
of the same type of argument across the contexts. This phenomenon was also observed in
the interview phase. An examination of the participants’ explanation revealed that the
complexity of the expression and concepts used in the arguments, students’ familiarity
with the context, the clarity of the explanation, and personal perception of specific
elements used in the arguments seemed to have impacted the subjects’ evaluation, which
It was also observed that survey responses from students enrolled in higher
performing schools (as assessed by state standardized tests) were not significantly
different from those who enrolled in lower performing schools. This result suggested that
the knowledge and skills that could help students achieve higher scores in standardized
tests may not directly associate with greater maturity in mathematical reasoning. Related
results were also observed in the interview phase. Two participants of the interviews
(Allen and Betty) were enrolled in Honors Algebra I classes and they did demonstrated a
236
familiarity to formulas and fluency of symbolic operations. However, neither one of them
was aware that checking a few examples was not sufficient to prove a conjecture was
always true. They didn’t realize that algebraic expressions could be used to prove the
general validity of a conjecture either. Their personal preferences that didn’t focus on the
rigor of logic of arguments had also impacted their judgment of whether a conjecture was
convincing. Therefore, they apparently didn’t fully understand the purpose of the use of
algebra in mathematics despite of their greater familiarity with symbolic skills. It was
237
CHAPTER 5. CONCLUSION
This chapter is dedicated to a discussion of the key findings of the study. First, an
research questions are summarized. Furthermore, the study’s contribution to the literature
The study examined how 8th grade students evaluate arguments in a wide range of
arguments that students found convincing, exploratory and appealing, common aspects
and features of arguments that impacted students’ evaluation of the arguments, and
The study involved two phases, a survey and a follow-up interview. Over five
hundred 8th grade students from five Ohio public schools participated the survey study,
where they were provided a variety of arguments in four different mathematical contexts
and were asked to determine which of these arguments were convincing, explanatory and
appealing to them. Eight subjects, whose survey responses were distinct from each other,
were selected to participate the follow-up interviews, where they were asked to explain
238
Both quantitative and qualitative methods were utilized in data analysis.
Statistical data from the survey was used to identify types of mathematical arguments that
students found convincing, exploratory and appealing. Interview data were coded using a
proof classification framework, i.e. CCIA (see Figure 7), to identify the aspects and
The findings from both the survey and interview are summarized to address each
Q.1. Are there certain types of mathematical arguments that students found convincing,
examined if any type (i.e. algebraic, inductive, perceptual, and visual) of argument was
cumulative data of all responses obtained to all problems. The survey results suggested
that no certain type of arguments received significantly higher ratings than others (see
Table 13). A certain type of argument might have received higher rating in one problem
but was rated low in another problem, and collectively, when combining the results for all
problems, no categorical type stood out as the most convincing, explanatory or appealing.
This result was compatible with findings from the interviews, where no argument
received significantly higher ranking than others in any problem (see Table 46).
239
Second, we examined if there was any argument type received significantly better
ratings than others in each problem context. As mentioned, the interview results didn’t
reveal significant differences in any of the problems. However, the survey results did
indicate that there were some arguments that were considered more convincing,
(visual) and B3 (algebraic) were considered as the most convincing, explanatory, and
appealing option in their respective problem contexts (the appealing ratings for B3 and
received significantly higher ratings than all others. This suggested that the participants’
preference of argument type was more uniform in some contexts than the others, such as
in Problems A and B; however their views were more diverse in other situations.
Nevertheless, even in Problems A and B, the lower rated arguments shouldn’t be ignored.
The most appealing arguments, A4 (visual) and B2 (perceptual), were chosen by students
as the closest way to how they would argue by no more than 40% of the participants (39%
for A4 and 28% for B2), while the least appealing options, A3 (perceptual) and B1
(inductive), were chosen by 17% and 20% of the participants. Although the difference
between the most appealing and least appealing options was significant statistically (p
< .05), this does not mean that more than 1/6 of the participants’ preference could be
ignored. Therefore, although some arguments received higher ratings in their respective
problem contexts, we found the lower rated arguments were still convincing, explanatory
Lastly, we examined if there was certain type of arguments that was preferred by
each individual. The interview results suggested that none of the participants offered the
240
highest ranking for the same type of argument in more than 3 of the 5 problems. The
survey results indicated that only 19 participants (4% of the sample) had chosen the same
type of arguments as the appealing options for all 4 problems, and an additional 122
participants (26%) had chosen the same type of arguments as the appealing options for 3
of the 4 problems. Therefore, most participants (70%) didn’t select the same type of
arguments as the appealing option in more than 2 of the 4 problems. This result suggested
that for most individuals, there wasn’t a single argument type that was preferred across
Overall, we found that students’ ratings for the same type of arguments were
highly inconsistent across the contexts and among individuals. Every argument was
survey participants, while in the interview, no argument was ranked significantly lower (p
Q.2. Are there common aspects and features of arguments that significantly impact
were first reported on the thinking patterns of all subjects as a group. In addition, findings
from each individual interview were compared to identify similarities and differences
The interview responses revealed that among the three aspects of arguments
identified in CCIA (i.e. evidence, representation, and link between evidence and
conclusion), the evidence was the most frequently referenced by the subjects, followed by
241
representation, and the link was the least concerned aspects when they justify their
Among all types of evidence, examples (i.e. results from immediate tests) were
referenced most frequently, followed by imaginaries (i.e. mental image created upon or
recalled from previous experience) and facts (i.e. well known mathematical results).
Among all types of representations, visual illustration received the most attention.
It was at times criticized as being confusing or unreliable. Similar situation aroused when
studying the algebraic representation. Some subjects found it concise and convincing,
Among all types of links between the evidence and conclusion, induction was
the subjects’ conviction depending on the context and one’s personal preference.
arguments shared some common features but differences among the individuals also
existed. The between-subject similarities and differences were systematically studied (see
First, when choosing the reliable source of evidence, the participants found
examples (i.e. results from immediate tests) convincing in most contexts. However, their
view on the use of existing mathematical results and their own imaginaries differed.
Some participants were more likely to be convinced by well known mathematical results
242
(e.g. theorem or formula). Others tended to rely on their own imaginaries and previous
we found the participants were more likely to understand numerical and narrative
arguments than symbolic ones. The participants often found numerical results convincing
except for very rare occasions (e.g. Brenda had difficulties working with numbers in the
claimed that images/diagrams that were more difficult to understand (e.g. B4 and D4)
also confused them and hence were not helpful to their conviction. Furthermore, Amy
claimed that visual illustration could be misleading sometimes. Betty claimed that a
visual illustration by itself was not sufficient to convince her and suggested that it should
their view of symbolic representations, 3 of the 8 participants (Allen, Amy and Betty)
found symbolic expressions concise and clear, and others viewed them confusing and not
Third, when evaluating the link between evidence and conclusion, except for Amy,
no participant had realized that symbolic representation had the potential to prove the
general validity of a conjecture. Even Allen and Betty, who demonstrated well perception
of the meaning of variables in each symbolic argument, were not aware of algebra’s
logical advantage. In fact, with the exception of Amy, the participants were not aware that
the link between evidence and conclusion must show the argument was always true.
Some participants claimed an argument convincing, but were still not sure if the
243
corresponding conjecture was always true (e.g. Alice and Betty). Therefore, deductive
reasoning was not utilized by most of them (except for Amy in her work on non-
adopted based upon trials, experience and imaginaries. The participants’ view toward
induction differed. Half of them articulated that checking a few cases was not enough to
prove a conjecture was always true. However, for these participants, the realization of the
limitation of induction wasn’t present in their responses to every problem. For example,
Beth claimed that trying a few numbers in Problem A was not sufficient to show the
conjecture was always true; however she found checking a few values in Problem E was
adequate to demonstrate the conjecture was always true. Note that the familiarity with
algebraic techniques didn’t necessarily help the participants realize the limitation of
induction. For example, Allen was very confident working with algebra, however he
claimed that he needed to plug in a few values to make sure a formula was correct.
argument meant to them (see Table 37). Only Amy insisted that a convincing argument
must show the conjecture was always true. This criteria wasn’t a requirement for
convincing argument in other participants’ view. Instead, five of them found easier-to-
convinced by familiar scenarios evoked by the arguments. Two participant claimed that a
Overall, our data suggested that when evaluating mathematical arguments, the
evidence had the largest impact on the subjects’ judgment, followed by the representation,
and the logical link between evidence and conclusion seemed to have the least impact.
244
However whether a certain type of evidence, representation, and link caused positive or
the subjects also had personal standards to determine if an argument was convincing. The
Analysis of interview data revealed that the complexity of the expression and
concepts, students’ familiarity with the context used in the arguments, and the clarity of
the explanation presented seemed to have impacted the subjects’ evaluation and judgment
(see Figure 30). An argument, regardless of the type in which it was categorized, could be
perceived as difficult or easy, could use familiar or unfamiliar illustrations, and could
the features that students noticed didn’t align with the factors we used to categorize the
arguments. Because of this misalignment, arguments in the same category were evaluated
varied across the context. For example, some participants had viewed the examples used
in one problem as isolated cases while in another problem as related instances (e.g. Alice
and Beth), which led to their different judgment of inductive reasoning among the
study. However they considered visual aid in other problems less reliable as a way to
interpret the content (e.g. Amy). These findings suggested that although some
might seem difficult to students. Since those features/factors were used to categorize the
arguments, students’ inconsistent evaluations for arguments that were categorized as the
This study advanced the understanding about proof learning from four aspects:
empirical report on results from a large sample, investigation on student thinking pattern,
First, the study analyzed survey results from 476 eighth grade students who were
enrolled in five Ohio public schools that had demonstrated different levels of
(2000) study, which focused on the performance of high attending 14-15 year old
students, this study chose a sample that was more likely to represent the general eighth
grade student population. Healy and Hoyles found that students excluded algebraic
arguments when they were asked to select an argument that they found convincing and
explanatory. Our results do contrast findings reported in Healy and Hoyles’s study in that
our subjects didn’t show bias against algebraic arguments when making their choices. In
fact, the algebraic argument in each problem context was considered convincing and
explanatory by at least 3/5 of the participants. Algebraic arguments were not the least
246
appealing options in all but Problem D. In addition, the follow-up interviews revealed
that 3 of the 8 participants exhibited a preference toward the use of symbolic expressions.
we didn’t observe that algebraic arguments were less convincing or preferred when
compared to other types of options. The most evident finding was that students’ preferred
argument type was highly inconsistent across content areas and different among
individuals. Hence, it was difficult to conclude whether a certain type of argument was
more likely to be considered convincing, explanatory and appealing by the students. This
result is compatible with findings of the previous literature that the understanding of
proof develops locally (Freudenthal, 1971; Reid, 2011), and hence an overarching
preference of proof type is unlikely to be achieved at early cognitive stages. The finding
of this study addressed that there didn’t seem to be any single approach that solely
facilitated the local development of the proof understanding. As illustrated by our data, at
least half of the students found two or more argument types convincing and explanatory
in the same context, and even the least appealing option was preferred by at least 1/6 of
the sample in each problem. Overall, this current study provided an analysis of empirical
preference of argument types was highly diverse among individuals in each of the studied
contexts.
provided insights into the aspects of arguments that impacted eighth grade students’
argument received the greatest attention from the subjects and had a major impact on
their judgment. In addition, our analysis revealed that at least half of the subjects had
realized that a conjecture couldn’t be proved to be always true if only a few examples
were tested. However, most of them were not yet aware of the advantage of symbolic
pragmatic justification (see Figure 1) and Waring’s (2000) proof levels well explained
this phenomenon. According to Balacheff’s theory, the subjects in this study no longer
relied on naive empiricism. Instead, their conviction depended on crucial experiment and
generic examples. According to Waring’s model, these subjects had reached Level 2,
where they still relied on empirical checking but were more careful in choosing examples
to verify with the potential to notice certain patterns in the process. Such an
argumentation mode between induction and deduction was also extensively documented
The interview responses also revealed that the link between evidence and
conclusion was the issue students seemed least concerned with when they were
evaluating the arguments. This finding was compatible with Yang and Lin’s (2008) RCGP
model (see Figure 6). As suggested by RCGP, students would not start examining the link
if they found the evidence unconvincing or the argument was represented in an unreliable
format. Therefore, it was natural that the link was not as frequently referenced. This
phenomenon could also be well explained by the broad maturation of proof structure
model (Tall et al, see Figure 4). According to this model, lower cognitive stages involved
knowledge structure, would students become able to reflect on the link used in the
argumentation.
Additionally, factors that caused the subjects’ different rankings of the same type
of arguments in different contexts were discussed. Factors that were not context specific,
such as the complexity of the language used in the arguments, students’ familiarity with
the context used in the arguments, the clarity of the explanation presented were identified.
However some context specific factors were also detected. For example, some subjects
were more likely to see the common properties among shapes than between numbers,
which led to a different interpretation of the inductive arguments in the number theory
and geometry contexts. In particular, they viewed examples used in the number theory
problem as isolated cases which couldn’t show the general validity of the conjecture,
while they considered examples used in the geometry problems as generic examples that
demonstrated why the corresponding conjecture was always true. In addition, visual
arguments could serve for different purposes in different contexts. Diagrams and figures
spatial relationships, their views of the visual arguments could also be different. This was
an important finding since few past studies had specified an explanation for students’
It was also observed that for 5 subjects a convincing argument needed to be easy
to understand. This finding is compatible with Hanna & Jahnke’s (1993) suggestion that
offering that arguments that uses easy expression, simple examples, and familiar
participants used other personal standards used to determine whether an argument was
convincing. For example, one of the participant, Allen, believed that a convincing
contain more details. Blake advocated an opinion that countered Betty’s, claiming that a
convincing argument should not provide the complete and detailed procedure but leave
some space for readers to think. Rigor was not as important as other factors when
Third, the study relied on a novel theoretical framework, i.e. CCIA (see Figure 7),
to classify aspects of proofs and different genres within each aspect for documenting
students’ foci when they evaluated arguments. In order to clarify ambiguities associated
neither the representation, source of conviction, nor the link between source and the
conclusion can be identified merely by looking at the text and content of the argument.
understanding of an argument instead of its expression was the most distinct feature of
and E1); however, when a learner had perceived more general properties through the
examination of one or a few cases, then an seemly inductive argument was actually
treated as a transformational one even though description of the transformation was not
included in text (e.g. the cases of Allen, Amy and Betty). Therefore, to perceive students’
250
evaluation of different argument types, it was important to first understand their
and hence was a more accurate model for investigations on students’ thinking.
Lastly, few studies have investigated middle school students’ comprehension and
evaluation of given proofs. This has been, in part, due to the absence of instruments that
support such investigations. In this study, five problems were designed (four were
included in SMR and another was used in the interview) as to enrich the task reservoir
that were appropriate for students who have been introduced to symbolic expression and
proofs in school mathematics. These tasks were embedded in different branches of school
mathematics and provided a variety of problem contexts. The argument types provided in
each problems were aligned with those used in other contexts to assess whether students’
view of mathematical proof were consistently developed across the fields in school
mathematics. The tasks can also be used for older students to examine whether school
When comparing the tasks used in this study to existing materials, it was found
that problems used in Harel & Sowder’s (1998) study involved more advanced
mathematical topics since it was designed to be used for college students. In addition,
Harel & Sowder studied the arguments generated by students instead of their judgment of
proposed items. Yang & Lin (2008), Healy & Hoyles (2000), and Stylianides &
used in Yang & Lin’s study were restricted in the geometry contexts. Healy & Hoyles’s
questionnaire contained both geometry and number theory tasks, however only the later
were published in that work. Tasks used by Stylianides & Stylianides covered various
251
mathematical contexts, and some didn’t involve complex mathematical concepts and
hence can be used for younger students. However, the interested subjects for their study
were primarily college students. Therefore, the tasks used in this study enriched the task
reservoir in three major aspects: 1) Tasks were specially designed for students who were
first introduced to algebraic expression and geometry proofs; 2) Tasks were embedded in
multiple branches of school mathematics; 3) The type of arguments used in each problem
aligned with those used in other problems, which enabled a between-context comparison.
The first limitation of the study was the unconfirmed effect of multiple
approaches on a learner’s conviction. Indeed the survey and interview data had
the use of multiple strategies was suggested since any single approach might only
contribute to some students’ conviction. However, it was unclear, merely based on our
data, whether the use of multiple argument types had enhanced a particular individual’s
conviction about the validity of an argument than just adopting one approach that he/she
found most appealing 10. Therefore, studies to verify the actual effect of adopting
Second, the participants’ evaluation of one argument could have been altered by
evaluation, the participants read all four arguments from each problem at the same time.
10
Nonetheless, Allen did indicated that arguments that involved a combination of formula and visual illustration would
be “perfect.”
252
Since their perception of these arguments involved not only information extracted from
them but could also include construction of new mental images, their judgment
subconsciously may have been evoked by this new knowledge. Consequently, their
alone. The possible disturbance between subjects’ evaluation of different arguments need
Lastly, although this study had highlighted some aspects and features of
arguments that largely impacted students’ evaluation, the emergent patterns were not
precise enough to allow for a prediction of individuals’ choices when a new problem was
proposed. This wasn’t surprising since we had focused on the impact of the aspects of
arguments, while other personal and contextual factors, such as learners’ background
topics, were not considered. Therefore, investigations that seek to identify personal and
contextual factors and their impact on one’s judgment of arguments are essential to
identify phases where they are able to ide ntify, understand, appreciate or produce certain
types of proofs or certain components of arguments and to describe how they develop
through these phases (Tall et al, 2012; Waring, 2000). In order to do so, a classification of
(2008) model and Tall et al’s (2012) framework concerned students’ understanding of
different components of arguments (e.g. the evidence, concepts, and links), whereas
Waring (2000), Harel & Sowder (1998), and Simon (1996) built their theories by
The theoretical framework of the current study, i.e. CCIA, considered students’
these components to provide a more precise classification of the proof types so that each
argument could be classified based on its representation, source of evidence, and the link
between evidence and conclusion. These models, although different in many ways,
shared some common deficiencies. Most prominently, the types and components of
proofs identified in these theories are not content specific. For example, inductive
arguments, as categorized by Waring, Harel & Sowder, and CCIA, include verifications
by empirical tests in geometry, number theory, probability, and other mathematical areas.
values shift in number theory contexts. The ability to understand and apply definitions as
identified in Tall et al’s model describes a stage in learning geometric proofs as well as in
working on proofs in abstract algebra. Even Yang & Lin’s model, which was restricted to
the context of geometry, didn’t consider the differences between 2 dimensional and 3
dimensional geometry, or between triangles and circles. Since the designers of these
mathematics, they were able to see the connection between two arguments in two
254
different mathematical fields or topics. Consequently, certain arguments were grouped
together as one category since they share some logical structure or possess some other
“macro” properties. However, proof learners, who haven’t yet developed the ability to
compare mathematical arguments across the content areas, might not be able to see the
connection between two arguments that were classified as the same type in
mathematical topic.
The argument types used in this study were generated based on the researcher’s
understanding of proofs. The results from both the survey and the interviews suggested
that students demonstrated inconsistent views toward the same type of argument in
different contexts. However, this phenomena might be explained differently. That is, the
represented by arguments that were classified as the same type (e.g. inductive argument
used in the number theory problem might not have offered as much explanation to
students as the inductive argument in the algebra problem did). As pointed out earlier, the
argument type was determined by the standard set by researchers who had a mature
certainly different from that of school learners. Therefore, arguments classified in the
same category by researchers might not seem similar to students. As pointed out by
Lakatos (1976), even mathematician’s standard of reliable proofs changes when different
contexts were taken into consideration. For instance, visual illustration was widely used
in geometric proofs; however, they are no longer viewed as reliable when calculus was
255
taken into consideration. Therefore, it was natural for the learners to first develop their
justification skills in local contexts (Reid, 2011). Only when their reasoning skills
reached certain levels in two contexts were they able to identify and compare the
inductive argument might be premature without considering the specific problem context.
This was certainly evidence in the results of this study, where those interviewed clearly
D. The status of visual arguments and their impact on students’ conviction wasn’t
conclusive either, since graphs and diagrams could serve for distinct functions in different
contexts (e.g. in B4, D4, and E2). Currently tools that measure students’ reasoning
maturity and characteristics in specific content areas are absent. Therefore, all features
and categories we constructed and used were based upon the synthesis of what was
known about mathematical reasoning as a generalized method. The theories were not
built upon the features of local content and learners’ understanding of such content. Note
that we didn’t attempted to deny the existence of more general patterns in students’
development of reasoning ability across the content areas. However, we would like to
address that merely identifying these general patterns might not be sufficient to
256
Therefore, we call for the need to develop content specific proof/argument classification
In addition, we claim the need to take personal factors into consideration in the
development of useful and explanatory theoretical models. The survey and interview
results had both identified great differences among the individuals. The differences did
not only appear in the participants’ judgment of arguments, but also resided in their focus
and rationale when making judgments. The individual differences were impacted by their
certainly influence learners’ mental images and how they perceive and interpret
arguments. The participants had found certain arguments convincing since they were
familiar with the scenarios provided in the arguments (e.g. “football field” in B2). They
had articulated topics and results learnt in class (e.g. “triangle are formula” in C2)
might also vary. However, this element has rarely been taken into account from existing
proof understanding and reasoning development models. The focus has been on the
arguments that students were able to produce and the their judgment of certain type of
mathematical statements rather than the sources that may have contributed to their
thinking when making decisions. Therefore, in designing future (content specific) proof
classification and proof skill development models, personal factors may be considered as
key variables. These factors shouldn’t be treated as obstacles that impede the
development of proving skills, but as valuable sources for sense making and construction
of sound arguments.
257
Implication for Proof Teaching
proofs might help students pass examinations but it might also create a gap between the
work they used to show their teachers and how they would use to convince themselves
(Hanna & Jahnke, 1993; Healy & Hoyles, 2000). Therefore, pressing students to use a
rigorous reasoning format may not actually help them understand the logic embedded in a
mathematical problem. The process of nurturing mathematical reasoning should start with
an understanding of more “natural” ways in which the students argue. Those “natural”
ways are usually mathematically incomplete or at times incorrect, however they may help
learners understand and access the problem and can ultimately influence their judgment.
For example, our study revealed that students’ conviction was strongly impacted by
examination of examples. This idea coincided with the constructivist’s view of using
verify a statement is not a rigorous way to proof the statement is true, it does provide a
concrete context for students to work on and hence to understand the problem better.
The findings of this study highlighted the need to foster students’ proof capacity
supported by the findings of this study, learners’ understanding of proof develops locally
and doesn’t automatically transfer to other fields. Students may appreciate deductive
reasoning in one area, but still find visual illustration and use of examples convincing in
other contexts (e.g. Amy). Since proof ability essentially concerns the relationships
multiple approaches, including the use of various evidence, representations and reasoning
modes. Since arguments that convinced students were highly diverse among individual
learners, any single approach might be appropriate and effective for only a small
reasoning modes could be essential to help all students access a problem and perceive the
symbolic representation and known theorems didn’t guarantee that students understood
the algebraic expression’s general validity. This was evident in Betty’s case, where she
claimed the algebraic argument that adopted the Pythagoras Theorem was convincing and
she clearly explained the meaning of the variables used in the theorem; however she still
believed there existed counter examples to the conjecture. Therefore, cultivation into the
Findings of this study do offer that a need exists for fostering students’
mathematics. Note that we are not suggesting that students at the introductory level need
to be taught to examine the rigor of each reasoning step in an argument. In fact, we posit
that students should be allowed to use any type of evidence, representation and reasoning
mode to investigate a problem and to convince themselves that a conjecture is always true.
However, since it was found that most of the interviewees didn’t realize that an argument
259
couldn’t be convincing if it didn’t show the conjecture was always true, we argue that
practice. Such an awareness is the foundation for future development of rigor in logic (e.g.
Stylianides, 2008b)).
Lastly, findings of this study emphasized the need to provide examples for
considered a valid mathematical proving process, it does provide students access to the
problem, and provide them opportunities to make and testify conjecture, to seek patterns
and to explore approaches that verify a conjecture. The value of examples has been
addressed by other scholars (e.g. Balacheff, 1988; de Villiers, 2003; Simon, 1996;
Stylianides & Stylianides, 2008a). In this study, examples, as a type of evidence, were
also the most referenced components of arguments that impacted the subjects’ judgment.
Students’ preferred type of representations and reasoning modes might differ; however,
even those who were aware of the limitation of examples considered them helpful for
their understanding of the problem. Therefore, the use of examples (could be in various
for the instruction of proofs. This implication is compatible with the main stream
260
REFERENCES
Armstrong, A. H. (Ed.) (1970). The Cambridge history of later Greek and early Medieval
philosophy. Cambridge, UK: Cambridge University Press.
Balacheff, N. (1991). The benefit and limits of social interaction: The case of
mathematical proof. In A. Bishop, Mellin-Olsen, E. & van Dormolen, J. (Eds.),
Mathematical knowledge: Its growth through teaching (pp. 175-192). Dordrecht,
Netherlands: Kluwer.
Ball, D. L., & Bass, H. (2003). Making mathematics reasonable in school. In J. Kilpatrick,
Martin, W. G., & Schifter, D. (Eds.), A research companion to principles and
standards for school mathematics (pp. 27-44). Reston, VA: National Council of
Teachers of Mathematics.
Ball, D. L., & Bass, H. (2000). Interweaving content and pedagogy in teaching and
learning to teach: Knowing and using mathematics. In J. Boaler (Ed.), Multiple
perspectives on the teaching and learning of mathematics (pp. 83-104). Westport,
DT: Ablex.
Biggs, J., & Collis, K. (1982). Evaluating the quality of learning: the SOLO taxonomy.
New York: Academic Press.
261
Bloom, B. S. (1984). Taxonomy of educational objectives: Book I cognitive domain (2nd
edition). Boston, MA: Addison Wesley Publishing Company.
Boero, P. (Ed.) (2007). Theorems in school: From history, epistemology and cognition to
classroom practice. Rotterdam, Netherland: Sense Publisher.
Brabiner, J. V. (2009). Why Proof? Some Lessons from the History of Mathematics. In
Fou-Lai Lin, Hsieh, F., Hanna, G. & de Villiers, M. (Eds.), Proceedings of the
ICMI Study 19 conference: Proof and proving in mathematics education (Vol. 1,
pp. 12). Taipei, Taiwan: National Taiwan Normal University.
Brouwer, L. E. J. (1905/1996). Life, art and mysticism. Notre Dame Journal of Formal
Logic, 37(3), 389-429.
Bruner, K. (1987). The perception of man and the conception of society: Two approaches
to understand society. Economic Inquiry, 15(3), 367-388.
Carnap, R. (1937). The logical syntax of language. London, UK: K. Paul Trench.
Chazan, D. (1993). High school geometry students’ justification for their views of
empirical evidence and mathematical proof. Educational Studies in Mathematics,
24(4), 359-387.
Chazan, D., & Lueke, H. M. (2009). Relationships between disciplinary knowledge and
school mathematics: Implications for understanding the place of reasoning and
proof in school mathematics. In D. A. Stylianou, Blanton, M. L., & Knuth E. J.
(Eds.), Teaching and Learning Proof Across the Grades: AK-16 Perspective (pp.
21-39). New York: Routledge.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). New
Jersey: Lawrence Erlbaum.
Council of Chief State School Officers (2010). Common Core State Standards
(Mathematics). National Governors Association Center for Best Practices.
Washington, D. C.
262
Clements, D. H. & Battista, M. T. (1992). Geometry and spatial reasoning. In D. Grouws,
(Ed.), Handbook of Research on Mathematics Teaching and Learning (pp. 420-
464). New York: NCTM/Macmillan.
Creswell, J. W. & Plano Clark, V. L. (2011). Designing and conducting mixed methods
research. Thousand Oaks, CA: Sage Publications, Inc.
Davis, P. J. (1976). The nature of proof. In M. Carss (Ed.), Proceedings of the fifth
international congress on mathematical education. Boston, MA: Birkhauser.
de Villiers, M. (2003). Rethinking proof with the Geometer’s Sketchpad. Emeryville, CA:
Key Curriculum Press.
de Villiers, M. (1990). The role and function of proof in mathematics. Pythagoras, 24,
17–24.
Ernest, P. (1996). New angles on old rules. Times Higher Educational Supplement. Times
Supplements Ltd.
Fawcett, H. P. (1995). The nature of proof. Thirteenth Yearbook of the NCTM. New York:
Teachers College, Columbia University. (Original work published 1938).
Fischbein, E. (1982). Intuition and proof. For the Learning of Mathematics 3(2), 9–24.
Freudenthal, H. (1971). Geometry between the devil and the deep sea. Educational
Studies in Mathematics, 3(3-4), 413-435.
Gödel, K. (1931). Über formal unentscheidbare Sätze der Principia Mathematica und
verwandter Systeme. Monatshefte für Mathematik und Physik, 38, 173–98.
González, G., & Herbst, P. (2006). Competing arguments for the geometry course: Why
were American high school students to study geometry in the twentieth century?
International Journal for the History of Mathematics Education, 1(1), 7-33.
Hanna, G. (1983). Rigorous proof in mathematics education. Toronto, CA: OISE Press.
Hanna, G., & Jahnke, H. N. (Eds.). (1993). Aspects of proof [Special issue]. Educational
Studies in Mathematics, 24(4).
Harel, G., & Sowder, L. (1998). Students’ proof schemes: Results from exploratory
studies. In A. H. Schoenfeld, Kaput, J., & Dubinsky, E. (Eds.), Research in
Collegiate Mathematics Education III (pp. 234-283). American Mathematical
Society.
Harel, G., & Sowder, L. (2007). Toward comprehensive perspectives on the learning and
teaching of proof. In F. Lester (Ed.), Second handbook of research in mathematics
teaching and learning (pp. 805-842). Charlotte, NC: Information Age Publishing
Healy, L., & Hoyles, C. (2000). A study of proof conceptions in algebra. Journal for
Research in Mathematics Education, 31(4), 396–428.
Heinze, A., & Reiss, K. (2009). Developing argumentation and proof competencies in the
mathematics classroom. In D. A. Stylianou, Blanton, M. L., & Knuth E. J. (Eds.),
Teaching and Learning Proof Across the Grades: AK-16 Perspective (pp. 191-
203). New York: Routledge.
Herbst, P., & Brach, C. (2006). Proving and doing proofs in high school geometry classes:
What is it that is going on for students? Cognition and Instruction, 24(1), 73–122.
Hersh, R. (2009). What I would like my students to already know about proof. In D. A.
Stylianou, Blanton, M. L., & Knuth E. J. (Eds.), Teaching and Learning Proof
Across the Grades: AK-16 Perspective (pp. 17-20). New York: Routledge.
264
Hilbert, D., & Bernays, P. (1934/1939). Grundlagen der Mathematik I and II, first
editions. Berlin, Germany: Verlag Julius Springer.
Hoyles, C. (1997). The curricular shaping of students’ approaches to proof. For the
Learning of Mathematics, 17(1),7-16.
IAS/PCMI (2007). International Seminar: Bridging policy and practice in the context of
reasoning and proof. Institute for Advanced Study / Park City Mathematics
Institute. Princeton, NJ.
< http://mathforum.org/pcmi/int2007.html>
Inglis, M., & Alcock, L. (2012). Expert and novice approaches to reading mathematical
proofs. Journal of Research in Mathematics Education, 43(4), 358-390.
Longo, G. (2009). Theorems as constructive visions. In Fou-Lai Lin, Hsieh, F., Hanna, G.
& de Villiers, M. (Eds.), Proceedings of the ICMI Study 19 conference: Proof and
proving in mathematics education (Vol. 1, pp. 13-25). Taipei, Taiwan: National
Taiwan Normal University.
Kakeya, S (1917). Some problems on maximum and minimum regarding ovals. Tohoku
Science Reports, 6, 71–88.
Kieren, T., & Pirie, S. (1991). Recursion and the mathematical experience. In L. Steffe
(Ed.), The Epistemology of Mathematical Experience (pp. 78-101). New York:
Springer Verlag Psychology Series.
265
Knuth, E. J., Choppin, J. M., & Bieda, K. N. (2009). Middle school students’ production
of mathematical justifications. In D. A. Stylianou, Blanton, M. L., & Knuth, E. J.
(Eds.), Teaching and Learning Proof Across the Grades: AK-16 Perspective (pp.
153-170). New York: Routledge.
Mason, J. (2009). Mathematics education: Theory, practice and memories over 50 years.
In S. Lerman and B. Davis (Eds.), Mathematical action & structures of noticing:
Studies on John Mason’s contribution to mathematics education (pp. 1-14).
Rotterdam: Sense Publisher.
Marrades, R., & Gutiérrez, A. (2000). Proofs produced by secondary school students
learning geometry in a dynamic computer environment. Educational Studies in
Mathematics, 44 (1&2), 87-125.
Mejia-Ramos, J. P., Fuller, E., Weber, E.; Rhoads, K. & Samkoff, A. (2012). An
assessment model for proof comprehension in undergraduate mathematics.
Educational Studies in Mathematics, 79(1), 3-18.
McConaughy, S. H., & Achenbach, T. M. (2001). Manual for the Semistructured Clinical
Interview for Children and Adolescents (2nd ed.). Burlington, VT: University of
Vermont, Research Center for Children, Youth, and Families.
National Council of Teachers of Mathematics (2000). Principles and standards for school
mathematics. Reston, VA: NCTM.
Onwuegbuzie, A. J., & Leech, N. J. (2006). Linking research question to mixed methods
data analysis procedures. The Qualitative Report, 11(3), 2006, 474-498.
266
Pal, J. (1920). Ueber ein elementares variationsproblem. Kongelige Danske
Videnskabernes Selskab Math.-Fys, Medd. 2, 1–35.
Pegg, J., & Davey, G. (1998). Interpreting student understanding in geometry: A synthesis
of two models. In R. Lehrer & D. Chazan (Eds.), Designing learning
environments for developing understanding of geometry and space (pp. 109–133).
Mahwah, NJ: Lawrence Erlbaum Associates.
Piaget, J. (1928). Judgment and reasoning in the child. New York, NY: Harcourt, Brace,
and Co.
Pirie, S., & Kieren, T. (1992). Creating constructivist environments and constructing
creative mathematics. Educational Studies in Mathematics, 23(5), 505-528.
267
Schoenfeld, A. H. (1991). On mathematics as sense-making: An informal attack on the
unfortunate divorce of formal and informal mathematics. In J. Voss, D. N. Perkins,
& J. Segal (Eds.), Informal reasoning and education (pp. 311-343). Hillsdale, NJ:
Erlbaum.
Schoenfeld, A. H. (1988). When good teaching leads to bad results: The disasters of
“well-taught” mathematics courses. Educational Psychologist, 23(2), 145–166.
Selden, A., & Selden, J. (2003). Validations of proofs considered as texts: Can
undergraduates tell whether an argument proves a theorem? Journal for Research
in Mathematics Education, 34(1), 4–36.
Senk, S. L. (1985). How well do students write geometry proofs? Mathematics Teacher,
78(6), 448-456.
Simon, M. A. (1996). Beyond inductive and deductive reasoning: the search for a sense
of knowing. Educational Studies in Mathematics, 90(2), 197-210.
Stylianides, A. J. (2007). Proof and proving in school mathematics. Journal for Research
in Mathematics Education, 38(3), 289-321.
268
Tall, D. (2005). The transition from embodied thought experiment and symbolic
manipulation to formal proof. In M. Bulmer, H. MacGillivray & C. Varsavsky
(Eds.), Proceedings of Kingfisher Delta’05, Fifth Southern Hemisphere
Symposium on Undergraduate Mathematics and Statistics Teaching and Learning
(pp. 23-35). Fraser Island, Australia.
Tall, D. (2002). Differing modes of proof and belief in mathematics. In F.-L. Lin (Ed.),
International Conference on Mathematics: Understanding Proving and Proving to
Understand (pp. 91–107). National Taiwan Normal University, Taipei, Taiwan.
Tall, D. (1999). The cognitive development of proof: Is mathematical proof for all or for
some? In Z. Usiskin (Ed.), Developments in School Mathematics Education
Around the World (Vol. 4, pp. 117–136). Reston, Virginia: NCTM.
van Hiele, P.M. (1986). Structure and insight: A theory of mathematics education. New
York: Academic Press.
Usiskin, Z. (1980). What should not be in the algebra and geometry curricula of average
college-bound students? Mathematics Teacher, 73(6), 413-424.
269
Waring, S. (2000). Can you prove it? Developing concepts of proof in primary and
secondary schools. Leicester, UK: The Mathematical Association.
Weber, K. (2001). Student difficulty in constructing proofs: The need for strategic
knowledge. Educational Studies in Mathematics, 48(1), 101-119.
Yin, R. K. (2009). Case study research: Design and methods (Fourth Edition). Thousand
Oaks, CA: SAGE Publications.
Yang, K., & Lin, F. (2008). A model of reading comprehension of geometry proof.
Educational Studies in Mathematics, 67(1), 59-76.
Zack, V. (1997). “You have to prove us wrong”: Proof at the elementary school level. In
E. Pehkonen (Ed.), Proceedings of the 21st Conference of the International Group
for the Psychology of Mathematics Education (Vol. 4, pp. 291-298). Lahti:
University of Helsinki.
270
APPENDIX A. SURVEY RESULTS: PAIRWISE COMPARISONS OF
271
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
A1 A2 .174* .037 .000 .102 .246
A3 .111* .036 .002 .041 .182
A4 .057 .036 .115 -.014 .127
*
A2 A1 -.174 .037 .000 -.246 -.102
A3 -.063 .041 .121 -.143 .017
*
A4 -.118 .042 .005 -.200 -.036
*
A3 A1 -.111 .036 .002 -.182 -.041
A2 .063 .041 .121 -.017 .143
A4 -.055 .041 .187 -.136 .027
A4 A1 -.057 .036 .115 -.127 .014
A2 .118* .042 .005 .036 .200
A3 .055 .041 .187 -.027 .136
B1 B2 -.118* .042 .005 -.200 -.035
*
B3 .137 .047 .004 .044 .229
B4 .183* .048 .000 .089 .276
*
B2 B1 .118 .042 .005 .035 .200
*
B3 .254 .045 .000 .166 .343
*
B4 .300 .046 .000 .210 .391
*
B3 B1 -.137 .047 .004 -.229 -.044
*
B2 -.254 .045 .000 -.343 -.166
B4 .046 .046 .311 -.043 .136
*
B4 B1 -.183 .048 .000 -.276 -.089
*
B2 -.300 .046 .000 -.391 -.210
B3 -.046 .046 .311 -.136 .043
Continued
Table 39. Pairwise comparisons: Participants’ ratings on whether the arguments in each
problem were understandable
272
Table 39 continued
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
C1 C2 .011 .044 .811 -.076 .097
C3 -.080 .044 .073 -.167 .007
C4 -.086 .044 .053 -.173 .001
C2 C1 -.011 .044 .811 -.097 .076
*
C3 -.090 .046 .048 -.180 -.001
*
C4 -.097 .044 .028 -.183 -.010
C3 C1 .080 .044 .073 -.007 .167
*
C2 .090 .046 .048 .001 .180
C4 -.006 .044 .887 -.093 .081
C4 C1 .086 .044 .053 -.001 .173
*
C2 .097 .044 .028 .010 .183
C3 .006 .044 .887 -.081 .093
*
D1 D2 .237 .041 .000 .156 .319
D3 .021 .043 .624 -.063 .105
*
D4 .208 .044 .000 .121 .295
*
D2 D1 -.237 .041 .000 -.319 -.156
*
D3 -.216 .045 .000 -.305 -.128
D4 -.029 .044 .506 -.116 .057
D3 D1 -.021 .043 .624 -.105 .063
*
D2 .216 .045 .000 .128 .305
*
D4 .187 .044 .000 .100 .273
*
D4 D1 -.208 .044 .000 -.295 -.121
D2 .029 .044 .506 -.057 .116
*
D3 -.187 .044 .000 -.273 -.100
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
273
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
A1 A2 -.318* .073 .000 -.462 -.173
A3 -.270* .075 .000 -.418 -.123
*
A4 -.474 .075 .000 -.622 -.326
*
A2 A1 .318 .073 .000 .173 .462
A3 .047 .068 .487 -.087 .182
*
A4 -.156 .063 .014 -.281 -.032
*
A3 A1 .270 .075 .000 .123 .418
A2 -.047 .068 .487 -.182 .087
*
A4 -.204 .064 .002 -.330 -.078
*
A4 A1 .474 .075 .000 .326 .622
A2 .156* .063 .014 .032 .281
*
A3 .204 .064 .002 .078 .330
B1 B2 -.154 .094 .103 -.339 .032
*
B3 -.685 .090 .000 -.862 -.508
B4 -.510* .086 .000 -.681 -.340
B2 B1 .154 .094 .103 -.032 .339
*
B3 -.531 .089 .000 -.708 -.355
*
B4 -.357 .094 .000 -.543 -.170
*
B3 B1 .685 .090 .000 .508 .862
*
B2 .531 .089 .000 .355 .708
*
B4 .175 .074 .020 .028 .322
*
B4 B1 .510 .086 .000 .340 .681
*
B2 .357 .094 .000 .170 .543
B3 -.175* .074 .020 -.322 -.028
Continued
Table 40. Pairwise comparisons of survey results: Participants’ ratings on whether the
arguments in each problem were convincing
274
Table 40 continued
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
C1 C2 -.172* .071 .017 -.313 -.031
C3 .019 .082 .815 -.142 .180
C4 -.096 .072 .184 -.237 .046
*
C2 C1 .172 .071 .017 .031 .313
*
C3 .191 .077 .014 .039 .343
C4 .076 .067 .258 -.057 .209
C3 C1 -.019 .082 .815 -.180 .142
*
C2 -.191 .077 .014 -.343 -.039
C4 -.115 .077 .137 -.266 .037
C4 C1 .096 .072 .184 -.046 .237
C2 -.076 .067 .258 -.209 .057
C3 .115 .077 .137 -.037 .266
D1 D2 .074 .084 .376 -.091 .240
D3 .034 .079 .671 -.123 .191
D4 -.047 .087 .588 -.219 .125
D2 D1 -.074 .084 .376 -.240 .091
D3 -.041 .084 .630 -.207 .126
D4 -.122 .081 .137 -.282 .039
D3 D1 -.034 .079 .671 -.191 .123
D2 .041 .084 .630 -.126 .207
D4 -.081 .085 .341 -.249 .087
D4 D1 .047 .087 .588 -.125 .219
D2 .122 .081 .137 -.039 .282
D3 .081 .085 .341 -.087 .249
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
275
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
A1 A2 -.076 .069 .273 -.212 .060
A3 -.076 .066 .252 -.206 .054
*
A4 -.242 .059 .000 -.359 -.124
A2 A1 .076 .069 .273 -.060 .212
A3 .000 .068 1.000 -.133 .133
*
A4 -.166 .057 .004 -.279 -.053
A3 A1 .076 .066 .252 -.054 .206
A2 .000 .068 1.000 -.133 .133
*
A4 -.166 .057 .004 -.279 -.053
*
A4 A1 .242 .059 .000 .124 .359
A2 .166* .057 .004 .053 .279
*
A3 .166 .057 .004 .053 .279
B1 B2 -.056 .096 .559 -.245 .133
*
B3 -.329 .091 .000 -.508 -.149
B4 -.168 .091 .069 -.349 .013
B2 B1 .056 .096 .559 -.133 .245
*
B3 -.273 .089 .003 -.448 -.097
B4 -.112 .087 .198 -.283 .059
*
B3 B1 .329 .091 .000 .149 .508
*
B2 .273 .089 .003 .097 .448
*
B4 .161 .075 .035 .012 .310
B4 B1 .168 .091 .069 -.013 .349
B2 .112 .087 .198 -.059 .283
B3 -.161* .075 .035 -.310 -.012
Table 41. Pairwise comparisons of survey results: Participants’ ratings on whether the
arguments in each problem were explanatory
276
Table 41 continued
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
C1 C2 -.076 .078 .329 -.231 .078
C3 .045 .070 .523 -.093 .182
C4 .025 .069 .712 -.110 .161
C2 C1 .076 .078 .329 -.078 .231
C3 .121 .078 .122 -.033 .275
C4 .102 .074 .171 -.044 .248
C3 C1 -.045 .070 .523 -.182 .093
C2 -.121 .078 .122 -.275 .033
C4 -.019 .065 .769 -.147 .109
C4 C1 -.025 .069 .712 -.161 .110
C2 -.102 .074 .171 -.248 .044
C3 .019 .065 .769 -.109 .147
*
D1 D2 .176 .070 .014 .037 .315
D3 .020 .071 .777 -.121 .161
D4 -.014 .068 .844 -.149 .122
*
D2 D1 -.176 .070 .014 -.315 -.037
*
D3 -.155 .078 .047 -.309 -.002
*
D4 -.189 .075 .013 -.338 -.041
D3 D1 -.020 .071 .777 -.161 .121
*
D2 .155 .078 .047 .002 .309
D4 -.034 .065 .606 -.163 .095
D4 D1 .014 .068 .844 -.122 .149
D2 .189* .075 .013 .041 .338
D3 .034 .065 .606 -.095 .163
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
277
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
A1 A2 .002 .030 .944 -.056 .061
A3 .046 .028 .101 -.009 .102
*
A4 -.176 .035 .000 -.245 -.108
A2 A1 -.002 .030 .944 -.061 .056
A3 .044 .028 .117 -.011 .099
*
A4 -.179 .035 .000 -.246 -.111
A3 A1 -.046 .028 .101 -.102 .009
A2 -.044 .028 .117 -.099 .011
*
A4 -.223 .033 .000 -.287 -.159
*
A4 A1 .176 .035 .000 .108 .245
*
A2 .179 .035 .000 .111 .246
*
A3 .223 .033 .000 .159 .287
*
B1 B2 -.086 .032 .007 -.148 -.024
*
B3 -.074 .031 .019 -.135 -.012
B4 -.015 .030 .618 -.073 .043
*
B2 B1 .086 .032 .007 .024 .148
B3 .013 .034 .713 -.055 .080
*
B4 .071 .032 .027 .008 .135
*
B3 B1 .074 .031 .019 .012 .135
B2 -.013 .034 .713 -.080 .055
B4 .059 .032 .066 -.004 .122
B4 B1 .015 .030 .618 -.043 .073
*
B2 -.071 .032 .027 -.135 -.008
B3 -.059 .032 .066 -.122 .004
Continued
Table 42. Pairwise comparisons of survey results: Participants’ ratings on whether the
arguments in each problem were appealing
278
Table 42 continued
(I) (J) Mean Difference Std. Error Sig.b 95% Confidence Interval for
Argum Argum (I-J) Differenceb
ent ent Lower Bound Upper Bound
C1 C2 .013 .033 .704 -.052 .078
C3 .021 .033 .523 -.044 .086
*
C4 .069 .031 .026 .008 .130
C2 C1 -.013 .033 .704 -.078 .052
C3 .008 .032 .796 -.055 .072
C4 .057 .031 .066 -.004 .117
C3 C1 -.021 .033 .523 -.086 .044
C2 -.008 .032 .796 -.072 .055
C4 .048 .030 .113 -.012 .108
*
C4 C1 -.069 .031 .026 -.130 -.008
C2 -.057 .031 .066 -.117 .004
C3 -.048 .030 .113 -.108 .012
*
D1 D2 .124 .032 .000 .062 .186
D3 .038 .035 .277 -.031 .106
*
D4 .082 .033 .014 .017 .147
*
D2 D1 -.124 .032 .000 -.186 -.062
*
D3 -.086 .031 .005 -.146 -.026
D4 -.042 .029 .151 -.099 .015
D3 D1 -.038 .035 .277 -.106 .031
*
D2 .086 .031 .005 .026 .146
D4 .044 .032 .171 -.019 .107
*
D4 D1 -.082 .033 .014 -.147 -.017
D2 .042 .029 .151 -.015 .099
D3 -.044 .032 .171 -.107 .019
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no
adjustments).
279
APPENDIX B. SURVEY RESULTS: COMPARISON BETWEEN SUBGROUPS
OF STUDENTS
280
Dependent (I) School (J) School Mean Std. Error Sig.b 95% Confidence
Variable Difference Interval for Differenceb
(I-J)
Lower Upper
Bound Bound
A1.1 Group H Group L 0.01 0.056 0.853 -0.099 0.119
A1.2 Group H Group L 0.029 0.091 0.755 -0.151 0.208
A1.3 Group H Group L -0.068 0.078 0.383 -0.22 0.085
A2.1 Group H Group L -0.125 0.075 0.097 -0.273 0.023
A2.2 Group H Group L -0.096 0.082 0.244 -0.258 0.066
A2.3 Group H Group L -0.075 0.07 0.281 -0.213 0.062
A3.1 Group H Group L -0.016 0.072 0.825 -0.158 0.126
A3.2 Group H Group L 0.06 0.08 0.457 -0.098 0.218
A3.3 Group H Group L 0.038 0.07 0.59 -0.1 0.176
A4.1 Group H Group L -0.048 0.067 0.469 -0.179 0.083
A4.2 Group H Group L 0.079 0.077 0.308 -0.073 0.23
A4.3 Group H Group L 0.045 0.065 0.488 -0.082 0.172
A5.1 Group H Group L -0.004 0.044 0.93 -0.091 0.083
A5.2 Group H Group L -0.055 0.045 0.22 -0.144 0.033
A5.3 Group H Group L 0.01 0.04 0.8 -0.069 0.089
A5.4 Group H Group L 0.037 0.053 0.481 -0.067 0.141
B1.1 Group H Group L 0.119 0.081 0.143 -0.04 0.278
B1.2 Group H Group L -0.064 0.086 0.455 -0.234 0.105
B1.3 Group H Group L 0.029 0.082 0.722 -0.132 0.19
B2.1 Group H Group L 0.107 0.07 0.129 -0.031 0.245
B2.2 Group H Group L 0.053 0.086 0.54 -0.117 0.222
B2.3 Group H Group L 0.067 0.08 0.401 -0.09 0.224
B3.1 Group H Group L -0.052 0.086 0.548 -0.22 0.117
B3.2 Group H Group L -0.001 0.07 0.987 -0.139 0.137
B3.3 Group H Group L -0.042 0.066 0.526 -0.172 0.088
B4.1 Group H Group L 0.007 0.088 0.934 -0.165 0.18
B4.2 Group H Group L 0.047 0.071 0.504 -0.092 0.186
B4.3 Group H Group L 0.094 0.07 0.179 -0.043 0.232
B5.1 Group H Group L -0.058 0.043 0.176 -0.143 0.026
B5.2 Group H Group L 0.057 0.049 0.25 -0.04 0.154
B5.3 Group H Group L -0.069 0.048 0.148 -0.163 0.025
B5.4 Group H Group L 0.077 0.045 0.084 -0.01 0.165
Continued
281
Table 43 continued
Dependent (I) School (J) School Mean Std. Error Sig.b 95% Confidence
b
Variable Difference Interval for Difference
(I-J)
Lower Upper
Bound Bound
C1.1 Group H Group L 0.012 0.081 0.88 -0.147 0.171
C1.2 Group H Group L 0.054 0.079 0.495 -0.101 0.208
C1.3 Group H Group L 0.079 0.07 0.262 -0.059 0.217
C2.1 Group H Group L 0.037 0.082 0.654 -0.125 0.198
C2.2 Group H Group L 0.012 0.073 0.873 -0.132 0.155
C2.3 Group H Group L 0.124 0.069 0.074 -0.012 0.26
C3.1 Group H Group L 0.125 0.079 0.114 -0.03 0.281
C3.2 Group H Group L 0.062 0.081 0.445 -0.098 0.222
C3.3 Group H Group L 0.076 0.073 0.3 -0.068 0.219
C4.1 Group H Group L 0.036 0.077 0.646 -0.117 0.188
C4.2 Group H Group L 0.047 0.076 0.538 -0.103 0.197
C4.3 Group H Group L 0.049 0.071 0.494 -0.091 0.189
C5.1 Group H Group L -0.044 0.048 0.366 -0.139 0.051
C5.2 Group H Group L -0.009 0.047 0.843 -0.102 0.084
C5.3 Group H Group L .103* 0.047 0.028 0.011 0.195
C5.4 Group H Group L -0.022 0.042 0.604 -0.105 0.061
D1.1 Group H Group L -0.053 0.075 0.479 -0.201 0.095
D1.2 Group H Group L -.172* 0.08 0.031 -0.329 -0.016
D1.3 Group H Group L 0.032 0.073 0.664 -0.111 0.175
D2.1 Group H Group L 0.076 0.088 0.39 -0.097 0.249
D2.2 Group H Group L 0.037 0.072 0.612 -0.105 0.179
D2.3 Group H Group L -0.001 0.069 0.99 -0.137 0.135
D3.1 Group H Group L 0.13 0.077 0.094 -0.022 0.281
D3.2 Group H Group L -0.004 0.078 0.956 -0.158 0.149
D3.3 Group H Group L 0.044 0.074 0.557 -0.102 0.189
D4.1 Group H Group L 0.013 0.085 0.882 -0.154 0.179
D4.2 Group H Group L .150* 0.075 0.045 0.003 0.297
D4.3 Group H Group L -0.042 0.068 0.542 -0.176 0.093
D5.1 Group H Group L -0.015 0.05 0.766 -0.113 0.083
D5.2 Group H Group L -0.036 0.042 0.39 -0.118 0.046
D5.3 Group H Group L -0.052 0.048 0.279 -0.147 0.042
D5.4 Group H Group L .093* 0.046 0.041 0.004 0.183
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
282
Dependent (I) (J) Mean Std. Error Sig.b 95% Confidence
Variable Gender Gender Difference Interval for Differenceb
(I-J)
Lower Upper
Bound Bound
A1.1 Female Male .064 .048 .190 -.032 .159
A1.2 Female Male -.082 .077 .290 -.233 .070
A1.3 Female Male .061 .067 .357 -.069 .192
*
A2.1 Female Male -.192 .064 .003 -.318 -.067
A2.2 Female Male -.121 .069 .080 -.257 .014
A2.3 Female Male -.106 .060 .076 -.224 .011
A3.1 Female Male -.087 .061 .152 -.206 .032
A3.2 Female Male -.054 .069 .435 -.189 .082
A3.3 Female Male -.009 .060 .882 -.127 .110
A4.1 Female Male -.076 .059 .197 -.192 .040
A4.2 Female Male -.009 .065 .895 -.137 .120
A4.3 Female Male .000 .056 .998 -.109 .109
A5.1 Female Male .040 .038 .293 -.035 .114
A5.2 Female Male -.076* .037 .043 -.149 -.002
A5.3 Female Male -.027 .034 .431 -.095 .041
A5.4 Female Male .085 .045 .060 -.004 .174
B1.1 Female Male -.054 .068 .425 -.188 .079
B1.2 Female Male -.076 .073 .299 -.220 .068
B1.3 Female Male .007 .069 .921 -.129 .143
B2.1 Female Male -.071 .060 .233 -.188 .046
B2.2 Female Male -.041 .073 .576 -.185 .103
B2.3 Female Male .046 .068 .500 -.088 .180
B3.1 Female Male -.088 .073 .226 -.231 .055
B3.2 Female Male -.055 .060 .356 -.172 .062
B3.3 Female Male -.032 .057 .573 -.144 .080
B4.1 Female Male -.057 .074 .443 -.202 .089
B4.2 Female Male -.004 .060 .943 -.123 .115
B4.3 Female Male -.045 .060 .453 -.164 .073
B5.1 Female Male .015 .037 .694 -.058 .088
B5.2 Female Male -.040 .042 .342 -.122 .042
B5.3 Female Male .064 .041 .124 -.017 .145
B5.4 Female Male -.033 .038 .385 -.108 .042
continued
283
Table 44 continued
Dependent (I) (J) Mean Std. Error Sig.b 95% Confidence
b
Variable Gender Gender Difference Interval for Difference
(I-J)
Lower Upper
Bound Bound
C1.1 Female Male -.082 .069 .236 -.218 .054
C1.2 Female Male -.100 .066 .128 -.229 .029
C1.3 Female Male .020 .060 .740 -.097 .137
C2.1 Female Male -.090 .071 .203 -.229 .049
C2.2 Female Male .001 .061 .993 -.119 .120
C2.3 Female Male .033 .059 .577 -.082 .148
C3.1 Female Male -.050 .068 .465 -.184 .084
C3.2 Female Male -.053 .069 .442 -.188 .082
C3.3 Female Male .050 .063 .427 -.073 .173
C4.1 Female Male -.050 .066 .449 -.181 .080
C4.2 Female Male .017 .064 .796 -.110 .143
C4.3 Female Male .031 .060 .602 -.087 .150
C5.1 Female Male .064 .041 .124 -.017 .145
C5.2 Female Male -.022 .040 .593 -.101 .058
C5.3 Female Male -.039 .040 .336 -.117 .040
C5.4 Female Male -.002 .037 .948 -.075 .070
D1.1 Female Male .047 .064 .462 -.079 .174
D1.2 Female Male .022 .067 .739 -.110 .154
D1.3 Female Male .095 .061 .117 -.024 .215
D2.1 Female Male -.133 .074 .071 -.279 .012
D2.2 Female Male -.037 .061 .543 -.158 .083
D2.3 Female Male .025 .059 .672 -.091 .140
D3.1 Female Male .014 .066 .836 -.116 .143
D3.2 Female Male -.024 .067 .716 -.156 .107
D3.3 Female Male -.023 .063 .712 -.148 .101
D4.1 Female Male -.113 .072 .116 -.254 .028
D4.2 Female Male -.086 .063 .173 -.209 .038
D4.3 Female Male -.033 .057 .564 -.146 .080
D5.1 Female Male .058 .043 .176 -.026 .142
D5.2 Female Male .041 .036 .253 -.029 .111
D5.3 Female Male -.014 .041 .743 -.094 .067
D5.4 Female Male -.072 .039 .063 -.148 .004
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).
284
Source Dependent Type III Sum df Mean F Sig. Partial Eta Observed
Variable of Squares Square Squared Powerbm
Gender * A1.1 .192 1 .192 .727 .394 .002 .136
School A1.2 .364 1 .364 .512 .475 .001 .110
A1.3 .621 1 .621 1.197 .275 .003 .194
A2.1 8.728 1 8.728 19.357 .000* .045 .992
A2.2 .107 1 .107 .186 .666 .000 .072
A2.3 .669 1 .669 1.631 .202 .004 .247
A3.1 .226 1 .226 .506 .477 .001 .109
A3.2 .087 1 .087 .158 .692 .000 .068
A3.3 .041 1 .041 .097 .755 .000 .061
A4.1 .131 1 .131 .342 .559 .001 .090
A4.2 .115 1 .115 .227 .634 .001 .076
A4.3 .540 1 .540 1.497 .222 .004 .231
A5.1 .012 1 .012 .073 .788 .000 .058
A5.2 .079 1 .079 .474 .491 .001 .106
A5.3 .487 1 .487 3.540 .061 .008 .467
A5.4 .078 1 .078 .330 .566 .001 .088
B1.1 .304 1 .304 .545 .461 .001 .114
B1.2 .537 1 .537 .853 .356 .002 .152
B1.3 .002 1 .002 .004 .950 .000 .050
B2.1 .232 1 .232 .547 .460 .001 .114
B2.2 .845 1 .845 1.333 .249 .003 .210
B2.3 .667 1 .667 1.235 .267 .003 .198
B3.1 1.331 1 1.331 2.130 .145 .005 .308
B3.2 .065 1 .065 .155 .694 .000 .068
B3.3 .245 1 .245 .650 .421 .002 .127
B4.1 .446 1 .446 .681 .410 .002 .131
B4.2 .063 1 .063 .146 .703 .000 .067
B4.3 .103 1 .103 .242 .623 .001 .078
B5.1 .099 1 .099 .627 .429 .002 .124
B5.2 .246 1 .246 1.190 .276 .003 .193
B5.3 .064 1 .064 .330 .566 .001 .088
B5.4 .031 1 .031 .181 .671 .000 .071
Continued
285
Table 45 continued
Source Dependent Type III Sum df Mean F Sig. Partial Eta Observed
bm
Variable of Squares Square Squared Power
Gender *
C1.1 .167 1 .167 .302 .583 .001 .085
School
C1.2 .109 1 .109 .207 .649 .000 .074
C1.3 .617 1 .617 1.456 .228 .003 .226
C2.1 .134 1 .134 .232 .630 .001 .077
C2.2 .276 1 .276 .605 .437 .001 .121
C2.3 .051 1 .051 .123 .726 .000 .064
C3.1 .122 1 .122 .225 .635 .001 .076
C3.2 .236 1 .236 .420 .517 .001 .099
C3.3 .049 1 .049 .106 .745 .000 .062
C4.1 1.691 1 1.691 3.327 .069 .008 .444
C4.2 .440 1 .440 .888 .347 .002 .156
C4.3 .008 1 .008 .017 .895 .000 .052
C5.1 .169 1 .169 .845 .359 .002 .150
C5.2 .007 1 .007 .036 .850 .000 .054
C5.3 .030 1 .030 .160 .689 .000 .068
C5.4 .050 1 .050 .330 .566 .001 .088
D1.1 .380 1 .380 .785 .376 .002 .143
D1.2 1.210 1 1.210 2.270 .133 .005 .324
D1.3 .027 1 .027 .060 .807 .000 .057
D2.1 2.543 1 2.543 3.891 .049* .009 .503
D2.2 .149 1 .149 .332 .565 .001 .089
D2.3 .012 1 .012 .028 .867 .000 .053
D3.1 1.215 1 1.215 2.377 .124 .006 .337
D3.2 .543 1 .543 1.041 .308 .003 .175
D3.3 .495 1 .495 1.050 .306 .003 .176
D4.1 3.509 1 3.509 5.801 .016* .014 .671
D4.2 2.508 1 2.508 5.417 .020* .013 .641
D4.3 .438 1 .438 1.094 .296 .003 .181
D5.1 .181 1 .181 .849 .357 .002 .151
D5.2 .024 1 .024 .164 .686 .000 .069
D5.3 .000 1 .000 .001 .969 .000 .050
D5.4 .088 1 .088 .497 .481 .001 .108
*. The mean difference is significant at the .05 level.
286
APPENDIX C. INTERVIEW RESULTS: PAIRWISE COMPARISON OF THE
287
(I) (J) Mean Std. Error Sig.a 95% Confidence Interval for
Argum Argum Difference (I-J) Differencea
ent ent Lower Bound Upper Bound
A1 A2 1.125 .639 .122 -.386 2.636
A3 .125 .666 .857 -1.451 1.701
A4 .750 .675 .303 -.846 2.346
A2 A1 -1.125 .639 .122 -2.636 .386
A3 -1.000 .500 .086 -2.182 .182
A4 -.375 .653 .584 -1.919 1.169
A3 A1 -.125 .666 .857 -1.701 1.451
A2 1.000 .500 .086 -.182 2.182
A4 .625 .625 .351 -.853 2.103
A4 A1 -.750 .675 .303 -2.346 .846
A2 .375 .653 .584 -1.169 1.919
A3 -.625 .625 .351 -2.103 .853
B1 B2 .125 .693 .862 -1.513 1.763
B3 -.125 .666 .857 -1.701 1.451
B4 .000 .598 1.000 -1.413 1.413
B2 B1 -.125 .693 .862 -1.763 1.513
B3 -.250 .881 .785 -2.334 1.834
B4 -.125 .766 .875 -1.937 1.687
B3 B1 .125 .666 .857 -1.451 1.701
B2 .250 .881 .785 -1.834 2.334
B4 .125 .441 .785 -.917 1.167
B4 B1 .000 .598 1.000 -1.413 1.413
B2 .125 .766 .875 -1.687 1.937
B3 -.125 .441 .785 -1.167 .917
continued
288
Table 46 continued
a
(I) (J) Mean Std. Error Sig. 95% Confidence Interval for
a
Argum Argum Difference (I-J) Difference
ent ent Lower Bound Upper Bound
C1 C2 .000 .707 1.000 -1.672 1.672
C3 -.125 .581 .836 -1.498 1.248
C4 -.375 .653 .584 -1.919 1.169
C2 C1 .000 .707 1.000 -1.672 1.672
C3 -.125 .666 .857 -1.701 1.451
C4 -.375 .680 .598 -1.982 1.232
C3 C1 .125 .581 .836 -1.248 1.498
C2 .125 .666 .857 -1.451 1.701
C4 -.250 .796 .763 -2.133 1.633
C4 C1 .375 .653 .584 -1.169 1.919
C2 .375 .680 .598 -1.232 1.982
C3 .250 .796 .763 -1.633 2.133
D1 D2 -.625 .532 .279 -1.884 .634
D3 -.750 .648 .285 -2.282 .782
D4 -.625 .730 .420 -2.352 1.102
D2 D1 .625 .532 .279 -.634 1.884
D3 -.125 .666 .857 -1.701 1.451
D4 .000 .598 1.000 -1.413 1.413
D3 D1 .750 .648 .285 -.782 2.282
D2 .125 .666 .857 -1.451 1.701
D4 .125 .789 .879 -1.741 1.991
D4 D1 .625 .730 .420 -1.102 2.352
D2 .000 .598 1.000 -1.413 1.413
D3 -.125 .789 .879 -1.991 1.741
continued
289
Table 46 continued
a
(I) (J) Mean Std. Error Sig. 95% Confidence Interval for
a
Argum Argum Difference (I-J) Difference
ent ent Lower Bound Upper Bound
E1 E2 -.375 .730 .623 -2.102 1.352
E3 -1.000 .627 .155 -2.482 .482
E4 -.625 .653 .370 -2.169 .919
E2 E1 .375 .730 .623 -1.352 2.102
E3 -.625 .532 .279 -1.884 .634
E4 -.250 .726 .741 -1.966 1.466
E3 E1 1.000 .627 .155 -.482 2.482
E2 .625 .532 .279 -.634 1.884
E4 .375 .625 .567 -1.103 1.853
E4 E1 .625 .653 .370 -.919 2.169
E2 .250 .726 .741 -1.466 1.966
E3 -.375 .625 .567 -1.853 1.103
Based on estimated marginal means
a. Adjustment for multiple comparisons: Least Significant Difference (equivalent to no
adjustments).
290