Professional Documents
Culture Documents
Cognitice Assessment-An Introduction To The Rule Space Method
Cognitice Assessment-An Introduction To The Rule Space Method
ASSESSMENT
An Introduction to
the Rule Space Method
Multivariate Applications Series
Sponsored by the Society of Multivariate Experimental Psychology, the goal of this series is to
apply complex statistical methods to significant social or behavioral issues, in such a way so as to
be accessible to a nontechnical-oriented readership (e.g., nonmethodological researchers, teach-
ers, students, government personnel, practitioners, and other professionals). Applications from
a variety of disciplines such as psychology, public health, sociology, education, and business are
welcome. Books can be single- or multiple-authored or edited volumes that (1) demonstrate the
application of a variety of multivariate methods to a single, major area of research; (2) describe
a multivariate procedure or framework that could be applied to a number of research areas; or
(3) present a variety of perspectives on a controversial subject of interest to applied multivariate
researchers.
There are currently 14 books in the series:
r What If There Were No Significance Tests?, coedited by Lisa L. Harlow, Stanley A. Mulaik,
and James H. Steiger (1997)
r Structural Equation Modeling with LISREL, PRELIS, and SIMPLIS: Basic Concepts,
Applications, and Programming, written by Barbara M. Byrne (1998)
r Multivariate Applications in Substance Use Research: New Methods for New Questions,
coedited by Jennifer S. Rose, Laurie Chassin, Clark C. Presson, and Steven J. Sherman
(2000)
r Item Response Theory for Psychologists, coauthored by Susan E. Embretson and Steven P.
Reise (2000)
r Structural Equation Modeling with AMOS: Basic Concepts, Applications, and Programming,
written by Barbara M. Byrne (2001)
r Conducting Meta-Analysis Using SAS, written by Winfred Arthur, Jr., Winston Bennett,
Jr., and Allen I. Huffcutt (2001)
r Modeling Intraindividual Variability with Repeated Measures Data: Methods and Applications,
coedited by D. S. Moskowitz and Scott L. Hershberger (2002)
r Multilevel Modeling: Methodological Advances, Issues, and Applications, coedited by Steven
P. Reise and Naihua Duan (2003)
r The Essence of Multivariate Thinking: Basic Themes and Methods, written by Lisa Harlow
(2005)
r Contemporary Psychometrics: A Festschrift for Roderick P. McDonald, coedited by Albert
Maydeu-Olivares and John J. McArdle (2005)
r Structural Equation Modeling with EQS: Basic Concepts, Applications, and Programming,
Second Edition, written by Barbara M. Byrne (2006)
r Introduction to Statistical Mediation Analysis, written by David P. MacKinnon (2008)
r Applied Data Analytic Techniques for Turning Points Research, edited by Patricia Cohen
(2008)
r Cognitive Assessment: An Introduction to the Rule Space Method, written by Kikumi K.
Tatsuoka (2009)
Anyone wishing to submit a book proposal should send the following: (1) author/title; (2) time-
line including completion date; (3) brief overview of the book’s focus, including table of contents
and, ideally, a sample chapter (or chapters); (4) a brief description of competing publications; and
(5) targeted audiences.
For more information, please contact the series editor, Lisa Harlow, at Department of
Psychology, University of Rhode Island, 10 Chafee Road, Suite 8, Kingston, RI 02881-0808;
phone (401) 874-4242; fax (401) 874-5562; or e-mail LHarlow@uri.edu. Information may also be
obtained from members of the advisory board; Leona Aiken (Arizona State University), Gwyneth
Boodoo (Educational Testing Services), Barbara M. Byrne (University of Ottawa), Patrick Curran
(University of North Carolina), Scott E. Maxwell (University of Notre Dame), David Rindskopf
(City University of New York), Liora Schmelkin (Hofstra University), and Stephen West (Arizona
State University).
COGNITIVE
ASSESSMENT
An Introduction to
the Rule Space Method
Kikumi K. Tatsuoka
Routledge Routledge
Taylor & Francis Group Taylor & Francis Group
270 Madison Avenue 27 Church Road
New York, NY 10016 Hove, East Sussex BN3 2FA
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, trans-
mitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter
invented, including photocopying, microfilming, and recording, or in any information storage or retrieval
system, without written permission from the publishers.
Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are
used only for identification and explanation without intent to infringe.
Tatsuoka, Kikumi K.
Cognitive assessment : an introduction to the rule space method / Kikumi K.
Tatsuoka.
p. cm.
Includes index.
ISBN 978-0-8058-2828-3 (hardback) -- ISBN 978-1-84872-813-4 (pbk.)
1. Cognition--Testing. I. Title.
BF311.T295 2009
153.9’3--dc22 2008032623
Preface ................................................................................................................xi
xi
xii Preface
As long as a test shares the same set of attributes, analysis results from
several tests consisting of different sets of items like the Trends in
International Mathematics and Science Study (TIMSS) and several parallel
forms of assessments can be combined for various secondary analyses.
The RSM, which can determine an individual’s strengths and weaknesses,
has been applied to Preliminary SAT (PSAT) to generate scoring reports,
which inform schools, teachers, and parents exactly what the total score of
500 means. This book evolved from hundreds of journal articles, technical
reports, Ph.D. theses, presentations at conferences, and book chapters in
which various segments of RSM were introduced and discussed. Because
RSM belongs to an approach of statistical pattern recognition and classifi-
cation problems popular in engineering areas, this book will be useful to
graduate students in a variety of disciplines. The book is primarily written
for graduate students in quantitative and educational psychology and in
instructional technology, but it is also applicable to medical diagnoses, a
variety of applications in computer science, and engineering.
The conceptual framework of RSM is influenced by my early education
in mathematics, in which I specialized in abstract algebra, the theories
of functional space, and optimization by vector space methods, and my
experience developing software to analyze students’ online performance
using instructional materials on the Programmed Logic for Automatic
Teaching Operations (PLATO) system at the University of Illinois. These
online data were so massive and extremely complex that the traditional
psychometric theories, such as classical test theory and item response
Preface xiii
ETS and College Board, and James Corter at Teachers College, Columbia
University. Many thanks to the chief editor Debra Reigert and the edito-
rial team at Taylor & Francis especially the project editor Susan Horwitz,
to Jane Hye for the developmental editing of my book, and to reviewers
of Society for Multiriate Experimental Psychology (SMEP) board mem-
bers for helpful comments and suggestions. Last, I still miss my late hus-
band, Maurice, who had worked with me at the beginning stage of the
project and who had developed the RSM classification procedure with
me by discussing it days and nights from many angles. I am very grate-
ful to my younger son, Curtis (a statistician); his mathematician friend,
Dr. Ferenc Varadi; also my mathematician friend, Robert Baillie, for
writing various computer programs essential for the development of a
methodology; and my older son, Kay, who is also a statistician and math-
ematician, for valuable discussions.
1
Dimensionality of Test Data and
Aberrant Response Patterns
1
2 Cognitive Assessment: An Introduction to the Rule Space Method
systems. If a new erroneous rule was discovered, then the program would
be modified to include it. The system could not discover either new erro-
neous rules not listed in the initial list or common erroneous rules using
a different strategy or a new method to solve a problem.
Example 1.1
A water ski tow handle makes an isosceles triangle. If one of the congru-
ent angles is 65 degrees, what is the measure of the angle?
65°
F 65°
G 75°
H 155°
I 170°
Example 1.2
An electrician has a plastic pipe, used for underground wiring, which is
15 feet long. He needs plastic pieces that are 6.5 inches long to complete
his job. When he cuts the pipe, how many pieces will he be able to use for
his job?
Two thirds of the students answered this question correctly. We counted
the number of words used in the stem and found 52 words; however, the
problem requires translation of a word problem into an arithmetic proce-
dure in order to solve this item: P1.
Because two different units, feet and inch, are used, we have to convert a
foot to 12 inches, and then 15 feet must be 180 inches: S1.
The length of a pipe is 6.5 inches, so we need 27 pieces, 180/6.5 27
pieces: P2.
Dividing 180 by a decimal number, 6.5, belongs to the content domain
of C2: C2.
There are two steps—the first to convert the unit to the common unit,
and the second to carry out the computation: P9.
4 Cognitive Assessment: An Introduction to the Rule Space Method
TABLE 1.1
A Modified List of Knowledge, Skill, and Process Attributes Derived to Explain
Performance on Mathematics Items From the TIMSS-R (1999) for Population 2
(Eighth Graders) for Some State Assessment
Content Attributes
C1 Basic concepts and operations in whole numbers and integers
C2 Basic concepts and operations in fractions and decimals
EXP Powers, roots, and scientific expression of numbers are separated from C2
C3 Basic concepts and operations in elementary algebra
C4 Basic concepts and operations in two-dimensional geometry
C5 Data and basic statistics
PROB Basic concepts, properties, and computational skills
Process Attributes
P1 Translate, formulate, and understand (only for seventh graders) equations and
expressions to solve a problem
P2 Computational applications of knowledge in arithmetic and geometry
P3 Judgmental applications of knowledge in arithmetic and geometry
P4 Applying rules in algebra and solving equations (plugging in included for
seventh graders)
P5 Logical reasoning—includes case reasoning, deductive thinking skills, if-then,
necessary and sufficient conditions, and generalization skills
P6 Problem search; analytic thinking and problem restructuring; and inductive thinking
P7 Generating, visualizing, and reading figures and graphs
P9 Management of data and procedures, complex, and can set multigoals
P10 Quantitative and logical reading (less than, must, need to be, at least, best, etc.)
Tatsuoka, Corter, and Tatsuoka (2004) and Tatsuoka et al. (2006) compared
mathematical thinking skills of 20 countries and found the countries teach-
ing more geometry performed better on mathematical thinking skills. Dean
(2006) found that students learned mathematical thinking skills better in the
seventh grade and achieved much higher scores on advanced mathemat-
ics in the 12th grade. Tatsuoka and Boodoo (2000), Kuramoto et al. (2003),
Dogan (2006), and Chen et al. (2008) applied RSM to Japanese, Turkish, and
Taiwanese tests to examine their constructive validities.
total scores and to use the information for selection, grading, and predict-
ing examinees’ future performance; however, providing useful diagnostic
information for improving teaching and learning has not been important to
psychometricians. At the present time, though, many test users want test-
ing to be an integral part of instruction so that prescribed reports can guide
teachers and students to attain higher educational goals.
TABLE 1.2
Response of Four Eighth-Grade Students Who Consistently Applied Their
Erroneous Algorithms in Response to Six Addition Problems in Signed Numbers
Responses by Student
Note: Rules of the four students that were validated by interviews (Birenbaum, 1981) are as
follows: Student 1 treats parentheses as absolute value notation; Student 2 adds the two
numbers and takes the sign of the number with the larger absolute value; Student 3 mistypes
the answers to items 3, 4, and 5, or has errors in whole-number addition; and Student 4
moves the second number from the origin instead of the position of the first number on the
number line.
test crystallizes into the two item types, addition and subtraction items. A
scoring system that takes into consideration the process rather than rely-
ing solely on the outcome of the right or wrong scores resulted in signifi-
cant improvement in the psychometric properties.
Tatsuoka et al. (2004) found that the underlying knowledge and process-
ing skills of TIMSS showed multidimensionality, and their four factors
given in Table 1.3 supported the conclusions derived independently from
some other statistical analyses and background questionnaires given to
teachers and schools in 20 nations.
15
L = Larger
14 absolute value
S = Smaller
13 absolute value
12
11
Number of Scores Adjusted
10
2
–S – (–L)
–L – (–S)
–S – + L
S – (–L)
L – (–S)
–L + –S
–S + –L
–L + S
S + –L
–S + L
L + –S
–S –L
–L –S
L–S
S–L
L+S
0
4 12 2 9 13 1 8 7 16 6 10 14 5 15 3 11
Subtraction Addition
Task Number
FIGURE 1.1
Number of scores adjusted in each test task.
associates closely with a majority of students who took it and the kinds of
skills and knowledge they used correctly in taking test items at different
proficiency levels.
It seems that detection of “aberrant response patterns” is important for
data analyses. In this section, we will discuss how one can detect aberrant
response patterns, which are different from “normal” item response pat-
terns. Several indices have been developed and discussed in Meijer (1994);
we will leave detailed discussion to Meijer and will introduce indices only
that are closely related to cognitive perspectives.
Two indices were developed for measuring the degree of conformity or
consistency of an individual examinee’s response pattern on a set of items
(Tatsuoka, 1981; Tatsuoka & Tatsuoka, 1980, 1982, 1983). The first, called the
Dimensionality of Test Data and Aberrant Response Patterns 15
12
–1.5 –1.0 –0.5 0.0 0.5 1.0 1.5
1.5 1.5
14
1.0 1 1.0
15
8 5
0.5 0.5
12
11 2
0.0 0.0
13 4
16 6
–0.5 –0.5
9
3
–1.0 –1.0
7
10
–1.5 –1.5
–1.5 –1.0 –0.5 0.0 0.5 1.0 1.5
Original Task Scores
A 2-Dimensional Plot
FIGURE 1.2
Analyses of multidimensional scaling for original and modified datasets.
16 Cognitive Assessment: An Introduction to the Rule Space Method
TABLE 1.3
Rotated Component Matrix From the Principal Component Analysis of All
Attributes, Performed on Mean Attribute Mastery Probability Profiles
Across 20 Countries
Attribute: Description F1 F2 F3 F4
been made to give partial credit for partial knowledge, procedures for dis-
crediting correct answers arrived at by incorrect reasons have typically
been confined to the use of formulas for correction for guessing. This lack
may not be serious for standardized ability tests, but is very important in
the context of achievement testing, which is an integral part of the instruc-
tional process (Birenbaum & Tatsuoka, 1983). Here, the test must serve the
purpose of diagnosing what type of misconception exists, so that appropri-
ate remedial instruction can be given (Glaser, 1981; Nitko, 1980; Tatsuoka,
1981). This calls for the study of cognitive processes that are used in solv-
ing problems, and identifying where the examinee went astray even when
the correct answer was produced. This type of diagnostic testing was pio-
neered by Brown and Burton (1978). Their celebrated BUGGY is essentially
an adaptive diagnostic testing system, which utilizes network theory for
routing examinees through a set of problems in subtraction of positive
integers. Tire branching is such that each problem serves to narrow down
the scope of “hypotheses” as to the type(s) of misconception(s) held by the
examinee until finally a unique diagnosis is made. Baillie and Tatsuoka
(1982) developed a diagnostic testing system called FBUG, which differed
from BUGGY in that the test was not adaptive but “conventional” (i.e.,
linear). The test was constructed for use in conjunction with lessons in the
addition and subtraction of signed numbers (positive and negative inte-
gers) for eighth-grade students, and consisted of four parallel subtests of
16 items each. A system of error vectors was developed for diagnosing the
type(s) of error committed.
Crucial to this system of error diagnosis is the ability to tell whether
and to what extent a response pattern is “typical” or “consistent.” We may
speak of consistency with respect to either the average response pattern
of a group or an individual’s own response pattern over time. To measure
consistency in these two senses, two related but distinct indices are devel-
oped in this book. They are called the NCI and the ICI, respectively.
It has been or will be shown that a certain weighted average of the NCIs
of the members of a group yields one of Cliff’s (1977) group consistency
indices, Ct1. The higher the value of Ct1, the closer the group dataset is to
being unidimensional in the sense of forming a Guttman scale. This notion
of unidimensionality is different from the factor analytic unidimensional-
ity of data, and Wise (1981) created several counterexamples. One sample
was highly conforming to the Guttman scale but multidimensional by fac-
tor analysis, whereas the others are unidimensional by factor analysis and
principal component analysis, but they are not the Guttman scale.
Response patterns produced by erroneous rules are usually quite differ-
ent from the average response pattern and not conforming to the Guttman
scale; hence, removing individuals with low (usually negative) NCI
values—that is, those with aberrant response patterns—will yield a data-
set that is more nearly unidimensional (Tatsuoka & Tatsuoka, 1982, 1983).
18 Cognitive Assessment: An Introduction to the Rule Space Method
The ICI, on the other hand, measures the degree to which an individ-
ual’s response pattern remains invariant over time; thus, for example, in
the signed-number test consisting of four parallel subtests, the ICI indi-
cates whether an examinee’s response pattern changes markedly from
one subset to the next or remains relatively stable. Low ICI values, indi-
cating instability of response pattern, would suggest that the examinee
was still in the early stages of learning, changing his or her method for
solving equivalent problems from one wave to the next. A high ICI value,
reflecting stability of response pattern, would signal the nearing of mas-
tery or a learning plateau.
Although the NCI and ICI can each serve useful purposes, as suggested
above and illustrated in detail below, examining them jointly opens up
various diagnostic possibilities, as does the consideration of each of them
in combination with the total test score.
C 2Ua/U – 1 (1.2)
Example 1.3
Let S (10110), for the items ordered by o, then
0 0 0 0 0 0
0 1 0 1 1 0
N S ` S 0 r (10110) 0 0 0 0 0
0 0 0 0 0 0
1 1 0 1 1 0
Example 1.4
Let S be a Guttman vector, S (00111). Then S `S will be
0 0 1 1 1
0 0 1 1 1
`
S S 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
`
S S 0 0 0 0 0
1 1 1 0 0
1 1 1 0 0
From the foregoing example, the first two of the following properties of
Cp(o) may be inferred. The other properties are illustrated by further exam-
ples, and intuitive arguments are given to substantiate them. Their formal
proofs are not difficult but tedious, and therefore have been omitted.
¤S ³
C p (o) (S1 , S 2 ) ¥ 1 ´ S1` S1 S `2 S 2
¦ S2 µ