Professional Documents
Culture Documents
Automatic Detection of Off-Task Behaviors in Intelligent Tutoring Systems With Machine Learning Techniques
Automatic Detection of Off-Task Behaviors in Intelligent Tutoring Systems With Machine Learning Techniques
Automatic Detection of Off-Task Behaviors in Intelligent Tutoring Systems With Machine Learning Techniques
3, JULY-SEPTEMBER 2010
Abstract—Identifying off-task behaviors in intelligent tutoring systems is a practical and challenging research topic. This paper
proposes a machine learning model that can automatically detect students’ off-task behaviors. The proposed model only utilizes the
data available from the log files that record students’ actions within the system. The model utilizes a set of time features, performance
features, and mouse movement features, and is compared to 1) a model that only utilizes time features and 2) a model that uses time
and performance features. Different students have different types of behaviors; therefore, personalized version of the proposed model
is constructed and compared to the corresponding nonpersonalized version. In order to address data sparseness problem, a robust
Ridge Regression algorithm is utilized to estimate model parameters. An extensive set of experiment results demonstrates the power
of using multiple types of evidence, the personalized model, and the robust Ridge Regression algorithm.
1 INTRODUCTION
data, systems must detect off-task behaviors using only the not learned with Ridge Regression). We show that utilizing
data from students’ actions within the tutoring software. multiple types of evidence, personalization, and the robust
This brings its own challenges since it has been found that Ridge Regression technique improves the effectiveness of
understanding students’ intentions only from their actions off-task detection.
within a system can be challenging [25]. Yet, off-task The rest of the paper is structured as follows: Section 2
behavior detectors, especially personalized off-task detec- introduces the data set used in this work. Section 3
tors that consider the interuser behavior variability (e.g., describes the Least-Squares and Ridge Regression techni-
using more or less time to solve the problems, having ques. Section 4 proposes several approaches for modeling
difficulties with particular types of questions, and/or off-task behaviors as well as the personalization of those
problems or different mouse usage types, etc.) have the modeling approaches. Section 5 discusses the experimental
potential of improving students’ learning experiences. methodology. Section 6 presents the experiment results, and
To the best of our knowledge, there is very limited prior finally, Section 7 concludes this work.
work on the automatic detection of off-task behavior
utilizing mouse movement data [10], [13] and there is no 2 DATA
prior work on the personalized detection of off-task
behaviors with multifeature models. Prior works mainly Data from a study conducted in 2008 in an elementary school
focus on detecting gaming behaviors [3], [4], [7], [27], which were used in this work. The study was conducted in
is also a quite different task from detecting off-task mathematics classrooms using a math tutoring software
behaviors and none of the researchers in these works (that had been developed by the authors). The tutoring
utilized mouse movement data. Other prior work that software taught problem solving skills for Equal Group and
Multiplicative Compare problems. These two problem types
focused on off-task behavior detection only analyzed time
are a subset of the most important mathematical word
and performance features (that can be extracted from
problem types that represent about 78 percent of the
the logs of user-system interaction) and did not utilize
problems in fourth-grade mathematics textbooks [20]. First,
mouse movement data [2]. One of the works using mouse
in the tutoring system, a conceptual instruction session is
movement data was done by De Vicente and Pain. They had
studied by the students followed by problem solving
human participants use mouse movement data as well as
sections to test their understanding. Both the conceptual
data from student-tutor interactions for motivation diag-
instruction and the problem solving parts require students to
nosis, but did not use it for automated detection of the off-
work one-on-one with the tutoring software and if students
task behaviors [13]. Cetintas et al. tracked and analyzed
fail to pass a problem solving session, they have to repeat the
mouse movement data along with performance and time
corresponding conceptual instruction and the problem
data to automatically detect off-task behaviors; however,
solving session. The tutoring software has a total of four
these researchers did not consider personalization [10]. In conceptual instruction sessions and 11 problem solving
another work, Cetintas et al. also used mouse movement sessions that have 12 questions each. The problem solving
data along with performance, problem, and time features to sessions include four sessions for Equal Group worksheets,
predict the correctness of problem solving, which is a very four sessions for Multiplicative Compare worksheets, and
different task from the off-task detection [9]. three sessions for Mixed worksheets, each of which includes
Although time and performance features are useful for six EG and six MC problems. The tutoring software is
improving the effectiveness of off-task behavior detection, supported with animations, audio (with more than 500 audio
the vast majority of these works ignore: 1) an important files), instructional hints, exercises, etc.
type of data, namely mouse tracking data, which can be The study included 12 students that include four
easily stored in and retrieved from user-system logs. students with learning disabilities, one student with
Furthermore, all prior works ignore 2) the approach of emotional disorder, and one student with emotional
personalization to capture different characteristics of disorder combined with a mild intellectual disability.
students’ that can lead to different behavior types, which Students used the tutor for several sessions that last about
are hard to identify with nonpersonalized off-task detectors 30 minutes. Students used the tutor for an average of
since these models can not recognize interuser variability 18.2500 sessions with standard deviation of 3.3878 sessions.
for different behavior types. The evidence about students’ on and off-task behaviors
This paper proposes a machine learning model that can was observed during these sessions. Outside observations
automatically identify the off-task behaviors of a student by of behavior were used for collecting the evidence. Self-
utilizing multiple types of evidences including time, report was not used to assess students’ on and off-task
performance, and mouse movement features and by behaviors due to the concern that this measure might
utilizing personalization to capture interuser behavior influence students’ off-task behaviors and learning out-
variability. To address data sparseness problem, the comes. A specific student is coded as either on-task or off-
proposed model utilizes a robust ridge regression technique task by the observer as in past observatory studies of on-
to estimate model parameters. The proposed model is task and off-task behaviors [15], [17], [18].
compared to 1) a model that only utilizes time features and Most of the observations were carried out by a single
2) a model that uses time and performance features. observer. An observation includes watching a student solve
Furthermore all models are compared to each other 1) when a problem (i.e., starting from the time s/he starts to solve it
personalization is not used and 2) when data sparseness until s/he submits her/his final answer). In an observation,
problem is not addressed (i.e., when model parameters are if a student has been seen in an off-task activity such as
230 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 3, NO. 3, JULY-SEPTEMBER 2010
TABLE 1
Details of Observed Data for On-Task and Off-Task Behaviors
Average number of observations per student can be seen under the Average subcolumn of the # of Observations Per Student column and the
standard deviation of the number of observations among students can be seen under the Std. Deviation subcolumn.
1) talking with another student about anything other than 3 METHODS: LEAST SQUARES AND RIDGE
the subject material, 2) inactivity (e.g., putting her/his head REGRESSION
on the desk, staring into space, etc.), 3) off-task solitary
behavior (i.e., any activity that does not involve the This section first describes the technique of Least Squares,
then introduces the problem of overfitting, and finally, talks
tutoring system or another student such as surfing the
about the technique of Ridge Regression.
web) for more than 30 seconds (for any of these three types
The linear model fit has been a very important method in
of behaviors), the behavior was coded as off-task. Brief
statistics for the past 30 years and is still one of the most
reflective pauses and “gaming the system” behaviors were
powerful prediction techniques. The simplest linear model
not treated as off-task behaviors. In order to avoid any
for regression involves a linear combination of input
observer bias, students were observed sequentially. Syn-
variables as follows:
chronization of field observations (which included the
information about the problem number, student, time, and wÞ ¼ w0 þ w1 x1 þ þ wn xn ¼ wT x;
xn ; w
yðx ð1Þ
date) and the log data is straightforward since the log data
T
include the date and time pairs for every student action where x ¼ ð1; x1 ; . . . ; xn Þ is an instance of training data of
such as mouse movements, answer inputs, starting to solve D þ 1 dimensions and w ¼ ðw0 ; w1 ; . . . ; wn ÞT are model
a problem, finishing solving a problem etc. A total of coefficients (w0 is the bias or the intercept). To fit a linear
407 observations were taken, corresponding to approxi- model to a set of training data, the method of the Least
mately 34 observations per student. Details about the on- Squares is one of the most popular techniques. It has been
task and off-task observations are given in Table 1. Data noted in prior work that for machine learning agents that act
from six students were used as training data to build the as “black box” when making predictions, the exact
models for making predictions for the other six students mechanism used by the agent is secondary (i.e., any machine
(i.e., who are used as the test data). Note that students with learning method that performs function approximation will
learning disabilities and emotional disorders are split work and that determination of the model’s inputs/outputs
between the training and test groups uniformly in order is the critical issue) [5], [6]. Since this paper uses a machine
not to have any bias on the learned models. Particularly, learning agent acting as “black box” for off-task prediction,
data from two students with learning disabilities and one the Least Squares is used as the machine learning agent.
student with emotional disorder were used along with the The Least-Squares method determines the values of
data from three normal achieving students for training. In model coefficients w by minimizing the sum of the squares
the same way, data from the remaining two students with error between predictions yðx wÞ for each data point xn
xn ; w
learning disabilities and one student with emotional and the corresponding target value tn . The sum of squares
disorder combined with a mild intellectual disability error is defined as follows:
were used for testing along with the data from the three X
N X
N
remaining normal achieving students. Details about the ED ðw
wÞ ¼ ftn yðx wÞg2 ¼
xn ; w ftn wT xn g2 ; ð2Þ
training and test splits can be seen in Table 2. n¼1 n¼1
constraints on model parameters in order to discourage them many students while others take relatively much less time
from reaching large values that lead to overfitting [8], [14]. depending on factors such as difficulty level, familiarity, etc.
Ridge Regression is a technique that better controls overfitting In this work, both an absolute time feature and a relative
by adding a quadratic regularization punishment of time feature are used for time only modeling. The relative
wÞ ¼ 1=2 wT w to the data-dependent error ED ðw
EW ðw wÞ. After time feature used in this work is defined as the time spent
the addition of EW ðw wÞ, the total error function that the by a user minus the average time spent on the same
technique of the Ridge Regression aims to minimize becomes problem by all other students.
Time only modeling in this work serves as the baseline
XN 2 for all other models and will be referred as TimeOnly Mod.
ET OT AL ¼ ED ðw
wÞ þ EW ðw
wÞ ¼ tn wT xn þ w T w;
n¼1
2
4.1.2 Time and Performance-Based Modeling
ð4Þ (TimePerf Mod)
where is the regularization coefficient that controls the Time-related features are useful in many situations; how-
relative importance of data-dependent error ED ðwwÞ and the ever, there are lots of other possible data that can be good
regularization term EW ðw
wÞ. The regularization coefficient in indicators of off-task behaviors such as the probability that
this work is learned with twofold cross validation in the the student possesses the prior knowledge to answer the
training phase. The exact minimizer of the total error given question correctly. The percentage of correctness
function can be found in closed form as follows: across all previous problems for a student has recently been
used as an indicator of this feature for off-task gaming
w RIDGE ¼ ðI þ T Þ1 t
t; ð5Þ behavior detection [27] and a similar measure has been used
in Baker’s recent off-task behavior detection effort [2].
which is the Ridge Regression solution of the parameters of In addition to two time features that have been
the model. mentioned, this modeling approach incorporates eight more
features consisting of four main features each of which has
4 MODELING APPROACHES AND PERSONALIZATION one absolute and one relative version that are calculated by
student’s feature value minus the mean feature value across
This section describes several modeling approaches for off-
all students for the current worksheet. These eight new
task behavior detection as well as the personalized versions
features are used as a measure of the probability that the
of the models.
student knew the skills asked in a question. The first feature
4.1 Several Modeling Approaches is the percentage of correct answers so far in a problem solving
This section describes the models that are used for evaluation: worksheet. Each problem solving worksheet consists of
1) a model that only considers time features (i.e., Time Only 12 math word problems and a problem is counted as correct
Modeling), 2) another modeling approach that considers only if all question boxes for the problem are filled
performance features as well as time features (i.e., Time and correctly. The percentage of correctly solved problems up
to a current problem in a worksheet is a good indicator of
Performance-Based Modeling), and finally, 3) a more
students’ success for the current problem. Second, third,
advanced model that incorporates mouse movement features
and fourth features help to assess students’ partial skills
with time and performance-related features (i.e., Time,
that are needed for the solution of a problem when they
Performance, and Mouse-Tracking-Based Modeling).
cannot give a full answer. Such partial skills for a problem
4.1.1 Time Only Modeling (TimeOnly Mod) include answers to 1) diagram boxes which check students’
mapping of the information given in a problem into an
Modeling students’ off-task behaviors just by considering
abstract model, 2) an equation box which checks whether a
the time taken on an action has been considered a useful
student can form a correct equation from the information
approach in the prior work [4], [7], [22]. This modeling
given in a problem, and 3) a final answer box which checks
approach only considers time-related features as a good
whether a student can solve the asked unknown in a
discriminator of on-task and off-task behaviors. Setting a
problem correctly. The corresponding features are percen-
cutoff on how much time an action/problem should take tage of correct diagram answers so far, percentage of correct
and categorizing all actions that last longer than that cutoff equation box answers so far, and percentage of final answers so
is one of the simplest and most intuitive ways that have far. The values of these features are calculated for a current
previously been applied to determine whether a student is problem solving worksheet and provide the percentage of
reading hints carefully [22], to determine whether a student correct answers given by a student for the associated partial
is using guessing to solve a problem [7], and by Baker as a skill boxes (of each feature) of all the solved problems of the
baseline for his multifeature off-task detection model in his current worksheet.
recent work [2]. Baker uses an absolute time feature which is The approach that uses time and performance features
the time that the action of interest takes the user as well as a will be referred as TimePerf Mod.
relative time feature that is expressed in terms of the
number of standard deviations the actions’ time was faster 4.1.3 Time, Performance, and Mouse-Tracking-Based
or slower than the mean time that actions take for all other Modeling (TimePerfMouseT Mod)
students. Idea of relative time features is also quite intuitive Incorporation of performance features into the time only
since some actions/problems might take more time for modeling is an effective way of improving the off-task
232 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 3, NO. 3, JULY-SEPTEMBER 2010
TABLE 4
Results of the Nonpersonalized Version of TimePerfMouseT Mod Model in
Comparison to Nonpersonalized Versions of TimeOnly Mod and TimePerf Mod Models
Note that the results for each model for the technique of least squares are shown under the Least Squares NP ers column, and the results for each
model for the technique of ridge regression are shown under the Ridge Regression NP ers column. The percentages in the parentheses show the
relative improvements of each model with respect to the least-squares version of the nonpersonalized version of the TimeOnly Mod model. The
performance is evaluated by the F1 measure.
TABLE 5
Results of the Personalized Version of TimePerfMouseT Mod Model in
Comparison to Personalized Versions of TimeOnly Mod and TimePerf Mod Models
Note that the results for each model for the technique of least squares are shown under the Least Squares Pers column, and the results for each
model for the technique of ridge regression are shown under the Ridge Regression Pers column. The percentages in the parentheses show the
relative improvements of each model with respect to the least-squares version of the personalized version of the TimeOnly Mod model. The
performance is evaluated by the F1 measure.
TABLE 6
Results of the Least-Squares Version of the Personalized Version of
All Models in Comparison to Nonpersonalized Versions of All Models
Note that the results for the nonpersonalized version of each model for the technique of least squares are shown under the Least Squares NPers
column, and the results for the personalized version of each model are shown under the Least Squares Pers column. The percentages in the
parentheses show the relative improvements of each model with respect to the least-squares version of the nonpersonalized version of the
TimeOnly Mod model. The performance is evaluated by the F1 measure.
TABLE 7
Results of the Ridge Regression Version of the Personalized and Nonpersonalized Versions of All
Models in Comparison to the Least-Squares Version of the Nonpersonalized Version of All Models
Note that the results for the nonpersonalized version of each model for the technique of least squares are shown under the Least Squares NPers
column, the results for the nonpersonalized version of each model for the technique of ridge regression are shown under the Ridge Regression NPers
column, and the results for the personalized version of each model for the technique of ridge regression are shown under the Ridge Regression Pers
column. The percentages in the parentheses show the relative improvements of each model with respect to the least-squares version of the
nonpersonalized version of the TimeOnly Mod model. The performance is evaluated by the F1 measure.
nonpersonalized and personalized versions. The perfor- t-tests have been applied for this set of experiments.
mance of Ridge Regression learned version of each model Although the results were not shown to be statistically
is shown in comparison to Least-Squares learned versions significant (i.e., with p-value of less than 0.05), the
in Table 4 for nonpersonalized versions of these models, personalization approaches outperform corresponding
and in Table 5 for personalized versions of these models. It approaches without personalization consistently in differ-
can be seen that the Ridge Regression learned version of ent configurations, which demonstrates the robustness and
each model outperforms Least-Squares learned versions effectiveness of using personalization. This explicitly
with its regularization framework for both of nonpersona- demonstrates the power of personalized modeling for
lized and personalized models. Paired t-tests have been off-task behavior detection in intelligent tutoring systems.
applied for this set of experiments, and statistical sig-
nificance with p-value of less than 0.05 has been achieved 7 CONCLUSION AND FUTURE WORK
in favor of using ridge regression against using least-
square regression with mouse movements. This paper proposes a novel machine learning model to
identify students’ off-task behaviors (which involves
6.3 The Performance of Utilizing the Approach of neither the system nor a learning task) while using an
Personalization intelligent tutoring system. Only the data that are available
The last set of experiments was conducted to measure the from the log files from students’ actions within the software
effect of utilizing the approach of personalization to better are used to construct the model; therefore, the model does
capture different behavior types of different students. The not need sophisticated instrumentation (e.g., microphones,
gaze trackers, etc.) that are unavailable in most school
details about this approach are given in Section 4.2.
computer labs. The proposed model makes use of a set of
More specifically, personalized version of each model is
evidences such as time, performance, and mouse move-
compared to its corresponding nonpersonalized version.
ment features, and is compared to 1) a model that only
The performance of the personalized version of each utilizes time features and 2) a model that uses time and
model is shown in comparison to nonpersonalized version performance features together. Different students have
of that model in Table 6 for Least-Squares learned different types of behaviors; therefore, personalized ver-
versions. The performance of personalized version of each sions of each model are constructed and compared to their
model is shown in comparison to nonpersonalized version corresponding nonpersonalized versions. To address data
of that model in Table 7 for Ridge Regression learned sparseness problem, the proposed model utilizes a robust
versions. It can be seen that for both of Least-Squares and Ridge Regression technique to estimate model parameters.
the Ridge Regression learned versions of each model, the An extensive set of empirical results shows that the
personalized version outperforms the nonpersonalized proposed off-task behavior detection model substantially
versions in most cases with its capability to better capture outperforms the model that only uses time features as well
the different usage styles of different students. Paired as the model that utilizes time and performance features
CETINTAS ET AL.: AUTOMATIC DETECTION OF OFF-TASK BEHAVIORS IN INTELLIGENT TUTORING SYSTEMS WITH MACHINE LEARNING... 235
together. It is also shown by the experiment results that the [15] N. Karweit and R.E. Slavin, “Time-on-Task: Issues of Timing,
Sampling, and Definition,” J. Experimental Psychology, vol. 74,
personalized version of each model outperforms the no. 6, pp. 844-851, 1982.
corresponding nonpersonalized version indicating that [16] K.R. Koedinger and J.R. Anderson, “Intelligent Tutoring Goes to
personalization helps to improve the effectiveness of off- School in the Big City,” Int’l J. Artificial Intelligence in Education,
vol. 8, pp. 30-43, 1997.
task detection. Furthermore empirical results also show that [17] H.M. Lahaderne, “Attitudinal and Intellectual Correlates of
the proposed models attain a better performance by Attention: A Study of Four Sixth-Grade Classrooms,” J. Educa-
tional Psychology, vol. 59, no. 5, pp. 320-324, 1968.
utilizing the technique of Ridge Regression over the [18] S.W. Lee, K.E. Kelly, and J.E. Nyre, “Preliminary Report on
standard Least-Squares technique. the Relation of Students’ On-Task Behavior with Completion
There are several possibilities to extend the research. For of School Work,” Psychological Reports, vol. 84, pp. 267-272,
1999.
example, features that explicitly model the difficult levels of [19] D.J. Litman and K. Forbes-Riley, “Predicting Student Emotions in
available problems will help the system better identify Computer-Human Tutoring Dialogues,” Proc. 42nd Ann. Meeting
students’ behavior. Future research will be conducted on Assoc. for Computational Linguistics, 2004.
[20] E.M. Maletsky et al., Harcourt Math, Indiana ed. Harcourt, 2004.
mainly in this direction. [21] C. Merten and C. Conati, “Eye-Tracking to Model and Adapt to
User Meta-Cognition in Intelligent Learning Environments,” Proc.
11th Int’l Conf. Intelligent User Interfaces, pp. 39-46, 2006.
ACKNOWLEDGMENTS [22] R.C. Murray and K. VanLehn, “Effects of Dissuading Unnecessary
Help Requests While Providing Proactive Help,” Proc. 12th Int’l
This research was partially supported by US National Conf. Artificial Intelligence in Education (AIED ’05), pp. 887-889,
Science Foundation grant nos. IIS-0749462, IIS-0746830, and 2005.
[23] N. Person, B. Klettke, K. Link, and R. Kreuz, “The Integration of
DRL-0822296. Any opinions, findings, conclusions, or Affective Responses into Autotutor,” Proc. Int’l Workshop Affect in
recommendations expressed in this paper are the authors’ Interactions. Towards a New Generation of Interfaces, 1999.
and do not necessarily reflect those of the sponsor. [24] C.J. van Rijsbergen, Information Retrieval, second ed. Univ. of
Glasgow, 1979.
[25] J.W. Schofield, Computers and Classroom Culture. Cambridge Univ.
Press, 1995.
REFERENCES [26] L. Suchman, Plans and Situated Actions: The Problem of Human-
[1] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Machine Communication. Cambridge Univ. Press, 1987.
pp. 75-82. Addison Wesley, 1999. [27] J.A. Walonoski and N.T. Heffernan, “Prevention of Off-Task
[2] R.S. Baker, “Modeling and Understanding Students’ Off-Task Gaming Behavior in Intelligent Tutoring Systems,” Intelligent
Behavior in Intelligent Tutoring Systems,” Proc. SIGCHI Conf. Tutoring Systems, pp. 722-724, Springer, 2006.
Human Factors in Computing Systems, pp. 1059-1068, 2007. [28] T.R. Ziemek, “Two-D or Not Two-D: Gender Implications of
[3] R.S. Baker, A.T. Corbett, K.R. Koedinger, and A.Z. Wagner, “Off- Visual Cognition in Electronic Games,” Proc. Symp. Interactive 3D
Task Behavior in the Cognitive Tutor Classroom: When Students Graphics and Games, pp. 183-190, 2006.
‘Game the System,’” Proc. SIGCHI Conf. Human Factors in
Computing Systems (CHI ’04), pp. 383-390, 2004. Suleyman Cetintas received the BS degree in
[4] R.S. Baker, I. Roll, A.T. Corbett, and K.R. Koedinger, “Do computer engineering from Bilkent University,
Performance Goals Lead Students to Game the System,” Proc. Turkey. He is currently working toward the PhD
12th Int’l Conf. Artificial Intelligence and Education (AIED ’05), degree in computer science at Purdue Univer-
pp. 57-64, 2005. sity. His primary interests include the areas of
[5] C. Beal, B.P. Woolf, J. Beck, I. Arroyo, K. Schultz, and D.M. Hart, information retrieval, machine learning, intelli-
“Gaining Confidence in Mathematics: Instructional Technology gent tutoring systems, and text mining. He has
for Girls,” Proc. Int’l Conf. Math./Science Education and Technology, also worked in the area of privacy preserving
2000. data mining. He is a member of the ACM, ACM
[6] J.E. Beck and B.P. Woolf, “High Level Student Modeling with SIGIR, and the International Artificial Intelligence
Machine Learning,” Proc. Intelligent Tutoring Systems Conf., pp. 584- in Education (IAIED) Society.
593, 2000.
[7] J. Beck, “Engagement Tracing: Using Response Times to Model
Student Disengagement,” Proc. 12th Int’l Conf. Artificial Intelligence Luo Si received the PhD degree from Carnegie
in Education (AIED ’05), pp. 88-95, 2005. Mellon University in 2006. He is an assistant
[8] C.M. Bishop, Pattern Recognition and Machine Learning, pp. 6-10, professor in the Computer Science Department
144-145. Springer, 2006. and the Statistics Department (by courtesy) at
[9] S. Cetintas, L. Si, Y.P. Xin, and C. Hord, “Predicting Correctness of Purdue University. His main research interests
Problem Solving from Low-Level Log Data in Intelligent Tutoring include information retrieval, knowledge man-
Systems,” Proc. Second Int’l Conf. Educational Data Mining (EDM agement, machine learning, intelligent tutoring
’09), pp. 230-239, 2009. systems, and text mining. His research has been
[10] S. Cetintas, L. Si, Y.P. Xin, C. Hord, and D. Zhang, “Learning to supported by the US National Science Founda-
Identify Students’ Off-Task Behavior in Intelligent Tutoring tion (NSF), State of Indiana, Purdue University,
Systems,” Proc. 14th Int’l Conf. Artificial Intelligence and Education and industry companies. He received the NSF Career Award in 2008.
(AIED ’09), pp. 701-703, 2009.
[11] T. Dalton, R.C. Martella, and N.E. Marchand-Martella, “The
Effects of a Self-Management Program in Reducing Off-Task
Behavior,” J. Behavioral Education, vol. 9, nos. 3/4, pp. 157-176,
1999.
[12] F. Davis, La Comunicación No Verbal, vol. 616, El Libro de Bolsillo,
Alianza ed., translated by L. Mourglier from Inside Intuition—
What We Knew about Non-Verbal Communication, McGraw-Hill
Book Company, 1976.
[13] A. De Vicente and H. Pain, “Informing the Detection of the
Students’ Motivational State: An Empirical Study,” Proc. Intelligent
Tutoring Systems, pp. 933-943, 2002.
[14] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical
Learning, pp. 61-68. Springer, 2001.
236 IEEE TRANSACTIONS ON LEARNING TECHNOLOGIES, VOL. 3, NO. 3, JULY-SEPTEMBER 2010
Yan Ping Xin received the PhD degree in Casey Hord is currently working toward the PhD
special education from Lehigh University. degree in special education at Purdue Univer-
Currently, she is an associate professor at sity. His primary interests include the area of
Purdue University. Her research interests include interventions for students who struggle with
effective instructional strategies in mathematics mathematics, particularly students with learning
problem solving with students with learning disabilities or mild intellectual disabilities. He has
disabilities/difficulties, cross-culture performance also worked in the area of software development
and curriculum comparison, and meta-analysis. for helping elementary and middle school
She pioneered the COnceptual Model-based students understand and solve word problems.
Problems Solving (COMPS) approach that facil- His research has contributed to the development
itates algebra readiness in elementary mathematics learning. She is of methods designed to help students at risk for failure in mathematics
currently the principal investigator (with Dr. R. Tzur in math education and utilizing techniques such as model-based instruction and the concrete-
Dr. L. Si in computer science) of a five-year multimillion dollar grant semiconcrete-abstract teaching sequence. He taught middle school in
project (funded through the US National Science Foundation, 2008-2013) special education and general education settings for a total of six years.
that aims to develop a computerized conceptual model-based problem
solving system to nurture multiplicative reasoning in students with
learning difficulties. She has published her work in prestigious journals
such as Exceptional Children, The Journal of Special Education, the
Journal for Research in Mathematics Education, and The Journal of
Educational Research. Her work is cited in detail in the recent
Instructional Practices Report from the National Mathematics Panel.