Habibi TMP Assetyx5x6z

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Original article

Learning styles’ recognition in e-learning


environments with feed-forward neural
networks
J. E. Villaverde, D. Godoyw & A. Amandiw
ISISTAN Research Institute, UNICEN University, Campus Universitario, Paraje Arroyo Seco, Tandil (CP 7000), Bs. As., Argentina

wCONICET, Consejo Nacional de Investigaciones Cientı́fı́cas y Técnicas, Av. Rivadavia 1917, Capital Federal (CP1033), Bs. As., Argentina

Abstract People have unique ways of learning, which may greatly affect the learning process and,
therefore, its outcome. In order to be effective, e-learning systems should be capable of
adapting the content of courses to the individual characteristics of students. In this regard,
some educational systems have proposed the use of questionnaires for determining a student
learning style; and then adapting their behaviour according to the students’ styles. However,
the use of questionnaires is shown to be not only a time-consuming investment but also an
unreliable method for acquiring learning style characterisations. In this paper, we present an
approach to recognize automatically the learning styles of individual students according to
the actions that he or she has performed in an e-learning environment. This recognition
technique is based upon feed-forward neural networks.

Keywords learning styles, neural networks, Web-based instruction.

Introduction presented in the way that best fit the learning style of
each student, which is usually assessed through a
Students learn in different ways. Some students prefer
predefined questionnaire. However, answering long
graphics, such as diagrams and blueprints, while oth-
questionnaires is a time-consuming task that students
ers prefer written material. Some students feel more
are not always willing to carry out and, consequently,
comfortable with facts, data and experimentation,
results become unreliable (Stash et al. 2004).
whereas others prefer principles and theories (Felder
This paper describes an approach to the problem of
& Silvermann 1988). E-learning environments can
mapping students&apso; actions within e-learning
take advantage of these different forms of learning by
environments into learning styles. The method is
recognizing the style of each individual student using
based on artificial neural networks (ANNs). Neural
the system and adapting the content of courses to
networks are computational models for classification
match this style.
inspired by the neural structure of the brain: models
There are a few systems that are actually capable of
that have proven to produce very accurate classifiers.
adapting courses’ contents according to students’
In the proposed approach, feed-forward neural net-
learning styles (Carver et al. 1999; Gilbert & Han
works are used to recognize students’ learning styles
1999; Paredes & Rodriguez 2002; Stash & Brau
based upon the actions they have performed in an
2004). In these systems, the learning materials are then
e-learning system.
The rest of this paper is organized as follows: The
Accepted: 16 March 2006
next section briefly describes learning style theory.
Correspondence: Jorge E. Villaverde, ISISTAN Research Institute,
UNICEN University, Campus Universitario, Paraje Arroyo Seco, Tandil This is followed by an overview of ANNs and the
(CP 7000), Bs. As., Argentina. E-mail: jvillave@exa.unicen.edu.ar back-propagation algorithms. The sections after that

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd Journal of Computer Assisted Learning 22, pp197–206 197
198 J. E. Villaverde et al.

show how we have modelled the recognition of practical, and are oriented towards facts and
learning styles using neural networks and empirical procedural information. When solving problems,
results obtained in the recognition of learning styles. sensory students are routinely very patient with
The penultimate section discusses related work and details and usually dislike surprises. Because of
the last section presents our conclusions. these characteristics, they show a slower reaction
to problems, but they typically present a better
outcome.
Learning styles  Intuitive learners: intuitive learners prefer the-
Keefe (1979) defines learning styles as characteristic ories and principles. They rapidly become bored
cognitive, affective and psychological behaviours that with details and mechanical problem solving.
serve as relatively stable indicators of how learners Innovation is what attracts intuitive learners’ at-
perceive, interact with and respond to learning en- tention. They generally solve problems quickly,
vironments. In the last decade, several learning styles not paying much attention to details. This makes
models have been proposed (Kolb 1984; Myers & them fast but prone to errors and, then, they
McCaulley 1985; Felder & Silvermann 1988; Lawr- often get lower qualifications than sensitive
ence 1993; Litzinger & Osif 1993). A critical review learners.
of learning styles, analysing their reliability, validity
and implication for pedagogy, can be found in (Cof-  Input: Through which sensory channel is external
field et al. 2004a;b). The authors of this review con- information most effectively perceived: visual——
cluded that in the field of learning styles, there is a lack pictures, diagrams, graphs, demonstrations or ver-
of theoretical coherence and a common framework bal——written or spoken sounds?
(Coffield et al. 2004b, p. 145).
 Visual learners: they remember, understand and
In spite of this fact, experimental research on the
assimilate information better if it is presented to
application of learning styles in computer-based edu-
them in a visual way. They tend to remember
cation provides support for the view that learning can
graphics, pictures, diagrams, time lines, blue-
be enhanced through the presentation of materials that
prints, presentations and any other visual material.
are consistent with a student’s particular learning style
 Verbal learners: cognitive scientists have estab-
(Budhu 2002; Peña et al. 2002; Stash et al. 2004). For
lished that our brains generally convert written
example, it has been shown that the performance of
works into their spoken equivalents and process
students in a simple Web-based learning environment
them in the same way that they process spoken
correlates with their self-reported learning preference
words (Felder & Brent 2005). Hence, verbal
(Walters et al. 2000).
learners are not only those who prefer auditory
In this paper, we have adopted the model suggested
material but also those who remember well what
by Felder and Silverman (1988) for engineering edu-
they hear and what they read.
cation, which classifies students according to their
position in several scales that evaluate how they per-  Processing: How does the student prefer to process
ceive and process information. information: actively——through engagement in
This model classifies students according to four physical activities or discussions, or reflectively——
dimensions: through introspection?

 Perception: What type of information does the  Active learners: they feel more comfortable with
student preferentially perceive: sensory (ex- active experimentation than with reflexive ob-
ternal)——sights, sounds, physical sensations, or servation. An active person learns by trying
intuitive (internal)——possibilities, insights, hun- things out and working with others. They like
ches? doing something in the external world with the
received information. Active learners work well
 Sensory learners: these students like facts, data in groups and in situations that require their
and experimentation. They perceive concrete, participation.

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
Learning styles’ recognition in e-learning 199

 Reflective learners: reflective learners prefer in- X1


trospective examination and manipulation of in-
formation. They learn by thinking things through X2 W1
F(aj (t),Netj )
W2 Fj (aj (t +1)) yj
and working alone or with another person. =
. Netj =
. Wn aj (t +1) yj
.
 Understanding: How does the student progress Xn
towards understanding: sequentially——in con- Unit Uj
tinual steps, or globally——in large jumps, holi-
stically? Fig 1 Main components of an artificial neuron.

 Sequential learners: sequential learners follow a


line of reasoning when progressing towards the The neuron is the basic processing unit. Each neu-
solution of a problem; they like things to be lin- ron receives signals from others neurons, processes
ear. They learn better if information is presented these signals and transmits an output to its neighbours.
in a steady progression of complexity and diffi- Some of the signals it receives excite it while others
culty (i.e. they learn in small incremental steps). inhibit it. Usually, an exciting signal is represented as
a positive real value, while a negative value is as-
 Global learners: global learners make intuitive signed to an inhibiting signal. As shown in Fig 1, the
leaps and may be unable to explain how they Net input of a neuron is the sum of values that arrive at
come up with solutions. They are holistic, system its input. This value represents the excitation level of
thinkers; they learn in large leaps. They need to the neuron. If it exceeds a certain threshold, the neuron
understand the whole before understanding the fires an output signal to its neighbours. From this in-
parts that compose it; they need to get the ‘big put–process–output model, neurons are classified with
picture’. respect to where the signals come from. If the signals
that the neuron receives come from the environment, it
If the dimensions were absolute, these four dimensions is called an input neuron. If it receives the input from
of learning allow to obtain 16 different learning styles other neurons and transmits its outcome to others as
(i.e. in the perception dimension, a student would well, it is called a hidden neuron. Finally, if it sends its
either be sensitive or intuitive). However, each output to the environment, then it is called an output
dimension can be rated on a scale, for example the neuron. Usually, neurons sharing similar character-
scale used in the Index of Learning Styles (ILS)1 is a istics are grouped together, forming layers of neurons.
scale of 22 degree. Having this scale, the number of What distinguishes one layer of neurons from another
different styles is 234256 (224). are their inputs and outputs. So, neurons belonging to
In the following section, we detail how ANNs can be layer i receive their input from layer i-1, and they send
used to recognize the learning styles of students based their output to layer i11. Three kinds of layers are
on the actions they perform in an e-learning system. distinguished in ANN literature. If the layer contains
input neurons, it is an input layer; if it contains hidden
ANNs
neurons, it is a hidden layer; and if it contains output
The human brain is composed of cells (neurons) that neurons then it is an output layer.
are the only cells capable of communicating with each An ANN is termed feed-forward, if neurons be-
other. This is one of the capabilities that allows humans longing to layer i receive input only from layer i-1, and
to exhibit intelligent behaviour. ANNs are computa- only send output to layer i11. This means that in feed-
tional models based on the biological neural structure forward neural networks, there are neither connection
of the brain, as first proposed by McCulloch and Pitts between neurons from the same layer (lateral con-
(1943). This computational model, also known as nections) nor from previous layers (recurrent con-
connectionism, aims to mathematically represent and nections). Information in this class of networks only
reproduce the way a human nervous system works. flows forward, from the input to the output passing
1
Available online at http://www.ncsu.edu/felderpublic.ILSpage.html through hidden layers.

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
200 J. E. Villaverde et al.

Structure and operation of ANNs of the most spread models in the connectionism field,
due to its learning capability to associate an input
The fundamental unit of processing in ANNs is the
space to an output. It has been demonstrated by dif-
Artificial Neuron (AN); see Fig 1. ANs are mathe-
ferent authors that a multi-layer perceptron is a uni-
matical models that represent the general character-
versal function approximation mechanism, in the
istics of biological neurons. Let us see how these
sense that any continuous function over a compact Rn
biological features are mathematically modeled and
can be approximated by a multi-layer perceptron
how they contribute to the learning process of ANNs.
(Cybenko 1989; Hornik et al. 1989).
When a signal is transmitted from processing unit i
BPN nets operate in two steps. In the first step,
to processing unit j, the signal xi is modified by the
called training process, the network is initialized with
synaptic weight (wij ) associated with this communica-
random small values in its weights. The goal of this
tion channel. The modulated signals that arrive at unit
process is to find a set of weight values that minimize
j are added to form the net input Netj as is shown in
X the global error of the network, given in equation (5).
Netj ¼ xi wij ; ð1Þ The second step is called generalization. In this step,
i
the network has already learned an internal re-
Each neuron is characterized in any instant of time t presentation of the previously presented patterns and
by a real value, called activation state or activation becomes able to classify novel patterns presented as
level, aj ðtÞ Also, there is a function F, called activa- inputs.
tion function, which determines the next activation The learning process of a BPN is briefly described
state based on the current activation state and the Netj as follows: a pattern is a two-tuple Pi ¼ hXi ; Ti i where
input of the neuron. Xi is a set of values that will be presented as input to
 
F aj ðtÞ; Netj ¼ aj ðt þ 1Þ: ð2Þ the network and Ti is a set of values that represent the
desired target for the values presented at the input
Associated with each unit there exists an output
layer. For the network to learn a pattern, Pi, the values
function, fj , that transforms the current activation level
of the pattern have to be presented at the input layer
into an output signal yi. This signal is sent through a
first. These values are taken as stimulus and are pro-
unidirectional communication channel to other units in
pagated forward until the output layer is reached. In
the network
  the output, a set (O) of values obtained by the network
fj aj ðt þ 1Þ ¼ yj ð3Þ (called oj) are compared with the set of desired values
(Ti) to obtain the pattern error according to
Learning algorithm X
M  2
Erri ¼ oj  tj ; ð4Þ
A learning algorithm is the process by which an ANN j¼1

generates internal changes so that it can adapt its be-


where M is the number of neurons in the output layer,
haviour in response to the environment. The mod-
oi 2 O and tj 2 T. Hence, the global error of the net-
ifications that by the network during this process
work is calculated considering all patterns as follows:
enable it to gain better performance, so it can over-
M  
come its output to the environment. When there is an 1 XP X
ðk Þ ðkÞ 2
Err ¼ oj  tj ; ð5Þ
external agent involved in the learning process, it 2P k¼1 j¼1
receives the name of supervised learning. Back-
propagation is a supervised learning algorithm, used where P is the number of patterns in the training set.
in feed-forward neural networks, which reduce the This error is the one that the training procedure tries to
global error produced by the network over the weight minimize. These error values are back propagated
space. through the network to adjust the values of its con-
Back-propagation (Parker 1982; Werbos 1988) is a nection weights so that the change in values is pro-
generalization of the LMS algorithm (Rosenblatt portional to the gradient descent of the error in
1958; Widrow & Hoff 1960) applied to feed-forward equation (6). This weight adjustment rule is known as
multi-layer perceptron networks. This network is one the generalized delta rule (GDR) (Rumelhart et al.

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
Learning styles’ recognition in e-learning 201

1986;1988) and is a generalization of the Widrow– per observed action in the system. These actions are as
Hoff delta rule (Widrow & Hoff 1960) follows:

@Err  Reading material: academic units can be presented


DWij ¼ using both abstract (theories) and concrete material
@Wij
(exercises). What kind of material is the student
@ 1X M
¼ ðok  tk Þ2 ð6Þ most interested in?
@Wij 2 k¼1
 Access to examples: in each academic unit, a
X
M
@ok number of examples are presented to students. In
¼ ð ok  t k Þ :
@Wij relation to the total number of available examples,
k¼1
how many of them has the student accessed to?
This equation can also be rewritten as equation (7),  Answer changes: Does the student change the an-
with addition of the momentum (Rumelhart et al. swers of the exam before he hands it over? If yes,
1986) term to achieve faster convergence. The momen- what is the percentage of answers he has changed?
tum part of the equation, bðW ðt  1Þ  W ðt  2ÞÞ,  Exercises: a number of exercises are also included
allows a network to respond not only to the local in academic units. In relation to the total number of
gradient but also to recent trends in the error available exercises, how many exercises has the
surface. student accessed to?
W ðtÞ ¼ W ðt  1Þ  adj yj  Exam delivery time: each exam has an associated
þ bðW ðt  1Þ  W ðt  2ÞÞ; ð7Þ time –to solve. What is the relation between the
student’s exam delivery time and the units’ time –to
where dj is calculated according to the rule by solve?
( 0
f ðzj Þðo
Pj  tj Þ ðoutput : layerÞ  Exam revision: in relation to the time to solve of the
dj ¼ f 0 ðzj Þ Wij dj ðhidden : layerÞ ð8Þ exam, what was the percentage of time spent by the
i
student checking the correctness of the exam?
This process of weight adjustment is repeated until  Chat usage: the student may ignore the chat, read
a desired error threshold is reached. other students’ messages or read/write messages
with others.
 Forum usage: the student may ignore the forum,
Modelling learning styles with feed-forward read other students posted messages or post mes-
neural networks sages in the forum.
 Mail usage: the student may use (or not) the e-mail.
The neural network architecture that we propose in
 Information access: information in academic units
this paper aims to find a mapping between students’
is presented following a line of reasoning. How has
actions in the system and the learning style that they
the student followed that line of reasoning? Line-
best fit. To achieve this goal, we must identify the
ally, or has he or she visited a random sequence of
inputs of the network, its outputs and the meaning of
items?
their possible values. It is also necessary to determine
other architectural parameters, such as the number of These values have to be encoded in the real interval
hidden layers to be used, the number of processing [  5; 15] as expected by the neurons in the input layer
units in each of the hidden layers, the activation of the network. This interval was intentionally selected
function to be used in the processing units and the to match the expected domain of the activation func-
learning coefficient of the network. Each of these is- tion selected for the units of the net as shown in sub-
sues is analysed in the following subsections. section ‘Network architecture and parameters’. Table 1
summarizes the input vector, X, representation.
Input layer Output layer
To represent the input of the network, we propose the The output of the network should approximate the
use of one processing unit (neuron) in the input layer learning style of the students based on the actions

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
202 J. E. Villaverde et al.

Table 1. Input representation. Although there are some empirical rules, such as the
Baum–Haussler rule (Baum & Haussler 1988), for
X Action 5 15
determining the desirable number of neurons in the
x0 Reading material Abstract Concrete hidden layer of a multi-layer feed-forward neural
x1 Access to examples Few Much network, we determined this architectural parameter
x2 Answer changes Few Much via trial-and-error experimentation. A total of 24 units
x3 Exercises Few Much
have been empirically found as appropriate for the
x4 Exam delivery time Quick Slow
x5 Exam revision Few Much
task at hand, considering that this layer has to have
x6 Chat usage Don’t use Read and write enough processing units to represent the nonlinear
x7 Forum usage Don’t use Read and post aspects of the model and not too many for making the
x8 Mail usage Don’t use Use training process very complex.
x9 Information access Lineal Global

Network architecture and parameters

The network is trained using the GDR (Rumelhart


Table 2. Output representation.
et al. 1986;1988), a generalization of the Widrow–
O Dimension 1 11 Hoff delta rule (Widrow & Hoff 1960). The activation
function used in the network units is the hyperbolic
o0 Perception Intuitive Sensitive
tangent function shown in
o1 Processing Active Reflective
o2 Understanding Sequential Global ez  ez
f ðzÞ ¼ z : ð9Þ
e þ ez
Many interesting properties of this function make it
presented at the input layer. In this case, we propose suitable to represent the activation function of the
the use of one processing neuron in this layer per network. On the one hand, the function domain can be
learning style dimension used in the model. In this restricted to [  5; 15] where the function reaches
work, only three of the four dimensions of the Felder– more than 99.99% of its range; the different observed
Silverman model have been used to model students’ aspects are then projected to this range. Another in-
learning style. These dimensions are as follows: teresting property is that its derivate can be defined in
terms of the function’s output; this makes calculation
 Perception: this dimension determines whether the easier as is shown in equation (10). A further im-
style of the student is intuitive or sensitive. portant property of the hyperbolic tangent function is
 Processing: this dimension decides whether a stu- that its range is [  1; 11]; this property can be uti-
dent’s leaning style better fits active or reflective. lized to represent the different ranges of the learning
 Understanding: this dimension informs whether the style dimensions.
student learning style is sequential or global. d d ex  ex
tanhð xÞ ¼
dx dx ex þ ex
Table 2 summarizes the output vector representation. d sinhð xÞ
¼
dx coshð xÞ : ð10Þ
Hidden layer cosh2 ð xÞ  sinh2 ð xÞ
¼
cosh2 ð xÞ
In a multi-layer perceptron using continuous nonlinear
¼ 1  tanh2 ð xÞ
hidden-layer activation functions, as proposed in this
work, one hidden layer with an arbitrarily large This architecture parameter, along with the proper
number of units suffices for the ‘universal approx- selection of learning rates and momentum coefficients,
imation property’ (Hornik et al. 1989; Bishop 1995). defines the specific values selected for this work. The
However, there is still no theory about how many learning rate, a, was set to a small value between 0.1
hidden units are needed to approximate any given and 0.25 so that the representation acquired can be a
function. faithful one.

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
Learning styles’ recognition in e-learning 203

Table 3. Proposed Architectural Parameters. relevance of past experience in the system to the re-
cognition of the learning style.
Parameter Value
Experimental results
Number of input neurons 10
Number of hidden neurons 24 Estimating the accuracy of classifiers induced by su-
Number of output neurons 3 pervised learning algorithms (i.e. classifier’s prob-
Activation function Hyperbolic tangent
ability of correctly recognizing a randomly selected
Learning rate 0.02
Momentum 0.5
instance) is important not only to predict its future
prediction accuracy but also for choosing a classifier
from a given set (model selection) or combining
To sum up, the proposed architecture for learning classifiers (Wolpert 1992). In order to evaluate the
style recognition is a three-layered feed-forward proposed approach, we generated an artificial data-
neural network with Batch Gradient Descent with the set for experimentation by simulating the actions of
Momentum learning algorithm. In this architecture, students.
the first layer contains a total of 10 input neurons; the For this task, we considered that each student has a
hidden layer contains 24 processing units that are particular learning style denoted by a set of preferred
connected to 3 output units in the third layer. In- actions and behaves according to it. The resulting
formation about the network architecture and other dataset built for testing the network consisted of
parameters is summarized in Table 3. 100 pairs of input–output values. Each of these pairs
contained possible actions that students might perform
Temporal adaptation in the system associated with their corresponding
learning style dimensions.
To ensure that the student’s learning style does not
To determine the best number of processing units in
oscillate between different states (and keeps the sys-
the hidden layer, we trained the network varying this
tem changing the contents of the student’s courses as a
parameter and estimated the architecture accuracy
consequence), historical interaction data can be sup-
using k-fold cross-validation with k 5 10. Each of the
plied to the classifier to enhance the prediction of the
10 folds was composed of a set of 90 training patterns
learning style. To accomplish this goal, a parameter N
and 10 test patterns. To measure the accuracy of each
is defined to represent the length of the input sequence
classifier, we calculated the global error (equation (5))
to the network, causing the input to the classifier to be
that it produces when it is stimulated with the test
a function of the historical data. That is, the input to
cases. Each of the output dimensions was considered
the system is a finite sequence X where
independently in the calculation.
X ¼ xt ; xt1 ; . . . ; xtN : ð11Þ Figure 2 shows the mean accuracy of cross-valida-
tion when varying the number of neurons in the net-
work-hidden layer. We can observe in the figure that
This sequence of inputs is pre-processed applying
the best accuracy (69.3%) is reached when the number
the discount return transformation (Sutton 1988) be-
of hidden units is 24. Figure 3 shows the variance of
fore becoming the real input to the network. Equation
the classifier when the number of neuron varies. In this
(12) shows how the actual input to the ANN will be
figure, the lowest variance (1.53%) is obtained when
produced from the data measured at time t and from
the number of neurons is 22. Based on these results, we
historical information
have chosen 24 neurons for achieving the maximum
X
N
accuracy. For this value, the variance (1.70%) of the
xt ¼ gk1 xtþk 0  g  1: ð12Þ
classifier is not the minimum but it is still acceptable.
k¼1

Another important network parameter is the dis-


Related work
count factor, g, which determines the importance gi-
ven to past experience. Along with the length of the Most adaptive educational systems providing in-
sequence, the discount factor allows to control the dividualized courses according to learning styles, such

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
204 J. E. Villaverde et al.

69.4 approach, the network is trained based on a number of


69.2 the most recent actions performed by students and,
69 thus, it allows the system to deal with changes in the
68.8 students’ learning styles. The closest related work is
Accuracy (%)

68.6
AHA! (Stash & Brau 2004), an educational system
that infers the students’ learning style from the ob-
68.4
servation of their browsing behaviour. In contrast to
68.2
the mentioned systems, it does not make use of
68 questionnaires. However, this system covers a small
67.8 number of learning styles, those corresponding to
67.6 field-dependent/independent styles and to visual/ver-
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
# Neurons
bal styles, whereas the approach presented in this pa-
per is based on the Felder and Silverman theory for
Fig 2 Mean accuracy as a function of the number of neurons in engineering education.
the hidden layer of the network. Learning styles in engineering education have been
studied by (Al-Holou & Clum 1999; Ayre & Nafalski
1.9 2000; Budhu 2002; Kuri & Truzzi 2002). Kuri and
1.85 Truzzi (2002) use the Felder–Silverman model to as-
sess the distribution of responses of freshmen en-
1.8
gineering students using the ILS questionnaire, and
1.75 they concluded that the use of learning styles to ac-
Variance

1.7 commodate their teaching strategies to students has


been shown to be effective in encouraging students.
1.65
Other e-learning environments that are not exclusively
1.6 from engineering education have incorporated the
1.55 Felder–Silverman model of learning styles (Peña et al.
2002; Kim & Moore 2005). They also rely on re-
1.5
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 sponses obtained from ILS questionnaires fulfilled at
# Neurons
the beginning of the courses, and they adapt the con-
Fig 3 Variance as a function of the number of neurons in the tent of the courses to the characteristics of instruc-
hidden layer of the network. tional and learning strategies.

as ARTHUR (Gilbert & Han 1999), CS388 (Carver


Conclusions
et al. 1999), MAS-PLANG (Peña et al. 2002) and
TangoW (Paredes & Rodriguez 2002), assess stu- In this paper, we have described an approach based on
dents’ learning styles by having them complete ques- feed-forward neural networks to infer the learning
tionnaires. Each question of these questionnaires styles of students automatically. We have selected the
requires that the student selects one of several possible back-propagation algorithm to train the ANN de-
answers. The main disadvantage of this approach is scribed in this work. In addition, we have described a
that the questionnaires tend to be time consuming and neural network architecture that learns the associations
students might try to answer questions arbitrarily. This between students’ actions in e-learning environments
has a negative effect on the accuracy of the results and their corresponding Felder–Silvermans learning
obtained. A further disadvantage is that learning styles style of engineering education.
can change over time, so the knowledge gathered by The advantage of this approach is twofold. First, an
the questionnaire can become obsolete. automatic mechanism for style recognition facilitates
Conversely, the proposed approach consists in the gathering of information about learning pre-
training a neural network to determine learning styles ferences, making it imperceptible to students. Second,
automatically based on the students’ actions. In this the proposed algorithm uses the recent history of

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
Learning styles’ recognition in e-learning 205

system usage so that systems using this approach can Gilbert J.E. & Han C.Y. (1999) Arthur: adapting instruction
recognize changes in learning styles or some of their to accommodate learning style. In World Conference on
dimensions over time. the WWW and Internet (WebNet 99) (eds P.D. Bra & J.J.
The recognition mechanism presented in this paper Leggett), pp. 433–438. Honolulu, Hawaii, USA.
can be introduced in adaptive e-learning environments Hornik K., Stinchcombe M. & White H. (1989) Multilayer
to help in the detection of students’ learning styles feedforward networks are universal approximators.
Neural Networks 2, 359–366.
and, thus, conveniently adapt the contents of academic
Keefe L.W. (1979) Learning style: an overview. In NASSP’s
courses that are presented to them. It can also be ex-
Student learning styles: Diagnosing and prescribing
tended to consider further input actions available in
programs, pp. 1–17.
particular e-learning systems or domains.
Kim K.S. & Moore J.L. (2005) Web-based learning: factors
affecting students’ satisfaction and learning experience.
References First Monday 10. Available from http://www.firstmonday.
dk/issue/issue10-11/kim/index.html
Al-Holou N. & Clum J.A. (1999) Teaching electroscience Kolb D.A. (1984) Experiential Learning: Experience as the
using computer-based instruction. In 29th ASEE/IEEE Source of Learning and Development. Prentice-Hall,
Frontiers in Education Conference (eds D. Budny &
New York.
G. Bjedov), pp. 24–29. San Juan, Puerto Rico.
Kuri N.P. & Truzzi O.M.S. (2002) Learning styles of
Ayre M. & Nafalski A. (2000) Recognising diverse learning
freshmen engineering students. In International Con-
styles in teaching and assessment of electronic en-
ference on Engineering Education (ICEE 2002), Man-
gineering. In 30th ASEE/IEEE Frontiers in Education
chester, UK.
Conference (eds D. Budny & G. Bjedov), pp. 18–23.
Lawrence G. (1993) People Types and Tiger Stripes: A
Kansas City, MO.
Practical Guide to Learning Styles. Center for Applica-
Baum E.B. & Haussler D. (1988) What size net gives valid
tions of Psychological Type. Gainesville, FL.
generalization? Neural Computation 1, 151–160.
Litzinger M.E. & Osif B. (1993) Accommodating diverse
Bishop C.M. (1995) Neural Networks for Pattern Recogni-
learning styles: designing instruction for electronic infor-
tion. Clarendon Press, Oxford.
mation sources. In What is Good Instruction Now? Libr-
Budhu M. (2002) Interactive web-based learning using in-
ary Instruction for the 90s (ed. L. Shirato). Pierian Press.
teractive multimedia simulations. In International Con-
McCulloch W.S. & Pitts W. (1943) A logical calculus of
ference on Engineering Education (ICEE 2002), pp. 1–6.
ideas immanent in nervous activity. Bulletin of Mathe-
International Network for Engineering Education & Re-
search, Manchester, UK. matical Biophysics 5, 115–133.
Carver C.A.J., Howard R.A. & Lane W.D. (1999) Enhancing Myers I. & McCaulley M. (1985) Manual: A Guide to the
student learning through hypermedia courseware and Development and Use of the Myers–Briggs Type In-
incorporation of student learning styles. IEEE Transac- dicator. Consulting Psychologists Press., Palo Alto, CA.
tions on Education 42, 33–38. Paredes P. & Rodriguez P. (2002) Considering learning
Coffield F., Moseley D., Hall E. & Ecclestone K. (2004a) styles in adaptive web-based system. In Proceeding of
Learning styles and pedagogy in post-16 learning: a the 6th World Multiconference on Systemics, Cybernetics
systematic and critical review. Technical Report 041543, and informatics (eds N. Callaos, M. Margenstern &
Learning and Skills Research Centre. B. Sanchez), pp. 481–485. Orlardo, FL.
Coffield F., Moseley D., Hall E. & Ecclestone K. (2004b) Parker D.B. (1982) Learning logic, invention report. Tech-
Should we be using learning styles? What research has to nical Report S81–64, File 1, Office of Technology Li-
say to practice. Technical Report 041540, Learning and cencing, Stanford University.
Skills Research Centre. Peña C.I., Marzo J.L. & de la Rosa J.L (2002) Intelligent
Cybenko G. (1989) Approximations by superposition of agents in a teaching and learning environment on the
sigmoidal functions. Mathematics of Control, Signals, Web. In ICALT2002. IEEE, Kazan, Russia.
and Systems 2, 303–314. Rosenblatt F. (1958) The perceptron: a probabilistic model
Felder R.M. & Brent R. (2005) Understanding student dif- for information storage and organization in the brain.
ferences. Journal of Engineering Education 94, 57–72. Psychological Review 65, 386–408.
Felder R.M. & Silvermann L.K. (1988) Learning and Rumelhart D.E., Hinton G.E. & Williams R.J. (1986)
teaching styles in engineering education. Journal of En- Learning representations by back-propagating errors.
gineering Education 78, 674–681. Nature 323, 533–536.

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd
206 J. E. Villaverde et al.

Rumelhart D., Hinton G. & Williams R. (1988) Learning Sutton R.S. (1988) Learning to predict by the methods of
internal representations by error propagation. In Parallel temporal differences. Machine Learning 3, 9–44.
Distributed Processing: Explorations in the Micro- Walters D., Egert C. & Cuddihy E. (2000) Learning styles and
structure of Cognition (eds D. Rumelhart & G. McClel- web-based education: a quantitative approach. In Pro-
land), pp. 318–362. MIT Press, Cambridge, MA. ceedings from the 9th Annual FACT Conference on In-
Stash N. & Brau P.D. (2004) Incorporating cognitive styles structional Technology, pp. 115–117. Malo, Buffalo, NY.
in AHA! (The adaptive hypermedia architecture). In Werbos P.J. (1988) Generalization of backpropagation with
IASTED International Conference (ed. V. Uskov), pp. application to a recurrent gas market model. Neural
378–383. ACTA Press, Innsbruck, Austria. Networks 1, 339–356.
Stash N., Cristea A. & Brau P.D. (2004) Authoring of learn- Widrow B. & Hoff M.E. (1960) Adaptive switching circuits.
ing styles in adaptive hypermedia: problems and In IRE WESCON Convention Record, pp. 96–104, New
solutions. In 13th International Conference on York, NY.
World Wide Web – Alternate Track Papers & Posters, Wolpert D.H. (1992) Stacked generalization. Neural Net-
pp. 114–123. works 5, 241–259.

& 2006 The Authors. Journal compilation & 2006 Blackwell Publishing Ltd

You might also like