Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

6

BUILDING THE FOUNDATIONS


OF LANGUAGE
Mechanisms of curiosity-driven learning

Katherine E. Twomey and Gert Westermann

Introduction
The field of language acquisition benefits from a rich experimental literature. The majority
of this work employs carefully controlled experimental paradigms designed to isolate and
test the effect of a given set of variables on learning, and to rule out the effect of extraneous
factors. For example, experimental studies of word learning typically use novel words and
control how often a child hears these words so that the researcher can be confident any
between-condition differences in behavior are the result of the experimental manipula-
tion and not the number of times children heard the word; similarly, in sentence repetition
tasks investigating older children’s syntactic development, linguistic variables are controlled
to ensure repetition relates to grammatical structure and not to other characteristics of the
stimuli such as lexical frequency.
However, in contrast to the constrained lab environment required for good experi-
mental control, the majority of infants’ and toddlers’ real-world learning takes place in
a messy, cluttered learning environment. While a child participating in an experiment
typically sees a fixed number of stimuli in a predetermined order, outside the lab children
explore their world freely, attending to what they encounter at their own pace, in the
context of their learning history, in whatever order they choose. Importantly, then, chil-
dren play an active role in their own early learning by extracting information from the
environment as they choose and consequently imposing a structure on this environment.
Put differently, early learning is driven by the child’s intrinsic motivation, or curiosity.
Critically, curiosity is likely also to be important to language acquisition, since the child
is free to attend to linguistic and nonlinguistic stimuli and to extract information from
this rich environment. Nevertheless, the role of curiosity in early development has only
recently begun to be studied in earnest, and the mechanisms by which curiosity might
drive language acquisition are yet to be clearly defined. The current chapter reviews
classical psychological theories of curiosity, describes recent work implementing these
theories in artificial learning systems and explores current findings from developmental
psychology that point towards a link between curiosity, attention and language at the
earliest stages of development.

102
Building the foundations of language

Theoretical approaches to curiosity

Curiosity, motivation and attention


Historically, curiosity has been closely associated with the notion of intrinsic motivation. As
adults we spend substantial time engaged in activities such as watching television or listen-
ing to music. These behaviors are intrinsically motivated; unlike extrinsically motivated
behaviors, where an action is carried out to satisfy some external criterion, intrinsically
motivated behaviors have no external goal, but rather are carried out simply for their
inherent reward (Berlyne, 1960; Ryan & Deci, 2000). For example, eating is externally
motivated by the need to satiate hunger, and the action of memorizing a vocabulary list
to pass a language test is externally motivated by the desire to pass the test, while playing
a computer game or singing to oneself while taking a shower attracts no external reward.
Intrinsically motivated behaviors are also evident early in development. For example, infants
begin to babble spontaneously, not driven by an externally mediated reward (Moulin-Frier,
Nguyen, & Oudeyer, 2014); similarly, infants who are expert crawlers nonetheless transition
to the initially much more effort- and errorful stage of walking (Adolph & Tamis-LeMonda,
2014). In contrast, other early behaviors, for example, tidying toys or helping an adult reach
an object out of their reach, have clearly identifiable extrinsic motivations (e.g., eliciting
praise). More broadly, however, behaviors often have both intrinsic and extrinsic motiva-
tions; for example, someone riding a bicycle both because it is enjoyable and because they
want to improve their health.
Curiosity is typically viewed as a form of intrinsically motivated behavior in which a learner
or agent interacts with the world to gain information without seeking an externally mediated
reward. Curiosity-driven behaviors therefore mediate between what the learner already knows
and what is to be learned; that is, curiosity can determine what information is sought and when
by determining how the child allocates his or her attention. The puzzle of curiosity, and the
focus of work across philosophy, psychology and computer science, is therefore to understand
the intrinsic motivation mechanisms which underlie curiosity, and to explain how the resulting
curiosity-driven behavior is shaped by – and indeed shapes – factors both intrinsic and extrinsic
to the child.
Due to the complex nature of curiosity, many theories of the cognitive basis of curiosity
have been proposed, until recently largely from the perspective of curiosity in adults. Below, we
review three principal curiosity mechanisms which have been the focus of substantial theoretical
and experimental work (but note many other implementations are possible; for a comprehensive
review, see Oudeyer & Kaplan, 2009). All three can be broadly described as a drive to accumulate
information; however, the precise processes by which information is acquired vary, with a range
of implications for curiosity-driven learning in early development.

Incongruity approaches
On incongruity approaches, curiosity relates to the disparity between what a learner knows and
the perceptual input they receive: when the incongruity is large, the learner attends to a stimulus
in order to address this gap (Berlyne, 1960; Hebb, 1949; Piaget, 2013). Thus the key cognitive
mechanism in incongruity approaches to curiosity is an intrinsic motivation to reduce the
disparity between the external world and the learner’s internal representations. Indeed, there
is evidence that human and non-human learners spontaneously orient their attention to new

103
Katherine E. Twomey and Gert Westermann

information. Sokolov (1963) extended early work by Pavlov to identify an orienting reflex in
both behavioral and neural responses to novel stimuli, by which attention increases to new
information, but gradually decreases across time as this information is assimilated. In the context
of incongruity approaches, then, the orienting reflex represents a mechanism for reduction of
disparity between the environment and representation. Once the stimulus has been learned
as fully as possible, the learner has habituated, at which point attention is reoriented to a more
novel stimulus. Incongruity approaches relate closely to infancy research, in which researchers
frequently capitalize on this orienting reflex in looking time studies, and in particular in famil-
iarization paradigms (e.g., Althaus & Westermann, 2016; Plunkett, Hu, & Cohen, 2008), which
have demonstrated that even early in development, language can affect how infants distribute
their attention (for more discussion of the orienting reflex in infancy, see below).

Information gap approaches


While orienting to novelty may explain at least some exploratory behavior in very early learn-
ing, across development, rigidly adhering to such a strategy could lead to less-than-optimal out-
comes. For example, while falling off a cliff is certainly novel, few learners attempt to experience
it for themselves, let alone repeatedly. Information gap approaches involve the learner perceiving
that there is missing information in their mental representation of a particular situation, and
seeking to fill this gap (e.g., Festinger, 1962; Kagan, 1972; Loewenstein, 1994). For example, a
driver waiting at a road junction will seek information from traffic lights to establish whether it
is safe to proceed. Information gap approaches are therefore similar to incongruity approaches
in that, in both, curiosity involves reducing a gap in knowledge about the world. Critically,
however, incongruity theories require that learners consciously notice this gap; by extension,
these approaches necessarily assume that infants can reflect on their existing knowledge in
order to identify the gap which needs to be addressed in order to achieve that goal (Gottlieb,
Oudeyer, Lopes, & Baranes, 2013).Thus, it is not clear that such approaches can account for early
curiosity-driven attention before the development of the higher-level metacognitive processes
that allow learners to reflect on what they know.

Learning progress approaches


More recent theories of curiosity assume that curiosity is an intrinsic motivation to maximize
learning (or competence) progress (e.g., Little & Sommer, 2013; Oudeyer, Gottlieb, & Lopes,
2016; Oudeyer & Smith, 2016; Schmidhuber, 2010). Broadly, like incongruity and information
gap approaches, learning progress theories involve the reduction of disparity between internal
representation and environment; however, a key difference is that learning progress accounts
hinge on the reduction of prediction error. Specifically, the learner makes predictions about the
environment (or the outcome of an action), and compares this prediction to the true state of the
world (or true action outcome). In addition to prediction error-based learning, these approaches
invoke an additional cognitive component: curiosity as an intrinsic motivation to maximize
learning progress. Curiosity-driven exploration therefore emerges from the motivation to seek
out new situations which offer the greatest possibility for maximizing learning (i.e., reducing
prediction error). These contemporary formalizations of the potential mechanisms underlying
curiosity have been implemented in a range of computational robotic models of curiosity-
driven learning, one of which is discussed in detail below, and show promise in capturing devel-
opmental phenomena (e.g., Forestier & Oudeyer, 2017).

104
Building the foundations of language

Evaluating competing theories


It is important to note that there is at present no consensus as to what cognitive computations
learners perform when engaging in curiosity-driven behavior, and even the behaviors them-
selves are notoriously difficult to define. This point is exemplified by Berlyne’s (1960) semi-
nal early work, in which he proposed that curiosity-based actions vary along two orthogonal
dimensions: perceptual/epistemic and specific/diversive (for a detailed review, see Loewenstein,
1994). Thus, any curiosity-driven behavior can be located anywhere along conjoint perceptual/
epistemic and specific/diversive continua.
The perceptual/epistemic continuum relates to the extent to which a behavior is the result
of concrete perception or more abstract knowledge: perceptual curiosity relates to attraction
to novelty, while epistemic curiosity is a drive to acquire knowledge. The specific-diversive
continuum relates to whether or not the learner aims to gain a particular piece of information:
specific curiosity is a drive to acquire a specific piece of knowledge, while diversive curiosity
is a desire to increase the general information content of the current situation. Thus, according
to Berlyne, any non-extrinsically motivated behavior will fall somewhere along these continua.
For example, the child attempting to discover how to open a toybox to obtain the contents
would be acting on specific-epistemic curiosity, while an infant reorienting its attention after
habituation would be engaging in perceptual-diversive curiosity. From this perspective, all three
mechanisms reviewed above could play a role in shaping children’s attention during learning.
More broadly, Berlyne’s work highlights the challenge in pinpointing how curiosity works, when
there is no agreement as to what curiosity looks like (Kidd & Hayden, 2015).

Computational curiosity
Despite this lack of consensus, substantial progress in understanding the mechanisms underly-
ing curiosity has been made in the field of computational modeling, and at present, the major-
ity of work across disciplines to focus specifically on curiosity-driven learning comes from the
field of computational reinforcement learning. In this work, researchers implement psycho-
logically inspired exploration-based learning mechanisms in artificial computational and/or
robotic learners, with the goal of building a system that can learn autonomously and optimally.
These systems are inspired by work in human reinforcement learning, which has demonstrated
that the neurotransmitter dopamine reinforces learning following rewarding events. The pre-
cise role of dopamine in learning is complex and the focus of active research (for a review,
see Bromberg-Martin, Matsumoto, & Hikosaka, 2010). Broadly, however, dopamine-releasing
neurons are understood to increase their production of the neurotransmitter in response to
external environmental input that is better than predicted, reinforcing the successful action or
response such that it is more likely to be produced in future. For example, dopamine release is
increased in response to discovery of food or water (Schultz, 1998). Computational reinforce-
ment learning approaches are based on this mechanism, employing an artificial reward system
by which the system receives a numerical reinforcing signal when it carries out an action or
encounters an environment that leads to a particular outcome (for a detailed discussion of
the relationship between biological and computational reinforcement learning, see Kaplan &
Oudeyer, 2007). The larger the reward, the more likely the learner is to repeat the action or
seek out the circumstances that elicited it. Curiosity is therefore a mechanism for driving
exploration such that reward is maximized. Consequently, a central question in this field is
how to specify how this reward is calculated, that is, to develop a mechanism for curiosity
(Oudeyer & Kaplan, 2007).

105
Katherine E. Twomey and Gert Westermann

In fact, computational models employ a range of different implementations of curiosity,


reflecting the range of existing theories. In line with information gap approaches, computa-
tional curiosity has been modeled as prediction error reduction. For example, Schlesinger and
Amso (2013) presented an intrinsically motivated artificial learner with five sets of potential
visual exploratory sequences of a series of naturalistic images. Each set of sequences consisted of
samples of gaze data over an image. One set of samples was generated by 9-month-old infants
and one by adults, with the remaining three generated artificially based on attributes of the
images (salient areas, noisy areas or random sampling). For each set, samples were presented
one-by-one to a simple recurrent neural network (Elman, 1990) that learned across training to
predict which sample should appear next in the sequence, resulting in five networks trained on
infant, adult or image-based sequences. For each network, error across learning was recorded,
reflecting the accuracy with which the network had learned the sequences.The artificial learner
then selected a network to learn from. Next, the artificial learner received a reward that was
proportional to that network’s error. Importantly, in the types of neural network employed in
this study, error reflects the ease with which the network can encode its inputs: stimuli that are
easy to learn result in low network error, while stimuli that are difficult to learn result in high
error. Thus, because the artificial leaner received greater reward for choosing low error net-
works, and would therefore learn to choose low error networks in future, over time the learner
would choose networks which had been trained with easy-to-learn sequences. Critically, the
artificial learner systematically preferred the network that had learned to predict the exploratory
sequences generated by infants, suggesting that infants’ image scanning generated sequences of
gaze samples that were easier to learn than either artificially or randomly generated sequences.
As a consequence, this simulation suggests that infants can spontaneously minimize prediction
error. This reinforcement learning-based simulation indicates that prediction error reduction
is a potential mechanism for curiosity-based learning and predicts that infants are capable of
generating exploratory sequences that maximize their own learning when allowed to engage in
curiosity-driven visual exploration.
However, models that rely only on minimizing prediction error to generate reward cannot
account for the diversity of infants’ exploration in complex learning environments. To address
this issue, Oudeyer, Kaplan, and Hafner (2007) modeled curiosity as a motivation to maximize
learning progress. In this “playground experiment”, two dog-like robots are placed on a play mat
surrounded by toy objects. One robot is the “learner”, and the other is the “teacher”. Each can
vocalize; the learner spontaneously produces vocalizations when looking at the teacher, while
the teacher imitates the learner’s vocalizations. Both robots possess a series of primitive motor
functions that allow them to explore the objects by grasping them with their mouths or pushing
them with their limbs. The learner robot attempts to predict the outcomes of its actions on the
toys, and receives a reward based on the extent to which its predictions are improved or wors-
ened based on a given movement (i.e., reducing prediction error). Thus, successfully predicted
actions are more likely to be repeated than unsuccessfully predicted actions, leading to learning
progress, in turn leading to a reward. However, once a particular action is well learned (with
reduced opportunity for learning progress), the robot explores new possibilities for action. By
initially randomly exploring the environment and receiving a reward for successful predictions,
then, the learner robot fine-tunes its exploration across experience and learns to explore action
spaces that allow it to learn most rapidly, without being explicitly told how best to learn.
Particularly relevant to the current discussion, one action space available to the robot is
vocalization. Specifically, when the learner robot vocalizes, the teacher robot is preprogrammed
to respond. Based on its intrinsic motivation to maximize global learning progress while mini-
mizing error in predicting the outcome of a particular action (here, vocalization), the learner

106
Building the foundations of language

robot exhibits turn taking in vocalizations with the teacher, suggesting that curiosity may play
a role in the development of early vocal interaction (Oudeyer & Smith, 2016). Importantly, this
model is one of the first to address the link between curiosity and the beginnings of language, by
treating vocal behavior as part of the same learning progress maximization system as intrinsically
motivated exploration (but see also Forestier & Oudeyer, 2017).
These models are just two examples of the many computational investigations of curiosity
from the reinforcement learning field. Other models have also been shown to autonomously
maximize their own learning, using a range of implementations of curiosity (for reviews, see
Gottlieb et al., 2013; Oudeyer et al., 2007). Thus, currently there is no single optimum mecha-
nism for calculating reward in curiosity-based learning. However, since the goal of this type of
model is to build an artificial agent that can learn independently, many models, while inspired
by work in psychology, include elements that need not reflect early developmental mechanisms
(Oudeyer & Kaplan, 2009). For instance, both the models reviewed include a metacognitive
module that can “reflect” on what has been learned and decide to select a particular course of
action. However, precisely when metacognition emerges in children is currently unclear (Sch-
neider, 2008). Overall, however, this work in computational reinforcement learning signposts
critical new directions for empirical work towards understanding the mechanisms driving early
development – and consequently early language acquisition (Gottlieb et al., 2013).
One recent model aimed to simulate curiosity-driven learning in infancy without invok-
ing metacognition. Twomey and Westermann (2017) described a simple neural network model
that selected the order in which it learned from stimuli in a category learning task. The authors
used an autoencoder architecture (e.g., Mareschal & French, 2000), which learns over repeated
encounters with a set of inputs to represent what it sees. Autoencoders do not require a sepa-
rate “teacher” during learning, and are not dependent on an engineered reward module, and
are therefore a useful tool for simulating early developmental processes. Existing work with
these networks has captured empirical results in infants’ category learning using a simulated
familiarization/novelty preference task in line with the canonical empirical method of examin-
ing infants’ category learning (French, Mareschal, Mermillod, & Quinn, 2004; Westermann &
Mareschal, 2012, 2014). In these tasks, infants learn from a series of images presented in a fixed
order (e.g., Mather & Plunkett, 2011; Quinn, Eimas, & Rosenkrantz, 1993; Younger & Cohen,
1983). In contrast, this model was allowed to select the order in which it viewed stimuli based
on a novel implementation of curiosity. Curiosity was modeled broadly as an intrinsic motiva-
tion to maximize learning at each learning step, but specifically as a function of what the model
had learned already, combined with in-the-moment information from the environment and the
model’s ability to learn from that particular stimulus. By choosing the stimulus that maximized
this curiosity function, the model learned optimally relative to an identical model trained with
fixed stimulus presentation, and did so by selecting stimuli in a sequence of intermediate com-
plexity. Like the models reviewed, this work indicates that infants’ learning in the real world
should show systematic patterns of exploration, but offers a mechanism that does not depend
on cognitive capacities that develop post-infancy. However, this work leaves open the important
question of how language and curiosity-driven learning are related.

Curiosity in early development


Given its fundamental role in learning, to date surprisingly little empirical work has focused on
curiosity in infants. Observationally, however, infants show clear curiosity-type behaviors: dur-
ing play they chew and bash toys, shake and roll them, even throw them – actions which are
not motivated by any external reward. Revisiting the existing literature also provides evidence

107
Katherine E. Twomey and Gert Westermann

for the prevalence of curiosity-driven behavior in infancy and early development. For example,
in line with incongruity approaches in which curiosity is a drive to minimize the discrepancy
between the external world and the learner’s internal representations, given experience over
time with a particular stimulus infants will switch their attention when a novel stimulus appears
(Fantz, 1964), a much-replicated phenomenon known as the novelty preference (see orienting
reflex, above). This phenomenon suggests that some kind of prediction error reduction mecha-
nism may be at play in early infancy: infants orient to stimuli that do not conform to their
predictions, and continue to encode them until their predictions are correct. Further evidence
comes from early work which shows that greater stimulus complexity engages infants’ atten-
tion; for example, by 6 months infants have been shown to look for longer at black and white
checkerboard stimuli consisting of many small squares than stimuli consisting of fewer, larger
squares when the overall ratio of black to white was controlled (Fantz & Fagan, 1975; see also,
e.g., Cohen, DeLoache, & Rissman, 1975; Colombo, Frick, & Gorman, 1997; Tellinghuisen &
Oakes, 1997). Thus, some of the determinants of curiosity-driven attention theorized to be
operating in adults – and, in particular, attention to novelty – are also seen early in development.
However, establishing a clear link between novelty in the environment and infant curiosity
is not trivial. With respect to the novelty preference, the picture is more complex than a simple
increase in attention to the most novel stimulus present: under some circumstances infants prefer
familiar stimuli, and the extent to which they orient their attention towards familiarity or nov-
elty in looking time studies can be manipulated experimentally by varying stimulus presentation
times (Balas & Oakes, 2015; Houston-Price & Nakai, 2004; Mather, 2013). Specifically, in these
studies infants show a familiarity preference after shorter exposures but a novelty preference after
longer exposures. Further, as Berlyne (1960) points out, novelty can be complete (a never-before-
seen toy), long-term (a toy not seen today, but played with yesterday) or short-term (a toy
played with in the past few minutes). Novelty can also be absolute, as in the never-before-seen
objects commonly used in studies of word learning, or relative, as in the out-of-category “novel”
stimuli often presented in studies of infant categorization. More broadly, patterns of curiosity-
driven exploration are not easy to predict: infants’ attention allocation and exploratory behaviors
emerge from the interaction between the infant, the infant’s particular learning history, and in-
the-moment features of the task environment. For example, while one infant may be familiar
with their favorite 3D dinosaur toy, another may have only seen dinosaurs in picture books.
When encountering a new toy dinosaur at nursery, each child’s exploration of the object will
depend not only on the features or affordances of the object itself, but also on the child’s previ-
ous experience. Thus, while novelty is an important determinant of curiosity-driven attention,
novelty itself is subjective; curiosity-driven behavior therefore depends fundamentally on the
dynamic interaction between external, perceptual input and the child’s learned representations
(Kagan, 2009; Oudeyer et al., 2016; Twomey & Westermann, 2017). Understanding this interac-
tion is a current challenge for developmental psychology.
The recent studies of curiosity in development employ a range of paradigms but offer con-
verging evidence that infants and children are active learners who systematically sample their
learning environments; moreover, these early empirical studies indicate that curiosity-driven
learning involves imposing structure or regularity on the input to learning. This ability appears
to start very young. For example, Kidd and colleagues (Kidd, Piantadosi, & Aslin, 2012) used
a looking time paradigm to explore infants’ preference for learning from stimuli of different
levels of complexity. Specifically, they presented 7- to 8-month-old infants with a series of
videos of occlusion events, in which a brightly colored occluder disappeared to reveal either an
image of an object such as a fire truck, or a blank screen. The probability of an object appear-
ing varied from 0 to 1; thus, while some objects appeared on every trial, some objects did not.

108
Building the foundations of language

Consequently, some trial types were highly predictable (or simpler), while others were highly
unpredictable (or complex). The authors found a “Goldilocks effect”: infants preferred to look
at trials which were moderately predictable, but not highly predictable or unpredictable (for
a similar finding in auditory attention, see Kidd, Piantadosi, & Aslin, 2014; see also Yu, Bona-
witz, & Shafto, 2017).
This preference for intermediate complexity has been replicated in a category exploration
task in which complexity was indexed by perceptual distances between stimuli.Twomey, Malem,
and Westermann (2016) presented 12-month-old children and adults with a shape priming task
to explore what level of complexity infants generate when engaged in curiosity-driven category
exploration. In a 2D eye-tracking paradigm infants saw abstract shapes presented on a com-
puter screen. Shapes were drawn from a novel category of five exemplars which varied along a
continuum in which the differences between successive exemplars were controlled; specifically,
exemplars varied systematically along a shape continuum (wide to narrow) and a color contin-
uum (red to blue). Each step along the continuum was taken as a perceptual distance of 1; thus,
infants could traverse short distances (i.e., 1), medium distances (i.e., 2) or long distances (i.e., 3).
On prime trials, infants saw one of the two category-peripheral exemplars; on test trials, they
saw the remaining four exemplars, and were free to scan these for 10 seconds. For each sequence
generated in the 10-second free exploration phase, the mean of the perceptual distances was cal-
culated and served as a proxy for sequence complexity. In line with Kidd and colleagues’ (2012)
findings, infants’ visual exploration was systematic: their first looks were consistently to stimuli of
intermediate distance from the prime. Analyses of individual differences in scan sequences also
indicated that infants spontaneously imposed structure on their input: following their first look,
some (although not all) infants systematically generated sequences of intermediate difficulty. As
well as extending Kidd and colleagues’ findings to a different measure of task “complexity” (i.e.,
perceptual distance versus predictability), this work provides evidence convergent with compu-
tational work (Schlesinger & Amso, 2013; Twomey & Westermann, 2017) that infants are not
passive learners or simply random explorers: rather, in visual exploration at least, infants appear
to prefer to learn from stimuli of intermediate complexity and, importantly, systematically select
information in order to generate this level of complexity. Importantly, prior to these studies it
was not clear what level of environmental complexity would best support early learning. For
example, in experiments in which what infants see is more constrained, newborns have been
shown to learn best from simple sequences (Bulf, Johnson, & Valenza, 2011), while maximizing
complexity can support 10-month-old infants’ category learning (Mather & Plunkett, 2011).
This newly emerging research suggests that in contrast, when allowed to explore freely, infants
prefer to learn from intermediate complexity.
While curiosity in early development is the focus of ongoing research, these findings have
interesting links with classic theories of learning. In particular, in Vygotsky’s (1978) influential
work in educational theory, children’s zone of proximal development (ZPD; the amount – and
therefore maximum – they can develop at a given time) is defined by a more capable tutor struc-
turing their input such that it is sufficiently challenging for them to learn, but not so challenging
that learning is impossible.That is, the ZPD represents a learning space of intermediate difficulty.
These early studies of infants’ curiosity-driven learning raise the possibility that while input from
social partners is clearly important, infants themselves are capable of defining their own ZPD.

Curiosity and language


As noted, the relationship between curiosity, attention and early language development has only
recently been explicitly studied, and much of this work is ongoing. However, findings from

109
Katherine E. Twomey and Gert Westermann

traditional experimental paradigms suggest that hearing language during learning can affect
how children allocate their attention. Months before the onset of speech, infants as young as
6 months have been shown to direct their attention to the correct referent when they hear a
familiar word (Bergelson & Swingley, 2012). Novel words also affect attention: 10-month-old
infants hearing words during category learning demonstrate increased attention to category-
relevant features of stimuli (Althaus & Mareschal, 2014), and the presence of language shapes the
category representations they ultimately learn (Althaus & Westermann, 2016; Plunkett, Hu, &
Cohen, 2008). In toddlers, the presence of novel words during a match-to-sample task increases
the duration of infants’ fixations while scanning potential referents (Carvalho, Vales, Fausey, &
Smith, 2018) and increases the amount of time spent engaging with 3D objects (Baldwin &
Markman, 1989). Interestingly, language need not be present in-the-moment: knowing an item’s
name prompts infants to increase their attention to that item even when that item is presented
in silence (Gliga,Volein, & Csibra, 2010; Mani & Plunkett, 2010;Twomey & Westermann, 2018).
Importantly, while these studies suggest that language affects children’s attention and conse-
quently the nature of the input to learning, recent work suggests that nonlinguistic aspects of the
learning environment also have profound effects on language learning. For example, the pres-
ence of variability in the visual scene can facilitate noun acquisition and longer-term vocabu-
lary development (Goldenberg & Sandhofer, 2013; Perry, Samuelson, Malloy, & Schiffer, 2010;
Twomey, Ma, & Westermann, 2017), while visual characteristics of a potential referent determine
how children generalize words (e.g., S. S. Jones, Smith, & Landau, 1991; Landau, Smith, & Jones,
1988; Son, Smith, & Goldstone, 2008). Overall, a picture is emerging of a dynamic interaction
between linguistic and nonlinguistic input in early language acquisition. This point is critical:
evidence is building that infants can actively select information in their learning environment
to generate their own input to learning. Given that the nonlinguistic input affects early lan-
guage acquisition, curiosity-driven learning may have a profound effect on language acquisi-
tion. Understanding curiosity in development is therefore fundamental to our understanding
of language development. Current work on learning outside the constrained lab environment,
inspired jointly by computational work and recent empirical studies, is under way to generate
critical insight into this complex relationship.

Implications
As is clear from this review, current theories of curiosity are heterogeneous. Such is the com-
plexity of the problem, in fact, that Kidd and Hayden (2015) recently argued that researchers in
developmental psychology at least for now should put to one side theoretical debates and move
towards characterizing and understanding the behavior in order to ensure progress continues to
be made. At present, then, there remains much work to be done in understanding how young
children can drive and even maximize their own learning and in particular how this ability
interacts with language. Nonetheless, research has begun to highlight important avenues for
future research with important implications for the development of interventions in atypical
(language) development.
First, an important message from the current work is that infants can drive their own learning,
and that this exploration depends critically on what the infants already know.Taken to its logical
conclusion, then, the emerging picture of curiosity-driven learning highlights the importance
of play-based learning to early development and raises the possibility that early years educa-
tion should emphasize exploration-based learning over instruction with pre-prepared materials.
Indeed, recent work in related educational settings indicates that eliciting curiosity may prove
a useful pedagogical tool: for example, school age children have been shown to better retain

110
Building the foundations of language

facts that elicit greater curiosity (Walin, Grady, & Xu, 2016), and work in robotics highlights the
benefit to learning of curiosity on the part of the instructor (e.g., Gordon, Breazeal, & Engel,
2015). Investigating the role of curiosity in the early years educational context, and in particular
in early language acquisition, is an important avenue for future work.
Second, the infants who took part in existing studies of curiosity were typically developing. It
is possible, therefore, that in atypical development the structure (input) infants generate is different
from that generated by typically developing children, and this, in addition to identified neurocogni-
tive impairments, may in part explain delays in learning. For example, in the case of autism spectrum
disorder (ASD), infants at risk for autism who go on to develop socio-communicative difficulties
show different patterns of attention allocation to low-risk controls (Bedford et al., 2012; for a review,
see Jones, Gliga, Bedford, Charman, & Johnson, 2014). Clearly, characterizing curiosity-driven
exploration in typical development is critical for developing tools to identify atypical exploratory
styles. In terms of intervention, such a characterization could also provide insight into the features of
the learning environment that best support learning, and the foundation for designing interventions
that make these features more readily accessible to atypically developing children.
Third, and importantly, while this work emphasizes infants’ impressive ability to learn inde-
pendently, clearly not all learning takes place in isolation: substantial amounts of curiosity-based
learning take place during social interactions, particularly as language develops. Thus, under-
standing how the child’s intrinsic curiosity interacts with and may be shaped by input from
caregivers is an important avenue for future research.

References
Adolph, K. E., & Tamis-LeMonda, C. S. (2014). The costs and benefits of development: The transition from
crawling to walking. Child Development Perspectives, 8(4), 187–192. doi:10.1111/cdep.12085
Althaus, N., & Mareschal, D. (2014). Labels direct infants’ attention to commonalities during novel category
learning. PloS One, 9(7), e99670. doi:10.1371/journal.pone.0099670
Althaus, N., & Westermann, G. (2016). Labels constructively shape object categories in 10-month-old
infants. Journal of Experimental Child Psychology, 151, 5–17. doi:10.1016/j.jecp.2015.11.013
Balas, B., & Oakes, L. (2015). Modeling infant visual preference as perceptual oscillation. Proceedings of the 2015
Joint IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL-
EpiRob) (pp. 26–32). IEEE. doi:10.1109/DEVLRN.2015.7345451
Baldwin, D. A., & Markman, E. M. (1989). Establishing word-object relations: A first step. Child Development,
60(2), 381–398. doi:10.2307/1130984
Bedford, R., Elsabbagh, M., Gliga, T., Pickles, A., Senju, A., Charman, T., . . . Team, the B. (2012). Precur-
sors to social and communication difficulties in infants at-risk for autism: Gaze following and atten-
tional engagement. Journal of Autism and Developmental Disorders, 42(10), 2208–2218. doi:10.1007/
s10803-012-1450-y
Bergelson, E., & Swingley, D. (2012). At 6–9 months, human infants know the meanings of many common
nouns. Proceedings of the National Academy of Sciences, 109(9), 3253–3258. doi:10.1073/pnas.1113380109
Berlyne, D. E. (1960). Conflict, arousal, and curiosity. New York, NY: McGraw-Hill Book Company.
doi:10.1037/11164-000
Bromberg-Martin, E. S., Matsumoto, M., & Hikosaka, O. (2010). Dopamine in motivational control:
Rewarding, aversive, and alerting. Neuron, 68(5), 815–834. doi:10.1016/j.neuron.2010.11.022
Bulf, H., Johnson, S. P., & Valenza, E. (2011). Visual statistical learning in the newborn infant. Cognition,
121(1), 127–132. doi:10.1016/j.cognition.2011.06.010
Carvalho, P. F.,Vales, C., Fausey, C. M., & Smith, L. B. (2018). Novel names extend for how long preschool
children sample visual information. Journal of Experimental Child Psychology, 168, 1–18. doi:10.1016/j.
jecp.2017.12.002
Cohen, L. B., DeLoache, J. S., & Rissman, M. W. (1975). The effect of stimulus complexity on infant visual
attention and habituation. Child Development, 46(3), 611–617. doi:10.2307/1128557

111
Katherine E. Twomey and Gert Westermann

Colombo, J., Frick, J. E., & Gorman, S. A. (1997). Sensitization during visual habituation sequences: Pro-
cedural effects and individual differences. Journal of Experimental Child Psychology, 67(2), 223–235.
doi:10.1006/jecp.1997.2406
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211. doi:10.1207/
s15516709cog1402_1
Fantz, R. L. (1964).Visual experience in infants: Decreased attention familiar patterns relative to novel ones.
Science, 146, 668–670. doi:10.1126/science.146.3644.668
Fantz, R. L., & Fagan, J. F. (1975).Visual attention to size and number of pattern details by term and preterm
infants during first 6 months. Child Development, 46(1), 3–18. doi:10.2307/1128828
Festinger, L. (1962). A theory of cognitive dissonance (Vol. 2). Stanford, CA: Stanford University Press.
Forestier, S., & Oudeyer, P-Y. (2017). A unified model of speech and tool use early development. In H.
Granger & Sutton (Eds.), Proceedings of the 39th annual conference of the cognitive science society. Austin, TX:
Cognitive Science Society.
French, R. M., Mareschal, D., Mermillod, M., & Quinn, P. C. (2004). The role of bottom-up processing
in perceptual categorization by 3-to 4-month-old infants: Simulations and data. Journal of Experimental
Psychology-General, 133(3), 382–397. doi:10.1037/0096-3445.133.3.382
Gliga, T.,Volein, A., & Csibra, G. (2010).Verbal labels modulate perceptual object processing in 1-year-old
children. Journal of Cognitive Neuroscience, 22(12), 2781–2789. doi:10.1162/jocn.2010.21427
Goldenberg, E. R., & Sandhofer, C. M. (2013). Same, varied, or both? Contextual support aids young
children in generalizing category labels. Journal of Experimental Child Psychology, 115(1), 150–162.
doi:10.1016/j.jecp.2012.11.011
Gordon, G., Breazeal, C., & Engel, S. (2015). Can children catch curiosity from a social robot? Proceedings of the
Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (pp. 91–98). ACM.
doi:10.1145/2696454.2696469
Gottlieb, J., Oudeyer, P-Y., Lopes, M., & Baranes, A. (2013). Information-seeking, curiosity, and attention:
Computational and neural mechanisms. Trends in Cognitive Sciences, 17(11), 585–593. doi:10.1016/j.
tics.2013.09.001
Hebb, D. (1949). The organization of behavior: A neuropsychological theory. New York, NY: Wiley.
Houston-Price, C., & Nakai, S. (2004). Distinguishing novelty and familiarity effects in infant preference
procedures. Infant and Child Development, 13(4), 341–348. doi:10.1002/icd.364
Jones, E. J. H., Gliga, T., Bedford, R., Charman, T., & Johnson, M. H. (2014). Developmental pathways to
autism: A review of prospective studies of infants at risk. Neuroscience & Biobehavioral Reviews, 39, 1–33.
doi:10.1016/j.neubiorev.2013.12.001
Jones, S. S., Smith, L. B., & Landau, B. (1991). Object properties and knowledge in early lexical learning.
Child Development, 62(3), 499–516. doi:10.2307/1131126
Kagan, J. (1972). Motives and development. Journal of Personality and Social Psychology, 22(1), 51. doi:10.1037/
h0032356
Kagan, J. (2009). Categories of novelty and states of uncertainty. Review of General Psychology, 13(4), 290.
doi:10.1037/a0017142
Kaplan, F., & Oudeyer, P-Y. (2007). In search of the neural circuits of intrinsic motivation. Frontiers in Neu-
roscience, 1(1), 225–236. doi:10.3389/neuro.01.1.1.017.2007
Kidd, C., & Hayden, B. Y. (2015). The psychology and neuroscience of curiosity. Neuron, 88(3), 449–460.
doi:10.1016/j.neuron.2015.09.010
Kidd, C., Piantadosi, S. T., & Aslin, R. N. (2012). The goldilocks effect: Human infants allocate attention
to visual sequences that are neither too simple nor too complex. PloS One, 7(5), e36399. doi:10.1371/
journal.pone.0036399
Kidd, C., Piantadosi, S. T., & Aslin, R. N. (2014). The goldilocks effect in infant auditory attention. Child
Development, 85(5), 1795–1804. doi:10.1111/cdev.12263
Landau, B., Smith, L. B., & Jones, S. S. (1988). The importance of shape in early lexical learning. Cognitive
Development, 3(3), 299–321. doi:10.1016/0885-2014(88)90014-7
Little, D. Y-J., & Sommer, F. T. (2013). Learning and exploration in action-perception loops. Frontiers in
Neural Circuits, 7, 37. doi:10.3389/fncir.2013.00037
Loewenstein, G. (1994). The psychology of curiosity: A review and reinterpretation. Psychological Bulletin,
116(1), 75–98. doi:10.1037/0033-2909.116.1.75
Mani, N., & Plunkett, K. (2010). In the infant’s mind’s ear: Evidence for implicit naming in 18-month-olds.
Psychological Science, 21(7), 908–913. doi:10.1177/0956797610373371

112
Building the foundations of language

Mareschal, D., & French, R. (2000). Mechanisms of categorization in infancy. Infancy, 1(1), 59–76.
doi:10.1207/S15327078IN0101_06
Mather, E. (2013). Novelty, attention, and challenges for developmental psychology. Frontiers in Psychology,
4. doi:10.3389/fpsyg.2013.00491
Mather, E., & Plunkett, K. (2011). Same items, different order: Effects of temporal variability on infant
categorization. Cognition, 119(3), 438–447. doi:10.1016/j.cognition.2011.02.008
Moulin-Frier, C., Nguyen, S. M., & Oudeyer, P-Y. (2014). Self-organization of early vocal develop-
ment in infants and machines: The role of intrinsic motivation. Frontiers in Psychology, 4. doi:10.3389/
fpsyg.2013.01006
Oudeyer, P-Y., Gottlieb, J., & Lopes, M. (2016). Intrinsic motivation, curiosity, and learning: Theory and
applications in educational technologies. In B. Studer & S. Knecht (Eds.), Progress in brain research
(Vol. 229, pp. 257–284). Cambridge, MA: Elsevier. doi:10.1016/bs.pbr.2016.05.005
Oudeyer, P-Y., & Kaplan, F. (2009). What is intrinsic motivation? A typology of computational approaches.
Frontiers in Neurorobotics, 1. doi:10.3389/neuro.12.006.2007
Oudeyer, P-Y., Kaplan, F., & Hafner, V. V. (2007). Intrinsic motivation systems for autonomous mental
development. IEEE Transactions on Evolutionary Computation, 11(2), 265–286. doi:10.1109/TEVC.2006.
890271
Oudeyer, P-Y., & Smith, L. B. (2016). How evolution may work through curiosity-driven developmental
process. Topics in Cognitive Science, 8(2), 492–502. doi:10.1111/tops.12196
Perry, L. K., Samuelson, L. K., Malloy, L. M., & Schiffer, R. N. (2010). Learn locally, think globally: Exem-
plar variability supports higher-order generalization and word learning. Psychological Science, 21(12),
1894–1902. doi:10.1177/0956797610389189
Piaget, J. (2013). The construction of reality in the child. Oxon: Routledge.
Plunkett, K., Hu, J-F., & Cohen, L. B. (2008). Labels can override perceptual categories in early infancy.
Cognition, 106(2), 665–681. doi:10.1016/j.cognition.2007.04.003
Quinn, P. C., Eimas, P. D., & Rosenkrantz, S. L. (1993). Evidence for representations of perceptually similar
natural categories by 3-month-old and 4-month-old infants. Perception, 22(4), 463–475. doi:10.1068/
p220463
Ryan, R. M., & Deci, E. L. (2000). Intrinsic and extrinsic motivations: Classic definitions and new direc-
tions. Contemporary Educational Psychology, 25(1), 54–67. doi:10.1006/ceps.1999.1020
Schlesinger, M., & Amso, D. (2013). Image free-viewing as intrinsically-motivated exploration: Estimat-
ing the learnability of center-of-gaze image samples in infants and adults. Frontiers in Psychology, 4.
doi:10.3389/fpsyg.2013.00802
Schmidhuber, J. (2010). Formal theory of creativity, fun, and intrinsic motivation (1990–2010). IEEE Trans-
actions on Autonomous Mental Development, 2(3), 230–247. doi:10.1109/TAMD.2010.2056368
Schneider, W. (2008). The development of metacognitive knowledge in children and adoles-
cents: Major trends and implications for education. Mind, Brain, and Education, 2(3), 114–121.
doi:10.1111/j.1751-228X.2008.00041.x
Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80(1), 1–27.
doi:10.1152/jn.1998.80.1.1
Sokolov, E. N. (1963). Higher nervous functions: The orienting reflex. Annual Review of Physiology, 25(1),
545–580. doi:10.1146/annurev.ph.25.030163.002553
Son, J. Y., Smith, L. B., & Goldstone, R. L. (2008). Simplicity and generalization: Short-cutting
abstraction in children’s object categorizations. Cognition, 108(3), 626–638. doi:10.1016/j.
cognition.2008.05.002
Tellinghuisen, D. J., & Oakes, L. M. (1997). Distractibility in infancy: The effects of distractor charac-
teristics and type of attention. Journal of Experimental Child Psychology, 64(2), 232–254. doi:10.1006/
jecp.1996.2341
Twomey, K. E., Ma, L., & Westermann, G. (2017). All the right noises: Background variability helps early
word learning. Cognitive Science, 42(Suppl. 2), 413–438. doi:10.1111/cogs.12539
Twomey, K. E., Malem, B., & Westermann, G. (2016). Infants’ information selection in a category learning
task. Presented at in Twomey, K. E. (chair), Understanding infants’ curiosity-based learning: Empirical and
computational approaches. Symposium presented at the XX Biennial International Conference on Infant
Studies, New Orleans, LA, USA.
Twomey, K. E., & Westermann, G. (2017). Curiosity-based learning in infants: A neurocomputational
approach. Developmental Science. doi:10.1111/desc.12629

113
Katherine E. Twomey and Gert Westermann

Twomey, K. E., & Westermann, G. (2018). Learned labels shape pre-speech infants’ object representations.
Infancy, 23(1), 61–73. doi:10.1111/infa.12201
Vygotsky, L. S. (1978). Mind in society: The development of higher mental processes. Cambridge, MA: Harvard
University Press.
Walin, H., O’Grady, S., & Xu, F. (2016). Curiosity and its influence on children’s memory. In A. Papafragou,
D. Grodner, D. Mirman, & J. C. Trueswell (Eds.), Proceedings of the 38th annual conference of the cognitive
science society. Austin, TX: Cognitive Science Society.
Westermann, G., & Mareschal, D. (2012). Mechanisms of developmental change in infant categorization.
Cognitive Development, 27(4), 367–382. doi:10.1016/j.cogdev.2012.08.004
Westermann, G., & Mareschal, D. (2014). From perceptual to language-mediated categorization. Philosophi-
cal Transactions of the Royal Society B: Biological Sciences, 369(1634). doi:10.1098/rstb.2012.0391
Younger, B. A., & Cohen, L. B. (1983). Infant perception of correlations among attributes. Child Develop-
ment, 54(4), 858–867. doi:10.2307/1129890
Yu,Y., Bonawitz, E., & Shafto, P. (2017). Pedagogical questions in parent – child conversations. Child Devel-
opment. Early view. doi:10.2307/1129890

114

You might also like