Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Robotics and Autonomous Systems 143 (2021) 103834

Contents lists available at ScienceDirect

Robotics and Autonomous Systems


journal homepage: www.elsevier.com/locate/robot

A literature review of sensor heads for humanoid robots



J.A. Rojas-Quintero, M.C. Rodríguez-Liñán
CONACYT/Tecnológico Nacional de México/I.T. Ensenada, Esenada, B.C., Mexico

article info a b s t r a c t

Article history: We conducted a literature review on sensor heads for humanoid robots. A strong case is made on topics
Received 11 September 2019 involved in human–robot interaction. Having found that vision is the most abundant perception system
Received in revised form 7 May 2020 among sensor heads for humanoid robots, we included a review of control techniques for humanoid
Accepted 15 June 2021
active vision. We provide historical insight and inform on current robotic head design and applications.
Available online 22 June 2021
Information is chronologically organized whenever possible and exposes trends in control techniques,
Keywords: mechanical design, periodical advances and overall philosophy. We found that there are two main
Humanoid robot heads types of humanoid robot heads which we propose to classify as either non-expressive face robot
Human–robot interaction heads or expressive face robot heads. We expose their respective characteristics and provide some
Review ideas on design and vision control considerations for humanoid robot heads involved in human–robot
Control of robotic systems interactions.
Active vision © 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND
license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction from the environment [15]. However, literature has proven that
it may not be so simple because although gaze is particularly
Humanoid robot research is receiving much attention in the important in human–human relationships [15], it is not the only
recent years [1,2] because there may be potential commercial sensing mechanism that contributes to the interaction task [16].
applications for these robots [3–8]. Humanoid robots design is For example, human beings can localize sound sources just by
often targeted to reproduce the capabilities of a human being in a hearing [17]; or even locate gas sources [18] and get information
technical system [2,9], and even their appearance [10]. However, on taste [19] with the olfactory sense. The main function of the
there are numerous examples where humanoid robots rely on head is that of interacting with surrounding elements by acquir-
superhuman sensing abilities to succeed at their task as in [11– ing information through senses. It is therefore not surprising that
13]. Nevertheless, a humanoid robot can be considered as a set of humanoid robot heads naturally evolved to include at least one
several robotic subsystems among which a very important one more sensing system. Brooks was the first to mention the need
is the robot head. Based on the conducted research, we have for sound understanding by a humanoid robot in 1994 [20], but
the first example of a humanoid robot head incorporating vision
found that humanoid robots often feature a sensor head, that is,
and audition was the humanoid robot WABOT-2 [21] already in
a head that gathers important exteroceptive sensors that enable
1984. Audition seems to have also become a standard sensing
world perception. We should note right away that this is not
system in humanoid robot heads (see Sections 4 and 6 ). In the
the only solution for humanoid robots and some of them gather
future, a similar process might occur with the inclusion of an
exteroceptive sensors in the torso, waist and limbs [11,12], but
olfactory system as a sensing device for humanoid robot heads.
in this review we focus on humanoid robots sensor heads. In
The robot WE-3 RIV was the first example that included an olfac-
this context, we will therefore refer to these sensor heads as tory system [22] and there has been some research conducted in
humanoid robot heads for the rest of this review. At a glance, the area of odor detection and location, that could be aimed for
creating a humanoid robot head appears to be a rather simple humanoid robotics [18]. The human head contains the brain as a
task because, in its most basic expression, a robot head can part of the Central Nervous System; several sensory systems take
essentially be composed of a pair of actuated cameras [14], thus place exclusively in the head. The main being: a visual system
providing a means for the robot to interact with its surrounding (sight), an auditory system (audition), a gustatory system (taste)
environment, and even communicate through gaze motions. For and an olfactory system (smell). There is also the more specialized
a time, robot heads were quasi-exclusively of this type, because vestibular system which contributes to the sense of balance and
vision appears to be the most effective way to obtain information orientation. This one appears to be particularly well-needed for
a humanoid robot to be able to navigate its environment [23].
∗ Corresponding author. Note that by ‘‘navigate’’ we simply mean the possibility that a hu-
E-mail addresses: jarojas@conacyt.mx (J.A. Rojas-Quintero), manoid robot has to change its position and orientation through
mcrodriguez@conacyt.mx (M.C. Rodríguez-Liñán). a particular route or course in its surrounding environment.

https://doi.org/10.1016/j.robot.2021.103834
0921-8890/© 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

Fig. 1. Examples of non-expressive humanoid robot heads. From left to right: WE-2 [24], an early development featuring actuated stereo vision and ARMAR-III [9]
featuring foveated stereo vision, both feature actuated necks and hearing; Romeo [25] was designed to evoke familiarity in a passive manner; Atlas [26] featuring
a Carnegie Robotics Multisense SL with stereo vision and a LIDAR sensor. WE-2 by Takanishi Laboratory, reproduced with permission. ARMAR-III by Renate Herbst,
Romeo by Softbank Robotics Europe, and Atlas photos retrieved 2019 from Wikimedia Commons.

Fig. 2. Examples of expressive humanoid robot heads. From left to right: Kobian-R [27] displaying happiness and anger with its articulated facial traits; Albert
HUBO [28] has an android head featuring artificial skin and hair; and SociBot [29] whose face is an animation projected onto a screen-face. Kobian-R by Takanishi
Laboratory, reproduced with permission. Albert HUBO by David Hanson, retrieved 2019 from Wikimedia Commons. SociBot by Engineered Arts Ltd., reproduced with
permission.

If a humanoid robot is to dwell among us, one of its most on key subjects such as design and control. Section 2 presents
important tasks will be to interact with surrounding human be- our research methodology. Section 3 presents a chronological
ings [30]. For this purpose, a humanoid robot needs to identify, analysis of humanoid robot heads general evolution, to offer
recognize and locate external stimuli, but it must also be able historical insight and to allow for the identification of future
to establish relationships. For a human being, the head is an perspectives. This discussion considers their external appearance,
essential organ for interacting with others because it can inform emphasizing their varied structure construction. A revision of the
about the current emotional state through facial expressions [15]. numerous ideas to provide them with effective sensing and ex-
This may not always be a necessary concern when developing pressive devices that guide their overall design is also included in
a humanoid head, however, the sole fact that a humanoid robot 3. Section 4 presents the two types of humanoid robot heads en-
is supposed to eventually navigate an environment designed for countered in the specialized literature (non-expressive face heads
humans, and interact with them at some point, justifies the need and expressive face heads). Section 5 also presents a compilation
for communication [31,32]. Therefore, a humanoid robot head of the most common control methods in active vision related to
should not only be viewed as a receptacle of sensory systems, but humanoid robot heads. A brief discussion of the physiological as-
also as a means of communication in the field of Human–Robot pects of human vision introduces the artificial models that inspire
Interaction (HRI). To this respect, we have found two types of such controllers in this section. Section 6 presents general design
humanoid robot heads: non-expressive face heads and expressive and control considerations that should be taken into account in
face heads (see Section 4). The former are exclusively designed order to enhance HRI involving humanoid robot heads. Section 7
to absorb incoming information from the environment. The latter concludes our work.
concerns those systems that not only perceive data but can also
communicate. 2. Methodology
There are some common features to all types of humanoid
robot heads. They all require to be designed as a mechanical Interested in knowing what constitutes and what character-
structure that contains integrated sensing devices to perform izes a humanoid robot head, we have tried to review most of what
their task and motion, or the motion of a larger system where it could be found in the literature by consulting hundreds of scien-
is embedded, which has to be controlled. It is important to point tific papers, and focusing on the main academic publishers. There
out that a humanoid robot head is generally shaped as something is so much variety among humanoid robot heads and their appli-
resembling the human head. This can be appreciated throughout cations but we were not able to find a formal literature review
the many examples of humanoid robots available in literature of humanoid robot heads. We consulted the scientific databases
(see some examples in Figs. 1 and 2). Scopus, Web of Science and Google Scholar until January 2019.
This paper aims to contribute to humanoid robotics research References [2,33–39] were published and retrieved after this
by providing organized information on humanoid robot heads date and where added during the revision process. Reference
2
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

lists of included articles were reviewed for further citations. We Active Vision research was initiated in 1988 [42,43]. The scientific
began our research using the keyword ‘‘humanoid robot’’ AND paradigm was investigated by Aloimonos et al. [42], according to
‘‘head design’’. However, our search quickly led us to use the whom, an observer is called active when engaged in a process
following keywords: ‘‘active vision’’; ‘‘cerebellar model articula- whose purpose is to control the geometric parameters of its
tion control’’; ‘‘robot communication’’; ‘‘developmental robotics’’; sensory apparatus. However, Bajcsy [43] defined active vision as
‘‘disaster response robot’’; ‘‘eye-head coordination’’; ‘‘entertain- the modeling and study of control strategies to search and follow
ment robot’’; ‘‘feedback-error-learning control’’; ‘‘gaze control’’; objects through vision. Taking both definitions, it appears clear
‘‘head stabilization’’; ‘‘humanoid robot sensing’’; ‘‘humanoid robot that active vision research could not be possible without active
control’’; ‘‘human--robot interaction’’; ‘‘image stabilization’’; ‘‘ma- vision sensors such as robot heads. Not long after this research
nipulation of objects’’; ‘‘robot assistance’’; ‘‘robot vision’’; ‘‘sac- debuted, around 1990, a number of stereo or binocular camera
cade’’; ‘‘vestibulo-ocular reflex’’. Boolean combinations involving heads featuring exclusively stereo vision as the main guiding
these keywords were also used. sense were developed as a support for computer vision research
We limit the scope of this paper to the review of robot heads (Rochester Robot Head [44], LIRA/DIST Head [45], LIFIA/SAVA
and the associated concepts and ideas that are currently used or Binocular Head [46], Harvard Binocular Head [14], KTH head [47]).
that could directly be of use in humanoid robotics. Here, we take At first, these robot heads were composed of a pair of cameras
that in terms of general mechanical design, a humanoid robot mounted at the end effector of an industrial robot arm (Rochester
has been defined as being inspired by the human appearance, Robot Head [44], or LIFIA/SAVA Binocular Head [46] for example).
structure and kinematics [2,9,10], ideally including a torso, hip, This was done for practical purposes: the robot arm would act
head, neck, and limbs [2]. Following this definition, we consider as a human neck, re-orienting and pointing the cameras in the
that a humanoid robot head is supposed to hold some physical right direction in a very effective manner. However, such a device
and functional resemblance with a human head. It essentially could not be integrated into a humanoid robot. In early stages,
plays the role of receiving and processing external information most robot heads were essentially verging stereo vision systems
to allow for the control of robot motions, on the one hand; on mounted on a type of neck mechanism that allowed for panning
the other hand, it should provide means to interact with the and tilting motions (such as the LIRA/DIST Head [45]). Interest-
environment as a message carrier through para-linguistic com- ingly, the KTH Head [47] included optical degrees of freedom
munication signals such as facial expressions, intonation or gaze (DOF) to control focus, zoom and aperture camera parameters and
direction [40,41]. even a baseline DOF (called inter-pupillary distance, for humans)
We propose to define a humanoid robot head as a device that modified the horizontal distance between cameras, to ensure
capable of acquiring information that the humanoid robot can that the robot could focus at dynamically varying distances from
process and act on, to successfully navigate its environment and the object of interest. Optical degrees of freedom were scarcely
interact with humans. If the humanoid robot head is to engage used in subsequent works due to design and control difficulties.
in HRI, it should be able to non-verbally communicate via para- WE-2 [24] (see Fig. 1) equipped a humanoid robot called
linguistic communication signals. Therefore, a humanoid robot Hadaly, which acted as a campus assistant in 1995. It was the
head should obey to three main aspects: first to incorporate an artificial form of vestibulo-ocular reflex
(VOR, see Section 5.1 for more details on this physiological reflex)
1. context: it must be designed with the intention of being through the use of inertial sensors. VOR allows us to maintain the
embedded in a humanoid robot; direction of our eyes constant during head motion (object fixa-
2. functionality: it must contain at least one of the main senses tion). Later, a WE-2 successor, WE-3R, could adjust to brightness
that allow the humanoid robot to navigate an environment through the use of eyelashes [48].
designed for human beings; The human eye offers high acuity through its fovea (high
3. appearance: it should be close to human-like appearance or acuity zone located at the center of the eye), but also a large
at least evoke human-like traits. field of vision (FOV) through its peripheral zone (of lower acuity).
Foveated vision appeared to emulate this. A set of cameras with
3. Chronological analysis of humanoid robot heads features wide field of view (FOV) would be combined with a set of cameras
with narrow FOV (see COG [49], second example of Fig. 1).
In this section we proceed to present the most repeatedly An alternative to stereo cameras to obtain 3D depth rep-
integrated humanoid robot heads senses and features (vision and resentation of the environment appeared with RGB-D sensors.
other senses). We then briefly explore the topic of their integra- These sensors integrate an RGB camera with an infra-red (IR)
tion within a humanoid robot. Finally we present features that are depth sensor and emitter to produce color images that contain
useful in HRI such as para-linguistic communication, appearance depth information for each pixel in the image. Originally intended
and behavior. We have grouped the ideas chronologically in each for entertainment applications, their low cost and effectiveness
of the following subsections. have made them attractive to the robotics research community.
RGB-D technology presents one disadvantage in outdoor environ-
3.1. Vision and neck mobility ments though: the IR sensor cannot discern between the sunlight
and the IR light from the emitter. Therefore, some robots like
As we outlined in our introduction, the human head contains TORO [10] or Hubo2+ [50], additionally rely on other sensors to
several sensory systems, the main being: audition, sight, smell obtain more robust measurements.
and taste. The last one (taste) is yet to appear in a robotic example
of a robot head in the literature. One reason may be because 3.2. Other sensing features
unlike other senses, taste does not directly serve to guide us
around our environment. A chronological analysis of scientific The humanoid robot head WE-2 had many successors with
literature shows that the first and most repeatedly used sensing multiple versions, among which some came with very original
device equipping robot heads is a pair of cameras. Vision is what features. WE-3RII was equipped with actuated facial traits so that
seems to have gained most of the attention in this particular it could display emotions through facial expressions [51]. WE-
research area, since when it comes to robot heads, they should 3RIII was the first to integrate the sense of touch (or in this case,
be able to track and follow objects as well as to recognize them. cutaneous sensing) [52] in 2000. Not long after, WE-3RIV was
3
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

even provided with an artificial lung and a gas sensor giving it the on discovering ways to expand sensing capabilities; size, shape
ability to breathe in and recognize certain smells [53]. Although and appearance were not the main concern. Integration of sensor
the lung is not located inside the head, this was the first robot heads in humanoid robots is of importance in humanoid robotics.
head to integrate the sense of smell. As a more complete system, humanoids enable more active HRI
Only human-inspired sensing features have been mentioned such as collaborative tasks, but also the study of hypotheses about
up to this point. However, in order to succeed at challenging human motion [66]
tasks such as emergency relief, some humanoid robot heads have
been provided with sensors that are not necessarily inspired by 3.4. Para-linguistic communication, appearance and behavior
human sensing capabilities. Disaster-response robots structures
are very diverse, but examples featuring a humanoid structure There is evidence that human beings tend to humanize ob-
emerged after the Fukushima Daiichi nuclear disaster in 2011: jects [67–69], specially when these present human-like features
CHIMP [13], DRC-HUBO+ [54], HRP-2Kai [55], THORMANG [56] [27,68,70]. This phenomenon is known as anthropomorphism
and Valkyrie [12] are some examples. Their heads include light [68,70]. Therefore, one approach to enhance robot acceptance
detection and ranging (LIDAR) which is a method that com- is by designing them with anthropomorphic design [68] and
putes distance to objects by measuring the time that a pulse capable of social characteristics for interaction and communica-
of light takes to travel from the source to the observed target tion [67,68].
and then to the detector. A LIDAR sensor can be used to create Our research has shown that para-linguistic communication
a 3D representation of the environment to enhance real-world signals have been employed by humanoid robot heads in order
perception, specially in cluttered environments [57]. Therefore, to enhance HRI. These communication cues are effective ways of
LIDAR sensors seem particularly adequate for disaster response carrying messages and enhance linguistic communication. Facial
humanoid robots. Carnegie Robotics Multisense SL/SLB combines expressions are one type of para-linguistic communication cues.
a camera based stereo vision system with a rolling LIDAR sensor. These were implemented early on (around 1993) in the Tokyo
Such a system could be considered to be a fully operational Robot Head [71,72], which is a predecessor of the face robot
humanoid robot head since it enables a robot to ‘‘see’’ a very large SAYA [73]. This was the first to feature a form of artificial skin,
scene around it without the need to actuate cameras or neck. The along with actuated facial traits (mouth, nose and eyebrows) in
Multisense SL/SLB (see Fig. 1) equips several humanoid robots order to imitate the human capability of emotion communication
having participated in the DARPA Robotics Challenge (DRC) such through facial expressions, making it the first emotive robot head.
as Atlas-DRC and Atlas-Unplugged [26] (see Fig. 1), ESCHER [58], Emotions are biological states conforming a motivation system
JAXON [11] or WALK-MAN [59]. that determines the behavioral reaction to environmental events
significant for the needs and goals of a creature [40,41]. The
3.3. Humanoid integration response translates into a set of para-linguistic reactions such
as gestures, intonation or facial expressions that inform about a
Humanoid robots date back to 1973 when WABOT-1, the first certain emotional state. Emotion expression is closely related to
full-scale humanoid robot in modern history, was unveiled [60]. anthropomorphism in robot head design. Along with Kismet [40],
It could talk, measure distances to objects and estimate directions these robots of the first era (before the year 2000) were ca-
of incoming input through artificial ears and eyes. It was an pable of displaying a variety of emotions, thus enhancing the
impressive development integrating most of what a robot head interaction experience and perceived intelligence of the robot.
should have nowadays, but it was hard to visually estimate the Six basic emotions (anger, disgust, fear, happiness, sadness and
location of its head. WABOT-2 could also converse with people surprise) [41] were commonly used to evaluate the expressive
and play the piano, with anthropomorphic hands, by reading a capabilities (facial) of these robots that were designed for social
musical score via its one-eyed vision system, and by receiving interaction with humans [40,71].
auditory feedback [21]. To the best of our knowledge, this is the After the year 2000, humanoid robot platforms that featured
first humanoid robot head to have met the three aspects that we a robot head continued to be developed. Their heads were for
defined in Section 1. the most part mechanisms featuring active vision, sometimes
At first, humanoid robots that included a functional head were active audition, VOR and other sensing capabilities that were
quite rare because advanced humanoid robot heads are gener- already introduced years before. In 1970, Mori raised the hypoth-
ally very complex mechanisms that are not easily embedded. esis that human-likeness in appearance and behavior, increases
Some of the reasons include the large quantity of actuators when robot acceptance in society, but this only happens up to a point
the head has actuated facial traits; it may simply be too big where extreme but not perfect resemblance would cause their
to be embedded. Honda did provide its humanoid robots with rejection [74]. It is still only a hypothesis and the amount of
cameras and other useful sensing devices [61] but their heads empirical proof remains scarce [70,75], so it is unclear whether
did not meet our three requirements until the development of or not this hypothesis holds, simply because a humanoid robot
P2 [62]. Soon, functional humanoid robot heads were equipping that perfectly looks and behaves as a human does not yet exist.
all sorts of humanoid robots around the year 2000. COG was Regardless, there has been struggle to overcome this uncanny
designed to explore the hypothesis that human-like intelligence valley [76], suggesting that appearance does matter (see Philip K.
requires human-like interactions with the world, and it featured Dick android [77], Geminoid HI-1 [78] or HRP-4C [79] humanoid
visual, auditory, proprioceptive, tactile, and vestibular senses to robots).
enhance such interactions [63]. Robita humanoid could communi- Humanoid robot heads appearance is considered a major con-
cate with various users detecting sound sources and recognizing cern by some, and there has been an emergence of android
faces and gestures [64]. Infanoid was an infant humanoid robot robot heads in humanoid robots [28,71,76–79]. Androids have
that featured foveated stereo vision and was capable of facial been defined as humanoid robots with an appearance that closely
expressions [31]. ETL humanoid had a 3 DOF neck, its eyes could resembles that of humans, possessing traits such as artificial skin
not only pan and tilt but also roll which is rare among hu- and hair (see [80] for an example), and capable of linguistic
manoid robot heads [65]. The list of humanoid robots including and para-linguistic communication [76]. Aiming to identify robot
a functional head vastly extends after the year 2000. Up until aesthetics principles that would increase robot acceptance, image
that year, humanoid robot head research was mainly focused based surveys where participants graded certain metrics have
4
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

been considered [77,81]. In one study, the head of a small hu- 4. Types of humanoid robot heads
manoid robot (Qrio) is compared with an android one (Philip
K. Dick) on four metrics: realism, appeal, eeriness and famil- Development of robot heads has vastly evolved since their
iarity [77]. The results imply that an android face is deemed debut around 1975. As we outlined in the previous section, early
more humanlike and familiar. In another study, the images of examples consisted mainly of an embedded vision system and
some other sensors, as means to obtain information from the
48 humanoid robots were rated on a 1 to 5 scale from ‘‘not
environment. The constant addition of sensing devices and de-
very human like’’ to ‘‘very human like’’ [81]. The study showed
sign parameters has taken current robot heads to a high level
that the increased head width and number of facial features
of complexity and, as a consequence, they have evolved to be
(browline, eyes, mouth, nose, etc.; see Romeo [25] in Fig. 1 and very diverse in terms of appearance, function, and application.
robot heads of Fig. 2) enhances the perceived humanness of a Most robot heads address the basic sensorial needs of a hu-
humanoid robot head. However, the authors also point out that manoid robot (such as sight, hearing and VOR). However, many
behavior may also be responsible for robot acceptance, which designs seem to address the para-linguistic communication cue of
is an argument shared by other studies [40,82]. In a study by facial-expressions through diverse mechanisms. It is therefore not
Minato et al. [82], human subjects were asked questions by surprising that there are non-expressive face heads (see Fig. 1) as
both a human questioner and an android (Repliee Q1) that was well as expressive face heads (see Fig. 2). Therefore, in this section,
programmed to display micro-motions such as eye and shoul- we will proceed to present both types.
der movements. The subjects reactions were then analyzed in
4.1. Non-expressive face heads
both situations and compared. Results suggest that the android
was deemed human-like although not at the same level as the
Even though a humanoid robot often shows physical resem-
human questioner, and that it is the combination of both ap-
blance with the human being, it is the several sensory systems the
pearance and behavior that increases robot acceptance. In HRI, human head holds that have initially inspired scientists. Sensory
two conditions of robot head design gradually became implicit: systems allow to coordinate body motions. They guide towards
that they are pleasant to the eye, and that they are easy to a goal, to reach an object that can be located through sight,
interact with. As a consequence, humanoid robot heads that were audition or even smell. If this logic is taken, the head acts mostly
capable of communicating by speech and by displaying facial as a sensorial organ which coordinates body motions. Therefore
expressions became more abundant between 2000 and 2010: humanoid robot heads often feature sensors such as cameras,
WE-4 [32]; MERTZ [83]; iCub [84]; Philip K. Dick android [77]; microphones, artificial vestibular systems and in a few cases,
Albert HUBO [28] (see Fig. 2); SAYA [85]; CB2 child robot [86]. electronic noses. These types of features are often included in
Some robot heads have been designed with animal-inspired non-expressive face heads (see Fig. 1 for some examples). Non-
expressive face heads are those that cannot display dynamically
looks: Leonardo [87], iCat [88], EEEX [89], Probo [90] or EMYS
changing facial expressions. Some of these have been considered
[91]. One idea behind these zoomorphic developments is that the
mostly as a research platform to develop areas such as active
uncanny valley effect can be avoided by providing the robot with
vision [97,98], active audition [16,83], and software develop-
a familiar but not human appearance. However, these designs ment [99–104]. We should note that most non-expressive face
were made to explore certain aspects of social HRI that are heads that we encountered in the literature are in fact part
otherwise impossible to explore with a more humane appearance of a humanoid robot: HRP-2 [105], ETL-Humanoid [65], KHR-
such as the effect of exaggerative facial expression [89,91] or even 2 [106], Rollin’ Justin [107], ARMAR-4 [108], CHIMP [13], or
medical treatments like animal assisted therapy [90]. DRC-HUBO+ [54] are some examples.
Sensorial and social functions of a robot head have gradually Recent non-expressive robot heads feature two DOF per eye
become two minimum requirements in HRI, and some go as far as and a neck with as much as five DOF (such as Armar IV [108], al-
to proclaim that significant contribution can only be made if there lowing for lower and upper pitch, lower and upper roll and yaw).
is solid advance in these two functions [92]. Therefore, emotion Artificial VOR and optokinetic response (OKR, see 5.1 for more on
expression by robots has become more mature by exploring novel this reflex) have become standard image stabilization techniques.
Many resort to foveated vision systems but it seems that RGB-D
methods to display emotions. We have seen some interesting
is used more and more since 2013: HaSaRam IX [109], TORO [10],
developments such as Flobi [92], Probo [90], HRP-4C [79], and
HRP2 W [110], Shenyang Head [111] or Valkyrie [12] among
KOBIAN-R [27] (see Fig. 2). The mechanical design of a robot others. LIDAR and GPS/INS sensors seem to be used only for
head having not only sensorial functions but also an actuated face the more specialized disaster-response humanoid robots. Rolling
can become quite complex and the resulting cost can be high. LIDAR sensors often replace a neck mechanism because it allows
One solution to decrease mechanical design complexity while the humanoid robot to create a map of the environment [13]
keeping emotion communication abilities consists in displaying without the need to re-orientate its head. However, for some of
colors associated with an emotional state, as done with iCub [84] these examples, the overall design is aimed towards efficiency,
or Nao [93]. It is even possible to animate and display certain and sometimes part of their perception systems are located in
facial traits. For example the Twente humanoid head displays the torso (THORMANG [56], DRC-HUBO+ [54]) or waist and back
animated eyebrows and mouth [94]. By integrating displays into (Valkyrie [12]) for example.
their design, humanoid robot heads are no longer subject to The development of non-expressive heads began long ago. At
a time where actuated facial traits were an emerging feature,
specific mechanical limitations in terms of emotion expression,
many non-expressive heads were specifically built with HRI ap-
and the scope of possible expressions vastly broadens. However,
plications in mind such as service or assistance (Robonaut [112],
some developments went even further by projecting the whole
Honda Asimo [113], Pearl [81], Robovie [114] or Romeo [25]) or as
face onto a display/translucent mask. Some examples include the platforms for cognition research (COG [49], ROBITA [64], H6 [115],
LightHead robot [95], which later became SociBot [29] (see Fig. 2), or DB [116]). We should note that even if all of these examples
and Mask-bot 2i [96]. With this type of mechanism, emotion usually lack appearance-enhancing design parameters (such as
expression is no longer limited by the mechanical capabilities of actuated facial traits) they can still be involved in HRI. Non-
a specific design. expressive face heads can still use para-linguistic communication
5
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

cues such as gaze direction [15], intonation or head gestures Android robot heads tend to defy the valley effect by taking
which are effective para-linguistic communication cues that carry the design to very high levels of physical resemblance with the
messages and even convey emotions [40,68]. Intentions, which human being (see third image of Fig. 2). This contradicts the idea
are a state of commitment to carry out actions, are another im- that in order for robots to be socially accepted, their appearance
portant interaction component [117] that can be easily conveyed should be only roughly similar to that of a human being [35].
by non-expressive face heads through motor actions (like Robovie Robot heads of this type are usually equipped with much more
in [114] or Baxter with its expressive face in [118]). detailed physical traits such as artificial skin and artificial hair;
most are able to communicate with speech and para-linguistic
4.2. Expressive face heads communication cues. The android type also often features an an-
thropomorphous neck with at least 3 DOF (with the exception of
In human species, the head is the external organ for social Lilly [125] who is also not able to communicate via speech). Skin
interaction. Our faces can communicate essential information and hair are also standard features for these robots. Most of these
about our emotions through gaze and facial expression [15] (see expressive face heads android faces are intended to have the
Fig. 2 for an example). In fact, there is evidence that words are appearance of adults, although there are some examples of child-
not even needed to convey a particular emotion [119]. What- androids like Barthoc Jr. [126] and CB2 [86]. One very interesting
ever the means, it appears that the ability to communicate is feature of Repliee Q1 and Geminoid HI-1 androids [78] is the
an essential aspect of social interaction. If a robot is to freely implementation of micro-motions. We consider micro-motions to
navigate an environment created for humans, then, it is natural be a type of appearance-enhancing design parameter because it
to expect that humans will be present in such space. Therefore, is something that is constantly being displayed (breathing motion
HRI becomes inevitable; exteroceptive senses do not suffice to or shoulder movements).
interact with people around. This means that the robot has to There are expressive face heads in which the face or some
be able to direct and orientate itself by avoiding obstacles, it has facial traits are either displayed on LCDs; or projected onto a
to locate and manipulate objects, but it also has to be able to translucent surface with a video projector. Some of them make
interact with surrounding people. Within the HRI logic, a robot use of Light-emitting diodes (LED) placed under a translucent
head should be able to yield a particular emotional state; it mask to display animated eyebrows and mouth like iCub [84] and
must be able to verbally or physically communicate and interact the Twente University Head [94]. RoboThespian [127] displays
with a human being; therefore its actions must be recognizable animated eyes to simulate blinking and retinal deformation. Ex-
and interpretable. As it was previously stated, a non-expressive pressive face heads have reached a very high level of complexity
head can already communicate verbally and with some para- in their mechatronic design, and more specially, the android type.
linguistic communication cues. However, there are expressive This complexity reflects in their production cost. If we take the
face heads that can, in addition, make use of facial expressions to example of WE-4RII [23] with its 29 DOF, one can understand
communicate intentions or emotions. Expressive face heads (see that the cost of actuators alone is very high and may not be a
Fig. 2) display dynamically changing facial expressions through realistic possibility for every research project. There also seems
several mechanisms: actuated facial traits (see the two images on to be some skepticism towards the facial expression quality of
the left of Fig. 2); animated facial traits through displays; or the expressive face heads robot heads with actuated faces [29,84,94–
projection of the animated face onto a screen (see right image of 96,128]. A displayed or projected expressive face arises from a
Fig. 2). very simple and yet effective idea: the display or projection of
All of these expressive face heads are capable of nonverbal an animated face on a screen that acts as a face. The cost of this
communication, which is an important part of our daily interac- type of system is therefore extremely reduced because only the
tions [119,120]. It is interesting to notice that the first expressive neck needs to be actuated. In such devices, the quality of facial
face head was already of the android type (Tokyo Robot Face [71, expressions depends less on the hardware itself, as opposed to
72]). It is also worth noticing that no android-type expressive face expressive face heads actuated faces and androids. Synthetic ani-
heads were developed for several years after this. mations are usually projected, but real faces can also be mapped
Some expressive face heads combine mechanically actuated to the display [129] or translucent mask [96]. The example of
facial traits with animated traits through displays. For example, SCIPRR [123] is more of a vessel in which one could incorporate
EEEX [89] (which is a zoomorphic design inspired by a bug) dis- either a display or a retro-projected translucent mask. However,
plays emoji-stlye eyes to complement its exaggerative emotion we must note that even though displayed or projected expressive
expression when its actuated ears (antennae), cheeks and jaw face heads are very attractive, they remain subject to the valley
pop out of its face. Exaggerative emotion expression is also used effect; for example, a user qualified SociBot (Fig. 2) as being
by EMYS [91] (also a zoomorphic design inspired by a turtle). Its ‘‘creepy’’ [29], confirming that the valley effect is not so easily
eyes can pop out of its face, which is formed by three actuated avoided.
disks that greatly deform its face when expressing emotions. By
using exaggerative emotion expression, these robots ensure the 5. Active vision control
recognition of their emotions during HRI.
As with non-expressive face heads, all the expressive face Previous sections discussed general design aspects of hu-
heads that we encountered in literature have eyes, strongly sug- manoid robot heads, identifying two basic types: non-expressive
gesting that the presence of eyes and a vision system have be- face heads and expressive face heads Within these two classes
come a minimum requirement. Same goes for audition which active vision appears to be a fundamental aspect for humanoid
seems to have become a standard sense (only a few do not robot heads. Active vision ensures gaze direction, coordinating
feature hearing: EEEX [89], KASPAR [121], Twente [94], SHFR- eyes and head motion. It also allows the robot to direct its gaze
III [122], LightHead [95], Mask-bot 2i [96] and SCIPRR [123]). towards a detected stimulus, guaranteeing its tracking even when
Smell seems to have ended with WE-4 [32], since no other head the target or the robot are in motion. When the head is embedded
has featured that sense ever since. In terms of communication, in a humanoid, active vision also allows to coordinate limbs
many of these heads are able to produce speech but not all motion.
(Tokyo Face Robot [71], Leonardo [87] and Bert2 [124] are some Moreover, a robot that can react in a human way to a visual
examples). cue will be more attractive to humans, than one whose actions
6
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

appear mechanical. This is, if the robot can turn its head towards Vestibulo-Ocular Reflex This reflex works to stabilize the image
a moving object in a natural way, balancing the motion of the on the retina while the head is moving. It compensates
eyes, head and body; if it can make eye contact with the person head motion by directing eye movements in the opposite
with whom it is interacting; if it can keep track of a moving direction, this way the image can be kept at the center of
target; etc, then the robot’s attitude will appear more natural to the field of vision. VOR depends on stimuli detected in the
the user. This is particularly important in HRI tasks because the inner ear (by the vestibular system), rather than on visual
robot must be able to navigate through a human environment events. It has been demonstrated that, in humans and other
naturally, and to interact with users without eliciting rejection primates, this process is adaptive and is carried away by
from people. In order to achieve this desirable ‘‘humanity’’, one the cerebellum [137].
of the most important aspects to consider is the control and
coordination of the robot’s eyes, head and body. Because, ulti- Optokinetic Response This is a combination of fast and slow
mately, these controllers will be the ones that give the robot paced eye movements. The OKR serves to follow moving
human-like abilities, if they are properly designed. From a phys- objects when the head is kept still, or when stationary
iological point of view, the process of human vision is composed targets are observed by the moving subject. It acts on in-
of various features that make it possible to detect targets, keep formation from the retinal slip, and works in combination
them in focus, follow them as they move, etc; and numerous with the VOR to stabilize the image in the retina.
researchers have attempted to recreate such features. However,
Vestibulocollic Reflex VCR’s role is to stabilize the head using
humans also use facial (and body) gestures that enrich our in-
the neck muscles. Its response depends on information
teractions, and therefore their replication in humanoid robotics
from the vestibular system, and complements the actions
is also important [130]. Some of these gestures include blinks,
of VOR and OKR.
mouth movements, eyebrow motions, etc. In a 2019 study, Ali
et al. [33] note the benefits that therapy with a humanoid robot Smooth pursuit The objective of this type of response is the
(NAO) with the ability to ‘‘blink’’ has on children with autism tracking of moving objects in a horizontal or vertical way.
spectrum disorder. In this case the blinking action was simulated It comprises two stages. The first is a ballistic, open-loop
with led lightning changes around the robot’s eyes. Mechanical response, during which the eye tries to nullify the retinal
blinking has been implemented as well, such as in [131,132]. The velocity (the velocity with which the object moves in the
robot head in [132] can also move its lips and change the orienta- retina). In the second stage, a closed-loop is formed, using
tion of the head. In [36] a humanoid robot has been dotted with visual feedback, to match the angular velocity of the eye
the ability to ‘laugh’. It can generate laughing sounds, and modify with that of the target. It provides an online correction of
the position of its eyelids, mouth, cheeks, and generate upper pursuit velocity to compensate for retinal slip (movement
body motions to simulate laughter. The following sections focus of the target image in the retina).
on active vision, rather than on complementary gestures used in
interaction. The different features that characterize human vision Saccades In contrast with vergence, saccades are fast, low la-
should be considered as desirable functionalities of a humanoid tency, simultaneous movements of both eyes, reaching
robot head if it is intended for HRI. In the following, we describe speeds up to 1000 ◦ /s[138]. The human retina has a small,
these features and the control techniques that research groups maximum acuity area known as the fovea. Humans process
have used to recreate one or several of these. the observed environment by moving the eyes around and
thus bringing interesting parts of it into the fovea. This
5.1. Physiological aspects of human vision is achieved thanks to saccadic movements. The result is
a 3-D map of the area. Saccades can be further classified
The concept of ‘active vision’ or ‘animate vision’ was intro- according to their specific objective as visually guided,
duced by Aloimonos et al. [42], Ballard et al. [133], and Ba- antisaccade, memory guided and predictive.
jcsy [43]. Initially, most of the research around robot vision fo-
cused on ‘passive vision’, namely, the observation and analysis Research shows that motion control in humans is conducted
of static images. Human vision, however, is an active process; by the cerebellum [137], using information from different sensors
objects are not always static, and focus should be maintained, in (position, velocity, acceleration, tension, etc.), which the central
spite of changes in illumination, distance or position [42]. On the nervous system then uses to determine the actions that should
other hand, active vision reduces the computational cost associ- be taken by the muscles to perform the desired task [139–141].
ated to passive vision, by allowing the vision system to interact The development of oculo-motor control abilities starts during
with the environment [133]. Naturally, in order to develop active infancy, in a series of stages described by Piaget in 1951 [142].
vision in humanoid robots, it is necessary to understand the pro- This process forms connections between neurons that can be
cess of human vision from a physiological point of view. Through flexible, allowing for adaptive responses to different scenarios.
the years, several phyisiological studies have identified different This is called heterosynaptic plasticity, and implies that the body’s
vision phenomena (in humans and primates), such as vergence, dynamics are learnt with practice [141].
saccades, smooth pursuit, VOR, OKR, and the Vestibulocollic Re-
flex (VCR). Each will be briefly described below. More detailed 5.2. Control mechanisms for biomimetic active vision
descriptions of the vestibular and visual processes involved in the
human gaze dynamics can be found in the literature [134–136]. Berthouze et al. [148] define gaze control as ‘‘the ability to keep
an object of attention in the center of the image’’. More specifically,
Vergence In animals with foveal regions in their retinas, ver- gaze control or gaze stabilization is constituted of two main tasks:
gence plays an important role when focusing on a target target fixation and image stabilization. The first refers to the
placed at a varying distance from the eyes. Both eyes must problem of acquiring a target and keeping it within the limits of
maintain focus on the target by modifying their orienta- the foveal region; the second is concerned with keeping motion
tion angles. Vergence is a slow motion process, performed blur (generally caused by head or body motions, or by the target’s
simultaneously by both eyes but with different directions. own displacement) at a minimum. It is difficult to decide where to
Its purpose is to maintain binocular vision. draw the line when defining which visual processes contribute to
7
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

Table 1 through observation of an external ‘teacher’. The ‘student’ tries to


Eye-centered gaze control strategies. imitate the observed movement and corrects the path, according
Technique Implemented visual feature to the desired trajectory. The model proposed by Kawato in
CMAC [140] [143] 1987 [141] is based on this human ability, and defines a neural
RCA [144] [145] network model that learns the desired body dynamics from pre-
FEL [146,147] [148–150] [145,147,151] diction of actual joint dynamics, obtained from an inverse model
NN [152] [152]
of the mechanism. This approach mimics the plasticity exhibited
ANN [149] [153]
Pr [154,155] [154] [154–156] by neural cells. The method was later referred to as Feedback-
Ad [157,158] [157] Error-Learning (FEL) [146]; it parts from the feedback principle in
PI/PID [25] [156] control systems to reduce the perceived error between the actual
Saccades OKR/VOR Smooth Pursuit Vergence and desired motions. Essentially, a FEL system is comprised of a
CMAC: Cerebellar Model Articulation Controller, RCA: Recurrent Cerebellar feedback loop and an inverse model in a feed-forward loop. The
Architecture, FEL: Feedback-error-learning, NN: Neural Networks, ANN: Adap- first loop computes the motor command signals to the actuators,
tive Neural Networks, Pr: Predictive methods, Ad: Adaptive methods, PI: so that the system follows a desired trajectory. This generates an
Proportional–Integral control. error signal which is then used as input to the inverse model. In
the second stage, the calculated inverse model is used to predict
actual system trajectories (via a neural network), which are then
solve the gaze control problem. For some researchers, gaze stabil- used in a feed-forward loop to reduce tracking errors [141,146].
ity is merely achieved by a combination of ocular and vestibular Even when neither CMAC nor FEL were originally intended for
phenomena, so saccades and smooth pursuit are considered as in- gaze control applications (CMAC was meant to be used for motion
dependent processes that do not contribute to gaze-control [159]. control in manipulation tasks, and FEL’s objective was to replicate
However, some authors consider gaze control as a combination of the learning process during general body coordination) both of
all the aforementioned processes, plus the motion of the neck and these approaches are naturally inserted in the gaze control prob-
the head to pursuit the target [160]; this can include body motion lem [143]. Allow us to elaborate on this by taking, for instance,
as well [161]. Thus, it can be understood that biomimicry of gaze the case of saccadic response. The first models of saccades had
control is a complex task that involves the design and control of the form of a feedback system, where eye position was assumed
various subsystems that simulate biological visual processes. to be the controlled variable, and visual information was used
In general, the common approach to achieve gaze control is in the feedback loop to determine its response [162]. However,
to design a low-level servo controller to command the actuators, this view was later disputed by several works, which showed that
and a higher-level gaze controller that takes over the visual saccades were in fact dependent on eye displacement, that is, by
aspects of gaze (namely, VOR, OKR, saccades, vergence, etc.). change in position rather than by the position itself [163,164].
Most low-level controllers are based on Proportional–Integral– Moreover, visual feedback produces lags, longer than saccadic
Derivative (PID) control, or variations of it. The higher-level con- responses, thus complicating its use in the calculation of efficient
trol, however, has to rely on alternative methods, mainly in- saccadic actions [149]. In order to produce accurate saccades,
telligent, adaptive and/or predictive control approaches. This is the system must ‘predict’ the motions that are needed to find
the conclusion that diverse authors have reached after years of the target [143,160]. The above justifies the use of alternative
analysis of the human visual system. methods like CMACs or FEL in active vision. More detailed ac-
counts on saccade models and their evolution can be found in
5.2.1. CMAC and FEL the literature[164,165].
Next, two of the main methodologies employed in active vi- The following sections will focus mainly on the different
sion are described. These approaches, which appeared in the 70’s higher-level control systems used in the literature. We present
and 80’s, are inspired in the physiology of human vision. More the most important solutions, with the intention of giving the
specifically, by studying the adaptability and complexity of the reader a generalized view of the state-of-the-art in gaze control
central nervous system and the biological motion control process, for humanoid robot heads. The most relevant works in gaze
these models attempt to replicate the cerebellum’s actions in control are presented next, grouped by subsystem: eyes or eye-
robotic systems. Their original intention was not to embed them head/neck, giving an overview of the most popular approaches
into humanoid robot heads, rather, they focused in the gen- used to implement each visual feature. Additionally, for eye-
eral problem of body coordination. Nevertheless, several research head/neck coordination systems, we have included aspects such
groups have later exploited their features for active humanoid as head stabilization, head/eye coordination, tracking and gaze
vision. direction. The works which are referred in each of these instances
Albus proposed the Cerebellar Model Articulation Controller do not specify which feature is implemented or by what means.
(CMAC) in 1975, with the aim of replicating the physiological Instead, they study vision as a general gaze control problem. The
structure of the cerebellum by means of a modified perceptron. interested reader is referred to [166] for an interesting review
The CMAC has the form of a table-based model of the cerebellum, on motion control strategies up to 2015. The article describes
as an alternative to usual analytical models. In order to realize a artificial reconstructions of some visual features, in particular
desired task, the CMAC relates the task to the necessary actions saccades, smooth pursuit, VOR/OKR, and eye-head coordination.
to be performed by the actuators. First, based on information
from the desired output and actual data from the sensors, the 5.3. Eye-centered gaze control
CMAC calculates which actions, and in what proportion, should
contribute to the final response. This is done by reading weight Eye-centered gaze control, as the name indicates, assumes
values from a table. The weights on the table are active, meaning that gaze is maintained only through eye processes. This is, it
they can be adjusted by feedback, depending on the desired joint assumes that the head or neck motions do not contribute to
actuator value. The weighted signals are, finally, added together the control of gaze. Then, the visual features or tasks that are
to obtain the computed actuator signal [140]. considered for implementation are: saccades, OKR, VOR, smooth
Biological inspiration also played a role in Kawato’s work. The pursuit, and vergence. Table 1 shows the most common tech-
human brain has the capability to learn desired motion patterns niques that are employed to implement these phenomena. In the
8
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

table, the methodologies are grouped according to their general uses inertial information (e.g. from IMUs) to obtain head velocity
structure. The CMAC [140] and the Recurrent Cerebellar Archi- data [161]. It is commonly modeled using FEL architectures. It
tecture (RCA) [144] are both controllers based on the structure of has been noted in the literature that VOR is a learning process;
the cerebellum, and its functionality in regards with perception. when retinal slip is affected, VOR is able to recover the stillness of
Next, the table shows those models or controllers that are based the image in the retina by adapting to the new conditions [157].
on Neural Networks (NN), including the FEL methodology by The most popular method for modeling VOR is Kawato’s FEL
Kawato [146], and Adaptive NN. Other strategies are constructed architecture. In 1992, Gomi and Kawato [147] modeled VOR using
from Predictive controllers (which may include Kalman filters, an adaptive FEL system, which was an extension to their original
Smith Predictors, or variations of those). Adaptive control strate- work [146]. Later, Shibata and Schaal [151] interpreted retinal
gies, in which the controller parameters are adjusted based on slip as the error signal feeding the OKR model, likening it to a
the evolution of the system, are listed next. And finally, classical feedback control system. Then, the output from the OKR is used to
controllers appear last in the table, in this case in the form of PI feed a model for the VOR. However, Haith and Vijayakumar [145]
or PID control. noted that by using this method, the VOR-OKR system would be
heavily dependent on the model’s dynamics, implying that the
5.3.1. Saccades implementation controller would not generate appropriate signals if a variation
Saccades require very rapid motions, with minimum delay. As in the dynamics occurs. Therefore, they proposed to isolate the
mentioned above, visual feedback in this case is not appropriate VOR and the OKR, based on the fact that OKR does not depend on
due to feedback’s inherent latency. Then, the tendency is to the system kinematics. Haith and Vijayakumar’s approach relies
implement saccades by means of intelligent techniques (particu- on recurrent cerebellar architectures to train the OKR. Then, the
larly, neural networks), and predictive or adaptive control. As an OKR’s generated output contributes to the training of the VOR
example, one can refer to the work of Berthouze et al. [148] They subsystem.
use a FEL algorithm in a sensorial robot head with foveated stereo
vision (ESCHeR) where each joint (for right and left vergence, pan 5.3.3. Smooth pursuit
and tilt) is controlled by an independent FEL module. This work Smooth pursuit refers to the ability of the visual system to fol-
was later combined with a Self-Organizing Map (SOM) based on low a moving object, once it has been detected by saccades. Most
a Kohonen network [150]. The objective was to show how these works found in the literature use adaptive or predictive models
learning algorithms interact with each other and the world. A
to simulate this visual feature. Predictive methods are naturally
variant of FEL is presented in [149], in which the feedforward
inserted in the smooth pursuit problem, rather than conventional
path of the controller is implemented using an Adaptive Neural
feedback control, because visual feedback is a rather slow process.
Network (ANN). This modification not only allows the platform to
Predictive methods allow to estimate where the object will be in
satisfy the velocity requirements for saccades but it also permits
the future and start directing the eyes in that direction. This is
the NN to selectively grow in regions where more precision is
achieved by analyzing the overall movement of the object up to
needed. Learning methods can also be used to provide the control
the current time. Examples of such implementations are found
system with the desired position in advance (rather than fed
in [154,155]. Both of these works use predictions to decide which
back), as demonstrated by Dean et al. in [143]. Their implemen-
visual feature is to be activated (see Section 5.3.1.)
tation feeds a CMAC with signals from a trained linear NN, thus
Two different approaches to implement smooth pursuit via
imitating the learning and teaching processes that occur naturally
adaptive mechanisms are the works by Bahill and McDonald [157]
in the cerebellum.
and Lunghi et al. [153]. An Adaptive Neural Network framework is
It has been demonstrated that humans and primates use pre-
used in [157] to model smooth pursuit responses; this is the same
dictive and adaptive mechanisms to control their gaze. If there
approach that was used for saccadic motion, and that has been
are errors in the visual tracking (originated by a poor prediction)
these are summed up and used to improve the model [157]. Then, discussed in Section 5.3.1. Later, Lunghi et al. present a neural
different studies have proposed the use of predictions to achieve adaptive predictor to implement smooth pursuit. One interesting
efficient saccadic mechanisms. Predictions are used in [154] to variation is that the gains of the predictor are updated using a
select the visual feature that should be activated after a visual fuzzy system. The predictor can effectively compensate a visual
stimulus has been detected. A similar idea is found in [155], delay by modifying its parameters on line [153].
where a gaze control system is built upon saliency maps. At this Macesanu et al. present a tracking methodology for a 6-DOF
stage, the points of interest in the image have been identified. humanoid vision system in which, the cameras are able to detect
Then a Markovian switching system is used to select the gaze and track a person moving in front of the robot. The method
process that is necessary to track the object, namely saccades, considers the delay associated with the time needed to process
smooth pursuit or fixation. the images from the cameras, which as noted in [167], can lead
Experiments have also shown that humans use adaptation to to instability of the control system. The motion of the cameras
adjust their saccadic and vestibulo-ocular systems to unexpected is achieved by a PID controller (one for each degree of freedom),
situations, [157]. This is, our visual system can adapt and react while the dead-time is compensated using Smith predictors [156].
in accordance to the situation. This biological adaptation inspired
the use of adaptive control strategies in the implementation of 5.3.4. Vergence
different visual processes, such as saccades [157,158] and smooth As it was mentioned above, the role of vergence is to drive
pursuit [157]. both eyes towards the observed target, with the purpose of main-
taining it inside the fovea. To effectively realize vergence, each
5.3.2. Optokinetic reflex and VOR eye must be driven by its own reference, which varies depending
Retinal slip is the displacement velocity of the observed image on the position in space of the tracked object. As before, feedback
in the retina. For example, a moving object passing in front is not the ideal solution to implement vergence, because of its
of our eyes can appear blurry, but image remains clear if it is natural slow response time. This is why, intelligent approaches,
the head that moves instead. The first phenomenon is related like neural networks, make an appearance once again. Muham-
to OKR; the second, to VOR. OKR is a reflex that stabilizes the mad and Spratling [152] model vergence and saccade planning
image using information from the retinal slip. The vestibulo- by means of a hierarchical neural network. The proposed model
ocular reflex or VOR can be recreated by a feedback loop that takes visual cues, as well as pose information, and maps them
9
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

to the visual space. It contrasts with other methods because it 5.4.2. OKR and VOR
takes into account the fact that in humans, saccades are open- In complete eye/head coordination systems, cerebellar and
loop ballistic models, whereas previous works considered them neural methods seem to be favored when implementing OKR and
to be closed-loop. Muhammad and Spratling’s interpretation of VOR. In a study from 2010, Franci et al. compared FEL against
saccades leads to an adequate model of vergence, as well. Porril’s decorrelation method, and concluded that, although FEL
is computationally less aggressive, the latter is better at guaran-
5.4. Eye-head coordination for gaze control teeing a performance closer to that of humans. The analysis is
realized in terms of image stabilization [137].
In humans and primates, gaze control is accomplished thanks Gaze stabilization as a combination of VCR, VOR and OKR,
to the mutual contribution of the eyes and the head. When an is considered by Vannucci et al. in [136]. They created a model
observer gazes towards an object, the eyes shift in the direc- of the cerebellar granular layer using a regression method (Lo-
tion of the target, and then the head follows suit [160,191]. In cally Weighted Projection Regression, LWPR). This approach com-
particular, the motion of the head serves to bring the target bines machine learning and cerebellar models to obtain a system
image into the fovea when the object is located outside of the that effectively learns the internal model of the plant. Falotico
field-of-view [136]. In robotics, gaze stabilization refers to the et al. [170] demonstrated that FEL is better at head stabilization
process of keeping a target focused in spite of object, head or tasks than traditional control methods, where VOR and OKR are
body movements. It can include a feedback loop to process visual considered in the model. The FEL controller outdoes the classical
information and to calculate adequate head motions to attain ones when a simulated sinusoidal perturbation is present. How-
a particular target, and a feed-forward loop that predicts head ever, when the controllers were implemented in a real walking
movements. When the object is within a region reachable by pure robot, the authors noted that the FEL controller is sensitive to
eye motion, the head’s contribution to gaze shift is minimal. Then, perturbations caused by the initial impact of the robot’s foot.
once the target has been acquired, VOR ensures that it remains in Panerai et al. [171] consider VOR to achieve image stabilization
focus in spite of head motion. Table 2 shows control approaches in the presence of head motion. This adaptive reflex commands
that have been used for gaze control, when considering cooper- the eyes to move in the opposite direction of head movement
ation of the eyes and the head. The techniques in Table 2 are by using information from the vestibular system, thus keeping
grouped as follows. First, there appear those methods based on the image stable in the retina. The authors use an unsupervised
cerebellar structure or functionality, such as RCA and the Locally learning neural network that takes data from visual and inertial
Weighted Projection Regression (LWPR) learning method [169]. sensors to calculate the necessary control actions to achieve gaze
The methods based on some form of intelligent control are listed stabilization.
next; included here are NN, and fuzzy logic models. Predictors
are next, with instances of Kalman filters and Smith Predic-
tors. These are followed by Adaptive control techniques. In a 5.4.3. Smooth pursuit
separate category, one can find strategies based on Develop- This feature can be interpreted as a feedback loop, where the
mental Robotics [192]. This is an emergent area in robotics that reference is the object position. However, in order to successfully
stems from human developmental theories [142] for learning. keep the target in the center of the retina, the visual system must
Bio-inspired techniques include those deriving from neurological predict the next position of the target based on its current pattern
models of the superior colliculus [193], or biological models of of motion. Then, the most common methods for modeling smooth
the eyes’ movements, like Listing’s and Donders’ laws [184]. Next pursuit are predictive and adaptive strategies.
in the table are those solutions based on classical controllers, like Adaptive approaches are presented by Milighetti et al. [176]
PID and its variations, and finally, control schemes based on State and Vannucci et al. [175]. Milighetti et al. [176] propose a gaze
Machines. control scheme based on an adaptive Kalman filter predictor. The
proposed method is able to track an object moving in arbitrary
trajectories by combining a proportional feedback loop and an
5.4.1. Saccades
adaptive feed-forward gain to predict the next state of the target.
It is clear from Table 2 that the most common architectures for
implementing bioinspired saccades are the predictive and adap- Later, Vannucci et al. [175], use an adaptive approach based on
tive methods [174,177]. The high speeds required to implement Asuni’s neural controller [172]. The objective is to coordinate eye
saccades mean that predictive or neural approaches are better and head motions to achieve smooth pursuit of a moving object.
suited for the task, since they allow the eyes to react to a stimulus The algorithm uses predictions of the target’s motion, obtained
in an anticipated manner. from an extended Kalman filter, to ensure accurate tracking with
For Brown [177], gaze control is achieved by the collabora- zero delay and minimal tracking error.
tive work of various subsystems (the visual features) and body An alternative approach is proposed by Wang et al. [189],
propioception. Brown extended Bahill’s approach [157] by taking where the visual system is built in terms of several PID con-
into consideration the delays associated with gaze control. To trollers. The implemented active vision system accounts for sac-
cope with them, it is proposed to use a Smith predictor that cades, smooth pursuit, VOR and vergence. Their proposal com-
naturally handles delays. The algorithm also permits the coordi- bines an open loop controller that ensures fast target fixation
nation of different control subsystems associated with saccades, with a closed loop controller in charge of vergence. Additionally,
smooth pursuit, vergence, VOR and head compensation, [177, smooth pursuit is achieved through the VOR system. Position
194]. It should be noted however, that in the absence of an control and velocity control are used jointly to satisfy the re-
exact model of the plant, Smith predictors are not suited to the quirements of the saccadic and smooth pursuit systems. PID
task. In this case, a Kalman filter and predictor can offer better controllers are used for both smooth pursuit and vergence control
results [174]. In [174], supervised learning is used to select the in a feedback loop. A third PID controller is used to control head
most appropriate response from various visual features (saccade, motion, which works together with VOR and vergence to ensure
smooth pursuit, VOR, OKR, vergence, or focus). gaze stabilization.
10
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

Table 2
Eye-head coordination gaze control methods (see [168]).
Technique Implemented visual feature
RCA [144] [137]
LWPR[169] [136] [136]
FEL [146,147] [137,170]
NN [168] [168,171] [168] [172]
ANN
Fuzzy [173]
Kalman [174] [174–176]
SP [177] [167] [167]
Ad [177] [178] [178] [179]
Dev [142] [180,181]
B-i [182,183] [184]
P [185] [186]
PD [170] [187,188]
PID [189] [189] [189]
O-L [189] [25]
SM [190]
Saccades OKR/VOR Smooth pursuit Vergence VCR Foveation Gaze direction Head-eye coord.

RCA: Recurrent Cerebellar Architecture, LWPR: Locally Weighted Projection Regression, FEL: Feedback-error-learning, NN: Neural Network
approaches, ANN: Adaptive Neural Network, SP: Smith Predictor, Ad: Adaptive methods, Dev: Developmental control, B-i: Bio-inspired control,
P: Proportional control, PD: Proportional–Derivative control, PID: Proportional–Integral–Derivative control, O-L: Open Loop control, SM: State
Machines.

5.4.4. Vergence to control the position of the end of the virtual mechanism. The
During vergence, each eye is driven by its own reference control objective is attained by means of a proportional velocity
inducing latencies than can interfere with the control system. controller, with the system dynamics defined in terms of the
Then, an effective tracking solution needs to account for these Jacobian matrix.
delays. Sharkey et al. [167] use Smith predictors to overcome the
associated delays. The controller is implemented in two stages: 5.4.6. Gaze direction
a low-level system based on square root control (which drives Although gaze direction can, in general terms, be subdivided
the actuators), and a higher-level controller based on predictive into the performance of different visual actions, we have decided
methods that takes charge of the visual control processes. Results to set it apart to highlight some works that stand out from the
are demonstrated for the vergence control problem. works previously mentioned. We can see, from Table 2, that
Another issue that should be considered is the effect that the proposed solutions in this category are somewhat balanced
an object moving with an unpredictable trajectory can have in between intelligent [172,173], predictive or adaptive [167,179]
the tracking system. In [178] an adaptive scheme is proposed to and classical approaches, [186,189], but a new category was also
track an object with unknown trajectory. The algorithm calcu- found, that we label Bio-inspired control.
lates the optimal joint actions, coordinating eyes, head and neck Bio-inspired control includes works that aim to explicitly im-
mechanisms, to keep the target in the center of the image. The itate the physiological phenomena occurring in vision. The dif-
method implements vergence control, tracking, and gaze control, ference between these and the previously presented methods
considering head motions. (namely, FEL and CMAC) is that the latter use existing algorithms
(like neural networks) and try to accommodate them into copy-
5.4.5. Foveation ing the responses of a certain natural phenomenon. Bio-inspired
When a scene is being surveyed, the head and eye movements methods, on the other hand, create step by step, artificial versions
allow the viewer to scan the area bringing different parts of of those physiological responses; their purpose is to create an
the scene into the fovea, thus detecting features of interest. We explicit model of the visual feature in consideration. For the
call this process foveation and, in humanoid robots, it is typi- problem of gaze direction we can mention the works in [182,183].
cally recreated by classical controllers, such as those based on In this works, a robotic head is created to implement a neuro-
Proportional (P) or Proportional–Derivative (PD) schemes. physiological model of gaze control by Goossens and Opstal [193].
Ude et al. [188] achieve foveated vision in a binocular robot This model includes three stages: the collicular mapping, the
head by means of PD controllers. This strategy considers the stimulus detection and the control of the eyes velocities.
role of head motion in aiding the eyes locate the target by Intelligent control is employed in the works by Asuni et al.
increasing the observed area. Moreover, eyes coordinated motion [172] and Yoo et al. [173]. Neural networks in the form of Self-
is implemented to provide a natural motion (rather than one eye organizing Neural Maps (SOM) are used in [172] for the control of
roaming independently of the other), and to increase the chances the ARTS Lab sensorial head. A set of sensors acquires information
of relocating the target if one eye loses it. Gu and Su [187] propose about the motion of the head and based on this data the neural
a gaze control algorithm for the general problem of tracking a network learns the dynamics of the robot by relating the motion
visual target based on the kinematic equations of the system. of the joints with the gaze direction. Once the learning phase is
Saccades are used to make the eyes converge on the target once over, the network provides the necessary motor commands to
it has entered the field of view. If a tracking error is detected, the satisfy the reference given a desired gaze direction. The authors
head moves to compensate the error. Then, if motion of target is show evidence of effective target detection and gaze stabilization
detected, a series of PD controllers would ensure that the eyes within human tolerances. Yoo et al. [173] use an integral-based
remain fixed on the target. Later, Omrčen and Ude [185] propose fuzzy controller to solve the decision making problem in a real
a foveated vision robot head that is controlled by means of a world setting. This work is explicitly directed towards Human–
virtual mechanism for target tracking. This virtual link extends Robot-Interaction. Stable tracking of a visual object is attained by
the model of the head with an additional joint ‘fixed’ to the eye, Kryczka et al. [186] by means of an inverse dynamics based con-
thus connecting the head to the object of interest. The problem is troller that drives the head orientation. The system implements
11
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

the Vestibulocollic Reflex using information obtained from IMU continue to be developed. It can be also appreciated that there
sensors. is not one robot head which concentrates every sensing feature
Adaptive control is present in the work by Kuhn et al. in because not all are needed for every application, however, vision
2012 [179]. The authors propose an adaptive gaze control method and audition seem to be standard. A form of inertial sensing is not
based on saliency information from scene analysis. An interesting always present in the head because many humanoid robots locate
point of this proposal is that it reduces head motion by act- these sensors elsewhere (for example, DRC-HUBO+ [54] has them
ing only when new, important, data is detected by the robot’s at the pelvis). Neck mobility does not seem to generally increase,
cameras. but most have 2 DOF or more. Communicative features such as
Finally, Yoo and Kim [190] introduce a state machine system speech and facial expressions are featured by a fair amount of
for gaze control. In this work, the objective is to teach the robot robot heads in Table 3.
how to water a plant. The robot learns from observing a human
tutor, with the gaze shifting decisions being dictated by the state 6.2. Vision system and neck
machine.
If a humanoid robot is to navigate its environment, the ori-
5.4.7. Head-eye coordination entation of its vision system would change dynamically, hence
In the works reported, head-eye coordination is achieved by the need of active vision schemes, against passive vision meth-
biologically inspired mechanisms, that we have classified as ei- ods [42]. There are several visual features that need to be con-
ther bio-inspired control (based on physiological characteristics) sidered in the design of a truly biomimetic robot vision system:
and developmental control (based on psychological features). saccades, VOR and OKR, smooth pursuit, vergence, VCR, fovation,
Instances of both are presented next. gaze direction and head-eye coordination. According to the anal-
An example of bio-inspired control applied to the head and ysis in Section 5, saccades are typically implemented using pre-
eye coordination problem is found in [184]. The approach in this dictive and adaptive methods. VOR and OKR, which work hand in
work detects saliency points that are used to activate adequate hand in stabilizing images in the retina, are mostly recreated via
saccadic responses, based on their biological counterparts. This intelligent controllers. Smooth pursuit is traditionally achieved
saccades are sent to the head and eye systems so that appropriate by Kalman filters and predictors, while vergence is generated by
actions can be taken to track the object in question. predictive and adaptive methods. Foveation, on the other hand,
Developmental robotics is a relatively recent research area can be achieved through classical PID-type controllers. A lot of
that combines robotics and developmental psychology and neu- the reported works on active vision, treat gaze direction as a
roscience. Developmental sciences state that humans acquire general problem where two or more of the individual features
their abilities, from infancy, in stages [142]. For example, children may merge. Therefore, we found that the reported implementa-
are not born with the ability to walk; they first need to learn tions range from classical, to intelligent, to bio-inspired control
to control their body, and to perform simple motions with arms techniques. Finally, in terms of head-eye coordination systems,
and legs, etc. Then, in developmental robotics the objective is to developmental approaches and bio-inspired controllers are being
endow robots with the necessary tools to learn how to perform investigated for achieving responses that are closer to human
complex tasks from simple ones [192,195]. Based on this premise, actions.
Law et al. [180,181] implement a neural approach for robot vision, The human being has the ability to visualize a whole scene
in which the robot gradually learns to coordinate eye and head and at the same time, focus a specific object in the scene thanks
movements to achieve gaze control. The approach shows that to both peripheral and foveal vision. Foveated vision can be
adding constraints (that enable staged learning) is beneficial to simulated with retina-like or space-variant sensors, where reso-
attaining the general objective. lution is highest at the center and gradually decreases as going
towards the periphery of the sensor [45,196]. However, these
6. Design and control considerations retina-like sensors are not easily available. Two cameras per eye
where one of them gives wide FOV images (peripheral vision)
Based on results available in literature, we propose some and the other provides narrow FOV images through a longer focal
parameters that should be considered for the global design and length lens (to simulate the fovea) is another solution (see Armar
control of humanoid robot heads for HRI applications. In the first III in Fig. 1). In terms of control, foveation can be achieved by
of the following subsections we will present a brief overview of directing and maintaining the image inside the high definition
humanoid robot head design. We will then continue our analysis area; this is typically achieved by classical methods like P, and
in the subsequent subsections. PD controllers, [185,187,188]. When the head is of the projected
type, one may consider RGB-D sensors which provide vision with
6.1. Overview depth information and sometimes audition in a fully integrated
system. Two main RGB-D sensors abound among humanoid robot
We have positioned several humanoid robot heads in Table 3. heads: the Microsoft Kinect [109,197,198], and Asus Xtion Pro
This table records the main sensing features presented by the Live sensor [10,50,110].
listed humanoid robot heads. It also shows their neck mobility If the vision system has actuated cameras, and the goal is
and whether or not they are capable of linguistic communica- to reproduce kinematic and dynamic capabilities of the human
tion as well as the para-linguistic communication cue of facial visual apparatus, one may consider the following data. Eye an-
expressions. To this respect, one bullet corresponds to passive gular velocities are comprised between 600 ◦ s−1 and 1000 ◦ s−1 ,
facial traits that cannot change dynamically but are still intended and angular accelerations between 35 000 ◦ s−2 and 45 000 ◦ s−2
for appearance enhancement (such as Romeo [25] in Fig. 1); during saccadic motions [138]. Saccadic motions are of low am-
two bullets correspond to partially expressive faces where only plitude, generally of no more than 20◦ . Motions that require
some facial traits change dynamically (iCub [84] or EMYS [91]); more than 30 ◦ s−1 velocities usually elicit saccadic motions and
three bullets correspond to heads where most facial traits can are not in the domain of smooth pursuit [199]. Smooth pursuit
dynamically change allowing to express a wider variety of emo- and saccades are normally modeled by intelligent, adaptive or
tions. It appears clear that there has not been a linear evolution predictive methods. If the robot is expected to interact with
of humanoid robot heads. Expressive and non-expressive faces humans within a dynamic environment, then its responses must
12
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

Table 3
Sensing and communicative features of several robot heads.
Robot head Circa Sensing Neck Verbal Facial
features DOF speech expressions
Wabot-2 [21] 1984 V A I S D L G 0 •
KTH Head [47] 1992 V A I S D L G 3
Tokyo face robot [71] 1993 V A I S D L G 0 •••
COG [49,63] 1995 V A I S D L G 3
WE-2/Hadaly [24] 1995 V A I S D L G 2
HRP-2 [105] 1998 V A I S D L G 2
WE-3 [48,53] 1998 V A I S D L G 4 • •••
Kismet [40] 1999 V A I S D L G 1 • •••
Robonaut [112] 2000 V A I S D L G 2
Asimo [113] 2000 V A I S D L G 2
WE-4 [23,32] 2002 V A I S D L G 4 • •••
Robovie [114] 2002 V A I S D L G 3
Qrio [77] 2004 V A I S D L G 2 •
Leonardo [87] 2004 V A I S D L G 4 •••
Philip K. Dick [77] 2005 V A I S D L G N/A • •••
Barthoc (Jr.) [126] 2006 V A I S D L G 4 • •••
ARMAR-III [9] 2006 V A I S D L G 4
Albert HUBO [28] 2006 V A I S D L G 3 • •••
iCub [84] 2006 V A I S D L G 3 • ••
HRP-4C [79] 2009 V A I S D L G 3 • •••
Nao [93] 2009 V A I S D L G 2 • •
Twente [94] 2009 V A I S D L G 4 ••
Flobi [92] 2010 V A I S D L G 3 • •••
EveR-2 [80] 2011 V A I S D L G 3 • •••
Mask-bot 2i [96] 2012 V A I S D L G 3 • •••
ARMAR-4 [108] 2013 V A I S D L G 5
EMYS [91] 2013 V A I S D L G 3 ••
Kobian-R [27] 2013 V A I S D L G 4 • •••
Atlas-DRC [26] 2013 V A I S D L G 1
CHIMP [13] 2013 V A I S D L G 0
RoboThespianTM [127] 2013 V A I S D L G 3 • •••
Romeo [25] 2014 V A I S D L G 4 • •
SociBot [29] 2014 V A I S D L G 2 • •••
Valkyrie [12] 2014 V A I S D L G 3
TORO [10] 2015 V A I S D L G 2 •
HRP-2Kai [55] 2015 V A I S D L G 1
DRC-HUBO+ [54] 2015 V A I S D L G 2
ISR-RobotHead [128] 2017 V A I S D L G 2 • •••
SCIPRR [123] 2018 V A I S D L G 4 •••
Armar-6 [34] 2018 V A I S D L G 2 •••
Sensing features: V (vision), A (audition), I (VOR/OKR through IMU), S (Smell), D (RGB-D), L (Lidar), G (INS/GPS). Facial
expressions: • (passive traits), •• (partially expressive face), ••• (fully expressive face).

be dynamic as well. The motion of the visual cues may be erratic, As we outlined in previous sections, non-expressive face robot
or too fast; regardless, the visual system must be able to track it. heads have been designed with HRI applications in mind. One
This is achievable by control structures that can predict motion way to enhance their appearance is to design them so as to evoke
of the object, or that can adjust their parameters as needed. familiarity to the user in a passive manner (such as Romeo [25],
As stated in Section 3 and Table 3, early non-expressive face Fig. 1, or Pearl [81]). Passive designs may require to consider
heads had simple neck mechanisms. However, redundancy allows factors such as head width and shape [201]; inter-pupillary dis-
for a wider variety of motions while orienting the head [23]. tance (distance between eyes) [202,203]; or the presence of traits
A more anthropomorphous construction with four or more DOF such as browline, eyes and mouth [81]. As an image-based survey
neck (roll, yaw and lower and upper pitch) such as in [108] allows conducted on NAO suggests, the simple addition of one facial trait
for natural motions more suitable for HRI [200]. Then, another greatly increased the recognition rate of anger and sadness [93].
important aspect to consider is the interaction of the eyes with Interestingly, live tests have confirmed that the presence of
the head (or neck). An appropriate coordination of these, will actuated facial traits enhance recognition of emotions expressed
allow the robot to react naturally, and to be familiar to the hu-
by the robot [40]. However, the face is not the only appendage
mans around it. In this respect, new robot design methodologies
that can enhance robot acceptance. In fact, it has been found that
based on the psychological human development [195], are being
whole-body motions and behavior can greatly increase the accep-
introduced and gaining popularity, and control techniques based
tance rate of a robot [114] by conveying intentionality. However,
on this theory have appeared [180,181].
intentionality can still be conveyed by gaze directions [15]. Motor
6.3. Appearance and behavior mimicry [126], dynamical update of facial expressions, generation
of emotion expression according to external stimuli [204] and
A robot head usually shapes the physical appearance of a hu- well-defined emotion expression [92,204] are all recommended
manoid robot into something that looks human-like (or animal- features in a humanoid robot head for HRI. Even though legged
like in some examples). To enhance acceptance of humanoid locomotion does not particularly improve acceptance [204], ex-
robots during HRI, there are two types of considerations to bear pressive full-body motions do [114,204] (here, the head is not the
in mind. The first one is the robot appearance. The second one only agent of communication but part of it). Exaggerative emotion
has more to do with what the robot is able to do to interact and expression tends to elevate the recognition rate of certain emo-
convey intentionality. tions [91] and is an interesting feature to consider. Regardless,
13
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

it seems that all of these functions positively impact the quality 6.4. Some mechanical design challenges
of interactions and their implementation is recommended. In
terms of appearance-enhancing design parameters, it seems that Interesting developments such as foveated vision [31,49,108]
eyebrows and eyelids are a very effective way to convey emotion, have arisen in sensor heads for humanoid robots. However, two
since they are the most repeatedly actuated facial traits. Actuated cameras per eye might result in a bulky design, that may nega-
lips are also repeatedly used and usually suffices for displaying tively impact the robot appearance. Space-variant sensors are an
motions during speech, which can be enhanced by an actuated alternative but these are not readily available (see Section 6.2).
jaw [89,205]. We must emphasize that while it is true that a robot Perhaps, the design of foveating lenses with a higher magnifica-
can convey intentionality using body motions (such as Baxter tion at the center than the periphery [98,209] can be alternatively
in [118]), for the non-expressive head (we are after all focusing considered. There is also a challenge in reproducing the kinemat-
on the head), motion patterns are a combination of neck motions ics of the human visual apparatus. If we take the data introduced
with gaze directions. Even though intonation is still a possible in Section 6.2, we can see that velocities and accelerations during
para-linguistic signal, the presence of the neck ensures these saccadic motions are very high. Some humanoid robot heads
come near such figures (as Romeo [25], Fig. 1, for example), but
motion patterns that convey intentionality. It has been shown
fully reaching human performance would require more power,
that fine-grained variations of body language are required to
bigger actuators and bulkier designs. Additionally, mechanical
give a humanoid robot the illusion of life [37]. Transporting this
design parameters such as eyes and neck ranges of motion are
idea to just the head, fair neck and eyes mobility along with
usually taken from established literature coming from medical or
an expressive face might be required in order to achieve such
statistical data of the human being [32,92]. However, these results
fine-grained variations. We therefore consider the neck as an
are often taken out of the context of a more natural and dynamic
important component of robot heads.
set of motions that arise during interaction. It might be beneficial
We have found that non-expressive face heads have initially to perform motion analysis of the human neck to identify natural
been used in applications where no complex communication ranges of motion during interaction as it has been done for other
skills are required, such as active vision and navigation research, body parts [210].
disaster response, or reach and grasp for manipulation. These are Table 3 shows that few robots are equipped with more than
qualities needed for autonomously navigating an environment three sensing features. Adding more sensing features certainly
built for humans but do not allow for direct engagement in HRI. complicates the design task. The challenge increases when the
We also see that there are expressive face heads that display robot is also provided with actuated facial expressions and high
facial expressions with a mixture of both actuated and displayed neck mobility along with many sensing features (even more so by
facial traits such as iCub [84] or Twente Head [94] (which in fact adding artificial skin and hair). The numerous required actuators
lacks hearing). Finally, we see that android robot heads tend to be elevate both the cost and the noise coming from the actuation
the most appearance-enhanced and present all the basic sensorial system. Noise may reduce the quality of HRI [78] and more
needs. silent mechanisms may be required. Also, high power to size
In terms of HRI enhancing functions, humanoid robot heads ratio actuators and structural optimization [108] become neces-
should show an empathic and engaging behavior based on the sary to avoid bulky designs. Passive (face) designs are therefore
voice pattern or the facial expressions of an interacting agent. appealing because they reduce the number of required drives
This has been partially done in [126] and are certainly features and thus the noise they produce. However it has been suggested
required in HRI [35]. After all, a sociable robot is one that adapts that non-expressive face heads may rely on other body parts
its behavior according to stimuli perceived from the environment, to communicate para-linguistically and obtain good acceptance
thus, a proper and real-time analysis of actions and gestures results [114]. Humanoid robot heads with projected faces [29,96]
performed by interacting agents is needed [35]. Ambient noise also reduce the number of required drives because these do
seems to affect audition capabilities of humanoids in industrial not need to actuate facial traits. These have not yet equipped a
settings [34], so there is room for improvement in this respect. walking humanoid robot and the consulted literature does not
‘‘Cutaneous sensing’’ in a humanoid robot head could be imple- mention if embedded projection technology is compatible with
mented by a combination of touch sensors and impedance con- the dynamics of walking.
trol. It should be kept in mind that while robot head appearance
6.5. Perspectives in gaze control
and behavior greatly enhance HRI, identifying and interpreting
the interacting partner intentions may reveal to be even more
We have seen that a lot of the existing works on gaze control
important in a collaborative context. Research on this issue has
have relied on predictions, adaptations and learning methods to
been conducted [118,126,206] showing that an attentive state
imitate human vision. In parallel, artificial intelligence (AI) has
is of great importance [87,207] but more research should be
been receiving a lot of attention from various research groups in
done in this direction. Additionally, a few studies have compared
recent years. Most notably, methodologies such as deep learning,
expressive face heads against non-expressive face heads [77,208] evolutionary computation, or developmental robotics.
mostly on appearance parameters. These studies evaluate their When it comes to humanoid robotic vision there are several
design and perceived intelligence by the human counterpart. challenges that still need to be addressed. The challenge goes
Other studies focus more on acceptance and usability of the further than merely controlling the position and velocity of the
overall humanoid robot [34] and are a step in the right direction. eyes and/or head. Before commanding them to gaze in a certain
The idea that a more human looking robot head results in a direction, it is first necessary to develop a biomimetic visual
more natural interaction [206] may hold in certain circumstances, detection algorithm, that is capable of human-like responses.
however, more field studies are needed in order to define if Target detection in complex, unstructured or cluttered scenes has
an expressive face head positively impacts active collaborative been object of study by artificial vision groups for several years,
tasks when compared to a non-expressive robot head (and to with deep learning methodologies being the most commonly
what extent). After all, it seems that in a collaborative context, used. For instance, in [211] Deep Convolutional Neural Networks
interacting partners look at the humanoid robot head in search (DCNN) are employed to train the Baxter robot to grasp unknown
for social cues [34]. objects in real-time. Other developments focus on the design of
14
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

Visual attention (VA) algorithms. An example based on Genetic Because sight has been the first and most studied guiding
Programming (GP), known as Brain Programming (BP), automates sense implemented in a humanoid robot head, we have presented
the search of VA programs [212]; an application example can a review of the various control strategies that are commonly used
be found in [39]. The VAs are built on the premise that vision in active vision systems for these robots. Gaze control is usually
is performed in different areas of the brain, in a hierarchical implemented in two steps: a higher level controller that is con-
manner, and the model is evolved to find the best solutions out cerned with the visual aspects of gaze, namely stimuli detection,
of all the possible visual attention programs. The proposed VA tracking, and gaze stabilization; and a low level control that man-
model is shown to be better than other man-made solutions. ages the actuator system. In terms of the higher level control, it is
Once the target has been acquired by the cameras, it is nec- evident from the literature that the current trend points towards
essary to design appropriate control solutions that can manage physiological or psychological inspired controllers that emulate
the different subsystems that contribute to reproduce the var- the functionality of the central nervous system, the vestibular sys-
ious reactions required. This is, to achieve a true biomimetic tem, or the brain. Three approaches stand out: cerebellar archi-
response from the robot, it is necessary to coordinate (a) various tectures, neural networks, and adaptive or predictive methods, all
motions of the eyes, movements of the head, and in the case of which have been observed in biological visual systems. While
of more complex robotic heads, even manage movements of the some works propose analytic, non bio-inspired, active vision so-
lips, cheeks, eyelids, etc. Strategies on evolutionary computing lutions, the majority of the research in the literature is based on
could prove useful to solve these issues. A new control design some biological model of vision, aiming to replicate it. Specially
methodology, known as Analytic Behaviors based control has since it has been demonstrated by various authors that such bio-
been employed for solving complex control challenges, under inspired controllers give the most human-like responses. In the
different constraints [38]. This methodology, which emanates recent years artificial intelligent (AI) methodologies have been
from Genetic Programming, performs a search for optimal so- gaining popularity amongst researchers, with applications to HRI,
lutions, given certain restrictions. The result is a set of analytic complex systems, robotics, artificial vision, etc. Here we can find
expressions that can be used to define a controller, which can deep convolutional neural networks (DCNN), brain programming
be analyzed under the control theory framework. Developmental (BP) and genetic programming (GP). In particular, DCNN and BP
robotics, which is already being used in the context of humanoid have been already employed for recreating biomimetic artificial
vision, is another interesting (AI) methodology that could offer vision systems. GP, on the other hand, has proven useful for
interesting results in the solution of several of the open problems controlling complex systems with various restrictions. Its exten-
in HRI. sion to systems composed of multiple, possibly heterogeneous,
The performance evaluation of the different control strate- subsystems is an active area of research. Since humanoid robot
gies vary from application to application. In some works, the heads can be studied as complex systems, the implementation of
performance evaluation criteria focuses on how well the system active vision systems that use scene information acquired from
reacts to stimuli, how fast can the ‘eyes’ saccade to the moving DCNN or BP, and then compute control actions from GPs, for
target, or how efficient the control system is in keeping the visual instance, seems like a promising research area for humanoid
objective in focus, but there is no explicit comparison with human roboticists. An alternative to this is the field of developmental
responses. Some other evaluations focus on one specific human robotics, which bridges psychology, neuroscience and robotics, to
visual process and explicitly compare the robot’s response with create robots that learn and evolve in a human-like fashion. The
that of a human subject. It would be interesting to establish premise here is to build robots that acquire new abilities to solve
a benchmark to evaluate the appropriateness of each solution complex tasks from the learning of basic skills.
in terms of likeliness to the human system, and especially, the We have included a set of considerations regarding the global
degree of acceptance from the users. design and control strategies that could enhance HRI involving
humanoid robot heads. We believe that a humanoid robot head
7. Conclusion should be viewed also as a means of communication, designed to
enhance the quality of HRI. We have seen that much focus has
We proposed to define a humanoid robot head as being one gone towards the appearance of the robot head (be it expressive
designed with the intention of being embedded in a humanoid or non-expressive), however, more research should be performed
robot, that contains at least one guiding sense and that evokes in topics such as: real-time emotion analysis and recognition
human-like physical traits. Therefore, we focused our study on (via audition and vision) of interacting agents and the genera-
existing robot heads meeting these requirements. We have re- tion of extensive and fine variations of facial expressions and
viewed the general evolution of humanoid robot heads. These head motions. Field studies evaluating acceptance and usability
robot heads are generally composed of various sensing devices of expressive face heads are also needed to assess the extent of
such as vision, audition and artificial vestibulo-ocular systems, their impact in collaborative tasks and general HRI, as compared
but other sensors are also used in conjunction. We have found to non-expressive face robot heads. Applications for humanoid
that vision is the most studied and abundant perception mecha- robot heads range from customer service like clients reception,
nism embedded in humanoid robot heads. Many humanoid robot guidance or user support, to clinical services such as care for the
heads are currently able to detect sounds, process speech, scan elderly and vulnerable patients, or therapy for individuals with
and recreate a 3D representation of the environment, communi- cognitive disabilities. In applications such as emergency relief,
cate with para-linguistic signals such as facial expressions and appearance and communicative skills do not yet appear to be es-
gaze directions, and even hold extreme physical resemblance sential requirements. Current disaster-response humanoid robots
with the human being by incorporating artificial skin and hair. success only partially depends on the sensor head performance
Very few examples are capable of detecting smells, so this seems which has to be accurate and robust. However, humanoid robots
to be an underrated feature which could well be of use as a are increasingly being used to enhance therapy by accompanying
guiding sense for humanoids. Some humanoid robot heads are patients during rehabilitation tasks. Commercial or entertainment
of the expressive face type or the non-expressive head type. We applications are also becoming more abundant. In these instances,
have seen that a non-expressive head can perform well in HRI HRI naturally occurs. Consequently, robot heads with a pleasant
but the para-linguistic communication cue of facial expressions demeanor are desirable in order to enhance acceptance from the
remains very useful to enhance robot acceptance. interacting agent. Thus, there is a need to enhance the robot
15
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

social behavior characteristics such as empathy and engagement, [11] K. Kojima, T. Karasawa, T. Kozuki, E. Kuroiwa, S. Yukizaki, S. Iwaishi, T.
which can be emulated by displaying appropriate verbal and/or Ishikawa, R. Koyama, S. Noda, F. Sugai, S. Nozawa, Y. Kakiuchi, K. Okada,
M. Inaba, Development of life-sized high-power humanoid robot JAXON
non-verbal signals.
for real-world use, in: 2015 IEEE-RAS 15th International Conference on
Overall, substantial progress has been made in the areas of Humanoid Robots (Humanoids), 2015, pp. 838–843, http://dx.doi.org/10.
active vision, cognition, design, emotion expression and recog- 1109/HUMANOIDS.2015.7363459.
nition as well as HRI and human–robot collaboration. Many of [12] N.A. Radford, P. Strawser, K. Hambuchen, J.S. Mehling, W.K. Verdeyen,
A.S. Donnan, J. Holley, J. Sanchez, V. Nguyen, L. Bridgwater, R. Berka,
these developments work well for their independent application R. Ambrose, M. Myles Markee, N.J. Fraser-Chanpong, C. McQuin, J.D.
scenarios, however no humanoid robot head currently exists, ca- Yamokoski, S. Hart, R. Guo, A. Parsons, B. Wightman, P. Dinh, B. Ames,
pable of combining all of the individual approaches. The growing C. Blakely, C. Edmondson, B. Sommers, R. Rea, C. Tobler, H. Bibby, B.
complexity of the models involved may complicate their imple- Howard, L. Niu, A. Lee, M. Conover, L. Truong, R. Reed, D. Chesney, R.
Platt Jr, G. Johnson, C.-L. Fok, N. Paine, L. Sentis, E. Cousineau, R. Sinnet,
mentation into a single autonomous system, and the deployment J. Lack, M. Powell, B. Morris, A. Ames, J. Akinyode, Valkyrie: NASA’s
in uncontrolled human environments may be needed to assess first bipedal humanoid robot, J. Field Robotics 32 (3) (2015) 397–419,
real world requirements and general robustness of the humanoid http://dx.doi.org/10.1002/rob.21560.
[13] A. Stentz, H. Herman, A. Kelly, E. Meyhofer, G.C. Haynes, D. Stager, B.
robot head. Another open challenge is the issue of robot knowl-
Zajac, J.A. Bagnell, J. Brindza, C. Dellin, M. George, J. Gonzalez-Mora, S.
edge for active vision and social behavior: it is impossible to Hyde, M. Jones, M. Laverne, M. Likhachev, L. Lister, M. Powers, O. Ramos,
pre-program robot heads for all possible situations, therefore J. Ray, D. Rice, J. Scheifflee, R. Sidki, S. Srinivasa, K. Strabala, J.-P. Tardif,
robot heads should be able to acquire knowledge and reasoning J.-S. Valois, J.M. Vande Weghe, M. Wagner, C. Wellington, CHIMP, the
CMU highly intelligent mobile platform, J. Field Robotics 32 (2) (2015)
capabilities. It could also be useful that these robots learn from
209–228, http://dx.doi.org/10.1002/rob.21569.
failures (to identify objects or when displaying the wrong expres- [14] N.J. Ferrier, Harvard binocular head, in: Proc. SPIE 1708, Applications of
sion) when an interacting agent indicates so. For collaborative Artificial Intelligence X: Machine Vision and Robotics, Vol. 1708, 1992,
tasks, there is a need to anticipate intentions of interactive agents http://dx.doi.org/10.1117/12.58557.
[15] N. Emery, The eyes have it: the neuroethology, function and evolution
in order to communicate according to the needs of the situation. of social gaze, Neurosci. Biobehav. Rev. 24 (6) (2000) 581–604, http:
//dx.doi.org/10.1016/S0149-7634(00)00025-7.
[16] L. Natale, G. Metta, G. Sandini, Development of auditory-evoked reflexes:
Declaration of competing interest
Visuo-acoustic cues integration in a binocular head, Robot. Auton. Syst.
39 (2) (2002) 87–106, http://dx.doi.org/10.1016/S0921-8890(02)00174-4.
The authors declare that they have no known competing finan- [17] L. Rayleigh, XII.on our perception of sound direction, Philos. Mag. Ser. 6
13 (74) (1907) 214–232, http://dx.doi.org/10.1080/14786440709463595.
cial interests or personal relationships that could have appeared [18] R. Russell, A. Purnamadjaja, Odor and airflow: complementary senses for
to influence the work reported in this paper. a humanoid robot, in: Proceedings of the IEEE International Conference
on Robotics and Automation, IEEE, 2002, http://dx.doi.org/10.1109/robot.
2002.1014809.
Acknowledgments [19] M. Bonnefille, in: K. Goodner, R. Rouseff (Eds.), Practical Analysis of Flavor
and Fragrance Materials, Wiley, 2011, pp. 111–154.
[20] R.A. Brooks, L.A. Stein, Building brains for bodies, Auton. Robots 1 (1)
This work was supported by Secretaría de Educación Pública
(1994) 7–25, http://dx.doi.org/10.1007/BF00735340.
(SEP) and Consejo Nacional de Ciencia y Tecnología (CONACYT) [21] I. Kato, S. Ohteru, K. Shirai, T. Matsushima, S. Narita, S. Sugano, T.
under Grant A1-S-29824. Kobayashi, E. Fujisawa, The robot musician ‘wabot-2’ (waseda robot-2),
Robotics 3 (2) (1987) 143–155, http://dx.doi.org/10.1016/0167-8493(87)
90002-7.
References [22] H. Miwa, A. Takanishi, H. Takanobu, Experimental study on robot
personality for humanoid head robot, in: Proceedings of the IEEE/RSJ
[1] B. Duran, S. Thill, Rob’s robot: Current and future challenges for humanoid International Conference on Intelligent Robots and Systems, Vol. 2, 2001,
robots, in: The Future of Humanoid Robots - Research and Applications, pp. 1183–1188, http://dx.doi.org/10.1109/IROS.2001.976329.
[23] F. Patane, C. Laschi, H. Miwa, E. Guglielmelli, P. Dario, A. Takanishi, Design
InTech, 2012, http://dx.doi.org/10.5772/27197.
and development of a biologically-inspired artificial vestibular system
[2] S. Saeedvand, M. Jafari, H.S. Aghdasi, J. Baltes, A comprehensive survey
for robot heads, in: Proceedings of the IEEE/RSJ International Conference
on humanoid robot development, Knowl. Eng. Rev. 34 (2019) e20, http:
on Intelligent Robots and Systems, Vol. 2, 2004, pp. 1317–1322, http:
//dx.doi.org/10.1017/S0269888919000158.
//dx.doi.org/10.1109/IROS.2004.1389578.
[3] B. Burton, The smithsonian’s new tour guide isn’t human, 2018, [24] A. Takanishi, S. Ishimoto, T. Matsuno, Development of an anthropo-
https://www.cnet.com/news/smithsonian-museum-new-tour-guide-is-a- morphic head-eye system for robot and human communication, in:
pepper-robot-from-softbank/. Proceedings of the IEEE International Workshop on Robot and Human
[4] J. Brown, Navigating the crowd with SPENCER, 2016, https://news.cnrs. Communication, 1995, pp. 77–82, http://dx.doi.org/10.1109/ROMAN.1995.
fr/articles/navigating-the-crowd-with-spencer. 531938.
[5] M. de la Grande Guerre, Visite à distance, 2018, https://www. [25] N. Pateromichelakis, A. Mazel, M.A. Hache, T. Koumpogiannis, R. Gelin,
museedelagrandeguerre.eu/fr/espace-pedagogique/visitez-le-musee/ B. Maisonnier, A. Berthoz, Head-eyes system and gaze analysis of the
visite-a-distance.html. humanoid robot romeo, in: Proceedings of the IEEE/RSJ International
[6] D. Carvajal, Let a robot be your museum tour guide, 2017, Conference on Intelligent Robots and Systems, 2014, pp. 1374–1379,
https://www.nytimes.com/2017/03/14/arts/design/museums-experiment- http://dx.doi.org/10.1109/IROS.2014.6942736.
with-robots-as-guides.html. [26] G. Nelson, A. Saunders, R. Playter, The PETMAN and atlas robots at boston
[7] J. Archer, ‘Friendly’ hospital robot begins trials to help stressed nurses, dynamics, in: A. Goswami, P. Vadakkepat (Eds.), Humanoid Robotics:
2018, https://www.telegraph.co.uk/technology/2018/09/19/friendly- A Reference, Springer Netherlands, Dordrecht, 2019, pp. 169–186, http:
hospital-robot-begins-trials-help-stressed-nurses/. //dx.doi.org/10.1007/978-94-007-6046-2_15.
[27] G. Trovato, T. Kishi, N. Endo, M. Zecca, K. Hashimoto, A. Takanishi, Cross-
[8] J. Vincent, LG’s new airport robots will guide you to your gate and clean
cultural perspectives on emotion expressive humanoid robotic head:
up your trash, 2017, https://www.theverge.com/2017/7/21/16007680/lg-
Recognition of facial expressions and symbols, Int. J. Soc. Robot. 5 (4)
airport-robot-cleaning-guide-south-korea-incheon.
(2013) 515–527, http://dx.doi.org/10.1007/s12369-013-0213-z.
[9] A. Albers, S. Brudniok, J. Ottnad, C. Sauter, K. Sedchaicharn, Design of [28] J. h. Oh, D. Hanson, W. s. Kim, Y. Han, J. y. Kim, I. w. Park, Design of
modules and components for humanoid robots, in: A.C. de Pina Filho android type humanoid robot albert HUBO, in: Proceedings of the IEEE/RSJ
(Ed.), Humanoid Robots, IntechOpen, Rijeka, 2007, http://dx.doi.org/10. International Conference on Intelligent Robots and Systems, 2006, pp.
5772/4857. 1428–1433, http://dx.doi.org/10.1109/IROS.2006.281935.
[10] J. Englsberger, A. Werner, C. Ott, B. Henze, M.A. Roa, G. Garofalo, R. [29] P. Wills, P. Baxter, J. Kennedy, E. Senft, T. Belpaeme, Socially contingent
Burger, A. Beyer, O. Eiberger, K. Schmid, A. Albu-Schaffer, Overview of the humanoid robot head behaviour results in increased charity donations, in:
torque-controlled humanoid robot TORO, in: 2014 IEEE-RAS International Proceedings of the 11th ACM/IEEE International Conference on Human-
Conference on Humanoid Robots, IEEE, 2014, http://dx.doi.org/10.1109/ Robot Interaction, 2016, pp. 533–534, http://dx.doi.org/10.1109/HRI.2016.
humanoids.2014.7041473. 7451842.

16
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

[30] R.A. Brooks, Behavior-based humanoid robotics, in: Proceedings of the [51] A. Takanishi, H. Takanobu, I. Kato, T. Umetsu, Development of the anthro-
IEEE/RSJ International Conference on Intelligent Robots and Systems, Vol. pomorphic head-eye robot WE-3rii with an autonomous facial expression
1, 1996, pp. 1–8, http://dx.doi.org/10.1109/IROS.1996.570613. mechanism, in: Proceedings of the IEEE International Conference on
[31] H. Kozima, Infanoid, a babybot that explores the social environment, in: K. Robotics and Automation, Vol. 4, 1999, pp. 3255–3260, http://dx.doi.org/
Dautenhahn, A. Bond, L. Cañamero, B. Edmonds (Eds.), Socially Intelligent 10.1109/ROBOT.1999.774094.
Agents: Creating Relationships with Computers and Robots, Springer US, [52] A. Takanishi, K. Sato, K. Segawa, H. Takanobu, H. Miwa, An anthro-
Boston, MA, 2002, pp. 157–164, http://dx.doi.org/10.1007/0-306-47373- pomorphic head-eye robot expressing emotions based on equations of
9_19. emotion, in: Proceedings of the IEEE International Conference on Robotics
[32] H. Miwa, T. Okuchi, H. Takanobu, A. Takanishi, Development of a new and Automation, Vol. 3, 2000, pp. 2243–2249, http://dx.doi.org/10.1109/
human-like head robot WE-4, in: Proceedings of the IEEE/RSJ Interna- ROBOT.2000.846361.
tional Conference on Intelligent Robots and Systems, Vol. 3, 2002, pp. [53] H. Miwa, T. Umetsu, A. Takanishi, H. Takanohu, Human-like robot head
2443–2448, http://dx.doi.org/10.1109/IRDS.2002.1041634. that has olfactory sensation and facial color expression, in: Proceedings
[33] S. Ali, F. Mehmood, Y. Ayaz, U. Asgher, M.J. Khan, Effect of different of the IEEE International Conference on Robotics and Automation, Vol. 1,
visual stimuli on joint attention of ASD children using NAO robot, 2001, pp. 459–464, http://dx.doi.org/10.1109/ROBOT.2001.932593.
in: Advances in Neuroergonomics and Cognitive Engineering, Springer [54] J. Lim, I. Lee, I. Shim, H. Jung, H.M. Joe, H. Bae, O. Sim, J. Oh, T. Jung,
International Publishing, 2019, pp. 490–499, http://dx.doi.org/10.1007/ S. Shin, K. Joo, M. Kim, K. Lee, Y. Bok, D.-G. Choi, B. Cho, S. Kim, J.
978-3-030-20473-0_48. Heo, I. Kim, J. Lee, I.S. Kwon, J.-H. Oh, Robot system of DRC-HUBO+ and
[34] B. Busch, G. Cotugno, M. Khoramshahi, G. Skaltsas, D. Turchi, L. Urbano, control strategy of team KAIST in DARPA robotics challenge finals, J. Field
M. Wächter, Y. Zhou, T. Asfour, G. Deacon, D. Russell, A. Billard, Evaluation Robotics 34 (4) (2017) 802–829, http://dx.doi.org/10.1002/rob.21673.
[55] K. Kaneko, M. Morisawa, S. Kajita, S. Nakaoka, T. Sakaguchi, R. Cisneros,
of an industrial robotic assistant in an ecological environment, in: 2019
F. Kanehiro, Humanoid robot HRP-2kai — Improvement of HRP-2 towards
28th IEEE International Conference on Robot and Human Interactive
disaster response tasks, in: 2015 IEEE-RAS 15th International Conference
Communication (RO-MAN), 2019, pp. 1–8, http://dx.doi.org/10.1109/RO-
on Humanoid Robots (Humanoids), 2015, pp. 132–139, http://dx.doi.org/
MAN46459.2019.8956399.
10.1109/HUMANOIDS.2015.7363526.
[35] A. Greco, A. Roberto, A. Saggese, M. Vento, V. Vigilante, Emotion analysis
[56] S. Kim, M. Kim, J. Lee, S. Hwang, J. Chae, B. Park, H. Cho, J. Sim, J. Jung, H.
from faces for social robotics, in: 2019 IEEE International Conference on
Lee, S. Shin, M. Kim, N. Kwak, Y. Lee, S. Lee, M. Lee, S. Yi, K.K.C. Chang,
Systems, Man and Cybernetics (SMC), 2019, pp. 358–364, http://dx.doi.
J. Park, Approach of team SNU to the DARPA robotics challenge finals,
org/10.1109/SMC.2019.8914039.
in: 2015 IEEE-RAS 15th International Conference on Humanoid Robots
[36] C.T. Ishi, T. Minato, H. Ishiguro, Analysis and generation of laughter
(Humanoids), 2015, pp. 777–784, http://dx.doi.org/10.1109/HUMANOIDS.
motions, and evaluation in an android robot, APSIPA Trans. Signal Inf.
2015.7363458.
Process. 8 (2019) http://dx.doi.org/10.1017/atsip.2018.32. [57] M.J. Schuster, J. Okerman, H. Nguyen, J.M. Rehg, C.C. Kemp, Perceiving
[37] M. Marmpena, A. Lim, T.S. Dahl, N. Hemion, Generating robotic emotional clutter and surfaces for object placement in indoor environments, in:
body language with variational autoencoders, in: 2019 8th International 2010 10th IEEE-RAS International Conference on Humanoid Robots, 2010,
Conference on Affective Computing and Intelligent Interaction (ACII), pp. 152–159, http://dx.doi.org/10.1109/ICHR.2010.5686328.
2019, pp. 545–551, http://dx.doi.org/10.1109/ACII.2019.8925459. [58] C. Knabe, R. Griffin, J. Burton, G. Cantor-Cooke, L. Dantanarayana, G. Day,
[38] M. Meza-Sánchez, E. Clemente, M. Rodríguez-Liñán, G. Olague, Synthetic- O. Ebeling-Koning, E. Hahn, M. Hopkins, J. Neal, J. Newton, C. Nogales,
analytic behavior-based control framework: Constraining velocity in V. Orekhov, J. Peterson, M. Rouleau, J. Seminatore, Y. Sung, J. Webb,
tracking for nonholonomic wheeled mobile robots, Inform. Sci. 501 (2019) N. Wittenstein, J. Ziglar, A. Leonessa, B. Lattimer, T. Furukawa, Team
436–459, http://dx.doi.org/10.1016/j.ins.2019.06.025. valor’s ESCHER: A novel electromechanical biped for the darpa robotics
[39] G. Olague, D.E. Hernández, P. Llamas, E. Clemente, J.L. Briseño, Brain challenge, in: M. Spenko, S. Buerger, K. Iagnemma (Eds.), The DARPA
programming as a new strategy to create visual routines for object Robotics Challenge Finals: Humanoid Robots To the Rescue, Springer
tracking, Multimedia Tools Appl. 78 (5) (2019) 5881–5918, http://dx.doi. International Publishing, Cham, 2018, pp. 583–629, http://dx.doi.org/10.
org/10.1007/s11042-018-6634-9. 1007/978-3-319-74666-1_15.
[40] C. Breazeal, Emotion and sociable humanoid robots, Int. J. Hum.-Comput. [59] N.G. Tsagarakis, D. Caldwell, F. Negrello, W. Choi, L. Baccelliere, V.
Stud. 59 (1) (2003) 119–155, http://dx.doi.org/10.1016/S1071-5819(03) Loc, J. Noorden, L. Muratore, A. Margan, A. Cardellino, L. Natale, E.
00018-1. Mingo Hoffman, H. Dallali, N. Kashiri, J. Malzahn, J. Lee, P. Kryczka, D.
[41] P. Ekman, W. Friesen, Unmasking the Face: A Guide to Recognizing Kanoulas, M. Garabini, M. Catalano, M. Ferrati, V. Varricchio, L. Pallottino,
Emotions from Facial Clues, in: A Spectrum book, Malor Books, 2003. C. Pavan, A. Bicchi, A. Settimi, A. Rocchi, A. Ajoudani, WALK-MAN: A
[42] J. Aloimonos, I. Weiss, A. Bandyopadhyay, Active vision, Int. J. Comput. high-performance humanoid platform for realistic environments, J. Field
Vis. 1 (4) (1988) 333–356, http://dx.doi.org/10.1007/BF00133571. Robotics 34 (7) (2017) 1225–1259, http://dx.doi.org/10.1002/rob.21702.
[43] R. Bajcsy, Active perception, Proc. IEEE 76 (8) (1988) 966–1005, http: [60] I. Kato, Development of WABOT 1, Biomechanism 2 (1973) 173–214.
//dx.doi.org/10.1109/5.5968. [61] K. Hirai, Current and future perspective of honda humamoid robot, in:
[44] T.J. Olson, R.D. Potter, Real time vergence control, in: Proceedings Proceedings of the IEEE/RSJ International Conference on Intelligent Robots
of the IEEE Computer Society Conference on Computer Vision and and Systems, Vol. 2, 1997, pp. 500–508, http://dx.doi.org/10.1109/IROS.
Pattern Recognition, 1989, pp. 404–409, http://dx.doi.org/10.1109/CVPR. 1997.655059.
1989.37878. [62] K. Hirai, M. Hirose, Y. Haikawa, T. Takenaka, The development of honda
humanoid robot, in: Proceedings of the IEEE International Conference on
[45] C. Capurro, F. Panerai, E. Grosso, G. Sandini, A binocular active vision
Robotics and Automation, Vol. 2, 1998, pp. 1321–1326, http://dx.doi.org/
system using space variant sensors: exploiting autonomous behaviors for
10.1109/ROBOT.1998.677288.
space applications, in: Proceedings of the International Conference on
[63] B. Scassellati, A binocular, foveated active vision system, Tech. rep.,
Digital Signal Processing, 1993.
Massachusetts Institute of Technology, Cambridge, MA, 1998.
[46] J.L. Crowley, P. Bobet, M. Mesrabi, Gaze control for a binocular camera
[64] Y. Matsusaka, T. Tojo, S. Kubota, K. Furukawa, D.T.K. Hayata, Multi-person
head, in: G. Sandini (Ed.), Computer Vision — ECCV’92: Second European
conversation via multi-modal interface | a robot who communicates
Conference on Computer Vision Santa Margherita Ligure, Italy, May 19–
with multi-user, in: Proceedings of the European Conference on Speech
22, 1992 Proceedings, Springer Berlin Heidelberg, Berlin, Heidelberg,
Communication Technology, 1999, pp. 1723–1726.
1992, pp. 588–596, http://dx.doi.org/10.1007/3-540-55426-2_63. [65] A. Nagakubo, Y. Kuniyoshi, G. Cheng, Development of a high-performance
[47] K. Pahlavan, J.-O. Eklundh, Heads, eyes and head-eye systems, Int. J. upper-body humanoid system, in: Proceedings of the IEEE/RSJ Interna-
Pattern Recognit. Artif. Intell. 07 (01) (1993) 33–49, http://dx.doi.org/10. tional Conference on Intelligent Robots and Systems, Vol. 3, 2000, pp.
1142/S0218001493000030. 1577–1583, http://dx.doi.org/10.1109/IROS.2000.895198.
[48] A. Takanishi, S. Hirano, K. Sato, Development of an anthropomorphic [66] H.G. Marques, M. Jäntsch, S. Wittmeier, O. Holland, C. Alessandro, A.
head-eye system for a humanoid robot-realization of human-like head- Diamond, M. Lungarella, R. Knight, ECCE1: The first of a series of
eye motion using eyelids adjusting to brightness, in: Proceedings of the anthropomimetic musculoskeletal upper torsos, in: Proceedings of the
IEEE International Conference on Robotics and Automation, Vol. 2, 1998, 10th IEEE-RAS International Conference on Humanoid Robots, 2010, pp.
pp. 1308–1314, http://dx.doi.org/10.1109/ROBOT.1998.677285. 391–396, http://dx.doi.org/10.1109/ICHR.2010.5686344.
[49] M. Marjanovic, B. Scassellati, M. Williamson, Self-taught visually-guided [67] T. Fong, I. Nourbakhsh, K. Dautenhahn, A survey of socially interactive
pointing for a humanoid robot, in: From Animals to Animats: Proceedings robots, Robot. Auton. Syst. 42 (3) (2003) 143–166, http://dx.doi.org/10.
of the Society of Adaptive Behavior, MIT Press, 1996, pp. 35–44. 1016/S0921-8890(02)00372-X, Socially Interactive Robots.
[50] C. Rasmussen, K. Yuvraj, R. Vallett, K. Sohn, P. Oh, Towards functional [68] J. Fink, Anthropomorphism and human likeness in the design of robots
labeling of utility vehicle point clouds for humanoid driving, Intell. Serv. and human-robot interaction, in: S.S. Ge, O. Khatib, J.-J. Cabibihan, R. Sim-
Robotics 7 (3) (2014) 133–143, http://dx.doi.org/10.1007/s11370-014- mons, M.-A. Williams (Eds.), Social Robotics, Springer Berlin Heidelberg,
0157-7. Berlin, Heidelberg, 2012, pp. 199–208.

17
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

[69] E.G. Urquiza-Haas, K. Kotrschal, The mind behind anthropomorphic think- [88] A.N. van Breemen, Animation engine for believable interactive user-
ing: attribution of mental states to other species, Anim. Behav. 109 (2015) interface robots, in: Proceedings of the IEEE/RSJ International Conference
167–176, http://dx.doi.org/10.1016/j.anbehav.2015.08.011. on Intelligent Robots and Systems, Vol. 3, 2004, pp. 2873–2878, http:
[70] A. Prakash, W. Rogers, Why some humanoid faces are perceived more //dx.doi.org/10.1109/IROS.2004.1389845.
positively than others: Effects of human-likeness and task, Int. J. Soc. [89] H. Song, Y.-M. Kim, J.C. Park, C.H. Kim, D.-S. Kwon, Design of a robot head
Robot. 7 (2) (2015) 309–331, http://dx.doi.org/10.1007/s12369-014-0269- for emotional expression: EEEX, in: 17th IEEE International Symposium
4. on Robot and Human Interactive Communication, 2008, pp. 207–212,
[71] H. Kobayashi, F. Hara, Study on face robot for active human interface- http://dx.doi.org/10.1109/ROMAN.2008.4600667.
mechanisms of face robot and expression of 6 basic facial expressions, in: [90] J. Saldien, K. Goris, B. Vanderborght, J. Vanderfaeillie, D. Lefeber, Express-
2nd IEEE International Workshop on Robot and Human Communication, ing emotions with the social robot probo, Int. J. Soc. Robot. 2 (4) (2010)
1993, pp. 276–281, http://dx.doi.org/10.1109/ROMAN.1993.367708. 377–389, http://dx.doi.org/10.1007/s12369-010-0067-6.
[72] F. Hara, H. Kobayashi, F. Iida, An interactive face robot able to create [91] P.E. McKenna, M.Y. Lim, A. Ghosh, R. Aylett, F. Broz, G. Rajendran, Do you
virtual communication with human, in: M. Göbel, J. Landauer, U. Lang, M. think i approve of that? Designing facial expressions for a robot, in: A.
Wapler (Eds.), Virtual Environments ’98, Springer Vienna, Vienna, 1998, Kheddar, E. Yoshida, S.S. Ge, K. Suzuki, J.-J. Cabibihan, F. Eyssel, H. He
pp. 182–194. (Eds.), Social Robotics, Springer International Publishing, Cham, 2017, pp.
[73] T. Hashimoto, I. Verner, H. Kobayashi, Human-like robot as teacher’s 188–197.
representative in a science lesson: An elementary school experiment, [92] I. Lütkebohle, F. Hegel, S. Schulz, M. Hackel, B. Wrede, S. Wachsmuth, G.
in: J.-H. Kim, E.T. Matson, H. Myung, P. Xu (Eds.), Robot Intelligence Sagerer, The bielefeld anthropomorphic robot head ‘‘flobi’’, in: Proceed-
Technology and Applications 2012: An Edition of the Presented Papers ings of the IEEE International Conference on Robotics and Automation,
from the 1st International Conference on Robot Intelligence Technology 2010, pp. 3384–3391, http://dx.doi.org/10.1109/ROBOT.2010.5509173.
and Applications, Springer, Berlin, Heidelberg, 2013, pp. 775–786, http: [93] A. De Beir, H.-L. Cao, P. Gómez Esteban, G. Van de Perre, D. Lefeber, B.
//dx.doi.org/10.1007/978-3-642-37374-9_74. Vanderborght, Enhancing emotional facial expressiveness on NAO, Int. J.
[74] M. Mori, K.F. MacDorman, N. Kageki, The uncanny valley [from the field], Soc. Robot. 8 (4) (2016) 513–521, http://dx.doi.org/10.1007/s12369-016-
IEEE Robot. Autom. Mag. 19 (2) (2012) 98–100, http://dx.doi.org/10.1109/ 0363-x.
MRA.2012.2192811. [94] R. Reilink, L.C. Visser, D.M. Brouwer, R. Carloni, S. Stramigioli, Mechatronic
[75] C. Bartneck, T. Kanda, H. Ishiguro, N. Hagita, My robotic doppelgänger - a design of the Twente humanoid head, Intell. Serv. Robot. 4 (2) (2011)
critical look at the Uncanny Valley, in: 18th IEEE International Symposium 107–118, http://dx.doi.org/10.1007/s11370-010-0077-0.
on Robot and Human Interactive Communication, 2009, pp. 269–276, [95] F. Delaunay, T. Belpaeme, Refined human-robot interaction through retro-
http://dx.doi.org/10.1109/ROMAN.2009.5326351. projected robotic heads, in: 2012 IEEE Workshop on Advanced Robotics
[76] T. Minato, M. Shimada, H. Ishiguro, S. Itakura, Development of an android and Its Social Impacts, 2012, pp. 106–107, http://dx.doi.org/10.1109/ARSO.
robot for studying human-robot interaction, in: B. Orchard, C. Yang, M. 2012.6213409.
[96] B. Pierce, T. Kuratate, C. Vogl, G. Cheng, ‘‘Mask-bot 2i": An active
Ali (Eds.), Innovations in Applied Artificial Intelligence: 17th International
customisable robotic head with interchangeable face, in: Proceedings of
Conference on Industrial and Engineering Applications of Artificial Intel-
the 12th IEEE-RAS International Conference on Humanoid Robots, 2012,
ligence and Expert Systems, IEA/AIE 2004, Ottawa, Canada, May 17-20,
pp. 520–525, http://dx.doi.org/10.1109/HUMANOIDS.2012.6651569.
2004. Proceedings, Springer Berlin Heidelberg, Berlin, Heidelberg, 2004,
[97] D.H. Ballard, Reference frames for animate vision, in: Proceedings of
pp. 424–434, http://dx.doi.org/10.1007/978-3-540-24677-0_44.
the 11th International Joint Conference on Artificial Intelligence, Vol.
[77] D. Hanson, Exploring the aesthetic range for humanoid robots, in:
2, Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1989, pp.
Proceedings of the ICCS/CogSci-2006 Long Symposium: Toward Social
1635–1641.
Mechanisms of Android Science, 2006, pp. 39–42.
[98] Y. Kuniyoshi, N. Kita, T. Suehiro, S. Rougeaux, Active stereo vision system
[78] H. Ishiguro, S. Nishio, Building artificial humans to understand humans,
with foveated wide angle lenses, in: S.Z. Li, D.P. Mital, E.K. Teoh, H.
J. Artif. Organs 10 (3) (2007) 133–142, http://dx.doi.org/10.1007/s10047-
Wang (Eds.), Recent Developments in Computer Vision: Second Asian
007-0381-4.
Conference on Computer Vision, ACCV ’95 Singapore, December 5–8, 1995
[79] K. Kaneko, F. Kanehiro, M. Morisawa, K. Miura, S. Nakaoka, S. Kajita,
Invited Session Papers, Springer Berlin Heidelberg, Berlin, Heidelberg,
Cybernetic human HRP-4c, in: Proceedings of the 9th IEEE-RAS Inter-
1996, pp. 191–200, http://dx.doi.org/10.1007/3-540-60793-5_74.
national Conference on Humanoid Robots, 2009, pp. 7–14, http://dx.doi.
[99] P. Fitzpatrick, G. Metta, L. Natale, Towards long-lived robot genes, Robot.
org/10.1109/ICHR.2009.5379537.
Auton. Syst. 56 (1) (2008) 29–45, http://dx.doi.org/10.1016/j.robot.2007.
[80] H.S. Ahn, D.-W. Lee, D. Choi, D.Y. Lee, M.H. Hur, H. Lee, W.H. Shon,
09.014.
Development of an android for singing with facial expression, in: IECON [100] F. Grondin, D. Létourneau, F. Ferland, V. Rousseau, F. Michaud, The
2011 - 37th Annual Conference of the IEEE Industrial Electronics Society, manyears open framework, Auton. Robots 34 (3) (2013) 217–232, http:
2011, pp. 104–109, http://dx.doi.org/10.1109/IECON.2011.6119296. //dx.doi.org/10.1007/s10514-012-9316-x.
[81] C. DiSalvo, F. Gemperle, J. Forlizzi, S. Kiesler, All robots are not created [101] E. Marchand, F. Spindler, F. Chaumette, Visp for visual servoing: a generic
equal: The design and perception of humanoid robot heads, in: Pro- software platform with a wide class of robot control skills, IEEE Robot.
ceedings of the Conference on Designing Interactive Systems: Processes, Autom. Mag. 12 (4) (2005) 40–52, http://dx.doi.org/10.1109/MRA.2005.
Practices, Methods, and Techniques, ACM, New York, NY, 2002, pp. 1577023.
321–326, http://dx.doi.org/10.1145/778712.778756. [102] E.M. Hoffman, S. Traversaro, A. Rocchi, M. Ferrati, A. Settimi, F. Romano,
[82] T. Minato, M. Shimada, S. Itakura, K. Lee, H. Ishiguro, Evaluating the L. Natale, A. Bicchi, F. Nori, N.G. Tsagarakis, Yarp based plugins for
human likeness of an android by comparing gaze behaviors elicited by gazebo simulator, in: Modelling and Simulation for Autonomous Systems,
the android and a person, Adv. Robot.: Int. J. Robot. Soc. Japan 20 (10) Springer International Publishing, 2014, pp. 333–346, http://dx.doi.org/10.
(2006) 1147—1163, http://dx.doi.org/10.1163/156855306778522505. 1007/978-3-319-13823-7_29.
[83] L. Aryananda, J. Weber, MERTZ: a quest for a robust and scalable active [103] K. Nakadai, T. Takahashi, H.G. Okuno, H. Nakajima, Y. Hasegawa, H. Tsu-
vision humanoid head robot, in: Proceedings of the IEEE/RAS International jino, Design and implementation of robot audition system ’HARK’ – open
Conference on Humanoid Robots, Vol. 2, 2004, pp. 513–532, http://dx.doi. source software for listening to three simultaneous speakers, Adv. Robot.
org/10.1109/ICHR.2004.1442668. 24 (5–6) (2010) 739–761, http://dx.doi.org/10.1163/016918610x493561.
[84] R. Beira, M. Lopes, M. Praca, J. Santos-Victor, A. Bernardino, G. Metta, F. [104] V. Tikhanoff, A. Cangelosi, P. Fitzpatrick, G. Metta, L. Natale, F. Nori, An
Becchi, R. Saltaren, Design of the robot-cub (icub) head, in: Proceedings open-source simulator for cognitive robotics research, in: Proceedings of
of the IEEE International Conference on Robotics and Automation, 2006, the 8th Workshop on Performance Metrics for Intelligent Systems, ACM
pp. 94–100, http://dx.doi.org/10.1109/ROBOT.2006.1641167. Press, 2008, http://dx.doi.org/10.1145/1774674.1774684.
[85] T. Hashimoto, S. Hitramatsu, T. Tsuji, H. Kobayashi, Development of [105] H. Hirukawa, F. Kanehiro, K. Kaneko, S. Kajita, K. Fujiwara, Y. Kawai, F.
the face robot SAYA for rich facial expressions, in: Proceedings of the Tomita, S. Hirai, K. Tanie, T. Isozumi, K. Akachi, T. Kawasaki, S. Ota, K.
SICE-ICASE International Joint Conference, 2006, pp. 5423–5428, http: Yokoyama, H. Handa, Y. Fukase, J. ichiro Maeda, Y. Nakamura, S. Tachi, H.
//dx.doi.org/10.1109/SICE.2006.315537. Inoue, Humanoid robotics platforms developed in HRP, Robot. Auton. Syst.
[86] T. Minato, Y. Yoshikawa, T. Noda, S. Ikemoto, H. Ishiguro, M. Asada, CB2: 48 (4) (2004) 165–175, http://dx.doi.org/10.1016/j.robot.2004.07.007.
A child robot with biomimetic body for cognitive developmental robotics, [106] I.-W. Park, J.-Y. Kim, S.-W. Park, J.-H. Oh, Development of humanoid
in: Proceedings of the IEEE-RAS International Conference on Humanoid robot platform KHR-2 (KAIST humanoid robot-2), in: Proceedings of the
Robots, 2007, pp. 557–562, http://dx.doi.org/10.1109/ICHR.2007.4813926. IEEE/RAS International Conference on Humanoid Robots, Vol. 1, 2004, pp.
[87] C. Breazeal, A. Brooks, D. Chilongo, J. Gray, G. Hoffman, C. Kidd, H. Lee, 292–310, http://dx.doi.org/10.1109/ICHR.2004.1442128.
J. Lieberman, A. Lockerd, Working collaboratively with humanoid robots, [107] C. Ott, O. Eiberger, W. Friedl, B. Bauml, U. Hillenbrand, C. Borst, A.
in: Proceedings of the IEEE/RAS International Conference on Humanoid Albu-Schaffer, B. Brunner, H. Hirschmuller, S. Kielhofer, R. Konietschke,
Robots, Vol. 1, 2004, pp. 253–272, http://dx.doi.org/10.1109/ICHR.2004. M. Suppa, T. Wimbock, F. Zacharias, G. Hirzinger, A humanoid two-
1442126. arm system for dexterous manipulation, in: Proceedings of the IEEE-RAS

18
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

International Conference on Humanoid Robots, 2006, pp. 276–283, http: [127] T. Hashimoto, H. Kobayashi, A. Polishuk, I. Verner, Elementary science
//dx.doi.org/10.1109/ICHR.2006.321397. lesson delivered by robot, in: Proceedings of the 8th ACM/IEEE Inter-
[108] T. Asfour, J. Schill, H. Peters, C. Klas, J. Bücker, C. Sander, S. Schulz, A. national Conference on Human-Robot Interaction, 2013, pp. 133–134,
Kargov, T. Werner, V. Bartenbach, ARMAR-4: A 63 DOF torque controlled http://dx.doi.org/10.1109/HRI.2013.6483537.
humanoid robot, in: Proceedings of the 13th IEEE-RAS International [128] R. Loureiro, A. Lopes, C. Carona, D. Almeida, F. Faria, L. Garrote, C.
Conference on Humanoid Robots, 2013, pp. 390–396, http://dx.doi.org/ Premebida, U.J. Nunes, ISR-robothead: Robotic head with LCD-based
10.1109/HUMANOIDS.2013.7030004. emotional expressiveness, in: 2017 IEEE 5th Portuguese Meeting on Bio-
[109] J.-K. Yoo, S.-B. Han, J.-H. Kim, Sway motion cancellation scheme using a engineering (ENBENG), 2017, pp. 1–4, http://dx.doi.org/10.1109/ENBENG.
RGB-d camera-based vision system for humanoid robots, in: Advances in 2017.7889437.
Intelligent Systems and Computing, Springer Berlin Heidelberg, 2013, pp. [129] M. Hackel, S. Schwope, A humanoid interaction robot for information,
263–272, http://dx.doi.org/10.1007/978-3-642-37374-9_26. negotiation and entertainment use, Int. J. Human. Robot. 01 (03) (2004)
[110] X. Chen, K. Chaudhary, Y. Tanaka, K. Nagahama, H. Yaguchi, K. Okada, M. 551–563, http://dx.doi.org/10.1142/S0219843604000198.
Inaba, Reasoning-based vision recognition for agricultural humanoid robot [130] F. Cid, J. Moreno, P. Bustos, P. Núñez, Muecas: A multi-sensor robotic
toward tomato harvesting, in: 2015 IEEE/RSJ International Conference on head for affective human robot interaction and imitation, Sensors 14 (5)
Intelligent Robots and Systems (IROS), IEEE, 2015, http://dx.doi.org/10. (2014) 7711–7737, http://dx.doi.org/10.3390/s140507711.
1109/iros.2015.7354304. [131] S. Schulz, F. Lier, A. Kipp, S. Wachsmuth, Humotion: A human inspired
gaze control framework for anthropomorphic robot heads, in: Proceedings
[111] M. Wan, M. Fu, H. Zhang, W. Zhou, Mechatronic design of a humanoid
of the Fourth International Conference on Human Agent Interaction - HAI
head robot, in: Proceedings of the IEEE International Conference on
’16, ACM Press, 2016, http://dx.doi.org/10.1145/2974804.2974827.
Mechatronics and Automation, 2016, pp. 244–248, http://dx.doi.org/10.
[132] T.C. Apostolescu, G. Ionascu, L. Bogatu, S. Petrache, L.A. Cartal, Anima-
1109/ICMA.2016.7558568.
tronic robot humanoid head with 7 DOF: Design and experimental set-up,
[112] R.O. Ambrose, H. Aldridge, R.S. Askew, R.R. Burridge, W. Bluethmann, M.
in: 2018 10th International Conference on Electronics, Computers and
Diftler, C. Lovchik, D. Magruder, F. Rehnmark, Robonaut: NASA’s space
Artificial Intelligence (ECAI), IEEE, 2018, http://dx.doi.org/10.1109/ecai.
humanoid, IEEE Intell. Syst. Appl. 15 (4) (2000) 57–63, http://dx.doi.org/
2018.8679056.
10.1109/5254.867913.
[133] D.H. Ballard, Animate vision, Artificial Intelligence 48 (1) (1991) 57–86,
[113] Y. Sakagami, R. Watanabe, C. Aoyama, S. Matsunaga, N. Higaki, K. http://dx.doi.org/10.1016/0004-3702(91)90080-4.
Fujimura, The intelligent ASIMO: system overview and integration, in: [134] P. Glimcher, Eye movements, in: Fundamental Neuroscience, Academic
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots Press, 1998.
and Systems, Vol. 3, 2002, pp. 2478–2483, http://dx.doi.org/10.1109/IRDS. [135] F. Panerai, G. Metta, G. Sandini, Learning visual stabilization reflexes
2002.1041641. in robots with moving eyes, Neurocomputing 48 (1) (2002) 323–337,
[114] T. Kanda, H. Ishiguro, T. Ono, M. Imai, R. Nakatsu, Development and http://dx.doi.org/10.1016/S0925-2312(01)00645-2.
evaluation of an interactive humanoid robot "robovie", in: Proceedings [136] L. Vannucci, S. Tolu, E. Falotico, P. Dario, H.H. Lund, C. Laschi, Adaptive
of the IEEE International Conference on Robotics and Automation, Vol. 2, gaze stabilization through cerebellar internal models in a humanoid
2002, pp. 1848–1855, http://dx.doi.org/10.1109/ROBOT.2002.1014810. robot, in: Proceedings of the 6th IEEE International Conference on
[115] K. Nishiwaki, T. Sugihara, S. Kagami, F. Kanehiro, M. Inaba, H. Inoue, Biomedical Robotics and Biomechatronics, 2016, pp. 25–30, http://dx.doi.
Design and development of research platform for perception-action org/10.1109/BIOROB.2016.7523593.
integration in humanoid robot: H6, in: Proceedings of the IEEE/RSJ [137] E. Franchi, E. Falotico, D. Zambrano, G.G. Muscolo, L. Marazzato, P.
International Conference on Intelligent Robots and Systems, Vol. 3, 2000, Dario, C. Laschi, A comparison between two bio-inspired adaptive models
pp. 1559–1564, http://dx.doi.org/10.1109/IROS.2000.895195. of vestibulo-ocular reflex (VOR) implemented on the icub robot, in:
[116] C.G. Atkeson, J.G. Hale, F. Pollick, M. Riley, S. Kotosaka, S. Schaul, T. Proceedings of the 10th IEEE-RAS International Conference on Humanoid
Shibata, G. Tevatia, A. Ude, S. Vijayakumar, E. Kawato, M. Kawato, Using Robots, 2010, pp. 251–256, http://dx.doi.org/10.1109/ICHR.2010.5686329.
humanoid robots to study human behavior, IEEE Intell. Syst. Appl. 15 (4) [138] A.L. Yarbus, Saccadic eye movements, in: Eye Movements and Vision,
(2000) 46–56, http://dx.doi.org/10.1109/5254.867912. Springer US, Boston, MA, 1967, pp. 129–146, http://dx.doi.org/10.1007/
[117] C. Breazeal, B. Scassellati, How to build robots that make friends and 978-1-4899-5379-7_5.
influence people, in: Proceedings of the IEEE/RSJ International Conference [139] J.S. Albus, A theory of cerebellar function, Math. Biosci. 10 (1–2) (1971)
on Intelligent Robots and Systems, Vol. 2, 1999, pp. 858–863, http: 25–61, http://dx.doi.org/10.1016/0025-5564(71)90051-4.
//dx.doi.org/10.1109/IROS.1999.812787. [140] J.S. Albus, A new approach to manipulator control: The cerebellar model
[118] A. Roncone, O. Mangin, B. Scassellati, Transparent role assignment and articulation controller (CMAC), J. Dyn. Syst. Meas. Control 97 (3) (1975)
task allocation in human robot collaboration, in: 2017 IEEE International 220, http://dx.doi.org/10.1115/1.3426922.
Conference on Robotics and Automation (ICRA), 2017, pp. 1014–1021, [141] M. Kawato, K. Furukawa, R. Suzuki, A hierarchical neural-network model
http://dx.doi.org/10.1109/ICRA.2017.7989122. for control and learning of voluntary movement, Biol. Cybernet. 57 (3)
[119] A. Mehrabian, Nonverbal Communication, Routledge, 2017. (1987) 169–185, http://dx.doi.org/10.1007/bf00364149.
[120] M. Knapp, Nonverbal Communication in Human Interaction, Holt, [142] J. Piaget, Play, Dreams and Immitation in Childhood, Routledge and Kegan
Rinehart and Winston, 1972. Paul Ltd, 1951.
[143] P. Dean, J.E. Mayhew, N. Thacker, P.M. Langdon, Saccade control in
[121] K. Dautenhahn, C.L. Nehaniv, M.L. Walters, B. Robins, H. Kose-Bagci, M.
a simulated robot camera-head system: neural net architectures for
Blow, KASPAR –a minimally expressive humanoid robot for human–
efficient learning of inverse kinematics, Biol. Cybernet. 66 (1991) 27–36.
robot interaction research, Appl. Bionics Biomech. 6 (3,4) (2009) 369–397,
[144] J. Porrill, P. Dean, J.V. Stone, Recurrent cerebellar architecture solves the
http://dx.doi.org/10.1080/11762320903123567.
motor-error problem, Proc. R Soc. Lond. 271 (1541) (2004) 789–796,
[122] X. Ke, B. Qiu, J. Xin, Y. Yun, Vision development of humanoid head robot
http://dx.doi.org/10.1098/rspb.2003.2658.
SHFR-III, in: Proceedings of the IEEE International Conference on Robotics
[145] A. Haith, S. Vijayakumar, Robustness of VOR and OKR adaptation under
and Biomimetics, 2015, pp. 1590–1595, http://dx.doi.org/10.1109/ROBIO.
kinematics and dynamics transformations, in: Proceedings of the IEEE
2015.7418998.
International Conference on Development and Learning, 2007, pp. 37–42,
[123] A.M. Harrison, W.M. Xu, J.G. Trafton, User-centered robot head design: A http://dx.doi.org/10.1109/DEVLRN.2007.4354055.
sensing computing interaction platform for robotics research (SCIPRR), [146] M. Kawato, Feedback-error-learning neural network for supervised motor
in: Proceedings of the ACM/IEEE International Conference on Human- learning, in: Advanced Neural Computers, Elsevier, 1990, pp. 365–372,
Robot Interaction, ACM, New York, NY, USA, 2018, pp. 215–223, http: http://dx.doi.org/10.1016/b978-0-444-88400-8.50047-9.
//dx.doi.org/10.1145/3171221.3171283. [147] H. Gomi, M. Kawato, Adaptive feedback control models of the vestibulo-
[124] D. Bazo, R. Vaidyanathan, A. Lentz, C. Melhuish, Design and testing cerebellum and spinocerebellum, Biol. Cybernet. (1992) http://dx.doi.org/
of a hybrid expressive face for a humanoid robot, in: Proceedings of 10.1007/BF00201432.
the IEEE/RSJ International Conference on Intelligent Robots and Systems, [148] L. Berthouze, S. Rougeaux, Y. Kuniyoshi, A learning stereo-head control
2010, pp. 5317–5322, http://dx.doi.org/10.1109/IROS.2010.5651469. system, in: World Automation Congress/International Symposium on
[125] J. Rajruangrabin, D.O. Popa, Robot head motion control with an emphasis Robotics and Manufacturing, ASME Press, 1996, pp. 43–47.
on realism of neck–eye coordination during object tracking, J. Intell. [149] J. Bruske, M. Hansen, L. Riehn, G. Sommer, Biologically inspired
Robot. Syst. 63 (2) (2011) 163–190, http://dx.doi.org/10.1007/s10846- calibration-free adaptive saccade control of a binocular camera-
010-9468-x. head, Biol. Cybernet. 77 (6) (1997) 433–446, http://dx.doi.org/10.1007/
[126] F. Hegel, T. Spexard, B. Wrede, G. Horstmann, T. Vogt, Playing a different s004220050403.
imitation game: Interaction with an empathic android robot, in: Proceed- [150] L. Berthouze, Y. Kuniyoshi, Emergence and categorization of coordinated
ings of the IEEE-RAS International Conference on Humanoid Robots, 2006, visual behavior through embodied interaction, Mach. Learn. 31 (1/3)
pp. 56–61, http://dx.doi.org/10.1109/ICHR.2006.321363. (1998) 187–200, http://dx.doi.org/10.1023/a:1007453010407.

19
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

[151] T. Shibata, S. Schaal, Biomimetic gaze stabilization based on feedback- [175] L. Vannucci, N. Cauli, E. Falotico, A. Bernardino, C. Laschi, Adaptive visual
error-learning with nonparametric regression networks, Neural Netw. 14 pursuit involving eye-head coordination and prediction of the target
(2001) 201–216. motion, in: Proceedings of the IEEE-RAS International Conference on
[152] W. Muhammad, M.W. Spratling, A neural model of binocular saccade Humanoid Robots, IEEE, 2014, http://dx.doi.org/10.1109/humanoids.2014.
planning and vergence control, Adapt. Behav. 23 (5) (2015) 265–282, 7041415.
http://dx.doi.org/10.1177/1059712315607363. [176] G. Milighetti, L. Vallone, A.D. Luca, Adaptive predictive gaze control
[153] F. Lunghi, S. Lazzari, G. Magenes, Neural adaptive predictor for visual of a redundant humanoid robot head, in: Proceedings of the IEEE/RSJ
tracking system, in: Proceedings of the IEEE Engineering in Medicine and International Conference on Intelligent Robots and Systems, IEEE, 2011,
Biology Society, Vol. 20, IEEE, 1998, http://dx.doi.org/10.1109/iembs.1998. http://dx.doi.org/10.1109/iros.2011.6094417.
747140. [177] C. Brown, Kinematic and 3D motion prediction for gaze control, in:
[154] I. Reid, K. Bradshaw, P. McLauchlan, P. Sharkey, D. Murray, From saccades Workshop on Interpretation of 3D Scenes, IEEE Comput. Soc. Press, 1989,
to smooth pursuit: real-time gaze control using motion feedback, in: http://dx.doi.org/10.1109/tdscen.1989.68113.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots [178] N. Marturi, V. Ortenzi, J. Xiao, M. Adjigble, R. Stolkin, A. Leonardis, A
and Systems, IEEE, 1993, http://dx.doi.org/10.1109/iros.1993.583282. real-time tracking and optimised gaze control for a redundant humanoid
[155] M. Mancas, F. Pirri, M. Pizzoli, From saliency to eye gaze: Embodied robot head, in: Proceedings of the IEEE-RAS 15th International Confer-
visual selection for a pan-tilt-based robotic head, in: Advances in Visual ence on Humanoid Robots, 2015, pp. 467–474, http://dx.doi.org/10.1109/
Computing, Springer Berlin Heidelberg, 2011, pp. 135–146, http://dx.doi. HUMANOIDS.2015.7363591.
org/10.1007/978-3-642-24028-7_13. [179] B. Kuhn, B. Schauerte, K. Kroschel, R. Stiefelhagen, Multimodal saliency-
[156] G. Macesanu, V. Comnac, F. Moldoveanu, S.M. Grigorescu, A time-delay based attention: A lazy robot’s approach, in: Proceedings of the IEEE/RSJ
control approach for a stereo vision based human-machine interaction International Conference on Intelligent Robots and Systems, IEEE, 2012,
system, J. Intell. Robot. Syst. 76 (2) (2013) 297–313, http://dx.doi.org/10. http://dx.doi.org/10.1109/iros.2012.6385515.
1007/s10846-013-9994-4. [180] J. Law, P. Shaw, M. Lee, A biologically constrained architecture for
[157] A.T. Bahill, J.D. McDonald, Adaptive control models for saccadic and developmental learning of eye–head gaze control on a humanoid robot,
smooth pursuit eye movements, in: Progress in Oculomotor Research, Auton. Robots 35 (1) (2013) 77–92, http://dx.doi.org/10.1007/s10514-
Elsevier North Holland, 1981. 013-9335-2.
[158] M. Antonelli, A.J. Duran, E. Chinellato, A.P. del Pobil, Adaptive saccade [181] P. Shaw, J. Law, M. Lee, A comparison of learning strategies for biologically
controller inspired by the primates’ cerebellum, in: Proceedings of the constrained development of gaze control on an iCub robot, Auton. Robots
IEEE International Conference on Robotics and Automation, IEEE, 2015, 37 (1) (2014) 97–110, http://dx.doi.org/10.1007/s10514-013-9378-4.
http://dx.doi.org/10.1109/icra.2015.7139901. [182] E.S. Maini, L. Manfredi, C. Laschi, P. Dario, Bioinspired velocity control of
[159] E. Kowler, Eye movements: The past 25years, Vis. Res. 51 (13) (2011) fast gaze shifts on a robotic anthropomorphic head, Auton. Robots 25 (1)
1457–1483, http://dx.doi.org/10.1016/j.visres.2010.12.014. (2008) 37–58, http://dx.doi.org/10.1007/s10514-007-9078-z.
[160] D. Guitton, Control of eye-head coordination during orienting gaze shifts,
[183] C. Laschi, F. Patanè, E. Maini, L. Manfredi, G. Teti, L. Zollo, E. Guglielmelli,
Trends Neurosci. 15 (1992) 174–179.
P. Dario, An anthropomorphic robotic head for investigating gaze
[161] T. Habra, R. Ronsse, Gaze stabilization of a humanoid robot based on
control, Adv. Robot. 22 (1) (2008) 57–89, http://dx.doi.org/10.1163/
virtual linkage, in: Proceedings of the IEEE International Conference on
156855308x291845.
Biomedical Robotics and Biomechatronics, 2016, pp. 163–169, http://dx.
[184] H. He, S.S. Ge, Z. Zhang, A saliency-driven robotic head with bio-
doi.org/10.1109/BIOROB.2016.7523616.
inspired saccadic behaviors for social robotics, Auton. Robots 36 (3)
[162] D.A. Robinson, Models of the saccadic eye movement control system,
(2013) 225–240, http://dx.doi.org/10.1007/s10514-013-9346-z.
Kybernetik 14 (2) (1973) 71–83, http://dx.doi.org/10.1007/bf00288906.
[185] D. Omrčen, A. Ude, Redundant control of a humanoid robot head with
[163] W. Becker, R. Jürgens, Human oblique saccades: Quantitative analysis of
foveated vision for object tracking, in: Proceedings of the IEEE Inter-
the relation between horizontal and vertical components, Vis. Res. 30 (6)
national Conference on Robotics and Automation, 2010, pp. 4151–4156,
(1990) 893–920, http://dx.doi.org/10.1016/0042-6989(90)90057-r.
[164] E.G. Freedman, Coordination of the eyes and head during visual orienting, http://dx.doi.org/10.1109/ROBOT.2010.5509515.
[186] P. Kryczka, E. Falotico, K. Hashimoto, H. o. Lim, A. Takanishi, C. Laschi,
Exp. Brain Res. 190 (4) (2008) 369, http://dx.doi.org/10.1007/s00221-008-
P. Dario, A. Berthoz, A robotic implementation of a bio-inspired head
1504-8.
[165] M. Daemi, J.D. Crawford, A kinematic model for 3-d head-free gaze-shifts, motion stabilization model on a humanoid platform, in: Proceedings of
in: Front. Comput. Neurosci., 2015. the IEEE/RSJ International Conference on Intelligent Robots and Systems,
[166] Z. Zhu, Q. Wang, W. Zou, F. Zhang, Overview of motion control on bionic 2012, pp. 2076–2081, http://dx.doi.org/10.1109/IROS.2012.6386177.
eyes, in: Proceedings of the IEEE International Conference on Robotics [187] L. Gu, J. Su, Gaze control on humanoid robot head, in: 2006 6th
and Biomimetics, 2015, pp. 2389–2394, http://dx.doi.org/10.1109/ROBIO. World Congress on Intelligent Control and Automation, Vol. 2, 2006, pp.
2015.7419696. 9144–9148, http://dx.doi.org/10.1109/WCICA.2006.1713769.
[167] P. Sharkey, D. Murray, S. Vandevelde, I. Reid, P. McLauchlan, A modular [188] A. Ude, C. Gaskett, G. Cheng, Foveated vision systems with two cameras
head/eye platform for real-time reactive vision, Mechatronics 3 (4) (1993) per eye, in: Proceedings of the IEEE International Conference on Robotics
517–535, http://dx.doi.org/10.1016/0957-4158(93)90021-S. and Automation, 2006, pp. 3457–3462, http://dx.doi.org/10.1109/ROBOT.
[168] Y. Song, X. Zhang, An active binocular integrated system for intelligent 2006.1642230.
robot vision, in: 2012 IEEE International Conference on Intelligence [189] X. Wang, J. van de Weem, P. Jonker, An advanced active vision system
and Security Informatics, IEEE, 2012, http://dx.doi.org/10.1109/isi.2012. imitating human eye movements, in: Proceedings of the International
6284090. Conference on Advanced Robotics, IEEE, 2013, http://dx.doi.org/10.1109/
[169] A. D’Souza, S. Vijayakumar, S. Schaal, Learning inverse kinematics, in: Pro- icar.2013.6766517.
ceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and [190] B.-S. Yoo, J.-H. Kim, Gaze control of humanoid robot for learning from
Systems. Expanding the Societal Role of Robotics in the Next Millennium demonstration, in: J.-H. Kim, F. Karray, J. Jo, P. Sincak, H. Myung
(Cat. No.01CH37180), IEEE. https://doi.org/10.1109/iros.2001.973374. (Eds.), Robot Intelligence Technology and Applications 4: Results from
[170] E. Falotico, N. Cauli, P. Kryczka, K. Hashimoto, A. Berthoz, A. Takanishi, the 4th International Conference on Robot Intelligence Technology and
P. Dario, C. Laschi, Head stabilization in a humanoid robot: models and Applications, Springer International Publishing, Cham, 2017, pp. 263–270,
implementations, Auton. Robots 41 (2) (2017) 349–365, http://dx.doi.org/ http://dx.doi.org/10.1007/978-3-319-31293-4_21.
10.1007/s10514-016-9583-z. [191] E.G. Freedman, Interactions between eye and head control signals can
[171] F. Panerai, G. Metta, G. Sandini, Visuo-inertial stabilization in space- account for movement kinematics, Biol. Cybernet. 84 (6) (2001) 453–462,
variant binocular systems, Robot. Auton. Syst. 30 (1) (2000) 195–214, http://dx.doi.org/10.1007/PL00007989.
http://dx.doi.org/10.1016/S0921-8890(99)00072-X. [192] A. Cangelosi, M. Schlesinger, Developmental Robotics. from Babies to
[172] G. Asuni, G. Teti, C. Laschi, E. Guglielmelli, P. Dario, A robotic head neuro- Robots, MIT Press, 2015.
controller based on biologically-inspired neural models, in: Proceedings [193] H.L.M. Goossens, A.J.V. Opstal, Human eye-head coordination in two
of the IEEE International Conference on Robotics and Automation, IEEE, dimensions under different sensorimotor conditions, Exp. Brain Res. 114
2005, http://dx.doi.org/10.1109/robot.2005.1570466. (3) (1997) 542–560, http://dx.doi.org/10.1007/pl00005663.
[173] B.-S. Yoo, J.-H. Kim, Fuzzy integral-based gaze control of a robotic head for [194] C. Brown, Gaze controls with interactions and delays, IEEE Trans. Syst.
human robot interaction, IEEE Trans. Cybernet. 45 (9) (2015) 1769–1783, Man Cybern. 20 (2) (1990) 518–527, http://dx.doi.org/10.1109/21.52563.
http://dx.doi.org/10.1109/tcyb.2014.2360205. [195] M. Lungarella, G. Metta, R. Pfeifer, G. Sandini, Developmental robotics:
[174] F. Du, M. Brady, D. Murray, Gaze control for a two-eyed robot head, a survey, Connect. Sci. 15 (4) (2003) 151–190, http://dx.doi.org/10.1080/
in: P. Mowforth (Ed.), BMVC91: Proceedings of the British Machine 09540090310001655110.
Vision Conference, Organised for the British Machine Vision Association [196] G. Sandini, G. Metta, Retina-like sensors: Motivations, technology and
By the Turing Institute 24–26 September 1991 University of Glasgow, applications, in: F.G. Barth, J.A.C. Humphrey, T.W. Secomb (Eds.), Sensors
Springer London, London, 1991, pp. 193–201, http://dx.doi.org/10.1007/ and Sensing in Biology and Engineering, Springer Vienna, Vienna, 2003,
978-1-4471-1921-0_25. pp. 251–262, http://dx.doi.org/10.1007/978-3-7091-6025-1_18.

20
J.A. Rojas-Quintero and M.C. Rodríguez-Liñán Robotics and Autonomous Systems 143 (2021) 103834

[197] B. Bauml, T. Hammer, R. Wagner, O. Birbach, T. Gumpert, F. Zhi, U. [208] J. Bernotat, F. Eyssel, An evaluation study of robot designs for smart
Hillenbrand, S. Beer, W. Friedl, J. Butterfass, Agile justin: An upgraded environments, in: Proceedings of the ACM/IEEE International Conference
member of DLR’s family of lightweight and torque controlled humanoids, on Human-Robot Interaction, in: HRI ’17, ACM, New York, NY, USA, 2017,
in: 2014 IEEE International Conference on Robotics and Automation, IEEE, pp. 87–88, http://dx.doi.org/10.1145/3029798.3038429.
2014, http://dx.doi.org/10.1109/icra.2014.6907220. [209] S. Huber, B. Selby, B.P. Tripp, Design of a saccading and accommodating
[198] T. Bhattacharjee, A.A. Shenoi, D. Park, J.M. Rehg, C.C. Kemp, Combining robot vision system, in: 2016 13th Conference on Computer and Robot
tactile sensing and vision for rapid haptic mapping, in: 2015 IEEE/RSJ Vision (CRV), 2016, pp. 350–357, http://dx.doi.org/10.1109/CRV.2016.48.
International Conference on Intelligent Robots and Systems (IROS), IEEE, [210] J.A. Rojas-Quintero, P. Seguin, J.P. Gazeau, M. Arsicault, Using a motion
2015, http://dx.doi.org/10.1109/iros.2015.7353522. capture system to identify pertinent design parameters of a bio-inspired
[199] K.W. Wright, Anatomy and physiology of eye movements, in: K.W. mechanical hand, Comput. Methods Biomech. Biomed. Eng. 16 (sup1)
Wright, P.H. Spiegel, L.S. Thompson (Eds.), Handbook of Pediatric Strabis- (2013) 179–181, http://dx.doi.org/10.1080/10255842.2013.815950.
mus and Amblyopia, Springer New York, New York, NY, 2006, pp. 24–69, [211] L. Bergamini, M. Sposato, M. Pellicciari, M. Peruzzini, S. Calderara, J.
http://dx.doi.org/10.1007/0-387-27925-3_2. Schmidt, Deep learning-based method for vision-guided robotic grasping
[200] M. Jeung, G. Lee, Y. Oh, Realization of human neck motion with novel of unknown objects, Adv. Eng. Inf. 44 (2020) 101052, http://dx.doi.org/
robotic mechanism, in: Proceedings of the 13th International Conference 10.1016/j.aei.2020.101052.
on Ubiquitous Robots and Ambient Intelligence, 2016, pp. 482–486, http: [212] L. Dozal, G. Olague, E. Clemente, D.E. Hernández, Brain programming for
//dx.doi.org/10.1109/URAI.2016.7734087. the evolution of an artificial dorsal stream, Cogn. Comput. 6 (3) (2014)
[201] G.C. Yan Zhu, Research on the head form design of service robots based 528–557, http://dx.doi.org/10.1007/s12559-014-9251-6.
on Kansei engineering and BP neural network, in: Proceedings of the
7th International Conference on Electronics and Information Engineering,
2017, http://dx.doi.org/10.1117/12.2267419.
[202] N.A. Dodgson, Variation and extrema of human interpupillary distance, Juan Antonio Rojas-Quintero received the B.E. de-
in: Proceedings of the SPIE Conference on Electronic Imaging, Vol. 5291, gree in Science and Technology of Mechanics and
2004, pp. 5291 – 5291 – 11, http://dx.doi.org/10.1117/12.529999. Engineering in 2007, the M.Sc. degree in Mechanics
[203] J.S. Pointer, The interpupillary distance in adult Caucasian subjects, and Engineering Sciences in 2009, and the Ph.D. de-
with reference to ‘readymade’ reading spectacle centration, Ophthalmic gree in Mechanics and Engineering Sciences in 2013
Physiol. Opt. 32 (4) (2012) 324–331, http://dx.doi.org/10.1111/j.1475- from the University of Poitiers, Poitiers, France. He
1313.2012.00910.x. was a postdoctoral fellow at the School of Mechanics
[204] T. Kishi, T. Kojima, N. Endo, M. Destephe, T. Otani, L. Jamone, P. Kryczka, and Engineering of the Southwest Jiaotong University,
G. Trovato, K. Hashimoto, S. Cosentino, A. Takanishi, Impression survey Chengdu, P.R. of China, from 2014 to 2016. He is cur-
of the emotion expression humanoid robot with mental model based rently a CONACYT Research Fellow with the National
dynamic emotions, in: Proceedings of the IEEE International Conference Technology Institute (Tecnológico Nacional de México),
on Robotics and Automation, 2013, pp. 1663–1668, http://dx.doi.org/10. Ensenada Campus, Mexico. His research interests include nonlinear dynamical
1109/ICRA.2013.6630793. systems, optimal control and robotics.
[205] H. Kim, G. York, G. Burton, E. Murphy-Chutorian, J. Triesch, Design of an
anthropomorphic robot head for studying autonomous development and
María del Carmen Rodríguez-Liñán received the B.E.
learning, in: Proceedings of the IEEE International Conference on Robotics
degree in Electronic Engineering and the MSc degree in
and Automation, Vol. 4, 2004, pp. 3506–3511, http://dx.doi.org/10.1109/
Electrical Engineering from the University of San Luis
ROBOT.2004.1308796.
Potosí, Mexico, in 2006 and 2008, and the PhD de-
[206] B.J.A. Kröse, J.M. Porta, A.J.N. van Breemen, K. Crucq, M. Nuttin, E.
gree in Electrical and Electronic Engineering from The
Demeester, Lino, the user-interface robot, in: E. Aarts, R.W. Collier, E.
University of Manchester, UK, in 2013. From 2014 to
van Loenen, B. de Ruyter (Eds.), Ambient Intelligence: First European
2016 she was a postdoctoral fellow at the Autonomous
Symposium, EUSAI 2003, Veldhoven,the Netherlands,November 3-4, 2003.
University of San Luis Potosí. She is currently a CONA-
Proceedings, Springer Berlin Heidelberg, Berlin, Heidelberg, 2003, pp.
CYT Research Fellow, with the National Technology
264–274, http://dx.doi.org/10.1007/978-3-540-39863-9_20.
Institute, at the Ensenada campus, Mexico. Her research
[207] A. Ito, S. Hayakawa, T. Terada, Why robots need body for mind com-
interests include nonlinear systems, hard nonlinearities
munication - an attempt of eye-contact between human and robot,
and robot control.
in: RO-MAN 2004. 13th IEEE International Workshop on Robot and
Human Interactive Communication (IEEE Catalog No.04TH8759), 2004, pp.
473–478, http://dx.doi.org/10.1109/ROMAN.2004.1374806.

21

You might also like