Professional Documents
Culture Documents
Real Time Artificial Vision System in An Omnidirectional Mobile Platform For HRI
Real Time Artificial Vision System in An Omnidirectional Mobile Platform For HRI
I. Introduction
Knowledge
DataBase
MMB
Pos. Objetive Behavior Symbolic
( q goal ) Selected Information
Associative Level
Map
Messages/
Reception
Mapping Comunication
Designer Monitor Location Rx/Tx
Motion Deliverative
Plan Reactive
Behavior of
Navigation Behaviors
LCTS
Fig. 2: CAD model for the mobile base of HiBot. Quantitative
Response
RTM
Subsytem
232 for stereoscopic vision implementation driven by
a servomotor to control elevation and panoramic; four Fig. 3: Three levels of the architecture for the concurrent tasks
execution in real time. The arrows indicate the flow of information
digital encoders placed in the axes of the DC motors. An in the three modules of the architecture. The blocks indicate the
audio-visual interaction from LCD screen will be made components and processes running on each module.
using computer graphics synthesis combined with the
capture and voice sinthesis for a face avatar. The actuator
subsystem consist of four DC motors with planetary Image Classificator
Acquisition
gearboxes 5.95:1. There are used four H-bridges to engage VGA Cam
AdaBoost
Process 10 Associative
implementation of dynamic and kinematic control strate- Sensor
Level
gies of the actuators. The operating system kernel in
real time will be implemented from a structure based on Sound
Acquisition
Preprocessor
Procedure
Petri nets in the FreeRTOS package. The high-level layer Microphone
will be used for the trajectory control, and for advanced Preception Subsystem
control laws for navigation and perception. The high level
Fig. 4: Perception subsystem based on the approach of concurrent
processing system has the ability to designate actions in execution of tasks for the architecture of HiBot.
the low layer using simultaneous movement control of
the four DC motors. This layer is implemented into a
PC-104 with embedded Linux operating system, Intelr
CoreTM 2 Duo Processor (2.26 GHz), with 4 GB of RAM The implementation of the perceptual system is de-
with support for SPX modules that allows increasing the signed within the approach of an architecture for con-
number of analog and digital inputs. current execution of tasks with a hybrid structure and
improves system performance by task synchronization
III. Perception System and the Hibot and response time, contrary to happens with centralized
Architecture architectures [10], [13], [2], [3]. The Fig. 3 shows the
architecture based on three basics processes. The per-
Since the central topic of this paper is the vision sys- ception system consists of acquisition modules that are
tem, we will give treatment principally to the perception responsible for extracting information from the cameras,
within the HiBot architecture. The perception system distance sensors and microphones [10], [13], as shown in
aims the following activities: Fig. 4.
i. Recognize human and/or objects in the environment; The artificial vision system consist of an acquisition
ii. Support the environment state in order to help the system for stereoscopic image, controlled by two servo-
navigation (movement detection); motor as shown in Fig. 1. With the information from the
camera subsystem, an image frame is formed and sent scription of the current state of the environment that is
to a sorter that aims to identify if the environmental sent by messages to the cognitive level.
element that is near the robot, is a human, an animal 3) Reactive level: The reactive level will be imple-
or an object with or without movement. The idea is mented in the RTM process and its main function will be
to implement this subsystem using the SVM (Support the implementation of the basic controllers for the robot
Vector Machine) classification technique and the sensors locomotion. These basics controllers will be encapsulated
fusion technique, both approaches for recognition and and organized as behavior in a kernel for real-time. This
objects classification [17]. level has the responsibility through the perception mod-
ule, to collect information of environment as shown in
A. HiBot Architecture
Fig. 4 (using encoders and distance sensors); human faces
To implement the architecture for the HiBot is used and localization of elements into the environment by the
a hybrid approach of three levels: a Cognitive level, an information acquired from camera and microphone [4].
Associative level and a Reactive level ; all implemented
with a concurrent programming approach, as shown in IV. Implemented Vision System
Fig. 3. The consideration of concurrent processes is due
to the robot is responsible for locate itself autonomously With the purpose of achieve the items i) and ii) in
in the environment, without receiving no information Section III the vision mechanism that was considered as
about its location in the environment or the position of part of the perception subsystem project for the HiBot
the environmental factors, like in dynamic environments was divided into: i) facial detection and recognition, ii)
[11]. Otherwise, the perception of the environment, loca- object detection and iii) tracking of moving objects. For
tion, object recognition and mapping of the environment the vision system was used the Viola and Jones method
are responsibility of the robot [10], [21], [22], [23]. [19] that is based on Haar features [9], [5].
According to Fig. 3, the RTM (Real Time Manager ) For the face recognition method it was used Eigenfaces
process is the responsible for the perception and action technique based on PCA (Principal Component Analy-
activities. The LCTS (Linker and Coordinated Tasks Syn- sis) technique. It is a technique for feature extraction
chronizer ) process is responsible for the communication favorable for Gaussian distribution data. However, it is
between internal processes of the robot and its synchro- not possible to ensure that the images in the study show
nization. The MMB (Manager Motivation and Behavior ) that distribution. In compliance with that limitation,
process is responsible for choosing the behavior of human methods that use PCA have high success rates [7], like
interaction through emotions. Eigenfaces [16]. An example of good results obtained
1) Cognitive level: The cognitive level has the re- using PCA is shown in [1], where it is used Eigenfaces
sponsibility for choosing the learning tasks for the ap- method, obtaining recognition rates higher than 90%.
propriate behavior. The relative tasks from this level The track object module is based on size and color
will be implemented by the MMB process which uses a segmentation, where is chosen the object color to be
symbolic knowledge base to encapsulate messages con- tracked, a width and height base. It uses the HSV
taining perception information received by the robot, (Hue, Saturation and Value) model, which is a non-linear
a motivation process in order to organize and keeps transformation of the RGB color system.
a sequence of the behavior/habilities in an emotional
model that incorporates basic emotions (i.e., sadness, A. Face Detection in Motion
anger, joy, fear, neutral) [13]. When a task is performing The algorithmic scheme which was implemented in
and the cognitive level receives information from the order to develop the proposed detector in this work, is
current state of the environment, it checks if the current based in the proposal made by Viola and Jones in [19]
state of the environment allows the current task remains and the subsequent modifications made by Maydt in [6].
performed [10]. This detector is based on a cascade of classifiers which
2) Associative level: The associative level is imple- is explored throughout the image in multiples scales and
mented through LCTS process and its main responsibil- locations. Each cascade step is based on the use of simple
ity is to make local tasks provided by the cognitive level. type Haar characteristics, which are computed efficiently
For each task the LCTS process chooses the sequence using a classifier based on Boosting called AdaBoost,
of adequate parameters for the behavior of the reactive which is able to select the most important characteristics.
level, in order to that task can be performed successfully. From a database containing positive and negative face
With the information of received perception, the asso- images (faces and non-faces), the AdaBoost is applied
ciative level uses an inference engine with fuzzy rules for to find the features that better differentiate positive and
the frame received by the perception system, in order to negative images and distribute these features in classifiers
classify the current state of the environment and feeds [19]. The efficiency of this scheme lies in the fact that the
the knowledge base of the cognitive level so then, decide negatives (the majority of windows to be explored) are
which plans are applicable to the various environmental gradually eliminated (see Fig. 5), so that the first steps
situations. That classification generates a symbolic de- remove large numbers of them (the easiest) with little
replacements