Aboshosha 2009

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Object Tracking in Deformed Patterns

Ashraf Aboshosha, M. Hassan1, M. Ashour1


(1) Radiation Engineering Department, NCRRT, AEA, P. O. Box. 29, 8th District,
Nasr City, Cairo Egypt

Abstract- The geometrical modeling and tracking of visual to estimate the position and shape of multiple faces in images,
targets is a key subject in several advanced technologies. The and to track them. X. Liu and T. Chen [2] proposed a face
precision of the visual tracking is still an open area for research mosaicing approach to model both the facial appearance and
due to the presence of several serious problems. This paper geometry from pose-varying videos, and apply it in face
presents a successful technique to benefit from the camera as a
real ranging sensor by which the serious errors due to
tracking and recognition. Payman Haqiqat [3] presented a
deformations will be eliminated. The visual object tracking is an region-moment based method for tracking moving objects.
important subject for precise identification and pursuing of Region-based methods that compute a similarity measure, like
targets (human and objects) in deformed video stream. A three cross-correlation, between the two segmented regions in two
dimensional (3D) geometrical model has been developed to successive frames of the input sequence work well when the
determine the current pose of an object and predict its future moving objects have only a translational motion. These
location based on FIR model learned by the OLS while the visual methods get into trouble in the presence of rotational motion
extraction relies on the YCbCr model. We propose a robust of the moving objects. They used image moments to define a
ranging technique to track a visual target to replace the distance function between two circular regions. In [4] the
traditional expensive ranging sensors. The presented research authors presented a parametric skin color classifier that can be
work is applied to real video stream and achieved high
precession results
adapted to the conditions of each image or image sequence.
Its high speed and high accuracy makes it appropriate for real
Index Terms-Visual Interferometery, modeling, motion time applications such as face tracking and mimic
estimation, visual ranging, digital crime, Forensic science, recognition.

I. Introduction B. Visual Tracking, a Review,


In object tracking several researchers focused on thresholding,
V isual object tracking become a key subject of several
advanced applications such as, satellite and space
systems, traffic systems, medical applications,
region based segmentation, region merging and splitting, edge
based segmentation, clustering and on hybrid techniques to
surveillance, security, autonomous vehicles and Visual track a visual guide.
ranging which defined as estimation of the distance between Parker proposed a thresholding method capable of
the camera and the tracked object. Camera distance estimation thresholding images that have been produced in the context of
from objects in images has been a adaptable area for computer variable illumination [5]. Thresholding is the oldest
vision and image processing researchers. There are different segmentation method and is a very popular technique in
algorithms to calculate the distance between the camera and image segmentation since it is computationally inexpensive
the object from digital images and videos. These methods and fast [6]. Cheriet et al. proposed a more sophisticated
depend on the knowledge of the angle between the cameras approach, the “recursive thresholding technique”, in order to
and the object, therefore limiting the application of such segment bank cheques [7]. Ohlander proposed a thresholding
algorithms. These algorithms utilize methods such as pan and technique for outdoor colored images. He adopted an
tilt or laser rangers to measure the angles which limits the approach using nine one-dimensional histograms of features
ability of these algorithms.The images and sequences coming such as color intensity (for red green and blue), and hue to
from video systems need to be digitalized in order to be segment natural scenes [8].
processed by dedicated software to enhance features useful The purpose of this paper is to present a different approach to
for visual tracking analysis. Generally, this is done either to camera distance estimation from objects. The method that we
reduce the different kinds of corruptions that have been are proposing does not depend on the knowledge of the angle
introduced in the acquisition, conversion, and storage between the camera and the object, therefore this method can
processes of data or to overcome the limits of the overall be applied to images taken with generic digital cameras and
system. The characteristic problems in this area are; to deal eliminate the need for expensive laser rangers, moreover, A
with low resolution, lack of contrast, different types of noise three dimensional (3D) geometrical model has been
and change of illumination. The precision of the visual developed to determine the current pose of an object and
detection is still an open area for research due to the presence predict its future location based on FIR model learned by the
of several serious deformation sources such as deformations OLS.
caused by the motion. The rest of the paper is organized as follows; section 2
presents image denosing technique used to improve the
A..I mage Deformations, a Review, quality of an image corrupted by a lot of noise. Section 3
S. Harasse et al. [1] presented a method for the detection and illustrates color table thresholding segmentation technique to
tracking of multiple faces in a video by using a model of first extract the visual target. Section 4 shows the enhancement of
and second order local moments. An iterative process is used applying the morphological operations on the segmented

978-1-4244-5844-8/09/$26.00 ©2009 IEEE 51


visual target. Section 5 deuces the center of gravity of the characteristics. There is a wide variety of noise types while
tracked object. Section 6 explains the experimental results. we will focus on the following types; Gaussian noise, speckle
Finally section 7 summarizes the presented research work. noise, poison noise, impulse noise, salt and pepper noise. A
robust image Improvement technique has to suppress the
II. Image Preprocessing noise while preserving natural information in the images. A
Image denoising is a key issue in all image processing large number of linear and non linear filtering algorithms [9]
techniques. It could be considered as the first preprocessing have been proposed to remove noise from corrupted image to
step in dealing with image processing. It is used to improve enhance image quality. At this frame work we applied
the quality of an image corrupted by a lot of noise due to the adaptive to eliminate the noise to improve the visual tracking.
undesired conditions of image acquisition phase. The great This is nonlinear filter where his response is based on
challenge of image denoising is how to preserve the edges and ordering (ranking). Its edge-preserving nature makes it useful
details of an image when reducing the noise. Noise can occur in cases where edge blurring is undesirable. As shown in
during image capture, transmission or processing, and may be figure (1) there is false and bad tracking due to the presence
dependent on or independent of image content. Noise is of the noise.
usually described by its pattern and by its probabilistic

image with salt and pepper image with Gaussian noise image with poison noise image with speckle noise

False tracking bad tracking False tracking False tracking

restored image from salt and restored image from restored image from restored image from
pepper nose Gaussian noise poison noise speckle noise

Fig. 1. The effected of the denoising on the visual tracking

III. Segmentation choice of the property association in the region [10-15]. In


To track a visual target we have to relay on a segmentation object tracking several researchers focused on thresholding,
technique. Segmentation involves partitioning an image into a region based segmentation, region merging and splitting, edge
set of homogeneous and meaningful regions, such that the based segmentation, clustering and on hybrid techniques to
pixels in each partitioned region possess an identical set of track a visual guide.
properties or attributes. These sets of properties of the image During this framework we employ a color table thresholding
may include gray levels, contrast, spectral values, or textural segmentation technique to extract the visual target. Because
properties. The result of segmentation is a number of it's fast and need less computational time so it is suitable in
homogeneous regions, each having a unique label. An image real time tracking. At this technique there are six threshold
is thus defined by a set of regions that are connected and values. Most images as two-dimensional arrays (i.e.,
nonoverlapping, so that each pixel in the image acquires a matrices), in which each element of the matrix corresponds to
unique region label that indicates the region it belongs to. The a single pixel in the displayed image. (Pixel is derived from
set of objects of interest in an image, which are segmented, picture element and usually denotes a single dot on a
undergoes subsequent processing, such as object classification computer display.) Some images, such as true color images,
and scene description require a three-dimensional array, where the first plane in the
There may exist a number of possible partitions, but the third dimension represents the red pixel intensities, the second
selection of an appropriate set of regions depends on the plane represents the green pixel intensities, and the third plane

52
represents the blue pixel intensities. According to the previous in the YCbCr denotes the luminance component, and Cb and
knowledge we have six thresholds; max _red, max_green, Cr represent the chrominance factors.
max_blue, min_red, min_green, min_blue. During motion the To analyze and process images in color, computer vision
color model deformation may happen due to the change in systems typically use data from either the RGB or YCbCr
illumination. It is beneficial to segment the image precisely to color spaces, depending on a given task’s complexity. For
track the object. The main goal of the vision system is to example, in simple applications such as tracking highly
extract the object (guide) and to estimate its range saturated homogeneous colored guides it is enough to use the
geometrically. In this paper tracking homogenous/ RGB model. With more complex applications, however, such
inhomogeneous colored objects is studied. The purpose is to as persons tracking based on the skin color under changing
select the appropriate color model for these two types of illumination, a vision system may require YCbCr or
objects. normalized RGB information to perform the operation.
Visual guide extraction is achieved by applying the following In short, the RGB model is suited for image color generation,
steps 1) select the appropriate color model, 2) define the color whereas the YCbCr model is suited for image color
of the guide, 3) apply a Gaussian filter to eliminate the noise description. The conversion from RGB to YCbCr can be
of the image 4) extract the object based on thresholding, 5) calculated by the following;
deduce the x and y projections of the guide’s center of gravity ªY º ª0.257 0.504 0.098 º ª R º ª16 º
COG [16, 17, 18]. «C » « 0.148  0.291 0.439 » «G »  «128»
« b» « »« » « »
A.. Selection of an Appropriate Color Model «¬Cr »¼ «¬0.439  0.368  0.071»¼ «¬ B »¼ «¬128»¼
Although many of defined color spaces exist, RGB and HSV
are the most commonly used models. The RGB color space As shown in figure (2) that both RGB and YCbCr are
represents all colors as a mixture of red, green and blue which successful in dealing with highly saturated homogenous
constitute the primary colors used by video cameras, objects even under change of illumination and there is no
televisions and PC monitors. When combined, these colors need to use the YCbCr color model with more computational
can create any color on the spectrum. The RGB is too burden. For tracking of inhomogeneous colored targets, face
sensitive with respect to change of illumination while the skin color, the YCbCr model is successful in under change of
HSV is unstable with the gray scale spectrum. To avoid these illumination while the RGB fails to achieve this target.
problems we decided to rely on the YCbCr color model. The Y

RGB color table thresholding (homogenous) YCbCr color table thresholding (homogenous)

RGB color table thresholding (inhomogeneous) YCbCr color table thresholding (inhomogeneous)
Fig.2. RGB and YCbCr based tracking under illumination changes

IV. Deduction of Guide’s COG


To track a guide dynamically its center of gravity is calculated
and the results are delivered to the control system to steer the
robot adaptively, see figure (4). Compared with contour
detection presented in [19, 20 and 21] the COG is more
accurate because the object to be tracked is represented with a
single point in 3D dimensions. Moreover, the elapsed time Original image X=191.897, Y=108.563
Fig. 4. Center of gravity (COG) of a target
required to extract the geometrical LUT of the object is
shorter. This representation is easy to be manipulated in V. Geometrical Calibration
control system as a definite feedback, see figure (4). The According to figures (5, 6) which show the tracked object at
COG is calculated according to the following: different distance. We can deduce the distance between the
n

¦x i 1
i
(4)
camera and the guide mathematically using the following
form:
xc
n N ae bD (6)
n

¦y
i 1
n
(5)
Where, D is the distance between the robot and the guide a, b
is a constant depending on the geometry of the camera’s lens
yc
n and N is the projection size. To reject the noise a threshold
number is used to delimit the projection size. From the size of

53
the object N we estimate the distance D to the guide. Then by ªy t  n  1 º
using the least square approximation (LS) we obtained the « »
coefficient data, a = 30606.621, b=-0.03410108. «y t  n  2 »
To calculate the pose of the guide in 3D dimensions y  «« y t  n  3 »» (11)
geometrically, first we use the COG rule to get x and y axes «# »
parameters COG (x), COG (y). Then, we use the range « »
calibration method to calculate the z axis parameter, range D ¬«
y t ¼»
as shown in Figure (7).
ªu t  n  1 º
VI. Motion Estimation and Prediction based on FIR-OLS « »
technique «u t  n  2 »
)  ««u t  n  3 »»
In real time the computation complexity is a vital issue to
overcome this problem we use the FIR structure. For a single- (12)
input/single-output system (SISO), the FIR model structure is «# »
« »
«¬u t
presented by the discrete equation, see figure (8)
n »¼
y t ¦ au t  i  e t aT u t  e t (7)
i 1
ªe t  n  1 º
« »
Where, y (t) represents the output at time t, u (t) represents the «e t  n  2 »
input at time t, and e (t) is output additive noise that is i.i.d. e  ««e t  n  3 »» (13)
(identically, independently with 0 mean and variance V2;
«# »
>a1 a3 "an @
T
a a2 (8) « »
«¬e t »¼
Is the FIR model kernel; and
u t ¬ªu t  1 u t  2 u t  3 " u t  n ¼º
T
(9) The OLS solution to the above problem is found by solving
Is the vector of lagged inputs. the standard least-squares
Problem
y  )a y  )a
T
VII. FIR Identification using OLS min (14)
a
With N observations, input-output data can be written in
Whose solution is the estimate of the FIR model kernel:
matrix form as
) )
1
y )u  e (10) aˆ T
)T y  R 1)T y (15)
Where
Where, R  )T ) The bias of the OLS estimator â

Dist=30 Dist=40 Dist=50 Dist=60 Dist=70

Dist=90 Dist=100 Dist=80 Dist=110 Dist=110


Fig. 5. object tracking at different distance

54
Fig. 6. The relation between range (D) and projection Fig.7. The relation between the range and location of the
size (N) object in 3D domain

Where, a is the true FIR model coefficient vector, and E is the


expectation operator.
Similarly it can be easily shown that the variance of the OLS
estimator â is
E « a  aˆ a  aˆ T » V 2 R 1 (17)
¬ ¼
Fig. 8. FIR model structures

and the mean square error for the OLS estimator is


E >a  aˆ @ E « a  aˆ a  aˆ » V 2trace R 1
T
b ¬ ¼ (18)

a  E >aˆ @
Figure (9) shows the different models w.r.t. the actual output.
a  E ª¬ R 1)T y º¼ Figure (10) shows the auto correlation of all model . Figure
(11) shows the output of the FIR model w.r.t. the actual
a  R 1)T E [)a  e ] (16)
output , Figure (12) shows the the capability of the model to
a  R 1)T ) E >a @  E e predict the output if the system input is known.

a  R 1Ra
0

Fig. 9. The different models w.r.t. the Fig. 10. The auto correlation of all model
actual output

Fig.11 .Model output w.r.t. system output Fig.12 Model output w.r.t. system output

55
7. Conclusion [8] Ohlander R. B, (1975). Analysis of natural scenes,
Throughout this framework we present an idea to convert the Ph.D. dissertation, Department of Computer Science,
normal visual cameras to a quasi-precise ranging sensor. We Carnegie-Mellon University, Pittsburgh, PA.
proposed a new robust computationally efficient approach to [9] Zack, G.W., W.E. Rogers, and S.A. Latt (1977)
build a 3D geometrical model of the visual tracking process. Automatic Measurement of Sister Chromatid Exchange
A three dimensional (3D) geometrical model has been Frequency.. 25(7): p. 741-753.
developed to determine the current pose of an object and [10] P. Rosin (2001). Unimodal Thresholding. Pattern
predict its future location based on FIR model learned by the Recognition 34 (11), 2083-2096.
OLS. The applied visual tracking algorithm exhibits higher [11] Pavlidis, T. (1977). Structural Pattern Recognition.
reliability and achieves the goaled robustness. We could Springer Verlag, Berlin, Heidelberg, New York.
eliminate the effects of noise and change of illumination [12] Prager J.M. (1980) Extracting and labeling boundary
significantly. In contrary to the dominant idea about the RGB/ segments in natural scenes, IEEE Transactions on
YCbCr the RGB color model is a stable and fast technique in Pattern Analysis and Machine Intelligence, vol. 2, no.
recognition of homogeneous saturated color targets even in 1, pp. 16-27.
presence of change of illumination while the YCbCr is [13] Pal N. R. and Pal S. K. (1993). A review on image
efficient with inhomogeneous visual targets such as face segmentation techniques,” Pattern Recognit., vol. 26,
tracking on an embedded system to develop a visual. As a pp. 1227–1249.
future work, we are going to implement the applied algorithm [14] Rafael Gonzalez Richard Woods, Digital Image
on An embedded system to develop a visual RADAR System Processing, Pearson Publications.
[15] Hanzi Wang and David Suter, "A Model-Based Range
Image Segmentation Algorithm Using a Novel Robust
8. References
Estimator", 3rd Int'l Workshop on Statistical and
[1] S. Harasse, L. Bonnaud, M. Desvignes, "Finding
Computational Theories of Vision (in conjunction with
People in Video Streams by Statistical Modeling", 3rd
ICCV'03), Nice, France, October 2003.
ICAPR Int'l International Conference on Advances in
[16] Ashraf Aboshosha, "Adaptive Navigation and Motion
Pattern Recognition, pp 608-617, Bath, August 2005.
Planning for Autonomous Mobile Robots", Ph.D.
[2] X. Liu and T. Chen, "Online Modeling and Tracking of
thesis, University of Tuebingen, 2004.
Pose-Varying Faces in Video," Video Proceedings of
[17] R. Aufrere, C. Mertz, and C. Thorpe. "Multiple Sensor
IEEE International Conference on Computer Vision
Fusion for Detecting Location of Curbs, Walls, and
and Pattern Recognition 2005.
Barriers" , In Proceedings of the IEEE Intelligent
[3] Payman Haqiqat, "Using Image Moments for Tracking
Vehicles Symposium (IV2003), 2003
Rotating Objects", ICGST international Conference on
[18] T. Belker, M. Beetz, and A. B. Cremers, "Learning of
Automation, Robotics and Autonomous Systems,
Plan Execution Policies for Indoor Navigation" , AI
ARA05, 19-21 December 2005.
Communications, 15(1):3–16, 2002.
[4] M. Wimmer, B. Radig, "Adaptive Skin Color
[19] Alexandre Bernardino, Jose’ Santos-Victor, and Giulio
Classificator", ICGST international conference on
Sandini. Foveated active tracking with redundant 2d
Graphics, Vision and Image processing, CICC, Cairo,
motion parameters. In Robotics and Autonomous
Egypt, 19-21 December 2005.
Systems, volume Vol. 39. Elsevier Science, June 2002.
[5] Parker, J.R. (1991) Grey Level Thresholding in Badly
[20] Darius Burschka, Jeremy Geiman, and Gregory D.
Illuminated Images, Transactions on Pattern Analysis
Hager. Optimal Landmark Con.guration for Vision-
and Machine Intelligence, Vol. 13 No. 8 813-819.
Based Control of Mobile Robots. In Proc. of
[6] M.Sonka, Vaclav Hlavac & Roger Boyle (1993).
International Conference on Robotics and Automation
Image Processing,Analysis and Machine Vision.
(ICRA), 2003.
Chapman & Hall Computing 1st Edition pp 112-178.
[21] Darius Burschka and Gregory D. Hager. Principles and
[7] Cheriet M., Said J. N. and Suen C. Y. (1998) A
practice of real time tracking on consumer hardware .
recursive thresholding technique for image
In Tutorial 1 at IEEE VR2003: Recent Methods for
segmentation, IEEE Transactions on Image Processing,
Image-Based Modelling and Rendering, March 2003
vol. 7, no. 6, pp. 918-920.

56

You might also like