Professional Documents
Culture Documents
object tracking using background subtraction
object tracking using background subtraction
Abstract ² The main objective of this paper is to develop positioning error and the positioning error continuously
multiple human object tracking approach based on motion getting added while updating the template.
estimation and detection, background subtraction, shadow In template-based approach category, mean-shift method
removal and occlusion detection. A reference frame is initially [3] and Kernel-based tracking method [4] have been proposed,
used and considered as background information. While a new
where the color histograms of the target object is constructed
object enters into the frame, the foreground information and
background information are identified using the reference frame using a Kernel density estimation function. Since, the color
as background model. Most of the times, the shadow of the histogram is invariant feature for rotation, scaling and
background information is merged with the foreground object translation, it is considered as one of the suitable feature for
and makes the tracking process a complex one. In the approach, handling the problem of change in the scale, rotation and
morphological operations are used for identifying and removed translation of target object. The object tracking is carried out
the shadow. The occlusion is one of the most common events in by comparing the color histogram of the template and the
object tracking and object centroid of each object is used for target object. However, mean-shift method is not suitable for
detecting the occlusion and identifying each object separately. 3-D target object and monochromatic object. In case of
Video sequences have been captured in the laboratory and tested
monochromatic target object, even small variation in
with the proposed algorithm. The algorithm works efficiently in
the event of occlusion in the video sequences. illumination, produces narrow histogram pattern and tracking
often fails.
Keywords²Background modeling and subtraction, human In object tracking problem, the object representation is the
motion detection, shadow removal and occlusion detection. difficult aspect. Various ways of representing or describing
target objects have been proposed such as object appearance
[1, 2], image features [5, 6], target contour [7, 8] and color
I. INTRODUCTION histogram [4]. In both appearance-based and color histogram
based approaches, the region of the object has to be defined
In Computer Vision, object tracking is considered as one of for describing the target. Thus, if some of the background
the most important task. Various methods have been proposed pixels are mixed with the defined region, the tracking may
and reported both in academia and industry for large number fail.
of real-time applications. The object tracking methods may While tracking non-rigid objects, the probabilistic based
broadly be categorized as template-based, probabilistic and tracking methods have given better performance. Some of the
pixel-wise. While the template-based method represents the approach in this category can be found in [13, 14, 15, 19]. In
object in a suitable way for tracking, the probabilistic method one of the probabilistic method [13], the factors such as
uses intelligent searching strategy for tracking the target motion detector, region tracker, head detector and active shape
object. Similarly, the similarity matching techniques are used tracker have been combined for tracking the pedestrian. The
for tracking the target object in pixel-based methods. assumption made in this method is that there are no people
However, among all the above said approaches, the template- moving in the background. Since, this method uses contour as
based approach is found to be suitable for many real-time one of the feature, initial contour definition is difficult for the
applications [1, 2]. In this category of tracking methods, complicated contour target object.
similarity of the predefined target is being calculated with the Object tracking is also performed by predicting the object
object translation. However, for object transformations such position from the past information and the predicted current
as translation, rotation and scaling this method often fails. position. These types of methods combine both statistical
This is due to the fact that the procedures of selection of target computation and the parameter vector [11, 12, 16, 17].
object as constant size templates. For handling this difficult However, for real-time object tracking systems, it is found to
issue, varying templates are used. The inclusion of be difficult for constructing the proper feature vectors. This
background pixels into the template introduces the problem of method has been extended by Khan, et al. [11], for dealing
with the problem of interacting targets. The Markov Random
978-1-4244-8594-9/10/$26.00 2010
c IEEE 79
Field (MRF) has been used for modelling the interactions. In the detection of shadows the foreground objects are very
This has been achieved by adding an interaction weighted common, producing undesirable consequences. For example,
factor. However, in this method the tracking fails while there shadows connect different people walking in a group,
is an overlap between targets. generating a single object (typically called blob) as output of
In contrast to model-based tracking methods, the pixel-wise background subtraction. In such case, it is more difficult to
tracking methods are data-driven methods. In pixel-wise isolate and track each person in the group. There are several
tracking method, prior model of the target is not required. A techniques for shadow detection in video sequences [21].
parallel K-means clustering algorithm [18] has been used by The main objective of this paper is to develop an algorithm
Heisele, et al. [9, 10] for segmenting the color image sequence that can detect human motion at certain distance for object
and moving region is identified as target. However, the tracking applications. We carry out various tasks such as
method is computationally expensive due to large number of motion detection, background modeling and subtraction,
clusters. Similarly, another K-means based autoregressive foreground detection, shadow detection and removal,
model has been proposed and the clustering is performed only morphological operations and identifying occlusion.
WR WKH SRVLWLYH VDPSOHV 7KXV WKH WUDFNLQJ IDLOXUH FDQ¶W EH The paper organized as follows. Section 2.0 the object
detected and the failure recovery may not be possible. For segmentation of the video frames from the HSV color space is
tracking, the image pixels are divided as target and non-target presented. The proposed method is explained in Section 3.0.
pixels and K-means clustering algorithm is applied on these In section 4, we present the experimental results and we
pixels [20@ +RZHYHU WKLV PHWKRG FDQ¶W GHDO ZLWK WKH conclude the paper in the last section of the paper.
appearance changes of the target object such as size, pose, etc.
In addition, the computational cost is proportional to the
number of non-target points. II. BACKGROUND SUBTRACTION
It is understood from the above discussion that pixel-based
methods are robust against the background interfusion Human motion analysis and detection are the foremost task
methods. In this kind of method, the failure detection and in computer vision based problems. Human detection aims at
automatic failure recovery can be carried out effectively. segmenting regions corresponding to people from the entire
A very fundamental and critical task in computer vision is image. It is a significant issue in human motion analysis
the detection and tracking of moving objects in video system since the subsequent processes such as tracking and
sequences. Possible applications are as follows (i) Visual action recognition follows the motion detection. The motion
surveillance: A human action recognition system process detection and foreground object extraction algorithm consists
image sequences captured by video cameras monitoring of several sequential processes. The process algorithm is
sensitive areas such as bank, departmental stores, parking lots described in a flow chart and shown in Fig.1.
and country border to determine whether one or more humans In general, the Sum of Absolute Difference (SAD)
engaged are suspicious or under criminal activity. (ii) Content algorithm is used for background modelling, which is based
based video retrieval: A human behavior understanding on the frame differencing techniques. It is mathematically
system scan an input video, and an action or event specified in represented as
high-level language as output. This application will be very 1
much useful for sportscasters to retrieve quickly important D (t ) I (t i ) I (t j ) (1)
N
events in particular games. (iii) Precise analysis of athletic Where, N is the number of pixels in the frame and also
performance: Video analysis of athlete action is becoming an
important tool for sports training, since it has no intervention I (t i ) and I (t j ) are the frames at time
used as scaling factor,
to the athletic. i and j respectively. D(i ) is the normalized sum of
In all these applications fixed cameras are used with respect
absolute difference for that time instance. In an ideal case,
to static background (e.g. stationary surveillance camera) and a
when there is no motion, the following condition holds good.
common approach of background subtraction is used to obtain
an initial estimate of moving objects. First perform background I (t i ) I (t j ) and D(t ) 0 (2)
modeling to yield reference model. This reference model is
used in background subtraction in which each video sequence A. Background Subtraction
is compared against the reference model to determine possible Background subtraction is a popular technique to segment
variation. The variations between current video frames to that out the interested objects in a frame. This technique involves
of the reference frame in terms of pixels signify existence of subtracting an image that contains the object, with the
moving objects. The variation which also represents the previous background image that has no foreground objects of
foreground pixels are further processed for object localization interest. The area of the image plane where there is a
and tracking. Ideally, background subtraction should detect significant difference within these images indicates the pixel
real moving objects with high accuracy and limiting false location of the moving objects [22]. These objects, which are
negatives (not detected) as much as possible. At the same time, represented by groups of pixel, are then separated from the
it should extract pixels of moving objects with maximum background image by using threshold technique.
possible pixels, avoiding shadows, static objects and noise.
No Motion
detected?
(a) (b)
End
Yes
Background modelling
Background subtraction/
Foreground object (c) (d)
extraction Fig.2. Background subtraction and moving object identification (a) Video
Frame, (b) Background, (c) Background Subtracked Image and (d) B/W
frames Showing Objects
Shadow detection and Fig.2 shows the video frames used for the background
removal subtraction and moving object identification. In Fig. 2(b), the
background frame is shown and that has been used for
constructing the background model. The foreground
information and background information are identified and
Morphology process finally subtracted for identifying the objects present in the
foreground frame, which is shown in Fig. 2(c). In Fig. 2(d),
the final outputs are shown where the objects present in the
frame is converted as black and white pixels for effective
Occlusions detection idenfication of the object.
B (i, j ) is the background image and Tij (n, m) is the (a) (b)
template on current image.
(c) (d)
(a) (b) Fig.4. Occlusion detection and background subtraction (a) Video Frame, (b)
Background Subtraction, (c) BW Image Showing Objects and (d) Occlusion
Detection
(a)
(b)
Fig.6. Performance comparison D .DOPDQILOWHU¶VUHVXOWV E 3URSRVHG%DFNJRUXQGVXEWUDFWLRQDQGVKDGRZUHPRYDO¶VUHVXOWV