Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

Object Tracking using Joint Histogram of Color


and Local Rhombus Pattern
Manisha Verma Balasubramanian Raman
Department of Mathematics Department of Computer Science and Engineering
Indian Institute of Technology Roorkee Indian Institute of Technology Roorkee
Uttarakhand, India Uttarakhand, India
Email: manisha.verma.in@ieee.org Email: balaiitr@ieee.org

Abstract—Object tracking is a challenging real world prob- illumination, occlusion and object/camera motion conditions
lem for traffic, crime scenes, sports, etc. A feature extraction has been proposed using local features [10]. A two layer
method, named as the local rhombus pattern (LRP) is proposed feature learning module has been proposed using neural
in this work, and it is different from the conventional local bi-
nary pattern as it extracts the local relationship of neighboring network and pre-learned features has been been adopted in
pixels itself instead of local relationship with the center pixel. tracking mode in video sequence [11]. Joint color texture
The proposed method is combined with HSV (hue, saturation histogram created by LBP and RGB color channel, is used
and value) quantized histogram, and is applied to object to extract feature, and mean shift algorithm is applied for
tracking using mean shift tracking algorithm. Experiments object tracking [12]. A novel method called, spatial extended
are carried out for road traffic and sports video, using joint
histogram of LRP and HSV color space, and compared to center symmetric local binary pattern was proposed for
two state-of-art approaches. The experimental results show the background subtraction from the image sequence [13]. Local
effectiveness of the proposed method over existing methods. maxima edge binary pattern (LMEBP) has been proposed,
Index Terms—Local binary pattern; local rhombus pattern; and rotation invariant and uniform LMEBP has been applied
mean shift tracking; object tracking. for object tracking using mean shift tracking algorithm [14].
Dash et al. proposed a method based on local binary patterns
I. I NTRODUCTION and Ohta color features instead of RGB, and employed it for
Object tracking is a crucial issue in the field of pattern object tracking [15]. Multiple object tracking in a long sports
recognition and computer vision. It mainly finds applications video was proposed by Liu et al. using short-term activity
in the areas of vehicle navigation, traffic monitoring, face of each player in the game [16].
tracking, etc. Object tracking has two major tasks, first is A brief survey related to the proposed feature extraction
feature extraction of the target object in the video sequence, method is also given in this work. Ojala et al. proposed local
and second is tracking of the target object in the video binary patterns that extract the local information of each
sequence, using features. The proposed work targets feature pixel based on neighboring pixels [17]. It has been used for
extraction. Previously, numerous feature extraction methods many pattern recognition applications, e.g., face recognition
based on color and texture have been proposed. An extensive [18], facial expression recognition [19], object tracking [12],
survey on object tracking has been presented for different [15], etc. Further, local binary patterns were converted into
categories in [1]. uniform and rotation invariant local binary pattern [20]. LBP
Object tracking in a moving camera for non-rigid objects considered all patterns, but in dominant LBP [21], only
has been performed with mean shift tracking algorithm and dominant patterns were considered because other patterns
dissimilarity has been measured with a distance measure do not contribute much in the feature description. LBP
derived from Bhattacharya coefficient [2]. For better ob- combined with magnitude pattern and global mean of image,
ject tracking, shadow detection and suppression have been is known as completed local binary patterns [22]. Marko
carried out using the HSV color information of moving et al. proposed center symmetric local binary pattern [23]
objects [3]. A kernel based object tracking was employed which considers only center symmetric pixels and extracts
for non-rigid objects using histogram as a feature space [4]. the pattern based on their relationship. The direction based
Shape features were also utilized with HSV color histogram local pattern that extracts local information based on pixels
using edge histogram in different directions and applied in four directions was proposed in [24]. To overcome the
to object tracking [5]. An interest point based tracking problem of noise sensitive in local binary pattern, a local
algorithm was proposed in [6]. Texture recognition has been ternary pattern was proposed which creates ternary patterns
applied in the temporal domain for a dynamic sequence based on threshold interval [25]. After LBP and LTP, local
using local binary pattern in three orthonormal planes [7]. A tetra pattern that considers horizontal and vertical pixels of
modified LBP illumination variation was proposed in [8] and the center pixel and generates a pattern, was proposed [26].
applied to detect moving objects in a video sequence. Takala The proposed feature descriptor is motivated by the con-
et al. used color histogram, color correlogram and local ventional local binary pattern. LBP is very strong feature
binary pattern for color and texture features. Motion features descriptor but it has a high dimension for joint histogram
were extracted using trajectories and applied for object purpose. The modified rotation invariant uniform patterns
tracking in indoor and outdoor videos [9]. Object tracking in [20] have less feature vector length, however it loses in-

978-1-4799-8996-6/15/$31.00 2015
c IEEE

77
2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

formation with feature reduction. The proposed method, B. Local Rhombus Pattern
named local rhombus pattern (LRP) is based on mutual In the proposed method, features are derived using neigh-
relationship among neighboring pixels rather than the center boring pixels that evaluate the mutual relationship among
pixel and neighboring pixel relationship, and originally it neighboring pixels instead of the center pixel. Four neigh-
has less feature vector length, and no further reduction is borhood pixels, two each in the vertical and horizontal
required. Hence, it is not losing extra information in reduc- directions have been used for pattern formation. For each
tion of feature vector length. RGB color histogram works of the four pixels, two neighboring pixels are considered,
uniformly for all three channels. To over come this issue, and relationships based on their comparison are extracted.
HSV color space has been adopted as it differentiates color, As shown in figure 2(a), I1 , I3 , I5 and I7 are considered
brightness and intensity components. Each channel represent for pattern formation. A sample window of the image is
different information with different values. Joint histogram given in figure 2(b). In figure 2(c-f), the steps involved
of HSV color and local rhombus pattern is used for feature in pattern creation process are demonstrated. Figure 2(g-
extraction. The feature vector has been employed for object i) shows how the pattern values are obtained from local
tracking experiment on two different video sequences. rhombus pattern. In figure 2(c), the pixel I1 is subtracted
This paper is systemized as follows: Section 1 presents a from pixel I2 and I8 , and based on both difference values,
brief introduction of the problem and summarize the related a pattern is assigned to I1 . If both difference values are
work. Section 2 describes the local binary pattern and local of different signs, i.e., positive and negative, then ‘0’ is
rhombus pattern. The framework of the proposed method is assigned, and if both difference values are of same sign,
presented in section 3. The experimental results have been i.e., both positive or both negative then ‘1’ is assigned to
demonstrated in section 4. Finally, section 5 concludes the that pixel. Hence, in this example, 0, 1, 1 and 1 values are
paper. assigned to I1 , I3 , I5 and I7 , respectively. These values are
further multiplied with weights as mentioned in the figure
II. L OCAL PATTERNS 2(h) and summed up to a single pattern value as shown in
A. Local Binary Pattern figure 2(i). The four pixels which form a rhombus around
the center pixel, are used for pattern creation. Hence, this
The local binary pattern (LBP) has been proposed by method is named as the local rhombus pattern.
Ojala et al. for local information of pixels in an image [17]. For a pixel (x, y), LRP is formulated as follows:
In LBP, every pixel of the image is treated as the center
pixel, one at a time, and local information is acquired for T1n = In−1 − In , T2n = In+1 − In , ∀ n = 3, 5, 7. (4)
each pixel that depends on neighboring pixels. Each center
T1n = I8 − In , T2n = In+1 − In , f or n=1 (5)
pixel is subtracted from all neighboring pixels and a binary

number is assigned to each neighboring pixel. These binary ⎪
⎪0 i f T1n ≥ 0 & T2n < 0

numbers construct the local binary pattern for the center 0 i f T1n < 0 & T2n ≥ 0
F(T1n , T2n ) = (6)
pixel. Further, local binary patterns are multiplied by some ⎪
⎪1 i f T1n ≥ 0 & T2n ≥ 0

weights and summed up to a pattern value which is known 1 i f T1n < 0 & T2n < 0
as local binary pattern value for a center pixel. For each
3
pixel in the image, local binary pattern value is calculated.
LRP(x, y) = ∑ 2i × F(T12i+1 , T22i+1 ) (7)
For a center pixel Ic and neighboring pixel In , LBP can be i=0
obtained as follows:
III. F RAMEWORK OF PROPOSED ALGORITHM
p−1 A. Target object representation
LBPp,r = ∑ 2n × S(In − Ic ) (1)
The proposed method is inspired by the local binary pat-
n=0
tern that extracts the local information based on neighboring

1 x≥0 pixels [17] and center pixel. Ning et al. used joint histogram
S(x) = of LBP and RGB color channels for object tracking [12].
0 else
In the proposed work, the HSV color space is used
where p and r are the number of neighboring pixels and ra- for color information of the target object. It separates the
dius respectively. The rotation invariant LBP can be obtained color component (hue), brightness (saturation) and intensity
as follows: (value) such that individual information regarding hue, satu-
⎧ p−1 ration and value can be extracted. Hue, saturation and value
⎨ S(I − I ) if U(LBP ) ≤ 2
riu2 ∑ c n p,r components of the HSV color space are quantized in order
LBPp,r = n=0 (2)
⎩ to reduce the complexity of the algorithm. Hue, saturation
p+1 if else
and value components are quantized into 18, 3 and 3 bins
  respectively. Texture information of the object is created by
U(LBPp,r ) = S(I p−1 − Ic ) − S(I0 − Ic ) + LRP and joint histogram of LRP, and then hue, saturation
p−1 (3) and value components are generated. The local rhombus
∑ |S(In − Ic ) − S(In−1 − Ic )| pattern has a total of 16 features. Hue, saturation and value
n=1
have 18, 3 and 3 bins respectively. Hence, the total length
A sample window example of LBP pattern is shown in of histogram is 16×18×3×3. The target object is tracked in
figure 1. the next frames using mean shift tracking algorithm [4]. The

78
2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

Fig. 1: Local binary pattern example

Fig. 2: Local rhombus pattern sample window example

algorithm for the proposed system is given in the following LBPriu2 RGB : Rotation invariant uniform local binary
sequence: pattern + RGB color histogram
LEP RGB : Local extrema pattern + RGB color histogram
B. Algorithm The proposed method is abbreviated as LRP HSV. In
Input: Video sequence with location of the target object each experiment, the target object is selected manually and
in the first frame. marked as a red box. Our algorithm first extracts features,
Output: Tracked object in full video. and then tracks the required object in the next frames.
1) Upload the video and select the target object in the first In the first experiment, a video sequence of nearly similar
frame for tracking. moving cars is used. The video sequence comprises 201
2) Compute LRP of the target object in the first and next frames of size 640× 480. A car is selected as the target
frame. object for tracking, and marked with red color, as shown
3) Convert the current and next frame from RGB to HSV in figure 4. In next frames, the target object is tracked, and
color space, and quantize hue, saturation and value bins shown in red color. Results of LBPriu2 RGB, LEP RGB and
to 18, 3 and 3 respectively. LRP HSV are shown in figure 4(a), (b) and (c) respectively.
4) Create joint histogram of hue, saturation, value and It has been observed that up to frame 63, LBPriu2 RGB and
LRP for the target object in the current and next frame. LEP RGB tracked the correct object, whereas they lost the
5) Track the target object in the next frame using mean track of object near after frame 63. The failure in tracking
shift tracking algorithm with joint HSV and LRP his- can be attributed to the fact that, near frame 63 another car
togram. went through the target object, and both methods could not
6) Repeat the process from step 2 to 5 till the end frame. identify the correct car between the two. On the contrary, the
proposed method handled this issue, and tracked the correct
IV. E XPERIMENTAL RESULTS AND DISCUSSIONS object till the end as shown in figure 4(c).
In this work, two experiments on different videos have In the second experiment, a video of football game
been conducted and the proposed algorithm is compared is employed for the purpose of tracking one player. The
with following two algorithms. tracking results of both methods have been demonstrated in

79
2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

Fig. 3: Results from player tracking in football video of (a) LBPriu2 RGB (b) LEP RGB and (c) LRP HSV

TABLE I: Feature vector length of proposed method and previous methods


figure 3. At the beginning of the video, all three algorithms
worked equally well and tracked the correct object. This is Method Feature vector length Time taken
because the target object was almost in isolation, and no LBPriu2 RGB 10 × 8 × 8 × 8 = 5120 1:08
LEP RGB 16 × 8 × 8 × 8 = 8192 1:15
disturbing objects were present nearby. Whereas, towards LRP HSV 16 × 18 × 3 × 3 = 2592 1:02
the end frames, LBPriu2 RGB and LEP RGB missed the
target object, and started tracking other spurious objects. The
reason for incorrect tracking is that towards the end video
HSV color space is used for color features. Next, a joint
frames, other players with similar dress(white) have come
histogram is constructed for color-texture features. The pro-
close to the target object, and both methods have failed to
posed method is applied to object tracking application. Ob-
distinguish the target object from other objects. Whereas, in
ject tracking is performed on two video sequences of traffic
the proposed method, the target object is tracked correctly
and sports using mean shift tracking algorithm. Experimental
till the end of the video, and are shown in figure 3(c).
results prove that the proposed method is significantly better
Feature vector length of all three methods are given below than LBPriu2 RGB and LEP RGB algorithms.
in the table I. Feature vector length of the proposed method
is considerably less than other two approaches.
R EFERENCES
V. C ONCLUSION [1] A. Yilmaz, O. Javed, and M. Shah, “Object tracking: A survey,” Acm
A novel algorithm in the field of object tracking is computing surveys (CSUR), vol. 38(4), 13, 2006.
[2] D. Comaniciu, V. Ramesh, and P. Meer, “Real-time tracking of
proposed. The proposed LRP for texture features extracts non-rigid objects using mean shift,” Computer Vision and Pattern
the local relationship among neighboring pixels, and the Recognition, IEEE Conference on, vol. 2, pp. 142–149, 2000.

80
2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

Fig. 4: Object tracking in road traffic video (a) Results of LBPriu2 RGB (b) LEP RGB and, (c) LRP HSV

[3] R. Cucchiara, C. Grana, M. Piccardi, A. Prati, and S. Sirotti, “Im- [9] V. Takala and M. Pietikainen, “Multi-object tracking using color,
proving shadow suppression in moving object detection with hsv texture and motion,” Computer Vision and Pattern Recognition, IEEE
color information,” in Intelligent Transportation Systems, Proceedings Conference on, pp. 1–7, 2007.
IEEE, pp. 334–339, 2001. [10] F. Pernici and A.D. Bimbo, “Object tracking by oversampling
[4] D. Comaniciu, V. Ramesh, and P. Meer, “Kernel-based object track- local features,” Pattern Analysis and Machine Intelligence, IEEE
ing,” Pattern Analysis and Machine Intelligence, IEEE Transactions Transactions on, vol. 36(12), pp. 2538–2551, 2014.
on, vol. 25(5), pp. 564–577, 2003. [11] L. Wang, T. Liu, G. Wang, G., K.L. Chan and Q. Yang, “Video
[5] K. She, G. Bebis, H. Gu, and R. Miller, “Vehicle tracking using on- tracking using learned hierarchical features,” Image Processing, IEEE
line fusion of color and shape features,” Intelligent Transportation Transactions on, vol. 24(4), pp. 1424–1435, 2015.
Systems, 7th International IEEE Conference on, pp. 731–736, 2004. [12] J. Ning, L. Zhang, D. Zhang, and C. Wu, “Robust object tracking
[6] R. Babu, R. Venkatesh, and P. Parate, “Robust tracking with using joint color-texture histogram,” International Journal of Pattern
interest points: A sparse representation approach.” Image and Vision Recognition and Artificial Intelligence, vol. 23(07), pp. 1245–1263,
Computing, vol. 33, pp. 44-56, 2015. 2009.
[7] G. Zhao and M. Pietikainen, “Local binary pattern descriptors [13] G. Xue, J. Sun, and L. Song, “Dynamic background subtraction
for dynamic texture recognition,” Pattern Recognition, 18th IEEE based on spatial extended center-symmetric local binary pattern,”
International Conference on, vol. 2, pp. 211–214, 2006. Multimedia and Expo, IEEE International Conference on, pp. 1050–
[8] M. Heikkila and M. Pietikainen, “A texture-based method for 1054, 2010.
modeling the background and detecting moving objects,” Pattern [14] S. Murala, R.P. Maheshwari, and R. Balasubramanian, “Local
Analysis and Machine Intelligence, IEEE Transactions on, vol. 28(4), maximum edge binary patterns: a new descriptor for image retrieval
pp. 657–662, 2006. and object tracking,” Signal Processing, vol. 92(6), pp. 1467–1479,

81
2015 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

2012.
[15] P.P. Dash, D. Patra, and S.K. Mishra, “Local binary pattern as a
texture feature descriptor in object tracking algorithm,” in Intelligent
Computing, Networking, and Informatics, pp. 541–548, Springer,
2014.
[16] J. Liu, P. Carr, R.T. Collins, and Y. Liu, “Tracking sports players
with context-conditioned motion models,” inComputer Vision and
Pattern Recognition (CVPR) IEEE Conference on, pp. 1830-1837,
IEEE, 2013.
[17] T. Ojala, M. Pietikäinen, and D. Harwood, “A comparative study of
texture measures with classification based on featured distributions,”
Pattern recognition, vol. 29(1), pp. 51–59, 1996.
[18] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face recognition with
local binary patterns,” European Conference on Computer vision, pp.
469–481, Springer, 2004.
[19] C. Shan, S. Gong, and P.W. McOwan, “Robust facial expression
recognition using local binary patterns,” Image Processing, IEEE
International Conference on, vol. 2, pp. II:914–917, 2005.
[20] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale
and rotation invariant texture classification with local binary patterns,”
Pattern Analysis and Machine Intelligence, IEEE Transactions on,
vol. 24(7), pp. 971–987, 2002.
[21] S. Liao, M.W. Law, and A.C. Chung, “Dominant local binary patterns
for texture classification,” Image Processing, IEEE Transactions on,
vol. 18(5), pp. 1107–1118, 2009.
[22] Z. Guo and D. Zhang, “A completed modeling of local binary
pattern operator for texture classification,” Image Processing, IEEE
Transactions on, vol. 19(6), pp. 1657–1663, 2010.
[23] M. Heikkilä, M. Pietikäinen, and C. Schmid, “Description of interest
regions with local binary patterns,” Pattern recognition, vol. 42(3),
pp. 425–436, 2009.
[24] S. Murala, R.P. Maheshwari, and R. Balasubramanian, “Directional
local extrema patterns: a new descriptor for content based image
retrieval,” International Journal of Multimedia Information Retrieval,
vol. 1(3), pp. 191–203, 2012.
[25] X. Tan and B. Triggs, “Enhanced local texture feature sets for
face recognition under difficult lighting conditions,” in Analysis and
Modeling of Faces and Gestures, pp. 168–182. Springer, 2007.
[26] S. Murala, R.P. Maheshwari, and R. Balasubramanian. “Local tetra
patterns: a new feature descriptor for content-based image retrieval,”
Image Processing, IEEE Transactions on, vol. 21(5), pp. 2874–2886,
2012.

82

You might also like