Robust, Real Time People Tracking With Shadow Removal in Open Environment

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Robust, Real Time People Tracking with Shadow Removal in Open Environment

Ching-Tang Hsieh, Eugene Lai, Yeh-Kuang Wu and Chih-Kai Liang Department of Electrical Engineering, Tamkang University, Taipei, Taiwan, R.O.C. e-mail: hsieh@ee.tku.edu.tw
Abstr act This paper presents a framework to track people using wavelet transform and Kalman filter in unconstrained environments. We adopt simultaneously the maximum and minimum variances of the color information in the trunk part of the tracked people to be the tracked features. However, the shadow is one of the environmental factors influencing on processing of monitored images. To make the system more robust, a shadow-removal scheme is devised. A multi-resolution method and the color information are used to eliminate shadows. The K alman filter and DSA is adopted to be the estimator in this system. Experiments show that the proposed system achieves optimal performance in the complicated backgrounds. 1 Introduction to remove the shadows. But they could not remove the shadows completely when the shadows are near the tracked object or the shadows are overlapping the tracked people. Recently several systems for people tracking had been proposed [1]-[11]. These researches can be classified as two categories . One of the studies is focusing on the feature searching, and the other is the motion compensation. Feature extraction is an important part of the tracking system. Many different methods of feature searching were proposed. The people model had been made to be the tracking feature [1][2][3]. Some of the tracking systems used the extreme of curvature on the bounding contour of the segmented regions, the uniform brightness regions or the color regions to be the tracking features [4][5][6][7][8]. Some of the tracking features even combined two or more features such as the object depth and color information [9][10][11]. And the tracking systems using the texture of color information are more robust against the perturbation in illumination. Motion estimation also plays an important role in the tracking system. The full search algorithm (FSA) gives the optimal performance at the expense of huge computation. Some fast search algorithm have been proposed to reduce s the computational complexity such as the diamond search algorithm (DSA)[12], new three-step search algorithm (NTSS) [13] and the three-s tep search algorithm (TSS)[14]. Compared with the TSS algorithm, the DSA achieves closely performance and requires less computation. The DSA is suitable for estimating or compensating the motion in the real time system. The Kalman filter is an effective estimator [22]. Based on the Gauss-Markov signal model, a sequential minimum mean square error estimator is constructed to allow the unknown parameters to change in time. This approach calculates the value of the estimator based on the estimator of the previous time and hence is recursive in nature. In our system, the Kalman filter is used to be the estimator to predict the motion of the tracked object. We can achieve a good performance by the recursive procedures consisting of two stages: prediction and compensation. In our system, we try to adopt discrete wavelet transform (DWT) and Kalman filter [21] to detect and track the

Detecting and tracking the moving people is becoming an important issue in several applications such as camerabased surveillance and human machine interaction. The achievements of the researches can be applied to monitor a specific area or to plan a path for pedestrian. However, object tracking in the complicated environments or in the real time is still the key problem of the visual surveillance. Shadow is one of the environmental factors influencing on processing of monitored images. The object is often accompanied by the shadow. The shadow will be detected as a moving object by the motion-detecting procedure. In order to detect several moving objects by image processing, it is necessary to separate the object from its shadow [15][20]. McKenna et al. [15] assumed that any significant chromatic change could have been caused by shadow, and they removed the shadow by the properties of gradient in shadows. And Kuo et al. [18] proposed a shadow-removal scheme to exclude the pixels having chromaticity very similar to that of the corresponding background regions from the color histogram of a moving person. But they could not clearly distinguish the difference between the shadow and the black pants or dressing. Zhao et al. [19] used the morphological operations to remove the shadow pixels. Chang et al. [20] detected the shadows by calculating the orientation of the tracked people and the boundary lines and proposed a Gaussian shadow modeling

moving people in the video sequence. After two-level DWT decomp osition, the coefficients of the coarsest sub-band are selected. The noise and the isolated point of the image are reduced. We adopt simultaneously the maximum and minimum variances of the color information in the trunk part of the tracked people to be the tracked features . The variances of the coefficients are invariable when the luminance is changed in the tracking process. A shadowremoval technique is proposed to improve the tracking system. Hence, we can achieve the good performance whether the background is complicated or not. The paper is organized as follows. Section 2 describes a preview about Kalman filter. The shadow-removal algorithm is described in Section 3. Section 4 presents the general architecture of the proposed system. Experimental results are reported in Section 5. And the conclusion is presented in Section 6. 2 Preview

M(t) = [(1 K(t))Pre_M(t)]

(5)

Removal of Shadow

When the tracked people are accompanied by the shadow, the accuracy of tracking system will decrease. The color distribution of shadow possesses two properties. One is that the chromaticity and luminance are lower in the shadow area. Another one is that there is a higher density in the lower chromaticity of the shadows. The steps of shadowremoval technique are described as follows. 3.1 Color extraction According to the first property mentioned above, we extract the coefficients having low chromaticity and luminance located in the detected area. The rest having large luminance such as clothes and face components will be excluded. The function of color extraction is described in eq(6).
Y F ' (x, y) = {F ( x , y ) , otherwise

255

, if F ( x , y )> VY or FI ,Q ( x , y ) >VI ,Q

The operation of Kalman filter is a recursive procedure consisting of two stages: prediction and compensation. The first stage is aimed to estimate the movement, and the compensation stage is aimed to find the accuracy of the estimated motion vector by the motion estimation algorithm and Kalman gain. 2.1 Prediction stage We generate a predicted motion model using the motion of the previous block as eq(1). We get a minimum prediction error between the current block and the previous block as eq(2).

(6)
F (.)
'

where

F (.)

is the coefficient of original image.


FY ,I ,Q (.)

is the

coefficient of color-extracted image. is the respective coefficient of the tracked image in the Y,I,Q color space. is the respective threshold value in the Y,I,Q color space. 3.2 Projection After the procedure of section 3.1, we can get the area with lower chromaticity and luminance such as shadow, hair, and black suit. After one-level decomposition by the wavelet transform, the X and Y -axis information in different sub' bands are projected to the low frequency component FL and ' the high frequency component FH . And we can get the productions S 'x and S y' by eq(7).
' ' ' S x ( x) = FH ( x) FL ( x)
' ' ' S y ( y) = FH ( y) FL ( y)
VY, I , Q

Pre_P(t) = a(t) * P(t 1) + W(t)


Pre_M(t) = a(t) 2 * M(t 1)

(1) (2)

where Pre_P(t) is the prediction of the location of the feature point at the frame t, and P(t-1) means the estimated location at the previous frame t-1. a(t) is the weighting value related to the unbiased estimator. W(t) is the noise, and M(t) is the minimum prediction error. 2.2 Compensation stage We compensate Pre_P(t) to P(t) by the Kalman gain and will describe it at section4.3. The operation of Kalman gain and the updating of P(t) are calculated by eq(3) and eq(4).
K(t) = Pre_M(t) (Pre_M(t)+ ra)

(7)

(3) (4)

P(t) = Pre_P(t) + K(t)[(Z(t) Pre_P(t)]


where ra is the noise.

The measurement vector Z(t) is compensated by the estimation algorithm such as DSA. Then, we update the kalman gain K(t) by using the correlation between the minimum prediction error and Gaussian variance. Update a(t) by the ratio of P(t) to Pre_P(t), and calculate the minimum prediction error by

3.3 Shadow removal The high frequency information of the shadow area is less than that of the others such as hair, black suit, because of the smaller variation in the chromaticity. The value of the coefficients in the low frequency is smaller since the chromaticity and luminance of the shadow areas are lower. The areas where the values of the S (x) and S (y) are lower than the threshold values are defined as the shadow, and the background information of the corresponding position is recovered. T he block diagram of the proposed s hadowremoval algorithm is shown in Figure 1.
' x
' y

In Figure 2, (a) and (b) are the original tracked images. (a) is the image with shadow and (b) is the image with people wearing a black shirt. In (c) and (d), the tracked objects are detected. (e) and (f) are the histograms of x -axis projection

in the low frequency components after wavelet transform, and (g) and (h) are the ones in high frequency components. The magnitudes of the histograms in the shadow areas in Figure 2(e)(g) are smaller than that in the non-shadow ones in Figure 2(f)(h).

current frame is large enough, the difference of the two frames is recorded in S(x,y). An object is detected as an entering object in this defined section if the total value of the variance in the section is large. The other entering creatures except human will not be detected as a tracked object because the total value of the variances in the specific section is small. 4.2 Feature extraction Comparing to the background, the moving object will bring the huge variation of luminance information. The moving object is separated from the background according to the statistic of the variance histogram. After the procedure of the shadow-removal scheme, the shadows are eliminated. Comparing to other components of the body such as arm, leg and face, the movement of the trunk is smoother when people is moving. In this paper, we choose the maximum and the minimum variances with the 3*3 mask in the region of trunk to be our referable features to improve the robustness of the system. If the tracked people have a color clothes with the complicated texture, the maximum value will be large. And the smooth shade color can be described by the minimum variance. 4.3 Motion estimation and Feature matching We predict the approximate location of the next frame in the prediction stage. The effective prediction method, DSA, is used as the measurement method of the compensation stage. The DSA, combining the searching paths with horizontal, vertical, and diagonal directions, will not easily be trapped into a local minimum for the large motion content.

Figure 1:

The block diagram of the shadow-removal technique.

(a)

(c)

(e)

(g)

(b)

(d)

(f)

(h)

4.4 Feature labeling We label the maximum and the minimum variances in the current frame, and thus complete the compensation stage of the kalman filter operation. Repeat Section 4.1-4.3, and the system will stop tracking when the tracked object is out of the monitor. 5 Experimental results

Figure 2:

(a) is the tracked frame with shadow. (b) is the tracked frame with people. (c) and (d) are the detected moving object images. (e) and (f) are the histograms of the low frequency information. (g) and (h) are the histograms of the high frequency information. 4 The Proposed Tracking System

The system has been tested on several video sequences of the moving people in different indoor and outdoor environments with fixed camera. The frames for testing are color images based on the system of RGB. All experiments are simulated on a 1.7GHz Pentium4 PC and the size of frames is 320*240 pixels. Figure 4(a) shows a tracked image with shadow, and the background image is shown in Figure 4(b). After the object detection procedure, we can obtain the tracked object. The white part in Figure 4(c) shows the detected object. Then, the color extraction step is proceeded to detect whether the object is neither people nor shadow. Figure 4(d) shows the result of the color extraction procedure. The higher chromaticity components are removed and represented by white points. Figure 4(e) shows the image after one level wavelet transform. Figure 4(f) shows that we can detect the

The general architecture of the proposed system is shown in Figure 6. To reduce the influence by the brightness and disturbance in the shade of color, we transform the RGB model into the YIQ model. 4.1 Object detection After two-level DWT, a background image without object is set as a reference image. We estimate the blocking variance for detecting the object. If the value of the variance difference in the luminance between the reference and the

shadow and successfully recover shadow as the background information. We simulate the tracking system with the shadow-removal algorithm and without the shadow-removal algorithm. In the experiments as shown in Figure 5, the people tracking system without shadow-removal system gets some error results. The shadow area is detected as a tracked people so as lead to the wrong tracking results. The tracking system with shadow-removal algorithm can distinguish the difference between the shadow and the tracked people. Figure 6 shows that it can completely track the people with the shadow. In Figure 7 and 8, the red line indicates the tracking locus of the maximum features, and the green line is that of the minimum features . Figure 7 shows the tracking results when a person is in an outdoor place in where the swinging leaves complicate the background. Figure 7 shows that people is tracked successfully in the environment with lots of noises. In Figure 8, the tracked people with red shirt pass through the hallway beside the red wall. The tracking of the ma ximum feature point has been lost in Figure 8(a~d), but we can still track people successfully by the minimum feature point.
Background DWT input DWT New Frame Object Detection Shadow Removal Output Feature Extraction Kalman Filter Feature Labeling IDWT

(a)

(b)

(c) Figure 6:

(d)

The results of the tracking system with shadow-removal algorithm.

(a)

(b)

Motion Compensation

Figure 3:

The block diagram of the proposed system.

(c) Figure 7: (a) (b)

(d)

The tracking results when people is in a complicated outdoor background.

(c) Figure 4:

(d) (e) (f) The e xperimental results of the proposed shadow-removal technique.

(a)

(b)

(a)

(b) Figure 8:

(c)

(d)

The tracking results when the object s color is similar to the color of the background.

(c) Figure 5:

(d)

The results of the tracking system without shadow-removal algorithm.

Conclusions

This paper studies how to use wavelet transform and Kalman filter in people tracking of video sequence. In order to get the improved result in complicated background, we choose the maximum and the minimum variances of the color information as features. We propose a shadowremoval scheme to make the system more robust. Moreover, we can fulfill the prediction and matching of the feature vector by us ing Kalman filter and DSA. The successful results of the experiments show that we can utilize wavelet transform and Kalman filter effectively for people tracking. References [1] Wachter, S., and Nagel, H-H. , Tracking of persons in monocular image sequence, Proc. IEEE on Nonrigid and Articulated Motion Workshop, 1997, pp.~2--9. Tesei, A., Foresti, G.L., and Regazzoni, C.S., Human body modelling for people localization and tracking from real image sequences, Proc. Int. Conf. on Image Processing and its Applications, 1995, pp.~806--809. Ude, A., and Riley, M., Prediction of body configurations and appearance for model-based estimation of articulated human motions, Proc. IEEE on SMC, 1999, Vol.2, pp.~687--691. Yamane, T., Shirai, Y., and Miura, J., Person tracking by integrating optical flow and uniform brightness regions, Proc. IEEE Int. Conf. on Robotics and Automation, 1998, Vol.4, pp.~3267--3272. Okada, R., Shirai, Y., and Miura, J., Tracking a person with 3-D motion by integrating optical flow and depth, Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, 2000, pp.~336--341. Segen, J., A camera -based system for tracking people in real time, Proc. Int. Conf. on Pattern Recognition, 1996, Vol.3, pp.~63--67. Rossi, M., and Bozzoli, A., Tracking and counting moving people, Proc, IEEE Int. Conf. on Image Processing, ICIP-94, 1994, Vol.3, pp.~212--216. Dobie, M.R., and Lewis, P.H., Object tracking in multimedia systems, Proc. Int. Conf. on Image Processing and its Applications, 1992, pp.~41--44. Kompatsiaris, I., Mantzaras, G., and Strintzis, M.G., Spatiotemporal segmentation and tracking of objects in color image sequences, Proc. Int. Conf. On Circuits and Systems, 2000, Vol.5, pp.~29--32.

[12] Shan Z., and Ma K.K., A new diamond search algorithm for fast block-matching motion estimation, IEEE Trans. on Image Processing, Feb., 2000, Issue: 2, pp~287--290. [13] Li R.X., B.Z., and Liou, M.L., A new three-step search algorithm for block motion estimation, IEEE Trans. on Circuits and Systems for Video Technology, Aug., 1994, Vol.4, pp.~438--442. [14] Lu, J.H., and Liou, M.L., A simple and efficient search algorithm for block-matching motion estimation, IEEE Trans. on Circuits and Systems for Video Technology, April, 1997, Vol.7, pp.~429-- 433. [15] McKenna, S.J., Jabri, S., Duric, Z., and Wechsler, H., Tracking interacting people, Proc, IEEE Int. Conf. On Automatic Face and Gesture Recognition, 2000, pp. ~348--353. [16] Wu Y.M., Ye X.Q., and Gu W.K., A shadow handler in traffic monitoring system, Proc. IEEE Int. Conf. on Vehicular Technology, 2002, pp.~303--307. [17] Sonoda, Y. , and Ogata, T., Separation of moving objects and their shadows, and application to tracking of loci in the monitoring images, Proc. Int. Conf. on Signal Processing, 1998, pp.~1261--1264. [18] Kuo, C.M., Hsieh, C.H., Lin, H.C., and Lu, P.C., Motion estimation algorithm with kalman filter, Electronics Letters, 1994, Vol. 30, pp.~1204-- 1206. [19] Zhao, T., Nevatia, R., and Fengjun Lv, Segmentation and tracking of multiple humans in complex situations, Proc. IEEE Int. Conf. on Computer Vision and Pattern Recognition, 2001, pp.~194--201. [20] Chang, C.J., Hu, W.F., and Hsieh, J.W., Shadow elimination for effective moving object detection with Gaussian models, Proc. Int. Conf. on Pattern Recognition, 2002, pp.~ 540--543. [21] Lu, W.M., and Tan, Y.P., A color histogram based people tracking system, Proc. IEEE Int. Conf. on Circuits and Systems, 2001, pp.~137--140. [22] Steven M.K., Fundamentals of statistical signal processing: estimation theory, Prentice Hall, 1993

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10] Heisele, B., Kressel, U., and Ritter, W., Tracking NonRigid, Moving Objects Based on Color Cluster Flow, Proc. IEEE Conf. On Computer Vision and Pattern Recognition, 1997, pp.~257--260. [11] Heisele, B., and Ritter, W., Obstacle detection based on color blob flow, Proc. on Intelligent Vehicles '95 Symposium, 1995, pp.~282--286.

You might also like