Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Adaptive algorithm to identify anomalies in moving objects

using computer vision


Paola A. Mateus Cesar L. Nino
Pontifical Xaverian University Pontifical Xaverian University
mateusp@javeriana.edu.co cesar.nino@javeriana.edu.co

Abstract - In this document, an adaptive algorithm is just a few, which only offer a partial limited solution to
proposed to identify anomalies in moving objects, such as the problem of detect anomaly in the video. Also, many
pedestrians, cars, motorcyclists and cyclists. The anomalies of these projects make tests in controlled settings, where
detected by this algorithm are: occlusions, an object par-
tially entering or exiting a frame, alterations in the object a scene is arranged and labelled as anomalous, and then,
velocities or collisions between two or more objects in the the algorithm detects this behavior when a similar scene
frame. The classification between frames with anomaly and takes place.
no anomaly is achieved by finding an adaptive threshold
that depends on the video sequences.
III. S YSTEM OVERVIEW
I. I NTRODUCTION
A. Segmentation of Moving Objects
Nowadays, one of the main problems of video surveil-
lance is the detection of anomalies while monitoring mul- The correct segmentation of the moving objects in
tiple cameras at the same time. To solves this problems each frame is very important to obtain the relevant
this project proposes and develops an adaptive algorithm kinetic energy and area throughout the video. These two
that works in uncontrolled scenes and automatically features allow the detection of the different changes and
detects a variety of anomalies, such as occlusions,drastic anomalies within that frame.
changes in the velocities of the objects or at least two The segmentation of the foreground extraction was
objects colliding in the same frame. developed as follows:
The remainder of this work is organized as follows: The first step is the use of three binarization tech-
In Section II, related works are described. In Section III, niques: Gaussian Mixture Modelling [12] (GMM) (see
a summary of the diferente techniques in the algorithm Figure 1b), Morphological Reconstruction [13] (see Fig-
is given. Some of these techniques are background sub- ure 1d) and Motion Detection (see Figure 1b).
straction, apparent mass and kinetic energy estimation,
The second step involves an AND operation between
recursive least squares and an unsupervised classifier.
each pair of possible combinations of the three images
Finally, the experimental results on different video clips
obtained from the previous step (see Figures 1e-g).
are presented in Section IV.
In the third and final step, the OR operation is pro-
II. R ELATED W ORK posed in order to get a unique binary image I between
all the images obtained previously (see Figure 1h).
Different works have presented solutions to ensure
anomaly detection in surveillance videos. Some of these
solutions are based on detection, segmentation and track- B. Feature Extraction
ing [1-6]. Other projects [7-9] used temporal features,
such as, local velocity and local movement. Finally, In order to identify the anomalies in the video, two
in projects [10] and [11] two kinds of anomalies are features are proposed: the apparent mass and the kinetic
detected through kinetic energy when the people are energy.
exiting the boundary of the frame or when the frame
The apparent mass is proposed knowing that the
has crowded scenes.
volume of the object is a physical and solid characteristic
All the aforementioned method use complex opera-
that should not have changed throughout the video. The el objeto solo cambia
tions such as object tracking with a high numbers of tra- cuando hay oclusiones
drastic changes in this feature can only occur when there o una entrada o salida
jectories and people counting through the video, to name
are occlusions or the partially entry or exit of an object parcial de un objeto en
el frame
978-1-5090-2532-9/16/$31.00 c 2016 IEEE in the frame.
In order to estimate the second feature (kinetic en-
ergy), a pixel-by-pixel moving window P (i, j) is pro-
posed to make an horizontal sweep of the frame. This
window is created by selecting a neighbourhood of IxJ
pixels from the matrix of apparent mass M (u, v). This
neighbourhood is centered around a pixel of interest,
being (i, j) the pixel coordinates in the matrix.
The theory [11] denotes that the foreground entropy
H(u, v) is the dispersion of the foreground on the
horizontal and vertical directions. Therefore, the new
mass apparent matrix that use these foreground entropy
is defined as Crowd Dispersion Index and it is calculated
by the following:

Pu+ U2 P U2
( i=u− U
j=v− U
P (i, j))2
2 2
CDI(u, v) = , (2)
H(u, v)3

Finally, the kinetic energy of each frame is defined as


follows:

u+ U
2 2
U
X X
2
Ek(u, v) = CDI(u, v) ∗ vij , (3)
Figure 1. (a) Original image. Foreground detection with: (b) Gaussian i=u− U
2 j=v− U
2
Mixture Modeling, (c) Motion detection, (d) Morphological reconstruc-
tion. The resulting images from the AND operation between: (e) GMM
and motion detection, (f) GMM and morphological reconstruction and where vij is the velocity obtained through a Thomas
(g) Motion detection and morphological reconstruction. (h) Foreground
detection resulting from the combination of the three methods. Brox [14] optical flow algorithm.

Drastic changes in the kinetic energy allow detecting C. Recursive Least Square Filters
alterations in the velocities of the objects or collisions
between two or more objects in the frame. Two recursive least squares filters are proposed. One
to estimate the total apparent mass and the other one to
To calculate the first feature (apparent mass) is nec-
estimate the total kinetic energy in the frame k.
essary to obtain the volume estimation with the area
The RLS filter is calculated as:
correction of each moving object through the vanishing
point. The vanishing point is found by the detection
of two parallel lines in the original RGB image. Then, ŷ(k) = wT (k − 1)g(k) (4)
the distance di is calculated from the centroid of each
moving object i to the vanishing point. where, ŷ could be the estimation of total apparent
mass M̂ t or the estimation of total kinetic energy Ekt ˆ
With this distance, the apparent mass is attained as:
depending on the needed filter, w(k−1) are the respective
filter coefficients: 4 for the total apparent mass and 8 for
n
X the total kinetic energy, g is the vector of the buffered
M (u, v) = di Ii (u, v), (1)
input samples at the frame k being g = M t for the total
i=1
apparent mass or g = Ekt for the total kinetic energy.
Two instantaneous error are calculated to know if there
where, M (u, v) represents the mass apparent matrix are drastic changes in the total apparent mass and the
where the background is equal to zero and the foreground total kinetic energy respectively.
has positive values different of zero. The values of the Taking into account that the proposed RLS filters track
foreground correspond to the distances di of each moving the real signal of the total apparent mass and the total
object i, n is the total number of moving objects in the kinetic energy throughout the video, it is supposed that
frame, Ii is the (U, V ) matrix of the binary image with the absolute error is lognormal. Therefore, the calculated
only the i-th moving object, U and V are the height and histogram by the log error has a normal distribution when
width values of the real image respectively. there are not anomalies.
Resolution Frames
Two frame-by-frame moving windows are used to Video 1 240x352 30/s
detect a classification threshold that adapts to different Video 2 360x640 30/s
scenes in the video. One window is linked to the total Video 3 360x528 30/s
Video 4 252x320 30/s
apparent mass instantaneous error and the other one to Table I
the total kinetic energy instantaneous error. This classifi- E ACH VIDEO SPECIFICATIONS USED BY THE ALGORITHM .
cation method is unsupervised since there is no previous
training.
These moving windows select the data from each error
and then, a log-normal histogram with an absolute error Figure 2 highlights drastic changes in the total kinetic
distribution for each of them is created. energy or total apparent mass with a red color and minor
A probability density function (PDF) that fits each changes in these features with a yellow color.
histogram has to be found. The PDFs that fits best the
different distribution of the histograms are: First, when
the scene has no anomalies, the proposed (RLS) filter
has soft changes making the histogram a lognormal dis-
tribution one. Therefore, this distribution is modelled as
a Gaussian PDF. Second, when the scene has anomalies
the histogram changes its distribution (not a lognormal
one) making the absolute error value to be modelled as
a Gaussian Mixture PDF.
A Gaussian mixture distributions, if the algorithm
detects a dataset which contains changes in the scenes
(anomaly or non-anomaly in the frame) and a Gaussian
distribution, if the dataset has only samples without any
anomaly in the apparent mass or the kinetic energy. The
term anomaly makes reference to a drastic change in the
total apparent mass or in the total kinetic energy.
In order to know wich of Gaussian PDF or Gaussian Figure 2. Processed images of Video 1 showing drastic changes in
mixture PDF most fit the histogram a Kullback–Leibler total kinetic energy or total apparent mass.
divergence is calculated for both of them:
Finally, if the Gaussian Mixture PDF is the one that The samples behave throughout all the Video 1 as
fit the histogram, the threshold thr is calculated as: shown in Figure 3, where the blue samples are the frames
that have normal behavior, the green ones are the frames
√ µ1 σ22 + µ2 σ12 that have changed in the total kinetic energy and the
thr = γ+ (5)
σ12 + σ22 red ones are the frames that have a change in the total
where, apparent mass.
In Table 2 the results obtained with the unsupervised
γ = σ12 σ22 (ln(a1)(σ12 σ22 ) − (µ1 − µ2 )2 classifier for each video are shown.
+ (σ12 + σ22 )ln(a2)) (6) Non-anomaly Anomaly VPR SPC ACC
Video 1 945 235 81.66% 71.66% 73.19%
Video 2 621 249 80.77% 78.26% 78.66%
Video 3 1466 307 77.78% 90.76% 88.36%
Video 4 947 268 84.00% 81.82% 82.17%
where µ1 and µ2 are the means of the first and second Table II
Gaussian distribution respectively, σ1 and σ2 are the S ENSIBILITY, SPECIFICITY AND PRECISION FOR THE FOUR VIDEOS .

standard deviation of each distribution.

IV. E XPERIMENTAL R ESULTS In Figure 4, the Receiver Operating Characteristic


curve was drawn from the sensibility and specificity.
The adaptive algorithm was evaluated in four videos
in outdoor areas. Using surveillance cameras, different V. C ONCLUSIONS
videos with occlusions, shadows and lightning conditions
that vary throughout the day were obtained. Each of them The proposed combination of the three methods deliv-
contains two features (anomaly and non-anomaly) with ers a better binary segmentation than the ones obtained
the specifications in Table 1. from each independent method. Drastic changes in the
total apparent mass occur when there are occlusions or an [4] M. P. Kumar and P. H. S. Torr and A. Zisserman. Learning layered
object partially enters or exits a frame. Since the volume motion segmentations of video. In Computer Vision, 2005. ICCV
2005. Tenth IEEE International Conference on, pages 33-40 Vol.
of an object is a physical and solid characteristic, it 1, Oct. 2005.
should not change throughout the video. Drastic changes
[5] R. Mehran and A. Oyama and M. Shah. Abnormal crowd behavior
in the kinetic energy allow to detect alterations in the detection using social force model. In Computer Vision and Pattern
object velocities or collisions between two or more Recognition, 2009. CVPR 2009. IEEE Conference on, pages 935-
objects in the frame. The pixel-by-pixel windows used 942, June 2009.
in the kinetic energy allow to identify the specific pixels [6] B. Antić and B. Ommer. Video parsing for abnormality detection.
that change in the frame. The two PDFs proposed in this In Computer Vision (ICCV), 2011 IEEE International Conference
on, pages 2415-2422, Nov 2011.
work (Gaussian Mixture PDF and Gaussian PDF) are the
best fit for the histograms created by the instantaneous [7] J. Zhao and Y. Xu and X. Yang and Q. Yan. Crowd instability
analysis using velocity-field based social force model. In Visual
log error of the total apparent mass and the total kinetic Communications and Image Processing (VCIP), 2011 IEEE, pages
energy. The algorithm was tested in a ground truth that 1-4, Nov 2011.
contains four videos with 3079 frames with no anomaly [8] R. Raghavendra and A. Del Bue and M. Cristani and V. Murino.
and 1059 frames with anomaly. Optimizing interaction force for global anomaly detection in
crowded scenes. In Computer Vision Workshops (ICCV Work-
shops), 2011 IEEE International Conference on, pages 136-143,
Nov 2011.
[9] D. Tran and J. Yuan. Optimal spatio-temporal path discovery for
video event detection. In Computer Vision and Pattern Recognition
(CVPR), 2011 IEEE Conference on, pages 3321-3328, June 2011.
[10] G. Xiong and X. Wu and Y. L. Chen and Y. Ou. Abnormal crowd
behavior detection based on the energy model. In Information and
Automation (ICIA), 2011 IEEE International Conference on, pages
495-500, June 2011.
[11] T. Cao and X. Wu and J. Guo and S. Yu and Y. Xu. Abnormal
crowd motion analysis. In Robotics and Biomimetics (ROBIO),
2009 IEEE International Conference on, pages 1709-1714, Dec.
Figure 3. Blue samples are the frames that have not anomaly, the 2009.
green ones are the frames that have drastic changed in the total kinetic
energy and the red ones are the frames that have drastic change in the [12] T. Huang and J. Qiu and T. Ikenaga. A Foreground Extraction
total apparent mass. Algorithm Based on Adaptively Adjusted Gaussian Mixture Mod-
els. In INC, IMS and IDC, 2009. NCM ’09. Fifth International
Joint Conference on, pages 1662-1667, Aug. 2009.
[13] J. J. Chen and C. R. Su. Volume Image Segmentaton by
Dual Multi-Scale Morphological Reconstructions. In Intelligent
Information Hiding and Multimedia Signal Processing, 2009. IIH-
MSP ’09. Fifth International Conference on,pages 511-514, Sept.
2009.
[14] Thomas Brox and Jitendra Malik. Large Displacement Optical
Flow: Descriptor Matching in Variational Motion Estimation. In
IEEE Transactions on Pattern Analysis and Machine Intelligence,
33(3):500-513, 2011.
[15] W. Li and V. Mahadevan and N. Vasconcelos. Anomaly Detec-
tion and Localization in Crowded Scenes. In IEEE Transactions
on Pattern Analysis and Machine Intelligence, pages 18-32, Jan.
2014.
Figure 4. Receiver Operating Characteristic curve that indicates the [16] H. Yang and Y. Cao and S. Wu and W. Lin and S. Zheng
average of ten iterations at the four parking lots. and Z. Yu. Abnormal crowd behavior detection based on local
pressure model. In Signal Information Processing Association
Annual Summit and Conference (APSIPA ASC), 2012 Asia-Pacific,
R EFERENCES pages 1-4, Dec. 2012.
[1] H. Dee, D. Hogg. Detecting Inexplicable Behavior, In BMVC,
2004. [17] C. Xie and L. Shang. Anomaly detection in crowded scenes
using genetic programming. In Evolutionary Computation (CEC),
[2] N. Robertson and I. Reid. Behaviour understanding in video: a 2014 IEEE Congress on, pages 1832-1839, July 2014.
combined method. In Computer Vision, 2005. ICCV 2005. Tenth
IEEE International Conference on, pages 808-815 Vol. 1, Oct.
2005.
[3] A. Basharat and A. Gritai and M. Shah. Learning object motion
patterns for anomaly detection and improved object detection.
In Computer Vision and Pattern Recognition, 2008. CVPR 2008.
IEEE Conference on, pages 1-8, June 2008.

You might also like