Professional Documents
Culture Documents
IJPT AStudyonObjectDetection 22875-22885
IJPT AStudyonObjectDetection 22875-22885
net/publication/338253407
CITATIONS READS
2 8,886
2 authors:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Manjula S. on 31 December 2019.
ISSN: 0975-766X
CODEN: IJPTFI
Available Online through Research Article
www.ijptonline.com
A STUDY ON OBJECT DETECTION
S.Manjulaa* Dr.K.Lakshmib
a
Research Scholar, bProfessor, Computer Science and Engineering, Periyar Maniammai University,
Thanjavur-613403, Tamilnadu, India.
Email:manjula_se@pmu.edu
Received on: 20.10.2016 Accepted on: 25.11.2016
Abstract
Using video cameras for monitoring the campus is common in day-today life. Most of these surveillance use human
to monitor the activities that is happening in the area of interest. However using human in surveillance has its own
disadvantage; to overcome that limitation researchers are working in automated visual surveillance systems. The
visual surveillance process comprises of the following steps: environment modelling, motion segmentation, object
classification, tracking, behaviour understanding, human identification and data fusion. The first and foremost step in
visual surveillance is identifying moving objects in a video sequence. The moving object of interest may be human
being, vehicle, etc. Object detection is the technology that deals with identifying the semantic class of the moving
object in the video sequence. Hence Object Detection is very vital for tracking moving object and behaviour analysis
in the given video sequence. Considering the importance of object detection in visual surveillance, this paper presents
various methods available for object detection. Automatic Visual surveillance has wide area of applications such as
human identification at a distance, monitoring the congestion, detection of anomalous behaviours etc.
Segmentation.
1. Introduction
Traditional visual surveillance system uses human to monitor cameras for detecting any unpleasant events. If more
number of cameras are to be monitored more number of man power is required which puts a limitation on the
manpower use in visual surveillance. Hence the surveillance system becomes weaker. To have a better solution for
this problem researchers work on Automated Visual Surveillance System which detects events requiring attention as
it happens, and take action immediately. Most of the automated visual surveillance system used to detect people
Background modelling - Background modeling is a basic step in detecting moving objects in video sequences. A
general approach to background modeling is to classify each image pixel is related to background or not1.
Motion segmentation - Motion segmentation is the process of separating regions, features, or trajectories from a
video sequence into reliable subsets. These subsets correspond to independent moving objects in the scene.
Object classification - The purpose of object classificatio2-9, is to extract the region corresponding to people from all
Target tracking – it is an important job in the field of computer vision. Surveillance systems track moving objects
Behaviour understanding - is the process of analysis and recognition of motion patterns, and the production of high-
Human identification, aim of this phase is to identify the object/human entering the area under surveillance, and
Data fusion, the purpose of data fusion is tracking continuously by integrating every camera10. The relationship
Moving object detection is an important aspect in any surveillance applications such as video analysis, video
communication, traffic control, medical imaging, and military service11. Object Detection finds its importance in the
Abnormal event detection: With the growing demand for security, it is necessary to analyze the behaviors of people
and vehicles in area of interest to determine whether the behaviors are normal or abnormal activities that indicate any
Human gait characterization: The gait is a daily activity and one of the main skill of the human being. Human gait
characterization observes an individual’s walking style and can be used for human identification, to analysis the gait,
the successful and well-defined representation method is used. Motion information is extracted from human gaits, and
the help of human detection technique and avoid congestion by diverting it to free pathways12.
Fall detection of elderly people: Falls are a major reason of serious injury for the elderly population. To avoid these
problems, develop some intelligent monitoring systems with the ability to automatically detect fall and make alert to
relatives or authorities. Object detection10 is a very important aspect in visual surveillance system because the poor
accuracy in object detection affects the future process like object tracking and behavior understanding. Hence, this
This section provides an overview of visual surveillance system. The rest of this paper organized as follows section 2
gives general overview of well-known moving object detection methods. Section 3 provides about the object
classification methods, the available datasets, performance metrics and evaluation. Conclusion of the paper is
presented in section 4.
The goal of object detection is to detect all instances of objects from a known class, such as people, vehicles or faces
in an image or video. Object detection is a difficult task because of illumination changes in environment, rapid
variations in target appearance, similar non-target objects in background, and occlusions. Object detection method
uses semiautomatic or automatic detection techniques. Figure. 2 shows the steps such as environment modeling,
Video image
Sequences
Environment modeling
a. Background Subtraction
c. Spatio-temporal filter
a. Shape-based method
Object Classification
b. Motion-based method
c. Texture-based method
Detected Object
c.
Environment model construction and updating of model is essential for object detection. Environment model can be
classified into 2D models in image plane and 3D models in real world. An image can be acquired by fixed camera,
pure translation camera and mobile camera10. Depending on the types of camera used background modelling
graph to acquire a holistic background image14 and homography matrices can be used to describe the transformation
relationship between different images. Motion compensation is required to construct temporary background images 15
where mobile cameras are used. If fixed camera is used, the factors such as illumination variance, shadows, shaking
branches influence the construction and updating of background model. These factors are suppressed and effective
background model is constructed by using various algorithm like temporal average of an image sequence 16-17,
Adaptive Gaussian estimation18, Parameter estimation based pixel processes19-20,adaptive background estimation and
foreground detection using Kalman filtering, recovering and updating background images based on mixed Gaussian
model20. Toyama et.al21 proposed a wallflower algorithm where background subtraction was carried out at 3 levels
viz Pixel level, Region level, Frame level. Haritaoglu et.al22 builds a statistical model by representing each pixel with
Maximum intensity value, Minimum intensity value and Maximum intensity difference between consecutive frames.
These 3 values are observed during training period and updated periodically. McKenna et.al23 used an adaptive
background model with colour and gradient information to reduce the influences of shadows and unreliable colour
cues.
Motion segmentation in an image sequence is detecting regions having moving objects. Segmentation is classified
into spatial segmentation and temporal segmentation. Spatial segmentation is either local segmentation or global
segmentation. Local Segmentation is segmenting sub images (small windows) on a whole image. The number of
pixels available to local segmentation is lower than global segmentation. Global segmentation is concerned with
segmenting whole image. Global segmentation consists of a large number of pixels. Automatic video segmentation is
separation of the moving object from the background and identification of accurate boundaries of the object. Some of
Background Subtraction method is suitable for static background. The detection of object from a video sequence
provides a classification of the pixels into either foreground or background24. A scene in object detection process is
represented by a model is called background model. Background modelling is one of the primary and challenging
tasks for background subtraction. The background subtraction algorithm should be robust against environmental
change i.e. capable to handle changes in illumination conditions and able to ignore the movement of small
model in a pixel-by-pixel or block- by- block fashion. Figure 3 shows an example of background model.
Figure 3.(a) Original Frame (b) Reference Frame / Background Model (c) Difference Image (d)
Binary Image
Most researchers show more interest in building different adaptive background models in order to reduce the
influence of dynamic scene changes on motion segmentation.Some early studies given by Karmann and Brandt 25 and
Kilger26 ,proposed an adaptive background model based on kalman filtering to adapt temporal changes of weather and
lighting.A detailed general survey of image change algorithms can be found in 27 .The GMM is one of the most
commonly used method for background subtraction in visual surveillance applications for fixed cameras.A mixture of
Gaussians is maintained for each pixel in the image.A new pixel values update the mixture of Gaussians using an
online K-means approach. The estimation update is used to account for illumination changes, slight sensor
movements,and noise.
Vector based approach28 is applied in optical flow method to estimate the motion in video by matching points on
objects over multiple frames.For accurate measurement, it require high frame rate.It need specialized hardware for
real time implementation due to the complexity of the algorithm.Optical flow is robust to multiple and simultaneous
camera and object motions and it is suitable for crowd analysis.Meyer et.al29 performed monofonic operation which
computed the displacement vector field to initialize a contour-based tracking algorithm,called active rays,for the
extraction of articulated objects which would be used for gait analysis.Rowley and Rehg30 focused on the
segmentation of optical flow fields of articulated objects.Its major contributions were to add kinematic motion
constraints to each pixel,and to combine motion segmentation with estimation in expectation maximization(EM)
computation. In Bregler’s work31 ,each pixel was represented by its optical flow.These flow vectors were grouped
into blobs having coherent motion and characterized by a mixture of multivariate Gaussians.Optical flow methods 32
used to detect moving objects even in the presence of camera motion.Most optical flow computation methods are
specialized hardware.
2.2.3.Temporal Differencing
Temporal differencing2-3,33-36 use pixel-wise difference between two or three consecutive frames in an image
sequence to extract moving regions.Temporal differencing is adaptive to dynamic environments,but generally does a
poor job of extracting the entire relevant feature pixels,e.g.,possibly generating holes inside moving entities.Lipton
et.al3 detected moving objects using temporal differencing in real video streams.VSAM2 has successfully developed
a hybrid algorithm for motion segmentation by combining an adaptive background subtraction algorithm with a three
frame differencing technique.The hybrid algorithm is very fast and surprisingly effective for detecting moving objects
in image sequence.
Different moving regions are corresponding to various moving objects in real world image sequences.The desired
moving object region is should be separated from the other moving object region because further activities like
tracking and activity recognization is depends on it, it is necessary to correctly distinguish them from other moving
objects.The purpose of moving object classification2-8,37,is to precisely extract the region corresponding to people
from all moving blobs obtainted by the motion segmentation methods.Common geometric or topological properties
3.1.Shape-based classification
The shape information of motion regions like point,box,silhouette and blob are used to classify moving objects.
Shape based classification is particularly useful in certain transit systems where certain parts of the objects are
visible.In buses and trains,objects will be partially occluded most of the time, in this situation, the head8 will be a
salient feature in the scene.Lin et.al38 proposed a shape based approach to detect human.This approach is the key idea
to detect humans and estimate their poses by matching template.One major disadvantage of the shape-based method
is that it cannot capture the internal motion of the object in the silhouette region.
Collins et.al2,classified moving object blobs into four classes such as single human,vehicles, human groups and
clutter, using a viewpoint-specific three layer neural network classifier.Classification was performed on each blob at
This classification method is based on object motion characteristics and patterns. Motion can be used to recognize
methods 39 are used to identify human at a distance location.Bobick et.al 40 developed a view-based approach for the
recognition of human movements by constructing a vector image template comprising two temporal projection
operators:binary motion-energy image and motion-history image.Cutler et.al41 presented a self-similarity-based time-
frequency technology to detect and analyze periodic motion for human classification. Efros et.al42 characterized the
human motion within a spatio-temporal volume by a descriptor,which was based on computing the optical flow,
projecting the motion onto a number of motion channels and blurring with gaussian recognition was performed in a
Local binary pattern(LBP) is a texture-based method that quantifies intensity patterns in the neighbourhood of the
pixel43.Zhang et.al44 proposed the multi-block local binary pattern to encode intensities of the rectangular regions by
LBP. HOG45 introduced another texture-based method which uses high-dimensional features based on edges and then
applies SVM to detect human body regions.Zhu et.al 46 applied the HOG descriptors in combination with the cascade
of rejecters algorithm and introduced blocks that vary in size, location and aspect ratio.Moctezuma et.al 47 proposed
HOG with Gabor filter and showed improved performances in both person counting and identification.A
4. Conclusion
Detecting an object accurately in a surveillance video is one of the major research areas in computer vision due to its
wide range of applications. It is very challenging one, to process the image obtained from a surveillance video due to
the following reasons low resolution, illumination variation, dynamic objects in the background, small changes in the
background like waving of leaves. We have presented an overview of recent developments in object detection
methods. The detection process occurs in background modeling, object detection, object classification. In this paper,
temporal filter methods and discussed the advantages and disadvantages of the methods applied in various types of
dataset. The object classification techniques are categorized into shape-based, motion-based and texture based
methods. The state-of-the-art of existing methods in each key issue is discussed and made to point the future work
References
1. C.Stauffer and W.Grimson.Adaptive background mixture models for real-time tracking. Proceedings of IEEE
2. R.T Collins, et al. A System for video surveillance and monitoring: VSAM final report.CMU-TR-0012,
3. A.J. Lipton, H. Fuijiyoshi, R.S Patil. Moving target classification and tracking from real-time video. Proceedings
4. Y. Kuno, T. Wantance , Y. shimosakoda , S. Nakagawa. Automated detection of human for visual surveillance
5. R. Culter, L.S Davis. Robust real-time periodic motion detection, analysis, and application. IEEE Trans. Pattern
6. Local application of optic flow to analysis rigid versus non-rigid motion. http://www.eecs.lehigh.edu/FRAME/
Lipton/iccvframe.html.
7. A.Selinger, L. Wixson. Classifying moving objects as rigid or non-rigid without correspondences. Proceedings of
8. M. Oren, et al. Pedestrian detection using wavelet templates. Proceedings of the IEEE CS Conference on
9. C. Stauffer. Automatic hierarchical classification using time-base co-occurrences. Proceedings of the IEEE CS
10. Weiming Hu, Tieniu Tan, Liang Wang, and Steve Maybank. A Survey on Visual Surveillance of Object Motion
and Behaviors. IEEE Trans. on Sys. Man, and Cybernetics—Part C: Applications and Reviews, Vol. 34, No. 3,
August 2004
Journal of Computer Vision and Image Understanding. vol. 70, no. 2,142–156, May 1998.
12. Manoranjan Paul, Shah M E Haque and Subrata Chakraborty. Human detection in surveillance videos and its
13. João P. Ferreira, Manuel M. Crisóstomo, and A. Paulo Coimbra. Human Gait Acquisition and Characterization.
IEEE Trans. On Ins. and measurement, vol. 58, No. 9, September 2009.
14. H.-Y. Shum, M. Han, and R. Szeliski. Interactive construction of 3D models from panoramic mosaics.
Proceedings of IEEE Conf. Computer Vision and Pattern Recognition, Santa Barbara, CA, 1998, 427–433.
15. T. Tian and C. Tomasi. Comparison of approaches to ego motion computation. Proceedings of IEEE Conf.
16. N.Friedman and S.Russell. Image Segmentation in video sequences:a probabilistic approach.Proceedings of 13th
18. M.Kohle,D.Merkl,and J.Kastner.Clinical gait analysis by neural networks: Issues and experiences. Proceedings
19. H.Z.Sun,T.Feng,and T.N.Tan. Robust extraction of moving objects from image sequences.Proceedings of Asian
Conf.Computer Vision,Taiwan,R.O.C.,2000,961-964.
20. W.E.L.Grimson,C.Stauffer,R.Romano, and L.Lee. Using adaptive tracking to classify and monitor activities in a
site. Proceedings of IEEE Conf. Computer Vision and Pattern Recognition, Santa Barbara,CA,1998,22-31.
21. K.P Karmann, A. Brandt. Moving object recognition using an adaptive background memory. In Time-Varying
Image Processing and Moving Object Recognition. V.Cappellini (ed.), Vol.2 Elsevier, Amsterdam, The
Netherlands, 1990.
22. K. Toyama, J. Krumm, B. Brumitt, and B. Meyers. Wallflower: principles and practice of background
23. I.Haritaoglu,D.Harwood,and L.S.Davis. W4:Real-time surveillance of people and their activities. IEEE
25. Wu M, Peng X. Spatio-temporal context for codebook-based dynamic background subtraction. AEU-
26. M. Kilger, A shadow handler in a video-based real-time traffic monitoring system, Proceedings of the IEEE
27. R.J.Radke, S.Andra, O.Al-Kofahi, and B.Roysam. Image change detection algorithms: A Systematic survey.
28. Joshua Candamo, Matthew Shreve, Dmitry B. Goldof, Deborah B.Sapper, Rangachar Kasturi. Understanding
29. D. Meyar , J. Denzler and H. Niemann. Model based extraction of articulated objects in image sequences for gait
30. H.A. Rowley, J.M.Rehg. Analyzing articulated motion using expectation-maximization. Proceedings of the
31. C. Bregler. Learning and recognizing human dynamics in video sequences. Proceedings of IEEE CS Conference
32. Liang Wang, Hu and Tan. Recent developments in human motion analysis. The journal of the Pattern
33. C. Anderson P. Bert, G. Vender Wal. Change detection and tracking using pyramids transformation techniques.
Proceeding of the SPIE-Intelligent Robots and Computer Vision. Vol. 579, 1985, 72-78.
34. J.R. Bergen, et al. A three frame algorithm for estimating two-component image motion. IEEE Trans. Pattern
35. T Ojala, M Pietikinen, T Maenpaa. Multi-resolution gray scale and rotation invariant texture classification with
36. Y. Kameda, M. Minoth. A human motion estimation method using 3-successive video frames. Proceedings of the
38. Judy Jenita, S.,Justin Samuel, S.,Abirami, S.,Shalini, R.S.(2015), ―An efficient policy based security mechanism
using HMAC to detect and prevent unauthorized access in cloud transactions‖, ARPN Journal of Engineering
39. Z Lin, LS Davis. Shape-based human detection and segmentation via hierarchical part-template matching. IEEE
40. Y,-B.Li,T,-X,Jiang,Z,-H.Qiao, and H,-J.Qian. General methods and development actuality of gait recognition.
41. AF Bobick, JW Davis. The recognition of human movement using temporal templates. IEEE Trans. Pattern
42. R Cutler, LS Davis. Robust real-time periodic motion detection, analysis, and applications. IEEE Trans. Pattern
43. A Efros, A Berg, G Mori, J Malik. Recognizing action at a distance. Proceedings of Ninth IEEE International
44. L Zhang, SZ Li, X Yuan, S Xiang. Real-time object classification in video surveillance based on appearance
learning. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2007
45. N Dalal, B Triggs. Histograms of oriented gradients for human detection. Proceedings of IEEE Conference on
Computer Vision and Pattern Recognition (CVPR 2005) (IEEE, Piscataway, 2005), 886–893.
46. Q Zhu, S Avidan, M-C Yeh, K-T Cheng. Fast human detection using a cascade of histograms of oriented
gradients. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition
47. D Moctezuma, C Conde, IM Diego, E Cabello. Person detection in surveillance environment with HoGG: Gabor
filters and histogram of oriented gradient. Proceedings of ICCV Workshops (IEEE, Piscataway, 2011), 1793–
1800.