Object Detection Report

Object Detection Using KDE ,GMM ,OTSU
ABSTARCT
Object detection is most prevalent step of video analytics. Performance at higher level is
greatly depends on accurate performance of object detection. Various platforms are being
used for designing and implementation of object detection algorithm. . In order to detect an
object in an image or a video the system needs to have a few components in order to
complete the task of detecting an object, they are a model database, a feature detector, a
hypothesizer and a hypothesizer verifier. In this work we presents a review of the various
techniques like GMM,OTSU,KDE that are used to detect an object, localize an object,
categorize an object, extract features, appearance information, and many more, in images and
videos. The comments are drawn based on the studied literature and key issues are also
identified relevant to the object detection. Information about the source codes and online
datasets is provided to facilitate the new researcher in object detection area. It includes C
programming, MATLAB and Simulink, open cv etc. Among these, MATLAB programming
is most popular in students and researchers due to its extensive features. These features
include data processing using matrix, set of toolboxes and Simulink blocks covering all
technology fields, easy programming, and Help topics with numerous examples. This paper
presents the implementation of object detection and tracking using MATLAB. It
demonstrates the basic block diagram of object detection and explains various predefined
functions and object from different toolboxes that can be useful at each level in object
detection. Useful tool boxes include image acquisition, image processing, and computer
vision. This study helps new researcher in object detection field to design and implement
algorithms using MATLAB.
Chapter 1
Introduction to Object Detection
1.1 INTRODUCTION:
Video analytics is popular segment of computer vision. It has enormous applications such as
traffic monitoring, parking lot management, crowd detection, object recognition, unattended
baggage detection, secure area monitoring, etc. Object detection is critical step in video
analytics. The performance at this step is important for scene analysis, object matching and
tracking, activity recognition. Over the years, research is flowing towards innovating new
concept and improving or extending the established research for performance improvement
of object detection and tracking. Various object detection approaches has been developed
based on statistic, fuzzy, neural network etc. Most approaches involve complex theory. These
approaches can be evolved further by thorough understanding, implementation and
experimentation. All these approaches can be learned by reading, reviewing, and taking
professor’s expert guidance. However, implementation and experimentation requires good
programmer. Various platforms are being used for the design and implementation of object
detection and tracking algorithm. These platforms include C programming, Open CV,
MATLAB etc. The object detection system to be used in real time should satisfy two
conditions. First, system code must be short in terms of execution time. Second, it must
efficiently use memory. However, programmer must have good programming skill in case of
programming in C and OpenCV. Moreover, it is time intensive too for new researcher to
develop such efficient code for real time use.Assuming all these facts, the MATLAB is found
as better platform to design and implementation of algorithm. It contains more than seventy
toolboxes covering all possible fields in technology. All toolboxes are rich with predefined
functions, system objects and simulink blocks. This feature helps to write short code and
saves time in logic development at various steps in system. MATLAB supports matrix
operation which is huge advantage during processing of an image or frame in video sequence.
MATLAB coding is simple and easily learned by any new researcher. This paper presents
implementation of object detection system using MATLAB and its toolboxes. This study
explored various toolboxes and identified useful functions and objects that can be used at
various levels in object detection and tracking. Toolboxes mainly include computer vision,
image processing, and image acquisition. MATLAB 2012 version is used for this study. This
paper organized in four section second section describe general block diagram of object
detection. Third section involves MATLAB functions and objects that are useful in
implementation of object detection system. Sample coding is presented for object detection
and tracking in section four. Paper is concluded in fifth section.
1.2 What is Object Detection?

Object Detection is a task of finding and identifying object in an image or video
sequence.The goal of instance level recognition is to recognize a specific object or scene.It is
a computer technology related to computer vision and image processing that deals with
detecting instances of semantic objects of a certain class(such as humans, buildings, or cars)
in digital images and videos.
1.3 Why object detection matters?

Object detection is a key technology behind advanced driver assistance systems (ADAS) that
enable cars to detect driving lanes or perform pedestrian detection to improve road safety.
Object detection is also useful in applications such as video surveillance or image retrieval
systems.
Today, images and video are everywhere. Online photo sharing sites and social networks
have them in the billions. The field of vision research[1,] has been dominated by machine
learning and statistics. Using images and video to detect, classify, and track objects or events
in order to ”understand” a real-world scene. Programming a computer and designing
algorithms for understanding what is in these images is the field of computer vision.
Computer vision powers applications like image search, robot navigation, medical image
analysis, photo management and many more. From a computer vision point of view, the
image is a scene consisting of objects of interest and a background represented by everything
else in the image. The relations and interactions among these objects are the key factors for
scene understanding. Object detection and recognition are two important computer vision
tasks. Object detection determines the presence of an object and/or its scope, and locations in
the image. Object recognition identifies the object class in the training database, to which the
object belongs to. Object detection typically precedes object recognition. It can be treated as a
two-class object recognition, where one class represents the object class and another class
represents non-object class. Object detection can be further divided into soft detection, which
only detects the presence of an object, and hard detection, which detects both the presence
and location of the object. Object detection field is typically carried out by searching each
part of an image to localize parts. This can be accomplished by scanning an object template
across an image at different locations, scales, and rotations, and a detection is declared if the
similarity between the template and the image is sufficiently high. The similarity between a
template and an image region can be measured by their correlation (SSD). Over the last
several years it has been shown that image based object detectors are sensitive to the training
data.
1.4 How is it currently being used?

Object detection is breaking into a wide range of industries, with use cases ranging from
personal security to productivity in the workplace. Facial detection is one form of it, which
can be utilized as a security measure to let only certain people into a highly classified area of
a government building, for example. It can be used to count the number of people present
within a business meeting to automatically adjust other technical tools that will help
streamline the time dedicated to that particular meeting. It can also be used within a visual
search engine to help consumers find a specific item they’re on the hunt for – Pinterest is one
example of this, as the entire social and shopping platform is built around this technology.
These features utilize people and object detection to create big data for a variety of
applications in the workplace.
1.5 What potential does it have?

The possibilities are endless when it comes to future use cases for object detection.
Sports broadcasting will be utilizing this technology in instances such as detecting when a
football team is about to make a touchdown and notifying fans via their mobile phone or at-
home virtual reality setup in a highly creative way.
In video collaboration, business leaders will be able to count the number of participants
within a meeting to help them automate the process further and monitor room usage to ensure
spaces are being used properly. A relatively new “people counting” method that detects heads
rather than bodies and motion will allow for more accurate detecting to take place, in
crowded places specifically (IEEE), which will enable even more applications for
the security industry.
The future of object detection has massive potential across a wide range of industries. We are
thrilled to be one of the main drivers behind real time intelligent vision, high performance
computing, artificial intelligence and machine learning, which has allowed us to create a
solution that will never distort video, allowing for various AI capabilities which other
companies simply cannot enable.
1.6 PROBLEM DEFINITION AND SCOPE

The problem that this work attempts to solve is concerned with the tracking and
detection of suspicious objects in surveillance videos of large public areas. A suspicious
object here is defined as one that is carried into the scene by a person and left behind while
the person exits the scene. To be classified as one particular thing, such an object should
remain stationary in the scene for a certain period of time without any second party showing
any apparent interest in it. In addition to detecting objects, this system also detects removed
objects as any objects that were in the scene long enough to become part of the background
and were subsequently removed.
The scope of this task is to identify any such objects in real time by looking for certain pre-
defined patterns in the incoming video stream so as to raise an alarm without requiring any
human intervention. It is assumed that the data about the scene is available from only one
camera and from a fixed viewpoint.
The objectives of this system can be summarized as follows:

 It should be able to identify objects in real time and therefore must employ efficient
and computationally inexpensive algorithms.
 It should be able to detect the face of person who left the object or luggage there (a
particular place i.e., railway station, airports or any other crowded places).
 It should be robust against illumination changes, cluttered backgrounds, occlusions,

ghost effects and rapidly varying scenes.
 It should try to maximize the detection rate while at the same time minimizing false
positives.
1.5 PROJECT MOTIVATION

In the last decade the topic of automated surveillance has become very important in
the field of activity analysis. Within the realm of automated surveillance, much emphasis is
being laid on the protection of transportation sites and infrastructure like airport and railway
stations. These places are the lifeline of the economy and thus particularly prone to attacks. It
is difficult to monitor these places manually because of two main reasons. First, it would be
very labor intensive and thus a very expensive proposition. Second, it is not humanly possible
to continuously monitor a scene for an extended period of time, as it requires a lot of
concentration. Therefore, as a step in that direction, we need an automated system that can
assist the security personnel in their surveillance tasks. Since a common threat to any
infrastructure establishment is through a bomb placed in an abandoned bag, we look at the
problem of detecting potentially abandoned objects in the scene.
1.6 ORGANISATION OF THE PROJECT

The report is organized as follows:
Abstract
Table of Contents
List of Figures
Chapter 1: Introduction
Chapter 2: Literature survey
Chapter 3: Block Diagram description
Chapter 4: Image Processing Intro
Chapter 5: Software
Chapter 6: Project description
Chapter 7: Simulation and results
Chapter 8: Future scope & Conclusion
References
Appendix
CHAPTER 2
LITERATURE SURVEY
In this section, the various analyses and researches made in the field of ‘object detection’ and
result already published, taking into account various parameters of project and the extent of
the project are discussed
Visual surveillance is an important computer vision research problem. As more and more
surveillance cameras appear around us, the demand for automatic methods for video analysis
is increasing. Such methods have broad applications including surveillance for safety in
public transportation, public areas, and in schools and hospitals. Automatic surveillance is
also essential in the fight against terrorism. And hence has become a very popular subject and
has been projected remarkably in various areas with a large scope. They have carried out
numerous laboratory experiments and field observations to illuminate the darkness of this
field.
CURRENTLY EXISTING TECHNOLOGIES
Most existing techniques of abandoned (and removed) object detection
employ a modular approach with several independent steps where the output of each step
serves as the input for the next one. Many efficient algorithms exist for carrying out each of
these steps and any single complete AOD (Abandoned Object Detection) system has to
address the problem of finding a suitable combination of algorithms to suit a specific
scenario.
Following is a brief description of these steps and related methods, in the order they are
carried out:
1.2.1 Background Modeling and Subtraction (BGS): This stage creates a dynamic model
of the scene background and subtracts it from each incoming frame to detect the current
foreground regions. The output of this stage is usually a mask depicting pixels in the current
frame that do not match the current background model. Some popular background modeling
techniques include adaptive medians, running averages, mixture of Gaussians, kernel density
estimators, Eigen-backgrounds and mean-shift based estimation. There also exist methods
that employ dual backgrounds or dual foregrounds for this purpose. The BGS step often
utilizes feedback from the object tracking stage to improve its performance.
Foreground Analysis: The BGS step is often unable to adapt to sudden changes in the scene
(of lighting, etc.) since the background model is typically updated slowly. It might also
confuse parts of a foreground object as background if their appearance happens to be similar
to the corresponding background, thus causing a single object to be split into multiple
foreground blobs. In addition, certain foreground areas, while being detected correctly, are
not of interest for further processing. The above factors necessitate an additional refinement
stage to remove both false foreground regions, caused by factors like background state
changes and lighting variations, as well as correct but uninteresting foreground areas like
shadows.
Several methods exist for detecting sudden lighting changes, ranging from simple gradient
and texture based approaches to those that utilize complex lighting invariant features
combined with binary classifiers like support vector machines. Shadow detection is usually
carried out by performing a pixel-by-pixel comparison between the current frame and the
background image to evaluate some measure of similarity between them. These measures
include normalized cross correlation, edge-width information and illumination ratio.
Blob Extraction: This stage applies a connected component algorithm to the foreground
mask to detect the foreground blobs while optionally discarding too small blobs created due
to noise. Most existing methods use an efficient linear time algorithm. The popularity of this
method is owing to the fact that it requires only a single pass over the image to identify and
label all the connected components therein, as opposed to most other methods that require
two passes.
1.2.4 Blob Tracking: This is often the most critical step in the AOD process and is
concerned with finding a correspondence between the current foreground blobs and the
existing tracked blobs from the previous frame (if any). The results of this step are sometimes
used as feedback to improve the results of background modeling. Many methods exist for
carrying out this task, including finite state machines, color histogram ratios, Markov chain
Monte Carlo model, Bayesian inference, Hidden Markov models and Kalman filters.
1.2.5 Abandonment Analysis: This step classifies a static blob detected by the tracking step
as either abandoned or removed object or even a very still person. An alarm is raised if a
detected abandoned/removed object remains in the scene for a certain amount of time, as
specified by the user. The task of distinguishing between removed and abandoned objects is
generally carried out by calculating the degree of agreement between the current frame and
the background frame around the object‘s edges, under the assumption that the image without
any object would show better agreement with the immediate surroundings. There exist
several ways to calculate this degree of agreement; two of the popular methods are based on
edge energy and region growing. There also exist methods that use human tracking to look
for the object‘s owner and evaluate the owner‘s activities around the dropping point to decide
whether the object is abandoned or removed.
ANALYSIS OF PREVIOUS RESEARCH IN THIS AREA

A great deal of research has been carried out in the area of AOD owing to its significance
in anti-terrorism measures. Most methods developed recently can be classified into two major
groups: those that employ background modeling and those that rely on tracking based
detection. Most of these use Gaussian Mixture Model (GMM) for background subtraction. In
this model, the intensity at each pixel is modeled as the weighted sum of multiple Gaussian
probability distributions, with separate distributions representing the background and the
foreground. The method used first detects blobs from the foreground using pixel variance
thresholds and then calculates several features for these blobs to decrease false positives. The
approach maintains two separate backgrounds- one each for long and short term durations-
and modifies them using Bayesian learning. These are then compared with each frame to
estimate dual foregrounds. The method detailed mainly focuses on tracking an object and its
owner in an indoor environment with the aim of informing the owner if someone else takes
that object. The method proposed applies GMM with three distributions for background
modeling and uses these to categorize the foreground into moving, abandoned and removed
objects. A similar background modeling method has been used in along with crowd filtering
to isolate the moving pedestrians in the foreground from the crowd by the use of vertical line
scanning. There are also some approaches to background modeling that do not employ GMM
that uses approximate median model for this purpose, this one too maintains two separate
backgrounds, one of which is updated more frequently than the other.
Some of the approaches based on the other class of methods, based on tracking. The tracking
based approach used in comprises three levels of processing- starting with background
modeling in the lowest level using feedback from higher levels, followed by person and
object tracking in the middle level and finally the person-object split in the highest level to
classify an object as abandoned. The system proposed in considers the abandonment of an
object to comprise of four sub-events, from the arrival of the owner with the object to his
departure without it. Whenever the system detects any unattended object, it traces back in
time to identify the person who brought it into the scene and thus identifies the owner.
Tracking and detection of carried objects is performed using histograms in where the missing
colors in ratio histogram between the frames with and without the object are used to identify
the abandoned object. The method used performs tracking through a trans-dimensional
Markov Chain Monte Carlo model suitable for tracking generic blobs and thus incapable of
distinguishing between humans and other objects as the subject of tracking. The output of this
tracking system therefore needs to be subjected to further processing before luggage can be
identified and labeled as object.
Literature Summary:
The object detection task can be addressed by considering the video as an unrelated sequence
of frames and perform static object detection In 2009, Felzenszwalb et al. [1] described an
object detection system based on mixtures of multiscale deformable part models. Their
system was able to represent highly variable object classes and achieves state-of-the-art
results in the PASCAL object detection challenges. They combined a margin-sensitive
approach for data-mining hard negative examples with a formalism we call latent SVM. This
led to an iterative training algorithm that alternates between fixing latent values for positive
examples and optimizing the latent SVM objective function. Their system relied heavily on
new methods for discriminative training of classifiers that make use of latent information. It
also relied heavily on efficient methods for matching deformable models to images. The
described framework allows for exploration of additional latent structure. For example, one
can consider deeper part hierarchies (parts with parts) or mixture models with many
components. Leibe et al. [2] in 2007, presented a novel method for detecting and localizing
objects of a visual category in cluttered real-world scenes. Their approach considered object
categorization and figure-ground segmentation as two interleaved processes that closely
collaborate towards a common goal. The tight coupling between those two processes allows
them to benefit from each other and improve the combined performance. The core part of
their approach was a highly flexible learned representation for object shape that could
combine the information observed on different training examples in a probabilistic extension
of the Generalized Hough Transform. As they showed, the resulting approach can detect
categorical objects in novel images and automatically infer a probabilistic segmentation from
the recognition result. This segmentation was then in turn used to again improve recognition
by allowing the system to focus its efforts on object pixels and to discard misleading
influences from the background. Their extensive evaluation on several large data sets showed
that the proposed system was applicable to a range of different object categories, including
both rigid and articulated objects. In addition, its flexible representation allowed it to achieve
competitive object detection performance already from training sets that were between one
and two orders of magnitude smaller than those used in comparable systems. Recently in last
decade, methods based on local image features have shown promise for texture and object
recognition tasks. Zhang et al. [3] in 2006, presented a large-scale evaluation of an approach
that represented images as distributions (signatures or histograms) of features extracted from
a sparse set of key-point locations and learnt a Support Vector Machine classifier with
kernels based on two effective measures for comparing distributions. They first evaluated the
performance of the proposed approach with different key-point detectors and descriptors, as
well as different kernels and classifiers. Then, they conducted a comparative evaluation with
several modern recognition methods on 4 texture and 5 object databases. On most of those
databases, their implementation exceeded the best reported results and achieved comparable
performance on the rest. Additionally, we also investigated the influence of background
correlations on recognition performance. In 2001, Viola and Jones [4] in a conference on
pattern recognition described a machine learning approach for visual object detection which
was capable of processing images extremely rapidly and achieving high detection rates. Their
work was distinguished by three key contributions. The first was the introduction of a new
image representation called the "integral image" which allowed the features used by their
detector to be computed very quickly. The second was a learning algorithm, based on
AdaBoost, which used to select a small number of critical visual features from a larger set
and yield extremely efficient classifiers. The third contribution was a method for combining
increasingly more complex classifiers in a "cascade" which allowed background regions of
the image to be quickly discarded while spending more computation on promising object-like
regions. The cascade could be viewed as an object specific focus-of-attention mechanism
which unlike some of the previous approaches provided statistical guarantees that discarded
regions were unlikely to contain the object of interest. They had done some testing over face
detection where the system yielded detection rates comparable to the best of previous
systems. Used in real-time applications, the detector runs at 15 frames per second without
resorting to image differencing or skin color detection. In 2000, Weber et al. [5] proposed a
method to learn heterogeneous models of object classes for visual recognition. The training
images, that they used, contained a preponderance of clutter and the learning was
unsupervised. Their models represented objects as probabilistic constellations of rigid parts
(features). The variability within a class was represented by a join probability density
function on the shape of the constellation and the appearance of the parts. Their method
automatically identified distinctive features in the training set. The set of model parameters
was then learned using expectation maximization. When trained on different, unlabeled and
non-segmented views of a class of objects, each component of the mixture model could adapt
to represent a subset of the views. Similarly, different component models could also
specialize on sub-classes of an object class. Experiments on images of human heads, leaves
from different species of trees, and motor-cars demonstrated that the method works well over
a wide variety of objects.
CHAPTER 3
BLOCK DIAGRAM
This section explains general block diagram of object detection and significance of each
block in the system. Common object detection mainly includes video input, preprocessing,
object segmentation, post processing. It is shown in Fig.
The significance of each block is as follows
Video Input:- It can be stored video or real time video.

Preprocessing:-It mainly involves temporal and spatial smoothing such as intensity
adjustment, removal of noise. For real-time systems, frame-size and frame-rate reduction are
commonly used. It highly reduces computational cost and time[1].
Object detection: It is the process of change detection and extracts appropriate change for
further analysis and qualification. Pixels are classified as foreground, if they
changed. Otherwise, they are considered as background. This process is called as back
ground subtraction. The degree of "change" is a key factor in segmentation and can vary
depending on the application. The result of segmentation is one or more foreground blobs, a
blob being a collection of connected pixels [1].
Post processing: Remove false detection caused due to dynamic condition in background
using morphological and speckle noise removal. BMC 2012 Dataset[6]: This dataset include
real and synthetic video. It is mainly used for comparison of different background subtraction
techniques. Fish4knowledge Dataset[7]: The Fish4 knowledge 35 dataset is an underwater
benchmark dataset for target detection against complex background. Carnegie Mellon
Dataset[8]:
The sequence of CMU25 by Sheikh and Shah involves a camera mounted on a tall tripod.
The wind caused the tripod to sway back and forth causing vibration in the scene. This
dataset is useful while studying camera jitter background Situation. Stored video need to be
read in appropriate format before processing. Various related functions from image
processing(IP) and computer vision(CV) toolbox can be used for this purpose.
CHAPTER 4
SOFTWARE
4.1 INTRODUCTION
The software used in this project is MATLAB R2015a
4.2 MATLAB R2015a
4.2.1 The Language of Technical Computing
Millions of engineers and scientists worldwide use MATLAB to analyze and design
the systems and products transforming our world. MATLAB is in automobile active safety
systems, interplanetary spacecraft, and health monitoring devices, smart power grids, and
LTE cellular networks. It is used for machine learning, signal processing, image processing,
computer vision, communications, computational finance, control design, robotics, and much
more.
4.2.2Math. Graphics. Programming.
The MATLAB platform is optimized for solving engineering and scientific problems.
The matrix-based MATLAB language is the world’s most natural way to express
computational mathematics. Built-in graphics make it easy to visualize and gain insights from
data. A vast library of prebuilt toolboxes lets you get started right away with algorithms
essential to your domain. The desktop environment invites experimentation, exploration, and
discovery. These MATLAB tools and capabilities are all rigorously tested and designed to
work together.
4.2.3 Scale. Integrate. Deploy.
MATLAB helps you take your ideas beyond the desktop. You can run your analyses
on larger data sets and scale up to clusters and clouds. MATLAB code can be integrated with
other languages, enabling you to deploy algorithms and applications within web, enterprise,
and production systems.
4.2.4 Key Features
 High-level language for scientific and engineering computing
 Desktop environment tuned for iterative exploration, design, and problem-solving
 Graphics for visualizing data and tools for creating custom plots
 Apps for curve fitting, data classification, signal analysis, and many other domain-specific
tasks
 Add-on toolboxes for a wide range of engineering and scientific applications
 Tools for building applications with custom user interfaces
 Interfaces to C/C++, Java, .NET, Python, SQL, Hadoop, and Microsoft Excel
 Royalty-free deployment options for sharing MATLAB programs with end users
4.2.5 Why MATLAB?
MATLAB is the easiest and most productive software for engineers and scientists.
Whether you’re analyzing data, developing algorithms, or creating models, MATLAB
provides an environment that invites exploration and discovery. It combines a high-level
language with a desktop environment tuned for iterative engineering and scientific
workflows.
4.2.6 MATLAB Speaks Math
The matrix-based MATLAB language is the world’s most natural way to express
computational mathematics. Linear algebra in MATLAB looks like linear algebra in a
textbook. This makes it straightforward to capture the mathematics behind your ideas, which
means your code is easier to write, easier to read and understand, and easier to maintain.
You can trust the results of your computations. MATLAB, which has strong roots in
the numerical analysis research community, is known for its impeccable numerics. A
MathWorks team of 350 engineers continuously verifies quality by running millions of tests
on the MATLAB code base every day.
MATLAB does the hard work to ensure your code runs quickly. Math operations are
distributed across multiple cores on your computer, library calls are heavily optimized,
and all code is just-in-time compiled. You can run your algorithms in parallel by changing
for-loops into parallel for-loops or by changing standard arrays into GPU or distributed
arrays. Run parallel algorithms in infinitely scalable public or private clouds with no code
changes.
The MATLAB language also provides features of traditional programming languages,

including flow control, error handling, object-oriented programming, unit testing, and source
control integration.
4.2.7 MATLAB Is Designed for Engineers and Scientists

MATLAB provides a desktop environment tuned for iterative engineering and
scientific workflows. Integrated tools support simultaneous exploration of data and programs,
letting you evaluate more ideas in less time.
 You can interactively preview, select, and preprocess the data you want to import.
 An extensive set of built-in math functions supports your engineering and scientific analysis.
 2D and 3D plotting functions enable you to visualize and understand your data and
communicate results.
 MATLAB apps allow you to perform common engineering tasks without having to program.
Visualize how different algorithms work with your data, and iterate until you’ve got the results
you want.
 The integrated editing and debugging tools let you quickly explore multiple options, refine your
analysis, and iterate to an optimal solution.
 You can capture your work as sharable, interactive narratives.
Comprehensive, professional documentation written by engineers and scientists is

always at your fingertips to keep you productive. Reliable, real-time technical support staff
answers your questions quickly. And you can tap into the knowledge and experience of over
100,000 community members and MathWorks engineers on MATLAB Central, an open
exchange for MATLAB and Simulink users.
MATLAB and add-on toolboxes are integrated with each other and designed to work
together. They offer professionally developed, rigorously tested, field-hardened, and fully
documented functionality specifically for scientific and engineering applications.
4.2.8 MATLAB Integrates Workflows
Major engineering and scientific challenges require broad coordination to take ideas
to implementation. Every handoff along the way adds errors and delays.
MATLAB automates the entire path from research through production. You can:
 Build and package custom MATLAB apps and toolboxes to share with other MATLAB
users.
 Create standalone executables to share with others who do not have MATLAB.
 Integrate with C/C++, Java, .NET, and Python. Call those languages directly from MATLAB,
or package MATLAB algorithms and applications for deployment within web, enterprise,
and production systems.
 Convert MATLAB algorithms to C, HDL, and PLC code to run on embedded devices.
 Deploy MATLAB code to run on production Hadoop systems.
MATLAB is also a key part of Model-Based Design, which is used for multidomain
simulation, physical and discrete-event simulation, and verification and code generation.
Figure 4.1: A MATLAB window in which the code for the project is written
CHAPTER 5
1.1 GENERAL OVERVIEW OF IMAGE PROCESSING
Image processing is a method to convert an image into digital form and perform
some operations on it, in order to get an enhanced image or to extract some useful
information from it. It is a type of signal dispensation in which input is image, like video
frame or photograph and output may be image or characteristics associated with that image.
Usually Image Processing system includes treating images as two dimensional signals while
applying already set signal processing methods to them.
It is among rapidly growing technologies today, with its applications in various

aspects of a business. Image Processing forms core research area within engineering and
computer science disciplines too. In imaging science, image processing is any form of signal
processing for which the input is an image, such as a photograph or video frame; the output
of image processing may be either an image or a set of characteristics or parameters related to
the image. Most image-processing techniques involve treating the image as a two-
dimensional signal and applying standard signal-processing techniques to it.
Image processing usually refers to digital image processing, but optical and analog
image processing also are possible. This article is about general techniques that apply to all of
them. The acquisition of images (producing the input image in the first place) is referred to as
imaging.
Closely related to image processing are computer graphics and computer vision.
In computer graphics, images are manually made from physical models of objects,
environments, and lighting, instead of being acquired (via imaging devices such as cameras)
from natural scenes, as in most animated movies. Computer vision, on the other hand, is often
considered high-level image processing out of which a machine/computer/software intends to
decipher the physical contents of an image or a sequence of images (e.g., videos or 3D full-
body magnetic resonance scans).
In modern sciences and technologies, images also gain much broader scopes due
to the ever growing importance of scientific visualization (of often large-scale complex
scientific/experimental data). Examples include microarray data in genetic research, or real-
time multi-asset portfolio trading in finance.
Image processing basically includes the following three steps.
 Importing the image with optical scanner or by digital photography.

 Analyzing and manipulating the image which includes data compression and
image enhancement and spotting patterns that are not to human eyes like
satellite photographs.
 Output is the last stage in which result can be altered image or report that is
based on image analysis.
1.1.1 Purpose of Image processing
The purpose of image processing is divided into 5 groups. They are:
1. Visualization - Observe the objects that are not visible.

2. Image sharpening and restoration - To create a better image.
3. Image retrieval - Seek for the image of interest.
4. Measurement of pattern – Measures various objects in an image.
5. Image Recognition – Distinguish the objects in an image.
1.1.2 Types
The two types of methods used for Image Processing are Analog and

Digital Image Processing. Analog or visual techniques of image processing can be used for
the hard copies like printouts and photographs. Image analysts use various fundamentals of
interpretation while using these visual techniques. The image processing is not just confined
to area that has to be studied but on knowledge of analyst. Association is another important
tool in image processing through visual techniques. So analysts apply a combination of
personal knowledge and collateral data to image processing.
Digital Processing techniques help in manipulation of the digital images by using

computers. As raw data from imaging sensors from satellite platform contains deficiencies.
To get over such flaws and to get originality of information, it has to undergo various phases
of processing. The three general phases that all types of data have to undergo while using
digital technique are Pre-processing, enhancement and display, information extraction. Fig.1
represents the hierarchy of image processing.
Fig.1: Hierarchy of Image Processing
8.3.1. Images in Matlab

The first step in MATLAB image processing is to understand that a digital image is
composed of a two or three dimensional matrix of pixels. Individual pixels contain a number
or numbers representing what grayscale or color value is assigned to it. Color pictures
generally contain three times as much data as grayscale pictures, depending on what color
representation scheme is used. Therefore, color pictures take three times as much
computational power to process. In this tutorial the method for conversion from color to
grayscale will be demonstrated and all processing will be done on grayscale images.
However, in order to understand how image processing works, we will begin by analyzing
simple two dimensional 8-bit matrices.
8.3.2. Loading an Image
Many times you will want to process a specific image, other times you may just want
to test a filter on an arbitrary matrix. If you choose to do this in MATLAB you will need to
load the image so you can begin processing. If the image that you have is in color, but color
is not important for the current application, then you can change the image to grayscale. This
makes processing much simpler since then there are only a third of the pixel values present in
the new image. Color may not be important in an image when you are trying to locate a
specific object that has good contrast with its surroundings. Example 4.1, below,
demonstrates how to load different images.
If colour is not an important aspect then rgb2gray can be used to change a color image
into a grayscale image. The class of the new image is the same as that of the color image. As
you can see from the example M-file in Figure 4.1, MATLAB has the capability of loading
many different image formats, two of which are shown. The function imreadused to read an
image file with a specified format. Consult imread in MATLAB’s help to find which formats
are supported. The function imshowdisplays an image, while figure tells MATLAB which
figure window the image should appear in. If figure does not have a number associated with
it, then figures will appear chronologically as they appear in the M-file. Figures 8, 9, 10 and
11, below, are a loaded bitmap file, the image in Figure 8 converted to a grayscale image, a
loaded JPEG file, and the image in Figure 11 converted to a grayscale image, respectively.
The images used in this example are both MATLAB example images. In order to
demonstrate how to load an image file, these images were copied and pasted into the folder
denoted in the M-file in Figure 4.1. In Example 7.1, later in this tutorial, you will see that
MATLAB images can be loaded by simply using the imread function. However, this function
will only load an image stored in:
Figure 8: Bitmap Image Figure 9: Grayscale Image
Figure 10: JPEG Image Figure 11: Grayscale Image
8.3.3 Writing an Image
Sometimes an image must be saved so that it can be transferred to a disk or opened

with another program. In this case you will want to do the opposite of loading an image,
reading it, and instead write it to a file. This can be accomplished in MATLAB using the
imwrite function. This function allows you to save an image as any type of file supported by
MATLAB, which are the same as supported by imread. Figure 12 shows the image for
saving the image using m-file.
Figure 12: M-file for Saving an Image
8.4. Image Properties
8.4.1. Histogram
A histogram is bar graph that shows a distribution of data. In image processing

histograms are used to show how many of each pixel value are present in an image.
Histograms can be very useful in determining which pixel values are important in an image.
From this data you can manipulate an image to meet your specifications. Data from a
histogram can aid you in contrast enhancement and thresholding. In order to create a
histogram from an image, use the imhist function. Contrast enhancement can be performed
by the histeq function, while thresholding can be performed by using the graythresh
function and the im2bw function. See Figure 14,15,16,17 for a demonstration of imhist,
imadjust, graythresh, and im2bw. If you want to see the resulting histogram of a contrast
enhanced image, simply perform the imhist operation on the image created with histeq.
8.4.2. Negative
The negative of an image means the output image is the reversal of the input image.
In the case of an 8-bit image, the pixels with a value of 0 take on a new value of 255, while
the pixels with a value of 255 take on a new value of 0. All the pixel values in between take
on similarly reversed new values. The new image appears as the opposite of the original.
The imadjust function performs this operation. See Figure 13 for an example of how to use
imadjust to create the negative of the image. Another method for creating the negative of an
image is to use imcomplement, which is described in Figure 13.
Figure 13: M-file for Creating Histogram, Negative, Contrast Enhanced and
Binary Images from the Image
Figure 14: Histogram Figure 15: Negative

Figure 16: Contrast Enhanced Figure 17: Binary
8.4.3. Median Filters

Median Filters can be very useful for removing noise from images. A median filter is
like an averaging filter in some ways. The averaging filter examines the pixel in question and
its neighbor’s pixel values and returns the mean of these pixel values. The median filter
looks at this same neighborhood of pixels, but returns the median value. In this way noise
can be removed, but edges are not blurred as much, since the median filter is better at
ignoring large discrepancies in pixel values. The Example, below, for how to perform a
median filtering operation.
This example uses two types of median filters that both output the same result. The
first filter is medfilt2, which takes the median value of the pixel in question and its neighbors.
In this case it outputs the median value of nine pixels being examined. The second filter,
ordfilt2, does the exact same thing in this configuration, but can be configured to perform
other types of filtering. In this case, it looks at every pixel in the 3x3 matrix and outputs the
value in the fifth position of rank, which is the median position. In other words it outputs a
value, where half the pixel values are greater and half are less, in the matrix.
Figure 18: Noisy Image
Figure 19: medfilt2 Figure 20: ordfilt2
Figure 18, above depicts the Noisy image. The original image in Figure 19, above, is
the output of the image in Figure 18, filtered with a 3x3 two-dimensional median filter.
Figure 19, above, is the same as Figure 19, but was achieved by filtering the image in Figure
18 with ordfilt2, configured to produce the same result as medfilt2. Notice how both filters
produce the same result. Each is able to remove the noise, without blurring the edges in the
image too much.
8.4.4 Edge Detectors
Edge detectors are very useful for locating objects within images. There are many
different kinds of edge detectors, but we will concentrate on two: the Sobel edge detector and
the Canny edge detector. The Sobel edge detector is able to look for strong edges in the
horizontal direction, vertical direction, or both directions. The Canny edge detector detects
all strong edges plus it will find weak edges that are associated with strong edges. Both of
these edge detectors return binary images with the edges shown in white on a black
background. The Example, below, demonstrates the use of these edge detectors.
The Canny and Sobel edge detectors are both demonstrated in this example.
Figure 21, below, is a sample M-file for performing these operations. The image used is the
MATLAB image, rice.tif, which can be found in the manner described in Example 4.1. Two
methods for performing edge detection using the Sobel method are shown. The first method
uses the MATLAB functions, fspecial, which creates the filter, and imfilter, which applies the
filter to the image. The second method uses the MATLAB function, edge, in which you must
specify the type of edge detection method desired. Sobel was used as the first edge detection
method, while Canny was used as the next type. Figure 21, below, displays the results of the
M-file in figure 18. The first image is the original image; the image denoted Horizontal
Sobel is the result of using fspecial and imfilter. The image labeled Sobel is the result of
using the edge filter with Sobel specified, while the image labeled Canny has Canny
specified.
The Zoom In tool was used to depict the detail in the images more clearly. As you
can see, the filter used to create the Horizontal Sobel image detects horizontal edges much
more readily than vertical edges. The filter used to create the Sobel image detected both
horizontal and vertical edges. This resulted from MATLAB looking for both horizontal and
vertical edges independently and then summing them. The Canny image demonstrates how
well the Canny method detects all edges. The Canny method does not only look for strong
edges, as in the Sobel method, but also will look for weak edges that are connected to strong
edges and show those, too.
Figure 21: Images Created by Different Edge Detection Methods
CHAPTER 6
BLOCK DIAGRAM AND DESCRIPTION

Figure 6.1: block diagram of the proposed system
BLOCK DIAGRAM DESCRIPTION
Figure above shows the block diagram (flow chart) for the proposed system. The
problem is divided into different modules.
First the video input is given to the system which is from a camera in the case of real
time. But here the test is done using videos of real time situation. The videos given as input
are of the same resolution as the video from a camera.
In the second stage the video is extracted to frames for analyzing the video frame by
frame.at a time single frames are analyzed.
In the third stage we check for any faces in the incoming frames and if any save that
frame to a memory location and keeps the co-ordinates of the faces in an array.
The next stage checks whether the frame is first frame or not. If it is first frame store
it as background image after a conversion from RGB to gray. And the next coming frames
are also converted to gray scale and compared with the stored background image. If there is
any change is detected. That is any new object is come in next frames (moving or stationary),
that will save as the foreground image. If there is no change the result of background
subtraction will be zero.i.e., we will get a black image.
So now after change detection we have to ignore the moving objects and concentrate
on stationary objects which are not in the scene before.so we avoid the motion changes using
some thresholding method. And find the stationary objects. We avoid stationary objects of
small area. Because of chance of presence of noise. And find the objects of some range of
area. After this operation the resulting image will be a binary image showing objects detected
as white and all other as black.
Now we calculate the centroids of objects including the faces which are saved earlier
to a particular location.
Then we have to track the stationary objects inorder to detect when it is abandoned.so
we keep an eye on each object is alone without the presence of its owner for a particular time
delay, the object is termed as abandoned, and when the object is detected as abandoned we
find the minimum distance face centroid from the object centroid inorder to find the face of
the owner.
And finally we makes an alarm or pass a message to the authority subject to the
detection of the abandoned object and displays the frame from the video in which the face of
the owner is marked inside a box.
And if there is no object is detected as particular object the process is repeated looking for
any change in the incoming frames.ie.looking for a foreground image. This process repeats.
5.2 PROPOSED ALGORITHM
The system architecture is shown in figure (5.1). It represents the process of the
system. The system obtains a video from a video surveillance camera (e.g. CCTV) or a video
file. Then, detect objects using image processing techniques. The output of the system is an
event classification result that is acquired from a decision-making. The result of the system
processing can be viewed via a user interface, which is a TV screen or a computer screen.
A. Video Acquisition
This processing unit is the process of importing the video from a video stream and
capture into sequence frames.
1) Video Stream – This method receives a streaming video from a file or a CCTV
camera. Currently, the following video file formats are supported, mp4, avi, bmp and
others. By far, this paper works with only one video from one camera or one video
file at a time.
2) Sequence Frame – After the program reads the video file, it takes and processes
each image by querying frames from the video file.
3) Capture Image Displaying – Creating a window in which the captured images from
camera will be shown on that window.
B. Gaussian Mixture Model (GMM)
GMM is a density model that consists of several Gaussian component functions. This method can
perform well when used for extraction process of background because its reliability against the
changes in light and condition during repeated object detection [3]. Pixel in the video scene is
modeled in Gaussian distribution. Each pixel in the frame was compared with model formed from
GMM. Pixels with similarity values under the standard deviation and highest weight factor were
considered as background, while pixels with higher standard deviation and lower weight factor
considered as foreground [7]. Pixel then categorized into one of GMM candidate model. If the color
of pixel is categorized as a background model then the pixel will be given zero (0) or black color.
While the pixel is uncategorized in background model then it will be considered as foreground and
given one (1) or white color. Then, the resulting binary image will be processed further. Foreground is
a moving object and changing position in every frame of video (dynamic), while background is an
object with the position unchanged in every video frames (static) [3]. After foreground object
detected, the filter process is done to fill the hole on the foreground object. This research uses
morphology process to filter the noise and fill the hole on the detected object.
C. OTSU Method
Otsu threshold Otsu's method is used to convert gray level image to a binary image. The two clusters
are obtained by Otsu method based on threshold value and the statistical measures are optimized [16].
The automatic thresholding of gray-level images via two-dimensional. This method provides good
segmentation of the object in an image. In Otsu's method, background pixels belong to one class and
foreground pixels belong to another class. The weight and variance of the two classes are calculated
and added. This resultant value constitutes the Otsu threshold value. It is shown in following Eqn,
is weights of background pixels and is weight of foreground pixels. are the inter class variance of the
two classes. Otsu shows that minimizing the intra-class variance is the same as maximizing inter-class
variance. It is shown in following Eqn,
where are the class probabilities and is the class mean.
The class probability is computed from :
while, the class mean is:
D. Kernel Density Estimation:
An approximation of the background pdf can be given by the histogram of the most recent values
classified as background values. However, as the number of samples is necessarily limited, such an
approximation suffers from significant drawbacks: the histogram, as a step function, might provide
poor modeling of the true, unknown pdf, with the “tails” of the true pdf often missing. In order to
address such issues, Elgammal et al. in [7] have proposed to model the background distribution by a
non-parametric model based on Kernel Density Estimation (KDE) on the buffer of the last n
background values. KDE guarantees a smoothed, continuous version of the histogram.
In [7], the background pdf is given as a sum of Gaussian kernels centered in the most recent n
background values, xi:
Likewise (4), it seems to be dealing with a sum of Gaussians. However, differences are substantial: in
(4), each Gaussian describes a main “mode” of the pdf and is updated over time; here, instead, each
Gaussian describes just one sample data, with n in the order of 100, and Σt is the same for all kernels.
If background values are not known, unclassified sample data can be used in their place; the initial
inaccuracy will be recovered along model updates. Based on (7), classification of xt as foreground can
be straightforwardly stated if P(xt) < T.
Model update is obtained by simply updating the buffer of the background values in fifo order by
selective update (see Sect. 2.1): in this way, “pollution” of the model (7) by foreground values is
prevented. However, complete model estimation also requires the estimation of Σt (which is assumed
diagonal for simplicity). This is a key problem in KDE. In [7], the variance is estimated in the time
domain by analysing the set of differences between two consecutive values. The model proposed in
[7] is actually more complex than what outlined so far. First, in order to address the issue of the time
scale, two similar models are concurrently used, one for long-term and the other for short-term
memory. Second, the long-term model is updated with a blind update mechanism so as to prevent
undesired exclusion from the model of incorrectly classified background pixels. Furthermore, it
addresses explicitly the problem of spatial correlation in the modeling of values from neighbouring
pixel locations as described hereafter. All the approaches at Sects. 2.1-2.3 model independently single
pixel locations. However, it is intuitive that neighbouring locations will exhibit spatial correlation in
the modeling and classification of values. To exploit this property, various morphological operations
have been used for refining the binary map of the classified foreground pixels. In [7], instead, this
same issue is addressed at the model level, by suggesting to evaluate P(xt) also in the models from
neighbouring pixels and use the maximum value found in the comparison against T.
E.Object Detection:
Object detection is the process of finding out the area of interest as per user’s
requirement. Here we have proposed the algorithm for object detection using frame
difference method (One of the background subtraction algorithms). Steps are given as:
a) Read all the image frames generated from the video, which are stored on a variable or
storage medium.
b) Convert them into greyscale image using rgb2gray ( ) from coloured image.
c) Store the first frame as background image
c) Calculate the difference as |frame[i]-background frame|
d) If the difference is greater than a threshold (rth), then the value is considered to be the part
of foreground otherwise background (no change is detected)
e) Update the value of i by incrementing with one.
f) Repeat the step c to d up to the last image frame.
g) End the process.
F. Post Processing:
The detected object in the previous phase may lead to have a problem of connectivity
and it may also have some holes which may be useless for object representation. Therefore
here we need to have some post processing which will reduce the problem of handling holes
and the connectivity of pixels within object region. Mathematical morphological analysis is
one of post processing approach which leads to enhance the segmented image in order to
improve the required result. In the proposed method we have used the erosion and dilation
iteratively so that an object will clearly appear in foreground while the rest useless blobs will
be removed. Morphological operations are useful to obtain the useful components from the
image. These components may be the object boundary, region, shape and skeleton etc.
Dilation: Dilation is an increasing transform, used to fill small holes and narrow chasm in the
objects.
Erosion: Erosion as morphological transformation can be used to find the contours of the
objects. It is used to shrink or reduce the pixel values.
Feature Selection:
The features like centroid of an object, height and width of an object are selected so
that it is easy to plot the location of non-rigid body/objects with frame to frame. The
proposed method evaluates the centroid of detected object in each frame. It is assumed that
after the morphological operations there will not be any false object. And then a centroid of
the object in two dimensional frames can be calculated as the average of the pixels in x and y
coordinates belonging to the object.
Cx=total moments in x-direction/total area------ (1)

Cy=total moments in y-direction/total area------ (2)
F. Object Representation:
Here we are using centroid and the rectangular shape to cover the object boundary to
represent the object. After calculating the centroid, find the Width Wi and Height Hi of the
object by extracting the positions of pixels Pxmax and Pxmin which has the maximum and
minimum values of X Coordinate related to the object. Similarly, for the Y coordinates,
calculate Pymax and Pymin.
G. Trajectory Plot:
After the process of object detection using frame differencing method, the detected
components are given as input to the tracking process to plot the trajectory. The frame
differencing algorithm will give all the pixel values of the detected object. The centroid of the
objects is calculated by using the equation 1 and 2. Here input will be the pixel values of the
object and the output will be the rectangular area around the object. This process will
calculate the centroid, height and width of the object for the purpose of trajectory plotting.
H. Video Analytics Processing:
Segmentation is the process of detecting changes and extracting relevant changes for
further analysis and qualification. Changed pixels from previous positions are referred to as
"Foreground Pixels"; those that do not change are called "Background Pixels". The
segmentation method used here is Background Subtraction. Image Pixels remaining after the
background has been subtracted are the foreground pixels. The key factor which is used to
identify foreground pixels by means of “Degree of change" in segmentation and can vary
depending on the application.
The segmentation result is one or more foreground blobs. A blob is nothing but a collection
of connected pixels.
I. Tracking
The next process in object detection is tracking the different blobs so as to find which
blobs correspond to abandoned objects. The first step in this process is to create a set, Track,
whose elements have three variables: blob- Properties, hitCount and missCount. The next
step is to analyze the incoming image for all the blobs. If the area change and the centroid
position change, as compared to any of the elements of the set Track are below a threshold
value, we increment hitCount and reinitialize miscount with a zero; otherwise we create a
new element in the Track-set, initializing the blob-properties variable with the properties of
incoming blob and hitCount and missCount are initialized to zero. We then run a loop
through all the elements of the set. If the hitCount goes above a user defined threshold value,
an alarm is triggered. If the missCount goes above a threshold, we delete the element from
the set. These two steps are repeated until there are no incoming images.
J. Alarm and Display
We use the Raise-alarm flag from previous units and highlight that part of the video
for which the alarm has been raised. We also display the face of the person who abandoned
that luggage in the particular location. The face of the person is captured by the following
procedure.
We saves the faces and their co-ordinates from each and every frame. Calculates their
centroids and object centroid. When an object is detected as abandoned, we find the
minimum distance face centroid from the object centroid and display that particular frame
from the video and the face of the person is indicated inside a box.
CHAPTER 7
SIMULATION AND RESULTS
7.1 INTRODUCTION
The simulation of Owner detection and abandoned object detection is done by using
MATLAB version R2015a by considering background Image in a video of 300 frames. The
simulation results for all the techniques are explained. Initially the moving objects in video
images are tracked based on image segmentation, background subtraction and object
detection techniques. The simulation results of the algorithms are shown below. Abandoned
object detection: The sample frames from input video sequence is shown in Figure 6.1. The
abandoned object that are detected are shown in Figure 6.2a. The abandoned object found is
marked in red color. The results obtained for owner detection are shown in Figure 6.2b .the
face of the owner is marked using a green coloured box.
Figure 6.1a: sample frames from the video

Figure 6.1b: sample frames from the video showing face detection
Figure 6.2a: abandoned object detection (object is bounded by red box)

Figure 6.2b: identified owner face (owner face is bounded by green box)
7.2 ADVANTAGES
 Produces low false alarms and missing detection
 Provides on time security
 No special sensors are required
 It can also identify the owner face

DISADVANTAGES
 .In a high density scenario, there is a possibility that the object is prone to be hidden
from camera view for most of the time, leading to a failure in detection.
7.4 Summary
Simulation for the proposed system using MATLAB R2015a is done. The results are
satisfactory. The simulation results using a video is shown in the above figures. It implies that
the system works well on different video streams of practical situations. Also tried to
implement in real time but the problem was the miss detection of the owner of the luggage
because of variations in some parameters. Object detection is done perfectly. This system can
be considered as foundation for a truly robust frame work that only requires a bit of
calibration to perform well in practically any scenario.
CHAPTER 8
FUTURE SCOPE&CONCLUSION
Owing to modular nature of this system, it is quite easy to add more sophisticated
methods to any of its module. The relative simplicity of this tracking algorithm promises that
a DSP implementation is possible
We are glad to complete this project work successfully and satisfactory. One of the
short comings of the system was occlusion, since this project uses a single camera view from
a fixed point. The occlusion can be avoided by using multiple camera views and a robust
algorithm for object detection. Also there were some problems in the face detection due to the
environmental changes. But the testing results gave us a satisfactory output. And so we
concludes that this system can be considered as foundation for a truly robust frame work that
only requires a bit of calibration to perform well in practically any scenario.
REFERENCES
[1].An Abandoned Object Detection from Real time video. Kulkarni Abhishek, Kulkarni
Saurabh, Patil Aniket, Patil Girish. International Journal of Scientific & Engineering
Research, Volume 5, Issue 5, May-2014, ISSN 2229-5518
[2]. F. Porikli, Y. Ivanov, and T. Haga, “Robust Abandoned Object Detection Using Dual
Foregrounds”, Eurasip Journal on Advances in Signal Processing, vol. 2008, 2008.
[3]. M. Bhargava et al. “Detection of abandoned objects in crowded environments”. Proc. of

IEEE Conf. on Advanced Video and Signal Based Surveillance, pp. 271-276, 2007.
[4].M. Spengler, B. Schiele, “Automatic Detection and Tracking of Abandoned Objects,”

Joint IEEE International Workshop on Visual Surveillance and PETS, 2003.
[5]. P. L. Venetianer, Z. Zhang, W. Yin, A. J. Liptop, “Stationary Target Detection Using the
Object Video Surveillance System”, IEEE International Conference on Advanced Video and
Signal based Surveillance, London , UK , September 2007.
[6]. R. Cucchiara, et al. "Detecting Moving Objects, Ghosts, and Shadows in Video Streams",
IEEE Trans. on Pattern Analysis and Machine Intelligence, 25(10):1337-1342, 2003.
[7]. H.H. Liao, J.Y. Chang, and L.G. Chen, “A localized approach to abandoned luggage
detection with foreground-mask sampling,” in IEEE International Conference on Advanced
Video and Signal Based Surveillance, 2008, pp. 132–139.
[8]. Y. Tian et al., “Robust detection of abandoned and removed objects in complex
surveillance videos,” IEEE Trans. on Systems, Man, and Cybernetics, Part C: Applications
and Reviews, vol.PP (99), pp. 1–12, 2010.
[9]. M. Beynon, D. Hook, M. Seibert, A. Peacock, and D. Dudgeon, “Detecting Abandoned

Packages in a Multi-camera Video Surveillance System”, IEEE International Conference on
Advanced Video and Signal-Based Surveillance, 2003
[10]. N. Bird, S. Atev, N. Caramelli, R. Martin, O. Masoud, N. Papanikolopoulos, “Real

Time, Online Detection of Abandoned Objects in Public Areas”, IEEE International
Conference on Robotics and Automation, May, 2006.
[11]. Ferrando.S, Gera.G, Regazzoni.C, “Classification of Unattended and Stolen Object in

Video-Surveillance System”, in IEEE International Conference on Video and Signal Based
Surveillance, 2006. AVSS '06
[12]. http://in.mathworks.com/products/matlab/

Object Detection Report

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Object Detection Report

Uploaded by

Copyright:

Available Formats

Object Detection Using KDE ,GMM ,OTSU

Introduction to Object Detection

1.2 What is Object Detection?

1.3 Why object detection matters?

1.4 How is it currently being used?

1.5 What potential does it have?

1.6 PROBLEM DEFINITION AND SCOPE

The objectives of this system can be summarized as follows:

 It should be robust against illumination changes, cluttered backgrounds, occlusions,

1.5 PROJECT MOTIVATION

1.6 ORGANISATION OF THE PROJECT

ANALYSIS OF PREVIOUS RESEARCH IN THIS AREA

The significance of each block is as follows

Video Input:- It can be stored video or real time video.

The software used in this project is MATLAB R2015a

4.2 MATLAB R2015a

4.2.1 The Language of Technical Computing

4.2.3 Scale. Integrate. Deploy.

4.2.4 Key Features

 High-level language for scientific and engineering computing

 Desktop environment tuned for iterative exploration, design, and problem-solving

 Add-on toolboxes for a wide range of engineering and scientific applications

 Tools for building applications with custom user interfaces

 Interfaces to C/C++, Java, .NET, Python, SQL, Hadoop, and Microsoft Excel

4.2.6 MATLAB Speaks Math

The MATLAB language also provides features of traditional programming languages,

4.2.7 MATLAB Is Designed for Engineers and Scientists

 An extensive set of built-in math functions supports your engineering and scientific analysis.

 You can capture your work as sharable, interactive narratives.

Comprehensive, professional documentation written by engineers and scientists is

4.2.8 MATLAB Integrates Workflows

 Deploy MATLAB code to run on production Hadoop systems.

1.1 GENERAL OVERVIEW OF IMAGE PROCESSING

It is among rapidly growing technologies today, with its applications in various

Image processing basically includes the following three steps.

 Importing the image with optical scanner or by digital photography.

The purpose of image processing is divided into 5 groups. They are:

1. Visualization - Observe the objects that are not visible.

The two types of methods used for Image Processing are Analog and

Digital Processing techniques help in manipulation of the digital images by using

Fig.1: Hierarchy of Image Processing

8.3.1. Images in Matlab

8.3.2. Loading an Image

Figure 10: JPEG Image Figure 11: Grayscale Image

8.3.3 Writing an Image

Sometimes an image must be saved so that it can be transferred to a disk or opened

8.4. Image Properties

A histogram is bar graph that shows a distribution of data. In image processing

Figure 14: Histogram Figure 15: Negative

8.4.3. Median Filters

Figure 19: medfilt2 Figure 20: ordfilt2

8.4.4 Edge Detectors

BLOCK DIAGRAM AND DESCRIPTION

BLOCK DIAGRAM DESCRIPTION

5.2 PROPOSED ALGORITHM

where are the class probabilities and is the class mean.

The class probability is computed from :

while, the class mean is:

D. Kernel Density Estimation:

Cx=total moments in x-direction/total area------ (1)

H. Video Analytics Processing:

J. Alarm and Display

SIMULATION AND RESULTS