Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

INDUSTRIAL INTERNSHIP REPORT

ON
"IMAGE ENHANCEMENT, OBJECT DETECTION AND
TRACKING USING MATLAB"
*************************************

Undertaken at
LASTEC, DRDO
(Defence Research & Development Organization)
LASER SCIENCE AND TECHNOLOGY CENTRE
METCALFE HOUSE COMPLEX, CIVIL LINES, DELHI -110054
(MAY 2017- JULY 2017)

Under the sincere supervision of

Mr. Dipak Mallik


(Scientist ‘E’, LASTEC, DRDO)
And
Dr. Ravindra Singh
(Scientist ‘G’, LASTEC, DRDO)
Submitted by:

Name – Rachit Goyal


Roll No. – 14110106
Course – B. Tech
Branch – Electrical Engineering
IIT GANDHINAGAR

1
DECLARATION

I hereby declare that the project work entitled on “IMAGE


ENHANCEMENT, OBJECT DETECTION AND TRACKING
USING MATLAB” is an authentic record of my own work carried
out at Laser Science and Technology Centre (LASTEC), DRDO
under the guidance of Mr. Dipak Mallik, Sc.‟E‟ in the lab headed
by Dr. Ravindra Singh, Sc. „G‟, Control Interface & Tracking
(CIT), LASTEC, DRDO during 15th May, 2017 to 10th July, 2017.

RACHIT GOYAL
(TRAINEE)

2
ACKNOWLEDGEMENT

I am grateful to the authorities of LASTEC, DRDO for having


permitted me to go ahead with the Training on “IMAGE
ENHANCEMENT, OBJECT DETECTION AND TRACKING
USING MATLAB” for industrial experience on MATLAB platform
and other fundamentals used in video processing.

I am particularly thankful to Dr. Ravindra Singh, Sc. „G‟, Group


head (Control Interface & Tracking Group) for his valuable guidance
and advice during the course of this project.

I want to express my sincere thanks to Mr. Dipak Malik, Sc. „E‟, for
his kind and continual support and constructive suggestions given
during the course of this project & providing me an opportunity to
have an exposure to the real life application of object tracking.

Finally, I express my indebtedness to all who have directly or


indirectly contributed to the successful completion of my
INTERNSHIP.

RACHIT GOYAL

3
LASER SCIENCE AND TECHNOLOGY CENTER
DEFENCE RESEARCH AND DEVELOPMENT ORGANISATION,
MINISTRY OF DEFENCE
METCALFE HOUSE, NEW DELHI-110054

CERTIFICATE
This is to certify that the training project titled “IMAGE ENHANCEMENT,
OBJECT DETECTION AND TRACKING USING MATLAB” submitted
by RACHIT GOYAL, student of B.Tech (Electrical Engineering), IIT
Gandhinagar, Gandhinagar, Gujarat is done under my guidance in Laser
Science and Technology Centre (LASTEC), Defence Research and
Development Organization (DRDO), Ministry of Defence, Metcalfe House,
New Delhi during 15th May, 2017 to10th July, 2017.

Dr. Ravindra Singh


Scientist ‘G’
Control Interface & Tracking Group
LASTEC, DRDO

4
ORGANIZATIONAL PROFILE

The Defence Research and Development Organization (DRDO) is an agency


of the Republic of India, responsible for the development of technology for use
by the military, headquartered in New Delhi, India. It was formed in 1958 by
the merger of the Technical Development Establishment and the Directorate of
Technical Development and Production with the Defence Science Organization.

DRDO is working in various areas of military technology which include


aeronautics, armaments, combat vehicles, electronics, instrumentation
engineering systems, missiles, materials, naval systems, advanced computing,
simulation and life sciences. DRDO while striving to meet the Cutting edge
weapons technology requirements provides ample spinoff benefits to the society
at large thereby contributing to the nation building.

DRDO has a network of 52 laboratories, spread all over India, which are deeply
engaged in developing defence technologies covering various fields, like
aeronautics, armaments, electronic and computer sciences, human resource
development, life sciences, materials, missiles, combat vehicles development
and naval research and development. The organization includes more than5,000
scientists and about 25,000 other scientific, technical and supporting personnel.

Vision: Make India prosperous by establishing world-class science and


technology base and provide our Defence Services decisive edge by equipping
them with internationally competitive systems and solutions.
5
LASER SCIENCE AND TECHNOLOGY CENTRE (LASTEC)

Laser Science and Technology Centre (LASTEC) is a laboratory of the


Defence Research &Development Organization (DRDO). Located in Delhi, it is
the main DRDO lab involved in the development of Lasers and related
technologies. LASTEC functions under the DRDO Directorate of Electronics &
Computer Science.

The Laser Science and Technology Centre had its beginning in 1950 as the
Defence Science Laboratory (DSL)) established as a nucleus laboratory of
DRDO (then known as Defence Science Organisation). In the beginning, DSL
operated from the National Physical Laboratory building. Later on April 9th
1960, it was shifted to Metcalfe House and inaugurated by then Raksha Mantri
Dr Krishna Menon in the presence of Pt. Jawaharlal Nehru. DSL had seeded for
as many as 15 present DRDO labs with core groups working in much diverse
area. In 1982, the Laboratory moved to a new technical building in Metcalfe
House complex and was rechristened as Defence Science Centre.

Vision: Be the Centre of Excellence in the field of Lasers and their defence
applications.

Mission: Develop and deliver directed energy weapon systems for the services.
Carry out advanced research in the field of Lasers, Photonics and Opto-
electronics. LASTEC's primary focus is the research and development of
various laser materials, components and laser systems, including High Power
Lasers (HPL) for defence-applications. The main charter of the lab revolves
around progressing in areas of Photonics, Electro-Optic Counter Measures
(EOCM), low and High Power Lasers (HPL).
6
TABLE OF CONTENTS

S. No. TITLE PAGE No.

DECLARATION 2
ACKNOWLEDGEMENT 3
CERTIFICATE 4
ORGANIZATIONAL PROFILE 5

1. ABOUT THE PROJECT


1.1 ABSTRACT 9
1.2 INTRODUCTION 9
1.3 SOFTWARE DETAILS 10
1.4 MATLAB PROGRAMMING STYLE 11

2. METHODOLOGY 12

3. FLOWCHARTS AND ALGORITHMS


3.1 BACKGROUND SUBTRACTION 14
3.2 BASIC COMPONENTS OF OBJECT TRACKING 14
3.3 OPERATONAL FLOWCHART OF SYSTEM 15
3.4 ALGORITHM 15

4. RESULTS AND CONCLUSION

4.1 FOREGROUND DETECTION 17


4.2 FILTERED FOREGROUND 18
4.3 OBJECT DETECTION 18
4.4 SPEED OF THE OBJECTS 19
4.5 CONCLUSION 19

7
5. BIBLIOGRAPHY 22

APPENDIX
APPENDIX A: SNAPSHOT OF THE PROGRAMMING WINDOW 23
APPENDIX B: MATLAB PROGRAM CODES USED 24

8
1. ABOUT THE PROJECT

1.1 ABSTRACT

Object detection and tracking is an important challenging task within the area in
Computer Vision that try to detect, recognize and track objects over a sequence
of images called video. Object tracking is the process of locating an object or
multiple objects using a single camera, multiple cameras or given video file. In
Object Detection and Tracking we have to detect the target object and track that
object in consecutive frames of a video file.

1.2 INTRODUCTION

There are three methods for object tracking that are template-based,
probabilistic and pixel-wise. Pixel based methods is one of best method for
object tracking. This method is robust against the background interfusion
methods. In this kind of method, the failure detection and automatic failure
recovery can be carried out effectively.

In the project videos having fixed cameras are used with respect to static
background (e.g. stationary surveillance camera) and a common approach of
background subtraction is used to obtain an initial estimate of moving objects.
First perform background modelling to yield reference model. This reference
model is used in background subtraction in which each video sequence is
compared against the reference model to determine possible variation. The
variations between current video frames to that of the reference frame in terms
of pixels signify existence of moving objects. The variation which also
represents the foreground pixels are further processed for object localization and
tracking. Ideally, background subtraction should detect real moving objects with
high accuracy and limiting false negatives (not detected) as much as possible.
At the same time, it should extract pixels of moving objects with maximum
possible pixels, avoiding static objects and noise.

9
The main objective of this project is to develop an algorithm that can detect
moving objects in a video at any distance for object tracking applications. The
various tasks involved in this are motion detection, background modelling and
subtraction, foreground detection and noise removal.
The object detection and tracking from a video frame system consists of 5 major
components:
1. Image Acquisition, for collecting a series of single images from video
scene and storing them in the temporary storage.
2. Image Enhancement, to improve some characteristics of the single image
in order to provide accuracy and better future performance.
3. Image Segmentation, to perform the vehicle position detection using
image differentiation by Gaussian mixture model.
4. Image Analysis, to analyse the positions of the reference starting point
and the reference ending point, of the moving vehicles by multiple
vehicle tracking.
5. Speed Detection, to calculate the speed of each vehicle in the single
image frame using the, detection vehicle position and the reference point
positions.

1.3 SOFTWARE DETAILS

MATLAB (matrix laboratory) is a multi-paradigm numerical computing


environment and fourth-generation programming language. A proprietary
programming language developed by MathWorks, MATLAB allows matrix
manipulations, plotting of functions and data, implementation of algorithms,
creation of user interfaces, and interfacing with programs written in other
languages, including C, C++, C#, Java, Fortran and Python.

Although MATLAB is intended primarily for numerical computing, an optional


toolbox uses the MuPAD symbolic engine, allowing access to symbolic
computing abilities. An additional package, Simulink, adds graphical multi-
domain simulation and model-based design for dynamic and embedded systems.

10
1.4 MATLAB PROGRAMMING STYLE

The MATLAB application is built around the MATLAB scripting language.


Common usage of the MATLAB application involves using the Command
Window as an interactive mathematical shell or executing text files containing
MATLAB code. MATLAB was first adopted by researchers and practitioners in
control engineering, Little's specialty, but quickly spread to many other
domains. It is now also used in education, in particular the teaching of linear
algebra, numerical analysis, and is popular amongst scientists involved in image
processing.

MATLAB supports developing applications with graphical user interface (GUI)


features. MATLAB includes GUIDE (GUI development environment) for
graphically designing GUIs. It also has tightly integrated graph-plotting
features.

11
2. METHODOLOGY

The captured video is converted into frames. Since the video had 15 frames per
second reference frames are converted from RGB to grayscale images that
reduce the computation. After that we reduce the noise if any noise is present in
the video frames. And then we have done the segmentation of moving object in
each frame with respect to their static background and then we calculated the
displacement of each moving object by tracking them individually. And we will
track each vehicle in the video and find their individual speed simultaneously by
tracking them.

Steps involved:

1. RGB to BINARY Image Conversion


In this we convert each frame of the video into binary image from the
RGB image. It will reduce the complexity at the time of processing each
frame.

2. Background Subtraction
Reference frame is a frame that does not consists of moving objects and
is used to remove the background of the image which is out of our
interest. We did background subtraction by Gaussian mixture model
according to our video resolution. In Gaussian mixture model cluster of
moving pixel is formed that help us to find moving objects in the video
frames. We do the Gaussian mixture model implementation on each
frame to get moving objects.

3. Bounding Box Creation For The Vehicle


In this, after Gaussian mixture model applied we find the cluster of the
moving pixels. After getting cluster of moving pixels we surround that
moving cluster into bounding box. By this we get bounding box to each
moving vehicle. After getting bounding box for each vehicle in each
frame we generate centroid for every bounding box. Centroid will help us
to reference each vehicle.

4. Speed Detection
In our video we have to identify each vehicle and to find their
corresponding distance covered in consecutive frames. In our approach
we track each vehicle and trace their centroid in upcoming frames to get
the distance travelled by that vehicle. In this approach we use the array of
structure to store centroids of each vehicle. In this as vehicle is arrived
into the region of interest in the video their corresponding bounding box

12
is created by that we generated the centroid of the bounding box. As new
vehicle arrived we store its centroid value to the tracks. And updating its
centroid value in upcoming frames, and updating its track value of
centroid until that vehicle passes out by the video. If in the video we have
no. of vehicles then we have to track them all simultaneously. And we
store their centroid values into the track structure that we created. And
updating the centroid of all vehicles simultaneously as the new frame
arrived. We keep on updating the centroid values until it is in the region
of interest. By measuring the distance travelled by a vehicle in a
particular time.
Here, we can calculate speed by measuring the distance covered by the
cars from one frame to another by using the formula of speed = (distance
covered in unit of pixels)/ time. And as we know captured video has
frame rate of 15 frames per second.
Therefore, rate of change of one frame to another consecutive frame is
1/15 seconds.
Speed = (distance covered in unit of pixels) / (1/15)
And the unit of speed is no. of pixels transferred per second.

5. Output
The speed is calculated in the number of pixels travelled per second unit.
The deviation of the object from the centre of the frame is also calculated
and stored in a text file named ‘error’. The final processed video can be
saved in .mp4 or .avi format for further use.

13
3. FLOWCHARTS AND ALGORITHMS

3.1 BACKGROUND SUBTRACTION

Fig 1. Background subtraction technique

3.2 BASIC COMPONENTS OF OBJECT TRACKING

Fig 2. Basic component of object tracking algorithm

14
3.3 OPERATIONAL FLOWCHART OF SYSTEM

Fig 3. Flowchart of various steps involved

3.4 ALGORITHM

Background subtraction is a popular technique to segment out the interested


objects in a frame. This technique involves subtracting an image that contains
the object, with the previous background image that has no foreground objects
of interest. The area of the image plane where there is a significant difference
within these images indicates the pixel location of the moving objects.

 The accuracy of detection of object in the video frame depends upon the
Gaussian mixture model parameters for training the frames for the
background. vision.ForegroundDetector function is used for training the
data for the background using Gaussian mixture model. The parameter
name called ‘NumTrainingFrames’ of the function
vision.ForegroundDetector changes the result for the detection of the
vehicle. Here in this approach we have taken the starting 100 frames of

15
the videos for training of background. Depending upon the no. of training
frames the vehicle detection varies according to it.

 vision.BlobAnalysis returns the area, centroid and the bounding box of


the blobs when the AreaOutputPort, CentroidOutputPort and
BoundingBoxOutputPort properties are set to true.
BoundingBoxOutputPort gives the dimensions and coordinates of the
bounding box to be formed around the detected object.
CentroidOutputPort gives the coordinates of the centroid of the bounding
box.

 vision.VideoFileWriter function is used to write the processed frames


back into a video file that can be used for further references.

 The while loop runs till it finishes reading all the video frames. The
centroid values are stored in temp0 variable for first frame and temp1
variable for the next frame. temp0 and temp1 are updated continuously.
Difference between the centroids of temp0 and temp1 gives the velocity
of object in pixels/sec.
 error function measures the deviation of the object from the centre of the
frame and its values are saved into a text file.

16
4. RESULTS AND CONCLUSION

4.1 FOREGROUND DETECTION

Fig 4. A frame from the original video

Vision.ForegroundDetector function detects the objects in the foreground.

Fig 5. Detected foreground from the video frame.

17
4.2 FILTERED FOREGROUND

Fig 6. Cleaner foreground after filtering

4.3 OBJECT DETECTION

A bounding box is created around the objects detected. The minimum size of
the bounding box can be adjusted.

Fig 7. Bounding box created around the detected objects

18
4.4 SPEED OF THE OBJECTS
Speed of the objects is measured by the centroid displacement method. The
distance travelled by the centroid of the bounding box in two frames multiplied
by the frame rate gives the speed in pixels/sec.

Fig 8. A video frame showing speed of various cars in pixels/second

4.5 CONCLUSION

Object detection and tracking is an important task in computer vision field. In


object detection and tracking it consist of two major processes, object detection
and object tracking. Object detection in video image obtained from single
camera with static background that means fixing camera is achieved by
background subtraction approach. The objective has been to detect moving
objects and thereafter, calculate the speed of moving Vehicles and motion
vectors (error). Speed is detected for multiple vehicles by processing the video
frames. Results show that the proposed model gives relatively good
performance. But occasions for bad weather such as heavy fog, weak
illumination and night scenes may produce poor performance. The main
problem under these conditions is the inaccurate detection of vehicles as a result
bounding box will not be created for the consecutive frames and if vehicle is not
recognized by their bounding box then it is not possible to calculate their speed.
The video camera based automatic vehicle speed detection is a very accurate
and promising technology.

19
There are many de-blurring techniques like:
 Deblurring with the Wiener Filter
 Deblurring with the Lucy-Richardson Algorithm
 Blind Deconvolution Deblurring Algorithm
 Regularized filter
 Median filter

Comparison of motion de-blur techniques for Weiner and Lucy-


Richardson filters:

Weiner Filter Lucy-Richardson Filter

Weiner algorithm is a linear The Lucy Richardson (LR)


restoration operation. Algorithm is an iterative nonlinear
restoration method.

It minimizes the mean square error The L-R algorithm arises from
between the estimated random process maximum likelihood formulation in
and the desired process. which image is modelled with poison
statistics.

It removes or reduces to some extent It is an interactive procedure in which


the additive noise and inverts the the pixels of the observed image are
blurring simultaneously. represented using the PSF and the
latent image.

Weiner De-convolution It is used to restore a degraded image


can be used effectively when the that has been blurred by a known PSF.
frequency characteristics of the
image and additive noise are known,

LESS efficient. MORE efficient.

20
Comparison of Kalman Filter and Mean Shift Algorithm for Object
Tracking:

Kalman Filter Mean Shift Algorithm

Kalman filter is an optimal Recursive In this moving objects are


Data Processing Algorithm. characterized by their color-
histograms.

It consists of the following two phases- The key operation of this object
(i) prediction and (ii) correction. tracking algorithm is histogram
estimation.

The first refers to the prediction of the Primary mean shift algorithm, based
next state using the current set of on color feature only, gives an
observations and update the current set accurate performance especially under
of predicted measurements. The partial occlusions.
second updates the predicted values
and gives a much better approximation
of the next state.

Iteration time per frame for kalman Iteration time per frame for Mean Shift
filter is lesser. algorithm is lesser.

The kalman filter is observed to be The Mean shift algorithm fails to


much more flexible under any kind of perform when the object is under any
motion. kind of motion other than pure
translational motion.

The Kalman filter performs much Mean shift algorithm performs worse
better under noisy atmospheric under noisy atmospheric conditions.
conditions.
When only translational motion is Mean Shift Algorithm results in more
taken into account, the kalman filter errors.
results in much lower RMS error.

21
5. BIBLIOGRAPHY

 https://in.mathworks.com/help/vision/ug/multiple-object-tracking.html

 https://in.mathworks.com/help/vision/ref/vision.foregrounddetector-
class.html

 https://www.rroij.com/open-access/object-detection-and-object-tracking-
usingbackground-subtraction-for-surveillanceapplication.php?aid=44496

 http://www.ijecscse.org/papers/apr2012/moving-object-tracking-in-video-
using-matlab.pdf

 http://ethesis.nitrkl.ac.in/6256/1/E-1.pdf

 https://pdfs.semanticscholar.org/2c3f/30acd1eb4ad21d43597023571bdde
77ed6eb.pdf

 http://airccse.org/journal/ijcsea/papers/4214ijcsea03.pdf

 http://www.ijcat.com/archives/volume2/issue1/ijcatr02011007.pdf

 http://www.ijraset.com/fileserve.php?FID=2931

22
APPENDIX A

Fig 9. A snapshot of the MATLAB programming window

Fig 10. A snapshot of the running program

23
APPENDIX B

MATLAB PROGRAM CODES

1. Using Weiner filter and Lucy-Richardson filter to enhance images

Aim: To enhance and de-blur the image.

Code:

Weiner :
I = im2double(imread('blur5.jpg'));
LEN = 21;
THETA = 11;
PSF = fspecial('motion', LEN, THETA);
blurred = imfilter(I, PSF, 'conv', 'circular');
wnr1 = deconvwnr(I, PSF, 0.1);
imshowpair(I, wnr1, 'montage')
Iblur1 = imgaussfilt(wnr1,2);
i = imsharpen(I);
imshowpair(I, i, 'montage')

Lucy-Richardson:
I = im2double(imread('blur4.jpg'));
LEN = 4;
THETA = 1000;
PSF = fspecial('gaussian', LEN, THETA);
%B = imfilter(I,h);
wnr1 = deconvlucy(I, PSF, 5);

24
a = imsharpen(wnr1,'Radius',2,'Amount',1);
%imshowpair(I, a, 'montage')
%i = adapthisteq(I,'clipLimit',0.02,'Distribution','rayleigh');
figure, imshowpair(I, a, 'montage')

2. Object detection using background subtraction method(Kalman Filter)

foregroundDetector = vision.ForegroundDetector('NumGaussians',
2,'NumTrainingFrames', 100);

videoReader = vision.VideoFileReader('video.mp4');
for j = 1:150
frame = step(videoReader); % read the next video frame
foreground = step(foregroundDetector, frame);
end

figure; imshow(frame); title('Video Frame');


figure; imshow(foreground); title('Foreground');
se = strel('square', 3);
filteredForeground = imopen(foreground, se);
figure; imshow(filteredForeground); title('Clean Foreground');
blobAnalysis = vision.BlobAnalysis('BoundingBoxOutputPort',
true,'AreaOutputPort', false, 'CentroidOutputPort', false,'MinimumBlobArea',
300);

25
centroid = vision.BlobAnalysis('BoundingBoxOutputPort',
false,'AreaOutputPort', false, 'CentroidOutputPort', true,'MinimumBlobArea',
300);
bbox = step(blobAnalysis, filteredForeground);
bbox1 = step(centroid, filteredForeground);

temp0 = zeros(20,2);
temp1 = zeros(20,2);
velocity = 0;
var = zeros(20,2);
error = [];

result = insertText(frame, bbox1, velocity);


result = insertShape(frame, 'rectangle', bbox, 'Color', 'green');

videoPlayer = vision.VideoPlayer('Name', 'Detected Cars');


videoPlayer.Position(3:4) = [650,400]; % window size: [width, height]
se = strel('square', 4); % morphological filter for noise removal

videoFWriter = vision.VideoFileWriter('aabb.avi')
videoFWriter.VideoCompressor='None (uncompressed)'

while ~isDone(videoReader)

frame = step(videoReader); % read the next video frame

26
% Detect the foreground in the current video frame
foreground = step(foregroundDetector, frame);

% Use morphological opening to remove noise in the foreground


filteredForeground = imopen(foreground, se);

% Detect the connected components with the specified minimum area, and
% compute their bounding boxes
bbox = step(blobAnalysis, filteredForeground);
bbox1 = step(centroid, filteredForeground);
% Draw bounding boxes around the detected cars

numCars = size(bbox, 1); %%%% number of cars in frame

result = insertShape(frame, 'rectangle', bbox, 'Color', 'green');

var = temp0;
temp0 = temp1;
temp1 = bbox1;

if (size(temp0,1) == size(temp1,1))
for i=1:numCars
velocity(i,1) = 30*((temp0(i,2)-temp1(i,2))^2 + (temp0(i,1)-
temp1(i,1))^2)^(1/2);
if (size(velocity,1) < numCars)
velocity = cat(1,velocity, zeros((numCars-size(velocity,1)),1));
result = insertText(frame, bbox1, velocity);
error1 = ((350-bbox1(i,2)).^2 + (650-bbox1(i,1)).^2).^(1/2);

27
error = cat(1, error, error1);
elseif (size(velocity,1) > numCars)
velocity = velocity(1:numCars,:);
result = insertText(frame, bbox1, velocity);
error1 = ((350-bbox1(i,2)).^2 + (650-bbox1(i,1)).^2).^(1/2);
error = cat(1, error, error1);
else
result = insertText(frame, bbox1, velocity);
error1 = ((350-bbox1(i,2)).^2 + (650-bbox1(i,1)).^2).^(1/2);
error = cat(1, error, error1);
end
end
end

%%%%Display the number of cars found in the video frame


numCars = size(bbox, 1);
result = insertText(result, [10 10], numCars, 'BoxOpacity', 1,'FontSize', 14);

step(videoPlayer, result); % display the results


step(videoFWriter, result) % save the video file

end

save('error.txt','error','-ascii');

release(videoReader); % close the video file

28

You might also like