MAJOR

Virtual Mouse using Hand Gesture and Colour
Detection
A Major Project Submitted in Partial Fulfillment of the
Requirements for the Award of the Degree
of
Bachelor of Technology
By
KUNTLA AJAY (157Y1A04C0)

BHUMA NAVYA (157Y1A04E3)
Under the Guidance of
Ms. T.TANUJA
Assistant professor
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
April-2019
DECLARATION
Project Title: Virtual Mouse using Hand Gesture and Colour Detection.
Degree for which the project is submitted: B .Tech.
We declare that the presented project represents largely our own ideas and work in our own
words. Where others ideas or words have been included, we have adequately cited and
listed in the reference materials. We have adhered to all principles of academic honesty and
integrity. No falsified or fabricated data have been presented in the project. The matter
embodied in this project report has not been submitted by us to any other university for the
award of any other degree.
----------------------- --------------------------
(KUNTLA AJAY) (BHUMA NAVYA)
(157Y1A04C0) (157Y1A04E3)
Date:
i
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
CERTIFICATE
This is to certify that the project work entitled “Virtual mouse using hand gesture and
colour detection” work done by KUNTLA AJAY(15Y1A04C0) and BHUMA
NAVYA(157Y1A04E3) students of Department of Electronics and Communication
Engineering, is a record of bonafide work carried out by the members under the guidance
of Ms. T.TANUJA. This project is done as a fulfillment of obtaining Bachelor of
Technology Degree to be awarded by Jawaharlal Nehru Technological University
Hyderabad.
This is to certify that the above statement made by the candidates is correct to the best of
my knowledge.
Date: (T.TANUJA)
The Project Viva-Voce Examination of above students, has been held on……………
HOD External Examiner Principal
ii
ABSTRACT
The main idea to choose this project is improvising the previous levels and developing the
Human-Machine Interface. In today’s technological era, many technologies are evolving
day by day. The aim of this project is to move the mouse cursor on the screen without
using hardware such as a mouse and only by moving the cursor through finger movements.
In this we present a novel approach for Human Computer Interaction (HCI) where cursor
movement is controlled using a real time camera. In this project, the hand movement of a
user is mapped into mouse inputs. A web camera is set to take the live video continuously
and then from this video various images are being captured by using MATLAB. The user
must have a particular color marker or pointer in his hand so that when the web camera
takes an image it must be visible in it. This color is detected from the image pixel in
MATLAB and object detection is used to map pixel position into mouse input.
iii
ACKNOWLEDGEMENTS
The satisfaction that accompanies the successful completion of any task would be
incomplete without the mentioning of the peoples who made it possible and whose
encouragement and guidance has been a source of inspiration throughout the course of the
project.
It is great to convey our profound sense of gratitude to our principal Dr.K. Venkateswar
Reddy, and to our director Dr. R. Kotaiah, ECE at Marri Laxman Reddy Institute of
Technology and Management for having been kind enough to arrange for necessary
facilities for executing the project in college.
At the inception we would like to express our deep sense of gratitude to our Head of the
Department Mr. K. Nagabhushanam, ECE, Marri Laxman Reddy Institute of Technology
and Management. Whose valuable suggestions have been indispensable to bring about the
successful completion of our project. We wish to acknowledge a special thanks to our
guide, T.Tanuja, Assistant professor who helped us throughout the academic to complete
our project.
Finally, we are thankful to our project co-ordinator Dr. G. Amarnath and staff members
of ECE Department, and other faculty members of our institution. Finally we thank those
who directly and indirectly helped us in this regard.
( K.Ajay ) (B.Navya)
iv
CONTENTS
Declaration i
Certificate ii
Abstract iii
Acknowledgements iv
Contents v
List of Figures vii
List of Tables viii
Chapter No. Page

No.
CHAPTER 1 Introduction 1 – 22
1.1 Introduction 1
1.2 Review of physical mouse 2
1.3 Optical and laser mouse 3
1.4 Project description 6
1.5 Working of colour detection 20
CHAPTER 2 Overview of project 23-30
2.1 Block diagram-real time image from user 23

2.2 Flow diagram of color recognition 27
2.3 Functions used 28
CHAPTER 3 System development 31-43
3.1 Flipping of images 32
3.2 Colour detection 33

3.3 Filtering of images 35
3.4 Conversion of images 36
3.5 Erosion and dilation 41
v
CHAPTER 4 Methods and technologies involved 44-59
4.1 Hardware and software requirements 44
4.2 Basics of image processing 46
4.3 Colour processing 48
4.4 Image processing 49
4.5 Code 51
4.6 Results 58
CHAPTER 5 Conclusion and Future Scope 60-61
References 62
vi
LIST OF FIGURES
Figure No. Caption/Description Page

No.
FIGURE 1.1 Mechanical mouse, with top cover removed 3
FIGURE 1.2 Optical Mouse, with top cover removed 4
FIGURE 1.3 Hand gesture recognition system 8
FIGURE 1.4 Algorithm of colour detection 9
FIGURE 1.5 Algorithm of hand tracking 13
FIGURE 1.6 Flow chart of hand tracking based on colour detection 17
FIGURE 1.7 Binary representation of hand processing. 18
FIGURE 1.8 Angle of bounding 19
FIGURE 1.9 Cover rectangles with palm 20
FIGURE 1.10 Detection of fingers 21
FIGURE 2.1 Overview of system 23
FIGURE 2.2 Flow diagram of the system 27
FIGURE 2.3 Flipping of images 29
FIGURE 2.4 Colour image 30
FIGURE 2.5 Grey image 30
FIGURE 3.1 Flipping of images 32
FIGURE 3.2 Grey scale conversion of flipped image 33
FIGURE 3.3 Input image 34
FIGURE 3.4 Red plane detected 34
FIGURE 3.5 Blue plane detected 35
FIGURE 3.6 Red colour filter 35
FIGURE 3.7 Blue colour filter 36
FIGURE 3.8 Red converted to BW 37
FIGURE 3.9 Blue converted to BW 37
FIGURE 3.10 Detected center for single blue 39
FIGURE 3.11 Detected center for double blue 39
FIGURE 3.12 Move as per co-ordinates 40
FIGURE 3.13 Erosion & Dilation 41
vii
FIGURE 4.1 An image array or a matrix of pixels arranged in rows and columns 46
FIGURE 4.2 Tri-colour image 46
FIGURE 4.3 A true colour image assembled by 3 gray scales 47
FIGURE 4.4 Additive model of RGB 48
LIST OF TABLES
Table No. Name of the table Page
No.
Table 1.1 Advantage and disadvantage of the Mechanical Mouse 4
Table 1.2 Advantage and disadvantage of the Optical and Laser Mouse 5
Table 1.3 Operations performed depending upon number of fingers. 16
viii
Chapter 1: Introduction
CHAPTER
1
INTRODUCTION
1.1 Introduction
A mouse, in computing terms is a pointing device that detects two-dimensional

movements relative to a surface. This movement is converted into the movement of a pointer
on a display that allows to control the Graphical User Interface (GUI) on a computer platform.
There are a lot of different types of mouse that have already existed in the modern days
technology, there's the mechanical mouse that determines the movements by a hard rubber ball
that rolls around as the mouse is moved. Years later, the optical mouse was introduced that
replace the hard rubber ball to a LED sensor to detects table top movement and then sends off
the information to the computer for processing. On the year 2004, the laser mouse was then
introduced to improve the accuracy movement with the slightest hand movement, it overcome
the limitations of the optical mouse which is the difficulties to track high-gloss surfaces.
However, no matter how accurate can it be, there are still limitations exist within the mouse
itself in both physical and technical terms. For example, a computer mouse is a consumable
hardware device as it requires replacement in the long run, either the mouse buttons were
degraded that causes inappropriate clicks, or the whole mouse was no longer detected by the
computer itself.
Despite the limitations, the computer technology still continues to grow, so does the
importance of the human computer interactions. Ever since the introduction of a mobile device
that can be interact with touch screen technology, the world is starting to demand the same
technology to be applied on every technological device, this includes the desktop system.
However, even though the touch screen technology for the desktop system is already exist, the
price can be very steep. A hand gesture based cursor control system allows users to give mouse
1
inputs to a system without using an actual mouse. To the extreme it can also be called as
hardware because it uses an ordinary web camera. The system can usually be operated with
multiple input devices, which may include an actual mouse or a computer keyboard. This
system uses a web camera which works with the help of different image processing techniques.
A colour pointer has been used for the object recognition and tracking. Left and the right click
events of the mouse have been achieved by detecting the number of colour pointers on the
images. The hand movements of a user are mapped into mouse inputs. A web camera is set to
take images continuously. The user must have a particular colour in his hand so that when the
web camera takes image it must be visible in the acquired image. This colour is detected from
the image pixel and the pixel position is mapped into mouse input.
In this project, the mouse cursor movement and click events are controlled using a camera
based on colour detection technique. Here real time video has been captured using a Web-
Camera. The user wears coloured tapes to provide information to the system. Individual frames
of the video are separately processed. The processing techniques involve an image subtraction
algorithm to detect colours. Once the colours are detected, the system performs various
operations to track the cursor and performs control actions. No additional hardware is required
by the system other than the standard webcam which is provided in every laptop computer.
Therefore, a hand gesture based human computer interaction device replaces the physical
mouse or keyboard by using a webcam or any other image capturing devices can be an
alternative way for the touch screen. This device which is the webcam will be constantly
utilized by a software that monitors the gestures given by the user in order to process it and
translate to motion of a pointes, as similar to a physical mouse.
1.2 Review of the Physical Mouse
It is known that there are various types of physical computer mouse in the modern technology,
the following will discuss about the types and differences about the physical mouse.
Mechanical Mouse
It is Known as the trackball mouse that is commonly used in the 1990s, the ball within the
mouse are supported by two rotating rollers in order to detect the movement made by the ball
itself. One roller detects the forward/backward motion while the other detects the left/right
2
motion. The ball within the mouse are steel made that was covered with a layer of hard rubber,
so that the detection are more precise. The common functions included are the left/right
buttons and a scroll-wheel. However, due to the constant friction made between the mouse ball
and the rollers itself, the mouse are prone to degradation, as overtime usage may cause the
rollers to degrade, thus causing it to unable to detect the motion properly, rendering it useless.
Furthermore, the switches in the mouse buttons are no different as well, as long term usage
may cause the mechanics within to be loosed and will no longer detect any mouse clicks till it
was disassembled and repaire
Figure 1.1 Mechanical mouse, with top cover removed
3
The following table describes the advantages and disadvantages of the Mechanical Mouse.
ADVANTAGES DISADVANTAGES
Allows the users to control the Prone to degradation of the mouse

computer system by moving the rollers and button switches, causing
mouse. to be faulty
Provides precise mouse tracking Requires a flat surface to operate

movements
Table 1.1: Advantage and disadvantage of the Mechanical Mouse
1.3 Optical and Laser Mouse
A mouse that commonly used in these days, the motions of optical mouse rely on the
Light Emitting Diodes (LEDs) to detect movements relative to the underlying surface, while
the laser mouse is an optical mouse that uses coherent laser lights. Comparing to its
predecessor, which is the mechanical mouse, the optical mouse no longer rely on the rollers to
determine its movement, instead it uses an imaging array of photodiodes. The purpose of
implementing this is to eliminate the limitations of degradation that plagues the current
predecessor, giving it more durability while offers better resolution and precision. However,
there's still some downside, even-though the optical mouse are functional on most opaque
diffuse surface, it's unable to detect motions on the polished surface. Furthermore, long term
usage without a proper cleaning or maintenance may leads to dust particles trap between the
LEDs, which will cause both optical and laser mouse having surface detection difficulties.
Other than that, it's still prone to degradation of the button switches, which again will cause the
mouse to function improperly unless it was disassembled and repaired.
4
Figure 1.2 Optical Mouse, with top cover removed
The following table describes the advantages and disadvantages of the Optical and
Laser Mouse.
ADVANTAGES DISADVANTAGES
 Allows better precision with lesser  Prone to button switches

hand movements. degradation.
 Longer life-span.  Does not function properly

while on a polished surface.
Table 1.2: Advantage and disadvantage of the Optical and Laser Mouse
Problem Statement
It's a known fact that every technological devices have its own limitations, especially
when it comes to computer devices. After the review of various type of the physical mouse, the
problems are identified and generalized. The following describes the general problem that the
current physical mouse suffers:
 Physical mouse is subjected to mechanical wear and tear.
 Physical mouse requires special hardware and surface to operate.
 Physical mouse is not easily adaptable to different environments and its performance varies
depending on the environment.
5
 Mouse has limited functions even in present operational environments.
 All wired mouse and wireless mouse have its own lifespan.
Motivation of the proposed project
It is fair to say that the Virtual Mouse will soon to be substituting the traditional
physical mouse in the near future, as people are aiming towards the lifestyle where that every
technological devices can be controlled and interacted remotely without using any peripheral
devices such as the remote, keyboards, etc. it doesn't just provides convenience, but it's cost
effective as well.
User Convenience
It is known in order to interact with the computer system, users are required to use an
actual physical mouse, which also requires a certain area of surface to operate, not to mention
that it suffers from cable length limitations. Cursor control system requires none of it, as it only
a webcam to allow image capturing of user's hand position in order to determine the position of
the pointers that the user want it to be. For example, the user will be able to remotely control
and interact the computer system by just facing the webcam or any other image capturing
devices and moving your fingers, thus eliminating the need to manually move the physical
mouse, while able to interact with the computer system from few feet away Cost Effective.
A physical mouse normally costs depending on their functionality and features. Since the
cursor control system requires only a webcam, a physical mouse are no longer required, thus
eliminating the need to purchase one, as a single webcam is sufficient enough to allow users to
interact with the computer system through it, while some other portable computer system such
as the laptop, are already supplied with a built-in webcam, could simply utilize the software
without having any concerns about purchasing any external peripheral devices.
Cost Effective
A physical mouse normally costs depending on their functionality and features.

Since the cursor control system requires only a webcam, a physical mouse are no longer
required, thus eliminating the need to purchase one, as a single webcam is sufficient enough to
allow users to interact with the computer system through it, while some other portable
6
computer system such as the laptop, are already supplied with a built-in webcam, could simply
utilize the software without having any concerns about purchasing any external peripheral
devices.
Problem Description
There are generally two approaches for hand gesture recognition, which are hardware
based, where the user must wear a device, and the other is vision based which uses image
processing techniques with inputs from a camera. The proposed system is vision based, which
uses image processing techniques and inputs from a computer webcam. Vision based gesture
recognition tracking and gesture recognition. The input frame would be captured from the
webcam and systems are generally broken down into four stages, skin detection, hand contour
extraction, hand the skin region would be detected using skin detection. The hand contour
would then be found and used for hand tracking and gesture recognition. Hand tracking would
be used to navigate the computer cursor and hand gestures would be used to perform mouse
functions such as right click, left click, scroll up and scroll down. The scope of the project
would therefore be to design a vision based CC system, which can perform the mouse function
previously stated.
7
In short, the Flowchart for our project will be as follows:
Figure 1.3 Hand gesture recognition system
1.4 Project Description
In this section the strategies and methods used in the design and development of the vision based
CC system will be explained. The algorithm for the entire system is shown in Figure below. In order
to reduce the effects of illumination, the image can be converted to chrominance colour space which is
less sensitive to illumination changes. The HSV colour space was chosen since it.
8
Figure 1.4. Algorithm of colour detection
was found by to be the best colour space for skin detection. The next step would be to use a method
that would differentiate skin pixels from non-skin pixels in the image (skin detection). Background
subtraction was then performed to remove the face and other skin colour objects in the background.
Morphology Opening operation (erosion followed by dilation) was then applied to efficiently remove
noise. A Gaussian filter was applied to smooth the image and give better edge detection. Edge
detection was then performed to get the hand contour in the frame. Using the hand contour, the tip of
the index finger was found and used for hand tracking and controlling the mouse movements. The
contour of the hand was also used for gesture recognition. The system can be broken down in four
main components, thus in the Methodology the method used in each component of the system will be
explained separately.
9
This section is separated into the following subsections:
 Skin Detection
 Hand Contour Extraction
 Hand Tracking
 Gesture Recognition
 Cursor Control
Skin Detection
Skin detection can be defined as detecting the skin colour pixels in an image. It is a
fundamental step a wide range of image processing application such as face detection, hand tracking
and hand gesture recognition. Skin detection using colour information has recently gained a lot of
attention, since it is computationally effective and provides robust information against scaling, rotation
and partial occlusion. Skin detection using colour information can be a challenging task, since skin
appearance in images is affected by illumination, camera characteristics, background and ethnicity. In
order to reduce the effects of illumination, the image can be converted to a chrominance colour space,
which is less sensitive to illumination changes.
A chrominance colour space is one where the intensity information In the proposed method, the
HSV colour space was used with the Histogram- based skin detection method. The HSV colour space
has three channels, Hue (H), Saturation(S) and Value (V). The H and S channels hold the colour
information, while the V channel holds the intensity information. The input image from the webcam
would be in the RGB colour space, thus it would have to be converted to the HSV colour space using
the conversion Formulae. The Histogram-based skin detection method proposed by uses 32 bins H and
S histograms to achieve skin detection. Using a small skin region, the colour of this region is
converted to a chrominance colour space. A 32 bin histogram for the region is then found and is used
as the histogram model. Each pixel in the image is then evaluated on how much probability it has to a
histogram model. This method is also called Histogram Back Projection. Back projection can be
defined as recording how well pixels or patches of pixels fit the distribution of pixels in a histogram
model. The result would be a grayscale image (back projected image), where the intensity indicates
10
the likelihood that the pixel is a skin colour pixel. This method is adaptive since the histogram model
is obtained from the users ski, under the preset lighting condition.
Hand Contour Extraction
After obtaining the skin segmented binary image, the next step is to perform edge
detection to obtain the hand contour in the image. There are several edge detection methods such as,
Laplacian edge detection, canny edge detection and border finding The Open CV function cv Find
Contours() uses a order finding edge detection method to find the contours in the image. The major
advantage of the border finding edge detection method, is that all the contours found in the image is
stored in an array. This means that we can analyse each contour in the image individually, to
determine the hand contour. The Canny and Laplacian edge detectors are able to find the contours in
the image, but do not give us access to each individual contour. For this reason the border finding edge
detection method was used in the proposed design.
11
In the contour extraction process, we are interested in extracting the hand contour so that shape
analysis can be done on it to determine the hand gesture. Figure below shows the result when edge
detection was applied to the skin segmented binary image. It can be seen that besides the hand
contour, there are lots of small contours in the image. These small contours can be considered as noise
and must be ignored. The assumption was made that the hand contour is the largest contour thereby
ignoring all the noise contours in the image. This assumption can be void, if the face contour is larger
than the hand contour. To solve this problem, the face region must be eliminated from the frame. The
assumption was made that the hand is the only moving object in the image and the face remains
relatively stationary compared to the hand. This means that background subtraction can be applied to
remove the stationary pixels in the image, including the face region. This is implemented in the
function named “BackgroundSubtractorMOG2”.
Hand Tracking
The movement of the cursor was controlled by the tip of the index finger. In order to identify the tip of
the index finger, the centre of the palm must first be found. The method used for finding the hand
centre was adopted from and it has the advantage of being simple and easy to implement. The
algorithm for the method is shown in the flow chart of Figure below. The shortest distance between
each point inside the inscribed circle to the contour was measured and the point with the largest
distance was recorded as the hand centre. The distance between the hand centre and the hand contour
was taken as the radius of the hand. The hand centre was calculated for each successive frame and
using the hand centre, the tip of the index finger would be identified and used for hand tracking. The
method used for identifying the index and the other fingers are described in the following subsection.
The results for hand tracking would be demonstrated in Figure in the Results and Analysis section.
12
Figure 1.5 Algorithm hand tracking
13
Gesture Recognition
The gesture recognition method used in the proposed design is a combination of two methods, the
method proposed by Yeo and the method proposed by Balazs. The algorithm for the proposed gesture
recognition method is described in the flow chart of Figure below. It can be seen from Figure above
that the convexity defects for the hand contour must firstly be calculated. The convexity defects for
the hand contour was calculated using the Open CV inbuilt function “cv Convexity Defects”. The
parameters of the convexity defects (start point, end point and depth point) are stored in a sequence of
arrays. After the convexity defects are obtained, there are two main steps for gesture recognition:
 Finger Tip Identification
 Number of Fingers.
14
Cursor Control
Once the hand gestures are recognized, it will be a simple matter of mapping different hand gestures
to specific mouse functions. It turns out that controlling the computer cursor, in the C/C++
programming language is relatively easy. By including the User.lib library into the program, the “Send
Input” function will allow control of the computer cursor. Instructions on how to properly use this
function, was obtained from the Microsoft Developers Network MSDN website. This function is only
available for the Windows 2000 Professional operating system or later. This introduces a new
limitation on the system, such that it can only be used on newer versions of the Windows operating
system. The algorithm for the cursor control is shown in Figure below.
15
Following table shows the Operations Performed depending upon the number of fingers detected:
Number of Fingertips Detected Operations Performed
One Move Cursor
Two Left Click
Three Right Click
Four Start Button
Five My Computer
Table 1.3: Operations performed depending upon number of fingers.
Starting with the position of the index fingertip, the cursor is moved to the fingertip position. This is
done using the “Send Input” function to control the cursor movement. The next step would be to
determine if a hand gesture was performed. If a hand gesture as performed, the “Send Input” function
is again used to control the cursor function. If there is no change in fingertip position, the loop is
exited and it would be started again, when a change in fingertip position is detected.
16
Below is a summary flowchart representation of the program:
Figure 1.6. Flow chart of hand tracking based on colour detection
17
The hand tracking is based on color recognition. The program is therefore

initialized by sampling color from the hand. The hand is then extracted from the
background by using a threshold using the sampled color profile. Each color in
the profile produces a binary image which in turn are all summed together. A
nonlinear median filter is then applied to get a smooth and noise free binary
representation of the hand.
When the binary representation is generated the hand is processed in the

following way:
Figure 1.7.Binary representation of hand processing.
18
The properties determining whether a convexity defect is to be dismissed is the

angle between the lines going from the defect to the neighboring convex polygon
vertices.
Figure 1.8. Angle of Bounding
The defect is dismissed if:
Length < 0.4lbb
Angle > 800
The analysis results in data that can be of further use in gesture recognition:
Fingertip positions
Number of fingers
Number of hands
Area of hands
19
1.5 WORKING OF COLOUR DETECTION
This is how it works:
First, The program asks you to place Palm on the Rectangles:
Figure 1.9.Cover rectangles with palm
20
Now, let’s see how it tracks our palm and detects our fingers:
Figure 1.10. Detection of fingers
This detects 5 fingers and hence five five fingers.
This detects 4 fingertips and hence four fingers.
21
This detects 3 fingertips and hence three fingers.
This detects 2 fingertips and hence two fingers.
This detects 1 fingertip and hence one finger.
22
Chapter 2: Over View of project
CHAPTER
2
OVERVIEW OF PROJECT
2.1 Block Diagram
Real Time Image from User
According to the system requirements we need to give RGB colour image inputs to the system.
This colour components will be placed on the finger tips of the user. The input will be given as
a continuous frame of images. This continuous image input is captured using a webcam.
Figure 2.1 Overview of system

23
Webcam Taking Images
For the system to work we need a sensor to detect the hand movements of the user. The
webcam of the computer is used as a sensor. The webcam captures the real time video at a
fixed frame rate and resolution which is determined by the hardware of the camera. The frame
rate and resolution can be changed in the system if required.
 Computer Webcam is used to capture the Real Time Video.
 Video is divided into Image frames base on the FPS (Frames per second) of the camera.
 Processing of individual Frames.
Image Pre-processing
Image pre-processing involves flipping of the input images. When the camera
captures an image, it is inverted. This means that if we move the color pointer towards the left,
the image of the pointer moves towards the right and vice-versa. It’s similar to an image
obtained when we stand in front of a mirror (Left is detected as right and right is detected as
left). To avoid this problem, we need to vertically flip the image. The image captured is an
RGB image and flipping actions cannot be directly performed on it. So, the individual color
channels of the image are separated and then they are flipped individually. After flipping the
red, blue and green colored channels individually, they are concatenated and a flipped RGB
image is obtained.
Mouse Movements
The control actions of the mouse are performed by controlling the flags associated with
the mouse buttons. JAVA robot class is used to access these flags. The user has to perform
hand gestures in order to create the control actions. Due to the use of color pointers, the
computation time required is reduced. Furthermore the system becomes resistant to
background noise and low illumination conditions.
24
Clicking action is based on the following colour detections.
 Red indicates cursor movement.
 Green represents cursor scroll. 
 Single blue represents Left click 
 Double blue represents Right Click
 Three blue represents Double click. 
25
2.2 Flow Diagram of Colour Recognition
26
Figure 2.2: Flow diagram of the System
27
Following are the steps in working of our project:
 Capturing real time video using Web-Camera.
 Processing the individual image frame.
 Flipping of each image frame.
 Conversion of each frame to a grey scale image.
 Colour detection and extraction of the different colours (RGB) from flipped gray scale image.
 Conversion of the detected image into a binary image.
 Finding the region of the image and calculating its centroid.
 Tracking the mouse pointer using the coordinates obtained from the centroid.
 Simulating the left click and the right click events of the mouse by assigning different colour
pointers.
2.3 Functions Used
SYNTAX
B = flipdim(A,dim)
DESCRIPTION
B = flipdim(A,dim) returns A with dimension dim flipped.
When the value of dim is 1, the array is flipped row-wise down. When dim is 2, the
array is flipped column wise left to right. flipdim(A,1) is the same as flipud(A), and
flipdim(A,2) is the same as fliplr(A).
28
Figure 2.3: Flipping of images
SYNTAX
Z = imsubtract(X,Y)
DESCRIPTION
Z = imsubtract(X,Y)
subtracts each element in array Y from the corresponding element in array X and
returns the difference in the corresponding element of the output array Z.
If X is an integer array, elements of the output that exceed the range of the integer type
are truncated, and fractional values are rounded.
im2bw
Convert image to binary image, based on threshold
SYNTAX
BW = im2bw(I,level)
DESCRIPTION
BW = im2bw(I,level) converts the grayscale image I to binary image BW, by replacing

all pixels in the input image with luminance greater than level with the value1 (white) and
replacing all other pixels with the value 0 (black).
29
This range is relative to the signal levels possible for the image's class.
Therefore, a level value of 0.5 corresponds to an intensity value halfway between
the minimum and maximum value of the class.
rgb2gray
Convert RGB image or colormap to grayscale
SYNTAX
I = rgb2gray(RGB)
newmap = rgb2gray(map)
DESCRIPTION
I = rgb2gray(RGB) converts the true colour image RGB to the grayscale

intensity image I. The rgb2gray function converts RGB images to grayscale by
eliminating the hue and saturation information while retaining the luminance. If
you have Parallel Computing Toolbox™ installed, rgb2gray can perform this
conversion on a GPU.
Figure 2.4: Colour image
Figure 2.5: Gray Image
30
Chapter 3: System development
CHAPTER
3
SYSTEM DEVELOPMENT
In the object tracking application one of the main problems is object detection. Instead of
finger tips, a colour pointer has been used to make the object detection easy and fast. To
simulate the click events of the mouse, three fingers serving as three colour pointers has been
used. The basic algorithm is as follows:
 Flipping of images
 Implementation Steps
 Colour detection
 Filtering the Images
 Conversion of Images
 Removing small areas
 Find Centre
 Move the cursor
 Mouse click event
 Erosion and dilation
31
3.1 Flipping of Images
When the camera captures an image, it is inverted. This means that if we move the
colour pointer towards the left, the image of the pointer moves towards the right and vice-
versa. It’s similar to an image obtained when we stand in front of a mirror (Left is detected as
right and right is detected as left). To avoid this problem we need to vertically flip the image.
The image captured is an RGB image and flipping actions cannot be directly performed on it.
So, the individual colour channels of the image are separated and then they are flipped
individually. After flipping the red, blue and green coloured channels individually, they are
concatenated and a flipped RGB image is obtained.
Figure3.1: Flipping of Image
32
Conversion of Flipped Image into Gray scale Image:
As compared to a coloured image, computational complexity is reduced in a gray scale image.

Thus the flipped image is converted into a gray scale image. All the necessary operations were
performed after converting the image into gray scale.
Figure 3.2: Gray scale conversion of flipped image.
3.2 Colour detection
In this project, there are uses of blue and red planes (see figure 2, figure 3 and figure 4).
In order to identify the blue colour of the hand, MATLAB built in function “imsubstract” can
be used.
Z = imsubtract (X, Y)
where this function subtracts elements from one array to another which is specified as X and Y
and gives the difference in the corresponding element to the output array Z.
X and Y are real, non-sparse numeric arrays of the same size.
33
Figure 3.3 : Input Image
Figure 3.4 : Red plane detected
34
Figure 3.5 : Blue plane detected
3.3 Filtering the Images
As soon as the blue colour in the real time image is detected, the next step is to go for filtering
of this single frame. Care has to be taken about the processing speed of every frame since
every camera has different frames per second.
Median filtering gives optimum results for such operations. Result of the filtering should look
like as mentioned in Fig 5. Median filtering is basically to remove “salt and pepper” noise.
Although convolution can be used for this purpose, a median filter is more effective when the
goal is to reduce noise and preserve edges. Illumination is major part while taking the real time
images. There is not much noise when the illumination is high, as seen from the images.
Figure 3.6 : Red color filter
35
Figure 3.7: Blue color filter
3.4 Conversion of Images
As soon as filtering is done over a frame, next step is to convert an image. For conversion of
image one may also use in built function “im2bw”.
The function can be used as BW = im2bw (I, level)
where I is the image.
The BW image replaces all pixels in the input image with some threshold greater than the
Value 1 with the value 1 (white) and replaces all other pixels with the value 0 (black). Level
should be in range from 0 to 1. This level is very much understood since the image itself is
binary levels not greater than 1. Therefore, a level value of 0.5 is midway between black and
white colour, regardless of class. The threshold 0.15 gave the best result for the large range of
illumination.
36
Figure 3.8: Red converted to BW
Figure 3.9: Blue converted to BW
37
Find Centre
In order to make more precise pointer of mouse finding centroid is necessary. Here “bwlabel”
Matlab function can be used for cropping the genuine area. In other words the required region
can be detected (see figure 10). To get the properties of the region such as center point or
bounding box etc., MATLABs built in regionprops function can be used as
STATS = regionprops (BW, properties)
where it measures a set of properties for each connected component (object) in the
binary image, BW.
For the user to control the mouse pointer it is necessary to determine a point whose coordinates
can be sent to the cursor. With these coordinates, the system can control the cursor movement.
An inbuilt function in MATLAB is used to find the centroid of the detected region. The output
of function is a matrix consisting of the X (horizontal) and Y (vertical) coordinates of the
centroid. These coordinates change with time as the object moves across the screen.
38
Figure 3.10: Detected Center for single blue
Figure 3.11: Detected center for Double blue
Tracking the Mouse pointer
Once the coordinates has been determined, the mouse driver is accessed and the coordinates
are sent to the cursor. With these coordinates, the cursor places itself in the required position. It
is assumed that the object moves continuously, each time a new centroid is determined and for
each frame the cursor obtains a new position, thus creating
39
an effect of tracking. So as the user moves his hands across the field of view of the camera, the
mouse moves proportionally across the screen.
Move the cursor
Movement of cursor is the last step where actual decision has to be taken. By seeing
the centroid from above image, movement has to take place. To move the cursor to desired (X,
Y) coordinates, MATLAB has a set (0,'PointerLocation', [x,y]) function. Matlab doesn’t provide
any function for the clicking events. To move the mouse and to simulate the mouse click event
Java class java.awt. Robot which has all these abilities can be used. Resolution of camera is
directly proportional to the resolution of mouse pointer. Therefore using better quality camera
will be more beneficial. In figure 11, the resolution of the input image was 640x480 and the
resolution of the computer monitor was 1280x800. In case if the resolution of camera is less than
monitor screen then scaling should be used.
Figure 3.12: Move as per co-ordinates
Mouse click event
Clicking could be the challenging task since Matlab doesn’t provide any
function for this. In usual mouse operation the left button of the mouse performs a different
task for single click and double click. There are several ways to do this. One of the methods is
to use three pointers and if those pointers are detected decide the clicking events depending on
the time that the pointer is being detected. For movement use all pointers and for clicking use two
40
pointers. In case user wants to go for left click one pointer should be less which is from right
side and same is the case for right click.
3.5 Erosion and dilation
Morphology can be said to be a set of image processing operations that processes

images based on their shapes. A structuring element is applied to an input image that gives a
similar sized output image. The respective input image is compared with its neighbors, thereby
giving a value for the output image. The factors of the neighborhood such as shape and size can
be decided by the programmer, thereby constructing programmer-defined morphological
operations for the input image. The most basic morphological operations are dilation and
erosion (see figure 12, figure 13). Dilation adds pixels to the boundaries of objects in an image,
while erosion removes pixels on object boundaries. According to the structuring element used,
the number of pixels added or removed will differ. In the morphological dilation and erosion
operations, the features of any pixels in the output image is determined by applying a specific
rule to the corresponding pixel and its neighbors in the input image. Whether the operation is
dilation or erosion can be determined from the rule that has been applied to the image pixel.
Figure 3.13: Erosion
41
Figure 3.13: Dilation
Implementation Issues and Challenges
 Throughout the development of the application, there are several implementation issues
occurred. The following describes the issues and challenges that will likely to be encountered
throughout the development phase: The interruptions of salt and pepper noises within the
captured frames.
Salt and pepper noises occurred when the captured frame contains required
HSV values that are too small, but still underwent a series of process even though it’s not large
enough to be considered an input. To overcome this issue, the unwanted HSV pixels within the
frame must first be filtered off, this includes the area of the pixels that are too large and small.
With this method, the likelihood of interruptions of similar pixels will reduce greatly.
 Performance degradation due to high process load for low-tier system.
Since the application is required to undergo several of process to filter, process and execute
the mouse functions in real time, the application can be CPU intensive for most of the low-tier
system. If the size of the captured frames is too large, the time-taken for the application to
process the entire frame are increase drastically. Therefore, to overcome this issue, the
application is required to process only the essential part of the frames, and reduces the redundant
filtering process that could potentially slow the application down. The difficulties of calibrating
the brightness and the contrast of the frames to get the required HSV values.
42
 The difficulties of calibrating the brightness and the contrast of the frames
to get the required HSV values.
The intensity of brightness and contrast matters

greatly when it comes to acquiring the required colour pixels. In order for the
application to execute the entire mouse functions provided, all of the required
HSV values to execute the specific mouse functions must be satisfied, meaning
that the overall HSV values must be satisfied with the brightness and contrast as
well. However, the calibration can be somewhat tedious as certain intensity could
only satisfy part of the required HSV values, unless the original HSV values
were modified to prove otherwise. To overcome this issue, the application must
first start up with a calibration phase, which allows the users to choose their
desired colour pixels before directing them to the main phase.
43
Chapter 4: Methods and technologies involved
CHAPTER
4
METHODS AND TECHNOLOGIES INVOLVED
4.1 Hardware Requirement
The following describes the hardware needed in order to execute and develop the
Virtual Mouse application:
Computer Desktop or Laptop
The computer desktop or a laptop will be utilized to run the visual software in order
to display what webcam had captured. A notebook which is a small, lightweight and
inexpensive laptop computer is proposed to increase mobility.
System will be using Processor: Core2Duo Main Memory: 4GB RAM Hard Disk: 320GB
Display: 14" Monitor
Webcam
Webcam is utilized for image processing, the webcam will continuously taking
image in order for the program to process the image and find pixel position.
Software Requirement
The following describes the software needed in-order to develop the Virtual Mouse
application:
44
MATLAB
(matrix laboratory) is a multi-paradigm numerical computing environment

and proprietary programming language developed by Math Works. MATLAB allows
matrix manipulations, plotting of functions and data, implementation of algorithms, creation of
user interfaces, and interfacing with programs written in other languages, including C, C++,
C#, Java, Fortran and Python.
Although MATLAB is intended primarily for numerical computing, an optional toolbox

uses the MuPAD symbolic engine, allowing access to symbolic computing abilities. An
additional package, Simulink, adds graphical multi-domain simulation and model-based design
for dynamic and embedded systems
Version used: R2015a
C++ Language
The coding technique on developing the Virtual Mouse application will be the
C++ with the aid of the integrated development environment (IDE) that are used for developing
computer programs, known as the Microsoft Visual Studio. A C++ library provides more than
35 operators, covering basic arithmetic, bit manipulation, indirection, comparisons, logical
operations and others.
MATLAB Web-Cam toolbox
With MATLAB® Support Package for USB Webcams, you can connect to your
computer’s webcam and acquire images straight into MATLAB. Functionality is provided to
preview live images, adjust acquisition parameters, and take snapshots either individually or in
a loop. Connect to your webcam from the MATLAB desktop or through a web browser with
MATLAB Online
You can acquire images from any USB video class (UVC) compliant webcam. This
includes webcams that are built into laptops and other devices as well as those that plug into
your computer via USB port.
45
4.2 Basics of Image Processing
Image
An image is an array, or a matrix, of square pixels (picture elements) arranged in

columns and rows.
Figure 4.1: An image — an array or a matrix of pixels arranged in columns and rows
In a (8-bit) grey scale image each picture element has an assigned intensity that ranges from 0
to 255. A grey scale image is what people normally call a black and white image, but the name
emphasizes that such an image will also include many shades of grey.
Figure 4.2: Each pixel has a value from 0 (black) to 255 (white).
46
A normal grey scale image has 8 bit colour depth = 256 grey scales. A “true colour” image has
24 bit colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours = ~16 million colours.
Figure 4.3: A true-colour image assembled from three grey scale images coloured red, green and blue. Such an
image may contain up to 16 million different colours.
Some grey scale images have more grey scales, for instance 16 bit = 65536 grey scales. In
principle three grey scale images can be combined to form an image with 281,474,976,710,656
grey scales. There are two general groups of ‘images’: vector graphics (or line art) and bitmaps
(pixel-based or ‘images’). Some of the most common file formats are:
GIF — an 8-bit (256 colour), non-destructively compressed bitmap format. Mostly used for
web. Has several sub-standards one of which is the animated GIF.
JPEG — a very efficient (i.e. much information per byte) destructively compressed 24 bit (16
million colours) bitmap format. Widely used, especially for web and Internet (bandwidth-
limited).
TIFF — the standard 24 bit publication bitmap format. Compresses non- destructively with, for
instance, Lempel-Ziv-Welch (LZW) compression.
PS — Postscript, a standard vector format. Has numerous sub-standards and can be difficult to
transport across platforms and operating systems.
PSD – a dedicated Photoshop format that keeps all the information in an image including all
the layers.
47
4.3 Colours Processing
For science communication, the two main colour spaces are RGB and CMYK.
1)RGB
The RGB colour model relates very closely to the way we perceive colour with the
r, g and b receptors in our retinas. RGB uses additive colour mixing and is the basic colour
model used in television or any other medium that projects colour with light. It is the basic
colour model used in computers and for web graphics, but it cannot be used for print
production.
The secondary colours of RGB – cyan, magenta, and yellow – are formed by
mixing two of the primary colours (red, green or blue) and excluding the third colour. Red and
green combine to make yellow, green and blue to make cyan, and blue and red form magenta.
The combination of red, green, and blue in full intensity makes white.
In Photoshop using the “screen” mode for the different layers in an

image will make the intensities mix together according to the additive colour mixing model.
This is analogous to stacking slide images on top of each other and shining light through them.
Figure 4.4: The additive model of RGB. Red, green, and blue are the primary stimuli for human colour
perception and are the primary additive colours. Courtesy of adobe.com.
48
2)CMYK
The 4-colour CMYK model used in printing lays down overlapping layers of
varying percentages of transparent cyan ©, magenta (M) and yellow (Y) inks. In addition a
layer of black (K) ink can be added. The CMYK model uses the subtractive colour model.
4.4 Image Processing
To explain better the principles of watermarking, it is necessary to explain some fundamentals

of image processing. There are two basic methods of implementing visual information in
digital form: vector and raster. Vector technique is widely used for creating new images. The
picture is implemented as a set of vectors which mathematically describe each line, curve, and
plane in the picture and their features. The other method, called raster, is related to digitizing
some existing images: for instance, photos, paintings, X-ray, UV images, and other artistic,
scientific, and engineering images. This method is created for keeping and managing huge
databases of medical, artistic, physics, etc. pictures.
In this technique, an image is described as a two-dimensional matrix, each cell of

which has coordinates on the plane x and y, so called spatial coordinates, and keeps a value of
the amplitude of the signal. For instance, in a grey scale image, the white color dot has
maximum value 255, the black one has minimum value 0, and the other levels of grey colors are
inside this interval. The square cell of this matrix is a picture element named pixel. Therefore,
as we can see, the usual mathematical representation of an image is a function of two spatial
variables: f (x, y). The value of the function f at some particular location (x,y) represents the
intensity of the image in that point. The plane of the visual information of the image is referred
as a spatial domain. The simplest picture (Figure 4.5, left) consists of black and white colours
only and therefore could be described by ones and zeros using one
49
bit for one pixel. The grey scale image (Figure 4.5, right) shows up to 256 levels of
shadows from black to white; each pixel of such image can be described by one
Byte or 8 bits. The colour image is created by a combination of three or four
matrices; each of these matrices consists of full grey scale image representing the
level of specific colour in this picture.
There are some special devices developed to transform images from real life into
digital forms, such as scanners and digital photo cameras. During scanning
process, the image has been divided into assigned number of rows and columns
then transmitted dot by dot into a digital carrier forming the matrix. The process
of dividing the image on rows and columns is referred as sampling. The value of
every pixel is calculated as an average brightness in the pixel rounded to the
nearest integer value. This process is usually referred to as amplitude quantization
or simply quantization.
The number of pixels in the unit of the measurement is referred as resolution.

This parameter defines the quality of the image. The more resolution the image
has, the more elements and details could be seen in the picture.
50
4.5 Code
% Program Name : Mouse Pointer Control
% Author : Arindam Bose
% Version : 5.5
% Copyright : ? 2013, Arindam Bose, All rights reserved.
% Description : This program controls the functions of mouse pointer
% by detecting red, green and blue colored caps
% Thanks : MATLAB Central Submission: Mouse pointer control by light
% source by Utsav Barman (29777)
% Mouse Control controls the functions of mouse pointer without
% incorporating the Mouse.
% Inputs: redThresh = Threshold for red color detection
% greenThresh = Threshold for green color detection
% blueThresh = Threshold for blue color detection
% numFrame = Total number of frames duration
% Set the default values below
% Adjust the value of thresholds for different environments
% Controls: Use 1(One) RED, 1(One) GREEN and 3(Three) BLUE Caps for
% different fingers.
% MOVE the RED finger everywhere to control the POINTER

POSITION,
51
% Show ONE BLUE finger to LEFT CLICK,
% Show TWO BLUE finger to RIGHT CLICK,
% Show THREE BLUE finger to DOUBLE CLICK,
% MOVE the GREEN finger up and down to control the MOUSE

SCROLL.
function MouseControl(redThresh, greenThresh, blueThresh, numFrame)
warning('off','vision:transition:usesOldCoordinates');
%% Initialization
if nargin < 1
redThresh = 0.22; % Threshold for Red color detection
greenThresh = 0.14; % Threshold for green color detection
blueThresh = 0.18; % Threshold for blue color detection
numFrame = 2400; % Total number of frames duration
end
cam = imaqhwinfo; % Get Camera information
cameraName = char(cam.InstalledAdaptors(end));
cameraInfo = imaqhwinfo(cameraName);
cameraId = cameraInfo.DeviceInfo.DeviceID(end);
cameraFormat = char(cameraInfo.DeviceInfo.SupportedFormats(end));
jRobot = java.awt.Robot; % Initialize the JAVA robot
vidDevice = imaq.VideoDevice(cameraName, cameraId, cameraFormat, ... %

Input Video from current adapter
52
'ReturnedColorSpace', 'RGB');
vidInfo = imaqhwinfo(vidDevice); % Acquire video information
screenSize = get(0,'ScreenSize'); % Acquire system screensize
hblob = vision.BlobAnalysis('AreaOutputPort', false, ... % Setup blob analysis

handling
'CentroidOutputPort', true, ...
'BoundingBoxOutputPort', true', ...
'MaximumBlobArea', 3000, ...
'MinimumBlobArea', 100, ...
'MaximumCount', 3);
hshapeinsBox = vision.ShapeInserter('BorderColorSource', 'Input port', ... %

Setup colored box handling
'Fill', true, ...
'FillColorSource', 'Input port', ...
'Opacity', 0.4);
hVideoIn = vision.VideoPlayer('Name', 'Final Video', ... % Setup output video

stream handling
'Position', [100 100 vidInfo.MaxWidth+20

vidInfo.MaxHeight+30]);
nFrame = 0; % Initializing variables
lCount = 0; rCount = 0; dCount = 0;
sureEvent = 5;
iPos = vidInfo.MaxWidth/2;
53
%% Frame Processing Loop
while (nFrame < numFrame)
rgbFrame = step(vidDevice); % Acquire single frame
rgbFrame = flipdim(rgbFrame,2); % Flip the frame for userfriendliness
diffFrameRed = imsubtract(rgbFrame(:,:,1), rgb2gray(rgbFrame)); % Get red

components of the image
binFrameRed = im2bw(diffFrameRed, redThresh); % Convert the image into

binary image with the red objects as white
[centroidRed, bboxRed] = step(hblob, binFrameRed); % Get the centroids and

bounding boxes of the red blobs
diffFrameGreen = imsubtract(rgbFrame(:,:,2), rgb2gray(rgbFrame)); % Get

green components of the image
binFrameGreen = im2bw(diffFrameGreen, greenThresh); % Convert the

image into binary image with the green objects as white
[centroidGreen, bboxGreen] = step(hblob, binFrameGreen); % Get the

centroids and bounding boxes of the blue blobs
diffFrameBlue = imsubtract(rgbFrame(:,:,3), rgb2gray(rgbFrame)); % Get blue

components of the image
binFrameBlue = im2bw(diffFrameBlue, blueThresh); % Convert the image

into binary image with the blue objects as white
[~, bboxBlue] = step(hblob, binFrameBlue); % Get the centroids and bounding

boxes of the blue blobs
if length(bboxRed(:,1)) == 1 % Mouse pointer movement routine
54
jRobot.mouseMove(1.5*centroidRed(:,1)*screenSize(3)/vidInfo.MaxWidth,
1.5*centroidRed(:,2)*screenSize(4)/vidInfo.MaxHeight);
end
if ~isempty(bboxBlue(:,1)) % Left Click, Right Click, Double Click routine
if length(bboxBlue(:,1)) == 1 % Left Click routine
lCount = lCount + 1;
if lCount == sureEvent % Make sure of the left click event
jRobot.mousePress(16);
pause(0.1);
jRobot.mouseRelease(16);
end
elseif length(bboxBlue(:,1)) == 2 % Right Click routine
rCount = rCount + 1;
if rCount == sureEvent % Make sure of the right click event
pause(0.1);
end
elseif length(bboxBlue(:,1)) == 3 % Double Click routine
dCount = dCount + 1;
if dCount == sureEvent % Make sure of the double click event
55
pause(0.1);
pause(0.2);
pause(0.1);
end
end
else
lCount = 0; rCount = 0; dCount = 0; % Reset the sureEvent counter
end
if ~isempty(bboxGreen(:,1)) % Scroll event routine
if (mean(centroidGreen(:,2)) - iPos) < -2
jRobot.mouseWheel(-1);
elseif (mean(centroidGreen(:,2)) - iPos) > 2
jRobot.mouseWheel(1);
end
iPos = mean(centroidGreen(:,2));
end
vidIn = step(hshapeinsBox, rgbFrame, bboxRed,single([1 0 0])); % Show the

red objects in output stream
vidIn = step(hshapeinsBox, vidIn, bboxGreen,single([0 1 0])); % Show the
56
green objects in output stream
vidIn = step(hshapeinsBox, vidIn, bboxBlue,single([0 0 1])); % Show the blue

objects in output stream
step(hVideoIn, vidIn); % Output video stream
nFrame = nFrame+1;
end
%% Clearing Memory
release(hVideoIn); % Release all memory and buffer used
release(vidDevice);
clc;
end
57
4.6 Results
a) Movement of cursor:
b) Left click event:
c) Right click event:
58
d) Double click event:
e) Cursor scroll event:
59
Chapter 5: Conclusion and Future Scope
CHAPTER
5
CONCLUSION AND FUTURE SCOPE
The Histogram-based and Explicitly threshold skin detection methods were

evaluated and based on the results, the Histogram method was deemed as more
accurate. The vision based cursor control using hand gesture system was
developed in the C++ language, using the Open CV library. The system was able
to control the movement of a Cursor by tracking the users hand. Cursor functions
were performed by using different hand gestures. The system has the potential of
being a viable replacement for the computer mouse, however due to the
constraints encountered; it cannot completely replace the computer mouse. The
major constraint of the system is that it must be operated in a well lit room. This
is the main reason why the system cannot completely replace the computer
mouse, since it is very common for computers to be used in outdoor
environments with poor lighting condition. The accuracy of the hand gesture
recognition could have been improved, if the Template Matching hand gesture
recognition method was used with a machine learning classifier. This would have
taken a lot longer to implement, but the accuracy of the gesture recognition could
have been improved. It was very difficult to control the cursor for precise cursor
movements, since the cursor was very unstable. The stability of the cursor
control could have been improved if a Kalman filter was incorporated in the
design. The Kalman filter also requires a considerable amount of time to
implement and due to time constraints, it was not implemented. All of the
operations which were intended to be performed using various gestures were
completed with satisfactory results.
60
Chapter 5: Conclusion and Future Scope
FURTURE WORK
We would improve the performance of the software especially hand tracking in

the near future. And we also want to decrease the response time of the software
for cursor movement so that it can completely be used to replace our
conventional mouse. We are also planning to design a hardware implementation
for the same in-order to improve accuracy and increase the functionality to
various domains such as a gaming controller or as a general purpose computer
controller.
Other advanced implementation include the hand gesture recognition stage to use
the Template Matching method to distinguish the hand gestures. This method
requires the use of a machine learning classifier, which takes a considerably long
time to train develop. However, it would have allow the use of lots more hand
gestures which in turn would allow the use of more mouse functions such as
zoom in and zoom out. Once the classifier is well trained, the accuracy of the
Template Matching method is expected to be better than the method used in the
proposed design. Another novel implementation of this technology would to use
the computer to train the visually or hearing impaired.
61
References
References
[1]. A. Erdem, E. Yardimci, Y. Atalay, V. Cetin, A. E.“Computer vision based

mouse”,Acoustics, Speech, and Signal Processing, Proceedings. (ICASS). IEEE
International Conference, 2002.
[2]. Hojoon Park, “A Method for Controlling the Mouse Movement using a Real
Time Camera”, Brown University, Providence, RI, USA, Department of
computer science, 2008.
[3]. Chu-Feng Lien, “Portable Vision-Based HCI – A Realtime Hand Mouse

System on Handheld Devices”, National Taiwan University, Computer Science
and Information Engineering Department.
[4]. Kamran Niyazi, Vikram Kumar, Swapnil Mahe, Swapnil Vyawahare,

“Mouse Simulation Using Two Coloured Tapes”, Department of Computer
Science, University of Pune, India, International Journal of Information Sciences
and Techniques (IJIST) Vol.2, No.2, March 2012.
[5]. K N. Shah, K R. Rathod and S. J. Agravat, “A survey on Human Computer

Interaction Mechanism Using Finger Tracking” International Journal of
Computer Trends and Technology, 7(3), 2014, 174-177.
[6]. Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, 2nd
edition, Prentice Hall, Upper Saddle River, New Jersey, 07458.
[7]. Shahzad Malik, “Real-time Hand Tracking and Finger Tracking for
Interaction”, CSC2503F Project Report, December 18, 2003.
[8]. MSDN Microsoft developers network – www.msdn.microsoft.com
[9]. Code project – www.codeproject.com/Articles/498193/Mouse-Control-via-

Webcam.
[10]. Aniket Tatipamula’s Blog -

http://anikettatipamula.blogspot.in/2012/02/hand- gesture-using-opencv.html
62

MAJOR

Uploaded by

Copyright:

Available Formats

You might also like

MAJOR

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MAJOR

Uploaded by

Copyright:

Available Formats

Virtual Mouse using Hand Gesture and Colour

KUNTLA AJAY (157Y1A04C0)

Under the Guidance of

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

Degree for which the project is submitted: B .Tech.

HOD External Examiner Principal

List of Figures vii

List of Tables viii

Chapter No. Page

2.1 Block diagram-real time image from user 23

3.2 Colour detection 33

3.4 Conversion of images 36

3.5 Erosion and dilation 41

Figure No. Caption/Description Page

Table 1.3 Operations performed depending upon number of fingers. 16

A mouse, in computing terms is a pointing device that detects two-dimensional

1.2 Review of the Physical Mouse

Figure 1.1 Mechanical mouse, with top cover removed

Allows the users to control the Prone to degradation of the mouse

Provides precise mouse tracking Requires a flat surface to operate

Table 1.1: Advantage and disadvantage of the Mechanical Mouse

1.3 Optical and Laser Mouse

Figure 1.2 Optical Mouse, with top cover removed

 Allows better precision with lesser  Prone to button switches

 Longer life-span.  Does not function properly

 Physical mouse is subjected to mechanical wear and tear.

 Physical mouse requires special hardware and surface to operate.

 Mouse has limited functions even in present operational environments.

Motivation of the proposed project

A physical mouse normally costs depending on their functionality and features.

In short, the Flowchart for our project will be as follows:

Figure 1.3 Hand gesture recognition system

1.4 Project Description

Figure 1.4. Algorithm of colour detection

This section is separated into the following subsections:

 Hand Contour Extraction

Hand Contour Extraction

Figure 1.5 Algorithm hand tracking

 Finger Tip Identification

Number of Fingertips Detected Operations Performed

One Move Cursor

Two Left Click

Three Right Click

Four Start Button

Table 1.3: Operations performed depending upon number of fingers.

Below is a summary flowchart representation of the program:

Figure 1.6. Flow chart of hand tracking based on colour detection

The hand tracking is based on color recognition. The program is therefore

When the binary representation is generated the hand is processed in the

Figure 1.7.Binary representation of hand processing.

The properties determining whether a convexity defect is to be dismissed is the

Figure 1.8. Angle of Bounding

The defect is dismissed if:

Length < 0.4lbb

Angle > 800

1.5 WORKING OF COLOUR DETECTION

This is how it works:

First, The program asks you to place Palm on the Rectangles: