MAJOR

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 71

Virtual Mouse using Hand Gesture and Colour

Detection
A Major Project Submitted in Partial Fulfillment of the
Requirements for the Award of the Degree
of
Bachelor of Technology

By

KUNTLA AJAY (157Y1A04C0)


BHUMA NAVYA (157Y1A04E3)

Under the Guidance of

Ms. T.TANUJA
Assistant professor

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

April-2019
DECLARATION

Project Title: Virtual Mouse using Hand Gesture and Colour Detection.

Degree for which the project is submitted: B .Tech.

We declare that the presented project represents largely our own ideas and work in our own
words. Where others ideas or words have been included, we have adequately cited and
listed in the reference materials. We have adhered to all principles of academic honesty and
integrity. No falsified or fabricated data have been presented in the project. The matter
embodied in this project report has not been submitted by us to any other university for the
award of any other degree.

----------------------- --------------------------
(KUNTLA AJAY) (BHUMA NAVYA)

(157Y1A04C0) (157Y1A04E3)

Date:

i
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING

CERTIFICATE

This is to certify that the project work entitled “Virtual mouse using hand gesture and
colour detection” work done by KUNTLA AJAY(15Y1A04C0) and BHUMA
NAVYA(157Y1A04E3) students of Department of Electronics and Communication
Engineering, is a record of bonafide work carried out by the members under the guidance
of Ms. T.TANUJA. This project is done as a fulfillment of obtaining Bachelor of
Technology Degree to be awarded by Jawaharlal Nehru Technological University
Hyderabad.

This is to certify that the above statement made by the candidates is correct to the best of
my knowledge.

Date: (T.TANUJA)

The Project Viva-Voce Examination of above students, has been held on……………

HOD External Examiner Principal

ii
ABSTRACT

The main idea to choose this project is improvising the previous levels and developing the
Human-Machine Interface. In today’s technological era, many technologies are evolving
day by day. The aim of this project is to move the mouse cursor on the screen without
using hardware such as a mouse and only by moving the cursor through finger movements.
In this we present a novel approach for Human Computer Interaction (HCI) where cursor
movement is controlled using a real time camera. In this project, the hand movement of a
user is mapped into mouse inputs. A web camera is set to take the live video continuously
and then from this video various images are being captured by using MATLAB. The user
must have a particular color marker or pointer in his hand so that when the web camera
takes an image it must be visible in it. This color is detected from the image pixel in
MATLAB and object detection is used to map pixel position into mouse input.

iii
ACKNOWLEDGEMENTS

The satisfaction that accompanies the successful completion of any task would be
incomplete without the mentioning of the peoples who made it possible and whose
encouragement and guidance has been a source of inspiration throughout the course of the
project.

It is great to convey our profound sense of gratitude to our principal Dr.K. Venkateswar
Reddy, and to our director Dr. R. Kotaiah, ECE at Marri Laxman Reddy Institute of
Technology and Management for having been kind enough to arrange for necessary
facilities for executing the project in college.

At the inception we would like to express our deep sense of gratitude to our Head of the
Department Mr. K. Nagabhushanam, ECE, Marri Laxman Reddy Institute of Technology
and Management. Whose valuable suggestions have been indispensable to bring about the
successful completion of our project. We wish to acknowledge a special thanks to our
guide, T.Tanuja, Assistant professor who helped us throughout the academic to complete
our project.

Finally, we are thankful to our project co-ordinator Dr. G. Amarnath and staff members
of ECE Department, and other faculty members of our institution. Finally we thank those
who directly and indirectly helped us in this regard.

( K.Ajay ) (B.Navya)

iv
CONTENTS

Declaration i

Certificate ii

Abstract iii

Acknowledgements iv
Contents v

List of Figures vii

List of Tables viii

Chapter No. Page


No.
CHAPTER 1 Introduction 1 – 22
1.1 Introduction 1
1.2 Review of physical mouse 2
1.3 Optical and laser mouse 3
1.4 Project description 6
1.5 Working of colour detection 20
CHAPTER 2 Overview of project 23-30

2.1 Block diagram-real time image from user 23


2.2 Flow diagram of color recognition 27
2.3 Functions used 28
CHAPTER 3 System development 31-43
3.1 Flipping of images 32

3.2 Colour detection 33


3.3 Filtering of images 35

3.4 Conversion of images 36

3.5 Erosion and dilation 41

v
CHAPTER 4 Methods and technologies involved 44-59
4.1 Hardware and software requirements 44
4.2 Basics of image processing 46
4.3 Colour processing 48
4.4 Image processing 49
4.5 Code 51
4.6 Results 58
CHAPTER 5 Conclusion and Future Scope 60-61

References 62

vi
LIST OF FIGURES

Figure No. Caption/Description Page


No.
FIGURE 1.1 Mechanical mouse, with top cover removed 3
FIGURE 1.2 Optical Mouse, with top cover removed 4
FIGURE 1.3 Hand gesture recognition system 8
FIGURE 1.4 Algorithm of colour detection 9
FIGURE 1.5 Algorithm of hand tracking 13
FIGURE 1.6 Flow chart of hand tracking based on colour detection 17
FIGURE 1.7 Binary representation of hand processing. 18
FIGURE 1.8 Angle of bounding 19
FIGURE 1.9 Cover rectangles with palm 20
FIGURE 1.10 Detection of fingers 21
FIGURE 2.1 Overview of system 23
FIGURE 2.2 Flow diagram of the system 27
FIGURE 2.3 Flipping of images 29
FIGURE 2.4 Colour image 30
FIGURE 2.5 Grey image 30
FIGURE 3.1 Flipping of images 32
FIGURE 3.2 Grey scale conversion of flipped image 33
FIGURE 3.3 Input image 34
FIGURE 3.4 Red plane detected 34
FIGURE 3.5 Blue plane detected 35
FIGURE 3.6 Red colour filter 35
FIGURE 3.7 Blue colour filter 36
FIGURE 3.8 Red converted to BW 37
FIGURE 3.9 Blue converted to BW 37
FIGURE 3.10 Detected center for single blue 39
FIGURE 3.11 Detected center for double blue 39
FIGURE 3.12 Move as per co-ordinates 40
FIGURE 3.13 Erosion & Dilation 41

vii
FIGURE 4.1 An image array or a matrix of pixels arranged in rows and columns 46
FIGURE 4.2 Tri-colour image 46
FIGURE 4.3 A true colour image assembled by 3 gray scales 47
FIGURE 4.4 Additive model of RGB 48

LIST OF TABLES
Table No. Name of the table Page
No.
Table 1.1 Advantage and disadvantage of the Mechanical Mouse 4

Table 1.2 Advantage and disadvantage of the Optical and Laser Mouse 5

Table 1.3 Operations performed depending upon number of fingers. 16

viii
Chapter 1: Introduction

CHAPTER

1
INTRODUCTION

1.1 Introduction

A mouse, in computing terms is a pointing device that detects two-dimensional


movements relative to a surface. This movement is converted into the movement of a pointer
on a display that allows to control the Graphical User Interface (GUI) on a computer platform.
There are a lot of different types of mouse that have already existed in the modern days
technology, there's the mechanical mouse that determines the movements by a hard rubber ball
that rolls around as the mouse is moved. Years later, the optical mouse was introduced that
replace the hard rubber ball to a LED sensor to detects table top movement and then sends off
the information to the computer for processing. On the year 2004, the laser mouse was then
introduced to improve the accuracy movement with the slightest hand movement, it overcome
the limitations of the optical mouse which is the difficulties to track high-gloss surfaces.
However, no matter how accurate can it be, there are still limitations exist within the mouse
itself in both physical and technical terms. For example, a computer mouse is a consumable
hardware device as it requires replacement in the long run, either the mouse buttons were
degraded that causes inappropriate clicks, or the whole mouse was no longer detected by the
computer itself.

Despite the limitations, the computer technology still continues to grow, so does the
importance of the human computer interactions. Ever since the introduction of a mobile device
that can be interact with touch screen technology, the world is starting to demand the same
technology to be applied on every technological device, this includes the desktop system.
However, even though the touch screen technology for the desktop system is already exist, the
price can be very steep. A hand gesture based cursor control system allows users to give mouse

1
Chapter 1: Introduction

inputs to a system without using an actual mouse. To the extreme it can also be called as
hardware because it uses an ordinary web camera. The system can usually be operated with
multiple input devices, which may include an actual mouse or a computer keyboard. This
system uses a web camera which works with the help of different image processing techniques.

A colour pointer has been used for the object recognition and tracking. Left and the right click
events of the mouse have been achieved by detecting the number of colour pointers on the
images. The hand movements of a user are mapped into mouse inputs. A web camera is set to
take images continuously. The user must have a particular colour in his hand so that when the
web camera takes image it must be visible in the acquired image. This colour is detected from
the image pixel and the pixel position is mapped into mouse input.

In this project, the mouse cursor movement and click events are controlled using a camera
based on colour detection technique. Here real time video has been captured using a Web-
Camera. The user wears coloured tapes to provide information to the system. Individual frames
of the video are separately processed. The processing techniques involve an image subtraction
algorithm to detect colours. Once the colours are detected, the system performs various
operations to track the cursor and performs control actions. No additional hardware is required
by the system other than the standard webcam which is provided in every laptop computer.

Therefore, a hand gesture based human computer interaction device replaces the physical
mouse or keyboard by using a webcam or any other image capturing devices can be an
alternative way for the touch screen. This device which is the webcam will be constantly
utilized by a software that monitors the gestures given by the user in order to process it and
translate to motion of a pointes, as similar to a physical mouse.

1.2 Review of the Physical Mouse

It is known that there are various types of physical computer mouse in the modern technology,
the following will discuss about the types and differences about the physical mouse.

Mechanical Mouse

It is Known as the trackball mouse that is commonly used in the 1990s, the ball within the
mouse are supported by two rotating rollers in order to detect the movement made by the ball
itself. One roller detects the forward/backward motion while the other detects the left/right

2
Chapter 1: Introduction

motion. The ball within the mouse are steel made that was covered with a layer of hard rubber,
so that the detection are more precise. The common functions included are the left/right
buttons and a scroll-wheel. However, due to the constant friction made between the mouse ball
and the rollers itself, the mouse are prone to degradation, as overtime usage may cause the
rollers to degrade, thus causing it to unable to detect the motion properly, rendering it useless.
Furthermore, the switches in the mouse buttons are no different as well, as long term usage
may cause the mechanics within to be loosed and will no longer detect any mouse clicks till it
was disassembled and repaire

Figure 1.1 Mechanical mouse, with top cover removed

3
Chapter 1: Introduction

The following table describes the advantages and disadvantages of the Mechanical Mouse.

ADVANTAGES DISADVANTAGES

Allows the users to control the Prone to degradation of the mouse


computer system by moving the rollers and button switches, causing
mouse. to be faulty

Provides precise mouse tracking Requires a flat surface to operate


movements

Table 1.1: Advantage and disadvantage of the Mechanical Mouse

1.3 Optical and Laser Mouse

A mouse that commonly used in these days, the motions of optical mouse rely on the
Light Emitting Diodes (LEDs) to detect movements relative to the underlying surface, while
the laser mouse is an optical mouse that uses coherent laser lights. Comparing to its
predecessor, which is the mechanical mouse, the optical mouse no longer rely on the rollers to
determine its movement, instead it uses an imaging array of photodiodes. The purpose of
implementing this is to eliminate the limitations of degradation that plagues the current
predecessor, giving it more durability while offers better resolution and precision. However,
there's still some downside, even-though the optical mouse are functional on most opaque
diffuse surface, it's unable to detect motions on the polished surface. Furthermore, long term
usage without a proper cleaning or maintenance may leads to dust particles trap between the
LEDs, which will cause both optical and laser mouse having surface detection difficulties.
Other than that, it's still prone to degradation of the button switches, which again will cause the
mouse to function improperly unless it was disassembled and repaired.

4
Chapter 1: Introduction

Figure 1.2 Optical Mouse, with top cover removed

The following table describes the advantages and disadvantages of the Optical and
Laser Mouse.

ADVANTAGES DISADVANTAGES

 Allows better precision with lesser  Prone to button switches


hand movements. degradation.

 Longer life-span.  Does not function properly


while on a polished surface.

Table 1.2: Advantage and disadvantage of the Optical and Laser Mouse

Problem Statement

It's a known fact that every technological devices have its own limitations, especially
when it comes to computer devices. After the review of various type of the physical mouse, the
problems are identified and generalized. The following describes the general problem that the
current physical mouse suffers:

 Physical mouse is subjected to mechanical wear and tear.

 Physical mouse requires special hardware and surface to operate.

 Physical mouse is not easily adaptable to different environments and its performance varies
depending on the environment.
5
Chapter 1: Introduction

 Mouse has limited functions even in present operational environments.

 All wired mouse and wireless mouse have its own lifespan.

Motivation of the proposed project

It is fair to say that the Virtual Mouse will soon to be substituting the traditional
physical mouse in the near future, as people are aiming towards the lifestyle where that every
technological devices can be controlled and interacted remotely without using any peripheral
devices such as the remote, keyboards, etc. it doesn't just provides convenience, but it's cost
effective as well.

User Convenience

It is known in order to interact with the computer system, users are required to use an
actual physical mouse, which also requires a certain area of surface to operate, not to mention
that it suffers from cable length limitations. Cursor control system requires none of it, as it only
a webcam to allow image capturing of user's hand position in order to determine the position of
the pointers that the user want it to be. For example, the user will be able to remotely control
and interact the computer system by just facing the webcam or any other image capturing
devices and moving your fingers, thus eliminating the need to manually move the physical
mouse, while able to interact with the computer system from few feet away Cost Effective.

A physical mouse normally costs depending on their functionality and features. Since the
cursor control system requires only a webcam, a physical mouse are no longer required, thus
eliminating the need to purchase one, as a single webcam is sufficient enough to allow users to
interact with the computer system through it, while some other portable computer system such
as the laptop, are already supplied with a built-in webcam, could simply utilize the software
without having any concerns about purchasing any external peripheral devices.

Cost Effective

A physical mouse normally costs depending on their functionality and features.


Since the cursor control system requires only a webcam, a physical mouse are no longer
required, thus eliminating the need to purchase one, as a single webcam is sufficient enough to
allow users to interact with the computer system through it, while some other portable
6
Chapter 1: Introduction

computer system such as the laptop, are already supplied with a built-in webcam, could simply
utilize the software without having any concerns about purchasing any external peripheral
devices.

Problem Description

There are generally two approaches for hand gesture recognition, which are hardware
based, where the user must wear a device, and the other is vision based which uses image
processing techniques with inputs from a camera. The proposed system is vision based, which
uses image processing techniques and inputs from a computer webcam. Vision based gesture
recognition tracking and gesture recognition. The input frame would be captured from the
webcam and systems are generally broken down into four stages, skin detection, hand contour
extraction, hand the skin region would be detected using skin detection. The hand contour
would then be found and used for hand tracking and gesture recognition. Hand tracking would
be used to navigate the computer cursor and hand gestures would be used to perform mouse
functions such as right click, left click, scroll up and scroll down. The scope of the project
would therefore be to design a vision based CC system, which can perform the mouse function
previously stated.

7
Chapter 1: Introduction

In short, the Flowchart for our project will be as follows:

Figure 1.3 Hand gesture recognition system

1.4 Project Description

In this section the strategies and methods used in the design and development of the vision based
CC system will be explained. The algorithm for the entire system is shown in Figure below. In order
to reduce the effects of illumination, the image can be converted to chrominance colour space which is
less sensitive to illumination changes. The HSV colour space was chosen since it.

8
Chapter 1: Introduction

Figure 1.4. Algorithm of colour detection

was found by to be the best colour space for skin detection. The next step would be to use a method
that would differentiate skin pixels from non-skin pixels in the image (skin detection). Background
subtraction was then performed to remove the face and other skin colour objects in the background.
Morphology Opening operation (erosion followed by dilation) was then applied to efficiently remove
noise. A Gaussian filter was applied to smooth the image and give better edge detection. Edge
detection was then performed to get the hand contour in the frame. Using the hand contour, the tip of
the index finger was found and used for hand tracking and controlling the mouse movements. The
contour of the hand was also used for gesture recognition. The system can be broken down in four
main components, thus in the Methodology the method used in each component of the system will be
explained separately.

9
Chapter 1: Introduction

This section is separated into the following subsections:

 Skin Detection

 Hand Contour Extraction

 Hand Tracking

 Gesture Recognition

 Cursor Control

Skin Detection

Skin detection can be defined as detecting the skin colour pixels in an image. It is a
fundamental step a wide range of image processing application such as face detection, hand tracking
and hand gesture recognition. Skin detection using colour information has recently gained a lot of
attention, since it is computationally effective and provides robust information against scaling, rotation
and partial occlusion. Skin detection using colour information can be a challenging task, since skin
appearance in images is affected by illumination, camera characteristics, background and ethnicity. In
order to reduce the effects of illumination, the image can be converted to a chrominance colour space,
which is less sensitive to illumination changes.

A chrominance colour space is one where the intensity information In the proposed method, the
HSV colour space was used with the Histogram- based skin detection method. The HSV colour space
has three channels, Hue (H), Saturation(S) and Value (V). The H and S channels hold the colour
information, while the V channel holds the intensity information. The input image from the webcam
would be in the RGB colour space, thus it would have to be converted to the HSV colour space using
the conversion Formulae. The Histogram-based skin detection method proposed by uses 32 bins H and
S histograms to achieve skin detection. Using a small skin region, the colour of this region is
converted to a chrominance colour space. A 32 bin histogram for the region is then found and is used
as the histogram model. Each pixel in the image is then evaluated on how much probability it has to a
histogram model. This method is also called Histogram Back Projection. Back projection can be
defined as recording how well pixels or patches of pixels fit the distribution of pixels in a histogram
model. The result would be a grayscale image (back projected image), where the intensity indicates

10
Chapter 1: Introduction

the likelihood that the pixel is a skin colour pixel. This method is adaptive since the histogram model
is obtained from the users ski, under the preset lighting condition.

Hand Contour Extraction

After obtaining the skin segmented binary image, the next step is to perform edge
detection to obtain the hand contour in the image. There are several edge detection methods such as,
Laplacian edge detection, canny edge detection and border finding The Open CV function cv Find
Contours() uses a order finding edge detection method to find the contours in the image. The major
advantage of the border finding edge detection method, is that all the contours found in the image is
stored in an array. This means that we can analyse each contour in the image individually, to
determine the hand contour. The Canny and Laplacian edge detectors are able to find the contours in
the image, but do not give us access to each individual contour. For this reason the border finding edge
detection method was used in the proposed design.

11
Chapter 1: Introduction

In the contour extraction process, we are interested in extracting the hand contour so that shape
analysis can be done on it to determine the hand gesture. Figure below shows the result when edge
detection was applied to the skin segmented binary image. It can be seen that besides the hand
contour, there are lots of small contours in the image. These small contours can be considered as noise
and must be ignored. The assumption was made that the hand contour is the largest contour thereby
ignoring all the noise contours in the image. This assumption can be void, if the face contour is larger
than the hand contour. To solve this problem, the face region must be eliminated from the frame. The
assumption was made that the hand is the only moving object in the image and the face remains
relatively stationary compared to the hand. This means that background subtraction can be applied to
remove the stationary pixels in the image, including the face region. This is implemented in the
function named “BackgroundSubtractorMOG2”.

Hand Tracking

The movement of the cursor was controlled by the tip of the index finger. In order to identify the tip of
the index finger, the centre of the palm must first be found. The method used for finding the hand
centre was adopted from and it has the advantage of being simple and easy to implement. The
algorithm for the method is shown in the flow chart of Figure below. The shortest distance between
each point inside the inscribed circle to the contour was measured and the point with the largest
distance was recorded as the hand centre. The distance between the hand centre and the hand contour
was taken as the radius of the hand. The hand centre was calculated for each successive frame and
using the hand centre, the tip of the index finger would be identified and used for hand tracking. The
method used for identifying the index and the other fingers are described in the following subsection.
The results for hand tracking would be demonstrated in Figure in the Results and Analysis section.

12
Chapter 1: Introduction

Figure 1.5 Algorithm hand tracking

13
Chapter 1: Introduction

Gesture Recognition

The gesture recognition method used in the proposed design is a combination of two methods, the
method proposed by Yeo and the method proposed by Balazs. The algorithm for the proposed gesture
recognition method is described in the flow chart of Figure below. It can be seen from Figure above
that the convexity defects for the hand contour must firstly be calculated. The convexity defects for
the hand contour was calculated using the Open CV inbuilt function “cv Convexity Defects”. The
parameters of the convexity defects (start point, end point and depth point) are stored in a sequence of
arrays. After the convexity defects are obtained, there are two main steps for gesture recognition:

 Finger Tip Identification

 Number of Fingers.

14
Chapter 1: Introduction

Cursor Control

Once the hand gestures are recognized, it will be a simple matter of mapping different hand gestures
to specific mouse functions. It turns out that controlling the computer cursor, in the C/C++
programming language is relatively easy. By including the User.lib library into the program, the “Send
Input” function will allow control of the computer cursor. Instructions on how to properly use this
function, was obtained from the Microsoft Developers Network MSDN website. This function is only
available for the Windows 2000 Professional operating system or later. This introduces a new
limitation on the system, such that it can only be used on newer versions of the Windows operating
system. The algorithm for the cursor control is shown in Figure below.

15
Chapter 1: Introduction

Following table shows the Operations Performed depending upon the number of fingers detected:

Number of Fingertips Detected Operations Performed

One Move Cursor

Two Left Click

Three Right Click

Four Start Button

Five My Computer

Table 1.3: Operations performed depending upon number of fingers.

Starting with the position of the index fingertip, the cursor is moved to the fingertip position. This is
done using the “Send Input” function to control the cursor movement. The next step would be to
determine if a hand gesture was performed. If a hand gesture as performed, the “Send Input” function
is again used to control the cursor function. If there is no change in fingertip position, the loop is
exited and it would be started again, when a change in fingertip position is detected.

16
Chapter 1: Introduction

Below is a summary flowchart representation of the program:

Figure 1.6. Flow chart of hand tracking based on colour detection

17
Chapter 1: Introduction

The hand tracking is based on color recognition. The program is therefore


initialized by sampling color from the hand. The hand is then extracted from the
background by using a threshold using the sampled color profile. Each color in
the profile produces a binary image which in turn are all summed together. A
nonlinear median filter is then applied to get a smooth and noise free binary
representation of the hand.

When the binary representation is generated the hand is processed in the


following way:

Figure 1.7.Binary representation of hand processing.

18
Chapter 1: Introduction

The properties determining whether a convexity defect is to be dismissed is the


angle between the lines going from the defect to the neighboring convex polygon
vertices.

Figure 1.8. Angle of Bounding

The defect is dismissed if:

Length < 0.4lbb

Angle > 800

The analysis results in data that can be of further use in gesture recognition:

Fingertip positions

Number of fingers

Number of hands

Area of hands

19
Chapter 1: Introduction

1.5 WORKING OF COLOUR DETECTION

This is how it works:

First, The program asks you to place Palm on the Rectangles:

Figure 1.9.Cover rectangles with palm

20
Chapter 1: Introduction

Now, let’s see how it tracks our palm and detects our fingers:

Figure 1.10. Detection of fingers

This detects 5 fingers and hence five five fingers.

This detects 4 fingertips and hence four fingers.

21
Chapter 1: Introduction

This detects 3 fingertips and hence three fingers.

This detects 2 fingertips and hence two fingers.

This detects 1 fingertip and hence one finger.

22
Chapter 2: Over View of project

CHAPTER

2
OVERVIEW OF PROJECT

2.1 Block Diagram

Real Time Image from User

According to the system requirements we need to give RGB colour image inputs to the system.
This colour components will be placed on the finger tips of the user. The input will be given as
a continuous frame of images. This continuous image input is captured using a webcam.

Figure 2.1 Overview of system


23
Chapter 2: Over View of project

Webcam Taking Images

For the system to work we need a sensor to detect the hand movements of the user. The
webcam of the computer is used as a sensor. The webcam captures the real time video at a
fixed frame rate and resolution which is determined by the hardware of the camera. The frame
rate and resolution can be changed in the system if required.

 Computer Webcam is used to capture the Real Time Video.

 Video is divided into Image frames base on the FPS (Frames per second) of the camera.

 Processing of individual Frames.

Image Pre-processing

Image pre-processing involves flipping of the input images. When the camera
captures an image, it is inverted. This means that if we move the color pointer towards the left,
the image of the pointer moves towards the right and vice-versa. It’s similar to an image
obtained when we stand in front of a mirror (Left is detected as right and right is detected as
left). To avoid this problem, we need to vertically flip the image. The image captured is an
RGB image and flipping actions cannot be directly performed on it. So, the individual color
channels of the image are separated and then they are flipped individually. After flipping the
red, blue and green colored channels individually, they are concatenated and a flipped RGB
image is obtained.

Mouse Movements

The control actions of the mouse are performed by controlling the flags associated with
the mouse buttons. JAVA robot class is used to access these flags. The user has to perform
hand gestures in order to create the control actions. Due to the use of color pointers, the
computation time required is reduced. Furthermore the system becomes resistant to
background noise and low illumination conditions.

24
Chapter 2: Over View of project

Clicking action is based on the following colour detections.

 Red indicates cursor movement.

 Green represents cursor scroll. 

 Single blue represents Left click 

 Double blue represents Right Click

 Three blue represents Double click. 

25
Chapter 2: Over View of project

2.2 Flow Diagram of Colour Recognition

26
Chapter 2: Over View of project

Figure 2.2: Flow diagram of the System

27
Chapter 2: Over View of project

Following are the steps in working of our project:

 Capturing real time video using Web-Camera.

 Processing the individual image frame.

 Flipping of each image frame.

 Conversion of each frame to a grey scale image.

 Colour detection and extraction of the different colours (RGB) from flipped gray scale image.

 Conversion of the detected image into a binary image.

 Finding the region of the image and calculating its centroid.

 Tracking the mouse pointer using the coordinates obtained from the centroid.

 Simulating the left click and the right click events of the mouse by assigning different colour
pointers.

2.3 Functions Used

SYNTAX

B = flipdim(A,dim)

DESCRIPTION

B = flipdim(A,dim) returns A with dimension dim flipped.

When the value of dim is 1, the array is flipped row-wise down. When dim is 2, the
array is flipped column wise left to right. flipdim(A,1) is the same as flipud(A), and
flipdim(A,2) is the same as fliplr(A).

28
Chapter 2: Over View of project

Figure 2.3: Flipping of images

SYNTAX

Z = imsubtract(X,Y)

DESCRIPTION

Z = imsubtract(X,Y)

subtracts each element in array Y from the corresponding element in array X and
returns the difference in the corresponding element of the output array Z.

If X is an integer array, elements of the output that exceed the range of the integer type
are truncated, and fractional values are rounded.

im2bw

Convert image to binary image, based on threshold

SYNTAX

BW = im2bw(I,level)

DESCRIPTION

BW = im2bw(I,level) converts the grayscale image I to binary image BW, by replacing


all pixels in the input image with luminance greater than level with the value1 (white) and
replacing all other pixels with the value 0 (black).

29
Chapter 2: Over View of project

This range is relative to the signal levels possible for the image's class.
Therefore, a level value of 0.5 corresponds to an intensity value halfway between
the minimum and maximum value of the class.

rgb2gray

Convert RGB image or colormap to grayscale

SYNTAX

I = rgb2gray(RGB)
newmap = rgb2gray(map)

DESCRIPTION

I = rgb2gray(RGB) converts the true colour image RGB to the grayscale


intensity image I. The rgb2gray function converts RGB images to grayscale by
eliminating the hue and saturation information while retaining the luminance. If
you have Parallel Computing Toolbox™ installed, rgb2gray can perform this
conversion on a GPU.

Figure 2.4: Colour image

Figure 2.5: Gray Image

30
Chapter 3: System development

CHAPTER

3
SYSTEM DEVELOPMENT

In the object tracking application one of the main problems is object detection. Instead of
finger tips, a colour pointer has been used to make the object detection easy and fast. To
simulate the click events of the mouse, three fingers serving as three colour pointers has been
used. The basic algorithm is as follows:

 Flipping of images

 Implementation Steps

 Colour detection

 Filtering the Images

 Conversion of Images

 Removing small areas

 Find Centre

 Move the cursor

 Mouse click event

 Erosion and dilation

31
Chapter 3: System development

3.1 Flipping of Images

When the camera captures an image, it is inverted. This means that if we move the
colour pointer towards the left, the image of the pointer moves towards the right and vice-
versa. It’s similar to an image obtained when we stand in front of a mirror (Left is detected as
right and right is detected as left). To avoid this problem we need to vertically flip the image.
The image captured is an RGB image and flipping actions cannot be directly performed on it.
So, the individual colour channels of the image are separated and then they are flipped
individually. After flipping the red, blue and green coloured channels individually, they are
concatenated and a flipped RGB image is obtained.

Figure3.1: Flipping of Image

32
Chapter 3: System development

Conversion of Flipped Image into Gray scale Image:

As compared to a coloured image, computational complexity is reduced in a gray scale image.


Thus the flipped image is converted into a gray scale image. All the necessary operations were
performed after converting the image into gray scale.

Figure 3.2: Gray scale conversion of flipped image.

3.2 Colour detection

In this project, there are uses of blue and red planes (see figure 2, figure 3 and figure 4).

In order to identify the blue colour of the hand, MATLAB built in function “imsubstract” can
be used.

Z = imsubtract (X, Y)

where this function subtracts elements from one array to another which is specified as X and Y
and gives the difference in the corresponding element to the output array Z.

X and Y are real, non-sparse numeric arrays of the same size.

33
Chapter 3: System development

Figure 3.3 : Input Image

Figure 3.4 : Red plane detected

34
Chapter 3: System development

Figure 3.5 : Blue plane detected

3.3 Filtering the Images

As soon as the blue colour in the real time image is detected, the next step is to go for filtering
of this single frame. Care has to be taken about the processing speed of every frame since
every camera has different frames per second.

Median filtering gives optimum results for such operations. Result of the filtering should look
like as mentioned in Fig 5. Median filtering is basically to remove “salt and pepper” noise.
Although convolution can be used for this purpose, a median filter is more effective when the
goal is to reduce noise and preserve edges. Illumination is major part while taking the real time
images. There is not much noise when the illumination is high, as seen from the images.

Figure 3.6 : Red color filter

35
Chapter 3: System development

Figure 3.7: Blue color filter

3.4 Conversion of Images

As soon as filtering is done over a frame, next step is to convert an image. For conversion of
image one may also use in built function “im2bw”.

The function can be used as BW = im2bw (I, level)

where I is the image.

The BW image replaces all pixels in the input image with some threshold greater than the
Value 1 with the value 1 (white) and replaces all other pixels with the value 0 (black). Level
should be in range from 0 to 1. This level is very much understood since the image itself is
binary levels not greater than 1. Therefore, a level value of 0.5 is midway between black and
white colour, regardless of class. The threshold 0.15 gave the best result for the large range of
illumination.

36
Chapter 3: System development

Figure 3.8: Red converted to BW

Figure 3.9: Blue converted to BW

37
Chapter 3: System development

Find Centre

In order to make more precise pointer of mouse finding centroid is necessary. Here “bwlabel”
Matlab function can be used for cropping the genuine area. In other words the required region
can be detected (see figure 10). To get the properties of the region such as center point or
bounding box etc., MATLABs built in regionprops function can be used as

STATS = regionprops (BW, properties)

where it measures a set of properties for each connected component (object) in the
binary image, BW.

For the user to control the mouse pointer it is necessary to determine a point whose coordinates
can be sent to the cursor. With these coordinates, the system can control the cursor movement.
An inbuilt function in MATLAB is used to find the centroid of the detected region. The output
of function is a matrix consisting of the X (horizontal) and Y (vertical) coordinates of the
centroid. These coordinates change with time as the object moves across the screen.

38
Chapter 3: System development

Figure 3.10: Detected Center for single blue

Figure 3.11: Detected center for Double blue

Tracking the Mouse pointer

Once the coordinates has been determined, the mouse driver is accessed and the coordinates
are sent to the cursor. With these coordinates, the cursor places itself in the required position. It
is assumed that the object moves continuously, each time a new centroid is determined and for
each frame the cursor obtains a new position, thus creating

39
Chapter 3: System development

an effect of tracking. So as the user moves his hands across the field of view of the camera, the
mouse moves proportionally across the screen.

Move the cursor

Movement of cursor is the last step where actual decision has to be taken. By seeing
the centroid from above image, movement has to take place. To move the cursor to desired (X,
Y) coordinates, MATLAB has a set (0,'PointerLocation', [x,y]) function. Matlab doesn’t provide
any function for the clicking events. To move the mouse and to simulate the mouse click event
Java class java.awt. Robot which has all these abilities can be used. Resolution of camera is
directly proportional to the resolution of mouse pointer. Therefore using better quality camera
will be more beneficial. In figure 11, the resolution of the input image was 640x480 and the
resolution of the computer monitor was 1280x800. In case if the resolution of camera is less than
monitor screen then scaling should be used.

Figure 3.12: Move as per co-ordinates

Mouse click event

Clicking could be the challenging task since Matlab doesn’t provide any
function for this. In usual mouse operation the left button of the mouse performs a different
task for single click and double click. There are several ways to do this. One of the methods is
to use three pointers and if those pointers are detected decide the clicking events depending on
the time that the pointer is being detected. For movement use all pointers and for clicking use two

40
Chapter 3: System development

pointers. In case user wants to go for left click one pointer should be less which is from right
side and same is the case for right click.

3.5 Erosion and dilation

Morphology can be said to be a set of image processing operations that processes


images based on their shapes. A structuring element is applied to an input image that gives a
similar sized output image. The respective input image is compared with its neighbors, thereby
giving a value for the output image. The factors of the neighborhood such as shape and size can
be decided by the programmer, thereby constructing programmer-defined morphological
operations for the input image. The most basic morphological operations are dilation and
erosion (see figure 12, figure 13). Dilation adds pixels to the boundaries of objects in an image,
while erosion removes pixels on object boundaries. According to the structuring element used,
the number of pixels added or removed will differ. In the morphological dilation and erosion
operations, the features of any pixels in the output image is determined by applying a specific
rule to the corresponding pixel and its neighbors in the input image. Whether the operation is
dilation or erosion can be determined from the rule that has been applied to the image pixel.

Figure 3.13: Erosion

41
Chapter 3: System development

Figure 3.13: Dilation

Implementation Issues and Challenges

 Throughout the development of the application, there are several implementation issues
occurred. The following describes the issues and challenges that will likely to be encountered
throughout the development phase: The interruptions of salt and pepper noises within the
captured frames.

Salt and pepper noises occurred when the captured frame contains required
HSV values that are too small, but still underwent a series of process even though it’s not large
enough to be considered an input. To overcome this issue, the unwanted HSV pixels within the
frame must first be filtered off, this includes the area of the pixels that are too large and small.
With this method, the likelihood of interruptions of similar pixels will reduce greatly.

 Performance degradation due to high process load for low-tier system.

Since the application is required to undergo several of process to filter, process and execute
the mouse functions in real time, the application can be CPU intensive for most of the low-tier
system. If the size of the captured frames is too large, the time-taken for the application to
process the entire frame are increase drastically. Therefore, to overcome this issue, the
application is required to process only the essential part of the frames, and reduces the redundant
filtering process that could potentially slow the application down. The difficulties of calibrating
the brightness and the contrast of the frames to get the required HSV values.
42
Chapter 3: System development

 The difficulties of calibrating the brightness and the contrast of the frames
to get the required HSV values.

The intensity of brightness and contrast matters


greatly when it comes to acquiring the required colour pixels. In order for the
application to execute the entire mouse functions provided, all of the required
HSV values to execute the specific mouse functions must be satisfied, meaning
that the overall HSV values must be satisfied with the brightness and contrast as
well. However, the calibration can be somewhat tedious as certain intensity could
only satisfy part of the required HSV values, unless the original HSV values
were modified to prove otherwise. To overcome this issue, the application must
first start up with a calibration phase, which allows the users to choose their
desired colour pixels before directing them to the main phase.

43
Chapter 4: Methods and technologies involved

CHAPTER

4
METHODS AND TECHNOLOGIES INVOLVED

4.1 Hardware Requirement

The following describes the hardware needed in order to execute and develop the
Virtual Mouse application:

Computer Desktop or Laptop

The computer desktop or a laptop will be utilized to run the visual software in order
to display what webcam had captured. A notebook which is a small, lightweight and
inexpensive laptop computer is proposed to increase mobility.

System will be using Processor: Core2Duo Main Memory: 4GB RAM Hard Disk: 320GB
Display: 14" Monitor

Webcam

Webcam is utilized for image processing, the webcam will continuously taking
image in order for the program to process the image and find pixel position.

Software Requirement

The following describes the software needed in-order to develop the Virtual Mouse
application:

44
Chapter 4: Methods and technologies involved

MATLAB

(matrix laboratory) is a multi-paradigm numerical computing environment


and proprietary programming language developed by Math Works. MATLAB allows
matrix manipulations, plotting of functions and data, implementation of algorithms, creation of
user interfaces, and interfacing with programs written in other languages, including C, C++,
C#, Java, Fortran and Python.

Although MATLAB is intended primarily for numerical computing, an optional toolbox


uses the MuPAD symbolic engine, allowing access to symbolic computing abilities. An
additional package, Simulink, adds graphical multi-domain simulation and model-based design
for dynamic and embedded systems

Version used: R2015a

C++ Language

The coding technique on developing the Virtual Mouse application will be the
C++ with the aid of the integrated development environment (IDE) that are used for developing
computer programs, known as the Microsoft Visual Studio. A C++ library provides more than
35 operators, covering basic arithmetic, bit manipulation, indirection, comparisons, logical
operations and others.

MATLAB Web-Cam toolbox

With MATLAB® Support Package for USB Webcams, you can connect to your
computer’s webcam and acquire images straight into MATLAB. Functionality is provided to
preview live images, adjust acquisition parameters, and take snapshots either individually or in
a loop. Connect to your webcam from the MATLAB desktop or through a web browser with
MATLAB Online

You can acquire images from any USB video class (UVC) compliant webcam. This
includes webcams that are built into laptops and other devices as well as those that plug into
your computer via USB port.

45
Chapter 4: Methods and technologies involved

4.2 Basics of Image Processing

Image

An image is an array, or a matrix, of square pixels (picture elements) arranged in


columns and rows.

Figure 4.1: An image — an array or a matrix of pixels arranged in columns and rows

In a (8-bit) grey scale image each picture element has an assigned intensity that ranges from 0
to 255. A grey scale image is what people normally call a black and white image, but the name
emphasizes that such an image will also include many shades of grey.

Figure 4.2: Each pixel has a value from 0 (black) to 255 (white).

46
Chapter 4: Methods and technologies involved

A normal grey scale image has 8 bit colour depth = 256 grey scales. A “true colour” image has
24 bit colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours = ~16 million colours.

Figure 4.3: A true-colour image assembled from three grey scale images coloured red, green and blue. Such an
image may contain up to 16 million different colours.

Some grey scale images have more grey scales, for instance 16 bit = 65536 grey scales. In
principle three grey scale images can be combined to form an image with 281,474,976,710,656
grey scales. There are two general groups of ‘images’: vector graphics (or line art) and bitmaps
(pixel-based or ‘images’). Some of the most common file formats are:

GIF — an 8-bit (256 colour), non-destructively compressed bitmap format. Mostly used for
web. Has several sub-standards one of which is the animated GIF.

JPEG — a very efficient (i.e. much information per byte) destructively compressed 24 bit (16
million colours) bitmap format. Widely used, especially for web and Internet (bandwidth-
limited).

TIFF — the standard 24 bit publication bitmap format. Compresses non- destructively with, for
instance, Lempel-Ziv-Welch (LZW) compression.

PS — Postscript, a standard vector format. Has numerous sub-standards and can be difficult to
transport across platforms and operating systems.

PSD – a dedicated Photoshop format that keeps all the information in an image including all
the layers.

47
Chapter 4: Methods and technologies involved

4.3 Colours Processing

For science communication, the two main colour spaces are RGB and CMYK.

1)RGB

The RGB colour model relates very closely to the way we perceive colour with the
r, g and b receptors in our retinas. RGB uses additive colour mixing and is the basic colour
model used in television or any other medium that projects colour with light. It is the basic
colour model used in computers and for web graphics, but it cannot be used for print
production.

The secondary colours of RGB – cyan, magenta, and yellow – are formed by
mixing two of the primary colours (red, green or blue) and excluding the third colour. Red and
green combine to make yellow, green and blue to make cyan, and blue and red form magenta.
The combination of red, green, and blue in full intensity makes white.

In Photoshop using the “screen” mode for the different layers in an


image will make the intensities mix together according to the additive colour mixing model.
This is analogous to stacking slide images on top of each other and shining light through them.

Figure 4.4: The additive model of RGB. Red, green, and blue are the primary stimuli for human colour
perception and are the primary additive colours. Courtesy of adobe.com.

48
Chapter 4: Methods and technologies involved

2)CMYK

The 4-colour CMYK model used in printing lays down overlapping layers of
varying percentages of transparent cyan ©, magenta (M) and yellow (Y) inks. In addition a
layer of black (K) ink can be added. The CMYK model uses the subtractive colour model.

4.4 Image Processing

To explain better the principles of watermarking, it is necessary to explain some fundamentals


of image processing. There are two basic methods of implementing visual information in
digital form: vector and raster. Vector technique is widely used for creating new images. The
picture is implemented as a set of vectors which mathematically describe each line, curve, and
plane in the picture and their features. The other method, called raster, is related to digitizing
some existing images: for instance, photos, paintings, X-ray, UV images, and other artistic,
scientific, and engineering images. This method is created for keeping and managing huge
databases of medical, artistic, physics, etc. pictures.

In this technique, an image is described as a two-dimensional matrix, each cell of


which has coordinates on the plane x and y, so called spatial coordinates, and keeps a value of
the amplitude of the signal. For instance, in a grey scale image, the white color dot has
maximum value 255, the black one has minimum value 0, and the other levels of grey colors are
inside this interval. The square cell of this matrix is a picture element named pixel. Therefore,
as we can see, the usual mathematical representation of an image is a function of two spatial
variables: f (x, y). The value of the function f at some particular location (x,y) represents the
intensity of the image in that point. The plane of the visual information of the image is referred
as a spatial domain. The simplest picture (Figure 4.5, left) consists of black and white colours
only and therefore could be described by ones and zeros using one

49
Chapter 4: Methods and technologies involved

bit for one pixel. The grey scale image (Figure 4.5, right) shows up to 256 levels of
shadows from black to white; each pixel of such image can be described by one
Byte or 8 bits. The colour image is created by a combination of three or four
matrices; each of these matrices consists of full grey scale image representing the
level of specific colour in this picture.

There are some special devices developed to transform images from real life into
digital forms, such as scanners and digital photo cameras. During scanning
process, the image has been divided into assigned number of rows and columns
then transmitted dot by dot into a digital carrier forming the matrix. The process
of dividing the image on rows and columns is referred as sampling. The value of
every pixel is calculated as an average brightness in the pixel rounded to the
nearest integer value. This process is usually referred to as amplitude quantization
or simply quantization.

The number of pixels in the unit of the measurement is referred as resolution.


This parameter defines the quality of the image. The more resolution the image
has, the more elements and details could be seen in the picture.

50
Chapter 4: Methods and technologies involved

4.5 Code

% Program Name : Mouse Pointer Control

% Author : Arindam Bose

% Version : 5.5

% Copyright : ? 2013, Arindam Bose, All rights reserved.

% Description : This program controls the functions of mouse pointer

% by detecting red, green and blue colored caps

% Thanks : MATLAB Central Submission: Mouse pointer control by light

% source by Utsav Barman (29777)

% Mouse Control controls the functions of mouse pointer without

% incorporating the Mouse.

% Inputs: redThresh = Threshold for red color detection

% greenThresh = Threshold for green color detection

% blueThresh = Threshold for blue color detection

% numFrame = Total number of frames duration

% Set the default values below

% Adjust the value of thresholds for different environments

% Controls: Use 1(One) RED, 1(One) GREEN and 3(Three) BLUE Caps for

% different fingers.

% MOVE the RED finger everywhere to control the POINTER


POSITION,

51
Chapter 4: Methods and technologies involved

% Show ONE BLUE finger to LEFT CLICK,

% Show TWO BLUE finger to RIGHT CLICK,

% Show THREE BLUE finger to DOUBLE CLICK,

% MOVE the GREEN finger up and down to control the MOUSE


SCROLL.

function MouseControl(redThresh, greenThresh, blueThresh, numFrame)

warning('off','vision:transition:usesOldCoordinates');

%% Initialization

if nargin < 1

redThresh = 0.22; % Threshold for Red color detection

greenThresh = 0.14; % Threshold for green color detection

blueThresh = 0.18; % Threshold for blue color detection

numFrame = 2400; % Total number of frames duration

end

cam = imaqhwinfo; % Get Camera information

cameraName = char(cam.InstalledAdaptors(end));

cameraInfo = imaqhwinfo(cameraName);

cameraId = cameraInfo.DeviceInfo.DeviceID(end);

cameraFormat = char(cameraInfo.DeviceInfo.SupportedFormats(end));

jRobot = java.awt.Robot; % Initialize the JAVA robot

vidDevice = imaq.VideoDevice(cameraName, cameraId, cameraFormat, ... %


Input Video from current adapter

52
Chapter 4: Methods and technologies involved

'ReturnedColorSpace', 'RGB');

vidInfo = imaqhwinfo(vidDevice); % Acquire video information

screenSize = get(0,'ScreenSize'); % Acquire system screensize

hblob = vision.BlobAnalysis('AreaOutputPort', false, ... % Setup blob analysis


handling

'CentroidOutputPort', true, ...

'BoundingBoxOutputPort', true', ...

'MaximumBlobArea', 3000, ...

'MinimumBlobArea', 100, ...

'MaximumCount', 3);

hshapeinsBox = vision.ShapeInserter('BorderColorSource', 'Input port', ... %


Setup colored box handling

'Fill', true, ...

'FillColorSource', 'Input port', ...

'Opacity', 0.4);

hVideoIn = vision.VideoPlayer('Name', 'Final Video', ... % Setup output video


stream handling

'Position', [100 100 vidInfo.MaxWidth+20


vidInfo.MaxHeight+30]);

nFrame = 0; % Initializing variables

lCount = 0; rCount = 0; dCount = 0;

sureEvent = 5;

iPos = vidInfo.MaxWidth/2;

53
Chapter 4: Methods and technologies involved

%% Frame Processing Loop

while (nFrame < numFrame)

rgbFrame = step(vidDevice); % Acquire single frame

rgbFrame = flipdim(rgbFrame,2); % Flip the frame for userfriendliness

diffFrameRed = imsubtract(rgbFrame(:,:,1), rgb2gray(rgbFrame)); % Get red


components of the image

binFrameRed = im2bw(diffFrameRed, redThresh); % Convert the image into


binary image with the red objects as white

[centroidRed, bboxRed] = step(hblob, binFrameRed); % Get the centroids and


bounding boxes of the red blobs

diffFrameGreen = imsubtract(rgbFrame(:,:,2), rgb2gray(rgbFrame)); % Get


green components of the image

binFrameGreen = im2bw(diffFrameGreen, greenThresh); % Convert the


image into binary image with the green objects as white

[centroidGreen, bboxGreen] = step(hblob, binFrameGreen); % Get the


centroids and bounding boxes of the blue blobs

diffFrameBlue = imsubtract(rgbFrame(:,:,3), rgb2gray(rgbFrame)); % Get blue


components of the image

binFrameBlue = im2bw(diffFrameBlue, blueThresh); % Convert the image


into binary image with the blue objects as white

[~, bboxBlue] = step(hblob, binFrameBlue); % Get the centroids and bounding


boxes of the blue blobs

if length(bboxRed(:,1)) == 1 % Mouse pointer movement routine

54
Chapter 4: Methods and technologies involved

jRobot.mouseMove(1.5*centroidRed(:,1)*screenSize(3)/vidInfo.MaxWidth,
1.5*centroidRed(:,2)*screenSize(4)/vidInfo.MaxHeight);

end

if ~isempty(bboxBlue(:,1)) % Left Click, Right Click, Double Click routine

if length(bboxBlue(:,1)) == 1 % Left Click routine

lCount = lCount + 1;

if lCount == sureEvent % Make sure of the left click event

jRobot.mousePress(16);

pause(0.1);

jRobot.mouseRelease(16);

end

elseif length(bboxBlue(:,1)) == 2 % Right Click routine

rCount = rCount + 1;

if rCount == sureEvent % Make sure of the right click event

jRobot.mousePress(4);

pause(0.1);

jRobot.mouseRelease(4);

end

elseif length(bboxBlue(:,1)) == 3 % Double Click routine

dCount = dCount + 1;

if dCount == sureEvent % Make sure of the double click event

jRobot.mousePress(16);

55
Chapter 4: Methods and technologies involved

pause(0.1);

jRobot.mouseRelease(16);

pause(0.2);

jRobot.mousePress(16);

pause(0.1);

jRobot.mouseRelease(16);

end

end

else

lCount = 0; rCount = 0; dCount = 0; % Reset the sureEvent counter

end

if ~isempty(bboxGreen(:,1)) % Scroll event routine

if (mean(centroidGreen(:,2)) - iPos) < -2

jRobot.mouseWheel(-1);

elseif (mean(centroidGreen(:,2)) - iPos) > 2

jRobot.mouseWheel(1);

end

iPos = mean(centroidGreen(:,2));

end

vidIn = step(hshapeinsBox, rgbFrame, bboxRed,single([1 0 0])); % Show the


red objects in output stream

vidIn = step(hshapeinsBox, vidIn, bboxGreen,single([0 1 0])); % Show the

56
Chapter 4: Methods and technologies involved

green objects in output stream

vidIn = step(hshapeinsBox, vidIn, bboxBlue,single([0 0 1])); % Show the blue


objects in output stream

step(hVideoIn, vidIn); % Output video stream

nFrame = nFrame+1;

end

%% Clearing Memory

release(hVideoIn); % Release all memory and buffer used

release(vidDevice);

clc;

end

57
Chapter 4: Methods and technologies involved

4.6 Results

a) Movement of cursor:

b) Left click event:

c) Right click event:

58
Chapter 4: Methods and technologies involved

d) Double click event:

e) Cursor scroll event:

59
Chapter 5: Conclusion and Future Scope

CHAPTER

5
CONCLUSION AND FUTURE SCOPE

The Histogram-based and Explicitly threshold skin detection methods were


evaluated and based on the results, the Histogram method was deemed as more
accurate. The vision based cursor control using hand gesture system was
developed in the C++ language, using the Open CV library. The system was able
to control the movement of a Cursor by tracking the users hand. Cursor functions
were performed by using different hand gestures. The system has the potential of
being a viable replacement for the computer mouse, however due to the
constraints encountered; it cannot completely replace the computer mouse. The
major constraint of the system is that it must be operated in a well lit room. This
is the main reason why the system cannot completely replace the computer
mouse, since it is very common for computers to be used in outdoor
environments with poor lighting condition. The accuracy of the hand gesture
recognition could have been improved, if the Template Matching hand gesture
recognition method was used with a machine learning classifier. This would have
taken a lot longer to implement, but the accuracy of the gesture recognition could
have been improved. It was very difficult to control the cursor for precise cursor
movements, since the cursor was very unstable. The stability of the cursor
control could have been improved if a Kalman filter was incorporated in the
design. The Kalman filter also requires a considerable amount of time to
implement and due to time constraints, it was not implemented. All of the
operations which were intended to be performed using various gestures were
completed with satisfactory results.

60
Chapter 5: Conclusion and Future Scope

FURTURE WORK

We would improve the performance of the software especially hand tracking in


the near future. And we also want to decrease the response time of the software
for cursor movement so that it can completely be used to replace our
conventional mouse. We are also planning to design a hardware implementation
for the same in-order to improve accuracy and increase the functionality to
various domains such as a gaming controller or as a general purpose computer
controller.

Other advanced implementation include the hand gesture recognition stage to use
the Template Matching method to distinguish the hand gestures. This method
requires the use of a machine learning classifier, which takes a considerably long
time to train develop. However, it would have allow the use of lots more hand
gestures which in turn would allow the use of more mouse functions such as
zoom in and zoom out. Once the classifier is well trained, the accuracy of the
Template Matching method is expected to be better than the method used in the
proposed design. Another novel implementation of this technology would to use
the computer to train the visually or hearing impaired.

61
References

References

[1]. A. Erdem, E. Yardimci, Y. Atalay, V. Cetin, A. E.“Computer vision based


mouse”,Acoustics, Speech, and Signal Processing, Proceedings. (ICASS). IEEE
International Conference, 2002.

[2]. Hojoon Park, “A Method for Controlling the Mouse Movement using a Real
Time Camera”, Brown University, Providence, RI, USA, Department of
computer science, 2008.

[3]. Chu-Feng Lien, “Portable Vision-Based HCI – A Realtime Hand Mouse


System on Handheld Devices”, National Taiwan University, Computer Science
and Information Engineering Department.

[4]. Kamran Niyazi, Vikram Kumar, Swapnil Mahe, Swapnil Vyawahare,


“Mouse Simulation Using Two Coloured Tapes”, Department of Computer
Science, University of Pune, India, International Journal of Information Sciences
and Techniques (IJIST) Vol.2, No.2, March 2012.

[5]. K N. Shah, K R. Rathod and S. J. Agravat, “A survey on Human Computer


Interaction Mechanism Using Finger Tracking” International Journal of
Computer Trends and Technology, 7(3), 2014, 174-177.

[6]. Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, 2nd
edition, Prentice Hall, Upper Saddle River, New Jersey, 07458.

[7]. Shahzad Malik, “Real-time Hand Tracking and Finger Tracking for
Interaction”, CSC2503F Project Report, December 18, 2003.

[8]. MSDN Microsoft developers network – www.msdn.microsoft.com

[9]. Code project – www.codeproject.com/Articles/498193/Mouse-Control-via-


Webcam.

[10]. Aniket Tatipamula’s Blog -


http://anikettatipamula.blogspot.in/2012/02/hand- gesture-using-opencv.html

62

You might also like