Professional Documents
Culture Documents
MAJOR
MAJOR
MAJOR
Detection
A Major Project Submitted in Partial Fulfillment of the
Requirements for the Award of the Degree
of
Bachelor of Technology
By
Ms. T.TANUJA
Assistant professor
April-2019
DECLARATION
Project Title: Virtual Mouse using Hand Gesture and Colour Detection.
We declare that the presented project represents largely our own ideas and work in our own
words. Where others ideas or words have been included, we have adequately cited and
listed in the reference materials. We have adhered to all principles of academic honesty and
integrity. No falsified or fabricated data have been presented in the project. The matter
embodied in this project report has not been submitted by us to any other university for the
award of any other degree.
----------------------- --------------------------
(KUNTLA AJAY) (BHUMA NAVYA)
(157Y1A04C0) (157Y1A04E3)
Date:
i
DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING
CERTIFICATE
This is to certify that the project work entitled “Virtual mouse using hand gesture and
colour detection” work done by KUNTLA AJAY(15Y1A04C0) and BHUMA
NAVYA(157Y1A04E3) students of Department of Electronics and Communication
Engineering, is a record of bonafide work carried out by the members under the guidance
of Ms. T.TANUJA. This project is done as a fulfillment of obtaining Bachelor of
Technology Degree to be awarded by Jawaharlal Nehru Technological University
Hyderabad.
This is to certify that the above statement made by the candidates is correct to the best of
my knowledge.
Date: (T.TANUJA)
The Project Viva-Voce Examination of above students, has been held on……………
ii
ABSTRACT
The main idea to choose this project is improvising the previous levels and developing the
Human-Machine Interface. In today’s technological era, many technologies are evolving
day by day. The aim of this project is to move the mouse cursor on the screen without
using hardware such as a mouse and only by moving the cursor through finger movements.
In this we present a novel approach for Human Computer Interaction (HCI) where cursor
movement is controlled using a real time camera. In this project, the hand movement of a
user is mapped into mouse inputs. A web camera is set to take the live video continuously
and then from this video various images are being captured by using MATLAB. The user
must have a particular color marker or pointer in his hand so that when the web camera
takes an image it must be visible in it. This color is detected from the image pixel in
MATLAB and object detection is used to map pixel position into mouse input.
iii
ACKNOWLEDGEMENTS
The satisfaction that accompanies the successful completion of any task would be
incomplete without the mentioning of the peoples who made it possible and whose
encouragement and guidance has been a source of inspiration throughout the course of the
project.
It is great to convey our profound sense of gratitude to our principal Dr.K. Venkateswar
Reddy, and to our director Dr. R. Kotaiah, ECE at Marri Laxman Reddy Institute of
Technology and Management for having been kind enough to arrange for necessary
facilities for executing the project in college.
At the inception we would like to express our deep sense of gratitude to our Head of the
Department Mr. K. Nagabhushanam, ECE, Marri Laxman Reddy Institute of Technology
and Management. Whose valuable suggestions have been indispensable to bring about the
successful completion of our project. We wish to acknowledge a special thanks to our
guide, T.Tanuja, Assistant professor who helped us throughout the academic to complete
our project.
Finally, we are thankful to our project co-ordinator Dr. G. Amarnath and staff members
of ECE Department, and other faculty members of our institution. Finally we thank those
who directly and indirectly helped us in this regard.
( K.Ajay ) (B.Navya)
iv
CONTENTS
Declaration i
Certificate ii
Abstract iii
Acknowledgements iv
Contents v
v
CHAPTER 4 Methods and technologies involved 44-59
4.1 Hardware and software requirements 44
4.2 Basics of image processing 46
4.3 Colour processing 48
4.4 Image processing 49
4.5 Code 51
4.6 Results 58
CHAPTER 5 Conclusion and Future Scope 60-61
References 62
vi
LIST OF FIGURES
vii
FIGURE 4.1 An image array or a matrix of pixels arranged in rows and columns 46
FIGURE 4.2 Tri-colour image 46
FIGURE 4.3 A true colour image assembled by 3 gray scales 47
FIGURE 4.4 Additive model of RGB 48
LIST OF TABLES
Table No. Name of the table Page
No.
Table 1.1 Advantage and disadvantage of the Mechanical Mouse 4
Table 1.2 Advantage and disadvantage of the Optical and Laser Mouse 5
viii
Chapter 1: Introduction
CHAPTER
1
INTRODUCTION
1.1 Introduction
Despite the limitations, the computer technology still continues to grow, so does the
importance of the human computer interactions. Ever since the introduction of a mobile device
that can be interact with touch screen technology, the world is starting to demand the same
technology to be applied on every technological device, this includes the desktop system.
However, even though the touch screen technology for the desktop system is already exist, the
price can be very steep. A hand gesture based cursor control system allows users to give mouse
1
Chapter 1: Introduction
inputs to a system without using an actual mouse. To the extreme it can also be called as
hardware because it uses an ordinary web camera. The system can usually be operated with
multiple input devices, which may include an actual mouse or a computer keyboard. This
system uses a web camera which works with the help of different image processing techniques.
A colour pointer has been used for the object recognition and tracking. Left and the right click
events of the mouse have been achieved by detecting the number of colour pointers on the
images. The hand movements of a user are mapped into mouse inputs. A web camera is set to
take images continuously. The user must have a particular colour in his hand so that when the
web camera takes image it must be visible in the acquired image. This colour is detected from
the image pixel and the pixel position is mapped into mouse input.
In this project, the mouse cursor movement and click events are controlled using a camera
based on colour detection technique. Here real time video has been captured using a Web-
Camera. The user wears coloured tapes to provide information to the system. Individual frames
of the video are separately processed. The processing techniques involve an image subtraction
algorithm to detect colours. Once the colours are detected, the system performs various
operations to track the cursor and performs control actions. No additional hardware is required
by the system other than the standard webcam which is provided in every laptop computer.
Therefore, a hand gesture based human computer interaction device replaces the physical
mouse or keyboard by using a webcam or any other image capturing devices can be an
alternative way for the touch screen. This device which is the webcam will be constantly
utilized by a software that monitors the gestures given by the user in order to process it and
translate to motion of a pointes, as similar to a physical mouse.
It is known that there are various types of physical computer mouse in the modern technology,
the following will discuss about the types and differences about the physical mouse.
Mechanical Mouse
It is Known as the trackball mouse that is commonly used in the 1990s, the ball within the
mouse are supported by two rotating rollers in order to detect the movement made by the ball
itself. One roller detects the forward/backward motion while the other detects the left/right
2
Chapter 1: Introduction
motion. The ball within the mouse are steel made that was covered with a layer of hard rubber,
so that the detection are more precise. The common functions included are the left/right
buttons and a scroll-wheel. However, due to the constant friction made between the mouse ball
and the rollers itself, the mouse are prone to degradation, as overtime usage may cause the
rollers to degrade, thus causing it to unable to detect the motion properly, rendering it useless.
Furthermore, the switches in the mouse buttons are no different as well, as long term usage
may cause the mechanics within to be loosed and will no longer detect any mouse clicks till it
was disassembled and repaire
3
Chapter 1: Introduction
The following table describes the advantages and disadvantages of the Mechanical Mouse.
ADVANTAGES DISADVANTAGES
A mouse that commonly used in these days, the motions of optical mouse rely on the
Light Emitting Diodes (LEDs) to detect movements relative to the underlying surface, while
the laser mouse is an optical mouse that uses coherent laser lights. Comparing to its
predecessor, which is the mechanical mouse, the optical mouse no longer rely on the rollers to
determine its movement, instead it uses an imaging array of photodiodes. The purpose of
implementing this is to eliminate the limitations of degradation that plagues the current
predecessor, giving it more durability while offers better resolution and precision. However,
there's still some downside, even-though the optical mouse are functional on most opaque
diffuse surface, it's unable to detect motions on the polished surface. Furthermore, long term
usage without a proper cleaning or maintenance may leads to dust particles trap between the
LEDs, which will cause both optical and laser mouse having surface detection difficulties.
Other than that, it's still prone to degradation of the button switches, which again will cause the
mouse to function improperly unless it was disassembled and repaired.
4
Chapter 1: Introduction
The following table describes the advantages and disadvantages of the Optical and
Laser Mouse.
ADVANTAGES DISADVANTAGES
Table 1.2: Advantage and disadvantage of the Optical and Laser Mouse
Problem Statement
It's a known fact that every technological devices have its own limitations, especially
when it comes to computer devices. After the review of various type of the physical mouse, the
problems are identified and generalized. The following describes the general problem that the
current physical mouse suffers:
Physical mouse is not easily adaptable to different environments and its performance varies
depending on the environment.
5
Chapter 1: Introduction
All wired mouse and wireless mouse have its own lifespan.
It is fair to say that the Virtual Mouse will soon to be substituting the traditional
physical mouse in the near future, as people are aiming towards the lifestyle where that every
technological devices can be controlled and interacted remotely without using any peripheral
devices such as the remote, keyboards, etc. it doesn't just provides convenience, but it's cost
effective as well.
User Convenience
It is known in order to interact with the computer system, users are required to use an
actual physical mouse, which also requires a certain area of surface to operate, not to mention
that it suffers from cable length limitations. Cursor control system requires none of it, as it only
a webcam to allow image capturing of user's hand position in order to determine the position of
the pointers that the user want it to be. For example, the user will be able to remotely control
and interact the computer system by just facing the webcam or any other image capturing
devices and moving your fingers, thus eliminating the need to manually move the physical
mouse, while able to interact with the computer system from few feet away Cost Effective.
A physical mouse normally costs depending on their functionality and features. Since the
cursor control system requires only a webcam, a physical mouse are no longer required, thus
eliminating the need to purchase one, as a single webcam is sufficient enough to allow users to
interact with the computer system through it, while some other portable computer system such
as the laptop, are already supplied with a built-in webcam, could simply utilize the software
without having any concerns about purchasing any external peripheral devices.
Cost Effective
computer system such as the laptop, are already supplied with a built-in webcam, could simply
utilize the software without having any concerns about purchasing any external peripheral
devices.
Problem Description
There are generally two approaches for hand gesture recognition, which are hardware
based, where the user must wear a device, and the other is vision based which uses image
processing techniques with inputs from a camera. The proposed system is vision based, which
uses image processing techniques and inputs from a computer webcam. Vision based gesture
recognition tracking and gesture recognition. The input frame would be captured from the
webcam and systems are generally broken down into four stages, skin detection, hand contour
extraction, hand the skin region would be detected using skin detection. The hand contour
would then be found and used for hand tracking and gesture recognition. Hand tracking would
be used to navigate the computer cursor and hand gestures would be used to perform mouse
functions such as right click, left click, scroll up and scroll down. The scope of the project
would therefore be to design a vision based CC system, which can perform the mouse function
previously stated.
7
Chapter 1: Introduction
In this section the strategies and methods used in the design and development of the vision based
CC system will be explained. The algorithm for the entire system is shown in Figure below. In order
to reduce the effects of illumination, the image can be converted to chrominance colour space which is
less sensitive to illumination changes. The HSV colour space was chosen since it.
8
Chapter 1: Introduction
was found by to be the best colour space for skin detection. The next step would be to use a method
that would differentiate skin pixels from non-skin pixels in the image (skin detection). Background
subtraction was then performed to remove the face and other skin colour objects in the background.
Morphology Opening operation (erosion followed by dilation) was then applied to efficiently remove
noise. A Gaussian filter was applied to smooth the image and give better edge detection. Edge
detection was then performed to get the hand contour in the frame. Using the hand contour, the tip of
the index finger was found and used for hand tracking and controlling the mouse movements. The
contour of the hand was also used for gesture recognition. The system can be broken down in four
main components, thus in the Methodology the method used in each component of the system will be
explained separately.
9
Chapter 1: Introduction
Skin Detection
Hand Tracking
Gesture Recognition
Cursor Control
Skin Detection
Skin detection can be defined as detecting the skin colour pixels in an image. It is a
fundamental step a wide range of image processing application such as face detection, hand tracking
and hand gesture recognition. Skin detection using colour information has recently gained a lot of
attention, since it is computationally effective and provides robust information against scaling, rotation
and partial occlusion. Skin detection using colour information can be a challenging task, since skin
appearance in images is affected by illumination, camera characteristics, background and ethnicity. In
order to reduce the effects of illumination, the image can be converted to a chrominance colour space,
which is less sensitive to illumination changes.
A chrominance colour space is one where the intensity information In the proposed method, the
HSV colour space was used with the Histogram- based skin detection method. The HSV colour space
has three channels, Hue (H), Saturation(S) and Value (V). The H and S channels hold the colour
information, while the V channel holds the intensity information. The input image from the webcam
would be in the RGB colour space, thus it would have to be converted to the HSV colour space using
the conversion Formulae. The Histogram-based skin detection method proposed by uses 32 bins H and
S histograms to achieve skin detection. Using a small skin region, the colour of this region is
converted to a chrominance colour space. A 32 bin histogram for the region is then found and is used
as the histogram model. Each pixel in the image is then evaluated on how much probability it has to a
histogram model. This method is also called Histogram Back Projection. Back projection can be
defined as recording how well pixels or patches of pixels fit the distribution of pixels in a histogram
model. The result would be a grayscale image (back projected image), where the intensity indicates
10
Chapter 1: Introduction
the likelihood that the pixel is a skin colour pixel. This method is adaptive since the histogram model
is obtained from the users ski, under the preset lighting condition.
After obtaining the skin segmented binary image, the next step is to perform edge
detection to obtain the hand contour in the image. There are several edge detection methods such as,
Laplacian edge detection, canny edge detection and border finding The Open CV function cv Find
Contours() uses a order finding edge detection method to find the contours in the image. The major
advantage of the border finding edge detection method, is that all the contours found in the image is
stored in an array. This means that we can analyse each contour in the image individually, to
determine the hand contour. The Canny and Laplacian edge detectors are able to find the contours in
the image, but do not give us access to each individual contour. For this reason the border finding edge
detection method was used in the proposed design.
11
Chapter 1: Introduction
In the contour extraction process, we are interested in extracting the hand contour so that shape
analysis can be done on it to determine the hand gesture. Figure below shows the result when edge
detection was applied to the skin segmented binary image. It can be seen that besides the hand
contour, there are lots of small contours in the image. These small contours can be considered as noise
and must be ignored. The assumption was made that the hand contour is the largest contour thereby
ignoring all the noise contours in the image. This assumption can be void, if the face contour is larger
than the hand contour. To solve this problem, the face region must be eliminated from the frame. The
assumption was made that the hand is the only moving object in the image and the face remains
relatively stationary compared to the hand. This means that background subtraction can be applied to
remove the stationary pixels in the image, including the face region. This is implemented in the
function named “BackgroundSubtractorMOG2”.
Hand Tracking
The movement of the cursor was controlled by the tip of the index finger. In order to identify the tip of
the index finger, the centre of the palm must first be found. The method used for finding the hand
centre was adopted from and it has the advantage of being simple and easy to implement. The
algorithm for the method is shown in the flow chart of Figure below. The shortest distance between
each point inside the inscribed circle to the contour was measured and the point with the largest
distance was recorded as the hand centre. The distance between the hand centre and the hand contour
was taken as the radius of the hand. The hand centre was calculated for each successive frame and
using the hand centre, the tip of the index finger would be identified and used for hand tracking. The
method used for identifying the index and the other fingers are described in the following subsection.
The results for hand tracking would be demonstrated in Figure in the Results and Analysis section.
12
Chapter 1: Introduction
13
Chapter 1: Introduction
Gesture Recognition
The gesture recognition method used in the proposed design is a combination of two methods, the
method proposed by Yeo and the method proposed by Balazs. The algorithm for the proposed gesture
recognition method is described in the flow chart of Figure below. It can be seen from Figure above
that the convexity defects for the hand contour must firstly be calculated. The convexity defects for
the hand contour was calculated using the Open CV inbuilt function “cv Convexity Defects”. The
parameters of the convexity defects (start point, end point and depth point) are stored in a sequence of
arrays. After the convexity defects are obtained, there are two main steps for gesture recognition:
Number of Fingers.
14
Chapter 1: Introduction
Cursor Control
Once the hand gestures are recognized, it will be a simple matter of mapping different hand gestures
to specific mouse functions. It turns out that controlling the computer cursor, in the C/C++
programming language is relatively easy. By including the User.lib library into the program, the “Send
Input” function will allow control of the computer cursor. Instructions on how to properly use this
function, was obtained from the Microsoft Developers Network MSDN website. This function is only
available for the Windows 2000 Professional operating system or later. This introduces a new
limitation on the system, such that it can only be used on newer versions of the Windows operating
system. The algorithm for the cursor control is shown in Figure below.
15
Chapter 1: Introduction
Following table shows the Operations Performed depending upon the number of fingers detected:
Five My Computer
Starting with the position of the index fingertip, the cursor is moved to the fingertip position. This is
done using the “Send Input” function to control the cursor movement. The next step would be to
determine if a hand gesture was performed. If a hand gesture as performed, the “Send Input” function
is again used to control the cursor function. If there is no change in fingertip position, the loop is
exited and it would be started again, when a change in fingertip position is detected.
16
Chapter 1: Introduction
17
Chapter 1: Introduction
18
Chapter 1: Introduction
The analysis results in data that can be of further use in gesture recognition:
Fingertip positions
Number of fingers
Number of hands
Area of hands
19
Chapter 1: Introduction
20
Chapter 1: Introduction
Now, let’s see how it tracks our palm and detects our fingers:
21
Chapter 1: Introduction
22
Chapter 2: Over View of project
CHAPTER
2
OVERVIEW OF PROJECT
According to the system requirements we need to give RGB colour image inputs to the system.
This colour components will be placed on the finger tips of the user. The input will be given as
a continuous frame of images. This continuous image input is captured using a webcam.
For the system to work we need a sensor to detect the hand movements of the user. The
webcam of the computer is used as a sensor. The webcam captures the real time video at a
fixed frame rate and resolution which is determined by the hardware of the camera. The frame
rate and resolution can be changed in the system if required.
Video is divided into Image frames base on the FPS (Frames per second) of the camera.
Image Pre-processing
Image pre-processing involves flipping of the input images. When the camera
captures an image, it is inverted. This means that if we move the color pointer towards the left,
the image of the pointer moves towards the right and vice-versa. It’s similar to an image
obtained when we stand in front of a mirror (Left is detected as right and right is detected as
left). To avoid this problem, we need to vertically flip the image. The image captured is an
RGB image and flipping actions cannot be directly performed on it. So, the individual color
channels of the image are separated and then they are flipped individually. After flipping the
red, blue and green colored channels individually, they are concatenated and a flipped RGB
image is obtained.
Mouse Movements
The control actions of the mouse are performed by controlling the flags associated with
the mouse buttons. JAVA robot class is used to access these flags. The user has to perform
hand gestures in order to create the control actions. Due to the use of color pointers, the
computation time required is reduced. Furthermore the system becomes resistant to
background noise and low illumination conditions.
24
Chapter 2: Over View of project
25
Chapter 2: Over View of project
26
Chapter 2: Over View of project
27
Chapter 2: Over View of project
Colour detection and extraction of the different colours (RGB) from flipped gray scale image.
Tracking the mouse pointer using the coordinates obtained from the centroid.
Simulating the left click and the right click events of the mouse by assigning different colour
pointers.
SYNTAX
B = flipdim(A,dim)
DESCRIPTION
When the value of dim is 1, the array is flipped row-wise down. When dim is 2, the
array is flipped column wise left to right. flipdim(A,1) is the same as flipud(A), and
flipdim(A,2) is the same as fliplr(A).
28
Chapter 2: Over View of project
SYNTAX
Z = imsubtract(X,Y)
DESCRIPTION
Z = imsubtract(X,Y)
subtracts each element in array Y from the corresponding element in array X and
returns the difference in the corresponding element of the output array Z.
If X is an integer array, elements of the output that exceed the range of the integer type
are truncated, and fractional values are rounded.
im2bw
SYNTAX
BW = im2bw(I,level)
DESCRIPTION
29
Chapter 2: Over View of project
This range is relative to the signal levels possible for the image's class.
Therefore, a level value of 0.5 corresponds to an intensity value halfway between
the minimum and maximum value of the class.
rgb2gray
SYNTAX
I = rgb2gray(RGB)
newmap = rgb2gray(map)
DESCRIPTION
30
Chapter 3: System development
CHAPTER
3
SYSTEM DEVELOPMENT
In the object tracking application one of the main problems is object detection. Instead of
finger tips, a colour pointer has been used to make the object detection easy and fast. To
simulate the click events of the mouse, three fingers serving as three colour pointers has been
used. The basic algorithm is as follows:
Flipping of images
Implementation Steps
Colour detection
Conversion of Images
Find Centre
31
Chapter 3: System development
When the camera captures an image, it is inverted. This means that if we move the
colour pointer towards the left, the image of the pointer moves towards the right and vice-
versa. It’s similar to an image obtained when we stand in front of a mirror (Left is detected as
right and right is detected as left). To avoid this problem we need to vertically flip the image.
The image captured is an RGB image and flipping actions cannot be directly performed on it.
So, the individual colour channels of the image are separated and then they are flipped
individually. After flipping the red, blue and green coloured channels individually, they are
concatenated and a flipped RGB image is obtained.
32
Chapter 3: System development
In this project, there are uses of blue and red planes (see figure 2, figure 3 and figure 4).
In order to identify the blue colour of the hand, MATLAB built in function “imsubstract” can
be used.
Z = imsubtract (X, Y)
where this function subtracts elements from one array to another which is specified as X and Y
and gives the difference in the corresponding element to the output array Z.
33
Chapter 3: System development
34
Chapter 3: System development
As soon as the blue colour in the real time image is detected, the next step is to go for filtering
of this single frame. Care has to be taken about the processing speed of every frame since
every camera has different frames per second.
Median filtering gives optimum results for such operations. Result of the filtering should look
like as mentioned in Fig 5. Median filtering is basically to remove “salt and pepper” noise.
Although convolution can be used for this purpose, a median filter is more effective when the
goal is to reduce noise and preserve edges. Illumination is major part while taking the real time
images. There is not much noise when the illumination is high, as seen from the images.
35
Chapter 3: System development
As soon as filtering is done over a frame, next step is to convert an image. For conversion of
image one may also use in built function “im2bw”.
The BW image replaces all pixels in the input image with some threshold greater than the
Value 1 with the value 1 (white) and replaces all other pixels with the value 0 (black). Level
should be in range from 0 to 1. This level is very much understood since the image itself is
binary levels not greater than 1. Therefore, a level value of 0.5 is midway between black and
white colour, regardless of class. The threshold 0.15 gave the best result for the large range of
illumination.
36
Chapter 3: System development
37
Chapter 3: System development
Find Centre
In order to make more precise pointer of mouse finding centroid is necessary. Here “bwlabel”
Matlab function can be used for cropping the genuine area. In other words the required region
can be detected (see figure 10). To get the properties of the region such as center point or
bounding box etc., MATLABs built in regionprops function can be used as
where it measures a set of properties for each connected component (object) in the
binary image, BW.
For the user to control the mouse pointer it is necessary to determine a point whose coordinates
can be sent to the cursor. With these coordinates, the system can control the cursor movement.
An inbuilt function in MATLAB is used to find the centroid of the detected region. The output
of function is a matrix consisting of the X (horizontal) and Y (vertical) coordinates of the
centroid. These coordinates change with time as the object moves across the screen.
38
Chapter 3: System development
Once the coordinates has been determined, the mouse driver is accessed and the coordinates
are sent to the cursor. With these coordinates, the cursor places itself in the required position. It
is assumed that the object moves continuously, each time a new centroid is determined and for
each frame the cursor obtains a new position, thus creating
39
Chapter 3: System development
an effect of tracking. So as the user moves his hands across the field of view of the camera, the
mouse moves proportionally across the screen.
Movement of cursor is the last step where actual decision has to be taken. By seeing
the centroid from above image, movement has to take place. To move the cursor to desired (X,
Y) coordinates, MATLAB has a set (0,'PointerLocation', [x,y]) function. Matlab doesn’t provide
any function for the clicking events. To move the mouse and to simulate the mouse click event
Java class java.awt. Robot which has all these abilities can be used. Resolution of camera is
directly proportional to the resolution of mouse pointer. Therefore using better quality camera
will be more beneficial. In figure 11, the resolution of the input image was 640x480 and the
resolution of the computer monitor was 1280x800. In case if the resolution of camera is less than
monitor screen then scaling should be used.
Clicking could be the challenging task since Matlab doesn’t provide any
function for this. In usual mouse operation the left button of the mouse performs a different
task for single click and double click. There are several ways to do this. One of the methods is
to use three pointers and if those pointers are detected decide the clicking events depending on
the time that the pointer is being detected. For movement use all pointers and for clicking use two
40
Chapter 3: System development
pointers. In case user wants to go for left click one pointer should be less which is from right
side and same is the case for right click.
41
Chapter 3: System development
Throughout the development of the application, there are several implementation issues
occurred. The following describes the issues and challenges that will likely to be encountered
throughout the development phase: The interruptions of salt and pepper noises within the
captured frames.
Salt and pepper noises occurred when the captured frame contains required
HSV values that are too small, but still underwent a series of process even though it’s not large
enough to be considered an input. To overcome this issue, the unwanted HSV pixels within the
frame must first be filtered off, this includes the area of the pixels that are too large and small.
With this method, the likelihood of interruptions of similar pixels will reduce greatly.
Since the application is required to undergo several of process to filter, process and execute
the mouse functions in real time, the application can be CPU intensive for most of the low-tier
system. If the size of the captured frames is too large, the time-taken for the application to
process the entire frame are increase drastically. Therefore, to overcome this issue, the
application is required to process only the essential part of the frames, and reduces the redundant
filtering process that could potentially slow the application down. The difficulties of calibrating
the brightness and the contrast of the frames to get the required HSV values.
42
Chapter 3: System development
The difficulties of calibrating the brightness and the contrast of the frames
to get the required HSV values.
43
Chapter 4: Methods and technologies involved
CHAPTER
4
METHODS AND TECHNOLOGIES INVOLVED
The following describes the hardware needed in order to execute and develop the
Virtual Mouse application:
The computer desktop or a laptop will be utilized to run the visual software in order
to display what webcam had captured. A notebook which is a small, lightweight and
inexpensive laptop computer is proposed to increase mobility.
System will be using Processor: Core2Duo Main Memory: 4GB RAM Hard Disk: 320GB
Display: 14" Monitor
Webcam
Webcam is utilized for image processing, the webcam will continuously taking
image in order for the program to process the image and find pixel position.
Software Requirement
The following describes the software needed in-order to develop the Virtual Mouse
application:
44
Chapter 4: Methods and technologies involved
MATLAB
C++ Language
The coding technique on developing the Virtual Mouse application will be the
C++ with the aid of the integrated development environment (IDE) that are used for developing
computer programs, known as the Microsoft Visual Studio. A C++ library provides more than
35 operators, covering basic arithmetic, bit manipulation, indirection, comparisons, logical
operations and others.
With MATLAB® Support Package for USB Webcams, you can connect to your
computer’s webcam and acquire images straight into MATLAB. Functionality is provided to
preview live images, adjust acquisition parameters, and take snapshots either individually or in
a loop. Connect to your webcam from the MATLAB desktop or through a web browser with
MATLAB Online
You can acquire images from any USB video class (UVC) compliant webcam. This
includes webcams that are built into laptops and other devices as well as those that plug into
your computer via USB port.
45
Chapter 4: Methods and technologies involved
Image
Figure 4.1: An image — an array or a matrix of pixels arranged in columns and rows
In a (8-bit) grey scale image each picture element has an assigned intensity that ranges from 0
to 255. A grey scale image is what people normally call a black and white image, but the name
emphasizes that such an image will also include many shades of grey.
Figure 4.2: Each pixel has a value from 0 (black) to 255 (white).
46
Chapter 4: Methods and technologies involved
A normal grey scale image has 8 bit colour depth = 256 grey scales. A “true colour” image has
24 bit colour depth = 8 x 8 x 8 bits = 256 x 256 x 256 colours = ~16 million colours.
Figure 4.3: A true-colour image assembled from three grey scale images coloured red, green and blue. Such an
image may contain up to 16 million different colours.
Some grey scale images have more grey scales, for instance 16 bit = 65536 grey scales. In
principle three grey scale images can be combined to form an image with 281,474,976,710,656
grey scales. There are two general groups of ‘images’: vector graphics (or line art) and bitmaps
(pixel-based or ‘images’). Some of the most common file formats are:
GIF — an 8-bit (256 colour), non-destructively compressed bitmap format. Mostly used for
web. Has several sub-standards one of which is the animated GIF.
JPEG — a very efficient (i.e. much information per byte) destructively compressed 24 bit (16
million colours) bitmap format. Widely used, especially for web and Internet (bandwidth-
limited).
TIFF — the standard 24 bit publication bitmap format. Compresses non- destructively with, for
instance, Lempel-Ziv-Welch (LZW) compression.
PS — Postscript, a standard vector format. Has numerous sub-standards and can be difficult to
transport across platforms and operating systems.
PSD – a dedicated Photoshop format that keeps all the information in an image including all
the layers.
47
Chapter 4: Methods and technologies involved
For science communication, the two main colour spaces are RGB and CMYK.
1)RGB
The RGB colour model relates very closely to the way we perceive colour with the
r, g and b receptors in our retinas. RGB uses additive colour mixing and is the basic colour
model used in television or any other medium that projects colour with light. It is the basic
colour model used in computers and for web graphics, but it cannot be used for print
production.
The secondary colours of RGB – cyan, magenta, and yellow – are formed by
mixing two of the primary colours (red, green or blue) and excluding the third colour. Red and
green combine to make yellow, green and blue to make cyan, and blue and red form magenta.
The combination of red, green, and blue in full intensity makes white.
Figure 4.4: The additive model of RGB. Red, green, and blue are the primary stimuli for human colour
perception and are the primary additive colours. Courtesy of adobe.com.
48
Chapter 4: Methods and technologies involved
2)CMYK
The 4-colour CMYK model used in printing lays down overlapping layers of
varying percentages of transparent cyan ©, magenta (M) and yellow (Y) inks. In addition a
layer of black (K) ink can be added. The CMYK model uses the subtractive colour model.
49
Chapter 4: Methods and technologies involved
bit for one pixel. The grey scale image (Figure 4.5, right) shows up to 256 levels of
shadows from black to white; each pixel of such image can be described by one
Byte or 8 bits. The colour image is created by a combination of three or four
matrices; each of these matrices consists of full grey scale image representing the
level of specific colour in this picture.
There are some special devices developed to transform images from real life into
digital forms, such as scanners and digital photo cameras. During scanning
process, the image has been divided into assigned number of rows and columns
then transmitted dot by dot into a digital carrier forming the matrix. The process
of dividing the image on rows and columns is referred as sampling. The value of
every pixel is calculated as an average brightness in the pixel rounded to the
nearest integer value. This process is usually referred to as amplitude quantization
or simply quantization.
50
Chapter 4: Methods and technologies involved
4.5 Code
% Version : 5.5
% Controls: Use 1(One) RED, 1(One) GREEN and 3(Three) BLUE Caps for
% different fingers.
51
Chapter 4: Methods and technologies involved
warning('off','vision:transition:usesOldCoordinates');
%% Initialization
if nargin < 1
end
cameraName = char(cam.InstalledAdaptors(end));
cameraInfo = imaqhwinfo(cameraName);
cameraId = cameraInfo.DeviceInfo.DeviceID(end);
cameraFormat = char(cameraInfo.DeviceInfo.SupportedFormats(end));
52
Chapter 4: Methods and technologies involved
'ReturnedColorSpace', 'RGB');
'MaximumCount', 3);
'Opacity', 0.4);
sureEvent = 5;
iPos = vidInfo.MaxWidth/2;
53
Chapter 4: Methods and technologies involved
54
Chapter 4: Methods and technologies involved
jRobot.mouseMove(1.5*centroidRed(:,1)*screenSize(3)/vidInfo.MaxWidth,
1.5*centroidRed(:,2)*screenSize(4)/vidInfo.MaxHeight);
end
lCount = lCount + 1;
jRobot.mousePress(16);
pause(0.1);
jRobot.mouseRelease(16);
end
rCount = rCount + 1;
jRobot.mousePress(4);
pause(0.1);
jRobot.mouseRelease(4);
end
dCount = dCount + 1;
jRobot.mousePress(16);
55
Chapter 4: Methods and technologies involved
pause(0.1);
jRobot.mouseRelease(16);
pause(0.2);
jRobot.mousePress(16);
pause(0.1);
jRobot.mouseRelease(16);
end
end
else
end
jRobot.mouseWheel(-1);
jRobot.mouseWheel(1);
end
iPos = mean(centroidGreen(:,2));
end
56
Chapter 4: Methods and technologies involved
nFrame = nFrame+1;
end
%% Clearing Memory
release(vidDevice);
clc;
end
57
Chapter 4: Methods and technologies involved
4.6 Results
a) Movement of cursor:
58
Chapter 4: Methods and technologies involved
59
Chapter 5: Conclusion and Future Scope
CHAPTER
5
CONCLUSION AND FUTURE SCOPE
60
Chapter 5: Conclusion and Future Scope
FURTURE WORK
Other advanced implementation include the hand gesture recognition stage to use
the Template Matching method to distinguish the hand gestures. This method
requires the use of a machine learning classifier, which takes a considerably long
time to train develop. However, it would have allow the use of lots more hand
gestures which in turn would allow the use of more mouse functions such as
zoom in and zoom out. Once the classifier is well trained, the accuracy of the
Template Matching method is expected to be better than the method used in the
proposed design. Another novel implementation of this technology would to use
the computer to train the visually or hearing impaired.
61
References
References
[2]. Hojoon Park, “A Method for Controlling the Mouse Movement using a Real
Time Camera”, Brown University, Providence, RI, USA, Department of
computer science, 2008.
[6]. Rafael C. Gonzalez and Richard E. Woods, Digital Image Processing, 2nd
edition, Prentice Hall, Upper Saddle River, New Jersey, 07458.
[7]. Shahzad Malik, “Real-time Hand Tracking and Finger Tracking for
Interaction”, CSC2503F Project Report, December 18, 2003.
62