Final Year Report

CHAPTER 1
PREAMBLE
The aim is to address the challenges faced by quadriplegic patients in utilizing
conventional computer interfaces Now a days human being needs more efficient
and quick results of anything which they are doing. As technologies are
developing day to day the results get more quickly. The computers which have
been used by us can be improved and can give us more efficient result by
replacing mouse by eyes. Here what we are doing is that using eyes gaze as a
cursor of the mouse by using eye detection and tracking. In this we are using
webcam to record the videos. These videos will be then converted into grey scale
format which will help us to detect and track the eyes. After tracking we will
perform some operations such as click, double click, right click, left click etc.
This will save our time by doing tasks rapidly as compare to mouse. This will
also be helpful to those people are handicapped and not able to use computer are
now will able to use it. As computer-aided learning gains traction, the
significance of human computer interaction is increasingly acknowledged. The
fusion of humans and computers has seen considerable growth, now
indispensable in both professional and academic domains. Consequently, a
vision-driven approach is adopted to craft efficient human-computer interface
systems. By leveraging a webcam, facial movements are captured, recorded, and
translated onto the computer screen to manipulate the mouse cursor's position.
The mouse's motion is dynamically adjusted according to the anchor point's
location. The camera captures real-time facial movements, processed in the
background by OpenCV. RGB images received from the user are transformed
into grayscale images. Distinct facial expressions, such as winks and eye
squeezes, are associated with cursor manipulation actions, including clicking and
scrolling, thereby enriching user engagement with applications such as PDF
viewers. This would streamline cursor manipulation through facial gestures.
Introducing "Controlling Mouse Pointer Using Eye and Face Gestures," a user-
friendly project simplifying human-computer interaction by utilizing eye
movement. This eye-tracking technique offers hands-free accessibility for
individuals with disabilities. With this application, users can navigate computer
screen by moving the cursor and perform additional functions like clicking.
1
Essentially, it works by leveraging the natural action of blinking to improve the
accuracy and speed of identifying faces. By monitoring the blinking patterns of
individuals, the system can enhance the face detection process, making it more
efficient and effective. This method is particularly useful in scenarios where quick
and reliable face detection is crucial, such as in security systems, surveillance
cameras, or even in interactive applications like virtual reality. By incorporating
blink detection into the face recognition process, the system can better distinguish
faces from other objects or backgrounds, leading to more precise results in real
time. The concept behind this technique is to use the unique characteristics of
blinking, like the frequency and duration of blinks, to create a more robust face
detection algorithm. This can help overcome challenges such as variations in
lighting conditions, different facial expressions, or partial occlusions of the face,
ultimately improving the overall performance of the face detection system.
Overall, this method showcases an innovative way to enhance face detection
capabilities by integrating blink detection into the process. The tasks of face
detection and landmark localization are a key foundation for many facial analysis
applications, while great advancements have been achieved in recent years there
are still challenges to increase the precision of face detection.
1.1 PROBLEM STATEMENT

The goal is to create a system that accurately translates eye and face gestures into
cursor movements, providing an alternative input method for individuals with
limited mobility or for situations where hands-free interaction is desired.
1.2 SOLUTION
A hybrid cursor system based on facial movements for hand free control of mouse
is presented. The proposed system has used HAAR algorithm for face detection
and make use of facial landmark localization, eye ball tracking, blink detection,
mouth aspect ratio to successfully control the mouse.
1.3 OBJECTIVES OF PROJECT

The project objectives are as follows:
➢ To offer people with extreme disabilities, an opportunity to control acomputer
simply by moving his/her eyes or head.
➢ To design a low-cost combined eye and head tracking system for persons with
2
deficiency of their upper limbs.
1.4 ADVANTAGES
The project advantages are as follows:
➢ Independence: Quadriplegic patients often rely heavily on caregivers for
simple tasks. With this technology, they can regain some independence in
using computers, accessing communication tools, browsing the internet, and
engaging in various activities without constant assistance.
➢ Improved Communication: Many communication aids and devices require
mouse control. Enabling quadriplegic individuals to control the mouse pointer
with their eyes or facial gestures enhances their ability to communicate
effectively, whether through typing, selecting options, or navigating
interfaces.
➢ Access to Technology: The project opens up access to technology that was
previously inaccessible to quadriplegic individuals. They can now utilize
computers, tablets, and other devices for education, work, entertainment, and
social interaction like everyone else.
➢ Enhanced Quality of Life: Being able to control a computer independently
can significantly enhance the quality of life for quadriplegic patients. It allows
them to engage in recreational activities such as gaming, watching videos, or
browsing social media, providing entertainment and opportunities for
socialization.
➢ Increased Productivity: Many quadriplegic individuals want to remain
productive members of society. By enabling them to control computers
effectively, they can engage in work-related tasks, pursue education, or
contribute to various projects, boosting their productivity and sense of
accomplishment.
➢ Customization and Adaptability: Such systems can be tailored to the
specific needs and abilities of each individual. Customizable interfaces and
gestures can accommodate different levels of mobility and cognitive function,
ensuring that the technology is usable and effective for a wide range of users.
3
1.5 LITERATURE SURVEY
[1] Title of Paper: Integrated Deep Model for Face Detection and Landmark
LocalizationFrom ‘‘In The Wild’’ Images
Author and Year: AHMED BOURIDANE , AND RICHARD JIANG 2023
Description: The tasks of face detection and landmark localization are a key
foundation for many facial analysis applications, while great advancements
have been achieved in recent years there are still challenges to increase the
precision of face detection. Within this paper, we present our novel method
the Integrated Deep Model (IDM), fusing two state-of-the-art deep learning
architectures, namely, Faster R-CNN and a stacked hourglass for improved
face detection precision and accurate landmark localization. Integration is
achieved through the application of a novel optimization function and is
shown in experimental evaluation to increase accuracy of face detection
specifically precision by reducing false position detection’s byan average of
62%. Our proposed IDM method is evaluated on the Annotated Faces In-The-
Wild, Annotated Facial Landmarks. In The Wild and the Face Detection
Dataset and Benchmark face detection test sets and shows a high level of
recall and precision when compared with previously proposed methods.
Landmark localization is evaluated on the Annotated Faces In-The-Wild and
300-W test sets, this specifically focuses on localization accuracy from
detected face bounding boxes when compared with baseline evaluations using
ground truth bounding boxes. Our findings highlight only a small 0.005%
maximum increase in error which is more profound for the subset of facial
landmarks which border the face.
[2] Title Of paper: Simultaneous Face Detection and Pose Estimation Using
ConvolutionalNeural Network Cascade
Author and year: HAO WU, KE ZHANG , AND GUOHUI TIAN
Description: Recent studies show that convolutional neural networks (CNNs)
has maa series of breakthroughs in the two tasks of face detection and pose
estimation, respectively. There are two CNN frameworks for solving these
two integrated tasks simultaneously One is to use face detection network to
detect faces firstly, and then use pose estimation network to estimate each
face’s pose; the other is to use region proposal algorithm to generate many
4
candidate regions that may contain faces, and then use a single deep multitask
CNN to process these regions for simultaneous face detection and pose
estimation. The former’s problem is pose estimation’s performance is
affected by face detection network because two networks are separate. The
latter generates lots of candidate regions, which will bring huge computation
cost to CNN and can’t achieve real-time. To solve the above existing
problems, we propose a multi-task CNN cascade framework that integrates
these two tasks. We show that multi-task learning of face detection and head
pose estimation helps to extract more representative features. We exploit
CNN feature fusion strategy to further improve head pose estimation’s
performance. We evaluate face detection on FDDB benchmark, and evaluate
pose estimation on AFW benchmark. Our method achieves comparative result
compared withstate-of-the-art in these two tasks and can achieve real time
performance.
[3] Title Of Paper: Face Detection Method Based on Cascaded Convolutional
Networks
Author and year: RONG QI1, RUI-SHENG JIA
Description: The Deep learning achieves substantial improvements in face
detection. However, the existing methods need to input fixed-size images for
image processing and most methods use a single network for feature
extraction, which makes the model generalization ability weak. In response
to the above problems, our framework leverages a cascaded architecture with
three stages of deep convolutional networks to improve detection
performance. The network can predict face in a coarse-to-fine manner. We
replace the standard convolution with a combination of separable convolution
and residual structure in the network. Extensive experiments on the
challenging FDDB and WIDER FACE benchmarks demonstrate that our
method achieves competitive accuracy to the state-of-the art techniques while
keeps real-time performance. Faces can be captured conveniently by digital
cameras, web cameras, smart phones, etc. The convenience is a double edged
sword. It makes faces become not only the most widely used but also the most
untrustful biometric modality. With the fast development of face recognition,
the modern face recognition algorithms The most classic method of face
detection is the VJ face detection proposed by Viola and Jones in 2001.
5
[4] Title Of paper: A Face Spoofing Detection Method Based on Domain
Adaptation and Lossless Size Adaptation
Author and year: WENYUN SUN

Description: The In this paper, a face spoofing detection is method called
the Fully Convolutional Network with Domain Adaptation and Lossless Size
Adaptation (FCNDA-LSA) is proposed. As its name suggests, the FCN-DA-
LSA includes a lossless size adaptation preprocessor followed by an FCN
based pixel-level is a classifier embedded with a domain adaptation layer. The
FCN local classifier makes full use of the basic properties of face spoof
distortion namely call ubiquitous and repetitive. The domain adaptation (DA)
layer improves generalization across the different domains. The lossless size
adaptation (LSA) preserves the high frequent of the spoof clues caused by the
face recapture process. The ablation study shows that both DA and the LSA
are most necessary for high-accuracy face spoofing detection. The FCN-
LSA obtains competitive performance among the state- of-the-art methods.
With the help of small-sample external data in the target domain (2/50, 2/50,
and1/20 subjects for both CASIA- FASD, Replay-Attack, and OULU-NPU
respectively), the FCN-DA- LSA further improves the performance and
outperforms the existing methods.
1.6 ORGANIZATION OF REPORT

Chapter 1 deals with the problem statement, solution, project objectives,
advantages, literature survey and organization of the project report.
Chapter 2 deals with the software requirement specification under that functional
overview and requirements, software requirements, hardware requirements and
project cycle.
Chapter 3 deals with the system design under that project architecture, sequence
diagram, dataflow diagram and flowchart.
Chapter 4 deals with the implementation it has codes used in the project.
Chapter 5 deals with the testing part of the project under that it has scope of
testing, unit testing, integration testing, functional testing and system testing.
Chapter 6 deals with the results it has snapshots of the project.
Chapter 7 deals with conclusion and future scope.
6
CHAPTER 2
SOFTWARE REQUIREMENTS SPECIFICATION
The functional overview and software requirements are discussed in this chapter.
2.1 FUNCTIONAL OVERVIEW

The project "Controlling Mouse Pointer Using Eye and Face Gesture for
Quadriplegic Patients" is a human-machine interaction (HMI) system designed to
provide a better way for communication between humans and computers. The
main focus of this project is to help people with disabilities, particularly
quadriplegic patients, to use computers hands-free. The system uses a webcam to
capture the user's face and eyes, and based on the movement and blink of the eyes,
it controls the movement and clicking functions of the mouse pointer.
The following are the key functional components of this project:
➢ Face and Eye Detection: The system uses facial landmark detection to
estimate the location of facial points on a person's face. After capturing live
images, the system maps the facial points and stores the coordinates of the
facial marks on the left and right eyes. Then, it draws the eyes on a black
mask, segments out the eyes, and finds the two largest contours from the
mask, which should be the eyeballs.
➢ Eyeball Movement-Based Cursor Movement: The system calculates the
ratio of eye landmark points to get the position of the eyeball and then
connects the movement of the eyeball to the cursor using a script that
automates interaction with other applications. This way, the cursor moves
according to the movement of the user's eyeballs.
➢ Blinking-Based Clicking: The system calculates the eye aspect ratio (EAR)
to detect the blinking of the eyes. When the EAR is very low, it implies that
the eye is closed, and the system performs a click event. The system can detect
left and right clicks based on the blinking of the right and left eyes,
respectively.
The system provides an average accuracy of 70-85% in controlling the mouse
pointer using eye and face gestures. The accuracy depends on various factors,
including the intensity of light and the quality of the camera. The system can be
7
further optimized for better accuracy by using a good camera and adjusting the
threshold value based on the lighting conditions.
The system has various applications, including helping quadriplegic
patients to use computers, reducing the burden of holding the mouse for its
operation, decreasing wrist pain, and ensuring reliable communication between
the computer and differently abled people. The system can also be extended for
the implementation of a soft keyboard, controlling appliances such as TV sets and
tube lights, detecting sleep and drowsiness of drivers, and playing video games
using eye and eye gazing.
2.2 USER CHARACTERISTICS

User has to provide the input through the camera to system in order for it to
recognize the face and the features. Then system detect the face using the DNN
model, then the user is asked to perform certain input activities to calculate eye
aspect ratio and mouth aspect ratio for the user to control mouse by left-eye
clicking and right-eye clicking to operate the cursor.
2.3 OUTPUT REQUIREMENTS

The system requirements, which include operational, performance, physical,
performance, and support requirements, are the outcome of the needs analysis.
During performance analysis and the operation of the system engineering
programme, these system requirements are matched to specific system features.
2.3.1 Functional Requirements

➢ The system should accurately track the movement of the user's eyes. It should
be able to detect the direction and speed of eye movements. It should
recognize blinks and distinguish them from intentional eye movements.
➢ The system should recognize facial gestures such as smiles, frowns, and
eyebrow movements. It should accurately interpret these gestures as
commands for controlling themouse pointer.
➢ The system should be able to move the mouse pointer smoothly and
accurately according to the detected eye movements and facial gestures. It
should support functionssuch as left-click, right-click, and drag-and-drop.
➢ The system should have a user-friendly interface for configuring settings and
calibrating the eye and face tracking. It should provide visual feedback
8
to the user about the detected eye movements and facial gestures.
➢ The system should be compatible with different operating systems such as
Windows, macOS, and Linux. It should support various input devices,
including webcams and specialized eye-tracking hardware.
2.4 SOFTWARE REQUIREMENTS

➢ System: Intel i 5.
➢ Hard disk: 50 GB.
➢ Input Devices: USB Interface, Power Adapter.
➢ Ram: 4GB (minimum) to 16 GB.
2.5 HARDWARE REQUIREMENTS

➢ Operating System: Windows 10.
➢ Coding Language: Python.
➢ Software: MongoDB, Anaconda
➢ IDE: PyCharm
2.6 PROJECT CYCLE
Fig 2.1: Project Cycle
9
Here is the detailed explanation of the project cycle:
➢ Requirements Collection: In this stage, the project requirements are

gathered. For the eye and face gesture-controlled mouse pointer project, the
requirements could include:
• The system should accurately track and interpret eye and face gestures.
• The system should control the mouse pointer smoothly and responsively.
• The system should be easy to use and set up for quadriplegic patients.
• The system should ensure user privacy and data security.
➢ System Design: In this phase, the software architecture and design are
created based on the requirements. For the eye and face gesture-controlled
mouse pointer project, the design could include:
• The selection of appropriate eye and face tracking hardware and software.
• The development of algorithms to interpret gestures and control the mouse
pointer.
• The creation of a user-friendly interface for patients to set up and use the
system.
• The implementation of security measures to protect user data.
➢ Implementation: This step involves writing and integrating the code to
create the software based on the design. For the eye and face gesture-
controlled mouse pointer project, the implementation could include:
• Developing and testing the gesture recognition algorithms.
• Integrating the algorithms with the mouse pointer control system.
• Creating the user interface for the software.
• Implementing security features to protect user data.
➢ Testing: In this stage, the software is thoroughly tested to ensure it meets
the requirements and functions as intended. For the eye and face gesture-
controlled mouse pointer project, the testing could include:
• Validating the accuracy and responsiveness of the gesture recognition.
• Checking the smoothness and reliability of the mouse pointer control.
• Verifying the user interface's ease of use and accessibility.
• Assessing the security features to ensure user data protection.
10
➢ Deployment: This is the final step, where the software is released for use.
For the eye and face gesture-controlled mouse pointer project, the
deployment could include:
• Providing clear instructions for patients to install and set up the software.
• Offering technical support for any issues or questions.
• Regularly updating the software to fix bugs, improve performance, and
add new features.
➢ Prediction & Maintenance: Post-deployment, the system should be
monitored for any issues, and maintenance should be performed as needed.
Additionally, predictions can be made based on user data to improve the
system further.
11
CHAPTER 3
SYSTEM DESIGN
System design deals with defining elements of a system like modules involved in
the system, architecture, components and their interfaces based on the specified
requirements. This chapter discusses the methodology of the adopted system.
3.1 PROJECT ARCHITECTURE AND DESCRIPTION

The proposed system operates in the following manner: A comprehensive
procedure is outlined to navigate the mouse across the desktop based on users'
facial gestures. Initially, real-time input is obtained from the user, which is then
utilized to control the cursor's movement. Consequently, various operations can
be executed using this user input. The system architecture of the application is
structured as follows: Fig 3.1 illustrates the algorithm employed for face detection
and cursor movement. Initially, input data is acquiredand preprocessed to ensure
compatibility with the app.
Fig 3.1: Architecture of mouse pointer controlling system
12
If the facial landmark shifts, the cursor adjusts accordingly until it reaches the
intended destination. Subsequently, the system verifies click functions; if
activated, corresponding actions are executed; otherwise, the system remains idle.
The system operates in a continuous loop, monitoring for any displacement or
gestures from the user's eyes to executethe desired actions.
➢ Image Acquisition: This block refers to capturing the image using a device
like a cameraor scanner. The camera lens gathers light and focuses it on a
light-sensitive sensor that converts it into an electrical signal.
➢ Preprocessing: Once captured, the raw image data might undergo
preprocessing to prepare it for further processing. This may involve
correcting for lens distortion or converting the colour format.
➢ Image Enhancement: This block deals with improving the visual quality of
the image. Techniques like adjusting brightness, contrast, or sharpening can
be applied in this stage.
➢ Image Restoration: Sometimes, images might be degraded due to factors
like noise or blur. Restoration techniques aim to rectify these issues and
improve the image's fidelity.
➢ Image Analysis: This stage involves extracting meaningful information
from the image. It might involve tasks like segmenting the image into objects
or identifying specific features within the image.
➢ Image Recognition: This block applies computer vision techniques to
recognize objects, patterns, or even faces within the image. Facial recognition
is a common application of image recognition.
➢ Postprocessing: After analysis or recognition, the processed image data
might undergo postprocessing. This may involve formatting the data for
storage or display.
Facial landmark detection is a crucial step for blink detection, gaze detection, and
even face recognition. Facial landmarks are specific points on a face that
correspond to facial features like eyes, nose, mouth, and jawline. By identifying
these landmarks, we can gain insights into facial expressions and orientations.
Blink Detection: Eye aspect ratio (EAR) is a common method used for blink
detection. Facial landmarks around the eyes are used to calculate the EAR, which
represents the ratio of the eye's width to its height. Blinking causes the EAR to
decrease significantly, allowing us to detect blinks. Face Detection: Techniques
13
like Viola-Jones face detection or deep learning algorithms can be used to
identify faces within an image.
3.2 SEQUENCE DIAGRAM
A sequential diagram illustrates the sequence of interactions between different
components or actors in a system. For the project "Cursor Control Using Eye and
Face Gestures," here's a simplified sequential diagram This diagram outlines the flow
of actions in the system, starting from the user's interaction with the eye and face
tracking system, through the processing of gestures to control the cursor, and finally,
the feedback provided to the user through the user.
Fig 3.2: Sequence diagram
Here’s the detailed explanation of the diagram Fig 3.2:
➢ Stent Capturing: This step likely refers to capturing an image of the user’s
eye.
➢ Face Detection: The system detects the presence of a face in the image.
➢ Eye Detection: The system narrows down its focus to the user’s eye within
the image.
➢ Finding Aspect Ratio: The system calculates the aspect ratio of the eye,
which is the ratio ofthe width of the eye to its height.
14
➢ Eye Tracking: The system tracks the movement of the user’s eye.
➢ Eye Blink Detection: The system detects when the user blinks.
➢ Cursor Movement: The system translates the movement of the user’s eye
into cursormovement on the screen.
3.3 FLOW CHART

The flow chart shows the flow of data in our project and its classification.
Fig 3.3: Flow Chart
The Fig 3.3 is the block diagram of the proposed system for controlling a mouse
pointer using eye and face gestures for quadriplegic patients. Here's a breakdown
of each block:
➢ Camera Input: This block represents the input from a camera that captures
the user's face and eyes.
➢ Detecting Face using DNN: This block uses a Deep Neural Network (DNN)
to detect the user's face within the camera input.
➢ Black Mask on Image: This block applies a black mask to the detected face,
possibly to isolate the eyes or other facial features.
➢ Finding Eyeballs using Threshold: This block uses a thresholding technique
to locate the eyeballs within the masked image.
➢ Segment our Eyes: This block further segments the eyeballs from the rest of
the face, possibly to track the movement of the eyes.
15
➢ Connecting Cursor to the Movement of Eyeball: This block connects the
movement of the eyeballs to the movement of the mouse cursor, allowing the
user to control the cursor with their eye movements.
The overall system appears to be designed to enable quadriplegic patients to
control a mouse pointer using their eye and face gestures, providing them with
a means of interacting with a computer. The use of DNN and thresholding
techniques suggests that the system employs machine learning and image
processing techniques to accurately detect and track the user's facial features.
3.4 DATA FLOW DIAGRAM

The data flow diagram shows the flow of data in our project and its operation.
Fig 3.4: Data Flow Diagram
The Fig 3.4 is a representation of the data processing steps involved in a project.
The flowchart depicts a facial recognition system’s process, capturing a face
image, detecting the face and possibly the eyes, finding the aspect ratio, and then
performing optional steps like eye tracking, blink detection, cursor movement
tracking, and click event detection. Certainly! Facial recognition systems leverage
16
sophisticated algorithms to identify individuals based on their facial
characteristics. The process typically involves capturing an image, followed by
face detection to isolate the facial region. Eye detection may refine the process
further. Then, the system extracts facial features and compares them against a
database of known faces to identify a match.
➢ Stent Capturing: The first step involves capturing an image of a person’s face.
This image can be captured from a variety of sources, such as a security camera,
a smartphone, or a social media profile.
➢ Face Detection: Once an image is captured, the system needs to identify the
presence of a face within the image. This is typically done by using facial
detection algorithms that can identify facial features such as eyes, nose, and
mouth.
➢ Eye Detection: After a face is detected, the system may then try to identify the
specific location of the eyes within the face. This information can be useful for
later steps in the process, such as eye tracking or blink detection.
➢ Finding Aspect Ratio: The aspect ratio refers to the ratio of the width of an
image to its height. In facial recognition, finding the aspect ratio of a face can
help to normalize the image and ensure that different faces are compared in a
consistent manner.
➢ Eye Tracking (Possible): This step involves tracking the movement of a
person’s eyes. Eye tracking can be used for a variety of purposes, such as
understanding where a person is looking or determining if they are paying
attention.
➢ Eye Blink Detection (Possible): Eye blink detection can be used to determine
if a person is blinking their eyes. This information can be useful for a variety of
purposes, such as livenessdetection (verifying that a person is present and not a
photograph) or fatigue detection.
➢ Cursor Movement (Possible): This step likely refers to tracking the movement
of a cursor ona computer screen. This information could be used to determine
where a person is looking on a screen or to interact with a computer program
using eye movements.
➢ Click Event (Possible): This step refers to a user clicking on a computer mouse
or touchscreen. This information could be used to track a person’s interactions
with a computerprogram or website.
17
It’s important to note that not all facial recognition systems will include all of
these steps. The specific steps involved will vary depending on the application.
3.5 UTILITIES
Here is a list of utilities used in the project "Controlling Mouse Pointer using Eye
and Face Gesture for Quadriplegic Patients":
➢ Webcam: Used to capture images of the user's face and eyes.
➢ Computer: The system on which the mouse pointer control software is
installed.
➢ Mouse: The device being controlled using eye and face gestures.
➢ OpenCV: A computer vision library used for image processing, feature
detection, and object recognition.
➢ Python: The programming language used to develop the software
application.
➢ Dlib: A modern C++ toolkit containing machine learning algorithms and
tools used for face detection and facial landmark detection.
➢ NumPy: A library used for efficient numerical computation in Python.
➢ SciPy: A scientific computing library used for signal processing and
image analysis.
➢ PyAuto GUI: A cross-platform GUI automation library used to control
the mouse pointer.
➢ Face Detection Algorithm: Used to detect the user's face in the webcam
feed.
➢ Eye Detection Algorithm: Used to detect the user's eyes and track their
movements.
➢ Facial Landmark Detection Algorithm: Used to detect facial landmarks
such as the eyebrows, nose, and mouth.
➢ Gesture Recognition Algorithm: Used to recognize and interpret the
user's eye and face gestures.
18
CHAPTER 4
IMPLEMENTATION
The system is implemented using embedded python language to control the action
of mouse cursor. This chapter includes the software tools used and the code that
is implemented.
Cursor control using eye and face gestures is a computer interaction
technique that utilizes a webcamto track facial movements and translate them into
cursor movements on the screen. This technology offers an alternative to
traditional input methods like mice and trackpads, particularly for users with
limited hand mobility. Face Detection and Landmarking: The system first
employs computer vision techniques to detect the user's face in the webcam feed.
Facial landmarking algorithms then pinpoint specific facial features like eyes,
nose, and mouth. Eye Movement Tracking: Eye movements are tracked by
monitoring the distance between key facial landmarks around the eyes. Blinking
or squinting can be detected to trigger clicks or activate scrolling mode. Facial
Feature Analysis: The position and orientation of facial features, particularly the
nose in this case, are analyzed to determine cursor movement direction. Head
movements can also be mapped to cursor movements.
Calibration and Control: The system often requires a calibration phase to
establish a baseline for facial feature positions and movements in relation to
desired cursor actions. Users can then control the cursor and interact with the
computer interface using their eye blinks, facial expressions, and head
movements. Facial Detection and Feature Tracking the system leverages
computer vision to identify the user's face within the webcam footage.
Sophisticated algorithms then pinpoint specific facial features like the eyes, nose,
and mouth by recognizing their relative positions. These features are tracked over
time to detect movements and expressions.
Eye Movement Interpretation Eye movements are meticulously
monitored by analyzing the distance between key facial landmarks around the
eyes. Blinking or squinting can be used to trigger specific actions, such as left or
right clicks, or even activate scrolling mode. The system translates the position
and orientation of facial features, particularly the nose in this case, into cursor
movements on the screen. Head movements can also be mapped to cursor
19
movements, allowing users to control the cursor by tilting their head in the desired
direction. Calibration and User Control an initial calibration phase is often
necessary to establish a baseline for facial feature positions and movements
relative to desired cursor actions. Once calibrated, users can control the cursor
and interact with the computer interface through their eye blinks, facial
expressions, and head movements. Facial feature positions and movements in
relation todesired cursor actions. Users can then control the cursor and interact
with the computer interface using their eye blinks, facial expressions, and head
movements.
4.1 SOFTWARE TOOLS USED

Software is a set of instructions, data or programs used to operate computers and
execute specific tasks. Software is a generic term used to refer to applications,
scripts and programs that run on a device. It can be thought of as the variable part
of a computer, while hardware is the invariable part. The two main categories of
software are application software and system software.An application is software
that full fills a specific need or performs tasks. Systemsoftware is designed to run
a computer's hardware and provides a platform for applications to run on top of.
PyCharm is an integrated development environment (IDE) used for programming
in Python. It provides code analysis, a graphical debugger, an integrated unit
tester, integration with version control systems, and supports web development
with Django. PyCharm is developed by the Czech company JetBrains. It is cross-
platform, working on Microsoft Windows, macOS, and Linux. PyCharm has a
Professional Edition, released under a proprietary license and a Community
Edition released under the Apache License. PyCharm Community Edition is less
extensive than the Professional Edition.
PyCharm was released to the market of the Python-focused IDEs to
compete with PyDev (for Eclipse) or the more broadly focused Komodo
IDE by Active State. The beta version of the product was released in July 2010,
with the 1.0 arriving 3 months later. Version 2.0 was released on 13 December
2011, version 3.0 was released on 24 September 2013, and version 4.0 was
released on November 19, 2014. PyCharm became Open Source on 22 October
2013. The Open-Source variant is released under the name Community
20
Edition – while the commercial variant, Professional Edition, contains closed-
source modules.
Fig 4.1: PyCharm IDE
4.2 PROPOSED ALGORITHMS

This section consists of all the algorithms used in this project and their respective
code snippet.
4.2.1 HAAR CASCADE CLASSIFIER
Fig 4.1: Haar like features
Haar-like features are the basic features used in Haar cascade classifiers for object
detection. These features are simple rectangles that are used to represent patterns
21
of light and dark regions in an image. In a Haar-like feature, the algorithm
calculates the difference between the sum of pixel values in the white rectangle
area and the sum of pixel values in the black rectangle area. By moving and
resizing these rectangles across the image, the classifier can detect patterns like
edges,corners, and other important features that help in identifying objects.The
Haar cascade classifier uses these Haar-like features to create a strong classifier
that can distinguish between objects of interest and background in an image. It
does this by combining multiple weak classifiers, whichare based on these Haar-
like features, into a strong classifier that can accurately detect objects in different
scales and orientations. Overall, Haar-like features play a crucial role in the
effectiveness of Haar cascade classifiers for tasks like face detection, object
recognition, and more in computer vision applications.
Haar-like features identify patterns in images. Integral image speeds up feature
calculation.
s(x, y) = s(x, y1) + i(x, y) ----------(eq-1)
ii(x, y) = ii(x1,y) + s(x, y) ----------(eq-2)
These two equations relate to the calculation of integral images and the
computation of Haar-like features. The eq-1 represents the computation of the
summed area (s) of a rectangular region in an image. It's calculated as the sum of
the pixels in the rectangle from (0,0) to (x,y), which can be expressed as the sum
of the pixels from (0,0) to (x,y1) plus the pixel at (x, y). This equation seems
correct.
The eq-2 defines the integral image (ii), which is essentially a way to speed up
the computation of Haar-like features. The integral image at a point (x, y) is
calculated as the integral image at the point (x1,y) plus the summed area at the
point (x, y).
Features are calculated by comparing pixel intensities It basically works on
detecting on Haar features. Haar features such as properties common to human
faces: The eye region is darker than the upper-cheeks. The nose bridge region is
brighter than the eyes.
22
Fig 4.2: Stages in Haar Cascade Classifier
In the Haar cascade classifier algorithm for face detection, the process begins by
extracting Haar-like features from the image. These features are calculated by
moving rectangular windows of different sizes across the image and computing
the difference between the sum of pixel values in the white and black rectangles
within each window. Once the Haar-like features are extracted, the algorithm uses
a cascade of classifiers to identify faces in the image. The cascade consists of
multiple stages, each containing several weak classifiers. These weak classifiers
are simple decision rules based.
CODE SNIPPET
# Load the pre-trained Haar cascade classifier for face detection
face_cascade=cv2.CascadeClassifier(cv2.data.haarcascades+
'haarcascade_frontalface_default.xml')
# Load an image
image = cv2.imread('image.jpg')
# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Perform face detection
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5,
minSize=(30, 30))
# Draw rectangles around the detected facesfor (x, y, w, h) in faces:
cv2.rectangle(image, (x, y), (x+w, y+h), (255, 0, 0), 2)
# Display the output image cv2.imshow('Face Detection', image)cv2.waitKey(0)
cv2.destroyAllWindows()
During the face detection process, the algorithm evaluates each window in the
image using the cascade of classifiers. At each stage, the window is checked
against a specific weak classifier. If the window passes the weak classifier's
threshold, it moves on to the next stage. If the window fails at any stage, it is
23
rejected as not containing a face and the algorithm moves on to the next window.
By combining multiple weak classifiers in a cascade, the algorithm can efficiently
discard non-face regions early in the process, focusing computational efforts on
potential face regions. This hierarchical approach allows for fast and accurate face
detection in images, making the Haar cascade classifier a popular choice for real-
time applications like face recognition in cameras and video processing systems.
he Haar cascade classifier works by utilizing a machine learning technique called
AdaBoost to train a cascade of classifiers. Each stage of the cascade focuses on
a specific set of features to detect objects. These features are simple rectangular
patterns that are compared to the image at different locations and scales. The
classifier uses these features to determine if a region of the image contains the
object of interest. If a region passes all stages of the cascade, it is considered a
positive detection. This method is efficient for real-time object detection tasks
like face detection. et of pre-trained XML files that contain information about
features to detect objects, such as faces. These files define patterns of dark and
light regions that are used to identify objects in an image. The classifier slides
these patterns over the image at different scales and detects objects based on how
well the patterns match the image content. This method is efficient for tasks like
face detection due to its speed and accuracy in identifying specific objects in
images.The Haar-like features are extracted, the algorithm uses a cascade of
classifiers to identify faces in theimage. The cascade consists of multiple stages,
each containing several weak classifiers. These weak classifiers are simple
decision rules based.
4.2.2 AdaBOOST ALGORITHM

Adaboost selects best features for strong classifier. Cascade of classifiers
efficiently detectsfaces with fewer false positives.
Boosting is an ensemble modeling technique that attempts to build a strong
classifier from the number of weak classifiers. It is done by building a model by
using weak models in series. Firstly, a model is built from the training data.
Then the second model is built which tries to correct the errors present in the
first model. This procedure is continued and models are added until either the
complete training data set is predicted correctly or the maximum number of
models are added.
24
Fig 4.3: AdaBoost features
➢ Initialize Weights: All training data points are assigned equal weights,
signifying that each point contributes equally to the learning process.
➢ Calculate Error Rate: For each weak classifier (a simple machine learning
model) considered, the error rate is calculated. The error rate represents the
proportion of data points that the classifier misclassified.
➢ Select Best Classifier: The weak classifier with the lowest error rate is
chosen as thebase learner for the current iteration.
➢ Compute Voting Power: A weight, referred to as alpha, is computed for the
chosen classifier. Alpha determines the classifier's influence on the final
prediction. Classifiers with lower error rates receive higher weights,
indicating greater influence.
➢ Update Weights: The weights of the training data points are adjusted
based on the performance of the chosen classifier. Points that were
misclassified by the classifier have their weights increased, while correctly
classified points have their weights decreased. This emphasizes the
importance of points that the current classifier has difficulty handling,
ensuring the next classifier focuses on those points.
➢ Append Classifier and Check Stopping Criterion: The chosen classifier
is incorporated into the ensemble model, and a check is performed to see if
a stoppingcriterion is met. This criterion might be a maximum number of
iterations or a sufficiently low overall error rate.
➢ Repeat: If the stopping criterion isn't met, steps 2 through 6 are repeated.
25
The algorithm iteratively refines the ensemble model by incorporating new
weak learnersand adjusting data point weights.
The AdaBoost algorithm effectively leverages multiple weak learners to create

a robust ensemble classifier, achieving higher accuracy than any individual
weak learner could on its own.
4.2.3 EYE ASPECT RATIO

When the aspect ratio of the left eye becomes less than the aspect ratio of the right
eye, thenthe left click is implemented. Similarly, if the aspect ratio of the right
eye is smaller than the aspect ratio of the left eye then right-click is implemented.
Click state is the variable used to change the mode of clicks, the values stored in
click state indicate different types of operations to be performed (i.e. single click,
double click, drag). Mouse Count is a counter used in drag mode.
CODE SNIPPET
leftEye = shape[lstart:lend]
rightEye = shape[rstart:rend]
mouthroi = shape[mstart:mend]
leftEyeHull = cv2.convexHull(leftEye) rightEyeHull =
cv2.convexHull(rightEye) mouthHull = cv2.convexHull(mouthroi)
cv2.drawContours(image,[mouthroi],-1,(0,255,0),1)
cv2.drawContours(image,[leftEyeHull],-1,(0,255,0),1)
cv2.drawContours(image,[rightEyeHull],-1,(0,255,0),1)
Fig 4.4 : 68 point feature map
In face detection using a 68-point feature map, the algorithm works by identifying
key facial landmarks on a detected face. These landmarks are predefined points
on the face such as the corners of the eyes, nose, mouth, and jawline. The 68-
point feature map is a set of 68 specific points that are used to accurately represent
26
the facial structure.These points are detected by the algorithm after the face has
been identified using techniques like the Haar cascade classifier or deep learning
models. Once the algorithm has located the face, it maps the 68 points onto the
face to create a detailed representation of the facial features. This feature map
provides valuableinformation about the face's geometry, which can be used for
tasks like facial recognition, emotion detection, and facial alignment in
applications like virtual makeup or augmented reality filters, such as the eye
corners and the eyelids. By comparing these distances, the eye aspect ratio can
help in detecting drowsiness, fatigue, or even in facial recognition systems to
determine if a person is blinking or has their eyes open.
Fig 4.5: eye feature annotation points
The formula for calculating the eye aspect ratio (EAR) is: EAR = (||P2 - P6|| +
||P3 - P5||) / 2 * ||P1 - P4||
Where:
• P1, P2, P3, P4, P5, and P6 are specific points on the eye, typically representing
thecorners and eyelids.
• ||Pn - Pm|| represents the Euclidean distance between two points Pn and Pm.
CODE SNIPPET
#The function is used for calculation of EAR for an eye def
EAR(point1,point2,point3,point4,point5,point6):
ear = (dst(point2,point6) + dst(point3,point5))/(2*dst(point1,point4))*1.0
return ear
# The function is used for calculating the distance between the two points.
This is primarilyused for calculating EAR
def dst(point1, point2):
distance = np.sqrt((point1[0] - point2[0])**2 + (point1[1] - point2[1])**2)
return distance
27
4.3 CURSOR FUNCTIONS
a. Left-click function:
if left_EAR < right_EAR:
if clickstate == 0: then single_left_click
else if clickstate == 1: then double_left_click
else if clickstate == 2: then drag_using_ left_click In clickstate == 2
there are two different subparts, click and release.
CODE SNIPPET
if EARdiff < leftclick and larea < leftclickarea: # Left click will be initiated
if the EARdiff isless than the leftclick calculated during calibration
pag.click(button = 'left') cv2.putText(blackimage,"Left
Click",(0,300),font,1,(255,255,255),2,cv2.LINE_AA)
lclick = np.array([])
b. Click and Release
if mousecount == 0: press left button else if mousecount == 1: release left
button
c. Right-click function:-
if left_EAR > right_EAR: then single clickelse return value 0
d. Scroll Mode
If the aspect ratio of both the left and right eye together is less than the
threshold value, the scroll mode is implemented. As the cursor reaches
theupper and lower threshold value the cursor moves up and down
respectively.In scroll mode we take the mean of theleft eye aspect ratio
and right eye aspectratio and this mean is stored in ear.
if ear <= Eye_ar_thresholdvalue, theneye counter increment to 1 if eye
counter > eye_AR_frame: then Scroll_Mode_Enabled
CODE SNIPPET
# Sets the condition for scrolling modeif scroll_status == 0:
if((h-250)**2 + (k-250)**2 - 50**2 > 0):
a = angle(shape[33]) # Calculates the angle
if h > 250: # The below conditions set the conditions for themouse
to move and that too inany direction we desire it to move to.
time.sleep(0.03)
pag.moveTo(pag.position()[0]+(10*np.cos(1.0*a)),pag.position()[1]+(10*
np.sin(10*a)), duration = 0.01)
pag.moveTo(pag.position()[0]-(10*np.cos(1.0*a)),pag.position()[1]-
(10*np.sin(1.0*a)),duration = 0.01)
cv2.putText(blackimage,"Moving",(0,250),font,1,(255,255,255),2,cv2.LI
NE_AA) else: #Enabling scroll status
cv2.putText(blackimage,'Scrollmode
ON',(0,100),font,1,(255,255,255),2,cv2.LINE_AA)
28
if k > 300:
cv2.putText(blackimage,"Scrolling
Down",(0,300),font,1,(255,255,255),2,cv2.LINE_AA)
pag.scroll(-1)elif k < 200:
cv2.putText(blackimage,"Scrolling
Up",(0,300),font,1,(255,255,255),2,cv2.LINE_AA)
pag.scroll(1)
e. Double Click Functions

If the click state value is 1 it performs a double click function.
else ifclickstate == 1:then double_left_click is enabled
4.4 MOUTH ASPECT RATIO

The mouth aspect ratio (MAR) is a measurement used in facial landmark
detection to assess the shape and movement of the mouth. It is calculated by
comparing the distance between two points on the lips to the distance between
two other points on the lips. This ratio helps in determining the openness or
closure of the mouth. These landmarks are key spots, such as the corners of the
mouth and the top and bottom of the lips. By looking at the distances between
these points, we can calculate the aspect ratio. This ratio gives us a way to
understand the shape and movement of the mouth, which is super useful for things
like recognizing facial expressions or analyzing speech patterns So, these
landmarks are like little markers that help us measureand interpret what's going
on with the mouth.
Fig 4.6 : MAR Formula
29
To compute the mouth aspect ratio, specific points on the lips are identified.
Typically, these points are the corners of the mouth and the top and bottom of the
lips. By measuring the distances between these points, the aspect ratio is
calculated. A higher aspect ratio indicates a more open mouth.
Fig 4.7 : Landmarks On Mouth

In the context of the mouth aspect ratio, landmarks refer to specific points on the
lips that are used to calculate the ratio. These landmarks are crucial for accurately
measuring the shape and movement of the mouth. Common landmarks for
computing the mouth aspect ratio include the corners of the mouth, the top lip,
and the bottom lip. By identifying and measuring the distancesbetween these key
points, the mouth aspect ratio can be calculated to assess the position and
openness of the mouth. These landmarks play a vital role in facial landmark
detection and analysis, especially in tasks like facial expression recognition and
speech processing. When it comes to the mouth aspect ratio, landmarks are like
special points on the lips that help figure out how open or closed the mouth is.
4.5 DATASETS
Datasets play a crucial role in training the machine learning models used for eye
and face gesture cursor control projects. These datasets consist of recordings of
various facial expressions and eye movements paired with corresponding cursor
movements or clicks. The model learns to recognize the patterns in the facial data
and translates them into cursor control actions.
CODE SNIPPET
#importing the .dat file into the variable p, which will be used for the
calculation of faciallandmarks
p = "shape_predictor.dat"
detector = dlib.get_frontal_face_detector() # Returns a default face detector object
predictor = dlib.shape_predictor(p) # Outputs a set of location points that define
a pose of theobject. (Here, pose of the human face)
30
(lstart,lend)=face_utils.FACIAL_LANDMARKS_IDXS["right_eye"]
(rstart,rend) = face_utils.FACIAL_LANDMARKS_IDXS["left_eye"]
(mstart,mend) = face_utils.FACIAL_LANDMARKS_IDXS["mouth"]
Fig 4.8: Collection Of Datasets

➢ Data Collection: Researchers collect data from participants by recording
their facial movements and eye gaze while they perform specific cursor
actions on a screen. This data might include head movements, blinks,
eyebrow raises, and eye widening.
➢ Data Labelling: The collected data is then manually labelled with the
corresponding cursor movements or clicks. This labelling process helps the
machine learning model understand the relationship between facial features
and desired cursor actions.
➢ Model Training: The labelled dataset is used to train a machine learning
model, typically a convolutional neural network (CNN). The CNN learns to
identify specific facial features and their combinations associated with
particular cursor movements.
4.5 PREPROCESSING
➢ Identify and handle missing values: This might involve removing data points
with missing values, imputing missing values using statistical methods
(mean/median filling),or interpolating between existing values.
➢ Identify and handle outliers: Outliers can significantly skew analysis. You
31
can remove outliers, cap them to a specific range, or winsorize them (replace
with values closer to the central tendency).
a. Data Transformation:
• Scaling: Standardize or normalize features to have a common scale,
ensuring all featurescontribute equally during analysis. Common methods
include normalization (min-max scaling) and standardization (z-score
scaling).
• Feature Engineering: Create new features that might be more relevant for
your analysis. This could involve feature selection (picking informative
features) or creating combinations of existing features.
b. Data Integration:
• If your project involves data from multiple sources (e.g., eye tracking data
along with collected data), you'll need to integrate them.
32
CHAPTER 5
TESTING
5.1 OVERVIEW
Testing for cursor control using eye and face gestures likely involves evaluating
the accuracy, speed, and ease of use of the system. Here's a possible breakdown
of the testing process Accuracy Testing where it measures how well the system
translates eye and face movements into cursor movements. This might involve
tasks like clicking on specific targets or tracking moving objects on the screen.
Speed Testing shows how quickly the system responds to user input. This is
important for practical use and user experience. Ease of Use Testing evaluates
how easy it is for users to learn and use the system. This might involve observing
user interactions. Fatigue Testing system for extended periods leads to fatigue or
discomfort. This is important for users who may rely on this technology for
extended computer use.
5.1.1 PRINCIPLES OF TESTING

➢ All the tests should meet the customer requirements.
➢ To make our software testing should be performed by a third party.
➢ Exhaustive testing is not possible. As we need the optimal amount of testing
based onthe risk assessment of the application.
➢ All the tests to be conducted should be planned before implementing it.
➢ It follows the Pareto rule (80/20 rule) which states that 80% of errors come
from 20% ofprogram components.
➢ Start testing with small parts and extend it to large parts.
5.2 TYPES OF TESTING

➢ White box testing:
White-box testing is the detailed investigation of internal logic and structure
of the code. White- box testing is also called glass testing or open-box
testing. In order to perform white- box string on an application, a tester
needs to know the internal workings of the code. The tester needs to have a
look inside the source code and find out which unit/chunk of the code is
behaving inappropriately.
33
➢ Black box testing:
The technique of testing without interior workings of the application is
called black box testing. The tester is oblivious to the system architecture
and does not have access to the source code. Typically, while performing
a black-box test, a tester will interact with the system's user interface by
providing inputs and examining outputs without knowing how and where
the inputs are worked upon.
➢ Grey box testing
Grey-box testing is a technique to test the application with having a
limited knowledge of the internal workings of an application. In software
testing, the phrase the more you know, the better carries a lot of weight
while testing an application.
5.3 TEST CASES

In testing the project for controlling a mouse pointer via facial gestures for
quadriplegic patients, a series of crucial test cases must be considered. These
include verifying the system's proper initialization and accurate face detection
under varying conditions. Additionally, assessing the precision of gesture
recognition and its translation into mouse movements is vital. Usability testing
with quadriplegic patients is essential to ensure the interface's intuitiveness and
effectiveness. Performance testing under different loads and error handling in
unexpected scenarios are also key. Compatibility testing with various operating
systems and hardware configurations, as well as regression testing for new
updates, completes the comprehensive testing process, ensuring the system's
functionality, usability, and reliability.
Table 5.3.1 :Description of the test cases and their results
Sl Test Case Expected Test Result

no Outcome
1. Input without Sunglasses Consistent Performance Meeting the expected
and with light inreading the input outcomes
2. Input with Sunglasses Correct identification Fail

andwith light withminimal eye
detection
34
3. Input with eyeglasses Accurate Identification Achieving consistent
andwith light ofeyes performance
4. Input with eyeglasses Minimal face and eye Reached the expected
andwith low light detection outcome
5. Input with eye lenses Minimal Face detection Met the

expected expecte
doutcome
6. Input with User with Cannot able to identify the Reached the
microphthalmia blinks and calculate the expected
EAR outcome
35
CHAPTER 6
RESULTS
In this chapter, we discuss about the snapshots of the project. Here it contains
results that are retrieved at a specific point in time.
6.1 SNAP SHOTS OF THE PROJECT

The envisioned system effectively tracks faces and accurately positions the cursor
on the computer screen. Users can monitor the mouse and manipulate its
movement, along with performing various mouse functions such as clicking,
dragging, and scrolling, utilizing different eye movements. This system
undergoes testing on multiple users, including two physically disabled
individuals and one physically fit individual. The respective snapshots of the test
cases are provided below.
Fig 6.1: Reading Input

Fig 6.1 illustrates the input reading capability of the application. The input function
becomesactive when the user opens their mouth. The two boxes depicted in the
figure represent the cursor movement speed. As the control moves beyond the
outer box, the speed increases, and conversely, when it remains within the inner
box, the speed decreases. The user's mouth serves as the primary input source for
the system application. Additionally, the input function can be disabled by once
again opening the mouth when the application is not in use.
36
Fig. 6.2: Reading the input of left eye for EAR
Fig 6.2 demonstrates the reading the input of left eye, forthe calibration phaseto
calculate the EAR andto storethe values with the left-click function activated by the
left eye. In the click functionality, the eyes play a crucial role as they are the
essential components for clicking on icons.
Fig. 6.3: Reading the input of mouth for MAR

Fig 6.3 demonstrates the reading the input of mouth, forthe calibration phaseto calculate
the MAR andto storethe values for the further functionalities.
37
Fig. 6.4: Calibration value of EAR
This shows the values that are stored after the reading and calculating the eye
aspect ratio ofthe user that will be used for the comparisons.
Fig. 6.5: Cursor moving Downwards

Fig 6.5 shows the cursor moving downwards by using the face that is directed
towardsdown side where there will be no interaction between mouse and hand.
38
Fig. 6.6: Cursor moving Rightside
Fig 6.6 illustrates the right side cursor movement functionality, which is activated
by theface moving towards the right side. In the clicking process, the eyes are
pivotal as they are the essential elements for interacting with icons.
Fig. 6.7: Reading the PDF file
Fig 6.7 illustrates the reading of PDF file by opening with eye and face gestures
without thehelp of the hands for controlling the cursor.
39
Fig. 6.8: Reading the PDF file
The Fig 6.8 shows the double click functionality performed on the task bar of the
screen asshown in the figure above.
40
CHAPTER 7
CONCLUSION
The "Controlling Mouse Pointer Using Eye and Face Gestures for Quadriplegic
Patients" system aims to enable hands- free computing by utilizing eye movements
to control computer systems. It explores various movement- based human-
computer interaction techniques, focusing on operating the mouse cursor through
eyemovement. The Viola-Jones algorithm is employed to facilitate mouse pointer
movement and clicking operations. The proposed system captures real-time input
from users via OpenCV and operates in the background. It can be deployed on
laptops and desktops equipped with either built-inor external webcams. With this
application, users can execute actions such as cursor movement in all directions,
clicking, scrolling, and dragging, rendering it particularly beneficial for
individuals with physical disabilities. An important feature of the application is
its availability as already-to-install executable file, eliminating the need for
external packages. Additionally, it can be configured to automatically launch on
system boot, eliminating the need for manual startup.
7.1 FUTURE SCOPE

The project "controlling mouse pointer using eye and face gesture for
quadriplegic patients," this means that the technology developed in the project
could have a significant impact on the lives of individuals with quadriplegia. By
using eye movements and facial gestures to control a mouse pointer, quadriplegic
patients could interact with digital devices in a more natural and intuitive way,
without the need for physical input devices. The potential applications of this
technology are vast. For example, it could be used to enable quadriplegic patients
to: Navigate the internet and access online resources Communicate with others
through email, social media, and video conferencing Control smart home devices,
such as lights, thermostats, and entertainment systems Operate assistive
technology devices, such as wheelchairs and robotic arms.
Furthermore, the technology developed in this project could also have
applications beyond the quadriplegic community. For example, it could be used
to create more intuitive and natural interfaces for virtual and augmented reality
41
systems, or to develop new forms of input for gaming and entertainment.
Overall, the potential of HCI software to transform the way we interact with
digital devices is immense, and the project "controlling mouse pointer using eye
and face gesture for quadriplegic patients" is a prime example of the exciting
possibilities in this field.
42
REFERENCES
[1] N. Crosswhite, J. Byrne, C. Stauffer, O. Parkhi, Q. Cao, and A. Zisserman,
‘‘Template adaptation for face verification and identification,’’ in Proc. IEEE
Int. Conf. Autom. Face Gesture, Jun. 2017, pp. 1–8.
[2] A. Majumdar, R. Singh, and M. Vatsa, ‘‘Face verification via class sparsity
based supervised encoding,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 39,
no. 6, pp. 1273– 1280, Jun. 2017.
[3] Y. Gao, J. Ma, and A. L. Yuille, ‘‘Semi-supervised sparse representation

based classification for face recognition with insufficient labeled samples,’’
IEEE Trans. Image Process., vol. 26, no. 5, pp. 2545–2560, May 2017.
[4] M. H. Khan, J. McDonagh, and G. Tzimiropoulos, ‘‘Synergy between face

alignment andtracking via discriminative global consensus optimization,’’ in
Proc. IEEE ICCV, Oct. 2017, pp. 3811–3819. [5] P. Viola and M. J. Jones,
‘‘Robust real-time face detection,’’ Int. J. Comput. Vis., vol. 57, no. 2, pp.
137–154, 2004.
[5] T. Ojala, M. Pietikäinen, and T. Mäenpää, ‘‘Multiresolution gray-scale and

rotation invariant texture classification with local binary patterns,’’ IEEE
Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987, Jul. 2002.
[6] D. G. Lowe, ‘‘Distinctive image features from scale-invariant keypoints,’’

Int. J. Comput.Vis., vol. 60, no. 2, pp. 91–110, Nov. 2004.
[7] N. Dalal and B. Triggs, ‘‘Histograms of oriented gradients for human

detection,’’ in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern
Recognit., Jun. 2005, pp. 886–893.
[8] P. Felzenszwalb, D. McAllester, and D. Ramanan, ‘‘A discriminatively

trained, multiscale, deformable part model,’’ in Proc. IEEE Conf. Comput.
Vis. Pattern Recognit., Jun. 2008, pp. 1–8. [10] B. Yang, J. Yan, Z. Lei, and
S.
[9] Z. Li, ‘‘Aggregate channel features for multi-view face detection,’’ in Proc.
IEEE Int. Joint Conf. Biometrics, Sep./Oct. 2014, pp.1–8.
[10]Y. Ren, Y. Sun, X. Jing, Z. Cui, and Z. Shi, ‘‘Adaptive makeup transfer via
43
bat algorithm,’’ Mathematics, vol. 7, no. 3, p. 273, 2019. doi: 10.3390/
math7030273.
[11] Z. Cui, L. Du, P. Wang, X. Cai, and W. Zhang, ‘‘Malicious code detection
based on CNNs and multi-objective algorithm,’’ J. Parallel Distrib. Comput.,
vol. 129, pp. 50–58, Jul. 2019. doi: 10.1016/j.jpdc.2019.03.010.
[12] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, ‘‘Facial landmark detection by

deep multitask learning,’’ in Proc. Eur. Conf. Comput. Vis., 2014, pp. 94–
108.
[13] S. Yang, P. Luo, C.-C. Loy, and X. Tang, ‘‘From facial parts responses to
face detection: A deep learning approach,’’ in Proc. IEEE Int. Conf. Comput.
Vis., Dec. 2015, pp. 3676–3684.
[14] H. Li, Z. Lin, X. Shen, J. Brandt, and G. Hua, ‘‘A convolutional neural
network cascade for face detection,’’ in Proc. IEEE Conf. Comput. Vis.
Pattern Recognit., Jun. 2015, pp. 5325–5334.
[15] S. Yang, P. Luo, C. C. Loy, and X. Tang, ‘‘Faceness-net: Face detection

through deep facial part responses,’’ 2017, arXiv:1701.08393. [Online].
Available: https://arxiv.org/abs/1701.08393
[16] K. Zhang, Z. Zhang, Z. Li, and Y. Qiao, ‘‘Joint face detection and alignment
using multitask cascaded convolutional networks,’’ IEEE Signal Process.
Lett., vol. 23, no. 10, pp. 1499–1503, Oct. 2016.
[17] R. Ranjan, V. M. Patel, and R. Chellappa, ‘‘Hyperface: A deep multitask

learning framework for face detection, landmark localization, pose
estimation, and gender recognition,’’ IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 41, no. 1, pp. 121–135, Jan. 2019.
[18] Y. Li, B. Sun, T. Wu, Y. Wang, and W. Gao, ‘‘Face detection with endto-
end integration of a convnet and a 3D model,’’ in Proc. ECCV, 2016, pp. 420–
436. [Online]. Available: https://arxiv.org/abs/1606.00850
[19] J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, ‘‘UnitBox: An advanced

object detection .
[20] N. Crosswhite, J. Byrne, C. Stauffer, O. Parkhi, Q. Cao, and A. Zisserman,
44
‘‘Template adaptation for face verification and identification,’’ in Proc. IEEE
Int. Conf. Autom. Face Gesture, Jun. 2017, pp. 1–8.
[21] A. Majumdar, R. Singh, and M. Vatsa, ‘‘Face verification via class sparsity
based supervised encoding,’’ IEEE Trans. Pattern Anal. Mach. Intell., vol. 39,
no. 6, pp. 1273– 1280, Jun. 2017.
45

Final Year Report

Uploaded by

Copyright:

Available Formats

You might also like

Final Year Report

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Final Year Report

Uploaded by

Copyright:

Available Formats

CHAPTER 1

1.1 PROBLEM STATEMENT

1.3 OBJECTIVES OF PROJECT

Author and year: WENYUN SUN

1.6 ORGANIZATION OF REPORT

2.1 FUNCTIONAL OVERVIEW

2.2 USER CHARACTERISTICS

2.3 OUTPUT REQUIREMENTS

2.3.1 Functional Requirements

2.4 SOFTWARE REQUIREMENTS

2.5 HARDWARE REQUIREMENTS

2.6 PROJECT CYCLE

Fig 2.1: Project Cycle

➢ Requirements Collection: In this stage, the project requirements are

3.1 PROJECT ARCHITECTURE AND DESCRIPTION

Fig 3.1: Architecture of mouse pointer controlling system

Fig 3.2: Sequence diagram

Here’s the detailed explanation of the diagram Fig 3.2:

3.3 FLOW CHART

Fig 3.3: Flow Chart

3.4 DATA FLOW DIAGRAM

Fig 3.4: Data Flow Diagram

4.1 SOFTWARE TOOLS USED

Fig 4.1: PyCharm IDE

4.2 PROPOSED ALGORITHMS

Fig 4.1: Haar like features

s(x, y) = s(x, y1) + i(x, y) ----------(eq-1)

ii(x, y) = ii(x1,y) + s(x, y) ----------(eq-2)

4.2.2 AdaBOOST ALGORITHM

The AdaBoost algorithm effectively leverages multiple weak learners to create

4.2.3 EYE ASPECT RATIO

Fig 4.4 : 68 point feature map

Fig 4.5: eye feature annotation points

e. Double Click Functions

else ifclickstate == 1:then double_left_click is enabled

4.4 MOUTH ASPECT RATIO

Fig 4.6 : MAR Formula

Fig 4.7 : Landmarks On Mouth

Fig 4.8: Collection Of Datasets

5.1.1 PRINCIPLES OF TESTING

5.2 TYPES OF TESTING

5.3 TEST CASES

Table 5.3.1 :Description of the test cases and their results

Sl Test Case Expected Test Result

2. Input with Sunglasses Correct identification Fail

5. Input with eye lenses Minimal Face detection Met the

6.1 SNAP SHOTS OF THE PROJECT

Fig 6.1: Reading Input

Fig. 6.3: Reading the input of mouth for MAR

Fig. 6.5: Cursor moving Downwards

Fig. 6.7: Reading the PDF file

7.1 FUTURE SCOPE

[3] Y. Gao, J. Ma, and A. L. Yuille, ‘‘Semi-supervised sparse representation

[4] M. H. Khan, J. McDonagh, and G. Tzimiropoulos, ‘‘Synergy between face

[5] T. Ojala, M. Pietikäinen, and T. Mäenpää, ‘‘Multiresolution gray-scale and

[6] D. G. Lowe, ‘‘Distinctive image features from scale-invariant keypoints,’’

[7] N. Dalal and B. Triggs, ‘‘Histograms of oriented gradients for human

[8] P. Felzenszwalb, D. McAllester, and D. Ramanan, ‘‘A discriminatively