Project Report

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 40

CHAPTER 1

INTRODUCTION
To do the communication with paralyzed people we use eye Face Detection
Algorithm. The Face Detection Algorithm then processes captured video frames to
give out the rectangular boxed face. This output from Face Detection Algorithm
then gets processed using AdaBoost Classifier to detect the eye region in the face.
Eye detected will be sent to check if there is any movement of the eyeball. If it’s
there, then this movement will be tracked to give out the combination the patient is
using to express the dialogue. If not, then the blink pattern will be processed to
give out the voice as well as the text input with the respective dialogue. There are
many methods introduced for motor neuron disease patients to communicate with
the outside world such as Brain wave technique and Electro-oculography. Loss of
speech can be hard to adjust. It is difficult for the patients to make the caretaker
understand what they need especially when they are in hospitals. It becomes
difficult for the patients to express their feelings and even they cannot take part in
conversations. The proposed system detects the voluntary blinks of the patient and
accordingly sends the message about the requirement to the caretaker and also
gives the voice output via a call to the caretaker. The system uses an inbuilt camera
to capture the video of the patient and with the help of a facial landmark algorithm,
it identifies the face and eyes of the patient. The system then slides a bunch of
images one after the other on the screen and the patient can choose to blink over
the image he wants, just to convey message of his desires. The system identifies
the blink with help of eye aspect ratio and then sends a message to the care taker of
what the patient wants and also the system initiates a call to the care taker where in
a voice is audible saying what the patient wants.
Blink To Speak offers a form of independence to paralyzed people. The
software platform converts eye blinks to Speak. Every feature of the software can

1
be controlled by eye movement. Thus, the software can be independently operated
by paralyzed people. Using the software, patients can record messages, recite those
messages aloud, and send the messages to others.

1.1 PROBLEM STATEMENT


Patients who lose the ability to speak and write they can only contact the
outside world through human-computer interaction; e.g. controlling brain waves or
tracking eye movements. Currently, brain wave-controlling devices need to be
worn by users, so they are not convenient for people to use. There exists eye-
motion-based software which enables the MND patients to write in the computer
by using their eye functions only. When they are away from PC and lie on the bed,
they cannot communicate with care providers. With the goal of helping MND
patients on the bed to call for other people with a simple and easy approach this
research aims to develop a real time video processing system, which can
successfully research aims to develop a detect the eye blinks regardless the head
directions, day or night.

1.2 PROPOSED SYSTEM


The proposed project aims to bring out a solution for the paralyzed people
without any harm to their body externally or internally. It overweighs the
previously developed prototypes in this field because none of the components are
in direct contact with the patient body hence it definitely will prove to be safer.
Cost effective: The main objective of developing algorithm of a real time
video Oculography system is that to provide cost effective for those people who
cannot afford. The existing technique for such patients to communicate is too
costly. Thus, it is necessary to design a system which is affordable to common
people which includes cost effective components for designing.

2
Fast: There are few algorithms which are developed for video Oculography system
for communication. The main objective of this project is to develop an algorithm
which is extremely fast compared to the existing ones.

Accuracy: The main objective of this project is to develop an algorithm which is


more accurate compared to the existing ones that can afford the technology. Blink
To Speak focuses on a different demographic that is often ignored. Blink To Speak
is free and open source. The software runs on a wide variety of low-end computers.
The only required peripheral is a basic webcam. Not only is this software
accessible to paralyzed people, but paralyzed people of almost all financial classes
as well.

1.3 SCOPE OF THE PROJECT


 To make the Paralyzed people to communicate with us.
 To make the Perfect Assistance for Paralyzed People.
 To give all the services for the paralyzed people after they communicate
their needs via blinks.

3
CHAPTER 2
LITERATURE SURVEY

2.1 INFERENCE ON EYE BLINK PATTERN


Sivakumar D1 , Ramkumar P2 , Sridhar V3 , Yamuna A4 , Shashi B5 1,
2,3,4,5 Department of Computer Science and Engineering, Rajarajeswari College
of Engineering, Bangalore, Karnataka, India-560074.
The goal of this project is to assist those with disabilities who cannot
communicate with people. For the purpose of obtaining information about the eyes
and facial axes, the Haar Cascade Classifier is utilised for face and eye recognition.
In contrast, the same classifier is applied to determine how the eye should be
positioned in relation to the face's axis and other senses based on characteristics
resembling those of a haar. This suggests an efficient face location-based eye
detection system. The position of the observed face is used to propose an effective
method for eye detection. Finally, a method for detecting eye blinks is developed.

2.2 INFERENCE ON EYE MOTION DETECTION


Assistance for Paralyzed Patient Using Eye Motion Detection.
Authors: Milan Pandey ,Anoop Shinde , Kushal Chaudhari, Divyanshu Totla
,Rajnish, Prof. N.D. Mali Dept. of Computer Engineering Sinhgad Academy of
Engineering .
These systems are costly to implement, increase stress on the patients and
need skilled computer interface for a completely paralyzed patient using labor to
setup and maintain system for proper functioning. eye motion and eye blink
detection. In paralysis the ability to control muscles movement is limited around
the eye muscles or There are two main groups of human computer blinks are the
only way for the patient to communicate. In such interface. First is brain computer

4
interface (BCI) and second communication the interface is often intrusive, which
require is a system controlled by invasive devices . A BCI uses special hardware or
depend on active infrared sensors. A nonelectrical brain activity and measure the
signal and interpret intrusive communication interface which was developed and
runs on a consumer grade computer which takes input in the the signal which helps
to control computer applications. form of video frames from an inexpensive
webcam without However, the main drawback of BCI are intrusiveness and special
lightning conditions. The interface detects voluntary eye needs an EEG recording
hardware. The invasive devices blinks and pupil motion then interprets them as
control commands. The detected eye direction can be useful through method
makes use of the contact-lens based tracking system. applications such as medical
assistance, S.O.S, basic utility. The Small silicon wired coils are used, called as the
Scleral video frame is processed by OpenCV library which is an open Search Coils
that are embedded into a modified contact lens source software.

2.3 INFERENCE ON EYE MONITORING DEVICE


Eye Monitored Device for disable People.
Authors: Asfand Ateem, Mairaj Ali, Zeeshan Ali Akbar ,Muhammad Asad Bashir
Department of Electrical Engineering College of E&ME.
Technology is mostly used to benefit people. People pursue particular themes.
Some conditions are fixed, but others, like recurrent paralysis (induced by
genetics), are becoming more dependent on technology as time passes. illnesses)
are caused by a variety of factors combined Brain-computer Technologists are
always looking for ways to overcome challenges. interface is really a system which
allows for managing the pc for ease of homo sapiens. In addition to it we have also
lodge applications through measuring as well as interpreting electric a model to
engineer modish thing With the syndrome; is a medical mode in which most of the

5
body improvements in information technology, object detection / muscles are
paralyzed except the movement of eyes. Our recognition has wide usage in
applications.

2.4 INFERENCE ON AUTOMATED EYE BLINK DETECTION


A Fully Automated Unsupervised Algorithm for Eye-Blink Detection in EEG
Signals
Authors: Mohit Agarwal Electrical and Computer Engineering Georgia Institute of
Technology, Raghupathy Sivakumar Electrical and Computer Engineering Georgia
Institute of Technology
Eye blinks are known to significantly pollute EEG data, which has a
negative impact on how well EEG signals are decoded in a variety of medical and
scientific applications. In this study, we take into account the challenge of eye-
blink identification, which may then be used to effectively eliminate blinks from
EEG signals. We suggest the algorithm Blink, which self-learns user-specific
brainwave profiles for eye-blinks and is totally automated and unsupervised. As a
result, Blink eliminates the need for manual inspection or user training. Blink can
accurately estimate the start and finish timestamps of eye blinks when operating on
a single channel EEG.

6
CHAPTER 3
MATERIALS AND METHOLOGY
3.1 METHOLOGY
3.1.1 VISUAL COMMUNICATION
Visual communication is the practice of using visual elements to convey a
message, inspire change, or evoke emotion. It’s one part communication design
crafting a message that educates, motivates, and engages, and one-part graphic
design using design principles to communicate that message so that it’s clear and
eye-catching. Effective visual communication should be equally appealing and
informative.

3.1.2 IMAGE PROCESSING


Image processing is the process of transforming an image into a digital form
and performing certain operations to get some useful information from it. The
image processing system usually treats all images as 2D signals when applying
certain predetermined signal processing methods.
Types of Image Processing
There are five main types of image processing
 Visualization - Find objects that are not visible in the image.
 Recognition - Distinguish or detect objects in the image.
 Sharpening and restoration - Create an enhanced image from the original
image.
 Pattern recognition - Measure the various patterns around the objects in the
image.
 Retrieval - Browse and search images from a large database of digital
images that are similar to the original image.

7
3.1.3 FACE DETECTION
Face detection has progressed from rudimentary computer vision techniques
to advances in machine learning (ML) to increasingly sophisticated artificial neural
networks (ANN) and related technologies; the result has been continuous
performance improvements. It now plays an important role as the first step in many
key applications -- including face tracking, face analysis and facial recognition.
Face detection has a significant effect on how sequential operations will perform in
the application.

In face analysis, face detection helps identify which parts of an image or


video should be focused on to determine age, gender and emotions using facial
expressions. In a facial recognition system which maps an individual's facial
features mathematically and stores the data as a faceprint face detection data is
required for the algorithms that discern which parts of an image or video are
needed to generate a faceprint. Once identified, the new faceprint can be compared
with stored faceprints to determine if there is a match.

3.1.4 EYE DETECTION


Eye tracking refers to the process of measuring where we look, also known
as our point of gaze. These measurements are carried out by an eye tracker, that
records the position of the eyes and the movements they make. Near-infrared light
is directed toward the center of the eyes (pupil), causing detectable reflections in
both the pupil and the cornea (the outer-most optical element of the eye). These
reflections – the vector between the cornea and the pupil – are tracked by an
infrared camera. This is the optical tracking of corneal reflections, known as pupil
center corneal reflection (PCCR).

8
An infrared light source (and thus detection method) is necessary as the
accuracy of gaze direction measurement is dependent on a clear demarcation (and
detection) of the pupil as well as the detection of corneal reflection. Normal light
sources (with ordinary cameras) aren’t able to provide as much contrast, meaning
that an appropriate amount of accuracy is much harder to achieve without infrared
light.
Light from the visible spectrum is likely to generate uncontrolled Specular
reflection, while infrared light allows for a precise differentiation between the
pupil and the iris – while the light directly enters the pupil, it just “bounces off” the
iris. Additionally, as infrared light is not visible to humans it doesn’t cause any
distraction while the eyes are being tracked.

FIG 3.1 EYE BALL MOVEMENT


3.1.5 TYPES OF EYE TRACKING TECHNOLOGIES
 Screen Based Eye Tracking Technology.
 Eye Tracking glasses.

9
3.1.6 BLINK DETECTION
Blink detection is actually the process of using computer vision to firstly
detect a face, with eyes, and then using a video stream (or even a series of rapidly-
taken still photos) to determine whether those eyes have blinked or not within a
certain timeframe.

3.2 PROGRAMMING LANGUAGE USED


3.2.1 PYTHON
Python is a high-level, general-purpose programming language. It is design
philosophy emphasizes code readability with the use of significant indentation via
the off-side rule. Python is dynamically typed and garbage- collected. It supports
multiple programming paradise, including structured (particularly procedural),
object-oriented, and functional programming. It is often described as a “batteries
included” language due to its comprehensive standard library.

3.2.2 PYTHON LIBRARIES


Normally, a library is a collection of books or is a room or place where many
books are stored to be used later. Similarly, in the programming world, a library is
a collection of precompiled codes that can be used later on in a program for some
specific well-defined operations. A python library is a collection of related
modules. It contains bundle of codes that can be used repeatedly in different
programs. It makes python programming simpler and convenient for the
programmer. As we don’t need to write the same code again and again for different
programs. Python libraries plays a very important role in fields of machine
Learning, Data science, Data visualization, etc.

10
The following are the libraries used in our project
 OpenCv
 Dlib
 Enum
 Time
 Subprocess
 TKinter
 Gtts
 PIL
 Twillio & Tempfile

3.2.2.1OPENCV
OpenCV is the huge open-source library for the computer vision, machine
learning, and image processing and now it plays a major role in real- time
operation which is very important in today’s systems. By using it, one can process
images and videos to identify objects, faces, or even handwriting of a human.
When it integrated with various libraries, such as NumPy, python is capable of
processing the OpenCV array structure for analysis. To Identify image pattern and
its various features we use vector space and perform mathematical operations on
these features.

3.2.2.2 DLIB
Dlib is a general-purpose cross platform software library written in the
programming language C++. It is design heavily influenced by ideas from a design
by contract and component-based software engineering. Thus it is, first and
foremost, a set of independent software components.

11
It is open-source software released under a boost software license. Since
development began in 2002, Dlib has grown to include a wide variety of tools. As
of 2016, it contains software components for dealing with networking, threads,
graphical user interfaces, data structures, linear algebra, machine learning, image
processing, data mining, XML and text parsing, numerical optimization, Bayesian
networks, and many other tasks.

3.2.2.3 ENUM
Enumerations in Python are implemented by using the module named
“enum”. Enumerations are created using classes. Enums have names and values
associated with them.

3.2.2.4 TIME
As the name suggests Python time module allows to work with time in
Python. It allows functionality like getting the current time, pausing the Program
from executing, etc. So before starting with this module we need to import it.

3.2.2.5 TKINTER
Python offers multiple options for developing GUI (Graphical User
Interface). Out of all the GUI methods, tkinter is the most commonly used method.
It is a standard Python interface to the Tk GUI toolkit shipped with Python. Python
with tkinter is the fastest and easiest way to create the GUI applications. Creating a
GUI using tkinter is an easy task.

3.2.2.6 SUBPROCESS
Subprocess is a standard Python module that allows the user to start new
processes from within a Python script1234. It is useful for running multiple

12
processes in parallel or calling an external program or command from inside
Python code. Subprocess allows the user to manage inputs, outputs, and errors
raised by the child process from Python code.
The parent-child relationship of processes is where the "sub" in the
subprocess name comes from. Subprocess is used to launch processes that are
completely separate from the user's program, while multiprocessing is designed to
communicate with each other.

3.2.2.7 GTTS
There are several APIs available to convert text to speech in Python. One of
such APIs is the Google Text to Speech API commonly known as the gTTS API.
gTTS is a very easy to use tool which converts the text entered, into audio which
can be saved as a mp3 file. The gTTS API supports several languages including
English, Hindi, Tamil, French, German and many more.

The speech can be delivered in any one of the two available audio speeds,
fast or slow. However, as of the latest update, it is not possible to change the voice
of the generated audio.

3.2.2.8 PIL(Pillow)
PIL stands for Python Imaging Library, and it’s the original library that
enabled Python to deal with images. PIL was discontinued in 2011 and only
supports Python 2. To use its developers’ own description, Pillow is the friendly
PIL fork that kept the library alive and includes support for Python 3.

PIL is the Python Imaging Library which provides the python interpreter with
image editing capabilities. The Image module provides a class with the same name

13
which is used to represent a PIL image. The module also provides a number of
factory functions, including functions to load images from files, and to create new
images.

3.2.2.9 TWILLIO & TEMPFILE


Twilio is a platform that provides APIs and SDKs for developers to build
communication features and capabilities into their applications. Twilio enables
developers to use voice, text, chat, video, email, WhatsApp, and IoT channels to
create personalized customer experiences.

Twilio is used by hundreds of thousands of businesses and more than ten


million developers worldwide, including major companies like Uber, Airbnb,
Netflix, and HubSpot.

Tempfile is a Python module used in a situation, where we need to read


multiple files, change or access the data in the file, and gives output files based on
the result of processed data. Each of the output files produced during the program
execution was no longer needed after the program was done.

In this case, a problem arose that many output files were created and this cluttered
the file system with unwanted files that would require deleting every time the
program ran.

14
3.3 SYSTEM ARCHITECTURE

FIG 3.2. SYSTEM ARCHITECTURE

15
3.4 WORK FLOW DIAGRAM

FIG 3.3 WORK FLOW DIAGRAM

16
EYE RECOGNITION DIAGRAM

FIG 3.4 EYE RECOGNITION DIAGRAM

17
CASE DIAGRAM

FIG 3.5. CASE DIAGRAM

18
STEPS FOR IMPLEMENTATION:
Step 1: Capturing a video.
Step 2: Capture images from video.
Step 3: Converting images into grayscale. Step 4: Fix landmarks on the images.
Step 5: Detect Blinks.
Step 6: Detecting Eye-ball movements.
Step 7: Converting to text.
Step 8: Sending the text (or) if emergency situation means doing calls.
3.5 MODULES
 Images From Camera
 Converting images into Grayscale
 Preprocessing
 Face Detection
 Eye Ball Movement Recognition using Dlib
 Sending Messages

3.5.1 DESCRIPTION OF A MODULE


3.5.1.1 IMAGES FROM CAMERA
The very first module is collecting images from the camera The camera
should be placed in front of the paralyzed people and the camera may be fixed with
their wheel chair. Camera will starts capturing their images when the person clicks
the concerned button that is fixed with the wheelchair. The Captured images will
be send to the system for the further procedure.

3.5.1.2 CONVERTING IMAGES INTO GRAYSCALE:


Many image processing operations work on a plane of image data (e.g., a
single color channel) at a time.

19
The purpose for converting grayscale images are
 Simplicity
 Data Reduction
If we converting into grayscale, especially due to the likely reduction in processing
time. However, it comes at the cost of throwing away data (color data) that may be
very helpful or required for many image processing applications.

3.5.1.3 PREPROCESSING:
Pre-processing is the first step of the language processing system, which
translates high-level language to machine-level language or absolute machine
code. It involves data validation and imputation to assess whether the data is
complete and accurate, and to correct errors and input missing values. We use the
pre-processing method to improve the quality of the captured images.
In image processing, preprocessing improves image quality by removing
noise, unmated data, or eliminating variations that arise during acquisition.

3.5.1.4 FACE DETECTION


We use Face detection technology that uses machine learning and
algorithms in order to extract human faces from larger images. such images
typically contain plenty of non-face objects, such as buildings, landscapes, and
various body parts.
Facial detection algorithms usually begin by seeking out human eyes, which
are one of the easiest facial features to detect. Next, the algorithm might try to find
the mouth, nose, eyebrows, and iris.
After identifying these facial features, and the algorithm concludes that it
has extracted a face, it then goes through additional tests to confirm that it is,
indeed, a face.

20
3.5.1.5 EYE BALL MOVEMENT RECOGNITION USING DLIB
Research on eye tracking is increasing owing to its ability to facilitate many
different tasks, particularly for the elderly or users with special needs. Eye tracking
is the process of measuring where one is looking (point of gaze) or the motion of
an eye relative to the head. Researchers have developed different algorithms and
techniques to automatically track the gaze position and direction, which are helpful
to find the emotions of the paralyzed person. We explore and review eye tracking
concepts, methods, and techniques by further elaborating on efficient and effective
modern approaches such as machine learning (ML).

3.5.1.6 SENDING MESSAGES


Alert messaging (or alert notification) is machine-to-person communication
that is important or time-sensitive. An alert may be a calendar reminder or a
notification of a new message. After Recognizing the emotions or needs of the
patient we have to send the alert messages to the care taker or their relatives. If the
patients meets the health problems or the emergency situations, the voice messages
or sending call to the Doctor or Nurse. This process uses the Twillio package to
give the call to the concerned person.
3.6 DEEP LEARNING
Deep learning is a branch of machine learning which is based on artificial
neural networks. It is capable of learning complex patterns and relationships within
data. In deep learning, we don’t need to explicitly program everything. It has
become increasingly popular in recent years due to the advances in processing
power and the availability of large datasets. Because it is based on artificial neural
networks (ANNs) also known as deep neural networks (DNNs). These neural
networks are inspired by the structure and function of the human brain’s biological
neurons, and they are designed to learn from large amounts of data.

21
 Deep Learning is a subfield of Machine Learning that involves the use of
neural networks to model and solve complex problems. Neural networks
are modeled after the structure and function of the human brain and
consist of layers of interconnected nodes that process and transform data.
 The key characteristic of Deep Learning is the use of deep neural
networks, which have multiple layers of interconnected nodes. These
networks can learn complex representations of data by discovering
hierarchical patterns and features in the data. Deep Learning algorithms
can automatically learn and improve from data without the need for
manual feature engineering.
 Deep Learning has achieved significant success in various fields,
including image recognition, natural language processing, speech
recognition, and recommendation systems. Some of the popular Deep
Learning architectures include Convolutional Neural Networks (CNNs),
Recurrent Neural Networks (RNNs), and Deep Belief Networks (DBNs).
 Training deep neural networks typically requires a large amount of data
and computational resources. However, the availability of cloud
computing and the development of specialized hardware, such as Graphics
Processing Units (GPUs), has made it easier to train deep neural networks.
 In summary, Deep Learning is a subfield of Machine Learning that
involves the use of deep neural networks to model and solve complex
problems. Deep Learning has achieved significant success in various
fields, and its use is expected to continue to grow as more data becomes
available, and more powerful computing resources become available.
Deep learning is the branch of machine learning which is based on
artificial neural network architecture. An artificial neural network or ANN uses

22
layers of interconnected nodes called neurons that work together to process and
learn from the input data.
In a fully connected Deep neural network, there is an input layer and
one or more hidden layers connected one after the other. Each neuron receives
input from the previous layer neurons or the input layer. The output of one neuron
becomes the input to other neurons in the next layer of the network, and this
process continues until the final layer produces the output of the network. The
layers of the neural network transform the input data through a series of nonlinear
transformations, allowing the network to learn complex representations of the input
data.

3.6.1METHODS OF DEEP LEARNING


Today Deep learning has become one of the most popular and visible areas
of machine learning, due to its success in a variety of applications, such as
computer vision, natural language processing, and Reinforcement learning.
Deep learning can be used for supervised, unsupervised as well as reinforcement
machine learning. it uses a variety of ways to process these.
SUPERVISED MACHINE LEARNING:
Supervised machine learning is the machine learning technique in which the
neural network learns to make predictions or classify data based on the labeled
datasets. Here we input both input features along with the target variables. the
neural network learns to make predictions based on the cost or error that comes
from the difference between the predicted and the actual target, this process is
known as backpropagation. Deep learning algorithms like Convolutional neural
networks, Recurrent neural networks are used for many supervised tasks like
image classifications and recognization, sentiment analysis, language translations,
etc.

23
UNSUPERVISED MACHINE LEARNING
Unsupervised machine learning is the machine learning technique in which
the neural network learns to discover the patterns or to cluster the dataset based on
unlabeled datasets. Here there are no target variables. while the machine has to
self-determined the hidden patterns or relationships within the datasets. Deep
learning algorithms like autoencoders and generative models are used for
unsupervised tasks like clustering, dimensionality reduction, and anomaly
detection.

REINFORCEMENT MACHINE LEARNING:


Reinforcement Machine Learning is the machine learning technique in
which an agent learns to make decisions in an environment to maximize a reward
signal. The agent interacts with the environment by taking action and observing the
resulting rewards. Deep learning can be used to learn policies, or a set of actions,
that maximizes the cumulative reward over time. Deep reinforcement learning
algorithms like Deep Q networks and Deep Deterministic Policy Gradient (DDPG)
are used to reinforce tasks like robotics and game playing etc.

ARTIFICIAL NEURAL NETWORKS


Artificial neural networks are built on the principles of the structure and
operation of human neurons. It is also known as neural networks or neural nets. An
artificial neural network’s input layer, which is the first layer, receives input from
external sources and passes it on to the hidden layer, which is the second layer.
Each neuron in the hidden layer gets information from the neurons in the previous
layer, computes the weighted total, and then transfers it to the neurons in the next
layer. These connections are weighted, which means that the impacts of the inputs
from the preceding layer are more or less optimized by giving each input a distinct

24
weight. These weights are then adjusted during the training process to enhance the
performance of the model.

3.6.2 DEEP LEARNING ALGORITHMS:


 Convolutional Neural Networks (CNNs)
 Long ShortTerm Memory Networks (LSTMs)
 Recurrent Neural Networks (RNNs)
 Generative Adversarial Networks (GANs)
 Radial Basis Function Networks (RBFNs)
 Multilayer Perceptron’s (MLPs)
 Self-organizing Maps (SOMs)

ALGORITHM USED
CNN (CONVOLUTIONAL NEURAL NETWORKS)
Deep Learning has facilitated multiple approaches to computer vision,
cognitive computation and refined processing of visual data. One such instance is
the use of CNN or Convolutional Neural Networks for object or image
classification. CNN algorithms provide a massive advantage in visual-based
classification by enabling machines to perceive the world around them (in the form
of pixels) as humans do.
CNN is fundamentally a recognition algorithm that allows machines to
become trained enough to process, classify or identify a multitude of parameters
from visual data through layers. CNN-based systems learn from image-based
training data and can classify future input images or visual data on the basis of its
training model. As long as the dataset that is used for training contains a range of
useful visual cues (spatial data), the image or object classifier will be highly
accurate.

25
This promotes advanced object identification and image classification by
enabling machines or software to accurately identify the required objects from
input data. CNN models rely on classification, segmentation, localisation and then
build predictions. This allows these cars to almost react like human brains would in
any given situation or sometimes even more effectively than human drivers.

3.6.3 TECHNIQUES AND METHODS


 Visual Communication
 Image Processing
 Face Detection
 Eye Detection
 Blink Detection

3.6.3.1 VISUAL COMMUNICATION


Visual communication is the practice of using visual elements to convey a
message, inspire change, or evoke emotion. It’s one part communication design
crafting a message that educates, motivates, and engages, and one part graphic
design using design principles to communicate that message so that it’s clear and
eye-catching. Effective visual communication should be equally appealing and
informative.

3.6.3.2 IMAGE PROCESSING


Image processing is the process of transforming an image into a digital
form and performing certain operations to get some useful information from it.
Types of Image Processing
 Visualization
 Recognition

26
 Sharpening and Restoration
 Pattern recognition
 Retrieval

VISUALIZATION
Data visualization is the graphical representation of information and data. By
using visual elements like charts, graphs, and maps, data visualization tools
provide an accessible way to see and understand trends, outliers, and patterns in
data. Additionally, it provides an excellent way for employees or business owners
to present data to non-technical audiences without confusion. In the world of Big
Data, data visualization tools and technologies are essential to analyze massive
amounts of information and make data-driven decisions.

ADVANTAGES OF VISUALIZATION:
 Easily sharing information.
 Interactively explore opportunities.
 Visualize patterns and relationships.

RECOGNITION
Facial recognition is a way of identifying or confirming an individual’s
identity using their face. Facial recognition systems can be used to identify people
in photos, videos, or in real-time.
Facial recognition is a category of biometric security. Other forms of biometric
software include voice recognition, fingerprint recognition, and eye retina or iris
recognition. The technology is mostly used for security and law enforcement,
though there is increasing interest in other areas of use.

27
Many people are familiar with face recognition technology through the
Face ID used to unlock iPhones (however, this is only one application of face
recognition). Typically, facial recognition does not rely on a massive database of
photos to determine an individual’s identity it simply identifies and recognizes one
person as the sole owner of the device, while limiting access to others.
Beyond unlocking phones, facial recognition works by matching the faces
of people walking past special cameras, to images of people on a watch list. The
watch lists can contain pictures of anyone, including people who are not suspected
of any wrongdoing, and the images can come from anywhere — even from our
social media accounts. Facial technology systems can vary, but in general, they
tend to operate as follows:

STEP 1: FACE DETECTION


The camera detects and locates the image of a face, either alone or in a
crowd. The image may show the person looking straight ahead or in profile.

STEP 2: FACE ANALYSIS


Next, an image of the face is captured and analyzed. Most facial
recognition technology relies on 2D rather than 3D images because it can more
conveniently match a 2D image with public photos or those in a database. The
software reads the geometry of your face. Key factors include the distance between
your eyes, the depth of your eye sockets, the distance from forehead to chin, the
shape of your cheekbones, and the contour of the lips, ears, and chin. The aim is to
identify the facial landmarks that are key to distinguishing your face.

28
STEP 3: CONVERTING THE IMAGE TO DATA
The face capture process transforms analog information (a face) into a set
of digital information (data) based on the person's facial features. Your face's
analysis is essentially turned into a mathematical formula. The numerical code is
called a faceprint. In the same way that thumbprints are unique, each person has
their own faceprint.

STEP 4: FINDING A MATCH


Your faceprint is then compared against a database of other known faces.
For example, the FBI has access to up to 650 million photos, drawn from various
state databases. On Facebook, any photo tagged with a person’s name becomes a
part of Facebook's database, which may also be used for facial recognition. If your
faceprint matches an image in a facial recognition database, then a determination is
made. Of all the biometric measurements, facial recognition is considered the most
natural. Intuitively, this makes sense, since we typically recognize ourselves and
others by looking at faces, rather than thumbprints and irises. It is estimated that
over half of the world's population is touched by facial recognition technology
regularly.

SHARPENING AND RESTORATION


Before getting into the act of sharpening an image, we need to consider
what sharpness actually is. The biggest problem is that, in large part, sharpness is
subjective. Sharpness is a combination of two factor resolution and acutance.
Resolution is straightforward and not subjective. It's just the size, in pixels, of the
image file. All other factors equal, the higher the resolution of the image the more
pixels it has the sharper it can be. Acutance is a little more complicated.

29
It’s a subjective measure of the contrast at an edge. There’s no unit for
acutance you either think an edge has contrast or think it doesn’t. Edges that have
more contrast appear to have a more defined edge to the human visual system.

PATTERN RECOGNITION
Pattern recognition is a technique to classify input data into classes or
objects by recognizing patterns or feature similarities. Unlike pattern matching
which searches for exact matches, pattern recognition looks for a “most likely”
pattern to classify all information provided. This can be done in a supervised
(labelled data) learning model or unsupervised (unlabelled data) to discover new,
hidden patterns.

INFORMATION RETRIEVAL:
Information Retrieval (IR) can be defined as a software program that deals
with the organization, storage, retrieval, and evaluation of information from
document repositories, particularly textual information. Information Retrieval is
the activity of obtaining material that can usually be documented on an
unstructured nature i.e. usually text which satisfies an information need from
within large collections which is stored on computers. For example, Information
Retrieval can be when a user enters a query into the system.

3.6.3.3 FACE DETECTION:


Face detection, also called facial detection, is an artificial intelligence (AI)-based
computer technology used to find and identify human faces in digital images and
video. Face detection technology is often used for surveillance and tracking of
people in real time. It is used in various fields including security, biometrics, law
enforcement, entertainment and social media. Face detection uses machine learning

30
(ML) and artificial neural network (ANN) technology, and plays an important role
in face tracking, face analysis and facial recognition. In face analysis, face
detection uses facial expressions to identify which parts of an image or video
should be focused on to determine age, gender and emotions. In a facial
recognition system, face detection data is required to generate a faceprint and
match it with other stored faceprints.

3.6.3.4 EYE DETECTION


Eye tracking is a sensor technology that can detect a person’s presence and
follow what they are looking at in real-time. The technology converts eye
movements into a data stream that contains information such as pupil position, the
gaze vector for each eye, and gaze point. Essentially, the technology decodes eye
movements and translates them into insights that can be used in a wide range of
applications or as an additional input modality.

FIG 3.6. EYE DETECTION


3.6.3.5 BLINK DETECTION
Blink detection is actually the process of using computer vision to firstly
detect a face, with eyes, and then using a video stream (or even a series of rapidly-

31
taken still photos) to determine whether those eyes have blinked or not within a
certain timeframe. There are a number of uses for blink detection. Probably the
most common use, as far as consumers are concerned, has been in cameras and
smartphones. The aim of this detection has been to help the photographer improve
their photographs, by telling them when their subjects have blinked.
The blink detection technology focuses on the eyes of people in the
photograph (they can often work with up to twenty faces) and whenever a pair of
eyes are occluded there will either be a message displayed on the lcd screen telling
the photographer to delay taking their photograph, or the more advanced cameras
are smart enough to simply snap the photo at a moment when all eyes are open.

32
CHAPTER 4
RESULTS

4.1 OUTPUT SCREEN

FIG 4.1 EYE REGION DETECTION

FIG 4.2 BLINK COUNTING

33
FIG 4.3 PATIENTS NEED SELECTION

FIG 4.4 HEALTH ASSISTANCE NEED

34
FIG 4.5 MOBILITY ASSISTANCE NEED

FIG 4.6 NURSING ASSISTANCE NEED

35
FUTURE ENHANCEMENT
In our research, we have demonstrated in the laptop perspective which can
be equipped in compact manner. Which will help the normal users to use the
proposed system. Without any human intervention the system should work
according to the requirement. This analysis does not work in dark, hence it can be
enhanced in the system Further it can be automated the things to which are in the
form of audio and message in our study. The main objective is to design a real time
interactive system that can assist the paralysis patients to control appliances such
as lights, fans etc. In addition, It can also play pre-recorded audio messages
through predefined number of eye blinks and it also helps to alert the doctor or
concerned person by sending SMS in case of emergency by using eye blink Sensor.
The eye blink sensor is able to detect an intentional blink from a normal blink,
which is useful for the paralysis patients especially Tetraplegic patients to regulate
their home devices easily without any help.

36
CONCLUSION
Although blink detection systems exist for other purposes, an
implementation of a blink detection system with the end use of controlling
appliances has not been previously accomplished. While the system is intended to
assist the paralyzed and physically challenged, it can definitely be used by all
types of individuals. The main challenge involved in the implementation of the
system is the development of a real time robust blink detection algorithm. Many
algorithms have been developed to serve the purpose, with some being more
accurate than the others. This paper presented a blink detection system based on
Online template matching. The first phase involved the blink detection phase; the
second phase involved the counting of blinks and subsequent control of
appliances through a micro controller. By enabling the paralyzed to gain control
of albeit a small part of their lives, the system can offer some level of
independence to them. The helpers who are assigned the task of tending to
paralyzed persons through the day can then be afforded a break. The practical
use. For continuous video input, laptops with built in webcams or USB cameras
will suffice. The system is limited by the efficiency of the blink detection
algorithm and efficiency falls further under limited lighting conditions. Since the
initialization phase of the algorithm is based on differencing between consecutive
frames, background movement in the frame may lead to inaccurate operation.
Typically, background movement causes non eye pairs to be detected as eye pairs.
This is overcome to some extent by limiting the search region to the face of an
individual, by implementing a face tracking algorithm prior to blink detection.
However, this in turn can lead to reduced efficiency in blink detection. By giving
an option to the user to choose between the system with and without face
tracking, a level of flexibility can be reached. The application of the blink
detection system is not limited to the control of appliances but can also be used fora

37
variety of other functions. Playback of audio distress messages over an intercom
system is one of the other applications of the system. Future applications of the
system may include playback of video or audio files by eye blinks and making a
VOIP call to play a distress message.

38
REFERENCES
1. Chinnawat Devahasdin Na Ayudhya, ThitiwanSrinark, A Method for Real-
Time Eye Blink Detection and Its Application
2. Michael Chau and Margrit Betke,2005. Real Time Eye Tracking and Blink
Detection with USB Cameras. Boston University Computer Science Technical
Report No. 2005-12. Boston, USA.
3. Liting Wang, Xiaoqing Ding, Chi Fang, Changsong Liu, Kongqiao Wang
,2009. Eye Blink Detection Based on Eye Contour Extraction. Proceedings of
SPIE-IS&T Electronic Imaging. San Jose, CA, USA.
4. Abdul-Kader, S. A., & Woods, J. (2015). Survey on Chatbot Design
Techniques in Speech Conversation Systems. International Journal of
Advanced Hima T Eye Controlled Home-Automation For Disable‖ pp6-7
(ERTEEI'17).
5. AbuShawar B., Atwell E. ALICE Chatbot: Trials and outputs Computación y
Sistemas, 19 (2015). doi: 10.13053/cys-19-4-2326
6. Kohei Aai and Ronny Mardiyanto, 2011. Comparative Study on Blink
Detection and Gaze Estimation Methods for HCI, in Particular, Gabor Filter
Utilized Blink Detection Method. Proceedings of Eighth International
Conference on Information Technology: New Generations. Las Vegas, USA,
pp. 441-446.
7. Taner Danisman, Ian Marius Bilasco, Chabane Djeraba, Nacim Ihaddadene
“Drowsy driver detection system using Eye Blink patterns” 2010 International
Conference on Machine and Web Intelligence 29 November 2010.
8. Atish Udayashankar ; Amit R. Kowshik ; S. Chandramouli ; H.S.
Prashanth,”Assistance for paralyzed using Eye Blink Detection, 2012 fourth
international conference on digital home , 11 december 2012.
9. Home automation with eye blink for paralyzed patients, DeepBose, BRAC

39
University, Bangladesh, 2017
10. F.M. Sukno, S.K. Pavani, C. Butakoffand and A.F. Frangi, “Automatic
Assessment of Eye Blinking Patterns through Statistical Shape Models,” ICVS
2009, LNCS 5815, Springer-Verlag Berlin Heidelberg, pp. 33-42, 2009.
11. M. Divjak and H. Bischof, “Eye blink based fatigue detection for prevention of
Computer Vision Syndrome,” MVA2009 IAPR Conference on Machine Vision
Applications, Yokohama, Japan, May 2009.
12. L. Wang, X. Ding, C. Fang and C. Liu, “Eye blink detection based on eye
contour extraction,” Proceedings of SPIE, vol. 7245,72450R, 2009.
13. Michelle Alva, Neil Castellino,” An Image Based Eye Controlled Assistive
System for Paralytic Patients”, 2nd International Conference on
Communication Systems IEEE 2017
14. Milan Pandey, Anoop Shinde, Kushal Chaudhari, DivyanshuTotla, Rajnish
Kumar, Prof. N.D. Mali “Assistance for Paralyzed Patient Using Eye Motion
Detection” 2018 IEEE.

40

You might also like