Download as pdf or txt
Download as pdf or txt
You are on page 1of 72

VISUAL COMMUNICATION WITH PARALYZED

PEOPLE USING FACE DETECTION


by
R. AKASH
(Registration number: 112421622002)
of
SRI VENKATESWARA COLLEGE OF ENGINEERING
AND TECHNOLOGY

A PROJECT REPORT
submitted to the

FACULTY OF INFORMATION AND COMMUNICATION


ENGINEERING

in partial fulfillment of the


requirementsfor the award of
the degree

of
MASTER OF COMPUTER APPLICATIONS

ANNA UNIVERSITY
CHENNAI -600 025
SEPTEMBER-2023

i
BONAFIDE CERTIFICATE

Certified that the project report titled “VISUAL COMMUNICATION WITH PARALYZED
PEOPLE USING FACE DETECTION” is the bonafide work of Mr.R. AKASH (Registration number:
112421622002), who carried out the research under my supervision. Certified further that to
the best of my knowledge the work reported here in does not from part of any other project
report or dissertation on the basis of which a degreeor award was conferred on an earlier
occasion on this or any other candidate.

Internal Guide Head of the Department

Submitted to Project and Viva - Voce Examination held on ………………………

Internal Examiner External Examiner

ii
ACKNOWLEDGEMENT

I would like to express my sincere thanks and gratitude to our honorable Chairman

Thiru.Dr.S.K.PURUSHOTHAMAN,M.E.,Ph.D.,

I convey my gratitude to our beloved Principal, MR.DR.P.PALANI,M.E,Phd forbeing a


great source of inspiration.

My thanks to internal project guide Mrs.JAYAKUMARI, MCA,M.Phil,Ph.D., Assistant


Professor, Head of the Department, Department of Computer Applications.

Also My sincere thanks to Mrs.P.KANIMOZHI, MCA,ME Assistant Professor

Also My sincere thanks to Mrs. J. JEMI, MCA, M.Phil Assistant Professor

I wish to express my thank to External project guide your Mrs. M.G.VENNILA


Technical support, NIIT for his continuous guidance and constant encouragement
throughout this project work which has madethis project a success.

My sincere thanks to all the teaching and non-teaching members of the Department
of Computer Applications, who have directly or indirectly helped us in this project

R.AKASH

iii
VISUAL COMMUNICATION WITH PARALYZED
PEOPLE USING FACE DETECTION

TABLE OF CONTENTS
S. No Title Page No.
1 Abstract 1
2 Introduction 1
3 Existing System 3
4 Proposed System 4
5 Scope of the project 5
6 Literature Survey 5
(i) Visual Communication 7
(ii) Image Processing 8
(iii) Types of Image Processing 8
(iv) Face Detection 8
7
(v) Eye Detection 9
(vi) Types of Eye Tracking
10
Technology
(vii) Blink Detection 11
The programming language used
8 11
Python
9 System Architecture 17
10 Data flow Diagram 18
11 ER Diagram 19
iv
12 Use case Diagram 20
13 Steps for Implementation 21
14 Modules 22
15 Description of module 22
16 Deep learning 25
17 Implementation 37
18 Future Enhancement 67
19 Conclusion 68
20 Reference 70

v
ABSTRACT

Motor Neuron Disease (MND) is a medical condition where the motor


neurons of the patient are paralyzed, it is incurable. It also leads to weakness
of muscles with respect to hand, feet or voice. Because of this, the patient
cannot perform his voluntary actions and it is very difficult for the patient
to express his needs as he is not able to communicate with the world. The
proposed system detects the eye blink and differentiates between an
intentional long blink and a normal eye blink. The proposed system can be
used to control and Communicate with other people. The objectives of the
system are: Capturing the frame from the video using the system’s camera
initializes the execution of the proposed system. We used the Python Library
called OpenCV to capture the image of the patient, and we use the python
language to do the Communication with paralyzed people using Eye Ball
blinks.

1
Introduction:

The communication with paralyzed people we use eye Face Detection


Algorithm.

The Face Detection Algorithm then processes captured video frames to


give out the rectangular boxed face.

This output from Face Detection Algorithm then gets processed using
AdaBoost Classifier to detect the eye region in the face.

Eye detected will be sent to check if there is any movement of the eyeball.
If it’s there, then this movement will be tracked to give out the combination
the patient is using to express the dialogue.

If not, then the blink pattern will be processed to give out the voice as well
as the text input with the respective dialogue.

There are many methods introduced for motor neuron disease patients to
communicate with the outside world such as Brain wave technique and
Electro-oculography.

Loss of speech can be hard to adjust. It is difficult for the patients to make
the caretaker understand what they need especially when they are in hospitals.
It becomes difficult for the patients to express their feelings and even they
cannot take part in conversations.

The proposed system detects the voluntary blinks of the patient and
accordingly sends the message about the requirement to the caretaker and also
gives the voice output via a call to the caretaker. The system uses an inbuilt
camera to capture the video of the patient and with the help of a facial
landmark algorithm, it identifies the face and eyes of the patient. The system
then slides a bunch of images one after the other on the screen and the patient
can choose to blink over the image he wants ,just to convey message of his

2
desires. The system identifies the blink with help of eye aspect ratio and then
sends a message to the care taker of what the patient wants and also the system
initiates a call to the care taker where in a voice is audible saying what the
patient wants.

Blink To Speak offers a form of independence to paralyzed people. The


software platform converts eye blinks to Speak. Every feature of the software
can be controlled by eye movement. Thus, the software can be independently
operated by paralyzed people. Using the software, patients can record
messages, recite those messages aloud, and send the messages to others.

Problem Statement:

Patients who lose the ability to speak and write they can only contact the
outside world through human-computer interaction; e.g. controlling brain waves
or tracking eye movements. Currently, brain wave-controlling devices need to be
worn by users, so they are not convenient for people to use.

There exists eye-motion-based software which enables the MND patients


to write in the computer by using their eye functions only. When they are away
from PC and lie on the bed, they cannot communicate with care providers. With
the goal of helping MND patients on the bed to call for other people with a simple
and easy approach this research aims to develop a real time video k system, which
can successfully research aims to develop a detect the eye blinks regardless the
head directions, day or night.

Existing System:

This Eye Blink sensor sense the eye blink using is infrared, The Variation
Across the eye will vary as per eye blink, If the eye is closed the output is high

3
otherwise output is low. The eye -blink sensor works by illuminating the eye
and eyelid area with infrared light, then monitoring the changes in the reflected
light using a phototransistor and differentiator circuit. The exact functionality
depends greatly on the positioning and aiming of the emitter and detector with
respect to the eye. The eye blink sensor is an IR based blink sensor. If the eye is
closed, it means the output is high otherwise the output is low. Here the input
is sampled three times per blink; these input blinks are classified as a short or
long blink. Since eight appliances are being controlled simultaneously (which
is also equal to 23 appliances), three blinks are used to control the appliance.

Disadvantages of Existing System:

 Not possible to use in a real time with Existing equipment

 Mostly truly not so good to identify the long blinks

 Obtrusive

 Not independent

 Synchronization problems with simulator data(in SleepEye)

Proposed System:

The proposed project aims to bring out a solution for the paralyzed people
without any harm to their body externally or internally. It overweighs the
previously developed prototypes in this field because none of the components are
in direct contact with the patients body hence it definitely will prove to be safer.

Cost effective: The main objective of developing algorithm of a real time


video Oculography system is that to provide cost effective for those people who
4
cannot afford. The existing technique for such patients to communicate is too
costly. Thus, it is necessary to design a system which is affordable to common
people which includes cost effective components for designing.

Fast: There are few algorithms which are developed for video Oculography
system for communication. The main objective of this project is to develop an
algorithm which is extremely fast compared to the existing ones.

Accuracy: The main objective of this project is to develop an algorithm which


is more accurate compared to the existing ones.

that can afford the technology. Blink To Speak focuses on a different


demographic that is often ignored. Blink To Speak is free and open source. The
software runs on a wide variety of low-end computers. The only required
peripheral is a basic webcam. Not only is this software accessible to paralyzed
people, but paralyzed people of almost all financial classes as well.

Scope of the Project:

1. To make the Paralyzed people to communicate with us.

2. To make the Perfect Assistance for Paralyzed People

3. To give all the services for the paralyzed people after they communicate
their needs via blinks.

5
Literature Survey

Now-a-days, the rapid growth of technology has made our PC become


outdated. The tasks that once we used to do with PC are now being handled by
mobiles or other smart devices. Introduction of network enabled devices or
IOT devices have led to advanced home automation systems. However, the
usage is limited for people with physical disorders as remote control of an
appliance becomes difficult. In this paper the project is about for those people
who are suffering from Paralysis (As example, Tetraplegia Patients) and the
difficulties which they face while controlling home appliances. Tetraplegia
Paralysis is brought about by harm to the cerebrum or the spinal line this patient;
client needs to control the appliance.

We try to take care of their issue utilizing eye blink sensor. An eye blink sensor
is a transducer which detects an eye blink, and gives a yield voltage at whatever
point the eye is shut. This project is about eye blinking for instance, in systems
that monitor a paralyzed human so that he/she can operate home appliances, such
as light, fan, Air Condition and so on. Also, this is connected with Android
Smartphone Bluetooth radio, so that patients can communicate with others in
case of emergency via sending text SMS, just by blinking their eyes.

D. Taneral, 2016. Presents Drowsy driver detection system using Eye


Blink patterns that is based on monitoring the changes in the eye blink duration.
In addition to tracking the face and the eyes to compute drowsiness index
Garcia has also presented a nonintrusive approach for drowsiness detection
based on computer vision. It is installed in the car and it is able to work under real
operation
conditions. In this an IR camera is placed in the front of the driver, in the
dashboard .In order to detect his face and obtain drowsiness clues from their
eyes closure.

6
H.S. Prashanth, 2012. Assistance for paralyzed using Eye Blink
Detection, The main aim of this paper is to design a real time interactive system
that can assist the paralyzed to control.

Abdul Rahaman Shaik Journal of Engineering Research and Application:


appliances such as lights, fans etc. or by playing pre-recorded audio
messages, through a predefined number of eye blinks. Image processing
techniques have been implemented in order to detect the eye blinks.
DeepBose, BRAC University, Bangladesh, 2017. Home automation with
eye blink for paralyzed patients [10], The constant demand to improve daily
living standards for paralysis patients or general people serves as a motivation to
develop newer technology. The tasks once performed by big traditional
computers are now solved with smaller smart devices. The study here talks
about the development of a blink sensor device which used for automated home
designs for disable people.

Content:

1. Visual Communication:
Visual communication is the practice of using visual elements to convey a
message, inspire change, or evoke emotion.

It’s one part communication design—crafting a message that educates,


motivates, and engages, and one part graphic design—using design principles to
communicate that message so that it’s clear and eye-catching.

Effective visual communication should be equally appealing and


informative.

7
Image Processing

Image processing is the process of transforming an image into a digital


form and performing certain operations to get some useful information from it.
The image processing system usually treats all images as 2D signals when
applying certain predetermined signal processing methods.

2. Types of Image Processing

There are five main types of image processing:

 Visualization - Find objects that are not visible in the image

 Recognition - Distinguish or detect objects in the image

 Sharpening and restoration - Create an enhanced image from the original


image

 Pattern recognition - Measure the various patterns around the objects in


the image

 Retrieval - Browse and search images from a large database of digital


images that are similar to the original image

3. Face Detection

Face detection has progressed from rudimentary computer


vision techniques to advances in machine learning (ML) to increasingly
sophisticated artificial neural networks (ANN) and related technologies; the
result has been continuous performance improvements. It now plays an
important role as the first step in many key applications -- including face
tracking, face analysis and facial recognition. Face detection has a significant
effect on how sequential operations will perform in the application.
8
In face analysis, face detection helps identify which parts of an image
or video should be focused on to determine age, gender and emotions using
facial expressions. In a facial recognition system -- which maps an
individual's facial features mathematically and stores the data as a faceprint
-- face detection data is required for the algorithms that discern which parts
of an image or video are needed to generate a faceprint. Once identified,
the new faceprint can be compared with stored faceprints to determine if
there is a match.

4. Eye Detection

Eye tracking refers to the process of measuring where we look, also


known as our point of gaze. These measurements are carried out by an eye
tracker, that records the position of the eyes and the movements they make.

Near-infrared light is directed toward the center of the eyes (pupil),


causing detectable reflections in both the pupil and the cornea (the outer-
most optical element of the eye). These reflections – the vector between
the cornea and the pupil – are tracked by an infrared camera. This
is the optical tracking of corneal reflections, known as pupil center corneal
reflection(PCCR)

An infrared light source (and thus detection method) is necessary as


the accuracy of gaze direction measurement is dependent on a clear
demarcation (and detection) of the pupil as well as the detection of corneal
reflection. Normal light sources (with ordinary cameras) aren’t able to
provide as much contrast, meaning that an appropriate amount of accuracy
is much harder to achieve without infrared light.

Light from the visible spectrum is likely to generate uncontrolled


Specular reflection, while infrared light allows for a precise differentiation
9
between the pupil and the iris – while the light directly enters the pupil, it
just “bounces off” the iris. Additionally, as infrared light is not visible to
humans it doesn’t cause any distraction while the eyes are being tracked.

5. Types of Eye Tracking Technologies

1. Screen Based Eye Tracking

2. Eye Tracking glasses

6. Blink Detection

Blink detection is actually the process of using computer vision to firstly


detect a face, with eyes, and then using a video stream (or even a series of
rapidly-taken still photos) to determine whether those eyes have blinked or
not within a certain timeframe.

10
Programming Language Used

Python

Python isa high-level, general-purposeprogramming language.Its


design philosophy emphasizes code readability with the use of significant
indentation via the off-side rule.

Python is dynamically typed and garbage-collected. It supports


multiple programming paradise,
including structured (particularly procedural), object-oriented ,
and functional Programming. It is often described as a "batteries included"
language due to its comprehensive standard Library.

Python Libraries

Normally, a library is a collection of books or is a room or place


where many books are stored to be used later. Similarly, in the
programming world, a library is a collection of precompiled codes that can
be used later on in a program for some specific well-defined operations.

A Python library is a collection of related modules. It contains


bundles of code that can be used repeatedly in different programs. It makes
Python Programming simpler and convenient for the programmer.

As we don’t need to write the same code again and again for
different programs. Python libraries play a very vital role in fields of
Machine Learning, Data Science, Data Visualization, etc.

11
The following are the libraries used in our project

1. OpenCv
2. Dlib
3. Enum
4. Time
5. Tkinter
6. Subprocess
7. Gtts
8. PIL
9. Twillio
10.tempfile

1. OpenCV

OpenCV is the huge open-source library for the computer vision,


machine learning, and image processing and now it plays a major role in
real-time operation which is very important in today’s systems.

By using it, one can process images and videos to identify objects,
faces, or even handwriting of a human. When it integrated with various
libraries, such as NumPy, python is capable of processing the OpenCV
array structure for analysis.

To Identify image pattern and its various features we use vector


space and perform mathematical operations on these features.

2. Dlib

Dlib is a general-purpose cross platform software library written in


the programming language C++. Its design is heavily influenced by ideas
from a design by contract and component-based software engineering.
12
Thus it is, first and foremost, a set of independent software
components. It is open-source software released under a Boost Software
License.

Since development began in 2002, Dlib has grown to include a wide


variety of tools. As of 2016, it contains software components for dealing
with networking, threads, graphical user interfaces, data structures, linear
algebra, machine learning, image processing, data mining, XML and text
parsing, numerical optimization, Bayesian networks, and many other tasks.

3. Enum

Enumerations in Python are implemented by using the module


named “enum“.Enumerations are created using classes. Enums have names
and values associated with them.

4. Time

As the name suggests Python time module allows to work with time
in Python. It allows functionality like getting the current time, pausing the
Program from executing, etc. So before starting with this module we need
to import it.

5. Tkinter

Python offers multiple options for developing GUI (Graphical User


Interface). Out of all the GUI methods, tkinter is the most commonly used
method. It is a standard Python interface to the Tk GUI toolkit shipped with
Python.

Python with tkinter is the fastest and easiest way to create the GUI
applications. Creating a GUI using tkinter is an easy task.
13
6. Subprocess

Subprocess is a standard Python module that allows the user to start


new processes from within a Python script1234. It is useful for running
multiple processes in parallel or calling an external program or command
from inside Python code. Subprocess allows the user to manage inputs,
outputs, and errors raised by the child process from Python code.

The parent-child relationship of processes is where the "sub" in the


subprocess name comes from. Subprocess is used to launch processes that
are completely separate from the user's program, while multiprocessing is
designed to communicate with each other.

7. gtts

There are several APIs available to convert text to speech in Python.


One of such APIs is the Google Text to Speech API commonly known as
the gTTS API. gTTS is a very easy to use tool which converts the text
entered, into audio which can be saved as a mp3 file.

The gTTS API supports several languages including English, Hindi,


Tamil, French, German and many more.

The speech can be delivered in any one of the two available audio
speeds, fast or slow. However, as of the latest update, it is not possible to
change the voice of the generated audio.

8. PIL(Pillow)

PIL stands for Python Imaging Library, and it’s the original library
that enabled Python to deal with images. PIL was discontinued in 2011 and
only supports Python 2.
14
To use its developers’ own description, Pillow is the friendly PIL
fork that kept the library alive and includes support for Python 3.

PIL is the Python Imaging Library which provides the python


interpreter with image editing capabilities. The Image module provides a
class with the same name which is used to represent a PIL image. The
module also provides a number of factory functions, including functions to
load images from files, and to create new images.

9. Twillio

Twilio is a platform that provides APIs and SDKs for developers to


build communication features and capabilities into their applications.
Twilio enables developers to use voice, text, chat, video, email, WhatsApp,
and IoT channels to create personalized customer experiences.

Twilio is used by hundreds of thousands of businesses and more than


ten million developers worldwide, including major companies like Uber,
Airbnb, Netflix, and HubSpot.

10. Tempfile

Tempfile is a Python module used in a situation, where we need to


read multiple files, change or access the data in the file, and gives output
files based on the result of processed data. Each of the output files produced
during the program execution was no longer needed after the program was
done.

In this case, a problem arose that many output files were created and
this cluttered the file system with unwanted files that would require
deleting every time the program ran.

15
System Architecture

16
Data Flow Diagram

17
ER Diagra

18
Use Case Diagram

19
Steps for Implementation

Step 1: Capturing a video

Step 2: Capture images from video

Step 3: Converting images into grayscale

Step 4: Fix landmarks on the images

Step 5: Detect Blinks

Step 6: Detecting Eye-ball movements

Step 7: Converting to text

Step 8: Sending the text (or) if emergency situation means doing calls

Modules

1. Images From Camera

2. Converting images into Grayscale

3. Preprocessing

4. Face Detection

5. Eye Ball Movement Recognition using Dlib

6. Sending Messages

20
Description of a Module

1. Images From Camera

The very first module is collecting images from the camera The
camera should be placed in front of the paralyzed people and the camera
may be fixed with their wheelchair.

Camera will starts capturing their images when the person clicks the
concerned button that is fixed with the wheelchair.

The Captured images will be send to the system for the furtherr
procedure.

2. Converting images into Grayscale

Many image processing operations work on a plane of image data


(e.g., a single color channel) at a time.

The purpose for converting grayscale images

1. simplicity

2. Data Reduction

if we converting into grayscale, especially due to the likely reduction


in processing time. However it comes at the cost of throwing away data
(color data) that may be very helpful or required for many image
processing applications.

3. Preprocessing

Preprocessing is the first step of the language processing system,


which translates high-level language to machine-level language or absolute
machine code.

21
It involves data validation and imputation to assess whether the data
is complete and accurate, and to correct errors and input missing values.

We use the prepeocessing method to improve the quality of the


captured images.

In image processing, preprocessing improves image quality by


removing noise, unmated data, or eliminating variations that arise during
acquisition.

4. Face Detection

We use Face detection technology that uses machine learning and


algorithms in order to extract human faces from larger images. such images
typically contain plenty of non-face objects, such as buildings, landscapes,
and various body parts.

Facial detection algorithms usually begin by seeking out human


eyes, which are one of the easiest facial features to detect. Next, the
algorithm might try to find the mouth, nose, eyebrows, and iris.

After identifying these facial features, and the algorithm concludes


that it has extracted a face, it then goes through additional tests to confirm
that it is, indeed, a face.

5. Eye Ball Movement Recognition using Dlib

Research on eye tracking is increasing owing to its ability to


facilitate many different tasks, particularly for the elderly or users with
special needs.

Eye tracking is the process of measuring where one is looking (point


of gaze) or the motion of an eye relative to the head. Researchers have
developed different algorithms and techniques to automatically track the

22
gaze position and direction, which are helpful to find the emotions of the
paralyzed person.

We explore and review eye tracking concepts, methods, and


techniques by further elaborating on efficient and effective modern
approaches such as machine learning (ML),

6. Sending Messages

Alert messaging (or alert notification) is machine-to-person


communication that is important or time-sensitive. An alert may be a
calendar reminder or a notification of a new message.

After Recognizing the the emotions or needs of the patient we have


to send the alert messages to the care taker or their relatives

If the patients meets the health problems or the emergency


situations, the voice messages or sending call to the Doctor or Nurse.

This process uses the Twillio package to give the call to the
concerned person.

Deep Learning

Deep learning is a branch of machine learning which is based on


artificial neural networks. It is capable of learning complex patterns and
relationships within data. In deep learning, we don’t need to explicitly
program everything. It has become increasingly popular in recent years
due to the advances in processing power and the availability of large
datasets. Because it is based on artificial neural networks (ANNs) also
known as deep neural networks (DNNs). These neural networks are
inspired by the structure and function of the human brain’s biological
neurons, and they are designed to learn from large amounts of data.

23
1. Deep Learning is a subfield of Machine Learning that involves the use
of neural networks to model and solve complex problems. Neural
networks are modeled after the structure and function of the human
brain and consist of layers of interconnected nodes that process and
transform data.
2. The key characteristic of Deep Learning is the use of deep neural
networks, which have multiple layers of interconnected nodes. These
networks can learn complex representations of data by discovering
hierarchical patterns and features in the data. Deep Learning
algorithms can automatically learn and improve from data without the
need for manual feature engineering.
3. Deep Learning has achieved significant success in various fields,
including image recognition, natural language processing, speech
recognition, and recommendation systems. Some of the popular Deep
Learning architectures include Convolutional Neural Networks
(CNNs), Recurrent Neural Networks (RNNs), and Deep Belief
Networks (DBNs).
4. Training deep neural networks typically requires a large amount of
data and computational resources. However, the availability of cloud
computing and the development of specialized hardware, such as
Graphics Processing Units (GPUs), has made it easier to train deep
neural networks.
In summary, Deep Learning is a subfield of Machine Learning that involves the
use of deep neural networks to model and solve complex problems. Deep
Learning has achieved significant success in various fields, and its use is
expected to continue to grow as more data becomes available, and more
powerful computing resources become available.

Deep learning is the branch of machine learning which is based on


artificial neural network architecture. An artificial neural network or ANN uses
24
layers of interconnected nodes called neurons that work together to process and
learn from the input data.
In a fully connected Deep neural network, there is an input layer and one
or more hidden layers connected one after the other. Each neuron receives input
from the previous layer neurons or the input layer. The output of one neuron
becomes the input to other neurons in the next layer of the network, and this
process continues until the final layer produces the output of the network. The
layers of the neural network transform the input data through a series of
nonlinear transformations, allowing the network to learn complex
representations of the input data.

Methods of Deep Learning

Today Deep learning has become one of the most popular and visible
areas of machine learning, due to its success in a variety of applications, such
as computer vision, natural language processing, and Reinforcement learning.

Deep learning can be used for supervised, unsupervised as well as reinforcement


machine learning. it uses a variety of ways to process these.

 Supervised Machine Learning: Supervised machine learning is


the machine learning technique in which the neural network learns to
make predictions or classify data based on the labeled datasets. Here
we input both input features along with the target variables. the neural
network learns to make predictions based on the cost or error that
comes from the difference between the predicted and the actual target,
this process is known as backpropagation. Deep learning algorithms
like Convolutional neural networks, Recurrent neural networks are
used for many supervised tasks like image classifications and
recognization, sentiment analysis, language translations, etc.

25
 Unsupervised Machine Learning: Unsupervised machine learning is
the machine learning technique in which the neural network learns to
discover the patterns or to cluster the dataset based on unlabeled
datasets. Here there are no target variables. while the machine has to
self-determined the hidden patterns or relationships within the
datasets. Deep learning algorithms like autoencoders and generative
models are used for unsupervised tasks like clustering, dimensionality
reduction, and anomaly detection.
 Reinforcement Machine Learning: Reinforcement Machine
Learning is the machine learning technique in which an agent learns
to make decisions in an environment to maximize a reward signal. The
agent interacts with the environment by taking action and observing
the resulting rewards. Deep learning can be used to learn policies, or a
set of actions, that maximizes the cumulative reward over time. Deep
reinforcement learning algorithms like Deep Q networks and Deep
Deterministic Policy Gradient (DDPG) are used to reinforce tasks like
robotics and game playing etc.

Artificial Neural Networks

Artificial neural networks are built on the principles of the structure and
operation of human neurons. It is also known as neural networks or neural nets.
An artificial neural network’s input layer, which is the first layer, receives input
from external sources and passes it on to the hidden layer, which is the second
layer. Each neuron in the hidden layer gets information from the neurons in the
previous layer, computes the weighted total, and then transfers it to the neurons
in the next layer. These connections are weighted, which means that the impacts
of the inputs from the preceding layer are more or less optimized by giving each

26
input a distinct weight. These weights are then adjusted during the training
process to enhance the performance of the model.

Deep Learning Algorithms

1. Convolutional Neural Networks (CNNs),

2. Long Short Term Memory Networks (LSTMs),

3. Recurrent Neural Networks (RNNs),

4. Generative Adversarial Networks (GANs),

5. Radial Basis Function Networks (RBFNs),

6. Multilayer Perceptrons (MLPs), and

7. Self Organizing Maps (SOMs)

Algorithm Used

CNN(Convolutional Neural Networks)

Deep Learning has facilitated multiple approaches to computer


vision, cognitive computation and refined processing of visual data. One such
instance is the use of CNN or Convolutional Neural Networks for object
or image classification.

CNN algorithms provide a massive advantage in visual-based


classification by enabling machines to perceive the world around them (in
the form of pixels) as humans do.

CNN is fundamentally a recognition algorithm that allows machines


to become trained enough to process, classify or identify a multitude of
parameters from visual data through layers.
27
CNN-based systems learn from image-based training data and can
classify future input images or visual data on the basis of its training model.
As long as the dataset that is used for training contains a range of useful
visual cues (spatial data), the image or object classifier will be highly
accurate.

This promotes advanced object identification and image


classification by enabling machines or software to accurately identify the
required objects from input data.

CNN models rely on classification, segmentation, localisation and


then build predictions. This allows these cars to almost react like human
brains would in any given situation or sometimes even more effectively
than human drivers.

Techniques and Methods

 Visual Communication

 Image Processing

 Face Detection

 Eye Detection

 Blink Detection

1. Visual Communication

Visual communication is the practice of using visual elements to


convey a message, inspire change, or evoke emotion.

It’s one part communication design—crafting a message that


educates, motivates, and engages, and one part graphic design—using

28
design principles to communicate that message so that it’s clear and eye-
catching.

Effective visual communication should be equally appealing and


informative.

2. Image Processing

Image processing is the process of transforming an image into a


digital form and performing certain operations to get some useful
information from it.

Types of Image Processing

1. Visualization
2. Recognition
3. Sharpening and Restoration
4. Pattern recognition
5. Retrieval

Visualization

Data visualization is the graphical representation of information and


data. By using visual elements like charts, graphs, and maps, data visualization
tools provide an accessible way to see and understand trends, outliers, and
patterns in data. Additionally, it provides an excellent way for employees
or business owners to present data to non-technical audiences without
confusion.

In the world of Big Data, data visualization tools and technologies


are essential to analyze massive amounts of information and make data-
driven decisions.

29
Advantages of visualization:

 Easily sharing information.

 Interactively explore opportunities.

 Visualize patterns and relationships.

Recognition

Facial recognition is a way of identifying or confirming an


individual’s identity using their face. Facial recognition systems can be
used to identify people in photos, videos, or in real-time.

Facial recognition is a category of biometric security. Other forms


of biometric software include voice recognition, fingerprint recognition,
and eye retina or iris recognition. The technology is mostly used for
security and law enforcement, though there is increasing interest in other
areas of use.

Many people are familiar with face recognition technology through


the FaceID used to unlock iPhones (however, this is only one application
of face recognition). Typically, facial recognition does not rely on a
massive database of photos to determine an individual’s identity — it
simply identifies and recognizes one person as the sole owner of the device,
while limiting access to others.

Beyond unlocking phones, facial recognition works by matching the


faces of people walking past special cameras, to images of people on a
watch list. The watch lists can contain pictures of anyone, including people
who are not suspected of any wrongdoing, and the images can come from
anywhere — even from our social media accounts. Facial technology
systems can vary, but in general, they tend to operate as follows:

30
Step 1: Face detection

The camera detects and locates the image of a face, either alone or
in a crowd. The image may show the person looking straight ahead or in
profile.

Step 2: Face analysis

Next, an image of the face is captured and analyzed. Most facial


recognition technology relies on 2D rather than 3D images because it can
more conveniently match a 2D image with public photos or those in a
database. The software reads the geometry of your face. Key factors
include the distance between your eyes, the depth of your eye sockets, the
distance from forehead to chin, the shape of your cheekbones, and the
contour of the lips, ears, and chin. The aim is to identify the facial
landmarks that are key to distinguishing your face.

Step 3: Converting the image to data

The face capture process transforms analog information (a face) into


a set of digital information (data) based on the person's facial features.
Your face's analysis is essentially turned into a mathematical formula. The
numerical code is called a faceprint. In the same way that thumbprints are
unique, each person has their own faceprint.

31
Step 4: Finding a match

Your faceprint is then compared against a database of other known


faces. For example, the FBI has access to up to 650 million photos, drawn
from various state databases. On Facebook, any photo tagged with a
person’s name becomes a part of Facebook's database, which may also be
used for facial recognition. If your faceprint matches an image in a facial
recognition database, then a determination is made.

Of all the biometric measurements, facial recognition is considered


the most natural. Intuitively, this makes sense, since we typically recognize
ourselves and others by looking at faces, rather than thumbprints and irises.
It is estimated that over half of the world's population is touched by facial
recognition technology regularly.

Sharpening and Restoration

Before getting into the act of sharpening an image, we need to


consider what sharpness actually is. The biggest problem is that, in large
part, sharpness is subjective.

Sharpness is a combination of two factors: resolution and acutance.


Resolution is straightforward and not subjective. It's just the size, in pixels,
of the image file. All other factors equal, the higher the resolution of the
image the more pixels it has the sharper it can be. Acutance is a little more
complicated. It’s a subjective measure of the contrast at an edge. There’s
no unit for acutance you either think an edge has contrast or think it doesn’t.
Edges that have more contrast appear to have a more defined edge to the
human visual system.

32
Pattern Recognition

Pattern recognition is a technique to classify input data into classes


or objects by recognizing patterns or feature similarities. Unlike pattern
matching which searches for exact matches, pattern recognition looks for
a “most likely” pattern to classify all information provided. This can be
done in a supervised (labelled data) learning model or unsupervised
(unlabelled data) to discover new, hidden patterns.

Retrieval

Information Retrieval (IR) can be defined as a software


program that deals with the organization, storage, retrieval, and evaluation
of information from document repositories, particularly textual
information. Information Retrieval is the activity of obtaining material that
can usually be documented on an unstructured nature i.e. usually text
which satisfies an information need from within large collections which is
stored on computers. For example, Information Retrieval can be when a
user enters a query into the system.

3. Face Detection

Face detection, also called facial detection, is an artificial


intelligence (AI)-based computer technology used to find and identify
human faces in digital images and video. Face detection technology is often
used for surveillance and tracking of people in real time. It is used in
various fields including security, biometrics, law enforcement,
entertainment and social media.

33
Face detection uses machine learning (ML) and artificial neural
network (ANN) technology, and plays an important role in face tracking,
face analysis and facial recognition. In face analysis, face detection uses
facial expressions to identify which parts of an image or video should be
focused on to determine age, gender and emotions. In a facial recognition
system, face detection data is required to generate a faceprint and match it
with other stored faceprints.

4. Eye Detection

Eye tracking is a sensor technology that can detect a person’s


presence and follow what they are looking at in real-time. The technology
converts eye movements into a data stream that contains information such
as pupil position, the gaze vector for each eye, and gaze point. Essentially,
the technology decodes eye movements and translates them into insights
that can be used in a wide range of applications or as an additional input
modality.

5. Blink Detection

Blink detection is actually the process of using computer vision to


firstly detect a face, with eyes, and then using a video stream (or even a
series of rapidly-taken still photos) to determine whether those eyes have
blinked or not within a certain timeframe.

There are a number of uses for blink detection. Probably the most
common use, as far as consumers are concerned, has been in cameras and
smartphones. The aim of this detection has been to help the photographer
improve their photographs, by telling them when their subjects have
blinked.

34
The blink detection technology focuses on the eyes of people in the
photograph (they can often work with up to twenty faces) and whenever a
pair of eyes are occluded there will either be a message displayed on the
lcd screen telling the photographer to delay taking their photograph, or the
more advanced cameras are smart enough to simply snap the photo at a
moment when all eyes are open.

Implementation:

Video_process.py

import cv2

class VideoProcess():
def __init__(self):
self.camera_dev = None
self.frame = None
self.window_name = 'source'
self.terminate = False

def start_capture(self):
self.camera_dev = cv2.VideoCapture(1)
if not self.camera_dev.isOpened():
print('Error: Failed to open Camera')
self.camera_dev = None
return False
# height = self.camera_dev.set(cv2.CAP_PROP_FRAME_HEIGHT, 960)

35
# width = self.camera_dev.set(cv2.CAP_PROP_FRAME_WIDTH, 1280)
# print('VideoProcess: start_capture: height = ', height, 'width', width)

return True

def get_frame(self):

self.fame = None

# Capture frame-by-frame
status, frame = self.camera_dev.read()
if not status:
print('Error: Failed to capture image')
return False, self.fame

# is it necessary to flip?
# gray_frame = cv2.flip(gray_frame, 1)
self.fame = frame.copy()
return True, self.fame

def start_process(self, display_image):


status = self.start_capture()
if status == False:
print('Error: Failed to open camera')
self.terminate = True

while self.terminate == False:

# Capture frame-by-frame

36
status, frame = self.get_frame()
if not status:
print('Error: Failed to capture image')
self.terminate = True
else:
if display_image == True:
self.show_image()

self.terminate_process()

def show_image(self):
# Display the resulting frame
cv2.imshow(self.window_name, self.frame)
key = cv2.waitKey(1)
if key == ord('e'):
self.terminate = True

def terminate_process(self):
# release the capture
self.camera_dev.release()
cv2.destroyAllWindows()

Dlib_process.py

'''
Final Project

'''

import cv2

37
import dlib
from enum import Enum
from time import time
import video_process

class LandMarkLoc(Enum):
lt_eye_top_1 = 37
lt_eye_btm_1 = 41
lt_eye_top_2 = 38
lt_eye_btm_2 = 40
rt_eye_top_1 = 43
rt_eye_btm_1 = 47
rt_eye_top_2 = 44
rt_eye_btm_2 = 46
lt_eye_st_crn = 36
lt_eye_end_crn = 39
rt_eye_st_crn = 42
rt_eye_end_crn = 45

LFT_EYE_TOP1 = 0
LFT_EYE_BTM1 = 1
LFT_EYE_TOP2 = 2
LFT_EYE_BTM2 = 3
RT_EYE_TOP1 = 4
RT_EYE_BTM1 = 5
RT_EYE_TOP2 = 6
RT_EYE_BTM2 = 7

38
LFT_EYE_ST_CNR = 8
LFT_EYE_END_CNR = 9
RT_EYE_ST_CNR = 10
RT_EYE_END_CNR = 11

EYE_STATE_INIT = -1
EYE_STATE_OPEN = 0
EYE_STATE_CLOSE = 1

class DlibProcess():
def __init__(self):
self.face_detector = dlib.get_frontal_face_detector()
self.land_mark_predictor =
dlib.shape_predictor('./models/shape_predictor_68_face_landmarks.dat')
self.terminate = False
self.frame = None
self.face_obj = None
self.land_mark_dict = {}
self.window_name = 'landmark'
self.blink_count = 0
self.eye_state = EYE_STATE_INIT

cv2.namedWindow(self.window_name)
cv2.moveWindow(self.window_name, 100, 100)

def set_image(self, image):

self.frame = image.copy();

39
# print('DlibProcess: image shape:', self.frame.shape)

def get_faces(self):
self.face_obj = None

if self.frame is None:
# print('DlibProcess(): get_faces: None objects detected')
return False

# convert image to gray scale


gray_frame = cv2.cvtColor(self.frame, cv2.COLOR_BGR2GRAY)

face_objs = self.face_detector(gray_frame, 1)
print("DlibProcess: Number of faces detected: {}".format(len(face_objs)))
if len(face_objs) != 1:
# print("DlibProcess: Number of faces detected is not 1")
return False

self.face_obj = face_objs[0]
return True

def get_lanmark_data(self):
self.land_mark_dict = {}

if self.frame is None or self.face_obj is None:


# print('DlibProcess(): get_lanmark_data: None objects detected')
return False, self.blink_count

# convert image to gray scale

40
gray_frame = cv2.cvtColor(self.frame, cv2.COLOR_BGR2GRAY)

land_mark_obj = self.land_mark_predictor(gray_frame, self.face_obj)


# Left Eye Top 1 x, y coordinate
self.land_mark_dict[LFT_EYE_TOP1] = (
land_mark_obj.part(LandMarkLoc.lt_eye_top_1.value).x,
land_mark_obj.part(LandMarkLoc.lt_eye_top_1.value).y)

# Left Eye Btm 1 x, y coordinate


self.land_mark_dict[LFT_EYE_BTM1] = (
land_mark_obj.part(LandMarkLoc.lt_eye_btm_1.value).x,
land_mark_obj.part(LandMarkLoc.lt_eye_btm_1.value).y)

# Left Eye Top 2 x, y coordinate


self.land_mark_dict[LFT_EYE_TOP2] = (
land_mark_obj.part(LandMarkLoc.lt_eye_top_2.value).x,
land_mark_obj.part(LandMarkLoc.lt_eye_top_2.value).y)
# Left Eye Btm 2 x, y coordinate
self.land_mark_dict[LFT_EYE_BTM2] = (
land_mark_obj.part(LandMarkLoc.lt_eye_btm_2.value).x,
land_mark_obj.part(LandMarkLoc.lt_eye_btm_2.value).y)

# Right Eye Top 1 x, y coordinate


self.land_mark_dict[RT_EYE_TOP1] = (
land_mark_obj.part(LandMarkLoc.rt_eye_top_1.value).x,
land_mark_obj.part(LandMarkLoc.rt_eye_top_1.value).y)
# Right Eye Btm 1 x, y coordinate
self.land_mark_dict[RT_EYE_BTM1] = (
land_mark_obj.part(LandMarkLoc.rt_eye_btm_1.value).x,

41
land_mark_obj.part(LandMarkLoc.rt_eye_btm_1.value).y)

# Right Eye Top 2 x, y coordinate


self.land_mark_dict[RT_EYE_TOP2] = (
land_mark_obj.part(LandMarkLoc.rt_eye_top_2.value).x,
land_mark_obj.part(LandMarkLoc.rt_eye_top_2.value).y)
# Right Eye Btm 2 x, y coordinate
self.land_mark_dict[RT_EYE_BTM2] = (
land_mark_obj.part(LandMarkLoc.rt_eye_btm_2.value).x,
land_mark_obj.part(LandMarkLoc.rt_eye_btm_2.value).y)

# Left Eye start corner x, y coordinate


self.land_mark_dict[LFT_EYE_ST_CNR] = (
land_mark_obj.part(LandMarkLoc.lt_eye_st_crn.value).x,
land_mark_obj.part(LandMarkLoc.lt_eye_st_crn.value).y)

# Left Eye end corner x, y coordinate


self.land_mark_dict[LFT_EYE_END_CNR] = (
land_mark_obj.part(LandMarkLoc.lt_eye_end_crn.value).x,
land_mark_obj.part(LandMarkLoc.lt_eye_end_crn.value).y)

# Right Eye start corner x, y coordinate


self.land_mark_dict[RT_EYE_ST_CNR] = (
land_mark_obj.part(LandMarkLoc.rt_eye_st_crn.value).x,
land_mark_obj.part(LandMarkLoc.rt_eye_st_crn.value).y)

# Right Eye end corner x, y coordinate


self.land_mark_dict[RT_EYE_END_CNR] = (
land_mark_obj.part(LandMarkLoc.rt_eye_end_crn.value).x,

42
land_mark_obj.part(LandMarkLoc.rt_eye_end_crn.value).y)

lt_h1 = land_mark_obj.part(LandMarkLoc.lt_eye_btm_1.value).y -
land_mark_obj.part(
LandMarkLoc.lt_eye_top_1.value).y
lt_h2 = land_mark_obj.part(LandMarkLoc.lt_eye_btm_2.value).y -
land_mark_obj.part(
LandMarkLoc.lt_eye_top_2.value).y
lt_width = self.land_mark_dict[LFT_EYE_END_CNR][0] -
self.land_mark_dict[LFT_EYE_ST_CNR][0]
lt_h1_ratio = lt_h1 / lt_width
lt_h2_ratio = lt_h2 / lt_width

# EYE_STATE_OPEN self.total_blink_count, self.eye_state

eye_status = 'OPEN'
if lt_h1_ratio <= 0.18:
eye_status = 'CLOSE'
if self.eye_state == EYE_STATE_OPEN:
self.eye_state = EYE_STATE_CLOSE
self.blink_count += 1
else:
if self.eye_state == EYE_STATE_INIT or self.eye_state ==
EYE_STATE_CLOSE:
self.eye_state = EYE_STATE_OPEN

print('LT1: ', lt_h1, 'LT2: ', lt_h2, 'lt_h1_ratio: ', lt_h1_ratio, 'lt_h2_ratio: ',
lt_h2_ratio,
'EYE sttus', eye_status, 'blink_count: ', self.blink_count)

43
return True, self.blink_count

def show_image(self):

disp_img = self.frame.copy()

if len(self.land_mark_dict) == 0:
disp_img = cv2.line(disp_img, (disp_img.shape[1] - 50,
int(disp_img.shape[0] / 2)),
(disp_img.shape[1] - 50, int(disp_img.shape[0] / 2)), (0, 0,
255), 40)
else:
disp_img = cv2.line(disp_img, (disp_img.shape[1] - 50,
int(disp_img.shape[0] / 2)),
(disp_img.shape[1] - 50, int(disp_img.shape[0] / 2)), (0,
255, 0), 40)

cv2.putText(disp_img, "Blink Count {}".format(self.blink_count), (50,


30), cv2.FONT_HERSHEY_SIMPLEX, .5,
(255, 0, 0), 1, cv2.LINE_AA)

for key, val in self.land_mark_dict.items():


# value contains x,y coordinate points
disp_img = cv2.line(disp_img, (int(val[0]), int(val[1])), (int(val[0]),
int(val[1])), (0, 0, 255), 2)
# print('disp_img,shape', disp_img.shape)
disp_img = cv2.resize(disp_img, (int(disp_img.shape[0] * 0.75),
int(disp_img.shape[1] * 0.75)))

44
# Display the resulting frame
cv2.imshow(self.window_name, disp_img)
key = cv2.waitKey(1)
if key == ord('e'):
self.terminate = True

def start_process(self, display_image):

# create an object of Video Process


vid_process = video_process.VideoProcess()

status = vid_process.start_capture()
if status == False:
print('Error: DlibProcess: Failed to open camera')
self.terminate = True

while self.terminate == False:


self.land_mark_dict = {}
self.frame = None

# Capture frame-by-frame
status, frame = vid_process.get_frame()
if not status:
print('Error: DlibProcess: Failed to capture image')
self.terminate = True
else:
# set the captured image for processing
self.set_image(frame)

45
# Got frame from camera. Get get_faces
if self.get_faces() == True:
self.get_lanmark_data()

self.show_image()

vid_process.terminate_process()

def get_blinkcount(self, duration=3):

self.blink_count = 0

# create an object of Video Process


vid_process = video_process.VideoProcess()

status = vid_process.start_capture()
if not status:
print('Error: DlibProcess: get_blinkcount: Failed to open camera')
return status, self.blink_count

start_time = time()
while time() - start_time <= 1.0 * duration:
self.land_mark_dict = {}
self.frame = None

# Capture frame-by-frame
status, frame = vid_process.get_frame()
if not status:

46
print('Error: DlibProcess:get_blinkcount: Failed to capture image')
self.blink_count = 0
return status, self.blink_count
else:
# set the captured image for processing
self.set_image(frame)

# Got frame from camera. Get get_faces


if self.get_faces() == True:
status = self.get_lanmark_data()

self.show_image()

vid_process.terminate_process()

return True, self.blink_count

Main_view.py

from tkinter import *


from tkinter import ttk
import dlib_process
from subprocess import call
from gtts import gTTS
import os
from PIL import Image, ImageTk
from twilio.rest import Client
from tempfile import TemporaryFile

class MainView(ttk.Frame):

47
data_matrix = []
image_reference_list = []
current_option = ""
back_option = ""
next_option = ""
rows = 0
selected_items = ""
blink_count = 0
selected_info_image = None
next_can_img = None
next_can_txt = None
curr_can_txt = None
curr_can_img = None

# ---- callback function for button


def calculate(self):
if self.app_start == True: self.get_audio_text("Please blink to select your
choice.")
self.app_start = False
# create an object of Video Process
self.stop_cycle = False
self.land_mark_process = dlib_process.DlibProcess()
# start time limited blink detection
self.status, self.blink_count = self.land_mark_process.get_blinkcount()
if self.status == True:
print("==========Got Blink count: ", self.blink_count)
self.get_next_option_onblink()
else: print("Failed to get Blink count")

48
if self.stop_cycle == False:
root.after(100, self.calculate)

def update(self):
print('update function', self.blink_count)
root.after(300, self.update)

def get_next_option_onblink(self):
print("In get_next_option_onblink ::::, blink_count= ",self.blink_count)
for i in range(self.rows):
print("current_option = ",self.current_option)
if self.data_matrix[i][0] == self.current_option:
self.next_option = self.data_matrix[i][1]
self.back_option = self.data_matrix[i][2]
print("In get_next_option_onblink ::::, Back Option= ",
self.back_option, ", Current_Option = ", self.current_option, ", next_option=
", self.next_option)
if self.back_option == "meal" or self.back_option
=="health_assistance" or self.back_option =="mobility_assistance" or
self.back_option =="emergency" :
self.selected_items = ""
self.info_labelimg.configure(image='')

if self.blink_count > 0:
if "back" in self.current_option:
self.selected_items = ""
else:self.selected_items = self.current_option
self.set_selected_info()

49
self.back_option = self.current_option
self.current_option = self.next_option

if self.current_option == "request":
text = self.back_option.replace("_", " ").title()+" Requested"
audio_text = "Your request for " + self.back_option.replace("_",
" ").title() + " has been
sent. Your help is on the way. Thank you."
self.requested_msg.set(audio_text)
print("get requested msg = ", self.requested_msg.get())
self.get_audio_text(audio_text) # remove comment
if "Emergency" in text: self.send_call()
self.set_items(self.back_option, self.current_option, text)

self.stop_cycle = True
root.after(5000, self.set_item_mainscreen("meal"))
else:
text = self.current_option.replace("_", " ").title()
self.set_items(self.back_option, self.current_option, text)

elif self.blink_count == 0:
if "back" in self.current_option:
self.selected_items = self.current_option[:-5]
self.set_selected_info()
#text = "Back"
if "back" in self.back_option: text = "Back"
else : text = self.back_option.replace("_", " ").title()

50
self.set_items(self.current_option, self.back_option, text)
self.current_option = self.back_option
break

def set_selected_info(self):
if self.selected_items != "":
self.selected_info.set("You Selected " + self.selected_items.replace("_","
").title())
self.selected_info_image = PhotoImage(file="./gif_images/" +
self.selected_items + ".gif")
self.info_labelimg.configure(image = self.selected_info_image)
self.info_labelimg.image = self.selected_info_image
"""
Sends an SMS through the Textbelt API.
:param phone: Phone number to send the SMS to.
:param msg: SMS message. Should not be more than 160 characters.
:param apikey: Your textbelt API key. 'textbelt' can be used for free for 1
SMS per day.
:returns: True if the SMS could be sent. False otherwise.
:rtype: bool
"""
result = True
json_success = False
# Attempt to send the SMS through textbelt's API and a requests instance.
try:
resp = requests.post('https://textbelt.com/text', {
'phone': phone,
'message': msg,
'key': apikey,
})
except:
result = False
# Extract boolean API result
if result:
try:
json_success = resp.json()["success"]
51
except:
result = False
# Evaluate if the SMS was successfully sent.
if result:
if not json_success:
result = False;
# Give the result back to the caller.
return result

else:
self.selected_info.set("")
self.info_labelimg.configure(image='')

def main(self):
"""
Send an SMS message for testing purposes.
"""
phone = '+15558838530' # <-- Enter your own phone number here
smsmsg = "You Selected " + self.selected_items.replace("_"," ").title()
apikey = 'textbelt' # <-- Change to your API key, if desired
# Attempt to send the SMS message.
if send_textbelt_sms(phone, smsmsg, apikey):
print('SMS message successfully sent!')
else:
print('Could not send SMS message.')

else:
self.selected_info.set("")
self.info_labelimg.configure(image='')

def get_audio_text(self, audio_text):


language = 'en'
myobj = gTTS(text=audio_text, lang=language, slow=False)
myobj.save("./audio_text.mp3")
# Playing the converted file
call(["ffplay", "-nodisp", "-autoexit", "./audio_text.mp3"])

52
def send_call(self):
account_sid = "AC41f98e886f567f58cec9b61e2d5cbf39"
auth_token = "e6066bc799a63a9251f687404f60dd57"
client = Client (account_sid, auth_token)
twilio_call = client.calls.create(to="+18056896547",
from_="+18447477770",
url="https://api.rcqatol.com/rentcafeapi.aspx?requesttype=twiliotest")
print(twilio_call.status)

# ---- read mapping file and store into matrix form


def get_data_matrix(self):
with open("./data_mapping", 'r') as f:
for line in f:
row = line.rstrip().split(', ')
self.data_matrix.append(row)
self.rows += 1
return self.data_matrix, self.rows

# -- reset texts and information


def set_item_mainscreen(self, option):

self.selected_items = ""
self.selected_info.set("")
self.requested_msg.set("")
self.info_labelimg.configure(image='')
self.can.delete(self.next_can_img)
self.can.delete(self.next_can_txt )

53
self.can.delete(self.curr_can_img )
self.can.delete(self.curr_can_txt )
self.current_option = self.data_matrix[0][0]

self.image = PhotoImage(file="./gif_images/" + option + ".gif")


img_height = self.image.height()
img_width = self.image.width()
self.curr_can_img = self.can.create_image(450, 190, anchor=CENTER,
image=self.image, tags="image")
self.curr_can_txt = self.can.create_text(img_width + 100, img_height -
175, anchor=CENTER, fill='white',
font=("Times", 22, "bold"))
self.can.itemconfigure(self.curr_can_txt, text=option.replace("_", "
").title())
self.current_option = option
self.stop_cycle = False

def set_items(self, option, next_option, text):


distx = 10
if(next_option != ""):
for i in range(100):
self.can.move(self.curr_can_img, distx, 0)
self.can.move(self.curr_can_txt, distx, 0)
if (i == 45):
self.image_reference_list.append(PhotoImage(file="./gif_images/"
+ next_option + ".gif"))
self.next_can_img = self.can.create_image(-
self.image_reference_list[-1].width(), self.can.winfo_height()/3,

54
image=self.image_reference_list[-1], anchor=NW)
self.next_can_txt = self.can.create_text(-self.image_reference_list[-
1].width() + (self.image_reference_list[-1].width() / 2),
self.image_reference_list[-1].height() - 150, text=text, fill='white',
font=("Times", 22, "bold"))
if i >= 45:
self.can.move(self.next_can_img, distx, 0)
self.can.move(self.next_can_txt, distx, 0)
root.update() # update the display
root.after(20) # wait 30 ms

self.curr_can_img = self.next_can_img
self.curr_can_txt = self.next_can_txt

def __init__(self, master):


ttk.Frame.__init__(self)
self.selected_info = StringVar()
self.requested_msg = StringVar()
self.app_start = True
# ---- set initial screen
self.data_matrix, rows = self.get_data_matrix()
self.current_option = self.data_matrix[0][0]
frame_style = ttk.Style()
frame_style.configure('Black.TLabel', background="#003455")
#013243") 002366
infostyle = ttk.Style()
infostyle.configure('Black.TLabelframe', background="#000099")

# ---- top frame to set help message

55
top_frame = ttk.Frame(master, padding="15 15 15 15",
style='Black.TLabel')
top_frame.pack(side=TOP, fill="both")
self.logo = PhotoImage(file="./gif_images/logo.png")
main_label = ttk.Label(top_frame, text="", font=("Helvetica", 12, "bold
italic"),
image=self.logo, background= "#003455")
main_label.pack(fill = "none", expand=True)
# ---- info frame to indicate what option is selected
info_frame = ttk.Frame(master, padding="1 1 1
1",style='Black.TLabelframe')
info_frame.pack(side=TOP, fill="both")

# ---- set canvas and info frame background


self.tempimg = Image.open("./gif_images/background.gif")
self.tempimg = self.tempimg.resize((root.winfo_screenwidth(),
root.winfo_screenheight()))
self.background_image = ImageTk.PhotoImage(image=self.tempimg)

self.frmtempimg = Image.open("./gif_images/frame_backimg1.gif")

self.frmtempimg = self.frmtempimg.resize((root.winfo_screenwidth(),
root.winfo_screenheight()))
self.frame_back_image = ImageTk.PhotoImage(image=self.frmtempimg)

self.info_label_backgrnd = ttk.Label(info_frame,
image=self.frame_back_image, border=0)
self.info_label_backgrnd.pack(fill = BOTH, expand = True)

56
self.info_label = ttk.Label(self.info_label_backgrnd, text="",
textvariable=self.selected_info, font=("Times", 20, "bold"), anchor = CENTER,
foreground="white", background = "#154360", border=0) #002366
self.info_label.grid(column=1, row=0, sticky=( W, E)) #pack(fill = "none",
expand = True, side = LEFT) #
self.info_labelimg = ttk.Label(self.info_label_backgrnd,
image=self.selected_info_image, border=0)
self.info_labelimg.grid(column=2, row=0, sticky=(N, W, E, S))
#pack(fill="none", side = LEFT) # grid(column=1, row=0, sticky=(N, W, E,
S))

root.columnconfigure(3, weight=1)
root.rowconfigure(0, weight=1)
for child in self.info_label_backgrnd.winfo_children():
child.grid_configure(padx=10, pady=10)

self.can = Canvas(self)
self.can.pack(side=TOP, fill="both", expand=True)
self.can.config(bd = 4, relief = "sunken")

self.image_obj = self.can.create_image((self.tempimg.size[0] // 2),


(self.tempimg.size[1] // 2),
image=self.background_image, anchor=CENTER)

self.set_item_mainscreen("meal")

57
# ---- message_frame to show final message
message_frame = ttk.Frame(master, padding="15 15 15 15",
style='Black.TLabel')
message_frame.pack(side=BOTTOM, fill="both")
message_label = ttk.Label(message_frame,
textvariable=self.requested_msg, foreground="white",background="#003455",
font=("Times", 17, "bold"))
message_label.pack(fill="none", expand=True)

if __name__ == "__main__":
root = Tk()

main = MainView(root)
root.title("Health Talk")

root.geometry('%dx%d' % (root.winfo_screenwidth() / 1.6,


root.winfo_screenheight() / 1.2))

root.update() # to get actual geometry of root


main.pack(side="top", fill="both", expand=True)
root.after(2000, main.calculate)
root.mainloop()

58
Output Screenshots

For Food

59
For Water

60
For Emergency

For Health Assistance

61
Output screenshot 5:

62
Output

Future Enhancement:

The constant demand to improve daily living standards for paralysis


patients serves as a motivation to develop newer technology. The task
once performed by big traditional computers is now solved with smaller
smart devices. Paralysis is defined as the complete loss of muscle function
in any part of the body. It occurs when there is a problem with the passage

63
of messages between the muscles and the brain. The main objective is to
design a real time interactive system that can assist the paralysis patients
to control appliances such as lights, fans etc. In addition, It can also play
pre-recorded audio messages through predefined number of eye blinks and
it also helps to alert the doctor or concerned person by sending SMS in case
of emergency by using eye blink Sensor. The eye blink sensor is able to
detect an intentional blink from a normal blink, which is useful for the
paralysis patients especially Tetraplegic patients to regulate their home
devices easily without any help.

Conclusion:
Although blink detection systems exist for other purposes, an
implementation of a blink detection system with the end use of controlling
appliances has not been previously accomplished. While the system is
intended to assist the paralyzed and physically challenged, it can definitely
be used by all types of individuals. The main challenge involved in the
implementation of the system is the development of a real time robust blink
detection algorithm.
Many algorithms have been developed to serve the purpose, with some being
more accurate than the others. This paper presented a blink detection system
based on
online template matching. The first phase involved the blink detection
phase; the second phase involved the counting of blinks and subsequent
control of appliances through a micro controller.
By enabling the paralyzed to gain control of albeit a small
part of their lives, the system can offer some level of independence to
them. The helpers who are assigned the task of tending to paralyzed
persons through the day can then be afforded a break. The system needs
moderate processing power, making it suitable for practical use. For

64
continuous video input, laptops with built in webcams or USB cameras will
suffice.
The system is limited by the efficiency of the blink detection
algorithm and efficiency falls further under limited lighting conditions.
Since the initialization phase of the algorithm is based on differencing
between consecutive frames, background movement in the frame may lead
to inaccurate operation.
Typically, background movement causes non eye pairs to be
detected as eye pairs. This is overcome to some extent by limiting the
search region to the face of an individual, by implementing a face tracking
algorithm prior to blink detection. However, this in turn can lead to reduced
efficiency in blink detection. By giving an option to the user to choose
between the system with and without face tracking, a level of flexibility
can be reached. The application of the blink detection system is not limited
to the control of appliances but can also be used fora variety of other
functions. Playback of audio distress messages over an intercom system is
one of the other applications of the system. Future applications of the
system may include playback of video or audio files by eye blinks and
making a VOIP call to play a distress message. In the future, the system
can be implemented on a Digital Signal Processor, making it a truly
embedded system which could be used as a standalone device without the
need for a laptop or desktop PC. Application 1 ‘Bulb glows on’

REFERENCES

[1]. Experimental results show that the proposed scheme can achieve much
better eye blink detection.

65
[2]. Chinnawat Devahasdin Na Ayudhya, ThitiwanSrinark, A Method for
Real-Time Eye Blink Detection and Its Application

[3]. Michael Chau and Margrit Betke,2005. Real Time Eye Tracking and Blink
Detection with USB Cameras. Boston University Computer Science Technical
Report No. 2005-12. Boston, USA.

[4]. Liting Wang , Xiaoqing Ding , Chi Fang , Changsong Liu , Kongqiao
Wang ,2009. Eye Blink Detection Based on Eye Contour Extraction.
Proceedings of SPIE-IS&T Electronic Imaging. San Jose, CA, USA.

[5]. Joseph K George, Subhin K B, Arun Jose, Hima T ―Eye Controlled


Home-Automation For Disable‖ pp6-7 (ERTEEI'17)

[6]. T. Danisman, I.M Bilasco, C. Djeraba, NacimIhaddadene ―Drowsy


Driver Detection System Using Eye Blink Patterns‖ IEEE, march 2010 , pp.
230-233

[7]. Kohei Aai and Ronny Mardiyanto, 2011. Comparative Study on Blink
Detection and Gaze Estimation Methods for HCI, in Particular, Gabor Filter
Utilized Blink Detection Method. Proceedings of Eighth International
Conference on Information Technology: New Generations. Las Vegas,
USA, pp. 441-446.
[8]. Taner Danisman, Ian Marius Bilasco, Chabane Djeraba, Nacim
Ihaddadene “Drowsy driver detection system using Eye Blink patterns” 2010
International Conference on Machine and Web Intelligence 29 November 2010

66
[9]. Atish Udayashankar ; Amit R. Kowshik ; S. Chandramouli ; H.S.
Prashanth,” Assistance for paralyzed using Eye Blink Detection, . 2012 fourth
international conference on digital home , 11 december 2012

[10]. Home automation with eye blink for paralyzed patients, DeepBose,
BRAC University, Bangladesh, 2017

[11]. F.M. Sukno, S.K. Pavani, C. Butakoffand and A.F. Frangi, “Automatic
Assessment of Eye Blinking Patterns through Statistical Shape Models,”
ICVS 2009, LNCS 5815, Springer-Verlag Berlin Heidelberg, pp. 33-42, 2009.

[12]. M. Divjak and H. Bischof, “Eye blink based fatigue detection for
prevention of Computer Vision Syndrome,” MVA2009 IAPR Conference on
Machine Vision Applications, Yokohama, Japan, May 2009.

[13]. L. Wang, X. Ding, C. Fang and C. Liu, “Eye blink detection based on
eye contour extraction,” Proceedings of SPIE, vol. 7245, 72450R, 2009.

67

You might also like