Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 89

ABSTRACT

After the breakout of the worldwide pandemic COVID-19, there arises a


severe need of protection mechanisms, face mask being the primary one. The
basic aim of the project is to detect the presence of a face mask on human faces
on livestreaming video as well as on images. We have used deep learning to
develop our face detector model. The architecture used for the object detection
purpose is Single Shot Detector (SSD) because of its good performance
accuracy and highspeed. Alongside this, we have used basic concepts of transfer
learning in neural networks to finally output presence or absence of a face mask
in an image or a video stream. Experimental results show that our model
performs well on the test data with 100% and 99% precision and recall,
respectively.
INDEX

1. INTRODUCTION
1.1 INTRODUCTION TO PROJECT
2. SYSTEM ANALYSIS
2.1 INTRODUCTION
2.2 STUDY OF THE SYSTEM
2.3 HARDWARE AND SOFTWARE REQUIRMENTS
2.4 PROPOSED SYSTEM
3. FEASIBILITY REPORT
3.1 TECHNICAL FEASIBILITY
3.2 OPERATIONAL FEASIBILITY
3.3 ECONOMICAL FEASIBILTY
4. SOFTWARE REQUIREMENT SPECIFICATIONS
4.1 FUNCTIONAL REQUIREMENTS
4.2 PERFORMANCE REQUIREMENTS
4.3 SOFTWARE REQUIREMENTS
5. SYSTEM DESIGN

5.1 INTRODUCTION

5.2 SYSTEM WORK FLOW

5.3 UML DIAGRAMS

6. IMPLEMENTATION

6.1 SOURCE CODE

6.2 SCREENSHOTS

7. SYSTEM TESTING

8. CONCLUSION
INTRODUCTION
The year 2020 has shown mankind some mind-boggling series of events
amongst which the COVID-19 pandemic is the most life-changing event which
has startled the world since the year began. Affecting the health and lives of
masses, COVID-19 has called for strict measures to be followed in order to
prevent the spread of disease. From the very basic hygiene standards to the
treatments in the hospitals, people are doing all they can for their own and the
society’s safety; face masks are one of the personal protective equipment.
People wear face masks once they step out of their homes and authorities strictly
ensure that people are wearing face masks while they are in groups and public
places. To monitor that people are following this basic safety principle, a strategy
should be developed. A face mask detector system can be implemented to check
this. Face mask detection means to identify whether a person is wearing a mask
or not. The first step to recognize the presence of a mask on the face is to detect
the face, which makes the strategy divided into two parts: to detect faces and to
detect masks on those faces. Face detection is one of the applications of object
detection and can be used in many areas like security, biometrics, law
enforcement and more. There are many detector systems developed around the
world and being implemented. However, all this science needs optimization; a
better, more precise detector, because the world cannot afford any more increase
in corona cases.

In this project, we will be developing a face mask detector that is able to


distinguish between faces with masks and faces with no masks. In this report, we
have proposed a detector which employs SSD for face detection and a neural
network to detect presence of a face mask. The implementation of the algorithm
is on images, videos and live video streams
System Analysis

2.1 INTRODUCTION

Software Development Life Cycle:-

There is various software development approaches defined and designed


which are used/employed during development process of software, these
approaches are also referred as "Software Development Process Models". Each
process model follows a particular life cycle in order to ensure success in
process of software development.

Requirements:

Business requirements are gathered in this phase.  This phase is the main focus
of the project managers and stake holders.  Meetings with managers, stake
holders and users are held in order to determine the requirements.  Who is going
to use the system?  How will they use the system?  What data should be input
into the system?  What data should be output by the system?  These are general
questions that get answered during a requirements gathering phase.  This
produces a nice big list of functionality that the system should provide, which
describes functions the system should perform, business logic that processes
data, what data is stored and used by the system, and how the user interface
should work.  The overall result is the system as a whole and how it performs,
not how it is actually going to do it.

Design

The software system design is produced from the results of the requirements
phase.  Architects have the ball in their court during this phase and this is the
phase in which their focus lies.  This is where the details on how the system will
work is produced.  Architecture, including hardware and software,
communication, software design (UML is produced here) are all part of the
deliverables of a design phase.

Implementation

Code is produced from the deliverables of the design phase during


implementation, and this is the longest phase of the software development life
cycle.  For a developer, this is the main focus of the life cycle because this is
where the code is produced.  Implementation my overlap with both the design
and testing phases.  Many tools exists (CASE tools) to actually automate the
production of code using information gathered and produced during the design
phase.

Testing

During testing, the implementation is tested against the requirements to make


sure that the product is actually solving the needs addressed and gathered during
the requirements phase.  Unit tests and system/acceptance tests are done during
this phase.  Unit tests act on a specific component of the system, while system
tests act on the system as a whole.

So in a nutshell, that is a very basic overview of the general software


development life cycle model.  Now let’s delve into some of the traditional and
widely used variations.

SDLC METHDOLOGIES

This document play a vital role in the development of life cycle (SDLC) as it
describes the complete requirement of the system. It means for use by
developers and will be the basic during testing phase. Any changes made to the
requirements in the future will have to go through formal change approval
process.
SPIRAL MODEL was defined by Barry Boehm in his 1988 article, “A spiral
Model of Software Development and Enhancement. This model was not the
first model to discuss iterative development, but it was the first model to explain
why the iteration models.

As originally envisioned, the iterations were typically 6 months to 2 years long.


Each phase starts with a design goal and ends with a client reviewing the
progress thus far. Analysis and engineering efforts are applied at each phase of
the project, with an eye toward the end goal of the project.

The following diagram shows how a spiral model acts like:

The steps for Spiral Model can be generalized as follows:

 The new system requirements are defined in as much details as


possible. This usually involves interviewing a number of users
representing all the external or internal users and other aspects of the
existing system.

 A preliminary design is created for the new system.

 A first prototype of the new system is constructed from the


preliminary design. This is usually a scaled-down system, and
represents an approximation of the characteristics of the final product.
 A second prototype is evolved by a fourfold procedure:

1. Evaluating the first prototype in terms of its strengths,


weakness, and risks.

2. Defining the requirements of the second prototype.

3. Planning an designing the second prototype.

4. Constructing and testing the second prototype.

 At the customer option, the entire project can be aborted if the risk is
deemed too great. Risk factors might involved development cost
overruns, operating-cost miscalculation, or any other factor that could,
in the customer’s judgment, result in a less-than-satisfactory final
product.

 The existing prototype is evaluated in the same manner as was the


previous prototype, and if necessary, another prototype is developed
from it according to the fourfold procedure outlined above.

 The preceding steps are iterated until the customer is satisfied that the
refined prototype represents the final product desired.

 The final system is constructed, based on the refined prototype.

 The final system is thoroughly evaluated and tested. Routine


maintenance is carried on a continuing basis to prevent large scale
failures and to minimize down time.

2.2 STUDY OF THE SYSTEM

The dataset which we have used consists of 3835 total images out of which
1916 are of masked faces and 1919 are of unmasked faces. All the images are
actual images extracted from Bing Search API, Kaggle datasets and RMFD
dataset. From all the three sources, the proportion of the images is equal. The
images cover diverse races i.e Asian, Caucasian etc. The proportion of masked
to unmasked faces determine that the dataset is balanced. We need to split our
dataset into three parts: training dataset, test dataset and validation dataset. The
purpose of splitting data is to avoid overfitting which is paying attention to
minor details/noise which is not necessary and only optimizes the training
dataset accuracy. We need a model that performs well on a dataset that it has
never seen (test data), which is called generalization. The training set is the
actual subset of the dataset that we use to train the model.

The model observes and learns from this data and then optimizes its parameters.
The validation dataset is used to select hyperparameters (learning rate,
regularization parameters). When the model is performing well enough on our
validation dataset, we can stop learning using a training dataset. The test set is
there mining subset of data used to provide an unbiased evaluation of a final
model fit on the training dataset. Data is split as per a split ratio which is highly
dependent on the type of model we are building and the dataset itself. If our
dataset and model are such that a lot of training is required, then we use a larger
chunk of the data just for training which is our case. If the model has a lot of
hyperparameters that can be tuned, then we need to take a higher amount of
validation dataset. Models with a smaller number of hyperparameters are easy
to tune and update, and so we can take a smaller validation dataset. In our
approach, we have dedicated 80% of the dataset as the training data and the
remaining 20%as the testing data, which makes the split ratio as 0.8:0.2 of train
to test set. Out of the training data, we have used 20% as a validation data set.
Overall, 64% of the dataset is used for training, 16% for validation and 20% for
testing.

2.3Hardware And Software Requirements

Hardware Requirements:

• RAM: 4GB and Higher

• Processor: Intel i3 and above

• Hard Disk: 500GB: Minimum

Software Requirements:

• OS: Windows

• Python IDE: python 2.7.x and above

• Pycharm IDE
• Setup tools and pip to be installed for 3.6.x and above

2.4Proposed System:

A face mask detector system can be implemented to check this. Face


mask detection means to identify whether a person is wearing a mask or not.
The first step to recognize the presence of a mask on the face is to detect the
face, which makes the strategy divided into two parts: to detect faces and to
detect masks on those faces. Face detection is one of the applications of object
detection and can be used in many areas like security, biometrics, law
enforcement and more. There are many detector systems developed around the
world and being implemented. However, all this science needs optimization; a
better, more precise detector, because the world cannot afford any more
increase in corona cases.

3. Feasability Report

Preliminary investigation examines project feasibility, the likelihood the


system will be useful to the organization. The main objective of the feasibility
study is to test the Technical, Operational and Economical feasibility for adding
new modules and debugging old running system. All system is feasible if they
are unlimited resources and infinite time. There are aspects in the feasibility
study portion of the preliminary investigation:

 Technical Feasibility
 Operational Feasibility
 Economic Feasibility

3.1. TECHNICAL FEASIBILITY

The technical issue usually raised during the feasibility stage of the
investigation includes the following:

 Does the necessary technology exist to do what is suggested?


 Do the proposed equipments have the technical capacity to hold the data
required to use the new system?
 Will the proposed system provide adequate response to inquiries, regardless
of the number or location of users?
 Can the system be upgraded if developed?
 Are there technical guarantees of accuracy, reliability, ease of access and
data security?

Earlier no system existed to cater to the needs of ‘Secure Infrastructure


Implementation System’. The current system developed is technically feasible.
It is a web based user interface for audit workflow at NIC-CSD. Thus it
provides an easy access to the users. The database’s purpose is to create,
establish and maintain a workflow among various entities in order to facilitate
all concerned users in their various capacities or roles. Permission to the users
would be granted based on the roles specified. Therefore, it provides the
technical guarantee of accuracy, reliability and security. The software and hard
requirements for the development of this project are not many and are already
available in-house at NIC or are available as free as open source. The work for
the project is done with the current equipment and existing software technology.
Necessary bandwidth exists for providing a fast feedback to the users
irrespective of the number of users using the system.

3.2. OPERATIONAL FEASIBILITY

Proposed projects are beneficial only if they can be turned out into
information system. That will meet the organization’s operating requirements.
Operational feasibility aspects of the project are to be taken as an important part
of the project implementation. Some of the important issues raised are to test the
operational feasibility of a project includes the following: -

 Is there sufficient support for the management from the users?


 Will the system be used and work properly if it is being developed and
implemented?
 Will there be any resistance from the user that will undermine the possible
application benefits?
This system is targeted to be in accordance with the above-mentioned
issues. Beforehand, the management issues and user requirements have been
taken into consideration. So there is no question of resistance from the users that
can undermine the possible application benefits.

The well-planned design would ensure the optimal utilization of the computer
resources and would help in the improvement of performance status.

3.3. ECONOMICAL FEASIBILITY

A system can be developed technically and that will be used if installed


must still be a good investment for the organization. In the economical
feasibility, the development cost in creating the system is evaluated against the
ultimate benefit derived from the new systems. Financial benefits must equal or
exceed the costs.

The system is economically feasible. It does not require any addition


hardware or software. Since the interface for this system is developed using the
existing resources and technologies available at NIC, There is nominal
expenditure and economical feasibility for certain.

SOFTWARE REQUIREMENT SPECIFICATIONS

4.1Functional requirements

Outputs from computer systems are required primarily to communicate the


results of processing to users. They are also used to provide a permanent copy
of the results for later consultation. The various types of outputs in general are:

 External Outputs, whose destination is outside the organization,.


 Internal Outputs whose destination is within organization and they are the
 User’s main interface with the computer.
 Operational outputs whose use is purely within the computer department.
 Interface outputs, which involve the user in communicating directly.
 Understanding user’s preferences, expertise level and his business
requirements through a friendly questionnaire.
 Input data can be in four different forms - Relational DB, text files, .xls
and xml files. For testing and demo you can choose data from any
domain. User-B can provide business data as input.
Non-Functional Requirements

1. Secure access of confidential data (user’s details). SSL can be used.


2. 24 X 7 availability.

3. Better component design to get better performance at peak time

4. Flexible service based architecture will be highly desirable for future


extension

SOFTWARE REQUIREMENTS

 Language: Python
 Artificial Intelligence
 Facial Recognition
 Machine Learning libraries
IMPLIMENTATION ON (PYTHON):

What Is A Script?

Up to this point, I have concentrated on the interactive programming capability of


Python.  This is a very useful capability that allows you to type in a program and to
have it executed immediately in an interactive mode

Scripts are reusable

Basically, a script is a text file containing the statements that comprise a Python
program.  Once you have created the script, you can execute it over and over
without having to retype it each time.

Scripts are editable

Perhaps, more importantly, you can make  different versions of the script by
modifying the statements from one file to the next using a text editor.  Then you
can execute each of the individual versions.  In this way, it is easy to create
different programs with a minimum amount of typing.
You will need a text editor

Just about any text editor will suffice for creating Python script files.

You can use Microsoft Notepad, Microsoft WordPad, Microsoft Word, or just


about any word processor if you want to.

Difference between a script and a program

Script:

Scripts are distinct from the core code of the application, which is usually written
in a different language, and are often created or at least modified by the end-
user. Scripts are often interpreted from source code or byte code, where as the
applications they control are traditionally compiled to native machine code.

Program:

The program has an executable form that the computer can use directly to
execute the instructions.

The same program in its human-readable source code form, from which
executable programs are derived(e.g., compiled)
Python

what is Python? Chances you are asking yourself this. You may have found this
book because you want to learn to program but don’t know anything about
programming languages. Or you may have heard of programming languages like C
and want to know what Python is and how it compares to “big name” languages.
Hopefully I can explain it for you.

Python concepts

If your not interested in the the hows and whys of Python, feel free to skip to the
next chapter. In this chapter I will try to explain to the reader why I think Python is
one of the best languages available and why it’s a great one to start programming
with.

• Open source general-purpose language.

• Object Oriented, Procedural, Functional

• Easy to interface with C

• Easy-ish to interface with C++ (via SWIG)

• Great interactive environment

Python is a high-level, interpreted, interactive and object-oriented scripting


language. Python is designed to be highly readable. It uses English keywords
frequently where as other languages use punctuation, and it has fewer
syntactical constructions than other languages.
 Python is Interpreted − Python is processed at runtime by the interpreter.
You do not need to compile your program before executing it. This is
similar to PERL and PHP.

 Python is Interactive − You can actually sit at a Python prompt and interact
with the interpreter directly to write your programs.

 Python is Object-Oriented − Python supports Object-Oriented style or


technique of programming that encapsulates code within objects.

 Python is a Beginner's Language − Python is a great language for the


beginner-level programmers and supports the development of a wide
range of applications from simple text processing to WWW browsers to
games.

History of Python

Python was developed by Guido van Rossum in the late eighties and early
nineties at the National Research Institute for Mathematics and Computer
Science in the Netherlands.

Python is derived from many other languages, including ABC, Modula-3, C, C++,
Algol-68, SmallTalk, and Unix shell and other scripting languages.

Python is copyrighted. Like Perl, Python source code is now available under the
GNU General Public License (GPL).

Python is now maintained by a core development team at the institute, although


Guido van Rossum still holds a vital role in directing its progress.
Python Features

Python's features include −

 Easy-to-learn − Python has few keywords, simple structure, and a clearly


defined syntax. This allows the student to pick up the language quickly.

 Easy-to-read − Python code is more clearly defined and visible to the eyes.

 Easy-to-maintain − Python's source code is fairly easy-to-maintain.

 A broad standard library − Python's bulk of the library is very portable and
cross-platform compatible on UNIX, Windows, and Macintosh.

 Interactive Mode − Python has support for an interactive mode which


allows interactive testing and debugging of snippets of code.

 Portable − Python can run on a wide variety of hardware platforms and has
the same interface on all platforms.

 Extendable − You can add low-level modules to the Python interpreter.


These modules enable programmers to add to or customize their tools to
be more efficient.

 Databases − Python provides interfaces to all major commercial databases.

 GUI Programming − Python supports GUI applications that can be created


and ported to many system calls, libraries and windows systems, such as
Windows MFC, Macintosh, and the X Window system of Unix.

 Scalable − Python provides a better structure and support for large


programs than shell scripting.
Apart from the above-mentioned features, Python has a big list of good features,
few are listed below −

 It supports functional and structured programming methods as well as


OOP.

 It can be used as a scripting language or can be compiled to byte-code for


building large applications.

 It provides very high-level dynamic data types and supports dynamic type
checking.

 IT supports automatic garbage collection.

 It can be easily integrated with C,.

Dynamic vs Static

Types Python is a dynamic-typed language. Many other languages are static


typed, such as C. A static typed language requires the programmer to explicitly tell
the computer what type of “thing” each data value is.

For example, in C if you had a variable that was to contain the price of something,
you would have to declare the variable as a “float” type.

This tells the compiler that the only data that can be used for that variable must
be a floating point number, i.e. a number with a decimal point.

If any other data value was assigned to that variable, the compiler would give an
error when trying to compile the program.

Python, however, doesn’t require this. You simply give your variables names and
assign values to them. The interpreter takes care of keeping track of what kinds of
objects your program is using. This also means that you can change the size of the
values as you develop the program. Say you have another decimal number (a.k.a.
a floating point number) you need in your program.

With a static typed language, you have to decide the memory size the variable can
take when you first initialize that variable. A double is a floating point value that
can handle a much larger number than a normal float (the actual memory sizes
depend on the operating environment).

If you declare a variable to be a float but later on assign a value that is too big to
it, your program will fail; you will have to go back and change that variable to be a
double.

With Python, it doesn’t matter. You simply give it whatever number you want
and Python will take care of manipulating it as needed. It even works for derived
values.

For example, say you are dividing two numbers. One is a floating point number
and one is an integer. Python realizes that it’s more accurate to keep track of
decimals so it automatically calculates the result as a floating point number

4.3.2 ANALYSIS OF THE EUCLIDEAN ALGORITHM

Suppose a and b are integers, not both zero. The greatest common


divisor (gcd, for short) of a and b, written (a,b) or gcd(a,b), is the largest
positive integer that divides both a and b. We will be concerned almost
exclusively with the case where a and b are non-negative, but the theory goes
through with essentially no change in case a or b is negative. The
notation (a,b) might be somewhat confusing, since it is also used to denote
ordered pairs and open intervals. The meaning is usually clear from the context.

It may be worth reminding the reader of the details of the Euclidean


algorithm. Let u0 and u, be positive integers.
ANALYSIS OF THE EUCLIDEAN ALGORITHM:

uo = q0ul + u2
ul = qlu2 + u 3
:
:
Un-I = qn-lun +un+l,
where 0 = un+l < un < ….< u2 < ul. Then u1 = gcd (uo, u1).

We define E (u0, u1) to be the number of division steps performed by the


algorithm on input (u0, u1) and we see that E (uo, u1) = n. It can be proved by
induction that if u > v > 0, E (u, v) = n, and u is as small as possible, then (u, v)
= (Fn+ 2, Fn+1), where Fk denotes the kth Fibonacci number, defined by F 0 = 0; F1
= 1; and F k+ 2 = F k+ 1 + F k for k ≥ 0.

4.3.3. INTRODUCTION TO ARTIFICIAL INTELLIGENCE

Artificial intelligence (AI), the ability of a digital computer or computer-


controlled robot to perform tasks commonly associated with intelligent beings.
The term is frequently applied to the project of developing systems endowed
with the intellectual processes characteristic of humans, such as the ability to
reason, discover meaning, generalize, or learn from past experience. Since the
development of the digital computer in the 1940s, it has been demonstrated that
computers can be programmed to carry out very complex tasks as, for example,
discovering proofs for mathematical theorems or playing chess with great
proficiency

As machines become increasingly capable, tasks considered to require


"intelligence" are often removed from the definition of AI, a phenomenon
known as the AI effect. A quip in Tesler's Theorem says “AI is whatever hasn't
been done yet”.

The traditional problems (or goals) of AI research include reasoning,


knowledge representation, planning, learning, natural language processing,
perception and the ability to move and manipulate objects. General intelligence
is among the field's long-term goals. Approaches include statistical methods,
computational intelligence, and traditional symbolic AI. Many tools are used in
AI, including versions of search and mathematical optimization, artificial neural
networks, and methods based on statistics, probability and economics. The AI
field draws upon computer science, information engineering, mathematics,
psychology, linguistics, philosophy, and many other fields.

Types of Artificial Intelligence (AI):

AI can be classified in any number of ways there are two types of main
classification.

Type1:

Weak AI or Narrow AI: It is focused on one narrow task, the phenomenon


that machines which are not too intelligent to do their own work can be built in
such a way that they seem smart. An example would be a poker game where a
machine beats human where in which all rules and moves are fed into the
machine. Here each and every possible scenario need to be entered beforehand
manually. Each and every weak AI will contribute to the building of strong AI.

Strong AI: The machines that can actually think and perform tasks on its own
just like a human being. There are no proper existing examples for this but some
industry leaders are very keen on getting close to build a strong AI which has
resulted in rapid progress.

Type2(based on functionalities):
Reactive Machines: This is one of the basic forms of AI. It doesn’t have past
memory and cannot use past information to information for the future actions.
Example:- IBM chess program that beat Garry Kasparov in the 1990s.

Limited Memory: AI systems can use past experiences to inform future


decisions. Some of the decision-making functions in self-driving cars have been
designed this way. Observations used to inform actions happening in the not so
distant future, such as a car that has changed lanes. These observations are not
stored permanently and also Apple’s Chatbot Siri.

Theory of Mind: This type of AI should be able to understand people’s


emotion, belief, thoughts, expectations and be able to interact socially Even
though a lot of improvements are there in this field this kind of AI is not
complete yet.

Self-awareness: An AI that has it’s own conscious, super intelligent, self-


awareness and sentient (In simple words a complete human being). Of course,
this kind of bot also doesn’t exist and if achieved it will be one of the milestones
in the field of AI.

There are many ways AI can be achived some of them are as follows:
The most important among them are as follows:

Machine Learning (ML): It is a method where the target(goal) is defined


and the steps to reach that target is learned by the machine itself by
training(gaining experience).For example to identify a simple object such as an
apple or orange. The target is achieved not by explicitly specifying the details
about it and coding it but it is just as we teach a child by showing multiple
different pictures of it and therefore allowing the machine to define the steps to
identify it like an apple or an orange.
Natural Language Processing (NLP): Natural Language Processing is
broadly defined as the automatic manipulation of natural language, like speech
and text, by software. One of the well-known examples of this is email spam
detection as we can see how it has improved in our mail system.

Vision: It can be said as a field which enables the machines to see. Machine
vision captures and analyses visual information using a camera, analog-to-
digital conversion, and digital signal processing. It can be compared to human
eyesight but it is not bound by the human limitation which can enable it to see
through walls(now that would be interesting if we can have implants that can
make us see through the wall). It is usually achieved through machine learning
to get the best possible results so we could say that these two fields are
interlinked.

Robotics: It is a field of engineering focused on the design and


manufacturing of robots. Robots are often used to perform tasks that are
difficult for humans to perform or perform consistently. Examples include car
assembly lines, in hospitals, office cleaner, serving foods, and preparing foods
in hotels, patrolling farm areas and even as police officers. Recently machine
learning has been used to achieve certain good results in building robots that
interact socially(Sophia)
ADVANTAGES AND DISADVANTAGES OF A.I.

Artificial Intelligence is the ability of a computer program to learn and think.


Everything can be considered Artificial intelligence if it involves a program
doing something that we would normally think would rely on the intelligence of
a human. In this article, we will discuss the different advantages and
disadvantages of Artificial Intelligence.

1.Advantages of Artificial Intelligence


2.Disadvantages of Artificial Intelligence

The advantages of Artificial intelligence applications are enormous and can


revolutionize any professional sector. Let’s see some of them.

Advantages of Artificial Intelligence

1) Reduction in Human Error

The phrase “human error” was born because humans make mistakes from time
to time. Computers however, do not make these mistakes if they are
programmed properly. With Artificial intelligence, the decisions are taken from
the previously gathered information applying certain set of algorithms. So errors
are reduced and the chance of reaching accuracy with a greater degree of
precision is a possibility.

Example: In Weather Forecasting using AI it has reduced majority of human


error.
2) Takes risks instead of Humans

This is one of the biggest advantage of Artificial intelligence. We can overcome


many risky limitations of human by developing an AI Robot which in turn can
do the risky things for us. Let it be going to mars, defuse a bomb, explore the
deepest parts of oceans, minning for coal and oil, it can be used effectively in
any kinds of nautural or man made disasters.

Example: Have you heard about the Chernobyl nuclear power plant explosion in
Ukraine? At that time there were no AI powered robots which can help us to
minimise the affect of radiation by controlling the fire in early stages, as any
human went close to the core was dead in matter of minutes. They eventually
poured sand and boron from helichopters from a mere distance.

AI Robots can be used in such situations where an human intervention can be


hazardous.

3) Available 24×7

An Average human will work for 4-6 hours a day excluding the breaks. Humans
are built in such a way to get some time out for refereshing themselves and get
ready for a new day of work and they even have weekly off’s to stay intact with
their work life and personal life. But using AI we can make machines work
24×7 without any breaks and they don’t even get bored unlike humans.

Example: Educational Institutes and Helpline centers are getting many queries
and issues which can be handled effecively using AI.

4) Helping in Repetitive Jobs

In our day-to-day work, we will be performing many repetitive works like


sending a thanking mail, verifying certain documents for errors and many more
things. Using artificial intelligence we can productively automate these
mundane tasks and can even remove “boring” tasks for humans and free them
up to be increasingly creative.
Example: In banks, we often see many verifications of documents in order to
get a loan which is a repetitive task for the owner of the bank. Using AI
Cognitive Automation the owner can speed up the process of verifying the
documents by which both the customers and the owner will be benefited.

5) Digital Assistance

Some of the highly advanced organizations use digital assistants to interact with
users which saves the need of human resource. The digital assistant also used in
many websites to provide things that user want. We can chat with them about
what we are looking for. Some chatbots are designed in such a way that it
becomes hard to determine that we’re chatting with a chatbot or a human being.

Example: We all know that organizations have a customer support team which
needs to clarify the doubts and queries of the customers. Using AI the
organizations can set up a Voicebot or Chatbot which can help customers with
all their queries. We can see many organizations already started using them in
their websites and mobile applications.

6) Faster Decisions:

Using AI alongside other technologies we can make machines take decisions


faster than a human and carry out actions quicker. While taking a decision
human will analyse many factors both emotionally and practically but AI-
powered machine works on what it is programmed and delivers the results in a
faster way.

Example: We all have played a Chess game in Windows. It is nearly impossible


to beat CPU in the hard mode because of the AI behind that game. It will take
the best possible step in a very short time according to the algorithms used
behind it.
7) Daily Applications

Daily applications such as Apple’s Siri, Window’s Cortana, Google’s OK


Google are frequently used in our daily routine whether it is for searching a
location, taking a selfie, making a phone call, replying to a mail and many more.

Example: Around 20 years ago, when we are planning to go somewhere we


used to ask a person who already went there for the directions. But now all we
have to do is say “OK Google where is Visakhapatnam”. It will show you
Visakhapatnam’s location on google map and best path between you and
Visakhapatnam.

8) New Inventions:
AI is powering many inventions in almost every domain which will help
humans solve the majority of complex problems.

As every bright side has a darker version in it. Artificial Intelligence also has
some disadvantages. Let’s see some of them.

Disadvantages of Artificial Intelligence

1) High Costs of Creation

As AI is updating every day the hardware and software need to get updated with
time to meet the latest requirements. Machines need repairing and maintenance
which need plenty of costs. It’ s creation requires huge costs as they are very
complex machines.

2) Making Humans Lazy

AI is making humans lazy with its applications automating the majority of the
work. Humans tend to get addicted to these inventions which can cause a
problem to the future generations.
3) Unemployement:

As AI is replacing the majority of the repetitive tasks and other works with
robots, human interference is becoming less which will cause a major problem
in the employment standards. Every organization is looking to replace the
minimum qualified individuals with AI robots which can do similar work with
more efficiency.

4) No Emotions

There is no doubt that machines are much better when it comes to working
efficiently but they cannot replace the human connection that makes the team.
Machines cannot develop a bond with humans which is an essential attribute
when comes to Team Management.

Why Python for AI ?

The obvious question that we need to encounter at this point is why we should
choose Python for AI over others.

Python offers the least code among others and is in fact 1/5 the number
compared to other OOP languages. No wonder it is one of the most popular in
the market today.

Python has Prebuilt Libraries like Numpy for scientific computation, Scipy
for advanced computing and Pybrain for machine learning (Python Machine
Learning) making it one of the best languages For AI.

Python developers around the world provide comprehensive support and


assistance via forums and tutorials making the job of the coder easier than any
other popular languages.
Python is platform Independent and is hence one of the most flexible and
popular choiceS for use across different platforms and technologies with the
least tweaks in basic coding.

Python is the most flexible of all others with options to choose between
OOPs approach and scripting. You can also use IDE itself to check for most
codes and is a boon for developers struggling with different algorithms.

Python along with packages like NumPy, scikit-learn, iPython Notebook,


and matplotlib form the basis to start your AI project.

NumPy is used as a container for generic data comprising of an N-


dimensional array object, tools for integrating C/C++ code, Fourier transform,
random number capabilities, and other functions.

Another useful library is pandas, an open source library that provides


users with easy-to-use data structures and analytic tools for Python.

Matplotlib is another service which is a 2D plotting library creating


publication quality figures. You can use matplotlib to up to 6 graphical users
interface toolkits, web application servers, and Python scripts.

Some of the most commonly used Python AI libraries are AIMA,


pyDatalog, SimpleAI, EasyAi, etc. There are also Python libraries for machine
learning like PyBrain, MDP, scikit, PyML.

Python Libraries for General AI

--AIMA – Python implementation of algorithms from Russell and Norvig’s


‘Artificial Intelligence: A Modern Approach.’

--pyDatalog – Logic Programming engine in Python


--SimpleAI – Python implementation of many of the artificial intelligence
algorithms described on the book “Artificial Intelligence, a Modern Approach”.
It focuses on providing an easy to use, well documented and tested library.

--EasyAI – Simple Python engine for two-players games with AI (Negamax,


transposition tables, game solving).

Python for Machine Language (ML)

Let us look as to why Python is used for MachinPython is one of the most
popular programming languages used by developers today. Guido Van Rossum
created it in 1991 and ever since its inception has been one of the most widely
used languages along with C++, Java, etc.

In our endeavour to identify what is the best programming language for AI and
neural network, Python has taken a big lead. Let us look at why Artificial
Intelligence with Python is one of the best ideas under the sun.

Python Libraries for Natural Language & Text Processing

 NLTK – Open source Python modules, linguistic data and documentation

for research and development in natural language processing and text


analytics with distributions for Windows, Mac OSX, and Linux.

PyBrain - A flexible, simple yet effective algorithm for ML tasks. It is also a


modular Machine Learning Library for Python providing a variety of predefined
environments to test and compare algorithms.

PyML – A bilateral framework written in Python that focuses on SVMs and


other kernel methods. It is supported on Linux and Mac OS X
Scikit-learn – Scikit-learn is an efficient tool for data analysis while using
Python. It is open source and the most popular general purpose machine
learning library.

MDP-Toolkit – Another Python data processing framework that can be easily


expanded, it also has a collection of supervised and unsupervised learning
algorithms and other data processing units that can be combined into data
processing sequences and more complex feed-forward network architectures.

The implementation of new algorithms is easy and intuitive. The base of


available algorithms is steadily increasing and includes signal processing
methods (Principal Component Analysis, Independent Component Analysis,
and Slow Feature Analysis), manifold learning methods ([Hessian] Locally
Linear Embedding), several classifiers, probabilistic methods (Factor Analysis,
RBM), data pre-processing methods, and many others

4.3.4 MACHINE LEARNING LIBRARIES

TKINTER

Tkinter is a graphical user interface (GUI) module for Python, you can
make desktop apps with Python. You can make windows, buttons, show text
and images amongst other things.

Tk and Tkinter apps can run on most Unix platforms. This also works on
Windows and Mac OS X.
The module Tkinter is an interface to the Tk GUI toolkit.
Example:

Tkinter module
This example opens a blank desktop window. The tkinter module is part of the
standard library.
To use tkinter, import the tkinter module.

1 from tkinter import *

This is tkinter with underscore t, it has been renamed in Python 3.

Setup the window


Start tk and create a window.

root = Tk()
1
app =
2
Window(root)

PYGAME

Pygame is a cross-platform set of Python modules designed for writing


video games. It includes computer graphics and sound libraries designed to be
used with the Python programming language.

Pygame was originally written by Pete Shinners to replace PySDL after


its development stalled.It has been a community project since 2000 and is
released under the open source free software GNU Lesser General Public
License.
Game creation in any programming language is very rewarding, and also
makes for a great teaching tool. With game development, you often have quite a
bit of logic, mathematics, physics, artificial intelligence, and other things, all of
which come together for game creation. Not only this, but the topic is games, so
it can be very fun.

Many times people like to visualize the programs they are creating, as it
can help people to learn programming logic quickly. Games are fantastic for
this, as your are specifically programming everything you see.

SCIPY

SciPy is a free and open-source Python library used for scientific computing and
technical computing.

SciPy contains modules for optimization, linear algebra, integration,


interpolation, special functions, FFT, signal and image processing, ODE solvers
and other tasks common in science and engineering.

SciPy builds on the NumPy array object and is part of the NumPy stack which
includes tools like Matplotlib, pandas and SymPy, and an expanding set of
scientific computing libraries. This NumPy stack has similar users to other
applications such as MATLAB, GNU Octave, and Scilab. The NumPy stack is
also sometimes referred to as the SciPy stack.[4]

SciPy is also a family of conferences for users and developers of these tools:
SciPy (in the United States), EuroSciPy (in Europe) and SciPy.in (in India).[5]
Enthought originated the SciPy conference in the United States and continues to
sponsor many of the international conferences as well as host the SciPy website.

The SciPy library is currently distributed under the BSD license, and its
development is sponsored and supported by an open community of developers.
It is also supported by NumFOCUS, a community foundation for supporting
reproducible and accessible science.

DLIB

Dlib is a general purpose cross-platform software library written in the


programming language C++. Its design is heavily influenced by ideas from
design by contract and component-based software engineering. Thus it is, first
and foremost, a set of independent software components. It is open-source
software released under a Boost Software License.

Since development began in 2002, Dlib has grown to include a wide


variety of tools. As of 2016, it contains software components for dealing with
networking, threads, graphical user interfaces, data structures, linear algebra,
machine learning, image processing, data mining, XML and text parsing,
numerical optimization, Bayesian networks, and many other tasks. In recent
years, much of the development has been focused on creating a broad set of
statistical machine learning tools and in 2009 Dlib was published in the Journal
of Machine Learning Research. Since then it has been used in a wide range of
domains.

DLib-ml implements numerous machine learning algorithms:

SVMs,

K-Means clustering,

Bayesian Networks,

and many others.

DLib also features utility functionality including

Threading,

Networking,

Numerical Algorithms,

Image Processing,

and Data Compression and Integrity algorithms.

DLib includes extensive unit testing coverage and examples using the
library. Every class and function in the library is documented. This
documentation can be found on the library's home page. DLib provides a good
framework for developing machine learning applications in C++.
DLib is much like DMTL in that it provides a generic high-performance
machine learning toolkit with many different algorithms, but DLib is more
recently updated and has more examples. DLib also contains much more
supporting functionality.

What makes DLib unique is that it is designed for both research use and
creating machine learning applications in C++ and Python.

Imutils

Translation
Translation is the shifting of an image in either the x or y direction. To translate
an image in OpenCV you need to supply the (x, y)-shift, denoted as (tx, ty) to
construct the translation matrix M:

[0
M = 1 0 ty
1 tx
]
And from there, you would need to apply the cv2.warpAffine function. Instead
of manually constructing the translation matrix M and calling cv2.warpAffine,
you can simply make a call to the translate function of imutils 
Rotation:

Rotating an image in OpenCV is accomplished by making a call


to cv2.getRotationMatrix2D   and cv2.warpAffine . Further care has to be taken
to supply the (x, y)-coordinate of the point the image is to be rotated about.
These calculation calls can quickly add up and make your code bulky and less
readable. The rotate  function in imutils  helps resolve this problem.
Resizing:
Resizing an image in OpenCV is accomplished by calling
the cv2.resize function. However, special care needs to be taken to ensure that
the aspect ratio is maintained. This resize function of imutils maintains the
aspect ratio and provides the keyword arguments width and height so the image
can be resized to the intended width/height while (1) maintaining aspect ratio
and (2) ensuring the dimensions of the image do not have to be explicitly
computed by the developer.
Another optional keyword argument, inter, can be used to specify interpolation
method as well.
Skeletonization
Skeletonization is the process of constructing the “topological skeleton” of an
object in an image, where the object is presumed to be white on a black
background. OpenCV does not provide a function to explicitly construct the
skeleton, but does provide the morphological and binary functions to do so.

For convenience, the skeletonize function of imutils can be used to construct the


topological skeleton of the image.
The first argument, size is the size of the structuring element kernel. An
optional argument, structuring, can be used to control the structuring element -
it defaults to cv2.MORPH_RECT, but can be any valid structuring element.

OPENCV

OpenCV (Open Source Computer Vision Library) is an open source


computer vision and machine learning software library. OpenCV was built to
provide a common infrastructure for computer vision applications and to
accelerate the use of machine perception in the commercial products. Being a
BSD-licensed product, OpenCV makes it easy for businesses to utilize and
modify the code.

The library has more than 2500 optimized algorithms, which includes a
comprehensive set of both classic and state-of-the-art computer vision and
machine learning algorithms. These algorithms can be used to detect and
recognize faces, identify objects, classify human actions in videos, track camera
movements, track moving objects, extract 3D models of objects, produce 3D
point clouds from stereo cameras, stitch images together to produce a high
resolution image of an entire scene, find similar images from an image database,
remove red eyes from images taken using flash, follow eye movements,
recognize scenery and establish markers to overlay it with augmented reality,
etc.

Along with well-established companies like Google, Yahoo, Microsoft,


Intel, IBM, Sony, Honda, Toyota that employ the library, there are many
startups such as Applied Minds, VideoSurf, and Zeitera, that make extensive
use of OpenCV. OpenCV’s deployed uses span the range from stitching
streetview images together, detecting intrusions in surveillance video in Israel,
monitoring mine equipment in China, helping robots navigate and pick up
objects at Willow Garage, detection of swimming pool drowning accidents in
Europe, running interactive art in Spain and New York, checking runways for
debris in Turkey, inspecting labels on products in factories around the world on
to rapid face detection in Japan.

SYSTEM DESIGN

5.1 INRODUCTION

Software design sits at the technical kernel of the software engineering


process and is applied regardless of the development paradigm and area of
application. Design is the first step in the development phase for any engineered
product or system. The designer’s goal is to produce a model or representation
of an entity that will later be built. Beginning, once system requirement have
been specified and analyzed, system design is the first of the three technical
activities -design, code and test that is required to build and verify software.

The importance can be stated with a single word “Quality”. Design is the place
where quality is fostered in software development. Design provides us with
representations of software that can assess for quality. Design is the only way
that we can accurately translate a customer’s view into a finished software
product or system. Software design serves as a foundation for all the software
engineering steps that follow. Without a strong design we risk building an
unstable system – one that will be difficult to test, one whose quality cannot be
assessed until the last stage. The purpose of the design phase is to plan a solution
of the problem specified by the requirement document. This phase is the first step in
moving from the problem domain to the solution domain. In other words, starting
with what is needed, design takes us toward how to satisfy the needs. The design of
a system is perhaps the most critical factor affection the quality of the software; it has
a major impact on the later phase, particularly testing, maintenance. The output of
this phase is the design document. This document is similar to a blueprint for the
solution and is used later during implementation, testing and maintenance. The
design activity is often divided into two separate phases System Design and Detailed
Design.
System Design also called top-level design aims to identify the modules that
should be in the system, the specifications of these modules, and how they interact
with each other to produce the desired results. At the end of the system design all
the major data structures, file formats, output formats, and the major modules in the
system and their specifications are decided.
During, Detailed Design, the internal logic of each of the modules specified in
system design is decided. During this phase, the details of the data of a module is
usually specified in a high-level design description language, which is independent
of the target language in which the software will eventually be implemented.
In system design the focus is on identifying the modules, whereas during
detailed design the focus is on designing the logic for each of the modules. In other
works, in system design the attention is on what components are needed, while in
detailed design how the components can be implemented in software is the issue.
Design is concerned with identifying software components specifying
relationships among components. Specifying software structure and providing blue
print for the document phase. Modularity is one of the desirable properties of large
systems. It implies that the system is divided into several parts. In such a manner,
the interaction between parts is minimal clearly specified.
During the system design activities, Developers bridge the gap between the
requirements specification, produced during requirements elicitation and analysis,
and the system that is delivered to the user.
Design is the place where the quality is fostered in development. Software
design is a process through which requirements are translated into a representation
of software.

5. SYSTEM DESIGN

INTRODUCTION
Object detection is one of the trending topics in the field of image processing and
computer vision. Ranging from small scale personal applications to large scale
industrial applications, object detection and recognition is employed in a wide
range of industries. Some examples include image retrieval, security and
intelligence, OCR, medical imaging and agricultural monitoring. In object
detection, an image is read and one or more objects in that image are
categorized. The location of those objects is also specified by a boundary called
the bounding box. Traditionally, researchers used pattern recognition to predict
faces based on prior face models. A break through face detection technology
then was developed named as Viola Jones detector that was an optimized
technique of using Haar, digital image features used in object recognition.
However, it failed because it did not perform well on faces in dark areas and non-
frontal faces. Since then, researchers are eager to develop new algorithms based
on deep learning to improve the models. Deep learning allows us to learn
features with end to end manner and removing the need to use prior knowledge
for forming feature extractors. There are various methods of object detection
based on deep learning which are divided into two categories: one stage and two
stage object detectors
Two stage detectors use two neural networks to detect objects, for instance
region-based convolu-tional neural networks (R-CNN) and faster R-CNN. The
first neural network is used to generateregion proposals and the second one
refines these region proposals; performing a coarse-to-fine de-tection. This
strategy results in high detection performance compromising on speed. The
seminalwork R-CNN is proposed by R. Girshick et al. R-CNN uses selective
search to propose somecandidate regions which may contain objects. After that,
the proposals are fed into a CNN modelto extract features, and a support vector
machine (SVM) is used to recognize classes of objects.However, the second
stage of R-CNN is computationally expensive since the network has to detect
proposals on a one-by-one manner and uses a separate SVM for final
classification. Fast R-CNN solves this problem by introducing a region of interest
(ROI) pooling layer to input all proposal regions at once. Faster RCNN is the
evolution of R-CNN and Fast R-CNN, and as the name implies its training and
testing speed is greater than those of its predecessors. While R-CNN and Fast R-
CNN use selective search algorithms limiting the detection speed, Faster R-CNN
learns the proposed object regions itself using a region proposal network (RPN).2
On the other hand, a one stage detector utilizes only a single neural network for
region proposals and for detection; some primary ones being SSD (Single Shot
Detection) and YOLO (You Only Look Once) To achieve this, the bounding
boxes should be predefined. YOLO divides the image into several cells and then
matches the bounding boxes to objects for each cell. This, however, is not good
for small sized objects. Thus, multi scale detection is introduced in SSD which
can detect objects of varying sizes in an image. Later, in order to improve
detection accuracy, Lin et. al proposes Retina Network (RetinaNet) by combining
an SSD and FPN (feature pyramid network) to increase detection accuracy and
reduce class imbalance. One-stage detectors have higher speed but trades off
the detection performance but then only are preferred over two-stage detectors.
Like object detection, face detection adopts the same architectures as one-stage
and two-stage de-tectors, but in order to improve face detection accuracy, more
face-like features are being added. However, there is occasional research
focusing on face mask detection. Some already existing facemask detectors have
been modeled using OpenCV, Pytorch Lightning, MobileNet, RetinaNet
andSupport Vector Machines. Here, we will be discussing two projects. One
project used Real World Masked Face Dataset (RMFD) which contains 5,000
masked faces of 525 people and 90,000 normal faces [8]. These images are 250
x 250 in dimensions and cover all races and ethnicities and are unbalanced. This
project took 100 x 100 images as input, and therefore, transformed each sample
image when querying it, by resizing it to 100x100. Moreover, this project uses
PyTorch then they convert images to Tensors, which is the base data type that
PyTorch can work with. RMFD is im-balanced (5,000 masked faces vs 90,000
non-masked faces). Therefore, the ratio of the samples in train/validation while
splitting the dataset was kept equal using the train testsplit function of sklearn.
Moreover, to deal with unbalanced data, they passed this information to the loss
function to avoid unproportioned step sizes of the optimizer. They did this by
assigning a weight to each class, according to its representability in the dataset.
They assigned more weight to classes with a small number of samples so that
the network will be penalized more if it makes mistakes predicting the label of
these classes. While classes with large numbers of samples, they assigned to
them a smaller weight. This makes their network training agnostic to the
proportion of classes.

This project used the data loader. For instance, in this project, they used the
PyTorch lighting, and to load them for training and validation they divided data
into 32batches and assigned the works of loading to the 4 number of workers,
and this procedure allowed them to perform multi-process data loading. Like most
of the projects, this project also used Adamoptimizer. If any Model has a high rate
of learning, it learns faster, but it bounces a lot to reach the global minima and
may diverge from the global minima. However, a small learning rate may take
considerably lower time to train, but it reaches to the global minima. If the loss of
the model declines quickly for any learning rate, then that learning rate would be
the best learning rate. However, it seems that this project considered the 0.00001
learning rate would be the best for their model so that it could work efficiently. To
train the model they defined a model checkpointing callback where they wanted
to save the best accuracy and the lowest loss. They tried to train the model for 10
epochs and after finding optimal epoch, they saved the model for 8 epochs to test
on the real data. To get rid of the problem of occlusions of the face which causes
trouble face detectors to detect masks in the images, they used a built-in
OpenCV deep learning face detection model. For instance, the Haar-Cascade
model could be used but the problem of the Haar-Cascade model is that the
detection frame is a rectangle, not a square. That is why, without capturing the
portion of the background, the face frame can fit the entirety of the face, which
can interfere with the face mask model predictions.In the second project [9], a
dataset was created by Prajna Bhandary using a PyImage Search reader. This
dataset consists of 1,376 images belonging to all races and is balanced. There
are 690 images with masks and 686 without masks. Firstly, it took normal images
of faces and then created a customized computer vision Python script to add face
masks to them. Thereby, it created a real-world applicable artificial dataset. This
method used the facial landmarks which allow them to detect the different parts
of the faces such as eyes, eyebrows, nose, mouth, jawline etc. To use the facial
landmarks, it takes a picture of a person who is not wearing a mask, and, then, it
detects the portion of that person’s face. After knowing the location of the face in
the image, it extracted the face Region of Interest (ROI). After localizing facial
landmarks, a picture of a mask is placed intothe face.
In this project, embedded devices are used for deployment that could reduce the
cost of manufacturing. MobileNetV2 architecture is used as it is a highly efficient
architecture to apply on embedded devices with limited computational capacity
such as Google Coral, NVIDIA JetsonNano. This project performed well,
however, if a large portion of the face is occluded by the mask,this model could
not detect whether a person is wearing a mask or not. The dataset used to train
theface detector did not have images of people wearing face masks as a result, if
the large portion offaces is occluded, the face detector would probably fail to
detect properly. To get rid of this problem,they should gather actual images of
people wearing masks rather than artificially generated images.

Methodology
The working of the Single Shot Detector algorithm relies on an input image with
a specified bound-ing box against the objects. The methodology of predicting an
object in an image depends upon very renowned convolution fashion. For each
pixel of a given image, a set of default bounding boxes (usually 4) with different
sizes and aspect ratios are evaluated. Moreover, for all the pixels, a confidence
score for all possible objects are calculated with an additional label of ‘No Object’.
This calculation is repeated for many different feature maps. In order to extract
feature maps, we usually use the predefined trained techniques which are used
for high quality classification problems. We call this part of the model a base
model. For the SSD, we have VGG-16 network as our base model. At the training
time, the bounding boxes evaluated are compared with the ground truth boxes
and in the back propagation, the trainable parameters are altered as per
requirement. We truncate the VGG-16 model just before the classification layer
and add feature layers which keep on decreasing in size. At each feature space,
we use a kernel to produce outcomes which depicts corresponding scores for
each pixel whether there exists any objector not and the corresponding
dimensions of the resulting bounding box.VGG-16 is a very dense network
having 16 layers of convolution which are useful in extracting features to classify
and detect objects. The reason for the selection is because the architecture4

consists of stacks of convolutions with 3x3 kernel size which thoroughly extract
numerous feature information along with max-pooling and ReLU to pass the
information flow in the model and adding non linearity respectively from the given
image. For additional nonlinearity, it uses 1x1 convolution blocks which does not
change the spatial dimension of the input. Due to the small size filters striding
over the image, there are many weight parameters which end up giving an
improved performance. The block diagram shows the working functionality of
SSD. At the input end, we can see theVGG-16 being used as the base model.
Some additional feature layers are added at the end of the base model to take
care of offsets and confidence scores of different bounding boxes. At end part of
the figure, we can see the layers being flattened to make predictions for different
bounding boxes. At the end, non maximum suppression is used whose purpose
is to remove duplicate or quite similar bounding boxes around same objects.
There may be situations where the neighboring pixel alsopredicts a bounding box
for an object with a bit less confidence which is finally rejected.

The problem can be solved in two parts: first detecting the presence of several
faces in a given im-age or stream of video and then in the second part, detect the
presence or absence of face mask on face. In order to detect the face, we have
used the OpenCV library. The latest OpenCV includes a Deep Neural Network
(DNN) module, which comes with a pre-trained face detection convolutional
neural network (CNN). The new model enhances the face detection performance
compared to the traditional models. Whenever a new test image is given, it is first
converted into BLOBS (BinaryLarge Object refers to a group of connected pixels
in a binary image) and then sent into the pre-trained model which outputs the
number of detected faces. Every face detected comes out with alevel of
confidence which is then compared with a threshold value to filter out the
irrelevant detec-tions. After we have the faces, we need to evaluate the bounding
box around it and send it to thesecond part of the model to check if the face has a
mask or not.The second part of the model is trained by us using a dataset
consisting of images with mask andwithout mask. We have used Keras along
with Tensorflow to train our model. First part of thetraining includes storing all
labels of the images in a Numpy array and the corresponding imagesare also
reshaped (224, 244, 3) for the base model. Image augmentation is a very useful
techniquebecause it increases our dataset with images with a whole new
perspective. Before inputting, weperformed the following image augmentations
randomly: rotations up to 20 degrees, zooming in andout up to 15%, width or
height shift up to 20%, up to 15 degrees shear angle in the
counterclockwisedirection, flip inputs horizontally and points outside the
boundaries of the inputs are filled from thenearest available pixel of the input. For
the image classification, it is now a common practice touse transfer learning
which means using a model which has been pre-trained on millions of
labelsbefore and it has been tested that this method results in significant increase
in accuracy. Obviously,the assumption here is that both the problems have
sufficient similarity. It uses a well-structured anddeep neural network that has
been trained on a large amount of data set. Due to somewhat samenature of the
problem, we can use the same weights which have the capability to extract
features andlater in the deep layers, convert those features to objects.The base
model that we have used here is MobileNetV2 with the given ‘ImageNet’ weights.
Ima-geNet is an image database that has been trained on hundreds of thousands
of images hence it helpsa lot in Image classification. For the base model, we
truncate the head and use a series of our self-defined layers. We used an
average pooling layer, a flatten layer, a dense layer with output shape(None,
128), and activation ReLU, a 50% dropout layer for optimization, finally another
dense layer with output shape (None, 2), and Sigmoid activation is used. The
overall process flow diagram ofthe algorithm is shown below.

Training

At the training time, for each pixel, we compare the default bounding boxes
having different sizes and aspect ratios with ground truth boxes and finally use
Intersection over Union (IoU) method to select the best matching box. IoU
evaluates how much part of our predicted box match with the ground reality. The
values range from 0 to 1 and increasing values of IoU determine the accuracies
in the prediction; the best value being the highest value of IoU. The equation and
pictorial description of IoU is given as follow:
Hyperparameters

A hyperparameter is a parameter or a variable we need to set before applying an


algorithm into a dataset. These parameters express the “High Level” properties of
the model such as its complexity or how fast it should learn. Hyperparameters are
fixed before the actual training process begins. They can be divided into two
categories: optimizer hyperparameters and model hyperparameters. Optimizer
parameters help us to tune or optimize our model before the actual training
process starts. Some common optimizer hyperparameters are as follows.
Learning rate is a hyperparameter that controls how much we are adjusting the
weights of our neural network with respect to the gradient. Mini-batch size is a
hyperparameter that influences the resource requirements of the training and
impacts training speed and number of iterations. Epochs are the
hyperparameters that determine the frequency of running the model. One epoch
is when an entire dataset is passed forward and backward through the neural
network only once. Model hyperparameters are parameters that are more
involved in the architecture or structure of the model. They help us to define our
model complexity based on the different layers like the input layer, hidden layer,
and output layer of a neural network. Initially, we trained with different values of
hyperparameters by changing one and keeping the other constant and noted
down the results in each case. We selected the hyperparameters that produced
better performance through evaluation metrics.

We have chosen the hyperparameters as follows: initial learning rate is taken as


0.0001, batch size is taken to be 32 and number of epochs as 20. In our case,
the target size is also one of the hyperparameters which we kept (224, 224, 3) as
it is default input shape of MobileNetV2

Loss functions

The loss of the overall detection problem can be broken into two main
subsets: localization loss and confidence loss. The localization loss is just the
difference between the default predicted bounding box and ground truth bounding
box (g). For a given center (cx, cy), we try to alter the width and height of the box
such as to decrease the loss. Respectively, the equations for the localization and
confidence losses can be defined as:

The notations used in the equations are as follows:


 g: ground truth bounding box
 l: predicted box
 d: default bounding box
 xpij: matchingith predicted box withjth default box with “p” category.
 cx, cy: distance from centre of box in both x and y direction
 w, h = width and height with respect to image size
 c: confidence of presence of object or not.
Confidence loss is the just measure of how high the probability of the presence of
an object is, when there exists an object. Similarly, the localization loss is the
measure of how much a predicted box differs from the ground truth box in
dimensions. Our model will try to push down both losses by predicting the
presence of object and then correctly classifying it to the right class.

Evaluation
Testing We tried using three different base models for detecting ‘mask’ or ‘no
mask’. The exercise was done to find the best fit model in our scenario. The
evaluation process consists of first looking at the classification report which gives
us insight towards precision, recall and F1 score. The equations of these three
metrics are as follows:
Using these three metrics, we can conclude which model is performing most
efficiently. The second part consists of plotting the train loss, validation loss, train
accuracy and validation accuracy which also proves helpful in choosing a final
model.

Data Flow Diagrams:

A graphical tool used to describe and analyze the moment of data through a system
manual or automated including the process, stores of data, and delays in the system.
Data Flow Diagrams are the central tool and the basis from which other components
are developed. The transformation of data from input to output, through processes,
may be described logically and independently of the physical components
associated with the system. The DFD is also know as a data flow graph or a bubble
chart.

DFDs are the model of the proposed system. They clearly should show the
requirements on which the new system should be built. Later during design activity
this is taken as the basis for drawing the system’s structure charts. The Basic
Notation used to create a DFD’s are as follows:

1. Dataflow: Data move in a specific direction from an origin to a destination.

2. Process: People, procedures, or devices that use or produce (Transform) Data.


The physical component is not identified.
3. Source: External sources or destination of data, which may be People, programs,
organizations or other entities.

4. Data Store: Here data are stored or referenced by a process in the System.

What is a UML Class Diagram?

Class diagrams are the backbone of almost every object-oriented method


including UML. They describe the static structure of a system.

Basic Class Diagram Symbols and Notations

Classes represent an abstraction of entities with common characteristics.


Associations represent the relationships between classes.

Illustrate classes with rectangles divided into compartments. Place the name of
the class in the first partition (centered, bolded, and capitalized), list the
attributes in the second partition, and write operations into the third.

Active Class

Active classes initiate and control the flow of activity, while passive classes
store data and serve other classes. Illustrate active classes with a thicker border.

Visibility

Use visibility markers to signify who can access the information contained
within a class. Private visibility hides information from anything outside the
class partition. Public visibility allows all other classes to view the marked
information. Protected visibility allows child classes to access information they
inherited from a parent class. .

Associations
Associations represent static relationships between classes. Place association
names above, on, or below the association line. Use a filled arrow to indicate
the direction of the relationship. Place roles near the end of an association.
Roles represent the way the two classes see each other.
Note: It's uncommon to name both the association and the class roles.

Multiplicity (Cardinality)

Place multiplicity notations near the ends of an association. These symbols


indicate the number of instances of one class linked to one instance of the other
class. For example, one company will have one or more employees, but each
employee works for one company only.

Constraint

Place constraints inside curly braces {}.


Simple Constraint

Composition and Aggregation

Composition is a special type of aggregation that denotes a strong ownership


between Class A, the whole, and Class B, its part. Illustrate composition with a
filled diamond. Use a hollow diamond to represent a simple aggregation
relationship, in which the "whole" class plays a more important role than the
"part" class, but the two classes are not dependent on each other. The diamond
end in both a composition and aggregation relationship points toward the
"whole" class or the aggregate

Generalization
Generalization is another name for inheritance or an "is a" relationship. It refers
to a relationship between two classes where one class is a specialized version of
another. For example, Honda is a type of car. So the class Honda would have a
generalization relationship with the class car.

In real life coding examples, the difference between inheritance and aggregation
can be confusing. If you have an aggregation relationship, the aggregate (the
whole) can access only the PUBLIC functions of the part class. On the other
hand, inheritance allows the inheriting class to access both the PUBLIC and
PROTECTED functions of the super class.

What is a UML Use Case Diagram?

Use case diagrams model the functionality of a system using actors and use
cases. Use cases are services or functions provided by the system to its users.

Basic Use Case Diagram Symbols and Notations

System

Draw your system's boundaries using a rectangle that contains use cases. Place
actors outside the system's boundaries.
Use Case

Draw use cases using ovals. Label with ovals with verbs that represent the
system's functions.

Actors

Actors are the users of a system. When one system is the actor of another
system, label the actor system with the actor stereotype.

Relationships
Illustrate relationships between an actor and a use case with a simple line. For
relationships among use cases, use arrows labeled either "uses" or "extends." A
"uses" relationship indicates that one use case is needed by another in order to
perform a task. An "extends" relationship indicates alternative options under a
certain use case.

Sequence Diagram

Sequence diagrams describe interactions among classes in terms of an exchange


of messages over time.

Basic Sequence Diagram Symbols and Notations

Class roles

Class roles describe the way an object will behave in context. Use the UML
object symbol to illustrate class roles, but don't list object attributes.
Activation

Activation boxes represent the time an object needs to complete a task.

Messages

Messages are arrows that represent communication between objects. Use half-
arrowed lines to represent asynchronous messages. Asynchronous messages are
sent from an object that will not wait for a response from the receiver before
continuing its tasks.
Various message types for Sequence and Collaboration diagrams

Lifelines

Lifelines are vertical dashed lines that indicate the object's presence over time.

Destroying Objects

Objects can be terminated early using an arrow labeled "<< destroy >>" that
points to an X.
Loops

A repetition or loop within a sequence diagram is depicted as a rectangle. Place


the condition for exiting the loop at the bottom left corner in square brackets [ ].

Collaboration Diagram

A collaboration diagram describes interactions among objects in terms of


sequenced messages. Collaboration diagrams represent a combination of
information taken from class, sequence, and use case diagrams describing both
the static structure and dynamic behavior of a system.

Basic Collaboration Diagram Symbols and Notations

Class roles

Class roles describe how objects behave. Use the UML object symbol to
illustrate class roles, but don't list object attributes.

Association roles

Association roles describe how an association will behave given a particular


situation. You can draw association roles using simple lines labeled with
stereotypes.

Messages

Unlike sequence diagrams, collaboration diagrams do not have an explicit way


to denote time and instead number messages in order of execution. Sequence
numbering can become nested using the Dewey decimal system. For example,
nested messages under the first message are labeled 1.1, 1.2, 1.3, and so on. The
a condition for a message is usually placed in square brackets immediately
following the sequence number. Use a * after the sequence number to indicate a
loop.
Learn how to add arrows to your lines.

Activity Diagram

An activity diagram illustrates the dynamic nature of a system by modeling the


flow of control from activity to activity. An activity represents an operation on
some class in the system that results in a change in the state of the system.
Typically, activity diagrams are used to model workflow or business processes
and internal operation. Because an activity diagram is a special kind of state
chart diagram, it uses some of the same modeling conventions.

Basic Activity Diagram Symbols and Notations

Action states

Action states represent the non interruptible actions of objects. You can draw an
action state in Smart Draw using a rectangle with rounded corners.

Action Flow
Action flow arrows illustrate the relationships among action states.

Object Flow

Object flow refers to the creation and modification of objects by activities. An


object flow arrow from an action to an object means that the action creates or
influences the object. An object flow arrow from an object to an action indicates
that the action state uses the object.

Initial State

A filled circle followed by an arrow represents the initial action state.

Final State

An arrow pointing to a filled circle nested inside another circle represents the
final action state.
Branching

A diamond represents a decision with alternate paths. The outgoing alternates


should be labeled with a condition or guard expression. You can also label one
of the paths "else."

Synchronization

A synchronization bar helps illustrate parallel transitions. Synchronization is


also called forking and joining.

Swimlanes
Swimlanes group related activities into one column.

State chart Diagram

A state chart diagram shows the behavior of classes in response to external


stimuli. This diagram models the dynamic flow of control from state to state
within a system.

Basic State chart Diagram Symbols and Notations

States

States represent situations during the life of an object. You can easily illustrate a
state in Smart Draw by using a rectangle with rounded corners.

Transition

A solid arrow represents the path between different states of an object. Label the
transition with the event that triggered it and the action that results from it.
Initial State

A filled circle followed by an arrow represents the object's initial state.

Final State

An arrow pointing to a filled circle nested inside another circle represents the
object's final state.

Synchronization and Splitting of Control

A short heavy bar with two transitions entering it represents a synchronization


of control. A short heavy bar with two transitions leaving it represents a
splitting of control that creates multiple states.
STATE CHART DIAGRAM:

What is a UML Component Diagram?

A component diagram describes the organization of the physical components in


a system.

Basic Component Diagram Symbols and Notations

Component

A component is a physical building block of the system. It is represented as a


rectangle with tabs.
Learn how to resize grouped objects like components.

Interface

An interface describes a group of operations used or created by components.

Dependencies

Draw dependencies among components using dashed arrows.


Learn about line styles in SmartDraw.
COMPONENT DIAGRAM:

What is a UML Deployment Diagram?

Deployment diagrams depict the physical resources in a system including


nodes, components, and connections.

Basic Deployment Diagram Symbols and Notations

Component

A node is a physical resource that executes code components.


Learn how to resize grouped objects like nodes.

Association

Association refers to a physical connection between nodes, such as Ethernet.


Learn how to connect two nodes.
Components and Nodes

Place components inside the node that deploys them.

UML Diagrams Overview


UML combines best techniques from data modeling (entity relationship
diagrams), business modeling (work flows), object modeling, and component
modeling. It can be used with all processes, throughout the software
development life cycle, and across different implementation technologies . UML
has synthesized the notations of the Booch method, the Object-modeling
technique (OMT) and Object-oriented software engineering (OOSE) by fusing
them into a single, common and widely usable modeling language. UML aims
to be a standard modeling language which can model concurrent and distributed
systems.
USECASE DAIGRAM:

Use case diagrams are considered for high level requirement analysis of a
system. So when the requirements of a system are analyzed the functionalities
are captured in use cases.So we can say that uses cases are nothing but the
system functionalities written in an organized manner. Now the second things
which are relevant to the use cases are the actors. Actors can be defined as
something that interacts with the system.

The actors can be human user, some internal applications or may be some
external applications. So in a brief when we are planning to draw an use case
diagram we should have the following items identified.

 Functionalities to be represented as an use case

 Actors

 Relationships among the use cases and actors.


user

take dataset preparing dateset clasification

supervised unsupervised

classifying regression

features lables

predict(result)
testing&training
SVM

predending the data using alogorithms SVR ALGORITHM

plotting

SEQUENCE DAIGRAM

A sequence diagram in Unified Modeling Language (UML) is a kind of


interaction diagram that shows how processes operate with one another and in
what order. It is a construct of a Message Sequence Chart. A sequence diagram
shows, as parallel vertical lines ("lifelines"), different processes or objects that
live simultaneously, and, as horizontal arrows, the messages exchanged between
them, in the order in which they occur. This allows the specification of simple
runtime scenarios in a graphical manner.
user dataset classification supervised classifying testing&training algorithms predict(result) plotting

1 : supervised()

2 : unsupervised()

3 : classifying()

4 : regrission()

5 : features()

6 : lables()

7 : SVM algorithm()

8 : SVR algorithm()

9 : using SVM algorithm()

10 : results()

COLLEBARATION DAIGRAM:

A collaboration diagram, also called a communication diagram or interaction


diagram. A Collaboration diagram is easily represented by modeling objects in a
system and representing the associations between the objects as links. The
interaction between the objects is denoted by arrows. To identify the sequence
of invocation of these objects, a number is placed next to each of these arrows.
A sophisticated modeling tool can easily convert a collaboration diagram into a
sequence diagram and the vice versa. Hence, the elements of a Collaboration
diagram are essentially the same as that of a Sequence diagram.

supervised classifying algorithms plotting predict(result) classification

9 : using SVM algorithm()


10 : results()
7 : SVR
8 SVM algorithm()
5 6: :features()
34 : classifying()
regrission() lables()
21: :unsupervised()
supervised()
user dataset testing&training predict(result)using SVM algorithms
ACTIVITY DAIGRAM:

Activity diagrams are graphical representations of Workflows of stepwise


activities and actions with support for choice, iteration and concurrency. In the
Unified Modeling Language, activity diagrams can be used to describe the
business and operational step-by-step workflows of components in a system. An
activity diagram shows the overall flow of control.

user

dadaset

clasification

supervised unsupervised

regression classifying

features(days) lables(price)

testing&training

predict the data using SVM algorithm

plotting
Component

matplotlib
datasets

server

numpy
sklearn

IMPLENTATION
SOURCE CODE
Detect_video.py
# USAGE

# python detect_mask_video.py

# import the necessary packages

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

from tensorflow.keras.preprocessing.image import img_to_array

from tensorflow.keras.models import load_model

from imutils.video import VideoStream

import numpy as np

import argparse

import imutils
import time

import cv2

import os

def detect_and_predict_mask(frame, faceNet, maskNet):

# grab the dimensions of the frame and then construct a blob

# from it

(h, w) = frame.shape[:2]

blob = cv2.dnn.blobFromImage(frame, 1.0, (300, 300),

(104.0, 177.0, 123.0))

# pass the blob through the network and obtain the face detections

faceNet.setInput(blob)

detections = faceNet.forward()

# initialize our list of faces, their corresponding locations,

# and the list of predictions from our face mask network

faces = []

locs = []

preds = []

# loop over the detections

for i in range(0, detections.shape[2]):

# extract the confidence (i.e., probability) associated with

# the detection

confidence = detections[0, 0, i, 2]
# filter out weak detections by ensuring the confidence is

# greater than the minimum confidence

if confidence > args["confidence"]:

# compute the (x, y)-coordinates of the bounding box for

# the object

box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])

(startX, startY, endX, endY) = box.astype("int")

# ensure the bounding boxes fall within the dimensions of

# the frame

(startX, startY) = (max(0, startX), max(0, startY))

(endX, endY) = (min(w - 1, endX), min(h - 1, endY))

# extract the face ROI, convert it from BGR to RGB channel

# ordering, resize it to 224x224, and preprocess it

face = frame[startY:endY, startX:endX]

face = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)

face = cv2.resize(face, (224, 224))

face = img_to_array(face)

face = preprocess_input(face)

# add the face and bounding boxes to their respective

# lists

faces.append(face)

locs.append((startX, startY, endX, endY))


# only make a predictions if at least one face was detected

if len(faces) > 0:

# for faster inference we'll make batch predictions on *all*

# faces at the same time rather than one-by-one predictions

# in the above `for` loop

faces = np.array(faces, dtype="float32")

preds = maskNet.predict(faces, batch_size=32)

# return a 2-tuple of the face locations and their corresponding

# locations

return (locs, preds)

# construct the argument parser and parse the arguments

ap = argparse.ArgumentParser()

ap.add_argument("-f", "--face", type=str,

default="face_detector",

help="path to face detector model directory")

ap.add_argument("-m", "--model", type=str,

default="mask_detector.model",

help="path to trained face mask detector model")

ap.add_argument("-c", "--confidence", type=float, default=0.5,

help="minimum probability to filter weak detections")

args = vars(ap.parse_args())

# load our serialized face detector model from disk


print("[INFO] loading face detector model...")

prototxtPath = os.path.sep.join([args["face"], "deploy.prototxt"])

weightsPath = os.path.sep.join([args["face"],

"res10_300x300_ssd_iter_140000.caffemodel"])

faceNet = cv2.dnn.readNet(prototxtPath, weightsPath)

# load the face mask detector model from disk

print("[INFO] loading face mask detector model...")

maskNet = load_model(args["model"])

# initialize the video stream and allow the camera sensor to warm up

print("[INFO] starting video stream...")

vs = VideoStream(src=0).start()

time.sleep(2.0)

# loop over the frames from the video stream

while True:

# grab the frame from the threaded video stream and resize it

# to have a maximum width of 400 pixels

frame = vs.read()

frame = imutils.resize(frame, width=400)

# detect faces in the frame and determine if they are wearing a

# face mask or not

(locs, preds) = detect_and_predict_mask(frame, faceNet, maskNet)


# loop over the detected face locations and their corresponding

# locations

for (box, pred) in zip(locs, preds):

# unpack the bounding box and predictions

(startX, startY, endX, endY) = box

(mask, withoutMask) = pred

# determine the class label and color we'll use to draw

# the bounding box and text

label = "Mask" if mask > withoutMask else "No Mask"

color = (0, 255, 0) if label == "Mask" else (0, 0, 255)

# include the probability in the label

label = "{}: {:.2f}%".format(label, max(mask, withoutMask) * 100)

# display the label and bounding box rectangle on the output

# frame

cv2.putText(frame, label, (startX, startY - 10),

cv2.FONT_HERSHEY_SIMPLEX, 0.45, color, 2)

cv2.rectangle(frame, (startX, startY), (endX, endY), color, 2)

# show the output frame

cv2.imshow("Frame", frame)

key = cv2.waitKey(1) & 0xFF

# if the `q` key was pressed, break from the loop


if key == ord("q"):

break

# do a bit of cleanup

cv2.destroyAllWindows()

vs.stop()

Detect_image.py

# USAGE

# python detect_mask_image.py --image images/pic1.jpeg

# import the necessary packages

from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

from tensorflow.keras.preprocessing.image import img_to_array

from tensorflow.keras.models import load_model

import numpy as np

import argparse

import cv2

import os

def mask_image():

# construct the argument parser and parse the arguments

ap = argparse.ArgumentParser()

ap.add_argument("-i", "--image", required=True,

help="path to input image")

ap.add_argument("-f", "--face", type=str,


default="face_detector",

help="path to face detector model directory")

ap.add_argument("-m", "--model", type=str,

default="mask_detector.model",

help="path to trained face mask detector model")

ap.add_argument("-c", "--confidence", type=float, default=0.5,

help="minimum probability to filter weak detections")

args = vars(ap.parse_args())

# load our serialized face detector model from disk

print("[INFO] loading face detector model...")

prototxtPath = os.path.sep.join([args["face"], "deploy.prototxt"])

weightsPath = os.path.sep.join([args["face"],

"res10_300x300_ssd_iter_140000.caffemodel"])

net = cv2.dnn.readNet(prototxtPath, weightsPath)

# load the face mask detector model from disk

print("[INFO] loading face mask detector model...")

model = load_model(args["model"])

# load the input image from disk, clone it, and grab the image spatial

# dimensions

image = cv2.imread(args["image"])

orig = image.copy()

(h, w) = image.shape[:2]
# construct a blob from the image

blob = cv2.dnn.blobFromImage(image, 1.0, (300, 300),

(104.0, 177.0, 123.0))

# pass the blob through the network and obtain the face detections

print("[INFO] computing face detections...")

net.setInput(blob)

detections = net.forward()

# loop over the detections

for i in range(0, detections.shape[2]):

# extract the confidence (i.e., probability) associated with

# the detection

confidence = detections[0, 0, i, 2]

# filter out weak detections by ensuring the confidence is

# greater than the minimum confidence

if confidence > args["confidence"]:

# compute the (x, y)-coordinates of the bounding box for

# the object

box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])

(startX, startY, endX, endY) = box.astype("int")

# ensure the bounding boxes fall within the dimensions of

# the frame

(startX, startY) = (max(0, startX), max(0, startY))


(endX, endY) = (min(w - 1, endX), min(h - 1, endY))

# extract the face ROI, convert it from BGR to RGB channel

# ordering, resize it to 224x224, and preprocess it

face = image[startY:endY, startX:endX]

face = cv2.cvtColor(face, cv2.COLOR_BGR2RGB)

face = cv2.resize(face, (224, 224))

face = img_to_array(face)

face = preprocess_input(face)

face = np.expand_dims(face, axis=0)

# pass the face through the model to determine if the face

# has a mask or not

(mask, withoutMask) = model.predict(face)[0]

# determine the class label and color we'll use to draw

# the bounding box and text

label = "Mask" if mask > withoutMask else "No Mask"

color = (0, 255, 0) if label == "Mask" else (0, 0, 255)

# include the probability in the label

label = "{}: {:.2f}%".format(label, max(mask, withoutMask) *


100)

# display the label and bounding box rectangle on the output

# frame
cv2.putText(image, label, (startX, startY - 10),

cv2.FONT_HERSHEY_SIMPLEX, 0.45, color, 2)

cv2.rectangle(image, (startX, startY), (endX, endY), color, 2)

# show the output image

cv2.imshow("Output", image)

cv2.waitKey(0)

if __name__ == "__main__":

mask_image()
ScreenShots
Conclusion

An accurate and efficient Face Mask Detection system with facial


recognition has been developed which achieves comparable metrics with the
existing state-of-the-art system. This project uses recent techniques in the field
of computer vision and deep learning. Custom dataset was created using label-
Image and the evaluation was consistent. This can be used in real-time
applications which require facial recognition for pre-processing in their
pipeline.

An important scope would be to train the system on a video sequence for


usage in tracking applications. Addition of a temporally consistent network
would enable smooth detection and more optimal than per-frame detection

You might also like