Download as pdf or txt
Download as pdf or txt
You are on page 1of 63

A NOVEL AND EFFICIENT REAL-

TIME DRIVER FATIGUE AND


YAWN DETECTION-ALERT
SYSTEM
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Abstract

Fatigue among drivers is a major cause of road accidents every year in


India. Lack of sound sleep for six to eight hours is one of the primary reasons
behind this fatigue. Drivers with sleep deprivation can imbalance the reaction
time and decision-making when behind the wheel and this can increase the cause
of accidents. This type of accidents is more likely to result in death or severe
injury as they tend to be in high speed and because of the fact that driver has
fallen asleep cannot apply brake or skew to avoid or reduce the impact. Therefore,
it is highly essential to create a smart=system which can spot and alert the driver
of his/her condition. Although there are few solutions proposed in this direction,
most of them have not been implemented successfully and many of them only
remain in theory. In this research paper, we propose an efficient driver fatigue
detection and alert system using mainly open source technologies. We implement
and test this system in real time and the results are highly encouraging compared
to many existing systems.

1
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 1
INTRODUCTION

2
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

INTRODUCTION
Due to non-availability of the prior information about bus arrival schedule, people have to
wait for longer on bus stops especially in morning when they have to reach the offices in
time. The travel time of buses varies depending on several external parameters such as
accidents and traffic. Buses are stuck in traffic and are hampered by the passage of junctions
which makes the management of the bus schedule in the bus stations a difficult task. Bus
station follows fixed schedules and don’t make use of intelligent systems. Many employees
are deployed at the station who controls the entrance and exit of buses and manually
prepares the trip sheets of buses containing the schedules which are time consuming and
inaccurate. Even public transport departments have no visibility over utilization of its fleet,
which leads in underutilization of resources. The provision of accurate travel time
information is so important. With the help of new technology the administrator can monitor
the buses traffic while increasing the satisfaction of passengers and reducing cost through
efficient operations.

Face recognition is one of the major issues in biometric technology. It identifies and/or
verifies a person by using 2D/3D physical characteristics of the face images. The baseline
method of face recognition system is the Eigen face by which the goal of the method is to
project linearly the image space onto the feature space which has less dimensionality. One can
reconstruct a face image by using only a few eigenvectors which correspond to the largest
eigenvalues, known as Eigen picture,, Karhunen- Loeve transform and principal component
analysis ,Several techniques have been proposed for solving a major problem in face
recognition such as fisher face , elastic bunch graph matching and support vector machine.
However, there are still many challenge problems in face recognition system such as facial
expressions, pose variations, occlusion and illumination change. Those variations dramatically
degrade the performance of face recognition system. It is evident that illumination variation is
the most impact of the changes in appearance of the face images because of its fluctuation by
increasing or decreasing the intensities of face images due to shadow cast given by different
light source direction. Therefore the one of key success is to increase the robustness of face
representation against these variations.

In order to reduce the illumination variation, many literatures have been proposed.
Belhumeur et. al.suggested that discarding the three most significant principal components can

3
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

reduce the illumination variation in the face images. Nevertheless, the three most significant
principal components not only contain illumination variations but also some useful
information, therefore, the system was also degraded as well. Wang et. al. proposed a Self
Quotient Image (SQI) by using only single image. The SQI was obtained by using the weighted
Gaussian function as a smoothing kernel function. The Total Variation Quotient Image (TVQI)
and Logarithmic Quotient Image (LTV) have been proposed by which the face image was
decomposed into a small scale (texture) and large scale (cartoon) images. The normalized
image was obtained by dividing the original image with the large scale one. The TVQI and
LTV has a very high computational complexity due to the second order cone programming as
their kernel function.
However, these methods are suitable only for illumination variation but not for other
variations. Whereas the face representation-based method has more robustness. It is not
insensitive to illumination variation but insensitive to facial expression as well, such as Local
Binary Pattern (LBP) and its extension .it was originally designed for texture description. The
LBP operator assigns a label to every pixel of an image by thresholding the 3x3-neighborhood
of each surrounding pixel with the centre pixel value and a decimal representation is then
obtained from the binary sequence (8 bits). The LBP image is subsequently divided into R
nonoverlapping regions of same size and the local histogram over each region are then
calculated. finally, the concatenated histogram can be obtained as a face descriptor.

1.1 IMAGE SEGMENTATION


In computer vision, segmentation refers to the process of partitioning a digital image
into multiple segments (sets of pixels, also known as super pixels). The goal of segmentation
is to simplify and/or change the representation of an image into something that is more
meaningful and easier to analyze. Image segmentation is typically used to locate objects and
boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process
of assigning a label to every pixel in an image such that pixels with the same label share certain
visual characteristics.
The result of image segmentation is a set of segments that collectively cover the entire
image, or a set of contours extracted from the image (see edge detection). Each of the pixels in
a region are similar with respect to some characteristic or computed property, such as colour,
intensity, or texture. Adjacent regions are significantly different with respect to the same
characteristic(s).

4
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

APPLICATIONS
Some of the practical applications of image segmentation are:
➢ Medical Imaging
➢ Locate tumors and other pathologies
➢ Measure tissue volumes
➢ Computer-guided surgery
➢ Diagnosis
➢ Treatment planning
➢ Study of anatomical structure
➢ Locate objects in satellite images (roads, forests, etc.)
➢ Face recognition
➢ Fingerprint recognition
➢ Traffic control systems
➢ Brake light detection
➢ Machine vision

Several general-purpose algorithms and techniques have been developed for image
segmentation. Since there is no general solution to the image segmentation problem, these
techniques often have to be combined with domain knowledge in order to effectively solve an
image segmentation problem for a problem domain.

5
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 2
IMAGE PROCESSING

6
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

IMAGE PROCESSING
2.1 IMAGE

An image is a two-dimensional picture, which has a similar appearance to some subject


usually a physical object or a person.

Image is a two-dimensional, such as a photograph, screen display, and as well as a


three-dimensional, such as a statue. They may be captured by optical devices such as cameras,
mirrors, lenses, telescopes, microscopes, etc. and natural objects and phenomena, such as the
human eye or water surfaces.

The word image is also used in the broader sense of any two-dimensional figure such
as a map, a graph, a pie chart, or an abstract painting. In this wider sense, images can also be
rendered manually, such as by drawing, painting, carving, rendered automatically by printing
or computer graphics technology, or developed by a combination of methods, especially in a
pseudo-photograph.

Fig.2.1. 2D Image

7
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

An image is a rectangular grid of pixels. It has a definite height and a definite


width counted in pixels. Each pixel is square and has a fixed size on a given display.
However different computer monitors may use different sized pixels.

The pixels that constitute an image are ordered as a grid (columns and rows);
each pixel consists of numbers representing magnitudes of brightness and colour.

Fig.2.2. Image pixels table

Each pixel has a colour. The colour is a 32-bit integer. The first eight bits
determine the redness of the pixel, the next eight bits the greenness, the next eight bits
the blueness, and the remaining eight bits the transparency of the pixel.

Fig.2.3. 32 Bit Integer

2.2 IMAGE FILE SIZES

Image file size is expressed as the number of bytes that increases with the number
of pixels composing an image, and the colour depth of the pixels. The greater the
number of rows and columns, the greater the image resolution and the larger the file.
Also, each pixel of an image increases in size when its colour depth increases, an 8-bit

8
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

pixel (1 byte) stores 256 colours, a 24-bit pixel (3 bytes) stores 16 million colors, the
latter known as true color.

Image compression uses algorithms to decrease the size of a file. High resolution
cameras produce large image files, ranging from hundreds of kilobytes to megabytes,
per the camera's resolution and the image-storage format capacity. High resolution
digital cameras record 12-megapixel (1MP = 1,000,000 pixels / 1 million) images, or
more, in true color. For example, an image recorded by a 12 MP camera; since each
pixel uses 3 bytes to record true color, the uncompressed image would occupy
36,000,000 bytes of memory, a great amount of digital storage for one image, given that
cameras must record and store many images to be practical. Faced with large file sizes,
both within the camera and a storage disc, image file formats were developed to store
such large images.

2.3 IMAGE FILE FORMATS

Image file formats are standardized means of organizing and storing images.
This entry is about digital image formats used to store photographic and other images.
Image files are composed of either pixel or vector (geometric) data that are rasterized
to pixels when displayed (with few exceptions) in a vector graphic display. Including
proprietary types, there are hundreds of image file types. The PNG, JPEG, and GIF
formats are most often used to display images on the Internet.

Fig.2.4. Raster and Vector format

In addition to straight image formats, Metafile formats are portable formats


which can include both raster and vector information. The metafile format is an

9
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

intermediate format. Most Windows applications open metafiles and then save them in
their own native format.

2.3.1 RASTER FORMATS

These formats store images as bitmaps (also known as pixmaps).

JPEG/JFIF

JPEG (Joint Photographic Experts Group) is a compression method. JPEG compressed


images are usually stored in the JFIF (JPEG File Interchange Format) file format. JPEG
compression is lossy compression. Nearly every digital camera can save images in the
JPEG/JFIF format, which supports 8 bits per color (red, green, blue) for a 24-bit total,
producing relatively small files. Photographic images may be better stored in a lossless non-
JPEG format if they will be re-edited, or if small "artifacts" are unacceptable. The JPEG/JFIF
format also is used as the image compression algorithm in many Adobe PDF files.

EXIF

The EXIF (Exchangeable image file format) format is a file standard similar to the JFIF
format with TIFF extensions. It is incorporated in the JPEG writing software used in most
cameras. Its purpose is to record and to standardize the exchange of images with image
metadata between digital cameras and editing and viewing software. The metadata are recorded
for individual images and include such things as camera settings, time and date, shutter speed,
exposure, image size, compression, name of camera, color information, etc. When images are
viewed or edited by image editing software, all of this image information can be displayed.

TIFF

The TIFF (Tagged Image File Format) format is a flexible format that normally saves
8 bits or 16 bits per color (red, green, blue) for 24-bit and 48-bit totals, respectively, usually
using either the TIFF or TIF filename extension. TIFFs are lossy and lossless. Some offer
relatively good lossless compression for bi-level (black & white) images. Some digital cameras
can save in TIFF format, using the LZW compression algorithm for lossless storage. TIFF
image format is not widely supported by web browsers. TIFF remains widely accepted as a
photograph file standard in the printing business. TIFF can handle device-specific color spaces,
such as the CMYK defined by a particular set of printing press inks.

10
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

PNG

The PNG (Portable Network Graphics) file format was created as the free, open-source
successor to the GIF. The PNG file format supports true color (16 million colors) while the
GIF supports only 256 colors. The PNG file excels when the image has large, uniformly
coloured areas. The lossless PNG format is best suited for editing pictures, and the lossy
formats, like JPG, are best for the final distribution of photographic images, because JPG files
are smaller than PNG files. PNG, an extensible file format for the lossless, portable, well-
compressed storage of raster images. PNG provides a patent-free replacement for GIF and can
also replace many common uses of TIFF. Indexed-color, grayscale, and true color images are
supported, plus an optional alpha channel. PNG is designed to work well in online viewing
applications, such as the World Wide Web. PNG is robust, providing both full file integrity
checking and simple detection of common transmission errors.

GIF
GIF (Graphics Interchange Format) is limited to an 8-bit palette, or 256 colors. This
makes the GIF format suitable for storing graphics with relatively few colors such as simple
diagrams, shapes, logos and cartoon style images. The GIF format supports animation and is
still widely used to provide image animation effects. It also uses a lossless compression that is
more effective when large areas have a single color, and ineffective for detailed images or
dithered images.

BMP

The BMP file format (Windows bitmap) handles graphics files within the Microsoft
Windows OS. Typically, BMP files are uncompressed, hence they are large. The advantage is
their simplicity and wide acceptance in Windows programs.

2.3.2 VECTOR FORMATS

As opposed to the raster image formats above (where the data describes the
characteristics of each individual pixel), vector image formats contain a geometric description
which can be rendered smoothly at any desired display size.

At some point, all vector graphics must be rasterized in order to be displayed on digital
monitors. However, vector images can be displayed with analog CRT technology such as that

11
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

used in some electronic test equipment, medical monitors, radar displays, laser shows and early
video games. Plotters are printers that use vector data rather than pixel data to draw graphics.

CGM

CGM (Computer Graphics Metafile) is a file format for 2D vector graphics, raster
graphics, and text. All graphical elements can be specified in a textual source file that can be
compiled into a binary file or one of two text representations. CGM provides a means of
graphics data interchange for computer representation of 2D graphical information independent
from any particular application, system, platform, or device.

SVG

SVG (Scalable Vector Graphics) is an open standard created and developed by the
World Wide Web Consortium to address the need for a versatile, scriptable and all purpose
vector format for the web and otherwise. The SVG format does not have a compression scheme
of its own, but due to the textual nature of XML, an SVG graphic can be compressed using a
program such as gzip.

2.4 IMAGE PROCESSING

Digital image processing, the manipulation of images by computer, is relatively recent


development in terms of man’s ancient fascination with visual stimuli. In its short history, it
has been applied to practically every type of images with varying degree of success. The
inherent subjective appeal of pictorial displays attracts perhaps a disproportionate amount of
attention from the scientists and also from the layman. Digital image processing like other
glamour fields, suffers from myths, mis-connect ions, mis-understandings and mis-
information. It is vast umbrella under which fall diverse aspect of optics, electronics,
mathematics, photography graphics and computer technology. It is truly multidisciplinary
endeavor ploughed with imprecise jargon.

Several factors combine to indicate a lively future for digital image processing. A major
factor is the declining cost of computer equipment. Several new technological trends promise
to further promote digital image processing. These include parallel processing mode practical
by low cost microprocessors, and the use of charge coupled devices (CCDs) for digitizing,
storage during processing and display and large low cost of image storage arrays.

12
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

2.5 FUNDAMENTAL STEPS IN DIGITAL IMAGE PROCESSING

Fig.2.5. Block Diagram Digital Image Processing

2.5.1 IMAGE ACQUISITION


Image Acquisition is to acquire a digital image. To do so requires an image sensor and
the capability to digitize the signal produced by the sensor. The sensor could be monochrome
or color TV camera that produces an entire image of the problem domain every 1/30 sec. the
image sensor could also be line scan camera that produces a single image line at a time. In this
case, the objects motion past the line.

Fig.2.5.1. Digital Camera

13
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Fig.2.5.1. Scanner

2.5.2 IMAGE ENHANCEMENT

Image enhancement is among the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement techniques is to bring out detail that is
obscured, or simply to highlight certain features of interesting an image. A familiar example
of enhancement is when we increase the contrast of an image because “it looks better.” It is
important to keep in mind that enhancement is a very subjective area of image processing.

Fig.2.8. Image Enhancement

2.5.3 IMAGE RESTORATION

Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration is objective, in the sense
that restoration techniques tend to be based on mathematical or probabilistic models of image
degradation.

14
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Fig.2.9. Image Restoration

Enhancement, on the other hand, is based on human subjective preferences regarding


what constitutes a “good” enhancement result. For example, contrast stretching is considered
an enhancement technique because it is based primarily on the pleasing aspects it might present
to the viewer, whereas removal of image blur by applying a deblurring function is considered
a restoration technique.

2.5.4 COLOR IMAGE PROCESSING

The use of color in image processing is motivated by two principal factors. First, color
is a powerful descriptor that often simplifies object identification and extraction from a scene.
Second, humans can discern thousands of color shades and intensities, compared to about only
two dozen shades of grey. This second factor is particularly important in manual image
analysis.

Fig.2.5.4. Color image processing

2.5.5 WAVELETS AND MULTI RESOLUTION PROCESSING

Wavelets are the formation for representing images in various degrees of resolution.
Although the Fourier transform has been the mainstay of transform based image processing

15
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

since the late1950’s, a more recent transformation, called the wavelet transform, and is now
making it even easier to compress, transmit, and analyse many images. Unlike the Fourier
transform, whose basis functions are sinusoids, wavelet transforms are based on small values,
called Wavelets, of varying frequency and limited duration.

Fig.2.11. Wavelets and multi resolution processing image

Wavelets were first shown to be the foundation of a powerful new approach to signal
processing and analysis called Multiresolution theory. Multiresolution theory incorporates and
unifies techniques from a variety of disciplines, including sub band coding from signal
processing, quadrature mirror filtering from digital speech recognition, and pyramidal image
processing.

2.5.6 COMPRESSION

Compression, as the name implies, deals with techniques for reducing the storage
required saving an image, or the bandwidth required for transmitting it. Although storage
technology has improved significantly over the past decade, the same cannot be said for
transmission capacity. This is true particularly in uses of the Internet, which are characterized
by significant pictorial content. Image compression is familiar to most users of computers in
the form of image file extensions, such as the jpg file extension used in the JPEG (Joint
Photographic Experts Group) image compression standard.

2.5.7 MORPHOLOGICAL PROCESSING

16
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Morphological processing deals with tools for extracting image components that are
useful in the representation and description of shape. The language of mathematical
morphology is sets theory. As such, morphology offers a unified and powerful approach to
numerous image processing problems. Sets in mathematical morphology represent objects in
an image. For example, the set of all black pixels in a binary image is a complete morphological
description of the image.

Fig.2.5.7. Morphological processing image

In binary images, the sets in question are members of the 2-D integer space Z2, where
each element of a set is a 2-D vector whose coordinates are the (x, y) coordinates of a black (or
white) pixel in the image. Gray-scale digital images can be represented as sets whose
components are in Z3. In this case, two components of each element of the set refer to the
coordinates of a pixel, and the third corresponds to its discrete gray-level value.

2.5.8 SEGMENTATION

Segmentation procedures partition an image into its constituent parts or objects. In


general, autonomous segmentation is one of the most difficult tasks in digital image processing.
A rugged segmentation procedure brings the process a long way toward successful solution of
imaging problems that require objects to be identified individually.

17
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Fig.2.13. Segmented images

On the other hand, weak or erratic segmentation algorithms almost always guarantee
eventual failure. In general, the more accurate the segmentation, the more likely recognition is
to succeed.

2.5.9 REPRESENTATION AND DESCRIPTION

Representation and description almost always follow the output of a segmentation


stage, which usually is raw pixel data, constituting either the boundary of a region (i.e., the set
of pixels separating one image region from another) or all the points in the region itself. In
either case, converting the data to a form suitable for computer processing is necessary. The
first decision that must be made is whether the data should be represented as a boundary or as
a complete region. Boundary representation is appropriate when the focus is on external shape
characteristics, such as corners and inflections.

Regional representation is appropriate when the focus is on internal properties, such as


texture or skeletal shape. In some applications, these representations complement each other.
Choosing a representation is only part of the solution for transforming raw data into a form
suitable for subsequent computer processing. A method must also be specified for describing
the data so that features of interest are highlighted. Description, also called feature selection,
deals with extracting attributes that result in some quantitative information of interest or are
basic for differentiating one class of objects from another.

2.5.10 OBJECT RECOGNITION

The last stage involves recognition and interpretation. Recognition is the process that
assigns a label to an object based on the information provided by its descriptors. Interpretation
involves assigning meaning to an ensemble of recognized objects.

2.5.11 KNOWLEDGEBASE

Knowledge about a problem domain is coded into image processing system in the form
of a knowledge database. This knowledge may be as simple as detailing regions of an image
when the information of interests is known to be located, thus limiting the search that has to be
conducted in seeking that information. The knowledge base also can be quite complex, such as
an inter related to list of all major possible defects in a materials inspection problem or an
image data base containing high resolution satellite images of a region in connection with

18
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

change deletion application. In addition to guiding the operation of each processing module,
the knowledge base also controls the interaction between modules. The system must be
endowed with the knowledge to recognize the significance of the location of the string with
respect to other components of an address field. This knowledge glides not only the operation
of each module, but it also aids in feedback operations between modules through the knowledge
base. We implemented pre-processing techniques using MATLAB.

2.6 COMPONENTS OF AN IMAGE PROCESSING SYSTEM

As recently as the mid-1980s, numerous models of image processing systems being


sold throughout the world were rather substantial peripheral devices that attached to equally
substantial host computers. Late in the 1980s and early in the 1990s, the market shifted to
image processing hardware in the form of single boards designed to be compatible with
industry standard buses and to fit into engineering workstation cabinets and personal
computers. In addition to lowering costs, this market shift also served as a catalyst for a
significant number of new companies whose specialty is the development of software written
specifically for image processing.

Fig.2.6 Components of an Image Processing System

19
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Although large-scale image processing systems still are being sold for massive imaging
applications, such as processing of satellite images, the trend continues toward miniaturizing
and blending of general-purpose small computers with specialized image processing hardware.
Figure 1.24 shows the basic components comprising a typical general-purpose system used for
digital image processing. The function of each component is discussed in the following
paragraphs, starting with image sensing.

➢ Image sensors: With reference to sensing, two elements are required to acquire digital
images. The first is a physical device that is sensitive to the energy radiated by the
object we wish to image. The second, called a digitizer, is a device for converting the
output of the physical sensing device into digital form. For instance, in a digital video
camera, the sensors produce an electrical output proportional to light intensity. The
digitizer converts these outputs to digital data.
➢ Specialized image processing hardware: Specialized image processing hardware
usually consists of the digitizer just mentioned, plus hardware that performs other
primitive operations, such as an arithmetic logic unit (ALU), which performs arithmetic
and logical operations in parallel on entire images. One example of how an ALU is used
is in averaging images as quickly as they are digitized, for the purpose of noise
reduction. This type of hardware sometimes is called a front-end subsystem, and its
most distinguishing characteristic is speed. In other words, this unit performs functions
that require fast data throughputs (e.g., digitizing and averaging video images at 30
frames) that the typical main computer cannot handle.
➢ Computer: The computer in an image processing system is a general-purpose
computer and can range from a PC to a supercomputer. In dedicated applications,
sometimes specially designed computers are used to achieve a required level of
performance, but our interest here is on general-purpose image processing systems. In
these systems, almost any well-equipped PC-type machine is suitable for offline image
processing tasks.
➢ Image processing software: Software for image processing consists of specialized
modules that perform specific tasks. A well-designed package also includes the
capability for the user to write code that, as a minimum, utilizes the specialized
modules. More sophisticated software packages allow the integration of those modules
and general-purpose software commands from at least one computer language.

20
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

➢ Mass storage: Mass storage capability is a must in image processing applications. An


image of size 1024*1024 pixels, in which the intensity of each pixel is an 8-bit quantity,
requires one megabyte of storage space if the image is not compressed. When dealing
with thousands, or even millions, of images, providing adequate storage in an image
processing system can be a challenge. Digital storage for image processing applications
fall into three principal categories: (1) short-term storage for use during processing, (2)
on-line storage for relatively fast recall, and (3) archival storage, characterized by
infrequent access. Storage is measured in bytes (eight bits), Kbytes (one thousand
bytes), Mbytes (one million bytes), Gbytes (meaning giga, or one billion, bytes), and
Tbytes (meaning tera, or one trillion, bytes).
One method of providing short-term storage is computer memory. Another is
by specialized boards, called frame buffers that store one or more images and can be
accessed rapidly, usually at video rates. The latter method allows virtually
instantaneous image zoom, as well as scroll (vertical shifts) and pan (horizontal shifts).
Frame buffers usually are housed in the specialized image processing hardware unit
shown in Fig. 1.24. Online storage generally takes the form of magnetic disks or optical-
media storage. The key factor characterizing on-line storage is frequent access to the
stored data. Finally, archival storage is characterized by massive storage requirements
but infrequent need for access. Magnetic tapes and optical disks housed in “jukeboxes”
are the usual media for archival applications.
➢ Image displays: Image displays in use today are mainly color (preferably flat screen)
TV monitors. Monitors are driven by the outputs of image and graphics display cards
that are an integral part of the computer system. Seldom are there requirements for
image display applications that cannot be met by display cards available commercially
as part of the computer system. In some cases, it is necessary to have stereo displays,
and these are implemented in the form of headgear containing two small displays
embedded in goggles worn by the user.
➢ Hardcopy: Hardcopy devices for recording images include laser printers, film cameras,
heat-sensitive devices, inkjet units, and digital units, such as optical and CD-ROM
disks. Film provides the highest possible resolution, but paper is the obvious medium
of choice for written material. For presentations, images are displayed on film
transparencies or in a digital medium if image projection equipment is used. The latter
approach is gaining acceptance as the standard for image presentations.

21
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

➢ Network: Networking is almost a default function in any computer system in use today.
Because of the large amount of data inherent in image processing applications, the key
consideration in image transmission is bandwidth. In dedicated networks, this typically
is not a problem, but communications with remote sites via the Internet are not always
as efficient. Fortunately, this situation is improving quickly as a result of optical fiber
and other broadband technologies.

22
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 3
LITERATURE SURVEY

23
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

LITERATURE SURVEY

3.1 FACE DESCRIPTION WITH LOCAL BINARY PATTERNS APPLICATION TO


FACE RECOGNITION

The procedure consists of using the texture descriptor to build several local descriptions
of the face and combining them into a global description. Instead of striving for a holistic
description. The facial image is divided into local regions and texture descriptors are extracted
from each region independently. the LBP labels for the histogram contain information about
the patterns on a pixel-level, the labels are summed over a small region to produce information
on a regional level and the regional histograms are concatenated to build a global description
of the face.

MERITS

➢ The method extracts the pattern of the image. So, it can identify the face effectively.

DEMERITS

➢ From the LBP we have to select the important feature. it is the difficult task.

3.2 EVALUATION OF FACE RECOGNITION TECHNIQUES FOR APPLICATION


TO FACEBOOK

In conclusion, we have utilized a new, real-world source of images to test a variety of


algorithms for holistic performance with respect to the potential application of Facebook. The
results from PCA, LDA, ICA, and SVMs show that no single or hybrid method tried is ideally
suited to a widespread application for use by millions of Facebook users. SVM and ILDA
methods yield fair 65% accuracy at the cost of high computation and memory requirements.
Further, they must be completely retrained with each new image. Likewise, the Individual
IPCA approach is ideally suited to a real-world implementation, but yields a low accuracy
intolerable to most users. However, if the scope was scaled back from full autonomy, Individual
IPCA could aid in tagging by automatically detecting faces and suggesting most likely
identities.

24
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

MERITS

➢ The method provides the better recognition result than the existing system

DEMERITS

➢ The accuracy of the classifier is very low
.
3.3 LOCALIZING PARTS OF FACES USING A CONSENSUS OF EXEMPLARS

Our work focuses on localizing parts in natural face images, taken under a wide range
of poses, lighting conditions, and facial expressions, in the presence of occluding objects such
as sunglasses or microphones. Our work focuses on localizing parts in natural face images,
taken under a wide range of poses, lighting conditions, and facial expressions, in the presence
of occluding objects such as sunglasses or microphones.

MERITS

➢ The system can effectively localize the facial parts.

DEMERITS

➢ The system has the computational complexity.

3.4 PLASTIC SURGERY A NEW DIMENSION TO FACE RECOGNITION

Plastic surgery has been an unexplored area in the face recognition domain and it poses
ethical, social and engineering challenges. The procedures can significantly change the facial
regions both locally and globally, altering the appearance, facial features and texture, thereby
posing a serious challenge to face recognition systems. Existing face recognition algorithms
generally rely on local and global facial features and any variation can affect the recognition
performance. Here PCA features are extracted and it will be recognized by GNN.

MERITS

➢ The methods provide the high accuracy than the existing System.

DEMERITS

➢ The method takes more time for execution.

25
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

3.5 OPEN SET FACE RECOGNITION USING TRANSDUCTION

Biometric systems in general, and face recognition engines, in particular, require


significant tuning and calibration, for setting the detection thresholds among other things. It
suitable for the open set recognition problem. Towards that end we introduced the Open Set
TCM – kNN (Transduction Confidence Machine – k Nearest Neighbors), a novel realization
of transudative inference that is suitable for open set multi-class classification and includes a
rejection option. Outlier detection corresponds to change detection when faces or patterns
change their appearance. Feature selection for enhanced pattern recognition can be further
achieved using strangeness and the p-value function. The stranger the feature values are the
better the discrimination between the patterns.

MERITS

➢ It yields overall much better performance compared to PCA components.

DEMERITS

➢ There is an overlapping between the unlabeled data.

26
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 4
EXISTING SYSTEM

27
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

LBP BASED HOG METHOD

Driver drowsiness, especially among long-distance truck drivers who are public service vehicle
drivers and private vehicle drivers, is a significant concern. Despite taking several measures,
the government had failed to rectify the problem. Steps like checking whether the driver is
drunk and many other actions are considered but are unable to acknowledge the problem. A
few systems are available in the market; however; their cost making is high, making the driver
lost his hope to purchase these systems. Therefore, we are needed to come up with affordable
solution for lower middle class keeping in mind to also address the accidents associated with
drowsiness in public service vehicles.

Flow chart

Local Binary Patterns (LBP) is a type of feature used for classification in computer vision.
LBP was first described in It has since been found to be a powerful feature for texture
classification; The LBP feature vector, in its simplest form, is created in the following manner:

28
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

• Divide the examined window to cells (e.g. 16x16 pixels for each cell).
• For each pixel in a cell, compare the pixel to each of its 8 neighbours (on its left-top,
left-middle, left-bottom, right-top, etc.). Follow the pixels along a circle, i.e. clockwise
or counter-clockwise.
• Where the centre pixel's value is greater than the neighbour, write "1". Otherwise, write
"0". This gives an 8-digit binary number (which is usually converted to decimal for
convenience).
• Compute the histogram, over the cell, of the frequency of each "number" occurring (i.e.,
each combination of which pixels are smaller and which are greater than the centre).
• Optionally normalize the histogram.
• Concatenate normalized histograms of all cells. This gives the feature vector for the
window.

Binary pattern

Local binary pattern (LBP) is a popular technique used for image/face representation and
classification. LBP has been widely applied in various applications due to its high
discriminative power and tolerance against illumination changes such as texture analysis and
object recognition. It was originally introduced by Ojala et al. [10] as gray-scale and rotation
invariant texture classification. Basically, LBP is invariant to monotonic gray-scale
transformations. The basic idea is that each 3x3-neighborhood in an image is threshold by the
value of its center pixel and a decimal representation is then obtained by taking the binary
sequence (Figure 1) as a binary number such that LBP [0, 255].

Figure 1. LBP operator: (left) the binary sequence (8 bits) and (right) the weighted threshold

For each pixel, LBP accounts only for its relative relationship with its neighbors, while
discarding the information of amplitude, and this makes the resulting LBP values very
insensitive to illumination intensities. LBP is originally described as

29
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

where ic corresponds to the grey value of the center pixel (xc, yc), in the gray values of the 8
surrounding pixels. s (.) is defined as:

The original LBP is later extended to be multi-scale LBP [11] which uses a circular
neighborhood of different radiussizes using bilinearly interpolating. LBPP,R indicates P
sampling pixels on a circle of radius of R. The example of multiscale LBP operator is illustrated
in Figure 2. An another extension called uniform patterns [11] which contain at most two bit-
wise 0 to 1 or 1 to 0 transitions (circular binary code). For example the patterns 11111111 (0
transition), 00000110 (2 transitions), and 10000111 (2 transitions) are uniform whereas the
pattern 11001001 (4 transitions) is not. These uniform LBPs represent the micro-features such
as
lines, edges and corners.

Figure 2. The multi-scale LBP operator with (8,1) and (8,2) neighborhoods. Pixel values are
bilinearly interpolated for points which are not in the center pixel.

Enhanced Local Binary Pattern


operator that considers both local shape and texture information instead of raw grayscale
information and it is robust to illumination variation, called improved Local Binary Patterns
(ILBP) [8]. The main difference between ILBP and LBP is the comparison.

The HOG features [16] method is a feature descriptor that is used for various object detection
applications in the field of computer vision. The key idea of the HOG features is to group
gradient magnitudes into bins in a histogram based on its orientation. Then the image is
divided into 4 blocks of size 16x16 pixels with each block overlapping half of the region
covered by the preceding block. Each block has 4 cells of size 8x8 pixels. Next the gradients
are computed for each pixel inside the cells using Sobel filters. These gradients magnitudes are
then plotted in a histogram which has magnitudes on the y- axis and orientations in the x-axis.
The x-axis is divided into 9 bins, each bin having a width of 20 degrees. The gradient
magnitudes are then arranged into these 9-bin histograms based on their orientation.
30
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Dis advantages
➢ Its efficiency based on the environmental condition.
➢ Due to the use of infrared illumination to acquire the video, infrared rays damages the
lens and cornea of the eye.
➢ make detection of facial expressions possibility difficult.

31
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 5
PROPOSED SYSTEM

32
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

PROPOSED SYSTEM
5.1 Introduction
Face location can be used to design a video camera system that tracks a person's face
in a room. It can be used as part of an intelligent vision system or simply in video surveillance.

Although the research on face segmentation has been pur-sued at a feverish pace, there
are still many problems yet to be fully and convincingly solved as the level of difficulty of the
problem depends highly on the complexity level of the image content and its application, In
this paper, we will discuss the color analysis approach to face segmentation. The discussion
includes the derivation of a universal model of human skin color, the use of appropriate color
space, and the limitations of color segmentation. We then present a practical solution to the
face-segmentation problem. This includes how to derive a robust skin-color reference map and
how to overcome the limitations of color segmentation. In addition to face segmentation

This paper is organized as follows. The color analysis approach to face segmentation is
presented in Section II. In Section III, we present our contributions to this field of research,
which include our proposed skin-color reference map and methodology to face segmentation.
The simulation results of our proposed algorithm along with some discussion is provided in
Section IV. This is followed by Section V, which describes a video coding technique that uses
the face-segmentation results. The conclusions and further research directions are presented in
Section VI.

5.2. COLOR ANALYSIS


The use of color information has been introduced to the face-locating problem in recent
years, and it has gained increasing attention since then. Some recent publications that have
reported this study include The color information is typically used for region rather than edge
segmentation. We classify the region segmentation into two general approaches, as illustrated
in Fig. 5.1. One approach is to employ color as a feature for partitioning an image into a set of
homogeneous regions The other approach, however, makes use of color as a feature for
identifying a specific object in an image. In this case, the skin color can be used to identify the
human face. This is feasible because human faces have a special color distribution that differs
significantly (although not entirely) from those of the background objects. Hence this approach
requires a color map that models the skin-color distribution characteristics.

33
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Fig.5.1. The use of color information for region segmentation.

Fig.5.2. Foreman image with a white contour highlighting the facial region

In another approach, the skin-color map can be designed by adopting histograming


technique on a given set of training Therefore, this individual color feature can simply be
defined by the presence of Cr values within, say, 136 and 156, and Cb values within 110 and
123. Using these ranges of values, we managed to locate the subject's face in another frame of
Foreman and also in a different scene (a standard test image called Carphone), as can be seen
in Fig. 4. This approach was suggested in the past by Li and Forchheimer in however, a detailed
procedure on the modeling of individual color features and their choice of color space was not
disclosed.

34
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Fig5.2 Histograms of Cr and Cb components in the facial region

Fig5. 2. Foreman and Carphone images, and their color segmentation results, obtained by
using the same predefined skin-color map.

35
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

COLOR SPACE

An image can be presented in a number of different colorspace models

RGB: This stands for the three primary colors: red, green, and blue. It is a hardware-oriented
model and is well known for its color-monitor display purpose.

HSV: An acronym for hue-saturation-value. Hue is a color attribute that describes a pure
color, while saturation defines the relative purity or the amount of white light mixed with a
hue; value refers to the brightness of the image. This model is commonly used for image
analysis.

YCrCb: This is yet another hardware-oriented model. However, unlike the RGB space, here
the luminance is separated from the chrominance data. The Y value repre-sents the luminance
(or brightness) component, while the Cr and Cb values, also known as the color difference
signals, represent the chrominance component of the image.

The skin color segmentation in applied to YCbCr color space. So first of all RGB color space
is converted to YCbCr color space. Y represents the luminance and Cb and Cr represents
chrominance. The RGB color space is converted to YCbCr color space using the following
equation:

Y = 0.299R + 0.587G +0.114B

Cb = (B-Y)*0.564 + 128

Cr = (R-Y)*0.713 + 128 ……………(1)

The skin color segmentation is used to classify the pixel as skin pixel or non-skin pixel. As or
hand is connected component made of skin pixels we will get the hand after skin color
segmentation. Steps for skin color segmentation:

1. The first step in skin color segmentation to specify the range for the skin pixels in
YCbCr color space.

2. Find the pixels (p) that are in the range defined above:is lower and upper bound for
Cb component.

36
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

The edges of image are considered to be most important image attributes that provide
valuable information for human image perception. The edge detection is a terminology in
image processing, particularly in the areas of feature extraction, to refer to algorithms which
aim at identifying points in a digital image at which the image brightness changes sharply .The
data of edge detection is very large, so the speed of image processing is a difficult problem.
The main objective of image processing is to improve the quality of the images for human
interpretation or the perception of the machines independent of the images for human
interpretation or the perception of the machines independently. This paper focuses in the
processing pixel to pixel of an image and in the modification of pixel neighborhoods and of course
the transformation can be applied to the whole image or only a partial region. The need to
process the image in real time, leading to the implementation level hardware, which offers
parallelism, Thus significantly reduces the processing time, which was why decided to use a
tool with graphical interface under the Mat lab, Simulink, based blocks which makes it very
easy to handle with respect to other software for hardware description.

Canny’s Edge Detection Algorithm

The Canny edge detection algorithm is known to many as the optimal edge detector.
Canny's intentions were to enhance the many edge detectors already out at the time he started
his work. He was very successful in achieving his goal and his ideas and methods can be found
in his paper, "A Computational Approach to Edge Detection". In his paper, he followed a list
of criteria to improve current methods of edge detection. The first and most obvious is low
error rate. It is important that edges occurring in images should not be missed and that there be
NO responses to non-edges. The second criterion is that the edge points be well localized. In
other words, the distance between the edge pixels as found by the detector and the actual edge
is to be at a minimum. A third criterion is to have only one response to a single edge. This was
implemented because the first 2 were not substantial enough to completely eliminate the
possibility of multiple responses to an edge.

37
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Based on these criteria, the canny edge detector first smoothes the image to eliminate and noise.
It then finds the image gradient to highlight regions with high spatial derivatives. The algorithm
then tracks along these regions and suppresses any pixel that is not at the maximum
(nonmaximum suppression). The gradient array is now further reduced by hysteresis.
Hysteresis is used to track along the remaining pixels that have not been suppressed. Hysteresis
uses two thresholds and if the magnitude is below the first threshold, it is set to zero (made a
nonedge). If the magnitude is above the high threshold, it is made an edge. And if the magnitude
is between the 2 thresholds, then it is set to zero unless there is a path from this pixel to a pixel
with a gradient above T2.

SOBEL EDGE DETECTION OPERATOR:

In case of Sobel Edge Detection there are two masks, one mask identifies the
horizontal edges and the other mask identifies the vertical edges. The mask which finds
the horizontal edges that is equivalent to having the gradient in vertical direction and
the mask which computes the vertical edges is equivalent to taking in the gradient in
horizontal direction. Sobel masks are given in the figure 1.

38
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

-1 -2 -1 1 0 -1
0 0 0 2 0 -2
1 2 1 1 0 -1

Figure 1: Sobel operators

By passing these two masks over the intensity image the gradient along x
direction (Gx) and gradient along the y direction (Gy) can be computed at the different
location in the image. Now the strength and the direction of the edge at that particular
location can be computed by using the gradients Gx and Gy. The gradient of an image
(𝑥,) at location(𝑥,𝑦) is defined as the vector

Where Gx is the partial derivative of 𝑓 along x direction and Gy is the partial


derivative of 𝑓 along the y direction. Computation of the magnitude of the gradient
involves squaring the two components Gx and Gy adding them and takes the square
root of this addition.

The approximation of this is taken as magnitude of the gradient to be sum of magnitude


of Gx gradient in the x direction plus magnitude of Gy in the y direction

The magnitude tells the strength of the edge at location(𝑥,𝑦), it does not tell anything
about the direction of the edge [9][10]. To compute the direction of the gradient𝑓, let
(𝑥,) represent the direction angle of the vector ∇𝑓 at (𝑥,), then

Sobel Edge Operator gives an averaging affect over the image, so effect due to the
presence of spurious noise in the image is taken care of some extent by the Sobel

39
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

operator. Sobel operator also gives a smoothing effect by which we can reduce the
spurious edge that can be generated because of the noise present in the image.

5.3 Sobel Filter Analysis:

Filtering is the process of applying masks to images and the application of a


mask to an input image produces an output image of the same size as the input image.
There are three steps of convolution are given which is necessary for filtering.

Step1. For each pixel in the input image, the mask is conceptually placed lying on that
pixel.

Step2. The values of each input image pixel under the mask are multiplied by the value
of the corresponding mask weights.

Step3. The result are summed together to yield a single output value that is placed in
the output image at the location of pixel being processed on the input.

The pixel values of an original image is shown in the figure 2 and Sobel masks are also
shown in figure 1 for horizontal and vertical scan. Now compute Gx and Gy, gradients
of the image performing the convolution of Sobel kernels with the image and use zero-
padding to extend the image. The process of computing Gx and Gy using convolution
and zero padding is given in the figure 3 and figure 4. Gradients Gx and Gy, of the
image are shown in the figure 5 and figure 6.

40
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

41
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

5.4 SOBEL EDGE DETECTION IMPLEMENTATION:

This paper proposed Edge Detection using Sobel Operator in Digital Image Processing
and implementation using MATLAB. Firstly, a jpg image is inputted and converted into binary
image with the help of MATLAB. Acquire a jpg image, which is by default in an RGB color
space and convert this RGB image to grey level image. Now convert the grey level image into
the binary image. This binary image is very large, so it is resized and written into a text file
shown in the figure8. The Sobel operator is used commonly in edge detection. The Sobel
operator is a classic first order edge detection operator, computing an approximation of the
gradient of the image intensity function. At each point in the image, the result of the Sobel
operator is the corresponding norm of this gradient vector. The Sobel operator only considers
the two orientations which are 0°and 90°convolution kernels. The operator uses the two kernels
which are convolved with the original image to calculate approximations of the gradient. As
given above, the gradients are calculated along with the magnitude. Read the text file generated
by the MATLAB into the memory and store it into the RAM, then extract the raster window.

42
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Two 3 x 3 windows are shown in figure 7. Next, scan the text file with the window and find
out the values of center pixel, north, south, east, west, south east, south west, north east and
north west pixel for whole binary image row wise and column wise, this completes the
horizontal and vertical scanning. The process of raster scanning is shown in figure 9.

Figure 9: Process flow of Raster window scanning

Further, two convolution masks are designed to respond maximally to edges running
vertically and horizontally relative to the pixel grid, one mask for each of the two perpendicular
orientations. The masks can be applied separately to the input image, to produce separate
measurements of the vertical and horizontal gradients. These can combined together to find the
absolute magnitude of the gradient at each point. Process flow is shown in the figure 10.

43
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

5.5 K-MEAN
In pattern recognition, the K -Means may be a wide used classifier for classifying
objects supported nearest coaching examples within the feature area. The k-nearest neighbor
formula is that the simplest classifier of all machine learning algorithms. During this classifier
image is assessed by a majority vote of its neighbors. In K -Means the Euclidian distance
between the testing image feature and every coaching image feature is set to make a distance
matrix. The summation worth of distance matrix is calculable and sorted in increasing order
.The first K components are elect and majority category worth is set for classifying the image
accurately.

K -Means algorithm is used for classification in this research because it is the simplest
machine learning algorithm and it is very easy to implement. It is a technique based on the
closest training samples in the feature space. When the test sample is given, the distance
between the test sample and all the training samples is first calculated using Euclidean
distance. Then, the 'k' nearest neighbours which have minimum distance are determined.
Once the nearest neighbours are found, the test sample is classified according to the majority
votes. In the testing phase, the unlabelled query image is simply assigned to the label of its k
nearest neighbours. is

Fig 4 knn Classifier

Typically, the test data is classified based on the majority labels of its k nearest neighbours.
For k=1, the class label of test image is assigned as the class of its nearest object If there
are only two classes, k must be an odd integer. For multiclass classification, ties occur even
though k is an odd integer. The Euclidean distance 'd' between the training feature vector

44
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

X=(x1,x2, ……xn) and the test feature vector Y= (y1,y2,…..yn) of fixed length is
calculated using the following equation.

The accuracy of K-Means classifier is found by choosing different values of k. We obtained


the better classification accuracy of 76% at k=5. If the value of k is increased further, there
is no significant improvement in the performance.
5.6 SVM
In machine learning, support vector machines are supervised learning models with
associated learning algorithms that analyze data used for classification and regression analysis
In addition to performing linear classification, SVMs can efficiently perform a non-linear
classification using what is called the kernel trick, implicitly mapping their inputs into high-
dimensional feature spaces.

When data are not labeled, supervised learning is not possible, and an unsupervised learning approach
is required, which attempts to find natural clustering of the data to groups, and then map new data to
these formed groups. The clustering algorithm which provides an improvement to the support vector
machines is called support vector clustering and is often used in industrial applications either when
data is not labeled or when only some data is labelled as a pre-processing for a classification pass More
formally, a support vector machine constructs a hyper plane or set of hyper planes in a high- or infinite-
dimensional space, which can be used for classification, regression, or other tasks. Intuitively, a good
separation is achieved by the hyperplane that has the largest distance to the nearest training-data point
of any class (so-called functional margin), since in general the larger the margin the lowe the
generalization of the classifier.

Figure 4.2 Illustrting Linear SVM along with graph

45
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Maximum-margin hyperplane and margins for an SVM trained with samples from two
classes. Samples on the margin are called the support vectors where is the (notnecessarily
normalized) normal vector to the hyperplane. The parameter determines the offset of the
hyperplane from the origin along the normal vector

5.7 FEATURE EXTRACTION

The performance of a Face Recognition system also depends upon the feature extraction
and their classification to get the accurate results. Feature extraction is achieved using feature
based techniques or holistic techniques. In some holistic techniques we can make use of
dimensionality reduction before classification. We compared the results of different holistic
approaches usedFor feature extraction and classification in real time scenario Normally the
textures may be random but with the consistent properties. Such Textures can be described by their
statistical properties. Moment of intensity plays a major role in describing the Texture in a region.
Suppose in a region we construct the histogram of the intensities then the moments of the 1-D (one
dimensional) histogram can be computed.

The mean intensity which we have discussed is the first moment. The variance describes how
similar the intensities are within the region then this variance is the second central moment.and they
mainly do statistics for the first order, second-order and third-order moment of each color component.
For image retrieval, the color moment is a simple and effective representative method of color features.
Such color moment as first-order (mean) and second (variance) and third-order (gradient), is proved to
be very effective in presenting color distribution of images. The three colors moments are defined with
figures.

46
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 6
INTRODUCTION TO MATLAB

47
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

INTRODUCTION TO MATLAB

The fundamental devices needed for this task can be characterized into two general classes.

1. Hardware prerequisite,
2. Software prerequisite.

6.1 HARDWARE REQUIREMENT

In the hardware part a customary PC where MATLAB programming can be viably lived
up to expectations is required, i.e. with a base system game plan of: RAM 512MB, hard circle
20GB and with a processor Pentium III.

6.2 SOFTWARE REQUIREMENT

In the item part MATLAB programming and the highlight, which is to be settled, is the base
need. A rate of the points of interest from MATLAB in highlight taking care of are:

➢ Easy to work with; as Images and lattices


➢ Built in capacities for complex operations and calculations (Ex. FFT, DCT, and so
forth… )
➢ Image transforming tool kit,
➢ Supports most picture configurations (.bmp, .jpg, .gif, tiff and so forth… )

6.3 INTRODUCTION TO MATLAB

MATLAB is a high-performance language for technical computing. It integrates


computation, visualization, and programming in an easy-to-use environment where problems
and solutions are expressed in familiar mathematical notation. MATLAB stands for matrix
laboratory, and was written originally to provide easy access to matrix software developed by
LINPACK (linear system package) and EISPACK (Eigen system package) projects. MATLAB
is therefore built on a foundation of sophisticated matrix software in which the basic element
is array that does not require pre dimensioning which to solve many technical computing
problems, especially those with matrix and vector formulations, in a fraction of time

MATLAB features a family of applications specific solutions called toolboxes. Very


important to most users of MATLAB, toolboxes allow learning and applying specialized

48
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

technology. These are comprehensive collections of MATLAB functions (M-files) that extend
the MATLAB environment to solve particular classes of problems. Areas in which toolboxes
are available include signal processing, control system, neural networks, fuzzy logic, wavelets,
simulation and many others.

Typical uses of MATLAB include: Math and computation, Algorithm development,


Data acquisition, Modeling, simulation, prototyping, Data analysis, exploration, visualization,
Scientific and engineering graphics, Application development, including graphical user
interface building.

MATLAB is a program that was originally designed to simplify the implementation of


numerical linear algebra routines. It has since grown into something much bigger, and it is used
to implement numerical algorithms for a wide range of applications. The basic language used
is very similar to standard linear algebra notation, but there are a few extensions that will likely
cause you some problems at first.

6.3.1 BASIC BUILDING BLOCKS OF MATLAB


The basic building block of MATLAB is MATRIX. The fundamental data type is the
array. Vectors, scalars, real matrices and complex matrix are handled as specific class of this
basic data type. The built in functions are optimized for vector operations. No dimension
statements are required for vectors or arrays.

MATLAB WINDOW

The MATLAB works based on five windows: Command window, Workspace window,
Current directory window, Command history window, Editor Window, Graphics window and
Online-help window.

Command Window: The command window is where the user types MATLAB commands
and expressions at the prompt (>>) and where the output of those commands is displayed. It is
opened when the application program is launched. All commands including user-written
programs are typed in this window at MATLAB prompt for execution.

Work Space Window: MATLAB defines the workspace as the set of variables that the user
creates in a work session. The workspace browser shows these variables and some
information about them. Double clicking on a variable in the workspace browser launches the
Array Editor, which can be used to obtain information.

49
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Current Directory Window: The current Directory tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windows
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
“work” is a subdirectory of the main directory “MATLAB”; which is installed in drive C.
Clicking on the arrow in the current directory window shows a list of recently used paths.
MATLAB uses a search path to find M-files and other MATLAB related files. Any file run in
MATLAB must reside in the current directory or in a directory that is on search path.

Command History Window: The Command History Window contains a record of the
commands a user has entered in the command window, including both current and previous
MATLAB sessions. Previously entered MATLAB commands can be selected and re-executed
from the command history window by right clicking on a command or sequence of commands.
This is useful to select various options in addition to executing the commands and is useful
feature when experimenting with various commands in a work session.

Editor Window: The MATLAB editor is both a text editor specialized for creating M-files
and a graphical MATLAB debugger. The editor can appear in a window by itself, or it can be
a sub window in the desktop. In this window one can write, edit, create and save programs in
files called M-files.

MATLAB editor window has numerous pull-down menus for tasks such as saving,
viewing, and debugging files. Because it performs some simple checks and also uses color to
differentiate between various elements of code, this text editor is recommended as the tool of
choice for writing and editing M-functions.

Graphics or Figure Window: The output of all graphic commands typed in the command
window is seen in this window.

Online Help Window: MATLAB provides online help for all it’s built in functions and
programming language constructs. The principal way to get help online is to use the MATLAB
help browser, opened as a separate window either by clicking on the question mark symbol on
the desktop toolbar, or by typing help browser at the prompt in the command window. The help
Browser is a web browser integrated into the MATLAB desktop that displays a Hypertext
Markup Language (HTML) documents. The Help Browser consists of two panes, the help
navigator pane, used to find information, and the display pane, used to view the information.
Self-explanatory tabs other than navigator pane are used to perform a search.

50
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

6.3.2 MATLAB FILES

MATLAB has three types of files for storing information. They are: M-files and
MAT-files.

M-FILES
These are standard ASCII text file with ‘m’ extension to the file name and creating own
matrices using M-files, which are text files containing MATLAB code. MATLAB editor or
another text editor is used to create a file containing the same statements which are typed at the
MATLAB command line and save the file under a name that ends in .m. There are two types
of M-files:

Script Files: It is an M-file with a set of MATLAB commands in it and is executed by typing
name of file on the command line. These files work on global variables currently present in
that environment.

Function Files: A function file is also an M-file except that the variables in a function file
are all local. This type of files begins with a function definition line.

MAT-FILES
These are binary data files with .mat extension to the file that are created by MATLAB
when the data is saved. The data written in a special format that only MATLAB can read. These
are located into MATLAB with ‘load’ command.

6.3.3 THE MATLAB SYSTEM

The MATLAB system consists of five main parts:

Development Environment: This is the set of tools and facilities that help you use MATLAB
functions and files. Many of these tools are graphical user interfaces. It includes the MATLAB
desktop and Command Window, a command history, an editor and debugger, and browsers for
viewing help, the workspace, files, and the search path.

The MATLAB Mathematical Function: This is a vast collection of computational algorithms


ranging from elementary functions like sum, sine, cosine, and complex arithmetic, to more

51
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

sophisticated functions like matrix inverse, matrix eigen values, Bessel functions, and fast
Fourier transforms.

The MATLAB Language: This is a high-level matrix/array language with control flow
statements, functions, data structures, input/output, and object-oriented programming features.
It allows both "programming in the small" to rapidly create quick and dirty throw-away
programs, and "programming in the large" to create complete large and complex
application programs.

Graphics: MATLAB has extensive facilities for displaying vectors and matrices as graphs, as
well as annotating and printing these graphs. It includes high-level functions for two-
dimensional and three-dimensional data visualization, image processing, animation, and
presentation graphics. It also includes low-level functions that allow you to fully customize the
appearance of graphics as well as to build complete graphical user interfaces on your MATLAB
applications.

MATLAB APPLICATION PROGRAM INTERFACE (API)


This is a library that allows you to write C and FORTRAN programs that interact with
MATLAB. It includes facilities for calling routines from MATLAB (dynamic linking), calling
MATLAB as a computational engine, and for reading and writing MAT-files.

6.3.4 MATLAB WORKING ENVIRONMENT


MATLAB DESKTOP
Matlab Desktop is the main Matlab application window. The desktop contains five sub
windows, the command window, the workspace browser, the current directory window, the
command history window, and one or more figure windows, which are shown only when the
user displays a graphic.

The command window is where the user types MATLAB commands and expressions
at the prompt (>>) and where the output of those commands is displayed. MATLAB defines
the workspace as the set of variables that the user creates in a work session.

52
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

The workspace browser shows these variables and some information about them.
Double clicking on a variable in the workspace browser launches the Array Editor, which can
be used to obtain information and income instances edit certain properties of the variable.

The current Directory tab above the workspace tab shows the contents of the current
directory, whose path is shown in the current directory window. For example, in the windosws
operating system the path might be as follows: C:\MATLAB\Work, indicating that directory
“work” is a subdirectory of the main directory “MATLAB”; WHICH IS INSTALLED IN
DRIVE C. clicking on the arrow in the current directory window shows a list of recently used
paths. Clicking on the button to the right of the window allows the user to change the current
directory.

MATLAB uses a search path to find M-files and other MATLAB related files, which
are organize in directories in the computer file system. Any file run in MATLAB must reside
in the current directory or in a directory that is on search path. By default, the files supplied
with MATLAB and math works toolboxes are included in the search path. The easiest way to
see which directories are soon the search path, or to add or modify a search path, is to select
set path from the File menu the desktop, and then use the set path dialog box. It is good practice
to add any commonly used directories to the search path to avoid repeatedly having the change
the current directory.

The Command History Window contains a record of the commands a user has entered
in the command window, including both current and previous MATLAB sessions. Previously
entered MATLAB commands can be selected and re-executed from the command history
window by right clicking on a command or sequence of commands.

This action launches a menu from which to select various options in addition to
executing the commands. This is useful to select various options in addition to executing the
commands. This is a useful feature when experimenting with various commands in a work
session.

USING THE MATLAB EDITOR TO CREATE M-FILES


The MATLAB editor is both a text editor specialized for creating M-files and a
graphical MATLAB debugger. The editor can appear in a window by itself, or it can be a sub
window in the desktop. M-files are denoted by the extension .m, as in pixelup.m.

The MATLAB editor window has numerous pull-down menus for tasks such as saving,
viewing, and debugging files. Because it performs some simple checks and also uses color to
differentiate between various elements of code, this text editor is recommended as the tool of
choice for writing and editing M-functions.

53
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

To open the editor, type edit at the prompt opens the M-file filename.m in an editor
window, ready for editing. As noted earlier, the file must be in the current directory, or in a
directory in the search path.

GETTING HELP
The principal way to get help online is to use the MATLAB help browser, opened as a
separate window either by clicking on the question mark symbol (?) on the desktop toolbar, or
by typing help browser at the prompt in the command window. The help Browser is a web
browser integrated into the MATLAB desktop that displays a Hypertext Markup Language
(HTML) documents. The Help Browser consists of two panes, the help navigator pane, used to
find information, and the display pane, used to view the information

54
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 7
RESULTS

55
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

Non-fatigue

56
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CHAPTER 8
ADVANTAGES AND APPLICATIONS

57
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

ADVANTAGES AND APPLICATIONS


8.1 ADVANTAGES
➢ Help art historians verify authenticity and identification of the portrait’s artist and
sitter

➢ System for identifying forensic & composite sketches to help law


enforcement

➢ Potentially help identify handwriting & paleography (ancient writing)

8.2 APPLICATIONS
1. Education
2. Industrial
3. Military

58
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CONCLUSION

59
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

CONCLUSION
This paper presents the real time implementation of drowsiness detection which is
invariant to illumination and performs well under various lighting conditions. Correlation
coefficient template matching provides a super-fast way to track the eyes and mouth. The
proposed system achieves an overall accuracy of 94.58% in four test cases, which is highest in
comparison to the recent methods. A high detection rate and reduced false alarms makes sure
that this system can efficiently reduce the number of fatalities every year.

Despite the highly satisfactory performance, the system was unable to predict
drowsiness when the head was tilted towards right or left. Head lowering prediction is also to
be included with a threshold. The accuracy drops by 8% with spectacles. In the future, efforts
will be made to make the system rotation invariant.

60
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

REFERENCES

61
A NOVEL AND EFFICIENT REAL-TIME DRIVER FATIGUE AND YAWN DETECTION-ALERT SYSTEM

REFERENCES
[1] A. F. Abate, M. Nappi, D. Riccio, and G. Sabatino, “2D and 3D face recognition: A
survey,” Pattern Recog. Lett., vol. 28, no. 14, pp. 1885– 1906, Oct. 2007.

[2] T. Ahonen, A. Hadid, and M. Pietikainen, “Face recognition with local binary
patterns,” in Proc. 8th ECCV, N. Sebe, M. S. Lew, and T. S. Huang, Eds., 2004, pp. 469–
481.

[3] T. Ahonen, A. Hadid, and M. Pietikäinen, “Face description with local binary patterns:
Application to face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 12, pp.
2037–2041, Dec. 2006.

[4] B. Becker and E. Ortiz, “Evaluation of face recognition techniques for application to
Facebook,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recog., 2008, pp. 1–6.

[5] P. N. Belhumeur, D. W. Jacobs, D. J. Kriegman, and N. Kumar, “Localizing parts of


faces using a consensus of exemplars,” in Proc. IEEE Conf. CVPR, 2011, pp. 545–552.

[6] J. R. Beveridge, P. J. Phillips, G. H. Givens, B. A. Draper, M. N. Teli, and D. S. Bolme,


“When high-quality face images match poorly,” in Proc. IEEE Int. Conf. Autom. Face Gesture
Recog. Workshops, 2011, pp. 572–578.

[7] V. Blanz and T. Vetter, “Face recognition based on fitting a 3D morphable model,”

IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 9, pp. 1063– 1074, Sep. 2003.

62

You might also like