Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 11

Enhancing Road Safety: Unveiling the Power of Driver Cabin Cameras in

Modern Vehicles using image processing and data analyzing

Abstract

Developing a driver monitoring system capable of assessing the driver's condition is essential
for enhancing road safety. This article explores the role of driver cabin cameras in modern vehicles
and their potential to revolutionize safety on the road. Through sophisticated image processing
technology, these cameras have the capability to recognize driver faces and activities, thereby
mitigating risks associated with distraction and drowsiness. This paper examines the underlying
mechanisms of driver cabin cameras, their implications for road safety, ethical considerations, and
future prospects. Deep learning has shown promise in achieving high accuracy for such systems,
provided that high-quality datasets are accessible. These datasets encompass various parameters such
as driver head pose, heart rate, and in-cabin behavior like drowsiness and seatbelt status.

Utilizing this dataset, deep learning models can be trained and evaluated to estimate the
driver's physical and mental well-being, level of focus, and activities within the cabin. Furthermore,
the dataset was assessed using the multi-task temporal shift convolutional attention network (MTTS-
CAN) algorithm, which yielded a mean average error of 16.375 heartbeats per minute.

Keywords: driver monitoring dataset, driver state, driver activity recognition, Face Detection, Face
Recognition, Digital Image Processing, PCA.

1.Introduction:

The proliferation of driver assistance systems and autonomous driving technologies


underscores a growing commitment to improving road safety. Among these innovations, driver cabin
cameras represent a pivotal development, offering real-time monitoring of driver behavior. By
leveraging image processing algorithms, these cameras can detect signs of distraction and drowsiness,
contributing to accident prevention and mitigation.

Road accidents claim the lives of hundreds of thousands of individuals annually, ranking
among the top ten causes of death in low and middle-income countries, as reported by the World
Health Organization [1]. These accidents impact not only drivers and passengers but also pedestrians,
with human error being the primary cause in most cases. In response, significant efforts have been
directed towards the development of fully automated vehicles operated by Artificial Intelligence (AI)
to mitigate the human factor.

As automated vehicles become more prevalent worldwide, driving will transition into a
shared responsibility between humans and machines. This shift underscores the need for systems
capable of assessing the driver's readiness to assume control of the vehicle at any given moment.

Recent attention has been focused on the development of driver monitoring systems capable
of estimating the driver's state. These systems are designed to enhance road safety by alerting drivers
and include the following features:

1. Detection of vital signs such as heart rate, blood pressure, oxygen saturation, and respiratory rate.

2. Assessment of the driver's mental state, particularly regarding fatigue.

3. Measurement of the driver's attention and concentration levels.

4. Identification of the driver's activities within the vehicle cabin.

Researchers have devised various methods for detecting driver fatigue, including those reliant
on biological signals such as heart rate [10,11] and others focusing on physical features like facial
expressions and eye movements [12,13].

Face recognition has emerged as a recent focal point in research. The challenge lies in making
our systems adept at detecting faces even in adverse conditions, such as low light, wet surfaces, or
when individuals are wearing accessories like glasses or goggles, and undergoing facial changes like
the presence of a beard.

In this paper, we introduce an annotated dataset called DriverMVT (Driver Monitoring


dataset with Videos and Telemetry) for monitoring drivers within the vehicle cabin. This dataset
serves as a valuable resource for training and evaluating deep learning models aimed at estimating the
driver's state, including fatigue, distraction, and overall health status. Developing models capable of
identifying such critical behaviors and alerting drivers has the potential to prevent numerous accidents
and enhance road safety significantly.

2. Related Works

In the Related Work section, we provide a concise overview of various methods and datasets
used for driver monitoring. Researchers have introduced several datasets aimed at understanding
driver behavior and attention levels.

One dataset, introduced by researchers, comprises a diverse benchmark of video sequences


depicting normal, critical, and accidental scenarios outside the vehicle. The dataset aims to predict
driving accidents based on driver attention levels.

Another dataset, called DrivFace, contains image sequences of drivers in real-world


scenarios, annotated with head pose angles and view direction. Researchers also propose a method for
estimating attention levels based on head pose angles.

Additionally, the MPIIGaze dataset includes images collected during everyday computer use,
focusing on estimating gaze angles to determine attention levels. The DriveAHead dataset provides
infrared and depth images of drivers' head poses in real driving situations, along with annotations for
head pose labels and facial occlusions.

Furthermore, datasets aim to monitor drowsiness and driver distraction, such as the dataset
collected from individuals undergoing experiments with sleep deprivation and another dataset
capturing drivers' actions in various driving scenarios using depth cameras.

Finally, the Multimodal Spontaneous Expression Heart Rate (MMSE-HR) dataset includes
videos and associated heart rate and blood pressure information collected from participants using face
and contact sensors.

In contrast, our DriverMVT dataset offers detailed annotations for driver health indices,
mental state, and head pose estimation, recorded in real driving environments to provide authentic
data for a wide range of driver-related tasks.
Table 1. Comparison between Driver monitoring public datasets.

3. Face Recognition System

A facial recognition system, as described in [18], is a computer system commonly utilized to


identify individuals from images stored in a database. The process involves comparing an input image
with the images or video frames stored in the database. Initially, facial features are selected using
various algorithms, and these features are then compared between the input and stored images.

Facial recognition systems are predominantly employed in security and biometric systems for
identification purposes. They utilize facial features, fingerprints, jaw shape, and iris recognition to
accurately identify individuals.

Face detection algorithms play a crucial role in identifying faces by extracting facial features
based on conditional positioning, size, and shape of the eyes, nose, cheeks, and jawline. These
features facilitate the comparison of input images with stored images, resulting in accurate
identification.

4. Transformation of face Recognition

A. Face Detection: Face detection plays a pivotal role in various applications utilizing computer
technology to verify or identify human faces from databases. Algorithms for face detection primarily
focus on accurately detecting frontal faces. Images are stored in pixel form, and each pixel is matched
bit by bit between the input and stored database images. Any discrepancies due to issues during image
acquisition can hinder the matching process.
B. Preprocessing: Preprocessing is integral in enhancing the False Acceptance Ratio (FAR) and
speeding up the system. This step improves computational efficiency and reduces false positives by
optimizing image quality.

C. Feature Extraction: Following preprocessing, feature extraction is employed to extract distinctive


features from human faces such as mouth shape, jawline, nose, eyes, and eyebrows. These features are
compared to identify the exact face from the given database. Feature extraction also encompasses the
extraction of facial features used to distinguish facial expressions. It is categorized into three
subcategories: general feature extraction, feature selection, and feature decomposition. Initially,
general information about features is extracted, followed by feature selection using algorithms like
PCA, LDA, and ICA. Finally, feature decomposition aids in further refining the features extracted
from the face.

Fig.1 Multiview faces overlaid with labeled graphs

Fig.2 Examples of critical events from the dataset.


5.Data Evaluation

In our data evaluation process, we conducted experiments using the multi-task temporal shift
convolutional attention network (MTTS-CAN) [24], a state-of-the-art algorithm for heart rate
estimation. This evaluation was performed on a subset of our dataset containing heart rate
information, consisting of 12 videos. The results showed a mean average error of 16.375 heartbeats
per minute and a Root Mean Square Error of 19.495, indicating a relatively high error rate.

Additionally, we conducted a separate experiment to evaluate respiratory rate estimation.


Utilizing an algorithm proposed in paper [25], we aimed to detect respiratory rate when the vehicle
speed was either zero or close to zero. Our experiments demonstrated that we could accurately
measure respiratory rate when the vehicle speed was less than 3 km/h. The algorithm involves the
following steps:

1. Estimating the position of the chest keypoint using the Openpose human pose estimation model.
2. Calculating keypoint displacement using an optical flow-based neural network (SelFlow).

3. Cleaning the displacement signal through filtering and detrending.

4. Counting the number of peaks/troughs in a time window of one minute to determine respiratory
rate.
6.Functionality of Driver Cabin Cameras:

Driver cabin cameras employ advanced image processing algorithms to analyze driver
behavior in real-time. Facial recognition technology enables the identification of the driver, while
algorithms assess factors such as gaze direction, head pose, and eyelid closure to detect signs of
distraction or drowsiness. In the event of concerning behavior, the system can issue alerts to prompt
driver intervention or autonomous vehicle response.

7. Principal Component Analysis

The Eigenface algorithm utilizes Principal Component Analysis (PCA) for dimensionality
reduction, aiming to identify vectors that best represent the distribution of face images within the
entire image space [14]. These vectors define the face space, where all faces in the training set are
projected to determine a set of weights describing each vector's contribution. To identify a test image,
its projection onto the face space yields a corresponding set of weights. By comparing these weights
with those of faces in the training set, the test image's face can be identified.

PCA relies on the Karhunen-Loève transformation [18], treating image elements as random variables
within a stochastic process. The basis vectors of PCA are the eigenvectors of the scatter matrix \
(S_T\):

\[S_T = \sum_{i=1}^{N} (x_i - \mu)(x_i - \mu)^T\]

where \(x_i\) represents the image elements and \(\mu\) is the mean. The transformation matrix \
(W_{PCA}\) comprises the eigenvectors corresponding to the d largest eigenvalues. Fig. 7 illustrates
a 2D example of PCA, while Fig. 9 displays eigenvectors (i.e., eigenfaces) and Fig. 8 showcases the
average face, derived from the ORL face database [19]. After projection, the input vector in an n-
dimensional space is reduced to a feature vector in a d-dimensional subspace.

Furthermore, eigenvectors corresponding to the 7 smallest eigenvalues, shown in Fig. 10, are typically
considered noise and are disregarded during identification for most applications.
Future Prospects:

As automotive technology continues to evolve, driver cabin cameras are poised to play an
increasingly prominent role in ensuring road safety. Advancements in artificial intelligence and
machine learning hold the promise of even more sophisticated driver monitoring capabilities.
Moreover, the integration of cabin cameras with vehicle-to-vehicle communication systems and
autonomous driving technology heralds a future where accidents are not only mitigated but potentially
eliminated altogether.

Conclusion:

In conclusion, driver cabin cameras represent a powerful tool for enhancing road safety in modern
vehicles. By leveraging image processing technology, these cameras can effectively detect and
mitigate risks associated with driver distraction and drowsiness. However, ethical considerations
regarding privacy and data security must be carefully addressed to ensure widespread acceptance and
adoption of this technology. As we look to the future, driver cabin cameras stand as a beacon of
innovation, paving the way towards safer roads for all.

Furthermore, our dataset was assessed using the MTTS-CAN algorithm, which yielded a mean
average error of 16.375 heartbeats per minute. We encourage fellow researchers to explore our dataset
for novel applications. Ultimately, our primary objective is for research and models derived from the
DriverMVT dataset to contribute to saving lives on the roads.

References

1)Xiaoguang Lu Dept. of Computer Science & Engineering Michigan State University, East Lansing,
MI, 48824 Email: lvxiaogu@cse.msu.edu
2)Deepika Dubey1 and Dr. G.S. Tomar2 1Assistant Professor SRCEM Banmore 2Director THDC-
IHET Tehri 1 deepika.sa1304@gmail.com, 2 gstomar@ieee.org

3)
Information Technology and Programming Faculty, ITMO University, 197101 St. Petersburg, Russia

You might also like