Download as pdf or txt
Download as pdf or txt
You are on page 1of 59

Vehicle Distance Estimator

A Major Project-I Report


Submitted in Partial fulfillment for the award of
Bachelor of Technology in CSE

Submitted to
RAJIV GANDHI PROUDYOGIKI VISHWAVIDYALAYA
BHOPAL (M.P)

MAJOR PROJECT REPORT


Submitted by

Piyush Dogne [0103CS201119] Yash Deharia [0103CS201202]


Pratham Parashar [0103CS201126] Nitikesh Parse [0103CS201114]

Under the supervision of


Dr. Susheel Gupta
(Prof.)

Department of CSE
Lakshmi Narain College of Technology, Bhopal (M.P.)

Session
2023-24
LAKSHMI NARAIN COLLEGE OF TECHNOLOGY, BHOPAL

DEPARTMENT OF CSE

CERTIFICATE

This is to certify that the work embodied in this project work entitled ”Vehical
Distance Estimator” has been satisfactorily completed by the Piyush Dogne
(0103CS201119), Yash Deharia (0103CS201202), Pratham Parashar
(0103CS201126) and Nitikesh Parse (0103CS201114). It is a bonafide piece of work,
carried out under the guidance in Department of CSE, Lakshmi Narain College of
Technology, Bhopal for the partial fulfillment of the Bachelor of Technology during
the academic year 2023-2024.

Dr. Susheel Gupta


Professor

Approved By

Dr. Sadhana K Mishra

Professor & Head


Department of Computer Science & Engineering
LAKSHMI NARAIN COLLEGE OF TECHNOLOGY, BHOPAL

DEPARTMENT OF CSE

ACKNOWLEDGEMENT

We express our deep sense of gratitude to Dr. Susheel Gupta (Guide) department of
CSE L.N.C.T., Bhopal, whose kindness, valuable guidance and timely help encouraged
me to complete this project.

A special thank goes to Dr. Sadhna K Mishra (HOD) who helped me in completing this
project work. He exchanged his interesting ideas & thoughts which made this project
work successful.

We would also thank our institution and all the faculty members without whom this
project work would have been a distant reality.

Signature

Piyush Dogne [0103CS201119] Yash Deharia [0103CS201202]

Paratham Parashar[0103CS201126] Nitikesh Parse[0103CS201114]


Table of Contents

Certificate ii

Declaration iii

Acknowledgement iv

Abstract v

1 Introduction 1
1.1 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Research Gaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Literature Review & Survey 11


2.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 UML Diagrams 17
3.1 Vehicle Distance Estimation Sequence Diagram . . . . . . . . . . . . 17
3.2 Vehicle Distance Estimation Usecase Diagram . . . . . . . . . . . . . 18
3.3 Vehicle Distance Estimation Activity Diagram . . . . . . . . . . . . . 18
3.4 Drowsiness Detection Diagram . . . . . . . . . . . . . . . . . . . . . . 20
3.5 Lane Detection Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Project Scheduling 22
4.1 Phase 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1.1 Idea & Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1.2 Learning Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.1.3 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.4 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.5 Distance Estimation . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.6 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1.7 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.2 Phase 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.1 Idea & Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.2 Learning Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.3 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.2.4 Lane Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.5 Drowsiness Detection . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.6 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.2.7 Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . 25

vii
5 Tools and Technology Used 26
5.1 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.2 NumPy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5.3 Pandas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.4 Matplotlib . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5.5 TensorFlow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.6 Google Colaboratory . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.7 YOLO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.8 KITTI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Modules & Methodology 31


6.1 Vehicle Distance Estimation . . . . . . . . . . . . . . . . . . . . . . . 31
6.1.1 Object Detection . . . . . . . . . . . . . . . . . . . . . . . . . 31
6.1.2 Intermediate results Calculation . . . . . . . . . . . . . . . . . 32
6.1.3 Distance Estimation . . . . . . . . . . . . . . . . . . . . . . . 33
6.1.4 Visualize test results . . . . . . . . . . . . . . . . . . . . . . . 34
6.2 Drowsiness Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.1 Data Collection . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.2.2 Eye Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.3 EAR Calculation . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.4 Drowsiness Threshold . . . . . . . . . . . . . . . . . . . . . . . 36
6.2.5 Drowsiness Detection . . . . . . . . . . . . . . . . . . . . . . . 36
6.3 Lane Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.3.1 Image Capturing . . . . . . . . . . . . . . . . . . . . . . . . . 40
6.3.2 Greyscale Conversion . . . . . . . . . . . . . . . . . . . . . . . 40
6.3.3 Gaussian Blurring . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.3.4 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . 42
6.3.5 Extraction of ROI . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.3.6 Hough Transform . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.3.7 Image Marking . . . . . . . . . . . . . . . . . . . . . . . . . . 44

7 Result Analysis 46

8 Conclusion 51

References 53

viii
List of Figures
1 Deaths per 100 Accidents in India in 2021 [1] . . . . . . . . . . . . . . 1
2 Road Accidents in India severity during 2000 to 2021 [1] . . . . . . . 2
3 Vehicle Distance Estimation Sequence Diagram . . . . . . . . . . . . 17
4 Vehicle Distance Estimation Use Case Diagram . . . . . . . . . . . . 18
5 Vehicle Distance Estimation Activity Diagram . . . . . . . . . . . . . 18
6 Drowsiness Detection Diagram . . . . . . . . . . . . . . . . . . . . . . 20
7 Lane Detection Diagram . . . . . . . . . . . . . . . . . . . . . . . . . 21
8 Python Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
9 Numpy Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
10 Pandas Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
11 Matplotlib Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
12 Tensorflow Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
13 Colaboratory Logo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
14 Example YOLO object detection . . . . . . . . . . . . . . . . . . . . 29
15 Example KITTI video frames . . . . . . . . . . . . . . . . . . . . . . 30
16 Project architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
17 Object Coordinate Sheet . . . . . . . . . . . . . . . . . . . . . . . . . 32
18 Object Distance Estimated from Coordinates of bounding box . . . . 34
19 Frame from video with distance written on bounding box . . . . . . . 35
20 Detecting Eye Points . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
21 DLib Shape Predictor 68 . . . . . . . . . . . . . . . . . . . . . . . . . 38
22 Awake Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
23 Sleepy Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
24 Image Captured from Camera . . . . . . . . . . . . . . . . . . . . . . 40
25 Greyscale Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
26 Gaussian Blurring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
27 Canny Edge Detection . . . . . . . . . . . . . . . . . . . . . . . . . . 42
28 Extraction of ROI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
29 Hough Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
30 Marked Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
31 Lane Detection Process . . . . . . . . . . . . . . . . . . . . . . . . . . 45
32 Actual Distance Vs. Predicted Distance . . . . . . . . . . . . . . . . . 46
33 Distance Vs. Error Rate . . . . . . . . . . . . . . . . . . . . . . . . . 47
34 Confusion Matrix for Drowsiness Detection . . . . . . . . . . . . . . . 48
35 Drowsiness Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
36 Lane Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

0
1 Introduction
According to the World Health Organization (WHO), an estimated 1.35
million people die each year from road traffic accidents, making it the
leading cause of death for people aged 5-29 years. The Centers for Dis-
ease Control and Prevention (CDC) reports that unintentional injuries,
which include accidental deaths, are the third leading cause of death in
the United States, accounting for over 200,000 deaths each year. In the
European Union, an estimated 90,000 people die each year from acciden-
tal injuries, including traffic accidents, falls, and poisonings, according
to the European Commission.

Figure 1: Deaths per 100 Accidents in India in 2021 [1]

1
Figure 2: Road Accidents in India severity during 2000 to 2021 [1]

A study conducted by the AAA Foundation for Traffic Safety found


that 7% of all motor vehicle accidents and 16.5% of all fatal accidents
involve drowsy driving. These statistics highlight the serious and po-
tentially deadly consequences of driving while fatigued. According to
the National Highway Traffic Safety Administration (NHTSA), drowsy
driving results in an estimated 100,000 crashes each year in the United
States alone, causing 40,000 injuries and 1,550 fatalities.

These statistics highlight the need for effective driving assistance sys-
tems that can help prevent accidents caused by driver fatigue. Driving
assistance systems are technologies that are designed to help drivers
operate their vehicles more safely and efficiently. These systems can in-
clude a wide range of features, such as lane departure warning, adaptive
cruise control, collision avoidance, and more.

Driving Assistance Systems are beneficial to drivers as they provide


a lot of features including:

2
1. Improved safety: By helping drivers stay in their lanes, maintain
a safe following distance, and avoid collisions, driving assistance
systems can help reduce the risk of accidents and injuries on the
road.

2. Enhanced efficiency: Systems like adaptive cruise control can help


drivers maintain a more consistent speed, which can lead to more
efficient and fuel-efficient driving.

3. Increased comfort: Features like lane departure warning and colli-


sion avoidance can help drivers feel more confident and relaxed on
the road, especially in challenging driving conditions.

4. Reduced driver fatigue: Driving assistance systems can help allevi-


ate some of the mental and physical demands of driving, which can
be particularly useful for long-distance driving or in monotonous
conditions.

5. Improved driver performance: By providing drivers with additional


information and support, driving assistance systems can help im-
prove overall driving performance and reduce mistakes or errors.

Driving assistance systems are broadly divided into two categories.


The first focuses on the outside of the vehicle, taking into consideration
the lane, other vehicles, and information about the distance to these
vehicles.
These types of driving assistance systems can reduce the chance of
collisions and other accidents that happen when the vehicles come dan-
gerously close to each other by measuring the vehicle distances and also
by keeping track of the lane the vehicle is in and warning the driver
when he exits the lane.

The other type of assistance system focuses on the driver inside. They
work by analyzing the face of the driver to detect his mood and emotion
and also detect if the driver is feeling drowsy. They can ensure that the

3
driver is focused on the road and does not get tired. These two systems
also work hand in hand to provide a comprehensive assistance system
for better support.

In Phase 1 of the major project, we focused on the outside part of the


vehicle and implemented the vehicle distance estimation system. The
vehicle distance estimation system works by estimating the distance to
the vehicles moving forward on the road.
The system implemented was able to identify a variety of objects and
find distances to them. This can also be used to detect if any human
comes before the car in any emergency situation.

There are several ways to implement vehicle distance estimation:

1. Radar: Radar uses radio waves to detect the range, angle, and
velocity of objects in the surrounding environment. It is often used
for distance estimation in vehicles, as it can operate in a variety of
conditions and is relatively inexpensive.

2. Lidar: Lidar (light detection and ranging) uses lasers to measure


the distance to objects by timing how long it takes for the laser
pulse to bounce back. Lidar can be very accurate, but it is more
expensive and may be less reliable in certain weather conditions.

3. Cameras: Cameras can be used to create a model of the surrounding


environment. This can be used to estimate distances to objects
based on their size and position in the model.

4. Ultrasonic sensors: Ultrasonic sensors use high-frequency sound


waves to measure the distance to objects. They are relatively in-
expensive and can be used in a variety of conditions, but their
accuracy may be limited at longer distances.

The methods involving Radar, Lidar, and Ultrasonic sensors are not
implemented in this project because of the requirement of additional

4
hardware which is required to send and capture the waves. This hard-
ware imposes an extra cost that all drivers may not be willing to pay
so to make the system affordable and usable with the least hardware
requirement, we have focused on the use of cameras mainly dashboard
cameras which are installed by a lot of people to monitor the road and
to capture accidents if they occur. These dashboard cams produce video
feeds of medium quality which can be used in most cases to accurately
measure the distance to vehicles in front of it.

Computer vision and machine learning techniques can be used to es-


timate distances in a vehicle by analyzing images or video from cameras
mounted on the vehicle. By training a machine learning model on a
large dataset of images with known distances, it is possible to learn the
relationship between the appearance of an object in an image and its
distance from the camera. The model can form a relationship between
the appearance of the objects in the image and their distance from the
camera.

In Phase 2 of the major project, we focused on Lane Detection and


Drowsiness Detection. Drowsy driving is a significant cause of accidents
on the roads, and it can be deadly. Drivers who are fatigued can experi-
ence slower reaction times, impaired judgment, and decreased awareness
of their surroundings. In some cases, they may even fall asleep behind
the wheel, leading to disastrous consequences. To address this issue,
researchers have been developing drowsiness detection systems that can
alert drivers when they are at risk of falling asleep. We explore the
importance of drowsiness detection, the technologies involved, and the
potential impact it can have on road safety.

To detect driver fatigue, various drowsiness detection systems are be-


ing developed that rely on different technologies such as eye-tracking,
which monitors eye movements and blinks to determine the driver’s level

5
of alertness. The system can detect when a driver’s eyes close or droop,
indicating they are becoming drowsy or falling asleep. Drowsiness de-
tection through eye tracking involves using cameras and infrared sensors
to monitor a driver’s eye movements and blinks. By analyzing changes
in the driver’s eye behavior, the system can detect when the driver is
becoming drowsy or falling asleep. This technology has the potential to
significantly reduce the number of accidents caused by driver fatigue.

Drowsiness detection technology has the potential to significantly re-


duce the number of accidents caused by drowsy driving. By alerting
drivers when they are at risk of falling asleep, the systems can prevent
drivers from making critical errors that could lead to accidents. Addi-
tionally, by detecting drowsiness early on, drivers can take preventive
measures such as taking a break or drinking a cup of coffee, which can
help them stay alert and focused on the road.

Lane detection using computer vision is a field of study that involves


developing algorithms and techniques to identify the lane boundaries on
a road or highway using images or video feeds from cameras mounted on
vehicles. This technology is becoming increasingly important in the au-
tomotive industry, as it can help improve road safety by assisting drivers
with lane departure warnings and automated steering systems.

The basic idea behind lane detection using computer vision is to ana-
lyze the image or video feed from a camera to identify the lane markings
on the road. The system uses various image processing techniques to
isolate the lane markings and extract relevant features, such as the lane
width and curvature. These features are then used to estimate the ve-
hicle’s position relative to the lane boundaries and provide feedback to
the driver.

Lane detection using computer vision has many potential applica-

6
tions, such as lane departure warning systems, automatic lane-keeping
systems, and autonomous driving. In lane departure warning systems,
the technology is used to alert the driver when the vehicle is drifting out
of its lane. In automatic lane-keeping systems, the technology is used to
keep the vehicle centered within the lane boundaries without the driver’s
input. In autonomous driving, the technology is used to enable the ve-
hicle to navigate the road network without human intervention.

1.1 Problem Definition


The problem addressed in this project is aimed at reducing the accidents
which happen on the roads due to driver carelessness.

1. How to Avoid Collisions? The collisions can be avoided using


an emergency braking system. The system will leverage the distance
from the vehicles in front estimated using a vehicle distance estimation
system. Vehicle distance estimation is a critical component of many DAS
systems, as it can help prevent accidents caused by rear-end collisions
or unsafe following distances. However, one of the main challenges fac-
ing vehicle distance estimation technology is dealing with varying traffic
conditions, such as changes in traffic speed or density. Additionally, ve-
hicle distance estimation systems must be able to accurately detect and
track multiple vehicles in real time, which can be challenging in con-
gested or high-speed traffic conditions. Another challenge is ensuring
that the system can accurately estimate the distance between vehicles,
even when there are obstacles or other factors that may interfere with
the signal.

2. How to warn the driver when a vehicle dangerously leaves


the lane? A lane departure warning system can be created to address
this problem, the system relies upon lane detection which is a criti-
cal component of many DAS systems, as it can help prevent accidents

7
caused by lane departures or lane changes. However, one of the main
challenges facing lane detection technology is dealing with complex road
geometries, such as roundabouts, intersections, and multi-lane highways.
Another challenge is dealing with varying lighting and weather condi-
tions, which can affect the performance of the system. Additionally, lane
detection systems must be able to accurately detect lane markings, even
when they are faded or obscured by debris on the road.

3. What steps can be taken to maintain the attentiveness


and concentration of drivers on the road to avoid accidents?
The drivers can be alerted through a system that detected if the driver
is focused on the road by detecting the focus and sleepiness of the driver.
Drowsiness Detection helps prevent accidents caused by driver fatigue or
falling asleep at the wheel. However, drowsiness detection systems must
be able to accurately detect the driver’s level of alertness in real-time,
which can be challenging. One of the main challenges facing drowsiness
detection technology is dealing with individual differences in behavior
and physiology, as some drivers may exhibit different signs of drowsiness
than others. Additionally, drowsiness detection systems must be able to
operate in varying lighting and weather conditions and must be able to
detect changes in the driver’s behavior quickly and reliably.

1.2 Research Gaps


Vehicle distance estimation is an important aspect of driving that helps
ensure safe and efficient traffic flow. However, there are several gaps in
current methods of vehicle distance estimation. One major gap is the
reliance on visual estimation by drivers. While visual estimation can
be useful, it is subject to human error and can be affected by factors
such as weather conditions and driver distraction. Additionally, visual
estimation does not take into account the speed and direction of other
vehicles, which can affect their distance from the observer.

8
Furthermore, there is a lack of standardization in the methods used
to estimate vehicle distance. Different drivers and technologies may use
different metrics or units, making it difficult to compare and commu-
nicate distance information accurately. Overall, addressing these gaps
in vehicle distance estimation is crucial for improving road safety and
efficiency. New technologies and standardization efforts can help reduce
human error and increase the reliability and accuracy of distance esti-
mation methods.

Drowsiness detection using eye aspect ratio (EAR) is a promising


method for improving driver safety. However, there are several gaps in
the technology that need to be addressed for it to be more effective and
reliable. These gaps include the variability in results, the lack of real-
time monitoring, sensitivity to environmental conditions, and the lack of
standardization in the measurement and interpretation of EAR values.
These gaps can result in false alarms or missed detections, which can
compromise the effectiveness of the system.

Researchers are working on developing more robust algorithms that


can account for environmental conditions and individual differences in
eye shape and size, providing more consistent results. They are also
exploring new approaches, such as using multiple cameras or sensors,
to improve the reliability of the system under different environmental
conditions. Standardization of measurement and interpretation of EAR
values will allow for more consistent and reliable detection of drowsiness
across studies. As research in this area continues, it is likely that we will
see further advancements in EAR-based drowsiness detection, making
our roads safer and more efficient.

Lane detection is an important aspect of autonomous driving and


advanced driver assistance systems. However, there are several gaps
in current methods of lane detection. While deep learning approaches

9
have shown promise in accurately detecting lane boundaries, they can
struggle in adverse weather conditions, such as heavy rain, fog, or snow.
Additionally, these algorithms can struggle to detect lanes with poor or
faded markings.

Another gap is the lack of standardization in lane marking regula-


tions. Different countries and regions may have different regulations
regarding lane markings, which can make it difficult for autonomous
vehicles to navigate effectively in different environments. Furthermore,
current lane detection methods may not be robust enough to handle
complex driving scenarios, such as intersections, roundabouts, and con-
struction zones. These scenarios may require additional sensors and
algorithms to accurately detect and navigate lanes.

In conclusion, while DAS technology has shown great promise in im-


proving driver safety and reducing the number of accidents on the road,
there are still several challenges that need to be addressed in order to
develop more effective and reliable systems in the areas of lane detec-
tion, drowsiness detection, and vehicle distance estimation. These chal-
lenges include dealing with complex road geometries, varying lighting
and weather conditions, individual differences in behavior and physiol-
ogy, changing traffic conditions, and accurately detecting and tracking
multiple vehicles in real time. As research in this area continues, it is
likely that we will see further advancements in DAS technology that will
make our roads safer and more efficient.

10
2 Literature Review & Survey
In their paper titled "Vehicle detection and inter-vehicle distance estima-
tion using single-lens video camera on urban/suburb roads" by Ahmed
Ali et. al.[2] discusses a method for estimating the distance of a vehi-
cle from a single camera view in real-time. The proposed method uses
single-view geometry to estimate the distance, which involves analyzing
the size and orientation of the vehicle in the image, as well as the cam-
era’s intrinsic parameters. The authors tested the proposed method on
a dataset of real-world images and found that it was able to estimate the
distance of vehicles with an average error of less than 3% and a frame
rate of 30 frames per second.
The authors also compared the proposed method to other distance
estimation methods and found that it outperformed them in terms of
accuracy and speed. The paper concludes that the proposed method is
a reliable and efficient method for real-time vehicle distance estimation
using single-view geometry.

In their paper "Vehicle Routing Optimization System with Smart


Geopositioning Updates", Radoslav and Mateuz [3] present a system
for optimizing the routes of a fleet of vehicles. The system uses smart
geopositioning updates, which involve continuously updating the loca-
tion of the vehicles in real-time, to optimize their routes. The authors
of the paper developed a mathematical model for the problem and used
it to design an algorithm for finding the optimal routes.
They tested the proposed system on a dataset of real-world vehicle
routing problems and found that it was able to significantly reduce the
total distance traveled by the fleet and improve the efficiency of the
routes. The paper concludes that the proposed system is an effective
and efficient solution for vehicle routing optimization with smart geopo-
sitioning updates.

"Real-time vehicle tracking and classification using deep learning"[4]

11
presents a method for tracking and classifying vehicles in real-time using
deep learning. The authors of the paper propose a convolutional neural
network (CNN) based approach for detecting and classifying vehicles in
video streams.
They tested the proposed method on a dataset of real-world video
streams and found that it was able to achieve high accuracy in both
tracking and classification tasks. The authors also compared the perfor-
mance of the proposed method to other state-of-the-art methods. The
paper concludes that the proposed method is an effective and efficient
solution for real-time vehicle tracking and classification using deep learn-
ing.
Overall, these three papers demonstrate the various approaches and
technologies being developed for improving vehicle distance estimation,
routing optimization, and tracking and classification. The authors pro-
vide evidence of the effectiveness and efficiency of their proposed meth-
ods[2][3][4] through testing on real-world datasets and comparisons to
other methods. The advancements and innovations in these areas have
the potential to improve road safety and efficiency in transportation sys-
tems.

The paper titled "Development of drowsiness detection system" by


H. Ueno; M. Kaneda; M. Tsukino [5] describes the design and imple-
mentation of a drowsiness detection system for use in vehicles. The
system is intended to help prevent accidents caused by drowsy driving.
The paper begins by discussing the importance of detecting drowsiness
in drivers and the potential consequences of drowsy driving. The paper
then presents the design of the drowsiness detection system, which uti-
lizes a camera to monitor the driver’s face and analyze the driver’s eye
movements and facial expressions.
The system uses computer vision algorithms to analyze the driver’s eye
movements and facial expressions to determine the level of drowsiness.
The system also takes into account other factors such as vehicle speed

12
and time of day to improve the accuracy of the drowsiness detection.
The paper presents the results of testing the system in real-world driv-
ing conditions. The system was tested on a group of test subjects, and
the results showed that the system was able to accurately detect drowsi-
ness in the majority of cases.
The paper concludes that the drowsiness detection system has the
potential to help prevent accidents caused by drowsy driving. The sys-
tem is relatively inexpensive and can be easily integrated into existing
vehicle systems. The system is also non-intrusive and does not require
the driver to wear any special equipment.
In summary, the paper describes the design and implementation of
a drowsiness detection system for use in vehicles. The system utilizes a
camera and computer vision algorithms to analyze the driver’s eye move-
ments and facial expressions to determine the level of drowsiness. The
system was tested in real-world driving conditions and showed promis-
ing results in accurately detecting drowsiness. The paper concludes that
the system has the potential to help prevent accidents caused by drowsy
driving and can be easily integrated into existing vehicle systems.

The paper titled "Driver Drowsiness Detection System and Tech-


niques: A Review" by Vandna Saini [6] provides a comprehensive overview
of various techniques and systems used for detecting driver drowsiness.
The paper begins by discussing the importance of detecting drowsiness
in drivers and the potential risks associated with drowsy driving. The
paper then proceeds to review various physiological and behavioral in-
dicators of drowsiness, such as eye closure duration, yawning frequency,
and heart rate variability.
The paper then provides an in-depth review of different techniques
used for detecting drowsiness, such as image processing techniques, ma-
chine learning algorithms, and physiological sensors. The paper dis-
cusses the advantages and limitations of each technique and provides
examples of studies that have utilized these techniques. The paper also

13
reviews various drowsiness detection systems that have been developed
for use in real-world driving scenarios, such as the face-based drowsiness
detection system, steering wheel-based drowsiness detection system, and
wearable sensors-based drowsiness detection system.
Furthermore, the paper discusses the challenges associated with de-
veloping drowsiness detection systems, such as the variation in indi-
vidual sleep patterns, the difficulty in accurately measuring drowsiness,
and the need for real-time processing of data. The paper concludes by
suggesting future research directions in the field of drowsiness detec-
tion, such as the development of non-invasive and real-time detection
systems that are capable of detecting drowsiness in a wider range of
driving scenarios.
In summary, the paper provides a comprehensive review of different
techniques and systems used for detecting driver drowsiness. The pa-
per discusses the advantages and limitations of each technique, reviews
various drowsiness detection systems, and highlights the challenges as-
sociated with developing drowsiness detection systems. The paper con-
cludes by suggesting future research directions in the field of drowsiness
detection.

The paper titled "Recent progress in road and lane detection: a sur-
vey" by Aharon Bar Hillel, Ronen Lerner, Dan Levi, and Guy Raz [7]
provides a comprehensive survey of recent progress in road and lane
detection. The paper begins by discussing the importance of road and
lane detection in the development of autonomous vehicles and advanced
driver assistance systems. The paper then provides an overview of the
different approaches used for road and lane detection, including tradi-
tional computer vision algorithms and deep learning techniques.
The paper reviews various computer vision techniques that have been
used for road and lane detection, such as edge detection, Hough trans-
forms, and template matching. The paper then discusses the advantages
and limitations of deep learning techniques, such as convolutional neural

14
networks (CNNs) and recurrent neural networks (RNNs), in road and
lane detection. The paper also reviews recent advances in road and lane
detection, such as the use of stereo vision and LiDAR sensors. The paper
discusses the advantages and limitations of these sensors and provides
examples of studies that have utilized these techniques.
Furthermore, the paper discusses the challenges associated with road
and lane detection, such as the variation in road and lane markings,
environmental conditions, and real-time processing of data. The paper
also highlights the need for robust and accurate road and lane detection
algorithms for the safe and reliable operation of autonomous vehicles.
The paper concludes by suggesting future research directions in the field
of road and lane detection, such as the development of more robust and
accurate algorithms that can handle varying road and weather condi-
tions, and the integration of multiple sensor modalities for improved
detection performance.
In summary, the paper provides a comprehensive survey of recent
progress in road and lane detection. The paper reviews various com-
puter vision techniques and deep learning techniques that have been
used for road and lane detection, discusses recent advances in sensor
technologies, and highlights the challenges associated with road and lane
detection. The paper concludes by suggesting future research directions
in the field of road and lane detection.

The paper "Lane Detection and Tracking by video sensors" by J.


Goldbeck and B. Huertgen [8] presents a method for detecting and track-
ing lanes on the road using video sensors. The aim of this research is
to provide a reliable and accurate system for lane detection that can
be used in advanced driver assistance systems (ADAS) and autonomous
vehicles. The proposed method is based on a combination of edge detec-
tion, and Hough transform. First, edge detection is used to extract the
edges of the lane markings from the video frames. The Hough transform
is then applied to these edges to detect the lines corresponding to the

15
lane markings. The detected lines are used to estimate the position and
orientation over time.
To improve the accuracy of lane detection, the authors propose sev-
eral techniques. First, the edges are filtered using a Gaussian kernel to
reduce noise and improve the robustness of the Hough transform. Sec-
ond, a region of interest (ROI) is defined around the expected location
of the lanes, to reduce the search space and improve the efficiency of
the algorithm. The performance of the proposed method was evaluated
using real-world video data collected from a moving vehicle. The results
show that the method is able to detect and track the lane markings
with high accuracy and robustness, even in challenging scenarios such
as occlusions and curved roads. The authors also compare their method
to other existing lane detection methods and show that it outperforms
them in terms of accuracy and efficiency.
In conclusion, the proposed method for lane detection and tracking
using video sensors is a promising approach for ADAS and autonomous
vehicles. Its robustness, accuracy, and efficiency make it suitable for
real-world applications, and it represents a significant improvement over
existing methods.

2.1 Summary
These papers review several technologies that are being developed to in-
crease road safety and transportation efficiency. There are studies that
present methods for determining a vehicle’s distance in real-time using a
single camera image. Also, studies have been conducted for optimizing
vehicle routes using smart geopositioning updates. Deep learning-based
systems are also being researched for tracking and identifying automo-
biles in real-time. An in-depth look at the various strategies and systems
shows the technological advances that have the potential to increase road
safety and transit efficiency.

16
3 UML Diagrams
3.1 Vehicle Distance Estimation Sequence Diagram

Start

User open the system

User get option "get help instruction"

System verify that camera works well

System start to detect object

System estimate distance of detected objects

User exit the system

End

Figure 3: Vehicle Distance Estimation Sequence Diagram

17
3.2 Vehicle Distance Estimation Usecase Diagram

Driving Assistance System

<<extends>> get help


instructions
Open System
ensure that
<<
inc e >> camera works
lud lud as expected
e >> inc
<<
setup
environment

<<
in
get video

clu
de
stream with all <<
inc

>>
info written lud
Driver e >> ensure that
requirements Camera
estimate
distance are satisfied

exit system
<<include>>

<<include>> get video from


detect objects
stream

Figure 4: Vehicle Distance Estimation Use Case Diagram

3.3 Vehicle Distance Estimation Activity Diagram

Initialize
Capture Image
Variables
Start

Extract Pre-Process
Match Features
Features image

Calculate Display
distance Distance
End

Figure 5: Vehicle Distance Estimation Activity Diagram

18
Explanation of the steps:

1. Start: This indicates the start of the process. Initialize variables:


This step involves setting up any necessary variables that will be
used in the process, such as image dimensions, camera parameters,
etc.

2. Capture image: This step involves using a camera to capture an


image of the scene, which will be used to estimate the distance to
the vehicle.

3. Pre-process image: This step involves any necessary pre-processing


of the image, such as cropping, resizing, or applying filters.

4. Extract features: This step involves extracting relevant features


from the image, such as edges, corners, or certain patterns.

5. Match features: This step involves matching the extracted fea-


tures to a database of known features to determine the distance to
the vehicle.

6. Calculate distance: This step involves using the matched fea-


tures and any necessary calculations to determine the distance to
the vehicle.

7. Display distance: This step involves displaying the calculated


distance to the user.

8. End: This indicates the end of the process.

19
3.4 Drowsiness Detection Diagram

Start

Initialize
Camera

Detect Face

Face No
Detected?

Yes
Extract
Features

Detect Eyes Detect Yawn

Yes No

Open Eyes Open Mouth


Detected Detected

No No
No Yes

Eyes Closed
Yawns>th
frames>th

Yes Yes

Alert

Figure 6: Drowsiness Detection Diagram

20
3.5 Lane Detection Diagram

Original Image Greyscale


Captured Conversion
Start

Extract Region
Canny Edges Gaussian Blur
of Interest

Hough
Mark Lanes
Transformation
End

Figure 7: Lane Detection Diagram

21
4 Project Scheduling
4.1 Phase 1
Semester VII
July August September October November December
Idea & Analysis
Learning Tools
Setup
Object Detection
Distance Estimation
Testing
Documentation

The Phase 1 of the project was planned and executed in multiple


steps as follows:

4.1.1 Idea & Analysis


The first step was to find the idea and topic on which to base the ma-
jor project. This involved going through various research papers and
understanding the problems. The most important criteria we used in
evaluating ideas to choose the appropriate idea is uniqueness, and it
solves a real problem. Therefore, we changed our minds many times.
After selecting the idea, we proceeded to search for more information
about the problem to learn more about all aspects of the problem, its
available solutions, and the advantages & disadvantages of its available
solutions, if they exist.
The information we got through our search helped us to build an in-
tuition about how can we solve this problem & how to work on currently
available solutions shortcomings.

4.1.2 Learning Tools


The next step was to read about the tools used for vehicle detection and
the languages and models required for the same. In this step, we studied

22
the current research and the libraries in python, and the datasets that
are related to vehicle distance estimation.

4.1.3 Setup
In the setup process, we tried installing TensorFlow on local computers
to work with and installed the different python environments for the
same, this setup did not work much efficiently due to issues with Ten-
sorFlow on local machines. For this reason, Google Collaboratory was
selected to be used as the platform to research our project.

4.1.4 Object Detection


In this step, we focused on developing a machine-learning model to de-
tect the objects in the image as this data will be used in the next steps
to estimate the distance from the vehicle.

4.1.5 Distance Estimation


This is the most time-consuming step phase in the project development
as in this phase we focused on machine learning algorithms and models
to estimate the distance of the vehicle from the objects marked in the
previous step.

4.1.6 Testing
In the next phase, we test the created model to adjust the parameters
and optimize the model. We also test the model in various conditions
to find how it performs in varied conditions.

4.1.7 Documentation
In the final phase of this semester, we focus on creating the documen-
tation and creating the report and the presentation.

23
4.2 Phase 2
Semester VIII
January February March April
Idea & Analysis
Learning Tools
Setup
Lane Detection
Drowsiness Detection
Testing
Documentation

The Phase 2 of the project was planned and executed in multiple


steps as follows:

4.2.1 Idea & Analysis


We started phase 2 of the project with research about Lane Detection
and Drowsiness Detection and read about the latest research being con-
ducted on the topics and the state-of-the-art systems available in the
market at the moment.
This research helped us to build an intuition about how can we solve
this problem how to work on currently available solutions shortcomings.

4.2.2 Learning Tools


The next step was to read about the tools used for Lane Detection and
Drowsiness Detection and the languages and models required for the
same. In this step, we studied the current research and the libraries
in Python, and the datasets that are related to Lane Detection and
Drowsiness Detection.

4.2.3 Setup
In the setup process, we tried making the video camera work on Google
Collaboratory for drowsiness detection but as it was not working we
switched to local development for drowsiness detection. But for lane de-
tection, we went with Google Collaboratory as the platform to research

24
our project.

4.2.4 Lane Detection


In this step, we focused on using computer vision algorithms for use on
the camera video through which we can detect the lanes in a video.

4.2.5 Drowsiness Detection


In this step, we focused on finding the best face detection models through
which we can analyze the eye and the mouth for signs of drowsiness.

4.2.6 Testing
In the next phase, we test the created model to adjust the parameters
and optimize the model. We also test the model in various conditions
to find how it performs in varied conditions.

4.2.7 Documentation
In the final phase of this semester, we focus on creating the documen-
tation and creating the report and the presentation.

25
5 Tools and Technology Used
5.1 Python
Python [9] is a popular programming language for machine learning due
to its rich ecosystem of libraries and tools. It has a large and active
community of users and developers, which means that there is a wealth
of documentation, libraries, and resources available for those working
with the language. It has a simple and readable syntax, which makes
it easy for developers to express their ideas in code. It is flexible and
can be used for a wide range of tasks, from simple scripts to complex
applications.

Figure 8: Python Logo

5.2 NumPy
NumPy[10] is a Python library that is used for scientific computing and
data analysis. It provides functionality for working with large, multi-
dimensional arrays and matrices of numerical data, as well as for per-
forming mathematical operations on these.
NumPy is designed to be efficient and to allow for the manipulation
of large arrays of data. It provides a high-performance multidimensional
array object, as well as tools for working with these arrays.

Figure 9: Numpy Logo

26
5.3 Pandas
Pandas[11] is a software library written for the Python programming
language for data manipulation and analysis. It provides a fast and
flexible way to work with data and is particularly useful for tabular
data (data in the form of rows and columns, similar to a spreadsheet).
Some of the key features of Pandas include Data structures for holding
and manipulating large amounts of data in a tabular format, including
data frames (two-dimensional arrays with rows and columns) and series
(one-dimensional arrays), tools for reading and writing data to and from
a variety of formats, including CSV, Excel among other features.
Pandas is a powerful tool for working with data in Python and is
widely used in a variety of fields, including finance, economics, statistics,
and data science.

Figure 10: Pandas Logo

5.4 Matplotlib
Matplotlib[12] is a library for data visualization in Python. It provides
functions for creating a wide range of charts, plots, and other visualiza-
tions, which can be useful for understanding and interpreting machine
learning models and data.
Matplotlib is particularly useful for visualizing the results of ma-
chine learning algorithms, as it provides a way to view trends, patterns,
and relationships in the data. In addition to its visualization capabil-
ities, Matplotlib also provides a number of tools for customizing and
fine-tuning the appearance of plots and charts, allowing users to create
professional-quality visualizations that are tailored to their needs.

27
Figure 11: Matplotlib Logo

5.5 TensorFlow
TensorFlow[13] is an open-source machine learning library developed by
Google. It provides a flexible and powerful platform for building and
training machine learning models, including support for deep learning
and neural networks.
TensorFlow is based on the concept of "tensors," which are mul-
tidimensional arrays of data. The library includes a set of tools and
libraries for defining, training, and evaluating machine learning models
using these tensors. It also includes a suite of visualization and debug-
ging tools, making it easier to understand and optimize machine learning
models.

Figure 12: Tensorflow Logo

5.6 Google Colaboratory


Google Colaboratory[14], or Google Colab for short, is a free online
Jupyter notebook environment that allows users to write and execute
code, as well as create and share documents that contain live code,
equations, visualizations, and narrative text. It provides a convenient

28
way for users to run and experiment with code using a variety of pro-
gramming languages, including Python, without the need to install any
software locally. Google Colab can be used for a wide range of purposes,
including machine learning and data analysis.

Figure 13: Colaboratory Logo

5.7 YOLO
YOLO (You Only Look Once)[15] is a real-time object detection system
developed by Joseph Redmon and Ali Farhadi. It is a convolutional
neural network (CNN) based model that is able to identify and locate
objects in images and videos in real time. It is able to process images
and videos at a high frame rate, making it capable of detecting and
classifying objects in real time.
The YOLO model works by dividing the input image or frame into a
grid of cells and predicting the presence and location of objects within
each cell. It is trained on a large dataset of images labeled with the
locations and classes of objects, and it uses this training data to learn
to recognize and locate objects in new images.

Figure 14: Example YOLO object detection

29
5.8 KITTI
The KITTI (Karlsruhe Institute of Technology and Toyota Technological
Institute at Chicago) dataset[16] is a collection of visual and LiDAR
(Light Detection and Ranging) data collected from a driving scenario
in Karlsruhe, Germany. The dataset was collected by the Perception
Group at the Karlsruhe Institute of Technology as part of their research
on autonomous driving and computer vision.
The KITTI dataset includes a variety of data types, including stereo
images, monocular images, 3D object labels, LiDAR point clouds, and
camera calibration data. It is widely used for research and development
in areas such as object detection, tracking, and pose estimation, and
it has become a benchmark dataset for evaluating the performance of
various algorithms and systems.

Figure 15: Example KITTI video frames

30
6 Modules & Methodology
6.1 Vehicle Distance Estimation
The various modules and architecture of the system is as follows:
1. Object Detection

2. Intermediate Result Calculation

3. Distance Estimation

4. Visualization

detected objects and their info (coordinates,


Object Distance
width, height,class ... etc)
Detection Estimation
Model Model

detected
objects and
Video captured
their estimated
using camera
distance
Video with
objects
detected, info,
distance written
Camera End User Interface

Figure 16: Project architecture

6.1.1 Object Detection


YOLO(You Only Look Once) is used for detecting the objects in the
images. It divides the image into a grid system and each cell in the
grid is responsible for detecting objects within itself. It uses neural
networks to provide real-time object detection and is thus a popular
choice because of its speed and accuracy.
YOLO employs convolutional neural networks to detect objects in
real-time and it only required a single forward propagation through a

31
neural network to detect objects. This improved the detection results
when compared to Retina-Net and Fast R-CNN.

6.1.2 Intermediate results Calculation


Our goal is to build a system that can estimate the distances between
a vehicle and various objects in a scene, such as cars, pedestrians, and
trucks.
To facilitate this process, we have modified the output format of our
object detection model to include a generated sheet containing detected
object coordinates. This sheet is organized with a row for each frame
of video, and a column with a coordinate (xmin, ymin, xmax, ymax)
for each detected object. We will use this sheet in the next step of our
process, which involves estimating the distances between the vehicle and
the detected objects.

Figure 17: Object Coordinate Sheet

In addition to generating this sheet of object coordinates, we have


also saved annotated video frames so that we can write our distance
estimation results back to them later. This will allow us to visualize
the distance estimates by overlaying them onto the video, which can be

32
a useful way to verify the accuracy of our estimates and identify any
potential issues.

6.1.3 Distance Estimation


To train a machine-learning model that can estimate the distances be-
tween a vehicle and objects in a scene based on the bounding box co-
ordinates of the detected objects, we will need a large dataset of video
frames with annotated object coordinates and corresponding distance
estimates.

This is achieved using KITTI Dataset. The KITTI dataset is a


widely-used dataset for evaluating and benchmarking computer vision
algorithms for autonomous driving applications. It contains a suite of
tasks that have been built using an autonomous driving platform.

The KITTI object detection dataset is a subset of the full KITTI


dataset that is specifically focused on object detection. It includes
monocular images and bounding boxes that annotate the locations of
various objects in the scene, such as cars, pedestrians, and trucks.

We will use this dataset to train a deep learning model that can
take in the bounding box coordinates of a detected object as input and
output an estimate of the distance to the object. This model will be
trained using a supervised learning approach, which involves providing
it with a set of input-output pairs and adjusting the model’s parameters
to minimize the error between the predicted output and the true output.

Once the model has been trained, we use it to estimate the distances
between the vehicle and objects in real-time as the vehicle moves through
the scene. To predict the distance (z) between a vehicle and objects in a
scene we use bounding box coordinates. The bounding box coordinates
(xmin, ymin, xmax, ymax) define a rectangular region in the image that

33
encloses an object. By analyzing the size and position of the bounding
box within the image, it is possible to estimate the distance (z) of the
object from the vehicle.

Figure 18: Object Distance Estimated from Coordinates of bounding box

6.1.4 Visualize test results


We have developed a basic visualizer that can be used to visualize the
results of our distance estimation system. The visualizer has two main
functions: writing estimated data to video frames and generating a video
from the frames.

Writing estimated data to video frames involves overlaying the dis-


tance estimates onto the video frames so that the estimates can be vi-
sually displayed as the video plays. This can be a useful way to verify
the accuracy of the distance estimates and identify any potential issues.

Generating a video from the frames involves creating a video file from

34
the individual video frames that have been annotated with the distance
estimates. This is done using an in-built library in python, which stitches
the frames together and saves them as a video file in a specified format.

Figure 19: Frame from video with distance written on bounding box

By using this visualizer, we can easily view the distance estimates as


a video, which can help us understand how the estimates change over
time and how they are affected by various factors such as the position
and orientation of the objects in the scene. This can be a useful tool for
debugging and improving your distance estimation system.

6.2 Drowsiness Detection


The methodology for drowsiness detection using EAR (Eye Aspect Ra-
tio) typically involves the following steps:

6.2.1 Data Collection


Video footage of the subject’s face and eyes are collected using a camera
or webcam. The video is recorded while the subject performs a task

35
that requires attention, such as driving, working, or studying.

6.2.2 Eye Detection


The region of interest (ROI) is identified, which is usually the eyes and
surrounding areas. This is done using computer vision techniques.

6.2.3 EAR Calculation


The EAR is calculated using the horizontal and vertical distances be-
tween the landmarks of the eyes, which are typically the inner corner,
outer corner, and center of the eye. The EAR is a measure of how
open or closed the eyes are and is calculated as the ratio of the vertical
distance to the horizontal distance between the landmarks.

6.2.4 Drowsiness Threshold


A threshold value is set for the EAR, below which the person is consid-
ered drowsy. This threshold can be determined using statistical analysis
of the EAR values collected from a large number of subjects.

6.2.5 Drowsiness Detection


The EAR values are continuously monitored, and if the value falls be-
low the threshold, an alert is triggered to indicate that the person is
drowsy. This alert can take the form of an alarm, vibration, or visual
cue, depending on the application.
In summary, the methodology for drowsiness detection using EAR in-
volves collecting video footage, detecting the eyes, calculating the EAR,
setting a drowsiness threshold, monitoring the EAR values, and trigger-
ing an alert when the threshold is crossed.

Eye Aspect Ratio (EAR) is a measurement used in computer vi-


sion and image processing to detect eye blinks and estimate the level of
drowsiness of a person. EAR is calculated by measuring the ratio of the

36
vertical and horizontal distances between two specific points on the eye.

The two points used to calculate the EAR are typically the vertical
distance between the upper eyelid and the lower eyelid and the horizon-
tal distance between the left and right corners of the eye. The vertical
distance is measured as the average of the distances between the upper
and lower eyelid at two points along the eye contour. The horizontal
distance is measured as the distance between the two outermost points
on the eye contour.

EAR is calculated by dividing the vertical distance by the horizon-


tal distance. A higher EAR value indicates that the eye is more open,
while a lower EAR value indicates that the eye is more closed. When a
person blinks, the EAR value drops to zero, indicating that the eye is
fully closed.

Figure 20: Detecting Eye Points

|p2 − p6| + |p3 − p5|


EAR =
2 × |p1 − p4|

EAR can be used to detect eye blinks in video footage or in real-time


using a camera or webcam. It can also be used to estimate the level
of drowsiness of a person, as drowsiness is typically associated with a
decrease in the frequency of eye blinks and a decrease in the EAR value.

37
For the calculation of EAR we require feature points of the face, these
are calculated using dlib shape predictor 68. ’shape_predictor_68_face_
landmarks’ is a pre-trained model in computer vision and facial recogni-
tion, which is used to detect and identify specific landmarks on a human
face. The model is trained using a machine learning algorithm called
a regression tree ensemble, and it is trained on a dataset of faces with
labeled landmarks.

The model is called ’shape_predictor_68_face_landmarks’ because


it predicts the locations of 68 specific landmarks on a human face, such
as the corners of the eyes, the tip of the nose, and the corners of the
mouth. These landmarks are useful for a variety of applications, includ-
ing facial recognition, emotion detection, and facial expression analysis.

Figure 21: DLib Shape Predictor 68

The model takes an input image of a face and uses a set of regression
trees to predict the locations of the 68 landmarks. Each regression tree
predicts the location of one landmark, and the output of all the trees
is combined to produce the final set of landmark locations. The model
is optimized to produce accurate predictions of the landmark locations
while minimizing the error between the predicted and ground truth lo-
cations.

38
The ’shape_predictor_68_face_landmarks’ model is widely used in
computer vision and facial recognition applications because it provides
a fast and accurate way to detect and identify specific landmarks on
a human face. The model has been used in a variety of applications,
including facial recognition systems, emotion detection systems, and
virtual makeup applications.

Figure 22: Awake Driver

Figure 23: Sleepy Driver

39
6.3 Lane Detection
The methodology followed for Lane Detection is described s follows:

6.3.1 Image Capturing


The live image feed is captured from a camera device attached to the
machine. This video footage will be used to detect if the driver is awake
or drowsy by processing the incoming stream frame by frame.

Figure 24: Image Captured from Camera

6.3.2 Greyscale Conversion


Grayscale conversion is a common technique used in image processing
to simplify an image and reduce its color information to a single chan-
nel. Color images are usually represented as a combination of three
color channels (Red, Green, and Blue), each of which contains intensity
values for that color at each pixel location in the image. The image pro-
cessing algorithms used in the project only require a single channel of
information to work with, and grayscale conversion is a way to achieve
this.

Figure 25: Greyscale Conversion

40
6.3.3 Gaussian Blurring
In the next step we apply Gaussian blurring to the image. This is done
to reduce the noise in the image as the noise may hinder processing
and lead to inaccurate processing and results. The noise in the image
may have crept in because of image sensors, transmission algorithms, or
transmission errors.

Gaussian blurring works by convolving the image with a Gaussian


function, which is a type of probability distribution. The Gaussian func-
tion assigns higher weights to the central pixels and lower weights to the
surrounding pixels. This means that the pixels in the center of the Gaus-
sian kernel contribute more to the blurred image than the pixels at the
edges.

Figure 26: Gaussian Blurring

The amount of blurring applied by the Gaussian filter depends on


the size of the kernel. A larger kernel size means that more pixels are
averaged, which results in more blurring. In general, a kernel size of 3x3
or 5x5 is commonly used in image processing.

Gaussian blurring is particularly useful for preprocessing images be-


fore performing edge detection or other image analysis techniques. By
reducing noise and detail in the image, it makes it easier to detect edges
and other features of interest.

41
6.3.4 Canny Edge Detection
Canny edge detection is a popular technique used in image processing to
identify edges in an image. It is named after its inventor, John Canny,
and is widely used due to its accuracy and robustness.

The edges in an image often correspond to important features, such


as object boundaries or contours. By detecting these edges, it is possible
to extract these features and use them for further analysis for detecting
lanes in the image.

Figure 27: Canny Edge Detection

Canny edge detection includes a step called "non-maximum suppres-


sion," which suppresses non-maximum values and sharpens edges while
reducing noise. This makes it possible to detect edges in noisy images
more accurately. Canny edge detection provides control over several pa-
rameters, such as the low and high threshold values. These parameters
can be tuned to adapt to different images and to optimize edge detec-
tion for specific applications. Canny edge detection is relatively fast
and can be applied to a wide range of image types and sizes, making it
suitable for use with low-powered devices for real-time driver drowsiness
detection.

6.3.5 Extraction of ROI


In image processing, a region of interest (ROI) refers to a specific area
or region within an image that contains the object or feature of inter-

42
est. It is a crucial step in many image processing applications, such
as object detection and tracking, as it helps to reduce processing time
and improve accuracy by focusing only on the relevant part of the image.

The region of interest can be defined by specifying the coordinates


of the vertices of a polygon that encloses the area of interest. Once the
polygon is defined, the pixels outside the polygon can be ignored or set
to a specific value, such as black or white, depending on the application.

Figure 28: Extraction of ROI

By applying a region of interest, the processing time and memory us-


age can be reduced since the algorithm only needs to analyze the relevant
part of the image, and the noise or background information outside the
region of interest can be ignored. This can also help to improve the accu-
racy of the algorithm since the focus is on the most relevant information.

6.3.6 Hough Transform


Hough transform is a popular technique used in image processing for de-
tecting lines, circles, and other shapes in an image. It was invented by
Paul Hough in 1962 and has since been widely used in computer vision
and image analysis applications.

The Hough transform works by transforming the image space into a


parameter space, where each point in the parameter space corresponds
to a particular line or circle in the image space. The Hough transform

43
algorithm can detect these lines or circles by finding the intersection
points in the parameter space, which correspond to the lines or circles
in the original image.

Figure 29: Hough Transform

The basic idea of the Hough transform is to represent each line in


the image space as a point in the parameter space. A line in the image
space can be represented by its slope and intercept, while a circle can be
represented by its center coordinates and radius. Once these parameters
are represented in the parameter space, it becomes easier to detect the
lines or circles in the image.

6.3.7 Image Marking


The lines obtained from the Hough transform are drawn on an image
and this is added back as a weighted image on the original image. This
helps to visualize the lanes detected in the incoming video frames and
these marked frames can then be used for various purposes like steering
assist, lane departure warning, etc.

Figure 30: Marked Image

44
Figure 31: Lane Detection Process

45
7 Result Analysis
The work done in phase 1 of the project enables us to estimate the
distance of other vehicles and objects through the workflow pipeline
mentioned earlier. This can be done with low computing power and
can be implemented to work in real time without much overhead. The
machine learning model can be improved to increase its accuracy and
provide better estimation results.

The accuracy of the distance estimation system varied with the dis-
tance of the objects from the vehicle or camera. The distance estimation
works best when the objects are in the range of 10 to 30 meters as the
error percentage of the predicted distance vs the actual distance is the
least, this is mainly because the vehicles generally maintain a 5 to 30
meters distance from the vehicle in front so most of the training dataset
contained videos of the same order of distance, because of the model is
able to estimate the distance of objects in this range with the highest
degree of precision.

Figure 32: Actual Distance Vs. Predicted Distance

46
The distance estimation is also affected by how large the camera
aperture is, or its field of view as this strongly affects the relative size of
objects like cars and humans in the video. With a very large field of view,
the distance estimated is found to be lesser than the actual distance, and
as the field of view becomes smaller the predicted distance increases
compared to the actual distance. The testing done in this project is
done with a similar field of view videos to get better results.
The graph here shows that the predicted distance is overestimated as
the actual distance increases from the vehicle. This behavior can be ac-
counted for and the final estimated distance can be adjusted accordingly.
But the main function of this distance estimation is to save the driver
from the danger of collision and that happens when the other objects
are near the vehicle itself. In this case, the model will work as expected
as when the objects will get closer to the camera and the vehicle the
error rate will decrease with it.

Figure 33: Distance Vs. Error Rate

The object detection module of the system is trained on videos of


the KITTI dataset. The dataset contains videos shot in and around

47
Karlsruhe, Germany. The vehicles on the roads in Germany differ from
the ones found on Indian roads, because of this object detection takes
a slight hit in the certain case where the scenes contain vehicles like
tractors, trolleys, and rickshaws as these are not present in the videos
of the KITTI dataset.

The results from Phase 2 of the major project relate to Drowsiness


Detection and Lane Detection. The model created for the analysis of
drowsiness works well in well-lit conditions and provides an accuracy of
over 90.5% as seen in the confusion matrix 34.

The module is able to detect if the driver is drowsy based on two


parameters, the EAR or the Eye Aspect Ratio and the MAR or Mouth
Aspect Ratio. The EAR enables us to find if the eyes of the driver are
drooping and it works well in tests also. The MAR helps us to know if
the driver is yawning and upon a certain number of yawns an alert can
be generated also. The MAR produces good results in lab conditions
but in the real world, the accuracy dips a bit if the driver is actually
talking to someone as the moth movement gets accidentally tagged as
yawning.

Figure 34: Confusion Matrix for Drowsiness Detection

48
Figure 35: Drowsiness Detection

The accuracy is not consistent for everyone, as people have different


eye shapes. For instance, individuals with Asian ancestry typically have
wider eye shapes, leading to lower EAR values even when their eyes
are open. This may cause incorrect detection of drowsiness if a fixed
EAR threshold is used. To solve this problem we can calculate the EAR
values for each individual with their eyes open and set a personalized
threshold for each person. This would improve the accuracy of EAR-
based drowsiness detection for people with different eye shapes.

The lane detection is based on computer vision techniques which


allow fast processing of the input frames and reliable results too. The
model is able to achieve an accuracy of 87% in the tests performed with
videos sourced from the internet. The model requires specific tuning
for the parameters of the canny transform threshold for edge detection
and the Hough transform for line detection if the input video changes
drastically. The Region of Interest mask for the image also has to be
modified based on the position and the angle of the dashcam, as the

49
road view changes accordingly.

Figure 36: Lane Detection

The lane detection works well for roads with clear straight lane mark-
ings but when the markings are obstructed by other vehicles or the road
is curving at a steep angle the results are not very accurate. This can
be improved by using the fact the lane lines are highly correlated for
adjacent frames, ie the markings do not change by a great degree from
one frame to another, this fact can be used in tough-to-detect scenar-
ios to employ the markings of the previous frames and use a regression
model to detect the possible lane marking for the given frame and try
to optimize the markings with the video frame available. This can help
to further increase the accuracy and make lane detection model more
robust.

50
8 Conclusion
Drowsy driving is a significant problem that can lead to fatal accidents.
Drowsiness detection technology offers a solution by detecting signs of
fatigue and alerting drivers when they are at risk of falling asleep. The
technologies involved in drowsiness detection, such as eye-tracking and
facial recognition, are constantly evolving and improving, making the
systems more effective in preventing accidents. The potential impact
on road safety is significant, as evidenced by the reduction in accidents
observed in studies conducted on drowsiness detection systems.

Lane detection using computer vision is a challenging problem due


to the wide range of environmental conditions and the variability of the
lane markings themselves. Sensors other than dashcams, like infrared
cameras or thermal imaging, have the potential to overcome environ-
mental factors and result in better accuracy. Also, with advances in
computer vision algorithms and hardware, lane detection systems are
becoming increasingly accurate and reliable.

Vehicle distance estimation is an important task in many computer


vision applications, such as autonomous driving, advanced driver assis-
tance systems (ADAS), and traffic monitoring. The accuracy of distance
estimation algorithms depends on several factors, including the quality
of the input images, the choice of features and metrics, and the com-
plexity of the model.

In general, the performance of estimation algorithms improves with


the increasing complexity of the model and the use of multiple sensors.
However, more complex models may be computationally expensive and
may require more processing power. More computationally expensive
versions of the model created in the project can be created to achieve
higher accuracy at the cost of more processing time and an increase in
power consumption.

51
These systems explored in the project can be implemented by fleets
that work in the transportation sector as this help them to increase
their productivity while keeping their drivers and vehicles safe even on
longer drives. Nowadays many people install dashcams in their vehicles
and this system can be integrated with the dashcams to facilitate better
availability and reach of the system. Many of the new cars being released
nowadays have some sort of driving assistance system but there are no
systems for the cars that have already been released and are on the road.
For those vehicles, dashcams and corresponding attached software are
one of the ways they can make their drive safer.

52
References
[1] 2023. [Online]. Available: https : / / morth . nic . in / road - accident - in -
india.
[2] D.-Y. Huang, C.-H. Chen, T.-Y. Chen, W.-C. Hu, and K.-W. Feng, “Vehicle
detection and inter-vehicle distance estimation using single-lens video camera
on urban/suburb roads,” Journal of Visual Communication and Image Rep-
resentation, vol. 46, pp. 250–259, 2017.
[3] R. Belka and M. Godlewski, “Vehicle routing optimization system with smart
geopositioning updates,” Applied Sciences, vol. 11, no. 22, p. 10 933, 2021.
[4] A. Ali, A. Hassan, A. R. Ali, H. U. Khan, W. Kazmi, and A. Zaheer, “Real-
time vehicle distance estimation using single view geometry,” in Proceedings
of the IEEE/CVF Winter Conference on Applications of Computer Vision,
2020, pp. 1111–1120.
[5] H. Ueno, M. Kaneda, and M. Tsukino, “Development of drowsiness detection
system,” in Proceedings of VNIS’94-1994 Vehicle Navigation and Information
Systems Conference, IEEE, 1994, pp. 15–20.
[6] V. Saini and R. Saini, “Driver drowsiness detection system and techniques:
A review,” International Journal of Computer Science and Information Tech-
nologies, vol. 5, no. 3, pp. 4245–4249, 2014.
[7] A. Bar Hillel, R. Lerner, D. Levi, and G. Raz, “Recent progress in road and
lane detection: A survey,” Machine vision and applications, vol. 25, no. 3,
pp. 727–745, 2014.
[8] J. Goldbeck and B. Huertgen, “Lane detection and tracking by video sensors,”
in Proceedings 199 IEEE/IEEJ/JSAI International Conference on Intelligent
Transportation Systems (Cat. No. 99TH8383), IEEE, 1999, pp. 74–79.
[9] Dec. 2022. [Online]. Available: https://www.python.org/.
[10] 2022. [Online]. Available: https://numpy.org/.
[11] 2022. [Online]. Available: https://pandas.pydata.org/.
[12] 2022. [Online]. Available: https://matplotlib.org/.
[13] 2022. [Online]. Available: https://www.tensorflow.org/.
[14] 2022. [Online]. Available: https://colab.research.google.com/.
[15] 2020. [Online]. Available: https://docs.ultralytics.com/.
[16] 2012. [Online]. Available: https://www.cvlibs.net/datasets/kitti/.

53

You might also like