Download as pdf or txt
Download as pdf or txt
You are on page 1of 87

DESIGN AND SIMULATION OF OPTIMIZED TRAFFIC

LIGHT CONTROL SYSTEM USING YOLOV8

A Thesis Presented to the

College of Science and Technology

Adventist University of the Philippines

In Partial Fulfillment of the

Requirements for the degree of

Bachelor of Science in Electronics Engineering

By:

Bualoy, Jan Eliel

Domingo, Golden Dhan

Erer, Ivan Floyd

May 2024
APPROVAL SHEET

This engineering design entitled “Design and Simulation of Optimized Traffic Light Control
System Using YOLOv8” by Jan Eliel Bualoy, Golden Dhan Domingo and Ivan Floyd Erer has
been examined and approved by the Panel of Oral Examinees

for the degree of

BACHELOR OF SCIENCE IN ELECTRONICS ENGINEERING

Winelfred G. Pasamba
Adviser

PANEL OF EXAMINERS

Examined and approved by the Committee on Oral Examination:

Engr. Elmer P. Joaquin Engr. Melquiades B. Garrino


Panel Member Panel Member

Engr. Melvin T. Valdez Engr. Jonalyn D. Castaño


Panel Member Panel Member

Dr. Lorcelie B. Taclan


External Panel Member

Dr. Edwin A. Balila


Chair

Accepted as partial fulfillment of the requirements for the degree Bachelor of Science in Electronics Engineering

Engr. Jonalyn D. Castaño Dr. Edwin A. Balila


Chair, Electronics Engineering Dept. Dean, College of Science and Technology

ii
ABSTRACT

This research explores the development and implementation of an optimized traffic

light system using YOLOv8 as a computer vision model to address the evolving dynamics

of urban traffic. Leveraging deep learning models such as YOLOv8n for object detection

and image classification, the system aims to improve traffic flow efficiency and safety at

intersections. The study evaluates the performance of the model through rigorous

validation, demonstrating promising results in detecting vehicles and classifying traffic

density under various environmental conditions. Detection at nighttime tends to decrease

the model accuracy while classification reached top-1 accuracy 98% on average in different

environmental conditions, emphasizing the weight of the classification model to be used

along with detection model as a hybrid technique to optimize traffic light timing systems.

Safety concerns dictate that on-site implementation is deferred until the system reaches a

mature stage of development. The researchers navigate these challenges by developing a

miniature-scale prototype representative of real-world conditions resulting in a decrease of

38.28% mean %extra green time to 10.81%. Recommendations for further research

emphasize the need for continuous improvement, scalability, and integration of user

interface development to enhance the system's adaptability and responsiveness. Through

continuous refinement and adherence to safety standards, the optimized traffic light system

presents a promising solution to mitigate urban traffic congestion and enhance road safety.

Keywords: computer vision, traffic flow, object detection, image classification,

traffic density

iii
ACKNOWLEDGEMENT

We would like to express our deepest gratitude to everyone who made this thesis

capstone research possible.

First, to our Almighty God for the wisdom, strength, and guidance that has

sustained us throughout this research journey.

Special thanks to the Adventist University of the Philippines, College of

Science and Technology, Electronics Engineering Department, for providing the

necessary resources and support to pursue this study.

We extend our heartfelt appreciation to Winelfred Pasamba, our thesis adviser,

for his invaluable guidance, encouragement, and expertise that significantly contributed

to the completion of this work.

Our sincere thanks to Engr. Jona Castaño, Engineering Department Chair, Engr.

Elmer Joaquin, Machine Learning Advisor, and Dr. Lorcelie Taclan, External

Consultant, for their insightful feedback and assistance that helped us grow in this

research field that we decided to pursue.

Lastly, we are grateful to our family and friends for their unwavering love,

understanding, and encouragement throughout this endeavor.

We extend our heartfelt gratitude to you all.

TO GOD BE THE GLORY!

iv
TABLE OF CONTENTS

TITLE PAGE ....................................................................................................................... i


APPROVAL SHEET .......................................................................................................... ii
ABSTRACT....................................................................................................................... iii
ACKNOWLEDGEMENT ................................................................................................. iv
TABLE OF CONTENTS.....................................................................................................v
LIST OF FIGURES ........................................................................................................... vi
LIST OF TABLES ............................................................................................................ vii

CHAPTER 1
THE PROBLEM AND ITS BACKGROUND ..........................................................1
Introduction ..............................................................................................................1
General Objectives ...................................................................................................3
Specific Objectives ..................................................................................................3
Significance of the study ..........................................................................................3
Scope and Limitation ...............................................................................................4
CHAPTER 2
REVIEW OF RELATED LITERATURE ................................................................5
Definition of Terms................................................................................................11
Theoretical Framework: .........................................................................................12
CHAPTER 3
MATERIALS AND METHODOLOGY .................................................................15
MATERIALS
Software Components ......................................................................................15
Transfer Learning Models................................................................................15
Datasets ............................................................................................................17
Electronic Components: ...................................................................................18
METHODOLOGY
I. Traffic Density Model ..................................................................................21
A. Deep Learning Pipeline .........................................................................21
B. Image Classification and Object Detection ...........................................24
C. Inference Processing .............................................................................25

v
II. CCTV System and Traffic Light System with Integrated Optimized System
..........................................................................................................................32
III. Image Detection through Varied Environmental Conditions ....................39
DESIGN CONSTRAINTS ....................................................................................41
CHAPTER 4
RESULTS AND DISCUSSION ................................................................................42
I. YOLOv8 Model Validation Results .............................................................42
II. Optimized Traffic Light Timing .................................................................48
III. Environmental Conditions Performance ....................................................50
CHAPTER 5
CONCLUSION AND RECOMMENDATION .......................................................53
REFERENCES ..................................................................................................................57
APPENDICES ...................................................................................................................60
Appendix A: Initial Draft Review Matrix ....................................................................60
Appendix B: Local Government Cooperation Summary.............................................61
Appendix C: Change Matrix ........................................................................................63
Appendix D: Request Letter for CCTV and Traffic Light Access ..............................65
Appendix E: Traffic Light Lanes Raw Data ................................................................66
Appendix F: Source Code ............................................................................................68
Appendix G: Similarity Index......................................................................................76
Appendix H: Certificate of Technical Editing .............................................................77
CURRICULUM VITAE ....................................................................................................78

LIST OF FIGURES

Figure 1. Deterministic Component of Delay Models. ........................................................6


Figure 2. YOLOv8 Comparison to Previous Versions of YOLO........................................9
Figure 3. Comparison of YOLO and other state of the art detectors. ...............................10
Figure 4. Hybrid Traffic Light Optimization Framework .................................................13
Figure 5. Arduino Uno and specifications. ........................................................................18
Figure 6. Beaglebone Black and specifications. ................................................................19
Figure 7. ESP32 cam and specifications. ...........................................................................19
Figure 8. Traffic Light Module and specifications. ...........................................................20

vi
Figure 9. Logic Level Bi-directional Converter and specifications. ..................................20
Figure 10. Image Classification and Object Detection Deep Learning Pipeline ...............22
Figure 11. Accuracy, Precision and Recall Equations .......................................................22
Figure 12. Sample Precision-Recall Curve ........................................................................23
Figure 13. YOLOv8 Architecture, visualization................................................................24
Figure 14. Sample image classification and sample object detection ................................25
Figure 15a. Model Inference Process Flow. ......................................................................26
Figure 15b. Timer and Variables Initiation Process Flow. ................................................27
Figure 15c. Pedestrian Signal Process Flow. .....................................................................28
Figure 15d. Next Signal Lane Process Flow. .....................................................................30
Figure 15e. Warning Signal Process Flow. ........................................................................31
Figure 16. Block diagram of CCTV system to Traffic Light system . ..............................33
Figure 17a. Traffic Light System Process..........................................................................34
Figure 17b. Automatic Cycle of Traffic Light Process......................................................35
Figure 17c. Manual Cycle of Traffic Light Process. .........................................................36
Figure 18. Miniature Traffic Light system with proposed dimensions. ............................37
Figure 19. Traffic Light Controller and Optimized Actuator schematic. ..........................38
Figure 20. Mean %Extra Green Time formula. .................................................................39
Figure 21. Simulated environmental conditions. ...............................................................39
Figure 22. Object Detection Precision-Recall Curve .........................................................43
Figure 23. Object Detection Validation Results. ...............................................................43
Figure 24. Object Detection Confusion Matrix. ................................................................44
Figure 25. Sample Object Detection Prediction.. ..............................................................45
Figure 26. Image Classification Validation Results. .........................................................46
Figure 27. Image Classification Confusion Matrix............................................................46
Figure 28. Sample Image Classification Prediction. ..........................................................47
Figure 29. Sample validation of different environmental conditions. ...............................52

LIST OF TABLES
Table 1. YOLOv8n Performance .......................................................................................16
Table 2. YOLOv8n-cls Performance .................................................................................17
Table 3. Traffic Light Lanes Average Timing Comparison ..............................................49
Table 4. Validation Performance per Environmental Condition .......................................51

vii
CHAPTER 1

THE PROBLEM AND ITS BACKGROUND

The Philippines is one of the fastest growing countries in Southeast Asia. And with

more industries launching and urbanization expanding, the capital city Metro Manila, is

the eighth worst in the world in hours spent in traffic among cities based on a recent study

by UK-based insurance technology site GoShorty (Zaldarriaga, 2023). This traffic

congestion problem also costs the country an economic loss of approximately 3.5 billion

pesos (JICA, 2022). Thus, the pursuit for a solution is continuous to alleviate this problem

in the country.

One of the solutions devised by the MMDA (Metropolitan Manila Development

Authority) Road Safety Unit Traffic Operations was upgrading its fixed countdown traffic

lights with advanced traffic signals that operate based on an intersection's volume of

vehicles (CNN Philippines Staff, 2022). This upgrade cost 295 million pesos (Gamil,

2014). Some of the technologies utilized in this project include closed circuit television

cameras (CCTVs) and underground loop detectors. However, many traffic light systems

outside the scope of MMDA are using fixed-timer technology to control traffic lights.

Induction loop detectors are an expensive upgrade; based on the past upgrade of

the MMDA; which focused on the “more green-light time” of a motorist. Though this

system could be reliable and require minimal maintenance, it could only account for an

overall approximate volume of traffic.

Most crossings with a traffic light, with or without a vehicle detector, include a

CCTV (Closed Circuit Television), monitored for motorists' safety. These camera systems

1
are regularly monitored but do not contribute to the control of the traffic lights, since most

systems do not have a central control system but use a fixed-timer or a countdown system.

The optimization of these traffic light systems will benefit from the current CCTV

systems installed at traffic light systems using computer vision. Different techniques are

being studied and developed that have led to the use of Computer Vision as a solution to

the traffic congestion problem. One notable research in the Philippines, with the use of

Computer Vision, was a proposal to develop an intelligent traffic light system using

computer vision to measure the traffic density in an intersection regardless of the time of

day (Nodado et al., 2018). This method used a technique called image segmentation. Image

segmentation is a key part of computer vision as it allows you to define features that

consolidate into a variable of traffic density.

Deep Learning using popular object detection techniques such as YOLO (“You

Only Look Once”), are also being studied by counting detections and tracking objects, but

this proved challenging during night scenes when detection criteria are not being met by

the objects needed to be detected (Sharma et al., 2021). Electronic sensor methods such as

a PIR (Passive Infrared) sensor are also considered; the method focuses on motion

detection and density estimations (Agarwal, 2016). With more implementations and

methods studied, most traffic light optimizations are focused on traffic density or through

counting multiple detections.

This study focuses on detecting objects and calculating traffic density through the

utilization of a hybrid image classification and object detection techniques merged into a

pipeline to optimize the traffic light control system.

2
General Objectives

To design and implement an optimized traffic light system with a complete

electronic physical system and a custom model pipeline to be evaluated for different

environmental conditions.

Specific Objectives

1. To develop a traffic density calculator pipeline using YOLOv8 classify and detect

models that would accurately perform both at day and nighttime.

2. Integrate the optimized system with minimal requirements into a traffic light system

prototype and multiple video IP stream inputs to minimize extra green signal time.

3. Evaluate the performance of the proposed traffic light control system through

different environmental conditions such as but not limited to:

a. Daytime

b. Night

c. Fog

d. Rain

Significance of the study

Computer Vision techniques have proved its importance to society by utilizing

image technology advancement to improve efficiency, automate and accelerate a variety

of tasks. The study will benefit the current traffic light implementation by providing an

additional computer vision technique available; and having a low-cost installation,

3
operation, and maintenance of the traffic light control system. It will utilize existing CCTV

systems thus eliminating the need for additional infrastructure.

Scope and Limitation

The study is limited by the following as to design and implement the optimized

traffic light control system using a hybrid of image classification and object detection:

1. The design will be a prototype implementation with focus on the workflow and

integration of the Computer Vision system and NOT the overall stability of the

separate traffic light control system circuit.

2. Model performance is dependent on the available datasets such as COCO and

datasets with open-source attribution.

3. The CCTV system will be proxied through video streams instead of a real-time

stream due to security and safety factors. CCTV video will be sourced at a local 4-

way intersection or T intersection using on-site footage.

4. The system optimization is only applicable during non-peak traffic hours with

which at peak traffic hours the system adapts maximum value based on the local

intersection signal timing.

4
CHAPTER 2

REVIEW OF RELATED LITERATURE

Traffic light control can easily increase complexity in terms of distinct phases to

define, the number of lanes in an intersection, the type of lighting patterns, the light

actuation type, and other overlooked parameters. In this study, the scope and limitation of

the prototype implementation is defined with the focus on simple workflow of the system.

Phase rotation is a particularly important concept for trying to optimize a traffic

light. Not all intersections are the same in terms of their cycles, timing, and phases. The

study focuses on a cycle with exclusive phasing, which allows only a single lane to flow to

the rest of the lanes. Thus, it does not allow even concurrent phases or phases that can both

have a green time, for example, opposing lanes in a 4-way intersection going left at the

same time (Buckholz, 2024).

There is another basic concept for timing of model signals that is built on

assumptions that there is a zero initial queue at the start of each phase with a uniform arrival

and departure flow rate. Notice in Figure 1, that after Qs or the queue clears, there is always

an extra effective green time allotted for change of traffic flow rates at every cycle. This

means that when traffic density increases, this cycle fails, and on average, when the queue

is finished, the wasted green light time is up to 25%, thus the need for another control

strategy (Rouphail et al., 1997).

5
Figure 1. Deterministic Component of Delay Models. Adapted from Rouphail et al.,

1997.

There are three major traffic control strategies used to solve traffic light timing –

fixed-time, actuated and adaptive. The fixed time is the conventional and the most common

way to time traffic lights using a fixed duration for each cycle. Then another is actuated,

which this research seeks to expand on, by applying sensors (in this research as video-

based sensors, CCTVs) which could apply simple logic criterion such as green light

extension, gap out to eliminate timing gaps and max out to let the max traffic flow in each

cycle (Eom & Kim, 2020).

To further the study, a manual from the US Department of Transportation Federal

Highway Administration Office of Operations defines a key mode that will be simulated

by the research – Fully-Actuated Control. This refers through which for a given

6
intersection, all its phases are actuated, thus the need for detection is needed (US

Department of Transportation, 2021).

The actuated mode of control is required for digital detection to be integrated into

the system. This type of control is available in the Philippines with Cebu City spending

480 million pesos, using the fully adaptive countdown timers, maximizing CCTVs with

Artificial Intelligence detection for fully actuated control (Sunnexdesk, 2023). This type of

project is already available and is in production. However, Computer vision solutions are

still continuously studied which present problems regarding the quality of inferences at

night. Thus, the research presents a hybrid method of image classification and object

detection.

Studies regarding the use of Computer Vision for Traffic Light Control present

models which either used a type of image classification, object detection or even image

segmentation. A single model is deployed on the desired application system. The method

proposed is to use object detection by the YOLO (“You Only Look Once”) technique and

SORT (Simple Online and Real-time Tracking algorithm) to track different objects in the

given frames. This method then uses the count of objects to facilitate the timing of the

traffic lights. However, due to the nature of Computer Vision relying on the camera input

to provide an inference, when vehicles are close to the camera frames or cast large shadows,

and at night, the accuracy of inference suffer (Sharma et al., 2021). Using YOLO as a

single approach or using object detection on its own, in general, tends to affect the

calculation of traffic density, with which many approaches are also studied.

7
Simple computer vision techniques such as image processing have also been

studied in the Philippines. Image segmentation through edge detection is a method of

manually looking for features that would define traffic density. Edge detection produces

contours, which then segments the image into different contour segments. The binary

regions of the vehicle contours were counted, which equates to manually defining a vehicle

using contour, and is then used to calculate the overall traffic density (Nodado et al., 2018).

Other techniques studied are variations of image processing which uses kernels to

do edge detection and then use a feature based matching technique to calculate traffic

density. A reference image with empty vehicles matched with a target match image with

vehicles, including percentage confidence, defines the traffic density. The higher

confidence level of the feature matching, the lesser the density of the traffic (Meng et al.,

2021).

Thus, with the direction of the research, two different model computer vision

approaches would be used to create inference for the traffic light. With image classification

and object detection techniques working simultaneously on runtime, the researchers seek

to create a reliable method for actuating traffic signals. With these, the YOLOv8n model

was chosen due to its reliability and fast inference times, effective for real-time processing.

The model is compared with other previous YOLO versions and YOLOv8 is the highest

performing model even at its smallest model size as shown in Figure 2 (Reis et al., 2023).

8
Figure 2. YOLOv8 Comparison to Previous Versions of YOLO. Adapted from

Ultralytics, 2024.

This model architecture for object detection, YOLO short for "You Only Look

Once," is an innovative approach to object detection by putting it into a grid system which

uses a single end-to-end convolutional neural network which makes it an extremely fast

and simple model, fit for real-time applications. A model that continues to scale and grow

and be continuously modified to fit the growing demand in object detection (Redmon et

al., 2016). Many implementations of YOLO perform significantly better than other state of

the art detectors, especially in the inference times with which even for its smaller versions,

it is performing with comparable mAP of normal sized model of other detectors (Long et

al., 2020).

9
Figure 3. Comparison of YOLO and other state of the art detectors. Adapted from Long

et al., 2020.

And with the fast development of versions of this new model architecture,

YOLOv8 was created and is still in active development at the time of writing. This state-

of-the-art YOLO model is a new way to follow up on previous studies done. This model

has a high rate of accuracy measured by Microsoft COCO and Roboflow 100, this also

comes with many features that will help us develop the model both in image classification

and object detection (Solawetz & Francesco, 2023). This model will be employed by the

researchers to provide a hybrid approach to traffic light optimization using computer

vision.

10
Definition of Terms

● Computer Vision – the use of computer machines to interpret and recognize visual

objects.

● Machine Learning – a subset of Artificial Intelligence focused on a computer

system that can learn a specific way of thinking by past data to predict outcomes

and learn patterns.

● Deep Learning – a subset of Machine Learning that uses artificial neural networks

to create more complex predictive models that imitate the human brain learning

process.

● Artificial Intelligence – a process or a method that simulates human behavior by

creating complex neural networks.

● Image Processing – the process of transforming and manipulating images to

extract useful information.

● Image Segmentation – the use of image processing to partition an image into

different regions based on features defined.

● Object Detection – computer vision technique to locate instances of objects in an

image or video.

● Transfer Learning – a machine learning method where a model is reused for

another task.

● One Stage Model – class of object detection architectures that is synonymous with

transfer learning for this research with it being a model being reused for another

task.

11
● Inference (Machine Learning) – the process of running data points into a machine

learning model to calculate an output such as a single numerical score.

● Dataset – data gathered for training the model; in this research, it is referred to as

images collected or an already curated dataset that contain substantial amounts of

images.

● Python – a computer programming software and language capable of using

multiple scientific libraries and commonly found for machine learning

applications.

● Traffic Light Control – the control of signaling devices commonly positioned at

road intersections.

● Traffic Density – the measurement of how much traffic is at a given location area.

● Exclusive Phase – a traffic signal phase with which at a go/green signal, a single

lane is going through the intersection exclusively in any direction.

Theoretical Framework:

As displayed in Figure 4, there are four stages the whole optimized light control

system must complete: the Dataset preparation, the Model pipeline, the Model inference

processing, and the Traffic Light Control System Application.

12
Figure 4. Hybrid Image Classification and Object Detection for Traffic Light Optimization
Framework

The Dataset stage is a process where the data set is gathered and prepared. There

are two datasets prepared; one for the image classification with labels Fire, Accident,

Sparse Traffic, and Dense Traffic; and one for the object detection with labels Person, Car,

Bicycle, Motorcycle, Bus, and Truck.

The Model (Machine Learning) Pipeline stage starts at the input of the dataset and

ends with the model exportation or packaging, wherein it would then be used for inference.

For image classification and object detection, there are 2 YOLOv8 models chosen, namely

the yolov8n-cls for classification and yolov8n for detection. These models will be analyzed

and validated to maximize its output for use in the pipeline.

13
The Model Inference processing stage is where the computation for the Traffic

Light Control System happens. Both image classification and object detection are set for

inference and with which both are considered for the interpretation of the Traffic Light

Signal. This would then be the input for the Application Stage composed of the Traffic

Light Control System Circuit.

14
CHAPTER 3

MATERIALS AND METHODOLOGY

The research materials are composed of software components, the transfer learning

models to be used, and the electronics components with which the inferences will be run,

and the traffic light prototype system will be built with.

MATERIALS

Software Components:

● Python

● Ultralytics

● Google Colab

Transfer Learning Models:

● YOLOv8n

Table 1 shows that YOLOv8n is the fastest of all the YOLOv8 models due to its

smaller parameter size at 3.2 million parameters. It has a fast inference speed at

0.99ms on an A100 TensorRT. Thus, this model is chosen to give the fastest

inferences while running with additional computation scripts.

15
Table 1

YOLOv8n Performance

Model size mAPval Speed Speed Params FLOPs


(pixels) 50-95 CPU A100 (M) (B)
ONNX TensorRT
(ms) (ms)

YOLOv8n 640 37.3 80.4 0.99 3.2 8.7

YOLOv8s 640 44.9 128.4 1.20 11.2 28.6

YOLOv8m 640 50.2 234.7 1.83 25.9 78.9

YOLOv8l 640 52.9 375.2 2.39 43.7 165.2

YOLOv8x 640 53.9 479.1 3.53 68.2 257.8

Note. Adapted from Ultralytics

● YOLOv8n-cls

Table 2 displays that this model is also the fastest of all the YOLOv8 classification

models due to its smaller parameter size at 2.7 million parameters. It has a fast

inference speed of 0.31ms on an A100 TensorRT.

16
Table 2

YOLOv8n-cls Performance

Model size acc acc Speed Speed Params FLOPs


(pixels) top1 top5 CPU A100 (M) (B) at
ONNX TensorRT 640
(ms) (ms)

YOLOv8n -cls 224 69.0 88.3 12.9 0.31 2.7 4.3

YOLOv8s-cls 224 73.8 91.7 23.4 0.35 6.4 13.5

YOLOv8m-cls 224 76.8 93.5 85.4 0.62 17.0 42.7

YOLOv8l-cls 224 76.8 93.5 163.0 0.87 37.5 99.7

YOLOv8x-cls 224 79.0 94.6 232.0 1.01 57.4 154.8

Note. Adapted from Ultralytics

Datasets:

● Traffic-Net Dataset

● Common Objects in Context (COCO)

17
Electronic Components:

Arduino Uno - The Arduino Uno as exhibited in Figure 5, functions as the GPIO

extension of the computing device. It controls and manages the electronic Inputs

and Outputs that will be controlled using Pyfirmata.

ARDUINO UNO: SPECIFICATIONS

● Clock Speed: 16MHz


● Operating Voltage: 5V
● Supply Voltage: 7-12V
● Analog Inputs: 6
● Digital IO Pins: 14

Figure 5. Arduino Uno and specifications. Adapted from Arduino, 2024.

Beaglebone Black - The Beaglebone Black as presented in Figure 6, serves as the

Traffic light controller processing unit. It receives signal actuations to effectively

change signals on the traffic lights. It is mediated by a logic level converter to

receive 3.3V signals and actuations.

18
BEAGLEBONE BLACK: SPECIFICATIONS

● Processor: Sitara AM3359AZCZ100 1


GHz, 2000 MIPS
● Memory: 512MB DDR3L, 4KB EEPROM,
4GB Embedded MMC
● Boot Modes: eMMC Boot, SD Boot, Serial
Boot, USB Boot
● Power Sources: USB power, 5V DC 1A
● Indicators: 1 Power, 2 Ethernet, 4 Blue
LEDs

Figure 6. Beaglebone Black and specifications. Adapted from Beagleboard.org

Foundation, 2024.

ESP32 cam - The ESP cam, as shown in Figure 7, serves as a CCTV proxy and the

camera of the system.

ESP32: SPECIFICATIONS

● Operating Voltage: 4.75V-5.25V


● IO Pins: 9
● Frequency at 240MHz
● Flash:4MB
● RAM 320KB

Figure 7. ESP32CAM and specifications. Adapted from PlatformIO, 2024.

19
Traffic Light Module (Red, Green, Yellow) - This module in Figure 8, serves as

the physical interface of the traffic light.

TRAFFIC LIGHT MODULE


SPECIFICATIONS

● 8mm Red, Yellow and Green LEDs


● LEDs active HIGH
● Built-in current limiting resistors
● VCC: 3.3-5V
● ITyp: 8-9mA @5V
● Dimensions: LxW (PCB) 56mm x 13mm
(2.2 x 0.51”)

Figure 8. Traffic Light Module and specifications.

Logic Level Bi-directional Converter - This converter as seen in Figure 9, serves

as the mediating interface between the 5V Arduino Uno and the 3.3V Beaglebone

Black. It helps the GPIO input and output of the devices communicate with each

other.

LOGIC LEVEL BI-DIRECTIONAL


CONVERTER SPECIFICATIONS

● Lower Voltage 1.5-7V


● Higher Voltage up to 18V
● 4 Bidirectional channels

Figure 9. Logic Level Bi-directional Converter and specifications.

20
METHODOLOGY

The methods used are composed of the solutions used for the specific research

objectives: the Model Pipeline, the CCTV and Proposed System Integration Prototype, and

the Image Detection at Varied Environmental Conditions.

I. Traffic Density Model

A. Deep Learning Pipeline

The machine learning pipeline was defined in Figure 10, with the input

from the Image Classification dataset and the Object Detection dataset. Data

Preparation is organizing the structure of the dataset including the data split of 70%

training set, 15% test set, and 15% validation set. After splitting the data, the data

is fed to the model training with which a transfer model is also chosen for a set of

training. The transfer models used are based on the YOLOv8 with optimized

weights in its layers that will define the feature scaling of the image inputs. The

training environment is via Google Colaboratory Environment (Google Colab or

Colab) which hosts Jupyter Notebook instances. Google Colab provides up to 12

Gigabytes of RAM and a dedicated GPU for training the model. Model Analysis

and Validation is also run in the same Colab instance, with which, it would be

retrained and refined. The model is then Exported after a series of retraining and

validation. The export will be in a PyTorch model format which would be easily

converted for inference with multiple edge devices. The model is then deployed to

the platform and the device that it would run on.

21
Figure 10. Image Classification and Object Detection Deep Learning Pipeline

To evaluate the models, certain metrics are applied. Image Classification

metrics are Accuracy as displayed in Figure 11. As for the Object Detection, the

Precision-Recall Curve exhibited in Figure 12 will be utilized as a substitute for

Accuracy, the mAP (mean Average Precision) will be applied, which is the area

under the curve of the Precision-Recall plot.

𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑜𝑟𝑟𝑒𝑐𝑡 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠


𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑜𝑡𝑎𝑙 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠

𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒

𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑟𝑢𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒

Figure 11. Accuracy, Precision and Recall Equations (top to bottom)

22
Figure 12. Sample Precision-Recall Curve

The model architecture is defined by Figure 13. The model has 225 layers,

3012018 parameters, 0 gradients, 8.2 GFLOPs (one billion floating-point

operations). Transfer models typically freeze backbone layers and add layers at the

head for the output, but YOLO implementation typically does not freeze layers

during the transfer learning process, rather it updates the weights of the final layers

using gradient descent, in this case the researchers utilized AdamW optimizer to

fine-tune the model. The difference between the image classification and the object

detection model is the last layer, which is the Bbox and the Cls. layer, both of which

calculate the loss. The researchers fine-tuned the whole model and froze the last

layer Discriminative Feature Learning (DFL) which is already fine-tuned to an

optimal state.

23
Figure 13. YOLOv8 Architecture, visualization made by GitHub user RangeKing.

Adapted from Solawetz et al., 2023.

B. Image Classification and Object Detection

The research is a hybrid of combining two computer vision techniques to

effectively make a reliable detection system. In Figure 14, we can see a sample

Image Classification (on the left), with which the whole image is processed and

inputted to the Image Classification model and then outputs a result from the labels

with which the model is trained with. With this sample, there are two labels namely,

Dense Traffic and Sparse Traffic. The image is classified as a Dense Traffic Image

with a confidence level of 1.00, which means that, if the label Dense Traffic is

trained with heavy traffic images, the image is predicted to be having heavy traffic.

24
Figure 14. Sample image classification of traffic density and sample object

detection of cars, from left to right.

The sample Object Detection (on the right) in Figure 14, takes the same

image as the sample Image Classification input. The input image is then evaluated

to look for an object that is identified as a car and then draw a bounding box into it.

The Object Detection model's output is then used to count the number of car

detections and calculate their velocity or follow their path. Combining these two

computer vision techniques produces better output quality by using multiple

sources of useful information.

C. Inference Processing

The model is deployed and is now ready for inference. As seen in Figure

15a, the program process can be divided into sections and processes. The process

starts with initiating timer and variables. Followed by the output of the model

inferences of the object detection and image classification, which are then processed

according to the need for the traffic light controller. Pedestrian Signal Process flow

25
has the Person class input and then outputs a Pedestrian Signal pulse. The Next

Signal Lane Process flow uses the classes Bicycle, Car, Motorcycle, Bus and Truck

of object detection and the Sparse Traffic and Dense Traffic of the image

classification. This sends a Next Signal Lane pulse to the controller. The Warning

Signal Process flow uses the Fire and Accident output classes of the image

classification that sends a Warning Signal pulse to the controller.

Figure 15a. Model Inference Process Flow.

26
The variables initiated initially are shown in Figure 15b. The process starts

with the lane clock timer, then initiating the needed variables and user-defined

constants. The variables are initiated to be consistently used throughout the entire

process.

Figure 15b. Timer and Variables Initiation Process Flow.

Figure 15c shows how the system processes the signal for the pedestrian

actuation. The process starts with counting the pedestrian inference of the output

then if there if it is True, it will start a Pedestrian Wait Clock timer. The process

continues by counting the pedestrian again, and if the pedestrian count is greater

than the minimum count then it proceeds to add priority to the next lane. This can

also be triggered when the pedestrian has already waited at a maximum amount of

time to cross the intersection.

27
Figure 15c. Pedestrian Signal Process Flow.

The next process shows how the system processes the flow to change the

next signal lane or the next lane which will have a go signal. It is important to note

that the process of the lane phasing relies on the exclusive phasing. With which the

signal lane at a go signal will be able to go to all directions of the intersection.

28
Figure 15d process starts with two inputs from the detection and

classification. The detection output from the model is counted along with arbitrary

weights based on their speed in going through the intersection, the space they

consume and how they interact with other motorists. The sum of the weighted count

is then added to the Total Lane Signal time. The classification output is also

processed simultaneously by checking the traffic density through the Sparse Traffic

and the Dense Traffic which is then multiplied to a corresponding multiplier

whether to decrease or increase the time which is then added to the Total Lane

Signal Time. When the running time or Current Lane time is greater than or equal

the optimized Total Lane Signal time or the Maximum Signal time, the process at

that point outputs an actuation signal prompting the controller that it should be the

next lane to have a go signal.

29
Figure 15d. Next Signal Lane Process Flow.

Figure 15e shows the process to output the warning signal. The input of this

operation is from the Fire and Accident classes of the classification model. The

confidence levels are received and if one of the classes is greater than the threshold

30
it will initiate a validation delay. The validation delay is to verify whether the

inference received is consistent or just a spike of the confidence value of the class.

After the validation delay the process is repeated then if its greater than threshold,

a warning signal is initiated.

Figure 15e. Warning Signal Process Flow.

There are three main outputs of the Model Inference Processing mainly, to

initiate a warning blinking yellow lights which sends a pulse to the receiver and

31
waits for a feedback to end the warning blinking yellow lights; to initiate the next

signal lane for inference through send a pulse that immediately resets the inference

process, and to start the pedestrian signal by sending a pulse and waiting for a

feedback signaling the end of the pedestrian crossing. These outputs are from the

inference side separated from the Traffic Light Controller as discussed in further

sections.

II. CCTV System and Traffic Light System with Integrated Optimized System

The modern CCTV (Closed Circuit Television) system captures video and

functions as a video stream with multiple video frames in a single stream or a single video

on different streams, it is also called IP Cameras (Internet Protocol Camera). With this

definition, it is possible to replicate its function with an ESP32 along with its accessory

CAM OV2640. It has a WiFi module that can stream the videos captured wirelessly using

IP, which can be accessed by a given IP address on the Local Network.

In Figure 16, the integration and the optimization of the system is seen in the block

diagram. The CCTV System Proxy and the Traffic Light System Prototype are two separate

systems integrated together using the proposed optimization system. All three systems are

electrically isolated to easily troubleshoot and scale the systems individually. These would

also translate to the current systems applicable to this study.

32
Figure 16. Block diagram of CCTV system to Traffic Light system with integration of

optimized system.

The CCTV or IP Camera will capture the live footage of the intersection. Video

footage will be taken with a drone processed at 1080p resolution. Afterwards it streams the

video directly on the computer through an IP Streaming Protocol. The programmed model

will output inferences which will subsequently be processed through a USB-connected

Arduino Uno. The Arduino Uno’s function is not to run the program itself but act as an

extension of the computer utilizing its IO ports as a signal actuator for the various signals

needed by the Traffic Light System prototype. There are three actuation signals received

as HIGH by the Traffic Light System: Pedestrian Signal, Next Signal Lane Lights, and

Warning Blink Signals.

The Traffic Light System prototype consists of the Controller and the Lights. It can

function in a Fixed-timer Mode (Automatic Mode) as well as a Fully actuated Mode

(Manual Mode) that will receive signals from the Arduino Uno through the GPIO pins. The

signal exchanged between the GPIO pins of Arduino, and the Traffic Light Controller are

33
0V - 5V. The signal received by the controller replicates the ability of the Traffic Light

System to be actuated or switched by a separate signal. The pulses sent by the Optimization

System are 5V pulses which are interpreted at the rising edge by the Traffic Light System.

The electrically isolated traffic lights will be connected to multiple relays controlled by the

controller.

The Traffic Light System is scaled down to a miniature prototype representative of

the industry specifications of modern traffic light systems sourced locally. In Figure 17a,

the Traffic Light System process flow is described. The traffic light includes arbitrary

variables that can be changed during runtime and to fit the observable timings from related

local intersections. The system receives actuation as 3.3V logic high to fulfill conditions.

Figure 17a. Traffic Light System Process.

34
Figure 17b shows how the automatic cycle works for the traffic light system

controller. The process loops the Green or “go” signal, the yellow or the “ready to stop”

signal and the “stop” signal. The values and timing are inherited values from the timing

signal of local intersection traffic lights. The signal is looped by iterating through a n-value

which is later returned to the Check Actuation Mode condition as seen on Figure 17a. The

Pedestrian signal is also part of the loop and the cycle of the whole traffic light.

Figure 17b. Automatic Cycle of Traffic Light Process.

Figure 17c exhibits how the automatic cycle works for the traffic light system

controller. The process loops depending on the input actuation of the following: Pedestrian,

Next Lane Signal and Warning. These are synonymous with the outputs of the model

inference. Each output corresponds to a specific sequence of timing and traffic lights. After

a single process it also iterates the n-value to go over all the phases of the intersection.

35
Figure 17c. Manual Cycle of Traffic Light Process.

The traffic light system flow will be implemented in a small demonstrable physical

model. Figure 18 displays the dimensions of the miniature traffic light system model. It

will be a four-way intersection with a traffic light led module on each lane. An extra traffic

light at the center of the intersection demonstrates the pedestrian crossing at all lanes.

36
Figure 18. Miniature Traffic Light system with proposed dimensions.

The miniature model traffic light system will be constructed around the Beaglebone

Black (BBB) microcontroller. Figure 19 shows the schematics designed and considered.

The traffic light system model from the BBB will interface with the Arduino Uno through

a Logic Level Converter that will convert 5V to 3.3V so that both devices can effectively

communicate logics without overdriving the lower voltage device.

37
Figure 19. Traffic Light Controller and Optimized Actuator schematic.

To assess the optimized system, the mean extra green time, as seen on Figure 20,

must be compared to the actual and observed values at the local intersections. There will

be some locations along Tagaytay – Sta Rosa Rd. such as the Paseo–LTI intersection and

Greenfield Auto Parkway. Researchers observed each lane’s total red time, the total time

that the cars are stopped with the red signal; the total green time, the effective green signal

that the traffic is moving; the queue finish, the time it takes from the last vehicle that

stopped at the red signal light to cross the intersection; the extra green time, the time after

the queue finished while at green signal light; and yellow, the total time of the yellow signal

to flash. This is tested using the captured footage to estimate the extra green time.

38
𝑒𝑥𝑡𝑟𝑎 𝑔𝑟𝑒𝑒𝑛 𝑡𝑖𝑚𝑒
𝑚𝑒𝑎𝑛 %𝐸𝑥𝑡𝑟𝑎 𝐺𝑟𝑒𝑒𝑛 𝑇𝑖𝑚𝑒 = 𝛴 × 100%
𝑡𝑜𝑡𝑎𝑙 𝑔𝑟𝑒𝑒𝑛 𝑡𝑖𝑚𝑒

Figure 20. Mean %Extra Green Time formula.

III. Image Detection through Varied Environmental Conditions

The optimized traffic light system is trained to work through different

environmental conditions. Simulated environmental conditions are simulated in Figure 21

to provide visual ideas of image noises to solve. The ideal condition is during daylight with

ample lighting and minimal noise on the image, this will be the benchmark of the other

environmental conditions that have significant effect on the camera image quality. Images

will be collected and gathered through open-source datasets COCO and Traffic-Net that

will be augmented to a simulated condition.

Figure 21. Simulated environmental conditions.

Sample conditions are tested and compared to its actual data. The same procedure

will be done to each variable, which is to verify the model with curated validation data.

Due to the research being limited to CCTV footage at available Traffic Light intersections,

39
data gathering for multiple conditions are acquired through aerial footages, as seen in

Figure 18, where a sample aerial shot is captured at a local highway. These footages are

added to the dataset and will be augmented to simulate different environmental conditions.

It will also be labeled and annotated using a bounding box annotation and exporting it to a

YOLO format. Different highways will be captured at different time intervals and

environmental conditions which will be used for testing the model functioning real-time

using the miniature traffic light system.

Augmentation techniques to simulate different environmental conditions are the

following: Daylight – Brightness between +20% and +0%, the footage is the one captured

at daytime; Night – Noise up to 3% of pixels, the footage is already captured at nighttime;

Fog – Blur up to 4.5px, Exposure between -10% and +10%, the footage used is at daytime;

and Rain – Blur up to 3 px, Noise up to 8% of pixels, the footage used is both at daytime.

Notice that Fog and Rain conditions use the daytime footage, since at night, further

augmentation will lead to undesirable results, or rendering the footage unusable. This

augmentation will be done to Object Detection and Image Classification but the validation

dataset for image classification will be purely augmented with the difference that the

“Night” variable will have an augmentation step of Brightness between -40% and 0% and

a Cutout of 5 boxes with 20% size each.

40
DESIGN CONSTRAINTS

1. Regulatory Constraint

Due to the restrictions imposed by the Republic Act 10173 – Data Privacy

Act of 2012 and security measures enforced by the Local Government Unit, access

to CCTV footage for inference purposes is limited. Despite requesting access to

video footage, the researchers encountered obstacles in obtaining sufficient

information. Consequently, the implementation of the traffic light optimization

system is hindered by the lack of cooperation from local government authorities.

As a workaround, the researchers opted to develop and test the system on a small

miniature scale, which is representative of the data available from the local

government.

2. Safety Constraint

The project cannot be implemented on-site due to safety concerns

associated with its current stage of development. Since the system is still in its

design, development, or experimental phase, deploying it in real-world traffic

intersections poses potential risks. Therefore, to ensure the safety of both the

researchers and the public, testing and experimentation are confined to controlled

environments until the system is adequately refined and proven to meet safety

standards.

41
CHAPTER 4

RESULTS AND DISCUSSION

This study determined the performance of the model and the prototype

implementation as prepared with the tables, figures, and graphs.

I. YOLOv8 Model Validation Results

The model results consist of object detection and image classification metrics

during the training and the validation.

a. Object Detection

The first metric for object detection is the Precision-Recall (PR) curve. As shown

in Figure 22, the graph follows a curve at the upper right of the plot. This means it is doing

well for an object detection model with a limited dataset and a separate set of validation

data. The area under the PR curve of all classes is called the mean average precision (mAP).

There are varied factors that affect the mAP such as the quality of the data that it is validated

on and the addition of different augmentation techniques in the validation data to produce

results that consider different environmental conditions simulation. The mAP50-95

validation on COCO 2017 of the YOLOv8n is at 37.3 (Jocher et al., 2023). And as seen on

Figure 23, the trained model produced a 38.6 mAP50-95 score on a custom validation

dataset. The model also started converging at around 20 epochs and continued to decrease

losses at the end of the 100 epoch, which is a good sign of a model preventing overfitting.

42
Figure 22. Object Detection Precision-Recall Curve

Figure 23. Object Detection Validation Results.

43
Figure 24 offers an insight of the classes that contributed much to the model

performance. The best performing class is that car class. The class is also the most

represented having around 6000 instances with the next labels represented at around 3000

instances. This is important because the car class contributes an average of 90% of all total

time inferences. The model is struggling to validate persons class since the distance of the

validation dataset at 640px renders persons to be confused with the background.

Figure 24. Object Detection Confusion Matrix.

Figure 25 displays the sample inference output on some actual images. It can be

observed that there are some errors in the prediction such as predicting in the background,

not properly classifying the class of a bounding box, or not predicting at all. These errors

44
are expected and will render unpredictable at different actual scenarios. Thus, this will be

mitigated by the other inference model which is the image classification.

Figure 25. Sample Object Detection Prediction. (left prediction, right actual).

b. Image Classification

The main metric for the classification is accuracy. Figure 26 exhibits the top 1

accuracy or the first class to have the highest confidence level on inference to be 94.3%.

The baseline accuracy for the model YOLOv8n-cls is at 69.0%, thus contributing most of

the reliability to the overall inference process that the object detection method cannot

handle alone. (Jocher et al., 2023).

45
Figure 26. Image Classification Validation Results.

The model is doing well in its accuracy for predicting the state of a given traffic

intersection image. Looking at the classes’ individual performance as shown in Figure 27,

the model performed best on detecting accidents and fire with confidences of 0.97 and

0.99, respectively. The sparse traffic and dense traffic are used together to perform

continuous inferences or evaluation on the frames of the real time processing of traffic

timing.

Figure 27. Image Classification Confusion Matrix.

46
The model relies on available datasets for use to train the image classification

dataset, thus, with the given dataset, the classification of manually curating the dataset

affected the model’s performance.

Figure 28 shows the sample inference output of the classification model. The

difference between the prediction and the actual might be reasonable since the model may

have given weight on the sign as to predict an accident rather than a mere sparse_traffic.

This is also the same as the dense_traffic actual data, which the model has given a

reasonable prediction of sparse_traffic, since it does not clearly show the density of cars in

the image.

Figure 28. Sample Image Classification Prediction (left prediction, right actual).

47
II. Optimized Traffic Light Timing

The optimized system is integrated into the traffic light system and the proxy CCTV

system by local file, IP stream or ESP32 with several types of CAM OV2640 (when

conditions are right for outside demonstration). The video image sizes are at a Full HD

resolution or 1080x1920 which is a standard output resolution of common CCTV systems.

The script runs are tested on a MacBook Pro M1 device to handle inferences with the

trained YOLOv8n and YOLOv8n-cls. These are small models with a weight (pytorch) file

of around 6MB. Thus, with the strong processing unit and a smaller model, the inferences

run at a speed of 1.6-5.1 ms per frame. This would easily be translated to edge devices,

since the YOLOv8 model can be exported directly onto other models such as Tensorflow

Lite format, to run inferences on Coral edge devices.

The inferences are also utilized by the miniature system and performs as expected

by changing the required lanes along with the inferences that happen on the inference

machine. The integration with the miniature system can be easily scaled up since the

inference is at a separate device only actuating the logic required by the controller. The

actuation signal can also be transferred wirelessly using IP protocol and a script that can

adapt to any other device input requirements.

Table 3 shows observed traffic data from local intersections that at non-peak hours,

the queue finishes early and demonstrates a 38.28% mean %extra green or go time after

the queue finishes. And the LTI-Paseo traffic intersection where the optimized system will

be tested. This is compared to the optimized system which decreased the mean %extra

green time to 10.81%.

48
Table 3

Traffic Light Lanes Average Timing Comparison

Traffic Intersection Lane Total Total Queue Extra Yellow


no. Red Green Finish Green Time (s)
Time (s) Time (s) Time (s) Time (s)

1 137 58 28.7 29.3 3

Greenfield Autopark 2 175 20 17.0 3.0 3

3 120 75 49.3 25.7 3

4 170 25 18.3 6.7 3

1 62 60 32.0 28.0 3

2 107 15 10.3 4.7 3


Laguna Bel-Air
3 62 60 33.3 26.7 3

4 107 15 8.7 6.3 3

1 82 35 20.3 14.7 3

2 82 35 18.7 16.3 3
LTI-Paseo
Actual 3 85 35 30.3 4.7 3

4 100 14 9.0 5.0 3

1 N/A 51 42.5 8.5 3

LTI-Paseo 2 N/A 32 29.5 2.5 3


Optimized System at
gathered footage 3 N/A 45 41.0 4.0 3

4 N/A 20 19.0 1.0 3

Note. Averaged timing data at local intersections at Santa Rosa, Laguna during non-peak

hours.

49
The results do not count total red as the time depends on the density of traffic. The

total green time accounts for the traffic density and the queue it selects for the region where

it detects traffic. There are pedestrian crossing points present at LTI-Paseo intersection, but

they are seldom used by pedestrians. The system can detect these pedestrians at corners

where they can start crossing. However, the pedestrian feature of the model is an extra

feature wherein it assumes that when many pedestrians need to cross, all the lanes need to

stop.

Another assumption that has been made at optimizing the lanes is that these lanes

should have an exclusive phase. The phases in which inferences are made have a phase

rotation that allows concurrent lanes to be in a single phase, this means that the cycle of

both systems differ. Though the systems differ, the assumption removes complexity from

the timing situations and allows for better optimization of the green light time (Buckholz,

2024).

III. Environmental Conditions Performance

The conditions are validated using augmented images of the validation dataset, this

does not reflect actual condition performance. For example, Rain condition takes noise

augmentations and some exposure augmentations. Table 4 displays that the validation

performance of the image classification using classify, reached accuracy of equal or greater

than 98%, giving weight to the importance of using image classification alongside an object

detection model. The object detection validation scores fall short at nighttime since the

camera cannot capture useful information in the dark without processing or equalizing the

image first to show useful details.

50
Table 4

Validation Performance per Environmental Condition

Environmental Condition mAP50 Accuracy top1


(Detect) (Classify)

Daytime 0.862 0.997

Night 0.424 0.987

Fog 0.755 0.993

Rain 0.687 0.992

Note. Augmented environmental conditions are used to simulate the conditions.

Figure 29 shows the observable performance with a given environmental

condition. Its inference outputs true results for detection and classification, with some

exceptions to the night condition, leading to confusion to confidences and bounding

boxes. These night conditions could be better utilized by doing equalization and image

processing to improve image usability.

51
Figure 29. Sample validation of different environmental conditions.

52
CHAPTER 5

CONCLUSION AND RECOMMENDATIONS

Conclusion

This study aimed to optimize traffic light control through the integration of

computer vision techniques and a hybrid approach to traffic signal timing. With a goal to

contribute to the current developments to address traffic and land transportation issues in

the Philippines, the researchers designed the computer vision hybrid system to optimize

traffic light control composed of image classification and object detection techniques. This

study demonstrated effectiveness in validation results by accurately detecting and

classifying traffic conditions.

Thus, through extensive development and careful evaluation, this research yields

several key takeaways:

1. YOLOv8 Model Performance

The object detection model, based on YOLOv8n, achieved a mean average

precision (mAP) of 38.6% on a custom validation dataset, surpassing the baseline

mAP of 37.3%. This indicates the robustness of the model in accurately detecting

various objects, particularly cars, which contribute significantly to the overall

traffic flow. However, challenges remain in accurately detecting pedestrians,

especially at distances where they may be confused with the background.

Furthermore, the image classification model, utilizing YOLOv8n-cls,

achieved an impressive top-1 accuracy of 98% on the validation dataset. This high

53
accuracy rate enhances the reliability of the overall inference process, particularly

in identifying critical traffic conditions such as accidents and fires. The model's

performance underscores the importance of combining image classification with

object detection to provide comprehensive insights into traffic density and

conditions.

2. Optimized Traffic Light System Efficiency

The optimized traffic light system significantly reduced the mean %extra

green time from 38.28% at traditional intersections to 10.81% at the LTI-Paseo

intersection. This substantial reduction in wasted green light time demonstrates the

practical benefits of using computer vision-based optimization strategies.

Additionally, the integration of the optimized system actuation with the miniature

traffic light prototype displayed its scalability and adaptability for real-world

deployment.

3. Challenges and Opportunities in Environmental Conditions

However, challenges persist in addressing environmental conditions,

particularly during nighttime and adverse weather conditions. While the system

demonstrated satisfactory performance in simulated environmental conditions,

further enhancements are needed to improve accuracy and reliability in real-world

scenarios. Strategies such as image processing and equalization could be explored

to enhance the system's performance under challenging conditions.

In summary, this research highlights the potential of computer vision-based

approaches in revolutionizing traffic light control systems. By leveraging advanced

54
computer vision techniques, such as object detection and image classification by using

YOLOv8, significant improvements in traffic flow optimization can be achieved. Moving

forward, continued research and development efforts are essential to refine and optimize

these systems for widespread adoption, leading to safer and more efficient urban

transportation networks.

Recommendation

Based on the findings of this study, the following recommendations are proposed:

1. Further Research and Improvement

Continuous research is necessary to improve the performance of the object

detection model under challenging environmental conditions, such as nighttime,

fog, and rain. Exploring advanced image processing techniques and data

augmentation methods could enhance the model's accuracy and reliability.

Improving the camera technologies would also help the overall computer vision

system.

The system is also reliant on the model used and the dataset it was trained

on, thus, the researchers advocate for aligning the future direction of this research

with the principle of Continuous Integration/Continuous Delivery (CI/CD) to

address the dynamic nature of traffic data over time. This entails the seamless

deployment of updates and improvements to the traffic light optimization system,

ensuring that it remains adaptive and responsive to evolving traffic conditions.

55
2. Field Testing

Conducting field tests of the optimized traffic light system in real-world

traffic intersections would provide valuable insights into its effectiveness and

practicality. Collaborating with local transportation authorities and municipalities

to implement pilot projects could validate the system's performance in live traffic

scenarios.

3. Scalability and Deployment

Considerations should be given to the scalability and deployment of the

optimized system across various intersections and urban environments. The studied

system is designed to scale to edge devices and up to large inference machines by

leveraging export formats of YOLOv8 and the interoperability of the actuator

design.

With the implementation of these recommendations, the trajectory of traffic light

optimization through computer vision technology is positioned for significant

advancement. By prioritizing further research, field testing, scalability, and continuous

improvement, the potential for real-world deployment and impact is enhanced. This

ensures that traffic management systems remain adaptable and responsive to the changing

dynamics of urban traffic, contributing to safer and more efficient transportation networks

in the cities.

56
REFERENCES

Agarwal, S., & Kumar, P. (2016). SMART TRAFFIC CONTROL SYSTEM BASED ON

VEHICLE DENSITY BACHELOR OF TECHNOLOGY IN ELECTRONICS

AND COMMUNICATION ENGINEERING.

http://www.ir.juit.ac.in:8080/jspui/bitstream/123456789/7563/1/Smart%20Traffic

%20Control%20System%20Based%20on%20Vehicle%20Density.pdf

Arduino. (2024). UNO R3 | Arduino Documentation. Docs.arduino.cc.

https://docs.arduino.cc/hardware/uno-rev3/

Beagleboard.org Foundation (2024). BeagleBone Black — BeagleBoard Documentation.

Docs.beagleboard.org.

https://docs.beagleboard.org/latest/boards/beaglebone/black/

Buckholz, J. (2024). Introduction to Traffic Signal Phasing. CED Engineering.com.

https://www.cedengineering.com/userfiles/Introduction%20to%20Traffic%20Sig

nal%20Phasing-R1.pdf

CNN Philippines Staff (2022, August 1). CNN Philippines. Retrieved December 5, 2023,

from http://www.cnnphilippines.com/news/2022/8/1/MMDA-countdown-

stoplights-adaptive-traffic-signals.html

Eom, M., & Kim, B.-I. (2020). The traffic signal control problem for intersections: a

review. European Transport Research Review, 12(1).

https://doi.org/10.1186/s12544-020-00440-8

Gamil, J. T. (2014, January 10). MMDA upgrades traffic system. INQUIRER.net.

https://newsinfo.inquirer.net/561705/mmda-upgrades-traffic-system

57
JICA (2022, February 10). Ex-Ante Evaluation Southeast Asia Division 5 Southeast Asia

and Pacific Department. https://www2.jica.go.jp/en/evaluation/pdf/2021_PH-

P275_1_s.pdf

Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics YOLO (Version 8.0.0) [Computer

software]. https://github.com/ultralytics/ultralytics

Long, X., Deng, K., Wang, G., Zhang, Y., Dang, Q., Gao, Y., Shen, H., Ren, J., Han, S.,

Ding, E., & Wen, S. (2020). PP-YOLO: An Effective and Efficient

Implementation of Object Detector. ArXiv (Cornell University).

https://doi.org/10.48550/arxiv.2007.12099

Meng, B. C. C., Damanhuri, N. S., & Othman, N. A. (2021). Smart traffic light control

system using image processing. IOP Conference Series: Materials Science and

Engineering, 1088(1), 012021. https://doi.org/10.1088/1757-899x/1088/1/012021

Nodado, J. T. G., Morales, H. C. P., Abugan, M. A. P., Olisea, J. L., Aralar, A. C., &

Loresco, P. J. M. (2018, October 1). Intelligent Traffic Light System Using

Computer Vision with Android Monitoring and Control. IEEE Xplore.

https://doi.org/10.1109/TENCON.2018.8650084

PlatformIO (2024). AI Thinker ESP32-CAM — PlatformIO v6.1 documentation.

Docs.platformio.org. Retrieved April 29, 2024, from

https://docs.platformio.org/en/latest/boards/espressif32/esp32cam.html

Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016, May 9). You only look once:

Unified, real-time object detection. arXiv.org. https://arxiv.org/abs/1506.02640

Reis, D., Kupec, J., Hong, J., & Daoudi, A. (2023). Real-Time Flying Object Detection

with YOLOv8. https://doi.org/10.48550/arxiv.2305.09972

58
Sharma, M., Bansal, A., Kashyap, V., Goyal, P., & Sheikh, T. H. (2021). Intelligent

Traffic Light Control System Based on Traffic Environment Using Deep

Learning. IOP Conference Series: Materials Science and Engineering, 1022,

012122. https://doi.org/10.1088/1757-899x/1022/1/012122

Solawetz, J., & Francesco. (2023, January 11). What is YOLOv8? The Ultimate Guide.

Roboflow Blog. https://blog.roboflow.com/whats-new-in-yolov8/

Sunnexdesk (2023, January 23). Cebu’s traffic light system “most advanced” in world.

SunStar Publishing Inc. https://www.sunstar.com.ph/cebu/local-news/cebus-

traffic-light-system-most-advanced-in-world

Rouphail, N., Tark, A., & Li, J. (1997). TRAFFIC FLOW AT SIGNALIZED

INTERSECTIONS. Turner-Fairbank Highway Research Center.

https://www.fhwa.dot.gov/publications/research/operations/tft/chap9.pdf

US Department of Transportation Federal Highway Administration Office of Operations.

(2021). Traffic Signal Timing Manual: Chapter 5 [Review of Traffic Signal

Timing Manual: Chapter 5].

https://ops.fhwa.dot.gov/publications/fhwahop08024/chapter5.htm

Zaldarriaga, J. (2023, January 31). MMDA’s Artes on the right track in pushing road

safety. Philippine News Agency. https://www.pna.gov.ph/opinion/pieces/623-

mmdas-artes-on-the-right-track-in-pushing-road-safety

59
APPENDICES

Appendix A: Initial Draft Review Matrix

Comment No. Query Action Manuscript Changes


Added at the Introduction
The title Traffic "Detecting objects and
Light System using Panel changed title to calculating the traffic density is
Computer Vision is Design and also the focus of this research;
a little too broad Implementation of focusing on the hybrid image
and does not specify Optimized Traffic Light classification and object
a good System Using Image detection techniques merged
representation of the Classification and Object into a pipeline to optimize the
1 research. Detection Techniques traffic light control system."
The proposal to implement
this at a local government Changed the Scope and
may not be feasible with Limitation of the study from
the requirements needed "Project to be tested on local
thus, the idea was shifted intersections and T roads in
Working with LGU to implement on the Calamba or Silang" to
on traffic light university instead and "Implementation will be made
systems is difficult focus on the output at a local 4-way intersection at
enough, is this prototype. The traffic light the Adventist University of the
feasible for the system is complex enough Philippines, Silang, Cavite,
2 given timeline? for the project. Philippines."
The project started with
using existing models to
account for traffic density.
Methodology was Rewrite the Methodology from
reworked to account for using Traffic-Net model to
The novelty of the the thesis proposal's predict the traffic density, to
proposed optimized novelty using the hybrid of creating a pipeline for training
traffic light idea. image classification and and testing an image
There have been a object detection. The classification and object
lot of Advanced project team will create its detection model. Added a new
Traffic Light own model pipeline and framework which focused on
Systems installed in focus more on the the pipeline, rather than using
3 the Philippines. objectives. the ready-made model.

60
Appendix B: Local Government Cooperation Summary

Santa Rosa City – Command Center

and Santa Rosa City – City Traffic and Management Enforcement Unit

(Left to Right)

The researchers visited the Local Government Unit Offices of Santa Rosa City

CTMEU (City Traffic and Management Enforcement Unit) and Santa Rosa City

Command Center to reach out for advice and additional information regarding the Traffic

Light System in the locality. Information acquired are the following:

● The Traffic Light System has an Automatic and Manual Actuation Mode that can

be controlled both by the Command Center and on-site at the Traffic Light

Systems.

● The Automatic Actuation Mode is a fixed timer that can be set to separate times at

the Command Center.

● The Manual Actuation mode can be controlled by the CTMEU personnel when

there is a need for manual control of the traffic flow at heavy-traffic time slots.

61
● The Traffic Light System is powered by 220V, and its internal microcontrollers

are at 9-24V.

● The CCTV system is controlled by the Command Center through IP.

These information are essential for understanding the systems currently used in

the Philippines. These are also the basis of arbitrary and base constants used by the

researchers to implement the Traffic Light System prototype.

The researchers also requested some of the video footage for inference. However,

due to the Republic Act 10173 – Data Privacy Act of 2012 and security measures by the

Local Government Unit, limited information is given, and CCTV footage access is

restricted. The system's implementation requires some cooperation from the local

government, which the researchers are denied. Thus, the researchers decided to

implement the traffic light optimization system on a small miniature scale that is

representative of the information acquired from local government.

There are revisions and updates after the decision to scale down the project. It is

described in the Change Matrix.

62
Appendix C: Change Matrix

Change Page
Code. Rationale Original Revised No.

Statement of the Problem


No existing 2. Simple integration of
CCTV System 2. Simple integration of the the proposed system to a
and Traffic Light proposed system to an prototype CCTV system
System accessible existing CCTV system and and Traffic Light
A1 to the researchers. Traffic Light System. System. 6
Researchers will
integrate the
optimization
system with a 2. Integrate minimal
traffic light 2. Integrate with minimal requirements into a
system prototype requirements into a traffic traffic light system
and proxy CCTV light system with multiple prototype with multiple
A2 video streams. CCTV input. video IP streams. 6
Scope and Limitation

2. Model performance is
Unable to access dependent on the available 2. Model performance is
local intersection datasets such as COCO, dependent on the
CCTV footage datasets with open-source available datasets such
from Local attribution and local as COCO and datasets
Government intersection CCTV footages with open-source
B1 Units. from Local Government Unit. attribution. 7
Scope and Limitation
Implementation is
not possible for 3. Implementation will be
existing traffic done at a local 4-way
light systems. A intersection at the CALAX -
makeshift system Sta. Rosa Tagaytay Rd. T
is possible but intersection and Nuvali
may prove unsafe Greenfield Pkwy. - Sta. Rosa
for Tagaytay Rd. 4-way Redacted
B2 implementation. intersection. 7

63
Researchers
agreed to use a
proxy stream to IP
or an ESP32
CAM as a CCTV,
IP Camera
alternative or
local accessible 3. The CCTV system
CCTV footage of will be proxied through
the highway. video streams instead of
Researchers a real-time stream due to
assume that any security and safety
highway with factors. CCTV video
traffic flow is will be sourced at a local
representative of Scope and Limitation 4-way intersection or T
an inference at a intersection using on-
B3 green light lane. Append Revision site footage. 7

64
Appendix D: Request Letter for CCTV and Traffic Light Access

65
Appendix E: Traffic Light Lanes Raw Data

Traffic Intersection Lane Total Total Queue Extra Yellow


no. Red Green Finish Time Green Time
Time Time (s) (s) Time (s) (s)
(s)

1 137 58 23 35 3

137 58 33 25 3

137 58 30 28 3

2 175 20 18 2 3

175 20 17 3 3
Greenfield Autopark 175 20 16 4 3

3 120 75 50 25 3

120 75 50 25 3

120 75 48 27 3

4 170 25 12 13 3

170 25 23 2 3

170 25 20 5 3

1 62 60 27 33 3

62 60 37 23 3

62 60 32 28 3

2 107 15 8 7 3

107 15 11 4 3
Laguna Bel-Air
107 15 12 3 3

3 62 60 30 30 3

62 60 42 18 3

62 60 28 32 3

66
4 107 15 10 5 3

107 15 7 8 3

107 15 9 6 3

1 82 35 20 15 3

82 35 23 12 3

82 35 18 17 3

2 82 35 17 18 3

82 35 19 16 3
LTI-Paseo
Actual 82 35 20 15 3

3 85 35 29 6 3

85 35 32 3 3

85 35 30 5 3

4 100 14 11 3 3

100 14 7 7 3

100 14 9 5 3

1 N/A 51 44 7 3

LTI-Paseo N/A 51 41 10 3
Optimized System at
gathered footage 2 N/A 32 29 3 3

N/A 32 30 2 3

3 N/A 45 40 5 3

N/A 45 42 3 3

4 N/A 20 19 1 3

N/A 20 19 1 3

67
Appendix F: Source Code

Repository Link – https://github.com/dyi-el/optimized-traffic-light-computer-vision

import argparse
from collections import defaultdict
from pathlib import Path
import time
import pyfirmata
import asyncio

import cv2
import numpy as np
from shapely.geometry import Polygon
from shapely.geometry.point import Point

from ultralytics import YOLO


from ultralytics.utils.files import increment_path
from ultralytics.utils.plotting import Annotator, colors

track_history = defaultdict(list)

start_region = None
counting_regions = [
{
"name": "Pedestrian Region",
"polygon": Polygon([(900, 700), (1200, 700), (1200, 500), (900, 500)]), # Polygon points
"counts": 0,
"dragging": False,
"region_color": (255, 155, 100), # BGR Value
"text_color": (255, 255, 255), # Region Text Color
},
{
"name": "Multiplier Region",
"polygon": Polygon([(400, 650), (700, 650), (650, 50), (500, 50)]), # Polygon points
"counts": 0,
"dragging": False,
"region_color": (100, 200, 255), # BGR Value
"text_color": (0, 0, 0), # Region Text Color
},
]

async def next_lane_go():


global board, next_source

board.digital[12].write(1)
await asyncio.sleep(1)
board.digital[12].write(0)
next_source = True

async def pedestrian_go():


global board, next_source

68
board.digital[8].write(1)
await asyncio.sleep(1)
board.digital[8].write(0)

await asyncio.sleep(10)
board.digital[8].write(1)
await asyncio.sleep(0.5)
board.digital[8].write(0)

board.digital[12].write(1)
await asyncio.sleep(1)
board.digital[12].write(0)

#next_source = True

async def warning_go():


global board, next_source

board.digital[7].write(1)
await asyncio.sleep(1)
board.digital[7].write(0)

await asyncio.sleep(10)

board.digital[7].write(1)
await asyncio.sleep(1)
board.digital[7].write(0)

#next_source = True

def reset_variables():
"""Reset or reinitialize variables."""
global max_lane_signal_time, total_lane_signal_time, start_lane_time
global max_pedestrian_wait_time, start_pedestrian_wait_time, pedestrian_count
global pedestrian_detected, detect_added, classify_added, next_source

max_lane_signal_time = 120
total_lane_signal_time = 0
start_lane_time = time.time()

max_pedestrian_wait_time = 10
start_pedestrian_wait_time = 0
pedestrian_count = 0
pedestrian_detected = False

detect_added = 0
classify_added = 43

next_source = False

def next_lane_signal(detect_added, classify_added, results_cls, top5_labels, top5_conf_np):


global total_lane_signal_time, start_lane_time

69
for r in results_cls:
# Iterate over each class in the classification results
for label, prob in zip(top5_labels, top5_conf_np):
# Apply conditions based on class probabilities
if label == "sparse_traffic" and prob > 0.80:
classify_added *= 0.5
elif label == "dense_traffic" and prob > 0.30:
classify_added *= 1.2
elif label == "fire" and prob > 0.60:
#cv2.putText(frame,"WARNING", (600, 300), cv2.FONT_HERSHEY_SIMPLEX, 3, (0, 140,
255), 2)
#asyncio.run(warning_go())
continue
elif label == "accident" and prob > 0.60:
#cv2.putText(frame,"WARNING", (600, 300), cv2.FONT_HERSHEY_SIMPLEX, 3, (0, 140,
255), 2)
#asyncio.run(warning_go())
continue

total_lane_signal_time = detect_added + classify_added

lane_elapsed_time = time.time() - start_lane_time

if min(max_lane_signal_time, total_lane_signal_time) <= lane_elapsed_time:


#cv2.putText(frame,"Next Lane", (600, 300), cv2.FONT_HERSHEY_SIMPLEX, 3, (0, 255, 0), 2)
asyncio.run(next_lane_go())
return total_lane_signal_time, lane_elapsed_time
else:
return total_lane_signal_time, lane_elapsed_time

def pedestrian_signal(pedestrian_count):
global start_pedestrian_wait_time, pedestrian_detected, next_source

if not pedestrian_detected:
if pedestrian_count >= 3:
start_pedestrian_wait_time = time.time() # Start the pedestrian wait time counter
print("Pedestrian wait time started")
pedestrian_detected = True
else:
print("Pedestrian signal: Stop")
else:
pedestrian_elapsed_time = time.time() - start_pedestrian_wait_time
print(f"elapsed_pedestrian_time: {pedestrian_elapsed_time}")

if max_pedestrian_wait_time <= pedestrian_elapsed_time or pedestrian_count > 7:


#cv2.putText(frame,"Pedestrian Go", (600, 300), cv2.FONT_HERSHEY_SIMPLEX, 3, (0, 255,
0), 2)
asyncio.run(pedestrian_go())

70
next_source = True
time.sleep(3)
asyncio.run(next_lane_go())

def mouse_callback(event, x, y, flags, param):

global start_region

# Mouse left button down event


if event == cv2.EVENT_LBUTTONDOWN:
for region in counting_regions:
if region["polygon"].contains(Point((x, y))):
start_region = region
start_region["dragging"] = True
start_region["offset_x"] = x
start_region["offset_y"] = y

# Mouse move event


elif event == cv2.EVENT_MOUSEMOVE:
if start_region is not None and start_region["dragging"]:
dx = x - start_region["offset_x"]
dy = y - start_region["offset_y"]
start_region["polygon"] = Polygon(
[(p[0] + dx, p[1] + dy) for p in start_region["polygon"].exterior.coords]
)
start_region["offset_x"] = x
start_region["offset_y"] = y

# Mouse left button up event


elif event == cv2.EVENT_LBUTTONUP:
if start_region is not None and start_region["dragging"]:
start_region["dragging"] = False

def run(
source=None,
device="cpu",
hide_img=False,
board_port='/dev/tty.usbmodem1401',
line_thickness=2,
track_thickness=2,
region_thickness=4,
):

global total_lane_signal_time, start_pedestrian_wait_time, detect_added


global classify_added, pedestrian_count, start_lane_time, board, next_source

board = pyfirmata.Arduino(board_port)
print("Communication Successfully started")

vid_frame_count = 0

# Check source path


if not Path(source).exists():

71
raise FileNotFoundError(f"Source path '{source}' does not exist.")

# Setup Model
model = YOLO("runs/detect/train3/weights/best.pt")
model.to("cuda") if device == "0" else model.to(device)

model_cls = YOLO("runs/classify/train2/weights/best.pt")
model_cls.to("cuda") if device == "0" else model_cls.to(device)

# Extract classes names


names = model.model.names
names_cls = model_cls.model.names

# Video setup
videocapture = cv2.VideoCapture(source)

# Iterate over video frames


while videocapture.isOpened():
success, frame = videocapture.read()
if not success:
break
vid_frame_count += 1

# Extract the results


results = model.track(frame, persist=True, classes=None)
results_cls = model_cls(frame)

# Classification results
for r in results_cls:
# Convert the tensor to a numpy array
top5_conf_np = r.probs.top5conf

# Get the top 5 class names


top5_labels = [names_cls[i] for i in r.probs.top5]

# Display class names and probabilities


y_offset = 50
for label, prob in zip(top5_labels, top5_conf_np):
text = f"{label}: {prob:.2f}"
cv2.putText(frame, text, (50, y_offset), cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 255),
2)
y_offset += 30

if results[0].boxes.id is not None:


boxes = results[0].boxes.xyxy.cpu()
#track_ids = results[0].boxes.id.int().cpu().tolist()
clss = results[0].boxes.cls.cpu().tolist()

annotator = Annotator(frame, line_width=line_thickness, example=str(names))

72
classes_with_multiplier = {'bicycle': 0, 'bus': 0, 'car': 0, 'motorcycle': 0, 'truck': 0}

for box, cls in zip(boxes, clss):

annotator.box_label(box, str(names[cls]), color=colors(cls, True))


bbox_center = (box[0] + box[2]) / 2, (box[1] + box[3]) / 2 # Bbox center
'''
track = track_history[track_id] # Tracking Lines plot
track.append((float(bbox_center[0]), float(bbox_center[1])))
if len(track) > 30:
track.pop(0)
points = np.hstack(track).astype(np.int32).reshape((-1, 1, 2))
cv2.polylines(frame, [points], isClosed=False, color=colors(cls, True),
thickness=track_thickness)
'''
# Check if detection inside rectangle region
for region in counting_regions:
if region["name"] == "Multiplier Region" and
region["polygon"].contains(Point((bbox_center[0], bbox_center[1]))):
# Apply multipliers based on class
if cls == 0: # bicycle
classes_with_multiplier['bicycle'] += 2
elif cls == 1: # bus
classes_with_multiplier['bus'] += 5
elif cls == 2: # car
classes_with_multiplier['car'] += 3
elif cls == 3: # motorcycle
classes_with_multiplier['motorcycle'] += 2
elif cls == 5: # truck
classes_with_multiplier['truck'] += 7

# Check if detection inside polygon region


for region in counting_regions:
if region["name"] == "Pedestrian Region" and cls == 4 and
region["polygon"].contains(Point((bbox_center[0], bbox_center[1]))):
region["counts"] += 1

detect_added = sum(classes_with_multiplier.values())

# Draw regions (Polygons/Rectangles)


for region in counting_regions:
region_label = ""
if region["name"] == "Pedestrian Region":
if region["counts"] > 0:
region_label = f"Person Count: {region['counts']}"
pedestrian_count = region['counts']
else:
region_label = "Person Count: 0"
else:
region_label = f"Time Added: {detect_added:.2f}"

region_color = region["region_color"]
region_text_color = region["text_color"]

73
polygon_coords = np.array(region["polygon"].exterior.coords, dtype=np.int32)
centroid_x, centroid_y = int(region["polygon"].centroid.x), int(region["polygon"].centroid.y)

text_size, _ = cv2.getTextSize(
region_label, cv2.FONT_HERSHEY_SIMPLEX, fontScale=0.7, thickness=line_thickness
)
text_x = centroid_x - text_size[0] // 2
text_y = centroid_y + text_size[1] // 2
cv2.rectangle(
frame,
(text_x - 5, text_y - text_size[1] - 5),
(text_x + text_size[0] + 5, text_y + 5),
region_color,
-1,
)
cv2.putText(
frame, region_label, (text_x, text_y), cv2.FONT_HERSHEY_SIMPLEX, 0.7,
region_text_color, line_thickness
)
cv2.polylines(frame, [polygon_coords], isClosed=True, color=region_color,
thickness=region_thickness)

total_time, elapsed_time = next_lane_signal(detect_added, classify_added, results_cls,


top5_labels, top5_conf_np)
print("Total time:", total_time)
print("Elapsed time:", elapsed_time)
text_total = f"Total Optimized Time: {total_time:.2f}"
text_elapsed = f"Elapsed Time: {elapsed_time:.2f}"

cv2.putText(frame, text_total, (700, 50), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 255), 2)


cv2.putText(frame, text_elapsed, (700, 80), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 255),
2)

pedestrian_signal(pedestrian_count)

if not hide_img:
if vid_frame_count == 1:
cv2.namedWindow("Optimized Traffic Light Inference Mode")
cv2.setMouseCallback("Optimized Traffic Light Inference Mode", mouse_callback)
cv2.imshow("Optimized Traffic Light Inference Mode", frame)

for region in counting_regions: # Reinitialize count for each region


region["counts"] = 0

if next_source:
break

if cv2.waitKey(1) & 0xFF == ord("q"):


break

74
del vid_frame_count
videocapture.release()
cv2.destroyAllWindows()

def parse_opt():
"""Parse command line arguments."""
parser = argparse.ArgumentParser()
parser.add_argument("--device", default="cpu", help="cuda device, cpu or mps")
parser.add_argument("--hide-img", action="store_true", help="hide results")
parser.add_argument("--board-port", required=True, help="ls /dev/tty.*")

return parser.parse_args()

def main(opt):
"""Main function."""
# List of source paths

source_list = [
"traffic-view/Paseo-1.mp4",
"traffic-view/Paseo-2.mp4",
"traffic-view/Paseo-3.mp4",
"traffic-view/Paseo-4.mp4",
"traffic-view/Paseo-5.mp4",
"traffic-view/Paseo-6.mp4",
"traffic-view/Paseo-7.mp4",
"traffic-view/Paseo-8.mp4"
]

while True:
for source_path in source_list:
reset_variables()
opt.source = source_path
run(**vars(opt))

if __name__ == "__main__":
opt = parse_opt()
main(opt)

75
Appendix G: Similarity Index

REFERENCE NUMBER: ID: oid:29780:59215556

COLLEGE OF SCIENCE & TECHNOLOGY

Name of Researcher(s): Bualoy, Jan Eliel Domingo, Golden Dhan


Erer, Ivan Floyd
Program Course: Bachelor of Science in Electronics
Engineering
Adviser: Winelfred G. Pasamba, MIS

Similarity Index: 3%
Artificial Intelligence: --
Date of Test Conducted: Tue, 14 May 2024 15:28:08 +0800 PM PST

DESIGN AND SIMULATION OF OPTIMIZED TRAFFIC LIGHT CONTROL SYSTEM


USING YOLOV8
System Generated Certificate by AUP-JLDM Library
Tue, 14 May 2024 15:28:35 +0800 PM PST

3
Copyright 2019. All Rights Reserved.

WINELFRED G. PASAMBA, MIS /

ARTEMIO T. CONCORDIA JR. / 05-14-2024


SCAN QR CODE

Powered by TCPDF (www.tcpdf.org)

76
Appendix H: Certificate of Technical Editing

May 14, 2024

This is to certify that I have edited the Final Draft of the thesis manuscript entitled:

DESIGN AND SIMULATION OF OPTIMIZED TRAFFIC


LIGHT CONTROL SYSTEM USING YOLOV8

Prepared by

Bualoy, Jan Eliel


Domingo, Golden Dhan
Erer, Ivan Floyd

underwent technical editing and is complete in terms of inventive steps, grammar, APA
format as prescribed by this office.

Mayoliewin N. Taclan
Editor

77
CURRICULUM VITAE

78
79
80

You might also like