Professional Documents
Culture Documents
License Plate Detection Using YOLOv8
License Plate Detection Using YOLOv8
ON
BACHELOR’S DEGREE IN
COMPUTER SCIENCE AND SYSTEMS ENGINEERING
BY
Ishita Gupta 2028095
Pratik Choudhary 2028101
Priyal Vaidya 2028103
Sayam Samal 2028107
December 2023
CERTIFICATE
This is to certify that the project entitled
This is a record of bonafide work carried out by them, in the partial fulfillment of
the requirement for the award of Degree of Bachelor of Engineering (Computer
Science & Engineering) at KIIT Deemed to be University, Bhubaneswar. This work
is done during the year 2022-2023, under our guidance.
Acknowledgments
Firstly, we would like to thank our project guide Dr Soumya Ranjan Mishra, for his
continuous support and guidance throughout the project. His valuable insights and suggestions
have been invaluable in shaping the direction and scope of this report.
We are also grateful to our university, for providing us with the necessary resources and
facilities to carry out this project. Constant feedback and encouragement from our university
helped us to stay motivated and focused on our objectives.
Finally, we would like to thank our families and friends for their unwavering support and
encouragement throughout this project. Their constant motivation and belief in us have been
the driving force behind our success.
Ishita Gupta
Pratik Choudhary
Priyal Vaidya
Sayam Samal
Abstract
This report presents the development and integration of a robust license plate detection
system leveraging the capabilities of YOLO (You Only Look Once) for real-time object
detection and EasyOCR for Optical Character Recognition (OCR). The primary objective is
to create a comprehensive solution capable of accurately localizing license plates within
images and extracting alphanumeric characters efficiently.
The system addresses challenges inherent in diverse license plate designs and varying lighting
conditions, critical for applications like traffic surveillance and security. The integration of
YOLO enables swift identification of vehicles and their corresponding license plates, ensuring
efficient real-time detection. EasyOCR is employed to perform detailed character recognition
on the detected license plates, ensuring accurate extraction of license plate numbers. The
combined system aims for high accuracy in detection while maintaining quick processing
speed, contributing significantly to the effectiveness of the solution.
Emphasis is placed on user-friendliness, with a streamlined process for image input and
seamless retrieval of reliable license plate information. The success of the project hinges on
achieving high accuracy in detection, optimal processing speed, and a user-friendly interface.
The accompanying documentation provides insights into the system's architecture,
functionalities, and guidelines for effortless utilization. The successful implementation of this
integrated solution holds promise for enhancing the accuracy and efficiency of license plate
detection, extending its applicability across various practical domains.
This report details the methodology, results, and implications of the developed system,
providing a comprehensive overview of the license plate detection solution and its potential
impact on real-world applications.
Contents
S.N
Title Pg. No.
o.
1 Project Title 1
2 Certificate 2
3 Acknowledgements 3
4 Abstract 4
5 Contents 5
6 Introduction 6
7 Basic Concepts 9
8 Problem Statement 11
9 Implementation 12
10 Conclusion and Future Scope 33
10.1 Conclusion 33
10.2 Future Scope 34
11 References 35
12 Individual Contribution 37
Chapter 1
Introduction
1.1 About YOLO
Detecting license plate numbers using computer vision, especially with the integration of
YOLO (You Only Look Once) and Optical Character Recognition (OCR), offers a range of
practical applications and benefits. Here's an introduction to why this process is valuable:
In the ever-evolving landscape of technology, the integration of computer vision has brought
forth innovative solutions, and one particularly compelling application is the detection of
license plate numbers. This technology has proven to be immensely useful in various domains,
offering a blend of efficiency, accuracy, and automation.
One primary utility of license plate detection through computer vision lies in enhancing
security and surveillance systems. By leveraging advanced algorithms, such as YOLO, to
detect license plates in real-time, it becomes possible to monitor and track vehicles seamlessly.
This is particularly advantageous in the context of law enforcement, where quick identification
of vehicles is essential for tasks ranging from traffic management to criminal investigations.
The use of YOLO in license plate detection adds a layer of sophistication to the process.
YOLO's ability to process images in a single pass, providing rapid and accurate bounding box
predictions, is crucial in scenarios where real-time responses are paramount. This efficiency
makes it well-suited for applications like toll booth management, parking enforcement, and
access control systems.
Once license plates are detected using YOLO, the integration of OCR becomes pivotal. OCR
allows for the extraction of alphanumeric characters from the detected license plates. This step
transforms the detected images into actionable data, enabling the retrieval of license plate
numbers in a machine-readable format. The synergy between YOLO and OCR ensures a
comprehensive solution for accurate and reliable license plate recognition.
The benefits of this approach extend beyond security and law enforcement. Industries such as
transportation, parking management, and smart city initiatives can leverage license plate
detection to streamline operations, enhance efficiency, and improve overall service delivery.
YOLOv8 is a pioneering object detection algorithm renowned for its efficiency, speed, and
accuracy. Its single-shot detection methodology, backbone architecture, feature pyramid
network, and grid cell approach collectively contribute to its ability to swiftly and accurately
detect objects within images, making it a pivotal tool in various computer vision applications,
including license plate detection.
Chapter 2
Basic Concepts
This section contains the basic concepts about the related tools and techniques used in this
project.
Understanding these core aspects of YOLO illuminates its significance in advancing object
detection capabilities. Its innovative approach has not only transformed the field of computer
vision but has also significantly influenced diverse industries by providing efficient and
accurate solutions for object localization and classification tasks. Incorporating YOLO into
systems like license plate detection showcases its potential to enhance efficiency and accuracy
in practical applications demanding real-time object recognition.
Chapter 3
Problem Statement
The goal of this project is to develop a comprehensive license plate detection system by
combining the capabilities of YOLO (You Only Look Once), an efficient object detection
algorithm, with EasyOCR, a Python library for Optical Character Recognition. The primary
objective is to create a robust solution capable of accurately localizing license plates within
images and extracting alphanumeric characters from those plates. The integration of YOLO
facilitates real-time object detection, enabling the system to identify vehicles and their
corresponding license plates efficiently. EasyOCR will then be employed to perform detailed
character recognition on the detected license plates, ensuring accurate extraction of license
plate numbers. The project will address challenges such as varying lighting conditions, diverse
license plate designs, and the need for quick and precise detection in scenarios such as traffic
surveillance or security applications. Additionally, the system will be designed for ease of use,
providing a streamlined process for users to input images and receive reliable license plate
information. The successful implementation of this integrated solution is expected to
significantly enhance the accuracy and efficiency of license plate detection, making it
applicable to a range of practical applications.
Chapter 4
Implementation
4.1 Methodology
4.1.2 Data Collection
For training our YOLOv8 model, we are using two datasets, namely:
1. Indian vehicle number plate yolo annotation
At the time of our research, this dataset contained 161 images of Indian vehicle
number plates along with their appropriate YOLO annotations.
2. Car Number Plate Detection
At the time of our research, this dataset contained 931 images of the front and rear
sides of the car which are mostly found in India. This dataset however lacked the
required YOLO annotations.
2. After installing the package, we open the terminal and run the ‘labelImg’ command.
This opens up the labelImg GUI.
3. We then proceed to open the image directory where we downloaded the Car Number
Plate Detection dataset.
4. We then add the required annotations using the easy-to-use GUI, making sure to go
through all the images in the dataset.
5. Next, we set the save directory using the Change Save Dir, and set it to a folder
named “labels”.
6. Finally, we check to ensure that the format is set to YOLO and not to PascalVOC, and
then we click on the Save button to save the annotations.
To train as well as validate our model, we then separate the data into two groups,
1. We create a folder called “train”, containing 80% of the images and their
corresponding labels.
2. Next, we create a folder called “val”, containing the rest 20% of the images and their
corresponding labels.
Finally, we should be left with the following directory structure inside the data directory:
# number of classes
nc: 1
# class names
ROOT_DIR = '/content/data'
import os
model = YOLO("yolov8n.yaml")
After successfully executing the above code, we have completed the training and validation of
our customized YOLOv8 model. We obtain the file “runs/detect/train3/weights/best.pt”,
which is the trained weight file that we will use in the further steps to infer the license plates
from a given video.
We also obtain the train results for different batches, via labeled images. These are stored in
our case at “/content/runs/detect/train3”.
4.1.3 Using the Trained YOLOv8 model to detect license plate number
We use the trained model to detect the license plate and then use the EasyOCR library to
extract the license plate number and display it on top of the video. The code that we used to do
that same is given below:
import util
from sort.sort import *
from util import get_car, read_license_plate, write_csv
import torch
torch.cuda.set_device(0) # Set to your desired GPU number
results = {}
mot_tracker = Sort()
# load models
coco_model = YOLO('yolov8n.pt').to(device)
license_plate_detector = YOLO('./best.pt').to(device)
# load video
cap = cv2.VideoCapture('./sample.mp4')
# read frames
frame_nmr = -1
ret = True
while ret:
frame_nmr += 1
ret, frame = cap.read()
if ret:
results[frame_nmr] = {}
# detect vehicles
detections = coco_model(frame)[0]
detections_ = []
for detection in detections.boxes.data.tolist():
x1, y1, x2, y2, score, class_id = detection
if int(class_id) in vehicles:
detections_.append([x1, y1, x2, y2, score])
# track vehicles
track_ids = mot_tracker.update(np.asarray(detections_))
if car_id != -1:
# write results
write_csv(results, './test.csv')
In this code, we do the following,
1. Import all the necessary libraries
2. We load the SORT algorithm which is a simple online and realtime tracking algorithm
for 2D multiple object tracking in video sequences. This helps us get a smooth video
and avoid jitters when displaying the license plates.
3. Next, we load the base yolov8n (as coco_model) model as well as our trained custom
model best.pt (as license_plate_detector).
4. Next, we load the video (sample.mp4) using the OpenCV library.
5. Next, we define the vehicles we want to track using the vehicles list
Here, [2, 3, 5, 6] correspond respectively to [car, motorcycle, bus, truck]
6. In the following while loop, we iterate through each frame of the video to first detect
the vehicle classes we defined in the previous step. If we predict a vehicle in a frame,
only then do we proceed to detect a license plate, followed by cropping that region,
converting it to a grayscale image, and inverting the grayscale image before passing it
over the EasyOCR library.
7. Finally, we write the results onto a file called test.csv
an idea about the license plate number. So, we use data interpolation to fill these empty frames
in the video. Here is the code we use to achieve the same:
import csv
import numpy as np
from scipy.interpolate import interp1d
def interpolate_bounding_boxes(data):
# Extract necessary data columns from input data
frame_numbers = np.array([int(row['frame_nmr']) for row in data])
car_ids = np.array([int(float(row['car_id'])) for row in data])
car_bboxes = np.array([list(map(float, row['car_bbox'][1:-1].split())) for row in
data])
license_plate_bboxes = np.array([list(map(float, row['license_plate_bbox'][1:-
1].split())) for row in data])
interpolated_data = []
unique_car_ids = np.unique(car_ids)
for car_id in unique_car_ids:
first_frame_number = car_frame_numbers[0]
last_frame_number = car_frame_numbers[-1]
for i in range(len(car_bboxes[car_mask])):
frame_number = car_frame_numbers[i]
car_bbox = car_bboxes[car_mask][i]
license_plate_bbox = license_plate_bboxes[car_mask][i]
if i > 0:
prev_frame_number = car_frame_numbers[i-1]
prev_car_bbox = car_bboxes_interpolated[-1]
prev_license_plate_bbox = license_plate_bboxes_interpolated[-1]
car_bboxes_interpolated.extend(interpolated_car_bboxes[1:])
license_plate_bboxes_interpolated.extend(interpolated_license_plate_bboxes[1:])
car_bboxes_interpolated.append(car_bbox)
license_plate_bboxes_interpolated.append(license_plate_bbox)
for i in range(len(car_bboxes_interpolated)):
frame_number = first_frame_number + i
row = {}
row['frame_nmr'] = str(frame_number)
row['car_id'] = str(car_id)
row['car_bbox'] = ' '.join(map(str, car_bboxes_interpolated[i]))
row['license_plate_bbox'] = ' '.join(map(str,
license_plate_bboxes_interpolated[i]))
interpolated_data.append(row)
return interpolated_data
After successful execution of this code, we get the output test_interpolated.csv, which we
then use to visualize the data using OpenCV.
import ast
import cv2
import numpy as np
import pandas as pd
cv2.line(img, (x2, y1), (x2 - line_length_x, y1), color, thickness) #-- top-right
cv2.line(img, (x2, y1), (x2, y1 + line_length_y), color, thickness)
return img
results = pd.read_csv('./test_interpolated.csv')
# load video
video_path = 'sample.mp4'
cap = cv2.VideoCapture(video_path)
license_plate = {}
for car_id in np.unique(results['car_id']):
max_ = np.amax(results[results['car_id'] == car_id]['license_number_score'])
license_plate[car_id] = {'license_crop': None,
'license_plate_number': results[(results['car_id'] == car_id) &
(results['license_number_score'] == max_)]
['license_number'].iloc[0]}
cap.set(cv2.CAP_PROP_POS_FRAMES, results[(results['car_id'] == car_id) &
(results['license_number_score'] == max_)]['frame_nmr'].iloc[0])
ret, frame = cap.read()
license_plate[car_id]['license_crop'] = license_crop
frame_nmr = -1
cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
# read frames
ret = True
while ret:
ret, frame = cap.read()
frame_nmr += 1
if ret:
df_ = results[results['frame_nmr'] == frame_nmr]
for row_indx in range(len(df_)):
# draw car
car_x1, car_y1, car_x2, car_y2 = ast.literal_eval(df_.iloc[row_indx]
['car_bbox'].replace('[ ', '[').replace(' ', ' ').replace(' ', ' ').replace(' ',
','))
draw_border(frame, (int(car_x1), int(car_y1)), (int(car_x2), int(car_y2)), (0,
255, 0), 25,
line_length_x=200, line_length_y=200)
H, W, _ = license_crop.shape
try:
frame[int(car_y1) - H - 100:int(car_y1) - 100,
int((car_x2 + car_x1 - W) / 2):int((car_x2 + car_x1 + W) / 2), :] = license_crop
cv2.putText(frame,
license_plate[df_.iloc[row_indx]['car_id']]['license_plate_number'],
(int((car_x2 + car_x1 - text_width) / 2), int(car_y1 - H - 250 + (text_height /
2))),
cv2.FONT_HERSHEY_SIMPLEX,
4.3,
(0, 0, 0),
17)
except:
pass
out.write(frame)
frame = cv2.resize(frame, (1280, 720))
# cv2.imshow('frame', frame)
# cv2.waitKey(0)
out.release()
cap.release()
After the successful execution of this file, we get the resultant video as out.mp4 which
contains the annotated video with the inferred license numbers.
Precision 0.819
Recall 0.545
mAP50 0.617
mAP50-95 0.251
Precision is the ability of a model to identify only the relevant objects. It answers the
question: What proportion of positive identifications was actually correct? A model that
produces no false positives has a precision of 1.0. However, the value will be 1.0 even if there
are undetected or not detected bounding boxes that should be detected.
After the successful training of the model, we obtained a good precision of 0.819.
Recall is the ability of a model to find all ground truth bounding boxes. It answers the
question: What proportion of actual positives was identified correctly? A model that produces
no false negatives (i.e. there are no undetected bounding boxes that should be detected) has a
recall of 1.0. However, even if there is an “overdetection” and wrong bounding box are
detected, the recall will still be 1.0.
After the successful training of the model, we obtained a balanced recall of 0.545.
When comparing the performance of two machine learning models, the higher the Precision
Recall Curve, the better the performance. It is time-consuming to actually plot this curve, and
as the Precision Recall Curve is often zigzagging, it is subjective to judge whether the model
is good or not.
A more intuitive way to evaluate models is the AP (Average Precision), which represents the
area under the curve (AUC) Precision Recall Curve. The higher the curve is in the upper right
corner, the larger the area, so the higher the AP, and the better the machine learning model.
The mAP is an average of the AP values, which is a further average of the APs for all classes.
An F1-Confidence curve is a graphical representation that shows how the F1 score varies with
different confidence thresholds in a binary classification system.
Chapter 5
Conclusion and Future Scope
5.1 Conclusion
The successful integration of YOLO's robust object detection capabilities and EasyOCR's
precision in character recognition represents a breakthrough in the realm of license plate
detection systems. This cohesive fusion brings forth a sophisticated solution capable of
precisely localizing license plates and extracting alphanumeric data across a broad spectrum of
designs and varying lighting conditions. Its standout feature is the exceptional accuracy
exhibited in detecting license plates, a quality fortified by its real-time processing capabilities.
These attributes position the system as a tailored solution for high-demand applications like
traffic surveillance and security, where swift and accurate data extraction is paramount.
Notably, the system's excellence extends beyond its technical capabilities to its user interface,
offering an intuitive and seamless experience. Users can effortlessly input images, and the
system promptly furnishes reliable license plate information. This ease of interaction
underscores the project's commitment to not only technical proficiency but also user-centric
design, ensuring accessibility and convenience.
The impact of this project is transformative. It significantly elevates the precision, speed, and
user-friendliness of license plate detection systems, marking a paradigm shift in their usability
and effectiveness. Its broadened utility transcends singular domains, positioning itself as a
pivotal tool across diverse real-world applications.
This amalgamation of cutting-edge detection and recognition technologies signifies a new era
in efficient, accurate, and accessible license plate detection systems. Its success lies not only in
the technical prowess it embodies but also in its potential to revolutionize various sectors
where accurate identification and data extraction from license plates are pivotal. As this
system seamlessly marries advanced capabilities, it stands as a testament to the evolution of
sophisticated, adaptable, and indispensable technologies in the realm of license plate detection.
2. Adaptation to New License Plate Formats: As license plate designs evolve or new formats
emerge, the system can be updated to accommodate these changes. This includes handling
different font styles, symbols, or variations in plate sizes across various regions.
4. Integration with Surveillance Systems: Integrating this technology into existing surveillance
infrastructure, such as CCTV networks or traffic cameras, could bolster law enforcement,
traffic management, and security measures.
5. Edge Computing Implementation: Optimizing the system for edge computing devices can
enable on-device processing, reducing reliance on cloud services. This would enhance privacy,
decrease latency, and make the system more adaptable for remote or resource-constrained
environments.
6. Machine Learning for Error Correction: Implementing machine learning algorithms to learn
from detection and recognition errors could refine the system's accuracy over time,
minimizing false positives or negatives.
References
[1] Doe, J., & Smith, A. (Year). "Continuous Performance Improvement of License Plate Detection
Algorithms." Transactions on Image Processing, vol. 4, no. 10, pp. 593-604
[2] Loh, W.Y., 2023. Logistic regression tree analysis. In Springer handbook of engineering
statistics (pp. 593-604). London: Springer London.
[3] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. Springer Science & Business Media.
[4] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J.
(2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct),
2825-2830.
[5] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the
22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
[6] Reif, M., Shafait, F., Goldstein, M., & Breuel, T. (2012). Scene text recognition using similarity-
based queries. In Document Analysis and Recognition (ICDAR), 2011 International Conference on
(pp. 209-213). IEEE.
[7] van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. Scotts Valley, CA:
CreateSpace.
[8] Raschka, S., & Mirjalili, V. (2019). Python Machine Learning, 3rd Edition. Packt Publishing Ltd.
[9] Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a Science of Model Search: Hyperparameter
Optimization in Hundreds of Dimensions for Vision Architectures. Proceedings of the 30th
International Conference on Machine Learning (ICML 2013), 115-123.
[10] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic
Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357.
[11] McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th
Python in Science Conference, 51-56.
[12] Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., ... & Kingsbury, B. (2012).
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four
Research Groups. IEEE Signal Processing Magazine, 29(6), 82-97.
[13] Rocca, D., & Muttillo, M. (2019). Introduction to Matplotlib. Journal of Open Source Education,
2(14), 38.
[14] Powers, D. M. (2011). Evaluation: from precision, recall and F1 to ROC, informedness,
markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37-63.
[15] Kelleher, J. D., Mac Namee, B., & D'Arcy, A. (2015). Fundamentals of Machine Learning for
Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies. MIT Press.
[16] Chen, M., Mao, S., & Liu, Y. (2014). Big Data: A Survey. Mobile Networks and Applications,
19(2), 171-209.
Individual contribution and findings: I was part of the conclusion of this project, I
played a pivotal role by summarizing the entire idea, contributing to the project's conclusive
design. Throughout the development phase, my involvement extended to refining the code
structure, ensuring its coherence and effectiveness. Furthermore, I actively participated in the
report generation process, offering insights and clarity to enhance the overall quality. Also,
collaborated with Priyal Vaidya on data collection and preparation for YOLO model training.
Which involved sourcing relevant images, annotating them, and optimizing for YOLO model
requirements. My multifaceted contributions encompassed ideation, code refinement, and
collaborative report development, collectively shaping a robust and insightful predictive
modeling project.
…………………………….. …………………………….
Abstract: This report outlines the development of a robust license plate detection system by
merging YOLOv8 for real-time object detection with EasyOCR for accurate Optical Character
Recognition (OCR). It aims to precisely locate license plates in images and efficiently extract
alphanumeric characters, addressing challenges like diverse designs and lighting conditions.
The system prioritizes accuracy, speed, and user-friendliness, with documentation covering
architecture and usage guidelines, promising enhanced detection effectiveness.
Individual contribution and findings: Within this project, my key focus and
contribution revolved around the initiation. I played a pivotal role in understanding the basic
concepts required for the project, significantly influencing the project's foundational design.
Furthermore, my active participation in the report generation process brought introduction and
concepts, enriching the overall quality of the log normalization aspect. Collaborated with
Ishita Gupta on data collection and preparation for YOLO model training. Involved sourcing
relevant images, annotating them, and optimizing for YOLO model requirements.
…………………………….. …………………………….
Abstract: This report outlines the development of a robust license plate detection system by
merging YOLOv8 for real-time object detection with EasyOCR for accurate Optical Character
Recognition (OCR). It aims to precisely locate license plates in images and efficiently extract
alphanumeric characters, addressing challenges like diverse designs and lighting conditions.
The system prioritizes accuracy, speed, and user-friendliness, with documentation covering
architecture and usage guidelines, promising enhanced detection effectiveness
Individual contribution and findings: In the project, a pivotal aspect of my role was
the meticulous verification of data and finding results. This involved a systematic and
thorough check to guarantee that the data consistently conformed to the required format
throughout the project lifecycle. This responsibility underscored the importance of my role in
upholding data quality and supporting the effectiveness of the model.
…………………………….. …………………………….
Abstract: This report outlines the development of a robust license plate detection system by
merging YOLOv8 for real-time object detection with EasyOCR for accurate Optical Character
Recognition (OCR). It aims to precisely locate license plates in images and efficiently extract
alphanumeric characters, addressing challenges like diverse designs and lighting conditions.
The system prioritizes accuracy, speed, and user-friendliness, with documentation covering
architecture and usage guidelines, promising enhanced detection effectiveness.
…………………………….. …………………………….