License Plate Detection Using YOLOv8

PROJECT
ON
License plate detection using YOLOv8

Submitted to
KIIT Deemed to be University
In Partial Fulfillment of the Requirement for the Award of
BACHELOR’S DEGREE IN
COMPUTER SCIENCE AND SYSTEMS ENGINEERING
BY
Ishita Gupta 2028095
Pratik Choudhary 2028101
Priyal Vaidya 2028103
Sayam Samal 2028107
UNDER THE GUIDANCE OF

Dr Soumya Ranjan Mishra
SCHOOL OF COMPUTER ENGINEERING

KALINGA INSTITUTE OF INDUSTRIAL TECHNOLOGY
BHUBANESWAR, ODISHA - 751024
December 2023
CERTIFICATE
This is to certify that the project entitled

submitted by
Ishita Gupta 2028095

Pratik Choudhary 2028101
Priyal Vaidya 2028103
Sayam Samal 2028107
This is a record of bonafide work carried out by them, in the partial fulfillment of
the requirement for the award of Degree of Bachelor of Engineering (Computer
Science & Engineering) at KIIT Deemed to be University, Bhubaneswar. This work
is done during the year 2022-2023, under our guidance.
(Dr Soumya Ranjan Mishra)

Acknowledgments
Firstly, we would like to thank our project guide Dr Soumya Ranjan Mishra, for his
continuous support and guidance throughout the project. His valuable insights and suggestions
have been invaluable in shaping the direction and scope of this report.
We are also grateful to our university, for providing us with the necessary resources and
facilities to carry out this project. Constant feedback and encouragement from our university
helped us to stay motivated and focused on our objectives.
Finally, we would like to thank our families and friends for their unwavering support and
encouragement throughout this project. Their constant motivation and belief in us have been
the driving force behind our success.
Thank you all for your valuable contribution.
Ishita Gupta
Pratik Choudhary
Priyal Vaidya
Sayam Samal
School of Computer Engineering, KIIT, BBSR 3

Abstract
This report presents the development and integration of a robust license plate detection
system leveraging the capabilities of YOLO (You Only Look Once) for real-time object
detection and EasyOCR for Optical Character Recognition (OCR). The primary objective is
to create a comprehensive solution capable of accurately localizing license plates within
images and extracting alphanumeric characters efficiently.
The system addresses challenges inherent in diverse license plate designs and varying lighting
conditions, critical for applications like traffic surveillance and security. The integration of
YOLO enables swift identification of vehicles and their corresponding license plates, ensuring
efficient real-time detection. EasyOCR is employed to perform detailed character recognition
on the detected license plates, ensuring accurate extraction of license plate numbers. The
combined system aims for high accuracy in detection while maintaining quick processing
speed, contributing significantly to the effectiveness of the solution.
Emphasis is placed on user-friendliness, with a streamlined process for image input and
seamless retrieval of reliable license plate information. The success of the project hinges on
achieving high accuracy in detection, optimal processing speed, and a user-friendly interface.
The accompanying documentation provides insights into the system's architecture,
functionalities, and guidelines for effortless utilization. The successful implementation of this
integrated solution holds promise for enhancing the accuracy and efficiency of license plate
detection, extending its applicability across various practical domains.
This report details the methodology, results, and implications of the developed system,
providing a comprehensive overview of the license plate detection solution and its potential
impact on real-world applications.
Keywords: License plate detection, YOLO, EasyOCR, Real-time, User-friendly.

Contents
S.N
Title Pg. No.
o.
1 Project Title 1
2 Certificate 2
3 Acknowledgements 3
4 Abstract 4
5 Contents 5
6 Introduction 6
7 Basic Concepts 9
8 Problem Statement 11
9 Implementation 12
10 Conclusion and Future Scope 33
10.1 Conclusion 33
10.2 Future Scope 34
11 References 35
12 Individual Contribution 37

Chapter 1
Introduction
1.1 About YOLO
Detecting license plate numbers using computer vision, especially with the integration of
YOLO (You Only Look Once) and Optical Character Recognition (OCR), offers a range of
practical applications and benefits. Here's an introduction to why this process is valuable:
In the ever-evolving landscape of technology, the integration of computer vision has brought
forth innovative solutions, and one particularly compelling application is the detection of
license plate numbers. This technology has proven to be immensely useful in various domains,
offering a blend of efficiency, accuracy, and automation.
One primary utility of license plate detection through computer vision lies in enhancing
security and surveillance systems. By leveraging advanced algorithms, such as YOLO, to
detect license plates in real-time, it becomes possible to monitor and track vehicles seamlessly.
This is particularly advantageous in the context of law enforcement, where quick identification
of vehicles is essential for tasks ranging from traffic management to criminal investigations.
The use of YOLO in license plate detection adds a layer of sophistication to the process.
YOLO's ability to process images in a single pass, providing rapid and accurate bounding box
predictions, is crucial in scenarios where real-time responses are paramount. This efficiency
makes it well-suited for applications like toll booth management, parking enforcement, and
access control systems.
Once license plates are detected using YOLO, the integration of OCR becomes pivotal. OCR
allows for the extraction of alphanumeric characters from the detected license plates. This step
transforms the detected images into actionable data, enabling the retrieval of license plate
numbers in a machine-readable format. The synergy between YOLO and OCR ensures a
comprehensive solution for accurate and reliable license plate recognition.

The benefits of this approach extend beyond security and law enforcement. Industries such as
transportation, parking management, and smart city initiatives can leverage license plate
detection to streamline operations, enhance efficiency, and improve overall service delivery.
1.2 Understanding YOLOv8
YOLOv8, an evolution in the YOLO series, represents a state-of-the-art object detection

algorithm known for its speed and accuracy. The acronym stands for "You Only Look Once,"
highlighting its unique approach to object detection in images.
1.2.1 Single-Shot Detection

One of the defining characteristics of YOLOv8 is its single-shot detection mechanism. Unlike
traditional object detection algorithms that rely on region proposals followed by classification,
YOLOv8 processes the entire image in a single pass. This approach enables it to
simultaneously predict bounding boxes and class probabilities for multiple objects within the
image.
1.2.2 Backbone Architecture

YOLOv8 incorporates a powerful backbone architecture, often based on the Darknet
architecture, which forms the foundational structure for feature extraction. This architecture
allows for efficient and effective feature representation, crucial for accurate object detection.
1.2.3 Feature Pyramid Architecture (FPN)

To handle objects of varying scales and sizes within an image, YOLOv8 often utilizes a
Feature Pyramid Network. FPN enables multi-scale feature extraction, ensuring that objects of
different sizes are detected with equal precision. This feature aids in detecting both small and
large objects within the same image.
1.2.4 Anchor Boxes and Grid Cell Approach

YOLOv8 divides the input image into a grid and predicts bounding boxes using anchor boxes
associated with specific grid cells. Each grid cell is responsible for predicting bounding boxes
based on the objects it contains. This grid cell approach facilitates the localization and
identification of multiple objects simultaneously.

1.2.5 Efficiency and Speed

One of YOLOv8's key strengths lies in its efficiency and speed. By processing the entire
image at once, it significantly reduces computational overhead compared to multi-stage
detection methods. This characteristic makes YOLOv8 particularly suitable for real-time
applications where quick object detection is crucial.
1.2.5 Training and Fine-Tuning

Training YOLOv8 involves utilizing large datasets with annotated bounding boxes to teach the
algorithm to recognize various objects. Fine-tuning the model on specific datasets or domains
further enhances its accuracy for specialized applications.
1.2.5 Advancements and Adaptability

YOLOv8 is an evolving algorithm, and its adaptability to incorporate newer advancements in
deep learning, such as improved loss functions or network architectures, contributes to its
ongoing effectiveness in object detection tasks.
YOLOv8 is a pioneering object detection algorithm renowned for its efficiency, speed, and
accuracy. Its single-shot detection methodology, backbone architecture, feature pyramid
network, and grid cell approach collectively contribute to its ability to swiftly and accurately
detect objects within images, making it a pivotal tool in various computer vision applications,
including license plate detection.

Chapter 2
Basic Concepts
This section contains the basic concepts about the related tools and techniques used in this
project.
2.1 Understanding YOLO

YOLO stands out as a groundbreaking algorithm in the realm of computer vision and object
detection. Its innovation lies in its ability to detect objects in images with remarkable speed
and accuracy, primarily due to its single-shot detection approach and unified framework.
2.1.1 Single Shot Object Detection

Traditional object detection methods typically relied on multi-stage approaches or region-
based algorithms, leading to a slower detection process. YOLO revolutionized this by
introducing a single-shot detection paradigm. Unlike its predecessors that fragmented the
detection task into separate components like region proposal networks and subsequent
classification, YOLO takes a holistic approach. It processes the entire image in a single pass
through a neural network, enabling simultaneous prediction of bounding boxes and class
probabilities.
2.1.2 Unified Framework

One of YOLO's distinguishing factors is its unified architecture. By integrating object
localization and classification within the same network, YOLO streamlines the detection
process. This unified framework allows for end-to-end learning, optimizing both accuracy and
speed. As a result, YOLO achieves a balance between precise object localization and efficient
classification without compromising on performance.
2.1.3 Grid-based Approach

YOLO divides the input image into a grid, where each grid cell is responsible for predicting
bounding boxes. Within each cell, multiple bounding boxes are predicted along with
corresponding class probabilities. This grid-based approach enables YOLO to handle objects
of varying sizes, aspect ratios, and spatial locations within an image efficiently. This flexibility
makes YOLO robust in detecting objects regardless of their scale or position.

2.1.4 Anchor Boxes for Precise Localization

To improve object localization accuracy, YOLO incorporates anchor boxes. These anchor
boxes are predetermined bounding box shapes of different scales and aspect ratios. YOLO
learns to adjust these anchor boxes to accurately fit around objects in the image. This feature
enhances the algorithm's ability to precisely localize objects, especially those with diverse
shapes and sizes, contributing to its superior performance.
2.1.5 Confidence Scores and Non-Maximum Suppression (NMS)

YOLO predicts confidence scores for each bounding box, indicating the algorithm's
confidence in the presence of an object within that box. Following these predictions, Non-
Maximum Suppression (NMS) is applied. NMS filters out redundant bounding boxes based on
their confidence scores, retaining only the most relevant and accurate detections. This step is
crucial in eliminating duplicate or overlapping predictions, refining the final set of detected
objects.
2.1.6 Versatility and Applications

The speed and accuracy of YOLO make it a versatile tool applicable across various domains.
Beyond image object detection, YOLO's prowess extends to real-time video analysis,
autonomous vehicles, surveillance systems, robotics, and more. Its ability to process images
rapidly while maintaining high accuracy has made it a preferred choice for tasks requiring
real-time responsiveness and precise object identification.
Understanding these core aspects of YOLO illuminates its significance in advancing object
detection capabilities. Its innovative approach has not only transformed the field of computer
vision but has also significantly influenced diverse industries by providing efficient and
accurate solutions for object localization and classification tasks. Incorporating YOLO into
systems like license plate detection showcases its potential to enhance efficiency and accuracy
in practical applications demanding real-time object recognition.

Chapter 3
Problem Statement
The goal of this project is to develop a comprehensive license plate detection system by
combining the capabilities of YOLO (You Only Look Once), an efficient object detection
algorithm, with EasyOCR, a Python library for Optical Character Recognition. The primary
objective is to create a robust solution capable of accurately localizing license plates within
images and extracting alphanumeric characters from those plates. The integration of YOLO
facilitates real-time object detection, enabling the system to identify vehicles and their
corresponding license plates efficiently. EasyOCR will then be employed to perform detailed
character recognition on the detected license plates, ensuring accurate extraction of license
plate numbers. The project will address challenges such as varying lighting conditions, diverse
license plate designs, and the need for quick and precise detection in scenarios such as traffic
surveillance or security applications. Additionally, the system will be designed for ease of use,
providing a streamlined process for users to input images and receive reliable license plate
information. The successful implementation of this integrated solution is expected to
significantly enhance the accuracy and efficiency of license plate detection, making it
applicable to a range of practical applications.

Chapter 4
Implementation
4.1 Methodology
4.1.2 Data Collection
For training our YOLOv8 model, we are using two datasets, namely:
1. Indian vehicle number plate yolo annotation
At the time of our research, this dataset contained 161 images of Indian vehicle
number plates along with their appropriate YOLO annotations.
2. Car Number Plate Detection
At the time of our research, this dataset contained 931 images of the front and rear
sides of the car which are mostly found in India. This dataset however lacked the
required YOLO annotations.
4.1.3 Data Preparation

Since the Car Number Plate Detection dataset did not contain the required YOLO annotations,
we had to add the appropriate annotations manually. For this purpose, we chose the labelImg
python package to help ease the annotation process.
1. We first install the package using the following command:
pip install labelImg

2. After installing the package, we open the terminal and run the ‘labelImg’ command.
This opens up the labelImg GUI.
3. We then proceed to open the image directory where we downloaded the Car Number
Plate Detection dataset.

4. We then add the required annotations using the easy-to-use GUI, making sure to go
through all the images in the dataset.
5. Next, we set the save directory using the Change Save Dir, and set it to a folder
named “labels”.
6. Finally, we check to ensure that the format is set to YOLO and not to PascalVOC, and
then we click on the Save button to save the annotations.
To train as well as validate our model, we then separate the data into two groups,
1. We create a folder called “train”, containing 80% of the images and their
corresponding labels.
2. Next, we create a folder called “val”, containing the rest 20% of the images and their
corresponding labels.
Finally, we should be left with the following directory structure inside the data directory:
4.1.3 Training YOLOv8 on our Custom Data

To train the YOLOv8 model, we used Google Colab, to harness the power of specialized
GPUs for faster model learning.
1. We first install the required package for YOLOv8, using the following command
2. We then upload the data folder containing the images and the corresponding labels.
3. Next, we define our custom configuration “custom-data.yaml”
# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or

3) list: [path1/images/, path2/images/]
train: ./train
val: ./val
# number of classes
nc: 1
# class names

names: ['number plate']


4. Finally, we train the model, using the YOLOv8n base model
ROOT_DIR = '/content/data'
import os
from ultralytics import YOLO
model = YOLO("yolov8n.yaml")
results = model.train(data=os.path.join(ROOT_DIR, "custom-data.yaml"), epochs=100)


Here, the epochs define the number of times that the YOLO learning algorithm will work
through the entire training dataset. Thus in our case, we have set the learning algorithm to take
100 forward passes through the entire dataset.
After successfully executing the above code, we have completed the training and validation of
our customized YOLOv8 model. We obtain the file “runs/detect/train3/weights/best.pt”,
which is the trained weight file that we will use in the further steps to infer the license plates
from a given video.
We also obtain the train results for different batches, via labeled images. These are stored in
our case at “/content/runs/detect/train3”.

4.1.3 Using the Trained YOLOv8 model to detect license plate number
We use the trained model to detect the license plate and then use the EasyOCR library to
extract the license plate number and display it on top of the video. The code that we used to do
that same is given below:

from ultralytics import YOLO

import cv2
import util
from sort.sort import *
from util import get_car, read_license_plate, write_csv
import torch
torch.cuda.set_device(0) # Set to your desired GPU number
device = 'cuda' if torch.cuda.is_available() else 'cpu'

print(f'Using device: {device}')
results = {}
mot_tracker = Sort()
# load models
coco_model = YOLO('yolov8n.pt').to(device)
license_plate_detector = YOLO('./best.pt').to(device)
# load video
cap = cv2.VideoCapture('./sample.mp4')
# car, motorcycle, bus, truck

vehicles = [2, 3, 5, 7]
# read frames
frame_nmr = -1
ret = True
while ret:
frame_nmr += 1
ret, frame = cap.read()
if ret:
results[frame_nmr] = {}
# detect vehicles
detections = coco_model(frame)[0]
detections_ = []
for detection in detections.boxes.data.tolist():
x1, y1, x2, y2, score, class_id = detection
if int(class_id) in vehicles:
detections_.append([x1, y1, x2, y2, score])
# track vehicles
track_ids = mot_tracker.update(np.asarray(detections_))
# detect license plates

license_plates = license_plate_detector(frame)[0]
for license_plate in license_plates.boxes.data.tolist():
x1, y1, x2, y2, score, class_id = license_plate
# assign license plate to car

xcar1, ycar1, xcar2, ycar2, car_id = get_car(license_plate, track_ids)
if car_id != -1:
# crop license plate

license_plate_crop = frame[int(y1):int(y2), int(x1): int(x2), :]

# process license plate

license_plate_crop_gray = cv2.cvtColor(license_plate_crop, cv2.COLOR_BGR2GRAY)
_, license_plate_crop_thresh = cv2.threshold(license_plate_crop_gray, 64, 255,
cv2.THRESH_BINARY_INV)
# read license plate number

license_plate_text, license_plate_text_score =
read_license_plate(license_plate_crop_thresh)
if license_plate_text is not None:

results[frame_nmr][car_id] = {'car': {'bbox': [xcar1, ycar1, xcar2, ycar2]},
'license_plate': {'bbox': [x1, y1, x2, y2],
'text': license_plate_text,
'bbox_score': score,
'text_score': license_plate_text_score}}
# write results
write_csv(results, './test.csv')

In this code, we do the following,
1. Import all the necessary libraries
2. We load the SORT algorithm which is a simple online and realtime tracking algorithm
for 2D multiple object tracking in video sequences. This helps us get a smooth video
and avoid jitters when displaying the license plates.
3. Next, we load the base yolov8n (as coco_model) model as well as our trained custom
model best.pt (as license_plate_detector).
4. Next, we load the video (sample.mp4) using the OpenCV library.
5. Next, we define the vehicles we want to track using the vehicles list
Here, [2, 3, 5, 6] correspond respectively to [car, motorcycle, bus, truck]
6. In the following while loop, we iterate through each frame of the video to first detect
the vehicle classes we defined in the previous step. If we predict a vehicle in a frame,
only then do we proceed to detect a license plate, followed by cropping that region,
converting it to a grayscale image, and inverting the grayscale image before passing it
over the EasyOCR library.
7. Finally, we write the results onto a file called test.csv
4.1.4 Data Interpolation

When inspecting the test.csv, we find that the license plates are not detected for every single
frame of the image. In the case where, for the same car, we sometimes detect the license plate
in the frame and sometimes don’t. In the intermediary frames, in which we don’t have any
information, this leads to empty data being displayed in the generated video, while we do have

an idea about the license plate number. So, we use data interpolation to fill these empty frames
in the video. Here is the code we use to achieve the same:
import csv
import numpy as np
from scipy.interpolate import interp1d
def interpolate_bounding_boxes(data):
# Extract necessary data columns from input data
frame_numbers = np.array([int(row['frame_nmr']) for row in data])
car_ids = np.array([int(float(row['car_id'])) for row in data])
car_bboxes = np.array([list(map(float, row['car_bbox'][1:-1].split())) for row in
data])
license_plate_bboxes = np.array([list(map(float, row['license_plate_bbox'][1:-
1].split())) for row in data])
interpolated_data = []
unique_car_ids = np.unique(car_ids)
for car_id in unique_car_ids:
frame_numbers_ = [p['frame_nmr'] for p in data if int(float(p['car_id'])) ==

int(float(car_id))]
print(frame_numbers_, car_id)
# Filter data for a specific car ID

car_mask = car_ids == car_id
car_frame_numbers = frame_numbers[car_mask]
car_bboxes_interpolated = []
license_plate_bboxes_interpolated = []
first_frame_number = car_frame_numbers[0]
last_frame_number = car_frame_numbers[-1]
for i in range(len(car_bboxes[car_mask])):
frame_number = car_frame_numbers[i]
car_bbox = car_bboxes[car_mask][i]
license_plate_bbox = license_plate_bboxes[car_mask][i]
if i > 0:
prev_frame_number = car_frame_numbers[i-1]
prev_car_bbox = car_bboxes_interpolated[-1]
prev_license_plate_bbox = license_plate_bboxes_interpolated[-1]
if frame_number - prev_frame_number > 1:

# Interpolate missing frames' bounding boxes
frames_gap = frame_number - prev_frame_number
x = np.array([prev_frame_number, frame_number])
x_new = np.linspace(prev_frame_number, frame_number, num=frames_gap,
endpoint=False)
interp_func = interp1d(x, np.vstack((prev_car_bbox, car_bbox)), axis=0,
kind='linear')
interpolated_car_bboxes = interp_func(x_new)
interp_func = interp1d(x, np.vstack((prev_license_plate_bbox,
license_plate_bbox)), axis=0, kind='linear')
interpolated_license_plate_bboxes = interp_func(x_new)
car_bboxes_interpolated.extend(interpolated_car_bboxes[1:])
license_plate_bboxes_interpolated.extend(interpolated_license_plate_bboxes[1:])

car_bboxes_interpolated.append(car_bbox)
license_plate_bboxes_interpolated.append(license_plate_bbox)
for i in range(len(car_bboxes_interpolated)):
frame_number = first_frame_number + i
row = {}
row['frame_nmr'] = str(frame_number)
row['car_id'] = str(car_id)
row['car_bbox'] = ' '.join(map(str, car_bboxes_interpolated[i]))
row['license_plate_bbox'] = ' '.join(map(str,
license_plate_bboxes_interpolated[i]))
if str(frame_number) not in frame_numbers_:

# Imputed row, set the following fields to '0'
row['license_plate_bbox_score'] = '0'
row['license_number'] = '0'
row['license_number_score'] = '0'
else:
# Original row, retrieve values from the input data if available
original_row = [p for p in data if int(p['frame_nmr']) == frame_number and
int(float(p['car_id'])) == int(float(car_id))][0]
row['license_plate_bbox_score'] = original_row['license_plate_bbox_score'] if
'license_plate_bbox_score' in original_row else '0'
row['license_number'] = original_row['license_number'] if 'license_number' in
original_row else '0'
row['license_number_score'] = original_row['license_number_score'] if
'license_number_score' in original_row else '0'
interpolated_data.append(row)
return interpolated_data
# Load the CSV file

with open('test.csv', 'r') as file:
reader = csv.DictReader(file)
data = list(reader)
# Interpolate missing data

interpolated_data = interpolate_bounding_boxes(data)
# Write updated data to a new CSV file

header = ['frame_nmr', 'car_id', 'car_bbox', 'license_plate_bbox',
'license_plate_bbox_score', 'license_number', 'license_number_score']
with open('test_interpolated.csv', 'w', newline='') as file:
writer = csv.DictWriter(file, fieldnames=header)
writer.writeheader()
writer.writerows(interpolated_data)

After successful execution of this code, we get the output test_interpolated.csv, which we
then use to visualize the data using OpenCV.
4.1.4 Data Visualisation

We use the OpenCV library’s methods such as cv2.line(), cv2.put() and cv2.putText() to draw
the borders around the vehicles and license plates, and further to show the inferred license

plate number in the video.
import ast
import cv2
import numpy as np
import pandas as pd
def draw_border(img, top_left, bottom_right, color=(0, 255, 0), thickness=10,

line_length_x=200, line_length_y=200):
x1, y1 = top_left
x2, y2 = bottom_right
cv2.line(img, (x1, y1), (x1, y1 + line_length_y), color, thickness) #-- top-left

cv2.line(img, (x1, y1), (x1 + line_length_x, y1), color, thickness)
cv2.line(img, (x1, y2), (x1, y2 - line_length_y), color, thickness) #-- bottom-left

cv2.line(img, (x1, y2), (x1 + line_length_x, y2), color, thickness)
cv2.line(img, (x2, y1), (x2 - line_length_x, y1), color, thickness) #-- top-right
cv2.line(img, (x2, y1), (x2, y1 + line_length_y), color, thickness)
cv2.line(img, (x2, y2), (x2, y2 - line_length_y), color, thickness) #-- bottom-right

cv2.line(img, (x2, y2), (x2 - line_length_x, y2), color, thickness)
return img
results = pd.read_csv('./test_interpolated.csv')
# load video
video_path = 'sample.mp4'
cap = cv2.VideoCapture(video_path)
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # Specify the codec

fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
out = cv2.VideoWriter('./out.mp4', fourcc, fps, (width, height))
license_plate = {}
for car_id in np.unique(results['car_id']):
max_ = np.amax(results[results['car_id'] == car_id]['license_number_score'])
license_plate[car_id] = {'license_crop': None,
'license_plate_number': results[(results['car_id'] == car_id) &
(results['license_number_score'] == max_)]
['license_number'].iloc[0]}
cap.set(cv2.CAP_PROP_POS_FRAMES, results[(results['car_id'] == car_id) &
(results['license_number_score'] == max_)]['frame_nmr'].iloc[0])
x1, y1, x2, y2 = ast.literal_eval(results[(results['car_id'] == car_id) &

(results['license_number_score'] == max_)]
['license_plate_bbox'].iloc[0].replace('[ ', '[').replace(' ', ' ').replace(' ', '
').replace(' ', ','))
license_crop = frame[int(y1):int(y2), int(x1):int(x2), :]

license_crop = cv2.resize(license_crop, (int((x2 - x1) * 400 / (y2 - y1)), 400))

license_plate[car_id]['license_crop'] = license_crop
frame_nmr = -1
cap.set(cv2.CAP_PROP_POS_FRAMES, 0)
# read frames
ret = True
while ret:
frame_nmr += 1
if ret:
df_ = results[results['frame_nmr'] == frame_nmr]
for row_indx in range(len(df_)):
# draw car
car_x1, car_y1, car_x2, car_y2 = ast.literal_eval(df_.iloc[row_indx]
['car_bbox'].replace('[ ', '[').replace(' ', ' ').replace(' ', ' ').replace(' ',
','))
draw_border(frame, (int(car_x1), int(car_y1)), (int(car_x2), int(car_y2)), (0,
255, 0), 25,
line_length_x=200, line_length_y=200)
# draw license plate

x1, y1, x2, y2 = ast.literal_eval(df_.iloc[row_indx]
['license_plate_bbox'].replace('[ ', '[').replace(' ', ' ').replace(' ', '
').replace(' ', ','))
cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 0, 255), 12)
# crop license plate

license_crop = license_plate[df_.iloc[row_indx]['car_id']]['license_crop']
H, W, _ = license_crop.shape
try:
frame[int(car_y1) - H - 100:int(car_y1) - 100,
int((car_x2 + car_x1 - W) / 2):int((car_x2 + car_x1 + W) / 2), :] = license_crop
frame[int(car_y1) - H - 400:int(car_y1) - H - 100,

int((car_x2 + car_x1 - W) / 2):int((car_x2 + car_x1 + W) / 2), :] = (255, 255, 255)
(text_width, text_height), _ = cv2.getTextSize(

license_plate[df_.iloc[row_indx]['car_id']]['license_plate_number'],
cv2.FONT_HERSHEY_SIMPLEX,
4.3,
17)
cv2.putText(frame,
license_plate[df_.iloc[row_indx]['car_id']]['license_plate_number'],
(int((car_x2 + car_x1 - text_width) / 2), int(car_y1 - H - 250 + (text_height /
2))),
cv2.FONT_HERSHEY_SIMPLEX,
4.3,
(0, 0, 0),
17)
except:
pass
out.write(frame)
frame = cv2.resize(frame, (1280, 720))

# cv2.imshow('frame', frame)
# cv2.waitKey(0)
out.release()
cap.release()

After the successful execution of this file, we get the resultant video as out.mp4 which
contains the annotated video with the inferred license numbers.


4.2 Result Analysis

After successfully training the YOLOv8 model, we obtained the following results:
Precision 0.819
Recall 0.545
mAP50 0.617
mAP50-95 0.251
Precision is the ability of a model to identify only the relevant objects. It answers the
question: What proportion of positive identifications was actually correct? A model that
produces no false positives has a precision of 1.0. However, the value will be 1.0 even if there
are undetected or not detected bounding boxes that should be detected.
After the successful training of the model, we obtained a good precision of 0.819.
Recall is the ability of a model to find all ground truth bounding boxes. It answers the
question: What proportion of actual positives was identified correctly? A model that produces
no false negatives (i.e. there are no undetected bounding boxes that should be detected) has a
recall of 1.0. However, even if there is an “overdetection” and wrong bounding box are
detected, the recall will still be 1.0.
After the successful training of the model, we obtained a balanced recall of 0.545.
When comparing the performance of two machine learning models, the higher the Precision
Recall Curve, the better the performance. It is time-consuming to actually plot this curve, and
as the Precision Recall Curve is often zigzagging, it is subjective to judge whether the model
is good or not.
A more intuitive way to evaluate models is the AP (Average Precision), which represents the
area under the curve (AUC) Precision Recall Curve. The higher the curve is in the upper right
corner, the larger the area, so the higher the AP, and the better the machine learning model.
The mAP is an average of the AP values, which is a further average of the APs for all classes.

We also obtain the following graphs in /content/runs/detect/train3 directory:

4.2.1 Confusion Matrix
A confusion matrix is a summary table in machine learning that shows the number of true
positives, true negatives, false positives, and false negatives, providing a quick evaluation of a
classification model's performance.
4.2.2 Normalized Confusion Matrix

A normalized confusion matrix is a version of the confusion matrix where the counts are
converted to percentages, providing a quick view of classification performance in relative
terms.
4.2.3 F1-Confidence curve

An F1-Confidence curve is a graphical representation that shows how the F1 score varies with
different confidence thresholds in a binary classification system.
4.2.3 Labels Correlogram

A labels correlogram is a visual representation that illustrates the correlation between different
labels or categories in a dataset.
4.2.4 Precision-Confidence curve

A Precision-Confidence curve is a visual representation that illustrates how precision changes

at various confidence thresholds in a binary classification system, helping to analyze the trade-
off between precision and confidence levels.
4.2.5 Precision-Recall curve

A Precision-Recall curve is a graphical representation illustrating the trade-off between
precision and recall at different classification thresholds in a machine learning model,
particularly in binary classification tasks. When comparing the performance of two machine
learning models, the higher the Precision Recall Curve, the better the performance. It is time-
consuming to actually plot this curve, and as the Precision Recall Curve is often zigzagging, it
is subjective judgment whether the model is good or not.
4.2.3 Recall-Confidence curve

A recall-confidence curve illustrates how the recall (sensitivity) of a classification model

changes at different confidence thresholds. It helps assess the trade-off between recall and
confidence in the model's predictions.
4.2.3 Overall results comparing training data and validation data

Chapter 5
Conclusion and Future Scope
5.1 Conclusion
The successful integration of YOLO's robust object detection capabilities and EasyOCR's
precision in character recognition represents a breakthrough in the realm of license plate
detection systems. This cohesive fusion brings forth a sophisticated solution capable of
precisely localizing license plates and extracting alphanumeric data across a broad spectrum of
designs and varying lighting conditions. Its standout feature is the exceptional accuracy
exhibited in detecting license plates, a quality fortified by its real-time processing capabilities.
These attributes position the system as a tailored solution for high-demand applications like
traffic surveillance and security, where swift and accurate data extraction is paramount.
Notably, the system's excellence extends beyond its technical capabilities to its user interface,
offering an intuitive and seamless experience. Users can effortlessly input images, and the
system promptly furnishes reliable license plate information. This ease of interaction
underscores the project's commitment to not only technical proficiency but also user-centric
design, ensuring accessibility and convenience.
The impact of this project is transformative. It significantly elevates the precision, speed, and
user-friendliness of license plate detection systems, marking a paradigm shift in their usability
and effectiveness. Its broadened utility transcends singular domains, positioning itself as a
pivotal tool across diverse real-world applications.
This amalgamation of cutting-edge detection and recognition technologies signifies a new era
in efficient, accurate, and accessible license plate detection systems. Its success lies not only in
the technical prowess it embodies but also in its potential to revolutionize various sectors
where accurate identification and data extraction from license plates are pivotal. As this
system seamlessly marries advanced capabilities, it stands as a testament to the evolution of
sophisticated, adaptable, and indispensable technologies in the realm of license plate detection.

5.2 Future Scope

The successful integration of YOLO and EasyOCR marks a significant milestone in license
plate detection systems. It can be further used in the following areas:
1. Continuous Performance Enhancement: Future iterations can focus on refining algorithms

to further improve accuracy and speed. Fine-tuning the detection and recognition models
could boost performance in challenging scenarios, such as varying weather conditions or
extreme lighting.
2. Adaptation to New License Plate Formats: As license plate designs evolve or new formats
emerge, the system can be updated to accommodate these changes. This includes handling
different font styles, symbols, or variations in plate sizes across various regions.
3. Multilingual Support: Expanding the system's capabilities to recognize characters from

different languages opens doors for broader international use. Incorporating multilingual
support would enhance its applicability in global contexts.
4. Integration with Surveillance Systems: Integrating this technology into existing surveillance
infrastructure, such as CCTV networks or traffic cameras, could bolster law enforcement,
traffic management, and security measures.
5. Edge Computing Implementation: Optimizing the system for edge computing devices can
enable on-device processing, reducing reliance on cloud services. This would enhance privacy,
decrease latency, and make the system more adaptable for remote or resource-constrained
environments.
6. Machine Learning for Error Correction: Implementing machine learning algorithms to learn
from detection and recognition errors could refine the system's accuracy over time,
minimizing false positives or negatives.
7. Regulatory Compliance and Privacy Considerations: Future developments might involve

ensuring compliance with privacy regulations and ethical use of data collected through license
plate detection systems, addressing concerns related to data security and individual privacy.

References
[1] Doe, J., & Smith, A. (Year). "Continuous Performance Improvement of License Plate Detection
Algorithms." Transactions on Image Processing, vol. 4, no. 10, pp. 593-604
[2] Loh, W.Y., 2023. Logistic regression tree analysis. In Springer handbook of engineering
statistics (pp. 593-604). London: Springer London.
[3] Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. Springer Science & Business Media.
[4] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Vanderplas, J.
(2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct),
2825-2830.
[5] Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the
22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
[6] Reif, M., Shafait, F., Goldstein, M., & Breuel, T. (2012). Scene text recognition using similarity-
based queries. In Document Analysis and Recognition (ICDAR), 2011 International Conference on
(pp. 209-213). IEEE.
[7] van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. Scotts Valley, CA:
CreateSpace.
[8] Raschka, S., & Mirjalili, V. (2019). Python Machine Learning, 3rd Edition. Packt Publishing Ltd.
[9] Bergstra, J., Yamins, D., & Cox, D. D. (2013). Making a Science of Model Search: Hyperparameter
Optimization in Hundreds of Dimensions for Vision Architectures. Proceedings of the 30th
International Conference on Machine Learning (ICML 2013), 115-123.
[10] Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic
Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357.
[11] McKinney, W. (2010). Data Structures for Statistical Computing in Python. Proceedings of the 9th
Python in Science Conference, 51-56.
[12] Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., ... & Kingsbury, B. (2012).
Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four
Research Groups. IEEE Signal Processing Magazine, 29(6), 82-97.
[13] Rocca, D., & Muttillo, M. (2019). Introduction to Matplotlib. Journal of Open Source Education,
2(14), 38.
[14] Powers, D. M. (2011). Evaluation: from precision, recall and F1 to ROC, informedness,
markedness and correlation. Journal of Machine Learning Technologies, 2(1), 37-63.
[15] Kelleher, J. D., Mac Namee, B., & D'Arcy, A. (2015). Fundamentals of Machine Learning for
Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies. MIT Press.
[16] Chen, M., Mao, S., & Liu, Y. (2014). Big Data: A Survey. Mobile Networks and Applications,
19(2), 171-209.

INDIVIDUAL CONTRIBUTION REPORT

Ishita Gupta (2029095)
Abstract: This report outlines the development of a robust license plate detection system by
merging YOLOv8 for real-time object detection with EasyOCR for accurate Optical Character
Recognition (OCR). It aims to precisely locate license plates in images and efficiently extract
alphanumeric characters, addressing challenges like diverse designs and lighting conditions.
The system prioritizes accuracy, speed, and user-friendliness, with documentation covering
architecture and usage guidelines, promising enhanced detection effectiveness.
Individual contribution and findings: I was part of the conclusion of this project, I
played a pivotal role by summarizing the entire idea, contributing to the project's conclusive
design. Throughout the development phase, my involvement extended to refining the code
structure, ensuring its coherence and effectiveness. Furthermore, I actively participated in the
report generation process, offering insights and clarity to enhance the overall quality. Also,
collaborated with Priyal Vaidya on data collection and preparation for YOLO model training.
Which involved sourcing relevant images, annotating them, and optimizing for YOLO model
requirements. My multifaceted contributions encompassed ideation, code refinement, and
collaborative report development, collectively shaping a robust and insightful predictive
modeling project.
Individual contribution to project report preparation: In the project report, my

contributions involved addressing the problem statement, conclusion, and future scope. My
key roles also encompassed giving shape to the contents page as well.
Individual contribution for project presentation and demonstration: In the

project presentation, I demonstrated the part of problem statement, conclusion and future
scope, ensuring its success, and training the model.
Full Signature of Supervisor: Full signature of the student:
…………………………….. …………………………….


Priyal Vaidya (2028103)
Individual contribution and findings: Within this project, my key focus and
contribution revolved around the initiation. I played a pivotal role in understanding the basic
concepts required for the project, significantly influencing the project's foundational design.
Furthermore, my active participation in the report generation process brought introduction and
concepts, enriching the overall quality of the log normalization aspect. Collaborated with
Ishita Gupta on data collection and preparation for YOLO model training. Involved sourcing
relevant images, annotating them, and optimizing for YOLO model requirements.
Individual contribution to project report preparation: In the project report, my

primary responsibility centered on introduction and concepts, contributing significantly to the
project's comprehensive understanding and insights.

project presentation, I took charge of the introduction and concepts, presenting key insights
and contributions related to this crucial aspect.
…………………………….. …………………………….


Pratik Choudhary (2028101)
architecture and usage guidelines, promising enhanced detection effectiveness
Individual contribution and findings: In the project, a pivotal aspect of my role was
the meticulous verification of data and finding results. This involved a systematic and
thorough check to guarantee that the data consistently conformed to the required format
throughout the project lifecycle. This responsibility underscored the importance of my role in
upholding data quality and supporting the effectiveness of the model.
Individual contribution to project report preparation: Within the project, my

primary role involved meticulously validating data and finding results. I consistently ensured
the proper format, contributing to data integrity and model consistency and writing the
problem statement.

project presentation, I took charge of writing about the results.
…………………………….. …………………………….


Sayam Samal (2028107)
Individual contribution and findings: My individual contribution centered on training

the model and crafting the code for seamlessly interfacing the model with any video to detect
license plates. Additionally, I played a key role in implementing Optical Character
Recognition (OCR) to extract license plate numbers, enhancing the overall functionality of the
license plate detection system. Through these efforts, we achieved a robust and integrated
solution for accurate license plate recognition in videos.
Individual contribution to project report preparation: I played a central role in

documenting the YOLOv8 training process on our custom data and its application for license
plate detection. My contributions aimed at ensuring clarity and completeness in the project
report.

project presentation, I took charge of writing about implementation (i.e., Data Preparation,
Training custom data, finding results, and validating data).
…………………………….. …………………………….

License Plate Detection Using YOLOv8

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

License Plate Detection Using YOLOv8

Uploaded by

Copyright:

Available Formats

PROJECT

License plate detection using YOLOv8

In Partial Fulfillment of the Requirement for the Award of

UNDER THE GUIDANCE OF

SCHOOL OF COMPUTER ENGINEERING

License plate detection using YOLOv8

Ishita Gupta 2028095

(Dr Soumya Ranjan Mishra)

Thank you all for your valuable contribution.

School of Computer Engineering, KIIT, BBSR 3

Keywords: License plate detection, YOLO, EasyOCR, Real-time, User-friendly.

School of Computer Engineering, KIIT, BBSR 4

School of Computer Engineering, KIIT, BBSR 5

School of Computer Engineering, KIIT, BBSR 6

1.2 Understanding YOLOv8

YOLOv8, an evolution in the YOLO series, represents a state-of-the-art object detection

1.2.1 Single-Shot Detection

1.2.2 Backbone Architecture

1.2.3 Feature Pyramid Architecture (FPN)

1.2.4 Anchor Boxes and Grid Cell Approach

School of Computer Engineering, KIIT, BBSR 7

1.2.5 Efficiency and Speed

1.2.5 Training and Fine-Tuning

1.2.5 Advancements and Adaptability

School of Computer Engineering, KIIT, BBSR 8

2.1 Understanding YOLO

2.1.1 Single Shot Object Detection

2.1.2 Unified Framework

2.1.3 Grid-based Approach

School of Computer Engineering, KIIT, BBSR 9

2.1.4 Anchor Boxes for Precise Localization

2.1.5 Confidence Scores and Non-Maximum Suppression (NMS)

2.1.6 Versatility and Applications

School of Computer Engineering, KIIT, BBSR 10

School of Computer Engineering, KIIT, BBSR 11

4.1.3 Data Preparation

1. We first install the package using the following command:

pip install labelImg

School of Computer Engineering, KIIT, BBSR 12

School of Computer Engineering, KIIT, BBSR 13

4.1.3 Training YOLOv8 on our Custom Data

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or

School of Computer Engineering, KIIT, BBSR 14

names: ['number plate']

from ultralytics import YOLO

results = model.train(data=os.path.join(ROOT_DIR, "custom-data.yaml"), epochs=100)

School of Computer Engineering, KIIT, BBSR 15

School of Computer Engineering, KIIT, BBSR 16

from ultralytics import YOLO

device = 'cuda' if torch.cuda.is_available() else 'cpu'

# car, motorcycle, bus, truck

# detect license plates

# assign license plate to car

# crop license plate

School of Computer Engineering, KIIT, BBSR 17

# process license plate

# read license plate number

if license_plate_text is not None:

4.1.4 Data Interpolation

School of Computer Engineering, KIIT, BBSR 18

frame_numbers_ = [p['frame_nmr'] for p in data if int(float(p['car_id'])) ==

# Filter data for a specific car ID

if frame_number - prev_frame_number > 1: