Professional Documents
Culture Documents
Traffic Sign Classification: BY Qasim Lakdawala (19BT04020) Husain Kalolwala (19BT04016)
Traffic Sign Classification: BY Qasim Lakdawala (19BT04020) Husain Kalolwala (19BT04016)
BY
QASIM LAKDAWALA
(19BT04020)
HUSAIN KALOLWALA
(19BT04016)
INTRODUCTION
Traffic sign classification is the process of automatically recognizing traffic signs along the
road, including speed limit signs, yield signs, merge signs, etc. Being able to automatically
recognize traffic signs enables us to build “smarter cars”.
Self-driving cars need traffic sign recognition in order to properly parse and understand the
roadway. Similarly, “driver alert” systems inside cars need to understand the roadway around
them to help aid and protect drivers.
Traffic sign recognition is just one of the problems that computer vision and deep learning can
solve.
Our traffic sign dataset
The dataset we’ll be using to train our own custom traffic sign
classifier is the German Traffic Sign Recognition Benchmark
(GTSRB).
The top class (Speed limit 50km/h) has over 2,000 examples while the least
represented class (Speed limit 20km/h) has under 200 examples — that’s an
order of magnitude difference!
Our project contains three main directories and one Python module:
gtsrb-german-traffic-sign/
: Contains our output model and training history plot generated by
train.py
examples/
pyimagesearch
train.py
and
predict.py
. Our training script loads the data, compiles the model, trains,
and outputs the serialized model and plot image to disk. From there,
our prediction script generates annotated images for visual
validation purposes.
For this article, you’ll need to have the following packages installed:
OpenCV
NumPy
scikit-learn
scikit-image
imutils
Matplotlib
Luckily each of these is easily installed with pip, a Python package manager.
Let’s install the packages now, ideally into a virtual environment as shown (you’ll need to
create the environment):
$ workon traffic_signs
Using pip to install OpenCV is hands-down the fastest and easiest way to get
started with OpenCV. Instructions on how to create your virtual environment are
included in the tutorial at this link. This method (as opposed to compiling from
source) simply checks prerequisites and places a precompiled binary that will work
on most systems into your virtual environment site-packages
Our
tf.keras
imports are listed on Lines 2-9. We will be taking advantage of
Keras’ Sequential API to build our
TrafficSignNet
CNN (Line 2).
Line 11 defines our
TrafficSignNet
class followed by Line 13 which defines our
build
method. The
build
method accepts four parameters: the image dimensions,
depth
, and number of
classes
in the dataset.
Lines 16-19 initialize our
Sequential
model and specify the CNN’s
inputShape
.
Let’s define our
This set of layers uses a 5×5 kernel to learn larger features — it will help to
distinguish between different traffic sign shapes and color blobs on the traffic signs
themselves.
# first set of (CONV => RELU => CONV => RELU) * 2 => POOL
model.add(Conv2D(16, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(16, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
# second set of (CONV => RELU => CONV => RELU) * 2 => POOL
model.add(Conv2D(32, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(32, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
These sets of layers deepen the network by stacking two sets of
The head of our network consists of two sets of fully connected layers and a softmax
classifier:
Dropout is applied as a form of regularization which aims to prevent overfitting. The result is
often a more generalizable model.
model
; we will compile and train the model in our
train.py
script next.
matplotlib
classification_report
Let’s go ahead and define a function to load our data from disk:
def load_split(basePath, csvPath):
# initialize the list of data and labels
data = []
labels = []
# load the contents of the CSV file, remove the first line (since
# it contains the CSV header), and shuffle the rows (otherwise
# all examples of a particular class will be in sequential order)
rows = open(csvPath).read().strip().split("\n")[1:]
random.shuffle(rows)
The GTSRB dataset is pre-split into training/testing splits for us. Line 20 defines
load_split
to load each training split respectively. It accepts a path to the
base of the dataset as well as a
.csv
file path which contains the class label for each image.
Lines 22 and 23 initialize our
data
and
labels
lists which this function will soon populate and return.
Line 28 loads our
.csv
file, strips whitespace, and grabs each row via the newline
delimiter, skipping the first header row. The result is a list of
rows
which Line 29 then shuffles randomly.
The result of Lines 28 and 29 can be seen here (i.e. if you were to print the first three rows in
the list via
print(rows[:3])
):
['33,35,5,5,28,29,13,Train/13/00013_00001_00009.png',
'36,36,5,5,31,31,38,Train/38/00038_00049_00021.png',
'75,77,6,7,69,71,35,Train/35/00035_00039_00024.png']
rows
now and extract + preprocess the data that we need
# loop over the rows of the CSV file
# split the row into components and then grab the class ID
image = io.imread(imagePath)
rows
. Inside the loop, we proceed to:
Display a status update to the terminal for every 1000th image processed
(Lines 34 and 35).
Extract the ClassID (
label
) and
imagePath
from the
row
(Line 39).
Derive the full path to the image file + load the image with scikit-image (Lines
42 and 43).
Let’s preprocess our images by applying CLAHE now:
# resize the image to be 32x32 pixels, ignoring aspect ratio,
# Equalization (CLAHE)
labels.append(int(label))
data = np.array(data)
labels = np.array(labels)
rows
, we:
and
labels
data
and
labels
into NumPy arrays and
return
them to the calling function.
With our
load_split
function defined, now we can move on to parsing command line
arguments:
# construct the argument parser and parse the arguments
ap = argparse.ArgumentParser()
args = vars(ap.parse_args())
--dataset
NUM_EPOCHS = 30
INIT_LR = 1e-3
BS = 64
labelNames = open("signnames.csv").read().strip().split("\n")[1:]
Lines 74-76 initialize the number of epochs to train for, our initial learning rate, and batch
size.
labelNames
from a
.csv
file. Unnecessary markup in the file is automatically discarded.
Now let’s go ahead and load + preprocess our data:
classTotals = trainY.sum(axis=0)
Derive paths to the training and testing splits (Lines 83 and 84).
Use our
load_split
function to load each of the training/testing splits, respectively (Lines 88 and 89).
Preprocess the images by scaling them to the range [0, 1] (Lines 92 and 93).
One-hot encode the training/testing class labels (Lines 96-98).
Account for skew in our dataset (i.e. the fact that we have significantly more images
for some classes than others). Lines 101 and 102 assign a weight to each class for use
during training.
model
:
# construct the image generator for data augmentation
aug = ImageDataGenerator(
rotation_range=10,
zoom_range=0.15,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.15,
horizontal_flip=False,
vertical_flip=False,
fill_mode="nearest")
classes=numLabels)
model.compile(loss="categorical_crossentropy", optimizer=opt,
metrics=["accuracy"])
H = model.fit_generator(
validation_data=(testX, testY),
steps_per_epoch=trainX.shape[0] // BS,
epochs=NUM_EPOCHS,
class_weight=classWeight,
verbose=1)
TraffigSignNet
model with the
Adam
optimizer and learning rate decay.
Lines 125-131 train the
model
using Keras’ fit_generator method. Notice the
class_weight
parameter is passed to accommodate the skew in our dataset.
Next, we will evaluate the
model
and serialize it to disk:
# evaluate the network
print(classification_report(testY.argmax(axis=1),
predictions.argmax(axis=1), target_names=labelNames))
model.save(args["model"])
model
on the testing set. From there, Lines 136 and 137 print a
classification report in the terminal.
Line 141 serializes the Keras
model
to disk so that we can later use it for inference in our
prediction script.
Finally, the following code block plots the training accuracy/loss curves and exports the plot to
an image file on disk:
N = np.arange(0, NUM_EPOCHS)
plt.style.use("ggplot")
plt.figure()
plt.xlabel("Epoch #")
plt.ylabel("Loss/Accuracy")
plt.legend(loc="lower left")
plt.savefig(args["plot"])
Take special note here that TensorFlow 2.0 has renamed the training history keys:
H.history["acc"]
is now
H.history["accuracy"]
.
H.history["val_acc"]
is now
H.history["val_accuracy"]
.
At this point, you should be using TensorFlow 2.0 (with Keras built-in), but if you aren’t, you
can adjust the key names (Lines 149 and 150).
Summary
Traffic sign classification and recognition with Keras and Deep Learning.