Professional Documents
Culture Documents
8329684
8329684
TITEL:
Automatisk genkendelse af num-
merplader
PROJEKTPERIODE: SYNOPSIS:
5. februar - 31. maj, 2002
Denne rapport beskriver analyse, de-
sign samt test af et system til automa-
PROJEKT GRUPPE: tisk genkendelse af nummerplader,
IN6-621 tænkt som et delsystem til en automa-
tisk fartkontrol. Input til systemet
er en serie farvebilleder af køretøjer i
GRUPPEMEDLEMMER: bevægelse, og output består af num-
Henrik Hansen merpladens registreringsnummer.
Anders Wang Kristensen Fremskaffelsen af de ønskede informa-
Morten Porsborg Køhler tioner sker i tre dele. Først udtrækkes
Allan Weber Mikkelsen nummerpladen fra det samlede billede,
derefter adskilles de syv tegn fra hinan-
Jens Mejdahl Pedersen
den, og til sidst genkendes de enkelte
Michael Trangeled
karakterer ved brug af statistisk møn-
stergenkendelse samt korrelation.
VEJLEDER: Algoritmerne blev udviklet ved hjælp
Thomas Moeslund af et sæt træningsbilleder, og testet på
billeder taget under varierende forhold.
Det færdige program er i stand til at
ANTAL KOPIER: 9 uddrage de ønskede informationer i en
høj procentdel af testbillederne.
RAPPORT SIDEANTAL: 123
APPENDIKS SIDEANTAL: 13
TITLE:
Automatic recognition of license
plates
SYNOPSIS:
PROJECT PERIOD:
February 5. - May 31. 2002 This report describes analysis, design
and implementation of a system for au-
tomatic recognition of license plates,
PROJECT GROUP: which is considered a subsystem for au-
IN6-621 tomatic speed control. The input to
the system is a series of color images of
moving vehicles, and output consists of
GROUP MEMBERS:
the registration number of the license
Henrik Hansen
plate.
Anders Wang Kristensen Extraction of the desired information is
Morten Porsborg Køhler done in three steps. First, the license
Allan Weber Mikkelsen plate is extracted from the original im-
Jens Mejdahl Pedersen age, then the seven characters are iso-
Michael Trangeled lated, and finally each character is iden-
tified using statistical pattern recogni-
tion and correlation.
SUPERVISOR: The algorithms were developed using a
Thomas Moeslund set of training images, and tested on
images taken under varying conditions.
The final program is capable of extract-
NUMBER OF COPIES: 9
ing the desired information in a high
REPORT PAGES: 123 percentage of the test images.
APPENDIX PAGES: 13
This report has been written as a 6th semester project at the Institute of
Electronic Systems at Aalborg University. The main theme of the semester
is gathering and description of information, and the goal is to collect physical
data, represent these symbolically, and demonstrate techniques for processing
these data. The report mainly applies to the censor and supervisor, along with
future students at the 6th semester in Informatics.
This report includes analysis, design and test of a system designed to au-
tomatically recognize license plates from color images. Source code and the
corresponding executable program are included on the attached CD (See Ap-
pendix C for full contents of the CD, as well as instructions of use).
We would like to thank the Aalborg Police Department for information used
in the report, and for their guided tour of the present speed control system.
Also, we would like to thank our supervisor of this project, Thomas Moeslund.
——————————— ———————————
Henrik Hansen Anders Wang Kristensen
——————————— ———————————
Morten Porsborg Køhler Allan Weber Mikkelsen
——————————— ———————————
Jens Mejdahl Pedersen Michael Trangeled
Contents
Introduction 13
I Analysis 15
1 Traffic control 17
1.1 Current system . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.1.1 Disadvantages of the current system . . . . . . . . . . . 18
1.2 Improved system . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.3 The components of the system . . . . . . . . . . . . . . . . . . . 21
1.3.1 Differences from the existing system . . . . . . . . . . . . 22
1.3.2 Project focus . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4 License plates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4.1 Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.4.2 Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.3 Mounting and material . . . . . . . . . . . . . . . . . . . 24
1.5 System definition . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.6 Delimitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.6.1 License plate formats . . . . . . . . . . . . . . . . . . . . 25
1.6.2 Video processing . . . . . . . . . . . . . . . . . . . . . . 25
1.6.3 Identification of driver . . . . . . . . . . . . . . . . . . . 26
1.6.4 Transportation of data . . . . . . . . . . . . . . . . . . . 26
1.6.5 Quality of decisions . . . . . . . . . . . . . . . . . . . . . 26
II Design 27
3
CONTENTS
3 Character isolation 51
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2 Isolating the characters . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.1 Static bounds . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.2 Pixel count . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.2.3 Connected components . . . . . . . . . . . . . . . . . . . 53
3.2.4 Improving image quality . . . . . . . . . . . . . . . . . . 55
3.2.5 Combined strategy . . . . . . . . . . . . . . . . . . . . . 56
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4 Character identification 59
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Template matching . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3 Statistical pattern recognition . . . . . . . . . . . . . . . . . . . 60
4.4 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4.1 Area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4
CONTENTS
III Test 83
5 Extraction test 85
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.2 Region growing and Hough transform . . . . . . . . . . . . . . . 85
5.2.1 Criteria of success . . . . . . . . . . . . . . . . . . . . . . 85
5.2.2 Test data . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.3 Test description . . . . . . . . . . . . . . . . . . . . . . . 86
5.2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.3 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.1 Criteria of success . . . . . . . . . . . . . . . . . . . . . . 88
5.3.2 Test data . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.3 Test description . . . . . . . . . . . . . . . . . . . . . . . 88
5.3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
5.4 Combined method . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.1 Criteria of success . . . . . . . . . . . . . . . . . . . . . . 90
5.4.2 Test data . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.3 Test description . . . . . . . . . . . . . . . . . . . . . . . 90
5.4.4 Result of test . . . . . . . . . . . . . . . . . . . . . . . . 90
5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6 Isolation test 93
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Criteria of success . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.3 Test description . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5
CONTENTS
7 Identification test 99
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.2 Criteria of success . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.3 Test data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
7.4 Feature based identification . . . . . . . . . . . . . . . . . . . . 100
7.4.1 Test description . . . . . . . . . . . . . . . . . . . . . . . 100
7.4.2 Result of test using Euclidian distance . . . . . . . . . . 100
7.4.3 Result of test using Mahalanobis distance . . . . . . . . 102
7.5 Identification through correlation . . . . . . . . . . . . . . . . . 104
7.5.1 Test description . . . . . . . . . . . . . . . . . . . . . . . 104
7.5.2 Result of test . . . . . . . . . . . . . . . . . . . . . . . . 104
7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
IV Conclusion 111
9 Conclusion 113
V Appendix 115
A Videocamera 117
A.1 Physical components . . . . . . . . . . . . . . . . . . . . . . . . 118
6
CONTENTS
C Contents of CD 125
C.1 The program . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
VI Literature 127
7
List of Figures
9
LIST OF FIGURES
10
List of Tables
11
8.1 Overall result for the system . . . . . . . . . . . . . . . . . . . . 108
Many traffic accidents in Denmark are caused by high speed, and reports show
that speed control is reducing the speed, because it has a deterrent effect on the
drivers [8]. Therefore a new and more efficient way of speed control has been
introduced. It is a partially automatic control where a photo of the offender
is taken, instead of the traditional method where the police had to pull over
each driver.
Although the current system is far more effective than the old system, it still
has some shortcomings. The goal of this project is to investigate the possibility
of creating an improved system which alleviates some of these shortcomings.
We want to examine the possibility of automating the work flow of the system.
The reason behind this particular choice of philosophy is that the major dis-
advantage of the current system is the use of manual labor to accomplish tasks
that we believe nowadays could be handled just as efficiently by computers. A
secondary goal of this project is to explore the possibility of replacing parts of
the costly equipment currently used with more reasonably priced equipment.
13
Part I
Analysis
15
Chapter 1
Traffic control
The goal of this chapter is to create a system definition which describes the sys-
tem we want to develop. To identify the shortcomings of the current partially
automatic speed control system, and to identify the parts of the system which
arguably can be further automated, both the current system and a proposal
for an improved system will be described in this chapter.
17
Chapter 1. Traffic control
the pictures were originally taken. Here they receive a digitized version of the
processed film. The entire procedure is illustrated in Figure 1.2.
Two persons are required for the manual registration of the license plates.
The first person uses a software tool for optimizing the pictures, so that driver
and plate appears clearly. Then the license plate is registered along with the
speed of the vehicle for each picture. If the vehicle was not speeding or if the
face of the driver does not appear clearly, the picture is discarded. The second
person verifies that the characters of the license plate have been correctly
identified. If the vehicle is identifiable, a fine is issued to the offender. The
entire process takes about 3-4 days from the picture is taken, until the fine is
received by the offender.
18
1.2 Improved system
100 km/t
50 km/t
RL 39 312
0 km/t
Manual
verification Manned
Radar
control
station
Camera
Manual
registration
Central
development
laboratory
vehicle, this is an extra 700.000 DKR for the speed control equipment itself.
Apart from the cost of materials, the manual registration of license plates
requires that the personnel is trained in the use of the previously mentioned
software tool.
19
Chapter 1. Traffic control
100 km/t
50 km/t
RL 39 312
0 km/t
Automatic
registration
Unmanned
control
station
Video sequence
image analysis on a computer using the sequence as input. The lefthand side
of Figure 1.3 shows this part.
The purpose of the image analysis is two-fold; first it should be determined
for each car in the sequence if an actual traffic violation took place. This means
determining the speed of the vehicles based on the video sequence. Secondly
any offenders should be identified by reading the license plate of their cars, so
that they later can be held responsible.
Developing reliable systems for extracting the information needed for speed
estimation and vehicle identification is very important. In the case of estimat-
ing the speed of vehicles, the consequence of reaching the conclusion that a
vehicle drove too fast, when in reality it did not, is much greater than letting
a real offender slip by.
In the case of identifying license plates it is even worse if a wrong conclusion
is reached, since this would mean that an innocent person would be fined for
the actions of someone else. Therefore, the algorithms must be designed to
refuse to make such a decision if it is estimated that the probability of being
correct is below some predetermined threshold. An alternative would be a
manual decision in cases of doubt. Since the current system requires a manual
20
1.3 The components of the system
verification, it is reasonable to assume that the person could perform the same
task, but verifying the computers output instead of that of a co-worker.
Isolate characters
Determine speed
No Speed Yes
violation?
21
Chapter 1. Traffic control
with a vehicle exceeding the speed limit. The subsystem consists of four tasks;
the first task selects the optimal frame of the incoming video sequence. The
second task then extracts the region believed to contain the license plate.
The third task isolates the seven characters and the last task identifies the
individual characters.
Single image
Select optimal frame
Figure 1.5 shows the four tasks of the system. The right half of the figure
shows the input and output of these steps all the way from the input video
sequence to the final identified characters of the license plate. Alternatively
this progression could be viewed as the reduction or suppression of unwanted
information from the information carrying signal, here a video sequence con-
taining vast amounts of irrelevant information, to abstract symbols in the form
of the characters of a license place.
22
1.4 License plates
Component In project
Speed
- Recording video
- Speed determination
Recognition
- Selecting optimal frame
- Extracting license plate ✓
- Isolating characters in plate ✓
- Identifying characters ✓
Table 1.2: Areas of the improved system addressed in this project
1.4.1 Dimensions
All license plates must be either rectangular or quadric shaped. The dimensions
of the plate can vary, depending on the vehicle type, but the most common
23
Chapter 1. Traffic control
license plate is the rectangular shaped shown in Figure 1.6. The plate has the
dimensions (w × h) 504 × 120 mm and has all 7 characters written in a single
line.
Some vehicles uses ‘square’ plates with characters written in two lines, but
according to the guideline for vehicle inspection [2], all front side plates must
be rectangular shaped, and therefore this will be the only license plate shape
that this project will focus on. Other license plate dimensions are allowed for
different vehicles such as tractors and motorbikes.
1.4.2 Layout
The front side license plate on all Danish vehicles is either white enclosed by
a red frame, or yellow enclosed by a black frame. All standard license plates
have two capital letters ranging from A to Z, followed by five digits ranging
from 0 to 9. Beside the standard license plate, there is an option for buying a
so called wish-plate or customized license plates, where the vehicle owner can
choose randomly from 1 to 7 characters or digits to fill the plate. All characters
and digits are written in black, both on yellow and white license plates.
The front side license plate must be mounted horizontally, and in upright
position seen from the side of the car. In other words, the textual part must
face directly forward. The plate may not be changed in shape, decorated or
embellished, and the plate may not be covered by any means. Bolts used for
mounting may not affect the readability of the plate and must be painted in
the same color as the mounting point on the plate. The reference guide says
nothing about the material, but generally the plates are made of a reflecting
material for increased readability.
24
1.5 System definition
 ¿
This project will focus on the design of algorithms used for extracting
the license plate from a single image, isolating the characters of the
plate and identifying the individual characters.
Á À
1.6 Delimitation
This section concludes the chapter by summing up the delimitations for the
system. The delimitations are grouped by subject.
25
Chapter 1. Traffic control
26
Part II
Design
27
Chapter 2
License plate extraction
Single image
2.1 Introduction
Before isolating the characters of the license plate in the image, it is advanta-
geous to extract the license plate.
This chapter presents three different extraction strategies. First the theory
behind each method is developed, then the strengths and weaknesses are sum-
marized and finally, in the last section of this chapter, the three strategies are
combined. The strategies are Hough transform, template matching and region
growing. The common goal of all these strategies are, given an input image, to
29
Chapter 2. License plate extraction
The first two assumptions are trivially fulfilled, since the license plate is
bright white or yellow and the size is known (see Section 1.4). The last two
assumptions are based on the fact that the camera should always be aligned
with the road. This is necessary in order to reduce perspective distortion.
Several other objects commonly found in the streets of urban areas fit the
above description as well. These are primarily signs of various kinds. This
should be taken into account when finding locations for placing the camera
in the first place. Even if some of these objects are mistakenly identified,
this is not a great problem since later processing steps will cause these to be
discarded.
Throughout the chapter the same source image is used in all examples.
This image is shown in Figure 2.1. The methods have been developed using
36 training images, and they can all be found on the attached CD-ROM along
with 72 images used for testing.
30
2.2 Hough transform
The first step is to threshold the gray scale source image. Then the resulting
image is passed through two parallel sequences, in order to extract horizontal
and vertical line segments respectively.
The first step in both of these sequences is to extract edges. The result is
a binary image with edges highlighted. This image is then used as input to
the Hough transform, which produces a list of lines in the form of accumulator
cells. These cells are then analyzed and line segments are computed.
Finally the list of horizontal and vertical line segments are combined and
any rectangular regions matching the dimensions of a license plate are kept as
candidate regions. This is also the output of the algorithm.
31
Chapter 2. License plate extraction
Thresholding
Binary image
Abstraction level
edges highlighted
Accumulator cells
(corresponding to lines)
Candidate regions
The choice of kernels was partly based on experiments and partly because
they produce edges with the thickness of a single pixel, which is desirable input
to the Hough transform.
Figure 2.3 shows the two kernels applied to the inverted thresholded source
image. The image is inverted to make is easier to show in this report. Since
the two filters are high pass filters, which approximates the partial derivatives
in either horizontal or vertical direction, a value of positive one in the resulting
image corresponds to a transition from black to white and a value of minus
one corresponds to the opposite situation. Since it is of no importance which
transition it is, the absolute value of each pixel is taken before proceeding.
32
2.2 Hough transform
yi = axi + b ⇔ (2.2)
b = −xi a + yi (2.3)
For a fixed (xi , yi ), Equation (2.3) yields a line in parameter space and a
single point on this line corresponds to a line through (xi , yi ) in the original im-
age. Finding lines in an image now simply corresponds to finding intersections
between lines in parameter space.
In practice, Equation (2.3) is never used since the parameter a approaches
∞ as the line becomes vertical. Instead the following form is used:
In Equation (2.4) the θ parameter is the angle between the normal to the
line and the x-axis and the ρ parameter is the perpendicular distance between
the line and the origin. This is also illustrated in Figure 2.4. Also in contrast
to the previous method, where points in the image corresponded to lines in
33
Chapter 2. License plate extraction
θ1 ρ1
(x2 , y2 )
Hough transform
(ρ1 , θ1 )
(x1 , y1 )
ρ
x
34
2.2 Hough transform
is that only horizontal and vertical lines are considered. This corresponds to
computing the Hough transform only in a small interval around either 0 or
π/2. Since only a single ‘column’ of the accumulator array is ever computed,
this vastly decreases both the memory requirements and processing time of the
algorithm.
After the algorithm has iterated over all points, the accumulator cells con-
tain the number of points which contributed to that particular line. Finding
lines is then a matter of searching the accumulator array for local maxima.
Figure 2.5 shows an ideal case, where the longest line segments from the
35
Chapter 2. License plate extraction
image have been extracted. The short line segments in the letters and in the
top part have been eliminated.
The first step is sorting the points along the line. This is done by first
finding the parameterized equation of the line. A cell with parameters (ρi , θi )
corresponds to a line with Equation (2.5).
" # " # " #
x cos(θi + π/2) ρi cos θi
= t+ (2.5)
y sin(θi + π/2) ρi sin θi
Computing t for each point and sorting after t is now trivial. The result of
this step is a list of line segments.
2. The ratio between their average length and distance should equal that
of a standard license plate.
The resulting two lists of regions, one with horizontal and one with ver-
tical pairs, are then compared and any region not contained in both lists are
discarded. The remaining regions are the final candidate regions.
Strengths Explanation
Scaling invariant Since the algorithm does not look for regions
of particular size, it is invariant to scaling of
the license plate.
36
2.3 Template matching
Weaknesses Explanation
Trouble detecting verti- Vertical lines in the license plate are typi-
cal lines cally more than a factor four shorter than the
horizontal lines and thus more susceptible to
noise.
Finds more than just the All rectangular regions with dimensions
license plate equal to that of a license plate are identified,
which is sometimes many. This makes it dif-
ficult to choose the correct candidate later.
Table 2.2: Weaknesses of the Hough transform method
Image preprocessing
When examining the source images in the training set, it is quite obvious that
there are two major differences in the plates. The plates vary in size because
of the fact that the cars are at different distances and the license plates vary in
light intensity due to different lighting conditions when the images are taken.
Both are issues that cannot be handled by the template construction. The
size issue cannot be helped at all. This observation might very well turn out
37
Chapter 2. License plate extraction
to be critical for this approach, but for now that fact will be ignored and the
attention turned to the lighting issue. This can be helped by proper image
preprocessing.
A bit simplified, the goal is to make the plates look similar. This can be
done by thresholding the image. The threshold value is not calculated dynam-
ically because the image lighting is unpredictable and the different locations
and colors of cars makes it impossible to predetermine the black (or white)
pixel percentage needed for dynamic thresholding. Therefore the value giv-
ing the best result in a number of test images is chosen. Figure 2.6 shows
two images, with very similar license plates, but different overall lighting, to
demonstrate, that a dynamic threshold is hard to accomplish.
Also, a license plate has an edge, which on the processed image will appear
to be solid black, and a bit thicker in the bottom. It is however not a good
idea to add this feature to the template. Even a small rotation of the license
plate means that the correlation would yield a big error.
A subtemplate for the letters can be constructed by adding layer after layer
containing the letters in the alphabet, and then merging the layers, giving
each layer the opacity of the probability of the letter. In the same way a
subtemplate can be made for the digits. Ignoring the diversity of the license
plates on the road today and combination restrictions, each letter is given
the same probability, as are the digits. Assuming that letters have the same
probability does not reduce the efficiency of the system. Doing so, we end up
with a gray-scale template as seen in Figure 2.7.
38
2.3 Template matching
Figure 2.7: Template created through opacity-merging of the probabilities. The left
hand side shows a letter template, and the right hand side a digit template
XX
d2f,t (u, v) = [f (x, y) − t(x − u, y − v)]2 (2.6)
x y
39
Chapter 2. License plate extraction
Template
t(x−u,y−v)
Output Image
is constant, since it is the value of the sum of pixels in the entire template
squared.
P 2
Assuming that the term f (x, y) can be regarded as constant as well,
means that it is assumed that the light intensity of the image does not vary in
regions the size of the template over the entire image.
This is useful, because based on this assumption, the remaining term as
expressed in Equation (2.8) becomes a measure for the similarity between
the image and the template. This measure for similarity is called the cross
correlation.
XX
c(u, v) = f (x, y) · t(x − u, y − v) (2.8)
x y
The assumption for which the validity of the measure is based, is however
somewhat frail. The term x y f 2 (x, y) is only approximately constant for
P P
images in which the image energy only varies slightly. In most images this is
not the case. The effect of this is that the correlation value might be higher in
bright areas, than in areas where the template is actually matched. Also the
range of the measure is totally dependent on the size of the template.
These issues are addressed in the normalized cross correlation.
40
2.3 Template matching
P P
x y [f (x, y)− f u,v ][t(x − u, y − v) − t]
γ(u, v) = qP P (2.9)
2
P P 2
x y [f (x, y) − f u,v ] x y [t(x − u, y − v) − t]
f u,v is the mean value of the image pixels in the region covered by the template,
and t is the mean value of the template. The value of γ lies between -1 and 1,
where -1 is the value of a reversed match, 1 when a perfect match occurs. The
value approaches 0 when there is no match.
As can be seen in the expression for the cross correlation coefficient (Equa-
tion (2.9)), it is a computationally expensive task. For each pixel in the output
image, the coefficient has to be calculated. Assuming an image of size M 2 , a
template of size N 2 and not including the normalization (only the numerator
of Equation (2.9)), the calculations involves approximately N 2 (M − N + 1)2
multiplications and the same number of additions [5]1 .
1
For all the complexity evaluations it should be noted that these are estimates, and vary
depending on method of implementation.
41
Chapter 2. License plate extraction
Weaknesses Explanation
Slow algorithm A large input image and a smaller template,
will make performing the simple calculations
in the many nested summations a demanding
task.
Not invariant to rotation If the region sought after is rotated or dis-
and perspective distor- torted in the input image, the region may
tion very well bear little or no resemblance to
the template on a pixel by pixel basis. This
means the similarity measurement will fail.
Not invariant to scaling Scaling of input images proves to be an un-
surpassable problem. It is an impossible task
to examine the input image using all possible
sizes for the template, and even the smallest
variation in size will often lead to a wrong
result.
Static threshold The images vary a great deal in overall
brightness, depending on the surroundings
Table 2.4: Weaknesses of the template matching method
42
2.4 Region growing
No
Does pixel
fulfil requirements?
Yes
Yes
Expansion possible?
No
are made from a reflecting material (see Section 1.4), they will usually appear
brighter than the rest of the vehicle. Exceptions occur with white cars, but
in general, looking at the brightness is an effective way of finding possible
candidates for license plates. Of course, it has to be considered, that the
characters inside the plate are black. This is helpful when a license plate has
to be distinguished from e.g. other parts of a white vehicle. License plates also
have well defined dimensions.
Combining the two features above, all bright rectangular regions with a
certain height-width ratio should be considered candidates for license plates.
2.4.2 Preprocessing
Before searching through the image for any pixels bright enough to be part
of a license plate, the image is prepared by converting it to binary. This is
done, since color plays no role when looking for bright pixels. In performing
this conversion, the threshold value is critical as to whether or not the license
plate can be distinguished from its surroundings. On one hand, the threshold
43
Chapter 2. License plate extraction
should be low enough to convert all pixels from the license plate background
into white, on the other hand it should be chosen high enough to convert as
much of the other parts of the image as possible into black. Figure 2.10 shows
an example of such a binary image. The license plate appears as a bright
rectangle, but there are several other white regions.
A problem arises, since the overall brightness of the images is not known in
advance, and therefore it is reasonable to select a relatively low threshold to
guarantee that the background of the license plates are always converted into
white. Then other criteria, such as the ratio mentioned above, will have to
help in selecting the most probable candidate for a license plate, from among
the list of white regions.
44
2.4 Region growing
Since the method of transforming a region into the largest contained rect-
angle is susceptible to noise, it is reasonable to assume that the second method
will prove to give the best results.
An enhancement to the algorithm can be achieved by setting a dynamic
criteria for when a neighbor pixel belongs to the same region. Instead of a
45
Chapter 2. License plate extraction
static threshold, dividing the picture in black and white pixels, the criteria
could be that the neighboring pixel must not differ more than a given margin
in brightness. This would mean that license plates partly covered in shadow
could be seen as a coherent region, but also introduces the risk of letting the
license plate region expand beyond the real license plate, if the border is not
abrupt.
Strengths Explanation
Fast algorithm Each pixel is examined no more than once
for each neighbor. This implies an O(n) al-
gorithm.
Invariant to distance be- The method extracts candidates with the
tween camera and vehicle correct shape, it does not depend on size of
regions.
Resistant to noise The region is expanded to the largest possible
rectangle based on maximum and minimum
values.
Table 2.5: Strengths of the region growing method
Weaknesses Explanation
High demands for mem- The recursive nature of the algorithm stores
ory temporary results for each call to the recur-
sive function.
Static threshold The images vary a great deal in overall
brightness, depending on the surroundings.
Table 2.6: Weaknesses of the region growing method
46
2.5 Combining the method
47
Chapter 2. License plate extraction
It cannot be guaranteed, that the images are taken in the same distance
from each vehicles. Thus, the criteria must be invariant with respect to the
distance between camera and vehicle, or in other terms it is crucial that the
method is scaling invariant.
Due to the scaling problem template matching will not be able to extract
license plates single handedly. It might however be very useful as an extension
to the other methods examined in this chapter, because it will be able to aid
in the evaluation of the intermediate results.
In real life, the assumption that license plates are aligned with the axes
might not hold. It might not be possible to place the camera under optimal
conditions, resulting in perspective and/or rotation distortion. Therefore a
more robust system, with some invariance to these factors would be desirable.
Here the region growing method seems like a good choice. But, since the system
will be expected to function in all weather conditions, it would be preferable
that the method is as insensitive to changes in lighting conditions as possible.
The Hough transform has that quality to some extent.
Combining the two methods ideally makes sure the system finds the plate
under all conditions. Both methods will produce a set of possible regions,
which serves as input to a method which is to determine the most probable
candidate.
Correlation
First correlation takes its turn to sort out regions, that bear no resem-
blance to license plates. This step sorts out regions such as bright areas
of the sky.
48
2.5 Combining the method
Peak-and-valley
This method is designed to sort out any uniform regions, such as pieces of
the road. It works by examining a horizontal projection of the candidate
region. In this projection, the numbers of sudden transitions from a high
number of black pixels to low, and vice versa, is counted. If this number is
below an experimentally determined threshold, the region cannot contain
a license plate.
Height-width ratio
Although some times a region with the license plate will contain some
edges around the actual plate, the ratio provides a mean of sorting out
a lot of regions, that could not possibly contain a license plate. If more
than one region passes through all three criteria, the one with the best
ratio is selected.
These methods have been designed to complement each other, so that var-
ious types of regions can be sorted out from true license plates. Figure 2.13
illustrates, how the three steps each sort out a certain type of regions. The
49
Chapter 2. License plate extraction
first region from the left will not pass a correlation test, since the black pixels
does not cover the correct areas. When resized, both regions 2 and 3 pass this
test, but region 2 is too uniform to pass the peak-and-valley test. Now only
region 3 and the real license plate are left, and the height-width ratio easily
determines which is the better alternative.
The performance of the combined methods will be examined in Section 5.4,
as well as a discussion of the order in which the methods should be applied.
2.6 Summary
This chapter introduced three methods for finding candidate regions for the
license plate. While region growing and Hough transform seem to be viable
algorithms, the correlation scheme suffers from a scaling problem, which pre-
vents it from being a real alternative. Also, it was demonstrated how the selec-
tion of the most probable candidate takes place, with three different methods
that complement each others weaknesses. One of these was correlation, which
proved to be a good discriminator, when sorting out regions that does not
contain a license plate.
50
Chapter 3
Character isolation
Single image
3.1 Introduction
To ease the process of identifying the characters, it is preferable to divide the
extracted plate into seven images, each containing one isolated character. This
chapter describes several methods for the task of isolating the characters.
Since no color information is relevant, the image is converted to binary
colors before any further processing takes place. Figure 3.1 shows the ideal
process of dividing the plate to seven images containing a character each.
51
Chapter 3. Character isolation
The advantage of this method is the simplicity, and that its success does
not depend on the image quality (assuming that the license plate extraction is
performed satisfactory). Its weakness is the fairly high risk of choosing wrong
bounds and thereby making the identification of the characters difficult. This
risk is directly proportional to the quality of the license plate extraction output.
52
3.2 Isolating the characters
The first character on the lowest plate in Figure 3.2 shows an example
output from the plate isolation, where a portion of the mounting frame was
included. Another weakness is that the method only separates the characters
instead of finding the exact character bounds.
Depending on the quality of the license plate extraction, and the success of
removing the frame, this method is very useful, since it is independent of the
character positions. The downside is that this method is very dependent upon
image quality and the result of the license plate extraction.
53
Chapter 3. Character isolation
Result of dilation
Structuring element B
54
3.2 Isolating the characters
to fail.
The image received from the extraction often contains more than just the
license plate, for example the mounting frame as in Figure 3.1. This frame can
be removed by further isolation of the actual license plate.
The isolation of the license plate from any superfluous background is per-
formed using a histogram that reflects the number of black pixels in each row
and in each column. In most cases, projecting the amount of black pixels both
vertically and horizontally reveals the actual position of the plate. An example
of the projections is shown in Figure 3.5.
A simple but effective method for removing the frame is based on the
assumption, that the vertical projection has exactly one wide peak created
by the rows of the characters. Therefore the start of the widest peak on the
vertical projection, must be the top of the characters, and the end of the peak
the bottom of the characters. It is expected that the horizontal projection has
exactly seven wide peaks and eight valleys (one before each character, and one
after the last character).
The success of this method depends on the assumption that the plate is
horizontal. In some cases the method will result in a small part of the frame
being left back in parts of the image, for example if the angle of the plate is
too large, although depending on the thickness of the remaining frame, it is
still possible to separate the characters.
55
Chapter 3. Character isolation
Dynamic threshold
If the original image is dark or the license plate is dirty, the binary image
created from the standard threshold value can be very dark and filled with
unwanted noise. There are various methods to eliminate this problem; the
threshold used when the original image is converted from color to binary, is
calculated dynamically based on the assumption that an ideal license plate
image on average contains approximately 69% white/yellow pixels and 31%
black pixels including the frame1 . The idea is first to make the image binary.
Second, the ratio between black and white pixels is calculated and compared
with the expected value. Then a new threshold value is selected and the
original image is converted again, until a satisfactory ratio has been achieved.
Although this method is affected when the edges of the plate have not been
cut of, it is far more reliable than setting a static threshold.
A different approach is to remove the unwanted pixels from the binary image.
Knowing that we are looking for seven fairly large regions of black pixels, it
can be useful to process a number of erosions on the image before searching
for the bounds, and thereby removing unwanted objects, for example bolts.
56
3.2 Isolating the characters
First convert the image into binary colors using a dynamic threshold to
achieve the best result
If still unsuccessful, try the pixel-count method to search for the character
bounds on the horizontal projection.
If this method also fails, use static bounds as the final option.
No
Succes? Succes? Succes?
No No
Yes Yes Segment using
Yes static bounds
57
Chapter 3. Character isolation
3.3 Summary
In this chapter several approaches to isolating the characters from an input
image containing the entire license plate, was described. None of the methods
were capable of providing reliable results on their own, due to the varying
input image quality, but a combination of the methods ensures a very robust
isolation scheme. The results of the combination can be seen in Section 6.8.
58
Chapter 4
Character identification
4.1 Introduction
Single image
After splitting the extracted license plate into seven images, the character
in each image can be identified. Identifying the character can be done in a
number of ways. In this chapter, methods for this will be addressed.
First, a solution based on the previously discussed template matching (Sec-
tion 2.3), will be presented. Thereafter, a method based on statistical pattern
recognition will be introduced, and a number of features used by this method
will be described. Then, an algorithm for choosing the best features called
SEPCOR, will be presented. After finding a proper set of features, means of
selecting the most probable class is necessary. For this purpose, Bayes decision
59
Chapter 4. Character identification
rule will be introduced, along with theory on discriminant functions and pa-
rameter estimation. Finally the methods will be compared against each other
and it will be examined if there is anything to be gained from combining the
two.
60
4.4 Features
uncorrelated features have been found, a way of deciding which class an ob-
servation belongs to, is needed. For this purpose Bayes decision rule will be
introduced, and finally an expression for the classification will be proposed.
4.4 Features
All of the features are extracted from a binary image because most of the
features require this. The area and the circumference of the digits are the
most simple features to distinguish. These features are not sufficient, because
there are different digits with approximately the same area and circumference,
e.g. the digits ‘6’ and ‘9’. To distinguish in such cases, the number of endpoints
in the upper half and lower half of the image are taken into account. Here it
is assumed that the digit ‘6’ has one endpoint in the upper half and zero
endpoints in the lower half, and the endpoints of the digit ‘9’ are the reverse
of a ‘6’. To distinguish between the digits ‘0’ and ‘8’, it is not possible to use
the endpoint feature because they both have zero endpoints. Here the number
of connected compounds is an appropriate feature. The number of compounds
in the digit ‘8’ is three, whereas a ‘0’ has only two compounds.
Furthermore the value of each pixel is chosen as a feature. A final feature
is the area of each row and column, meaning the number of black pixels in
each of the horizontal or vertical lines in the image.
The features are listed below:
Area
Circumference
Compounds
Most of the features require that the characters have the same size in pixels,
and therefore the images with the characters are initially normalized to the
same height. The height is chosen because there is no difference between the
height of the characters, whereas the width of the different digits differ, e.g.
a ‘1’ and an ‘8’ have the same height but not the same width. The following
describes the methods for extracting each feature.
61
Chapter 4. Character identification
4.4.1 Area
When calculating the area of a character, assuming background pixels are white
and the character pixels are black, the number of black pixels are counted.
In area for each row the number of black pixels in every horizontal line are
counted. As seen in Figure 4.1, these vertical projections are distinct for many
of the digits. The horizontal projections are more similar in structure, with
either a single wide peak, or a peak in the beginning and end of the digit.
Although more similar they will still be used to distinguish between the digits.
62
4.4 Features
character, the algorithm is divided into two parts. The first part deletes edges
in east, south or the northwest corner. The second part deletes edges in west,
north or the southeast corner.
During the processing of the two parts, pixels that satisfy the conditions
listed below are flagged for deletion. The deletion is not applied until the
entire image has been processed, so that it does not affect the analysis of the
remaining pixels.
The first part of the algorithm deletes pixels if all of the following conditions
are satisfied.
(a) 2 ≤ N (p1) ≤ 6
(b) Z(p1) = 1
(c) p2 · p4 · p6 = 0
(d) p4 · p6 · p8 = 0
p9 p2 p3
p8 p1 p4
p7 p6 p5
The second part of the algorithm deletes pixels if condition (a) and (b)
combined with (c’) and (d’) are satisfied.
(c’) p2 · p4 · p8 = 0
(d’) p2 · p6 · p8 = 0
63
Chapter 4. Character identification
0 1 0
0 p1 0
0 1 0
Condition (c’) and (d’) satisfy that the pixel is placed in west, north or the
southeast corner, see the gray pixels in Figure 4.4c.
An example of using the thinning algorithm is depicted in Figure 4.4. The
gray pixels in are those marked for deletion. a) shows the original images, b) is
the result of applying part 1 one time and c) is the output of using part 2 once,
d) is part 1 once more, and e) is the final result of the thinning algorithm.
a b c d e
The algorithm terminates when no further pixels are flagged for deletion.
When using the thinning algorithm even small structures are present in the
final skeleton of the image, an example of this can be seen in Figure 4.5.
64
4.4 Features
Connection point
Normally a ‘0’ does not have any end points, but in this case it gets one end
point. The way of finding end points is therefore slightly modified. Structures
that are less than 3 pixels long before reaching a connection point are not taken
into account. Then the digit ‘0’ in Figure 4.5 has no end points, which was
expected.
4.4.3 Circumference
Another possible feature to distinguish between characters is their circumfer-
ence. As it can be seen in Figure 4.6, a ‘2’ has a somewhat larger circumference
than a ‘1’.
100 174
Figure 4.6: Two digits with different circumferences. The circumference is shown below
To find the circumference of a character, the pixels on the outer edge are
counted. This is done by traversing the outer edge of the character, and count-
ing the number of pixels, until the start position has been reached. The cir-
cumference is very dependent upon the selected threshold value, but relatively
resistant to noise.
4.4.4 Compounds
Each character consists of one or more connected compounds. A compound is
defined as an area that contains pixels with similar values. Since the image is
binary, a compound consists of either ones or zeros. The background is only
65
Chapter 4. Character identification
1 2 1
(a) (b)
Figure 4.7: (a) shows the two compounds in a ‘0’, and (b) the one in a ‘1’
∂2f ∂2f
L[f (x, y)] = + (4.1)
∂x2 ∂y 2
This leads to the following digital mask:
0 1 0
1 −4 1
0 1 0
Convolving the original binary image of the character with this mask, the
result is an image with black background, and the edges represented by a one
pixel thick white line, see Figure 4.8.
(a) (b)
Figure 4.8: Result of applying a Laplacian filter to a ‘3’. (a) is the original and (b), is
the resulting image.
66
4.5 Feature space dimensionality
2. Repeat:
(a) Remove and save the feature with the largest V-value.
67
Chapter 4. Character identification
This algorithm runs until a desired number of features have been removed
from the list, or the list is empty.
Calculating the correlation coefficient as mentioned in step 2.b is done as
shown in Equation (4.3).
σij
c =| √ | (4.3)
σii · σjj
The correlation coefficient is actually the normalized correlation coefficient
as presented in Section 2.3.2. Here the σij simply corresponds to the (i, j)
entry in the covariance matrix, which is the covariance between feature i and
j, and σii is the variance of feature i (see Section 4.6).
68
4.6 Decision theory
300
250
Training set
Test set
200
Number of features
150
100
50
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Correlation threshold
Figure 4.9: Number of features selected as a function of the maximum correlation coef-
ficient.
not the data set used in this project is normally distributed will be investigated
by making various tests.
When using the Bayes classifier it is necessary to know the class conditional
probability density functions. Since the distribution of the data is assumed
normal, this means that the parameters of this particular distribution has to
be estimated. A method for doing this is also introduced.
p(x̄ | ωj ) · P (ωj )
P (ωj | x̄) = (4.4)
p(x̄)
in which s
X
p(x̄) = p(x̄ | ωj ) · P (ωj ) (4.5)
j=1
x̄ is the feature vector of the sample being investigated. A finite set of classes,
s is defined as Ω = {ω1 , ω2 , . . . , ωs }. In this case, the classes represent the
different characters that can occur on a license plate. Similar, a set of actions
is declared as A = {α1 , α2 , . . . , αs }. These actions can be described as the
69
Chapter 4. Character identification
where P (ωj | x̄) is the probability that ωj is the true class given x̄.
This can be used when having to make decisions. One is always interested in
choosing the action that minimizes the loss. R(αi | x̄) is called the conditional
risk. What is wanted is a decision rule, α(x̄), that determines the action to be
taken on a given data sample. This means, that for a given x̄, α(x̄) evaluates
to one of the actions in A. In doing so, the action that leads to the smallest
risk should be chosen, in accordance with Equation 4.6.
Minimum-error-rate classification
As already stated, it is desirable to choose the action that reduces the risk the
most. It can also be said, that if action αi corresponds to the correct class ωj ,
then the decision is correct for i = j, and incorrect for i 6= j. The symmetrical
loss-function is built on this principle:
0 i=j
λ(αi | ωj ) = i, j = 1, . . . , s (4.7)
1 i 6= j
70
4.6 Decision theory
function gi (x̄) for each class is defined. Then the classifier assigns vector x̄ to
class ωi if: gi (x̄) > gj (x̄) for all j 6= i. An example of dividing the feature space
into decision boundaries can be seen in Figure 4.10, which accomplishes the
minimum error-rate.
71
Chapter 4. Character identification
µ̄ = E[x̄] (4.11)
µi = E[xi ] (4.13)
n
X xik
µi = (4.15)
k=1
n
n
X (xik − µi )(xjk − µj )
σij = (4.16)
k=1
n−1
In the covariance matrix the diagonal, σii , is the variance of feature i and
the off-diagonal, σij , is the covariance of feature i and j. Samples drawn from
a normal distributed population tend to gather in a single cloud. The center
of such a cloud is determined by the mean vector, and the shape is dependent
of the covariance matrix. The interpretation of the covariance matrix and the
shape of the clouds can be divided into five different cases. For simplicity the
examples have only two features.
If the covariance matrix equals the identity, the cloud is circular shaped,
which is depicted in Figure 4.11, where the axes represent the two features.
72
4.6 Decision theory
If all the off-diagonal entries are 0, then the features are uncorrelated and
thus statistically independent, and the cloud is formed as an ellipse where
the axes are the eigenvectors of the covariance matrix and the length is the
corresponding eigenvalue3 . In these cases the ellipse is aligned with the feature-
axes. If σ11 > σ22 the major axis is parallel with the axis of feature one, and
if σ11 < σ22 the minor axis is parallel with feature one. Figure 4.12 shows the
two cases.
73
Chapter 4. Character identification
is negative. The orientation of the clouds in the two situations are depicted in
Figure 4.13.
In the case of multiple features the clouds are hyper-ellipsoids instead,
where the principal axes again are given by the eigenvectors, and the eigenval-
ues determine the length of these axes. In the hyper-ellipsoids the quadratic
form:
¯ −1 (x̄ − µ̄)
r2 = (x̄ − µ̄)t Σ̄ (4.20)
is constant and sometimes called the squared Mahalanobis distance from x̄
to µ̄ [9].
To achieve the minimum error-rate, the classification can be achieved by
the use of discriminant functions for the normal density. Since the discriminant
equal function: gi (x̄) = p(x̄ | ωi )P (ωi ), it can be rewritten as:
74
4.6 Decision theory
condition number for the matrices, which for all matrices was greater than
1020 . The condition number of a matrix A, is defined as the product of the
norm of A and the norm of A−1 .
Under the assumption that the classes have the same probability and the
determinant of the covariance matrix nearly equals zero, the classification is in-
stead done simply on basis of the squared Mahalanobis distance, as represented
in Equation (4.20).
As part of computing the squared Mahalanobis distance, the inverse covari-
ance matrix must be computed. As stated above this was made difficult by
the fact that these matrices were often ill-conditioned and traditional methods
for computing the inverse matrix fails in these special cases.
An alternative to computing the traditional inverse is to compute the
pseudo-inverse. The pseudo-inverse is defined in terms of singular value de-
composition and extends the notion inverse matrices to singular matrices[10].
As an alternative distance measure, the Euclidean distance can be used.
The squared Euclidean distance is defined as[4]:
When using the Euclidean distance, the inverse covariance matrix is replaced
with the identity matrix. This means, that unlike Mahalanobis distance, the
Euclidean distance is not a weighted distance.
75
Chapter 4. Character identification
The Matlab plot displays true normal distributions as linear curves, and so
it is expected, that the data from the circumference is approximately linear.
Figure 4.15 shows, that this is almost the case.
0.98
0.95
0.90
0.75
Probability
0.50
0.25
0.10
0.05
0.02
275 280 285 290 295 300 305 310 315 320
Data
The goodness of fit test can be based on the χ2 (chi-square) distribution [1].
The observations made in a single class, e.g. the number ‘0’, are then grouped
into k categories. The observed frequencies of sample data in the k categories
are then compared to the expected frequencies of the k categories, under the
assumption that the population has a normal distribution. The measure of the
difference between the observed and the expected in all categories is calculated
76
4.6 Decision theory
by:
k
2
X (oi − ei )2
χ = (4.24)
i=1
ei
where oi is the observed frequency for category i, ei is the expected fre-
quency for category i, and k is the number of categories.
The χ2 -test requires the expected frequencies to be five or more for all
categories.
When using the χ2 -test to check whether or not the data is normal dis-
tributed, data is assumed to be defined by the mean µ and the variance σ 2 .
The null and alternative hypotheses are based on these assumptions.
H0 : The population has a normal probability distribution, N (µ, σ 2 ).
Ha : The population does not have a normal probability distribution.
The rejection rule:
Reject H0 if χ2 > χ2α
where α is the level of significance and with k-3 degrees of freedom.
When defining the categories, the starting point is the standard normal
distribution, N(0,1), and the strategy is then to define approximately equally
sized intervals. Here k = 6 categories are created, with the limits for the
intervals defined as ]-∞;-1], ]-1;-0.44], ]-0.44;0], ]0;0.44], ]0.44;1] and ]1;∞[.
The intervals and the area of probability are depicted in Figure 4.16. The
degrees of freedom can then be calculated as Df reedom = k − 3 = 3.
Figure 4.16: Standard normal distribution, with the 6 categories and the corresponding
areas of probability
The observations made from the training set are used to estimate the mean
circumference, µ, and the standard deviation s, of the normal distribution.
P
xi 11360
µ= = = 298.95 (4.25)
n 38
77
Chapter 4. Character identification
rP r
(xi − µ)2 3977.87
s= = = 10.37 (4.26)
n−1 37
The limits for the categories can then be calculated on the basis of the
defined intervals, the mean and the standard deviation.
k
2
X (oi − ei )2
χ = = 2.74 < χ20.10 = 5.251
i=1
ei
The null hypothesis, H0 , will not be rejected with a 10% level of significance,
and with a 3 degrees of freedom.
The results for the other features, were not always as close to a normal
distribution as desired. A feature such as the number of compounds is very
stable, and does not have any variance whatsoever. Other features have seem-
ingly random distributions, which to a high extent could be caused by the
relatively small amount of training data.
Although some of the features are not normal distributed, the majority are.
Therefore, in this project it is assumed that all features are normal distributed
and hence we can apply Bayes decision rule. Section 4.6.4 will discuss methods
for estimating the parameters that fit the best, under the assumption that the
data is normal distributed.
78
4.6 Decision theory
n
¯
Y
p(X | θ̄) = p(x̄k | θ) (4.27)
k=1
79
Chapter 4. Character identification
Equation (4.27) and which makes p(x̄ | ωj ; θ̄j ) the best fit for the given sample
set.
n
¯ =
X
l(θ̄) = log p(X | θ) log p(x̄k | θ̄) (4.28)
k=1
∂
∂θ1
∂
∂θ2
∇θ̄ = .
(4.29)
..
∂
∂θd
All critical points of Equation (4.28), including θ̂, are then solutions to:
n
X
∇θ̄ l = ∇θ̄ log p(x̄k | θ̄) = 0 (4.30)
k=1
¯
l(θ̄) = log p(X | θ) (4.31)
Xn
= log p(xk | θ̄) (4.32)
k=1
n
" Ã µ ¶2 !#
X 1 1 xk − θ 1
= log √ √ exp − √ (4.33)
k=1
2π θ2 2 θ2
n
X 1 1
= − log 2πθ2 − (xk − θ1 )2 (4.34)
k=1
2 2θ 2
4
The reason for taking the logarithm is that it eases some algebraic steps later on (this
is clearly legal, since the logarithm is a monotonically increasing function).
80
4.7 Comparing the identification strategies
The next step is to insert the last equation into Equation (4.30):
" # n
∂
1 X 1
∇θ̄ l = ∂θ1
∂ − log 2πθ2 − (xk − θ1 )2 (4.35)
∂θ2
k=1
2 2θ2
n
" #
1
X
θ2
(xk − θ1 )
= −θ1 )2 =0 (4.36)
k=1
− 2θ12 + (xk2θ 2
2
Finally µ and σ 2 are back substituted and solved for. This yields:
n
1X
µ = xk (4.37)
n k=1
n
1X
σ2 = (xk − µ)2 (4.38)
n k=1
n
1X
µ̄ = x̄k (4.39)
n k=1
n
¯ = 1
X
Σ̄ (x̄k − µ̄)(x̄k − µ̄)t (4.40)
n k=1
These results correspond well to Equations (4.15) and (4.16). Thus the
maximum likelihood estimates are as expected the sample mean and the sample
covariance matrix.
81
Chapter 4. Character identification
4.8 Summary
In this chapter the two methods, template matching and statistical pattern
recognition, for identifying characters were described. In connection with the
statistical recognition, a method for reducing the number of features called
SEPCOR was introduced. Mahalanobis distance was stated as a measure of
identifying the characters, derived from Bayes decision rule. As an alternative
measure, the Euclidean distance was used. A mean for testing, whether data
is normal distributed was presented, and parameter estimation for the Bayes
classifier is described. Finally, the two identification strategies were compared.
82
Part III
Test
83
Chapter 5
Extraction test
5.1 Introduction
This chapter describes the tests performed to verify that the license plate
extraction performs adequately. Since region growing and Hough transform
have the same objective, they will be described together, whereas correlation
is described separately, since it is used for verifying whether or not a region
could potentially contain a license plate. Finally the combined method will be
tested, in order to give an estimate of the actual performance of the subsystem.
85
Chapter 5. Extraction test
form of edges. If the region contains more than just the license plate, it will
be harder to distinguish the correct region from the entire list of regions.
As a secondary goal, the methods should not provide an excessive amount
of regions. A very large amount of regions makes it harder to select the most
probable license plate candidate.
86
5.2 Region growing and Hough transform
5.2.4 Results
The region growing method turned out to be a very robust way of finding the
license plate. As Table 5.1 indicates, it almost always found the plate. In only
2 of the 72 test images, it was not capable of locating a satisfying region. The
total amount of regions lies in the lower hundreds, which may at first glance
seem a lot. It covers the fact that several regions are duplicated because of the
iterations with different threshold values.
There were some common characteristics of the two plates that were not
found. They were both rather dark images, taken so that the shade was in
front of the vehicle. Also, they were both yellow plates, further decreasing
the brightness of the plates in contrast to the surroundings. A further scaling
of the threshold value might have solved this issue, but for lower thresholds,
the total amount of regions rise quickly, so that this might not provide better
overall results anyway.
For Hough transform, the percentage of found plates was somewhat lower.
Again, the yellow plates were more troublesome, with a hit percentage of 43
compared too a rate of 64 % for white plates. Also, the number of candidate
regions was approximately a factor 2 higher than for region growing in the test
set, but with very large differences from image to image. In the training set the
number was lower, again with large differences between the individual images.
As with region growing, the results were slightly better for the training set,
but the difference is so minuscule, that it does not imply that the algorithm
had been artificially designed for the specific set of training images.
As is also seen in Table 5.1, the amount of regions found by the Hough
transform is higher for the test set than for the training set. The reason is,
that a parking lot in the background of some of the images causes a very high
number of edges to be detected. Region growing actually finds fewer regions in
the test set, but the numbers represent an average, with very high deviations
between the individual images.
87
Chapter 5. Extraction test
The plates that were not found by region growing, were not found by the
Hough transform either. Therefore, the conclusion must be that the two meth-
ods do not complement each other.
In conclusion, the region growing method will guarantee a very high success
rate, but also a rather large amount of regions to be sorted out before finding
the most probable candidate. A very large percentage of the regions can easily
be discerned as random regions without much similarity with a license plate.
The algorithms were proven to be almost as efficient when applied to images,
that were not part of the training set, and were taken in different lighting
situations and distances.
5.3 Correlation
The correlation was ruled out as an extraction method, due to the fact that it
was not scaling invariant. Another use was found for it in the selection of the
candidate regions, identified by the other two methods. It is for this use, that
the correlation method will be tested.
88
5.3 Correlation
manually examined, to see if all plates were contained, and which type of other
regions passed.
5.3.4 Results
As can be seen in Table 5.2, all of the plates have been found.
The table also shows that regions that are not license plates, are not always
ruled out. This does not matter as it will be used in connection with other
methods. The only demand is, that the combination rules them out.
Looking at the non-plate regions that were not ruled out, there are some
common characteristics (see Figure 5.1). Many of the regions are as a), bright
regions with dark vertical lines, very similar to a license plate. The same goes
for b), where bright regions with darker ends are mistaken for the plates. It
should be noted that the correlation values for these types are much lower,
than a typical license plate would be expected to produce. Still, the threshold
value is set relatively low, so that even dark or slightly rotated license plates
are not omitted. c) is a different matter altogether. It might very well be
difficult to distinguish between a whole plate and a plate with the end cut off,
at least it is expected, that the region is not discarded as being a plate. Still,
the template has been designed so that the actual plate, where the characters
are placed as on the template, would yield a higher correlation coefficient.
a)
b)
c)
This put aside, this test shows, that if the method was to be used alone,
more effort would have to be put in determining threshold values and perhaps
in template construction.
89
Chapter 5. Extraction test
90
5.5 Summary
The license plate not found was a blurred, small yellow plate, and by tweak-
ing the threshold values in the different steps, this plate could also be recog-
nized as a plate. The number of non-plates found to be plates, could be
seriously reduced, by tweaking the threshold values. For instance, a proper
selection of the threshold value for the correlation coefficient makes the corre-
lation validation more accurate. But it is pointless to do so, with the individual
threshold values out of context with the other methods. If the threshold for
the coefficient is increased, it is more likely that an actual plate is discarded.
5.5 Summary
The test showed, that the region growing method was capable of finding nearly
all license plates. The Hough transform found a smaller amount of the plates,
and it could not be established, that a combination would provide better results
than for region growing on its own.
The combined method for selecting the most probable license plate candi-
date is very effective. The three steps are each capable of sorting out a different
type of region, and the combination makes it possible to find the correct license
plate in a large collection of non-plate regions.
The amount of successfully extracted license plates is not as high as the
results above would indicate, however. This is due to the fact, that the amount
of found regions is very different from one image to another. Therefore, some
plates are not found when searching for the best region.
91
Chapter 6
Isolation test
6.1 Introduction
This chapter describes the test of the preprocessing step of isolating the char-
acters in the license plate. The purpose of the chapter is to verify that the
implementation of the isolation method described in Chapter 3 performs effi-
ciently.
The sequence of the subimages has to be in the correct order. This means
that the first character of the plate is the first of the subimages.
93
Chapter 6. Isolation test
depends on the resulting output images. The test has been performed using the
connected component method, the pixel count method and the method using
static bounds. All methods are tested independently. Finally, the combined
method also described in Chapter 3, is tested as well.
94
6.6 Pixel count
improved drastically and this reflects clearly on the result of the test. Where a
total of 78 images succeeded before, the improved method was able to divide
another 23 images. for both sets, a total of 101 of the 106 images resulted
in successful isolation. In general the resulting images from the method is of
good quality, meaning that the bounds found by the method are accurate.
From the test it is also clear that there is no remarkable difference in the
result of the two data sets. The results are slightly better for the test set, but
for both sets, a success rate of above 90 % is achieved.
Figure 6.2 shows an example of a plate, that could not be successfully
divided into 7 subimages using the connected component method. The reason
is, that the third and fourth digit are part of the same component after the
image has been thresholded, and therefore they cannot be isolated.
95
Chapter 6. Isolation test
The use of dynamic threshold gave a significantly better result. One image
which succeeded before dynamic threshold was applied, failed, but a total of 72
images succeeded. This means, that using a combination of the output without
improvement and the output when using dynamic thresholding, a total of 73
of 106 images resulted in a successful isolation.
As in the previous test, there was no significant difference between the
success percentage of the two data sets, when isolating without preprocessing.
As with the connected components, the results were slightly better for the test
set, when using dynamic thresholding. However, the 5 % difference is small
enough to be caused by statistical uncertainty.
Naturally the method always succeeds in dividing the image into seven
subimages and thereby the first and the third criteria of success are always
fulfilled. The success of the method depends merely on the quality of the
isolated characters, meaning that the second criteria is fulfilled.
96
6.8 Combined method
The images that failed both the connected component and the pixel count
method, are all images that can be adequately divided by using static bounds.
The connected component method produces the best output images since
the bounds are very accurate, but when the method fails, both the pixel count
method and static bounds proved useful.
6.8.1 Summary
The success criteria for the isolation of the characters in the license plate was
that all seven characters should be isolated in the correct order and without
being diminished in any way.
The test shows that although the individual methods for isolating the char-
acters described in Chapter 3 cannot perform the task by themselves, the com-
bination was proven to be able to do this in all of the images used for testing.
97
Chapter 7
Identification test
7.1 Introduction
The final step in recognizing a license plate is identifying the single characters
extracted in the previous steps. Two methods for doing this was presented
earlier. The first that has been tested is the method based on statistical pattern
recognition, and second the normalized cross correlation coefficient.
99
Chapter 7. Identification test
100
90
80
70
Identification percentage
60
50
40
30
20
10
0
0 50 100 150 200 250 300
Number of features
100
7.4 Feature based identification
The test was performed using a maximum correlation of 1, and this yields an
identification rate of 100 %. This was expected, since the system should be
able to identify the characters it was trained with.
Max. Features
corr. used Successful Percentage
0.1 5 130 of 350 37.1 %
0.2 13 247 of 350 70.6 %
0.3 25 267 of 350 76.3 %
0.4 42 316 of 350 90.3 %
0.5 69 337 of 350 96.3 %
0.6 96 338 of 350 96.6 %
0.7 142 344 of 350 98.3 %
0.8 210 344 of 350 98.3 %
0.9 262 345 of 350 98.6 %
1.0 306 345 of 350 98.9 %
Table 7.1: Result of the test on the test set
Results
Correct 180
Training set size 180
Percentage (%) 100.0
Table 7.2: Result of the test on the training set
As it can be seen in both Table 7.1 and Figure 7.1, the recognition percent-
age rises quickly when the maximum correlation is below 0.5, and then settles
at about 98 %. The gain achieved from using a maximum correlation of 1 com-
pared to 0.7 is only 0.57 %. But since there are no hard timing restrictions,
the higher identification rate, although small, is preferable in this system.
(a) (b)
Figure 7.2: (a) shows the unidentifiable character, and (b) the optimal result.
101
Chapter 7. Identification test
The number of errors using the value 0.7 are spread on three different license
plates, and on two using 1 as maximum correlation. It should be noted, that
using the value 0.7, one license plate was responsible for four of the six errors.
Using a maximum correlation of 1, the same plate accounts for three out of
four errors. An example of an error originating from this plate can be seen
in Figure 7.2. This image actually produces some very nice subimages of the
digits, but the digits in these images are a bit small. The digit in the image
should fill the image horizontally, but fails to do so, probably because of the
use of static bounds.
This is believed to be the cause of the inability to identify the digits in this
plate.
Table 7.4 shows the results of identifying the set not used for training. Not
surprisingly, the identification percentage is lower when identifying the set not
used for training. Also, contrary to the results with the Euclidean distance,
adding more features does not always increase the identification percentage.
The results of the tests are also shown in Figure 7.3 and Figure 7.4 as a
function of the number of features used. Here it becomes even more apparent
that a higher number of features does not always produce a better result.
The reason seems to be that the SEPCOR algorithm sometimes removes
good features, which although correlated with other selected features, had
a positive impact on the identification percentage. These features are then
replaced with other features, which provide an inferior result. Or to put it in
other words: When removing features, the SEPCOR algorithm does not look
102
7.4 Feature based identification
Test 1 Test 2
Max. Features Features
corr. used Correct (%) used Successful (%)
0.1 5 165 of 350 47.1 7 75 of 180 41.7
0.2 12 133 of 350 38.0 13 163 of 180 90.6
0.3 27 181 of 350 51.7 31 116 of 180 64.4
0.4 50 229 of 350 65.4 47 146 of 180 81.1
0.5 85 236 of 350 67.4 73 115 of 180 63.9
0.6 123 249 of 350 71.1 106 125 of 180 69.4
0.7 173 268 of 350 76.6 160 150 of 180 83.3
0.8 233 270 of 350 77.1 208 166 of 180 92.2
0.9 275 299 of 350 85.4 258 161 of 180 89.4
1.0 306 301 of 350 86.0 306 162 of 180 90.0
Table 7.4: Result of the tests
90
80
Identification percentage
70
60
50
40
Training set
Test set
30
at how good a feature is, that is how high its V-value is, before removing it.
This means that any correlated feature no matter how good can potentially be
removed.
Comparing identification percentages with those from the Euclidean dis-
tance, it is clear that identification using the Euclidean distance produces bet-
ter results. There can be several reasons as to why this is the case. First, the
amount of data used for training the system in the two tests might have been
103
Chapter 7. Identification test
too small, although the tests actually show that identification using the smaller
training set performs better. This might be a coincidence though. Secondly,
the assumption that the distribution of the data is normal might be wrong,
and as mentioned in Section 4.6.3 not all features can be considered as normal
distributed.
100
90
80
Identification percentage
70
60
50
40
Training set
Test set
30
20
0 50 100 150 200 250 300
Number of features
104
7.5 Identification through correlation
put into the construction of the digit templates. Also there are more different
letters than digits.
Many of the errors were due to poor quality of the input images. The
major problems are connected to the size of the input images and the overall
brightness of the images.
Using the plate in Figure 7.5 as an example, it is noted that in spite of the
small size of the plate in the image, the plate is extracted and the characters
are isolated without problems. The character images are however of such a
poor quality that the correlation identification fails.
Many of the extracted license plates were so dark that no proper threshold
value could be found to remove the background of the plate and make the
characters stand out in a satisfactory way. The plate in Figure 7.6 serves as
an example of such a plate.
The problem is exaggerated in that image, as a better isolation of the
characters can be achieved using dynamical thresholding but it illustrates the
problem with this type of plates.
Almost half of the errors in the training set occur in plates with more
than one error, indicating that the input images from that plate is not of a
sufficient quality. Many of the other errors can also be attributed to these two
factors as the plates are borderline examples of the situations. Also there is
105
Chapter 7. Identification test
7.6 Summary
Although the hit rates are almost the same in the two of the tests, Euclidean
distance and template matching, three out of four errors in feature based iden-
tification are to be found in the same plate, whereas the errors in the template
method are to be found in different plates. This supports what was stated
in Section 4.7, where the sensitivity to noise of the feature based method was
discussed.
In Section 4.7, it was also stated that template matching was less suscepti-
ble to noise prone errors than feature based identification. This is hard to see
from these results, as the template method hit rate is lower. This indicates
that it is in fact the chosen features who are less susceptible to noise than the
template strategy.
All in all, the conclusion of the tests must be, that the feature based identifi-
cation suits our purpose the best, but that identification using the Mahalanobis
distance either needs a larger training set or that the assumption that the data
is normal distributed is not entirely correct.
106
Chapter 8
System test
8.1 Introduction
In the previous test chapters, the components of the system were examined
separately in terms of performance. It is also relevant to test the combination
of the components. When testing the individual components, only useful input
was used. In real life, an error made in the license plate isolation will ripple
through the system, and affect the overall performance. This chapter will
describe the tests performed on the final system.
107
Chapter 8. System test
8.5 Results
The results obtained for this test are seen in Table 8.1.
Region growing combined with the method for finding the most probable
region, found a total of 57 license plates. When using both region growing
and Hough transform, the number of license plates found was actually slightly
lower, due to the extra amount of random regions to be sorted out. Therefore,
the remaining algorithms automatically got these 57 license plates, and 15
random regions as input. Of course, it is not very interesting to see what the
further results will be for the 15 random regions.
The isolation of the individual characters worked in all of the 57 license
plate regions.
As mentioned, all three methods were tested for the identification of the
characters. Here Mahalanobis turned out to provide the least impressive re-
sults. It turned out, that in many of the plates only a single character was
misidentified, but in this test, such a plate is considered a failure.
108
8.6 Summary
On the other hand, several poor license plate images were correctly identi-
fied. Two examples of such plates are shown in Figure 8.2. These also contain
marks from the bolts, but the classifier is still capable of providing the correct
number for each character.
8.6 Summary
The test of the overall system established, that by using region growing for
license plate extraction and a Euclidean distance measure for the feature based
identification, a total of 76.4 % of the test images were successfully identified.
The main cause of the failures was the license plate extraction, which caused 15
of the 17 errors. Since it was shown in Section 5.2.4, that 70 license plates were
actually found, the algorithm for extracting the most probable region throws
away 13 license plates in favor of regions containing random information.
The results could be further enhanced, if some restrictions were imposed
on the input images. If, for instance, the images were taken in approximately
the same distance from the vehicle (such as in the currently used system), the
extraction of the license plate would be significantly easier to accomplish.
109
Part IV
Conclusion
111
Chapter 9
Conclusion
The purpose of this project has been to investigate the possibility of making
a system for automatic recognition of license plates, to be used by the police
force to catch speed violators. In the current system, manual labor is needed
to register the license plate of the vehicle violating the speed limit. The ma-
jority of the involved tasks are trivial, and the extensive and expensive police
training is not used in any way. Therefore an automatic traffic control system,
eliminating or at least reducing these tasks has been proposed.
We wanted to investigate the possibility of making such an automatic sys-
tem, namely the part dealing with automatically recognizing the license plates.
Given an input image, it should be able to first extract the license plate, then
isolate the characters contained in the plate, and finally identify the characters
in the license plate. For each task, a set of methods were developed and tested.
For the extraction part, the Hough transform, and in particular the region
growing method proved capable of extracting the plate from an image. Both of
them locate a large number of candidate regions, and to select between them,
template matching was utilized. Template matching, combined with height-
width ratio and peak-valley methods, provided a successful method of selecting
the correct region.
The methods developed for isolating the characters proved very reliable.
The method for finding character bounds using an algorithm that searches for
connected components proved to be the most useful, and combined with pixel
count and static bounds, the method proved to be extremely successful.
For the actual identification process, two related methods were developed;
one based on statistical pattern recognition and the other on template match-
ing. Both methods proved to be highly successful, with feature based identifi-
cation slightly better than template matching. This is because more features
are included in the statistical pattern recognition, and therefore this method
113
Chapter 9. Conclusion
was chosen to identify the digits on the plate. No attempts for identifying
the letters were made using features based identification, and this is the only
reason why template matching was used to identify the letters.
In order for the system to be useful, it should be able to combine the three
different tasks, and to recognize the license plates in a high percentage, so the
use of manual labor is reduced as much as possible. This implies, that the
success rate for the individual parts should be close to 100 %. The results
obtained are summarized in Table 9.1.
As can be seen, the individual parts perform very satisfactory, all with a
success rate close to 100 %. The plate extraction succeeds in 98.1 % of the
test images, and this is a very high success rate. The extraction fails in only
two of the images. This is acceptable, since the extraction works in more than
98 % of the images, thereby fulfilling the criteria of this task.
The part of isolating the characters contained in the license plate succeeds
in 100 % of the cases, and thus is very successful, achieving the goal set for
this task.
Out of the isolated digits, 98.9 % were correctly identified. This is also a
very high success rate, and it must be taken into account, that the wrongly
identified digits originates from only two plates.
The overall performance is not as high as for the individual tasks, but still
a large amount of license plates is correctly identified, namely 76.4 %.
In general, the conclusion of this report is, that a system for automatic
license plate recognition can be constructed. We have successfully designed
and implemented the key element of the system, the actual recognition of the
license plate.
The main theme of the semester was gathering and description of informa-
tion, and the goal is to collect physical data, represent these symbolically, and
demonstrate processing techniques for this data. This goal has been accom-
plished in this project.
114
Part V
Appendix
115
Appendix A
Videocamera
Resolution
A certain resolution is required in order to distinguish between the indi-
vidual characters of a license plate. Generally, raising the resolution will
provide better images. This improvement in quality comes at the cost of
larger amounts of data, and more expensive equipment.
Color
Since Danish license plates are either yellow or white with black char-
acters, no color is needed to separate characters from the background.
However, most modern equipment defaults to color images. This does
not pose a problem, as a software conversion to gray scale images is pos-
sible. As with the resolution, an improvement in color depth will result
in more data to be processed.
Light sensitivity
Since the equipment will be placed outdoors, weather will have a large
impact on the images. Especially the camera should be able to automat-
ically adjust the light sensitivity to the conditions. If the light sensitivity
is either too high or too low, the images will tend to become blurry, so
that even high-resolution images are useless. A camera with good ability
to obtain sharp images at varying lighting conditions, is said to have a
large dynamic range.
117
Appendix A. Videocamera
Lens
The camera lens sharpens images by focusing the light from the source at
a particular point, where the CCD-chip converts the light to an electrical
signal. Usually a camera contains several different lenses, to compensate
for e.g. the fact, that light at different wavelengths diffracts differently.
Lenses are also used for zooming.
Shutter
The shutter is simply a device, which lets different amounts of light pass
into the camera, depending upon the overall brightness of the surround-
ings. Usually the shutter consists of six ‘plates’, that synchronously move
away from, or closer to, each other.
CCD-chip
The CCD-chip converts an amount of light to an electrical signal. In its
most simple design, no concern is given to the color of the absorbed light.
The more sophisticated color CCD-chips work in one of two ways: Either
the light is spread into its three base colors by lenses and absorbed by
a CCD-chip for each color, or the CCD-chip uses a color filtered array
such as the Bayer pattern (see Figure A.1) to distinguish between colors.
118
A.1 Physical components
When an entire line has been written to output, a new line is shifted
downwards and so forth.
119
Appendix B
Visiting the police force
In order to get an understanding of the case, we visited the local police de-
partment, so that we could see first hand how the system currently being used
operates.
The presentation of the system was split into two parts. First we saw how
the pictures are processed at the local police office, and the different stages
were explained. Secondly, we saw the recording of the images, as we drove out
to one of the surveillance vans. Here we got a feel for the circumstances under
which the images are recorded.
121
Appendix B. Visiting the police force
122
B.3 Evaluation of the current system
speed limit to trigger the camera is set and the address of its position is also
registered.
When setting the “trigger-speed-limit” a 3 km/h insecurity of the radar is
taken into account, and only offenders driving more than 10% faster than the
speed limit of the road are fined. This means that for a section of road with a
speed limit of 50 km/h the trigger speed limit is set to 59 km/h.
Having initialized the system no further human involvement is needed.
The need for an officer to be present is based on the fact that different vehicle
types have different speed limits. If busses and lorries are to be fined he has
to manually trigger the camera when they pass by. Actually, what he does is
switching the system to a different trigger speed limit.
The radar is a sensitive piece of equipment. In addition to the 3 km/h
margin of error it is also affected by acceleration and deceleration. The use
of radar for measuring the speed of the vehicles also restricts the locations
available for surveillance. Large solid objects such as billboards can confuse
the radar.
123
Appendix C
Contents of CD
This purpose of this appendix is to give an overview of the attached CD. Below,
a quick overview of the contents is displayed:
Directory: CD
Acrobat Reader 5.05
Referenced web pages
Images
Test
Training
Source code
Final
Image
Misc
Installation
Required DLL’s
All of the instructions required to install and use the program are included
on the CD. If a browser window does not appear automatically when inserting
the CD, manually start your browser and select the file index.html in the root
of the CD. The report is also available for viewing via a link in the index.html
file.
125
Appendix C. Contents of CD
on the CD also contains the project files used when building the executable.
The libraries imported were Intel Image Processing Library (IPL) and Intel
JPEG Library (IJL). Also, a package called Open Computer Vision (OpenCV)
was utilized. All of these libraries contain routines for manipulating images, in-
cluding morphology operations, per-pixel operations and color manipulation.
126
Part VI
Literature
127
Bibliography
129
BIBLIOGRAPHY
130