You are on page 1of 21

Expert Systems With Applications 131 (2019) 219–239

Contents lists available at ScienceDirect

Expert Systems With Applications


journal homepage: www.elsevier.com/locate/eswa

A novel character segmentation-reconstruction approach for license


plate recognition
Vijeta Khare a, Palaiahnakote Shivakumara b, Chee Seng Chan b, Tong Lu c,∗,
Liang Kim Meng d, Hon Hock Woon d, Michael Blumenstein e
a
Department of Computer Science and Software Engineering, Concordia University, Montreal, Canada
b
Faculty of Computer Systems and Information Technology, University of Malaya, Malaysia
c
National Key Lab for Novel Software Technology, Nanjing University, Nanjing, China
d
Advanced Informatics Lab, MIMOS Berhad, Kuala Lumpur, Malaysia
e
Faculty of Engineering and Information Technology, University of Technology Sydney, Australia

a r t i c l e i n f o a b s t r a c t

Article history: Developing an automatic license plate recognition system that can cope with multiple factors is chal-
Received 26 November 2018 lenging and interesting in the current scenario. In this paper, we introduce a new concept called partial
Revised 2 March 2019
character reconstruction to segment characters of license plates to enhance the performance of license
Accepted 16 April 2019
plate recognition systems. Partial character reconstruction is proposed based on the characteristics of
Available online 18 April 2019
stroke width in the Laplacian and gradient domain in a novel way. This results in character components
Keywords: with incomplete shapes. The angular information of character components determined by PCA and the
Character segmentation major axis are then studied by considering regular spacing between characters and aspect ratios of char-
Character reconstruction acter components in a new way for segmenting characters. Next, the same stroke width properties are
Stroke width used for reconstructing the complete shape of each character in the gray domain rather than in the gra-
Zero crossing dient domain, which helps in improving the recognition rate. Experimental results on benchmark license
Gradient vector flow
plate databases, namely, MIMOS, Medialab, UCSD data, Uninsbria data Challenged data, as well as video
License plate recognition
databases, namely, ICDAR 2015, YVT video, and natural scene data, namely, ICDAR 2013, ICDAR 2015, SVT,
MSRA, show that the proposed technique is effective and useful.
© 2019 Elsevier Ltd. All rights reserved.

1. Introduction itoring of car speeds on the road, automatic estimation of traf-


fic volume at different traffic junctions, detection of illegal park-
Creating a smart/digital/safe city has been one of the important ing and incorrect traffic flows (Abolghasemi & Ahmadyafrd, 2009;
emerging trends in both developing and developed countries in re- Azam & Islam, 2016; Tadic, Popovic, & Odry, 2016). However, such
cent times. As a result, developing automatic systems has become a system only works well for a particular application since it is not
an integral part of the above-mentioned initiatives (Rathore, Ah- developed for multiple applications. This is because any particular
mad, Paul, & Rho, 2016; Yuan et al., 2017). One such example is to system can cope with a single adverse factor but not multiple fac-
develop intelligent transport systems for safety and mobility, and tors, which affect license plate visuals (Zhou, Li, Lu, & Tian, 2012).
to enhance public welfare with the help of advanced technologies In addition, most of the existing systems that have been devel-
by recognizing license plates (Anagnostopoulos, Anagnostopoulos, oped use conventional binarization methods, which are proposed
Loumos, & Kayafas, 2006; Du, Ibrahim, Shehata, & Badawy, 2013; for plain background document images to localize and recognize
Suresh, Kumar, & Rajagopalan, 2007). There are transport systems license plates (Ghaili, Mashohor, Ramli & Ismail, 2013; Du et al.,
proposed for recognizing license plates in the literature for applica- 2013; Yu, Li, Zhang, Liu, & Meng, 2015). It is obvious that for differ-
tions such as the automatic collection of toll fees, automatic mon- ent real-time applications, multiple environmental effects are com-
mon (e.g., low resolution, low contrast, complex backgrounds, blur
due to camera or vehicle movements, illumination effects due to

Corresponding author. sunlight, headlights, degradation effects due to rain, fog or haze,
E-mail addresses: shiva@um.edu.my (P. Shivakumara), cs.chan@um.edu.my
and distortion effects due to camera angle variations).
(C.S. Chan), lutong@nju.edu.cn (T. Lu), liang.kimmeng@mimos.my (L.K. Meng),
hockwoon.hon@mimos.my (H.H. Woon), Michael.Blumenstein@uts.edu.au The illustration shown in Fig. 1 demonstrates that input image-
(M. Blumenstein). 1 is affected by perspective distortion, while input image-2 is

https://doi.org/10.1016/j.eswa.2019.04.030
0957-4174/© 2019 Elsevier Ltd. All rights reserved.
220 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

Fig. 1. Binarization and recognition results of license plate images affected by different effects, which result in varying recognition results.

affected by blur as shown in Fig. 1(a). For these two license plate for segmentation. As a result, the method is dataset dependent.
images, the binarization method (Zhou, Feild, Learned-Miller, & Khare, Shivakumara, Raveendran, Meng, and Woon (2015) pro-
Wang, 2013), which is the state-of-the-art method and works well posed a new sharpness-based approach for character segmentation
for low contrast and complex background images, fails to give of license plate images. The method explores gradient vector and
good results for input image-1, but gives better results for input sharpness for segmentation. However, the method is said to be
image-2 as shown in Fig. 1(b). However, the recognition results sensitive to seed point selection and blur presence. Kim, Song,
given by Tesseract OCR gives nothing for input image-1 due to Lee, and Ko (2016) proposed an effective character segmentation
touching, and incorrect results for input image-2 due to shape loss approach for license plate recognition under varying illumination
as shown in Fig. 1(b). On the other hand, the proposed method environments. The method uses binarization and the super pixel
works well except for the first character in input image-1 through concept for segmentation. However, the method focuses on a
reconstruction-segmentation with the same OCR. With this illus- single cause but not multiple causes.
tration, one can conclude that there is an urgent need for develop- In the same way, recently, Dhar, Guha, Biswas, and Abedin
ing a system, which can withstand multiple adverse factors such (2018) proposed a system design for license plate recognition us-
that the same system can be used for several real-time applica- ing edge detection and convolutional neural networks. The method
tions successfully. uses character segmentation as a preprocessing step for license
plate recognition. For character segmentation, the method ex-
plores edge detection, morphological operations and region prop-
2. Related work
erties. However, the method is good for the images with sim-
ple backgrounds but not for images affected by many challenges.
The proposed license plate recognition system involves char-
Ingole and Gundre (2017) proposed character feature-based vehi-
acter segmentation through partial reconstruction, and complete
cle license plate detection and recognition. First, the method seg-
reconstruction for recognition. Therefore, we review the research
ments characters from license plate regions for recognition. For
related to character segmentation, character recognition and
character segmentation, the method proposes vertical and hori-
character reconstruction.
zontal projection profile-based features. The proposed projection
Character Segmentation: Phan, Shivakumara, Su, and Tan
profile-based features may not be robust for the images with com-
(2011) proposed a gradient-vector-flow based method for video
plex backgrounds. Radchenko, Zarovsky, and Kazymyr (2017) pro-
character segmentation. The method uses text line length for
posed a segmentation and recognition method for Ukrainian li-
finding seed points that are unreliable, and then uses mini-
cense plates. The method segments characters based on connected
mum cost path estimation for finding spaces between characters.
component analysis. The connected component analysis works well
Sharma, Shivakumara, Pal, Blumenstein, and Tan (2013) proposed a
when the input image is binarized without the loss of the charac-
new method for character segmentation from multi-oriented video
ter shapes and touching between the characters. However, for the
words. The method is sensitive to dominant points. Liang, Shiv-
images with complex backgrounds, it is hard to propose a binariza-
akumara, Lu, and Tan (2015) proposed a new wavelet Laplacian
tion method to separate foreground and background information.
method for arbitrarily-oriented character segmentation in video
In summary, from the above context, we can conclude that
text lines. This method explores zero crossing points to find
most of the methods made an attempt to solve the problem of low
spaces between words or characters. The performance of the
resolution or illumination effects, but do not include other distor-
method degrades when an image contains noisy backgrounds.
tions such as blur, touching and complex backgrounds. In addition,
There are methods proposed for segmenting characters from
none of the methods explore the concept of reconstruction for seg-
license plate images. For example, Tian, Wang, Wang, Liu, and
menting characters from license plate images.
Xia (2015a) proposed a two-stage character segmentation method
Character Recognition: To recognize characters in text lines of
for Chinese license plates. This method relies on binarization for
video, natural scene images and license plate images, there are
segmentation. Sedighi and Vafadust (2011) proposed a new and
methods that use either binarization methods or classifiers (Ye &
robust method for character segmentation and recognition in li-
Doermann, 2015). For example, Zhou et al. (2013) proposed scene
cense plate images. This method uses a classifier, and binarization
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 221

text binarization via inverse rendering. The method proposes a Silva and Jung (2018) proposed license plate detection and recog-
different idea for adapting parameters that tune the method ac- nition in unconstrained scenarios. The method explores CNNs for
cording to image complexity. However, the assumptions made for addressing challenges caused by degradation. It detects the license
proposing a number of criteria limits its ability to work on dif- plate region first and then the detected region is fed to an OCR
ferent applications. Wang, Shi, Xiao, and Wang (2015) proposed for recognition. Lin, Lin, and Liu (2018) proposed an efficient li-
MRF-based text binarization for complex images using stroke fea- cense plate recognition system using convolution neural networks.
tures. The success of the method depends on how well it se- The method detects vehicles for license plate region detection and
lects seed pixels from the foreground and background. Similarly, then it explores CNNs for recognition. Yang et al. (2018) proposed
Anagnostopoulos et al. (2006) proposed a license plate recogni- Chinese vehicle license plate recognition using kernel-based ex-
tion algorithm for intelligent transportation applications. Since the treme learning machines with deep convolutional features. The
method involves binarization and a classifier for recognition, it method explore the combination of CNN and ELM (extreme learn-
may not work well for images affected by multiple adverse effects ing machines) for license plate recognition. It is found from the
such as low resolution, blur and touching. Saha, Basu, and Nasipuri above discussion on deep learning models that the methods work
(2015) proposed automatic license plate recognition for Indian li- well when we have a huge number of labeled predefined samples.
cense plate images. The method involves edge map generation, However, it is hard to choose predefined samples that represent
the Hough transform and a classifier for recognition. The success all possible variations in license plate recognition, especially for
of the method depends on edge map generation and a classifier. the images affected by multiple adverse factors as in the proposed
Gou, Wang, Yao, and Li (2016) proposed vehicle license plate recog- work. In addition, deep learning has its own inherent limitations
nition based on extremal regions and restricted Boltzmann ma- such as optimizing parameters for different databases and main-
chines. The method extracts HoG features for detected characters, taining stability of deep neural networks (Liu et al., 2017). It can be
and then uses a classifier for recognition. In summary, it is noted noted from the above discussion that there is a gap between the
from the above review of license plate recognition approaches that state-of-the-art methods and the present demand. This observation
most of the methods consider binarization algorithms and classi- motivated us to propose a new method for license plate recogni-
fiers for recognition. In addition, the methods do not consider im- tion without depending much on classifiers and a large number of
ages affected by multiple factors for achieving their results. There- labeled samples, as in the existing methods.
fore, the methods lose generality and the ability to work on license Character Reconstruction: Similar to the proposed work, there
plate images of different background and foreground complexities. are methods in the literature, which reconstruct character shapes
Deep Learning Models for Character Recognition: to improve recognition rates without the help of classifiers and bi-
Jaderberg, Simonyan, Vedaldi, and Zisserman (2016) proposed an narization algorithms. Shivakumara, Phan, Bhowmick, Tan, and Pal
approach for reading texts in the wild with a convolutional neural (2013) proposed a ring radius transform for character shape re-
network, which explores deep learning for achieving high recogni- construction in video. Its performance is good as long as Canny
tion results for texts in natural scene images. Goodfellow, Bulatov, produces the correct character structures. However, it is true that
Ibarz, Arnoud, and Shet (2013) proposed multi-digit number Canny is sensitive to blur and other distortions. To overcome this
recognition from street view imagery using deep convolutional drawback, Tian, Shivakumara, Phan, Lu, and Tan (2015b) proposed
neural networks, which explores deep learning at the pixel level. a method for character shape restoration using gradient orienta-
Despite both methods addressing the challenges caused by natural tions. It finds the medial axis in the gradient domain with differ-
scene images, they are limited to text recognition from high ent directions. However, the method does not work well for char-
contrast images but not from low resolution license plate images acters having blur and complex backgrounds. In addition, the pri-
and video images. Raghunandan et al. (2017) proposed a Riesz mary objective of this work is to reconstruct the characters from
fractional-based model for enhancing license plate detection and video, which suffer from low resolution and low contrast, but does
recognition. This method makes an attempt to address the causes not deal with license plate images.
which affect license plate detection and recognition. Based on In light of the above discussions on the review of character seg-
the experimental results, it is noted that enhancement of license mentation from license plate images, character recognition from li-
plate images may improve the recognition results but it is not cense plate images and character reconstruction, most of the meth-
adequate for real time applications. Al-Shemarry, Li, and Abdulla ods focus on a particular dataset and certain applications, such as
(2018) proposed an ensemble of adaboost cascades of 3L-LBPs natural scene images or video images or license plate images. As a
classifiers for license plate detection from low quality images. The result the scope of the above methods is limited to specific appli-
method explores texture features based on LBP operations and cations and objectives. This motivated us to propose a method that
uses a classifier for license plate detection from images affected can work well for license plate images, natural scenes and video
by multiple adverse factors. However, the performance of the images. In addition, license plates images are generally affected by
method heavily depends on learning and the number of labeled multiple adverse factors due to background and foreground varia-
samples. In addition, the scope is limited to text detection but not tions, making the problem of recognition more complex and inter-
recognition as in the proposed work. Text detection is easier than esting.
recognition in this case because detection does not require the full Inspired by the work Shivakumara et al., 2019) where keyword
shapes of characters. spotting is addressed for multiple types of images with powerful
Recently, inspired by the strong ability and discriminating feature extraction, we propose a novel idea for recognizing char-
power of deep learning models, some methods have explored dif- acters from license plates affected by multiple factors. The key
ferent deep learning models for license plate recognition. For ex- contributions of the proposed work are as follows: ((1) Propos-
ample, Dong, He, Luo, Liu, and Zeng (2017) proposed a CNN- ing partial reconstruction for segmenting characters from license
based approach for automatic license plate recognition in the plate images is novel; (2) Reconstructing complete shapes of char-
wild. The method explores an R-CNN for license plate recog- acters from segmented characters without binarization, which can
nition. Bulan, Kozitsky, Ramesh, and Shreve (2017) proposed work well for not only license plate images but also natural scene
segmentation-and annotation-free license plate recognition with and video images, is also novel; (3) The combination of reconstruc-
deep localization and failure identification. The method explores tion and character segmentation in a new way is another inter-
CNNs for detecting a set of candidate regions. Then it filters esting step to achieve good recognition rates for multi-type im-
false positive from the candidate regions based on strong CNNs. ages. The main advantage of the proposed method is that since the
222 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

proposed reconstruction approach preserves character shapes, the


performance of the method does not depend much on classifiers
and the number of training samples.
The proposed method is structured as follows. Stroke width pair
candidate detection is illustrated by estimating stroke width dis-
tances for each pixel in the images in Section 3.1. In Section 3.2,
we propose symmetry properties based on stroke width distances
to obtain partial reconstruction results. Section 3.3 proposes char-
acter segmentation using partial reconstruction results based on
principal and major axis information of the character components.
We describe the steps for complete reconstruction in the gray do-
main in Section 3.4.

3. Proposed technique

This work considers license plates affected by multiple fac-


Fig. 2. Pipeline of the proposed method.
tors according to various applications, such as low resolution, low
contrast, complex backgrounds, multiple fonts or font sizes, blur,
multi-orientation, touching elements and distortion due to illumi- act OCR for recognition. The flow of the proposed method is shown
nation effects, as input for character segmentation and recognition. in Fig. 2.
To overcome the problem of low contrast and low resolu-
tion, inspired by Laplacian and gradient operations, which usu- 3.1. Stroke width pair candidates detection
ally enhance high contrast information at the edges or near
edges by suppressing background information (Phan et al., 2011; As mentioned in the previous section, the stroke width dis-
Liang et al. 2015; Khare et al., 2015), we propose Laplacian and gra- tances (thickness of the stroke) of characters in a license plate im-
dient information for finding pixels which represent stroke width age are usually the same as shown in Fig. 3(a). To extract stroke
(thickness of the stroke) of characters in license plate images. This width distance, we propose a Laplacian operation which gives high
is justified because the Laplacian process, which is the second or- positive and negative responses for the transition from background
der derivative, gives high positive and negative values at the edges to foreground and vice versa, respectively. This results in search-
and near edges, respectively. Similarly, the gradient, which is the ing two zero crossing points that define stroke width distance as
first order derivative, gives high positive values at the edges and shown in Fig. 3(b) and 3(c), where a pictorial representation of the
near edges. This information is used for Stroke Width Pair (SWP) marked region in Fig. 3(b) is shown. Since the input images con-
candidate detection. It is true that stroke width or stroke width sidered have complex backgrounds and small orientations due to
distance and color remain constant throughout characters regard- angle variations, we use the following mask to extract horizontal
less of font or font size variations (Epshtein, Ofek, & Wexler, 2010) and vertical diagonal zero crossing points. Due to background vari-
at the character level. Most of the time, license plates are pre- ations and noise introduced by the Laplacian operation as shown
pared using upper case letters. Furthermore, the spacing between in Fig. 3(b), background and noise pixels may contribute to defin-
characters in license plate images is almost constant. Based on ing stroke width distances. Therefore, to overcome this issue, we
these facts, we propose new symmetry features which use Lapla- plot a histogram for stroke width distances as shown in Fig. 3(c).
cian and gradient properties at the SWP candidates to find neigh- The distances are chosen from those contributing to the highest
boring SWPs. However, due to complex backgrounds, severe illu- peak as candidate stroke width pairs, which are shown in Fig. 3(d),
mination effects and blur, there is a possibility for SWPs to fail in where one can see all the red pixels denoting stroke width pair
satisfying the symmetry features. This results in the loss of infor- candidates. This is justified because the stroke pixel pairs that de-
mation and hence we consider the output of this step as partial fine actual stroke width distance are higher than the pixel pairs
reconstruction. We believe that the output of partial reconstruc- defined by background or noise pixels. In this way, the proposed
tion results preserve the structure of character components. This step can withstand the cause of background noise and degrada-
may lead to under-and over-segmentation. tions. It may be noted from Fig. 3(d) that Stroke Width Pair (SWP)
It is understood that Eigen vectors of PCA give angles based on candidates represent character strokes. In addition, each character
the number of pixels which contribute to the direction of character has a set of SWPs. It is evident from Fig. 3(e) that the proposed
components (Shivakumara, Dutta, Tan, & Pal, 2014). In other words, technique detects SWPs for the complex image in Fig. 1(a), where
to estimate the possible angle of the whole character, PCA does not touching exists due to perspective distortion.
requires the full character information. As per our experiments, in It is noted from Fig. 3(d) that the number of red pixels for the
general, if the character contains more than 50% of pixels, one can characters are different from one character to another. This is be-
expect almost the same angle of the actual character. The same cause the proposed steps estimate stroke width distance by con-
thing is true for angle estimation via the major axis of the charac- sidering all the pixels of characters but not the pixels of individual
ter. With this motivation, we use angle information given by PCA characters. Since we consider the common stroke width distance
and the Major Axis (MA) to estimate angles of character compo- of the pixels in the image, the number of stroke width pairs vary
nents. The angle information between PCA and MA is explored for from one region to another due to background complexity. As a
character segmentation. Since the proposed symmetry properties result, all the pixels of characters may not contribute to the high-
are sensitive to blur, touching and complex backgrounds, we pro- est peak in the histogram. Therefore, one cannot predict the num-
pose the same symmetry properties with weak conditions in the ber of stroke width pairs for each character as shown in Fig. 3(d).
gray domain instead of Laplacian and gradient domains to recon- However, the proposed method has the ability to restore the char-
struct the full character shape with the help of the Canny edge acter shape with one stroke width pair of each character by the
image of the input image. This is possible because there is no in- partial reconstruction step. We believe that each character gets at
fluence from neighboring characters after segmenting characters least one stroke width pair from the histogram operation for the
from the image. The reconstructed characters are passed to Tesser- partial reconstruction step because they follow the same font size
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 223

Fig. 3. Stroke width pair candidate detection.

and typeface. and low to high as shown in Fig. 4(c), where we can see grad-
  ual changes in gradient values which are defined as the Gradient
1 1 1
Laplace Mask = 1 −8 1 Symmetry (GS) feature. When we look at the Gradient Vector Flow
1 1 1 (GVF) of the stroke pixels, as shown in Fig. 4(d), we can observe
arrows, which are pointing towards the edges; the direction of the
3.2. Partial character reconstruction arrows of two stroke pixels have opposite directions. This is called
the GVF Symmetry (GVFS) feature as shown in Fig. 4(e). Similarly,
The proposed technique considers SWP candidates given by the we consider the value of a positive peak of the Laplacian and the
previous section as the representatives to find neighboring SWP difference between the positive and negative peak values for find-
candidates, which define stroke width of the character. To achieve ing symmetry. In this way, we find the neighboring SWP of each
this, for each SWP candidate, the proposed technique considers SWP candidate as shown in Fig. 4(f), where one can see positive
eight neighbors of two stroke pixels and then checks all the com- and positive-negative peaks. This is called the Laplacian Symme-
binations to identify the correct SWP as shown in Fig. 4(a), where try (LS) feature. The proposed technique extracts four symmetry
we can see the process of searching for the right neighbor SWP. features for each SWP candidate, and then checks the four sym-
In this work, the proposed method uses an 8-directional code for metry features with all 64 combinations. Subsequently, it chooses
searching the correct stroke width pair; one can expect 8 neigh- the combination which satisfies the four symmetries as the neigh-
bor pixels for each stroke pixel of the pair. Therefore, the total boring SWP, and the pair will be displayed as white pixels. The
number of combinations is 8 × 8 = 64 pairs. The reason to con- identified neighbor SWP is considered as an SWP candidate, and
sider 8 neighbors for each stroke pixel is to ensure that the step again the whole process repeats recursively to find all the neigh-
does not miss checking any pair of pixels. Since stroke pixels rep- boring SWPs in the image. This process stops when it visits all
resent edge pixels of characters, we can expect high gradient val- SWPs. However, the number of iterations depends on the complex-
ues compared to their background. Similarly, the pixel value be- ity of the characters and the number of SWPs of each character.
tween the stroke pixels represent a homogeneous background, and As long as the stroke width pair satisfies the symmetry properties,
the gradient gives low values for the pixels compared to the gradi- the partial reconstruction step restores the contour pixels of the
ent values of the stroke pixels as shown in Fig. 4(b) (Khare et al., characters. When SWPs fail to satisfy the symmetry properties or
2015). Therefore, we study the gradual changes from high to low there are no more SWPs to visit, the iterative process terminates.
224 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

Fig. 4. Exploiting symmetrical features for finding neighbor SWPs from 64 combinations.

This is the reason to obtain the partial shape of the character by (i) If GSW = {gSW1 ,gSW2 ……,gSWn } and, GNP = {gNP1 ,gNP2 ,……,
partial reconstruction as shown in Fig. 5, where we can see the in- gNPn }, where n is the size of the stroke width (SW), gSWn
termediate steps for the partial reconstruction results. It can also and gNPn represents the gradient value of the stroke width
be noted from Fig. 5 that the partial reconstruction results provide and Neighbor Pair (NP) at location n, respectively.
the structures of the characters with some loss of information. Then NP = =1iff {gSW1 = =gNP1 ,gSW2 = =gNP2 ,……gSWn = =gNPn }
The four symmetrical features are defined specifically as fol- Gradient symmetries can be visualized as in
lows. Fig. 4(b).
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 225

Fig. 5. Intermediate and final partial reconstruction results.

(ii) Angle information of GVF at the starting point (sp) and end most text lines, the major orientations of characters are nearly
point (ep) of the stroke width is represented as: GVFSW(sp) perpendicular to the major orientation of the text line”, both PCA
GV FNP (sp) == GV FSW (sp) && and MA should give approximately 90° if characters in the text
and GVFSW(ep) . Then NP = =1iff
GV FNP (ep) == GV FSW (ep) are aligned in the horizontal direction. The above observations can
where GVFNP(sp) and GVFNP(ep) represent the angle informa- be confirmed from the sample results of partial reconstruction
tion of GVF at the starting point and end point of NP, re- on alphabets, namely, A to Z, and numerals, namely, 0–9, chosen
spectively. GVF angle symmetry can be visualized as in Fig. from the databases shown in Fig. 6, where we note that for both
3(e). alphabet and numeral images, PCA (yellow color axis) and MA
(iii) The peak values of stroke width Laplace (L) at the start- (red color axis) give angles, which are almost the same and ap-
ing point and end point are respectively represented as proximately 90° because all the images are inclined in the vertical
P _LSW (sp) and P _LSW (ep) , and the peak values of neighbor pair direction. Similarly, the same conclusion can be drawn from the re-
Laplace starting point and end points are, respectively de- sults shown in Fig. 7(a)-7(b), where we present PCA and MA angle
noted by P _LNP (sp) and P _LNP (ep) . information for the images affected by low contrast, complex back-
Then NP = =1iffP _LNP (sp) == P _LSW (sp) && P _LNP (ep) == grounds, multi-fonts, multi-font sizes, blur and perspective distor-
P _LSW (ep) tion. In the same way, the sample partial reconstruction results
(iv) Similarly, the highest peak to the lowest peak of the Laplace shown in Fig. 8(a)-8(b) for the images of two character compo-
zero-crossing difference is also used for comparing neigh- nents show that PCA and MA give angles of almost 0° as character
bor pairs. Here the highest and lowest peaks of Laplace components, which are aligned towards the horizontal direction.
zero-crossing points for stroke width can be represented The results in Figs. 6, and 7 show that partial reconstruction
as: hP _LSW and lP _LSW and for the neighbor pair hP _LNP and has the ability to preserve character shapes regardless of differ-
lP _LNP . Then the high to low difference can be defined as: ent causes, while PCA and MA have the ability to give the angle
DiffSW = hP _LSW − lP _LSW , DiffNP = hP _LNP − lP _LNP of character orientation without the complete shape of the char-
Then NP = =1iff DiffNP = DiffSW acter components. This observation leads to define the following
hypothesis for character segmentation. If both the axes give almost
Laplace symmetries (iii) and (iv) can be visualized as in Fig. 4(f). 90° with a ± 26 difference, then the component is considered as a
full character, else if both the axes give almost zero degrees with
3.3. Character segmentation a ± 26 difference then the component is considered to be an under-
segmentation. This is possible when two character components are
When we look at the partial reconstruction results given by the joined together as shown in Fig. 8. Otherwise, the component is
previous section as shown in Fig. 5(f) and 5(g), one can understand considered as a case of over-segmentation. This occurs when a
that even though there is a loss of shape, it still provides enough character loses shape. The value of ±26 is determined based on
structure, which helps us to find the spacing between characters experimental results, which will be presented in the Experimen-
and character regions for segmentation. As mentioned in the tal Section. The reason to fix such a threshold is that segmentation
proposed Methodology Section, Principal Component Analysis requires either a vertical or horizontal orientation. With this idea,
(PCA) and the Major Axis (MA) do not require the full character the proposed technique classifies components from the partial re-
shape to estimate possible directions of character components. It construction results into three cases.
is also noted that most license plate images including Malaysian In general, characters in license plate images share the same
license plates contain upper case letters with numerals, but not aspect ratio especially height of characters, as shown in Fig. 5(a).
the combination of upper case with lower case letters. According This observation motivated us to find the width of components of
to the statement in Yao, Bai, Liu, Ma, and Tu (2012) that “for three cases. If partial reconstruction outputs characters with clear
226 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

Fig. 6. Angle information given by PCA and MA for the alphabets and numerals of license plate images. The MA axis is represented by a red color and the PCA axis is
represented by a yellow color.

Fig. 7. PCA and MA angle information of the partial reconstruction result for the different distorted images.

Fig. 8. PCA and MA angle information of the partial reconstruction results for the image of two character components.

shapes, and all the components are classified as an ideal charac- almost all the characters except for “W” and “U”. Therefore, seg-
ter case according to angular information, the proposed technique mentation with probable widths is good in ideal cases as shown
considers the width which contributes to the highest peak in the in Fig. 9(f), where for the complex image in Fig. 3(e), the proba-
histogram as the probable width. If the proposed technique does ble width segments all the characters successfully using the partial
not find a peak on the basis of width, it considers the average of reconstruction results. However, it is not true for all the cases. For
the width of the characters as a probable width. The same prob- example, it results in under-segmentation and over-segmentation
able width is used for segmenting characters as shown in Fig. 9, as shown in Fig. 9(d) and Fig. 9(e), respectively.
where for the input license plate images in Fig. 9(a) and Fig. 9(b), To solve the problem of under-segmentation given by the prob-
the proposed technique plots histograms using the probable width able width, we propose an iterative-shrinking algorithm, which re-
as shown in Fig. 9(c), and the segmentation results given by the duces small portions of components from the right side with a step
probable width are shown in Fig. 9(d) and Fig. 9(e), respectively. size of five pixels in the partial reconstruction results, and then
Fig. 9(d) and 9(e) show that segmentation is performed using checks angle information of ideal characters. The proposed tech-
probable width segments in almost all the characters for image- nique investigates whether the angle difference between PCA and
1 in Fig. 9(a) except for “12 . For image-2 in Fig. 9(b), it segments MA leads to an angle of 90° or not, iteratively. When the angle
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 227

Fig. 9. Character segmentation using probable widths.

Fig. 10. Iterative-shrinking process for under-segmentation. (a) gives the case of under segmentation, and (b) shows the intermediate results of the iterative process.

difference satisfies the condition of an ideal character, the itera- cases. For each component given by the probable width, the pro-
tive process stops, and the character is considered as an individual posed technique expands with a step size of five pixels from the
component. Since under-segmentation usually contains two char- left side. At the same time, in the partial reconstruction results,
acters such as “12 , the iterative process segments such cases suc- it calculates the angle differences of PCA and MA. This process
cessfully. This process is tested on all the components from the continues until it gets the angle of almost zero degrees. When
results of partial reconstruction to solve the problem of under- two characters are merged, the iterative process gets an angle
segmentation. of zero degrees by both PCA and MA. At this point, the itera-
The process of iterative-shrinking is illustrated in Fig. 10, where tive process stops and then we use the iterative-shrinking algo-
(a) is a sample of an under-segmentation case, (b) gives the in- rithm to segment both the characters. Therefore, the proposed
termediate results of the iterative process, and (c) shows the final iterative-expansion uses iterative-shrinking for solving the over-
results. It is observed from Fig. 10(b) that the angle difference be- segmentation problem. Note that the proposed technique first em-
tween axes given by PCA and MA reduces as the iterations con- ploys iterative-shrinking to solve the under-segmentation, then
tinue, and subsequently stops when both the axes give the same it uses iterative-expansion for solving over-segmentation. This is
angle. because iterative-expansion requires iterative-shrinking. The rea-
In the same way of iterative-shrinking for under-segmentation, son to propose an iterative procedure for both shrinking and ex-
we propose iterative-expansion to solve the over-segmentation pansion is that when a character component is split into small
228 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

Fig. 11. Iterative-expansion process for over-segmentation.

fragments due to adverse factors or when character components pixel and the neighbor pixel remains the same, and the neighbor
are joined together, it is necessary to study local information in pair satisfies the stroke width distance of SWP, the proposed tech-
order to identify the vertical and horizontal cases. The process of nique moves in the same direction to restore the neighbor SWP.
iterative-expansion is illustrated in Fig. 11, where (a) shows the This process works well when straight strokes are present, whilst
cases of under-segmentation, (b) shows intermediate results of par- at curves and corners the tangent angle gives a high difference.
tial reconstruction of (a), (c) gives the results of iterative-expansion Moreover, this tangent-based restoration works well for individ-
followed by shrinking for correct segmentation, and (d) gives the ual characters but not for the whole license plate image, where
final character segmentation results. this tangent direction may be a guide for touching, adjacent char-
acters. In this situation, the proposed technique recalculates the
3.4. Complete character reconstruction stroke width using eight neighbors of SWP pixels as we calculated
in Section 3.2. To find the right combination SWP out of 64, we de-
Section 3.2 described the method to obtain partial character re- fine symmetry features as the intensity value at the first pixel, and
construction for input license plate images, and the method pre- the second pixel has almost the same value as shown in Fig. 12(b),
sented in Section 3.3 uses the advantage of partial reconstruction which is called the Peak Intensity Symmetry (PIS). The intensity
for character segmentation. Since characters are segmented well values between the first and second pixels of SWP should have
from license plate images even when they are affected by multiple gradual changes from high to low and low to high as shown in
factors, we apply Canny to obtain edges to reconstruct complete Fig. 12(c), which is called the Intensity Symmetry (IS). If the com-
shapes of characters for each incomplete shape given by partial re- bination of SWP satisfies the above two symmetry features, the
construction. This is because Canny gives fine edges for low and pair is considered as actual contour pixels and displayed as white
high contrast images when we supply individual characters rather pixels, which are shown in Fig. 12(d), where one can see that the
than the whole license plate image (Saha et al., 2015). Therefore, lost information in Fig. 12(a) is restored. The potential of com-
we consider the output of Canny as the input for reconstructing plete character reconstruction for license plate images shown in
missing information in partial reconstruction results. Fig. 13(a) can be seen in Fig. 13(b) where shapes are restored, and
For the Canny edge of the input character image shown in the recognition results in Fig. 13(c) illustrate correct OCR recogni-
Fig. 12(a), the proposed technique finds the Stroke Width Pair tion results for both the license plate images.
(SWP) candidates as described in Section 3.2, where we can see In summary, the gradient domain helps us to define symmetry
the characters “W” and “5 given by partial reconstruction of lost properties, and at the same time, it misses vital pixels of charac-
shapes. The SWP are considered as representatives for reconstruc- ters due to sensitivity to low contrast and low resolution, which
tion in this Section. For each SWP, as the proposed technique de- results in partial character reconstruction. To overcome this prob-
fines symmetrical features using gradient values, gradient vector lem, the proposed method defines the same properties using gray
flow and Laplacian, we define the same symmetry features using values rather than gradient values. This is because the segmented
gray information rather than gradient information. This is because character does not have an influence on complex backgrounds and
according to our analysis of the experimental results, the gradient it understood that the pixel of the characters have almost uniform
does not give good responses for low contrast, low resolution and values. Therefore, the combination of the properties in gradient
distorted images. This is the main reason for the loss of shapes and and gray domains help us to restore the missing information. In
the same thing has led to partial reconstruction. Since characters other words, the partial reconstruction helps in the accurate seg-
are segmented and pixels have uniform color values, we propose mentation of characters while segmentation helps in restoring the
symmetry features in the gray domain to restore the rest of the complete shape using intensity values in the gray domain.
incomplete information for partial reconstruction results to obtain
complete character shapes.
For SWP, the proposed technique calculates a tangent angle as 4. Experimental results
defined below:
To evaluate the effectiveness of the proposed technique for
Angl etan = tan((y − y1 )/(x − x1 ))
real-time applications, we consider the dataset provided by MI-
where (x, y) is the starting pixel location of the SWP, and (x1 ,y1 ) is MOS, which is the institute funded by the Government of Malaysia
the location of its neighbor pixel. Since the tangent angle between where License Plate Recognition (LPR) is a live ongoing project. The
the pixel of SWP and the neighbor pixel gives a direction, the pro- dataset consists of 680 complex license plate images with various
posed technique finds the neighbor pixel in the same direction challenges, such as poor quality images where we can expect low
with the same stroke width distance to restore the neighbor SWP. contrast, blurred images, and character-touching images due to il-
As long as the difference between the tangent angle of the current lumination effects, sun light, or headlights at night.
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 229

Fig. 12. Complete reconstruction in the gray domain.

Fig. 13. Effectiveness of the complete reconstruction algorithm.

To demonstrate the merit of the proposed technique, we con- on other standard datasets, such as ICDAR 2013 which has
sider standard datasets that are available publicly, namely, the 28 videos (Karatzas et al., 2013), YVT which has 29 videos
UCSD dataset (Zamberletti, Gallo, & Noce, 2015) with 1547 im- (Nguyen, Wang, & Belongie, 2014), and ICDAR 2015 which has
ages, which have a variety of challenges including the presence 49 videos (Karatzas et al., 2015). These datasets are popular for
of blur, license plate images with very small font captured from text detection and recognition in order to evaluate the method.
a substantial distance, and low resolution images. The Medialab These datasets include low resolution, low contrast, complex back-
dataset (Zamberletti et al., 2015) contains 680 license plate im- grounds, and multiple fonts, sizes, or orientations. Similarly, for
ages, which have a variety of font sizes, illumination effects, natural scene datasets, we use ICDAR 2013 (Karatzas et al., 2013),
and shadow effects. The Uninsbria dataset (Zamberletti et al., which has 551 images, SVT which has 350 images (Wang & Be-
2015) containing 503 license plate images captured from nearby, longie, 2010), MSRA-500 which has 500 images (Yao et al., 2012)
are better quality compared to the UCSD and Medialab datasets, and ICDAR 2015 which has 462 images (Karatzas et al., 2015). The
but generally have more complex backgrounds. In total, we con- reason to consider natural scene datasets for experimentation is to
sidered 3410 license plate images for experimentation, covering show that when the proposed technique works well for low res-
multiple factors that were mentioned in the Introduction sec- olution and low contrast images, it will also work for high res-
tion. In addition, we chose 100 license plate images that are af- olution and high contrast images. The main differences between
fected by multiple adverse factors as mentioned above from all the video datasets and these datasets include contrast and reso-
the license plate datasets to test the ability and effectiveness of lution. In other words, video datasets suffer from low resolution
the proposed technique, which are termed as challenging data. and low contrast, while natural scene datasets provide high con-
This data does not include ‘good’ (easy) images like in other trast and high resolution images. In total, 3510 license plate im-
datasets. ages, 106 videos, and 1863 natural scene images are considered for
Since the proposed technique is capable of handling mul- experimentation to demonstrate that the proposed technique is ro-
tiple causes, we test the ability of the proposed technique bust, generic and effective.
230 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

The proposed technique involves a reconstruction step and a In order to show the usefulness and effectiveness of the pro-
character segmentation step. To evaluate the reconstruction step, posed technique, we implement existing character segmentation
we follow the standard measures and scheme used in Peyrard, Bac- methods, namely, Phan et al. (2011), which use minimum cost path
couche, Mamalet, and Garcia (2015) for calculating measures, estimation for character segmentation in video; the method of
namely, Peak Signal to Noise Ratio (PSNR), Root Mean Square Er- Khare et al. (2015) proposes sharpness features for character seg-
ror (RMSE), and Mean Structural Similarity (MSSIM) as defined be- mentation in license plate images, and the method of Sharma et al.
low. Since the measures used (Peyrard et al., 2015) are proposed (2013), uses the combination of clusters analysis and the min-
for evaluating the quality of handwritten images, we therefore pre- imum cost path estimation for character segmentation in video
fer these measures for evaluating the reconstruction steps of the to facilitate comparative studies. The main reason for selecting
proposed technique. these existing methods is that an existing method which focuses
on a single factor may not work well for license plate images af-
1
N
P SNR = P SN Ri (1) fected by multiple factors. Phan et al.’s method addresses low res-
N olution and low contrast factors, Khare et al.’s method is a re-
i=1
cent one that addresses license plate issues to some extent, and
1
N Sharma et al.’s method addresses multi-oriented and touching fac-
RMSE = RMSEi (2) tors. Dhar et al. (2018) proposed a system design for license plate
N
i=1 recognition by using edge detection and convolutional neural net-
works. Ingole and Gundre (2017) proposed character feature-based
1
N
vehicle license plate detection and recognition. Radchenko et al.
MSSIM = MSSIMi (3)
N (2017) proposed a method of segmentation and recognition of
i=1
Ukrainian license plates. The reason to choose these methods is
For character segmentation, we use standard measures pro- that the objective of the methods is the same as the proposed
posed in Phan et al. (2011), where the same measures are used work. However, the methods are confined to specific applications.
for character segmentation, namely, Recall (R), Precision (P), F- In the same way, we choose the state-of-the-art recognition
measure (F), UnderSegmentation (U) and OverSegmentation (O). methods, namely, the method of Zhou et al. (2013), which is
The definitions for the measures are as follows. a robust binarization approach that works well for high resolu-
Truly Detected Character (TDC): A segmented block that con- tion and low contrast images: the method of Tian et al. (2015a),
tains correctly-segmented characters. which is a recent approach proposed for the recognition of
Under-Segmented Blocks (USB): A segmented block which con- video characters through shape restoration, and the method of
tains more than one characters. Anagnostopoulos et al. (2006) which proposes an artificial neural
Over-Segmented Blocks (OSB): A segmented block that contains network for character recognition in license plate images. The mo-
no complete characters. tivation to choose these methods for the comparative study is that
False detected block (FDB): A segmented block that does not Zhao et al.’s method is the state-of-the-art approach which rep-
contain any characters; for example, intermediate objects, bound- resents recognition of scene characters through binarization, Tian
ary or a blank space. The measures can be calculated as follows. et al.’s method is the state-of-the-art approach which represents
recognition of video characters through reconstruction, and the
Recall(R ) = TDC/ANC,
method of Anagnostopoulos et al. (2006), is the state-of-the-art
approach recognizing characters in license plates through classi-
Precision(P ) = TDC/(FDB ), fiers. Since the proposed technique is robust to multiple factors, we
chose these methods to work on different datasets for undertaking
F − measure(F ) = (2 ∗ P ∗ R )/(P + R ). a comparative study to validate the strengths of the proposed tech-
nique. Additionally, we also consider the following methods that
UnderSegmentation(U ) = USB/ANC explore the recent deep learning models for license plate recog-
nition. Bulan et al. (2017) proposed segmentation-and annotation-
OverSegmentation(O ) = OSB/ANC
free license plate recognition with deep localization and failure
To validate the reconstruction step that preserves charac- identification. The method explore CNNs for detecting a set of can-
ter shapes, we consider the character recognition rate as a didate regions. Silva and Jung (2018) proposed license plate de-
measure for reconstructed images with the publicly available tection and recognition in unconstrained scenarios. The method
Tesseract OCR, 2016. For the purpose of the evaluation of the explores CNNs for addressing challenges caused by degradation.
recognition results, we follow the definitions of Recall (RR) as Pre- Lin et al. (2018) proposed an efficient license plate recognition sys-
cision (RP) and F-measures (RF) as in Ben-Ami, Basha, and Avi- tem using convolution neural networks.
dan (2012), because these definitions are proposed for Bib number For finding the value for the parameters, threshold, symmetry
recognition. Since Bib number and license number have a similar- properties and conditions, we randomly chose 500 sample images
ity, we prefer to use these measures. from the dataset for experimentation. Since the proposed method
RR is defined as the percentage of correctly recognized charac- does not involve classifiers for training, we prefer to choose sam-
ters out of the total number of characters (ground truth), and RP is ples randomly from all the databases considered in this work for
defined as the percentage of correctly recognized characters out of experimentation. We use a system with an Intel Core i5 CPU with
the total number of recognized characters. For the F-measure, we 8 GB RAM configuration for all experiments. According to our ex-
use the same formula employed for evaluating the segmentation periments, the proposed method consumes 30 ms for each im-
step for combining RR and RP into one measure. age, which includes partial reconstruction, character segmentation,
Note that since there is no ground truth available for license complete character reconstruction and recognition.
plate datasets, for MIMOS, UCSD, Medialab, and Uninsbria, we In Section 3.3, we define three hypotheses for ideal charac-
manually count the Actual Number of Characters (ANC) as the ter detection, over-segmentation and under-segmentation based on
ground truth. For standard video and scene image datasets, we use the principal (PCA) and major axes (MA). It is expected that the
the available ground truth and evaluation schemes as instructions PCA gives the same angles for the ideal characters. However, it is
in the ground truth. not the case due to the complexity of the problem. Therefore, we
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 231

Fig. 14. Determining the optimal value for the threshold of PCA and MA to check whether a segmented character is ideal or not.
Note: at a 26 angle value, the recognition rate is high compared to other percentage values.

Fig. 15. Determining the percentage of missing pixels to define partial reconstruction and the threshold value for angle difference between PCA and MA angles.

set ±26° as a threshold for character segmentation using partial MA, the difference between the PCA and MA angle and different
reconstruction results. To determine the value, we conduct experi- percentages of missing white pixels. It is observed from Fig. 15(a)
ments for 500 samples chosen randomly by varying different angle and 15(b) that for 90% to 40%, the proposed method constructs the
values against the recognition rate as shown in Fig. 14, where it is complete shape of the character and obtains correct recognition re-
observed that for angle, 26, the proposed method reports a high sults. But for lower than 40%, the proposed method loses the shape
recognition rate. Hence, we choose the same value for all the ex- of the character, which results in incorrect recognition. Based on
periments in this work. this experimental analysis, we consider 40% as the threshold to de-
In Section 3.2, the proposed method introduces the partial re- fine partial reconstruction results in this work. It is also noted from
construction concept for character segmentation. It is expected Fig. 15(a) that for a difference angle, 28.2, the proposed criteria for
that the partial reconstruction step outputs the structure of the character segmentation fail as the OCR gives incorrect results. It is
character shape such that at least a human could read the char- evident that ±26 is a feasible threshold to achieve better results.
acter. The question is how to define the partial reconstruction in
terms of quantity. Therefore, we conducted experiments by esti- 4.1. Experiments for analyzing the contributions of individual steps of
mating the number of missing pixels compared to the pixels in the proposed technique
the ground truth. In this experiment, we manually add noise and
blur at different levels to make the character images complex such The major contributions of the proposed technique are partial
that they lose pixels. We calculate the percentage of missing pixels reconstruction, character segmentation and complete reconstruc-
with the help of the ground truth. We illustrate sample results for tion. To understand the effectiveness of each step, we conducted
different percentages of missing pixels during partial reconstruc- experiments on the MIMOS dataset and calculated the respec-
tion as shown in Fig. 15(a) where we can see angles given by PCA, tive measures as reported in Table 1. The reason for selecting the

Table 1
Performances of individual steps of the Proposed Technique on the MIMOS dataset.

Steps Quality measures Segmentation measures Recognition

PSNR RMSE MSSIM R P F O U RR RP RF

PR 12.3 69.4 0.29 23.7 13.1 18.5


SWR 16.3 8.4 11.1 33.3 26.6
CRWS 8.6 78.6 0.24 14.4 13.2 13.8
232 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

MIMOS dataset is that it consists of live data provided by a re-


search institute. To estimate the quality measures for partial re-

88.9
21.4
RF
construction and complete reconstruction, we use the Canny edge
images of English alphabets created artificially as the ground truth.
It is noted from the quality measures of the partial reconstruction

15.3

87.3
RP
Recognition
reported in Table 1 that except MSSIM, the other two measures re-
port poor results. This shows that the partial reconstruction steps

90.6
27.6
preserve the character structures, at the same time, some informa-

RR
tion is lost. It is evident from the recognition results of the partial
reconstruction reported in Table 1 that all three measures report

24.4

2.1
low results. Therefore, one can ascertain that partial reconstruction

U
alone may not help us to achieve better results. To analyze the ef-

30.7
fectiveness of the segmentation step, we apply the segmentation

9.4
O
step on the Canny edge images of the input characters without
partial reconstruction (SWR) results. It is observed from the mea-

86.6
14.7
Segmentation measures
sures of segmentation that they all report low results, especially

F
under- and over-segmentation report poor results. This shows that
the segmentation step alone is inadequate for solving the problem

84.3
10.6
of segmentation for license plate images. Similarly, we apply the

P
steps of the complete reconstruction algorithm on the Canny edge
image of each input image without segmenting characters (CRWS).

88.9
18.9
The results reported in Table 1 show that the quality measures

R
report low results except for MSSIM, and the measures of recog-

MSSIM
nition also report poor results. Therefore, we can argue that the

0.32

0.62
0.21
symmetry features proposed for complete reconstruction are not

After rectification

Quality measures
good when we apply them on the whole image without segmen-

RMSE

65.4

74.3
tation. Overall, we can conclude that reconstruction and character

6.4
segmentation are complementing each other to achieve better re-
sults.

PSNR

34.7
13.7

10.4
In the case of license plate recognition, when the images are af-
fected by multiple causes, sometimes, we can expect a little elon-
gation, such as the effect of perspective distortion. To show the

86.3
13.8
18.5
RF
effect of elongation created by multiple causes, we implemented
Performance of the individual steps and the proposed method before and after rectification on the MIMOS dataset.

the method in Dhar et al. (2018) where the method considers ex-

84.3
13.2
13.1
trema points for correcting small tilts to the horizontal direction.
RP
Recognition

In this work, we calculate quality measures, segmentation mea-


sures and recognition rate before and after rectification on the MI-
23.7

88.4
14.4
RR

MOS dataset as reported in Table 2. Before rectification the images


are considered as input without correcting the small tilt in the hor-
26.6

izontal direction for experimentation. After rectification, the cor- 2.4


U

rected images are considered for experimentation. It is found from


Table 2 that the results of all the steps including the proposed
33.3

10.8

method give slightly better results after rectification compared to


O

before rectification. However, the difference is marginal. Therefore,


Segmentation measures

we can conclude that overall, if we use rectification before recog-


84.6
11.1

nizing the license plates, the recognition rate improves slightly.


F

82.6

4.2. Experiments on the proposed character segmentation approach


8.4
P

Qualitative results of the proposed technique on license plate


86.8
16.3

images of different datasets, namely, MIMOS, Medialab, UCSD and


R

Uninsubria are shown in Fig. 16(a) and 16(b), where we can see
MSSIM

that the complexity of the input images vary from one dataset to
0.29

0.65
0.24

another due to multiple factors of the datasets. For such images,


the proposed technique segments characters successfully. It is
evident that the proposed technique is robust to multiple factors.
Before rectification

Quality measures

RMSE

69.4

78.6

Quantitative results of the proposed and existing techniques for


7.1

the above-mentioned datasets are reported in Table 3, where we


note that the proposed technique is the best at all the measures
PSNR

especially under- and over-segmentation rates, which report a low


12.3

32.1
8.6

score compared to the existing techniques. Table 3 shows that


all the methods including the proposed technique provide good
accuracies on the MIMOS dataset and the lowest for the UCSD
Proposed

dataset. This is because the number of distorted images is higher


CRWS
Table 2

Steps

SWR

in the case of the UCSD dataset compared to MIMOS and the


PR

other datasets. The results of the proposed and existing methods


V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 233

Table 3
Performance of the proposed and existing techniques for character segmentation on different license plate datasets.

Phan et al., Khare et al., Sharma et al., Dhar et al., Ingole and Radchenko
Datasets Measures 2011 2015 2013 2018 Gundre (2017) et al., 2017 Proposed

MIMOS R 39.4 58.4 68.3 73.4 74.6 65.3 86.8


P 38.4 57.3 66.9 72.3 70.4 63.3 82.6
F 38.7 57.5 67.5 72.8 72.5 64.3 84.6
O 21.1 23.2 14.9 14.8 15.3 18.3 10.8
U 38.7 18.4 16.7 12.4 12.2 17.4 2.4
Medialab R 34.3 51.3 54.7 69.7 70 59.4 82.1
P 33.6 47.3 42.1 64.2 67.4 55.4 81.6
F 33.9 49.3 48.3 66.9 68.7 57.4 81.6
O 24.2 25.2 19.7 21.1 20.4 42.6 10.1
U 39.6 20.6 22.6 12.7 10.9 23.3 7.9
UCSD R 21.3 26.1 41.3 35.2 47.2 29.6 56.7
P 20.4 22.4 36.9 30.6 40.7 27.4 53.4
F 20.8 24.6 39.1 32.9 43.9 28.5 55.1
O 35.5 43.1 26.4 39.7 34 35.9 12.9
U 45.7 30.6 32.6 27.4 22.1 35.6 29.8
Uninusubria R 31.4 42.7 61.3 41.6 53.6 48.7 75.7
P 30.5 41.6 57.4 39.8 50.9 46.1 66.4
F 30.9 42.1 59.3 40.7 52.2 47.4 71.1
O 35.7 28.9 22.3 31.9 26.9 24.3 12.3
U 32.9 28.4 16.4 27.4 20.8 28.3 12.6
Only Challenged R 33.4 47.2 57.4 43.6 54.8 51.6 72.1
Images
P 36.2 42.3 52.3 41.1 50.4 47.9 73.4
F 34.7 44.7 54.8 42.3 52.6 49.7 72.6
O 34.3 29.7 24.6 33.5 24.8 26.8 13.6
U 30.9 25.5 20.6 24.1 22.6 23.4 13.8

Fig. 16. Qualitative results of the proposed technique for character segmentation on different datasets.

on challenging data show that the proposed method performs methods are sensitive to the starting point as they need to esti-
almost the same as other license plate approaches despite the mate the minimum cost path. On the other hand, the proposed
fact that the challenging data does not include any ‘good’ (easy) technique does not require either seed points or starting points to
images as in other datasets. The reason for the poor results by find spaces between characters. Overall, the segmentation experi-
the existing methods is the main goal of all the three methods ments shows that the proposed technique is capable of handling
is to detect text in video or natural scene images but not license license plates as well as video and natural scene images.
plate images. Similarly, though the methods, namely, Dhar et al.
(2018), Ingole and Gundre (2017) and Radchenko et al. (2017) were 4.3. Experiments on the proposed character recognition technique
developed for character segmentation from license plate images, through reconstruction
the methods do not perform well on all the dataset compared to
the proposed method. The reason is that the methods depend on Qualitative results of the proposed and existing techniques for
profile based features, binarization and the specific nature of the the recognition of license plate images for different datasets are
dataset as conventional document analysis methods. shown in Fig. 17(a)–17(d) for MIMOS, Medialab, UCSD and Unin-
Similarly, quantitative results of the proposed and existing tech- subria, respectively. The recognition step considers the output of
niques for video and natural scene images are reported in Table 4, the segmentation step as the input, which is shown in Fig. 17,
where it is observed that the proposed technique is the best for to reconstruct shapes of the segmented characters. This results in
the F-measure for under-and over-segmentations as compared to the conclusion that the proposed technique reconstructs shapes
existing techniques. It may be noted from Table 4 that the pro- well for characters of different datasets affected by different fac-
posed technique scores consistent results for all the datasets except tors. It can be validated by the recognition results given by OCR
for the MSRA-TD-500 dataset. This is because this dataset contains in double quotes in Fig. 17. Thus, we can assert that the pro-
arbitrary-oriented texts. Since our aim is to develop a technique posed technique does not require binarization for recognition. To
for license plate images, where we may not find arbitrary orienta- evaluate the reconstruction results given by the proposed and ex-
tions, the proposed technique gives poor results when the charac- isting techniques, we estimate quality measures, which are re-
ters are in arbitrary orientations, such as curved texts. The reason ported in Table 5 for license plate, video and natural scene im-
for the poor results by the existing method is that all the three ages datasets. Since Tian et al.’s method (Tian et al., 2015a) outputs
234 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

Table 4
Performance of the proposed and existing techniques for character segmentation on different video and natural scene datasets.

Datasets Measures Phan et al., Khare et al., Sharma et al., Dhar et al., Ingole and Radchenko Proposed
2011 2015 2013 2018 Gundre (2017) et al., 2017

ICDAR 2015 Video R 22.6 37.9 60.7 39.4 55.3 46.8 66.9
P 24.6 34.2 58.3 37.4 53.9 42.8 62.4
F 23.2 36.1 59.5 38.4 54.6 44.8 64.6
O 38.7 28.1 23.4 31.9 24.1 30.9 18.7
U 36.4 35.8 17.1 29.7 21.3 24.3 18.1
YVT Video R 30.6 38.2 51.6 51 65.9 57.4 74.9
P 29.7 41.6 52.4 49.1 60.3 56.3 73.4
F 30.1 39.9 52.1 50 63.1 56.8 73.8
O 32.7 32.6 24.3 29.3 24.5 24.8 16.2
U 36.1 29.8 23.6 20.7 12.4 18.3 14.9
ICDAR 2013 Video R 28.9 37.4 52.6 54.2 66.8 54.5 71.2
P 27.6 39.1 51.4 50.3 63.4 52.2 70.9
F 28.1 38.5 51.9 52.2 65.1 53.3 71.1
O 42.4 33.2 17.8 30.3 20.8 29.2 13.5
U 30.6 28.9 29.4 17.4 14.1 17.4 18.4
ICDAR 2015 Scene R 31.4 42.7 61.3 56.7 58.3 54.1 71.3
Dataset
P 30.5 41.6 57.4 52.2 52.8 48.7 69.4
F 30.9 42.1 59.3 54.7 55.5 51.4 70.7
O 35.7 28.9 22.3 29.1 31.9 27 14.2
U 32.9 28.4 16.4 16.4 12.5 21.6 16.8
ICDAR 2013 Scene R 32.6 40.7 61.1 59.3 54.3 53.8 76.8
Dataset
P 32.5 46.5 52.2 52.6 53.7 47.2 72.3
F 32.5 43.4 56.6 55.9 54 50.5 74.5
O 37.1 29.7 21.3 23.6 25.3 25.4 16.3
U 32.4 26.9 22.0 20.4 20.7 24.1 9.1
SVT Scene Dataset R 21.4 38.6 61.3 43.4 50.6 47.9 64.7
P 20.4 31.4 57.4 39.2 45.8 42.7 61.9
F 20.9 35.0 59.3 41.3 48.2 45.3 63.3
O 41.3 22.9 22.3 31.6 27.5 30.8 12.6
U 37.2 42.1 16.4 27.1 24.3 23.9 24.1
MSRA-TD-500 R 22.4 26.7 42.1 30.7 38.4 34.3 59.3
Dataset
P 23.6 24.4 32.1 28.4 35.7 29.4 57.3
F 22.7 25.6 37.1 29.5 37 31.8 58.6
O 32.4 41.2 29.3 42.6 36.6 39.9 22.7
U 46.9 35.7 31.8 27.8 26.3 28.2 28.9

Fig. 17. Qualitative results of the proposed technique for reconstruction and recognition on different license plate images.
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 235

Fig. 18. Overall performance of the proposed method on the images affected by multiple adverse factors. Column-1-Column-5 denote input images of different causes, the
results of partial reconstruction, the result of character segmentation, the result of full reconstruction and recognition, respectively.

Table 5 where we pass Canny edge images to the OCR directly for recogni-
Performance of the proposed and existing techniques for reconstruction on different
tion without reconstruction. To demonstrate that a Canny edge im-
license, video and natural scene datasets.
age alone without reconstruction is inadequate to achieve good re-
Methods Tian et al., 2015a Proposed sults, we conducted recognition experiments by passing the Canny
Datasets RMSE PSNR MSSIM RMSE PSNR MSSIM edge images to an OCR directly. It is can be verified from the re-
MIMOS 22.7 19.9 0.74 7.1 32.1 0.65
sults reported in Table 6, where it is noted that recognition results
Medialab 42.7 21.8 0.79 12.4 26.3 0.60 with Canny images are far from those of the proposed technique
UCSD 69.0 19.7 0.59 31.7 22.4 0.6 in terms of all the three measures. This is due to the fact that
Uninsubria 72.4 8.4 0.52 26.3 23.8 0.4 Canny is sensitive to blur and complex backgrounds. We can also
ICDAR 2015 Video 63.5 11.7 0.61 19.7 23.9 0.63
observe from Table 6 that the proposed technique achieves bet-
YVT Video 55.3 16.1 0.68 16.3 24.5 0.67
ICDAR 2013 Video 63.6 11.7 0.61 18.4 24.0 0.60 ter results than the other existing methods for complex datasets,
ICDAR 2015 Scene 57.3 15.4 0.67 18.41 24.0 0.64 namely, MIMOS, UCSD, YYVT video, SVT, MSRA and the challeng-
ICDAR 2013 Scene 57.3 15.4 0.67 16.2 24.6 0.65 ing dataset. For other datasets, the existing method, Silva and
SVT Scene 62.1 12.4 0.62 22.3 23.8 0.61 Jung (2018) achieves better results than all the methods including
MSRA-TD 500 Scene 68.7 10.7 0.59 26.1 23.7 0.55
the proposed method. This is justifiable because the method ex-
Only Challenged 39.2 12.5 0.57 15.6 20.4 0.48
plores a powerful deep learning model for unconstrained license
plate recognition. It is evident that the methods of Bulan et al.
reconstruction results for recognition as in our technique but un- (2017) and Lin et al. (2018), which also explore deep learning mod-
like other existing methods, the proposed technique is compared els for license plate recognition, achieve better results than all the
with only Tian et al.’s method (Tian et al., 2015a). Table 5 shows other existing methods but these two perform worse than the pro-
that the proposed technique is better than the existing method in posed method. However, the difference between the method in
terms of three quality measures for all the three types of datasets. Silva and Jung (2018) and the proposed method is marginal. Be-
It is also observed from Table 5 that the proposed method per- sides, the results on difficult data show that the proposed method
forms almost the same as on other datasets, but is applied to chal- is effective in tackling challenges as it reports almost the same as
lenging data. The main reason for the poor results of the exist- on the other datasets. Therefore, the proposed technique is robust
ing method is that it depends on gradient information, which gives and generic compared to the existing methods. The major weak-
good responses for high contrast images for reconstructing charac- ness of the existing methods is as follows. Since the gradient used
ter images, while the proposed technique uses both gradient and in Tian et al.’s method is good for high contrast images, it gives
intensity information for reconstruction to handle the images af- poor results for low contrast images; Zhao et al.’s method is de-
fected by multiple factors. veloped for high contrast and homogeneous background images,
Quantitative results for the recognition of the proposed and ex- however it gives poor results, and Anagnostopoulus et al.’s method
isting techniques on license plate images, video and natural image involves binarization and parameter tuning to give poor results
datasets are reported in Table 6. These experiments include recog- for the images affected by multiple factors. Conversely, the pro-
nition results using Canny edge images of input character images, posed technique does not depend on binarization and explores the
236 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

Table 6
Performance of the proposed and existing techniques for recognition on different license, video and natural scene datasets.

Anagnostopoulos Zhou et al., Tian et al., Bulan et al., Silva and Lin et al.,
Datasets Measures Canny et al., 2006 2013 2015a 2017 Jung (2018) 2018 Proposed

MIMOS RR 58.7 63.2 47.4 57.6 86.3 89.3 78.3 88.4


RP 54.3 64.7 52.3 59.7 82.6 83.2 74.9 84.3
RF 56.4 63.8 50.3 58.6 84.5 86.2 76.6 86.3
Medialab RR 59.3 64.7 52.4 61.2 83.7 86.4 75.6 82.3
RP 52.4 66.9 56.8 62.7 75.3 82.3 71.9 79.3
RF 55.3 65.7 54.6 61.6 79.5 84.3 73.7 81.3
UCSD RR 29.2 42.3 47.2 44.9 52.4 58.3 51.7 65.7
RP 32.7 44.7 48.1 46.2 47.4 55.3 49.5 62.1
RF 31.3 43.6 47.6 45.5 49.9 56.8 50.6 63.9
Uninsubria RR 62.4 65.3 68.3 64.3 76.4 78.4 77.1 78.7
RP 66.7 68.7 69.4 69.4 72.4 75.3 77.4 80.3
RF 64.8 66.9 68.8 67.1 74.4 76.8 77.2 79.5
ICDAR 2015 RR 66.2 68.9 71.8 72.6 83.4 86.4 84.3 78.6
Video
RP 61.3 75.7 72.3 72.7 81.3 80.3 78.9 73.4
RF 63.7 72.7 72.1 72.6 82.3 83.3 81.6 76.2
YVT Video RR 72.4 72.9 66.9 71.4 83.4 85.9 84.9 78.3
RP 65.3 77.8 70.3 74.8 79.2 81.4 78.8 82.6
RF 68.4 75.8 68.7 72.9 81.3 83.6 81.8 80.5
ICDAR 2013 RR 68.2 78.7 71.3 74.9 81.6 83.2 79.5 83.7
Video
RP 61.3 79.3 68.9 71.3 80.4 81.5 78.4 84.2
RF 65.7 78.5 69.8 72.8 81 82.3 78.9 83.5
ICDAR 2015 RR 66.8 77.3 72.1 65.3 82.3 85.7 80.2 80.3
Scene
RP 67.3 72.1 74.3 62.1 81.4 84.4 80.3 82.1
RF 66.9 75.2 73.6 64.4 81.8 85 80.2 81.5
ICDAR 2013 RR 59.3 72.3 71.3 65.6 83.1 86.1 81.4 78.3
Scene
RP 56.3 72.4 68.7 64.3 81.5 84.3 80.4 73.2
RF 58.6 72.3 70.1 64.9 82.3 85.2 80.9 75.8
SVT Scene RR 58.3 76.4 71.4 66.3 78.2 79.3 76.4 80.4
RP 59.7 78.3 74.7 67.2 77.3 76.3 74.8 81.6
RF 58.6 77.9 73.1 66.8 77.7 77.8 75.6 81.0
MSRA-TD-500 RR 64.3 73.9 75.9 72.4 78.4 81.3 74.9 82.4
Scene
RP 65.8 76.4 74.3 77.3 74.8 80.4 73.8 81.6
RF 64.9 75.9 75.1 75.4 76.6 80.8 74.3 81.9
Only RR 58.7 51.9 47.4 57.6 54.8 57.6 55.9 62.9
Challenged
Images
RP 54.3 52.3 52.3 59.7 51.7 56.6 51.4 65.7
RF 56.5 52.1 49.8 58.6 53.2 57.1 53.6 64.3

combination of gradient and intensity for reconstruction through tions such as image size, font variation, and orientation. As a result,
character segmentation, and it performs better than the existing despite the fact that the proposed method reconstructs character
methods especially for the datasets, which involve images contain- shapes, it fails to achieve a high accuracy, which is more than 90%.
ing multiple challenging factors. Since our target is to address the above challenges and to develop
Overall, to show the proposed method is robust to multiple a generalized method, we prefer to use available OCR approaches
adverse factors as mentioned in the Introduction and Proposed to demonstrate the effectiveness and usefulness of the proposed
Methodology sections, we present sample results of each step on method rather than using language models, lexicons and learning
different images affected by low contrast, complex background, models. This is because these methods restrict generality. There-
multi-fonts, multi-font sizes, blur and distortion due to perspec- fore, we believe that the proposed work makes an important state-
tive angle, as shown respectively in Fig. 18(a)–18(f), which include ment that there is a way to handle adverse factors such that one
the results of partial reconstruction, character segmentation, full can use machine learning or deep learning to achieve high accu-
reconstruction and recognition. One can assert from the results racy instead of using traditional OCR by considering reconstructed
shown in Fig. 18 that the proposed method has significant bene- results as input. This is our next target to achieve high accuracy on
fits in handling multiple adverse factors. If a license plate image the MIMOS dataset by exploring the deep learning concept.
contains any logo or symbol as shown in Fig. 18(c), the segmenta- To show the effectiveness of the proposed method on license
tion algorithm dissects the symbols as characters. However, when plate images of different countries, we test the steps of the pro-
the result is sent to an OCR, it fails as shown in Fig. 18(c). As a posed method on American license plate images as well as those
result, the presence of symbols in license plate images does not shown in Fig. 19, where it is noted that the steps and the pro-
affect the overall performance of the technique. It is evident from posed method work well for American license plate images. This is
the results reported in Tables 3, 5 and 6 on challenging data, that the advantage of the steps proposed in this work, i.e. stroke width
one can see the proposed method performs almost the same as on pair candidate detection, partial reconstruction, character segmen-
the other data. It is noted that for the recognition experiments, we tation, complete reconstruction and recognition. This shows that
use OCR, which is available publicly. This OCR has inherent limita- the proposed method is independent of scripts.
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 237

Fig. 19. Examples of the proposed stroke width pair candidate detection, reconstruction, segmentation and recognition approaches for American license plate images.

Fig. 20. Recognition rate of the proposed method for different scales to find the lower and upper boundary for scaling up and down.

To test the scaling effect for license plate recognition of the pro- the proposed method gives better results. This shows that the dif-
posed method, we calculate recognition rate for different scales as ferent scales may not have much of an effect on the overall perfor-
shown in Fig. 20. If the image is too small as shown in Fig. 20 (i.e. mance of the proposed method. Therefore, we can conclude that
size of the character image is 4 × 4), the proposed method reports the proposed method is invariant to scaling. This is justifiable be-
poor results as shown in Fig. 20. This type of small size is rare for cause the features proposed based on stroke width distance are
license plate recognition. However, for a size greater than 16 × 16, invariant to scaling.
238 V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239

5. Conclusions and future work Gou, C., Wang, K., Yao, Y., & Li, Z. (2016). Vehicle license plate recognition based
on extremal regions and restricted Boltzmann machines. IEEE Transactions on
Intelligent Transportation Systems, 17(4), 1096–1107.
We have proposed a novel technique for recognizing license Ingole, S. K., & Gundre, S. B. (2017). Characters feature based Indian Vehicle license
plates, video and natural scene images through reconstruction. plate detection and recognition. In Proceedings of the I2C2 (pp. 1–5).
The proposed technique explores gradient and Laplacian symmet- Jaderberg, M., Simonyan, K., Vedaldi, A., & Zisserman, A. (2016). Reading text in the
wild with convolutional neural networks. International Journal of Computer Vi-
rical features based on stroke width distance to obtain partial re- sion, 116(1), 1–20.
construction for segmenting characters. To segment characters af- Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M.,
fected by multiple factors such as low contrast, blur, complex back- et al. (2015). ICDAR 2015 competition on robust reading. In Proceedings of
the 13th international conference on document analysis and recognition (ICDAR)
grounds, and illumination variations, we introduce angular infor-
(pp. 1156–1160). IEEE.
mation for partial reconstruction results based on character struc- Karatzas, D., Shafait, F., Uchida, S., Iwamura, M., Bigorda, L. G., Mestre, S. R.,
tures, which solve under- and over-segmentations successfully. For et al. (2013). ICDAR 2013 robust reading competition. In Proceedings of the
12th international conference on document analysis and recognition (ICDAR)
segmented characters, the proposed technique explores symme-
(pp. 1484–1493). IEEE.
try features based on stroke width distance and tangent direc- Khare, V., Shivakumara, P., Raveendran, P., Meng, L. K., & Woon, H. H. (2015). A
tion in the gray domain to restore complete shapes for partial re- new sharpness based approach for character segmentation in License plate im-
construction results. Comprehensive experimental results are con- ages. In Proceedings of the 3rd IAPR Asian conference on pattern recognition (ACPR)
(pp. 544–548). IEEE.
ducted on large datasets, which include license plates, video and Kim, D., Song, T., Lee, Y., & Ko, H. (2016). Effective character segmentation for license
natural scene images to show that the proposed technique is ro- plate recognition under illumination changing environment. In Proceedings of
bust and generic compared to existing methods. The same idea can the IEEE international conference on consumer electronics (ICCE) (pp. 532–533).
IEEE.
be extended with the help of a deep learning concept for images Liang, G., Shivakumara, P., Lu, T., & Tan, C. L. (2015). A new wavelet-Laplacian
of different scripts from other countries, such as Indian, Russian, method for arbitrarily-oriented character segmentation in video text lines. In
Arabic and European, to develop a generic system in the near fu- Proceedings of the 13th international conference on document analysis and recog-
nition (ICDAR) (pp. 926–930). IEEE.
ture. Lin, C. H., Lin, Y. S., & Liu, W. C. (2018). An efficient license plate recognition system
using convolution neural networks. In Proceedings of the ICASI (pp. 224–227).
Acknowledgements Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., & Alsaadi, F. E. (2017). A survey of
deep neural network architectures and their applications. Neurocomputing, 234,
11–26.
This work was supported by the Natural Science Foundation Nguyen, P. X., Wang, K., & Belongie, S. (2014). Video text detection and recognition:
of China under Grant nos. 61672273 and 61832008, and the Sci- Dataset and benchmark. In Proceedings of the IEEE winter conference on applica-
tions of computer vision (WACV) (pp. 776–783). IEEE.
ence Foundation for Distinguished Young Scholars of Jiangsu un-
Peyrard, C., Baccouche, M., Mamalet, F., & Garcia, C. (2015). ICDAR2015 competition
der Grant BK20160021. This work is also partly supported by on text image super-resolution. In Proceedings of the 13th international confer-
the University of Malaya under Grant no: UM.0 0 0 0520/HRU.BK ence on document analysis and recognition (ICDAR) (pp. 1201–1205). IEEE.
Phan, T. Q., Shivakumara, P., Su, B., & Tan, C. L. (2011). A gradient vector flow-based
(BKS003-2018).
method for video character segmentation. In Proceedings of the international con-
The authors would like to thank the anonymous reviewers and ference on document analysis and recognition (ICDAR) (pp. 1024–1028). IEEE.
the Editor for their constructive comments and suggestions to im- Radchenko, A., Zarovsky, R., & Kazymyr, V. (2017). Method of segmentation and
prove the quality and clarity of this paper. recognition of Ukrainian license plates. In Proceedings of the YSF (pp. 62–65).
Raghunandan, K. S., Shivakumara, P., Jalab, H. A., Ibrahim, R. W., Kumar, G. H., Pal, U.,
et al. (2017). Riesz fractional based model for enhancing license plate detection
Conflict of Interest and recognition. IEEE Transactions on Circuits and Systems for Video Technology,
28(9), 2276–2288.
Rathore, M. M., Ahmad, A., Paul, A., & Rho, S. (2016). Urban planning and building
None. smart cities based on the internet of things using big data analytics. Computer
Networks, 101, 63–80.
References Saha, S., Basu, S., & Nasipuri, M. (2015). iLPR: An Indian license plate recognition
system. Multimedia Tools and Applications, 74(23), 10621–10656.
Abolghasemi, V., & Ahmadyfard, A. (2009). An edge-based color-aided method for Sedighi, A., & Vafadust, M. (2011). A new and robust method for character segmen-
license plate detection. Image and Vision Computing, 27(8), 1134–1142. tation and recognition in license plate images. Expert Systems with Applications,
Al-Ghaili, A. M., Mashohor, S., Ramli, A. R., & Ismail, A. (2013). Vertical-edge-based 38(11), 13497–13504.
car-license-plate detection method. IEEE Transactions on Vehicular Technology, Sharma, N., Shivakumara, P., Pal, U., Blumenstein, M., & Tan, C. L. (2013). A new
62(1), 26–38. method for character segmentation from multi-oriented video words. In Pro-
Al-Shemarry, M. S., Li, Y., & Abdulla, S. (2018). Ensemble of adaboost cascades of ceedings of the 12th international conference on document analysis and recognition
3L-LBPs classifiers for license plates detection with low quality images. Expert (ICDAR) (pp. 413–417). IEEE.
Systems with Applications, 92, 216–235. Shivakumara, P., Dutta, A., Tan, C. L., & Pal, U. (2014). Multi-oriented scene text de-
Anagnostopoulos, C. N. E., Anagnostopoulos, I. E., Loumos, V., & Kayafas, E. (2006). tection in video based on wavelet and angle projection boundary growing. Mul-
A license plate-recognition algorithm for intelligent transportation system ap- timedia Tools and Applications, 72(1), 515–539.
plications. IEEE Transactions on Intelligent Transportation Systems, 7(3), 377–392. Shivakumara, P., Phan, T. Q., Bhowmick, S., Tan, C. L., & Pal, U. (2013). A novel ring
Azam, S., & Islam, M. M. (2016). Automatic license plate detection in hazardous con- radius transform for video character reconstruction. Pattern Recognition, 46(1),
dition. Journal of Visual Communication and Image Representation, 36, 172–186. 131–140.
Ben-Ami, I., Basha, T., & Avidan, S. (2012). Racing bib numbers recognition. In Pro- Shivakumara, P., Roy, S., Jalab, H. A., Ibrahim, R. W., Pal, U., Lu, T., et al. (2019).
ceedings of the BMVC (pp. 1–10). Fractional means based method for multi-oriented keyword spotting in
Bulan, O., Kozitsky, V., Ramesh, P., & Shreve, M. (2017). Segmentation-and annota- video/scene/license plate images. Expert Systems with Applications, 118, 1–19.
tion-free license plate recognition with deep localization and failure identifica- Silva, S. M., & Jung, C. R. (2018). License plate detection and recognition in uncon-
tion. IEEE Transactions ITS, 18(9), 2351–2363. strained scenarios. In Proceedings of the ECCV (pp. 593–609).
Dhar, P., Guha, S., Biswas, T., & Abedin, M. Z. (2018). A system design for license Suresh, K. V., Kumar, G. M., & Rajagopalan, A. N. (2007). Superresolution of license
plate recognition by using edge detection and convolution neural network. In plates in real traffic videos. IEEE Transactions on Intelligent Transportation Sys-
Proceedings of the IC4ME2 (pp. 1–4). tems, 8(2), 321–331.
Dong, M., He, D., Luo, C., Liu, D., & Zeng, W. (2017). A CNN-based approach for Tadic, V., Popovic, M., & Odry, P. (2016). Fuzzified Gabor filter for license plate de-
automatic license plate recognition in the wild. In Proceedings of the BMCV tection. Engineering Applications of Artificial Intelligence, 48, 40–58.
(pp. 1–12). Tesseract OCR software (2016). http://vision.ucsd.edu/belongie-grp/research/carRec/
Du, S., Ibrahim, M., Shehata, M., & Badawy, W. (2013). Automatic license plate recog- car_rec.html
nition (ALPR): A state-of-the-art review. IEEE Transactions on Circuits and Sys- Tian, J., Wang, R., Wang, G., Liu, J., & Xia, Y. (2015a). A two-stage character segmen-
tems for Video Technology, 23(2), 311–325. tation method for Chinese license plate. Computers & Electrical Engineering, 46,
Epshtein, B., Ofek, E., & Wexler, Y. (2010). Detecting text in natural scenes with 539–553.
stroke width transform. In Proceedings of the IEEE conference on computer vision Tian, S., Shivakumara, P., Phan, T. Q., Lu, T., & Tan, C. L. (2015b). Character shape
and pattern recognition (CVPR) (pp. 2963–2970). IEEE. restoration system through medial axis points in video. Neurocomputing, 161,
Goodfellow, I. J., Bulatov, Y., Ibarz, J., Arnoud, S., & Shet, V. (2013). Multi-digit num- 183–198.
ber recognition from street view imagery using deep convolutional neural net- Wang, K., & Belongie, S. (2010). Word spotting in the wild. In Proceedings of the
works. arXiv:1312.6082. European conference on computer vision (pp. 591–604). Springer.
V. Khare, P. Shivakumara and C.S. Chan et al. / Expert Systems With Applications 131 (2019) 219–239 239

Wang, Y., Shi, C., Xiao, B., & Wang, C. (2015). MRF based text binarization in complex Yuan, Y., Zou, W., Zhao, Y., Wang, Xin’an, Hu, X., & Komodakis, N. (2017). A robust
images using stroke feature. In Proceedings of the 13th international conference on and efficient approach to license plate detection. IEEE Transactions Image Pro-
document analysis and recognition (ICDAR) (pp. 821–825). IEEE. cessing, 26(3), 1102–1114.
Yang, Y., Li, D., & Duan, Z. (2018). Chinese vehicle license plate recognition using Zamberletti, A., Gallo, I., & Noce, L. (2015). Augmented text character proposals and
kernel-based extreme learning machine with deep convolutional features. IET convolutional neural networks for text spotting from scene images. In Proceed-
Intelligent Transport System, 12(3), 213–219. ings of the 3rd IAPR Asian conference on pattern recognition (ACPR) (pp. 196–200).
Yao, C., Bai, X., Liu, W., Ma, Y., & Tu, Z. (2012). Detecting texts of arbitrary orienta- IEEE.
tions in natural images. In Proceedings of the IEEE conference on computer vision Zhou, W., Li, H., Lu, Y., & Tian, Q. (2012). Principal visual word discovery for au-
and pattern recognition (pp. 1083–1090). IEEE. tomatic license plate detection. IEEE Transactions on Image Processing, 21(9),
Ye, Q., & Doermann, D. (2015). Text detection and recognition in imagery: A survey. 4269–4279.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(7), 1480–1500. Zhou, Y., Feild, J., Learned-Miller, E., & Wang, R. (2013). Scene text segmentation via
Yu, S., Li, B., Zhang, Q., Liu, C., & Meng, M. Q. H. (2015). A novel license plate loca- inverse rendering. In Proceedings of the 12th international conference on document
tion method based on wavelet transform and EMD analysis. Pattern Recognition, analysis and recognition (ICDAR) (pp. 457–461). IEEE.
48(1), 114–125.

You might also like