1043 - زیارت بان

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Improving Farsi Font Recognition Accuracy by

Using Proposed Directional Elliptic Gabor Filters


Majid Ziaratban Fatemeh Bagheri
Electrical Engineering Department Computer Engineering Department
Golestan University Golestan University
Gorgan, Iran Gorgan, Iran
m.ziaratban@gu.ac.ir f.bagheri@gu.ac.ir

Abstract-- In this paper a directional filter is proposed to and lower zones of the line from the printed texts. The main
describe the curvedness of textures. The proposed filter is inspired drawback of these approaches is that they require high-quality
by the basic Gabor filter and has an elliptic form. Thus, they are and noise-free document images [21].
called directional elliptic Gabor (DEG) filters. Characters and
subwords in Farsi machine-printed texts are constructed from
The approaches based on frequently used components
both straight and curved segments. Moreover, the amounts of [13,17,18] have been proposed for content-independent font
curvedness of various Farsi fonts are different. Therefore, the recognition applications. In these methods, the learning set
features based on the proposed filter can be useful in Farsi font consists of a number of samples of frequently used components
recognition. Better describing straightness and curvedness of text in all font classes. In these approaches, the text font is
components increases the separability among various fonts. determined based on the fonts of the detected samples of the
Experiments demonstrate that using both Gabor filters and the
proposed DEG filters for texture features extraction improves the
components in a document image. The algorithms in the third
Farsi font recognition accuracy. category require computing the matching scores between all
predetermined most-frequent components in all font classes
Keywords—Farsi font recognition, Directional elliptic and all components of a test image. Due to the large number of
Gabor filters. required matchings, these methods are very time-consuming.
By considering more font classes and larger number of words
in the test document images, the complexity and the processing
I. INTRODUCTION time will increase considerably.
Texture analysis was used in many approaches
Converting document images into editable text files is one of
the interesting goals of optical character recognition (OCR). [2,6,8,10,14,19-21] to determine the fonts of text blocks. These
The accuracy of OCR in the machine-printed texts with known approaches first normalize the spaces between text lines,
words, and characters in text images. Text blocks are
fonts is significantly higher than that in the texts with unknown
fonts [1]. Several works have been done in optical font normalized by filling the empty spaces at the end of text
blocks. Then, the texture features are extracted from the
recognition in various languages such as Latin [2-12],
Chinese [6, 13-16], Arabic [17-19], and Farsi [20,21]. The normalized text blocks.
related works on the optical font recognition are briefly listed Gabor filter was used in [20] to extract the texture features
from Farsi document images. Seven font types and four font
in Table I. In all of these works, the font of the whole text in a
document image was assumed to be uniform. For the cases of styles were considered and the rate of 85% was obtained by
using a weighted Euclidean distance (WED) classifier.
complex multi-font text images, a preprocessing stage is
Recently, Khosravi and Kabir [21] proposed a Sobel-
required to segment a multi-font document into several single-
font text parts. This preprocessing stage is a wide subject in the Roberts feature extraction method for Farsi font recognition.
These features statistically describe the texture of the texts.
OCR field and is not in the scope of this paper. Thus, like all
other previous works, we propose an algorithm to recognize an 15000 training and 5000 test samples were used and the rate of
unknown uniform font of a machine-printed document. 94.16% was achieved for the recognition of ten Farsi font
types.
Font recognition approaches can be roughly divided into three
main categories: Typographical feature-based methods, The rest of the paper is organized as follows: The proposed
DEG filters which are used for Farsi font recognition is
frequently used component-based methods, and global texture
analysis. discussed in section 2. In section 3, the experimental results are
Typographical feature-based algorithms [3-5] extract some presented. Finally, conclusions are drawn in section 4.
features, like character skews, between-characters and
between-words space widths, and projections in upper, center

978-1-4673-6206-1/13/$31.00 ©2013 IEEE


TABLE 1
RELATED WORKS ON OPTICAL FONT RECOGNITION
Number of Font
Language font font font all font recognition approach Category
types styles sizes classes rate (%)
Projection profiles + Multivariate Bayesian
Zramdini 1993 [3] French 7 4 4 112 96.75
classifier
Global typographical features + Multivariate Typographical
Zramdini 1998 [4] French 10 4 7 280 97.35 features
Bayesian classifier
Jung 1999 [5] English 7 1 6 42 94.84 Typographical features + Neural network

Chinese 6 4 1 24 98.6 Global texture analysis (Gabor filter) + WED


Zhu 2001 [6]
English 8 4 1 32 99.2 classifier

Chinese Optimized Gabor filter with GA + Post


Ha 2005 [14] 4 1 1 4 99.07
processing
82 Global texture analysis (Gabor filter) + SVM
Borji 2007 [20] Farsi 7 4 1 28
85 Global texture analysis (Gabor filter) + WED Global texture
analysis
Global texture analysis (Sobel-Roberts features)
Khosravi 2010 [21] Farsi 10 1 1 10 94.16
+ MLP
High-order statistical texture analysis (They
Cruz 2005 [8] Spanish 8 4 1 32 100
used an artificial computer-generated dataset)
Latin, Greek
Ma 2005 [2] 5 1 1 5 95.93 Grating cells + BPNN classifier
and Cyrillic
Ben Moussa 2005 [19] Arabic 9 1 1 9 94.44 Fractal dimension + KPPV classifier

Lin 2001 [13] Chinese 5 1 1 5 99.32 40 most-frequent characters+Most-frequent font

Abuhaiba 2003 [17] Arabic 3 4 3 36 77.4 Template matching + Most-frequent font


Based on
Abuhaiba 2005 [18] Arabic 3 4 3 36 90.8 100 most-frequent words + Decision tree frequently used
components
11 4 1 34 100 Dynamic Most-Frequent Connected
Ziaratban [22] Farsi
10 1 1 10 96.58 Components

Fig. 1. Various curvedness of four different Farsi fonts. The texts of all text images are the same.

same. As can be seen, the number of straight segments and


II. PROPOSED METHOD sharp corners in Fig. 1(a) is higher than that of others. Also, the
The texture-based methods usually use directional filters to curvedness in Fig. 1(c) is greater than that in other text images.
extract texture features. A directional filter, which is applied to In Fig. 1(c), the corresponding corners are smoother than those
a document image, is sensitive to the foreground pixels which in Fig. 1(a).
are located in the corresponding direction. For example in In this paper, we propose a filter which is most sensitive to
Gabor filters, the pixel values of a filtered image corresponding curvatures and thus can describe the curvedness of the textures.
to the straight lines of the original image which are in the same The basic Gabor filter is as follows:
direction of the applied Gabor filter are very high. The
⎧ (x2 + y2 ) ⎫ ⎧ 2π
directional filters are widely used in texture analysis and
g ( x, y ) = exp ⎨− ⎬. exp ⎨i (x cos θ + y sin θ )⎫⎬
present good results. But they cannot describe the amount of ⎩ 2 σ 2
⎭ ⎩ λ ⎭ (1)
curvature of the textures. Farsi machine-printed texts consist of
both straight line segments and curved segments. Furthermore,
A Gabor filter with λ=25, σ=22, and θ=0 is shown in Fig.
some Farsi fonts have higher curvedness than others. Four
2(a). The proposed filter e(x,y) is the multiplication of two
Farsi fonts are illustrated in Fig. 1. Left column in this figure
terms: an exponential term and a sinusoidal term as follows:
shows the font name. The texts in these four text images are the
In the sinusoidal term, λ is the spatial period. Fig. 2(c)
⎧⎪ ( x′ + γ y ′ − r )
2 2 2 ⎫⎪ ⎧ 2π ⎫
shows the sinusoidal term for which γ=1 and λ=10. x' and y'
e( x, y ) = exp ⎨− ⎬. sin ⎨ x′ 2 + γ y ′ 2 ⎬ are the rotated coordinates with angle equal to θ. The
⎪⎩ 2σ 2 ⎪⎭ ⎩λ ⎭
(2) multiplication of Fig. 2(b) by 2(c) is shown in Fig. 2(d). As
shown in Fig. 2(d-f), the proposed filter can be considered as
x′ = x cosθ + y sin θ (3) the elliptic version of the Gabor filters in various directions.
Consequently, we call the proposed filters Directional Elliptic
y′ = − x sin θ + y cosθ (4) Gabor (DEG) Filters.
To have stronger features and to achieve a higher accuracy,
The location of the maximum values of the exponential term combinations of different features are used in the experiments.
has an elliptic form. γ controls the ratio between the diameters To concatenate different feature sets, they should be first
of the ellipse. If γ=1, the ellipse is converted into a circle and normalized. Suppose that A and B are two feature sets. Atrain
in this case, r is the radius of the circle. The values of the and Atest are the training and test subsets of A, respectively. The
exponential term inside and outside the ellipse reduce combined training and test features are obtained as follows:
exponentially. The rate of the reduction is inversely
proportional to σ. In Fig. 2(b) the exponential term is depicted Ctrain = Concat{Atrain /μA , Btrain /μB} (5)
for which γ=1, σ=8, and r=40.
Ctest = Concat{Atest /μA , Btest /μB} (6)

where μA and μB are the means of Atrain and Btrain,


respectively. Concat{A,B} concatenates A and B together.

III. EXPERIMENTAL RESULTS


We used the dataset which was gathered by Khosravi and
Kabir [21] for the evaluations. This dataset consists of 15000
Farsi text blocks for the training and 5000 blocks for the test.
σ=21, λ=25, θ=0 γ=1, σ=8, r=40 Texts were printed with two different printers and several times
(a) (b)
the printing quality was changed to have some images of
different qualities. Two different scanners were used to scan
the printed pages and the scanning resolution was set to 100
dpi. Ten font types were considered in this dataset as follows:
‘Times New Roman’, ‘Mitra’, ‘Traffic’, ‘Yagut, ‘Homa’,
‘Lotus’, ‘Nazanin’, ‘Tahoma’, ‘Titr’ and ‘Zar’.
Gabor filter-based and Sobel-Roberts-based [21] algorithms
were implemented for performance comparison. All algorithms
were developed with MATLAB 7.4 and run on a 3.4 GHz
Pentium 4 PC with 2 GB of RAM. Just like in [21], 512 Sobel-
γ=1, λ=10 γ=1, σ=7, r=40, λ=10
Roberts features for 16 different directions were extracted. 32
(c) (d)
Gabor filters in 16 different directions and two different values
of λ were applied to text blocks to extract Gabor texture
features. Two values for λ were set to 3 and 5. Like other
studies such as [2], the value of σ in Gabor filters was set to
0.56λ. Mean, variance, and maximum values of each filtered
image are considered as Gabor features. Hence, totally
16x2x3=96 Gabor features are extracted from each text block.
The values of different parameters of the proposed DEG filter
are as follows:
γ=2, σ=18, r=65, λ=25, θ=0 γ=4, σ=11, r=50, λ=25, θ=45
(e) (f) σ=1.5,
λ=5,
Fig. 2. (a) A Gabor filter, (b) and (c) are the exponential and sinusoidal
terms of the proposed filter, respectively, (d-f) Three proposed directional r=5 and 8,
elliptic Gabor filters. γ=4, 2, and 1,
θ=0, 22.5, 45, 67.5, 90, 112.5, 135, and 157.5.
Obviously, for γ=1 the filter has a circle shape and is not However the Sobel-Roberts method is the fastest one, but is the
directional. Therefore, totally 34 DEG filters are used in our most sensitive to noise. Fig. 4 shows the noise sensitivity of
approach. Extracting 3 features (mean, variance, and maximum various filters over a sample text block.
values) from each filtered image, totally 102 DEG features are
extracted. TABLE II
FEATURE COMPARISON
Similar to [21], in the classification stage, four MLP neural
Sobel- Proposed Font
networks were used with AdaBoost M2 training method [23]. Gabor
Roberts DEG
Feature Processing
recognition
AdaBoost M2 is a popular boosting algorithm which increases features length time (sec)
features features rate (%)
the correct classification rate of a weak classifier by creating
X 96 0.468 93.92
several classifiers. The first classifier is regularly trained. For
each new classifier, the distribution of the training samples is X 512 0.023 94.16
changed based on the results of the previous classifier. In this X 102 0.481 88.75
algorithm, misclassified samples of the current classifier will
X X 198 0.949 95.84
be more trained than other samples in the following classifier
[21]. X X 614 0.504 95.02
A comparison between several feature sets is made and the
results are listed in Table II. The results in this table show that Gabor Sobel-Roberts
the highest font recognition rate was achieved by using the DEG Gabor & DEG
combination of the proposed DEG and basic Gabor features. Sobel-Roberts & DEG
100
The reason is that the Gabor features are sensitive to the

Font recognition rate (%)


90
directional straight segments and the proposed features 80
describe the curvature of textures. Textures of the normalized 70
Farsi text blocks consist of both straight and curved segments. 60
Furthermore, various Farsi fonts have different directional 50
straightness and curvature values. Therefore, the combination 40
of the Gabor features and the proposed DEG features improved 30
20
the accuracy of the Farsi font recognition.
10
Font recognition rates of different methods over noisy text 0
blocks are shown in Fig.3. Although, a good font recognition 50 45 40 35 30 25 20 15
rate was obtained by using the combination of Sobel-Roberts
SNR (dB)
and DEG features, but the robustness of these combined
features against noise are not acceptable. The reason is that in Fig. 3. Font recognition rate (%) over noisy text blocks
the Sobel-Roberts approach, both Sobel and Roberts are high-
pass filters and the frequency of noises is usually high.

Fig. 4. Noise sensitivity of various feature extraction methods: from left side (1st columns) sample text blocks, (2nd and 3rd columns) Sobel and Roberts phase
images, respectively. (4th and 5th columns) Gabor and DEG filtered images, respectively. (1st row) results related to the original text block, (2nd row) the results on
the noisy text block with the SNR value equal to 25dB.
[20] A. Borji and M. Hamidi, “Support Vector Machine for Persian Font
Recognition,” Int. Journal of Intelligent Systems and Technologies, Vol.
IV. CONCLUSION 2, 2007, pp. 178-183.
In this paper, directional elliptic Gabor features were [21] H. Khosravi and E. Kabir, “Farsi font recognition based on Sobel–
Roberts features,” Pattern Recognition Letters, Vol. 31, 2010, pp. 75–
proposed to describe the curvature of the textures. 82.
Combination of Gabor features and proposed DEG features [22] M. Ziaratban, K. Faez, F. Bagheri, “Content-Independent Farsi Font
described both straightness and curvature of different Farsi Recognition Based on Dynamic Most-Frequent Connected
fonts and presented the best performance. Combining the Components,” 21st International Conference on Pattern Recognition,
ICPR’12, Japan, pp. 729-733, 2012.
proposed DEG features with the basic Gabor features [23] Y. Freund, R.E. Schapire, “Experiments with a new boosting algorithm,”
improves Farsi font recognition rate about 1.9% and 1.7% In Proc. Int. Conf. on Machine Learning, Bari, Italy, pp. 148–156, 1996.
better than the basic Gabor and Sobel-Roberts features,
respectively. Furthermore, basic Gabor and DEG filter-based
features were more robust to noise as well as their
combination.

V. REFERENCES
[1] H.S. Baird and G. Nagy, “A Self-Correcting 100-Font Classifier,” In
Proc. SPIE, Vol. 2181, pp. 106-115, 1994.
[2] H. Ma and D. Doermann, “Font Identification Using the Grating Cell
Texture Operator,” In Proc. of DRR, 2005, pp. 148-156.
[3] A. Zramdini and R. Ingold, “Optical Font Recognition from Projection
Profiles,” Electronic Publishing, Vol. 6, No. 3, 1993 pp. 249-260.
[4] A. Zramdini and R. Ingold, “Optical Font Recognition Using
Typographical Features,” IEEE Trans. on PAMI, Vol. 20, No. 8, 1998
pp. 877-882.
[5] M.C. Jung, Y.C. Shin and S.N. Srihari, “Multifont Classification using
Typographical Attributes,” In Proc. of ICDAR, India, 1999 pp. 353-356.
[6] Y. Zhu, T. Tan and Y. Wang, “Font Recognition Based on Global
Texture Analysis,” IEEE Trans. on PAMI, Vol. 23, No. 10, 2001 pp.
1192-1200.
[7] S.H. Kim, “Word-Level Optical Font Recognition Using Typographical
Features,” IJPRAI, Vol. 18, No. 4, 2004, pp. 541-561.
[8] C.A. Cruz, R.R. Kuoppa, M.R. Ayala, A.A. Gonzalez and R.E. Perez,
“High-order Statistical Texture Analysis-Font Recognition Applied,”
Pattern Recognition Letters, Vol. 26, 2005, pp. 135-145.
[9] B.B. Chaudhuri and U. Garain, “Extraction of Type Style-based Meta-
information from Image Documents,” IJDAR, Vol. 3, 2001, pp. 138-149.
[10] B. Allier and H. Emptoz, “Font Type Extraction and Character
Prototyping Using Gabor Filters,” In Proc. of ICDAR, 2003, pp. 799-
803.
[11] H. Shi and T. Pavlidis, “Font Recognition and Contextual Processing for
More Accurate Text Recognition,” In Proc. of ICDAR, 1997, pp. 39-44.
[12] S.L. Manna, A.M. Colla and A. Sperduti, “Optical Font Recognition for
Multi-Font OCR and Document Processing,” In Proc. of 10th Int.
Workshop on Database & Expert Systems Applications, 1999, pp. 549-
553.
[13] C.F. Lin, Y.F. Fang and Y.T. Juang, “Chinese text distinction and font
identification by recognizing most frequently used characters,” Image
and Vision Computing, Vol. 19, 2001, pp. 329-338.
[14] M.H. Ha, X.D. Tian and Z.R. Zhang, “Optical Font Recognition Based
on Gabor Filter,” In Proc. of Int. Conf. on Machine Learning and
Cybernetics, 2005, pp. 4864-4869.
[15] Z. Yang, L. Yang, D. Qi and C.Y. Suen, “An EMD-based Recognition
Method for Chinese Fonts and Styles,” Pattern Recognition Letters, Vol.
27, 2006, pp. 1692-1701.
[16] X. Ding, L. Chen and T. Wu, “Character Independent Font Recognition
on a Single Chinese Character,” IEEE Trans. on PAMI, Vol. 29, No. 2,
2007, pp. 197-204.
[17] I.S.I. Abuhaiba, “Arabic Font Recognition Based on Templates,” Int.
Arab Journal of Information Technology, Vol. 1, 2003, pp. 33-39.
[18] I.S.I. Abuhaiba, “Arabic Font Recognition Using Decision Trees Built
from Common Words,” Journal of Computing and Information
Technology (CIT), Vol. 13, No. 3, 2005, pp. 211-223.
[19] B. Moussa, A. Zahour, M.A. Alimi and A. Benabdelhafid, “Can Fractal
Dimension Be Used in Font Classification,” In Proc. of ICDAR, 2005,
pp. 146-150.

You might also like