Professional Documents
Culture Documents
Combination of Hough Transform and Neural Network On Recognizing Mathematical Symbols
Combination of Hough Transform and Neural Network On Recognizing Mathematical Symbols
Combination of Hough Transform and Neural Network On Recognizing Mathematical Symbols
Abstract—Offline printed mathematical symbol recognition is 60s [5] [6]. In this work, the general objective, covered is to
a particularly difficult task. Recognizing mathematical symbols is contribute to an automatic system which is required to rec-
one stage within the overall system for recognition of mathemati- ognize mathematical formulas, automatically extracted from
cal documents. We describe many experiments using MultiLayer
Perceptron (MLP), Hough Transform (HT), k Nearest Neighbors image of scientific documents [7]. Notice that mathematical
formula recognition is mainly composed of three major steps:
2021 8th International Conference on ICT & Accessibility (ICTA) | 978-1-6654-6641-7/21/$31.00 ©2021 IEEE | DOI: 10.1109/ICTA54582.2021.9809779
This paper addresses the problem of recognizing math- II. P REVIOUS RELATED RESEARCH WORK
ematic notation which poses challenges from its two- In [8], we report that the failure of conventional
dimensional pattern, the rich set of used symbols, the range of OCR systems to treat mathematics symbols has several
similar looking symbols whose bold, calligraphic, and italic consequences:
varieties must be recognized distinctly, the imbalance and
paucity of an available training data, and the impossibility - Readers of mathematical documents cannot automatically
of final verification through spell check. Many researchers search for earlier occurrences of a variable or operator, in
have worked on Handwritten Mathematical Expression (HME) tracing the notation and definitions used by a journal article.
recognition for many years such as Andersons effort in the - The appearance of mathematics on the same line as text
Authorized licensed use limited to: Zhejiang University. Downloaded on August 18,2023 at 22:46:06 UTC from IEEE Xplore. Restrictions apply.
to improve the M LP results by the combination of HT as good results. Then, we used Freeman code to describe the
frontal part. We call this hybrid neural structural method: contours of symbols. Recall that this code moves along a
HT − M LP s. digital curve or a sequence of border pixels based on eight
connectivities. Each movement is classified as a direction
B. HT-MLPs, neural structural classifier and encoded with a numbering scheme i—i=0,1,2,7 denoting
It is about to extract segments from symbol bitmap using an angle of 45i counterclockwise from the positive xaxis.
HT (after a skeletonization step using a thinning algorithm) In this method chain code stores the absolute position of
then according to their number (0, 1, 2 segments or more), the first pixel and the relative positions of successive pixels
we call the appropriate M LP as shown in Figure 8. This along the symbols border. Thus, input and model symbols are
implies that initial MLP will be split into three specialized converted to sequences of digits, called chain codes. Digit
M LP s: MLP0, MLP1 and MLP2+ , each one will serve to sequences are compared using histogram of slopes along
learn and recognize symbols respectively composed of 0, 1, 2 contour and histogram matching techniques. In fact, we used
or more segments. Here are the best obtained results for the various metric of distances (e.g. Euclidian, Manhattan, and
three MLPs: Canberra distances), and similarities (histogram intersection)
- MLP0: Input layer (80 neurons), 1 hidden layer (4 neurons), to respectively compare between two histograms (stored as
Number of iterations:14. vectors). In case of a distance, we usually look for the
- MLP1: Input layer: 80 neurons Hidden layer : 1 layer nearest elements (to minimize distance). In the case of a
(3 neurons) Number of iterations:100 - MLP2+: Input layer: similarity, we look for the most similar elements, (to maximize
80 neurons Hidden layer : 1 layer (12 neurons) Number of similarity index). This method achieved a recognition rate near
iterations:1100 to 46.67%.
Figure 7 summarizes the obtained results with the proposal
neural structural method: HT-MLPs. We can observe a distinc-
tive improvement compared to initial MLP since the symbol IV. C ONCLUSION
misrecognition [11] rate is reduced by 21% and the recognition
rate is increased from 72% to 93%. However, there exists yet In this paper, we mainly described a successful classifi-
some confusion between similar symbols, ∏ composed∩ of the cation method based on three specialized MLPs, utilizing
same number ∑ of segments (e.g. versus (, versus , [versus HT. We evaluated effects of this hybridization, comparing
], ] versus ). We considered some of the misrecognitions to its recognition rates with those of initial MLP and other
be too difficult for any classifier to resolve on the basis of only methods based on KNN and Freeman code. Results of this
the number of segments. Further structural primitives should hybridization are very satisfactory, given the small number
be taken into account to avoid such confusions. of samples. It achieved 93% of single symbol recognition.
These results show that: 1) giving slightly higher weight to the
C. KNN classifier using Hu moments structural information: segment number, and 2) specializing
It is reported that KNN algorithm is amongst the simplest of the MLP, produces better results. But, the problem of symbol
all machine learning algorithms. We applied here for symbol recognition is still opened especially as the ultimate objective
recognition. Thus, a symbol is classified by a majority vote is to recognize any mathematical symbol whatever isolated or
of its neighbors, with the symbol being assigned to the class in the formula context, handwritten or printed. It is important
most common amongst its k nearest neighbors. K is a positive to note that the rates of recognition, confusion and rejection are
integer, typically small. If k = 1, then the symbol is simply not the only criteria to evaluate this work since the formula
assigned to the class of its nearest neighbor. The neighbors are context should play a decisive role to pronounce about the
taken from a set of symbols for which the correct classification real identity of the symbol. To improve this work, we plan
is known. This can be thought of as the training set for the further research in particular, the following: 1) to elaborate
algorithm, though no explicit training step is required. In order tests of efficiency and performance of the proposal neural
to identify neighbors, the symbols are here represented by structural method on a wider set of mathematical symbols.
Hu moments in a multidimensional feature space. We used Until now, we formed a reasonable size set. Classical meth-
the Euclidean distance to compare between input and model ods have limitations, but the development of robust methods
symbols Hu moment values. After several tests by varying the depends on the database availability of large size to test the
value of k from 1 to 4 considering Hu moments, we find that performance, robustness, reliability of the proposed methods
the best recognition rate is obtained with k equals to 4 and and conduct meaningful statistical tests to compare against
the first three moments of Hu. each other, 2) to distinguish the symbols that confuse yet
the HTMLPs classifier considering additional features such as
D. Structural Freeman chain code segment orientation, other structural primitives like presence
Another method, known as contour, is the structural Free- of loops, branchpoints [11], endpoints that could distinguish
man chain code of the symbols border. To detect symbol many variants of symbols and 3) to combine the results of
border, we tested different filters such as filters of Robert several recognizers which should give more accurate results
and Sobel. We find that Sobel filter is simplest and gives than using only one.
Authorized licensed use limited to: Zhejiang University. Downloaded on August 18,2023 at 22:46:06 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES
[1] Bianchini, Claudia and Borgia, Fabrizio and De Marsico, Maria, ”A
concrete example of inclusive design: deaf-oriented accessibility”, arXiv
preprint arXiv:1911.13207, 2019.
[2] Ayeb, Kawther Khazri and Meguebli, Yosra and Echi, Afef Kacem,
”Deep Learning Architecture for Off-Line Recognition of Handwritten
Math Symbols”, Mediterranean Conference on Pattern Recognition and
Artificial Intelligence,2020.
[3] Ayeb, Kawther Khazri and Echi, Afef Kacem and Belaı̈d, Abdel,
”Arabic/Latin and handwritten/machine-printed formula classification
and recognition”, 1st International Workshop on Arabic Script Analysis
and Recognition (ASAR),2017.
[4] Nazemi, Azadeh and Tavakolian, Niloofar and Fitzpatrick, Donal and
Suen, Ching Y and others, ”Offline handwritten mathematical symbol
recognition utilising deep learning”,arXiv preprint arXiv:1910.07395,
2019.
[5] R. H. Anderson, ”Syntax-directed Recognition of Hand-printed Twodi-
mensional Mathematics”, in Symposium on Interactive Systems for
Experimental Applied Mathematics: Proceedings of the Association for
Computing Machinery Inc. Symposium, 1967, pp. 436459.
[6] D. BLOSTEIN and A. N. N. GRBAVEC, ”RECOGNITION OF MATH-
EMATICAL NOTATION”,in Handbook of Character Recognition and
Fig. 1. The CROHME competition for mathematic formula and symbol
Document Image Analysis, pp. 557582.
recognition.
[7] A. Kacem, A. Belad and M. Benahmed, ”Automatic extraction of
printed mathematical formulas using fuzzy logic and propagation of
context”, in Jour. of IJDAR, volume 4, Number 2, pp. 97108, December
2001.
[8] , Malon, Christopher and Uchida, Seiichi and Suzuki, Masakazu, ”Math-
ematical symbol recognition with support vector machines”, Pattern
Recognition Letters, volume 29 , number 9 , pages 1326–1332 ,2008
[9] Yang, Michael and Fateman, Richard, ”Extracting mathematical expres-
sions from postscript documents”,Proceedings of the 2004 international
symposium on Symbolic and algebraic computation, pages 305–311,
2004.
[10] Alvaro, Francisco and Zanibbi, Richard, ”A shape-based layout descrip-
tor for classifying spatial relationships in handwritten math”, Proceed-
ings of the 2013 ACM symposium on Document engineering, pages
123–126, 2013.
[11] M. Koschinski, H.J. Winkler and M. ang, ”Segmentation and recognition
of symbols within handwritten mathematical expressions”, in Proc. of
CASSP, vol. 4, Detroit, MI, pp. 24392442, 1995.
[12] H.J. Winkler,H. Fahner and M. lang, ”A softdecision approach for
structural analysis of handwritten mathematical expressions”, in Proc.
of ICASSP, vol. 4, Detroit, MI, pp. 24592462, 1995.
[13] A. Kosmala, G. Rigoll, S. Lavirotte and L. Pottier, ”Online handwritten
formula recognition using hidden Markov models and context dependent
graph grammars”, in Proc. of ICDAR, Bangalore, Karnataka, India,
pp.107110, 1999.
[14] Z. Xuejun, L. Xinyu, P. Boachang and Y. tang, ”Online recognition
handwritten mathematical symbols”, in Proc. of ICDAR, Ulm, Germany,
pp. 645648, 1997.
[15] E. Topia and R. Rojas, ”Recognition of online handwritten mathematical
formulas in the Echalk system”, in Proc. of ICDAR, Edinburgh, U.K.,
pp. 980984, 2003.
[16] U. Garain, B. Chaudhuri, and A. Ray Chaudhuri, ”Identification of
embedded mathematical expressions in scanned documents”, in Proc.
of ICPR, Cambridge, UK, pp. 138149, 2004.
Fig. 2. MLP design with 4 pixel portion black pixel densities as input.
Authorized licensed use limited to: Zhejiang University. Downloaded on August 18,2023 at 22:46:06 UTC from IEEE Xplore. Restrictions apply.
Fig. 4. symbol confusion cases.
Authorized licensed use limited to: Zhejiang University. Downloaded on August 18,2023 at 22:46:06 UTC from IEEE Xplore. Restrictions apply.
Combination of Hough Transform and Neural
Network on recognizing mathematical symbols
Aouadi Nabil
Latice Laboratory
Tunis, Tunisia
nabil.aouadi@utic.rnu.tn
Authorized licensed use limited to: Zhejiang University. Downloaded on August 18,2023 at 22:46:06 UTC from IEEE Xplore. Restrictions apply.