Professional Documents
Culture Documents
DRR2014 StructureAnalysisforPlaneGeometryFigures
DRR2014 StructureAnalysisforPlaneGeometryFigures
DRR2014 StructureAnalysisforPlaneGeometryFigures
net/publication/266203349
Conference Paper in Proceedings of SPIE - The International Society for Optical Engineering · February 2014
DOI: 10.1117/12.2042462
CITATIONS READS
6 3,504
5 authors, including:
Zhi Tang
Sichuan University
192 PUBLICATIONS 2,897 CITATIONS
SEE PROFILE
All content following this page was uploaded by lu Liu on 30 September 2014.
ABSTRACT
As there are increasing numbers of digital documents for education purpose, we realize that there is not a retrieval
application for mathematic plane geometry images. In this paper, we propose a method for retrieving plane geometry figures
(PGFs), which often appear in geometry books and digital documents. First, detecting algorithms are applied to detect
common basic geometry shapes from a PGF image. Based on all basic shapes, we analyze the structural relationships between
two basic shapes and combine some of them to a compound shape to build the PGF descriptor. Afterwards, we apply matching
function to retrieve candidate PGF images with ranking. The great contribution of the paper is that we propose a structure
analysis method to better describe the spatial relationships in such image composed of many overlapped shapes. Experimental
results demonstrate that our analysis method and shape descriptor can obtain good retrieval results with relatively high
effectiveness and efficiency.
Keywords—plane geometry figure, structure analysis, compound shape, document image retrieval
1 INTRODUCTION
Computer-aided instruction has become increasingly popular in recent years. Consequently, a growing number of
teaching contents have been digitalized and stored electronically. Thus, the subject of pattern recognition for document
images or figures has become important. Plane geometry figure (PGF, Figure 1) is a type of document graph. To the best
of our knowledge, existing plane geometry question retrieval systems generally focus on the keyword in the text of a
plane geometry problem. However, the description of such keywords may not represent the questions sufficiently, thus
leading to inaccurate performance.
One key issue of graph recognition is to explore exquisite descriptors that reflect the nature of a shape. Description
techniques can be generally classified into two classes: region-based methods and contour-based methods [1]. Region-
based features are primarily intended for overall statistics or the analysis of the entire region of a shape, such as moment
invariants [2] or transforms [3, 4]. Although these methods are not susceptible to noise, which lead to satisfactory
discrimination results, they fail to capture the structural features of a shape. By contrast, contour-based methods [5, 6]
1
paddy5625@gmail.com
2
lvxiaoqing@pku.edu.cn
Document Recognition and Retrieval XXI, edited by Bertrand Coüasnon, Eric K. Ringger,
Proc. of SPIE-IS&T Electronic Imaging, Vol. 9021, 90210R · © 2014 SPIE-IS&T
CCC code: 0277-786X/14/$18 · doi: 10.1117/12.2042462
Figure 2 An example for basic shape detection: one circle and three triangles have been detected in given plane geometry figure.
Given an example showed in Figure 2, we can see one circle and three overlapped triangles have been detected.
The main detection steps are listed in Table 1.
We also use a 4×4 matrix to represent the relationships between two quadrilaterals; and a 4×3 matrix to represent
the relationships between a quadrilateral and a triangle. Hereafter, a quadrilateral can only be a rectangle, a trapezoid, or
a parallelogram.
M 44 ( i, j ) = Rss ( ai , b j ) i = 1...4, j = 1...4
(2)
M 43 ( i, j ) = Rss (ai , b j ) i = 1...4, j = 1...3 (3)
000
M= 3 3 4
3 8 0
Figure 4 The matrix which represents the relationship between two given triangles
For example, as shown in Figure 4, the matrix M is the origin representation of given two triangles in the left. The
number 8 are the max number in the matrix which represents the relationship between the segment a1 and b1. After we
removed the line and column of the matrix where the number 8 located, the rest max number is 4, which represents the
parallel segments a2 and b2. Since a3 and b3 are neither parallel nor intersected, their relationship score is zero. The
relationship between these two triangles are shown in Equation (5).
(5)
If S1 and S2 are both quadrilaterals, then R will have four values; otherwise, R will only have three values. We use this
method to represent the relationships between triangles or quadrilaterals. However, this method is not suitable for
representing the relationships between a circle and another basic shape.
1 ∃e ( e ∈ S2 .edge ∧ e is a diameter of S1 )
Nd = (10)
0 others
The relationship between the circle and the triangle is defined in the following equation:
R ( S1 , S 2 ) = ( N v , N e , N c , N d ) (11)
2.2.4 Relationships between Two Circles
Let d represent the distance between the two centroids of the circles, and r1 and r2 represent their radii. Table 4 shows
five scenarios in which we compare d, r1+ r2, and |r1- r2|.
d = r1 − r2 Inscribed
d < r1 − r2 Included
Theoretically, we could analyze all relationships between any two basic shapes with the method introduced in Sec.
2.2, as the number of types of basic shapes is finite. However, the computation will be huge if we enumerate all
relationships. Moreover, not all kinds of relationships are equally important for the structural feature description of
PGFs. It is necessary to select those important relationships and exclude unimportant ones. In the next section,
compound shapes will be adopted to fulfill this idea.
Nv 3 Nv 3 Nv 2
Ne 0 Ne 0 Ne 0
Nc 0 Nc 0 Nc 1
Nd 1 Nd 0 Nd 0
Nv 2 Nv 1 Nv 0
Ne 1 Ne 1 Ne 3
Nc 0 Nc 1 Nc 0
Nd 1 Nd 0 Nd 0
2.3.3 A Circle and a Quadrilateral
Quadrilaterals can be a rectangle, a trapezoid, or a parallelogram. In the case of a rectangle, we use the condition
N v + N e >= 4 , which can generate four situations. For a trapezoid, we only choose a condition when all the vertices of
the trapezoid fall within the circumference of the circle. We discard all the relationships between a circle and a
parallelogram.
Through analysis and a set of experiments of combinations of two shapes, 41 compound shapes are looked on as
strong relationships, as shown in Figure 5. Generally, both compound and basic shapes are used to represent the feature
of a PGF.
Consequently, we obtain the updated set G = { P1 ,..., Pn } ; in this set, P can be a basic or compound shape. We
maintain the perimeter and the area in each P. If P is a compound shape, then we use the sum of the perimeters and areas
AAAAXA
s=== 0 p/
A
of the two basic shapes to represent those of the compound shape P.
AAAA AAA7 7
q,Th H
,
6
01810G C r a _J a00@c)i00i)
Figure 5 List of all the compound shapes.
Table 6 Algorithm 2: the algorithm which composes the compound shapes based on the basic shapes.
Sort the basic shapes (descending) by area.
x = 0;
for i = 1 to n
for j = 1 to i-1
if (Si is ‘used’ and Sj is ‘used’) continue; end (if)
if R ( Si , S j ) is strong (Figure 5)
x = x+1;
Generate a compound shape Dx;
Mark Si and Sj ‘used’;
end(if)
end(for)
end(for)
Remove ‘used’ basic shapes in set G.
Add compound shapes D1…Dx to set G.
Sort all the elements (descending) by area in set G.
Table 7 the quantity of visual salient basic shapes and total basic shapes in our dataset
Circles Triangles Rectangles Trapezoids Parallelograms
Visual salient quantity 219 134 31 26 5
Total quantity 219 1955 38 65 40
Rate 100% 6.9% 82% 40% 13%
Let P be a basic shape in figure G1 and Q be a basic shape in figure G2. We assign the class weights as 50 times of
the rates in Table 7 under the condition that P and Q are same types of basic shapes, which reveals their different
significance. In our actual experiments, the weights C ( P, Q ) are replaced with close integers as shown in the last
column in Table 8.
On the other hand, if P and Q are different types, we assume that they are not matched so that the weight C ( P, Q )
is set to zero.
P Q Rates times 50 C ( P, Q )
Circle Circle 50 50
Rectangle Rectangle 41 40
Trapezoid Trapezoid 20 20
Parallelogram Parallelogram 6.5 7
Triangle Triangle 3.45 3
Different types -- 0
We also consider the partial circle rate of P and Q as
A ( P, Q ) = 1 − ( PCR ( P ) − PCR ( Q ) ) * 2 π
(12)
where
Area ( S )
PCR ( S ) = , S ∈ { P, Q}
Perimeter ( S ) (13)
We should also generate a size factor to keep the scale invariance, as follows:
Area ( P ) / Area ( Q ) 1
S ( P, Q ) = if S ( P, Q ) > 1, S ( P, Q ) = (14)
Scale ( P ) / Scale ( Q ) S ( P, Q )
Equation (14) indicates that S ( P ) is the biggest basic shape S1 of the set G1 which can reflect size. The values of
both A and S are obviously between 0 and 1. Subsequently, we define the weight between P and Q as
3.4 Match
We establish a match matrix for G1 { P1 ,..., Pm } and G2 {Q1 ,..., Qn } as follows:
W ( P1 , Q1 ) W ( P1 , Q2 ) ... ... W ( P1 , Qn )
W ( P2 , Q1 ) W ( P2 , Q2 ) ... ... W ( P2 , Qn )
M ( G1 , G2 ) = ... ... ... ... ...
... ... ... ... ...
W ( P , Q ) W ( P , Q ) ... ... W ( Pm , Qn )
m 1 m 2
(22)
KM ( G1 , G2 ) = Kuhn _ Munkras ( M ( G1 , G2 ) )
(23)
And finally obtain the matching score as:
2 * KM ( G1 , G2 )
Match ( G1 , G2 ) =
KM ( G1 , G1 ) + KM ( G2 , G2 ) (24)
2290
2280
2270
20 25 30 35 40 45 50
The parameter N in Equation (19) (see Sec. 3.4) are set to 40 after we have tried the value of N=20,25,30,35,40,45,50.
The sum of all vote scores are shown in Figure 6.
ZMD
BSS
o e
OkeiCD
Proposed
method
00
Figure 7 Experimental results for the three methods
In Figure 7, the leftmost graph in each line is the query. The comparison of the second and the third graphs in each
line reveals that the proposed method and the BSS perform better than the ZMD method. We can also conclude from the
fourth and the fifth columns in Figure 7 that the proposed description based on the compound shapes perform better than
BSS because both the numerous triangles in the circle and their relationships are closer to the query.
The sum score after we use BSS and ZMD method are shown in Figure 8, we can see both our method and BSS
perform much better than ZMD method. The proposed method performs better than BSS because the BSS method lose
the relationships between basic shapes.
We also sum the vote score of the queries and its top 1 matched PGF, which show in the right bars of Figure 8. We
can concluded that our method scores 517 and performs best also.
Experiments also shows some limitation of our method. For the PGFs which do not contain compound shapes, the
retrieval results are not satisfactory. For example, the query in Figure 9(a) contains a circle, a trapezoid and two
triangles, but our algorithm fails to find out efficient compound shapes. Consequently, none of the retrieved figures in
Figure 9(b) is similar to the query, though they contain similar circles, triangles and trapezoids.
(a) (b)
ACKNOWLEDGMENTS
This work is supported by Natural Science Foundation of Beijing under Grant 4132033. We are deeply indebted to
the collaborators for many insightful remarks and valuable suggestions.
REFERENCES
[1] Zhang, D. and Lu, G., "Review of shape representation and description techniques," Patt. Reco. Soc., 37(1), (2004)
[2] Liao, S. X., Pawlak, M., "On image analysis by moments," IEEE Trans. Patt. Analy. and Mach., Intel., 18(3), (1996).
[3] Zhang, D. and Lu, G. "Generic fourier descriptor for shape-based image retrieval," Proc. of the IEEE ICME. 1, 425-
428(2002).
[4] Tabbone, S., Ramos, Terrades, O., Barrat, S., "Histogram of radon transform. a useval descriptor for shape retrieval,"
Proc. IEEE ICPR., 1-4(2008).
[5] Ling, H. and Jacobs, D. W., "Shape classification using the inner-distance," IEEE Trans. Patt. Analy. and
Mach.,29(2), (2007).
[6] Molhtarian, F., Abbasi, S. and Kittler, J., "Curvature scale space image in shape similarity retrieval," Multi. System,
7(6), (1999).
[7] Ballard, D. H., “Generalizing the Hough Transform to detect arbitrary shapes,” Patt. Reco. 13, 111-222(1981).
[8] Lamiroy, B., Gaucher, O. and Fritz, L., “Robust circle detection,” Proc. ICDAR, 526-530(2007).
[9] Chung, K. L., Huang, Y. H., Shen, S. M., Krylov, A. S., Yurin, D. V. Semeikina, E. V. “Efficient sampling strategy
and refinement strategy for randomized circle detection,” Patt. Reco. 45, 252-263(2012).
[10] Akinlar, C. and Topal, C., “A real-time circle detector with a false detection control,” ICASSP, 1309-1312(2012).
[11] Cuevas, E., Enciso, V. O., Wario, F., Zaldivar, D. and Cisneros, M. P., “Automatic multiple circle detection based
on artificial immune systems,” Expert Systems with Application, 39, 713-722(2012).
[12] Duda, R. O. and Hart, P. E., “Use of the Hough transformation to detect lines and curves in pictures,” Comm. ACM
15, 11-15(1972).
[13] Lin, C. and Nevatia, R., “Building detection and description from a single intensity image,” Comp. Vision Image
Under. 72(2), 101-121(1998).
[14] Nayef, N. and Breuel, T. M., "Efficient symbol retrieval by building a symbol index from a collection of line
drawings," DRR Feb 5-7 San Francisco CA U.S.A., (2013).
[15] Li, K., Lu, X., Ling, H., Liu, T., Feng, T. and Tang, Z., “Detection of overlapped quadrangles in plane geometry
figures,” ICDAR Aug 25-28 Washington DC U.S.A., (2013).
[16] Bourgeois, F. and Lassalle, J. C., “An extension of the munkres algorithm for the assignment problem to rectangular
matrices,” Comm. ACM, 14(12), 802-804(1971).
[17] Kim, W. Y. and Kim, Y. S. “A region-based shape descriptor using Zernike moments,” Sig. Proc. Image Comm. 16,
95-102(2000).