Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Shape Matching: Similarity Measures and Algorithms

Remco C. Veltkamp
Dept. Computing Science, Utrecht University
Padualaan 14, 3584 CH Utrecht, The Netherlands
Remco.Veltkamp@cs.uu.nl

Abstract Shape matching deals with transforming a shape, and


measuring the resemblance with another one, using some
Shape matching is an important ingredient in shape re- similarity measure. So, shape similarity measures are an
trieval, recognition and class$cation, alignment and regis- essential ingredient in shape matching. Although the term
tration, and approximation and simpl$cation. This paper similarity is often used, dissimilarity corresponds to the no-
treats various aspects that are needed to solve shape match- tion of distance: small distance means small dissimilarity,
ing problems: choosing the precise problem, selecting the and large similarity.
properties of the similarity measure that are needed for the The algorithm to compute the similarity often depends
problem, choosing the specijic similarity measure, and con- on the precise measure, which depends on the required
structing the algorithm to compute the similarity. The focus properties, which in turn depends on the particular match-
is on methods that lie close to the field of computational ing problem for the application at hand. Therefore, section 2
geometry. is about the classification of matching problems, section 13
is about similarity measure properties, section 4 presents .a
number of specific similarity measures, and section 5 treats
1. Introduction a few matching algorithms.
There are various ways to approach the shape matching
There is no universal definition of what shape is. Impres- problem. In this article we focus on methods that lie closme
sions of shape can be conveyed by color or intensity patterns to computational geometry, the subarea of algorithms de-
(texture), from which a geometrical representation can be sign that deals with the design and analysis of algorithms
derived. This is shown already in Plato’s work Meno (380 for geometric problems involving objects like points, lines,
BC) [41]. This is one of the so-called Socratic dialogues, polygons, and polyhedra. The standard approach taken in
where two persons digcuss aspects of virtue; to honor and computational geometry is the development of exact, prov-
memorialize him, one person is called Socrates. In the di- ably correct and efficient solutions to geometric problems.
alogue, Socrates describes shape as follows (the word ‘fig- First some related work is mentioned in the next subsection.
ure’ is used for shape):

“figure is the only existing thing that is found al- 1.1. Related work
ways following color”.

This does not satisfy Meno, after which Socrates gives a


Matching has been approached in a number of ways,
definition in “terms employed in geometrical problems”:
including tree pruning [55],the generalized Hough trans-
“figure is limit of solid’. form [8] or pose clustering [51], geometric hashing [591,
the alignment method [27], statistics [40], deformable tem-
Here we also consider shape as something geometrical. plates [50], relaxation labeling [44], Fourier descriptors
We will use the term shape for a geometrical pattern, con- [35], wavelet transform [3 13, curvature scale space [36], and
sisting of a set of points, curves, surfaces, solids, etc. This neural networks [21]. The following subsections treat a few
is commonly done, although ‘shape’ is sometimes used for methods in more detail. They are based on shape repre-
a geometrical pattern modulo some transformation group, sentations that depend on the global shape. Therefore, they
in particular similarity transformations (combinations of are not robust against occlusion, and do not allow partial
translations, rotations, and scalings). matching.

188
$10.000 2001 IEEE
0-7695-0853-7/01
1.1.1 Moments the contour C be parameterized by arc-length s: C(s) =
(x(s),y(s)). The coordinate functions of C are convolved
When a complete object in an image has been identified,
with a Gaussian kernel 4, of width 0:
it can be described by a set of moments mp,q.The (p, 4)-
moment of an object 0 C R2 is given by

mp,q = d,,,E* xpyq dx dy.


and the same for y(s). With increasing value of 0,the re-
sulting contour gets smoother, see figure 1, and the number
For finite point sets the integral can be replaced by a sum-
of zero crossings of the curvature along it decreases, until
mation. The infinite sequence of moments, p , q = 0 , 1 , . . . ,
finally the contour is convex and the curvature is positive
uniquely determines the shape, and vice versa. Variations
everywhere.
such as Zernike moments are described in [32] and [12].
Based on such moments, a number of functions, moment
invariants, can be defined that are invariant under certain
transformations such as translation, scaling, and rotation.
Using only a limited number of low order moment invari-
ants, the less critical and noisy high order moments are dis-
carded. A number of such moment invariants can be put
into a feature vector, which can be used for matching.
Algebraic moments and other global object features such
as area, circularity, eccentricity, compactness, major axis
orientation, Euler number, concavity tree, shape numbers,
can all be used for shape description [9], [42].

1.1.2 Modal matching


Rather than working with the area of a 2D object, the
boundary can be used instead. Samples of the boundary
can be described with Fourier descriptors, the coefficients Figure 1. Contour smoothing, reducing cur-
of the discrete Fourier transform [56]. vature changes.
Another form of shape decomposition is the decomposi-
tion into an ordered set of eigenvectors, also called principal
components. Again, the noisy high order components can For continuously increasing 0,the positions of the cur-
be discarded, using only the most robust components. The vature zero-crossings continuously move along the contour,
idea is to consider n points on the boundary of an object, until two such positions meet and annihilate. Matching of
and to define a matrix D such that element Dij determines two objects can be done by matching points of annihila-
how boundary points i and j of the object interact, typically tion in the ( s , 0)plane [36]. Another way of reducing cur-
involving the distance between points i and j. vature changes is based on the turning angle function (see
The eigenvectors ei of D , satisfying D e i = Xei, i = section 4.5) [34].
1,.. . ,n, are the modes of D,also called eigenshapes. To Matching with the curvature scale space is robust against
match two shapes, take the eigenvectors ei of one object, slight affine distortion, as has been experimentally deter-
and the eigenvectors e; of the other object, and compute a mined [37]. Be careful, however, to use this property for
mismatch value m ( e i ,e;). For simplicity, let us assume that fish recognition, see section 3.
the eigenvectors have the same length. For a fixed i = io,
determine the value j o of j for which m(ei,, e>)is minimal. 2. Matching problems
.io)
If the value of i for which m ( e i , is minimal is equal to
io, then point i of one shape and point j of the other shape Shape matching is studied in various forms. Given two
match. See for example [22] and [49] for variations on this patterns and a dissimilarity measure, we can:
basic technique of modal matching.
(computation problem) compute the dissimilarity be-
1.1.3 Curvature scale space tween the two patterns,

Another approach is the use of a scale space representa- 0 (decision problem) for a given threshold, decide
tion of the curvature of the contour of 2D objects. Let whether the dissimilarity is smaller than the threshold,

189
(decision problem) for a given threshold, decide Whether or not specific properties are wanted will depend
whether there exists a transformation such that the on the application at hand, sometimes a property will be
dissimilarity between the transformed pattern and the useful, sometimes it will be undesirable. Some combina-
other pattern is smaller than the threshold, tions of properties are contradictory, so that no distance
function can be found satisfying them. A shape dissimilar-
(optimization problem) find the transformation that ity measure, or distance function, on a collection of shapes
minimizes the dissimilarity between the transformed
pattern and the other pattern.
S is a function d : S x S + E% In the properties listed
below, it is understood that they must hold for all shapes A,
Sometimes the time complexities to solve these problems B , or C in S.
are rather high, so that it makes sense to devise approxima- Metric properties Nonnegativity means d(A,B ) 2 [D.
tion algorithms: The identity property is that d(A,A ) = 0. Uniqueness says
that d ( A , B ) = 0 implies A = B. A strong form of the
(approximate optimization problem) find a transforma-
tion that gives a dissimilarity between the two patterns
+
triangle inequality is d ( A ,B ) d ( A , C ) 2 d ( B , C). A
distance function satisfying identity, uniqueness, and strong
that is within a constant multiplicative factor from the
triangle inequality is called a metric. If a function satisfies
minimum dissimilarity.
only identity and strong triangle inequality, then it is called
These problems play an important role in the following a semimetric. Symmetry, d ( A , B ) = d ( B ,A ) , follows from
categories of applications. the strong triangle inequality and identity. Symmetry is not
always wanted. Indeed, human perception does not always
Shape retrieval: search for all shapes in a typically large
find that shape A is equally similar to B, as B is to A . [n
database of shapes that are similar to a query shape.
particular, a variant A of prototype B is often found more
Usually all shapes within a given distance from the
similar to B than vice versa [53].
query are determined (decision problem), or the first
A more frequently encountered formulation of the tri-
few shapes that have the smallest distance (computa-
tion problem). If the database is large, it may be infea-
+
angle inequality is the following: d(A,B ) d ( B , C ) 2
d ( A , C). Similarity measures for partial matching, giving
sible to compute the similarity between the query and
a small distance d ( A , B ) if a part of A matches a part aof
every database shape. An indexing structure can help
B , do not obey the triangle inequality. An illustration is
to exclude large parts of the database from considera-
given in figure 2: the distance from the man to the cen-
tion at an early stage of the search, often using some
taur is small, the distance from the centaur to the horse is
form of triangle inequality property, see section 3.
small, but the distance from the man to the horse is large, :so
Shape recognition and classification: determine whether +
d ( m a n ,centaur) d(centaur,horse) > d(man,horse)
a given shape matches a model sufficiently close (de- does not hold. It therefore makes sense to formulate an
cision problem), or which of k class representatives is even weaker form, the relaxed triangle inequality [ 18 1:
most similar (IC computation problems). c ( d ( A ,B ) + d ( B , C ) ) 2 d ( A , C), for some constant c 2 1.

Shape alignment and registration: transform one shape


so that it best matches another shape (optimization
problem), in whole or in part.
Shape approximation and simplification: construct a
shape of fewer elements (points, segments, triangles,
etc.), that is still similar to the original. There are
many heuristics for approximating polygonal curves
[45] and polyhedral surfaces [26]. Optimal methods
construct an approximation with the fewest elements
Figure 2. Under partial matching, the triangle
given a maximal dissimilarity, or with the smallest
inequality does not hold.
dissimilarity given the maximal number of elements.
(Checking the former dissimilarity is a decision
problem, the latter is a computation problem.) Continuity properties It is often desirable that a simii-
larity function has some continuity properties. The follow-
3. Properties ing four properties are about robustness, a form of conti-
nuity. Such properties are for example useful to be robust
In this section we list a number of properties. It can against the effects of discretization. Perturbation robust-
be desirable that a similarity measures has such properties. ness: for each E > 0, there is an open set F of deformations

190
sufficiently close to the identity, such that d(f(A), A) < E 4. Similarity Measures
for all f E F . For example, it can be desirable that a
distance function is robust against small affine distortion. 4.1. Discrete Metric
Crack robustness: for each each E > 0, and each “crack”
x in bd(A), the boundary of A, an open neighborhood U
Finding an affine invariant metric for patterns is not so
of x exists such that for all B, A - U = B - U implies
difficult. Indeed, a metric that is invariant not only for
d ( A , B ) < E . Blur robustness: for each c > 0, an open
affine transformations, but for general homeomorphisms is
neighborhood U of bd(A) exists, such that d(A, B ) < E for
the discrete metric:
all B satisfying B-U = A - U and bd(A) E b d ( B ) . Noise
and occlusion robustness: for each x E R2 - A, and each 0 if A equals B
> 0, an open neighborhood U of x exists such that for all d ( A , B )=
1 otherwise
B,B-U=A-Uimpliesd(A,B) <E.

However, this metric lacks useful properties. For example,


Invariance A distance function d is invariant under if a pattern A is only slightly distorted to form a pattern A’,
a chosen group of transformations G if for all g E G, the discrete distance d(A, A’) is already maximal.
d(g(A),g(B)) = d ( A , B ) . For object recognition, it is Under the discrete metric, computing the smallest
often desirable that the similarity measure is invariant un- d(A, B ) over all transformations in a transformation group
der affine transformations, since this is a good approxima- G is equivalent to deciding whether there is some transfor-
tion of weak perspective projections of points lying in or mation g in G such that g(A) equals B. This is known
close to a plane. However, it depends on the application as exact congruence matching. For sets of n points in Rk,
whether a large invariance group is wanted. For example, the algorithms with the best known time complexity run in
Sir d’Arcy Thompson [52] showed that the outlines of two O ( nlog n ) time if G is translations, scaling, or homotheties
hatchetfishes of different genus, Argyropefecus olfersi and (translation plus scaling), and logn) time for ro-
Sternoptyx diaphana, can be transformed into each other by tations, isometries, and similarities [ 1 I].
shear and scaling, see figure 3. So, the two fishes will be
found to match the same model if the matching is invariant 4.2. L, Distance, Minkowski Distance
under affine transformations.
Many similarity measures on shapes are based on the L,
distance between two points. For two points x,y in Rk,the
L, distance is defined as L,(x,y) = (&, /xi- yilP)l/P.
This is also often called the Minkowski distance. For p = 2,
this yields the Euclidean distance Ls. For p = 1, we get
the Manhattan, city block, or taxicab distance L1. For p
approaching 00, we get the max metric: maxi(Jxi - yiJ):
For all p 2 1, the L, distances are metrics. For p < 1
it is not a metric anymore, since the triangle inequality does
not hold.

4.3. Bottleneck Distance

Let A and B be two point sets of size n, and d(a,b) a dis-


tance between two points. The bottleneck distance F ( A , B )
is the minimum over all 1 - 1 correspondences f between
A and B of the maximum distance d(a, f(a)). For the dis-
tance d(a, b ) between two points, an L, distance could be
chosen. An alternative is to compute an approximation F
Figure 3. Two hatchetfishes of different to the real bottleneck distance F . An approximate match-
genus: Argyropelecus olfersi and Sternopvx di- ing between A and B with F the furthest matched pair, such
aphana. After [52]. +
that F < F < (1 c)F, can be computed with a less com-
plex algorithm [ 171.
The decision problem for translations, deciding whether
there exists a translation C such that F ( A + C, B ) < E can

191
also be solved, but takes considerably more time [ 171. Be- rected distances are defined as the k-th value i; increasing
cause of the high degree in the computational complexity, order of the distance/from a point in A to B: hk(A,B ) =
it is interesting to look at approximations with a factor E : minbcB d ( a , b). The partial Hausdorff distance is
F(A + e, + +
B ) < (1 e ) F ( A e*,T ) ,where e* is the op- not a metric since it fails the triangle inequality. Decid-
timal translation. Finding such a translation can be done in ing whether there is a translation plus scaling that brings
(?(n2.5) time [48]. the partial Hausdorff distance under a given threshold is
Variations on the bottleneck distance are the minimum done in [30] by means of a transformation space subdivi-
weight distance, the most uniform distance, and the mini- sion scheme. The running time depends on the depth of
mum deviation distance. subdivision of transformation space.
For pattern matching, the Hausdorff metric is often tool
4.4. Hausdorff Distance sensitive to noise. For finite point sets, the partial Haus-
dorff distance is not that sensitive, but it is no metric.
In many applications, for example stereo matching, not Alternatively, [7] observes that the Hausdorff distance of
all points from A need to have a corresponding point in B , A , B E X , X having a finite number of elements, can be
due to occlusion and noise. Typically, the two point sets are written as H ( A , B ) = supzEx Id(z,A) - d(z,B)I. The
of different size, so that no one-to-one correspondence ex- supremum is then replaced by an average: A p ( A , B ) =:
ists between all points. In that case, a dissimilarity measure (A CzEX Id(%,A ) - d ( z , B)lP)'lP, where d ( z , A ) =:
that is often used is the Hausdorff distance. The Hausdorff infaEA d ( z , U ) , resulting in the p-th order mean Hausdodf
distance is defined not only for finite point sets, it is defined distance. This is a metric less sensitive to noise, but it is
on nonempty closed bounded subsets_of any metric space. still not robust. It can also not be used for partial matching,
The directed Hausdorff distance h(A,B ) is defined as since it depends on all points.
the lowest upper bound+(supremum) over all points in A
of the distances to B: h(A,B ) = supaEAinfbGBd ( a ,b ) , 4.5. Turning Function Distance
with d(a, b) the underlying distance, for example the Eu-
clidean distance (Lz). The Hausdorff distance H ( A ,B ) The cumulative angle function, or turning function,
O A ( S )of a polygon or polyline A gives the angle between
is the maximum of i ( A , B ) and L ( B , A ) : H ( A , B ) =
the counterclockwise tangent and the z-axis as a function of
max{$A,B),a?(B,A)}. For finite point sets, it can the arc length s. O A ( S keeps
) track of the turning that takes
be computed using Voronoi diagrams in time O ( ( m +
place, increasing with left hand turns, and decreasing with
n )l o d m + n ) )VI. right hand turns, see figure 4.
Given two finit point sets A and B , the translation e*
that minimizes the Hausdorff distance d ( A + e, B ) can be
determined in time O(mn(1ogm n ) 2 )when the underlying
metric is L1 or L , [ 141. For other L, metrics, p = 2 , 3 , . . .
+
it can be computed in time U(mn(m n)a(mn)log(m +
n ) )[29]. ( a ( n )is the inverse Ackermann function, a very
slowly increasing function.) This is done using the upper
envelopes of Voronoi surfaces. Figure 4. Polygonal curve and turning func-
Computing the optimal rigid motion T (translation plus tion.
rotation), minimizing H ( r ( A ) ,B ) can be done in O ( ( m +
n)610g(mn)) time [28]. This is done using dynamic
Voronoi diagrams. Given a real value E , deciding if there Clearly, this function is invariant under translation of the
is a rigid motion such that H ( r ( A ) ,B ) < 6 can be done in polyline. Rotating a polyline over an angle 8 results in a ver-
+
time O ( ( m n)m2n2logmn) [13]. tical shift of the function with an amount 8. For polygons
Given the high complexities of these problems, it makes and polylines, the turning function is a piecewise constarit
sense to look at approximations. Computing an approxi- function, increasing or decreasing at the vertices, and con-
mate optimal Hausdorff distance under translation and rigid stant between two consecutive vertices.
motion is discussed in [ 11. In [6] the turning angle function is used to match poly-
The Hausdorff distance is very sensitive to noise: a gons. First the size of the polygons are scaled so that they
single outlier can determine the distance value. For fi- have equal perimeter. The L, metric on function spaces, ap-
nite point sets, a similar measure that is not as sensi- plied to O A and O B , gives a dissimilarity measure on A and
tive is the partial Hausdorfs distance. It is the max- B: ~ ( A , B=) (J I o A ( S ) - o ~ ( ~d s)) l Il PP ,see figure 5 .
imum of the two directed pa@l Hausdorff distances: In [ 5 8 ] ,for the purpose of retrieving hieroglyphic shape:;,
Hk(A,B) = max{hk(A,B),hk(B,A)}, where the di- polyline curves do not have the same length, so that partial

192
A variation of the FrCchet distance is obtained by drop-
ping the monotonicity condition of the parameterization.
The resulting FrCchet distance d ( A , B ) is a semimetric:
zero distance need not mean that the objects are the same.
Another variation is to consider partial matching: finding
the part of one curve to which the other has the smallest
FrCchet distance.
Parameterized contours are curves where the starting
Figure 5. Rectangles enclosed by OA(S), point and ending point are the same. However, the starting
O,(s), and dotted lines are used for evalu- and ending point could as well lie somewhere else on the
ation of dissimilarity. contour, without changing the shape of the contour curve.
For convex contours, the Frtchet distance is equal to the
Hausdorff distance [4].
matching can be performed. In that case the starting point
of the shorter one is moved along the longer one, consider- 4.7. Nonlinear elastic matching distance
ing only the turning function where the arc lengths overlap.
This is a variation of the algorithms for matching closed Let A = { a l , . . . ,a,} and B = { b l , . . . , b,} be two
polygons with respect to the turning function, which can be finite sets of ordered contour points, and let f be a corre-
done in O(mnlog(mn)) time [6]. spondence between all points in A and all points in B such
Partial matching under scaling, in addition to transla- that there are no a1 < a2, with f ( a 1 ) > f ( a 2 ) . The stretch
tion and rotation, is more involved. It can be done in time s(ai,b j ) of ( a i ,f(ui) = b j ) is 1 if either f ( a i - 1 ) = bj or
O(m2n2), see [15]. f ( a i ) = b j - 1 , or 0 otherwise. The nonlinear elastic match-
ing distance N E M ( A ,B ) is the minimum over all corre-
4.6. Frichet Distance spondences f of +
s(ai,b j ) d(ai, b j ) , with d ( a i , b j ) the
difference between the tangent angles at ai and b j . It can be
The Hausdorff distance is often not appropriate to, mea- computed using dynamic programming [ 161. This measure
sure the dissimilarity between curves. For all points, on A , is not a metric, since it does not obey the triangle inequality.
the distance to the closest point on B may be small:, but The relaxed nonlinear elastic matching distance N E M T
if we walk forward along curves A and B simultaneously, is a variation of N E M , where the stretch s ( a i , b j ) of
and measure the distance between corresponding points, the ( a i ,f ( a i ) = b j ) is T (rather than 1) if either f ( a i - 1 ) = bj
maximum of these distances may be larger. This is what is or f ( a i ) = bj-1, or 0 otherwise, where T 2 1 is a chosen
called the FrCchet distance. More formally, let A and B constant. The resulting distance is not a metric, but it does
be two parameterized curves A ( a ( t ) )and B(P(t)),and let obey the relaxed triangle inequality [ 181.
their parameterizations (Y and ,8 be continuous functions of
the same parameter t E [0,1],such that a(0) = p(0) = 0, 4.8. Reflection Distance
and a(1) = p(l) = 1. The FrCchet distance is the mini-
mum over all monotone increasing parameterizations a ( t ) The rejection metric [24] is an affine-invariant metric
and p(t) of the maximal distance d ( A ( a ( t ) )B(P(t))),
, tE that is defined on finite unions of curves in the plane or sur-
[O, 11. faces in space. They are converted into real-valued func-
[4] considers the computation of the FrCchet distance for tions on the plane. Then, these functions are compared us-
the special case of polylines. Deciding whether the FrCchet ing integration, resulting in a similarity measure for the cor-
distance is smaller than a given constant, can be done in responding patterns.
time O(mn). Based on this result, and the 'parametric The functions are formed as follows, for each finite union
search' technique, it is derived that the computation of the of curves A. For each z E R2, the visibility star V$
FrCchet distance can be done in time O(mnlog(")). Al- is defined as the union of open line segments connecting
though the algorithm has low asymptotic complexity, it is points of A that are visible from z: V z = U{= 1 a E
not really practical. The parametric search technique used A and A n = a}.The rejection star R i is defined by
here makes use of a sorting network with very high con- intersecting V$ with its reflection in z: R$ = {x w E +
stants in the running time. A simpler sorting algorithm leads +
R2 I z - w E VT a n d z w E V;}, see figure 6. The
to an asymptotic running time of o(mn(10gmn)~).Still, function P A : R2 -+ R is the area of the reflection star in
the parametric search is not easy to implement. A simpler each point: p ~ ( z =) area(R5). Observe that for points z
+
algorithm, which runs in time O(mn(m n) log(mn)) is outside the convex hull of A , this area is always zero. The
given in [20]. reflection metric between patterns A and B defines a nor-

193
tu(&))}, where Ai and Bi are subsets of Et2, with asso-
ciates weights w(Ai), tu(&). The transport distance be-
tween A and B is the minimum amount of work needed to
transform A into B.(The notion of work as in physics: the
amount of energy needed to displace a mass.) This is a dis-
crete form of what is also called the Monge-Kantorovich
distance. The transport distance is used for various pur-
poses, and goes under different names. Monge first stated! it
in 1781 to describe the most efficient way to transport piles
of soil. It is called the Augean Stable by David Mumford,
after the story in Greek mythology about the moving of
piles of dirt from the huge cattle stables of king Augeas by
Hercules [ 5 ] . The name Earth Mover’s Distance is coined
A by Jorge Stolfi. The transport distance has been used in heat
transform problems [43], object contour matching [ 191, and
color-based image retrieval [46].
Figure 6. Reflection star at z (dark grey).
Let f i j be the flow from location Ai to B j , F the matrix
of elements f i j , and dij the ‘ground distance’ between Ai
malized difference of the corresponding functions PA and and Bj.Then the transport distance is
PB:
m i n CL1
~ Cj”=1f i j d i j
1
C L Cy=,f i j
under the following conditions: fij 2 0 for all i
From the definition follows that the reflection metric is and j , fij 5 w(Bj), Cy=,fij I w(Ai), and
invariant under all affine transformations. In contrast to the cE1 fij = m i n { C L l w(Ai), w(Bj)}. If
FrCchet distance, this metric is defined also for patterns con- the total weights of the two sets are equal, and the ground
sisting of multiple curves. In addition, the reflection met- distance is metric, then the transport distance is a metric.
ric is deformation, blur, crack, and noise occlusion robust. It can be computed by linear programming in polynomial
Computing the reflection metric by explicitly constructing time. The often used simplex algorithm can give exponen-
the overlay of two visibility graphs results in a high time tial time complexity in the worst case, but often performs
complexity, Q(c(m+n)), with c the complexity ofthe over- linear in practice.
+
lay, which is O ( ( m n)4) [23].
5. Algorithms
4.9. Area of Symmetric Difference, Template Metric
In the previous section, algorithms were mentioned
For two compact sets A and B , the area of symmet- along with the description of the measure, when the algo-
ric difference, also called template metric, is defined as rithm is specific for that measure. This section mentiom a
area((A - B ) U ( B - A ) ) . Unlike the area of overlap, few algorithms that are more general.
this measure is a metric.
Translating convex polygons so that their centroids coin- 5.1. Voting schemes
cide also gives an approximate solution for the symmetric
difference, which is at most 11/3 of the optimal solution Geometric hashing [33, 591 is a method that determines
under translations [3]. This also holds for a set of transfor- if there is a transformed subset of the query point set that
mations F other than translations, if the following holds: matches a subset of a target point set. The method first con-
the centroid of A, c ( A ) ,is equivariant under the transfor- structs a single hash table for all target point sets together.
mations, i.e. c ( f ( A ) )= f ( c ( A ) )for all f in F , and F is It is described here for 2D. Each point is represented as
closed under composition with translation. + +
eo K(e1 - eo) A(e2 - e o ) , for some fixed choice of
points eo,el,e2, and the ( ~ , X ) - p l a n eis quantized into a
4.10. Transport Distance, Earth Mover’s Distance two-dimensional table, mapping each real coordinate pair
( K , A) to an integer index pair ( I C , e).
Given two weighted point patterns A = { ( A l , w ( A i ) ) , Let there be N target point sets Bi. For each target point
... , ( A m , w ( A m ) )and
} B = {(&,w(&)), ... ? ( B n 1 set, the following is done. For each three non-collinear

194
points eo, e l , e2 from the point set, express the other points mal transformation is approximated to any desired accuracy.
as eo + +
K(e1 - eo) X(e2 - eo), and append the tuple The matching can be done with respect to any transforma-
( 2 , eo, e l , e2) to entry (k,e). If there are 8 ( m ) points in tion group G, for example similarity (translation, rotation,
each target point set, the construction of the hash table is of and scaling), or affine transformations (translation, rotation,
complexity S ( N m 4 ) . scaling, and shear).
NOW, given a query point set A , choose three non- The algorithm uses a lower bound X(C, A, B ) for the
collinear points eh, e:, e: from the point set, and express similarity d ( g ( A ) ,B ) over g E C C G, where C is a set of
each other point as eh +.(e: - eh) +.\(e; - eh), and tally a transformations represented by a rectangular cell in parame-
vote for each tuple (i, eo, e l , e2) in entry (k,l ) of the table. ter space. The algorithm starts with a cell C that contains all
The tuple (i, eo, e l , e2) that receives most votes indicates possible minima, which is inserted in a priority queue, us-
the target point set Ti containing the query point set. The ing the lower bound X(C, A, B ) as a key. In each iteration,
affine transformation that maps (eh, e:, e:) to the winner the cell having a minimal associated value of X is extracted
(eo, e l , e2) is assumed to be the transformation between the from the priority queue. If the size of the corresponding cell
two shapes. The complexity of matching a single query set is so small that it achieves the desired accuracy, some trans-
of n points is O ( n ) .There are several variations of this ba- formation in that cell is reported as a (pseudo-)minimum.
sic method, such as balancing the hashing table, or avoiding Otherwise the cell is split in two, the lower bounds of its
taking all possible S ( n 3 )3-tuples. sub-cells will be evaluated, and subsequently inserted into
The generalized Hough transform [8], or pose clustering the priority queue.
[51], is also a voting scheme. Here, affine transformations The previous algorithm is simple, and tests show that
are represented by six coefficients. The quantized transfor- it has a typical running time of a few minutes for trans-
mation space is represented as a six-dimensional table. Now lation, translation plus scaling, rigid transformations and
for each triplet of points in one set, and each triplet of points sometimes even for affine transformation. Since transfor-
from the other set, compute the transformation between the mation space is split up recursively, the algorithm is also
two triples, and tally a vote in the corresponding entry of expected to work well in applications in which the minima
the table. Again the winner entry determines the matching lie in small clusters. A speed-up can be achieved by com-
transformation. The complexity of matching a single query bining the progressive subdivision with alignment [38].
set is S ( N m 3 n 3 ) .
In the alignment method (27, 541, for each triplet of 6. Concluding remarks
points from the query set, and each triplet from the target
set, we compute the transformation between them. With
We have discussed a number of shape similarity prop-
each such transformation, all the other points from the tar-
erties. More possibly useful properties are formulated in
get set are transformed. If they match with query points, the
[57]. It is a challenging research task to construct similar-
transformation receives a vote, and if the number of votes
ity measure with a chosen set of properties. We can use a
is above a chosen threshold, the transformation is assumed
number of constructions to achieve some properties, such as
to be the matching transformation between the query and
remapping, normalization, going from semi-metric to met-
the target. The complexity of matching a single query set is
ric, defining semi-metrics on orbits, extension of pattern
S(”4n3).
space with the empty set, vantageing, and imbedding pat-
Variations of these methods also work for geometric fea-
terns in a function space, see [57].
tures other than points, such as segments, or points with nor-
A difficult problem is partial matching. Applications
mal vectors [ lo], and for other transformations than affine
where this plays a vital role is the registration of scan-
transformations. A comparison between geometric hashing,
ning data sets from multiple views, reverse engineering, and
pose clustering, and the alignment method is made in [60].
shape database retrieval. The difficulty is that the distance
Other voting schemes exist, for example taking a probabilis-
measure must be suitable for partial matching. The dissim-
tic approach [39].
ilarity must be small when two shapes contain similar re-
gions, and the measure should not penalize for regions that
5.2. Subdivision Schemes do not match. Also, the number of local minima of the dis-
tance can be large. For example, even for rigid motions in
As mentioned above, deciding whether there is a trans- 2D, the lower bound on the worst case number of minima of
lation plus scaling that brings the partial Hausdorff distance the Hausdorff distance is s2(n5)[47]. So, for large values
under a given threshold is done in [30] by a progressive of n, the time complexity is prohibitively high if all local
subdivision of transformation space. The subdivision of minima should be evaluated. Finding a good approximate
transformation space is generalized to a general ‘geomet- transformation is essential. After that, numerical optimiza-
ric branch and bound’ framework in [25]. Here the opti- tion techniques can perhaps be used to find the optimum.

195
This cannot be done right from the start, because such tech- [I21 C. C. Chen. Improved moment invariants for shape discrim-
niques get easily stuck in local minima. ination. Pattern Recognition, 26(5):683-686, 1993.
Another approach to partial matching is to first decom- [I31 L. P. Chew, M. T. Goodrich. D. P. Huttenlocher, K. Kedem,
pose the shapes into smaller parts, and do the matching with J. M. Kleinberg, and D. Kravets. Geometric pattern match-
the parts. For example, retrieving shapes similar to the cen- ing under Euclidean motion. Cornprrtational Geometn, Thc-
ory and Applications, 7: 1 13-124, 1997.
taur from figure 2 with partial matching, should yield both
the man and the horse. If these shape are decomposed into a [ 141 P. Chew and K. Kedem. Improvements on approximate pat-
tern matching. In 3rd Scandinavian Workshop on Algorithm
buste and a body, than matching is much easier. The advan-
T h e o n , Lecture Notes in Computer Science 621, pages 3 I E)-
tages of taking apart buste and body was already recognized 325. Springer, 1992.
by Xenophon in his work Cyropaedia (370s BC) [61]:
[I51 S. D. Cohen and L. J. Guibas. Partial matching of planx
“Indeed, my state will be better than being grown polylines under similarity transformations. In Proceedings
of the 8th Annual Symposiron on Discrete Algorithms, pages
together in one piece. [...I so what else shall I
777-786, 1997.
be than a.centaur that can be taken apart and put
[I61 G. Cortelazzo, G. A. Mian, G. Vezzi, and P. Zamper-
together.”
oni. Trademark shapes description by string-matching tech-
niques. Pattern Recognition, 27: 1005-1018, 1994.
References [I71 A. Efrat and A. Itai. Improvements on bottleneck matching
and related problems using geometry. Proceedings of the
[I] 0. Aichholzer, H. Alt, and G. Rote. Matching shapes with a 12th Symposium on Computational G e o m e t n . pages 30 1-
reference point. In International Journal of Computational 310, 1996.
G e o m e t n and Applications, volume 7, pages 349-363, Au- [I81 R. Fagin and L. Stockmeyer. Relaxing the triangle inequal-
gust 1997. ity in pattern matching. Int. Journal of Computer Vision,
[2] H. Alt, B. Behrends, and J. Blomer. Approximate matching 28(3):21’+231, 1998.
of polygonal shapes. Annals of Mathematics and ArtiJcial [ 191 D.Fry. Shape Recognition using M e t r i a on the Space of
Intelligence, pages 25 1-265, 1995. Shapes. PhD thesis, Harvard University, Department of
[3] H. Alt, U. Fuchs, G. Rote, and G. Weber. Matching con- Mathematics, 1993.
vex shapes with respect to the symmetric difference. In [20] M. Godau. A natural metric for curves - computing the dis-
Algorithms ESA ‘96, Proceedings of the 4th Annual Euro- tance for polygonal chains and approximation algorithms.
pean Symposiiun on Algorithms, Barcelona, Spain, Septem- In Proceedings of the S~niposiumon Theoretical Aspects of
ber ’96, pages 320-333. LNCS 1136, Springer, 1996. Conilmter Science (STACS),Lecture Notes in Computer Sci-
[4] H. Alt and M. Godeau. Computing the Frechet distance be- ence 480, pages 127-136. Springer, 1991.
tween two polygonal curves. International Journal of Com- [2 11 S . Gold. Matching and Learning Stnrctitral and Spatial Rep-
putational G e o m e t n & Applications, pages 75-9 I , 1995. resentations with Neural Networks. PhD thesis, Yale Univcr-
[5] Apollodorus. Library 100-200A D . Sir James George Frazer sity, 1995.
(ed.), Harvard University Press, 1921, ISBN 0674991354, [22] B. Giinsel and A. M. Tekalp. Shape similarity matching
0674991362. S e e a l s o h t t p : / / w w w . p e r s e u s . t u f t s . for query-by-example. Pattern Recognition, 3 1(7):93 1-944,
edu/cgi-bin/ptext?lookup=Apollod.+2.5.5. 1998.
[6] E. Arkin, P. Chew, D. Huttenlocher, K. Kedem, and
[23] M. Hagedoorn, M. Overmars, and R. C. Veltkamp. A new
J. Mitchel. An efficiently computable metric for comparing
visibility partition for affine pattern matching. In Discrete
polygonal shapes. IEEE Transactions on Pattern Analysis
Geometry for Computer hnagen: 9th International Confir-
and Machine Intelligence, 13(3):209-215, 1991.
ence Proceedings DGCI 2000, LNCS 1953, pages 358-370.
[7] A. J . Baddeley. An error metric for binary images. In Robust
Springer, 2000.
Computer Vision: Quality of Vision Algorithms. Proc. of
the Int. Workshop on Robust Computer Vision, Bonn, 1992,
[24] M. Hagedoorn and R. C. Veltkamp. Metric pattern
pages 59-78. Wichmann, 1992. spaces. ’Technical Report UU-CS-1999-03, Utrecht Univer-
[8] D. H. Ballard. Generalized Hough transform to detect arbi- sity, 1999.
trary patterns. IEEE Transactions on Pattern Analysis and [25] M. Hagedoom and R. C. Veltkamp. Reliable and efficient
Machine Intelligence, 13(2):111-122, 1981. pattern matching using an affine invariant metric. Interna-
[9] D.H. Ballard and C. M. Brown. Computer Vision. Prentice tional Journal of Computer Vision, 3 l (2/3):203-225, 199‘3.
Hall, 1982. [26] P. S. Heckbert and M. Garland. Survey of polygonal sur-
[IO] G. Barequet and M. Sharir. Partial surface matching by using face simplification algorithms. Technical report, School
directed footprints. In Proceedings of the 12th Annual Sym- of Computer Science, Camegie Mellon University, Pitts-
posium on Computational Geometry, pages 409-410, 1996. burgh, 1997. Multiresolution Surface Modeling Course S I G
[ 1 I] P. Brass and C. Knauer. Testing the congruence of d- GRAPH ’97.
dimensional point sets. In Proceedings 16th Annual ACM [27] D. Huttenlocher and S . Ullman. Object recognition using
Symposium on Computational Geometry, pages 3 10-3 14, alignment. In Proceedings of the International Conference
2000. on Computer Vision, London, pages 102-1 1I , 1987.

196
[28] D. P. Huttenlocher, K. Kedem, and J. M. Kleinberg. On [45] P. L. Rosin. Techniques for assessing polygonal approxima-
dynamic Voronoi diagrams and the minimum Hausdorff dis- tions of curves. IEEE Transactions on Pattern Recognition
tance for point sets under Euclidean motion in the plane. In and Machine Intelligence, 19(5):659-666, 1997.
Proceedings of the 8th Annuual ACM Symposium on Com- [46] Y.Rubner, C. Tomassi, and L. Guibas. A metric for dis-
putational Geometry, pages 1 10-1 20, 1992. tributions with applications to image databases. In Proc. of
[29] D. P. Huttenlocher, K. Kedem, and M. Sharir. The upper the IEEE Int. Con$ on Comp. Vision, Bombay, India, pages
envelope of Voronoi surfaces and its applications. Discrete 59-66, 1998.
and Computational Geometry, 9:267-291, 1993. [47] W. J. Rucklidge. Lower bounds for the complexity of the
[30] D. P. Huttenlocher, G. A. Klanderman, and W. J. Ruck- graph of the Hausdorff distance as a function of transfor-
lidge. Comparing images using the Hausdorff distance. mation. Discrete & Computational Geometry, 16:135-1 53,
IEEE Transactions on Pattem Analysis and Machinen In- September 1996.
telligence, 15:850-863, 1993. [48] S. Schirra. Approximate decision algorithms for approxi-
[31] C. Jacobs, A. Finkelstein, and D. Salesin. Fast multireso- mate congruence. Information Processing Letters, 43:29-
lution image querying. In Computer Graphics Proceedings 34, 1992.
SIGGRAPH, pages 271-286, 1995. [49] S. Sclaroff. Deformable prototypes for encoding shape cate-
[32] A. Khotanzad and Y.H. Hong. Invariant image recongnition gories in image databases. Pattern Recognition, 30(4):627-
by Zemike moments. IEEE Transactions on Pattem Analy- 641, 1997.
sis and Machine Intelligence, 12(5):489497, 1990. [SO] S. Sclaroff and A. P. Pentland. Modal matching for corre-
[33] Y. Lamdan and H. J. Wolfson. Geometric hashing: a general spondence and recognition. IEEE Transactions on Pattern
and efficient model-based recognition scheme. In 2nd Inter: Analysis and Machine Intelligence, 17(6),June 1995.
Con$ on Comput. Vision, pages 238-249, 1988. [51] G. Stockman. Object recognition and localization via pose
clustering. Computer Vision, Graphics, and Image Process-
[34] L. J. Latecki and R. Lakamper. Convexity rule for shape de-
ing, 40(3):361-387, 1987.
composition based on discrete contour evolution. Computer
[52] D. W. Thomsom. Morphology and mathematics. Transac-
Vision and Image Understanding, 73(3):441454, 1999.
tions of the Royal Society of Edinburgh, Volume L, Part IV,
[35] S. Loncaric. A survey of shape analysis techniques. Pattern
NO. 27:857-895, 1915.
Recognition, 3 1(8):983-1001, 1998.
[53] A. Tversky. Features of similarity. Psychological Review,
[36] F. Mokhtarian, S. Abbasi, and J. Kittler. Efficient and robust 84(4):327-352, 1977.
retrieval by shape content through curvature scale space. In [54] S. Ullman. High-Level Vision. MIT Press, 1996.
Image Databases and Multi-Media Search, proceedings of [55] S. Umeyama. Parameterized point pattem matching and
the First Intemational Workshop IDB-MMS'96, Amsterdam, its application to recognition of object families. IEEE
The Netherlands, pages 35-42, 1996. Transactions on Pattem Analysis and Machine Intelligence,
[37] F. Moktarian and S. Abassi. Retrieval of similar shapes un- 15(1):136-144, 1993.
der affine transform. In Visual Information and Information [56] P. J. van Otterloo. A Contour-Oriented Approach to Shape
Systems, Proceedings VISUAL'99, pages 566-574. LNCS Analysis. Hemel Hampstead, Prentice Hall, 1992.
1614, Springer, 1999. [57] R. C. Veltkamp and M. Hagedoom. Shape similarities,prop-
[38] D. M. Mount, N. S. Netanyahu, and J. LeMoigne. Efficient erties, and constructions. In Advances in Visual Information
algorithms for robust point pattem matching and applica- Systems, Proceedings of the 4th Intemational Conference,
tions to image registration. Pattem Recognition, 32: 17-38, VISUAL 2000, Lyon, France, November 2000, LNCS 1929,
1999. pages 467-476. Springer, 2000.
[39] C. F. Olson. Efficient pose clustering using a random- [58] J. Vleugels and R. C. Veltkamp. Efficient image retrieval
ized algorithm. Intemational Journal of Computer Vision, through vantage objects. In D. P. Huijsmans and A. W. M.
23(2):131-147,'1997. Smeulders, editors, Visual Information and Information
[40] R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Systems - Proceedings of the Third International Confer-
Matching 3d models with shape distributions. In this pro- ence VISUAL'99, Amsterdam, The Netherlands, June 1999,
ceedings, 2001. LNCS 1614, pages 575-584. Springer, 1999.
[41] Plato. Meno, 380 BC. W. Lamb (ed.), Plato in Twelve [59] H. Wolfson and I. Rigoutsos. Geometric hashing: an
Volumes, vol. 3. Harvard University Press, 1967. ISBN overview. IEEE Computational Science & Engineering,
0674991842. S e e a l s o h t t p : / / w w w . p e r s e u s . t u f t s . pages 10-21, October-December 1997.
edu/cgi-bin/ptext?lookup=Plat.+Meno+70a. [60] H. J. Wolfson. Model-based object recognition by geometric
[42] R. J. Prokop and A. P. Reeves. A survey of moment-based hashing. In Proceedings of the 1st European Conference on
techniques for unoccluded object representation and recog- Computer Vision, Lecture Notes in Computer Science 427,
nition. CVGIP: Graphics Models and Image Processing, pages 526-536. Springer, 1990.
54(5):438-460, 1992. [61] Xenophon. Cyropaedia, 370s BC. W. Miller (ed.),
[43] S. Rachev. The Monge-Kantorovich mass transference prob- Xenophon in Seven Volumes, vol. 5,6, Harvard Univer-
lem and its stochastical applications. Theory of Probability sity Press, 1983, 1979. ISBN 0674990579, 0674990587.
and Applications, 29:647-676, 1985. See also h t t p : //www.perseus. t u f t s . e d u / c g i -
[U] S. Ranade and A. Rosenfeld. Point pattem matching by re- bin/ptext?lookup=Xen.+Cyrop.+4.3.1.
laxation. Pattern Recognition, 12:269-275, 1980.

197

You might also like